From cmay at phys.ethz.ch  Wed Jul  1 04:07:09 2009
From: cmay at phys.ethz.ch (Christian May)
Date: Wed, 1 Jul 2009 11:07:09 +0200 (CEST)
Subject: petsc with parmetis configuration problem
Message-ID: <alpine.DEB.1.10.0907011057420.24043@anubis.ethz.ch>

Hi all,

I want to configure petsc with parmetis invoking

./configure --prefix=/cluster/work/phys/cmay/petsc/ --with-c++-support 
--with-precision=double --with-shared=0 --download-superlu_dist=1 
--with-superlu_dist 
--with-parmetis-lib=/cluster/work/phys/cmay/petsc/petsc-3.0.0-p3/externalpackages/ParMetis-3.1.1/libparmetis.a 
--with-parmetis-include=/cluster/work/phys/cmay/petsc/petsc-3.0.0-p3/externalpackages/ParMetis-3.1.1/

According to configure.log the error is:

Possible ERROR while running linker: 
/cluster/work/phys/cmay/petsc/petsc-3.0.0-p3/externalpackages/ParMetis-3.1.1/libparmetis.a(kmetis.o): 
In function
  `ParMETIS_V3_PartKway':
kmetis.c:(.text+0xec5): undefined reference to 
`METIS_mCPartGraphRecursive2'
kmetis.c:(.text+0xf45): undefined reference to `METIS_WPartGraphKway'

These symbols are defined in libmetis.a, however there is no configure 
option for petsc to include metis. What would be the clean solution for this?

Thanks
Christian

From knepley at gmail.com  Wed Jul  1 08:26:37 2009
From: knepley at gmail.com (Matt Knepley)
Date: Wed, 1 Jul 2009 08:26:37 -0500
Subject: MatGetArrayF90 returns 2d array
In-Reply-To: <4A4AA1B5.9030501@59A2.org>
References: <4A4A4F29.5070105@imperial.ac.uk>
	<E5A443F0-28DE-4F06-885A-078811134F8A@mcs.anl.gov>
	<a9f269830906301548xc351fcaxc5481e19eb94c8e@mail.gmail.com>
	<4A4AA1B5.9030501@59A2.org>
Message-ID: <2A14059B-1CA8-49AF-B247-648AB63DDE62@gmail.com>

I cannot see the logic in getting the array when we already allow  
getting the aij structures. I cannot condone a function which changes  
it's meaning for every matrix type.

   Matt

 From the phone

On Jun 30, 2009, at 6:37 PM, Jed Brown <jed at 59A2.org> wrote:

> Matthew Knepley wrote:
>> I thought the idea was that MatGetArray() never applies to a sparse
>> matrix.  No other sparse format supports this, does it?
>
> That's not true at all, but the result is implementation-dependent.   
> For
> example, the array for AIJ is different from the array for BAIJ.  For
> this reason, you shouldn't be calling MatGetArray unless you know the
> matrix type, but of course the F90 interface should agree with the C
> interface.
>
> Jed
>

From bsmith at mcs.anl.gov  Wed Jul  1 13:36:01 2009
From: bsmith at mcs.anl.gov (Barry Smith)
Date: Wed, 1 Jul 2009 13:36:01 -0500
Subject: MatGetArrayF90 returns 2d array
In-Reply-To: <4A4AA1B5.9030501@59A2.org>
References: <4A4A4F29.5070105@imperial.ac.uk>	<E5A443F0-28DE-4F06-885A-078811134F8A@mcs.anl.gov>
	<a9f269830906301548xc351fcaxc5481e19eb94c8e@mail.gmail.com>
	<4A4AA1B5.9030501@59A2.org>
Message-ID: <871F13E3-EE03-409A-8470-DEF8C2C21A65@mcs.anl.gov>


   Satish,

     Could you please convert these (both the stub code and the F90  
interface) to use a 1d array? In both 3.0.0 and petsc-dev.

    Thanks

    Barry




On Jun 30, 2009, at 6:37 PM, Jed Brown wrote:

> Matthew Knepley wrote:
>> I thought the idea was that MatGetArray() never applies to a sparse
>> matrix.  No other sparse format supports this, does it?
>
> That's not true at all, but the result is implementation-dependent.   
> For
> example, the array for AIJ is different from the array for BAIJ.  For
> this reason, you shouldn't be calling MatGetArray unless you know the
> matrix type, but of course the F90 interface should agree with the C
> interface.
>
> Jed
>


From knepley at gmail.com  Wed Jul  1 17:54:58 2009
From: knepley at gmail.com (Matthew Knepley)
Date: Thu, 2 Jul 2009 06:54:58 +0800
Subject: MatGetArrayF90 returns 2d array
In-Reply-To: <871F13E3-EE03-409A-8470-DEF8C2C21A65@mcs.anl.gov>
References: <4A4A4F29.5070105@imperial.ac.uk>
	<E5A443F0-28DE-4F06-885A-078811134F8A@mcs.anl.gov>
	<a9f269830906301548xc351fcaxc5481e19eb94c8e@mail.gmail.com>
	<4A4AA1B5.9030501@59A2.org>
	<871F13E3-EE03-409A-8470-DEF8C2C21A65@mcs.anl.gov>
Message-ID: <a9f269830907011554s45e8b7b8wf0d284f34e6a77f3@mail.gmail.com>

On Thu, Jul 2, 2009 at 2:36 AM, Barry Smith <bsmith at mcs.anl.gov> wrote:

>
>  Satish,
>
>    Could you please convert these (both the stub code and the F90
> interface) to use a 1d array? In both 3.0.0 and petsc-dev.


I am still against this, since it seems much nicer to get a 2D array
for the dense matrix, which is the ONLY matrix for which GetArray()
makes any sense I think.

  Matt


>
>   Thanks
>
>   Barry
>
> On Jun 30, 2009, at 6:37 PM, Jed Brown wrote:
>
> Matthew Knepley wrote:
>>
>>> I thought the idea was that MatGetArray() never applies to a sparse
>>> matrix.  No other sparse format supports this, does it?
>>>
>>
>> That's not true at all, but the result is implementation-dependent.  For
>> example, the array for AIJ is different from the array for BAIJ.  For
>> this reason, you shouldn't be calling MatGetArray unless you know the
>> matrix type, but of course the F90 interface should agree with the C
>> interface.
>>
>> Jed
>>
>>
>


-- 
What most experimenters take for granted before they begin their experiments
is infinitely more interesting than any results to which their experiments
lead.
-- Norbert Wiener
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.mcs.anl.gov/pipermail/petsc-users/attachments/20090702/e8a1814f/attachment.htm>

From bsmith at mcs.anl.gov  Thu Jul  2 11:47:56 2009
From: bsmith at mcs.anl.gov (Barry Smith)
Date: Thu, 2 Jul 2009 11:47:56 -0500
Subject: MatGetArrayF90 returns 2d array
In-Reply-To: <a9f269830907011554s45e8b7b8wf0d284f34e6a77f3@mail.gmail.com>
References: <4A4A4F29.5070105@imperial.ac.uk>
	<E5A443F0-28DE-4F06-885A-078811134F8A@mcs.anl.gov>
	<a9f269830906301548xc351fcaxc5481e19eb94c8e@mail.gmail.com>
	<4A4AA1B5.9030501@59A2.org>
	<871F13E3-EE03-409A-8470-DEF8C2C21A65@mcs.anl.gov>
	<a9f269830907011554s45e8b7b8wf0d284f34e6a77f3@mail.gmail.com>
Message-ID: <CD09EC73-A522-4CAA-9DA1-84AD55F1F710@mcs.anl.gov>


  MatGetArray() returns access to the nonzero values of the matrix in  
a format that depends on the underlying class. It is a perfectly  
reasonable method and has been there for decades just fine.

   Barry

On Jul 1, 2009, at 5:54 PM, Matthew Knepley wrote:

> On Thu, Jul 2, 2009 at 2:36 AM, Barry Smith <bsmith at mcs.anl.gov>  
> wrote:
>
> Satish,
>
>   Could you please convert these (both the stub code and the F90  
> interface) to use a 1d array? In both 3.0.0 and petsc-dev.
>
> I am still against this, since it seems much nicer to get a 2D array
> for the dense matrix, which is the ONLY matrix for which GetArray()
> makes any sense I think.
>
>  Matt
>
>
>  Thanks
>
>  Barry
>
> On Jun 30, 2009, at 6:37 PM, Jed Brown wrote:
>
> Matthew Knepley wrote:
> I thought the idea was that MatGetArray() never applies to a sparse
> matrix.  No other sparse format supports this, does it?
>
> That's not true at all, but the result is implementation-dependent.   
> For
> example, the array for AIJ is different from the array for BAIJ.  For
> this reason, you shouldn't be calling MatGetArray unless you know the
> matrix type, but of course the F90 interface should agree with the C
> interface.
>
> Jed
>
>
>
>
>
> -- 
> What most experimenters take for granted before they begin their  
> experiments is infinitely more interesting than any results to which  
> their experiments lead.
> -- Norbert Wiener


From dalcinl at gmail.com  Thu Jul  2 14:59:41 2009
From: dalcinl at gmail.com (Lisandro Dalcin)
Date: Thu, 2 Jul 2009 16:59:41 -0300
Subject: MatGetArrayF90 returns 2d array
In-Reply-To: <CD09EC73-A522-4CAA-9DA1-84AD55F1F710@mcs.anl.gov>
References: <4A4A4F29.5070105@imperial.ac.uk>
	<E5A443F0-28DE-4F06-885A-078811134F8A@mcs.anl.gov>
	<a9f269830906301548xc351fcaxc5481e19eb94c8e@mail.gmail.com>
	<4A4AA1B5.9030501@59A2.org>
	<871F13E3-EE03-409A-8470-DEF8C2C21A65@mcs.anl.gov>
	<a9f269830907011554s45e8b7b8wf0d284f34e6a77f3@mail.gmail.com>
	<CD09EC73-A522-4CAA-9DA1-84AD55F1F710@mcs.anl.gov>
Message-ID: <e7ba66e40907021259l3ac1aa1byc2fbaf0a69440dbf@mail.gmail.com>

On Thu, Jul 2, 2009 at 1:47 PM, Barry Smith<bsmith at mcs.anl.gov> wrote:
>
> ?MatGetArray() returns access to the nonzero values of the matrix in a
> format that depends on the underlying class. It is a perfectly reasonable
> method and has been there for decades just fine.
>

BTW, Matt commented about PETSc providing support for obtaining the
AIJ structure... I know how to get the "IJ" part, MatGetArray() let me
get the "A" part, but ... Where is the call that let one to get the
whole data, indices and values?


> ?Barry
>
> On Jul 1, 2009, at 5:54 PM, Matthew Knepley wrote:
>
>> On Thu, Jul 2, 2009 at 2:36 AM, Barry Smith <bsmith at mcs.anl.gov> wrote:
>>
>> Satish,
>>
>> ?Could you please convert these (both the stub code and the F90 interface)
>> to use a 1d array? In both 3.0.0 and petsc-dev.
>>
>> I am still against this, since it seems much nicer to get a 2D array
>> for the dense matrix, which is the ONLY matrix for which GetArray()
>> makes any sense I think.
>>
>> ?Matt
>>
>>
>> ?Thanks
>>
>> ?Barry
>>
>> On Jun 30, 2009, at 6:37 PM, Jed Brown wrote:
>>
>> Matthew Knepley wrote:
>> I thought the idea was that MatGetArray() never applies to a sparse
>> matrix. ?No other sparse format supports this, does it?
>>
>> That's not true at all, but the result is implementation-dependent. ?For
>> example, the array for AIJ is different from the array for BAIJ. ?For
>> this reason, you shouldn't be calling MatGetArray unless you know the
>> matrix type, but of course the F90 interface should agree with the C
>> interface.
>>
>> Jed
>>
>>
>>
>>
>>
>> --
>> What most experimenters take for granted before they begin their
>> experiments is infinitely more interesting than any results to which their
>> experiments lead.
>> -- Norbert Wiener
>
>



-- 
Lisandro Dalc?n
---------------
Centro Internacional de M?todos Computacionales en Ingenier?a (CIMEC)
Instituto de Desarrollo Tecnol?gico para la Industria Qu?mica (INTEC)
Consejo Nacional de Investigaciones Cient?ficas y T?cnicas (CONICET)
PTLC - G?emes 3450, (3000) Santa Fe, Argentina
Tel/Fax: +54-(0)342-451.1594

From bsmith at mcs.anl.gov  Thu Jul  2 15:02:04 2009
From: bsmith at mcs.anl.gov (Barry Smith)
Date: Thu, 2 Jul 2009 15:02:04 -0500
Subject: MatGetArrayF90 returns 2d array
In-Reply-To: <e7ba66e40907021259l3ac1aa1byc2fbaf0a69440dbf@mail.gmail.com>
References: <4A4A4F29.5070105@imperial.ac.uk>
	<E5A443F0-28DE-4F06-885A-078811134F8A@mcs.anl.gov>
	<a9f269830906301548xc351fcaxc5481e19eb94c8e@mail.gmail.com>
	<4A4AA1B5.9030501@59A2.org>
	<871F13E3-EE03-409A-8470-DEF8C2C21A65@mcs.anl.gov>
	<a9f269830907011554s45e8b7b8wf0d284f34e6a77f3@mail.gmail.com>
	<CD09EC73-A522-4CAA-9DA1-84AD55F1F710@mcs.anl.gov>
	<e7ba66e40907021259l3ac1aa1byc2fbaf0a69440dbf@mail.gmail.com>
Message-ID: <76E06F04-2715-410F-A3BB-B86A8398E7BC@mcs.anl.gov>


   #include "src/mat/impls/aij/seq/aij.h"     :-)


On Jul 2, 2009, at 2:59 PM, Lisandro Dalcin wrote:

> On Thu, Jul 2, 2009 at 1:47 PM, Barry Smith<bsmith at mcs.anl.gov> wrote:
>>
>>  MatGetArray() returns access to the nonzero values of the matrix  
>> in a
>> format that depends on the underlying class. It is a perfectly  
>> reasonable
>> method and has been there for decades just fine.
>>
>
> BTW, Matt commented about PETSc providing support for obtaining the
> AIJ structure... I know how to get the "IJ" part, MatGetArray() let me
> get the "A" part, but ... Where is the call that let one to get the
> whole data, indices and values?
>
>
>>  Barry
>>
>> On Jul 1, 2009, at 5:54 PM, Matthew Knepley wrote:
>>
>>> On Thu, Jul 2, 2009 at 2:36 AM, Barry Smith <bsmith at mcs.anl.gov>  
>>> wrote:
>>>
>>> Satish,
>>>
>>>  Could you please convert these (both the stub code and the F90  
>>> interface)
>>> to use a 1d array? In both 3.0.0 and petsc-dev.
>>>
>>> I am still against this, since it seems much nicer to get a 2D array
>>> for the dense matrix, which is the ONLY matrix for which GetArray()
>>> makes any sense I think.
>>>
>>>  Matt
>>>
>>>
>>>  Thanks
>>>
>>>  Barry
>>>
>>> On Jun 30, 2009, at 6:37 PM, Jed Brown wrote:
>>>
>>> Matthew Knepley wrote:
>>> I thought the idea was that MatGetArray() never applies to a sparse
>>> matrix.  No other sparse format supports this, does it?
>>>
>>> That's not true at all, but the result is implementation- 
>>> dependent.  For
>>> example, the array for AIJ is different from the array for BAIJ.   
>>> For
>>> this reason, you shouldn't be calling MatGetArray unless you know  
>>> the
>>> matrix type, but of course the F90 interface should agree with the C
>>> interface.
>>>
>>> Jed
>>>
>>>
>>>
>>>
>>>
>>> --
>>> What most experimenters take for granted before they begin their
>>> experiments is infinitely more interesting than any results to  
>>> which their
>>> experiments lead.
>>> -- Norbert Wiener
>>
>>
>
>
>
> -- 
> Lisandro Dalc?n
> ---------------
> Centro Internacional de M?todos Computacionales en Ingenier?a (CIMEC)
> Instituto de Desarrollo Tecnol?gico para la Industria Qu?mica (INTEC)
> Consejo Nacional de Investigaciones Cient?ficas y T?cnicas (CONICET)
> PTLC - G?emes 3450, (3000) Santa Fe, Argentina
> Tel/Fax: +54-(0)342-451.1594


From sekikawa at msi.co.jp  Thu Jul  2 23:30:30 2009
From: sekikawa at msi.co.jp (Takuya Sekikawa)
Date: Fri, 03 Jul 2009 13:30:30 +0900
Subject: matrix creation on LAPACK mode
Message-ID: <20090703132511.6ACF.SEKIKAWA@msi.co.jp>

Hello

I made eigenvalue solver program with SLEPc. in my program, to 
setup matrix, I use MatCreateSeqAIJ() function.

void setupMatrix(int m, int n)
{
	PetscErrorCode ierr;

	ierr=MatCreateSeqAIJ(PETSC_COMM_WORLD, m, n, nz, PETSC_NULL,
&g_A);
	...
}

Normally I select solver as KrylovSchur, but sometimes I switched solver
to LAPACK. with using LAPACK, result seems to be no problem. but I
suspect calculation time takes longer (because of using MatCreateSeqAIJ)

Does switching matrix create function to MatCreateSeqDense() give any
effect to speed up on LAPACK mode?

Thanks,
Takuya
---------------------------------------------------------------
   Takuya Sekikawa
         Mathematical Systems, Inc
                   sekikawa at msi.co.jp
---------------------------------------------------------------



From sekikawa at msi.co.jp  Fri Jul  3 00:59:34 2009
From: sekikawa at msi.co.jp (Takuya Sekikawa)
Date: Fri, 03 Jul 2009 14:59:34 +0900
Subject: calculation time
Message-ID: <20090703145230.6ADB.SEKIKAWA@msi.co.jp>

Dear PETSc/SLEPc users,

I have made eigenproblem solver program with SLEPc.
Currently it works well, but it takes very long time to solve big
problem.

with 10000x10000 random matrix, it takes about 34 hours to solve.
(solver = KrylovSchur, on 64bit Linux platform, 16G memory, 1 machine)

Is this ordinally time to solve problem like these size?
or Is there any good way to shorten calculation time?

Thanks
Takuya
---------------------------------------------------------------
  Takuya Sekikawa
         Mathematical Systems, Inc
                    sekikawa at msi.co.jp
---------------------------------------------------------------



From socrates.wei at gmail.com  Fri Jul  3 02:50:27 2009
From: socrates.wei at gmail.com (Zi-Hao Wei)
Date: Fri, 3 Jul 2009 15:50:27 +0800
Subject: matrix creation on LAPACK mode
In-Reply-To: <20090703132511.6ACF.SEKIKAWA@msi.co.jp>
References: <20090703132511.6ACF.SEKIKAWA@msi.co.jp>
Message-ID: <bb13bf8a0907030050k253fc986gc977786e218a6ab2@mail.gmail.com>

Hi

I remember that when you use LAPACK as eigensolver the SLEPc will
automatically convert sparse matrix into dense matrix by the function
SlepcMatConvertSeqDense.


On Fri, Jul 3, 2009 at 12:30 PM, Takuya Sekikawa<sekikawa at msi.co.jp> wrote:
> Hello
>
> I made eigenvalue solver program with SLEPc. in my program, to
> setup matrix, I use MatCreateSeqAIJ() function.
>
> void setupMatrix(int m, int n)
> {
> ? ? ? ?PetscErrorCode ierr;
>
> ? ? ? ?ierr=MatCreateSeqAIJ(PETSC_COMM_WORLD, m, n, nz, PETSC_NULL,
> &g_A);
> ? ? ? ?...
> }
>
> Normally I select solver as KrylovSchur, but sometimes I switched solver
> to LAPACK. with using LAPACK, result seems to be no problem. but I
> suspect calculation time takes longer (because of using MatCreateSeqAIJ)
>
> Does switching matrix create function to MatCreateSeqDense() give any
> effect to speed up on LAPACK mode?
>
> Thanks,
> Takuya
> ---------------------------------------------------------------
> ? Takuya Sekikawa
> ? ? ? ? Mathematical Systems, Inc
> ? ? ? ? ? ? ? ? ? sekikawa at msi.co.jp
> ---------------------------------------------------------------
>
>
>



-- 
Zi-Hao Wei
Department of Mathematics
National Central University, Taiwan
Adrienne Gusoff  - "Opportunity knocked. My doorman threw him out." -
http://www.brainyquote.com/quotes/authors/a/adrienne_gusoff.html

From socrates.wei at gmail.com  Fri Jul  3 02:56:08 2009
From: socrates.wei at gmail.com (Zi-Hao Wei)
Date: Fri, 3 Jul 2009 15:56:08 +0800
Subject: calculation time
In-Reply-To: <20090703145230.6ADB.SEKIKAWA@msi.co.jp>
References: <20090703145230.6ADB.SEKIKAWA@msi.co.jp>
Message-ID: <bb13bf8a0907030056l5c10bd4cta9d232e13673896e@mail.gmail.com>

Hi

How many eigenvalues did you compute?
I think that these Krylov subspace methods, such as Krylov-Schur,
Arnoldi, Lanczos, and etc. are not suitable for finding whole
spectrum.

On Fri, Jul 3, 2009 at 1:59 PM, Takuya Sekikawa<sekikawa at msi.co.jp> wrote:
> Dear PETSc/SLEPc users,
>
> I have made eigenproblem solver program with SLEPc.
> Currently it works well, but it takes very long time to solve big
> problem.
>
> with 10000x10000 random matrix, it takes about 34 hours to solve.
> (solver = KrylovSchur, on 64bit Linux platform, 16G memory, 1 machine)
>
> Is this ordinally time to solve problem like these size?
> or Is there any good way to shorten calculation time?
>
> Thanks
> Takuya
> ---------------------------------------------------------------
> ?Takuya Sekikawa
> ? ? ? ? Mathematical Systems, Inc
> ? ? ? ? ? ? ? ? ? ?sekikawa at msi.co.jp
> ---------------------------------------------------------------
>
>
>



-- 
Zi-Hao Wei
Department of Mathematics
National Central University, Taiwan
Rita Rudner  - "I was a vegetarian until I started leaning toward the
sunlight." - http://www.brainyquote.com/quotes/authors/r/rita_rudner.html

From kuiper at mpia.de  Fri Jul  3 03:52:55 2009
From: kuiper at mpia.de (Rolf Kuiper)
Date: Fri, 3 Jul 2009 10:52:55 +0200
Subject: MPI-layout of PETSc
In-Reply-To: <59B4908D-E3F7-4286-905A-6495C4D24013@mcs.anl.gov>
References: <EBB7FC5D-3E4C-488E-BE15-2DC3A537934C@mpia.de>
	<59B4908D-E3F7-4286-905A-6495C4D24013@mcs.anl.gov>
Message-ID: <DBC9C847-1392-46EB-B1EA-11C9D60B1D94@mpia.de>

Hi,

Am 30.06.2009 um 02:24 schrieb Barry Smith:
>
> On Jun 29, 2009, at 7:07 PM, Rolf Kuiper wrote:
>
>> Hi PETSc users,
>>
>> I ran into trouble in combining my developed PETSc application with  
>> another code (based on another library called "ArrayLib").
>> The problem is the parallel layout for MPI, e.g. in 2D with 6 cpus  
>> the ArrayLib code gives the names/ranks of the local cpus first in  
>> y-direction, than in x (from last to first, in the same way the MPI  
>> arrays are called, like 3Darray[z][y][x]):
>>
>> y
>> ^
>> | 2-4-6
>> | 1-3-5
>> |--------> x
>>
>> If I call DACreate() from PETSc, it will assume an ordering  
>> according to names/ranks first set in x-direction, than in y:
>>
>> y
>> ^
>> | 4-5-6
>> | 1-2-3
>> |--------> x
>>
>> Of course, if I now communicate the boundary values, I mix up the  
>> domain (build by the other program).
>>
>> Is there a possibility / a flag to set the name of the ranks?
>> Due to the fact that my application is written and working in  
>> curvilinear coordinates and not in cartesian, I cannot just switch  
>> the directions.
>
>   What we recommend in this case is to just change the meaning of x,  
> y, and z when you use the PETSc DA.  This does mean changing your  
> code that uses the PETSc DA.

The code is used as a module for many codes, so I would prefer to not  
change the code (and the meaning of directions, that's not user- 
friendly), but 'just' change the communicator.

> I do not understand why curvilinear coordinates has anything to do  
> with it. Another choice is to create a new MPI communicator that has  
> the different ordering of the ranks of the processors and then using  
> that comm to create the PETSc DA objects; then you would not need to  
> change your code that calls PETSc.

I tried some time before to use the PetscSetCommWorld() routine, but I  
can't find it anymore, how can I set a new communicator in PETSc3.0?
The communicator, I want to use, is the MPI_COMM_WORLD, which takes  
the first described ordering.
Now I read that the MPI_COMM_WORLD is the default communicator for  
PETSc. But why is the ordering than different?

Sorry for all this question, but (as you can see) I really don't  
understand this comm problem at the moment,
Thanks for all,
Rolf

>   Unfortunately PETSc doesn't have any way to flip how the DA  
> handles the layout automatically.
>
>    Barry
>
>>
>> Thanks a lot for your help,
>> Rolf
>


From bsmith at mcs.anl.gov  Fri Jul  3 10:56:20 2009
From: bsmith at mcs.anl.gov (Barry Smith)
Date: Fri, 3 Jul 2009 10:56:20 -0500
Subject: MPI-layout of PETSc
In-Reply-To: <DBC9C847-1392-46EB-B1EA-11C9D60B1D94@mpia.de>
References: <EBB7FC5D-3E4C-488E-BE15-2DC3A537934C@mpia.de>
	<59B4908D-E3F7-4286-905A-6495C4D24013@mcs.anl.gov>
	<DBC9C847-1392-46EB-B1EA-11C9D60B1D94@mpia.de>
Message-ID: <201D150F-CF06-4746-A2A5-C29D213CDCCC@mcs.anl.gov>


    In designing the PETSc DA I did not (by ignorance) follow the  
layout approach of the MPI cartesian MPI_Cart_create (that gives the  
first local cpus first in the y-direction).
I had it put the first cpus in the x-direction.

    What you need to do is create a new communicator that changes the  
order of the processors so that when used by the PETSc DA they lie out  
in the ordering that matches the other code. You will need to read up  
on the MPI_Cart stuff.

    To change PETSC_COMM_WORLD you simply set PETSC_COMM_WORLD =  
yournewcom BEFORE calling PetscInitialize().

    Barry

On Jul 3, 2009, at 3:52 AM, Rolf Kuiper wrote:

> Hi,
>
> Am 30.06.2009 um 02:24 schrieb Barry Smith:
>>
>> On Jun 29, 2009, at 7:07 PM, Rolf Kuiper wrote:
>>
>>> Hi PETSc users,
>>>
>>> I ran into trouble in combining my developed PETSc application  
>>> with another code (based on another library called "ArrayLib").
>>> The problem is the parallel layout for MPI, e.g. in 2D with 6 cpus  
>>> the ArrayLib code gives the names/ranks of the local cpus first in  
>>> y-direction, than in x (from last to first, in the same way the  
>>> MPI arrays are called, like 3Darray[z][y][x]):
>>>
>>> y
>>> ^
>>> | 2-4-6
>>> | 1-3-5
>>> |--------> x
>>>
>>> If I call DACreate() from PETSc, it will assume an ordering  
>>> according to names/ranks first set in x-direction, than in y:
>>>
>>> y
>>> ^
>>> | 4-5-6
>>> | 1-2-3
>>> |--------> x
>>>
>>> Of course, if I now communicate the boundary values, I mix up the  
>>> domain (build by the other program).
>>>
>>> Is there a possibility / a flag to set the name of the ranks?
>>> Due to the fact that my application is written and working in  
>>> curvilinear coordinates and not in cartesian, I cannot just switch  
>>> the directions.
>>
>>  What we recommend in this case is to just change the meaning of x,  
>> y, and z when you use the PETSc DA.  This does mean changing your  
>> code that uses the PETSc DA.
>
> The code is used as a module for many codes, so I would prefer to  
> not change the code (and the meaning of directions, that's not user- 
> friendly), but 'just' change the communicator.
>
>> I do not understand why curvilinear coordinates has anything to do  
>> with it. Another choice is to create a new MPI communicator that  
>> has the different ordering of the ranks of the processors and then  
>> using that comm to create the PETSc DA objects; then you would not  
>> need to change your code that calls PETSc.
>
> I tried some time before to use the PetscSetCommWorld() routine, but  
> I can't find it anymore, how can I set a new communicator in PETSc3.0?
> The communicator, I want to use, is the MPI_COMM_WORLD, which takes  
> the first described ordering.
> Now I read that the MPI_COMM_WORLD is the default communicator for  
> PETSc. But why is the ordering than different?
>
> Sorry for all this question, but (as you can see) I really don't  
> understand this comm problem at the moment,
> Thanks for all,
> Rolf
>
>>  Unfortunately PETSc doesn't have any way to flip how the DA  
>> handles the layout automatically.
>>
>>   Barry
>>
>>>
>>> Thanks a lot for your help,
>>> Rolf
>>
>


From bsmith at mcs.anl.gov  Fri Jul  3 11:00:39 2009
From: bsmith at mcs.anl.gov (Barry Smith)
Date: Fri, 3 Jul 2009 11:00:39 -0500
Subject: matrix creation on LAPACK mode
In-Reply-To: <bb13bf8a0907030050k253fc986gc977786e218a6ab2@mail.gmail.com>
References: <20090703132511.6ACF.SEKIKAWA@msi.co.jp>
	<bb13bf8a0907030050k253fc986gc977786e218a6ab2@mail.gmail.com>
Message-ID: <649C57BF-E0C5-4D04-9CD0-9C94BFEE562F@mcs.anl.gov>


On Jul 3, 2009, at 2:50 AM, Zi-Hao Wei wrote:

> Hi
>
> I remember that when you use LAPACK as eigensolver the SLEPc will
> automatically convert sparse matrix into dense matrix by the function
> SlepcMatConvertSeqDense.

    That conversion should be very fast so it won't affect the overall  
time by much. Especially since the LAPACK eigenvalues computation is  
order N^3 work which will swamp out any oder N^2 work.

     Barry

>
>
> On Fri, Jul 3, 2009 at 12:30 PM, Takuya Sekikawa<sekikawa at msi.co.jp>  
> wrote:
>> Hello
>>
>> I made eigenvalue solver program with SLEPc. in my program, to
>> setup matrix, I use MatCreateSeqAIJ() function.
>>
>> void setupMatrix(int m, int n)
>> {
>>        PetscErrorCode ierr;
>>
>>        ierr=MatCreateSeqAIJ(PETSC_COMM_WORLD, m, n, nz, PETSC_NULL,
>> &g_A);
>>        ...
>> }
>>
>> Normally I select solver as KrylovSchur, but sometimes I switched  
>> solver
>> to LAPACK. with using LAPACK, result seems to be no problem. but I
>> suspect calculation time takes longer (because of using  
>> MatCreateSeqAIJ)
>>
>> Does switching matrix create function to MatCreateSeqDense() give any
>> effect to speed up on LAPACK mode?
>>
>> Thanks,
>> Takuya
>> ---------------------------------------------------------------
>>   Takuya Sekikawa
>>         Mathematical Systems, Inc
>>                   sekikawa at msi.co.jp
>> ---------------------------------------------------------------
>>
>>
>>
>
>
>
> -- 
> Zi-Hao Wei
> Department of Mathematics
> National Central University, Taiwan
> Adrienne Gusoff  - "Opportunity knocked. My doorman threw him out." -
> http://www.brainyquote.com/quotes/authors/a/adrienne_gusoff.html


From kuiper at mpia-hd.mpg.de  Fri Jul  3 12:09:00 2009
From: kuiper at mpia-hd.mpg.de (Rolf Kuiper)
Date: Fri, 3 Jul 2009 19:09:00 +0200
Subject: MPI-layout of PETSc
In-Reply-To: <201D150F-CF06-4746-A2A5-C29D213CDCCC@mcs.anl.gov>
References: <EBB7FC5D-3E4C-488E-BE15-2DC3A537934C@mpia.de>
	<59B4908D-E3F7-4286-905A-6495C4D24013@mcs.anl.gov>
	<DBC9C847-1392-46EB-B1EA-11C9D60B1D94@mpia.de>
	<201D150F-CF06-4746-A2A5-C29D213CDCCC@mcs.anl.gov>
Message-ID: <EFFA24D2-A500-40CE-BF84-05F614511184@mpia.de>

Hi Barry,

I tried that already with:
First way by copying:
MPI_Comm_dup(PETSC_COMM_WORLD, &MyComm);

Second way by creating:
int dims[3] = {0,0,0};
int ndims=3;
MPI_Dims_create(NumberOfProcessors, ndims, dims);
int false = 0; int true = 1;
int periods[3] = { false, false, true };
int reorder = true;
MPI_Comm MyComm;
MPI_Cart_create(PETSC_COMM_WORLD, ndims, dims, periods, reorder,  
&MyComm);

in the end then:
PETSC_COMM_WORLD = MyComm;

I test the MyComm with MPI_Topo_test(); and it is cartesian, yes.
I can the coordinates of the cpus with MPI_Cart_coords(MyComm,  
LocalRank, ndims, coords); , but I found no way to set/rearrange these  
coordinates.

Do you can help me in that case or have I to ask a MPI-support?

Thanks for all,
Rolf


Am 03.07.2009 um 17:56 schrieb Barry Smith:
>
>   In designing the PETSc DA I did not (by ignorance) follow the  
> layout approach of the MPI cartesian MPI_Cart_create (that gives the  
> first local cpus first in the y-direction).
> I had it put the first cpus in the x-direction.
>
>   What you need to do is create a new communicator that changes the  
> order of the processors so that when used by the PETSc DA they lie  
> out in the ordering that matches the other code. You will need to  
> read up on the MPI_Cart stuff.
>
>   To change PETSC_COMM_WORLD you simply set PETSC_COMM_WORLD =  
> yournewcom BEFORE calling PetscInitialize().
>
>   Barry
>
> On Jul 3, 2009, at 3:52 AM, Rolf Kuiper wrote:
>
>> Hi,
>>
>> Am 30.06.2009 um 02:24 schrieb Barry Smith:
>>>
>>> On Jun 29, 2009, at 7:07 PM, Rolf Kuiper wrote:
>>>
>>>> Hi PETSc users,
>>>>
>>>> I ran into trouble in combining my developed PETSc application  
>>>> with another code (based on another library called "ArrayLib").
>>>> The problem is the parallel layout for MPI, e.g. in 2D with 6  
>>>> cpus the ArrayLib code gives the names/ranks of the local cpus  
>>>> first in y-direction, than in x (from last to first, in the same  
>>>> way the MPI arrays are called, like 3Darray[z][y][x]):
>>>>
>>>> y
>>>> ^
>>>> | 2-4-6
>>>> | 1-3-5
>>>> |--------> x
>>>>
>>>> If I call DACreate() from PETSc, it will assume an ordering  
>>>> according to names/ranks first set in x-direction, than in y:
>>>>
>>>> y
>>>> ^
>>>> | 4-5-6
>>>> | 1-2-3
>>>> |--------> x
>>>>
>>>> Of course, if I now communicate the boundary values, I mix up the  
>>>> domain (build by the other program).
>>>>
>>>> Is there a possibility / a flag to set the name of the ranks?
>>>> Due to the fact that my application is written and working in  
>>>> curvilinear coordinates and not in cartesian, I cannot just  
>>>> switch the directions.
>>>
>>> What we recommend in this case is to just change the meaning of x,  
>>> y, and z when you use the PETSc DA.  This does mean changing your  
>>> code that uses the PETSc DA.
>>
>> The code is used as a module for many codes, so I would prefer to  
>> not change the code (and the meaning of directions, that's not user- 
>> friendly), but 'just' change the communicator.
>>
>>> I do not understand why curvilinear coordinates has anything to do  
>>> with it. Another choice is to create a new MPI communicator that  
>>> has the different ordering of the ranks of the processors and then  
>>> using that comm to create the PETSc DA objects; then you would not  
>>> need to change your code that calls PETSc.
>>
>> I tried some time before to use the PetscSetCommWorld() routine,  
>> but I can't find it anymore, how can I set a new communicator in  
>> PETSc3.0?
>> The communicator, I want to use, is the MPI_COMM_WORLD, which takes  
>> the first described ordering.
>> Now I read that the MPI_COMM_WORLD is the default communicator for  
>> PETSc. But why is the ordering than different?
>>
>> Sorry for all this question, but (as you can see) I really don't  
>> understand this comm problem at the moment,
>> Thanks for all,
>> Rolf
>>
>>> Unfortunately PETSc doesn't have any way to flip how the DA  
>>> handles the layout automatically.
>>>
>>>  Barry
>>>
>>>>
>>>> Thanks a lot for your help,
>>>> Rolf
>>>
>>
>

-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.mcs.anl.gov/pipermail/petsc-users/attachments/20090703/4e7a21e9/attachment.htm>

From jroman at dsic.upv.es  Fri Jul  3 12:42:20 2009
From: jroman at dsic.upv.es (Jose E. Roman)
Date: Fri, 3 Jul 2009 19:42:20 +0200
Subject: calculation time
In-Reply-To: <20090703145230.6ADB.SEKIKAWA@msi.co.jp>
References: <20090703145230.6ADB.SEKIKAWA@msi.co.jp>
Message-ID: <219AC380-90D4-45DF-8A5E-0E70F2556DF2@dsic.upv.es>


On 03/07/2009, Takuya Sekikawa wrote:

> Dear PETSc/SLEPc users,
>
> I have made eigenproblem solver program with SLEPc.
> Currently it works well, but it takes very long time to solve big
> problem.
>
> with 10000x10000 random matrix, it takes about 34 hours to solve.
> (solver = KrylovSchur, on 64bit Linux platform, 16G memory, 1 machine)
>
> Is this ordinally time to solve problem like these size?
> or Is there any good way to shorten calculation time?
>
> Thanks
> Takuya
> ---------------------------------------------------------------
>  Takuya Sekikawa
>         Mathematical Systems, Inc
>                    sekikawa at msi.co.jp
> ---------------------------------------------------------------

SLEPc is intended for computing part of the spectrum of a sparse  
matrix. If you want to compute a few eigenpairs of a 10000 matrix, it  
should be very fast. If you want to compute a large percentage of the  
spectrum (30% say) then you can do it with SLEPc but need to be more  
careful (use appropriate values of nev, ncv and mpd parameters).  
Finally, if you want to compute all eigenvalues, then you should not  
use SLEPc. The Lapack solver in SLEPc should be used only for  
debugging purposes in small problems. Please read the documentation.

Jose


From bsmith at mcs.anl.gov  Fri Jul  3 18:44:39 2009
From: bsmith at mcs.anl.gov (Barry Smith)
Date: Fri, 3 Jul 2009 18:44:39 -0500
Subject: MPI-layout of PETSc
In-Reply-To: <EFFA24D2-A500-40CE-BF84-05F614511184@mpia.de>
References: <EBB7FC5D-3E4C-488E-BE15-2DC3A537934C@mpia.de>
	<59B4908D-E3F7-4286-905A-6495C4D24013@mcs.anl.gov>
	<DBC9C847-1392-46EB-B1EA-11C9D60B1D94@mpia.de>
	<201D150F-CF06-4746-A2A5-C29D213CDCCC@mcs.anl.gov>
	<EFFA24D2-A500-40CE-BF84-05F614511184@mpia.de>
Message-ID: <D694192E-2F20-48F5-AEEA-EE4EE86FF9F8@mcs.anl.gov>


   Use MPI_Comm_split() with the same color for all processors, then  
use the second integer argument to indicate the new rank you want for  
the process.
Choice the new rank so its x,y coordinate in the logical grid will  
match the y,x coordinate in the cartesian grid.

    Barry

On Jul 3, 2009, at 12:09 PM, Rolf Kuiper wrote:

> Hi Barry,
>
> I tried that already with:
> First way by copying:
> MPI_Comm_dup(PETSC_COMM_WORLD, &MyComm);
>
> Second way by creating:
> int dims[3] = {0,0,0};
> int ndims=3;
> MPI_Dims_create(NumberOfProcessors, ndims, dims);
> int false = 0; int true = 1;
> int periods[3] = { false, false, true };
> int reorder = true;
> MPI_Comm MyComm;
> MPI_Cart_create(PETSC_COMM_WORLD, ndims, dims, periods, reorder,  
> &MyComm);
>
> in the end then:
> PETSC_COMM_WORLD = MyComm;
>
> I test the MyComm with MPI_Topo_test(); and it is cartesian, yes.
> I can the coordinates of the cpus with MPI_Cart_coords(MyComm,  
> LocalRank, ndims, coords); , but I found no way to set/rearrange  
> these coordinates.
>
> Do you can help me in that case or have I to ask a MPI-support?
>
> Thanks for all,
> Rolf
>
>
> Am 03.07.2009 um 17:56 schrieb Barry Smith:
>>
>>   In designing the PETSc DA I did not (by ignorance) follow the  
>> layout approach of the MPI cartesian MPI_Cart_create (that gives  
>> the first local cpus first in the y-direction).
>> I had it put the first cpus in the x-direction.
>>
>>   What you need to do is create a new communicator that changes the  
>> order of the processors so that when used by the PETSc DA they lie  
>> out in the ordering that matches the other code. You will need to  
>> read up on the MPI_Cart stuff.
>>
>>   To change PETSC_COMM_WORLD you simply set PETSC_COMM_WORLD =  
>> yournewcom BEFORE calling PetscInitialize().
>>
>>   Barry
>>
>> On Jul 3, 2009, at 3:52 AM, Rolf Kuiper wrote:
>>
>>> Hi,
>>>
>>> Am 30.06.2009 um 02:24 schrieb Barry Smith:
>>>>
>>>> On Jun 29, 2009, at 7:07 PM, Rolf Kuiper wrote:
>>>>
>>>>> Hi PETSc users,
>>>>>
>>>>> I ran into trouble in combining my developed PETSc application  
>>>>> with another code (based on another library called "ArrayLib").
>>>>> The problem is the parallel layout for MPI, e.g. in 2D with 6  
>>>>> cpus the ArrayLib code gives the names/ranks of the local cpus  
>>>>> first in y-direction, than in x (from last to first, in the same  
>>>>> way the MPI arrays are called, like 3Darray[z][y][x]):
>>>>>
>>>>> y
>>>>> ^
>>>>> | 2-4-6
>>>>> | 1-3-5
>>>>> |--------> x
>>>>>
>>>>> If I call DACreate() from PETSc, it will assume an ordering  
>>>>> according to names/ranks first set in x-direction, than in y:
>>>>>
>>>>> y
>>>>> ^
>>>>> | 4-5-6
>>>>> | 1-2-3
>>>>> |--------> x
>>>>>
>>>>> Of course, if I now communicate the boundary values, I mix up  
>>>>> the domain (build by the other program).
>>>>>
>>>>> Is there a possibility / a flag to set the name of the ranks?
>>>>> Due to the fact that my application is written and working in  
>>>>> curvilinear coordinates and not in cartesian, I cannot just  
>>>>> switch the directions.
>>>>
>>>> What we recommend in this case is to just change the meaning of  
>>>> x, y, and z when you use the PETSc DA.  This does mean changing  
>>>> your code that uses the PETSc DA.
>>>
>>> The code is used as a module for many codes, so I would prefer to  
>>> not change the code (and the meaning of directions, that's not  
>>> user-friendly), but 'just' change the communicator.
>>>
>>>> I do not understand why curvilinear coordinates has anything to  
>>>> do with it. Another choice is to create a new MPI communicator  
>>>> that has the different ordering of the ranks of the processors  
>>>> and then using that comm to create the PETSc DA objects; then you  
>>>> would not need to change your code that calls PETSc.
>>>
>>> I tried some time before to use the PetscSetCommWorld() routine,  
>>> but I can't find it anymore, how can I set a new communicator in  
>>> PETSc3.0?
>>> The communicator, I want to use, is the MPI_COMM_WORLD, which  
>>> takes the first described ordering.
>>> Now I read that the MPI_COMM_WORLD is the default communicator for  
>>> PETSc. But why is the ordering than different?
>>>
>>> Sorry for all this question, but (as you can see) I really don't  
>>> understand this comm problem at the moment,
>>> Thanks for all,
>>> Rolf
>>>
>>>> Unfortunately PETSc doesn't have any way to flip how the DA  
>>>> handles the layout automatically.
>>>>
>>>>  Barry
>>>>
>>>>>
>>>>> Thanks a lot for your help,
>>>>> Rolf
>>>>
>>>
>>
>


From kuiper at mpia-hd.mpg.de  Sat Jul  4 06:08:44 2009
From: kuiper at mpia-hd.mpg.de (Rolf Kuiper)
Date: Sat, 4 Jul 2009 13:08:44 +0200
Subject: MPI-layout of PETSc
In-Reply-To: <D694192E-2F20-48F5-AEEA-EE4EE86FF9F8@mcs.anl.gov>
References: <EBB7FC5D-3E4C-488E-BE15-2DC3A537934C@mpia.de>
	<59B4908D-E3F7-4286-905A-6495C4D24013@mcs.anl.gov>
	<DBC9C847-1392-46EB-B1EA-11C9D60B1D94@mpia.de>
	<201D150F-CF06-4746-A2A5-C29D213CDCCC@mcs.anl.gov>
	<EFFA24D2-A500-40CE-BF84-05F614511184@mpia.de>
	<D694192E-2F20-48F5-AEEA-EE4EE86FF9F8@mcs.anl.gov>
Message-ID: <B141DA7C-9C75-45B6-9B31-5A72D8286C6E@mpia.de>

Thanks Barry!
It's working. But by the way: You simply should offer such a second  
communicator inside the PETSc-library.

Thanks for all your help, the support we got from this mailing list is  
amazing,
Rolf


Am 04.07.2009 um 01:44 schrieb Barry Smith:
>
>  Use MPI_Comm_split() with the same color for all processors, then  
> use the second integer argument to indicate the new rank you want  
> for the process.
> Choice the new rank so its x,y coordinate in the logical grid will  
> match the y,x coordinate in the cartesian grid.
>
>   Barry
>
> On Jul 3, 2009, at 12:09 PM, Rolf Kuiper wrote:
>
>> Hi Barry,
>>
>> I tried that already with:
>> First way by copying:
>> MPI_Comm_dup(PETSC_COMM_WORLD, &MyComm);
>>
>> Second way by creating:
>> int dims[3] = {0,0,0};
>> int ndims=3;
>> MPI_Dims_create(NumberOfProcessors, ndims, dims);
>> int false = 0; int true = 1;
>> int periods[3] = { false, false, true };
>> int reorder = true;
>> MPI_Comm MyComm;
>> MPI_Cart_create(PETSC_COMM_WORLD, ndims, dims, periods, reorder,  
>> &MyComm);
>>
>> in the end then:
>> PETSC_COMM_WORLD = MyComm;
>>
>> I test the MyComm with MPI_Topo_test(); and it is cartesian, yes.
>> I can the coordinates of the cpus with MPI_Cart_coords(MyComm,  
>> LocalRank, ndims, coords); , but I found no way to set/rearrange  
>> these coordinates.
>>
>> Do you can help me in that case or have I to ask a MPI-support?
>>
>> Thanks for all,
>> Rolf
>>
>>
>> Am 03.07.2009 um 17:56 schrieb Barry Smith:
>>>
>>>  In designing the PETSc DA I did not (by ignorance) follow the  
>>> layout approach of the MPI cartesian MPI_Cart_create (that gives  
>>> the first local cpus first in the y-direction).
>>> I had it put the first cpus in the x-direction.
>>>
>>>  What you need to do is create a new communicator that changes the  
>>> order of the processors so that when used by the PETSc DA they lie  
>>> out in the ordering that matches the other code. You will need to  
>>> read up on the MPI_Cart stuff.
>>>
>>>  To change PETSC_COMM_WORLD you simply set PETSC_COMM_WORLD =  
>>> yournewcom BEFORE calling PetscInitialize().
>>>
>>>  Barry
>>>
>>> On Jul 3, 2009, at 3:52 AM, Rolf Kuiper wrote:
>>>
>>>> Hi,
>>>>
>>>> Am 30.06.2009 um 02:24 schrieb Barry Smith:
>>>>>
>>>>> On Jun 29, 2009, at 7:07 PM, Rolf Kuiper wrote:
>>>>>
>>>>>> Hi PETSc users,
>>>>>>
>>>>>> I ran into trouble in combining my developed PETSc application  
>>>>>> with another code (based on another library called "ArrayLib").
>>>>>> The problem is the parallel layout for MPI, e.g. in 2D with 6  
>>>>>> cpus the ArrayLib code gives the names/ranks of the local cpus  
>>>>>> first in y-direction, than in x (from last to first, in the  
>>>>>> same way the MPI arrays are called, like 3Darray[z][y][x]):
>>>>>>
>>>>>> y
>>>>>> ^
>>>>>> | 2-4-6
>>>>>> | 1-3-5
>>>>>> |--------> x
>>>>>>
>>>>>> If I call DACreate() from PETSc, it will assume an ordering  
>>>>>> according to names/ranks first set in x-direction, than in y:
>>>>>>
>>>>>> y
>>>>>> ^
>>>>>> | 4-5-6
>>>>>> | 1-2-3
>>>>>> |--------> x
>>>>>>
>>>>>> Of course, if I now communicate the boundary values, I mix up  
>>>>>> the domain (build by the other program).
>>>>>>
>>>>>> Is there a possibility / a flag to set the name of the ranks?
>>>>>> Due to the fact that my application is written and working in  
>>>>>> curvilinear coordinates and not in cartesian, I cannot just  
>>>>>> switch the directions.
>>>>>
>>>>> What we recommend in this case is to just change the meaning of  
>>>>> x, y, and z when you use the PETSc DA.  This does mean changing  
>>>>> your code that uses the PETSc DA.
>>>>
>>>> The code is used as a module for many codes, so I would prefer to  
>>>> not change the code (and the meaning of directions, that's not  
>>>> user-friendly), but 'just' change the communicator.
>>>>
>>>>> I do not understand why curvilinear coordinates has anything to  
>>>>> do with it. Another choice is to create a new MPI communicator  
>>>>> that has the different ordering of the ranks of the processors  
>>>>> and then using that comm to create the PETSc DA objects; then  
>>>>> you would not need to change your code that calls PETSc.
>>>>
>>>> I tried some time before to use the PetscSetCommWorld() routine,  
>>>> but I can't find it anymore, how can I set a new communicator in  
>>>> PETSc3.0?
>>>> The communicator, I want to use, is the MPI_COMM_WORLD, which  
>>>> takes the first described ordering.
>>>> Now I read that the MPI_COMM_WORLD is the default communicator  
>>>> for PETSc. But why is the ordering than different?
>>>>
>>>> Sorry for all this question, but (as you can see) I really don't  
>>>> understand this comm problem at the moment,
>>>> Thanks for all,
>>>> Rolf
>>>>
>>>>> Unfortunately PETSc doesn't have any way to flip how the DA  
>>>>> handles the layout automatically.
>>>>>
>>>>> Barry
>>>>>
>>>>>>
>>>>>> Thanks a lot for your help,
>>>>>> Rolf
>>>>>
>>>>
>>>
>>
>


From bsmith at mcs.anl.gov  Sat Jul  4 12:24:43 2009
From: bsmith at mcs.anl.gov (Barry Smith)
Date: Sat, 4 Jul 2009 12:24:43 -0500
Subject: MPI-layout of PETSc
In-Reply-To: <B141DA7C-9C75-45B6-9B31-5A72D8286C6E@mpia.de>
References: <EBB7FC5D-3E4C-488E-BE15-2DC3A537934C@mpia.de>
	<59B4908D-E3F7-4286-905A-6495C4D24013@mcs.anl.gov>
	<DBC9C847-1392-46EB-B1EA-11C9D60B1D94@mpia.de>
	<201D150F-CF06-4746-A2A5-C29D213CDCCC@mcs.anl.gov>
	<EFFA24D2-A500-40CE-BF84-05F614511184@mpia.de>
	<D694192E-2F20-48F5-AEEA-EE4EE86FF9F8@mcs.anl.gov>
	<B141DA7C-9C75-45B6-9B31-5A72D8286C6E@mpia.de>
Message-ID: <A7BCE09B-00FE-45F9-8605-4BBC8DFCC87C@mcs.anl.gov>


    Send us the code to do the conversion and we'll include as a  
utility.

    Barry

On Jul 4, 2009, at 6:08 AM, Rolf Kuiper wrote:

> Thanks Barry!
> It's working. But by the way: You simply should offer such a second  
> communicator inside the PETSc-library.
>
> Thanks for all your help, the support we got from this mailing list  
> is amazing,
> Rolf
>
>
> Am 04.07.2009 um 01:44 schrieb Barry Smith:
>>
>> Use MPI_Comm_split() with the same color for all processors, then  
>> use the second integer argument to indicate the new rank you want  
>> for the process.
>> Choice the new rank so its x,y coordinate in the logical grid will  
>> match the y,x coordinate in the cartesian grid.
>>
>>  Barry
>>
>> On Jul 3, 2009, at 12:09 PM, Rolf Kuiper wrote:
>>
>>> Hi Barry,
>>>
>>> I tried that already with:
>>> First way by copying:
>>> MPI_Comm_dup(PETSC_COMM_WORLD, &MyComm);
>>>
>>> Second way by creating:
>>> int dims[3] = {0,0,0};
>>> int ndims=3;
>>> MPI_Dims_create(NumberOfProcessors, ndims, dims);
>>> int false = 0; int true = 1;
>>> int periods[3] = { false, false, true };
>>> int reorder = true;
>>> MPI_Comm MyComm;
>>> MPI_Cart_create(PETSC_COMM_WORLD, ndims, dims, periods, reorder,  
>>> &MyComm);
>>>
>>> in the end then:
>>> PETSC_COMM_WORLD = MyComm;
>>>
>>> I test the MyComm with MPI_Topo_test(); and it is cartesian, yes.
>>> I can the coordinates of the cpus with MPI_Cart_coords(MyComm,  
>>> LocalRank, ndims, coords); , but I found no way to set/rearrange  
>>> these coordinates.
>>>
>>> Do you can help me in that case or have I to ask a MPI-support?
>>>
>>> Thanks for all,
>>> Rolf
>>>
>>>
>>> Am 03.07.2009 um 17:56 schrieb Barry Smith:
>>>>
>>>> In designing the PETSc DA I did not (by ignorance) follow the  
>>>> layout approach of the MPI cartesian MPI_Cart_create (that gives  
>>>> the first local cpus first in the y-direction).
>>>> I had it put the first cpus in the x-direction.
>>>>
>>>> What you need to do is create a new communicator that changes the  
>>>> order of the processors so that when used by the PETSc DA they  
>>>> lie out in the ordering that matches the other code. You will  
>>>> need to read up on the MPI_Cart stuff.
>>>>
>>>> To change PETSC_COMM_WORLD you simply set PETSC_COMM_WORLD =  
>>>> yournewcom BEFORE calling PetscInitialize().
>>>>
>>>> Barry
>>>>
>>>> On Jul 3, 2009, at 3:52 AM, Rolf Kuiper wrote:
>>>>
>>>>> Hi,
>>>>>
>>>>> Am 30.06.2009 um 02:24 schrieb Barry Smith:
>>>>>>
>>>>>> On Jun 29, 2009, at 7:07 PM, Rolf Kuiper wrote:
>>>>>>
>>>>>>> Hi PETSc users,
>>>>>>>
>>>>>>> I ran into trouble in combining my developed PETSc application  
>>>>>>> with another code (based on another library called "ArrayLib").
>>>>>>> The problem is the parallel layout for MPI, e.g. in 2D with 6  
>>>>>>> cpus the ArrayLib code gives the names/ranks of the local cpus  
>>>>>>> first in y-direction, than in x (from last to first, in the  
>>>>>>> same way the MPI arrays are called, like 3Darray[z][y][x]):
>>>>>>>
>>>>>>> y
>>>>>>> ^
>>>>>>> | 2-4-6
>>>>>>> | 1-3-5
>>>>>>> |--------> x
>>>>>>>
>>>>>>> If I call DACreate() from PETSc, it will assume an ordering  
>>>>>>> according to names/ranks first set in x-direction, than in y:
>>>>>>>
>>>>>>> y
>>>>>>> ^
>>>>>>> | 4-5-6
>>>>>>> | 1-2-3
>>>>>>> |--------> x
>>>>>>>
>>>>>>> Of course, if I now communicate the boundary values, I mix up  
>>>>>>> the domain (build by the other program).
>>>>>>>
>>>>>>> Is there a possibility / a flag to set the name of the ranks?
>>>>>>> Due to the fact that my application is written and working in  
>>>>>>> curvilinear coordinates and not in cartesian, I cannot just  
>>>>>>> switch the directions.
>>>>>>
>>>>>> What we recommend in this case is to just change the meaning of  
>>>>>> x, y, and z when you use the PETSc DA.  This does mean changing  
>>>>>> your code that uses the PETSc DA.
>>>>>
>>>>> The code is used as a module for many codes, so I would prefer  
>>>>> to not change the code (and the meaning of directions, that's  
>>>>> not user-friendly), but 'just' change the communicator.
>>>>>
>>>>>> I do not understand why curvilinear coordinates has anything to  
>>>>>> do with it. Another choice is to create a new MPI communicator  
>>>>>> that has the different ordering of the ranks of the processors  
>>>>>> and then using that comm to create the PETSc DA objects; then  
>>>>>> you would not need to change your code that calls PETSc.
>>>>>
>>>>> I tried some time before to use the PetscSetCommWorld() routine,  
>>>>> but I can't find it anymore, how can I set a new communicator in  
>>>>> PETSc3.0?
>>>>> The communicator, I want to use, is the MPI_COMM_WORLD, which  
>>>>> takes the first described ordering.
>>>>> Now I read that the MPI_COMM_WORLD is the default communicator  
>>>>> for PETSc. But why is the ordering than different?
>>>>>
>>>>> Sorry for all this question, but (as you can see) I really don't  
>>>>> understand this comm problem at the moment,
>>>>> Thanks for all,
>>>>> Rolf
>>>>>
>>>>>> Unfortunately PETSc doesn't have any way to flip how the DA  
>>>>>> handles the layout automatically.
>>>>>>
>>>>>> Barry
>>>>>>
>>>>>>>
>>>>>>> Thanks a lot for your help,
>>>>>>> Rolf
>>>>>>
>>>>>
>>>>
>>>
>>
>


From kuiper at mpia-hd.mpg.de  Sat Jul  4 16:33:33 2009
From: kuiper at mpia-hd.mpg.de (Rolf Kuiper)
Date: Sat, 4 Jul 2009 23:33:33 +0200
Subject: MPI-layout of PETSc
In-Reply-To: <A7BCE09B-00FE-45F9-8605-4BBC8DFCC87C@mcs.anl.gov>
References: <EBB7FC5D-3E4C-488E-BE15-2DC3A537934C@mpia.de>
	<59B4908D-E3F7-4286-905A-6495C4D24013@mcs.anl.gov>
	<DBC9C847-1392-46EB-B1EA-11C9D60B1D94@mpia.de>
	<201D150F-CF06-4746-A2A5-C29D213CDCCC@mcs.anl.gov>
	<EFFA24D2-A500-40CE-BF84-05F614511184@mpia.de>
	<D694192E-2F20-48F5-AEEA-EE4EE86FF9F8@mcs.anl.gov>
	<B141DA7C-9C75-45B6-9B31-5A72D8286C6E@mpia.de>
	<A7BCE09B-00FE-45F9-8605-4BBC8DFCC87C@mcs.anl.gov>
Message-ID: <F048ABF1-00ED-4A71-B0A2-EABCFB46EA85@mpia.de>

No problem, here is the code:

// the numbers of processors per direction are (int) x_procs, y_procs,  
z_procs respectively
// (no parallelization in direction 'dir' means dir_procs = 1)

MPI_Comm NewComm;
int MPI_Rank, NewRank, x,y,z;

// get rank from MPI ordering:
MPI_Comm_rank(MPI_COMM_WORLD, &MPI_Rank);

// calculate coordinates of cpus in MPI ordering:
x = MPI_rank / (z_procs*y_procs);
y = (MPI_rank % (z_procs*y_procs)) / z_procs;
z = (MPI_rank % (z_procs*y_procs)) % z_procs;

// set new rank according to PETSc ordering:
NewRank = z*y_procs*x_procs + y*x_procs + x;

// create communicator with new ranks according to PETSc ordering:
MPI_Comm_split(PETSC_COMM_WORLD, 1, NewRank, &NewComm);

// override the default communicator (was MPI_COMM_WORLD as default)
PETSC_COMM_WORLD = NewComm;

I hope, this will be useful for some of you.

Ciao,
Rolf

-------------------------------------------------------
  Rolf Kuiper
  Max-Planck Institute for Astronomy

  K?nigstuhl 17
  69117 Heidelberg

  Office A5, Els?sser Labor
  Phone: 0049 (0)6221 528 350
  Mail: kuiper at mpia.de
  Homepage: http://www.mpia.de/~kuiper
-------------------------------------------------------

Am 04.07.2009 um 19:24 schrieb Barry Smith:
>
>   Send us the code to do the conversion and we'll include as a  
> utility.
>
>   Barry
>
> On Jul 4, 2009, at 6:08 AM, Rolf Kuiper wrote:
>
>> Thanks Barry!
>> It's working. But by the way: You simply should offer such a second  
>> communicator inside the PETSc-library.
>>
>> Thanks for all your help, the support we got from this mailing list  
>> is amazing,
>> Rolf
>>
>>
>> Am 04.07.2009 um 01:44 schrieb Barry Smith:
>>>
>>> Use MPI_Comm_split() with the same color for all processors, then  
>>> use the second integer argument to indicate the new rank you want  
>>> for the process.
>>> Choice the new rank so its x,y coordinate in the logical grid will  
>>> match the y,x coordinate in the cartesian grid.
>>>
>>> Barry
>>>
>>> On Jul 3, 2009, at 12:09 PM, Rolf Kuiper wrote:
>>>
>>>> Hi Barry,
>>>>
>>>> I tried that already with:
>>>> First way by copying:
>>>> MPI_Comm_dup(PETSC_COMM_WORLD, &MyComm);
>>>>
>>>> Second way by creating:
>>>> int dims[3] = {0,0,0};
>>>> int ndims=3;
>>>> MPI_Dims_create(NumberOfProcessors, ndims, dims);
>>>> int false = 0; int true = 1;
>>>> int periods[3] = { false, false, true };
>>>> int reorder = true;
>>>> MPI_Comm MyComm;
>>>> MPI_Cart_create(PETSC_COMM_WORLD, ndims, dims, periods, reorder,  
>>>> &MyComm);
>>>>
>>>> in the end then:
>>>> PETSC_COMM_WORLD = MyComm;
>>>>
>>>> I test the MyComm with MPI_Topo_test(); and it is cartesian, yes.
>>>> I can the coordinates of the cpus with MPI_Cart_coords(MyComm,  
>>>> LocalRank, ndims, coords); , but I found no way to set/rearrange  
>>>> these coordinates.
>>>>
>>>> Do you can help me in that case or have I to ask a MPI-support?
>>>>
>>>> Thanks for all,
>>>> Rolf
>>>>
>>>>
>>>> Am 03.07.2009 um 17:56 schrieb Barry Smith:
>>>>>
>>>>> In designing the PETSc DA I did not (by ignorance) follow the  
>>>>> layout approach of the MPI cartesian MPI_Cart_create (that gives  
>>>>> the first local cpus first in the y-direction).
>>>>> I had it put the first cpus in the x-direction.
>>>>>
>>>>> What you need to do is create a new communicator that changes  
>>>>> the order of the processors so that when used by the PETSc DA  
>>>>> they lie out in the ordering that matches the other code. You  
>>>>> will need to read up on the MPI_Cart stuff.
>>>>>
>>>>> To change PETSC_COMM_WORLD you simply set PETSC_COMM_WORLD =  
>>>>> yournewcom BEFORE calling PetscInitialize().
>>>>>
>>>>> Barry
>>>>>
>>>>> On Jul 3, 2009, at 3:52 AM, Rolf Kuiper wrote:
>>>>>
>>>>>> Hi,
>>>>>>
>>>>>> Am 30.06.2009 um 02:24 schrieb Barry Smith:
>>>>>>>
>>>>>>> On Jun 29, 2009, at 7:07 PM, Rolf Kuiper wrote:
>>>>>>>
>>>>>>>> Hi PETSc users,
>>>>>>>>
>>>>>>>> I ran into trouble in combining my developed PETSc  
>>>>>>>> application with another code (based on another library  
>>>>>>>> called "ArrayLib").
>>>>>>>> The problem is the parallel layout for MPI, e.g. in 2D with 6  
>>>>>>>> cpus the ArrayLib code gives the names/ranks of the local  
>>>>>>>> cpus first in y-direction, than in x (from last to first, in  
>>>>>>>> the same way the MPI arrays are called, like 3Darray[z][y][x]):
>>>>>>>>
>>>>>>>> y
>>>>>>>> ^
>>>>>>>> | 2-4-6
>>>>>>>> | 1-3-5
>>>>>>>> |--------> x
>>>>>>>>
>>>>>>>> If I call DACreate() from PETSc, it will assume an ordering  
>>>>>>>> according to names/ranks first set in x-direction, than in y:
>>>>>>>>
>>>>>>>> y
>>>>>>>> ^
>>>>>>>> | 4-5-6
>>>>>>>> | 1-2-3
>>>>>>>> |--------> x
>>>>>>>>
>>>>>>>> Of course, if I now communicate the boundary values, I mix up  
>>>>>>>> the domain (build by the other program).
>>>>>>>>
>>>>>>>> Is there a possibility / a flag to set the name of the ranks?
>>>>>>>> Due to the fact that my application is written and working in  
>>>>>>>> curvilinear coordinates and not in cartesian, I cannot just  
>>>>>>>> switch the directions.
>>>>>>>
>>>>>>> What we recommend in this case is to just change the meaning  
>>>>>>> of x, y, and z when you use the PETSc DA.  This does mean  
>>>>>>> changing your code that uses the PETSc DA.
>>>>>>
>>>>>> The code is used as a module for many codes, so I would prefer  
>>>>>> to not change the code (and the meaning of directions, that's  
>>>>>> not user-friendly), but 'just' change the communicator.
>>>>>>
>>>>>>> I do not understand why curvilinear coordinates has anything  
>>>>>>> to do with it. Another choice is to create a new MPI  
>>>>>>> communicator that has the different ordering of the ranks of  
>>>>>>> the processors and then using that comm to create the PETSc DA  
>>>>>>> objects; then you would not need to change your code that  
>>>>>>> calls PETSc.
>>>>>>
>>>>>> I tried some time before to use the PetscSetCommWorld()  
>>>>>> routine, but I can't find it anymore, how can I set a new  
>>>>>> communicator in PETSc3.0?
>>>>>> The communicator, I want to use, is the MPI_COMM_WORLD, which  
>>>>>> takes the first described ordering.
>>>>>> Now I read that the MPI_COMM_WORLD is the default communicator  
>>>>>> for PETSc. But why is the ordering than different?
>>>>>>
>>>>>> Sorry for all this question, but (as you can see) I really  
>>>>>> don't understand this comm problem at the moment,
>>>>>> Thanks for all,
>>>>>> Rolf
>>>>>>
>>>>>>> Unfortunately PETSc doesn't have any way to flip how the DA  
>>>>>>> handles the layout automatically.
>>>>>>>
>>>>>>> Barry
>>>>>>>
>>>>>>>>
>>>>>>>> Thanks a lot for your help,
>>>>>>>> Rolf
>>>>>>>
>>>>>>
>>>>>
>>>>
>>>
>>
>

-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.mcs.anl.gov/pipermail/petsc-users/attachments/20090704/28fceefe/attachment-0001.htm>

From enjoywm at cs.wm.edu  Sun Jul  5 12:46:51 2009
From: enjoywm at cs.wm.edu (Yixun Liu)
Date: Sun, 05 Jul 2009 13:46:51 -0400
Subject: make test
Message-ID: <4A50E70B.7070208@cs.wm.edu>

Hi,
After making test, I received a lot of warnings and lib load failures.

Running test examples to verify correct installation
Possible error running C/C++ src/snes/examples/tutorials/ex19 with 1 MPI
process
See http://www.mcs.anl.gov/petsc/petsc-as/documentation/troubleshooting.html
DAT: library load failure: /usr/lib64/libdaplcma.so.1: undefined symbol:
dat_registry_add_provider
--------------------------------------------------------------------------

WARNING: Failed to open "OpenIB-cma"
[DAT_PROVIDER_NOT_FOUND:DAT_NAME_NOT_REGISTERED].
This may be a real error or it may be an invalid entry in the uDAPL
Registry which is contained in the dat.conf file. Contact your local
System Administrator to confirm the availability of the interfaces in
the dat.conf file.
--------------------------------------------------------------------------
DAT: library load failure: /usr/lib64/libdaplcma.so.1: undefined symbol:
dat_registry_add_provider
--------------------------------------------------------------------------

WARNING: Failed to open "OpenIB-cma-1"
[DAT_PROVIDER_NOT_FOUND:DAT_NAME_NOT_REGISTERED].
This may be a real error or it may be an invalid entry in the uDAPL
Registry which is contained in the dat.conf file. Contact your local
System Administrator to confirm the availability of the interfaces in
the dat.conf file.
--------------------------------------------------------------------------
DAT: library load failure: /usr/lib64/libdaplscm.so.1: undefined symbol:
dat_registry_add_provider
--------------------------------------------------------------------------

WARNING: Failed to open "OpenIB-mthca0-1"
[DAT_PROVIDER_NOT_FOUND:DAT_NAME_NOT_REGISTERED].
This may be a real error or it may be an invalid entry in the uDAPL
Registry which is contained in the dat.conf file. Contact your local
System Administrator to confirm the availability of the interfaces in
the dat.conf file.
--------------------------------------------------------------------------
DAT: library load failure: /usr/lib64/libdaplscm.so.1: undefined symbol:
dat_registry_add_provider
--------------------------------------------------------------------------

WARNING: Failed to open "OpenIB-mthca0-2"
[DAT_PROVIDER_NOT_FOUND:DAT_NAME_NOT_REGISTERED].
This may be a real error or it may be an invalid entry in the uDAPL
Registry which is contained in the dat.conf file. Contact your local
System Administrator to confirm the availability of the interfaces in
the dat.conf file.
--------------------------------------------------------------------------
DAT: library load failure: /usr/lib64/libdaplscm.so.1: undefined symbol:
dat_registry_add_provider
--------------------------------------------------------------------------

WARNING: Failed to open "OpenIB-mlx4_0-1"
[DAT_PROVIDER_NOT_FOUND:DAT_NAME_NOT_REGISTERED].
This may be a real error or it may be an invalid entry in the uDAPL
Registry which is contained in the dat.conf file. Contact your local
System Administrator to confirm the availability of the interfaces in
the dat.conf file.
--------------------------------------------------------------------------
DAT: library load failure: /usr/lib64/libdaplscm.so.1: undefined symbol:
dat_registry_add_provider
--------------------------------------------------------------------------

WARNING: Failed to open "OpenIB-mlx4_0-2"
[DAT_PROVIDER_NOT_FOUND:DAT_NAME_NOT_REGISTERED].
This may be a real error or it may be an invalid entry in the uDAPL
Registry which is contained in the dat.conf file. Contact your local
System Administrator to confirm the availability of the interfaces in
the dat.conf file.
--------------------------------------------------------------------------
DAT: library load failure: /usr/lib64/libdaplcma.so.1: undefined symbol:
dat_registry_add_provider
--------------------------------------------------------------------------

WARNING: Failed to open "OpenIB-iwarp"
[DAT_PROVIDER_NOT_FOUND:DAT_NAME_NOT_REGISTERED].
This may be a real error or it may be an invalid entry in the uDAPL
Registry which is contained in the dat.conf file. Contact your local
System Administrator to confirm the availability of the interfaces in
the dat.conf file.
--------------------------------------------------------------------------
--------------------------------------------------------------------------
[0,1,0]: uDAPL on host md was unable to find any NICs.
Another transport will be used instead, although this may result in
lower performance.
--------------------------------------------------------------------------
lid velocity = 0.0016, prandtl # = 1, grashof # = 1
Number of Newton iterations = 2
lid velocity = 0.0016, prandtl # = 1, grashof # = 1
Number of Newton iterations = 2
Possible error running C/C++ src/snes/examples/tutorials/ex19 with 2 MPI
processes
See http://www.mcs.anl.gov/petsc/petsc-as/documentation/troubleshooting.html
DAT: library load failure: /usr/lib64/libdaplcma.so.1: undefined symbol:
dat_registry_add_provider
--------------------------------------------------------------------------

WARNING: Failed to open "OpenIB-cma"
[DAT_PROVIDER_NOT_FOUND:DAT_NAME_NOT_REGISTERED].
This may be a real error or it may be an invalid entry in the uDAPL
Registry which is contained in the dat.conf file. Contact your local
System Administrator to confirm the availability of the interfaces in
the dat.conf file.
--------------------------------------------------------------------------
DAT: library load failure: /usr/lib64/libdaplcma.so.1: undefined symbol:
dat_registry_add_provider
--------------------------------------------------------------------------

WARNING: Failed to open "OpenIB-cma-1"
[DAT_PROVIDER_NOT_FOUND:DAT_NAME_NOT_REGISTERED].
This may be a real error or it may be an invalid entry in the uDAPL
Registry which is contained in the dat.conf file. Contact your local
System Administrator to confirm the availability of the interfaces in
the dat.conf file.
--------------------------------------------------------------------------
DAT: library load failure: /usr/lib64/libdaplscm.so.1: undefined symbol:
dat_registry_add_provider
--------------------------------------------------------------------------

WARNING: Failed to open "OpenIB-mthca0-1"
[DAT_PROVIDER_NOT_FOUND:DAT_NAME_NOT_REGISTERED].
This may be a real error or it may be an invalid entry in the uDAPL
Registry which is contained in the dat.conf file. Contact your local
System Administrator to confirm the availability of the interfaces in
the dat.conf file.
--------------------------------------------------------------------------
DAT: library load failure: /usr/lib64/libdaplscm.so.1: undefined symbol:
dat_registry_add_provider
--------------------------------------------------------------------------

WARNING: Failed to open "OpenIB-mthca0-2"
[DAT_PROVIDER_NOT_FOUND:DAT_NAME_NOT_REGISTERED].
This may be a real error or it may be an invalid entry in the uDAPL
Registry which is contained in the dat.conf file. Contact your local
System Administrator to confirm the availability of the interfaces in
the dat.conf file.
--------------------------------------------------------------------------
DAT: library load failure: /usr/lib64/libdaplscm.so.1: undefined symbol:
dat_registry_add_provider
--------------------------------------------------------------------------

WARNING: Failed to open "OpenIB-mlx4_0-1"
[DAT_PROVIDER_NOT_FOUND:DAT_NAME_NOT_REGISTERED].
This may be a real error or it may be an invalid entry in the uDAPL
Registry which is contained in the dat.conf file. Contact your local
System Administrator to confirm the availability of the interfaces in
the dat.conf file.
--------------------------------------------------------------------------
DAT: library load failure: /usr/lib64/libdaplscm.so.1: undefined symbol:
dat_registry_add_provider
--------------------------------------------------------------------------

WARNING: Failed to open "OpenIB-mlx4_0-2"
[DAT_PROVIDER_NOT_FOUND:DAT_NAME_NOT_REGISTERED].
This may be a real error or it may be an invalid entry in the uDAPL
Registry which is contained in the dat.conf file. Contact your local
System Administrator to confirm the availability of the interfaces in
the dat.conf file.
--------------------------------------------------------------------------
DAT: library load failure: /usr/lib64/libdaplcma.so.1: undefined symbol:
dat_registry_add_provider
--------------------------------------------------------------------------

WARNING: Failed to open "OpenIB-iwarp"
[DAT_PROVIDER_NOT_FOUND:DAT_NAME_NOT_REGISTERED].
This may be a real error or it may be an invalid entry in the uDAPL
Registry which is contained in the dat.conf file. Contact your local
System Administrator to confirm the availability of the interfaces in
the dat.conf file.
--------------------------------------------------------------------------
--------------------------------------------------------------------------
[0,1,1]: uDAPL on host md was unable to find any NICs.
Another transport will be used instead, although this may result in
lower performance.
--------------------------------------------------------------------------
DAT: library load failure: /usr/lib64/libdaplcma.so.1: undefined symbol:
dat_registry_add_provider
--------------------------------------------------------------------------

WARNING: Failed to open "OpenIB-cma"
[DAT_PROVIDER_NOT_FOUND:DAT_NAME_NOT_REGISTERED].
This may be a real error or it may be an invalid entry in the uDAPL
Registry which is contained in the dat.conf file. Contact your local
System Administrator to confirm the availability of the interfaces in
the dat.conf file.
--------------------------------------------------------------------------
DAT: library load failure: /usr/lib64/libdaplcma.so.1: undefined symbol:
dat_registry_add_provider
--------------------------------------------------------------------------

WARNING: Failed to open "OpenIB-cma-1"
[DAT_PROVIDER_NOT_FOUND:DAT_NAME_NOT_REGISTERED].
This may be a real error or it may be an invalid entry in the uDAPL
Registry which is contained in the dat.conf file. Contact your local
System Administrator to confirm the availability of the interfaces in
the dat.conf file.
--------------------------------------------------------------------------
DAT: library load failure: /usr/lib64/libdaplscm.so.1: undefined symbol:
dat_registry_add_provider
--------------------------------------------------------------------------

WARNING: Failed to open "OpenIB-mthca0-1"
[DAT_PROVIDER_NOT_FOUND:DAT_NAME_NOT_REGISTERED].
This may be a real error or it may be an invalid entry in the uDAPL
Registry which is contained in the dat.conf file. Contact your local
System Administrator to confirm the availability of the interfaces in
the dat.conf file.
--------------------------------------------------------------------------
DAT: library load failure: /usr/lib64/libdaplscm.so.1: undefined symbol:
dat_registry_add_provider
--------------------------------------------------------------------------

WARNING: Failed to open "OpenIB-mthca0-2"
[DAT_PROVIDER_NOT_FOUND:DAT_NAME_NOT_REGISTERED].
This may be a real error or it may be an invalid entry in the uDAPL
Registry which is contained in the dat.conf file. Contact your local
System Administrator to confirm the availability of the interfaces in
the dat.conf file.
--------------------------------------------------------------------------
DAT: library load failure: /usr/lib64/libdaplscm.so.1: undefined symbol:
dat_registry_add_provider
--------------------------------------------------------------------------

WARNING: Failed to open "OpenIB-mlx4_0-1"
[DAT_PROVIDER_NOT_FOUND:DAT_NAME_NOT_REGISTERED].
This may be a real error or it may be an invalid entry in the uDAPL
Registry which is contained in the dat.conf file. Contact your local
System Administrator to confirm the availability of the interfaces in
the dat.conf file.
--------------------------------------------------------------------------
DAT: library load failure: /usr/lib64/libdaplscm.so.1: undefined symbol:
dat_registry_add_provider
--------------------------------------------------------------------------

WARNING: Failed to open "OpenIB-mlx4_0-2"
[DAT_PROVIDER_NOT_FOUND:DAT_NAME_NOT_REGISTERED].
This may be a real error or it may be an invalid entry in the uDAPL
Registry which is contained in the dat.conf file. Contact your local
System Administrator to confirm the availability of the interfaces in
the dat.conf file.
--------------------------------------------------------------------------
DAT: library load failure: /usr/lib64/libdaplcma.so.1: undefined symbol:
dat_registry_add_provider
--------------------------------------------------------------------------

WARNING: Failed to open "OpenIB-iwarp"
[DAT_PROVIDER_NOT_FOUND:DAT_NAME_NOT_REGISTERED].
This may be a real error or it may be an invalid entry in the uDAPL
Registry which is contained in the dat.conf file. Contact your local
System Administrator to confirm the availability of the interfaces in
the dat.conf file.
--------------------------------------------------------------------------
--------------------------------------------------------------------------
[0,1,0]: uDAPL on host md was unable to find any NICs.
Another transport will be used instead, although this may result in
lower performance.
--------------------------------------------------------------------------
lid velocity = 0.0016, prandtl # = 1, grashof # = 1
Number of Newton iterations = 2
lid velocity = 0.0016, prandtl # = 1, grashof # = 1
Number of Newton iterations = 2
Possible error running Graphics examples
src/snes/examples/tutorials/ex19 1 MPI process
See http://www.mcs.anl.gov/petsc/petsc-as/documentation/troubleshooting.html
DAT: library load failure: /usr/lib64/libdaplcma.so.1: undefined symbol:
dat_registry_add_provider
--------------------------------------------------------------------------

WARNING: Failed to open "OpenIB-cma"
[DAT_PROVIDER_NOT_FOUND:DAT_NAME_NOT_REGISTERED].
This may be a real error or it may be an invalid entry in the uDAPL
Registry which is contained in the dat.conf file. Contact your local
System Administrator to confirm the availability of the interfaces in
the dat.conf file.
--------------------------------------------------------------------------
DAT: library load failure: /usr/lib64/libdaplcma.so.1: undefined symbol:
dat_registry_add_provider
DAT: library load failure: /usr/lib64/libdaplscm.so.1: undefined symbol:
dat_registry_add_provider
DAT: library load failure: /usr/lib64/libdaplscm.so.1: undefined symbol:
dat_registry_add_provider
DAT: library load failure: /usr/lib64/libdaplscm.so.1: undefined symbol:
dat_registry_add_provider
DAT: library load failure: /usr/lib64/libdaplscm.so.1: undefined symbol:
dat_registry_add_provider
DAT: library load failure: /usr/lib64/libdaplcma.so.1: undefined symbol:
dat_registry_add_provider
--------------------------------------------------------------------------

WARNING: Failed to open "OpenIB-cma-1"
[DAT_PROVIDER_NOT_FOUND:DAT_NAME_NOT_REGISTERED].
This may be a real error or it may be an invalid entry in the uDAPL
Registry which is contained in the dat.conf file. Contact your local
System Administrator to confirm the availability of the interfaces in
the dat.conf file.
--------------------------------------------------------------------------
--------------------------------------------------------------------------

WARNING: Failed to open "OpenIB-mthca0-1"
[DAT_PROVIDER_NOT_FOUND:DAT_NAME_NOT_REGISTERED].
This may be a real error or it may be an invalid entry in the uDAPL
Registry which is contained in the dat.conf file. Contact your local
System Administrator to confirm the availability of the interfaces in
the dat.conf file.
--------------------------------------------------------------------------
--------------------------------------------------------------------------

WARNING: Failed to open "OpenIB-mthca0-2"
[DAT_PROVIDER_NOT_FOUND:DAT_NAME_NOT_REGISTERED].
This may be a real error or it may be an invalid entry in the uDAPL
Registry which is contained in the dat.conf file. Contact your local
System Administrator to confirm the availability of the interfaces in
the dat.conf file.
--------------------------------------------------------------------------
--------------------------------------------------------------------------

WARNING: Failed to open "OpenIB-mlx4_0-1"
[DAT_PROVIDER_NOT_FOUND:DAT_NAME_NOT_REGISTERED].
This may be a real error or it may be an invalid entry in the uDAPL
Registry which is contained in the dat.conf file. Contact your local
System Administrator to confirm the availability of the interfaces in
the dat.conf file.
--------------------------------------------------------------------------
--------------------------------------------------------------------------

WARNING: Failed to open "OpenIB-mlx4_0-2"
[DAT_PROVIDER_NOT_FOUND:DAT_NAME_NOT_REGISTERED].
This may be a real error or it may be an invalid entry in the uDAPL
Registry which is contained in the dat.conf file. Contact your local
System Administrator to confirm the availability of the interfaces in
the dat.conf file.
--------------------------------------------------------------------------
--------------------------------------------------------------------------

WARNING: Failed to open "OpenIB-iwarp"
[DAT_PROVIDER_NOT_FOUND:DAT_NAME_NOT_REGISTERED].
This may be a real error or it may be an invalid entry in the uDAPL
Registry which is contained in the dat.conf file. Contact your local
System Administrator to confirm the availability of the interfaces in
the dat.conf file.
--------------------------------------------------------------------------
--------------------------------------------------------------------------
[0,1,0]: uDAPL on host md was unable to find any NICs.
Another transport will be used instead, although this may result in
lower performance.
--------------------------------------------------------------------------
lid velocity = 0.0016, prandtl # = 1, grashof # = 1
Number of Newton iterations = 2
lid velocity = 0.0016, prandtl # = 1, grashof # = 1
Number of Newton iterations = 2
Error running Fortran example src/snes/examples/tutorials/ex5f with 1
MPI process
See http://www.mcs.anl.gov/petsc/petsc-as/documentation/troubleshooting.html
DAT: library load failure: /usr/lib64/libdaplcma.so.1: undefined symbol:
dat_registry_add_provider
--------------------------------------------------------------------------

WARNING: Failed to open "OpenIB-cma"
[DAT_PROVIDER_NOT_FOUND:DAT_NAME_NOT_REGISTERED].
This may be a real error or it may be an invalid entry in the uDAPL
Registry which is contained in the dat.conf file. Contact your local
System Administrator to confirm the availability of the interfaces in
the dat.conf file.
--------------------------------------------------------------------------
DAT: library load failure: /usr/lib64/libdaplcma.so.1: undefined symbol:
dat_registry_add_provider
--------------------------------------------------------------------------

WARNING: Failed to open "OpenIB-cma-1"
[DAT_PROVIDER_NOT_FOUND:DAT_NAME_NOT_REGISTERED].
This may be a real error or it may be an invalid entry in the uDAPL
Registry which is contained in the dat.conf file. Contact your local
System Administrator to confirm the availability of the interfaces in
the dat.conf file.
--------------------------------------------------------------------------
DAT: library load failure: /usr/lib64/libdaplscm.so.1: undefined symbol:
dat_registry_add_provider
--------------------------------------------------------------------------

WARNING: Failed to open "OpenIB-mthca0-1"
[DAT_PROVIDER_NOT_FOUND:DAT_NAME_NOT_REGISTERED].
This may be a real error or it may be an invalid entry in the uDAPL
Registry which is contained in the dat.conf file. Contact your local
System Administrator to confirm the availability of the interfaces in
the dat.conf file.
--------------------------------------------------------------------------
DAT: library load failure: /usr/lib64/libdaplscm.so.1: undefined symbol:
dat_registry_add_provider
--------------------------------------------------------------------------

WARNING: Failed to open "OpenIB-mthca0-2"
[DAT_PROVIDER_NOT_FOUND:DAT_NAME_NOT_REGISTERED].
This may be a real error or it may be an invalid entry in the uDAPL
Registry which is contained in the dat.conf file. Contact your local
System Administrator to confirm the availability of the interfaces in
the dat.conf file.
--------------------------------------------------------------------------
DAT: library load failure: /usr/lib64/libdaplscm.so.1: undefined symbol:
dat_registry_add_provider
--------------------------------------------------------------------------

WARNING: Failed to open "OpenIB-mlx4_0-1"
[DAT_PROVIDER_NOT_FOUND:DAT_NAME_NOT_REGISTERED].
This may be a real error or it may be an invalid entry in the uDAPL
Registry which is contained in the dat.conf file. Contact your local
System Administrator to confirm the availability of the interfaces in
the dat.conf file.
--------------------------------------------------------------------------
DAT: library load failure: /usr/lib64/libdaplscm.so.1: undefined symbol:
dat_registry_add_provider
--------------------------------------------------------------------------

WARNING: Failed to open "OpenIB-mlx4_0-2"
[DAT_PROVIDER_NOT_FOUND:DAT_NAME_NOT_REGISTERED].
This may be a real error or it may be an invalid entry in the uDAPL
Registry which is contained in the dat.conf file. Contact your local
System Administrator to confirm the availability of the interfaces in
the dat.conf file.
--------------------------------------------------------------------------
DAT: library load failure: /usr/lib64/libdaplcma.so.1: undefined symbol:
dat_registry_add_provider
--------------------------------------------------------------------------

WARNING: Failed to open "OpenIB-iwarp"
[DAT_PROVIDER_NOT_FOUND:DAT_NAME_NOT_REGISTERED].
This may be a real error or it may be an invalid entry in the uDAPL
Registry which is contained in the dat.conf file. Contact your local
System Administrator to confirm the availability of the interfaces in
the dat.conf file.
--------------------------------------------------------------------------
--------------------------------------------------------------------------
[0,1,0]: uDAPL on host md was unable to find any NICs.
Another transport will be used instead, although this may result in
lower performance.
--------------------------------------------------------------------------
Number of Newton iterations =     4
Completed test examples


Thanks.

Yixun

From balay at mcs.anl.gov  Sun Jul  5 12:57:28 2009
From: balay at mcs.anl.gov (Satish Balay)
Date: Sun, 5 Jul 2009 12:57:28 -0500 (CDT)
Subject: make test
In-Reply-To: <4A50E70B.7070208@cs.wm.edu>
References: <4A50E70B.7070208@cs.wm.edu>
Message-ID: <alpine.LFD.2.00.0907051253420.2924@asterix>


Looks like some issue with your MPI. You might want to talk with your
sysadmin about it.

Also send us some compile logs - so we know whats hapenning. For eg:

cd src/ksp/ksp/examples/tutorials/
make ex2
mpiexec -n 2 ./ex2 [or however you are supporsed to run MPI binaries on this cluster]

BTW: If you are currently doing development - don't bother with a
cluster MPI - and just use --download-mpich=1


Satish

On Sun, 5 Jul 2009, Yixun Liu wrote:

> Hi,
> After making test, I received a lot of warnings and lib load failures.
> 
> Running test examples to verify correct installation
> Possible error running C/C++ src/snes/examples/tutorials/ex19 with 1 MPI
> process
> See http://www.mcs.anl.gov/petsc/petsc-as/documentation/troubleshooting.html
> DAT: library load failure: /usr/lib64/libdaplcma.so.1: undefined symbol:
> dat_registry_add_provider
> --------------------------------------------------------------------------
> 
> WARNING: Failed to open "OpenIB-cma"
> [DAT_PROVIDER_NOT_FOUND:DAT_NAME_NOT_REGISTERED].
> This may be a real error or it may be an invalid entry in the uDAPL
> Registry which is contained in the dat.conf file. Contact your local
> System Administrator to confirm the availability of the interfaces in
> the dat.conf file.
> --------------------------------------------------------------------------
> DAT: library load failure: /usr/lib64/libdaplcma.so.1: undefined symbol:
> dat_registry_add_provider
> --------------------------------------------------------------------------
> 
> WARNING: Failed to open "OpenIB-cma-1"
> [DAT_PROVIDER_NOT_FOUND:DAT_NAME_NOT_REGISTERED].
> This may be a real error or it may be an invalid entry in the uDAPL
> Registry which is contained in the dat.conf file. Contact your local
> System Administrator to confirm the availability of the interfaces in
> the dat.conf file.
> --------------------------------------------------------------------------
> DAT: library load failure: /usr/lib64/libdaplscm.so.1: undefined symbol:
> dat_registry_add_provider
> --------------------------------------------------------------------------
> 
> WARNING: Failed to open "OpenIB-mthca0-1"
> [DAT_PROVIDER_NOT_FOUND:DAT_NAME_NOT_REGISTERED].
> This may be a real error or it may be an invalid entry in the uDAPL
> Registry which is contained in the dat.conf file. Contact your local
> System Administrator to confirm the availability of the interfaces in
> the dat.conf file.
> --------------------------------------------------------------------------
> DAT: library load failure: /usr/lib64/libdaplscm.so.1: undefined symbol:
> dat_registry_add_provider
> --------------------------------------------------------------------------
> 
> WARNING: Failed to open "OpenIB-mthca0-2"
> [DAT_PROVIDER_NOT_FOUND:DAT_NAME_NOT_REGISTERED].
> This may be a real error or it may be an invalid entry in the uDAPL
> Registry which is contained in the dat.conf file. Contact your local
> System Administrator to confirm the availability of the interfaces in
> the dat.conf file.
> --------------------------------------------------------------------------
> DAT: library load failure: /usr/lib64/libdaplscm.so.1: undefined symbol:
> dat_registry_add_provider
> --------------------------------------------------------------------------
> 
> WARNING: Failed to open "OpenIB-mlx4_0-1"
> [DAT_PROVIDER_NOT_FOUND:DAT_NAME_NOT_REGISTERED].
> This may be a real error or it may be an invalid entry in the uDAPL
> Registry which is contained in the dat.conf file. Contact your local
> System Administrator to confirm the availability of the interfaces in
> the dat.conf file.
> --------------------------------------------------------------------------
> DAT: library load failure: /usr/lib64/libdaplscm.so.1: undefined symbol:
> dat_registry_add_provider
> --------------------------------------------------------------------------
> 
> WARNING: Failed to open "OpenIB-mlx4_0-2"
> [DAT_PROVIDER_NOT_FOUND:DAT_NAME_NOT_REGISTERED].
> This may be a real error or it may be an invalid entry in the uDAPL
> Registry which is contained in the dat.conf file. Contact your local
> System Administrator to confirm the availability of the interfaces in
> the dat.conf file.
> --------------------------------------------------------------------------
> DAT: library load failure: /usr/lib64/libdaplcma.so.1: undefined symbol:
> dat_registry_add_provider
> --------------------------------------------------------------------------
> 
> WARNING: Failed to open "OpenIB-iwarp"
> [DAT_PROVIDER_NOT_FOUND:DAT_NAME_NOT_REGISTERED].
> This may be a real error or it may be an invalid entry in the uDAPL
> Registry which is contained in the dat.conf file. Contact your local
> System Administrator to confirm the availability of the interfaces in
> the dat.conf file.
> --------------------------------------------------------------------------
> --------------------------------------------------------------------------
> [0,1,0]: uDAPL on host md was unable to find any NICs.
> Another transport will be used instead, although this may result in
> lower performance.
> --------------------------------------------------------------------------
> lid velocity = 0.0016, prandtl # = 1, grashof # = 1
> Number of Newton iterations = 2
> lid velocity = 0.0016, prandtl # = 1, grashof # = 1
> Number of Newton iterations = 2
> Possible error running C/C++ src/snes/examples/tutorials/ex19 with 2 MPI
> processes
> See http://www.mcs.anl.gov/petsc/petsc-as/documentation/troubleshooting.html
> DAT: library load failure: /usr/lib64/libdaplcma.so.1: undefined symbol:
> dat_registry_add_provider
> --------------------------------------------------------------------------
> 
> WARNING: Failed to open "OpenIB-cma"
> [DAT_PROVIDER_NOT_FOUND:DAT_NAME_NOT_REGISTERED].
> This may be a real error or it may be an invalid entry in the uDAPL
> Registry which is contained in the dat.conf file. Contact your local
> System Administrator to confirm the availability of the interfaces in
> the dat.conf file.
> --------------------------------------------------------------------------
> DAT: library load failure: /usr/lib64/libdaplcma.so.1: undefined symbol:
> dat_registry_add_provider
> --------------------------------------------------------------------------
> 
> WARNING: Failed to open "OpenIB-cma-1"
> [DAT_PROVIDER_NOT_FOUND:DAT_NAME_NOT_REGISTERED].
> This may be a real error or it may be an invalid entry in the uDAPL
> Registry which is contained in the dat.conf file. Contact your local
> System Administrator to confirm the availability of the interfaces in
> the dat.conf file.
> --------------------------------------------------------------------------
> DAT: library load failure: /usr/lib64/libdaplscm.so.1: undefined symbol:
> dat_registry_add_provider
> --------------------------------------------------------------------------
> 
> WARNING: Failed to open "OpenIB-mthca0-1"
> [DAT_PROVIDER_NOT_FOUND:DAT_NAME_NOT_REGISTERED].
> This may be a real error or it may be an invalid entry in the uDAPL
> Registry which is contained in the dat.conf file. Contact your local
> System Administrator to confirm the availability of the interfaces in
> the dat.conf file.
> --------------------------------------------------------------------------
> DAT: library load failure: /usr/lib64/libdaplscm.so.1: undefined symbol:
> dat_registry_add_provider
> --------------------------------------------------------------------------
> 
> WARNING: Failed to open "OpenIB-mthca0-2"
> [DAT_PROVIDER_NOT_FOUND:DAT_NAME_NOT_REGISTERED].
> This may be a real error or it may be an invalid entry in the uDAPL
> Registry which is contained in the dat.conf file. Contact your local
> System Administrator to confirm the availability of the interfaces in
> the dat.conf file.
> --------------------------------------------------------------------------
> DAT: library load failure: /usr/lib64/libdaplscm.so.1: undefined symbol:
> dat_registry_add_provider
> --------------------------------------------------------------------------
> 
> WARNING: Failed to open "OpenIB-mlx4_0-1"
> [DAT_PROVIDER_NOT_FOUND:DAT_NAME_NOT_REGISTERED].
> This may be a real error or it may be an invalid entry in the uDAPL
> Registry which is contained in the dat.conf file. Contact your local
> System Administrator to confirm the availability of the interfaces in
> the dat.conf file.
> --------------------------------------------------------------------------
> DAT: library load failure: /usr/lib64/libdaplscm.so.1: undefined symbol:
> dat_registry_add_provider
> --------------------------------------------------------------------------
> 
> WARNING: Failed to open "OpenIB-mlx4_0-2"
> [DAT_PROVIDER_NOT_FOUND:DAT_NAME_NOT_REGISTERED].
> This may be a real error or it may be an invalid entry in the uDAPL
> Registry which is contained in the dat.conf file. Contact your local
> System Administrator to confirm the availability of the interfaces in
> the dat.conf file.
> --------------------------------------------------------------------------
> DAT: library load failure: /usr/lib64/libdaplcma.so.1: undefined symbol:
> dat_registry_add_provider
> --------------------------------------------------------------------------
> 
> WARNING: Failed to open "OpenIB-iwarp"
> [DAT_PROVIDER_NOT_FOUND:DAT_NAME_NOT_REGISTERED].
> This may be a real error or it may be an invalid entry in the uDAPL
> Registry which is contained in the dat.conf file. Contact your local
> System Administrator to confirm the availability of the interfaces in
> the dat.conf file.
> --------------------------------------------------------------------------
> --------------------------------------------------------------------------
> [0,1,1]: uDAPL on host md was unable to find any NICs.
> Another transport will be used instead, although this may result in
> lower performance.
> --------------------------------------------------------------------------
> DAT: library load failure: /usr/lib64/libdaplcma.so.1: undefined symbol:
> dat_registry_add_provider
> --------------------------------------------------------------------------
> 
> WARNING: Failed to open "OpenIB-cma"
> [DAT_PROVIDER_NOT_FOUND:DAT_NAME_NOT_REGISTERED].
> This may be a real error or it may be an invalid entry in the uDAPL
> Registry which is contained in the dat.conf file. Contact your local
> System Administrator to confirm the availability of the interfaces in
> the dat.conf file.
> --------------------------------------------------------------------------
> DAT: library load failure: /usr/lib64/libdaplcma.so.1: undefined symbol:
> dat_registry_add_provider
> --------------------------------------------------------------------------
> 
> WARNING: Failed to open "OpenIB-cma-1"
> [DAT_PROVIDER_NOT_FOUND:DAT_NAME_NOT_REGISTERED].
> This may be a real error or it may be an invalid entry in the uDAPL
> Registry which is contained in the dat.conf file. Contact your local
> System Administrator to confirm the availability of the interfaces in
> the dat.conf file.
> --------------------------------------------------------------------------
> DAT: library load failure: /usr/lib64/libdaplscm.so.1: undefined symbol:
> dat_registry_add_provider
> --------------------------------------------------------------------------
> 
> WARNING: Failed to open "OpenIB-mthca0-1"
> [DAT_PROVIDER_NOT_FOUND:DAT_NAME_NOT_REGISTERED].
> This may be a real error or it may be an invalid entry in the uDAPL
> Registry which is contained in the dat.conf file. Contact your local
> System Administrator to confirm the availability of the interfaces in
> the dat.conf file.
> --------------------------------------------------------------------------
> DAT: library load failure: /usr/lib64/libdaplscm.so.1: undefined symbol:
> dat_registry_add_provider
> --------------------------------------------------------------------------
> 
> WARNING: Failed to open "OpenIB-mthca0-2"
> [DAT_PROVIDER_NOT_FOUND:DAT_NAME_NOT_REGISTERED].
> This may be a real error or it may be an invalid entry in the uDAPL
> Registry which is contained in the dat.conf file. Contact your local
> System Administrator to confirm the availability of the interfaces in
> the dat.conf file.
> --------------------------------------------------------------------------
> DAT: library load failure: /usr/lib64/libdaplscm.so.1: undefined symbol:
> dat_registry_add_provider
> --------------------------------------------------------------------------
> 
> WARNING: Failed to open "OpenIB-mlx4_0-1"
> [DAT_PROVIDER_NOT_FOUND:DAT_NAME_NOT_REGISTERED].
> This may be a real error or it may be an invalid entry in the uDAPL
> Registry which is contained in the dat.conf file. Contact your local
> System Administrator to confirm the availability of the interfaces in
> the dat.conf file.
> --------------------------------------------------------------------------
> DAT: library load failure: /usr/lib64/libdaplscm.so.1: undefined symbol:
> dat_registry_add_provider
> --------------------------------------------------------------------------
> 
> WARNING: Failed to open "OpenIB-mlx4_0-2"
> [DAT_PROVIDER_NOT_FOUND:DAT_NAME_NOT_REGISTERED].
> This may be a real error or it may be an invalid entry in the uDAPL
> Registry which is contained in the dat.conf file. Contact your local
> System Administrator to confirm the availability of the interfaces in
> the dat.conf file.
> --------------------------------------------------------------------------
> DAT: library load failure: /usr/lib64/libdaplcma.so.1: undefined symbol:
> dat_registry_add_provider
> --------------------------------------------------------------------------
> 
> WARNING: Failed to open "OpenIB-iwarp"
> [DAT_PROVIDER_NOT_FOUND:DAT_NAME_NOT_REGISTERED].
> This may be a real error or it may be an invalid entry in the uDAPL
> Registry which is contained in the dat.conf file. Contact your local
> System Administrator to confirm the availability of the interfaces in
> the dat.conf file.
> --------------------------------------------------------------------------
> --------------------------------------------------------------------------
> [0,1,0]: uDAPL on host md was unable to find any NICs.
> Another transport will be used instead, although this may result in
> lower performance.
> --------------------------------------------------------------------------
> lid velocity = 0.0016, prandtl # = 1, grashof # = 1
> Number of Newton iterations = 2
> lid velocity = 0.0016, prandtl # = 1, grashof # = 1
> Number of Newton iterations = 2
> Possible error running Graphics examples
> src/snes/examples/tutorials/ex19 1 MPI process
> See http://www.mcs.anl.gov/petsc/petsc-as/documentation/troubleshooting.html
> DAT: library load failure: /usr/lib64/libdaplcma.so.1: undefined symbol:
> dat_registry_add_provider
> --------------------------------------------------------------------------
> 
> WARNING: Failed to open "OpenIB-cma"
> [DAT_PROVIDER_NOT_FOUND:DAT_NAME_NOT_REGISTERED].
> This may be a real error or it may be an invalid entry in the uDAPL
> Registry which is contained in the dat.conf file. Contact your local
> System Administrator to confirm the availability of the interfaces in
> the dat.conf file.
> --------------------------------------------------------------------------
> DAT: library load failure: /usr/lib64/libdaplcma.so.1: undefined symbol:
> dat_registry_add_provider
> DAT: library load failure: /usr/lib64/libdaplscm.so.1: undefined symbol:
> dat_registry_add_provider
> DAT: library load failure: /usr/lib64/libdaplscm.so.1: undefined symbol:
> dat_registry_add_provider
> DAT: library load failure: /usr/lib64/libdaplscm.so.1: undefined symbol:
> dat_registry_add_provider
> DAT: library load failure: /usr/lib64/libdaplscm.so.1: undefined symbol:
> dat_registry_add_provider
> DAT: library load failure: /usr/lib64/libdaplcma.so.1: undefined symbol:
> dat_registry_add_provider
> --------------------------------------------------------------------------
> 
> WARNING: Failed to open "OpenIB-cma-1"
> [DAT_PROVIDER_NOT_FOUND:DAT_NAME_NOT_REGISTERED].
> This may be a real error or it may be an invalid entry in the uDAPL
> Registry which is contained in the dat.conf file. Contact your local
> System Administrator to confirm the availability of the interfaces in
> the dat.conf file.
> --------------------------------------------------------------------------
> --------------------------------------------------------------------------
> 
> WARNING: Failed to open "OpenIB-mthca0-1"
> [DAT_PROVIDER_NOT_FOUND:DAT_NAME_NOT_REGISTERED].
> This may be a real error or it may be an invalid entry in the uDAPL
> Registry which is contained in the dat.conf file. Contact your local
> System Administrator to confirm the availability of the interfaces in
> the dat.conf file.
> --------------------------------------------------------------------------
> --------------------------------------------------------------------------
> 
> WARNING: Failed to open "OpenIB-mthca0-2"
> [DAT_PROVIDER_NOT_FOUND:DAT_NAME_NOT_REGISTERED].
> This may be a real error or it may be an invalid entry in the uDAPL
> Registry which is contained in the dat.conf file. Contact your local
> System Administrator to confirm the availability of the interfaces in
> the dat.conf file.
> --------------------------------------------------------------------------
> --------------------------------------------------------------------------
> 
> WARNING: Failed to open "OpenIB-mlx4_0-1"
> [DAT_PROVIDER_NOT_FOUND:DAT_NAME_NOT_REGISTERED].
> This may be a real error or it may be an invalid entry in the uDAPL
> Registry which is contained in the dat.conf file. Contact your local
> System Administrator to confirm the availability of the interfaces in
> the dat.conf file.
> --------------------------------------------------------------------------
> --------------------------------------------------------------------------
> 
> WARNING: Failed to open "OpenIB-mlx4_0-2"
> [DAT_PROVIDER_NOT_FOUND:DAT_NAME_NOT_REGISTERED].
> This may be a real error or it may be an invalid entry in the uDAPL
> Registry which is contained in the dat.conf file. Contact your local
> System Administrator to confirm the availability of the interfaces in
> the dat.conf file.
> --------------------------------------------------------------------------
> --------------------------------------------------------------------------
> 
> WARNING: Failed to open "OpenIB-iwarp"
> [DAT_PROVIDER_NOT_FOUND:DAT_NAME_NOT_REGISTERED].
> This may be a real error or it may be an invalid entry in the uDAPL
> Registry which is contained in the dat.conf file. Contact your local
> System Administrator to confirm the availability of the interfaces in
> the dat.conf file.
> --------------------------------------------------------------------------
> --------------------------------------------------------------------------
> [0,1,0]: uDAPL on host md was unable to find any NICs.
> Another transport will be used instead, although this may result in
> lower performance.
> --------------------------------------------------------------------------
> lid velocity = 0.0016, prandtl # = 1, grashof # = 1
> Number of Newton iterations = 2
> lid velocity = 0.0016, prandtl # = 1, grashof # = 1
> Number of Newton iterations = 2
> Error running Fortran example src/snes/examples/tutorials/ex5f with 1
> MPI process
> See http://www.mcs.anl.gov/petsc/petsc-as/documentation/troubleshooting.html
> DAT: library load failure: /usr/lib64/libdaplcma.so.1: undefined symbol:
> dat_registry_add_provider
> --------------------------------------------------------------------------
> 
> WARNING: Failed to open "OpenIB-cma"
> [DAT_PROVIDER_NOT_FOUND:DAT_NAME_NOT_REGISTERED].
> This may be a real error or it may be an invalid entry in the uDAPL
> Registry which is contained in the dat.conf file. Contact your local
> System Administrator to confirm the availability of the interfaces in
> the dat.conf file.
> --------------------------------------------------------------------------
> DAT: library load failure: /usr/lib64/libdaplcma.so.1: undefined symbol:
> dat_registry_add_provider
> --------------------------------------------------------------------------
> 
> WARNING: Failed to open "OpenIB-cma-1"
> [DAT_PROVIDER_NOT_FOUND:DAT_NAME_NOT_REGISTERED].
> This may be a real error or it may be an invalid entry in the uDAPL
> Registry which is contained in the dat.conf file. Contact your local
> System Administrator to confirm the availability of the interfaces in
> the dat.conf file.
> --------------------------------------------------------------------------
> DAT: library load failure: /usr/lib64/libdaplscm.so.1: undefined symbol:
> dat_registry_add_provider
> --------------------------------------------------------------------------
> 
> WARNING: Failed to open "OpenIB-mthca0-1"
> [DAT_PROVIDER_NOT_FOUND:DAT_NAME_NOT_REGISTERED].
> This may be a real error or it may be an invalid entry in the uDAPL
> Registry which is contained in the dat.conf file. Contact your local
> System Administrator to confirm the availability of the interfaces in
> the dat.conf file.
> --------------------------------------------------------------------------
> DAT: library load failure: /usr/lib64/libdaplscm.so.1: undefined symbol:
> dat_registry_add_provider
> --------------------------------------------------------------------------
> 
> WARNING: Failed to open "OpenIB-mthca0-2"
> [DAT_PROVIDER_NOT_FOUND:DAT_NAME_NOT_REGISTERED].
> This may be a real error or it may be an invalid entry in the uDAPL
> Registry which is contained in the dat.conf file. Contact your local
> System Administrator to confirm the availability of the interfaces in
> the dat.conf file.
> --------------------------------------------------------------------------
> DAT: library load failure: /usr/lib64/libdaplscm.so.1: undefined symbol:
> dat_registry_add_provider
> --------------------------------------------------------------------------
> 
> WARNING: Failed to open "OpenIB-mlx4_0-1"
> [DAT_PROVIDER_NOT_FOUND:DAT_NAME_NOT_REGISTERED].
> This may be a real error or it may be an invalid entry in the uDAPL
> Registry which is contained in the dat.conf file. Contact your local
> System Administrator to confirm the availability of the interfaces in
> the dat.conf file.
> --------------------------------------------------------------------------
> DAT: library load failure: /usr/lib64/libdaplscm.so.1: undefined symbol:
> dat_registry_add_provider
> --------------------------------------------------------------------------
> 
> WARNING: Failed to open "OpenIB-mlx4_0-2"
> [DAT_PROVIDER_NOT_FOUND:DAT_NAME_NOT_REGISTERED].
> This may be a real error or it may be an invalid entry in the uDAPL
> Registry which is contained in the dat.conf file. Contact your local
> System Administrator to confirm the availability of the interfaces in
> the dat.conf file.
> --------------------------------------------------------------------------
> DAT: library load failure: /usr/lib64/libdaplcma.so.1: undefined symbol:
> dat_registry_add_provider
> --------------------------------------------------------------------------
> 
> WARNING: Failed to open "OpenIB-iwarp"
> [DAT_PROVIDER_NOT_FOUND:DAT_NAME_NOT_REGISTERED].
> This may be a real error or it may be an invalid entry in the uDAPL
> Registry which is contained in the dat.conf file. Contact your local
> System Administrator to confirm the availability of the interfaces in
> the dat.conf file.
> --------------------------------------------------------------------------
> --------------------------------------------------------------------------
> [0,1,0]: uDAPL on host md was unable to find any NICs.
> Another transport will be used instead, although this may result in
> lower performance.
> --------------------------------------------------------------------------
> Number of Newton iterations =     4
> Completed test examples
> 
> 
> Thanks.
> 
> Yixun
> 


From enjoywm at cs.wm.edu  Sun Jul  5 13:05:17 2009
From: enjoywm at cs.wm.edu (Yixun Liu)
Date: Sun, 05 Jul 2009 14:05:17 -0400
Subject: make test
In-Reply-To: <alpine.LFD.2.00.0907051253420.2924@asterix>
References: <4A50E70B.7070208@cs.wm.edu>
	<alpine.LFD.2.00.0907051253420.2924@asterix>
Message-ID: <4A50EB5D.8090001@cs.wm.edu>

I run it on my computer.

md[/home/scratch/yixun/petsc-3.0.0-p3/src/ksp/ksp/examples/tutorials>make
ex2

mpicc -o ex2.o -c -Wall -Wwrite-strings -Wno-strict-aliasing -g3
-I/home/scratch/yixun/petsc-3.0.0-p3/src/dm/mesh/sieve
-I/home/scratch/yixun/petsc-3.0.0-p3/linux-gnu-c-debug/include
-I/home/scratch/yixun/petsc-3.0.0-p3/include
-I/usr/lib64/mpi/gcc/openmpi/include -I/usr/lib64/mpi/gcc/openmpi/lib64
-D__SDIR__="src/ksp/ksp/examples/tutorials/" ex2.c
mpicc -Wall -Wwrite-strings -Wno-strict-aliasing -g3 -o ex2 ex2.o
-Wl,-rpath,/home/scratch/yixun/petsc-3.0.0-p3/linux-gnu-c-debug/lib
-L/home/scratch/yixun/petsc-3.0.0-p3/linux-gnu-c-debug/lib -lpetscksp
-lpetscdm -lpetscmat -lpetscvec -lpetsc -lX11 -llapack -lblas
-L/usr/lib64/mpi/gcc/openmpi/lib64
-L/usr/lib64/gcc/x86_64-suse-linux/4.3 -L/usr/lib64 -L/lib64
-L/usr/x86_64-suse-linux/lib -ldl -lmpi -lopen-rte -lopen-pal -lnsl
-lutil -lgcc_s -lpthread -lmpi_f90 -lmpi_f77 -lgfortranbegin -lgfortran
-lm -lm -L/usr/lib64/gcc/x86_64-suse-linux -L/usr/x86_64-suse-linux/bin
-L/lib -lm -lm -ldl -lmpi -lopen-rte -lopen-pal -lnsl -lutil -lgcc_s
-lpthread -ldl
/bin/rm -f ex2.o



md[/home/scratch/yixun/petsc-3.0.0-p3/src/ksp/ksp/examples/tutorials>mpiexec
-n 2 ./ex2

DAT: library load failure: /usr/lib64/libdaplcma.so.1: undefined symbol:
dat_registry_add_provider
DAT: library load failure: /usr/lib64/libdaplcma.so.1: undefined symbol:
dat_registry_add_provider
--------------------------------------------------------------------------

WARNING: Failed to open "OpenIB-cma"
[DAT_PROVIDER_NOT_FOUND:DAT_NAME_NOT_REGISTERED].
This may be a real error or it may be an invalid entry in the uDAPL
Registry which is contained in the dat.conf file. Contact your local
System Administrator to confirm the availability of the interfaces in
the dat.conf file.
--------------------------------------------------------------------------
--------------------------------------------------------------------------

WARNING: Failed to open "OpenIB-cma-1"
[DAT_PROVIDER_NOT_FOUND:DAT_NAME_NOT_REGISTERED].
This may be a real error or it may be an invalid entry in the uDAPL
Registry which is contained in the dat.conf file. Contact your local
System Administrator to confirm the availability of the interfaces in
the dat.conf file.
--------------------------------------------------------------------------
DAT: library load failure: /usr/lib64/libdaplscm.so.1: undefined symbol:
dat_registry_add_provider
DAT: library load failure: /usr/lib64/libdaplscm.so.1: undefined symbol:
dat_registry_add_provider
DAT: library load failure: /usr/lib64/libdaplscm.so.1: undefined symbol:
dat_registry_add_provider
--------------------------------------------------------------------------

WARNING: Failed to open "OpenIB-mthca0-1"
[DAT_PROVIDER_NOT_FOUND:DAT_NAME_NOT_REGISTERED].
This may be a real error or it may be an invalid entry in the uDAPL
Registry which is contained in the dat.conf file. Contact your local
System Administrator to confirm the availability of the interfaces in
the dat.conf file.
--------------------------------------------------------------------------
--------------------------------------------------------------------------

WARNING: Failed to open "OpenIB-mthca0-2"
[DAT_PROVIDER_NOT_FOUND:DAT_NAME_NOT_REGISTERED].
This may be a real error or it may be an invalid entry in the uDAPL
Registry which is contained in the dat.conf file. Contact your local
System Administrator to confirm the availability of the interfaces in
the dat.conf file.
--------------------------------------------------------------------------
--------------------------------------------------------------------------

WARNING: Failed to open "OpenIB-mlx4_0-1"
[DAT_PROVIDER_NOT_FOUND:DAT_NAME_NOT_REGISTERED].
This may be a real error or it may be an invalid entry in the uDAPL
Registry which is contained in the dat.conf file. Contact your local
System Administrator to confirm the availability of the interfaces in
the dat.conf file.
--------------------------------------------------------------------------
DAT: library load failure: /usr/lib64/libdaplscm.so.1: undefined symbol:
dat_registry_add_provider
DAT: library load failure: /usr/lib64/libdaplcma.so.1: undefined symbol:
dat_registry_add_provider
--------------------------------------------------------------------------

WARNING: Failed to open "OpenIB-mlx4_0-2"
[DAT_PROVIDER_NOT_FOUND:DAT_NAME_NOT_REGISTERED].
This may be a real error or it may be an invalid entry in the uDAPL
Registry which is contained in the dat.conf file. Contact your local
System Administrator to confirm the availability of the interfaces in
the dat.conf file.
--------------------------------------------------------------------------
--------------------------------------------------------------------------

WARNING: Failed to open "OpenIB-iwarp"
[DAT_PROVIDER_NOT_FOUND:DAT_NAME_NOT_REGISTERED].
This may be a real error or it may be an invalid entry in the uDAPL
Registry which is contained in the dat.conf file. Contact your local
System Administrator to confirm the availability of the interfaces in
the dat.conf file.
--------------------------------------------------------------------------
--------------------------------------------------------------------------
[0,1,0]: uDAPL on host md was unable to find any NICs.
Another transport will be used instead, although this may result in
lower performance.
--------------------------------------------------------------------------
DAT: library load failure: /usr/lib64/libdaplcma.so.1: undefined symbol:
dat_registry_add_provider
DAT: library load failure: /usr/lib64/libdaplcma.so.1: undefined symbol:
dat_registry_add_provider
DAT: library load failure: /usr/lib64/libdaplscm.so.1: undefined symbol:
dat_registry_add_provider
--------------------------------------------------------------------------

WARNING: Failed to open "OpenIB-cma"
[DAT_PROVIDER_NOT_FOUND:DAT_NAME_NOT_REGISTERED].
This may be a real error or it may be an invalid entry in the uDAPL
Registry which is contained in the dat.conf file. Contact your local
System Administrator to confirm the availability of the interfaces in
the dat.conf file.
--------------------------------------------------------------------------
--------------------------------------------------------------------------

WARNING: Failed to open "OpenIB-cma-1"
[DAT_PROVIDER_NOT_FOUND:DAT_NAME_NOT_REGISTERED].
This may be a real error or it may be an invalid entry in the uDAPL
Registry which is contained in the dat.conf file. Contact your local
System Administrator to confirm the availability of the interfaces in
the dat.conf file.
--------------------------------------------------------------------------
--------------------------------------------------------------------------

WARNING: Failed to open "OpenIB-mthca0-1"
[DAT_PROVIDER_NOT_FOUND:DAT_NAME_NOT_REGISTERED].
This may be a real error or it may be an invalid entry in the uDAPL
Registry which is contained in the dat.conf file. Contact your local
System Administrator to confirm the availability of the interfaces in
the dat.conf file.
--------------------------------------------------------------------------
DAT: library load failure: /usr/lib64/libdaplscm.so.1: undefined symbol:
dat_registry_add_provider
DAT: library load failure: /usr/lib64/libdaplscm.so.1: undefined symbol:
dat_registry_add_provider
DAT: library load failure: /usr/lib64/libdaplscm.so.1: undefined symbol:
dat_registry_add_provider
DAT: library load failure: /usr/lib64/libdaplcma.so.1: undefined symbol:
dat_registry_add_provider
--------------------------------------------------------------------------

WARNING: Failed to open "OpenIB-mthca0-2"
[DAT_PROVIDER_NOT_FOUND:DAT_NAME_NOT_REGISTERED].
This may be a real error or it may be an invalid entry in the uDAPL
Registry which is contained in the dat.conf file. Contact your local
System Administrator to confirm the availability of the interfaces in
the dat.conf file.
--------------------------------------------------------------------------
--------------------------------------------------------------------------

WARNING: Failed to open "OpenIB-mlx4_0-1"
[DAT_PROVIDER_NOT_FOUND:DAT_NAME_NOT_REGISTERED].
This may be a real error or it may be an invalid entry in the uDAPL
Registry which is contained in the dat.conf file. Contact your local
System Administrator to confirm the availability of the interfaces in
the dat.conf file.
--------------------------------------------------------------------------
--------------------------------------------------------------------------

WARNING: Failed to open "OpenIB-mlx4_0-2"
[DAT_PROVIDER_NOT_FOUND:DAT_NAME_NOT_REGISTERED].
This may be a real error or it may be an invalid entry in the uDAPL
Registry which is contained in the dat.conf file. Contact your local
System Administrator to confirm the availability of the interfaces in
the dat.conf file.
--------------------------------------------------------------------------
--------------------------------------------------------------------------

WARNING: Failed to open "OpenIB-iwarp"
[DAT_PROVIDER_NOT_FOUND:DAT_NAME_NOT_REGISTERED].
This may be a real error or it may be an invalid entry in the uDAPL
Registry which is contained in the dat.conf file. Contact your local
System Administrator to confirm the availability of the interfaces in
the dat.conf file.
--------------------------------------------------------------------------
--------------------------------------------------------------------------
[0,1,1]: uDAPL on host md was unable to find any NICs.
Another transport will be used instead, although this may result in
lower performance.
--------------------------------------------------------------------------
Norm of error 0.000411674 iterations 7




















Satish Balay wrote:
> Looks like some issue with your MPI. You might want to talk with your
> sysadmin about it.
>
> Also send us some compile logs - so we know whats hapenning. For eg:
>
> cd src/ksp/ksp/examples/tutorials/
> make ex2
> mpiexec -n 2 ./ex2 [or however you are supporsed to run MPI binaries on this cluster]
>
> BTW: If you are currently doing development - don't bother with a
> cluster MPI - and just use --download-mpich=1
>
>
> Satish
>
> On Sun, 5 Jul 2009, Yixun Liu wrote:
>
>   
>> Hi,
>> After making test, I received a lot of warnings and lib load failures.
>>
>> Running test examples to verify correct installation
>> Possible error running C/C++ src/snes/examples/tutorials/ex19 with 1 MPI
>> process
>> See http://www.mcs.anl.gov/petsc/petsc-as/documentation/troubleshooting.html
>> DAT: library load failure: /usr/lib64/libdaplcma.so.1: undefined symbol:
>> dat_registry_add_provider
>> --------------------------------------------------------------------------
>>
>> WARNING: Failed to open "OpenIB-cma"
>> [DAT_PROVIDER_NOT_FOUND:DAT_NAME_NOT_REGISTERED].
>> This may be a real error or it may be an invalid entry in the uDAPL
>> Registry which is contained in the dat.conf file. Contact your local
>> System Administrator to confirm the availability of the interfaces in
>> the dat.conf file.
>> --------------------------------------------------------------------------
>> DAT: library load failure: /usr/lib64/libdaplcma.so.1: undefined symbol:
>> dat_registry_add_provider
>> --------------------------------------------------------------------------
>>
>> WARNING: Failed to open "OpenIB-cma-1"
>> [DAT_PROVIDER_NOT_FOUND:DAT_NAME_NOT_REGISTERED].
>> This may be a real error or it may be an invalid entry in the uDAPL
>> Registry which is contained in the dat.conf file. Contact your local
>> System Administrator to confirm the availability of the interfaces in
>> the dat.conf file.
>> --------------------------------------------------------------------------
>> DAT: library load failure: /usr/lib64/libdaplscm.so.1: undefined symbol:
>> dat_registry_add_provider
>> --------------------------------------------------------------------------
>>
>> WARNING: Failed to open "OpenIB-mthca0-1"
>> [DAT_PROVIDER_NOT_FOUND:DAT_NAME_NOT_REGISTERED].
>> This may be a real error or it may be an invalid entry in the uDAPL
>> Registry which is contained in the dat.conf file. Contact your local
>> System Administrator to confirm the availability of the interfaces in
>> the dat.conf file.
>> --------------------------------------------------------------------------
>> DAT: library load failure: /usr/lib64/libdaplscm.so.1: undefined symbol:
>> dat_registry_add_provider
>> --------------------------------------------------------------------------
>>
>> WARNING: Failed to open "OpenIB-mthca0-2"
>> [DAT_PROVIDER_NOT_FOUND:DAT_NAME_NOT_REGISTERED].
>> This may be a real error or it may be an invalid entry in the uDAPL
>> Registry which is contained in the dat.conf file. Contact your local
>> System Administrator to confirm the availability of the interfaces in
>> the dat.conf file.
>> --------------------------------------------------------------------------
>> DAT: library load failure: /usr/lib64/libdaplscm.so.1: undefined symbol:
>> dat_registry_add_provider
>> --------------------------------------------------------------------------
>>
>> WARNING: Failed to open "OpenIB-mlx4_0-1"
>> [DAT_PROVIDER_NOT_FOUND:DAT_NAME_NOT_REGISTERED].
>> This may be a real error or it may be an invalid entry in the uDAPL
>> Registry which is contained in the dat.conf file. Contact your local
>> System Administrator to confirm the availability of the interfaces in
>> the dat.conf file.
>> --------------------------------------------------------------------------
>> DAT: library load failure: /usr/lib64/libdaplscm.so.1: undefined symbol:
>> dat_registry_add_provider
>> --------------------------------------------------------------------------
>>
>> WARNING: Failed to open "OpenIB-mlx4_0-2"
>> [DAT_PROVIDER_NOT_FOUND:DAT_NAME_NOT_REGISTERED].
>> This may be a real error or it may be an invalid entry in the uDAPL
>> Registry which is contained in the dat.conf file. Contact your local
>> System Administrator to confirm the availability of the interfaces in
>> the dat.conf file.
>> --------------------------------------------------------------------------
>> DAT: library load failure: /usr/lib64/libdaplcma.so.1: undefined symbol:
>> dat_registry_add_provider
>> --------------------------------------------------------------------------
>>
>> WARNING: Failed to open "OpenIB-iwarp"
>> [DAT_PROVIDER_NOT_FOUND:DAT_NAME_NOT_REGISTERED].
>> This may be a real error or it may be an invalid entry in the uDAPL
>> Registry which is contained in the dat.conf file. Contact your local
>> System Administrator to confirm the availability of the interfaces in
>> the dat.conf file.
>> --------------------------------------------------------------------------
>> --------------------------------------------------------------------------
>> [0,1,0]: uDAPL on host md was unable to find any NICs.
>> Another transport will be used instead, although this may result in
>> lower performance.
>> --------------------------------------------------------------------------
>> lid velocity = 0.0016, prandtl # = 1, grashof # = 1
>> Number of Newton iterations = 2
>> lid velocity = 0.0016, prandtl # = 1, grashof # = 1
>> Number of Newton iterations = 2
>> Possible error running C/C++ src/snes/examples/tutorials/ex19 with 2 MPI
>> processes
>> See http://www.mcs.anl.gov/petsc/petsc-as/documentation/troubleshooting.html
>> DAT: library load failure: /usr/lib64/libdaplcma.so.1: undefined symbol:
>> dat_registry_add_provider
>> --------------------------------------------------------------------------
>>
>> WARNING: Failed to open "OpenIB-cma"
>> [DAT_PROVIDER_NOT_FOUND:DAT_NAME_NOT_REGISTERED].
>> This may be a real error or it may be an invalid entry in the uDAPL
>> Registry which is contained in the dat.conf file. Contact your local
>> System Administrator to confirm the availability of the interfaces in
>> the dat.conf file.
>> --------------------------------------------------------------------------
>> DAT: library load failure: /usr/lib64/libdaplcma.so.1: undefined symbol:
>> dat_registry_add_provider
>> --------------------------------------------------------------------------
>>
>> WARNING: Failed to open "OpenIB-cma-1"
>> [DAT_PROVIDER_NOT_FOUND:DAT_NAME_NOT_REGISTERED].
>> This may be a real error or it may be an invalid entry in the uDAPL
>> Registry which is contained in the dat.conf file. Contact your local
>> System Administrator to confirm the availability of the interfaces in
>> the dat.conf file.
>> --------------------------------------------------------------------------
>> DAT: library load failure: /usr/lib64/libdaplscm.so.1: undefined symbol:
>> dat_registry_add_provider
>> --------------------------------------------------------------------------
>>
>> WARNING: Failed to open "OpenIB-mthca0-1"
>> [DAT_PROVIDER_NOT_FOUND:DAT_NAME_NOT_REGISTERED].
>> This may be a real error or it may be an invalid entry in the uDAPL
>> Registry which is contained in the dat.conf file. Contact your local
>> System Administrator to confirm the availability of the interfaces in
>> the dat.conf file.
>> --------------------------------------------------------------------------
>> DAT: library load failure: /usr/lib64/libdaplscm.so.1: undefined symbol:
>> dat_registry_add_provider
>> --------------------------------------------------------------------------
>>
>> WARNING: Failed to open "OpenIB-mthca0-2"
>> [DAT_PROVIDER_NOT_FOUND:DAT_NAME_NOT_REGISTERED].
>> This may be a real error or it may be an invalid entry in the uDAPL
>> Registry which is contained in the dat.conf file. Contact your local
>> System Administrator to confirm the availability of the interfaces in
>> the dat.conf file.
>> --------------------------------------------------------------------------
>> DAT: library load failure: /usr/lib64/libdaplscm.so.1: undefined symbol:
>> dat_registry_add_provider
>> --------------------------------------------------------------------------
>>
>> WARNING: Failed to open "OpenIB-mlx4_0-1"
>> [DAT_PROVIDER_NOT_FOUND:DAT_NAME_NOT_REGISTERED].
>> This may be a real error or it may be an invalid entry in the uDAPL
>> Registry which is contained in the dat.conf file. Contact your local
>> System Administrator to confirm the availability of the interfaces in
>> the dat.conf file.
>> --------------------------------------------------------------------------
>> DAT: library load failure: /usr/lib64/libdaplscm.so.1: undefined symbol:
>> dat_registry_add_provider
>> --------------------------------------------------------------------------
>>
>> WARNING: Failed to open "OpenIB-mlx4_0-2"
>> [DAT_PROVIDER_NOT_FOUND:DAT_NAME_NOT_REGISTERED].
>> This may be a real error or it may be an invalid entry in the uDAPL
>> Registry which is contained in the dat.conf file. Contact your local
>> System Administrator to confirm the availability of the interfaces in
>> the dat.conf file.
>> --------------------------------------------------------------------------
>> DAT: library load failure: /usr/lib64/libdaplcma.so.1: undefined symbol:
>> dat_registry_add_provider
>> --------------------------------------------------------------------------
>>
>> WARNING: Failed to open "OpenIB-iwarp"
>> [DAT_PROVIDER_NOT_FOUND:DAT_NAME_NOT_REGISTERED].
>> This may be a real error or it may be an invalid entry in the uDAPL
>> Registry which is contained in the dat.conf file. Contact your local
>> System Administrator to confirm the availability of the interfaces in
>> the dat.conf file.
>> --------------------------------------------------------------------------
>> --------------------------------------------------------------------------
>> [0,1,1]: uDAPL on host md was unable to find any NICs.
>> Another transport will be used instead, although this may result in
>> lower performance.
>> --------------------------------------------------------------------------
>> DAT: library load failure: /usr/lib64/libdaplcma.so.1: undefined symbol:
>> dat_registry_add_provider
>> --------------------------------------------------------------------------
>>
>> WARNING: Failed to open "OpenIB-cma"
>> [DAT_PROVIDER_NOT_FOUND:DAT_NAME_NOT_REGISTERED].
>> This may be a real error or it may be an invalid entry in the uDAPL
>> Registry which is contained in the dat.conf file. Contact your local
>> System Administrator to confirm the availability of the interfaces in
>> the dat.conf file.
>> --------------------------------------------------------------------------
>> DAT: library load failure: /usr/lib64/libdaplcma.so.1: undefined symbol:
>> dat_registry_add_provider
>> --------------------------------------------------------------------------
>>
>> WARNING: Failed to open "OpenIB-cma-1"
>> [DAT_PROVIDER_NOT_FOUND:DAT_NAME_NOT_REGISTERED].
>> This may be a real error or it may be an invalid entry in the uDAPL
>> Registry which is contained in the dat.conf file. Contact your local
>> System Administrator to confirm the availability of the interfaces in
>> the dat.conf file.
>> --------------------------------------------------------------------------
>> DAT: library load failure: /usr/lib64/libdaplscm.so.1: undefined symbol:
>> dat_registry_add_provider
>> --------------------------------------------------------------------------
>>
>> WARNING: Failed to open "OpenIB-mthca0-1"
>> [DAT_PROVIDER_NOT_FOUND:DAT_NAME_NOT_REGISTERED].
>> This may be a real error or it may be an invalid entry in the uDAPL
>> Registry which is contained in the dat.conf file. Contact your local
>> System Administrator to confirm the availability of the interfaces in
>> the dat.conf file.
>> --------------------------------------------------------------------------
>> DAT: library load failure: /usr/lib64/libdaplscm.so.1: undefined symbol:
>> dat_registry_add_provider
>> --------------------------------------------------------------------------
>>
>> WARNING: Failed to open "OpenIB-mthca0-2"
>> [DAT_PROVIDER_NOT_FOUND:DAT_NAME_NOT_REGISTERED].
>> This may be a real error or it may be an invalid entry in the uDAPL
>> Registry which is contained in the dat.conf file. Contact your local
>> System Administrator to confirm the availability of the interfaces in
>> the dat.conf file.
>> --------------------------------------------------------------------------
>> DAT: library load failure: /usr/lib64/libdaplscm.so.1: undefined symbol:
>> dat_registry_add_provider
>> --------------------------------------------------------------------------
>>
>> WARNING: Failed to open "OpenIB-mlx4_0-1"
>> [DAT_PROVIDER_NOT_FOUND:DAT_NAME_NOT_REGISTERED].
>> This may be a real error or it may be an invalid entry in the uDAPL
>> Registry which is contained in the dat.conf file. Contact your local
>> System Administrator to confirm the availability of the interfaces in
>> the dat.conf file.
>> --------------------------------------------------------------------------
>> DAT: library load failure: /usr/lib64/libdaplscm.so.1: undefined symbol:
>> dat_registry_add_provider
>> --------------------------------------------------------------------------
>>
>> WARNING: Failed to open "OpenIB-mlx4_0-2"
>> [DAT_PROVIDER_NOT_FOUND:DAT_NAME_NOT_REGISTERED].
>> This may be a real error or it may be an invalid entry in the uDAPL
>> Registry which is contained in the dat.conf file. Contact your local
>> System Administrator to confirm the availability of the interfaces in
>> the dat.conf file.
>> --------------------------------------------------------------------------
>> DAT: library load failure: /usr/lib64/libdaplcma.so.1: undefined symbol:
>> dat_registry_add_provider
>> --------------------------------------------------------------------------
>>
>> WARNING: Failed to open "OpenIB-iwarp"
>> [DAT_PROVIDER_NOT_FOUND:DAT_NAME_NOT_REGISTERED].
>> This may be a real error or it may be an invalid entry in the uDAPL
>> Registry which is contained in the dat.conf file. Contact your local
>> System Administrator to confirm the availability of the interfaces in
>> the dat.conf file.
>> --------------------------------------------------------------------------
>> --------------------------------------------------------------------------
>> [0,1,0]: uDAPL on host md was unable to find any NICs.
>> Another transport will be used instead, although this may result in
>> lower performance.
>> --------------------------------------------------------------------------
>> lid velocity = 0.0016, prandtl # = 1, grashof # = 1
>> Number of Newton iterations = 2
>> lid velocity = 0.0016, prandtl # = 1, grashof # = 1
>> Number of Newton iterations = 2
>> Possible error running Graphics examples
>> src/snes/examples/tutorials/ex19 1 MPI process
>> See http://www.mcs.anl.gov/petsc/petsc-as/documentation/troubleshooting.html
>> DAT: library load failure: /usr/lib64/libdaplcma.so.1: undefined symbol:
>> dat_registry_add_provider
>> --------------------------------------------------------------------------
>>
>> WARNING: Failed to open "OpenIB-cma"
>> [DAT_PROVIDER_NOT_FOUND:DAT_NAME_NOT_REGISTERED].
>> This may be a real error or it may be an invalid entry in the uDAPL
>> Registry which is contained in the dat.conf file. Contact your local
>> System Administrator to confirm the availability of the interfaces in
>> the dat.conf file.
>> --------------------------------------------------------------------------
>> DAT: library load failure: /usr/lib64/libdaplcma.so.1: undefined symbol:
>> dat_registry_add_provider
>> DAT: library load failure: /usr/lib64/libdaplscm.so.1: undefined symbol:
>> dat_registry_add_provider
>> DAT: library load failure: /usr/lib64/libdaplscm.so.1: undefined symbol:
>> dat_registry_add_provider
>> DAT: library load failure: /usr/lib64/libdaplscm.so.1: undefined symbol:
>> dat_registry_add_provider
>> DAT: library load failure: /usr/lib64/libdaplscm.so.1: undefined symbol:
>> dat_registry_add_provider
>> DAT: library load failure: /usr/lib64/libdaplcma.so.1: undefined symbol:
>> dat_registry_add_provider
>> --------------------------------------------------------------------------
>>
>> WARNING: Failed to open "OpenIB-cma-1"
>> [DAT_PROVIDER_NOT_FOUND:DAT_NAME_NOT_REGISTERED].
>> This may be a real error or it may be an invalid entry in the uDAPL
>> Registry which is contained in the dat.conf file. Contact your local
>> System Administrator to confirm the availability of the interfaces in
>> the dat.conf file.
>> --------------------------------------------------------------------------
>> --------------------------------------------------------------------------
>>
>> WARNING: Failed to open "OpenIB-mthca0-1"
>> [DAT_PROVIDER_NOT_FOUND:DAT_NAME_NOT_REGISTERED].
>> This may be a real error or it may be an invalid entry in the uDAPL
>> Registry which is contained in the dat.conf file. Contact your local
>> System Administrator to confirm the availability of the interfaces in
>> the dat.conf file.
>> --------------------------------------------------------------------------
>> --------------------------------------------------------------------------
>>
>> WARNING: Failed to open "OpenIB-mthca0-2"
>> [DAT_PROVIDER_NOT_FOUND:DAT_NAME_NOT_REGISTERED].
>> This may be a real error or it may be an invalid entry in the uDAPL
>> Registry which is contained in the dat.conf file. Contact your local
>> System Administrator to confirm the availability of the interfaces in
>> the dat.conf file.
>> --------------------------------------------------------------------------
>> --------------------------------------------------------------------------
>>
>> WARNING: Failed to open "OpenIB-mlx4_0-1"
>> [DAT_PROVIDER_NOT_FOUND:DAT_NAME_NOT_REGISTERED].
>> This may be a real error or it may be an invalid entry in the uDAPL
>> Registry which is contained in the dat.conf file. Contact your local
>> System Administrator to confirm the availability of the interfaces in
>> the dat.conf file.
>> --------------------------------------------------------------------------
>> --------------------------------------------------------------------------
>>
>> WARNING: Failed to open "OpenIB-mlx4_0-2"
>> [DAT_PROVIDER_NOT_FOUND:DAT_NAME_NOT_REGISTERED].
>> This may be a real error or it may be an invalid entry in the uDAPL
>> Registry which is contained in the dat.conf file. Contact your local
>> System Administrator to confirm the availability of the interfaces in
>> the dat.conf file.
>> --------------------------------------------------------------------------
>> --------------------------------------------------------------------------
>>
>> WARNING: Failed to open "OpenIB-iwarp"
>> [DAT_PROVIDER_NOT_FOUND:DAT_NAME_NOT_REGISTERED].
>> This may be a real error or it may be an invalid entry in the uDAPL
>> Registry which is contained in the dat.conf file. Contact your local
>> System Administrator to confirm the availability of the interfaces in
>> the dat.conf file.
>> --------------------------------------------------------------------------
>> --------------------------------------------------------------------------
>> [0,1,0]: uDAPL on host md was unable to find any NICs.
>> Another transport will be used instead, although this may result in
>> lower performance.
>> --------------------------------------------------------------------------
>> lid velocity = 0.0016, prandtl # = 1, grashof # = 1
>> Number of Newton iterations = 2
>> lid velocity = 0.0016, prandtl # = 1, grashof # = 1
>> Number of Newton iterations = 2
>> Error running Fortran example src/snes/examples/tutorials/ex5f with 1
>> MPI process
>> See http://www.mcs.anl.gov/petsc/petsc-as/documentation/troubleshooting.html
>> DAT: library load failure: /usr/lib64/libdaplcma.so.1: undefined symbol:
>> dat_registry_add_provider
>> --------------------------------------------------------------------------
>>
>> WARNING: Failed to open "OpenIB-cma"
>> [DAT_PROVIDER_NOT_FOUND:DAT_NAME_NOT_REGISTERED].
>> This may be a real error or it may be an invalid entry in the uDAPL
>> Registry which is contained in the dat.conf file. Contact your local
>> System Administrator to confirm the availability of the interfaces in
>> the dat.conf file.
>> --------------------------------------------------------------------------
>> DAT: library load failure: /usr/lib64/libdaplcma.so.1: undefined symbol:
>> dat_registry_add_provider
>> --------------------------------------------------------------------------
>>
>> WARNING: Failed to open "OpenIB-cma-1"
>> [DAT_PROVIDER_NOT_FOUND:DAT_NAME_NOT_REGISTERED].
>> This may be a real error or it may be an invalid entry in the uDAPL
>> Registry which is contained in the dat.conf file. Contact your local
>> System Administrator to confirm the availability of the interfaces in
>> the dat.conf file.
>> --------------------------------------------------------------------------
>> DAT: library load failure: /usr/lib64/libdaplscm.so.1: undefined symbol:
>> dat_registry_add_provider
>> --------------------------------------------------------------------------
>>
>> WARNING: Failed to open "OpenIB-mthca0-1"
>> [DAT_PROVIDER_NOT_FOUND:DAT_NAME_NOT_REGISTERED].
>> This may be a real error or it may be an invalid entry in the uDAPL
>> Registry which is contained in the dat.conf file. Contact your local
>> System Administrator to confirm the availability of the interfaces in
>> the dat.conf file.
>> --------------------------------------------------------------------------
>> DAT: library load failure: /usr/lib64/libdaplscm.so.1: undefined symbol:
>> dat_registry_add_provider
>> --------------------------------------------------------------------------
>>
>> WARNING: Failed to open "OpenIB-mthca0-2"
>> [DAT_PROVIDER_NOT_FOUND:DAT_NAME_NOT_REGISTERED].
>> This may be a real error or it may be an invalid entry in the uDAPL
>> Registry which is contained in the dat.conf file. Contact your local
>> System Administrator to confirm the availability of the interfaces in
>> the dat.conf file.
>> --------------------------------------------------------------------------
>> DAT: library load failure: /usr/lib64/libdaplscm.so.1: undefined symbol:
>> dat_registry_add_provider
>> --------------------------------------------------------------------------
>>
>> WARNING: Failed to open "OpenIB-mlx4_0-1"
>> [DAT_PROVIDER_NOT_FOUND:DAT_NAME_NOT_REGISTERED].
>> This may be a real error or it may be an invalid entry in the uDAPL
>> Registry which is contained in the dat.conf file. Contact your local
>> System Administrator to confirm the availability of the interfaces in
>> the dat.conf file.
>> --------------------------------------------------------------------------
>> DAT: library load failure: /usr/lib64/libdaplscm.so.1: undefined symbol:
>> dat_registry_add_provider
>> --------------------------------------------------------------------------
>>
>> WARNING: Failed to open "OpenIB-mlx4_0-2"
>> [DAT_PROVIDER_NOT_FOUND:DAT_NAME_NOT_REGISTERED].
>> This may be a real error or it may be an invalid entry in the uDAPL
>> Registry which is contained in the dat.conf file. Contact your local
>> System Administrator to confirm the availability of the interfaces in
>> the dat.conf file.
>> --------------------------------------------------------------------------
>> DAT: library load failure: /usr/lib64/libdaplcma.so.1: undefined symbol:
>> dat_registry_add_provider
>> --------------------------------------------------------------------------
>>
>> WARNING: Failed to open "OpenIB-iwarp"
>> [DAT_PROVIDER_NOT_FOUND:DAT_NAME_NOT_REGISTERED].
>> This may be a real error or it may be an invalid entry in the uDAPL
>> Registry which is contained in the dat.conf file. Contact your local
>> System Administrator to confirm the availability of the interfaces in
>> the dat.conf file.
>> --------------------------------------------------------------------------
>> --------------------------------------------------------------------------
>> [0,1,0]: uDAPL on host md was unable to find any NICs.
>> Another transport will be used instead, although this may result in
>> lower performance.
>> --------------------------------------------------------------------------
>> Number of Newton iterations =     4
>> Completed test examples
>>
>>
>> Thanks.
>>
>> Yixun
>>
>>     
>
>   


From balay at mcs.anl.gov  Sun Jul  5 13:17:24 2009
From: balay at mcs.anl.gov (Satish Balay)
Date: Sun, 5 Jul 2009 13:17:24 -0500 (CDT)
Subject: make test
In-Reply-To: <4A50EB5D.8090001@cs.wm.edu>
References: <4A50E70B.7070208@cs.wm.edu>
	<alpine.LFD.2.00.0907051253420.2924@asterix>
	<4A50EB5D.8090001@cs.wm.edu>
Message-ID: <alpine.LFD.2.00.0907051311310.2924@asterix>

On Sun, 5 Jul 2009, Yixun Liu wrote:

> I run it on my computer.
> 
> md[/home/scratch/yixun/petsc-3.0.0-p3/src/ksp/ksp/examples/tutorials>make
> ex2
> 
> mpicc -o ex2.o -c -Wall -Wwrite-strings -Wno-strict-aliasing -g3
> -I/home/scratch/yixun/petsc-3.0.0-p3/src/dm/mesh/sieve
> -I/home/scratch/yixun/petsc-3.0.0-p3/linux-gnu-c-debug/include
> -I/home/scratch/yixun/petsc-3.0.0-p3/include
> -I/usr/lib64/mpi/gcc/openmpi/include -I/usr/lib64/mpi/gcc/openmpi/lib64
> -D__SDIR__="src/ksp/ksp/examples/tutorials/" ex2.c
> mpicc -Wall -Wwrite-strings -Wno-strict-aliasing -g3 -o ex2 ex2.o
> -Wl,-rpath,/home/scratch/yixun/petsc-3.0.0-p3/linux-gnu-c-debug/lib
> -L/home/scratch/yixun/petsc-3.0.0-p3/linux-gnu-c-debug/lib -lpetscksp
> -lpetscdm -lpetscmat -lpetscvec -lpetsc -lX11 -llapack -lblas
> -L/usr/lib64/mpi/gcc/openmpi/lib64
> -L/usr/lib64/gcc/x86_64-suse-linux/4.3 -L/usr/lib64 -L/lib64
> -L/usr/x86_64-suse-linux/lib -ldl -lmpi -lopen-rte -lopen-pal -lnsl
> -lutil -lgcc_s -lpthread -lmpi_f90 -lmpi_f77 -lgfortranbegin -lgfortran
> -lm -lm -L/usr/lib64/gcc/x86_64-suse-linux -L/usr/x86_64-suse-linux/bin
> -L/lib -lm -lm -ldl -lmpi -lopen-rte -lopen-pal -lnsl -lutil -lgcc_s
> -lpthread -ldl
> /bin/rm -f ex2.o

Did you install this OpenMPI - or did someone-else/sysadmin install it for you?

> 
> md[/home/scratch/yixun/petsc-3.0.0-p3/src/ksp/ksp/examples/tutorials>mpiexec
> -n 2 ./ex2
> 
> DAT: library load failure: /usr/lib64/libdaplcma.so.1: undefined symbol:
> dat_registry_add_provider
> DAT: library load failure: /usr/lib64/libdaplcma.so.1: undefined symbol:
> dat_registry_add_provider

> --------------------------------------------------------------------------
> 
> WARNING: Failed to open "OpenIB-cma"
> [DAT_PROVIDER_NOT_FOUND:DAT_NAME_NOT_REGISTERED].
> This may be a real error or it may be an invalid entry in the uDAPL
> Registry which is contained in the dat.conf file. Contact your local
> System Administrator to confirm the availability of the interfaces in
> the dat.conf file.

Your mpiexec is trying to run on infiniban and failing?

> --------------------------------------------------------------------------
> [0,1,1]: uDAPL on host md was unable to find any NICs.
> Another transport will be used instead, although this may result in
> lower performance.
> --------------------------------------------------------------------------
> Norm of error 0.000411674 iterations 7

And then it attempts 'sockets' - and then successfully runs the PETSc example..

So something is wrong with your mpi usage. I guess - you'll have to
check with your sysadmin - how to correctly use infiniband..

Satish


From enjoywm at cs.wm.edu  Sun Jul  5 13:27:52 2009
From: enjoywm at cs.wm.edu (Yixun Liu)
Date: Sun, 05 Jul 2009 14:27:52 -0400
Subject: make test
In-Reply-To: <alpine.LFD.2.00.0907051311310.2924@asterix>
References: <4A50E70B.7070208@cs.wm.edu>	<alpine.LFD.2.00.0907051253420.2924@asterix>	<4A50EB5D.8090001@cs.wm.edu>
	<alpine.LFD.2.00.0907051311310.2924@asterix>
Message-ID: <4A50F0A8.1010700@cs.wm.edu>

Satish Balay wrote:
> On Sun, 5 Jul 2009, Yixun Liu wrote:
>
>   
>> I run it on my computer.
>>
>> md[/home/scratch/yixun/petsc-3.0.0-p3/src/ksp/ksp/examples/tutorials>make
>> ex2
>>
>> mpicc -o ex2.o -c -Wall -Wwrite-strings -Wno-strict-aliasing -g3
>> -I/home/scratch/yixun/petsc-3.0.0-p3/src/dm/mesh/sieve
>> -I/home/scratch/yixun/petsc-3.0.0-p3/linux-gnu-c-debug/include
>> -I/home/scratch/yixun/petsc-3.0.0-p3/include
>> -I/usr/lib64/mpi/gcc/openmpi/include -I/usr/lib64/mpi/gcc/openmpi/lib64
>> -D__SDIR__="src/ksp/ksp/examples/tutorials/" ex2.c
>> mpicc -Wall -Wwrite-strings -Wno-strict-aliasing -g3 -o ex2 ex2.o
>> -Wl,-rpath,/home/scratch/yixun/petsc-3.0.0-p3/linux-gnu-c-debug/lib
>> -L/home/scratch/yixun/petsc-3.0.0-p3/linux-gnu-c-debug/lib -lpetscksp
>> -lpetscdm -lpetscmat -lpetscvec -lpetsc -lX11 -llapack -lblas
>> -L/usr/lib64/mpi/gcc/openmpi/lib64
>> -L/usr/lib64/gcc/x86_64-suse-linux/4.3 -L/usr/lib64 -L/lib64
>> -L/usr/x86_64-suse-linux/lib -ldl -lmpi -lopen-rte -lopen-pal -lnsl
>> -lutil -lgcc_s -lpthread -lmpi_f90 -lmpi_f77 -lgfortranbegin -lgfortran
>> -lm -lm -L/usr/lib64/gcc/x86_64-suse-linux -L/usr/x86_64-suse-linux/bin
>> -L/lib -lm -lm -ldl -lmpi -lopen-rte -lopen-pal -lnsl -lutil -lgcc_s
>> -lpthread -ldl
>> /bin/rm -f ex2.o
>>     
>
> Did you install this OpenMPI - or did someone-else/sysadmin install it for you?
>   
Sysadmin install it. They let me set LD_LIBRARY_PATH to
/usr/lib64/mpi/gcc/openmpi/lib64, but it still doesn't work.



>   
>> md[/home/scratch/yixun/petsc-3.0.0-p3/src/ksp/ksp/examples/tutorials>mpiexec
>> -n 2 ./ex2
>>
>> DAT: library load failure: /usr/lib64/libdaplcma.so.1: undefined symbol:
>> dat_registry_add_provider
>> DAT: library load failure: /usr/lib64/libdaplcma.so.1: undefined symbol:
>> dat_registry_add_provider
>>     
>
>   
>> --------------------------------------------------------------------------
>>
>> WARNING: Failed to open "OpenIB-cma"
>> [DAT_PROVIDER_NOT_FOUND:DAT_NAME_NOT_REGISTERED].
>> This may be a real error or it may be an invalid entry in the uDAPL
>> Registry which is contained in the dat.conf file. Contact your local
>> System Administrator to confirm the availability of the interfaces in
>> the dat.conf file.
>>     
>
> Your mpiexec is trying to run on infiniban and failing?
>
>   
>> --------------------------------------------------------------------------
>> [0,1,1]: uDAPL on host md was unable to find any NICs.
>> Another transport will be used instead, although this may result in
>> lower performance.
>> --------------------------------------------------------------------------
>> Norm of error 0.000411674 iterations 7
>>     
>
> And then it attempts 'sockets' - and then successfully runs the PETSc example..
>
> So something is wrong with your mpi usage. I guess - you'll have to
> check with your sysadmin - how to correctly use infiniband..
>
> Satish
>
>   


From jed at 59A2.org  Sun Jul  5 13:33:28 2009
From: jed at 59A2.org (Jed Brown)
Date: Sun, 05 Jul 2009 20:33:28 +0200
Subject: make test
In-Reply-To: <4A50F0A8.1010700@cs.wm.edu>
References: <4A50E70B.7070208@cs.wm.edu>	<alpine.LFD.2.00.0907051253420.2924@asterix>	<4A50EB5D.8090001@cs.wm.edu>	<alpine.LFD.2.00.0907051311310.2924@asterix>
	<4A50F0A8.1010700@cs.wm.edu>
Message-ID: <4A50F1F8.40202@59A2.org>

Yixun Liu wrote:
> Satish Balay wrote:
>> On Sun, 5 Jul 2009, Yixun Liu wrote:
>>
>>   
>>> I run it on my computer.
>>>
>>> md[/home/scratch/yixun/petsc-3.0.0-p3/src/ksp/ksp/examples/tutorials>make
>>> ex2
>>>
>>> mpicc -o ex2.o -c -Wall -Wwrite-strings -Wno-strict-aliasing -g3
>>> -I/home/scratch/yixun/petsc-3.0.0-p3/src/dm/mesh/sieve
>>> -I/home/scratch/yixun/petsc-3.0.0-p3/linux-gnu-c-debug/include
>>> -I/home/scratch/yixun/petsc-3.0.0-p3/include
>>> -I/usr/lib64/mpi/gcc/openmpi/include -I/usr/lib64/mpi/gcc/openmpi/lib64
>>> -D__SDIR__="src/ksp/ksp/examples/tutorials/" ex2.c
>>> mpicc -Wall -Wwrite-strings -Wno-strict-aliasing -g3 -o ex2 ex2.o
>>> -Wl,-rpath,/home/scratch/yixun/petsc-3.0.0-p3/linux-gnu-c-debug/lib
>>> -L/home/scratch/yixun/petsc-3.0.0-p3/linux-gnu-c-debug/lib -lpetscksp
>>> -lpetscdm -lpetscmat -lpetscvec -lpetsc -lX11 -llapack -lblas
>>> -L/usr/lib64/mpi/gcc/openmpi/lib64
>>> -L/usr/lib64/gcc/x86_64-suse-linux/4.3 -L/usr/lib64 -L/lib64
>>> -L/usr/x86_64-suse-linux/lib -ldl -lmpi -lopen-rte -lopen-pal -lnsl
>>> -lutil -lgcc_s -lpthread -lmpi_f90 -lmpi_f77 -lgfortranbegin -lgfortran
>>> -lm -lm -L/usr/lib64/gcc/x86_64-suse-linux -L/usr/x86_64-suse-linux/bin
>>> -L/lib -lm -lm -ldl -lmpi -lopen-rte -lopen-pal -lnsl -lutil -lgcc_s
>>> -lpthread -ldl
>>> /bin/rm -f ex2.o
>>>     
>> Did you install this OpenMPI - or did someone-else/sysadmin install it for you?
>>   
> Sysadmin install it. They let me set LD_LIBRARY_PATH to
> /usr/lib64/mpi/gcc/openmpi/lib64, but it still doesn't work.

How about running with 'make runex2_2' or
/usr/lib64/mpi/gcc/openmpi/bin/mpiexec?

Jed

-------------- next part --------------
A non-text attachment was scrubbed...
Name: signature.asc
Type: application/pgp-signature
Size: 260 bytes
Desc: OpenPGP digital signature
URL: <http://lists.mcs.anl.gov/pipermail/petsc-users/attachments/20090705/8a24983c/attachment.pgp>

From enjoywm at cs.wm.edu  Sun Jul  5 14:01:33 2009
From: enjoywm at cs.wm.edu (Yixun Liu)
Date: Sun, 05 Jul 2009 15:01:33 -0400
Subject: make test
In-Reply-To: <4A50F1F8.40202@59A2.org>
References: <4A50E70B.7070208@cs.wm.edu>	<alpine.LFD.2.00.0907051253420.2924@asterix>	<4A50EB5D.8090001@cs.wm.edu>	<alpine.LFD.2.00.0907051311310.2924@asterix>	<4A50F0A8.1010700@cs.wm.edu>
	<4A50F1F8.40202@59A2.org>
Message-ID: <4A50F88D.9010200@cs.wm.edu>

It has the same errors when I use /usr/lib64/mpi/gcc/openmpi/bin/mpiexec.
md[/home/scratch/yixun/petsc-3.0.0-p3/src/ksp/ksp/examples/tutorials>/usr/lib64/mpi/gcc/openmpi/bin/mpiexec
-np 2 ./ex2


DAT: library load failure: /usr/lib64/libdaplcma.so.1: undefined symbol:
dat_registry_add_provider
DAT: library load failure: /usr/lib64/libdaplcma.so.1: undefined symbol:
dat_registry_add_provider
DAT: library load failure: /usr/lib64/libdaplscm.so.1: undefined symbol:
dat_registry_add_provider
DAT: library load failure: /usr/lib64/libdaplscm.so.1: undefined symbol:
dat_registry_add_provider
DAT: library load failure: /usr/lib64/libdaplscm.so.1: undefined symbol:
dat_registry_add_provider
DAT: library load failure: /usr/lib64/libdaplscm.so.1: undefined symbol:
dat_registry_add_provider
DAT: library load failure: /usr/lib64/libdaplcma.so.1: undefined symbol:
dat_registry_add_provider
--------------------------------------------------------------------------

WARNING: Failed to open "OpenIB-cma"
[DAT_PROVIDER_NOT_FOUND:DAT_NAME_NOT_REGISTERED].
This may be a real error or it may be an invalid entry in the uDAPL
Registry which is contained in the dat.conf file. Contact your local
System Administrator to confirm the availability of the interfaces in
the dat.conf file.
--------------------------------------------------------------------------
--------------------------------------------------------------------------

WARNING: Failed to open "OpenIB-cma-1"
[DAT_PROVIDER_NOT_FOUND:DAT_NAME_NOT_REGISTERED].
This may be a real error or it may be an invalid entry in the uDAPL
Registry which is contained in the dat.conf file. Contact your local
System Administrator to confirm the availability of the interfaces in
the dat.conf file.
--------------------------------------------------------------------------
--------------------------------------------------------------------------

WARNING: Failed to open "OpenIB-mthca0-1"
[DAT_PROVIDER_NOT_FOUND:DAT_NAME_NOT_REGISTERED].
This may be a real error or it may be an invalid entry in the uDAPL
Registry which is contained in the dat.conf file. Contact your local
System Administrator to confirm the availability of the interfaces in
the dat.conf file.
--------------------------------------------------------------------------
--------------------------------------------------------------------------

WARNING: Failed to open "OpenIB-mthca0-2"
[DAT_PROVIDER_NOT_FOUND:DAT_NAME_NOT_REGISTERED].
This may be a real error or it may be an invalid entry in the uDAPL
Registry which is contained in the dat.conf file. Contact your local
System Administrator to confirm the availability of the interfaces in
the dat.conf file.
--------------------------------------------------------------------------
--------------------------------------------------------------------------

WARNING: Failed to open "OpenIB-mlx4_0-1"
[DAT_PROVIDER_NOT_FOUND:DAT_NAME_NOT_REGISTERED].
This may be a real error or it may be an invalid entry in the uDAPL
Registry which is contained in the dat.conf file. Contact your local
System Administrator to confirm the availability of the interfaces in
the dat.conf file.
--------------------------------------------------------------------------
--------------------------------------------------------------------------

WARNING: Failed to open "OpenIB-mlx4_0-2"
[DAT_PROVIDER_NOT_FOUND:DAT_NAME_NOT_REGISTERED].
This may be a real error or it may be an invalid entry in the uDAPL
Registry which is contained in the dat.conf file. Contact your local
System Administrator to confirm the availability of the interfaces in
the dat.conf file.
--------------------------------------------------------------------------
--------------------------------------------------------------------------

WARNING: Failed to open "OpenIB-iwarp"
[DAT_PROVIDER_NOT_FOUND:DAT_NAME_NOT_REGISTERED].
This may be a real error or it may be an invalid entry in the uDAPL
Registry which is contained in the dat.conf file. Contact your local
System Administrator to confirm the availability of the interfaces in
the dat.conf file.
--------------------------------------------------------------------------
--------------------------------------------------------------------------
[0,1,0]: uDAPL on host md was unable to find any NICs.
Another transport will be used instead, although this may result in
lower performance.
--------------------------------------------------------------------------
DAT: library load failure: /usr/lib64/libdaplcma.so.1: undefined symbol:
dat_registry_add_provider
DAT: library load failure: /usr/lib64/libdaplcma.so.1: undefined symbol:
dat_registry_add_provider
DAT: library load failure: /usr/lib64/libdaplscm.so.1: undefined symbol:
dat_registry_add_provider
DAT: library load failure: /usr/lib64/libdaplscm.so.1: undefined symbol:
dat_registry_add_provider
DAT: library load failure: /usr/lib64/libdaplscm.so.1: undefined symbol:
dat_registry_add_provider
DAT: library load failure: /usr/lib64/libdaplscm.so.1: undefined symbol:
dat_registry_add_provider
DAT: library load failure: /usr/lib64/libdaplcma.so.1: undefined symbol:
dat_registry_add_provider
--------------------------------------------------------------------------

WARNING: Failed to open "OpenIB-cma"
[DAT_PROVIDER_NOT_FOUND:DAT_NAME_NOT_REGISTERED].
This may be a real error or it may be an invalid entry in the uDAPL
Registry which is contained in the dat.conf file. Contact your local
System Administrator to confirm the availability of the interfaces in
the dat.conf file.
--------------------------------------------------------------------------
--------------------------------------------------------------------------

WARNING: Failed to open "OpenIB-cma-1"
[DAT_PROVIDER_NOT_FOUND:DAT_NAME_NOT_REGISTERED].
This may be a real error or it may be an invalid entry in the uDAPL
Registry which is contained in the dat.conf file. Contact your local
System Administrator to confirm the availability of the interfaces in
the dat.conf file.
--------------------------------------------------------------------------
--------------------------------------------------------------------------

WARNING: Failed to open "OpenIB-mthca0-1"
[DAT_PROVIDER_NOT_FOUND:DAT_NAME_NOT_REGISTERED].
This may be a real error or it may be an invalid entry in the uDAPL
Registry which is contained in the dat.conf file. Contact your local
System Administrator to confirm the availability of the interfaces in
the dat.conf file.
--------------------------------------------------------------------------
--------------------------------------------------------------------------

WARNING: Failed to open "OpenIB-mthca0-2"
[DAT_PROVIDER_NOT_FOUND:DAT_NAME_NOT_REGISTERED].
This may be a real error or it may be an invalid entry in the uDAPL
Registry which is contained in the dat.conf file. Contact your local
System Administrator to confirm the availability of the interfaces in
the dat.conf file.
--------------------------------------------------------------------------
--------------------------------------------------------------------------

WARNING: Failed to open "OpenIB-mlx4_0-1"
[DAT_PROVIDER_NOT_FOUND:DAT_NAME_NOT_REGISTERED].
This may be a real error or it may be an invalid entry in the uDAPL
Registry which is contained in the dat.conf file. Contact your local
System Administrator to confirm the availability of the interfaces in
the dat.conf file.
--------------------------------------------------------------------------
--------------------------------------------------------------------------

WARNING: Failed to open "OpenIB-mlx4_0-2"
[DAT_PROVIDER_NOT_FOUND:DAT_NAME_NOT_REGISTERED].
This may be a real error or it may be an invalid entry in the uDAPL
Registry which is contained in the dat.conf file. Contact your local
System Administrator to confirm the availability of the interfaces in
the dat.conf file.
--------------------------------------------------------------------------
--------------------------------------------------------------------------

WARNING: Failed to open "OpenIB-iwarp"
[DAT_PROVIDER_NOT_FOUND:DAT_NAME_NOT_REGISTERED].
This may be a real error or it may be an invalid entry in the uDAPL
Registry which is contained in the dat.conf file. Contact your local
System Administrator to confirm the availability of the interfaces in
the dat.conf file.
--------------------------------------------------------------------------
--------------------------------------------------------------------------
[0,1,1]: uDAPL on host md was unable to find any NICs.
Another transport will be used instead, although this may result in
lower performance.
--------------------------------------------------------------------------
Norm of error 0.000411674 iterations 7


Jed Brown wrote:
> Yixun Liu wrote:
>   
>> Satish Balay wrote:
>>     
>>> On Sun, 5 Jul 2009, Yixun Liu wrote:
>>>
>>>   
>>>       
>>>> I run it on my computer.
>>>>
>>>> md[/home/scratch/yixun/petsc-3.0.0-p3/src/ksp/ksp/examples/tutorials>make
>>>> ex2
>>>>
>>>> mpicc -o ex2.o -c -Wall -Wwrite-strings -Wno-strict-aliasing -g3
>>>> -I/home/scratch/yixun/petsc-3.0.0-p3/src/dm/mesh/sieve
>>>> -I/home/scratch/yixun/petsc-3.0.0-p3/linux-gnu-c-debug/include
>>>> -I/home/scratch/yixun/petsc-3.0.0-p3/include
>>>> -I/usr/lib64/mpi/gcc/openmpi/include -I/usr/lib64/mpi/gcc/openmpi/lib64
>>>> -D__SDIR__="src/ksp/ksp/examples/tutorials/" ex2.c
>>>> mpicc -Wall -Wwrite-strings -Wno-strict-aliasing -g3 -o ex2 ex2.o
>>>> -Wl,-rpath,/home/scratch/yixun/petsc-3.0.0-p3/linux-gnu-c-debug/lib
>>>> -L/home/scratch/yixun/petsc-3.0.0-p3/linux-gnu-c-debug/lib -lpetscksp
>>>> -lpetscdm -lpetscmat -lpetscvec -lpetsc -lX11 -llapack -lblas
>>>> -L/usr/lib64/mpi/gcc/openmpi/lib64
>>>> -L/usr/lib64/gcc/x86_64-suse-linux/4.3 -L/usr/lib64 -L/lib64
>>>> -L/usr/x86_64-suse-linux/lib -ldl -lmpi -lopen-rte -lopen-pal -lnsl
>>>> -lutil -lgcc_s -lpthread -lmpi_f90 -lmpi_f77 -lgfortranbegin -lgfortran
>>>> -lm -lm -L/usr/lib64/gcc/x86_64-suse-linux -L/usr/x86_64-suse-linux/bin
>>>> -L/lib -lm -lm -ldl -lmpi -lopen-rte -lopen-pal -lnsl -lutil -lgcc_s
>>>> -lpthread -ldl
>>>> /bin/rm -f ex2.o
>>>>     
>>>>         
>>> Did you install this OpenMPI - or did someone-else/sysadmin install it for you?
>>>   
>>>       
>> Sysadmin install it. They let me set LD_LIBRARY_PATH to
>> /usr/lib64/mpi/gcc/openmpi/lib64, but it still doesn't work.
>>     
>
> How about running with 'make runex2_2' or
> /usr/lib64/mpi/gcc/openmpi/bin/mpiexec?
>
> Jed
>
>   


From vyan2000 at gmail.com  Mon Jul  6 14:22:51 2009
From: vyan2000 at gmail.com (Ryan Yan)
Date: Mon, 6 Jul 2009 15:22:51 -0400
Subject: PCFIELDSPLIT
Message-ID: <bb5eaf5f0907061222q45a2f718w9b1b46326d5aeaef@mail.gmail.com>

Hi, All,
I am reading a large Block Compressed Row Storage PETSc from an application
into PETSc binary files.

And I use matload to load this PETSc binar matrix as mpiaij. Since the
matrix is resulting from a finite volume discretization with degree of
freedom 5 at each cell center, what I am going to is use pcfieldsplit and
PCFieldSplitGetSubKSP.  For each filed I want to use the pc type hypre, and
hypre type euclid.

My question is: is there any way to send this euclid information by a
function call, instead of command line parameter. The thing is that I want
to save some typing, just in case that there are ten fields. The way that I
am using now is "-fieldsplit_4_sub_pc_type hypre,
-fieldsplit_4_sub_pc_hypre_type euclid".


Notice that PCSetType can only pass in the "PCHYPRE".

Thank you very much,

Yan
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.mcs.anl.gov/pipermail/petsc-users/attachments/20090706/28f8b0e8/attachment.htm>

From bsmith at mcs.anl.gov  Mon Jul  6 14:44:50 2009
From: bsmith at mcs.anl.gov (Barry Smith)
Date: Mon, 6 Jul 2009 14:44:50 -0500
Subject: PCFIELDSPLIT
In-Reply-To: <bb5eaf5f0907061222q45a2f718w9b1b46326d5aeaef@mail.gmail.com>
References: <bb5eaf5f0907061222q45a2f718w9b1b46326d5aeaef@mail.gmail.com>
Message-ID: <E80D1375-5931-4EB4-9A39-5D8907EA094E@mcs.anl.gov>


On Jul 6, 2009, at 2:22 PM, Ryan Yan wrote:

> Hi, All,
> I am reading a large Block Compressed Row Storage PETSc from an  
> application into PETSc binary files.
>
> And I use matload to load this PETSc binar matrix as mpiaij. Since  
> the matrix is resulting from a finite volume discretization with  
> degree of freedom 5 at each cell center, what I am going to is use  
> pcfieldsplit and PCFieldSplitGetSubKSP.  For each filed I want to  
> use the pc type hypre, and hypre type euclid.
>
> My question is: is there any way to send this euclid information by  
> a function call, instead of command line parameter. The thing is  
> that I want to save some typing, just in case that there are ten  
> fields. The way that I am using now is "-fieldsplit_4_sub_pc_type  
> hypre, -fieldsplit_4_sub_pc_hypre_type euclid".
>
    You can put them in a file called .petscrc or another file and  
list that filename in PetscInitialize()

   You can call PetscOptionsSet("- 
fieldsplit_4_sub_pc_hypre_type","euclid"); in your code right after  
PetscInitialize().

    Barry


>
> Notice that PCSetType can only pass in the "PCHYPRE".
>
> Thank you very much,
>
> Yan


From enjoywm at cs.wm.edu  Tue Jul  7 12:48:13 2009
From: enjoywm at cs.wm.edu (Yixun Liu)
Date: Tue, 07 Jul 2009 13:48:13 -0400
Subject: make test
In-Reply-To: <alpine.LFD.2.00.0907051311310.2924@asterix>
References: <4A50E70B.7070208@cs.wm.edu>	<alpine.LFD.2.00.0907051253420.2924@asterix>	<4A50EB5D.8090001@cs.wm.edu>
	<alpine.LFD.2.00.0907051311310.2924@asterix>
Message-ID: <4A538A5D.6000606@cs.wm.edu>

Hi,
I use the command,
./config/configure.py --with-cc=gcc --with-fc=gfortran --download-f-blas-lapack=1 --download-mpich=1

 and make test success.

But when I compile my Petsc-based application I got the following errors,


Linking CXX executable ../../../bin/PETScSolver
/home/scratch/yixun/petsc-3.0.0-p3/linux-gnu-c-debug/lib/libpetscvec.a(vpscat.o):
In function `VecScatterCreateCommon_PtoS':
/home/scratch/yixun/petsc-3.0.0-p3/src/vec/vec/utils/vpscat.c:1770:
undefined reference to `MPI_Type_create_indexed_block'
/home/scratch/yixun/petsc-3.0.0-p3/src/vec/vec/utils/vpscat.c:1792:
undefined reference to `MPI_Type_create_indexed_block'
collect2: ld returned 1 exit status
/usr/bin/mpiCC: No such file or directory
gmake[2]: *** [bin/PETScSolver] Error 1
gmake[1]: ***
[PersoPkgs/oclatzPkg/MeshRegister/CMakeFiles/PETScSolver.dir/all] Error 2
gmake: *** [all] Error 2



Does it mean that I need to set LD_LIBRARY_PATH to MPICH2 installation path?

Thanks.












Satish Balay wrote:
> On Sun, 5 Jul 2009, Yixun Liu wrote:
>
>   
>> I run it on my computer.
>>
>> md[/home/scratch/yixun/petsc-3.0.0-p3/src/ksp/ksp/examples/tutorials>make
>> ex2
>>
>> mpicc -o ex2.o -c -Wall -Wwrite-strings -Wno-strict-aliasing -g3
>> -I/home/scratch/yixun/petsc-3.0.0-p3/src/dm/mesh/sieve
>> -I/home/scratch/yixun/petsc-3.0.0-p3/linux-gnu-c-debug/include
>> -I/home/scratch/yixun/petsc-3.0.0-p3/include
>> -I/usr/lib64/mpi/gcc/openmpi/include -I/usr/lib64/mpi/gcc/openmpi/lib64
>> -D__SDIR__="src/ksp/ksp/examples/tutorials/" ex2.c
>> mpicc -Wall -Wwrite-strings -Wno-strict-aliasing -g3 -o ex2 ex2.o
>> -Wl,-rpath,/home/scratch/yixun/petsc-3.0.0-p3/linux-gnu-c-debug/lib
>> -L/home/scratch/yixun/petsc-3.0.0-p3/linux-gnu-c-debug/lib -lpetscksp
>> -lpetscdm -lpetscmat -lpetscvec -lpetsc -lX11 -llapack -lblas
>> -L/usr/lib64/mpi/gcc/openmpi/lib64
>> -L/usr/lib64/gcc/x86_64-suse-linux/4.3 -L/usr/lib64 -L/lib64
>> -L/usr/x86_64-suse-linux/lib -ldl -lmpi -lopen-rte -lopen-pal -lnsl
>> -lutil -lgcc_s -lpthread -lmpi_f90 -lmpi_f77 -lgfortranbegin -lgfortran
>> -lm -lm -L/usr/lib64/gcc/x86_64-suse-linux -L/usr/x86_64-suse-linux/bin
>> -L/lib -lm -lm -ldl -lmpi -lopen-rte -lopen-pal -lnsl -lutil -lgcc_s
>> -lpthread -ldl
>> /bin/rm -f ex2.o
>>     
>
> Did you install this OpenMPI - or did someone-else/sysadmin install it for you?
>
>   
>> md[/home/scratch/yixun/petsc-3.0.0-p3/src/ksp/ksp/examples/tutorials>mpiexec
>> -n 2 ./ex2
>>
>> DAT: library load failure: /usr/lib64/libdaplcma.so.1: undefined symbol:
>> dat_registry_add_provider
>> DAT: library load failure: /usr/lib64/libdaplcma.so.1: undefined symbol:
>> dat_registry_add_provider
>>     
>
>   
>> --------------------------------------------------------------------------
>>
>> WARNING: Failed to open "OpenIB-cma"
>> [DAT_PROVIDER_NOT_FOUND:DAT_NAME_NOT_REGISTERED].
>> This may be a real error or it may be an invalid entry in the uDAPL
>> Registry which is contained in the dat.conf file. Contact your local
>> System Administrator to confirm the availability of the interfaces in
>> the dat.conf file.
>>     
>
> Your mpiexec is trying to run on infiniban and failing?
>
>   
>> --------------------------------------------------------------------------
>> [0,1,1]: uDAPL on host md was unable to find any NICs.
>> Another transport will be used instead, although this may result in
>> lower performance.
>> --------------------------------------------------------------------------
>> Norm of error 0.000411674 iterations 7
>>     
>
> And then it attempts 'sockets' - and then successfully runs the PETSc example..
>
> So something is wrong with your mpi usage. I guess - you'll have to
> check with your sysadmin - how to correctly use infiniband..
>
> Satish
>
>   


From balay at mcs.anl.gov  Tue Jul  7 12:54:50 2009
From: balay at mcs.anl.gov (Satish Balay)
Date: Tue, 7 Jul 2009 12:54:50 -0500 (CDT)
Subject: make test
In-Reply-To: <4A538A5D.6000606@cs.wm.edu>
References: <4A50E70B.7070208@cs.wm.edu>
	<alpine.LFD.2.00.0907051253420.2924@asterix>
	<4A50EB5D.8090001@cs.wm.edu>
	<alpine.LFD.2.00.0907051311310.2924@asterix>
	<4A538A5D.6000606@cs.wm.edu>
Message-ID: <alpine.LFD.2.00.0907071250060.2783@asterix>


> /usr/bin/mpiCC: No such file or directory

You are using --downlod-mpich with PETSc - but compiling your code
wiht mpiCC from a different MPI install? It won't work.

Is your code c++? If so - sugest building PETSc with additional options:
'--with-cxx=g++ --with-clanguage=cxx'

And then use PETSc Makefile format for your appliation code [that sets
all make variables and targets needed to build PETSc
applications]. For eg: check src/ksp/ksp/examples/tutorials/makefile

Satish

On Tue, 7 Jul 2009, Yixun Liu wrote:

> Hi,
> I use the command,
> ./config/configure.py --with-cc=gcc --with-fc=gfortran --download-f-blas-lapack=1 --download-mpich=1
> 
>  and make test success.
> 
> But when I compile my Petsc-based application I got the following errors,
> 
> 
> Linking CXX executable ../../../bin/PETScSolver
> /home/scratch/yixun/petsc-3.0.0-p3/linux-gnu-c-debug/lib/libpetscvec.a(vpscat.o):
> In function `VecScatterCreateCommon_PtoS':
> /home/scratch/yixun/petsc-3.0.0-p3/src/vec/vec/utils/vpscat.c:1770:
> undefined reference to `MPI_Type_create_indexed_block'
> /home/scratch/yixun/petsc-3.0.0-p3/src/vec/vec/utils/vpscat.c:1792:
> undefined reference to `MPI_Type_create_indexed_block'
> collect2: ld returned 1 exit status
> /usr/bin/mpiCC: No such file or directory
> gmake[2]: *** [bin/PETScSolver] Error 1
> gmake[1]: ***
> [PersoPkgs/oclatzPkg/MeshRegister/CMakeFiles/PETScSolver.dir/all] Error 2
> gmake: *** [all] Error 2
> 
> 
> 
> Does it mean that I need to set LD_LIBRARY_PATH to MPICH2 installation path?
> 
> Thanks.
> 
> 
> 
> 
> 
> 
> 
> 
> 
> 
> 
> 
> Satish Balay wrote:
> > On Sun, 5 Jul 2009, Yixun Liu wrote:
> >
> >   
> >> I run it on my computer.
> >>
> >> md[/home/scratch/yixun/petsc-3.0.0-p3/src/ksp/ksp/examples/tutorials>make
> >> ex2
> >>
> >> mpicc -o ex2.o -c -Wall -Wwrite-strings -Wno-strict-aliasing -g3
> >> -I/home/scratch/yixun/petsc-3.0.0-p3/src/dm/mesh/sieve
> >> -I/home/scratch/yixun/petsc-3.0.0-p3/linux-gnu-c-debug/include
> >> -I/home/scratch/yixun/petsc-3.0.0-p3/include
> >> -I/usr/lib64/mpi/gcc/openmpi/include -I/usr/lib64/mpi/gcc/openmpi/lib64
> >> -D__SDIR__="src/ksp/ksp/examples/tutorials/" ex2.c
> >> mpicc -Wall -Wwrite-strings -Wno-strict-aliasing -g3 -o ex2 ex2.o
> >> -Wl,-rpath,/home/scratch/yixun/petsc-3.0.0-p3/linux-gnu-c-debug/lib
> >> -L/home/scratch/yixun/petsc-3.0.0-p3/linux-gnu-c-debug/lib -lpetscksp
> >> -lpetscdm -lpetscmat -lpetscvec -lpetsc -lX11 -llapack -lblas
> >> -L/usr/lib64/mpi/gcc/openmpi/lib64
> >> -L/usr/lib64/gcc/x86_64-suse-linux/4.3 -L/usr/lib64 -L/lib64
> >> -L/usr/x86_64-suse-linux/lib -ldl -lmpi -lopen-rte -lopen-pal -lnsl
> >> -lutil -lgcc_s -lpthread -lmpi_f90 -lmpi_f77 -lgfortranbegin -lgfortran
> >> -lm -lm -L/usr/lib64/gcc/x86_64-suse-linux -L/usr/x86_64-suse-linux/bin
> >> -L/lib -lm -lm -ldl -lmpi -lopen-rte -lopen-pal -lnsl -lutil -lgcc_s
> >> -lpthread -ldl
> >> /bin/rm -f ex2.o
> >>     
> >
> > Did you install this OpenMPI - or did someone-else/sysadmin install it for you?
> >
> >   
> >> md[/home/scratch/yixun/petsc-3.0.0-p3/src/ksp/ksp/examples/tutorials>mpiexec
> >> -n 2 ./ex2
> >>
> >> DAT: library load failure: /usr/lib64/libdaplcma.so.1: undefined symbol:
> >> dat_registry_add_provider
> >> DAT: library load failure: /usr/lib64/libdaplcma.so.1: undefined symbol:
> >> dat_registry_add_provider
> >>     
> >
> >   
> >> --------------------------------------------------------------------------
> >>
> >> WARNING: Failed to open "OpenIB-cma"
> >> [DAT_PROVIDER_NOT_FOUND:DAT_NAME_NOT_REGISTERED].
> >> This may be a real error or it may be an invalid entry in the uDAPL
> >> Registry which is contained in the dat.conf file. Contact your local
> >> System Administrator to confirm the availability of the interfaces in
> >> the dat.conf file.
> >>     
> >
> > Your mpiexec is trying to run on infiniban and failing?
> >
> >   
> >> --------------------------------------------------------------------------
> >> [0,1,1]: uDAPL on host md was unable to find any NICs.
> >> Another transport will be used instead, although this may result in
> >> lower performance.
> >> --------------------------------------------------------------------------
> >> Norm of error 0.000411674 iterations 7
> >>     
> >
> > And then it attempts 'sockets' - and then successfully runs the PETSc example..
> >
> > So something is wrong with your mpi usage. I guess - you'll have to
> > check with your sysadmin - how to correctly use infiniband..
> >
> > Satish
> >
> >   
> 
> 


From luitjens at cs.utah.edu  Tue Jul  7 15:01:39 2009
From: luitjens at cs.utah.edu (Justin Luitjens)
Date: Tue, 7 Jul 2009 14:01:39 -0600
Subject: PCILUSetFill in 3.0.0
Message-ID: <913d17a50907071301r230c0d30m9a12a33368c40795@mail.gmail.com>

Hi,

We are trying to make our code 3.0.0 compliant.  We are currently using
versions in the 2.3.* range.  We currently have a call to PCILUSetFill in
order to preallocate memory.  What is the equivalent to this call in 3.0.0?

Thanks,
Justin
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.mcs.anl.gov/pipermail/petsc-users/attachments/20090707/78ffe3a7/attachment.htm>

From tyoung at ippt.gov.pl  Tue Jul  7 15:18:51 2009
From: tyoung at ippt.gov.pl (Toby D. Young)
Date: Tue, 7 Jul 2009 22:18:51 +0200 (CEST)
Subject: PCILUSetFill in 3.0.0
In-Reply-To: <913d17a50907071301r230c0d30m9a12a33368c40795@mail.gmail.com>
References: <913d17a50907071301r230c0d30m9a12a33368c40795@mail.gmail.com>
Message-ID: <Pine.LNX.4.58.0907072212010.10466@brama.ippt.gov.pl>



> We are trying to make our code 3.0.0 compliant.  We are currently using
> versions in the 2.3.* range.  We currently have a call to PCILUSetFill in
> order to preallocate memory.  What is the equivalent to this call in 3.0.0?

Allocating memory for PETSc is a pain in the ass. Check the documentation
or (better) ask Barry Smith directly. Then let me know about it and I will
write a patch for us deal.ii.ers. I will gladly submit a patch for this.

Throw something at me.... like an error message????   ;-)

Cheers,
	Toby



-----

Toby D. Young
Philosopher-Physicist
Adiunkt (Assistant Professor)
Polish Academy of Sciences
Warszawa, Polska

www:   http://www.ippt.gov.pl/~tyoung
skype: stenografia


From bsmith at mcs.anl.gov  Tue Jul  7 15:20:21 2009
From: bsmith at mcs.anl.gov (Barry Smith)
Date: Tue, 7 Jul 2009 15:20:21 -0500
Subject: PCILUSetFill in 3.0.0
In-Reply-To: <913d17a50907071301r230c0d30m9a12a33368c40795@mail.gmail.com>
References: <913d17a50907071301r230c0d30m9a12a33368c40795@mail.gmail.com>
Message-ID: <91E99583-E7CB-4756-BEC7-890AB166D8FD@mcs.anl.gov>


   PCFactorSetFill().

   Essentially we introduced a factor class that took all the methods  
common to the various PCILUXXX, PCICCXXX, PCLUXXX,  ... objects and  
put them together.

    Barry

On Jul 7, 2009, at 3:01 PM, Justin Luitjens wrote:

> Hi,
>
> We are trying to make our code 3.0.0 compliant.  We are currently  
> using versions in the 2.3.* range.  We currently have a call to  
> PCILUSetFill in order to preallocate memory.  What is the equivalent  
> to this call in 3.0.0?
>
> Thanks,
> Justin


From john.fettig at gmail.com  Tue Jul  7 15:20:50 2009
From: john.fettig at gmail.com (John Fettig)
Date: Tue, 7 Jul 2009 15:20:50 -0500
Subject: MatGetSubMatrix performance
Message-ID: <b1936d110907071320y7613b81br526e63e8a848d165@mail.gmail.com>

What kind of performance should one expect with MatGetSubMatrix on a
MPIAIJ matrix, and is there anything that I need to know to get the
best performance?  Or is this routine best avoided?  I currently use
it, but find that performance varies widely from call to call.  One
time it will take 0.25 seconds, another time it will take 185 seconds,
and I can't figure out what would cause such a disparity.

John

From bsmith at mcs.anl.gov  Tue Jul  7 15:33:55 2009
From: bsmith at mcs.anl.gov (Barry Smith)
Date: Tue, 7 Jul 2009 15:33:55 -0500
Subject: MatGetSubMatrix performance
In-Reply-To: <b1936d110907071320y7613b81br526e63e8a848d165@mail.gmail.com>
References: <b1936d110907071320y7613b81br526e63e8a848d165@mail.gmail.com>
Message-ID: <ABBECDDD-B38C-4B02-AD9D-7CCFEDD42C78@mcs.anl.gov>


    I've found that it is generally much faster than the numerical  
parts of the code (for example if you use a MatGetSubmatrix to select  
a big chunk of the matrix and then solve a linear system on that chunk  
the get submatrix may take 5 percent of the time to solve the system).  
So, in general, I don't think there is a reason to avoid it.

   Are you getting a huge difference in time for the exact same  
submatrix? This would surprise me. A cluster with gigabyte ethernet  
will also be slow.

   The performance will get bad for a poor load balance of the gotten  
submatrix. For example if some processes get huge chunks of other  
processes values it will be slow. Generally you want most of the  
gotten rows to live on the same process they are gotten from.

    Barry

On Jul 7, 2009, at 3:20 PM, John Fettig wrote:

> What kind of performance should one expect with MatGetSubMatrix on a
> MPIAIJ matrix, and is there anything that I need to know to get the
> best performance?  Or is this routine best avoided?  I currently use
> it, but find that performance varies widely from call to call.  One
> time it will take 0.25 seconds, another time it will take 185 seconds,
> and I can't figure out what would cause such a disparity.
>
> John


From yfeng1 at tigers.lsu.edu  Wed Jul  8 13:24:05 2009
From: yfeng1 at tigers.lsu.edu (Yin Feng)
Date: Wed, 8 Jul 2009 13:24:05 -0500
Subject: A question about parallel computation
Message-ID: <1e8c69dc0907081124g5485dbceif7f804a3e3faf4fe@mail.gmail.com>

I am a beginner of PETSc.
I tried the PETSC example 5(ex5) with 4 nodes,
However, it seems every nodes doing the exactly the same things and
output the same results again and again. is this the problem of petsc or
MPI installation?

Thank you in adcance!

Sincerely,
YIN

From balay at mcs.anl.gov  Wed Jul  8 13:26:26 2009
From: balay at mcs.anl.gov (Satish Balay)
Date: Wed, 8 Jul 2009 13:26:26 -0500 (CDT)
Subject: A question about parallel computation
In-Reply-To: <1e8c69dc0907081124g5485dbceif7f804a3e3faf4fe@mail.gmail.com>
References: <1e8c69dc0907081124g5485dbceif7f804a3e3faf4fe@mail.gmail.com>
Message-ID: <alpine.LFD.2.00.0907081325270.25843@asterix>

Perhaps you are using the wrong mpiexec or mpirun. You'll have to use
the correspond mpiexec from MPI you've used to build PETSc.

Or if the MPI has special instruction on usage - you should follow
that [for ex: some clusters require extra options to mpiexec ]

Satish

On Wed, 8 Jul 2009, Yin Feng wrote:

> I am a beginner of PETSc.
> I tried the PETSC example 5(ex5) with 4 nodes,
> However, it seems every nodes doing the exactly the same things and
> output the same results again and again. is this the problem of petsc or
> MPI installation?
> 
> Thank you in adcance!
> 
> Sincerely,
> YIN
> 


From enjoywm at cs.wm.edu  Wed Jul  8 14:49:20 2009
From: enjoywm at cs.wm.edu (Yixun Liu)
Date: Wed, 08 Jul 2009 15:49:20 -0400
Subject: rebuild petsc
Message-ID: <4A54F840.8060002@cs.wm.edu>

Hi,
I want to clean the configuration generated at last time. Which command
should I use?

Thanks.

Yixun

From balay at mcs.anl.gov  Wed Jul  8 15:47:30 2009
From: balay at mcs.anl.gov (Satish Balay)
Date: Wed, 8 Jul 2009 15:47:30 -0500 (CDT)
Subject: rebuild petsc
In-Reply-To: <4A54F840.8060002@cs.wm.edu>
References: <4A54F840.8060002@cs.wm.edu>
Message-ID: <alpine.LFD.2.00.0907081546590.25843@asterix>

rm -rf PETSC_ARCH

Satish

On Wed, 8 Jul 2009, Yixun Liu wrote:

> Hi,
> I want to clean the configuration generated at last time. Which command
> should I use?
> 
> Thanks.
> 
> Yixun
> 


From yfeng1 at tigers.lsu.edu  Wed Jul  8 15:48:42 2009
From: yfeng1 at tigers.lsu.edu (Yin Feng)
Date: Wed, 8 Jul 2009 15:48:42 -0500
Subject: A question about parallel computation
In-Reply-To: <alpine.LFD.2.00.0907081325270.25843@asterix>
References: <1e8c69dc0907081124g5485dbceif7f804a3e3faf4fe@mail.gmail.com>
	<alpine.LFD.2.00.0907081325270.25843@asterix>
Message-ID: <1e8c69dc0907081348n41e8661amc1de5fde0e1dfa29@mail.gmail.com>

I tried OpenMPI build PETSc and used mpirun provided by OpenMPI.
But, when I check the load on each node, I found the master node take
all the load
and others are just free.

Did you have any idea about this situation?

Thanks in adcance!

Sincerely,
YIN

On Wed, Jul 8, 2009 at 1:26 PM, Satish Balay<balay at mcs.anl.gov> wrote:
> Perhaps you are using the wrong mpiexec or mpirun. You'll have to use
> the correspond mpiexec from MPI you've used to build PETSc.
>
> Or if the MPI has special instruction on usage - you should follow
> that [for ex: some clusters require extra options to mpiexec ]
>
> Satish
>
> On Wed, 8 Jul 2009, Yin Feng wrote:
>
>> I am a beginner of PETSc.
>> I tried the PETSC example 5(ex5) with 4 nodes,
>> However, it seems every nodes doing the exactly the same things and
>> output the same results again and again. is this the problem of petsc or
>> MPI installation?
>>
>> Thank you in adcance!
>>
>> Sincerely,
>> YIN
>>
>
>

From balay at mcs.anl.gov  Wed Jul  8 16:01:39 2009
From: balay at mcs.anl.gov (Satish Balay)
Date: Wed, 8 Jul 2009 16:01:39 -0500 (CDT)
Subject: A question about parallel computation
In-Reply-To: <1e8c69dc0907081348n41e8661amc1de5fde0e1dfa29@mail.gmail.com>
References: <1e8c69dc0907081124g5485dbceif7f804a3e3faf4fe@mail.gmail.com>
	<alpine.LFD.2.00.0907081325270.25843@asterix>
	<1e8c69dc0907081348n41e8661amc1de5fde0e1dfa29@mail.gmail.com>
Message-ID: <alpine.LFD.2.00.0907081559200.25843@asterix>

Sounds like openmpi configuration issue. Perhaps you need to configure
hostfile for it?

You can try '--default-hostfile' option for mpiexec.

Also - you should figure out OpenMPI issues with a simple MPI test
code [like cpi.c] - not PETSc.

Satish


On Wed, 8 Jul 2009, Yin Feng wrote:

> I tried OpenMPI build PETSc and used mpirun provided by OpenMPI.
> But, when I check the load on each node, I found the master node take
> all the load
> and others are just free.
> 
> Did you have any idea about this situation?
> 
> Thanks in adcance!
> 
> Sincerely,
> YIN
> 
> On Wed, Jul 8, 2009 at 1:26 PM, Satish Balay<balay at mcs.anl.gov> wrote:
> > Perhaps you are using the wrong mpiexec or mpirun. You'll have to use
> > the correspond mpiexec from MPI you've used to build PETSc.
> >
> > Or if the MPI has special instruction on usage - you should follow
> > that [for ex: some clusters require extra options to mpiexec ]
> >
> > Satish
> >
> > On Wed, 8 Jul 2009, Yin Feng wrote:
> >
> >> I am a beginner of PETSc.
> >> I tried the PETSC example 5(ex5) with 4 nodes,
> >> However, it seems every nodes doing the exactly the same things and
> >> output the same results again and again. is this the problem of petsc or
> >> MPI installation?
> >>
> >> Thank you in adcance!
> >>
> >> Sincerely,
> >> YIN
> >>
> >
> >
> 


From chianshin at gmail.com  Wed Jul  8 16:15:18 2009
From: chianshin at gmail.com (Xin Qian)
Date: Wed, 8 Jul 2009 17:15:18 -0400
Subject: A question about parallel computation
In-Reply-To: <1e8c69dc0907081348n41e8661amc1de5fde0e1dfa29@mail.gmail.com>
References: <1e8c69dc0907081124g5485dbceif7f804a3e3faf4fe@mail.gmail.com>
	<alpine.LFD.2.00.0907081325270.25843@asterix>
	<1e8c69dc0907081348n41e8661amc1de5fde0e1dfa29@mail.gmail.com>
Message-ID: <c594f9bc0907081415p7b0d91by3ebc5182fcfec12e@mail.gmail.com>

You can try to run sole MPI samples coming with OpenMPI first, make sure the
OpenMPI is running all right.

Thanks,

Xin Qian

On Wed, Jul 8, 2009 at 4:48 PM, Yin Feng <yfeng1 at tigers.lsu.edu> wrote:

> I tried OpenMPI build PETSc and used mpirun provided by OpenMPI.
> But, when I check the load on each node, I found the master node take
> all the load
> and others are just free.
>
> Did you have any idea about this situation?
>
> Thanks in adcance!
>
> Sincerely,
> YIN
>
> On Wed, Jul 8, 2009 at 1:26 PM, Satish Balay<balay at mcs.anl.gov> wrote:
> > Perhaps you are using the wrong mpiexec or mpirun. You'll have to use
> > the correspond mpiexec from MPI you've used to build PETSc.
> >
> > Or if the MPI has special instruction on usage - you should follow
> > that [for ex: some clusters require extra options to mpiexec ]
> >
> > Satish
> >
> > On Wed, 8 Jul 2009, Yin Feng wrote:
> >
> >> I am a beginner of PETSc.
> >> I tried the PETSC example 5(ex5) with 4 nodes,
> >> However, it seems every nodes doing the exactly the same things and
> >> output the same results again and again. is this the problem of petsc or
> >> MPI installation?
> >>
> >> Thank you in adcance!
> >>
> >> Sincerely,
> >> YIN
> >>
> >
> >
>



-- 
QIAN, Xin (http://pubpages.unh.edu/~xqian/)
xqian at unh.edu chianshin at gmail.com
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.mcs.anl.gov/pipermail/petsc-users/attachments/20090708/60940bb1/attachment.htm>

From yfeng1 at tigers.lsu.edu  Thu Jul  9 00:02:37 2009
From: yfeng1 at tigers.lsu.edu (Yin Feng)
Date: Thu, 9 Jul 2009 00:02:37 -0500
Subject: A question about parallel computation
In-Reply-To: <c594f9bc0907081415p7b0d91by3ebc5182fcfec12e@mail.gmail.com>
References: <1e8c69dc0907081124g5485dbceif7f804a3e3faf4fe@mail.gmail.com>
	<alpine.LFD.2.00.0907081325270.25843@asterix>
	<1e8c69dc0907081348n41e8661amc1de5fde0e1dfa29@mail.gmail.com>
	<c594f9bc0907081415p7b0d91by3ebc5182fcfec12e@mail.gmail.com>
Message-ID: <1e8c69dc0907082202j539f7c15k3fe86075bf2604e3@mail.gmail.com>

Firstly, thanks for all your replies!

I changed compiler to MPICH and tried a sample successfully but the
problem is still there.
I ran my code in 4 nodes and each node have 8 processors. And the
information I saw is like:
 NODE        LOAD
       0             32
       1              0
        2             0
       3              0

Normally, in that case, we should see is:
NODE        LOAD
       0             8
       1              8
        2             8
       3              8

So, anyone got any idea about this?

Thank you in advance!

Sincerely,
YIN

On Wed, Jul 8, 2009 at 4:15 PM, Xin Qian<chianshin at gmail.com> wrote:
> You can try to run sole MPI samples coming with OpenMPI first, make sure the
> OpenMPI is running all right.
>
> Thanks,
>
> Xin Qian
>
> On Wed, Jul 8, 2009 at 4:48 PM, Yin Feng <yfeng1 at tigers.lsu.edu> wrote:
>>
>> I tried OpenMPI build PETSc and used mpirun provided by OpenMPI.
>> But, when I check the load on each node, I found the master node take
>> all the load
>> and others are just free.
>>
>> Did you have any idea about this situation?
>>
>> Thanks in adcance!
>>
>> Sincerely,
>> YIN
>>
>> On Wed, Jul 8, 2009 at 1:26 PM, Satish Balay<balay at mcs.anl.gov> wrote:
>> > Perhaps you are using the wrong mpiexec or mpirun. You'll have to use
>> > the correspond mpiexec from MPI you've used to build PETSc.
>> >
>> > Or if the MPI has special instruction on usage - you should follow
>> > that [for ex: some clusters require extra options to mpiexec ]
>> >
>> > Satish
>> >
>> > On Wed, 8 Jul 2009, Yin Feng wrote:
>> >
>> >> I am a beginner of PETSc.
>> >> I tried the PETSC example 5(ex5) with 4 nodes,
>> >> However, it seems every nodes doing the exactly the same things and
>> >> output the same results again and again. is this the problem of petsc
>> >> or
>> >> MPI installation?
>> >>
>> >> Thank you in adcance!
>> >>
>> >> Sincerely,
>> >> YIN
>> >>
>> >
>> >
>
>
>
> --
> QIAN, Xin (http://pubpages.unh.edu/~xqian/)
> xqian at unh.edu chianshin at gmail.com
>

From sekikawa at msi.co.jp  Thu Jul  9 02:50:24 2009
From: sekikawa at msi.co.jp (Takuya Sekikawa)
Date: Thu, 09 Jul 2009 16:50:24 +0900
Subject: PETSc configure with Intel-compiler static linking
Message-ID: <20090709163521.0DEC.SEKIKAWA@msi.co.jp>

Hello petsc users,

I need to know how to configure PETSc with Intel-compiler (icc/icpc) on
static linking.

shared linking is just fine, 
but I need to build PETSc with static-linking. so I tried several description.

[1]
$ ./config/configure.py --with-cc=icc --with-cxx=icpc --with-fc=0
--with-shared=0 --with-blas-lapack-dir=${MKL_DIR}

environment variable MKL_DIR is set to intel MKL library directory.
this one is fine, (also compiling and running is ok)
but is spite of "--with-shared=0" flag, executable still link with .so
(libmkl_lapack.so, etc)

so I tried another one:

[2]
$ ./config/configure.py --with-cc=icc --with-cxx=icpc --with-fc=0
--with-shared=0 --with-blas-lapack-lib=${MKL_DIR}/libmkl_lapack.a

this time configure.py failed:

*********************************************************************************
         UNABLE to CONFIGURE with GIVEN OPTIONS    (see configure.log for details):
---------------------------------------------------------------------------------------
You set a value for --with-blas-lapack-lib=<lib>, but ['/opt/intel/mkl/10.0.010/lib/em64t/libmkl_lapack.a'] cannot be used
*********************************************************************************

Could someone give me good advice? (or examples are greatly appriciated)

Thanks in advance

Takuya

From knepley at gmail.com  Thu Jul  9 06:08:00 2009
From: knepley at gmail.com (Matthew Knepley)
Date: Thu, 9 Jul 2009 06:08:00 -0500
Subject: PETSc configure with Intel-compiler static linking
In-Reply-To: <20090709163521.0DEC.SEKIKAWA@msi.co.jp>
References: <20090709163521.0DEC.SEKIKAWA@msi.co.jp>
Message-ID: <a9f269830907090408h1ff9cb5ar728bad23ce9509a3@mail.gmail.com>

For any configure problem, you MUST send configure.log or we have no idea
what happened.

  Matt

On Thu, Jul 9, 2009 at 2:50 AM, Takuya Sekikawa <sekikawa at msi.co.jp> wrote:

> Hello petsc users,
>
> I need to know how to configure PETSc with Intel-compiler (icc/icpc) on
> static linking.
>
> shared linking is just fine,
> but I need to build PETSc with static-linking. so I tried several
> description.
>
> [1]
> $ ./config/configure.py --with-cc=icc --with-cxx=icpc --with-fc=0
> --with-shared=0 --with-blas-lapack-dir=${MKL_DIR}
>
> environment variable MKL_DIR is set to intel MKL library directory.
> this one is fine, (also compiling and running is ok)
> but is spite of "--with-shared=0" flag, executable still link with .so
> (libmkl_lapack.so, etc)
>
> so I tried another one:
>
> [2]
> $ ./config/configure.py --with-cc=icc --with-cxx=icpc --with-fc=0
> --with-shared=0 --with-blas-lapack-lib=${MKL_DIR}/libmkl_lapack.a
>
> this time configure.py failed:
>
>
> *********************************************************************************
>         UNABLE to CONFIGURE with GIVEN OPTIONS    (see configure.log for
> details):
>
> ---------------------------------------------------------------------------------------
> You set a value for --with-blas-lapack-lib=<lib>, but
> ['/opt/intel/mkl/10.0.010/lib/em64t/libmkl_lapack.a'] cannot be used
>
> *********************************************************************************
>
> Could someone give me good advice? (or examples are greatly appriciated)
>
> Thanks in advance
>
> Takuya
>



-- 
What most experimenters take for granted before they begin their experiments
is infinitely more interesting than any results to which their experiments
lead.
-- Norbert Wiener
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.mcs.anl.gov/pipermail/petsc-users/attachments/20090709/c851dc26/attachment.htm>

From knepley at gmail.com  Thu Jul  9 06:15:49 2009
From: knepley at gmail.com (Matthew Knepley)
Date: Thu, 9 Jul 2009 06:15:49 -0500
Subject: rebuild petsc
In-Reply-To: <alpine.LFD.2.00.0907081546590.25843@asterix>
References: <4A54F840.8060002@cs.wm.edu>
	<alpine.LFD.2.00.0907081546590.25843@asterix>
Message-ID: <a9f269830907090415kda6df74ycce49e7ab8008432@mail.gmail.com>

cd $PETSC_DIR
rm -f $PETSC_ARCH

On Wed, Jul 8, 2009 at 3:47 PM, Satish Balay <balay at mcs.anl.gov> wrote:

> rm -rf PETSC_ARCH
>
> Satish
>
> On Wed, 8 Jul 2009, Yixun Liu wrote:
>
> > Hi,
> > I want to clean the configuration generated at last time. Which command
> > should I use?
> >
> > Thanks.
> >
> > Yixun
> >
>
>


-- 
What most experimenters take for granted before they begin their experiments
is infinitely more interesting than any results to which their experiments
lead.
-- Norbert Wiener
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.mcs.anl.gov/pipermail/petsc-users/attachments/20090709/1ed025b1/attachment-0001.htm>

From knepley at gmail.com  Thu Jul  9 06:20:13 2009
From: knepley at gmail.com (Matthew Knepley)
Date: Thu, 9 Jul 2009 06:20:13 -0500
Subject: A question about parallel computation
In-Reply-To: <1e8c69dc0907082202j539f7c15k3fe86075bf2604e3@mail.gmail.com>
References: <1e8c69dc0907081124g5485dbceif7f804a3e3faf4fe@mail.gmail.com>
	<alpine.LFD.2.00.0907081325270.25843@asterix>
	<1e8c69dc0907081348n41e8661amc1de5fde0e1dfa29@mail.gmail.com>
	<c594f9bc0907081415p7b0d91by3ebc5182fcfec12e@mail.gmail.com>
	<1e8c69dc0907082202j539f7c15k3fe86075bf2604e3@mail.gmail.com>
Message-ID: <a9f269830907090420j6378ea18q6d09b0bca3e4846e@mail.gmail.com>

I think it is time to ask your system administrator for help.

  Matt

On Thu, Jul 9, 2009 at 12:02 AM, Yin Feng <yfeng1 at tigers.lsu.edu> wrote:

> Firstly, thanks for all your replies!
>
> I changed compiler to MPICH and tried a sample successfully but the
> problem is still there.
> I ran my code in 4 nodes and each node have 8 processors. And the
> information I saw is like:
>  NODE        LOAD
>       0             32
>       1              0
>        2             0
>       3              0
>
> Normally, in that case, we should see is:
> NODE        LOAD
>       0             8
>       1              8
>        2             8
>       3              8
>
> So, anyone got any idea about this?
>
> Thank you in advance!
>
> Sincerely,
> YIN
>
> On Wed, Jul 8, 2009 at 4:15 PM, Xin Qian<chianshin at gmail.com> wrote:
> > You can try to run sole MPI samples coming with OpenMPI first, make sure
> the
> > OpenMPI is running all right.
> >
> > Thanks,
> >
> > Xin Qian
> >
> > On Wed, Jul 8, 2009 at 4:48 PM, Yin Feng <yfeng1 at tigers.lsu.edu> wrote:
> >>
> >> I tried OpenMPI build PETSc and used mpirun provided by OpenMPI.
> >> But, when I check the load on each node, I found the master node take
> >> all the load
> >> and others are just free.
> >>
> >> Did you have any idea about this situation?
> >>
> >> Thanks in adcance!
> >>
> >> Sincerely,
> >> YIN
> >>
> >> On Wed, Jul 8, 2009 at 1:26 PM, Satish Balay<balay at mcs.anl.gov> wrote:
> >> > Perhaps you are using the wrong mpiexec or mpirun. You'll have to use
> >> > the correspond mpiexec from MPI you've used to build PETSc.
> >> >
> >> > Or if the MPI has special instruction on usage - you should follow
> >> > that [for ex: some clusters require extra options to mpiexec ]
> >> >
> >> > Satish
> >> >
> >> > On Wed, 8 Jul 2009, Yin Feng wrote:
> >> >
> >> >> I am a beginner of PETSc.
> >> >> I tried the PETSC example 5(ex5) with 4 nodes,
> >> >> However, it seems every nodes doing the exactly the same things and
> >> >> output the same results again and again. is this the problem of petsc
> >> >> or
> >> >> MPI installation?
> >> >>
> >> >> Thank you in adcance!
> >> >>
> >> >> Sincerely,
> >> >> YIN
> >> >>
> >> >
> >> >
> >
> >
> >
> > --
> > QIAN, Xin (http://pubpages.unh.edu/~xqian/<http://pubpages.unh.edu/%7Exqian/>
> )
> > xqian at unh.edu chianshin at gmail.com
> >
>



-- 
What most experimenters take for granted before they begin their experiments
is infinitely more interesting than any results to which their experiments
lead.
-- Norbert Wiener
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.mcs.anl.gov/pipermail/petsc-users/attachments/20090709/eb7961c8/attachment.htm>

From balay at mcs.anl.gov  Thu Jul  9 09:50:21 2009
From: balay at mcs.anl.gov (Satish Balay)
Date: Thu, 9 Jul 2009 09:50:21 -0500 (CDT)
Subject: PETSc configure with Intel-compiler static linking
In-Reply-To: <20090709163521.0DEC.SEKIKAWA@msi.co.jp>
References: <20090709163521.0DEC.SEKIKAWA@msi.co.jp>
Message-ID: <alpine.LFD.2.00.0907090937350.25843@asterix>

On Thu, 9 Jul 2009, Takuya Sekikawa wrote:

> Hello petsc users,
> 
> I need to know how to configure PETSc with Intel-compiler (icc/icpc) on
> static linking.

Why?

> 
> shared linking is just fine, 
> but I need to build PETSc with static-linking. so I tried several description.
> 
> [1]
> $ ./config/configure.py --with-cc=icc --with-cxx=icpc --with-fc=0
> --with-shared=0 --with-blas-lapack-dir=${MKL_DIR}
> 
> environment variable MKL_DIR is set to intel MKL library directory.
> this one is fine, (also compiling and running is ok)
> but is spite of "--with-shared=0" flag, executable still link with .so
> (libmkl_lapack.so, etc)

--with-shared=0 refers to petsc libraries. It doesn't mean static
linking or shared linking. 

Generally static linking is done by the linker option [with icc/ifort
its: -Bstatic]. But since all system libraries might not be available
as static libraries - this might not work.

Esp with MKL - since the librariry names are different between .so and
.a files. [so PETSc configure doesn't explicitly look for tha .a
names.

> 
> so I tried another one:
> 
> [2]
> $ ./config/configure.py --with-cc=icc --with-cxx=icpc --with-fc=0
> --with-shared=0 --with-blas-lapack-lib=${MKL_DIR}/libmkl_lapack.a

Generally - you  need -lmkl_lapack -lmkl -lpthread -lguide

However -lmkl is only available as .so. So you'll have to cat
libmkl.so to see what the actual libraries it links with: For me I
have:

[petsc:10.0.2.018/lib/em64t] petsc> cat libmkl.so
GROUP (libmkl_intel_lp64.so libmkl_intel_thread.so libmkl_core.so)
[petsc:10.0.2.018/lib/em64t] petsc> 


So you might be able to use:

--with-blas-lapack-lib="${MKL_DIR}/libmkl_lapack.a ${MKL_DIR}/libmkl_intel_lp64.a ${MKL_DIR}/libmkl_core.a -lpthread ${MKL_DIR}/libguide.a"


Satish



> 
> this time configure.py failed:
> 
> *********************************************************************************
>          UNABLE to CONFIGURE with GIVEN OPTIONS    (see configure.log for details):
> ---------------------------------------------------------------------------------------
> You set a value for --with-blas-lapack-lib=<lib>, but ['/opt/intel/mkl/10.0.010/lib/em64t/libmkl_lapack.a'] cannot be used
> *********************************************************************************
> 
> Could someone give me good advice? (or examples are greatly appriciated)
> 
> Thanks in advance
> 
> Takuya
> 


From balay at mcs.anl.gov  Thu Jul  9 09:52:49 2009
From: balay at mcs.anl.gov (Satish Balay)
Date: Thu, 9 Jul 2009 09:52:49 -0500 (CDT)
Subject: A question about parallel computation
In-Reply-To: <1e8c69dc0907082202j539f7c15k3fe86075bf2604e3@mail.gmail.com>
References: <1e8c69dc0907081124g5485dbceif7f804a3e3faf4fe@mail.gmail.com>
	<alpine.LFD.2.00.0907081325270.25843@asterix>
	<1e8c69dc0907081348n41e8661amc1de5fde0e1dfa29@mail.gmail.com>
	<c594f9bc0907081415p7b0d91by3ebc5182fcfec12e@mail.gmail.com>
	<1e8c69dc0907082202j539f7c15k3fe86075bf2604e3@mail.gmail.com>
Message-ID: <alpine.LFD.2.00.0907090950360.25843@asterix>

You'll have to learn about the MPI you've installed.

If its MPICH - how did you install it? Did you install with PETSc or
MPICH separately?

Did you make sure its install with mpd? [This is the default if its
installed separately. However if you've installed with PETSc - you
will need additional option: --download-mpich-pm=mpd]

And then have you configured mpd correctly across all the nodes you'd
like to use?

These are all MPI issues - you should figure these out - before
attempting PETSc.

Satish

On Thu, 9 Jul 2009, Yin Feng wrote:

> Firstly, thanks for all your replies!
> 
> I changed compiler to MPICH and tried a sample successfully but the
> problem is still there.
> I ran my code in 4 nodes and each node have 8 processors. And the
> information I saw is like:
>  NODE        LOAD
>        0             32
>        1              0
>         2             0
>        3              0
> 
> Normally, in that case, we should see is:
> NODE        LOAD
>        0             8
>        1              8
>         2             8
>        3              8
> 
> So, anyone got any idea about this?
> 
> Thank you in advance!
> 
> Sincerely,
> YIN
> 
> On Wed, Jul 8, 2009 at 4:15 PM, Xin Qian<chianshin at gmail.com> wrote:
> > You can try to run sole MPI samples coming with OpenMPI first, make sure the
> > OpenMPI is running all right.
> >
> > Thanks,
> >
> > Xin Qian
> >
> > On Wed, Jul 8, 2009 at 4:48 PM, Yin Feng <yfeng1 at tigers.lsu.edu> wrote:
> >>
> >> I tried OpenMPI build PETSc and used mpirun provided by OpenMPI.
> >> But, when I check the load on each node, I found the master node take
> >> all the load
> >> and others are just free.
> >>
> >> Did you have any idea about this situation?
> >>
> >> Thanks in adcance!
> >>
> >> Sincerely,
> >> YIN
> >>
> >> On Wed, Jul 8, 2009 at 1:26 PM, Satish Balay<balay at mcs.anl.gov> wrote:
> >> > Perhaps you are using the wrong mpiexec or mpirun. You'll have to use
> >> > the correspond mpiexec from MPI you've used to build PETSc.
> >> >
> >> > Or if the MPI has special instruction on usage - you should follow
> >> > that [for ex: some clusters require extra options to mpiexec ]
> >> >
> >> > Satish
> >> >
> >> > On Wed, 8 Jul 2009, Yin Feng wrote:
> >> >
> >> >> I am a beginner of PETSc.
> >> >> I tried the PETSC example 5(ex5) with 4 nodes,
> >> >> However, it seems every nodes doing the exactly the same things and
> >> >> output the same results again and again. is this the problem of petsc
> >> >> or
> >> >> MPI installation?
> >> >>
> >> >> Thank you in adcance!
> >> >>
> >> >> Sincerely,
> >> >> YIN
> >> >>
> >> >
> >> >
> >
> >
> >
> > --
> > QIAN, Xin (http://pubpages.unh.edu/~xqian/)
> > xqian at unh.edu chianshin at gmail.com
> >
> 


From balay at mcs.anl.gov  Thu Jul  9 21:37:02 2009
From: balay at mcs.anl.gov (Satish Balay)
Date: Thu, 9 Jul 2009 21:37:02 -0500 (CDT)
Subject: PETSc configure with Intel-compiler static linking
In-Reply-To: <20090710085311.4E21.SEKIKAWA@msi.co.jp>
References: <20090709163521.0DEC.SEKIKAWA@msi.co.jp>
	<alpine.LFD.2.00.0907090937350.25843@asterix>
	<20090710085311.4E21.SEKIKAWA@msi.co.jp>
Message-ID: <alpine.LFD.2.00.0907092121430.25843@asterix>

For one - shell is not expanding ${MKL_DIR} for you. Perhaps you used
the wrong quotes?

Anyway - the current configure interface to --with-blas-lapack-lib
prevents listing files as I mentioned before. So you can try the
following workarround:

- create a different mkl location for just the .a files - and use it
 with configure - as follows:

[choose any convinent location]
mkdir /foo/mkl-static
cp /opt/intel/mkl/10.0.010/lib/em64t/*.a /foo/mkl-static/
cd $PETSC_DIR
./configure .... --with-blas-lapack-lib=[/foo/mkl-static/libmkl_lapack.a,mkl_intel_lp64.a,libmkl_core.a,libguide.a,libthread.a]


Also we prevent flooding the mailing list with configure.log - so such
issues [requiring communicating configure.log] can be sent to
petsc-maint at mcs.anl.gov

Satish

On Fri, 10 Jul 2009, Takuya Sekikawa wrote:

> Dear Matt and Satish,
> 
> Thank you for quick response.
> 
> On Thu, 9 Jul 2009 06:08:00 -0500
> Matthew Knepley <knepley at gmail.com> wrote:
> 
> > For any configure problem, you MUST send configure.log or we have no idea
> > what happened.
> 
> Oh, Sorry.
> I attached latest configure.log.
> 
> On Thu, 9 Jul 2009 09:50:21 -0500 (CDT)
> Satish Balay <balay at mcs.anl.gov> wrote:
> 
> > On Thu, 9 Jul 2009, Takuya Sekikawa wrote:
> > 
> > > Hello petsc users,
> > > 
> > > I need to know how to configure PETSc with Intel-compiler (icc/icpc) on
> > > static linking.
> > 
> > Why?
> 
> Mainly because of license problem.
> .so version needs target user to purchase licsense.
> 
> > > shared linking is just fine, 
> > > but I need to build PETSc with static-linking. so I tried several description.
> > > 
> > > [1]
> > > $ ./config/configure.py --with-cc=icc --with-cxx=icpc --with-fc=0
> > > --with-shared=0 --with-blas-lapack-dir=${MKL_DIR}
> > > 
> > > environment variable MKL_DIR is set to intel MKL library directory.
> > > this one is fine, (also compiling and running is ok)
> > > but is spite of "--with-shared=0" flag, executable still link with .so
> > > (libmkl_lapack.so, etc)
> > 
> > --with-shared=0 refers to petsc libraries. It doesn't mean static
> > linking or shared linking. 
> 
> Ok. I understood.
> 
> > Generally static linking is done by the linker option [with icc/ifort
> > its: -Bstatic]. But since all system libraries might not be available
> > as static libraries - this might not work.
> > 
> > Esp with MKL - since the librariry names are different between .so and
> > .a files. [so PETSc configure doesn't explicitly look for tha .a
> > names.
> > 
> > > 
> > > so I tried another one:
> > > 
> > > [2]
> > > $ ./config/configure.py --with-cc=icc --with-cxx=icpc --with-fc=0
> > > --with-shared=0 --with-blas-lapack-lib=${MKL_DIR}/libmkl_lapack.a
> > 
> > Generally - you  need -lmkl_lapack -lmkl -lpthread -lguide
> > 
> > However -lmkl is only available as .so. So you'll have to cat
> > libmkl.so to see what the actual libraries it links with: For me I
> > have:
> > 
> > [petsc:10.0.2.018/lib/em64t] petsc> cat libmkl.so
> > GROUP (libmkl_intel_lp64.so libmkl_intel_thread.so libmkl_core.so)
> > [petsc:10.0.2.018/lib/em64t] petsc> 
> > 
> > 
> > So you might be able to use:
> > 
> > --with-blas-lapack-lib="${MKL_DIR}/libmkl_lapack.a ${MKL_DIR}/libmkl_intel_lp64.a ${MKL_DIR}/libmkl_core.a -lpthread ${MKL_DIR}/libguide.a"
> 
> Thank you.
> I tried as you wrote, but unsuccessful. 
> configure.py said:
> 
> *********************************************************************************
>          UNABLE to CONFIGURE with GIVEN OPTIONS    (see configure.log for details):
> ---------------------------------------------------------------------------------------
> You set a value for --with-blas-lapack-lib=<lib>, but ['${MKL_DIR}/libmkl_lapack.a ${MKL_DIR}/libmkl_intel_lp64.a ${MKL_DIR}/libmkl_core.a -lpthread ${MKL_DIR}/libguide.a'] cannot be used
> *********************************************************************************
> 
> What is the real cause of "cannot be used" ?
> I cannot make out why "cannot be used" (.a is collapsed? or simply need
> to specify more .a?)
> 
> Takuya
> 


From sekikawa at msi.co.jp  Fri Jul 10 03:58:12 2009
From: sekikawa at msi.co.jp (Takuya Sekikawa)
Date: Fri, 10 Jul 2009 17:58:12 +0900
Subject: PETSc configure with Intel-compiler static linking
In-Reply-To: <alpine.LFD.2.00.0907092121430.25843@asterix>
References: <20090710085311.4E21.SEKIKAWA@msi.co.jp>
	<alpine.LFD.2.00.0907092121430.25843@asterix>
Message-ID: <20090710174935.4E36.SEKIKAWA@msi.co.jp>

Dear Satish,

On Thu, 9 Jul 2009 21:37:02 -0500 (CDT)
Satish Balay <balay at mcs.anl.gov> wrote:

> For one - shell is not expanding ${MKL_DIR} for you. Perhaps you used
> the wrong quotes?

As you wrote I suspected ${MKL_DIR} didn't expand by shell so I changed this
part to fullpath, but result was same.

> Anyway - the current configure interface to --with-blas-lapack-lib
> prevents listing files as I mentioned before. So you can try the
> following workarround:
> 
> - create a different mkl location for just the .a files - and use it
>  with configure - as follows:
> 
> [choose any convinent location]
> mkdir /foo/mkl-static
> cp /opt/intel/mkl/10.0.010/lib/em64t/*.a /foo/mkl-static/
> cd $PETSC_DIR
> ./configure .... --with-blas-lapack-lib=[/foo/mkl-static/libmkl_lapack.a,mkl_intel_lp64.a,libmkl_core.a,libguide.a,libthread.a]

Thank you for your advice.
but seems that it don't work well...

Well, situation was changed.
I pursaded my client that we have to use .so version of MKL.
so for the time I don't have to compile PETSc with Intel static library.
Thank you for assistance.

> Also we prevent flooding the mailing list with configure.log - so such
> issues [requiring communicating configure.log] can be sent to
> petsc-maint at mcs.anl.gov

I'm so sorry.
next time I'll post configure.log to maintainance address.

Takuya

> Satish
> 
> On Fri, 10 Jul 2009, Takuya Sekikawa wrote:
> 
> > Dear Matt and Satish,
> > 
> > Thank you for quick response.
> > 
> > On Thu, 9 Jul 2009 06:08:00 -0500
> > Matthew Knepley <knepley at gmail.com> wrote:
> > 
> > > For any configure problem, you MUST send configure.log or we have no idea
> > > what happened.
> > 
> > Oh, Sorry.
> > I attached latest configure.log.
> > 
> > On Thu, 9 Jul 2009 09:50:21 -0500 (CDT)
> > Satish Balay <balay at mcs.anl.gov> wrote:
> > 
> > > On Thu, 9 Jul 2009, Takuya Sekikawa wrote:
> > > 
> > > > Hello petsc users,
> > > > 
> > > > I need to know how to configure PETSc with Intel-compiler (icc/icpc) on
> > > > static linking.
> > > 
> > > Why?
> > 
> > Mainly because of license problem.
> > .so version needs target user to purchase licsense.
> > 
> > > > shared linking is just fine, 
> > > > but I need to build PETSc with static-linking. so I tried several description.
> > > > 
> > > > [1]
> > > > $ ./config/configure.py --with-cc=icc --with-cxx=icpc --with-fc=0
> > > > --with-shared=0 --with-blas-lapack-dir=${MKL_DIR}
> > > > 
> > > > environment variable MKL_DIR is set to intel MKL library directory.
> > > > this one is fine, (also compiling and running is ok)
> > > > but is spite of "--with-shared=0" flag, executable still link with .so
> > > > (libmkl_lapack.so, etc)
> > > 
> > > --with-shared=0 refers to petsc libraries. It doesn't mean static
> > > linking or shared linking. 
> > 
> > Ok. I understood.
> > 
> > > Generally static linking is done by the linker option [with icc/ifort
> > > its: -Bstatic]. But since all system libraries might not be available
> > > as static libraries - this might not work.
> > > 
> > > Esp with MKL - since the librariry names are different between .so and
> > > .a files. [so PETSc configure doesn't explicitly look for tha .a
> > > names.
> > > 
> > > > 
> > > > so I tried another one:
> > > > 
> > > > [2]
> > > > $ ./config/configure.py --with-cc=icc --with-cxx=icpc --with-fc=0
> > > > --with-shared=0 --with-blas-lapack-lib=${MKL_DIR}/libmkl_lapack.a
> > > 
> > > Generally - you  need -lmkl_lapack -lmkl -lpthread -lguide
> > > 
> > > However -lmkl is only available as .so. So you'll have to cat
> > > libmkl.so to see what the actual libraries it links with: For me I
> > > have:
> > > 
> > > [petsc:10.0.2.018/lib/em64t] petsc> cat libmkl.so
> > > GROUP (libmkl_intel_lp64.so libmkl_intel_thread.so libmkl_core.so)
> > > [petsc:10.0.2.018/lib/em64t] petsc> 
> > > 
> > > 
> > > So you might be able to use:
> > > 
> > > --with-blas-lapack-lib="${MKL_DIR}/libmkl_lapack.a ${MKL_DIR}/libmkl_intel_lp64.a ${MKL_DIR}/libmkl_core.a -lpthread ${MKL_DIR}/libguide.a"
> > 
> > Thank you.
> > I tried as you wrote, but unsuccessful. 
> > configure.py said:
> > 
> > *********************************************************************************
> >          UNABLE to CONFIGURE with GIVEN OPTIONS    (see configure.log for details):
> > ---------------------------------------------------------------------------------------
> > You set a value for --with-blas-lapack-lib=<lib>, but ['${MKL_DIR}/libmkl_lapack.a ${MKL_DIR}/libmkl_intel_lp64.a ${MKL_DIR}/libmkl_core.a -lpthread ${MKL_DIR}/libguide.a'] cannot be used
> > *********************************************************************************
> > 
> > What is the real cause of "cannot be used" ?
> > I cannot make out why "cannot be used" (.a is collapsed? or simply need
> > to specify more .a?)
> > 
> > Takuya
> > 

---------------------------------------------------------------
   ?   Takuya Sekikawa
 ???        Mathematical Systems, Inc
   ?                 sekikawa at msi.co.jp
---------------------------------------------------------------



From w_subber at yahoo.com  Fri Jul 10 20:18:49 2009
From: w_subber at yahoo.com (Waad Subber)
Date: Fri, 10 Jul 2009 18:18:49 -0700 (PDT)
Subject: Matrix transpose
Message-ID: <244334.84354.qm@web38207.mail.mud.yahoo.com>

Hi all

In the function MatMatMultTranspose(Mat A,Mat B,MatReuse scall,PetscReal fill,Mat *C) 

is A has to be a square matrix ? 

And what about the function MatTranspose(Mat mat,MatReuse reuse,Mat *B) is mat has to be a square matrix too ? 

I am trying to use these functions with a rectangular matrix. but it doesn't work for me !

Thanks
Waad



      
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.mcs.anl.gov/pipermail/petsc-users/attachments/20090710/490159f5/attachment.htm>

From bsmith at mcs.anl.gov  Fri Jul 10 20:42:47 2009
From: bsmith at mcs.anl.gov (Barry Smith)
Date: Fri, 10 Jul 2009 20:42:47 -0500
Subject: Matrix transpose
In-Reply-To: <244334.84354.qm@web38207.mail.mud.yahoo.com>
References: <244334.84354.qm@web38207.mail.mud.yahoo.com>
Message-ID: <424F650E-380B-4B6F-B9F5-BA7BBB430755@mcs.anl.gov>


    They have not been written or tested to work for general  
rectangular matrices. They may work for some formats and not for  
others. You may need to debug and modify them yourself to provide the  
support you need. Or perhaps another PETSc user can generalize them.

    Barry

We haven't had the time to provide all functionality that would be  
nice to have.

On Jul 10, 2009, at 8:18 PM, Waad Subber wrote:

> Hi all
>
> In the function MatMatMultTranspose(Mat A,Mat B,MatReuse  
> scall,PetscReal fill,Mat *C)
>
> is A has to be a square matrix ?
>
> And what about the function MatTranspose(Mat mat,MatReuse reuse,Mat  
> *B) is mat has to be a square matrix too ?
>
> I am trying to use these functions with a rectangular matrix. but it  
> doesn't work for me !
>
> Thanks
> Waad
>


From saswata at umd.edu  Sun Jul 12 10:34:21 2009
From: saswata at umd.edu (Saswata Hier-Majumder)
Date: Sun, 12 Jul 2009 11:34:21 -0400
Subject: VTK output from DA vectors
Message-ID: <4A5A027D.60901@umd.edu>

Hi,
I would like to generate a vtk output from   a multicomponent problem. 
In the vtk file, I would like the DA coordinates as well as all 4 
components stored separately as scalar point data.

I have been using the  VecView_VTK routine from 
/ksp/ksp/ksp/examples/tutorials/ex29.c. But the vtk output seems to 
contain only the local coordinates (may be because DAGetCoordinates is 
not collective?).  Is there a way to fix this?

Also, using the same routine, all components of the solution 
corresponding to a node are dumped together. Is there a way to extract 
each component separately and ouput them separately as scalar point data?

Thanks

-- 
www.geol.umd.edu/~saswata


From vyan2000 at gmail.com  Sun Jul 12 15:30:09 2009
From: vyan2000 at gmail.com (Ryan Yan)
Date: Sun, 12 Jul 2009 16:30:09 -0400
Subject: about src/mat/examples/tutorials/ex5.c.html
Message-ID: <bb5eaf5f0907121330r3d6f26f4y7a5f847b24bed854@mail.gmail.com>

http://www.mcs.anl.gov/petsc/petsc-2/snapshots/petsc-current/src/mat/examples/tutorials/ex5.c.html

Hi All,
I am tring to read through an example about PetscBinaryRead. It looks like
the matrix is reading from a CRS matrix object descriptor "fd1, or fd2".

I have difficulty of understand the line 50:
+++++++++++++++++++++++++++++
50: PetscBinaryRead(fd2,(char *)header,4,PETSC_INT);
+++++++++++++++++++++++++++++
>From the context, my guess is:
header[0] unknown

header[1] contains the info of how many rows of matrix stored on this
processor

header[2] contains the info of how many global columns of the matrix

header[3] unknown

and line:
++++++++++++++++++++++++++++
86: PetscBinaryRead(fd1,ourlens,m,PETSC_INT);
101: PetscBinaryRead(fd1,mycols,ourlens[i],PETSC_INT);
+++++++++++++++++++++++++++++++
>From the context, my guess is:
ourlens[i] stores the length of the ith local row for the "local" portion of
the matrix.

mycols is an array storing the column indices of the nonzero entries of the
ith local row for the "local" portion of the matrix(include diagonal and off
diagonal).





Is there any pointer to the definition of the struct descriptor.

Can anyone confirm my guess and provide a pointer or example?  How does the
PetscBinaryRead() switch smoothly between reading different informations
with the same parameter list.

Thank you very much,

Yan
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.mcs.anl.gov/pipermail/petsc-users/attachments/20090712/d4289578/attachment.htm>

From bsmith at mcs.anl.gov  Sun Jul 12 16:41:56 2009
From: bsmith at mcs.anl.gov (Barry Smith)
Date: Sun, 12 Jul 2009 16:41:56 -0500
Subject: about src/mat/examples/tutorials/ex5.c.html
In-Reply-To: <bb5eaf5f0907121330r3d6f26f4y7a5f847b24bed854@mail.gmail.com>
References: <bb5eaf5f0907121330r3d6f26f4y7a5f847b24bed854@mail.gmail.com>
Message-ID: <990EBC48-42A5-45C3-B7F2-A7B26537246E@mcs.anl.gov>


   The manual page for MatLoad() and VecLoad() contain the definitions  
of those structs.

    The file binary format is independent of parallel storage of the  
matrix so has no information about the "diagonal" and "off-diagonal"  
parts of the matrix. That is all determined when the binary file is  
read in.

    Barry


On Jul 12, 2009, at 3:30 PM, Ryan Yan wrote:

> http://www.mcs.anl.gov/petsc/petsc-2/snapshots/petsc-current/src/mat/examples/tutorials/ex5.c.html
>
> Hi All,
> I am tring to read through an example about PetscBinaryRead. It  
> looks like the matrix is reading from a CRS matrix object descriptor  
> "fd1, or fd2".
>
> I have difficulty of understand the line 50:
> +++++++++++++++++++++++++++++
> 50: PetscBinaryRead(fd2,(char *)header,4,PETSC_INT);
> +++++++++++++++++++++++++++++
> From the context, my guess is:
> header[0] unknown
>
> header[1] contains the info of how many rows of matrix stored on  
> this processor
>
> header[2] contains the info of how many global columns of the matrix
>
> header[3] unknown
>
> and line:
> ++++++++++++++++++++++++++++
> 86: PetscBinaryRead(fd1,ourlens,m,PETSC_INT);
> 101: PetscBinaryRead(fd1,mycols,ourlens[i],PETSC_INT);
> +++++++++++++++++++++++++++++++
> From the context, my guess is:
> ourlens[i] stores the length of the ith local row for the "local"  
> portion of the matrix.
>
> mycols is an array storing the column indices of the nonzero entries  
> of the ith local row for the "local" portion of the matrix(include  
> diagonal and off diagonal).
>
>
>
>
>
> Is there any pointer to the definition of the struct descriptor.
>
> Can anyone confirm my guess and provide a pointer or example?  How  
> does the PetscBinaryRead() switch smoothly between reading different  
> informations with the same parameter list.
>
> Thank you very much,
>
> Yan
>


From knepley at gmail.com  Sun Jul 12 16:48:25 2009
From: knepley at gmail.com (Matthew Knepley)
Date: Sun, 12 Jul 2009 16:48:25 -0500
Subject: VTK output from DA vectors
In-Reply-To: <4A5A027D.60901@umd.edu>
References: <4A5A027D.60901@umd.edu>
Message-ID: <a9f269830907121448o7740adb8v401cc31ace75387f@mail.gmail.com>

1) I believe the standard VTK viewer (ASCII Viewer with VTK format) gets the
coordinates
    right. Can you verify this?

2) You should be able to split the 4- component field into 4 fields using
the Split filter in VTK
     or whatever viewer you use (I do this in Mayavi2, but Paraview also
works).

  Matt

On Sun, Jul 12, 2009 at 10:34 AM, Saswata Hier-Majumder <saswata at umd.edu>wrote:

> Hi,
> I would like to generate a vtk output from   a multicomponent problem. In
> the vtk file, I would like the DA coordinates as well as all 4 components
> stored separately as scalar point data.
>
> I have been using the  VecView_VTK routine from
> /ksp/ksp/ksp/examples/tutorials/ex29.c. But the vtk output seems to contain
> only the local coordinates (may be because DAGetCoordinates is not
> collective?).  Is there a way to fix this?
>
> Also, using the same routine, all components of the solution corresponding
> to a node are dumped together. Is there a way to extract each component
> separately and ouput them separately as scalar point data?
>
> Thanks
>
> --
> www.geol.umd.edu/~saswata <http://www.geol.umd.edu/%7Esaswata>
>
>


-- 
What most experimenters take for granted before they begin their experiments
is infinitely more interesting than any results to which their experiments
lead.
-- Norbert Wiener
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.mcs.anl.gov/pipermail/petsc-users/attachments/20090712/fa1f04a0/attachment.htm>

From vyan2000 at gmail.com  Sun Jul 12 16:56:41 2009
From: vyan2000 at gmail.com (Ryan Yan)
Date: Sun, 12 Jul 2009 17:56:41 -0400
Subject: about src/mat/examples/tutorials/ex5.c.html
In-Reply-To: <990EBC48-42A5-45C3-B7F2-A7B26537246E@mcs.anl.gov>
References: <bb5eaf5f0907121330r3d6f26f4y7a5f847b24bed854@mail.gmail.com>
	<990EBC48-42A5-45C3-B7F2-A7B26537246E@mcs.anl.gov>
Message-ID: <bb5eaf5f0907121456k54bdf6cbv63e2201ee772cd5f@mail.gmail.com>

On Sun, Jul 12, 2009 at 5:41 PM, Barry Smith <bsmith at mcs.anl.gov> wrote:

>
>  The manual page for MatLoad() and VecLoad() contain the definitions of
> those structs.


This pointer is great.


>
>   The file binary format is independent of parallel storage of the matrix
> so has no information about the "diagonal" and "off-diagonal" parts of the
> matrix. That is all determined when the binary file is read in.


That's exactly what I get confused with.

Thank you very much,

Yan


>
>   Barry
>
>
>
> On Jul 12, 2009, at 3:30 PM, Ryan Yan wrote:
>
>
>> http://www.mcs.anl.gov/petsc/petsc-2/snapshots/petsc-current/src/mat/examples/tutorials/ex5.c.html
>>
>> Hi All,
>> I am tring to read through an example about PetscBinaryRead. It looks like
>> the matrix is reading from a CRS matrix object descriptor "fd1, or fd2".
>>
>> I have difficulty of understand the line 50:
>> +++++++++++++++++++++++++++++
>> 50: PetscBinaryRead(fd2,(char *)header,4,PETSC_INT);
>> +++++++++++++++++++++++++++++
>> From the context, my guess is:
>> header[0] unknown
>>
>> header[1] contains the info of how many rows of matrix stored on this
>> processor
>>
>> header[2] contains the info of how many global columns of the matrix
>>
>> header[3] unknown
>>
>> and line:
>> ++++++++++++++++++++++++++++
>> 86: PetscBinaryRead(fd1,ourlens,m,PETSC_INT);
>> 101: PetscBinaryRead(fd1,mycols,ourlens[i],PETSC_INT);
>> +++++++++++++++++++++++++++++++
>> From the context, my guess is:
>> ourlens[i] stores the length of the ith local row for the "local" portion
>> of the matrix.
>>
>> mycols is an array storing the column indices of the nonzero entries of
>> the ith local row for the "local" portion of the matrix(include diagonal and
>> off diagonal).
>>
>>
>>
>>
>>
>> Is there any pointer to the definition of the struct descriptor.
>>
>> Can anyone confirm my guess and provide a pointer or example?  How does
>> the PetscBinaryRead() switch smoothly between reading different informations
>> with the same parameter list.
>>
>> Thank you very much,
>>
>> Yan
>>
>>
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.mcs.anl.gov/pipermail/petsc-users/attachments/20090712/85772825/attachment.htm>

From saswata at umd.edu  Sun Jul 12 21:09:02 2009
From: saswata at umd.edu (Saswata Hier-Majumder)
Date: Sun, 12 Jul 2009 22:09:02 -0400
Subject: VTK output from DA vectors
In-Reply-To: <a9f269830907121448o7740adb8v401cc31ace75387f@mail.gmail.com>
References: <4A5A027D.60901@umd.edu>
	<a9f269830907121448o7740adb8v401cc31ace75387f@mail.gmail.com>
Message-ID: <4A5A973E.7060300@umd.edu>

1) I did. It returns correct values for the x nodes, but fails to do so 
for the y nodes. Here's a sample output generated from the driven cavity 
problem. ( I suppressed printing the solutions for brevity).

2) Thanks, I'll try that.

Matthew Knepley wrote:
> 1) I believe the standard VTK viewer (ASCII Viewer with VTK format) gets the
> coordinates
>     right. Can you verify this?
>
> 2) You should be able to split the 4- component field into 4 fields using
> the Split filter in VTK
>      or whatever viewer you use (I do this in Mayavi2, but Paraview also
> works).
>
>   Matt
>
> On Sun, Jul 12, 2009 at 10:34 AM, Saswata Hier-Majumder <saswata at umd.edu>wrote:
>
>   
>> Hi,
>> I would like to generate a vtk output from   a multicomponent problem. In
>> the vtk file, I would like the DA coordinates as well as all 4 components
>> stored separately as scalar point data.
>>
>> I have been using the  VecView_VTK routine from
>> /ksp/ksp/ksp/examples/tutorials/ex29.c. But the vtk output seems to contain
>> only the local coordinates (may be because DAGetCoordinates is not
>> collective?).  Is there a way to fix this?
>>
>> Also, using the same routine, all components of the solution corresponding
>> to a node are dumped together. Is there a way to extract each component
>> separately and ouput them separately as scalar point data?
>>
>> Thanks
>>
>> --
>> www.geol.umd.edu/~saswata <http://www.geol.umd.edu/%7Esaswata>
>>
>>
>>     
>
>
>   

-- 
www.geol.umd.edu/~saswata

-------------- next part --------------
An embedded and charset-unspecified text was scrubbed...
Name: solution.vtk
URL: <http://lists.mcs.anl.gov/pipermail/petsc-users/attachments/20090712/02cab317/attachment.diff>

From knepley at gmail.com  Sun Jul 12 21:13:02 2009
From: knepley at gmail.com (Matthew Knepley)
Date: Sun, 12 Jul 2009 21:13:02 -0500
Subject: VTK output from DA vectors
In-Reply-To: <4A5A973E.7060300@umd.edu>
References: <4A5A027D.60901@umd.edu>
	<a9f269830907121448o7740adb8v401cc31ace75387f@mail.gmail.com>
	<4A5A973E.7060300@umd.edu>
Message-ID: <a9f269830907121913u6e35c159uc464d5ca724d1521@mail.gmail.com>

On Sun, Jul 12, 2009 at 9:09 PM, Saswata Hier-Majumder <saswata at umd.edu>wrote:

> 1) I did. It returns correct values for the x nodes, but fails to do so for
> the y nodes. Here's a sample output generated from the driven cavity
> problem. ( I suppressed printing the solutions for brevity).


I did not realize you mean as fields (rather than as the mesh). Give me your
exact calling sequence. I can output
parallel fields fine, so something else must be going on. If you modified
the code, send that too.

  Matt


>
> 2) Thanks, I'll try that.
>
> Matthew Knepley wrote:
>
>> 1) I believe the standard VTK viewer (ASCII Viewer with VTK format) gets
>> the
>> coordinates
>>    right. Can you verify this?
>>
>> 2) You should be able to split the 4- component field into 4 fields using
>> the Split filter in VTK
>>     or whatever viewer you use (I do this in Mayavi2, but Paraview also
>> works).
>>
>>  Matt
>>
>> On Sun, Jul 12, 2009 at 10:34 AM, Saswata Hier-Majumder <saswata at umd.edu
>> >wrote:
>>
>>
>>
>>> Hi,
>>> I would like to generate a vtk output from   a multicomponent problem. In
>>> the vtk file, I would like the DA coordinates as well as all 4 components
>>> stored separately as scalar point data.
>>>
>>> I have been using the  VecView_VTK routine from
>>> /ksp/ksp/ksp/examples/tutorials/ex29.c. But the vtk output seems to
>>> contain
>>> only the local coordinates (may be because DAGetCoordinates is not
>>> collective?).  Is there a way to fix this?
>>>
>>> Also, using the same routine, all components of the solution
>>> corresponding
>>> to a node are dumped together. Is there a way to extract each component
>>> separately and ouput them separately as scalar point data?
>>>
>>> Thanks
>>>
>>> --
>>> www.geol.umd.edu/~saswata <http://www.geol.umd.edu/%7Esaswata> <
>>> http://www.geol.umd.edu/%7Esaswata>
>>>
>>>
>>>
>>>
>>
>>
>>
>>
>
> --
> www.geol.umd.edu/~saswata <http://www.geol.umd.edu/%7Esaswata>
>
>
> # vtk DataFile Version 2.0
> ASCII
> DATASET STRUCTURED_POINTS
> DIMENSIONS 49 49 1
> ORIGIN  0 0 0
> SPACING 1 1 1
>
> POINT_DATA 2401
> SCALARS scalars double 3
> LOOKUP_TABLE default
> X_COORDINATES 49 double
> 0 0.0208333 0.0416667 0.0625 0.0833333 0.104167 0.125 0.145833 0.166667
> 0.1875 0.208333 0.229167 0.25 0.270833 0.291667 0.3125 0.333333 0.354167
> 0.375 0.395833 0.416667 0.4375 0.458333 0.479167 0.5 0 0.0208333 0.0416667
> 0.0625 0.0833333 0.104167 0.125 0.145833 0.166667 0.1875 0.208333 0.229167
> 0.25 0.270833 0.291667 0.3125 0.333333 0.354167 0.375 0.395833 0.416667
> 0.4375 0.458333 0.479167
> Y_COORDINATES 49 double
> 0 0.0208333 0.0625 0.104167 0.145833 0.1875 0.229167 0.270833 0.3125
> 0.354167 0.395833 0.4375 0.479167 5.27367e-317 5.23119e-317 0 5.31253e-317 0
> 0 0 3.41641e-312 3.26575e-311 1.14376e-311 1.54269e-311 1.94163e-311
> 2.34056e-311 5.29014e-311 5.27095e-317 0 0 0 1.82492e-312 1.01431e-311
> 1.84614e-311 1.49455e-320 3.09811e-312 1.17559e-311 2.04136e-311 0 0 0
> 5.23119e-317 0 0 8.06358e-313 5.3312e-317 5.33093e-317 0 5.27392e-317
> Z_COORDINATES 1 double
> 0
>
>


-- 
What most experimenters take for granted before they begin their experiments
is infinitely more interesting than any results to which their experiments
lead.
-- Norbert Wiener
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.mcs.anl.gov/pipermail/petsc-users/attachments/20090712/fa15abf9/attachment-0001.htm>

From saswata at umd.edu  Mon Jul 13 08:27:42 2009
From: saswata at umd.edu (Saswata Hier-Majumder)
Date: Mon, 13 Jul 2009 09:27:42 -0400
Subject: VTK output from DA vectors
In-Reply-To: <a9f269830907121913u6e35c159uc464d5ca724d1521@mail.gmail.com>
References: <4A5A027D.60901@umd.edu>	
	<a9f269830907121448o7740adb8v401cc31ace75387f@mail.gmail.com>	
	<4A5A973E.7060300@umd.edu>
	<a9f269830907121913u6e35c159uc464d5ca724d1521@mail.gmail.com>
Message-ID: <4A5B364E.9070109@umd.edu>

OK, here's the c program and the vtk output. Thanks for your help.


Matthew Knepley wrote:
> On Sun, Jul 12, 2009 at 9:09 PM, Saswata Hier-Majumder <saswata at umd.edu>wrote:
>
>   
>> 1) I did. It returns correct values for the x nodes, but fails to do so for
>> the y nodes. Here's a sample output generated from the driven cavity
>> problem. ( I suppressed printing the solutions for brevity).
>>     
>
>
> I did not realize you mean as fields (rather than as the mesh). Give me your
> exact calling sequence. I can output
> parallel fields fine, so something else must be going on. If you modified
> the code, send that too.
>
>   Matt
>
>
>   
>> 2) Thanks, I'll try that.
>>
>> Matthew Knepley wrote:
>>
>>     
>>> 1) I believe the standard VTK viewer (ASCII Viewer with VTK format) gets
>>> the
>>> coordinates
>>>    right. Can you verify this?
>>>
>>> 2) You should be able to split the 4- component field into 4 fields using
>>> the Split filter in VTK
>>>     or whatever viewer you use (I do this in Mayavi2, but Paraview also
>>> works).
>>>
>>>  Matt
>>>
>>> On Sun, Jul 12, 2009 at 10:34 AM, Saswata Hier-Majumder <saswata at umd.edu
>>>       
>>>> wrote:
>>>>         
>>>
>>>       
>>>> Hi,
>>>> I would like to generate a vtk output from   a multicomponent problem. In
>>>> the vtk file, I would like the DA coordinates as well as all 4 components
>>>> stored separately as scalar point data.
>>>>
>>>> I have been using the  VecView_VTK routine from
>>>> /ksp/ksp/ksp/examples/tutorials/ex29.c. But the vtk output seems to
>>>> contain
>>>> only the local coordinates (may be because DAGetCoordinates is not
>>>> collective?).  Is there a way to fix this?
>>>>
>>>> Also, using the same routine, all components of the solution
>>>> corresponding
>>>> to a node are dumped together. Is there a way to extract each component
>>>> separately and ouput them separately as scalar point data?
>>>>
>>>> Thanks
>>>>
>>>> --
>>>> www.geol.umd.edu/~saswata <http://www.geol.umd.edu/%7Esaswata> <
>>>> http://www.geol.umd.edu/%7Esaswata>
>>>>
>>>>
>>>>
>>>>
>>>>         
>>>
>>>
>>>       
>> --
>> www.geol.umd.edu/~saswata <http://www.geol.umd.edu/%7Esaswata>
>>
>>
>> # vtk DataFile Version 2.0
>> ASCII
>> DATASET STRUCTURED_POINTS
>> DIMENSIONS 49 49 1
>> ORIGIN  0 0 0
>> SPACING 1 1 1
>>
>> POINT_DATA 2401
>> SCALARS scalars double 3
>> LOOKUP_TABLE default
>> X_COORDINATES 49 double
>> 0 0.0208333 0.0416667 0.0625 0.0833333 0.104167 0.125 0.145833 0.166667
>> 0.1875 0.208333 0.229167 0.25 0.270833 0.291667 0.3125 0.333333 0.354167
>> 0.375 0.395833 0.416667 0.4375 0.458333 0.479167 0.5 0 0.0208333 0.0416667
>> 0.0625 0.0833333 0.104167 0.125 0.145833 0.166667 0.1875 0.208333 0.229167
>> 0.25 0.270833 0.291667 0.3125 0.333333 0.354167 0.375 0.395833 0.416667
>> 0.4375 0.458333 0.479167
>> Y_COORDINATES 49 double
>> 0 0.0208333 0.0625 0.104167 0.145833 0.1875 0.229167 0.270833 0.3125
>> 0.354167 0.395833 0.4375 0.479167 5.27367e-317 5.23119e-317 0 5.31253e-317 0
>> 0 0 3.41641e-312 3.26575e-311 1.14376e-311 1.54269e-311 1.94163e-311
>> 2.34056e-311 5.29014e-311 5.27095e-317 0 0 0 1.82492e-312 1.01431e-311
>> 1.84614e-311 1.49455e-320 3.09811e-312 1.17559e-311 2.04136e-311 0 0 0
>> 5.23119e-317 0 0 8.06358e-313 5.3312e-317 5.33093e-317 0 5.27392e-317
>> Z_COORDINATES 1 double
>> 0
>>
>>
>>     
>
>
>   

-- 
www.geol.umd.edu/~saswata

-------------- next part --------------
A non-text attachment was scrubbed...
Name: dmmg.c
Type: text/x-csrc
Size: 15473 bytes
Desc: not available
URL: <http://lists.mcs.anl.gov/pipermail/petsc-users/attachments/20090713/95666254/attachment-0001.c>
-------------- next part --------------
An embedded and charset-unspecified text was scrubbed...
Name: solution.vtk
URL: <http://lists.mcs.anl.gov/pipermail/petsc-users/attachments/20090713/95666254/attachment-0001.diff>

From C.Klaij at marin.nl  Tue Jul 14 03:36:15 2009
From: C.Klaij at marin.nl (Klaij, Christiaan)
Date: Tue, 14 Jul 2009 10:36:15 +0200
Subject: hypre preconditioners
Message-ID: <5D9143EF9FADE942BEF6F2A636A861170800F780@MAR150CV1.marin.local>

I'm solving the steady incompressible Navier-Stokes equations (discretized with FV on unstructured grids) using the SIMPLE Pressure Correction method. I'm using Picard linearization and solve the system for the momentum equations with BICG and for the pressure equation with CG. Currently, for parallel runs, I'm using JACOBI as a preconditioner. My grids typically have a few million cells and I use between 4 and 16 cores (1 to 4 quadcore CPUs on a linux cluster). A significant portion of the CPU time goes into solving the pressure equation. To reach the relative tolerance I need, CG with JACOBI takes about 100 iterations per outer loop for these problems.

In order to reduce CPU time, I've compiled PETSc with support for Hypre and I'm looking at BoomerAMG and Euclid to replace JACOBI as a preconditioner for the pressure equation. With default settings, both BoomerAMG and Euclid greatly reduce the number of iterations: with BoomerAMG 1 or 2 iterations are enough, with Euclid about 10. However, I do not get any reduction in CPU time. With Euclid, CPU time is similar to JACOBI and with BoomerAMG it is approximately doubled.

Is this what one can expect? Are BoomerAMG and Euclid meant for much larger problems? I understand Hypre uses a different matrix storage format, is CPU time 'lost in translation' between PETSc and Hypre for these small problems? Are there maybe any settings I should change?

Chris







dr. ir. Christiaan Klaij
CFD Researcher
Research & Development
mailto:C.Klaij at marin.nl
T +31 317 49 33 44

MARIN
2, Haagsteeg, P.O. Box 28, 6700 AA Wageningen, The Netherlands
T +31 317 49 39 11, F +31 317 49 32 45, I http://www.marin.nl/
http://www.marin.nl/web/show/id=46836/contentid=2324 First AMT'09 conference, Nantes, France, September 1-2 

This e-mail may be confidential, privileged and/or protected by copyright. If you are not the intended recipient, you should return it to the sender immediately and delete your copy from your system.

-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.mcs.anl.gov/pipermail/petsc-users/attachments/20090714/7ea9abd5/attachment.htm>
-------------- next part --------------
A non-text attachment was scrubbed...
Name: not available
Type: image/jpeg
Size: 1069 bytes
Desc: not available
URL: <http://lists.mcs.anl.gov/pipermail/petsc-users/attachments/20090714/7ea9abd5/attachment.jpeg>
-------------- next part --------------
A non-text attachment was scrubbed...
Name: not available
Type: image/jpeg
Size: 1622 bytes
Desc: not available
URL: <http://lists.mcs.anl.gov/pipermail/petsc-users/attachments/20090714/7ea9abd5/attachment-0001.jpeg>

From Andreas.Grassl at student.uibk.ac.at  Tue Jul 14 10:42:33 2009
From: Andreas.Grassl at student.uibk.ac.at (Andreas Grassl)
Date: Tue, 14 Jul 2009 17:42:33 +0200
Subject: ifort -i8 -r8 options
Message-ID: <4A5CA769.4090101@student.uibk.ac.at>

Hello,

trying external packages (especially MUMPS and HYPRE) I noticed, that PETSc has
to be compiled with-32-bit-indices and this is giving me some problems because
all Diana-routines from which I'm reading out my data are compiled with ifort
-i8 -r8 flags and I run into trouble matching together 64-bit integers from
Diana to 32-bit PetscInt's.

Recompiling Diana without the ugly flags is no alternative.

Wrapping some casting routines around the arrays seems doable but doesn't seem
to me the cleanest solution.

Does anybody have an advice how to handle the problem?

Cheers,

ando

-- 
 /"\                               Grassl Andreas
 \ /    ASCII Ribbon Campaign      Uni Innsbruck Institut f. Mathematik
  X      against HTML email        Technikerstr. 13 Zi 709
 / \                               +43 (0)512 507 6091

From bsmith at mcs.anl.gov  Tue Jul 14 10:42:58 2009
From: bsmith at mcs.anl.gov (Barry Smith)
Date: Tue, 14 Jul 2009 10:42:58 -0500
Subject: hypre preconditioners
In-Reply-To: <5D9143EF9FADE942BEF6F2A636A861170800F780@MAR150CV1.marin.local>
References: <5D9143EF9FADE942BEF6F2A636A861170800F780@MAR150CV1.marin.local>
Message-ID: <DC1E3E8F-1D2D-4256-A1EE-14BA81EAEC67@mcs.anl.gov>


    First run the three cases with -log_summary (also -ksp_view to see  
exact solver options that are being used) and send those files. This  
will tell us where the time is being spent; without this information  
any comments are pure speculation. (For example, the "copy" time to  
hypre format is trivial compared to the time to build a hypre  
preconditioner and not the problem).


    What you report is not uncommon; the setup and per iteration cost  
of the hypre preconditioners will be much larger than the simpler  
Jacobi preconditioner.

    Barry

On Jul 14, 2009, at 3:36 AM, Klaij, Christiaan wrote:

>
> I'm solving the steady incompressible Navier-Stokes equations  
> (discretized with FV on unstructured grids) using the SIMPLE  
> Pressure Correction method. I'm using Picard linearization and solve  
> the system for the momentum equations with BICG and for the pressure  
> equation with CG. Currently, for parallel runs, I'm using JACOBI as  
> a preconditioner. My grids typically have a few million cells and I  
> use between 4 and 16 cores (1 to 4 quadcore CPUs on a linux  
> cluster). A significant portion of the CPU time goes into solving  
> the pressure equation. To reach the relative tolerance I need, CG  
> with JACOBI takes about 100 iterations per outer loop for these  
> problems.
>
> In order to reduce CPU time, I've compiled PETSc with support for  
> Hypre and I'm looking at BoomerAMG and Euclid to replace JACOBI as a  
> preconditioner for the pressure equation. With default settings,  
> both BoomerAMG and Euclid greatly reduce the number of iterations:  
> with BoomerAMG 1 or 2 iterations are enough, with Euclid about 10.  
> However, I do not get any reduction in CPU time. With Euclid, CPU  
> time is similar to JACOBI and with BoomerAMG it is approximately  
> doubled.
>
> Is this what one can expect? Are BoomerAMG and Euclid meant for much  
> larger problems? I understand Hypre uses a different matrix storage  
> format, is CPU time 'lost in translation' between PETSc and Hypre  
> for these small problems? Are there maybe any settings I should  
> change?
>
> Chris
>
>
>
>
>
>
>
>
> <mime-attachment.jpeg><mime-attachment.jpeg>
> dr. ir. Christiaan Klaij
> CFD Researcher
> Research & Development
> MARIN
> 2, Haagsteeg
> c.klaij at marin.nl
> P.O. Box 28
> T +31 317 49 39 11
> 6700 AA  Wageningen
> F +31 317 49 32 45
> T  +31 317 49 33 44
> The Netherlands
> I  www.marin.nl
>
>
> MARIN webnews: First AMT'09 conference, Nantes, France, September 1-2
>
>
> This e-mail may be confidential, privileged and/or protected by  
> copyright. If you are not the intended recipient, you should return  
> it to the sender immediately and delete your copy from your system.
>


From bsmith at mcs.anl.gov  Tue Jul 14 10:51:23 2009
From: bsmith at mcs.anl.gov (Barry Smith)
Date: Tue, 14 Jul 2009 10:51:23 -0500
Subject: ifort -i8 -r8 options
In-Reply-To: <4A5CA769.4090101@student.uibk.ac.at>
References: <4A5CA769.4090101@student.uibk.ac.at>
Message-ID: <4B624EC9-98C8-4F70-BAD1-26DBC63FD218@mcs.anl.gov>


   Mumps and hypre do not currently support using 32 bit integer  
indices, though the MUMPS folks say they plan to support it eventually.

   Changing PETSc to convert all 64 bit integers to 32 bit before  
passing to MUMPS and hypre is a huge project and we will not be doing  
that.
You need to lobby the MUMPS and hypre to properly support 64 bit  
integers if you want to use them in that mode.

   Unless you are solving very large problems it seems you should be  
able to use the -r8 flag but not the -i8 flag.

    Barry


On Jul 14, 2009, at 10:42 AM, Andreas Grassl wrote:

> Hello,
>
> trying external packages (especially MUMPS and HYPRE) I noticed,  
> that PETSc has
> to be compiled with-32-bit-indices and this is giving me some  
> problems because
> all Diana-routines from which I'm reading out my data are compiled  
> with ifort
> -i8 -r8 flags and I run into trouble matching together 64-bit  
> integers from
> Diana to 32-bit PetscInt's.
>
> Recompiling Diana without the ugly flags is no alternative.
>
> Wrapping some casting routines around the arrays seems doable but  
> doesn't seem
> to me the cleanest solution.
>
> Does anybody have an advice how to handle the problem?
>
> Cheers,
>
> ando
>
> -- 
> /"\                               Grassl Andreas
> \ /    ASCII Ribbon Campaign      Uni Innsbruck Institut f. Mathematik
>  X      against HTML email        Technikerstr. 13 Zi 709
> / \                               +43 (0)512 507 6091


From Andreas.Grassl at student.uibk.ac.at  Tue Jul 14 11:18:07 2009
From: Andreas.Grassl at student.uibk.ac.at (Andreas Grassl)
Date: Tue, 14 Jul 2009 18:18:07 +0200
Subject: ifort -i8 -r8 options
In-Reply-To: <4B624EC9-98C8-4F70-BAD1-26DBC63FD218@mcs.anl.gov>
References: <4A5CA769.4090101@student.uibk.ac.at>
	<4B624EC9-98C8-4F70-BAD1-26DBC63FD218@mcs.anl.gov>
Message-ID: <4A5CAFBF.6020103@student.uibk.ac.at>

Barry Smith schrieb:
> 
>   Mumps and hypre do not currently support using 32 bit integer indices,
                                                   ^^^^^^
                                                here you mean 64?!

> though the MUMPS folks say they plan to support it eventually.
> 
>   Changing PETSc to convert all 64 bit integers to 32 bit before passing
> to MUMPS and hypre is a huge project and we will not be doing that.
> You need to lobby the MUMPS and hypre to properly support 64 bit
> integers if you want to use them in that mode.
> 
>   Unless you are solving very large problems it seems you should be able
> to use the -r8 flag but not the -i8 flag.

For my needs, this is certainly true, but I don't have the whole sourcecode and
 I am not able to get a working Diana-program if I omit the -i8 flag.

so you suggest casting the input data from Diana to PetscInt which is defined
32-bit?!

Cheers,

ando

-- 
 /"\                               Grassl Andreas
 \ /    ASCII Ribbon Campaign      Uni Innsbruck Institut f. Mathematik
  X      against HTML email        Technikerstr. 13 Zi 709
 / \                               +43 (0)512 507 6091

From bsmith at mcs.anl.gov  Tue Jul 14 11:42:07 2009
From: bsmith at mcs.anl.gov (Barry Smith)
Date: Tue, 14 Jul 2009 11:42:07 -0500
Subject: ifort -i8 -r8 options
In-Reply-To: <4A5CAFBF.6020103@student.uibk.ac.at>
References: <4A5CA769.4090101@student.uibk.ac.at>
	<4B624EC9-98C8-4F70-BAD1-26DBC63FD218@mcs.anl.gov>
	<4A5CAFBF.6020103@student.uibk.ac.at>
Message-ID: <60071B53-CF30-47B6-A7B4-B9729F5572DA@mcs.anl.gov>


On Jul 14, 2009, at 11:18 AM, Andreas Grassl wrote:

> Barry Smith schrieb:
>>
>>  Mumps and hypre do not currently support using 32 bit integer  
>> indices,
>                                                   ^^^^^^
>                                                here you mean 64?!
>
>> though the MUMPS folks say they plan to support it eventually.
>>
>>  Changing PETSc to convert all 64 bit integers to 32 bit before  
>> passing
>> to MUMPS and hypre is a huge project and we will not be doing that.
>> You need to lobby the MUMPS and hypre to properly support 64 bit
>> integers if you want to use them in that mode.
>>
>>  Unless you are solving very large problems it seems you should be  
>> able
>> to use the -r8 flag but not the -i8 flag.
>
> For my needs, this is certainly true, but I don't have the whole  
> sourcecode and
> I am not able to get a working Diana-program if I omit the -i8 flag.
>
> so you suggest casting the input data from Diana to PetscInt which  
> is defined
> 32-bit?!

    If you can do that. But it means copying any integer arrays from  
64 bit integer arrays to
32 bit integer arrays.

    Barry

>
> Cheers,
>
> ando
>
> -- 
> /"\                               Grassl Andreas
> \ /    ASCII Ribbon Campaign      Uni Innsbruck Institut f. Mathematik
>  X      against HTML email        Technikerstr. 13 Zi 709
> / \                               +43 (0)512 507 6091


From C.Klaij at marin.nl  Wed Jul 15 03:58:36 2009
From: C.Klaij at marin.nl (Klaij, Christiaan)
Date: Wed, 15 Jul 2009 10:58:36 +0200
Subject: hypre preconditioners
References: <mailman.67.1247590821.10645.petsc-users@mcs.anl.gov>
Message-ID: <5D9143EF9FADE942BEF6F2A636A861170800F783@MAR150CV1.marin.local>

Barry,

Thanks for your reply! Below is the information from KSPView and -log_summary for the three cases. Indeed PCSetUp takes much more time with the hypre preconditioners. 

Chris

-----------------------------
--- Jacobi preconditioner ---
-----------------------------

KSP Object:
  type: cg
  maximum iterations=500
  tolerances:  relative=0.05, absolute=1e-50, divergence=10000
  left preconditioning
PC Object:
  type: jacobi
  linear system matrix = precond matrix:
  Matrix Object:
    type=mpiaij, rows=256576, cols=256576
    total: nonzeros=1769552, allocated nonzeros=1769552
      not using I-node (on process 0) routines

************************************************************************************************************************
***             WIDEN YOUR WINDOW TO 120 CHARACTERS.  Use 'enscript -r -fCourier9' to print this document            ***
************************************************************************************************************************

---------------------------------------------- PETSc Performance Summary: ----------------------------------------------

./fresco on a linux_32_ named lin0077 with 2 processors, by cklaij Wed Jul 15 10:22:04 2009
Using Petsc Release Version 2.3.3, Patch 13, Thu May 15 17:29:26 CDT 2008 HG revision: 4466c6289a0922df26e20626fd4a0b4dd03c8124

                         Max       Max/Min        Avg      Total 
Time (sec):           6.037e+02      1.00000   6.037e+02
Objects:              9.270e+02      1.00000   9.270e+02
Flops:                5.671e+10      1.00065   5.669e+10  1.134e+11
Flops/sec:            9.393e+07      1.00065   9.390e+07  1.878e+08
MPI Messages:         1.780e+04      1.00000   1.780e+04  3.561e+04
MPI Message Lengths:  5.239e+08      1.00000   2.943e+04  1.048e+09
MPI Reductions:       2.651e+04      1.00000

Flop counting convention: 1 flop = 1 real number operation of type (multiply/divide/add/subtract)
                            e.g., VecAXPY() for real vectors of length N --> 2N flops
                            and VecAXPY() for complex vectors of length N --> 8N flops

Summary of Stages:   ----- Time ------  ----- Flops -----  --- Messages ---  -- Message Lengths --  -- Reductions --
                        Avg     %Total     Avg     %Total   counts   %Total     Avg         %Total   counts   %Total 
 0:      Main Stage: 6.0374e+02 100.0%  1.1338e+11 100.0%  3.561e+04 100.0%  2.943e+04      100.0%  5.302e+04 100.0% 

------------------------------------------------------------------------------------------------------------------------
See the 'Profiling' chapter of the users' manual for details on interpreting output.
Phase summary info:
   Count: number of times phase was executed
   Time and Flops/sec: Max - maximum over all processors
                       Ratio - ratio of maximum to minimum over all processors
   Mess: number of messages sent
   Avg. len: average message length
   Reduct: number of global reductions
   Global: entire computation
   Stage: stages of a computation. Set stages with PetscLogStagePush() and PetscLogStagePop().
      %T - percent time in this phase         %F - percent flops in this phase
      %M - percent messages in this phase     %L - percent message lengths in this phase
      %R - percent reductions in this phase
   Total Mflop/s: 10e-6 * (sum of flops over all processors)/(max time over all processors)
------------------------------------------------------------------------------------------------------------------------


      ##########################################################
      #                                                        #
      #                          WARNING!!!                    #
      #                                                        #
      #   This code was run without the PreLoadBegin()         #
      #   macros. To get timing results we always recommend    #
      #   preloading. otherwise timing numbers may be          #
      #   meaningless.                                         #
      ##########################################################


Event                Count      Time (sec)     Flops/sec                         --- Global ---  --- Stage ---   Total
                   Max Ratio  Max     Ratio   Max  Ratio  Mess   Avg len Reduct  %T %F %M %L %R  %T %F %M %L %R Mflop/s
------------------------------------------------------------------------------------------------------------------------

--- Event Stage 0: Main Stage

VecDot             31370 1.0 1.2887e+01 1.0 6.28e+08 1.0 0.0e+00 0.0e+00 3.1e+04  2 14  0  0 59   2 14  0  0 59  1249
VecNorm            16235 1.0 2.3343e+00 1.0 1.79e+09 1.0 0.0e+00 0.0e+00 1.6e+04  0  7  0  0 31   0  7  0  0 31  3569
VecCopy             1600 1.0 9.4822e-01 1.0 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00  0  0  0  0  0   0  0  0  0  0     0
VecSet              3732 1.0 8.7824e-01 1.0 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00  0  0  0  0  0   0  0  0  0  0     0
VecAXPY            32836 1.0 1.9510e+01 1.0 4.34e+08 1.0 0.0e+00 0.0e+00 0.0e+00  3 15  0  0  0   3 15  0  0  0   864
VecAYPX            16701 1.0 7.4898e+00 1.0 5.73e+08 1.0 0.0e+00 0.0e+00 0.0e+00  1  8  0  0  0   1  8  0  0  0  1144
VecAssemblyBegin    1200 1.0 3.3916e-01 2.2 0.00e+00 0.0 0.0e+00 0.0e+00 3.6e+03  0  0  0  0  7   0  0  0  0  7     0
VecAssemblyEnd      1200 1.0 1.6778e-03 1.0 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00  0  0  0  0  0   0  0  0  0  0     0
VecPointwiseMult   18301 1.0 1.4524e+01 1.0 1.62e+08 1.0 0.0e+00 0.0e+00 0.0e+00  2  4  0  0  0   2  4  0  0  0   323
VecScatterBegin    17801 1.0 5.8999e-01 1.0 0.00e+00 0.0 3.6e+04 2.9e+04 0.0e+00  0  0100100  0   0  0100100  0     0
VecScatterEnd      17801 1.0 3.3189e+00 2.2 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00  0  0  0  0  0   0  0  0  0  0     0
KSPSetup             600 1.0 6.7541e-01 1.0 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00  0  0  0  0  0   0  0  0  0  0     0
KSPSolve             600 1.0 1.6520e+02 1.0 3.43e+08 1.0 3.6e+04 2.9e+04 4.8e+04 27100100100 90  27100100100 90   686
PCSetUp              600 1.0 4.4189e+00 1.0 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00  1  0  0  0  0   1  0  0  0  0     0
PCApply            18301 1.0 1.4579e+01 1.0 1.62e+08 1.0 0.0e+00 0.0e+00 1.0e+00  2  4  0  0  0   2  4  0  0  0   322
MatMult            16235 1.0 9.3444e+01 1.0 2.86e+08 1.0 3.2e+04 2.9e+04 0.0e+00 15 47 91 91  0  15 47 91 91  0   570
MatMultTranspose    1566 1.0 8.8825e+00 1.0 3.12e+08 1.0 3.1e+03 2.9e+04 0.0e+00  1  5  9  9  0   1  5  9  9  0   624
MatAssemblyBegin     600 1.0 6.0139e-0125.2 0.00e+00 0.0 0.0e+00 0.0e+00 1.2e+03  0  0  0  0  2   0  0  0  0  2     0
MatAssemblyEnd       600 1.0 2.5127e+00 1.0 0.00e+00 0.0 4.0e+00 1.5e+04 6.1e+02  0  0  0  0  1   0  0  0  0  1     0
------------------------------------------------------------------------------------------------------------------------

Memory usage is given in bytes:

Object Type          Creations   Destructions   Memory  Descendants' Mem.

--- Event Stage 0: Main Stage

           Index Set     4              4      30272     0
                 Vec   913            902  926180816     0
         Vec Scatter     2              0          0     0
       Krylov Solver     1              0          0     0
      Preconditioner     1              0          0     0
              Matrix     6              0          0     0
========================================================================================================================
Average time to get PetscTime(): 2.14577e-07
Average time for MPI_Barrier(): 8.10623e-07
Average time for zero size MPI_Send(): 2.0504e-05



-----------------------------------
--- Hypre Euclid preconditioner ---
-----------------------------------

KSP Object:
  type: cg
  maximum iterations=500
  tolerances:  relative=0.05, absolute=1e-50, divergence=10000
  left preconditioning
PC Object:
  type: hypre
    HYPRE Euclid preconditioning
    HYPRE Euclid: number of levels 1
  linear system matrix = precond matrix:
  Matrix Object:
    type=mpiaij, rows=256576, cols=256576
    total: nonzeros=1769552, allocated nonzeros=1769552
      not using I-node (on process 0) routines

************************************************************************************************************************
***             WIDEN YOUR WINDOW TO 120 CHARACTERS.  Use 'enscript -r -fCourier9' to print this document            ***
************************************************************************************************************************

---------------------------------------------- PETSc Performance Summary: ----------------------------------------------

./fresco on a linux_32_ named lin0077 with 2 processors, by cklaij Wed Jul 15 10:10:05 2009
Using Petsc Release Version 2.3.3, Patch 13, Thu May 15 17:29:26 CDT 2008 HG revision: 4466c6289a0922df26e20626fd4a0b4dd03c8124

                         Max       Max/Min        Avg      Total 
Time (sec):           6.961e+02      1.00000   6.961e+02
Objects:              1.227e+03      1.00000   1.227e+03
Flops:                1.340e+10      1.00073   1.340e+10  2.679e+10
Flops/sec:            1.925e+07      1.00073   1.924e+07  3.848e+07
MPI Messages:         4.748e+03      1.00000   4.748e+03  9.496e+03
MPI Message Lengths:  1.397e+08      1.00000   2.943e+04  2.794e+08
MPI Reductions:       7.192e+03      1.00000

Flop counting convention: 1 flop = 1 real number operation of type (multiply/divide/add/subtract)
                            e.g., VecAXPY() for real vectors of length N --> 2N flops
                            and VecAXPY() for complex vectors of length N --> 8N flops

Summary of Stages:   ----- Time ------  ----- Flops -----  --- Messages ---  -- Message Lengths --  -- Reductions --
                        Avg     %Total     Avg     %Total   counts   %Total     Avg         %Total   counts   %Total 
 0:      Main Stage: 6.9614e+02 100.0%  2.6790e+10 100.0%  9.496e+03 100.0%  2.943e+04      100.0%  1.438e+04 100.0% 

------------------------------------------------------------------------------------------------------------------------
See the 'Profiling' chapter of the users' manual for details on interpreting output.
Phase summary info:
   Count: number of times phase was executed
   Time and Flops/sec: Max - maximum over all processors
                       Ratio - ratio of maximum to minimum over all processors
   Mess: number of messages sent
   Avg. len: average message length
   Reduct: number of global reductions
   Global: entire computation
   Stage: stages of a computation. Set stages with PetscLogStagePush() and PetscLogStagePop().
      %T - percent time in this phase         %F - percent flops in this phase
      %M - percent messages in this phase     %L - percent message lengths in this phase
      %R - percent reductions in this phase
   Total Mflop/s: 10e-6 * (sum of flops over all processors)/(max time over all processors)
------------------------------------------------------------------------------------------------------------------------


      ##########################################################
      #                                                        #
      #                          WARNING!!!                    #
      #                                                        #
      #   This code was run without the PreLoadBegin()         #
      #   macros. To get timing results we always recommend    #
      #   preloading. otherwise timing numbers may be          #
      #   meaningless.                                         #
      ##########################################################


Event                Count      Time (sec)     Flops/sec                         --- Global ---  --- Stage ---   Total
                   Max Ratio  Max     Ratio   Max  Ratio  Mess   Avg len Reduct  %T %F %M %L %R  %T %F %M %L %R Mflop/s
------------------------------------------------------------------------------------------------------------------------

--- Event Stage 0: Main Stage

VecDot              5410 1.0 1.1865e+01 4.5 5.26e+08 4.5 0.0e+00 0.0e+00 5.4e+03  1 10  0  0 38   1 10  0  0 38   234
VecNorm             3255 1.0 7.8095e-01 1.0 1.07e+09 1.0 0.0e+00 0.0e+00 3.3e+03  0  6  0  0 23   0  6  0  0 23  2139
VecCopy             1600 1.0 9.5096e-01 1.0 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00  0  0  0  0  0   0  0  0  0  0     0
VecSet              4746 1.0 8.9868e-01 1.0 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00  0  0  0  0  0   0  0  0  0  0     0
VecAXPY             6801 1.0 4.8778e+00 1.0 3.59e+08 1.0 0.0e+00 0.0e+00 0.0e+00  1 13  0  0  0   1 13  0  0  0   715
VecAYPX             3646 1.0 2.2348e+00 1.0 4.19e+08 1.0 0.0e+00 0.0e+00 0.0e+00  0  7  0  0  0   0  7  0  0  0   837
VecAssemblyBegin    1200 1.0 2.7152e-01 2.5 0.00e+00 0.0 0.0e+00 0.0e+00 3.6e+03  0  0  0  0 25   0  0  0  0 25     0
VecAssemblyEnd      1200 1.0 1.7414e-03 1.0 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00  0  0  0  0  0   0  0  0  0  0     0
VecPointwiseMult    3982 1.0 4.0871e+00 1.0 1.26e+08 1.0 0.0e+00 0.0e+00 0.0e+00  1  4  0  0  0   1  4  0  0  0   250
VecScatterBegin     4746 1.0 1.8000e-01 1.0 0.00e+00 0.0 9.5e+03 2.9e+04 0.0e+00  0  0100100  0   0  0100100  0     0
VecScatterEnd       4746 1.0 4.6870e+00 5.0 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00  0  0  0  0  0   0  0  0  0  0     0
KSPSetup             600 1.0 6.8991e-01 1.0 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00  0  0  0  0  0   0  0  0  0  0     0
KSPSolve             600 1.0 2.5931e+02 1.0 5.17e+07 1.0 9.5e+03 2.9e+04 9.0e+03 37100100100 62  37100100100 62   103
PCSetUp              600 1.0 1.8337e+02 1.0 0.00e+00 0.0 0.0e+00 0.0e+00 2.0e+02 26  0  0  0  1  26  0  0  0  1     0
PCApply             5246 1.0 3.6440e+01 1.3 1.88e+07 1.3 0.0e+00 0.0e+00 1.0e+02  5  4  0  0  1   5  4  0  0  1    28
MatMult             3255 1.0 2.3031e+01 1.2 2.85e+08 1.2 6.5e+03 2.9e+04 0.0e+00  3 40 69 69  0   3 40 69 69  0   464
MatMultTranspose    1491 1.0 8.4907e+00 1.0 3.11e+08 1.0 3.0e+03 2.9e+04 0.0e+00  1 20 31 31  0   1 20 31 31  0   621
MatConvert           100 1.0 1.2686e+01 1.0 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00  2  0  0  0  0   2  0  0  0  0     0
MatAssemblyBegin     600 1.0 2.3702e+0042.6 0.00e+00 0.0 0.0e+00 0.0e+00 1.2e+03  0  0  0  0  8   0  0  0  0  8     0
MatAssemblyEnd       600 1.0 2.5303e+00 1.0 0.00e+00 0.0 4.0e+00 1.5e+04 6.1e+02  0  0  0  0  4   0  0  0  0  4     0
MatGetRow        12828800 1.0 5.2074e+00 1.0 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00  1  0  0  0  0   1  0  0  0  0     0
MatGetRowIJ          200 1.0 1.6284e-04 1.0 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00  0  0  0  0  0   0  0  0  0  0     0
------------------------------------------------------------------------------------------------------------------------

Memory usage is given in bytes:

Object Type          Creations   Destructions   Memory  Descendants' Mem.

--- Event Stage 0: Main Stage

           Index Set     4              4      30272     0
                 Vec  1213           1202  1234223216     0
         Vec Scatter     2              0          0     0
       Krylov Solver     1              0          0     0
      Preconditioner     1              0          0     0
              Matrix     6              0          0     0
========================================================================================================================
Average time to get PetscTime(): 2.14577e-07
Average time for MPI_Barrier(): 3.8147e-07
Average time for zero size MPI_Send(): 1.39475e-05




--------------------------------------
--- Hypre BoomerAMG preconditioner ---
--------------------------------------

KSP Object:
  type: cg
  maximum iterations=500
  tolerances:  relative=0.05, absolute=1e-50, divergence=10000
  left preconditioning
PC Object:
  type: hypre
    HYPRE BoomerAMG preconditioning
    HYPRE BoomerAMG: Cycle type V
    HYPRE BoomerAMG: Maximum number of levels 25
    HYPRE BoomerAMG: Maximum number of iterations PER hypre call 1
    HYPRE BoomerAMG: Convergence tolerance PER hypre call 0
    HYPRE BoomerAMG: Threshold for strong coupling 0.25
    HYPRE BoomerAMG: Interpolation truncation factor 0
    HYPRE BoomerAMG: Interpolation: max elements per row 0
    HYPRE BoomerAMG: Number of levels of aggressive coarsening 0
    HYPRE BoomerAMG: Number of paths for aggressive coarsening 1
    HYPRE BoomerAMG: Maximum row sums 0.9
    HYPRE BoomerAMG: Sweeps down         1
    HYPRE BoomerAMG: Sweeps up           1
    HYPRE BoomerAMG: Sweeps on coarse    1
    HYPRE BoomerAMG: Relax down          symmetric-SOR/Jacobi
    HYPRE BoomerAMG: Relax up            symmetric-SOR/Jacobi
    HYPRE BoomerAMG: Relax on coarse     Gaussian-elimination
    HYPRE BoomerAMG: Relax weight  (all)      1
    HYPRE BoomerAMG: Outer relax weight (all) 1
    HYPRE BoomerAMG: Using CF-relaxation
    HYPRE BoomerAMG: Measure type        local
    HYPRE BoomerAMG: Coarsen type        Falgout
    HYPRE BoomerAMG: Interpolation type  classical
  linear system matrix = precond matrix:
  Matrix Object:
    type=mpiaij, rows=256576, cols=256576
    total: nonzeros=1769552, allocated nonzeros=1769552
      not using I-node (on process 0) routines

************************************************************************************************************************
***             WIDEN YOUR WINDOW TO 120 CHARACTERS.  Use 'enscript -r -fCourier9' to print this document            ***
************************************************************************************************************************

---------------------------------------------- PETSc Performance Summary: ----------------------------------------------

./fresco on a linux_32_ named lin0077 with 2 processors, by cklaij Wed Jul 15 09:53:07 2009
Using Petsc Release Version 2.3.3, Patch 13, Thu May 15 17:29:26 CDT 2008 HG revision: 4466c6289a0922df26e20626fd4a0b4dd03c8124

                         Max       Max/Min        Avg      Total 
Time (sec):           7.080e+02      1.00000   7.080e+02
Objects:              1.227e+03      1.00000   1.227e+03
Flops:                1.054e+10      1.00076   1.054e+10  2.107e+10
Flops/sec:            1.489e+07      1.00076   1.488e+07  2.977e+07
MPI Messages:         3.857e+03      1.00000   3.857e+03  7.714e+03
MPI Message Lengths:  1.135e+08      1.00000   2.942e+04  2.270e+08
MPI Reductions:       5.800e+03      1.00000

Flop counting convention: 1 flop = 1 real number operation of type (multiply/divide/add/subtract)
                            e.g., VecAXPY() for real vectors of length N --> 2N flops
                            and VecAXPY() for complex vectors of length N --> 8N flops

Summary of Stages:   ----- Time ------  ----- Flops -----  --- Messages ---  -- Message Lengths --  -- Reductions --
                        Avg     %Total     Avg     %Total   counts   %Total     Avg         %Total   counts   %Total 
 0:      Main Stage: 7.0799e+02 100.0%  2.1075e+10 100.0%  7.714e+03 100.0%  2.942e+04      100.0%  1.160e+04 100.0% 

------------------------------------------------------------------------------------------------------------------------
See the 'Profiling' chapter of the users' manual for details on interpreting output.
Phase summary info:
   Count: number of times phase was executed
   Time and Flops/sec: Max - maximum over all processors
                       Ratio - ratio of maximum to minimum over all processors
   Mess: number of messages sent
   Avg. len: average message length
   Reduct: number of global reductions
   Global: entire computation
   Stage: stages of a computation. Set stages with PetscLogStagePush() and PetscLogStagePop().
      %T - percent time in this phase         %F - percent flops in this phase
      %M - percent messages in this phase     %L - percent message lengths in this phase
      %R - percent reductions in this phase
   Total Mflop/s: 10e-6 * (sum of flops over all processors)/(max time over all processors)
------------------------------------------------------------------------------------------------------------------------


      ##########################################################
      #                                                        #
      #                          WARNING!!!                    #
      #                                                        #
      #   This code was run without the PreLoadBegin()         #
      #   macros. To get timing results we always recommend    #
      #   preloading. otherwise timing numbers may be          #
      #   meaningless.                                         #
      ##########################################################


Event                Count      Time (sec)     Flops/sec                         --- Global ---  --- Stage ---   Total
                   Max Ratio  Max     Ratio   Max  Ratio  Mess   Avg len Reduct  %T %F %M %L %R  %T %F %M %L %R Mflop/s
------------------------------------------------------------------------------------------------------------------------

--- Event Stage 0: Main Stage

VecDot              3554 1.0 1.8220e+00 1.0 5.03e+08 1.0 0.0e+00 0.0e+00 3.6e+03  0  9  0  0 31   0  9  0  0 31  1001
VecNorm             2327 1.0 6.7031e-01 1.0 9.34e+08 1.0 0.0e+00 0.0e+00 2.3e+03  0  6  0  0 20   0  6  0  0 20  1781
VecCopy             1600 1.0 9.4440e-01 1.0 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00  0  0  0  0  0   0  0  0  0  0     0
VecSet              3855 1.0 8.0550e-01 1.0 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00  0  0  0  0  0   0  0  0  0  0     0
VecAXPY             4982 1.0 3.7953e+00 1.0 3.39e+08 1.0 0.0e+00 0.0e+00 0.0e+00  1 12  0  0  0   1 12  0  0  0   674
VecAYPX             2755 1.0 1.8270e+00 1.0 3.89e+08 1.0 0.0e+00 0.0e+00 0.0e+00  0  7  0  0  0   0  7  0  0  0   774
VecAssemblyBegin    1200 1.0 1.8679e-01 1.8 0.00e+00 0.0 0.0e+00 0.0e+00 3.6e+03  0  0  0  0 31   0  0  0  0 31     0
VecAssemblyEnd      1200 1.0 1.7717e-03 1.1 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00  0  0  0  0  0   0  0  0  0  0     0
VecPointwiseMult    4056 1.0 4.1344e+00 1.0 1.26e+08 1.0 0.0e+00 0.0e+00 0.0e+00  1  5  0  0  0   1  5  0  0  0   252
VecScatterBegin     3855 1.0 1.5116e-01 1.0 0.00e+00 0.0 7.7e+03 2.9e+04 0.0e+00  0  0100100  0   0  0100100  0     0
VecScatterEnd       3855 1.0 7.3828e-01 2.0 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00  0  0  0  0  0   0  0  0  0  0     0
KSPSetup             600 1.0 5.1192e-01 1.0 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00  0  0  0  0  0   0  0  0  0  0     0
KSPSolve             600 1.0 2.7194e+02 1.0 3.88e+07 1.0 7.7e+03 2.9e+04 6.2e+03 38100100100 53  38100100100 53    77
PCSetUp              600 1.0 1.6630e+02 1.0 0.00e+00 0.0 0.0e+00 0.0e+00 2.0e+02 23  0  0  0  2  23  0  0  0  2     0
PCApply             4355 1.0 7.3735e+01 1.0 7.06e+06 1.0 0.0e+00 0.0e+00 1.0e+02 10  5  0  0  1  10  5  0  0  1    14
MatMult             2327 1.0 1.3706e+01 1.0 2.79e+08 1.0 4.7e+03 2.9e+04 0.0e+00  2 36 60 60  0   2 36 60 60  0   557
MatMultTranspose    1528 1.0 8.6412e+00 1.0 3.13e+08 1.0 3.1e+03 2.9e+04 0.0e+00  1 26 40 40  0   1 26 40 40  0   626
MatConvert           100 1.0 1.2962e+01 1.0 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00  2  0  0  0  0   2  0  0  0  0     0
MatAssemblyBegin     600 1.0 2.4579e+0096.9 0.00e+00 0.0 0.0e+00 0.0e+00 1.2e+03  0  0  0  0 10   0  0  0  0 10     0
MatAssemblyEnd       600 1.0 2.5257e+00 1.0 0.00e+00 0.0 4.0e+00 1.5e+04 6.1e+02  0  0  0  0  5   0  0  0  0  5     0
MatGetRow        12828800 1.0 5.2907e+00 1.0 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00  1  0  0  0  0   1  0  0  0  0     0
MatGetRowIJ          200 1.0 1.7476e-04 1.1 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00  0  0  0  0  0   0  0  0  0  0     0
------------------------------------------------------------------------------------------------------------------------

Memory usage is given in bytes:

Object Type          Creations   Destructions   Memory  Descendants' Mem.

--- Event Stage 0: Main Stage

           Index Set     4              4      30272     0
                 Vec  1213           1202  1234223216     0
         Vec Scatter     2              0          0     0
       Krylov Solver     1              0          0     0
      Preconditioner     1              0          0     0
              Matrix     6              0          0     0
========================================================================================================================
Average time to get PetscTime(): 1.90735e-07
Average time for MPI_Barrier(): 8.10623e-07
Average time for zero size MPI_Send(): 1.95503e-05
OptionTable: -log_summary




-----Original Message-----
Date: Tue, 14 Jul 2009 10:42:58 -0500
From: Barry Smith <bsmith at mcs.anl.gov>
Subject: Re: hypre preconditioners
To: PETSc users list <petsc-users at mcs.anl.gov>
Message-ID: <DC1E3E8F-1D2D-4256-A1EE-14BA81EAEC67 at mcs.anl.gov>
Content-Type: text/plain; charset=US-ASCII; format=flowed; delsp=yes


    First run the three cases with -log_summary (also -ksp_view to see  
exact solver options that are being used) and send those files. This  
will tell us where the time is being spent; without this information  
any comments are pure speculation. (For example, the "copy" time to  
hypre format is trivial compared to the time to build a hypre  
preconditioner and not the problem).


    What you report is not uncommon; the setup and per iteration cost  
of the hypre preconditioners will be much larger than the simpler  
Jacobi preconditioner.

    Barry

On Jul 14, 2009, at 3:36 AM, Klaij, Christiaan wrote:

>
> I'm solving the steady incompressible Navier-Stokes equations  
> (discretized with FV on unstructured grids) using the SIMPLE  
> Pressure Correction method. I'm using Picard linearization and solve  
> the system for the momentum equations with BICG and for the pressure  
> equation with CG. Currently, for parallel runs, I'm using JACOBI as  
> a preconditioner. My grids typically have a few million cells and I  
> use between 4 and 16 cores (1 to 4 quadcore CPUs on a linux  
> cluster). A significant portion of the CPU time goes into solving  
> the pressure equation. To reach the relative tolerance I need, CG  
> with JACOBI takes about 100 iterations per outer loop for these  
> problems.
>
> In order to reduce CPU time, I've compiled PETSc with support for  
> Hypre and I'm looking at BoomerAMG and Euclid to replace JACOBI as a  
> preconditioner for the pressure equation. With default settings,  
> both BoomerAMG and Euclid greatly reduce the number of iterations:  
> with BoomerAMG 1 or 2 iterations are enough, with Euclid about 10.  
> However, I do not get any reduction in CPU time. With Euclid, CPU  
> time is similar to JACOBI and with BoomerAMG it is approximately  
> doubled.
>
> Is this what one can expect? Are BoomerAMG and Euclid meant for much  
> larger problems? I understand Hypre uses a different matrix storage  
> format, is CPU time 'lost in translation' between PETSc and Hypre  
> for these small problems? Are there maybe any settings I should  
> change?
>
> Chris
>
>
>
>
>
>
>
>
> <mime-attachment.jpeg><mime-attachment.jpeg>
> dr. ir. Christiaan Klaij
> CFD Researcher
> Research & Development
> MARIN
> 2, Haagsteeg
> c.klaij at marin.nl
> P.O. Box 28
> T +31 317 49 39 11
> 6700 AA  Wageningen
> F +31 317 49 32 45
> T  +31 317 49 33 44
> The Netherlands
> I  www.marin.nl
>
>
> MARIN webnews: First AMT'09 conference, Nantes, France, September 1-2
>
>
> This e-mail may be confidential, privileged and/or protected by  
> copyright. If you are not the intended recipient, you should return  
> it to the sender immediately and delete your copy from your system.
>
-------------- next part --------------
A non-text attachment was scrubbed...
Name: not available
Type: application/ms-tnef
Size: 14202 bytes
Desc: not available
URL: <http://lists.mcs.anl.gov/pipermail/petsc-users/attachments/20090715/a0229c8c/attachment-0001.bin>

From dalcinl at gmail.com  Wed Jul 15 11:23:19 2009
From: dalcinl at gmail.com (Lisandro Dalcin)
Date: Wed, 15 Jul 2009 13:23:19 -0300
Subject: hypre preconditioners
In-Reply-To: <5D9143EF9FADE942BEF6F2A636A861170800F783@MAR150CV1.marin.local>
References: <mailman.67.1247590821.10645.petsc-users@mcs.anl.gov>
	<5D9143EF9FADE942BEF6F2A636A861170800F783@MAR150CV1.marin.local>
Message-ID: <e7ba66e40907150923naa1acf2he4f79d4e173b29c8@mail.gmail.com>

Did you try Block-Jacobi for the velocity problem? If the matrix of
your presure problem changes in each solve (is this your case?) could
you try to use ML? In my little experience, ML leads to lower setup
times, but higher iteration counts (let say twice); perhaps it will be
faster than BommerAMG for you use case.


On Wed, Jul 15, 2009 at 5:58 AM, Klaij, Christiaan<C.Klaij at marin.nl> wrote:
> Barry,
>
> Thanks for your reply! Below is the information from KSPView and -log_summary for the three cases. Indeed PCSetUp takes much more time with the hypre preconditioners.
>
> Chris
>
> -----------------------------
> --- Jacobi preconditioner ---
> -----------------------------
>
> KSP Object:
> ?type: cg
> ?maximum iterations=500
> ?tolerances: ?relative=0.05, absolute=1e-50, divergence=10000
> ?left preconditioning
> PC Object:
> ?type: jacobi
> ?linear system matrix = precond matrix:
> ?Matrix Object:
> ? ?type=mpiaij, rows=256576, cols=256576
> ? ?total: nonzeros=1769552, allocated nonzeros=1769552
> ? ? ?not using I-node (on process 0) routines
>
> ************************************************************************************************************************
> *** ? ? ? ? ? ? WIDEN YOUR WINDOW TO 120 CHARACTERS. ?Use 'enscript -r -fCourier9' to print this document ? ? ? ? ? ?***
> ************************************************************************************************************************
>
> ---------------------------------------------- PETSc Performance Summary: ----------------------------------------------
>
> ./fresco on a linux_32_ named lin0077 with 2 processors, by cklaij Wed Jul 15 10:22:04 2009
> Using Petsc Release Version 2.3.3, Patch 13, Thu May 15 17:29:26 CDT 2008 HG revision: 4466c6289a0922df26e20626fd4a0b4dd03c8124
>
> ? ? ? ? ? ? ? ? ? ? ? ? Max ? ? ? Max/Min ? ? ? ?Avg ? ? ?Total
> Time (sec): ? ? ? ? ? 6.037e+02 ? ? ?1.00000 ? 6.037e+02
> Objects: ? ? ? ? ? ? ?9.270e+02 ? ? ?1.00000 ? 9.270e+02
> Flops: ? ? ? ? ? ? ? ?5.671e+10 ? ? ?1.00065 ? 5.669e+10 ?1.134e+11
> Flops/sec: ? ? ? ? ? ?9.393e+07 ? ? ?1.00065 ? 9.390e+07 ?1.878e+08
> MPI Messages: ? ? ? ? 1.780e+04 ? ? ?1.00000 ? 1.780e+04 ?3.561e+04
> MPI Message Lengths: ?5.239e+08 ? ? ?1.00000 ? 2.943e+04 ?1.048e+09
> MPI Reductions: ? ? ? 2.651e+04 ? ? ?1.00000
>
> Flop counting convention: 1 flop = 1 real number operation of type (multiply/divide/add/subtract)
> ? ? ? ? ? ? ? ? ? ? ? ? ? ?e.g., VecAXPY() for real vectors of length N --> 2N flops
> ? ? ? ? ? ? ? ? ? ? ? ? ? ?and VecAXPY() for complex vectors of length N --> 8N flops
>
> Summary of Stages: ? ----- Time ------ ?----- Flops ----- ?--- Messages --- ?-- Message Lengths -- ?-- Reductions --
> ? ? ? ? ? ? ? ? ? ? ? ?Avg ? ? %Total ? ? Avg ? ? %Total ? counts ? %Total ? ? Avg ? ? ? ? %Total ? counts ? %Total
> ?0: ? ? ?Main Stage: 6.0374e+02 100.0% ?1.1338e+11 100.0% ?3.561e+04 100.0% ?2.943e+04 ? ? ?100.0% ?5.302e+04 100.0%
>
> ------------------------------------------------------------------------------------------------------------------------
> See the 'Profiling' chapter of the users' manual for details on interpreting output.
> Phase summary info:
> ? Count: number of times phase was executed
> ? Time and Flops/sec: Max - maximum over all processors
> ? ? ? ? ? ? ? ? ? ? ? Ratio - ratio of maximum to minimum over all processors
> ? Mess: number of messages sent
> ? Avg. len: average message length
> ? Reduct: number of global reductions
> ? Global: entire computation
> ? Stage: stages of a computation. Set stages with PetscLogStagePush() and PetscLogStagePop().
> ? ? ?%T - percent time in this phase ? ? ? ? %F - percent flops in this phase
> ? ? ?%M - percent messages in this phase ? ? %L - percent message lengths in this phase
> ? ? ?%R - percent reductions in this phase
> ? Total Mflop/s: 10e-6 * (sum of flops over all processors)/(max time over all processors)
> ------------------------------------------------------------------------------------------------------------------------
>
>
> ? ? ?##########################################################
> ? ? ?# ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ?#
> ? ? ?# ? ? ? ? ? ? ? ? ? ? ? ? ?WARNING!!! ? ? ? ? ? ? ? ? ? ?#
> ? ? ?# ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ?#
> ? ? ?# ? This code was run without the PreLoadBegin() ? ? ? ? #
> ? ? ?# ? macros. To get timing results we always recommend ? ?#
> ? ? ?# ? preloading. otherwise timing numbers may be ? ? ? ? ?#
> ? ? ?# ? meaningless. ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? #
> ? ? ?##########################################################
>
>
> Event ? ? ? ? ? ? ? ?Count ? ? ?Time (sec) ? ? Flops/sec ? ? ? ? ? ? ? ? ? ? ? ? --- Global --- ?--- Stage --- ? Total
> ? ? ? ? ? ? ? ? ? Max Ratio ?Max ? ? Ratio ? Max ?Ratio ?Mess ? Avg len Reduct ?%T %F %M %L %R ?%T %F %M %L %R Mflop/s
> ------------------------------------------------------------------------------------------------------------------------
>
> --- Event Stage 0: Main Stage
>
> VecDot ? ? ? ? ? ? 31370 1.0 1.2887e+01 1.0 6.28e+08 1.0 0.0e+00 0.0e+00 3.1e+04 ?2 14 ?0 ?0 59 ? 2 14 ?0 ?0 59 ?1249
> VecNorm ? ? ? ? ? ?16235 1.0 2.3343e+00 1.0 1.79e+09 1.0 0.0e+00 0.0e+00 1.6e+04 ?0 ?7 ?0 ?0 31 ? 0 ?7 ?0 ?0 31 ?3569
> VecCopy ? ? ? ? ? ? 1600 1.0 9.4822e-01 1.0 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00 ?0 ?0 ?0 ?0 ?0 ? 0 ?0 ?0 ?0 ?0 ? ? 0
> VecSet ? ? ? ? ? ? ?3732 1.0 8.7824e-01 1.0 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00 ?0 ?0 ?0 ?0 ?0 ? 0 ?0 ?0 ?0 ?0 ? ? 0
> VecAXPY ? ? ? ? ? ?32836 1.0 1.9510e+01 1.0 4.34e+08 1.0 0.0e+00 0.0e+00 0.0e+00 ?3 15 ?0 ?0 ?0 ? 3 15 ?0 ?0 ?0 ? 864
> VecAYPX ? ? ? ? ? ?16701 1.0 7.4898e+00 1.0 5.73e+08 1.0 0.0e+00 0.0e+00 0.0e+00 ?1 ?8 ?0 ?0 ?0 ? 1 ?8 ?0 ?0 ?0 ?1144
> VecAssemblyBegin ? ?1200 1.0 3.3916e-01 2.2 0.00e+00 0.0 0.0e+00 0.0e+00 3.6e+03 ?0 ?0 ?0 ?0 ?7 ? 0 ?0 ?0 ?0 ?7 ? ? 0
> VecAssemblyEnd ? ? ?1200 1.0 1.6778e-03 1.0 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00 ?0 ?0 ?0 ?0 ?0 ? 0 ?0 ?0 ?0 ?0 ? ? 0
> VecPointwiseMult ? 18301 1.0 1.4524e+01 1.0 1.62e+08 1.0 0.0e+00 0.0e+00 0.0e+00 ?2 ?4 ?0 ?0 ?0 ? 2 ?4 ?0 ?0 ?0 ? 323
> VecScatterBegin ? ?17801 1.0 5.8999e-01 1.0 0.00e+00 0.0 3.6e+04 2.9e+04 0.0e+00 ?0 ?0100100 ?0 ? 0 ?0100100 ?0 ? ? 0
> VecScatterEnd ? ? ?17801 1.0 3.3189e+00 2.2 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00 ?0 ?0 ?0 ?0 ?0 ? 0 ?0 ?0 ?0 ?0 ? ? 0
> KSPSetup ? ? ? ? ? ? 600 1.0 6.7541e-01 1.0 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00 ?0 ?0 ?0 ?0 ?0 ? 0 ?0 ?0 ?0 ?0 ? ? 0
> KSPSolve ? ? ? ? ? ? 600 1.0 1.6520e+02 1.0 3.43e+08 1.0 3.6e+04 2.9e+04 4.8e+04 27100100100 90 ?27100100100 90 ? 686
> PCSetUp ? ? ? ? ? ? ?600 1.0 4.4189e+00 1.0 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00 ?1 ?0 ?0 ?0 ?0 ? 1 ?0 ?0 ?0 ?0 ? ? 0
> PCApply ? ? ? ? ? ?18301 1.0 1.4579e+01 1.0 1.62e+08 1.0 0.0e+00 0.0e+00 1.0e+00 ?2 ?4 ?0 ?0 ?0 ? 2 ?4 ?0 ?0 ?0 ? 322
> MatMult ? ? ? ? ? ?16235 1.0 9.3444e+01 1.0 2.86e+08 1.0 3.2e+04 2.9e+04 0.0e+00 15 47 91 91 ?0 ?15 47 91 91 ?0 ? 570
> MatMultTranspose ? ?1566 1.0 8.8825e+00 1.0 3.12e+08 1.0 3.1e+03 2.9e+04 0.0e+00 ?1 ?5 ?9 ?9 ?0 ? 1 ?5 ?9 ?9 ?0 ? 624
> MatAssemblyBegin ? ? 600 1.0 6.0139e-0125.2 0.00e+00 0.0 0.0e+00 0.0e+00 1.2e+03 ?0 ?0 ?0 ?0 ?2 ? 0 ?0 ?0 ?0 ?2 ? ? 0
> MatAssemblyEnd ? ? ? 600 1.0 2.5127e+00 1.0 0.00e+00 0.0 4.0e+00 1.5e+04 6.1e+02 ?0 ?0 ?0 ?0 ?1 ? 0 ?0 ?0 ?0 ?1 ? ? 0
> ------------------------------------------------------------------------------------------------------------------------
>
> Memory usage is given in bytes:
>
> Object Type ? ? ? ? ?Creations ? Destructions ? Memory ?Descendants' Mem.
>
> --- Event Stage 0: Main Stage
>
> ? ? ? ? ? Index Set ? ? 4 ? ? ? ? ? ? ?4 ? ? ?30272 ? ? 0
> ? ? ? ? ? ? ? ? Vec ? 913 ? ? ? ? ? ?902 ?926180816 ? ? 0
> ? ? ? ? Vec Scatter ? ? 2 ? ? ? ? ? ? ?0 ? ? ? ? ?0 ? ? 0
> ? ? ? Krylov Solver ? ? 1 ? ? ? ? ? ? ?0 ? ? ? ? ?0 ? ? 0
> ? ? ?Preconditioner ? ? 1 ? ? ? ? ? ? ?0 ? ? ? ? ?0 ? ? 0
> ? ? ? ? ? ? ?Matrix ? ? 6 ? ? ? ? ? ? ?0 ? ? ? ? ?0 ? ? 0
> ========================================================================================================================
> Average time to get PetscTime(): 2.14577e-07
> Average time for MPI_Barrier(): 8.10623e-07
> Average time for zero size MPI_Send(): 2.0504e-05
>
>
>
> -----------------------------------
> --- Hypre Euclid preconditioner ---
> -----------------------------------
>
> KSP Object:
> ?type: cg
> ?maximum iterations=500
> ?tolerances: ?relative=0.05, absolute=1e-50, divergence=10000
> ?left preconditioning
> PC Object:
> ?type: hypre
> ? ?HYPRE Euclid preconditioning
> ? ?HYPRE Euclid: number of levels 1
> ?linear system matrix = precond matrix:
> ?Matrix Object:
> ? ?type=mpiaij, rows=256576, cols=256576
> ? ?total: nonzeros=1769552, allocated nonzeros=1769552
> ? ? ?not using I-node (on process 0) routines
>
> ************************************************************************************************************************
> *** ? ? ? ? ? ? WIDEN YOUR WINDOW TO 120 CHARACTERS. ?Use 'enscript -r -fCourier9' to print this document ? ? ? ? ? ?***
> ************************************************************************************************************************
>
> ---------------------------------------------- PETSc Performance Summary: ----------------------------------------------
>
> ./fresco on a linux_32_ named lin0077 with 2 processors, by cklaij Wed Jul 15 10:10:05 2009
> Using Petsc Release Version 2.3.3, Patch 13, Thu May 15 17:29:26 CDT 2008 HG revision: 4466c6289a0922df26e20626fd4a0b4dd03c8124
>
> ? ? ? ? ? ? ? ? ? ? ? ? Max ? ? ? Max/Min ? ? ? ?Avg ? ? ?Total
> Time (sec): ? ? ? ? ? 6.961e+02 ? ? ?1.00000 ? 6.961e+02
> Objects: ? ? ? ? ? ? ?1.227e+03 ? ? ?1.00000 ? 1.227e+03
> Flops: ? ? ? ? ? ? ? ?1.340e+10 ? ? ?1.00073 ? 1.340e+10 ?2.679e+10
> Flops/sec: ? ? ? ? ? ?1.925e+07 ? ? ?1.00073 ? 1.924e+07 ?3.848e+07
> MPI Messages: ? ? ? ? 4.748e+03 ? ? ?1.00000 ? 4.748e+03 ?9.496e+03
> MPI Message Lengths: ?1.397e+08 ? ? ?1.00000 ? 2.943e+04 ?2.794e+08
> MPI Reductions: ? ? ? 7.192e+03 ? ? ?1.00000
>
> Flop counting convention: 1 flop = 1 real number operation of type (multiply/divide/add/subtract)
> ? ? ? ? ? ? ? ? ? ? ? ? ? ?e.g., VecAXPY() for real vectors of length N --> 2N flops
> ? ? ? ? ? ? ? ? ? ? ? ? ? ?and VecAXPY() for complex vectors of length N --> 8N flops
>
> Summary of Stages: ? ----- Time ------ ?----- Flops ----- ?--- Messages --- ?-- Message Lengths -- ?-- Reductions --
> ? ? ? ? ? ? ? ? ? ? ? ?Avg ? ? %Total ? ? Avg ? ? %Total ? counts ? %Total ? ? Avg ? ? ? ? %Total ? counts ? %Total
> ?0: ? ? ?Main Stage: 6.9614e+02 100.0% ?2.6790e+10 100.0% ?9.496e+03 100.0% ?2.943e+04 ? ? ?100.0% ?1.438e+04 100.0%
>
> ------------------------------------------------------------------------------------------------------------------------
> See the 'Profiling' chapter of the users' manual for details on interpreting output.
> Phase summary info:
> ? Count: number of times phase was executed
> ? Time and Flops/sec: Max - maximum over all processors
> ? ? ? ? ? ? ? ? ? ? ? Ratio - ratio of maximum to minimum over all processors
> ? Mess: number of messages sent
> ? Avg. len: average message length
> ? Reduct: number of global reductions
> ? Global: entire computation
> ? Stage: stages of a computation. Set stages with PetscLogStagePush() and PetscLogStagePop().
> ? ? ?%T - percent time in this phase ? ? ? ? %F - percent flops in this phase
> ? ? ?%M - percent messages in this phase ? ? %L - percent message lengths in this phase
> ? ? ?%R - percent reductions in this phase
> ? Total Mflop/s: 10e-6 * (sum of flops over all processors)/(max time over all processors)
> ------------------------------------------------------------------------------------------------------------------------
>
>
> ? ? ?##########################################################
> ? ? ?# ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ?#
> ? ? ?# ? ? ? ? ? ? ? ? ? ? ? ? ?WARNING!!! ? ? ? ? ? ? ? ? ? ?#
> ? ? ?# ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ?#
> ? ? ?# ? This code was run without the PreLoadBegin() ? ? ? ? #
> ? ? ?# ? macros. To get timing results we always recommend ? ?#
> ? ? ?# ? preloading. otherwise timing numbers may be ? ? ? ? ?#
> ? ? ?# ? meaningless. ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? #
> ? ? ?##########################################################
>
>
> Event ? ? ? ? ? ? ? ?Count ? ? ?Time (sec) ? ? Flops/sec ? ? ? ? ? ? ? ? ? ? ? ? --- Global --- ?--- Stage --- ? Total
> ? ? ? ? ? ? ? ? ? Max Ratio ?Max ? ? Ratio ? Max ?Ratio ?Mess ? Avg len Reduct ?%T %F %M %L %R ?%T %F %M %L %R Mflop/s
> ------------------------------------------------------------------------------------------------------------------------
>
> --- Event Stage 0: Main Stage
>
> VecDot ? ? ? ? ? ? ?5410 1.0 1.1865e+01 4.5 5.26e+08 4.5 0.0e+00 0.0e+00 5.4e+03 ?1 10 ?0 ?0 38 ? 1 10 ?0 ?0 38 ? 234
> VecNorm ? ? ? ? ? ? 3255 1.0 7.8095e-01 1.0 1.07e+09 1.0 0.0e+00 0.0e+00 3.3e+03 ?0 ?6 ?0 ?0 23 ? 0 ?6 ?0 ?0 23 ?2139
> VecCopy ? ? ? ? ? ? 1600 1.0 9.5096e-01 1.0 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00 ?0 ?0 ?0 ?0 ?0 ? 0 ?0 ?0 ?0 ?0 ? ? 0
> VecSet ? ? ? ? ? ? ?4746 1.0 8.9868e-01 1.0 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00 ?0 ?0 ?0 ?0 ?0 ? 0 ?0 ?0 ?0 ?0 ? ? 0
> VecAXPY ? ? ? ? ? ? 6801 1.0 4.8778e+00 1.0 3.59e+08 1.0 0.0e+00 0.0e+00 0.0e+00 ?1 13 ?0 ?0 ?0 ? 1 13 ?0 ?0 ?0 ? 715
> VecAYPX ? ? ? ? ? ? 3646 1.0 2.2348e+00 1.0 4.19e+08 1.0 0.0e+00 0.0e+00 0.0e+00 ?0 ?7 ?0 ?0 ?0 ? 0 ?7 ?0 ?0 ?0 ? 837
> VecAssemblyBegin ? ?1200 1.0 2.7152e-01 2.5 0.00e+00 0.0 0.0e+00 0.0e+00 3.6e+03 ?0 ?0 ?0 ?0 25 ? 0 ?0 ?0 ?0 25 ? ? 0
> VecAssemblyEnd ? ? ?1200 1.0 1.7414e-03 1.0 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00 ?0 ?0 ?0 ?0 ?0 ? 0 ?0 ?0 ?0 ?0 ? ? 0
> VecPointwiseMult ? ?3982 1.0 4.0871e+00 1.0 1.26e+08 1.0 0.0e+00 0.0e+00 0.0e+00 ?1 ?4 ?0 ?0 ?0 ? 1 ?4 ?0 ?0 ?0 ? 250
> VecScatterBegin ? ? 4746 1.0 1.8000e-01 1.0 0.00e+00 0.0 9.5e+03 2.9e+04 0.0e+00 ?0 ?0100100 ?0 ? 0 ?0100100 ?0 ? ? 0
> VecScatterEnd ? ? ? 4746 1.0 4.6870e+00 5.0 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00 ?0 ?0 ?0 ?0 ?0 ? 0 ?0 ?0 ?0 ?0 ? ? 0
> KSPSetup ? ? ? ? ? ? 600 1.0 6.8991e-01 1.0 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00 ?0 ?0 ?0 ?0 ?0 ? 0 ?0 ?0 ?0 ?0 ? ? 0
> KSPSolve ? ? ? ? ? ? 600 1.0 2.5931e+02 1.0 5.17e+07 1.0 9.5e+03 2.9e+04 9.0e+03 37100100100 62 ?37100100100 62 ? 103
> PCSetUp ? ? ? ? ? ? ?600 1.0 1.8337e+02 1.0 0.00e+00 0.0 0.0e+00 0.0e+00 2.0e+02 26 ?0 ?0 ?0 ?1 ?26 ?0 ?0 ?0 ?1 ? ? 0
> PCApply ? ? ? ? ? ? 5246 1.0 3.6440e+01 1.3 1.88e+07 1.3 0.0e+00 0.0e+00 1.0e+02 ?5 ?4 ?0 ?0 ?1 ? 5 ?4 ?0 ?0 ?1 ? ?28
> MatMult ? ? ? ? ? ? 3255 1.0 2.3031e+01 1.2 2.85e+08 1.2 6.5e+03 2.9e+04 0.0e+00 ?3 40 69 69 ?0 ? 3 40 69 69 ?0 ? 464
> MatMultTranspose ? ?1491 1.0 8.4907e+00 1.0 3.11e+08 1.0 3.0e+03 2.9e+04 0.0e+00 ?1 20 31 31 ?0 ? 1 20 31 31 ?0 ? 621
> MatConvert ? ? ? ? ? 100 1.0 1.2686e+01 1.0 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00 ?2 ?0 ?0 ?0 ?0 ? 2 ?0 ?0 ?0 ?0 ? ? 0
> MatAssemblyBegin ? ? 600 1.0 2.3702e+0042.6 0.00e+00 0.0 0.0e+00 0.0e+00 1.2e+03 ?0 ?0 ?0 ?0 ?8 ? 0 ?0 ?0 ?0 ?8 ? ? 0
> MatAssemblyEnd ? ? ? 600 1.0 2.5303e+00 1.0 0.00e+00 0.0 4.0e+00 1.5e+04 6.1e+02 ?0 ?0 ?0 ?0 ?4 ? 0 ?0 ?0 ?0 ?4 ? ? 0
> MatGetRow ? ? ? ?12828800 1.0 5.2074e+00 1.0 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00 ?1 ?0 ?0 ?0 ?0 ? 1 ?0 ?0 ?0 ?0 ? ? 0
> MatGetRowIJ ? ? ? ? ?200 1.0 1.6284e-04 1.0 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00 ?0 ?0 ?0 ?0 ?0 ? 0 ?0 ?0 ?0 ?0 ? ? 0
> ------------------------------------------------------------------------------------------------------------------------
>
> Memory usage is given in bytes:
>
> Object Type ? ? ? ? ?Creations ? Destructions ? Memory ?Descendants' Mem.
>
> --- Event Stage 0: Main Stage
>
> ? ? ? ? ? Index Set ? ? 4 ? ? ? ? ? ? ?4 ? ? ?30272 ? ? 0
> ? ? ? ? ? ? ? ? Vec ?1213 ? ? ? ? ? 1202 ?1234223216 ? ? 0
> ? ? ? ? Vec Scatter ? ? 2 ? ? ? ? ? ? ?0 ? ? ? ? ?0 ? ? 0
> ? ? ? Krylov Solver ? ? 1 ? ? ? ? ? ? ?0 ? ? ? ? ?0 ? ? 0
> ? ? ?Preconditioner ? ? 1 ? ? ? ? ? ? ?0 ? ? ? ? ?0 ? ? 0
> ? ? ? ? ? ? ?Matrix ? ? 6 ? ? ? ? ? ? ?0 ? ? ? ? ?0 ? ? 0
> ========================================================================================================================
> Average time to get PetscTime(): 2.14577e-07
> Average time for MPI_Barrier(): 3.8147e-07
> Average time for zero size MPI_Send(): 1.39475e-05
>
>
>
>
> --------------------------------------
> --- Hypre BoomerAMG preconditioner ---
> --------------------------------------
>
> KSP Object:
> ?type: cg
> ?maximum iterations=500
> ?tolerances: ?relative=0.05, absolute=1e-50, divergence=10000
> ?left preconditioning
> PC Object:
> ?type: hypre
> ? ?HYPRE BoomerAMG preconditioning
> ? ?HYPRE BoomerAMG: Cycle type V
> ? ?HYPRE BoomerAMG: Maximum number of levels 25
> ? ?HYPRE BoomerAMG: Maximum number of iterations PER hypre call 1
> ? ?HYPRE BoomerAMG: Convergence tolerance PER hypre call 0
> ? ?HYPRE BoomerAMG: Threshold for strong coupling 0.25
> ? ?HYPRE BoomerAMG: Interpolation truncation factor 0
> ? ?HYPRE BoomerAMG: Interpolation: max elements per row 0
> ? ?HYPRE BoomerAMG: Number of levels of aggressive coarsening 0
> ? ?HYPRE BoomerAMG: Number of paths for aggressive coarsening 1
> ? ?HYPRE BoomerAMG: Maximum row sums 0.9
> ? ?HYPRE BoomerAMG: Sweeps down ? ? ? ? 1
> ? ?HYPRE BoomerAMG: Sweeps up ? ? ? ? ? 1
> ? ?HYPRE BoomerAMG: Sweeps on coarse ? ?1
> ? ?HYPRE BoomerAMG: Relax down ? ? ? ? ?symmetric-SOR/Jacobi
> ? ?HYPRE BoomerAMG: Relax up ? ? ? ? ? ?symmetric-SOR/Jacobi
> ? ?HYPRE BoomerAMG: Relax on coarse ? ? Gaussian-elimination
> ? ?HYPRE BoomerAMG: Relax weight ?(all) ? ? ?1
> ? ?HYPRE BoomerAMG: Outer relax weight (all) 1
> ? ?HYPRE BoomerAMG: Using CF-relaxation
> ? ?HYPRE BoomerAMG: Measure type ? ? ? ?local
> ? ?HYPRE BoomerAMG: Coarsen type ? ? ? ?Falgout
> ? ?HYPRE BoomerAMG: Interpolation type ?classical
> ?linear system matrix = precond matrix:
> ?Matrix Object:
> ? ?type=mpiaij, rows=256576, cols=256576
> ? ?total: nonzeros=1769552, allocated nonzeros=1769552
> ? ? ?not using I-node (on process 0) routines
>
> ************************************************************************************************************************
> *** ? ? ? ? ? ? WIDEN YOUR WINDOW TO 120 CHARACTERS. ?Use 'enscript -r -fCourier9' to print this document ? ? ? ? ? ?***
> ************************************************************************************************************************
>
> ---------------------------------------------- PETSc Performance Summary: ----------------------------------------------
>
> ./fresco on a linux_32_ named lin0077 with 2 processors, by cklaij Wed Jul 15 09:53:07 2009
> Using Petsc Release Version 2.3.3, Patch 13, Thu May 15 17:29:26 CDT 2008 HG revision: 4466c6289a0922df26e20626fd4a0b4dd03c8124
>
> ? ? ? ? ? ? ? ? ? ? ? ? Max ? ? ? Max/Min ? ? ? ?Avg ? ? ?Total
> Time (sec): ? ? ? ? ? 7.080e+02 ? ? ?1.00000 ? 7.080e+02
> Objects: ? ? ? ? ? ? ?1.227e+03 ? ? ?1.00000 ? 1.227e+03
> Flops: ? ? ? ? ? ? ? ?1.054e+10 ? ? ?1.00076 ? 1.054e+10 ?2.107e+10
> Flops/sec: ? ? ? ? ? ?1.489e+07 ? ? ?1.00076 ? 1.488e+07 ?2.977e+07
> MPI Messages: ? ? ? ? 3.857e+03 ? ? ?1.00000 ? 3.857e+03 ?7.714e+03
> MPI Message Lengths: ?1.135e+08 ? ? ?1.00000 ? 2.942e+04 ?2.270e+08
> MPI Reductions: ? ? ? 5.800e+03 ? ? ?1.00000
>
> Flop counting convention: 1 flop = 1 real number operation of type (multiply/divide/add/subtract)
> ? ? ? ? ? ? ? ? ? ? ? ? ? ?e.g., VecAXPY() for real vectors of length N --> 2N flops
> ? ? ? ? ? ? ? ? ? ? ? ? ? ?and VecAXPY() for complex vectors of length N --> 8N flops
>
> Summary of Stages: ? ----- Time ------ ?----- Flops ----- ?--- Messages --- ?-- Message Lengths -- ?-- Reductions --
> ? ? ? ? ? ? ? ? ? ? ? ?Avg ? ? %Total ? ? Avg ? ? %Total ? counts ? %Total ? ? Avg ? ? ? ? %Total ? counts ? %Total
> ?0: ? ? ?Main Stage: 7.0799e+02 100.0% ?2.1075e+10 100.0% ?7.714e+03 100.0% ?2.942e+04 ? ? ?100.0% ?1.160e+04 100.0%
>
> ------------------------------------------------------------------------------------------------------------------------
> See the 'Profiling' chapter of the users' manual for details on interpreting output.
> Phase summary info:
> ? Count: number of times phase was executed
> ? Time and Flops/sec: Max - maximum over all processors
> ? ? ? ? ? ? ? ? ? ? ? Ratio - ratio of maximum to minimum over all processors
> ? Mess: number of messages sent
> ? Avg. len: average message length
> ? Reduct: number of global reductions
> ? Global: entire computation
> ? Stage: stages of a computation. Set stages with PetscLogStagePush() and PetscLogStagePop().
> ? ? ?%T - percent time in this phase ? ? ? ? %F - percent flops in this phase
> ? ? ?%M - percent messages in this phase ? ? %L - percent message lengths in this phase
> ? ? ?%R - percent reductions in this phase
> ? Total Mflop/s: 10e-6 * (sum of flops over all processors)/(max time over all processors)
> ------------------------------------------------------------------------------------------------------------------------
>
>
> ? ? ?##########################################################
> ? ? ?# ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ?#
> ? ? ?# ? ? ? ? ? ? ? ? ? ? ? ? ?WARNING!!! ? ? ? ? ? ? ? ? ? ?#
> ? ? ?# ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ?#
> ? ? ?# ? This code was run without the PreLoadBegin() ? ? ? ? #
> ? ? ?# ? macros. To get timing results we always recommend ? ?#
> ? ? ?# ? preloading. otherwise timing numbers may be ? ? ? ? ?#
> ? ? ?# ? meaningless. ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? #
> ? ? ?##########################################################
>
>
> Event ? ? ? ? ? ? ? ?Count ? ? ?Time (sec) ? ? Flops/sec ? ? ? ? ? ? ? ? ? ? ? ? --- Global --- ?--- Stage --- ? Total
> ? ? ? ? ? ? ? ? ? Max Ratio ?Max ? ? Ratio ? Max ?Ratio ?Mess ? Avg len Reduct ?%T %F %M %L %R ?%T %F %M %L %R Mflop/s
> ------------------------------------------------------------------------------------------------------------------------
>
> --- Event Stage 0: Main Stage
>
> VecDot ? ? ? ? ? ? ?3554 1.0 1.8220e+00 1.0 5.03e+08 1.0 0.0e+00 0.0e+00 3.6e+03 ?0 ?9 ?0 ?0 31 ? 0 ?9 ?0 ?0 31 ?1001
> VecNorm ? ? ? ? ? ? 2327 1.0 6.7031e-01 1.0 9.34e+08 1.0 0.0e+00 0.0e+00 2.3e+03 ?0 ?6 ?0 ?0 20 ? 0 ?6 ?0 ?0 20 ?1781
> VecCopy ? ? ? ? ? ? 1600 1.0 9.4440e-01 1.0 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00 ?0 ?0 ?0 ?0 ?0 ? 0 ?0 ?0 ?0 ?0 ? ? 0
> VecSet ? ? ? ? ? ? ?3855 1.0 8.0550e-01 1.0 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00 ?0 ?0 ?0 ?0 ?0 ? 0 ?0 ?0 ?0 ?0 ? ? 0
> VecAXPY ? ? ? ? ? ? 4982 1.0 3.7953e+00 1.0 3.39e+08 1.0 0.0e+00 0.0e+00 0.0e+00 ?1 12 ?0 ?0 ?0 ? 1 12 ?0 ?0 ?0 ? 674
> VecAYPX ? ? ? ? ? ? 2755 1.0 1.8270e+00 1.0 3.89e+08 1.0 0.0e+00 0.0e+00 0.0e+00 ?0 ?7 ?0 ?0 ?0 ? 0 ?7 ?0 ?0 ?0 ? 774
> VecAssemblyBegin ? ?1200 1.0 1.8679e-01 1.8 0.00e+00 0.0 0.0e+00 0.0e+00 3.6e+03 ?0 ?0 ?0 ?0 31 ? 0 ?0 ?0 ?0 31 ? ? 0
> VecAssemblyEnd ? ? ?1200 1.0 1.7717e-03 1.1 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00 ?0 ?0 ?0 ?0 ?0 ? 0 ?0 ?0 ?0 ?0 ? ? 0
> VecPointwiseMult ? ?4056 1.0 4.1344e+00 1.0 1.26e+08 1.0 0.0e+00 0.0e+00 0.0e+00 ?1 ?5 ?0 ?0 ?0 ? 1 ?5 ?0 ?0 ?0 ? 252
> VecScatterBegin ? ? 3855 1.0 1.5116e-01 1.0 0.00e+00 0.0 7.7e+03 2.9e+04 0.0e+00 ?0 ?0100100 ?0 ? 0 ?0100100 ?0 ? ? 0
> VecScatterEnd ? ? ? 3855 1.0 7.3828e-01 2.0 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00 ?0 ?0 ?0 ?0 ?0 ? 0 ?0 ?0 ?0 ?0 ? ? 0
> KSPSetup ? ? ? ? ? ? 600 1.0 5.1192e-01 1.0 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00 ?0 ?0 ?0 ?0 ?0 ? 0 ?0 ?0 ?0 ?0 ? ? 0
> KSPSolve ? ? ? ? ? ? 600 1.0 2.7194e+02 1.0 3.88e+07 1.0 7.7e+03 2.9e+04 6.2e+03 38100100100 53 ?38100100100 53 ? ?77
> PCSetUp ? ? ? ? ? ? ?600 1.0 1.6630e+02 1.0 0.00e+00 0.0 0.0e+00 0.0e+00 2.0e+02 23 ?0 ?0 ?0 ?2 ?23 ?0 ?0 ?0 ?2 ? ? 0
> PCApply ? ? ? ? ? ? 4355 1.0 7.3735e+01 1.0 7.06e+06 1.0 0.0e+00 0.0e+00 1.0e+02 10 ?5 ?0 ?0 ?1 ?10 ?5 ?0 ?0 ?1 ? ?14
> MatMult ? ? ? ? ? ? 2327 1.0 1.3706e+01 1.0 2.79e+08 1.0 4.7e+03 2.9e+04 0.0e+00 ?2 36 60 60 ?0 ? 2 36 60 60 ?0 ? 557
> MatMultTranspose ? ?1528 1.0 8.6412e+00 1.0 3.13e+08 1.0 3.1e+03 2.9e+04 0.0e+00 ?1 26 40 40 ?0 ? 1 26 40 40 ?0 ? 626
> MatConvert ? ? ? ? ? 100 1.0 1.2962e+01 1.0 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00 ?2 ?0 ?0 ?0 ?0 ? 2 ?0 ?0 ?0 ?0 ? ? 0
> MatAssemblyBegin ? ? 600 1.0 2.4579e+0096.9 0.00e+00 0.0 0.0e+00 0.0e+00 1.2e+03 ?0 ?0 ?0 ?0 10 ? 0 ?0 ?0 ?0 10 ? ? 0
> MatAssemblyEnd ? ? ? 600 1.0 2.5257e+00 1.0 0.00e+00 0.0 4.0e+00 1.5e+04 6.1e+02 ?0 ?0 ?0 ?0 ?5 ? 0 ?0 ?0 ?0 ?5 ? ? 0
> MatGetRow ? ? ? ?12828800 1.0 5.2907e+00 1.0 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00 ?1 ?0 ?0 ?0 ?0 ? 1 ?0 ?0 ?0 ?0 ? ? 0
> MatGetRowIJ ? ? ? ? ?200 1.0 1.7476e-04 1.1 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00 ?0 ?0 ?0 ?0 ?0 ? 0 ?0 ?0 ?0 ?0 ? ? 0
> ------------------------------------------------------------------------------------------------------------------------
>
> Memory usage is given in bytes:
>
> Object Type ? ? ? ? ?Creations ? Destructions ? Memory ?Descendants' Mem.
>
> --- Event Stage 0: Main Stage
>
> ? ? ? ? ? Index Set ? ? 4 ? ? ? ? ? ? ?4 ? ? ?30272 ? ? 0
> ? ? ? ? ? ? ? ? Vec ?1213 ? ? ? ? ? 1202 ?1234223216 ? ? 0
> ? ? ? ? Vec Scatter ? ? 2 ? ? ? ? ? ? ?0 ? ? ? ? ?0 ? ? 0
> ? ? ? Krylov Solver ? ? 1 ? ? ? ? ? ? ?0 ? ? ? ? ?0 ? ? 0
> ? ? ?Preconditioner ? ? 1 ? ? ? ? ? ? ?0 ? ? ? ? ?0 ? ? 0
> ? ? ? ? ? ? ?Matrix ? ? 6 ? ? ? ? ? ? ?0 ? ? ? ? ?0 ? ? 0
> ========================================================================================================================
> Average time to get PetscTime(): 1.90735e-07
> Average time for MPI_Barrier(): 8.10623e-07
> Average time for zero size MPI_Send(): 1.95503e-05
> OptionTable: -log_summary
>
>
>
>
> -----Original Message-----
> Date: Tue, 14 Jul 2009 10:42:58 -0500
> From: Barry Smith <bsmith at mcs.anl.gov>
> Subject: Re: hypre preconditioners
> To: PETSc users list <petsc-users at mcs.anl.gov>
> Message-ID: <DC1E3E8F-1D2D-4256-A1EE-14BA81EAEC67 at mcs.anl.gov>
> Content-Type: text/plain; charset=US-ASCII; format=flowed; delsp=yes
>
>
> ? ?First run the three cases with -log_summary (also -ksp_view to see
> exact solver options that are being used) and send those files. This
> will tell us where the time is being spent; without this information
> any comments are pure speculation. (For example, the "copy" time to
> hypre format is trivial compared to the time to build a hypre
> preconditioner and not the problem).
>
>
> ? ?What you report is not uncommon; the setup and per iteration cost
> of the hypre preconditioners will be much larger than the simpler
> Jacobi preconditioner.
>
> ? ?Barry
>
> On Jul 14, 2009, at 3:36 AM, Klaij, Christiaan wrote:
>
>>
>> I'm solving the steady incompressible Navier-Stokes equations
>> (discretized with FV on unstructured grids) using the SIMPLE
>> Pressure Correction method. I'm using Picard linearization and solve
>> the system for the momentum equations with BICG and for the pressure
>> equation with CG. Currently, for parallel runs, I'm using JACOBI as
>> a preconditioner. My grids typically have a few million cells and I
>> use between 4 and 16 cores (1 to 4 quadcore CPUs on a linux
>> cluster). A significant portion of the CPU time goes into solving
>> the pressure equation. To reach the relative tolerance I need, CG
>> with JACOBI takes about 100 iterations per outer loop for these
>> problems.
>>
>> In order to reduce CPU time, I've compiled PETSc with support for
>> Hypre and I'm looking at BoomerAMG and Euclid to replace JACOBI as a
>> preconditioner for the pressure equation. With default settings,
>> both BoomerAMG and Euclid greatly reduce the number of iterations:
>> with BoomerAMG 1 or 2 iterations are enough, with Euclid about 10.
>> However, I do not get any reduction in CPU time. With Euclid, CPU
>> time is similar to JACOBI and with BoomerAMG it is approximately
>> doubled.
>>
>> Is this what one can expect? Are BoomerAMG and Euclid meant for much
>> larger problems? I understand Hypre uses a different matrix storage
>> format, is CPU time 'lost in translation' between PETSc and Hypre
>> for these small problems? Are there maybe any settings I should
>> change?
>>
>> Chris
>>
>>
>>
>>
>>
>>
>>
>>
>> <mime-attachment.jpeg><mime-attachment.jpeg>
>> dr. ir. Christiaan Klaij
>> CFD Researcher
>> Research & Development
>> MARIN
>> 2, Haagsteeg
>> c.klaij at marin.nl
>> P.O. Box 28
>> T +31 317 49 39 11
>> 6700 AA ?Wageningen
>> F +31 317 49 32 45
>> T ?+31 317 49 33 44
>> The Netherlands
>> I ?www.marin.nl
>>
>>
>> MARIN webnews: First AMT'09 conference, Nantes, France, September 1-2
>>
>>
>> This e-mail may be confidential, privileged and/or protected by
>> copyright. If you are not the intended recipient, you should return
>> it to the sender immediately and delete your copy from your system.
>>
>



-- 
Lisandro Dalc?n
---------------
Centro Internacional de M?todos Computacionales en Ingenier?a (CIMEC)
Instituto de Desarrollo Tecnol?gico para la Industria Qu?mica (INTEC)
Consejo Nacional de Investigaciones Cient?ficas y T?cnicas (CONICET)
PTLC - G?emes 3450, (3000) Santa Fe, Argentina
Tel/Fax: +54-(0)342-451.1594

From Andreas.Grassl at student.uibk.ac.at  Wed Jul 15 13:18:01 2009
From: Andreas.Grassl at student.uibk.ac.at (Andreas Grassl)
Date: Wed, 15 Jul 2009 20:18:01 +0200
Subject: Mumps speedup by passing symmetric matrix
Message-ID: <4A5E1D59.3000905@student.uibk.ac.at>

Hello,

solved the 64-32-bit-issue, i have working now MUMPS and gain reasonable
results, but I'm wondering if I could see some performance increasing by using
the symmetry of the matrix. By setting only the option -mat_mumps_sym I don't
see any changes in runtime and INFOG(8) returns 100.

Setting MatSetOption(A,MAT_SYMMETRIC,PETSC_TRUE) I don't see any changes either.

Does MUMPS recognize and use automatically the symmetry?

Cheers,

ando

-- 
 /"\                               Grassl Andreas
 \ /    ASCII Ribbon Campaign      Uni Innsbruck Institut f. Mathematik
  X      against HTML email        Technikerstr. 13 Zi 709
 / \                               +43 (0)512 507 6091


-------------- next part --------------
A non-text attachment was scrubbed...
Name: signature.asc
Type: application/pgp-signature
Size: 315 bytes
Desc: OpenPGP digital signature
URL: <http://lists.mcs.anl.gov/pipermail/petsc-users/attachments/20090715/2f52d25c/attachment.pgp>

From bsmith at mcs.anl.gov  Wed Jul 15 15:26:17 2009
From: bsmith at mcs.anl.gov (Barry Smith)
Date: Wed, 15 Jul 2009 15:26:17 -0500
Subject: hypre preconditioners
In-Reply-To: <e7ba66e40907150923naa1acf2he4f79d4e173b29c8@mail.gmail.com>
References: <mailman.67.1247590821.10645.petsc-users@mcs.anl.gov>
	<5D9143EF9FADE942BEF6F2A636A861170800F783@MAR150CV1.marin.local>
	<e7ba66e40907150923naa1acf2he4f79d4e173b29c8@mail.gmail.com>
Message-ID: <83E2B8C2-9475-45C6-A448-502114D4959D@mcs.anl.gov>


On Jul 15, 2009, at 11:23 AM, Lisandro Dalcin wrote:

> Did you try Block-Jacobi for the velocity problem?

    You can try -pc_type sor and it will run block Jacobi with one  
symmetric sweep of SOR for each iteration. This may be faster than  
your plain Jacobi.

> If the matrix of
> your presure problem changes in each solve (is this your case?) could
> you try to use ML? In my little experience, ML leads to lower setup
> times, but higher iteration counts (let say twice); perhaps it will be
> faster than BommerAMG for you use case.

    ML is worth trying.

    Also you might try "playing" with the various boomerAMG options. I  
don't know them in detail so cannot make suggestions, but the various  
ways of coarsening control how quickly the setup time is.

   Finally, if the matrix is not changing much for each new solve you  
can use the same boomerAMG preconditioner for several linear solves.  
Just use SAME_PRECONDITIONER as the argument to KSPSetOperators() and  
it will not create a new preconditioner until you call it with  
SAME_NONZERO_PATTERN. I am thinking this might work very well for you.

    Barry


>
>
> On Wed, Jul 15, 2009 at 5:58 AM, Klaij, Christiaan<C.Klaij at marin.nl>  
> wrote:
>> Barry,
>>
>> Thanks for your reply! Below is the information from KSPView and - 
>> log_summary for the three cases. Indeed PCSetUp takes much more  
>> time with the hypre preconditioners.
>>
>> Chris
>>
>> -----------------------------
>> --- Jacobi preconditioner ---
>> -----------------------------
>>
>> KSP Object:
>>  type: cg
>>  maximum iterations=500
>>  tolerances:  relative=0.05, absolute=1e-50, divergence=10000
>>  left preconditioning
>> PC Object:
>>  type: jacobi
>>  linear system matrix = precond matrix:
>>  Matrix Object:
>>    type=mpiaij, rows=256576, cols=256576
>>    total: nonzeros=1769552, allocated nonzeros=1769552
>>      not using I-node (on process 0) routines
>>
>> ************************************************************************************************************************
>> ***             WIDEN YOUR WINDOW TO 120 CHARACTERS.  Use 'enscript  
>> -r -fCourier9' to print this document            ***
>> ************************************************************************************************************************
>>
>> ---------------------------------------------- PETSc Performance  
>> Summary: ----------------------------------------------
>>
>> ./fresco on a linux_32_ named lin0077 with 2 processors, by cklaij  
>> Wed Jul 15 10:22:04 2009
>> Using Petsc Release Version 2.3.3, Patch 13, Thu May 15 17:29:26  
>> CDT 2008 HG revision: 4466c6289a0922df26e20626fd4a0b4dd03c8124
>>
>>                         Max       Max/Min        Avg      Total
>> Time (sec):           6.037e+02      1.00000   6.037e+02
>> Objects:              9.270e+02      1.00000   9.270e+02
>> Flops:                5.671e+10      1.00065   5.669e+10  1.134e+11
>> Flops/sec:            9.393e+07      1.00065   9.390e+07  1.878e+08
>> MPI Messages:         1.780e+04      1.00000   1.780e+04  3.561e+04
>> MPI Message Lengths:  5.239e+08      1.00000   2.943e+04  1.048e+09
>> MPI Reductions:       2.651e+04      1.00000
>>
>> Flop counting convention: 1 flop = 1 real number operation of type  
>> (multiply/divide/add/subtract)
>>                            e.g., VecAXPY() for real vectors of  
>> length N --> 2N flops
>>                            and VecAXPY() for complex vectors of  
>> length N --> 8N flops
>>
>> Summary of Stages:   ----- Time ------  ----- Flops -----  ---  
>> Messages ---  -- Message Lengths --  -- Reductions --
>>                        Avg     %Total     Avg     %Total   counts    
>> %Total     Avg         %Total   counts   %Total
>>  0:      Main Stage: 6.0374e+02 100.0%  1.1338e+11 100.0%  3.561e 
>> +04 100.0%  2.943e+04      100.0%  5.302e+04 100.0%
>>
>> ------------------------------------------------------------------------------------------------------------------------
>> See the 'Profiling' chapter of the users' manual for details on  
>> interpreting output.
>> Phase summary info:
>>   Count: number of times phase was executed
>>   Time and Flops/sec: Max - maximum over all processors
>>                       Ratio - ratio of maximum to minimum over all  
>> processors
>>   Mess: number of messages sent
>>   Avg. len: average message length
>>   Reduct: number of global reductions
>>   Global: entire computation
>>   Stage: stages of a computation. Set stages with  
>> PetscLogStagePush() and PetscLogStagePop().
>>      %T - percent time in this phase         %F - percent flops in  
>> this phase
>>      %M - percent messages in this phase     %L - percent message  
>> lengths in this phase
>>      %R - percent reductions in this phase
>>   Total Mflop/s: 10e-6 * (sum of flops over all processors)/(max  
>> time over all processors)
>> ------------------------------------------------------------------------------------------------------------------------
>>
>>
>>      ##########################################################
>>      #                                                        #
>>      #                          WARNING!!!                    #
>>      #                                                        #
>>      #   This code was run without the PreLoadBegin()         #
>>      #   macros. To get timing results we always recommend    #
>>      #   preloading. otherwise timing numbers may be          #
>>      #   meaningless.                                         #
>>      ##########################################################
>>
>>
>> Event                Count      Time (sec)     Flops/ 
>> sec                         --- Global ---  --- Stage ---   Total
>>                   Max Ratio  Max     Ratio   Max  Ratio  Mess   Avg  
>> len Reduct  %T %F %M %L %R  %T %F %M %L %R Mflop/s
>> ------------------------------------------------------------------------------------------------------------------------
>>
>> --- Event Stage 0: Main Stage
>>
>> VecDot             31370 1.0 1.2887e+01 1.0 6.28e+08 1.0 0.0e+00  
>> 0.0e+00 3.1e+04  2 14  0  0 59   2 14  0  0 59  1249
>> VecNorm            16235 1.0 2.3343e+00 1.0 1.79e+09 1.0 0.0e+00  
>> 0.0e+00 1.6e+04  0  7  0  0 31   0  7  0  0 31  3569
>> VecCopy             1600 1.0 9.4822e-01 1.0 0.00e+00 0.0 0.0e+00  
>> 0.0e+00 0.0e+00  0  0  0  0  0   0  0  0  0  0     0
>> VecSet              3732 1.0 8.7824e-01 1.0 0.00e+00 0.0 0.0e+00  
>> 0.0e+00 0.0e+00  0  0  0  0  0   0  0  0  0  0     0
>> VecAXPY            32836 1.0 1.9510e+01 1.0 4.34e+08 1.0 0.0e+00  
>> 0.0e+00 0.0e+00  3 15  0  0  0   3 15  0  0  0   864
>> VecAYPX            16701 1.0 7.4898e+00 1.0 5.73e+08 1.0 0.0e+00  
>> 0.0e+00 0.0e+00  1  8  0  0  0   1  8  0  0  0  1144
>> VecAssemblyBegin    1200 1.0 3.3916e-01 2.2 0.00e+00 0.0 0.0e+00  
>> 0.0e+00 3.6e+03  0  0  0  0  7   0  0  0  0  7     0
>> VecAssemblyEnd      1200 1.0 1.6778e-03 1.0 0.00e+00 0.0 0.0e+00  
>> 0.0e+00 0.0e+00  0  0  0  0  0   0  0  0  0  0     0
>> VecPointwiseMult   18301 1.0 1.4524e+01 1.0 1.62e+08 1.0 0.0e+00  
>> 0.0e+00 0.0e+00  2  4  0  0  0   2  4  0  0  0   323
>> VecScatterBegin    17801 1.0 5.8999e-01 1.0 0.00e+00 0.0 3.6e+04  
>> 2.9e+04 0.0e+00  0  0100100  0   0  0100100  0     0
>> VecScatterEnd      17801 1.0 3.3189e+00 2.2 0.00e+00 0.0 0.0e+00  
>> 0.0e+00 0.0e+00  0  0  0  0  0   0  0  0  0  0     0
>> KSPSetup             600 1.0 6.7541e-01 1.0 0.00e+00 0.0 0.0e+00  
>> 0.0e+00 0.0e+00  0  0  0  0  0   0  0  0  0  0     0
>> KSPSolve             600 1.0 1.6520e+02 1.0 3.43e+08 1.0 3.6e+04  
>> 2.9e+04 4.8e+04 27100100100 90  27100100100 90   686
>> PCSetUp              600 1.0 4.4189e+00 1.0 0.00e+00 0.0 0.0e+00  
>> 0.0e+00 0.0e+00  1  0  0  0  0   1  0  0  0  0     0
>> PCApply            18301 1.0 1.4579e+01 1.0 1.62e+08 1.0 0.0e+00  
>> 0.0e+00 1.0e+00  2  4  0  0  0   2  4  0  0  0   322
>> MatMult            16235 1.0 9.3444e+01 1.0 2.86e+08 1.0 3.2e+04  
>> 2.9e+04 0.0e+00 15 47 91 91  0  15 47 91 91  0   570
>> MatMultTranspose    1566 1.0 8.8825e+00 1.0 3.12e+08 1.0 3.1e+03  
>> 2.9e+04 0.0e+00  1  5  9  9  0   1  5  9  9  0   624
>> MatAssemblyBegin     600 1.0 6.0139e-0125.2 0.00e+00 0.0 0.0e+00  
>> 0.0e+00 1.2e+03  0  0  0  0  2   0  0  0  0  2     0
>> MatAssemblyEnd       600 1.0 2.5127e+00 1.0 0.00e+00 0.0 4.0e+00  
>> 1.5e+04 6.1e+02  0  0  0  0  1   0  0  0  0  1     0
>> ------------------------------------------------------------------------------------------------------------------------
>>
>> Memory usage is given in bytes:
>>
>> Object Type          Creations   Destructions   Memory   
>> Descendants' Mem.
>>
>> --- Event Stage 0: Main Stage
>>
>>           Index Set     4              4      30272     0
>>                 Vec   913            902  926180816     0
>>         Vec Scatter     2              0          0     0
>>       Krylov Solver     1              0          0     0
>>      Preconditioner     1              0          0     0
>>              Matrix     6              0          0     0
>> = 
>> = 
>> = 
>> = 
>> = 
>> = 
>> = 
>> = 
>> = 
>> = 
>> = 
>> = 
>> = 
>> = 
>> = 
>> = 
>> = 
>> = 
>> = 
>> = 
>> = 
>> = 
>> = 
>> = 
>> = 
>> = 
>> = 
>> = 
>> = 
>> = 
>> = 
>> = 
>> = 
>> = 
>> = 
>> = 
>> = 
>> = 
>> = 
>> = 
>> = 
>> = 
>> = 
>> = 
>> = 
>> = 
>> = 
>> = 
>> = 
>> = 
>> = 
>> =====================================================================
>> Average time to get PetscTime(): 2.14577e-07
>> Average time for MPI_Barrier(): 8.10623e-07
>> Average time for zero size MPI_Send(): 2.0504e-05
>>
>>
>>
>> -----------------------------------
>> --- Hypre Euclid preconditioner ---
>> -----------------------------------
>>
>> KSP Object:
>>  type: cg
>>  maximum iterations=500
>>  tolerances:  relative=0.05, absolute=1e-50, divergence=10000
>>  left preconditioning
>> PC Object:
>>  type: hypre
>>    HYPRE Euclid preconditioning
>>    HYPRE Euclid: number of levels 1
>>  linear system matrix = precond matrix:
>>  Matrix Object:
>>    type=mpiaij, rows=256576, cols=256576
>>    total: nonzeros=1769552, allocated nonzeros=1769552
>>      not using I-node (on process 0) routines
>>
>> ************************************************************************************************************************
>> ***             WIDEN YOUR WINDOW TO 120 CHARACTERS.  Use 'enscript  
>> -r -fCourier9' to print this document            ***
>> ************************************************************************************************************************
>>
>> ---------------------------------------------- PETSc Performance  
>> Summary: ----------------------------------------------
>>
>> ./fresco on a linux_32_ named lin0077 with 2 processors, by cklaij  
>> Wed Jul 15 10:10:05 2009
>> Using Petsc Release Version 2.3.3, Patch 13, Thu May 15 17:29:26  
>> CDT 2008 HG revision: 4466c6289a0922df26e20626fd4a0b4dd03c8124
>>
>>                         Max       Max/Min        Avg      Total
>> Time (sec):           6.961e+02      1.00000   6.961e+02
>> Objects:              1.227e+03      1.00000   1.227e+03
>> Flops:                1.340e+10      1.00073   1.340e+10  2.679e+10
>> Flops/sec:            1.925e+07      1.00073   1.924e+07  3.848e+07
>> MPI Messages:         4.748e+03      1.00000   4.748e+03  9.496e+03
>> MPI Message Lengths:  1.397e+08      1.00000   2.943e+04  2.794e+08
>> MPI Reductions:       7.192e+03      1.00000
>>
>> Flop counting convention: 1 flop = 1 real number operation of type  
>> (multiply/divide/add/subtract)
>>                            e.g., VecAXPY() for real vectors of  
>> length N --> 2N flops
>>                            and VecAXPY() for complex vectors of  
>> length N --> 8N flops
>>
>> Summary of Stages:   ----- Time ------  ----- Flops -----  ---  
>> Messages ---  -- Message Lengths --  -- Reductions --
>>                        Avg     %Total     Avg     %Total   counts    
>> %Total     Avg         %Total   counts   %Total
>>  0:      Main Stage: 6.9614e+02 100.0%  2.6790e+10 100.0%  9.496e 
>> +03 100.0%  2.943e+04      100.0%  1.438e+04 100.0%
>>
>> ------------------------------------------------------------------------------------------------------------------------
>> See the 'Profiling' chapter of the users' manual for details on  
>> interpreting output.
>> Phase summary info:
>>   Count: number of times phase was executed
>>   Time and Flops/sec: Max - maximum over all processors
>>                       Ratio - ratio of maximum to minimum over all  
>> processors
>>   Mess: number of messages sent
>>   Avg. len: average message length
>>   Reduct: number of global reductions
>>   Global: entire computation
>>   Stage: stages of a computation. Set stages with  
>> PetscLogStagePush() and PetscLogStagePop().
>>      %T - percent time in this phase         %F - percent flops in  
>> this phase
>>      %M - percent messages in this phase     %L - percent message  
>> lengths in this phase
>>      %R - percent reductions in this phase
>>   Total Mflop/s: 10e-6 * (sum of flops over all processors)/(max  
>> time over all processors)
>> ------------------------------------------------------------------------------------------------------------------------
>>
>>
>>      ##########################################################
>>      #                                                        #
>>      #                          WARNING!!!                    #
>>      #                                                        #
>>      #   This code was run without the PreLoadBegin()         #
>>      #   macros. To get timing results we always recommend    #
>>      #   preloading. otherwise timing numbers may be          #
>>      #   meaningless.                                         #
>>      ##########################################################
>>
>>
>> Event                Count      Time (sec)     Flops/ 
>> sec                         --- Global ---  --- Stage ---   Total
>>                   Max Ratio  Max     Ratio   Max  Ratio  Mess   Avg  
>> len Reduct  %T %F %M %L %R  %T %F %M %L %R Mflop/s
>> ------------------------------------------------------------------------------------------------------------------------
>>
>> --- Event Stage 0: Main Stage
>>
>> VecDot              5410 1.0 1.1865e+01 4.5 5.26e+08 4.5 0.0e+00  
>> 0.0e+00 5.4e+03  1 10  0  0 38   1 10  0  0 38   234
>> VecNorm             3255 1.0 7.8095e-01 1.0 1.07e+09 1.0 0.0e+00  
>> 0.0e+00 3.3e+03  0  6  0  0 23   0  6  0  0 23  2139
>> VecCopy             1600 1.0 9.5096e-01 1.0 0.00e+00 0.0 0.0e+00  
>> 0.0e+00 0.0e+00  0  0  0  0  0   0  0  0  0  0     0
>> VecSet              4746 1.0 8.9868e-01 1.0 0.00e+00 0.0 0.0e+00  
>> 0.0e+00 0.0e+00  0  0  0  0  0   0  0  0  0  0     0
>> VecAXPY             6801 1.0 4.8778e+00 1.0 3.59e+08 1.0 0.0e+00  
>> 0.0e+00 0.0e+00  1 13  0  0  0   1 13  0  0  0   715
>> VecAYPX             3646 1.0 2.2348e+00 1.0 4.19e+08 1.0 0.0e+00  
>> 0.0e+00 0.0e+00  0  7  0  0  0   0  7  0  0  0   837
>> VecAssemblyBegin    1200 1.0 2.7152e-01 2.5 0.00e+00 0.0 0.0e+00  
>> 0.0e+00 3.6e+03  0  0  0  0 25   0  0  0  0 25     0
>> VecAssemblyEnd      1200 1.0 1.7414e-03 1.0 0.00e+00 0.0 0.0e+00  
>> 0.0e+00 0.0e+00  0  0  0  0  0   0  0  0  0  0     0
>> VecPointwiseMult    3982 1.0 4.0871e+00 1.0 1.26e+08 1.0 0.0e+00  
>> 0.0e+00 0.0e+00  1  4  0  0  0   1  4  0  0  0   250
>> VecScatterBegin     4746 1.0 1.8000e-01 1.0 0.00e+00 0.0 9.5e+03  
>> 2.9e+04 0.0e+00  0  0100100  0   0  0100100  0     0
>> VecScatterEnd       4746 1.0 4.6870e+00 5.0 0.00e+00 0.0 0.0e+00  
>> 0.0e+00 0.0e+00  0  0  0  0  0   0  0  0  0  0     0
>> KSPSetup             600 1.0 6.8991e-01 1.0 0.00e+00 0.0 0.0e+00  
>> 0.0e+00 0.0e+00  0  0  0  0  0   0  0  0  0  0     0
>> KSPSolve             600 1.0 2.5931e+02 1.0 5.17e+07 1.0 9.5e+03  
>> 2.9e+04 9.0e+03 37100100100 62  37100100100 62   103
>> PCSetUp              600 1.0 1.8337e+02 1.0 0.00e+00 0.0 0.0e+00  
>> 0.0e+00 2.0e+02 26  0  0  0  1  26  0  0  0  1     0
>> PCApply             5246 1.0 3.6440e+01 1.3 1.88e+07 1.3 0.0e+00  
>> 0.0e+00 1.0e+02  5  4  0  0  1   5  4  0  0  1    28
>> MatMult             3255 1.0 2.3031e+01 1.2 2.85e+08 1.2 6.5e+03  
>> 2.9e+04 0.0e+00  3 40 69 69  0   3 40 69 69  0   464
>> MatMultTranspose    1491 1.0 8.4907e+00 1.0 3.11e+08 1.0 3.0e+03  
>> 2.9e+04 0.0e+00  1 20 31 31  0   1 20 31 31  0   621
>> MatConvert           100 1.0 1.2686e+01 1.0 0.00e+00 0.0 0.0e+00  
>> 0.0e+00 0.0e+00  2  0  0  0  0   2  0  0  0  0     0
>> MatAssemblyBegin     600 1.0 2.3702e+0042.6 0.00e+00 0.0 0.0e+00  
>> 0.0e+00 1.2e+03  0  0  0  0  8   0  0  0  0  8     0
>> MatAssemblyEnd       600 1.0 2.5303e+00 1.0 0.00e+00 0.0 4.0e+00  
>> 1.5e+04 6.1e+02  0  0  0  0  4   0  0  0  0  4     0
>> MatGetRow        12828800 1.0 5.2074e+00 1.0 0.00e+00 0.0 0.0e+00  
>> 0.0e+00 0.0e+00  1  0  0  0  0   1  0  0  0  0     0
>> MatGetRowIJ          200 1.0 1.6284e-04 1.0 0.00e+00 0.0 0.0e+00  
>> 0.0e+00 0.0e+00  0  0  0  0  0   0  0  0  0  0     0
>> ------------------------------------------------------------------------------------------------------------------------
>>
>> Memory usage is given in bytes:
>>
>> Object Type          Creations   Destructions   Memory   
>> Descendants' Mem.
>>
>> --- Event Stage 0: Main Stage
>>
>>           Index Set     4              4      30272     0
>>                 Vec  1213           1202  1234223216     0
>>         Vec Scatter     2              0          0     0
>>       Krylov Solver     1              0          0     0
>>      Preconditioner     1              0          0     0
>>              Matrix     6              0          0     0
>> = 
>> = 
>> = 
>> = 
>> = 
>> = 
>> = 
>> = 
>> = 
>> = 
>> = 
>> = 
>> = 
>> = 
>> = 
>> = 
>> = 
>> = 
>> = 
>> = 
>> = 
>> = 
>> = 
>> = 
>> = 
>> = 
>> = 
>> = 
>> = 
>> = 
>> = 
>> = 
>> = 
>> = 
>> = 
>> = 
>> = 
>> = 
>> = 
>> = 
>> = 
>> = 
>> = 
>> = 
>> = 
>> = 
>> = 
>> = 
>> = 
>> = 
>> = 
>> =====================================================================
>> Average time to get PetscTime(): 2.14577e-07
>> Average time for MPI_Barrier(): 3.8147e-07
>> Average time for zero size MPI_Send(): 1.39475e-05
>>
>>
>>
>>
>> --------------------------------------
>> --- Hypre BoomerAMG preconditioner ---
>> --------------------------------------
>>
>> KSP Object:
>>  type: cg
>>  maximum iterations=500
>>  tolerances:  relative=0.05, absolute=1e-50, divergence=10000
>>  left preconditioning
>> PC Object:
>>  type: hypre
>>    HYPRE BoomerAMG preconditioning
>>    HYPRE BoomerAMG: Cycle type V
>>    HYPRE BoomerAMG: Maximum number of levels 25
>>    HYPRE BoomerAMG: Maximum number of iterations PER hypre call 1
>>    HYPRE BoomerAMG: Convergence tolerance PER hypre call 0
>>    HYPRE BoomerAMG: Threshold for strong coupling 0.25
>>    HYPRE BoomerAMG: Interpolation truncation factor 0
>>    HYPRE BoomerAMG: Interpolation: max elements per row 0
>>    HYPRE BoomerAMG: Number of levels of aggressive coarsening 0
>>    HYPRE BoomerAMG: Number of paths for aggressive coarsening 1
>>    HYPRE BoomerAMG: Maximum row sums 0.9
>>    HYPRE BoomerAMG: Sweeps down         1
>>    HYPRE BoomerAMG: Sweeps up           1
>>    HYPRE BoomerAMG: Sweeps on coarse    1
>>    HYPRE BoomerAMG: Relax down          symmetric-SOR/Jacobi
>>    HYPRE BoomerAMG: Relax up            symmetric-SOR/Jacobi
>>    HYPRE BoomerAMG: Relax on coarse     Gaussian-elimination
>>    HYPRE BoomerAMG: Relax weight  (all)      1
>>    HYPRE BoomerAMG: Outer relax weight (all) 1
>>    HYPRE BoomerAMG: Using CF-relaxation
>>    HYPRE BoomerAMG: Measure type        local
>>    HYPRE BoomerAMG: Coarsen type        Falgout
>>    HYPRE BoomerAMG: Interpolation type  classical
>>  linear system matrix = precond matrix:
>>  Matrix Object:
>>    type=mpiaij, rows=256576, cols=256576
>>    total: nonzeros=1769552, allocated nonzeros=1769552
>>      not using I-node (on process 0) routines
>>
>> ************************************************************************************************************************
>> ***             WIDEN YOUR WINDOW TO 120 CHARACTERS.  Use 'enscript  
>> -r -fCourier9' to print this document            ***
>> ************************************************************************************************************************
>>
>> ---------------------------------------------- PETSc Performance  
>> Summary: ----------------------------------------------
>>
>> ./fresco on a linux_32_ named lin0077 with 2 processors, by cklaij  
>> Wed Jul 15 09:53:07 2009
>> Using Petsc Release Version 2.3.3, Patch 13, Thu May 15 17:29:26  
>> CDT 2008 HG revision: 4466c6289a0922df26e20626fd4a0b4dd03c8124
>>
>>                         Max       Max/Min        Avg      Total
>> Time (sec):           7.080e+02      1.00000   7.080e+02
>> Objects:              1.227e+03      1.00000   1.227e+03
>> Flops:                1.054e+10      1.00076   1.054e+10  2.107e+10
>> Flops/sec:            1.489e+07      1.00076   1.488e+07  2.977e+07
>> MPI Messages:         3.857e+03      1.00000   3.857e+03  7.714e+03
>> MPI Message Lengths:  1.135e+08      1.00000   2.942e+04  2.270e+08
>> MPI Reductions:       5.800e+03      1.00000
>>
>> Flop counting convention: 1 flop = 1 real number operation of type  
>> (multiply/divide/add/subtract)
>>                            e.g., VecAXPY() for real vectors of  
>> length N --> 2N flops
>>                            and VecAXPY() for complex vectors of  
>> length N --> 8N flops
>>
>> Summary of Stages:   ----- Time ------  ----- Flops -----  ---  
>> Messages ---  -- Message Lengths --  -- Reductions --
>>                        Avg     %Total     Avg     %Total   counts    
>> %Total     Avg         %Total   counts   %Total
>>  0:      Main Stage: 7.0799e+02 100.0%  2.1075e+10 100.0%  7.714e 
>> +03 100.0%  2.942e+04      100.0%  1.160e+04 100.0%
>>
>> ------------------------------------------------------------------------------------------------------------------------
>> See the 'Profiling' chapter of the users' manual for details on  
>> interpreting output.
>> Phase summary info:
>>   Count: number of times phase was executed
>>   Time and Flops/sec: Max - maximum over all processors
>>                       Ratio - ratio of maximum to minimum over all  
>> processors
>>   Mess: number of messages sent
>>   Avg. len: average message length
>>   Reduct: number of global reductions
>>   Global: entire computation
>>   Stage: stages of a computation. Set stages with  
>> PetscLogStagePush() and PetscLogStagePop().
>>      %T - percent time in this phase         %F - percent flops in  
>> this phase
>>      %M - percent messages in this phase     %L - percent message  
>> lengths in this phase
>>      %R - percent reductions in this phase
>>   Total Mflop/s: 10e-6 * (sum of flops over all processors)/(max  
>> time over all processors)
>> ------------------------------------------------------------------------------------------------------------------------
>>
>>
>>      ##########################################################
>>      #                                                        #
>>      #                          WARNING!!!                    #
>>      #                                                        #
>>      #   This code was run without the PreLoadBegin()         #
>>      #   macros. To get timing results we always recommend    #
>>      #   preloading. otherwise timing numbers may be          #
>>      #   meaningless.                                         #
>>      ##########################################################
>>
>>
>> Event                Count      Time (sec)     Flops/ 
>> sec                         --- Global ---  --- Stage ---   Total
>>                   Max Ratio  Max     Ratio   Max  Ratio  Mess   Avg  
>> len Reduct  %T %F %M %L %R  %T %F %M %L %R Mflop/s
>> ------------------------------------------------------------------------------------------------------------------------
>>
>> --- Event Stage 0: Main Stage
>>
>> VecDot              3554 1.0 1.8220e+00 1.0 5.03e+08 1.0 0.0e+00  
>> 0.0e+00 3.6e+03  0  9  0  0 31   0  9  0  0 31  1001
>> VecNorm             2327 1.0 6.7031e-01 1.0 9.34e+08 1.0 0.0e+00  
>> 0.0e+00 2.3e+03  0  6  0  0 20   0  6  0  0 20  1781
>> VecCopy             1600 1.0 9.4440e-01 1.0 0.00e+00 0.0 0.0e+00  
>> 0.0e+00 0.0e+00  0  0  0  0  0   0  0  0  0  0     0
>> VecSet              3855 1.0 8.0550e-01 1.0 0.00e+00 0.0 0.0e+00  
>> 0.0e+00 0.0e+00  0  0  0  0  0   0  0  0  0  0     0
>> VecAXPY             4982 1.0 3.7953e+00 1.0 3.39e+08 1.0 0.0e+00  
>> 0.0e+00 0.0e+00  1 12  0  0  0   1 12  0  0  0   674
>> VecAYPX             2755 1.0 1.8270e+00 1.0 3.89e+08 1.0 0.0e+00  
>> 0.0e+00 0.0e+00  0  7  0  0  0   0  7  0  0  0   774
>> VecAssemblyBegin    1200 1.0 1.8679e-01 1.8 0.00e+00 0.0 0.0e+00  
>> 0.0e+00 3.6e+03  0  0  0  0 31   0  0  0  0 31     0
>> VecAssemblyEnd      1200 1.0 1.7717e-03 1.1 0.00e+00 0.0 0.0e+00  
>> 0.0e+00 0.0e+00  0  0  0  0  0   0  0  0  0  0     0
>> VecPointwiseMult    4056 1.0 4.1344e+00 1.0 1.26e+08 1.0 0.0e+00  
>> 0.0e+00 0.0e+00  1  5  0  0  0   1  5  0  0  0   252
>> VecScatterBegin     3855 1.0 1.5116e-01 1.0 0.00e+00 0.0 7.7e+03  
>> 2.9e+04 0.0e+00  0  0100100  0   0  0100100  0     0
>> VecScatterEnd       3855 1.0 7.3828e-01 2.0 0.00e+00 0.0 0.0e+00  
>> 0.0e+00 0.0e+00  0  0  0  0  0   0  0  0  0  0     0
>> KSPSetup             600 1.0 5.1192e-01 1.0 0.00e+00 0.0 0.0e+00  
>> 0.0e+00 0.0e+00  0  0  0  0  0   0  0  0  0  0     0
>> KSPSolve             600 1.0 2.7194e+02 1.0 3.88e+07 1.0 7.7e+03  
>> 2.9e+04 6.2e+03 38100100100 53  38100100100 53    77
>> PCSetUp              600 1.0 1.6630e+02 1.0 0.00e+00 0.0 0.0e+00  
>> 0.0e+00 2.0e+02 23  0  0  0  2  23  0  0  0  2     0
>> PCApply             4355 1.0 7.3735e+01 1.0 7.06e+06 1.0 0.0e+00  
>> 0.0e+00 1.0e+02 10  5  0  0  1  10  5  0  0  1    14
>> MatMult             2327 1.0 1.3706e+01 1.0 2.79e+08 1.0 4.7e+03  
>> 2.9e+04 0.0e+00  2 36 60 60  0   2 36 60 60  0   557
>> MatMultTranspose    1528 1.0 8.6412e+00 1.0 3.13e+08 1.0 3.1e+03  
>> 2.9e+04 0.0e+00  1 26 40 40  0   1 26 40 40  0   626
>> MatConvert           100 1.0 1.2962e+01 1.0 0.00e+00 0.0 0.0e+00  
>> 0.0e+00 0.0e+00  2  0  0  0  0   2  0  0  0  0     0
>> MatAssemblyBegin     600 1.0 2.4579e+0096.9 0.00e+00 0.0 0.0e+00  
>> 0.0e+00 1.2e+03  0  0  0  0 10   0  0  0  0 10     0
>> MatAssemblyEnd       600 1.0 2.5257e+00 1.0 0.00e+00 0.0 4.0e+00  
>> 1.5e+04 6.1e+02  0  0  0  0  5   0  0  0  0  5     0
>> MatGetRow        12828800 1.0 5.2907e+00 1.0 0.00e+00 0.0 0.0e+00  
>> 0.0e+00 0.0e+00  1  0  0  0  0   1  0  0  0  0     0
>> MatGetRowIJ          200 1.0 1.7476e-04 1.1 0.00e+00 0.0 0.0e+00  
>> 0.0e+00 0.0e+00  0  0  0  0  0   0  0  0  0  0     0
>> ------------------------------------------------------------------------------------------------------------------------
>>
>> Memory usage is given in bytes:
>>
>> Object Type          Creations   Destructions   Memory   
>> Descendants' Mem.
>>
>> --- Event Stage 0: Main Stage
>>
>>           Index Set     4              4      30272     0
>>                 Vec  1213           1202  1234223216     0
>>         Vec Scatter     2              0          0     0
>>       Krylov Solver     1              0          0     0
>>      Preconditioner     1              0          0     0
>>              Matrix     6              0          0     0
>> = 
>> = 
>> = 
>> = 
>> = 
>> = 
>> = 
>> = 
>> = 
>> = 
>> = 
>> = 
>> = 
>> = 
>> = 
>> = 
>> = 
>> = 
>> = 
>> = 
>> = 
>> = 
>> = 
>> = 
>> = 
>> = 
>> = 
>> = 
>> = 
>> = 
>> = 
>> = 
>> = 
>> = 
>> = 
>> = 
>> = 
>> = 
>> = 
>> = 
>> = 
>> = 
>> = 
>> = 
>> = 
>> = 
>> = 
>> = 
>> = 
>> = 
>> = 
>> =====================================================================
>> Average time to get PetscTime(): 1.90735e-07
>> Average time for MPI_Barrier(): 8.10623e-07
>> Average time for zero size MPI_Send(): 1.95503e-05
>> OptionTable: -log_summary
>>
>>
>>
>>
>> -----Original Message-----
>> Date: Tue, 14 Jul 2009 10:42:58 -0500
>> From: Barry Smith <bsmith at mcs.anl.gov>
>> Subject: Re: hypre preconditioners
>> To: PETSc users list <petsc-users at mcs.anl.gov>
>> Message-ID: <DC1E3E8F-1D2D-4256-A1EE-14BA81EAEC67 at mcs.anl.gov>
>> Content-Type: text/plain; charset=US-ASCII; format=flowed; delsp=yes
>>
>>
>>    First run the three cases with -log_summary (also -ksp_view to see
>> exact solver options that are being used) and send those files. This
>> will tell us where the time is being spent; without this information
>> any comments are pure speculation. (For example, the "copy" time to
>> hypre format is trivial compared to the time to build a hypre
>> preconditioner and not the problem).
>>
>>
>>    What you report is not uncommon; the setup and per iteration cost
>> of the hypre preconditioners will be much larger than the simpler
>> Jacobi preconditioner.
>>
>>    Barry
>>
>> On Jul 14, 2009, at 3:36 AM, Klaij, Christiaan wrote:
>>
>>>
>>> I'm solving the steady incompressible Navier-Stokes equations
>>> (discretized with FV on unstructured grids) using the SIMPLE
>>> Pressure Correction method. I'm using Picard linearization and solve
>>> the system for the momentum equations with BICG and for the pressure
>>> equation with CG. Currently, for parallel runs, I'm using JACOBI as
>>> a preconditioner. My grids typically have a few million cells and I
>>> use between 4 and 16 cores (1 to 4 quadcore CPUs on a linux
>>> cluster). A significant portion of the CPU time goes into solving
>>> the pressure equation. To reach the relative tolerance I need, CG
>>> with JACOBI takes about 100 iterations per outer loop for these
>>> problems.
>>>
>>> In order to reduce CPU time, I've compiled PETSc with support for
>>> Hypre and I'm looking at BoomerAMG and Euclid to replace JACOBI as a
>>> preconditioner for the pressure equation. With default settings,
>>> both BoomerAMG and Euclid greatly reduce the number of iterations:
>>> with BoomerAMG 1 or 2 iterations are enough, with Euclid about 10.
>>> However, I do not get any reduction in CPU time. With Euclid, CPU
>>> time is similar to JACOBI and with BoomerAMG it is approximately
>>> doubled.
>>>
>>> Is this what one can expect? Are BoomerAMG and Euclid meant for much
>>> larger problems? I understand Hypre uses a different matrix storage
>>> format, is CPU time 'lost in translation' between PETSc and Hypre
>>> for these small problems? Are there maybe any settings I should
>>> change?
>>>
>>> Chris
>>>
>>>
>>>
>>>
>>>
>>>
>>>
>>>
>>> <mime-attachment.jpeg><mime-attachment.jpeg>
>>> dr. ir. Christiaan Klaij
>>> CFD Researcher
>>> Research & Development
>>> MARIN
>>> 2, Haagsteeg
>>> c.klaij at marin.nl
>>> P.O. Box 28
>>> T +31 317 49 39 11
>>> 6700 AA  Wageningen
>>> F +31 317 49 32 45
>>> T  +31 317 49 33 44
>>> The Netherlands
>>> I  www.marin.nl
>>>
>>>
>>> MARIN webnews: First AMT'09 conference, Nantes, France, September  
>>> 1-2
>>>
>>>
>>> This e-mail may be confidential, privileged and/or protected by
>>> copyright. If you are not the intended recipient, you should return
>>> it to the sender immediately and delete your copy from your system.
>>>
>>
>
>
>
> -- 
> Lisandro Dalc?n
> ---------------
> Centro Internacional de M?todos Computacionales en Ingenier?a (CIMEC)
> Instituto de Desarrollo Tecnol?gico para la Industria Qu?mica (INTEC)
> Consejo Nacional de Investigaciones Cient?ficas y T?cnicas (CONICET)
> PTLC - G?emes 3450, (3000) Santa Fe, Argentina
> Tel/Fax: +54-(0)342-451.1594


From C.Klaij at marin.nl  Thu Jul 16 01:47:27 2009
From: C.Klaij at marin.nl (Klaij, Christiaan)
Date: Thu, 16 Jul 2009 08:47:27 +0200
Subject: hypre preconditioners
References: <mailman.57.1247677218.7018.petsc-users@mcs.anl.gov>
Message-ID: <5D9143EF9FADE942BEF6F2A636A861170800F787@MAR150CV1.marin.local>

Lisandro,

Thanks for your response! The velocity problem is segregated (I use BICG with Jacobi for the 3 linear systems) but these need (much) less iterations than the pressure problem. The pressure matrix changes at each solve. Also, I did try ML and, like you say, it needs about two times more iterations than boomerAMG. Overall, boomerAMG is a bit faster for my cases than ML.

Chris


-----Original Message-----
Date: Wed, 15 Jul 2009 13:23:19 -0300
From: Lisandro Dalcin <dalcinl at gmail.com>
Subject: Re: hypre preconditioners
To: PETSc users list <petsc-users at mcs.anl.gov>
Message-ID:
	<e7ba66e40907150923naa1acf2he4f79d4e173b29c8 at mail.gmail.com>
Content-Type: text/plain; charset=ISO-8859-1

Did you try Block-Jacobi for the velocity problem? If the matrix of
your presure problem changes in each solve (is this your case?) could
you try to use ML? In my little experience, ML leads to lower setup
times, but higher iteration counts (let say twice); perhaps it will be
faster than BommerAMG for you use case.


On Wed, Jul 15, 2009 at 5:58 AM, Klaij, Christiaan<C.Klaij at marin.nl> wrote:
> Barry,
>
> Thanks for your reply! Below is the information from KSPView and -log_summary for the three cases. Indeed PCSetUp takes much more time with the hypre preconditioners.
>
> Chris
>
> -----------------------------
> --- Jacobi preconditioner ---
> -----------------------------
>
> KSP Object:
> ?type: cg
> ?maximum iterations=500
> ?tolerances: ?relative=0.05, absolute=1e-50, divergence=10000
> ?left preconditioning
> PC Object:
> ?type: jacobi
> ?linear system matrix = precond matrix:
> ?Matrix Object:
> ? ?type=mpiaij, rows=256576, cols=256576
> ? ?total: nonzeros=1769552, allocated nonzeros=1769552
> ? ? ?not using I-node (on process 0) routines
>
> ************************************************************************************************************************
> *** ? ? ? ? ? ? WIDEN YOUR WINDOW TO 120 CHARACTERS. ?Use 'enscript -r -fCourier9' to print this document ? ? ? ? ? ?***
> ************************************************************************************************************************
>
> ---------------------------------------------- PETSc Performance Summary: ----------------------------------------------
>
> ./fresco on a linux_32_ named lin0077 with 2 processors, by cklaij Wed Jul 15 10:22:04 2009
> Using Petsc Release Version 2.3.3, Patch 13, Thu May 15 17:29:26 CDT 2008 HG revision: 4466c6289a0922df26e20626fd4a0b4dd03c8124
>
> ? ? ? ? ? ? ? ? ? ? ? ? Max ? ? ? Max/Min ? ? ? ?Avg ? ? ?Total
> Time (sec): ? ? ? ? ? 6.037e+02 ? ? ?1.00000 ? 6.037e+02
> Objects: ? ? ? ? ? ? ?9.270e+02 ? ? ?1.00000 ? 9.270e+02
> Flops: ? ? ? ? ? ? ? ?5.671e+10 ? ? ?1.00065 ? 5.669e+10 ?1.134e+11
> Flops/sec: ? ? ? ? ? ?9.393e+07 ? ? ?1.00065 ? 9.390e+07 ?1.878e+08
> MPI Messages: ? ? ? ? 1.780e+04 ? ? ?1.00000 ? 1.780e+04 ?3.561e+04
> MPI Message Lengths: ?5.239e+08 ? ? ?1.00000 ? 2.943e+04 ?1.048e+09
> MPI Reductions: ? ? ? 2.651e+04 ? ? ?1.00000
>
> Flop counting convention: 1 flop = 1 real number operation of type (multiply/divide/add/subtract)
> ? ? ? ? ? ? ? ? ? ? ? ? ? ?e.g., VecAXPY() for real vectors of length N --> 2N flops
> ? ? ? ? ? ? ? ? ? ? ? ? ? ?and VecAXPY() for complex vectors of length N --> 8N flops
>
> Summary of Stages: ? ----- Time ------ ?----- Flops ----- ?--- Messages --- ?-- Message Lengths -- ?-- Reductions --
> ? ? ? ? ? ? ? ? ? ? ? ?Avg ? ? %Total ? ? Avg ? ? %Total ? counts ? %Total ? ? Avg ? ? ? ? %Total ? counts ? %Total
> ?0: ? ? ?Main Stage: 6.0374e+02 100.0% ?1.1338e+11 100.0% ?3.561e+04 100.0% ?2.943e+04 ? ? ?100.0% ?5.302e+04 100.0%
>
> ------------------------------------------------------------------------------------------------------------------------
> See the 'Profiling' chapter of the users' manual for details on interpreting output.
> Phase summary info:
> ? Count: number of times phase was executed
> ? Time and Flops/sec: Max - maximum over all processors
> ? ? ? ? ? ? ? ? ? ? ? Ratio - ratio of maximum to minimum over all processors
> ? Mess: number of messages sent
> ? Avg. len: average message length
> ? Reduct: number of global reductions
> ? Global: entire computation
> ? Stage: stages of a computation. Set stages with PetscLogStagePush() and PetscLogStagePop().
> ? ? ?%T - percent time in this phase ? ? ? ? %F - percent flops in this phase
> ? ? ?%M - percent messages in this phase ? ? %L - percent message lengths in this phase
> ? ? ?%R - percent reductions in this phase
> ? Total Mflop/s: 10e-6 * (sum of flops over all processors)/(max time over all processors)
> ------------------------------------------------------------------------------------------------------------------------
>
>
> ? ? ?##########################################################
> ? ? ?# ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ?#
> ? ? ?# ? ? ? ? ? ? ? ? ? ? ? ? ?WARNING!!! ? ? ? ? ? ? ? ? ? ?#
> ? ? ?# ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ?#
> ? ? ?# ? This code was run without the PreLoadBegin() ? ? ? ? #
> ? ? ?# ? macros. To get timing results we always recommend ? ?#
> ? ? ?# ? preloading. otherwise timing numbers may be ? ? ? ? ?#
> ? ? ?# ? meaningless. ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? #
> ? ? ?##########################################################
>
>
> Event ? ? ? ? ? ? ? ?Count ? ? ?Time (sec) ? ? Flops/sec ? ? ? ? ? ? ? ? ? ? ? ? --- Global --- ?--- Stage --- ? Total
> ? ? ? ? ? ? ? ? ? Max Ratio ?Max ? ? Ratio ? Max ?Ratio ?Mess ? Avg len Reduct ?%T %F %M %L %R ?%T %F %M %L %R Mflop/s
> ------------------------------------------------------------------------------------------------------------------------
>
> --- Event Stage 0: Main Stage
>
> VecDot ? ? ? ? ? ? 31370 1.0 1.2887e+01 1.0 6.28e+08 1.0 0.0e+00 0.0e+00 3.1e+04 ?2 14 ?0 ?0 59 ? 2 14 ?0 ?0 59 ?1249
> VecNorm ? ? ? ? ? ?16235 1.0 2.3343e+00 1.0 1.79e+09 1.0 0.0e+00 0.0e+00 1.6e+04 ?0 ?7 ?0 ?0 31 ? 0 ?7 ?0 ?0 31 ?3569
> VecCopy ? ? ? ? ? ? 1600 1.0 9.4822e-01 1.0 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00 ?0 ?0 ?0 ?0 ?0 ? 0 ?0 ?0 ?0 ?0 ? ? 0
> VecSet ? ? ? ? ? ? ?3732 1.0 8.7824e-01 1.0 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00 ?0 ?0 ?0 ?0 ?0 ? 0 ?0 ?0 ?0 ?0 ? ? 0
> VecAXPY ? ? ? ? ? ?32836 1.0 1.9510e+01 1.0 4.34e+08 1.0 0.0e+00 0.0e+00 0.0e+00 ?3 15 ?0 ?0 ?0 ? 3 15 ?0 ?0 ?0 ? 864
> VecAYPX ? ? ? ? ? ?16701 1.0 7.4898e+00 1.0 5.73e+08 1.0 0.0e+00 0.0e+00 0.0e+00 ?1 ?8 ?0 ?0 ?0 ? 1 ?8 ?0 ?0 ?0 ?1144
> VecAssemblyBegin ? ?1200 1.0 3.3916e-01 2.2 0.00e+00 0.0 0.0e+00 0.0e+00 3.6e+03 ?0 ?0 ?0 ?0 ?7 ? 0 ?0 ?0 ?0 ?7 ? ? 0
> VecAssemblyEnd ? ? ?1200 1.0 1.6778e-03 1.0 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00 ?0 ?0 ?0 ?0 ?0 ? 0 ?0 ?0 ?0 ?0 ? ? 0
> VecPointwiseMult ? 18301 1.0 1.4524e+01 1.0 1.62e+08 1.0 0.0e+00 0.0e+00 0.0e+00 ?2 ?4 ?0 ?0 ?0 ? 2 ?4 ?0 ?0 ?0 ? 323
> VecScatterBegin ? ?17801 1.0 5.8999e-01 1.0 0.00e+00 0.0 3.6e+04 2.9e+04 0.0e+00 ?0 ?0100100 ?0 ? 0 ?0100100 ?0 ? ? 0
> VecScatterEnd ? ? ?17801 1.0 3.3189e+00 2.2 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00 ?0 ?0 ?0 ?0 ?0 ? 0 ?0 ?0 ?0 ?0 ? ? 0
> KSPSetup ? ? ? ? ? ? 600 1.0 6.7541e-01 1.0 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00 ?0 ?0 ?0 ?0 ?0 ? 0 ?0 ?0 ?0 ?0 ? ? 0
> KSPSolve ? ? ? ? ? ? 600 1.0 1.6520e+02 1.0 3.43e+08 1.0 3.6e+04 2.9e+04 4.8e+04 27100100100 90 ?27100100100 90 ? 686
> PCSetUp ? ? ? ? ? ? ?600 1.0 4.4189e+00 1.0 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00 ?1 ?0 ?0 ?0 ?0 ? 1 ?0 ?0 ?0 ?0 ? ? 0
> PCApply ? ? ? ? ? ?18301 1.0 1.4579e+01 1.0 1.62e+08 1.0 0.0e+00 0.0e+00 1.0e+00 ?2 ?4 ?0 ?0 ?0 ? 2 ?4 ?0 ?0 ?0 ? 322
> MatMult ? ? ? ? ? ?16235 1.0 9.3444e+01 1.0 2.86e+08 1.0 3.2e+04 2.9e+04 0.0e+00 15 47 91 91 ?0 ?15 47 91 91 ?0 ? 570
> MatMultTranspose ? ?1566 1.0 8.8825e+00 1.0 3.12e+08 1.0 3.1e+03 2.9e+04 0.0e+00 ?1 ?5 ?9 ?9 ?0 ? 1 ?5 ?9 ?9 ?0 ? 624
> MatAssemblyBegin ? ? 600 1.0 6.0139e-0125.2 0.00e+00 0.0 0.0e+00 0.0e+00 1.2e+03 ?0 ?0 ?0 ?0 ?2 ? 0 ?0 ?0 ?0 ?2 ? ? 0
> MatAssemblyEnd ? ? ? 600 1.0 2.5127e+00 1.0 0.00e+00 0.0 4.0e+00 1.5e+04 6.1e+02 ?0 ?0 ?0 ?0 ?1 ? 0 ?0 ?0 ?0 ?1 ? ? 0
> ------------------------------------------------------------------------------------------------------------------------
>
> Memory usage is given in bytes:
>
> Object Type ? ? ? ? ?Creations ? Destructions ? Memory ?Descendants' Mem.
>
> --- Event Stage 0: Main Stage
>
> ? ? ? ? ? Index Set ? ? 4 ? ? ? ? ? ? ?4 ? ? ?30272 ? ? 0
> ? ? ? ? ? ? ? ? Vec ? 913 ? ? ? ? ? ?902 ?926180816 ? ? 0
> ? ? ? ? Vec Scatter ? ? 2 ? ? ? ? ? ? ?0 ? ? ? ? ?0 ? ? 0
> ? ? ? Krylov Solver ? ? 1 ? ? ? ? ? ? ?0 ? ? ? ? ?0 ? ? 0
> ? ? ?Preconditioner ? ? 1 ? ? ? ? ? ? ?0 ? ? ? ? ?0 ? ? 0
> ? ? ? ? ? ? ?Matrix ? ? 6 ? ? ? ? ? ? ?0 ? ? ? ? ?0 ? ? 0
> ========================================================================================================================
> Average time to get PetscTime(): 2.14577e-07
> Average time for MPI_Barrier(): 8.10623e-07
> Average time for zero size MPI_Send(): 2.0504e-05
>
>
>
> -----------------------------------
> --- Hypre Euclid preconditioner ---
> -----------------------------------
>
> KSP Object:
> ?type: cg
> ?maximum iterations=500
> ?tolerances: ?relative=0.05, absolute=1e-50, divergence=10000
> ?left preconditioning
> PC Object:
> ?type: hypre
> ? ?HYPRE Euclid preconditioning
> ? ?HYPRE Euclid: number of levels 1
> ?linear system matrix = precond matrix:
> ?Matrix Object:
> ? ?type=mpiaij, rows=256576, cols=256576
> ? ?total: nonzeros=1769552, allocated nonzeros=1769552
> ? ? ?not using I-node (on process 0) routines
>
> ************************************************************************************************************************
> *** ? ? ? ? ? ? WIDEN YOUR WINDOW TO 120 CHARACTERS. ?Use 'enscript -r -fCourier9' to print this document ? ? ? ? ? ?***
> ************************************************************************************************************************
>
> ---------------------------------------------- PETSc Performance Summary: ----------------------------------------------
>
> ./fresco on a linux_32_ named lin0077 with 2 processors, by cklaij Wed Jul 15 10:10:05 2009
> Using Petsc Release Version 2.3.3, Patch 13, Thu May 15 17:29:26 CDT 2008 HG revision: 4466c6289a0922df26e20626fd4a0b4dd03c8124
>
> ? ? ? ? ? ? ? ? ? ? ? ? Max ? ? ? Max/Min ? ? ? ?Avg ? ? ?Total
> Time (sec): ? ? ? ? ? 6.961e+02 ? ? ?1.00000 ? 6.961e+02
> Objects: ? ? ? ? ? ? ?1.227e+03 ? ? ?1.00000 ? 1.227e+03
> Flops: ? ? ? ? ? ? ? ?1.340e+10 ? ? ?1.00073 ? 1.340e+10 ?2.679e+10
> Flops/sec: ? ? ? ? ? ?1.925e+07 ? ? ?1.00073 ? 1.924e+07 ?3.848e+07
> MPI Messages: ? ? ? ? 4.748e+03 ? ? ?1.00000 ? 4.748e+03 ?9.496e+03
> MPI Message Lengths: ?1.397e+08 ? ? ?1.00000 ? 2.943e+04 ?2.794e+08
> MPI Reductions: ? ? ? 7.192e+03 ? ? ?1.00000
>
> Flop counting convention: 1 flop = 1 real number operation of type (multiply/divide/add/subtract)
> ? ? ? ? ? ? ? ? ? ? ? ? ? ?e.g., VecAXPY() for real vectors of length N --> 2N flops
> ? ? ? ? ? ? ? ? ? ? ? ? ? ?and VecAXPY() for complex vectors of length N --> 8N flops
>
> Summary of Stages: ? ----- Time ------ ?----- Flops ----- ?--- Messages --- ?-- Message Lengths -- ?-- Reductions --
> ? ? ? ? ? ? ? ? ? ? ? ?Avg ? ? %Total ? ? Avg ? ? %Total ? counts ? %Total ? ? Avg ? ? ? ? %Total ? counts ? %Total
> ?0: ? ? ?Main Stage: 6.9614e+02 100.0% ?2.6790e+10 100.0% ?9.496e+03 100.0% ?2.943e+04 ? ? ?100.0% ?1.438e+04 100.0%
>
> ------------------------------------------------------------------------------------------------------------------------
> See the 'Profiling' chapter of the users' manual for details on interpreting output.
> Phase summary info:
> ? Count: number of times phase was executed
> ? Time and Flops/sec: Max - maximum over all processors
> ? ? ? ? ? ? ? ? ? ? ? Ratio - ratio of maximum to minimum over all processors
> ? Mess: number of messages sent
> ? Avg. len: average message length
> ? Reduct: number of global reductions
> ? Global: entire computation
> ? Stage: stages of a computation. Set stages with PetscLogStagePush() and PetscLogStagePop().
> ? ? ?%T - percent time in this phase ? ? ? ? %F - percent flops in this phase
> ? ? ?%M - percent messages in this phase ? ? %L - percent message lengths in this phase
> ? ? ?%R - percent reductions in this phase
> ? Total Mflop/s: 10e-6 * (sum of flops over all processors)/(max time over all processors)
> ------------------------------------------------------------------------------------------------------------------------
>
>
> ? ? ?##########################################################
> ? ? ?# ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ?#
> ? ? ?# ? ? ? ? ? ? ? ? ? ? ? ? ?WARNING!!! ? ? ? ? ? ? ? ? ? ?#
> ? ? ?# ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ?#
> ? ? ?# ? This code was run without the PreLoadBegin() ? ? ? ? #
> ? ? ?# ? macros. To get timing results we always recommend ? ?#
> ? ? ?# ? preloading. otherwise timing numbers may be ? ? ? ? ?#
> ? ? ?# ? meaningless. ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? #
> ? ? ?##########################################################
>
>
> Event ? ? ? ? ? ? ? ?Count ? ? ?Time (sec) ? ? Flops/sec ? ? ? ? ? ? ? ? ? ? ? ? --- Global --- ?--- Stage --- ? Total
> ? ? ? ? ? ? ? ? ? Max Ratio ?Max ? ? Ratio ? Max ?Ratio ?Mess ? Avg len Reduct ?%T %F %M %L %R ?%T %F %M %L %R Mflop/s
> ------------------------------------------------------------------------------------------------------------------------
>
> --- Event Stage 0: Main Stage
>
> VecDot ? ? ? ? ? ? ?5410 1.0 1.1865e+01 4.5 5.26e+08 4.5 0.0e+00 0.0e+00 5.4e+03 ?1 10 ?0 ?0 38 ? 1 10 ?0 ?0 38 ? 234
> VecNorm ? ? ? ? ? ? 3255 1.0 7.8095e-01 1.0 1.07e+09 1.0 0.0e+00 0.0e+00 3.3e+03 ?0 ?6 ?0 ?0 23 ? 0 ?6 ?0 ?0 23 ?2139
> VecCopy ? ? ? ? ? ? 1600 1.0 9.5096e-01 1.0 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00 ?0 ?0 ?0 ?0 ?0 ? 0 ?0 ?0 ?0 ?0 ? ? 0
> VecSet ? ? ? ? ? ? ?4746 1.0 8.9868e-01 1.0 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00 ?0 ?0 ?0 ?0 ?0 ? 0 ?0 ?0 ?0 ?0 ? ? 0
> VecAXPY ? ? ? ? ? ? 6801 1.0 4.8778e+00 1.0 3.59e+08 1.0 0.0e+00 0.0e+00 0.0e+00 ?1 13 ?0 ?0 ?0 ? 1 13 ?0 ?0 ?0 ? 715
> VecAYPX ? ? ? ? ? ? 3646 1.0 2.2348e+00 1.0 4.19e+08 1.0 0.0e+00 0.0e+00 0.0e+00 ?0 ?7 ?0 ?0 ?0 ? 0 ?7 ?0 ?0 ?0 ? 837
> VecAssemblyBegin ? ?1200 1.0 2.7152e-01 2.5 0.00e+00 0.0 0.0e+00 0.0e+00 3.6e+03 ?0 ?0 ?0 ?0 25 ? 0 ?0 ?0 ?0 25 ? ? 0
> VecAssemblyEnd ? ? ?1200 1.0 1.7414e-03 1.0 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00 ?0 ?0 ?0 ?0 ?0 ? 0 ?0 ?0 ?0 ?0 ? ? 0
> VecPointwiseMult ? ?3982 1.0 4.0871e+00 1.0 1.26e+08 1.0 0.0e+00 0.0e+00 0.0e+00 ?1 ?4 ?0 ?0 ?0 ? 1 ?4 ?0 ?0 ?0 ? 250
> VecScatterBegin ? ? 4746 1.0 1.8000e-01 1.0 0.00e+00 0.0 9.5e+03 2.9e+04 0.0e+00 ?0 ?0100100 ?0 ? 0 ?0100100 ?0 ? ? 0
> VecScatterEnd ? ? ? 4746 1.0 4.6870e+00 5.0 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00 ?0 ?0 ?0 ?0 ?0 ? 0 ?0 ?0 ?0 ?0 ? ? 0
> KSPSetup ? ? ? ? ? ? 600 1.0 6.8991e-01 1.0 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00 ?0 ?0 ?0 ?0 ?0 ? 0 ?0 ?0 ?0 ?0 ? ? 0
> KSPSolve ? ? ? ? ? ? 600 1.0 2.5931e+02 1.0 5.17e+07 1.0 9.5e+03 2.9e+04 9.0e+03 37100100100 62 ?37100100100 62 ? 103
> PCSetUp ? ? ? ? ? ? ?600 1.0 1.8337e+02 1.0 0.00e+00 0.0 0.0e+00 0.0e+00 2.0e+02 26 ?0 ?0 ?0 ?1 ?26 ?0 ?0 ?0 ?1 ? ? 0
> PCApply ? ? ? ? ? ? 5246 1.0 3.6440e+01 1.3 1.88e+07 1.3 0.0e+00 0.0e+00 1.0e+02 ?5 ?4 ?0 ?0 ?1 ? 5 ?4 ?0 ?0 ?1 ? ?28
> MatMult ? ? ? ? ? ? 3255 1.0 2.3031e+01 1.2 2.85e+08 1.2 6.5e+03 2.9e+04 0.0e+00 ?3 40 69 69 ?0 ? 3 40 69 69 ?0 ? 464
> MatMultTranspose ? ?1491 1.0 8.4907e+00 1.0 3.11e+08 1.0 3.0e+03 2.9e+04 0.0e+00 ?1 20 31 31 ?0 ? 1 20 31 31 ?0 ? 621
> MatConvert ? ? ? ? ? 100 1.0 1.2686e+01 1.0 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00 ?2 ?0 ?0 ?0 ?0 ? 2 ?0 ?0 ?0 ?0 ? ? 0
> MatAssemblyBegin ? ? 600 1.0 2.3702e+0042.6 0.00e+00 0.0 0.0e+00 0.0e+00 1.2e+03 ?0 ?0 ?0 ?0 ?8 ? 0 ?0 ?0 ?0 ?8 ? ? 0
> MatAssemblyEnd ? ? ? 600 1.0 2.5303e+00 1.0 0.00e+00 0.0 4.0e+00 1.5e+04 6.1e+02 ?0 ?0 ?0 ?0 ?4 ? 0 ?0 ?0 ?0 ?4 ? ? 0
> MatGetRow ? ? ? ?12828800 1.0 5.2074e+00 1.0 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00 ?1 ?0 ?0 ?0 ?0 ? 1 ?0 ?0 ?0 ?0 ? ? 0
> MatGetRowIJ ? ? ? ? ?200 1.0 1.6284e-04 1.0 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00 ?0 ?0 ?0 ?0 ?0 ? 0 ?0 ?0 ?0 ?0 ? ? 0
> ------------------------------------------------------------------------------------------------------------------------
>
> Memory usage is given in bytes:
>
> Object Type ? ? ? ? ?Creations ? Destructions ? Memory ?Descendants' Mem.
>
> --- Event Stage 0: Main Stage
>
> ? ? ? ? ? Index Set ? ? 4 ? ? ? ? ? ? ?4 ? ? ?30272 ? ? 0
> ? ? ? ? ? ? ? ? Vec ?1213 ? ? ? ? ? 1202 ?1234223216 ? ? 0
> ? ? ? ? Vec Scatter ? ? 2 ? ? ? ? ? ? ?0 ? ? ? ? ?0 ? ? 0
> ? ? ? Krylov Solver ? ? 1 ? ? ? ? ? ? ?0 ? ? ? ? ?0 ? ? 0
> ? ? ?Preconditioner ? ? 1 ? ? ? ? ? ? ?0 ? ? ? ? ?0 ? ? 0
> ? ? ? ? ? ? ?Matrix ? ? 6 ? ? ? ? ? ? ?0 ? ? ? ? ?0 ? ? 0
> ========================================================================================================================
> Average time to get PetscTime(): 2.14577e-07
> Average time for MPI_Barrier(): 3.8147e-07
> Average time for zero size MPI_Send(): 1.39475e-05
>
>
>
>
> --------------------------------------
> --- Hypre BoomerAMG preconditioner ---
> --------------------------------------
>
> KSP Object:
> ?type: cg
> ?maximum iterations=500
> ?tolerances: ?relative=0.05, absolute=1e-50, divergence=10000
> ?left preconditioning
> PC Object:
> ?type: hypre
> ? ?HYPRE BoomerAMG preconditioning
> ? ?HYPRE BoomerAMG: Cycle type V
> ? ?HYPRE BoomerAMG: Maximum number of levels 25
> ? ?HYPRE BoomerAMG: Maximum number of iterations PER hypre call 1
> ? ?HYPRE BoomerAMG: Convergence tolerance PER hypre call 0
> ? ?HYPRE BoomerAMG: Threshold for strong coupling 0.25
> ? ?HYPRE BoomerAMG: Interpolation truncation factor 0
> ? ?HYPRE BoomerAMG: Interpolation: max elements per row 0
> ? ?HYPRE BoomerAMG: Number of levels of aggressive coarsening 0
> ? ?HYPRE BoomerAMG: Number of paths for aggressive coarsening 1
> ? ?HYPRE BoomerAMG: Maximum row sums 0.9
> ? ?HYPRE BoomerAMG: Sweeps down ? ? ? ? 1
> ? ?HYPRE BoomerAMG: Sweeps up ? ? ? ? ? 1
> ? ?HYPRE BoomerAMG: Sweeps on coarse ? ?1
> ? ?HYPRE BoomerAMG: Relax down ? ? ? ? ?symmetric-SOR/Jacobi
> ? ?HYPRE BoomerAMG: Relax up ? ? ? ? ? ?symmetric-SOR/Jacobi
> ? ?HYPRE BoomerAMG: Relax on coarse ? ? Gaussian-elimination
> ? ?HYPRE BoomerAMG: Relax weight ?(all) ? ? ?1
> ? ?HYPRE BoomerAMG: Outer relax weight (all) 1
> ? ?HYPRE BoomerAMG: Using CF-relaxation
> ? ?HYPRE BoomerAMG: Measure type ? ? ? ?local
> ? ?HYPRE BoomerAMG: Coarsen type ? ? ? ?Falgout
> ? ?HYPRE BoomerAMG: Interpolation type ?classical
> ?linear system matrix = precond matrix:
> ?Matrix Object:
> ? ?type=mpiaij, rows=256576, cols=256576
> ? ?total: nonzeros=1769552, allocated nonzeros=1769552
> ? ? ?not using I-node (on process 0) routines
>
> ************************************************************************************************************************
> *** ? ? ? ? ? ? WIDEN YOUR WINDOW TO 120 CHARACTERS. ?Use 'enscript -r -fCourier9' to print this document ? ? ? ? ? ?***
> ************************************************************************************************************************
>
> ---------------------------------------------- PETSc Performance Summary: ----------------------------------------------
>
> ./fresco on a linux_32_ named lin0077 with 2 processors, by cklaij Wed Jul 15 09:53:07 2009
> Using Petsc Release Version 2.3.3, Patch 13, Thu May 15 17:29:26 CDT 2008 HG revision: 4466c6289a0922df26e20626fd4a0b4dd03c8124
>
> ? ? ? ? ? ? ? ? ? ? ? ? Max ? ? ? Max/Min ? ? ? ?Avg ? ? ?Total
> Time (sec): ? ? ? ? ? 7.080e+02 ? ? ?1.00000 ? 7.080e+02
> Objects: ? ? ? ? ? ? ?1.227e+03 ? ? ?1.00000 ? 1.227e+03
> Flops: ? ? ? ? ? ? ? ?1.054e+10 ? ? ?1.00076 ? 1.054e+10 ?2.107e+10
> Flops/sec: ? ? ? ? ? ?1.489e+07 ? ? ?1.00076 ? 1.488e+07 ?2.977e+07
> MPI Messages: ? ? ? ? 3.857e+03 ? ? ?1.00000 ? 3.857e+03 ?7.714e+03
> MPI Message Lengths: ?1.135e+08 ? ? ?1.00000 ? 2.942e+04 ?2.270e+08
> MPI Reductions: ? ? ? 5.800e+03 ? ? ?1.00000
>
> Flop counting convention: 1 flop = 1 real number operation of type (multiply/divide/add/subtract)
> ? ? ? ? ? ? ? ? ? ? ? ? ? ?e.g., VecAXPY() for real vectors of length N --> 2N flops
> ? ? ? ? ? ? ? ? ? ? ? ? ? ?and VecAXPY() for complex vectors of length N --> 8N flops
>
> Summary of Stages: ? ----- Time ------ ?----- Flops ----- ?--- Messages --- ?-- Message Lengths -- ?-- Reductions --
> ? ? ? ? ? ? ? ? ? ? ? ?Avg ? ? %Total ? ? Avg ? ? %Total ? counts ? %Total ? ? Avg ? ? ? ? %Total ? counts ? %Total
> ?0: ? ? ?Main Stage: 7.0799e+02 100.0% ?2.1075e+10 100.0% ?7.714e+03 100.0% ?2.942e+04 ? ? ?100.0% ?1.160e+04 100.0%
>
> ------------------------------------------------------------------------------------------------------------------------
> See the 'Profiling' chapter of the users' manual for details on interpreting output.
> Phase summary info:
> ? Count: number of times phase was executed
> ? Time and Flops/sec: Max - maximum over all processors
> ? ? ? ? ? ? ? ? ? ? ? Ratio - ratio of maximum to minimum over all processors
> ? Mess: number of messages sent
> ? Avg. len: average message length
> ? Reduct: number of global reductions
> ? Global: entire computation
> ? Stage: stages of a computation. Set stages with PetscLogStagePush() and PetscLogStagePop().
> ? ? ?%T - percent time in this phase ? ? ? ? %F - percent flops in this phase
> ? ? ?%M - percent messages in this phase ? ? %L - percent message lengths in this phase
> ? ? ?%R - percent reductions in this phase
> ? Total Mflop/s: 10e-6 * (sum of flops over all processors)/(max time over all processors)
> ------------------------------------------------------------------------------------------------------------------------
>
>
> ? ? ?##########################################################
> ? ? ?# ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ?#
> ? ? ?# ? ? ? ? ? ? ? ? ? ? ? ? ?WARNING!!! ? ? ? ? ? ? ? ? ? ?#
> ? ? ?# ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ?#
> ? ? ?# ? This code was run without the PreLoadBegin() ? ? ? ? #
> ? ? ?# ? macros. To get timing results we always recommend ? ?#
> ? ? ?# ? preloading. otherwise timing numbers may be ? ? ? ? ?#
> ? ? ?# ? meaningless. ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? #
> ? ? ?##########################################################
>
>
> Event ? ? ? ? ? ? ? ?Count ? ? ?Time (sec) ? ? Flops/sec ? ? ? ? ? ? ? ? ? ? ? ? --- Global --- ?--- Stage --- ? Total
> ? ? ? ? ? ? ? ? ? Max Ratio ?Max ? ? Ratio ? Max ?Ratio ?Mess ? Avg len Reduct ?%T %F %M %L %R ?%T %F %M %L %R Mflop/s
> ------------------------------------------------------------------------------------------------------------------------
>
> --- Event Stage 0: Main Stage
>
> VecDot ? ? ? ? ? ? ?3554 1.0 1.8220e+00 1.0 5.03e+08 1.0 0.0e+00 0.0e+00 3.6e+03 ?0 ?9 ?0 ?0 31 ? 0 ?9 ?0 ?0 31 ?1001
> VecNorm ? ? ? ? ? ? 2327 1.0 6.7031e-01 1.0 9.34e+08 1.0 0.0e+00 0.0e+00 2.3e+03 ?0 ?6 ?0 ?0 20 ? 0 ?6 ?0 ?0 20 ?1781
> VecCopy ? ? ? ? ? ? 1600 1.0 9.4440e-01 1.0 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00 ?0 ?0 ?0 ?0 ?0 ? 0 ?0 ?0 ?0 ?0 ? ? 0
> VecSet ? ? ? ? ? ? ?3855 1.0 8.0550e-01 1.0 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00 ?0 ?0 ?0 ?0 ?0 ? 0 ?0 ?0 ?0 ?0 ? ? 0
> VecAXPY ? ? ? ? ? ? 4982 1.0 3.7953e+00 1.0 3.39e+08 1.0 0.0e+00 0.0e+00 0.0e+00 ?1 12 ?0 ?0 ?0 ? 1 12 ?0 ?0 ?0 ? 674
> VecAYPX ? ? ? ? ? ? 2755 1.0 1.8270e+00 1.0 3.89e+08 1.0 0.0e+00 0.0e+00 0.0e+00 ?0 ?7 ?0 ?0 ?0 ? 0 ?7 ?0 ?0 ?0 ? 774
> VecAssemblyBegin ? ?1200 1.0 1.8679e-01 1.8 0.00e+00 0.0 0.0e+00 0.0e+00 3.6e+03 ?0 ?0 ?0 ?0 31 ? 0 ?0 ?0 ?0 31 ? ? 0
> VecAssemblyEnd ? ? ?1200 1.0 1.7717e-03 1.1 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00 ?0 ?0 ?0 ?0 ?0 ? 0 ?0 ?0 ?0 ?0 ? ? 0
> VecPointwiseMult ? ?4056 1.0 4.1344e+00 1.0 1.26e+08 1.0 0.0e+00 0.0e+00 0.0e+00 ?1 ?5 ?0 ?0 ?0 ? 1 ?5 ?0 ?0 ?0 ? 252
> VecScatterBegin ? ? 3855 1.0 1.5116e-01 1.0 0.00e+00 0.0 7.7e+03 2.9e+04 0.0e+00 ?0 ?0100100 ?0 ? 0 ?0100100 ?0 ? ? 0
> VecScatterEnd ? ? ? 3855 1.0 7.3828e-01 2.0 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00 ?0 ?0 ?0 ?0 ?0 ? 0 ?0 ?0 ?0 ?0 ? ? 0
> KSPSetup ? ? ? ? ? ? 600 1.0 5.1192e-01 1.0 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00 ?0 ?0 ?0 ?0 ?0 ? 0 ?0 ?0 ?0 ?0 ? ? 0
> KSPSolve ? ? ? ? ? ? 600 1.0 2.7194e+02 1.0 3.88e+07 1.0 7.7e+03 2.9e+04 6.2e+03 38100100100 53 ?38100100100 53 ? ?77
> PCSetUp ? ? ? ? ? ? ?600 1.0 1.6630e+02 1.0 0.00e+00 0.0 0.0e+00 0.0e+00 2.0e+02 23 ?0 ?0 ?0 ?2 ?23 ?0 ?0 ?0 ?2 ? ? 0
> PCApply ? ? ? ? ? ? 4355 1.0 7.3735e+01 1.0 7.06e+06 1.0 0.0e+00 0.0e+00 1.0e+02 10 ?5 ?0 ?0 ?1 ?10 ?5 ?0 ?0 ?1 ? ?14
> MatMult ? ? ? ? ? ? 2327 1.0 1.3706e+01 1.0 2.79e+08 1.0 4.7e+03 2.9e+04 0.0e+00 ?2 36 60 60 ?0 ? 2 36 60 60 ?0 ? 557
> MatMultTranspose ? ?1528 1.0 8.6412e+00 1.0 3.13e+08 1.0 3.1e+03 2.9e+04 0.0e+00 ?1 26 40 40 ?0 ? 1 26 40 40 ?0 ? 626
> MatConvert ? ? ? ? ? 100 1.0 1.2962e+01 1.0 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00 ?2 ?0 ?0 ?0 ?0 ? 2 ?0 ?0 ?0 ?0 ? ? 0
> MatAssemblyBegin ? ? 600 1.0 2.4579e+0096.9 0.00e+00 0.0 0.0e+00 0.0e+00 1.2e+03 ?0 ?0 ?0 ?0 10 ? 0 ?0 ?0 ?0 10 ? ? 0
> MatAssemblyEnd ? ? ? 600 1.0 2.5257e+00 1.0 0.00e+00 0.0 4.0e+00 1.5e+04 6.1e+02 ?0 ?0 ?0 ?0 ?5 ? 0 ?0 ?0 ?0 ?5 ? ? 0
> MatGetRow ? ? ? ?12828800 1.0 5.2907e+00 1.0 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00 ?1 ?0 ?0 ?0 ?0 ? 1 ?0 ?0 ?0 ?0 ? ? 0
> MatGetRowIJ ? ? ? ? ?200 1.0 1.7476e-04 1.1 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00 ?0 ?0 ?0 ?0 ?0 ? 0 ?0 ?0 ?0 ?0 ? ? 0
> ------------------------------------------------------------------------------------------------------------------------
>
> Memory usage is given in bytes:
>
> Object Type ? ? ? ? ?Creations ? Destructions ? Memory ?Descendants' Mem.
>
> --- Event Stage 0: Main Stage
>
> ? ? ? ? ? Index Set ? ? 4 ? ? ? ? ? ? ?4 ? ? ?30272 ? ? 0
> ? ? ? ? ? ? ? ? Vec ?1213 ? ? ? ? ? 1202 ?1234223216 ? ? 0
> ? ? ? ? Vec Scatter ? ? 2 ? ? ? ? ? ? ?0 ? ? ? ? ?0 ? ? 0
> ? ? ? Krylov Solver ? ? 1 ? ? ? ? ? ? ?0 ? ? ? ? ?0 ? ? 0
> ? ? ?Preconditioner ? ? 1 ? ? ? ? ? ? ?0 ? ? ? ? ?0 ? ? 0
> ? ? ? ? ? ? ?Matrix ? ? 6 ? ? ? ? ? ? ?0 ? ? ? ? ?0 ? ? 0
> ========================================================================================================================
> Average time to get PetscTime(): 1.90735e-07
> Average time for MPI_Barrier(): 8.10623e-07
> Average time for zero size MPI_Send(): 1.95503e-05
> OptionTable: -log_summary
>
>
>
>
> -----Original Message-----
> Date: Tue, 14 Jul 2009 10:42:58 -0500
> From: Barry Smith <bsmith at mcs.anl.gov>
> Subject: Re: hypre preconditioners
> To: PETSc users list <petsc-users at mcs.anl.gov>
> Message-ID: <DC1E3E8F-1D2D-4256-A1EE-14BA81EAEC67 at mcs.anl.gov>
> Content-Type: text/plain; charset=US-ASCII; format=flowed; delsp=yes
>
>
> ? ?First run the three cases with -log_summary (also -ksp_view to see
> exact solver options that are being used) and send those files. This
> will tell us where the time is being spent; without this information
> any comments are pure speculation. (For example, the "copy" time to
> hypre format is trivial compared to the time to build a hypre
> preconditioner and not the problem).
>
>
> ? ?What you report is not uncommon; the setup and per iteration cost
> of the hypre preconditioners will be much larger than the simpler
> Jacobi preconditioner.
>
> ? ?Barry
>
> On Jul 14, 2009, at 3:36 AM, Klaij, Christiaan wrote:
>
>>
>> I'm solving the steady incompressible Navier-Stokes equations
>> (discretized with FV on unstructured grids) using the SIMPLE
>> Pressure Correction method. I'm using Picard linearization and solve
>> the system for the momentum equations with BICG and for the pressure
>> equation with CG. Currently, for parallel runs, I'm using JACOBI as
>> a preconditioner. My grids typically have a few million cells and I
>> use between 4 and 16 cores (1 to 4 quadcore CPUs on a linux
>> cluster). A significant portion of the CPU time goes into solving
>> the pressure equation. To reach the relative tolerance I need, CG
>> with JACOBI takes about 100 iterations per outer loop for these
>> problems.
>>
>> In order to reduce CPU time, I've compiled PETSc with support for
>> Hypre and I'm looking at BoomerAMG and Euclid to replace JACOBI as a
>> preconditioner for the pressure equation. With default settings,
>> both BoomerAMG and Euclid greatly reduce the number of iterations:
>> with BoomerAMG 1 or 2 iterations are enough, with Euclid about 10.
>> However, I do not get any reduction in CPU time. With Euclid, CPU
>> time is similar to JACOBI and with BoomerAMG it is approximately
>> doubled.
>>
>> Is this what one can expect? Are BoomerAMG and Euclid meant for much
>> larger problems? I understand Hypre uses a different matrix storage
>> format, is CPU time 'lost in translation' between PETSc and Hypre
>> for these small problems? Are there maybe any settings I should
>> change?
>>
>> Chris
>>
>>
>>
>>
>>
>>
>>
>>
>> <mime-attachment.jpeg><mime-attachment.jpeg>
>> dr. ir. Christiaan Klaij
>> CFD Researcher
>> Research & Development
>> MARIN
>> 2, Haagsteeg
>> c.klaij at marin.nl
>> P.O. Box 28
>> T +31 317 49 39 11
>> 6700 AA ?Wageningen
>> F +31 317 49 32 45
>> T ?+31 317 49 33 44
>> The Netherlands
>> I ?www.marin.nl
>>
>>
>> MARIN webnews: First AMT'09 conference, Nantes, France, September 1-2
>>
>>
>> This e-mail may be confidential, privileged and/or protected by
>> copyright. If you are not the intended recipient, you should return
>> it to the sender immediately and delete your copy from your system.
>>
>



-- 
Lisandro Dalc?n
---------------
Centro Internacional de M?todos Computacionales en Ingenier?a (CIMEC)
Instituto de Desarrollo Tecnol?gico para la Industria Qu?mica (INTEC)
Consejo Nacional de Investigaciones Cient?ficas y T?cnicas (CONICET)
PTLC - G?emes 3450, (3000) Santa Fe, Argentina
Tel/Fax: +54-(0)342-451.1594

-------------- next part --------------
A non-text attachment was scrubbed...
Name: not available
Type: application/ms-tnef
Size: 15358 bytes
Desc: not available
URL: <http://lists.mcs.anl.gov/pipermail/petsc-users/attachments/20090716/d99301d8/attachment-0001.bin>

From jed at 59A2.org  Thu Jul 16 03:06:48 2009
From: jed at 59A2.org (Jed Brown)
Date: Thu, 16 Jul 2009 10:06:48 +0200
Subject: hypre preconditioners
In-Reply-To: <5D9143EF9FADE942BEF6F2A636A861170800F787@MAR150CV1.marin.local>
References: <mailman.57.1247677218.7018.petsc-users@mcs.anl.gov>
	<5D9143EF9FADE942BEF6F2A636A861170800F787@MAR150CV1.marin.local>
Message-ID: <4A5EDF98.7090508@59A2.org>

Klaij, Christiaan wrote:

> The velocity problem is segregated (I use BICG with Jacobi for the 3
> linear systems) but these need (much) less iterations than the pressure
> problem. The pressure matrix changes at each solve.

It may change, but it might still make a good preconditioner for several
time steps.  How many dofs are in your pressure system?  You mentioned a
few million cells, but it makes a huge difference whether you are using
tets vs. hexes, and what the pressure space is.  If the pressure space
is around 1M dofs, the system is relatively well-conditioned to converge
in only 100 iterations with Jacobi which means that you stand a good
chance of getting acceptable performance from a 1-level DD
preconditioner (block jacobi or small-overlap additive Schwarz).  So try
Barry's suggestion of -pc_type sor and also -pc_type asm with a few
choices of subdomain solver (-sub_pc_type).

> Also, I did try ML and, like you say, it needs about two times more
> iterations than boomerAMG. Overall, boomerAMG is a bit faster for my
> cases than ML.

To speed up Hypre, I've found these options to be especially useful.

-pc_hypre_boomeramg_strong_threshold
  defaults to 0.25 which is good for 2D scalar problems, change to 0.5 or above for 3D problems

-pc_hypre_boomeramg_agg_nl
  set this greater than 0 to use aggressive coarsening


However, I almost always find ML to be faster.  By default, it uses way
more levels than you want (often making the coarse level have only 1 dof
instead of around 1000) so try reducing -pc_ml_maxNlevels.

Jed

-------------- next part --------------
A non-text attachment was scrubbed...
Name: signature.asc
Type: application/pgp-signature
Size: 260 bytes
Desc: OpenPGP digital signature
URL: <http://lists.mcs.anl.gov/pipermail/petsc-users/attachments/20090716/8f13f3c1/attachment.pgp>

From tchouanm at msn.com  Thu Jul 16 04:44:41 2009
From: tchouanm at msn.com (STEPHANE TCHOUANMO)
Date: Thu, 16 Jul 2009 11:44:41 +0200
Subject: Petsc command to get the conditionning of matrices
In-Reply-To: <mailman.47.1235412009.23881.petsc-users@mcs.anl.gov>
References: <mailman.47.1235412009.23881.petsc-users@mcs.anl.gov>
Message-ID: <BAY107-W18A09361EBDDC121BCC512C2210@phx.gbl>


Dear all,

 

I solve a non-linear problem in Petsc using the classical Newton method. I give to Petsc the jacobian matrix and residuals at each Newton iteration.

Is there a way to get the conditionning of my Jacobian matrices? like a command when running my exectable (-ksp_.. or -snes_.. or -pc_..)

 

Thanks.

 

Stephane


_________________________________________________________________
More than messages?check out the rest of the Windows Live?.
http://www.microsoft.com/windows/windowslive/
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.mcs.anl.gov/pipermail/petsc-users/attachments/20090716/d74cf4e5/attachment.htm>

From knepley at gmail.com  Thu Jul 16 05:39:45 2009
From: knepley at gmail.com (Matthew Knepley)
Date: Thu, 16 Jul 2009 05:39:45 -0500
Subject: Petsc command to get the conditionning of matrices
In-Reply-To: <BAY107-W18A09361EBDDC121BCC512C2210@phx.gbl>
References: <mailman.47.1235412009.23881.petsc-users@mcs.anl.gov>
	<BAY107-W18A09361EBDDC121BCC512C2210@phx.gbl>
Message-ID: <a9f269830907160339p3dc14bdgd93d5b3cc0a37497@mail.gmail.com>

You can try -ksp_monitor_singular_value.

  Matt

On Thu, Jul 16, 2009 at 4:44 AM, STEPHANE TCHOUANMO <tchouanm at msn.com>wrote:

>  Dear all,
>
> I solve a non-linear problem in Petsc using the classical Newton
> method. I give to Petsc the jacobian matrix and residuals at each Newton
> iteration.
> Is there a way to get the conditionning of my Jacobian matrices? like a
> command when running my exectable (-ksp_.. or -snes_.. or -pc_..)
>
> Thanks.
>
> Stephane
>
>
> ------------------------------
> check out the rest of the Windows Live?. More than mail?Windows Live? goes
> way beyond your inbox. More than messages<http://www.microsoft.com/windows/windowslive/>
>



-- 
What most experimenters take for granted before they begin their experiments
is infinitely more interesting than any results to which their experiments
lead.
-- Norbert Wiener
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.mcs.anl.gov/pipermail/petsc-users/attachments/20090716/e2abc607/attachment.htm>

From jed at 59A2.org  Thu Jul 16 05:44:36 2009
From: jed at 59A2.org (Jed Brown)
Date: Thu, 16 Jul 2009 12:44:36 +0200
Subject: Petsc command to get the conditionning of matrices
In-Reply-To: <BAY107-W18A09361EBDDC121BCC512C2210@phx.gbl>
References: <mailman.47.1235412009.23881.petsc-users@mcs.anl.gov>
	<BAY107-W18A09361EBDDC121BCC512C2210@phx.gbl>
Message-ID: <4A5F0494.2090205@59A2.org>

STEPHANE TCHOUANMO wrote:

> Is there a way to get the conditionning of my Jacobian matrices? like a command when running my exectable (-ksp_.. or -snes_.. or -pc_..)

I use

-ksp_monitor_singular_value

  estimate at every Krylov iteration, only works with GMRES and CG

-ksp_compute_eigenvalues

  estimate of a few eigenvalues from the iteration (GMRES and CG)

-ksp_compute_eigenvalues_explicitly

  sometimes useful for very small systems, e.g. to find the size of a
  null space while debugging

-ksp_plot_eigenvalues_explicitly

  again, only for very small problems.


Note that these eigenvalues and singular values are not reliable for
eigen-analysis, they are only intended to help understand why iterative
methods are working a certain way.  If you care about accurate
eigen/singular values, use SLEPc.



I noticed a few weeks ago that -ksp_compute_singularvalues didn't work
as advertised.  A few lines needed to be added to KSPSolve(), as for
-ksp_compute_eigenvalues.  I usually use -ksp_monitor_singular_value
instead, so it hadn't bothered me, but it's now in petsc-dev:

  http://petsc.cs.iit.edu/petsc/petsc-dev/rev/3180aa7f49b4


Jed

-------------- next part --------------
A non-text attachment was scrubbed...
Name: signature.asc
Type: application/pgp-signature
Size: 260 bytes
Desc: OpenPGP digital signature
URL: <http://lists.mcs.anl.gov/pipermail/petsc-users/attachments/20090716/c35e0631/attachment.pgp>

From C.Klaij at marin.nl  Thu Jul 16 09:20:17 2009
From: C.Klaij at marin.nl (Klaij, Christiaan)
Date: Thu, 16 Jul 2009 16:20:17 +0200
Subject: hypre preconditioners
References: <mailman.5483.1247726868.12748.petsc-users@mcs.anl.gov>
Message-ID: <5D9143EF9FADE942BEF6F2A636A861170800F789@MAR150CV1.marin.local>

Barry,

Thanks for your suggestions, I especially like the idea of keeping the same preconditioner for several solves; that's definitely worth a try.

Chris


-----Original Message-----
Date: Wed, 15 Jul 2009 15:26:17 -0500
From: Barry Smith <bsmith at mcs.anl.gov>
Subject: Re: hypre preconditioners
To: PETSc users list <petsc-users at mcs.anl.gov>
Message-ID: <83E2B8C2-9475-45C6-A448-502114D4959D at mcs.anl.gov>
Content-Type: text/plain; charset=ISO-8859-1; format=flowed; delsp=yes


On Jul 15, 2009, at 11:23 AM, Lisandro Dalcin wrote:

> Did you try Block-Jacobi for the velocity problem?

    You can try -pc_type sor and it will run block Jacobi with one  
symmetric sweep of SOR for each iteration. This may be faster than  
your plain Jacobi.

> If the matrix of
> your presure problem changes in each solve (is this your case?) could
> you try to use ML? In my little experience, ML leads to lower setup
> times, but higher iteration counts (let say twice); perhaps it will be
> faster than BommerAMG for you use case.

    ML is worth trying.

    Also you might try "playing" with the various boomerAMG options. I  
don't know them in detail so cannot make suggestions, but the various  
ways of coarsening control how quickly the setup time is.

   Finally, if the matrix is not changing much for each new solve you  
can use the same boomerAMG preconditioner for several linear solves.  
Just use SAME_PRECONDITIONER as the argument to KSPSetOperators() and  
it will not create a new preconditioner until you call it with  
SAME_NONZERO_PATTERN. I am thinking this might work very well for you.

    Barry


>
>
> On Wed, Jul 15, 2009 at 5:58 AM, Klaij, Christiaan<C.Klaij at marin.nl>  
> wrote:
>> Barry,
>>
>> Thanks for your reply! Below is the information from KSPView and - 
>> log_summary for the three cases. Indeed PCSetUp takes much more  
>> time with the hypre preconditioners.
>>
>> Chris
>>
>> -----------------------------
>> --- Jacobi preconditioner ---
>> -----------------------------
>>
>> KSP Object:
>>  type: cg
>>  maximum iterations=500
>>  tolerances:  relative=0.05, absolute=1e-50, divergence=10000
>>  left preconditioning
>> PC Object:
>>  type: jacobi
>>  linear system matrix = precond matrix:
>>  Matrix Object:
>>    type=mpiaij, rows=256576, cols=256576
>>    total: nonzeros=1769552, allocated nonzeros=1769552
>>      not using I-node (on process 0) routines
>>
>> ************************************************************************************************************************
>> ***             WIDEN YOUR WINDOW TO 120 CHARACTERS.  Use 'enscript  
>> -r -fCourier9' to print this document            ***
>> ************************************************************************************************************************
>>
>> ---------------------------------------------- PETSc Performance  
>> Summary: ----------------------------------------------
>>
>> ./fresco on a linux_32_ named lin0077 with 2 processors, by cklaij  
>> Wed Jul 15 10:22:04 2009
>> Using Petsc Release Version 2.3.3, Patch 13, Thu May 15 17:29:26  
>> CDT 2008 HG revision: 4466c6289a0922df26e20626fd4a0b4dd03c8124
>>
>>                         Max       Max/Min        Avg      Total
>> Time (sec):           6.037e+02      1.00000   6.037e+02
>> Objects:              9.270e+02      1.00000   9.270e+02
>> Flops:                5.671e+10      1.00065   5.669e+10  1.134e+11
>> Flops/sec:            9.393e+07      1.00065   9.390e+07  1.878e+08
>> MPI Messages:         1.780e+04      1.00000   1.780e+04  3.561e+04
>> MPI Message Lengths:  5.239e+08      1.00000   2.943e+04  1.048e+09
>> MPI Reductions:       2.651e+04      1.00000
>>
>> Flop counting convention: 1 flop = 1 real number operation of type  
>> (multiply/divide/add/subtract)
>>                            e.g., VecAXPY() for real vectors of  
>> length N --> 2N flops
>>                            and VecAXPY() for complex vectors of  
>> length N --> 8N flops
>>
>> Summary of Stages:   ----- Time ------  ----- Flops -----  ---  
>> Messages ---  -- Message Lengths --  -- Reductions --
>>                        Avg     %Total     Avg     %Total   counts    
>> %Total     Avg         %Total   counts   %Total
>>  0:      Main Stage: 6.0374e+02 100.0%  1.1338e+11 100.0%  3.561e 
>> +04 100.0%  2.943e+04      100.0%  5.302e+04 100.0%
>>
>> ------------------------------------------------------------------------------------------------------------------------
>> See the 'Profiling' chapter of the users' manual for details on  
>> interpreting output.
>> Phase summary info:
>>   Count: number of times phase was executed
>>   Time and Flops/sec: Max - maximum over all processors
>>                       Ratio - ratio of maximum to minimum over all  
>> processors
>>   Mess: number of messages sent
>>   Avg. len: average message length
>>   Reduct: number of global reductions
>>   Global: entire computation
>>   Stage: stages of a computation. Set stages with  
>> PetscLogStagePush() and PetscLogStagePop().
>>      %T - percent time in this phase         %F - percent flops in  
>> this phase
>>      %M - percent messages in this phase     %L - percent message  
>> lengths in this phase
>>      %R - percent reductions in this phase
>>   Total Mflop/s: 10e-6 * (sum of flops over all processors)/(max  
>> time over all processors)
>> ------------------------------------------------------------------------------------------------------------------------
>>
>>
>>      ##########################################################
>>      #                                                        #
>>      #                          WARNING!!!                    #
>>      #                                                        #
>>      #   This code was run without the PreLoadBegin()         #
>>      #   macros. To get timing results we always recommend    #
>>      #   preloading. otherwise timing numbers may be          #
>>      #   meaningless.                                         #
>>      ##########################################################
>>
>>
>> Event                Count      Time (sec)     Flops/ 
>> sec                         --- Global ---  --- Stage ---   Total
>>                   Max Ratio  Max     Ratio   Max  Ratio  Mess   Avg  
>> len Reduct  %T %F %M %L %R  %T %F %M %L %R Mflop/s
>> ------------------------------------------------------------------------------------------------------------------------
>>
>> --- Event Stage 0: Main Stage
>>
>> VecDot             31370 1.0 1.2887e+01 1.0 6.28e+08 1.0 0.0e+00  
>> 0.0e+00 3.1e+04  2 14  0  0 59   2 14  0  0 59  1249
>> VecNorm            16235 1.0 2.3343e+00 1.0 1.79e+09 1.0 0.0e+00  
>> 0.0e+00 1.6e+04  0  7  0  0 31   0  7  0  0 31  3569
>> VecCopy             1600 1.0 9.4822e-01 1.0 0.00e+00 0.0 0.0e+00  
>> 0.0e+00 0.0e+00  0  0  0  0  0   0  0  0  0  0     0
>> VecSet              3732 1.0 8.7824e-01 1.0 0.00e+00 0.0 0.0e+00  
>> 0.0e+00 0.0e+00  0  0  0  0  0   0  0  0  0  0     0
>> VecAXPY            32836 1.0 1.9510e+01 1.0 4.34e+08 1.0 0.0e+00  
>> 0.0e+00 0.0e+00  3 15  0  0  0   3 15  0  0  0   864
>> VecAYPX            16701 1.0 7.4898e+00 1.0 5.73e+08 1.0 0.0e+00  
>> 0.0e+00 0.0e+00  1  8  0  0  0   1  8  0  0  0  1144
>> VecAssemblyBegin    1200 1.0 3.3916e-01 2.2 0.00e+00 0.0 0.0e+00  
>> 0.0e+00 3.6e+03  0  0  0  0  7   0  0  0  0  7     0
>> VecAssemblyEnd      1200 1.0 1.6778e-03 1.0 0.00e+00 0.0 0.0e+00  
>> 0.0e+00 0.0e+00  0  0  0  0  0   0  0  0  0  0     0
>> VecPointwiseMult   18301 1.0 1.4524e+01 1.0 1.62e+08 1.0 0.0e+00  
>> 0.0e+00 0.0e+00  2  4  0  0  0   2  4  0  0  0   323
>> VecScatterBegin    17801 1.0 5.8999e-01 1.0 0.00e+00 0.0 3.6e+04  
>> 2.9e+04 0.0e+00  0  0100100  0   0  0100100  0     0
>> VecScatterEnd      17801 1.0 3.3189e+00 2.2 0.00e+00 0.0 0.0e+00  
>> 0.0e+00 0.0e+00  0  0  0  0  0   0  0  0  0  0     0
>> KSPSetup             600 1.0 6.7541e-01 1.0 0.00e+00 0.0 0.0e+00  
>> 0.0e+00 0.0e+00  0  0  0  0  0   0  0  0  0  0     0
>> KSPSolve             600 1.0 1.6520e+02 1.0 3.43e+08 1.0 3.6e+04  
>> 2.9e+04 4.8e+04 27100100100 90  27100100100 90   686
>> PCSetUp              600 1.0 4.4189e+00 1.0 0.00e+00 0.0 0.0e+00  
>> 0.0e+00 0.0e+00  1  0  0  0  0   1  0  0  0  0     0
>> PCApply            18301 1.0 1.4579e+01 1.0 1.62e+08 1.0 0.0e+00  
>> 0.0e+00 1.0e+00  2  4  0  0  0   2  4  0  0  0   322
>> MatMult            16235 1.0 9.3444e+01 1.0 2.86e+08 1.0 3.2e+04  
>> 2.9e+04 0.0e+00 15 47 91 91  0  15 47 91 91  0   570
>> MatMultTranspose    1566 1.0 8.8825e+00 1.0 3.12e+08 1.0 3.1e+03  
>> 2.9e+04 0.0e+00  1  5  9  9  0   1  5  9  9  0   624
>> MatAssemblyBegin     600 1.0 6.0139e-0125.2 0.00e+00 0.0 0.0e+00  
>> 0.0e+00 1.2e+03  0  0  0  0  2   0  0  0  0  2     0
>> MatAssemblyEnd       600 1.0 2.5127e+00 1.0 0.00e+00 0.0 4.0e+00  
>> 1.5e+04 6.1e+02  0  0  0  0  1   0  0  0  0  1     0
>> ------------------------------------------------------------------------------------------------------------------------
>>
>> Memory usage is given in bytes:
>>
>> Object Type          Creations   Destructions   Memory   
>> Descendants' Mem.
>>
>> --- Event Stage 0: Main Stage
>>
>>           Index Set     4              4      30272     0
>>                 Vec   913            902  926180816     0
>>         Vec Scatter     2              0          0     0
>>       Krylov Solver     1              0          0     0
>>      Preconditioner     1              0          0     0
>>              Matrix     6              0          0     0
>> = 
>> = 
>> = 
>> = 
>> = 
>> = 
>> = 
>> = 
>> = 
>> = 
>> = 
>> = 
>> = 
>> = 
>> = 
>> = 
>> = 
>> = 
>> = 
>> = 
>> = 
>> = 
>> = 
>> = 
>> = 
>> = 
>> = 
>> = 
>> = 
>> = 
>> = 
>> = 
>> = 
>> = 
>> = 
>> = 
>> = 
>> = 
>> = 
>> = 
>> = 
>> = 
>> = 
>> = 
>> = 
>> = 
>> = 
>> = 
>> = 
>> = 
>> = 
>> =====================================================================
>> Average time to get PetscTime(): 2.14577e-07
>> Average time for MPI_Barrier(): 8.10623e-07
>> Average time for zero size MPI_Send(): 2.0504e-05
>>
>>
>>
>> -----------------------------------
>> --- Hypre Euclid preconditioner ---
>> -----------------------------------
>>
>> KSP Object:
>>  type: cg
>>  maximum iterations=500
>>  tolerances:  relative=0.05, absolute=1e-50, divergence=10000
>>  left preconditioning
>> PC Object:
>>  type: hypre
>>    HYPRE Euclid preconditioning
>>    HYPRE Euclid: number of levels 1
>>  linear system matrix = precond matrix:
>>  Matrix Object:
>>    type=mpiaij, rows=256576, cols=256576
>>    total: nonzeros=1769552, allocated nonzeros=1769552
>>      not using I-node (on process 0) routines
>>
>> ************************************************************************************************************************
>> ***             WIDEN YOUR WINDOW TO 120 CHARACTERS.  Use 'enscript  
>> -r -fCourier9' to print this document            ***
>> ************************************************************************************************************************
>>
>> ---------------------------------------------- PETSc Performance  
>> Summary: ----------------------------------------------
>>
>> ./fresco on a linux_32_ named lin0077 with 2 processors, by cklaij  
>> Wed Jul 15 10:10:05 2009
>> Using Petsc Release Version 2.3.3, Patch 13, Thu May 15 17:29:26  
>> CDT 2008 HG revision: 4466c6289a0922df26e20626fd4a0b4dd03c8124
>>
>>                         Max       Max/Min        Avg      Total
>> Time (sec):           6.961e+02      1.00000   6.961e+02
>> Objects:              1.227e+03      1.00000   1.227e+03
>> Flops:                1.340e+10      1.00073   1.340e+10  2.679e+10
>> Flops/sec:            1.925e+07      1.00073   1.924e+07  3.848e+07
>> MPI Messages:         4.748e+03      1.00000   4.748e+03  9.496e+03
>> MPI Message Lengths:  1.397e+08      1.00000   2.943e+04  2.794e+08
>> MPI Reductions:       7.192e+03      1.00000
>>
>> Flop counting convention: 1 flop = 1 real number operation of type  
>> (multiply/divide/add/subtract)
>>                            e.g., VecAXPY() for real vectors of  
>> length N --> 2N flops
>>                            and VecAXPY() for complex vectors of  
>> length N --> 8N flops
>>
>> Summary of Stages:   ----- Time ------  ----- Flops -----  ---  
>> Messages ---  -- Message Lengths --  -- Reductions --
>>                        Avg     %Total     Avg     %Total   counts    
>> %Total     Avg         %Total   counts   %Total
>>  0:      Main Stage: 6.9614e+02 100.0%  2.6790e+10 100.0%  9.496e 
>> +03 100.0%  2.943e+04      100.0%  1.438e+04 100.0%
>>
>> ------------------------------------------------------------------------------------------------------------------------
>> See the 'Profiling' chapter of the users' manual for details on  
>> interpreting output.
>> Phase summary info:
>>   Count: number of times phase was executed
>>   Time and Flops/sec: Max - maximum over all processors
>>                       Ratio - ratio of maximum to minimum over all  
>> processors
>>   Mess: number of messages sent
>>   Avg. len: average message length
>>   Reduct: number of global reductions
>>   Global: entire computation
>>   Stage: stages of a computation. Set stages with  
>> PetscLogStagePush() and PetscLogStagePop().
>>      %T - percent time in this phase         %F - percent flops in  
>> this phase
>>      %M - percent messages in this phase     %L - percent message  
>> lengths in this phase
>>      %R - percent reductions in this phase
>>   Total Mflop/s: 10e-6 * (sum of flops over all processors)/(max  
>> time over all processors)
>> ------------------------------------------------------------------------------------------------------------------------
>>
>>
>>      ##########################################################
>>      #                                                        #
>>      #                          WARNING!!!                    #
>>      #                                                        #
>>      #   This code was run without the PreLoadBegin()         #
>>      #   macros. To get timing results we always recommend    #
>>      #   preloading. otherwise timing numbers may be          #
>>      #   meaningless.                                         #
>>      ##########################################################
>>
>>
>> Event                Count      Time (sec)     Flops/ 
>> sec                         --- Global ---  --- Stage ---   Total
>>                   Max Ratio  Max     Ratio   Max  Ratio  Mess   Avg  
>> len Reduct  %T %F %M %L %R  %T %F %M %L %R Mflop/s
>> ------------------------------------------------------------------------------------------------------------------------
>>
>> --- Event Stage 0: Main Stage
>>
>> VecDot              5410 1.0 1.1865e+01 4.5 5.26e+08 4.5 0.0e+00  
>> 0.0e+00 5.4e+03  1 10  0  0 38   1 10  0  0 38   234
>> VecNorm             3255 1.0 7.8095e-01 1.0 1.07e+09 1.0 0.0e+00  
>> 0.0e+00 3.3e+03  0  6  0  0 23   0  6  0  0 23  2139
>> VecCopy             1600 1.0 9.5096e-01 1.0 0.00e+00 0.0 0.0e+00  
>> 0.0e+00 0.0e+00  0  0  0  0  0   0  0  0  0  0     0
>> VecSet              4746 1.0 8.9868e-01 1.0 0.00e+00 0.0 0.0e+00  
>> 0.0e+00 0.0e+00  0  0  0  0  0   0  0  0  0  0     0
>> VecAXPY             6801 1.0 4.8778e+00 1.0 3.59e+08 1.0 0.0e+00  
>> 0.0e+00 0.0e+00  1 13  0  0  0   1 13  0  0  0   715
>> VecAYPX             3646 1.0 2.2348e+00 1.0 4.19e+08 1.0 0.0e+00  
>> 0.0e+00 0.0e+00  0  7  0  0  0   0  7  0  0  0   837
>> VecAssemblyBegin    1200 1.0 2.7152e-01 2.5 0.00e+00 0.0 0.0e+00  
>> 0.0e+00 3.6e+03  0  0  0  0 25   0  0  0  0 25     0
>> VecAssemblyEnd      1200 1.0 1.7414e-03 1.0 0.00e+00 0.0 0.0e+00  
>> 0.0e+00 0.0e+00  0  0  0  0  0   0  0  0  0  0     0
>> VecPointwiseMult    3982 1.0 4.0871e+00 1.0 1.26e+08 1.0 0.0e+00  
>> 0.0e+00 0.0e+00  1  4  0  0  0   1  4  0  0  0   250
>> VecScatterBegin     4746 1.0 1.8000e-01 1.0 0.00e+00 0.0 9.5e+03  
>> 2.9e+04 0.0e+00  0  0100100  0   0  0100100  0     0
>> VecScatterEnd       4746 1.0 4.6870e+00 5.0 0.00e+00 0.0 0.0e+00  
>> 0.0e+00 0.0e+00  0  0  0  0  0   0  0  0  0  0     0
>> KSPSetup             600 1.0 6.8991e-01 1.0 0.00e+00 0.0 0.0e+00  
>> 0.0e+00 0.0e+00  0  0  0  0  0   0  0  0  0  0     0
>> KSPSolve             600 1.0 2.5931e+02 1.0 5.17e+07 1.0 9.5e+03  
>> 2.9e+04 9.0e+03 37100100100 62  37100100100 62   103
>> PCSetUp              600 1.0 1.8337e+02 1.0 0.00e+00 0.0 0.0e+00  
>> 0.0e+00 2.0e+02 26  0  0  0  1  26  0  0  0  1     0
>> PCApply             5246 1.0 3.6440e+01 1.3 1.88e+07 1.3 0.0e+00  
>> 0.0e+00 1.0e+02  5  4  0  0  1   5  4  0  0  1    28
>> MatMult             3255 1.0 2.3031e+01 1.2 2.85e+08 1.2 6.5e+03  
>> 2.9e+04 0.0e+00  3 40 69 69  0   3 40 69 69  0   464
>> MatMultTranspose    1491 1.0 8.4907e+00 1.0 3.11e+08 1.0 3.0e+03  
>> 2.9e+04 0.0e+00  1 20 31 31  0   1 20 31 31  0   621
>> MatConvert           100 1.0 1.2686e+01 1.0 0.00e+00 0.0 0.0e+00  
>> 0.0e+00 0.0e+00  2  0  0  0  0   2  0  0  0  0     0
>> MatAssemblyBegin     600 1.0 2.3702e+0042.6 0.00e+00 0.0 0.0e+00  
>> 0.0e+00 1.2e+03  0  0  0  0  8   0  0  0  0  8     0
>> MatAssemblyEnd       600 1.0 2.5303e+00 1.0 0.00e+00 0.0 4.0e+00  
>> 1.5e+04 6.1e+02  0  0  0  0  4   0  0  0  0  4     0
>> MatGetRow        12828800 1.0 5.2074e+00 1.0 0.00e+00 0.0 0.0e+00  
>> 0.0e+00 0.0e+00  1  0  0  0  0   1  0  0  0  0     0
>> MatGetRowIJ          200 1.0 1.6284e-04 1.0 0.00e+00 0.0 0.0e+00  
>> 0.0e+00 0.0e+00  0  0  0  0  0   0  0  0  0  0     0
>> ------------------------------------------------------------------------------------------------------------------------
>>
>> Memory usage is given in bytes:
>>
>> Object Type          Creations   Destructions   Memory   
>> Descendants' Mem.
>>
>> --- Event Stage 0: Main Stage
>>
>>           Index Set     4              4      30272     0
>>                 Vec  1213           1202  1234223216     0
>>         Vec Scatter     2              0          0     0
>>       Krylov Solver     1              0          0     0
>>      Preconditioner     1              0          0     0
>>              Matrix     6              0          0     0
>> = 
>> = 
>> = 
>> = 
>> = 
>> = 
>> = 
>> = 
>> = 
>> = 
>> = 
>> = 
>> = 
>> = 
>> = 
>> = 
>> = 
>> = 
>> = 
>> = 
>> = 
>> = 
>> = 
>> = 
>> = 
>> = 
>> = 
>> = 
>> = 
>> = 
>> = 
>> = 
>> = 
>> = 
>> = 
>> = 
>> = 
>> = 
>> = 
>> = 
>> = 
>> = 
>> = 
>> = 
>> = 
>> = 
>> = 
>> = 
>> = 
>> = 
>> = 
>> =====================================================================
>> Average time to get PetscTime(): 2.14577e-07
>> Average time for MPI_Barrier(): 3.8147e-07
>> Average time for zero size MPI_Send(): 1.39475e-05
>>
>>
>>
>>
>> --------------------------------------
>> --- Hypre BoomerAMG preconditioner ---
>> --------------------------------------
>>
>> KSP Object:
>>  type: cg
>>  maximum iterations=500
>>  tolerances:  relative=0.05, absolute=1e-50, divergence=10000
>>  left preconditioning
>> PC Object:
>>  type: hypre
>>    HYPRE BoomerAMG preconditioning
>>    HYPRE BoomerAMG: Cycle type V
>>    HYPRE BoomerAMG: Maximum number of levels 25
>>    HYPRE BoomerAMG: Maximum number of iterations PER hypre call 1
>>    HYPRE BoomerAMG: Convergence tolerance PER hypre call 0
>>    HYPRE BoomerAMG: Threshold for strong coupling 0.25
>>    HYPRE BoomerAMG: Interpolation truncation factor 0
>>    HYPRE BoomerAMG: Interpolation: max elements per row 0
>>    HYPRE BoomerAMG: Number of levels of aggressive coarsening 0
>>    HYPRE BoomerAMG: Number of paths for aggressive coarsening 1
>>    HYPRE BoomerAMG: Maximum row sums 0.9
>>    HYPRE BoomerAMG: Sweeps down         1
>>    HYPRE BoomerAMG: Sweeps up           1
>>    HYPRE BoomerAMG: Sweeps on coarse    1
>>    HYPRE BoomerAMG: Relax down          symmetric-SOR/Jacobi
>>    HYPRE BoomerAMG: Relax up            symmetric-SOR/Jacobi
>>    HYPRE BoomerAMG: Relax on coarse     Gaussian-elimination
>>    HYPRE BoomerAMG: Relax weight  (all)      1
>>    HYPRE BoomerAMG: Outer relax weight (all) 1
>>    HYPRE BoomerAMG: Using CF-relaxation
>>    HYPRE BoomerAMG: Measure type        local
>>    HYPRE BoomerAMG: Coarsen type        Falgout
>>    HYPRE BoomerAMG: Interpolation type  classical
>>  linear system matrix = precond matrix:
>>  Matrix Object:
>>    type=mpiaij, rows=256576, cols=256576
>>    total: nonzeros=1769552, allocated nonzeros=1769552
>>      not using I-node (on process 0) routines
>>
>> ************************************************************************************************************************
>> ***             WIDEN YOUR WINDOW TO 120 CHARACTERS.  Use 'enscript  
>> -r -fCourier9' to print this document            ***
>> ************************************************************************************************************************
>>
>> ---------------------------------------------- PETSc Performance  
>> Summary: ----------------------------------------------
>>
>> ./fresco on a linux_32_ named lin0077 with 2 processors, by cklaij  
>> Wed Jul 15 09:53:07 2009
>> Using Petsc Release Version 2.3.3, Patch 13, Thu May 15 17:29:26  
>> CDT 2008 HG revision: 4466c6289a0922df26e20626fd4a0b4dd03c8124
>>
>>                         Max       Max/Min        Avg      Total
>> Time (sec):           7.080e+02      1.00000   7.080e+02
>> Objects:              1.227e+03      1.00000   1.227e+03
>> Flops:                1.054e+10      1.00076   1.054e+10  2.107e+10
>> Flops/sec:            1.489e+07      1.00076   1.488e+07  2.977e+07
>> MPI Messages:         3.857e+03      1.00000   3.857e+03  7.714e+03
>> MPI Message Lengths:  1.135e+08      1.00000   2.942e+04  2.270e+08
>> MPI Reductions:       5.800e+03      1.00000
>>
>> Flop counting convention: 1 flop = 1 real number operation of type  
>> (multiply/divide/add/subtract)
>>                            e.g., VecAXPY() for real vectors of  
>> length N --> 2N flops
>>                            and VecAXPY() for complex vectors of  
>> length N --> 8N flops
>>
>> Summary of Stages:   ----- Time ------  ----- Flops -----  ---  
>> Messages ---  -- Message Lengths --  -- Reductions --
>>                        Avg     %Total     Avg     %Total   counts    
>> %Total     Avg         %Total   counts   %Total
>>  0:      Main Stage: 7.0799e+02 100.0%  2.1075e+10 100.0%  7.714e 
>> +03 100.0%  2.942e+04      100.0%  1.160e+04 100.0%
>>
>> ------------------------------------------------------------------------------------------------------------------------
>> See the 'Profiling' chapter of the users' manual for details on  
>> interpreting output.
>> Phase summary info:
>>   Count: number of times phase was executed
>>   Time and Flops/sec: Max - maximum over all processors
>>                       Ratio - ratio of maximum to minimum over all  
>> processors
>>   Mess: number of messages sent
>>   Avg. len: average message length
>>   Reduct: number of global reductions
>>   Global: entire computation
>>   Stage: stages of a computation. Set stages with  
>> PetscLogStagePush() and PetscLogStagePop().
>>      %T - percent time in this phase         %F - percent flops in  
>> this phase
>>      %M - percent messages in this phase     %L - percent message  
>> lengths in this phase
>>      %R - percent reductions in this phase
>>   Total Mflop/s: 10e-6 * (sum of flops over all processors)/(max  
>> time over all processors)
>> ------------------------------------------------------------------------------------------------------------------------
>>
>>
>>      ##########################################################
>>      #                                                        #
>>      #                          WARNING!!!                    #
>>      #                                                        #
>>      #   This code was run without the PreLoadBegin()         #
>>      #   macros. To get timing results we always recommend    #
>>      #   preloading. otherwise timing numbers may be          #
>>      #   meaningless.                                         #
>>      ##########################################################
>>
>>
>> Event                Count      Time (sec)     Flops/ 
>> sec                         --- Global ---  --- Stage ---   Total
>>                   Max Ratio  Max     Ratio   Max  Ratio  Mess   Avg  
>> len Reduct  %T %F %M %L %R  %T %F %M %L %R Mflop/s
>> ------------------------------------------------------------------------------------------------------------------------
>>
>> --- Event Stage 0: Main Stage
>>
>> VecDot              3554 1.0 1.8220e+00 1.0 5.03e+08 1.0 0.0e+00  
>> 0.0e+00 3.6e+03  0  9  0  0 31   0  9  0  0 31  1001
>> VecNorm             2327 1.0 6.7031e-01 1.0 9.34e+08 1.0 0.0e+00  
>> 0.0e+00 2.3e+03  0  6  0  0 20   0  6  0  0 20  1781
>> VecCopy             1600 1.0 9.4440e-01 1.0 0.00e+00 0.0 0.0e+00  
>> 0.0e+00 0.0e+00  0  0  0  0  0   0  0  0  0  0     0
>> VecSet              3855 1.0 8.0550e-01 1.0 0.00e+00 0.0 0.0e+00  
>> 0.0e+00 0.0e+00  0  0  0  0  0   0  0  0  0  0     0
>> VecAXPY             4982 1.0 3.7953e+00 1.0 3.39e+08 1.0 0.0e+00  
>> 0.0e+00 0.0e+00  1 12  0  0  0   1 12  0  0  0   674
>> VecAYPX             2755 1.0 1.8270e+00 1.0 3.89e+08 1.0 0.0e+00  
>> 0.0e+00 0.0e+00  0  7  0  0  0   0  7  0  0  0   774
>> VecAssemblyBegin    1200 1.0 1.8679e-01 1.8 0.00e+00 0.0 0.0e+00  
>> 0.0e+00 3.6e+03  0  0  0  0 31   0  0  0  0 31     0
>> VecAssemblyEnd      1200 1.0 1.7717e-03 1.1 0.00e+00 0.0 0.0e+00  
>> 0.0e+00 0.0e+00  0  0  0  0  0   0  0  0  0  0     0
>> VecPointwiseMult    4056 1.0 4.1344e+00 1.0 1.26e+08 1.0 0.0e+00  
>> 0.0e+00 0.0e+00  1  5  0  0  0   1  5  0  0  0   252
>> VecScatterBegin     3855 1.0 1.5116e-01 1.0 0.00e+00 0.0 7.7e+03  
>> 2.9e+04 0.0e+00  0  0100100  0   0  0100100  0     0
>> VecScatterEnd       3855 1.0 7.3828e-01 2.0 0.00e+00 0.0 0.0e+00  
>> 0.0e+00 0.0e+00  0  0  0  0  0   0  0  0  0  0     0
>> KSPSetup             600 1.0 5.1192e-01 1.0 0.00e+00 0.0 0.0e+00  
>> 0.0e+00 0.0e+00  0  0  0  0  0   0  0  0  0  0     0
>> KSPSolve             600 1.0 2.7194e+02 1.0 3.88e+07 1.0 7.7e+03  
>> 2.9e+04 6.2e+03 38100100100 53  38100100100 53    77
>> PCSetUp              600 1.0 1.6630e+02 1.0 0.00e+00 0.0 0.0e+00  
>> 0.0e+00 2.0e+02 23  0  0  0  2  23  0  0  0  2     0
>> PCApply             4355 1.0 7.3735e+01 1.0 7.06e+06 1.0 0.0e+00  
>> 0.0e+00 1.0e+02 10  5  0  0  1  10  5  0  0  1    14
>> MatMult             2327 1.0 1.3706e+01 1.0 2.79e+08 1.0 4.7e+03  
>> 2.9e+04 0.0e+00  2 36 60 60  0   2 36 60 60  0   557
>> MatMultTranspose    1528 1.0 8.6412e+00 1.0 3.13e+08 1.0 3.1e+03  
>> 2.9e+04 0.0e+00  1 26 40 40  0   1 26 40 40  0   626
>> MatConvert           100 1.0 1.2962e+01 1.0 0.00e+00 0.0 0.0e+00  
>> 0.0e+00 0.0e+00  2  0  0  0  0   2  0  0  0  0     0
>> MatAssemblyBegin     600 1.0 2.4579e+0096.9 0.00e+00 0.0 0.0e+00  
>> 0.0e+00 1.2e+03  0  0  0  0 10   0  0  0  0 10     0
>> MatAssemblyEnd       600 1.0 2.5257e+00 1.0 0.00e+00 0.0 4.0e+00  
>> 1.5e+04 6.1e+02  0  0  0  0  5   0  0  0  0  5     0
>> MatGetRow        12828800 1.0 5.2907e+00 1.0 0.00e+00 0.0 0.0e+00  
>> 0.0e+00 0.0e+00  1  0  0  0  0   1  0  0  0  0     0
>> MatGetRowIJ          200 1.0 1.7476e-04 1.1 0.00e+00 0.0 0.0e+00  
>> 0.0e+00 0.0e+00  0  0  0  0  0   0  0  0  0  0     0
>> ------------------------------------------------------------------------------------------------------------------------
>>
>> Memory usage is given in bytes:
>>
>> Object Type          Creations   Destructions   Memory   
>> Descendants' Mem.
>>
>> --- Event Stage 0: Main Stage
>>
>>           Index Set     4              4      30272     0
>>                 Vec  1213           1202  1234223216     0
>>         Vec Scatter     2              0          0     0
>>       Krylov Solver     1              0          0     0
>>      Preconditioner     1              0          0     0
>>              Matrix     6              0          0     0
>> = 
>> = 
>> = 
>> = 
>> = 
>> = 
>> = 
>> = 
>> = 
>> = 
>> = 
>> = 
>> = 
>> = 
>> = 
>> = 
>> = 
>> = 
>> = 
>> = 
>> = 
>> = 
>> = 
>> = 
>> = 
>> = 
>> = 
>> = 
>> = 
>> = 
>> = 
>> = 
>> = 
>> = 
>> = 
>> = 
>> = 
>> = 
>> = 
>> = 
>> = 
>> = 
>> = 
>> = 
>> = 
>> = 
>> = 
>> = 
>> = 
>> = 
>> = 
>> =====================================================================
>> Average time to get PetscTime(): 1.90735e-07
>> Average time for MPI_Barrier(): 8.10623e-07
>> Average time for zero size MPI_Send(): 1.95503e-05
>> OptionTable: -log_summary
>>
>>
>>
>>
>> -----Original Message-----
>> Date: Tue, 14 Jul 2009 10:42:58 -0500
>> From: Barry Smith <bsmith at mcs.anl.gov>
>> Subject: Re: hypre preconditioners
>> To: PETSc users list <petsc-users at mcs.anl.gov>
>> Message-ID: <DC1E3E8F-1D2D-4256-A1EE-14BA81EAEC67 at mcs.anl.gov>
>> Content-Type: text/plain; charset=US-ASCII; format=flowed; delsp=yes
>>
>>
>>    First run the three cases with -log_summary (also -ksp_view to see
>> exact solver options that are being used) and send those files. This
>> will tell us where the time is being spent; without this information
>> any comments are pure speculation. (For example, the "copy" time to
>> hypre format is trivial compared to the time to build a hypre
>> preconditioner and not the problem).
>>
>>
>>    What you report is not uncommon; the setup and per iteration cost
>> of the hypre preconditioners will be much larger than the simpler
>> Jacobi preconditioner.
>>
>>    Barry
>>
>> On Jul 14, 2009, at 3:36 AM, Klaij, Christiaan wrote:
>>
>>>
>>> I'm solving the steady incompressible Navier-Stokes equations
>>> (discretized with FV on unstructured grids) using the SIMPLE
>>> Pressure Correction method. I'm using Picard linearization and solve
>>> the system for the momentum equations with BICG and for the pressure
>>> equation with CG. Currently, for parallel runs, I'm using JACOBI as
>>> a preconditioner. My grids typically have a few million cells and I
>>> use between 4 and 16 cores (1 to 4 quadcore CPUs on a linux
>>> cluster). A significant portion of the CPU time goes into solving
>>> the pressure equation. To reach the relative tolerance I need, CG
>>> with JACOBI takes about 100 iterations per outer loop for these
>>> problems.
>>>
>>> In order to reduce CPU time, I've compiled PETSc with support for
>>> Hypre and I'm looking at BoomerAMG and Euclid to replace JACOBI as a
>>> preconditioner for the pressure equation. With default settings,
>>> both BoomerAMG and Euclid greatly reduce the number of iterations:
>>> with BoomerAMG 1 or 2 iterations are enough, with Euclid about 10.
>>> However, I do not get any reduction in CPU time. With Euclid, CPU
>>> time is similar to JACOBI and with BoomerAMG it is approximately
>>> doubled.
>>>
>>> Is this what one can expect? Are BoomerAMG and Euclid meant for much
>>> larger problems? I understand Hypre uses a different matrix storage
>>> format, is CPU time 'lost in translation' between PETSc and Hypre
>>> for these small problems? Are there maybe any settings I should
>>> change?
>>>
>>> Chris
>>>
>>>
>>>
>>>
>>>
>>>
>>>
>>>
>>> <mime-attachment.jpeg><mime-attachment.jpeg>
>>> dr. ir. Christiaan Klaij
>>> CFD Researcher
>>> Research & Development
>>> MARIN
>>> 2, Haagsteeg
>>> c.klaij at marin.nl
>>> P.O. Box 28
>>> T +31 317 49 39 11
>>> 6700 AA  Wageningen
>>> F +31 317 49 32 45
>>> T  +31 317 49 33 44
>>> The Netherlands
>>> I  www.marin.nl
>>>
>>>
>>> MARIN webnews: First AMT'09 conference, Nantes, France, September  
>>> 1-2
>>>
>>>
>>> This e-mail may be confidential, privileged and/or protected by
>>> copyright. If you are not the intended recipient, you should return
>>> it to the sender immediately and delete your copy from your system.
>>>
>>
-------------- next part --------------
A non-text attachment was scrubbed...
Name: not available
Type: application/ms-tnef
Size: 15870 bytes
Desc: not available
URL: <http://lists.mcs.anl.gov/pipermail/petsc-users/attachments/20090716/b9bfb2aa/attachment-0001.bin>

From ycollet at freesurf.fr  Thu Jul 16 14:54:53 2009
From: ycollet at freesurf.fr (Collette Yann)
Date: Thu, 16 Jul 2009 21:54:53 +0200
Subject: Petsc-3 MPIless
Message-ID: <4A5F858D.1040701@freesurf.fr>

Hello,

I am currently interfacing petsc/snes to scilab (http://www.scilab.org).
I worked using petsc-2.3.3-p15 and everything is nearly fine (I had some 
convergenge problem, but that's not really important).
My petsc-2.3.3 is configure without mpi: config/configure.py 
--with-mpi=0 --enable-shared

Now, I would like to switch to petsc-3. So I configured petsc-3 using 
the same command line as above.

The problem I meet is that petsc-3 still required mpi.
Is it possible to compilte petsc-3 without mpi ?

Cheers,

YC





From bsmith at mcs.anl.gov  Thu Jul 16 14:57:19 2009
From: bsmith at mcs.anl.gov (Barry Smith)
Date: Thu, 16 Jul 2009 14:57:19 -0500
Subject: Petsc-3 MPIless
In-Reply-To: <4A5F858D.1040701@freesurf.fr>
References: <4A5F858D.1040701@freesurf.fr>
Message-ID: <6E6C11DE-E38D-4342-ABE3-2FE564A7BB19@mcs.anl.gov>


   PETSc 3.0 does not require MPI in the same way that 2.3.3 does not  
require MPI. You should be able to use the same configure options as  
before.
If that does not work please send configure.log that is generated to petsc-maint at mcs.anl 
,gov (not to this email address).

     Barry

On Jul 16, 2009, at 2:54 PM, Collette Yann wrote:

> Hello,
>
> I am currently interfacing petsc/snes to scilab (http:// 
> www.scilab.org).
> I worked using petsc-2.3.3-p15 and everything is nearly fine (I had  
> some convergenge problem, but that's not really important).
> My petsc-2.3.3 is configure without mpi: config/configure.py --with- 
> mpi=0 --enable-shared
>
> Now, I would like to switch to petsc-3. So I configured petsc-3  
> using the same command line as above.
>
> The problem I meet is that petsc-3 still required mpi.
> Is it possible to compilte petsc-3 without mpi ?
>
> Cheers,
>
> YC
>
>
>
>


From C.Klaij at marin.nl  Fri Jul 17 02:54:41 2009
From: C.Klaij at marin.nl (Klaij, Christiaan)
Date: Fri, 17 Jul 2009 09:54:41 +0200
Subject: hypre preconditioners
References: <mailman.5504.1247754031.12748.petsc-users@mcs.anl.gov>
Message-ID: <5D9143EF9FADE942BEF6F2A636A861170800F78A@MAR150CV1.marin.local>

Jed,

I'm using a cell centered discretization on hexa grids. I'll try sor and asm and also changing the default settings in boomeramg and ML. You say you find ML almost always faster, by how much? Thanks for your help! 

Chris


dr. ir. Christiaan Klaij
CFD Researcher
Research & Development
mailto:C.Klaij at marin.nl
T +31 317 49 33 44

MARIN
2, Haagsteeg, P.O. Box 28, 6700 AA Wageningen, The Netherlands
T +31 317 49 39 11, F +31 317 49 32 45, I http://www.marin.nl/
http://www.marin.nl/web/show/id=46836/contentid=2324 First AMT'09 conference, Nantes, France, September 1-2 

This e-mail may be confidential, privileged and/or protected by copyright. If you are not the intended recipient, you should return it to the sender immediately and delete your copy from your system.
-----Original Message-----

Date: Thu, 16 Jul 2009 10:06:48 +0200
From: Jed Brown <jed at 59A2.org>
Subject: Re: hypre preconditioners
To: PETSc users list <petsc-users at mcs.anl.gov>
Message-ID: <4A5EDF98.7090508 at 59A2.org>
Content-Type: text/plain; charset="iso-8859-1"

Klaij, Christiaan wrote:

> The velocity problem is segregated (I use BICG with Jacobi for the 3
> linear systems) but these need (much) less iterations than the pressure
> problem. The pressure matrix changes at each solve.

It may change, but it might still make a good preconditioner for several
time steps.  How many dofs are in your pressure system?  You mentioned a
few million cells, but it makes a huge difference whether you are using
tets vs. hexes, and what the pressure space is.  If the pressure space
is around 1M dofs, the system is relatively well-conditioned to converge
in only 100 iterations with Jacobi which means that you stand a good
chance of getting acceptable performance from a 1-level DD
preconditioner (block jacobi or small-overlap additive Schwarz).  So try
Barry's suggestion of -pc_type sor and also -pc_type asm with a few
choices of subdomain solver (-sub_pc_type).

> Also, I did try ML and, like you say, it needs about two times more
> iterations than boomerAMG. Overall, boomerAMG is a bit faster for my
> cases than ML.

To speed up Hypre, I've found these options to be especially useful.

-pc_hypre_boomeramg_strong_threshold
  defaults to 0.25 which is good for 2D scalar problems, change to 0.5 or above for 3D problems

-pc_hypre_boomeramg_agg_nl
  set this greater than 0 to use aggressive coarsening


However, I almost always find ML to be faster.  By default, it uses way
more levels than you want (often making the coarse level have only 1 dof
instead of around 1000) so try reducing -pc_ml_maxNlevels.

Jed
-------------- next part --------------
A non-text attachment was scrubbed...
Name: not available
Type: application/ms-tnef
Size: 4122 bytes
Desc: not available
URL: <http://lists.mcs.anl.gov/pipermail/petsc-users/attachments/20090717/9da259cc/attachment.bin>

From C.Klaij at marin.nl  Fri Jul 17 04:52:48 2009
From: C.Klaij at marin.nl (Klaij, Christiaan)
Date: Fri, 17 Jul 2009 11:52:48 +0200
Subject: call PetscOptionsSetValue in fortran
Message-ID: <5D9143EF9FADE942BEF6F2A636A861170800F78B@MAR150CV1.marin.local>

I'm trying to change the options of the Hypre preconditioner using PetscOptionsSetValue in a fortran program, but I must be doing something wrong, see the session below. It works fine from the command line, though. As an example, I took ex12f from src/ksp/ksp/examples/tests (petsc-2.3.3-p13) and modified it a little.


$ cat ex12f.F
!
      program main
       implicit none

#include "include/finclude/petsc.h"
#include "include/finclude/petscvec.h"
#include "include/finclude/petscmat.h"
#include "include/finclude/petscpc.h"
#include "include/finclude/petscksp.h"
#include "include/finclude/petscviewer.h"
!
!  This example is the Fortran version of ex6.c.  The program reads a PETSc matrix
!  and vector from a file and solves a linear system.  Input arguments are:
!        -f <input_file> : file to load.  For a 5X5 example of the 5-pt. stencil
!                          use the file petsc/src/mat/examples/matbinary.ex
!

      PetscErrorCode  ierr
      PetscInt its
      PetscTruth flg
      PetscScalar      norm,none
      Vec              x,b,u
      Mat              A
      character*(128)  f 
      PetscViewer      fd
      MatInfo          info(MAT_INFO_SIZE)
      KSP              ksp
! cklaij: adding pc
      PC               pc
! cklaij: adding pc end

      none = -1.0
      call PetscInitialize(PETSC_NULL_CHARACTER,ierr)

! Read in matrix and RHS
      call PetscOptionsGetString(PETSC_NULL_CHARACTER,'-f',f,flg,ierr)
      print *,f
      call PetscViewerBinaryOpen(PETSC_COMM_WORLD,f,FILE_MODE_READ,     &
     &     fd,ierr)

      call MatLoad(fd,MATSEQAIJ,A,ierr)

! Get information about matrix
      call MatGetInfo(A,MAT_GLOBAL_SUM,info,ierr)
      write(*,100) info(MAT_INFO_ROWS_GLOBAL),                          &
     &  info(MAT_INFO_COLUMNS_GLOBAL),                                  &
     &  info(MAT_INFO_ROWS_LOCAL),info(MAT_INFO_COLUMNS_LOCAL),         &
     &  info(MAT_INFO_BLOCK_SIZE),info(MAT_INFO_NZ_ALLOCATED),          &
     &  info(MAT_INFO_NZ_USED),info(MAT_INFO_NZ_UNNEEDED),              &
     &  info(MAT_INFO_MEMORY),info(MAT_INFO_ASSEMBLIES),                &
     &  info(MAT_INFO_MALLOCS)

 100  format(11(g7.1,1x))
      call VecLoad(fd,PETSC_NULL_CHARACTER,b,ierr)
      call PetscViewerDestroy(fd,ierr)

! Set up solution
      call VecDuplicate(b,x,ierr)
      call VecDuplicate(b,u,ierr)

! Solve system
      call KSPCreate(PETSC_COMM_WORLD,ksp,ierr)
      call KSPSetOperators(ksp,A,A,DIFFERENT_NONZERO_PATTERN,ierr)
      call KSPSetFromOptions(ksp,ierr)
! cklaij: try boomeramg
      call KSPGetPC(ksp,pc,ierr)
      call PCSetType(pc,PCHYPRE,ierr)
      call PCHYPRESetType(pc,"boomeramg",ierr)
      call PetscOptionsSetValue 
     &     ("-pc_hypre_boomeramg_strong_threshold","0.5",ierr)
! cklaij: try boomeramg end
      call KSPSolve(ksp,b,x,ierr)

! Show result
      call MatMult(A,x,u,ierr)
      call VecAXPY(u,none,b,ierr)
      call VecNorm(u,NORM_2,norm,ierr)
      call KSPGetIterationNumber(ksp,its,ierr)
      print*, 'Number of iterations = ',its
      print*, 'Residual norm = ',norm

! Cleanup
      call KSPDestroy(ksp,ierr)
      call VecDestroy(b,ierr)
      call VecDestroy(x,ierr)
      call VecDestroy(u,ierr)
      call MatDestroy(A,ierr)

      call PetscFinalize(ierr)
      end

$ ./ex12f  -f ../../../../mat/examples/matbinary.ex -ksp_view | grep Threshold
    HYPRE BoomerAMG: Threshold for strong coupling 0.25
$ ./ex12f  -f ../../../../mat/examples/matbinary.ex -pc_type hypre -pc_hypre_type boomeramg -pc_hypre_boomeramg_strong_threshold 0.5 -ksp_view | grep Threshold
    HYPRE BoomerAMG: Threshold for strong coupling 0.5
$ 

dr. ir. Christiaan Klaij
CFD Researcher
Research & Development
mailto:C.Klaij at marin.nl
T +31 317 49 33 44

MARIN
2, Haagsteeg, P.O. Box 28, 6700 AA Wageningen, The Netherlands
T +31 317 49 39 11, F +31 317 49 32 45, I http://www.marin.nl/
http://www.marin.nl/web/show/id=46836/contentid=2324 First AMT'09 conference, Nantes, France, September 1-2 

This e-mail may be confidential, privileged and/or protected by copyright. If you are not the intended recipient, you should return it to the sender immediately and delete your copy from your system.

-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.mcs.anl.gov/pipermail/petsc-users/attachments/20090717/98b35a0a/attachment.htm>
-------------- next part --------------
A non-text attachment was scrubbed...
Name: not available
Type: image/jpeg
Size: 1069 bytes
Desc: not available
URL: <http://lists.mcs.anl.gov/pipermail/petsc-users/attachments/20090717/98b35a0a/attachment.jpeg>
-------------- next part --------------
A non-text attachment was scrubbed...
Name: not available
Type: image/jpeg
Size: 1622 bytes
Desc: not available
URL: <http://lists.mcs.anl.gov/pipermail/petsc-users/attachments/20090717/98b35a0a/attachment-0001.jpeg>

From s.kramer at imperial.ac.uk  Fri Jul 17 07:44:09 2009
From: s.kramer at imperial.ac.uk (Stephan Kramer)
Date: Fri, 17 Jul 2009 13:44:09 +0100
Subject: non-local values being dropped in MatSetValues
Message-ID: <4A607219.3070609@imperial.ac.uk>

Hello

We've spend sometime debugging a problem were in the assembly of a 
parallel MPIAIJ matrix, some values that were created on a process other 
than the owner of the row seemed to disappear. I think I narrowed it 
down to what I think is a bug in MatSetValues_MPIAIJ, but please tell me 
if I'm wrong.

The situation is the following: I'm calling MatSetValues with the flag 
ADD_VALUES and with matrix option MAT_IGNORE_ZERO_ENTRIES. I'm inserting 
  multiple values at once, multiple columns and rows, so I provide a 
rank-2 matrix of values. As I'm calling this from fortran I'm also using 
MAT_COLUMN_ORIENTED. Now for provided rows that are not owned by the 
process, it jumps to mpiaij.c:394 (line numbers as in petsc-dev). On 
line 399, it checks for zero entries, but only checks the very first 
entry of the (non-owned) row. If however other entries of that same row 
are nonzero, the entire row is still dropped. Note that this is 
independent of row_oriented/column_oriented as line 396 does exactly the 
same.

If I don't set the option MAT_IGNORE_ZERO_ENTRIES the problem 
disappears. In that case however we would either have to preallocate 
substantially more nonzeros, or complicate the matrix assembly in our 
code by taking out the zero entries ourselves and call MatSetValues for 
each entry seperately.

Your help would be much appreciated,
Cheers
Stephan


-- 
Stephan Kramer  <s.kramer at imperial.ac.uk>
Applied Modelling and Computation Group,
Department of Earth Science and Engineering,
Imperial College London

From bsmith at mcs.anl.gov  Fri Jul 17 10:51:53 2009
From: bsmith at mcs.anl.gov (Barry Smith)
Date: Fri, 17 Jul 2009 10:51:53 -0500
Subject: call PetscOptionsSetValue in fortran
In-Reply-To: <5D9143EF9FADE942BEF6F2A636A861170800F78B@MAR150CV1.marin.local>
References: <5D9143EF9FADE942BEF6F2A636A861170800F78B@MAR150CV1.marin.local>
Message-ID: <47364ED4-B1E5-4713-A784-CF74F52D16AD@mcs.anl.gov>


    You are calling KSPSetFromOptions() BEFORE setting the PC type to  
hypre and setting boomeramg and before you call PetscOptionsSetValue().
You should call PetscOptionsSetValue() then PCSetType() then  
PCHypreSetType() then KSPSetFromOptions().

    Barry

On Jul 17, 2009, at 4:52 AM, Klaij, Christiaan wrote:

>
> I'm trying to change the options of the Hypre preconditioner using  
> PetscOptionsSetValue in a fortran program, but I must be doing  
> something wrong, see the session below. It works fine from the  
> command line, though. As an example, I took ex12f from src/ksp/ksp/ 
> examples/tests (petsc-2.3.3-p13) and modified it a little.
>
>
> $ cat ex12f.F
> !
>       program main
>        implicit none
>
> #include "include/finclude/petsc.h"
> #include "include/finclude/petscvec.h"
> #include "include/finclude/petscmat.h"
> #include "include/finclude/petscpc.h"
> #include "include/finclude/petscksp.h"
> #include "include/finclude/petscviewer.h"
> !
> !  This example is the Fortran version of ex6.c.  The program reads  
> a PETSc matrix
> !  and vector from a file and solves a linear system.  Input  
> arguments are:
> !        -f <input_file> : file to load.  For a 5X5 example of the 5- 
> pt. stencil
> !                          use the file petsc/src/mat/examples/ 
> matbinary.ex
> !
>
>       PetscErrorCode  ierr
>       PetscInt its
>       PetscTruth flg
>       PetscScalar      norm,none
>       Vec              x,b,u
>       Mat              A
>       character*(128)  f
>       PetscViewer      fd
>       MatInfo          info(MAT_INFO_SIZE)
>       KSP              ksp
> ! cklaij: adding pc
>       PC               pc
> ! cklaij: adding pc end
>
>       none = -1.0
>       call PetscInitialize(PETSC_NULL_CHARACTER,ierr)
>
> ! Read in matrix and RHS
>       call PetscOptionsGetString(PETSC_NULL_CHARACTER,'-f',f,flg,ierr)
>       print *,f
>       call  
> PetscViewerBinaryOpen(PETSC_COMM_WORLD,f,FILE_MODE_READ,     &
>      &     fd,ierr)
>
>       call MatLoad(fd,MATSEQAIJ,A,ierr)
>
> ! Get information about matrix
>       call MatGetInfo(A,MAT_GLOBAL_SUM,info,ierr)
>       write(*,100)  
> info(MAT_INFO_ROWS_GLOBAL),                          &
>      &   
> info(MAT_INFO_COLUMNS_GLOBAL),                                  &
>      &   
> info(MAT_INFO_ROWS_LOCAL),info(MAT_INFO_COLUMNS_LOCAL),         &
>      &   
> info(MAT_INFO_BLOCK_SIZE),info(MAT_INFO_NZ_ALLOCATED),          &
>      &   
> info(MAT_INFO_NZ_USED),info(MAT_INFO_NZ_UNNEEDED),              &
>      &   
> info(MAT_INFO_MEMORY),info(MAT_INFO_ASSEMBLIES),                &
>      &  info(MAT_INFO_MALLOCS)
>
>  100  format(11(g7.1,1x))
>       call VecLoad(fd,PETSC_NULL_CHARACTER,b,ierr)
>       call PetscViewerDestroy(fd,ierr)
>
> ! Set up solution
>       call VecDuplicate(b,x,ierr)
>       call VecDuplicate(b,u,ierr)
>
> ! Solve system
>       call KSPCreate(PETSC_COMM_WORLD,ksp,ierr)
>       call KSPSetOperators(ksp,A,A,DIFFERENT_NONZERO_PATTERN,ierr)
>       call KSPSetFromOptions(ksp,ierr)
> ! cklaij: try boomeramg
>       call KSPGetPC(ksp,pc,ierr)
>       call PCSetType(pc,PCHYPRE,ierr)
>       call PCHYPRESetType(pc,"boomeramg",ierr)
>       call PetscOptionsSetValue
>      &     ("-pc_hypre_boomeramg_strong_threshold","0.5",ierr)
> ! cklaij: try boomeramg end
>       call KSPSolve(ksp,b,x,ierr)
>
> ! Show result
>       call MatMult(A,x,u,ierr)
>       call VecAXPY(u,none,b,ierr)
>       call VecNorm(u,NORM_2,norm,ierr)
>       call KSPGetIterationNumber(ksp,its,ierr)
>       print*, 'Number of iterations = ',its
>       print*, 'Residual norm = ',norm
>
> ! Cleanup
>       call KSPDestroy(ksp,ierr)
>       call VecDestroy(b,ierr)
>       call VecDestroy(x,ierr)
>       call VecDestroy(u,ierr)
>       call MatDestroy(A,ierr)
>
>       call PetscFinalize(ierr)
>       end
>
> $ ./ex12f  -f ../../../../mat/examples/matbinary.ex -ksp_view | grep  
> Threshold
>     HYPRE BoomerAMG: Threshold for strong coupling 0.25
> $ ./ex12f  -f ../../../../mat/examples/matbinary.ex -pc_type hypre - 
> pc_hypre_type boomeramg -pc_hypre_boomeramg_strong_threshold 0.5 - 
> ksp_view | grep Threshold
>     HYPRE BoomerAMG: Threshold for strong coupling 0.5
> $
>
>
> <mime-attachment.jpeg><mime-attachment.jpeg>
> dr. ir. Christiaan Klaij
> CFD Researcher
> Research & Development
> MARIN
> 2, Haagsteeg
> c.klaij at marin.nl
> P.O. Box 28
> T +31 317 49 39 11
> 6700 AA  Wageningen
> F +31 317 49 32 45
> T  +31 317 49 33 44
> The Netherlands
> I  www.marin.nl
>
>
> MARIN webnews: First AMT'09 conference, Nantes, France, September 1-2
>
>
> This e-mail may be confidential, privileged and/or protected by  
> copyright. If you are not the intended recipient, you should return  
> it to the sender immediately and delete your copy from your system.
>


From vyan2000 at gmail.com  Fri Jul 17 17:18:08 2009
From: vyan2000 at gmail.com (Ryan Yan)
Date: Fri, 17 Jul 2009 18:18:08 -0400
Subject: About src/ksp/pc/impls/fieldsplit/fieldsplit.c
Message-ID: <bb5eaf5f0907171518i609d669hc1fd57c481d66a9d@mail.gmail.com>

Hi All,
I have some difficulty of understanding the struct PC_FieldSplit. From the
definition I can see that it has a data structure like a "list" and each
member in the list is representing an object Field(or PC_Field).

Suppose that my matrix has a blocksize of 5.   I want to Set 3 Fields,  i.e.
field_0 as {0,1},  field_1 as{2,3},and field_3 as {4}. Then can anyone
please help me to fill in the following parameter.

PC_FieldSplit *jac = (PC_FieldSplit*)pc->data;

what is jac->nsplit?     3, is my guess, since we have split the matrix into
3 split, namely field_0, field_1, field_2.

PC_FieldSplitLink ilink = jac->head;

what is ilink->nfields? 2, is my guess, since the fields_0 has 2 fields
inside.

Then ilink=ilink->next; ilink=ilin->next; ilink->nfileds should be 1 right?

Thank you very much in advance.

Yan
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.mcs.anl.gov/pipermail/petsc-users/attachments/20090717/40c6726b/attachment.htm>

From bsmith at mcs.anl.gov  Fri Jul 17 18:23:50 2009
From: bsmith at mcs.anl.gov (Barry Smith)
Date: Fri, 17 Jul 2009 18:23:50 -0500
Subject: About src/ksp/pc/impls/fieldsplit/fieldsplit.c
In-Reply-To: <bb5eaf5f0907171518i609d669hc1fd57c481d66a9d@mail.gmail.com>
References: <bb5eaf5f0907171518i609d669hc1fd57c481d66a9d@mail.gmail.com>
Message-ID: <1DDB2681-2D29-4373-900B-A3E8BD958BF0@mcs.anl.gov>


   You NEVER want to be building these linked lists yourself. You  
should use the PCFieldSplit API to construct the fields you want.

    If you are interested in seeing what the result is then run your  
code in the debugger and just look at the various links and fields.

    Barry

On Jul 17, 2009, at 5:18 PM, Ryan Yan wrote:

> Hi All,
> I have some difficulty of understanding the struct PC_FieldSplit.  
> From the definition I can see that it has a data structure like a  
> "list" and each member in the list is representing an object  
> Field(or PC_Field).
>
> Suppose that my matrix has a blocksize of 5.   I want to Set 3  
> Fields,  i.e. field_0 as {0,1},  field_1 as{2,3},and field_3 as {4}.  
> Then can anyone please help me to fill in the following parameter.
>
> PC_FieldSplit *jac = (PC_FieldSplit*)pc->data;
>
> what is jac->nsplit?     3, is my guess, since we have split the  
> matrix into 3 split, namely field_0, field_1, field_2.
>
> PC_FieldSplitLink ilink = jac->head;
>
> what is ilink->nfields? 2, is my guess, since the fields_0 has 2  
> fields inside.
>
> Then ilink=ilink->next; ilink=ilin->next; ilink->nfileds should be 1  
> right?
>
> Thank you very much in advance.
>
> Yan


From bsmith at mcs.anl.gov  Fri Jul 17 20:02:02 2009
From: bsmith at mcs.anl.gov (Barry Smith)
Date: Fri, 17 Jul 2009 20:02:02 -0500
Subject: non-local values being dropped in MatSetValues
In-Reply-To: <4A607219.3070609@imperial.ac.uk>
References: <4A607219.3070609@imperial.ac.uk>
Message-ID: <80B2A52B-2566-4336-A52C-1744E21E200E@mcs.anl.gov>


    Stephan,

    I'm sorry for your wasted time finding this bug.  I have fixed it  
in the Mecurial version of PETSc 3.0.0 and in petsc-dev. It will be  
fixed in the next 3.0.0 patch that we release.

    Barry

On Jul 17, 2009, at 7:44 AM, Stephan Kramer wrote:

> Hello
>
> We've spend sometime debugging a problem were in the assembly of a  
> parallel MPIAIJ matrix, some values that were created on a process  
> other than the owner of the row seemed to disappear. I think I  
> narrowed it down to what I think is a bug in MatSetValues_MPIAIJ,  
> but please tell me if I'm wrong.
>
> The situation is the following: I'm calling MatSetValues with the  
> flag ADD_VALUES and with matrix option MAT_IGNORE_ZERO_ENTRIES. I'm  
> inserting  multiple values at once, multiple columns and rows, so I  
> provide a rank-2 matrix of values. As I'm calling this from fortran  
> I'm also using MAT_COLUMN_ORIENTED. Now for provided rows that are  
> not owned by the process, it jumps to mpiaij.c:394 (line numbers as  
> in petsc-dev). On line 399, it checks for zero entries, but only  
> checks the very first entry of the (non-owned) row. If however other  
> entries of that same row are nonzero, the entire row is still  
> dropped. Note that this is independent of row_oriented/ 
> column_oriented as line 396 does exactly the same.
>
> If I don't set the option MAT_IGNORE_ZERO_ENTRIES the problem  
> disappears. In that case however we would either have to preallocate  
> substantially more nonzeros, or complicate the matrix assembly in  
> our code by taking out the zero entries ourselves and call  
> MatSetValues for each entry seperately.
>
> Your help would be much appreciated,
> Cheers
> Stephan
>
>
> -- 
> Stephan Kramer  <s.kramer at imperial.ac.uk>
> Applied Modelling and Computation Group,
> Department of Earth Science and Engineering,
> Imperial College London


From vyan2000 at gmail.com  Sat Jul 18 23:24:46 2009
From: vyan2000 at gmail.com (Ryan Yan)
Date: Sun, 19 Jul 2009 00:24:46 -0400
Subject: PETSC debugger
Message-ID: <bb5eaf5f0907182124u2ef73468j7dc9cd1e7b2ca4e2@mail.gmail.com>

Hi All,
I am tring to use the PETSc runtime option -start_in_debugger.

However, when I attach the debugger at run time to each process, there are
error messages and I only get one gdb window(Am I suppose to get as many as
the number of the processes?)

vyan2000 at vyan2000-linux:~/local/PPETSc/petsc-2.3.3-p15/src/ksp/ksp/examples/tutorials/ttt2$
mpirun -np 2 ./rpisolve -ksp_monitor_true_residual -start_in_debugger
-display :0.0

[1]PETSC ERROR: PETSC: Attaching gdb to ./rpisolve of pid 26307 on display
:0.0 on machine vyan2000-linux
[0]PETSC ERROR: PETSC: Attaching gdb to ./rpisolve of pid 26306 on display
:0.0 on machine vyan2000-linux

Then, only a *single* gdb window prompts out. When I run with 3 process,
there are only *two* gdb windows.

Thank you very much in advance,

Yan
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.mcs.anl.gov/pipermail/petsc-users/attachments/20090719/660d2de4/attachment.htm>

From bsmith at mcs.anl.gov  Sat Jul 18 23:30:59 2009
From: bsmith at mcs.anl.gov (Barry Smith)
Date: Sat, 18 Jul 2009 23:30:59 -0500
Subject: PETSC debugger
In-Reply-To: <bb5eaf5f0907182124u2ef73468j7dc9cd1e7b2ca4e2@mail.gmail.com>
References: <bb5eaf5f0907182124u2ef73468j7dc9cd1e7b2ca4e2@mail.gmail.com>
Message-ID: <B6FDF357-4A48-4062-BC35-6796CD00CBE3@mcs.anl.gov>


   Try using -display vyan2000-linux:0.0

   Shouldn't make any difference but since it appears you are running  
everything on the same machine what you have given should work.

    Barry

On Jul 18, 2009, at 11:24 PM, Ryan Yan wrote:

> Hi All,
> I am tring to use the PETSc runtime option -start_in_debugger.
>
> However, when I attach the debugger at run time to each process,  
> there are error messages and I only get one gdb window(Am I suppose  
> to get as many as the number of the processes?)
>
> vyan2000 at vyan2000-linux:~/local/PPETSc/petsc-2.3.3-p15/src/ksp/ksp/ 
> examples/tutorials/ttt2$ mpirun -np 2 ./rpisolve - 
> ksp_monitor_true_residual -start_in_debugger -display :0.0
>
> [1]PETSC ERROR: PETSC: Attaching gdb to ./rpisolve of pid 26307 on  
> display :0.0 on machine vyan2000-linux
> [0]PETSC ERROR: PETSC: Attaching gdb to ./rpisolve of pid 26306 on  
> display :0.0 on machine vyan2000-linux
>
> Then, only a *single* gdb window prompts out. When I run with 3  
> process, there are only *two* gdb windows.
>
> Thank you very much in advance,
>
> Yan


From vyan2000 at gmail.com  Sat Jul 18 23:50:55 2009
From: vyan2000 at gmail.com (Ryan Yan)
Date: Sun, 19 Jul 2009 00:50:55 -0400
Subject: PETSC debugger
In-Reply-To: <B6FDF357-4A48-4062-BC35-6796CD00CBE3@mcs.anl.gov>
References: <bb5eaf5f0907182124u2ef73468j7dc9cd1e7b2ca4e2@mail.gmail.com>
	<B6FDF357-4A48-4062-BC35-6796CD00CBE3@mcs.anl.gov>
Message-ID: <bb5eaf5f0907182150g7e065366tc13d907927ffdaa9@mail.gmail.com>

Thank you very much, Barry.

After I use the vyan2000-linux:0.0, I got errors without any gdb window.

vyan2000 at vyan2000-linux:~/local/PPETSc/petsc-2.3.3-p15/src/ksp/ksp/examples/tutorials/ttt2$
mpirun -np 2 ./rpisolve -ksp_monitor_true_residual -start_in_debugger
-display vyan2000-linux:0.0
[0]PETSC ERROR: PETSC: Attaching gdb to ./rpisolve of pid 26518 on display
vyan2000-linux:0.0 on machine vyan2000-linux
[1]PETSC ERROR: PETSC: Attaching gdb to ./rpisolve of pid 26519 on display
vyan2000-linux:0.0 on machine vyan2000-linux
xterm Xt error: Can't open display: vyan2000-linux:0.0
xterm Xt error: Can't open display: vyan2000-linux:0.0

Then
I changed back,
vyan2000 at vyan2000-linux:~/local/PPETSc/petsc-2.3.3-p15/src/ksp/ksp/examples/tutorials/ttt2$
mpirun -np 2 ./rpisolve -ksp_monitor_true_residual -start_in_debugger
-display :0.0

Same as before, error messages with only a single gdb window, (and the
window shows up at different place at different instances).

Yan


On Sun, Jul 19, 2009 at 12:30 AM, Barry Smith <bsmith at mcs.anl.gov> wrote:

>
>  Try using -display vyan2000-linux:0.0
>
>  Shouldn't make any difference but since it appears you are running
> everything on the same machine what you have given should work.
>
>   Barry
>
>
> On Jul 18, 2009, at 11:24 PM, Ryan Yan wrote:
>
>  Hi All,
>> I am tring to use the PETSc runtime option -start_in_debugger.
>>
>> However, when I attach the debugger at run time to each process, there are
>> error messages and I only get one gdb window(Am I suppose to get as many as
>> the number of the processes?)
>>
>> vyan2000 at vyan2000-linux:~/local/PPETSc/petsc-2.3.3-p15/src/ksp/ksp/examples/tutorials/ttt2$
>> mpirun -np 2 ./rpisolve -ksp_monitor_true_residual -start_in_debugger
>> -display :0.0
>>
>> [1]PETSC ERROR: PETSC: Attaching gdb to ./rpisolve of pid 26307 on display
>> :0.0 on machine vyan2000-linux
>> [0]PETSC ERROR: PETSC: Attaching gdb to ./rpisolve of pid 26306 on display
>> :0.0 on machine vyan2000-linux
>>
>> Then, only a *single* gdb window prompts out. When I run with 3 process,
>> there are only *two* gdb windows.
>>
>> Thank you very much in advance,
>>
>> Yan
>>
>
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.mcs.anl.gov/pipermail/petsc-users/attachments/20090719/238e3e8e/attachment.htm>

From bsmith at mcs.anl.gov  Sat Jul 18 23:54:18 2009
From: bsmith at mcs.anl.gov (Barry Smith)
Date: Sat, 18 Jul 2009 23:54:18 -0500
Subject: PETSC debugger
In-Reply-To: <bb5eaf5f0907182150g7e065366tc13d907927ffdaa9@mail.gmail.com>
References: <bb5eaf5f0907182124u2ef73468j7dc9cd1e7b2ca4e2@mail.gmail.com>
	<B6FDF357-4A48-4062-BC35-6796CD00CBE3@mcs.anl.gov>
	<bb5eaf5f0907182150g7e065366tc13d907927ffdaa9@mail.gmail.com>
Message-ID: <F03EF3F1-0539-43C4-8AEA-41EB91A91930@mcs.anl.gov>


   Are you sure the other window isn't lurking away somewhere off (or  
nearly) off the screen?

   Maybe try shutting down the x server and restarting?

   Barry

On Jul 18, 2009, at 11:50 PM, Ryan Yan wrote:

> Thank you very much, Barry.
>
> After I use the vyan2000-linux:0.0, I got errors without any gdb  
> window.
>
> vyan2000 at vyan2000-linux:~/local/PPETSc/petsc-2.3.3-p15/src/ksp/ksp/ 
> examples/tutorials/ttt2$ mpirun -np 2 ./rpisolve - 
> ksp_monitor_true_residual -start_in_debugger -display vyan2000-linux: 
> 0.0
> [0]PETSC ERROR: PETSC: Attaching gdb to ./rpisolve of pid 26518 on  
> display vyan2000-linux:0.0 on machine vyan2000-linux
> [1]PETSC ERROR: PETSC: Attaching gdb to ./rpisolve of pid 26519 on  
> display vyan2000-linux:0.0 on machine vyan2000-linux
> xterm Xt error: Can't open display: vyan2000-linux:0.0
> xterm Xt error: Can't open display: vyan2000-linux:0.0
>
> Then
> I changed back,
> vyan2000 at vyan2000-linux:~/local/PPETSc/petsc-2.3.3-p15/src/ksp/ksp/ 
> examples/tutorials/ttt2$ mpirun -np 2 ./rpisolve - 
> ksp_monitor_true_residual -start_in_debugger -display :0.0
>
> Same as before, error messages with only a single gdb window, (and  
> the window shows up at different place at different instances).
>
> Yan
>
>
> On Sun, Jul 19, 2009 at 12:30 AM, Barry Smith <bsmith at mcs.anl.gov>  
> wrote:
>
>  Try using -display vyan2000-linux:0.0
>
>  Shouldn't make any difference but since it appears you are running  
> everything on the same machine what you have given should work.
>
>   Barry
>
>
> On Jul 18, 2009, at 11:24 PM, Ryan Yan wrote:
>
> Hi All,
> I am tring to use the PETSc runtime option -start_in_debugger.
>
> However, when I attach the debugger at run time to each process,  
> there are error messages and I only get one gdb window(Am I suppose  
> to get as many as the number of the processes?)
>
> vyan2000 at vyan2000-linux:~/local/PPETSc/petsc-2.3.3-p15/src/ksp/ksp/ 
> examples/tutorials/ttt2$ mpirun -np 2 ./rpisolve - 
> ksp_monitor_true_residual -start_in_debugger -display :0.0
>
> [1]PETSC ERROR: PETSC: Attaching gdb to ./rpisolve of pid 26307 on  
> display :0.0 on machine vyan2000-linux
> [0]PETSC ERROR: PETSC: Attaching gdb to ./rpisolve of pid 26306 on  
> display :0.0 on machine vyan2000-linux
>
> Then, only a *single* gdb window prompts out. When I run with 3  
> process, there are only *two* gdb windows.
>
> Thank you very much in advance,
>
> Yan
>
>


From vyan2000 at gmail.com  Sun Jul 19 00:06:33 2009
From: vyan2000 at gmail.com (Ryan Yan)
Date: Sun, 19 Jul 2009 01:06:33 -0400
Subject: PETSC debugger
In-Reply-To: <F03EF3F1-0539-43C4-8AEA-41EB91A91930@mcs.anl.gov>
References: <bb5eaf5f0907182124u2ef73468j7dc9cd1e7b2ca4e2@mail.gmail.com>
	<B6FDF357-4A48-4062-BC35-6796CD00CBE3@mcs.anl.gov>
	<bb5eaf5f0907182150g7e065366tc13d907927ffdaa9@mail.gmail.com>
	<F03EF3F1-0539-43C4-8AEA-41EB91A91930@mcs.anl.gov>
Message-ID: <bb5eaf5f0907182206w58912648ued73fe073e53b5b5@mail.gmail.com>

I do not have acess to linux right now, I will check it as the first thing
tomorrow.

Yan

On Sun, Jul 19, 2009 at 12:54 AM, Barry Smith <bsmith at mcs.anl.gov> wrote:

>
>  Are you sure the other window isn't lurking away somewhere off (or nearly)
> off the screen?
>
>  Maybe try shutting down the x server and restarting?
>
>  Barry
>
>
> On Jul 18, 2009, at 11:50 PM, Ryan Yan wrote:
>
> Thank you very much, Barry.
>>
>> After I use the vyan2000-linux:0.0, I got errors without any gdb window.
>>
>> vyan2000 at vyan2000-linux:~/local/PPETSc/petsc-2.3.3-p15/src/ksp/ksp/examples/tutorials/ttt2$
>> mpirun -np 2 ./rpisolve -ksp_monitor_true_residual -start_in_debugger
>> -display vyan2000-linux:0.0
>> [0]PETSC ERROR: PETSC: Attaching gdb to ./rpisolve of pid 26518 on display
>> vyan2000-linux:0.0 on machine vyan2000-linux
>> [1]PETSC ERROR: PETSC: Attaching gdb to ./rpisolve of pid 26519 on display
>> vyan2000-linux:0.0 on machine vyan2000-linux
>> xterm Xt error: Can't open display: vyan2000-linux:0.0
>> xterm Xt error: Can't open display: vyan2000-linux:0.0
>>
>> Then
>> I changed back,
>> vyan2000 at vyan2000-linux:~/local/PPETSc/petsc-2.3.3-p15/src/ksp/ksp/examples/tutorials/ttt2$
>> mpirun -np 2 ./rpisolve -ksp_monitor_true_residual -start_in_debugger
>> -display :0.0
>>
>> Same as before, error messages with only a single gdb window, (and the
>> window shows up at different place at different instances).
>>
>> Yan
>>
>>
>> On Sun, Jul 19, 2009 at 12:30 AM, Barry Smith <bsmith at mcs.anl.gov> wrote:
>>
>>  Try using -display vyan2000-linux:0.0
>>
>>  Shouldn't make any difference but since it appears you are running
>> everything on the same machine what you have given should work.
>>
>>  Barry
>>
>>
>> On Jul 18, 2009, at 11:24 PM, Ryan Yan wrote:
>>
>> Hi All,
>> I am tring to use the PETSc runtime option -start_in_debugger.
>>
>> However, when I attach the debugger at run time to each process, there are
>> error messages and I only get one gdb window(Am I suppose to get as many as
>> the number of the processes?)
>>
>> vyan2000 at vyan2000-linux:~/local/PPETSc/petsc-2.3.3-p15/src/ksp/ksp/examples/tutorials/ttt2$
>> mpirun -np 2 ./rpisolve -ksp_monitor_true_residual -start_in_debugger
>> -display :0.0
>>
>> [1]PETSC ERROR: PETSC: Attaching gdb to ./rpisolve of pid 26307 on display
>> :0.0 on machine vyan2000-linux
>> [0]PETSC ERROR: PETSC: Attaching gdb to ./rpisolve of pid 26306 on display
>> :0.0 on machine vyan2000-linux
>>
>> Then, only a *single* gdb window prompts out. When I run with 3 process,
>> there are only *two* gdb windows.
>>
>> Thank you very much in advance,
>>
>> Yan
>>
>>
>>
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.mcs.anl.gov/pipermail/petsc-users/attachments/20090719/42eb1877/attachment.htm>

From b.van-wachem at imperial.ac.uk  Sun Jul 19 01:59:58 2009
From: b.van-wachem at imperial.ac.uk (Berend van Wachem)
Date: Sun, 19 Jul 2009 07:59:58 +0100
Subject: PETSC debugger
In-Reply-To: <bb5eaf5f0907182206w58912648ued73fe073e53b5b5@mail.gmail.com>
References: <bb5eaf5f0907182124u2ef73468j7dc9cd1e7b2ca4e2@mail.gmail.com>	<B6FDF357-4A48-4062-BC35-6796CD00CBE3@mcs.anl.gov>	<bb5eaf5f0907182150g7e065366tc13d907927ffdaa9@mail.gmail.com>	<F03EF3F1-0539-43C4-8AEA-41EB91A91930@mcs.anl.gov>
	<bb5eaf5f0907182206w58912648ued73fe073e53b5b5@mail.gmail.com>
Message-ID: <4A62C46E.6040300@imperial.ac.uk>

Dear Ryan,

I had a similar issue as you have. I am using KDE as a desktop manager 
and found that I have to comment out the line

"ServerArgsLocal=-nolisten tcp"

in kdm, in the kdmrc file (on my system located at /etc/kde/kdm/kdmrc).

After restarting kdm, I get all windows of gdb coming up.

Regards,

Berend.


Ryan Yan wrote:
> I do not have acess to linux right now, I will check it as the first 
> thing tomorrow.
>  
> Yan
> 
> On Sun, Jul 19, 2009 at 12:54 AM, Barry Smith <bsmith at mcs.anl.gov 
> <mailto:bsmith at mcs.anl.gov>> wrote:
> 
> 
>      Are you sure the other window isn't lurking away somewhere off (or
>     nearly) off the screen?
> 
>      Maybe try shutting down the x server and restarting?
> 
>      Barry
> 
> 
>     On Jul 18, 2009, at 11:50 PM, Ryan Yan wrote:
> 
>         Thank you very much, Barry.
> 
>         After I use the vyan2000-linux:0.0, I got errors without any gdb
>         window.
> 
>         vyan2000 at vyan2000-linux:~/local/PPETSc/petsc-2.3.3-p15/src/ksp/ksp/examples/tutorials/ttt2$
>         mpirun -np 2 ./rpisolve -ksp_monitor_true_residual
>         -start_in_debugger -display vyan2000-linux:0.0
>         [0]PETSC ERROR: PETSC: Attaching gdb to ./rpisolve of pid 26518
>         on display vyan2000-linux:0.0 on machine vyan2000-linux
>         [1]PETSC ERROR: PETSC: Attaching gdb to ./rpisolve of pid 26519
>         on display vyan2000-linux:0.0 on machine vyan2000-linux
>         xterm Xt error: Can't open display: vyan2000-linux:0.0
>         xterm Xt error: Can't open display: vyan2000-linux:0.0
> 
>         Then
>         I changed back,
>         vyan2000 at vyan2000-linux:~/local/PPETSc/petsc-2.3.3-p15/src/ksp/ksp/examples/tutorials/ttt2$
>         mpirun -np 2 ./rpisolve -ksp_monitor_true_residual
>         -start_in_debugger -display :0.0
> 
>         Same as before, error messages with only a single gdb window,
>         (and the window shows up at different place at different instances).
> 
>         Yan
> 
> 
>         On Sun, Jul 19, 2009 at 12:30 AM, Barry Smith
>         <bsmith at mcs.anl.gov <mailto:bsmith at mcs.anl.gov>> wrote:
> 
>          Try using -display vyan2000-linux:0.0
> 
>          Shouldn't make any difference but since it appears you are
>         running everything on the same machine what you have given
>         should work.
> 
>          Barry
> 
> 
>         On Jul 18, 2009, at 11:24 PM, Ryan Yan wrote:
> 
>         Hi All,
>         I am tring to use the PETSc runtime option -start_in_debugger.
> 
>         However, when I attach the debugger at run time to each process,
>         there are error messages and I only get one gdb window(Am I
>         suppose to get as many as the number of the processes?)
> 
>         vyan2000 at vyan2000-linux:~/local/PPETSc/petsc-2.3.3-p15/src/ksp/ksp/examples/tutorials/ttt2$
>         mpirun -np 2 ./rpisolve -ksp_monitor_true_residual
>         -start_in_debugger -display :0.0
> 
>         [1]PETSC ERROR: PETSC: Attaching gdb to ./rpisolve of pid 26307
>         on display :0.0 on machine vyan2000-linux
>         [0]PETSC ERROR: PETSC: Attaching gdb to ./rpisolve of pid 26306
>         on display :0.0 on machine vyan2000-linux
> 
>         Then, only a *single* gdb window prompts out. When I run with 3
>         process, there are only *two* gdb windows.
> 
>         Thank you very much in advance,
> 
>         Yan
> 
> 
> 
> 

From s.kramer at imperial.ac.uk  Sun Jul 19 04:11:17 2009
From: s.kramer at imperial.ac.uk (Stephan Kramer)
Date: Sun, 19 Jul 2009 10:11:17 +0100
Subject: non-local values being dropped in MatSetValues
In-Reply-To: <80B2A52B-2566-4336-A52C-1744E21E200E@mcs.anl.gov>
References: <4A607219.3070609@imperial.ac.uk>
	<80B2A52B-2566-4336-A52C-1744E21E200E@mcs.anl.gov>
Message-ID: <4A62E335.5030404@imperial.ac.uk>

Barry Smith wrote:
>     Stephan,
> 
>     I'm sorry for your wasted time finding this bug.  I have fixed it  
> in the Mecurial version of PETSc 3.0.0 and in petsc-dev. It will be  
> fixed in the next 3.0.0 patch that we release.
> 
>     Barry

Excellent. No problem, thanks a lot for your quick response and fix!

Cheers
Stephan

> 
> On Jul 17, 2009, at 7:44 AM, Stephan Kramer wrote:
> 
>> Hello
>>
>> We've spend sometime debugging a problem were in the assembly of a  
>> parallel MPIAIJ matrix, some values that were created on a process  
>> other than the owner of the row seemed to disappear. I think I  
>> narrowed it down to what I think is a bug in MatSetValues_MPIAIJ,  
>> but please tell me if I'm wrong.
>>
>> The situation is the following: I'm calling MatSetValues with the  
>> flag ADD_VALUES and with matrix option MAT_IGNORE_ZERO_ENTRIES. I'm  
>> inserting  multiple values at once, multiple columns and rows, so I  
>> provide a rank-2 matrix of values. As I'm calling this from fortran  
>> I'm also using MAT_COLUMN_ORIENTED. Now for provided rows that are  
>> not owned by the process, it jumps to mpiaij.c:394 (line numbers as  
>> in petsc-dev). On line 399, it checks for zero entries, but only  
>> checks the very first entry of the (non-owned) row. If however other  
>> entries of that same row are nonzero, the entire row is still  
>> dropped. Note that this is independent of row_oriented/ 
>> column_oriented as line 396 does exactly the same.
>>
>> If I don't set the option MAT_IGNORE_ZERO_ENTRIES the problem  
>> disappears. In that case however we would either have to preallocate  
>> substantially more nonzeros, or complicate the matrix assembly in  
>> our code by taking out the zero entries ourselves and call  
>> MatSetValues for each entry seperately.
>>
>> Your help would be much appreciated,
>> Cheers
>> Stephan
>>
>>
>> -- 
>> Stephan Kramer  <s.kramer at imperial.ac.uk>
>> Applied Modelling and Computation Group,
>> Department of Earth Science and Engineering,
>> Imperial College London
> 
> 


-- 
Stephan Kramer  <s.kramer at imperial.ac.uk>
Applied Modelling and Computation Group,
Department of Earth Science and Engineering,
Imperial College London

From vyan2000 at gmail.com  Sun Jul 19 13:47:23 2009
From: vyan2000 at gmail.com (Ryan Yan)
Date: Sun, 19 Jul 2009 14:47:23 -0400
Subject: PETSC debugger
In-Reply-To: <4A62C46E.6040300@imperial.ac.uk>
References: <bb5eaf5f0907182124u2ef73468j7dc9cd1e7b2ca4e2@mail.gmail.com>
	<B6FDF357-4A48-4062-BC35-6796CD00CBE3@mcs.anl.gov>
	<bb5eaf5f0907182150g7e065366tc13d907927ffdaa9@mail.gmail.com>
	<F03EF3F1-0539-43C4-8AEA-41EB91A91930@mcs.anl.gov>
	<bb5eaf5f0907182206w58912648ued73fe073e53b5b5@mail.gmail.com>
	<4A62C46E.6040300@imperial.ac.uk>
Message-ID: <bb5eaf5f0907191147v6badb63bkfc6790c570115a90@mail.gmail.com>

Dear Berend,
Thanks for your suggestion.  I can see that option right there!!

However, I am still struggling to find a way to turn this option off as I
can see from

vyan2000 at vyan2000-linux:/etc/gdm$ ps -ef | grep /usr/bin/X
root      5287  5285  3 14:19 tty7     00:00:42 /usr/bin/X :0 -br -audit 0
-auth /var/lib/gdm/:0.Xauth -nolisten tcp vt7

I am using Ubuntu *gnome*, but there is no such gdmrc file that I can
correct. I tried several other ways, but have not succeed yet.
vyan2000 at vyan2000-linux:/etc/gdm$ uname -a
Linux vyan2000-linux 2.6.24-24-generic #1 SMP Tue Jul 7 19:46:39 UTC 2009
i686 GNU/Linux

I am still searching....

Regards,

Yan


On Sun, Jul 19, 2009 at 2:59 AM, Berend van Wachem <
b.van-wachem at imperial.ac.uk> wrote:

> Dear Ryan,
>
> I had a similar issue as you have. I am using KDE as a desktop manager and
> found that I have to comment out the line
>
> "ServerArgsLocal=-nolisten tcp"
>
> in kdm, in the kdmrc file (on my system located at /etc/kde/kdm/kdmrc).
>
> After restarting kdm, I get all windows of gdb coming up.
>
> Regards,
>
> Berend.
>
>
> Ryan Yan wrote:
>
>> I do not have acess to linux right now, I will check it as the first thing
>> tomorrow.
>>  Yan
>>
>> On Sun, Jul 19, 2009 at 12:54 AM, Barry Smith <bsmith at mcs.anl.gov<mailto:
>> bsmith at mcs.anl.gov>> wrote:
>>
>>
>>     Are you sure the other window isn't lurking away somewhere off (or
>>    nearly) off the screen?
>>
>>     Maybe try shutting down the x server and restarting?
>>
>>     Barry
>>
>>
>>    On Jul 18, 2009, at 11:50 PM, Ryan Yan wrote:
>>
>>        Thank you very much, Barry.
>>
>>        After I use the vyan2000-linux:0.0, I got errors without any gdb
>>        window.
>>
>>        vyan2000 at vyan2000-linux
>> :~/local/PPETSc/petsc-2.3.3-p15/src/ksp/ksp/examples/tutorials/ttt2$
>>        mpirun -np 2 ./rpisolve -ksp_monitor_true_residual
>>        -start_in_debugger -display vyan2000-linux:0.0
>>        [0]PETSC ERROR: PETSC: Attaching gdb to ./rpisolve of pid 26518
>>        on display vyan2000-linux:0.0 on machine vyan2000-linux
>>        [1]PETSC ERROR: PETSC: Attaching gdb to ./rpisolve of pid 26519
>>        on display vyan2000-linux:0.0 on machine vyan2000-linux
>>        xterm Xt error: Can't open display: vyan2000-linux:0.0
>>        xterm Xt error: Can't open display: vyan2000-linux:0.0
>>
>>        Then
>>        I changed back,
>>        vyan2000 at vyan2000-linux
>> :~/local/PPETSc/petsc-2.3.3-p15/src/ksp/ksp/examples/tutorials/ttt2$
>>        mpirun -np 2 ./rpisolve -ksp_monitor_true_residual
>>        -start_in_debugger -display :0.0
>>
>>        Same as before, error messages with only a single gdb window,
>>        (and the window shows up at different place at different
>> instances).
>>
>>        Yan
>>
>>
>>        On Sun, Jul 19, 2009 at 12:30 AM, Barry Smith
>>        <bsmith at mcs.anl.gov <mailto:bsmith at mcs.anl.gov>> wrote:
>>
>>         Try using -display vyan2000-linux:0.0
>>
>>         Shouldn't make any difference but since it appears you are
>>        running everything on the same machine what you have given
>>        should work.
>>
>>         Barry
>>
>>
>>        On Jul 18, 2009, at 11:24 PM, Ryan Yan wrote:
>>
>>        Hi All,
>>        I am tring to use the PETSc runtime option -start_in_debugger.
>>
>>        However, when I attach the debugger at run time to each process,
>>        there are error messages and I only get one gdb window(Am I
>>        suppose to get as many as the number of the processes?)
>>
>>        vyan2000 at vyan2000-linux
>> :~/local/PPETSc/petsc-2.3.3-p15/src/ksp/ksp/examples/tutorials/ttt2$
>>        mpirun -np 2 ./rpisolve -ksp_monitor_true_residual
>>        -start_in_debugger -display :0.0
>>
>>        [1]PETSC ERROR: PETSC: Attaching gdb to ./rpisolve of pid 26307
>>        on display :0.0 on machine vyan2000-linux
>>        [0]PETSC ERROR: PETSC: Attaching gdb to ./rpisolve of pid 26306
>>        on display :0.0 on machine vyan2000-linux
>>
>>        Then, only a *single* gdb window prompts out. When I run with 3
>>        process, there are only *two* gdb windows.
>>
>>        Thank you very much in advance,
>>
>>        Yan
>>
>>
>>
>>
>>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.mcs.anl.gov/pipermail/petsc-users/attachments/20090719/b71e32b0/attachment.htm>

From b.van-wachem at imperial.ac.uk  Sun Jul 19 13:52:22 2009
From: b.van-wachem at imperial.ac.uk (Berend van Wachem)
Date: Sun, 19 Jul 2009 19:52:22 +0100
Subject: PETSC debugger
In-Reply-To: <bb5eaf5f0907191147v6badb63bkfc6790c570115a90@mail.gmail.com>
References: <bb5eaf5f0907182124u2ef73468j7dc9cd1e7b2ca4e2@mail.gmail.com>	<B6FDF357-4A48-4062-BC35-6796CD00CBE3@mcs.anl.gov>	<bb5eaf5f0907182150g7e065366tc13d907927ffdaa9@mail.gmail.com>	<F03EF3F1-0539-43C4-8AEA-41EB91A91930@mcs.anl.gov>	<bb5eaf5f0907182206w58912648ued73fe073e53b5b5@mail.gmail.com>	<4A62C46E.6040300@imperial.ac.uk>
	<bb5eaf5f0907191147v6badb63bkfc6790c570115a90@mail.gmail.com>
Message-ID: <4A636B66.4070605@imperial.ac.uk>

Dear Ryan,

I am not a Gnome user, but a colleague of mine suggested:

For Gnome you should edit the file /etc/gdm/gdm.conf
and change the settings:
DisallowTCP=false

Regards,

Berend.


Ryan Yan wrote:
> Dear Berend,
> Thanks for your suggestion.  I can see that option right there!!
> 
> However, I am still struggling to find a way to turn this option off as 
> I can see from
> 
> vyan2000 at vyan2000-linux:/etc/gdm$ ps -ef | grep /usr/bin/X
> root      5287  5285  3 14:19 tty7     00:00:42 /usr/bin/X :0 -br -audit 
> 0 -auth /var/lib/gdm/:0.Xauth -nolisten tcp vt7
> 
> I am using Ubuntu *gnome*, but there is no such gdmrc file that I can 
> correct. I tried several other ways, but have not succeed yet.
> vyan2000 at vyan2000-linux:/etc/gdm$ uname -a
> Linux vyan2000-linux 2.6.24-24-generic #1 SMP Tue Jul 7 19:46:39 UTC 
> 2009 i686 GNU/Linux
> 
> I am still searching....
> 
> Regards,
> 
> Yan
> 
> 
> On Sun, Jul 19, 2009 at 2:59 AM, Berend van Wachem 
> <b.van-wachem at imperial.ac.uk <mailto:b.van-wachem at imperial.ac.uk>> wrote:
> 
>     Dear Ryan,
> 
>     I had a similar issue as you have. I am using KDE as a desktop
>     manager and found that I have to comment out the line
> 
>     "ServerArgsLocal=-nolisten tcp"
> 
>     in kdm, in the kdmrc file (on my system located at /etc/kde/kdm/kdmrc).
> 
>     After restarting kdm, I get all windows of gdb coming up.
> 
>     Regards,
> 
>     Berend.
> 
> 
>     Ryan Yan wrote:
> 
>         I do not have acess to linux right now, I will check it as the
>         first thing tomorrow.
>          Yan
> 
>         On Sun, Jul 19, 2009 at 12:54 AM, Barry Smith
>         <bsmith at mcs.anl.gov <mailto:bsmith at mcs.anl.gov>
>         <mailto:bsmith at mcs.anl.gov <mailto:bsmith at mcs.anl.gov>>> wrote:
> 
> 
>             Are you sure the other window isn't lurking away somewhere
>         off (or
>            nearly) off the screen?
> 
>             Maybe try shutting down the x server and restarting?
> 
>             Barry
> 
> 
>            On Jul 18, 2009, at 11:50 PM, Ryan Yan wrote:
> 
>                Thank you very much, Barry.
> 
>                After I use the vyan2000-linux:0.0, I got errors without
>         any gdb
>                window.
> 
>              
>          vyan2000 at vyan2000-linux:~/local/PPETSc/petsc-2.3.3-p15/src/ksp/ksp/examples/tutorials/ttt2$
>                mpirun -np 2 ./rpisolve -ksp_monitor_true_residual
>                -start_in_debugger -display vyan2000-linux:0.0
>                [0]PETSC ERROR: PETSC: Attaching gdb to ./rpisolve of pid
>         26518
>                on display vyan2000-linux:0.0 on machine vyan2000-linux
>                [1]PETSC ERROR: PETSC: Attaching gdb to ./rpisolve of pid
>         26519
>                on display vyan2000-linux:0.0 on machine vyan2000-linux
>                xterm Xt error: Can't open display: vyan2000-linux:0.0
>                xterm Xt error: Can't open display: vyan2000-linux:0.0
> 
>                Then
>                I changed back,
>              
>          vyan2000 at vyan2000-linux:~/local/PPETSc/petsc-2.3.3-p15/src/ksp/ksp/examples/tutorials/ttt2$
>                mpirun -np 2 ./rpisolve -ksp_monitor_true_residual
>                -start_in_debugger -display :0.0
> 
>                Same as before, error messages with only a single gdb window,
>                (and the window shows up at different place at different
>         instances).
> 
>                Yan
> 
> 
>                On Sun, Jul 19, 2009 at 12:30 AM, Barry Smith
>                <bsmith at mcs.anl.gov <mailto:bsmith at mcs.anl.gov>
>         <mailto:bsmith at mcs.anl.gov <mailto:bsmith at mcs.anl.gov>>> wrote:
> 
>                 Try using -display vyan2000-linux:0.0
> 
>                 Shouldn't make any difference but since it appears you are
>                running everything on the same machine what you have given
>                should work.
> 
>                 Barry
> 
> 
>                On Jul 18, 2009, at 11:24 PM, Ryan Yan wrote:
> 
>                Hi All,
>                I am tring to use the PETSc runtime option
>         -start_in_debugger.
> 
>                However, when I attach the debugger at run time to each
>         process,
>                there are error messages and I only get one gdb window(Am I
>                suppose to get as many as the number of the processes?)
> 
>              
>          vyan2000 at vyan2000-linux:~/local/PPETSc/petsc-2.3.3-p15/src/ksp/ksp/examples/tutorials/ttt2$
>                mpirun -np 2 ./rpisolve -ksp_monitor_true_residual
>                -start_in_debugger -display :0.0
> 
>                [1]PETSC ERROR: PETSC: Attaching gdb to ./rpisolve of pid
>         26307
>                on display :0.0 on machine vyan2000-linux
>                [0]PETSC ERROR: PETSC: Attaching gdb to ./rpisolve of pid
>         26306
>                on display :0.0 on machine vyan2000-linux
> 
>                Then, only a *single* gdb window prompts out. When I run
>         with 3
>                process, there are only *two* gdb windows.
> 
>                Thank you very much in advance,
> 
>                Yan
> 
> 
> 
> 
> 

From vyan2000 at gmail.com  Sun Jul 19 14:54:32 2009
From: vyan2000 at gmail.com (Ryan Yan)
Date: Sun, 19 Jul 2009 15:54:32 -0400
Subject: PETSC debugger
In-Reply-To: <4A636B66.4070605@imperial.ac.uk>
References: <bb5eaf5f0907182124u2ef73468j7dc9cd1e7b2ca4e2@mail.gmail.com>
	<B6FDF357-4A48-4062-BC35-6796CD00CBE3@mcs.anl.gov>
	<bb5eaf5f0907182150g7e065366tc13d907927ffdaa9@mail.gmail.com>
	<F03EF3F1-0539-43C4-8AEA-41EB91A91930@mcs.anl.gov>
	<bb5eaf5f0907182206w58912648ued73fe073e53b5b5@mail.gmail.com>
	<4A62C46E.6040300@imperial.ac.uk>
	<bb5eaf5f0907191147v6badb63bkfc6790c570115a90@mail.gmail.com>
	<4A636B66.4070605@imperial.ac.uk>
Message-ID: <bb5eaf5f0907191254s69710174ocd59b37c0faf84cc@mail.gmail.com>

Dear Berend,
Thanks to your suggestion, I have turned it off. As it can be shown here:
vyan2000 at vyan2000-linux:~$ ps -ef |grep /usr/bin/X
root      5273  5271  4 15:45 tty7     00:00:14 /usr/bin/X :0 -br -audit 0
-auth /var/lib/gdm/:0.Xauth vt7
vyan2000  6116  6075  0 15:50 pts/1    00:00:00 grep /usr/bin/X

However, There are still errors when PETSc attach the debugger. And
sometimes the number of windows is not equal to the number of processes.
vyan2000 at vyan2000-linux:~/local/PPETSc/petsc-2.3.3-p15/src/ksp/ksp/examples/tutorials/ttt2$
mpirun -np 2 ./rpisolve -ksp_monitor_true_residual -start_in_debugger
-display :0.0
[0]PETSC ERROR: PETSC: Attaching gdb to ./rpisolve of pid 6022 on display
:0.0 on machine vyan2000-linux
[1]PETSC ERROR: PETSC: Attaching gdb to ./rpisolve of pid 6021 on display
:0.0 on machine vyan2000-linux

It still need more efforts...

Regards,

Yan




On Sun, Jul 19, 2009 at 2:52 PM, Berend van Wachem <
b.van-wachem at imperial.ac.uk> wrote:

> Dear Ryan,
>
> I am not a Gnome user, but a colleague of mine suggested:
>
> For Gnome you should edit the file /etc/gdm/gdm.conf
> and change the settings:
> DisallowTCP=false
>
> Regards,
>
> Berend.
>
>
> Ryan Yan wrote:
>
>> Dear Berend,
>> Thanks for your suggestion.  I can see that option right there!!
>>
>> However, I am still struggling to find a way to turn this option off as I
>> can see from
>>
>> vyan2000 at vyan2000-linux:/etc/gdm$ ps -ef | grep /usr/bin/X
>> root      5287  5285  3 14:19 tty7     00:00:42 /usr/bin/X :0 -br -audit 0
>> -auth /var/lib/gdm/:0.Xauth -nolisten tcp vt7
>>
>> I am using Ubuntu *gnome*, but there is no such gdmrc file that I can
>> correct. I tried several other ways, but have not succeed yet.
>> vyan2000 at vyan2000-linux:/etc/gdm$ uname -a
>> Linux vyan2000-linux 2.6.24-24-generic #1 SMP Tue Jul 7 19:46:39 UTC 2009
>> i686 GNU/Linux
>>
>> I am still searching....
>>
>> Regards,
>>
>> Yan
>>
>>
>> On Sun, Jul 19, 2009 at 2:59 AM, Berend van Wachem <
>> b.van-wachem at imperial.ac.uk <mailto:b.van-wachem at imperial.ac.uk>> wrote:
>>
>>    Dear Ryan,
>>
>>    I had a similar issue as you have. I am using KDE as a desktop
>>    manager and found that I have to comment out the line
>>
>>    "ServerArgsLocal=-nolisten tcp"
>>
>>    in kdm, in the kdmrc file (on my system located at /etc/kde/kdm/kdmrc).
>>
>>    After restarting kdm, I get all windows of gdb coming up.
>>
>>    Regards,
>>
>>    Berend.
>>
>>
>>    Ryan Yan wrote:
>>
>>        I do not have acess to linux right now, I will check it as the
>>        first thing tomorrow.
>>         Yan
>>
>>        On Sun, Jul 19, 2009 at 12:54 AM, Barry Smith
>>        <bsmith at mcs.anl.gov <mailto:bsmith at mcs.anl.gov>
>>        <mailto:bsmith at mcs.anl.gov <mailto:bsmith at mcs.anl.gov>>> wrote:
>>
>>
>>            Are you sure the other window isn't lurking away somewhere
>>        off (or
>>           nearly) off the screen?
>>
>>            Maybe try shutting down the x server and restarting?
>>
>>            Barry
>>
>>
>>           On Jul 18, 2009, at 11:50 PM, Ryan Yan wrote:
>>
>>               Thank you very much, Barry.
>>
>>               After I use the vyan2000-linux:0.0, I got errors without
>>        any gdb
>>               window.
>>
>>                     vyan2000 at vyan2000-linux
>> :~/local/PPETSc/petsc-2.3.3-p15/src/ksp/ksp/examples/tutorials/ttt2$
>>               mpirun -np 2 ./rpisolve -ksp_monitor_true_residual
>>               -start_in_debugger -display vyan2000-linux:0.0
>>               [0]PETSC ERROR: PETSC: Attaching gdb to ./rpisolve of pid
>>        26518
>>               on display vyan2000-linux:0.0 on machine vyan2000-linux
>>               [1]PETSC ERROR: PETSC: Attaching gdb to ./rpisolve of pid
>>        26519
>>               on display vyan2000-linux:0.0 on machine vyan2000-linux
>>               xterm Xt error: Can't open display: vyan2000-linux:0.0
>>               xterm Xt error: Can't open display: vyan2000-linux:0.0
>>
>>               Then
>>               I changed back,
>>                     vyan2000 at vyan2000-linux
>> :~/local/PPETSc/petsc-2.3.3-p15/src/ksp/ksp/examples/tutorials/ttt2$
>>               mpirun -np 2 ./rpisolve -ksp_monitor_true_residual
>>               -start_in_debugger -display :0.0
>>
>>               Same as before, error messages with only a single gdb
>> window,
>>               (and the window shows up at different place at different
>>        instances).
>>
>>               Yan
>>
>>
>>               On Sun, Jul 19, 2009 at 12:30 AM, Barry Smith
>>               <bsmith at mcs.anl.gov <mailto:bsmith at mcs.anl.gov>
>>        <mailto:bsmith at mcs.anl.gov <mailto:bsmith at mcs.anl.gov>>> wrote:
>>
>>                Try using -display vyan2000-linux:0.0
>>
>>                Shouldn't make any difference but since it appears you are
>>               running everything on the same machine what you have given
>>               should work.
>>
>>                Barry
>>
>>
>>               On Jul 18, 2009, at 11:24 PM, Ryan Yan wrote:
>>
>>               Hi All,
>>               I am tring to use the PETSc runtime option
>>        -start_in_debugger.
>>
>>               However, when I attach the debugger at run time to each
>>        process,
>>               there are error messages and I only get one gdb window(Am I
>>               suppose to get as many as the number of the processes?)
>>
>>                     vyan2000 at vyan2000-linux
>> :~/local/PPETSc/petsc-2.3.3-p15/src/ksp/ksp/examples/tutorials/ttt2$
>>               mpirun -np 2 ./rpisolve -ksp_monitor_true_residual
>>               -start_in_debugger -display :0.0
>>
>>               [1]PETSC ERROR: PETSC: Attaching gdb to ./rpisolve of pid
>>        26307
>>               on display :0.0 on machine vyan2000-linux
>>               [0]PETSC ERROR: PETSC: Attaching gdb to ./rpisolve of pid
>>        26306
>>               on display :0.0 on machine vyan2000-linux
>>
>>               Then, only a *single* gdb window prompts out. When I run
>>        with 3
>>               process, there are only *two* gdb windows.
>>
>>               Thank you very much in advance,
>>
>>               Yan
>>
>>
>>
>>
>>
>>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.mcs.anl.gov/pipermail/petsc-users/attachments/20090719/d0c0e482/attachment.htm>

From vyan2000 at gmail.com  Sun Jul 19 15:49:58 2009
From: vyan2000 at gmail.com (Ryan Yan)
Date: Sun, 19 Jul 2009 16:49:58 -0400
Subject: PETSC debugger
In-Reply-To: <F03EF3F1-0539-43C4-8AEA-41EB91A91930@mcs.anl.gov>
References: <bb5eaf5f0907182124u2ef73468j7dc9cd1e7b2ca4e2@mail.gmail.com>
	<B6FDF357-4A48-4062-BC35-6796CD00CBE3@mcs.anl.gov>
	<bb5eaf5f0907182150g7e065366tc13d907927ffdaa9@mail.gmail.com>
	<F03EF3F1-0539-43C4-8AEA-41EB91A91930@mcs.anl.gov>
Message-ID: <bb5eaf5f0907191349r22e469d5ha8a363a1bac3298f@mail.gmail.com>

Hi Barry,
I restarted x server many times(during turning off the option as ) -nolisten
tcp as Berend. And I also checked very carefully each time the bottom margin
of the window status panel. The gdb window did not function correctly on my
Linux box. Thanks in advance for any suggestions



On Sun, Jul 19, 2009 at 12:54 AM, Barry Smith <bsmith at mcs.anl.gov> wrote:

>
>  Are you sure the other window isn't lurking away somewhere off (or nearly)
> off the screen?
>
>  Maybe try shutting down the x server and restarting?
>
>  Barry
>
>
> On Jul 18, 2009, at 11:50 PM, Ryan Yan wrote:
>
>  Thank you very much, Barry.
>>
>> After I use the vyan2000-linux:0.0, I got errors without any gdb window.
>>
>> vyan2000 at vyan2000-linux:~/local/PPETSc/petsc-2.3.3-p15/src/ksp/ksp/examples/tutorials/ttt2$
>> mpirun -np 2 ./rpisolve -ksp_monitor_true_residual -start_in_debugger
>> -display vyan2000-linux:0.0
>> [0]PETSC ERROR: PETSC: Attaching gdb to ./rpisolve of pid 26518 on display
>> vyan2000-linux:0.0 on machine vyan2000-linux
>> [1]PETSC ERROR: PETSC: Attaching gdb to ./rpisolve of pid 26519 on display
>> vyan2000-linux:0.0 on machine vyan2000-linux
>> xterm Xt error: Can't open display: vyan2000-linux:0.0
>> xterm Xt error: Can't open display: vyan2000-linux:0.0
>>
>> Then
>> I changed back,
>> vyan2000 at vyan2000-linux:~/local/PPETSc/petsc-2.3.3-p15/src/ksp/ksp/examples/tutorials/ttt2$
>> mpirun -np 2 ./rpisolve -ksp_monitor_true_residual -start_in_debugger
>> -display :0.0
>>
>> Same as before, error messages with only a single gdb window, (and the
>> window shows up at different place at different instances).
>>
>> Yan
>>
>>
>> On Sun, Jul 19, 2009 at 12:30 AM, Barry Smith <bsmith at mcs.anl.gov> wrote:
>>
>>  Try using -display vyan2000-linux:0.0
>>
>>  Shouldn't make any difference but since it appears you are running
>> everything on the same machine what you have given should work.
>>
>>  Barry
>>
>>
>> On Jul 18, 2009, at 11:24 PM, Ryan Yan wrote:
>>
>> Hi All,
>> I am tring to use the PETSc runtime option -start_in_debugger.
>>
>> However, when I attach the debugger at run time to each process, there are
>> error messages and I only get one gdb window(Am I suppose to get as many as
>> the number of the processes?)
>>
>> vyan2000 at vyan2000-linux:~/local/PPETSc/petsc-2.3.3-p15/src/ksp/ksp/examples/tutorials/ttt2$
>> mpirun -np 2 ./rpisolve -ksp_monitor_true_residual -start_in_debugger
>> -display :0.0
>>
>> [1]PETSC ERROR: PETSC: Attaching gdb to ./rpisolve of pid 26307 on display
>> :0.0 on machine vyan2000-linux
>> [0]PETSC ERROR: PETSC: Attaching gdb to ./rpisolve of pid 26306 on display
>> :0.0 on machine vyan2000-linux
>>
>> Then, only a *single* gdb window prompts out. When I run with 3 process,
>> there are only *two* gdb windows.
>>
>> Thank you very much in advance,
>>
>> Yan
>>
>>
>>
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.mcs.anl.gov/pipermail/petsc-users/attachments/20090719/e7b3a788/attachment-0001.htm>

From bsmith at mcs.anl.gov  Sun Jul 19 16:00:46 2009
From: bsmith at mcs.anl.gov (Barry Smith)
Date: Sun, 19 Jul 2009 16:00:46 -0500
Subject: PETSC debugger
In-Reply-To: <bb5eaf5f0907191349r22e469d5ha8a363a1bac3298f@mail.gmail.com>
References: <bb5eaf5f0907182124u2ef73468j7dc9cd1e7b2ca4e2@mail.gmail.com>
	<B6FDF357-4A48-4062-BC35-6796CD00CBE3@mcs.anl.gov>
	<bb5eaf5f0907182150g7e065366tc13d907927ffdaa9@mail.gmail.com>
	<F03EF3F1-0539-43C4-8AEA-41EB91A91930@mcs.anl.gov>
	<bb5eaf5f0907191349r22e469d5ha8a363a1bac3298f@mail.gmail.com>
Message-ID: <5908FA08-1DFA-4711-A77D-74900A3C00BA@mcs.anl.gov>


    This option works by starting up an xterm with gdb running in that  
xterm. The code is in src/sys/error/adebug.c

    If you only need to look at one process, you can use the options - 
start_in_debugger noxterm -debugger_nodes 0 (or 1 or 2) then all the  
other processes won't use the debugger.

    Barry



On Jul 19, 2009, at 3:49 PM, Ryan Yan wrote:

> Hi Barry,
> I restarted x server many times(during turning off the option as ) - 
> nolisten tcp as Berend. And I also checked very carefully each time  
> the bottom margin of the window status panel. The gdb window did not  
> function correctly on my Linux box. Thanks in advance for any  
> suggestions
>
>
>
> On Sun, Jul 19, 2009 at 12:54 AM, Barry Smith <bsmith at mcs.anl.gov>  
> wrote:
>
>  Are you sure the other window isn't lurking away somewhere off (or  
> nearly) off the screen?
>
>  Maybe try shutting down the x server and restarting?
>
>  Barry
>
>
> On Jul 18, 2009, at 11:50 PM, Ryan Yan wrote:
>
> Thank you very much, Barry.
>
> After I use the vyan2000-linux:0.0, I got errors without any gdb  
> window.
>
> vyan2000 at vyan2000-linux:~/local/PPETSc/petsc-2.3.3-p15/src/ksp/ksp/ 
> examples/tutorials/ttt2$ mpirun -np 2 ./rpisolve - 
> ksp_monitor_true_residual -start_in_debugger -display vyan2000-linux: 
> 0.0
> [0]PETSC ERROR: PETSC: Attaching gdb to ./rpisolve of pid 26518 on  
> display vyan2000-linux:0.0 on machine vyan2000-linux
> [1]PETSC ERROR: PETSC: Attaching gdb to ./rpisolve of pid 26519 on  
> display vyan2000-linux:0.0 on machine vyan2000-linux
> xterm Xt error: Can't open display: vyan2000-linux:0.0
> xterm Xt error: Can't open display: vyan2000-linux:0.0
>
> Then
> I changed back,
> vyan2000 at vyan2000-linux:~/local/PPETSc/petsc-2.3.3-p15/src/ksp/ksp/ 
> examples/tutorials/ttt2$ mpirun -np 2 ./rpisolve - 
> ksp_monitor_true_residual -start_in_debugger -display :0.0
>
> Same as before, error messages with only a single gdb window, (and  
> the window shows up at different place at different instances).
>
> Yan
>
>
> On Sun, Jul 19, 2009 at 12:30 AM, Barry Smith <bsmith at mcs.anl.gov>  
> wrote:
>
>  Try using -display vyan2000-linux:0.0
>
>  Shouldn't make any difference but since it appears you are running  
> everything on the same machine what you have given should work.
>
>  Barry
>
>
> On Jul 18, 2009, at 11:24 PM, Ryan Yan wrote:
>
> Hi All,
> I am tring to use the PETSc runtime option -start_in_debugger.
>
> However, when I attach the debugger at run time to each process,  
> there are error messages and I only get one gdb window(Am I suppose  
> to get as many as the number of the processes?)
>
> vyan2000 at vyan2000-linux:~/local/PPETSc/petsc-2.3.3-p15/src/ksp/ksp/ 
> examples/tutorials/ttt2$ mpirun -np 2 ./rpisolve - 
> ksp_monitor_true_residual -start_in_debugger -display :0.0
>
> [1]PETSC ERROR: PETSC: Attaching gdb to ./rpisolve of pid 26307 on  
> display :0.0 on machine vyan2000-linux
> [0]PETSC ERROR: PETSC: Attaching gdb to ./rpisolve of pid 26306 on  
> display :0.0 on machine vyan2000-linux
>
> Then, only a *single* gdb window prompts out. When I run with 3  
> process, there are only *two* gdb windows.
>
> Thank you very much in advance,
>
> Yan
>
>
>
>


From vyan2000 at gmail.com  Sun Jul 19 17:43:21 2009
From: vyan2000 at gmail.com (Ryan Yan)
Date: Sun, 19 Jul 2009 18:43:21 -0400
Subject: PETSC debugger
In-Reply-To: <5908FA08-1DFA-4711-A77D-74900A3C00BA@mcs.anl.gov>
References: <bb5eaf5f0907182124u2ef73468j7dc9cd1e7b2ca4e2@mail.gmail.com>
	<B6FDF357-4A48-4062-BC35-6796CD00CBE3@mcs.anl.gov>
	<bb5eaf5f0907182150g7e065366tc13d907927ffdaa9@mail.gmail.com>
	<F03EF3F1-0539-43C4-8AEA-41EB91A91930@mcs.anl.gov>
	<bb5eaf5f0907191349r22e469d5ha8a363a1bac3298f@mail.gmail.com>
	<5908FA08-1DFA-4711-A77D-74900A3C00BA@mcs.anl.gov>
Message-ID: <bb5eaf5f0907191543s474360d3t768eb06aa32fd050@mail.gmail.com>

Thank you very much, barry,

Only debugging one processes is a way to go for my current bug(s).

Yan

On Sun, Jul 19, 2009 at 5:00 PM, Barry Smith <bsmith at mcs.anl.gov> wrote:

>
>   This option works by starting up an xterm with gdb running in that xterm.
> The code is in src/sys/error/adebug.c
>
>   If you only need to look at one process, you can use the options
> -start_in_debugger noxterm -debugger_nodes 0 (or 1 or 2) then all the other
> processes won't use the debugger.
>
>   Barry
>
>
>
>
> On Jul 19, 2009, at 3:49 PM, Ryan Yan wrote:
>
>  Hi Barry,
>> I restarted x server many times(during turning off the option as )
>> -nolisten tcp as Berend. And I also checked very carefully each time the
>> bottom margin of the window status panel. The gdb window did not function
>> correctly on my Linux box. Thanks in advance for any suggestions
>>
>>
>>
>> On Sun, Jul 19, 2009 at 12:54 AM, Barry Smith <bsmith at mcs.anl.gov> wrote:
>>
>>  Are you sure the other window isn't lurking away somewhere off (or
>> nearly) off the screen?
>>
>>  Maybe try shutting down the x server and restarting?
>>
>>  Barry
>>
>>
>> On Jul 18, 2009, at 11:50 PM, Ryan Yan wrote:
>>
>> Thank you very much, Barry.
>>
>> After I use the vyan2000-linux:0.0, I got errors without any gdb window.
>>
>> vyan2000 at vyan2000-linux:~/local/PPETSc/petsc-2.3.3-p15/src/ksp/ksp/examples/tutorials/ttt2$
>> mpirun -np 2 ./rpisolve -ksp_monitor_true_residual -start_in_debugger
>> -display vyan2000-linux:0.0
>> [0]PETSC ERROR: PETSC: Attaching gdb to ./rpisolve of pid 26518 on display
>> vyan2000-linux:0.0 on machine vyan2000-linux
>> [1]PETSC ERROR: PETSC: Attaching gdb to ./rpisolve of pid 26519 on display
>> vyan2000-linux:0.0 on machine vyan2000-linux
>> xterm Xt error: Can't open display: vyan2000-linux:0.0
>> xterm Xt error: Can't open display: vyan2000-linux:0.0
>>
>> Then
>> I changed back,
>> vyan2000 at vyan2000-linux:~/local/PPETSc/petsc-2.3.3-p15/src/ksp/ksp/examples/tutorials/ttt2$
>> mpirun -np 2 ./rpisolve -ksp_monitor_true_residual -start_in_debugger
>> -display :0.0
>>
>> Same as before, error messages with only a single gdb window, (and the
>> window shows up at different place at different instances).
>>
>> Yan
>>
>>
>> On Sun, Jul 19, 2009 at 12:30 AM, Barry Smith <bsmith at mcs.anl.gov> wrote:
>>
>>  Try using -display vyan2000-linux:0.0
>>
>>  Shouldn't make any difference but since it appears you are running
>> everything on the same machine what you have given should work.
>>
>>  Barry
>>
>>
>> On Jul 18, 2009, at 11:24 PM, Ryan Yan wrote:
>>
>> Hi All,
>> I am tring to use the PETSc runtime option -start_in_debugger.
>>
>> However, when I attach the debugger at run time to each process, there are
>> error messages and I only get one gdb window(Am I suppose to get as many as
>> the number of the processes?)
>>
>> vyan2000 at vyan2000-linux:~/local/PPETSc/petsc-2.3.3-p15/src/ksp/ksp/examples/tutorials/ttt2$
>> mpirun -np 2 ./rpisolve -ksp_monitor_true_residual -start_in_debugger
>> -display :0.0
>>
>> [1]PETSC ERROR: PETSC: Attaching gdb to ./rpisolve of pid 26307 on display
>> :0.0 on machine vyan2000-linux
>> [0]PETSC ERROR: PETSC: Attaching gdb to ./rpisolve of pid 26306 on display
>> :0.0 on machine vyan2000-linux
>>
>> Then, only a *single* gdb window prompts out. When I run with 3 process,
>> there are only *two* gdb windows.
>>
>> Thank you very much in advance,
>>
>> Yan
>>
>>
>>
>>
>>
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.mcs.anl.gov/pipermail/petsc-users/attachments/20090719/8f659899/attachment.htm>

From jed at 59A2.org  Mon Jul 20 04:47:35 2009
From: jed at 59A2.org (Jed Brown)
Date: Mon, 20 Jul 2009 11:47:35 +0200
Subject: PETSC debugger
In-Reply-To: <bb5eaf5f0907191543s474360d3t768eb06aa32fd050@mail.gmail.com>
References: <bb5eaf5f0907182124u2ef73468j7dc9cd1e7b2ca4e2@mail.gmail.com>	<B6FDF357-4A48-4062-BC35-6796CD00CBE3@mcs.anl.gov>	<bb5eaf5f0907182150g7e065366tc13d907927ffdaa9@mail.gmail.com>	<F03EF3F1-0539-43C4-8AEA-41EB91A91930@mcs.anl.gov>	<bb5eaf5f0907191349r22e469d5ha8a363a1bac3298f@mail.gmail.com>	<5908FA08-1DFA-4711-A77D-74900A3C00BA@mcs.anl.gov>
	<bb5eaf5f0907191543s474360d3t768eb06aa32fd050@mail.gmail.com>
Message-ID: <4A643D37.10708@59A2.org>

Ryan Yan wrote:
> Only debugging one processes is a way to go for my current bug(s).

There is another option for debugging multiple processes using screen
instead of X.  For the sake of cleanliness, start a new terminal and
open a special screen session for debugging

$ screen -S sdebug

Now in your original terminal, run the program like

$ mpirun -n 4 ./myapp -start_in_debugger -debug_terminal "screen -S sdebug -X screen"

(the quotes are important)

This opens four new windows within the screen session named "sdebug".


Recent versions of MPICH2 have a -gdb option which is a lightweight skin
for gdb that sends commands to all processes and collates the results,
but also allows you to drop to per-process debugging.  I had trouble
with it prior to release 1.1.


Jed

-------------- next part --------------
A non-text attachment was scrubbed...
Name: signature.asc
Type: application/pgp-signature
Size: 260 bytes
Desc: OpenPGP digital signature
URL: <http://lists.mcs.anl.gov/pipermail/petsc-users/attachments/20090720/15d102c7/attachment.pgp>

From vyan2000 at gmail.com  Mon Jul 20 12:04:37 2009
From: vyan2000 at gmail.com (Ryan Yan)
Date: Mon, 20 Jul 2009 13:04:37 -0400
Subject: PETSC debugger
In-Reply-To: <4A643D37.10708@59A2.org>
References: <bb5eaf5f0907182124u2ef73468j7dc9cd1e7b2ca4e2@mail.gmail.com>
	<B6FDF357-4A48-4062-BC35-6796CD00CBE3@mcs.anl.gov>
	<bb5eaf5f0907182150g7e065366tc13d907927ffdaa9@mail.gmail.com>
	<F03EF3F1-0539-43C4-8AEA-41EB91A91930@mcs.anl.gov>
	<bb5eaf5f0907191349r22e469d5ha8a363a1bac3298f@mail.gmail.com>
	<5908FA08-1DFA-4711-A77D-74900A3C00BA@mcs.anl.gov>
	<bb5eaf5f0907191543s474360d3t768eb06aa32fd050@mail.gmail.com>
	<4A643D37.10708@59A2.org>
Message-ID: <bb5eaf5f0907201004s591b6d11o69bd3c7d5658fcc8@mail.gmail.com>

Thank you very much, Jed.

I will try your suggestion.

Yan

On Mon, Jul 20, 2009 at 5:47 AM, Jed Brown <jed at 59a2.org> wrote:

> Ryan Yan wrote:
> > Only debugging one processes is a way to go for my current bug(s).
>
> There is another option for debugging multiple processes using screen
> instead of X.  For the sake of cleanliness, start a new terminal and
> open a special screen session for debugging
>
> $ screen -S sdebug
>
> Now in your original terminal, run the program like
>
> $ mpirun -n 4 ./myapp -start_in_debugger -debug_terminal "screen -S sdebug
> -X screen"
>
> (the quotes are important)
>
> This opens four new windows within the screen session named "sdebug".
>
>
> Recent versions of MPICH2 have a -gdb option which is a lightweight skin
> for gdb that sends commands to all processes and collates the results,
> but also allows you to drop to per-process debugging.  I had trouble
> with it prior to release 1.1.
>
>
> Jed
>
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.mcs.anl.gov/pipermail/petsc-users/attachments/20090720/37a4d93d/attachment.htm>

From vyan2000 at gmail.com  Mon Jul 20 12:50:21 2009
From: vyan2000 at gmail.com (Ryan Yan)
Date: Mon, 20 Jul 2009 13:50:21 -0400
Subject: PETSC debugger
In-Reply-To: <4A643D37.10708@59A2.org>
References: <bb5eaf5f0907182124u2ef73468j7dc9cd1e7b2ca4e2@mail.gmail.com>
	<B6FDF357-4A48-4062-BC35-6796CD00CBE3@mcs.anl.gov>
	<bb5eaf5f0907182150g7e065366tc13d907927ffdaa9@mail.gmail.com>
	<F03EF3F1-0539-43C4-8AEA-41EB91A91930@mcs.anl.gov>
	<bb5eaf5f0907191349r22e469d5ha8a363a1bac3298f@mail.gmail.com>
	<5908FA08-1DFA-4711-A77D-74900A3C00BA@mcs.anl.gov>
	<bb5eaf5f0907191543s474360d3t768eb06aa32fd050@mail.gmail.com>
	<4A643D37.10708@59A2.org>
Message-ID: <bb5eaf5f0907201050x6f0f93bel77edde56af7b003d@mail.gmail.com>

Hi Jed,
My X server set up may be messed up somehow. Same error as before.

After I creat a new session using,
vyan2000 at vyan2000-linux:~/local/PPETSc/petsc-2.3.3-p15/src/ksp/ksp/examples/tutorials/ttt2$
screen -S sdebug

Still errors.

vyan2000 at vyan2000-linux:~/local/PPETSc/petsc-2.3.3-p15/src/ksp/ksp/examples/tutorials/ttt2$
mpirun -n 4 ./rpisolve -start_in_debugger -debug_terminal "screen -S sdebug
-X screen"
[0]PETSC ERROR: PETSC: Attaching gdb to ./rpisolve on pid 9758 on
vyan2000-linux
[1]PETSC ERROR: PETSC: Attaching gdb to ./rpisolve on pid 9755 on
vyan2000-linux
[3]PETSC ERROR: PETSC: Attaching gdb to ./rpisolve on pid 9757 on
vyan2000-linux
[2]PETSC ERROR: PETSC: Attaching gdb to ./rpisolve on pid 9756 on
vyan2000-linux

The number of the screen for gdb is not equal to the number of
processes.(Mostly, less than), as you can see in the following "bottom
margin" of my screen session.

vyan2000-linux | 0.11 0.09 0.08 | 07-20 13:46 |0-$ shell  1$ gdb  2$ gdb
3$* gdb

And I use backtrace as the first command in the survived gdb session, I
get:

Loaded symbols for /usr/lib/libX11.so.6
Reading symbols from /usr/lib/libstdc++.so.6...done.
Loaded symbols for /usr/lib/libstdc++.so.6
Reading symbols from /usr/lib/liblapack.so.3gf...done.
Loaded symbols for /usr/lib/liblapack.so.3gf
Reading symbols from /usr/lib/libblas.so.3gf...done.
Loaded symbols for /usr/lib/libblas.so.3gf
Reading symbols from /lib/tls/i686/cmov/libdl.so.2...done.
Loaded symbols for /lib/tls/i686/cmov/libdl.so.2
Reading symbols from /lib/tls/i686/cmov/libpthread.so.0...done.
[Thread debugging using libthread_db enabled]
[New Thread 0xb739c6c0 (LWP 10213)]
Loaded symbols for /lib/tls/i686/cmov/libpthread.so.0
Reading symbols from /lib/tls/i686/cmov/librt.so.1...done.
Loaded symbols for /lib/tls/i686/cmov/librt.so.1
Reading symbols from /lib/libgcc_s.so.1...done.
Loaded symbols for /lib/libgcc_s.so.1
Reading symbols from /usr/lib/libgfortran.so.2...done.
Loaded symbols for /usr/lib/libgfortran.so.2
Reading symbols from /lib/tls/i686/cmov/libm.so.6...done.
Loaded symbols for /lib/tls/i686/cmov/libm.so.6
Reading symbols from /lib/tls/i686/cmov/libc.so.6...done.
Loaded symbols for /lib/tls/i686/cmov/libc.so.6
Reading symbols from /usr/lib/libxcb-xlib.so.0...done.
Loaded symbols for /usr/lib/libxcb-xlib.so.0
Reading symbols from /usr/lib/libxcb.so.1...done.
Loaded symbols for /usr/lib/libxcb.so.1
Reading symbols from /lib/ld-linux.so.2...done.
Loaded symbols for /lib/ld-linux.so.2
Reading symbols from /usr/lib/libXau.so.6...done.
Loaded symbols for /usr/lib/libXau.so.6
Reading symbols from /usr/lib/libXdmcp.so.6...done.
Loaded symbols for /usr/lib/libXdmcp.so.6
Reading symbols from /lib/tls/i686/cmov/libnss_compat.so.2...done.
Loaded symbols for /lib/tls/i686/cmov/libnss_compat.so.2
Reading symbols from /lib/tls/i686/cmov/libnsl.so.1...done.
Loaded symbols for /lib/tls/i686/cmov/libnsl.so.1
Reading symbols from /lib/tls/i686/cmov/libnss_nis.so.2...done.
Loaded symbols for /lib/tls/i686/cmov/libnss_nis.so.2
Reading symbols from /lib/tls/i686/cmov/libnss_files.so.2...done.
Loaded symbols for /lib/tls/i686/cmov/libnss_files.so.2
0xb7f5b410 in __kernel_vsyscall ()
(gdb) backtrace
#0  0xb7f5b410 in __kernel_vsyscall ()
#1  0xb7456c90 in nanosleep () from /lib/tls/i686/cmov/libc.so.6
#2  0xb7456ac7 in sleep () from /lib/tls/i686/cmov/libc.so.6
#3  0x085cd836 in PetscSleep (s=10) at psleep.c:40
#4  0x0860976f in PetscAttachDebugger () at adebug.c:412
#5  0x085b87ee in PetscOptionsCheckInitial_Private () at init.c:378
#6  0x085bd223 in PetscInitialize (argc=0xbfacffe0, args=0xbfacffe4,
file=0x0,
    help=0x89cbac0 "output the matrix A, rhs b.\n   exact solution x : check
\n", ' ' <repeats 21 times>, "\n\n") at pinit.c:573
#7  0x0804bf43 in main (argc=9, args=0x3a0001c6) at rpisolve.c:71

I have no idea how to debug this.

Yan




On Mon, Jul 20, 2009 at 5:47 AM, Jed Brown <jed at 59a2.org> wrote:

> Ryan Yan wrote:
> > Only debugging one processes is a way to go for my current bug(s).
>
> There is another option for debugging multiple processes using screen
> instead of X.  For the sake of cleanliness, start a new terminal and
> open a special screen session for debugging
>
> $ screen -S sdebug
>
> Now in your original terminal, run the program like
>
> $ mpirun -n 4 ./myapp -start_in_debugger -debug_terminal "screen -S sdebug
> -X screen"
>
> (the quotes are important)
>
> This opens four new windows within the screen session named "sdebug".
>
>
> Recent versions of MPICH2 have a -gdb option which is a lightweight skin
> for gdb that sends commands to all processes and collates the results,
> but also allows you to drop to per-process debugging.  I had trouble
> with it prior to release 1.1.
>
>
> Jed
>
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.mcs.anl.gov/pipermail/petsc-users/attachments/20090720/eb10fbcf/attachment.htm>

From jed at 59A2.org  Mon Jul 20 14:17:46 2009
From: jed at 59A2.org (Jed Brown)
Date: Mon, 20 Jul 2009 21:17:46 +0200
Subject: PETSC debugger
In-Reply-To: <bb5eaf5f0907201050x6f0f93bel77edde56af7b003d@mail.gmail.com>
References: <bb5eaf5f0907182124u2ef73468j7dc9cd1e7b2ca4e2@mail.gmail.com>	<B6FDF357-4A48-4062-BC35-6796CD00CBE3@mcs.anl.gov>	<bb5eaf5f0907182150g7e065366tc13d907927ffdaa9@mail.gmail.com>	<F03EF3F1-0539-43C4-8AEA-41EB91A91930@mcs.anl.gov>	<bb5eaf5f0907191349r22e469d5ha8a363a1bac3298f@mail.gmail.com>	<5908FA08-1DFA-4711-A77D-74900A3C00BA@mcs.anl.gov>	<bb5eaf5f0907191543s474360d3t768eb06aa32fd050@mail.gmail.com>	<4A643D37.10708@59A2.org>
	<bb5eaf5f0907201050x6f0f93bel77edde56af7b003d@mail.gmail.com>
Message-ID: <4A64C2DA.6020807@59A2.org>

Ryan Yan wrote:
> My X server set up may be messed up somehow. Same error as before.

Using screen bypasses X, so that seems an unlikely candidate.

> After I creat a new session using,
> vyan2000 at vyan2000-linux:~/local/PPETSc/petsc-2.3.3-p15/src/ksp/ksp/examples/tutorials/ttt2$
> screen -S sdebug
> 
> Still errors.

No errors at this point though, right?

> vyan2000 at vyan2000-linux:~/local/PPETSc/petsc-2.3.3-p15/src/ksp/ksp/examples/tutorials/ttt2$
> mpirun -n 4 ./rpisolve -start_in_debugger -debug_terminal "screen -S sdebug
> -X screen"
> [0]PETSC ERROR: PETSC: Attaching gdb to ./rpisolve on pid 9758 on
> vyan2000-linux
> [1]PETSC ERROR: PETSC: Attaching gdb to ./rpisolve on pid 9755 on
> vyan2000-linux
> [3]PETSC ERROR: PETSC: Attaching gdb to ./rpisolve on pid 9757 on
> vyan2000-linux
> [2]PETSC ERROR: PETSC: Attaching gdb to ./rpisolve on pid 9756 on
> vyan2000-linux
> 
> The number of the screen for gdb is not equal to the number of
> processes.(Mostly, less than), as you can see in the following "bottom
> margin" of my screen session.

Are there *ever* more sessions than the number of processes?  Are there
ever the same number?  Is there any consistency to which process is
missing?  The output above indicates that the debugger is being run and,
from the perspective of PetscAttachDebugger, the operation was
successful.  I have seen this behavior (one missing debug session) on a
couple occasions, but it wasn't reproducible so I couldn't debug it.  If
this continues to be a problem, I recommend attaching the debugger
yourself (put "set breakpoint pending on" and "break PetscError" in your
.gdbinit or run with -on_error_abort).

Jed

-------------- next part --------------
A non-text attachment was scrubbed...
Name: signature.asc
Type: application/pgp-signature
Size: 260 bytes
Desc: OpenPGP digital signature
URL: <http://lists.mcs.anl.gov/pipermail/petsc-users/attachments/20090720/0f218dd9/attachment.pgp>

From vyan2000 at gmail.com  Mon Jul 20 14:46:36 2009
From: vyan2000 at gmail.com (Ryan Yan)
Date: Mon, 20 Jul 2009 15:46:36 -0400
Subject: PETSC debugger
In-Reply-To: <4A64C2DA.6020807@59A2.org>
References: <bb5eaf5f0907182124u2ef73468j7dc9cd1e7b2ca4e2@mail.gmail.com>
	<B6FDF357-4A48-4062-BC35-6796CD00CBE3@mcs.anl.gov>
	<bb5eaf5f0907182150g7e065366tc13d907927ffdaa9@mail.gmail.com>
	<F03EF3F1-0539-43C4-8AEA-41EB91A91930@mcs.anl.gov>
	<bb5eaf5f0907191349r22e469d5ha8a363a1bac3298f@mail.gmail.com>
	<5908FA08-1DFA-4711-A77D-74900A3C00BA@mcs.anl.gov>
	<bb5eaf5f0907191543s474360d3t768eb06aa32fd050@mail.gmail.com>
	<4A643D37.10708@59A2.org>
	<bb5eaf5f0907201050x6f0f93bel77edde56af7b003d@mail.gmail.com>
	<4A64C2DA.6020807@59A2.org>
Message-ID: <bb5eaf5f0907201246l5011f56aj82b76af59c713564@mail.gmail.com>

Thanks, Jed. Please see reply below.


On Mon, Jul 20, 2009 at 3:17 PM, Jed Brown <jed at 59a2.org> wrote:

> Ryan Yan wrote:
> > My X server set up may be messed up somehow. Same error as before.
>
> Using screen bypasses X, so that seems an unlikely candidate.
>
> > After I creat a new session using,
> > vyan2000 at vyan2000-linux
> :~/local/PPETSc/petsc-2.3.3-p15/src/ksp/ksp/examples/tutorials/ttt2$
> > screen -S sdebug
> >
> > Still errors.
>
> No errors at this point though, right?
>

Yes, you are right. There is no error at this point.


>
> > vyan2000 at vyan2000-linux
> :~/local/PPETSc/petsc-2.3.3-p15/src/ksp/ksp/examples/tutorials/ttt2$
> > mpirun -n 4 ./rpisolve -start_in_debugger -debug_terminal "screen -S
> sdebug
> > -X screen"
> > [0]PETSC ERROR: PETSC: Attaching gdb to ./rpisolve on pid 9758 on
> > vyan2000-linux
> > [1]PETSC ERROR: PETSC: Attaching gdb to ./rpisolve on pid 9755 on
> > vyan2000-linux
> > [3]PETSC ERROR: PETSC: Attaching gdb to ./rpisolve on pid 9757 on
> > vyan2000-linux
> > [2]PETSC ERROR: PETSC: Attaching gdb to ./rpisolve on pid 9756 on
> > vyan2000-linux
> >
> > The number of the screen for gdb is not equal to the number of
> > processes.(Mostly, less than), as you can see in the following "bottom
> > margin" of my screen session.
>
> Are there *ever* more sessions than the number of processes?


 No.


> Are there ever the same number?


 Yes, some time. But the error information *is* always there.


>  Is there any consistency to which process is
> missing?


I do not how to check which process is missing... Sorry, I can not answer
this question.


> The output above indicates that the debugger is being run and,
> from the perspective of PetscAttachDebugger, the operation was
> successful.


I agree with you.


>  I have seen this behavior (one missing debug session) on a
> couple occasions, but it wasn't reproducible so I couldn't debug it.


I am using Ubuntu Gnome, maybe the error on this specific linux distribution
is reproducible (It may takes you a while). One of my colleague also have
the same error, she is using a same linux distribution as mine.

vyan2000 at vyan2000-linux:~/local/PPETSc/petsc-2.3.3-p15/src/ksp/ksp/examples/tutorials/ttt2$
uname -a
Linux vyan2000-linux 2.6.24-24-generic #1 SMP Tue Jul 7 19:46:39 UTC 2009
i686 GNU/Linux

vyan2000 at vyan2000-linux:~/local/PPETSc/petsc-2.3.3-p15/src/ksp/ksp/examples/tutorials/ttt2$
lsb_release -a
No LSB modules are available.
Distributor ID: Ubuntu
Description:    Ubuntu 8.04.3 LTS
Release:        8.04
Codename:       hardy



>  If this continues to be a problem, I recommend attaching the debugger
> yourself (put "set breakpoint pending on" and "break PetscError" in your
> .gdbinit or run with -on_error_abort).
>

I will try it. But I do not have any experience of attaching the debugger to
something else. Any pointer to a *reference*?

Yan





>
> Jed
>
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.mcs.anl.gov/pipermail/petsc-users/attachments/20090720/efdc8d31/attachment.htm>

From u.tabak at tudelft.nl  Tue Jul 21 15:16:13 2009
From: u.tabak at tudelft.nl (Umut Tabak)
Date: Tue, 21 Jul 2009 22:16:13 +0200
Subject: Petsc And Slepc, singular system
Message-ID: <4A66220D.8010401@tudelft.nl>

Dear all,

As a fresh user of Petsc libraries, should thank the developers for such 
a magnificent endeavor and years of work.

So the question directly related to Petsc is that if I have a singular 
system matrix and try to solve for the unknowns(simple enough 3 by 3) (I 
am using the simple linear system example from the Petsc user manual as 
a template where a preconditioner is used, I guess it is Jacobi.), I do 
not get any warnings for zero pivots in LU decomposition which I could 
not understand why, and the results are on the order of e+16, also the 
norm of the error. But why is not there some kind of warning.

The second part of the question is related to Slepc, this might not find 
direct answers here perhaps, but let me give it a try.

 I have a generalized eigenvalue problem, it is a vibration related 
problem so I will use K and M instead of A and B, respectively. On my 
problem, K is singular, and if I use slepc to find the solution, petsc 
warns me about the zero pivot emergence, and breaks down naturally, 
there after I apply some shift operations that are already implemented 
in slepc to overcome the problem.

The question is what is the effect of preconditioner on a singular 
matrix for the linear system explained above, somehow, I was thinking in 
any case that should also warn me but it did not and gave some wrong 
results.

I am a bit weak on the preconditioners, maybe should have done some 
reading but I know that singular systems can also have solutions by some 
order tricks, pseudo inverse, temporary links application solutions with 
respect to rigid body modes(from structural mechanics too specific maybe).

Can Petsc handle singular systems as well? I am a bit confused at this 
point.

Best regards,

Umut



From knepley at gmail.com  Tue Jul 21 15:56:11 2009
From: knepley at gmail.com (Matthew Knepley)
Date: Tue, 21 Jul 2009 15:56:11 -0500
Subject: Petsc And Slepc, singular system
In-Reply-To: <4A66220D.8010401@tudelft.nl>
References: <4A66220D.8010401@tudelft.nl>
Message-ID: <a9f269830907211356l784974e7j3b4bbd2926fb153c@mail.gmail.com>

On Tue, Jul 21, 2009 at 3:16 PM, Umut Tabak <u.tabak at tudelft.nl> wrote:

> Dear all,
>
> As a fresh user of Petsc libraries, should thank the developers for such a
> magnificent endeavor and years of work.
>
> So the question directly related to Petsc is that if I have a singular
> system matrix and try to solve for the unknowns(simple enough 3 by 3) (I am
> using the simple linear system example from the Petsc user manual as a
> template where a preconditioner is used, I guess it is Jacobi.), I do not
> get any warnings for zero pivots in LU decomposition which I could not
> understand why, and the results are on the order of e+16, also the norm of
> the error. But why is not there some kind of warning.


If your system is badly scaled, roundoff errors could result in a pivot
larger than our tolerance. It is also possible that your preconditioner
resulted in a badly scaled system.

   Matt


>
> The second part of the question is related to Slepc, this might not find
> direct answers here perhaps, but let me give it a try.
>
> I have a generalized eigenvalue problem, it is a vibration related problem
> so I will use K and M instead of A and B, respectively. On my problem, K is
> singular, and if I use slepc to find the solution, petsc warns me about the
> zero pivot emergence, and breaks down naturally, there after I apply some
> shift operations that are already implemented in slepc to overcome the
> problem.
>
> The question is what is the effect of preconditioner on a singular matrix
> for the linear system explained above, somehow, I was thinking in any case
> that should also warn me but it did not and gave some wrong results.
>
> I am a bit weak on the preconditioners, maybe should have done some reading
> but I know that singular systems can also have solutions by some order
> tricks, pseudo inverse, temporary links application solutions with respect
> to rigid body modes(from structural mechanics too specific maybe).
>
> Can Petsc handle singular systems as well? I am a bit confused at this
> point.
>
> Best regards,
>
> Umut
>
>
>


-- 
What most experimenters take for granted before they begin their experiments
is infinitely more interesting than any results to which their experiments
lead.
-- Norbert Wiener
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.mcs.anl.gov/pipermail/petsc-users/attachments/20090721/2a8dcac9/attachment.htm>

From umut.tabak at gmail.com  Tue Jul 21 15:39:18 2009
From: umut.tabak at gmail.com (Umut Tabak)
Date: Tue, 21 Jul 2009 22:39:18 +0200
Subject: singular systems, petsc and slepc
Message-ID: <4A662776.80103@gmail.com>

Dear all,

As a fresh user of Petsc libraries, should thank the developers for such 
a magnificent endeavor and years of work.

So the question directly related to Petsc is that if I have a singular 
system matrix and try to solve for the unknowns(simple enough 3 by 3) (I 
am using the simple linear system example from the Petsc user manual as 
a template where a preconditioner is used, I guess it is Jacobi.), I do 
not get any warnings for zero pivots in LU decomposition which I could 
not understand why, and the results are on the order of e+16, also the 
norm of the error. But why is not there some kind of warning.

The second part of the question is related to Slepc, this might not find 
direct answers here perhaps, but let me give it a try.

I have a generalized eigenvalue problem, it is a vibration related 
problem so I will use K and M instead of A and B, respectively. On my 
problem, K is singular, and if I use slepc to find the solution, petsc 
warns me about the zero pivot emergence, and breaks down naturally, 
there after I apply some shift operations that are already implemented 
in slepc to overcome the problem.

The question is what is the effect of preconditioner on a singular 
matrix for the linear system explained above, somehow, I was thinking in 
any case that should also warn me but it did not and gave some wrong 
results.

I am a bit weak on the preconditioners, maybe should have done some 
reading but I know that singular systems can also have solutions by some 
order tricks, pseudo inverse, temporary links application solutions with 
respect to rigid body modes(from structural mechanics too specific maybe).

Can Petsc handle singular systems as well? I am a bit confused at this 
point.

Best regards,

Umut



From u.tabak at tudelft.nl  Tue Jul 21 16:09:54 2009
From: u.tabak at tudelft.nl (Umut Tabak)
Date: Tue, 21 Jul 2009 23:09:54 +0200
Subject: Petsc And Slepc, singular system
In-Reply-To: <a9f269830907211356l784974e7j3b4bbd2926fb153c@mail.gmail.com>
References: <4A66220D.8010401@tudelft.nl>
	<a9f269830907211356l784974e7j3b4bbd2926fb153c@mail.gmail.com>
Message-ID: <20090721210954.GA31488@dutw689>

On Tue, Jul 21, 2009 at 03:56:11PM -0500, Matthew Knepley wrote:
> If your system is badly scaled, roundoff errors could result in a pivot
> larger than our tolerance. It is also possible that your preconditioner
> resulted in a badly scaled system.
> 
>    Matt
> 
Hi,
    Thanks for the fast reply. 
    since the system is singular condition number will be bad any way,
    but I was wondering if there were already ways to overcome this
    problem. And are there ways to prevent the preconditioner to give a
    badly scaled system, what I mean is that the system is singular so
    will preconditioning improve that? I definitely read about these.

    Thanks and best,
    Umut

-- 
Quote:      You can not teach a man anything, you can only help him find it within himself. 
Source:     Galileo Galilei 

From knepley at gmail.com  Tue Jul 21 16:24:30 2009
From: knepley at gmail.com (Matthew Knepley)
Date: Tue, 21 Jul 2009 16:24:30 -0500
Subject: Petsc And Slepc, singular system
In-Reply-To: <20090721210954.GA31488@dutw689>
References: <4A66220D.8010401@tudelft.nl>
	<a9f269830907211356l784974e7j3b4bbd2926fb153c@mail.gmail.com>
	<20090721210954.GA31488@dutw689>
Message-ID: <a9f269830907211424t67c311c3kcb49563a76edfc86@mail.gmail.com>

On Tue, Jul 21, 2009 at 4:09 PM, Umut Tabak <u.tabak at tudelft.nl> wrote:

> On Tue, Jul 21, 2009 at 03:56:11PM -0500, Matthew Knepley wrote:
> > If your system is badly scaled, roundoff errors could result in a pivot
> > larger than our tolerance. It is also possible that your preconditioner
> > resulted in a badly scaled system.
> >
> >    Matt
> >
> Hi,
>    Thanks for the fast reply.
>    since the system is singular condition number will be bad any way,
>    but I was wondering if there were already ways to overcome this
>    problem. And are there ways to prevent the preconditioner to give a
>    badly scaled system, what I mean is that the system is singular so
>    will preconditioning improve that? I definitely read about these.
>

Well, the default is ILU which can do horrendously things. Try running with
Jacobi as a test. It should fail like you want. In my opinion, blackbox PCs
almost never work, and what you really need is something tailored to your
problem (which is linear algebra heresy).

  Matt


>
>    Thanks and best,
>    Umut
>
> --
> Quote:      You can not teach a man anything, you can only help him find it
> within himself.
> Source:     Galileo Galilei
>



-- 
What most experimenters take for granted before they begin their experiments
is infinitely more interesting than any results to which their experiments
lead.
-- Norbert Wiener
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.mcs.anl.gov/pipermail/petsc-users/attachments/20090721/03946660/attachment.htm>

From bsmith at mcs.anl.gov  Tue Jul 21 16:29:23 2009
From: bsmith at mcs.anl.gov (Barry Smith)
Date: Tue, 21 Jul 2009 16:29:23 -0500
Subject: Petsc And Slepc, singular system
In-Reply-To: <a9f269830907211424t67c311c3kcb49563a76edfc86@mail.gmail.com>
References: <4A66220D.8010401@tudelft.nl>
	<a9f269830907211356l784974e7j3b4bbd2926fb153c@mail.gmail.com>
	<20090721210954.GA31488@dutw689>
	<a9f269830907211424t67c311c3kcb49563a76edfc86@mail.gmail.com>
Message-ID: <A5AB42AD-6882-4D74-A7D7-2C12BD630A4D@mcs.anl.gov>


    See the manual page for MatNullSpaceCreate() and KSPSetNullSpace()

    Barry

On Jul 21, 2009, at 4:24 PM, Matthew Knepley wrote:

> On Tue, Jul 21, 2009 at 4:09 PM, Umut Tabak <u.tabak at tudelft.nl>  
> wrote:
> On Tue, Jul 21, 2009 at 03:56:11PM -0500, Matthew Knepley wrote:
> > If your system is badly scaled, roundoff errors could result in a  
> pivot
> > larger than our tolerance. It is also possible that your  
> preconditioner
> > resulted in a badly scaled system.
> >
> >    Matt
> >
> Hi,
>    Thanks for the fast reply.
>    since the system is singular condition number will be bad any way,
>    but I was wondering if there were already ways to overcome this
>    problem. And are there ways to prevent the preconditioner to give a
>    badly scaled system, what I mean is that the system is singular so
>    will preconditioning improve that? I definitely read about these.
>
> Well, the default is ILU which can do horrendously things. Try  
> running with
> Jacobi as a test. It should fail like you want. In my opinion,  
> blackbox PCs
> almost never work, and what you really need is something tailored to  
> your
> problem (which is linear algebra heresy).
>
>   Matt
>
>
>    Thanks and best,
>    Umut
>
> --
> Quote:      You can not teach a man anything, you can only help him  
> find it within himself.
> Source:     Galileo Galilei
>
>
>
> -- 
> What most experimenters take for granted before they begin their  
> experiments is infinitely more interesting than any results to which  
> their experiments lead.
> -- Norbert Wiener


From jroman at dsic.upv.es  Tue Jul 21 17:01:43 2009
From: jroman at dsic.upv.es (Jose E. Roman)
Date: Wed, 22 Jul 2009 00:01:43 +0200
Subject: Petsc And Slepc, singular system
In-Reply-To: <A5AB42AD-6882-4D74-A7D7-2C12BD630A4D@mcs.anl.gov>
References: <4A66220D.8010401@tudelft.nl>
	<a9f269830907211356l784974e7j3b4bbd2926fb153c@mail.gmail.com>
	<20090721210954.GA31488@dutw689>
	<a9f269830907211424t67c311c3kcb49563a76edfc86@mail.gmail.com>
	<A5AB42AD-6882-4D74-A7D7-2C12BD630A4D@mcs.anl.gov>
Message-ID: <01BBF9CD-0165-473D-B912-24A5F63AB04D@dsic.upv.es>

If your eigenproblem is K x = lambda M x and you are solving it as  
K^-1 M x = 1/lambda x (with shift-and-invert with shift=0 in SLEPc),  
then if a basis of the nullspace of K is known (rigid-body modes) you  
can use EPSAttachDeflationSpace. That will make SLEPc automatically do  
KSPSetNullSpace in the underlying KSP object.

Jose

On 21/07/2009, Barry Smith wrote:

>
>   See the manual page for MatNullSpaceCreate() and KSPSetNullSpace()
>
>   Barry
>
> On Jul 21, 2009, at 4:24 PM, Matthew Knepley wrote:
>
>> On Tue, Jul 21, 2009 at 4:09 PM, Umut Tabak <u.tabak at tudelft.nl>  
>> wrote:
>> On Tue, Jul 21, 2009 at 03:56:11PM -0500, Matthew Knepley wrote:
>> > If your system is badly scaled, roundoff errors could result in a  
>> pivot
>> > larger than our tolerance. It is also possible that your  
>> preconditioner
>> > resulted in a badly scaled system.
>> >
>> >    Matt
>> >
>> Hi,
>>   Thanks for the fast reply.
>>   since the system is singular condition number will be bad any way,
>>   but I was wondering if there were already ways to overcome this
>>   problem. And are there ways to prevent the preconditioner to give a
>>   badly scaled system, what I mean is that the system is singular so
>>   will preconditioning improve that? I definitely read about these.
>>
>> Well, the default is ILU which can do horrendously things. Try  
>> running with
>> Jacobi as a test. It should fail like you want. In my opinion,  
>> blackbox PCs
>> almost never work, and what you really need is something tailored  
>> to your
>> problem (which is linear algebra heresy).
>>
>>  Matt
>>
>>
>>   Thanks and best,
>>   Umut
>>


From Andrew.Barker at Colorado.EDU  Tue Jul 21 17:44:32 2009
From: Andrew.Barker at Colorado.EDU (Andrew T Barker)
Date: Tue, 21 Jul 2009 16:44:32 -0600 (MDT)
Subject: set ASM subdomains in sub KSP
Message-ID: <20090721164432.AKA01094@batman.int.colorado.edu>

I want to do multiple sweeps of ASM to precondition GMRES.  So I do something like 

-ksp_type gmres -pc_type ksp -ksp_ksp_type richardson -ksp_pc_type asm

which works fine.  Now I want to set my own subdomains with PCASMSetLocalSubdomains(), which isn't accessible from the command line.  

I can use PCKSPGetKSP() to get the KSP (and then the PC) to use in PCASMSetLocalSubdomains(), but I have to call KSPSetUp() on the parent KSP in order to use PCKSPGetKSP().  And then I'm not allowed to call PCASMSetLocalSubdomains() if KSPSetUp() has already been called - object is in wrong state.

I'm at a loss for how to solve this problem - any help would be appreciated.  

Andrew
---
Andrew T. Barker
andrew.barker at colorado.edu
Department of Applied Mathematics
University of Colorado, Boulder
526 UCB, Boulder, CO 80309-0526

From knepley at gmail.com  Tue Jul 21 17:54:27 2009
From: knepley at gmail.com (Matthew Knepley)
Date: Tue, 21 Jul 2009 17:54:27 -0500
Subject: set ASM subdomains in sub KSP
In-Reply-To: <20090721164432.AKA01094@batman.int.colorado.edu>
References: <20090721164432.AKA01094@batman.int.colorado.edu>
Message-ID: <a9f269830907211554p3db1f009ld75c949f2ba59fd9@mail.gmail.com>

On Tue, Jul 21, 2009 at 5:44 PM, Andrew T Barker <Andrew.Barker at colorado.edu
> wrote:

> I want to do multiple sweeps of ASM to precondition GMRES.  So I do
> something like
>
> -ksp_type gmres -pc_type ksp -ksp_ksp_type richardson -ksp_pc_type asm
>
> which works fine.  Now I want to set my own subdomains with
> PCASMSetLocalSubdomains(), which isn't accessible from the command line.
>
> I can use PCKSPGetKSP() to get the KSP (and then the PC) to use in
> PCASMSetLocalSubdomains(), but I have to call KSPSetUp() on the parent KSP
> in order to use PCKSPGetKSP().  And then I'm not allowed to call
> PCASMSetLocalSubdomains() if KSPSetUp() has already been called - object is
> in wrong state.
>
> I'm at a loss for how to solve this problem - any help would be
> appreciated.


I will think about it, but the easiest way for now is just to create the
inner PC yourself:

  PCCreate(&inner_pc)
  PCSetType()
  PCSetFromOptions()
  PCASMSetLocalSubDomains()
  KSPSetPC(inner_ksp, inner_pc)


   Matt


> Andrew
> ---
> Andrew T. Barker
> andrew.barker at colorado.edu
> Department of Applied Mathematics
> University of Colorado, Boulder
> 526 UCB, Boulder, CO 80309-0526
>
-- 
What most experimenters take for granted before they begin their experiments
is infinitely more interesting than any results to which their experiments
lead.
-- Norbert Wiener
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.mcs.anl.gov/pipermail/petsc-users/attachments/20090721/e04eeb24/attachment-0001.htm>

From michel.cancelliere at polito.it  Wed Jul 22 06:28:15 2009
From: michel.cancelliere at polito.it (Michel Cancelliere)
Date: Wed, 22 Jul 2009 13:28:15 +0200
Subject: SNES Convergence test
Message-ID: <7f18de3b0907220428u6b2690bay9efd09181d7bbada@mail.gmail.com>

Hi there,

I am having problems with the SNES,  it seems that changing atol or rtol
have no effect on the numbers of Newton Iterations. Do you think that it can
be a problem of settings in snes solver or maybe a problem on the routines
for Jacobian matrix evaluation? The linear solver do converge at each
nonlinear iterations.

*My Code*

 ierr = SNESCreate(PETSC_COMM_WORLD,&snes);CHKERRQ(ierr);

  /********************************************************/
  /*    Creation of the matrix and vector data structures*/
  /********************************************************/
  ierr = VecCreate(PETSC_COMM_WORLD,&x); CHKERRQ(ierr);
  ierr = VecSetSizes(x,PETSC_DECIDE,2*input.grid.N); CHKERRQ(ierr);
  ierr = VecSetFromOptions(x); CHKERRQ(ierr);
  ierr = VecDuplicate(x,&R);CHKERRQ(ierr);

   ierr = MatCreate(PETSC_COMM_WORLD,&J);CHKERRQ(ierr);
  ierr =
MatSetSizes(J,PETSC_DECIDE,PETSC_DECIDE,2*input.grid.N,2*input.grid.N);CHKERRQ(ierr);
  ierr = MatSetFromOptions(J);CHKERRQ(ierr);


  ///Set function evaluation routine and vector

   // Assign global variable which is used in the static wrapper function
  pt2Object  = (void*) &system;
  pt2Object2 = (void*) &system;
  ierr =
SNESSetFunction(snes,R,CSystem::Wrapper_to_FormFunction,&input);CHKERRQ(ierr);

  // Set Jacobian matrix structure and Jacobian evaluation routine
  ierr =
SNESSetJacobian(snes,J,J,CSystem::Wrapper_to_FormJacobian,&input);CHKERRQ(ierr);

 /**      Customizr non linear solver; set runtime options       ***/
/* Set linear solver defaults for this problem. By extracting the KSP,and PC
contexts from
the SNES context, we can then directly call any KSP and PC routines to set
various options*/

  ierr = SNESGetKSP(snes,&ksp);CHKERRQ(ierr);
  ierr = KSPGetPC(ksp,&pc);CHKERRQ(ierr);
  ierr = PCSetType(pc,PCILU);CHKERRQ(ierr);
  ierr =
KSPSetTolerances(ksp,1.e-10,PETSC_DEFAULT,PETSC_DEFAULT,30);CHKERRQ(ierr);
  ierr = KSPSetFromOptions(ksp);CHKERRQ(ierr);



  /*
    Set SNES/KSP/PC rountime options, e.g.,
    -snes_view -snes_monitor -ksp_type <ksp> -pc_type <pc>
    These options will override thos specified above as lon as
SNESSetFromOptoons is called _after_ any other customization routines.
  */

  ierr
=SNESSetTolerances(snes,1e-100,1e-8,1e-1000,100,1000);CHKERRQ(ierr);  * No
matter what I put here I am getting the same results (Residual Norm and
number of iterations)*
  ierr = SNESSetFromOptions(snes);CHKERRQ(ierr);

  /*---------------------------------------------------------------
            Evaluate initial guess; then solve nonlinear system
  -----------------------------------------------------------------*/


  ierr = CSystem::Cells2Vec(x,input);CHKERRQ(ierr);


 for (int i=0;i<system.delta_t_V.size();i++){// for (int i=0;i<700;i++)
      system.delta_t = system.delta_t_V[i];
      system.t         = system.t_V[i];
      input.t = system.t;
      input.delta_t = system.delta_t;
     /*\\\\\\\\\\\\\\Boundary conditions\\\\\\\\\\\\\\\\*/
      Rate_tot = input.F_boundary(input.t,input.delta_t);
      input.bc.F(input.interfaces,Rate_tot);

      ierr = SNESSolve(snes,PETSC_NULL,x);CHKERRQ(ierr);
      gauge.push_back(input.cells[0].water.p);
  }







 ierr = VecDestroy(x);CHKERRQ(ierr);
 ierr = VecDestroy(R);CHKERRQ(ierr);
 ierr = MatDestroy(J);CHKERRQ(ierr);
 ierr = SNESDestroy(snes);CHKERRQ(ierr);
  ierr = PetscFinalize();CHKERRQ(ierr);
  return 0;
}


*Output:*
SNES Object:
  type: ls
    line search variant: SNESLineSearchCubic
    alpha=0.0001, maxstep=1e+008, minlambda=1e-012
  maximum iterations=20, maximum function evaluations=1000
  tolerances: relative=1e-008, absolute=1e-100, solution=0
  total number of linear solver iterations=2
  total number of function evaluations=13
  KSP Object:
    type: gmres
      GMRES: restart=30, using Classical (unmodified) Gram-Schmidt
Orthogonalization with no iterative refinement
      GMRES: happy breakdown tolerance 1e-030
    maximum iterations=30, initial guess is zero
    tolerances:  relative=1e-010, absolute=1e-050, divergence=10000
    left preconditioning
  PC Object:
    type: ilu
      ILU: 0 levels of fill
      ILU: factor fill ratio allocated 1
      ILU: tolerance for zero pivot 1e-012
      ILU: using diagonal shift to prevent zero pivot
      ILU: using diagonal shift on blocks to prevent zero pivot
           out-of-place factorization
           matrix ordering: natural
      ILU: factor fill ratio needed 0
           Factored matrix follows
          Matrix Object:
            type=seqbaij, rows=64, cols=64
            package used to perform factorization: petsc
            total: nonzeros=376, allocated nonzeros=920
                block size is 1
    linear system matrix = precond matrix:
    Matrix Object:
      type=seqbaij, rows=64, cols=64
      total: nonzeros=376, allocated nonzeros=920
          block size is 1
************************************************************************************************************************
***             WIDEN YOUR WINDOW TO 120 CHARACTERS.  Use 'enscript -r
-fCourier9' to print this document            ***
************************************************************************************************************************

---------------------------------------------- PETSc Performance Summary:
----------------------------------------------

ex2.exe on a cygwin-c- named IDROCP03 with 1 processor, by Administrator Wed
Jul 22 13:25:47 2009
Using Petsc Release Version 3.0.0, Patch 6, Fri Jun  5 13:31:12 CDT 2009

                         Max       Max/Min        Avg      Total
Time (sec):           4.667e-001      1.00000   4.667e-001
Objects:              2.800e+001      1.00000   2.800e+001
Flops:                1.417e+004      1.00000   1.417e+004  1.417e+004
Flops/sec:            3.037e+004      1.00000   3.037e+004  3.037e+004
Memory:               9.325e+004      1.00000              9.325e+004
MPI Messages:         0.000e+000      0.00000   0.000e+000  0.000e+000
MPI Message Lengths:  0.000e+000      0.00000   0.000e+000  0.000e+000
MPI Reductions:       0.000e+000      0.00000

Flop counting convention: 1 flop = 1 real number operation of type
(multiply/divide/add/subtract)
                            e.g., VecAXPY() for real vectors of length N -->
2N flops
                            and VecAXPY() for complex vectors of length N
--> 8N flops

Summary of Stages:   ----- Time ------  ----- Flops -----  --- Messages ---
-- Message Lengths --  -- Reductions --
                        Avg     %Total     Avg     %Total   counts
%Total     Avg         %Total   counts   %Total
 0:      Main Stage: 4.6669e-001 100.0%  1.4174e+004 100.0%  0.000e+000
0.0%  0.000e+000        0.0%  0.000e+000   0.0%

------------------------------------------------------------------------------------------------------------------------
See the 'Profiling' chapter of the users' manual for details on interpreting
output.
Phase summary info:
   Count: number of times phase was executed
   Time and Flops: Max - maximum over all processors
                   Ratio - ratio of maximum to minimum over all processors
   Mess: number of messages sent
   Avg. len: average message length
   Reduct: number of global reductions
   Global: entire computation
   Stage: stages of a computation. Set stages with PetscLogStagePush() and
PetscLogStagePop().
      %T - percent time in this phase         %F - percent flops in this
phase
      %M - percent messages in this phase     %L - percent message lengths
in this phase
      %R - percent reductions in this phase
   Total Mflop/s: 10e-6 * (sum of flops over all processors)/(max time over
all processors)
------------------------------------------------------------------------------------------------------------------------


      ##########################################################
      #                                                        #
      #                          WARNING!!!                    #
      #                                                        #
      #   This code was compiled with a debugging option,      #
      #   To get timing results run config/configure.py        #
      #   using --with-debugging=no, the performance will      #
      #   be generally two or three times faster.              #
      #                                                        #
      ##########################################################


Event                Count      Time (sec)
Flops                             --- Global ---  --- Stage ---   Total
                   Max Ratio  Max     Ratio   Max  Ratio  Mess   Avg len
Reduct  %T %F %M %L %R  %T %F %M %L %R Mflop/s
------------------------------------------------------------------------------------------------------------------------

--- Event Stage 0: Main Stage

SNESSolve              1 1.0 2.7082e-001 1.0 1.42e+004 1.0 0.0e+000 0.0e+000
0.0e+000 58100  0  0  0  58100  0  0  0     0
SNESLineSearch         2 1.0 2.9807e-002 1.0 4.94e+003 1.0 0.0e+000 0.0e+000
0.0e+000  6 35  0  0  0   6 35  0  0  0     0
SNESFunctionEval      13 1.0 3.2095e-002 1.0 0.00e+000 0.0 0.0e+000 0.0e+000
0.0e+000  7  0  0  0  0   7  0  0  0  0     0
SNESJacobianEval       2 1.0 2.5123e-003 1.0 0.00e+000 0.0 0.0e+000 0.0e+000
0.0e+000  1  0  0  0  0   1  0  0  0  0     0
VecView                2 1.0 2.3364e-001 1.0 0.00e+000 0.0 0.0e+000 0.0e+000
0.0e+000 50  0  0  0  0  50  0  0  0  0     0
VecDot                 2 1.0 1.2292e-005 1.0 2.54e+002 1.0 0.0e+000 0.0e+000
0.0e+000  0  2  0  0  0   0  2  0  0  0    21
VecMDot                2 1.0 1.0337e-005 1.0 2.54e+002 1.0 0.0e+000 0.0e+000
0.0e+000  0  2  0  0  0   0  2  0  0  0    25
VecNorm               20 1.0 1.1873e-004 1.0 2.54e+003 1.0 0.0e+000 0.0e+000
0.0e+000  0 18  0  0  0   0 18  0  0  0    21
VecScale               4 1.0 1.7321e-005 1.0 2.56e+002 1.0 0.0e+000 0.0e+000
0.0e+000  0  2  0  0  0   0  2  0  0  0    15
VecCopy                6 1.0 2.6819e-005 1.0 0.00e+000 0.0 0.0e+000 0.0e+000
0.0e+000  0  0  0  0  0   0  0  0  0  0     0
VecSet                 5 1.0 1.9835e-005 1.0 0.00e+000 0.0 0.0e+000 0.0e+000
0.0e+000  0  0  0  0  0   0  0  0  0  0     0
VecAXPY                2 1.0 1.0337e-005 1.0 2.56e+002 1.0 0.0e+000 0.0e+000
0.0e+000  0  2  0  0  0   0  2  0  0  0    25
VecWAXPY              12 1.0 5.2521e-005 1.0 1.41e+003 1.0 0.0e+000 0.0e+000
0.0e+000  0 10  0  0  0   0 10  0  0  0    27
VecMAXPY               4 1.0 1.7321e-005 1.0 5.12e+002 1.0 0.0e+000 0.0e+000
0.0e+000  0  4  0  0  0   0  4  0  0  0    30
VecAssemblyBegin      13 1.0 3.4641e-005 1.0 0.00e+000 0.0 0.0e+000 0.0e+000
0.0e+000  0  0  0  0  0   0  0  0  0  0     0
VecAssemblyEnd        13 1.0 3.1848e-005 1.0 0.00e+000 0.0 0.0e+000 0.0e+000
0.0e+000  0  0  0  0  0   0  0  0  0  0     0
VecReduceArith         2 1.0 1.7321e-005 1.0 2.54e+002 1.0 0.0e+000 0.0e+000
0.0e+000  0  2  0  0  0   0  2  0  0  0    15
VecReduceComm          1 1.0 3.0730e-006 1.0 0.00e+000 0.0 0.0e+000 0.0e+000
0.0e+000  0  0  0  0  0   0  0  0  0  0     0
VecNormalize           4 1.0 8.4368e-005 1.0 7.64e+002 1.0 0.0e+000 0.0e+000
0.0e+000  0  5  0  0  0   0  5  0  0  0     9
MatMult                4 1.0 4.4978e-005 1.0 2.75e+003 1.0 0.0e+000 0.0e+000
0.0e+000  0 19  0  0  0   0 19  0  0  0    61
MatMultTranspose       1 1.0 2.3467e-005 1.0 7.52e+002 1.0 0.0e+000 0.0e+000
0.0e+000  0  5  0  0  0   0  5  0  0  0    32
MatSolve               4 1.0 5.6711e-005 1.0 2.75e+003 1.0 0.0e+000 0.0e+000
0.0e+000  0 19  0  0  0   0 19  0  0  0    49
MatLUFactorNum         2 1.0 7.7943e-005 1.0 2.06e+003 1.0 0.0e+000 0.0e+000
0.0e+000  0 15  0  0  0   0 15  0  0  0    26
MatILUFactorSym        2 1.0 1.9975e-004 1.0 0.00e+000 0.0 0.0e+000 0.0e+000
0.0e+000  0  0  0  0  0   0  0  0  0  0     0
MatAssemblyBegin       2 1.0 6.7048e-006 1.0 0.00e+000 0.0 0.0e+000 0.0e+000
0.0e+000  0  0  0  0  0   0  0  0  0  0     0
MatAssemblyEnd         2 1.0 4.1346e-005 1.0 0.00e+000 0.0 0.0e+000 0.0e+000
0.0e+000  0  0  0  0  0   0  0  0  0  0     0
MatGetRowIJ            2 1.0 1.0895e-005 1.0 0.00e+000 0.0 0.0e+000 0.0e+000
0.0e+000  0  0  0  0  0   0  0  0  0  0     0
MatGetOrdering         2 1.0 2.0254e-004 1.0 0.00e+000 0.0 0.0e+000 0.0e+000
0.0e+000  0  0  0  0  0   0  0  0  0  0     0
MatView                2 1.0 5.7549e-004 1.0 0.00e+000 0.0 0.0e+000 0.0e+000
0.0e+000  0  0  0  0  0   0  0  0  0  0     0
KSPGMRESOrthog         2 1.0 5.6152e-005 1.0 5.10e+002 1.0 0.0e+000 0.0e+000
0.0e+000  0  4  0  0  0   0  4  0  0  0     9
KSPSetup               2 1.0 1.5170e-004 1.0 0.00e+000 0.0 0.0e+000 0.0e+000
0.0e+000  0  0  0  0  0   0  0  0  0  0     0
KSPSolve               2 1.0 1.8486e-003 1.0 7.97e+003 1.0 0.0e+000 0.0e+000
0.0e+000  0 56  0  0  0   0 56  0  0  0     4
PCSetUp                2 1.0 1.2032e-003 1.0 2.06e+003 1.0 0.0e+000 0.0e+000
0.0e+000  0 15  0  0  0   0 15  0  0  0     2
PCApply                4 1.0 8.7441e-005 1.0 2.75e+003 1.0 0.0e+000 0.0e+000
0.0e+000  0 19  0  0  0   0 19  0  0  0    31
------------------------------------------------------------------------------------------------------------------------

Memory usage is given in bytes:

Object Type          Creations   Destructions   Memory  Descendants' Mem.

--- Event Stage 0: Main Stage

                SNES     1              1        668     0
                 Vec    11             11      13596     0
              Matrix     3              3       7820     0
       Krylov Solver     1              1      17392     0
      Preconditioner     1              1        500     0
              Viewer     2              2        680     0
                Draw     1              1        444     0
                Axis     1              1        308     0
          Line Graph     1              1       1908     0
           Index Set     6              6       3360     0
========================================================================================================================
Average time to get PetscTime(): 2.03937e-006
#PETSc Option Table entries:
-log_summary
-mat_type baij
-snes_max_it 20
-snes_monitor_residual
-snes_view
#End o PETSc Option Table entries
Compiled without FORTRAN kernels
Compiled with full precision matrices (default)
sizeof(short) 2 sizeof(int) 4 sizeof(long) 4 sizeof(void*) 4
sizeof(PetscScalar) 8
Configure run at: Mon Jul  6 16:28:41 2009
Configure options: --with-cc="win32fe cl --nodetect"
--download-c-blas-lapack=1 --with-fc=0 --with-mpi=0 --useThreads=0
--with-shared=0
-----------------------------------------
Libraries compiled on Mon Jul  6 16:39:57     2009 on Idrocp03
Machine characteristics: CYGWIN_NT-5.1 Idrocp03 1.5.25(0.156/4/2) 2008-06-12
19:34 i686 Cygwin
Using PETSc directory: /home/Administrator/petsc
Using PETSc arch: cygwin-c-debug
-----------------------------------------
Using C compiler: /home/Administrator/petsc/bin/win32fe/win32fe cl
--nodetect -MT -wd4996 -Z7
Using Fortran compiler:
-----------------------------------------
Using include paths: -I/home/Administrator/petsc/cygwin-c-debug/include
-I/home/Administrator/petsc/include
-I/home/Administrator/petsc/include/mpiuni
------------------------------------------
Using C linker: /home/Administrator/petsc/bin/win32fe/win32fe cl --nodetect
-MT -wd4996 -Z7
Using Fortran linker:
Using libraries: -L/home/Administrator/petsc/cygwin-c-debug/lib
-L/home/Administrator/petsc/cygwin-c-debug/lib -lpetscts -lpetscsnes
-lpetscksp -lpetscdm -lpetscmat -lpetscvec -lpetsc
-L/home/Administrator/petsc/cygwin-c-debug/lib -lf2clapack -lf2cblas
-lmpiuni Gdi32.lib User32.lib Advapi32.lib Kernel32.lib Ws2_32.lib
------------------------------------------



Thank you in advance for your help,

Michel Cancelliere
Politecnico di Torino
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.mcs.anl.gov/pipermail/petsc-users/attachments/20090722/93da7165/attachment-0001.htm>

From bsmith at mcs.anl.gov  Wed Jul 22 08:26:39 2009
From: bsmith at mcs.anl.gov (Barry Smith)
Date: Wed, 22 Jul 2009 08:26:39 -0500
Subject: SNES Convergence test
In-Reply-To: <7f18de3b0907220428u6b2690bay9efd09181d7bbada@mail.gmail.com>
References: <7f18de3b0907220428u6b2690bay9efd09181d7bbada@mail.gmail.com>
Message-ID: <FBB3CDD0-6DF7-47B5-96F0-DAC285355C86@mcs.anl.gov>


    After SNESSolve() you should call SNESGetConvergedReason() and see  
if the value is negative.

    Use -snes_monitor and -snes_converged_reason to see why SNES is  
ending.

    Barry

On Jul 22, 2009, at 6:28 AM, Michel Cancelliere wrote:

> Hi there,
>
> I am having problems with the SNES,  it seems that changing atol or  
> rtol have no effect on the numbers of Newton Iterations. Do you  
> think that it can be a problem of settings in snes solver or maybe a  
> problem on the routines for Jacobian matrix evaluation? The linear  
> solver do converge at each nonlinear iterations.
>
> My Code
>
>  ierr = SNESCreate(PETSC_COMM_WORLD,&snes);CHKERRQ(ierr);
>
>   /********************************************************/
>   /*    Creation of the matrix and vector data structures*/
>   /********************************************************/
>   ierr = VecCreate(PETSC_COMM_WORLD,&x); CHKERRQ(ierr);
>   ierr = VecSetSizes(x,PETSC_DECIDE,2*input.grid.N); CHKERRQ(ierr);
>   ierr = VecSetFromOptions(x); CHKERRQ(ierr);
>   ierr = VecDuplicate(x,&R);CHKERRQ(ierr);
>
>    ierr = MatCreate(PETSC_COMM_WORLD,&J);CHKERRQ(ierr);
>   ierr = MatSetSizes(J,PETSC_DECIDE,PETSC_DECIDE,2*input.grid.N, 
> 2*input.grid.N);CHKERRQ(ierr);
>   ierr = MatSetFromOptions(J);CHKERRQ(ierr);
>
>
>   ///Set function evaluation routine and vector
>
>    // Assign global variable which is used in the static wrapper  
> function
>   pt2Object  = (void*) &system;
>   pt2Object2 = (void*) &system;
>   ierr =  
> SNESSetFunction 
> (snes,R,CSystem::Wrapper_to_FormFunction,&input);CHKERRQ(ierr);
>
>   // Set Jacobian matrix structure and Jacobian evaluation routine
>   ierr =  
> SNESSetJacobian 
> (snes,J,J,CSystem::Wrapper_to_FormJacobian,&input);CHKERRQ(ierr);
>
>  /**      Customizr non linear solver; set runtime options       ***/
> /* Set linear solver defaults for this problem. By extracting the  
> KSP,and PC contexts from
> the SNES context, we can then directly call any KSP and PC routines  
> to set various options*/
>
>   ierr = SNESGetKSP(snes,&ksp);CHKERRQ(ierr);
>   ierr = KSPGetPC(ksp,&pc);CHKERRQ(ierr);
>   ierr = PCSetType(pc,PCILU);CHKERRQ(ierr);
>   ierr = KSPSetTolerances(ksp,1.e-10,PETSC_DEFAULT,PETSC_DEFAULT, 
> 30);CHKERRQ(ierr);
>   ierr = KSPSetFromOptions(ksp);CHKERRQ(ierr);
>
>
>
>   /*
>     Set SNES/KSP/PC rountime options, e.g.,
>     -snes_view -snes_monitor -ksp_type <ksp> -pc_type <pc>
>     These options will override thos specified above as lon as  
> SNESSetFromOptoons is called _after_ any other customization routines.
>   */
>
>   ierr =SNESSetTolerances(snes, 
> 1e-100,1e-8,1e-1000,100,1000);CHKERRQ(ierr);   No matter what I put  
> here I am getting the same results (Residual Norm and number of  
> iterations)
>   ierr = SNESSetFromOptions(snes);CHKERRQ(ierr);
>
>   /*---------------------------------------------------------------
>             Evaluate initial guess; then solve nonlinear system
>   -----------------------------------------------------------------*/
>
>
>   ierr = CSystem::Cells2Vec(x,input);CHKERRQ(ierr);
>
>
>  for (int i=0;i<system.delta_t_V.size();i++){// for (int i=0;i<700;i+ 
> +)
>       system.delta_t = system.delta_t_V[i];
>       system.t         = system.t_V[i];
>       input.t = system.t;
>       input.delta_t = system.delta_t;
>      /*\\\\\\\\\\\\\\Boundary conditions\\\\\\\\\\\\\\\\*/
>       Rate_tot = input.F_boundary(input.t,input.delta_t);
>       input.bc.F(input.interfaces,Rate_tot);
>
>       ierr = SNESSolve(snes,PETSC_NULL,x);CHKERRQ(ierr);
>       gauge.push_back(input.cells[0].water.p);
>   }
>
>
>
>
>
>
>
>  ierr = VecDestroy(x);CHKERRQ(ierr);
>  ierr = VecDestroy(R);CHKERRQ(ierr);
>  ierr = MatDestroy(J);CHKERRQ(ierr);
>  ierr = SNESDestroy(snes);CHKERRQ(ierr);
>   ierr = PetscFinalize();CHKERRQ(ierr);
>   return 0;
> }
>
>
> Output:
> SNES Object:
>   type: ls
>     line search variant: SNESLineSearchCubic
>     alpha=0.0001, maxstep=1e+008, minlambda=1e-012
>   maximum iterations=20, maximum function evaluations=1000
>   tolerances: relative=1e-008, absolute=1e-100, solution=0
>   total number of linear solver iterations=2
>   total number of function evaluations=13
>   KSP Object:
>     type: gmres
>       GMRES: restart=30, using Classical (unmodified) Gram-Schmidt  
> Orthogonalization with no iterative refinement
>       GMRES: happy breakdown tolerance 1e-030
>     maximum iterations=30, initial guess is zero
>     tolerances:  relative=1e-010, absolute=1e-050, divergence=10000
>     left preconditioning
>   PC Object:
>     type: ilu
>       ILU: 0 levels of fill
>       ILU: factor fill ratio allocated 1
>       ILU: tolerance for zero pivot 1e-012
>       ILU: using diagonal shift to prevent zero pivot
>       ILU: using diagonal shift on blocks to prevent zero pivot
>            out-of-place factorization
>            matrix ordering: natural
>       ILU: factor fill ratio needed 0
>            Factored matrix follows
>           Matrix Object:
>             type=seqbaij, rows=64, cols=64
>             package used to perform factorization: petsc
>             total: nonzeros=376, allocated nonzeros=920
>                 block size is 1
>     linear system matrix = precond matrix:
>     Matrix Object:
>       type=seqbaij, rows=64, cols=64
>       total: nonzeros=376, allocated nonzeros=920
>           block size is 1
> ************************************************************************************************************************
> ***             WIDEN YOUR WINDOW TO 120 CHARACTERS.  Use 'enscript - 
> r -fCourier9' to print this document            ***
> ************************************************************************************************************************
>
> ---------------------------------------------- PETSc Performance  
> Summary: ----------------------------------------------
>
> ex2.exe on a cygwin-c- named IDROCP03 with 1 processor, by  
> Administrator Wed Jul 22 13:25:47 2009
> Using Petsc Release Version 3.0.0, Patch 6, Fri Jun  5 13:31:12 CDT  
> 2009
>
>                          Max       Max/Min        Avg      Total
> Time (sec):           4.667e-001      1.00000   4.667e-001
> Objects:              2.800e+001      1.00000   2.800e+001
> Flops:                1.417e+004      1.00000   1.417e+004  1.417e+004
> Flops/sec:            3.037e+004      1.00000   3.037e+004  3.037e+004
> Memory:               9.325e+004      1.00000              9.325e+004
> MPI Messages:         0.000e+000      0.00000   0.000e+000  0.000e+000
> MPI Message Lengths:  0.000e+000      0.00000   0.000e+000  0.000e+000
> MPI Reductions:       0.000e+000      0.00000
>
> Flop counting convention: 1 flop = 1 real number operation of type  
> (multiply/divide/add/subtract)
>                             e.g., VecAXPY() for real vectors of  
> length N --> 2N flops
>                             and VecAXPY() for complex vectors of  
> length N --> 8N flops
>
> Summary of Stages:   ----- Time ------  ----- Flops -----  ---  
> Messages ---  -- Message Lengths --  -- Reductions --
>                         Avg     %Total     Avg     %Total   counts    
> %Total     Avg         %Total   counts   %Total
>  0:      Main Stage: 4.6669e-001 100.0%  1.4174e+004 100.0%  0.000e 
> +000   0.0%  0.000e+000        0.0%  0.000e+000   0.0%
>
> ------------------------------------------------------------------------------------------------------------------------
> See the 'Profiling' chapter of the users' manual for details on  
> interpreting output.
> Phase summary info:
>    Count: number of times phase was executed
>    Time and Flops: Max - maximum over all processors
>                    Ratio - ratio of maximum to minimum over all  
> processors
>    Mess: number of messages sent
>    Avg. len: average message length
>    Reduct: number of global reductions
>    Global: entire computation
>    Stage: stages of a computation. Set stages with  
> PetscLogStagePush() and PetscLogStagePop().
>       %T - percent time in this phase         %F - percent flops in  
> this phase
>       %M - percent messages in this phase     %L - percent message  
> lengths in this phase
>       %R - percent reductions in this phase
>    Total Mflop/s: 10e-6 * (sum of flops over all processors)/(max  
> time over all processors)
> ------------------------------------------------------------------------------------------------------------------------
>
>
>       ##########################################################
>       #                                                        #
>       #                          WARNING!!!                    #
>       #                                                        #
>       #   This code was compiled with a debugging option,      #
>       #   To get timing results run config/configure.py        #
>       #   using --with-debugging=no, the performance will      #
>       #   be generally two or three times faster.              #
>       #                                                        #
>       ##########################################################
>
>
> Event                Count      Time (sec)      
> Flops                             --- Global ---  --- Stage ---    
> Total
>                    Max Ratio  Max     Ratio   Max  Ratio  Mess   Avg  
> len Reduct  %T %F %M %L %R  %T %F %M %L %R Mflop/s
> ------------------------------------------------------------------------------------------------------------------------
>
> --- Event Stage 0: Main Stage
>
> SNESSolve              1 1.0 2.7082e-001 1.0 1.42e+004 1.0 0.0e+000  
> 0.0e+000 0.0e+000 58100  0  0  0  58100  0  0  0     0
> SNESLineSearch         2 1.0 2.9807e-002 1.0 4.94e+003 1.0 0.0e+000  
> 0.0e+000 0.0e+000  6 35  0  0  0   6 35  0  0  0     0
> SNESFunctionEval      13 1.0 3.2095e-002 1.0 0.00e+000 0.0 0.0e+000  
> 0.0e+000 0.0e+000  7  0  0  0  0   7  0  0  0  0     0
> SNESJacobianEval       2 1.0 2.5123e-003 1.0 0.00e+000 0.0 0.0e+000  
> 0.0e+000 0.0e+000  1  0  0  0  0   1  0  0  0  0     0
> VecView                2 1.0 2.3364e-001 1.0 0.00e+000 0.0 0.0e+000  
> 0.0e+000 0.0e+000 50  0  0  0  0  50  0  0  0  0     0
> VecDot                 2 1.0 1.2292e-005 1.0 2.54e+002 1.0 0.0e+000  
> 0.0e+000 0.0e+000  0  2  0  0  0   0  2  0  0  0    21
> VecMDot                2 1.0 1.0337e-005 1.0 2.54e+002 1.0 0.0e+000  
> 0.0e+000 0.0e+000  0  2  0  0  0   0  2  0  0  0    25
> VecNorm               20 1.0 1.1873e-004 1.0 2.54e+003 1.0 0.0e+000  
> 0.0e+000 0.0e+000  0 18  0  0  0   0 18  0  0  0    21
> VecScale               4 1.0 1.7321e-005 1.0 2.56e+002 1.0 0.0e+000  
> 0.0e+000 0.0e+000  0  2  0  0  0   0  2  0  0  0    15
> VecCopy                6 1.0 2.6819e-005 1.0 0.00e+000 0.0 0.0e+000  
> 0.0e+000 0.0e+000  0  0  0  0  0   0  0  0  0  0     0
> VecSet                 5 1.0 1.9835e-005 1.0 0.00e+000 0.0 0.0e+000  
> 0.0e+000 0.0e+000  0  0  0  0  0   0  0  0  0  0     0
> VecAXPY                2 1.0 1.0337e-005 1.0 2.56e+002 1.0 0.0e+000  
> 0.0e+000 0.0e+000  0  2  0  0  0   0  2  0  0  0    25
> VecWAXPY              12 1.0 5.2521e-005 1.0 1.41e+003 1.0 0.0e+000  
> 0.0e+000 0.0e+000  0 10  0  0  0   0 10  0  0  0    27
> VecMAXPY               4 1.0 1.7321e-005 1.0 5.12e+002 1.0 0.0e+000  
> 0.0e+000 0.0e+000  0  4  0  0  0   0  4  0  0  0    30
> VecAssemblyBegin      13 1.0 3.4641e-005 1.0 0.00e+000 0.0 0.0e+000  
> 0.0e+000 0.0e+000  0  0  0  0  0   0  0  0  0  0     0
> VecAssemblyEnd        13 1.0 3.1848e-005 1.0 0.00e+000 0.0 0.0e+000  
> 0.0e+000 0.0e+000  0  0  0  0  0   0  0  0  0  0     0
> VecReduceArith         2 1.0 1.7321e-005 1.0 2.54e+002 1.0 0.0e+000  
> 0.0e+000 0.0e+000  0  2  0  0  0   0  2  0  0  0    15
> VecReduceComm          1 1.0 3.0730e-006 1.0 0.00e+000 0.0 0.0e+000  
> 0.0e+000 0.0e+000  0  0  0  0  0   0  0  0  0  0     0
> VecNormalize           4 1.0 8.4368e-005 1.0 7.64e+002 1.0 0.0e+000  
> 0.0e+000 0.0e+000  0  5  0  0  0   0  5  0  0  0     9
> MatMult                4 1.0 4.4978e-005 1.0 2.75e+003 1.0 0.0e+000  
> 0.0e+000 0.0e+000  0 19  0  0  0   0 19  0  0  0    61
> MatMultTranspose       1 1.0 2.3467e-005 1.0 7.52e+002 1.0 0.0e+000  
> 0.0e+000 0.0e+000  0  5  0  0  0   0  5  0  0  0    32
> MatSolve               4 1.0 5.6711e-005 1.0 2.75e+003 1.0 0.0e+000  
> 0.0e+000 0.0e+000  0 19  0  0  0   0 19  0  0  0    49
> MatLUFactorNum         2 1.0 7.7943e-005 1.0 2.06e+003 1.0 0.0e+000  
> 0.0e+000 0.0e+000  0 15  0  0  0   0 15  0  0  0    26
> MatILUFactorSym        2 1.0 1.9975e-004 1.0 0.00e+000 0.0 0.0e+000  
> 0.0e+000 0.0e+000  0  0  0  0  0   0  0  0  0  0     0
> MatAssemblyBegin       2 1.0 6.7048e-006 1.0 0.00e+000 0.0 0.0e+000  
> 0.0e+000 0.0e+000  0  0  0  0  0   0  0  0  0  0     0
> MatAssemblyEnd         2 1.0 4.1346e-005 1.0 0.00e+000 0.0 0.0e+000  
> 0.0e+000 0.0e+000  0  0  0  0  0   0  0  0  0  0     0
> MatGetRowIJ            2 1.0 1.0895e-005 1.0 0.00e+000 0.0 0.0e+000  
> 0.0e+000 0.0e+000  0  0  0  0  0   0  0  0  0  0     0
> MatGetOrdering         2 1.0 2.0254e-004 1.0 0.00e+000 0.0 0.0e+000  
> 0.0e+000 0.0e+000  0  0  0  0  0   0  0  0  0  0     0
> MatView                2 1.0 5.7549e-004 1.0 0.00e+000 0.0 0.0e+000  
> 0.0e+000 0.0e+000  0  0  0  0  0   0  0  0  0  0     0
> KSPGMRESOrthog         2 1.0 5.6152e-005 1.0 5.10e+002 1.0 0.0e+000  
> 0.0e+000 0.0e+000  0  4  0  0  0   0  4  0  0  0     9
> KSPSetup               2 1.0 1.5170e-004 1.0 0.00e+000 0.0 0.0e+000  
> 0.0e+000 0.0e+000  0  0  0  0  0   0  0  0  0  0     0
> KSPSolve               2 1.0 1.8486e-003 1.0 7.97e+003 1.0 0.0e+000  
> 0.0e+000 0.0e+000  0 56  0  0  0   0 56  0  0  0     4
> PCSetUp                2 1.0 1.2032e-003 1.0 2.06e+003 1.0 0.0e+000  
> 0.0e+000 0.0e+000  0 15  0  0  0   0 15  0  0  0     2
> PCApply                4 1.0 8.7441e-005 1.0 2.75e+003 1.0 0.0e+000  
> 0.0e+000 0.0e+000  0 19  0  0  0   0 19  0  0  0    31
> ------------------------------------------------------------------------------------------------------------------------
>
> Memory usage is given in bytes:
>
> Object Type          Creations   Destructions   Memory  Descendants'  
> Mem.
>
> --- Event Stage 0: Main Stage
>
>                 SNES     1              1        668     0
>                  Vec    11             11      13596     0
>               Matrix     3              3       7820     0
>        Krylov Solver     1              1      17392     0
>       Preconditioner     1              1        500     0
>               Viewer     2              2        680     0
>                 Draw     1              1        444     0
>                 Axis     1              1        308     0
>           Line Graph     1              1       1908     0
>            Index Set     6              6       3360     0
> = 
> = 
> = 
> = 
> = 
> = 
> = 
> = 
> = 
> = 
> = 
> = 
> = 
> = 
> = 
> = 
> = 
> = 
> = 
> = 
> = 
> = 
> = 
> = 
> = 
> = 
> = 
> = 
> = 
> = 
> = 
> = 
> = 
> = 
> = 
> = 
> = 
> = 
> = 
> = 
> = 
> = 
> = 
> = 
> = 
> = 
> = 
> = 
> = 
> = 
> ======================================================================
> Average time to get PetscTime(): 2.03937e-006
> #PETSc Option Table entries:
> -log_summary
> -mat_type baij
> -snes_max_it 20
> -snes_monitor_residual
> -snes_view
> #End o PETSc Option Table entries
> Compiled without FORTRAN kernels
> Compiled with full precision matrices (default)
> sizeof(short) 2 sizeof(int) 4 sizeof(long) 4 sizeof(void*) 4  
> sizeof(PetscScalar) 8
> Configure run at: Mon Jul  6 16:28:41 2009
> Configure options: --with-cc="win32fe cl --nodetect" --download-c- 
> blas-lapack=1 --with-fc=0 --with-mpi=0 --useThreads=0 --with-shared=0
> -----------------------------------------
> Libraries compiled on Mon Jul  6 16:39:57     2009 on Idrocp03
> Machine characteristics: CYGWIN_NT-5.1 Idrocp03 1.5.25(0.156/4/2)  
> 2008-06-12 19:34 i686 Cygwin
> Using PETSc directory: /home/Administrator/petsc
> Using PETSc arch: cygwin-c-debug
> -----------------------------------------
> Using C compiler: /home/Administrator/petsc/bin/win32fe/win32fe cl -- 
> nodetect -MT -wd4996 -Z7
> Using Fortran compiler:
> -----------------------------------------
> Using include paths: -I/home/Administrator/petsc/cygwin-c-debug/ 
> include -I/home/Administrator/petsc/include -I/home/Administrator/ 
> petsc/include/mpiuni
> ------------------------------------------
> Using C linker: /home/Administrator/petsc/bin/win32fe/win32fe cl -- 
> nodetect -MT -wd4996 -Z7
> Using Fortran linker:
> Using libraries: -L/home/Administrator/petsc/cygwin-c-debug/lib -L/ 
> home/Administrator/petsc/cygwin-c-debug/lib -lpetscts -lpetscsnes - 
> lpetscksp -lpetscdm -lpetscmat -lpetscvec -lpetsc        -L/home/ 
> Administrator/petsc/cygwin-c-debug/lib -lf2clapack -lf2cblas - 
> lmpiuni Gdi32.lib User32.lib Advapi32.lib Kernel32.lib Ws2_32.lib
> ------------------------------------------
>
>
>
> Thank you in advance for your help,
>
> Michel Cancelliere
> Politecnico di Torino
>


From knepley at gmail.com  Wed Jul 22 08:42:50 2009
From: knepley at gmail.com (Matthew Knepley)
Date: Wed, 22 Jul 2009 08:42:50 -0500
Subject: SNES Convergence test
In-Reply-To: <FBB3CDD0-6DF7-47B5-96F0-DAC285355C86@mcs.anl.gov>
References: <7f18de3b0907220428u6b2690bay9efd09181d7bbada@mail.gmail.com>
	<FBB3CDD0-6DF7-47B5-96F0-DAC285355C86@mcs.anl.gov>
Message-ID: <a9f269830907220642k4a853e55nb4e984ba4dd9b15c@mail.gmail.com>

On Wed, Jul 22, 2009 at 8:26 AM, Barry Smith <bsmith at mcs.anl.gov> wrote:

>
>   After SNESSolve() you should call SNESGetConvergedReason() and see if the
> value is negative.
>
>   Use -snes_monitor and -snes_converged_reason to see why SNES is ending.


Also, -snes_view will print the tolerances it actually used.

  Matt


>
>   Barry
>
>
> On Jul 22, 2009, at 6:28 AM, Michel Cancelliere wrote:
>
>  Hi there,
>>
>> I am having problems with the SNES,  it seems that changing atol or rtol
>> have no effect on the numbers of Newton Iterations. Do you think that it can
>> be a problem of settings in snes solver or maybe a problem on the routines
>> for Jacobian matrix evaluation? The linear solver do converge at each
>> nonlinear iterations.
>>
>> My Code
>>
>>  ierr = SNESCreate(PETSC_COMM_WORLD,&snes);CHKERRQ(ierr);
>>
>>  /********************************************************/
>>  /*    Creation of the matrix and vector data structures*/
>>  /********************************************************/
>>  ierr = VecCreate(PETSC_COMM_WORLD,&x); CHKERRQ(ierr);
>>  ierr = VecSetSizes(x,PETSC_DECIDE,2*input.grid.N); CHKERRQ(ierr);
>>  ierr = VecSetFromOptions(x); CHKERRQ(ierr);
>>  ierr = VecDuplicate(x,&R);CHKERRQ(ierr);
>>
>>   ierr = MatCreate(PETSC_COMM_WORLD,&J);CHKERRQ(ierr);
>>  ierr =
>> MatSetSizes(J,PETSC_DECIDE,PETSC_DECIDE,2*input.grid.N,2*input.grid.N);CHKERRQ(ierr);
>>  ierr = MatSetFromOptions(J);CHKERRQ(ierr);
>>
>>
>>  ///Set function evaluation routine and vector
>>
>>   // Assign global variable which is used in the static wrapper function
>>  pt2Object  = (void*) &system;
>>  pt2Object2 = (void*) &system;
>>  ierr =
>> SNESSetFunction(snes,R,CSystem::Wrapper_to_FormFunction,&input);CHKERRQ(ierr);
>>
>>  // Set Jacobian matrix structure and Jacobian evaluation routine
>>  ierr =
>> SNESSetJacobian(snes,J,J,CSystem::Wrapper_to_FormJacobian,&input);CHKERRQ(ierr);
>>
>>  /**      Customizr non linear solver; set runtime options       ***/
>> /* Set linear solver defaults for this problem. By extracting the KSP,and
>> PC contexts from
>> the SNES context, we can then directly call any KSP and PC routines to set
>> various options*/
>>
>>  ierr = SNESGetKSP(snes,&ksp);CHKERRQ(ierr);
>>  ierr = KSPGetPC(ksp,&pc);CHKERRQ(ierr);
>>  ierr = PCSetType(pc,PCILU);CHKERRQ(ierr);
>>  ierr =
>> KSPSetTolerances(ksp,1.e-10,PETSC_DEFAULT,PETSC_DEFAULT,30);CHKERRQ(ierr);
>>  ierr = KSPSetFromOptions(ksp);CHKERRQ(ierr);
>>
>>
>>
>>  /*
>>    Set SNES/KSP/PC rountime options, e.g.,
>>    -snes_view -snes_monitor -ksp_type <ksp> -pc_type <pc>
>>    These options will override thos specified above as lon as
>> SNESSetFromOptoons is called _after_ any other customization routines.
>>  */
>>
>>  ierr =SNESSetTolerances(snes,1e-100,1e-8,1e-1000,100,1000);CHKERRQ(ierr);
>>   No matter what I put here I am getting the same results (Residual Norm and
>> number of iterations)
>>  ierr = SNESSetFromOptions(snes);CHKERRQ(ierr);
>>
>>  /*---------------------------------------------------------------
>>            Evaluate initial guess; then solve nonlinear system
>>  -----------------------------------------------------------------*/
>>
>>
>>  ierr = CSystem::Cells2Vec(x,input);CHKERRQ(ierr);
>>
>>
>>  for (int i=0;i<system.delta_t_V.size();i++){// for (int i=0;i<700;i++)
>>      system.delta_t = system.delta_t_V[i];
>>      system.t         = system.t_V[i];
>>      input.t = system.t;
>>      input.delta_t = system.delta_t;
>>     /*\\\\\\\\\\\\\\Boundary conditions\\\\\\\\\\\\\\\\*/
>>      Rate_tot = input.F_boundary(input.t,input.delta_t);
>>      input.bc.F(input.interfaces,Rate_tot);
>>
>>      ierr = SNESSolve(snes,PETSC_NULL,x);CHKERRQ(ierr);
>>      gauge.push_back(input.cells[0].water.p);
>>  }
>>
>>
>>
>>
>>
>>
>>
>>  ierr = VecDestroy(x);CHKERRQ(ierr);
>>  ierr = VecDestroy(R);CHKERRQ(ierr);
>>  ierr = MatDestroy(J);CHKERRQ(ierr);
>>  ierr = SNESDestroy(snes);CHKERRQ(ierr);
>>  ierr = PetscFinalize();CHKERRQ(ierr);
>>  return 0;
>> }
>>
>>
>> Output:
>> SNES Object:
>>  type: ls
>>    line search variant: SNESLineSearchCubic
>>    alpha=0.0001, maxstep=1e+008, minlambda=1e-012
>>  maximum iterations=20, maximum function evaluations=1000
>>  tolerances: relative=1e-008, absolute=1e-100, solution=0
>>  total number of linear solver iterations=2
>>  total number of function evaluations=13
>>  KSP Object:
>>    type: gmres
>>      GMRES: restart=30, using Classical (unmodified) Gram-Schmidt
>> Orthogonalization with no iterative refinement
>>      GMRES: happy breakdown tolerance 1e-030
>>    maximum iterations=30, initial guess is zero
>>    tolerances:  relative=1e-010, absolute=1e-050, divergence=10000
>>    left preconditioning
>>  PC Object:
>>    type: ilu
>>      ILU: 0 levels of fill
>>      ILU: factor fill ratio allocated 1
>>      ILU: tolerance for zero pivot 1e-012
>>      ILU: using diagonal shift to prevent zero pivot
>>      ILU: using diagonal shift on blocks to prevent zero pivot
>>           out-of-place factorization
>>           matrix ordering: natural
>>      ILU: factor fill ratio needed 0
>>           Factored matrix follows
>>          Matrix Object:
>>            type=seqbaij, rows=64, cols=64
>>            package used to perform factorization: petsc
>>            total: nonzeros=376, allocated nonzeros=920
>>                block size is 1
>>    linear system matrix = precond matrix:
>>    Matrix Object:
>>      type=seqbaij, rows=64, cols=64
>>      total: nonzeros=376, allocated nonzeros=920
>>          block size is 1
>>
>> ************************************************************************************************************************
>> ***             WIDEN YOUR WINDOW TO 120 CHARACTERS.  Use 'enscript -r
>> -fCourier9' to print this document            ***
>>
>> ************************************************************************************************************************
>>
>> ---------------------------------------------- PETSc Performance Summary:
>> ----------------------------------------------
>>
>> ex2.exe on a cygwin-c- named IDROCP03 with 1 processor, by Administrator
>> Wed Jul 22 13:25:47 2009
>> Using Petsc Release Version 3.0.0, Patch 6, Fri Jun  5 13:31:12 CDT 2009
>>
>>                         Max       Max/Min        Avg      Total
>> Time (sec):           4.667e-001      1.00000   4.667e-001
>> Objects:              2.800e+001      1.00000   2.800e+001
>> Flops:                1.417e+004      1.00000   1.417e+004  1.417e+004
>> Flops/sec:            3.037e+004      1.00000   3.037e+004  3.037e+004
>> Memory:               9.325e+004      1.00000              9.325e+004
>> MPI Messages:         0.000e+000      0.00000   0.000e+000  0.000e+000
>> MPI Message Lengths:  0.000e+000      0.00000   0.000e+000  0.000e+000
>> MPI Reductions:       0.000e+000      0.00000
>>
>> Flop counting convention: 1 flop = 1 real number operation of type
>> (multiply/divide/add/subtract)
>>                            e.g., VecAXPY() for real vectors of length N
>> --> 2N flops
>>                            and VecAXPY() for complex vectors of length N
>> --> 8N flops
>>
>> Summary of Stages:   ----- Time ------  ----- Flops -----  --- Messages
>> ---  -- Message Lengths --  -- Reductions --
>>                        Avg     %Total     Avg     %Total   counts   %Total
>>     Avg         %Total   counts   %Total
>>  0:      Main Stage: 4.6669e-001 100.0%  1.4174e+004 100.0%  0.000e+000
>> 0.0%  0.000e+000        0.0%  0.000e+000   0.0%
>>
>>
>> ------------------------------------------------------------------------------------------------------------------------
>> See the 'Profiling' chapter of the users' manual for details on
>> interpreting output.
>> Phase summary info:
>>   Count: number of times phase was executed
>>   Time and Flops: Max - maximum over all processors
>>                   Ratio - ratio of maximum to minimum over all processors
>>   Mess: number of messages sent
>>   Avg. len: average message length
>>   Reduct: number of global reductions
>>   Global: entire computation
>>   Stage: stages of a computation. Set stages with PetscLogStagePush() and
>> PetscLogStagePop().
>>      %T - percent time in this phase         %F - percent flops in this
>> phase
>>      %M - percent messages in this phase     %L - percent message lengths
>> in this phase
>>      %R - percent reductions in this phase
>>   Total Mflop/s: 10e-6 * (sum of flops over all processors)/(max time over
>> all processors)
>>
>> ------------------------------------------------------------------------------------------------------------------------
>>
>>
>>      ##########################################################
>>      #                                                        #
>>      #                          WARNING!!!                    #
>>      #                                                        #
>>      #   This code was compiled with a debugging option,      #
>>      #   To get timing results run config/configure.py        #
>>      #   using --with-debugging=no, the performance will      #
>>      #   be generally two or three times faster.              #
>>      #                                                        #
>>      ##########################################################
>>
>>
>> Event                Count      Time (sec)     Flops
>>       --- Global ---  --- Stage ---   Total
>>                   Max Ratio  Max     Ratio   Max  Ratio  Mess   Avg len
>> Reduct  %T %F %M %L %R  %T %F %M %L %R Mflop/s
>>
>> ------------------------------------------------------------------------------------------------------------------------
>>
>> --- Event Stage 0: Main Stage
>>
>> SNESSolve              1 1.0 2.7082e-001 1.0 1.42e+004 1.0 0.0e+000
>> 0.0e+000 0.0e+000 58100  0  0  0  58100  0  0  0     0
>> SNESLineSearch         2 1.0 2.9807e-002 1.0 4.94e+003 1.0 0.0e+000
>> 0.0e+000 0.0e+000  6 35  0  0  0   6 35  0  0  0     0
>> SNESFunctionEval      13 1.0 3.2095e-002 1.0 0.00e+000 0.0 0.0e+000
>> 0.0e+000 0.0e+000  7  0  0  0  0   7  0  0  0  0     0
>> SNESJacobianEval       2 1.0 2.5123e-003 1.0 0.00e+000 0.0 0.0e+000
>> 0.0e+000 0.0e+000  1  0  0  0  0   1  0  0  0  0     0
>> VecView                2 1.0 2.3364e-001 1.0 0.00e+000 0.0 0.0e+000
>> 0.0e+000 0.0e+000 50  0  0  0  0  50  0  0  0  0     0
>> VecDot                 2 1.0 1.2292e-005 1.0 2.54e+002 1.0 0.0e+000
>> 0.0e+000 0.0e+000  0  2  0  0  0   0  2  0  0  0    21
>> VecMDot                2 1.0 1.0337e-005 1.0 2.54e+002 1.0 0.0e+000
>> 0.0e+000 0.0e+000  0  2  0  0  0   0  2  0  0  0    25
>> VecNorm               20 1.0 1.1873e-004 1.0 2.54e+003 1.0 0.0e+000
>> 0.0e+000 0.0e+000  0 18  0  0  0   0 18  0  0  0    21
>> VecScale               4 1.0 1.7321e-005 1.0 2.56e+002 1.0 0.0e+000
>> 0.0e+000 0.0e+000  0  2  0  0  0   0  2  0  0  0    15
>> VecCopy                6 1.0 2.6819e-005 1.0 0.00e+000 0.0 0.0e+000
>> 0.0e+000 0.0e+000  0  0  0  0  0   0  0  0  0  0     0
>> VecSet                 5 1.0 1.9835e-005 1.0 0.00e+000 0.0 0.0e+000
>> 0.0e+000 0.0e+000  0  0  0  0  0   0  0  0  0  0     0
>> VecAXPY                2 1.0 1.0337e-005 1.0 2.56e+002 1.0 0.0e+000
>> 0.0e+000 0.0e+000  0  2  0  0  0   0  2  0  0  0    25
>> VecWAXPY              12 1.0 5.2521e-005 1.0 1.41e+003 1.0 0.0e+000
>> 0.0e+000 0.0e+000  0 10  0  0  0   0 10  0  0  0    27
>> VecMAXPY               4 1.0 1.7321e-005 1.0 5.12e+002 1.0 0.0e+000
>> 0.0e+000 0.0e+000  0  4  0  0  0   0  4  0  0  0    30
>> VecAssemblyBegin      13 1.0 3.4641e-005 1.0 0.00e+000 0.0 0.0e+000
>> 0.0e+000 0.0e+000  0  0  0  0  0   0  0  0  0  0     0
>> VecAssemblyEnd        13 1.0 3.1848e-005 1.0 0.00e+000 0.0 0.0e+000
>> 0.0e+000 0.0e+000  0  0  0  0  0   0  0  0  0  0     0
>> VecReduceArith         2 1.0 1.7321e-005 1.0 2.54e+002 1.0 0.0e+000
>> 0.0e+000 0.0e+000  0  2  0  0  0   0  2  0  0  0    15
>> VecReduceComm          1 1.0 3.0730e-006 1.0 0.00e+000 0.0 0.0e+000
>> 0.0e+000 0.0e+000  0  0  0  0  0   0  0  0  0  0     0
>> VecNormalize           4 1.0 8.4368e-005 1.0 7.64e+002 1.0 0.0e+000
>> 0.0e+000 0.0e+000  0  5  0  0  0   0  5  0  0  0     9
>> MatMult                4 1.0 4.4978e-005 1.0 2.75e+003 1.0 0.0e+000
>> 0.0e+000 0.0e+000  0 19  0  0  0   0 19  0  0  0    61
>> MatMultTranspose       1 1.0 2.3467e-005 1.0 7.52e+002 1.0 0.0e+000
>> 0.0e+000 0.0e+000  0  5  0  0  0   0  5  0  0  0    32
>> MatSolve               4 1.0 5.6711e-005 1.0 2.75e+003 1.0 0.0e+000
>> 0.0e+000 0.0e+000  0 19  0  0  0   0 19  0  0  0    49
>> MatLUFactorNum         2 1.0 7.7943e-005 1.0 2.06e+003 1.0 0.0e+000
>> 0.0e+000 0.0e+000  0 15  0  0  0   0 15  0  0  0    26
>> MatILUFactorSym        2 1.0 1.9975e-004 1.0 0.00e+000 0.0 0.0e+000
>> 0.0e+000 0.0e+000  0  0  0  0  0   0  0  0  0  0     0
>> MatAssemblyBegin       2 1.0 6.7048e-006 1.0 0.00e+000 0.0 0.0e+000
>> 0.0e+000 0.0e+000  0  0  0  0  0   0  0  0  0  0     0
>> MatAssemblyEnd         2 1.0 4.1346e-005 1.0 0.00e+000 0.0 0.0e+000
>> 0.0e+000 0.0e+000  0  0  0  0  0   0  0  0  0  0     0
>> MatGetRowIJ            2 1.0 1.0895e-005 1.0 0.00e+000 0.0 0.0e+000
>> 0.0e+000 0.0e+000  0  0  0  0  0   0  0  0  0  0     0
>> MatGetOrdering         2 1.0 2.0254e-004 1.0 0.00e+000 0.0 0.0e+000
>> 0.0e+000 0.0e+000  0  0  0  0  0   0  0  0  0  0     0
>> MatView                2 1.0 5.7549e-004 1.0 0.00e+000 0.0 0.0e+000
>> 0.0e+000 0.0e+000  0  0  0  0  0   0  0  0  0  0     0
>> KSPGMRESOrthog         2 1.0 5.6152e-005 1.0 5.10e+002 1.0 0.0e+000
>> 0.0e+000 0.0e+000  0  4  0  0  0   0  4  0  0  0     9
>> KSPSetup               2 1.0 1.5170e-004 1.0 0.00e+000 0.0 0.0e+000
>> 0.0e+000 0.0e+000  0  0  0  0  0   0  0  0  0  0     0
>> KSPSolve               2 1.0 1.8486e-003 1.0 7.97e+003 1.0 0.0e+000
>> 0.0e+000 0.0e+000  0 56  0  0  0   0 56  0  0  0     4
>> PCSetUp                2 1.0 1.2032e-003 1.0 2.06e+003 1.0 0.0e+000
>> 0.0e+000 0.0e+000  0 15  0  0  0   0 15  0  0  0     2
>> PCApply                4 1.0 8.7441e-005 1.0 2.75e+003 1.0 0.0e+000
>> 0.0e+000 0.0e+000  0 19  0  0  0   0 19  0  0  0    31
>>
>> ------------------------------------------------------------------------------------------------------------------------
>>
>> Memory usage is given in bytes:
>>
>> Object Type          Creations   Destructions   Memory  Descendants' Mem.
>>
>> --- Event Stage 0: Main Stage
>>
>>                SNES     1              1        668     0
>>                 Vec    11             11      13596     0
>>              Matrix     3              3       7820     0
>>       Krylov Solver     1              1      17392     0
>>      Preconditioner     1              1        500     0
>>              Viewer     2              2        680     0
>>                Draw     1              1        444     0
>>                Axis     1              1        308     0
>>          Line Graph     1              1       1908     0
>>           Index Set     6              6       3360     0
>>
>> ========================================================================================================================
>> Average time to get PetscTime(): 2.03937e-006
>> #PETSc Option Table entries:
>> -log_summary
>> -mat_type baij
>> -snes_max_it 20
>> -snes_monitor_residual
>> -snes_view
>> #End o PETSc Option Table entries
>> Compiled without FORTRAN kernels
>> Compiled with full precision matrices (default)
>> sizeof(short) 2 sizeof(int) 4 sizeof(long) 4 sizeof(void*) 4
>> sizeof(PetscScalar) 8
>> Configure run at: Mon Jul  6 16:28:41 2009
>> Configure options: --with-cc="win32fe cl --nodetect"
>> --download-c-blas-lapack=1 --with-fc=0 --with-mpi=0 --useThreads=0
>> --with-shared=0
>> -----------------------------------------
>> Libraries compiled on Mon Jul  6 16:39:57     2009 on Idrocp03
>> Machine characteristics: CYGWIN_NT-5.1 Idrocp03 1.5.25(0.156/4/2)
>> 2008-06-12 19:34 i686 Cygwin
>> Using PETSc directory: /home/Administrator/petsc
>> Using PETSc arch: cygwin-c-debug
>> -----------------------------------------
>> Using C compiler: /home/Administrator/petsc/bin/win32fe/win32fe cl
>> --nodetect -MT -wd4996 -Z7
>> Using Fortran compiler:
>> -----------------------------------------
>> Using include paths: -I/home/Administrator/petsc/cygwin-c-debug/include
>> -I/home/Administrator/petsc/include
>> -I/home/Administrator/petsc/include/mpiuni
>> ------------------------------------------
>> Using C linker: /home/Administrator/petsc/bin/win32fe/win32fe cl
>> --nodetect -MT -wd4996 -Z7
>> Using Fortran linker:
>> Using libraries: -L/home/Administrator/petsc/cygwin-c-debug/lib
>> -L/home/Administrator/petsc/cygwin-c-debug/lib -lpetscts -lpetscsnes
>> -lpetscksp -lpetscdm -lpetscmat -lpetscvec -lpetsc
>>  -L/home/Administrator/petsc/cygwin-c-debug/lib -lf2clapack -lf2cblas
>> -lmpiuni Gdi32.lib User32.lib Advapi32.lib Kernel32.lib Ws2_32.lib
>> ------------------------------------------
>>
>>
>>
>> Thank you in advance for your help,
>>
>> Michel Cancelliere
>> Politecnico di Torino
>>
>>
>


-- 
What most experimenters take for granted before they begin their experiments
is infinitely more interesting than any results to which their experiments
lead.
-- Norbert Wiener
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.mcs.anl.gov/pipermail/petsc-users/attachments/20090722/10691c0f/attachment-0001.htm>

From xy2102 at columbia.edu  Wed Jul 22 15:18:05 2009
From: xy2102 at columbia.edu ((Rebecca) Xuefei YUAN)
Date: Wed, 22 Jul 2009 16:18:05 -0400
Subject: Is DMMGSolve in Petsc a Newton-multigrid?
Message-ID: <20090722161805.s1rsqdfoggwg8o4c@cubmail.cc.columbia.edu>

Is Newton iteration for the outer iterations and multigrid for the  
linear inner iterations?

Thanks

-- 
(Rebecca) Xuefei YUAN
Department of Applied Physics and Applied Mathematics
Columbia University
Tel:917-399-8032
www.columbia.edu/~xy2102


From knepley at gmail.com  Wed Jul 22 15:50:55 2009
From: knepley at gmail.com (Matthew Knepley)
Date: Wed, 22 Jul 2009 15:50:55 -0500
Subject: Is DMMGSolve in Petsc a Newton-multigrid?
In-Reply-To: <20090722161805.s1rsqdfoggwg8o4c@cubmail.cc.columbia.edu>
References: <20090722161805.s1rsqdfoggwg8o4c@cubmail.cc.columbia.edu>
Message-ID: <a9f269830907221350n25d5fa18pcc18ceb81d81a1c0@mail.gmail.com>

On Wed, Jul 22, 2009 at 3:18 PM, (Rebecca) Xuefei YUAN
<xy2102 at columbia.edu>wrote:

> Is Newton iteration for the outer iterations and multigrid for the linear
> inner iterations?


This is the usual formulation:

  Newton to solve F(u) = 0

and

  Multigrid to precondition F'(u) \delta u = -F(u)

  Matt


>
> Thanks
>
> --
> (Rebecca) Xuefei YUAN
> Department of Applied Physics and Applied Mathematics
> Columbia University
> Tel:917-399-8032
> www.columbia.edu/~xy2102 <http://www.columbia.edu/%7Exy2102>
>
-- 
What most experimenters take for granted before they begin their experiments
is infinitely more interesting than any results to which their experiments
lead.
-- Norbert Wiener
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.mcs.anl.gov/pipermail/petsc-users/attachments/20090722/81a07767/attachment.htm>

From xy2102 at columbia.edu  Wed Jul 22 17:24:52 2009
From: xy2102 at columbia.edu ((Rebecca) Xuefei YUAN)
Date: Wed, 22 Jul 2009 18:24:52 -0400
Subject: Is DMMGSolve in Petsc a Newton-multigrid?
In-Reply-To: <a9f269830907221350n25d5fa18pcc18ceb81d81a1c0@mail.gmail.com>
References: <20090722161805.s1rsqdfoggwg8o4c@cubmail.cc.columbia.edu>
	<a9f269830907221350n25d5fa18pcc18ceb81d81a1c0@mail.gmail.com>
Message-ID: <20090722182452.xcq0hhuyqocows8s@cubmail.cc.columbia.edu>

Dear Matt,

Does that mean mg is only used as a pc so far? Where could I check on  
which smoother it uses and options Petsc provided for dmmg? There is  
not much information on the user manual.

Thanks,

Rebecca


Quoting Matthew Knepley <knepley at gmail.com>:

> On Wed, Jul 22, 2009 at 3:18 PM, (Rebecca) Xuefei YUAN
> <xy2102 at columbia.edu>wrote:
>
>> Is Newton iteration for the outer iterations and multigrid for the linear
>> inner iterations?
>
>
> This is the usual formulation:
>
>   Newton to solve F(u) = 0
>
> and
>
>   Multigrid to precondition F'(u) \delta u = -F(u)
>
>   Matt
>
>
>>
>> Thanks
>>
>> --
>> (Rebecca) Xuefei YUAN
>> Department of Applied Physics and Applied Mathematics
>> Columbia University
>> Tel:917-399-8032
>> www.columbia.edu/~xy2102 <http://www.columbia.edu/%7Exy2102>
>>
> --
> What most experimenters take for granted before they begin their experiments
> is infinitely more interesting than any results to which their experiments
> lead.
> -- Norbert Wiener
>



-- 
(Rebecca) Xuefei YUAN
Department of Applied Physics and Applied Mathematics
Columbia University
Tel:917-399-8032
www.columbia.edu/~xy2102


From bsmith at mcs.anl.gov  Wed Jul 22 18:41:05 2009
From: bsmith at mcs.anl.gov (Barry Smith)
Date: Wed, 22 Jul 2009 18:41:05 -0500
Subject: Is DMMGSolve in Petsc a Newton-multigrid?
In-Reply-To: <20090722182452.xcq0hhuyqocows8s@cubmail.cc.columbia.edu>
References: <20090722161805.s1rsqdfoggwg8o4c@cubmail.cc.columbia.edu>
	<a9f269830907221350n25d5fa18pcc18ceb81d81a1c0@mail.gmail.com>
	<20090722182452.xcq0hhuyqocows8s@cubmail.cc.columbia.edu>
Message-ID: <0DFE65B3-6AAD-4437-870C-D4EC6AF07D17@mcs.anl.gov>


    -snes_view will always show the solver options being used. -help  
will cause all the options that are currently available.

    Barry

On Jul 22, 2009, at 5:24 PM, (Rebecca) Xuefei YUAN wrote:

> Dear Matt,
>
> Does that mean mg is only used as a pc so far? Where could I check  
> on which smoother it uses and options Petsc provided for dmmg? There  
> is not much information on the user manual.
>
> Thanks,
>
> Rebecca
>
>
> Quoting Matthew Knepley <knepley at gmail.com>:
>
>> On Wed, Jul 22, 2009 at 3:18 PM, (Rebecca) Xuefei YUAN
>> <xy2102 at columbia.edu>wrote:
>>
>>> Is Newton iteration for the outer iterations and multigrid for the  
>>> linear
>>> inner iterations?
>>
>>
>> This is the usual formulation:
>>
>>  Newton to solve F(u) = 0
>>
>> and
>>
>>  Multigrid to precondition F'(u) \delta u = -F(u)
>>
>>  Matt
>>
>>
>>>
>>> Thanks
>>>
>>> --
>>> (Rebecca) Xuefei YUAN
>>> Department of Applied Physics and Applied Mathematics
>>> Columbia University
>>> Tel:917-399-8032
>>> www.columbia.edu/~xy2102 <http://www.columbia.edu/%7Exy2102>
>>>
>> --
>> What most experimenters take for granted before they begin their  
>> experiments
>> is infinitely more interesting than any results to which their  
>> experiments
>> lead.
>> -- Norbert Wiener
>>
>
>
>
> -- 
> (Rebecca) Xuefei YUAN
> Department of Applied Physics and Applied Mathematics
> Columbia University
> Tel:917-399-8032
> www.columbia.edu/~xy2102
>


From Andreas.Grassl at student.uibk.ac.at  Thu Jul 23 07:47:28 2009
From: Andreas.Grassl at student.uibk.ac.at (Andreas Grassl)
Date: Thu, 23 Jul 2009 14:47:28 +0200
Subject: strange behaviour with PetscViewerBinary on MATIS
Message-ID: <4A685BE0.1030802@student.uibk.ac.at>

Hello,

I want to save my Matrix A to disk and process it then with ksp/ksp/ex10. Doing
it for type AIJ is working fine.

Using type IS, it seems to save only the local matrix from one processor to the
disk and dump the others to stdout.

PetscViewerBinaryOpen(commw,"matrix.bin",FILE_MODE_WRITE,&viewer1);
MatView(A,viewer1);

Is the only workaround to save the LocalToGlobalMapping and the local matrices
separately and to read in all this information or do you see an easier way?

Is there a canonical way to save and restore the LocalToGlobalMapping?

Cheers,

ando

-- 
 /"\                               Grassl Andreas
 \ /    ASCII Ribbon Campaign      Uni Innsbruck Institut f. Mathematik
  X      against HTML email        Technikerstr. 13 Zi 709
 / \                               +43 (0)512 507 6091

From jed at 59A2.org  Thu Jul 23 08:43:51 2009
From: jed at 59A2.org (Jed Brown)
Date: Thu, 23 Jul 2009 15:43:51 +0200
Subject: strange behaviour with PetscViewerBinary on MATIS
In-Reply-To: <4A685BE0.1030802@student.uibk.ac.at>
References: <4A685BE0.1030802@student.uibk.ac.at>
Message-ID: <4A686917.9020504@59A2.org>

Andreas Grassl wrote:
> Hello,
> 
> I want to save my Matrix A to disk and process it then with ksp/ksp/ex10. Doing
> it for type AIJ is working fine.
> 
> Using type IS, it seems to save only the local matrix from one processor to the
> disk and dump the others to stdout.
> 
> PetscViewerBinaryOpen(commw,"matrix.bin",FILE_MODE_WRITE,&viewer1);
> MatView(A,viewer1);

The viewer for MATIS is really simplistic, it doesn't ascribe any
parallel structure at all.

The technical explanation for the behavior you are seeing (which is bad)
is the following.  MatView_IS gets a "singleton" viewer which for a
Binary viewer is just a binary viewer on PETSC_COMM_SELF for rank 0,
with the NULL (0) viewer for all other ranks.  It then calls MatView
with this viewer which is a proper binary viewer for rank 0, but MatView
creates a new viewer when called with viewer 0.

> Is the only workaround to save the LocalToGlobalMapping and the local matrices
> separately and to read in all this information or do you see an easier way?

You can put this in MatView_IS if you really need it, but I doubt it
will actually be useful.  Unfortunately, you cannot change the domain
decomposition with Neumann preconditioners, hence they will have limited
use for solving a system with a saved matrix.  Why do you want to save
the matrix, it's vastly slower and less useful than a function which
assembles that matrix?

Jed

-------------- next part --------------
A non-text attachment was scrubbed...
Name: signature.asc
Type: application/pgp-signature
Size: 260 bytes
Desc: OpenPGP digital signature
URL: <http://lists.mcs.anl.gov/pipermail/petsc-users/attachments/20090723/3a576cee/attachment.pgp>

From Andreas.Grassl at student.uibk.ac.at  Thu Jul 23 10:08:26 2009
From: Andreas.Grassl at student.uibk.ac.at (Andreas Grassl)
Date: Thu, 23 Jul 2009 17:08:26 +0200
Subject: strange behaviour with PetscViewerBinary on MATIS
In-Reply-To: <4A686917.9020504@59A2.org>
References: <4A685BE0.1030802@student.uibk.ac.at> <4A686917.9020504@59A2.org>
Message-ID: <4A687CEA.3040403@student.uibk.ac.at>

Jed Brown schrieb:
> You can put this in MatView_IS if you really need it, but I doubt it
> will actually be useful.  Unfortunately, you cannot change the domain
> decomposition with Neumann preconditioners, hence they will have limited
> use for solving a system with a saved matrix.  Why do you want to save
> the matrix, it's vastly slower and less useful than a function which
> assembles that matrix?

I assemble the Matrix by reading out from a data structure produced by a
proprietary program and just used this easy approach to compare the solvers on
different machines, where this program is not installed.

Since the implementation of the NN-preconditioner is suboptimal at all, I will
not waste much time on this issues and my post at the list was lead mostly by my
curiosity.

thanks for the explanation

cheers,

ando

-- 
 /"\                               Grassl Andreas
 \ /    ASCII Ribbon Campaign      Uni Innsbruck Institut f. Mathematik
  X      against HTML email        Technikerstr. 13 Zi 709
 / \                               +43 (0)512 507 6091

From Stephen.R.Ball at awe.co.uk  Fri Jul 24 05:22:40 2009
From: Stephen.R.Ball at awe.co.uk (Stephen Ball)
Date: Fri, 24 Jul 2009 11:22:40 +0100
Subject: Any examples of how to set Spooles LU and CHOLESKY direct solvers
	using the Fortran API?
Message-ID: <97OBPJ025484@awe.co.uk>

Hi

I have recently moved from using PETSc v2.3.3 to v3.0.0 and am trying to
update my Fortran code accordingly.

Do you have any examples of how to set Spooles LU and CHOLESKY direct
solvers using the Fortran API?

I am struggling somewhat to understand the correct sequence of calls for
your new API, including the matrix and PC creation and set up stages
when using Spooles LU and CHOLESKY direct solvers.

What calls are required or optional and in what circumstances?

Kindest regards

Stephen

This e-mail and any attachments may contain confidential and
privileged information. If you are not the intended recipient,
please notify the sender immediately by return e-mail, delete this
e-mail and destroy any copies. Any dissemination or use of this
information by a person other than the intended recipient is
unauthorized and may be illegal.

From bsmith at mcs.anl.gov  Fri Jul 24 08:44:34 2009
From: bsmith at mcs.anl.gov (Barry Smith)
Date: Fri, 24 Jul 2009 08:44:34 -0500
Subject: Any examples of how to set Spooles LU and CHOLESKY direct solvers
	using the Fortran API?
In-Reply-To: <97OBPJ025484@awe.co.uk>
References: <97OBPJ025484@awe.co.uk>
Message-ID: <28FA3B3E-46A3-44D6-ADF4-E9BC16E25AFE@mcs.anl.gov>


    Stephen,

1) You no longer need to set particular Spooles matrix types. Just use  
AIJ or SBAIJ (for symmetric case)

2) call KSPGetPC(ksp,pc,ierr)
     Call PCSetType(pc,PCLU,ierr)
      call  PCFactorSetMatSolverPackage(pc,MAT_SOLVER_SPOOLES,ierr)
      call KSPSolve()

    Barry



On Jul 24, 2009, at 5:22 AM, Stephen Ball wrote:

> Hi
>
> I have recently moved from using PETSc v2.3.3 to v3.0.0 and am  
> trying to
> update my Fortran code accordingly.
>
> Do you have any examples of how to set Spooles LU and CHOLESKY direct
> solvers using the Fortran API?
>
> I am struggling somewhat to understand the correct sequence of calls  
> for
> your new API, including the matrix and PC creation and set up stages
> when using Spooles LU and CHOLESKY direct solvers.
>
> What calls are required or optional and in what circumstances?
>
> Kindest regards
>
> Stephen
>
> This e-mail and any attachments may contain confidential and
> privileged information. If you are not the intended recipient,
> please notify the sender immediately by return e-mail, delete this
> e-mail and destroy any copies. Any dissemination or use of this
> information by a person other than the intended recipient is
> unauthorized and may be illegal.


From xy2102 at columbia.edu  Sat Jul 25 13:20:33 2009
From: xy2102 at columbia.edu ((Rebecca) Xuefei YUAN)
Date: Sat, 25 Jul 2009 14:20:33 -0400
Subject: vecload_block_size
Message-ID: <20090725142033.inixoni34cg0scg0@cubmail.cc.columbia.edu>

Hi,

I am reading from a binary file of the previous solution and would  
like it to be my initial guess.

However, I found that the "-vecload_block_size" is shown in the  
PetscOptionsTable, I did not explicitly state this one in my options  
file, why it shows up in it? And the block size has been assigned as  
"4".

Thanks!

Rebecca


Here is this PetscOptionsTable:

(gdb) p *options
$21 = {N = 28, argc = 3, Naliases = 0, args = 0xbf8e9564, names = {
     0x8867768 "da_grid_x", 0x8867788 "da_grid_y",
     0x88678c8 "dmmg_iscoloring_type", 0x88678a8 "dmmg_levels",
     0x8867a20 "ksp_converged_reason", 0x8867a00 "ksp_max_it",
     0x8847bb8 "loadbin", 0x8847bd8 "mx_grid", 0x8867748 "my",
     0x88677d0 "number_of_time_steps", 0x8867a60 "pc_asm_overlap",
     0x8867a40 "pc_type", 0x8867910 "snes_converged_reason",
     0x8867998 "snes_ksp_ew", 0x8867940 "snes_max_fail",
     0x88679d8 "snes_max_funcs", 0x88679b8 "snes_max_it",
     0x8867968 "snes_max_linear_solve_fail", 0x8867930 "snes_mf",
     0x88678f8 "snes_monitor", 0x8867aa8 "sub_pc_factor_shift_nonzero",
     0x8867a88 "sub_pc_type", 0x8867850 "time_accuracy_order",
     0x8867800 "time_step_monitor", 0x88677a8 "time_step_size",
     0x8867818 "time_step_to_save_solution_text",
     0x8867878 "time_to_generate_grid", 0x8855890 "vecload_block_size",
     0x0 <repeats 484 times>}, values = {0x8867778 "8", 0x8867798 "8",
     0x88678e8 "global", 0x88678b8 "1", 0x0, 0x8867a10 "50", 0x8847bc8 "true",
     0x8867738 "9", 0x8867758 "9", 0x88677f0 "1", 0x8867a78 "1",
     0x8867a50 "asm", 0x0, 0x88679a8 "true", 0x8867958 "100",
     0x88679f0 "1000000", 0x88679c8 "10", 0x8867988 "100", 0x0, 0x0, 0x0,
     0x8867a98 "ilu", 0x8867868 "1", 0x0, 0x88677c0 "0.2", 0x8867840 "1",
     0x8867898 "0.2", 0x8847ba8 "4", 0x0 <repeats 484 times>}, aliases1 = {
     0x0 <repeats 25 times>}, aliases2 = {0x0 <repeats 25 times>}, used = {
     PETSC_FALSE, PETSC_FALSE, PETSC_TRUE, PETSC_TRUE, PETSC_TRUE, PETSC_TRUE,
     PETSC_FALSE, PETSC_TRUE, PETSC_TRUE, PETSC_FALSE, PETSC_TRUE, PETSC_TRUE,
     PETSC_TRUE, PETSC_TRUE, PETSC_TRUE, PETSC_TRUE, PETSC_TRUE, PETSC_TRUE,
     PETSC_FALSE, PETSC_TRUE, PETSC_FALSE, PETSC_FALSE, PETSC_FALSE,
---Type <return> to continue, or q <return> to quit---
     PETSC_FALSE, PETSC_FALSE, PETSC_FALSE, PETSC_TRUE,
     PETSC_FALSE <repeats 485 times>}, namegiven = PETSC_TRUE,
   programname =  
"/home/rebecca/linux/code/couple/twoway/twoway_oreggt/codes/tworeggt",  
'\0' <repeats 4028 times>, monitor = {0, 0, 0, 0, 0}, monitordestroy = {
     0, 0, 0, 0, 0}, monitorcontext = {0x0, 0x0, 0x0, 0x0, 0x0},
   numbermonitors = 0}









-- 
(Rebecca) Xuefei YUAN
Department of Applied Physics and Applied Mathematics
Columbia University
Tel:917-399-8032
www.columbia.edu/~xy2102


From bsmith at mcs.anl.gov  Sat Jul 25 13:50:02 2009
From: bsmith at mcs.anl.gov (Barry Smith)
Date: Sat, 25 Jul 2009 13:50:02 -0500
Subject: vecload_block_size
In-Reply-To: <20090725142033.inixoni34cg0scg0@cubmail.cc.columbia.edu>
References: <20090725142033.inixoni34cg0scg0@cubmail.cc.columbia.edu>
Message-ID: <226A9ED7-E5A5-4323-951A-957CAF886C40@mcs.anl.gov>


   Some binary files have another file with the same name and .info on  
the end. That file contains additional options such as - 
vecload_block_size.
If you don't want that option you can simply remove the file.


On Jul 25, 2009, at 1:20 PM, (Rebecca) Xuefei YUAN wrote:

> Hi,
>
> I am reading from a binary file of the previous solution and would  
> like it to be my initial guess.
>
> However, I found that the "-vecload_block_size" is shown in the  
> PetscOptionsTable, I did not explicitly state this one in my options  
> file, why it shows up in it? And the block size has been assigned as  
> "4".
>
> Thanks!
>
> Rebecca
>
>
> Here is this PetscOptionsTable:
>
> (gdb) p *options
> $21 = {N = 28, argc = 3, Naliases = 0, args = 0xbf8e9564, names = {
>    0x8867768 "da_grid_x", 0x8867788 "da_grid_y",
>    0x88678c8 "dmmg_iscoloring_type", 0x88678a8 "dmmg_levels",
>    0x8867a20 "ksp_converged_reason", 0x8867a00 "ksp_max_it",
>    0x8847bb8 "loadbin", 0x8847bd8 "mx_grid", 0x8867748 "my",
>    0x88677d0 "number_of_time_steps", 0x8867a60 "pc_asm_overlap",
>    0x8867a40 "pc_type", 0x8867910 "snes_converged_reason",
>    0x8867998 "snes_ksp_ew", 0x8867940 "snes_max_fail",
>    0x88679d8 "snes_max_funcs", 0x88679b8 "snes_max_it",
>    0x8867968 "snes_max_linear_solve_fail", 0x8867930 "snes_mf",
>    0x88678f8 "snes_monitor", 0x8867aa8 "sub_pc_factor_shift_nonzero",
>    0x8867a88 "sub_pc_type", 0x8867850 "time_accuracy_order",
>    0x8867800 "time_step_monitor", 0x88677a8 "time_step_size",
>    0x8867818 "time_step_to_save_solution_text",
>    0x8867878 "time_to_generate_grid", 0x8855890 "vecload_block_size",
>    0x0 <repeats 484 times>}, values = {0x8867778 "8", 0x8867798 "8",
>    0x88678e8 "global", 0x88678b8 "1", 0x0, 0x8867a10 "50", 0x8847bc8  
> "true",
>    0x8867738 "9", 0x8867758 "9", 0x88677f0 "1", 0x8867a78 "1",
>    0x8867a50 "asm", 0x0, 0x88679a8 "true", 0x8867958 "100",
>    0x88679f0 "1000000", 0x88679c8 "10", 0x8867988 "100", 0x0, 0x0,  
> 0x0,
>    0x8867a98 "ilu", 0x8867868 "1", 0x0, 0x88677c0 "0.2", 0x8867840  
> "1",
>    0x8867898 "0.2", 0x8847ba8 "4", 0x0 <repeats 484 times>},  
> aliases1 = {
>    0x0 <repeats 25 times>}, aliases2 = {0x0 <repeats 25 times>},  
> used = {
>    PETSC_FALSE, PETSC_FALSE, PETSC_TRUE, PETSC_TRUE, PETSC_TRUE,  
> PETSC_TRUE,
>    PETSC_FALSE, PETSC_TRUE, PETSC_TRUE, PETSC_FALSE, PETSC_TRUE,  
> PETSC_TRUE,
>    PETSC_TRUE, PETSC_TRUE, PETSC_TRUE, PETSC_TRUE, PETSC_TRUE,  
> PETSC_TRUE,
>    PETSC_FALSE, PETSC_TRUE, PETSC_FALSE, PETSC_FALSE, PETSC_FALSE,
> ---Type <return> to continue, or q <return> to quit---
>    PETSC_FALSE, PETSC_FALSE, PETSC_FALSE, PETSC_TRUE,
>    PETSC_FALSE <repeats 485 times>}, namegiven = PETSC_TRUE,
>  programname = "/home/rebecca/linux/code/couple/twoway/twoway_oreggt/ 
> codes/tworeggt", '\0' <repeats 4028 times>, monitor = {0, 0, 0, 0,  
> 0}, monitordestroy = {
>    0, 0, 0, 0, 0}, monitorcontext = {0x0, 0x0, 0x0, 0x0, 0x0},
>  numbermonitors = 0}
>
>
>
>
>
>
>
>
>
> -- 
> (Rebecca) Xuefei YUAN
> Department of Applied Physics and Applied Mathematics
> Columbia University
> Tel:917-399-8032
> www.columbia.edu/~xy2102
>


From xy2102 at columbia.edu  Sat Jul 25 16:06:20 2009
From: xy2102 at columbia.edu ((Rebecca) Xuefei YUAN)
Date: Sat, 25 Jul 2009 17:06:20 -0400
Subject: Any example of assembling a Jacobian matrix of DMComposite object?
Message-ID: <20090725170620.ytfi94xgu888gwck@cubmail.cc.columbia.edu>

Hi,

I have an optimization problem in 2d with some scalar parameter, thus  
DMComposite is used to manage the date structure. If I am going to  
write down the Jacobian matrix for the system, I come up with  
(mx*my+1,mx*my+1) matrix. Is there any example in PETSc of assembling  
the Jacobian for optimization problem?

Thanks very much!

-- 
(Rebecca) Xuefei YUAN
Department of Applied Physics and Applied Mathematics
Columbia University
Tel:917-399-8032
www.columbia.edu/~xy2102


From bsmith at mcs.anl.gov  Sat Jul 25 18:11:01 2009
From: bsmith at mcs.anl.gov (Barry Smith)
Date: Sat, 25 Jul 2009 18:11:01 -0500
Subject: Any example of assembling a Jacobian matrix of DMComposite object?
In-Reply-To: <20090725170620.ytfi94xgu888gwck@cubmail.cc.columbia.edu>
References: <20090725170620.ytfi94xgu888gwck@cubmail.cc.columbia.edu>
Message-ID: <453232BD-FE37-4AAB-B9A6-2F1739E7B40C@mcs.anl.gov>


On Jul 25, 2009, at 4:06 PM, (Rebecca) Xuefei YUAN wrote:

> Hi,
>
> I have an optimization problem in 2d with some scalar parameter,  
> thus DMComposite is used to manage the date structure. If I am going  
> to write down the Jacobian matrix for the system, I come up with  
> (mx*my+1,mx*my+1) matrix. Is there any example in PETSc of  
> assembling the Jacobian for optimization problem?

    No

>
> Thanks very much!
>
> -- 
> (Rebecca) Xuefei YUAN
> Department of Applied Physics and Applied Mathematics
> Columbia University
> Tel:917-399-8032
> www.columbia.edu/~xy2102
>


From xy2102 at columbia.edu  Sun Jul 26 18:51:36 2009
From: xy2102 at columbia.edu ((Rebecca) Xuefei YUAN)
Date: Sun, 26 Jul 2009 19:51:36 -0400
Subject: possible bug in DMCompositeGetMatrix().
Message-ID: <20090726195136.f07iqopggk0skkg0@cubmail.cc.columbia.edu>

Hi,

I am working on an optimization problem, in which I would like to  
assemble a Jacobian matrix. Thus  
DMMGSetSNES(dmmg,FormFunction,FormJacobian) is called.

In damgsnes.c:637, in calling DMGetMatrix(), it calls  
DMCompositeGetMatrix() where the temp matrix Atmp has been freed  
before it passes any information to J at pack.c:1722 and 1774.

So after calling DMGetMatrix() in DMMGSetSNES, the stencil of the  
dmmg[i]->B has unchanged, i.e.,

(gdb) p dmmg[0]->B->stencil
$107 = {dim = 0, dims = {0, 0, 0, 0}, starts = {0, 0, 0, 0}, noc =  
PETSC_FALSE}
(gdb) where
#0  DMMGSetSNES (dmmg=0x8856208, function=0x804c84f <FormFunction>,
     jacobian=0x8052932 <FormJacobian>) at damgsnes.c:641
#1  0x0804c246 in main (argc=Cannot access memory at address 0x0
) at tworeggt.c:126

I compare this with
http://www.mcs.anl.gov/petsc/petsc-as/snapshots/petsc-current/src/snes/examples/tutorials/ex18.c.html

and it shows that the stencil has been carried out and passed to  
dmmg[0]->B as follows:

(gdb) p dmmg[i]->B->stencil
$80 = {dim = 2, dims = {5, 5, 1, 0}, starts = {0, 0, 0, 0}, noc = PETSC_TRUE}
(gdb) where
#0  DMMGSetSNES (dmmg=0x884b530, function=0x804c364 <FormFunction>,
     jacobian=0x804d34d <FormJacobian>) at damgsnes.c:642
#1  0x0804b969 in main (argc=Cannot access memory at address 0x2
) at ex18.c:100

Because of this missing stencil of Jacobian matrix, I get the error  
code as follows:
Program received signal SIGSEGV, Segmentation fault.
0x082447c2 in ISLocalToGlobalMappingApply (mapping=0x0, N=1, in=0xbff8f250,
     out=0xbff8ce14) at /home/rebecca/soft/petsc-3.0.0-p1/include/petscis.h:129
129	  PetscInt i,*idx = mapping->indices,Nmax = mapping->n;
(gdb) where
#0  0x082447c2 in ISLocalToGlobalMappingApply (mapping=0x0, N=1,
     in=0xbff8f250, out=0xbff8ce14)
     at /home/rebecca/soft/petsc-3.0.0-p1/include/petscis.h:129
#1  0x0824440c in MatSetValuesLocal (mat=0x88825e8, nrow=1, irow=0xbff8f250,
     ncol=4, icol=0xbff8ee50, y=0xbff8f628, addv=INSERT_VALUES) at  
matrix.c:1583
#2  0x08240aae in MatSetValuesStencil (mat=0x88825e8, m=1, idxm=0xbff8f6b8,
     n=4, idxn=0xbff8f4b4, v=0xbff8f628, addv=INSERT_VALUES) at matrix.c:1099
#3  0x08053835 in FormJacobian (snes=0x8874700, X=0x8856778, J=0x88747d0,
     B=0x88747d4, flg=0xbff8f8d4, ptr=0x8856338) at tworeggt.c:937
#4  0x0805a5cf in DMMGComputeJacobian_Multigrid (snes=0x8874700, X=0x8856778,
     J=0x88747d0, B=0x88747d4, flag=0xbff8f8d4, ptr=0x8856208) at damgsnes.c:60
#5  0x0806b18a in SNESComputeJacobian (snes=0x8874700, X=0x8856778,
     A=0x88747d0, B=0x88747d4, flg=0xbff8f8d4) at snes.c:1111
#6  0x08084945 in SNESSolve_LS (snes=0x8874700) at ls.c:189
#7  0x08073198 in SNESSolve (snes=0x8874700, b=0x0, x=0x8856778) at  
snes.c:2221
#8  0x0805d5f9 in DMMGSolveSNES (dmmg=0x8856208, level=0) at damgsnes.c:510
#9  0x08056e38 in DMMGSolve (dmmg=0x8856208) at damg.c:372
#10 0x0804c3fe in main (argc=128, argv=0xbff90c04) at tworeggt.c:131

I think there might be a bug in DMCompositeGetMatrix().

Thanks very much!

Cheers,

-- 
(Rebecca) Xuefei YUAN
Department of Applied Physics and Applied Mathematics
Columbia University
Tel:917-399-8032
www.columbia.edu/~xy2102


From tim.kroeger at cevis.uni-bremen.de  Mon Jul 27 04:35:10 2009
From: tim.kroeger at cevis.uni-bremen.de (Tim Kroeger)
Date: Mon, 27 Jul 2009 11:35:10 +0200 (CEST)
Subject: Solver problem
Message-ID: <alpine.LFD.0.9999.0907271042460.4918@elektrik.mevis.lan>

Dear all,

In my application, there is a linear system to be solved in every time 
step.  Steps 0 and 1 work well, but in step 2 PETSc fails to converge. 
I suspected that the system might be unsolvable in that step and 
checked that by writing matrix and the right hand side to files and 
loading them into "octave".  Surprisingly, "octave" does find a 
solution to the system without any problems.

The problem occurs even on a single core.  I am using PETSc version 
2.3.3-p11 with the GMRES solver and ILU preconditioner.

Can anybody give me a hint which settings would PETSc reliably enable 
solving systems of the type that I face?

I have put matrix and right hand side on my homepage; they can be 
downloaded from www.mevis.de/~tim/m-and-v.tar.gz (7MB).  In octave, I 
used the following commands to find and check the solution:

octave:1> matrix2
octave:2> vector2
octave:3> x=Mat_0\Vec_1;
octave:4> res=Mat_0*x-Vec_1;
octave:5> norm(res)
ans =  1.0032e-12
octave:6> norm(Vec_1)
ans =  27.976
octave:7> norm(Mat_0,"fro")
ans =  2.5917e+22
octave:8> norm(x)
ans =  3855.3


Best Regards,

Tim

-- 
Dr. Tim Kroeger
tim.kroeger at mevis.fraunhofer.de            Phone +49-421-218-7710
tim.kroeger at cevis.uni-bremen.de            Fax   +49-421-218-4236

Fraunhofer MEVIS, Institute for Medical Image Computing
Universitaetsallee 29, 28359 Bremen, Germany


From knepley at gmail.com  Mon Jul 27 07:47:30 2009
From: knepley at gmail.com (Matthew Knepley)
Date: Mon, 27 Jul 2009 07:47:30 -0500
Subject: Solver problem
In-Reply-To: <alpine.LFD.0.9999.0907271042460.4918@elektrik.mevis.lan>
References: <alpine.LFD.0.9999.0907271042460.4918@elektrik.mevis.lan>
Message-ID: <a9f269830907270547s3de53216y499ec608ee31aa33@mail.gmail.com>

On Mon, Jul 27, 2009 at 4:35 AM, Tim Kroeger <
tim.kroeger at cevis.uni-bremen.de> wrote:

> Dear all,
>
> In my application, there is a linear system to be solved in every time
> step.  Steps 0 and 1 work well, but in step 2 PETSc fails to converge. I
> suspected that the system might be unsolvable in that step and checked that
> by writing matrix and the right hand side to files and loading them into
> "octave".  Surprisingly, "octave" does find a solution to the system without
> any problems.
>
> The problem occurs even on a single core.  I am using PETSc version
> 2.3.3-p11 with the GMRES solver and ILU preconditioner.
>
> Can anybody give me a hint which settings would PETSc reliably enable
> solving systems of the type that I face?


If we could, we would already have retired. There are simply no iterative
solvers that work for all systems. The
best preconditioners are usually tailored to the particular equations being
solved. I would suggest a search of
the literature for PCs for your equations.

  Thanks,

    Matt


>
> I have put matrix and right hand side on my homepage; they can be
> downloaded from www.mevis.de/~tim/m-and-v.tar.gz<http://www.mevis.de/%7Etim/m-and-v.tar.gz>(7MB).  In octave, I used the following commands to find and check the
> solution:
>
> octave:1> matrix2
> octave:2> vector2
> octave:3> x=Mat_0\Vec_1;
> octave:4> res=Mat_0*x-Vec_1;
> octave:5> norm(res)
> ans =  1.0032e-12
> octave:6> norm(Vec_1)
> ans =  27.976
> octave:7> norm(Mat_0,"fro")
> ans =  2.5917e+22
> octave:8> norm(x)
> ans =  3855.3
>
>
> Best Regards,
>
> Tim
>
> --
> Dr. Tim Kroeger
> tim.kroeger at mevis.fraunhofer.de            Phone +49-421-218-7710
> tim.kroeger at cevis.uni-bremen.de            Fax   +49-421-218-4236
>
> Fraunhofer MEVIS, Institute for Medical Image Computing
> Universitaetsallee 29, 28359 Bremen, Germany
>
>


-- 
What most experimenters take for granted before they begin their experiments
is infinitely more interesting than any results to which their experiments
lead.
-- Norbert Wiener
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.mcs.anl.gov/pipermail/petsc-users/attachments/20090727/760ce823/attachment.htm>

From bsmith at mcs.anl.gov  Mon Jul 27 09:30:07 2009
From: bsmith at mcs.anl.gov (Barry Smith)
Date: Mon, 27 Jul 2009 09:30:07 -0500
Subject: Solver problem
In-Reply-To: <alpine.LFD.0.9999.0907271042460.4918@elektrik.mevis.lan>
References: <alpine.LFD.0.9999.0907271042460.4918@elektrik.mevis.lan>
Message-ID: <6902D9FB-FE90-4C10-A4CC-187834E02988@mcs.anl.gov>


On Jul 27, 2009, at 4:35 AM, Tim Kroeger wrote:

> Dear all,
>
> In my application, there is a linear system to be solved in every  
> time step.  Steps 0 and 1 work well, but in step 2 PETSc fails to  
> converge. I suspected that the system might be unsolvable in that  
> step and checked that by writing matrix and the right hand side to  
> files and loading them into "octave".  Surprisingly, "octave" does  
> find a solution to the system without any problems.

    Octave is using a direct solver. Did you try PETSc's direct solver  
using -pc_type lu?

    Barry

>
> The problem occurs even on a single core.  I am using PETSc version  
> 2.3.3-p11 with the GMRES solver and ILU preconditioner.
>
> Can anybody give me a hint which settings would PETSc reliably  
> enable solving systems of the type that I face?
>
> I have put matrix and right hand side on my homepage; they can be  
> downloaded from www.mevis.de/~tim/m-and-v.tar.gz (7MB).  In octave,  
> I used the following commands to find and check the solution:
>
> octave:1> matrix2
> octave:2> vector2
> octave:3> x=Mat_0\Vec_1;
> octave:4> res=Mat_0*x-Vec_1;
> octave:5> norm(res)
> ans =  1.0032e-12
> octave:6> norm(Vec_1)
> ans =  27.976
> octave:7> norm(Mat_0,"fro")
> ans =  2.5917e+22
> octave:8> norm(x)
> ans =  3855.3
>
>
> Best Regards,
>
> Tim
>
> -- 
> Dr. Tim Kroeger
> tim.kroeger at mevis.fraunhofer.de            Phone +49-421-218-7710
> tim.kroeger at cevis.uni-bremen.de            Fax   +49-421-218-4236
>
> Fraunhofer MEVIS, Institute for Medical Image Computing
> Universitaetsallee 29, 28359 Bremen, Germany
>


From fernandez858 at gmail.com  Mon Jul 27 09:43:17 2009
From: fernandez858 at gmail.com (Michel Cancelliere)
Date: Mon, 27 Jul 2009 16:43:17 +0200
Subject: Solver problem
In-Reply-To: <6902D9FB-FE90-4C10-A4CC-187834E02988@mcs.anl.gov>
References: <alpine.LFD.0.9999.0907271042460.4918@elektrik.mevis.lan>
	<6902D9FB-FE90-4C10-A4CC-187834E02988@mcs.anl.gov>
Message-ID: <7f18de3b0907270743i7fe33021tf3b88c2e4f0d80be@mail.gmail.com>

Do you mean steps (iterations) 0 and 1 for SNES or KSP? If the iterations
are for SNES probably you have problems with you nonlinear solver for which
Octave can find a solution to the linear system but the actual problem is
not in there. Do you use -snes_converged_reason? Are you sure that Matrix
and right hand side routines are working well?
Michel


On Mon, Jul 27, 2009 at 4:30 PM, Barry Smith <bsmith at mcs.anl.gov> wrote:

>
> On Jul 27, 2009, at 4:35 AM, Tim Kroeger wrote:
>
>  Dear all,
>>
>> In my application, there is a linear system to be solved in every time
>> step.  Steps 0 and 1 work well, but in step 2 PETSc fails to converge. I
>> suspected that the system might be unsolvable in that step and checked that
>> by writing matrix and the right hand side to files and loading them into
>> "octave".  Surprisingly, "octave" does find a solution to the system without
>> any problems.
>>
>
>   Octave is using a direct solver. Did you try PETSc's direct solver using
> -pc_type lu?
>
>   Barry
>
>
>
>> The problem occurs even on a single core.  I am using PETSc version
>> 2.3.3-p11 with the GMRES solver and ILU preconditioner.
>>
>> Can anybody give me a hint which settings would PETSc reliably enable
>> solving systems of the type that I face?
>>
>> I have put matrix and right hand side on my homepage; they can be
>> downloaded from www.mevis.de/~tim/m-and-v.tar.gz (7MB).  In octave, I
>> used the following commands to find and check the solution:
>>
>> octave:1> matrix2
>> octave:2> vector2
>> octave:3> x=Mat_0\Vec_1;
>> octave:4> res=Mat_0*x-Vec_1;
>> octave:5> norm(res)
>> ans =  1.0032e-12
>> octave:6> norm(Vec_1)
>> ans =  27.976
>> octave:7> norm(Mat_0,"fro")
>> ans =  2.5917e+22
>> octave:8> norm(x)
>> ans =  3855.3
>>
>>
>> Best Regards,
>>
>> Tim
>>
>> --
>> Dr. Tim Kroeger
>> tim.kroeger at mevis.fraunhofer.de            Phone +49-421-218-7710
>> tim.kroeger at cevis.uni-bremen.de            Fax   +49-421-218-4236
>>
>> Fraunhofer MEVIS, Institute for Medical Image Computing
>> Universitaetsallee 29, 28359 Bremen, Germany
>>
>>
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.mcs.anl.gov/pipermail/petsc-users/attachments/20090727/872a2e8a/attachment.htm>

From Harun.BAYRAKTAR at 3ds.com  Mon Jul 27 12:46:36 2009
From: Harun.BAYRAKTAR at 3ds.com (BAYRAKTAR Harun)
Date: Mon, 27 Jul 2009 13:46:36 -0400
Subject: Avoiding assembly with MatDiagonalSet call for zeroed matrix to
	preserve preallocated space
Message-ID: <A4A715530F9D944C930E27C7F8932FBD2AD48F@CORP-CLT-EXB02.ds>

Hello,

 

I have a rather simple question. For a Matrix that was preallocated with
the correct diagonal and offdiagonal nonzero counts the following
operations cause a deallocation of all data except the diagonal which
casues later MatSetValues to have to reallocate.

 

MatCreateSeqAIJ

MatZeroEntries

MatDiagaonalSet

 

Using -info and the debugger I see that MatDiagonalSet ends up calling
MatDiagonalSet_Default which forces an assembly. Is there a way to do
the same thing and preserve the preallocated storage for future use by
MatSetValues? I tried MatSetOption with MAT_KEEP_ZEROED_ROWS and
MAT_IGNORE_ZERO_ENTRIED but it did not help.

 

Thanks a lot,

Harun

 

-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.mcs.anl.gov/pipermail/petsc-users/attachments/20090727/2b3cac2d/attachment.htm>

From knepley at gmail.com  Mon Jul 27 14:38:08 2009
From: knepley at gmail.com (Matthew Knepley)
Date: Mon, 27 Jul 2009 14:38:08 -0500
Subject: Avoiding assembly with MatDiagonalSet call for zeroed matrix to 
	preserve preallocated space
In-Reply-To: <A4A715530F9D944C930E27C7F8932FBD2AD48F@CORP-CLT-EXB02.ds>
References: <A4A715530F9D944C930E27C7F8932FBD2AD48F@CORP-CLT-EXB02.ds>
Message-ID: <a9f269830907271238i7aec4befp957829dd8d2fcbe6@mail.gmail.com>

On Mon, Jul 27, 2009 at 12:46 PM, BAYRAKTAR Harun
<Harun.BAYRAKTAR at 3ds.com>wrote:

>  Hello,
>
>
>
> I have a rather simple question. For a Matrix that was preallocated with
> the correct diagonal and offdiagonal nonzero counts the following operations
> cause a deallocation of all data except the diagonal which casues later
> MatSetValues to have to reallocate.
>
>
>
> MatCreateSeqAIJ
>
> MatZeroEntries
>
> MatDiagaonalSet
>
>
>
> Using -info and the debugger I see that MatDiagonalSet ends up calling
> MatDiagonalSet_Default which forces an assembly. Is there a way to do the
> same thing and preserve the preallocated storage for future use by
> MatSetValues? I tried MatSetOption with MAT_KEEP_ZEROED_ROWS and
> MAT_IGNORE_ZERO_ENTRIED but it did not help.
>

Is it possible to set the diagonal after the rest of the entires?

  Matt


> Thanks a lot,
>
> Harun
>
-- 
What most experimenters take for granted before they begin their experiments
is infinitely more interesting than any results to which their experiments
lead.
-- Norbert Wiener
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.mcs.anl.gov/pipermail/petsc-users/attachments/20090727/03cfd04c/attachment.htm>

From Harun.BAYRAKTAR at 3ds.com  Mon Jul 27 15:09:05 2009
From: Harun.BAYRAKTAR at 3ds.com (BAYRAKTAR Harun)
Date: Mon, 27 Jul 2009 16:09:05 -0400
Subject: Avoiding assembly with MatDiagonalSet call for zeroed matrix to
	preserve preallocated space
In-Reply-To: <a9f269830907271238i7aec4befp957829dd8d2fcbe6@mail.gmail.com>
References: <A4A715530F9D944C930E27C7F8932FBD2AD48F@CORP-CLT-EXB02.ds>
	<a9f269830907271238i7aec4befp957829dd8d2fcbe6@mail.gmail.com>
Message-ID: <A4A715530F9D944C930E27C7F8932FBD2AD492@CORP-CLT-EXB02.ds>

Matt,

 

Your suggestion was what we tried as a workaround before I wrote the
message and it fixes the problem completely. I just wanted to know if
there was a less restrictive way that allows the diagonal set before the
entries. Sounds like there isn't.

 

Thanks,

Harun

 

 

From: petsc-users-bounces at mcs.anl.gov
[mailto:petsc-users-bounces at mcs.anl.gov] On Behalf Of Matthew Knepley
Sent: Monday, July 27, 2009 3:38 PM
To: PETSc users list
Subject: Re: Avoiding assembly with MatDiagonalSet call for zeroed
matrix to preserve preallocated space

 

On Mon, Jul 27, 2009 at 12:46 PM, BAYRAKTAR Harun
<Harun.BAYRAKTAR at 3ds.com> wrote:

	Hello,

	 

	I have a rather simple question. For a Matrix that was
preallocated with the correct diagonal and offdiagonal nonzero counts
the following operations cause a deallocation of all data except the
diagonal which casues later MatSetValues to have to reallocate.

	 

	MatCreateSeqAIJ

	MatZeroEntries

	MatDiagaonalSet

	 

	Using -info and the debugger I see that MatDiagonalSet ends up
calling MatDiagonalSet_Default which forces an assembly. Is there a way
to do the same thing and preserve the preallocated storage for future
use by MatSetValues? I tried MatSetOption with MAT_KEEP_ZEROED_ROWS and
MAT_IGNORE_ZERO_ENTRIED but it did not help.


Is it possible to set the diagonal after the rest of the entires?

  Matt
 

	Thanks a lot,

	Harun

-- 
What most experimenters take for granted before they begin their
experiments is infinitely more interesting than any results to which
their experiments lead.
-- Norbert Wiener

-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.mcs.anl.gov/pipermail/petsc-users/attachments/20090727/0d8e7a2d/attachment.htm>

From xy2102 at columbia.edu  Mon Jul 27 15:51:47 2009
From: xy2102 at columbia.edu ((Rebecca) Xuefei YUAN)
Date: Mon, 27 Jul 2009 16:51:47 -0400
Subject: memory check of /snes/example/tutorials/ex29.c
Message-ID: <20090727165147.eybcqr8xcswg4k0c@cubmail.cc.columbia.edu>

Hi,

My own code has some left bytes still reachable according to valgrind,  
then I use two different version petsc (2.3.3-p15 and 3.0.0-p1) to  
compile and make the files, it gives me different number of bytes left  
still reachable. Moreover, I picked up the  
/snes/example/tutorials/ex29.c as another example, and found that some  
bytes are still reachable, what is the cause of it? It shows that it  
is from DACreate2D() and the I use -malloc_dump to get those unfreed  
informations.

I understand that for those 5 loss record, the 2nd, 3rd and 4th are  
true for all examples, but where do 1st and 5th ones come from? Also,  
the -malloc_dump information shows that there are
"[0]Total space allocated 37780 bytes",
but valgrind gives the information as
"==26628==    still reachable: 132,828 bytes in 323 blocks"

Why there is a big difference?

Thanks very much!

Rebecca

Here is the message from valgrind of running ex29:
==26628== 32 bytes in 2 blocks are still reachable in loss record 1 of 5
==26628==    at 0x4022AB8: malloc (vg_replace_malloc.c:207)
==26628==    by 0x86F9A78: MPID_VCRT_Create (mpid_vc.c:62)
==26628==    by 0x86F743A: MPID_Init (mpid_init.c:116)
==26628==    by 0x86D040B: MPIR_Init_thread (initthread.c:288)
==26628==    by 0x86CFF2D: PMPI_Init (init.c:106)
==26628==    by 0x8613D69: PetscInitialize (pinit.c:503)
==26628==    by 0x804B796: main (ex29.c:139)
==26628==
==26628==
==26628== 156 (36 direct, 120 indirect) bytes in 1 blocks are  
definitely lost in loss record 2 of 5
==26628==    at 0x4022AB8: malloc (vg_replace_malloc.c:207)
==26628==    by 0x429B3E2: (within /lib/tls/i686/cmov/libc-2.7.so)
==26628==    by 0x429BC2D: __nss_database_lookup (in  
/lib/tls/i686/cmov/libc-2.7.so)
==26628==    by 0x4732FDB: ???
==26628==    by 0x473413C: ???
==26628==    by 0x4247D15: getpwuid_r (in /lib/tls/i686/cmov/libc-2.7.so)
==26628==    by 0x424765D: getpwuid (in /lib/tls/i686/cmov/libc-2.7.so)
==26628==    by 0x8623509: PetscGetUserName (fuser.c:68)
==26628==    by 0x85E0CF0: PetscErrorPrintfInitialize (errtrace.c:68)
==26628==    by 0x8613E23: PetscInitialize (pinit.c:518)
==26628==    by 0x804B796: main (ex29.c:139)
==26628==
==26628==
==26628== 40 bytes in 5 blocks are indirectly lost in loss record 3 of 5
==26628==    at 0x4022AB8: malloc (vg_replace_malloc.c:207)
==26628==    by 0x429AFBB: __nss_lookup_function (in  
/lib/tls/i686/cmov/libc-2.7.so)
==26628==    by 0x4732FFB: ???
==26628==    by 0x473413C: ???
==26628==    by 0x4247D15: getpwuid_r (in /lib/tls/i686/cmov/libc-2.7.so)
==26628==    by 0x424765D: getpwuid (in /lib/tls/i686/cmov/libc-2.7.so)
==26628==    by 0x8623509: PetscGetUserName (fuser.c:68)
==26628==    by 0x85E0CF0: PetscErrorPrintfInitialize (errtrace.c:68)
==26628==    by 0x8613E23: PetscInitialize (pinit.c:518)
==26628==    by 0x804B796: main (ex29.c:139)
==26628==
==26628==
==26628== 80 bytes in 5 blocks are indirectly lost in loss record 4 of 5
==26628==    at 0x4022AB8: malloc (vg_replace_malloc.c:207)
==26628==    by 0x428839B: tsearch (in /lib/tls/i686/cmov/libc-2.7.so)
==26628==    by 0x429AF7D: __nss_lookup_function (in  
/lib/tls/i686/cmov/libc-2.7.so)
==26628==    by 0x4732FFB: ???
==26628==    by 0x473413C: ???
==26628==    by 0x4247D15: getpwuid_r (in /lib/tls/i686/cmov/libc-2.7.so)
==26628==    by 0x424765D: getpwuid (in /lib/tls/i686/cmov/libc-2.7.so)
==26628==    by 0x8623509: PetscGetUserName (fuser.c:68)
==26628==    by 0x85E0CF0: PetscErrorPrintfInitialize (errtrace.c:68)
==26628==    by 0x8613E23: PetscInitialize (pinit.c:518)
==26628==    by 0x804B796: main (ex29.c:139)
==26628==
==26628==
==26628== 132,796 bytes in 321 blocks are still reachable in loss  
record 5 of 5
==26628==    at 0x4022AB8: malloc (vg_replace_malloc.c:207)
==26628==    by 0x85EF3AC: PetscMallocAlign (mal.c:40)
==26628==    by 0x85F049B: PetscTrMallocDefault (mtr.c:194)
==26628==    by 0x81BCD3F: DACreate2d (da2.c:364)
==26628==    by 0x804BAFB: main (ex29.c:153)
==26628==
==26628== LEAK SUMMARY:
==26628==    definitely lost: 36 bytes in 1 blocks.
==26628==    indirectly lost: 120 bytes in 10 blocks.
==26628==      possibly lost: 0 bytes in 0 blocks.
==26628==    still reachable: 132,828 bytes in 323 blocks.
==26628==         suppressed: 0 bytes in 0 blocks.


-- 
(Rebecca) Xuefei YUAN
Department of Applied Physics and Applied Mathematics
Columbia University
Tel:917-399-8032
www.columbia.edu/~xy2102


From knepley at gmail.com  Mon Jul 27 16:22:10 2009
From: knepley at gmail.com (Matthew Knepley)
Date: Mon, 27 Jul 2009 16:22:10 -0500
Subject: memory check of /snes/example/tutorials/ex29.c
In-Reply-To: <20090727165147.eybcqr8xcswg4k0c@cubmail.cc.columbia.edu>
References: <20090727165147.eybcqr8xcswg4k0c@cubmail.cc.columbia.edu>
Message-ID: <a9f269830907271422q5142ad41ld0be3a454da5aaf9@mail.gmail.com>

On Mon, Jul 27, 2009 at 3:51 PM, (Rebecca) Xuefei YUAN
<xy2102 at columbia.edu>wrote:

> Hi,
>
> My own code has some left bytes still reachable according to valgrind, then
> I use two different version petsc (2.3.3-p15 and 3.0.0-p1) to compile and
> make the files, it gives me different number of bytes left still reachable.
> Moreover, I picked up the /snes/example/tutorials/ex29.c as another example,
> and found that some bytes are still reachable, what is the cause of it? It
> shows that it is from DACreate2D() and the I use -malloc_dump to get those
> unfreed informations.
>
> I understand that for those 5 loss record, the 2nd, 3rd and 4th are true
> for all examples, but where do 1st and 5th ones come from? Also, the
> -malloc_dump information shows that there are
> "[0]Total space allocated 37780 bytes",
> but valgrind gives the information as
> "==26628==    still reachable: 132,828 bytes in 323 blocks"
>
> Why there is a big difference?


1 is fine. It is from PMPI setup, which has some bytes not freed from
setting up the MPI
processes. The last one looks like an unfreed header for a DA, which is
strange.

  Matt


>
> Thanks very much!
>
> Rebecca
>
> Here is the message from valgrind of running ex29:
> ==26628== 32 bytes in 2 blocks are still reachable in loss record 1 of 5
> ==26628==    at 0x4022AB8: malloc (vg_replace_malloc.c:207)
> ==26628==    by 0x86F9A78: MPID_VCRT_Create (mpid_vc.c:62)
> ==26628==    by 0x86F743A: MPID_Init (mpid_init.c:116)
> ==26628==    by 0x86D040B: MPIR_Init_thread (initthread.c:288)
> ==26628==    by 0x86CFF2D: PMPI_Init (init.c:106)
> ==26628==    by 0x8613D69: PetscInitialize (pinit.c:503)
> ==26628==    by 0x804B796: main (ex29.c:139)
> ==26628==
> ==26628==
> ==26628== 156 (36 direct, 120 indirect) bytes in 1 blocks are definitely
> lost in loss record 2 of 5
> ==26628==    at 0x4022AB8: malloc (vg_replace_malloc.c:207)
> ==26628==    by 0x429B3E2: (within /lib/tls/i686/cmov/libc-2.7.so)
> ==26628==    by 0x429BC2D: __nss_database_lookup (in /lib/tls/i686/cmov/
> libc-2.7.so)
> ==26628==    by 0x4732FDB: ???
> ==26628==    by 0x473413C: ???
> ==26628==    by 0x4247D15: getpwuid_r (in /lib/tls/i686/cmov/libc-2.7.so)
> ==26628==    by 0x424765D: getpwuid (in /lib/tls/i686/cmov/libc-2.7.so)
> ==26628==    by 0x8623509: PetscGetUserName (fuser.c:68)
> ==26628==    by 0x85E0CF0: PetscErrorPrintfInitialize (errtrace.c:68)
> ==26628==    by 0x8613E23: PetscInitialize (pinit.c:518)
> ==26628==    by 0x804B796: main (ex29.c:139)
> ==26628==
> ==26628==
> ==26628== 40 bytes in 5 blocks are indirectly lost in loss record 3 of 5
> ==26628==    at 0x4022AB8: malloc (vg_replace_malloc.c:207)
> ==26628==    by 0x429AFBB: __nss_lookup_function (in /lib/tls/i686/cmov/
> libc-2.7.so)
> ==26628==    by 0x4732FFB: ???
> ==26628==    by 0x473413C: ???
> ==26628==    by 0x4247D15: getpwuid_r (in /lib/tls/i686/cmov/libc-2.7.so)
> ==26628==    by 0x424765D: getpwuid (in /lib/tls/i686/cmov/libc-2.7.so)
> ==26628==    by 0x8623509: PetscGetUserName (fuser.c:68)
> ==26628==    by 0x85E0CF0: PetscErrorPrintfInitialize (errtrace.c:68)
> ==26628==    by 0x8613E23: PetscInitialize (pinit.c:518)
> ==26628==    by 0x804B796: main (ex29.c:139)
> ==26628==
> ==26628==
> ==26628== 80 bytes in 5 blocks are indirectly lost in loss record 4 of 5
> ==26628==    at 0x4022AB8: malloc (vg_replace_malloc.c:207)
> ==26628==    by 0x428839B: tsearch (in /lib/tls/i686/cmov/libc-2.7.so)
> ==26628==    by 0x429AF7D: __nss_lookup_function (in /lib/tls/i686/cmov/
> libc-2.7.so)
> ==26628==    by 0x4732FFB: ???
> ==26628==    by 0x473413C: ???
> ==26628==    by 0x4247D15: getpwuid_r (in /lib/tls/i686/cmov/libc-2.7.so)
> ==26628==    by 0x424765D: getpwuid (in /lib/tls/i686/cmov/libc-2.7.so)
> ==26628==    by 0x8623509: PetscGetUserName (fuser.c:68)
> ==26628==    by 0x85E0CF0: PetscErrorPrintfInitialize (errtrace.c:68)
> ==26628==    by 0x8613E23: PetscInitialize (pinit.c:518)
> ==26628==    by 0x804B796: main (ex29.c:139)
> ==26628==
> ==26628==
> ==26628== 132,796 bytes in 321 blocks are still reachable in loss record 5
> of 5
> ==26628==    at 0x4022AB8: malloc (vg_replace_malloc.c:207)
> ==26628==    by 0x85EF3AC: PetscMallocAlign (mal.c:40)
> ==26628==    by 0x85F049B: PetscTrMallocDefault (mtr.c:194)
> ==26628==    by 0x81BCD3F: DACreate2d (da2.c:364)
> ==26628==    by 0x804BAFB: main (ex29.c:153)
> ==26628==
> ==26628== LEAK SUMMARY:
> ==26628==    definitely lost: 36 bytes in 1 blocks.
> ==26628==    indirectly lost: 120 bytes in 10 blocks.
> ==26628==      possibly lost: 0 bytes in 0 blocks.
> ==26628==    still reachable: 132,828 bytes in 323 blocks.
> ==26628==         suppressed: 0 bytes in 0 blocks.
>
>
> --
> (Rebecca) Xuefei YUAN
> Department of Applied Physics and Applied Mathematics
> Columbia University
> Tel:917-399-8032
> www.columbia.edu/~xy2102 <http://www.columbia.edu/%7Exy2102>
>
>


-- 
What most experimenters take for granted before they begin their experiments
is infinitely more interesting than any results to which their experiments
lead.
-- Norbert Wiener
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.mcs.anl.gov/pipermail/petsc-users/attachments/20090727/441ebd43/attachment-0001.htm>

From xy2102 at columbia.edu  Mon Jul 27 16:34:39 2009
From: xy2102 at columbia.edu ((Rebecca) Xuefei YUAN)
Date: Mon, 27 Jul 2009 17:34:39 -0400
Subject: memory check of /snes/example/tutorials/ex29.c
In-Reply-To: <a9f269830907271422q5142ad41ld0be3a454da5aaf9@mail.gmail.com>
References: <20090727165147.eybcqr8xcswg4k0c@cubmail.cc.columbia.edu>
	<a9f269830907271422q5142ad41ld0be3a454da5aaf9@mail.gmail.com>
Message-ID: <20090727173439.32t4un0xa80wsgw4@cubmail.cc.columbia.edu>

Those unfreed bytes cause "out of memory" when it runs at bigger grid  
sizes. So I have to find out those unfreed memory and free them... Any  
suggestions?

Thanks,

R

Quoting Matthew Knepley <knepley at gmail.com>:

> On Mon, Jul 27, 2009 at 3:51 PM, (Rebecca) Xuefei YUAN
> <xy2102 at columbia.edu>wrote:
>
>> Hi,
>>
>> My own code has some left bytes still reachable according to valgrind, then
>> I use two different version petsc (2.3.3-p15 and 3.0.0-p1) to compile and
>> make the files, it gives me different number of bytes left still reachable.
>> Moreover, I picked up the /snes/example/tutorials/ex29.c as another example,
>> and found that some bytes are still reachable, what is the cause of it? It
>> shows that it is from DACreate2D() and the I use -malloc_dump to get those
>> unfreed informations.
>>
>> I understand that for those 5 loss record, the 2nd, 3rd and 4th are true
>> for all examples, but where do 1st and 5th ones come from? Also, the
>> -malloc_dump information shows that there are
>> "[0]Total space allocated 37780 bytes",
>> but valgrind gives the information as
>> "==26628==    still reachable: 132,828 bytes in 323 blocks"
>>
>> Why there is a big difference?
>
>
> 1 is fine. It is from PMPI setup, which has some bytes not freed from
> setting up the MPI
> processes. The last one looks like an unfreed header for a DA, which is
> strange.
>
>   Matt
>
>
>>
>> Thanks very much!
>>
>> Rebecca
>>
>> Here is the message from valgrind of running ex29:
>> ==26628== 32 bytes in 2 blocks are still reachable in loss record 1 of 5
>> ==26628==    at 0x4022AB8: malloc (vg_replace_malloc.c:207)
>> ==26628==    by 0x86F9A78: MPID_VCRT_Create (mpid_vc.c:62)
>> ==26628==    by 0x86F743A: MPID_Init (mpid_init.c:116)
>> ==26628==    by 0x86D040B: MPIR_Init_thread (initthread.c:288)
>> ==26628==    by 0x86CFF2D: PMPI_Init (init.c:106)
>> ==26628==    by 0x8613D69: PetscInitialize (pinit.c:503)
>> ==26628==    by 0x804B796: main (ex29.c:139)
>> ==26628==
>> ==26628==
>> ==26628== 156 (36 direct, 120 indirect) bytes in 1 blocks are definitely
>> lost in loss record 2 of 5
>> ==26628==    at 0x4022AB8: malloc (vg_replace_malloc.c:207)
>> ==26628==    by 0x429B3E2: (within /lib/tls/i686/cmov/libc-2.7.so)
>> ==26628==    by 0x429BC2D: __nss_database_lookup (in /lib/tls/i686/cmov/
>> libc-2.7.so)
>> ==26628==    by 0x4732FDB: ???
>> ==26628==    by 0x473413C: ???
>> ==26628==    by 0x4247D15: getpwuid_r (in /lib/tls/i686/cmov/libc-2.7.so)
>> ==26628==    by 0x424765D: getpwuid (in /lib/tls/i686/cmov/libc-2.7.so)
>> ==26628==    by 0x8623509: PetscGetUserName (fuser.c:68)
>> ==26628==    by 0x85E0CF0: PetscErrorPrintfInitialize (errtrace.c:68)
>> ==26628==    by 0x8613E23: PetscInitialize (pinit.c:518)
>> ==26628==    by 0x804B796: main (ex29.c:139)
>> ==26628==
>> ==26628==
>> ==26628== 40 bytes in 5 blocks are indirectly lost in loss record 3 of 5
>> ==26628==    at 0x4022AB8: malloc (vg_replace_malloc.c:207)
>> ==26628==    by 0x429AFBB: __nss_lookup_function (in /lib/tls/i686/cmov/
>> libc-2.7.so)
>> ==26628==    by 0x4732FFB: ???
>> ==26628==    by 0x473413C: ???
>> ==26628==    by 0x4247D15: getpwuid_r (in /lib/tls/i686/cmov/libc-2.7.so)
>> ==26628==    by 0x424765D: getpwuid (in /lib/tls/i686/cmov/libc-2.7.so)
>> ==26628==    by 0x8623509: PetscGetUserName (fuser.c:68)
>> ==26628==    by 0x85E0CF0: PetscErrorPrintfInitialize (errtrace.c:68)
>> ==26628==    by 0x8613E23: PetscInitialize (pinit.c:518)
>> ==26628==    by 0x804B796: main (ex29.c:139)
>> ==26628==
>> ==26628==
>> ==26628== 80 bytes in 5 blocks are indirectly lost in loss record 4 of 5
>> ==26628==    at 0x4022AB8: malloc (vg_replace_malloc.c:207)
>> ==26628==    by 0x428839B: tsearch (in /lib/tls/i686/cmov/libc-2.7.so)
>> ==26628==    by 0x429AF7D: __nss_lookup_function (in /lib/tls/i686/cmov/
>> libc-2.7.so)
>> ==26628==    by 0x4732FFB: ???
>> ==26628==    by 0x473413C: ???
>> ==26628==    by 0x4247D15: getpwuid_r (in /lib/tls/i686/cmov/libc-2.7.so)
>> ==26628==    by 0x424765D: getpwuid (in /lib/tls/i686/cmov/libc-2.7.so)
>> ==26628==    by 0x8623509: PetscGetUserName (fuser.c:68)
>> ==26628==    by 0x85E0CF0: PetscErrorPrintfInitialize (errtrace.c:68)
>> ==26628==    by 0x8613E23: PetscInitialize (pinit.c:518)
>> ==26628==    by 0x804B796: main (ex29.c:139)
>> ==26628==
>> ==26628==
>> ==26628== 132,796 bytes in 321 blocks are still reachable in loss record 5
>> of 5
>> ==26628==    at 0x4022AB8: malloc (vg_replace_malloc.c:207)
>> ==26628==    by 0x85EF3AC: PetscMallocAlign (mal.c:40)
>> ==26628==    by 0x85F049B: PetscTrMallocDefault (mtr.c:194)
>> ==26628==    by 0x81BCD3F: DACreate2d (da2.c:364)
>> ==26628==    by 0x804BAFB: main (ex29.c:153)
>> ==26628==
>> ==26628== LEAK SUMMARY:
>> ==26628==    definitely lost: 36 bytes in 1 blocks.
>> ==26628==    indirectly lost: 120 bytes in 10 blocks.
>> ==26628==      possibly lost: 0 bytes in 0 blocks.
>> ==26628==    still reachable: 132,828 bytes in 323 blocks.
>> ==26628==         suppressed: 0 bytes in 0 blocks.
>>
>>
>> --
>> (Rebecca) Xuefei YUAN
>> Department of Applied Physics and Applied Mathematics
>> Columbia University
>> Tel:917-399-8032
>> www.columbia.edu/~xy2102 <http://www.columbia.edu/%7Exy2102>
>>
>>
>
>
> --
> What most experimenters take for granted before they begin their experiments
> is infinitely more interesting than any results to which their experiments
> lead.
> -- Norbert Wiener
>



-- 
(Rebecca) Xuefei YUAN
Department of Applied Physics and Applied Mathematics
Columbia University
Tel:917-399-8032
www.columbia.edu/~xy2102


From knepley at gmail.com  Mon Jul 27 16:42:10 2009
From: knepley at gmail.com (Matthew Knepley)
Date: Mon, 27 Jul 2009 16:42:10 -0500
Subject: memory check of /snes/example/tutorials/ex29.c
In-Reply-To: <20090727173439.32t4un0xa80wsgw4@cubmail.cc.columbia.edu>
References: <20090727165147.eybcqr8xcswg4k0c@cubmail.cc.columbia.edu>
	<a9f269830907271422q5142ad41ld0be3a454da5aaf9@mail.gmail.com>
	<20090727173439.32t4un0xa80wsgw4@cubmail.cc.columbia.edu>
Message-ID: <a9f269830907271442hf62b5cdh162df4ee01edbede@mail.gmail.com>

On Mon, Jul 27, 2009 at 4:34 PM, (Rebecca) Xuefei YUAN
<xy2102 at columbia.edu>wrote:

> Those unfreed bytes cause "out of memory" when it runs at bigger grid
> sizes. So I have to find out those unfreed memory and free them... Any
> suggestions?


Not from what you mailed in. On that DA line, I see PetscHeaderCreate(). Is
that what you see?

  Matt


>
> Thanks,
>
> R
>
>
> Quoting Matthew Knepley <knepley at gmail.com>:
>
>  On Mon, Jul 27, 2009 at 3:51 PM, (Rebecca) Xuefei YUAN
>> <xy2102 at columbia.edu>wrote:
>>
>>  Hi,
>>>
>>> My own code has some left bytes still reachable according to valgrind,
>>> then
>>> I use two different version petsc (2.3.3-p15 and 3.0.0-p1) to compile and
>>> make the files, it gives me different number of bytes left still
>>> reachable.
>>> Moreover, I picked up the /snes/example/tutorials/ex29.c as another
>>> example,
>>> and found that some bytes are still reachable, what is the cause of it?
>>> It
>>> shows that it is from DACreate2D() and the I use -malloc_dump to get
>>> those
>>> unfreed informations.
>>>
>>> I understand that for those 5 loss record, the 2nd, 3rd and 4th are true
>>> for all examples, but where do 1st and 5th ones come from? Also, the
>>> -malloc_dump information shows that there are
>>> "[0]Total space allocated 37780 bytes",
>>> but valgrind gives the information as
>>> "==26628==    still reachable: 132,828 bytes in 323 blocks"
>>>
>>> Why there is a big difference?
>>>
>>
>>
>> 1 is fine. It is from PMPI setup, which has some bytes not freed from
>> setting up the MPI
>> processes. The last one looks like an unfreed header for a DA, which is
>> strange.
>>
>>  Matt
>>
>>
>>
>>> Thanks very much!
>>>
>>> Rebecca
>>>
>>> Here is the message from valgrind of running ex29:
>>> ==26628== 32 bytes in 2 blocks are still reachable in loss record 1 of 5
>>> ==26628==    at 0x4022AB8: malloc (vg_replace_malloc.c:207)
>>> ==26628==    by 0x86F9A78: MPID_VCRT_Create (mpid_vc.c:62)
>>> ==26628==    by 0x86F743A: MPID_Init (mpid_init.c:116)
>>> ==26628==    by 0x86D040B: MPIR_Init_thread (initthread.c:288)
>>> ==26628==    by 0x86CFF2D: PMPI_Init (init.c:106)
>>> ==26628==    by 0x8613D69: PetscInitialize (pinit.c:503)
>>> ==26628==    by 0x804B796: main (ex29.c:139)
>>> ==26628==
>>> ==26628==
>>> ==26628== 156 (36 direct, 120 indirect) bytes in 1 blocks are definitely
>>> lost in loss record 2 of 5
>>> ==26628==    at 0x4022AB8: malloc (vg_replace_malloc.c:207)
>>> ==26628==    by 0x429B3E2: (within /lib/tls/i686/cmov/libc-2.7.so)
>>> ==26628==    by 0x429BC2D: __nss_database_lookup (in /lib/tls/i686/cmov/
>>> libc-2.7.so)
>>> ==26628==    by 0x4732FDB: ???
>>> ==26628==    by 0x473413C: ???
>>> ==26628==    by 0x4247D15: getpwuid_r (in /lib/tls/i686/cmov/libc-2.7.so
>>> )
>>> ==26628==    by 0x424765D: getpwuid (in /lib/tls/i686/cmov/libc-2.7.so)
>>> ==26628==    by 0x8623509: PetscGetUserName (fuser.c:68)
>>> ==26628==    by 0x85E0CF0: PetscErrorPrintfInitialize (errtrace.c:68)
>>> ==26628==    by 0x8613E23: PetscInitialize (pinit.c:518)
>>> ==26628==    by 0x804B796: main (ex29.c:139)
>>> ==26628==
>>> ==26628==
>>> ==26628== 40 bytes in 5 blocks are indirectly lost in loss record 3 of 5
>>> ==26628==    at 0x4022AB8: malloc (vg_replace_malloc.c:207)
>>> ==26628==    by 0x429AFBB: __nss_lookup_function (in /lib/tls/i686/cmov/
>>> libc-2.7.so)
>>> ==26628==    by 0x4732FFB: ???
>>> ==26628==    by 0x473413C: ???
>>> ==26628==    by 0x4247D15: getpwuid_r (in /lib/tls/i686/cmov/libc-2.7.so
>>> )
>>> ==26628==    by 0x424765D: getpwuid (in /lib/tls/i686/cmov/libc-2.7.so)
>>> ==26628==    by 0x8623509: PetscGetUserName (fuser.c:68)
>>> ==26628==    by 0x85E0CF0: PetscErrorPrintfInitialize (errtrace.c:68)
>>> ==26628==    by 0x8613E23: PetscInitialize (pinit.c:518)
>>> ==26628==    by 0x804B796: main (ex29.c:139)
>>> ==26628==
>>> ==26628==
>>> ==26628== 80 bytes in 5 blocks are indirectly lost in loss record 4 of 5
>>> ==26628==    at 0x4022AB8: malloc (vg_replace_malloc.c:207)
>>> ==26628==    by 0x428839B: tsearch (in /lib/tls/i686/cmov/libc-2.7.so)
>>> ==26628==    by 0x429AF7D: __nss_lookup_function (in /lib/tls/i686/cmov/
>>> libc-2.7.so)
>>> ==26628==    by 0x4732FFB: ???
>>> ==26628==    by 0x473413C: ???
>>> ==26628==    by 0x4247D15: getpwuid_r (in /lib/tls/i686/cmov/libc-2.7.so
>>> )
>>> ==26628==    by 0x424765D: getpwuid (in /lib/tls/i686/cmov/libc-2.7.so)
>>> ==26628==    by 0x8623509: PetscGetUserName (fuser.c:68)
>>> ==26628==    by 0x85E0CF0: PetscErrorPrintfInitialize (errtrace.c:68)
>>> ==26628==    by 0x8613E23: PetscInitialize (pinit.c:518)
>>> ==26628==    by 0x804B796: main (ex29.c:139)
>>> ==26628==
>>> ==26628==
>>> ==26628== 132,796 bytes in 321 blocks are still reachable in loss record
>>> 5
>>> of 5
>>> ==26628==    at 0x4022AB8: malloc (vg_replace_malloc.c:207)
>>> ==26628==    by 0x85EF3AC: PetscMallocAlign (mal.c:40)
>>> ==26628==    by 0x85F049B: PetscTrMallocDefault (mtr.c:194)
>>> ==26628==    by 0x81BCD3F: DACreate2d (da2.c:364)
>>> ==26628==    by 0x804BAFB: main (ex29.c:153)
>>> ==26628==
>>> ==26628== LEAK SUMMARY:
>>> ==26628==    definitely lost: 36 bytes in 1 blocks.
>>> ==26628==    indirectly lost: 120 bytes in 10 blocks.
>>> ==26628==      possibly lost: 0 bytes in 0 blocks.
>>> ==26628==    still reachable: 132,828 bytes in 323 blocks.
>>> ==26628==         suppressed: 0 bytes in 0 blocks.
>>>
>>>
>>> --
>>> (Rebecca) Xuefei YUAN
>>> Department of Applied Physics and Applied Mathematics
>>> Columbia University
>>> Tel:917-399-8032
>>> www.columbia.edu/~xy2102 <http://www.columbia.edu/%7Exy2102> <
>>> http://www.columbia.edu/%7Exy2102>
>>>
>>>
>>>
>>
>> --
>> What most experimenters take for granted before they begin their
>> experiments
>> is infinitely more interesting than any results to which their experiments
>> lead.
>> -- Norbert Wiener
>>
>>
>
>
> --
> (Rebecca) Xuefei YUAN
> Department of Applied Physics and Applied Mathematics
> Columbia University
> Tel:917-399-8032
> www.columbia.edu/~xy2102 <http://www.columbia.edu/%7Exy2102>
>
>


-- 
What most experimenters take for granted before they begin their experiments
is infinitely more interesting than any results to which their experiments
lead.
-- Norbert Wiener
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.mcs.anl.gov/pipermail/petsc-users/attachments/20090727/da43d62b/attachment.htm>

From bsmith at mcs.anl.gov  Mon Jul 27 16:53:20 2009
From: bsmith at mcs.anl.gov (Barry Smith)
Date: Mon, 27 Jul 2009 16:53:20 -0500
Subject: memory check of /snes/example/tutorials/ex29.c
In-Reply-To: <20090727173439.32t4un0xa80wsgw4@cubmail.cc.columbia.edu>
References: <20090727165147.eybcqr8xcswg4k0c@cubmail.cc.columbia.edu>
	<a9f269830907271422q5142ad41ld0be3a454da5aaf9@mail.gmail.com>
	<20090727173439.32t4un0xa80wsgw4@cubmail.cc.columbia.edu>
Message-ID: <D866CCA8-68B8-45F2-BBA4-EF3AF54D244F@mcs.anl.gov>


   This was due to a small memory leak in DMMGSetNullSpace() of the  
vectors creating internally to hold the null space. I have pushed a  
fix to petsc-3.0.0 and petsc-dev
It will be fixed in the next 3.0.0 patch.

    Note this would not cause the "out of memory" for runs at bigger  
grid sizes. That is likely just coming from trying to run too large a  
problem for your memory size.

    Thanks for reporting the memory leak,

    Barry


On Jul 27, 2009, at 4:34 PM, (Rebecca) Xuefei YUAN wrote:

> Those unfreed bytes cause "out of memory" when it runs at bigger  
> grid sizes. So I have to find out those unfreed memory and free  
> them... Any suggestions?
>
> Thanks,
>
> R
>
> Quoting Matthew Knepley <knepley at gmail.com>:
>
>> On Mon, Jul 27, 2009 at 3:51 PM, (Rebecca) Xuefei YUAN
>> <xy2102 at columbia.edu>wrote:
>>
>>> Hi,
>>>
>>> My own code has some left bytes still reachable according to  
>>> valgrind, then
>>> I use two different version petsc (2.3.3-p15 and 3.0.0-p1) to  
>>> compile and
>>> make the files, it gives me different number of bytes left still  
>>> reachable.
>>> Moreover, I picked up the /snes/example/tutorials/ex29.c as  
>>> another example,
>>> and found that some bytes are still reachable, what is the cause  
>>> of it? It
>>> shows that it is from DACreate2D() and the I use -malloc_dump to  
>>> get those
>>> unfreed informations.
>>>
>>> I understand that for those 5 loss record, the 2nd, 3rd and 4th  
>>> are true
>>> for all examples, but where do 1st and 5th ones come from? Also, the
>>> -malloc_dump information shows that there are
>>> "[0]Total space allocated 37780 bytes",
>>> but valgrind gives the information as
>>> "==26628==    still reachable: 132,828 bytes in 323 blocks"
>>>
>>> Why there is a big difference?
>>
>>
>> 1 is fine. It is from PMPI setup, which has some bytes not freed from
>> setting up the MPI
>> processes. The last one looks like an unfreed header for a DA,  
>> which is
>> strange.
>>
>>  Matt
>>
>>
>>>
>>> Thanks very much!
>>>
>>> Rebecca
>>>
>>> Here is the message from valgrind of running ex29:
>>> ==26628== 32 bytes in 2 blocks are still reachable in loss record  
>>> 1 of 5
>>> ==26628==    at 0x4022AB8: malloc (vg_replace_malloc.c:207)
>>> ==26628==    by 0x86F9A78: MPID_VCRT_Create (mpid_vc.c:62)
>>> ==26628==    by 0x86F743A: MPID_Init (mpid_init.c:116)
>>> ==26628==    by 0x86D040B: MPIR_Init_thread (initthread.c:288)
>>> ==26628==    by 0x86CFF2D: PMPI_Init (init.c:106)
>>> ==26628==    by 0x8613D69: PetscInitialize (pinit.c:503)
>>> ==26628==    by 0x804B796: main (ex29.c:139)
>>> ==26628==
>>> ==26628==
>>> ==26628== 156 (36 direct, 120 indirect) bytes in 1 blocks are  
>>> definitely
>>> lost in loss record 2 of 5
>>> ==26628==    at 0x4022AB8: malloc (vg_replace_malloc.c:207)
>>> ==26628==    by 0x429B3E2: (within /lib/tls/i686/cmov/libc-2.7.so)
>>> ==26628==    by 0x429BC2D: __nss_database_lookup (in /lib/tls/i686/ 
>>> cmov/
>>> libc-2.7.so)
>>> ==26628==    by 0x4732FDB: ???
>>> ==26628==    by 0x473413C: ???
>>> ==26628==    by 0x4247D15: getpwuid_r (in /lib/tls/i686/cmov/ 
>>> libc-2.7.so)
>>> ==26628==    by 0x424765D: getpwuid (in /lib/tls/i686/cmov/ 
>>> libc-2.7.so)
>>> ==26628==    by 0x8623509: PetscGetUserName (fuser.c:68)
>>> ==26628==    by 0x85E0CF0: PetscErrorPrintfInitialize (errtrace.c: 
>>> 68)
>>> ==26628==    by 0x8613E23: PetscInitialize (pinit.c:518)
>>> ==26628==    by 0x804B796: main (ex29.c:139)
>>> ==26628==
>>> ==26628==
>>> ==26628== 40 bytes in 5 blocks are indirectly lost in loss record  
>>> 3 of 5
>>> ==26628==    at 0x4022AB8: malloc (vg_replace_malloc.c:207)
>>> ==26628==    by 0x429AFBB: __nss_lookup_function (in /lib/tls/i686/ 
>>> cmov/
>>> libc-2.7.so)
>>> ==26628==    by 0x4732FFB: ???
>>> ==26628==    by 0x473413C: ???
>>> ==26628==    by 0x4247D15: getpwuid_r (in /lib/tls/i686/cmov/ 
>>> libc-2.7.so)
>>> ==26628==    by 0x424765D: getpwuid (in /lib/tls/i686/cmov/ 
>>> libc-2.7.so)
>>> ==26628==    by 0x8623509: PetscGetUserName (fuser.c:68)
>>> ==26628==    by 0x85E0CF0: PetscErrorPrintfInitialize (errtrace.c: 
>>> 68)
>>> ==26628==    by 0x8613E23: PetscInitialize (pinit.c:518)
>>> ==26628==    by 0x804B796: main (ex29.c:139)
>>> ==26628==
>>> ==26628==
>>> ==26628== 80 bytes in 5 blocks are indirectly lost in loss record  
>>> 4 of 5
>>> ==26628==    at 0x4022AB8: malloc (vg_replace_malloc.c:207)
>>> ==26628==    by 0x428839B: tsearch (in /lib/tls/i686/cmov/ 
>>> libc-2.7.so)
>>> ==26628==    by 0x429AF7D: __nss_lookup_function (in /lib/tls/i686/ 
>>> cmov/
>>> libc-2.7.so)
>>> ==26628==    by 0x4732FFB: ???
>>> ==26628==    by 0x473413C: ???
>>> ==26628==    by 0x4247D15: getpwuid_r (in /lib/tls/i686/cmov/ 
>>> libc-2.7.so)
>>> ==26628==    by 0x424765D: getpwuid (in /lib/tls/i686/cmov/ 
>>> libc-2.7.so)
>>> ==26628==    by 0x8623509: PetscGetUserName (fuser.c:68)
>>> ==26628==    by 0x85E0CF0: PetscErrorPrintfInitialize (errtrace.c: 
>>> 68)
>>> ==26628==    by 0x8613E23: PetscInitialize (pinit.c:518)
>>> ==26628==    by 0x804B796: main (ex29.c:139)
>>> ==26628==
>>> ==26628==
>>> ==26628== 132,796 bytes in 321 blocks are still reachable in loss  
>>> record 5
>>> of 5
>>> ==26628==    at 0x4022AB8: malloc (vg_replace_malloc.c:207)
>>> ==26628==    by 0x85EF3AC: PetscMallocAlign (mal.c:40)
>>> ==26628==    by 0x85F049B: PetscTrMallocDefault (mtr.c:194)
>>> ==26628==    by 0x81BCD3F: DACreate2d (da2.c:364)
>>> ==26628==    by 0x804BAFB: main (ex29.c:153)
>>> ==26628==
>>> ==26628== LEAK SUMMARY:
>>> ==26628==    definitely lost: 36 bytes in 1 blocks.
>>> ==26628==    indirectly lost: 120 bytes in 10 blocks.
>>> ==26628==      possibly lost: 0 bytes in 0 blocks.
>>> ==26628==    still reachable: 132,828 bytes in 323 blocks.
>>> ==26628==         suppressed: 0 bytes in 0 blocks.
>>>
>>>
>>> --
>>> (Rebecca) Xuefei YUAN
>>> Department of Applied Physics and Applied Mathematics
>>> Columbia University
>>> Tel:917-399-8032
>>> www.columbia.edu/~xy2102 <http://www.columbia.edu/%7Exy2102>
>>>
>>>
>>
>>
>> --
>> What most experimenters take for granted before they begin their  
>> experiments
>> is infinitely more interesting than any results to which their  
>> experiments
>> lead.
>> -- Norbert Wiener
>>
>
>
>
> -- 
> (Rebecca) Xuefei YUAN
> Department of Applied Physics and Applied Mathematics
> Columbia University
> Tel:917-399-8032
> www.columbia.edu/~xy2102
>


From xy2102 at columbia.edu  Mon Jul 27 17:04:56 2009
From: xy2102 at columbia.edu ((Rebecca) Xuefei YUAN)
Date: Mon, 27 Jul 2009 18:04:56 -0400
Subject: memory check of /snes/example/tutorials/ex29.c
In-Reply-To: <a9f269830907271442hf62b5cdh162df4ee01edbede@mail.gmail.com>
References: <20090727165147.eybcqr8xcswg4k0c@cubmail.cc.columbia.edu>
	<a9f269830907271422q5142ad41ld0be3a454da5aaf9@mail.gmail.com>
	<20090727173439.32t4un0xa80wsgw4@cubmail.cc.columbia.edu>
	<a9f269830907271442hf62b5cdh162df4ee01edbede@mail.gmail.com>
Message-ID: <20090727180456.lugnzepfgkcsg0kc@cubmail.cc.columbia.edu>

Dear Matt,

I ran the code on a 321*321 grid, with dof=4. The matrix is a sparse  
matrix with type aij.

After set up user defined options calls, the memory status is
Mem:   2033752k total,   607456k used,  1426296k free,     4832k buffers

         ierr = DMMGCreate(comm, parameters.numberOfLevels, &appCtx,  
&dmmg);CHKERRQ(ierr);
	ierr = DACreate2d(comm,DA_NONPERIODIC,DA_STENCIL_BOX, -5, -5,  
PETSC_DECIDE, PETSC_DECIDE, 4, 2, 0, 0, &da);CHKERRQ(ierr);
	ierr = DMMGSetDM(dmmg, (DM)da);CHKERRQ(ierr);
	ierr = DASetFieldName(DMMGGetDA(dmmg), 0, "phi");CHKERRQ(ierr);
	ierr = DASetFieldName(DMMGGetDA(dmmg), 1, "vz");CHKERRQ(ierr);
	ierr = DASetFieldName(DMMGGetDA(dmmg), 2, "psi");CHKERRQ(ierr);
	ierr = DASetFieldName(DMMGGetDA(dmmg), 3, "bz");CHKERRQ(ierr);


before DAGetMatrix() called, the memory status is
Mem:   2033752k total,   642940k used,  1390812k free,     4972k buffers

	ierr = DAGetMatrix(DMMGGetDA(dmmg), MATAIJ, &jacobian);CHKERRQ(ierr);

In gdb, it uses around 500M memory after DAGetMatrix(), which I do not  
think it is right, since for a sparse matrix with 13 nonzeros per row,  
the memory it needs should be 321*321*4(dof)*13(nonzeros per  
row)*8(PetscReal) = 42865056 bytes ~ 40M. It is strange.
i.e., after DAGetMatrix() call, the memory status is

Mem:   2033752k total,  1152032k used,   881720k free,     5072k buffers

Then when it goes the call of DMMGSetSNESLocal(), I found my memory is  
using till the message of corruption has appeared.

	ierr = DMMGSetSNESLocal(dmmg, FormFunctionLocal,  
FormJacobianLocal,0,0);CHKERRQ(ierr);

The memory corruption happens. The error message is:

   0 SNES Function norm 4.925849247379e-03
[0]PETSC ERROR: --------------------- Error Message  
------------------------------------
[0]PETSC ERROR: Out of memory. This could be due to allocating
[0]PETSC ERROR: too large an object or bleeding by not properly
[0]PETSC ERROR: destroying unneeded objects.
[0]PETSC ERROR: Memory allocated 2134287012 Memory used by process 1630638080
[0]PETSC ERROR: Try running with -malloc_dump or -malloc_log for info.
[0]PETSC ERROR: Memory requested 327270824!
[0]PETSC ERROR:  
------------------------------------------------------------------------
[0]PETSC ERROR: Petsc Release Version 3.0.0, Patch 1, Thu Jan  1  
13:54:27 CST 2009
[0]PETSC ERROR: See docs/changes/index.html for recent updates.
[0]PETSC ERROR: See docs/faq.html for hints about trouble shooting.
[0]PETSC ERROR: See docs/index.html for manual pages.
[0]PETSC ERROR:  
------------------------------------------------------------------------
[0]PETSC ERROR:  
/home/rebecca/linux/code/physics/qffxmhd/tests/qffxmhd3 on a linux-gnu  
named YuanWork by rebecca Mon Jul 27 17:49:53 2009
[0]PETSC ERROR: Libraries linked from  
/home/rebecca/soft/petsc-3.0.0-p1/linux-gnu-c-debug/lib
[0]PETSC ERROR: Configure run at Mon Apr 20 16:41:56 2009
[0]PETSC ERROR: Configure options  
--with-blas-lapack-dir=./externalpackages/fblaslapack-3.1.1/  
--download-mpich=1 --with-shared=0
[0]PETSC ERROR:  
------------------------------------------------------------------------
[0]PETSC ERROR: PetscMallocAlign() line 61 in src/sys/memory/mal.c
[0]PETSC ERROR: PetscTrMallocDefault() line 194 in src/sys/memory/mtr.c
[0]PETSC ERROR: MatDuplicateNoCreate_SeqAIJ() line 3402 in  
src/mat/impls/aij/seq/aij.c
[0]PETSC ERROR: MatILUFactorSymbolic_SeqAIJ() line 1241 in  
src/mat/impls/aij/seq/aijfact.c
[0]PETSC ERROR: MatILUFactorSymbolic() line 5243 in src/mat/interface/matrix.c
[0]PETSC ERROR: PCSetUp_ILU() line 293 in src/ksp/pc/impls/factor/ilu/ilu.c
[0]PETSC ERROR: PCSetUp() line 794 in src/ksp/pc/interface/precon.c
[0]PETSC ERROR: KSPSetUp() line 237 in src/ksp/ksp/interface/itfunc.c
[0]PETSC ERROR: PCSetUp_MG() line 516 in src/ksp/pc/impls/mg/mg.c
[0]PETSC ERROR: PCSetUp() line 794 in src/ksp/pc/interface/precon.c
[0]PETSC ERROR: KSPSetUp() line 237 in src/ksp/ksp/interface/itfunc.c
[0]PETSC ERROR: KSPSolve() line 353 in src/ksp/ksp/interface/itfunc.c
[0]PETSC ERROR: SNES_KSPSolve() line 2899 in src/snes/interface/snes.c
[0]PETSC ERROR: SNESSolve_LS() line 191 in src/snes/impls/ls/ls.c
[0]PETSC ERROR: SNESSolve() line 2221 in src/snes/interface/snes.c
[0]PETSC ERROR: DMMGSolveSNES() line 510 in src/snes/utils/damgsnes.c
[0]PETSC ERROR: DMMGSolve() line 372 in src/snes/utils/damg.c
[0]PETSC ERROR: Solve() line 318 in qffxmhd.c
[0]PETSC ERROR: main() line 172 in qffxmhd.c
application called MPI_Abort(MPI_COMM_WORLD, 1) - process 0[unset]:  
aborting job:
application called MPI_Abort(MPI_COMM_WORLD, 1) - process 0

Program exited with code 01.


   0 SNES Function norm 4.925849247379e-03
[0]PETSC ERROR: --------------------- Error Message  
------------------------------------
[0]PETSC ERROR: Out of memory. This could be due to allocating
[0]PETSC ERROR: too large an object or bleeding by not properly
[0]PETSC ERROR: destroying unneeded objects.
[0]PETSC ERROR: Memory allocated 2134287012 Memory used by process 1630638080
[0]PETSC ERROR: Try running with -malloc_dump or -malloc_log for info.
[0]PETSC ERROR: Memory requested 327270824!
[0]PETSC ERROR:  
------------------------------------------------------------------------
[0]PETSC ERROR: Petsc Release Version 3.0.0, Patch 1, Thu Jan  1  
13:54:27 CST 2009
[0]PETSC ERROR: See docs/changes/index.html for recent updates.
[0]PETSC ERROR: See docs/faq.html for hints about trouble shooting.
[0]PETSC ERROR: See docs/index.html for manual pages.
[0]PETSC ERROR:  
------------------------------------------------------------------------
[0]PETSC ERROR:  
/home/rebecca/linux/code/physics/qffxmhd/tests/qffxmhd3 on a linux-gnu  
named YuanWork by rebecca Mon Jul 27 17:49:53 2009
[0]PETSC ERROR: Libraries linked from  
/home/rebecca/soft/petsc-3.0.0-p1/linux-gnu-c-debug/lib
[0]PETSC ERROR: Configure run at Mon Apr 20 16:41:56 2009
[0]PETSC ERROR: Configure options  
--with-blas-lapack-dir=./externalpackages/fblaslapack-3.1.1/  
--download-mpich=1 --with-shared=0
[0]PETSC ERROR:  
------------------------------------------------------------------------
[0]PETSC ERROR: PetscMallocAlign() line 61 in src/sys/memory/mal.c
[0]PETSC ERROR: PetscTrMallocDefault() line 194 in src/sys/memory/mtr.c
[0]PETSC ERROR: MatDuplicateNoCreate_SeqAIJ() line 3402 in  
src/mat/impls/aij/seq/aij.c
[0]PETSC ERROR: MatILUFactorSymbolic_SeqAIJ() line 1241 in  
src/mat/impls/aij/seq/aijfact.c
[0]PETSC ERROR: MatILUFactorSymbolic() line 5243 in src/mat/interface/matrix.c
[0]PETSC ERROR: PCSetUp_ILU() line 293 in src/ksp/pc/impls/factor/ilu/ilu.c
[0]PETSC ERROR: PCSetUp() line 794 in src/ksp/pc/interface/precon.c
[0]PETSC ERROR: KSPSetUp() line 237 in src/ksp/ksp/interface/itfunc.c
[0]PETSC ERROR: PCSetUp_MG() line 516 in src/ksp/pc/impls/mg/mg.c
[0]PETSC ERROR: PCSetUp() line 794 in src/ksp/pc/interface/precon.c
[0]PETSC ERROR: KSPSetUp() line 237 in src/ksp/ksp/interface/itfunc.c
[0]PETSC ERROR: KSPSolve() line 353 in src/ksp/ksp/interface/itfunc.c
[0]PETSC ERROR: SNES_KSPSolve() line 2899 in src/snes/interface/snes.c
[0]PETSC ERROR: SNESSolve_LS() line 191 in src/snes/impls/ls/ls.c
[0]PETSC ERROR: SNESSolve() line 2221 in src/snes/interface/snes.c
[0]PETSC ERROR: DMMGSolveSNES() line 510 in src/snes/utils/damgsnes.c
[0]PETSC ERROR: DMMGSolve() line 372 in src/snes/utils/damg.c
[0]PETSC ERROR: Solve() line 318 in qffxmhd.c
[0]PETSC ERROR: main() line 172 in qffxmhd.c
application called MPI_Abort(MPI_COMM_WORLD, 1) - process 0[unset]:  
aborting job:
application called MPI_Abort(MPI_COMM_WORLD, 1) - process 0

Program exited with code 01.

I thought it might because of the unfreed memory, so I picked up  
ex29.c as a comparision.

Thanks,

Rebecca


Quoting Matthew Knepley <knepley at gmail.com>:

> On Mon, Jul 27, 2009 at 4:34 PM, (Rebecca) Xuefei YUAN
> <xy2102 at columbia.edu>wrote:
>
>> Those unfreed bytes cause "out of memory" when it runs at bigger grid
>> sizes. So I have to find out those unfreed memory and free them... Any
>> suggestions?
>
>
> Not from what you mailed in. On that DA line, I see PetscHeaderCreate(). Is
> that what you see?
>
>   Matt
>
>
>>
>> Thanks,
>>
>> R
>>
>>
>> Quoting Matthew Knepley <knepley at gmail.com>:
>>
>>  On Mon, Jul 27, 2009 at 3:51 PM, (Rebecca) Xuefei YUAN
>>> <xy2102 at columbia.edu>wrote:
>>>
>>>  Hi,
>>>>
>>>> My own code has some left bytes still reachable according to valgrind,
>>>> then
>>>> I use two different version petsc (2.3.3-p15 and 3.0.0-p1) to compile and
>>>> make the files, it gives me different number of bytes left still
>>>> reachable.
>>>> Moreover, I picked up the /snes/example/tutorials/ex29.c as another
>>>> example,
>>>> and found that some bytes are still reachable, what is the cause of it?
>>>> It
>>>> shows that it is from DACreate2D() and the I use -malloc_dump to get
>>>> those
>>>> unfreed informations.
>>>>
>>>> I understand that for those 5 loss record, the 2nd, 3rd and 4th are true
>>>> for all examples, but where do 1st and 5th ones come from? Also, the
>>>> -malloc_dump information shows that there are
>>>> "[0]Total space allocated 37780 bytes",
>>>> but valgrind gives the information as
>>>> "==26628==    still reachable: 132,828 bytes in 323 blocks"
>>>>
>>>> Why there is a big difference?
>>>>
>>>
>>>
>>> 1 is fine. It is from PMPI setup, which has some bytes not freed from
>>> setting up the MPI
>>> processes. The last one looks like an unfreed header for a DA, which is
>>> strange.
>>>
>>>  Matt
>>>
>>>
>>>
>>>> Thanks very much!
>>>>
>>>> Rebecca
>>>>
>>>> Here is the message from valgrind of running ex29:
>>>> ==26628== 32 bytes in 2 blocks are still reachable in loss record 1 of 5
>>>> ==26628==    at 0x4022AB8: malloc (vg_replace_malloc.c:207)
>>>> ==26628==    by 0x86F9A78: MPID_VCRT_Create (mpid_vc.c:62)
>>>> ==26628==    by 0x86F743A: MPID_Init (mpid_init.c:116)
>>>> ==26628==    by 0x86D040B: MPIR_Init_thread (initthread.c:288)
>>>> ==26628==    by 0x86CFF2D: PMPI_Init (init.c:106)
>>>> ==26628==    by 0x8613D69: PetscInitialize (pinit.c:503)
>>>> ==26628==    by 0x804B796: main (ex29.c:139)
>>>> ==26628==
>>>> ==26628==
>>>> ==26628== 156 (36 direct, 120 indirect) bytes in 1 blocks are definitely
>>>> lost in loss record 2 of 5
>>>> ==26628==    at 0x4022AB8: malloc (vg_replace_malloc.c:207)
>>>> ==26628==    by 0x429B3E2: (within /lib/tls/i686/cmov/libc-2.7.so)
>>>> ==26628==    by 0x429BC2D: __nss_database_lookup (in /lib/tls/i686/cmov/
>>>> libc-2.7.so)
>>>> ==26628==    by 0x4732FDB: ???
>>>> ==26628==    by 0x473413C: ???
>>>> ==26628==    by 0x4247D15: getpwuid_r (in /lib/tls/i686/cmov/libc-2.7.so
>>>> )
>>>> ==26628==    by 0x424765D: getpwuid (in /lib/tls/i686/cmov/libc-2.7.so)
>>>> ==26628==    by 0x8623509: PetscGetUserName (fuser.c:68)
>>>> ==26628==    by 0x85E0CF0: PetscErrorPrintfInitialize (errtrace.c:68)
>>>> ==26628==    by 0x8613E23: PetscInitialize (pinit.c:518)
>>>> ==26628==    by 0x804B796: main (ex29.c:139)
>>>> ==26628==
>>>> ==26628==
>>>> ==26628== 40 bytes in 5 blocks are indirectly lost in loss record 3 of 5
>>>> ==26628==    at 0x4022AB8: malloc (vg_replace_malloc.c:207)
>>>> ==26628==    by 0x429AFBB: __nss_lookup_function (in /lib/tls/i686/cmov/
>>>> libc-2.7.so)
>>>> ==26628==    by 0x4732FFB: ???
>>>> ==26628==    by 0x473413C: ???
>>>> ==26628==    by 0x4247D15: getpwuid_r (in /lib/tls/i686/cmov/libc-2.7.so
>>>> )
>>>> ==26628==    by 0x424765D: getpwuid (in /lib/tls/i686/cmov/libc-2.7.so)
>>>> ==26628==    by 0x8623509: PetscGetUserName (fuser.c:68)
>>>> ==26628==    by 0x85E0CF0: PetscErrorPrintfInitialize (errtrace.c:68)
>>>> ==26628==    by 0x8613E23: PetscInitialize (pinit.c:518)
>>>> ==26628==    by 0x804B796: main (ex29.c:139)
>>>> ==26628==
>>>> ==26628==
>>>> ==26628== 80 bytes in 5 blocks are indirectly lost in loss record 4 of 5
>>>> ==26628==    at 0x4022AB8: malloc (vg_replace_malloc.c:207)
>>>> ==26628==    by 0x428839B: tsearch (in /lib/tls/i686/cmov/libc-2.7.so)
>>>> ==26628==    by 0x429AF7D: __nss_lookup_function (in /lib/tls/i686/cmov/
>>>> libc-2.7.so)
>>>> ==26628==    by 0x4732FFB: ???
>>>> ==26628==    by 0x473413C: ???
>>>> ==26628==    by 0x4247D15: getpwuid_r (in /lib/tls/i686/cmov/libc-2.7.so
>>>> )
>>>> ==26628==    by 0x424765D: getpwuid (in /lib/tls/i686/cmov/libc-2.7.so)
>>>> ==26628==    by 0x8623509: PetscGetUserName (fuser.c:68)
>>>> ==26628==    by 0x85E0CF0: PetscErrorPrintfInitialize (errtrace.c:68)
>>>> ==26628==    by 0x8613E23: PetscInitialize (pinit.c:518)
>>>> ==26628==    by 0x804B796: main (ex29.c:139)
>>>> ==26628==
>>>> ==26628==
>>>> ==26628== 132,796 bytes in 321 blocks are still reachable in loss record
>>>> 5
>>>> of 5
>>>> ==26628==    at 0x4022AB8: malloc (vg_replace_malloc.c:207)
>>>> ==26628==    by 0x85EF3AC: PetscMallocAlign (mal.c:40)
>>>> ==26628==    by 0x85F049B: PetscTrMallocDefault (mtr.c:194)
>>>> ==26628==    by 0x81BCD3F: DACreate2d (da2.c:364)
>>>> ==26628==    by 0x804BAFB: main (ex29.c:153)
>>>> ==26628==
>>>> ==26628== LEAK SUMMARY:
>>>> ==26628==    definitely lost: 36 bytes in 1 blocks.
>>>> ==26628==    indirectly lost: 120 bytes in 10 blocks.
>>>> ==26628==      possibly lost: 0 bytes in 0 blocks.
>>>> ==26628==    still reachable: 132,828 bytes in 323 blocks.
>>>> ==26628==         suppressed: 0 bytes in 0 blocks.
>>>>
>>>>
>>>> --
>>>> (Rebecca) Xuefei YUAN
>>>> Department of Applied Physics and Applied Mathematics
>>>> Columbia University
>>>> Tel:917-399-8032
>>>> www.columbia.edu/~xy2102 <http://www.columbia.edu/%7Exy2102> <
>>>> http://www.columbia.edu/%7Exy2102>
>>>>
>>>>
>>>>
>>>
>>> --
>>> What most experimenters take for granted before they begin their
>>> experiments
>>> is infinitely more interesting than any results to which their experiments
>>> lead.
>>> -- Norbert Wiener
>>>
>>>
>>
>>
>> --
>> (Rebecca) Xuefei YUAN
>> Department of Applied Physics and Applied Mathematics
>> Columbia University
>> Tel:917-399-8032
>> www.columbia.edu/~xy2102 <http://www.columbia.edu/%7Exy2102>
>>
>>
>
>
> --
> What most experimenters take for granted before they begin their experiments
> is infinitely more interesting than any results to which their experiments
> lead.
> -- Norbert Wiener
>



-- 
(Rebecca) Xuefei YUAN
Department of Applied Physics and Applied Mathematics
Columbia University
Tel:917-399-8032
www.columbia.edu/~xy2102


From xy2102 at columbia.edu  Mon Jul 27 17:08:09 2009
From: xy2102 at columbia.edu ((Rebecca) Xuefei YUAN)
Date: Mon, 27 Jul 2009 18:08:09 -0400
Subject: memory check of /snes/example/tutorials/ex29.c
In-Reply-To: <D866CCA8-68B8-45F2-BBA4-EF3AF54D244F@mcs.anl.gov>
References: <20090727165147.eybcqr8xcswg4k0c@cubmail.cc.columbia.edu>
	<a9f269830907271422q5142ad41ld0be3a454da5aaf9@mail.gmail.com>
	<20090727173439.32t4un0xa80wsgw4@cubmail.cc.columbia.edu>
	<D866CCA8-68B8-45F2-BBA4-EF3AF54D244F@mcs.anl.gov>
Message-ID: <20090727180809.sh1t22n8wsgw8ckc@cubmail.cc.columbia.edu>

Dear Barry,

Do you mean that this small memory leak has been fixed in the current  
petsc-3.0.0-p7? Since the version I have is petsc-3.0.0-p1.

By the way, have you noticed another email about the possible bug in  
DMCompositeGetMatrix() function?

Thanks,

Rebecca

Quoting Barry Smith <bsmith at mcs.anl.gov>:

>
>   This was due to a small memory leak in DMMGSetNullSpace() of the
> vectors creating internally to hold the null space. I have pushed a fix
> to petsc-3.0.0 and petsc-dev
> It will be fixed in the next 3.0.0 patch.
>
>    Note this would not cause the "out of memory" for runs at bigger
> grid sizes. That is likely just coming from trying to run too large a
> problem for your memory size.
>
>    Thanks for reporting the memory leak,
>
>    Barry
>
>
> On Jul 27, 2009, at 4:34 PM, (Rebecca) Xuefei YUAN wrote:
>
>> Those unfreed bytes cause "out of memory" when it runs at bigger   
>> grid sizes. So I have to find out those unfreed memory and free   
>> them... Any suggestions?
>>
>> Thanks,
>>
>> R
>>
>> Quoting Matthew Knepley <knepley at gmail.com>:
>>
>>> On Mon, Jul 27, 2009 at 3:51 PM, (Rebecca) Xuefei YUAN
>>> <xy2102 at columbia.edu>wrote:
>>>
>>>> Hi,
>>>>
>>>> My own code has some left bytes still reachable according to   
>>>> valgrind, then
>>>> I use two different version petsc (2.3.3-p15 and 3.0.0-p1) to compile and
>>>> make the files, it gives me different number of bytes left still   
>>>> reachable.
>>>> Moreover, I picked up the /snes/example/tutorials/ex29.c as   
>>>> another example,
>>>> and found that some bytes are still reachable, what is the cause of it? It
>>>> shows that it is from DACreate2D() and the I use -malloc_dump to get those
>>>> unfreed informations.
>>>>
>>>> I understand that for those 5 loss record, the 2nd, 3rd and 4th are true
>>>> for all examples, but where do 1st and 5th ones come from? Also, the
>>>> -malloc_dump information shows that there are
>>>> "[0]Total space allocated 37780 bytes",
>>>> but valgrind gives the information as
>>>> "==26628==    still reachable: 132,828 bytes in 323 blocks"
>>>>
>>>> Why there is a big difference?
>>>
>>>
>>> 1 is fine. It is from PMPI setup, which has some bytes not freed from
>>> setting up the MPI
>>> processes. The last one looks like an unfreed header for a DA, which is
>>> strange.
>>>
>>> Matt
>>>
>>>
>>>>
>>>> Thanks very much!
>>>>
>>>> Rebecca
>>>>
>>>> Here is the message from valgrind of running ex29:
>>>> ==26628== 32 bytes in 2 blocks are still reachable in loss record 1 of 5
>>>> ==26628==    at 0x4022AB8: malloc (vg_replace_malloc.c:207)
>>>> ==26628==    by 0x86F9A78: MPID_VCRT_Create (mpid_vc.c:62)
>>>> ==26628==    by 0x86F743A: MPID_Init (mpid_init.c:116)
>>>> ==26628==    by 0x86D040B: MPIR_Init_thread (initthread.c:288)
>>>> ==26628==    by 0x86CFF2D: PMPI_Init (init.c:106)
>>>> ==26628==    by 0x8613D69: PetscInitialize (pinit.c:503)
>>>> ==26628==    by 0x804B796: main (ex29.c:139)
>>>> ==26628==
>>>> ==26628==
>>>> ==26628== 156 (36 direct, 120 indirect) bytes in 1 blocks are definitely
>>>> lost in loss record 2 of 5
>>>> ==26628==    at 0x4022AB8: malloc (vg_replace_malloc.c:207)
>>>> ==26628==    by 0x429B3E2: (within /lib/tls/i686/cmov/libc-2.7.so)
>>>> ==26628==    by 0x429BC2D: __nss_database_lookup (in /lib/tls/i686/cmov/
>>>> libc-2.7.so)
>>>> ==26628==    by 0x4732FDB: ???
>>>> ==26628==    by 0x473413C: ???
>>>> ==26628==    by 0x4247D15: getpwuid_r (in /lib/tls/i686/cmov/libc-2.7.so)
>>>> ==26628==    by 0x424765D: getpwuid (in /lib/tls/i686/cmov/libc-2.7.so)
>>>> ==26628==    by 0x8623509: PetscGetUserName (fuser.c:68)
>>>> ==26628==    by 0x85E0CF0: PetscErrorPrintfInitialize (errtrace.c:68)
>>>> ==26628==    by 0x8613E23: PetscInitialize (pinit.c:518)
>>>> ==26628==    by 0x804B796: main (ex29.c:139)
>>>> ==26628==
>>>> ==26628==
>>>> ==26628== 40 bytes in 5 blocks are indirectly lost in loss record 3 of 5
>>>> ==26628==    at 0x4022AB8: malloc (vg_replace_malloc.c:207)
>>>> ==26628==    by 0x429AFBB: __nss_lookup_function (in /lib/tls/i686/cmov/
>>>> libc-2.7.so)
>>>> ==26628==    by 0x4732FFB: ???
>>>> ==26628==    by 0x473413C: ???
>>>> ==26628==    by 0x4247D15: getpwuid_r (in /lib/tls/i686/cmov/libc-2.7.so)
>>>> ==26628==    by 0x424765D: getpwuid (in /lib/tls/i686/cmov/libc-2.7.so)
>>>> ==26628==    by 0x8623509: PetscGetUserName (fuser.c:68)
>>>> ==26628==    by 0x85E0CF0: PetscErrorPrintfInitialize (errtrace.c:68)
>>>> ==26628==    by 0x8613E23: PetscInitialize (pinit.c:518)
>>>> ==26628==    by 0x804B796: main (ex29.c:139)
>>>> ==26628==
>>>> ==26628==
>>>> ==26628== 80 bytes in 5 blocks are indirectly lost in loss record 4 of 5
>>>> ==26628==    at 0x4022AB8: malloc (vg_replace_malloc.c:207)
>>>> ==26628==    by 0x428839B: tsearch (in /lib/tls/i686/cmov/libc-2.7.so)
>>>> ==26628==    by 0x429AF7D: __nss_lookup_function (in /lib/tls/i686/cmov/
>>>> libc-2.7.so)
>>>> ==26628==    by 0x4732FFB: ???
>>>> ==26628==    by 0x473413C: ???
>>>> ==26628==    by 0x4247D15: getpwuid_r (in /lib/tls/i686/cmov/libc-2.7.so)
>>>> ==26628==    by 0x424765D: getpwuid (in /lib/tls/i686/cmov/libc-2.7.so)
>>>> ==26628==    by 0x8623509: PetscGetUserName (fuser.c:68)
>>>> ==26628==    by 0x85E0CF0: PetscErrorPrintfInitialize (errtrace.c:68)
>>>> ==26628==    by 0x8613E23: PetscInitialize (pinit.c:518)
>>>> ==26628==    by 0x804B796: main (ex29.c:139)
>>>> ==26628==
>>>> ==26628==
>>>> ==26628== 132,796 bytes in 321 blocks are still reachable in loss record 5
>>>> of 5
>>>> ==26628==    at 0x4022AB8: malloc (vg_replace_malloc.c:207)
>>>> ==26628==    by 0x85EF3AC: PetscMallocAlign (mal.c:40)
>>>> ==26628==    by 0x85F049B: PetscTrMallocDefault (mtr.c:194)
>>>> ==26628==    by 0x81BCD3F: DACreate2d (da2.c:364)
>>>> ==26628==    by 0x804BAFB: main (ex29.c:153)
>>>> ==26628==
>>>> ==26628== LEAK SUMMARY:
>>>> ==26628==    definitely lost: 36 bytes in 1 blocks.
>>>> ==26628==    indirectly lost: 120 bytes in 10 blocks.
>>>> ==26628==      possibly lost: 0 bytes in 0 blocks.
>>>> ==26628==    still reachable: 132,828 bytes in 323 blocks.
>>>> ==26628==         suppressed: 0 bytes in 0 blocks.
>>>>
>>>>
>>>> --
>>>> (Rebecca) Xuefei YUAN
>>>> Department of Applied Physics and Applied Mathematics
>>>> Columbia University
>>>> Tel:917-399-8032
>>>> www.columbia.edu/~xy2102 <http://www.columbia.edu/%7Exy2102>
>>>>
>>>>
>>>
>>>
>>> --
>>> What most experimenters take for granted before they begin their   
>>> experiments
>>> is infinitely more interesting than any results to which their experiments
>>> lead.
>>> -- Norbert Wiener
>>>
>>
>>
>>
>> -- 
>> (Rebecca) Xuefei YUAN
>> Department of Applied Physics and Applied Mathematics
>> Columbia University
>> Tel:917-399-8032
>> www.columbia.edu/~xy2102
>>



-- 
(Rebecca) Xuefei YUAN
Department of Applied Physics and Applied Mathematics
Columbia University
Tel:917-399-8032
www.columbia.edu/~xy2102


From knepley at gmail.com  Mon Jul 27 17:08:42 2009
From: knepley at gmail.com (Matthew Knepley)
Date: Mon, 27 Jul 2009 17:08:42 -0500
Subject: memory check of /snes/example/tutorials/ex29.c
In-Reply-To: <20090727180456.lugnzepfgkcsg0kc@cubmail.cc.columbia.edu>
References: <20090727165147.eybcqr8xcswg4k0c@cubmail.cc.columbia.edu>
	<a9f269830907271422q5142ad41ld0be3a454da5aaf9@mail.gmail.com>
	<20090727173439.32t4un0xa80wsgw4@cubmail.cc.columbia.edu>
	<a9f269830907271442hf62b5cdh162df4ee01edbede@mail.gmail.com>
	<20090727180456.lugnzepfgkcsg0kc@cubmail.cc.columbia.edu>
Message-ID: <a9f269830907271508o75086892q9c10b17d9a015c23@mail.gmail.com>

On Mon, Jul 27, 2009 at 5:04 PM, (Rebecca) Xuefei YUAN
<xy2102 at columbia.edu>wrote:

> Dear Matt,
>
> I ran the code on a 321*321 grid, with dof=4. The matrix is a sparse matrix
> with type aij.
>
> After set up user defined options calls, the memory status is
> Mem:   2033752k total,   607456k used,  1426296k free,     4832k buffers
>
>        ierr = DMMGCreate(comm, parameters.numberOfLevels, &appCtx,
> &dmmg);CHKERRQ(ierr);
>        ierr = DACreate2d(comm,DA_NONPERIODIC,DA_STENCIL_BOX, -5, -5,
> PETSC_DECIDE, PETSC_DECIDE, 4, 2, 0, 0, &da);CHKERRQ(ierr);
>        ierr = DMMGSetDM(dmmg, (DM)da);CHKERRQ(ierr);
>        ierr = DASetFieldName(DMMGGetDA(dmmg), 0, "phi");CHKERRQ(ierr);
>        ierr = DASetFieldName(DMMGGetDA(dmmg), 1, "vz");CHKERRQ(ierr);
>        ierr = DASetFieldName(DMMGGetDA(dmmg), 2, "psi");CHKERRQ(ierr);
>        ierr = DASetFieldName(DMMGGetDA(dmmg), 3, "bz");CHKERRQ(ierr);
>
>
> before DAGetMatrix() called, the memory status is
> Mem:   2033752k total,   642940k used,  1390812k free,     4972k buffers
>
>        ierr = DAGetMatrix(DMMGGetDA(dmmg), MATAIJ,
> &jacobian);CHKERRQ(ierr);
>
> In gdb, it uses around 500M memory after DAGetMatrix(), which I do not
> think it is right, since for a sparse matrix with 13 nonzeros per row, the
> memory it needs should be 321*321*4(dof)*13(nonzeros per row)*8(PetscReal) =
> 42865056 bytes ~ 40M. It is strange.
> i.e., after DAGetMatrix() call, the memory status is


Are you sure you have stencil width 1? This is not the only memory used by a
matrix, but should be close. The
number of nonzero is reported by -ksp_view. Check this.


>
> Mem:   2033752k total,  1152032k used,   881720k free,     5072k buffers
>
> Then when it goes the call of DMMGSetSNESLocal(), I found my memory is
> using till the message of corruption has appeared.
>
>        ierr = DMMGSetSNESLocal(dmmg, FormFunctionLocal,
> FormJacobianLocal,0,0);CHKERRQ(ierr);
>
> The memory corruption happens. The error message is:


This is not corruption, just using up memory.

  Matt


>
>  0 SNES Function norm 4.925849247379e-03
> [0]PETSC ERROR: --------------------- Error Message
> ------------------------------------
> [0]PETSC ERROR: Out of memory. This could be due to allocating
> [0]PETSC ERROR: too large an object or bleeding by not properly
> [0]PETSC ERROR: destroying unneeded objects.
> [0]PETSC ERROR: Memory allocated 2134287012 Memory used by process
> 1630638080
> [0]PETSC ERROR: Try running with -malloc_dump or -malloc_log for info.
> [0]PETSC ERROR: Memory requested 327270824!
> [0]PETSC ERROR:
> ------------------------------------------------------------------------
> [0]PETSC ERROR: Petsc Release Version 3.0.0, Patch 1, Thu Jan  1 13:54:27
> CST 2009
> [0]PETSC ERROR: See docs/changes/index.html for recent updates.
> [0]PETSC ERROR: See docs/faq.html for hints about trouble shooting.
> [0]PETSC ERROR: See docs/index.html for manual pages.
> [0]PETSC ERROR:
> ------------------------------------------------------------------------
> [0]PETSC ERROR: /home/rebecca/linux/code/physics/qffxmhd/tests/qffxmhd3 on
> a linux-gnu named YuanWork by rebecca Mon Jul 27 17:49:53 2009
> [0]PETSC ERROR: Libraries linked from
> /home/rebecca/soft/petsc-3.0.0-p1/linux-gnu-c-debug/lib
> [0]PETSC ERROR: Configure run at Mon Apr 20 16:41:56 2009
> [0]PETSC ERROR: Configure options
> --with-blas-lapack-dir=./externalpackages/fblaslapack-3.1.1/
> --download-mpich=1 --with-shared=0
> [0]PETSC ERROR:
> ------------------------------------------------------------------------
> [0]PETSC ERROR: PetscMallocAlign() line 61 in src/sys/memory/mal.c
> [0]PETSC ERROR: PetscTrMallocDefault() line 194 in src/sys/memory/mtr.c
> [0]PETSC ERROR: MatDuplicateNoCreate_SeqAIJ() line 3402 in
> src/mat/impls/aij/seq/aij.c
> [0]PETSC ERROR: MatILUFactorSymbolic_SeqAIJ() line 1241 in
> src/mat/impls/aij/seq/aijfact.c
> [0]PETSC ERROR: MatILUFactorSymbolic() line 5243 in
> src/mat/interface/matrix.c
> [0]PETSC ERROR: PCSetUp_ILU() line 293 in src/ksp/pc/impls/factor/ilu/ilu.c
> [0]PETSC ERROR: PCSetUp() line 794 in src/ksp/pc/interface/precon.c
> [0]PETSC ERROR: KSPSetUp() line 237 in src/ksp/ksp/interface/itfunc.c
> [0]PETSC ERROR: PCSetUp_MG() line 516 in src/ksp/pc/impls/mg/mg.c
> [0]PETSC ERROR: PCSetUp() line 794 in src/ksp/pc/interface/precon.c
> [0]PETSC ERROR: KSPSetUp() line 237 in src/ksp/ksp/interface/itfunc.c
> [0]PETSC ERROR: KSPSolve() line 353 in src/ksp/ksp/interface/itfunc.c
> [0]PETSC ERROR: SNES_KSPSolve() line 2899 in src/snes/interface/snes.c
> [0]PETSC ERROR: SNESSolve_LS() line 191 in src/snes/impls/ls/ls.c
> [0]PETSC ERROR: SNESSolve() line 2221 in src/snes/interface/snes.c
> [0]PETSC ERROR: DMMGSolveSNES() line 510 in src/snes/utils/damgsnes.c
> [0]PETSC ERROR: DMMGSolve() line 372 in src/snes/utils/damg.c
> [0]PETSC ERROR: Solve() line 318 in qffxmhd.c
> [0]PETSC ERROR: main() line 172 in qffxmhd.c
> application called MPI_Abort(MPI_COMM_WORLD, 1) - process 0[unset]:
> aborting job:
> application called MPI_Abort(MPI_COMM_WORLD, 1) - process 0
>
> Program exited with code 01.
>
>
>  0 SNES Function norm 4.925849247379e-03
> [0]PETSC ERROR: --------------------- Error Message
> ------------------------------------
> [0]PETSC ERROR: Out of memory. This could be due to allocating
> [0]PETSC ERROR: too large an object or bleeding by not properly
> [0]PETSC ERROR: destroying unneeded objects.
> [0]PETSC ERROR: Memory allocated 2134287012 Memory used by process
> 1630638080
> [0]PETSC ERROR: Try running with -malloc_dump or -malloc_log for info.
> [0]PETSC ERROR: Memory requested 327270824!
> [0]PETSC ERROR:
> ------------------------------------------------------------------------
> [0]PETSC ERROR: Petsc Release Version 3.0.0, Patch 1, Thu Jan  1 13:54:27
> CST 2009
> [0]PETSC ERROR: See docs/changes/index.html for recent updates.
> [0]PETSC ERROR: See docs/faq.html for hints about trouble shooting.
> [0]PETSC ERROR: See docs/index.html for manual pages.
> [0]PETSC ERROR:
> ------------------------------------------------------------------------
> [0]PETSC ERROR: /home/rebecca/linux/code/physics/qffxmhd/tests/qffxmhd3 on
> a linux-gnu named YuanWork by rebecca Mon Jul 27 17:49:53 2009
> [0]PETSC ERROR: Libraries linked from
> /home/rebecca/soft/petsc-3.0.0-p1/linux-gnu-c-debug/lib
> [0]PETSC ERROR: Configure run at Mon Apr 20 16:41:56 2009
> [0]PETSC ERROR: Configure options
> --with-blas-lapack-dir=./externalpackages/fblaslapack-3.1.1/
> --download-mpich=1 --with-shared=0
> [0]PETSC ERROR:
> ------------------------------------------------------------------------
> [0]PETSC ERROR: PetscMallocAlign() line 61 in src/sys/memory/mal.c
> [0]PETSC ERROR: PetscTrMallocDefault() line 194 in src/sys/memory/mtr.c
> [0]PETSC ERROR: MatDuplicateNoCreate_SeqAIJ() line 3402 in
> src/mat/impls/aij/seq/aij.c
> [0]PETSC ERROR: MatILUFactorSymbolic_SeqAIJ() line 1241 in
> src/mat/impls/aij/seq/aijfact.c
> [0]PETSC ERROR: MatILUFactorSymbolic() line 5243 in
> src/mat/interface/matrix.c
> [0]PETSC ERROR: PCSetUp_ILU() line 293 in src/ksp/pc/impls/factor/ilu/ilu.c
> [0]PETSC ERROR: PCSetUp() line 794 in src/ksp/pc/interface/precon.c
> [0]PETSC ERROR: KSPSetUp() line 237 in src/ksp/ksp/interface/itfunc.c
> [0]PETSC ERROR: PCSetUp_MG() line 516 in src/ksp/pc/impls/mg/mg.c
> [0]PETSC ERROR: PCSetUp() line 794 in src/ksp/pc/interface/precon.c
> [0]PETSC ERROR: KSPSetUp() line 237 in src/ksp/ksp/interface/itfunc.c
> [0]PETSC ERROR: KSPSolve() line 353 in src/ksp/ksp/interface/itfunc.c
> [0]PETSC ERROR: SNES_KSPSolve() line 2899 in src/snes/interface/snes.c
> [0]PETSC ERROR: SNESSolve_LS() line 191 in src/snes/impls/ls/ls.c
> [0]PETSC ERROR: SNESSolve() line 2221 in src/snes/interface/snes.c
> [0]PETSC ERROR: DMMGSolveSNES() line 510 in src/snes/utils/damgsnes.c
> [0]PETSC ERROR: DMMGSolve() line 372 in src/snes/utils/damg.c
> [0]PETSC ERROR: Solve() line 318 in qffxmhd.c
> [0]PETSC ERROR: main() line 172 in qffxmhd.c
> application called MPI_Abort(MPI_COMM_WORLD, 1) - process 0[unset]:
> aborting job:
> application called MPI_Abort(MPI_COMM_WORLD, 1) - process 0
>
> Program exited with code 01.
>
> I thought it might because of the unfreed memory, so I picked up ex29.c as
> a comparision.
>
> Thanks,
>
> Rebecca
>
>
>
> Quoting Matthew Knepley <knepley at gmail.com>:
>
>  On Mon, Jul 27, 2009 at 4:34 PM, (Rebecca) Xuefei YUAN
>> <xy2102 at columbia.edu>wrote:
>>
>>  Those unfreed bytes cause "out of memory" when it runs at bigger grid
>>> sizes. So I have to find out those unfreed memory and free them... Any
>>> suggestions?
>>>
>>
>>
>> Not from what you mailed in. On that DA line, I see PetscHeaderCreate().
>> Is
>> that what you see?
>>
>>  Matt
>>
>>
>>
>>> Thanks,
>>>
>>> R
>>>
>>>
>>> Quoting Matthew Knepley <knepley at gmail.com>:
>>>
>>>  On Mon, Jul 27, 2009 at 3:51 PM, (Rebecca) Xuefei YUAN
>>>
>>>> <xy2102 at columbia.edu>wrote:
>>>>
>>>>  Hi,
>>>>
>>>>>
>>>>> My own code has some left bytes still reachable according to valgrind,
>>>>> then
>>>>> I use two different version petsc (2.3.3-p15 and 3.0.0-p1) to compile
>>>>> and
>>>>> make the files, it gives me different number of bytes left still
>>>>> reachable.
>>>>> Moreover, I picked up the /snes/example/tutorials/ex29.c as another
>>>>> example,
>>>>> and found that some bytes are still reachable, what is the cause of it?
>>>>> It
>>>>> shows that it is from DACreate2D() and the I use -malloc_dump to get
>>>>> those
>>>>> unfreed informations.
>>>>>
>>>>> I understand that for those 5 loss record, the 2nd, 3rd and 4th are
>>>>> true
>>>>> for all examples, but where do 1st and 5th ones come from? Also, the
>>>>> -malloc_dump information shows that there are
>>>>> "[0]Total space allocated 37780 bytes",
>>>>> but valgrind gives the information as
>>>>> "==26628==    still reachable: 132,828 bytes in 323 blocks"
>>>>>
>>>>> Why there is a big difference?
>>>>>
>>>>>
>>>>
>>>> 1 is fine. It is from PMPI setup, which has some bytes not freed from
>>>> setting up the MPI
>>>> processes. The last one looks like an unfreed header for a DA, which is
>>>> strange.
>>>>
>>>>  Matt
>>>>
>>>>
>>>>
>>>>  Thanks very much!
>>>>>
>>>>> Rebecca
>>>>>
>>>>> Here is the message from valgrind of running ex29:
>>>>> ==26628== 32 bytes in 2 blocks are still reachable in loss record 1 of
>>>>> 5
>>>>> ==26628==    at 0x4022AB8: malloc (vg_replace_malloc.c:207)
>>>>> ==26628==    by 0x86F9A78: MPID_VCRT_Create (mpid_vc.c:62)
>>>>> ==26628==    by 0x86F743A: MPID_Init (mpid_init.c:116)
>>>>> ==26628==    by 0x86D040B: MPIR_Init_thread (initthread.c:288)
>>>>> ==26628==    by 0x86CFF2D: PMPI_Init (init.c:106)
>>>>> ==26628==    by 0x8613D69: PetscInitialize (pinit.c:503)
>>>>> ==26628==    by 0x804B796: main (ex29.c:139)
>>>>> ==26628==
>>>>> ==26628==
>>>>> ==26628== 156 (36 direct, 120 indirect) bytes in 1 blocks are
>>>>> definitely
>>>>> lost in loss record 2 of 5
>>>>> ==26628==    at 0x4022AB8: malloc (vg_replace_malloc.c:207)
>>>>> ==26628==    by 0x429B3E2: (within /lib/tls/i686/cmov/libc-2.7.so)
>>>>> ==26628==    by 0x429BC2D: __nss_database_lookup (in
>>>>> /lib/tls/i686/cmov/
>>>>> libc-2.7.so)
>>>>> ==26628==    by 0x4732FDB: ???
>>>>> ==26628==    by 0x473413C: ???
>>>>> ==26628==    by 0x4247D15: getpwuid_r (in /lib/tls/i686/cmov/
>>>>> libc-2.7.so
>>>>> )
>>>>> ==26628==    by 0x424765D: getpwuid (in /lib/tls/i686/cmov/libc-2.7.so
>>>>> )
>>>>> ==26628==    by 0x8623509: PetscGetUserName (fuser.c:68)
>>>>> ==26628==    by 0x85E0CF0: PetscErrorPrintfInitialize (errtrace.c:68)
>>>>> ==26628==    by 0x8613E23: PetscInitialize (pinit.c:518)
>>>>> ==26628==    by 0x804B796: main (ex29.c:139)
>>>>> ==26628==
>>>>> ==26628==
>>>>> ==26628== 40 bytes in 5 blocks are indirectly lost in loss record 3 of
>>>>> 5
>>>>> ==26628==    at 0x4022AB8: malloc (vg_replace_malloc.c:207)
>>>>> ==26628==    by 0x429AFBB: __nss_lookup_function (in
>>>>> /lib/tls/i686/cmov/
>>>>> libc-2.7.so)
>>>>> ==26628==    by 0x4732FFB: ???
>>>>> ==26628==    by 0x473413C: ???
>>>>> ==26628==    by 0x4247D15: getpwuid_r (in /lib/tls/i686/cmov/
>>>>> libc-2.7.so
>>>>> )
>>>>> ==26628==    by 0x424765D: getpwuid (in /lib/tls/i686/cmov/libc-2.7.so
>>>>> )
>>>>> ==26628==    by 0x8623509: PetscGetUserName (fuser.c:68)
>>>>> ==26628==    by 0x85E0CF0: PetscErrorPrintfInitialize (errtrace.c:68)
>>>>> ==26628==    by 0x8613E23: PetscInitialize (pinit.c:518)
>>>>> ==26628==    by 0x804B796: main (ex29.c:139)
>>>>> ==26628==
>>>>> ==26628==
>>>>> ==26628== 80 bytes in 5 blocks are indirectly lost in loss record 4 of
>>>>> 5
>>>>> ==26628==    at 0x4022AB8: malloc (vg_replace_malloc.c:207)
>>>>> ==26628==    by 0x428839B: tsearch (in /lib/tls/i686/cmov/libc-2.7.so)
>>>>> ==26628==    by 0x429AF7D: __nss_lookup_function (in
>>>>> /lib/tls/i686/cmov/
>>>>> libc-2.7.so)
>>>>> ==26628==    by 0x4732FFB: ???
>>>>> ==26628==    by 0x473413C: ???
>>>>> ==26628==    by 0x4247D15: getpwuid_r (in /lib/tls/i686/cmov/
>>>>> libc-2.7.so
>>>>> )
>>>>> ==26628==    by 0x424765D: getpwuid (in /lib/tls/i686/cmov/libc-2.7.so
>>>>> )
>>>>> ==26628==    by 0x8623509: PetscGetUserName (fuser.c:68)
>>>>> ==26628==    by 0x85E0CF0: PetscErrorPrintfInitialize (errtrace.c:68)
>>>>> ==26628==    by 0x8613E23: PetscInitialize (pinit.c:518)
>>>>> ==26628==    by 0x804B796: main (ex29.c:139)
>>>>> ==26628==
>>>>> ==26628==
>>>>> ==26628== 132,796 bytes in 321 blocks are still reachable in loss
>>>>> record
>>>>> 5
>>>>> of 5
>>>>> ==26628==    at 0x4022AB8: malloc (vg_replace_malloc.c:207)
>>>>> ==26628==    by 0x85EF3AC: PetscMallocAlign (mal.c:40)
>>>>> ==26628==    by 0x85F049B: PetscTrMallocDefault (mtr.c:194)
>>>>> ==26628==    by 0x81BCD3F: DACreate2d (da2.c:364)
>>>>> ==26628==    by 0x804BAFB: main (ex29.c:153)
>>>>> ==26628==
>>>>> ==26628== LEAK SUMMARY:
>>>>> ==26628==    definitely lost: 36 bytes in 1 blocks.
>>>>> ==26628==    indirectly lost: 120 bytes in 10 blocks.
>>>>> ==26628==      possibly lost: 0 bytes in 0 blocks.
>>>>> ==26628==    still reachable: 132,828 bytes in 323 blocks.
>>>>> ==26628==         suppressed: 0 bytes in 0 blocks.
>>>>>
>>>>>
>>>>> --
>>>>> (Rebecca) Xuefei YUAN
>>>>> Department of Applied Physics and Applied Mathematics
>>>>> Columbia University
>>>>> Tel:917-399-8032
>>>>> www.columbia.edu/~xy2102 <http://www.columbia.edu/%7Exy2102> <
>>>>> http://www.columbia.edu/%7Exy2102> <
>>>>> http://www.columbia.edu/%7Exy2102>
>>>>>
>>>>>
>>>>>
>>>>>
>>>> --
>>>> What most experimenters take for granted before they begin their
>>>> experiments
>>>> is infinitely more interesting than any results to which their
>>>> experiments
>>>> lead.
>>>> -- Norbert Wiener
>>>>
>>>>
>>>>
>>>
>>> --
>>> (Rebecca) Xuefei YUAN
>>> Department of Applied Physics and Applied Mathematics
>>> Columbia University
>>> Tel:917-399-8032
>>> www.columbia.edu/~xy2102 <http://www.columbia.edu/%7Exy2102> <
>>> http://www.columbia.edu/%7Exy2102>
>>>
>>>
>>>
>>
>> --
>> What most experimenters take for granted before they begin their
>> experiments
>> is infinitely more interesting than any results to which their experiments
>> lead.
>> -- Norbert Wiener
>>
>>
>
>
> --
> (Rebecca) Xuefei YUAN
> Department of Applied Physics and Applied Mathematics
> Columbia University
> Tel:917-399-8032
> www.columbia.edu/~xy2102 <http://www.columbia.edu/%7Exy2102>
>
>


-- 
What most experimenters take for granted before they begin their experiments
is infinitely more interesting than any results to which their experiments
lead.
-- Norbert Wiener
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.mcs.anl.gov/pipermail/petsc-users/attachments/20090727/3226d716/attachment-0001.htm>

From xy2102 at columbia.edu  Mon Jul 27 17:14:32 2009
From: xy2102 at columbia.edu ((Rebecca) Xuefei YUAN)
Date: Mon, 27 Jul 2009 18:14:32 -0400
Subject: memory check of /snes/example/tutorials/ex29.c
In-Reply-To: <20090727180456.lugnzepfgkcsg0kc@cubmail.cc.columbia.edu>
References: <20090727165147.eybcqr8xcswg4k0c@cubmail.cc.columbia.edu>
	<a9f269830907271422q5142ad41ld0be3a454da5aaf9@mail.gmail.com>
	<20090727173439.32t4un0xa80wsgw4@cubmail.cc.columbia.edu>
	<a9f269830907271442hf62b5cdh162df4ee01edbede@mail.gmail.com>
	<20090727180456.lugnzepfgkcsg0kc@cubmail.cc.columbia.edu>
Message-ID: <20090727181432.w0t5xicq804wkw4o@cubmail.cc.columbia.edu>

The memory status after running DMMGSetSNESLocal() is
Mem:   2033752k total,  1821800k used,   211952k free,     5944k buffers

then when it calls DMMGSolve(), the memory has been used up... till  
corruption.

R


Quoting "(Rebecca) Xuefei YUAN" <xy2102 at columbia.edu>:

> Dear Matt,
>
> I ran the code on a 321*321 grid, with dof=4. The matrix is a sparse
> matrix with type aij.
>
> After set up user defined options calls, the memory status is
> Mem:   2033752k total,   607456k used,  1426296k free,     4832k buffers
>
>         ierr = DMMGCreate(comm, parameters.numberOfLevels, &appCtx,
> &dmmg);CHKERRQ(ierr);
> 	ierr = DACreate2d(comm,DA_NONPERIODIC,DA_STENCIL_BOX, -5, -5,
> PETSC_DECIDE, PETSC_DECIDE, 4, 2, 0, 0, &da);CHKERRQ(ierr);
> 	ierr = DMMGSetDM(dmmg, (DM)da);CHKERRQ(ierr);
> 	ierr = DASetFieldName(DMMGGetDA(dmmg), 0, "phi");CHKERRQ(ierr);
> 	ierr = DASetFieldName(DMMGGetDA(dmmg), 1, "vz");CHKERRQ(ierr);
> 	ierr = DASetFieldName(DMMGGetDA(dmmg), 2, "psi");CHKERRQ(ierr);
> 	ierr = DASetFieldName(DMMGGetDA(dmmg), 3, "bz");CHKERRQ(ierr);
>
>
> before DAGetMatrix() called, the memory status is
> Mem:   2033752k total,   642940k used,  1390812k free,     4972k buffers
>
> 	ierr = DAGetMatrix(DMMGGetDA(dmmg), MATAIJ, &jacobian);CHKERRQ(ierr);
>
> In gdb, it uses around 500M memory after DAGetMatrix(), which I do not
> think it is right, since for a sparse matrix with 13 nonzeros per row,
> the memory it needs should be 321*321*4(dof)*13(nonzeros per
> row)*8(PetscReal) = 42865056 bytes ~ 40M. It is strange.
> i.e., after DAGetMatrix() call, the memory status is
>
> Mem:   2033752k total,  1152032k used,   881720k free,     5072k buffers
>
> Then when it goes the call of DMMGSetSNESLocal(), I found my memory is
> using till the message of corruption has appeared.
>
> 	ierr = DMMGSetSNESLocal(dmmg, FormFunctionLocal,
> FormJacobianLocal,0,0);CHKERRQ(ierr);
>
> The memory corruption happens. The error message is:
>
>   0 SNES Function norm 4.925849247379e-03
> [0]PETSC ERROR: --------------------- Error Message
> ------------------------------------
> [0]PETSC ERROR: Out of memory. This could be due to allocating
> [0]PETSC ERROR: too large an object or bleeding by not properly
> [0]PETSC ERROR: destroying unneeded objects.
> [0]PETSC ERROR: Memory allocated 2134287012 Memory used by process 1630638080
> [0]PETSC ERROR: Try running with -malloc_dump or -malloc_log for info.
> [0]PETSC ERROR: Memory requested 327270824!
> [0]PETSC ERROR:
> ------------------------------------------------------------------------
> [0]PETSC ERROR: Petsc Release Version 3.0.0, Patch 1, Thu Jan  1
> 13:54:27 CST 2009
> [0]PETSC ERROR: See docs/changes/index.html for recent updates.
> [0]PETSC ERROR: See docs/faq.html for hints about trouble shooting.
> [0]PETSC ERROR: See docs/index.html for manual pages.
> [0]PETSC ERROR:
> ------------------------------------------------------------------------
> [0]PETSC ERROR: /home/rebecca/linux/code/physics/qffxmhd/tests/qffxmhd3
> on a linux-gnu named YuanWork by rebecca Mon Jul 27 17:49:53 2009
> [0]PETSC ERROR: Libraries linked from
> /home/rebecca/soft/petsc-3.0.0-p1/linux-gnu-c-debug/lib
> [0]PETSC ERROR: Configure run at Mon Apr 20 16:41:56 2009
> [0]PETSC ERROR: Configure options
> --with-blas-lapack-dir=./externalpackages/fblaslapack-3.1.1/
> --download-mpich=1 --with-shared=0
> [0]PETSC ERROR:
> ------------------------------------------------------------------------
> [0]PETSC ERROR: PetscMallocAlign() line 61 in src/sys/memory/mal.c
> [0]PETSC ERROR: PetscTrMallocDefault() line 194 in src/sys/memory/mtr.c
> [0]PETSC ERROR: MatDuplicateNoCreate_SeqAIJ() line 3402 in
> src/mat/impls/aij/seq/aij.c
> [0]PETSC ERROR: MatILUFactorSymbolic_SeqAIJ() line 1241 in
> src/mat/impls/aij/seq/aijfact.c
> [0]PETSC ERROR: MatILUFactorSymbolic() line 5243 in   
> src/mat/interface/matrix.c
> [0]PETSC ERROR: PCSetUp_ILU() line 293 in src/ksp/pc/impls/factor/ilu/ilu.c
> [0]PETSC ERROR: PCSetUp() line 794 in src/ksp/pc/interface/precon.c
> [0]PETSC ERROR: KSPSetUp() line 237 in src/ksp/ksp/interface/itfunc.c
> [0]PETSC ERROR: PCSetUp_MG() line 516 in src/ksp/pc/impls/mg/mg.c
> [0]PETSC ERROR: PCSetUp() line 794 in src/ksp/pc/interface/precon.c
> [0]PETSC ERROR: KSPSetUp() line 237 in src/ksp/ksp/interface/itfunc.c
> [0]PETSC ERROR: KSPSolve() line 353 in src/ksp/ksp/interface/itfunc.c
> [0]PETSC ERROR: SNES_KSPSolve() line 2899 in src/snes/interface/snes.c
> [0]PETSC ERROR: SNESSolve_LS() line 191 in src/snes/impls/ls/ls.c
> [0]PETSC ERROR: SNESSolve() line 2221 in src/snes/interface/snes.c
> [0]PETSC ERROR: DMMGSolveSNES() line 510 in src/snes/utils/damgsnes.c
> [0]PETSC ERROR: DMMGSolve() line 372 in src/snes/utils/damg.c
> [0]PETSC ERROR: Solve() line 318 in qffxmhd.c
> [0]PETSC ERROR: main() line 172 in qffxmhd.c
> application called MPI_Abort(MPI_COMM_WORLD, 1) - process 0[unset]:
> aborting job:
> application called MPI_Abort(MPI_COMM_WORLD, 1) - process 0
>
> Program exited with code 01.
>
>
>   0 SNES Function norm 4.925849247379e-03
> [0]PETSC ERROR: --------------------- Error Message
> ------------------------------------
> [0]PETSC ERROR: Out of memory. This could be due to allocating
> [0]PETSC ERROR: too large an object or bleeding by not properly
> [0]PETSC ERROR: destroying unneeded objects.
> [0]PETSC ERROR: Memory allocated 2134287012 Memory used by process 1630638080
> [0]PETSC ERROR: Try running with -malloc_dump or -malloc_log for info.
> [0]PETSC ERROR: Memory requested 327270824!
> [0]PETSC ERROR:
> ------------------------------------------------------------------------
> [0]PETSC ERROR: Petsc Release Version 3.0.0, Patch 1, Thu Jan  1
> 13:54:27 CST 2009
> [0]PETSC ERROR: See docs/changes/index.html for recent updates.
> [0]PETSC ERROR: See docs/faq.html for hints about trouble shooting.
> [0]PETSC ERROR: See docs/index.html for manual pages.
> [0]PETSC ERROR:
> ------------------------------------------------------------------------
> [0]PETSC ERROR: /home/rebecca/linux/code/physics/qffxmhd/tests/qffxmhd3
> on a linux-gnu named YuanWork by rebecca Mon Jul 27 17:49:53 2009
> [0]PETSC ERROR: Libraries linked from
> /home/rebecca/soft/petsc-3.0.0-p1/linux-gnu-c-debug/lib
> [0]PETSC ERROR: Configure run at Mon Apr 20 16:41:56 2009
> [0]PETSC ERROR: Configure options
> --with-blas-lapack-dir=./externalpackages/fblaslapack-3.1.1/
> --download-mpich=1 --with-shared=0
> [0]PETSC ERROR:
> ------------------------------------------------------------------------
> [0]PETSC ERROR: PetscMallocAlign() line 61 in src/sys/memory/mal.c
> [0]PETSC ERROR: PetscTrMallocDefault() line 194 in src/sys/memory/mtr.c
> [0]PETSC ERROR: MatDuplicateNoCreate_SeqAIJ() line 3402 in
> src/mat/impls/aij/seq/aij.c
> [0]PETSC ERROR: MatILUFactorSymbolic_SeqAIJ() line 1241 in
> src/mat/impls/aij/seq/aijfact.c
> [0]PETSC ERROR: MatILUFactorSymbolic() line 5243 in   
> src/mat/interface/matrix.c
> [0]PETSC ERROR: PCSetUp_ILU() line 293 in src/ksp/pc/impls/factor/ilu/ilu.c
> [0]PETSC ERROR: PCSetUp() line 794 in src/ksp/pc/interface/precon.c
> [0]PETSC ERROR: KSPSetUp() line 237 in src/ksp/ksp/interface/itfunc.c
> [0]PETSC ERROR: PCSetUp_MG() line 516 in src/ksp/pc/impls/mg/mg.c
> [0]PETSC ERROR: PCSetUp() line 794 in src/ksp/pc/interface/precon.c
> [0]PETSC ERROR: KSPSetUp() line 237 in src/ksp/ksp/interface/itfunc.c
> [0]PETSC ERROR: KSPSolve() line 353 in src/ksp/ksp/interface/itfunc.c
> [0]PETSC ERROR: SNES_KSPSolve() line 2899 in src/snes/interface/snes.c
> [0]PETSC ERROR: SNESSolve_LS() line 191 in src/snes/impls/ls/ls.c
> [0]PETSC ERROR: SNESSolve() line 2221 in src/snes/interface/snes.c
> [0]PETSC ERROR: DMMGSolveSNES() line 510 in src/snes/utils/damgsnes.c
> [0]PETSC ERROR: DMMGSolve() line 372 in src/snes/utils/damg.c
> [0]PETSC ERROR: Solve() line 318 in qffxmhd.c
> [0]PETSC ERROR: main() line 172 in qffxmhd.c
> application called MPI_Abort(MPI_COMM_WORLD, 1) - process 0[unset]:
> aborting job:
> application called MPI_Abort(MPI_COMM_WORLD, 1) - process 0
>
> Program exited with code 01.
>
> I thought it might because of the unfreed memory, so I picked up ex29.c
> as a comparision.
>
> Thanks,
>
> Rebecca
>
>
> Quoting Matthew Knepley <knepley at gmail.com>:
>
>> On Mon, Jul 27, 2009 at 4:34 PM, (Rebecca) Xuefei YUAN
>> <xy2102 at columbia.edu>wrote:
>>
>>> Those unfreed bytes cause "out of memory" when it runs at bigger grid
>>> sizes. So I have to find out those unfreed memory and free them... Any
>>> suggestions?
>>
>>
>> Not from what you mailed in. On that DA line, I see PetscHeaderCreate(). Is
>> that what you see?
>>
>>  Matt
>>
>>
>>>
>>> Thanks,
>>>
>>> R
>>>
>>>
>>> Quoting Matthew Knepley <knepley at gmail.com>:
>>>
>>> On Mon, Jul 27, 2009 at 3:51 PM, (Rebecca) Xuefei YUAN
>>>> <xy2102 at columbia.edu>wrote:
>>>>
>>>> Hi,
>>>>>
>>>>> My own code has some left bytes still reachable according to valgrind,
>>>>> then
>>>>> I use two different version petsc (2.3.3-p15 and 3.0.0-p1) to compile and
>>>>> make the files, it gives me different number of bytes left still
>>>>> reachable.
>>>>> Moreover, I picked up the /snes/example/tutorials/ex29.c as another
>>>>> example,
>>>>> and found that some bytes are still reachable, what is the cause of it?
>>>>> It
>>>>> shows that it is from DACreate2D() and the I use -malloc_dump to get
>>>>> those
>>>>> unfreed informations.
>>>>>
>>>>> I understand that for those 5 loss record, the 2nd, 3rd and 4th are true
>>>>> for all examples, but where do 1st and 5th ones come from? Also, the
>>>>> -malloc_dump information shows that there are
>>>>> "[0]Total space allocated 37780 bytes",
>>>>> but valgrind gives the information as
>>>>> "==26628==    still reachable: 132,828 bytes in 323 blocks"
>>>>>
>>>>> Why there is a big difference?
>>>>>
>>>>
>>>>
>>>> 1 is fine. It is from PMPI setup, which has some bytes not freed from
>>>> setting up the MPI
>>>> processes. The last one looks like an unfreed header for a DA, which is
>>>> strange.
>>>>
>>>> Matt
>>>>
>>>>
>>>>
>>>>> Thanks very much!
>>>>>
>>>>> Rebecca
>>>>>
>>>>> Here is the message from valgrind of running ex29:
>>>>> ==26628== 32 bytes in 2 blocks are still reachable in loss record 1 of 5
>>>>> ==26628==    at 0x4022AB8: malloc (vg_replace_malloc.c:207)
>>>>> ==26628==    by 0x86F9A78: MPID_VCRT_Create (mpid_vc.c:62)
>>>>> ==26628==    by 0x86F743A: MPID_Init (mpid_init.c:116)
>>>>> ==26628==    by 0x86D040B: MPIR_Init_thread (initthread.c:288)
>>>>> ==26628==    by 0x86CFF2D: PMPI_Init (init.c:106)
>>>>> ==26628==    by 0x8613D69: PetscInitialize (pinit.c:503)
>>>>> ==26628==    by 0x804B796: main (ex29.c:139)
>>>>> ==26628==
>>>>> ==26628==
>>>>> ==26628== 156 (36 direct, 120 indirect) bytes in 1 blocks are definitely
>>>>> lost in loss record 2 of 5
>>>>> ==26628==    at 0x4022AB8: malloc (vg_replace_malloc.c:207)
>>>>> ==26628==    by 0x429B3E2: (within /lib/tls/i686/cmov/libc-2.7.so)
>>>>> ==26628==    by 0x429BC2D: __nss_database_lookup (in /lib/tls/i686/cmov/
>>>>> libc-2.7.so)
>>>>> ==26628==    by 0x4732FDB: ???
>>>>> ==26628==    by 0x473413C: ???
>>>>> ==26628==    by 0x4247D15: getpwuid_r (in /lib/tls/i686/cmov/libc-2.7.so
>>>>> )
>>>>> ==26628==    by 0x424765D: getpwuid (in /lib/tls/i686/cmov/libc-2.7.so)
>>>>> ==26628==    by 0x8623509: PetscGetUserName (fuser.c:68)
>>>>> ==26628==    by 0x85E0CF0: PetscErrorPrintfInitialize (errtrace.c:68)
>>>>> ==26628==    by 0x8613E23: PetscInitialize (pinit.c:518)
>>>>> ==26628==    by 0x804B796: main (ex29.c:139)
>>>>> ==26628==
>>>>> ==26628==
>>>>> ==26628== 40 bytes in 5 blocks are indirectly lost in loss record 3 of 5
>>>>> ==26628==    at 0x4022AB8: malloc (vg_replace_malloc.c:207)
>>>>> ==26628==    by 0x429AFBB: __nss_lookup_function (in /lib/tls/i686/cmov/
>>>>> libc-2.7.so)
>>>>> ==26628==    by 0x4732FFB: ???
>>>>> ==26628==    by 0x473413C: ???
>>>>> ==26628==    by 0x4247D15: getpwuid_r (in /lib/tls/i686/cmov/libc-2.7.so
>>>>> )
>>>>> ==26628==    by 0x424765D: getpwuid (in /lib/tls/i686/cmov/libc-2.7.so)
>>>>> ==26628==    by 0x8623509: PetscGetUserName (fuser.c:68)
>>>>> ==26628==    by 0x85E0CF0: PetscErrorPrintfInitialize (errtrace.c:68)
>>>>> ==26628==    by 0x8613E23: PetscInitialize (pinit.c:518)
>>>>> ==26628==    by 0x804B796: main (ex29.c:139)
>>>>> ==26628==
>>>>> ==26628==
>>>>> ==26628== 80 bytes in 5 blocks are indirectly lost in loss record 4 of 5
>>>>> ==26628==    at 0x4022AB8: malloc (vg_replace_malloc.c:207)
>>>>> ==26628==    by 0x428839B: tsearch (in /lib/tls/i686/cmov/libc-2.7.so)
>>>>> ==26628==    by 0x429AF7D: __nss_lookup_function (in /lib/tls/i686/cmov/
>>>>> libc-2.7.so)
>>>>> ==26628==    by 0x4732FFB: ???
>>>>> ==26628==    by 0x473413C: ???
>>>>> ==26628==    by 0x4247D15: getpwuid_r (in /lib/tls/i686/cmov/libc-2.7.so
>>>>> )
>>>>> ==26628==    by 0x424765D: getpwuid (in /lib/tls/i686/cmov/libc-2.7.so)
>>>>> ==26628==    by 0x8623509: PetscGetUserName (fuser.c:68)
>>>>> ==26628==    by 0x85E0CF0: PetscErrorPrintfInitialize (errtrace.c:68)
>>>>> ==26628==    by 0x8613E23: PetscInitialize (pinit.c:518)
>>>>> ==26628==    by 0x804B796: main (ex29.c:139)
>>>>> ==26628==
>>>>> ==26628==
>>>>> ==26628== 132,796 bytes in 321 blocks are still reachable in loss record
>>>>> 5
>>>>> of 5
>>>>> ==26628==    at 0x4022AB8: malloc (vg_replace_malloc.c:207)
>>>>> ==26628==    by 0x85EF3AC: PetscMallocAlign (mal.c:40)
>>>>> ==26628==    by 0x85F049B: PetscTrMallocDefault (mtr.c:194)
>>>>> ==26628==    by 0x81BCD3F: DACreate2d (da2.c:364)
>>>>> ==26628==    by 0x804BAFB: main (ex29.c:153)
>>>>> ==26628==
>>>>> ==26628== LEAK SUMMARY:
>>>>> ==26628==    definitely lost: 36 bytes in 1 blocks.
>>>>> ==26628==    indirectly lost: 120 bytes in 10 blocks.
>>>>> ==26628==      possibly lost: 0 bytes in 0 blocks.
>>>>> ==26628==    still reachable: 132,828 bytes in 323 blocks.
>>>>> ==26628==         suppressed: 0 bytes in 0 blocks.
>>>>>
>>>>>
>>>>> --
>>>>> (Rebecca) Xuefei YUAN
>>>>> Department of Applied Physics and Applied Mathematics
>>>>> Columbia University
>>>>> Tel:917-399-8032
>>>>> www.columbia.edu/~xy2102 <http://www.columbia.edu/%7Exy2102> <
>>>>> http://www.columbia.edu/%7Exy2102>
>>>>>
>>>>>
>>>>>
>>>>
>>>> --
>>>> What most experimenters take for granted before they begin their
>>>> experiments
>>>> is infinitely more interesting than any results to which their experiments
>>>> lead.
>>>> -- Norbert Wiener
>>>>
>>>>
>>>
>>>
>>> --
>>> (Rebecca) Xuefei YUAN
>>> Department of Applied Physics and Applied Mathematics
>>> Columbia University
>>> Tel:917-399-8032
>>> www.columbia.edu/~xy2102 <http://www.columbia.edu/%7Exy2102>
>>>
>>>
>>
>>
>> --
>> What most experimenters take for granted before they begin their experiments
>> is infinitely more interesting than any results to which their experiments
>> lead.
>> -- Norbert Wiener
>>
>
>
>
> -- 
> (Rebecca) Xuefei YUAN
> Department of Applied Physics and Applied Mathematics
> Columbia University
> Tel:917-399-8032
> www.columbia.edu/~xy2102



-- 
(Rebecca) Xuefei YUAN
Department of Applied Physics and Applied Mathematics
Columbia University
Tel:917-399-8032
www.columbia.edu/~xy2102


From bsmith at mcs.anl.gov  Mon Jul 27 20:35:37 2009
From: bsmith at mcs.anl.gov (Barry Smith)
Date: Mon, 27 Jul 2009 20:35:37 -0500
Subject: memory check of /snes/example/tutorials/ex29.c
In-Reply-To: <20090727180456.lugnzepfgkcsg0kc@cubmail.cc.columbia.edu>
References: <20090727165147.eybcqr8xcswg4k0c@cubmail.cc.columbia.edu>
	<a9f269830907271422q5142ad41ld0be3a454da5aaf9@mail.gmail.com>
	<20090727173439.32t4un0xa80wsgw4@cubmail.cc.columbia.edu>
	<a9f269830907271442hf62b5cdh162df4ee01edbede@mail.gmail.com>
	<20090727180456.lugnzepfgkcsg0kc@cubmail.cc.columbia.edu>
Message-ID: <4D7AAA33-B20F-4D55-B3D6-E0E9F507CC6F@mcs.anl.gov>


     The memory for the matrix will be (321*321*4) * (25*  
4)                              *  
(12)                                           = 494,596,800  which is  
500 megabytes
                                                                   ( #  
rows)         (nonzeros per row)      (1 double + 1 int per nonzero)

There are 25 * 4 nonzeros per row because you have box stencil of  
width 2.  DAGetMatrix() has no way of knowing that your equations have  
only 13 nonzeros per row it has to assume a nonzero for each possible  
couple in the stencil.  You can use DASetBlockFills() to indicate  
which parts of the stencil are truly nonzero and thus greatly reduce  
the matrix memory usage.

    Barry


On Jul 27, 2009, at 5:04 PM, (Rebecca) Xuefei YUAN wrote:

> Dear Matt,
>
> I ran the code on a 321*321 grid, with dof=4. The matrix is a sparse  
> matrix with type aij.
>
> After set up user defined options calls, the memory status is
> Mem:   2033752k total,   607456k used,  1426296k free,     4832k  
> buffers
>
>        ierr = DMMGCreate(comm, parameters.numberOfLevels, &appCtx,  
> &dmmg);CHKERRQ(ierr);
> 	ierr = DACreate2d(comm,DA_NONPERIODIC,DA_STENCIL_BOX, -5, -5,  
> PETSC_DECIDE, PETSC_DECIDE, 4, 2, 0, 0, &da);CHKERRQ(ierr);
> 	ierr = DMMGSetDM(dmmg, (DM)da);CHKERRQ(ierr);
> 	ierr = DASetFieldName(DMMGGetDA(dmmg), 0, "phi");CHKERRQ(ierr);
> 	ierr = DASetFieldName(DMMGGetDA(dmmg), 1, "vz");CHKERRQ(ierr);
> 	ierr = DASetFieldName(DMMGGetDA(dmmg), 2, "psi");CHKERRQ(ierr);
> 	ierr = DASetFieldName(DMMGGetDA(dmmg), 3, "bz");CHKERRQ(ierr);
>
>
> before DAGetMatrix() called, the memory status is
> Mem:   2033752k total,   642940k used,  1390812k free,     4972k  
> buffers
>
> 	ierr = DAGetMatrix(DMMGGetDA(dmmg), MATAIJ, &jacobian);CHKERRQ(ierr);
>
> In gdb, it uses around 500M memory after DAGetMatrix(), which I do  
> not think it is right, since for a sparse matrix with 13 nonzeros  
> per row, the memory it needs should be 321*321*4(dof)*13(nonzeros  
> per row)*8(PetscReal) = 42865056 bytes ~ 40M. It is strange.
> i.e., after DAGetMatrix() call, the memory status is
>
> Mem:   2033752k total,  1152032k used,   881720k free,     5072k  
> buffers
>
> Then when it goes the call of DMMGSetSNESLocal(), I found my memory  
> is using till the message of corruption has appeared.
>
> 	ierr = DMMGSetSNESLocal(dmmg, FormFunctionLocal, FormJacobianLocal, 
> 0,0);CHKERRQ(ierr);
>
> The memory corruption happens. The error message is:
>
>  0 SNES Function norm 4.925849247379e-03
> [0]PETSC ERROR: --------------------- Error Message  
> ------------------------------------
> [0]PETSC ERROR: Out of memory. This could be due to allocating
> [0]PETSC ERROR: too large an object or bleeding by not properly
> [0]PETSC ERROR: destroying unneeded objects.
> [0]PETSC ERROR: Memory allocated 2134287012 Memory used by process  
> 1630638080
> [0]PETSC ERROR: Try running with -malloc_dump or -malloc_log for info.
> [0]PETSC ERROR: Memory requested 327270824!
> [0]PETSC ERROR:  
> ------------------------------------------------------------------------
> [0]PETSC ERROR: Petsc Release Version 3.0.0, Patch 1, Thu Jan  1  
> 13:54:27 CST 2009
> [0]PETSC ERROR: See docs/changes/index.html for recent updates.
> [0]PETSC ERROR: See docs/faq.html for hints about trouble shooting.
> [0]PETSC ERROR: See docs/index.html for manual pages.
> [0]PETSC ERROR:  
> ------------------------------------------------------------------------
> [0]PETSC ERROR: /home/rebecca/linux/code/physics/qffxmhd/tests/ 
> qffxmhd3 on a linux-gnu named YuanWork by rebecca Mon Jul 27  
> 17:49:53 2009
> [0]PETSC ERROR: Libraries linked from /home/rebecca/soft/petsc-3.0.0- 
> p1/linux-gnu-c-debug/lib
> [0]PETSC ERROR: Configure run at Mon Apr 20 16:41:56 2009
> [0]PETSC ERROR: Configure options --with-blas-lapack-dir=./ 
> externalpackages/fblaslapack-3.1.1/ --download-mpich=1 --with-shared=0
> [0]PETSC ERROR:  
> ------------------------------------------------------------------------
> [0]PETSC ERROR: PetscMallocAlign() line 61 in src/sys/memory/mal.c
> [0]PETSC ERROR: PetscTrMallocDefault() line 194 in src/sys/memory/ 
> mtr.c
> [0]PETSC ERROR: MatDuplicateNoCreate_SeqAIJ() line 3402 in src/mat/ 
> impls/aij/seq/aij.c
> [0]PETSC ERROR: MatILUFactorSymbolic_SeqAIJ() line 1241 in src/mat/ 
> impls/aij/seq/aijfact.c
> [0]PETSC ERROR: MatILUFactorSymbolic() line 5243 in src/mat/ 
> interface/matrix.c
> [0]PETSC ERROR: PCSetUp_ILU() line 293 in src/ksp/pc/impls/factor/ 
> ilu/ilu.c
> [0]PETSC ERROR: PCSetUp() line 794 in src/ksp/pc/interface/precon.c
> [0]PETSC ERROR: KSPSetUp() line 237 in src/ksp/ksp/interface/itfunc.c
> [0]PETSC ERROR: PCSetUp_MG() line 516 in src/ksp/pc/impls/mg/mg.c
> [0]PETSC ERROR: PCSetUp() line 794 in src/ksp/pc/interface/precon.c
> [0]PETSC ERROR: KSPSetUp() line 237 in src/ksp/ksp/interface/itfunc.c
> [0]PETSC ERROR: KSPSolve() line 353 in src/ksp/ksp/interface/itfunc.c
> [0]PETSC ERROR: SNES_KSPSolve() line 2899 in src/snes/interface/snes.c
> [0]PETSC ERROR: SNESSolve_LS() line 191 in src/snes/impls/ls/ls.c
> [0]PETSC ERROR: SNESSolve() line 2221 in src/snes/interface/snes.c
> [0]PETSC ERROR: DMMGSolveSNES() line 510 in src/snes/utils/damgsnes.c
> [0]PETSC ERROR: DMMGSolve() line 372 in src/snes/utils/damg.c
> [0]PETSC ERROR: Solve() line 318 in qffxmhd.c
> [0]PETSC ERROR: main() line 172 in qffxmhd.c
> application called MPI_Abort(MPI_COMM_WORLD, 1) - process 0[unset]:  
> aborting job:
> application called MPI_Abort(MPI_COMM_WORLD, 1) - process 0
>
> Program exited with code 01.
>
>
>  0 SNES Function norm 4.925849247379e-03
> [0]PETSC ERROR: --------------------- Error Message  
> ------------------------------------
> [0]PETSC ERROR: Out of memory. This could be due to allocating
> [0]PETSC ERROR: too large an object or bleeding by not properly
> [0]PETSC ERROR: destroying unneeded objects.
> [0]PETSC ERROR: Memory allocated 2134287012 Memory used by process  
> 1630638080
> [0]PETSC ERROR: Try running with -malloc_dump or -malloc_log for info.
> [0]PETSC ERROR: Memory requested 327270824!
> [0]PETSC ERROR:  
> ------------------------------------------------------------------------
> [0]PETSC ERROR: Petsc Release Version 3.0.0, Patch 1, Thu Jan  1  
> 13:54:27 CST 2009
> [0]PETSC ERROR: See docs/changes/index.html for recent updates.
> [0]PETSC ERROR: See docs/faq.html for hints about trouble shooting.
> [0]PETSC ERROR: See docs/index.html for manual pages.
> [0]PETSC ERROR:  
> ------------------------------------------------------------------------
> [0]PETSC ERROR: /home/rebecca/linux/code/physics/qffxmhd/tests/ 
> qffxmhd3 on a linux-gnu named YuanWork by rebecca Mon Jul 27  
> 17:49:53 2009
> [0]PETSC ERROR: Libraries linked from /home/rebecca/soft/petsc-3.0.0- 
> p1/linux-gnu-c-debug/lib
> [0]PETSC ERROR: Configure run at Mon Apr 20 16:41:56 2009
> [0]PETSC ERROR: Configure options --with-blas-lapack-dir=./ 
> externalpackages/fblaslapack-3.1.1/ --download-mpich=1 --with-shared=0
> [0]PETSC ERROR:  
> ------------------------------------------------------------------------
> [0]PETSC ERROR: PetscMallocAlign() line 61 in src/sys/memory/mal.c
> [0]PETSC ERROR: PetscTrMallocDefault() line 194 in src/sys/memory/ 
> mtr.c
> [0]PETSC ERROR: MatDuplicateNoCreate_SeqAIJ() line 3402 in src/mat/ 
> impls/aij/seq/aij.c
> [0]PETSC ERROR: MatILUFactorSymbolic_SeqAIJ() line 1241 in src/mat/ 
> impls/aij/seq/aijfact.c
> [0]PETSC ERROR: MatILUFactorSymbolic() line 5243 in src/mat/ 
> interface/matrix.c
> [0]PETSC ERROR: PCSetUp_ILU() line 293 in src/ksp/pc/impls/factor/ 
> ilu/ilu.c
> [0]PETSC ERROR: PCSetUp() line 794 in src/ksp/pc/interface/precon.c
> [0]PETSC ERROR: KSPSetUp() line 237 in src/ksp/ksp/interface/itfunc.c
> [0]PETSC ERROR: PCSetUp_MG() line 516 in src/ksp/pc/impls/mg/mg.c
> [0]PETSC ERROR: PCSetUp() line 794 in src/ksp/pc/interface/precon.c
> [0]PETSC ERROR: KSPSetUp() line 237 in src/ksp/ksp/interface/itfunc.c
> [0]PETSC ERROR: KSPSolve() line 353 in src/ksp/ksp/interface/itfunc.c
> [0]PETSC ERROR: SNES_KSPSolve() line 2899 in src/snes/interface/snes.c
> [0]PETSC ERROR: SNESSolve_LS() line 191 in src/snes/impls/ls/ls.c
> [0]PETSC ERROR: SNESSolve() line 2221 in src/snes/interface/snes.c
> [0]PETSC ERROR: DMMGSolveSNES() line 510 in src/snes/utils/damgsnes.c
> [0]PETSC ERROR: DMMGSolve() line 372 in src/snes/utils/damg.c
> [0]PETSC ERROR: Solve() line 318 in qffxmhd.c
> [0]PETSC ERROR: main() line 172 in qffxmhd.c
> application called MPI_Abort(MPI_COMM_WORLD, 1) - process 0[unset]:  
> aborting job:
> application called MPI_Abort(MPI_COMM_WORLD, 1) - process 0
>
> Program exited with code 01.
>
> I thought it might because of the unfreed memory, so I picked up  
> ex29.c as a comparision.
>
> Thanks,
>
> Rebecca
>
>
> Quoting Matthew Knepley <knepley at gmail.com>:
>
>> On Mon, Jul 27, 2009 at 4:34 PM, (Rebecca) Xuefei YUAN
>> <xy2102 at columbia.edu>wrote:
>>
>>> Those unfreed bytes cause "out of memory" when it runs at bigger  
>>> grid
>>> sizes. So I have to find out those unfreed memory and free them...  
>>> Any
>>> suggestions?
>>
>>
>> Not from what you mailed in. On that DA line, I see  
>> PetscHeaderCreate(). Is
>> that what you see?
>>
>>  Matt
>>
>>
>>>
>>> Thanks,
>>>
>>> R
>>>
>>>
>>> Quoting Matthew Knepley <knepley at gmail.com>:
>>>
>>> On Mon, Jul 27, 2009 at 3:51 PM, (Rebecca) Xuefei YUAN
>>>> <xy2102 at columbia.edu>wrote:
>>>>
>>>> Hi,
>>>>>
>>>>> My own code has some left bytes still reachable according to  
>>>>> valgrind,
>>>>> then
>>>>> I use two different version petsc (2.3.3-p15 and 3.0.0-p1) to  
>>>>> compile and
>>>>> make the files, it gives me different number of bytes left still
>>>>> reachable.
>>>>> Moreover, I picked up the /snes/example/tutorials/ex29.c as  
>>>>> another
>>>>> example,
>>>>> and found that some bytes are still reachable, what is the cause  
>>>>> of it?
>>>>> It
>>>>> shows that it is from DACreate2D() and the I use -malloc_dump to  
>>>>> get
>>>>> those
>>>>> unfreed informations.
>>>>>
>>>>> I understand that for those 5 loss record, the 2nd, 3rd and 4th  
>>>>> are true
>>>>> for all examples, but where do 1st and 5th ones come from? Also,  
>>>>> the
>>>>> -malloc_dump information shows that there are
>>>>> "[0]Total space allocated 37780 bytes",
>>>>> but valgrind gives the information as
>>>>> "==26628==    still reachable: 132,828 bytes in 323 blocks"
>>>>>
>>>>> Why there is a big difference?
>>>>>
>>>>
>>>>
>>>> 1 is fine. It is from PMPI setup, which has some bytes not freed  
>>>> from
>>>> setting up the MPI
>>>> processes. The last one looks like an unfreed header for a DA,  
>>>> which is
>>>> strange.
>>>>
>>>> Matt
>>>>
>>>>
>>>>
>>>>> Thanks very much!
>>>>>
>>>>> Rebecca
>>>>>
>>>>> Here is the message from valgrind of running ex29:
>>>>> ==26628== 32 bytes in 2 blocks are still reachable in loss  
>>>>> record 1 of 5
>>>>> ==26628==    at 0x4022AB8: malloc (vg_replace_malloc.c:207)
>>>>> ==26628==    by 0x86F9A78: MPID_VCRT_Create (mpid_vc.c:62)
>>>>> ==26628==    by 0x86F743A: MPID_Init (mpid_init.c:116)
>>>>> ==26628==    by 0x86D040B: MPIR_Init_thread (initthread.c:288)
>>>>> ==26628==    by 0x86CFF2D: PMPI_Init (init.c:106)
>>>>> ==26628==    by 0x8613D69: PetscInitialize (pinit.c:503)
>>>>> ==26628==    by 0x804B796: main (ex29.c:139)
>>>>> ==26628==
>>>>> ==26628==
>>>>> ==26628== 156 (36 direct, 120 indirect) bytes in 1 blocks are  
>>>>> definitely
>>>>> lost in loss record 2 of 5
>>>>> ==26628==    at 0x4022AB8: malloc (vg_replace_malloc.c:207)
>>>>> ==26628==    by 0x429B3E2: (within /lib/tls/i686/cmov/libc-2.7.so)
>>>>> ==26628==    by 0x429BC2D: __nss_database_lookup (in /lib/tls/ 
>>>>> i686/cmov/
>>>>> libc-2.7.so)
>>>>> ==26628==    by 0x4732FDB: ???
>>>>> ==26628==    by 0x473413C: ???
>>>>> ==26628==    by 0x4247D15: getpwuid_r (in /lib/tls/i686/cmov/ 
>>>>> libc-2.7.so
>>>>> )
>>>>> ==26628==    by 0x424765D: getpwuid (in /lib/tls/i686/cmov/ 
>>>>> libc-2.7.so)
>>>>> ==26628==    by 0x8623509: PetscGetUserName (fuser.c:68)
>>>>> ==26628==    by 0x85E0CF0: PetscErrorPrintfInitialize  
>>>>> (errtrace.c:68)
>>>>> ==26628==    by 0x8613E23: PetscInitialize (pinit.c:518)
>>>>> ==26628==    by 0x804B796: main (ex29.c:139)
>>>>> ==26628==
>>>>> ==26628==
>>>>> ==26628== 40 bytes in 5 blocks are indirectly lost in loss  
>>>>> record 3 of 5
>>>>> ==26628==    at 0x4022AB8: malloc (vg_replace_malloc.c:207)
>>>>> ==26628==    by 0x429AFBB: __nss_lookup_function (in /lib/tls/ 
>>>>> i686/cmov/
>>>>> libc-2.7.so)
>>>>> ==26628==    by 0x4732FFB: ???
>>>>> ==26628==    by 0x473413C: ???
>>>>> ==26628==    by 0x4247D15: getpwuid_r (in /lib/tls/i686/cmov/ 
>>>>> libc-2.7.so
>>>>> )
>>>>> ==26628==    by 0x424765D: getpwuid (in /lib/tls/i686/cmov/ 
>>>>> libc-2.7.so)
>>>>> ==26628==    by 0x8623509: PetscGetUserName (fuser.c:68)
>>>>> ==26628==    by 0x85E0CF0: PetscErrorPrintfInitialize  
>>>>> (errtrace.c:68)
>>>>> ==26628==    by 0x8613E23: PetscInitialize (pinit.c:518)
>>>>> ==26628==    by 0x804B796: main (ex29.c:139)
>>>>> ==26628==
>>>>> ==26628==
>>>>> ==26628== 80 bytes in 5 blocks are indirectly lost in loss  
>>>>> record 4 of 5
>>>>> ==26628==    at 0x4022AB8: malloc (vg_replace_malloc.c:207)
>>>>> ==26628==    by 0x428839B: tsearch (in /lib/tls/i686/cmov/ 
>>>>> libc-2.7.so)
>>>>> ==26628==    by 0x429AF7D: __nss_lookup_function (in /lib/tls/ 
>>>>> i686/cmov/
>>>>> libc-2.7.so)
>>>>> ==26628==    by 0x4732FFB: ???
>>>>> ==26628==    by 0x473413C: ???
>>>>> ==26628==    by 0x4247D15: getpwuid_r (in /lib/tls/i686/cmov/ 
>>>>> libc-2.7.so
>>>>> )
>>>>> ==26628==    by 0x424765D: getpwuid (in /lib/tls/i686/cmov/ 
>>>>> libc-2.7.so)
>>>>> ==26628==    by 0x8623509: PetscGetUserName (fuser.c:68)
>>>>> ==26628==    by 0x85E0CF0: PetscErrorPrintfInitialize  
>>>>> (errtrace.c:68)
>>>>> ==26628==    by 0x8613E23: PetscInitialize (pinit.c:518)
>>>>> ==26628==    by 0x804B796: main (ex29.c:139)
>>>>> ==26628==
>>>>> ==26628==
>>>>> ==26628== 132,796 bytes in 321 blocks are still reachable in  
>>>>> loss record
>>>>> 5
>>>>> of 5
>>>>> ==26628==    at 0x4022AB8: malloc (vg_replace_malloc.c:207)
>>>>> ==26628==    by 0x85EF3AC: PetscMallocAlign (mal.c:40)
>>>>> ==26628==    by 0x85F049B: PetscTrMallocDefault (mtr.c:194)
>>>>> ==26628==    by 0x81BCD3F: DACreate2d (da2.c:364)
>>>>> ==26628==    by 0x804BAFB: main (ex29.c:153)
>>>>> ==26628==
>>>>> ==26628== LEAK SUMMARY:
>>>>> ==26628==    definitely lost: 36 bytes in 1 blocks.
>>>>> ==26628==    indirectly lost: 120 bytes in 10 blocks.
>>>>> ==26628==      possibly lost: 0 bytes in 0 blocks.
>>>>> ==26628==    still reachable: 132,828 bytes in 323 blocks.
>>>>> ==26628==         suppressed: 0 bytes in 0 blocks.
>>>>>
>>>>>
>>>>> --
>>>>> (Rebecca) Xuefei YUAN
>>>>> Department of Applied Physics and Applied Mathematics
>>>>> Columbia University
>>>>> Tel:917-399-8032
>>>>> www.columbia.edu/~xy2102 <http://www.columbia.edu/%7Exy2102> <
>>>>> http://www.columbia.edu/%7Exy2102>
>>>>>
>>>>>
>>>>>
>>>>
>>>> --
>>>> What most experimenters take for granted before they begin their
>>>> experiments
>>>> is infinitely more interesting than any results to which their  
>>>> experiments
>>>> lead.
>>>> -- Norbert Wiener
>>>>
>>>>
>>>
>>>
>>> --
>>> (Rebecca) Xuefei YUAN
>>> Department of Applied Physics and Applied Mathematics
>>> Columbia University
>>> Tel:917-399-8032
>>> www.columbia.edu/~xy2102 <http://www.columbia.edu/%7Exy2102>
>>>
>>>
>>
>>
>> --
>> What most experimenters take for granted before they begin their  
>> experiments
>> is infinitely more interesting than any results to which their  
>> experiments
>> lead.
>> -- Norbert Wiener
>>
>
>
>
> -- 
> (Rebecca) Xuefei YUAN
> Department of Applied Physics and Applied Mathematics
> Columbia University
> Tel:917-399-8032
> www.columbia.edu/~xy2102
>


From bsmith at mcs.anl.gov  Mon Jul 27 20:36:35 2009
From: bsmith at mcs.anl.gov (Barry Smith)
Date: Mon, 27 Jul 2009 20:36:35 -0500
Subject: memory check of /snes/example/tutorials/ex29.c
In-Reply-To: <20090727180809.sh1t22n8wsgw8ckc@cubmail.cc.columbia.edu>
References: <20090727165147.eybcqr8xcswg4k0c@cubmail.cc.columbia.edu>
	<a9f269830907271422q5142ad41ld0be3a454da5aaf9@mail.gmail.com>
	<20090727173439.32t4un0xa80wsgw4@cubmail.cc.columbia.edu>
	<D866CCA8-68B8-45F2-BBA4-EF3AF54D244F@mcs.anl.gov>
	<20090727180809.sh1t22n8wsgw8ckc@cubmail.cc.columbia.edu>
Message-ID: <446098C7-F361-485A-A704-F81AB497C591@mcs.anl.gov>


On Jul 27, 2009, at 5:08 PM, (Rebecca) Xuefei YUAN wrote:

> Dear Barry,
>
> Do you mean that this small memory leak has been fixed in the  
> current petsc-3.0.0-p7?

     No, it is fixed in the Mecurial version and in petsc-dev. It will  
be fixed in the next patch. As I said before it is not a serious  
memory leak.

    Barry

> Since the version I have is petsc-3.0.0-p1.
>
> By the way, have you noticed another email about the possible bug in  
> DMCompositeGetMatrix() function?
>
> Thanks,
>
> Rebecca
>
> Quoting Barry Smith <bsmith at mcs.anl.gov>:
>
>>
>>  This was due to a small memory leak in DMMGSetNullSpace() of the
>> vectors creating internally to hold the null space. I have pushed a  
>> fix
>> to petsc-3.0.0 and petsc-dev
>> It will be fixed in the next 3.0.0 patch.
>>
>>   Note this would not cause the "out of memory" for runs at bigger
>> grid sizes. That is likely just coming from trying to run too large a
>> problem for your memory size.
>>
>>   Thanks for reporting the memory leak,
>>
>>   Barry
>>
>>
>> On Jul 27, 2009, at 4:34 PM, (Rebecca) Xuefei YUAN wrote:
>>
>>> Those unfreed bytes cause "out of memory" when it runs at bigger   
>>> grid sizes. So I have to find out those unfreed memory and free   
>>> them... Any suggestions?
>>>
>>> Thanks,
>>>
>>> R
>>>
>>> Quoting Matthew Knepley <knepley at gmail.com>:
>>>
>>>> On Mon, Jul 27, 2009 at 3:51 PM, (Rebecca) Xuefei YUAN
>>>> <xy2102 at columbia.edu>wrote:
>>>>
>>>>> Hi,
>>>>>
>>>>> My own code has some left bytes still reachable according to   
>>>>> valgrind, then
>>>>> I use two different version petsc (2.3.3-p15 and 3.0.0-p1) to  
>>>>> compile and
>>>>> make the files, it gives me different number of bytes left  
>>>>> still  reachable.
>>>>> Moreover, I picked up the /snes/example/tutorials/ex29.c as   
>>>>> another example,
>>>>> and found that some bytes are still reachable, what is the cause  
>>>>> of it? It
>>>>> shows that it is from DACreate2D() and the I use -malloc_dump to  
>>>>> get those
>>>>> unfreed informations.
>>>>>
>>>>> I understand that for those 5 loss record, the 2nd, 3rd and 4th  
>>>>> are true
>>>>> for all examples, but where do 1st and 5th ones come from? Also,  
>>>>> the
>>>>> -malloc_dump information shows that there are
>>>>> "[0]Total space allocated 37780 bytes",
>>>>> but valgrind gives the information as
>>>>> "==26628==    still reachable: 132,828 bytes in 323 blocks"
>>>>>
>>>>> Why there is a big difference?
>>>>
>>>>
>>>> 1 is fine. It is from PMPI setup, which has some bytes not freed  
>>>> from
>>>> setting up the MPI
>>>> processes. The last one looks like an unfreed header for a DA,  
>>>> which is
>>>> strange.
>>>>
>>>> Matt
>>>>
>>>>
>>>>>
>>>>> Thanks very much!
>>>>>
>>>>> Rebecca
>>>>>
>>>>> Here is the message from valgrind of running ex29:
>>>>> ==26628== 32 bytes in 2 blocks are still reachable in loss  
>>>>> record 1 of 5
>>>>> ==26628==    at 0x4022AB8: malloc (vg_replace_malloc.c:207)
>>>>> ==26628==    by 0x86F9A78: MPID_VCRT_Create (mpid_vc.c:62)
>>>>> ==26628==    by 0x86F743A: MPID_Init (mpid_init.c:116)
>>>>> ==26628==    by 0x86D040B: MPIR_Init_thread (initthread.c:288)
>>>>> ==26628==    by 0x86CFF2D: PMPI_Init (init.c:106)
>>>>> ==26628==    by 0x8613D69: PetscInitialize (pinit.c:503)
>>>>> ==26628==    by 0x804B796: main (ex29.c:139)
>>>>> ==26628==
>>>>> ==26628==
>>>>> ==26628== 156 (36 direct, 120 indirect) bytes in 1 blocks are  
>>>>> definitely
>>>>> lost in loss record 2 of 5
>>>>> ==26628==    at 0x4022AB8: malloc (vg_replace_malloc.c:207)
>>>>> ==26628==    by 0x429B3E2: (within /lib/tls/i686/cmov/libc-2.7.so)
>>>>> ==26628==    by 0x429BC2D: __nss_database_lookup (in /lib/tls/ 
>>>>> i686/cmov/
>>>>> libc-2.7.so)
>>>>> ==26628==    by 0x4732FDB: ???
>>>>> ==26628==    by 0x473413C: ???
>>>>> ==26628==    by 0x4247D15: getpwuid_r (in /lib/tls/i686/cmov/ 
>>>>> libc-2.7.so)
>>>>> ==26628==    by 0x424765D: getpwuid (in /lib/tls/i686/cmov/ 
>>>>> libc-2.7.so)
>>>>> ==26628==    by 0x8623509: PetscGetUserName (fuser.c:68)
>>>>> ==26628==    by 0x85E0CF0: PetscErrorPrintfInitialize  
>>>>> (errtrace.c:68)
>>>>> ==26628==    by 0x8613E23: PetscInitialize (pinit.c:518)
>>>>> ==26628==    by 0x804B796: main (ex29.c:139)
>>>>> ==26628==
>>>>> ==26628==
>>>>> ==26628== 40 bytes in 5 blocks are indirectly lost in loss  
>>>>> record 3 of 5
>>>>> ==26628==    at 0x4022AB8: malloc (vg_replace_malloc.c:207)
>>>>> ==26628==    by 0x429AFBB: __nss_lookup_function (in /lib/tls/ 
>>>>> i686/cmov/
>>>>> libc-2.7.so)
>>>>> ==26628==    by 0x4732FFB: ???
>>>>> ==26628==    by 0x473413C: ???
>>>>> ==26628==    by 0x4247D15: getpwuid_r (in /lib/tls/i686/cmov/ 
>>>>> libc-2.7.so)
>>>>> ==26628==    by 0x424765D: getpwuid (in /lib/tls/i686/cmov/ 
>>>>> libc-2.7.so)
>>>>> ==26628==    by 0x8623509: PetscGetUserName (fuser.c:68)
>>>>> ==26628==    by 0x85E0CF0: PetscErrorPrintfInitialize  
>>>>> (errtrace.c:68)
>>>>> ==26628==    by 0x8613E23: PetscInitialize (pinit.c:518)
>>>>> ==26628==    by 0x804B796: main (ex29.c:139)
>>>>> ==26628==
>>>>> ==26628==
>>>>> ==26628== 80 bytes in 5 blocks are indirectly lost in loss  
>>>>> record 4 of 5
>>>>> ==26628==    at 0x4022AB8: malloc (vg_replace_malloc.c:207)
>>>>> ==26628==    by 0x428839B: tsearch (in /lib/tls/i686/cmov/ 
>>>>> libc-2.7.so)
>>>>> ==26628==    by 0x429AF7D: __nss_lookup_function (in /lib/tls/ 
>>>>> i686/cmov/
>>>>> libc-2.7.so)
>>>>> ==26628==    by 0x4732FFB: ???
>>>>> ==26628==    by 0x473413C: ???
>>>>> ==26628==    by 0x4247D15: getpwuid_r (in /lib/tls/i686/cmov/ 
>>>>> libc-2.7.so)
>>>>> ==26628==    by 0x424765D: getpwuid (in /lib/tls/i686/cmov/ 
>>>>> libc-2.7.so)
>>>>> ==26628==    by 0x8623509: PetscGetUserName (fuser.c:68)
>>>>> ==26628==    by 0x85E0CF0: PetscErrorPrintfInitialize  
>>>>> (errtrace.c:68)
>>>>> ==26628==    by 0x8613E23: PetscInitialize (pinit.c:518)
>>>>> ==26628==    by 0x804B796: main (ex29.c:139)
>>>>> ==26628==
>>>>> ==26628==
>>>>> ==26628== 132,796 bytes in 321 blocks are still reachable in  
>>>>> loss record 5
>>>>> of 5
>>>>> ==26628==    at 0x4022AB8: malloc (vg_replace_malloc.c:207)
>>>>> ==26628==    by 0x85EF3AC: PetscMallocAlign (mal.c:40)
>>>>> ==26628==    by 0x85F049B: PetscTrMallocDefault (mtr.c:194)
>>>>> ==26628==    by 0x81BCD3F: DACreate2d (da2.c:364)
>>>>> ==26628==    by 0x804BAFB: main (ex29.c:153)
>>>>> ==26628==
>>>>> ==26628== LEAK SUMMARY:
>>>>> ==26628==    definitely lost: 36 bytes in 1 blocks.
>>>>> ==26628==    indirectly lost: 120 bytes in 10 blocks.
>>>>> ==26628==      possibly lost: 0 bytes in 0 blocks.
>>>>> ==26628==    still reachable: 132,828 bytes in 323 blocks.
>>>>> ==26628==         suppressed: 0 bytes in 0 blocks.
>>>>>
>>>>>
>>>>> --
>>>>> (Rebecca) Xuefei YUAN
>>>>> Department of Applied Physics and Applied Mathematics
>>>>> Columbia University
>>>>> Tel:917-399-8032
>>>>> www.columbia.edu/~xy2102 <http://www.columbia.edu/%7Exy2102>
>>>>>
>>>>>
>>>>
>>>>
>>>> --
>>>> What most experimenters take for granted before they begin their   
>>>> experiments
>>>> is infinitely more interesting than any results to which their  
>>>> experiments
>>>> lead.
>>>> -- Norbert Wiener
>>>>
>>>
>>>
>>>
>>> -- 
>>> (Rebecca) Xuefei YUAN
>>> Department of Applied Physics and Applied Mathematics
>>> Columbia University
>>> Tel:917-399-8032
>>> www.columbia.edu/~xy2102
>>>
>
>
>
> -- 
> (Rebecca) Xuefei YUAN
> Department of Applied Physics and Applied Mathematics
> Columbia University
> Tel:917-399-8032
> www.columbia.edu/~xy2102
>


From bsmith at mcs.anl.gov  Mon Jul 27 21:04:16 2009
From: bsmith at mcs.anl.gov (Barry Smith)
Date: Mon, 27 Jul 2009 21:04:16 -0500
Subject: possible bug in DMCompositeGetMatrix().
In-Reply-To: <20090726195136.f07iqopggk0skkg0@cubmail.cc.columbia.edu>
References: <20090726195136.f07iqopggk0skkg0@cubmail.cc.columbia.edu>
Message-ID: <CDE81EEA-2237-4909-8CEC-E6702683749F@mcs.anl.gov>


    You cannot use the stencil operations to put values into a  
"composite matrix". The numbering of rows and columns of the composite  
matrix reflect all the different variables (unknowns) sp do not match  
what they are for a single component.


    Barry

On Jul 26, 2009, at 6:51 PM, (Rebecca) Xuefei YUAN wrote:

> Hi,
>
> I am working on an optimization problem, in which I would like to  
> assemble a Jacobian matrix. Thus  
> DMMGSetSNES(dmmg,FormFunction,FormJacobian) is called.
>
> In damgsnes.c:637, in calling DMGetMatrix(), it calls  
> DMCompositeGetMatrix() where the temp matrix Atmp has been freed  
> before it passes any information to J at pack.c:1722 and 1774.
>
> So after calling DMGetMatrix() in DMMGSetSNES, the stencil of the  
> dmmg[i]->B has unchanged, i.e.,
>
> (gdb) p dmmg[0]->B->stencil
> $107 = {dim = 0, dims = {0, 0, 0, 0}, starts = {0, 0, 0, 0}, noc =  
> PETSC_FALSE}
> (gdb) where
> #0  DMMGSetSNES (dmmg=0x8856208, function=0x804c84f <FormFunction>,
>    jacobian=0x8052932 <FormJacobian>) at damgsnes.c:641
> #1  0x0804c246 in main (argc=Cannot access memory at address 0x0
> ) at tworeggt.c:126
>
> I compare this with
> http://www.mcs.anl.gov/petsc/petsc-as/snapshots/petsc-current/src/snes/examples/tutorials/ex18.c.html
>
> and it shows that the stencil has been carried out and passed to  
> dmmg[0]->B as follows:
>
> (gdb) p dmmg[i]->B->stencil
> $80 = {dim = 2, dims = {5, 5, 1, 0}, starts = {0, 0, 0, 0}, noc =  
> PETSC_TRUE}
> (gdb) where
> #0  DMMGSetSNES (dmmg=0x884b530, function=0x804c364 <FormFunction>,
>    jacobian=0x804d34d <FormJacobian>) at damgsnes.c:642
> #1  0x0804b969 in main (argc=Cannot access memory at address 0x2
> ) at ex18.c:100
>
> Because of this missing stencil of Jacobian matrix, I get the error  
> code as follows:
> Program received signal SIGSEGV, Segmentation fault.
> 0x082447c2 in ISLocalToGlobalMappingApply (mapping=0x0, N=1,  
> in=0xbff8f250,
>    out=0xbff8ce14) at /home/rebecca/soft/petsc-3.0.0-p1/include/ 
> petscis.h:129
> 129	  PetscInt i,*idx = mapping->indices,Nmax = mapping->n;
> (gdb) where
> #0  0x082447c2 in ISLocalToGlobalMappingApply (mapping=0x0, N=1,
>    in=0xbff8f250, out=0xbff8ce14)
>    at /home/rebecca/soft/petsc-3.0.0-p1/include/petscis.h:129
> #1  0x0824440c in MatSetValuesLocal (mat=0x88825e8, nrow=1,  
> irow=0xbff8f250,
>    ncol=4, icol=0xbff8ee50, y=0xbff8f628, addv=INSERT_VALUES) at  
> matrix.c:1583
> #2  0x08240aae in MatSetValuesStencil (mat=0x88825e8, m=1,  
> idxm=0xbff8f6b8,
>    n=4, idxn=0xbff8f4b4, v=0xbff8f628, addv=INSERT_VALUES) at  
> matrix.c:1099
> #3  0x08053835 in FormJacobian (snes=0x8874700, X=0x8856778,  
> J=0x88747d0,
>    B=0x88747d4, flg=0xbff8f8d4, ptr=0x8856338) at tworeggt.c:937
> #4  0x0805a5cf in DMMGComputeJacobian_Multigrid (snes=0x8874700,  
> X=0x8856778,
>    J=0x88747d0, B=0x88747d4, flag=0xbff8f8d4, ptr=0x8856208) at  
> damgsnes.c:60
> #5  0x0806b18a in SNESComputeJacobian (snes=0x8874700, X=0x8856778,
>    A=0x88747d0, B=0x88747d4, flg=0xbff8f8d4) at snes.c:1111
> #6  0x08084945 in SNESSolve_LS (snes=0x8874700) at ls.c:189
> #7  0x08073198 in SNESSolve (snes=0x8874700, b=0x0, x=0x8856778) at  
> snes.c:2221
> #8  0x0805d5f9 in DMMGSolveSNES (dmmg=0x8856208, level=0) at  
> damgsnes.c:510
> #9  0x08056e38 in DMMGSolve (dmmg=0x8856208) at damg.c:372
> #10 0x0804c3fe in main (argc=128, argv=0xbff90c04) at tworeggt.c:131
>
> I think there might be a bug in DMCompositeGetMatrix().
>
> Thanks very much!
>
> Cheers,
>
> -- 
> (Rebecca) Xuefei YUAN
> Department of Applied Physics and Applied Mathematics
> Columbia University
> Tel:917-399-8032
> www.columbia.edu/~xy2102
>


From sekikawa at msi.co.jp  Tue Jul 28 01:10:44 2009
From: sekikawa at msi.co.jp (Takuya Sekikawa)
Date: Tue, 28 Jul 2009 15:10:44 +0900
Subject: eigenvector on singlar matrix
Message-ID: <20090728145459.AEB0.SEKIKAWA@msi.co.jp>

Hi

I have a question about SLEPc.

What are EigenVectors calculated when given matrix is singular?

(ex1)
for example, matrix like this:

    (1 1 1)
A = (1 1 1)
    (1 1 1)

matrix A have 2 eigenvalues, one is 0 (double multiple root),
and other is 3.

in this case eigenvector related to eigenvalue 0, is
(z1, z2, -(z1+z2))t (z1, z2 can be any value. i.e. freedom degree is 2)


(ex2)

B = (0 1)
    (0 0)

in this case B's eigenvalue is only 0. (double multiple root)
but eigenvector has only 1 freedom degree. 
(z1, 0)t

My question is, what will be solution by SLEPc in these case?

Thanks,
Takuya
---------------------------------------------------------------
  Takuya Sekikawa
         Mathematical Systems, Inc
                    sekikawa at msi.co.jp
---------------------------------------------------------------



From tim.kroeger at cevis.uni-bremen.de  Tue Jul 28 01:22:49 2009
From: tim.kroeger at cevis.uni-bremen.de (Tim Kroeger)
Date: Tue, 28 Jul 2009 08:22:49 +0200 (CEST)
Subject: Solver problem
In-Reply-To: <6902D9FB-FE90-4C10-A4CC-187834E02988@mcs.anl.gov>
References: <alpine.LFD.0.9999.0907271042460.4918@elektrik.mevis.lan>
	<6902D9FB-FE90-4C10-A4CC-187834E02988@mcs.anl.gov>
Message-ID: <alpine.LFD.0.9999.0907280815410.21104@elektrik.mevis.lan>

Dear Barry,

On Mon, 27 Jul 2009, Barry Smith wrote:

> On Jul 27, 2009, at 4:35 AM, Tim Kroeger wrote:
>
>> In my application, there is a linear system to be solved in every time 
>> step.  Steps 0 and 1 work well, but in step 2 PETSc fails to converge. I 
>> suspected that the system might be unsolvable in that step and checked that 
>> by writing matrix and the right hand side to files and loading them into 
>> "octave".  Surprisingly, "octave" does find a solution to the system 
>> without any problems.
>
>  Octave is using a direct solver. Did you try PETSc's direct solver using 
> -pc_type lu?

Good idea!  It works, and I'm actually surprised that it does.  I did 
try ILU(3) before (i.e., -pc_type ilu -pc_factor_levels 3), which took 
forever to compute.  Hence I thought that full LU would take even 
longer, but this turned out to be not true; LU is accetible in 
performance and solves the problem for that test case.  I'll see how 
it will behave on a larger-scale problem and on multiple cores.

Would you recommend to try MUMPS as well?  (I.e., will MUMPS have a 
change to be faster than ILU?)

Best Regards,

Tim

-- 
Dr. Tim Kroeger
tim.kroeger at mevis.fraunhofer.de            Phone +49-421-218-7710
tim.kroeger at cevis.uni-bremen.de            Fax   +49-421-218-4236

Fraunhofer MEVIS, Institute for Medical Image Computing
Universitaetsallee 29, 28359 Bremen, Germany


From tim.kroeger at cevis.uni-bremen.de  Tue Jul 28 01:26:14 2009
From: tim.kroeger at cevis.uni-bremen.de (Tim Kroeger)
Date: Tue, 28 Jul 2009 08:26:14 +0200 (CEST)
Subject: Solver problem
In-Reply-To: <7f18de3b0907270743i7fe33021tf3b88c2e4f0d80be@mail.gmail.com>
References: <alpine.LFD.0.9999.0907271042460.4918@elektrik.mevis.lan>
	<6902D9FB-FE90-4C10-A4CC-187834E02988@mcs.anl.gov>
	<7f18de3b0907270743i7fe33021tf3b88c2e4f0d80be@mail.gmail.com>
Message-ID: <alpine.LFD.0.9999.0907280823060.21104@elektrik.mevis.lan>

Dear Michel,

On Mon, 27 Jul 2009, Michel Cancelliere wrote:

> Do you mean steps (iterations) 0 and 1 for SNES or KSP?

Neither nor.  I meant time steps.  I am not using SNES, but I have a 
time dependent problem where in each step a linear system is solved 
using KSP.

Anyway, thank you for your reply.

Best Regards,

Tim

-- 
Dr. Tim Kroeger
tim.kroeger at mevis.fraunhofer.de            Phone +49-421-218-7710
tim.kroeger at cevis.uni-bremen.de            Fax   +49-421-218-4236

Fraunhofer MEVIS, Institute for Medical Image Computing
Universitaetsallee 29, 28359 Bremen, Germany


From Andreas.Grassl at student.uibk.ac.at  Tue Jul 28 02:30:46 2009
From: Andreas.Grassl at student.uibk.ac.at (Andreas Grassl)
Date: Tue, 28 Jul 2009 09:30:46 +0200
Subject: Solver problem
In-Reply-To: <alpine.LFD.0.9999.0907280815410.21104@elektrik.mevis.lan>
References: <alpine.LFD.0.9999.0907271042460.4918@elektrik.mevis.lan>	<6902D9FB-FE90-4C10-A4CC-187834E02988@mcs.anl.gov>
	<alpine.LFD.0.9999.0907280815410.21104@elektrik.mevis.lan>
Message-ID: <4A6EA926.3090400@student.uibk.ac.at>

Tim Kroeger schrieb:
> Would you recommend to try MUMPS as well?  (I.e., will MUMPS have a
> change to be faster than ILU?)

I would highly recommend to give it a try.

MUMPS is a direct sparse solver, so it is not comparable to the combination
ilu-gmres, because iterative solver have a complexity of O(N) and highly depend
on the spectrum/condition of the matrix. Sparse direct solver have a complexity
of around O(N^2), but the runtime is not connected to the condition of the
matrix (only the accuracy of the result).

In my experience MUMPS is much faster than lu, because lu is only sequential and
the implementation is only for "verification"-reasons, what i understood.

Maybe -ksp_monitor_singular_value is interesting for you?

-- 
 /"\                               Grassl Andreas
 \ /    ASCII Ribbon Campaign      Uni Innsbruck Institut f. Mathematik
  X      against HTML email        Technikerstr. 13 Zi 709
 / \                               +43 (0)512 507 6091

From jroman at dsic.upv.es  Tue Jul 28 02:57:58 2009
From: jroman at dsic.upv.es (Jose E. Roman)
Date: Tue, 28 Jul 2009 09:57:58 +0200
Subject: eigenvector on singlar matrix
In-Reply-To: <20090728145459.AEB0.SEKIKAWA@msi.co.jp>
References: <20090728145459.AEB0.SEKIKAWA@msi.co.jp>
Message-ID: <3B6A7B28-1818-4063-A263-A8852E1D4727@dsic.upv.es>


On 28/07/2009, Takuya Sekikawa wrote:

> Hi
>
> I have a question about SLEPc.
>
> What are EigenVectors calculated when given matrix is singular?
>
> (ex1)
> for example, matrix like this:
>
>    (1 1 1)
> A = (1 1 1)
>    (1 1 1)
>
> matrix A have 2 eigenvalues, one is 0 (double multiple root),
> and other is 3.
>
> in this case eigenvector related to eigenvalue 0, is
> (z1, z2, -(z1+z2))t (z1, z2 can be any value. i.e. freedom degree is  
> 2)
>
>
> (ex2)
>
> B = (0 1)
>    (0 0)
>
> in this case B's eigenvalue is only 0. (double multiple root)
> but eigenvector has only 1 freedom degree.
> (z1, 0)t
>
> My question is, what will be solution by SLEPc in these case?
>
> Thanks,
> Takuya
> ---------------------------------------------------------------
>  Takuya Sekikawa
>         Mathematical Systems, Inc
>                    sekikawa at msi.co.jp
> ---------------------------------------------------------------

For such small matrices, the computed solution will be the same as the  
one provided by Lapack. If your problem matrices are small, use Lapack  
instead of SLEPc.

For large matrices, if the dimension of the nullspace is small then  
you should have no problems when computing the eigenvectors of zero  
eigenvalues with SLEPc. But if you have a large nullspace then things  
may get problematic - I have not tried this case. Please report any  
problems to the SLEPc maintainance email.

Jose


From tim.kroeger at cevis.uni-bremen.de  Tue Jul 28 08:42:20 2009
From: tim.kroeger at cevis.uni-bremen.de (Tim Kroeger)
Date: Tue, 28 Jul 2009 15:42:20 +0200 (CEST)
Subject: Solver problem
In-Reply-To: <4A6EA926.3090400@student.uibk.ac.at>
References: <alpine.LFD.0.9999.0907271042460.4918@elektrik.mevis.lan>
	<6902D9FB-FE90-4C10-A4CC-187834E02988@mcs.anl.gov>
	<alpine.LFD.0.9999.0907280815410.21104@elektrik.mevis.lan>
	<4A6EA926.3090400@student.uibk.ac.at>
Message-ID: <alpine.LFD.0.9999.0907281416080.3255@elektrik.mevis.lan>

Dear Andreas,

On Tue, 28 Jul 2009, Andreas Grassl wrote:

> Tim Kroeger schrieb:
>
>> Would you recommend to try MUMPS as well?  (I.e., will MUMPS have a
>> change to be faster than ILU?)
>
> I would highly recommend to give it a try.

I can't get it running.  )-:

I used the chance and updated to petsc-3.0.0-p7.  I configured with
MUMPS and compiled succesfully.  I run my application with
-pc_factor_mat_solver_package mumps -pc_type lu, and then it crashes
with the following message:

symbol lookup error: /home/tkroeger/archives/petsc-3.0.0-p7/linux-gnu/lib/libpetscmat.so: undefined symbol: _gfortran_allocate_array

Otherwise, petsc-3.0.0-p7 works fine; that is, if I don't use the
above options, it doesn't crash.

What did I do wrong?

Best Regards,

Tim

-- 
Dr. Tim Kroeger
tim.kroeger at mevis.fraunhofer.de            Phone +49-421-218-7710
tim.kroeger at cevis.uni-bremen.de            Fax   +49-421-218-4236

Fraunhofer MEVIS, Institute for Medical Image Computing
Universitaetsallee 29, 28359 Bremen, Germany


From tim.kroeger at cevis.uni-bremen.de  Tue Jul 28 09:40:28 2009
From: tim.kroeger at cevis.uni-bremen.de (Tim Kroeger)
Date: Tue, 28 Jul 2009 16:40:28 +0200 (CEST)
Subject: Solver problem
In-Reply-To: <alpine.LFD.0.9999.0907281416080.3255@elektrik.mevis.lan>
References: <alpine.LFD.0.9999.0907271042460.4918@elektrik.mevis.lan>
	<6902D9FB-FE90-4C10-A4CC-187834E02988@mcs.anl.gov>
	<alpine.LFD.0.9999.0907280815410.21104@elektrik.mevis.lan>
	<4A6EA926.3090400@student.uibk.ac.at>
	<alpine.LFD.0.9999.0907281416080.3255@elektrik.mevis.lan>
Message-ID: <alpine.LFD.0.9999.0907281627300.3255@elektrik.mevis.lan>

On Tue, 28 Jul 2009, Tim Kroeger wrote:

> I used the chance and updated to petsc-3.0.0-p7.  I configured with
> MUMPS and compiled succesfully.  I run my application with
> -pc_factor_mat_solver_package mumps -pc_type lu, and then it crashes
> with the following message:
>
> symbol lookup error: /home/tkroeger/archives/petsc-3.0.0-p7/linux-gnu/lib/libpetscmat.so: undefined symbol: _gfortran_allocate_array

I should add that the crash occurs inside KSPSolve().  I'm at a loss.

Best Regards,

Tim

-- 
Dr. Tim Kroeger
tim.kroeger at mevis.fraunhofer.de            Phone +49-421-218-7710
tim.kroeger at cevis.uni-bremen.de            Fax   +49-421-218-4236

Fraunhofer MEVIS, Institute for Medical Image Computing
Universitaetsallee 29, 28359 Bremen, Germany


From rlmackie862 at gmail.com  Tue Jul 28 10:17:35 2009
From: rlmackie862 at gmail.com (Randall Mackie)
Date: Tue, 28 Jul 2009 08:17:35 -0700
Subject: suggestions for debugging code
Message-ID: <4A6F168F.5070208@gmail.com>

I have run into a very difficult debugging problem. I have recently made some
modifications to my PETSc code, to add some new features. When I compiled the
code in debug mode (we are using the Intel compilers and mvapich on Infiniband),
the code runs fine with any number of processes.

When the code is compiled in optimize mode, it runs fine on, say, up to 32 processes,
but not 64, bombing out someplace strange, with a Segmentation Violation.

I've tried using Valgrind, but you can't use it with PETSc and my code compiled in
Debug mode because the code finishes successfully, and the other problem I have with
Valgrind + mvapich is there are about a million messages spewed out, making it
extremely difficult to see if there are really any issues in MY code. I've thought
to have PETSc download and compile MPICH2, which I would hope would produce less
output from Valgrind.

Anyone have any suggestions on how to debug this tricky situation? Any suggestions
would be greatly appreciated.

Randy

From u.tabak at tudelft.nl  Tue Jul 28 10:28:00 2009
From: u.tabak at tudelft.nl (Umut Tabak)
Date: Tue, 28 Jul 2009 17:28:00 +0200
Subject: Normalization options in slepc
Message-ID: <4A6F1900.10400@tudelft.nl>

Dear all,

Is there a normalization selection option in Slepc for eigenvectors, as 
far as I can see, it normalizes the eigenvectors so that their norm is 
equal to 1. Can this normalization be customized with respect to B, in a 
generalized problem context. It is not hard to write a function for 
this, but I wondered if there is already an option for this.

Best regards,

Umut

From bsmith at mcs.anl.gov  Tue Jul 28 10:35:56 2009
From: bsmith at mcs.anl.gov (Barry Smith)
Date: Tue, 28 Jul 2009 10:35:56 -0500
Subject: Solver problem
In-Reply-To: <alpine.LFD.0.9999.0907281416080.3255@elektrik.mevis.lan>
References: <alpine.LFD.0.9999.0907271042460.4918@elektrik.mevis.lan>
	<6902D9FB-FE90-4C10-A4CC-187834E02988@mcs.anl.gov>
	<alpine.LFD.0.9999.0907280815410.21104@elektrik.mevis.lan>
	<4A6EA926.3090400@student.uibk.ac.at>
	<alpine.LFD.0.9999.0907281416080.3255@elektrik.mevis.lan>
Message-ID: <897733B6-228F-4397-9C49-81CC0BE296F9@mcs.anl.gov>

> undefined symbol: _gfortran_allocate_array

    This is likely a symbol in the gfortran compiler libraries.

   Are you linking your application code against all the libraries it  
needs to be linked against? In the PETSc directory run make  
getlinklibs make sure all the libraries are listed in your makefile or  
use a PETSc makefile as a template.

    Or this is coming from the fact that you are using shared or  
dynamic libraries. If you don't need shared libraries then run PETSc's  
config/configure.py with --with-shared=0

    Barry

On Jul 28, 2009, at 8:42 AM, Tim Kroeger wrote:

> Dear Andreas,
>
> On Tue, 28 Jul 2009, Andreas Grassl wrote:
>
>> Tim Kroeger schrieb:
>>
>>> Would you recommend to try MUMPS as well?  (I.e., will MUMPS have a
>>> change to be faster than ILU?)
>>
>> I would highly recommend to give it a try.
>
> I can't get it running.  )-:
>
> I used the chance and updated to petsc-3.0.0-p7.  I configured with
> MUMPS and compiled succesfully.  I run my application with
> -pc_factor_mat_solver_package mumps -pc_type lu, and then it crashes
> with the following message:
>
> symbol lookup error: /home/tkroeger/archives/petsc-3.0.0-p7/linux- 
> gnu/lib/libpetscmat.so: undefined symbol: _gfortran_allocate_array
>
> Otherwise, petsc-3.0.0-p7 works fine; that is, if I don't use the
> above options, it doesn't crash.
>
> What did I do wrong?
>
> Best Regards,
>
> Tim
>
> -- 
> Dr. Tim Kroeger
> tim.kroeger at mevis.fraunhofer.de            Phone +49-421-218-7710
> tim.kroeger at cevis.uni-bremen.de            Fax   +49-421-218-4236
>
> Fraunhofer MEVIS, Institute for Medical Image Computing
> Universitaetsallee 29, 28359 Bremen, Germany
>


From knepley at gmail.com  Tue Jul 28 10:41:11 2009
From: knepley at gmail.com (Matthew Knepley)
Date: Tue, 28 Jul 2009 10:41:11 -0500
Subject: suggestions for debugging code
In-Reply-To: <4A6F168F.5070208@gmail.com>
References: <4A6F168F.5070208@gmail.com>
Message-ID: <a9f269830907280841g60834cedo747e4ecd7d194e2a@mail.gmail.com>

On Tue, Jul 28, 2009 at 10:17 AM, Randall Mackie <rlmackie862 at gmail.com>wrote:

> I have run into a very difficult debugging problem. I have recently made
> some
> modifications to my PETSc code, to add some new features. When I compiled
> the
> code in debug mode (we are using the Intel compilers and mvapich on
> Infiniband),
> the code runs fine with any number of processes.
>
> When the code is compiled in optimize mode, it runs fine on, say, up to 32
> processes,
> but not 64, bombing out someplace strange, with a Segmentation Violation.
>
> I've tried using Valgrind, but you can't use it with PETSc and my code
> compiled in
> Debug mode because the code finishes successfully, and the other problem I
> have with


Sometimes valgrind will catch things even when code does not crash.


>
> Valgrind + mvapich is there are about a million messages spewed out, making
> it
> extremely difficult to see if there are really any issues in MY code. I've
> thought
> to have PETSc download and compile MPICH2, which I would hope would produce
> less
> output from Valgrind.


In order to filter these out, you use a "suppressions file" for valgrind.
The manual has a
good section on this and it should not be hard to wipre out most of them.
Satish designed
one for our unit tests.

  Matt


>
> Anyone have any suggestions on how to debug this tricky situation? Any
> suggestions
> would be greatly appreciated.
>
> Randy
>



-- 
What most experimenters take for granted before they begin their experiments
is infinitely more interesting than any results to which their experiments
lead.
-- Norbert Wiener
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.mcs.anl.gov/pipermail/petsc-users/attachments/20090728/e987bcb1/attachment.htm>

From darach at tchpc.tcd.ie  Tue Jul 28 11:28:13 2009
From: darach at tchpc.tcd.ie (darach at tchpc.tcd.ie)
Date: Tue, 28 Jul 2009 17:28:13 +0100
Subject: Compiling Boost & Sieve & complex scalar
Message-ID: <20090728162813.GH19239@tchpc.tcd.ie>

Hi,

I have been trying to compile petsc with boost & sieve & complex
scalars using the following configuration commands (I also compiled
with PetscScalar=real as a comparison).  I'm getting an error when the
petsc scalar type is complex, but no error with scalar type real (just
warnings).  What configuration options am I missing? I include longer
output below; first the complex case and then the real case.

petsc was compiled from the following tar file:  petsc-3.0.0-p7.tar.gz

Complex:
./config/configure.py --prefix=/path-to/user/install/petsc-v3p7-defaultboost-sieve-complex  --with-scalar-type=complex  --with-clanguage=cxx  --with-boost=1 --download-boost=/home/user/Compile/petsc3p7-complex/externalpackages/boost.tar.gz  --with-sieve=1

Real:
 ./config/configure.py --prefix=/path-to/user/install/petsc-v3p7-defaultboost-sieve-real  --with-scalar-type=real  --with-clanguage=cxx  --with-boost=1 --download-boost=/home/user/Compile/petsc3p7-real/externalpackages/boost.tar.gz  --with-sieve=1

Darach


Longer Output:

Complex Scalar:

./config/configure.py --prefix=/path-to/user/install/petsc-v3p7-defaultboost-sieve-complex  --with-scalar-type=complex  --with-clanguage=cxx  --with-boost=1 --download-boost=/home/user/Compile/petsc3p7-complex/externalpackages/boost.tar.gz  --with-sieve=1
....
=================================================================================
             Configuring PETSc to compile on your system                         
=================================================================================
TESTING: alternateConfigureLibrary from PETSc.packages.petsc4py(config/PETSc/packages/petsc4py.py:69)                                                                         Compilers:
  C Compiler:         mpicc  -Wall -Wwrite-strings -Wno-strict-aliasing -g3 
  C++ Compiler:       mpicxx  -Wall -Wwrite-strings -Wno-strict-aliasing -g   
  Fortran Compiler:   mpif90  -Wall -Wno-unused-variable -g  
Linkers:
  Static linker:   /usr/bin/ar cr
PETSc:
  **
  ** Before running "make" your PETSC_ARCH must be specified with:
  **  ** setenv PETSC_ARCH linux-gnu-cxx-debug (csh/tcsh)
  **  ** PETSC_ARCH=linux-gnu-cxx-debug; export PETSC_ARCH (sh/bash)
  **
  PETSC_DIR: /home/user/Compile/petsc3p7-complex
  **
  ** Now build the libraries with "make all"
  **
  Clanguage: Cxx
  PETSc shared libraries: disabled
  PETSc dynamic libraries: disabled
  Scalar type:complex
MPI:
  Includes: -I/misc/shared/apps/openmpi/gcc/64/1.2.8/include -I/misc/shared/apps/openmpi/gcc/64/1.2.8/lib
X11:
  Includes: ['']
  Library: ['-lX11']
BLAS/LAPACK: -llapack -lblas
Sieve:
  Includes: -I/home/user/Compile/petsc3p7-complex/include/sieve
Boost:
  Includes: -I/home/user/Compile/petsc3p7-complex/externalpackages/Boost/ -I/misc/shared/apps/openmpi/gcc/64/1.2.8/include -I/misc/shared/apps/openmpi/gcc/64/1.2.8/lib


Errors:
------------------------------------------------------------------------
......
......

libfast in: /home/user/Compile/petsc3p7-complex/src/dm/mesh
mesh.c: In function ???PetscErrorCode assembleVector(_p_Vec*, PetscInt, PetscScalar*, InsertMode)???:
mesh.c:1104: error: no matching function for call to ???ALE::IMesh<ALE::LabelSifter<int, int, ALE::malloc_allocator<ALE::NewSifterDef::Arrow<int, int> > > >::update(const ALE::Obj<ALE::IGeneralSection<int, double, ALE::malloc_allocator<double> >, ALE::malloc_allocator<ALE::IGeneralSection<int, double, ALE::malloc_allocator<double> > > >&, int, PetscScalar*&)???
/home/user/Compile/petsc3p7-complex/include/sieve/Mesh.hh:1715: note: candidates are: void ALE::IMesh<Label_>::update(const ALE::Obj<Y, ALE::malloc_allocator<U> >&, const typename ALE::IBundle<ALE::IFSieve<int, ALE::malloc_allocator<int> >, ALE::IGeneralSection<int, double, ALE::malloc_allocator<double> >, ALE::IGeneralSection<int, int, ALE::malloc_allocator<int> >, Label_, ALE::UniformSection<ALE::MinimalArrow<int, int>, int, 1, ALE::malloc_allocator<int> > >::sieve_type::point_type&, const typename Section::value_type*) [with Section = ALE::IGeneralSection<int, double, ALE::malloc_allocator<double> >, Label_ = ALE::LabelSifter<int, int, ALE::malloc_allocator<ALE::NewSifterDef::Arrow<int, int> > >]
mesh.c:1106: error: no matching function for call to ???ALE::IMesh<ALE::LabelSifter<int, int, ALE::malloc_allocator<ALE::NewSifterDef::Arrow<int, int> > > >::updateAdd(const ALE::Obj<ALE::IGeneralSection<int, double, ALE::malloc_allocator<double> >, ALE::malloc_allocator<ALE::IGeneralSection<int, double, ALE::malloc_allocator<double> > > >&, int, PetscScalar*&)???
/home/user/Compile/petsc3p7-complex/include/sieve/Mesh.hh:1741: note: candidates are: void ALE::IMesh<Label_>::updateAdd(const ALE::Obj<Y, ALE::malloc_allocator<U> >&, const typename ALE::IBundle<ALE::IFSieve<int, ALE::malloc_allocator<int> >, ALE::IGeneralSection<int, double, ALE::malloc_allocator<double> >, ALE::IGeneralSection<int, int, ALE::malloc_allocator<int> >, Label_, ALE::UniformSection<ALE::MinimalArrow<int, int>, int, 1, ALE::malloc_allocator<int> > >::sieve_type::point_type&, const typename Section::value_type*) [with Section = ALE::IGeneralSection<int, double, ALE::malloc_allocator<double> >, Label_ = ALE::LabelSifter<int, int, ALE::malloc_allocator<ALE::NewSifterDef::Arrow<int, int> > >]
mesh.c: In function ???PetscErrorCode MeshRestrictClosure(_p_Mesh*, _p_SectionReal*, PetscInt, PetscInt, PetscScalar*)???:
mesh.c:2523: error: no matching function for call to ???ALE::IMesh<ALE::LabelSifter<int, int, ALE::malloc_allocator<ALE::NewSifterDef::Arrow<int, int> > > >::restrictClosure(ALE::Obj<ALE::IGeneralSection<int, double, ALE::malloc_allocator<double> >, ALE::malloc_allocator<ALE::IGeneralSection<int, double, ALE::malloc_allocator<double> > > >&, PetscInt&, PetscScalar*&, PetscInt&)???
/home/user/Compile/petsc3p7-complex/include/sieve/Mesh.hh:1691: note: candidates are: const typename Section::value_type* ALE::IMesh<Label_>::restrictClosure(const ALE::Obj<Y, ALE::malloc_allocator<U> >&, const typename ALE::IBundle<ALE::IFSieve<int, ALE::malloc_allocator<int> >, ALE::IGeneralSection<int, double, ALE::malloc_allocator<double> >, ALE::IGeneralSection<int, int, ALE::malloc_allocator<int> >, Label_, ALE::UniformSection<ALE::MinimalArrow<int, int>, int, 1, ALE::malloc_allocator<int> > >::sieve_type::point_type&, typename Section::value_type*, int) [with Section = ALE::IGeneralSection<int, double, ALE::malloc_allocator<double> >, Label_ = ALE::LabelSifter<int, int, ALE::malloc_allocator<ALE::NewSifterDef::Arrow<int, int> > >]
mesh.c: In function ???PetscErrorCode MeshUpdateClosure(_p_Mesh*, _p_SectionReal*, PetscInt, PetscScalar*)???:
mesh.c:2557: error: no matching function for call to ???ALE::IMesh<ALE::LabelSifter<int, int, ALE::malloc_allocator<ALE::NewSifterDef::Arrow<int, int> > > >::update(ALE::Obj<ALE::IGeneralSection<int, double, ALE::malloc_allocator<double> >, ALE::malloc_allocator<ALE::IGeneralSection<int, double, ALE::malloc_allocator<double> > > >&, PetscInt&, PetscScalar*&)???
/home/user/Compile/petsc3p7-complex/include/sieve/Mesh.hh:1715: note: candidates are: void ALE::IMesh<Label_>::update(const ALE::Obj<Y, ALE::malloc_allocator<U> >&, const typename ALE::IBundle<ALE::IFSieve<int, ALE::malloc_allocator<int> >, ALE::IGeneralSection<int, double, ALE::malloc_allocator<double> >, ALE::IGeneralSection<int, int, ALE::malloc_allocator<int> >, Label_, ALE::UniformSection<ALE::MinimalArrow<int, int>, int, 1, ALE::malloc_allocator<int> > >::sieve_type::point_type&, const typename Section::value_type*) [with Section = ALE::IGeneralSection<int, double, ALE::malloc_allocator<double> >, Label_ = ALE::LabelSifter<int, int, ALE::malloc_allocator<ALE::NewSifterDef::Arrow<int, int> > >]
/home/user/Compile/petsc3p7-complex/include/petscmesh.hh: In function ???PetscErrorCode MeshCreateGlobalScatter(const ALE::Obj<Y, ALE::malloc_allocator<U> >&, const ALE::Obj<Sieve_, ALE::malloc_allocator<Rec_> >&, _p_VecScatter**) [with Mesh = ALE::IMesh<ALE::LabelSifter<int, int, ALE::malloc_allocator<ALE::NewSifterDef::Arrow<int, int> > > >, Section = ALE::IGeneralSection<int, double, ALE::malloc_allocator<double> >]???:
mesh.c:815:   instantiated from here
/home/user/Compile/petsc3p7-complex/include/petscmesh.hh:93: error: no matching function for call to ???VecCreateSeqWithArray(ompi_communicator_t*, int, const double*, _p_Vec**)???
/home/user/Compile/petsc3p7-complex/include/petscvec.h:66: note: candidates are: PetscErrorCode VecCreateSeqWithArray(ompi_communicator_t*, PetscInt, const PetscScalar*, _p_Vec**)
/home/user/Compile/petsc3p7-complex/include/petscvec.h:67: note:                 PetscErrorCode VecCreateSeqWithArray(PetscInt, PetscScalar*, _p_Vec**)
/home/user/Compile/petsc3p7-complex/include/petscmesh.hh: In function ???PetscErrorCode updateOperator(_p_Mat*, const Sieve&, Visitor&, const int&, PetscScalar*, InsertMode) [with Sieve = ALE::IFSieve<int, ALE::malloc_allocator<int> >, Visitor = updateOperator(_p_Mat*, const ALE::Obj<ALE::IMesh<ALE::LabelSifter<int, int, ALE::malloc_allocator<ALE::NewSifterDef::Arrow<int, int> > > >, ALE::malloc_allocator<ALE::IMesh<ALE::LabelSifter<int, int, ALE::malloc_allocator<ALE::NewSifterDef::Arrow<int, int> > > > > >&, const ALE::Obj<ALE::IGeneralSection<int, double, ALE::malloc_allocator<double> >, ALE::malloc_allocator<ALE::IGeneralSection<int, double, ALE::malloc_allocator<double> > > >&, const ALE::Obj<ALE::GlobalOrder<int, ALE::Point>, ALE::malloc_allocator<ALE::GlobalOrder<int, ALE::Point> > >&, const int&, PetscScalar*, InsertMode)::visitor_type]???:
mesh.c:1121:   instantiated from here
.....
.....
------------------------------------------------------------------------

~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~


Real Scalar

 ./config/configure.py --prefix=/path-to/user/install/petsc-v3p7-defaultboost-sieve-real  --with-scalar-type=real  --with-clanguage=cxx  --with-boost=1 --download-boost=/home/user/Compile/petsc3p7-real/externalpackages/boost.tar.gz  --with-sieve=1
....
             Configuring PETSc to compile on your system                         
=================================================================================
TESTING: alternateConfigureLibrary from PETSc.packages.petsc4py(config/PETSc/packages/petsc4py.py:69)                                          Compilers:
  C Compiler:         mpicc  -Wall -Wwrite-strings -Wno-strict-aliasing -g3 
  C++ Compiler:       mpicxx  -Wall -Wwrite-strings -Wno-strict-aliasing -g   
  Fortran Compiler:   mpif90  -Wall -Wno-unused-variable -g  
Linkers:
  Static linker:   /usr/bin/ar cr
PETSc:
  **
  ** Before running "make" your PETSC_ARCH must be specified with:
  **  ** setenv PETSC_ARCH linux-gnu-cxx-debug (csh/tcsh)
  **  ** PETSC_ARCH=linux-gnu-cxx-debug; export PETSC_ARCH (sh/bash)
  **
  PETSC_DIR: /home/user/Compile/petsc3p7-real
  **
  ** Now build the libraries with "make all"
  **
  Clanguage: Cxx
  PETSc shared libraries: disabled
  PETSc dynamic libraries: disabled
  Scalar type:real
MPI:
  Includes: -I/misc/shared/apps/openmpi/gcc/64/1.2.8/include -I/misc/shared/apps/openmpi/gcc/64/1.2.8/lib
X11:
  Includes: ['']
  Library: ['-lX11']
BLAS/LAPACK: -llapack -lblas
Sieve:
  Includes: -I/home/user/Compile/petsc3p7-real/include/sieve
Boost:
  Includes: -I/home/user/Compile/petsc3p7-real/externalpackages/Boost/ -I/misc/shared/apps/openmpi/gcc/64/1.2.8/include -I/misc/shared/apps/openmpi/gcc/64/1.2.8/lib




No Errors; warnings:
------------------------------------------------------------------------
.....
.....

libfast in: /home/user/Compile/petsc3p7-real/src/snes/utils/sieve
/home/user/Compile/petsc3p7-real/include/sieve/ISieve.hh: In instantiation of ???ALE::IFSieveDef::Sequence<int>???:
/home/user/Compile/petsc3p7-real/include/sieve/ALE_mem.hh:759:   instantiated from ???void ALE::Obj<X, A>::destroy() [with X = ALE::IFSieveDef::Sequence<int>, A = ALE::malloc_allocator<ALE::IFSieveDef::Sequence<int> >]???
/home/user/Compile/petsc3p7-real/include/sieve/ALE_mem.hh:705:   instantiated from ???ALE::Obj<X, A>::~Obj() [with X = ALE::IFSieveDef::Sequence<int>, A = ALE::malloc_allocator<ALE::IFSieveDef::Sequence<int> >]???
/home/user/Compile/petsc3p7-real/include/sieve/ISieve.hh:1156:   instantiated from ???ALE::IFSieve<Point_, Allocator_>::~IFSieve() [with Point_ = int, Allocator_ = ALE::malloc_allocator<int>]???
/home/user/Compile/petsc3p7-real/include/sieve/ALE_mem.hh:759:   instantiated from ???void ALE::Obj<X, A>::destroy() [with X = ALE::IFSieve<int, ALE::malloc_allocator<int> >, A = ALE::malloc_allocator<ALE::IFSieve<int, ALE::malloc_allocator<int> > >]???
/home/user/Compile/petsc3p7-real/include/sieve/ALE_mem.hh:705:   instantiated from ???ALE::Obj<X, A>::~Obj() [with X = ALE::IFSieve<int, ALE::malloc_allocator<int> >, A = ALE::malloc_allocator<ALE::IFSieve<int, ALE::malloc_allocator<int> > >]???
/home/user/Compile/petsc3p7-real/include/sieve/Mesh.hh:1347:   instantiated from ???ALE::IBundle<Sieve_, RealSection_, IntSection_, Label_, ArrowSection_>::~IBundle() [with Sieve_ = ALE::IFSieve<int, ALE::malloc_allocator<int> >, RealSection_ = ALE::IGeneralSection<int, double, ALE::malloc_allocator<double> >, IntSection_ = ALE::IGeneralSection<int, int, ALE::malloc_allocator<int> >, Label_ = ALE::LabelSifter<int, int, ALE::malloc_allocator<ALE::NewSifterDef::Arrow<int, int> > >, ArrowSection_ = ALE::UniformSection<ALE::MinimalArrow<int, int>, int, 1, ALE::malloc_allocator<int> >]???
/home/user/Compile/petsc3p7-real/include/sieve/Mesh.hh:1572:   instantiated from ???void ALE::Obj<X, A>::destroy() [with X = ALE::IMesh<ALE::LabelSifter<int, int, ALE::malloc_allocator<ALE::NewSifterDef::Arrow<int, int> > > >, A = ALE::malloc_allocator<ALE::IMesh<ALE::LabelSifter<int, int, ALE::malloc_allocator<ALE::NewSifterDef::Arrow<int, int> > > > >]???
/home/user/Compile/petsc3p7-real/include/sieve/ALE_mem.hh:705:   instantiated from ???ALE::Obj<X, A>::~Obj() [with X = ALE::IMesh<ALE::LabelSifter<int, int, ALE::malloc_allocator<ALE::NewSifterDef::Arrow<int, int> > > >, A = ALE::malloc_allocator<ALE::IMesh<ALE::LabelSifter<int, int, ALE::malloc_allocator<ALE::NewSifterDef::Arrow<int, int> > > > >]???
meshmgsnes.c:63:   instantiated from here
/home/user/Compile/petsc3p7-real/include/sieve/ISieve.hh:954: warning: ???class ALE::IFSieveDef::Sequence<int>??? has virtual functions but non-virtual destructor
/home/user/Compile/petsc3p7-real/include/sieve/ISieve.hh: In instantiation of ???ALE::IFSieveDef::Sequence<int>::const_iterator???:
/home/user/Compile/petsc3p7-real/include/sieve/ISieve.hh:991:   instantiated from ???ALE::IFSieveDef::Sequence<PointType_>::const_iterator ALE::IFSieveDef::Sequence<PointType_>::begin() const [with PointType_ = int]???
meshmgsnes.c:349:   instantiated from here
/home/user/Compile/petsc3p7-real/include/sieve/ISieve.hh:957: warning: ???class ALE::IFSieveDef::Sequence<int>::const_iterator??? has virtual functions but non-virtual destructor
libfast in: /home/user/Compile/petsc3p7-real/src/snes/utils/ftn-auto
libfast in: /home/user/Compile/petsc3p7-real/src/snes/utils/ftn-custom
libfast in: /home/user/Compile/petsc3p7-real/src/snes/f90-mod
mpif90 -c  -Wall -Wno-unused-variable -g   -I/home/user/Compile/petsc3p7-real/linux-gnu-cxx-debug/include -I/home/user/Compile/petsc3p7-real/include -I/home/user/Compile/petsc3p7-real/include/sieve -I/home/user/Compile/petsc3p7-real/externalpackages/Boost/ -I/misc/shared/apps/openmpi/gcc/64/1.2.8/include -I/misc/shared/apps/openmpi/gcc/64/1.2.8/lib       -o petscsnesmod.o petscsnesmod.F
/usr/bin/ar cr /home/user/Compile/petsc3p7-real/linux-gnu-cxx-debug/lib/libpetscsnes.a petscsnesmod.o
/bin/cp -f *.mod /home/user/Compile/petsc3p7-real/linux-gnu-cxx-debug/include
libfast in: /home/user/Compile/petsc3p7-real/src/snes/ftn-custom
libfast in: /home/user/Compile/petsc3p7-real/src/ts
libfast in: /home/user/Compile/petsc3p7-real/src/ts/interface
libfast in: /home/user/Compile/petsc3p7-real/src/ts/interface/ftn-auto
libfast in: /home/user/Compile/petsc3p7-real/src/ts/interface/ftn-custom
libfast in: /home/user/Compile/petsc3p7-real/src/ts/impls
libfast in: /home/user/Compile/petsc3p7-real/src/ts/impls/explicit
libfast in: /home/user/Compile/petsc3p7-real/src/ts/impls/explicit/euler
libfast in: /home/user/Compile/petsc3p7-real/src/ts/impls/explicit/rk
libfast in: /home/user/Compile/petsc3p7-real/src/ts/impls/explicit/rk/ftn-auto
libfast in: /home/user/Compile/petsc3p7-real/src/ts/impls/implicit
libfast in: /home/user/Compile/petsc3p7-real/src/ts/impls/implicit/beuler
libfast in: /home/user/Compile/petsc3p7-real/src/ts/impls/implicit/cn
libfast in: /home/user/Compile/petsc3p7-real/src/ts/impls/pseudo
libfast in: /home/user/Compile/petsc3p7-real/src/ts/impls/pseudo/ftn-auto
libfast in: /home/user/Compile/petsc3p7-real/src/ts/impls/python
libfast in: /home/user/Compile/petsc3p7-real/src/ts/impls/python/ftn-custom
libfast in: /home/user/Compile/petsc3p7-real/src/ts/impls/python/ftn-custom
libfast in: /home/user/Compile/petsc3p7-real/src/ts/examples
libfast in: /home/user/Compile/petsc3p7-real/src/ts/examples/tests
libfast in: /home/user/Compile/petsc3p7-real/src/ts/examples/tutorials
libfast in: /home/user/Compile/petsc3p7-real/src/ts/f90-mod
mpif90 -c  -Wall -Wno-unused-variable -g   -I/home/user/Compile/petsc3p7-real/linux-gnu-cxx-debug/include -I/home/user/Compile/petsc3p7-real/include -I/home/user/Compile/petsc3p7-real/include/sieve -I/home/user/Compile/petsc3p7-real/externalpackages/Boost/ -I/misc/shared/apps/openmpi/gcc/64/1.2.8/include -I/misc/shared/apps/openmpi/gcc/64/1.2.8/lib       -o petsctsmod.o petsctsmod.F
/usr/bin/ar cr /home/user/Compile/petsc3p7-real/linux-gnu-cxx-debug/lib/libpetscts.a petsctsmod.o
/bin/cp -f *.mod /home/user/Compile/petsc3p7-real/linux-gnu-cxx-debug/include
libfast in: /home/user/Compile/petsc3p7-real/src/dm
libfast in: /home/user/Compile/petsc3p7-real/src/dm/ao
libfast in: /home/user/Compile/petsc3p7-real/src/dm/ao/interface
libfast in: /home/user/Compile/petsc3p7-real/src/dm/ao/interface/ftn-auto
libfast in: /home/user/Compile/petsc3p7-real/src/dm/ao/interface/ftn-custom
libfast in: /home/user/Compile/petsc3p7-real/src/dm/ao/impls
libfast in: /home/user/Compile/petsc3p7-real/src/dm/ao/impls/basic
libfast in: /home/user/Compile/petsc3p7-real/src/dm/ao/impls/basic/ftn-custom
libfast in: /home/user/Compile/petsc3p7-real/src/dm/ao/impls/mapping
libfast in: /home/user/Compile/petsc3p7-real/src/dm/ao/impls/mapping/ftn-auto
libfast in: /home/user/Compile/petsc3p7-real/src/dm/ao/impls/mapping/ftn-custom
libfast in: /home/user/Compile/petsc3p7-real/src/dm/ao/examples
libfast in: /home/user/Compile/petsc3p7-real/src/dm/ao/examples/tests
libfast in: /home/user/Compile/petsc3p7-real/src/dm/ao/examples/tutorials
libfast in: /home/user/Compile/petsc3p7-real/src/dm/da
libfast in: /home/user/Compile/petsc3p7-real/src/dm/da/src
libfast in: /home/user/Compile/petsc3p7-real/src/dm/da/src/ftn-auto
libfast in: /home/user/Compile/petsc3p7-real/src/dm/da/src/ftn-custom
libfast in: /home/user/Compile/petsc3p7-real/src/dm/da/src/f90-custom
libfast in: /home/user/Compile/petsc3p7-real/src/dm/da/examples
libfast in: /home/user/Compile/petsc3p7-real/src/dm/da/examples/tests
libfast in: /home/user/Compile/petsc3p7-real/src/dm/da/examples/tutorials
libfast in: /home/user/Compile/petsc3p7-real/src/dm/da/utils
libfast in: /home/user/Compile/petsc3p7-real/src/dm/da/utils/ftn-auto
libfast in: /home/user/Compile/petsc3p7-real/src/dm/da/utils/ftn-custom
libfast in: /home/user/Compile/petsc3p7-real/src/dm/da/utils/f90-custom
libfast in: /home/user/Compile/petsc3p7-real/src/dm/mesh
/home/user/Compile/petsc3p7-real/include/sieve/ISieve.hh: In instantiation of ???ALE::IFSieveDef::Sequence<int>???:
/home/user/Compile/petsc3p7-real/include/sieve/ISieve.hh:1145:   instantiated from ???ALE::IFSieve<Point_, Allocator_>::IFSieve(ompi_communicator_t*, int) [with Point_ = int, Allocator_ = ALE::malloc_allocator<int>]???
mesh.c:1600:   instantiated from here
/home/user/Compile/petsc3p7-real/include/sieve/ISieve.hh:954: warning: ???class ALE::IFSieveDef::Sequence<int>??? has virtual functions but non-virtual destructor
/home/user/Compile/petsc3p7-real/include/sieve/ISieve.hh: In instantiation of ???ALE::IFSieveDef::Sequence<int>::const_iterator???:
/home/user/Compile/petsc3p7-real/include/sieve/ISieve.hh:991:   instantiated from ???ALE::IFSieveDef::Sequence<PointType_>::const_iterator ALE::IFSieveDef::Sequence<PointType_>::begin() const [with PointType_ = int]???
mesh.c:2641:   instantiated from here
/home/user/Compile/petsc3p7-real/include/sieve/ISieve.hh:957: warning: ???class ALE::IFSieveDef::Sequence<int>::const_iterator??? has virtual functions but non-virtual destructor
/home/user/Compile/petsc3p7-real/include/sieve/ISieve.hh: In instantiation of ???ALE::IFSieveDef::Sequence<int>???:
meshpcice.c:387:   instantiated from here
/home/user/Compile/petsc3p7-real/include/sieve/ISieve.hh:954: warning: ???class ALE::IFSieveDef::Sequence<int>??? has virtual functions but non-virtual destructor
/home/user/Compile/petsc3p7-real/include/sieve/ISieve.hh: In instantiation of ???ALE::IFSieveDef::Sequence<int>::const_iterator???:
meshpcice.c:388:   instantiated from here
/home/user/Compile/petsc3p7-real/include/sieve/ISieve.hh:957: warning: ???class ALE::IFSieveDef::Sequence<int>::const_iterator??? has virtual functions but non-virtual destructor
/home/user/Compile/petsc3p7-real/include/sieve/ISieve.hh: In instantiation of ???ALE::IFSieveDef::Sequence<int>???:
/home/user/Compile/petsc3p7-real/include/sieve/ALE_mem.hh:759:   instantiated from ???void ALE::Obj<X, A>::destroy() [with X = ALE::IFSieveDef::Sequence<int>, A = ALE::malloc_allocator<ALE::IFSieveDef::Sequence<int> >]???
/home/user/Compile/petsc3p7-real/include/sieve/ALE_mem.hh:705:   instantiated from ???ALE::Obj<X, A>::~Obj() [with X = ALE::IFSieveDef::Sequence<int>, A = ALE::malloc_allocator<ALE::IFSieveDef::Sequence<int> >]???
/home/user/Compile/petsc3p7-real/include/sieve/ISieve.hh:1156:   instantiated from ???ALE::IFSieve<Point_, Allocator_>::~IFSieve() [with Point_ = int, Allocator_ = ALE::malloc_allocator<int>]???
/home/user/Compile/petsc3p7-real/include/sieve/ALE_mem.hh:759:   instantiated from ???void ALE::Obj<X, A>::destroy() [with X = ALE::IFSieve<int, ALE::malloc_allocator<int> >, A = ALE::malloc_allocator<ALE::IFSieve<int, ALE::malloc_allocator<int> > >]???
/home/user/Compile/petsc3p7-real/include/sieve/ALE_mem.hh:705:   instantiated from ???ALE::Obj<X, A>::~Obj() [with X = ALE::IFSieve<int, ALE::malloc_allocator<int> >, A = ALE::malloc_allocator<ALE::IFSieve<int, ALE::malloc_allocator<int> > >]???
/home/user/Compile/petsc3p7-real/include/sieve/Mesh.hh:1347:   instantiated from ???ALE::IBundle<Sieve_, RealSection_, IntSection_, Label_, ArrowSection_>::~IBundle() [with Sieve_ = ALE::IFSieve<int, ALE::malloc_allocator<int> >, RealSection_ = ALE::IGeneralSection<int, double, ALE::malloc_allocator<double> >, IntSection_ = ALE::IGeneralSection<int, int, ALE::malloc_allocator<int> >, Label_ = ALE::LabelSifter<int, int, ALE::malloc_allocator<ALE::NewSifterDef::Arrow<int, int> > >, ArrowSection_ = ALE::UniformSection<ALE::MinimalArrow<int, int>, int, 1, ALE::malloc_allocator<int> >]???
/home/user/Compile/petsc3p7-real/include/sieve/Mesh.hh:1572:   instantiated from ???void ALE::Obj<X, A>::destroy() [with X = ALE::IMesh<ALE::LabelSifter<int, int, ALE::malloc_allocator<ALE::NewSifterDef::Arrow<int, int> > > >, A = ALE::malloc_allocator<ALE::IMesh<ALE::LabelSifter<int, int, ALE::malloc_allocator<ALE::NewSifterDef::Arrow<int, int> > > > >]???
/home/user/Compile/petsc3p7-real/include/sieve/ALE_mem.hh:705:   instantiated from ???ALE::Obj<X, A>::~Obj() [with X = ALE::IMesh<ALE::LabelSifter<int, int, ALE::malloc_allocator<ALE::NewSifterDef::Arrow<int, int> > > >, A = ALE::malloc_allocator<ALE::IMesh<ALE::LabelSifter<int, int, ALE::malloc_allocator<ALE::NewSifterDef::Arrow<int, int> > > > >]???
meshpflotran.c:235:   instantiated from here
/home/user/Compile/petsc3p7-real/include/sieve/ISieve.hh:954: warning: ???class ALE::IFSieveDef::Sequence<int>??? has virtual functions but non-virtual destructor
/home/user/Compile/petsc3p7-real/include/sieve/ISieve.hh: In instantiation of ???ALE::IFSieveDef::Sequence<int>::const_iterator???:
/home/user/Compile/petsc3p7-real/include/sieve/ISieve.hh:991:   instantiated from ???ALE::IFSieveDef::Sequence<PointType_>::const_iterator ALE::IFSieveDef::Sequence<PointType_>::begin() const [with PointType_ = int]???
meshpflotran.c:903:   instantiated from here
/home/user/Compile/petsc3p7-real/include/sieve/ISieve.hh:957: warning: ???class ALE::IFSieveDef::Sequence<int>::const_iterator??? has virtual functions but non-virtual destructor
/home/user/Compile/petsc3p7-real/include/sieve/ISieve.hh: In instantiation of ???ALE::IFSieveDef::Sequence<int>???:
/home/user/Compile/petsc3p7-real/include/sieve/ALE_mem.hh:759:   instantiated from ???void ALE::Obj<X, A>::destroy() [with X = ALE::IFSieveDef::Sequence<int>, A = ALE::malloc_allocator<ALE::IFSieveDef::Sequence<int> >]???
/home/user/Compile/petsc3p7-real/include/sieve/ALE_mem.hh:705:   instantiated from ???ALE::Obj<X, A>::~Obj() [with X = ALE::IFSieveDef::Sequence<int>, A = ALE::malloc_allocator<ALE::IFSieveDef::Sequence<int> >]???
/home/user/Compile/petsc3p7-real/include/sieve/ISieve.hh:1156:   instantiated from ???ALE::IFSieve<Point_, Allocator_>::~IFSieve() [with Point_ = int, Allocator_ = ALE::malloc_allocator<int>]???
/home/user/Compile/petsc3p7-real/include/sieve/ALE_mem.hh:759:   instantiated from ???void ALE::Obj<X, A>::destroy() [with X = ALE::IFSieve<int, ALE::malloc_allocator<int> >, A = ALE::malloc_allocator<ALE::IFSieve<int, ALE::malloc_allocator<int> > >]???
/home/user/Compile/petsc3p7-real/include/sieve/ALE_mem.hh:705:   instantiated from ???ALE::Obj<X, A>::~Obj() [with X = ALE::IFSieve<int, ALE::malloc_allocator<int> >, A = ALE::malloc_allocator<ALE::IFSieve<int, ALE::malloc_allocator<int> > >]???
/home/user/Compile/petsc3p7-real/include/sieve/Mesh.hh:1331:   instantiated from ???ALE::IBundle<Sieve_, RealSection_, IntSection_, Label_, ArrowSection_>::IBundle(ompi_communicator_t*, int) [with Sieve_ = ALE::IFSieve<int, ALE::malloc_allocator<int> >, RealSection_ = ALE::IGeneralSection<int, double, ALE::malloc_allocator<double> >, IntSection_ = ALE::IGeneralSection<int, int, ALE::malloc_allocator<int> >, Label_ = ALE::LabelSifter<int, int, ALE::malloc_allocator<ALE::NewSifterDef::Arrow<int, int> > >, ArrowSection_ = ALE::UniformSection<ALE::MinimalArrow<int, int>, int, 1, ALE::malloc_allocator<int> >]???
/home/user/Compile/petsc3p7-real/include/sieve/Mesh.hh:1598:   instantiated from ???ALE::IMesh<Label_>::IMesh(ompi_communicator_t*, int, int) [with Label_ = ALE::LabelSifter<int, int, ALE::malloc_allocator<ALE::NewSifterDef::Arrow<int, int> > >]???
meshexodus.c:183:   instantiated from here
/home/user/Compile/petsc3p7-real/include/sieve/ISieve.hh:954: warning: ???class ALE::IFSieveDef::Sequence<int>??? has virtual functions but non-virtual destructor
/home/user/Compile/petsc3p7-real/include/sieve/ISieve.hh: In instantiation of ???ALE::IFSieveDef::Sequence<int>::const_iterator???:
/home/user/Compile/petsc3p7-real/include/sieve/ISieve.hh:991:   instantiated from ???ALE::IFSieveDef::Sequence<PointType_>::const_iterator ALE::IFSieveDef::Sequence<PointType_>::begin() const [with PointType_ = int]???
meshexodus.c:364:   instantiated from here
/home/user/Compile/petsc3p7-real/include/sieve/ISieve.hh:957: warning: ???class ALE::IFSieveDef::Sequence<int>::const_iterator??? has virtual functions but non-virtual destructor
/home/user/Compile/petsc3p7-real/include/sieve/ISieve.hh: In instantiation of ???ALE::IFSieveDef::Sequence<int>???:
/home/user/Compile/petsc3p7-real/include/sieve/ALE_mem.hh:759:   instantiated from ???void ALE::Obj<X, A>::destroy() [with X = ALE::IFSieveDef::Sequence<int>, A = ALE::malloc_allocator<ALE::IFSieveDef::Sequence<int> >]???
/home/user/Compile/petsc3p7-real/include/sieve/ALE_mem.hh:705:   instantiated from ???ALE::Obj<X, A>::~Obj() [with X = ALE::IFSieveDef::Sequence<int>, A = ALE::malloc_allocator<ALE::IFSieveDef::Sequence<int> >]???
/home/user/Compile/petsc3p7-real/include/sieve/ISieve.hh:1156:   instantiated from ???ALE::IFSieve<Point_, Allocator_>::~IFSieve() [with Point_ = int, Allocator_ = ALE::malloc_allocator<int>]???
/home/user/Compile/petsc3p7-real/include/sieve/ALE_mem.hh:759:   instantiated from ???void ALE::Obj<X, A>::destroy() [with X = ALE::IFSieve<int, ALE::malloc_allocator<int> >, A = ALE::malloc_allocator<ALE::IFSieve<int, ALE::malloc_allocator<int> > >]???
/home/user/Compile/petsc3p7-real/include/sieve/ALE_mem.hh:705:   instantiated from ???ALE::Obj<X, A>::~Obj() [with X = ALE::IFSieve<int, ALE::malloc_allocator<int> >, A = ALE::malloc_allocator<ALE::IFSieve<int, ALE::malloc_allocator<int> > >]???
/home/user/Compile/petsc3p7-real/include/petscmesh_viewers.hh:482:   instantiated from here
/home/user/Compile/petsc3p7-real/include/sieve/ISieve.hh:954: warning: ???class ALE::IFSieveDef::Sequence<int>??? has virtual functions but non-virtual destructor
/home/user/Compile/petsc3p7-real/include/sieve/ISieve.hh: In instantiation of ???ALE::IFSieveDef::Sequence<int>::const_iterator???:
/home/user/Compile/petsc3p7-real/include/sieve/ISieve.hh:991:   instantiated from ???ALE::IFSieveDef::Sequence<PointType_>::const_iterator ALE::IFSieveDef::Sequence<PointType_>::begin() const [with PointType_ = int]???
section.c:1405:   instantiated from here
/home/user/Compile/petsc3p7-real/include/sieve/ISieve.hh:957: warning: ???class ALE::IFSieveDef::Sequence<int>::const_iterator??? has virtual functions but non-virtual destructor
libfast in: /home/user/Compile/petsc3p7-real/src/dm/mesh/sieve
libfast in: /home/user/Compile/petsc3p7-real/src/dm/mesh/impls
libfast in: /home/user/Compile/petsc3p7-real/src/dm/mesh/impls/cartesian
/home/user/Compile/petsc3p7-real/include/sieve/ISieve.hh: In instantiation of ???ALE::IFSieveDef::Sequence<int>???:
/home/user/Compile/petsc3p7-real/include/sieve/ALE_mem.hh:759:   instantiated from ???void ALE::Obj<X, A>::destroy() [with X = ALE::IFSieveDef::Sequence<int>, A = ALE::malloc_allocator<ALE::IFSieveDef::Sequence<int> >]???
/home/user/Compile/petsc3p7-real/include/sieve/ALE_mem.hh:705:   instantiated from ???ALE::Obj<X, A>::~Obj() [with X = ALE::IFSieveDef::Sequence<int>, A = ALE::malloc_allocator<ALE::IFSieveDef::Sequence<int> >]???
/home/user/Compile/petsc3p7-real/include/sieve/ISieve.hh:1156:   instantiated from ???ALE::IFSieve<Point_, Allocator_>::~IFSieve() [with Point_ = int, Allocator_ = ALE::malloc_allocator<int>]???
/home/user/Compile/petsc3p7-real/include/sieve/ALE_mem.hh:759:   instantiated from ???void ALE::Obj<X, A>::destroy() [with X = ALE::IFSieve<int, ALE::malloc_allocator<int> >, A = ALE::malloc_allocator<ALE::IFSieve<int, ALE::malloc_allocator<int> > >]???
/home/user/Compile/petsc3p7-real/include/sieve/ALE_mem.hh:705:   instantiated from ???ALE::Obj<X, A>::~Obj() [with X = ALE::IFSieve<int, ALE::malloc_allocator<int> >, A = ALE::malloc_allocator<ALE::IFSieve<int, ALE::malloc_allocator<int> > >]???
/home/user/Compile/petsc3p7-real/include/sieve/Mesh.hh:1347:   instantiated from ???ALE::IBundle<Sieve_, RealSection_, IntSection_, Label_, ArrowSection_>::~IBundle() [with Sieve_ = ALE::IFSieve<int, ALE::malloc_allocator<int> >, RealSection_ = ALE::IGeneralSection<int, double, ALE::malloc_allocator<double> >, IntSection_ = ALE::IGeneralSection<int, int, ALE::malloc_allocator<int> >, Label_ = ALE::LabelSifter<int, int, ALE::malloc_allocator<ALE::NewSifterDef::Arrow<int, int> > >, ArrowSection_ = ALE::UniformSection<ALE::MinimalArrow<int, int>, int, 1, ALE::malloc_allocator<int> >]???
/home/user/Compile/petsc3p7-real/include/sieve/Mesh.hh:1572:   instantiated from ???void ALE::Obj<X, A>::destroy() [with X = ALE::IMesh<ALE::LabelSifter<int, int, ALE::malloc_allocator<ALE::NewSifterDef::Arrow<int, int> > > >, A = ALE::malloc_allocator<ALE::IMesh<ALE::LabelSifter<int, int, ALE::malloc_allocator<ALE::NewSifterDef::Arrow<int, int> > > > >]???
/home/user/Compile/petsc3p7-real/include/sieve/ALE_mem.hh:705:   instantiated from ???ALE::Obj<X, A>::~Obj() [with X = ALE::IMesh<ALE::LabelSifter<int, int, ALE::malloc_allocator<ALE::NewSifterDef::Arrow<int, int> > > >, A = ALE::malloc_allocator<ALE::IMesh<ALE::LabelSifter<int, int, ALE::malloc_allocator<ALE::NewSifterDef::Arrow<int, int> > > > >]???
cartesian.c:263:   instantiated from here
/home/user/Compile/petsc3p7-real/include/sieve/ISieve.hh:954: warning: ???class ALE::IFSieveDef::Sequence<int>??? has virtual functions but non-virtual destructor
/home/user/Compile/petsc3p7-real/include/sieve/ISieve.hh: In instantiation of ???ALE::IFSieveDef::Sequence<int>::const_iterator???:
/home/user/Compile/petsc3p7-real/include/sieve/ISieve.hh:991:   instantiated from ???ALE::IFSieveDef::Sequence<PointType_>::const_iterator ALE::IFSieveDef::Sequence<PointType_>::begin() const [with PointType_ = int]???
cartesian.c:269:   instantiated from here
/home/user/Compile/petsc3p7-real/include/sieve/ISieve.hh:957: warning: ???class ALE::IFSieveDef::Sequence<int>::const_iterator??? has virtual functions but non-virtual destructor

------------------------------------------------------------------------


From jroman at dsic.upv.es  Tue Jul 28 12:17:56 2009
From: jroman at dsic.upv.es (Jose E. Roman)
Date: Tue, 28 Jul 2009 19:17:56 +0200
Subject: Normalization options in slepc
In-Reply-To: <4A6F1900.10400@tudelft.nl>
References: <4A6F1900.10400@tudelft.nl>
Message-ID: <17BE0BBF-66E5-478C-8E5D-2360E34DABA2@dsic.upv.es>


On 28/07/2009, Umut Tabak wrote:

> Dear all,
>
> Is there a normalization selection option in Slepc for eigenvectors,  
> as far as I can see, it normalizes the eigenvectors so that their  
> norm is equal to 1. Can this normalization be customized with  
> respect to B, in a generalized problem context. It is not hard to  
> write a function for this, but I wondered if there is already an  
> option for this.
>
> Best regards,
>
> Umut

In symmetric-definite generalized problems, it makes more sense to  
return B-normalized eigenvectors. Some time ago we considered changing  
this but for some reason we didn't. We will change this in the next  
patch (slepc-3.0.0-p5).

Jose


From knepley at gmail.com  Tue Jul 28 17:15:59 2009
From: knepley at gmail.com (Matthew Knepley)
Date: Tue, 28 Jul 2009 17:15:59 -0500
Subject: Compiling Boost & Sieve & complex scalar
In-Reply-To: <20090728162813.GH19239@tchpc.tcd.ie>
References: <20090728162813.GH19239@tchpc.tcd.ie>
Message-ID: <a9f269830907281515k50aa7adm3cb43f847120efa3@mail.gmail.com>

Sorry, I need to put an error in 3.0. Sieve does not work with the complex
type
in 3.0. I have fixed this in petsc-dev. There are instructions on the
website for
getting the development version if you need complex scalars.

  Thanks,

    Matt

On Tue, Jul 28, 2009 at 11:28 AM, <darach at tchpc.tcd.ie> wrote:

> Hi,
>
> I have been trying to compile petsc with boost & sieve & complex
> scalars using the following configuration commands (I also compiled
> with PetscScalar=real as a comparison).  I'm getting an error when the
> petsc scalar type is complex, but no error with scalar type real (just
> warnings).  What configuration options am I missing? I include longer
> output below; first the complex case and then the real case.
>
> petsc was compiled from the following tar file:  petsc-3.0.0-p7.tar.gz
>
> Complex:
> ./config/configure.py
> --prefix=/path-to/user/install/petsc-v3p7-defaultboost-sieve-complex
>  --with-scalar-type=complex  --with-clanguage=cxx  --with-boost=1
> --download-boost=/home/user/Compile/petsc3p7-complex/externalpackages/boost.tar.gz
>  --with-sieve=1
>
> Real:
>  ./config/configure.py
> --prefix=/path-to/user/install/petsc-v3p7-defaultboost-sieve-real
>  --with-scalar-type=real  --with-clanguage=cxx  --with-boost=1
> --download-boost=/home/user/Compile/petsc3p7-real/externalpackages/boost.tar.gz
>  --with-sieve=1
>
> Darach
>
>
> Longer Output:
>
> Complex Scalar:
>
> ./config/configure.py
> --prefix=/path-to/user/install/petsc-v3p7-defaultboost-sieve-complex
>  --with-scalar-type=complex  --with-clanguage=cxx  --with-boost=1
> --download-boost=/home/user/Compile/petsc3p7-complex/externalpackages/boost.tar.gz
>  --with-sieve=1
> ....
>
> =================================================================================
>             Configuring PETSc to compile on your system
>
> =================================================================================
> TESTING: alternateConfigureLibrary from
> PETSc.packages.petsc4py(config/PETSc/packages/petsc4py.py:69)
>                                                           Compilers:
>  C Compiler:         mpicc  -Wall -Wwrite-strings -Wno-strict-aliasing -g3
>  C++ Compiler:       mpicxx  -Wall -Wwrite-strings -Wno-strict-aliasing -g
>  Fortran Compiler:   mpif90  -Wall -Wno-unused-variable -g
> Linkers:
>  Static linker:   /usr/bin/ar cr
> PETSc:
>  **
>  ** Before running "make" your PETSC_ARCH must be specified with:
>  **  ** setenv PETSC_ARCH linux-gnu-cxx-debug (csh/tcsh)
>  **  ** PETSC_ARCH=linux-gnu-cxx-debug; export PETSC_ARCH (sh/bash)
>  **
>  PETSC_DIR: /home/user/Compile/petsc3p7-complex
>  **
>  ** Now build the libraries with "make all"
>  **
>  Clanguage: Cxx
>  PETSc shared libraries: disabled
>  PETSc dynamic libraries: disabled
>  Scalar type:complex
> MPI:
>  Includes: -I/misc/shared/apps/openmpi/gcc/64/1.2.8/include
> -I/misc/shared/apps/openmpi/gcc/64/1.2.8/lib
> X11:
>  Includes: ['']
>  Library: ['-lX11']
> BLAS/LAPACK: -llapack -lblas
> Sieve:
>  Includes: -I/home/user/Compile/petsc3p7-complex/include/sieve
> Boost:
>  Includes: -I/home/user/Compile/petsc3p7-complex/externalpackages/Boost/
> -I/misc/shared/apps/openmpi/gcc/64/1.2.8/include
> -I/misc/shared/apps/openmpi/gcc/64/1.2.8/lib
>
>
> Errors:
> ------------------------------------------------------------------------
> ......
> ......
>
> libfast in: /home/user/Compile/petsc3p7-complex/src/dm/mesh
> mesh.c: In function ?  PetscErrorCode assembleVector(_p_Vec*, PetscInt,
> PetscScalar*, InsertMode)?  :
> mesh.c:1104: error: no matching function for call to ?
>  ALE::IMesh<ALE::LabelSifter<int, int,
> ALE::malloc_allocator<ALE::NewSifterDef::Arrow<int, int> > > >::update(const
> ALE::Obj<ALE::IGeneralSection<int, double, ALE::malloc_allocator<double> >,
> ALE::malloc_allocator<ALE::IGeneralSection<int, double,
> ALE::malloc_allocator<double> > > >&, int, PetscScalar*&)?
> /home/user/Compile/petsc3p7-complex/include/sieve/Mesh.hh:1715: note:
> candidates are: void ALE::IMesh<Label_>::update(const ALE::Obj<Y,
> ALE::malloc_allocator<U> >&, const typename ALE::IBundle<ALE::IFSieve<int,
> ALE::malloc_allocator<int> >, ALE::IGeneralSection<int, double,
> ALE::malloc_allocator<double> >, ALE::IGeneralSection<int, int,
> ALE::malloc_allocator<int> >, Label_,
> ALE::UniformSection<ALE::MinimalArrow<int, int>, int, 1,
> ALE::malloc_allocator<int> > >::sieve_type::point_type&, const typename
> Section::value_type*) [with Section = ALE::IGeneralSection<int, double,
> ALE::malloc_allocator<double> >, Label_ = ALE::LabelSifter<int, int,
> ALE::malloc_allocator<ALE::NewSifterDef::Arrow<int, int> > >]
> mesh.c:1106: error: no matching function for call to ?
>  ALE::IMesh<ALE::LabelSifter<int, int,
> ALE::malloc_allocator<ALE::NewSifterDef::Arrow<int, int> > >
> >::updateAdd(const ALE::Obj<ALE::IGeneralSection<int, double,
> ALE::malloc_allocator<double> >,
> ALE::malloc_allocator<ALE::IGeneralSection<int, double,
> ALE::malloc_allocator<double> > > >&, int, PetscScalar*&)?
> /home/user/Compile/petsc3p7-complex/include/sieve/Mesh.hh:1741: note:
> candidates are: void ALE::IMesh<Label_>::updateAdd(const ALE::Obj<Y,
> ALE::malloc_allocator<U> >&, const typename ALE::IBundle<ALE::IFSieve<int,
> ALE::malloc_allocator<int> >, ALE::IGeneralSection<int, double,
> ALE::malloc_allocator<double> >, ALE::IGeneralSection<int, int,
> ALE::malloc_allocator<int> >, Label_,
> ALE::UniformSection<ALE::MinimalArrow<int, int>, int, 1,
> ALE::malloc_allocator<int> > >::sieve_type::point_type&, const typename
> Section::value_type*) [with Section = ALE::IGeneralSection<int, double,
> ALE::malloc_allocator<double> >, Label_ = ALE::LabelSifter<int, int,
> ALE::malloc_allocator<ALE::NewSifterDef::Arrow<int, int> > >]
> mesh.c: In function ?  PetscErrorCode MeshRestrictClosure(_p_Mesh*,
> _p_SectionReal*, PetscInt, PetscInt, PetscScalar*)?  :
> mesh.c:2523: error: no matching function for call to ?
>  ALE::IMesh<ALE::LabelSifter<int, int,
> ALE::malloc_allocator<ALE::NewSifterDef::Arrow<int, int> > >
> >::restrictClosure(ALE::Obj<ALE::IGeneralSection<int, double,
> ALE::malloc_allocator<double> >,
> ALE::malloc_allocator<ALE::IGeneralSection<int, double,
> ALE::malloc_allocator<double> > > >&, PetscInt&, PetscScalar*&, PetscInt&)?
> /home/user/Compile/petsc3p7-complex/include/sieve/Mesh.hh:1691: note:
> candidates are: const typename Section::value_type*
> ALE::IMesh<Label_>::restrictClosure(const ALE::Obj<Y,
> ALE::malloc_allocator<U> >&, const typename ALE::IBundle<ALE::IFSieve<int,
> ALE::malloc_allocator<int> >, ALE::IGeneralSection<int, double,
> ALE::malloc_allocator<double> >, ALE::IGeneralSection<int, int,
> ALE::malloc_allocator<int> >, Label_,
> ALE::UniformSection<ALE::MinimalArrow<int, int>, int, 1,
> ALE::malloc_allocator<int> > >::sieve_type::point_type&, typename
> Section::value_type*, int) [with Section = ALE::IGeneralSection<int, double,
> ALE::malloc_allocator<double> >, Label_ = ALE::LabelSifter<int, int,
> ALE::malloc_allocator<ALE::NewSifterDef::Arrow<int, int> > >]
> mesh.c: In function ?  PetscErrorCode MeshUpdateClosure(_p_Mesh*,
> _p_SectionReal*, PetscInt, PetscScalar*)?  :
> mesh.c:2557: error: no matching function for call to ?
>  ALE::IMesh<ALE::LabelSifter<int, int,
> ALE::malloc_allocator<ALE::NewSifterDef::Arrow<int, int> > >
> >::update(ALE::Obj<ALE::IGeneralSection<int, double,
> ALE::malloc_allocator<double> >,
> ALE::malloc_allocator<ALE::IGeneralSection<int, double,
> ALE::malloc_allocator<double> > > >&, PetscInt&, PetscScalar*&)?
> /home/user/Compile/petsc3p7-complex/include/sieve/Mesh.hh:1715: note:
> candidates are: void ALE::IMesh<Label_>::update(const ALE::Obj<Y,
> ALE::malloc_allocator<U> >&, const typename ALE::IBundle<ALE::IFSieve<int,
> ALE::malloc_allocator<int> >, ALE::IGeneralSection<int, double,
> ALE::malloc_allocator<double> >, ALE::IGeneralSection<int, int,
> ALE::malloc_allocator<int> >, Label_,
> ALE::UniformSection<ALE::MinimalArrow<int, int>, int, 1,
> ALE::malloc_allocator<int> > >::sieve_type::point_type&, const typename
> Section::value_type*) [with Section = ALE::IGeneralSection<int, double,
> ALE::malloc_allocator<double> >, Label_ = ALE::LabelSifter<int, int,
> ALE::malloc_allocator<ALE::NewSifterDef::Arrow<int, int> > >]
> /home/user/Compile/petsc3p7-complex/include/petscmesh.hh: In function ?
>  PetscErrorCode MeshCreateGlobalScatter(const ALE::Obj<Y,
> ALE::malloc_allocator<U> >&, const ALE::Obj<Sieve_,
> ALE::malloc_allocator<Rec_> >&, _p_VecScatter**) [with Mesh =
> ALE::IMesh<ALE::LabelSifter<int, int,
> ALE::malloc_allocator<ALE::NewSifterDef::Arrow<int, int> > > >, Section =
> ALE::IGeneralSection<int, double, ALE::malloc_allocator<double> >]?  :
> mesh.c:815:   instantiated from here
> /home/user/Compile/petsc3p7-complex/include/petscmesh.hh:93: error: no
> matching function for call to ?  VecCreateSeqWithArray(ompi_communicator_t*,
> int, const double*, _p_Vec**)?
> /home/user/Compile/petsc3p7-complex/include/petscvec.h:66: note: candidates
> are: PetscErrorCode VecCreateSeqWithArray(ompi_communicator_t*, PetscInt,
> const PetscScalar*, _p_Vec**)
> /home/user/Compile/petsc3p7-complex/include/petscvec.h:67: note:
>       PetscErrorCode VecCreateSeqWithArray(PetscInt, PetscScalar*, _p_Vec**)
> /home/user/Compile/petsc3p7-complex/include/petscmesh.hh: In function ?
>  PetscErrorCode updateOperator(_p_Mat*, const Sieve&, Visitor&, const int&,
> PetscScalar*, InsertMode) [with Sieve = ALE::IFSieve<int,
> ALE::malloc_allocator<int> >, Visitor = updateOperator(_p_Mat*, const
> ALE::Obj<ALE::IMesh<ALE::LabelSifter<int, int,
> ALE::malloc_allocator<ALE::NewSifterDef::Arrow<int, int> > > >,
> ALE::malloc_allocator<ALE::IMesh<ALE::LabelSifter<int, int,
> ALE::malloc_allocator<ALE::NewSifterDef::Arrow<int, int> > > > > >&, const
> ALE::Obj<ALE::IGeneralSection<int, double, ALE::malloc_allocator<double> >,
> ALE::malloc_allocator<ALE::IGeneralSection<int, double,
> ALE::malloc_allocator<double> > > >&, const ALE::Obj<ALE::GlobalOrder<int,
> ALE::Point>, ALE::malloc_allocator<ALE::GlobalOrder<int, ALE::Point> > >&,
> const int&, PetscScalar*, InsertMode)::visitor_type]?  :
> mesh.c:1121:   instantiated from here
> .....
> .....
> ------------------------------------------------------------------------
>
>
> ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
>
>
> Real Scalar
>
>  ./config/configure.py
> --prefix=/path-to/user/install/petsc-v3p7-defaultboost-sieve-real
>  --with-scalar-type=real  --with-clanguage=cxx  --with-boost=1
> --download-boost=/home/user/Compile/petsc3p7-real/externalpackages/boost.tar.gz
>  --with-sieve=1
> ....
>             Configuring PETSc to compile on your system
>
> =================================================================================
> TESTING: alternateConfigureLibrary from
> PETSc.packages.petsc4py(config/PETSc/packages/petsc4py.py:69)
>                            Compilers:
>  C Compiler:         mpicc  -Wall -Wwrite-strings -Wno-strict-aliasing -g3
>  C++ Compiler:       mpicxx  -Wall -Wwrite-strings -Wno-strict-aliasing -g
>  Fortran Compiler:   mpif90  -Wall -Wno-unused-variable -g
> Linkers:
>  Static linker:   /usr/bin/ar cr
> PETSc:
>  **
>  ** Before running "make" your PETSC_ARCH must be specified with:
>  **  ** setenv PETSC_ARCH linux-gnu-cxx-debug (csh/tcsh)
>  **  ** PETSC_ARCH=linux-gnu-cxx-debug; export PETSC_ARCH (sh/bash)
>  **
>  PETSC_DIR: /home/user/Compile/petsc3p7-real
>  **
>  ** Now build the libraries with "make all"
>  **
>  Clanguage: Cxx
>  PETSc shared libraries: disabled
>  PETSc dynamic libraries: disabled
>  Scalar type:real
> MPI:
>  Includes: -I/misc/shared/apps/openmpi/gcc/64/1.2.8/include
> -I/misc/shared/apps/openmpi/gcc/64/1.2.8/lib
> X11:
>  Includes: ['']
>  Library: ['-lX11']
> BLAS/LAPACK: -llapack -lblas
> Sieve:
>  Includes: -I/home/user/Compile/petsc3p7-real/include/sieve
> Boost:
>  Includes: -I/home/user/Compile/petsc3p7-real/externalpackages/Boost/
> -I/misc/shared/apps/openmpi/gcc/64/1.2.8/include
> -I/misc/shared/apps/openmpi/gcc/64/1.2.8/lib
>
>
>
>
> No Errors; warnings:
> ------------------------------------------------------------------------
> .....
> .....
>
> libfast in: /home/user/Compile/petsc3p7-real/src/snes/utils/sieve
> /home/user/Compile/petsc3p7-real/include/sieve/ISieve.hh: In instantiation
> of ?  ALE::IFSieveDef::Sequence<int>?  :
> /home/user/Compile/petsc3p7-real/include/sieve/ALE_mem.hh:759:
> instantiated from ?  void ALE::Obj<X, A>::destroy() [with X =
> ALE::IFSieveDef::Sequence<int>, A =
> ALE::malloc_allocator<ALE::IFSieveDef::Sequence<int> >]?
> /home/user/Compile/petsc3p7-real/include/sieve/ALE_mem.hh:705:
> instantiated from ?  ALE::Obj<X, A>::~Obj() [with X =
> ALE::IFSieveDef::Sequence<int>, A =
> ALE::malloc_allocator<ALE::IFSieveDef::Sequence<int> >]?
> /home/user/Compile/petsc3p7-real/include/sieve/ISieve.hh:1156:
> instantiated from ?  ALE::IFSieve<Point_, Allocator_>::~IFSieve() [with
> Point_ = int, Allocator_ = ALE::malloc_allocator<int>]?
> /home/user/Compile/petsc3p7-real/include/sieve/ALE_mem.hh:759:
> instantiated from ?  void ALE::Obj<X, A>::destroy() [with X =
> ALE::IFSieve<int, ALE::malloc_allocator<int> >, A =
> ALE::malloc_allocator<ALE::IFSieve<int, ALE::malloc_allocator<int> > >]?
> /home/user/Compile/petsc3p7-real/include/sieve/ALE_mem.hh:705:
> instantiated from ?  ALE::Obj<X, A>::~Obj() [with X = ALE::IFSieve<int,
> ALE::malloc_allocator<int> >, A = ALE::malloc_allocator<ALE::IFSieve<int,
> ALE::malloc_allocator<int> > >]?
> /home/user/Compile/petsc3p7-real/include/sieve/Mesh.hh:1347:   instantiated
> from ?  ALE::IBundle<Sieve_, RealSection_, IntSection_, Label_,
> ArrowSection_>::~IBundle() [with Sieve_ = ALE::IFSieve<int,
> ALE::malloc_allocator<int> >, RealSection_ = ALE::IGeneralSection<int,
> double, ALE::malloc_allocator<double> >, IntSection_ =
> ALE::IGeneralSection<int, int, ALE::malloc_allocator<int> >, Label_ =
> ALE::LabelSifter<int, int,
> ALE::malloc_allocator<ALE::NewSifterDef::Arrow<int, int> > >, ArrowSection_
> = ALE::UniformSection<ALE::MinimalArrow<int, int>, int, 1,
> ALE::malloc_allocator<int> >]?
> /home/user/Compile/petsc3p7-real/include/sieve/Mesh.hh:1572:   instantiated
> from ?  void ALE::Obj<X, A>::destroy() [with X =
> ALE::IMesh<ALE::LabelSifter<int, int,
> ALE::malloc_allocator<ALE::NewSifterDef::Arrow<int, int> > > >, A =
> ALE::malloc_allocator<ALE::IMesh<ALE::LabelSifter<int, int,
> ALE::malloc_allocator<ALE::NewSifterDef::Arrow<int, int> > > > >]?
> /home/user/Compile/petsc3p7-real/include/sieve/ALE_mem.hh:705:
> instantiated from ?  ALE::Obj<X, A>::~Obj() [with X =
> ALE::IMesh<ALE::LabelSifter<int, int,
> ALE::malloc_allocator<ALE::NewSifterDef::Arrow<int, int> > > >, A =
> ALE::malloc_allocator<ALE::IMesh<ALE::LabelSifter<int, int,
> ALE::malloc_allocator<ALE::NewSifterDef::Arrow<int, int> > > > >]?
> meshmgsnes.c:63:   instantiated from here
> /home/user/Compile/petsc3p7-real/include/sieve/ISieve.hh:954: warning: ?
>  class ALE::IFSieveDef::Sequence<int>?   has virtual functions but
> non-virtual destructor
> /home/user/Compile/petsc3p7-real/include/sieve/ISieve.hh: In instantiation
> of ?  ALE::IFSieveDef::Sequence<int>::const_iterator?  :
> /home/user/Compile/petsc3p7-real/include/sieve/ISieve.hh:991:
> instantiated from ?  ALE::IFSieveDef::Sequence<PointType_>::const_iterator
> ALE::IFSieveDef::Sequence<PointType_>::begin() const [with PointType_ =
> int]?
> meshmgsnes.c:349:   instantiated from here
> /home/user/Compile/petsc3p7-real/include/sieve/ISieve.hh:957: warning: ?
>  class ALE::IFSieveDef::Sequence<int>::const_iterator?   has virtual
> functions but non-virtual destructor
> libfast in: /home/user/Compile/petsc3p7-real/src/snes/utils/ftn-auto
> libfast in: /home/user/Compile/petsc3p7-real/src/snes/utils/ftn-custom
> libfast in: /home/user/Compile/petsc3p7-real/src/snes/f90-mod
> mpif90 -c  -Wall -Wno-unused-variable -g
> -I/home/user/Compile/petsc3p7-real/linux-gnu-cxx-debug/include
> -I/home/user/Compile/petsc3p7-real/include
> -I/home/user/Compile/petsc3p7-real/include/sieve
> -I/home/user/Compile/petsc3p7-real/externalpackages/Boost/
> -I/misc/shared/apps/openmpi/gcc/64/1.2.8/include
> -I/misc/shared/apps/openmpi/gcc/64/1.2.8/lib       -o petscsnesmod.o
> petscsnesmod.F
> /usr/bin/ar cr
> /home/user/Compile/petsc3p7-real/linux-gnu-cxx-debug/lib/libpetscsnes.a
> petscsnesmod.o
> /bin/cp -f *.mod
> /home/user/Compile/petsc3p7-real/linux-gnu-cxx-debug/include
> libfast in: /home/user/Compile/petsc3p7-real/src/snes/ftn-custom
> libfast in: /home/user/Compile/petsc3p7-real/src/ts
> libfast in: /home/user/Compile/petsc3p7-real/src/ts/interface
> libfast in: /home/user/Compile/petsc3p7-real/src/ts/interface/ftn-auto
> libfast in: /home/user/Compile/petsc3p7-real/src/ts/interface/ftn-custom
> libfast in: /home/user/Compile/petsc3p7-real/src/ts/impls
> libfast in: /home/user/Compile/petsc3p7-real/src/ts/impls/explicit
> libfast in: /home/user/Compile/petsc3p7-real/src/ts/impls/explicit/euler
> libfast in: /home/user/Compile/petsc3p7-real/src/ts/impls/explicit/rk
> libfast in:
> /home/user/Compile/petsc3p7-real/src/ts/impls/explicit/rk/ftn-auto
> libfast in: /home/user/Compile/petsc3p7-real/src/ts/impls/implicit
> libfast in: /home/user/Compile/petsc3p7-real/src/ts/impls/implicit/beuler
> libfast in: /home/user/Compile/petsc3p7-real/src/ts/impls/implicit/cn
> libfast in: /home/user/Compile/petsc3p7-real/src/ts/impls/pseudo
> libfast in: /home/user/Compile/petsc3p7-real/src/ts/impls/pseudo/ftn-auto
> libfast in: /home/user/Compile/petsc3p7-real/src/ts/impls/python
> libfast in: /home/user/Compile/petsc3p7-real/src/ts/impls/python/ftn-custom
> libfast in: /home/user/Compile/petsc3p7-real/src/ts/impls/python/ftn-custom
> libfast in: /home/user/Compile/petsc3p7-real/src/ts/examples
> libfast in: /home/user/Compile/petsc3p7-real/src/ts/examples/tests
> libfast in: /home/user/Compile/petsc3p7-real/src/ts/examples/tutorials
> libfast in: /home/user/Compile/petsc3p7-real/src/ts/f90-mod
> mpif90 -c  -Wall -Wno-unused-variable -g
> -I/home/user/Compile/petsc3p7-real/linux-gnu-cxx-debug/include
> -I/home/user/Compile/petsc3p7-real/include
> -I/home/user/Compile/petsc3p7-real/include/sieve
> -I/home/user/Compile/petsc3p7-real/externalpackages/Boost/
> -I/misc/shared/apps/openmpi/gcc/64/1.2.8/include
> -I/misc/shared/apps/openmpi/gcc/64/1.2.8/lib       -o petsctsmod.o
> petsctsmod.F
> /usr/bin/ar cr
> /home/user/Compile/petsc3p7-real/linux-gnu-cxx-debug/lib/libpetscts.a
> petsctsmod.o
> /bin/cp -f *.mod
> /home/user/Compile/petsc3p7-real/linux-gnu-cxx-debug/include
> libfast in: /home/user/Compile/petsc3p7-real/src/dm
> libfast in: /home/user/Compile/petsc3p7-real/src/dm/ao
> libfast in: /home/user/Compile/petsc3p7-real/src/dm/ao/interface
> libfast in: /home/user/Compile/petsc3p7-real/src/dm/ao/interface/ftn-auto
> libfast in: /home/user/Compile/petsc3p7-real/src/dm/ao/interface/ftn-custom
> libfast in: /home/user/Compile/petsc3p7-real/src/dm/ao/impls
> libfast in: /home/user/Compile/petsc3p7-real/src/dm/ao/impls/basic
> libfast in:
> /home/user/Compile/petsc3p7-real/src/dm/ao/impls/basic/ftn-custom
> libfast in: /home/user/Compile/petsc3p7-real/src/dm/ao/impls/mapping
> libfast in:
> /home/user/Compile/petsc3p7-real/src/dm/ao/impls/mapping/ftn-auto
> libfast in:
> /home/user/Compile/petsc3p7-real/src/dm/ao/impls/mapping/ftn-custom
> libfast in: /home/user/Compile/petsc3p7-real/src/dm/ao/examples
> libfast in: /home/user/Compile/petsc3p7-real/src/dm/ao/examples/tests
> libfast in: /home/user/Compile/petsc3p7-real/src/dm/ao/examples/tutorials
> libfast in: /home/user/Compile/petsc3p7-real/src/dm/da
> libfast in: /home/user/Compile/petsc3p7-real/src/dm/da/src
> libfast in: /home/user/Compile/petsc3p7-real/src/dm/da/src/ftn-auto
> libfast in: /home/user/Compile/petsc3p7-real/src/dm/da/src/ftn-custom
> libfast in: /home/user/Compile/petsc3p7-real/src/dm/da/src/f90-custom
> libfast in: /home/user/Compile/petsc3p7-real/src/dm/da/examples
> libfast in: /home/user/Compile/petsc3p7-real/src/dm/da/examples/tests
> libfast in: /home/user/Compile/petsc3p7-real/src/dm/da/examples/tutorials
> libfast in: /home/user/Compile/petsc3p7-real/src/dm/da/utils
> libfast in: /home/user/Compile/petsc3p7-real/src/dm/da/utils/ftn-auto
> libfast in: /home/user/Compile/petsc3p7-real/src/dm/da/utils/ftn-custom
> libfast in: /home/user/Compile/petsc3p7-real/src/dm/da/utils/f90-custom
> libfast in: /home/user/Compile/petsc3p7-real/src/dm/mesh
> /home/user/Compile/petsc3p7-real/include/sieve/ISieve.hh: In instantiation
> of ?  ALE::IFSieveDef::Sequence<int>?  :
> /home/user/Compile/petsc3p7-real/include/sieve/ISieve.hh:1145:
> instantiated from ?  ALE::IFSieve<Point_,
> Allocator_>::IFSieve(ompi_communicator_t*, int) [with Point_ = int,
> Allocator_ = ALE::malloc_allocator<int>]?
> mesh.c:1600:   instantiated from here
> /home/user/Compile/petsc3p7-real/include/sieve/ISieve.hh:954: warning: ?
>  class ALE::IFSieveDef::Sequence<int>?   has virtual functions but
> non-virtual destructor
> /home/user/Compile/petsc3p7-real/include/sieve/ISieve.hh: In instantiation
> of ?  ALE::IFSieveDef::Sequence<int>::const_iterator?  :
> /home/user/Compile/petsc3p7-real/include/sieve/ISieve.hh:991:
> instantiated from ?  ALE::IFSieveDef::Sequence<PointType_>::const_iterator
> ALE::IFSieveDef::Sequence<PointType_>::begin() const [with PointType_ =
> int]?
> mesh.c:2641:   instantiated from here
> /home/user/Compile/petsc3p7-real/include/sieve/ISieve.hh:957: warning: ?
>  class ALE::IFSieveDef::Sequence<int>::const_iterator?   has virtual
> functions but non-virtual destructor
> /home/user/Compile/petsc3p7-real/include/sieve/ISieve.hh: In instantiation
> of ?  ALE::IFSieveDef::Sequence<int>?  :
> meshpcice.c:387:   instantiated from here
> /home/user/Compile/petsc3p7-real/include/sieve/ISieve.hh:954: warning: ?
>  class ALE::IFSieveDef::Sequence<int>?   has virtual functions but
> non-virtual destructor
> /home/user/Compile/petsc3p7-real/include/sieve/ISieve.hh: In instantiation
> of ?  ALE::IFSieveDef::Sequence<int>::const_iterator?  :
> meshpcice.c:388:   instantiated from here
> /home/user/Compile/petsc3p7-real/include/sieve/ISieve.hh:957: warning: ?
>  class ALE::IFSieveDef::Sequence<int>::const_iterator?   has virtual
> functions but non-virtual destructor
> /home/user/Compile/petsc3p7-real/include/sieve/ISieve.hh: In instantiation
> of ?  ALE::IFSieveDef::Sequence<int>?  :
> /home/user/Compile/petsc3p7-real/include/sieve/ALE_mem.hh:759:
> instantiated from ?  void ALE::Obj<X, A>::destroy() [with X =
> ALE::IFSieveDef::Sequence<int>, A =
> ALE::malloc_allocator<ALE::IFSieveDef::Sequence<int> >]?
> /home/user/Compile/petsc3p7-real/include/sieve/ALE_mem.hh:705:
> instantiated from ?  ALE::Obj<X, A>::~Obj() [with X =
> ALE::IFSieveDef::Sequence<int>, A =
> ALE::malloc_allocator<ALE::IFSieveDef::Sequence<int> >]?
> /home/user/Compile/petsc3p7-real/include/sieve/ISieve.hh:1156:
> instantiated from ?  ALE::IFSieve<Point_, Allocator_>::~IFSieve() [with
> Point_ = int, Allocator_ = ALE::malloc_allocator<int>]?
> /home/user/Compile/petsc3p7-real/include/sieve/ALE_mem.hh:759:
> instantiated from ?  void ALE::Obj<X, A>::destroy() [with X =
> ALE::IFSieve<int, ALE::malloc_allocator<int> >, A =
> ALE::malloc_allocator<ALE::IFSieve<int, ALE::malloc_allocator<int> > >]?
> /home/user/Compile/petsc3p7-real/include/sieve/ALE_mem.hh:705:
> instantiated from ?  ALE::Obj<X, A>::~Obj() [with X = ALE::IFSieve<int,
> ALE::malloc_allocator<int> >, A = ALE::malloc_allocator<ALE::IFSieve<int,
> ALE::malloc_allocator<int> > >]?
> /home/user/Compile/petsc3p7-real/include/sieve/Mesh.hh:1347:   instantiated
> from ?  ALE::IBundle<Sieve_, RealSection_, IntSection_, Label_,
> ArrowSection_>::~IBundle() [with Sieve_ = ALE::IFSieve<int,
> ALE::malloc_allocator<int> >, RealSection_ = ALE::IGeneralSection<int,
> double, ALE::malloc_allocator<double> >, IntSection_ =
> ALE::IGeneralSection<int, int, ALE::malloc_allocator<int> >, Label_ =
> ALE::LabelSifter<int, int,
> ALE::malloc_allocator<ALE::NewSifterDef::Arrow<int, int> > >, ArrowSection_
> = ALE::UniformSection<ALE::MinimalArrow<int, int>, int, 1,
> ALE::malloc_allocator<int> >]?
> /home/user/Compile/petsc3p7-real/include/sieve/Mesh.hh:1572:   instantiated
> from ?  void ALE::Obj<X, A>::destroy() [with X =
> ALE::IMesh<ALE::LabelSifter<int, int,
> ALE::malloc_allocator<ALE::NewSifterDef::Arrow<int, int> > > >, A =
> ALE::malloc_allocator<ALE::IMesh<ALE::LabelSifter<int, int,
> ALE::malloc_allocator<ALE::NewSifterDef::Arrow<int, int> > > > >]?
> /home/user/Compile/petsc3p7-real/include/sieve/ALE_mem.hh:705:
> instantiated from ?  ALE::Obj<X, A>::~Obj() [with X =
> ALE::IMesh<ALE::LabelSifter<int, int,
> ALE::malloc_allocator<ALE::NewSifterDef::Arrow<int, int> > > >, A =
> ALE::malloc_allocator<ALE::IMesh<ALE::LabelSifter<int, int,
> ALE::malloc_allocator<ALE::NewSifterDef::Arrow<int, int> > > > >]?
> meshpflotran.c:235:   instantiated from here
> /home/user/Compile/petsc3p7-real/include/sieve/ISieve.hh:954: warning: ?
>  class ALE::IFSieveDef::Sequence<int>?   has virtual functions but
> non-virtual destructor
> /home/user/Compile/petsc3p7-real/include/sieve/ISieve.hh: In instantiation
> of ?  ALE::IFSieveDef::Sequence<int>::const_iterator?  :
> /home/user/Compile/petsc3p7-real/include/sieve/ISieve.hh:991:
> instantiated from ?  ALE::IFSieveDef::Sequence<PointType_>::const_iterator
> ALE::IFSieveDef::Sequence<PointType_>::begin() const [with PointType_ =
> int]?
> meshpflotran.c:903:   instantiated from here
> /home/user/Compile/petsc3p7-real/include/sieve/ISieve.hh:957: warning: ?
>  class ALE::IFSieveDef::Sequence<int>::const_iterator?   has virtual
> functions but non-virtual destructor
> /home/user/Compile/petsc3p7-real/include/sieve/ISieve.hh: In instantiation
> of ?  ALE::IFSieveDef::Sequence<int>?  :
> /home/user/Compile/petsc3p7-real/include/sieve/ALE_mem.hh:759:
> instantiated from ?  void ALE::Obj<X, A>::destroy() [with X =
> ALE::IFSieveDef::Sequence<int>, A =
> ALE::malloc_allocator<ALE::IFSieveDef::Sequence<int> >]?
> /home/user/Compile/petsc3p7-real/include/sieve/ALE_mem.hh:705:
> instantiated from ?  ALE::Obj<X, A>::~Obj() [with X =
> ALE::IFSieveDef::Sequence<int>, A =
> ALE::malloc_allocator<ALE::IFSieveDef::Sequence<int> >]?
> /home/user/Compile/petsc3p7-real/include/sieve/ISieve.hh:1156:
> instantiated from ?  ALE::IFSieve<Point_, Allocator_>::~IFSieve() [with
> Point_ = int, Allocator_ = ALE::malloc_allocator<int>]?
> /home/user/Compile/petsc3p7-real/include/sieve/ALE_mem.hh:759:
> instantiated from ?  void ALE::Obj<X, A>::destroy() [with X =
> ALE::IFSieve<int, ALE::malloc_allocator<int> >, A =
> ALE::malloc_allocator<ALE::IFSieve<int, ALE::malloc_allocator<int> > >]?
> /home/user/Compile/petsc3p7-real/include/sieve/ALE_mem.hh:705:
> instantiated from ?  ALE::Obj<X, A>::~Obj() [with X = ALE::IFSieve<int,
> ALE::malloc_allocator<int> >, A = ALE::malloc_allocator<ALE::IFSieve<int,
> ALE::malloc_allocator<int> > >]?
> /home/user/Compile/petsc3p7-real/include/sieve/Mesh.hh:1331:   instantiated
> from ?  ALE::IBundle<Sieve_, RealSection_, IntSection_, Label_,
> ArrowSection_>::IBundle(ompi_communicator_t*, int) [with Sieve_ =
> ALE::IFSieve<int, ALE::malloc_allocator<int> >, RealSection_ =
> ALE::IGeneralSection<int, double, ALE::malloc_allocator<double> >,
> IntSection_ = ALE::IGeneralSection<int, int, ALE::malloc_allocator<int> >,
> Label_ = ALE::LabelSifter<int, int,
> ALE::malloc_allocator<ALE::NewSifterDef::Arrow<int, int> > >, ArrowSection_
> = ALE::UniformSection<ALE::MinimalArrow<int, int>, int, 1,
> ALE::malloc_allocator<int> >]?
> /home/user/Compile/petsc3p7-real/include/sieve/Mesh.hh:1598:   instantiated
> from ?  ALE::IMesh<Label_>::IMesh(ompi_communicator_t*, int, int) [with
> Label_ = ALE::LabelSifter<int, int,
> ALE::malloc_allocator<ALE::NewSifterDef::Arrow<int, int> > >]?
> meshexodus.c:183:   instantiated from here
> /home/user/Compile/petsc3p7-real/include/sieve/ISieve.hh:954: warning: ?
>  class ALE::IFSieveDef::Sequence<int>?   has virtual functions but
> non-virtual destructor
> /home/user/Compile/petsc3p7-real/include/sieve/ISieve.hh: In instantiation
> of ?  ALE::IFSieveDef::Sequence<int>::const_iterator?  :
> /home/user/Compile/petsc3p7-real/include/sieve/ISieve.hh:991:
> instantiated from ?  ALE::IFSieveDef::Sequence<PointType_>::const_iterator
> ALE::IFSieveDef::Sequence<PointType_>::begin() const [with PointType_ =
> int]?
> meshexodus.c:364:   instantiated from here
> /home/user/Compile/petsc3p7-real/include/sieve/ISieve.hh:957: warning: ?
>  class ALE::IFSieveDef::Sequence<int>::const_iterator?   has virtual
> functions but non-virtual destructor
> /home/user/Compile/petsc3p7-real/include/sieve/ISieve.hh: In instantiation
> of ?  ALE::IFSieveDef::Sequence<int>?  :
> /home/user/Compile/petsc3p7-real/include/sieve/ALE_mem.hh:759:
> instantiated from ?  void ALE::Obj<X, A>::destroy() [with X =
> ALE::IFSieveDef::Sequence<int>, A =
> ALE::malloc_allocator<ALE::IFSieveDef::Sequence<int> >]?
> /home/user/Compile/petsc3p7-real/include/sieve/ALE_mem.hh:705:
> instantiated from ?  ALE::Obj<X, A>::~Obj() [with X =
> ALE::IFSieveDef::Sequence<int>, A =
> ALE::malloc_allocator<ALE::IFSieveDef::Sequence<int> >]?
> /home/user/Compile/petsc3p7-real/include/sieve/ISieve.hh:1156:
> instantiated from ?  ALE::IFSieve<Point_, Allocator_>::~IFSieve() [with
> Point_ = int, Allocator_ = ALE::malloc_allocator<int>]?
> /home/user/Compile/petsc3p7-real/include/sieve/ALE_mem.hh:759:
> instantiated from ?  void ALE::Obj<X, A>::destroy() [with X =
> ALE::IFSieve<int, ALE::malloc_allocator<int> >, A =
> ALE::malloc_allocator<ALE::IFSieve<int, ALE::malloc_allocator<int> > >]?
> /home/user/Compile/petsc3p7-real/include/sieve/ALE_mem.hh:705:
> instantiated from ?  ALE::Obj<X, A>::~Obj() [with X = ALE::IFSieve<int,
> ALE::malloc_allocator<int> >, A = ALE::malloc_allocator<ALE::IFSieve<int,
> ALE::malloc_allocator<int> > >]?
> /home/user/Compile/petsc3p7-real/include/petscmesh_viewers.hh:482:
> instantiated from here
> /home/user/Compile/petsc3p7-real/include/sieve/ISieve.hh:954: warning: ?
>  class ALE::IFSieveDef::Sequence<int>?   has virtual functions but
> non-virtual destructor
> /home/user/Compile/petsc3p7-real/include/sieve/ISieve.hh: In instantiation
> of ?  ALE::IFSieveDef::Sequence<int>::const_iterator?  :
> /home/user/Compile/petsc3p7-real/include/sieve/ISieve.hh:991:
> instantiated from ?  ALE::IFSieveDef::Sequence<PointType_>::const_iterator
> ALE::IFSieveDef::Sequence<PointType_>::begin() const [with PointType_ =
> int]?
> section.c:1405:   instantiated from here
> /home/user/Compile/petsc3p7-real/include/sieve/ISieve.hh:957: warning: ?
>  class ALE::IFSieveDef::Sequence<int>::const_iterator?   has virtual
> functions but non-virtual destructor
> libfast in: /home/user/Compile/petsc3p7-real/src/dm/mesh/sieve
> libfast in: /home/user/Compile/petsc3p7-real/src/dm/mesh/impls
> libfast in: /home/user/Compile/petsc3p7-real/src/dm/mesh/impls/cartesian
> /home/user/Compile/petsc3p7-real/include/sieve/ISieve.hh: In instantiation
> of ?  ALE::IFSieveDef::Sequence<int>?  :
> /home/user/Compile/petsc3p7-real/include/sieve/ALE_mem.hh:759:
> instantiated from ?  void ALE::Obj<X, A>::destroy() [with X =
> ALE::IFSieveDef::Sequence<int>, A =
> ALE::malloc_allocator<ALE::IFSieveDef::Sequence<int> >]?
> /home/user/Compile/petsc3p7-real/include/sieve/ALE_mem.hh:705:
> instantiated from ?  ALE::Obj<X, A>::~Obj() [with X =
> ALE::IFSieveDef::Sequence<int>, A =
> ALE::malloc_allocator<ALE::IFSieveDef::Sequence<int> >]?
> /home/user/Compile/petsc3p7-real/include/sieve/ISieve.hh:1156:
> instantiated from ?  ALE::IFSieve<Point_, Allocator_>::~IFSieve() [with
> Point_ = int, Allocator_ = ALE::malloc_allocator<int>]?
> /home/user/Compile/petsc3p7-real/include/sieve/ALE_mem.hh:759:
> instantiated from ?  void ALE::Obj<X, A>::destroy() [with X =
> ALE::IFSieve<int, ALE::malloc_allocator<int> >, A =
> ALE::malloc_allocator<ALE::IFSieve<int, ALE::malloc_allocator<int> > >]?
> /home/user/Compile/petsc3p7-real/include/sieve/ALE_mem.hh:705:
> instantiated from ?  ALE::Obj<X, A>::~Obj() [with X = ALE::IFSieve<int,
> ALE::malloc_allocator<int> >, A = ALE::malloc_allocator<ALE::IFSieve<int,
> ALE::malloc_allocator<int> > >]?
> /home/user/Compile/petsc3p7-real/include/sieve/Mesh.hh:1347:   instantiated
> from ?  ALE::IBundle<Sieve_, RealSection_, IntSection_, Label_,
> ArrowSection_>::~IBundle() [with Sieve_ = ALE::IFSieve<int,
> ALE::malloc_allocator<int> >, RealSection_ = ALE::IGeneralSection<int,
> double, ALE::malloc_allocator<double> >, IntSection_ =
> ALE::IGeneralSection<int, int, ALE::malloc_allocator<int> >, Label_ =
> ALE::LabelSifter<int, int,
> ALE::malloc_allocator<ALE::NewSifterDef::Arrow<int, int> > >, ArrowSection_
> = ALE::UniformSection<ALE::MinimalArrow<int, int>, int, 1,
> ALE::malloc_allocator<int> >]?
> /home/user/Compile/petsc3p7-real/include/sieve/Mesh.hh:1572:   instantiated
> from ?  void ALE::Obj<X, A>::destroy() [with X =
> ALE::IMesh<ALE::LabelSifter<int, int,
> ALE::malloc_allocator<ALE::NewSifterDef::Arrow<int, int> > > >, A =
> ALE::malloc_allocator<ALE::IMesh<ALE::LabelSifter<int, int,
> ALE::malloc_allocator<ALE::NewSifterDef::Arrow<int, int> > > > >]?
> /home/user/Compile/petsc3p7-real/include/sieve/ALE_mem.hh:705:
> instantiated from ?  ALE::Obj<X, A>::~Obj() [with X =
> ALE::IMesh<ALE::LabelSifter<int, int,
> ALE::malloc_allocator<ALE::NewSifterDef::Arrow<int, int> > > >, A =
> ALE::malloc_allocator<ALE::IMesh<ALE::LabelSifter<int, int,
> ALE::malloc_allocator<ALE::NewSifterDef::Arrow<int, int> > > > >]?
> cartesian.c:263:   instantiated from here
> /home/user/Compile/petsc3p7-real/include/sieve/ISieve.hh:954: warning: ?
>  class ALE::IFSieveDef::Sequence<int>?   has virtual functions but
> non-virtual destructor
> /home/user/Compile/petsc3p7-real/include/sieve/ISieve.hh: In instantiation
> of ?  ALE::IFSieveDef::Sequence<int>::const_iterator?  :
> /home/user/Compile/petsc3p7-real/include/sieve/ISieve.hh:991:
> instantiated from ?  ALE::IFSieveDef::Sequence<PointType_>::const_iterator
> ALE::IFSieveDef::Sequence<PointType_>::begin() const [with PointType_ =
> int]?
> cartesian.c:269:   instantiated from here
> /home/user/Compile/petsc3p7-real/include/sieve/ISieve.hh:957: warning: ?
>  class ALE::IFSieveDef::Sequence<int>::const_iterator?   has virtual
> functions but non-virtual destructor
>
> ------------------------------------------------------------------------
>
>


-- 
What most experimenters take for granted before they begin their experiments
is infinitely more interesting than any results to which their experiments
lead.
-- Norbert Wiener
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.mcs.anl.gov/pipermail/petsc-users/attachments/20090728/407f4062/attachment-0001.htm>

From vyan2000 at gmail.com  Tue Jul 28 20:15:48 2009
From: vyan2000 at gmail.com (Ryan Yan)
Date: Tue, 28 Jul 2009 21:15:48 -0400
Subject: How to set the ksp true residual tolerance on command line
Message-ID: <bb5eaf5f0907281815g6ec467e7m5af731c2004a90f8@mail.gmail.com>

Hi all,
Is there any way to set a true residual tolerance on command line. -ksp_atol
is for preconditioned residual norm only.

Thanks,

Yan
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.mcs.anl.gov/pipermail/petsc-users/attachments/20090728/69660e54/attachment.htm>

From bsmith at mcs.anl.gov  Tue Jul 28 20:28:25 2009
From: bsmith at mcs.anl.gov (Barry Smith)
Date: Tue, 28 Jul 2009 20:28:25 -0500
Subject: How to set the ksp true residual tolerance on command line
In-Reply-To: <bb5eaf5f0907281815g6ec467e7m5af731c2004a90f8@mail.gmail.com>
References: <bb5eaf5f0907281815g6ec467e7m5af731c2004a90f8@mail.gmail.com>
Message-ID: <019038C4-3647-4580-9F0D-4E323D49C64C@mcs.anl.gov>


    This depends on the Krylov solver and if you are using left or  
right preconditioning.  With right preconditioning it always uses the  
true residual norm.
So you can do -ksp_type gmres -ksp_pc_right
Some methods we do not have implemented for right preconditioning,  
like CG, in that case you can use -ksp_type cg -ksp_norm_type  
unpreconditioned
For CG many people prefer the energy norm which you can access with - 
ksp_norm_type natural

-ksp_view should always show which it is using

    Barry

On Jul 28, 2009, at 8:15 PM, Ryan Yan wrote:

> Hi all,
> Is there any way to set a true residual tolerance on command line. - 
> ksp_atol is for preconditioned residual norm only.
>
> Thanks,
>
> Yan


From vyan2000 at gmail.com  Tue Jul 28 21:18:22 2009
From: vyan2000 at gmail.com (Ryan Yan)
Date: Tue, 28 Jul 2009 22:18:22 -0400
Subject: How to set the ksp true residual tolerance on command line
In-Reply-To: <019038C4-3647-4580-9F0D-4E323D49C64C@mcs.anl.gov>
References: <bb5eaf5f0907281815g6ec467e7m5af731c2004a90f8@mail.gmail.com>
	<019038C4-3647-4580-9F0D-4E323D49C64C@mcs.anl.gov>
Message-ID: <bb5eaf5f0907281918q566e121drec72b1a84c256188@mail.gmail.com>

Barry, thank you very much for the suggestion and the clarification.

Yan


PS: It should be -ksp_right_pc, instead of -ksp_pc_right:

-ksp_monitor_draw_true_residual: Monitor graphically true residual norm
(KSPMonitorSet)
  -ksp_monitor_range_draw: Monitor graphically preconditioned residual norm
(KSPMonitorSet)
  Pick at most one of -------------
    -ksp_left_pc: Use left preconditioning (KSPSetPreconditionerSide)
    -ksp_right_pc: Use right preconditioning (KSPSetPreconditionerSide)
    -ksp_symmetric_pc: Use symmetric (factorized) preconditioning
(KSPSetPreconditionerSide)
  -ksp_compute_singularvalues: Compute singular values of preconditioned
operator (KSPSetComputeSingularValues)
  -ksp_compute_eigenvalues: Compute eigenvalues of preconditioned operator
(KSPSetComputeSingularValues)




On Tue, Jul 28, 2009 at 9:28 PM, Barry Smith <bsmith at mcs.anl.gov> wrote:

>
>   This depends on the Krylov solver and if you are using left or right
> preconditioning.  With right preconditioning it always uses the true
> residual norm.
> So you can do -ksp_type gmres -ksp_pc_right
> Some methods we do not have implemented for right preconditioning, like CG,
> in that case you can use -ksp_type cg -ksp_norm_type unpreconditioned
> For CG many people prefer the energy norm which you can access with
> -ksp_norm_type natural
>
> -ksp_view should always show which it is using
>
>   Barry
>
>
> On Jul 28, 2009, at 8:15 PM, Ryan Yan wrote:
>
>  Hi all,
>> Is there any way to set a true residual tolerance on command line.
>> -ksp_atol is for preconditioned residual norm only.
>>
>> Thanks,
>>
>> Yan
>>
>
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.mcs.anl.gov/pipermail/petsc-users/attachments/20090728/558c6eac/attachment.htm>

From tim.kroeger at cevis.uni-bremen.de  Wed Jul 29 03:33:30 2009
From: tim.kroeger at cevis.uni-bremen.de (Tim Kroeger)
Date: Wed, 29 Jul 2009 10:33:30 +0200 (CEST)
Subject: Solver problem
In-Reply-To: <897733B6-228F-4397-9C49-81CC0BE296F9@mcs.anl.gov>
References: <alpine.LFD.0.9999.0907271042460.4918@elektrik.mevis.lan>
	<6902D9FB-FE90-4C10-A4CC-187834E02988@mcs.anl.gov>
	<alpine.LFD.0.9999.0907280815410.21104@elektrik.mevis.lan>
	<4A6EA926.3090400@student.uibk.ac.at>
	<alpine.LFD.0.9999.0907281416080.3255@elektrik.mevis.lan>
	<897733B6-228F-4397-9C49-81CC0BE296F9@mcs.anl.gov>
Message-ID: <alpine.LFD.0.9999.0907291029270.10932@elektrik.mevis.lan>

Dear Barry,

On Tue, 28 Jul 2009, Barry Smith wrote:

>> undefined symbol: _gfortran_allocate_array
>
> This is likely a symbol in the gfortran compiler libraries.
>
> Are you linking your application code against all the libraries it needs to 
> be linked against?

Thank you for your help.  It seems I have found the reason now.  It 
has to do with the cluster I am working on: The operating system on 
the master (where I compile my application) does not coincide with the 
system on the nodes (where the application is run).  That is, e.g. 
/usr/lib/libgfortran.so differs between these installations.  I will 
ask the cluster's admin for assistance.

Best Regards,

Tim

-- 
Dr. Tim Kroeger
tim.kroeger at mevis.fraunhofer.de            Phone +49-421-218-7710
tim.kroeger at cevis.uni-bremen.de            Fax   +49-421-218-4236

Fraunhofer MEVIS, Institute for Medical Image Computing
Universitaetsallee 29, 28359 Bremen, Germany


From tim.kroeger at cevis.uni-bremen.de  Wed Jul 29 08:49:50 2009
From: tim.kroeger at cevis.uni-bremen.de (Tim Kroeger)
Date: Wed, 29 Jul 2009 15:49:50 +0200 (CEST)
Subject: Solver problem
In-Reply-To: <alpine.LFD.0.9999.0907280815410.21104@elektrik.mevis.lan>
References: <alpine.LFD.0.9999.0907271042460.4918@elektrik.mevis.lan>
	<6902D9FB-FE90-4C10-A4CC-187834E02988@mcs.anl.gov>
	<alpine.LFD.0.9999.0907280815410.21104@elektrik.mevis.lan>
Message-ID: <alpine.LFD.0.9999.0907291546180.10932@elektrik.mevis.lan>

Dear all,

On Tue, 28 Jul 2009, Tim Kroeger wrote:

> Would you recommend to try MUMPS as well?  (I.e., will MUMPS have a change to 
> be faster than ILU?)

It seems as if I can't use MUMPS since the cluster I am working on 
doesn't meet some system requirements.  (PETSc otherwise works fine on 
the cluster.)  However, I understand that PETSc also interfaces a 
large number of other sparse direct solvers.  Are there any 
recommendations about which one might be a good choice if MUMPS cannot 
be used?

Best Regards,

Tim

-- 
Dr. Tim Kroeger
tim.kroeger at mevis.fraunhofer.de            Phone +49-421-218-7710
tim.kroeger at cevis.uni-bremen.de            Fax   +49-421-218-4236

Fraunhofer MEVIS, Institute for Medical Image Computing
Universitaetsallee 29, 28359 Bremen, Germany


From knepley at gmail.com  Wed Jul 29 11:31:59 2009
From: knepley at gmail.com (Matthew Knepley)
Date: Wed, 29 Jul 2009 11:31:59 -0500
Subject: Solver problem
In-Reply-To: <alpine.LFD.0.9999.0907291546180.10932@elektrik.mevis.lan>
References: <alpine.LFD.0.9999.0907271042460.4918@elektrik.mevis.lan>
	<6902D9FB-FE90-4C10-A4CC-187834E02988@mcs.anl.gov>
	<alpine.LFD.0.9999.0907280815410.21104@elektrik.mevis.lan>
	<alpine.LFD.0.9999.0907291546180.10932@elektrik.mevis.lan>
Message-ID: <a9f269830907290931h5a8b313eu80449abb17de2806@mail.gmail.com>

On Wed, Jul 29, 2009 at 8:49 AM, Tim Kroeger <
tim.kroeger at cevis.uni-bremen.de> wrote:

> Dear all,
>
> On Tue, 28 Jul 2009, Tim Kroeger wrote:
>
>  Would you recommend to try MUMPS as well?  (I.e., will MUMPS have a change
>> to be faster than ILU?)
>>
>
> It seems as if I can't use MUMPS since the cluster I am working on doesn't
> meet some system requirements.  (PETSc otherwise works fine on the cluster.)
>  However, I understand that PETSc also interfaces a large number of other
> sparse direct solvers.  Are there any recommendations about which one might
> be a good choice if MUMPS cannot be used?


You can try SuperLU_dist.

  Matt


>
> Best Regards,
>
> Tim
>
> --
> Dr. Tim Kroeger
> tim.kroeger at mevis.fraunhofer.de            Phone +49-421-218-7710
> tim.kroeger at cevis.uni-bremen.de            Fax   +49-421-218-4236
>
> Fraunhofer MEVIS, Institute for Medical Image Computing
> Universitaetsallee 29, 28359 Bremen, Germany
>
>


-- 
What most experimenters take for granted before they begin their experiments
is infinitely more interesting than any results to which their experiments
lead.
-- Norbert Wiener
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.mcs.anl.gov/pipermail/petsc-users/attachments/20090729/eb882b3b/attachment.htm>

From rlmackie862 at gmail.com  Wed Jul 29 11:36:07 2009
From: rlmackie862 at gmail.com (Randall Mackie)
Date: Wed, 29 Jul 2009 09:36:07 -0700
Subject: suggestions for debugging code
In-Reply-To: <a9f269830907280841g60834cedo747e4ecd7d194e2a@mail.gmail.com>
References: <4A6F168F.5070208@gmail.com>
	<a9f269830907280841g60834cedo747e4ecd7d194e2a@mail.gmail.com>
Message-ID: <4A707A77.2050403@gmail.com>

Matthew,

Thanks - it took me the better part of the day yesterday to get the suppression
file so that it cut out most of the MPI stuff, and then I was able to eventually
zero in and find the offending bug.


Randy


Matthew Knepley wrote:
> On Tue, Jul 28, 2009 at 10:17 AM, Randall Mackie <rlmackie862 at gmail.com 
> <mailto:rlmackie862 at gmail.com>> wrote:
> 
>     I have run into a very difficult debugging problem. I have recently
>     made some
>     modifications to my PETSc code, to add some new features. When I
>     compiled the
>     code in debug mode (we are using the Intel compilers and mvapich on
>     Infiniband),
>     the code runs fine with any number of processes.
> 
>     When the code is compiled in optimize mode, it runs fine on, say, up
>     to 32 processes,
>     but not 64, bombing out someplace strange, with a Segmentation
>     Violation.
> 
>     I've tried using Valgrind, but you can't use it with PETSc and my
>     code compiled in
>     Debug mode because the code finishes successfully, and the other
>     problem I have with
> 
> 
> Sometimes valgrind will catch things even when code does not crash.
>  
> 
> 
>     Valgrind + mvapich is there are about a million messages spewed out,
>     making it
>     extremely difficult to see if there are really any issues in MY
>     code. I've thought
>     to have PETSc download and compile MPICH2, which I would hope would
>     produce less
>     output from Valgrind.
> 
> 
> In order to filter these out, you use a "suppressions file" for 
> valgrind. The manual has a
> good section on this and it should not be hard to wipre out most of 
> them. Satish designed
> one for our unit tests.
> 
>   Matt
>  
> 
> 
>     Anyone have any suggestions on how to debug this tricky situation?
>     Any suggestions
>     would be greatly appreciated.
> 
>     Randy
> 
> 
> 
> 
> -- 
> What most experimenters take for granted before they begin their 
> experiments is infinitely more interesting than any results to which 
> their experiments lead.
> -- Norbert Wiener

From yaakoub at tacc.utexas.edu  Wed Jul 29 12:34:53 2009
From: yaakoub at tacc.utexas.edu (Yaakoub El Khamra)
Date: Wed, 29 Jul 2009 12:34:53 -0500
Subject: suggestions for debugging code
In-Reply-To: <4A707A77.2050403@gmail.com>
References: <4A6F168F.5070208@gmail.com>
	<a9f269830907280841g60834cedo747e4ecd7d194e2a@mail.gmail.com> 
	<4A707A77.2050403@gmail.com>
Message-ID: <47a831090907291034r68f51517mc080ff408436c2a6@mail.gmail.com>

Just my 2c, but if you might want to check out DDT, it is a parallel
debugger with built-in memory checking. If you have a teragrid
account, you can probably use it on ranger or lonestar. It is
licensed, but until the eclipse PTP project comes around with memory
checking, it is an alternative.

Regards
Yaakoub El Khamra




On Wed, Jul 29, 2009 at 11:36 AM, Randall Mackie<rlmackie862 at gmail.com> wrote:
> Matthew,
>
> Thanks - it took me the better part of the day yesterday to get the suppression
> file so that it cut out most of the MPI stuff, and then I was able to eventually
> zero in and find the offending bug.
>
>
> Randy
>
>
> Matthew Knepley wrote:
>> On Tue, Jul 28, 2009 at 10:17 AM, Randall Mackie <rlmackie862 at gmail.com
>> <mailto:rlmackie862 at gmail.com>> wrote:
>>
>> ? ? I have run into a very difficult debugging problem. I have recently
>> ? ? made some
>> ? ? modifications to my PETSc code, to add some new features. When I
>> ? ? compiled the
>> ? ? code in debug mode (we are using the Intel compilers and mvapich on
>> ? ? Infiniband),
>> ? ? the code runs fine with any number of processes.
>>
>> ? ? When the code is compiled in optimize mode, it runs fine on, say, up
>> ? ? to 32 processes,
>> ? ? but not 64, bombing out someplace strange, with a Segmentation
>> ? ? Violation.
>>
>> ? ? I've tried using Valgrind, but you can't use it with PETSc and my
>> ? ? code compiled in
>> ? ? Debug mode because the code finishes successfully, and the other
>> ? ? problem I have with
>>
>>
>> Sometimes valgrind will catch things even when code does not crash.
>>
>>
>>
>> ? ? Valgrind + mvapich is there are about a million messages spewed out,
>> ? ? making it
>> ? ? extremely difficult to see if there are really any issues in MY
>> ? ? code. I've thought
>> ? ? to have PETSc download and compile MPICH2, which I would hope would
>> ? ? produce less
>> ? ? output from Valgrind.
>>
>>
>> In order to filter these out, you use a "suppressions file" for
>> valgrind. The manual has a
>> good section on this and it should not be hard to wipre out most of
>> them. Satish designed
>> one for our unit tests.
>>
>> ? Matt
>>
>>
>>
>> ? ? Anyone have any suggestions on how to debug this tricky situation?
>> ? ? Any suggestions
>> ? ? would be greatly appreciated.
>>
>> ? ? Randy
>>
>>
>>
>>
>> --
>> What most experimenters take for granted before they begin their
>> experiments is infinitely more interesting than any results to which
>> their experiments lead.
>> -- Norbert Wiener
>

From rlmackie862 at gmail.com  Wed Jul 29 14:09:32 2009
From: rlmackie862 at gmail.com (Randall Mackie)
Date: Wed, 29 Jul 2009 12:09:32 -0700
Subject: suggestions for debugging code
In-Reply-To: <47a831090907291034r68f51517mc080ff408436c2a6@mail.gmail.com>
References: <4A6F168F.5070208@gmail.com>	<a9f269830907280841g60834cedo747e4ecd7d194e2a@mail.gmail.com>
	<4A707A77.2050403@gmail.com>
	<47a831090907291034r68f51517mc080ff408436c2a6@mail.gmail.com>
Message-ID: <4A709E6C.5030500@gmail.com>

That looks interesting. Has anyone tried Totalview or PGDBG?

Randy

Yaakoub El Khamra wrote:
> Just my 2c, but if you might want to check out DDT, it is a parallel
> debugger with built-in memory checking. If you have a teragrid
> account, you can probably use it on ranger or lonestar. It is
> licensed, but until the eclipse PTP project comes around with memory
> checking, it is an alternative.
> 
> Regards
> Yaakoub El Khamra
> 
> 
> 
> 
> On Wed, Jul 29, 2009 at 11:36 AM, Randall Mackie<rlmackie862 at gmail.com> wrote:
>> Matthew,
>>
>> Thanks - it took me the better part of the day yesterday to get the suppression
>> file so that it cut out most of the MPI stuff, and then I was able to eventually
>> zero in and find the offending bug.
>>
>>
>> Randy
>>
>>
>> Matthew Knepley wrote:
>>> On Tue, Jul 28, 2009 at 10:17 AM, Randall Mackie <rlmackie862 at gmail.com
>>> <mailto:rlmackie862 at gmail.com>> wrote:
>>>
>>>     I have run into a very difficult debugging problem. I have recently
>>>     made some
>>>     modifications to my PETSc code, to add some new features. When I
>>>     compiled the
>>>     code in debug mode (we are using the Intel compilers and mvapich on
>>>     Infiniband),
>>>     the code runs fine with any number of processes.
>>>
>>>     When the code is compiled in optimize mode, it runs fine on, say, up
>>>     to 32 processes,
>>>     but not 64, bombing out someplace strange, with a Segmentation
>>>     Violation.
>>>
>>>     I've tried using Valgrind, but you can't use it with PETSc and my
>>>     code compiled in
>>>     Debug mode because the code finishes successfully, and the other
>>>     problem I have with
>>>
>>>
>>> Sometimes valgrind will catch things even when code does not crash.
>>>
>>>
>>>
>>>     Valgrind + mvapich is there are about a million messages spewed out,
>>>     making it
>>>     extremely difficult to see if there are really any issues in MY
>>>     code. I've thought
>>>     to have PETSc download and compile MPICH2, which I would hope would
>>>     produce less
>>>     output from Valgrind.
>>>
>>>
>>> In order to filter these out, you use a "suppressions file" for
>>> valgrind. The manual has a
>>> good section on this and it should not be hard to wipre out most of
>>> them. Satish designed
>>> one for our unit tests.
>>>
>>>   Matt
>>>
>>>
>>>
>>>     Anyone have any suggestions on how to debug this tricky situation?
>>>     Any suggestions
>>>     would be greatly appreciated.
>>>
>>>     Randy
>>>
>>>
>>>
>>>
>>> --
>>> What most experimenters take for granted before they begin their
>>> experiments is infinitely more interesting than any results to which
>>> their experiments lead.
>>> -- Norbert Wiener

From yaakoub at tacc.utexas.edu  Wed Jul 29 14:19:00 2009
From: yaakoub at tacc.utexas.edu (Yaakoub El Khamra)
Date: Wed, 29 Jul 2009 14:19:00 -0500
Subject: suggestions for debugging code
In-Reply-To: <4A709E6C.5030500@gmail.com>
References: <4A6F168F.5070208@gmail.com>
	<a9f269830907280841g60834cedo747e4ecd7d194e2a@mail.gmail.com> 
	<4A707A77.2050403@gmail.com>
	<47a831090907291034r68f51517mc080ff408436c2a6@mail.gmail.com> 
	<4A709E6C.5030500@gmail.com>
Message-ID: <47a831090907291219q65b5fb4dx43b28dbbad5a78bc@mail.gmail.com>

I used totalview heavily, it has memory debugging but at the time it
was only on a single core. That restriction might have been lifted, I
am not sure.

Regards
Yaakoub El Khamra




On Wed, Jul 29, 2009 at 2:09 PM, Randall Mackie<rlmackie862 at gmail.com> wrote:
> That looks interesting. Has anyone tried Totalview or PGDBG?
>
> Randy
>
> Yaakoub El Khamra wrote:
>> Just my 2c, but if you might want to check out DDT, it is a parallel
>> debugger with built-in memory checking. If you have a teragrid
>> account, you can probably use it on ranger or lonestar. It is
>> licensed, but until the eclipse PTP project comes around with memory
>> checking, it is an alternative.
>>
>> Regards
>> Yaakoub El Khamra
>>
>>
>>
>>
>> On Wed, Jul 29, 2009 at 11:36 AM, Randall Mackie<rlmackie862 at gmail.com> wrote:
>>> Matthew,
>>>
>>> Thanks - it took me the better part of the day yesterday to get the suppression
>>> file so that it cut out most of the MPI stuff, and then I was able to eventually
>>> zero in and find the offending bug.
>>>
>>>
>>> Randy
>>>
>>>
>>> Matthew Knepley wrote:
>>>> On Tue, Jul 28, 2009 at 10:17 AM, Randall Mackie <rlmackie862 at gmail.com
>>>> <mailto:rlmackie862 at gmail.com>> wrote:
>>>>
>>>> ? ? I have run into a very difficult debugging problem. I have recently
>>>> ? ? made some
>>>> ? ? modifications to my PETSc code, to add some new features. When I
>>>> ? ? compiled the
>>>> ? ? code in debug mode (we are using the Intel compilers and mvapich on
>>>> ? ? Infiniband),
>>>> ? ? the code runs fine with any number of processes.
>>>>
>>>> ? ? When the code is compiled in optimize mode, it runs fine on, say, up
>>>> ? ? to 32 processes,
>>>> ? ? but not 64, bombing out someplace strange, with a Segmentation
>>>> ? ? Violation.
>>>>
>>>> ? ? I've tried using Valgrind, but you can't use it with PETSc and my
>>>> ? ? code compiled in
>>>> ? ? Debug mode because the code finishes successfully, and the other
>>>> ? ? problem I have with
>>>>
>>>>
>>>> Sometimes valgrind will catch things even when code does not crash.
>>>>
>>>>
>>>>
>>>> ? ? Valgrind + mvapich is there are about a million messages spewed out,
>>>> ? ? making it
>>>> ? ? extremely difficult to see if there are really any issues in MY
>>>> ? ? code. I've thought
>>>> ? ? to have PETSc download and compile MPICH2, which I would hope would
>>>> ? ? produce less
>>>> ? ? output from Valgrind.
>>>>
>>>>
>>>> In order to filter these out, you use a "suppressions file" for
>>>> valgrind. The manual has a
>>>> good section on this and it should not be hard to wipre out most of
>>>> them. Satish designed
>>>> one for our unit tests.
>>>>
>>>> ? Matt
>>>>
>>>>
>>>>
>>>> ? ? Anyone have any suggestions on how to debug this tricky situation?
>>>> ? ? Any suggestions
>>>> ? ? would be greatly appreciated.
>>>>
>>>> ? ? Randy
>>>>
>>>>
>>>>
>>>>
>>>> --
>>>> What most experimenters take for granted before they begin their
>>>> experiments is infinitely more interesting than any results to which
>>>> their experiments lead.
>>>> -- Norbert Wiener
>

From Harun.BAYRAKTAR at 3ds.com  Wed Jul 29 15:54:35 2009
From: Harun.BAYRAKTAR at 3ds.com (BAYRAKTAR Harun)
Date: Wed, 29 Jul 2009 16:54:35 -0400
Subject: Smoother settings for AMG
Message-ID: <A4A715530F9D944C930E27C7F8932FBD2AD4A5@CORP-CLT-EXB02.ds>

Hi,

I am trying to solve a system of equations and I am having difficulty
picking the right smoothers for AMG (using ML as pc_type) in PETSc for
parallel execution. First here is what happens in terms of CG (ksp_type)
iteration counts (both columns use block jacobi):

cpus	|	AMG w/ ICC(0) x1	|	AMG w/ SOR x4
------------------------------------------------------
1	|		43		|		243
4	|		699		|		379

x1 or x4 means 1 or 4 iterations of smoother application at each AMG
level (all details from ksp view for the 4 cpu run are below). The main
observation is that on 1 cpu, AMG w/ ICC(0) is a clear winner but falls
apart in parallel. SOR on the other hand experiences a 1.5X increase in
iteration count which is totally expected from the quality of coarsening
ML delivers in parallel.

I basically would like to find a way (if possible) to have the number of
iterations in parallel stay with 1-2X of 1 cpu iteration count for the
AMG w/ ICC case. Is there a way to achieve this?

Thanks,
Harun

%%%%%%%%%%%%%%%%%%%%%%%%%
AMG w/ ICC(0) x1 ksp_view
%%%%%%%%%%%%%%%%%%%%%%%%%
KSP Object:
  type: cg
  maximum iterations=10000
  tolerances:  relative=1e-05, absolute=1e-50, divergence=10000
  left preconditioning
PC Object:
  type: ml
    MG: type is MULTIPLICATIVE, levels=3 cycles=v, pre-smooths=1,
post-smooths=1
  Coarse gride solver -- level 0 -------------------------------
    KSP Object:(mg_coarse_)
      type: preonly
      maximum iterations=1, initial guess is zero
      tolerances:  relative=1e-05, absolute=1e-50, divergence=10000
      left preconditioning
    PC Object:(mg_coarse_)
      type: redundant
        Redundant preconditioner: First (color=0) of 4 PCs follows
      KSP Object:(mg_coarse_redundant_)
        type: preonly
        maximum iterations=10000, initial guess is zero
        tolerances:  relative=1e-05, absolute=1e-50, divergence=10000
        left preconditioning
      PC Object:(mg_coarse_redundant_)
        type: lu
          LU: out-of-place factorization
            matrix ordering: nd
          LU: tolerance for zero pivot 1e-12
          LU: factor fill ratio needed 2.17227
               Factored matrix follows
              Matrix Object:
                type=seqaij, rows=283, cols=283
                total: nonzeros=21651, allocated nonzeros=21651
                  using I-node routines: found 186 nodes, limit used is
5
        linear system matrix = precond matrix:
        Matrix Object:
          type=seqaij, rows=283, cols=283
          total: nonzeros=9967, allocated nonzeros=14150
            not using I-node routines
      linear system matrix = precond matrix:
      Matrix Object:
        type=mpiaij, rows=283, cols=283
        total: nonzeros=9967, allocated nonzeros=9967
          not using I-node (on process 0) routines
  Down solver (pre-smoother) on level 1 -------------------------------
    KSP Object:(mg_levels_1_)
      type: richardson
        Richardson: damping factor=0.9
      maximum iterations=1, initial guess is zero
      tolerances:  relative=1e-05, absolute=1e-50, divergence=10000
      left preconditioning
    PC Object:(mg_levels_1_)
      type: bjacobi
        block Jacobi: number of blocks = 4
        Local solve is same for all blocks, in the following KSP and PC
objects:
      KSP Object:(mg_levels_1_sub_)
        type: preonly
        maximum iterations=10000, initial guess is zero
        tolerances:  relative=1e-05, absolute=1e-50, divergence=10000
        left preconditioning
      PC Object:(mg_levels_1_sub_)
        type: icc
          ICC: 0 levels of fill
          ICC: factor fill ratio allocated 1
          ICC: using Manteuffel shift
          ICC: factor fill ratio needed 0.514899
               Factored matrix follows
              Matrix Object:
                type=seqsbaij, rows=2813, cols=2813
                total: nonzeros=48609, allocated nonzeros=48609
                    block size is 1
        linear system matrix = precond matrix:
        Matrix Object:
          type=seqaij, rows=2813, cols=2813
          total: nonzeros=94405, allocated nonzeros=94405
            not using I-node routines
      linear system matrix = precond matrix:
      Matrix Object:
        type=mpiaij, rows=10654, cols=10654
        total: nonzeros=376634, allocated nonzeros=376634
          not using I-node (on process 0) routines
  Up solver (post-smoother) on level 1 -------------------------------
    KSP Object:(mg_levels_1_)
      type: richardson
        Richardson: damping factor=0.9
      maximum iterations=1
      tolerances:  relative=1e-05, absolute=1e-50, divergence=10000
      left preconditioning
    PC Object:(mg_levels_1_)
      type: bjacobi
        block Jacobi: number of blocks = 4
        Local solve is same for all blocks, in the following KSP and PC
objects:
      KSP Object:(mg_levels_1_sub_)
        type: preonly
        maximum iterations=10000, initial guess is zero
        tolerances:  relative=1e-05, absolute=1e-50, divergence=10000
        left preconditioning
      PC Object:(mg_levels_1_sub_)
        type: icc
          ICC: 0 levels of fill
          ICC: factor fill ratio allocated 1
          ICC: using Manteuffel shift
          ICC: factor fill ratio needed 0.514899
               Factored matrix follows
              Matrix Object:
                type=seqsbaij, rows=2813, cols=2813
                total: nonzeros=48609, allocated nonzeros=48609
                    block size is 1
        linear system matrix = precond matrix:
        Matrix Object:
          type=seqaij, rows=2813, cols=2813
          total: nonzeros=94405, allocated nonzeros=94405
            not using I-node routines
      linear system matrix = precond matrix:
      Matrix Object:
        type=mpiaij, rows=10654, cols=10654
        total: nonzeros=376634, allocated nonzeros=376634
          not using I-node (on process 0) routines
  Down solver (pre-smoother) on level 2 -------------------------------
    KSP Object:(mg_levels_2_)
      type: richardson
        Richardson: damping factor=0.9
      maximum iterations=1, initial guess is zero
      tolerances:  relative=1e-05, absolute=1e-50, divergence=10000
      left preconditioning
    PC Object:(mg_levels_2_)
      type: bjacobi
        block Jacobi: number of blocks = 4
        Local solve is same for all blocks, in the following KSP and PC
objects:
      KSP Object:(mg_levels_2_sub_)
        type: preonly
        maximum iterations=10000, initial guess is zero
        tolerances:  relative=1e-05, absolute=1e-50, divergence=10000
        left preconditioning
      PC Object:(mg_levels_2_sub_)
        type: icc
          ICC: 0 levels of fill
          ICC: factor fill ratio allocated 1
          ICC: using Manteuffel shift
          ICC: factor fill ratio needed 0.519045
               Factored matrix follows
              Matrix Object:
                type=seqsbaij, rows=101164, cols=101164
                total: nonzeros=1378558, allocated nonzeros=1378558
                    block size is 1
        linear system matrix = precond matrix:
        Matrix Object:
          type=seqaij, rows=101164, cols=101164
          total: nonzeros=2655952, allocated nonzeros=5159364
            not using I-node routines
      linear system matrix = precond matrix:
      Matrix Object:
        type=mpiaij, rows=411866, cols=411866
        total: nonzeros=10941434, allocated nonzeros=42010332
          not using I-node (on process 0) routines
  Up solver (post-smoother) on level 2 -------------------------------
    KSP Object:(mg_levels_2_)
      type: richardson
        Richardson: damping factor=0.9
      maximum iterations=1
      tolerances:  relative=1e-05, absolute=1e-50, divergence=10000
      left preconditioning
    PC Object:(mg_levels_2_)
      type: bjacobi
        block Jacobi: number of blocks = 4
        Local solve is same for all blocks, in the following KSP and PC
objects:
      KSP Object:(mg_levels_2_sub_)
        type: preonly
        maximum iterations=10000, initial guess is zero
        tolerances:  relative=1e-05, absolute=1e-50, divergence=10000
        left preconditioning
      PC Object:(mg_levels_2_sub_)
        type: icc
          ICC: 0 levels of fill
          ICC: factor fill ratio allocated 1
          ICC: using Manteuffel shift
          ICC: factor fill ratio needed 0.519045
               Factored matrix follows
              Matrix Object:
                type=seqsbaij, rows=101164, cols=101164
                total: nonzeros=1378558, allocated nonzeros=1378558
                    block size is 1
        linear system matrix = precond matrix:
        Matrix Object:
          type=seqaij, rows=101164, cols=101164
          total: nonzeros=2655952, allocated nonzeros=5159364
            not using I-node routines
      linear system matrix = precond matrix:
      Matrix Object:
        type=mpiaij, rows=411866, cols=411866
        total: nonzeros=10941434, allocated nonzeros=42010332
          not using I-node (on process 0) routines
  linear system matrix = precond matrix:
  Matrix Object:
    type=mpiaij, rows=411866, cols=411866
    total: nonzeros=10941434, allocated nonzeros=42010332
      not using I-node (on process 0) routines

%%%%%%%%%%%%%%%%%%%%%%
AMG w/ SOR x4 ksp_view
%%%%%%%%%%%%%%%%%%%%%%

KSP Object:
  type: cg
  maximum iterations=10000
  tolerances:  relative=1e-05, absolute=1e-50, divergence=10000
  left preconditioning
PC Object:
  type: ml
    MG: type is MULTIPLICATIVE, levels=3 cycles=v, pre-smooths=1,
post-smooths=1
  Coarse gride solver -- level 0 -------------------------------
    KSP Object:(mg_coarse_)
      type: preonly
      maximum iterations=1, initial guess is zero
      tolerances:  relative=1e-05, absolute=1e-50, divergence=10000
      left preconditioning
    PC Object:(mg_coarse_)
      type: redundant
        Redundant preconditioner: First (color=0) of 4 PCs follows
      KSP Object:(mg_coarse_redundant_)
        type: preonly
        maximum iterations=10000, initial guess is zero
        tolerances:  relative=1e-05, absolute=1e-50, divergence=10000
        left preconditioning
      PC Object:(mg_coarse_redundant_)
        type: lu
          LU: out-of-place factorization
            matrix ordering: nd
          LU: tolerance for zero pivot 1e-12
          LU: factor fill ratio needed 2.17227
               Factored matrix follows
              Matrix Object:
                type=seqaij, rows=283, cols=283
                total: nonzeros=21651, allocated nonzeros=21651
                  using I-node routines: found 186 nodes, limit used is
5
        linear system matrix = precond matrix:
        Matrix Object:
          type=seqaij, rows=283, cols=283
          total: nonzeros=9967, allocated nonzeros=14150
            not using I-node routines
      linear system matrix = precond matrix:
      Matrix Object:
        type=mpiaij, rows=283, cols=283
        total: nonzeros=9967, allocated nonzeros=9967
          not using I-node (on process 0) routines
  Down solver (pre-smoother) on level 1 -------------------------------
    KSP Object:(mg_levels_1_)
      type: richardson
        Richardson: damping factor=1
      maximum iterations=4, initial guess is zero
      tolerances:  relative=1e-05, absolute=1e-50, divergence=10000
      left preconditioning
    PC Object:(mg_levels_1_)
      type: sor
        SOR: type = local_symmetric, iterations = 1, omega = 1
      linear system matrix = precond matrix:
      Matrix Object:
        type=mpiaij, rows=10654, cols=10654
        total: nonzeros=376634, allocated nonzeros=376634
          not using I-node (on process 0) routines
  Up solver (post-smoother) on level 1 -------------------------------
    KSP Object:(mg_levels_1_)
      type: richardson
        Richardson: damping factor=1
      maximum iterations=4
      tolerances:  relative=1e-05, absolute=1e-50, divergence=10000
      left preconditioning
    PC Object:(mg_levels_1_)
      type: sor
        SOR: type = local_symmetric, iterations = 1, omega = 1
      linear system matrix = precond matrix:
      Matrix Object:
        type=mpiaij, rows=10654, cols=10654
        total: nonzeros=376634, allocated nonzeros=376634
          not using I-node (on process 0) routines
  Down solver (pre-smoother) on level 2 -------------------------------
    KSP Object:(mg_levels_2_)
      type: richardson
        Richardson: damping factor=1
      maximum iterations=4, initial guess is zero
      tolerances:  relative=1e-05, absolute=1e-50, divergence=10000
      left preconditioning
    PC Object:(mg_levels_2_)
      type: sor
        SOR: type = local_symmetric, iterations = 1, omega = 1
      linear system matrix = precond matrix:
      Matrix Object:
        type=mpiaij, rows=411866, cols=411866
        total: nonzeros=10941434, allocated nonzeros=42010332
          not using I-node (on process 0) routines
  Up solver (post-smoother) on level 2 -------------------------------
    KSP Object:(mg_levels_2_)
      type: richardson
        Richardson: damping factor=1
      maximum iterations=4
      tolerances:  relative=1e-05, absolute=1e-50, divergence=10000
      left preconditioning
    PC Object:(mg_levels_2_)
      type: sor
        SOR: type = local_symmetric, iterations = 1, omega = 1
      linear system matrix = precond matrix:
      Matrix Object:
        type=mpiaij, rows=411866, cols=411866
        total: nonzeros=10941434, allocated nonzeros=42010332
          not using I-node (on process 0) routines
  linear system matrix = precond matrix:
  Matrix Object:
    type=mpiaij, rows=411866, cols=411866
    total: nonzeros=10941434, allocated nonzeros=42010332
      not using I-node (on process 0) routines



From knepley at gmail.com  Wed Jul 29 16:00:08 2009
From: knepley at gmail.com (Matthew Knepley)
Date: Wed, 29 Jul 2009 16:00:08 -0500
Subject: Smoother settings for AMG
In-Reply-To: <A4A715530F9D944C930E27C7F8932FBD2AD4A5@CORP-CLT-EXB02.ds>
References: <A4A715530F9D944C930E27C7F8932FBD2AD4A5@CORP-CLT-EXB02.ds>
Message-ID: <a9f269830907291400r3abca289w9da01c34d5f9e190@mail.gmail.com>

On Wed, Jul 29, 2009 at 3:54 PM, BAYRAKTAR Harun <Harun.BAYRAKTAR at 3ds.com>wrote:

> Hi,
>
> I am trying to solve a system of equations and I am having difficulty
> picking the right smoothers for AMG (using ML as pc_type) in PETSc for
> parallel execution. First here is what happens in terms of CG (ksp_type)
> iteration counts (both columns use block jacobi):


Are you sure you have an elliptic system? These iteration counts are
extremely
high.

  Matt


>
> cpus    |       AMG w/ ICC(0) x1        |       AMG w/ SOR x4
> ------------------------------------------------------
> 1       |               43              |               243
> 4       |               699             |               379
>
> x1 or x4 means 1 or 4 iterations of smoother application at each AMG
> level (all details from ksp view for the 4 cpu run are below). The main
> observation is that on 1 cpu, AMG w/ ICC(0) is a clear winner but falls
> apart in parallel. SOR on the other hand experiences a 1.5X increase in
> iteration count which is totally expected from the quality of coarsening
> ML delivers in parallel.
>
> I basically would like to find a way (if possible) to have the number of
> iterations in parallel stay with 1-2X of 1 cpu iteration count for the
> AMG w/ ICC case. Is there a way to achieve this?
>
> Thanks,
> Harun
>
> %%%%%%%%%%%%%%%%%%%%%%%%%
> AMG w/ ICC(0) x1 ksp_view
> %%%%%%%%%%%%%%%%%%%%%%%%%
> KSP Object:
>  type: cg
>  maximum iterations=10000
>  tolerances:  relative=1e-05, absolute=1e-50, divergence=10000
>  left preconditioning
> PC Object:
>  type: ml
>    MG: type is MULTIPLICATIVE, levels=3 cycles=v, pre-smooths=1,
> post-smooths=1
>  Coarse gride solver -- level 0 -------------------------------
>    KSP Object:(mg_coarse_)
>      type: preonly
>      maximum iterations=1, initial guess is zero
>      tolerances:  relative=1e-05, absolute=1e-50, divergence=10000
>      left preconditioning
>    PC Object:(mg_coarse_)
>      type: redundant
>        Redundant preconditioner: First (color=0) of 4 PCs follows
>      KSP Object:(mg_coarse_redundant_)
>        type: preonly
>        maximum iterations=10000, initial guess is zero
>        tolerances:  relative=1e-05, absolute=1e-50, divergence=10000
>        left preconditioning
>      PC Object:(mg_coarse_redundant_)
>        type: lu
>          LU: out-of-place factorization
>            matrix ordering: nd
>          LU: tolerance for zero pivot 1e-12
>          LU: factor fill ratio needed 2.17227
>               Factored matrix follows
>              Matrix Object:
>                type=seqaij, rows=283, cols=283
>                total: nonzeros=21651, allocated nonzeros=21651
>                  using I-node routines: found 186 nodes, limit used is
> 5
>        linear system matrix = precond matrix:
>        Matrix Object:
>          type=seqaij, rows=283, cols=283
>          total: nonzeros=9967, allocated nonzeros=14150
>            not using I-node routines
>      linear system matrix = precond matrix:
>      Matrix Object:
>        type=mpiaij, rows=283, cols=283
>        total: nonzeros=9967, allocated nonzeros=9967
>          not using I-node (on process 0) routines
>  Down solver (pre-smoother) on level 1 -------------------------------
>    KSP Object:(mg_levels_1_)
>      type: richardson
>        Richardson: damping factor=0.9
>      maximum iterations=1, initial guess is zero
>      tolerances:  relative=1e-05, absolute=1e-50, divergence=10000
>      left preconditioning
>    PC Object:(mg_levels_1_)
>      type: bjacobi
>        block Jacobi: number of blocks = 4
>        Local solve is same for all blocks, in the following KSP and PC
> objects:
>      KSP Object:(mg_levels_1_sub_)
>        type: preonly
>        maximum iterations=10000, initial guess is zero
>        tolerances:  relative=1e-05, absolute=1e-50, divergence=10000
>        left preconditioning
>      PC Object:(mg_levels_1_sub_)
>        type: icc
>          ICC: 0 levels of fill
>          ICC: factor fill ratio allocated 1
>          ICC: using Manteuffel shift
>          ICC: factor fill ratio needed 0.514899
>               Factored matrix follows
>              Matrix Object:
>                type=seqsbaij, rows=2813, cols=2813
>                total: nonzeros=48609, allocated nonzeros=48609
>                    block size is 1
>        linear system matrix = precond matrix:
>        Matrix Object:
>          type=seqaij, rows=2813, cols=2813
>          total: nonzeros=94405, allocated nonzeros=94405
>            not using I-node routines
>      linear system matrix = precond matrix:
>      Matrix Object:
>        type=mpiaij, rows=10654, cols=10654
>        total: nonzeros=376634, allocated nonzeros=376634
>          not using I-node (on process 0) routines
>  Up solver (post-smoother) on level 1 -------------------------------
>    KSP Object:(mg_levels_1_)
>      type: richardson
>        Richardson: damping factor=0.9
>      maximum iterations=1
>      tolerances:  relative=1e-05, absolute=1e-50, divergence=10000
>      left preconditioning
>    PC Object:(mg_levels_1_)
>      type: bjacobi
>        block Jacobi: number of blocks = 4
>        Local solve is same for all blocks, in the following KSP and PC
> objects:
>      KSP Object:(mg_levels_1_sub_)
>        type: preonly
>        maximum iterations=10000, initial guess is zero
>        tolerances:  relative=1e-05, absolute=1e-50, divergence=10000
>        left preconditioning
>      PC Object:(mg_levels_1_sub_)
>        type: icc
>          ICC: 0 levels of fill
>          ICC: factor fill ratio allocated 1
>          ICC: using Manteuffel shift
>          ICC: factor fill ratio needed 0.514899
>               Factored matrix follows
>              Matrix Object:
>                type=seqsbaij, rows=2813, cols=2813
>                total: nonzeros=48609, allocated nonzeros=48609
>                    block size is 1
>        linear system matrix = precond matrix:
>        Matrix Object:
>          type=seqaij, rows=2813, cols=2813
>          total: nonzeros=94405, allocated nonzeros=94405
>            not using I-node routines
>      linear system matrix = precond matrix:
>      Matrix Object:
>        type=mpiaij, rows=10654, cols=10654
>        total: nonzeros=376634, allocated nonzeros=376634
>          not using I-node (on process 0) routines
>  Down solver (pre-smoother) on level 2 -------------------------------
>    KSP Object:(mg_levels_2_)
>      type: richardson
>        Richardson: damping factor=0.9
>      maximum iterations=1, initial guess is zero
>      tolerances:  relative=1e-05, absolute=1e-50, divergence=10000
>      left preconditioning
>    PC Object:(mg_levels_2_)
>      type: bjacobi
>        block Jacobi: number of blocks = 4
>        Local solve is same for all blocks, in the following KSP and PC
> objects:
>      KSP Object:(mg_levels_2_sub_)
>        type: preonly
>        maximum iterations=10000, initial guess is zero
>        tolerances:  relative=1e-05, absolute=1e-50, divergence=10000
>        left preconditioning
>      PC Object:(mg_levels_2_sub_)
>        type: icc
>          ICC: 0 levels of fill
>          ICC: factor fill ratio allocated 1
>          ICC: using Manteuffel shift
>          ICC: factor fill ratio needed 0.519045
>               Factored matrix follows
>              Matrix Object:
>                type=seqsbaij, rows=101164, cols=101164
>                total: nonzeros=1378558, allocated nonzeros=1378558
>                    block size is 1
>        linear system matrix = precond matrix:
>        Matrix Object:
>          type=seqaij, rows=101164, cols=101164
>          total: nonzeros=2655952, allocated nonzeros=5159364
>            not using I-node routines
>      linear system matrix = precond matrix:
>      Matrix Object:
>        type=mpiaij, rows=411866, cols=411866
>        total: nonzeros=10941434, allocated nonzeros=42010332
>          not using I-node (on process 0) routines
>  Up solver (post-smoother) on level 2 -------------------------------
>    KSP Object:(mg_levels_2_)
>      type: richardson
>        Richardson: damping factor=0.9
>      maximum iterations=1
>      tolerances:  relative=1e-05, absolute=1e-50, divergence=10000
>      left preconditioning
>    PC Object:(mg_levels_2_)
>      type: bjacobi
>        block Jacobi: number of blocks = 4
>        Local solve is same for all blocks, in the following KSP and PC
> objects:
>      KSP Object:(mg_levels_2_sub_)
>        type: preonly
>        maximum iterations=10000, initial guess is zero
>        tolerances:  relative=1e-05, absolute=1e-50, divergence=10000
>        left preconditioning
>      PC Object:(mg_levels_2_sub_)
>        type: icc
>          ICC: 0 levels of fill
>          ICC: factor fill ratio allocated 1
>          ICC: using Manteuffel shift
>          ICC: factor fill ratio needed 0.519045
>               Factored matrix follows
>              Matrix Object:
>                type=seqsbaij, rows=101164, cols=101164
>                total: nonzeros=1378558, allocated nonzeros=1378558
>                    block size is 1
>        linear system matrix = precond matrix:
>        Matrix Object:
>          type=seqaij, rows=101164, cols=101164
>          total: nonzeros=2655952, allocated nonzeros=5159364
>            not using I-node routines
>      linear system matrix = precond matrix:
>      Matrix Object:
>        type=mpiaij, rows=411866, cols=411866
>        total: nonzeros=10941434, allocated nonzeros=42010332
>          not using I-node (on process 0) routines
>  linear system matrix = precond matrix:
>  Matrix Object:
>    type=mpiaij, rows=411866, cols=411866
>    total: nonzeros=10941434, allocated nonzeros=42010332
>      not using I-node (on process 0) routines
>
> %%%%%%%%%%%%%%%%%%%%%%
> AMG w/ SOR x4 ksp_view
> %%%%%%%%%%%%%%%%%%%%%%
>
> KSP Object:
>  type: cg
>  maximum iterations=10000
>  tolerances:  relative=1e-05, absolute=1e-50, divergence=10000
>  left preconditioning
> PC Object:
>  type: ml
>    MG: type is MULTIPLICATIVE, levels=3 cycles=v, pre-smooths=1,
> post-smooths=1
>  Coarse gride solver -- level 0 -------------------------------
>    KSP Object:(mg_coarse_)
>      type: preonly
>      maximum iterations=1, initial guess is zero
>      tolerances:  relative=1e-05, absolute=1e-50, divergence=10000
>      left preconditioning
>    PC Object:(mg_coarse_)
>      type: redundant
>        Redundant preconditioner: First (color=0) of 4 PCs follows
>      KSP Object:(mg_coarse_redundant_)
>        type: preonly
>        maximum iterations=10000, initial guess is zero
>        tolerances:  relative=1e-05, absolute=1e-50, divergence=10000
>        left preconditioning
>      PC Object:(mg_coarse_redundant_)
>        type: lu
>          LU: out-of-place factorization
>            matrix ordering: nd
>          LU: tolerance for zero pivot 1e-12
>          LU: factor fill ratio needed 2.17227
>               Factored matrix follows
>              Matrix Object:
>                type=seqaij, rows=283, cols=283
>                total: nonzeros=21651, allocated nonzeros=21651
>                  using I-node routines: found 186 nodes, limit used is
> 5
>        linear system matrix = precond matrix:
>        Matrix Object:
>          type=seqaij, rows=283, cols=283
>          total: nonzeros=9967, allocated nonzeros=14150
>            not using I-node routines
>      linear system matrix = precond matrix:
>      Matrix Object:
>        type=mpiaij, rows=283, cols=283
>        total: nonzeros=9967, allocated nonzeros=9967
>          not using I-node (on process 0) routines
>  Down solver (pre-smoother) on level 1 -------------------------------
>    KSP Object:(mg_levels_1_)
>      type: richardson
>        Richardson: damping factor=1
>      maximum iterations=4, initial guess is zero
>      tolerances:  relative=1e-05, absolute=1e-50, divergence=10000
>      left preconditioning
>    PC Object:(mg_levels_1_)
>      type: sor
>        SOR: type = local_symmetric, iterations = 1, omega = 1
>      linear system matrix = precond matrix:
>      Matrix Object:
>        type=mpiaij, rows=10654, cols=10654
>        total: nonzeros=376634, allocated nonzeros=376634
>          not using I-node (on process 0) routines
>  Up solver (post-smoother) on level 1 -------------------------------
>    KSP Object:(mg_levels_1_)
>      type: richardson
>        Richardson: damping factor=1
>      maximum iterations=4
>      tolerances:  relative=1e-05, absolute=1e-50, divergence=10000
>      left preconditioning
>    PC Object:(mg_levels_1_)
>      type: sor
>        SOR: type = local_symmetric, iterations = 1, omega = 1
>      linear system matrix = precond matrix:
>      Matrix Object:
>        type=mpiaij, rows=10654, cols=10654
>        total: nonzeros=376634, allocated nonzeros=376634
>          not using I-node (on process 0) routines
>  Down solver (pre-smoother) on level 2 -------------------------------
>    KSP Object:(mg_levels_2_)
>      type: richardson
>        Richardson: damping factor=1
>      maximum iterations=4, initial guess is zero
>      tolerances:  relative=1e-05, absolute=1e-50, divergence=10000
>      left preconditioning
>    PC Object:(mg_levels_2_)
>      type: sor
>        SOR: type = local_symmetric, iterations = 1, omega = 1
>      linear system matrix = precond matrix:
>      Matrix Object:
>        type=mpiaij, rows=411866, cols=411866
>        total: nonzeros=10941434, allocated nonzeros=42010332
>          not using I-node (on process 0) routines
>  Up solver (post-smoother) on level 2 -------------------------------
>    KSP Object:(mg_levels_2_)
>      type: richardson
>        Richardson: damping factor=1
>      maximum iterations=4
>      tolerances:  relative=1e-05, absolute=1e-50, divergence=10000
>      left preconditioning
>    PC Object:(mg_levels_2_)
>      type: sor
>        SOR: type = local_symmetric, iterations = 1, omega = 1
>      linear system matrix = precond matrix:
>      Matrix Object:
>        type=mpiaij, rows=411866, cols=411866
>        total: nonzeros=10941434, allocated nonzeros=42010332
>          not using I-node (on process 0) routines
>  linear system matrix = precond matrix:
>  Matrix Object:
>    type=mpiaij, rows=411866, cols=411866
>    total: nonzeros=10941434, allocated nonzeros=42010332
>      not using I-node (on process 0) routines
>
>
>


-- 
What most experimenters take for granted before they begin their experiments
is infinitely more interesting than any results to which their experiments
lead.
-- Norbert Wiener
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.mcs.anl.gov/pipermail/petsc-users/attachments/20090729/fbf7e23e/attachment-0001.htm>

From bsmith at mcs.anl.gov  Wed Jul 29 16:05:19 2009
From: bsmith at mcs.anl.gov (Barry Smith)
Date: Wed, 29 Jul 2009 16:05:19 -0500
Subject: Smoother settings for AMG
In-Reply-To: <A4A715530F9D944C930E27C7F8932FBD2AD4A5@CORP-CLT-EXB02.ds>
References: <A4A715530F9D944C930E27C7F8932FBD2AD4A5@CORP-CLT-EXB02.ds>
Message-ID: <02E52450-1C28-4451-BAE8-E31563BCF1A3@mcs.anl.gov>


   Can you save the matrix and right hand side with the option - 
ksp_view_binary and send the file "output" to petsc-maint at mcs.anl.gov  
(not this email).

    Barry

If it is too big to email you can ftp it to info.mcs.anl.gov  
(anonymous login) and put it in the directory incoming then send us  
email petsc-maint at mcs.anl.gov with the filename.


On Jul 29, 2009, at 3:54 PM, BAYRAKTAR Harun wrote:

> Hi,
>
> I am trying to solve a system of equations and I am having difficulty
> picking the right smoothers for AMG (using ML as pc_type) in PETSc for
> parallel execution. First here is what happens in terms of CG  
> (ksp_type)
> iteration counts (both columns use block jacobi):
>
> cpus	|	AMG w/ ICC(0) x1	|	AMG w/ SOR x4
> ------------------------------------------------------
> 1	|		43		|		243
> 4	|		699		|		379
>
> x1 or x4 means 1 or 4 iterations of smoother application at each AMG
> level (all details from ksp view for the 4 cpu run are below). The  
> main
> observation is that on 1 cpu, AMG w/ ICC(0) is a clear winner but  
> falls
> apart in parallel. SOR on the other hand experiences a 1.5X increase  
> in
> iteration count which is totally expected from the quality of  
> coarsening
> ML delivers in parallel.
>
> I basically would like to find a way (if possible) to have the  
> number of
> iterations in parallel stay with 1-2X of 1 cpu iteration count for the
> AMG w/ ICC case. Is there a way to achieve this?
>
> Thanks,
> Harun
>
> %%%%%%%%%%%%%%%%%%%%%%%%%
> AMG w/ ICC(0) x1 ksp_view
> %%%%%%%%%%%%%%%%%%%%%%%%%
> KSP Object:
>  type: cg
>  maximum iterations=10000
>  tolerances:  relative=1e-05, absolute=1e-50, divergence=10000
>  left preconditioning
> PC Object:
>  type: ml
>    MG: type is MULTIPLICATIVE, levels=3 cycles=v, pre-smooths=1,
> post-smooths=1
>  Coarse gride solver -- level 0 -------------------------------
>    KSP Object:(mg_coarse_)
>      type: preonly
>      maximum iterations=1, initial guess is zero
>      tolerances:  relative=1e-05, absolute=1e-50, divergence=10000
>      left preconditioning
>    PC Object:(mg_coarse_)
>      type: redundant
>        Redundant preconditioner: First (color=0) of 4 PCs follows
>      KSP Object:(mg_coarse_redundant_)
>        type: preonly
>        maximum iterations=10000, initial guess is zero
>        tolerances:  relative=1e-05, absolute=1e-50, divergence=10000
>        left preconditioning
>      PC Object:(mg_coarse_redundant_)
>        type: lu
>          LU: out-of-place factorization
>            matrix ordering: nd
>          LU: tolerance for zero pivot 1e-12
>          LU: factor fill ratio needed 2.17227
>               Factored matrix follows
>              Matrix Object:
>                type=seqaij, rows=283, cols=283
>                total: nonzeros=21651, allocated nonzeros=21651
>                  using I-node routines: found 186 nodes, limit used is
> 5
>        linear system matrix = precond matrix:
>        Matrix Object:
>          type=seqaij, rows=283, cols=283
>          total: nonzeros=9967, allocated nonzeros=14150
>            not using I-node routines
>      linear system matrix = precond matrix:
>      Matrix Object:
>        type=mpiaij, rows=283, cols=283
>        total: nonzeros=9967, allocated nonzeros=9967
>          not using I-node (on process 0) routines
>  Down solver (pre-smoother) on level 1 -------------------------------
>    KSP Object:(mg_levels_1_)
>      type: richardson
>        Richardson: damping factor=0.9
>      maximum iterations=1, initial guess is zero
>      tolerances:  relative=1e-05, absolute=1e-50, divergence=10000
>      left preconditioning
>    PC Object:(mg_levels_1_)
>      type: bjacobi
>        block Jacobi: number of blocks = 4
>        Local solve is same for all blocks, in the following KSP and PC
> objects:
>      KSP Object:(mg_levels_1_sub_)
>        type: preonly
>        maximum iterations=10000, initial guess is zero
>        tolerances:  relative=1e-05, absolute=1e-50, divergence=10000
>        left preconditioning
>      PC Object:(mg_levels_1_sub_)
>        type: icc
>          ICC: 0 levels of fill
>          ICC: factor fill ratio allocated 1
>          ICC: using Manteuffel shift
>          ICC: factor fill ratio needed 0.514899
>               Factored matrix follows
>              Matrix Object:
>                type=seqsbaij, rows=2813, cols=2813
>                total: nonzeros=48609, allocated nonzeros=48609
>                    block size is 1
>        linear system matrix = precond matrix:
>        Matrix Object:
>          type=seqaij, rows=2813, cols=2813
>          total: nonzeros=94405, allocated nonzeros=94405
>            not using I-node routines
>      linear system matrix = precond matrix:
>      Matrix Object:
>        type=mpiaij, rows=10654, cols=10654
>        total: nonzeros=376634, allocated nonzeros=376634
>          not using I-node (on process 0) routines
>  Up solver (post-smoother) on level 1 -------------------------------
>    KSP Object:(mg_levels_1_)
>      type: richardson
>        Richardson: damping factor=0.9
>      maximum iterations=1
>      tolerances:  relative=1e-05, absolute=1e-50, divergence=10000
>      left preconditioning
>    PC Object:(mg_levels_1_)
>      type: bjacobi
>        block Jacobi: number of blocks = 4
>        Local solve is same for all blocks, in the following KSP and PC
> objects:
>      KSP Object:(mg_levels_1_sub_)
>        type: preonly
>        maximum iterations=10000, initial guess is zero
>        tolerances:  relative=1e-05, absolute=1e-50, divergence=10000
>        left preconditioning
>      PC Object:(mg_levels_1_sub_)
>        type: icc
>          ICC: 0 levels of fill
>          ICC: factor fill ratio allocated 1
>          ICC: using Manteuffel shift
>          ICC: factor fill ratio needed 0.514899
>               Factored matrix follows
>              Matrix Object:
>                type=seqsbaij, rows=2813, cols=2813
>                total: nonzeros=48609, allocated nonzeros=48609
>                    block size is 1
>        linear system matrix = precond matrix:
>        Matrix Object:
>          type=seqaij, rows=2813, cols=2813
>          total: nonzeros=94405, allocated nonzeros=94405
>            not using I-node routines
>      linear system matrix = precond matrix:
>      Matrix Object:
>        type=mpiaij, rows=10654, cols=10654
>        total: nonzeros=376634, allocated nonzeros=376634
>          not using I-node (on process 0) routines
>  Down solver (pre-smoother) on level 2 -------------------------------
>    KSP Object:(mg_levels_2_)
>      type: richardson
>        Richardson: damping factor=0.9
>      maximum iterations=1, initial guess is zero
>      tolerances:  relative=1e-05, absolute=1e-50, divergence=10000
>      left preconditioning
>    PC Object:(mg_levels_2_)
>      type: bjacobi
>        block Jacobi: number of blocks = 4
>        Local solve is same for all blocks, in the following KSP and PC
> objects:
>      KSP Object:(mg_levels_2_sub_)
>        type: preonly
>        maximum iterations=10000, initial guess is zero
>        tolerances:  relative=1e-05, absolute=1e-50, divergence=10000
>        left preconditioning
>      PC Object:(mg_levels_2_sub_)
>        type: icc
>          ICC: 0 levels of fill
>          ICC: factor fill ratio allocated 1
>          ICC: using Manteuffel shift
>          ICC: factor fill ratio needed 0.519045
>               Factored matrix follows
>              Matrix Object:
>                type=seqsbaij, rows=101164, cols=101164
>                total: nonzeros=1378558, allocated nonzeros=1378558
>                    block size is 1
>        linear system matrix = precond matrix:
>        Matrix Object:
>          type=seqaij, rows=101164, cols=101164
>          total: nonzeros=2655952, allocated nonzeros=5159364
>            not using I-node routines
>      linear system matrix = precond matrix:
>      Matrix Object:
>        type=mpiaij, rows=411866, cols=411866
>        total: nonzeros=10941434, allocated nonzeros=42010332
>          not using I-node (on process 0) routines
>  Up solver (post-smoother) on level 2 -------------------------------
>    KSP Object:(mg_levels_2_)
>      type: richardson
>        Richardson: damping factor=0.9
>      maximum iterations=1
>      tolerances:  relative=1e-05, absolute=1e-50, divergence=10000
>      left preconditioning
>    PC Object:(mg_levels_2_)
>      type: bjacobi
>        block Jacobi: number of blocks = 4
>        Local solve is same for all blocks, in the following KSP and PC
> objects:
>      KSP Object:(mg_levels_2_sub_)
>        type: preonly
>        maximum iterations=10000, initial guess is zero
>        tolerances:  relative=1e-05, absolute=1e-50, divergence=10000
>        left preconditioning
>      PC Object:(mg_levels_2_sub_)
>        type: icc
>          ICC: 0 levels of fill
>          ICC: factor fill ratio allocated 1
>          ICC: using Manteuffel shift
>          ICC: factor fill ratio needed 0.519045
>               Factored matrix follows
>              Matrix Object:
>                type=seqsbaij, rows=101164, cols=101164
>                total: nonzeros=1378558, allocated nonzeros=1378558
>                    block size is 1
>        linear system matrix = precond matrix:
>        Matrix Object:
>          type=seqaij, rows=101164, cols=101164
>          total: nonzeros=2655952, allocated nonzeros=5159364
>            not using I-node routines
>      linear system matrix = precond matrix:
>      Matrix Object:
>        type=mpiaij, rows=411866, cols=411866
>        total: nonzeros=10941434, allocated nonzeros=42010332
>          not using I-node (on process 0) routines
>  linear system matrix = precond matrix:
>  Matrix Object:
>    type=mpiaij, rows=411866, cols=411866
>    total: nonzeros=10941434, allocated nonzeros=42010332
>      not using I-node (on process 0) routines
>
> %%%%%%%%%%%%%%%%%%%%%%
> AMG w/ SOR x4 ksp_view
> %%%%%%%%%%%%%%%%%%%%%%
>
> KSP Object:
>  type: cg
>  maximum iterations=10000
>  tolerances:  relative=1e-05, absolute=1e-50, divergence=10000
>  left preconditioning
> PC Object:
>  type: ml
>    MG: type is MULTIPLICATIVE, levels=3 cycles=v, pre-smooths=1,
> post-smooths=1
>  Coarse gride solver -- level 0 -------------------------------
>    KSP Object:(mg_coarse_)
>      type: preonly
>      maximum iterations=1, initial guess is zero
>      tolerances:  relative=1e-05, absolute=1e-50, divergence=10000
>      left preconditioning
>    PC Object:(mg_coarse_)
>      type: redundant
>        Redundant preconditioner: First (color=0) of 4 PCs follows
>      KSP Object:(mg_coarse_redundant_)
>        type: preonly
>        maximum iterations=10000, initial guess is zero
>        tolerances:  relative=1e-05, absolute=1e-50, divergence=10000
>        left preconditioning
>      PC Object:(mg_coarse_redundant_)
>        type: lu
>          LU: out-of-place factorization
>            matrix ordering: nd
>          LU: tolerance for zero pivot 1e-12
>          LU: factor fill ratio needed 2.17227
>               Factored matrix follows
>              Matrix Object:
>                type=seqaij, rows=283, cols=283
>                total: nonzeros=21651, allocated nonzeros=21651
>                  using I-node routines: found 186 nodes, limit used is
> 5
>        linear system matrix = precond matrix:
>        Matrix Object:
>          type=seqaij, rows=283, cols=283
>          total: nonzeros=9967, allocated nonzeros=14150
>            not using I-node routines
>      linear system matrix = precond matrix:
>      Matrix Object:
>        type=mpiaij, rows=283, cols=283
>        total: nonzeros=9967, allocated nonzeros=9967
>          not using I-node (on process 0) routines
>  Down solver (pre-smoother) on level 1 -------------------------------
>    KSP Object:(mg_levels_1_)
>      type: richardson
>        Richardson: damping factor=1
>      maximum iterations=4, initial guess is zero
>      tolerances:  relative=1e-05, absolute=1e-50, divergence=10000
>      left preconditioning
>    PC Object:(mg_levels_1_)
>      type: sor
>        SOR: type = local_symmetric, iterations = 1, omega = 1
>      linear system matrix = precond matrix:
>      Matrix Object:
>        type=mpiaij, rows=10654, cols=10654
>        total: nonzeros=376634, allocated nonzeros=376634
>          not using I-node (on process 0) routines
>  Up solver (post-smoother) on level 1 -------------------------------
>    KSP Object:(mg_levels_1_)
>      type: richardson
>        Richardson: damping factor=1
>      maximum iterations=4
>      tolerances:  relative=1e-05, absolute=1e-50, divergence=10000
>      left preconditioning
>    PC Object:(mg_levels_1_)
>      type: sor
>        SOR: type = local_symmetric, iterations = 1, omega = 1
>      linear system matrix = precond matrix:
>      Matrix Object:
>        type=mpiaij, rows=10654, cols=10654
>        total: nonzeros=376634, allocated nonzeros=376634
>          not using I-node (on process 0) routines
>  Down solver (pre-smoother) on level 2 -------------------------------
>    KSP Object:(mg_levels_2_)
>      type: richardson
>        Richardson: damping factor=1
>      maximum iterations=4, initial guess is zero
>      tolerances:  relative=1e-05, absolute=1e-50, divergence=10000
>      left preconditioning
>    PC Object:(mg_levels_2_)
>      type: sor
>        SOR: type = local_symmetric, iterations = 1, omega = 1
>      linear system matrix = precond matrix:
>      Matrix Object:
>        type=mpiaij, rows=411866, cols=411866
>        total: nonzeros=10941434, allocated nonzeros=42010332
>          not using I-node (on process 0) routines
>  Up solver (post-smoother) on level 2 -------------------------------
>    KSP Object:(mg_levels_2_)
>      type: richardson
>        Richardson: damping factor=1
>      maximum iterations=4
>      tolerances:  relative=1e-05, absolute=1e-50, divergence=10000
>      left preconditioning
>    PC Object:(mg_levels_2_)
>      type: sor
>        SOR: type = local_symmetric, iterations = 1, omega = 1
>      linear system matrix = precond matrix:
>      Matrix Object:
>        type=mpiaij, rows=411866, cols=411866
>        total: nonzeros=10941434, allocated nonzeros=42010332
>          not using I-node (on process 0) routines
>  linear system matrix = precond matrix:
>  Matrix Object:
>    type=mpiaij, rows=411866, cols=411866
>    total: nonzeros=10941434, allocated nonzeros=42010332
>      not using I-node (on process 0) routines
>
>


From Harun.BAYRAKTAR at 3ds.com  Wed Jul 29 16:17:12 2009
From: Harun.BAYRAKTAR at 3ds.com (BAYRAKTAR Harun)
Date: Wed, 29 Jul 2009 17:17:12 -0400
Subject: Smoother settings for AMG
In-Reply-To: <a9f269830907291400r3abca289w9da01c34d5f9e190@mail.gmail.com>
References: <A4A715530F9D944C930E27C7F8932FBD2AD4A5@CORP-CLT-EXB02.ds>
	<a9f269830907291400r3abca289w9da01c34d5f9e190@mail.gmail.com>
Message-ID: <A4A715530F9D944C930E27C7F8932FBD2AD4A6@CORP-CLT-EXB02.ds>

Matt,

 

It is from the pressure poisson equation for incompressible
navier-stokes so it is elliptic. Also on 1 cpu, I am able to solve it
with reason able iteration count (i.e., 43 to 1.e-5 true res norm rel
tolerance). It is the parallel runs that really concern me.

 

Thanks,

Harun

 

 

From: petsc-users-bounces at mcs.anl.gov
[mailto:petsc-users-bounces at mcs.anl.gov] On Behalf Of Matthew Knepley
Sent: Wednesday, July 29, 2009 5:00 PM
To: PETSc users list
Subject: Re: Smoother settings for AMG

 

On Wed, Jul 29, 2009 at 3:54 PM, BAYRAKTAR Harun
<Harun.BAYRAKTAR at 3ds.com> wrote:

	Hi,
	
	I am trying to solve a system of equations and I am having
difficulty
	picking the right smoothers for AMG (using ML as pc_type) in
PETSc for
	parallel execution. First here is what happens in terms of CG
(ksp_type)
	iteration counts (both columns use block jacobi):


Are you sure you have an elliptic system? These iteration counts are
extremely
high.

  Matt
 

	
	cpus    |       AMG w/ ICC(0) x1        |       AMG w/ SOR x4
	------------------------------------------------------
	1       |               43              |               243
	4       |               699             |               379
	
	x1 or x4 means 1 or 4 iterations of smoother application at each
AMG
	level (all details from ksp view for the 4 cpu run are below).
The main
	observation is that on 1 cpu, AMG w/ ICC(0) is a clear winner
but falls
	apart in parallel. SOR on the other hand experiences a 1.5X
increase in
	iteration count which is totally expected from the quality of
coarsening
	ML delivers in parallel.
	
	I basically would like to find a way (if possible) to have the
number of
	iterations in parallel stay with 1-2X of 1 cpu iteration count
for the
	AMG w/ ICC case. Is there a way to achieve this?
	
	Thanks,
	Harun
	
	%%%%%%%%%%%%%%%%%%%%%%%%%
	AMG w/ ICC(0) x1 ksp_view
	%%%%%%%%%%%%%%%%%%%%%%%%%
	KSP Object:
	 type: cg
	 maximum iterations=10000
	 tolerances:  relative=1e-05, absolute=1e-50, divergence=10000
	 left preconditioning
	PC Object:
	 type: ml
	   MG: type is MULTIPLICATIVE, levels=3 cycles=v, pre-smooths=1,
	post-smooths=1
	 Coarse gride solver -- level 0 -------------------------------
	   KSP Object:(mg_coarse_)
	     type: preonly
	     maximum iterations=1, initial guess is zero
	     tolerances:  relative=1e-05, absolute=1e-50,
divergence=10000
	     left preconditioning
	   PC Object:(mg_coarse_)
	     type: redundant
	       Redundant preconditioner: First (color=0) of 4 PCs
follows
	     KSP Object:(mg_coarse_redundant_)
	       type: preonly
	       maximum iterations=10000, initial guess is zero
	       tolerances:  relative=1e-05, absolute=1e-50,
divergence=10000
	       left preconditioning
	     PC Object:(mg_coarse_redundant_)
	       type: lu
	         LU: out-of-place factorization
	           matrix ordering: nd
	         LU: tolerance for zero pivot 1e-12
	         LU: factor fill ratio needed 2.17227
	              Factored matrix follows
	             Matrix Object:
	               type=seqaij, rows=283, cols=283
	               total: nonzeros=21651, allocated nonzeros=21651
	                 using I-node routines: found 186 nodes, limit
used is
	5
	       linear system matrix = precond matrix:
	       Matrix Object:
	         type=seqaij, rows=283, cols=283
	         total: nonzeros=9967, allocated nonzeros=14150
	           not using I-node routines
	     linear system matrix = precond matrix:
	     Matrix Object:
	       type=mpiaij, rows=283, cols=283
	       total: nonzeros=9967, allocated nonzeros=9967
	         not using I-node (on process 0) routines
	 Down solver (pre-smoother) on level 1
-------------------------------
	   KSP Object:(mg_levels_1_)
	     type: richardson
	       Richardson: damping factor=0.9
	     maximum iterations=1, initial guess is zero
	     tolerances:  relative=1e-05, absolute=1e-50,
divergence=10000
	     left preconditioning
	   PC Object:(mg_levels_1_)
	     type: bjacobi
	       block Jacobi: number of blocks = 4
	       Local solve is same for all blocks, in the following KSP
and PC
	objects:
	     KSP Object:(mg_levels_1_sub_)
	       type: preonly
	       maximum iterations=10000, initial guess is zero
	       tolerances:  relative=1e-05, absolute=1e-50,
divergence=10000
	       left preconditioning
	     PC Object:(mg_levels_1_sub_)
	       type: icc
	         ICC: 0 levels of fill
	         ICC: factor fill ratio allocated 1
	         ICC: using Manteuffel shift
	         ICC: factor fill ratio needed 0.514899
	              Factored matrix follows
	             Matrix Object:
	               type=seqsbaij, rows=2813, cols=2813
	               total: nonzeros=48609, allocated nonzeros=48609
	                   block size is 1
	       linear system matrix = precond matrix:
	       Matrix Object:
	         type=seqaij, rows=2813, cols=2813
	         total: nonzeros=94405, allocated nonzeros=94405
	           not using I-node routines
	     linear system matrix = precond matrix:
	     Matrix Object:
	       type=mpiaij, rows=10654, cols=10654
	       total: nonzeros=376634, allocated nonzeros=376634
	         not using I-node (on process 0) routines
	 Up solver (post-smoother) on level 1
-------------------------------
	   KSP Object:(mg_levels_1_)
	     type: richardson
	       Richardson: damping factor=0.9
	     maximum iterations=1
	     tolerances:  relative=1e-05, absolute=1e-50,
divergence=10000
	     left preconditioning
	   PC Object:(mg_levels_1_)
	     type: bjacobi
	       block Jacobi: number of blocks = 4
	       Local solve is same for all blocks, in the following KSP
and PC
	objects:
	     KSP Object:(mg_levels_1_sub_)
	       type: preonly
	       maximum iterations=10000, initial guess is zero
	       tolerances:  relative=1e-05, absolute=1e-50,
divergence=10000
	       left preconditioning
	     PC Object:(mg_levels_1_sub_)
	       type: icc
	         ICC: 0 levels of fill
	         ICC: factor fill ratio allocated 1
	         ICC: using Manteuffel shift
	         ICC: factor fill ratio needed 0.514899
	              Factored matrix follows
	             Matrix Object:
	               type=seqsbaij, rows=2813, cols=2813
	               total: nonzeros=48609, allocated nonzeros=48609
	                   block size is 1
	       linear system matrix = precond matrix:
	       Matrix Object:
	         type=seqaij, rows=2813, cols=2813
	         total: nonzeros=94405, allocated nonzeros=94405
	           not using I-node routines
	     linear system matrix = precond matrix:
	     Matrix Object:
	       type=mpiaij, rows=10654, cols=10654
	       total: nonzeros=376634, allocated nonzeros=376634
	         not using I-node (on process 0) routines
	 Down solver (pre-smoother) on level 2
-------------------------------
	   KSP Object:(mg_levels_2_)
	     type: richardson
	       Richardson: damping factor=0.9
	     maximum iterations=1, initial guess is zero
	     tolerances:  relative=1e-05, absolute=1e-50,
divergence=10000
	     left preconditioning
	   PC Object:(mg_levels_2_)
	     type: bjacobi
	       block Jacobi: number of blocks = 4
	       Local solve is same for all blocks, in the following KSP
and PC
	objects:
	     KSP Object:(mg_levels_2_sub_)
	       type: preonly
	       maximum iterations=10000, initial guess is zero
	       tolerances:  relative=1e-05, absolute=1e-50,
divergence=10000
	       left preconditioning
	     PC Object:(mg_levels_2_sub_)
	       type: icc
	         ICC: 0 levels of fill
	         ICC: factor fill ratio allocated 1
	         ICC: using Manteuffel shift
	         ICC: factor fill ratio needed 0.519045
	              Factored matrix follows
	             Matrix Object:
	               type=seqsbaij, rows=101164, cols=101164
	               total: nonzeros=1378558, allocated
nonzeros=1378558
	                   block size is 1
	       linear system matrix = precond matrix:
	       Matrix Object:
	         type=seqaij, rows=101164, cols=101164
	         total: nonzeros=2655952, allocated nonzeros=5159364
	           not using I-node routines
	     linear system matrix = precond matrix:
	     Matrix Object:
	       type=mpiaij, rows=411866, cols=411866
	       total: nonzeros=10941434, allocated nonzeros=42010332
	         not using I-node (on process 0) routines
	 Up solver (post-smoother) on level 2
-------------------------------
	   KSP Object:(mg_levels_2_)
	     type: richardson
	       Richardson: damping factor=0.9
	     maximum iterations=1
	     tolerances:  relative=1e-05, absolute=1e-50,
divergence=10000
	     left preconditioning
	   PC Object:(mg_levels_2_)
	     type: bjacobi
	       block Jacobi: number of blocks = 4
	       Local solve is same for all blocks, in the following KSP
and PC
	objects:
	     KSP Object:(mg_levels_2_sub_)
	       type: preonly
	       maximum iterations=10000, initial guess is zero
	       tolerances:  relative=1e-05, absolute=1e-50,
divergence=10000
	       left preconditioning
	     PC Object:(mg_levels_2_sub_)
	       type: icc
	         ICC: 0 levels of fill
	         ICC: factor fill ratio allocated 1
	         ICC: using Manteuffel shift
	         ICC: factor fill ratio needed 0.519045
	              Factored matrix follows
	             Matrix Object:
	               type=seqsbaij, rows=101164, cols=101164
	               total: nonzeros=1378558, allocated
nonzeros=1378558
	                   block size is 1
	       linear system matrix = precond matrix:
	       Matrix Object:
	         type=seqaij, rows=101164, cols=101164
	         total: nonzeros=2655952, allocated nonzeros=5159364
	           not using I-node routines
	     linear system matrix = precond matrix:
	     Matrix Object:
	       type=mpiaij, rows=411866, cols=411866
	       total: nonzeros=10941434, allocated nonzeros=42010332
	         not using I-node (on process 0) routines
	 linear system matrix = precond matrix:
	 Matrix Object:
	   type=mpiaij, rows=411866, cols=411866
	   total: nonzeros=10941434, allocated nonzeros=42010332
	     not using I-node (on process 0) routines
	
	%%%%%%%%%%%%%%%%%%%%%%
	AMG w/ SOR x4 ksp_view
	%%%%%%%%%%%%%%%%%%%%%%
	
	KSP Object:
	 type: cg
	 maximum iterations=10000
	 tolerances:  relative=1e-05, absolute=1e-50, divergence=10000
	 left preconditioning
	PC Object:
	 type: ml
	   MG: type is MULTIPLICATIVE, levels=3 cycles=v, pre-smooths=1,
	post-smooths=1
	 Coarse gride solver -- level 0 -------------------------------
	   KSP Object:(mg_coarse_)
	     type: preonly
	     maximum iterations=1, initial guess is zero
	     tolerances:  relative=1e-05, absolute=1e-50,
divergence=10000
	     left preconditioning
	   PC Object:(mg_coarse_)
	     type: redundant
	       Redundant preconditioner: First (color=0) of 4 PCs
follows
	     KSP Object:(mg_coarse_redundant_)
	       type: preonly
	       maximum iterations=10000, initial guess is zero
	       tolerances:  relative=1e-05, absolute=1e-50,
divergence=10000
	       left preconditioning
	     PC Object:(mg_coarse_redundant_)
	       type: lu
	         LU: out-of-place factorization
	           matrix ordering: nd
	         LU: tolerance for zero pivot 1e-12
	         LU: factor fill ratio needed 2.17227
	              Factored matrix follows
	             Matrix Object:
	               type=seqaij, rows=283, cols=283
	               total: nonzeros=21651, allocated nonzeros=21651
	                 using I-node routines: found 186 nodes, limit
used is
	5
	       linear system matrix = precond matrix:
	       Matrix Object:
	         type=seqaij, rows=283, cols=283
	         total: nonzeros=9967, allocated nonzeros=14150
	           not using I-node routines
	     linear system matrix = precond matrix:
	     Matrix Object:
	       type=mpiaij, rows=283, cols=283
	       total: nonzeros=9967, allocated nonzeros=9967
	         not using I-node (on process 0) routines
	 Down solver (pre-smoother) on level 1
-------------------------------
	   KSP Object:(mg_levels_1_)
	     type: richardson
	       Richardson: damping factor=1
	     maximum iterations=4, initial guess is zero
	     tolerances:  relative=1e-05, absolute=1e-50,
divergence=10000
	     left preconditioning
	   PC Object:(mg_levels_1_)
	     type: sor
	       SOR: type = local_symmetric, iterations = 1, omega = 1
	     linear system matrix = precond matrix:
	     Matrix Object:
	       type=mpiaij, rows=10654, cols=10654
	       total: nonzeros=376634, allocated nonzeros=376634
	         not using I-node (on process 0) routines
	 Up solver (post-smoother) on level 1
-------------------------------
	   KSP Object:(mg_levels_1_)
	     type: richardson
	       Richardson: damping factor=1
	     maximum iterations=4
	     tolerances:  relative=1e-05, absolute=1e-50,
divergence=10000
	     left preconditioning
	   PC Object:(mg_levels_1_)
	     type: sor
	       SOR: type = local_symmetric, iterations = 1, omega = 1
	     linear system matrix = precond matrix:
	     Matrix Object:
	       type=mpiaij, rows=10654, cols=10654
	       total: nonzeros=376634, allocated nonzeros=376634
	         not using I-node (on process 0) routines
	 Down solver (pre-smoother) on level 2
-------------------------------
	   KSP Object:(mg_levels_2_)
	     type: richardson
	       Richardson: damping factor=1
	     maximum iterations=4, initial guess is zero
	     tolerances:  relative=1e-05, absolute=1e-50,
divergence=10000
	     left preconditioning
	   PC Object:(mg_levels_2_)
	     type: sor
	       SOR: type = local_symmetric, iterations = 1, omega = 1
	     linear system matrix = precond matrix:
	     Matrix Object:
	       type=mpiaij, rows=411866, cols=411866
	       total: nonzeros=10941434, allocated nonzeros=42010332
	         not using I-node (on process 0) routines
	 Up solver (post-smoother) on level 2
-------------------------------
	   KSP Object:(mg_levels_2_)
	     type: richardson
	       Richardson: damping factor=1
	     maximum iterations=4
	     tolerances:  relative=1e-05, absolute=1e-50,
divergence=10000
	     left preconditioning
	   PC Object:(mg_levels_2_)
	     type: sor
	       SOR: type = local_symmetric, iterations = 1, omega = 1
	     linear system matrix = precond matrix:
	     Matrix Object:
	       type=mpiaij, rows=411866, cols=411866
	       total: nonzeros=10941434, allocated nonzeros=42010332
	         not using I-node (on process 0) routines
	 linear system matrix = precond matrix:
	 Matrix Object:
	   type=mpiaij, rows=411866, cols=411866
	   total: nonzeros=10941434, allocated nonzeros=42010332
	     not using I-node (on process 0) routines
	
	




-- 
What most experimenters take for granted before they begin their
experiments is infinitely more interesting than any results to which
their experiments lead.
-- Norbert Wiener

-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.mcs.anl.gov/pipermail/petsc-users/attachments/20090729/1b912f36/attachment-0001.htm>

From knepley at gmail.com  Wed Jul 29 16:22:39 2009
From: knepley at gmail.com (Matthew Knepley)
Date: Wed, 29 Jul 2009 16:22:39 -0500
Subject: Smoother settings for AMG
In-Reply-To: <A4A715530F9D944C930E27C7F8932FBD2AD4A6@CORP-CLT-EXB02.ds>
References: <A4A715530F9D944C930E27C7F8932FBD2AD4A5@CORP-CLT-EXB02.ds>
	<a9f269830907291400r3abca289w9da01c34d5f9e190@mail.gmail.com>
	<A4A715530F9D944C930E27C7F8932FBD2AD4A6@CORP-CLT-EXB02.ds>
Message-ID: <a9f269830907291422k1e52606n253661dcf41bbe09@mail.gmail.com>

On Wed, Jul 29, 2009 at 4:17 PM, BAYRAKTAR Harun <Harun.BAYRAKTAR at 3ds.com>wrote:

>  Matt,
>
> It is from the pressure poisson equation for incompressible navier-stokes
> so it is elliptic. Also on 1 cpu, I am able to solve it with reason able
> iteration count (i.e., 43 to 1.e-5 true res norm rel tolerance). It is the
> parallel runs that really concern me.
>
Actually, it was the 43 that really concerned me. In my experience, an MG
that is doing what it is supposed
to on Poisson takes < 10 iterations. However, if your grid is pretty
distorted, maybe it can get this bad.

  Matt

>
> Thanks,
>
> Harun
>
>
>
>
>
> *From:* petsc-users-bounces at mcs.anl.gov [mailto:
> petsc-users-bounces at mcs.anl.gov] *On Behalf Of *Matthew Knepley
> *Sent:* Wednesday, July 29, 2009 5:00 PM
> *To:* PETSc users list
> *Subject:* Re: Smoother settings for AMG
>
>
>
> On Wed, Jul 29, 2009 at 3:54 PM, BAYRAKTAR Harun <Harun.BAYRAKTAR at 3ds.com>
> wrote:
>
> Hi,
>
> I am trying to solve a system of equations and I am having difficulty
> picking the right smoothers for AMG (using ML as pc_type) in PETSc for
> parallel execution. First here is what happens in terms of CG (ksp_type)
> iteration counts (both columns use block jacobi):
>
>
> Are you sure you have an elliptic system? These iteration counts are
> extremely
> high.
>
>   Matt
>
>
>
> cpus    |       AMG w/ ICC(0) x1        |       AMG w/ SOR x4
> ------------------------------------------------------
> 1       |               43              |               243
> 4       |               699             |               379
>
> x1 or x4 means 1 or 4 iterations of smoother application at each AMG
> level (all details from ksp view for the 4 cpu run are below). The main
> observation is that on 1 cpu, AMG w/ ICC(0) is a clear winner but falls
> apart in parallel. SOR on the other hand experiences a 1.5X increase in
> iteration count which is totally expected from the quality of coarsening
> ML delivers in parallel.
>
> I basically would like to find a way (if possible) to have the number of
> iterations in parallel stay with 1-2X of 1 cpu iteration count for the
> AMG w/ ICC case. Is there a way to achieve this?
>
> Thanks,
> Harun
>
> %%%%%%%%%%%%%%%%%%%%%%%%%
> AMG w/ ICC(0) x1 ksp_view
> %%%%%%%%%%%%%%%%%%%%%%%%%
> KSP Object:
>  type: cg
>  maximum iterations=10000
>  tolerances:  relative=1e-05, absolute=1e-50, divergence=10000
>  left preconditioning
> PC Object:
>  type: ml
>    MG: type is MULTIPLICATIVE, levels=3 cycles=v, pre-smooths=1,
> post-smooths=1
>  Coarse gride solver -- level 0 -------------------------------
>    KSP Object:(mg_coarse_)
>      type: preonly
>      maximum iterations=1, initial guess is zero
>      tolerances:  relative=1e-05, absolute=1e-50, divergence=10000
>      left preconditioning
>    PC Object:(mg_coarse_)
>      type: redundant
>        Redundant preconditioner: First (color=0) of 4 PCs follows
>      KSP Object:(mg_coarse_redundant_)
>        type: preonly
>        maximum iterations=10000, initial guess is zero
>        tolerances:  relative=1e-05, absolute=1e-50, divergence=10000
>        left preconditioning
>      PC Object:(mg_coarse_redundant_)
>        type: lu
>          LU: out-of-place factorization
>            matrix ordering: nd
>          LU: tolerance for zero pivot 1e-12
>          LU: factor fill ratio needed 2.17227
>               Factored matrix follows
>              Matrix Object:
>                type=seqaij, rows=283, cols=283
>                total: nonzeros=21651, allocated nonzeros=21651
>                  using I-node routines: found 186 nodes, limit used is
> 5
>        linear system matrix = precond matrix:
>        Matrix Object:
>          type=seqaij, rows=283, cols=283
>          total: nonzeros=9967, allocated nonzeros=14150
>            not using I-node routines
>      linear system matrix = precond matrix:
>      Matrix Object:
>        type=mpiaij, rows=283, cols=283
>        total: nonzeros=9967, allocated nonzeros=9967
>          not using I-node (on process 0) routines
>  Down solver (pre-smoother) on level 1 -------------------------------
>    KSP Object:(mg_levels_1_)
>      type: richardson
>        Richardson: damping factor=0.9
>      maximum iterations=1, initial guess is zero
>      tolerances:  relative=1e-05, absolute=1e-50, divergence=10000
>      left preconditioning
>    PC Object:(mg_levels_1_)
>      type: bjacobi
>        block Jacobi: number of blocks = 4
>        Local solve is same for all blocks, in the following KSP and PC
> objects:
>      KSP Object:(mg_levels_1_sub_)
>        type: preonly
>        maximum iterations=10000, initial guess is zero
>        tolerances:  relative=1e-05, absolute=1e-50, divergence=10000
>        left preconditioning
>      PC Object:(mg_levels_1_sub_)
>        type: icc
>          ICC: 0 levels of fill
>          ICC: factor fill ratio allocated 1
>          ICC: using Manteuffel shift
>          ICC: factor fill ratio needed 0.514899
>               Factored matrix follows
>              Matrix Object:
>                type=seqsbaij, rows=2813, cols=2813
>                total: nonzeros=48609, allocated nonzeros=48609
>                    block size is 1
>        linear system matrix = precond matrix:
>        Matrix Object:
>          type=seqaij, rows=2813, cols=2813
>          total: nonzeros=94405, allocated nonzeros=94405
>            not using I-node routines
>      linear system matrix = precond matrix:
>      Matrix Object:
>        type=mpiaij, rows=10654, cols=10654
>        total: nonzeros=376634, allocated nonzeros=376634
>          not using I-node (on process 0) routines
>  Up solver (post-smoother) on level 1 -------------------------------
>    KSP Object:(mg_levels_1_)
>      type: richardson
>        Richardson: damping factor=0.9
>      maximum iterations=1
>      tolerances:  relative=1e-05, absolute=1e-50, divergence=10000
>      left preconditioning
>    PC Object:(mg_levels_1_)
>      type: bjacobi
>        block Jacobi: number of blocks = 4
>        Local solve is same for all blocks, in the following KSP and PC
> objects:
>      KSP Object:(mg_levels_1_sub_)
>        type: preonly
>        maximum iterations=10000, initial guess is zero
>        tolerances:  relative=1e-05, absolute=1e-50, divergence=10000
>        left preconditioning
>      PC Object:(mg_levels_1_sub_)
>        type: icc
>          ICC: 0 levels of fill
>          ICC: factor fill ratio allocated 1
>          ICC: using Manteuffel shift
>          ICC: factor fill ratio needed 0.514899
>               Factored matrix follows
>              Matrix Object:
>                type=seqsbaij, rows=2813, cols=2813
>                total: nonzeros=48609, allocated nonzeros=48609
>                    block size is 1
>        linear system matrix = precond matrix:
>        Matrix Object:
>          type=seqaij, rows=2813, cols=2813
>          total: nonzeros=94405, allocated nonzeros=94405
>            not using I-node routines
>      linear system matrix = precond matrix:
>      Matrix Object:
>        type=mpiaij, rows=10654, cols=10654
>        total: nonzeros=376634, allocated nonzeros=376634
>          not using I-node (on process 0) routines
>  Down solver (pre-smoother) on level 2 -------------------------------
>    KSP Object:(mg_levels_2_)
>      type: richardson
>        Richardson: damping factor=0.9
>      maximum iterations=1, initial guess is zero
>      tolerances:  relative=1e-05, absolute=1e-50, divergence=10000
>      left preconditioning
>    PC Object:(mg_levels_2_)
>      type: bjacobi
>        block Jacobi: number of blocks = 4
>        Local solve is same for all blocks, in the following KSP and PC
> objects:
>      KSP Object:(mg_levels_2_sub_)
>        type: preonly
>        maximum iterations=10000, initial guess is zero
>        tolerances:  relative=1e-05, absolute=1e-50, divergence=10000
>        left preconditioning
>      PC Object:(mg_levels_2_sub_)
>        type: icc
>          ICC: 0 levels of fill
>          ICC: factor fill ratio allocated 1
>          ICC: using Manteuffel shift
>          ICC: factor fill ratio needed 0.519045
>               Factored matrix follows
>              Matrix Object:
>                type=seqsbaij, rows=101164, cols=101164
>                total: nonzeros=1378558, allocated nonzeros=1378558
>                    block size is 1
>        linear system matrix = precond matrix:
>        Matrix Object:
>          type=seqaij, rows=101164, cols=101164
>          total: nonzeros=2655952, allocated nonzeros=5159364
>            not using I-node routines
>      linear system matrix = precond matrix:
>      Matrix Object:
>        type=mpiaij, rows=411866, cols=411866
>        total: nonzeros=10941434, allocated nonzeros=42010332
>          not using I-node (on process 0) routines
>  Up solver (post-smoother) on level 2 -------------------------------
>    KSP Object:(mg_levels_2_)
>      type: richardson
>        Richardson: damping factor=0.9
>      maximum iterations=1
>      tolerances:  relative=1e-05, absolute=1e-50, divergence=10000
>      left preconditioning
>    PC Object:(mg_levels_2_)
>      type: bjacobi
>        block Jacobi: number of blocks = 4
>        Local solve is same for all blocks, in the following KSP and PC
> objects:
>      KSP Object:(mg_levels_2_sub_)
>        type: preonly
>        maximum iterations=10000, initial guess is zero
>        tolerances:  relative=1e-05, absolute=1e-50, divergence=10000
>        left preconditioning
>      PC Object:(mg_levels_2_sub_)
>        type: icc
>          ICC: 0 levels of fill
>          ICC: factor fill ratio allocated 1
>          ICC: using Manteuffel shift
>          ICC: factor fill ratio needed 0.519045
>               Factored matrix follows
>              Matrix Object:
>                type=seqsbaij, rows=101164, cols=101164
>                total: nonzeros=1378558, allocated nonzeros=1378558
>                    block size is 1
>        linear system matrix = precond matrix:
>        Matrix Object:
>          type=seqaij, rows=101164, cols=101164
>          total: nonzeros=2655952, allocated nonzeros=5159364
>            not using I-node routines
>      linear system matrix = precond matrix:
>      Matrix Object:
>        type=mpiaij, rows=411866, cols=411866
>        total: nonzeros=10941434, allocated nonzeros=42010332
>          not using I-node (on process 0) routines
>  linear system matrix = precond matrix:
>  Matrix Object:
>    type=mpiaij, rows=411866, cols=411866
>    total: nonzeros=10941434, allocated nonzeros=42010332
>      not using I-node (on process 0) routines
>
> %%%%%%%%%%%%%%%%%%%%%%
> AMG w/ SOR x4 ksp_view
> %%%%%%%%%%%%%%%%%%%%%%
>
> KSP Object:
>  type: cg
>  maximum iterations=10000
>  tolerances:  relative=1e-05, absolute=1e-50, divergence=10000
>  left preconditioning
> PC Object:
>  type: ml
>    MG: type is MULTIPLICATIVE, levels=3 cycles=v, pre-smooths=1,
> post-smooths=1
>  Coarse gride solver -- level 0 -------------------------------
>    KSP Object:(mg_coarse_)
>      type: preonly
>      maximum iterations=1, initial guess is zero
>      tolerances:  relative=1e-05, absolute=1e-50, divergence=10000
>      left preconditioning
>    PC Object:(mg_coarse_)
>      type: redundant
>        Redundant preconditioner: First (color=0) of 4 PCs follows
>      KSP Object:(mg_coarse_redundant_)
>        type: preonly
>        maximum iterations=10000, initial guess is zero
>        tolerances:  relative=1e-05, absolute=1e-50, divergence=10000
>        left preconditioning
>      PC Object:(mg_coarse_redundant_)
>        type: lu
>          LU: out-of-place factorization
>            matrix ordering: nd
>          LU: tolerance for zero pivot 1e-12
>          LU: factor fill ratio needed 2.17227
>               Factored matrix follows
>              Matrix Object:
>                type=seqaij, rows=283, cols=283
>                total: nonzeros=21651, allocated nonzeros=21651
>                  using I-node routines: found 186 nodes, limit used is
> 5
>        linear system matrix = precond matrix:
>        Matrix Object:
>          type=seqaij, rows=283, cols=283
>          total: nonzeros=9967, allocated nonzeros=14150
>            not using I-node routines
>      linear system matrix = precond matrix:
>      Matrix Object:
>        type=mpiaij, rows=283, cols=283
>        total: nonzeros=9967, allocated nonzeros=9967
>          not using I-node (on process 0) routines
>  Down solver (pre-smoother) on level 1 -------------------------------
>    KSP Object:(mg_levels_1_)
>      type: richardson
>        Richardson: damping factor=1
>      maximum iterations=4, initial guess is zero
>      tolerances:  relative=1e-05, absolute=1e-50, divergence=10000
>      left preconditioning
>    PC Object:(mg_levels_1_)
>      type: sor
>        SOR: type = local_symmetric, iterations = 1, omega = 1
>      linear system matrix = precond matrix:
>      Matrix Object:
>        type=mpiaij, rows=10654, cols=10654
>        total: nonzeros=376634, allocated nonzeros=376634
>          not using I-node (on process 0) routines
>  Up solver (post-smoother) on level 1 -------------------------------
>    KSP Object:(mg_levels_1_)
>      type: richardson
>        Richardson: damping factor=1
>      maximum iterations=4
>      tolerances:  relative=1e-05, absolute=1e-50, divergence=10000
>      left preconditioning
>    PC Object:(mg_levels_1_)
>      type: sor
>        SOR: type = local_symmetric, iterations = 1, omega = 1
>      linear system matrix = precond matrix:
>      Matrix Object:
>        type=mpiaij, rows=10654, cols=10654
>        total: nonzeros=376634, allocated nonzeros=376634
>          not using I-node (on process 0) routines
>  Down solver (pre-smoother) on level 2 -------------------------------
>    KSP Object:(mg_levels_2_)
>      type: richardson
>        Richardson: damping factor=1
>      maximum iterations=4, initial guess is zero
>      tolerances:  relative=1e-05, absolute=1e-50, divergence=10000
>      left preconditioning
>    PC Object:(mg_levels_2_)
>      type: sor
>        SOR: type = local_symmetric, iterations = 1, omega = 1
>      linear system matrix = precond matrix:
>      Matrix Object:
>        type=mpiaij, rows=411866, cols=411866
>        total: nonzeros=10941434, allocated nonzeros=42010332
>          not using I-node (on process 0) routines
>  Up solver (post-smoother) on level 2 -------------------------------
>    KSP Object:(mg_levels_2_)
>      type: richardson
>        Richardson: damping factor=1
>      maximum iterations=4
>      tolerances:  relative=1e-05, absolute=1e-50, divergence=10000
>      left preconditioning
>    PC Object:(mg_levels_2_)
>      type: sor
>        SOR: type = local_symmetric, iterations = 1, omega = 1
>      linear system matrix = precond matrix:
>      Matrix Object:
>        type=mpiaij, rows=411866, cols=411866
>        total: nonzeros=10941434, allocated nonzeros=42010332
>          not using I-node (on process 0) routines
>  linear system matrix = precond matrix:
>  Matrix Object:
>    type=mpiaij, rows=411866, cols=411866
>    total: nonzeros=10941434, allocated nonzeros=42010332
>      not using I-node (on process 0) routines
>
>
>
>
> --
> What most experimenters take for granted before they begin their
> experiments is infinitely more interesting than any results to which their
> experiments lead.
> -- Norbert Wiener
>



-- 
What most experimenters take for granted before they begin their experiments
is infinitely more interesting than any results to which their experiments
lead.
-- Norbert Wiener
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.mcs.anl.gov/pipermail/petsc-users/attachments/20090729/59760b10/attachment.htm>

From Harun.BAYRAKTAR at 3ds.com  Wed Jul 29 16:48:50 2009
From: Harun.BAYRAKTAR at 3ds.com (BAYRAKTAR Harun)
Date: Wed, 29 Jul 2009 17:48:50 -0400
Subject: Smoother settings for AMG
In-Reply-To: <a9f269830907291422k1e52606n253661dcf41bbe09@mail.gmail.com>
References: <A4A715530F9D944C930E27C7F8932FBD2AD4A5@CORP-CLT-EXB02.ds><a9f269830907291400r3abca289w9da01c34d5f9e190@mail.gmail.com><A4A715530F9D944C930E27C7F8932FBD2AD4A6@CORP-CLT-EXB02.ds>
	<a9f269830907291422k1e52606n253661dcf41bbe09@mail.gmail.com>
Message-ID: <A4A715530F9D944C930E27C7F8932FBD2AD4A7@CORP-CLT-EXB02.ds>

Matt,

 

Sorry, I thought you meant the ones with iteration counts in the
hundreds. I had a typo it was 46 actually (not 43). That's very
interesting that you think it should converge in less than 10
iterations, all I can say is I wish I could get there. This mesh is a
graded mesh and does have some poor aspect ratio elements. For a uniform
mesh I see 20 iterations or so.

 

Thanks,

Harun

 

 

From: petsc-users-bounces at mcs.anl.gov
[mailto:petsc-users-bounces at mcs.anl.gov] On Behalf Of Matthew Knepley
Sent: Wednesday, July 29, 2009 5:23 PM
To: PETSc users list
Subject: Re: Smoother settings for AMG

 

On Wed, Jul 29, 2009 at 4:17 PM, BAYRAKTAR Harun
<Harun.BAYRAKTAR at 3ds.com> wrote:

	Matt,

	It is from the pressure poisson equation for incompressible
navier-stokes so it is elliptic. Also on 1 cpu, I am able to solve it
with reason able iteration count (i.e., 43 to 1.e-5 true res norm rel
tolerance). It is the parallel runs that really concern me.

Actually, it was the 43 that really concerned me. In my experience, an
MG that is doing what it is supposed
to on Poisson takes < 10 iterations. However, if your grid is pretty
distorted, maybe it can get this bad.

  Matt

	 

	Thanks,

	Harun

	 

	 

	From: petsc-users-bounces at mcs.anl.gov
[mailto:petsc-users-bounces at mcs.anl.gov] On Behalf Of Matthew Knepley
	Sent: Wednesday, July 29, 2009 5:00 PM
	To: PETSc users list
	Subject: Re: Smoother settings for AMG

	 

	On Wed, Jul 29, 2009 at 3:54 PM, BAYRAKTAR Harun
<Harun.BAYRAKTAR at 3ds.com> wrote:

		Hi,
		
		I am trying to solve a system of equations and I am
having difficulty
		picking the right smoothers for AMG (using ML as
pc_type) in PETSc for
		parallel execution. First here is what happens in terms
of CG (ksp_type)
		iteration counts (both columns use block jacobi):

	
	Are you sure you have an elliptic system? These iteration counts
are extremely
	high.
	
	  Matt
	 

		
		cpus    |       AMG w/ ICC(0) x1        |       AMG w/
SOR x4
		------------------------------------------------------
		1       |               43              |
243
		4       |               699             |
379
		
		x1 or x4 means 1 or 4 iterations of smoother application
at each AMG
		level (all details from ksp view for the 4 cpu run are
below). The main
		observation is that on 1 cpu, AMG w/ ICC(0) is a clear
winner but falls
		apart in parallel. SOR on the other hand experiences a
1.5X increase in
		iteration count which is totally expected from the
quality of coarsening
		ML delivers in parallel.
		
		I basically would like to find a way (if possible) to
have the number of
		iterations in parallel stay with 1-2X of 1 cpu iteration
count for the
		AMG w/ ICC case. Is there a way to achieve this?
		
		Thanks,
		Harun
		
		%%%%%%%%%%%%%%%%%%%%%%%%%
		AMG w/ ICC(0) x1 ksp_view
		%%%%%%%%%%%%%%%%%%%%%%%%%
		KSP Object:
		 type: cg
		 maximum iterations=10000
		 tolerances:  relative=1e-05, absolute=1e-50,
divergence=10000
		 left preconditioning
		PC Object:
		 type: ml
		   MG: type is MULTIPLICATIVE, levels=3 cycles=v,
pre-smooths=1,
		post-smooths=1
		 Coarse gride solver -- level 0
-------------------------------
		   KSP Object:(mg_coarse_)
		     type: preonly
		     maximum iterations=1, initial guess is zero
		     tolerances:  relative=1e-05, absolute=1e-50,
divergence=10000
		     left preconditioning
		   PC Object:(mg_coarse_)
		     type: redundant
		       Redundant preconditioner: First (color=0) of 4
PCs follows
		     KSP Object:(mg_coarse_redundant_)
		       type: preonly
		       maximum iterations=10000, initial guess is zero
		       tolerances:  relative=1e-05, absolute=1e-50,
divergence=10000
		       left preconditioning
		     PC Object:(mg_coarse_redundant_)
		       type: lu
		         LU: out-of-place factorization
		           matrix ordering: nd
		         LU: tolerance for zero pivot 1e-12
		         LU: factor fill ratio needed 2.17227
		              Factored matrix follows
		             Matrix Object:
		               type=seqaij, rows=283, cols=283
		               total: nonzeros=21651, allocated
nonzeros=21651
		                 using I-node routines: found 186 nodes,
limit used is
		5
		       linear system matrix = precond matrix:
		       Matrix Object:
		         type=seqaij, rows=283, cols=283
		         total: nonzeros=9967, allocated nonzeros=14150
		           not using I-node routines
		     linear system matrix = precond matrix:
		     Matrix Object:
		       type=mpiaij, rows=283, cols=283
		       total: nonzeros=9967, allocated nonzeros=9967
		         not using I-node (on process 0) routines
		 Down solver (pre-smoother) on level 1
-------------------------------
		   KSP Object:(mg_levels_1_)
		     type: richardson
		       Richardson: damping factor=0.9
		     maximum iterations=1, initial guess is zero
		     tolerances:  relative=1e-05, absolute=1e-50,
divergence=10000
		     left preconditioning
		   PC Object:(mg_levels_1_)
		     type: bjacobi
		       block Jacobi: number of blocks = 4
		       Local solve is same for all blocks, in the
following KSP and PC
		objects:
		     KSP Object:(mg_levels_1_sub_)
		       type: preonly
		       maximum iterations=10000, initial guess is zero
		       tolerances:  relative=1e-05, absolute=1e-50,
divergence=10000
		       left preconditioning
		     PC Object:(mg_levels_1_sub_)
		       type: icc
		         ICC: 0 levels of fill
		         ICC: factor fill ratio allocated 1
		         ICC: using Manteuffel shift
		         ICC: factor fill ratio needed 0.514899
		              Factored matrix follows
		             Matrix Object:
		               type=seqsbaij, rows=2813, cols=2813
		               total: nonzeros=48609, allocated
nonzeros=48609
		                   block size is 1
		       linear system matrix = precond matrix:
		       Matrix Object:
		         type=seqaij, rows=2813, cols=2813
		         total: nonzeros=94405, allocated nonzeros=94405
		           not using I-node routines
		     linear system matrix = precond matrix:
		     Matrix Object:
		       type=mpiaij, rows=10654, cols=10654
		       total: nonzeros=376634, allocated nonzeros=376634
		         not using I-node (on process 0) routines
		 Up solver (post-smoother) on level 1
-------------------------------
		   KSP Object:(mg_levels_1_)
		     type: richardson
		       Richardson: damping factor=0.9
		     maximum iterations=1
		     tolerances:  relative=1e-05, absolute=1e-50,
divergence=10000
		     left preconditioning
		   PC Object:(mg_levels_1_)
		     type: bjacobi
		       block Jacobi: number of blocks = 4
		       Local solve is same for all blocks, in the
following KSP and PC
		objects:
		     KSP Object:(mg_levels_1_sub_)
		       type: preonly
		       maximum iterations=10000, initial guess is zero
		       tolerances:  relative=1e-05, absolute=1e-50,
divergence=10000
		       left preconditioning
		     PC Object:(mg_levels_1_sub_)
		       type: icc
		         ICC: 0 levels of fill
		         ICC: factor fill ratio allocated 1
		         ICC: using Manteuffel shift
		         ICC: factor fill ratio needed 0.514899
		              Factored matrix follows
		             Matrix Object:
		               type=seqsbaij, rows=2813, cols=2813
		               total: nonzeros=48609, allocated
nonzeros=48609
		                   block size is 1
		       linear system matrix = precond matrix:
		       Matrix Object:
		         type=seqaij, rows=2813, cols=2813
		         total: nonzeros=94405, allocated nonzeros=94405
		           not using I-node routines
		     linear system matrix = precond matrix:
		     Matrix Object:
		       type=mpiaij, rows=10654, cols=10654
		       total: nonzeros=376634, allocated nonzeros=376634
		         not using I-node (on process 0) routines
		 Down solver (pre-smoother) on level 2
-------------------------------
		   KSP Object:(mg_levels_2_)
		     type: richardson
		       Richardson: damping factor=0.9
		     maximum iterations=1, initial guess is zero
		     tolerances:  relative=1e-05, absolute=1e-50,
divergence=10000
		     left preconditioning
		   PC Object:(mg_levels_2_)
		     type: bjacobi
		       block Jacobi: number of blocks = 4
		       Local solve is same for all blocks, in the
following KSP and PC
		objects:
		     KSP Object:(mg_levels_2_sub_)
		       type: preonly
		       maximum iterations=10000, initial guess is zero
		       tolerances:  relative=1e-05, absolute=1e-50,
divergence=10000
		       left preconditioning
		     PC Object:(mg_levels_2_sub_)
		       type: icc
		         ICC: 0 levels of fill
		         ICC: factor fill ratio allocated 1
		         ICC: using Manteuffel shift
		         ICC: factor fill ratio needed 0.519045
		              Factored matrix follows
		             Matrix Object:
		               type=seqsbaij, rows=101164, cols=101164
		               total: nonzeros=1378558, allocated
nonzeros=1378558
		                   block size is 1
		       linear system matrix = precond matrix:
		       Matrix Object:
		         type=seqaij, rows=101164, cols=101164
		         total: nonzeros=2655952, allocated
nonzeros=5159364
		           not using I-node routines
		     linear system matrix = precond matrix:
		     Matrix Object:
		       type=mpiaij, rows=411866, cols=411866
		       total: nonzeros=10941434, allocated
nonzeros=42010332
		         not using I-node (on process 0) routines
		 Up solver (post-smoother) on level 2
-------------------------------
		   KSP Object:(mg_levels_2_)
		     type: richardson
		       Richardson: damping factor=0.9
		     maximum iterations=1
		     tolerances:  relative=1e-05, absolute=1e-50,
divergence=10000
		     left preconditioning
		   PC Object:(mg_levels_2_)
		     type: bjacobi
		       block Jacobi: number of blocks = 4
		       Local solve is same for all blocks, in the
following KSP and PC
		objects:
		     KSP Object:(mg_levels_2_sub_)
		       type: preonly
		       maximum iterations=10000, initial guess is zero
		       tolerances:  relative=1e-05, absolute=1e-50,
divergence=10000
		       left preconditioning
		     PC Object:(mg_levels_2_sub_)
		       type: icc
		         ICC: 0 levels of fill
		         ICC: factor fill ratio allocated 1
		         ICC: using Manteuffel shift
		         ICC: factor fill ratio needed 0.519045
		              Factored matrix follows
		             Matrix Object:
		               type=seqsbaij, rows=101164, cols=101164
		               total: nonzeros=1378558, allocated
nonzeros=1378558
		                   block size is 1
		       linear system matrix = precond matrix:
		       Matrix Object:
		         type=seqaij, rows=101164, cols=101164
		         total: nonzeros=2655952, allocated
nonzeros=5159364
		           not using I-node routines
		     linear system matrix = precond matrix:
		     Matrix Object:
		       type=mpiaij, rows=411866, cols=411866
		       total: nonzeros=10941434, allocated
nonzeros=42010332
		         not using I-node (on process 0) routines
		 linear system matrix = precond matrix:
		 Matrix Object:
		   type=mpiaij, rows=411866, cols=411866
		   total: nonzeros=10941434, allocated nonzeros=42010332
		     not using I-node (on process 0) routines
		
		%%%%%%%%%%%%%%%%%%%%%%
		AMG w/ SOR x4 ksp_view
		%%%%%%%%%%%%%%%%%%%%%%
		
		KSP Object:
		 type: cg
		 maximum iterations=10000
		 tolerances:  relative=1e-05, absolute=1e-50,
divergence=10000
		 left preconditioning
		PC Object:
		 type: ml
		   MG: type is MULTIPLICATIVE, levels=3 cycles=v,
pre-smooths=1,
		post-smooths=1
		 Coarse gride solver -- level 0
-------------------------------
		   KSP Object:(mg_coarse_)
		     type: preonly
		     maximum iterations=1, initial guess is zero
		     tolerances:  relative=1e-05, absolute=1e-50,
divergence=10000
		     left preconditioning
		   PC Object:(mg_coarse_)
		     type: redundant
		       Redundant preconditioner: First (color=0) of 4
PCs follows
		     KSP Object:(mg_coarse_redundant_)
		       type: preonly
		       maximum iterations=10000, initial guess is zero
		       tolerances:  relative=1e-05, absolute=1e-50,
divergence=10000
		       left preconditioning
		     PC Object:(mg_coarse_redundant_)
		       type: lu
		         LU: out-of-place factorization
		           matrix ordering: nd
		         LU: tolerance for zero pivot 1e-12
		         LU: factor fill ratio needed 2.17227
		              Factored matrix follows
		             Matrix Object:
		               type=seqaij, rows=283, cols=283
		               total: nonzeros=21651, allocated
nonzeros=21651
		                 using I-node routines: found 186 nodes,
limit used is
		5
		       linear system matrix = precond matrix:
		       Matrix Object:
		         type=seqaij, rows=283, cols=283
		         total: nonzeros=9967, allocated nonzeros=14150
		           not using I-node routines
		     linear system matrix = precond matrix:
		     Matrix Object:
		       type=mpiaij, rows=283, cols=283
		       total: nonzeros=9967, allocated nonzeros=9967
		         not using I-node (on process 0) routines
		 Down solver (pre-smoother) on level 1
-------------------------------
		   KSP Object:(mg_levels_1_)
		     type: richardson
		       Richardson: damping factor=1
		     maximum iterations=4, initial guess is zero
		     tolerances:  relative=1e-05, absolute=1e-50,
divergence=10000
		     left preconditioning
		   PC Object:(mg_levels_1_)
		     type: sor
		       SOR: type = local_symmetric, iterations = 1,
omega = 1
		     linear system matrix = precond matrix:
		     Matrix Object:
		       type=mpiaij, rows=10654, cols=10654
		       total: nonzeros=376634, allocated nonzeros=376634
		         not using I-node (on process 0) routines
		 Up solver (post-smoother) on level 1
-------------------------------
		   KSP Object:(mg_levels_1_)
		     type: richardson
		       Richardson: damping factor=1
		     maximum iterations=4
		     tolerances:  relative=1e-05, absolute=1e-50,
divergence=10000
		     left preconditioning
		   PC Object:(mg_levels_1_)
		     type: sor
		       SOR: type = local_symmetric, iterations = 1,
omega = 1
		     linear system matrix = precond matrix:
		     Matrix Object:
		       type=mpiaij, rows=10654, cols=10654
		       total: nonzeros=376634, allocated nonzeros=376634
		         not using I-node (on process 0) routines
		 Down solver (pre-smoother) on level 2
-------------------------------
		   KSP Object:(mg_levels_2_)
		     type: richardson
		       Richardson: damping factor=1
		     maximum iterations=4, initial guess is zero
		     tolerances:  relative=1e-05, absolute=1e-50,
divergence=10000
		     left preconditioning
		   PC Object:(mg_levels_2_)
		     type: sor
		       SOR: type = local_symmetric, iterations = 1,
omega = 1
		     linear system matrix = precond matrix:
		     Matrix Object:
		       type=mpiaij, rows=411866, cols=411866
		       total: nonzeros=10941434, allocated
nonzeros=42010332
		         not using I-node (on process 0) routines
		 Up solver (post-smoother) on level 2
-------------------------------
		   KSP Object:(mg_levels_2_)
		     type: richardson
		       Richardson: damping factor=1
		     maximum iterations=4
		     tolerances:  relative=1e-05, absolute=1e-50,
divergence=10000
		     left preconditioning
		   PC Object:(mg_levels_2_)
		     type: sor
		       SOR: type = local_symmetric, iterations = 1,
omega = 1
		     linear system matrix = precond matrix:
		     Matrix Object:
		       type=mpiaij, rows=411866, cols=411866
		       total: nonzeros=10941434, allocated
nonzeros=42010332
		         not using I-node (on process 0) routines
		 linear system matrix = precond matrix:
		 Matrix Object:
		   type=mpiaij, rows=411866, cols=411866
		   total: nonzeros=10941434, allocated nonzeros=42010332
		     not using I-node (on process 0) routines

	
	
	
	-- 
	What most experimenters take for granted before they begin their
experiments is infinitely more interesting than any results to which
their experiments lead.
	-- Norbert Wiener




-- 
What most experimenters take for granted before they begin their
experiments is infinitely more interesting than any results to which
their experiments lead.
-- Norbert Wiener

-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.mcs.anl.gov/pipermail/petsc-users/attachments/20090729/32587ba5/attachment-0001.htm>

From dalcinl at gmail.com  Wed Jul 29 17:40:26 2009
From: dalcinl at gmail.com (Lisandro Dalcin)
Date: Wed, 29 Jul 2009 19:40:26 -0300
Subject: Smoother settings for AMG
In-Reply-To: <A4A715530F9D944C930E27C7F8932FBD2AD4A7@CORP-CLT-EXB02.ds>
References: <A4A715530F9D944C930E27C7F8932FBD2AD4A5@CORP-CLT-EXB02.ds>
	<a9f269830907291400r3abca289w9da01c34d5f9e190@mail.gmail.com>
	<A4A715530F9D944C930E27C7F8932FBD2AD4A6@CORP-CLT-EXB02.ds>
	<a9f269830907291422k1e52606n253661dcf41bbe09@mail.gmail.com>
	<A4A715530F9D944C930E27C7F8932FBD2AD4A7@CORP-CLT-EXB02.ds>
Message-ID: <e7ba66e40907291540g7c8f2233jf962ece29b14af68@mail.gmail.com>

On Wed, Jul 29, 2009 at 6:48 PM, BAYRAKTAR Harun<Harun.BAYRAKTAR at 3ds.com> wrote:
>
> For a uniform mesh I see 20 iterations or so.
>

That's around the iteration count I would expect for for ML on uniform
meshes... Do this iteration count stays at about 20 when you run in 4
CPU's while still using an uniform mesh?


> From: petsc-users-bounces at mcs.anl.gov
> [mailto:petsc-users-bounces at mcs.anl.gov] On Behalf Of Matthew Knepley
> Sent: Wednesday, July 29, 2009 5:23 PM
>
> To: PETSc users list
> Subject: Re: Smoother settings for AMG
>
>
>
> On Wed, Jul 29, 2009 at 4:17 PM, BAYRAKTAR Harun <Harun.BAYRAKTAR at 3ds.com>
> wrote:
>
> Matt,
>
> It is from the pressure poisson equation for incompressible navier-stokes so
> it is elliptic. Also on 1 cpu, I am able to solve it with reason able
> iteration count (i.e., 43 to 1.e-5 true res norm rel tolerance). It is the
> parallel runs that really concern me.
>
> Actually, it was the 43 that really concerned me. In my experience, an MG
> that is doing what it is supposed
> to on Poisson takes < 10 iterations. However, if your grid is pretty
> distorted, maybe it can get this bad.
>
> ? Matt
>
>
>
> Thanks,
>
> Harun
>
>
>
>
>
> From: petsc-users-bounces at mcs.anl.gov
> [mailto:petsc-users-bounces at mcs.anl.gov] On Behalf Of Matthew Knepley
> Sent: Wednesday, July 29, 2009 5:00 PM
> To: PETSc users list
> Subject: Re: Smoother settings for AMG
>
>
>
> On Wed, Jul 29, 2009 at 3:54 PM, BAYRAKTAR Harun <Harun.BAYRAKTAR at 3ds.com>
> wrote:
>
> Hi,
>
> I am trying to solve a system of equations and I am having difficulty
> picking the right smoothers for AMG (using ML as pc_type) in PETSc for
> parallel execution. First here is what happens in terms of CG (ksp_type)
> iteration counts (both columns use block jacobi):
>
> Are you sure you have an elliptic system? These iteration counts are
> extremely
> high.
>
> ? Matt
>
>
> cpus ? ?| ? ? ? AMG w/ ICC(0) x1 ? ? ? ?| ? ? ? AMG w/ SOR x4
> ------------------------------------------------------
> 1 ? ? ? | ? ? ? ? ? ? ? 43 ? ? ? ? ? ? ?| ? ? ? ? ? ? ? 243
> 4 ? ? ? | ? ? ? ? ? ? ? 699 ? ? ? ? ? ? | ? ? ? ? ? ? ? 379
>
> x1 or x4 means 1 or 4 iterations of smoother application at each AMG
> level (all details from ksp view for the 4 cpu run are below). The main
> observation is that on 1 cpu, AMG w/ ICC(0) is a clear winner but falls
> apart in parallel. SOR on the other hand experiences a 1.5X increase in
> iteration count which is totally expected from the quality of coarsening
> ML delivers in parallel.
>
> I basically would like to find a way (if possible) to have the number of
> iterations in parallel stay with 1-2X of 1 cpu iteration count for the
> AMG w/ ICC case. Is there a way to achieve this?
>
> Thanks,
> Harun
>
> %%%%%%%%%%%%%%%%%%%%%%%%%
> AMG w/ ICC(0) x1 ksp_view
> %%%%%%%%%%%%%%%%%%%%%%%%%
> KSP Object:
> ?type: cg
> ?maximum iterations=10000
> ?tolerances: ?relative=1e-05, absolute=1e-50, divergence=10000
> ?left preconditioning
> PC Object:
> ?type: ml
> ? ?MG: type is MULTIPLICATIVE, levels=3 cycles=v, pre-smooths=1,
> post-smooths=1
> ?Coarse gride solver -- level 0 -------------------------------
> ? ?KSP Object:(mg_coarse_)
> ? ? ?type: preonly
> ? ? ?maximum iterations=1, initial guess is zero
> ? ? ?tolerances: ?relative=1e-05, absolute=1e-50, divergence=10000
> ? ? ?left preconditioning
> ? ?PC Object:(mg_coarse_)
> ? ? ?type: redundant
> ? ? ? ?Redundant preconditioner: First (color=0) of 4 PCs follows
> ? ? ?KSP Object:(mg_coarse_redundant_)
> ? ? ? ?type: preonly
> ? ? ? ?maximum iterations=10000, initial guess is zero
> ? ? ? ?tolerances: ?relative=1e-05, absolute=1e-50, divergence=10000
> ? ? ? ?left preconditioning
> ? ? ?PC Object:(mg_coarse_redundant_)
> ? ? ? ?type: lu
> ? ? ? ? ?LU: out-of-place factorization
> ? ? ? ? ? ?matrix ordering: nd
> ? ? ? ? ?LU: tolerance for zero pivot 1e-12
> ? ? ? ? ?LU: factor fill ratio needed 2.17227
> ? ? ? ? ? ? ? Factored matrix follows
> ? ? ? ? ? ? ?Matrix Object:
> ? ? ? ? ? ? ? ?type=seqaij, rows=283, cols=283
> ? ? ? ? ? ? ? ?total: nonzeros=21651, allocated nonzeros=21651
> ? ? ? ? ? ? ? ? ?using I-node routines: found 186 nodes, limit used is
> 5
> ? ? ? ?linear system matrix = precond matrix:
> ? ? ? ?Matrix Object:
> ? ? ? ? ?type=seqaij, rows=283, cols=283
> ? ? ? ? ?total: nonzeros=9967, allocated nonzeros=14150
> ? ? ? ? ? ?not using I-node routines
> ? ? ?linear system matrix = precond matrix:
> ? ? ?Matrix Object:
> ? ? ? ?type=mpiaij, rows=283, cols=283
> ? ? ? ?total: nonzeros=9967, allocated nonzeros=9967
> ? ? ? ? ?not using I-node (on process 0) routines
> ?Down solver (pre-smoother) on level 1 -------------------------------
> ? ?KSP Object:(mg_levels_1_)
> ? ? ?type: richardson
> ? ? ? ?Richardson: damping factor=0.9
> ? ? ?maximum iterations=1, initial guess is zero
> ? ? ?tolerances: ?relative=1e-05, absolute=1e-50, divergence=10000
> ? ? ?left preconditioning
> ? ?PC Object:(mg_levels_1_)
> ? ? ?type: bjacobi
> ? ? ? ?block Jacobi: number of blocks = 4
> ? ? ? ?Local solve is same for all blocks, in the following KSP and PC
> objects:
> ? ? ?KSP Object:(mg_levels_1_sub_)
> ? ? ? ?type: preonly
> ? ? ? ?maximum iterations=10000, initial guess is zero
> ? ? ? ?tolerances: ?relative=1e-05, absolute=1e-50, divergence=10000
> ? ? ? ?left preconditioning
> ? ? ?PC Object:(mg_levels_1_sub_)
> ? ? ? ?type: icc
> ? ? ? ? ?ICC: 0 levels of fill
> ? ? ? ? ?ICC: factor fill ratio allocated 1
> ? ? ? ? ?ICC: using Manteuffel shift
> ? ? ? ? ?ICC: factor fill ratio needed 0.514899
> ? ? ? ? ? ? ? Factored matrix follows
> ? ? ? ? ? ? ?Matrix Object:
> ? ? ? ? ? ? ? ?type=seqsbaij, rows=2813, cols=2813
> ? ? ? ? ? ? ? ?total: nonzeros=48609, allocated nonzeros=48609
> ? ? ? ? ? ? ? ? ? ?block size is 1
> ? ? ? ?linear system matrix = precond matrix:
> ? ? ? ?Matrix Object:
> ? ? ? ? ?type=seqaij, rows=2813, cols=2813
> ? ? ? ? ?total: nonzeros=94405, allocated nonzeros=94405
> ? ? ? ? ? ?not using I-node routines
> ? ? ?linear system matrix = precond matrix:
> ? ? ?Matrix Object:
> ? ? ? ?type=mpiaij, rows=10654, cols=10654
> ? ? ? ?total: nonzeros=376634, allocated nonzeros=376634
> ? ? ? ? ?not using I-node (on process 0) routines
> ?Up solver (post-smoother) on level 1 -------------------------------
> ? ?KSP Object:(mg_levels_1_)
> ? ? ?type: richardson
> ? ? ? ?Richardson: damping factor=0.9
> ? ? ?maximum iterations=1
> ? ? ?tolerances: ?relative=1e-05, absolute=1e-50, divergence=10000
> ? ? ?left preconditioning
> ? ?PC Object:(mg_levels_1_)
> ? ? ?type: bjacobi
> ? ? ? ?block Jacobi: number of blocks = 4
> ? ? ? ?Local solve is same for all blocks, in the following KSP and PC
> objects:
> ? ? ?KSP Object:(mg_levels_1_sub_)
> ? ? ? ?type: preonly
> ? ? ? ?maximum iterations=10000, initial guess is zero
> ? ? ? ?tolerances: ?relative=1e-05, absolute=1e-50, divergence=10000
> ? ? ? ?left preconditioning
> ? ? ?PC Object:(mg_levels_1_sub_)
> ? ? ? ?type: icc
> ? ? ? ? ?ICC: 0 levels of fill
> ? ? ? ? ?ICC: factor fill ratio allocated 1
> ? ? ? ? ?ICC: using Manteuffel shift
> ? ? ? ? ?ICC: factor fill ratio needed 0.514899
> ? ? ? ? ? ? ? Factored matrix follows
> ? ? ? ? ? ? ?Matrix Object:
> ? ? ? ? ? ? ? ?type=seqsbaij, rows=2813, cols=2813
> ? ? ? ? ? ? ? ?total: nonzeros=48609, allocated nonzeros=48609
> ? ? ? ? ? ? ? ? ? ?block size is 1
> ? ? ? ?linear system matrix = precond matrix:
> ? ? ? ?Matrix Object:
> ? ? ? ? ?type=seqaij, rows=2813, cols=2813
> ? ? ? ? ?total: nonzeros=94405, allocated nonzeros=94405
> ? ? ? ? ? ?not using I-node routines
> ? ? ?linear system matrix = precond matrix:
> ? ? ?Matrix Object:
> ? ? ? ?type=mpiaij, rows=10654, cols=10654
> ? ? ? ?total: nonzeros=376634, allocated nonzeros=376634
> ? ? ? ? ?not using I-node (on process 0) routines
> ?Down solver (pre-smoother) on level 2 -------------------------------
> ? ?KSP Object:(mg_levels_2_)
> ? ? ?type: richardson
> ? ? ? ?Richardson: damping factor=0.9
> ? ? ?maximum iterations=1, initial guess is zero
> ? ? ?tolerances: ?relative=1e-05, absolute=1e-50, divergence=10000
> ? ? ?left preconditioning
> ? ?PC Object:(mg_levels_2_)
> ? ? ?type: bjacobi
> ? ? ? ?block Jacobi: number of blocks = 4
> ? ? ? ?Local solve is same for all blocks, in the following KSP and PC
> objects:
> ? ? ?KSP Object:(mg_levels_2_sub_)
> ? ? ? ?type: preonly
> ? ? ? ?maximum iterations=10000, initial guess is zero
> ? ? ? ?tolerances: ?relative=1e-05, absolute=1e-50, divergence=10000
> ? ? ? ?left preconditioning
> ? ? ?PC Object:(mg_levels_2_sub_)
> ? ? ? ?type: icc
> ? ? ? ? ?ICC: 0 levels of fill
> ? ? ? ? ?ICC: factor fill ratio allocated 1
> ? ? ? ? ?ICC: using Manteuffel shift
> ? ? ? ? ?ICC: factor fill ratio needed 0.519045
> ? ? ? ? ? ? ? Factored matrix follows
> ? ? ? ? ? ? ?Matrix Object:
> ? ? ? ? ? ? ? ?type=seqsbaij, rows=101164, cols=101164
> ? ? ? ? ? ? ? ?total: nonzeros=1378558, allocated nonzeros=1378558
> ? ? ? ? ? ? ? ? ? ?block size is 1
> ? ? ? ?linear system matrix = precond matrix:
> ? ? ? ?Matrix Object:
> ? ? ? ? ?type=seqaij, rows=101164, cols=101164
> ? ? ? ? ?total: nonzeros=2655952, allocated nonzeros=5159364
> ? ? ? ? ? ?not using I-node routines
> ? ? ?linear system matrix = precond matrix:
> ? ? ?Matrix Object:
> ? ? ? ?type=mpiaij, rows=411866, cols=411866
> ? ? ? ?total: nonzeros=10941434, allocated nonzeros=42010332
> ? ? ? ? ?not using I-node (on process 0) routines
> ?Up solver (post-smoother) on level 2 -------------------------------
> ? ?KSP Object:(mg_levels_2_)
> ? ? ?type: richardson
> ? ? ? ?Richardson: damping factor=0.9
> ? ? ?maximum iterations=1
> ? ? ?tolerances: ?relative=1e-05, absolute=1e-50, divergence=10000
> ? ? ?left preconditioning
> ? ?PC Object:(mg_levels_2_)
> ? ? ?type: bjacobi
> ? ? ? ?block Jacobi: number of blocks = 4
> ? ? ? ?Local solve is same for all blocks, in the following KSP and PC
> objects:
> ? ? ?KSP Object:(mg_levels_2_sub_)
> ? ? ? ?type: preonly
> ? ? ? ?maximum iterations=10000, initial guess is zero
> ? ? ? ?tolerances: ?relative=1e-05, absolute=1e-50, divergence=10000
> ? ? ? ?left preconditioning
> ? ? ?PC Object:(mg_levels_2_sub_)
> ? ? ? ?type: icc
> ? ? ? ? ?ICC: 0 levels of fill
> ? ? ? ? ?ICC: factor fill ratio allocated 1
> ? ? ? ? ?ICC: using Manteuffel shift
> ? ? ? ? ?ICC: factor fill ratio needed 0.519045
> ? ? ? ? ? ? ? Factored matrix follows
> ? ? ? ? ? ? ?Matrix Object:
> ? ? ? ? ? ? ? ?type=seqsbaij, rows=101164, cols=101164
> ? ? ? ? ? ? ? ?total: nonzeros=1378558, allocated nonzeros=1378558
> ? ? ? ? ? ? ? ? ? ?block size is 1
> ? ? ? ?linear system matrix = precond matrix:
> ? ? ? ?Matrix Object:
> ? ? ? ? ?type=seqaij, rows=101164, cols=101164
> ? ? ? ? ?total: nonzeros=2655952, allocated nonzeros=5159364
> ? ? ? ? ? ?not using I-node routines
> ? ? ?linear system matrix = precond matrix:
> ? ? ?Matrix Object:
> ? ? ? ?type=mpiaij, rows=411866, cols=411866
> ? ? ? ?total: nonzeros=10941434, allocated nonzeros=42010332
> ? ? ? ? ?not using I-node (on process 0) routines
> ?linear system matrix = precond matrix:
> ?Matrix Object:
> ? ?type=mpiaij, rows=411866, cols=411866
> ? ?total: nonzeros=10941434, allocated nonzeros=42010332
> ? ? ?not using I-node (on process 0) routines
>
> %%%%%%%%%%%%%%%%%%%%%%
> AMG w/ SOR x4 ksp_view
> %%%%%%%%%%%%%%%%%%%%%%
>
> KSP Object:
> ?type: cg
> ?maximum iterations=10000
> ?tolerances: ?relative=1e-05, absolute=1e-50, divergence=10000
> ?left preconditioning
> PC Object:
> ?type: ml
> ? ?MG: type is MULTIPLICATIVE, levels=3 cycles=v, pre-smooths=1,
> post-smooths=1
> ?Coarse gride solver -- level 0 -------------------------------
> ? ?KSP Object:(mg_coarse_)
> ? ? ?type: preonly
> ? ? ?maximum iterations=1, initial guess is zero
> ? ? ?tolerances: ?relative=1e-05, absolute=1e-50, divergence=10000
> ? ? ?left preconditioning
> ? ?PC Object:(mg_coarse_)
> ? ? ?type: redundant
> ? ? ? ?Redundant preconditioner: First (color=0) of 4 PCs follows
> ? ? ?KSP Object:(mg_coarse_redundant_)
> ? ? ? ?type: preonly
> ? ? ? ?maximum iterations=10000, initial guess is zero
> ? ? ? ?tolerances: ?relative=1e-05, absolute=1e-50, divergence=10000
> ? ? ? ?left preconditioning
> ? ? ?PC Object:(mg_coarse_redundant_)
> ? ? ? ?type: lu
> ? ? ? ? ?LU: out-of-place factorization
> ? ? ? ? ? ?matrix ordering: nd
> ? ? ? ? ?LU: tolerance for zero pivot 1e-12
> ? ? ? ? ?LU: factor fill ratio needed 2.17227
> ? ? ? ? ? ? ? Factored matrix follows
> ? ? ? ? ? ? ?Matrix Object:
> ? ? ? ? ? ? ? ?type=seqaij, rows=283, cols=283
> ? ? ? ? ? ? ? ?total: nonzeros=21651, allocated nonzeros=21651
> ? ? ? ? ? ? ? ? ?using I-node routines: found 186 nodes, limit used is
> 5
> ? ? ? ?linear system matrix = precond matrix:
> ? ? ? ?Matrix Object:
> ? ? ? ? ?type=seqaij, rows=283, cols=283
> ? ? ? ? ?total: nonzeros=9967, allocated nonzeros=14150
> ? ? ? ? ? ?not using I-node routines
> ? ? ?linear system matrix = precond matrix:
> ? ? ?Matrix Object:
> ? ? ? ?type=mpiaij, rows=283, cols=283
> ? ? ? ?total: nonzeros=9967, allocated nonzeros=9967
> ? ? ? ? ?not using I-node (on process 0) routines
> ?Down solver (pre-smoother) on level 1 -------------------------------
> ? ?KSP Object:(mg_levels_1_)
> ? ? ?type: richardson
> ? ? ? ?Richardson: damping factor=1
> ? ? ?maximum iterations=4, initial guess is zero
> ? ? ?tolerances: ?relative=1e-05, absolute=1e-50, divergence=10000
> ? ? ?left preconditioning
> ? ?PC Object:(mg_levels_1_)
> ? ? ?type: sor
> ? ? ? ?SOR: type = local_symmetric, iterations = 1, omega = 1
> ? ? ?linear system matrix = precond matrix:
> ? ? ?Matrix Object:
> ? ? ? ?type=mpiaij, rows=10654, cols=10654
> ? ? ? ?total: nonzeros=376634, allocated nonzeros=376634
> ? ? ? ? ?not using I-node (on process 0) routines
> ?Up solver (post-smoother) on level 1 -------------------------------
> ? ?KSP Object:(mg_levels_1_)
> ? ? ?type: richardson
> ? ? ? ?Richardson: damping factor=1
> ? ? ?maximum iterations=4
> ? ? ?tolerances: ?relative=1e-05, absolute=1e-50, divergence=10000
> ? ? ?left preconditioning
> ? ?PC Object:(mg_levels_1_)
> ? ? ?type: sor
> ? ? ? ?SOR: type = local_symmetric, iterations = 1, omega = 1
> ? ? ?linear system matrix = precond matrix:
> ? ? ?Matrix Object:
> ? ? ? ?type=mpiaij, rows=10654, cols=10654
> ? ? ? ?total: nonzeros=376634, allocated nonzeros=376634
> ? ? ? ? ?not using I-node (on process 0) routines
> ?Down solver (pre-smoother) on level 2 -------------------------------
> ? ?KSP Object:(mg_levels_2_)
> ? ? ?type: richardson
> ? ? ? ?Richardson: damping factor=1
> ? ? ?maximum iterations=4, initial guess is zero
> ? ? ?tolerances: ?relative=1e-05, absolute=1e-50, divergence=10000
> ? ? ?left preconditioning
> ? ?PC Object:(mg_levels_2_)
> ? ? ?type: sor
> ? ? ? ?SOR: type = local_symmetric, iterations = 1, omega = 1
> ? ? ?linear system matrix = precond matrix:
> ? ? ?Matrix Object:
> ? ? ? ?type=mpiaij, rows=411866, cols=411866
> ? ? ? ?total: nonzeros=10941434, allocated nonzeros=42010332
> ? ? ? ? ?not using I-node (on process 0) routines
> ?Up solver (post-smoother) on level 2 -------------------------------
> ? ?KSP Object:(mg_levels_2_)
> ? ? ?type: richardson
> ? ? ? ?Richardson: damping factor=1
> ? ? ?maximum iterations=4
> ? ? ?tolerances: ?relative=1e-05, absolute=1e-50, divergence=10000
> ? ? ?left preconditioning
> ? ?PC Object:(mg_levels_2_)
> ? ? ?type: sor
> ? ? ? ?SOR: type = local_symmetric, iterations = 1, omega = 1
> ? ? ?linear system matrix = precond matrix:
> ? ? ?Matrix Object:
> ? ? ? ?type=mpiaij, rows=411866, cols=411866
> ? ? ? ?total: nonzeros=10941434, allocated nonzeros=42010332
> ? ? ? ? ?not using I-node (on process 0) routines
> ?linear system matrix = precond matrix:
> ?Matrix Object:
> ? ?type=mpiaij, rows=411866, cols=411866
> ? ?total: nonzeros=10941434, allocated nonzeros=42010332
> ? ? ?not using I-node (on process 0) routines
>
>
> --
> What most experimenters take for granted before they begin their experiments
> is infinitely more interesting than any results to which their experiments
> lead.
> -- Norbert Wiener
>
>
> --
> What most experimenters take for granted before they begin their experiments
> is infinitely more interesting than any results to which their experiments
> lead.
> -- Norbert Wiener



-- 
Lisandro Dalc?n
---------------
Centro Internacional de M?todos Computacionales en Ingenier?a (CIMEC)
Instituto de Desarrollo Tecnol?gico para la Industria Qu?mica (INTEC)
Consejo Nacional de Investigaciones Cient?ficas y T?cnicas (CONICET)
PTLC - G?emes 3450, (3000) Santa Fe, Argentina
Tel/Fax: +54-(0)342-451.1594

From tim.kroeger at cevis.uni-bremen.de  Thu Jul 30 04:52:28 2009
From: tim.kroeger at cevis.uni-bremen.de (Tim Kroeger)
Date: Thu, 30 Jul 2009 11:52:28 +0200 (CEST)
Subject: Solver problem
In-Reply-To: <a9f269830907290931h5a8b313eu80449abb17de2806@mail.gmail.com>
References: <alpine.LFD.0.9999.0907271042460.4918@elektrik.mevis.lan>
	<6902D9FB-FE90-4C10-A4CC-187834E02988@mcs.anl.gov>
	<alpine.LFD.0.9999.0907280815410.21104@elektrik.mevis.lan>
	<alpine.LFD.0.9999.0907291546180.10932@elektrik.mevis.lan>
	<a9f269830907290931h5a8b313eu80449abb17de2806@mail.gmail.com>
Message-ID: <alpine.LFD.0.9999.0907300846490.17874@elektrik.mevis.lan>

Dear Matt,

On Wed, 29 Jul 2009, Matthew Knepley wrote:

> On Wed, Jul 29, 2009 at 8:49 AM, Tim Kroeger <tim.kroeger at cevis.uni-bremen.de> wrote:
>
>> It seems as if I can't use MUMPS since the cluster I am working on doesn't
>> meet some system requirements.  (PETSc otherwise works fine on the cluster.)
>>  However, I understand that PETSc also interfaces a large number of other
>> sparse direct solvers.  Are there any recommendations about which one might
>> be a good choice if MUMPS cannot be used?
>
> You can try SuperLU_dist.

Thank you very much.  SuperLU_Dist works great!

Best Regards,

Tim

-- 
Dr. Tim Kroeger
tim.kroeger at mevis.fraunhofer.de            Phone +49-421-218-7710
tim.kroeger at cevis.uni-bremen.de            Fax   +49-421-218-4236

Fraunhofer MEVIS, Institute for Medical Image Computing
Universitaetsallee 29, 28359 Bremen, Germany


From Harun.BAYRAKTAR at 3ds.com  Thu Jul 30 06:28:35 2009
From: Harun.BAYRAKTAR at 3ds.com (BAYRAKTAR Harun)
Date: Thu, 30 Jul 2009 07:28:35 -0400
Subject: Smoother settings for AMG
In-Reply-To: <e7ba66e40907291540g7c8f2233jf962ece29b14af68@mail.gmail.com>
References: <A4A715530F9D944C930E27C7F8932FBD2AD4A5@CORP-CLT-EXB02.ds><a9f269830907291400r3abca289w9da01c34d5f9e190@mail.gmail.com><A4A715530F9D944C930E27C7F8932FBD2AD4A6@CORP-CLT-EXB02.ds><a9f269830907291422k1e52606n253661dcf41bbe09@mail.gmail.com><A4A715530F9D944C930E27C7F8932FBD2AD4A7@CORP-CLT-EXB02.ds>
	<e7ba66e40907291540g7c8f2233jf962ece29b14af68@mail.gmail.com>
Message-ID: <A4A715530F9D944C930E27C7F8932FBD2AD4AC@CORP-CLT-EXB02.ds>

I always see a jump going from sequential to parallel runs with ML for all types of problems I have tried (structural mechanics, incompressible flow). But once parallel the iteration counts stay roughly constant given I use a true parallel smoother like Chebychev. My experience is that the iteration count jump is somewhere between 1.5-2.0X.



-----Original Message-----
From: petsc-users-bounces at mcs.anl.gov [mailto:petsc-users-bounces at mcs.anl.gov] On Behalf Of Lisandro Dalcin
Sent: Wednesday, July 29, 2009 6:40 PM
To: PETSc users list
Subject: Re: Smoother settings for AMG

On Wed, Jul 29, 2009 at 6:48 PM, BAYRAKTAR Harun<Harun.BAYRAKTAR at 3ds.com> wrote:
>
> For a uniform mesh I see 20 iterations or so.
>

That's around the iteration count I would expect for for ML on uniform
meshes... Do this iteration count stays at about 20 when you run in 4
CPU's while still using an uniform mesh?


> From: petsc-users-bounces at mcs.anl.gov
> [mailto:petsc-users-bounces at mcs.anl.gov] On Behalf Of Matthew Knepley
> Sent: Wednesday, July 29, 2009 5:23 PM
>
> To: PETSc users list
> Subject: Re: Smoother settings for AMG
>
>
>
> On Wed, Jul 29, 2009 at 4:17 PM, BAYRAKTAR Harun <Harun.BAYRAKTAR at 3ds.com>
> wrote:
>
> Matt,
>
> It is from the pressure poisson equation for incompressible navier-stokes so
> it is elliptic. Also on 1 cpu, I am able to solve it with reason able
> iteration count (i.e., 43 to 1.e-5 true res norm rel tolerance). It is the
> parallel runs that really concern me.
>
> Actually, it was the 43 that really concerned me. In my experience, an MG
> that is doing what it is supposed
> to on Poisson takes < 10 iterations. However, if your grid is pretty
> distorted, maybe it can get this bad.
>
> ? Matt
>
>
>
> Thanks,
>
> Harun
>
>
>
>
>
> From: petsc-users-bounces at mcs.anl.gov
> [mailto:petsc-users-bounces at mcs.anl.gov] On Behalf Of Matthew Knepley
> Sent: Wednesday, July 29, 2009 5:00 PM
> To: PETSc users list
> Subject: Re: Smoother settings for AMG
>
>
>
> On Wed, Jul 29, 2009 at 3:54 PM, BAYRAKTAR Harun <Harun.BAYRAKTAR at 3ds.com>
> wrote:
>
> Hi,
>
> I am trying to solve a system of equations and I am having difficulty
> picking the right smoothers for AMG (using ML as pc_type) in PETSc for
> parallel execution. First here is what happens in terms of CG (ksp_type)
> iteration counts (both columns use block jacobi):
>
> Are you sure you have an elliptic system? These iteration counts are
> extremely
> high.
>
> ? Matt
>
>
> cpus ? ?| ? ? ? AMG w/ ICC(0) x1 ? ? ? ?| ? ? ? AMG w/ SOR x4
> ------------------------------------------------------
> 1 ? ? ? | ? ? ? ? ? ? ? 43 ? ? ? ? ? ? ?| ? ? ? ? ? ? ? 243
> 4 ? ? ? | ? ? ? ? ? ? ? 699 ? ? ? ? ? ? | ? ? ? ? ? ? ? 379
>
> x1 or x4 means 1 or 4 iterations of smoother application at each AMG
> level (all details from ksp view for the 4 cpu run are below). The main
> observation is that on 1 cpu, AMG w/ ICC(0) is a clear winner but falls
> apart in parallel. SOR on the other hand experiences a 1.5X increase in
> iteration count which is totally expected from the quality of coarsening
> ML delivers in parallel.
>
> I basically would like to find a way (if possible) to have the number of
> iterations in parallel stay with 1-2X of 1 cpu iteration count for the
> AMG w/ ICC case. Is there a way to achieve this?
>
> Thanks,
> Harun
>
> %%%%%%%%%%%%%%%%%%%%%%%%%
> AMG w/ ICC(0) x1 ksp_view
> %%%%%%%%%%%%%%%%%%%%%%%%%
> KSP Object:
> ?type: cg
> ?maximum iterations=10000
> ?tolerances: ?relative=1e-05, absolute=1e-50, divergence=10000
> ?left preconditioning
> PC Object:
> ?type: ml
> ? ?MG: type is MULTIPLICATIVE, levels=3 cycles=v, pre-smooths=1,
> post-smooths=1
> ?Coarse gride solver -- level 0 -------------------------------
> ? ?KSP Object:(mg_coarse_)
> ? ? ?type: preonly
> ? ? ?maximum iterations=1, initial guess is zero
> ? ? ?tolerances: ?relative=1e-05, absolute=1e-50, divergence=10000
> ? ? ?left preconditioning
> ? ?PC Object:(mg_coarse_)
> ? ? ?type: redundant
> ? ? ? ?Redundant preconditioner: First (color=0) of 4 PCs follows
> ? ? ?KSP Object:(mg_coarse_redundant_)
> ? ? ? ?type: preonly
> ? ? ? ?maximum iterations=10000, initial guess is zero
> ? ? ? ?tolerances: ?relative=1e-05, absolute=1e-50, divergence=10000
> ? ? ? ?left preconditioning
> ? ? ?PC Object:(mg_coarse_redundant_)
> ? ? ? ?type: lu
> ? ? ? ? ?LU: out-of-place factorization
> ? ? ? ? ? ?matrix ordering: nd
> ? ? ? ? ?LU: tolerance for zero pivot 1e-12
> ? ? ? ? ?LU: factor fill ratio needed 2.17227
> ? ? ? ? ? ? ? Factored matrix follows
> ? ? ? ? ? ? ?Matrix Object:
> ? ? ? ? ? ? ? ?type=seqaij, rows=283, cols=283
> ? ? ? ? ? ? ? ?total: nonzeros=21651, allocated nonzeros=21651
> ? ? ? ? ? ? ? ? ?using I-node routines: found 186 nodes, limit used is
> 5
> ? ? ? ?linear system matrix = precond matrix:
> ? ? ? ?Matrix Object:
> ? ? ? ? ?type=seqaij, rows=283, cols=283
> ? ? ? ? ?total: nonzeros=9967, allocated nonzeros=14150
> ? ? ? ? ? ?not using I-node routines
> ? ? ?linear system matrix = precond matrix:
> ? ? ?Matrix Object:
> ? ? ? ?type=mpiaij, rows=283, cols=283
> ? ? ? ?total: nonzeros=9967, allocated nonzeros=9967
> ? ? ? ? ?not using I-node (on process 0) routines
> ?Down solver (pre-smoother) on level 1 -------------------------------
> ? ?KSP Object:(mg_levels_1_)
> ? ? ?type: richardson
> ? ? ? ?Richardson: damping factor=0.9
> ? ? ?maximum iterations=1, initial guess is zero
> ? ? ?tolerances: ?relative=1e-05, absolute=1e-50, divergence=10000
> ? ? ?left preconditioning
> ? ?PC Object:(mg_levels_1_)
> ? ? ?type: bjacobi
> ? ? ? ?block Jacobi: number of blocks = 4
> ? ? ? ?Local solve is same for all blocks, in the following KSP and PC
> objects:
> ? ? ?KSP Object:(mg_levels_1_sub_)
> ? ? ? ?type: preonly
> ? ? ? ?maximum iterations=10000, initial guess is zero
> ? ? ? ?tolerances: ?relative=1e-05, absolute=1e-50, divergence=10000
> ? ? ? ?left preconditioning
> ? ? ?PC Object:(mg_levels_1_sub_)
> ? ? ? ?type: icc
> ? ? ? ? ?ICC: 0 levels of fill
> ? ? ? ? ?ICC: factor fill ratio allocated 1
> ? ? ? ? ?ICC: using Manteuffel shift
> ? ? ? ? ?ICC: factor fill ratio needed 0.514899
> ? ? ? ? ? ? ? Factored matrix follows
> ? ? ? ? ? ? ?Matrix Object:
> ? ? ? ? ? ? ? ?type=seqsbaij, rows=2813, cols=2813
> ? ? ? ? ? ? ? ?total: nonzeros=48609, allocated nonzeros=48609
> ? ? ? ? ? ? ? ? ? ?block size is 1
> ? ? ? ?linear system matrix = precond matrix:
> ? ? ? ?Matrix Object:
> ? ? ? ? ?type=seqaij, rows=2813, cols=2813
> ? ? ? ? ?total: nonzeros=94405, allocated nonzeros=94405
> ? ? ? ? ? ?not using I-node routines
> ? ? ?linear system matrix = precond matrix:
> ? ? ?Matrix Object:
> ? ? ? ?type=mpiaij, rows=10654, cols=10654
> ? ? ? ?total: nonzeros=376634, allocated nonzeros=376634
> ? ? ? ? ?not using I-node (on process 0) routines
> ?Up solver (post-smoother) on level 1 -------------------------------
> ? ?KSP Object:(mg_levels_1_)
> ? ? ?type: richardson
> ? ? ? ?Richardson: damping factor=0.9
> ? ? ?maximum iterations=1
> ? ? ?tolerances: ?relative=1e-05, absolute=1e-50, divergence=10000
> ? ? ?left preconditioning
> ? ?PC Object:(mg_levels_1_)
> ? ? ?type: bjacobi
> ? ? ? ?block Jacobi: number of blocks = 4
> ? ? ? ?Local solve is same for all blocks, in the following KSP and PC
> objects:
> ? ? ?KSP Object:(mg_levels_1_sub_)
> ? ? ? ?type: preonly
> ? ? ? ?maximum iterations=10000, initial guess is zero
> ? ? ? ?tolerances: ?relative=1e-05, absolute=1e-50, divergence=10000
> ? ? ? ?left preconditioning
> ? ? ?PC Object:(mg_levels_1_sub_)
> ? ? ? ?type: icc
> ? ? ? ? ?ICC: 0 levels of fill
> ? ? ? ? ?ICC: factor fill ratio allocated 1
> ? ? ? ? ?ICC: using Manteuffel shift
> ? ? ? ? ?ICC: factor fill ratio needed 0.514899
> ? ? ? ? ? ? ? Factored matrix follows
> ? ? ? ? ? ? ?Matrix Object:
> ? ? ? ? ? ? ? ?type=seqsbaij, rows=2813, cols=2813
> ? ? ? ? ? ? ? ?total: nonzeros=48609, allocated nonzeros=48609
> ? ? ? ? ? ? ? ? ? ?block size is 1
> ? ? ? ?linear system matrix = precond matrix:
> ? ? ? ?Matrix Object:
> ? ? ? ? ?type=seqaij, rows=2813, cols=2813
> ? ? ? ? ?total: nonzeros=94405, allocated nonzeros=94405
> ? ? ? ? ? ?not using I-node routines
> ? ? ?linear system matrix = precond matrix:
> ? ? ?Matrix Object:
> ? ? ? ?type=mpiaij, rows=10654, cols=10654
> ? ? ? ?total: nonzeros=376634, allocated nonzeros=376634
> ? ? ? ? ?not using I-node (on process 0) routines
> ?Down solver (pre-smoother) on level 2 -------------------------------
> ? ?KSP Object:(mg_levels_2_)
> ? ? ?type: richardson
> ? ? ? ?Richardson: damping factor=0.9
> ? ? ?maximum iterations=1, initial guess is zero
> ? ? ?tolerances: ?relative=1e-05, absolute=1e-50, divergence=10000
> ? ? ?left preconditioning
> ? ?PC Object:(mg_levels_2_)
> ? ? ?type: bjacobi
> ? ? ? ?block Jacobi: number of blocks = 4
> ? ? ? ?Local solve is same for all blocks, in the following KSP and PC
> objects:
> ? ? ?KSP Object:(mg_levels_2_sub_)
> ? ? ? ?type: preonly
> ? ? ? ?maximum iterations=10000, initial guess is zero
> ? ? ? ?tolerances: ?relative=1e-05, absolute=1e-50, divergence=10000
> ? ? ? ?left preconditioning
> ? ? ?PC Object:(mg_levels_2_sub_)
> ? ? ? ?type: icc
> ? ? ? ? ?ICC: 0 levels of fill
> ? ? ? ? ?ICC: factor fill ratio allocated 1
> ? ? ? ? ?ICC: using Manteuffel shift
> ? ? ? ? ?ICC: factor fill ratio needed 0.519045
> ? ? ? ? ? ? ? Factored matrix follows
> ? ? ? ? ? ? ?Matrix Object:
> ? ? ? ? ? ? ? ?type=seqsbaij, rows=101164, cols=101164
> ? ? ? ? ? ? ? ?total: nonzeros=1378558, allocated nonzeros=1378558
> ? ? ? ? ? ? ? ? ? ?block size is 1
> ? ? ? ?linear system matrix = precond matrix:
> ? ? ? ?Matrix Object:
> ? ? ? ? ?type=seqaij, rows=101164, cols=101164
> ? ? ? ? ?total: nonzeros=2655952, allocated nonzeros=5159364
> ? ? ? ? ? ?not using I-node routines
> ? ? ?linear system matrix = precond matrix:
> ? ? ?Matrix Object:
> ? ? ? ?type=mpiaij, rows=411866, cols=411866
> ? ? ? ?total: nonzeros=10941434, allocated nonzeros=42010332
> ? ? ? ? ?not using I-node (on process 0) routines
> ?Up solver (post-smoother) on level 2 -------------------------------
> ? ?KSP Object:(mg_levels_2_)
> ? ? ?type: richardson
> ? ? ? ?Richardson: damping factor=0.9
> ? ? ?maximum iterations=1
> ? ? ?tolerances: ?relative=1e-05, absolute=1e-50, divergence=10000
> ? ? ?left preconditioning
> ? ?PC Object:(mg_levels_2_)
> ? ? ?type: bjacobi
> ? ? ? ?block Jacobi: number of blocks = 4
> ? ? ? ?Local solve is same for all blocks, in the following KSP and PC
> objects:
> ? ? ?KSP Object:(mg_levels_2_sub_)
> ? ? ? ?type: preonly
> ? ? ? ?maximum iterations=10000, initial guess is zero
> ? ? ? ?tolerances: ?relative=1e-05, absolute=1e-50, divergence=10000
> ? ? ? ?left preconditioning
> ? ? ?PC Object:(mg_levels_2_sub_)
> ? ? ? ?type: icc
> ? ? ? ? ?ICC: 0 levels of fill
> ? ? ? ? ?ICC: factor fill ratio allocated 1
> ? ? ? ? ?ICC: using Manteuffel shift
> ? ? ? ? ?ICC: factor fill ratio needed 0.519045
> ? ? ? ? ? ? ? Factored matrix follows
> ? ? ? ? ? ? ?Matrix Object:
> ? ? ? ? ? ? ? ?type=seqsbaij, rows=101164, cols=101164
> ? ? ? ? ? ? ? ?total: nonzeros=1378558, allocated nonzeros=1378558
> ? ? ? ? ? ? ? ? ? ?block size is 1
> ? ? ? ?linear system matrix = precond matrix:
> ? ? ? ?Matrix Object:
> ? ? ? ? ?type=seqaij, rows=101164, cols=101164
> ? ? ? ? ?total: nonzeros=2655952, allocated nonzeros=5159364
> ? ? ? ? ? ?not using I-node routines
> ? ? ?linear system matrix = precond matrix:
> ? ? ?Matrix Object:
> ? ? ? ?type=mpiaij, rows=411866, cols=411866
> ? ? ? ?total: nonzeros=10941434, allocated nonzeros=42010332
> ? ? ? ? ?not using I-node (on process 0) routines
> ?linear system matrix = precond matrix:
> ?Matrix Object:
> ? ?type=mpiaij, rows=411866, cols=411866
> ? ?total: nonzeros=10941434, allocated nonzeros=42010332
> ? ? ?not using I-node (on process 0) routines
>
> %%%%%%%%%%%%%%%%%%%%%%
> AMG w/ SOR x4 ksp_view
> %%%%%%%%%%%%%%%%%%%%%%
>
> KSP Object:
> ?type: cg
> ?maximum iterations=10000
> ?tolerances: ?relative=1e-05, absolute=1e-50, divergence=10000
> ?left preconditioning
> PC Object:
> ?type: ml
> ? ?MG: type is MULTIPLICATIVE, levels=3 cycles=v, pre-smooths=1,
> post-smooths=1
> ?Coarse gride solver -- level 0 -------------------------------
> ? ?KSP Object:(mg_coarse_)
> ? ? ?type: preonly
> ? ? ?maximum iterations=1, initial guess is zero
> ? ? ?tolerances: ?relative=1e-05, absolute=1e-50, divergence=10000
> ? ? ?left preconditioning
> ? ?PC Object:(mg_coarse_)
> ? ? ?type: redundant
> ? ? ? ?Redundant preconditioner: First (color=0) of 4 PCs follows
> ? ? ?KSP Object:(mg_coarse_redundant_)
> ? ? ? ?type: preonly
> ? ? ? ?maximum iterations=10000, initial guess is zero
> ? ? ? ?tolerances: ?relative=1e-05, absolute=1e-50, divergence=10000
> ? ? ? ?left preconditioning
> ? ? ?PC Object:(mg_coarse_redundant_)
> ? ? ? ?type: lu
> ? ? ? ? ?LU: out-of-place factorization
> ? ? ? ? ? ?matrix ordering: nd
> ? ? ? ? ?LU: tolerance for zero pivot 1e-12
> ? ? ? ? ?LU: factor fill ratio needed 2.17227
> ? ? ? ? ? ? ? Factored matrix follows
> ? ? ? ? ? ? ?Matrix Object:
> ? ? ? ? ? ? ? ?type=seqaij, rows=283, cols=283
> ? ? ? ? ? ? ? ?total: nonzeros=21651, allocated nonzeros=21651
> ? ? ? ? ? ? ? ? ?using I-node routines: found 186 nodes, limit used is
> 5
> ? ? ? ?linear system matrix = precond matrix:
> ? ? ? ?Matrix Object:
> ? ? ? ? ?type=seqaij, rows=283, cols=283
> ? ? ? ? ?total: nonzeros=9967, allocated nonzeros=14150
> ? ? ? ? ? ?not using I-node routines
> ? ? ?linear system matrix = precond matrix:
> ? ? ?Matrix Object:
> ? ? ? ?type=mpiaij, rows=283, cols=283
> ? ? ? ?total: nonzeros=9967, allocated nonzeros=9967
> ? ? ? ? ?not using I-node (on process 0) routines
> ?Down solver (pre-smoother) on level 1 -------------------------------
> ? ?KSP Object:(mg_levels_1_)
> ? ? ?type: richardson
> ? ? ? ?Richardson: damping factor=1
> ? ? ?maximum iterations=4, initial guess is zero
> ? ? ?tolerances: ?relative=1e-05, absolute=1e-50, divergence=10000
> ? ? ?left preconditioning
> ? ?PC Object:(mg_levels_1_)
> ? ? ?type: sor
> ? ? ? ?SOR: type = local_symmetric, iterations = 1, omega = 1
> ? ? ?linear system matrix = precond matrix:
> ? ? ?Matrix Object:
> ? ? ? ?type=mpiaij, rows=10654, cols=10654
> ? ? ? ?total: nonzeros=376634, allocated nonzeros=376634
> ? ? ? ? ?not using I-node (on process 0) routines
> ?Up solver (post-smoother) on level 1 -------------------------------
> ? ?KSP Object:(mg_levels_1_)
> ? ? ?type: richardson
> ? ? ? ?Richardson: damping factor=1
> ? ? ?maximum iterations=4
> ? ? ?tolerances: ?relative=1e-05, absolute=1e-50, divergence=10000
> ? ? ?left preconditioning
> ? ?PC Object:(mg_levels_1_)
> ? ? ?type: sor
> ? ? ? ?SOR: type = local_symmetric, iterations = 1, omega = 1
> ? ? ?linear system matrix = precond matrix:
> ? ? ?Matrix Object:
> ? ? ? ?type=mpiaij, rows=10654, cols=10654
> ? ? ? ?total: nonzeros=376634, allocated nonzeros=376634
> ? ? ? ? ?not using I-node (on process 0) routines
> ?Down solver (pre-smoother) on level 2 -------------------------------
> ? ?KSP Object:(mg_levels_2_)
> ? ? ?type: richardson
> ? ? ? ?Richardson: damping factor=1
> ? ? ?maximum iterations=4, initial guess is zero
> ? ? ?tolerances: ?relative=1e-05, absolute=1e-50, divergence=10000
> ? ? ?left preconditioning
> ? ?PC Object:(mg_levels_2_)
> ? ? ?type: sor
> ? ? ? ?SOR: type = local_symmetric, iterations = 1, omega = 1
> ? ? ?linear system matrix = precond matrix:
> ? ? ?Matrix Object:
> ? ? ? ?type=mpiaij, rows=411866, cols=411866
> ? ? ? ?total: nonzeros=10941434, allocated nonzeros=42010332
> ? ? ? ? ?not using I-node (on process 0) routines
> ?Up solver (post-smoother) on level 2 -------------------------------
> ? ?KSP Object:(mg_levels_2_)
> ? ? ?type: richardson
> ? ? ? ?Richardson: damping factor=1
> ? ? ?maximum iterations=4
> ? ? ?tolerances: ?relative=1e-05, absolute=1e-50, divergence=10000
> ? ? ?left preconditioning
> ? ?PC Object:(mg_levels_2_)
> ? ? ?type: sor
> ? ? ? ?SOR: type = local_symmetric, iterations = 1, omega = 1
> ? ? ?linear system matrix = precond matrix:
> ? ? ?Matrix Object:
> ? ? ? ?type=mpiaij, rows=411866, cols=411866
> ? ? ? ?total: nonzeros=10941434, allocated nonzeros=42010332
> ? ? ? ? ?not using I-node (on process 0) routines
> ?linear system matrix = precond matrix:
> ?Matrix Object:
> ? ?type=mpiaij, rows=411866, cols=411866
> ? ?total: nonzeros=10941434, allocated nonzeros=42010332
> ? ? ?not using I-node (on process 0) routines
>
>
> --
> What most experimenters take for granted before they begin their experiments
> is infinitely more interesting than any results to which their experiments
> lead.
> -- Norbert Wiener
>
>
> --
> What most experimenters take for granted before they begin their experiments
> is infinitely more interesting than any results to which their experiments
> lead.
> -- Norbert Wiener



-- 
Lisandro Dalc?n
---------------
Centro Internacional de M?todos Computacionales en Ingenier?a (CIMEC)
Instituto de Desarrollo Tecnol?gico para la Industria Qu?mica (INTEC)
Consejo Nacional de Investigaciones Cient?ficas y T?cnicas (CONICET)
PTLC - G?emes 3450, (3000) Santa Fe, Argentina
Tel/Fax: +54-(0)342-451.1594

From Harun.BAYRAKTAR at 3ds.com  Thu Jul 30 06:30:30 2009
From: Harun.BAYRAKTAR at 3ds.com (BAYRAKTAR Harun)
Date: Thu, 30 Jul 2009 07:30:30 -0400
Subject: Smoother settings for AMG
In-Reply-To: <02E52450-1C28-4451-BAE8-E31563BCF1A3@mcs.anl.gov>
References: <A4A715530F9D944C930E27C7F8932FBD2AD4A5@CORP-CLT-EXB02.ds>
	<02E52450-1C28-4451-BAE8-E31563BCF1A3@mcs.anl.gov>
Message-ID: <A4A715530F9D944C930E27C7F8932FBD2AD4AD@CORP-CLT-EXB02.ds>

Barry,

I sent the matrix and rhs file name yesterday to the petsc-maint address. Did you get it OK?

Thanks a lot for your help,
Harun

-----Original Message-----
From: petsc-users-bounces at mcs.anl.gov [mailto:petsc-users-bounces at mcs.anl.gov] On Behalf Of Barry Smith
Sent: Wednesday, July 29, 2009 5:05 PM
To: PETSc users list
Subject: Re: Smoother settings for AMG


   Can you save the matrix and right hand side with the option - 
ksp_view_binary and send the file "output" to petsc-maint at mcs.anl.gov  
(not this email).

    Barry

If it is too big to email you can ftp it to info.mcs.anl.gov  
(anonymous login) and put it in the directory incoming then send us  
email petsc-maint at mcs.anl.gov with the filename.


On Jul 29, 2009, at 3:54 PM, BAYRAKTAR Harun wrote:

> Hi,
>
> I am trying to solve a system of equations and I am having difficulty
> picking the right smoothers for AMG (using ML as pc_type) in PETSc for
> parallel execution. First here is what happens in terms of CG  
> (ksp_type)
> iteration counts (both columns use block jacobi):
>
> cpus	|	AMG w/ ICC(0) x1	|	AMG w/ SOR x4
> ------------------------------------------------------
> 1	|		43		|		243
> 4	|		699		|		379
>
> x1 or x4 means 1 or 4 iterations of smoother application at each AMG
> level (all details from ksp view for the 4 cpu run are below). The  
> main
> observation is that on 1 cpu, AMG w/ ICC(0) is a clear winner but  
> falls
> apart in parallel. SOR on the other hand experiences a 1.5X increase  
> in
> iteration count which is totally expected from the quality of  
> coarsening
> ML delivers in parallel.
>
> I basically would like to find a way (if possible) to have the  
> number of
> iterations in parallel stay with 1-2X of 1 cpu iteration count for the
> AMG w/ ICC case. Is there a way to achieve this?
>
> Thanks,
> Harun
>
> %%%%%%%%%%%%%%%%%%%%%%%%%
> AMG w/ ICC(0) x1 ksp_view
> %%%%%%%%%%%%%%%%%%%%%%%%%
> KSP Object:
>  type: cg
>  maximum iterations=10000
>  tolerances:  relative=1e-05, absolute=1e-50, divergence=10000
>  left preconditioning
> PC Object:
>  type: ml
>    MG: type is MULTIPLICATIVE, levels=3 cycles=v, pre-smooths=1,
> post-smooths=1
>  Coarse gride solver -- level 0 -------------------------------
>    KSP Object:(mg_coarse_)
>      type: preonly
>      maximum iterations=1, initial guess is zero
>      tolerances:  relative=1e-05, absolute=1e-50, divergence=10000
>      left preconditioning
>    PC Object:(mg_coarse_)
>      type: redundant
>        Redundant preconditioner: First (color=0) of 4 PCs follows
>      KSP Object:(mg_coarse_redundant_)
>        type: preonly
>        maximum iterations=10000, initial guess is zero
>        tolerances:  relative=1e-05, absolute=1e-50, divergence=10000
>        left preconditioning
>      PC Object:(mg_coarse_redundant_)
>        type: lu
>          LU: out-of-place factorization
>            matrix ordering: nd
>          LU: tolerance for zero pivot 1e-12
>          LU: factor fill ratio needed 2.17227
>               Factored matrix follows
>              Matrix Object:
>                type=seqaij, rows=283, cols=283
>                total: nonzeros=21651, allocated nonzeros=21651
>                  using I-node routines: found 186 nodes, limit used is
> 5
>        linear system matrix = precond matrix:
>        Matrix Object:
>          type=seqaij, rows=283, cols=283
>          total: nonzeros=9967, allocated nonzeros=14150
>            not using I-node routines
>      linear system matrix = precond matrix:
>      Matrix Object:
>        type=mpiaij, rows=283, cols=283
>        total: nonzeros=9967, allocated nonzeros=9967
>          not using I-node (on process 0) routines
>  Down solver (pre-smoother) on level 1 -------------------------------
>    KSP Object:(mg_levels_1_)
>      type: richardson
>        Richardson: damping factor=0.9
>      maximum iterations=1, initial guess is zero
>      tolerances:  relative=1e-05, absolute=1e-50, divergence=10000
>      left preconditioning
>    PC Object:(mg_levels_1_)
>      type: bjacobi
>        block Jacobi: number of blocks = 4
>        Local solve is same for all blocks, in the following KSP and PC
> objects:
>      KSP Object:(mg_levels_1_sub_)
>        type: preonly
>        maximum iterations=10000, initial guess is zero
>        tolerances:  relative=1e-05, absolute=1e-50, divergence=10000
>        left preconditioning
>      PC Object:(mg_levels_1_sub_)
>        type: icc
>          ICC: 0 levels of fill
>          ICC: factor fill ratio allocated 1
>          ICC: using Manteuffel shift
>          ICC: factor fill ratio needed 0.514899
>               Factored matrix follows
>              Matrix Object:
>                type=seqsbaij, rows=2813, cols=2813
>                total: nonzeros=48609, allocated nonzeros=48609
>                    block size is 1
>        linear system matrix = precond matrix:
>        Matrix Object:
>          type=seqaij, rows=2813, cols=2813
>          total: nonzeros=94405, allocated nonzeros=94405
>            not using I-node routines
>      linear system matrix = precond matrix:
>      Matrix Object:
>        type=mpiaij, rows=10654, cols=10654
>        total: nonzeros=376634, allocated nonzeros=376634
>          not using I-node (on process 0) routines
>  Up solver (post-smoother) on level 1 -------------------------------
>    KSP Object:(mg_levels_1_)
>      type: richardson
>        Richardson: damping factor=0.9
>      maximum iterations=1
>      tolerances:  relative=1e-05, absolute=1e-50, divergence=10000
>      left preconditioning
>    PC Object:(mg_levels_1_)
>      type: bjacobi
>        block Jacobi: number of blocks = 4
>        Local solve is same for all blocks, in the following KSP and PC
> objects:
>      KSP Object:(mg_levels_1_sub_)
>        type: preonly
>        maximum iterations=10000, initial guess is zero
>        tolerances:  relative=1e-05, absolute=1e-50, divergence=10000
>        left preconditioning
>      PC Object:(mg_levels_1_sub_)
>        type: icc
>          ICC: 0 levels of fill
>          ICC: factor fill ratio allocated 1
>          ICC: using Manteuffel shift
>          ICC: factor fill ratio needed 0.514899
>               Factored matrix follows
>              Matrix Object:
>                type=seqsbaij, rows=2813, cols=2813
>                total: nonzeros=48609, allocated nonzeros=48609
>                    block size is 1
>        linear system matrix = precond matrix:
>        Matrix Object:
>          type=seqaij, rows=2813, cols=2813
>          total: nonzeros=94405, allocated nonzeros=94405
>            not using I-node routines
>      linear system matrix = precond matrix:
>      Matrix Object:
>        type=mpiaij, rows=10654, cols=10654
>        total: nonzeros=376634, allocated nonzeros=376634
>          not using I-node (on process 0) routines
>  Down solver (pre-smoother) on level 2 -------------------------------
>    KSP Object:(mg_levels_2_)
>      type: richardson
>        Richardson: damping factor=0.9
>      maximum iterations=1, initial guess is zero
>      tolerances:  relative=1e-05, absolute=1e-50, divergence=10000
>      left preconditioning
>    PC Object:(mg_levels_2_)
>      type: bjacobi
>        block Jacobi: number of blocks = 4
>        Local solve is same for all blocks, in the following KSP and PC
> objects:
>      KSP Object:(mg_levels_2_sub_)
>        type: preonly
>        maximum iterations=10000, initial guess is zero
>        tolerances:  relative=1e-05, absolute=1e-50, divergence=10000
>        left preconditioning
>      PC Object:(mg_levels_2_sub_)
>        type: icc
>          ICC: 0 levels of fill
>          ICC: factor fill ratio allocated 1
>          ICC: using Manteuffel shift
>          ICC: factor fill ratio needed 0.519045
>               Factored matrix follows
>              Matrix Object:
>                type=seqsbaij, rows=101164, cols=101164
>                total: nonzeros=1378558, allocated nonzeros=1378558
>                    block size is 1
>        linear system matrix = precond matrix:
>        Matrix Object:
>          type=seqaij, rows=101164, cols=101164
>          total: nonzeros=2655952, allocated nonzeros=5159364
>            not using I-node routines
>      linear system matrix = precond matrix:
>      Matrix Object:
>        type=mpiaij, rows=411866, cols=411866
>        total: nonzeros=10941434, allocated nonzeros=42010332
>          not using I-node (on process 0) routines
>  Up solver (post-smoother) on level 2 -------------------------------
>    KSP Object:(mg_levels_2_)
>      type: richardson
>        Richardson: damping factor=0.9
>      maximum iterations=1
>      tolerances:  relative=1e-05, absolute=1e-50, divergence=10000
>      left preconditioning
>    PC Object:(mg_levels_2_)
>      type: bjacobi
>        block Jacobi: number of blocks = 4
>        Local solve is same for all blocks, in the following KSP and PC
> objects:
>      KSP Object:(mg_levels_2_sub_)
>        type: preonly
>        maximum iterations=10000, initial guess is zero
>        tolerances:  relative=1e-05, absolute=1e-50, divergence=10000
>        left preconditioning
>      PC Object:(mg_levels_2_sub_)
>        type: icc
>          ICC: 0 levels of fill
>          ICC: factor fill ratio allocated 1
>          ICC: using Manteuffel shift
>          ICC: factor fill ratio needed 0.519045
>               Factored matrix follows
>              Matrix Object:
>                type=seqsbaij, rows=101164, cols=101164
>                total: nonzeros=1378558, allocated nonzeros=1378558
>                    block size is 1
>        linear system matrix = precond matrix:
>        Matrix Object:
>          type=seqaij, rows=101164, cols=101164
>          total: nonzeros=2655952, allocated nonzeros=5159364
>            not using I-node routines
>      linear system matrix = precond matrix:
>      Matrix Object:
>        type=mpiaij, rows=411866, cols=411866
>        total: nonzeros=10941434, allocated nonzeros=42010332
>          not using I-node (on process 0) routines
>  linear system matrix = precond matrix:
>  Matrix Object:
>    type=mpiaij, rows=411866, cols=411866
>    total: nonzeros=10941434, allocated nonzeros=42010332
>      not using I-node (on process 0) routines
>
> %%%%%%%%%%%%%%%%%%%%%%
> AMG w/ SOR x4 ksp_view
> %%%%%%%%%%%%%%%%%%%%%%
>
> KSP Object:
>  type: cg
>  maximum iterations=10000
>  tolerances:  relative=1e-05, absolute=1e-50, divergence=10000
>  left preconditioning
> PC Object:
>  type: ml
>    MG: type is MULTIPLICATIVE, levels=3 cycles=v, pre-smooths=1,
> post-smooths=1
>  Coarse gride solver -- level 0 -------------------------------
>    KSP Object:(mg_coarse_)
>      type: preonly
>      maximum iterations=1, initial guess is zero
>      tolerances:  relative=1e-05, absolute=1e-50, divergence=10000
>      left preconditioning
>    PC Object:(mg_coarse_)
>      type: redundant
>        Redundant preconditioner: First (color=0) of 4 PCs follows
>      KSP Object:(mg_coarse_redundant_)
>        type: preonly
>        maximum iterations=10000, initial guess is zero
>        tolerances:  relative=1e-05, absolute=1e-50, divergence=10000
>        left preconditioning
>      PC Object:(mg_coarse_redundant_)
>        type: lu
>          LU: out-of-place factorization
>            matrix ordering: nd
>          LU: tolerance for zero pivot 1e-12
>          LU: factor fill ratio needed 2.17227
>               Factored matrix follows
>              Matrix Object:
>                type=seqaij, rows=283, cols=283
>                total: nonzeros=21651, allocated nonzeros=21651
>                  using I-node routines: found 186 nodes, limit used is
> 5
>        linear system matrix = precond matrix:
>        Matrix Object:
>          type=seqaij, rows=283, cols=283
>          total: nonzeros=9967, allocated nonzeros=14150
>            not using I-node routines
>      linear system matrix = precond matrix:
>      Matrix Object:
>        type=mpiaij, rows=283, cols=283
>        total: nonzeros=9967, allocated nonzeros=9967
>          not using I-node (on process 0) routines
>  Down solver (pre-smoother) on level 1 -------------------------------
>    KSP Object:(mg_levels_1_)
>      type: richardson
>        Richardson: damping factor=1
>      maximum iterations=4, initial guess is zero
>      tolerances:  relative=1e-05, absolute=1e-50, divergence=10000
>      left preconditioning
>    PC Object:(mg_levels_1_)
>      type: sor
>        SOR: type = local_symmetric, iterations = 1, omega = 1
>      linear system matrix = precond matrix:
>      Matrix Object:
>        type=mpiaij, rows=10654, cols=10654
>        total: nonzeros=376634, allocated nonzeros=376634
>          not using I-node (on process 0) routines
>  Up solver (post-smoother) on level 1 -------------------------------
>    KSP Object:(mg_levels_1_)
>      type: richardson
>        Richardson: damping factor=1
>      maximum iterations=4
>      tolerances:  relative=1e-05, absolute=1e-50, divergence=10000
>      left preconditioning
>    PC Object:(mg_levels_1_)
>      type: sor
>        SOR: type = local_symmetric, iterations = 1, omega = 1
>      linear system matrix = precond matrix:
>      Matrix Object:
>        type=mpiaij, rows=10654, cols=10654
>        total: nonzeros=376634, allocated nonzeros=376634
>          not using I-node (on process 0) routines
>  Down solver (pre-smoother) on level 2 -------------------------------
>    KSP Object:(mg_levels_2_)
>      type: richardson
>        Richardson: damping factor=1
>      maximum iterations=4, initial guess is zero
>      tolerances:  relative=1e-05, absolute=1e-50, divergence=10000
>      left preconditioning
>    PC Object:(mg_levels_2_)
>      type: sor
>        SOR: type = local_symmetric, iterations = 1, omega = 1
>      linear system matrix = precond matrix:
>      Matrix Object:
>        type=mpiaij, rows=411866, cols=411866
>        total: nonzeros=10941434, allocated nonzeros=42010332
>          not using I-node (on process 0) routines
>  Up solver (post-smoother) on level 2 -------------------------------
>    KSP Object:(mg_levels_2_)
>      type: richardson
>        Richardson: damping factor=1
>      maximum iterations=4
>      tolerances:  relative=1e-05, absolute=1e-50, divergence=10000
>      left preconditioning
>    PC Object:(mg_levels_2_)
>      type: sor
>        SOR: type = local_symmetric, iterations = 1, omega = 1
>      linear system matrix = precond matrix:
>      Matrix Object:
>        type=mpiaij, rows=411866, cols=411866
>        total: nonzeros=10941434, allocated nonzeros=42010332
>          not using I-node (on process 0) routines
>  linear system matrix = precond matrix:
>  Matrix Object:
>    type=mpiaij, rows=411866, cols=411866
>    total: nonzeros=10941434, allocated nonzeros=42010332
>      not using I-node (on process 0) routines
>
>


From darach at tchpc.tcd.ie  Thu Jul 30 10:08:11 2009
From: darach at tchpc.tcd.ie (darach at tchpc.tcd.ie)
Date: Thu, 30 Jul 2009 16:08:11 +0100
Subject: Compiling petsc-dev with c++/boost/sieve
Message-ID: <20090730150811.GG23977@tchpc.tcd.ie>

Hi,

I'm trying to run examples from the
$PETSC_DIR/src/dm/mesh/examples/tutorials directory.  I'm interested
in the mixedpoisson example, but I've tried compiling the ex[1-3]
examples, with results that I give below.  I've also compiled some
files in the $PETSC_DIR/src/dm/mesh/examples/tests directory with
mostly failures as well.  Details of the petsc compilation are at the
bottom of the email

These compilations take place with PETSC_DIR=<compilation directory>,
and with PETSC_ARCH set, but I've had similar problems after a 'make
install'

Examples elsewhere in the petsc tree appear to compile and run
correctly

I don't want to waste anyones time wading through reams of output, so
I'm really asking whether this output indicates obviously that I have
failed to configure/compile petsc correctly with c++/boost/sieve, and
therefore have no hope of successfully compiling the examples?

Darach

petsc-dev:  HG revision: f9c1b044f127006244143f415f257bfbc93c7a6e HG Date: Tue Jul 28 16:59:48 2009 -0500

%> gcc --version
gcc (GCC) 4.1.2 20080704 (Red Hat 4.1.2-44)
.....

%>  make ex1
mpicxx -o ex1.o -c -Wall -Wwrite-strings -Wno-strict-aliasing -g -I/home/user/Compile/petsc-dev/src/dm/mesh/sieve -I/home/user/Compile/petsc-dev/externalpackages/Boost/ -I/misc/shared/apps/openmpi/gcc/64/1.2.8/include -I/misc/shared/apps/openmpi/gcc/64/1.2.8/lib -I/home/user/Compile/petsc-dev/linux-gnu-cxx-debug/include -I/home/user/Compile/petsc-dev/include -I/home/user/Compile/petsc-dev/include/sieve -I/home/user/Compile/petsc-dev/externalpackages/Boost/ -I/misc/shared/apps/openmpi/gcc/64/1.2.8/include -I/misc/shared/apps/openmpi/gcc/64/1.2.8/lib -D__SDIR__="src/dm/mesh/examples/tutorials/" ex1.c
ex1.c: In function ???PetscErrorCode CreatePartition(_p_Mesh*, _p_SectionInt**)???:
ex1.c:170: error: invalid initialization of reference of type ???ALE::Obj<ALE::IMesh<ALE::LabelSifter<int, int, ALE::malloc_allocator<ALE::NewSifterDef::Arrow<int, int> > > >, ALE::malloc_allocator<ALE::IMesh<ALE::LabelSifter<int, int, ALE::malloc_allocator<ALE::NewSifterDef::Arrow<int, int> > > > > >&??? from expression of type ???ALE::Obj<ALE::Mesh, ALE::malloc_allocator<ALE::Mesh> >???
/home/user/Compile/petsc-dev/include/petscmesh.h:108: error: in passing argument 2 of ???PetscErrorCode MeshGetMesh(_p_Mesh*, ALE::Obj<ALE::IMesh<ALE::LabelSifter<int, int, ALE::malloc_allocator<ALE::NewSifterDef::Arrow<int, int> > > >, ALE::malloc_allocator<ALE::IMesh<ALE::LabelSifter<int, int, ALE::malloc_allocator<ALE::NewSifterDef::Arrow<int, int> > > > > >&)???
ex1.c:171: error: invalid conversion from ???int??? to ???const char*???
ex1.c:171: error: invalid conversion from ???_p_SectionInt**??? to ???PetscInt???
/home/user/Compile/petsc-dev/include/petscmesh.h:255: error: too few arguments to function ???PetscErrorCode MeshGetCellSectionInt(_p_Mesh*, const char*, PetscInt, _p_SectionInt**)???
ex1.c:171: error: at this point in file
ex1.c:173: error: invalid initialization of reference of type ???ALE::Obj<ALE::IGeneralSection<int, int, ALE::malloc_allocator<int> >, ALE::malloc_allocator<ALE::IGeneralSection<int, int, ALE::malloc_allocator<int> > > >&??? from expression of type ???ALE::Obj<ALE::Section<int, int, ALE::malloc_allocator<int>, ALE::UniformSection<int, ALE::Point, 1, ALE::malloc_allocator<ALE::Point> > >, ALE::malloc_allocator<ALE::Section<int, int, ALE::malloc_allocator<int>, ALE::UniformSection<int, ALE::Point, 1, ALE::malloc_allocator<ALE::Point> > > > >???
....


%> make ex2
mpicxx -o ex2.o -c -Wall -Wwrite-strings -Wno-strict-aliasing -g -I/home/user/Compile/petsc-dev/src/dm/mesh/sieve -I/home/user/Compile/petsc-dev/externalpackages/Boost/ -I/misc/shared/apps/openmpi/gcc/64/1.2.8/include -I/misc/shared/apps/openmpi/gcc/64/1.2.8/lib -I/home/user/Compile/petsc-dev/linux-gnu-cxx-debug/include -I/home/user/Compile/petsc-dev/include -I/home/user/Compile/petsc-dev/include/sieve -I/home/user/Compile/petsc-dev/externalpackages/Boost/ -I/misc/shared/apps/openmpi/gcc/64/1.2.8/include -I/misc/shared/apps/openmpi/gcc/64/1.2.8/lib -D__SDIR__="src/dm/mesh/examples/tutorials/" ex2.c
ex2.c:22:38: error: ../src/dm/mesh/meshpcice.h: No such file or directory
ex2.c: In function ???PetscErrorCode CreateSquareBoundary(const ALE::Obj<ALE::Mesh, ALE::malloc_allocator<ALE::Mesh> >&)???:
ex2.c:148: error: ???class ALE::Mesh::topology_type??? has not been declared
ex2.c:148: error: expected initializer before ???patch???
ex2.c:153: error: ???topology_type??? is not a member of ???ALE::Mesh???
ex2.c:153: error: ???topology_type??? is not a member of ???ALE::Mesh???
ex2.c:153: error: template argument 1 is invalid
ex2.c:153: error: template argument 2 is invalid
ex2.c:153: error: invalid type in declaration before ???=??? token
ex2.c:153: error: expected type-specifier
ex2.c:153: error: invalid conversion from ???int*??? to ???int???
ex2.c:153: error: expected ???,??? or ???;???
ex2.c:182: error: base operand of ???->??? is not a pointer
ex2.c:182: error: ???patch??? was not declared in this scope
ex2.c:183: error: base operand of ???->??? is not a pointer
ex2.c:184: error: ???class ALE::Mesh??? has no member named ???setTopology???
ex2.c:185: error: ???SieveBuilder??? is not a member of ???ALE::New???
ex2.c:185: error: expected primary-expression before ???>??? token
ex2.c:185: error: ???::buildCoordinates??? has not been declared
ex2.c:187: error: ???topology_type??? is not a member of ???ALE::Mesh???
ex2.c:187: error: ???topology_type??? is not a member of ???ALE::Mesh???
ex2.c:187: error: template argument 1 is invalid
....

%> make ex3
mpicxx -o ex3.o -c -Wall -Wwrite-strings -Wno-strict-aliasing -g -I/home/user/Compile/petsc-dev/src/dm/mesh/sieve -I/home/user/Compile/petsc-dev/externalpackages/Boost/ -I/misc/shared/apps/openmpi/gcc/64/1.2.8/include -I/misc/shared/apps/openmpi/gcc/64/1.2.8/lib -I/home/user/Compile/petsc-dev/linux-gnu-cxx-debug/include -I/home/user/Compile/petsc-dev/include -I/home/user/Compile/petsc-dev/include/sieve -I/home/user/Compile/petsc-dev/externalpackages/Boost/ -I/misc/shared/apps/openmpi/gcc/64/1.2.8/include -I/misc/shared/apps/openmpi/gcc/64/1.2.8/lib -D__SDIR__="src/dm/mesh/examples/tutorials/" ex3.c
ex3.c:26: error: ???Two??? is not a member of ???ALE???
ex3.c:26: error: ???Two??? is not a member of ???ALE???
ex3.c:26: error: template argument 1 is invalid
ex3.c:26: error: template argument 2 is invalid
ex3.c:27: error: ???Two??? is not a member of ???ALE???
ex3.c:27: error: ???Two??? is not a member of ???ALE???
....


Compilation Details:
---------------------
[user at machine petsc-dev]$ ./config/configure.py --prefix=/home/user/install_home/petsc-dev-defaultboost-sieve-comp --with-scalar-type=complex  --with-clanguage=cxx  --with-boost=1 --download-boost=/home/user/Compile/petsc-dev/externalpackages/boost.tar.gz  --with-sieve=1
=================================================================================
             Configuring PETSc to compile on your system                         
=================================================================================
TESTING: alternateConfigureLibrary from PETSc.packages.mpi4py(config/PETSc/packages/mpi4py.py:54)                                                                             Compilers:
  C Compiler:         mpicc  -Wall -Wwrite-strings -Wno-strict-aliasing -g3 
  C++ Compiler:       mpicxx  -Wall -Wwrite-strings -Wno-strict-aliasing -g   
  Fortran Compiler:   mpif90  -Wall -Wno-unused-variable -g  
Linkers:
  Static linker:   /usr/bin/ar cr
PETSc:
  PETSC_ARCH: linux-gnu-cxx-debug
  PETSC_DIR: /home/user/Compile/petsc-dev
  **
  ** Now build the libraries with "make all"
  **
  Clanguage: Cxx
  Scalar type:complex
MPI:
  Includes: -I/misc/shared/apps/openmpi/gcc/64/1.2.8/include -I/misc/shared/apps/openmpi/gcc/64/1.2.8/lib
X11:
  Includes: ['']
  Library: ['-lX11']
  PETSc shared libraries: disabled
  PETSc dynamic libraries: disabled
BLAS/LAPACK: -llapack -lblas
Sieve:
  Includes: -I/home/user/Compile/petsc-dev/include/sieve
Boost:
  Includes: -I/home/user/Compile/petsc-dev/externalpackages/Boost/ -I/misc/shared/apps/openmpi/gcc/64/1.2.8/include -I/misc/shared/apps/openmpi/gcc/64/1.2.8/lib
c2html:
sowing:



Using mpiexec: /misc/shared/apps/openmpi/gcc/64/1.2.8/bin/mpiexec
==========================================
/bin/rm -f -f /home/user/Compile/petsc-dev/linux-gnu-cxx-debug/lib/libpetsc*.*
/bin/rm -f -f /home/user/Compile/petsc-dev/linux-gnu-cxx-debug/include/petsc*.mod
BEGINNING TO COMPILE LIBRARIES IN ALL DIRECTORIES
=========================================
....
libfast in: /home/user/Compile/petsc-dev/src/snes/examples/tutorials
libfast in: /home/user/Compile/petsc-dev/src/snes/examples/tutorials/ex10d
libfast in: /home/user/Compile/petsc-dev/src/snes/utils
libfast in: /home/user/Compile/petsc-dev/src/snes/utils/sieve
/home/user/Compile/petsc-dev/include/sieve/ISieve.hh: In instantiation of ???ALE::IFSieveDef::Sequence<int>???:
/home/user/Compile/petsc-dev/include/sieve/ALE_mem.hh:799:   instantiated from ???void ALE::Obj<X, A>::destroy() [with X = ALE::IFSieveDef::Sequence<int>, A = ALE::malloc_allocator<ALE::IFSieveDef::Sequence<int> >]???
/home/user/Compile/petsc-dev/include/sieve/ALE_mem.hh:745:   instantiated from ???ALE::Obj<X, A>::~Obj() [with X = ALE::IFSieveDef::Sequence<int>, A = ALE::malloc_allocator<ALE::IFSieveDef::Sequence<int> >]???
/home/user/Compile/petsc-dev/include/sieve/ISieve.hh:1159:   instantiated from ???ALE::IFSieve<Point_, Allocator_>::~IFSieve() [with Point_ = int, Allocator_ = ALE::malloc_allocator<int>]???
/home/user/Compile/petsc-dev/include/sieve/ALE_mem.hh:799:   instantiated from ???void ALE::Obj<X, A>::destroy() [with X = ALE::IFSieve<int, ALE::malloc_allocator<int> >, A = ALE::malloc_allocator<ALE::IFSieve<int, ALE::malloc_allocator<int> > >]???
/home/user/Compile/petsc-dev/include/sieve/ALE_mem.hh:745:   instantiated from ???ALE::Obj<X, A>::~Obj() [with X = ALE::IFSieve<int, ALE::malloc_allocator<int> >, A = ALE::malloc_allocator<ALE::IFSieve<int, ALE::malloc_allocator<int> > >]???
/home/user/Compile/petsc-dev/include/sieve/Mesh.hh:1348:   instantiated from ???ALE::IBundle<Sieve_, RealSection_, IntSection_, Label_, ArrowSection_>::~IBundle() [with Sieve_ = ALE::IFSieve<int, ALE::malloc_allocator<int> >, RealSection_ = ALE::IGeneralSection<int, double, ALE::malloc_allocator<double> >, IntSection_ = ALE::IGeneralSection<int, int, ALE::malloc_allocator<int> >, Label_ = ALE::LabelSifter<int, int, ALE::malloc_allocator<ALE::NewSifterDef::Arrow<int, int> > >, ArrowSection_ = ALE::UniformSection<ALE::MinimalArrow<int, int>, int, 1, ALE::malloc_allocator<int> >]???
/home/user/Compile/petsc-dev/include/sieve/Mesh.hh:1576:   instantiated from ???void ALE::Obj<X, A>::destroy() [with X = ALE::IMesh<ALE::LabelSifter<int, int, ALE::malloc_allocator<ALE::NewSifterDef::Arrow<int, int> > > >, A = ALE::malloc_allocator<ALE::IMesh<ALE::LabelSifter<int, int, ALE::malloc_allocator<ALE::NewSifterDef::Arrow<int, int> > > > >]???
/home/user/Compile/petsc-dev/include/sieve/ALE_mem.hh:745:   instantiated from ???ALE::Obj<X, A>::~Obj() [with X = ALE::IMesh<ALE::LabelSifter<int, int, ALE::malloc_allocator<ALE::NewSifterDef::Arrow<int, int> > > >, A = ALE::malloc_allocator<ALE::IMesh<ALE::LabelSifter<int, int, ALE::malloc_allocator<ALE::NewSifterDef::Arrow<int, int> > > > >]???
meshmgsnes.c:63:   instantiated from here
/home/user/Compile/petsc-dev/include/sieve/ISieve.hh:957: warning: ???class ALE::IFSieveDef::Sequence<int>??? has virtual functions but non-virtual destructor
/home/user/Compile/petsc-dev/include/sieve/ISieve.hh: In instantiation of ???ALE::IFSieveDef::Sequence<int>::const_iterator???:
/home/user/Compile/petsc-dev/include/sieve/ISieve.hh:994:   instantiated from ???ALE::IFSieveDef::Sequence<PointType_>::const_iterator ALE::IFSieveDef::Sequence<PointType_>::begin() const [with PointType_ = int]???
meshmgsnes.c:349:   instantiated from here
/home/user/Compile/petsc-dev/include/sieve/ISieve.hh:960: warning: ???class ALE::IFSieveDef::Sequence<int>::const_iterator??? has virtual functions but non-virtual destructor
....
libfast in: /home/user/Compile/petsc-dev/src/dm/da/utils/f90-custom
libfast in: /home/user/Compile/petsc-dev/src/dm/mesh
/home/user/Compile/petsc-dev/include/sieve/ISieve.hh: In instantiation of ???ALE::IFSieveDef::Sequence<int>???:
/home/user/Compile/petsc-dev/include/sieve/ISieve.hh:1148:   instantiated from ???ALE::IFSieve<Point_, Allocator_>::IFSieve(ompi_communicator_t*, int) [with Point_ = int, Allocator_ = ALE::malloc_allocator<int>]???
mesh.c:1642:   instantiated from here
/home/user/Compile/petsc-dev/include/sieve/ISieve.hh:957: warning: ???class ALE::IFSieveDef::Sequence<int>??? has virtual functions but non-virtual destructor
/home/user/Compile/petsc-dev/include/sieve/ISieve.hh: In instantiation of ???ALE::IFSieveDef::Sequence<int>::const_iterator???:
/home/user/Compile/petsc-dev/include/sieve/ISieve.hh:994:   instantiated from ???ALE::IFSieveDef::Sequence<PointType_>::const_iterator ALE::IFSieveDef::Sequence<PointType_>::begin() const [with PointType_ = int]???
mesh.c:2846:   instantiated from here
/home/user/Compile/petsc-dev/include/sieve/ISieve.hh:960: warning: ???class ALE::IFSieveDef::Sequence<int>::const_iterator??? has virtual functions but non-virtual destructor
/home/user/Compile/petsc-dev/include/sieve/ISieve.hh: In instantiation of ???ALE::IFSieveDef::Sequence<int>???:
meshpcice.c:387:   instantiated from here
/home/user/Compile/petsc-dev/include/sieve/ISieve.hh:957: warning: ???class ALE::IFSieveDef::Sequence<int>??? has virtual functions but non-virtual destructor
/home/user/Compile/petsc-dev/include/sieve/ISieve.hh: In instantiation of ???ALE::IFSieveDef::Sequence<int>::const_iterator???:
meshpcice.c:388:   instantiated from here
/home/user/Compile/petsc-dev/include/sieve/ISieve.hh:960: warning: ???class ALE::IFSieveDef::Sequence<int>::const_iterator??? has virtual functions but non-virtual destructor
/home/user/Compile/petsc-dev/include/sieve/ISieve.hh: In instantiation of ???ALE::IFSieveDef::Sequence<int>???:
/home/user/Compile/petsc-dev/include/sieve/ALE_mem.hh:799:   instantiated from ???void ALE::Obj<X, A>::destroy() [with X = ALE::IFSieveDef::Sequence<int>, A = ALE::malloc_allocator<ALE::IFSieveDef::Sequence<int> >]???
/home/user/Compile/petsc-dev/include/sieve/ALE_mem.hh:745:   instantiated from ???ALE::Obj<X, A>::~Obj() [with X = ALE::IFSieveDef::Sequence<int>, A = ALE::malloc_allocator<ALE::IFSieveDef::Sequence<int> >]???
/home/user/Compile/petsc-dev/include/sieve/ISieve.hh:1159:   instantiated from ???ALE::IFSieve<Point_, Allocator_>::~IFSieve() [with Point_ = int, Allocator_ = ALE::malloc_allocator<int>]???
/home/user/Compile/petsc-dev/include/sieve/ALE_mem.hh:799:   instantiated from ???void ALE::Obj<X, A>::destroy() [with X = ALE::IFSieve<int, ALE::malloc_allocator<int> >, A = ALE::malloc_allocator<ALE::IFSieve<int, ALE::malloc_allocator<int> > >]???
/home/user/Compile/petsc-dev/include/sieve/ALE_mem.hh:745:   instantiated from ???ALE::Obj<X, A>::~Obj() [with X = ALE::IFSieve<int, ALE::malloc_allocator<int> >, A = ALE::malloc_allocator<ALE::IFSieve<int, ALE::malloc_allocator<int> > >]???
/home/user/Compile/petsc-dev/include/sieve/Mesh.hh:1348:   instantiated from ???ALE::IBundle<Sieve_, RealSection_, IntSection_, Label_, ArrowSection_>::~IBundle() [with Sieve_ = ALE::IFSieve<int, ALE::malloc_allocator<int> >, RealSection_ = ALE::IGeneralSection<int, double, ALE::malloc_allocator<double> >, IntSection_ = ALE::IGeneralSection<int, int, ALE::malloc_allocator<int> >, Label_ = ALE::LabelSifter<int, int, ALE::malloc_allocator<ALE::NewSifterDef::Arrow<int, int> > >, ArrowSection_ = ALE::UniformSection<ALE::MinimalArrow<int, int>, int, 1, ALE::malloc_allocator<int> >]???
/home/user/Compile/petsc-dev/include/sieve/Mesh.hh:1576:   instantiated from ???void ALE::Obj<X, A>::destroy() [with X = ALE::IMesh<ALE::LabelSifter<int, int, ALE::malloc_allocator<ALE::NewSifterDef::Arrow<int, int> > > >, A = ALE::malloc_allocator<ALE::IMesh<ALE::LabelSifter<int, int, ALE::malloc_allocator<ALE::NewSifterDef::Arrow<int, int> > > > >]???
/home/user/Compile/petsc-dev/include/sieve/ALE_mem.hh:745:   instantiated from ???ALE::Obj<X, A>::~Obj() [with X = ALE::IMesh<ALE::LabelSifter<int, int, ALE::malloc_allocator<ALE::NewSifterDef::Arrow<int, int> > > >, A = ALE::malloc_allocator<ALE::IMesh<ALE::LabelSifter<int, int, ALE::malloc_allocator<ALE::NewSifterDef::Arrow<int, int> > > > >]???
meshpflotran.c:235:   instantiated from here
/home/user/Compile/petsc-dev/include/sieve/ISieve.hh:957: warning: ???class ALE::IFSieveDef::Sequence<int>??? has virtual functions but non-virtual destructor
/home/user/Compile/petsc-dev/include/sieve/ISieve.hh: In instantiation of ???ALE::IFSieveDef::Sequence<int>::const_iterator???:
/home/user/Compile/petsc-dev/include/sieve/ISieve.hh:994:   instantiated from ???ALE::IFSieveDef::Sequence<PointType_>::const_iterator ALE::IFSieveDef::Sequence<PointType_>::begin() const [with PointType_ = int]???
meshpflotran.c:903:   instantiated from here
/home/user/Compile/petsc-dev/include/sieve/ISieve.hh:960: warning: ???class ALE::IFSieveDef::Sequence<int>::const_iterator??? has virtual functions but non-virtual destructor
/home/user/Compile/petsc-dev/include/sieve/ISieve.hh: In instantiation of ???ALE::IFSieveDef::Sequence<int>???:
/home/user/Compile/petsc-dev/include/sieve/ALE_mem.hh:799:   instantiated from ???void ALE::Obj<X, A>::destroy() [with X = ALE::IFSieveDef::Sequence<int>, A = ALE::malloc_allocator<ALE::IFSieveDef::Sequence<int> >]???
/home/user/Compile/petsc-dev/include/sieve/ALE_mem.hh:745:   instantiated from ???ALE::Obj<X, A>::~Obj() [with X = ALE::IFSieveDef::Sequence<int>, A = ALE::malloc_allocator<ALE::IFSieveDef::Sequence<int> >]???
/home/user/Compile/petsc-dev/include/sieve/ISieve.hh:1159:   instantiated from ???ALE::IFSieve<Point_, Allocator_>::~IFSieve() [with Point_ = int, Allocator_ = ALE::malloc_allocator<int>]???
/home/user/Compile/petsc-dev/include/sieve/ALE_mem.hh:799:   instantiated from ???void ALE::Obj<X, A>::destroy() [with X = ALE::IFSieve<int, ALE::malloc_allocator<int> >, A = ALE::malloc_allocator<ALE::IFSieve<int, ALE::malloc_allocator<int> > >]???
/home/user/Compile/petsc-dev/include/sieve/ALE_mem.hh:745:   instantiated from ???ALE::Obj<X, A>::~Obj() [with X = ALE::IFSieve<int, ALE::malloc_allocator<int> >, A = ALE::malloc_allocator<ALE::IFSieve<int, ALE::malloc_allocator<int> > >]???
/home/user/Compile/petsc-dev/include/sieve/Mesh.hh:1332:   instantiated from ???ALE::IBundle<Sieve_, RealSection_, IntSection_, Label_, ArrowSection_>::IBundle(ompi_communicator_t*, int) [with Sieve_ = ALE::IFSieve<int, ALE::malloc_allocator<int> >, RealSection_ = ALE::IGeneralSection<int, double, ALE::malloc_allocator<double> >, IntSection_ = ALE::IGeneralSection<int, int, ALE::malloc_allocator<int> >, Label_ = ALE::LabelSifter<int, int, ALE::malloc_allocator<ALE::NewSifterDef::Arrow<int, int> > >, ArrowSection_ = ALE::UniformSection<ALE::MinimalArrow<int, int>, int, 1, ALE::malloc_allocator<int> >]???
/home/user/Compile/petsc-dev/include/sieve/Mesh.hh:1602:   instantiated from ???ALE::IMesh<Label_>::IMesh(ompi_communicator_t*, int, int) [with Label_ = ALE::LabelSifter<int, int, ALE::malloc_allocator<ALE::NewSifterDef::Arrow<int, int> > >]???
meshexodus.c:183:   instantiated from here
/home/user/Compile/petsc-dev/include/sieve/ISieve.hh:957: warning: ???class ALE::IFSieveDef::Sequence<int>??? has virtual functions but non-virtual destructor
/home/user/Compile/petsc-dev/include/sieve/ISieve.hh: In instantiation of ???ALE::IFSieveDef::Sequence<int>::const_iterator???:
/home/user/Compile/petsc-dev/include/sieve/ISieve.hh:994:   instantiated from ???ALE::IFSieveDef::Sequence<PointType_>::const_iterator ALE::IFSieveDef::Sequence<PointType_>::begin() const [with PointType_ = int]???
meshexodus.c:364:   instantiated from here
/home/user/Compile/petsc-dev/include/sieve/ISieve.hh:960: warning: ???class ALE::IFSieveDef::Sequence<int>::const_iterator??? has virtual functions but non-virtual destructor
section.c: In function ???PetscErrorCode SectionRealCreateLocalVector(_p_SectionReal*, _p_Vec**)???:
section.c:582: warning: unused variable ???ierr???
/home/user/Compile/petsc-dev/include/sieve/Field.hh: In member function ???void ALE::GeneralSection<Point_, Value_, Alloc_, Atlas_, BCAtlas_>::zero() [with Point_ = int, Value_ = int, Alloc_ = ALE::malloc_allocator<int>, Atlas_ = ALE::IUniformSection<int, ALE::Point, 1, ALE::malloc_allocator<ALE::Point> >, BCAtlas_ = ALE::ISection<int, int, ALE::malloc_allocator<int> >]???:
section.c:1354:   instantiated from here
/home/user/Compile/petsc-dev/include/sieve/Field.hh:1740: warning: passing ???double??? for argument 1 to ???void ALE::GeneralSection<Point_, Value_, Alloc_, Atlas_, BCAtlas_>::set(Value_) [with Point_ = int, Value_ = int, Alloc_ = ALE::malloc_allocator<int>, Atlas_ = ALE::IUniformSection<int, ALE::Point, 1, ALE::malloc_allocator<ALE::Point> >, BCAtlas_ = ALE::ISection<int, int, ALE::malloc_allocator<int> >]???
/home/user/Compile/petsc-dev/include/sieve/ISieve.hh: At global scope:
/home/user/Compile/petsc-dev/include/sieve/ISieve.hh: In instantiation of ???ALE::IFSieveDef::Sequence<int>???:
/home/user/Compile/petsc-dev/include/sieve/ALE_mem.hh:799:   instantiated from ???void ALE::Obj<X, A>::destroy() [with X = ALE::IFSieveDef::Sequence<int>, A = ALE::malloc_allocator<ALE::IFSieveDef::Sequence<int> >]???
/home/user/Compile/petsc-dev/include/sieve/ALE_mem.hh:745:   instantiated from ???ALE::Obj<X, A>::~Obj() [with X = ALE::IFSieveDef::Sequence<int>, A = ALE::malloc_allocator<ALE::IFSieveDef::Sequence<int> >]???
/home/user/Compile/petsc-dev/include/sieve/ISieve.hh:1159:   instantiated from ???ALE::IFSieve<Point_, Allocator_>::~IFSieve() [with Point_ = int, Allocator_ = ALE::malloc_allocator<int>]???
/home/user/Compile/petsc-dev/include/sieve/ALE_mem.hh:799:   instantiated from ???void ALE::Obj<X, A>::destroy() [with X = ALE::IFSieve<int, ALE::malloc_allocator<int> >, A = ALE::malloc_allocator<ALE::IFSieve<int, ALE::malloc_allocator<int> > >]???
/home/user/Compile/petsc-dev/include/sieve/ALE_mem.hh:745:   instantiated from ???ALE::Obj<X, A>::~Obj() [with X = ALE::IFSieve<int, ALE::malloc_allocator<int> >, A = ALE::malloc_allocator<ALE::IFSieve<int, ALE::malloc_allocator<int> > >]???
/home/user/Compile/petsc-dev/include/petscmesh_viewers.hh:490:   instantiated from here
/home/user/Compile/petsc-dev/include/sieve/ISieve.hh:957: warning: ???class ALE::IFSieveDef::Sequence<int>??? has virtual functions but non-virtual destructor
/home/user/Compile/petsc-dev/include/sieve/ISieve.hh: In instantiation of ???ALE::IFSieveDef::Sequence<int>::const_iterator???:
/home/user/Compile/petsc-dev/include/sieve/ISieve.hh:994:   instantiated from ???ALE::IFSieveDef::Sequence<PointType_>::const_iterator ALE::IFSieveDef::Sequence<PointType_>::begin() const [with PointType_ = int]???
section.c:1532:   instantiated from here
/home/user/Compile/petsc-dev/include/sieve/ISieve.hh:960: warning: ???class ALE::IFSieveDef::Sequence<int>::const_iterator??? has virtual functions but non-virtual destructor
libfast in: /home/user/Compile/petsc-dev/src/dm/mesh/sieve
libfast in: /home/user/Compile/petsc-dev/src/dm/mesh/impls
libfast in: /home/user/Compile/petsc-dev/src/dm/mesh/impls/cartesian
/home/user/Compile/petsc-dev/include/sieve/ISieve.hh: In instantiation of ???ALE::IFSieveDef::Sequence<int>???:
/home/user/Compile/petsc-dev/include/sieve/ALE_mem.hh:799:   instantiated from ???void ALE::Obj<X, A>::destroy() [with X = ALE::IFSieveDef::Sequence<int>, A = ALE::malloc_allocator<ALE::IFSieveDef::Sequence<int> >]???
/home/user/Compile/petsc-dev/include/sieve/ALE_mem.hh:745:   instantiated from ???ALE::Obj<X, A>::~Obj() [with X = ALE::IFSieveDef::Sequence<int>, A = ALE::malloc_allocator<ALE::IFSieveDef::Sequence<int> >]???
/home/user/Compile/petsc-dev/include/sieve/ISieve.hh:1159:   instantiated from ???ALE::IFSieve<Point_, Allocator_>::~IFSieve() [with Point_ = int, Allocator_ = ALE::malloc_allocator<int>]???
/home/user/Compile/petsc-dev/include/sieve/ALE_mem.hh:799:   instantiated from ???void ALE::Obj<X, A>::destroy() [with X = ALE::IFSieve<int, ALE::malloc_allocator<int> >, A = ALE::malloc_allocator<ALE::IFSieve<int, ALE::malloc_allocator<int> > >]???
/home/user/Compile/petsc-dev/include/sieve/ALE_mem.hh:745:   instantiated from ???ALE::Obj<X, A>::~Obj() [with X = ALE::IFSieve<int, ALE::malloc_allocator<int> >, A = ALE::malloc_allocator<ALE::IFSieve<int, ALE::malloc_allocator<int> > >]???
/home/user/Compile/petsc-dev/include/sieve/Mesh.hh:1348:   instantiated from ???ALE::IBundle<Sieve_, RealSection_, IntSection_, Label_, ArrowSection_>::~IBundle() [with Sieve_ = ALE::IFSieve<int, ALE::malloc_allocator<int> >, RealSection_ = ALE::IGeneralSection<int, double, ALE::malloc_allocator<double> >, IntSection_ = ALE::IGeneralSection<int, int, ALE::malloc_allocator<int> >, Label_ = ALE::LabelSifter<int, int, ALE::malloc_allocator<ALE::NewSifterDef::Arrow<int, int> > >, ArrowSection_ = ALE::UniformSection<ALE::MinimalArrow<int, int>, int, 1, ALE::malloc_allocator<int> >]???
/home/user/Compile/petsc-dev/include/sieve/Mesh.hh:1576:   instantiated from ???void ALE::Obj<X, A>::destroy() [with X = ALE::IMesh<ALE::LabelSifter<int, int, ALE::malloc_allocator<ALE::NewSifterDef::Arrow<int, int> > > >, A = ALE::malloc_allocator<ALE::IMesh<ALE::LabelSifter<int, int, ALE::malloc_allocator<ALE::NewSifterDef::Arrow<int, int> > > > >]???
/home/user/Compile/petsc-dev/include/sieve/ALE_mem.hh:745:   instantiated from ???ALE::Obj<X, A>::~Obj() [with X = ALE::IMesh<ALE::LabelSifter<int, int, ALE::malloc_allocator<ALE::NewSifterDef::Arrow<int, int> > > >, A = ALE::malloc_allocator<ALE::IMesh<ALE::LabelSifter<int, int, ALE::malloc_allocator<ALE::NewSifterDef::Arrow<int, int> > > > >]???
cartesian.c:263:   instantiated from here
/home/user/Compile/petsc-dev/include/sieve/ISieve.hh:957: warning: ???class ALE::IFSieveDef::Sequence<int>??? has virtual functions but non-virtual destructor
/home/user/Compile/petsc-dev/include/sieve/ISieve.hh: In instantiation of ???ALE::IFSieveDef::Sequence<int>::const_iterator???:
/home/user/Compile/petsc-dev/include/sieve/ISieve.hh:994:   instantiated from ???ALE::IFSieveDef::Sequence<PointType_>::const_iterator ALE::IFSieveDef::Sequence<PointType_>::begin() const [with PointType_ = int]???
cartesian.c:269:   instantiated from here
/home/user/Compile/petsc-dev/include/sieve/ISieve.hh:960: warning: ???class ALE::IFSieveDef::Sequence<int>::const_iterator??? has virtual functions but non-virtual destructor
libfast in: /home/user/Compile/petsc-dev/src/dm/mesh/ftn-auto
libfast in: /home/user/Compile/petsc-dev/src/dm/mesh/ftn-custom
libfast in: /home/user/Compile/petsc-dev/src/dm/mesh/f90-custom
libfast in: /home/user/Compile/petsc-dev/src/dm/adda
libfast in: /home/user/Compile/petsc-dev/src/dm/adda/examples
libfast in: /home/user/Compile/petsc-dev/src/dm/adda/examples/tests
libfast in: /home/user/Compile/petsc-dev/src/dm/adda/ftn-auto
libfast in: /home/user/Compile/petsc-dev/src/dm/f90-mod
libfast in: /home/user/Compile/petsc-dev/src/dm/ftn-custom
libfast in: /home/user/Compile/petsc-dev/src/contrib
libfast in: /home/user/Compile/petsc-dev/src/contrib/fun3d
libfast in: /home/user/Compile/petsc-dev/src/benchmarks
libfast in: /home/user/Compile/petsc-dev/src/docs
libfast in: /home/user/Compile/petsc-dev/include
libfast in: /home/user/Compile/petsc-dev/include/finclude
libfast in: /home/user/Compile/petsc-dev/include/finclude/ftn-auto
libfast in: /home/user/Compile/petsc-dev/include/finclude/ftn-custom
libfast in: /home/user/Compile/petsc-dev/include/private
libfast in: /home/user/Compile/petsc-dev/include/sieve
libfast in: /home/user/Compile/petsc-dev/include/ftn-auto
petschf.c: In function ???PetscErrorCode petscmemcpy_(void*, void*, size_t*, int*)???:
petschf.c:42: warning: no return statement in function returning non-void
libfast in: /home/user/Compile/petsc-dev/tutorials
libfast in: /home/user/Compile/petsc-dev/tutorials/multiphysics
Completed building libraries




From u.tabak at tudelft.nl  Thu Jul 30 14:52:37 2009
From: u.tabak at tudelft.nl (Umut Tabak)
Date: Thu, 30 Jul 2009 21:52:37 +0200
Subject: About binary matrix formats
Message-ID: <4A71FA05.3060206@tudelft.nl>

Dear all,

I was looking at the format on the MatLoad reference page, I would like 
to interface some matrices to petsc . To read them in matrix market 
format can take too long if the matrices are large. The 1st integer is 
binary file marker, MAT_FILE_COOKIE. When I read the file in Matlab with

fread(fid, 1, 'int')

I can get the value 1211216 which shows that it is a binary file. But 
doing the same in C++, I tried with the following code, does not give 
the true value. I know that reinterpret_cast can be quite implementation 
dependent. K33.bin is a matrix saved in binary format by Petsc.

    char c[4];
    char inname[] = "K33.bin";
    ifstream infile(inname, ios::binary);
    if (!infile){
      cout << "Couldn't open file " << inname << " for reading." << endl;
      return 1;
    }
    infile.read(c,4);
    int val;
    cout << (val = *(reinterpret_cast<int*>(c)));
   
If I write some numbers in binary format with some simple code, say 0 1 
2, I can convert them back to numerical values with this code, but I 
could not understand what happened  in the case of the binary file of 
Petsc. Should I do sth byte by byte? Most probably, this is due to my 
deficient programming knowledge, but clarification is appreciated.

Thanks and best regards,

Umut

From bsmith at mcs.anl.gov  Thu Jul 30 18:45:19 2009
From: bsmith at mcs.anl.gov (Barry Smith)
Date: Thu, 30 Jul 2009 18:45:19 -0500
Subject: Smoother settings for AMG
In-Reply-To: <A4A715530F9D944C930E27C7F8932FBD2AD4A5@CORP-CLT-EXB02.ds>
References: <A4A715530F9D944C930E27C7F8932FBD2AD4A5@CORP-CLT-EXB02.ds>
Message-ID: <E7D7C1CB-EF3C-4224-BF08-B3D612BC2A26@mcs.anl.gov>


    Harun,

   I have played around with this matrix. It is a nasty matrix; I  
think it is really beyond the normal capacity of ML (and hypre's  
boomerAMG).

Even the "convergence" you were getting below is BOGUS.  If you run  
with -ksp_norm_type unpreconditioned or -ksp_monitor_true_residual  
you'll see that the "true" residual norm is actually creeping to zero  
and at the converged 43 iterations below the true residual norm has  
decreased by like less than 1/10. (The preconditioned residual norm  
has decreased by 1.e 5 so the iteration stops and you think it has  
converged. In really hard problems preconditioners sometimes scales  
things in a funky way so a large decrease in preconditioned residual  
norm does not mean a large decrease in true residual norm). In other  
words the "answer" you got out of the runs below is garbage.

   I suggest,
1) check carefully that the matrix being created actually matches the  
model's equations, if they seem right then
2) see if you can change the model so it does not generate such  
hopeless matrices. If you MUST solve this nasty matrix
3) bite the bullet and use a parallel direct solver from PETSc. Try  
both MUMPS and SuperLU_dist

   Good luck,

    Barry




On Jul 29, 2009, at 3:54 PM, BAYRAKTAR Harun wrote:

> Hi,
>
> I am trying to solve a system of equations and I am having difficulty
> picking the right smoothers for AMG (using ML as pc_type) in PETSc for
> parallel execution. First here is what happens in terms of CG  
> (ksp_type)
> iteration counts (both columns use block jacobi):
>
> cpus	|	AMG w/ ICC(0) x1	|	AMG w/ SOR x4
> ------------------------------------------------------
> 1	|		43		|		243
> 4	|		699		|		379
>
> x1 or x4 means 1 or 4 iterations of smoother application at each AMG
> level (all details from ksp view for the 4 cpu run are below). The  
> main
> observation is that on 1 cpu, AMG w/ ICC(0) is a clear winner but  
> falls
> apart in parallel. SOR on the other hand experiences a 1.5X increase  
> in
> iteration count which is totally expected from the quality of  
> coarsening
> ML delivers in parallel.
>
> I basically would like to find a way (if possible) to have the  
> number of
> iterations in parallel stay with 1-2X of 1 cpu iteration count for the
> AMG w/ ICC case. Is there a way to achieve this?
>
> Thanks,
> Harun
>
> %%%%%%%%%%%%%%%%%%%%%%%%%
> AMG w/ ICC(0) x1 ksp_view
> %%%%%%%%%%%%%%%%%%%%%%%%%
> KSP Object:
>  type: cg
>  maximum iterations=10000
>  tolerances:  relative=1e-05, absolute=1e-50, divergence=10000
>  left preconditioning
> PC Object:
>  type: ml
>    MG: type is MULTIPLICATIVE, levels=3 cycles=v, pre-smooths=1,
> post-smooths=1
>  Coarse gride solver -- level 0 -------------------------------
>    KSP Object:(mg_coarse_)
>      type: preonly
>      maximum iterations=1, initial guess is zero
>      tolerances:  relative=1e-05, absolute=1e-50, divergence=10000
>      left preconditioning
>    PC Object:(mg_coarse_)
>      type: redundant
>        Redundant preconditioner: First (color=0) of 4 PCs follows
>      KSP Object:(mg_coarse_redundant_)
>        type: preonly
>        maximum iterations=10000, initial guess is zero
>        tolerances:  relative=1e-05, absolute=1e-50, divergence=10000
>        left preconditioning
>      PC Object:(mg_coarse_redundant_)
>        type: lu
>          LU: out-of-place factorization
>            matrix ordering: nd
>          LU: tolerance for zero pivot 1e-12
>          LU: factor fill ratio needed 2.17227
>               Factored matrix follows
>              Matrix Object:
>                type=seqaij, rows=283, cols=283
>                total: nonzeros=21651, allocated nonzeros=21651
>                  using I-node routines: found 186 nodes, limit used is
> 5
>        linear system matrix = precond matrix:
>        Matrix Object:
>          type=seqaij, rows=283, cols=283
>          total: nonzeros=9967, allocated nonzeros=14150
>            not using I-node routines
>      linear system matrix = precond matrix:
>      Matrix Object:
>        type=mpiaij, rows=283, cols=283
>        total: nonzeros=9967, allocated nonzeros=9967
>          not using I-node (on process 0) routines
>  Down solver (pre-smoother) on level 1 -------------------------------
>    KSP Object:(mg_levels_1_)
>      type: richardson
>        Richardson: damping factor=0.9
>      maximum iterations=1, initial guess is zero
>      tolerances:  relative=1e-05, absolute=1e-50, divergence=10000
>      left preconditioning
>    PC Object:(mg_levels_1_)
>      type: bjacobi
>        block Jacobi: number of blocks = 4
>        Local solve is same for all blocks, in the following KSP and PC
> objects:
>      KSP Object:(mg_levels_1_sub_)
>        type: preonly
>        maximum iterations=10000, initial guess is zero
>        tolerances:  relative=1e-05, absolute=1e-50, divergence=10000
>        left preconditioning
>      PC Object:(mg_levels_1_sub_)
>        type: icc
>          ICC: 0 levels of fill
>          ICC: factor fill ratio allocated 1
>          ICC: using Manteuffel shift
>          ICC: factor fill ratio needed 0.514899
>               Factored matrix follows
>              Matrix Object:
>                type=seqsbaij, rows=2813, cols=2813
>                total: nonzeros=48609, allocated nonzeros=48609
>                    block size is 1
>        linear system matrix = precond matrix:
>        Matrix Object:
>          type=seqaij, rows=2813, cols=2813
>          total: nonzeros=94405, allocated nonzeros=94405
>            not using I-node routines
>      linear system matrix = precond matrix:
>      Matrix Object:
>        type=mpiaij, rows=10654, cols=10654
>        total: nonzeros=376634, allocated nonzeros=376634
>          not using I-node (on process 0) routines
>  Up solver (post-smoother) on level 1 -------------------------------
>    KSP Object:(mg_levels_1_)
>      type: richardson
>        Richardson: damping factor=0.9
>      maximum iterations=1
>      tolerances:  relative=1e-05, absolute=1e-50, divergence=10000
>      left preconditioning
>    PC Object:(mg_levels_1_)
>      type: bjacobi
>        block Jacobi: number of blocks = 4
>        Local solve is same for all blocks, in the following KSP and PC
> objects:
>      KSP Object:(mg_levels_1_sub_)
>        type: preonly
>        maximum iterations=10000, initial guess is zero
>        tolerances:  relative=1e-05, absolute=1e-50, divergence=10000
>        left preconditioning
>      PC Object:(mg_levels_1_sub_)
>        type: icc
>          ICC: 0 levels of fill
>          ICC: factor fill ratio allocated 1
>          ICC: using Manteuffel shift
>          ICC: factor fill ratio needed 0.514899
>               Factored matrix follows
>              Matrix Object:
>                type=seqsbaij, rows=2813, cols=2813
>                total: nonzeros=48609, allocated nonzeros=48609
>                    block size is 1
>        linear system matrix = precond matrix:
>        Matrix Object:
>          type=seqaij, rows=2813, cols=2813
>          total: nonzeros=94405, allocated nonzeros=94405
>            not using I-node routines
>      linear system matrix = precond matrix:
>      Matrix Object:
>        type=mpiaij, rows=10654, cols=10654
>        total: nonzeros=376634, allocated nonzeros=376634
>          not using I-node (on process 0) routines
>  Down solver (pre-smoother) on level 2 -------------------------------
>    KSP Object:(mg_levels_2_)
>      type: richardson
>        Richardson: damping factor=0.9
>      maximum iterations=1, initial guess is zero
>      tolerances:  relative=1e-05, absolute=1e-50, divergence=10000
>      left preconditioning
>    PC Object:(mg_levels_2_)
>      type: bjacobi
>        block Jacobi: number of blocks = 4
>        Local solve is same for all blocks, in the following KSP and PC
> objects:
>      KSP Object:(mg_levels_2_sub_)
>        type: preonly
>        maximum iterations=10000, initial guess is zero
>        tolerances:  relative=1e-05, absolute=1e-50, divergence=10000
>        left preconditioning
>      PC Object:(mg_levels_2_sub_)
>        type: icc
>          ICC: 0 levels of fill
>          ICC: factor fill ratio allocated 1
>          ICC: using Manteuffel shift
>          ICC: factor fill ratio needed 0.519045
>               Factored matrix follows
>              Matrix Object:
>                type=seqsbaij, rows=101164, cols=101164
>                total: nonzeros=1378558, allocated nonzeros=1378558
>                    block size is 1
>        linear system matrix = precond matrix:
>        Matrix Object:
>          type=seqaij, rows=101164, cols=101164
>          total: nonzeros=2655952, allocated nonzeros=5159364
>            not using I-node routines
>      linear system matrix = precond matrix:
>      Matrix Object:
>        type=mpiaij, rows=411866, cols=411866
>        total: nonzeros=10941434, allocated nonzeros=42010332
>          not using I-node (on process 0) routines
>  Up solver (post-smoother) on level 2 -------------------------------
>    KSP Object:(mg_levels_2_)
>      type: richardson
>        Richardson: damping factor=0.9
>      maximum iterations=1
>      tolerances:  relative=1e-05, absolute=1e-50, divergence=10000
>      left preconditioning
>    PC Object:(mg_levels_2_)
>      type: bjacobi
>        block Jacobi: number of blocks = 4
>        Local solve is same for all blocks, in the following KSP and PC
> objects:
>      KSP Object:(mg_levels_2_sub_)
>        type: preonly
>        maximum iterations=10000, initial guess is zero
>        tolerances:  relative=1e-05, absolute=1e-50, divergence=10000
>        left preconditioning
>      PC Object:(mg_levels_2_sub_)
>        type: icc
>          ICC: 0 levels of fill
>          ICC: factor fill ratio allocated 1
>          ICC: using Manteuffel shift
>          ICC: factor fill ratio needed 0.519045
>               Factored matrix follows
>              Matrix Object:
>                type=seqsbaij, rows=101164, cols=101164
>                total: nonzeros=1378558, allocated nonzeros=1378558
>                    block size is 1
>        linear system matrix = precond matrix:
>        Matrix Object:
>          type=seqaij, rows=101164, cols=101164
>          total: nonzeros=2655952, allocated nonzeros=5159364
>            not using I-node routines
>      linear system matrix = precond matrix:
>      Matrix Object:
>        type=mpiaij, rows=411866, cols=411866
>        total: nonzeros=10941434, allocated nonzeros=42010332
>          not using I-node (on process 0) routines
>  linear system matrix = precond matrix:
>  Matrix Object:
>    type=mpiaij, rows=411866, cols=411866
>    total: nonzeros=10941434, allocated nonzeros=42010332
>      not using I-node (on process 0) routines
>
> %%%%%%%%%%%%%%%%%%%%%%
> AMG w/ SOR x4 ksp_view
> %%%%%%%%%%%%%%%%%%%%%%
>
> KSP Object:
>  type: cg
>  maximum iterations=10000
>  tolerances:  relative=1e-05, absolute=1e-50, divergence=10000
>  left preconditioning
> PC Object:
>  type: ml
>    MG: type is MULTIPLICATIVE, levels=3 cycles=v, pre-smooths=1,
> post-smooths=1
>  Coarse gride solver -- level 0 -------------------------------
>    KSP Object:(mg_coarse_)
>      type: preonly
>      maximum iterations=1, initial guess is zero
>      tolerances:  relative=1e-05, absolute=1e-50, divergence=10000
>      left preconditioning
>    PC Object:(mg_coarse_)
>      type: redundant
>        Redundant preconditioner: First (color=0) of 4 PCs follows
>      KSP Object:(mg_coarse_redundant_)
>        type: preonly
>        maximum iterations=10000, initial guess is zero
>        tolerances:  relative=1e-05, absolute=1e-50, divergence=10000
>        left preconditioning
>      PC Object:(mg_coarse_redundant_)
>        type: lu
>          LU: out-of-place factorization
>            matrix ordering: nd
>          LU: tolerance for zero pivot 1e-12
>          LU: factor fill ratio needed 2.17227
>               Factored matrix follows
>              Matrix Object:
>                type=seqaij, rows=283, cols=283
>                total: nonzeros=21651, allocated nonzeros=21651
>                  using I-node routines: found 186 nodes, limit used is
> 5
>        linear system matrix = precond matrix:
>        Matrix Object:
>          type=seqaij, rows=283, cols=283
>          total: nonzeros=9967, allocated nonzeros=14150
>            not using I-node routines
>      linear system matrix = precond matrix:
>      Matrix Object:
>        type=mpiaij, rows=283, cols=283
>        total: nonzeros=9967, allocated nonzeros=9967
>          not using I-node (on process 0) routines
>  Down solver (pre-smoother) on level 1 -------------------------------
>    KSP Object:(mg_levels_1_)
>      type: richardson
>        Richardson: damping factor=1
>      maximum iterations=4, initial guess is zero
>      tolerances:  relative=1e-05, absolute=1e-50, divergence=10000
>      left preconditioning
>    PC Object:(mg_levels_1_)
>      type: sor
>        SOR: type = local_symmetric, iterations = 1, omega = 1
>      linear system matrix = precond matrix:
>      Matrix Object:
>        type=mpiaij, rows=10654, cols=10654
>        total: nonzeros=376634, allocated nonzeros=376634
>          not using I-node (on process 0) routines
>  Up solver (post-smoother) on level 1 -------------------------------
>    KSP Object:(mg_levels_1_)
>      type: richardson
>        Richardson: damping factor=1
>      maximum iterations=4
>      tolerances:  relative=1e-05, absolute=1e-50, divergence=10000
>      left preconditioning
>    PC Object:(mg_levels_1_)
>      type: sor
>        SOR: type = local_symmetric, iterations = 1, omega = 1
>      linear system matrix = precond matrix:
>      Matrix Object:
>        type=mpiaij, rows=10654, cols=10654
>        total: nonzeros=376634, allocated nonzeros=376634
>          not using I-node (on process 0) routines
>  Down solver (pre-smoother) on level 2 -------------------------------
>    KSP Object:(mg_levels_2_)
>      type: richardson
>        Richardson: damping factor=1
>      maximum iterations=4, initial guess is zero
>      tolerances:  relative=1e-05, absolute=1e-50, divergence=10000
>      left preconditioning
>    PC Object:(mg_levels_2_)
>      type: sor
>        SOR: type = local_symmetric, iterations = 1, omega = 1
>      linear system matrix = precond matrix:
>      Matrix Object:
>        type=mpiaij, rows=411866, cols=411866
>        total: nonzeros=10941434, allocated nonzeros=42010332
>          not using I-node (on process 0) routines
>  Up solver (post-smoother) on level 2 -------------------------------
>    KSP Object:(mg_levels_2_)
>      type: richardson
>        Richardson: damping factor=1
>      maximum iterations=4
>      tolerances:  relative=1e-05, absolute=1e-50, divergence=10000
>      left preconditioning
>    PC Object:(mg_levels_2_)
>      type: sor
>        SOR: type = local_symmetric, iterations = 1, omega = 1
>      linear system matrix = precond matrix:
>      Matrix Object:
>        type=mpiaij, rows=411866, cols=411866
>        total: nonzeros=10941434, allocated nonzeros=42010332
>          not using I-node (on process 0) routines
>  linear system matrix = precond matrix:
>  Matrix Object:
>    type=mpiaij, rows=411866, cols=411866
>    total: nonzeros=10941434, allocated nonzeros=42010332
>      not using I-node (on process 0) routines
>
>


From Harun.BAYRAKTAR at 3ds.com  Fri Jul 31 13:15:17 2009
From: Harun.BAYRAKTAR at 3ds.com (BAYRAKTAR Harun)
Date: Fri, 31 Jul 2009 14:15:17 -0400
Subject: Smoother settings for AMG
In-Reply-To: <E7D7C1CB-EF3C-4224-BF08-B3D612BC2A26@mcs.anl.gov>
References: <A4A715530F9D944C930E27C7F8932FBD2AD4A5@CORP-CLT-EXB02.ds>
	<E7D7C1CB-EF3C-4224-BF08-B3D612BC2A26@mcs.anl.gov>
Message-ID: <A4A715530F9D944C930E27C7F8932FBD2AD4BD@CORP-CLT-EXB02.ds>

Barry,

Thanks a lot for looking in to this. One thing I want to clarify is that the 43 (should have been 46 sorry for the typo) iterations on 1 cpu seems like a real convergence to me. I do look at the unpreconditioned residual norm to determine convergence. For this I use:

ierr = KSPSetNormType(m_solver, KSP_NORM_UNPRECONDITIONED);  CHKERRQ(ierr);

Then I check convergence through KSPSetConvergenceTest. As an experiment I commented out the line above where I tell KSP to use the unpreconditioned norm and while the ||r|| values changed (naturally), it still converged in slightly more number of iterations (56).

I am familiar with the preconditioned norm going down 6 orders while the true relative norm is 0.1 or so (i.e., problem not solved at all). This usually happens to me in structural mechanics problems with ill conditioned systems and I use a KSP method that does not allow for the unpreconditioned residual to be monitored. However, this does not seem to be one of those cases though, maybe I am missing something.

Out of curiosity did you use ksp/ksp/examples/tutorials/ex10.c to solve this?

Thanks again,
Harun



-----Original Message-----
From: petsc-users-bounces at mcs.anl.gov [mailto:petsc-users-bounces at mcs.anl.gov] On Behalf Of Barry Smith
Sent: Thursday, July 30, 2009 7:45 PM
To: PETSc users list
Subject: Re: Smoother settings for AMG


    Harun,

   I have played around with this matrix. It is a nasty matrix; I  
think it is really beyond the normal capacity of ML (and hypre's  
boomerAMG).

Even the "convergence" you were getting below is BOGUS.  If you run  
with -ksp_norm_type unpreconditioned or -ksp_monitor_true_residual  
you'll see that the "true" residual norm is actually creeping to zero  
and at the converged 43 iterations below the true residual norm has  
decreased by like less than 1/10. (The preconditioned residual norm  
has decreased by 1.e 5 so the iteration stops and you think it has  
converged. In really hard problems preconditioners sometimes scales  
things in a funky way so a large decrease in preconditioned residual  
norm does not mean a large decrease in true residual norm). In other  
words the "answer" you got out of the runs below is garbage.

   I suggest,
1) check carefully that the matrix being created actually matches the  
model's equations, if they seem right then
2) see if you can change the model so it does not generate such  
hopeless matrices. If you MUST solve this nasty matrix
3) bite the bullet and use a parallel direct solver from PETSc. Try  
both MUMPS and SuperLU_dist

   Good luck,

    Barry




On Jul 29, 2009, at 3:54 PM, BAYRAKTAR Harun wrote:

> Hi,
>
> I am trying to solve a system of equations and I am having difficulty
> picking the right smoothers for AMG (using ML as pc_type) in PETSc for
> parallel execution. First here is what happens in terms of CG  
> (ksp_type)
> iteration counts (both columns use block jacobi):
>
> cpus	|	AMG w/ ICC(0) x1	|	AMG w/ SOR x4
> ------------------------------------------------------
> 1	|		43		|		243
> 4	|		699		|		379
>
> x1 or x4 means 1 or 4 iterations of smoother application at each AMG
> level (all details from ksp view for the 4 cpu run are below). The  
> main
> observation is that on 1 cpu, AMG w/ ICC(0) is a clear winner but  
> falls
> apart in parallel. SOR on the other hand experiences a 1.5X increase  
> in
> iteration count which is totally expected from the quality of  
> coarsening
> ML delivers in parallel.
>
> I basically would like to find a way (if possible) to have the  
> number of
> iterations in parallel stay with 1-2X of 1 cpu iteration count for the
> AMG w/ ICC case. Is there a way to achieve this?
>
> Thanks,
> Harun
>
> %%%%%%%%%%%%%%%%%%%%%%%%%
> AMG w/ ICC(0) x1 ksp_view
> %%%%%%%%%%%%%%%%%%%%%%%%%
> KSP Object:
>  type: cg
>  maximum iterations=10000
>  tolerances:  relative=1e-05, absolute=1e-50, divergence=10000
>  left preconditioning
> PC Object:
>  type: ml
>    MG: type is MULTIPLICATIVE, levels=3 cycles=v, pre-smooths=1,
> post-smooths=1
>  Coarse gride solver -- level 0 -------------------------------
>    KSP Object:(mg_coarse_)
>      type: preonly
>      maximum iterations=1, initial guess is zero
>      tolerances:  relative=1e-05, absolute=1e-50, divergence=10000
>      left preconditioning
>    PC Object:(mg_coarse_)
>      type: redundant
>        Redundant preconditioner: First (color=0) of 4 PCs follows
>      KSP Object:(mg_coarse_redundant_)
>        type: preonly
>        maximum iterations=10000, initial guess is zero
>        tolerances:  relative=1e-05, absolute=1e-50, divergence=10000
>        left preconditioning
>      PC Object:(mg_coarse_redundant_)
>        type: lu
>          LU: out-of-place factorization
>            matrix ordering: nd
>          LU: tolerance for zero pivot 1e-12
>          LU: factor fill ratio needed 2.17227
>               Factored matrix follows
>              Matrix Object:
>                type=seqaij, rows=283, cols=283
>                total: nonzeros=21651, allocated nonzeros=21651
>                  using I-node routines: found 186 nodes, limit used is
> 5
>        linear system matrix = precond matrix:
>        Matrix Object:
>          type=seqaij, rows=283, cols=283
>          total: nonzeros=9967, allocated nonzeros=14150
>            not using I-node routines
>      linear system matrix = precond matrix:
>      Matrix Object:
>        type=mpiaij, rows=283, cols=283
>        total: nonzeros=9967, allocated nonzeros=9967
>          not using I-node (on process 0) routines
>  Down solver (pre-smoother) on level 1 -------------------------------
>    KSP Object:(mg_levels_1_)
>      type: richardson
>        Richardson: damping factor=0.9
>      maximum iterations=1, initial guess is zero
>      tolerances:  relative=1e-05, absolute=1e-50, divergence=10000
>      left preconditioning
>    PC Object:(mg_levels_1_)
>      type: bjacobi
>        block Jacobi: number of blocks = 4
>        Local solve is same for all blocks, in the following KSP and PC
> objects:
>      KSP Object:(mg_levels_1_sub_)
>        type: preonly
>        maximum iterations=10000, initial guess is zero
>        tolerances:  relative=1e-05, absolute=1e-50, divergence=10000
>        left preconditioning
>      PC Object:(mg_levels_1_sub_)
>        type: icc
>          ICC: 0 levels of fill
>          ICC: factor fill ratio allocated 1
>          ICC: using Manteuffel shift
>          ICC: factor fill ratio needed 0.514899
>               Factored matrix follows
>              Matrix Object:
>                type=seqsbaij, rows=2813, cols=2813
>                total: nonzeros=48609, allocated nonzeros=48609
>                    block size is 1
>        linear system matrix = precond matrix:
>        Matrix Object:
>          type=seqaij, rows=2813, cols=2813
>          total: nonzeros=94405, allocated nonzeros=94405
>            not using I-node routines
>      linear system matrix = precond matrix:
>      Matrix Object:
>        type=mpiaij, rows=10654, cols=10654
>        total: nonzeros=376634, allocated nonzeros=376634
>          not using I-node (on process 0) routines
>  Up solver (post-smoother) on level 1 -------------------------------
>    KSP Object:(mg_levels_1_)
>      type: richardson
>        Richardson: damping factor=0.9
>      maximum iterations=1
>      tolerances:  relative=1e-05, absolute=1e-50, divergence=10000
>      left preconditioning
>    PC Object:(mg_levels_1_)
>      type: bjacobi
>        block Jacobi: number of blocks = 4
>        Local solve is same for all blocks, in the following KSP and PC
> objects:
>      KSP Object:(mg_levels_1_sub_)
>        type: preonly
>        maximum iterations=10000, initial guess is zero
>        tolerances:  relative=1e-05, absolute=1e-50, divergence=10000
>        left preconditioning
>      PC Object:(mg_levels_1_sub_)
>        type: icc
>          ICC: 0 levels of fill
>          ICC: factor fill ratio allocated 1
>          ICC: using Manteuffel shift
>          ICC: factor fill ratio needed 0.514899
>               Factored matrix follows
>              Matrix Object:
>                type=seqsbaij, rows=2813, cols=2813
>                total: nonzeros=48609, allocated nonzeros=48609
>                    block size is 1
>        linear system matrix = precond matrix:
>        Matrix Object:
>          type=seqaij, rows=2813, cols=2813
>          total: nonzeros=94405, allocated nonzeros=94405
>            not using I-node routines
>      linear system matrix = precond matrix:
>      Matrix Object:
>        type=mpiaij, rows=10654, cols=10654
>        total: nonzeros=376634, allocated nonzeros=376634
>          not using I-node (on process 0) routines
>  Down solver (pre-smoother) on level 2 -------------------------------
>    KSP Object:(mg_levels_2_)
>      type: richardson
>        Richardson: damping factor=0.9
>      maximum iterations=1, initial guess is zero
>      tolerances:  relative=1e-05, absolute=1e-50, divergence=10000
>      left preconditioning
>    PC Object:(mg_levels_2_)
>      type: bjacobi
>        block Jacobi: number of blocks = 4
>        Local solve is same for all blocks, in the following KSP and PC
> objects:
>      KSP Object:(mg_levels_2_sub_)
>        type: preonly
>        maximum iterations=10000, initial guess is zero
>        tolerances:  relative=1e-05, absolute=1e-50, divergence=10000
>        left preconditioning
>      PC Object:(mg_levels_2_sub_)
>        type: icc
>          ICC: 0 levels of fill
>          ICC: factor fill ratio allocated 1
>          ICC: using Manteuffel shift
>          ICC: factor fill ratio needed 0.519045
>               Factored matrix follows
>              Matrix Object:
>                type=seqsbaij, rows=101164, cols=101164
>                total: nonzeros=1378558, allocated nonzeros=1378558
>                    block size is 1
>        linear system matrix = precond matrix:
>        Matrix Object:
>          type=seqaij, rows=101164, cols=101164
>          total: nonzeros=2655952, allocated nonzeros=5159364
>            not using I-node routines
>      linear system matrix = precond matrix:
>      Matrix Object:
>        type=mpiaij, rows=411866, cols=411866
>        total: nonzeros=10941434, allocated nonzeros=42010332
>          not using I-node (on process 0) routines
>  Up solver (post-smoother) on level 2 -------------------------------
>    KSP Object:(mg_levels_2_)
>      type: richardson
>        Richardson: damping factor=0.9
>      maximum iterations=1
>      tolerances:  relative=1e-05, absolute=1e-50, divergence=10000
>      left preconditioning
>    PC Object:(mg_levels_2_)
>      type: bjacobi
>        block Jacobi: number of blocks = 4
>        Local solve is same for all blocks, in the following KSP and PC
> objects:
>      KSP Object:(mg_levels_2_sub_)
>        type: preonly
>        maximum iterations=10000, initial guess is zero
>        tolerances:  relative=1e-05, absolute=1e-50, divergence=10000
>        left preconditioning
>      PC Object:(mg_levels_2_sub_)
>        type: icc
>          ICC: 0 levels of fill
>          ICC: factor fill ratio allocated 1
>          ICC: using Manteuffel shift
>          ICC: factor fill ratio needed 0.519045
>               Factored matrix follows
>              Matrix Object:
>                type=seqsbaij, rows=101164, cols=101164
>                total: nonzeros=1378558, allocated nonzeros=1378558
>                    block size is 1
>        linear system matrix = precond matrix:
>        Matrix Object:
>          type=seqaij, rows=101164, cols=101164
>          total: nonzeros=2655952, allocated nonzeros=5159364
>            not using I-node routines
>      linear system matrix = precond matrix:
>      Matrix Object:
>        type=mpiaij, rows=411866, cols=411866
>        total: nonzeros=10941434, allocated nonzeros=42010332
>          not using I-node (on process 0) routines
>  linear system matrix = precond matrix:
>  Matrix Object:
>    type=mpiaij, rows=411866, cols=411866
>    total: nonzeros=10941434, allocated nonzeros=42010332
>      not using I-node (on process 0) routines
>
> %%%%%%%%%%%%%%%%%%%%%%
> AMG w/ SOR x4 ksp_view
> %%%%%%%%%%%%%%%%%%%%%%
>
> KSP Object:
>  type: cg
>  maximum iterations=10000
>  tolerances:  relative=1e-05, absolute=1e-50, divergence=10000
>  left preconditioning
> PC Object:
>  type: ml
>    MG: type is MULTIPLICATIVE, levels=3 cycles=v, pre-smooths=1,
> post-smooths=1
>  Coarse gride solver -- level 0 -------------------------------
>    KSP Object:(mg_coarse_)
>      type: preonly
>      maximum iterations=1, initial guess is zero
>      tolerances:  relative=1e-05, absolute=1e-50, divergence=10000
>      left preconditioning
>    PC Object:(mg_coarse_)
>      type: redundant
>        Redundant preconditioner: First (color=0) of 4 PCs follows
>      KSP Object:(mg_coarse_redundant_)
>        type: preonly
>        maximum iterations=10000, initial guess is zero
>        tolerances:  relative=1e-05, absolute=1e-50, divergence=10000
>        left preconditioning
>      PC Object:(mg_coarse_redundant_)
>        type: lu
>          LU: out-of-place factorization
>            matrix ordering: nd
>          LU: tolerance for zero pivot 1e-12
>          LU: factor fill ratio needed 2.17227
>               Factored matrix follows
>              Matrix Object:
>                type=seqaij, rows=283, cols=283
>                total: nonzeros=21651, allocated nonzeros=21651
>                  using I-node routines: found 186 nodes, limit used is
> 5
>        linear system matrix = precond matrix:
>        Matrix Object:
>          type=seqaij, rows=283, cols=283
>          total: nonzeros=9967, allocated nonzeros=14150
>            not using I-node routines
>      linear system matrix = precond matrix:
>      Matrix Object:
>        type=mpiaij, rows=283, cols=283
>        total: nonzeros=9967, allocated nonzeros=9967
>          not using I-node (on process 0) routines
>  Down solver (pre-smoother) on level 1 -------------------------------
>    KSP Object:(mg_levels_1_)
>      type: richardson
>        Richardson: damping factor=1
>      maximum iterations=4, initial guess is zero
>      tolerances:  relative=1e-05, absolute=1e-50, divergence=10000
>      left preconditioning
>    PC Object:(mg_levels_1_)
>      type: sor
>        SOR: type = local_symmetric, iterations = 1, omega = 1
>      linear system matrix = precond matrix:
>      Matrix Object:
>        type=mpiaij, rows=10654, cols=10654
>        total: nonzeros=376634, allocated nonzeros=376634
>          not using I-node (on process 0) routines
>  Up solver (post-smoother) on level 1 -------------------------------
>    KSP Object:(mg_levels_1_)
>      type: richardson
>        Richardson: damping factor=1
>      maximum iterations=4
>      tolerances:  relative=1e-05, absolute=1e-50, divergence=10000
>      left preconditioning
>    PC Object:(mg_levels_1_)
>      type: sor
>        SOR: type = local_symmetric, iterations = 1, omega = 1
>      linear system matrix = precond matrix:
>      Matrix Object:
>        type=mpiaij, rows=10654, cols=10654
>        total: nonzeros=376634, allocated nonzeros=376634
>          not using I-node (on process 0) routines
>  Down solver (pre-smoother) on level 2 -------------------------------
>    KSP Object:(mg_levels_2_)
>      type: richardson
>        Richardson: damping factor=1
>      maximum iterations=4, initial guess is zero
>      tolerances:  relative=1e-05, absolute=1e-50, divergence=10000
>      left preconditioning
>    PC Object:(mg_levels_2_)
>      type: sor
>        SOR: type = local_symmetric, iterations = 1, omega = 1
>      linear system matrix = precond matrix:
>      Matrix Object:
>        type=mpiaij, rows=411866, cols=411866
>        total: nonzeros=10941434, allocated nonzeros=42010332
>          not using I-node (on process 0) routines
>  Up solver (post-smoother) on level 2 -------------------------------
>    KSP Object:(mg_levels_2_)
>      type: richardson
>        Richardson: damping factor=1
>      maximum iterations=4
>      tolerances:  relative=1e-05, absolute=1e-50, divergence=10000
>      left preconditioning
>    PC Object:(mg_levels_2_)
>      type: sor
>        SOR: type = local_symmetric, iterations = 1, omega = 1
>      linear system matrix = precond matrix:
>      Matrix Object:
>        type=mpiaij, rows=411866, cols=411866
>        total: nonzeros=10941434, allocated nonzeros=42010332
>          not using I-node (on process 0) routines
>  linear system matrix = precond matrix:
>  Matrix Object:
>    type=mpiaij, rows=411866, cols=411866
>    total: nonzeros=10941434, allocated nonzeros=42010332
>      not using I-node (on process 0) routines
>
>


From bsmith at mcs.anl.gov  Fri Jul 31 13:25:13 2009
From: bsmith at mcs.anl.gov (Barry Smith)
Date: Fri, 31 Jul 2009 13:25:13 -0500
Subject: Smoother settings for AMG
In-Reply-To: <A4A715530F9D944C930E27C7F8932FBD2AD4BD@CORP-CLT-EXB02.ds>
References: <A4A715530F9D944C930E27C7F8932FBD2AD4A5@CORP-CLT-EXB02.ds>
	<E7D7C1CB-EF3C-4224-BF08-B3D612BC2A26@mcs.anl.gov>
	<A4A715530F9D944C930E27C7F8932FBD2AD4BD@CORP-CLT-EXB02.ds>
Message-ID: <A3D6FDB0-CE26-4C00-B252-A1E2438A2E5A@mcs.anl.gov>


On Jul 31, 2009, at 1:15 PM, BAYRAKTAR Harun wrote:

> Barry,
>
> Thanks a lot for looking in to this. One thing I want to clarify is  
> that the 43 (should have been 46 sorry for the typo) iterations on 1  
> cpu seems like a real convergence to me. I do look at the  
> unpreconditioned residual norm to determine convergence. For this I  
> use:
>
> ierr = KSPSetNormType(m_solver, KSP_NORM_UNPRECONDITIONED);   
> CHKERRQ(ierr);
>
> Then I check convergence through KSPSetConvergenceTest. As an  
> experiment I commented out the line above where I tell KSP to use  
> the unpreconditioned norm and while the ||r|| values changed  
> (naturally), it still converged in slightly more number of  
> iterations (56).
>
> I am familiar with the preconditioned norm going down 6 orders while  
> the true relative norm is 0.1 or so (i.e., problem not solved at  
> all). This usually happens to me in structural mechanics problems  
> with ill conditioned systems and I use a KSP method that does not  
> allow for the unpreconditioned residual to be monitored. However,  
> this does not seem to be one of those cases though, maybe I am  
> missing something.

Ok. I didn't see what you report (I saw it just iterating away for a  
long time with the unpreconditioned norm)  but then you never sent the  
command line options for the solver you used so I may have run it  
differently.

>
> Out of curiosity did you use ksp/ksp/examples/tutorials/ex10.c to  
> solve this?

Yes.

>
> Thanks again,
> Harun
>
>
>
> -----Original Message-----
> From: petsc-users-bounces at mcs.anl.gov [mailto:petsc-users-bounces at mcs.anl.gov 
> ] On Behalf Of Barry Smith
> Sent: Thursday, July 30, 2009 7:45 PM
> To: PETSc users list
> Subject: Re: Smoother settings for AMG
>
>
>    Harun,
>
>   I have played around with this matrix. It is a nasty matrix; I
> think it is really beyond the normal capacity of ML (and hypre's
> boomerAMG).
>
> Even the "convergence" you were getting below is BOGUS.  If you run
> with -ksp_norm_type unpreconditioned or -ksp_monitor_true_residual
> you'll see that the "true" residual norm is actually creeping to zero
> and at the converged 43 iterations below the true residual norm has
> decreased by like less than 1/10. (The preconditioned residual norm
> has decreased by 1.e 5 so the iteration stops and you think it has
> converged. In really hard problems preconditioners sometimes scales
> things in a funky way so a large decrease in preconditioned residual
> norm does not mean a large decrease in true residual norm). In other
> words the "answer" you got out of the runs below is garbage.
>
>   I suggest,
> 1) check carefully that the matrix being created actually matches the
> model's equations, if they seem right then
> 2) see if you can change the model so it does not generate such
> hopeless matrices. If you MUST solve this nasty matrix
> 3) bite the bullet and use a parallel direct solver from PETSc. Try
> both MUMPS and SuperLU_dist
>
>   Good luck,
>
>    Barry
>
>
>
>
> On Jul 29, 2009, at 3:54 PM, BAYRAKTAR Harun wrote:
>
>> Hi,
>>
>> I am trying to solve a system of equations and I am having difficulty
>> picking the right smoothers for AMG (using ML as pc_type) in PETSc  
>> for
>> parallel execution. First here is what happens in terms of CG
>> (ksp_type)
>> iteration counts (both columns use block jacobi):
>>
>> cpus	|	AMG w/ ICC(0) x1	|	AMG w/ SOR x4
>> ------------------------------------------------------
>> 1	|		43		|		243
>> 4	|		699		|		379
>>
>> x1 or x4 means 1 or 4 iterations of smoother application at each AMG
>> level (all details from ksp view for the 4 cpu run are below). The
>> main
>> observation is that on 1 cpu, AMG w/ ICC(0) is a clear winner but
>> falls
>> apart in parallel. SOR on the other hand experiences a 1.5X increase
>> in
>> iteration count which is totally expected from the quality of
>> coarsening
>> ML delivers in parallel.
>>
>> I basically would like to find a way (if possible) to have the
>> number of
>> iterations in parallel stay with 1-2X of 1 cpu iteration count for  
>> the
>> AMG w/ ICC case. Is there a way to achieve this?
>>
>> Thanks,
>> Harun
>>
>> %%%%%%%%%%%%%%%%%%%%%%%%%
>> AMG w/ ICC(0) x1 ksp_view
>> %%%%%%%%%%%%%%%%%%%%%%%%%
>> KSP Object:
>> type: cg
>> maximum iterations=10000
>> tolerances:  relative=1e-05, absolute=1e-50, divergence=10000
>> left preconditioning
>> PC Object:
>> type: ml
>>   MG: type is MULTIPLICATIVE, levels=3 cycles=v, pre-smooths=1,
>> post-smooths=1
>> Coarse gride solver -- level 0 -------------------------------
>>   KSP Object:(mg_coarse_)
>>     type: preonly
>>     maximum iterations=1, initial guess is zero
>>     tolerances:  relative=1e-05, absolute=1e-50, divergence=10000
>>     left preconditioning
>>   PC Object:(mg_coarse_)
>>     type: redundant
>>       Redundant preconditioner: First (color=0) of 4 PCs follows
>>     KSP Object:(mg_coarse_redundant_)
>>       type: preonly
>>       maximum iterations=10000, initial guess is zero
>>       tolerances:  relative=1e-05, absolute=1e-50, divergence=10000
>>       left preconditioning
>>     PC Object:(mg_coarse_redundant_)
>>       type: lu
>>         LU: out-of-place factorization
>>           matrix ordering: nd
>>         LU: tolerance for zero pivot 1e-12
>>         LU: factor fill ratio needed 2.17227
>>              Factored matrix follows
>>             Matrix Object:
>>               type=seqaij, rows=283, cols=283
>>               total: nonzeros=21651, allocated nonzeros=21651
>>                 using I-node routines: found 186 nodes, limit used is
>> 5
>>       linear system matrix = precond matrix:
>>       Matrix Object:
>>         type=seqaij, rows=283, cols=283
>>         total: nonzeros=9967, allocated nonzeros=14150
>>           not using I-node routines
>>     linear system matrix = precond matrix:
>>     Matrix Object:
>>       type=mpiaij, rows=283, cols=283
>>       total: nonzeros=9967, allocated nonzeros=9967
>>         not using I-node (on process 0) routines
>> Down solver (pre-smoother) on level 1 -------------------------------
>>   KSP Object:(mg_levels_1_)
>>     type: richardson
>>       Richardson: damping factor=0.9
>>     maximum iterations=1, initial guess is zero
>>     tolerances:  relative=1e-05, absolute=1e-50, divergence=10000
>>     left preconditioning
>>   PC Object:(mg_levels_1_)
>>     type: bjacobi
>>       block Jacobi: number of blocks = 4
>>       Local solve is same for all blocks, in the following KSP and PC
>> objects:
>>     KSP Object:(mg_levels_1_sub_)
>>       type: preonly
>>       maximum iterations=10000, initial guess is zero
>>       tolerances:  relative=1e-05, absolute=1e-50, divergence=10000
>>       left preconditioning
>>     PC Object:(mg_levels_1_sub_)
>>       type: icc
>>         ICC: 0 levels of fill
>>         ICC: factor fill ratio allocated 1
>>         ICC: using Manteuffel shift
>>         ICC: factor fill ratio needed 0.514899
>>              Factored matrix follows
>>             Matrix Object:
>>               type=seqsbaij, rows=2813, cols=2813
>>               total: nonzeros=48609, allocated nonzeros=48609
>>                   block size is 1
>>       linear system matrix = precond matrix:
>>       Matrix Object:
>>         type=seqaij, rows=2813, cols=2813
>>         total: nonzeros=94405, allocated nonzeros=94405
>>           not using I-node routines
>>     linear system matrix = precond matrix:
>>     Matrix Object:
>>       type=mpiaij, rows=10654, cols=10654
>>       total: nonzeros=376634, allocated nonzeros=376634
>>         not using I-node (on process 0) routines
>> Up solver (post-smoother) on level 1 -------------------------------
>>   KSP Object:(mg_levels_1_)
>>     type: richardson
>>       Richardson: damping factor=0.9
>>     maximum iterations=1
>>     tolerances:  relative=1e-05, absolute=1e-50, divergence=10000
>>     left preconditioning
>>   PC Object:(mg_levels_1_)
>>     type: bjacobi
>>       block Jacobi: number of blocks = 4
>>       Local solve is same for all blocks, in the following KSP and PC
>> objects:
>>     KSP Object:(mg_levels_1_sub_)
>>       type: preonly
>>       maximum iterations=10000, initial guess is zero
>>       tolerances:  relative=1e-05, absolute=1e-50, divergence=10000
>>       left preconditioning
>>     PC Object:(mg_levels_1_sub_)
>>       type: icc
>>         ICC: 0 levels of fill
>>         ICC: factor fill ratio allocated 1
>>         ICC: using Manteuffel shift
>>         ICC: factor fill ratio needed 0.514899
>>              Factored matrix follows
>>             Matrix Object:
>>               type=seqsbaij, rows=2813, cols=2813
>>               total: nonzeros=48609, allocated nonzeros=48609
>>                   block size is 1
>>       linear system matrix = precond matrix:
>>       Matrix Object:
>>         type=seqaij, rows=2813, cols=2813
>>         total: nonzeros=94405, allocated nonzeros=94405
>>           not using I-node routines
>>     linear system matrix = precond matrix:
>>     Matrix Object:
>>       type=mpiaij, rows=10654, cols=10654
>>       total: nonzeros=376634, allocated nonzeros=376634
>>         not using I-node (on process 0) routines
>> Down solver (pre-smoother) on level 2 -------------------------------
>>   KSP Object:(mg_levels_2_)
>>     type: richardson
>>       Richardson: damping factor=0.9
>>     maximum iterations=1, initial guess is zero
>>     tolerances:  relative=1e-05, absolute=1e-50, divergence=10000
>>     left preconditioning
>>   PC Object:(mg_levels_2_)
>>     type: bjacobi
>>       block Jacobi: number of blocks = 4
>>       Local solve is same for all blocks, in the following KSP and PC
>> objects:
>>     KSP Object:(mg_levels_2_sub_)
>>       type: preonly
>>       maximum iterations=10000, initial guess is zero
>>       tolerances:  relative=1e-05, absolute=1e-50, divergence=10000
>>       left preconditioning
>>     PC Object:(mg_levels_2_sub_)
>>       type: icc
>>         ICC: 0 levels of fill
>>         ICC: factor fill ratio allocated 1
>>         ICC: using Manteuffel shift
>>         ICC: factor fill ratio needed 0.519045
>>              Factored matrix follows
>>             Matrix Object:
>>               type=seqsbaij, rows=101164, cols=101164
>>               total: nonzeros=1378558, allocated nonzeros=1378558
>>                   block size is 1
>>       linear system matrix = precond matrix:
>>       Matrix Object:
>>         type=seqaij, rows=101164, cols=101164
>>         total: nonzeros=2655952, allocated nonzeros=5159364
>>           not using I-node routines
>>     linear system matrix = precond matrix:
>>     Matrix Object:
>>       type=mpiaij, rows=411866, cols=411866
>>       total: nonzeros=10941434, allocated nonzeros=42010332
>>         not using I-node (on process 0) routines
>> Up solver (post-smoother) on level 2 -------------------------------
>>   KSP Object:(mg_levels_2_)
>>     type: richardson
>>       Richardson: damping factor=0.9
>>     maximum iterations=1
>>     tolerances:  relative=1e-05, absolute=1e-50, divergence=10000
>>     left preconditioning
>>   PC Object:(mg_levels_2_)
>>     type: bjacobi
>>       block Jacobi: number of blocks = 4
>>       Local solve is same for all blocks, in the following KSP and PC
>> objects:
>>     KSP Object:(mg_levels_2_sub_)
>>       type: preonly
>>       maximum iterations=10000, initial guess is zero
>>       tolerances:  relative=1e-05, absolute=1e-50, divergence=10000
>>       left preconditioning
>>     PC Object:(mg_levels_2_sub_)
>>       type: icc
>>         ICC: 0 levels of fill
>>         ICC: factor fill ratio allocated 1
>>         ICC: using Manteuffel shift
>>         ICC: factor fill ratio needed 0.519045
>>              Factored matrix follows
>>             Matrix Object:
>>               type=seqsbaij, rows=101164, cols=101164
>>               total: nonzeros=1378558, allocated nonzeros=1378558
>>                   block size is 1
>>       linear system matrix = precond matrix:
>>       Matrix Object:
>>         type=seqaij, rows=101164, cols=101164
>>         total: nonzeros=2655952, allocated nonzeros=5159364
>>           not using I-node routines
>>     linear system matrix = precond matrix:
>>     Matrix Object:
>>       type=mpiaij, rows=411866, cols=411866
>>       total: nonzeros=10941434, allocated nonzeros=42010332
>>         not using I-node (on process 0) routines
>> linear system matrix = precond matrix:
>> Matrix Object:
>>   type=mpiaij, rows=411866, cols=411866
>>   total: nonzeros=10941434, allocated nonzeros=42010332
>>     not using I-node (on process 0) routines
>>
>> %%%%%%%%%%%%%%%%%%%%%%
>> AMG w/ SOR x4 ksp_view
>> %%%%%%%%%%%%%%%%%%%%%%
>>
>> KSP Object:
>> type: cg
>> maximum iterations=10000
>> tolerances:  relative=1e-05, absolute=1e-50, divergence=10000
>> left preconditioning
>> PC Object:
>> type: ml
>>   MG: type is MULTIPLICATIVE, levels=3 cycles=v, pre-smooths=1,
>> post-smooths=1
>> Coarse gride solver -- level 0 -------------------------------
>>   KSP Object:(mg_coarse_)
>>     type: preonly
>>     maximum iterations=1, initial guess is zero
>>     tolerances:  relative=1e-05, absolute=1e-50, divergence=10000
>>     left preconditioning
>>   PC Object:(mg_coarse_)
>>     type: redundant
>>       Redundant preconditioner: First (color=0) of 4 PCs follows
>>     KSP Object:(mg_coarse_redundant_)
>>       type: preonly
>>       maximum iterations=10000, initial guess is zero
>>       tolerances:  relative=1e-05, absolute=1e-50, divergence=10000
>>       left preconditioning
>>     PC Object:(mg_coarse_redundant_)
>>       type: lu
>>         LU: out-of-place factorization
>>           matrix ordering: nd
>>         LU: tolerance for zero pivot 1e-12
>>         LU: factor fill ratio needed 2.17227
>>              Factored matrix follows
>>             Matrix Object:
>>               type=seqaij, rows=283, cols=283
>>               total: nonzeros=21651, allocated nonzeros=21651
>>                 using I-node routines: found 186 nodes, limit used is
>> 5
>>       linear system matrix = precond matrix:
>>       Matrix Object:
>>         type=seqaij, rows=283, cols=283
>>         total: nonzeros=9967, allocated nonzeros=14150
>>           not using I-node routines
>>     linear system matrix = precond matrix:
>>     Matrix Object:
>>       type=mpiaij, rows=283, cols=283
>>       total: nonzeros=9967, allocated nonzeros=9967
>>         not using I-node (on process 0) routines
>> Down solver (pre-smoother) on level 1 -------------------------------
>>   KSP Object:(mg_levels_1_)
>>     type: richardson
>>       Richardson: damping factor=1
>>     maximum iterations=4, initial guess is zero
>>     tolerances:  relative=1e-05, absolute=1e-50, divergence=10000
>>     left preconditioning
>>   PC Object:(mg_levels_1_)
>>     type: sor
>>       SOR: type = local_symmetric, iterations = 1, omega = 1
>>     linear system matrix = precond matrix:
>>     Matrix Object:
>>       type=mpiaij, rows=10654, cols=10654
>>       total: nonzeros=376634, allocated nonzeros=376634
>>         not using I-node (on process 0) routines
>> Up solver (post-smoother) on level 1 -------------------------------
>>   KSP Object:(mg_levels_1_)
>>     type: richardson
>>       Richardson: damping factor=1
>>     maximum iterations=4
>>     tolerances:  relative=1e-05, absolute=1e-50, divergence=10000
>>     left preconditioning
>>   PC Object:(mg_levels_1_)
>>     type: sor
>>       SOR: type = local_symmetric, iterations = 1, omega = 1
>>     linear system matrix = precond matrix:
>>     Matrix Object:
>>       type=mpiaij, rows=10654, cols=10654
>>       total: nonzeros=376634, allocated nonzeros=376634
>>         not using I-node (on process 0) routines
>> Down solver (pre-smoother) on level 2 -------------------------------
>>   KSP Object:(mg_levels_2_)
>>     type: richardson
>>       Richardson: damping factor=1
>>     maximum iterations=4, initial guess is zero
>>     tolerances:  relative=1e-05, absolute=1e-50, divergence=10000
>>     left preconditioning
>>   PC Object:(mg_levels_2_)
>>     type: sor
>>       SOR: type = local_symmetric, iterations = 1, omega = 1
>>     linear system matrix = precond matrix:
>>     Matrix Object:
>>       type=mpiaij, rows=411866, cols=411866
>>       total: nonzeros=10941434, allocated nonzeros=42010332
>>         not using I-node (on process 0) routines
>> Up solver (post-smoother) on level 2 -------------------------------
>>   KSP Object:(mg_levels_2_)
>>     type: richardson
>>       Richardson: damping factor=1
>>     maximum iterations=4
>>     tolerances:  relative=1e-05, absolute=1e-50, divergence=10000
>>     left preconditioning
>>   PC Object:(mg_levels_2_)
>>     type: sor
>>       SOR: type = local_symmetric, iterations = 1, omega = 1
>>     linear system matrix = precond matrix:
>>     Matrix Object:
>>       type=mpiaij, rows=411866, cols=411866
>>       total: nonzeros=10941434, allocated nonzeros=42010332
>>         not using I-node (on process 0) routines
>> linear system matrix = precond matrix:
>> Matrix Object:
>>   type=mpiaij, rows=411866, cols=411866
>>   total: nonzeros=10941434, allocated nonzeros=42010332
>>     not using I-node (on process 0) routines
>>
>>
>


From Harun.BAYRAKTAR at 3ds.com  Fri Jul 31 13:55:49 2009
From: Harun.BAYRAKTAR at 3ds.com (BAYRAKTAR Harun)
Date: Fri, 31 Jul 2009 14:55:49 -0400
Subject: Smoother settings for AMG
In-Reply-To: <A3D6FDB0-CE26-4C00-B252-A1E2438A2E5A@mcs.anl.gov>
References: <A4A715530F9D944C930E27C7F8932FBD2AD4A5@CORP-CLT-EXB02.ds><E7D7C1CB-EF3C-4224-BF08-B3D612BC2A26@mcs.anl.gov><A4A715530F9D944C930E27C7F8932FBD2AD4BD@CORP-CLT-EXB02.ds>
	<A3D6FDB0-CE26-4C00-B252-A1E2438A2E5A@mcs.anl.gov>
Message-ID: <A4A715530F9D944C930E27C7F8932FBD2AD4BE@CORP-CLT-EXB02.ds>

Barry,

On Monday I'll use ex10.c to reproduce and send you the full options.

Thanks,
Harun

-----Original Message-----
From: petsc-users-bounces at mcs.anl.gov [mailto:petsc-users-bounces at mcs.anl.gov] On Behalf Of Barry Smith
Sent: Friday, July 31, 2009 2:25 PM
To: PETSc users list
Subject: Re: Smoother settings for AMG


On Jul 31, 2009, at 1:15 PM, BAYRAKTAR Harun wrote:

> Barry,
>
> Thanks a lot for looking in to this. One thing I want to clarify is  
> that the 43 (should have been 46 sorry for the typo) iterations on 1  
> cpu seems like a real convergence to me. I do look at the  
> unpreconditioned residual norm to determine convergence. For this I  
> use:
>
> ierr = KSPSetNormType(m_solver, KSP_NORM_UNPRECONDITIONED);   
> CHKERRQ(ierr);
>
> Then I check convergence through KSPSetConvergenceTest. As an  
> experiment I commented out the line above where I tell KSP to use  
> the unpreconditioned norm and while the ||r|| values changed  
> (naturally), it still converged in slightly more number of  
> iterations (56).
>
> I am familiar with the preconditioned norm going down 6 orders while  
> the true relative norm is 0.1 or so (i.e., problem not solved at  
> all). This usually happens to me in structural mechanics problems  
> with ill conditioned systems and I use a KSP method that does not  
> allow for the unpreconditioned residual to be monitored. However,  
> this does not seem to be one of those cases though, maybe I am  
> missing something.

Ok. I didn't see what you report (I saw it just iterating away for a  
long time with the unpreconditioned norm)  but then you never sent the  
command line options for the solver you used so I may have run it  
differently.

>
> Out of curiosity did you use ksp/ksp/examples/tutorials/ex10.c to  
> solve this?

Yes.

>
> Thanks again,
> Harun
>
>
>
> -----Original Message-----
> From: petsc-users-bounces at mcs.anl.gov [mailto:petsc-users-bounces at mcs.anl.gov 
> ] On Behalf Of Barry Smith
> Sent: Thursday, July 30, 2009 7:45 PM
> To: PETSc users list
> Subject: Re: Smoother settings for AMG
>
>
>    Harun,
>
>   I have played around with this matrix. It is a nasty matrix; I
> think it is really beyond the normal capacity of ML (and hypre's
> boomerAMG).
>
> Even the "convergence" you were getting below is BOGUS.  If you run
> with -ksp_norm_type unpreconditioned or -ksp_monitor_true_residual
> you'll see that the "true" residual norm is actually creeping to zero
> and at the converged 43 iterations below the true residual norm has
> decreased by like less than 1/10. (The preconditioned residual norm
> has decreased by 1.e 5 so the iteration stops and you think it has
> converged. In really hard problems preconditioners sometimes scales
> things in a funky way so a large decrease in preconditioned residual
> norm does not mean a large decrease in true residual norm). In other
> words the "answer" you got out of the runs below is garbage.
>
>   I suggest,
> 1) check carefully that the matrix being created actually matches the
> model's equations, if they seem right then
> 2) see if you can change the model so it does not generate such
> hopeless matrices. If you MUST solve this nasty matrix
> 3) bite the bullet and use a parallel direct solver from PETSc. Try
> both MUMPS and SuperLU_dist
>
>   Good luck,
>
>    Barry
>
>
>
>
> On Jul 29, 2009, at 3:54 PM, BAYRAKTAR Harun wrote:
>
>> Hi,
>>
>> I am trying to solve a system of equations and I am having difficulty
>> picking the right smoothers for AMG (using ML as pc_type) in PETSc  
>> for
>> parallel execution. First here is what happens in terms of CG
>> (ksp_type)
>> iteration counts (both columns use block jacobi):
>>
>> cpus	|	AMG w/ ICC(0) x1	|	AMG w/ SOR x4
>> ------------------------------------------------------
>> 1	|		43		|		243
>> 4	|		699		|		379
>>
>> x1 or x4 means 1 or 4 iterations of smoother application at each AMG
>> level (all details from ksp view for the 4 cpu run are below). The
>> main
>> observation is that on 1 cpu, AMG w/ ICC(0) is a clear winner but
>> falls
>> apart in parallel. SOR on the other hand experiences a 1.5X increase
>> in
>> iteration count which is totally expected from the quality of
>> coarsening
>> ML delivers in parallel.
>>
>> I basically would like to find a way (if possible) to have the
>> number of
>> iterations in parallel stay with 1-2X of 1 cpu iteration count for  
>> the
>> AMG w/ ICC case. Is there a way to achieve this?
>>
>> Thanks,
>> Harun
>>
>> %%%%%%%%%%%%%%%%%%%%%%%%%
>> AMG w/ ICC(0) x1 ksp_view
>> %%%%%%%%%%%%%%%%%%%%%%%%%
>> KSP Object:
>> type: cg
>> maximum iterations=10000
>> tolerances:  relative=1e-05, absolute=1e-50, divergence=10000
>> left preconditioning
>> PC Object:
>> type: ml
>>   MG: type is MULTIPLICATIVE, levels=3 cycles=v, pre-smooths=1,
>> post-smooths=1
>> Coarse gride solver -- level 0 -------------------------------
>>   KSP Object:(mg_coarse_)
>>     type: preonly
>>     maximum iterations=1, initial guess is zero
>>     tolerances:  relative=1e-05, absolute=1e-50, divergence=10000
>>     left preconditioning
>>   PC Object:(mg_coarse_)
>>     type: redundant
>>       Redundant preconditioner: First (color=0) of 4 PCs follows
>>     KSP Object:(mg_coarse_redundant_)
>>       type: preonly
>>       maximum iterations=10000, initial guess is zero
>>       tolerances:  relative=1e-05, absolute=1e-50, divergence=10000
>>       left preconditioning
>>     PC Object:(mg_coarse_redundant_)
>>       type: lu
>>         LU: out-of-place factorization
>>           matrix ordering: nd
>>         LU: tolerance for zero pivot 1e-12
>>         LU: factor fill ratio needed 2.17227
>>              Factored matrix follows
>>             Matrix Object:
>>               type=seqaij, rows=283, cols=283
>>               total: nonzeros=21651, allocated nonzeros=21651
>>                 using I-node routines: found 186 nodes, limit used is
>> 5
>>       linear system matrix = precond matrix:
>>       Matrix Object:
>>         type=seqaij, rows=283, cols=283
>>         total: nonzeros=9967, allocated nonzeros=14150
>>           not using I-node routines
>>     linear system matrix = precond matrix:
>>     Matrix Object:
>>       type=mpiaij, rows=283, cols=283
>>       total: nonzeros=9967, allocated nonzeros=9967
>>         not using I-node (on process 0) routines
>> Down solver (pre-smoother) on level 1 -------------------------------
>>   KSP Object:(mg_levels_1_)
>>     type: richardson
>>       Richardson: damping factor=0.9
>>     maximum iterations=1, initial guess is zero
>>     tolerances:  relative=1e-05, absolute=1e-50, divergence=10000
>>     left preconditioning
>>   PC Object:(mg_levels_1_)
>>     type: bjacobi
>>       block Jacobi: number of blocks = 4
>>       Local solve is same for all blocks, in the following KSP and PC
>> objects:
>>     KSP Object:(mg_levels_1_sub_)
>>       type: preonly
>>       maximum iterations=10000, initial guess is zero
>>       tolerances:  relative=1e-05, absolute=1e-50, divergence=10000
>>       left preconditioning
>>     PC Object:(mg_levels_1_sub_)
>>       type: icc
>>         ICC: 0 levels of fill
>>         ICC: factor fill ratio allocated 1
>>         ICC: using Manteuffel shift
>>         ICC: factor fill ratio needed 0.514899
>>              Factored matrix follows
>>             Matrix Object:
>>               type=seqsbaij, rows=2813, cols=2813
>>               total: nonzeros=48609, allocated nonzeros=48609
>>                   block size is 1
>>       linear system matrix = precond matrix:
>>       Matrix Object:
>>         type=seqaij, rows=2813, cols=2813
>>         total: nonzeros=94405, allocated nonzeros=94405
>>           not using I-node routines
>>     linear system matrix = precond matrix:
>>     Matrix Object:
>>       type=mpiaij, rows=10654, cols=10654
>>       total: nonzeros=376634, allocated nonzeros=376634
>>         not using I-node (on process 0) routines
>> Up solver (post-smoother) on level 1 -------------------------------
>>   KSP Object:(mg_levels_1_)
>>     type: richardson
>>       Richardson: damping factor=0.9
>>     maximum iterations=1
>>     tolerances:  relative=1e-05, absolute=1e-50, divergence=10000
>>     left preconditioning
>>   PC Object:(mg_levels_1_)
>>     type: bjacobi
>>       block Jacobi: number of blocks = 4
>>       Local solve is same for all blocks, in the following KSP and PC
>> objects:
>>     KSP Object:(mg_levels_1_sub_)
>>       type: preonly
>>       maximum iterations=10000, initial guess is zero
>>       tolerances:  relative=1e-05, absolute=1e-50, divergence=10000
>>       left preconditioning
>>     PC Object:(mg_levels_1_sub_)
>>       type: icc
>>         ICC: 0 levels of fill
>>         ICC: factor fill ratio allocated 1
>>         ICC: using Manteuffel shift
>>         ICC: factor fill ratio needed 0.514899
>>              Factored matrix follows
>>             Matrix Object:
>>               type=seqsbaij, rows=2813, cols=2813
>>               total: nonzeros=48609, allocated nonzeros=48609
>>                   block size is 1
>>       linear system matrix = precond matrix:
>>       Matrix Object:
>>         type=seqaij, rows=2813, cols=2813
>>         total: nonzeros=94405, allocated nonzeros=94405
>>           not using I-node routines
>>     linear system matrix = precond matrix:
>>     Matrix Object:
>>       type=mpiaij, rows=10654, cols=10654
>>       total: nonzeros=376634, allocated nonzeros=376634
>>         not using I-node (on process 0) routines
>> Down solver (pre-smoother) on level 2 -------------------------------
>>   KSP Object:(mg_levels_2_)
>>     type: richardson
>>       Richardson: damping factor=0.9
>>     maximum iterations=1, initial guess is zero
>>     tolerances:  relative=1e-05, absolute=1e-50, divergence=10000
>>     left preconditioning
>>   PC Object:(mg_levels_2_)
>>     type: bjacobi
>>       block Jacobi: number of blocks = 4
>>       Local solve is same for all blocks, in the following KSP and PC
>> objects:
>>     KSP Object:(mg_levels_2_sub_)
>>       type: preonly
>>       maximum iterations=10000, initial guess is zero
>>       tolerances:  relative=1e-05, absolute=1e-50, divergence=10000
>>       left preconditioning
>>     PC Object:(mg_levels_2_sub_)
>>       type: icc
>>         ICC: 0 levels of fill
>>         ICC: factor fill ratio allocated 1
>>         ICC: using Manteuffel shift
>>         ICC: factor fill ratio needed 0.519045
>>              Factored matrix follows
>>             Matrix Object:
>>               type=seqsbaij, rows=101164, cols=101164
>>               total: nonzeros=1378558, allocated nonzeros=1378558
>>                   block size is 1
>>       linear system matrix = precond matrix:
>>       Matrix Object:
>>         type=seqaij, rows=101164, cols=101164
>>         total: nonzeros=2655952, allocated nonzeros=5159364
>>           not using I-node routines
>>     linear system matrix = precond matrix:
>>     Matrix Object:
>>       type=mpiaij, rows=411866, cols=411866
>>       total: nonzeros=10941434, allocated nonzeros=42010332
>>         not using I-node (on process 0) routines
>> Up solver (post-smoother) on level 2 -------------------------------
>>   KSP Object:(mg_levels_2_)
>>     type: richardson
>>       Richardson: damping factor=0.9
>>     maximum iterations=1
>>     tolerances:  relative=1e-05, absolute=1e-50, divergence=10000
>>     left preconditioning
>>   PC Object:(mg_levels_2_)
>>     type: bjacobi
>>       block Jacobi: number of blocks = 4
>>       Local solve is same for all blocks, in the following KSP and PC
>> objects:
>>     KSP Object:(mg_levels_2_sub_)
>>       type: preonly
>>       maximum iterations=10000, initial guess is zero
>>       tolerances:  relative=1e-05, absolute=1e-50, divergence=10000
>>       left preconditioning
>>     PC Object:(mg_levels_2_sub_)
>>       type: icc
>>         ICC: 0 levels of fill
>>         ICC: factor fill ratio allocated 1
>>         ICC: using Manteuffel shift
>>         ICC: factor fill ratio needed 0.519045
>>              Factored matrix follows
>>             Matrix Object:
>>               type=seqsbaij, rows=101164, cols=101164
>>               total: nonzeros=1378558, allocated nonzeros=1378558
>>                   block size is 1
>>       linear system matrix = precond matrix:
>>       Matrix Object:
>>         type=seqaij, rows=101164, cols=101164
>>         total: nonzeros=2655952, allocated nonzeros=5159364
>>           not using I-node routines
>>     linear system matrix = precond matrix:
>>     Matrix Object:
>>       type=mpiaij, rows=411866, cols=411866
>>       total: nonzeros=10941434, allocated nonzeros=42010332
>>         not using I-node (on process 0) routines
>> linear system matrix = precond matrix:
>> Matrix Object:
>>   type=mpiaij, rows=411866, cols=411866
>>   total: nonzeros=10941434, allocated nonzeros=42010332
>>     not using I-node (on process 0) routines
>>
>> %%%%%%%%%%%%%%%%%%%%%%
>> AMG w/ SOR x4 ksp_view
>> %%%%%%%%%%%%%%%%%%%%%%
>>
>> KSP Object:
>> type: cg
>> maximum iterations=10000
>> tolerances:  relative=1e-05, absolute=1e-50, divergence=10000
>> left preconditioning
>> PC Object:
>> type: ml
>>   MG: type is MULTIPLICATIVE, levels=3 cycles=v, pre-smooths=1,
>> post-smooths=1
>> Coarse gride solver -- level 0 -------------------------------
>>   KSP Object:(mg_coarse_)
>>     type: preonly
>>     maximum iterations=1, initial guess is zero
>>     tolerances:  relative=1e-05, absolute=1e-50, divergence=10000
>>     left preconditioning
>>   PC Object:(mg_coarse_)
>>     type: redundant
>>       Redundant preconditioner: First (color=0) of 4 PCs follows
>>     KSP Object:(mg_coarse_redundant_)
>>       type: preonly
>>       maximum iterations=10000, initial guess is zero
>>       tolerances:  relative=1e-05, absolute=1e-50, divergence=10000
>>       left preconditioning
>>     PC Object:(mg_coarse_redundant_)
>>       type: lu
>>         LU: out-of-place factorization
>>           matrix ordering: nd
>>         LU: tolerance for zero pivot 1e-12
>>         LU: factor fill ratio needed 2.17227
>>              Factored matrix follows
>>             Matrix Object:
>>               type=seqaij, rows=283, cols=283
>>               total: nonzeros=21651, allocated nonzeros=21651
>>                 using I-node routines: found 186 nodes, limit used is
>> 5
>>       linear system matrix = precond matrix:
>>       Matrix Object:
>>         type=seqaij, rows=283, cols=283
>>         total: nonzeros=9967, allocated nonzeros=14150
>>           not using I-node routines
>>     linear system matrix = precond matrix:
>>     Matrix Object:
>>       type=mpiaij, rows=283, cols=283
>>       total: nonzeros=9967, allocated nonzeros=9967
>>         not using I-node (on process 0) routines
>> Down solver (pre-smoother) on level 1 -------------------------------
>>   KSP Object:(mg_levels_1_)
>>     type: richardson
>>       Richardson: damping factor=1
>>     maximum iterations=4, initial guess is zero
>>     tolerances:  relative=1e-05, absolute=1e-50, divergence=10000
>>     left preconditioning
>>   PC Object:(mg_levels_1_)
>>     type: sor
>>       SOR: type = local_symmetric, iterations = 1, omega = 1
>>     linear system matrix = precond matrix:
>>     Matrix Object:
>>       type=mpiaij, rows=10654, cols=10654
>>       total: nonzeros=376634, allocated nonzeros=376634
>>         not using I-node (on process 0) routines
>> Up solver (post-smoother) on level 1 -------------------------------
>>   KSP Object:(mg_levels_1_)
>>     type: richardson
>>       Richardson: damping factor=1
>>     maximum iterations=4
>>     tolerances:  relative=1e-05, absolute=1e-50, divergence=10000
>>     left preconditioning
>>   PC Object:(mg_levels_1_)
>>     type: sor
>>       SOR: type = local_symmetric, iterations = 1, omega = 1
>>     linear system matrix = precond matrix:
>>     Matrix Object:
>>       type=mpiaij, rows=10654, cols=10654
>>       total: nonzeros=376634, allocated nonzeros=376634
>>         not using I-node (on process 0) routines
>> Down solver (pre-smoother) on level 2 -------------------------------
>>   KSP Object:(mg_levels_2_)
>>     type: richardson
>>       Richardson: damping factor=1
>>     maximum iterations=4, initial guess is zero
>>     tolerances:  relative=1e-05, absolute=1e-50, divergence=10000
>>     left preconditioning
>>   PC Object:(mg_levels_2_)
>>     type: sor
>>       SOR: type = local_symmetric, iterations = 1, omega = 1
>>     linear system matrix = precond matrix:
>>     Matrix Object:
>>       type=mpiaij, rows=411866, cols=411866
>>       total: nonzeros=10941434, allocated nonzeros=42010332
>>         not using I-node (on process 0) routines
>> Up solver (post-smoother) on level 2 -------------------------------
>>   KSP Object:(mg_levels_2_)
>>     type: richardson
>>       Richardson: damping factor=1
>>     maximum iterations=4
>>     tolerances:  relative=1e-05, absolute=1e-50, divergence=10000
>>     left preconditioning
>>   PC Object:(mg_levels_2_)
>>     type: sor
>>       SOR: type = local_symmetric, iterations = 1, omega = 1
>>     linear system matrix = precond matrix:
>>     Matrix Object:
>>       type=mpiaij, rows=411866, cols=411866
>>       total: nonzeros=10941434, allocated nonzeros=42010332
>>         not using I-node (on process 0) routines
>> linear system matrix = precond matrix:
>> Matrix Object:
>>   type=mpiaij, rows=411866, cols=411866
>>   total: nonzeros=10941434, allocated nonzeros=42010332
>>     not using I-node (on process 0) routines
>>
>>
>