From li76pan at yahoo.com  Mon Apr  3 10:42:33 2006
From: li76pan at yahoo.com (li pan)
Date: Mon, 3 Apr 2006 08:42:33 -0700 (PDT)
Subject: blaslapack
Message-ID: <20060403154233.51893.qmail@web36814.mail.mud.yahoo.com>

I download blas lapack from the ftp which was
suggested by petsc download homepage 
ftp://ftp.mcs.anl.gov/pub/petsc/fblaslapack.tar.gz
and transfer it to a suse laptop, since this has no
internet connection. I set a value for
--with-blas-lapack-dir to the unziped the directory.
But got this error information :
..../fblaslapack can not be used
why?

best

pan

__________________________________________________
Do You Yahoo!?
Tired of spam?  Yahoo! Mail has the best spam protection around 
http://mail.yahoo.com 


From balay at mcs.anl.gov  Mon Apr  3 10:53:34 2006
From: balay at mcs.anl.gov (Satish Balay)
Date: Mon, 3 Apr 2006 10:53:34 -0500 (CDT)
Subject: blaslapack
In-Reply-To: <20060403154233.51893.qmail@web36814.mail.mud.yahoo.com>
References: <20060403154233.51893.qmail@web36814.mail.mud.yahoo.com>
Message-ID: <Pine.LNX.4.64.0604031051520.4549@asterix>

The option --with-blas-lapack-dir is useful if you already have
blaslapack libraries compiled & installed. If you've manually
downloaded fblaslapack.tar.gz - then use the option:

--download-f-blas-lapack=/home/petsc/fblaslapack.tar.gz

[with the correct patch to fblaslapack.tar.gz file]

Satish

On Mon, 3 Apr 2006, li pan wrote:

> I download blas lapack from the ftp which was
> suggested by petsc download homepage 
> ftp://ftp.mcs.anl.gov/pub/petsc/fblaslapack.tar.gz
> and transfer it to a suse laptop, since this has no
> internet connection. I set a value for
> --with-blas-lapack-dir to the unziped the directory.
> But got this error information :
> ..../fblaslapack can not be used
> why?
> 
> best
> 
> pan
> 
> __________________________________________________
> Do You Yahoo!?
> Tired of spam?  Yahoo! Mail has the best spam protection around 
> http://mail.yahoo.com 
> 
> 


From balay at mcs.anl.gov  Mon Apr  3 11:20:30 2006
From: balay at mcs.anl.gov (Satish Balay)
Date: Mon, 3 Apr 2006 11:20:30 -0500 (CDT)
Subject: blaslapack
In-Reply-To: <20060403160727.63846.qmail@web36814.mail.mud.yahoo.com>
References: <20060403160727.63846.qmail@web36814.mail.mud.yahoo.com>
Message-ID: <Pine.LNX.4.64.0604031119080.25406@asterix>

Use the latest 2.3.1 release of PETSc.

Satish

On Mon, 3 Apr 2006, li pan wrote:

> hi 
> but --download-f-blas-lapack only takes "no, yes .."
> boolean value.
> 
> pan
> 
> 
>     
> The option --with-blas-lapack-dir is useful if you
> already have
> blaslapack libraries compiled & installed. If you've
> manually
> downloaded fblaslapack.tar.gz - then use the option:
> 
> --download-f-blas-lapack=/home/petsc/fblaslapack.tar.gz
> 
> [with the correct patch to fblaslapack.tar.gz file]
> 
> Satish
> 
> On Mon, 3 Apr 2006, li pan wrote:
> 
> > I download blas lapack from the ftp which was
> > suggested by petsc download homepage 
> > ftp://ftp.mcs.anl.gov/pub/petsc/fblaslapack.tar.gz
> > and transfer it to a suse laptop, since this has no
> > internet connection. I set a value for
> > --with-blas-lapack-dir to the unziped the directory.
> > But got this error information :
> > ..../fblaslapack can not be used
> > why?
> > 
> > best
> > 
> > pan
> 
> __________________________________________________
> Do You Yahoo!?
> Tired of spam?  Yahoo! Mail has the best spam protection around 
> http://mail.yahoo.com 
> 
> 


From li76pan at yahoo.com  Mon Apr  3 11:42:59 2006
From: li76pan at yahoo.com (li pan)
Date: Mon, 3 Apr 2006 09:42:59 -0700 (PDT)
Subject: how to install 
Message-ID: <20060403164259.61719.qmail@web36813.mail.mud.yahoo.com>

Dear all,
could anybody tell how to install petsc into a pc
without internet connection? 

best

pan

__________________________________________________
Do You Yahoo!?
Tired of spam?  Yahoo! Mail has the best spam protection around 
http://mail.yahoo.com 


From knepley at gmail.com  Mon Apr  3 11:53:52 2006
From: knepley at gmail.com (Matthew Knepley)
Date: Mon, 3 Apr 2006 11:53:52 -0500
Subject: how to install
In-Reply-To: <20060403164259.61719.qmail@web36813.mail.mud.yahoo.com>
References: <20060403164259.61719.qmail@web36813.mail.mud.yahoo.com>
Message-ID: <a9f269830604030953k56a19547laf4697df906fb278@mail.gmail.com>

If you can get the tarball to your machine, you should not need the
internet, unless you need another package like MPI.

   Matt

On 4/3/06, li pan <li76pan at yahoo.com> wrote:
>
> Dear all,
> could anybody tell how to install petsc into a pc
> without internet connection?
>
> best
>
> pan
>
> __________________________________________________
> Do You Yahoo!?
> Tired of spam?  Yahoo! Mail has the best spam protection around
> http://mail.yahoo.com
>
>


--
"Failure has a thousand explanations. Success doesn't need one" -- Sir Alec
Guiness
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.mcs.anl.gov/pipermail/petsc-users/attachments/20060403/8e4e1296/attachment.htm>

From balay at mcs.anl.gov  Mon Apr  3 13:00:44 2006
From: balay at mcs.anl.gov (Satish Balay)
Date: Mon, 3 Apr 2006 13:00:44 -0500 (CDT)
Subject: blaslapack
In-Reply-To: <20060403162523.22522.qmail@web36802.mail.mud.yahoo.com>
References: <20060403162523.22522.qmail@web36802.mail.mud.yahoo.com>
Message-ID: <Pine.LNX.4.64.0604031258550.4560@asterix>

Please send replies to the list..

If you are not using 2.3.1 - then do the following:

cd petsc-2.3.0
mkdir externalpackages
cd externalpackages
tar -xzf ~/fblaslapack.tar.gz
cd ..
./config/configure.py --download-f-blas-lapack=1

Satish


On Mon, 3 Apr 2006, li pan wrote:

> hmmmmmmmm, I'm not sure whether my libmesh version
> supports new version of petsc.
> 
> pan
> 
> 
> 
> Satish
> 
> On Mon, 3 Apr 2006, li pan wrote:
> 
> > hi 
> > but --download-f-blas-lapack only takes "no, yes .."
> > boolean value.
> > 
> > pan
> > 
> 
> __________________________________________________
> Do You Yahoo!?
> Tired of spam?  Yahoo! Mail has the best spam protection around 
> http://mail.yahoo.com 
> 
> 


From balay at mcs.anl.gov  Mon Apr  3 13:03:03 2006
From: balay at mcs.anl.gov (Satish Balay)
Date: Mon, 3 Apr 2006 13:03:03 -0500 (CDT)
Subject: how to install 
In-Reply-To: <20060403164259.61719.qmail@web36813.mail.mud.yahoo.com>
References: <20060403164259.61719.qmail@web36813.mail.mud.yahoo.com>
Message-ID: <Pine.LNX.4.64.0604031302410.4560@asterix>

already responded to this query in the previous thread.

Satish

On Mon, 3 Apr 2006, li pan wrote:

> Dear all,
> could anybody tell how to install petsc into a pc
> without internet connection? 
> 
> best
> 
> pan
> 
> __________________________________________________
> Do You Yahoo!?
> Tired of spam?  Yahoo! Mail has the best spam protection around 
> http://mail.yahoo.com 
> 
> 


From billy at dem.uminho.pt  Tue Apr  4 09:20:31 2006
From: billy at dem.uminho.pt (billy at dem.uminho.pt)
Date: Tue,  4 Apr 2006 15:20:31 +0100
Subject: Combine ghost updates 
Message-ID: <1144160431.443280afadef6@serv-g1.ccom.uminho.pt>


Hi,

I read in a paper that if you combine all updates in a single vector, you can
speed up the comunication. They combined gradient x, y, z values in one vector.
Does anyone know how this can be done? 

For example vectors ux, uy, and uz:

VecGhostUpdateBegin(ux,INSERT_VALUES,SCATTER_FORWARD);
VecGhostUpdateEnd(ux,INSERT_VALUES,SCATTER_FORWARD);

VecGhostUpdateBegin(uy,INSERT_VALUES,SCATTER_FORWARD);
VecGhostUpdateEnd(uy,INSERT_VALUES,SCATTER_FORWARD);

VecGhostUpdateBegin(uz,INSERT_VALUES,SCATTER_FORWARD);
VecGhostUpdateEnd(uz,INSERT_VALUES,SCATTER_FORWARD);

Could they be combined in a vector U:

VecGhostUpdateBegin(U,INSERT_VALUES,SCATTER_FORWARD);
VecGhostUpdateEnd(U,INSERT_VALUES,SCATTER_FORWARD);

and would it be faster?


Billy.


From bsmith at mcs.anl.gov  Tue Apr  4 13:50:53 2006
From: bsmith at mcs.anl.gov (Barry Smith)
Date: Tue, 4 Apr 2006 13:50:53 -0500 (CDT)
Subject: Combine ghost updates 
In-Reply-To: <1144160431.443280afadef6@serv-g1.ccom.uminho.pt>
References: <1144160431.443280afadef6@serv-g1.ccom.uminho.pt>
Message-ID: <Pine.OSX.4.64.0604041337390.14239@barrys-computer.local>


   Billy,

    Since the three vectors are independent there is currently
no way to do this. We would need to add additional support to the
scatter operations to allow packing several scatters together
(actually not a bad idea).

    This problem does not usually come up because we recommend
under most circumstances to "interlace" field variables rather than
keep them in seperate vectors. For example, in your case you would
have a single U vector that looked like (ux_0,uy_0,uz_0,ux_1,uy_1,uz_1,...)
The reason to interlace is for efficiency; in most codes when ux_i is loaded
the uy_i and uz_i are also needed "at the same time"; by combining them
you can get less cache thrashing and less tlb misses: see
for example http://www-fp.mcs.anl.gov/petsc-fun3d/Papers/manke.pdf table 3
where performance is more than doubled by simply interlacing. There are
many other papers that discuss this at 
http://www-fp.mcs.anl.gov/petsc-fun3d/Papers/papers.html

   It is likely that switching to interlaced variables would give much
more of a performance boost then combining the scatters. But you could
much through the code in src/vec/vec/utils/vpscat.c you see how it would
be possible to manage multiple vectors with some hacking.


    Barry


On Tue, 4 Apr 2006, billy at dem.uminho.pt wrote:

>
> Hi,
>
> I read in a paper that if you combine all updates in a single vector, you can
> speed up the comunication. They combined gradient x, y, z values in one vector.
> Does anyone know how this can be done?
>
> For example vectors ux, uy, and uz:
>
> VecGhostUpdateBegin(ux,INSERT_VALUES,SCATTER_FORWARD);
> VecGhostUpdateEnd(ux,INSERT_VALUES,SCATTER_FORWARD);
>
> VecGhostUpdateBegin(uy,INSERT_VALUES,SCATTER_FORWARD);
> VecGhostUpdateEnd(uy,INSERT_VALUES,SCATTER_FORWARD);
>
> VecGhostUpdateBegin(uz,INSERT_VALUES,SCATTER_FORWARD);
> VecGhostUpdateEnd(uz,INSERT_VALUES,SCATTER_FORWARD);
>
> Could they be combined in a vector U:
>
> VecGhostUpdateBegin(U,INSERT_VALUES,SCATTER_FORWARD);
> VecGhostUpdateEnd(U,INSERT_VALUES,SCATTER_FORWARD);
>
> and would it be faster?
>
>
> Billy.
>
>


From buket at be.itu.edu.tr  Sat Apr  8 13:30:44 2006
From: buket at be.itu.edu.tr (buket at be.itu.edu.tr)
Date: Sat, 8 Apr 2006 21:30:44 +0300 (EEST)
Subject: what is default ordering method
Message-ID: <1981.160.75.78.141.1144521044.squirrel@www.be.itu.edu.tr>

Hi,

I have several questions;

Which ordering method does petsc use by default before matrix factorization?

How can I change the restarted value of Gmres(m)?

Is there any parallel overhead when I run petsc program on single
processor (with mprun -np 1)?


Thank you,
Best Regards,
Buket Benek


From knepley at gmail.com  Sat Apr  8 13:42:19 2006
From: knepley at gmail.com (Matthew Knepley)
Date: Sat, 8 Apr 2006 13:42:19 -0500
Subject: what is default ordering method
In-Reply-To: <1981.160.75.78.141.1144521044.squirrel@www.be.itu.edu.tr>
References: <1981.160.75.78.141.1144521044.squirrel@www.be.itu.edu.tr>
Message-ID: <a9f269830604081142s2a76b18dhc06959a506886940@mail.gmail.com>

On 4/8/06, buket at be.itu.edu.tr <buket at be.itu.edu.tr> wrote:
>
> Hi,
>
> I have several questions;
>
> Which ordering method does petsc use by default before matrix
> factorization?


None, but external packages may use a default ordering.

How can I change the restarted value of Gmres(m)?


-ksp_gmres_restart

Is there any parallel overhead when I run petsc program on single
> processor (with mprun -np 1)?


Not normally, since all classes are Seq by default.

   Matt

Thank you,
> Best Regards,
> Buket Benek
>
>
>
>
>


--
"Failure has a thousand explanations. Success doesn't need one" -- Sir Alec
Guiness
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.mcs.anl.gov/pipermail/petsc-users/attachments/20060408/4b8250a8/attachment.htm>

From petsc-maint at mcs.anl.gov  Sat Apr  8 14:09:36 2006
From: petsc-maint at mcs.anl.gov (Barry Smith)
Date: Sat, 8 Apr 2006 14:09:36 -0500 (CDT)
Subject: what is default ordering method
In-Reply-To: <a9f269830604081142s2a76b18dhc06959a506886940@mail.gmail.com>
References: <1981.160.75.78.141.1144521044.squirrel@www.be.itu.edu.tr>
 <a9f269830604081142s2a76b18dhc06959a506886940@mail.gmail.com>
Message-ID: <Pine.OSX.4.64.0604081406370.228@barrysmith-57.mcs.anl.gov>


On Sat, 8 Apr 2006, Matthew Knepley wrote:

> On 4/8/06, buket at be.itu.edu.tr <buket at be.itu.edu.tr> wrote:
>>
>> Hi,
>>
>> I have several questions;
>>
>> Which ordering method does petsc use by default before matrix
>> factorization?
>
>

For ILU it is none (i.e. natural) for LU it is nested dissection.

For ICC and Cholesky it is natural (note the reason it is natural
for Cholesky is because of the storage of only the upper triangular
reordering is expensive.)

   Barry

> None, but external packages may use a default ordering.
>
> How can I change the restarted value of Gmres(m)?
>
>
> -ksp_gmres_restart
>
> Is there any parallel overhead when I run petsc program on single
>> processor (with mprun -np 1)?
>
>
> Not normally, since all classes are Seq by default.
>
>   Matt
>
> Thank you,
>> Best Regards,
>> Buket Benek
>>
>>
>>
>>
>>
>
>
> --
> "Failure has a thousand explanations. Success doesn't need one" -- Sir Alec
> Guiness
>


From abdul-rahman at tu-harburg.de  Tue Apr 11 11:33:18 2006
From: abdul-rahman at tu-harburg.de (abdul-rahman at tu-harburg.de)
Date: Tue, 11 Apr 2006 18:33:18 +0200 (METDST)
Subject: Q about matsingle
Message-ID: <Pine.HPX.4.53.0604111825510.15246@para1.rz.tu-harburg.de>

Dear all,

if I were to compute in single precision complex, should I configure

--with-precision=single or matsingle, or both?

Thank you,

Regards,

Razi


From balay at mcs.anl.gov  Tue Apr 11 12:17:46 2006
From: balay at mcs.anl.gov (Satish Balay)
Date: Tue, 11 Apr 2006 12:17:46 -0500 (CDT)
Subject: Q about matsingle
In-Reply-To: <Pine.HPX.4.53.0604111825510.15246@para1.rz.tu-harburg.de>
References: <Pine.HPX.4.53.0604111825510.15246@para1.rz.tu-harburg.de>
Message-ID: <Pine.LNX.4.64.0604111137280.6382@asterix>

That would be --with-precision=single. However PETSc currently doesn't
compile in this mode.

Satish

On Tue, 11 Apr 2006, abdul-rahman at tu-harburg.de wrote:

> Dear all,
> 
> if I were to compute in single precision complex, should I configure
> 
> --with-precision=single or matsingle, or both?
> 
> Thank you,
> 
> Regards,
> 
> Razi
> 
> 


From abdul-rahman at tu-harburg.de  Tue Apr 11 12:26:22 2006
From: abdul-rahman at tu-harburg.de (abdul-rahman at tu-harburg.de)
Date: Tue, 11 Apr 2006 19:26:22 +0200 (METDST)
Subject: Q about matsingle
In-Reply-To: <Pine.LNX.4.64.0604111137280.6382@asterix>
References: <Pine.HPX.4.53.0604111825510.15246@para1.rz.tu-harburg.de>
 <Pine.LNX.4.64.0604111137280.6382@asterix>
Message-ID: <Pine.HPX.4.53.0604111923430.15914@para1.rz.tu-harburg.de>


On Tue, 11 Apr 2006, Satish Balay wrote:

> That would be --with-precision=single. However PETSc currently doesn't
> compile in this mode.
>
> Satish

Thanks for telling me. saves me from compiling. Out of curiousity, what is
matsingle for? Is it also not usable?


Razi

> On Tue, 11 Apr 2006, abdul-rahman at tu-harburg.de wrote:
>
> > Dear all,
> >
> > if I were to compute in single precision complex, should I configure
> >
> > --with-precision=single or matsingle, or both?
> >
> > Thank you,
> >
> > Regards,
> >
> > Razi
> >
> >
>
>


From balay at mcs.anl.gov  Tue Apr 11 12:45:59 2006
From: balay at mcs.anl.gov (Satish Balay)
Date: Tue, 11 Apr 2006 12:45:59 -0500 (CDT)
Subject: Q about matsingle
In-Reply-To: <Pine.HPX.4.53.0604111923430.15914@para1.rz.tu-harburg.de>
References: <Pine.HPX.4.53.0604111825510.15246@para1.rz.tu-harburg.de>
 <Pine.LNX.4.64.0604111137280.6382@asterix> <Pine.HPX.4.53.0604111923430.15914@para1.rz.tu-harburg.de>
Message-ID: <Pine.LNX.4.64.0604111235281.6382@asterix>

On Tue, 11 Apr 2006, abdul-rahman at tu-harburg.de wrote:

> 
> 
> On Tue, 11 Apr 2006, Satish Balay wrote:
> 
> > That would be --with-precision=single. However PETSc currently doesn't
> > compile in this mode.
> >
> > Satish
> 
> Thanks for telling me. saves me from compiling. Out of curiousity, what is
> matsingle for? Is it also not usable?

Even matsingle is not tested for complex mode. [Both
--with-precision=single,matsingle should work for the default
--with-scalar-type=real mode]

matsingle mode stores the matrix in single precision - but all
operations are done in double precision.  This is a performance
optimization mode [the performance gain is from the lower memory
bandwidth requirements of this single precision matrix storage].

Satish


From letian.wang at ghiocel-tech.com  Wed Apr 12 17:31:46 2006
From: letian.wang at ghiocel-tech.com (Letian Wang)
Date: Wed, 12 Apr 2006 18:31:46 -0400
Subject: get the preconditioner matrix
Message-ID: <000001c65e80$e1e51770$0b00a8c0@lele>

Dear all,

 
Is it possible to obtain the PETSc pre-conditioner matrix and output it to a
file? How can I do that?  Thanks.

 
Letian Wang

 
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.mcs.anl.gov/pipermail/petsc-users/attachments/20060412/c23fc82c/attachment.htm>

From bsmith at mcs.anl.gov  Wed Apr 12 19:55:34 2006
From: bsmith at mcs.anl.gov (Barry Smith)
Date: Wed, 12 Apr 2006 19:55:34 -0500 (CDT)
Subject: get the preconditioner matrix
In-Reply-To: <000001c65e80$e1e51770$0b00a8c0@lele>
References: <000001c65e80$e1e51770$0b00a8c0@lele>
Message-ID: <Pine.OSX.4.64.0604121949480.2787@barrys-computer.local>


   Letian,

   What do you mean be "pre-conditioner matrix"? It is very rare
that a preconditioner is explicitly represented as a matrix; it is almost
always just some code that applies the operator. In general an explicitly
represented preconditioner would actually be a dense matrix.

   If you truly want this dense matrix you can call PCComputeExplicitOperator()
and store the resulting matrix to a file.

    Barry


On Wed, 12 Apr 2006, Letian Wang wrote:

> Dear all,
>
>
>
> Is it possible to obtain the PETSc pre-conditioner matrix and output it to a
> file? How can I do that?  Thanks.
>
>
>
> Letian Wang
>
>
>
>


From abdul-rahman at tu-harburg.de  Thu Apr 13 04:57:32 2006
From: abdul-rahman at tu-harburg.de (abdul-rahman at tu-harburg.de)
Date: Thu, 13 Apr 2006 11:57:32 +0200 (METDST)
Subject: petsc-2.3.1 on FC4 with Intel Compilers
Message-ID: <Pine.HPX.4.53.0604131148000.8212@para1.rz.tu-harburg.de>

Dear all,

I'd like to know if anyone has successfully built petsc 2.3.1-p12 on
Fedora Core 4 with the Intel compilers package (icc and ifort 9.0 Build
20051201Z ). I used to have problems in building the complex type
but haven't really looked into it. Real-valued is perfect though.

Many thanks,


Razi


From bsmith at mcs.anl.gov  Thu Apr 13 11:50:37 2006
From: bsmith at mcs.anl.gov (Barry Smith)
Date: Thu, 13 Apr 2006 11:50:37 -0500 (CDT)
Subject: get the preconditioner matrix
In-Reply-To: <000601c65f05$154b7620$0b00a8c0@lele>
References: <000601c65f05$154b7620$0b00a8c0@lele>
Message-ID: <Pine.OSX.4.64.0604131143580.2787@barrys-computer.local>


   Again, this does not make sense. Prometheus as a dense matrix
will 1) require much much to much memory and 2) take much to
long to apply O(n^2) flops. The whole idea behind multilevel
methods is to be roughly order O(n) to apply.

   If Mark has provided a PCView() and PCLoad() for Prometheus
then it could be saved and reused (but it would not be saved
as a dense matrix), BUT I don't think Mark has done this (plus
it would very likely require rerunning on the same number of
processors).

   You just need to calculate the preconditioner for each time
you run the program.

    Barry

On Thu, 13 Apr 2006, Letian Wang wrote:

> Barry:
>
> Thank you for your reply. What I want to do is to use PETSc for
> optimization. I use Prometheus pre-conditioner to solve the initial problem.
> Usually it spends much time on getting the pre-conditioner, then the
> iterations are relatively going faster. I'm thinking to save the
> pre-conditioner matrix for the initial problem (Now I know I can do that by
> PCComputeExplicitOperator), then for other very similar problems, I can pass
> the saved pre-conditioner and apply them directly to the new problem. Do you
> think it will work?
>
> Is there any routine to apply the saved matrix as pre-conditioner? Or I have
> to program an user-defined PC routine?
>
> Thanks.
>
> Letian
>
> -----Original Message-----
> From: owner-petsc-users at mcs.anl.gov [mailto:owner-petsc-users at mcs.anl.gov]
> On Behalf Of Barry Smith
> Sent: Wednesday, April 12, 2006 7:56 PM
> To: petsc-users at mcs.anl.gov
> Subject: Re: get the preconditioner matrix
>
>
>   Letian,
>
>   What do you mean be "pre-conditioner matrix"? It is very rare
> that a preconditioner is explicitly represented as a matrix; it is almost
> always just some code that applies the operator. In general an explicitly
> represented preconditioner would actually be a dense matrix.
>
>   If you truly want this dense matrix you can call
> PCComputeExplicitOperator()
> and store the resulting matrix to a file.
>
>    Barry
>
>
> On Wed, 12 Apr 2006, Letian Wang wrote:
>
>> Dear all,
>>
>>
>>
>> Is it possible to obtain the PETSc pre-conditioner matrix and output it to
> a
>> file? How can I do that?  Thanks.
>>
>>
>>
>> Letian Wang
>>
>>
>>
>>
>
>
>


From balay at mcs.anl.gov  Thu Apr 13 14:06:33 2006
From: balay at mcs.anl.gov (Satish Balay)
Date: Thu, 13 Apr 2006 14:06:33 -0500 (CDT)
Subject: petsc-2.3.1 on FC4 with Intel Compilers
In-Reply-To: <Pine.HPX.4.53.0604131148000.8212@para1.rz.tu-harburg.de>
References: <Pine.HPX.4.53.0604131148000.8212@para1.rz.tu-harburg.de>
Message-ID: <Pine.LNX.4.64.0604131403460.4614@asterix>

What problems?

Send us the logs at petsc-maint at mcs.anl.gov - and we can take
a look at them.

Satish

On Thu, 13 Apr 2006, abdul-rahman at tu-harburg.de wrote:

> Dear all,
> 
> I'd like to know if anyone has successfully built petsc 2.3.1-p12 on
> Fedora Core 4 with the Intel compilers package (icc and ifort 9.0 Build
> 20051201Z ). I used to have problems in building the complex type
> but haven't really looked into it. Real-valued is perfect though.
> 
> Many thanks,
> 
> 
> 
> Razi
> 
> 


From adams at pppl.gov  Thu Apr 13 19:40:42 2006
From: adams at pppl.gov (Mark Adams)
Date: Thu, 13 Apr 2006 20:40:42 -0400
Subject: get the preconditioner matrix
In-Reply-To: <Pine.OSX.4.64.0604131143580.2787@barrys-computer.local>
References: <000601c65f05$154b7620$0b00a8c0@lele> <Pine.OSX.4.64.0604131143580.2787@barrys-computer.local>
Message-ID: <BDC8826B-8272-4927-85DA-F74D71E48135@pppl.gov>

Letian,

I think what you are asking for is the capacity to save the state of  
the PC after setup which I have not implemented.  As Barry said,  
saving the explicit operator would not be practical and I don't know  
of a linear solver that provides for saving the state in the way that  
I think you are asking (but it would be more natural and easy for a  
direct solver to save factors).  But you could keep using the same PC  
for different problems - at your own risk of course - in one run of  
your code.

Mark

On Apr 13, 2006, at 12:50 PM, Barry Smith wrote:

>
>   Again, this does not make sense. Prometheus as a dense matrix
> will 1) require much much to much memory and 2) take much to
> long to apply O(n^2) flops. The whole idea behind multilevel
> methods is to be roughly order O(n) to apply.
>
>   If Mark has provided a PCView() and PCLoad() for Prometheus
> then it could be saved and reused (but it would not be saved
> as a dense matrix), BUT I don't think Mark has done this (plus
> it would very likely require rerunning on the same number of
> processors).
>
>   You just need to calculate the preconditioner for each time
> you run the program.
>
>    Barry
>
> On Thu, 13 Apr 2006, Letian Wang wrote:
>
>> Barry:
>>
>> Thank you for your reply. What I want to do is to use PETSc for
>> optimization. I use Prometheus pre-conditioner to solve the  
>> initial problem.
>> Usually it spends much time on getting the pre-conditioner, then the
>> iterations are relatively going faster. I'm thinking to save the
>> pre-conditioner matrix for the initial problem (Now I know I can  
>> do that by
>> PCComputeExplicitOperator), then for other very similar problems,  
>> I can pass
>> the saved pre-conditioner and apply them directly to the new  
>> problem. Do you
>> think it will work?
>>
>> Is there any routine to apply the saved matrix as pre-conditioner?  
>> Or I have
>> to program an user-defined PC routine?
>>
>> Thanks.
>>
>> Letian
>>
>> -----Original Message-----
>> From: owner-petsc-users at mcs.anl.gov [mailto:owner-petsc- 
>> users at mcs.anl.gov]
>> On Behalf Of Barry Smith
>> Sent: Wednesday, April 12, 2006 7:56 PM
>> To: petsc-users at mcs.anl.gov
>> Subject: Re: get the preconditioner matrix
>>
>>
>>   Letian,
>>
>>   What do you mean be "pre-conditioner matrix"? It is very rare
>> that a preconditioner is explicitly represented as a matrix; it is  
>> almost
>> always just some code that applies the operator. In general an  
>> explicitly
>> represented preconditioner would actually be a dense matrix.
>>
>>   If you truly want this dense matrix you can call
>> PCComputeExplicitOperator()
>> and store the resulting matrix to a file.
>>
>>    Barry
>>
>>
>> On Wed, 12 Apr 2006, Letian Wang wrote:
>>
>>> Dear all,
>>>
>>>
>>>
>>> Is it possible to obtain the PETSc pre-conditioner matrix and  
>>> output it to
>> a
>>> file? How can I do that?  Thanks.
>>>
>>>
>>>
>>> Letian Wang
>>>
>>>
>>>
>>>
>>
>>
>>

**********************************************************************
Mark Adams Ph.D.                                   Columbia University
289 Engineering Terrace                                        MC 4701
New York NY 10027
adams at pppl.gov                                www.columbia.edu/~ma2325
voice: 212.854.4485                                  fax: 212.854.8257
**********************************************************************


From abdul-rahman at tu-harburg.de  Tue Apr 18 03:26:14 2006
From: abdul-rahman at tu-harburg.de (abdul-rahman at tu-harburg.de)
Date: Tue, 18 Apr 2006 10:26:14 +0200 (METDST)
Subject: petsc-2.3.1 on FC4 with Intel Compilers
In-Reply-To: <Pine.LNX.4.64.0604131403460.4614@asterix>
References: <Pine.HPX.4.53.0604131148000.8212@para1.rz.tu-harburg.de>
 <Pine.LNX.4.64.0604131403460.4614@asterix>
Message-ID: <Pine.HPX.4.53.0604181017420.2798@para1.rz.tu-harburg.de>


On Thu, 13 Apr 2006, Satish Balay wrote:

> What problems?
>
> Send us the logs at petsc-maint at mcs.anl.gov - and we can take
> a look at them.

Thanks Satish. I sent them already. I think the compilation problem has to
do with gcc 4.0.2 c++ headers. I don't get it why PETSc insists on gcc
headers and not icc 9.0, although I did point to the intel compilers in my
config.


Razi


From knepley at gmail.com  Tue Apr 18 08:46:22 2006
From: knepley at gmail.com (Matthew Knepley)
Date: Tue, 18 Apr 2006 08:46:22 -0500
Subject: petsc-2.3.1 on FC4 with Intel Compilers
In-Reply-To: <Pine.HPX.4.53.0604181017420.2798@para1.rz.tu-harburg.de>
References: <Pine.HPX.4.53.0604131148000.8212@para1.rz.tu-harburg.de>
	 <Pine.LNX.4.64.0604131403460.4614@asterix>
	 <Pine.HPX.4.53.0604181017420.2798@para1.rz.tu-harburg.de>
Message-ID: <a9f269830604180646j150cb6faq549b166f635b8d73@mail.gmail.com>

I can't find any logs. However, we do not require any specific headers, and
use intel compilers on a lot of our machines. This sounds like a compiler
installation problem.

   Matt

On 4/18/06, abdul-rahman at tu-harburg.de <abdul-rahman at tu-harburg.de> wrote:
>
>
>
> On Thu, 13 Apr 2006, Satish Balay wrote:
>
> > What problems?
> >
> > Send us the logs at petsc-maint at mcs.anl.gov - and we can take
> > a look at them.
>
> Thanks Satish. I sent them already. I think the compilation problem has to
> do with gcc 4.0.2 c++ headers. I don't get it why PETSc insists on gcc
> headers and not icc 9.0, although I did point to the intel compilers in my
> config.
>
>
> Razi
>
>
>


--
"Failure has a thousand explanations. Success doesn't need one" -- Sir Alec
Guiness
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.mcs.anl.gov/pipermail/petsc-users/attachments/20060418/6d573e0f/attachment.htm>

From shma7099 at student.uu.se  Tue Apr 18 09:09:29 2006
From: shma7099 at student.uu.se (Sh.M)
Date: Tue, 18 Apr 2006 16:09:29 +0200 (MEST)
Subject: My settings are overriden when I call the solver more than once?
Message-ID: <Pine.SOL.4.53.0604181544200.10258@vega.it.uu.se>

Hi all,

I am solving a matrix with boomerAMG, and I set the boomerAMG settings
thru the console. The matrix(it is a preconditioner matrix) is solved
several times... I set the boomerAMG preconditioner/solver to have a
convergence tolerance of 1.0e-06 and max number of iterations 4. I print out boomerAMG status to
verify that it works as it should.. But for some reason it doesnt..

Here is a print from boomerAMG when it starts, I guess when it constructs
the boomerAMG solver/preconditioner:

.
.
.
.
.
.
BoomerAMG SOLVER PARAMETERS:

  Maximum number of cycles:         4
  Stopping Tolerance:               1.000000e-06
  Cycle type (1 = V, 2 = W, etc.):  1

  Relaxation Parameters:
   Visiting Grid:                     fine  down   up  coarse
            Number of partial sweeps:   1     1    1     1
   Type 0=Jac, 1=GS, 3=Hybrid 9=GE:     3     3    3     9
   Point types, partial sweeps (1=C, -1=F):
                               Finest grid:   1  -1
                  Pre-CG relaxation (down):   1  -1
                   Post-CG relaxation (up):  -1   1
                             Coarsest grid:   0


Immediately after the above message, wich is when it starts to
solve, it changes its parameters to this:


BoomerAMG SOLVER PARAMETERS:

  Maximum number of cycles:         10000
  Stopping Tolerance:               1.000000e-05
  Cycle type (1 = V, 2 = W, etc.):  1

  Relaxation Parameters:
   Visiting Grid:                     fine  down   up  coarse
            Number of partial sweeps:   1     1    1     1
   Type 0=Jac, 1=GS, 3=Hybrid 9=GE:     3     3    3     9
   Point types, partial sweeps (1=C, -1=F):
                               Finest grid:   1  -1
                  Pre-CG relaxation (down):   1  -1
                   Post-CG relaxation (up):  -1   1
                             Coarsest grid:   0
.
.
.
.
.
.


As you see my settings have been changed.

I have used boomerAMG in the past, not exactly this way.,... but my
settings were preserved before and after solve.

here is what I run from the command line:

mprun -np 1 petscSolver
-a ../hypre/data/nr_hvl40.csr
-b ../hypre/data/nr_hvl40.csr.blockMatrix -inner_ksp_type richardson
-inner_pc_type hypre -inner_pc_hypre_type boomeramg
-inner_pc_hypre_boomeramg_max_iter 4
-inner_pc_hypre_boomeramg_tol 1.0e-06
-ksp_monitor
-ksp_type gmres
-inner_pc_hypre_boomeramg_print_statistics


And a piece of code:

.
.
.
.
.
.
BLOCK_PC  user_pc;
,
,
,
,
,
.
KSPSetOptionsPrefix(user_pc.block_solver,"inner_");
KSPSetOperators(user_pc.block_solver,user_pc.M,user_pc.M,DIFFERENT_NONZERO_PATTERN);
KSPSetFromOptions(user_pc.block_solver);


I am running on solaris64.

With best regards, Shaman Mahmoudi


From bsmith at mcs.anl.gov  Tue Apr 18 12:16:47 2006
From: bsmith at mcs.anl.gov (Barry Smith)
Date: Tue, 18 Apr 2006 12:16:47 -0500 (CDT)
Subject: My settings are overriden when I call the solver more than once?
In-Reply-To: <Pine.SOL.4.53.0604181544200.10258@vega.it.uu.se>
References: <Pine.SOL.4.53.0604181544200.10258@vega.it.uu.se>
Message-ID: <Pine.OSX.4.64.0604181157200.5943@barrysmith-57.mcs.anl.gov>


    What controls the iteration max and tolerance for Richardson 
solver is -inner_ksp_max_its <its> and -inner_ksp_rtol <rtol>
This is true for any preconditioner including hypre boomeramg.
The defaults are 10000 and 1.e-5.

The problem is that the options
-inner_pc_hypre_boomeramg_max_iter 4
-inner_pc_hypre_boomeramg_tol 1.0e-06
sure look like THEY should control this. Hence the confusion.

Here is the issue, pc_hypre_boomeramg_max_iter and pc_hypre_boomeramg_tol are
suppose to control (with PETSc) how many iterations boomeramg does WITHIN a 
single call to ITS solve. Say we used a ksp of gmres and a pc_hypre_boomeramg_max_iter 4
this means we are using preconditioned gmres with a preconditioner of FOUR V cycles
of boomeramg (this is different then using preconditioned gmres with a preconditioner of
ONE cycle). Completely possible and maybe a reasonable thing to do.

The error in PETSc is that paragraph three above does NOT hold for Richardsons method
(but it does hold for all other KSP methods). This is because Richardson's method
has a "special method" PCApplyRichardson() that avoids the overhead of explicitly
applying Richardson's method (see src/ksp/ksp/impls/rich/rich.c). I did not handle
this properly when I wrote the PCApplyRichardson_BoomerAMG. I will fix this error.

Anyways, the short answer is when using Richardson with hypre 
use -inner_ksp_max_its <its> and -inner_ksp_rtol <rtol> to control 
the iterations and tolerance, NOT 
-inner_pc_hypre_boomeramg_max_iter
and -inner_pc_hypre_boomeramg_tol
I will attempt to make the docs clearer.


    Barry


On Tue, 18 Apr 2006, Sh.M wrote:

> Hi all,
>
> I am solving a matrix with boomerAMG, and I set the boomerAMG settings
> thru the console. The matrix(it is a preconditioner matrix) is solved
> several times... I set the boomerAMG preconditioner/solver to have a
> convergence tolerance of 1.0e-06 and max number of iterations 4. I print out boomerAMG status to
> verify that it works as it should.. But for some reason it doesnt..
>
> Here is a print from boomerAMG when it starts, I guess when it constructs
> the boomerAMG solver/preconditioner:
>
> .
> .
> .
> .
> .
> .
> BoomerAMG SOLVER PARAMETERS:
>
>  Maximum number of cycles:         4
>  Stopping Tolerance:               1.000000e-06
>  Cycle type (1 = V, 2 = W, etc.):  1
>
>  Relaxation Parameters:
>   Visiting Grid:                     fine  down   up  coarse
>            Number of partial sweeps:   1     1    1     1
>   Type 0=Jac, 1=GS, 3=Hybrid 9=GE:     3     3    3     9
>   Point types, partial sweeps (1=C, -1=F):
>                               Finest grid:   1  -1
>                  Pre-CG relaxation (down):   1  -1
>                   Post-CG relaxation (up):  -1   1
>                             Coarsest grid:   0
>
>
>
> Immediately after the above message, wich is when it starts to
> solve, it changes its parameters to this:
>
>
> BoomerAMG SOLVER PARAMETERS:
>
>  Maximum number of cycles:         10000
>  Stopping Tolerance:               1.000000e-05
>  Cycle type (1 = V, 2 = W, etc.):  1
>
>  Relaxation Parameters:
>   Visiting Grid:                     fine  down   up  coarse
>            Number of partial sweeps:   1     1    1     1
>   Type 0=Jac, 1=GS, 3=Hybrid 9=GE:     3     3    3     9
>   Point types, partial sweeps (1=C, -1=F):
>                               Finest grid:   1  -1
>                  Pre-CG relaxation (down):   1  -1
>                   Post-CG relaxation (up):  -1   1
>                             Coarsest grid:   0
> .
> .
> .
> .
> .
> .
>
>
> As you see my settings have been changed.
>
> I have used boomerAMG in the past, not exactly this way.,... but my
> settings were preserved before and after solve.
>
> here is what I run from the command line:
>
> mprun -np 1 petscSolver
> -a ../hypre/data/nr_hvl40.csr
> -b ../hypre/data/nr_hvl40.csr.blockMatrix -inner_ksp_type richardson
> -inner_pc_type hypre -inner_pc_hypre_type boomeramg
> -inner_pc_hypre_boomeramg_max_iter 4
> -inner_pc_hypre_boomeramg_tol 1.0e-06
> -ksp_monitor
> -ksp_type gmres
> -inner_pc_hypre_boomeramg_print_statistics
>
>
> And a piece of code:
>
> .
> .
> .
> .
> .
> .
> BLOCK_PC  user_pc;
> ,
> ,
> ,
> ,
> ,
> .
> KSPSetOptionsPrefix(user_pc.block_solver,"inner_");
> KSPSetOperators(user_pc.block_solver,user_pc.M,user_pc.M,DIFFERENT_NONZERO_PATTERN);
> KSPSetFromOptions(user_pc.block_solver);
>
>
> I am running on solaris64.
>
> With best regards, Shaman Mahmoudi
>
>


From shma7099 at student.uu.se  Thu Apr 20 05:21:43 2006
From: shma7099 at student.uu.se (Sh.M)
Date: Thu, 20 Apr 2006 12:21:43 +0200 (MEST)
Subject: My settings are overriden when I call the solver more than once?
In-Reply-To: <Pine.OSX.4.64.0604181157200.5943@barrysmith-57.mcs.anl.gov>
References: <Pine.SOL.4.53.0604181544200.10258@vega.it.uu.se>
 <Pine.OSX.4.64.0604181157200.5943@barrysmith-57.mcs.anl.gov>
Message-ID: <Pine.SOL.4.53.0604201221090.3606@vega.it.uu.se>

Hi,

Thanks for the thorough explanation!

With best regards, Shaman Mahmoudi

On Tue, 18 Apr 2006, Barry Smith wrote:

>
>     What controls the iteration max and tolerance for Richardson
> solver is -inner_ksp_max_its <its> and -inner_ksp_rtol <rtol>
> This is true for any preconditioner including hypre boomeramg.
> The defaults are 10000 and 1.e-5.
>
> The problem is that the options
> -inner_pc_hypre_boomeramg_max_iter 4
> -inner_pc_hypre_boomeramg_tol 1.0e-06
> sure look like THEY should control this. Hence the confusion.
>
> Here is the issue, pc_hypre_boomeramg_max_iter and pc_hypre_boomeramg_tol are
> suppose to control (with PETSc) how many iterations boomeramg does WITHIN a
> single call to ITS solve. Say we used a ksp of gmres and a pc_hypre_boomeramg_max_iter 4
> this means we are using preconditioned gmres with a preconditioner of FOUR V cycles
> of boomeramg (this is different then using preconditioned gmres with a preconditioner of
> ONE cycle). Completely possible and maybe a reasonable thing to do.
>
> The error in PETSc is that paragraph three above does NOT hold for Richardsons method
> (but it does hold for all other KSP methods). This is because Richardson's method
> has a "special method" PCApplyRichardson() that avoids the overhead of explicitly
> applying Richardson's method (see src/ksp/ksp/impls/rich/rich.c). I did not handle
> this properly when I wrote the PCApplyRichardson_BoomerAMG. I will fix this error.
>
> Anyways, the short answer is when using Richardson with hypre
> use -inner_ksp_max_its <its> and -inner_ksp_rtol <rtol> to control
> the iterations and tolerance, NOT
> -inner_pc_hypre_boomeramg_max_iter
> and -inner_pc_hypre_boomeramg_tol
> I will attempt to make the docs clearer.
>
>
>     Barry
>
>
>
>
>
> On Tue, 18 Apr 2006, Sh.M wrote:
>
> > Hi all,
> >
> > I am solving a matrix with boomerAMG, and I set the boomerAMG settings
> > thru the console. The matrix(it is a preconditioner matrix) is solved
> > several times... I set the boomerAMG preconditioner/solver to have a
> > convergence tolerance of 1.0e-06 and max number of iterations 4. I print out boomerAMG status to
> > verify that it works as it should.. But for some reason it doesnt..
> >
> > Here is a print from boomerAMG when it starts, I guess when it constructs
> > the boomerAMG solver/preconditioner:
> >
> > .
> > .
> > .
> > .
> > .
> > .
> > BoomerAMG SOLVER PARAMETERS:
> >
> >  Maximum number of cycles:         4
> >  Stopping Tolerance:               1.000000e-06
> >  Cycle type (1 = V, 2 = W, etc.):  1
> >
> >  Relaxation Parameters:
> >   Visiting Grid:                     fine  down   up  coarse
> >            Number of partial sweeps:   1     1    1     1
> >   Type 0=Jac, 1=GS, 3=Hybrid 9=GE:     3     3    3     9
> >   Point types, partial sweeps (1=C, -1=F):
> >                               Finest grid:   1  -1
> >                  Pre-CG relaxation (down):   1  -1
> >                   Post-CG relaxation (up):  -1   1
> >                             Coarsest grid:   0
> >
> >
> >
> > Immediately after the above message, wich is when it starts to
> > solve, it changes its parameters to this:
> >
> >
> > BoomerAMG SOLVER PARAMETERS:
> >
> >  Maximum number of cycles:         10000
> >  Stopping Tolerance:               1.000000e-05
> >  Cycle type (1 = V, 2 = W, etc.):  1
> >
> >  Relaxation Parameters:
> >   Visiting Grid:                     fine  down   up  coarse
> >            Number of partial sweeps:   1     1    1     1
> >   Type 0=Jac, 1=GS, 3=Hybrid 9=GE:     3     3    3     9
> >   Point types, partial sweeps (1=C, -1=F):
> >                               Finest grid:   1  -1
> >                  Pre-CG relaxation (down):   1  -1
> >                   Post-CG relaxation (up):  -1   1
> >                             Coarsest grid:   0
> > .
> > .
> > .
> > .
> > .
> > .
> >
> >
> > As you see my settings have been changed.
> >
> > I have used boomerAMG in the past, not exactly this way.,... but my
> > settings were preserved before and after solve.
> >
> > here is what I run from the command line:
> >
> > mprun -np 1 petscSolver
> > -a ../hypre/data/nr_hvl40.csr
> > -b ../hypre/data/nr_hvl40.csr.blockMatrix -inner_ksp_type richardson
> > -inner_pc_type hypre -inner_pc_hypre_type boomeramg
> > -inner_pc_hypre_boomeramg_max_iter 4
> > -inner_pc_hypre_boomeramg_tol 1.0e-06
> > -ksp_monitor
> > -ksp_type gmres
> > -inner_pc_hypre_boomeramg_print_statistics
> >
> >
> > And a piece of code:
> >
> > .
> > .
> > .
> > .
> > .
> > .
> > BLOCK_PC  user_pc;
> > ,
> > ,
> > ,
> > ,
> > ,
> > .
> > KSPSetOptionsPrefix(user_pc.block_solver,"inner_");
> > KSPSetOperators(user_pc.block_solver,user_pc.M,user_pc.M,DIFFERENT_NONZERO_PATTERN);
> > KSPSetFromOptions(user_pc.block_solver);
> >
> >
> > I am running on solaris64.
> >
> > With best regards, Shaman Mahmoudi
> >
> >
>
>


From letian.wang at ghiocel-tech.com  Mon Apr 24 11:56:36 2006
From: letian.wang at ghiocel-tech.com (Letian Wang)
Date: Mon, 24 Apr 2006 12:56:36 -0400
Subject: How to loop Petsc in Fortran?
Message-ID: <000201c667c0$0c716e60$0b00a8c0@lele>

Dear All:

 
Question 1):

 
For an optimization task, I need to loop Petsc (I'm using Petsc-2.3.0). But
I had problems to reinitialize Petsc after finalize, here is a simple
FORTRAN program to explain my problem:

 
      program petsc_test

#     include "include/finclude/petsc.h"

      call PetscInitialize(PETSC_NULL_CHARACTER, ierr)

      call PetscFinalize(ierr)

      print*,'ierr=',ierr

      call PetscInitialize(PETSC_NULL_CHARACTER, ierr)

      call PetscFinalize(ierr)

      end

 
When the program excuted to the second PetscInitial line, it shows error
message: "Error encountered before initializing MPICH". Can anyone help me
on this? Thanks.

 
Question 2):

 
Follow up my previous question, I also tried to Initiallize and Finalize
Petsc only once and perform the do-loop between Petscinitialize and
PetscFinalize. I used KSP CR solver with Prometheus PCs to solver large
linear equations. After several loops, the program was interrupted by
segmentation violation error. I suppose there was a memory leak somewhere.
The error message is like this: Any suggestion for this? Thanks

 
*********Doing job -- nosort0001

 
Task No.    1    Total CPU=       52.3    

 ---------------------------------------------------

 
 *********Doing job -- nosort0002

 
Task No.    2    Total CPU=       52.1    

 ---------------------------------------------------

 
 *********Doing job -- nosort0003

--------------------------------------------------------------------------

Petsc Release Version 2.3.0, Patch 44, April, 26, 2005

See docs/changes/index.html for recent updates.

See docs/faq.html for hints about trouble shooting.

See docs/index.html for manual pages.

-----------------------------------------------------------------------

../feap on a linux-gnu named GPTnode3.cl.ghiocel-tech.com by ltwang Mon Apr
24 15:25:04 2006

Libraries linked from /home/ltwang/Library/petsc-2.3.0/lib/linux-gnu

Configure run at Tue Mar 14 11:19:49 2006

Configure options --with-mpi-dir=/usr --with-debugging=0
--download-spooles=1 --download-f-blas-lapack=1 --download-parmetis=1
--download-prometheus=1 --with-shared=0

-----------------------------------------------------------------------

[1]PETSC ERROR: Caught signal number 11 SEGV: Segmentation Violation,
probably memory access out of range

[1]PETSC ERROR: Try option -start_in_debugger or -on_error_attach_debugger

[1]PETSC ERROR: configure using --with-debugging=yes, recompile, link, and
run 

[1]PETSC ERROR: to get more information on the crash.

[1]PETSC ERROR: User provided function() line 0 in unknown directory unknown
file

[1]PETSC ERROR: Signal received!

[1]PETSC ERROR:  !

[cli_1]: aborting job:

application called MPI_Abort(MPI_COMM_WORLD, 59) - process 1

[cli_0]: aborting job:

Fatal error in MPI_Allgather: Other MPI error, error stack:

MPI_Allgather(949)........................: MPI_Allgather(sbuf=0xbffeea14,
scount=1, MPI_INT, rbuf=0x8bf0a0c, rcount=1, MPI_INT, comm=0x84000000)
failed

MPIR_Allgather(180).......................: 

MPIC_Sendrecv(161)........................: 

MPIC_Wait(321)............................: 

MPIDI_CH3_Progress_wait(199)..............: an error occurred while handling
an event returned by MPIDU_Sock_Wait()

MPIDI_CH3I_Progress_handle_sock_event(422): 

MPIDU_Socki_handle_read(649)..............: connection failure
(set=0,sock=2,errno=104:(strerror() not found))

rank 1 in job 477  GPTMaster_53830   caused collective abort of all ranks

  exit status of rank 1: return code 59 

 
Letian 

 
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.mcs.anl.gov/pipermail/petsc-users/attachments/20060424/d5957cda/attachment.htm>

From knepley at gmail.com  Mon Apr 24 12:01:36 2006
From: knepley at gmail.com (Matthew Knepley)
Date: Mon, 24 Apr 2006 12:01:36 -0500
Subject: How to loop Petsc in Fortran?
In-Reply-To: <000201c667c0$0c716e60$0b00a8c0@lele>
References: <000201c667c0$0c716e60$0b00a8c0@lele>
Message-ID: <a9f269830604241001r527d006bv23f530365b1022c5@mail.gmail.com>

On 4/24/06, Letian Wang <letian.wang at ghiocel-tech.com> wrote:
>
>  Dear All:
>
>
>
> Question 1):
>
>
>
> For an optimization task, I need to loop Petsc (I'm using Petsc-2.3.0).
> But I had problems to reinitialize Petsc after finalize, here is a simple
> FORTRAN program to explain my problem:
>
It is not possible to call MPI_Init() after an MPI_Finalize(). Therefore you
should only call PetscInitialize/Finalize() once.


> Question 2):
>
>
>
> Follow up my previous question, I also tried to Initiallize and Finalize
> Petsc only once and perform the do-loop between Petscinitialize and
> PetscFinalize. I used KSP CR solver with Prometheus PCs to solver large
> linear equations. After several loops, the program was interrupted by
> segmentation violation error. I suppose there was a memory leak somewhere.
> The error message is like this: Any suggestion for this? Thanks
>

This is a memory corruption problem. Use the debugger (-start_in_debugger)
to get a stack trace so at
least we know where  the SEGV is occurring. Then we can try to fix it.

   Thanks,

      Matt

> *********Doing job -- nosort0001
>
>
>
> Task No.    1    Total CPU=       52.3
>
>  ---------------------------------------------------
>
>
>
>  *********Doing job -- nosort0002
>
>
>
> Task No.    2    Total CPU=       52.1
>
>  ---------------------------------------------------
>
>
>
>  *********Doing job -- nosort0003
>
> --------------------------------------------------------------------------
>
> Petsc Release Version 2.3.0, Patch 44, April, 26, 2005
>
> See docs/changes/index.html for recent updates.
>
> See docs/faq.html for hints about trouble shooting.
>
> See docs/index.html for manual pages.
>
> -----------------------------------------------------------------------
>
> ../feap on a linux-gnu named GPTnode3.cl.ghiocel-tech.com by ltwang Mon
> Apr 24 15:25:04 2006
>
> Libraries linked from /home/ltwang/Library/petsc-2.3.0/lib/linux-gnu
>
> Configure run at Tue Mar 14 11:19:49 2006
>
> Configure options --with-mpi-dir=/usr --with-debugging=0
> --download-spooles=1 --download-f-blas-lapack=1 --download-parmetis=1
> --download-prometheus=1 --with-shared=0
>
> -----------------------------------------------------------------------
>
> [1]PETSC ERROR: Caught signal number 11 SEGV: Segmentation Violation,
> probably memory access out of range
>
> [1]PETSC ERROR: Try option -start_in_debugger or -on_error_attach_debugger
>
> [1]PETSC ERROR: configure using --with-debugging=yes, recompile, link, and
> run
>
> [1]PETSC ERROR: to get more information on the crash.
>
> [1]PETSC ERROR: User provided function() line 0 in unknown directory
> unknown file
>
> [1]PETSC ERROR: Signal received!
>
> [1]PETSC ERROR:  !
>
> [cli_1]: aborting job:
>
> application called MPI_Abort(MPI_COMM_WORLD, 59) - process 1
>
> [cli_0]: aborting job:
>
> Fatal error in MPI_Allgather: Other MPI error, error stack:
>
> MPI_Allgather(949)........................: MPI_Allgather(sbuf=0xbffeea14,
> scount=1, MPI_INT, rbuf=0x8bf0a0c, rcount=1, MPI_INT, comm=0x84000000)
> failed
>
> MPIR_Allgather(180).......................:
>
> MPIC_Sendrecv(161)........................:
>
> MPIC_Wait(321)............................:
>
> MPIDI_CH3_Progress_wait(199)..............: an error occurred while
> handling an event returned by MPIDU_Sock_Wait()
>
> MPIDI_CH3I_Progress_handle_sock_event(422):
>
> MPIDU_Socki_handle_read(649)..............: connection failure
> (set=0,sock=2,errno=104:(strerror() not found))
>
> rank 1 in job 477  GPTMaster_53830   caused collective abort of all ranks
>
>   exit status of rank 1: return code 59
>
>
>
>
>
>
>
> Letian
>
--
"Failure has a thousand explanations. Success doesn't need one" -- Sir Alec
Guiness
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.mcs.anl.gov/pipermail/petsc-users/attachments/20060424/624ca46b/attachment.htm>

From bsmith at mcs.anl.gov  Mon Apr 24 13:02:16 2006
From: bsmith at mcs.anl.gov (Barry Smith)
Date: Mon, 24 Apr 2006 13:02:16 -0500 (CDT)
Subject: How to loop Petsc in Fortran?
In-Reply-To: <a9f269830604241001r527d006bv23f530365b1022c5@mail.gmail.com>
References: <000201c667c0$0c716e60$0b00a8c0@lele>
 <a9f269830604241001r527d006bv23f530365b1022c5@mail.gmail.com>
Message-ID: <Pine.OSX.4.64.0604241301260.5943@barrysmith-57.mcs.anl.gov>


    Question 2) You should also use http://valgrind.org/ to determine
where the memory corruption is taking place.

    Barry


On Mon, 24 Apr 2006, Matthew Knepley wrote:

> On 4/24/06, Letian Wang <letian.wang at ghiocel-tech.com> wrote:
>>
>>  Dear All:
>>
>>
>>
>> Question 1):
>>
>>
>>
>> For an optimization task, I need to loop Petsc (I'm using Petsc-2.3.0).
>> But I had problems to reinitialize Petsc after finalize, here is a simple
>> FORTRAN program to explain my problem:
>>
> It is not possible to call MPI_Init() after an MPI_Finalize(). Therefore you
> should only call PetscInitialize/Finalize() once.
>
>
>> Question 2):
>>
>>
>>
>> Follow up my previous question, I also tried to Initiallize and Finalize
>> Petsc only once and perform the do-loop between Petscinitialize and
>> PetscFinalize. I used KSP CR solver with Prometheus PCs to solver large
>> linear equations. After several loops, the program was interrupted by
>> segmentation violation error. I suppose there was a memory leak somewhere.
>> The error message is like this: Any suggestion for this? Thanks
>>
>
> This is a memory corruption problem. Use the debugger (-start_in_debugger)
> to get a stack trace so at
> least we know where  the SEGV is occurring. Then we can try to fix it.
>
>   Thanks,
>
>      Matt
>
>> *********Doing job -- nosort0001
>>
>>
>>
>> Task No.    1    Total CPU=       52.3
>>
>>  ---------------------------------------------------
>>
>>
>>
>>  *********Doing job -- nosort0002
>>
>>
>>
>> Task No.    2    Total CPU=       52.1
>>
>>  ---------------------------------------------------
>>
>>
>>
>>  *********Doing job -- nosort0003
>>
>> --------------------------------------------------------------------------
>>
>> Petsc Release Version 2.3.0, Patch 44, April, 26, 2005
>>
>> See docs/changes/index.html for recent updates.
>>
>> See docs/faq.html for hints about trouble shooting.
>>
>> See docs/index.html for manual pages.
>>
>> -----------------------------------------------------------------------
>>
>> ../feap on a linux-gnu named GPTnode3.cl.ghiocel-tech.com by ltwang Mon
>> Apr 24 15:25:04 2006
>>
>> Libraries linked from /home/ltwang/Library/petsc-2.3.0/lib/linux-gnu
>>
>> Configure run at Tue Mar 14 11:19:49 2006
>>
>> Configure options --with-mpi-dir=/usr --with-debugging=0
>> --download-spooles=1 --download-f-blas-lapack=1 --download-parmetis=1
>> --download-prometheus=1 --with-shared=0
>>
>> -----------------------------------------------------------------------
>>
>> [1]PETSC ERROR: Caught signal number 11 SEGV: Segmentation Violation,
>> probably memory access out of range
>>
>> [1]PETSC ERROR: Try option -start_in_debugger or -on_error_attach_debugger
>>
>> [1]PETSC ERROR: configure using --with-debugging=yes, recompile, link, and
>> run
>>
>> [1]PETSC ERROR: to get more information on the crash.
>>
>> [1]PETSC ERROR: User provided function() line 0 in unknown directory
>> unknown file
>>
>> [1]PETSC ERROR: Signal received!
>>
>> [1]PETSC ERROR:  !
>>
>> [cli_1]: aborting job:
>>
>> application called MPI_Abort(MPI_COMM_WORLD, 59) - process 1
>>
>> [cli_0]: aborting job:
>>
>> Fatal error in MPI_Allgather: Other MPI error, error stack:
>>
>> MPI_Allgather(949)........................: MPI_Allgather(sbuf=0xbffeea14,
>> scount=1, MPI_INT, rbuf=0x8bf0a0c, rcount=1, MPI_INT, comm=0x84000000)
>> failed
>>
>> MPIR_Allgather(180).......................:
>>
>> MPIC_Sendrecv(161)........................:
>>
>> MPIC_Wait(321)............................:
>>
>> MPIDI_CH3_Progress_wait(199)..............: an error occurred while
>> handling an event returned by MPIDU_Sock_Wait()
>>
>> MPIDI_CH3I_Progress_handle_sock_event(422):
>>
>> MPIDU_Socki_handle_read(649)..............: connection failure
>> (set=0,sock=2,errno=104:(strerror() not found))
>>
>> rank 1 in job 477  GPTMaster_53830   caused collective abort of all ranks
>>
>>   exit status of rank 1: return code 59
>>
>>
>>
>>
>>
>>
>>
>> Letian
>>
> --
> "Failure has a thousand explanations. Success doesn't need one" -- Sir Alec
> Guiness
>


From randy at geosystem.us  Wed Apr 26 19:42:10 2006
From: randy at geosystem.us (Randall Mackie)
Date: Wed, 26 Apr 2006 17:42:10 -0700
Subject: question on DA's and performance
Message-ID: <44501362.6080600@geosystem.us>

I've been using Petsc for a few years with reasonably good success. My application is
3D EM forward modeling and inversion.

What has been working well is basically an adaptation of what I did in serial mode,
by solving the following system of equations:

|Mxx Mxy Mxz| |Hx|   |bx|
|Myx Myy Myz| |Hy| = |by|
|Mzx Mzy Mzz| |Hz|   |bz|

Because this system is very stiff and ill-conditioned, the preconditioner that has
been successfully used is the ILU(k) of the diagonal sub-blocks of the coefficient
matrix only, not the ILU(k) of the entire matrix.


I've tried, with less success, converting to distributed arrays, because in reality
my program requires alternating between solving the system above, and solving
another system based on the divergences of the magnetic field, and so this requires
a lot of message passing between the nodes. I think that using DA's would require
passing only the ghost values instead of all the values as I am now doing.

So I coded up DA's, and it works, but only okay, and not as well as the case above.
The main problem is that it seems to work best for ILU(0), whereas the case above
works better and better with the more fill-in specified. I think I've tried every
possible option, but I can't get any improvement over just using ILU(0), but the problem
is that with ILU(0), it takes too many iterations, and so the total time is more
than when I use my original method with, say, ILU(8).

The differences between using DA's and the approach above is that the fields are
interlaced in DA's, and the boundary values are included as unknowns (with coefficient
matrix values set to 1.0), whereas my original way the boundary values are
incorporated in the right-hand side. Maybe that hurts the condition of my system.

I would really like to use DA's to improve on the efficiency of the code, but I
can't figure out how to do that.

It was suggested to me last year to use PCFIELDSPLIT, but there are no examples,
and I'm not a c programmer so it's hard for me to look at the source code and
know what to do. (I'm only just now able to get back to this).

Does anyone do any prescaling of their systems? If so, does anyone have examples
of how this can be done?

Any advice is greatly appreciated.

Thanks, Randy

-- 
Randall Mackie
GSY-USA, Inc.
PMB# 643
2261 Market St.,
San Francisco, CA 94114-1600
Tel (415) 469-8649
Fax (415) 469-5044

California Registered Geophysicist
License No. GP 1034


From bsmith at mcs.anl.gov  Wed Apr 26 20:31:55 2006
From: bsmith at mcs.anl.gov (Barry Smith)
Date: Wed, 26 Apr 2006 20:31:55 -0500 (CDT)
Subject: question on DA's and performance
In-Reply-To: <44501362.6080600@geosystem.us>
References: <44501362.6080600@geosystem.us>
Message-ID: <Pine.OSX.4.64.0604262023430.5943@barrysmith-57.mcs.anl.gov>


   Randy,

     1) I'd first change the scaling of the 
> with coefficient matrix values set to 1.0), 
to match the other diagonal entries in the matrix. For example, if the
diagonal entries are usually 10,000 then multiply the "boundary condition"
rows by 10,000. (Note the diagonal entries may be proportional to h, or 1/h etc
so make sure you get the same scaling in your boundary condition rows. For example,
if the diagonal entries double when you refine the grid make sure your
"boundary condition" equations scale the same way.)

   2) Are you using AIJ or BAIJ matrices? BAIJ is likely much better
in your "new" code because it uses point block ILU(k) instead of point ILU(k).

   3) Are you using -pc_type bjacobi or -pc_type asm (and what did you do in 
the previous version?).

   4) If you use LU on the blocks does it work better than ILU(0) in terms of 
iterations? A little, a lot? If just a little I suggest trying -pc_type asm

   Let us know how things go, you should be able to get similar convergence
using the DAs.

    Barry


On Wed, 26 Apr 2006, Randall Mackie wrote:

> I've been using Petsc for a few years with reasonably good success. My 
> application is
> 3D EM forward modeling and inversion.
>
> What has been working well is basically an adaptation of what I did in serial 
> mode,
> by solving the following system of equations:
>
> |Mxx Mxy Mxz| |Hx|   |bx|
> |Myx Myy Myz| |Hy| = |by|
> |Mzx Mzy Mzz| |Hz|   |bz|
>
> Because this system is very stiff and ill-conditioned, the preconditioner 
> that has
> been successfully used is the ILU(k) of the diagonal sub-blocks of the 
> coefficient
> matrix only, not the ILU(k) of the entire matrix.
>
>
> I've tried, with less success, converting to distributed arrays, because in 
> reality
> my program requires alternating between solving the system above, and solving
> another system based on the divergences of the magnetic field, and so this 
> requires
> a lot of message passing between the nodes. I think that using DA's would 
> require
> passing only the ghost values instead of all the values as I am now doing.
>
> So I coded up DA's, and it works, but only okay, and not as well as the case 
> above.
> The main problem is that it seems to work best for ILU(0), whereas the case 
> above
> works better and better with the more fill-in specified. I think I've tried 
> every
> possible option, but I can't get any improvement over just using ILU(0), but 
> the problem
> is that with ILU(0), it takes too many iterations, and so the total time is 
> more
> than when I use my original method with, say, ILU(8).
>
> The differences between using DA's and the approach above is that the fields 
> are
> interlaced in DA's, and the boundary values are included as unknowns (with 
> coefficient
> matrix values set to 1.0), whereas my original way the boundary values are
> incorporated in the right-hand side. Maybe that hurts the condition of my 
> system.
>
> I would really like to use DA's to improve on the efficiency of the code, but 
> I
> can't figure out how to do that.
>
> It was suggested to me last year to use PCFIELDSPLIT, but there are no 
> examples,
> and I'm not a c programmer so it's hard for me to look at the source code and
> know what to do. (I'm only just now able to get back to this).
>
> Does anyone do any prescaling of their systems? If so, does anyone have 
> examples
> of how this can be done?
>
> Any advice is greatly appreciated.
>
> Thanks, Randy
>
>


From randy at geosystem.us  Thu Apr 27 00:23:37 2006
From: randy at geosystem.us (Randall Mackie)
Date: Wed, 26 Apr 2006 22:23:37 -0700
Subject: question on DA's and performance
In-Reply-To: <Pine.OSX.4.64.0604262023430.5943@barrysmith-57.mcs.anl.gov>
References: <44501362.6080600@geosystem.us> <Pine.OSX.4.64.0604262023430.5943@barrysmith-57.mcs.anl.gov>
Message-ID: <44505559.9010405@geosystem.us>


Barry Smith wrote:
> 
>   Randy,
> 
>     1) I'd first change the scaling of the
>> with coefficient matrix values set to 1.0), 
> to match the other diagonal entries in the matrix. For example, if the
> diagonal entries are usually 10,000 then multiply the "boundary condition"
> rows by 10,000. (Note the diagonal entries may be proportional to h, or 
> 1/h etc
> so make sure you get the same scaling in your boundary condition rows. 
> For example,
> if the diagonal entries double when you refine the grid make sure your
> "boundary condition" equations scale the same way.)

I suspected this might be a problem. The diagonal entries have a wide dynamic
range, all the way up to 1e10. I will try this and let you know if it helps.

> 
>   2) Are you using AIJ or BAIJ matrices? BAIJ is likely much better
> in your "new" code because it uses point block ILU(k) instead of point 
> ILU(k).

I am using parallel AIJ matrices. Are there any examples that show how
to set up the BAIJ matrices? (in fortran?)

> 
>   3) Are you using -pc_type bjacobi or -pc_type asm (and what did you do 
> in the previous version?).

In the DA code, I am using:

          -em_ksp_type bcgs \
          -em_sub_pc_type ilu \
          -divh_ksp_type cr \
          -divh_sub_pc_type ilu \


In my previous code, I was using:

          -em_ksp_type bcgs \
          -em_sub_pc_type ilu \
          -em_sub_pc_factor_levels 8 \
          -em_sub_pc_factor_fill 4 \
          -divh_ksp_type cr \
          -divh_sub_pc_type icc \


> 
>   4) If you use LU on the blocks does it work better than ILU(0) in 
> terms of iterations? A little, a lot? If just a little I suggest trying 
> -pc_type asm

I have tried asm, and it helps a little, but not enough to be competitive
with my previous code.


> 
>   Let us know how things go, you should be able to get similar convergence
> using the DAs.

I will try the scaling and the BAIJ matrix suggestions.

Thanks, Randy


> 
>    Barry
> 
> 
> 
> 
> On Wed, 26 Apr 2006, Randall Mackie wrote:
> 
>> I've been using Petsc for a few years with reasonably good success. My 
>> application is
>> 3D EM forward modeling and inversion.
>>
>> What has been working well is basically an adaptation of what I did in 
>> serial mode,
>> by solving the following system of equations:
>>
>> |Mxx Mxy Mxz| |Hx|   |bx|
>> |Myx Myy Myz| |Hy| = |by|
>> |Mzx Mzy Mzz| |Hz|   |bz|
>>
>> Because this system is very stiff and ill-conditioned, the 
>> preconditioner that has
>> been successfully used is the ILU(k) of the diagonal sub-blocks of the 
>> coefficient
>> matrix only, not the ILU(k) of the entire matrix.
>>
>>
>> I've tried, with less success, converting to distributed arrays, 
>> because in reality
>> my program requires alternating between solving the system above, and 
>> solving
>> another system based on the divergences of the magnetic field, and so 
>> this requires
>> a lot of message passing between the nodes. I think that using DA's 
>> would require
>> passing only the ghost values instead of all the values as I am now 
>> doing.
>>
>> So I coded up DA's, and it works, but only okay, and not as well as 
>> the case above.
>> The main problem is that it seems to work best for ILU(0), whereas the 
>> case above
>> works better and better with the more fill-in specified. I think I've 
>> tried every
>> possible option, but I can't get any improvement over just using 
>> ILU(0), but the problem
>> is that with ILU(0), it takes too many iterations, and so the total 
>> time is more
>> than when I use my original method with, say, ILU(8).
>>
>> The differences between using DA's and the approach above is that the 
>> fields are
>> interlaced in DA's, and the boundary values are included as unknowns 
>> (with coefficient
>> matrix values set to 1.0), whereas my original way the boundary values 
>> are
>> incorporated in the right-hand side. Maybe that hurts the condition of 
>> my system.
>>
>> I would really like to use DA's to improve on the efficiency of the 
>> code, but I
>> can't figure out how to do that.
>>
>> It was suggested to me last year to use PCFIELDSPLIT, but there are no 
>> examples,
>> and I'm not a c programmer so it's hard for me to look at the source 
>> code and
>> know what to do. (I'm only just now able to get back to this).
>>
>> Does anyone do any prescaling of their systems? If so, does anyone 
>> have examples
>> of how this can be done?
>>
>> Any advice is greatly appreciated.
>>
>> Thanks, Randy
>>
>>
> 

-- 
Randall Mackie
GSY-USA, Inc.
PMB# 643
2261 Market St.,
San Francisco, CA 94114-1600
Tel (415) 469-8649
Fax (415) 469-5044

California Registered Geophysicist
License No. GP 1034


From randy at geosystem.us  Thu Apr 27 11:56:50 2006
From: randy at geosystem.us (Randall Mackie)
Date: Thu, 27 Apr 2006 09:56:50 -0700
Subject: question on DA's and performance
In-Reply-To: <Pine.OSX.4.64.0604262023430.5943@barrysmith-57.mcs.anl.gov>
References: <44501362.6080600@geosystem.us> <Pine.OSX.4.64.0604262023430.5943@barrysmith-57.mcs.anl.gov>
Message-ID: <4450F7D2.4080304@geosystem.us>

Barry,

Can you give some advice, or do you have any examples, on how to set
the block size in MatCreateMPIBAIJ?

Randy


Barry Smith wrote:
> 
>   Randy,
> 
>     1) I'd first change the scaling of the
>> with coefficient matrix values set to 1.0), 
> to match the other diagonal entries in the matrix. For example, if the
> diagonal entries are usually 10,000 then multiply the "boundary condition"
> rows by 10,000. (Note the diagonal entries may be proportional to h, or 
> 1/h etc
> so make sure you get the same scaling in your boundary condition rows. 
> For example,
> if the diagonal entries double when you refine the grid make sure your
> "boundary condition" equations scale the same way.)
> 
>   2) Are you using AIJ or BAIJ matrices? BAIJ is likely much better
> in your "new" code because it uses point block ILU(k) instead of point 
> ILU(k).
> 
>   3) Are you using -pc_type bjacobi or -pc_type asm (and what did you do 
> in the previous version?).
> 
>   4) If you use LU on the blocks does it work better than ILU(0) in 
> terms of iterations? A little, a lot? If just a little I suggest trying 
> -pc_type asm
> 
>   Let us know how things go, you should be able to get similar convergence
> using the DAs.
> 
>    Barry
> 
> 
> 
> 
> On Wed, 26 Apr 2006, Randall Mackie wrote:
> 
>> I've been using Petsc for a few years with reasonably good success. My 
>> application is
>> 3D EM forward modeling and inversion.
>>
>> What has been working well is basically an adaptation of what I did in 
>> serial mode,
>> by solving the following system of equations:
>>
>> |Mxx Mxy Mxz| |Hx|   |bx|
>> |Myx Myy Myz| |Hy| = |by|
>> |Mzx Mzy Mzz| |Hz|   |bz|
>>
>> Because this system is very stiff and ill-conditioned, the 
>> preconditioner that has
>> been successfully used is the ILU(k) of the diagonal sub-blocks of the 
>> coefficient
>> matrix only, not the ILU(k) of the entire matrix.
>>
>>
>> I've tried, with less success, converting to distributed arrays, 
>> because in reality
>> my program requires alternating between solving the system above, and 
>> solving
>> another system based on the divergences of the magnetic field, and so 
>> this requires
>> a lot of message passing between the nodes. I think that using DA's 
>> would require
>> passing only the ghost values instead of all the values as I am now 
>> doing.
>>
>> So I coded up DA's, and it works, but only okay, and not as well as 
>> the case above.
>> The main problem is that it seems to work best for ILU(0), whereas the 
>> case above
>> works better and better with the more fill-in specified. I think I've 
>> tried every
>> possible option, but I can't get any improvement over just using 
>> ILU(0), but the problem
>> is that with ILU(0), it takes too many iterations, and so the total 
>> time is more
>> than when I use my original method with, say, ILU(8).
>>
>> The differences between using DA's and the approach above is that the 
>> fields are
>> interlaced in DA's, and the boundary values are included as unknowns 
>> (with coefficient
>> matrix values set to 1.0), whereas my original way the boundary values 
>> are
>> incorporated in the right-hand side. Maybe that hurts the condition of 
>> my system.
>>
>> I would really like to use DA's to improve on the efficiency of the 
>> code, but I
>> can't figure out how to do that.
>>
>> It was suggested to me last year to use PCFIELDSPLIT, but there are no 
>> examples,
>> and I'm not a c programmer so it's hard for me to look at the source 
>> code and
>> know what to do. (I'm only just now able to get back to this).
>>
>> Does anyone do any prescaling of their systems? If so, does anyone 
>> have examples
>> of how this can be done?
>>
>> Any advice is greatly appreciated.
>>
>> Thanks, Randy
>>
>>
> 

-- 
Randall Mackie
GSY-USA, Inc.
PMB# 643
2261 Market St.,
San Francisco, CA 94114-1600
Tel (415) 469-8649
Fax (415) 469-5044

California Registered Geophysicist
License No. GP 1034


From bsmith at mcs.anl.gov  Thu Apr 27 15:07:02 2006
From: bsmith at mcs.anl.gov (Barry Smith)
Date: Thu, 27 Apr 2006 15:07:02 -0500 (CDT)
Subject: question on DA's and performance
In-Reply-To: <4450F7D2.4080304@geosystem.us>
References: <44501362.6080600@geosystem.us> <Pine.OSX.4.64.0604262023430.5943@barrysmith-57.mcs.anl.gov>
 <4450F7D2.4080304@geosystem.us>
Message-ID: <Pine.OSX.4.64.0604271505030.5943@barrysmith-57.mcs.anl.gov>


   In your case it is 2, since you have 2 degree's of freedom per cell.
But note: Aren't you using DAGetMatrix() to get your matrix instead of
MatCreate....() directory? You would pass MATBAIJ into DAGetMatrix() to
get the BAIJ matrix.

     Barry


On Thu, 27 Apr 2006, Randall Mackie wrote:

> Barry,
>
> Can you give some advice, or do you have any examples, on how to set
> the block size in MatCreateMPIBAIJ?
>
> Randy
>
>
> Barry Smith wrote:
>>
>>   Randy,
>>
>>     1) I'd first change the scaling of the
>>> with coefficient matrix values set to 1.0), 
>> to match the other diagonal entries in the matrix. For example, if the
>> diagonal entries are usually 10,000 then multiply the "boundary condition"
>> rows by 10,000. (Note the diagonal entries may be proportional to h, or 1/h 
>> etc
>> so make sure you get the same scaling in your boundary condition rows. For 
>> example,
>> if the diagonal entries double when you refine the grid make sure your
>> "boundary condition" equations scale the same way.)
>>
>>   2) Are you using AIJ or BAIJ matrices? BAIJ is likely much better
>> in your "new" code because it uses point block ILU(k) instead of point 
>> ILU(k).
>>
>>   3) Are you using -pc_type bjacobi or -pc_type asm (and what did you do in 
>> the previous version?).
>>
>>   4) If you use LU on the blocks does it work better than ILU(0) in terms 
>> of iterations? A little, a lot? If just a little I suggest trying -pc_type 
>> asm
>>
>>   Let us know how things go, you should be able to get similar convergence
>> using the DAs.
>>
>>    Barry
>> 
>> 
>> 
>> 
>> On Wed, 26 Apr 2006, Randall Mackie wrote:
>> 
>>> I've been using Petsc for a few years with reasonably good success. My 
>>> application is
>>> 3D EM forward modeling and inversion.
>>> 
>>> What has been working well is basically an adaptation of what I did in 
>>> serial mode,
>>> by solving the following system of equations:
>>> 
>>> |Mxx Mxy Mxz| |Hx|   |bx|
>>> |Myx Myy Myz| |Hy| = |by|
>>> |Mzx Mzy Mzz| |Hz|   |bz|
>>> 
>>> Because this system is very stiff and ill-conditioned, the preconditioner 
>>> that has
>>> been successfully used is the ILU(k) of the diagonal sub-blocks of the 
>>> coefficient
>>> matrix only, not the ILU(k) of the entire matrix.
>>> 
>>> 
>>> I've tried, with less success, converting to distributed arrays, because 
>>> in reality
>>> my program requires alternating between solving the system above, and 
>>> solving
>>> another system based on the divergences of the magnetic field, and so this 
>>> requires
>>> a lot of message passing between the nodes. I think that using DA's would 
>>> require
>>> passing only the ghost values instead of all the values as I am now doing.
>>> 
>>> So I coded up DA's, and it works, but only okay, and not as well as the 
>>> case above.
>>> The main problem is that it seems to work best for ILU(0), whereas the 
>>> case above
>>> works better and better with the more fill-in specified. I think I've 
>>> tried every
>>> possible option, but I can't get any improvement over just using ILU(0), 
>>> but the problem
>>> is that with ILU(0), it takes too many iterations, and so the total time 
>>> is more
>>> than when I use my original method with, say, ILU(8).
>>> 
>>> The differences between using DA's and the approach above is that the 
>>> fields are
>>> interlaced in DA's, and the boundary values are included as unknowns (with 
>>> coefficient
>>> matrix values set to 1.0), whereas my original way the boundary values are
>>> incorporated in the right-hand side. Maybe that hurts the condition of my 
>>> system.
>>> 
>>> I would really like to use DA's to improve on the efficiency of the code, 
>>> but I
>>> can't figure out how to do that.
>>> 
>>> It was suggested to me last year to use PCFIELDSPLIT, but there are no 
>>> examples,
>>> and I'm not a c programmer so it's hard for me to look at the source code 
>>> and
>>> know what to do. (I'm only just now able to get back to this).
>>> 
>>> Does anyone do any prescaling of their systems? If so, does anyone have 
>>> examples
>>> of how this can be done?
>>> 
>>> Any advice is greatly appreciated.
>>> 
>>> Thanks, Randy
>>> 
>>> 
>> 
>
>


From randy at geosystem.us  Thu Apr 27 15:20:22 2006
From: randy at geosystem.us (Randall Mackie)
Date: Thu, 27 Apr 2006 13:20:22 -0700
Subject: question on DA's and performance
In-Reply-To: <Pine.OSX.4.64.0604271505030.5943@barrysmith-57.mcs.anl.gov>
References: <44501362.6080600@geosystem.us> <Pine.OSX.4.64.0604262023430.5943@barrysmith-57.mcs.anl.gov> <4450F7D2.4080304@geosystem.us> <Pine.OSX.4.64.0604271505030.5943@barrysmith-57.mcs.anl.gov>
Message-ID: <44512786.1070404@geosystem.us>

Actually, I have 3 degrees of freedom (Hx, Hy, Hz) per cell. I'm not using
DAGetMatrix because as of last year, that returns the full coupling
and a lot of structure that I don't use, and I couldn't figure out
how to use DASetBlockFills() to get the right amount of coupling I needed,
so I just coded it up use MatCreateMPIAIJ, and it works.

I suggested last year that the actual coupling could be based on the
values actually entered, and you said you'd put that on the todo list.

For example, in my problem, I'm solving the curl curl equations, so
the coupling is like

 >>>
 >>>	Hx(i,j,k) is coupled to Hx(i,j,k-1), Hx(i,j,k+1), Hx(i,j+1,k),
 >>>               Hx(i,j-1,k), Hy(i+1,j,k), Hy(i,j,k), Hy(i+1,j-1,k), Hy(i,j-1,k),
 >>>               Hz(i,j,k-1), Hz(i+1,j,k-1), Hz(i,j,k), Hz(i+1,j,k)
 >>>
 >>>
 >>>	Similarly for Hy(i,j,k) and Hz(i,j,k)


Randy


Barry Smith wrote:
> 
>   In your case it is 2, since you have 2 degree's of freedom per cell.
> But note: Aren't you using DAGetMatrix() to get your matrix instead of
> MatCreate....() directory? You would pass MATBAIJ into DAGetMatrix() to
> get the BAIJ matrix.
> 
>     Barry
> 
> 
> On Thu, 27 Apr 2006, Randall Mackie wrote:
> 
>> Barry,
>>
>> Can you give some advice, or do you have any examples, on how to set
>> the block size in MatCreateMPIBAIJ?
>>
>> Randy
>>
>>
>> Barry Smith wrote:
>>>
>>>   Randy,
>>>
>>>     1) I'd first change the scaling of the
>>>> with coefficient matrix values set to 1.0), 
>>> to match the other diagonal entries in the matrix. For example, if the
>>> diagonal entries are usually 10,000 then multiply the "boundary 
>>> condition"
>>> rows by 10,000. (Note the diagonal entries may be proportional to h, 
>>> or 1/h etc
>>> so make sure you get the same scaling in your boundary condition 
>>> rows. For example,
>>> if the diagonal entries double when you refine the grid make sure your
>>> "boundary condition" equations scale the same way.)
>>>
>>>   2) Are you using AIJ or BAIJ matrices? BAIJ is likely much better
>>> in your "new" code because it uses point block ILU(k) instead of 
>>> point ILU(k).
>>>
>>>   3) Are you using -pc_type bjacobi or -pc_type asm (and what did you 
>>> do in the previous version?).
>>>
>>>   4) If you use LU on the blocks does it work better than ILU(0) in 
>>> terms of iterations? A little, a lot? If just a little I suggest 
>>> trying -pc_type asm
>>>
>>>   Let us know how things go, you should be able to get similar 
>>> convergence
>>> using the DAs.
>>>
>>>    Barry
>>>
>>>
>>>
>>>
>>> On Wed, 26 Apr 2006, Randall Mackie wrote:
>>>
>>>> I've been using Petsc for a few years with reasonably good success. 
>>>> My application is
>>>> 3D EM forward modeling and inversion.
>>>>
>>>> What has been working well is basically an adaptation of what I did 
>>>> in serial mode,
>>>> by solving the following system of equations:
>>>>
>>>> |Mxx Mxy Mxz| |Hx|   |bx|
>>>> |Myx Myy Myz| |Hy| = |by|
>>>> |Mzx Mzy Mzz| |Hz|   |bz|
>>>>
>>>> Because this system is very stiff and ill-conditioned, the 
>>>> preconditioner that has
>>>> been successfully used is the ILU(k) of the diagonal sub-blocks of 
>>>> the coefficient
>>>> matrix only, not the ILU(k) of the entire matrix.
>>>>
>>>>
>>>> I've tried, with less success, converting to distributed arrays, 
>>>> because in reality
>>>> my program requires alternating between solving the system above, 
>>>> and solving
>>>> another system based on the divergences of the magnetic field, and 
>>>> so this requires
>>>> a lot of message passing between the nodes. I think that using DA's 
>>>> would require
>>>> passing only the ghost values instead of all the values as I am now 
>>>> doing.
>>>>
>>>> So I coded up DA's, and it works, but only okay, and not as well as 
>>>> the case above.
>>>> The main problem is that it seems to work best for ILU(0), whereas 
>>>> the case above
>>>> works better and better with the more fill-in specified. I think 
>>>> I've tried every
>>>> possible option, but I can't get any improvement over just using 
>>>> ILU(0), but the problem
>>>> is that with ILU(0), it takes too many iterations, and so the total 
>>>> time is more
>>>> than when I use my original method with, say, ILU(8).
>>>>
>>>> The differences between using DA's and the approach above is that 
>>>> the fields are
>>>> interlaced in DA's, and the boundary values are included as unknowns 
>>>> (with coefficient
>>>> matrix values set to 1.0), whereas my original way the boundary 
>>>> values are
>>>> incorporated in the right-hand side. Maybe that hurts the condition 
>>>> of my system.
>>>>
>>>> I would really like to use DA's to improve on the efficiency of the 
>>>> code, but I
>>>> can't figure out how to do that.
>>>>
>>>> It was suggested to me last year to use PCFIELDSPLIT, but there are 
>>>> no examples,
>>>> and I'm not a c programmer so it's hard for me to look at the source 
>>>> code and
>>>> know what to do. (I'm only just now able to get back to this).
>>>>
>>>> Does anyone do any prescaling of their systems? If so, does anyone 
>>>> have examples
>>>> of how this can be done?
>>>>
>>>> Any advice is greatly appreciated.
>>>>
>>>> Thanks, Randy
>>>>
>>>>
>>>
>>
>>
> 

-- 
Randall Mackie
GSY-USA, Inc.
PMB# 643
2261 Market St.,
San Francisco, CA 94114-1600
Tel (415) 469-8649
Fax (415) 469-5044

California Registered Geophysicist
License No. GP 1034


From bsmith at mcs.anl.gov  Thu Apr 27 15:35:00 2006
From: bsmith at mcs.anl.gov (Barry Smith)
Date: Thu, 27 Apr 2006 15:35:00 -0500 (CDT)
Subject: question on DA's and performance
In-Reply-To: <44512786.1070404@geosystem.us>
References: <44501362.6080600@geosystem.us> <Pine.OSX.4.64.0604262023430.5943@barrysmith-57.mcs.anl.gov>
 <4450F7D2.4080304@geosystem.us> <Pine.OSX.4.64.0604271505030.5943@barrysmith-57.mcs.anl.gov>
 <44512786.1070404@geosystem.us>
Message-ID: <Pine.OSX.4.64.0604271533190.5943@barrysmith-57.mcs.anl.gov>


   Randy,

    If the codes are in a form that I could build and run them could you just 
send me the two codes (petsc-maint at mcs.anl.gov) so I can play around with the
convergence of both? Also send sample input data if there is some.

      Barry


On Thu, 27 Apr 2006, Randall Mackie wrote:

> Actually, I have 3 degrees of freedom (Hx, Hy, Hz) per cell. I'm not using
> DAGetMatrix because as of last year, that returns the full coupling
> and a lot of structure that I don't use, and I couldn't figure out
> how to use DASetBlockFills() to get the right amount of coupling I needed,
> so I just coded it up use MatCreateMPIAIJ, and it works.
>
> I suggested last year that the actual coupling could be based on the
> values actually entered, and you said you'd put that on the todo list.
>
> For example, in my problem, I'm solving the curl curl equations, so
> the coupling is like
>
>>>>
>>>> 	Hx(i,j,k) is coupled to Hx(i,j,k-1), Hx(i,j,k+1), Hx(i,j+1,k),
>>>>               Hx(i,j-1,k), Hy(i+1,j,k), Hy(i,j,k), Hy(i+1,j-1,k), 
> Hy(i,j-1,k),
>>>>               Hz(i,j,k-1), Hz(i+1,j,k-1), Hz(i,j,k), Hz(i+1,j,k)
>>>>
>>>>
>>>> 	Similarly for Hy(i,j,k) and Hz(i,j,k)
>
>
> Randy
>
>
> Barry Smith wrote:
>>
>>   In your case it is 2, since you have 2 degree's of freedom per cell.
>> But note: Aren't you using DAGetMatrix() to get your matrix instead of
>> MatCreate....() directory? You would pass MATBAIJ into DAGetMatrix() to
>> get the BAIJ matrix.
>>
>>     Barry
>> 
>> 
>> On Thu, 27 Apr 2006, Randall Mackie wrote:
>> 
>>> Barry,
>>> 
>>> Can you give some advice, or do you have any examples, on how to set
>>> the block size in MatCreateMPIBAIJ?
>>> 
>>> Randy
>>> 
>>> 
>>> Barry Smith wrote:
>>>>
>>>>   Randy,
>>>>
>>>>     1) I'd first change the scaling of the
>>>>> with coefficient matrix values set to 1.0), 
>>>> to match the other diagonal entries in the matrix. For example, if the
>>>> diagonal entries are usually 10,000 then multiply the "boundary 
>>>> condition"
>>>> rows by 10,000. (Note the diagonal entries may be proportional to h, or 
>>>> 1/h etc
>>>> so make sure you get the same scaling in your boundary condition rows. 
>>>> For example,
>>>> if the diagonal entries double when you refine the grid make sure your
>>>> "boundary condition" equations scale the same way.)
>>>>
>>>>   2) Are you using AIJ or BAIJ matrices? BAIJ is likely much better
>>>> in your "new" code because it uses point block ILU(k) instead of point 
>>>> ILU(k).
>>>>
>>>>   3) Are you using -pc_type bjacobi or -pc_type asm (and what did you do 
>>>> in the previous version?).
>>>>
>>>>   4) If you use LU on the blocks does it work better than ILU(0) in terms 
>>>> of iterations? A little, a lot? If just a little I suggest trying 
>>>> -pc_type asm
>>>>
>>>>   Let us know how things go, you should be able to get similar 
>>>> convergence
>>>> using the DAs.
>>>>
>>>>    Barry
>>>> 
>>>> 
>>>> 
>>>> 
>>>> On Wed, 26 Apr 2006, Randall Mackie wrote:
>>>> 
>>>>> I've been using Petsc for a few years with reasonably good success. My 
>>>>> application is
>>>>> 3D EM forward modeling and inversion.
>>>>> 
>>>>> What has been working well is basically an adaptation of what I did in 
>>>>> serial mode,
>>>>> by solving the following system of equations:
>>>>> 
>>>>> |Mxx Mxy Mxz| |Hx|   |bx|
>>>>> |Myx Myy Myz| |Hy| = |by|
>>>>> |Mzx Mzy Mzz| |Hz|   |bz|
>>>>> 
>>>>> Because this system is very stiff and ill-conditioned, the 
>>>>> preconditioner that has
>>>>> been successfully used is the ILU(k) of the diagonal sub-blocks of the 
>>>>> coefficient
>>>>> matrix only, not the ILU(k) of the entire matrix.
>>>>> 
>>>>> 
>>>>> I've tried, with less success, converting to distributed arrays, because 
>>>>> in reality
>>>>> my program requires alternating between solving the system above, and 
>>>>> solving
>>>>> another system based on the divergences of the magnetic field, and so 
>>>>> this requires
>>>>> a lot of message passing between the nodes. I think that using DA's 
>>>>> would require
>>>>> passing only the ghost values instead of all the values as I am now 
>>>>> doing.
>>>>> 
>>>>> So I coded up DA's, and it works, but only okay, and not as well as the 
>>>>> case above.
>>>>> The main problem is that it seems to work best for ILU(0), whereas the 
>>>>> case above
>>>>> works better and better with the more fill-in specified. I think I've 
>>>>> tried every
>>>>> possible option, but I can't get any improvement over just using ILU(0), 
>>>>> but the problem
>>>>> is that with ILU(0), it takes too many iterations, and so the total time 
>>>>> is more
>>>>> than when I use my original method with, say, ILU(8).
>>>>> 
>>>>> The differences between using DA's and the approach above is that the 
>>>>> fields are
>>>>> interlaced in DA's, and the boundary values are included as unknowns 
>>>>> (with coefficient
>>>>> matrix values set to 1.0), whereas my original way the boundary values 
>>>>> are
>>>>> incorporated in the right-hand side. Maybe that hurts the condition of 
>>>>> my system.
>>>>> 
>>>>> I would really like to use DA's to improve on the efficiency of the 
>>>>> code, but I
>>>>> can't figure out how to do that.
>>>>> 
>>>>> It was suggested to me last year to use PCFIELDSPLIT, but there are no 
>>>>> examples,
>>>>> and I'm not a c programmer so it's hard for me to look at the source 
>>>>> code and
>>>>> know what to do. (I'm only just now able to get back to this).
>>>>> 
>>>>> Does anyone do any prescaling of their systems? If so, does anyone have 
>>>>> examples
>>>>> of how this can be done?
>>>>> 
>>>>> Any advice is greatly appreciated.
>>>>> 
>>>>> Thanks, Randy
>>>>> 
>>>>> 
>>>> 
>>> 
>>> 
>> 
>
>