From khalid_eee at yahoo.com  Fri Apr  1 00:53:23 2011
From: khalid_eee at yahoo.com (khalid ashraf)
Date: Thu, 31 Mar 2011 22:53:23 -0700 (PDT)
Subject: [petsc-users] FFT with PETSC
Message-ID: <932406.30502.qm@web112615.mail.gq1.yahoo.com>

I tried to use the FFTW using the PETSC interface. 
It gives the following error 


ex142.c(8): catastrophic error: could not open source file "petscmat.h"
  #include <petscmat.h>
                       ^
The same error with <fftw3.h>

Do I need to configure petsc with something else ?

Thanks.
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.mcs.anl.gov/pipermail/petsc-users/attachments/20110331/65308311/attachment.htm>

From jed at 59A2.org  Fri Apr  1 01:44:42 2011
From: jed at 59A2.org (Jed Brown)
Date: Fri, 1 Apr 2011 09:44:42 +0300
Subject: [petsc-users] PETSC with SuperLU
In-Reply-To: <4D9536AE.4050301@UManitoba.ca>
References: <4D9536AE.4050301@UManitoba.ca>
Message-ID: <AANLkTimBozxYesTPoHzkzp5qhxJp6=v6pGRKyoHbMdid@mail.gmail.com>

On Fri, Apr 1, 2011 at 05:21, Ormiston, Scott J.
<SJ_Ormiston at umanitoba.ca>wrote:

> I rebuilt PETSc with SuperLU and ParMETIS, and then I tried to run ex15f.F
> (it was working with a previous build of PETSc).
>

SuperLU is not the same package as SuperLU_Dist (don't ask me why they
organize software that way). You also misspelled the command line option. If
you run the example below in serial with

  -pc_factor_mat_solver_package superlu

it should work (using SuperLU instead of PETSc's native direct solver). To
use SuperLU_Dist (which works in parallel), configure with
--download-superlu_dist and run with -pc_factor_mat_solver_package
superlu_dist. You do not have to touch the source code.


>
> I ran
>
> % mpiexec -n 4 ex15f -pc_type lu -pc_factor_mat_solver_type superlu_dist
>
> but this generated the following error messages:
>
>
> ====================================================================================
> [0]PETSC ERROR: --------------------- Error Message
> ------------------------------------
> [0]PETSC ERROR: No support for this operation for this object type!
> [0]PETSC ERROR: Matrix format mpiaij does not have a built-in PETSc direct
> solver!
> [0]PETSC ERROR:
> ------------------------------------------------------------------------
> [0]PETSC ERROR: Petsc Release Version 3.1.0, Patch 6, Tue Nov 16 17:02:32
> CST 2010
> [0]PETSC ERROR: See docs/changes/index.html for recent updates.
> [0]PETSC ERROR: See docs/faq.html for hints about trouble shooting.
> [0]PETSC ERROR: See docs/index.html for manual pages.
> [0]PETSC ERROR:
> ------------------------------------------------------------------------
> [0]PETSC ERROR: ex15f on a linux-gnu named mecfd02 by sormist Thu Mar 31
> 21:15:34 2011
> [0]PETSC ERROR: Libraries linked from
> /home/mecfd/common/sw/petsc-3.1-p6/linux-gnu-c-debug/lib
> [0]PETSC ERROR: Configure run at Thu Mar 31 19:59:09 2011
> [0]PETSC ERROR: Configure options
> --with-mpi-dir=/home/mecfd/common/sw/openmpi-1.4.3 --download-superlu_dist
> --download-parmetis
> [0]PETSC ERROR:
> ------------------------------------------------------------------------
> [0]PETSC ERROR: MatGetFactor() line 3644 in src/mat/interface/matrix.c
> [0]PETSC ERROR: PCSetUp_LU() line 133 in src/ksp/pc/impls/factor/lu/lu.c
> [0]PETSC ERROR: PCSetUp() line 795 in src/ksp/pc/interface/precon.c
> [0]PETSC ERROR: KSPSetUp() line 237 in src/ksp/ksp/interface/itfunc.c
> [0]PETSC ERROR: KSPSolve() line 353 in src/ksp/ksp/interface/itfunc.c
> [1]PETSC ERROR: --------------------- Error Message
> ------------------------------------
> [1]PETSC ERROR: No support for this operation for this object type!
> [1]PETSC ERROR: Matrix format mpiaij does not have a built-in PETSc direct
> solver!
> [1]PETSC ERROR:
> ------------------------------------------------------------------------
> [1]PETSC ERROR: Petsc Release Version 3.1.0, Patch 6, Tue Nov 16 17:02:32
> CST 2010
> [1]PETSC ERROR: See docs/changes/index.html for recent updates.
> [1]PETSC ERROR: See docs/faq.html for hints about trouble shooting.
> [1]PETSC ERROR: See docs/index.html for manual pages.
> [1]PETSC ERROR:
> ------------------------------------------------------------------------
> [1]PETSC ERROR: ex15f on a linux-gnu named mecfd02 by sormist Thu Mar 31
> 21:15:34 2011
> [1]PETSC ERROR: Libraries linked from
> /home/mecfd/common/sw/petsc-3.1-p6/linux-gnu-c-debug/lib
> [1]PETSC ERROR: Configure run at Thu Mar 31 19:59:09 2011
> [1]PETSC ERROR: Configure options
> --with-mpi-dir=/home/mecfd/common/sw/openmpi-1.4.3 --download-superlu_dist
> --download-parmetis
> [1]PETSC ERROR:
> ------------------------------------------------------------------------
> [1]PETSC ERROR: MatGetFactor() line 3644 in src/mat/interface/matrix.c
> [1]PETSC ERROR: PCSetUp_LU() line 133 in src/ksp/pc/impls/factor/lu/lu.c
> [1]PETSC ERROR: PCSetUp() line 795 in src/ksp/pc/interface/precon.c
> [1]PETSC ERROR: KSPSetUp() line 237 in src/ksp/ksp/interface/itfunc.c
> [1]PETSC ERROR: KSPSolve() line 353 in src/ksp/ksp/interface/itfunc.c
> [2]PETSC ERROR: --------------------- Error Message
> ------------------------------------
> [2]PETSC ERROR: No support for this operation for this object type!
> [2]PETSC ERROR: Matrix format mpiaij does not have a built-in PETSc direct
> solver!
> [2]PETSC ERROR:
> ------------------------------------------------------------------------
> [2]PETSC ERROR: Petsc Release Version 3.1.0, Patch 6, Tue Nov 16 17:02:32
> CST 2010
> [2]PETSC ERROR: See docs/changes/index.html for recent updates.
> [2]PETSC ERROR: See docs/faq.html for hints about trouble shooting.
> [2]PETSC ERROR: See docs/index.html for manual pages.
> [2]PETSC ERROR:
> ------------------------------------------------------------------------
> [2]PETSC ERROR: ex15f on a linux-gnu named mecfd02 by sormist Thu Mar 31
> 21:15:34 2011
> [2]PETSC ERROR: Libraries linked from
> /home/mecfd/common/sw/petsc-3.1-p6/linux-gnu-c-debug/lib
> [2]PETSC ERROR: Configure run at Thu Mar 31 19:59:09 2011
> [2]PETSC ERROR: Configure options
> --with-mpi-dir=/home/mecfd/common/sw/openmpi-1.4.3 --download-superlu_dist
> --download-parmetis
> [2]PETSC ERROR:
> ------------------------------------------------------------------------
> [2]PETSC ERROR: MatGetFactor() line 3644 in src/mat/interface/matrix.c
> [2]PETSC ERROR: PCSetUp_LU() line 133 in src/ksp/pc/impls/factor/lu/lu.c
> [2]PETSC ERROR: PCSetUp() line 795 in src/ksp/pc/interface/precon.c
> [2]PETSC ERROR: KSPSetUp() line 237 in src/ksp/ksp/interface/itfunc.c
> [2]PETSC ERROR: KSPSolve() line 353 in src/ksp/ksp/interface/itfunc.c
> [3]PETSC ERROR: --------------------- Error Message
> ------------------------------------
> [3]PETSC ERROR: No support for this operation for this object type!
> [3]PETSC ERROR: Matrix format mpiaij does not have a built-in PETSc direct
> solver!
> [3]PETSC ERROR:
> ------------------------------------------------------------------------
> [3]PETSC ERROR: Petsc Release Version 3.1.0, Patch 6, Tue Nov 16 17:02:32
> CST 2010
> [3]PETSC ERROR: See docs/changes/index.html for recent updates.
> [3]PETSC ERROR: See docs/faq.html for hints about trouble shooting.
> [3]PETSC ERROR: See docs/index.html for manual pages.
> [3]PETSC ERROR:
> ------------------------------------------------------------------------
> [3]PETSC ERROR: ex15f on a linux-gnu named mecfd02 by sormist Thu Mar 31
> 21:15:34 2011
> [3]PETSC ERROR: Libraries linked from
> /home/mecfd/common/sw/petsc-3.1-p6/linux-gnu-c-debug/lib
> [3]PETSC ERROR: Configure run at Thu Mar 31 19:59:09 2011
> [3]PETSC ERROR: Configure options
> --with-mpi-dir=/home/mecfd/common/sw/openmpi-1.4.3 --download-superlu_dist
> --download-parmetis
> [3]PETSC ERROR:
> ------------------------------------------------------------------------
> [3]PETSC ERROR: MatGetFactor() line 3644 in src/mat/interface/matrix.c
> [3]PETSC ERROR: PCSetUp_LU() line 133 in src/ksp/pc/impls/factor/lu/lu.c
> [3]PETSC ERROR: PCSetUp() line 795 in src/ksp/pc/interface/precon.c
> [3]PETSC ERROR: KSPSetUp() line 237 in src/ksp/ksp/interface/itfunc.c
> [3]PETSC ERROR: KSPSolve() line 353 in src/ksp/ksp/interface/itfunc.c
> Norm of error 2.0050E+02 iterations     0
> WARNING! There are options you set that were not used!
> WARNING! could be spelling mistake, etc!
> Option left: name:-pc_factor_mat_solver_type value: superlu_dist
>
> ====================================================================================
>
> Does this require changing the code in ex15f.F or is there something else
> that I am doing wrong?
>
> Scott Ormiston
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.mcs.anl.gov/pipermail/petsc-users/attachments/20110401/334b2aee/attachment-0001.htm>

From jed at 59A2.org  Fri Apr  1 01:56:37 2011
From: jed at 59A2.org (Jed Brown)
Date: Fri, 1 Apr 2011 09:56:37 +0300
Subject: [petsc-users] FFT with PETSC
In-Reply-To: <932406.30502.qm@web112615.mail.gq1.yahoo.com>
References: <932406.30502.qm@web112615.mail.gq1.yahoo.com>
Message-ID: <AANLkTingBFBvU6KfwNZZAg4uBnQ6qQ=+quc31dXWWDXA@mail.gmail.com>

On Fri, Apr 1, 2011 at 08:53, khalid ashraf <khalid_eee at yahoo.com> wrote:

> ex142.c(8): catastrophic error: could not open source file "petscmat.h"
>   #include <petscmat.h>
>                        ^
>

What commands are printed when you typed "make ex142"? Are you trying to
invoke the compiler yourself (thus missing some include paths)?


> The same error with <fftw3.h>
>
> Do I need to configure petsc with something else ?
>

Did you use --download-fftw or --with-fftw-dir=/path/to/your/fftw3 ?
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.mcs.anl.gov/pipermail/petsc-users/attachments/20110401/14fd2c62/attachment.htm>

From gdiso at ustc.edu  Fri Apr  1 02:54:17 2011
From: gdiso at ustc.edu (Gong Ding)
Date: Fri, 1 Apr 2011 15:54:17 +0800 (CST)
Subject: [petsc-users] Slepc SVDCyclicSetExplicitMatrix does not prealloc
	memory?
Message-ID: <13795867.32711301644457561.JavaMail.coremail@mail.ustc.edu>

Hi,
I am stiall dealing with the ill conditioned problem. :-(
Yesderday, I installed slepc-3.1-p6 for SVD calculation of my matrix.

The SVD solver works well for the largest singualr value calculation.
But for the smallest singualr value, all most all the methods fail.

Finally, I chosen the most inefficient way. 
That build the cyclic matrix explicitly with 
shift-and-invert spectral transformation.
And solve the eigen value problem by LU preconditioned GMRES.
The preconditioner should be superlu rather than others.
I guess the reason is superlu use static pivot. 
Because solver with partial pivot such as mumps can not work.
Anyway, slepc solved my problem. 

However, the explicit building cyclic matrix takes too long to finish.
The log info says 

[0] MatSetUpPreallocation(): Warning not preallocating matrix storage
[0] MatAssemblyEnd_SeqAIJ(): Matrix size: 35038 X 35038; storage space: 325302 unneeded,482948 used
[0] MatAssemblyEnd_SeqAIJ(): Number of mallocs during MatSetValues() is 42204
[0] MatAssemblyEnd_SeqAIJ(): Maximum nonzeros in any row is 33
[0] Mat_CheckInode(): Found 26800 nodes of 35038. Limit used: 5. Using Inode routines

It seems no preallocation for cyclic matrix. Is it a bug or I forgot something?

 
From jroman at dsic.upv.es  Fri Apr  1 03:11:06 2011
From: jroman at dsic.upv.es (Jose E. Roman)
Date: Fri, 1 Apr 2011 10:11:06 +0200
Subject: [petsc-users] Slepc SVDCyclicSetExplicitMatrix does not
	prealloc memory?
In-Reply-To: <13795867.32711301644457561.JavaMail.coremail@mail.ustc.edu>
References: <13795867.32711301644457561.JavaMail.coremail@mail.ustc.edu>
Message-ID: <B3AA0210-9AAC-4878-A2E3-789A76B72251@dsic.upv.es>


On 01/04/2011, Gong Ding wrote:

> Hi,
> I am stiall dealing with the ill conditioned problem. :-(
> Yesderday, I installed slepc-3.1-p6 for SVD calculation of my matrix.
> 
> The SVD solver works well for the largest singualr value calculation.
> But for the smallest singualr value, all most all the methods fail.
> 
> Finally, I chosen the most inefficient way. 
> That build the cyclic matrix explicitly with 
> shift-and-invert spectral transformation.
> And solve the eigen value problem by LU preconditioned GMRES.
> The preconditioner should be superlu rather than others.
> I guess the reason is superlu use static pivot. 
> Because solver with partial pivot such as mumps can not work.
> Anyway, slepc solved my problem. 
> 
> However, the explicit building cyclic matrix takes too long to finish.
> The log info says 
> 
> [0] MatSetUpPreallocation(): Warning not preallocating matrix storage
> [0] MatAssemblyEnd_SeqAIJ(): Matrix size: 35038 X 35038; storage space: 325302 unneeded,482948 used
> [0] MatAssemblyEnd_SeqAIJ(): Number of mallocs during MatSetValues() is 42204
> [0] MatAssemblyEnd_SeqAIJ(): Maximum nonzeros in any row is 33
> [0] Mat_CheckInode(): Found 26800 nodes of 35038. Limit used: 5. Using Inode routines
> 
> It seems no preallocation for cyclic matrix. Is it a bug or I forgot something?
> 

Yes, you are right. No preallocation is done in this case within SLEPc. This is a problem also in SLEPc's QEPLINEAR. This is pending, my intention is to get it fixed for the next release.

Thanks.
Jose


From gdiso at ustc.edu  Fri Apr  1 03:45:00 2011
From: gdiso at ustc.edu (Gong Ding)
Date: Fri, 1 Apr 2011 16:45:00 +0800 (CST)
Subject: [petsc-users] Slepc SVDCyclicSetExplicitMatrix does not
 prealloc memory?
In-Reply-To: <B3AA0210-9AAC-4878-A2E3-789A76B72251@dsic.upv.es>
References: <B3AA0210-9AAC-4878-A2E3-789A76B72251@dsic.upv.es>
	<13795867.32711301644457561.JavaMail.coremail@mail.ustc.edu>
Message-ID: <5531913.32841301647500088.JavaMail.coremail@mail.ustc.edu>

> On 01/04/2011, Gong Ding wrote:
> 
> 
> 
> > Hi,
> 
> > I am stiall dealing with the ill conditioned problem. :-(
> 
> > Yesderday, I installed slepc-3.1-p6 for SVD calculation of my matrix.
> 
> > 
> 
> > The SVD solver works well for the largest singualr value calculation.
> 
> > But for the smallest singualr value, all most all the methods fail.
> 
> > 
> 
> > Finally, I chosen the most inefficient way. 
> 
> > That build the cyclic matrix explicitly with 
> 
> > shift-and-invert spectral transformation.
> 
> > And solve the eigen value problem by LU preconditioned GMRES.
> 
> > The preconditioner should be superlu rather than others.
> 
> > I guess the reason is superlu use static pivot. 
> 
> > Because solver with partial pivot such as mumps can not work.
> 
> > Anyway, slepc solved my problem. 
> 
> > 
> 
> > However, the explicit building cyclic matrix takes too long to finish.
> 
> > The log info says 
> 
> > 
> 
> > [0] MatSetUpPreallocation(): Warning not preallocating matrix storage
> 
> > [0] MatAssemblyEnd_SeqAIJ(): Matrix size: 35038 X 35038; storage space: 325302 unneeded,482948 used
> 
> > [0] MatAssemblyEnd_SeqAIJ(): Number of mallocs during MatSetValues() is 42204
> 
> > [0] MatAssemblyEnd_SeqAIJ(): Maximum nonzeros in any row is 33
> 
> > [0] Mat_CheckInode(): Found 26800 nodes of 35038. Limit used: 5. Using Inode routines
> 
> > 
> 
> > It seems no preallocation for cyclic matrix. Is it a bug or I forgot something?
> 
> > 
> 
> 
> 
> Yes, you are right. No preallocation is done in this case within SLEPc. This is a problem also in SLEPc's QEPLINEAR. This is pending, my intention is to get it fixed for the next release.
> 

Hope the problem can be solved soon.
And do you have some comment on how to solve the smallest singular value?
I guess i am not on the right way since matlab (with arpack) can calculate smallest singular value.
But I never make arpack work. 
Even for smallest eigen value problem, arpack report no eigen value are found.    


From jroman at dsic.upv.es  Fri Apr  1 07:02:50 2011
From: jroman at dsic.upv.es (Jose E. Roman)
Date: Fri, 1 Apr 2011 14:02:50 +0200
Subject: [petsc-users] Slepc SVDCyclicSetExplicitMatrix does not
	prealloc memory?
In-Reply-To: <5531913.32841301647500088.JavaMail.coremail@mail.ustc.edu>
References: <B3AA0210-9AAC-4878-A2E3-789A76B72251@dsic.upv.es>
	<13795867.32711301644457561.JavaMail.coremail@mail.ustc.edu>
	<5531913.32841301647500088.JavaMail.coremail@mail.ustc.edu>
Message-ID: <DF88084E-3071-4A29-AD66-4D81C7A2DD0C@dsic.upv.es>


On 01/04/2011, Gong Ding wrote:

> Hope the problem can be solved soon.
> And do you have some comment on how to solve the smallest singular value?
> I guess i am not on the right way since matlab (with arpack) can calculate smallest singular value.
> But I never make arpack work. 
> Even for smallest eigen value problem, arpack report no eigen value are found.    
> 

As discussed in section 3.3 of our report http://www.grycap.upv.es/slepc/documentation/reports/str8.pdf this is a difficult case. Probably the best choice is to use harmonic extraction in trlanczos. But this is not implemented in SLEPc, and there is no guarantee it works for difficult problems.

Jose


From SJ_Ormiston at UManitoba.ca  Fri Apr  1 08:43:12 2011
From: SJ_Ormiston at UManitoba.ca (Ormiston, Scott J.)
Date: Fri, 01 Apr 2011 08:43:12 -0500
Subject: [petsc-users] Performance of superlu_dist
Message-ID: <4D95D670.3050102@UManitoba.ca>

I am just starting to try superlu_dist to get a direct solver that runs 
in parallel with PETSc.

My first tests (with ex15f) show that it takes longer and longer as the 
number of cores increases. For example 4 cores takes 8 times longer than 
2 cores and 8 cores takes 25 times longer than 4 cores.  Obviously I 
expected a speed-up; has anyone else seen this behaviour with 
superlu_dist? If not, what could be going wrong here?

Scott Ormiston
-------------- next part --------------
A non-text attachment was scrubbed...
Name: SJ_Ormiston.vcf
Type: text/x-vcard
Size: 321 bytes
Desc: not available
URL: <http://lists.mcs.anl.gov/pipermail/petsc-users/attachments/20110401/8e226406/attachment.vcf>

From desire.nuentsa_wakam at inria.fr  Fri Apr  1 09:48:45 2011
From: desire.nuentsa_wakam at inria.fr (Desire NUENTSA WAKAM)
Date: Fri, 01 Apr 2011 16:48:45 +0200
Subject: [petsc-users] Performance of superlu_dist
In-Reply-To: <4D95D670.3050102@UManitoba.ca>
References: <4D95D670.3050102@UManitoba.ca>
Message-ID: <4D95E5CD.8040107@inria.fr>

On a multicore node, you may not get a very good speedup if the 
bandwidth is heavily shared between all the cores. I guess this is what 
Petsc people have explained here 
http://www.mcs.anl.gov/petsc/petsc-as/documentation/faq.html#computers
If you have a multi-socket multicore node, my guess would be to keep one 
MPI  process on each socket and then to use a multithreaded BLAS (like 
Goto) inside each socket to keep the cores busy during BLAS operations.
Hope this helps
Desire

On 04/01/2011 03:43 PM, Ormiston, Scott J. wrote:
> I am just starting to try superlu_dist to get a direct solver that 
> runs in parallel with PETSc.
>
> My first tests (with ex15f) show that it takes longer and longer as 
> the number of cores increases. For example 4 cores takes 8 times 
> longer than 2 cores and 8 cores takes 25 times longer than 4 cores.  
> Obviously I expected a speed-up; has anyone else seen this behaviour 
> with superlu_dist? If not, what could be going wrong here?
>
> Scott Ormiston

From SJ_Ormiston at UManitoba.ca  Fri Apr  1 10:40:16 2011
From: SJ_Ormiston at UManitoba.ca (Ormiston, Scott J.)
Date: Fri, 01 Apr 2011 10:40:16 -0500
Subject: [petsc-users] Performance of superlu_dist
In-Reply-To: <4D95E5CD.8040107@inria.fr>
References: <4D95D670.3050102@UManitoba.ca> <4D95E5CD.8040107@inria.fr>
Message-ID: <4D95F1E0.6020203@UManitoba.ca>

Desire NUENTSA WAKAM wrote:
> On a multicore node, you may not get a very good speedup if the 
> bandwidth is heavily shared between all the cores. I guess this is what 
> Petsc people have explained here 
> http://www.mcs.anl.gov/petsc/petsc-as/documentation/faq.html#computers
> If you have a multi-socket multicore node, my guess would be to keep one 
> MPI  process on each socket and then to use a multithreaded BLAS (like 
> Goto) inside each socket to keep the cores busy during BLAS operations.
> Hope this helps

It was very helpful.

Merci infiniment.

Scott Ormiston
-------------- next part --------------
A non-text attachment was scrubbed...
Name: SJ_Ormiston.vcf
Type: text/x-vcard
Size: 321 bytes
Desc: not available
URL: <http://lists.mcs.anl.gov/pipermail/petsc-users/attachments/20110401/6eba68bd/attachment.vcf>

From kenway at utias.utoronto.ca  Fri Apr  1 12:13:28 2011
From: kenway at utias.utoronto.ca (Gaetan Kenway)
Date: Fri, 1 Apr 2011 13:13:28 -0400
Subject: [petsc-users] Performance of superlu_dist
Message-ID: <AANLkTi=EahN7wYwXR0ZWaFhfwC4j03W-iX74FP57GzGZ@mail.gmail.com>

I have seen the same thing with SuperLU_dist as Scott Ormiston has. I've
been using to solve (small-ish) 3D solid finite element structural system
with rarely more than ~30,000 dof. Basically, if you use more than 2 cores,
SuperLU_dist tanks and the factorization time goes through the roof
exponentially.  However, if you solve the same system with Spooles, its
orders of magnitude faster.  I'm not overly concerned with speed, since I
only do this factorization once in my code and as such I don't have precise
timing results.  WIth 22,000 dof on an dual socket Xeon X5500 series machine
(8 cores per node), with spooles, there's a speed up going from 1-8 procs. I
could go up to about 32 procs before it takes longer than the single
processor case.

I hope this is of some use.

Gaetan
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.mcs.anl.gov/pipermail/petsc-users/attachments/20110401/a5e4a3d0/attachment.htm>

From SJ_Ormiston at UManitoba.ca  Fri Apr  1 12:36:05 2011
From: SJ_Ormiston at UManitoba.ca (Ormiston, Scott J.)
Date: Fri, 01 Apr 2011 12:36:05 -0500
Subject: [petsc-users] Performance of superlu_dist
In-Reply-To: <AANLkTi=EahN7wYwXR0ZWaFhfwC4j03W-iX74FP57GzGZ@mail.gmail.com>
References: <AANLkTi=EahN7wYwXR0ZWaFhfwC4j03W-iX74FP57GzGZ@mail.gmail.com>
Message-ID: <4D960D05.4030908@UManitoba.ca>

Gaetan Kenway wrote:
> I have seen the same thing with SuperLU_dist as Scott Ormiston has. I've 
> been using to solve (small-ish) 3D solid finite element structural 
> system with rarely more than ~30,000 dof. Basically, if you use more 
> than 2 cores, SuperLU_dist tanks and the factorization time goes through 
> the roof exponentially.  However, if you solve the same system with 
> Spooles, its orders of magnitude faster.  I'm not overly concerned with 
> speed, since I only do this factorization once in my code and as such I 
> don't have precise timing results.  WIth 22,000 dof on an dual socket 
> Xeon X5500 series machine (8 cores per node), with spooles, there's a 
> speed up going from 1-8 procs. I could go up to about 32 procs before it 
> takes longer than the single processor case.

Following the suggestion of Desire Nuentsa Wakam (who pointed me to the 
FAQ), I have had better performance from superlu_dist using

mpiexec --cpus-per-proc 4 --bind-to-core -np 3 executable_name \ 
             -pc_type lu -pc_factor_mat_solver_package superlu_dist

on a server that has 4 quad-core CPUS and 64 Gb of RAM. I assume other 
option settings will be needed on other arrangements of cores and 
interconnects.

I have not done enough tests to see about any speed-up.

Thank you for your pointer to Spooles.
Scott Ormiston
-------------- next part --------------
A non-text attachment was scrubbed...
Name: SJ_Ormiston.vcf
Type: text/x-vcard
Size: 321 bytes
Desc: not available
URL: <http://lists.mcs.anl.gov/pipermail/petsc-users/attachments/20110401/542a7139/attachment.vcf>

From gaurish108 at gmail.com  Fri Apr  1 18:02:38 2011
From: gaurish108 at gmail.com (Gaurish Telang)
Date: Fri, 1 Apr 2011 19:02:38 -0400
Subject: [petsc-users] Implementing a new routine in PETSc.
Message-ID: <AANLkTinr6tZLxhXzySvpqVgLkPnqQVH34o6y2JqjKwL1@mail.gmail.com>

Hi,

I am planning to implement the LSMR algorithm in PETSc which does least
squares and is supposed to have more favourable mathematical properties than
LSQR which has already been implemented in PETSc.

But I am not really sure how to go about this,  since the implementations of
the standard KSP methods themselves look quite complicated. The manual does
not seeem to say much
about how to go about  adding routines to the PETSc library.

Could you give me guidelines I should follow while implementing a KSP
method?

Also is it necessary to build the PETSc library again after implementing
this routine. If so is it necessary to make any changes to makefiles ?

Regards
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.mcs.anl.gov/pipermail/petsc-users/attachments/20110401/da0f6f2e/attachment.htm>

From bsmith at mcs.anl.gov  Fri Apr  1 18:18:07 2011
From: bsmith at mcs.anl.gov (Barry Smith)
Date: Fri, 1 Apr 2011 18:18:07 -0500
Subject: [petsc-users] Implementing a new routine in PETSc.
In-Reply-To: <AANLkTinr6tZLxhXzySvpqVgLkPnqQVH34o6y2JqjKwL1@mail.gmail.com>
References: <AANLkTinr6tZLxhXzySvpqVgLkPnqQVH34o6y2JqjKwL1@mail.gmail.com>
Message-ID: <C091DE68-7DF7-459A-A5D8-4BD6A66FDD9D@mcs.anl.gov>


   You must install petsc-dev http://www.mcs.anl.gov/petsc/petsc-as/developers/index.html to develop this new code.

   The CG implementation has detailed information about what needs to be provided for a new Krylov method. Start by making a new directory src/ksp/ksp/impls/lsmr and copy over to it the files 
    src/ksp/ksp/impls/cg/ makefile cg.c cgimpl.h call them lsmr.c and lsmrimpl.h modify the copied over makefile to list the lsmr stuff instead of the cg

   In lsmrimpl.h put in the data structure you'll need to store all the vectors and other information needed by lsmr in lsmr.c go through the current code and change it all for the lsmr algorithm. You do not need to recompile all of PETSc to access the new solver, just run make in that new directory.

    You will also need to edit src/ksp/ksp/interface/itregis.c to register your new method.


   Join petsc-dev http://www.mcs.anl.gov/petsc/petsc-as/miscellaneous/mailing-lists.html to correspond with the petsc developers if questions/issues come up.

   Have fun,


   Barry

On Apr 1, 2011, at 6:02 PM, Gaurish Telang wrote:

> Hi,
> 
> I am planning to implement the LSMR algorithm in PETSc which does least squares and is supposed to have more favourable mathematical properties than LSQR which has already been implemented in PETSc.  
> 
> But I am not really sure how to go about this,  since the implementations of the standard KSP methods themselves look quite complicated. The manual does not seeem to say much 
> about how to go about  adding routines to the PETSc library.
> 
> Could you give me guidelines I should follow while implementing a KSP method? 
> 
> Also is it necessary to build the PETSc library again after implementing this routine. If so is it necessary to make any changes to makefiles ?
> 
> Regards


From gaurish108 at gmail.com  Sun Apr  3 19:26:55 2011
From: gaurish108 at gmail.com (Gaurish Telang)
Date: Sun, 3 Apr 2011 20:26:55 -0400
Subject: [petsc-users] meaning of KSP_MatMult
Message-ID: <BANLkTikNg5Y2EfhDgjgJ2PohzoxRacxKaw@mail.gmail.com>

Hi

What is the difference between KSP_MatMult and MatMult? I am trying to
implement a new KSP method and see that all Matrix vector multiplies are
done with
KSP_MatMult in cg.c which implements the conjugate gradient.

Regards,

Gaurish
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.mcs.anl.gov/pipermail/petsc-users/attachments/20110403/edfc36ff/attachment.htm>

From bsmith at mcs.anl.gov  Sun Apr  3 19:57:49 2011
From: bsmith at mcs.anl.gov (Barry Smith)
Date: Sun, 3 Apr 2011 19:57:49 -0500
Subject: [petsc-users] meaning of KSP_MatMult
In-Reply-To: <BANLkTikNg5Y2EfhDgjgJ2PohzoxRacxKaw@mail.gmail.com>
References: <BANLkTikNg5Y2EfhDgjgJ2PohzoxRacxKaw@mail.gmail.com>
Message-ID: <AF648A20-0587-47A3-989E-00A28E9ED835@mcs.anl.gov>


  Usually it is best just to look at the code.

#define KSP_MatMult(ksp,A,x,y)          (!ksp->transpose_solve) ? MatMult(A,x,y)                                                            : MatMultTranspose(A,x,y) 

  It is only there to allow the same code work to solve with A or the transpose system with A'.  Of course with CG it doesn't even need to be used since the matrix is symmetric.

   Barry

On Apr 3, 2011, at 7:26 PM, Gaurish Telang wrote:

> Hi 
> 
> What is the difference between KSP_MatMult and MatMult? I am trying to implement a new KSP method and see that all Matrix vector multiplies are done with 
> KSP_MatMult in cg.c which implements the conjugate gradient. 
> 
> Regards,
> 
> Gaurish


From gaurish108 at gmail.com  Sun Apr  3 20:45:22 2011
From: gaurish108 at gmail.com (Gaurish Telang)
Date: Sun, 3 Apr 2011 21:45:22 -0400
Subject: [petsc-users] KSP structure and understanding some PETSc functions,
	.
Message-ID: <BANLkTinEBovhdTQFn0QR0UjOd7gyWRC55g@mail.gmail.com>

Hi,

Where can I find the details of the ksp data structure? Specifically I wish
to understand what ksp->converged, ksp->reason, and  ksp>cnvP mean.

Also I think the functions PetscObjecttakeAccess and PetscObjectGrantAccess
are undocumented. What do these functions do?

Thanks,

Gaurish.
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.mcs.anl.gov/pipermail/petsc-users/attachments/20110403/832b0616/attachment.htm>

From bsmith at mcs.anl.gov  Sun Apr  3 21:51:57 2011
From: bsmith at mcs.anl.gov (Barry Smith)
Date: Sun, 3 Apr 2011 21:51:57 -0500
Subject: [petsc-users] KSP structure and understanding some PETSc
	functions, .
In-Reply-To: <BANLkTinEBovhdTQFn0QR0UjOd7gyWRC55g@mail.gmail.com>
References: <BANLkTinEBovhdTQFn0QR0UjOd7gyWRC55g@mail.gmail.com>
Message-ID: <50D51353-F3B5-437B-A35C-12EF8F2DEAAE@mcs.anl.gov>


On Apr 3, 2011, at 8:45 PM, Gaurish Telang wrote:

> Hi,
> 
> Where can I find the details of the ksp data structure? Specifically I wish to understand what ksp->converged, ksp->reason, and  ksp>cnvP mean. 

   include/private/kspimpl.h
> 
> Also I think the functions PetscObjecttakeAccess and PetscObjectGrantAccess are undocumented. What do these functions do? 

    These are currently unused and can be ignored; there is no reason to put them in the code.

    Barry

> 
> Thanks,
> 
> Gaurish. 
>  


From ckontzialis at lycos.com  Mon Apr  4 00:59:08 2011
From: ckontzialis at lycos.com (Kontsantinos Kontzialis)
Date: Mon, 04 Apr 2011 08:59:08 +0300
Subject: [petsc-users] help with snes
Message-ID: <4D995E2C.8060106@lycos.com>

Dear all,

I use snes to apply an implicit Runge-Kutta method. I see that the snes 
and ksp solvers converge very slowly. Here is the runtime options I use:

mpiexec -np 4 ./hoac cylinder -snes_mf_operator -pc_type hypre 
-pc_hypre_type euclid -pc_hypre_euclid_levels 2 -ksp_type gmres -dt 
1.0e-1 -n_out 10 -file_out cylinder.txt -mat_inode_limit 5 -snes_monitor 
-end_time 1.0e+1 -roe_flux -snes_converged_reason -snes_max_fail 50 
-u_mom 0.2 -implicit -implicit_type 2 -snes_atol 1.0e-6 
-snes_ksp_ew_conv -ksp_gmres_cgs_refinement_type REFINE_IFNEEDED 
-ksp_gmres_classicalgramschmidt

Also, I use coloring to compute the jacobian of the system. Any suggestions?

Thank you,

Costas

From jed at 59A2.org  Mon Apr  4 01:21:02 2011
From: jed at 59A2.org (Jed Brown)
Date: Mon, 4 Apr 2011 08:21:02 +0200
Subject: [petsc-users] help with snes
In-Reply-To: <4D995E2C.8060106@lycos.com>
References: <4D995E2C.8060106@lycos.com>
Message-ID: <BANLkTimiZVYfV_CDbTiMSUJKmOWeazvxXw@mail.gmail.com>

On Mon, Apr 4, 2011 at 07:59, Kontsantinos Kontzialis <ckontzialis at lycos.com
> wrote:

> I use snes to apply an implicit Runge-Kutta method. I see that the snes and
> ksp solvers converge very slowly. Here is the runtime options I use:
>
> mpiexec -np 4 ./hoac cylinder -snes_mf_operator -pc_type hypre
> -pc_hypre_type euclid -pc_hypre_euclid_levels 2 -ksp_type gmres -dt 1.0e-1
> -n_out 10 -file_out cylinder.txt -mat_inode_limit 5 -snes_monitor -end_time
> 1.0e+1 -roe_flux -snes_converged_reason -snes_max_fail 50 -u_mom 0.2
> -implicit -implicit_type 2 -snes_atol 1.0e-6 -snes_ksp_ew_conv
> -ksp_gmres_cgs_refinement_type REFINE_IFNEEDED
> -ksp_gmres_classicalgramschmidt
>
> Also, I use coloring to compute the jacobian of the system. Any
> suggestions?
>

This is not enough information to do more than guess. What equations are you
solving, what methods have you tried, and how do they perform (show
convergence history, "very slowly" means very different things to different
people)?

To speed up Newton, you can (a) more accurate linear solve, (b) better
initial guess, e.g. provided by stable extrapolation or grid sequencing, (c)
more exotic things like nonlinear Schwarz. For the linear solve, you usually
have to improve the preconditioner (unless something is being done "wrong"
like failing to acknowledge a low-dimensional null space).
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.mcs.anl.gov/pipermail/petsc-users/attachments/20110404/0d97aab6/attachment.htm>

From ckontzialis at lycos.com  Mon Apr  4 04:21:53 2011
From: ckontzialis at lycos.com (Kontsantinos Kontzialis)
Date: Mon, 04 Apr 2011 12:21:53 +0300
Subject: [petsc-users] help with snes
In-Reply-To: <BANLkTimiZVYfV_CDbTiMSUJKmOWeazvxXw@mail.gmail.com>
References: <4D995E2C.8060106@lycos.com>
	<BANLkTimiZVYfV_CDbTiMSUJKmOWeazvxXw@mail.gmail.com>
Message-ID: <4D998DB1.7000601@lycos.com>

On 04/04/2011 09:21 AM, Jed Brown wrote:
> On Mon, Apr 4, 2011 at 07:59, Kontsantinos Kontzialis 
> <ckontzialis at lycos.com <mailto:ckontzialis at lycos.com>> wrote:
>
>     I use snes to apply an implicit Runge-Kutta method. I see that the
>     snes and ksp solvers converge very slowly. Here is the runtime
>     options I use:
>
>     mpiexec -np 4 ./hoac cylinder -snes_mf_operator -pc_type hypre
>     -pc_hypre_type euclid -pc_hypre_euclid_levels 2 -ksp_type gmres
>     -dt 1.0e-1 -n_out 10 -file_out cylinder.txt -mat_inode_limit 5
>     -snes_monitor -end_time 1.0e+1 -roe_flux -snes_converged_reason
>     -snes_max_fail 50 -u_mom 0.2 -implicit -implicit_type 2 -snes_atol
>     1.0e-6 -snes_ksp_ew_conv -ksp_gmres_cgs_refinement_type
>     REFINE_IFNEEDED -ksp_gmres_classicalgramschmidt
>
>     Also, I use coloring to compute the jacobian of the system. Any
>     suggestions?
>
>
> This is not enough information to do more than guess. What equations 
> are you solving, what methods have you tried, and how do they perform 
> (show convergence history, "very slowly" means very different things 
> to different people)?
>
> To speed up Newton, you can (a) more accurate linear solve, (b) better 
> initial guess, e.g. provided by stable extrapolation or grid 
> sequencing, (c) more exotic things like nonlinear Schwarz. For the 
> linear solve, you usually have to improve the preconditioner (unless 
> something is being done "wrong" like failing to acknowledge a 
> low-dimensional null space).
Jed,

  I am using a Discontinuous Galerkin method for the Euler equations of 
gas dynamics. I noticed that snes iterations do not drop the function 
norm and i need to set ksp_rtol to very low values in order to get 
converged solution. But this takes time. I use a matrix-free method with 
coloring for computing the jacobian as a preconditioner.

For instance

Timestep   0: step size = 0.1, time = 0, 2-norm residual = 0, CFL = 2910.23
Stage 1
   0 SNES Function norm 1.027079922974e-01
     0 KSP Residual norm 6.927237709481e+00
     1 KSP Residual norm 6.418924686154e-01
     2 KSP Residual norm 3.848527585443e-01
     3 KSP Residual norm 3.125092840762e-01
     4 KSP Residual norm 1.723224124866e-01
     5 KSP Residual norm 1.593049965355e-01
     6 KSP Residual norm 1.316596985033e-01
     7 KSP Residual norm 1.103893708040e-01
     8 KSP Residual norm 7.067294579947e-02
     9 KSP Residual norm 6.780500381016e-02
    10 KSP Residual norm 5.681755604855e-02
    11 KSP Residual norm 5.080542537215e-02
    12 KSP Residual norm 4.491096057365e-02
    13 KSP Residual norm 3.677544617683e-02
    14 KSP Residual norm 3.511275641736e-02
    15 KSP Residual norm 2.950583865595e-02
    16 KSP Residual norm 2.907564423196e-02
    17 KSP Residual norm 2.503394162059e-02
    18 KSP Residual norm 2.404208792763e-02
    19 KSP Residual norm 2.384167488597e-02
    20 KSP Residual norm 2.152623807336e-02
    21 KSP Residual norm 2.140123445364e-02
    22 KSP Residual norm 1.868992128733e-02
    23 KSP Residual norm 1.828641368024e-02
    24 KSP Residual norm 1.490016051590e-02
    25 KSP Residual norm 1.460258473433e-02
    26 KSP Residual norm 1.297026817438e-02
    27 KSP Residual norm 1.287620709307e-02
    28 KSP Residual norm 1.227961630095e-02
    29 KSP Residual norm 1.195861330978e-02
    30 KSP Residual norm 1.194779545098e-02
    31 KSP Residual norm 1.194557385421e-02
    32 KSP Residual norm 1.179010880493e-02
    33 KSP Residual norm 1.149738273328e-02
    34 KSP Residual norm 1.145048206086e-02
    35 KSP Residual norm 1.085469098538e-02
    36 KSP Residual norm 1.074679906620e-02
    37 KSP Residual norm 9.663878491019e-03
    38 KSP Residual norm 9.616289395896e-03
    39 KSP Residual norm 8.856279445189e-03
    40 KSP Residual norm 8.843663029957e-03
    41 KSP Residual norm 8.499163748714e-03
    42 KSP Residual norm 8.390473347232e-03
    43 KSP Residual norm 8.308302849856e-03
    44 KSP Residual norm 7.657464451533e-03
    45 KSP Residual norm 7.633733629069e-03
    46 KSP Residual norm 6.791549457705e-03
    47 KSP Residual norm 6.761807932440e-03
    48 KSP Residual norm 6.310159420349e-03
    49 KSP Residual norm 6.308135765891e-03
    50 KSP Residual norm 6.178469644420e-03
    51 KSP Residual norm 6.092441106798e-03
    52 KSP Residual norm 6.060022864473e-03
    53 KSP Residual norm 5.690961052139e-03
    54 KSP Residual norm 5.643775917490e-03
    55 KSP Residual norm 5.203641551515e-03
    56 KSP Residual norm 5.146470256961e-03
    57 KSP Residual norm 4.903702702525e-03
    58 KSP Residual norm 4.898824722874e-03
    59 KSP Residual norm 4.813924377178e-03
    60 KSP Residual norm 4.723526162768e-03
    61 KSP Residual norm 4.711592750967e-03
    62 KSP Residual norm 4.631580873260e-03
    63 KSP Residual norm 4.616608546291e-03
    64 KSP Residual norm 4.600837802917e-03
    65 KSP Residual norm 4.474343564228e-03
    66 KSP Residual norm 4.443094833339e-03
    67 KSP Residual norm 4.224165899917e-03
    68 KSP Residual norm 4.200220155408e-03
    69 KSP Residual norm 3.982639561746e-03
    70 KSP Residual norm 3.982638871843e-03
    71 KSP Residual norm 3.922944438857e-03
    72 KSP Residual norm 3.874158019578e-03
    73 KSP Residual norm 3.846756027386e-03
    74 KSP Residual norm 3.602959352657e-03
    75 KSP Residual norm 3.555118913445e-03
    76 KSP Residual norm 3.177203273748e-03
    77 KSP Residual norm 3.156576767081e-03
    78 KSP Residual norm 3.001743691226e-03
    79 KSP Residual norm 2.993215617549e-03
    80 KSP Residual norm 2.958723177572e-03
    81 KSP Residual norm 2.756494447464e-03
    82 KSP Residual norm 2.748309056765e-03
    83 KSP Residual norm 2.334551378904e-03
    84 KSP Residual norm 2.330135779280e-03
    85 KSP Residual norm 2.158060720380e-03
    86 KSP Residual norm 2.142593286155e-03
    87 KSP Residual norm 2.117828962844e-03
    88 KSP Residual norm 1.994027737245e-03
    89 KSP Residual norm 1.992614508152e-03
    90 KSP Residual norm 1.846186979481e-03
    91 KSP Residual norm 1.845847460736e-03
    92 KSP Residual norm 1.750049894784e-03
    93 KSP Residual norm 1.749884975326e-03
    94 KSP Residual norm 1.718243135806e-03
    95 KSP Residual norm 1.692788497273e-03
    96 KSP Residual norm 1.690541479527e-03
    97 KSP Residual norm 1.606671137740e-03
    98 KSP Residual norm 1.604648007071e-03
    99 KSP Residual norm 1.491359345392e-03
   100 KSP Residual norm 1.487500193779e-03
   101 KSP Residual norm 1.431517768459e-03
   102 KSP Residual norm 1.413892181036e-03
   103 KSP Residual norm 1.411244613047e-03
   104 KSP Residual norm 1.358895518150e-03
   105 KSP Residual norm 1.352907680814e-03
   106 KSP Residual norm 1.290825812795e-03
   107 KSP Residual norm 1.277895797362e-03
   108 KSP Residual norm 1.239827948290e-03
   109 KSP Residual norm 1.231707202118e-03
   110 KSP Residual norm 1.226844439539e-03
   111 KSP Residual norm 1.207141785145e-03
   112 KSP Residual norm 1.205237619380e-03
   113 KSP Residual norm 1.174105568044e-03
   114 KSP Residual norm 1.161571710906e-03
   115 KSP Residual norm 1.127339527466e-03
   116 KSP Residual norm 1.118975806161e-03
   117 KSP Residual norm 1.108813712733e-03
   118 KSP Residual norm 1.097495827387e-03
   119 KSP Residual norm 1.095458651348e-03
   120 KSP Residual norm 1.099354754200e-03
   121 KSP Residual norm 1.097690891747e-03
   122 KSP Residual norm 1.080236637641e-03
   123 KSP Residual norm 1.075249519522e-03
   124 KSP Residual norm 1.072337341208e-03
   125 KSP Residual norm 1.061274156366e-03
   126 KSP Residual norm 1.061069171322e-03
   127 KSP Residual norm 1.034869646780e-03
   128 KSP Residual norm 1.033331884671e-03
   129 KSP Residual norm 1.008920875186e-03
   130 KSP Residual norm 1.004952247105e-03
   131 KSP Residual norm 9.995273012222e-04
   132 KSP Residual norm 9.912110212877e-04
   133 KSP Residual norm 9.911763000203e-04
   134 KSP Residual norm 9.706007632685e-04
   135 KSP Residual norm 9.684837666814e-04
   136 KSP Residual norm 9.322763749700e-04
   137 KSP Residual norm 9.298333190027e-04
   138 KSP Residual norm 9.127255619624e-04
   139 KSP Residual norm 9.021675689666e-04
   140 KSP Residual norm 9.021675667879e-04
   141 KSP Residual norm 8.785893989448e-04
   142 KSP Residual norm 8.723618462518e-04
   143 KSP Residual norm 8.321111011021e-04
   144 KSP Residual norm 8.231111137986e-04
   145 KSP Residual norm 7.991951376136e-04
   146 KSP Residual norm 7.918718558159e-04
   147 KSP Residual norm 7.910248570583e-04
   148 KSP Residual norm 7.797518457299e-04
   149 KSP Residual norm 7.740621090192e-04
   150 KSP Residual norm 7.996865400922e-04
   151 KSP Residual norm 7.985981734959e-04
   152 KSP Residual norm 7.629754272125e-04
   153 KSP Residual norm 7.621465465035e-04
   154 KSP Residual norm 7.494704585166e-04
   155 KSP Residual norm 7.416807470073e-04
   156 KSP Residual norm 7.397837511960e-04
   157 KSP Residual norm 7.160474295065e-04
   158 KSP Residual norm 7.153230970945e-04
   159 KSP Residual norm 6.833108294843e-04
   160 KSP Residual norm 6.825301707967e-04
   161 KSP Residual norm 6.689873131620e-04
   162 KSP Residual norm 6.665860331512e-04
   163 KSP Residual norm 6.657084022499e-04
   164 KSP Residual norm 6.576302936674e-04
   165 KSP Residual norm 6.551257109061e-04
   166 KSP Residual norm 6.432495386190e-04
   167 KSP Residual norm 6.393490617663e-04
   168 KSP Residual norm 6.289563544548e-04
   169 KSP Residual norm 6.248514077557e-04
   170 KSP Residual norm 6.241333036068e-04
   171 KSP Residual norm 6.172289972591e-04
   172 KSP Residual norm 6.161332336230e-04
   173 KSP Residual norm 6.013433202566e-04
   174 KSP Residual norm 5.948389748744e-04
   175 KSP Residual norm 5.820459960548e-04
   176 KSP Residual norm 5.780888500630e-04
   177 KSP Residual norm 5.765550402812e-04
   178 KSP Residual norm 5.680617497540e-04
   179 KSP Residual norm 5.653573865953e-04
   180 KSP Residual norm 5.941451965319e-04
   181 KSP Residual norm 5.940284187867e-04
   182 KSP Residual norm 5.629093879046e-04
   183 KSP Residual norm 5.616029219408e-04
   184 KSP Residual norm 5.534251512246e-04
   185 KSP Residual norm 5.503051759591e-04
   186 KSP Residual norm 5.494615294258e-04
   187 KSP Residual norm 5.360903464690e-04
   188 KSP Residual norm 5.357596839925e-04
   189 KSP Residual norm 5.200595645635e-04
   190 KSP Residual norm 5.193081989440e-04
   191 KSP Residual norm 5.128888843140e-04
   192 KSP Residual norm 5.113184826859e-04
   193 KSP Residual norm 5.108224858562e-04
   194 KSP Residual norm 5.032655916038e-04
   195 KSP Residual norm 5.023522053806e-04
   196 KSP Residual norm 4.899188744381e-04
   197 KSP Residual norm 4.883553377255e-04
   198 KSP Residual norm 4.800289474213e-04
   199 KSP Residual norm 4.777279100041e-04
   200 KSP Residual norm 4.759559094663e-04
   201 KSP Residual norm 4.694818511931e-04
   202 KSP Residual norm 4.689973302866e-04
   203 KSP Residual norm 4.548454768507e-04
   204 KSP Residual norm 4.520367591463e-04
   205 KSP Residual norm 4.368451030442e-04
   206 KSP Residual norm 4.351974523374e-04
   207 KSP Residual norm 4.287065645698e-04
   208 KSP Residual norm 4.260330488789e-04
   209 KSP Residual norm 4.260155248925e-04
   210 KSP Residual norm 4.840531342466e-04
   211 KSP Residual norm 4.825477592344e-04
   212 KSP Residual norm 4.548435246139e-04
   213 KSP Residual norm 4.544995360569e-04
   214 KSP Residual norm 4.467888907692e-04
   215 KSP Residual norm 4.408942282764e-04
   216 KSP Residual norm 4.370393567572e-04
   217 KSP Residual norm 4.221207630715e-04
   218 KSP Residual norm 4.210509669927e-04
   219 KSP Residual norm 4.054854033659e-04
   220 KSP Residual norm 4.054421274470e-04
   221 KSP Residual norm 3.997032630117e-04
   222 KSP Residual norm 3.993282795883e-04
   223 KSP Residual norm 3.986988207685e-04
   224 KSP Residual norm 3.925181315143e-04
   225 KSP Residual norm 3.921926168323e-04
   226 KSP Residual norm 3.821204190940e-04
   227 KSP Residual norm 3.814133566930e-04
   228 KSP Residual norm 3.766128637251e-04
   229 KSP Residual norm 3.757253419248e-04
   230 KSP Residual norm 3.744322716702e-04
   231 KSP Residual norm 3.701160966981e-04
   232 KSP Residual norm 3.701152921823e-04
   233 KSP Residual norm 3.631617558377e-04
   234 KSP Residual norm 3.625708411480e-04
   235 KSP Residual norm 3.553808760610e-04
   236 KSP Residual norm 3.546034916825e-04
   237 KSP Residual norm 3.540786524554e-04
   238 KSP Residual norm 3.511344181564e-04
   239 KSP Residual norm 3.505042720512e-04
   240 KSP Residual norm 4.132290626555e-04
   241 KSP Residual norm 4.082857512512e-04
   242 KSP Residual norm 3.711356050665e-04
   243 KSP Residual norm 3.704069640625e-04
   244 KSP Residual norm 3.547568213329e-04
   245 KSP Residual norm 3.546708808789e-04
   246 KSP Residual norm 3.500496433773e-04
   247 KSP Residual norm 3.470649802709e-04
   248 KSP Residual norm 3.468562003147e-04
   249 KSP Residual norm 3.373236757208e-04
   250 KSP Residual norm 3.372785835705e-04
   251 KSP Residual norm 3.289650260262e-04
   252 KSP Residual norm 3.286262513859e-04
   253 KSP Residual norm 3.258404684367e-04
   254 KSP Residual norm 3.233806739275e-04
   255 KSP Residual norm 3.232848418577e-04
   256 KSP Residual norm 3.160390660748e-04
   257 KSP Residual norm 3.160023434928e-04
   258 KSP Residual norm 3.076800563449e-04
   259 KSP Residual norm 3.075950769044e-04
   260 KSP Residual norm 3.017997123649e-04
   261 KSP Residual norm 3.010370755465e-04
   262 KSP Residual norm 3.007348493909e-04
   263 KSP Residual norm 2.968283761627e-04
   264 KSP Residual norm 2.965667647944e-04
   265 KSP Residual norm 2.875262248594e-04
   266 KSP Residual norm 2.864872321977e-04
   267 KSP Residual norm 2.796799864284e-04
   268 KSP Residual norm 2.779800721903e-04
   269 KSP Residual norm 2.767755518062e-04
   270 KSP Residual norm 3.701476508480e-04
   271 KSP Residual norm 3.550531831491e-04
   272 KSP Residual norm 3.237961295410e-04
   273 KSP Residual norm 3.212659135171e-04
   274 KSP Residual norm 3.107416795714e-04
   275 KSP Residual norm 3.107404695290e-04
   276 KSP Residual norm 3.079497903315e-04
   277 KSP Residual norm 3.007874486462e-04
   278 KSP Residual norm 3.002891498945e-04
   279 KSP Residual norm 2.879889328174e-04
   280 KSP Residual norm 2.879415924827e-04
   281 KSP Residual norm 2.815601989921e-04
   282 KSP Residual norm 2.806654362436e-04
   283 KSP Residual norm 2.792915793473e-04
   284 KSP Residual norm 2.751905583679e-04
   285 KSP Residual norm 2.751559506503e-04
   286 KSP Residual norm 2.665087440439e-04
   287 KSP Residual norm 2.664771447418e-04
   288 KSP Residual norm 2.617548480595e-04
   289 KSP Residual norm 2.617541578606e-04
   290 KSP Residual norm 2.605161966159e-04
   291 KSP Residual norm 2.594628541708e-04
   292 KSP Residual norm 2.594611806903e-04
   293 KSP Residual norm 2.551674006816e-04
   294 KSP Residual norm 2.549606997571e-04
   295 KSP Residual norm 2.496835632981e-04
   296 KSP Residual norm 2.494290948044e-04
   297 KSP Residual norm 2.466008586027e-04
   298 KSP Residual norm 2.449860389047e-04
   299 KSP Residual norm 2.448908502370e-04
   300 KSP Residual norm 3.252504105699e-04
   301 KSP Residual norm 3.128088862424e-04
   302 KSP Residual norm 2.825197019556e-04
   303 KSP Residual norm 2.797432079325e-04
   304 KSP Residual norm 2.629301571560e-04
   305 KSP Residual norm 2.624809613098e-04
   306 KSP Residual norm 2.574770828907e-04
   307 KSP Residual norm 2.561990302939e-04
   308 KSP Residual norm 2.550132639501e-04
   309 KSP Residual norm 2.479276320314e-04
   310 KSP Residual norm 2.475314075344e-04
   311 KSP Residual norm 2.399863564606e-04
   312 KSP Residual norm 2.399843817621e-04
   313 KSP Residual norm 2.370399439375e-04
   314 KSP Residual norm 2.357745430699e-04
   315 KSP Residual norm 2.352348621885e-04
   316 KSP Residual norm 2.309261579390e-04
   317 KSP Residual norm 2.307545694928e-04
   318 KSP Residual norm 2.257800220221e-04
   319 KSP Residual norm 2.257254923503e-04
   320 KSP Residual norm 2.229018664838e-04
   321 KSP Residual norm 2.227391409799e-04
   322 KSP Residual norm 2.223152932089e-04
   323 KSP Residual norm 2.202129550826e-04
   324 KSP Residual norm 2.202120254316e-04
   325 KSP Residual norm 2.140858308118e-04
   326 KSP Residual norm 2.139462657614e-04
   327 KSP Residual norm 2.089236399059e-04
   328 KSP Residual norm 2.087138619354e-04
   329 KSP Residual norm 2.076200058793e-04
   330 KSP Residual norm 3.105137354000e-04
   331 KSP Residual norm 2.971618102980e-04
   332 KSP Residual norm 2.632509491662e-04
   333 KSP Residual norm 2.606782931874e-04
   334 KSP Residual norm 2.420375352578e-04
   335 KSP Residual norm 2.420149875537e-04
   336 KSP Residual norm 2.365092764887e-04
   337 KSP Residual norm 2.327611491860e-04
   338 KSP Residual norm 2.309994170994e-04
   339 KSP Residual norm 2.207932064590e-04
   340 KSP Residual norm 2.206876586929e-04
   341 KSP Residual norm 2.138312153956e-04
   342 KSP Residual norm 2.138302439214e-04
   343 KSP Residual norm 2.110520083920e-04
   344 KSP Residual norm 2.098418044530e-04
   345 KSP Residual norm 2.096235899540e-04
   346 KSP Residual norm 2.034185757316e-04
   347 KSP Residual norm 2.034184620635e-04
   348 KSP Residual norm 1.976488405192e-04
   349 KSP Residual norm 1.976115434786e-04
   350 KSP Residual norm 1.945816030084e-04
   351 KSP Residual norm 1.943065435743e-04
   352 KSP Residual norm 1.937314638157e-04
   353 KSP Residual norm 1.914843655114e-04
   354 KSP Residual norm 1.914699026461e-04
   355 KSP Residual norm 1.874847304454e-04
   356 KSP Residual norm 1.872309478602e-04
   357 KSP Residual norm 1.848831376039e-04
   358 KSP Residual norm 1.841538952700e-04
   359 KSP Residual norm 1.840508058060e-04
   360 KSP Residual norm 2.955683800624e-04
   361 KSP Residual norm 2.787092713017e-04
   362 KSP Residual norm 2.375973954952e-04
   363 KSP Residual norm 2.362837002318e-04
   364 KSP Residual norm 2.168814309608e-04
   365 KSP Residual norm 2.168107429718e-04
   366 KSP Residual norm 2.100451614964e-04
   367 KSP Residual norm 2.076551935282e-04
   368 KSP Residual norm 2.055029282370e-04
   369 KSP Residual norm 1.984878777061e-04
   370 KSP Residual norm 1.982729424399e-04
   371 KSP Residual norm 1.897882681691e-04
   372 KSP Residual norm 1.897059804945e-04
   373 KSP Residual norm 1.874194039890e-04
   374 KSP Residual norm 1.863443914354e-04
   375 KSP Residual norm 1.862854861081e-04
   376 KSP Residual norm 1.823365401806e-04
   377 KSP Residual norm 1.822124001004e-04
   378 KSP Residual norm 1.768066383954e-04
   379 KSP Residual norm 1.767749820797e-04
   380 KSP Residual norm 1.732546065272e-04
   381 KSP Residual norm 1.729449336935e-04
   382 KSP Residual norm 1.722785014035e-04
   383 KSP Residual norm 1.708647196037e-04
   384 KSP Residual norm 1.708139978897e-04
   385 KSP Residual norm 1.669271053730e-04
   386 KSP Residual norm 1.664487742887e-04
   387 KSP Residual norm 1.633713872445e-04
   388 KSP Residual norm 1.623611873959e-04
   389 KSP Residual norm 1.619434626999e-04
   390 KSP Residual norm 2.704606714739e-04
   391 KSP Residual norm 2.508430425343e-04
   392 KSP Residual norm 2.188049406578e-04
   393 KSP Residual norm 2.149849069956e-04
   394 KSP Residual norm 1.986339539905e-04
   395 KSP Residual norm 1.982077076504e-04
   396 KSP Residual norm 1.920881700711e-04
   397 KSP Residual norm 1.895113714247e-04
   398 KSP Residual norm 1.880286374771e-04
   399 KSP Residual norm 1.800051766499e-04
   400 KSP Residual norm 1.791342664207e-04
   401 KSP Residual norm 1.723021103007e-04
   402 KSP Residual norm 1.722983364536e-04
   403 KSP Residual norm 1.699546057741e-04
   404 KSP Residual norm 1.688016004024e-04
   405 KSP Residual norm 1.684647860428e-04
   406 KSP Residual norm 1.642044267398e-04
   407 KSP Residual norm 1.641199945107e-04
   408 KSP Residual norm 1.600279143621e-04
   409 KSP Residual norm 1.600082345451e-04
   410 KSP Residual norm 1.571845001808e-04
   411 KSP Residual norm 1.568421146682e-04
   412 KSP Residual norm 1.566121984341e-04
   413 KSP Residual norm 1.541905989228e-04
   414 KSP Residual norm 1.541893814351e-04
   415 KSP Residual norm 1.503096486796e-04
   416 KSP Residual norm 1.502069549475e-04
   417 KSP Residual norm 1.474600334036e-04
   418 KSP Residual norm 1.470811856273e-04
   419 KSP Residual norm 1.468503446550e-04
   420 KSP Residual norm 2.763803512664e-04
   421 KSP Residual norm 2.602057695388e-04
   422 KSP Residual norm 2.183342925626e-04
   423 KSP Residual norm 2.173993739737e-04
   424 KSP Residual norm 1.911804707591e-04
   425 KSP Residual norm 1.909057097612e-04
   426 KSP Residual norm 1.799387525085e-04
   427 KSP Residual norm 1.776177027182e-04
   428 KSP Residual norm 1.742907904950e-04
   429 KSP Residual norm 1.663678496699e-04
   430 KSP Residual norm 1.649483630841e-04
   431 KSP Residual norm 1.574852643960e-04
   432 KSP Residual norm 1.572807748854e-04
   433 KSP Residual norm 1.548887544042e-04
   434 KSP Residual norm 1.546798467860e-04
   435 KSP Residual norm 1.544697005440e-04
   436 KSP Residual norm 1.519993025554e-04
   437 KSP Residual norm 1.519316487948e-04
   438 KSP Residual norm 1.478918963793e-04
   439 KSP Residual norm 1.478889661490e-04
   440 KSP Residual norm 1.445882234540e-04
   441 KSP Residual norm 1.444716701122e-04
   442 KSP Residual norm 1.437624306066e-04
   443 KSP Residual norm 1.422790872780e-04
   444 KSP Residual norm 1.422764539369e-04
   445 KSP Residual norm 1.393786944559e-04
   446 KSP Residual norm 1.393457143473e-04
   447 KSP Residual norm 1.365525969412e-04
   448 KSP Residual norm 1.363095456469e-04
   449 KSP Residual norm 1.354028782839e-04
   450 KSP Residual norm 2.605601751211e-04
   451 KSP Residual norm 2.519694861974e-04
   452 KSP Residual norm 2.090311698383e-04
   453 KSP Residual norm 2.085715879165e-04
   454 KSP Residual norm 1.838015375310e-04
   455 KSP Residual norm 1.837739523119e-04
   456 KSP Residual norm 1.753463622085e-04
   457 KSP Residual norm 1.712295128609e-04
   458 KSP Residual norm 1.706509763269e-04
   459 KSP Residual norm 1.603179612581e-04
   460 KSP Residual norm 1.602910110614e-04
   461 KSP Residual norm 1.515043047705e-04
   462 KSP Residual norm 1.514059184851e-04
   463 KSP Residual norm 1.487375893344e-04
   464 KSP Residual norm 1.472465439488e-04
   465 KSP Residual norm 1.471786946789e-04
   466 KSP Residual norm 1.426656793695e-04
   467 KSP Residual norm 1.426493614178e-04
   468 KSP Residual norm 1.368739431348e-04
   469 KSP Residual norm 1.368507658167e-04
   470 KSP Residual norm 1.334891298516e-04
   471 KSP Residual norm 1.332407426119e-04
   472 KSP Residual norm 1.327738486878e-04
   473 KSP Residual norm 1.307942849615e-04
   474 KSP Residual norm 1.307351952401e-04
   475 KSP Residual norm 1.272337855340e-04
   476 KSP Residual norm 1.270264025130e-04
   477 KSP Residual norm 1.242721163254e-04
   478 KSP Residual norm 1.239396247568e-04
   479 KSP Residual norm 1.231277348178e-04
   480 KSP Residual norm 2.661551792497e-04
   481 KSP Residual norm 2.535113528327e-04
   482 KSP Residual norm 2.047238652433e-04
   483 KSP Residual norm 2.030655099375e-04
   484 KSP Residual norm 1.747312421388e-04
   485 KSP Residual norm 1.745992877616e-04
   486 KSP Residual norm 1.656341028631e-04
   487 KSP Residual norm 1.637510460120e-04
   488 KSP Residual norm 1.614472737567e-04
   489 KSP Residual norm 1.528805694879e-04
   490 KSP Residual norm 1.528597718238e-04
   491 KSP Residual norm 1.433703303374e-04
   492 KSP Residual norm 1.433634532650e-04
   493 KSP Residual norm 1.402208125363e-04
   494 KSP Residual norm 1.396024825606e-04
   495 KSP Residual norm 1.395185211705e-04
   496 KSP Residual norm 1.368721239565e-04
   497 KSP Residual norm 1.368721047917e-04
   498 KSP Residual norm 1.313099701242e-04
   499 KSP Residual norm 1.312558057896e-04
   500 KSP Residual norm 1.278276826286e-04
   501 KSP Residual norm 1.276445215817e-04
   502 KSP Residual norm 1.269142176155e-04
   503 KSP Residual norm 1.257668921397e-04
   504 KSP Residual norm 1.257526015128e-04
   505 KSP Residual norm 1.223245237264e-04
   506 KSP Residual norm 1.221042366836e-04
   507 KSP Residual norm 1.191294573606e-04
   508 KSP Residual norm 1.186779609848e-04
   509 KSP Residual norm 1.179260379388e-04
   510 KSP Residual norm 2.510412421241e-04
   511 KSP Residual norm 2.296629012501e-04
   512 KSP Residual norm 1.891291364587e-04
   513 KSP Residual norm 1.852147919892e-04
   514 KSP Residual norm 1.677771665780e-04
   515 KSP Residual norm 1.672450311159e-04
   516 KSP Residual norm 1.591298996024e-04
   517 KSP Residual norm 1.573813490451e-04
   518 KSP Residual norm 1.537945683810e-04
   519 KSP Residual norm 1.453012826230e-04
   520 KSP Residual norm 1.446639040447e-04
   521 KSP Residual norm 1.370962330437e-04
   522 KSP Residual norm 1.370622043459e-04
   523 KSP Residual norm 1.350810905802e-04
   524 KSP Residual norm 1.344406903453e-04
   525 KSP Residual norm 1.340256203640e-04
   526 KSP Residual norm 1.295916728961e-04
   527 KSP Residual norm 1.295055652264e-04
   528 KSP Residual norm 1.239274488400e-04
   529 KSP Residual norm 1.238972505288e-04
   530 KSP Residual norm 1.201152286403e-04
   531 KSP Residual norm 1.200306481147e-04
   532 KSP Residual norm 1.191679467888e-04
   533 KSP Residual norm 1.177359365688e-04
   534 KSP Residual norm 1.176440079687e-04
   535 KSP Residual norm 1.141371623941e-04
   536 KSP Residual norm 1.141353394103e-04
   537 KSP Residual norm 1.111902304574e-04
   538 KSP Residual norm 1.111435026607e-04
   539 KSP Residual norm 1.104373518780e-04
   540 KSP Residual norm 2.532065970323e-04
   541 KSP Residual norm 2.328558435699e-04
   542 KSP Residual norm 1.856863400432e-04
   543 KSP Residual norm 1.824052503129e-04
   544 KSP Residual norm 1.632727866436e-04
   545 KSP Residual norm 1.623079398737e-04
   546 KSP Residual norm 1.526844888068e-04
   547 KSP Residual norm 1.511265397186e-04
   548 KSP Residual norm 1.477189110182e-04
   549 KSP Residual norm 1.383359375652e-04
   550 KSP Residual norm 1.375702414606e-04
   551 KSP Residual norm 1.291115277686e-04
   552 KSP Residual norm 1.289805911758e-04
   553 KSP Residual norm 1.264565638309e-04
   554 KSP Residual norm 1.261764820334e-04
   555 KSP Residual norm 1.259971989377e-04
   556 KSP Residual norm 1.224124087888e-04
   557 KSP Residual norm 1.224106346986e-04
   558 KSP Residual norm 1.171149088916e-04
   559 KSP Residual norm 1.171056991973e-04
   560 KSP Residual norm 1.135465444557e-04
   561 KSP Residual norm 1.133273524148e-04
   562 KSP Residual norm 1.126341467351e-04
   563 KSP Residual norm 1.110537515701e-04
   564 KSP Residual norm 1.110037552185e-04
   565 KSP Residual norm 1.078890765170e-04
   566 KSP Residual norm 1.078787122262e-04
   567 KSP Residual norm 1.055011245599e-04
   568 KSP Residual norm 1.054883845546e-04
   569 KSP Residual norm 1.049715929970e-04
   570 KSP Residual norm 2.593584803792e-04
   571 KSP Residual norm 2.455814515878e-04
   572 KSP Residual norm 1.938992453788e-04
   573 KSP Residual norm 1.922723159534e-04
   574 KSP Residual norm 1.638561052430e-04
   575 KSP Residual norm 1.638003809937e-04
   576 KSP Residual norm 1.524110974008e-04
   577 KSP Residual norm 1.506343440379e-04
   578 KSP Residual norm 1.478423845281e-04
   579 KSP Residual norm 1.385076232286e-04
   580 KSP Residual norm 1.382879555696e-04
   581 KSP Residual norm 1.288719967621e-04
   582 KSP Residual norm 1.288691781733e-04
   583 KSP Residual norm 1.255148505764e-04
   584 KSP Residual norm 1.252964998236e-04
   585 KSP Residual norm 1.249334077999e-04
   586 KSP Residual norm 1.226044863792e-04
   587 KSP Residual norm 1.226037028980e-04
   588 KSP Residual norm 1.178394232303e-04
   589 KSP Residual norm 1.178390359455e-04
   590 KSP Residual norm 1.140645774874e-04
   591 KSP Residual norm 1.139549138636e-04
   592 KSP Residual norm 1.131882577208e-04
   593 KSP Residual norm 1.119290884708e-04
   594 KSP Residual norm 1.119288058877e-04
   595 KSP Residual norm 1.090328205509e-04
   596 KSP Residual norm 1.087834210599e-04
   597 KSP Residual norm 1.059285536635e-04
   598 KSP Residual norm 1.056123757512e-04


and it is still running. I have set for this run ksp_rtol to 1.0e-7. Any 
ideas?

Costas
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.mcs.anl.gov/pipermail/petsc-users/attachments/20110404/3e90a6ab/attachment-0001.htm>

From jed at 59A2.org  Mon Apr  4 06:26:17 2011
From: jed at 59A2.org (Jed Brown)
Date: Mon, 4 Apr 2011 13:26:17 +0200
Subject: [petsc-users] help with snes
In-Reply-To: <4D998DB1.7000601@lycos.com>
References: <4D995E2C.8060106@lycos.com>
	<BANLkTimiZVYfV_CDbTiMSUJKmOWeazvxXw@mail.gmail.com>
	<4D998DB1.7000601@lycos.com>
Message-ID: <BANLkTimgtYLC1OZY97hKSS2WgixLOd02NQ@mail.gmail.com>

On Mon, Apr 4, 2011 at 11:21, Kontsantinos Kontzialis <ckontzialis at lycos.com
> wrote:

>  I am using a Discontinuous Galerkin method for the Euler equations of gas
> dynamics. I noticed that snes iterations do not drop the function norm and i
> need to set ksp_rtol to very low values in order to get converged solution.
> But this takes time. I use a matrix-free method with coloring for computing
> the jacobian as a preconditioner.


I don't see SNES convergence here because it's still in the linear solve.
The restart is apparently too small for use with this preconditioner because
you are losing a lot of ground in each restart. For reference, how does
-pc_type asm -sub_pc_type lu work?

For globalization at moderate to high Mach numbers, I would recommend grid
sequencing if possible, otherwise you may be forced to take smaller time
steps.

For the linear solve, especially at low Mach number, you can precondition
using the Schur complement of momentum applied in the pressure space (this
contains the fast acoustic waves, see
http://epubs.siam.org/sisc/resource/1/sjoce3/v32/i6/p3394_s1 for several
examples defining the operator for use with semi-implicit integration). An
alternative is to build a custom multigrid, see e.g.
http://aero-comlab.stanford.edu/Papers/jameson.aiaa.01-2673.pdf. Both of
these options require some work on your part. You should recognize that
scalable solvers for implicit Euler at large time steps is still a hard
enough problem that "black box" approaches will not give the best
performance.
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.mcs.anl.gov/pipermail/petsc-users/attachments/20110404/19d7b323/attachment.htm>

From ckontzialis at lycos.com  Mon Apr  4 07:15:28 2011
From: ckontzialis at lycos.com (Kontsantinos Kontzialis)
Date: Mon, 04 Apr 2011 15:15:28 +0300
Subject: [petsc-users] help with snes
In-Reply-To: <BANLkTimgtYLC1OZY97hKSS2WgixLOd02NQ@mail.gmail.com>
References: <4D995E2C.8060106@lycos.com>	<BANLkTimiZVYfV_CDbTiMSUJKmOWeazvxXw@mail.gmail.com>	<4D998DB1.7000601@lycos.com>
	<BANLkTimgtYLC1OZY97hKSS2WgixLOd02NQ@mail.gmail.com>
Message-ID: <4D99B660.6000407@lycos.com>

On 04/04/2011 02:26 PM, Jed Brown wrote:
> -pc_type asm -sub_pc_type lu
Jed,

  I do a run with -pc_type asm -sub_pc_type lu and higher rate of gmres 
restart. I work on unstructured meshes. I'll let you know as soon as 
possible.

Costas

From gaurish108 at gmail.com  Mon Apr  4 13:41:51 2011
From: gaurish108 at gmail.com (Gaurish Telang)
Date: Mon, 4 Apr 2011 14:41:51 -0400
Subject: [petsc-users] compile errors with petsc-dev
Message-ID: <BANLkTim+iYSDb2syEzGRuJwbB2Mipqea+Q@mail.gmail.com>

Hi,

I have installed two versions of PETSc on my computer,  the debug version
petsc-3.1.p7 and petsc-dev.

However, when I try to compile my codes  with petsc-dev(which were
successfully compiled and executed under the first version)
I get lots of compile errors.

Why could this be happening? I have pasted the error message below.

gaurish108 at telang:~/Desktop/LSQR_progress$ $PETSC_DIR
bash: /home/gaurish108/software/petsc-dev: is a directory

gaurish108 at telang:~/Desktop/LSQR_progress$ make main
/home/gaurish108/software/petsc-dev/linux-gnu-c/bin/mpicc -o main.o -c -Wall
-Wwrite-strings -Wno-strict-aliasing -Wno-unknown-pragmas -g3
-I/home/gaurish108/software/petsc-dev/include
-I/home/gaurish108/software/petsc-dev/linux-gnu-c/include -D__INSDIR__=
main.c
main.c: In function ?main?:
main.c:17: error: ?PetscTruth? undeclared (first use in this function)
main.c:17: error: (Each undeclared identifier is reported only once
main.c:17: error: for each function it appears in.)
main.c:17: error: expected ?;? before ?flg_b?
main.c:26: error: ?flg_b? undeclared (first use in this function)
main.c:27: error: macro "SETERRQ" requires 3 arguments, but only 2 given
main.c:27: error: ?SETERRQ? undeclared (first use in this function)
main.c:30: warning: passing argument 1 of ?VecLoad? from incompatible
pointer type
/home/gaurish108/software/petsc-dev/include/petscvec.h:419: note: expected
?Vec? but argument is of type ?PetscViewer?
main.c:30: warning: passing argument 2 of ?VecLoad? from incompatible
pointer type
/home/gaurish108/software/petsc-dev/include/petscvec.h:419: note: expected
?PetscViewer? but argument is of type ?const char *?
main.c:30: error: too many arguments to function ?VecLoad?
main.c:37: error: ?flg_A? undeclared (first use in this function)
main.c:38: error: macro "SETERRQ" requires 3 arguments, but only 2 given
main.c:41: warning: passing argument 1 of ?MatLoad? from incompatible
pointer type
/home/gaurish108/software/petsc-dev/include/petscmat.h:493: note: expected
?Mat? but argument is of type ?PetscViewer?
main.c:41: warning: passing argument 2 of ?MatLoad? from incompatible
pointer type
/home/gaurish108/software/petsc-dev/include/petscmat.h:493: note: expected
?PetscViewer? but argument is of type ?const char *?
main.c:41: error: too many arguments to function ?MatLoad?
make: [main.o] Error 1 (ignored)
/home/gaurish108/software/petsc-dev/linux-gnu-c/bin/mpicc -Wall
-Wwrite-strings -Wno-strict-aliasing -Wno-unknown-pragmas -g3  -o main
main.o -L/home/gaurish108/software/petsc-dev/linux-gnu-c/lib  -lpetsc -lX11
-Wl,-rpath,/home/gaurish108/software/petsc-dev/linux-gnu-c/lib -lflapack
-lfblas -lm -L/usr/lib/gcc/i686-linux-gnu/4.4.5 -L/usr/lib/i686-linux-gnu
-ldl -lmpich -lopa -lmpl -lrt -lpthread -lgcc_s -lmpichf90 -lgfortran -lm
-lm -ldl -lmpich -lopa -lmpl -lrt -lpthread -lgcc_s -ldl
gcc: main.o: No such file or directory
make: [main] Error 1 (ignored)
/bin/rm -f main.o
gaurish108 at telang:~/Desktop/LSQR_progress$ $PETSC_DIR
bash: /home/gaurish108/software/petsc-dev: is a directory
gaurish108 at telang:~/Desktop/LSQR_progress$
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.mcs.anl.gov/pipermail/petsc-users/attachments/20110404/f2b864c0/attachment.htm>

From knepley at gmail.com  Mon Apr  4 13:46:08 2011
From: knepley at gmail.com (Matthew Knepley)
Date: Mon, 4 Apr 2011 13:46:08 -0500
Subject: [petsc-users] compile errors with petsc-dev
In-Reply-To: <BANLkTim+iYSDb2syEzGRuJwbB2Mipqea+Q@mail.gmail.com>
References: <BANLkTim+iYSDb2syEzGRuJwbB2Mipqea+Q@mail.gmail.com>
Message-ID: <BANLkTikLDkNaoU5GBR5RCB5NpW0mFGE2yQ@mail.gmail.com>

On Mon, Apr 4, 2011 at 1:41 PM, Gaurish Telang <gaurish108 at gmail.com> wrote:

> Hi,
>
> I have installed two versions of PETSc on my computer,  the debug version
> petsc-3.1.p7 and petsc-dev.
>
> However, when I try to compile my codes  with petsc-dev(which were
> successfully compiled and executed under the first version)
> I get lots of compile errors.
>
> Why could this be happening? I have pasted the error message below.
>

http://www.mcs.anl.gov/petsc/petsc-as/documentation/changes/dev.html

The first item.

   Matt


> gaurish108 at telang:~/Desktop/LSQR_progress$ $PETSC_DIR
> bash: /home/gaurish108/software/petsc-dev: is a directory
>
> gaurish108 at telang:~/Desktop/LSQR_progress$ make main
> /home/gaurish108/software/petsc-dev/linux-gnu-c/bin/mpicc -o main.o -c
> -Wall -Wwrite-strings -Wno-strict-aliasing -Wno-unknown-pragmas -g3
> -I/home/gaurish108/software/petsc-dev/include
> -I/home/gaurish108/software/petsc-dev/linux-gnu-c/include -D__INSDIR__=
> main.c
> main.c: In function ?main?:
> main.c:17: error: ?PetscTruth? undeclared (first use in this function)
> main.c:17: error: (Each undeclared identifier is reported only once
> main.c:17: error: for each function it appears in.)
> main.c:17: error: expected ?;? before ?flg_b?
> main.c:26: error: ?flg_b? undeclared (first use in this function)
> main.c:27: error: macro "SETERRQ" requires 3 arguments, but only 2 given
> main.c:27: error: ?SETERRQ? undeclared (first use in this function)
> main.c:30: warning: passing argument 1 of ?VecLoad? from incompatible
> pointer type
> /home/gaurish108/software/petsc-dev/include/petscvec.h:419: note: expected
> ?Vec? but argument is of type ?PetscViewer?
> main.c:30: warning: passing argument 2 of ?VecLoad? from incompatible
> pointer type
> /home/gaurish108/software/petsc-dev/include/petscvec.h:419: note: expected
> ?PetscViewer? but argument is of type ?const char *?
> main.c:30: error: too many arguments to function ?VecLoad?
> main.c:37: error: ?flg_A? undeclared (first use in this function)
> main.c:38: error: macro "SETERRQ" requires 3 arguments, but only 2 given
> main.c:41: warning: passing argument 1 of ?MatLoad? from incompatible
> pointer type
> /home/gaurish108/software/petsc-dev/include/petscmat.h:493: note: expected
> ?Mat? but argument is of type ?PetscViewer?
> main.c:41: warning: passing argument 2 of ?MatLoad? from incompatible
> pointer type
> /home/gaurish108/software/petsc-dev/include/petscmat.h:493: note: expected
> ?PetscViewer? but argument is of type ?const char *?
> main.c:41: error: too many arguments to function ?MatLoad?
> make: [main.o] Error 1 (ignored)
> /home/gaurish108/software/petsc-dev/linux-gnu-c/bin/mpicc -Wall
> -Wwrite-strings -Wno-strict-aliasing -Wno-unknown-pragmas -g3  -o main
> main.o -L/home/gaurish108/software/petsc-dev/linux-gnu-c/lib  -lpetsc -lX11
> -Wl,-rpath,/home/gaurish108/software/petsc-dev/linux-gnu-c/lib -lflapack
> -lfblas -lm -L/usr/lib/gcc/i686-linux-gnu/4.4.5 -L/usr/lib/i686-linux-gnu
> -ldl -lmpich -lopa -lmpl -lrt -lpthread -lgcc_s -lmpichf90 -lgfortran -lm
> -lm -ldl -lmpich -lopa -lmpl -lrt -lpthread -lgcc_s -ldl
> gcc: main.o: No such file or directory
> make: [main] Error 1 (ignored)
> /bin/rm -f main.o
> gaurish108 at telang:~/Desktop/LSQR_progress$ $PETSC_DIR
> bash: /home/gaurish108/software/petsc-dev: is a directory
> gaurish108 at telang:~/Desktop/LSQR_progress$
>
>


-- 
What most experimenters take for granted before they begin their experiments
is infinitely more interesting than any results to which their experiments
lead.
-- Norbert Wiener
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.mcs.anl.gov/pipermail/petsc-users/attachments/20110404/dc6dcd24/attachment.htm>

From sean at mcs.anl.gov  Mon Apr  4 13:46:43 2011
From: sean at mcs.anl.gov (Sean Farley)
Date: Mon, 4 Apr 2011 13:46:43 -0500
Subject: [petsc-users] compile errors with petsc-dev
In-Reply-To: <BANLkTim+iYSDb2syEzGRuJwbB2Mipqea+Q@mail.gmail.com>
References: <BANLkTim+iYSDb2syEzGRuJwbB2Mipqea+Q@mail.gmail.com>
Message-ID: <BANLkTimY-xPL7P3sQTKOUHwaH=D7wnQFBQ@mail.gmail.com>

>
> gaurish108 at telang:~/Desktop/LSQR_progress$ make main
> /home/gaurish108/software/petsc-dev/linux-gnu-c/bin/mpicc -o main.o -c
> -Wall -Wwrite-strings -Wno-strict-aliasing -Wno-unknown-pragmas -g3
> -I/home/gaurish108/software/petsc-dev/include
> -I/home/gaurish108/software/petsc-dev/linux-gnu-c/include -D__INSDIR__=
> main.c
> main.c: In function ?main?:
> main.c:17: error: ?PetscTruth? undeclared (first use in this function)
> main.c:17: error: (Each undeclared identifier is reported only once
> main.c:17: error: for each function it appears in.)
> main.c:17: error: expected ?;? before ?flg_b?
> main.c:26: error: ?flg_b? undeclared (first use in this function)
>

This error is because in petsc-dev PetscTruth changed to PetscBool:

http://www.mcs.anl.gov/petsc/petsc-as/documentation/changes/dev.html

"Changed PetscTruth to PetscBool, PETSC_TRUTH to PETSC_BOOL,
PetscOptionsTruth to PetscOptionsBool, etc."

main.c:27: error: macro "SETERRQ" requires 3 arguments, but only 2 given
> main.c:27: error: ?SETERRQ? undeclared (first use in this function)


Also, SETERRQX changed:

"PetscError() and SETERRQX() now take a MPI_Comm as the first argument to
indicate where the error is known. If you don't know what communicator use
then pass in PETSC_COMM_SELF"

Hope that helps,

Sean
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.mcs.anl.gov/pipermail/petsc-users/attachments/20110404/8e25c65b/attachment.htm>

From gaurish108 at gmail.com  Mon Apr  4 16:30:27 2011
From: gaurish108 at gmail.com (Gaurish Telang)
Date: Mon, 4 Apr 2011 17:30:27 -0400
Subject: [petsc-users] square root function
In-Reply-To: <BANLkTikx+-pM5fePV1DuBc2QHZgOv3t_yw@mail.gmail.com>
References: <BANLkTikx+-pM5fePV1DuBc2QHZgOv3t_yw@mail.gmail.com>
Message-ID: <BANLkTimu-t911=97Kf6KWD-2+4kcuLyNgg@mail.gmail.com>

Hi,

Is there a squareroot function implemented in PETSc which can be applied to
a PetscScalar type? I am not sure if the sqrt function of the C math library
will work on this datatype.

Regards,

Gaurish
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.mcs.anl.gov/pipermail/petsc-users/attachments/20110404/1a9bf0b9/attachment.htm>

From bsmith at mcs.anl.gov  Mon Apr  4 16:42:57 2011
From: bsmith at mcs.anl.gov (Barry Smith)
Date: Mon, 4 Apr 2011 16:42:57 -0500
Subject: [petsc-users] square root function
In-Reply-To: <BANLkTimu-t911=97Kf6KWD-2+4kcuLyNgg@mail.gmail.com>
References: <BANLkTikx+-pM5fePV1DuBc2QHZgOv3t_yw@mail.gmail.com>
	<BANLkTimu-t911=97Kf6KWD-2+4kcuLyNgg@mail.gmail.com>
Message-ID: <098AE37E-C0B6-4F66-81D9-C292BFF0AEC7@mcs.anl.gov>


   PetscSqrtScalar() macro automatically becomes the correct thing.

   Barry

On Apr 4, 2011, at 4:30 PM, Gaurish Telang wrote:

> Hi,
> 
> Is there a squareroot function implemented in PETSc which can be applied to a PetscScalar type? I am not sure if the sqrt function of the C math library will work on this datatype.
> 
> Regards,
> 
> Gaurish 
> 


From gaurish108 at gmail.com  Mon Apr  4 21:54:42 2011
From: gaurish108 at gmail.com (Gaurish Telang)
Date: Mon, 4 Apr 2011 22:54:42 -0400
Subject: [petsc-users] Modification of a routine
Message-ID: <BANLkTinxat8zo3AuBwvzf7V_SqPfAEM-9A@mail.gmail.com>

Hi,

I have tried to implement a recent least squares algorithm called LSMR, by
making modifications in the file  src/ksp/ksp/impls/lsqr/lsqr.c

Is it necessary to make any changes in other PETSc files or build the PETSc
library again?

I solved a simple least squares problem by supplying the flags -ksp_type
lsqr  -pc_type none and the problem seems to get solved correctly.

However, I had placed a couple of PetscPrintf statements inside the main
do-while loop of the algorithm in lsqr.c  but those statements are not
getting printed to standard output.

Thanks,

Gaurish.
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.mcs.anl.gov/pipermail/petsc-users/attachments/20110404/75c15069/attachment.htm>

From gdiso at ustc.edu  Mon Apr  4 22:07:32 2011
From: gdiso at ustc.edu (Gong Ding)
Date: Tue, 5 Apr 2011 11:07:32 +0800 (CST)
Subject: [petsc-users] Slepc eigen value solver gives strange behavior
Message-ID: <32393050.36991301972852982.JavaMail.coremail@mail.ustc.edu>

Hi,
I use slepc eigen value solver to evaluate the eigen values of jacobian matrix on each nonlinear iteration.

First, I create the EPS structure and set the operator to jacobian matrix. And then call the EPSSolve in the SNES build 
jacobian matrix functuion. This method create EPS once, call EPSSolve many times. However, it seems EPSSolve only work at
first time, and result eigen value never changes (however, relative error becomes larger and larger) in the following solve procedure. 

Then I create EPS each time in the SNES build jacobian matrix functuion, do EPSSolve and delete EPS at the end of function.
This method gives eigen value for the jacobian matrix with small relative error ~1e-8. Of course, create and destroy the EPS solver each time
is not efficient.

Does something get wrong in the first method? 

The code I used is attached here

  // create the EPS solver for smallest and largest eigen value
  EPS            eps_s;
  EPS            eps_l;

  FVM_NonlinearSolver & nonlinear_solver = dynamic_cast<FVM_NonlinearSolver &>(_solver);
  Mat & J = nonlinear_solver.jacobian_matrix();

  // create eigen value problem solver
  EPSCreate(PETSC_COMM_WORLD, &eps_s);
  EPSCreate(PETSC_COMM_WORLD, &eps_l);
  // Set operator
  EPSSetOperators(eps_s, J, PETSC_NULL);
  EPSSetOperators(eps_l, J, PETSC_NULL);

  // calculate smallest and largest eigen value
  EPSSetWhichEigenpairs(eps_s, EPS_SMALLEST_MAGNITUDE);
  EPSSetWhichEigenpairs(eps_l, EPS_LARGEST_MAGNITUDE);

  // shift-and-invert spectral transformation to enhance convergence of eigenvalues near zero
  ST st_s;
  EPSGetST(eps_s, &st_s);
  STSetType(st_s, STSINVERT);


  // Set solver parameters at runtime
  EPSSetFromOptions(eps_s);
  EPSSetFromOptions(eps_l);


///////////////////////////////////////////////////////////////////////////////
 
  // this part is called after jacobian matrix assemly

  PetscScalar kr_s, ki_s;
  PetscScalar kr_l, ki_l;
  PetscReal error_s;
  PetscReal error_l;
  PetscInt nconv_s;
  PetscInt nconv_l;

  // get the smallest eigen value
  EPSSolve( eps_s );
  EPSGetConverged( eps_s, &nconv_s );
  if( nconv_s > 0 )
  {
    EPSGetEigenvalue( eps_s, 0, &kr_s, &ki_s );
    EPSComputeRelativeError( eps_s, 0, &error_s );
  }

  // get the largest eigen value
  EPSSolve( eps_l );
  EPSGetConverged( eps_l, &nconv_l );
  if( nconv_l > 0 )
  {
    EPSGetEigenvalue( eps_l, 0, &kr_l, &ki_l );
    EPSComputeRelativeError( eps_l, 0, &error_l );
  }


From gdiso at ustc.edu  Mon Apr  4 22:16:16 2011
From: gdiso at ustc.edu (Gong Ding)
Date: Tue, 5 Apr 2011 11:16:16 +0800 (CST)
Subject: [petsc-users] Is there a way to do row/column scaling of jacobian
	matrix
Message-ID: <18393056.37011301973376995.JavaMail.coremail@mail.ustc.edu>

Hi,
I'd like to scaling the jacobian matrix as if the condition number can be improved.
That is scaling J by Dl*J*Dr. The scaling diagonal matrix will be changed in each nonlinear iteration.

Does SNES already exist some interface to do this?
   

From bsmith at mcs.anl.gov  Mon Apr  4 22:23:23 2011
From: bsmith at mcs.anl.gov (Barry Smith)
Date: Mon, 4 Apr 2011 22:23:23 -0500
Subject: [petsc-users] Modification of a routine
In-Reply-To: <BANLkTinxat8zo3AuBwvzf7V_SqPfAEM-9A@mail.gmail.com>
References: <BANLkTinxat8zo3AuBwvzf7V_SqPfAEM-9A@mail.gmail.com>
Message-ID: <39419F71-C854-4023-A93F-D9F60B7E0A5E@mcs.anl.gov>


  Use the debugger and step through the code to verify it is getting to those lines.

   Barry

On Apr 4, 2011, at 9:54 PM, Gaurish Telang wrote:

> Hi,
> 
> I have tried to implement a recent least squares algorithm called LSMR, by making modifications in the file  src/ksp/ksp/impls/lsqr/lsqr.c 
> 
> Is it necessary to make any changes in other PETSc files or build the PETSc library again? 
> 
> I solved a simple least squares problem by supplying the flags -ksp_type lsqr  -pc_type none and the problem seems to get solved correctly. 
> 
> However, I had placed a couple of PetscPrintf statements inside the main do-while loop of the algorithm in lsqr.c  but those statements are not getting printed to standard output. 
> 
> Thanks,
> 
> Gaurish. 


From bsmith at mcs.anl.gov  Mon Apr  4 22:25:48 2011
From: bsmith at mcs.anl.gov (Barry Smith)
Date: Mon, 4 Apr 2011 22:25:48 -0500
Subject: [petsc-users] Is there a way to do row/column scaling of
	jacobian matrix
In-Reply-To: <18393056.37011301973376995.JavaMail.coremail@mail.ustc.edu>
References: <18393056.37011301973376995.JavaMail.coremail@mail.ustc.edu>
Message-ID: <C1A3A589-4919-48D8-90B2-F3994DC2EBA0@mcs.anl.gov>


On Apr 4, 2011, at 10:16 PM, Gong Ding wrote:

> Hi,
> I'd like to scaling the jacobian matrix as if the condition number can be improved.
> That is scaling J by Dl*J*Dr. The scaling diagonal matrix will be changed in each nonlinear iteration.
> 
   Would this be in addition to building a preconditioner from the resulting scaled matrix? Or do you just want to use a symmetric Jacobi preconditioner?

   Barry

> Does SNES already exist some interface to do this?
> 
> 
> 
> 
> 


From bsmith at mcs.anl.gov  Mon Apr  4 22:52:47 2011
From: bsmith at mcs.anl.gov (Barry Smith)
Date: Mon, 4 Apr 2011 22:52:47 -0500
Subject: [petsc-users] Is there a way to do row/column scaling of
	jacobian matrix
In-Reply-To: <18393056.37011301973376995.JavaMail.coremail@mail.ustc.edu>
References: <18393056.37011301973376995.JavaMail.coremail@mail.ustc.edu>
Message-ID: <92ED67A8-57A1-4FF4-8F98-07C816052390@mcs.anl.gov>


   If you are looking for something like this:

When solving F(x) = 0, I would like to be able to scale both the solution
vector x and the residual function vector F, simply by specifying scaling
vectors, sx and sf, say. (These vectors would be the diagonal entries of
scaling matrices Dx and Df.)
I realize this can be achieved, at least in part, within the user residual
function.
This is what I had been doing, until I looked at Denis and Schnabel (sp?),
Brown and Saad, and the KINSOL user guide. It seems one has to take the
scaling matrices into account when computing various norms, when applying the
preconditioner, and when computing the step size, \sigma. No doubt there
are other things I have missed that also need to be done.

http://www.mcs.anl.gov/petsc/petsc-as/developers/projects.html

we don't have support for this (nor do I understand it). Anyways it has been on the "projects to do list" for a very long time; suspect it would require a good amount of futzing around in the source code to add.

   Barry


On Apr 4, 2011, at 10:16 PM, Gong Ding wrote:

> Hi,
> I'd like to scaling the jacobian matrix as if the condition number can be improved.
> That is scaling J by Dl*J*Dr. The scaling diagonal matrix will be changed in each nonlinear iteration.
> 
> Does SNES already exist some interface to do this?
> 
> 
> 
> 
> 


From bsmith at mcs.anl.gov  Mon Apr  4 23:10:20 2011
From: bsmith at mcs.anl.gov (Barry Smith)
Date: Mon, 4 Apr 2011 23:10:20 -0500
Subject: [petsc-users] Is there a way to do row/column scaling of
	jacobian matrix
In-Reply-To: <92ED67A8-57A1-4FF4-8F98-07C816052390@mcs.anl.gov>
References: <18393056.37011301973376995.JavaMail.coremail@mail.ustc.edu>
	<92ED67A8-57A1-4FF4-8F98-07C816052390@mcs.anl.gov>
Message-ID: <D8319D51-8882-4124-9A3C-AABD45D5AAC5@mcs.anl.gov>


   The literature is unclear to me, but I don't think these scalings are done in this way to improve the conditioning of the matrix. They are done to change the relative importance of different entries in the vector to determine stopping conditions and search directions in Newton's method. For example, if you consider getting the first vector entry in the residual/error small more important than the other entries you would use the scaling vector like [bignumber 1 1 1 1 ....]. In some way the scaling vectors reflect working with a different norm to measure the residual. Since PETSc does not support providing these scaling vectors you can get the same effect if you define your a new function (and hence also new Jacobian) that weights the various entries the way you want based on their importance. In other words newF(x)   = diagonalscaling1* oldF( diagonalscaling2 * y) then if x* is the solution to the new problem, y* = inv(diagonalscaling2*x*) is the solution to the original problem.  In some cases this transformation can correspond to working in "dimensionless coordinates" but all that language is over my head.


   If you just want to scale the matrix to have ones on the diagonal before forming the preconditioner (on the theory that it is better to solve problems with a "well-scaled" matrix) you can use the run time options -ksp_diagonal_scale -ksp_diagonal_scale_fix or in the code with KSPSetDiagonalScale() KSPSetDiagonalScaleFix().

   Barry

On Apr 4, 2011, at 10:52 PM, Barry Smith wrote:

> 
>   If you are looking for something like this:
> 
> When solving F(x) = 0, I would like to be able to scale both the solution
> vector x and the residual function vector F, simply by specifying scaling
> vectors, sx and sf, say. (These vectors would be the diagonal entries of
> scaling matrices Dx and Df.)
> I realize this can be achieved, at least in part, within the user residual
> function.
> This is what I had been doing, until I looked at Denis and Schnabel (sp?),
> Brown and Saad, and the KINSOL user guide. It seems one has to take the
> scaling matrices into account when computing various norms, when applying the
> preconditioner, and when computing the step size, \sigma. No doubt there
> are other things I have missed that also need to be done.
> 
> http://www.mcs.anl.gov/petsc/petsc-as/developers/projects.html
> 
> we don't have support for this (nor do I understand it). Anyways it has been on the "projects to do list" for a very long time; suspect it would require a good amount of futzing around in the source code to add.
> 
>   Barry
> 
> 
> On Apr 4, 2011, at 10:16 PM, Gong Ding wrote:
> 
>> Hi,
>> I'd like to scaling the jacobian matrix as if the condition number can be improved.
>> That is scaling J by Dl*J*Dr. The scaling diagonal matrix will be changed in each nonlinear iteration.
>> 
>> Does SNES already exist some interface to do this?
>> 
>> 
>> 
>> 
>> 
> 


From jed at 59A2.org  Tue Apr  5 01:24:37 2011
From: jed at 59A2.org (Jed Brown)
Date: Tue, 5 Apr 2011 08:24:37 +0200
Subject: [petsc-users] Is there a way to do row/column scaling of
	jacobian matrix
In-Reply-To: <D8319D51-8882-4124-9A3C-AABD45D5AAC5@mcs.anl.gov>
References: <18393056.37011301973376995.JavaMail.coremail@mail.ustc.edu>
	<92ED67A8-57A1-4FF4-8F98-07C816052390@mcs.anl.gov>
	<D8319D51-8882-4124-9A3C-AABD45D5AAC5@mcs.anl.gov>
Message-ID: <BANLkTikP1EAvDY_9d3VafmWGfgeGx2ZipA@mail.gmail.com>

On Tue, Apr 5, 2011 at 06:10, Barry Smith <bsmith at mcs.anl.gov> wrote:

> They are done to change the relative importance of different entries in the
> vector to determine stopping conditions and search directions in Newton's
> method.


Note that weighting fields differently is equivalent to choosing units for
the different fields. I think it is generally a good idea to make the units
a runtime option anyway since it lets you check that the code is
dimensionally correct.
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.mcs.anl.gov/pipermail/petsc-users/attachments/20110405/cd46a8fa/attachment.htm>

From jroman at dsic.upv.es  Tue Apr  5 04:53:17 2011
From: jroman at dsic.upv.es (Jose E. Roman)
Date: Tue, 5 Apr 2011 11:53:17 +0200
Subject: [petsc-users] Slepc eigen value solver gives strange behavior
In-Reply-To: <32393050.36991301972852982.JavaMail.coremail@mail.ustc.edu>
References: <32393050.36991301972852982.JavaMail.coremail@mail.ustc.edu>
Message-ID: <133CE1D3-2E96-4749-83B4-76F61DBC06F0@dsic.upv.es>


El 05/04/2011, a las 05:07, Gong Ding escribi?:

> Hi,
> I use slepc eigen value solver to evaluate the eigen values of jacobian matrix on each nonlinear iteration.
> 
> First, I create the EPS structure and set the operator to jacobian matrix. And then call the EPSSolve in the SNES build 
> jacobian matrix functuion. This method create EPS once, call EPSSolve many times. However, it seems EPSSolve only work at
> first time, and result eigen value never changes (however, relative error becomes larger and larger) in the following solve procedure. 
> 
> Then I create EPS each time in the SNES build jacobian matrix functuion, do EPSSolve and delete EPS at the end of function.
> This method gives eigen value for the jacobian matrix with small relative error ~1e-8. Of course, create and destroy the EPS solver each time
> is not efficient.
> 
> Does something get wrong in the first method? 

You will always solve the same eigenproblem, unless you call EPSSetOperators every time the matrix changes.
When EPSSetOperators is called, the EPS object is reset and therefore EPSSetUp will be called in the next EPSSolve. So basically the first approach will have the same cost as the second approach.

By the way, you are not using STSINVERT correctly. You should set EPS_TARGET_MAGNITUDE (instead of EPS_SMALLEST_MAGNITUDE) together with target=0.0 with EPSSetTarget.

Jose

> 
> The code I used is attached here
> 
>  // create the EPS solver for smallest and largest eigen value
>  EPS            eps_s;
>  EPS            eps_l;
> 
>  FVM_NonlinearSolver & nonlinear_solver = dynamic_cast<FVM_NonlinearSolver &>(_solver);
>  Mat & J = nonlinear_solver.jacobian_matrix();
> 
>  // create eigen value problem solver
>  EPSCreate(PETSC_COMM_WORLD, &eps_s);
>  EPSCreate(PETSC_COMM_WORLD, &eps_l);
>  // Set operator
>  EPSSetOperators(eps_s, J, PETSC_NULL);
>  EPSSetOperators(eps_l, J, PETSC_NULL);
> 
>  // calculate smallest and largest eigen value
>  EPSSetWhichEigenpairs(eps_s, EPS_SMALLEST_MAGNITUDE);
>  EPSSetWhichEigenpairs(eps_l, EPS_LARGEST_MAGNITUDE);
> 
>  // shift-and-invert spectral transformation to enhance convergence of eigenvalues near zero
>  ST st_s;
>  EPSGetST(eps_s, &st_s);
>  STSetType(st_s, STSINVERT);
> 
> 
>  // Set solver parameters at runtime
>  EPSSetFromOptions(eps_s);
>  EPSSetFromOptions(eps_l);
> 
> 
> ///////////////////////////////////////////////////////////////////////////////
> 
>  // this part is called after jacobian matrix assemly
> 
>  PetscScalar kr_s, ki_s;
>  PetscScalar kr_l, ki_l;
>  PetscReal error_s;
>  PetscReal error_l;
>  PetscInt nconv_s;
>  PetscInt nconv_l;
> 
>  // get the smallest eigen value
>  EPSSolve( eps_s );
>  EPSGetConverged( eps_s, &nconv_s );
>  if( nconv_s > 0 )
>  {
>    EPSGetEigenvalue( eps_s, 0, &kr_s, &ki_s );
>    EPSComputeRelativeError( eps_s, 0, &error_s );
>  }
> 
>  // get the largest eigen value
>  EPSSolve( eps_l );
>  EPSGetConverged( eps_l, &nconv_l );
>  if( nconv_l > 0 )
>  {
>    EPSGetEigenvalue( eps_l, 0, &kr_l, &ki_l );
>    EPSComputeRelativeError( eps_l, 0, &error_l );
>  }
> 
> 


From gdiso at ustc.edu  Tue Apr  5 10:44:03 2011
From: gdiso at ustc.edu (Gong Ding)
Date: Tue, 5 Apr 2011 23:44:03 +0800 (CST)
Subject: [petsc-users] Slepc eigen value solver gives strange behavior
In-Reply-To: <133CE1D3-2E96-4749-83B4-76F61DBC06F0@dsic.upv.es>
References: <133CE1D3-2E96-4749-83B4-76F61DBC06F0@dsic.upv.es>
	<32393050.36991301972852982.JavaMail.coremail@mail.ustc.edu>
Message-ID: <33536591.37741302018243123.JavaMail.coremail@mail.ustc.edu>

Thank you very much.

> El 05/04/2011, a las 05:07, Gong Ding escribi?:
> 
> 
> 
> > Hi,
> 
> > I use slepc eigen value solver to evaluate the eigen values of jacobian matrix on each nonlinear iteration.
> 
> > 
> 
> > First, I create the EPS structure and set the operator to jacobian matrix. And then call the EPSSolve in the SNES build 
> 
> > jacobian matrix functuion. This method create EPS once, call EPSSolve many times. However, it seems EPSSolve only work at
> 
> > first time, and result eigen value never changes (however, relative error becomes larger and larger) in the following solve procedure. 
> 
> > 
> 
> > Then I create EPS each time in the SNES build jacobian matrix functuion, do EPSSolve and delete EPS at the end of function.
> 
> > This method gives eigen value for the jacobian matrix with small relative error ~1e-8. Of course, create and destroy the EPS solver each time
> 
> > is not efficient.
> 
> > 
> 
> > Does something get wrong in the first method? 
> 
> 
> 
> You will always solve the same eigenproblem, unless you call EPSSetOperators every time the matrix changes.
> 
> When EPSSetOperators is called, the EPS object is reset and therefore EPSSetUp will be called in the next EPSSolve. So basically the first approach will have the same cost as the second approach.
> 
> 
> 
> By the way, you are not using STSINVERT correctly. You should set EPS_TARGET_MAGNITUDE (instead of EPS_SMALLEST_MAGNITUDE) together with target=0.0 with EPSSetTarget.
> 
> 
> 
> Jose
> 
> 
> 
> > 
> 
> > The code I used is attached here
> 
> > 
> 
> >  // create the EPS solver for smallest and largest eigen value
> 
> >  EPS            eps_s;
> 
> >  EPS            eps_l;
> 
> > 
> 
> >  FVM_NonlinearSolver & nonlinear_solver = dynamic_cast<FVM_NonlinearSolver &>(_solver);
> 
> >  Mat & J = nonlinear_solver.jacobian_matrix();
> 
> > 
> 
> >  // create eigen value problem solver
> 
> >  EPSCreate(PETSC_COMM_WORLD, &eps_s);
> 
> >  EPSCreate(PETSC_COMM_WORLD, &eps_l);
> 
> >  // Set operator
> 
> >  EPSSetOperators(eps_s, J, PETSC_NULL);
> 
> >  EPSSetOperators(eps_l, J, PETSC_NULL);
> 
> > 
> 
> >  // calculate smallest and largest eigen value
> 
> >  EPSSetWhichEigenpairs(eps_s, EPS_SMALLEST_MAGNITUDE);
> 
> >  EPSSetWhichEigenpairs(eps_l, EPS_LARGEST_MAGNITUDE);
> 
> > 
> 
> >  // shift-and-invert spectral transformation to enhance convergence of eigenvalues near zero
> 
> >  ST st_s;
> 
> >  EPSGetST(eps_s, &st_s);
> 
> >  STSetType(st_s, STSINVERT);
> 
> > 
> 
> > 
> 
> >  // Set solver parameters at runtime
> 
> >  EPSSetFromOptions(eps_s);
> 
> >  EPSSetFromOptions(eps_l);
> 
> > 
> 
> > 
> 
> > ///////////////////////////////////////////////////////////////////////////////
> 
> > 
> 
> >  // this part is called after jacobian matrix assemly
> 
> > 
> 
> >  PetscScalar kr_s, ki_s;
> 
> >  PetscScalar kr_l, ki_l;
> 
> >  PetscReal error_s;
> 
> >  PetscReal error_l;
> 
> >  PetscInt nconv_s;
> 
> >  PetscInt nconv_l;
> 
> > 
> 
> >  // get the smallest eigen value
> 
> >  EPSSolve( eps_s );
> 
> >  EPSGetConverged( eps_s, &nconv_s );
> 
> >  if( nconv_s > 0 )
> 
> >  {
> 
> >    EPSGetEigenvalue( eps_s, 0, &kr_s, &ki_s );
> 
> >    EPSComputeRelativeError( eps_s, 0, &error_s );
> 
> >  }
> 
> > 
> 
> >  // get the largest eigen value
> 
> >  EPSSolve( eps_l );
> 
> >  EPSGetConverged( eps_l, &nconv_l );
> 
> >  if( nconv_l > 0 )
> 
> >  {
> 
> >    EPSGetEigenvalue( eps_l, 0, &kr_l, &ki_l );
> 
> >    EPSComputeRelativeError( eps_l, 0, &error_l );
> 
> >  }
> 
> > 
> 
> > 
> 
> 
> 
>

From gdiso at ustc.edu  Tue Apr  5 10:58:29 2011
From: gdiso at ustc.edu (Gong Ding)
Date: Tue, 5 Apr 2011 23:58:29 +0800 (CST)
Subject: [petsc-users] Is there a way to do row/column scaling of
 jacobian matrix
In-Reply-To: <D8319D51-8882-4124-9A3C-AABD45D5AAC5@mcs.anl.gov>
References: <D8319D51-8882-4124-9A3C-AABD45D5AAC5@mcs.anl.gov>
	<18393056.37011301973376995.JavaMail.coremail@mail.ustc.edu>
	<92ED67A8-57A1-4FF4-8F98-07C816052390@mcs.anl.gov>
Message-ID: <13664069.37751302019109451.JavaMail.coremail@mail.ustc.edu>

I known the diagonal scaling. And I will try it tomorrow.
Thank to slepc, I can monitor the eigen values as an approximation of condition number.  
The original problem as condition number about 1e20, which defeat any iterative solver.
I hope I can reduce it as much as possible.

Further more, can I use MC64, which permute and scale a sparse unsymmetric matrix to 
put large entries on the diagonal?

 
>    The literature is unclear to me, but I don't think these scalings are done in this way to improve the conditioning of the matrix. They are done to change the relative importance of different entries in the vector to determine stopping conditions and search directions in Newton's method. For example, if you consider getting the first vector entry in the residual/error small more important than the other entries you would use the scaling vector like [bignumber 1 1 1 1 ....]. In some way the scaling vectors reflect working with a different norm to measure the residual. Since PETSc does not support providing these scaling vectors you can get the same effect if you define your a new function (and hence also new Jacobian) that weights the various entries the way you want based on their importance. In other words newF(x)   = diagonalscaling1* oldF( diagonalscaling2 * y) then if x* is the solution to the new problem, y* = inv(diagonalscaling2*x*) is the solution to the original problem.  In some cases this transformation can correspond to working in "dimensionless coordinates" but all that language is over my head.
> 
> 
> 
> 
> 
>    If you just want to scale the matrix to have ones on the diagonal before forming the preconditioner (on the theory that it is better to solve problems with a "well-scaled" matrix) you can use the run time options -ksp_diagonal_scale -ksp_diagonal_scale_fix or in the code with KSPSetDiagonalScale() KSPSetDiagonalScaleFix().
> 
> 
> 
>    Barry
> 
> 
> 
> On Apr 4, 2011, at 10:52 PM, Barry Smith wrote:
> 
> 
> 
> > 
> 
> >   If you are looking for something like this:
> 
> > 
> 
> > When solving F(x) = 0, I would like to be able to scale both the solution
> 
> > vector x and the residual function vector F, simply by specifying scaling
> 
> > vectors, sx and sf, say. (These vectors would be the diagonal entries of
> 
> > scaling matrices Dx and Df.)
> 
> > I realize this can be achieved, at least in part, within the user residual
> 
> > function.
> 
> > This is what I had been doing, until I looked at Denis and Schnabel (sp?),
> 
> > Brown and Saad, and the KINSOL user guide. It seems one has to take the
> 
> > scaling matrices into account when computing various norms, when applying the
> 
> > preconditioner, and when computing the step size, \sigma. No doubt there
> 
> > are other things I have missed that also need to be done.
> 
> > 
> 
> > http://www.mcs.anl.gov/petsc/petsc-as/developers/projects.html
> 
> > 
> 
> > we don't have support for this (nor do I understand it). Anyways it has been on the "projects to do list" for a very long time; suspect it would require a good amount of futzing around in the source code to add.
> 
> > 
> 
> >   Barry
> 
> > 
> 
> > 
> 
> > On Apr 4, 2011, at 10:16 PM, Gong Ding wrote:
> 
> > 
> 
> >> Hi,
> 
> >> I'd like to scaling the jacobian matrix as if the condition number can be improved.
> 
> >> That is scaling J by Dl*J*Dr. The scaling diagonal matrix will be changed in each nonlinear iteration.
> 
> >> 
> 
> >> Does SNES already exist some interface to do this?
> 
> >> 
> 
> >> 
> 
> >> 
> 
> >> 
> 
> >> 
> 
> > 
> 
> 
> 
>

From u.tabak at tudelft.nl  Tue Apr  5 11:14:02 2011
From: u.tabak at tudelft.nl (Umut Tabak)
Date: Tue, 05 Apr 2011 18:14:02 +0200
Subject: [petsc-users] Is there a way to do row/column scaling of
 jacobian matrix
In-Reply-To: <13664069.37751302019109451.JavaMail.coremail@mail.ustc.edu>
References: <D8319D51-8882-4124-9A3C-AABD45D5AAC5@mcs.anl.gov>	<18393056.37011301973376995.JavaMail.coremail@mail.ustc.edu>	<92ED67A8-57A1-4FF4-8F98-07C816052390@mcs.anl.gov>
	<13664069.37751302019109451.JavaMail.coremail@mail.ustc.edu>
Message-ID: <4D9B3FCA.1020600@tudelft.nl>

On 04/05/2011 05:58 PM, Gong Ding wrote:
> I known the diagonal scaling. And I will try it tomorrow.
> Thank to slepc, I can monitor the eigen values as an approximation of condition number.
> The original problem as condition number about 1e20, which defeat any iterative solver.
>    
 From personal experience, condition estimates larger than 1e+10 means 
practically singular and it is almost impossible to really decrease that 
to a reasonable number unless you use some specific system information 
which is really really difficult...
And again from personal experience, diagonal scaling is the most naive 
scaling out there(however the first to try if you do now know sth 
better), and it does not bring much on these kinds of ill-conditioned 
systems.

Trying to reformulate the problem seems like a better option to me.

Experts will comment on the above propositions ;)

Good luck.
U.

-- 
If I have a thousand ideas and only one turns out to be good,
I am satisfied.
Alfred Nobel


From zhaonanavril at gmail.com  Tue Apr  5 11:40:36 2011
From: zhaonanavril at gmail.com (NAN ZHAO)
Date: Tue, 5 Apr 2011 10:40:36 -0600
Subject: [petsc-users] sequential version of petsc (no mpi)
Message-ID: <BANLkTin1j_R3NYy6WA274dVYt=Z6RPJcYA@mail.gmail.com>

Dear all,

I need to build a no mpi version of petsc for some reason. I use the option
--with-mpi=0. The build seems to be successful. But when I compile my code
with petsc, then it has some errors related to undefined reference to
MPI_ABORT, MPI_NUITMP...
I just tired to use KSP solver. Is anyone have some suggestions?

Thanks,
Nan
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.mcs.anl.gov/pipermail/petsc-users/attachments/20110405/b7c8d176/attachment.htm>

From balay at mcs.anl.gov  Tue Apr  5 11:45:12 2011
From: balay at mcs.anl.gov (Satish Balay)
Date: Tue, 5 Apr 2011 11:45:12 -0500 (CDT)
Subject: [petsc-users] sequential version of petsc (no mpi)
In-Reply-To: <BANLkTin1j_R3NYy6WA274dVYt=Z6RPJcYA@mail.gmail.com>
References: <BANLkTin1j_R3NYy6WA274dVYt=Z6RPJcYA@mail.gmail.com>
Message-ID: <alpine.LFD.2.02.1104051144380.2274@asterix>

send all build logs [configure.log,make.log,test.log] to petsc-maint at mcs.anl.gov

satish

On Tue, 5 Apr 2011, NAN ZHAO wrote:

> Dear all,
> 
> I need to build a no mpi version of petsc for some reason. I use the option
> --with-mpi=0. The build seems to be successful. But when I compile my code
> with petsc, then it has some errors related to undefined reference to
> MPI_ABORT, MPI_NUITMP...
> I just tired to use KSP solver. Is anyone have some suggestions?
> 
> Thanks,
> Nan
> 


From panourg at mech.upatras.gr  Tue Apr  5 15:17:40 2011
From: panourg at mech.upatras.gr (panourg at mech.upatras.gr)
Date: Tue, 5 Apr 2011 23:17:40 +0300 (EEST)
Subject: [petsc-users] sequential version of petsc (no mpi)
In-Reply-To: <BANLkTin1j_R3NYy6WA274dVYt=Z6RPJcYA@mail.gmail.com>
References: <BANLkTin1j_R3NYy6WA274dVYt=Z6RPJcYA@mail.gmail.com>
Message-ID: <2324.94.64.236.148.1302034660.squirrel@mail.mech.upatras.gr>

You can run petsc in one process regardeless of mpi setup in your pc.
I believe that some routines or other packages of petsc need mpi and
therefore you take these errors.
Do setup of mpi and make your code as before.


K.P


> Dear all,
>
> I need to build a no mpi version of petsc for some reason. I use the
> option
> --with-mpi=0. The build seems to be successful. But when I compile my code
> with petsc, then it has some errors related to undefined reference to
> MPI_ABORT, MPI_NUITMP...
> I just tired to use KSP solver. Is anyone have some suggestions?
>
> Thanks,
> Nan
>


From vyan2000 at gmail.com  Tue Apr  5 21:52:53 2011
From: vyan2000 at gmail.com (Ryan Yan)
Date: Tue, 5 Apr 2011 22:52:53 -0400
Subject: [petsc-users] about pclu
Message-ID: <BANLkTinc338PhSAXdmThpS_fyUBP1qwPEA@mail.gmail.com>

Hi,
I am wondering is there a way of checking the residual of a direct solver.
It should
be one shot and very small.  I tried -ksp_monitor_true_residual, but no
thing shows up. I guess a piece of code
$Ax-b$ will do the trick?

Thanks,

Yan
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.mcs.anl.gov/pipermail/petsc-users/attachments/20110405/65c42e7c/attachment.htm>

From knepley at gmail.com  Tue Apr  5 21:59:23 2011
From: knepley at gmail.com (Matthew Knepley)
Date: Tue, 5 Apr 2011 21:59:23 -0500
Subject: [petsc-users] about pclu
In-Reply-To: <BANLkTinc338PhSAXdmThpS_fyUBP1qwPEA@mail.gmail.com>
References: <BANLkTinc338PhSAXdmThpS_fyUBP1qwPEA@mail.gmail.com>
Message-ID: <BANLkTimEi6xi__jv8VzHKCBhtTiMpaT01Q@mail.gmail.com>

On Tue, Apr 5, 2011 at 9:52 PM, Ryan Yan <vyan2000 at gmail.com> wrote:

> Hi,
> I am wondering is there a way of checking the residual of a direct solver.
> It should
> be one shot and very small.  I tried -ksp_monitor_true_residual, but no
> thing shows up. I guess a piece of code
> $Ax-b$ will do the trick?
>

If you use it through KSPSolve, then -ksp_monitor will give you the
residual.

   Matt


> Thanks,
>
> Yan
>


-- 
What most experimenters take for granted before they begin their experiments
is infinitely more interesting than any results to which their experiments
lead.
-- Norbert Wiener
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.mcs.anl.gov/pipermail/petsc-users/attachments/20110405/a381df40/attachment.htm>

From bsmith at mcs.anl.gov  Tue Apr  5 22:06:41 2011
From: bsmith at mcs.anl.gov (Barry Smith)
Date: Tue, 5 Apr 2011 22:06:41 -0500
Subject: [petsc-users] about pclu
In-Reply-To: <BANLkTinc338PhSAXdmThpS_fyUBP1qwPEA@mail.gmail.com>
References: <BANLkTinc338PhSAXdmThpS_fyUBP1qwPEA@mail.gmail.com>
Message-ID: <0E3D511E-B363-47F3-80A2-422EA61F2274@mcs.anl.gov>


On Apr 5, 2011, at 9:52 PM, Ryan Yan wrote:

> Hi, 
> I am wondering is there a way of checking the residual of a direct solver. It should 
> be one shot and very small.  I tried -ksp_monitor_true_residual, but no thing shows up. I guess a piece of code 
> $Ax-b$ will do the trick?

   The reason that the monitor doesn't display anything is not the direct solver but because you are using LU with KSPType of KSPPREONLY  if you run with -ksp_type richardson or -ksp_type gmres then -ksp_monitor_true_residual will print the residual as you want.

   Barry

> 
> Thanks,
> 
> Yan


From vyan2000 at gmail.com  Tue Apr  5 22:16:05 2011
From: vyan2000 at gmail.com (Ryan Yan)
Date: Tue, 5 Apr 2011 23:16:05 -0400
Subject: [petsc-users] about pclu
In-Reply-To: <0E3D511E-B363-47F3-80A2-422EA61F2274@mcs.anl.gov>
References: <BANLkTinc338PhSAXdmThpS_fyUBP1qwPEA@mail.gmail.com>
	<0E3D511E-B363-47F3-80A2-422EA61F2274@mcs.anl.gov>
Message-ID: <BANLkTimiiVdiV9SRoUbziqAyJF18q7zRcQ@mail.gmail.com>

Dear Barry and Matt,
Thanks for the help,

Indeed, the monitor starts to work with "richardson".

Yan


On Tue, Apr 5, 2011 at 11:06 PM, Barry Smith <bsmith at mcs.anl.gov> wrote:

>
> On Apr 5, 2011, at 9:52 PM, Ryan Yan wrote:
>
> > Hi,
> > I am wondering is there a way of checking the residual of a direct
> solver. It should
> > be one shot and very small.  I tried -ksp_monitor_true_residual, but no
> thing shows up. I guess a piece of code
> > $Ax-b$ will do the trick?
>
>    The reason that the monitor doesn't display anything is not the direct
> solver but because you are using LU with KSPType of KSPPREONLY  if you run
> with -ksp_type richardson or -ksp_type gmres then -ksp_monitor_true_residual
> will print the residual as you want.
>
>   Barry
>
> >
> > Thanks,
> >
> > Yan
>
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.mcs.anl.gov/pipermail/petsc-users/attachments/20110405/bbbd6483/attachment.htm>

From vyan2000 at gmail.com  Tue Apr  5 22:20:26 2011
From: vyan2000 at gmail.com (Ryan Yan)
Date: Tue, 5 Apr 2011 23:20:26 -0400
Subject: [petsc-users] about pclu
In-Reply-To: <0E3D511E-B363-47F3-80A2-422EA61F2274@mcs.anl.gov>
References: <BANLkTinc338PhSAXdmThpS_fyUBP1qwPEA@mail.gmail.com>
	<0E3D511E-B363-47F3-80A2-422EA61F2274@mcs.anl.gov>
Message-ID: <BANLkTimu=EHYskfkax+KoDmgS9aTFpvhhQ@mail.gmail.com>

One more ask, :-)

Which one is more efficient, richardson, preonly or no difference, if I am
going to use direct solver for many times steps.

Thanks,

Yan

On Tue, Apr 5, 2011 at 11:06 PM, Barry Smith <bsmith at mcs.anl.gov> wrote:

>
> On Apr 5, 2011, at 9:52 PM, Ryan Yan wrote:
>
> > Hi,
> > I am wondering is there a way of checking the residual of a direct
> solver. It should
> > be one shot and very small.  I tried -ksp_monitor_true_residual, but no
> thing shows up. I guess a piece of code
> > $Ax-b$ will do the trick?
>
>    The reason that the monitor doesn't display anything is not the direct
> solver but because you are using LU with KSPType of KSPPREONLY  if you run
> with -ksp_type richardson or -ksp_type gmres then -ksp_monitor_true_residual
> will print the residual as you want.
>
>   Barry
>
> >
> > Thanks,
> >
> > Yan
>
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.mcs.anl.gov/pipermail/petsc-users/attachments/20110405/d1b28290/attachment.htm>

From gdiso at ustc.edu  Tue Apr  5 22:21:21 2011
From: gdiso at ustc.edu (Gong Ding)
Date: Wed, 6 Apr 2011 11:21:21 +0800 (CST)
Subject: [petsc-users] Slepc eigen value solver gives strange behavior
In-Reply-To: <133CE1D3-2E96-4749-83B4-76F61DBC06F0@dsic.upv.es>
References: <133CE1D3-2E96-4749-83B4-76F61DBC06F0@dsic.upv.es>
	<32393050.36991301972852982.JavaMail.coremail@mail.ustc.edu>
Message-ID: <9020416.38421302060081374.JavaMail.coremail@mail.ustc.edu>

Dear Jose,
will you please also take a look at the SVD code for smallest singular value?
It seems work except the time consuming SVDCyclicSetExplicitMatrix routine.
However, I wonder if there exist some more clever method.

  // SVD solver for smallest singular value
  SVD            svd_s;
  EPS            eps_s;
  ST             st_s;
  KSP            ksp_s;
  PC             pc_s;

  PetscErrorCode ierr;

  // Create singular value solver context
  ierr = SVDCreate(PETSC_COMM_WORLD, &svd_s);

  // Set operator
  ierr = SVDSetOperator(svd_s, J);


  // small singular value use eigen value solver on Cyclic Matrix
  ierr = SVDSetWhichSingularTriplets(svd_s, SVD_SMALLEST);
  ierr = SVDSetType(svd_s, SVDCYCLIC);
  ierr = SVDCyclicSetExplicitMatrix(svd_s, PETSC_TRUE); // <-----time consuming
  // shift-and-invert spectral transformation to enhance convergence of eigenvalues near zero
  ierr = SVDCyclicGetEPS(svd_s, &eps_s);
  ierr = EPSSetType(eps_s, EPSKRYLOVSCHUR);
  ierr = EPSGetST(eps_s, &st_s);
  ierr = STSetType(st_s, STSINVERT);

  ierr = STGetKSP(st_s, &ksp_s);
  ierr = KSPGetPC(ksp_s, &pc_s);
  // since we have to deal with bad conditioned problem, we choose direct solver whenever possible

  // direct solver as preconditioner
  ierr = KSPSetType (ksp_s, (char*) KSPGMRES); assert(!ierr);
  // superlu which use static pivot seems very stable
  ierr = PCSetType  (pc_s, (char*) PCLU); assert(!ierr);
  ierr = PCFactorSetMatSolverPackage (pc_s, "superlu"); assert(!ierr);

  // Set solver parameters at runtime
  ierr = SVDSetFromOptions(svd_s);  assert(!ierr);

  ierr = SVDSetUp(svd_s); assert(!ierr);

  PetscReal sigma_large=1, sigma_small=1;
  PetscInt nconv;
  PetscReal error;

  // find the smallest singular value
  SVDSolve(svd_s);

From knepley at gmail.com  Tue Apr  5 22:27:03 2011
From: knepley at gmail.com (Matthew Knepley)
Date: Tue, 5 Apr 2011 22:27:03 -0500
Subject: [petsc-users] about pclu
In-Reply-To: <BANLkTimu=EHYskfkax+KoDmgS9aTFpvhhQ@mail.gmail.com>
References: <BANLkTinc338PhSAXdmThpS_fyUBP1qwPEA@mail.gmail.com>
	<0E3D511E-B363-47F3-80A2-422EA61F2274@mcs.anl.gov>
	<BANLkTimu=EHYskfkax+KoDmgS9aTFpvhhQ@mail.gmail.com>
Message-ID: <BANLkTimnQ8PU5N1yWebs-i=sv3F2KF0TLw@mail.gmail.com>

On Tue, Apr 5, 2011 at 10:20 PM, Ryan Yan <vyan2000 at gmail.com> wrote:

> One more ask, :-)
>
> Which one is more efficient, richardson, preonly or no difference, if I am
> going to use direct solver for many times steps.
>

There should be no difference since direct solves take so long.

  Matt


> Thanks,
>
> Yan
>
> On Tue, Apr 5, 2011 at 11:06 PM, Barry Smith <bsmith at mcs.anl.gov> wrote:
>
>>
>> On Apr 5, 2011, at 9:52 PM, Ryan Yan wrote:
>>
>> > Hi,
>> > I am wondering is there a way of checking the residual of a direct
>> solver. It should
>> > be one shot and very small.  I tried -ksp_monitor_true_residual, but no
>> thing shows up. I guess a piece of code
>> > $Ax-b$ will do the trick?
>>
>>    The reason that the monitor doesn't display anything is not the direct
>> solver but because you are using LU with KSPType of KSPPREONLY  if you run
>> with -ksp_type richardson or -ksp_type gmres then -ksp_monitor_true_residual
>> will print the residual as you want.
>>
>>   Barry
>>
>> >
>> > Thanks,
>> >
>> > Yan
>>
>>
>


-- 
What most experimenters take for granted before they begin their experiments
is infinitely more interesting than any results to which their experiments
lead.
-- Norbert Wiener
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.mcs.anl.gov/pipermail/petsc-users/attachments/20110405/8fabf337/attachment.htm>

From vyan2000 at gmail.com  Tue Apr  5 22:34:05 2011
From: vyan2000 at gmail.com (Ryan Yan)
Date: Tue, 5 Apr 2011 23:34:05 -0400
Subject: [petsc-users] about pclu
In-Reply-To: <BANLkTimnQ8PU5N1yWebs-i=sv3F2KF0TLw@mail.gmail.com>
References: <BANLkTinc338PhSAXdmThpS_fyUBP1qwPEA@mail.gmail.com>
	<0E3D511E-B363-47F3-80A2-422EA61F2274@mcs.anl.gov>
	<BANLkTimu=EHYskfkax+KoDmgS9aTFpvhhQ@mail.gmail.com>
	<BANLkTimnQ8PU5N1yWebs-i=sv3F2KF0TLw@mail.gmail.com>
Message-ID: <BANLkTi=Y9kk8xtnx7xw3-=uNNVvfZBMsaw@mail.gmail.com>

I see. Thanks, Matt.

Yan

On Tue, Apr 5, 2011 at 11:27 PM, Matthew Knepley <knepley at gmail.com> wrote:

> On Tue, Apr 5, 2011 at 10:20 PM, Ryan Yan <vyan2000 at gmail.com> wrote:
>
>> One more ask, :-)
>>
>> Which one is more efficient, richardson, preonly or no difference, if I am
>> going to use direct solver for many times steps.
>>
>
> There should be no difference since direct solves take so long.
>
>   Matt
>
>
>> Thanks,
>>
>> Yan
>>
>> On Tue, Apr 5, 2011 at 11:06 PM, Barry Smith <bsmith at mcs.anl.gov> wrote:
>>
>>>
>>> On Apr 5, 2011, at 9:52 PM, Ryan Yan wrote:
>>>
>>> > Hi,
>>> > I am wondering is there a way of checking the residual of a direct
>>> solver. It should
>>> > be one shot and very small.  I tried -ksp_monitor_true_residual, but no
>>> thing shows up. I guess a piece of code
>>> > $Ax-b$ will do the trick?
>>>
>>>    The reason that the monitor doesn't display anything is not the direct
>>> solver but because you are using LU with KSPType of KSPPREONLY  if you run
>>> with -ksp_type richardson or -ksp_type gmres then -ksp_monitor_true_residual
>>> will print the residual as you want.
>>>
>>>   Barry
>>>
>>> >
>>> > Thanks,
>>> >
>>> > Yan
>>>
>>>
>>
>
>
> --
> What most experimenters take for granted before they begin their
> experiments is infinitely more interesting than any results to which their
> experiments lead.
> -- Norbert Wiener
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.mcs.anl.gov/pipermail/petsc-users/attachments/20110405/90031d2e/attachment-0001.htm>

From jed at 59A2.org  Wed Apr  6 00:43:51 2011
From: jed at 59A2.org (Jed Brown)
Date: Wed, 6 Apr 2011 07:43:51 +0200
Subject: [petsc-users] about pclu
In-Reply-To: <BANLkTimnQ8PU5N1yWebs-i=sv3F2KF0TLw@mail.gmail.com>
References: <BANLkTinc338PhSAXdmThpS_fyUBP1qwPEA@mail.gmail.com>
	<0E3D511E-B363-47F3-80A2-422EA61F2274@mcs.anl.gov>
	<BANLkTimu=EHYskfkax+KoDmgS9aTFpvhhQ@mail.gmail.com>
	<BANLkTimnQ8PU5N1yWebs-i=sv3F2KF0TLw@mail.gmail.com>
Message-ID: <BANLkTikh2shm804ZO=KFhnp7=1sQ6NCE-Q@mail.gmail.com>

On Wed, Apr 6, 2011 at 05:27, Matthew Knepley <knepley at gmail.com> wrote:

> There should be no difference since direct solves take so long.


That is, solves are very fast compared to factorization.

If you just want to check the residual, Richardson is cheaper than GMRES
because it will require one fewer preconditioner/matrix applications.
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.mcs.anl.gov/pipermail/petsc-users/attachments/20110406/567fc788/attachment.htm>

From gdiso at ustc.edu  Wed Apr  6 09:24:50 2011
From: gdiso at ustc.edu (Gong Ding)
Date: Wed, 6 Apr 2011 22:24:50 +0800 (CST)
Subject: [petsc-users] Is there efficeint method for matrix with one
 extremely small eigen value?
Message-ID: <3243394.39741302099890114.JavaMail.coremail@mail.ustc.edu>

Hi,
Can some one gives me advise on how to solve the ill conditioned problem
efficiently with iterative method (since the problem size is big). 

I calculated the smallest eigen values as well as the largest eigen values.
There exist one extremely small eigen value, which made the system ill conditioned.
I guess method such as Tikhonov regularization may work? 
Or there are some cheaper method works, if I can endure some inaccuracy in the solution.
  

Smallest 0 eigen value: -2.112144e-15 with error 9.452618e-14
Smallest 1 eigen value: -2.480170e-04 with error 6.150216e-04
Smallest 2 eigen value: -2.787193e-04 with error 2.808614e-04
Smallest 3 eigen value: -2.825241e-04 with error 6.620491e-04
Smallest 4 eigen value: -2.825241e-04 with error 6.620491e-04
Smallest 5 eigen value: -2.833565e-04 with error 2.990142e-04
Smallest 6 eigen value: -3.020135e-04 with error 6.313397e-04
Smallest 7 eigen value: -3.020149e-04 with error 4.939515e-04
Smallest 8 eigen value: -3.083228e-04 with error 1.114806e-03
Largest  0 eigen value: -4.076308e+03 with error 2.403326e-08
Largest  1 eigen value: -3.894209e+03 with error 6.314489e-08
Largest  2 eigen value: -3.893185e+03 with error 3.924167e-08
Largest  3 eigen value: -3.855228e+03 with error 3.504644e-09
Largest  4 eigen value: -3.739288e+03 with error 1.689236e-08


Thanks.


From jed at 59A2.org  Wed Apr  6 09:32:34 2011
From: jed at 59A2.org (Jed Brown)
Date: Wed, 06 Apr 2011 17:32:34 +0300
Subject: [petsc-users] Is there efficeint method for matrix with one
	extremely small eigen value?
In-Reply-To: <3243394.39741302099890114.JavaMail.coremail@mail.ustc.edu>
References: <3243394.39741302099890114.JavaMail.coremail@mail.ustc.edu>
Message-ID: <87k4f730xp.fsf@59A2.org>

On Wed, 6 Apr 2011 22:24:50 +0800 (CST), "Gong Ding" <gdiso at ustc.edu> wrote:
> Hi,
> Can some one gives me advise on how to solve the ill conditioned problem
> efficiently with iterative method (since the problem size is big). 
> 
> I calculated the smallest eigen values as well as the largest eigen values.
> There exist one extremely small eigen value, which made the system ill conditioned.
> I guess method such as Tikhonov regularization may work? 
> Or there are some cheaper method works, if I can endure some inaccuracy in the solution.
>   
> 
> Smallest 0 eigen value: -2.112144e-15 with error 9.452618e-14

Your problem has a null space of dimension 1. Determine the eigenvector associated with this eigenvalue. That is the null space, it might just be a constant. Create a MatNullSpace and use KSPSetNullSpace(). (If it is the constant, you can just use -ksp_constant_null_space.) See the section in the users manual on solving singular systems.

From u.tabak at tudelft.nl  Wed Apr  6 09:37:26 2011
From: u.tabak at tudelft.nl (Umut Tabak)
Date: Wed, 06 Apr 2011 16:37:26 +0200
Subject: [petsc-users] Is there efficeint method for matrix with
 one	extremely small eigen value?
In-Reply-To: <87k4f730xp.fsf@59A2.org>
References: <3243394.39741302099890114.JavaMail.coremail@mail.ustc.edu>
	<87k4f730xp.fsf@59A2.org>
Message-ID: <4D9C7AA6.301@tudelft.nl>

On 04/06/2011 04:32 PM, Jed Brown wrote:
> On Wed, 6 Apr 2011 22:24:50 +0800 (CST), "Gong Ding"<gdiso at ustc.edu>  wrote:
>    
>> Hi,
>> Can some one gives me advise on how to solve the ill conditioned problem
>> efficiently with iterative method (since the problem size is big).
>>
>> I calculated the smallest eigen values as well as the largest eigen values.
>> There exist one extremely small eigen value, which made the system ill conditioned.
>> I guess method such as Tikhonov regularization may work?
>> Or there are some cheaper method works, if I can endure some inaccuracy in the solution.
>>
>>
>> Smallest 0 eigen value: -2.112144e-15 with error 9.452618e-14
>>      
> Your problem has a null space of dimension 1. Determine the eigenvector associated with this eigenvalue. That is the null space, it might just be a constant. Create a MatNullSpace and use KSPSetNullSpace(). (If it is the constant, you can just use -ksp_constant_null_space.) See the section in the users manual on solving singular systems.
>    
Just curious, are not the other negative eigenvalues problematic as well?


From bsmith at mcs.anl.gov  Wed Apr  6 09:48:01 2011
From: bsmith at mcs.anl.gov (Barry Smith)
Date: Wed, 6 Apr 2011 09:48:01 -0500
Subject: [petsc-users] Is there efficeint method for matrix with
	one	extremely small eigen value?
In-Reply-To: <4D9C7AA6.301@tudelft.nl>
References: <3243394.39741302099890114.JavaMail.coremail@mail.ustc.edu>
	<87k4f730xp.fsf@59A2.org> <4D9C7AA6.301@tudelft.nl>
Message-ID: <D51B1CF9-72AD-4A59-ACCF-E50795D7A933@mcs.anl.gov>


On Apr 6, 2011, at 9:37 AM, Umut Tabak wrote:

> On 04/06/2011 04:32 PM, Jed Brown wrote:
>> On Wed, 6 Apr 2011 22:24:50 +0800 (CST), "Gong Ding"<gdiso at ustc.edu>  wrote:
>>   
>>> Hi,
>>> Can some one gives me advise on how to solve the ill conditioned problem
>>> efficiently with iterative method (since the problem size is big).
>>> 
>>> I calculated the smallest eigen values as well as the largest eigen values.
>>> There exist one extremely small eigen value, which made the system ill conditioned.
>>> I guess method such as Tikhonov regularization may work?
>>> Or there are some cheaper method works, if I can endure some inaccuracy in the solution.
>>> 
>>> 
>>> Smallest 0 eigen value: -2.112144e-15 with error 9.452618e-14
>>>     
>> Your problem has a null space of dimension 1. Determine the eigenvector associated with this eigenvalue. That is the null space, it might just be a constant. Create a MatNullSpace and use KSPSetNullSpace(). (If it is the constant, you can just use -ksp_constant_null_space.) See the section in the users manual on solving singular systems.
>>   
> Just curious, are not the other negative eigenvalues problematic as well?

   They are not nice that they are not necessarily the end of the world (like the functionally zero one). After removing the functionally zero eigenvalue the condition number of the matrix is around 10^7 which is very large but within the realm of solvable. With that functionally zero one the problem is simply not solvable.

   Barry


> 


From jed at 59A2.org  Wed Apr  6 09:50:17 2011
From: jed at 59A2.org (Jed Brown)
Date: Wed, 06 Apr 2011 17:50:17 +0300
Subject: [petsc-users] Is there efficeint method for matrix with one
	extremely small eigen value?
In-Reply-To: <4D9C7AA6.301@tudelft.nl>
References: <3243394.39741302099890114.JavaMail.coremail@mail.ustc.edu>
	<87k4f730xp.fsf@59A2.org> <4D9C7AA6.301@tudelft.nl>
Message-ID: <87fwpv3046.fsf@59A2.org>

On Wed, 06 Apr 2011 16:37:26 +0200, Umut Tabak 
<u.tabak at tudelft.nl> wrote:
> Just curious, are not the other negative eigenvalues problematic 
> as well?

Negative eigenvalues do not pose any particular problem to Krylov methods like GMRES. Conjugate gradients does require that the matrix be SPD, but petsc-dev detects when a matrix is negative definite and still does the right thing. With petsc-3.1, you could simply change the sign of everything. (I prefer to build to formulate my equations with positive matrices when possible, but those other negative eigenvalues are not the problem here.)

From u.tabak at tudelft.nl  Wed Apr  6 10:24:23 2011
From: u.tabak at tudelft.nl (Umut Tabak)
Date: Wed, 06 Apr 2011 17:24:23 +0200
Subject: [petsc-users] Is there efficeint method for matrix with one
 extremely small eigen value?
In-Reply-To: <87fwpv3046.fsf@59A2.org>
References: <3243394.39741302099890114.JavaMail.coremail@mail.ustc.edu>
	<87k4f730xp.fsf@59A2.org> <4D9C7AA6.301@tudelft.nl>
	<87fwpv3046.fsf@59A2.org>
Message-ID: <4D9C85A7.1030405@tudelft.nl>

On 04/06/2011 04:50 PM, Jed Brown wrote:
> On Wed, 06 Apr 2011 16:37:26 +0200, Umut Tabak <u.tabak at tudelft.nl> 
> wrote:
>> Just curious, are not the other negative eigenvalues problematic as 
>> well?
>
> Negative eigenvalues do not pose any particular problem to Krylov 
> methods like GMRES. Conjugate gradients does require that the matrix 
> be SPD, but petsc-dev detects when a matrix is negative definite and 
> still does the right thing. 

Also with cg type methods? if yes, how?

Because I am dealing with a similar problem in a projection sense which 
makes some factors that are already available very good preconditioners, 
completely problem specific, then cg converges incredibly fast, sth like 
4 to 8 iterations. However, projection is the key and at every step, in 
cg, I should make sure that the search directions in cg are orthogonal 
to the previous ones by cgs/mgs, otherwise I bump into the well know 
orthogonality issues of Lanczos type methods...

why I am digging is to see some better options if there are any.

Greetz,
Umut

From jed at 59A2.org  Wed Apr  6 10:30:20 2011
From: jed at 59A2.org (Jed Brown)
Date: Wed, 06 Apr 2011 18:30:20 +0300
Subject: [petsc-users] Is there efficeint method for matrix with one
	extremely small eigen value?
In-Reply-To: <4D9C8180.5030901@tudelft.nl>
References: <3243394.39741302099890114.JavaMail.coremail@mail.ustc.edu>
	<87k4f730xp.fsf@59A2.org> <4D9C7AA6.301@tudelft.nl>
	<87fwpv3046.fsf@59A2.org> <4D9C8180.5030901@tudelft.nl>
Message-ID: <87d3kz2y9f.fsf@59A2.org>

On Wed, 06 Apr 2011 17:06:40 +0200, Umut Tabak <u.tabak at tudelft.nl> wrote:
> On 04/06/2011 04:50 PM, Jed Brown wrote:
> > On Wed, 06 Apr 2011 16:37:26 +0200, Umut Tabak <u.tabak at tudelft.nl> 
> > wrote:
> >> Just curious, are not the other negative eigenvalues problematic as 
> >> well?
> >
> > Negative eigenvalues do not pose any particular problem to Krylov 
> > methods like GMRES. Conjugate gradients does require that the matrix 
> > be SPD, but petsc-dev detects when a matrix is negative definite and 
> > still does the right thing. 
> 
> Also with cg type methods? if yes, how?

http://petsc.cs.iit.edu/petsc/petsc-dev/rev/cae94ca39fcb

From jroman at dsic.upv.es  Wed Apr  6 13:33:27 2011
From: jroman at dsic.upv.es (Jose E. Roman)
Date: Wed, 6 Apr 2011 20:33:27 +0200
Subject: [petsc-users] Slepc eigen value solver gives strange behavior
In-Reply-To: <9020416.38421302060081374.JavaMail.coremail@mail.ustc.edu>
References: <133CE1D3-2E96-4749-83B4-76F61DBC06F0@dsic.upv.es>
	<32393050.36991301972852982.JavaMail.coremail@mail.ustc.edu>
	<9020416.38421302060081374.JavaMail.coremail@mail.ustc.edu>
Message-ID: <F9ED855A-849D-4A36-A203-F39DC090CFA3@dsic.upv.es>


El 06/04/2011, a las 05:21, Gong Ding escribi?:

> Dear Jose,
> will you please also take a look at the SVD code for smallest singular value?
> It seems work except the time consuming SVDCyclicSetExplicitMatrix routine.
> However, I wonder if there exist some more clever method.

I think it is correct.
Jose

> 
>  // SVD solver for smallest singular value
>  SVD            svd_s;
>  EPS            eps_s;
>  ST             st_s;
>  KSP            ksp_s;
>  PC             pc_s;
> 
>  PetscErrorCode ierr;
> 
>  // Create singular value solver context
>  ierr = SVDCreate(PETSC_COMM_WORLD, &svd_s);
> 
>  // Set operator
>  ierr = SVDSetOperator(svd_s, J);
> 
> 
>  // small singular value use eigen value solver on Cyclic Matrix
>  ierr = SVDSetWhichSingularTriplets(svd_s, SVD_SMALLEST);
>  ierr = SVDSetType(svd_s, SVDCYCLIC);
>  ierr = SVDCyclicSetExplicitMatrix(svd_s, PETSC_TRUE); // <-----time consuming
>  // shift-and-invert spectral transformation to enhance convergence of eigenvalues near zero
>  ierr = SVDCyclicGetEPS(svd_s, &eps_s);
>  ierr = EPSSetType(eps_s, EPSKRYLOVSCHUR);
>  ierr = EPSGetST(eps_s, &st_s);
>  ierr = STSetType(st_s, STSINVERT);
> 
>  ierr = STGetKSP(st_s, &ksp_s);
>  ierr = KSPGetPC(ksp_s, &pc_s);
>  // since we have to deal with bad conditioned problem, we choose direct solver whenever possible
> 
>  // direct solver as preconditioner
>  ierr = KSPSetType (ksp_s, (char*) KSPGMRES); assert(!ierr);
>  // superlu which use static pivot seems very stable
>  ierr = PCSetType  (pc_s, (char*) PCLU); assert(!ierr);
>  ierr = PCFactorSetMatSolverPackage (pc_s, "superlu"); assert(!ierr);
> 
>  // Set solver parameters at runtime
>  ierr = SVDSetFromOptions(svd_s);  assert(!ierr);
> 
>  ierr = SVDSetUp(svd_s); assert(!ierr);
> 
>  PetscReal sigma_large=1, sigma_small=1;
>  PetscInt nconv;
>  PetscReal error;
> 
>  // find the smallest singular value
>  SVDSolve(svd_s);


From jed at 59A2.org  Wed Apr  6 14:01:37 2011
From: jed at 59A2.org (Jed Brown)
Date: Wed, 6 Apr 2011 21:01:37 +0200
Subject: [petsc-users] Is there efficeint method for matrix with one
 extremely small eigen value?
In-Reply-To: <32A929AB3D7C460BA6F517A9A29EE0D6@cogendaeda>
References: <3243394.39741302099890114.JavaMail.coremail@mail.ustc.edu>
	<87k4f730xp.fsf@59A2.org>
	<32A929AB3D7C460BA6F517A9A29EE0D6@cogendaeda>
Message-ID: <BANLkTinBy05HmRWrok+xQ6kRFzurDTkgLg@mail.gmail.com>

2011/4/6 Gong Ding <gdiso at ustc.edu>

> Ok, I will investigate matrix null space problem.
> The matrix comes from nonlinear problem, I wonder if I need to calculate
> the eigenvector each time.
>

Possibly, but it is more likely that the null space is something simple like
a constant.


>
> Several months ago, some one committed DGMRES implementation, which also
> dropped smallest eigen value.
> It it possible to use (slightly modified) DGMRES as flexable tool for
> sigular problem?
>

I'm not familiar with DGMRES.
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.mcs.anl.gov/pipermail/petsc-users/attachments/20110406/b752348a/attachment.htm>

From knepley at gmail.com  Wed Apr  6 14:25:21 2011
From: knepley at gmail.com (Matthew Knepley)
Date: Wed, 6 Apr 2011 14:25:21 -0500
Subject: [petsc-users] Is there efficeint method for matrix with one
 extremely small eigen value?
In-Reply-To: <BANLkTinBy05HmRWrok+xQ6kRFzurDTkgLg@mail.gmail.com>
References: <3243394.39741302099890114.JavaMail.coremail@mail.ustc.edu>
	<87k4f730xp.fsf@59A2.org>
	<32A929AB3D7C460BA6F517A9A29EE0D6@cogendaeda>
	<BANLkTinBy05HmRWrok+xQ6kRFzurDTkgLg@mail.gmail.com>
Message-ID: <BANLkTim6YXY5as-c8xudq-cidshKS56E3w@mail.gmail.com>

On Wed, Apr 6, 2011 at 2:01 PM, Jed Brown <jed at 59a2.org> wrote:

> 2011/4/6 Gong Ding <gdiso at ustc.edu>
>
>> Ok, I will investigate matrix null space problem.
>> The matrix comes from nonlinear problem, I wonder if I need to calculate
>> the eigenvector each time.
>>
>
> Possibly, but it is more likely that the null space is something simple
> like a constant.
>
>
>>
>> Several months ago, some one committed DGMRES implementation, which also
>> dropped smallest eigen value.
>> It it possible to use (slightly modified) DGMRES as flexable tool for
>> sigular problem?
>>
>
> I'm not familiar with DGMRES.
>

Deflated GMRES will not help here. This is just the power method, and thus
gets the large eigenvalues
first. You will not get the null space vector.

   Matt

-- 
What most experimenters take for granted before they begin their experiments
is infinitely more interesting than any results to which their experiments
lead.
-- Norbert Wiener
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.mcs.anl.gov/pipermail/petsc-users/attachments/20110406/344b96eb/attachment.htm>

From dharmareddy84 at gmail.com  Wed Apr  6 15:11:37 2011
From: dharmareddy84 at gmail.com (Dharmendar Reddy)
Date: Wed, 6 Apr 2011 15:11:37 -0500
Subject: [petsc-users] PETSc Mesh and Fortran
Message-ID: <BANLkTik=_7mzFT9+w9VSwb+pPY59BD4_aA@mail.gmail.com>

Hello,
         Are there any examples of PETSc mesh usage in a Fortran code. The
Examples link on PETSc mesh man page (
http://www.mcs.anl.gov/petsc/petsc-as/snapshots/petsc-current/docs/manualpages/Mesh/index.html)
redirects to page not found. I would like to learn usage of PETSc mesh
utilities. I have an exodus file for the mesh.

Thanks
Reddy

-- 
-----------------------------------------------------
Dharmendar Reddy Palle
Graduate Student
Microelectronics Research center,
University of Texas at Austin,
10100 Burnet Road, Bldg. 160
MER 2.608F, TX 78758-4445
e-mail: dharmareddy84 at gmail.com
Phone: +1-512-350-9082
United States of America.
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.mcs.anl.gov/pipermail/petsc-users/attachments/20110406/b4e0eede/attachment.htm>

From zhaonanavril at gmail.com  Wed Apr  6 18:53:20 2011
From: zhaonanavril at gmail.com (NAN ZHAO)
Date: Wed, 6 Apr 2011 17:53:20 -0600
Subject: [petsc-users] help on ksp
Message-ID: <BANLkTikpWQX2u23s_GAuYHOmTzRk-F2sZw@mail.gmail.com>

Dear all,

I tried to use ksp to solve some problem. I got some Segmentation Violation
error. And I got result from the solver as below. I am wondering if the ksp
matrix has some error, cause I got the nonzeros allocated wrong. Can anyone
dig out some valuable information from the ksp output? Thanks.
--------------------------------------------------------------------------------------------------
  total: nonzeros=38449, allocated nonzeros=52103
reason code = 2, its = 2546
KSP Object:
  type: gmres
    GMRES: restart=30, using Classical (unmodified) Gram-Schmidt
Orthogonalization with no iterative refinement
    GMRES: happy breakdown tolerance 1e-30
  maximum iterations=5000, initial guess is zero
  tolerances:  relative=1e-05, absolute=1e-50, divergence=10000
  left preconditioning
PC Object:
  type: ilu
    ILU: 0 levels of fill
    ILU: factor fill ratio allocated 1
    ILU: tolerance for zero pivot 1e-12
         out-of-place factorization
         matrix ordering: natural
    ILU: factor fill ratio needed 0
         Factored matrix follows
        Matrix Object:
          type=seqbaij, rows=2903, cols=2903
          total: nonzeros=38449, allocated nonzeros=52103
              block size is 1
 linear system matrix = precond matrix:
  Matrix Object:
    type=seqbaij, rows=2903, cols=2903
    total: nonzeros=38449, allocated nonzeros=52103
        block size is 1

Nan
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.mcs.anl.gov/pipermail/petsc-users/attachments/20110406/0b466851/attachment.htm>

From bsmith at mcs.anl.gov  Wed Apr  6 19:04:26 2011
From: bsmith at mcs.anl.gov (Barry Smith)
Date: Wed, 6 Apr 2011 19:04:26 -0500
Subject: [petsc-users] help on ksp
In-Reply-To: <BANLkTikpWQX2u23s_GAuYHOmTzRk-F2sZw@mail.gmail.com>
References: <BANLkTikpWQX2u23s_GAuYHOmTzRk-F2sZw@mail.gmail.com>
Message-ID: <1C5BEBEA-3789-4BAF-B84D-7BE612BD64C8@mcs.anl.gov>


   Incorrect preallocation should never cause a crash (just possibly slower code). You need to run valgrind http://www.mcs.anl.gov/petsc/petsc-as/documentation/faq.html#valgrind

  Barry

On Apr 6, 2011, at 6:53 PM, NAN ZHAO wrote:

> Dear all, 
> 
> I tried to use ksp to solve some problem. I got some Segmentation Violation error. And I got result from the solver as below. I am wondering if the ksp matrix has some error, cause I got the nonzeros allocated wrong. Can anyone dig out some valuable information from the ksp output? Thanks.
> --------------------------------------------------------------------------------------------------
>   total: nonzeros=38449, allocated nonzeros=52103
> reason code = 2, its = 2546
> KSP Object:
>   type: gmres
>     GMRES: restart=30, using Classical (unmodified) Gram-Schmidt Orthogonalization with no iterative refinement
>     GMRES: happy breakdown tolerance 1e-30
>   maximum iterations=5000, initial guess is zero
>   tolerances:  relative=1e-05, absolute=1e-50, divergence=10000
>   left preconditioning
> PC Object:
>   type: ilu
>     ILU: 0 levels of fill
>     ILU: factor fill ratio allocated 1
>     ILU: tolerance for zero pivot 1e-12
>          out-of-place factorization
>          matrix ordering: natural
>     ILU: factor fill ratio needed 0
>          Factored matrix follows
>         Matrix Object:
>           type=seqbaij, rows=2903, cols=2903
>           total: nonzeros=38449, allocated nonzeros=52103
>               block size is 1
>  linear system matrix = precond matrix:
>   Matrix Object:
>     type=seqbaij, rows=2903, cols=2903
>     total: nonzeros=38449, allocated nonzeros=52103
>         block size is 1
> 
> Nan


From bartlomiej.wach at yahoo.pl  Thu Apr  7 08:04:43 2011
From: bartlomiej.wach at yahoo.pl (=?utf-8?B?QmFydMWCb21pZWogVw==?=)
Date: Thu, 7 Apr 2011 14:04:43 +0100 (BST)
Subject: [petsc-users] Sparse Matrix preallocation and performance
In-Reply-To: <87k4f730xp.fsf@59A2.org>
Message-ID: <81155.5596.qm@web28304.mail.ukl.yahoo.com>

Hello,

Wheather I use

? ierr = MatCreate(PETSC_COMM_WORLD,&L);CHKERRQ(ierr);
? ierr = MatSetSizes(L,PETSC_DECIDE,PETSC_DECIDE,n,n);CHKERRQ(ierr);
???????????? MatSeqAIJSetPreallocation(L,PETSC_NULL,nnz);
? ierr = MatSetFromOptions(L);CHKERRQ(ierr);

or

? 
? ierr = MatCreateSeqAIJ(PETSC_COMM_WORLD,n,n,PETSC_DEFAULT,nnz,&L);CHKERRQ(ierr);
? ierr = MatSetFromOptions(L);CHKERRQ(ierr);
?
Gives me 

?? Number of mallocs during MatSetValues() is? X 

On matrix assembly, where X is positive
Is this indicating the preallocation or should it be zero and I'm missing something?

Moreover, using MatCreateSeqAIJ lowers the performance of MatSetValues

Is my code improper?

Regards 
Bart?omiej Wach
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.mcs.anl.gov/pipermail/petsc-users/attachments/20110407/3885cece/attachment.htm>

From knepley at gmail.com  Thu Apr  7 08:15:40 2011
From: knepley at gmail.com (Matthew Knepley)
Date: Thu, 7 Apr 2011 08:15:40 -0500
Subject: [petsc-users] Sparse Matrix preallocation and performance
In-Reply-To: <81155.5596.qm@web28304.mail.ukl.yahoo.com>
References: <87k4f730xp.fsf@59A2.org>
	<81155.5596.qm@web28304.mail.ukl.yahoo.com>
Message-ID: <BANLkTikjNXRMzj40pOS-ESir8RMSjC9AzA@mail.gmail.com>

On Thu, Apr 7, 2011 at 8:04 AM, Bart?omiej W <bartlomiej.wach at yahoo.pl>wrote:

> Hello,
>
> Wheather I use
>
>   ierr = MatCreate(PETSC_COMM_WORLD,&L);CHKERRQ(ierr);
>   ierr = MatSetSizes(L,PETSC_DECIDE,PETSC_DECIDE,n,n);CHKERRQ(ierr);
>              MatSeqAIJSetPreallocation(L,PETSC_NULL,nnz);
>   ierr = MatSetFromOptions(L);CHKERRQ(ierr);
>

Move SetPreallocation() after SetFromOptions(). Here is what happens:

  SetFromOptions() will give the matrix a type, since it does not have one
already

  Since the matrix had no type before SetPreallocation(), the call was
ignored

     Matt


> or
>
>
>   ierr =
> MatCreateSeqAIJ(PETSC_COMM_WORLD,n,n,PETSC_DEFAULT,nnz,&L);CHKERRQ(ierr);
>   ierr = MatSetFromOptions(L);CHKERRQ(ierr);
>
> Gives me
>
>    Number of mallocs during MatSetValues() is  X
>
> On matrix assembly, where X is positive
> Is this indicating the preallocation or should it be zero and I'm missing
> something?
>
> Moreover, using MatCreateSeqAIJ lowers the performance of MatSetValues
>
> Is my code improper?
>
> Regards
> Bart?omiej Wach
>


-- 
What most experimenters take for granted before they begin their experiments
is infinitely more interesting than any results to which their experiments
lead.
-- Norbert Wiener
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.mcs.anl.gov/pipermail/petsc-users/attachments/20110407/88df624a/attachment-0001.htm>

From vyan2000 at gmail.com  Thu Apr  7 08:49:35 2011
From: vyan2000 at gmail.com (Ryan Yan)
Date: Thu, 7 Apr 2011 09:49:35 -0400
Subject: [petsc-users] about pclu
In-Reply-To: <BANLkTikh2shm804ZO=KFhnp7=1sQ6NCE-Q@mail.gmail.com>
References: <BANLkTinc338PhSAXdmThpS_fyUBP1qwPEA@mail.gmail.com>
	<0E3D511E-B363-47F3-80A2-422EA61F2274@mcs.anl.gov>
	<BANLkTimu=EHYskfkax+KoDmgS9aTFpvhhQ@mail.gmail.com>
	<BANLkTimnQ8PU5N1yWebs-i=sv3F2KF0TLw@mail.gmail.com>
	<BANLkTikh2shm804ZO=KFhnp7=1sQ6NCE-Q@mail.gmail.com>
Message-ID: <BANLkTi=2KYYy4qZ0ktuCcP863MGD7Gw3FQ@mail.gmail.com>

Dear Jed,
Sorry for late reply. This email box gets piled up... :-)

I agree that solves are faster than factorization and GMRES
use more MatVecMult, even with exact preconditioner. So I guess
PREONLY is meaning direct solve without forming any sub-space and just
one forward and backward substitutions?

Thanks,

Yan

On Wed, Apr 6, 2011 at 1:43 AM, Jed Brown <jed at 59a2.org> wrote:

> On Wed, Apr 6, 2011 at 05:27, Matthew Knepley <knepley at gmail.com> wrote:
>
>> There should be no difference since direct solves take so long.
>
>
> That is, solves are very fast compared to factorization.
>
> If you just want to check the residual, Richardson is cheaper than GMRES
> because it will require one fewer preconditioner/matrix applications.
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.mcs.anl.gov/pipermail/petsc-users/attachments/20110407/778fdc8b/attachment.htm>

From PRaeth at hpti.com  Thu Apr  7 08:52:44 2011
From: PRaeth at hpti.com (Raeth, Peter)
Date: Thu, 7 Apr 2011 13:52:44 +0000
Subject: [petsc-users] about pclu
In-Reply-To: <BANLkTi=2KYYy4qZ0ktuCcP863MGD7Gw3FQ@mail.gmail.com>
References: <BANLkTinc338PhSAXdmThpS_fyUBP1qwPEA@mail.gmail.com>
	<0E3D511E-B363-47F3-80A2-422EA61F2274@mcs.anl.gov>
	<BANLkTimu=EHYskfkax+KoDmgS9aTFpvhhQ@mail.gmail.com>
	<BANLkTimnQ8PU5N1yWebs-i=sv3F2KF0TLw@mail.gmail.com>
	<BANLkTikh2shm804ZO=KFhnp7=1sQ6NCE-Q@mail.gmail.com>,
	<BANLkTi=2KYYy4qZ0ktuCcP863MGD7Gw3FQ@mail.gmail.com>
Message-ID: <3474F869C1954540B771FD9CAEBCB65722B80D98@CORTINA.HPTI.COM>

"This email box gets piled up..."


As a point of encouragement and appreciation, this is a sign of a much-used product that people are applying to increasingly-sophisticated research.


Best,


Peter.


Peter G. Raeth, Ph.D.
Senior Staff Scientist
Signal and Image Processing
High Performance Technologies, Inc
937-904-5147
praeth at hpti.com<mailto:praeth at hpti.com>
________________________________
From: petsc-users-bounces at mcs.anl.gov [petsc-users-bounces at mcs.anl.gov] on behalf of Ryan Yan [vyan2000 at gmail.com]
Sent: Thursday, April 07, 2011 9:49 AM
To: PETSc users list
Subject: Re: [petsc-users] about pclu

Dear Jed,
Sorry for late reply. This email box gets piled up... :-)

I agree that solves are faster than factorization and GMRES
use more MatVecMult, even with exact preconditioner. So I guess
PREONLY is meaning direct solve without forming any sub-space and just
one forward and backward substitutions?

Thanks,

Yan

On Wed, Apr 6, 2011 at 1:43 AM, Jed Brown <jed at 59a2.org<mailto:jed at 59a2.org>> wrote:
On Wed, Apr 6, 2011 at 05:27, Matthew Knepley <knepley at gmail.com<mailto:knepley at gmail.com>> wrote:
There should be no difference since direct solves take so long.

That is, solves are very fast compared to factorization.

If you just want to check the residual, Richardson is cheaper than GMRES because it will require one fewer preconditioner/matrix applications.

-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.mcs.anl.gov/pipermail/petsc-users/attachments/20110407/946e4fd1/attachment.htm>

From jed at 59A2.org  Thu Apr  7 08:55:41 2011
From: jed at 59A2.org (Jed Brown)
Date: Thu, 7 Apr 2011 15:55:41 +0200
Subject: [petsc-users] about pclu
In-Reply-To: <BANLkTi=2KYYy4qZ0ktuCcP863MGD7Gw3FQ@mail.gmail.com>
References: <BANLkTinc338PhSAXdmThpS_fyUBP1qwPEA@mail.gmail.com>
	<0E3D511E-B363-47F3-80A2-422EA61F2274@mcs.anl.gov>
	<BANLkTimu=EHYskfkax+KoDmgS9aTFpvhhQ@mail.gmail.com>
	<BANLkTimnQ8PU5N1yWebs-i=sv3F2KF0TLw@mail.gmail.com>
	<BANLkTikh2shm804ZO=KFhnp7=1sQ6NCE-Q@mail.gmail.com>
	<BANLkTi=2KYYy4qZ0ktuCcP863MGD7Gw3FQ@mail.gmail.com>
Message-ID: <BANLkTin34zGA7aGejsDEXvYFcjbZLwEEQg@mail.gmail.com>

On Thu, Apr 7, 2011 at 15:49, Ryan Yan <vyan2000 at gmail.com> wrote:

> I agree that solves are faster than factorization and GMRES
> use more MatVecMult, even with exact preconditioner. So I guess
> PREONLY is meaning direct solve without forming any sub-space and just
> one forward and backward substitutions?
>

Yes. Because of the way GMRES works with zero initial guess, there will be
two MatSolve and one MatMult even when using a direct solver. PREONLY does
one MatSolve and zero MatMult.
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.mcs.anl.gov/pipermail/petsc-users/attachments/20110407/dae5ec5f/attachment.htm>

From vyan2000 at gmail.com  Thu Apr  7 09:03:11 2011
From: vyan2000 at gmail.com (Ryan Yan)
Date: Thu, 7 Apr 2011 10:03:11 -0400
Subject: [petsc-users] about pclu
In-Reply-To: <3474F869C1954540B771FD9CAEBCB65722B80D98@CORTINA.HPTI.COM>
References: <BANLkTinc338PhSAXdmThpS_fyUBP1qwPEA@mail.gmail.com>
	<0E3D511E-B363-47F3-80A2-422EA61F2274@mcs.anl.gov>
	<BANLkTimu=EHYskfkax+KoDmgS9aTFpvhhQ@mail.gmail.com>
	<BANLkTimnQ8PU5N1yWebs-i=sv3F2KF0TLw@mail.gmail.com>
	<BANLkTikh2shm804ZO=KFhnp7=1sQ6NCE-Q@mail.gmail.com>
	<BANLkTi=2KYYy4qZ0ktuCcP863MGD7Gw3FQ@mail.gmail.com>
	<3474F869C1954540B771FD9CAEBCB65722B80D98@CORTINA.HPTI.COM>
Message-ID: <BANLkTikE_xFOe8yTuZYYprd457c1ELs01g@mail.gmail.com>

Nice Observation. :-)

Y


> As a point of encouragement and appreciation, this is a sign of a much-used
> product that people are applying to increasingly-sophisticated research.
>
>
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.mcs.anl.gov/pipermail/petsc-users/attachments/20110407/615556e3/attachment.htm>

From jed at 59A2.org  Thu Apr  7 09:08:28 2011
From: jed at 59A2.org (Jed Brown)
Date: Thu, 7 Apr 2011 16:08:28 +0200
Subject: [petsc-users] about pclu
In-Reply-To: <BANLkTinBXavk43xRr_mdA-cAtC0Ei9Nw-w@mail.gmail.com>
References: <BANLkTinc338PhSAXdmThpS_fyUBP1qwPEA@mail.gmail.com>
	<0E3D511E-B363-47F3-80A2-422EA61F2274@mcs.anl.gov>
	<BANLkTimu=EHYskfkax+KoDmgS9aTFpvhhQ@mail.gmail.com>
	<BANLkTimnQ8PU5N1yWebs-i=sv3F2KF0TLw@mail.gmail.com>
	<BANLkTikh2shm804ZO=KFhnp7=1sQ6NCE-Q@mail.gmail.com>
	<BANLkTi=2KYYy4qZ0ktuCcP863MGD7Gw3FQ@mail.gmail.com>
	<BANLkTin34zGA7aGejsDEXvYFcjbZLwEEQg@mail.gmail.com>
	<BANLkTinBXavk43xRr_mdA-cAtC0Ei9Nw-w@mail.gmail.com>
Message-ID: <BANLkTinitppwrHuM7M21h-tJ7hdDkBtC6A@mail.gmail.com>

On Thu, Apr 7, 2011 at 16:06, Ryan Yan <vyan2000 at gmail.com> wrote:

> Hi Jed,
> How is going? :-)
>
> PREONLY maybe also use two MatSolve? Or is there any magic I did not see.
> :-)
>

Why do you say that?

$ ./ex2 -pc_type lu -ksp_type preonly -log_summary |g '^MatSolve'
MatSolve               1 1.0 1.1921e-05 1.0 1.22e+03 1.0 0.0e+00 0.0e+00
0.0e+00  0 22  0  0  0   0 22  0  0  0   102
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.mcs.anl.gov/pipermail/petsc-users/attachments/20110407/45452e31/attachment.htm>

From vyan2000 at gmail.com  Thu Apr  7 09:21:13 2011
From: vyan2000 at gmail.com (Ryan Yan)
Date: Thu, 7 Apr 2011 10:21:13 -0400
Subject: [petsc-users] about pclu
In-Reply-To: <BANLkTinitppwrHuM7M21h-tJ7hdDkBtC6A@mail.gmail.com>
References: <BANLkTinc338PhSAXdmThpS_fyUBP1qwPEA@mail.gmail.com>
	<0E3D511E-B363-47F3-80A2-422EA61F2274@mcs.anl.gov>
	<BANLkTimu=EHYskfkax+KoDmgS9aTFpvhhQ@mail.gmail.com>
	<BANLkTimnQ8PU5N1yWebs-i=sv3F2KF0TLw@mail.gmail.com>
	<BANLkTikh2shm804ZO=KFhnp7=1sQ6NCE-Q@mail.gmail.com>
	<BANLkTi=2KYYy4qZ0ktuCcP863MGD7Gw3FQ@mail.gmail.com>
	<BANLkTin34zGA7aGejsDEXvYFcjbZLwEEQg@mail.gmail.com>
	<BANLkTinBXavk43xRr_mdA-cAtC0Ei9Nw-w@mail.gmail.com>
	<BANLkTinitppwrHuM7M21h-tJ7hdDkBtC6A@mail.gmail.com>
Message-ID: <BANLkTinCM_7g22JaWXiSy4e0QJJFAoB_qA@mail.gmail.com>

Wait, so is it for PREONLY? And what Matsolve does, solving a triangular
system with both L and U available?
The reason that I previously guess the count be two is that I think there is
a back-substituiton
and a forward-substitution involved in solving a linear system using
factorization. If a pair of
back-substitution and forward-substitution counts 1 MatSolve. Then I think
we mean the same thing.

Thanks,

Yan

On Thu, Apr 7, 2011 at 10:08 AM, Jed Brown <jed at 59a2.org> wrote:

> On Thu, Apr 7, 2011 at 16:06, Ryan Yan <vyan2000 at gmail.com> wrote:
>
>> Hi Jed,
>> How is going? :-)
>>
>> PREONLY maybe also use two MatSolve? Or is there any magic I did not see.
>> :-)
>>
>
> Why do you say that?
>
> $ ./ex2 -pc_type lu -ksp_type preonly -log_summary |g '^MatSolve'
> MatSolve               1 1.0 1.1921e-05 1.0 1.22e+03 1.0 0.0e+00 0.0e+00
> 0.0e+00  0 22  0  0  0   0 22  0  0  0   102
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.mcs.anl.gov/pipermail/petsc-users/attachments/20110407/e6979140/attachment.htm>

From jed at 59A2.org  Thu Apr  7 09:23:41 2011
From: jed at 59A2.org (Jed Brown)
Date: Thu, 7 Apr 2011 16:23:41 +0200
Subject: [petsc-users] about pclu
In-Reply-To: <BANLkTinCM_7g22JaWXiSy4e0QJJFAoB_qA@mail.gmail.com>
References: <BANLkTinc338PhSAXdmThpS_fyUBP1qwPEA@mail.gmail.com>
	<0E3D511E-B363-47F3-80A2-422EA61F2274@mcs.anl.gov>
	<BANLkTimu=EHYskfkax+KoDmgS9aTFpvhhQ@mail.gmail.com>
	<BANLkTimnQ8PU5N1yWebs-i=sv3F2KF0TLw@mail.gmail.com>
	<BANLkTikh2shm804ZO=KFhnp7=1sQ6NCE-Q@mail.gmail.com>
	<BANLkTi=2KYYy4qZ0ktuCcP863MGD7Gw3FQ@mail.gmail.com>
	<BANLkTin34zGA7aGejsDEXvYFcjbZLwEEQg@mail.gmail.com>
	<BANLkTinBXavk43xRr_mdA-cAtC0Ei9Nw-w@mail.gmail.com>
	<BANLkTinitppwrHuM7M21h-tJ7hdDkBtC6A@mail.gmail.com>
	<BANLkTinCM_7g22JaWXiSy4e0QJJFAoB_qA@mail.gmail.com>
Message-ID: <BANLkTimbe6pUm_UGP2+eYM10rnw6=qK3vw@mail.gmail.com>

On Thu, Apr 7, 2011 at 16:21, Ryan Yan <vyan2000 at gmail.com> wrote:

> Wait, so is it for PREONLY? And what Matsolve does, solving a triangular
> system with both L and U available?
>

Yup, forward- and back-solves are both done in one "MatSolve".


> The reason that I previously guess the count be two is that I think there
> is a back-substituiton
> and a forward-substitution involved in solving a linear system using
> factorization. If a pair of
> back-substitution and forward-substitution counts 1 MatSolve. Then I think
> we mean the same thing.
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.mcs.anl.gov/pipermail/petsc-users/attachments/20110407/64b1f401/attachment-0001.htm>

From vyan2000 at gmail.com  Thu Apr  7 09:25:40 2011
From: vyan2000 at gmail.com (Ryan Yan)
Date: Thu, 7 Apr 2011 10:25:40 -0400
Subject: [petsc-users] about pclu
In-Reply-To: <BANLkTimbe6pUm_UGP2+eYM10rnw6=qK3vw@mail.gmail.com>
References: <BANLkTinc338PhSAXdmThpS_fyUBP1qwPEA@mail.gmail.com>
	<0E3D511E-B363-47F3-80A2-422EA61F2274@mcs.anl.gov>
	<BANLkTimu=EHYskfkax+KoDmgS9aTFpvhhQ@mail.gmail.com>
	<BANLkTimnQ8PU5N1yWebs-i=sv3F2KF0TLw@mail.gmail.com>
	<BANLkTikh2shm804ZO=KFhnp7=1sQ6NCE-Q@mail.gmail.com>
	<BANLkTi=2KYYy4qZ0ktuCcP863MGD7Gw3FQ@mail.gmail.com>
	<BANLkTin34zGA7aGejsDEXvYFcjbZLwEEQg@mail.gmail.com>
	<BANLkTinBXavk43xRr_mdA-cAtC0Ei9Nw-w@mail.gmail.com>
	<BANLkTinitppwrHuM7M21h-tJ7hdDkBtC6A@mail.gmail.com>
	<BANLkTinCM_7g22JaWXiSy4e0QJJFAoB_qA@mail.gmail.com>
	<BANLkTimbe6pUm_UGP2+eYM10rnw6=qK3vw@mail.gmail.com>
Message-ID: <BANLkTikt5mz17unAWFB1BChD-RJC1iPS0A@mail.gmail.com>

It is wonderful to reach a agreement. :-)
Cheers,

Yan

On Thu, Apr 7, 2011 at 10:23 AM, Jed Brown <jed at 59a2.org> wrote:

> On Thu, Apr 7, 2011 at 16:21, Ryan Yan <vyan2000 at gmail.com> wrote:
>
>> Wait, so is it for PREONLY? And what Matsolve does, solving a triangular
>> system with both L and U available?
>>
>
> Yup, forward- and back-solves are both done in one "MatSolve".
>
>
>> The reason that I previously guess the count be two is that I think there
>> is a back-substituiton
>> and a forward-substitution involved in solving a linear system using
>> factorization. If a pair of
>> back-substitution and forward-substitution counts 1 MatSolve. Then I think
>> we mean the same thing.
>>
>
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.mcs.anl.gov/pipermail/petsc-users/attachments/20110407/8dd5bc39/attachment.htm>

From ram at ibrae.ac.ru  Thu Apr  7 18:17:32 2011
From: ram at ibrae.ac.ru (=?KOI8-R?B?4czFy9PFyiDy0drBzs/X?=)
Date: Fri, 8 Apr 2011 03:17:32 +0400
Subject: [petsc-users] How to create and assemble matrices for DA vectors??
Message-ID: <BANLkTimS4P6MxxFU6W=jV_-9FD+1cUBM1w@mail.gmail.com>

Hello.

When I create vectors using

VecCreate(PETSC_COMM_WORLD,&u);
VecSetSizes(u,PETSC_DECIDE, VecSize);
VecSetFromOptions(u);
VecDuplicate(u,&b);

and matrix using

MatCreate(PETSC_COMM_WORLD,&A);
MatSetSizes(A,PETSC_DECIDE,PETSC_DECIDE,VecSize,VecSize);
MatSetFromOptions(A);

PETSc distributes their elements in a proper identical way among processors,
so I can use procedures like

MatMult(A,u,b);

and
     KSPSolve(ksp,b,x);
Ofcourse after matrix assembling and initialization of KSP and PC

KSPCreate(PETSC_COMM_WORLD,&ksp);
KSPSetOperators(ksp,A,A,DIFFERENT_NONZERO_PATTERN);

And thats great and works amazingly!


But now I've created DA vectors "u" and "b" and assembled them through the
natural grid indexing.

And I need to solve the same SLE Au=b, where A is a Laplacian.

How should I create and assemble the A matrix according to my DA vector to
use the same functionality?

Thank you!

Alexey Ryazanov
______________________________________
Nuclear Safety Institute of Russian Academy of Sciences
<http://www.ibrae.ac.ru/>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.mcs.anl.gov/pipermail/petsc-users/attachments/20110408/8fa40a39/attachment.htm>

From balay at mcs.anl.gov  Thu Apr  7 18:32:47 2011
From: balay at mcs.anl.gov (Satish Balay)
Date: Thu, 7 Apr 2011 18:32:47 -0500 (CDT)
Subject: [petsc-users] How to create and assemble matrices for DA
 vectors??
In-Reply-To: <BANLkTimS4P6MxxFU6W=jV_-9FD+1cUBM1w@mail.gmail.com>
References: <BANLkTimS4P6MxxFU6W=jV_-9FD+1cUBM1w@mail.gmail.com>
Message-ID: <alpine.LFD.2.02.1104071828530.13118@asterix>

On Fri, 8 Apr 2011, ??????? ??????? wrote:

> Hello.
> 
> When I create vectors using
> 
> VecCreate(PETSC_COMM_WORLD,&u);
> VecSetSizes(u,PETSC_DECIDE, VecSize);
> VecSetFromOptions(u);
> VecDuplicate(u,&b);
> 
> and matrix using
> 
> MatCreate(PETSC_COMM_WORLD,&A);
> MatSetSizes(A,PETSC_DECIDE,PETSC_DECIDE,VecSize,VecSize);
> MatSetFromOptions(A);
> 
> PETSc distributes their elements in a proper identical way among processors,
> so I can use procedures like
> 
> MatMult(A,u,b);
> 
> and
>      KSPSolve(ksp,b,x);
> Ofcourse after matrix assembling and initialization of KSP and PC
> 
> KSPCreate(PETSC_COMM_WORLD,&ksp);
> KSPSetOperators(ksp,A,A,DIFFERENT_NONZERO_PATTERN);
> 
> And thats great and works amazingly!
> 
> 
> 
> 
> But now I've created DA vectors "u" and "b" and assembled them through the
> natural grid indexing.
> 
> And I need to solve the same SLE Au=b, where A is a Laplacian.
> 
> How should I create and assemble the A matrix according to my DA vector to
> use the same functionality?

Create u,b with DAGetGlobalVector() and A with DAGetMatrix() and they
will match the DA. For eg: check: src/snes/examples/tutorials/ex5.c
[or some of the examples in src/dm/da/examples]

Satish

> 
> Thank you!
> 
> Alexey Ryazanov
> ______________________________________
> Nuclear Safety Institute of Russian Academy of Sciences
> <http://www.ibrae.ac.ru/>
> 

From domenico.borzacchiello at univ-st-etienne.fr  Fri Apr  8 02:59:52 2011
From: domenico.borzacchiello at univ-st-etienne.fr (domenico.borzacchiello at univ-st-etienne.fr)
Date: Fri, 8 Apr 2011 09:59:52 +0200 (CEST)
Subject: [petsc-users] [DMMG] Stokes Solver
Message-ID: <6bdbc1590eb5051807915df3ffcd4b08.squirrel@arcon.univ-st-etienne.fr>

Hi,
I'm trying to implement a 3d stokes solver on a simple cartesian staggered
grid using the a MAC FV Discretisation. I'm following the example given in
/snes/examples/tutorials/ex30.c. As the stokes solver is ready I'll add
some more complex constitutive laws. For the moment I'm testing the solver
with just 1-level MG and I'd like a few clarification to know if what I'm
doing is correct.

- The solver solves for u v w p. I'm using a single DA with 4 DOFs and due
to the MAC arrangement I have some "extraboundary" nodes for u ( in the y
and z directions) v (x & z dir) w ( x & y dir) and p (x,y & z dir). If
what I'm getting from ex30.c is right I have to write a simple identity
for each of these nodes (i.e. p_extra = anyvalue) as they are not coupled
with the rest of the system. I'm doing the same for Dirichlet BCs nodes
(i.e. u = Ubound). Is this correct?

- How does Petsc deal with the pressure-velocity coupling? Is it correct
to try to solve the whole coupled system with DMMG as in ex30.c? At
present time I'm getting no convergence by running dmmgsolve (snes) with
all the default options on a very small system.

  0 SNES Function norm 7.128085632250e+00
Number of Newton iterations = 0
Number of Linear iterations = 0
Average Linear its / Newton = -nan
Converged Reason = -3

If I run the same case with a direct solver (pc_type lu) I'm basically
getting the same error:

RINFO(1) (local estimated flops for the elimination after analysis):
             [0] 5.42609e+08
      RINFO(2) (local estimated flops for the assembly after factorization):
             [0]  3.83582e+06
      RINFO(3) (local estimated flops for the elimination after
factorization):
             [0]  5.44162e+08
      INFO(15) (estimated size of (in MB) MUMPS internal data for running
numerical factorization):
             [0] 17
      INFO(16) (size of (in MB) MUMPS internal data used during numerical
factorization):
             [0] 17
      INFO(23) (num of pivots eliminated on this processor after
factorization):
             [0] 1372
Number of Newton iterations = 7
Number of Linear iterations = 88
Average Linear its / Newton = 1.257143e+01
Converged Reason = -3


Would you suggest anything to fix the problem? I'm double-checking the
user provided function in DMMGSetSNESLocal to see if I made any mistake
there.

Thank you in advance,
Domenico


From jed at 59A2.org  Fri Apr  8 03:26:07 2011
From: jed at 59A2.org (Jed Brown)
Date: Fri, 8 Apr 2011 10:26:07 +0200
Subject: [petsc-users] [DMMG] Stokes Solver
In-Reply-To: <6bdbc1590eb5051807915df3ffcd4b08.squirrel@arcon.univ-st-etienne.fr>
References: <6bdbc1590eb5051807915df3ffcd4b08.squirrel@arcon.univ-st-etienne.fr>
Message-ID: <BANLkTikxp7obVf6_YZbPNVygciWDAkcLmw@mail.gmail.com>

On Fri, Apr 8, 2011 at 09:59, <domenico.borzacchiello at univ-st-etienne.fr>wrote:

> Hi,
> I'm trying to implement a 3d stokes solver on a simple cartesian staggered
> grid using the a MAC FV Discretisation. I'm following the example given in
> /snes/examples/tutorials/ex30.c. As the stokes solver is ready I'll add
> some more complex constitutive laws. For the moment I'm testing the solver
> with just 1-level MG and I'd like a few clarification to know if what I'm
> doing is correct.
>
> - The solver solves for u v w p. I'm using a single DA with 4 DOFs and due
> to the MAC arrangement I have some "extraboundary" nodes for u ( in the y
> and z directions) v (x & z dir) w ( x & y dir) and p (x,y & z dir). If
> what I'm getting from ex30.c is right I have to write a simple identity
> for each of these nodes (i.e. p_extra = anyvalue) as they are not coupled
> with the rest of the system. I'm doing the same for Dirichlet BCs nodes
> (i.e. u = Ubound). Is this correct?
>

yes


>
> - How does Petsc deal with the pressure-velocity coupling? Is it correct
> to try to solve the whole coupled system with DMMG as in ex30.c? At
> present time I'm getting no convergence by running dmmgsolve (snes) with
> all the default options on a very small system.
>
>  0 SNES Function norm 7.128085632250e+00
> Number of Newton iterations = 0
> Number of Linear iterations = 0
> Average Linear its / Newton = -nan
> Converged Reason = -3
>

You may as well check why the linear solve failed by running with
-ksp_converged_reason.

There are two challenges for solving Stokes in this way. First, the
interpolation operators that the DA gives you are probably not what you want
(pressure and velocity should be interpolated differently) and second, the
standard smoother of ILU is not expected to work with interlaced velocity
and pressure (you would want to either use a "Vanka smoother" that solves
small problems associated with each pressure cell and all adjacent
velocities, use field-split as a smoother, or (tricky and not guaranteed to
work) order the pressures last in each subdomain with ILU as a smoother).

Vanka smoothers are very problem-dependent so you would need to write that
part yourself. It would be nice to have an example for Stokes. An
alternative to coupled multigrid is to use PCFieldSplit at the top level and
multigrid for the viscous part separately (see e.g. Elman's many papers on
this approach). We don't currently have an example doing PCFieldSplit with
geometric multigrid inside the splits, but it should work with petsc-dev if
you follow the approach in src/ksp/ksp/examples/tutorials/ex45.c (which does
not use DMMG, we are working to fold DMMG's functionality into the existing
solver objects).
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.mcs.anl.gov/pipermail/petsc-users/attachments/20110408/9a8bb7b5/attachment.htm>

From domenico.borzacchiello at univ-st-etienne.fr  Fri Apr  8 04:40:13 2011
From: domenico.borzacchiello at univ-st-etienne.fr (domenico.borzacchiello at univ-st-etienne.fr)
Date: Fri, 8 Apr 2011 11:40:13 +0200 (CEST)
Subject: [petsc-users] [DMMG] Stokes Solver
In-Reply-To: <BANLkTikxp7obVf6_YZbPNVygciWDAkcLmw@mail.gmail.com>
References: <6bdbc1590eb5051807915df3ffcd4b08.squirrel@arcon.univ-st-etienne.fr>
	<BANLkTikxp7obVf6_YZbPNVygciWDAkcLmw@mail.gmail.com>
Message-ID: <6cac91955a500fb6af06f51bb6b42bf0.squirrel@arcon.univ-st-etienne.fr>

Hi Jed
Thank you for the very quick reply,

running with -ksp_converged_reason and lu preconditioner gives out

Linear solve converged due to CONVERGED_RTOL iterations 14
Linear solve converged due to CONVERGED_RTOL iterations 12
Linear solve converged due to CONVERGED_RTOL iterations 13
Linear solve converged due to CONVERGED_RTOL iterations 12
Linear solve converged due to CONVERGED_RTOL iterations 13
Linear solve converged due to CONVERGED_RTOL iterations 12
Linear solve converged due to CONVERGED_RTOL iterations 12
Linear solve did not converge due to DIVERGED_DTOL iterations 1080
      RINFO(1) (local estimated flops for the elimination after analysis):
             [0] 5.42609e+08
      RINFO(2) (local estimated flops for the assembly after factorization):
             [0]  3.83582e+06
      RINFO(3) (local estimated flops for the elimination after
factorization):
             [0]  5.44162e+08
      INFO(15) (estimated size of (in MB) MUMPS internal data for running
numerical factorization):
             [0] 17
      INFO(16) (size of (in MB) MUMPS internal data used during numerical
factorization):
             [0] 17
      INFO(23) (num of pivots eliminated on this processor after
factorization):
             [0] 1372
Number of Newton iterations = 7
Number of Linear iterations = 88
Average Linear its / Newton = 1.257143e+01
Converged Reason = -3

I can't get why it takes more than one Newton iteration even if my system
is linear and the linear solver is direct.

As for the Vanka smoother would you suggest to implement it with PCSHELL
or by defining a new PC type?

thanks,
Domenico.

> On Fri, Apr 8, 2011 at 09:59,
> <domenico.borzacchiello at univ-st-etienne.fr>wrote:
>
>> Hi,
>> I'm trying to implement a 3d stokes solver on a simple cartesian
>> staggered
>> grid using the a MAC FV Discretisation. I'm following the example given
>> in
>> /snes/examples/tutorials/ex30.c. As the stokes solver is ready I'll add
>> some more complex constitutive laws. For the moment I'm testing the
>> solver
>> with just 1-level MG and I'd like a few clarification to know if what
>> I'm
>> doing is correct.
>>
>> - The solver solves for u v w p. I'm using a single DA with 4 DOFs and
>> due
>> to the MAC arrangement I have some "extraboundary" nodes for u ( in the
>> y
>> and z directions) v (x & z dir) w ( x & y dir) and p (x,y & z dir). If
>> what I'm getting from ex30.c is right I have to write a simple identity
>> for each of these nodes (i.e. p_extra = anyvalue) as they are not
>> coupled
>> with the rest of the system. I'm doing the same for Dirichlet BCs nodes
>> (i.e. u = Ubound). Is this correct?
>>
>
> yes
>
>
>>
>> - How does Petsc deal with the pressure-velocity coupling? Is it correct
>> to try to solve the whole coupled system with DMMG as in ex30.c? At
>> present time I'm getting no convergence by running dmmgsolve (snes) with
>> all the default options on a very small system.
>>
>>  0 SNES Function norm 7.128085632250e+00
>> Number of Newton iterations = 0
>> Number of Linear iterations = 0
>> Average Linear its / Newton = -nan
>> Converged Reason = -3
>>
>
> You may as well check why the linear solve failed by running with
> -ksp_converged_reason.
>
> There are two challenges for solving Stokes in this way. First, the
> interpolation operators that the DA gives you are probably not what you
> want
> (pressure and velocity should be interpolated differently) and second, the
> standard smoother of ILU is not expected to work with interlaced velocity
> and pressure (you would want to either use a "Vanka smoother" that solves
> small problems associated with each pressure cell and all adjacent
> velocities, use field-split as a smoother, or (tricky and not guaranteed
> to
> work) order the pressures last in each subdomain with ILU as a smoother).
>
> Vanka smoothers are very problem-dependent so you would need to write that
> part yourself. It would be nice to have an example for Stokes. An
> alternative to coupled multigrid is to use PCFieldSplit at the top level
> and
> multigrid for the viscous part separately (see e.g. Elman's many papers on
> this approach). We don't currently have an example doing PCFieldSplit with
> geometric multigrid inside the splits, but it should work with petsc-dev
> if
> you follow the approach in src/ksp/ksp/examples/tutorials/ex45.c (which
> does
> not use DMMG, we are working to fold DMMG's functionality into the
> existing
> solver objects).
>


From jed at 59A2.org  Fri Apr  8 04:47:35 2011
From: jed at 59A2.org (Jed Brown)
Date: Fri, 8 Apr 2011 11:47:35 +0200
Subject: [petsc-users] [DMMG] Stokes Solver
In-Reply-To: <6cac91955a500fb6af06f51bb6b42bf0.squirrel@arcon.univ-st-etienne.fr>
References: <6bdbc1590eb5051807915df3ffcd4b08.squirrel@arcon.univ-st-etienne.fr>
	<BANLkTikxp7obVf6_YZbPNVygciWDAkcLmw@mail.gmail.com>
	<6cac91955a500fb6af06f51bb6b42bf0.squirrel@arcon.univ-st-etienne.fr>
Message-ID: <BANLkTinRaWAOZipskDHj0ZYCDBLatR-d7A@mail.gmail.com>

On Fri, Apr 8, 2011 at 11:40, <domenico.borzacchiello at univ-st-etienne.fr>wrote:

> I can't get why it takes more than one Newton iteration even if my system
> is linear and the linear solver is direct.
>

The direct solver should also converge in one iteration. Are you only
assembling an approximation of the Jacobian (e.g. using -snes_mf_operator)?
If using MFFD, is the system poorly scaled such that the step size is very
low accuracy (maybe try -mat_mffd_type ds)? Are the equations singular? Is
both the Jacobian and residual evaluation correct?


>
> As for the Vanka smoother would you suggest to implement it with PCSHELL
> or by defining a new PC type?
>

That is up to you. Defining a new PC type makes it more reusable, but
PCShell is a bit quicker to develop. You can start with PCShell and convert
it later.
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.mcs.anl.gov/pipermail/petsc-users/attachments/20110408/18d74b06/attachment-0001.htm>

From jroman at dsic.upv.es  Fri Apr  8 05:36:13 2011
From: jroman at dsic.upv.es (Jose E. Roman)
Date: Fri, 8 Apr 2011 12:36:13 +0200
Subject: [petsc-users] Slepc eigen value solver gives strange behavior
In-Reply-To: <9020416.38421302060081374.JavaMail.coremail@mail.ustc.edu>
References: <133CE1D3-2E96-4749-83B4-76F61DBC06F0@dsic.upv.es>
	<32393050.36991301972852982.JavaMail.coremail@mail.ustc.edu>
	<9020416.38421302060081374.JavaMail.coremail@mail.ustc.edu>
Message-ID: <35A4F802-C36B-48AB-B6A6-2C7D806F68AD@dsic.upv.es>


El 06/04/2011, a las 05:21, Gong Ding escribi?:

> Dear Jose,
> will you please also take a look at the SVD code for smallest singular value?
> It seems work except the time consuming SVDCyclicSetExplicitMatrix routine.

Now in slepc-dev this routine does preallocation, so it should be efficient. Let us know if problems arise.
Jose


From ram at ibrae.ac.ru  Fri Apr  8 06:10:29 2011
From: ram at ibrae.ac.ru (=?KOI8-R?B?4czFy9PFyiDy0drBzs/X?=)
Date: Fri, 8 Apr 2011 15:10:29 +0400
Subject: [petsc-users] How to create and assemble matrices for DA
	vectors??
In-Reply-To: <alpine.LFD.2.02.1104071828530.13118@asterix>
References: <BANLkTimS4P6MxxFU6W=jV_-9FD+1cUBM1w@mail.gmail.com>
	<alpine.LFD.2.02.1104071828530.13118@asterix>
Message-ID: <BANLkTinbaLAW9XTB=W58yrBSzEQ8v33nEg@mail.gmail.com>

Thank you very much, Satish! Ill try it

Alexey

8 ?????? 2011 ?. 3:32 ???????????? Satish Balay <balay at mcs.anl.gov> ???????:

> On Fri, 8 Apr 2011, ??????? ??????? wrote:
>
> > Hello.
> >
> > When I create vectors using
> >
> > VecCreate(PETSC_COMM_WORLD,&u);
> > VecSetSizes(u,PETSC_DECIDE, VecSize);
> > VecSetFromOptions(u);
> > VecDuplicate(u,&b);
> >
> > and matrix using
> >
> > MatCreate(PETSC_COMM_WORLD,&A);
> > MatSetSizes(A,PETSC_DECIDE,PETSC_DECIDE,VecSize,VecSize);
> > MatSetFromOptions(A);
> >
> > PETSc distributes their elements in a proper identical way among
> processors,
> > so I can use procedures like
> >
> > MatMult(A,u,b);
> >
> > and
> >      KSPSolve(ksp,b,x);
> > Ofcourse after matrix assembling and initialization of KSP and PC
> >
> > KSPCreate(PETSC_COMM_WORLD,&ksp);
> > KSPSetOperators(ksp,A,A,DIFFERENT_NONZERO_PATTERN);
> >
> > And thats great and works amazingly!
> >
> >
> >
> >
> > But now I've created DA vectors "u" and "b" and assembled them through
> the
> > natural grid indexing.
> >
> > And I need to solve the same SLE Au=b, where A is a Laplacian.
> >
> > How should I create and assemble the A matrix according to my DA vector
> to
> > use the same functionality?
>
> Create u,b with DAGetGlobalVector() and A with DAGetMatrix() and they
> will match the DA. For eg: check: src/snes/examples/tutorials/ex5.c
> [or some of the examples in src/dm/da/examples]
>
> Satish
>
> >
> > Thank you!
> >
> > Alexey Ryazanov
> > ______________________________________
> > Nuclear Safety Institute of Russian Academy of Sciences
> > <http://www.ibrae.ac.ru/>
> >
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.mcs.anl.gov/pipermail/petsc-users/attachments/20110408/8ad1b6c6/attachment.htm>

From domenico.borzacchiello at univ-st-etienne.fr  Fri Apr  8 07:28:50 2011
From: domenico.borzacchiello at univ-st-etienne.fr (domenico.borzacchiello at univ-st-etienne.fr)
Date: Fri, 8 Apr 2011 14:28:50 +0200 (CEST)
Subject: [petsc-users] [DMMG] Stokes Solver
In-Reply-To: <BANLkTinRaWAOZipskDHj0ZYCDBLatR-d7A@mail.gmail.com>
References: <6bdbc1590eb5051807915df3ffcd4b08.squirrel@arcon.univ-st-etienne.fr>
	<BANLkTikxp7obVf6_YZbPNVygciWDAkcLmw@mail.gmail.com>
	<6cac91955a500fb6af06f51bb6b42bf0.squirrel@arcon.univ-st-etienne.fr>
	<BANLkTinRaWAOZipskDHj0ZYCDBLatR-d7A@mail.gmail.com>
Message-ID: <3e4914211fab32695db94a823cde828b.squirrel@arcon.univ-st-etienne.fr>

> The direct solver should also converge in one iteration. Are you only
> assembling an approximation of the Jacobian (e.g. using
> -snes_mf_operator)?
> If using MFFD, is the system poorly scaled such that the step size is very
> low accuracy (maybe try -mat_mffd_type ds)? Are the equations singular? Is
> both the Jacobian and residual evaluation correct?

Apparently the equations were singular. I modified the equations
describing the Outflow BC by explicitly writing the open boundary
condition -mu*(dw/dz)+p=0 (instead of including it in the momentum
equation as I was doing before) and the linear solver is now converging
within 1 iteration, snes is still diverging though.

  0 SNES Function norm 7.128085632250e+00
Linear solve converged due to CONVERGED_RTOL iterations 1
  1 SNES Function norm 7.068552744365e+00
Linear solve converged due to CONVERGED_RTOL iterations 1
  2 SNES Function norm 7.068535930605e+00
Linear solve converged due to CONVERGED_RTOL iterations 1
  3 SNES Function norm 7.068535930605e+00
Linear solve converged due to CONVERGED_RTOL iterations 1
.
. (some mumps output here)
.
Number of Newton iterations = 3
Number of Linear iterations = 4
Average Linear its / Newton = 1.333333e+00
Converged Reason = -6

if I run it with -snes_type tr instead I get
  0 SNES Function norm 7.128085632250e+00
Linear solve converged due to CONVERGED_STEP_LENGTH iterations 1
  1 SNES Function norm 7.081751494639e+00
Linear solve converged due to CONVERGED_STEP_LENGTH iterations 1
  2 SNES Function norm 7.068482944794e+00
Linear solve converged due to CONVERGED_STEP_LENGTH iterations 1
  3 SNES Function norm 7.067980457052e+00
Linear solve converged due to CONVERGED_STEP_LENGTH iterations 1
  4 SNES Function norm 7.067979237888e+00
Linear solve converged due to CONVERGED_STEP_LENGTH iterations 1
  5 SNES Function norm 7.067979237888e+00
.
.
.
Number of Newton iterations = 4
Number of Linear iterations = 5
Average Linear its / Newton = 1.250000e+00
Converged Reason = 4


I don't define the Jacobian myself I'm just calling
 DMMGSetSNESLocal(dmmg,FormFunctionLocal,0,0,0)

I Assumed that the FD evaluation of Jacobian would be exact since the the
Function is linear.


From jed at 59A2.org  Fri Apr  8 12:28:11 2011
From: jed at 59A2.org (Jed Brown)
Date: Fri, 8 Apr 2011 19:28:11 +0200
Subject: [petsc-users] [DMMG] Stokes Solver
In-Reply-To: <3e4914211fab32695db94a823cde828b.squirrel@arcon.univ-st-etienne.fr>
References: <6bdbc1590eb5051807915df3ffcd4b08.squirrel@arcon.univ-st-etienne.fr>
	<BANLkTikxp7obVf6_YZbPNVygciWDAkcLmw@mail.gmail.com>
	<6cac91955a500fb6af06f51bb6b42bf0.squirrel@arcon.univ-st-etienne.fr>
	<BANLkTinRaWAOZipskDHj0ZYCDBLatR-d7A@mail.gmail.com>
	<3e4914211fab32695db94a823cde828b.squirrel@arcon.univ-st-etienne.fr>
Message-ID: <BANLkTimQunkfi5YaDWwCr7ePfzEAZbTR9w@mail.gmail.com>

On Fri, Apr 8, 2011 at 14:28, <domenico.borzacchiello at univ-st-etienne.fr>wrote:

> I Assumed that the FD evaluation of Jacobian would be exact since the the
> Function is linear.
>

Sort of, it's still a problem if your function looks like f(x) = huge +
epsilon * x. I suspect either a memory bug (try valgrind) or that your
FormFunctionLocal is using internal state or otherwise nonlinear.
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.mcs.anl.gov/pipermail/petsc-users/attachments/20110408/03503a49/attachment.htm>

From gdiso at ustc.edu  Sat Apr  9 23:17:39 2011
From: gdiso at ustc.edu (Gong Ding)
Date: Sun, 10 Apr 2011 12:17:39 +0800 (CST)
Subject: [petsc-users] Slepc eigen value solver gives strange behavior
In-Reply-To: <35A4F802-C36B-48AB-B6A6-2C7D806F68AD@dsic.upv.es>
References: <35A4F802-C36B-48AB-B6A6-2C7D806F68AD@dsic.upv.es>
	<133CE1D3-2E96-4749-83B4-76F61DBC06F0@dsic.upv.es>
	<32393050.36991301972852982.JavaMail.coremail@mail.ustc.edu>
	<9020416.38421302060081374.JavaMail.coremail@mail.ustc.edu>
Message-ID: <20302444.45221302409059408.JavaMail.coremail@mail.ustc.edu>

 
> El 06/04/2011, a las 05:21, Gong Ding escribi?:
> 
> 
> 
> > Dear Jose,
> 
> > will you please also take a look at the SVD code for smallest singular value?
> 
> > It seems work except the time consuming SVDCyclicSetExplicitMatrix routine.
> 
> 
> 
> Now in slepc-dev this routine does preallocation, so it should be efficient. Let us know if problems arise.
> 
> Jose
> 

It works well. Thanks!


From gdiso at ustc.edu  Sat Apr  9 23:53:04 2011
From: gdiso at ustc.edu (Gong Ding)
Date: Sun, 10 Apr 2011 12:53:04 +0800 (CST)
Subject: [petsc-users] Finally,
	pseudo time method eliminated singular problem
Message-ID: <33088327.45281302411184479.JavaMail.coremail@mail.ustc.edu>

Hi all,
In the past several weeks, I am dealing with the nearly singular problem.
The structure has a metal connected two semiconductor devices. when two devices are both shutdown with high resistance,
the metal connector is floating.
This singular problem finally be shifted to well conditioned by simple pseudo time method --
just introducing pseudo time to the nearly floating domain (as a capacity to ground).  
Now iterative method works well.

Here, I must give thanks to Jose Roman, the slepc package gives quick evaluation 
of eigen value of the jacobian matrix. I can easliy target where the singular arising. 
And it helps to determine the pseudo time step.

And to Matt and Jed, thank you for the idea of null sapce. 
Pseudo time method is not as efficient as null space dropping.
I guess the algorithm to nonlinear singular problem should
1) drop null space within krylov iteration 
2) SNES should know the null vector and do a search in the direction of null vector to find the root.
Do you think I am in the right way?

Gong Ding

 
From gdiso at ustc.edu  Sun Apr 10 00:01:16 2011
From: gdiso at ustc.edu (Gong Ding)
Date: Sun, 10 Apr 2011 13:01:16 +0800 (CST)
Subject: [petsc-users] Patch to release extra memoty to aij matrix
Message-ID: <26209988.45291302411676528.JavaMail.coremail@mail.ustc.edu>

Hi, 
This is the patch file to aij.c, which will release excessively preallocated memory at MatAssemblyEnd. 
It had been tested in the past several months, both for serial and parallel.

Hope this patch can be accepted.

Gong Ding

-------------- next part --------------
A non-text attachment was scrubbed...
Name: aij.diff
Type: application/octet-stream
Size: 1424 bytes
Desc: not available
URL: <http://lists.mcs.anl.gov/pipermail/petsc-users/attachments/20110410/bac44ea2/attachment.obj>

From zonexo at gmail.com  Sun Apr 10 02:13:17 2011
From: zonexo at gmail.com (TAY wee-beng)
Date: Sun, 10 Apr 2011 09:13:17 +0200
Subject: [petsc-users] Minimum size of sparse or dense matrix to use PETSc
Message-ID: <4DA1588D.5010406@gmail.com>

Hi,

I am already using PETSc to solve my momentum and poisson equations. 
However in some parts of my code, I need to solve a dense (usually) or 
sparse matrix, which arises from the radial basis function 
interpolation. Depending on the problem, it can be a big or small matrix.

I am thinking whether to use PETSc or just a simple solver.

Can you recommend the minimum size of sparse or dense matrix to use PETSc?

Thank you.

-- 
Yours sincerely,

TAY wee-beng


From knepley at gmail.com  Sun Apr 10 06:56:50 2011
From: knepley at gmail.com (Matthew Knepley)
Date: Sun, 10 Apr 2011 06:56:50 -0500
Subject: [petsc-users] Minimum size of sparse or dense matrix to use
	PETSc
In-Reply-To: <4DA1588D.5010406@gmail.com>
References: <4DA1588D.5010406@gmail.com>
Message-ID: <BANLkTinK59p6Cp43bZCV5r+aHbjedch5oA@mail.gmail.com>

On Sun, Apr 10, 2011 at 2:13 AM, TAY wee-beng <zonexo at gmail.com> wrote:

> Hi,
>
> I am already using PETSc to solve my momentum and poisson equations.
> However in some parts of my code, I need to solve a dense (usually) or
> sparse matrix, which arises from the radial basis function interpolation.
> Depending on the problem, it can be a big or small matrix.
>
> I am thinking whether to use PETSc or just a simple solver.
>
> Can you recommend the minimum size of sparse or dense matrix to use PETSc?
>

For a dense matrix, we just call LAPACK in serial. And you can can change
from dense to sparse if it is
sparse enough by changing the matrix type. The only regime to worry about is
very large, dense matrices,
but I do not think you have those.

   Matt


> Thank you.
>
> --
> Yours sincerely,
>
> TAY wee-beng
>
>


-- 
What most experimenters take for granted before they begin their experiments
is infinitely more interesting than any results to which their experiments
lead.
-- Norbert Wiener
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.mcs.anl.gov/pipermail/petsc-users/attachments/20110410/7f4a6359/attachment.htm>

From knepley at gmail.com  Sun Apr 10 07:28:00 2011
From: knepley at gmail.com (Matthew Knepley)
Date: Sun, 10 Apr 2011 07:28:00 -0500
Subject: [petsc-users] Finally,
	pseudo time method eliminated singular problem
In-Reply-To: <33088327.45281302411184479.JavaMail.coremail@mail.ustc.edu>
References: <33088327.45281302411184479.JavaMail.coremail@mail.ustc.edu>
Message-ID: <BANLkTi=ZW6hqgXVE2yoCOy1ox==mwb125A@mail.gmail.com>

2011/4/9 Gong Ding <gdiso at ustc.edu>

> Hi all,
> In the past several weeks, I am dealing with the nearly singular problem.
> The structure has a metal connected two semiconductor devices. when two
> devices are both shutdown with high resistance,
> the metal connector is floating.
> This singular problem finally be shifted to well conditioned by simple
> pseudo time method --
> just introducing pseudo time to the nearly floating domain (as a capacity
> to ground).
> Now iterative method works well.
>
> Here, I must give thanks to Jose Roman, the slepc package gives quick
> evaluation
> of eigen value of the jacobian matrix. I can easliy target where the
> singular arising.
> And it helps to determine the pseudo time step.
>
> And to Matt and Jed, thank you for the idea of null sapce.
> Pseudo time method is not as efficient as null space dropping.
> I guess the algorithm to nonlinear singular problem should
> 1) drop null space within krylov iteration
>

Yes, you can do this using KSPSetNullSpace()


http://www.mcs.anl.gov/petsc/petsc-as/snapshots/petsc-current/docs/manualpages/KSP/KSPSetNullSpace.html


> 2) SNES should know the null vector and do a search in the direction of
> null vector to find the root.
>

I am not sure why this is necessary. Since KSP will project out the
nullspace in each Newton solve,
it should not appear in the update. Unless it is a component of the solution
(which would be strange since
the Jacobian gives no information about it), in which case you can add that
as the initial guess.

   Matt


> Do you think I am in the right way?
>
> Gong Ding
>
>
>
>
>
>
>
>
>
>


-- 
What most experimenters take for granted before they begin their experiments
is infinitely more interesting than any results to which their experiments
lead.
-- Norbert Wiener
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.mcs.anl.gov/pipermail/petsc-users/attachments/20110410/15a2544d/attachment.htm>

From domenico.borzacchiello at univ-st-etienne.fr  Mon Apr 11 06:43:01 2011
From: domenico.borzacchiello at univ-st-etienne.fr (domenico.borzacchiello at univ-st-etienne.fr)
Date: Mon, 11 Apr 2011 13:43:01 +0200 (CEST)
Subject: [petsc-users] [DMMG] Stokes Solver
In-Reply-To: <BANLkTimm4jYZaOr5udpQko59i+G9yYiskQ@mail.gmail.com>
References: <6bdbc1590eb5051807915df3ffcd4b08.squirrel@arcon.univ-st-etienne.fr>
	<BANLkTikxp7obVf6_YZbPNVygciWDAkcLmw@mail.gmail.com>
	<6cac91955a500fb6af06f51bb6b42bf0.squirrel@arcon.univ-st-etienne.fr>
	<BANLkTinRaWAOZipskDHj0ZYCDBLatR-d7A@mail.gmail.com>
	<3e4914211fab32695db94a823cde828b.squirrel@arcon.univ-st-etienne.fr>
	<BANLkTimQunkfi5YaDWwCr7ePfzEAZbTR9w@mail.gmail.com>
	<04494be3dc8dc0af4b98e63eff5661f4.squirrel@arcon.univ-st-etienne.fr>
	<BANLkTimm4jYZaOr5udpQko59i+G9yYiskQ@mail.gmail.com>
Message-ID: <cb676ca29492fee2f3f2400670b5743e.squirrel@arcon.univ-st-etienne.fr>


Ok, I'll keep that in mind.
thank you very much for the explanations.

Domenico.

> On Mon, Apr 11, 2011 at 13:32,
> <domenico.borzacchiello at univ-st-etienne.fr>wrote:
>
>> Still it's unclear to me why this happens and whether or not defining
>> the
>> Jacobian instead of computing it by FD may possibly fix it. I'd like to
>> stick with FD approximation of Jacobian because I'll add complex
>> rheology
>> models for which computing the Jacobian analitically won't be an easy
>> task.
>>
>
> To use FD, you have to make sure that your equations are well scaled. You
> should be able to volume-scale the residual (most PETSc examples do this)
> and choose units such that the system is well-scaled independent of grid
> resolution. You should do this regardless of whether you use FD, but with
> FD, you have half the number of digits to work with before running into
> rounding error problems.
>


From fd.kong at siat.ac.cn  Tue Apr 12 01:22:52 2011
From: fd.kong at siat.ac.cn (=?ISO-8859-1?B?ZmRrb25n?=)
Date: Tue, 12 Apr 2011 14:22:52 +0800
Subject: [petsc-users] time spent on each level of the solver for multigrid
	preconditioner
Message-ID: <tencent_15A7E9320EAA53EE0FCCB329@qq.com>

Hi every one
   I uses multigrid preconditioner for my application. Running the code with "Options Database Keys" 	-pc_mg_log, but can not get time spent on each level of the solver.  I want to know time spent on each level respectively.
   

VecMDot               30 1.0 2.8007e-03 2.5 1.61e+05 1.1 0.0e+00 0.0e+00 3.0e+01  0  4  0  0  8   0  4  0  0  9   217
VecNorm               48 1.0 2.3482e-03 2.1 1.07e+05 1.1 0.0e+00 0.0e+00 4.8e+01  0  3  0  0 12   0  3  0  0 15   173
VecScale              39 1.0 3.2115e-04 1.2 4.36e+04 1.1 0.0e+00 0.0e+00 0.0e+00  0  1  0  0  0   0  1  0  0  0   513
VecCopy               17 1.0 1.6999e-04 1.5 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00  0  0  0  0  0   0  0  0  0  0     0
VecSet               125 1.0 5.7936e-04 1.0 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00  0  0  0  0  0   0  0  0  0  0     0
VecAXPY               17 1.0 2.6035e-04 1.6 3.80e+04 1.1 0.0e+00 0.0e+00 0.0e+00  0  1  0  0  0   0  1  0  0  0   552
VecAYPX                4 1.0 8.7976e-05 1.3 4.47e+03 1.1 0.0e+00 0.0e+00 0.0e+00  0  0  0  0  0   0  0  0  0  0   192
VecMAXPY              38 1.0 7.0500e-04 1.1 2.28e+05 1.1 0.0e+00 0.0e+00 0.0e+00  0  6  0  0  0   0  6  0  0  0  1223
VecAssemblyBegin       3 1.0 4.5705e-04 1.6 0.00e+00 0.0 0.0e+00 0.0e+00 9.0e+00  0  0  0  0  2   0  0  0  0  3     0
VecAssemblyEnd         3 1.0 4.1962e-05 1.0 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00  0  0  0  0  0   0  0  0  0  0     0
VecScatterBegin      147 1.0 1.8594e-03 1.1 0.00e+00 0.0 1.0e+03 3.8e+02 0.0e+00  0  0 55 19  0   0  0 55 19  0     0
VecScatterEnd        147 1.0 1.4102e-03 1.5 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00  0  0  0  0  0   0  0  0  0  0     0
VecNormalize          34 1.0 1.9138e-03 1.5 1.14e+05 1.1 0.0e+00 0.0e+00 3.4e+01  0  3  0  0  9   0  3  0  0 11   225
MatMult               47 1.0 2.6152e-02 1.1 1.34e+06 1.1 4.7e+02 4.0e+02 0.0e+00  0 35 25  9  0   0 35 25  9  0   191
MatMultAdd             4 1.0 2.1584e-03 1.1 5.67e+04 1.2 4.0e+01 2.2e+02 0.0e+00  0  1  2  0  0   0  1  2  0  0    96
MatMultTranspose       8 1.0 4.4453e-03 1.0 1.13e+05 1.2 8.0e+01 2.2e+02 1.6e+01  0  3  4  1  4   0  3  4  1  5    94
MatSolve              50 1.0 3.1454e-02 1.0 1.41e+06 1.2 0.0e+00 0.0e+00 0.0e+00  0 36  0  0  0   0 36  0  0  0   164
MatLUFactorSym         1 1.0 7.4482e-04 1.3 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00  0  0  0  0  0   0  0  0  0  0     0
MatLUFactorNum         2 1.0 4.4755e-02 1.0 1.84e+05 1.2 0.0e+00 0.0e+00 0.0e+00  0  5  0  0  0   0  5  0  0  0    15
MatILUFactorSym        1 1.0 1.3239e-03 1.2 0.00e+00 0.0 0.0e+00 0.0e+00 3.0e+00  0  0  0  0  1   0  0  0  0  1     0
MatAssemblyBegin       4 1.0 2.4263e-0243.8 0.00e+00 0.0 4.5e+01 3.1e+03 6.0e+00  0  0  2  7  2   0  0  2  7  2     0
MatAssemblyEnd         4 1.0 7.4661e-03 1.1 0.00e+00 0.0 6.0e+01 7.7e+01 2.8e+01  0  0  3  0  7   0  0  3  0  9     0
MatGetRowIJ            1 1.0 3.0994e-05 1.7 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00  0  0  0  0  0   0  0  0  0  0     0
MatGetSubMatrice       1 1.0 3.2248e-03 1.1 0.00e+00 0.0 5.0e+01 1.9e+03 5.0e+00  0  0  3  5  1   0  0  3  5  2     0
MatGetOrdering         1 1.0 1.2500e-03 1.1 0.00e+00 0.0 0.0e+00 0.0e+00 4.0e+00  0  0  0  0  1   0  0  0  0  1     0
MatIncreaseOvrlp       1 1.0 1.0622e-03 1.1 0.00e+00 0.0 0.0e+00 0.0e+00 2.0e+00  0  0  0  0  1   0  0  0  0  1     0
MatZeroEntries         2 1.0 2.8491e-04 2.2 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00  0  0  0  0  0   0  0  0  0  0     0
MatView                6 1.0 1.5023e-03 1.3 0.00e+00 0.0 0.0e+00 0.0e+00 4.0e+00  0  0  0  0  1   0  0  0  0  1     0
MeshView               6 1.0 1.0528e+00 1.0 0.00e+00 0.0 9.0e+01 2.9e+03 0.0e+00  9  0  5 12  0   9  0  5 12  0     0
MeshGetGlobalScatter       3 1.0 1.6958e-02 1.0 0.00e+00 0.0 3.0e+01 8.8e+01 1.8e+01  0  0  2  0  5   0  0  2  0  6     0
MeshAssembleMatrix    1572 1.0 3.6974e-02 1.0 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00  0  0  0  0  0   0  0  0  0  0     0
MeshUpdateOperator    2131 1.0 8.2520e-02 1.3 0.00e+00 0.0 0.0e+00 0.0e+00 6.0e+00  1  0  0  0  2   1  0  0  0  2     0
SectionRealView        2 1.0 6.4932e-0216.4 0.00e+00 0.0 1.2e+01 4.1e+03 0.0e+00  0  0  1  2  0   0  0  1  2  0     0
PCSetUp                3 1.0 5.6296e-02 1.0 1.84e+05 1.2 7.0e+01 1.4e+03 3.0e+01  0  5  4  5  8   0  5  4  5  9    12
PCSetUpOnBlocks        8 1.0 6.5680e-03 1.1 1.84e+05 1.2 0.0e+00 0.0e+00 7.0e+00  0  5  0  0  2   0  5  0  0  2   102
PCApply                4 1.0 1.0816e-01 1.0 3.45e+06 1.2 9.6e+02 3.8e+02 2.0e+02  1 89 52 17 51   1 89 52 17 61   118
KSPGMRESOrthog        30 1.0 3.6988e-03 1.7 3.22e+05 1.1 0.0e+00 0.0e+00 3.0e+01  0  8  0  0  8   0  8  0  0  9   329
KSPSetup               4 1.0 1.2448e-03 1.2 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00  0  0  0  0  0   0  0  0  0  0     0
KSPSolve               1 1.0 1.1271e-01 1.0 3.65e+06 1.2 1.0e+03 3.8e+02 2.1e+02  1 94 54 18 54   1 94 54 18 65   120
MeshDestroy            5 1.0 3.2269e-0236.2 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00  0  0  0  0  0   0  0  0  0  0     0
DistributeMesh         1 1.0 2.0238e-01 1.1 0.00e+00 0.0 2.4e+01 2.3e+03 0.0e+00  2  0  1  3  0   2  0  1  3  0     0
PartitionCreate        2 1.0 4.0964e-0234.5 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00  0  0  0  0  0   0  0  0  0  0     0
PartitionClosure       2 1.0 8.7453e-024366.7 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00  0  0  0  0  0   0  0  0  0  0     0
DistributeCoords       2 1.0 4.6407e-02 2.4 0.00e+00 0.0 2.4e+01 3.0e+03 0.0e+00  0  0  1  3  0   0  0  1  3  0     0
DistributeLabels       2 1.0 8.7246e-02 3.1 0.00e+00 0.0 1.8e+01 7.6e+02 0.0e+00  0  0  1  1  0   0  0  1  1  0     0
CreateOverlap          2 1.0 2.5038e-02 3.1 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00  0  0  0  3  0   0  0  0  3  0     0
DistributeMeshByFineMesh       1 1.0 2.0225e+00 1.0 3.18e+05 0.0 2.4e+01 9.5e+03 0.0e+00 17  2  1 11  0  17  2  1 11  0     0
PartitionByFineMesh       1 1.0 1.2465e+0036561.8 3.18e+05 0.0 0.0e+00 0.0e+00 0.0e+00  3  2  0  0  0   3  2  0  0  0     0
CreatCoarseCellToFineCell       1 1.0 1.1892e+0099754.2 3.18e+05 0.0 0.0e+00 0.0e+00 0.0e+00  3  2  0  0  0   3  2  0  0  0     0
ConstructInterpolation       1 1.0 1.7860e-01 1.0 7.53e+04 1.2 3.5e+01 6.3e+02 1.8e+01  2  2  2  1  5   2  2  2  1  6     2
creatMapFromFinePointToCoarseCell       1 1.0 8.4537e-02 1.1 6.63e+04 1.2 0.0e+00 0.0e+00 0.0e+00  1  2  0  0  0   1  2  0  0  0     3
MGSetup Level 1        2 1.0 4.3158e-02 1.0 0.00e+00 0.0 0.0e+00 0.0e+00 8.0e+00  0  0  0  0  2   0  0  0  0  2     0
MGSmooth Level 1      16 1.0 9.6284e-02 1.0 3.11e+06 1.2 7.6e+02 4.1e+02 1.7e+02  1 80 41 15 45   1 80 41 15 54   120
MGResid Level 1        4 1.0 2.4343e-03 1.1 1.24e+05 1.1 4.0e+01 4.1e+02 0.0e+00  0  3  2  1  0   0  3  2  1  0   191
MGInterp Level 1      16 1.0 9.3703e-03 1.0 2.22e+05 1.2 1.6e+02 2.2e+02 1.6e+01  0  6  9  2  4   0  6  9  2  5    87


------------------
Fande Kong
ShenZhen Institutes of Advanced Technology
Chinese Academy of Sciences
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.mcs.anl.gov/pipermail/petsc-users/attachments/20110412/a3196871/attachment.htm>

From jed at 59a2.org  Tue Apr 12 01:43:50 2011
From: jed at 59a2.org (Jed Brown)
Date: Tue, 12 Apr 2011 08:43:50 +0200
Subject: [petsc-users] time spent on each level of the solver for
 multigrid preconditioner
In-Reply-To: <tencent_15A7E9320EAA53EE0FCCB329@qq.com>
References: <tencent_15A7E9320EAA53EE0FCCB329@qq.com>
Message-ID: <BANLkTi=KpsnGFBj=g6C+ZYxuVT7ZSjoV2A@mail.gmail.com>

I think -pc_mg_log does what you want in petsc-dev.

$ cd petsc/src/snes/examples/tutorials

$ ./ex48 -thi_nlevels 3 -log_summary -pc_mg_log

[...]

MGSetup Level 0        7 1.0 1.2443e-02 1.0 9.84e+04 1.0 0.0e+00 0.0e+00
5.0e+00  4  0  0  0  2   4  0  0  0  4     8

MGSmooth Level 0      78 1.0 9.3937e-04 1.0 1.99e+05 1.0 0.0e+00 0.0e+00
0.0e+00  0  0  0  0  0   0  0  0  0  0   212

MGSetup Level 1        7 1.0 4.4374e-03 1.0 9.37e+05 1.0 0.0e+00 0.0e+00
1.5e+01  1  1  0  0  6   1  1  0  0 11   211

MGSmooth Level 1     104 1.0 1.1142e-02 1.0 8.80e+06 1.0 0.0e+00 0.0e+00
1.0e+00  3 14  0  0  0   3 14  0  0  1   789

MGResid Level 1       52 1.0 9.5296e-04 1.0 9.49e+05 1.0 0.0e+00 0.0e+00
0.0e+00  0  1  0  0  0   0  1  0  0  0   996

MGInterp Level 1     156 1.0 6.0565e-03 1.0 1.97e+05 1.0 0.0e+00 0.0e+00
0.0e+00  2  0  0  0  0   2  0  0  0  0    32

MGSetup Level 2        7 1.0 5.4920e-03 1.0 7.14e+06 1.0 0.0e+00 0.0e+00
1.5e+01  2 11  0  0  6   2 11  0  0 11  1299

MGSmooth Level 2      52 1.0 2.6678e-02 1.0 3.61e+07 1.0 0.0e+00 0.0e+00
1.0e+00  8 56  0  0  0   8 56  0  0  1  1353

MGResid Level 2       26 1.0 2.5156e-03 1.0 3.52e+06 1.0 0.0e+00 0.0e+00
0.0e+00  1  6  0  0  0   1  6  0  0  0  1401

MGInterp Level 2     104 1.0 1.4493e-03 1.0 9.06e+05 1.0 0.0e+00 0.0e+00
0.0e+00  0  1  0  0  0   0  1  0  0  0   625
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.mcs.anl.gov/pipermail/petsc-users/attachments/20110412/63d7d5b7/attachment.htm>

From juhaj at iki.fi  Tue Apr 12 05:02:47 2011
From: juhaj at iki.fi (Juha =?iso-8859-1?q?J=E4ykk=E4?=)
Date: Tue, 12 Apr 2011 10:02:47 +0000
Subject: [petsc-users] TS problem: runge-kutta gives 0 step-length
Message-ID: <201104121102.47347.juhaj@iki.fi>

Hi list!

I have a small problem with running a TS program with -ts_type runge-kutta. It 
keeps telling me

Very small steps: 0.000000

from the very beginning and never gets anywhere. The programs works fine for 
other TS types (well, at least euler, beuler, cn and gl).

I am out of ideas as to why this happens. I even checked the RK source code. 
Any ideas?

Cheers,
Juha

From knepley at gmail.com  Tue Apr 12 05:29:56 2011
From: knepley at gmail.com (Matthew Knepley)
Date: Tue, 12 Apr 2011 05:29:56 -0500
Subject: [petsc-users] TS problem: runge-kutta gives 0 step-length
In-Reply-To: <201104121102.47347.juhaj@iki.fi>
References: <201104121102.47347.juhaj@iki.fi>
Message-ID: <BANLkTinmg-m8xR_EtASxyJoRwgKARX_8ew@mail.gmail.com>

On Tue, Apr 12, 2011 at 5:02 AM, Juha J?ykk? <juhaj at iki.fi> wrote:

> Hi list!
>
> I have a small problem with running a TS program with -ts_type runge-kutta.
> It
> keeps telling me
>
> Very small steps: 0.000000
>
> from the very beginning and never gets anywhere. The programs works fine
> for
> other TS types (well, at least euler, beuler, cn and gl).
>
> I am out of ideas as to why this happens. I even checked the RK source
> code.
> Any ideas?
>

Yes, the debugger to look at what happens when it chooses the new timestep.
This
is dependent on parameters you pass in (rk->maxerror, rk->p, ts->max_time).

    Matt


> Cheers,
> Juha
>


-- 
What most experimenters take for granted before they begin their experiments
is infinitely more interesting than any results to which their experiments
lead.
-- Norbert Wiener
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.mcs.anl.gov/pipermail/petsc-users/attachments/20110412/f53d7424/attachment.htm>

From bsmith at mcs.anl.gov  Tue Apr 12 08:40:33 2011
From: bsmith at mcs.anl.gov (Barry Smith)
Date: Tue, 12 Apr 2011 08:40:33 -0500
Subject: [petsc-users] time spent on each level of the solver for
	multigrid preconditioner
In-Reply-To: <tencent_15A7E9320EAA53EE0FCCB329@qq.com>
References: <tencent_15A7E9320EAA53EE0FCCB329@qq.com>
Message-ID: <74C217CD-9FCE-46A5-9F41-86D27AF96D45@mcs.anl.gov>


   It is right there:

MGSetup Level 1        2 1.0 4.3158e-02 1.0 0.00e+00 0.0 0.0e+00 0.0e+00 8.0e+00  0  0  0  0  2   0  0  0  0  2     0
MGSmooth Level 1      16 1.0 9.6284e-02 1.0 3.11e+06 1.2 7.6e+02 4.1e+02 1.7e+02  1 80 41 15 45   1 80 41 15 54   120
MGResid Level 1        4 1.0 2.4343e-03 1.1 1.24e+05 1.1 4.0e+01 4.1e+02 0.0e+00  0  3  2  1  0   0  3  2  1  0   191
MGInterp Level 1      16 1.0 9.3703e-03 1.0 2.22e+05 1.2 1.6e+02 2.2e+02 1.6e+01  0  6  9  2  4   0  6  9  2  5    87

  perhaps you are only running with one level and hence only getting one level or information. Or perhaps there is a bug/issue and we don't report for the coarsest level. If it is missing a level please send a bug report to petsc-maint at mcs.anl.gov using a PETSc example for example src/ksp/ksp/examples/tutorials/ex22.c

   Barry


On Apr 12, 2011, at 1:22 AM, fdkong wrote:

> Hi every one
>    I uses multigrid preconditioner for my application. Running the code with "Options Database Keys" 	-pc_mg_log, but can not get time spent on each level of the solver.  I want to know time spent on each level respectively.
>    
> 
> VecMDot               30 1.0 2.8007e-03 2.5 1.61e+05 1.1 0.0e+00 0.0e+00 3.0e+01  0  4  0  0  8   0  4  0  0  9   217
> VecNorm               48 1.0 2.3482e-03 2.1 1.07e+05 1.1 0.0e+00 0.0e+00 4.8e+01  0  3  0  0 12   0  3  0  0 15   173
> VecScale              39 1.0 3.2115e-04 1.2 4.36e+04 1.1 0.0e+00 0.0e+00 0.0e+00  0  1  0  0  0   0  1  0  0  0   513
> VecCopy               17 1.0 1.6999e-04 1.5 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00  0  0  0  0  0   0  0  0  0  0     0
> VecSet               125 1.0 5.7936e-04 1.0 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00  0  0  0  0  0   0  0  0  0  0     0
> VecAXPY               17 1.0 2.6035e-04 1.6 3.80e+04 1.1 0.0e+00 0.0e+00 0.0e+00  0  1  0  0  0   0  1  0  0  0   552
> VecAYPX                4 1.0 8.7976e-05 1.3 4.47e+03 1.1 0.0e+00 0.0e+00 0.0e+00  0  0  0  0  0   0  0  0  0  0   192
> VecMAXPY              38 1.0 7.0500e-04 1.1 2.28e+05 1.1 0.0e+00 0.0e+00 0.0e+00  0  6  0  0  0   0  6  0  0  0  1223
> VecAssemblyBegin       3 1.0 4.5705e-04 1.6 0.00e+00 0.0 0.0e+00 0.0e+00 9.0e+00  0  0  0  0  2   0  0  0  0  3     0
> VecAssemblyEnd         3 1.0 4.1962e-05 1.0 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00  0  0  0  0  0   0  0  0  0  0     0
> VecScatterBegin      147 1.0 1.8594e-03 1.1 0.00e+00 0.0 1.0e+03 3.8e+02 0.0e+00  0  0 55 19  0   0  0 55 19  0     0
> VecScatterEnd        147 1.0 1.4102e-03 1.5 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00  0  0  0  0  0   0  0  0  0  0     0
> VecNormalize          34 1.0 1.9138e-03 1.5 1.14e+05 1.1 0.0e+00 0.0e+00 3.4e+01  0  3  0  0  9   0  3  0  0 11   225
> MatMult               47 1.0 2.6152e-02 1.1 1.34e+06 1.1 4.7e+02 4.0e+02 0.0e+00  0 35 25  9  0   0 35 25  9  0   191
> MatMultAdd             4 1.0 2.1584e-03 1.1 5.67e+04 1.2 4.0e+01 2.2e+02 0.0e+00  0  1  2  0  0   0  1  2  0  0    96
> MatMultTranspose       8 1.0 4.4453e-03 1.0 1.13e+05 1.2 8.0e+01 2.2e+02 1.6e+01  0  3  4  1  4   0  3  4  1  5    94
> MatSolve              50 1.0 3.1454e-02 1.0 1.41e+06 1.2 0.0e+00 0.0e+00 0.0e+00  0 36  0  0  0   0 36  0  0  0   164
> MatLUFactorSym         1 1.0 7.4482e-04 1.3 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00  0  0  0  0  0   0  0  0  0  0     0
> MatLUFactorNum         2 1.0 4.4755e-02 1.0 1.84e+05 1.2 0.0e+00 0.0e+00 0.0e+00  0  5  0  0  0   0  5  0  0  0    15
> MatILUFactorSym        1 1.0 1.3239e-03 1.2 0.00e+00 0.0 0.0e+00 0.0e+00 3.0e+00  0  0  0  0  1   0  0  0  0  1     0
> MatAssemblyBegin       4 1.0 2.4263e-0243.8 0.00e+00 0.0 4.5e+01 3.1e+03 6.0e+00  0  0  2  7  2   0  0  2  7  2     0
> MatAssemblyEnd         4 1.0 7.4661e-03 1.1 0.00e+00 0.0 6.0e+01 7.7e+01 2.8e+01  0  0  3  0  7   0  0  3  0  9     0
> MatGetRowIJ            1 1.0 3.0994e-05 1.7 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00  0  0  0  0  0   0  0  0  0  0     0
> MatGetSubMatrice       1 1.0 3.2248e-03 1.1 0.00e+00 0.0 5.0e+01 1.9e+03 5.0e+00  0  0  3  5  1   0  0  3  5  2     0
> MatGetOrdering         1 1.0 1.2500e-03 1.1 0.00e+00 0.0 0.0e+00 0.0e+00 4.0e+00  0  0  0  0  1   0  0  0  0  1     0
> MatIncreaseOvrlp       1 1.0 1.0622e-03 1.1 0.00e+00 0.0 0.0e+00 0.0e+00 2.0e+00  0  0  0  0  1   0  0  0  0  1     0
> MatZeroEntries         2 1.0 2.8491e-04 2.2 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00  0  0  0  0  0   0  0  0  0  0     0
> MatView                6 1.0 1.5023e-03 1.3 0.00e+00 0.0 0.0e+00 0.0e+00 4.0e+00  0  0  0  0  1   0  0  0  0  1     0
> MeshView               6 1.0 1.0528e+00 1.0 0.00e+00 0.0 9.0e+01 2.9e+03 0.0e+00  9  0  5 12  0   9  0  5 12  0     0
> MeshGetGlobalScatter       3 1.0 1.6958e-02 1.0 0.00e+00 0.0 3.0e+01 8.8e+01 1.8e+01  0  0  2  0  5   0  0  2  0  6     0
> MeshAssembleMatrix    1572 1.0 3.6974e-02 1.0 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00  0  0  0  0  0   0  0  0  0  0     0
> MeshUpdateOperator    2131 1.0 8.2520e-02 1.3 0.00e+00 0.0 0.0e+00 0.0e+00 6.0e+00  1  0  0  0  2   1  0  0  0  2     0
> SectionRealView        2 1.0 6.4932e-0216.4 0.00e+00 0.0 1.2e+01 4.1e+03 0.0e+00  0  0  1  2  0   0  0  1  2  0     0
> PCSetUp                3 1.0 5.6296e-02 1.0 1.84e+05 1.2 7.0e+01 1.4e+03 3.0e+01  0  5  4  5  8   0  5  4  5  9    12
> PCSetUpOnBlocks        8 1.0 6.5680e-03 1.1 1.84e+05 1.2 0.0e+00 0.0e+00 7.0e+00  0  5  0  0  2   0  5  0  0  2   102
> PCApply                4 1.0 1.0816e-01 1.0 3.45e+06 1.2 9.6e+02 3.8e+02 2.0e+02  1 89 52 17 51   1 89 52 17 61   118
> KSPGMRESOrthog        30 1.0 3.6988e-03 1.7 3.22e+05 1.1 0.0e+00 0.0e+00 3.0e+01  0  8  0  0  8   0  8  0  0  9   329
> KSPSetup               4 1.0 1.2448e-03 1.2 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00  0  0  0  0  0   0  0  0  0  0     0
> KSPSolve               1 1.0 1.1271e-01 1.0 3.65e+06 1.2 1.0e+03 3.8e+02 2.1e+02  1 94 54 18 54   1 94 54 18 65   120
> MeshDestroy            5 1.0 3.2269e-0236.2 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00  0  0  0  0  0   0  0  0  0  0     0
> DistributeMesh         1 1.0 2.0238e-01 1.1 0.00e+00 0.0 2.4e+01 2.3e+03 0.0e+00  2  0  1  3  0   2  0  1  3  0     0
> PartitionCreate        2 1.0 4.0964e-0234.5 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00  0  0  0  0  0   0  0  0  0  0     0
> PartitionClosure       2 1.0 8.7453e-024366.7 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00  0  0  0  0  0   0  0  0  0  0     0
> DistributeCoords       2 1.0 4.6407e-02 2.4 0.00e+00 0.0 2.4e+01 3.0e+03 0.0e+00  0  0  1  3  0   0  0  1  3  0     0
> DistributeLabels       2 1.0 8.7246e-02 3.1 0.00e+00 0.0 1.8e+01 7.6e+02 0.0e+00  0  0  1  1  0   0  0  1  1  0     0
> CreateOverlap          2 1.0 2.5038e-02 3.1 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00  0  0  0  3  0   0  0  0  3  0     0
> DistributeMeshByFineMesh       1 1.0 2.0225e+00 1.0 3.18e+05 0.0 2.4e+01 9.5e+03 0.0e+00 17  2  1 11  0  17  2  1 11  0     0
> PartitionByFineMesh       1 1.0 1.2465e+0036561.8 3.18e+05 0.0 0.0e+00 0.0e+00 0.0e+00  3  2  0  0  0   3  2  0  0  0     0
> CreatCoarseCellToFineCell       1 1.0 1.1892e+0099754.2 3.18e+05 0.0 0.0e+00 0.0e+00 0.0e+00  3  2  0  0  0   3  2  0  0  0     0
> ConstructInterpolation       1 1.0 1.7860e-01 1.0 7.53e+04 1.2 3.5e+01 6.3e+02 1.8e+01  2  2  2  1  5   2  2  2  1  6     2
> creatMapFromFinePointToCoarseCell       1 1.0 8.4537e-02 1.1 6.63e+04 1.2 0.0e+00 0.0e+00 0.0e+00  1  2  0  0  0   1  2  0  0  0     3
> MGSetup Level 1        2 1.0 4.3158e-02 1.0 0.00e+00 0.0 0.0e+00 0.0e+00 8.0e+00  0  0  0  0  2   0  0  0  0  2     0
> MGSmooth Level 1      16 1.0 9.6284e-02 1.0 3.11e+06 1.2 7.6e+02 4.1e+02 1.7e+02  1 80 41 15 45   1 80 41 15 54   120
> MGResid Level 1        4 1.0 2.4343e-03 1.1 1.24e+05 1.1 4.0e+01 4.1e+02 0.0e+00  0  3  2  1  0   0  3  2  1  0   191
> MGInterp Level 1      16 1.0 9.3703e-03 1.0 2.22e+05 1.2 1.6e+02 2.2e+02 1.6e+01  0  6  9  2  4   0  6  9  2  5    87
> 
> ------------------
> Fande Kong
> ShenZhen Institutes of Advanced Technology
> Chinese Academy of Sciences
>  


From jed at 59A2.org  Tue Apr 12 08:45:23 2011
From: jed at 59A2.org (Jed Brown)
Date: Tue, 12 Apr 2011 15:45:23 +0200
Subject: [petsc-users] time spent on each level of the solver for
 multigrid preconditioner
In-Reply-To: <74C217CD-9FCE-46A5-9F41-86D27AF96D45@mcs.anl.gov>
References: <tencent_15A7E9320EAA53EE0FCCB329@qq.com>
	<74C217CD-9FCE-46A5-9F41-86D27AF96D45@mcs.anl.gov>
Message-ID: <BANLkTim7qjNijSXQCiCs9wDWCBweb5=vUw@mail.gmail.com>

On Tue, Apr 12, 2011 at 15:40, Barry Smith <bsmith at mcs.anl.gov> wrote:

> perhaps you are only running with one level and hence only getting one
> level or information. Or perhaps there is a bug/issue and we don't report
> for the coarsest level. If it is missing a level please send a bug report to
> petsc-maint at mcs.anl.gov using a PETSc example
>

-pc_mg_log was totally broken in 3.1, I fixed it here

changeset:   17148:1ab456826813
user:        Jed Brown <jed at 59A2.org>
date:        Fri Sep 24 11:39:46 2010 +0200
files:       src/ksp/pc/impls/mg/fmg.c src/ksp/pc/impls/mg/mg.c
src/ksp/pc/impls/mg/mgimpl.h src/ksp/pc/impls/mg/smg.c
description:
Make PC_MG logging (-pc_mg_log) log each level separately.

The old code clearly intended to do this, but the events were in PC_MG,
not PC_MG_Levels so all but the finest was leaked (and time from all
levels was attributed to the finest level).

http://petsc.cs.iit.edu/petsc/petsc-dev/rev/1ab456826813
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.mcs.anl.gov/pipermail/petsc-users/attachments/20110412/446afe8b/attachment.htm>

From khalid_eee at yahoo.com  Tue Apr 12 20:33:23 2011
From: khalid_eee at yahoo.com (khalid ashraf)
Date: Tue, 12 Apr 2011 18:33:23 -0700 (PDT)
Subject: [petsc-users] DMMG KSP Solve time with random initialization
Message-ID: <831692.28168.qm@web112615.mail.gq1.yahoo.com>

Hi,
I use the DMMG to solve the Ax=b. At the initialization part, I either assign a 
predetermined value to the vectors or a random value as shown in the code below. 
With the same system size, and no of processors, the random initialization takes 
significantly more time than the predetermined value. I am attaching the laog 
summary in both cases. Could you please suggest why the time requirement is so 
huge (specially KSP Solve) in the random initialization and how I can improve it 
?

Thanks in advance. 

###Code without random value assignment to vectors: 
    u_localptr[k][j][i] = 0.7e-0;
    v_localptr[k][j][i] = 0.81e-0;
    w_localptr[k][j][i] = -54e-1;

###Code with random value assignment to vectors: 
/*   PetscRandomCreate(PETSC_COMM_WORLD,&pRandom);
   PetscRandomSetFromOptions(pRandom);
   PetscRandomSetType(pRandom,PETSCRAND);
   PetscRandomSetInterval(pRandom,0.1e-8,1.0e-8);
   VecSetRandom(u,pRandom);
   PetscRandomSetInterval(pRandom,-1.e-8,-0.1e-8);
   VecSetRandom(v,pRandom);
   //VecSetRandom(w,pRandom);
   PetscRandomDestroy(pRandom);*/


###log_summary without random value assignment to vectors: 
                         Max       Max/Min        Avg      Total
Time (sec):           6.210e-01      1.00071   6.208e-01
Objects:              1.060e+02      1.00000   1.060e+02
Flops:                5.325e+04      1.00000   5.325e+04  1.065e+05
Flops/sec:            8.581e+04      1.00071   8.578e+04  1.716e+05
Memory:               1.412e+06      1.00582              2.815e+06
MPI Messages:         7.600e+01      1.00000   7.600e+01  1.520e+02
MPI Message Lengths:  3.078e+05      1.00000   4.051e+03  6.157e+05
MPI Reductions:       1.250e+02      1.00000

VecView               16 1.0 1.9195e-01 1.0 0.00e+00 0.0 3.6e+01 8.2e+03 7.0e+00 
30  0 24 48  6  30  0 24 48  8     0
VecNorm                4 1.0 5.9933e-05 1.4 1.64e+04 1.0 0.0e+00 0.0e+00 4.0e+00 
 0 31  0  0  3   0 31  0  0  4   547
VecScale               9 1.0 4.0106e-06 1.9 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00 
 0  0  0  0  0   0  0  0  0  0     0
VecCopy               12 1.0 5.3243e-05 1.1 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00 
 0  0  0  0  0   0  0  0  0  0     0
VecSet                 7 1.0 3.0005e-05 1.0 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00 
 0  0  0  0  0   0  0  0  0  0     0
VecAXPY                9 1.0 1.3322e-04 1.3 3.69e+04 1.0 0.0e+00 0.0e+00 0.0e+00 
 0 69  0  0  0   0 69  0  0  0   553
VecScatterBegin       53 1.0 1.1030e-03 1.1 0.00e+00 0.0 7.4e+01 4.1e+03 0.0e+00 
 0  0 49 49  0   0  0 49 49  0     0
VecScatterEnd         53 1.0 3.6766e-0310.5 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00 
 0  0  0  0  0   0  0  0  0  0     0
MatAssemblyBegin       2 1.0 2.6521e-04 2.8 0.00e+00 0.0 0.0e+00 0.0e+00 4.0e+00 
 0  0  0  0  3   0  0  0  0  4     0
MatAssemblyEnd         2 1.0 1.0279e-03 1.0 0.00e+00 0.0 4.0e+00 1.0e+03 1.1e+01 
 0  0  3  1  9   0  0  3  1 12     0
KSPSetup               2 1.0 5.8801e-04 1.0 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00 
 0  0  0  0  0   0  0  0  0  0     0
KSPSolve               4 1.0 4.1139e-03 1.0 1.64e+04 1.0 0.0e+00 0.0e+00 1.0e+01 
 1 31  0  0  8   1 31  0  0 11     8
PCSetUp                1 1.0 3.5669e-03 1.0 0.00e+00 0.0 0.0e+00 0.0e+00 2.0e+00 
 1  0  0  0  2   1  0  0  0  2     0
-----------------------------------------------------------------------------------------------------------------------


###log_summary with random initialization to vectors: 


Time (sec):           4.456e+01      1.00002   4.456e+01
Objects:              1.690e+02      1.00000   1.690e+02
Flops:                1.086e+10      1.00000   1.086e+10  2.172e+10
Flops/sec:            2.437e+08      1.00002   2.437e+08  4.875e+08
Memory:               2.709e+06      1.00302              5.410e+06
MPI Messages:         8.141e+04      1.00000   8.141e+04  1.628e+05
MPI Message Lengths:  3.335e+08      1.00000   4.096e+03  6.669e+08
MPI Reductions:       4.028e+05      1.00000


VecView               16 1.0 2.0461e-01 1.0 0.00e+00 0.0 3.6e+01 8.2e+03 7.0e+00 
 0  0  0  0  0   0  0  0  0  0     0
VecMDot            80000 1.0 2.7737e+00 1.0 2.70e+09 1.0 0.0e+00 0.0e+00 8.0e+04 
 6 25  0  0 20   6 25  0  0 20  1948
VecNorm           121336 1.0 3.8669e+00 1.0 4.97e+08 1.0 0.0e+00 0.0e+00 1.2e+05 
 9  5  0  0 30   9  5  0  0 30   257
VecScale          121345 1.0 8.6525e-01 1.0 2.48e+08 1.0 0.0e+00 0.0e+00 0.0e+00 
 2  2  0  0  0   2  2  0  0  0   574
VecCopy            40012 1.0 9.5324e-02 1.0 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00 
 0  0  0  0  0   0  0  0  0  0     0
VecSet            201343 1.0 3.3050e-01 1.0 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00 
 1  0  0  0  0   1  0  0  0  0     0
VecAXPY            41345 1.0 4.0391e-01 1.0 1.69e+08 1.0 0.0e+00 0.0e+00 0.0e+00 
 1  2  0  0  0   1  2  0  0  0   839
VecWAXPY            1336 1.0 1.5288e-02 1.0 2.74e+06 1.0 0.0e+00 0.0e+00 0.0e+00 
 0  0  0  0  0   0  0  0  0  0   358
VecMAXPY          121336 1.0 4.6148e+00 1.0 3.03e+09 1.0 0.0e+00 0.0e+00 0.0e+00 
10 28  0  0  0  10 28  0  0  0  1313
VecScatterBegin    81389 1.0 6.2763e-01 1.0 0.00e+00 0.0 1.6e+05 4.1e+03 0.0e+00 
 1  0100100  0   1  0100100  0     0
VecScatterEnd      81389 1.0 7.0998e-01 1.1 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00 
 2  0  0  0  0   2  0  0  0  0     0
VecSetRandom           2 1.0 2.7497e-04 1.1 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00 
 0  0  0  0  0   0  0  0  0  0     0
VecNormalize       80000 1.0 3.2196e+00 1.0 4.92e+08 1.0 0.0e+00 0.0e+00 8.0e+04 
 7  5  0  0 20   7  5  0  0 20   305
MatMult            81336 1.0 1.8218e+01 1.0 2.17e+09 1.0 1.6e+05 4.1e+03 0.0e+00 
41 20100100  0  41 20100100  0   238
MatSolve           80000 1.0 9.1123e+00 1.0 2.05e+09 1.0 0.0e+00 0.0e+00 0.0e+00 
20 19  0  0  0  20 19  0  0  0   450
MatLUFactorNum         1 1.0 7.6804e-04 1.0 3.74e+04 1.0 0.0e+00 0.0e+00 0.0e+00 
 0  0  0  0  0   0  0  0  0  0    97
MatILUFactorSym        1 1.0 7.0408e-04 1.0 0.00e+00 0.0 0.0e+00 0.0e+00 3.0e+00 
 0  0  0  0  0   0  0  0  0  0     0
MatAssemblyBegin       2 1.0 5.7212e-04 8.6 0.00e+00 0.0 0.0e+00 0.0e+00 4.0e+00 
 0  0  0  0  0   0  0  0  0  0     0
MatAssemblyEnd         2 1.0 1.1452e-03 1.0 0.00e+00 0.0 4.0e+00 1.0e+03 1.1e+01 
 0  0  0  0  0   0  0  0  0  0     0
MatGetRowIJ            1 1.0 1.0453e-06 1.1 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00 
 0  0  0  0  0   0  0  0  0  0     0
MatGetOrdering         1 1.0 5.4405e-04 1.0 0.00e+00 0.0 0.0e+00 0.0e+00 2.0e+00 
 0  0  0  0  0   0  0  0  0  0     0
KSPGMRESOrthog     80000 1.0 6.8689e+00 1.0 5.40e+09 1.0 0.0e+00 0.0e+00 8.0e+04 
15 50  0  0 20  15 50  0  0 20  1573
KSPSetup               3 1.0 5.9501e-04 1.0 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00 
 0  0  0  0  0   0  0  0  0  0     0
KSPSolve               4 1.0 4.3836e+01 1.0 1.09e+10 1.0 1.6e+05 4.1e+03 4.0e+05 
98100100100100  98100100100100   496
PCSetUp                2 1.0 5.8231e-03 1.0 3.74e+04 1.0 0.0e+00 0.0e+00 7.0e+00 
 0  0  0  0  0   0  0  0  0  0    13
PCSetUpOnBlocks    40000 1.0 2.4762e-02 1.0 3.74e+04 1.0 0.0e+00 0.0e+00 5.0e+00 
 0  0  0  0  0   0  0  0  0  0     3
PCApply            40000 1.0 2.6536e+01 1.0 4.26e+09 1.0 8.0e+04 4.1e+03 3.2e+05 
60 39 49 49 79  60 39 49 49 79   321
------------------------------------------------------------------------------------------------------------------------
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.mcs.anl.gov/pipermail/petsc-users/attachments/20110412/11af95ce/attachment.htm>

From jed at 59A2.org  Wed Apr 13 03:43:26 2011
From: jed at 59A2.org (Jed Brown)
Date: Wed, 13 Apr 2011 10:43:26 +0200
Subject: [petsc-users] DMMG KSP Solve time with random initialization
In-Reply-To: <831692.28168.qm@web112615.mail.gq1.yahoo.com>
References: <831692.28168.qm@web112615.mail.gq1.yahoo.com>
Message-ID: <BANLkTi=YEH+4nMnoK00fdabw7C8b3jPysw@mail.gmail.com>

On Wed, Apr 13, 2011 at 03:33, khalid ashraf <khalid_eee at yahoo.com> wrote:

> I use the DMMG to solve the Ax=b. At the initialization part, I either
> assign a predetermined value to the vectors or a random value as shown in
> the code below. With the same system size, and no of processors, the random
> initialization takes significantly more time than the predetermined value. I
> am attaching the laog summary in both cases. Could you please suggest why
> the time requirement is so huge (specially KSP Solve) in the random
> initialization and how I can improve it ?


The solve with constant initial state does zero iterations, the solve with
random initial state does not converge. You haven't explained what you are
solving or what calling sequence you are using, so I'm just going to take a
wild guess what's happening.

Your matrix is singular, probably because you didn't include boundary
conditions, and the right hand side vector is zero. The constant solution is
in the null space, therefore the residual is zero to begin with so no
iterations are ever done. The null space never gets projected out of the
random vector, therefore nothing ever converges and it takes a long time.
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.mcs.anl.gov/pipermail/petsc-users/attachments/20110413/8bfc9461/attachment.htm>

From domenico.borzacchiello at univ-st-etienne.fr  Wed Apr 13 04:08:25 2011
From: domenico.borzacchiello at univ-st-etienne.fr (domenico.borzacchiello at univ-st-etienne.fr)
Date: Wed, 13 Apr 2011 11:08:25 +0200 (CEST)
Subject: [petsc-users] Matrix Sparsity
Message-ID: <405131c721a99b01dd7900c2290691a5.squirrel@arcon.univ-st-etienne.fr>

Hi,
I'm using the DMMG interface for my code so I'm not directly calling the
matrix assembly routines.

I tried to retrieve the matrices through the commands

+
.  DMMGGetSNES(dmmg);
.  SNESGetKSP(snes,&ksp);
.  KSPGetOperators(ksp,&Amat,&Pmat,&flag);
+

,saved them (in Matlab format) and I noticed that a large number of zeros
entries were also saved.

If I run with -info i get

[0] MatAssemblyEnd_SeqAIJ(): Maximum nonzeros in any row is 500

How can I make sure that the correct sparsity patter is formed while still
using the DMMG interface? The same thing happens either if I use the FD
approximation of the jacobian or a specific FormJacobianFunction routine.

Thank you,
Domenico.


From jed at 59A2.org  Wed Apr 13 04:23:22 2011
From: jed at 59A2.org (Jed Brown)
Date: Wed, 13 Apr 2011 11:23:22 +0200
Subject: [petsc-users] Matrix Sparsity
In-Reply-To: <405131c721a99b01dd7900c2290691a5.squirrel@arcon.univ-st-etienne.fr>
References: <405131c721a99b01dd7900c2290691a5.squirrel@arcon.univ-st-etienne.fr>
Message-ID: <BANLkTi=ZMPOtUfppyhk-L=+dqdcG4V8U7A@mail.gmail.com>

On Wed, Apr 13, 2011 at 11:08, <domenico.borzacchiello at univ-st-etienne.fr>wrote:

> [0] MatAssemblyEnd_SeqAIJ(): Maximum nonzeros in any row is 500
>

What discretization are you using?


>
> How can I make sure that the correct sparsity patter is formed while still
> using the DMMG interface? The same thing happens either if I use the FD
> approximation of the jacobian or a specific FormJacobianFunction routine.
>

Did you the stencil width and shape (box or star) correctly?
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.mcs.anl.gov/pipermail/petsc-users/attachments/20110413/cab5798e/attachment-0001.htm>

From domenico.borzacchiello at univ-st-etienne.fr  Wed Apr 13 05:02:31 2011
From: domenico.borzacchiello at univ-st-etienne.fr (domenico.borzacchiello at univ-st-etienne.fr)
Date: Wed, 13 Apr 2011 12:02:31 +0200 (CEST)
Subject: [petsc-users] Matrix Sparsity
In-Reply-To: <BANLkTi=ZMPOtUfppyhk-L=+dqdcG4V8U7A@mail.gmail.com>
References: <405131c721a99b01dd7900c2290691a5.squirrel@arcon.univ-st-etienne.fr>
	<BANLkTi=ZMPOtUfppyhk-L=+dqdcG4V8U7A@mail.gmail.com>
Message-ID: <b61d9aa131271b353feccf9053a83cf8.squirrel@arcon.univ-st-etienne.fr>

> On Wed, Apr 13, 2011 at 11:08,
> <domenico.borzacchiello at univ-st-etienne.fr>wrote:
>
>> [0] MatAssemblyEnd_SeqAIJ(): Maximum nonzeros in any row is 500
>>
>
> What discretization are you using?
>
>

I'm still testing the code over small grids so I'm using a 9x9x9 grid (1
lev) at present time.

>>
>> How can I make sure that the correct sparsity patter is formed while
>> still
>> using the DMMG interface? The same thing happens either if I use the FD
>> approximation of the jacobian or a specific FormJacobianFunction
>> routine.
>>
>
> Did you the stencil width and shape (box or star) correctly?
>

I checked it and noticed that I was using an unnecessary stencil width of
2 (Box) and having 4 DOFs gave me a 4x5^3 = 500 non zero entries per row.
But even if I set it to 1 and Star it'll result in 4x7 = 28 nnz while I
need 17 at most. This means my matrices will always be double the needed
size (roughly). How can I control this?


From jed at 59A2.org  Wed Apr 13 05:11:39 2011
From: jed at 59A2.org (Jed Brown)
Date: Wed, 13 Apr 2011 12:11:39 +0200
Subject: [petsc-users] Matrix Sparsity
In-Reply-To: <b61d9aa131271b353feccf9053a83cf8.squirrel@arcon.univ-st-etienne.fr>
References: <405131c721a99b01dd7900c2290691a5.squirrel@arcon.univ-st-etienne.fr>
	<BANLkTi=ZMPOtUfppyhk-L=+dqdcG4V8U7A@mail.gmail.com>
	<b61d9aa131271b353feccf9053a83cf8.squirrel@arcon.univ-st-etienne.fr>
Message-ID: <BANLkTikb+Naw5P7fopfwg2w-z4Fv-4UWHw@mail.gmail.com>

On Wed, Apr 13, 2011 at 12:02, <domenico.borzacchiello at univ-st-etienne.fr>wrote:

> I checked it and noticed that I was using an unnecessary stencil width of
> 2 (Box) and having 4 DOFs gave me a 4x5^3 = 500 non zero entries per row.
> But even if I set it to 1 and Star it'll result in 4x7 = 28 nnz while I
> need 17 at most. This means my matrices will always be double the needed
> size (roughly). How can I control this?
>

You can use
http://www.mcs.anl.gov/petsc/petsc-as/snapshots/petsc-dev/docs/manualpages/DM/DMDASetBlockFills.html
(named
DASetBlockFills in petsc-3.1) with the AIJ matrix format. But the storage
costs are quite similar if you store those extra few nonzeros and use the
BAIJ format, and then you benefit from faster sparse matrix kernels so the
actual run time could be less than using the less regular nonzero structure
in AIJ. Also, BAIJ smooths all the components together which makes the
smoother stronger, thus you may converge in fewer iterations.
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.mcs.anl.gov/pipermail/petsc-users/attachments/20110413/60af6462/attachment.htm>

From gaurish108 at gmail.com  Wed Apr 13 10:32:41 2011
From: gaurish108 at gmail.com (Gaurish Telang)
Date: Wed, 13 Apr 2011 11:32:41 -0400
Subject: [petsc-users] format specifiers
Message-ID: <BANLkTi=dPOiaomdbJDfk=pixasbQvc_DcA@mail.gmail.com>

What are the format specifiers for data types PetscScalar and PetscScalar?

Gaurish
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.mcs.anl.gov/pipermail/petsc-users/attachments/20110413/b5ca9f72/attachment.htm>

From jed at 59A2.org  Wed Apr 13 10:40:17 2011
From: jed at 59A2.org (Jed Brown)
Date: Wed, 13 Apr 2011 17:40:17 +0200
Subject: [petsc-users] format specifiers
In-Reply-To: <BANLkTi=dPOiaomdbJDfk=pixasbQvc_DcA@mail.gmail.com>
References: <BANLkTi=dPOiaomdbJDfk=pixasbQvc_DcA@mail.gmail.com>
Message-ID: <BANLkTi=Q5jFGxTa7FBRs7-5DOfXFezJj+g@mail.gmail.com>

On Wed, Apr 13, 2011 at 17:32, Gaurish Telang <gaurish108 at gmail.com> wrote:

> What are the format specifiers for data types PetscScalar and PetscScalar?
>

Unfortunately there is no specifier. You can use

PetscPrintf(comm,"%G + %Gi\n",PetscRealPart(v),PetscImaginaryPart(v));
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.mcs.anl.gov/pipermail/petsc-users/attachments/20110413/67e0c2df/attachment.htm>

From gaurish108 at gmail.com  Wed Apr 13 12:24:10 2011
From: gaurish108 at gmail.com (Gaurish Telang)
Date: Wed, 13 Apr 2011 13:24:10 -0400
Subject: [petsc-users] format specifiers
In-Reply-To: <BANLkTi=dPOiaomdbJDfk=pixasbQvc_DcA@mail.gmail.com>
References: <BANLkTi=dPOiaomdbJDfk=pixasbQvc_DcA@mail.gmail.com>
Message-ID: <BANLkTi=qiNiWB34ANMiowpo99f4m+x0mtA@mail.gmail.com>

Hmm, %f seems to be working fine for these data types. Is there any harm in
using it though?

On Wed, Apr 13, 2011 at 11:32 AM, Gaurish Telang <gaurish108 at gmail.com>wrote:

> What are the format specifiers for data types PetscScalar and PetscScalar?
>
>
> Gaurish
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.mcs.anl.gov/pipermail/petsc-users/attachments/20110413/9ed20548/attachment.htm>

From jed at 59A2.org  Wed Apr 13 12:59:13 2011
From: jed at 59A2.org (Jed Brown)
Date: Wed, 13 Apr 2011 19:59:13 +0200
Subject: [petsc-users] format specifiers
In-Reply-To: <BANLkTi=qiNiWB34ANMiowpo99f4m+x0mtA@mail.gmail.com>
References: <BANLkTi=dPOiaomdbJDfk=pixasbQvc_DcA@mail.gmail.com>
	<BANLkTi=qiNiWB34ANMiowpo99f4m+x0mtA@mail.gmail.com>
Message-ID: <BANLkTin+LSZgCF3Mn35KzJnoHxjnUE0x5A@mail.gmail.com>

On Wed, Apr 13, 2011 at 19:24, Gaurish Telang <gaurish108 at gmail.com> wrote:

> Hmm, %f seems to be working fine for these data types. Is there any harm in
> using it though?


If you are referring to just using plain "%f" (or any single % specifier) to
show a PetscScalar, that's a matter of whether you have configured so that
PetscScalar is real or compex valued. It will not work when you use complex.

If you meant using %f instead of %G in my example above, that is a matter of
precision:

PetscPrintf converts %G (but not currently variants like %12.5G) to a
representation that works for any choice of precision. For example, '%g' for
double and float [1], %Lg for long double, %Qe for __float128. Similarly,
%[1-9]D is converted to %d or %lld depending the use of 64-bit indices. If
you only ever use double and native ints (usually 32-bit), then you don't
have to worry about these conversions and you can use whatever you want.


[1] The standard specifies that float is promoted to double when calling a
function with no prototype or a variadic function. The same rule promotes
char and short int to int.
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.mcs.anl.gov/pipermail/petsc-users/attachments/20110413/be3d70c5/attachment.htm>

From gshy2014 at gmail.com  Wed Apr 13 22:48:23 2011
From: gshy2014 at gmail.com (Shiyuan)
Date: Wed, 13 Apr 2011 22:48:23 -0500
Subject: [petsc-users] sparse matrix addition
Message-ID: <BANLkTinKcjX0uCy8K8cqyvYxsNjrnrbfrQ@mail.gmail.com>

Hi,
    I have two matrices A, B of different nonzero-pattern. Their size is
about 60k*60k. I notices that MATAXPY() is extremely slow. However, In
matlab, addition of the same two matrices is done in no time. Why is so? Any
strategies to speed up the sparse matrices additions? Thanks

Shiyuan
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.mcs.anl.gov/pipermail/petsc-users/attachments/20110413/dcf38433/attachment.htm>

From jed at 59A2.org  Thu Apr 14 03:08:45 2011
From: jed at 59A2.org (Jed Brown)
Date: Thu, 14 Apr 2011 10:08:45 +0200
Subject: [petsc-users] sparse matrix addition
In-Reply-To: <BANLkTinKcjX0uCy8K8cqyvYxsNjrnrbfrQ@mail.gmail.com>
References: <BANLkTinKcjX0uCy8K8cqyvYxsNjrnrbfrQ@mail.gmail.com>
Message-ID: <BANLkTi=u2E2fNHHbKuFkOytW7Zfa8x46wQ@mail.gmail.com>

On Thu, Apr 14, 2011 at 05:48, Shiyuan <gshy2014 at gmail.com> wrote:

> I have two matrices A, B of different nonzero-pattern. Their size is about
> 60k*60k. I notices that MATAXPY() is extremely slow.


What matrix format and what are you passing for MatStructure? With
DIFFERENT_NONZERO_STRUCTURE, petsc-3.1 was not doing preallocation. This is
fixed in petsc-dev.
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.mcs.anl.gov/pipermail/petsc-users/attachments/20110414/c3b5d2d3/attachment.htm>

From bsmith at mcs.anl.gov  Thu Apr 14 07:46:20 2011
From: bsmith at mcs.anl.gov (Barry Smith)
Date: Thu, 14 Apr 2011 07:46:20 -0500
Subject: [petsc-users] sparse matrix addition
In-Reply-To: <BANLkTi=u2E2fNHHbKuFkOytW7Zfa8x46wQ@mail.gmail.com>
References: <BANLkTinKcjX0uCy8K8cqyvYxsNjrnrbfrQ@mail.gmail.com>
	<BANLkTi=u2E2fNHHbKuFkOytW7Zfa8x46wQ@mail.gmail.com>
Message-ID: <B3FAF012-D89A-498A-B476-CB7113E3BBCF@mcs.anl.gov>


On Apr 14, 2011, at 3:08 AM, Jed Brown wrote:

> On Thu, Apr 14, 2011 at 05:48, Shiyuan <gshy2014 at gmail.com> wrote:
> I have two matrices A, B of different nonzero-pattern. Their size is about 60k*60k. I notices that MATAXPY() is extremely slow.
> 
> What matrix format and what are you passing for MatStructure? With DIFFERENT_NONZERO_STRUCTURE, petsc-3.1 was not doing preallocation. This is fixed in petsc-dev.

   In other words, switch to the development version of PETSc http://www.mcs.anl.gov/petsc/petsc-as/developers/index.html and it should be much faster. If it is not much faster than please send mail to petsc-maint at mcs.anl.gov with details of the matrix type AIJ? and ideally sample code and we'll see why it is so slow.

   Barry


From thomas.witkowski at tu-dresden.de  Thu Apr 14 08:18:56 2011
From: thomas.witkowski at tu-dresden.de (Thomas Witkowski)
Date: Thu, 14 Apr 2011 15:18:56 +0200
Subject: [petsc-users] FETI-DP
Message-ID: <4DA6F440.4000204@tu-dresden.de>

Has anybody of you implemented the FETI-DP method in PETSc? I think 
about to do this for my FEM code, but first I want to evaluate the 
effort of the implementation. So if some of you could give some comments 
on it or if there is some code I could reuse, I would be thankful for a 
short answer!

Thomas

From jed at 59A2.org  Thu Apr 14 09:19:14 2011
From: jed at 59A2.org (Jed Brown)
Date: Thu, 14 Apr 2011 16:19:14 +0200
Subject: [petsc-users] FETI-DP
In-Reply-To: <4DA6F440.4000204@tu-dresden.de>
References: <4DA6F440.4000204@tu-dresden.de>
Message-ID: <BANLkTimimSZL3n34Pzg-rnoLQt9sh4H6sA@mail.gmail.com>

On Thu, Apr 14, 2011 at 15:18, Thomas Witkowski <
thomas.witkowski at tu-dresden.de> wrote:

> Has anybody of you implemented the FETI-DP method in PETSc? I think about
> to do this for my FEM code, but first I want to evaluate the effort of the
> implementation.


There are a few implementations out there. Probably most notable is Axel
Klawonn and Oliver Rheinbach's implementation which has been scaled up to
very large problems and computers. My understanding is that Xuemin Tu did
some work on BDDC (equivalent to FETI-DP) using PETSc. I am not aware of
anyone releasing a working FETI-DP implementation using PETSc, but of course
you're welcome to ask these people if they would share code with you.


What sort of problems do you want it for (physics and mesh)? How are you
currently assembling your systems? A fully general FETI-DP implementation is
a lot of work. For a specific class of problems and variant of FETI-DP, it
will still take some effort, but should not be too much.

There was a start to a FETI-DP implementation in PETSc quite a while ago,
but it died due to bitrot and different ideas of how we would like to
implement. You can get that code from mercurial:

 http://petsc.cs.iit.edu/petsc/petsc-dev/rev/021f379b5eea


The fundamental ingredient of these methods is a "partially assembled"
matrix. For a library implementation, the challenges are

1. How does the user provide the information necessary to decide what the
coarse space looks like? (It's different for scalar problems, compressible
elasticity, and Stokes, and tricky to do with no geometric information from
the user.) The coefficient structure in the problem matters a lot when
deciding which coarse basis functions to use, see
http://dx.doi.org/10.1016/j.cma.2006.03.023

2. How do you handle primal basis functions with large support (e.g. rigid
body modes of a face)? Two choices here:
http://www.cs.nyu.edu/cs/faculty/widlund/FETI-DP-elasticity_TR.pdf .

3. How do you make it easy for the user to provide the required matrix?
Ideally, the user would just use plain MatSetValuesLocal() and run with
-mat_type partially-assembled -pc_type fetidp instead of, say -mat_type baij
-pc_type asm. It should work for multiple subdomains per process and
subdomains spanning multiple processes. This can now be done by implementing
MatGetLocalSubMatrix(). The local blocks of the partially assembled system
should be able to use different formats (e.g. SBAIJ).

4. How do you handle more than two levels? This is very important to use
more than about 1000 subdomains in 3D because the coarse problem just gets
too big (unless the coarse problem happens to be well-conditioned enough
that you can use algebraic multigrid).


I've wanted to implement FETI-DP in PETSc for almost two years, but it's
never been a high priority. I think I now know how to get enough flexibility
to make it worthwhile to me. I'd be happy to discuss implementation issues
with you.
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.mcs.anl.gov/pipermail/petsc-users/attachments/20110414/09f1f4b8/attachment.htm>

From desire.nuentsa_wakam at inria.fr  Thu Apr 14 09:39:43 2011
From: desire.nuentsa_wakam at inria.fr (Desire NUENTSA WAKAM)
Date: Thu, 14 Apr 2011 16:39:43 +0200
Subject: [petsc-users] -info filename
Message-ID: <4DA7072F.6020102@inria.fr>

Hi,
help on -info says :
*-info <optional filename>: print informative messages about the 
calculations*
but the optional filename expected is actually a logical value.
Is this a known behaviour ??
Thanks


-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.mcs.anl.gov/pipermail/petsc-users/attachments/20110414/0dc36d00/attachment-0001.htm>

From jed at 59A2.org  Thu Apr 14 09:51:09 2011
From: jed at 59A2.org (Jed Brown)
Date: Thu, 14 Apr 2011 16:51:09 +0200
Subject: [petsc-users] -info filename
In-Reply-To: <4DA7072F.6020102@inria.fr>
References: <4DA7072F.6020102@inria.fr>
Message-ID: <BANLkTi=eMz-vpk-Ean+gsfZgKyBEnKaA8A@mail.gmail.com>

On Thu, Apr 14, 2011 at 16:39, Desire NUENTSA WAKAM <
desire.nuentsa_wakam at inria.fr> wrote:

> *-info <optional filename>: print informative messages about the
> calculations*
> but the optional filename expected is actually a logical value.
> Is this a known behaviour ??


$ cd petsc-3.1/src/ksp/ksp/examples/tutorials/
$ make ex2
$ ./ex2 -info info.log
$ wc info.log.0
  49  355 3240 info.log.0

What do you expect?
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.mcs.anl.gov/pipermail/petsc-users/attachments/20110414/c2265eb4/attachment.htm>

From desire.nuentsa_wakam at inria.fr  Thu Apr 14 10:40:33 2011
From: desire.nuentsa_wakam at inria.fr (Desire NUENTSA WAKAM)
Date: Thu, 14 Apr 2011 17:40:33 +0200
Subject: [petsc-users] -info filename
In-Reply-To: <BANLkTi=eMz-vpk-Ean+gsfZgKyBEnKaA8A@mail.gmail.com>
References: <4DA7072F.6020102@inria.fr>
	<BANLkTi=eMz-vpk-Ean+gsfZgKyBEnKaA8A@mail.gmail.com>
Message-ID: <4DA71571.8070102@inria.fr>

Sorry, it has surely been corrected in current releases.
I have this in 3.1.p5
%./ex2 -info info.log
[0]PETSC ERROR: --------------------- Error Message 
------------------------------------
[0]PETSC ERROR: Invalid argument!
[0]PETSC ERROR: Unknown logical value: info.log!
[0]PETSC ERROR: 
------------------------------------------------------------------------
[0]PETSC ERROR: Petsc Release Version 3.1.0, Patch 5


On 04/14/2011 04:51 PM, Jed Brown wrote:
> On Thu, Apr 14, 2011 at 16:39, Desire NUENTSA WAKAM 
> <desire.nuentsa_wakam at inria.fr <mailto:desire.nuentsa_wakam at inria.fr>> 
> wrote:
>
>     *-info <optional filename>: print informative messages about the
>     calculations*
>     but the optional filename expected is actually a logical value.
>     Is this a known behaviour ??
>
>
> $ cd petsc-3.1/src/ksp/ksp/examples/tutorials/
> $ make ex2
> $ ./ex2 -info info.log
> $ wc info.log.0
>   49  355 3240 info.log.0
>
> What do you expect?
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.mcs.anl.gov/pipermail/petsc-users/attachments/20110414/3d85f496/attachment.htm>

From balay at mcs.anl.gov  Thu Apr 14 10:56:00 2011
From: balay at mcs.anl.gov (Satish Balay)
Date: Thu, 14 Apr 2011 10:56:00 -0500 (CDT)
Subject: [petsc-users] -info filename
In-Reply-To: <4DA71571.8070102@inria.fr>
References: <4DA7072F.6020102@inria.fr>
	<BANLkTi=eMz-vpk-Ean+gsfZgKyBEnKaA8A@mail.gmail.com>
	<4DA71571.8070102@inria.fr>
Message-ID: <alpine.LFD.2.02.1104141054500.31891@asterix>

yes - this is fixed in one of the post- 3.1.p5 patches [so upgrading
to 3.1.p8 should get rid of this problem]

satish

On Thu, 14 Apr 2011, Desire NUENTSA WAKAM wrote:

> Sorry, it has surely been corrected in current releases.
> I have this in 3.1.p5
> %./ex2 -info info.log
> [0]PETSC ERROR: --------------------- Error Message
> ------------------------------------
> [0]PETSC ERROR: Invalid argument!
> [0]PETSC ERROR: Unknown logical value: info.log!
> [0]PETSC ERROR:
> ------------------------------------------------------------------------
> [0]PETSC ERROR: Petsc Release Version 3.1.0, Patch 5
> 
> 
> On 04/14/2011 04:51 PM, Jed Brown wrote:
> > On Thu, Apr 14, 2011 at 16:39, Desire NUENTSA WAKAM
> > <desire.nuentsa_wakam at inria.fr <mailto:desire.nuentsa_wakam at inria.fr>>
> > wrote:
> > 
> >     *-info <optional filename>: print informative messages about the
> >     calculations*
> >     but the optional filename expected is actually a logical value.
> >     Is this a known behaviour ??
> > 
> > 
> > $ cd petsc-3.1/src/ksp/ksp/examples/tutorials/
> > $ make ex2
> > $ ./ex2 -info info.log
> > $ wc info.log.0
> >   49  355 3240 info.log.0
> > 
> > What do you expect?
> 


From f.denner09 at imperial.ac.uk  Thu Apr 14 12:20:48 2011
From: f.denner09 at imperial.ac.uk (Denner, Fabian)
Date: Thu, 14 Apr 2011 18:20:48 +0100
Subject: [petsc-users] Pre-conditioner for parallel CFD simulation
In-Reply-To: <A56CCB96395E684D9D57EAFB4AC41D4D1245C1DC92@ICEXM4.ic.ac.uk>
References: <A56CCB96395E684D9D57EAFB4AC41D4D1245C1DC92@ICEXM4.ic.ac.uk>
Message-ID: <A56CCB96395E684D9D57EAFB4AC41D4D1245C1DC93@ICEXM4.ic.ac.uk>

Hi,

I have a question concerning pre-conditioner/solver pairs for CFD. I'm using the BiCGStab solver with a Jacobi pre-conditioner at present to perform parallel simulations of fluid flow on unstructured grids. It works, however, for large meshes (>100k elements) the solver doesn't scale very well in terms of necessary iterations to reach a certain tolerance.
Does anybody have experience on which pre-conditioner works best for parallel CFD simulations using the BiCGStab solver? How is the convergence and stability of the multigrid solver compared to BiCGStab?

Best regards,
Fabian

From jed at 59A2.org  Thu Apr 14 12:28:54 2011
From: jed at 59A2.org (Jed Brown)
Date: Thu, 14 Apr 2011 19:28:54 +0200
Subject: [petsc-users] Pre-conditioner for parallel CFD simulation
In-Reply-To: <A56CCB96395E684D9D57EAFB4AC41D4D1245C1DC93@ICEXM4.ic.ac.uk>
References: <A56CCB96395E684D9D57EAFB4AC41D4D1245C1DC92@ICEXM4.ic.ac.uk>
	<A56CCB96395E684D9D57EAFB4AC41D4D1245C1DC93@ICEXM4.ic.ac.uk>
Message-ID: <BANLkTimf+ZU_ZWGQpFHxcz-Rwwj_d9u6sQ@mail.gmail.com>

On Thu, Apr 14, 2011 at 19:20, Denner, Fabian <f.denner09 at imperial.ac.uk>wrote:

> I have a question concerning pre-conditioner/solver pairs for CFD. I'm
> using the BiCGStab solver with a Jacobi pre-conditioner at present to
> perform parallel simulations of fluid flow on unstructured grids.
>

There are lots of methods for CFD. Maybe you could be more specific about
what you're solving (laminar, RANS, LES, DNS; compressible?; fully implicit,
v-p split implicit, explicit v/implicit p). The mesh quality is also
relevant. Do you have aspect ratio 10^6 elements as for wall-resolved LES?


> It works, however, for large meshes (>100k elements) the solver doesn't
> scale very well in terms of necessary iterations to reach a certain
> tolerance.
> Does anybody have experience on which pre-conditioner works best for
> parallel CFD simulations using the BiCGStab solver? How is the convergence
> and stability of the multigrid solver compared to BiCGStab?
>

1/(\Delta x) iterations versus 1
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.mcs.anl.gov/pipermail/petsc-users/attachments/20110414/787f80b7/attachment.htm>

From f.denner09 at imperial.ac.uk  Thu Apr 14 13:05:21 2011
From: f.denner09 at imperial.ac.uk (Denner, Fabian)
Date: Thu, 14 Apr 2011 19:05:21 +0100
Subject: [petsc-users] Pre-conditioner for parallel CFD simulation
In-Reply-To: <BANLkTimf+ZU_ZWGQpFHxcz-Rwwj_d9u6sQ@mail.gmail.com>
References: <A56CCB96395E684D9D57EAFB4AC41D4D1245C1DC92@ICEXM4.ic.ac.uk>
	<A56CCB96395E684D9D57EAFB4AC41D4D1245C1DC93@ICEXM4.ic.ac.uk>,
	<BANLkTimf+ZU_ZWGQpFHxcz-Rwwj_d9u6sQ@mail.gmail.com>
Message-ID: <A56CCB96395E684D9D57EAFB4AC41D4D1245C1DC96@ICEXM4.ic.ac.uk>

I solve fully implicit, incompressible DNS of low/moderate Reynolds number (Re = 1 - 1000) flows. The hexahedral meshes to test the code have an aspect ration of 1, the tetrahedral meshes are Voronoi-based, do not have high aspect ratios and a good quality. 
________________________________________
From: five9a2 at gmail.com [five9a2 at gmail.com] On Behalf Of Jed Brown [jed at 59A2.org]
Sent: 14 April 2011 18:28
To: PETSc users list
Cc: Denner, Fabian
Subject: Re: [petsc-users] Pre-conditioner for parallel CFD simulation

On Thu, Apr 14, 2011 at 19:20, Denner, Fabian <f.denner09 at imperial.ac.uk<mailto:f.denner09 at imperial.ac.uk>> wrote:
I have a question concerning pre-conditioner/solver pairs for CFD. I'm using the BiCGStab solver with a Jacobi pre-conditioner at present to perform parallel simulations of fluid flow on unstructured grids.

There are lots of methods for CFD. Maybe you could be more specific about what you're solving (laminar, RANS, LES, DNS; compressible?; fully implicit, v-p split implicit, explicit v/implicit p). The mesh quality is also relevant. Do you have aspect ratio 10^6 elements as for wall-resolved LES?

It works, however, for large meshes (>100k elements) the solver doesn't scale very well in terms of necessary iterations to reach a certain tolerance.
Does anybody have experience on which pre-conditioner works best for parallel CFD simulations using the BiCGStab solver? How is the convergence and stability of the multigrid solver compared to BiCGStab?

1/(\Delta x) iterations versus 1

From jed at 59A2.org  Thu Apr 14 13:16:19 2011
From: jed at 59A2.org (Jed Brown)
Date: Thu, 14 Apr 2011 20:16:19 +0200
Subject: [petsc-users] Pre-conditioner for parallel CFD simulation
In-Reply-To: <A56CCB96395E684D9D57EAFB4AC41D4D1245C1DC96@ICEXM4.ic.ac.uk>
References: <A56CCB96395E684D9D57EAFB4AC41D4D1245C1DC92@ICEXM4.ic.ac.uk>
	<A56CCB96395E684D9D57EAFB4AC41D4D1245C1DC93@ICEXM4.ic.ac.uk>
	<BANLkTimf+ZU_ZWGQpFHxcz-Rwwj_d9u6sQ@mail.gmail.com>
	<A56CCB96395E684D9D57EAFB4AC41D4D1245C1DC96@ICEXM4.ic.ac.uk>
Message-ID: <BANLkTin0yms29dRHAiKif1DTd2EHyG=nmA@mail.gmail.com>

On Thu, Apr 14, 2011 at 20:05, Denner, Fabian <f.denner09 at imperial.ac.uk>wrote:

> I solve fully implicit, incompressible DNS of low/moderate Reynolds number
> (Re = 1 - 1000) flows. The hexahedral meshes to test the code have an aspect
> ration of 1, the tetrahedral meshes are Voronoi-based, do not have high
> aspect ratios and a good quality.


Finite element? Inf-sup stable or stabilized? Continuous or discontinuous
pressure? This email from last week is relevant

http://lists.mcs.anl.gov/pipermail/petsc-users/2011-April/008475.html

Coupled multigrid takes some work, you can do algebraic multigrid with
PCFieldSplit with relatively little effort. What CFL are you running at?
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.mcs.anl.gov/pipermail/petsc-users/attachments/20110414/551cd4a1/attachment.htm>

From f.denner09 at imperial.ac.uk  Thu Apr 14 13:22:36 2011
From: f.denner09 at imperial.ac.uk (Denner, Fabian)
Date: Thu, 14 Apr 2011 19:22:36 +0100
Subject: [petsc-users] Pre-conditioner for parallel CFD simulation
In-Reply-To: <BANLkTin0yms29dRHAiKif1DTd2EHyG=nmA@mail.gmail.com>
References: <A56CCB96395E684D9D57EAFB4AC41D4D1245C1DC92@ICEXM4.ic.ac.uk>
	<A56CCB96395E684D9D57EAFB4AC41D4D1245C1DC93@ICEXM4.ic.ac.uk>
	<BANLkTimf+ZU_ZWGQpFHxcz-Rwwj_d9u6sQ@mail.gmail.com>
	<A56CCB96395E684D9D57EAFB4AC41D4D1245C1DC96@ICEXM4.ic.ac.uk>,
	<BANLkTin0yms29dRHAiKif1DTd2EHyG=nmA@mail.gmail.com>
Message-ID: <A56CCB96395E684D9D57EAFB4AC41D4D1245C1DC98@ICEXM4.ic.ac.uk>

It's Finite Volume, co-located grid arrangement, stabilized, with a continuous pressure field at CFL numbers < 1 (typically 0.3-0.7). 
Which pre-conditioner would you recommend with CG solvers (BiCGStab) for that sort of problem?

________________________________________
From: five9a2 at gmail.com [five9a2 at gmail.com] On Behalf Of Jed Brown [jed at 59A2.org]
Sent: 14 April 2011 19:16
To: Denner, Fabian
Cc: PETSc users list
Subject: Re: [petsc-users] Pre-conditioner for parallel CFD simulation

On Thu, Apr 14, 2011 at 20:05, Denner, Fabian <f.denner09 at imperial.ac.uk<mailto:f.denner09 at imperial.ac.uk>> wrote:
I solve fully implicit, incompressible DNS of low/moderate Reynolds number (Re = 1 - 1000) flows. The hexahedral meshes to test the code have an aspect ration of 1, the tetrahedral meshes are Voronoi-based, do not have high aspect ratios and a good quality.

Finite element? Inf-sup stable or stabilized? Continuous or discontinuous pressure? This email from last week is relevant

http://lists.mcs.anl.gov/pipermail/petsc-users/2011-April/008475.html

Coupled multigrid takes some work, you can do algebraic multigrid with PCFieldSplit with relatively little effort. What CFL are you running at?

From jed at 59A2.org  Thu Apr 14 13:49:18 2011
From: jed at 59A2.org (Jed Brown)
Date: Thu, 14 Apr 2011 20:49:18 +0200
Subject: [petsc-users] Pre-conditioner for parallel CFD simulation
In-Reply-To: <A56CCB96395E684D9D57EAFB4AC41D4D1245C1DC98@ICEXM4.ic.ac.uk>
References: <A56CCB96395E684D9D57EAFB4AC41D4D1245C1DC92@ICEXM4.ic.ac.uk>
	<A56CCB96395E684D9D57EAFB4AC41D4D1245C1DC93@ICEXM4.ic.ac.uk>
	<BANLkTimf+ZU_ZWGQpFHxcz-Rwwj_d9u6sQ@mail.gmail.com>
	<A56CCB96395E684D9D57EAFB4AC41D4D1245C1DC96@ICEXM4.ic.ac.uk>
	<BANLkTin0yms29dRHAiKif1DTd2EHyG=nmA@mail.gmail.com>
	<A56CCB96395E684D9D57EAFB4AC41D4D1245C1DC98@ICEXM4.ic.ac.uk>
Message-ID: <BANLkTim5L=S=yuScHf2Cj1o3zTQFOdbcGQ@mail.gmail.com>

On Thu, Apr 14, 2011 at 20:22, Denner, Fabian <f.denner09 at imperial.ac.uk>wrote:

> It's Finite Volume, co-located grid arrangement, stabilized, with a
> continuous pressure field at CFL numbers < 1 (typically 0.3-0.7).
>

Much easier, it is likely that a relatively standard coupled multigrid will
work. If you order unknowns so they are interlaced (u0,v0,w0,p0,u1,v1,...)
and MatSetBlockSize(A,4) and/or use the BAIJ format, you stand a good chance
with -pc_type ml and reasonable smoothers. Or you may have access to a
geometric hierarchy?

Preconditioning with SIMPLE, or (stronger) using SIMPLE as a smoother on
multigrid levels should work well. With the CFL number so low (and even
lower on coarse levels), you can also skip the SIMPLE procedure and just use
the "pressure Poisson" operator from the usual semi-implicit method as a
preconditioner (or as a smoother for coupled multigrid, with Jacobi applied
to the velocity part). Any of these variants should converge in a small
number of iterations independent of resolution.

The following does not do any coupled multigrid (which should converge
faster, but is more expensive per V-cycle), but should give you a good
methods intro. All these algorithms are straightforward to implement using
PCFieldSplit.

http://dx.doi.org/10.1016/S0021-9991(03)00121-9
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.mcs.anl.gov/pipermail/petsc-users/attachments/20110414/94e09ec6/attachment.htm>

From f.denner09 at imperial.ac.uk  Thu Apr 14 14:01:34 2011
From: f.denner09 at imperial.ac.uk (Denner, Fabian)
Date: Thu, 14 Apr 2011 20:01:34 +0100
Subject: [petsc-users] Pre-conditioner for parallel CFD simulation
In-Reply-To: <BANLkTim5L=S=yuScHf2Cj1o3zTQFOdbcGQ@mail.gmail.com>
References: <A56CCB96395E684D9D57EAFB4AC41D4D1245C1DC92@ICEXM4.ic.ac.uk>
	<A56CCB96395E684D9D57EAFB4AC41D4D1245C1DC93@ICEXM4.ic.ac.uk>
	<BANLkTimf+ZU_ZWGQpFHxcz-Rwwj_d9u6sQ@mail.gmail.com>
	<A56CCB96395E684D9D57EAFB4AC41D4D1245C1DC96@ICEXM4.ic.ac.uk>
	<BANLkTin0yms29dRHAiKif1DTd2EHyG=nmA@mail.gmail.com>
	<A56CCB96395E684D9D57EAFB4AC41D4D1245C1DC98@ICEXM4.ic.ac.uk>,
	<BANLkTim5L=S=yuScHf2Cj1o3zTQFOdbcGQ@mail.gmail.com>
Message-ID: <A56CCB96395E684D9D57EAFB4AC41D4D1245C1DC99@ICEXM4.ic.ac.uk>

Thanks Jed, I'll have a look on it and see if it works.

Best regards,
Fabian.
________________________________________
From: five9a2 at gmail.com [five9a2 at gmail.com] On Behalf Of Jed Brown [jed at 59A2.org]
Sent: 14 April 2011 19:49
To: Denner, Fabian
Cc: PETSc users list
Subject: Re: [petsc-users] Pre-conditioner for parallel CFD simulation

On Thu, Apr 14, 2011 at 20:22, Denner, Fabian <f.denner09 at imperial.ac.uk<mailto:f.denner09 at imperial.ac.uk>> wrote:
It's Finite Volume, co-located grid arrangement, stabilized, with a continuous pressure field at CFL numbers < 1 (typically 0.3-0.7).

Much easier, it is likely that a relatively standard coupled multigrid will work. If you order unknowns so they are interlaced (u0,v0,w0,p0,u1,v1,...) and MatSetBlockSize(A,4) and/or use the BAIJ format, you stand a good chance with -pc_type ml and reasonable smoothers. Or you may have access to a geometric hierarchy?

Preconditioning with SIMPLE, or (stronger) using SIMPLE as a smoother on multigrid levels should work well. With the CFL number so low (and even lower on coarse levels), you can also skip the SIMPLE procedure and just use the "pressure Poisson" operator from the usual semi-implicit method as a preconditioner (or as a smoother for coupled multigrid, with Jacobi applied to the velocity part). Any of these variants should converge in a small number of iterations independent of resolution.

The following does not do any coupled multigrid (which should converge faster, but is more expensive per V-cycle), but should give you a good methods intro. All these algorithms are straightforward to implement using PCFieldSplit.

http://dx.doi.org/10.1016/S0021-9991(03)00121-9

From khalid_eee at yahoo.com  Fri Apr 15 04:28:10 2011
From: khalid_eee at yahoo.com (khalid ashraf)
Date: Fri, 15 Apr 2011 02:28:10 -0700 (PDT)
Subject: [petsc-users] DMMG with PBC
Message-ID: <442878.22477.qm@web112607.mail.gq1.yahoo.com>

Hi,
I am running src/ksp/ksp/examples/tutorials/ex22.c
I matched the output of single processor and multiprocessor results and it works 
fine.
But I want to use a periodic boundary condition. I make the following changes in 
the main function and this works fine with this change as well:

  ierr = DMMGCreate(PETSC_COMM_WORLD,1,PETSC_NULL,&dmmg);CHKERRQ(ierr);
  ierr = 
DACreate3d(PETSC_COMM_WORLD,DA_XYZPERIODIC,DA_STENCIL_STAR,10,10,2,PETSC_DECIDE,PETSC_DECIDE,PETSC_DECIDE,1,1,0,0,0,&da);CHKERRQ(ierr);


However, when I comment out these following lines since I am using a PBC, then 
the result of 1 proc and multi-proc are not the same. They vary within 5 decimal 
points and the difference increases with increasing number of processors. 

/*      if (i==0 || j==0 || k==0 || i==mx-1 || j==my-1 || k==mz-1){
          v[0] = 2.0*(HxHydHz + HxHzdHy + HyHzdHx);
          ierr = 
MatSetValuesStencil(B,1,&row,1,&row,v,INSERT_VALUES);CHKERRQ(ierr);
        } else */

Could you please tell me what is going wrong here.

Thanks.
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.mcs.anl.gov/pipermail/petsc-users/attachments/20110415/844339c1/attachment-0001.htm>

From thomas.witkowski at tu-dresden.de  Fri Apr 15 05:06:38 2011
From: thomas.witkowski at tu-dresden.de (Thomas Witkowski)
Date: Fri, 15 Apr 2011 12:06:38 +0200
Subject: [petsc-users] Pre-conditioner for parallel CFD simulation
In-Reply-To: <BANLkTim5L=S=yuScHf2Cj1o3zTQFOdbcGQ@mail.gmail.com>
References: <A56CCB96395E684D9D57EAFB4AC41D4D1245C1DC92@ICEXM4.ic.ac.uk>	<A56CCB96395E684D9D57EAFB4AC41D4D1245C1DC93@ICEXM4.ic.ac.uk>	<BANLkTimf+ZU_ZWGQpFHxcz-Rwwj_d9u6sQ@mail.gmail.com>	<A56CCB96395E684D9D57EAFB4AC41D4D1245C1DC96@ICEXM4.ic.ac.uk>	<BANLkTin0yms29dRHAiKif1DTd2EHyG=nmA@mail.gmail.com>	<A56CCB96395E684D9D57EAFB4AC41D4D1245C1DC98@ICEXM4.ic.ac.uk>
	<BANLkTim5L=S=yuScHf2Cj1o3zTQFOdbcGQ@mail.gmail.com>
Message-ID: <4DA818AE.80900@tu-dresden.de>

Jed Brown wrote:
> On Thu, Apr 14, 2011 at 20:22, Denner, Fabian 
> <f.denner09 at imperial.ac.uk <mailto:f.denner09 at imperial.ac.uk>> wrote:
>
>     It's Finite Volume, co-located grid arrangement, stabilized, with
>     a continuous pressure field at CFL numbers < 1 (typically 0.3-0.7).
>
>
> Much easier, it is likely that a relatively standard coupled multigrid 
> will work. If you order unknowns so they are interlaced 
> (u0,v0,w0,p0,u1,v1,...) and MatSetBlockSize(A,4) and/or use the BAIJ 
> format, you stand a good chance with -pc_type ml and reasonable 
> smoothers. Or you may have access to a geometric hierarchy?
Which package must be installed to make use of "ml" (algebraic 
multigrid?) as a preconditioner?

Thomas

From jed at 59A2.org  Fri Apr 15 05:08:09 2011
From: jed at 59A2.org (Jed Brown)
Date: Fri, 15 Apr 2011 12:08:09 +0200
Subject: [petsc-users] Pre-conditioner for parallel CFD simulation
In-Reply-To: <4DA818AE.80900@tu-dresden.de>
References: <A56CCB96395E684D9D57EAFB4AC41D4D1245C1DC92@ICEXM4.ic.ac.uk>
	<A56CCB96395E684D9D57EAFB4AC41D4D1245C1DC93@ICEXM4.ic.ac.uk>
	<BANLkTimf+ZU_ZWGQpFHxcz-Rwwj_d9u6sQ@mail.gmail.com>
	<A56CCB96395E684D9D57EAFB4AC41D4D1245C1DC96@ICEXM4.ic.ac.uk>
	<BANLkTin0yms29dRHAiKif1DTd2EHyG=nmA@mail.gmail.com>
	<A56CCB96395E684D9D57EAFB4AC41D4D1245C1DC98@ICEXM4.ic.ac.uk>
	<BANLkTim5L=S=yuScHf2Cj1o3zTQFOdbcGQ@mail.gmail.com>
	<4DA818AE.80900@tu-dresden.de>
Message-ID: <BANLkTinA2phVRXNUyXbQhDmofuSYbeCTSw@mail.gmail.com>

On Fri, Apr 15, 2011 at 12:06, Thomas Witkowski <
thomas.witkowski at tu-dresden.de> wrote:

> Which package must be installed to make use of "ml" (algebraic multigrid?)
> as a preconditioner?


ML, --download-ml
also -pc_type hypre (BoomerAMG is default) from, you guessed it, Hypre,
--download-hypre
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.mcs.anl.gov/pipermail/petsc-users/attachments/20110415/2f184c72/attachment.htm>

From thomas.witkowski at tu-dresden.de  Fri Apr 15 05:38:56 2011
From: thomas.witkowski at tu-dresden.de (Thomas Witkowski)
Date: Fri, 15 Apr 2011 12:38:56 +0200
Subject: [petsc-users] Pre-conditioner for parallel CFD simulation
In-Reply-To: <BANLkTinA2phVRXNUyXbQhDmofuSYbeCTSw@mail.gmail.com>
References: <A56CCB96395E684D9D57EAFB4AC41D4D1245C1DC92@ICEXM4.ic.ac.uk>	<A56CCB96395E684D9D57EAFB4AC41D4D1245C1DC93@ICEXM4.ic.ac.uk>	<BANLkTimf+ZU_ZWGQpFHxcz-Rwwj_d9u6sQ@mail.gmail.com>	<A56CCB96395E684D9D57EAFB4AC41D4D1245C1DC96@ICEXM4.ic.ac.uk>	<BANLkTin0yms29dRHAiKif1DTd2EHyG=nmA@mail.gmail.com>	<A56CCB96395E684D9D57EAFB4AC41D4D1245C1DC98@ICEXM4.ic.ac.uk>	<BANLkTim5L=S=yuScHf2Cj1o3zTQFOdbcGQ@mail.gmail.com>	<4DA818AE.80900@tu-dresden.de>
	<BANLkTinA2phVRXNUyXbQhDmofuSYbeCTSw@mail.gmail.com>
Message-ID: <4DA82040.3010100@tu-dresden.de>

Jed Brown wrote:
> On Fri, Apr 15, 2011 at 12:06, Thomas Witkowski 
> <thomas.witkowski at tu-dresden.de 
> <mailto:thomas.witkowski at tu-dresden.de>> wrote:
>
>     Which package must be installed to make use of "ml" (algebraic
>     multigrid?) as a preconditioner?
>
>
> ML, --download-ml
> also -pc_type hypre (BoomerAMG is default) from, you guessed it, 
> Hypre, --download-hypre
I tried it, but when I make use of BAIJ matrix format, as you have 
proposed, I get the following error:

Invalid matrix type for ML. ML can only handle AIJ matrices.!

So can I make use of algebraic multigrid on block size matrices with 
this or one of the other packages?

Thomas

From thomas.witkowski at tu-dresden.de  Fri Apr 15 05:40:37 2011
From: thomas.witkowski at tu-dresden.de (Thomas Witkowski)
Date: Fri, 15 Apr 2011 12:40:37 +0200
Subject: [petsc-users] FETI-DP
In-Reply-To: <BANLkTimimSZL3n34Pzg-rnoLQt9sh4H6sA@mail.gmail.com>
References: <4DA6F440.4000204@tu-dresden.de>
	<BANLkTimimSZL3n34Pzg-rnoLQt9sh4H6sA@mail.gmail.com>
Message-ID: <4DA820A5.8020705@tu-dresden.de>

Jed Brown wrote:
> On Thu, Apr 14, 2011 at 15:18, Thomas Witkowski 
> <thomas.witkowski at tu-dresden.de 
> <mailto:thomas.witkowski at tu-dresden.de>> wrote:
>
>     Has anybody of you implemented the FETI-DP method in PETSc? I
>     think about to do this for my FEM code, but first I want to
>     evaluate the effort of the implementation.
>
>
> There are a few implementations out there. Probably most notable is 
> Axel Klawonn and Oliver Rheinbach's implementation which has been 
> scaled up to very large problems and computers. My understanding is 
> that Xuemin Tu did some work on BDDC (equivalent to FETI-DP) using 
> PETSc. I am not aware of anyone releasing a working FETI-DP 
> implementation using PETSc, but of course you're welcome to ask these 
> people if they would share code with you.
I know the works of Klawonn and Rheinbach, but was not aware that they 
have implemented their algorithms with PETSc.
>
> What sort of problems do you want it for (physics and mesh)? How are 
> you currently assembling your systems? A fully general FETI-DP 
> implementation is a lot of work. For a specific class of problems and 
> variant of FETI-DP, it will still take some effort, but should not be 
> too much.
My work is on a very general finite element toolbox (AMDiS) that solves 
a broad class of PDEs. The code is already parallelized, i.e., we have 
real distributed 2D (triangles) and 3D (tetrahedrons) adaptive meshes, 
mesh partitioning for load balancing with ParMETiS and Zoltan and a 
PETSc interface. For PETSc, there are two different modes at the moment. 
Either a so called global matrix solver or a Schur complement approach. 
The first one assembles one global parallel matrix, which we make most 
use of for using MUMPs or SuperLU on small and mid size problems. I 
would like to implement a broad class of different domain decomposition 
approaches into AMDiS, so that the user can make use of the method that 
is most appropriate for the problem.
>
> There was a start to a FETI-DP implementation in PETSc quite a while 
> ago, but it died due to bitrot and different ideas of how we would 
> like to implement. You can get that code from mercurial:
>
>  http://petsc.cs.iit.edu/petsc/petsc-dev/rev/021f379b5eea
Okay, good to know! I will have a look on it, may be I can extract some 
ideas for my own implementation.
>
>
> The fundamental ingredient of these methods is a "partially assembled" 
> matrix. For a library implementation, the challenges are
What do you mean by "partially assembled"? Do you mean that only a 
subset of the subdomain nodes must be assembled to a parallel 
distributed matrix and the most one can be put to a local matrix?
>
> 1. How does the user provide the information necessary to decide what 
> the coarse space looks like? (It's different for scalar problems, 
> compressible elasticity, and Stokes, and tricky to do with no 
> geometric information from the user.) The coefficient structure in the 
> problem matters a lot when deciding which coarse basis functions to 
> use, see http://dx.doi.org/10.1016/j.cma.2006.03.023
Do you think that this is really possible without providing at least 
some geometric information? At least in my code I can provide arbitrary 
geometrical information about the nodes to other libraries on very low 
computation costs.
>
> 2. How do you handle primal basis functions with large support (e.g. 
> rigid body modes of a face)? Two choices here: 
> http://www.cs.nyu.edu/cs/faculty/widlund/FETI-DP-elasticity_TR.pdf .
>
> 3. How do you make it easy for the user to provide the required 
> matrix? Ideally, the user would just use plain MatSetValuesLocal() and 
> run with -mat_type partially-assembled -pc_type fetidp instead of, say 
> -mat_type baij -pc_type asm. It should work for multiple subdomains 
> per process and subdomains spanning multiple processes. This can now 
> be done by implementing MatGetLocalSubMatrix(). The local blocks of 
> the partially assembled system should be able to use different formats 
> (e.g. SBAIJ).
I like this idea, but it's somehow the same as with PCFieldSplit. To 
make use of it, I have to provide at least the splits, before I can run 
this preconditioner. This will be same for FETI-DP. Somehow the user 
will need to specify the coarse space. To make this in a general way is 
a very challenging task, from my point of view.
>
> 4. How do you handle more than two levels? This is very important to 
> use more than about 1000 subdomains in 3D because the coarse problem 
> just gets too big (unless the coarse problem happens to be 
> well-conditioned enough that you can use algebraic multigrid).
Good question. Eventually, me code should run on definitely more then 
1000 nodes in 3D. We have some PDE's which we would like to run on 
O(10^5) nodes (phase field crystal equation, which is a 6th order 
nonlinear parabolic PDE).
>
> I've wanted to implement FETI-DP in PETSc for almost two years, but 
> it's never been a high priority. I think I now know how to get enough 
> flexibility to make it worthwhile to me. I'd be happy to discuss 
> implementation issues with you.
To implement FETI-DP in PETSc in a general way is very challenging but 
would be a feature of interest for most people how want to run their 
codes on real large number of nodes. If there are already some guys who 
have implemented it in PETSc, it would be the best to contact them to 
discuss these things.

Thomas


From thomas.witkowski at tu-dresden.de  Fri Apr 15 05:51:24 2011
From: thomas.witkowski at tu-dresden.de (Thomas Witkowski)
Date: Fri, 15 Apr 2011 12:51:24 +0200
Subject: [petsc-users] Pre-conditioner for parallel CFD simulation
In-Reply-To: <BANLkTinA2phVRXNUyXbQhDmofuSYbeCTSw@mail.gmail.com>
References: <A56CCB96395E684D9D57EAFB4AC41D4D1245C1DC92@ICEXM4.ic.ac.uk>	<A56CCB96395E684D9D57EAFB4AC41D4D1245C1DC93@ICEXM4.ic.ac.uk>	<BANLkTimf+ZU_ZWGQpFHxcz-Rwwj_d9u6sQ@mail.gmail.com>	<A56CCB96395E684D9D57EAFB4AC41D4D1245C1DC96@ICEXM4.ic.ac.uk>	<BANLkTin0yms29dRHAiKif1DTd2EHyG=nmA@mail.gmail.com>	<A56CCB96395E684D9D57EAFB4AC41D4D1245C1DC98@ICEXM4.ic.ac.uk>	<BANLkTim5L=S=yuScHf2Cj1o3zTQFOdbcGQ@mail.gmail.com>	<4DA818AE.80900@tu-dresden.de>
	<BANLkTinA2phVRXNUyXbQhDmofuSYbeCTSw@mail.gmail.com>
Message-ID: <4DA8232C.6070007@tu-dresden.de>

Jed Brown wrote:
> On Fri, Apr 15, 2011 at 12:06, Thomas Witkowski 
> <thomas.witkowski at tu-dresden.de 
> <mailto:thomas.witkowski at tu-dresden.de>> wrote:
>
>     Which package must be installed to make use of "ml" (algebraic
>     multigrid?) as a preconditioner?
>
>
> ML, --download-ml
> also -pc_type hypre (BoomerAMG is default) from, you guessed it, 
> Hypre, --download-hypre
But it works with MatAIJ with MatSetBlockSize(x). What is the difference 
between using MatBAIJ and MatAIJ with setting the block size directly?

From knepley at gmail.com  Fri Apr 15 06:34:18 2011
From: knepley at gmail.com (Matthew Knepley)
Date: Fri, 15 Apr 2011 06:34:18 -0500
Subject: [petsc-users] Pre-conditioner for parallel CFD simulation
In-Reply-To: <4DA8232C.6070007@tu-dresden.de>
References: <A56CCB96395E684D9D57EAFB4AC41D4D1245C1DC92@ICEXM4.ic.ac.uk>
	<A56CCB96395E684D9D57EAFB4AC41D4D1245C1DC93@ICEXM4.ic.ac.uk>
	<BANLkTimf+ZU_ZWGQpFHxcz-Rwwj_d9u6sQ@mail.gmail.com>
	<A56CCB96395E684D9D57EAFB4AC41D4D1245C1DC96@ICEXM4.ic.ac.uk>
	<BANLkTin0yms29dRHAiKif1DTd2EHyG=nmA@mail.gmail.com>
	<A56CCB96395E684D9D57EAFB4AC41D4D1245C1DC98@ICEXM4.ic.ac.uk>
	<BANLkTim5L=S=yuScHf2Cj1o3zTQFOdbcGQ@mail.gmail.com>
	<4DA818AE.80900@tu-dresden.de>
	<BANLkTinA2phVRXNUyXbQhDmofuSYbeCTSw@mail.gmail.com>
	<4DA8232C.6070007@tu-dresden.de>
Message-ID: <BANLkTimKN10ez5uFU87-8Ymd8yKGUjZZxQ@mail.gmail.com>

On Fri, Apr 15, 2011 at 5:51 AM, Thomas Witkowski <
thomas.witkowski at tu-dresden.de> wrote:

> Jed Brown wrote:
>
>  On Fri, Apr 15, 2011 at 12:06, Thomas Witkowski <
>> thomas.witkowski at tu-dresden.de <mailto:thomas.witkowski at tu-dresden.de>>
>> wrote:
>>
>>    Which package must be installed to make use of "ml" (algebraic
>>    multigrid?) as a preconditioner?
>>
>>
>> ML, --download-ml
>> also -pc_type hypre (BoomerAMG is default) from, you guessed it, Hypre,
>> --download-hypre
>>
> But it works with MatAIJ with MatSetBlockSize(x). What is the difference
> between using MatBAIJ and MatAIJ with setting the block size directly?
>

BAIJ changes the internal storage format to make things run faster. ML does
not handle this.

   Matt

-- 
What most experimenters take for granted before they begin their experiments
is infinitely more interesting than any results to which their experiments
lead.
-- Norbert Wiener
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.mcs.anl.gov/pipermail/petsc-users/attachments/20110415/7d036e81/attachment.htm>

From knepley at gmail.com  Fri Apr 15 06:38:25 2011
From: knepley at gmail.com (Matthew Knepley)
Date: Fri, 15 Apr 2011 06:38:25 -0500
Subject: [petsc-users] DMMG with PBC
In-Reply-To: <442878.22477.qm@web112607.mail.gq1.yahoo.com>
References: <442878.22477.qm@web112607.mail.gq1.yahoo.com>
Message-ID: <BANLkTikfFzr2JxZ06eRT5Q6SK-pL9EOu3A@mail.gmail.com>

On Fri, Apr 15, 2011 at 4:28 AM, khalid ashraf <khalid_eee at yahoo.com> wrote:

> Hi,
> I am running src/ksp/ksp/examples/tutorials/ex22.c
> I matched the output of single processor and multiprocessor results and it
> works fine.
> But I want to use a periodic boundary condition. I make the following
> changes in the main function and this works fine with this change as well:
>
>   ierr = DMMGCreate(PETSC_COMM_WORLD,1,PETSC_NULL,&dmmg);CHKERRQ(ierr);
>   ierr =
> DACreate3d(PETSC_COMM_WORLD,DA_XYZPERIODIC,DA_STENCIL_STAR,10,10,2,PETSC_DECIDE,PETSC_DECIDE,PETSC_DECIDE,1,1,0,0,0,&da);CHKERRQ(ierr);
>
>
> However, when I comment out these following lines since I am using a PBC,
> then the result of 1 proc and multi-proc are not the same. They vary within
> 5 decimal points and the difference increases with increasing number of
> processors.
>

The periodic operator has a null space. You must put that in the solver
DMMGSetNullSpace(), so that it is projected out at each step.

   Matt


> /*      if (i==0 || j==0 || k==0 || i==mx-1 || j==my-1 || k==mz-1){
>           v[0] = 2.0*(HxHydHz + HxHzdHy + HyHzdHx);
>           ierr =
> MatSetValuesStencil(B,1,&row,1,&row,v,INSERT_VALUES);CHKERRQ(ierr);
>         } else */
>
> Could you please tell me what is going wrong here.
>
> Thanks.
>


-- 
What most experimenters take for granted before they begin their experiments
is infinitely more interesting than any results to which their experiments
lead.
-- Norbert Wiener
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.mcs.anl.gov/pipermail/petsc-users/attachments/20110415/d2c8ba60/attachment.htm>

From Debao.Shao at brion.com  Fri Apr 15 02:16:04 2011
From: Debao.Shao at brion.com (Debao Shao)
Date: Fri, 15 Apr 2011 00:16:04 -0700
Subject: [petsc-users] MatCopy and MatSetValue consume most 99 percentage of
	runtime
Message-ID: <384FF55F15E3E447802DC8CCA85696980AAB2E2358@EX03>

Dear Petsc:

I'm trying on Petsc iterative solver(KSPCG & PCJACOBI), but it's strange that the two functions "MatCopy" and "MatSetValue" consume most of runtime, and the functions were not called frequently, just several times.

My libpetsc.a is built as follows:
1, /config/configure.py --with-mpi=0 --with-debugging=0 -with-log=0 -with-info=0
2, make all;

It's very appreciated to get your reply.

Thanks a lot,
Debao

________________________________
-- The information contained in this communication and any attachments is confidential and may be privileged, and is for the sole use of the intended recipient(s). Any unauthorized review, use, disclosure or distribution is prohibited. Unless explicitly stated otherwise in the body of this communication or the attachment thereto (if any), the information is provided on an AS-IS basis without any express or implied warranties or liabilities. To the extent you are relying on this information, you are doing so at your own risk. If you are not the intended recipient, please notify the sender immediately by replying to this message and destroy all copies of this message and any attachments. ASML is neither liable for the proper and complete transmission of the information contained in this communication, nor for any delay in its receipt.
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.mcs.anl.gov/pipermail/petsc-users/attachments/20110415/9b902af3/attachment-0001.htm>

From bsmith at mcs.anl.gov  Fri Apr 15 08:24:48 2011
From: bsmith at mcs.anl.gov (Barry Smith)
Date: Fri, 15 Apr 2011 08:24:48 -0500
Subject: [petsc-users] MatCopy and MatSetValue consume most 99
	percentage of runtime
In-Reply-To: <384FF55F15E3E447802DC8CCA85696980AAB2E2358@EX03>
References: <384FF55F15E3E447802DC8CCA85696980AAB2E2358@EX03>
Message-ID: <4C61B97A-F095-41BC-992C-2A0A040A0838@mcs.anl.gov>


   Debao,

      Please see http://www.mcs.anl.gov/petsc/petsc-as/documentation/faq.html#efficient-assembly It should resolve the difficulties.

    Barry

On Apr 15, 2011, at 2:16 AM, Debao Shao wrote:

> Dear Petsc:
>  
> I?m trying on Petsc iterative solver(KSPCG & PCJACOBI), but it?s strange that the two functions ?MatCopy? and ?MatSetValue? consume most of runtime, and the functions were not called frequently, just several times.
>  
> My libpetsc.a is built as follows:
> 1, /config/configure.py --with-mpi=0 --with-debugging=0 -with-log=0 -with-info=0
> 2, make all;
>  
> It?s very appreciated to get your reply.
>  
> Thanks a lot,
> Debao
> 
> -- The information contained in this communication and any attachments is confidential and may be privileged, and is for the sole use of the intended recipient(s). Any unauthorized review, use, disclosure or distribution is prohibited. Unless explicitly stated otherwise in the body of this communication or the attachment thereto (if any), the information is provided on an AS-IS basis without any express or implied warranties or liabilities. To the extent you are relying on this information, you are doing so at your own risk. If you are not the intended recipient, please notify the sender immediately by replying to this message and destroy all copies of this message and any attachments. ASML is neither liable for the proper and complete transmission of the information contained in this communication, nor for any delay in its receipt.


From J.Jaykka at leeds.ac.uk  Fri Apr 15 09:17:44 2011
From: J.Jaykka at leeds.ac.uk (Juha =?iso-8859-1?q?J=E4ykk=E4?=)
Date: Fri, 15 Apr 2011 14:17:44 +0000
Subject: [petsc-users] strange error
Message-ID: <201104151517.44294.J.Jaykka@leeds.ac.uk>

Hi list!

I keep getting strange errors when running a PETSc code:

[6]PETSC ERROR: --------------------- Error Message 
------------------------------------
[6]PETSC ERROR: Object is in wrong state!
[6]PETSC ERROR: Matrix must be set first!
[6]PETSC ERROR: 
------------------------------------------------------------------------

This happens on one machine only, but it has OpenMPI just as the others, where 
it works correctly.

More specifically, the error comes from

[6]PETSC ERROR: PCSetUp() line 775 in src/ksp/pc/interface/precon.c
[6]PETSC ERROR: PCApply() line 353 in src/ksp/pc/interface/precon.c
[6]PETSC ERROR: KSPSolve_PREONLY() line 29 in 
src/ksp/ksp/impls/preonly/preonly.c
[6]PETSC ERROR: KSPSolve() line 396 in src/ksp/ksp/interface/itfunc.c
[6]PETSC ERROR: PCApply_BJacobi_Singleblock() line 777 in 
src/ksp/pc/impls/bjacobi/bjacobi.c
[6]PETSC ERROR: PCApply() line 357 in src/ksp/pc/interface/precon.c
[6]PETSC ERROR: KSPInitialResidual() line 65 in src/ksp/ksp/interface/itres.c
[6]PETSC ERROR: KSPSolve_GMRES() line 240 in src/ksp/ksp/impls/gmres/gmres.c
[6]PETSC ERROR: KSPSolve() line 396 in src/ksp/ksp/interface/itfunc.c
[6]PETSC ERROR: SNES_KSPSolve() line 2944 in src/snes/interface/snes.c
[6]PETSC ERROR: SNESSolve_LS() line 191 in src/snes/impls/ls/ls.c
[6]PETSC ERROR: SNESSolve() line 2255 in src/snes/interface/snes.c
[6]PETSC ERROR: TSStep_BEuler_Nonlinear() line 176 in 
src/ts/impls/implicit/beuler/beuler.c
[6]PETSC ERROR: TSStep() line 1693 in src/ts/interface/ts.c
[6]PETSC ERROR: TSSolve() line 1731 in src/ts/interface/ts.c

Now, I do realize that the Jacobian and preconditioner matrices must be 
properly created before calling TSSolve, but, to the best of my knowledge, 
they are. If they were not, the code should never work.

And this always comes from just one MPI rank, as if somehow its memory is 
corrupt or something. Funny thing is, it only happens under the batch queue 
system: interactively, on the same machine, it works fine.

Any help is appreciated...

-Juha

From knepley at gmail.com  Fri Apr 15 10:13:31 2011
From: knepley at gmail.com (Matthew Knepley)
Date: Fri, 15 Apr 2011 10:13:31 -0500
Subject: [petsc-users] strange error
In-Reply-To: <201104151517.44294.J.Jaykka@leeds.ac.uk>
References: <201104151517.44294.J.Jaykka@leeds.ac.uk>
Message-ID: <BANLkTi=10j6W0idGHcgAjbnq7TVNXkH2xg@mail.gmail.com>

On Fri, Apr 15, 2011 at 9:17 AM, Juha J?ykk? <J.Jaykka at leeds.ac.uk> wrote:

> Hi list!
>
> I keep getting strange errors when running a PETSc code:
>
> [6]PETSC ERROR: --------------------- Error Message
> ------------------------------------
> [6]PETSC ERROR: Object is in wrong state!
> [6]PETSC ERROR: Matrix must be set first!
> [6]PETSC ERROR:
> ------------------------------------------------------------------------
>
> This happens on one machine only, but it has OpenMPI just as the others,
> where
> it works correctly.
>
> More specifically, the error comes from
>
> [6]PETSC ERROR: PCSetUp() line 775 in src/ksp/pc/interface/precon.c
> [6]PETSC ERROR: PCApply() line 353 in src/ksp/pc/interface/precon.c
> [6]PETSC ERROR: KSPSolve_PREONLY() line 29 in
> src/ksp/ksp/impls/preonly/preonly.c
> [6]PETSC ERROR: KSPSolve() line 396 in src/ksp/ksp/interface/itfunc.c
> [6]PETSC ERROR: PCApply_BJacobi_Singleblock() line 777 in
> src/ksp/pc/impls/bjacobi/bjacobi.c
> [6]PETSC ERROR: PCApply() line 357 in src/ksp/pc/interface/precon.c
> [6]PETSC ERROR: KSPInitialResidual() line 65 in
> src/ksp/ksp/interface/itres.c
> [6]PETSC ERROR: KSPSolve_GMRES() line 240 in
> src/ksp/ksp/impls/gmres/gmres.c
> [6]PETSC ERROR: KSPSolve() line 396 in src/ksp/ksp/interface/itfunc.c
> [6]PETSC ERROR: SNES_KSPSolve() line 2944 in src/snes/interface/snes.c
> [6]PETSC ERROR: SNESSolve_LS() line 191 in src/snes/impls/ls/ls.c
> [6]PETSC ERROR: SNESSolve() line 2255 in src/snes/interface/snes.c
> [6]PETSC ERROR: TSStep_BEuler_Nonlinear() line 176 in
> src/ts/impls/implicit/beuler/beuler.c
> [6]PETSC ERROR: TSStep() line 1693 in src/ts/interface/ts.c
> [6]PETSC ERROR: TSSolve() line 1731 in src/ts/interface/ts.c
>
> Now, I do realize that the Jacobian and preconditioner matrices must be
> properly created before calling TSSolve, but, to the best of my knowledge,
> they are. If they were not, the code should never work.
>
> And this always comes from just one MPI rank, as if somehow its memory is
> corrupt or something. Funny thing is, it only happens under the batch queue
> system: interactively, on the same machine, it works fine.
>

I would try valgrind on it first.

   Matt


> Any help is appreciated...
>
> -Juha
>


-- 
What most experimenters take for granted before they begin their experiments
is infinitely more interesting than any results to which their experiments
lead.
-- Norbert Wiener
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.mcs.anl.gov/pipermail/petsc-users/attachments/20110415/c4cf5004/attachment.htm>

From u.tabak at tudelft.nl  Fri Apr 15 12:33:49 2011
From: u.tabak at tudelft.nl (Umut Tabak)
Date: Fri, 15 Apr 2011 19:33:49 +0200
Subject: [petsc-users] difference on MATLAB backslash and PETSc/external
	solvers
Message-ID: <4DA8817D.8040702@tudelft.nl>

Dear all,

I have been testing the factors of a symmetric matrix of size 4225 by 
4225 as a preconditioner for a pcg type iteration in MATLAB(Since I have 
these factors from an eigenvalue extraction process). I extract the main 
matrix from a commercial code and the sparsity pattern is not that 
optimum for the moment.

Using the built in profiler, I tried to check the performance of the 
implementation, most of my time was spent in the forward-backward 
substitutions resulting from the preconditioner usage, namely, the  p = 
M^{-1} r operation in the pcg algorithm, which was expected.

Moreover, I conducted a series of simple tests with the same operator 
matrix and right hand side in PETSc and with the external direct solver 
interfaces, I ended up some differences in the solution phases. I timed 
the process with PetscGetTime function. The related part of the code is 
given as

    std::cout << "First solve ... " << std::endl;
    ierr = PetscGetTime(&t1);CHKERRQ(ierr);
    ierr = KSPSolve(ksp,b,x);CHKERRQ(ierr);
    ierr = PetscGetTime(&t2);CHKERRQ(ierr);
    std::cout << "Code took " << t2-t1 << " seconds .. " << std::endl;
    std::cout << "Second solve ... " << std::endl;
    ierr = PetscGetTime(&t3);CHKERRQ(ierr);
    ierr = KSPSolve(ksp,b,y);CHKERRQ(ierr);
    ierr = PetscGetTime(&t4);CHKERRQ(ierr);
    std::cout << "Code took " << t4-t3 << " seconds .. " << std::endl;
    std::cout << "Third  solve ... " << std::endl;
    ierr = PetscGetTime(&t5);CHKERRQ(ierr);
    ierr = KSPSolve(ksp,b,z);CHKERRQ(ierr);
    ierr = PetscGetTime(&t6);CHKERRQ(ierr);
    std::cout << "Code took " << t6-t5 << " seconds .. " << std::endl;

Basically, I do a first solve where the factorization is done( just to 
be sure, KSPSolve is the step where the factorization is done, right?) 
Then, the results for different factorization and forward backward 
substitutions are given below, when I use the ksp object the second 
time, it basically does a forward-backward solve, right? Then I compare 
the results with the MATLAB backslash which is also using UMFPACK as far 
as I can read from the documentation. PETSc is built in the default 
mode, which is the debug mode. Can this be the reason of this? Any 
explanations are welcome on this.
Actually the difference on the umfpack result was interesting to me.

Here are the results:

umfpack:
---------------------------------------------------
First solve ...
Code took 1.94039 seconds ..
Second solve ...
Code took 0.0264909 seconds ..
Third  solve ...
Code took 0.0264909 seconds ..

mumps:
---------------------------------------------------
First solve ...
Code took 1.40669 seconds ..
Second solve ...
Code took 0.0235441 seconds ..
Third  solve ...
Code took 0.023541 seconds ..

superlu:
---------------------------------------------------
First solve ...
Code took 3.20602 seconds ..
Second solve ...
Code took 0.0487978 seconds ..
Third  solve ...
Code took 0.048856 seconds ..

spooles:
----------------------------------------------------
First solve ...
Code took 1.43427 seconds ..
Second solve ...
Code took 0.0536189 seconds ..
Third  solve ...
Code took 0.053726 seconds ..

PETSc
----------------------------------------------------
First solve ...
Code took 1.3292 seconds ..
Second solve ...
Code took 0.0116079 seconds ..
Third  solve ...
Code took 0.011915 seconds ..

MATLAB by cputime function
-----------------------------------------------------
A \ b (native backslash)
3.800000000000006e-01 (with a bit fluctuation )

and by using the Factorize package also written by Timothy Davis;

t = cputime; factorOpA =factorize(OpA); factorOpA \ rhsA; cputime-t
6.300000000000026e-01

and a forward backward substitution using factorOpA

t = cputime; factorOpA \ rhsA; cputime-t
9.999999999998010e-03

Best,
Umut

-- 
If I have a thousand ideas and only one turns out to be good,
I am satisfied.
Alfred Nobel


From bsmith at mcs.anl.gov  Fri Apr 15 13:00:46 2011
From: bsmith at mcs.anl.gov (Barry Smith)
Date: Fri, 15 Apr 2011 13:00:46 -0500
Subject: [petsc-users] difference on MATLAB backslash and PETSc/external
	solvers
In-Reply-To: <4DA8817D.8040702@tudelft.nl>
References: <4DA8817D.8040702@tudelft.nl>
Message-ID: <A30F925E-53DC-4247-BB5F-180579E86561@mcs.anl.gov>


> Then I compare the results with the MATLAB backslash which is also using UMFPACK as far as I can read from the documentation. PETSc is built in the default mode, which is the debug mode.

   What do you mean in the debug mode?  To do any fair comparisons of times you should have ./configure PETSc with --with-debugging=0

> Can this be the reason of this? 

   Reason of what?

   Different solvers will give different solution times, this is perfectly normal. In fact sometimes for different matrices different solvers will actually be best for different matrices?

   Are you asking why UMFPack in Matlab does much faster (it seems) than the one used by PETSc? The likely answer is that the Matlab folks tweak the hell out of it to get good performance and when Tim Davis says "UMFPACK is in Matlab" actually means he gave something to Matlab and they improved it. It is unlikely that Matlab just uses the downloadable open source UMFPACK that PETSc uses.

   If this does not answer your question please rephrase.

   Barry


On Apr 15, 2011, at 12:33 PM, Umut Tabak wrote:

> Dear all,
> 
> I have been testing the factors of a symmetric matrix of size 4225 by 4225 as a preconditioner for a pcg type iteration in MATLAB(Since I have these factors from an eigenvalue extraction process). I extract the main matrix from a commercial code and the sparsity pattern is not that optimum for the moment.
> 
> Using the built in profiler, I tried to check the performance of the implementation, most of my time was spent in the forward-backward substitutions resulting from the preconditioner usage, namely, the  p = M^{-1} r operation in the pcg algorithm, which was expected.
> 
> Moreover, I conducted a series of simple tests with the same operator matrix and right hand side in PETSc and with the external direct solver interfaces, I ended up some differences in the solution phases. I timed the process with PetscGetTime function. The related part of the code is given as
> 
>   std::cout << "First solve ... " << std::endl;
>   ierr = PetscGetTime(&t1);CHKERRQ(ierr);
>   ierr = KSPSolve(ksp,b,x);CHKERRQ(ierr);
>   ierr = PetscGetTime(&t2);CHKERRQ(ierr);
>   std::cout << "Code took " << t2-t1 << " seconds .. " << std::endl;
>   std::cout << "Second solve ... " << std::endl;
>   ierr = PetscGetTime(&t3);CHKERRQ(ierr);
>   ierr = KSPSolve(ksp,b,y);CHKERRQ(ierr);
>   ierr = PetscGetTime(&t4);CHKERRQ(ierr);
>   std::cout << "Code took " << t4-t3 << " seconds .. " << std::endl;
>   std::cout << "Third  solve ... " << std::endl;
>   ierr = PetscGetTime(&t5);CHKERRQ(ierr);
>   ierr = KSPSolve(ksp,b,z);CHKERRQ(ierr);
>   ierr = PetscGetTime(&t6);CHKERRQ(ierr);
>   std::cout << "Code took " << t6-t5 << " seconds .. " << std::endl;
> 
> Basically, I do a first solve where the factorization is done( just to be sure, KSPSolve is the step where the factorization is done, right?) Then, the results for different factorization and forward backward substitutions are given below, when I use the ksp object the second time, it basically does a forward-backward solve, right? Then I compare the results with the MATLAB backslash which is also using UMFPACK as far as I can read from the documentation. PETSc is built in the default mode, which is the debug mode. Can this be the reason of this? Any explanations are welcome on this.
> Actually the difference on the umfpack result was interesting to me.
> 
> Here are the results:
> 
> umfpack:
> ---------------------------------------------------
> First solve ...
> Code took 1.94039 seconds ..
> Second solve ...
> Code took 0.0264909 seconds ..
> Third  solve ...
> Code took 0.0264909 seconds ..
> 
> mumps:
> ---------------------------------------------------
> First solve ...
> Code took 1.40669 seconds ..
> Second solve ...
> Code took 0.0235441 seconds ..
> Third  solve ...
> Code took 0.023541 seconds ..
> 
> superlu:
> ---------------------------------------------------
> First solve ...
> Code took 3.20602 seconds ..
> Second solve ...
> Code took 0.0487978 seconds ..
> Third  solve ...
> Code took 0.048856 seconds ..
> 
> spooles:
> ----------------------------------------------------
> First solve ...
> Code took 1.43427 seconds ..
> Second solve ...
> Code took 0.0536189 seconds ..
> Third  solve ...
> Code took 0.053726 seconds ..
> 
> PETSc
> ----------------------------------------------------
> First solve ...
> Code took 1.3292 seconds ..
> Second solve ...
> Code took 0.0116079 seconds ..
> Third  solve ...
> Code took 0.011915 seconds ..
> 
> MATLAB by cputime function
> -----------------------------------------------------
> A \ b (native backslash)
> 3.800000000000006e-01 (with a bit fluctuation )
> 
> and by using the Factorize package also written by Timothy Davis;
> 
> t = cputime; factorOpA =factorize(OpA); factorOpA \ rhsA; cputime-t
> 6.300000000000026e-01
> 
> and a forward backward substitution using factorOpA
> 
> t = cputime; factorOpA \ rhsA; cputime-t
> 9.999999999998010e-03
> 
> Best,
> Umut
> 
> -- 
> If I have a thousand ideas and only one turns out to be good,
> I am satisfied.
> Alfred Nobel
> 


From u.tabak at tudelft.nl  Fri Apr 15 13:02:56 2011
From: u.tabak at tudelft.nl (Umut Tabak)
Date: Fri, 15 Apr 2011 20:02:56 +0200
Subject: [petsc-users] difference on MATLAB backslash and PETSc/external
 solvers
In-Reply-To: <A30F925E-53DC-4247-BB5F-180579E86561@mcs.anl.gov>
References: <4DA8817D.8040702@tudelft.nl>
	<A30F925E-53DC-4247-BB5F-180579E86561@mcs.anl.gov>
Message-ID: <4DA88850.5000900@tudelft.nl>

On 04/15/2011 08:00 PM, Barry Smith wrote:
>     What do you mean in the debug mode?  To do any fair comparisons of times you should have ./configure PETSc with --with-debugging=0
>
>    
Dear Barry, Thx.
Ok, this is what I meant.
>     Reason of what?
>
>    
The difference...
>     Different solvers will give different solution times, this is perfectly normal. In fact sometimes for different matrices different solvers will actually be best for different matrices?
>
>     Are you asking why UMFPack in Matlab does much faster (it seems) than the one used by PETSc? The likely answer is that the Matlab folks tweak the hell out of it to get good performance and when Tim Davis says "UMFPACK is in Matlab" actually means he gave something to Matlab and they improved it.
ok, fine, this answers my question.
>   It is unlikely that Matlab just uses the downloadable open source UMFPACK that PETSc uses.
>    

From ram at ibrae.ac.ru  Sat Apr 16 15:26:47 2011
From: ram at ibrae.ac.ru (=?KOI8-R?B?4czFy9PFyiDy0drBzs/X?=)
Date: Sun, 17 Apr 2011 00:26:47 +0400
Subject: [petsc-users] How to create and assemble matrices for DA
	vectors??
In-Reply-To: <BANLkTinbaLAW9XTB=W58yrBSzEQ8v33nEg@mail.gmail.com>
References: <BANLkTimS4P6MxxFU6W=jV_-9FD+1cUBM1w@mail.gmail.com>
	<alpine.LFD.2.02.1104071828530.13118@asterix>
	<BANLkTinbaLAW9XTB=W58yrBSzEQ8v33nEg@mail.gmail.com>
Message-ID: <BANLkTimpO62pYkAzjeaL6t65xTQm=b--kw@mail.gmail.com>

>
>
>> Create u,b with DAGetGlobalVector() and A with DAGetMatrix() and they
>> will match the DA. For eg: check: src/snes/examples/tutorials/ex5.c
>> [or some of the examples in src/dm/da/examples]
>>
>> Satish
>>
>>
Hello again!

1. Please tell me, what's the principal difference between procedures
DAGetGlobalVector and DACreateGlobalVector? I cant catch it from man pages.

2. As I can read from DAGetMatrix man page, this procedure:

Creates a matrix with the correct parallel layout and nonzero structure
required for computing the Jacobian on a function defined using the stencil
set in the
 DA

Notes: This properly preallocates the number of nonzeros in the sparse
matrix so you do not need to do it yourself.

By default it also sets the nonzero structure and puts in the zero entries.
To prevent setting the nonzero pattern call
DASetMatPreallocateOnly<../DA/DASetMatPreallocateOnly.html#DASetMatPreallocateOnly>
()

So I use DASetMatPreallocateOnly. But I dont need a Jacobian. I need a
matrix of my linear system with its original number of nonzeros per row and
its original nonzero pattern. So I use MatSetValues and MatAsseblyBegin/End
to assemble it. And -info key on runtime tells me that there were additional
mallocs during runtime. As it said in manual, this is very expensive to
allocate memory dynamically. MatMPIAIJSetPreallocation doesnt help me. How
should I preallocate memory for DAMatrix?

Thank you!

Alexey Ryazanov
______________________________________
Nuclear Safety Institute of Russian Academy of Sciences
<http://www.ibrae.ac.ru/>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.mcs.anl.gov/pipermail/petsc-users/attachments/20110417/244e98d4/attachment.htm>

From jed at 59A2.org  Sat Apr 16 15:47:19 2011
From: jed at 59A2.org (Jed Brown)
Date: Sat, 16 Apr 2011 22:47:19 +0200
Subject: [petsc-users] How to create and assemble matrices for DA
	vectors??
In-Reply-To: <BANLkTimpO62pYkAzjeaL6t65xTQm=b--kw@mail.gmail.com>
References: <BANLkTimS4P6MxxFU6W=jV_-9FD+1cUBM1w@mail.gmail.com>
	<alpine.LFD.2.02.1104071828530.13118@asterix>
	<BANLkTinbaLAW9XTB=W58yrBSzEQ8v33nEg@mail.gmail.com>
	<BANLkTimpO62pYkAzjeaL6t65xTQm=b--kw@mail.gmail.com>
Message-ID: <BANLkTimno3zfzbOVYQYjFTicQaf0DRwhNw@mail.gmail.com>

On Sat, Apr 16, 2011 at 22:26, ??????? ??????? <ram at ibrae.ac.ru> wrote:

> 1. Please tell me, what's the principal difference between procedures
> DAGetGlobalVector and DACreateGlobalVector? I cant catch it from man pages.
>

Use DACreateGlobalVector() if you ownership of the vector, you call
VecDestroy() when you are done with it.

Use DAGetGlobalVector() to let the DA manage the lifetime of the vector,
call DARestoreGlobalVector() when you are done with it. The DA does not
actually destroy the vector, it keeps it around and just gives it back next
time you call DAGetGlobalVector(). This is usually what you want for "work"
vectors.


>
> 2. As I can read from DAGetMatrix man page, this procedure:
>
> Creates a matrix with the correct parallel layout and nonzero structure
> required for computing the Jacobian on a function defined using the stencil
> set in the
>  DA
>
> Notes: This properly preallocates the number of nonzeros in the sparse
> matrix so you do not need to do it yourself.
>
> By default it also sets the nonzero structure and puts in the zero entries.
> To prevent setting the nonzero pattern call DASetMatPreallocateOnly<http://../DA/DASetMatPreallocateOnly.html#DASetMatPreallocateOnly>
> ()
>
> So I use DASetMatPreallocateOnly.
>

Why would you want to do that?


> But I dont need a Jacobian. I need a matrix of my linear system
>

But that _is_ the Jacobian of the residual function (f(x) = A*x - b for
linear problems). This language is used frequently in PETSc.


> with its original number of nonzeros per row and its original nonzero
> pattern.
>

What do you mean by "original"?

You are setting values that have not been preallocated, perhaps because the
stencil you defined for the DA is different from the one you are using
during assembly.
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.mcs.anl.gov/pipermail/petsc-users/attachments/20110416/7ec57153/attachment.htm>

From domenico.borzacchiello at univ-st-etienne.fr  Sun Apr 17 16:16:38 2011
From: domenico.borzacchiello at univ-st-etienne.fr (domenico.borzacchiello at univ-st-etienne.fr)
Date: Sun, 17 Apr 2011 23:16:38 +0200 (CEST)
Subject: [petsc-users] Setting intepolation and restriction operators in DMMG
Message-ID: <9a75fdd120436247a0adcf63f0fe3e08.squirrel@arcon.univ-st-etienne.fr>

Hi,

I'm still in the process of coding my Stokes solver (FV+MAC
Discretisation) with DMMG.
So far the I've been testing it with direct solvers to check that
everything was right for the functional and jacobian creation and it runs
fine both in parallel and sequential mode. I now need to implement an
iterative solver  that will most likely be multigrid, and since I am using
fully staggered arrangement I need to define my own grid transfer
operators because the interpolation/restriction will be different for u v
w p. How can I do this?

Would you suggest another solution strategy instead?

thank you
Domenico.


From ilyascfd at gmail.com  Mon Apr 18 08:34:16 2011
From: ilyascfd at gmail.com (ilyas ilyas)
Date: Mon, 18 Apr 2011 16:34:16 +0300
Subject: [petsc-users] local row calculation in 3D
Message-ID: <BANLkTik7RZhwcQaP_1ArP67nov1j1P4CSQ@mail.gmail.com>

Hi,

In ex14f.F in KSP, "row" variable is calculated either

349: do 30 j=ys,ys+ym-1
350: ...
351: do 40 i=xs,xs+xm-1
352:          row = i - gxs + (j - gys)*gxm + 1

or

442: do 50 j=ys,ys+ym-1
443: ...
444: row = (j - gys)*gxm + xs - gxs
445: do 60 i=xs,xs+xm-1
446:          row = row + 1

How can I calculate "row" in 3D ?

I tried this;

do k=zs,zs+zm-1
   do j=ys,ys+ym-1
      do i=xs,xs+xm-1

           row = i - gxs + (j - gys)*gxm + (k - gzs)*gxm*gym + 1

It does not work for certain number of processors.


Thanks,

Ilyas
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.mcs.anl.gov/pipermail/petsc-users/attachments/20110418/d8c11716/attachment.htm>

From knepley at gmail.com  Mon Apr 18 08:40:17 2011
From: knepley at gmail.com (Matthew Knepley)
Date: Mon, 18 Apr 2011 08:40:17 -0500
Subject: [petsc-users] local row calculation in 3D
In-Reply-To: <BANLkTik7RZhwcQaP_1ArP67nov1j1P4CSQ@mail.gmail.com>
References: <BANLkTik7RZhwcQaP_1ArP67nov1j1P4CSQ@mail.gmail.com>
Message-ID: <BANLkTi=4GS2ojZCmz7iafpxodHMUJw9V-g@mail.gmail.com>

On Mon, Apr 18, 2011 at 8:34 AM, ilyas ilyas <ilyascfd at gmail.com> wrote:

> Hi,
>
> In ex14f.F in KSP, "row" variable is calculated either
>

These are very old. I suggest you use the FormFunctionLocal() approach in
ex5f.F which
does not calculate global row numbers when using a DA.

   Matt


> 349: do 30 j=ys,ys+ym-1
> 350: ...
> 351: do 40 i=xs,xs+xm-1
> 352:          row = i - gxs + (j - gys)*gxm + 1
>
> or
>
> 442: do 50 j=ys,ys+ym-1
> 443: ...
> 444: row = (j - gys)*gxm + xs - gxs
> 445: do 60 i=xs,xs+xm-1
> 446:          row = row + 1
>
> How can I calculate "row" in 3D ?
>
> I tried this;
>
> do k=zs,zs+zm-1
>    do j=ys,ys+ym-1
>       do i=xs,xs+xm-1
>
>            row = i - gxs + (j - gys)*gxm + (k - gzs)*gxm*gym + 1
>
> It does not work for certain number of processors.
>
>
> Thanks,
>
> Ilyas
>


-- 
What most experimenters take for granted before they begin their experiments
is infinitely more interesting than any results to which their experiments
lead.
-- Norbert Wiener
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.mcs.anl.gov/pipermail/petsc-users/attachments/20110418/61b4daa9/attachment.htm>

From ilyascfd at gmail.com  Mon Apr 18 08:54:19 2011
From: ilyascfd at gmail.com (ilyas ilyas)
Date: Mon, 18 Apr 2011 16:54:19 +0300
Subject: [petsc-users] local row calculation in 3D
In-Reply-To: <BANLkTi=4GS2ojZCmz7iafpxodHMUJw9V-g@mail.gmail.com>
References: <BANLkTik7RZhwcQaP_1ArP67nov1j1P4CSQ@mail.gmail.com>
	<BANLkTi=4GS2ojZCmz7iafpxodHMUJw9V-g@mail.gmail.com>
Message-ID: <BANLkTi=vzsA4xG6R66aXPvd7_PT+EYL2PQ@mail.gmail.com>

Hi,
Thank you for your suggestion. I will take it into account.
Since changing this structure in my "massive" code may take  too much time,
I would like to know that how "row" is calculated in 3D, independently from
processor numbers.

Regards,
Ilyas

2011/4/18 Matthew Knepley <knepley at gmail.com>

> On Mon, Apr 18, 2011 at 8:34 AM, ilyas ilyas <ilyascfd at gmail.com> wrote:
>
>> Hi,
>>
>> In ex14f.F in KSP, "row" variable is calculated either
>>
>
> These are very old. I suggest you use the FormFunctionLocal() approach in
> ex5f.F which
> does not calculate global row numbers when using a DA.
>
>    Matt
>
>
>> 349: do 30 j=ys,ys+ym-1
>> 350: ...
>> 351: do 40 i=xs,xs+xm-1
>> 352:          row = i - gxs + (j - gys)*gxm + 1
>>
>> or
>>
>> 442: do 50 j=ys,ys+ym-1
>> 443: ...
>> 444: row = (j - gys)*gxm + xs - gxs
>> 445: do 60 i=xs,xs+xm-1
>> 446:          row = row + 1
>>
>> How can I calculate "row" in 3D ?
>>
>> I tried this;
>>
>> do k=zs,zs+zm-1
>>    do j=ys,ys+ym-1
>>       do i=xs,xs+xm-1
>>
>>            row = i - gxs + (j - gys)*gxm + (k - gzs)*gxm*gym + 1
>>
>> It does not work for certain number of processors.
>>
>>
>> Thanks,
>>
>> Ilyas
>>
>
>
>
> --
> What most experimenters take for granted before they begin their
> experiments is infinitely more interesting than any results to which their
> experiments lead.
> -- Norbert Wiener
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.mcs.anl.gov/pipermail/petsc-users/attachments/20110418/4e0716ec/attachment.htm>

From rlmackie862 at gmail.com  Mon Apr 18 09:25:19 2011
From: rlmackie862 at gmail.com (Randall Mackie)
Date: Mon, 18 Apr 2011 07:25:19 -0700
Subject: [petsc-users] local row calculation in 3D
In-Reply-To: <BANLkTi=vzsA4xG6R66aXPvd7_PT+EYL2PQ@mail.gmail.com>
References: <BANLkTik7RZhwcQaP_1ArP67nov1j1P4CSQ@mail.gmail.com>
	<BANLkTi=4GS2ojZCmz7iafpxodHMUJw9V-g@mail.gmail.com>
	<BANLkTi=vzsA4xG6R66aXPvd7_PT+EYL2PQ@mail.gmail.com>
Message-ID: <BANLkTi=bLZUx47nb9Md3qw1Tj4fZRz8Xpw@mail.gmail.com>

Here's how I do it:

       do kk=zs,zs+zm-1
        do jj=ys,ys+ym-1
          do ii=xs,xs+xm-1

            row=ii-gxs + (jj-gys)*gxm + (kk-gzs)*gxm*gym


Good luck,

Randy M.


On Mon, Apr 18, 2011 at 6:54 AM, ilyas ilyas <ilyascfd at gmail.com> wrote:

> Hi,
> Thank you for your suggestion. I will take it into account.
> Since changing this structure in my "massive" code may take  too much time,
> I would like to know that how "row" is calculated in 3D, independently from
> processor numbers.
>
> Regards,
> Ilyas
>
> 2011/4/18 Matthew Knepley <knepley at gmail.com>
>
>> On Mon, Apr 18, 2011 at 8:34 AM, ilyas ilyas <ilyascfd at gmail.com> wrote:
>>
>>> Hi,
>>>
>>> In ex14f.F in KSP, "row" variable is calculated either
>>>
>>
>> These are very old. I suggest you use the FormFunctionLocal() approach in
>> ex5f.F which
>> does not calculate global row numbers when using a DA.
>>
>>    Matt
>>
>>
>>> 349: do 30 j=ys,ys+ym-1
>>> 350: ...
>>> 351: do 40 i=xs,xs+xm-1
>>> 352:          row = i - gxs + (j - gys)*gxm + 1
>>>
>>> or
>>>
>>> 442: do 50 j=ys,ys+ym-1
>>> 443: ...
>>> 444: row = (j - gys)*gxm + xs - gxs
>>> 445: do 60 i=xs,xs+xm-1
>>> 446:          row = row + 1
>>>
>>> How can I calculate "row" in 3D ?
>>>
>>> I tried this;
>>>
>>> do k=zs,zs+zm-1
>>>    do j=ys,ys+ym-1
>>>       do i=xs,xs+xm-1
>>>
>>>            row = i - gxs + (j - gys)*gxm + (k - gzs)*gxm*gym + 1
>>>
>>> It does not work for certain number of processors.
>>>
>>>
>>> Thanks,
>>>
>>> Ilyas
>>>
>>
>>
>>
>> --
>> What most experimenters take for granted before they begin their
>> experiments is infinitely more interesting than any results to which their
>> experiments lead.
>> -- Norbert Wiener
>>
>
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.mcs.anl.gov/pipermail/petsc-users/attachments/20110418/6c37c1af/attachment.htm>

From rongliang.chan at gmail.com  Mon Apr 18 21:42:48 2011
From: rongliang.chan at gmail.com (Rongliang Chen)
Date: Mon, 18 Apr 2011 20:42:48 -0600
Subject: [petsc-users] Problem in PCShellSetApply()
Message-ID: <BANLkTik0eCD1m+MGXwUc+L2L8wbcNxUUBQ@mail.gmail.com>

Hi,

I faced a problem when I used the function PCShellSetApply to set a
composite PC in Petsc 3.1. There was no problem when I compiled my code but
I got the following message when I run the code.
------------------------------------------------------------------------------------------------------------
[0]PETSC ERROR: [1]PETSC ERROR: --------------------- Error Message
------------------------------------
[1]PETSC ERROR: No support for this operation for this object type!
[1]PETSC ERROR: Matrix format mpiaij does not have a built-in PETSc direct
solver!
[1]PETSC ERROR:
------------------------------------------------------------------------
[1]PETSC ERROR: Petsc Release Version 3.1.0, Patch 8, Thu Mar 17 13:37:48
CDT 2011
...............................
[7]PETSC ERROR: MatGetFactor() line 3644 in src/mat/interface/matrix.c
[7]PETSC ERROR: PCSetUp_LU() line 133 in src/ksp/pc/impls/factor/lu/lu.c
[7]PETSC ERROR: PCSetUp() line 795 in src/ksp/pc/interface/precon.c
[7]PETSC ERROR: PCApply() line 353 in src/ksp/pc/interface/precon.c
[7]PETSC ERROR: PCApply_Composite_Additive() line 102 in
src/ksp/pc/impls/composite/composite.c
[7]PETSC ERROR: PCApply() line 357 in src/ksp/pc/interface/precon.c
[7]PETSC ERROR: PCApplyBAorAB() line 582 in src/ksp/pc/interface/precon.c
[7]PETSC ERROR: GMREScycle() line 161 in src/ksp/ksp/impls/gmres/gmres.c
[7]PETSC ERROR: KSPSolve_GMRES() line 241 in src/ksp/ksp/impls/gmres/gmres.c
[7]PETSC ERROR: KSPSolve() line 396 in src/ksp/ksp/interface/itfunc.c
[1]PETSC ERROR: SNES_KSPSolve() line 2944 in src/snes/interface/snes.c
[1]PETSC ERROR: SNESSolve_LS() line 191 in src/snes/impls/ls/ls.c
[1]PETSC ERROR: SNESSolve() line 2255 in src/snes/interface/snes.c
..............................
--------------------------------------------------------------------------------------------------------------

I found that the function PCShellSetApply did not call the user defined
function CoarseSolvePCApply (I called PCShellSetApply like this: ierr =
PCShellSetApply(coarsesolve, CoarseSolvePCApply);CHKERRQ(ierr); ). Can
anyone tell me what may be the problem? The related code attached.

-----------------------The related code-----------------------------------
PetscErrorCode SetupPreconditioner(JoabCtx* ctx, SNES snes)
{
  JoabParameters *parameters = &ctx->parameters;
  JoabView *view = &ctx->view;
  JoabGrid *grid = &ctx->grid;
  JoabGrid *coarsegrid = &ctx->coarsegrid;
  JoabAlgebra *algebra = &ctx->algebra;
  JoabAlgebra *coarsealgebra = &ctx->coarsealgebra;

  KSP fineksp;
  PC finepc, coarsepc;
  PC asmpc;
  PC coarsesolve;
  Vec fineones;
  PetscInt veclength;
  PetscScalar *values;
  int i;

  PetscErrorCode ierr;

  PetscFunctionBegin;
  ierr = SNESGetKSP(snes,&fineksp);CHKERRQ(ierr);
  ierr = KSPGetPC(fineksp,&finepc);CHKERRQ(ierr);
  ierr = KSPCreate(PETSC_COMM_WORLD,&ctx->coarseksp);CHKERRQ(ierr);

  ierr = SetMyKSPDefaults(ctx->coarseksp);CHKERRQ(ierr);
  ierr = KSPSetFromOptions(ctx->coarseksp);CHKERRQ(ierr);
  ierr = KSPSetOperators(ctx->coarseksp, coarsealgebra->H, coarsealgebra->H,
               SAME_NONZERO_PATTERN);CHKERRQ(ierr);
  if (parameters->geometric_asm) {
    ierr = KSPGetPC(ctx->coarseksp, &coarsepc);CHKERRQ(ierr);
    ierr = PCASMSetOverlap(coarsepc,0);CHKERRQ(ierr);
    ierr = PCASMSetLocalSubdomains(coarsepc,1,&coarsegrid->df_global_asm,
PETSC_NULL);CHKERRQ(ierr);
  }

  ierr = PCSetType(finepc,PCCOMPOSITE);CHKERRQ(ierr);
  ierr = PCCompositeAddPC(finepc,PCSHELL);CHKERRQ(ierr);
  ierr = PCCompositeAddPC(finepc,PCASM);CHKERRQ(ierr);

  /* set up asm (fine) part of two-level preconditioner */
  ierr = PCCompositeGetPC(finepc,1,&asmpc);CHKERRQ(ierr);
  if (parameters->geometric_asm) {
    ierr = PCSetType(asmpc,PCASM);CHKERRQ(ierr);
    ierr = PCASMSetOverlap(asmpc,0);CHKERRQ(ierr);
    ierr = PCASMSetLocalSubdomains(asmpc,1,&grid->df_global_asm,
PETSC_NULL);CHKERRQ(ierr);
  }
  ierr = SetMyPCDefaults(asmpc);CHKERRQ(ierr);
  ierr = PCSetFromOptions(asmpc);CHKERRQ(ierr);

  /* set up coarse solve part of two-level preconditioner */
  ierr = PCCompositeGetPC(finepc,0,&coarsesolve);CHKERRQ(ierr);
  ierr = PCShellSetContext(coarsesolve,ctx);CHKERRQ(ierr);
  ierr = PCShellSetApply(coarsesolve, CoarseSolvePCApply);CHKERRQ(ierr);

  PetscFunctionReturn(0);
}

PetscErrorCode CoarseSolvePCApply(PC pc, Vec xin, Vec xout)
{
  JoabCtx *ctx;
  PetscErrorCode ierr;
  JoabParameters *parameters;
  JoabView *view;
  JoabGrid *finegrid;
  JoabGrid *coarsegrid;
  JoabAlgebra *finealgebra;
  JoabAlgebra *coarsealgebra;
  PetscInt its;
  PetscLogDouble t1,t2,v1,v2;
  KSPConvergedReason reason;

  PetscFunctionBegin;
  ierr = PetscPrintf(PETSC_COMM_WORLD,"Setup coarse level
preconditioner.....\n");CHKERRQ(ierr);
  ierr = PCShellGetContext(pc,(void**)&ctx);CHKERRQ(ierr);
  parameters = &ctx->parameters;
  view = &ctx->view;
  finegrid = &ctx->grid;
  coarsegrid = &ctx->coarsegrid;
  finealgebra = &ctx->algebra;
  coarsealgebra = &ctx->coarsealgebra;

  ierr = PetscGetTime(&t1);CHKERRQ(ierr);

  parameters->whichlevel = COARSE_GRID;

  /* restrict fine grid to coarse grid */
  ierr = PetscGetTime(&v1);CHKERRQ(ierr);
  ierr =
VecSet(coarsealgebra->predictedShape_opt,0.0);CHKERRQ(ierr);CHKERRQ(ierr);
  ierr =
ApplyRestriction(ctx,xin,coarsealgebra->predictedShape_opt);CHKERRQ(ierr);

  ierr = VecSet(coarsealgebra->solutionShape_opt,0.0);CHKERRQ(ierr);
  ierr =
KSPSetTolerances(ctx->coarseksp,1e-6,1e-14,PETSC_DEFAULT,1000);CHKERRQ(ierr);
  ierr = KSPSolve(ctx->coarseksp, coarsealgebra->predictedShape_opt,
coarsealgebra->solutionShape_opt);CHKERRQ(ierr);

  /* interpolate coarse grid to fine grid */
  ierr =
MatInterpolate(ctx->Interp,coarsealgebra->solutionShape_opt,xout);CHKERRQ(ierr);


  PetscFunctionReturn(0);
}


Regards,
Rongliang
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.mcs.anl.gov/pipermail/petsc-users/attachments/20110418/64656d75/attachment.htm>

From bsmith at mcs.anl.gov  Mon Apr 18 22:03:54 2011
From: bsmith at mcs.anl.gov (Barry Smith)
Date: Mon, 18 Apr 2011 22:03:54 -0500
Subject: [petsc-users] Problem in PCShellSetApply()
In-Reply-To: <BANLkTik0eCD1m+MGXwUc+L2L8wbcNxUUBQ@mail.gmail.com>
References: <BANLkTik0eCD1m+MGXwUc+L2L8wbcNxUUBQ@mail.gmail.com>
Message-ID: <ACE7DD0E-5886-4BCB-A69B-CBF6C5B1407A@mcs.anl.gov>


On Apr 18, 2011, at 9:42 PM, Rongliang Chen wrote:

> Hi,
> 
> I faced a problem when I used the function PCShellSetApply to set a composite PC in Petsc 3.1. There was no problem when I compiled my code but I got the following message when I run the code.
> ------------------------------------------------------------------------------------------------------------
> [0]PETSC ERROR: [1]PETSC ERROR: --------------------- Error Message ------------------------------------
> [1]PETSC ERROR: No support for this operation for this object type!
> [1]PETSC ERROR: Matrix format mpiaij does not have a built-in PETSc direct solver!
> [1]PETSC ERROR: ------------------------------------------------------------------------
> [1]PETSC ERROR: Petsc Release Version 3.1.0, Patch 8, Thu Mar 17 13:37:48 CDT 2011
> ...............................
> [7]PETSC ERROR: MatGetFactor() line 3644 in src/mat/interface/matrix.c
> [7]PETSC ERROR: PCSetUp_LU() line 133 in src/ksp/pc/impls/factor/lu/lu.c
> [7]PETSC ERROR: PCSetUp() line 795 in src/ksp/pc/interface/precon.c
> [7]PETSC ERROR: PCApply() line 353 in src/ksp/pc/interface/precon.c
> [7]PETSC ERROR: PCApply_Composite_Additive() line 102 in src/ksp/pc/impls/composite/composite.c
> [7]PETSC ERROR: PCApply() line 357 in src/ksp/pc/interface/precon.c
> [7]PETSC ERROR: PCApplyBAorAB() line 582 in src/ksp/pc/interface/precon.c
> [7]PETSC ERROR: GMREScycle() line 161 in src/ksp/ksp/impls/gmres/gmres.c
> [7]PETSC ERROR: KSPSolve_GMRES() line 241 in src/ksp/ksp/impls/gmres/gmres.c
> [7]PETSC ERROR: KSPSolve() line 396 in src/ksp/ksp/interface/itfunc.c
> [1]PETSC ERROR: SNES_KSPSolve() line 2944 in src/snes/interface/snes.c
> [1]PETSC ERROR: SNESSolve_LS() line 191 in src/snes/impls/ls/ls.c
> [1]PETSC ERROR: SNESSolve() line 2255 in src/snes/interface/snes.c
> ..............................
> --------------------------------------------------------------------------------------------------------------
> 
> I found that the function PCShellSetApply did not call the user defined function CoarseSolvePCApply (I called PCShellSetApply like this: ierr = PCShellSetApply(coarsesolve, CoarseSolvePCApply);CHKERRQ(ierr); ). Can anyone tell me what may be the problem? The related code attached.
> 
> -----------------------The related code-----------------------------------
> PetscErrorCode SetupPreconditioner(JoabCtx* ctx, SNES snes)
> {
>   JoabParameters *parameters = &ctx->parameters;
>   JoabView *view = &ctx->view;
>   JoabGrid *grid = &ctx->grid;
>   JoabGrid *coarsegrid = &ctx->coarsegrid;
>   JoabAlgebra *algebra = &ctx->algebra;
>   JoabAlgebra *coarsealgebra = &ctx->coarsealgebra;
> 
>   KSP fineksp;
>   PC finepc, coarsepc;
>   PC asmpc;
>   PC coarsesolve;
>   Vec fineones;
>   PetscInt veclength;
>   PetscScalar *values;
>   int i;
> 
>   PetscErrorCode ierr;
>   
>   PetscFunctionBegin;
>   ierr = SNESGetKSP(snes,&fineksp);CHKERRQ(ierr);
>   ierr = KSPGetPC(fineksp,&finepc);CHKERRQ(ierr);
>   ierr = KSPCreate(PETSC_COMM_WORLD,&ctx->coarseksp);CHKERRQ(ierr);
> 
>   ierr = SetMyKSPDefaults(ctx->coarseksp);CHKERRQ(ierr); 
>   ierr = KSPSetFromOptions(ctx->coarseksp);CHKERRQ(ierr);
>   ierr = KSPSetOperators(ctx->coarseksp, coarsealgebra->H, coarsealgebra->H,
>                SAME_NONZERO_PATTERN);CHKERRQ(ierr);
>   if (parameters->geometric_asm) {
>     ierr = KSPGetPC(ctx->coarseksp, &coarsepc);CHKERRQ(ierr);
>     ierr = PCASMSetOverlap(coarsepc,0);CHKERRQ(ierr);
>     ierr = PCASMSetLocalSubdomains(coarsepc,1,&coarsegrid->df_global_asm, PETSC_NULL);CHKERRQ(ierr);
>   }
> 
>   ierr = PCSetType(finepc,PCCOMPOSITE);CHKERRQ(ierr);
>   ierr = PCCompositeAddPC(finepc,PCSHELL);CHKERRQ(ierr);
>   ierr = PCCompositeAddPC(finepc,PCASM);CHKERRQ(ierr); 
> 
>   /* set up asm (fine) part of two-level preconditioner */
>   ierr = PCCompositeGetPC(finepc,1,&asmpc);CHKERRQ(ierr);
>   if (parameters->geometric_asm) {

   Is this flag set so it actually sets the type to PCASM

>     ierr = PCSetType(asmpc,PCASM);CHKERRQ(ierr);
>     ierr = PCASMSetOverlap(asmpc,0);CHKERRQ(ierr);
>     ierr = PCASMSetLocalSubdomains(asmpc,1,&grid->df_global_asm, PETSC_NULL);CHKERRQ(ierr);
>   }
>   ierr = SetMyPCDefaults(asmpc);CHKERRQ(ierr); 

   What is this setting the solver to?

   One of the two composite PC's is being set to LU and it is parallel so it fails. My guess is that SetMyPCDefaults() is setting the the PCType to LU. If not you'll need to track through your code, perhaps put a break point in PCSetType() and see where the type is being set to LU.

   Barry

>   ierr = PCSetFromOptions(asmpc);CHKERRQ(ierr);
> 
>   /* set up coarse solve part of two-level preconditioner */
>   ierr = PCCompositeGetPC(finepc,0,&coarsesolve);CHKERRQ(ierr);
>   ierr = PCShellSetContext(coarsesolve,ctx);CHKERRQ(ierr);
>   ierr = PCShellSetApply(coarsesolve, CoarseSolvePCApply);CHKERRQ(ierr);
>   
>   PetscFunctionReturn(0);
> }
> 
> PetscErrorCode CoarseSolvePCApply(PC pc, Vec xin, Vec xout)
> {
>   JoabCtx *ctx;
>   PetscErrorCode ierr;
>   JoabParameters *parameters;
>   JoabView *view;
>   JoabGrid *finegrid;
>   JoabGrid *coarsegrid;
>   JoabAlgebra *finealgebra;
>   JoabAlgebra *coarsealgebra;
>   PetscInt its;
>   PetscLogDouble t1,t2,v1,v2;
>   KSPConvergedReason reason;
> 
>   PetscFunctionBegin;
>   ierr = PetscPrintf(PETSC_COMM_WORLD,"Setup coarse level preconditioner.....\n");CHKERRQ(ierr);
>   ierr = PCShellGetContext(pc,(void**)&ctx);CHKERRQ(ierr);
>   parameters = &ctx->parameters;
>   view = &ctx->view;
>   finegrid = &ctx->grid;
>   coarsegrid = &ctx->coarsegrid;
>   finealgebra = &ctx->algebra;
>   coarsealgebra = &ctx->coarsealgebra;
> 
>   ierr = PetscGetTime(&t1);CHKERRQ(ierr);
> 
>   parameters->whichlevel = COARSE_GRID;
> 
>   /* restrict fine grid to coarse grid */
>   ierr = PetscGetTime(&v1);CHKERRQ(ierr);
>   ierr = VecSet(coarsealgebra->predictedShape_opt,0.0);CHKERRQ(ierr);CHKERRQ(ierr);
>   ierr = ApplyRestriction(ctx,xin,coarsealgebra->predictedShape_opt);CHKERRQ(ierr); 
>   
>   ierr = VecSet(coarsealgebra->solutionShape_opt,0.0);CHKERRQ(ierr);
>   ierr = KSPSetTolerances(ctx->coarseksp,1e-6,1e-14,PETSC_DEFAULT,1000);CHKERRQ(ierr);
>   ierr = KSPSolve(ctx->coarseksp, coarsealgebra->predictedShape_opt, coarsealgebra->solutionShape_opt);CHKERRQ(ierr);
> 
>   /* interpolate coarse grid to fine grid */
>   ierr = MatInterpolate(ctx->Interp,coarsealgebra->solutionShape_opt,xout);CHKERRQ(ierr); 
> 
>   PetscFunctionReturn(0);
> }
> 
> 
> Regards,
> Rongliang
> 
> 


From khalid_eee at yahoo.com  Tue Apr 19 01:34:13 2011
From: khalid_eee at yahoo.com (khalid ashraf)
Date: Mon, 18 Apr 2011 23:34:13 -0700 (PDT)
Subject: [petsc-users] DMMG with PBC
In-Reply-To: <BANLkTikfFzr2JxZ06eRT5Q6SK-pL9EOu3A@mail.gmail.com>
References: <442878.22477.qm@web112607.mail.gq1.yahoo.com>
	<BANLkTikfFzr2JxZ06eRT5Q6SK-pL9EOu3A@mail.gmail.com>
Message-ID: <846336.71604.qm@web112609.mail.gq1.yahoo.com>

Hi Matt,
I wrote the following line after DMMGSetKSP() in ex22.c 
ierr =DMMGSetNullSpace(dmmg,PETSC_TRUE,0,PETSC_NULL);


Still I get the difference in the values calculated by single processor and the 
multiple processors.
The two input values for b that I used are 1 and 10 for all the elements in the 
vector. I am using this
code in one of my programs where I assign a random number to b. I get the 
discrepancy between 
single and multiple processors there as well.  

Thanks.

Khalid


________________________________
From: Matthew Knepley <knepley at gmail.com>
To: PETSc users list <petsc-users at mcs.anl.gov>
Cc: khalid ashraf <khalid_eee at yahoo.com>
Sent: Fri, April 15, 2011 4:38:25 AM
Subject: Re: [petsc-users] DMMG with PBC

On Fri, Apr 15, 2011 at 4:28 AM, khalid ashraf <khalid_eee at yahoo.com> wrote:

Hi,
>I am running src/ksp/ksp/examples/tutorials/ex22.c
>I matched the output of single processor and multiprocessor results and it works 
>fine.
>But I want to use a periodic boundary condition. I make the following changes in 
>the main function and this works fine with this change as well:
>
>
>  ierr = DMMGCreate(PETSC_COMM_WORLD,1,PETSC_NULL,&dmmg);CHKERRQ(ierr);
>  ierr = 
>DACreate3d(PETSC_COMM_WORLD,DA_XYZPERIODIC,DA_STENCIL_STAR,10,10,2,PETSC_DECIDE,PETSC_DECIDE,PETSC_DECIDE,1,1,0,0,0,&da);CHKERRQ(ierr);
>
>
>
>
>
>However, when I comment out these following lines since I am using a PBC, then 
>the result of 1 proc and multi-proc are not the same. They vary within 5 decimal 
>points and  the difference increases with increasing number of processors. 

The periodic operator has a null space. You must put that in the solver 
DMMGSetNullSpace(), so that it is projected out at each step.

   Matt
 
/*      if (i==0 || j==0 || k==0 || i==mx-1 || j==my-1 || k==mz-1){
>          v[0] = 2.0*(HxHydHz + HxHzdHy + HyHzdHx);
>          ierr = 
>MatSetValuesStencil(B,1,&row,1,&row,v,INSERT_VALUES);CHKERRQ(ierr);
>        } else */
>
>
>Could you please tell me what is going wrong here.
>
>
>Thanks.


-- 
What most experimenters take for granted before they begin their experiments is 
infinitely more interesting than any results to which their experiments lead.
-- Norbert Wiener
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.mcs.anl.gov/pipermail/petsc-users/attachments/20110418/046ca4fd/attachment.htm>

From ilyascfd at gmail.com  Tue Apr 19 02:00:14 2011
From: ilyascfd at gmail.com (ilyas ilyas)
Date: Tue, 19 Apr 2011 10:00:14 +0300
Subject: [petsc-users] local row calculation in 3D
In-Reply-To: <BANLkTi=bLZUx47nb9Md3qw1Tj4fZRz8Xpw@mail.gmail.com>
References: <BANLkTik7RZhwcQaP_1ArP67nov1j1P4CSQ@mail.gmail.com>
	<BANLkTi=4GS2ojZCmz7iafpxodHMUJw9V-g@mail.gmail.com>
	<BANLkTi=vzsA4xG6R66aXPvd7_PT+EYL2PQ@mail.gmail.com>
	<BANLkTi=bLZUx47nb9Md3qw1Tj4fZRz8Xpw@mail.gmail.com>
Message-ID: <BANLkTiniyZkCBf3m+pB8op1UFBzrHg8A+A@mail.gmail.com>

Hi Randy,

Thank you for your answer.

I have already done it. You can see it in my first e-mail.

It does not work properly for all number of processors.
For certain number of processors, it works correctly,
not for all number of processors.
For example, for 1,2,or 3 processors, it's ok.
For 4 processors, it gives wrong location, so on.
"Problem" occurs in 3rd dimension ( (kk-gzs)*gxm*gym )

Here is another suggestion (I have not tried yet) ;

       do kk=zs,zs+zm-1
        do jj=ys,ys+ym-1
          do ii=xs,xs+xm-1

            row=ii-gxs + (jj-gys)*MX + (kk-gzs)*MX*MY

MX,MY,MZ are global dimensions.This is also what I do serially

Do you think that it is correct or any other suggestions?

Regards,
Ilyas.

2011/4/18 Randall Mackie <rlmackie862 at gmail.com>

> Here's how I do it:
>
>        do kk=zs,zs+zm-1
>         do jj=ys,ys+ym-1
>           do ii=xs,xs+xm-1
>
>              row=ii-gxs + (jj-gys)*gxm + (kk-gzs)*gxm*gym
>
>
> Good luck,
>
> Randy M.
>
>
>
> On Mon, Apr 18, 2011 at 6:54 AM, ilyas ilyas <ilyascfd at gmail.com> wrote:
>
>> Hi,
>> Thank you for your suggestion. I will take it into account.
>> Since changing this structure in my "massive" code may take  too much
>> time,
>> I would like to know that how "row" is calculated in 3D, independently
>> from processor numbers.
>>
>> Regards,
>> Ilyas
>>
>> 2011/4/18 Matthew Knepley <knepley at gmail.com>
>>
>>> On Mon, Apr 18, 2011 at 8:34 AM, ilyas ilyas <ilyascfd at gmail.com> wrote:
>>>
>>>> Hi,
>>>>
>>>> In ex14f.F in KSP, "row" variable is calculated either
>>>>
>>>
>>> These are very old. I suggest you use the FormFunctionLocal() approach in
>>> ex5f.F which
>>> does not calculate global row numbers when using a DA.
>>>
>>>    Matt
>>>
>>>
>>>> 349: do 30 j=ys,ys+ym-1
>>>> 350: ...
>>>> 351: do 40 i=xs,xs+xm-1
>>>> 352:          row = i - gxs + (j - gys)*gxm + 1
>>>>
>>>> or
>>>>
>>>> 442: do 50 j=ys,ys+ym-1
>>>> 443: ...
>>>> 444: row = (j - gys)*gxm + xs - gxs
>>>> 445: do 60 i=xs,xs+xm-1
>>>> 446:          row = row + 1
>>>>
>>>> How can I calculate "row" in 3D ?
>>>>
>>>> I tried this;
>>>>
>>>> do k=zs,zs+zm-1
>>>>    do j=ys,ys+ym-1
>>>>       do i=xs,xs+xm-1
>>>>
>>>>            row = i - gxs + (j - gys)*gxm + (k - gzs)*gxm*gym + 1
>>>>
>>>> It does not work for certain number of processors.
>>>>
>>>>
>>>> Thanks,
>>>>
>>>> Ilyas
>>>>
>>>
>>>
>>>
>>> --
>>> What most experimenters take for granted before they begin their
>>> experiments is infinitely more interesting than any results to which their
>>> experiments lead.
>>> -- Norbert Wiener
>>>
>>
>>
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.mcs.anl.gov/pipermail/petsc-users/attachments/20110419/0d6868e6/attachment-0001.htm>

From jed at 59A2.org  Tue Apr 19 04:40:18 2011
From: jed at 59A2.org (Jed Brown)
Date: Tue, 19 Apr 2011 11:40:18 +0200
Subject: [petsc-users] local row calculation in 3D
In-Reply-To: <BANLkTiniyZkCBf3m+pB8op1UFBzrHg8A+A@mail.gmail.com>
References: <BANLkTik7RZhwcQaP_1ArP67nov1j1P4CSQ@mail.gmail.com>
	<BANLkTi=4GS2ojZCmz7iafpxodHMUJw9V-g@mail.gmail.com>
	<BANLkTi=vzsA4xG6R66aXPvd7_PT+EYL2PQ@mail.gmail.com>
	<BANLkTi=bLZUx47nb9Md3qw1Tj4fZRz8Xpw@mail.gmail.com>
	<BANLkTiniyZkCBf3m+pB8op1UFBzrHg8A+A@mail.gmail.com>
Message-ID: <BANLkTikYYYoQ6Vh1WLPD71brd_MdrxZpBw@mail.gmail.com>

On Tue, Apr 19, 2011 at 09:00, ilyas ilyas <ilyascfd at gmail.com> wrote:

> It does not work properly for all number of processors.


The "row" in ex14f.F is a local row, not a global row. The ComputeJacobian
in that file manually translates local rows to global rows using the map
returned by DAGetGlobalIndices(). Now you should just call
MatSetValuesLocal() with the local indices, or even easier,
 MatSetValuesStencil().

There is no easy way to determine the global index on your own (it is
hundreds of lines of code).
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.mcs.anl.gov/pipermail/petsc-users/attachments/20110419/dcdad897/attachment.htm>

From jed at 59A2.org  Tue Apr 19 04:42:40 2011
From: jed at 59A2.org (Jed Brown)
Date: Tue, 19 Apr 2011 11:42:40 +0200
Subject: [petsc-users] DMMG with PBC
In-Reply-To: <846336.71604.qm@web112609.mail.gq1.yahoo.com>
References: <442878.22477.qm@web112607.mail.gq1.yahoo.com>
	<BANLkTikfFzr2JxZ06eRT5Q6SK-pL9EOu3A@mail.gmail.com>
	<846336.71604.qm@web112609.mail.gq1.yahoo.com>
Message-ID: <BANLkTi=8ec+VJipnBmvFo+KGH+rtRNG_vg@mail.gmail.com>

On Tue, Apr 19, 2011 at 08:34, khalid ashraf <khalid_eee at yahoo.com> wrote:

> Still I get the difference in the values calculated by single processor and
> the multiple processors.


How much of a difference?

http://www.mcs.anl.gov/petsc/petsc-as/snapshots/petsc-dev/docs/faq.html#different
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.mcs.anl.gov/pipermail/petsc-users/attachments/20110419/f1d9c38b/attachment.htm>

From rlmackie862 at gmail.com  Tue Apr 19 09:09:52 2011
From: rlmackie862 at gmail.com (Randall Mackie)
Date: Tue, 19 Apr 2011 07:09:52 -0700
Subject: [petsc-users] local row calculation in 3D
In-Reply-To: <BANLkTiniyZkCBf3m+pB8op1UFBzrHg8A+A@mail.gmail.com>
References: <BANLkTik7RZhwcQaP_1ArP67nov1j1P4CSQ@mail.gmail.com>
	<BANLkTi=4GS2ojZCmz7iafpxodHMUJw9V-g@mail.gmail.com>
	<BANLkTi=vzsA4xG6R66aXPvd7_PT+EYL2PQ@mail.gmail.com>
	<BANLkTi=bLZUx47nb9Md3qw1Tj4fZRz8Xpw@mail.gmail.com>
	<BANLkTiniyZkCBf3m+pB8op1UFBzrHg8A+A@mail.gmail.com>
Message-ID: <BANLkTiniSaj4S-QRuAo2vyQ+G0x1wXjkNA@mail.gmail.com>

You are right! I just didn't read all the way to the end of your email.
Sorry about that.
So here is a little more code that does it correctly:

      PetscInt, pointer :: ltog(:)

      call DAGetGlobalIndicesF90(da,nloc,ltog,ierr); CHKERRQ(ierr)

      do kk=zs,zs+zm-1
        do jj=ys,ys+ym-1
          do ii=xs,xs+xm-1

            row=ii-gxs + (jj-gys)*gxm + (kk-gzs)*gxm*gym
            grow=ltog(3*row + 1)

[all your code here]

              call MatSetValues(A,i1,grow,ic,col,v,INSERT_VALUES,
     .             ierr); CHKERRQ(ierr)

[more code here]

      call MatAssemblyBegin(A,MAT_FINAL_ASSEMBLY,ierr); CHKERRQ(ierr)
      call MatAssemblyEnd(A,MAT_FINAL_ASSEMBLY,ierr); CHKERRQ(ierr)


Hope this is a little more helpful. As Jed points out, there are other ways
to do the same
thing (and probably more efficiently than what I've outlined here).

Randy M.


On Tue, Apr 19, 2011 at 12:00 AM, ilyas ilyas <ilyascfd at gmail.com> wrote:

> Hi Randy,
>
> Thank you for your answer.
>
> I have already done it. You can see it in my first e-mail.
>
> It does not work properly for all number of processors.
> For certain number of processors, it works correctly,
> not for all number of processors.
> For example, for 1,2,or 3 processors, it's ok.
> For 4 processors, it gives wrong location, so on.
> "Problem" occurs in 3rd dimension ( (kk-gzs)*gxm*gym )
>
> Here is another suggestion (I have not tried yet) ;
>
>        do kk=zs,zs+zm-1
>         do jj=ys,ys+ym-1
>           do ii=xs,xs+xm-1
>
>             row=ii-gxs + (jj-gys)*MX + (kk-gzs)*MX*MY
>
> MX,MY,MZ are global dimensions.This is also what I do serially
>
> Do you think that it is correct or any other suggestions?
>
> Regards,
> Ilyas.
>
> 2011/4/18 Randall Mackie <rlmackie862 at gmail.com>
>
>> Here's how I do it:
>>
>>        do kk=zs,zs+zm-1
>>         do jj=ys,ys+ym-1
>>           do ii=xs,xs+xm-1
>>
>>              row=ii-gxs + (jj-gys)*gxm + (kk-gzs)*gxm*gym
>>
>>
>> Good luck,
>>
>> Randy M.
>>
>>
>>
>> On Mon, Apr 18, 2011 at 6:54 AM, ilyas ilyas <ilyascfd at gmail.com> wrote:
>>
>>> Hi,
>>> Thank you for your suggestion. I will take it into account.
>>> Since changing this structure in my "massive" code may take  too much
>>> time,
>>> I would like to know that how "row" is calculated in 3D, independently
>>> from processor numbers.
>>>
>>> Regards,
>>> Ilyas
>>>
>>> 2011/4/18 Matthew Knepley <knepley at gmail.com>
>>>
>>>> On Mon, Apr 18, 2011 at 8:34 AM, ilyas ilyas <ilyascfd at gmail.com>wrote:
>>>>
>>>>> Hi,
>>>>>
>>>>> In ex14f.F in KSP, "row" variable is calculated either
>>>>>
>>>>
>>>> These are very old. I suggest you use the FormFunctionLocal() approach
>>>> in ex5f.F which
>>>> does not calculate global row numbers when using a DA.
>>>>
>>>>    Matt
>>>>
>>>>
>>>>> 349: do 30 j=ys,ys+ym-1
>>>>> 350: ...
>>>>> 351: do 40 i=xs,xs+xm-1
>>>>> 352:          row = i - gxs + (j - gys)*gxm + 1
>>>>>
>>>>> or
>>>>>
>>>>> 442: do 50 j=ys,ys+ym-1
>>>>> 443: ...
>>>>> 444: row = (j - gys)*gxm + xs - gxs
>>>>> 445: do 60 i=xs,xs+xm-1
>>>>> 446:          row = row + 1
>>>>>
>>>>> How can I calculate "row" in 3D ?
>>>>>
>>>>> I tried this;
>>>>>
>>>>> do k=zs,zs+zm-1
>>>>>    do j=ys,ys+ym-1
>>>>>       do i=xs,xs+xm-1
>>>>>
>>>>>            row = i - gxs + (j - gys)*gxm + (k - gzs)*gxm*gym + 1
>>>>>
>>>>> It does not work for certain number of processors.
>>>>>
>>>>>
>>>>> Thanks,
>>>>>
>>>>> Ilyas
>>>>>
>>>>
>>>>
>>>>
>>>> --
>>>> What most experimenters take for granted before they begin their
>>>> experiments is infinitely more interesting than any results to which their
>>>> experiments lead.
>>>> -- Norbert Wiener
>>>>
>>>
>>>
>>
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.mcs.anl.gov/pipermail/petsc-users/attachments/20110419/cfc4257c/attachment.htm>

From khalid_eee at yahoo.com  Tue Apr 19 18:26:00 2011
From: khalid_eee at yahoo.com (khalid ashraf)
Date: Tue, 19 Apr 2011 16:26:00 -0700 (PDT)
Subject: [petsc-users] DMMG with PBC
In-Reply-To: <mailman.45.1303232427.18851.petsc-users@mcs.anl.gov>
References: <mailman.45.1303232427.18851.petsc-users@mcs.anl.gov>
Message-ID: <369201.93548.qm@web112604.mail.gq1.yahoo.com>

>>How much of a difference?
With the applied XYZPeriodic, if I keep the follwoing lines 

 if (i==0 || j==0 || k==0 || i==mx-1 || j==my-1 || k==mz-1){
          v[0] = 2.0*(HxHydHz + HxHzdHy + HyHzdHx);
          ierr = 
MatSetValuesStencil(B,1,&row,1,&row,v,INSERT_VALUES);CHKERRQ(ierr);
        } else 

 Then the error between 1proc and 4 procs is only after 5th decimal point. 
However, if I comment out the 
above lines then the results are completely different for 1 and 4 procs.

I am attaching the output of some last data points of a 10X10X8 grid.
1 proc output:
-4.40214
-4.39202
-4.38693
-4.38547
-4.38687
-4.39047

4 proc output:
0.000188031
0.000169784
0.000157229
0.000178713
0.000179637
0.000188031
0.000169784
0.000157229
0.000178713
0.000179637

I am attaching the faulty code here for your review.

Thanks.

Khalid

static char help[] = "Solves 3D Laplacian using multigrid.\n\n";

#include "petscda.h"
#include "petscksp.h"
#include "petscdmmg.h"
#include "myHeaderfile.h"

extern PetscErrorCode ComputeMatrix(DMMG,Mat,Mat);
extern PetscErrorCode ComputeRHS(DMMG,Vec);

#undef __FUNCT__
#define __FUNCT__ "main"
int main(int argc,char **argv)
{
  PetscErrorCode ierr;
  DMMG           *dmmg;
  PetscReal      norm;
  DA             da;

  PetscInitialize(&argc,&argv,(char *)0,help);

  ierr = DMMGCreate(PETSC_COMM_WORLD,1,PETSC_NULL,&dmmg);CHKERRQ(ierr);
  ierr = 
DACreate3d(PETSC_COMM_WORLD,DA_XYZPERIODIC,DA_STENCIL_STAR,10,10,8,PETSC_DECIDE,PETSC_DECIDE,PETSC_DECIDE,1,1,0,0,0,&da);CHKERRQ(ierr);

  ierr = DMMGSetDM(dmmg,(DM)da);CHKERRQ(ierr);
 // ierr = DADestroy(da);CHKERRQ(ierr);

  ierr = DMMGSetKSP(dmmg,ComputeRHS,ComputeMatrix);CHKERRQ(ierr);
  ierr =DMMGSetNullSpace(dmmg,PETSC_TRUE,0,PETSC_NULL);

 ierr = DMMGSetUp(dmmg);CHKERRQ(ierr);
  ierr = DMMGSolve(dmmg);CHKERRQ(ierr);

  ierr = MatMult(DMMGGetJ(dmmg),DMMGGetx(dmmg),DMMGGetr(dmmg));CHKERRQ(ierr);
  ierr = VecAXPY(DMMGGetr(dmmg),-1.0,DMMGGetRHS(dmmg));CHKERRQ(ierr);
  ierr = VecNorm(DMMGGetr(dmmg),NORM_2,&norm);CHKERRQ(ierr);
  /* ierr = PetscPrintf(PETSC_COMM_WORLD,"Residual norm 
%G\n",norm);CHKERRQ(ierr); */
  ierr=VecView_VTK(DMMGGetx(dmmg),"X",&appctx);

  ierr = DMMGDestroy(dmmg);CHKERRQ(ierr);
  ierr = PetscFinalize();CHKERRQ(ierr);

  return 0;
}

#undef __FUNCT__
#define __FUNCT__ "ComputeRHS"
PetscErrorCode ComputeRHS(DMMG dmmg,Vec b)
{
  PetscErrorCode ierr;
  PetscInt       mx,my,mz;
  PetscScalar    h;

  PetscFunctionBegin;
  ierr = DAGetInfo((DA)dmmg->dm,0,&mx,&my,&mz,0,0,0,0,0,0,0);CHKERRQ(ierr);
  h    = 10.0/((mx-1)*(my-1)*(mz-1));
  ierr = VecSet(b,h);CHKERRQ(ierr);
  PetscFunctionReturn(0);
}

#undef __FUNCT__
#define __FUNCT__ "ComputeMatrix"
PetscErrorCode ComputeMatrix(DMMG dmmg,Mat jac,Mat B)
{
  DA             da = (DA)dmmg->dm;
  PetscErrorCode ierr;
  PetscInt       i,j,k,mx,my,mz,xm,ym,zm,xs,ys,zs;
  PetscScalar    v[7],Hx,Hy,Hz,HxHydHz,HyHzdHx,HxHzdHy;
  MatStencil     row,col[7];

  ierr = DAGetInfo(da,0,&mx,&my,&mz,0,0,0,0,0,0,0);CHKERRQ(ierr);
  Hx = 1.0 / (PetscReal)(mx-1); Hy = 1.0 / (PetscReal)(my-1); Hz = 1.0 / 
(PetscReal)(mz-1);
  HxHydHz = Hx*Hy/Hz; HxHzdHy = Hx*Hz/Hy; HyHzdHx = Hy*Hz/Hx;
  ierr = DAGetCorners(da,&xs,&ys,&zs,&xm,&ym,&zm);CHKERRQ(ierr);

  for (k=zs; k<zs+zm; k++){
    for (j=ys; j<ys+ym; j++){
      for(i=xs; i<xs+xm; i++){
        row.i = i; row.j = j; row.k = k;
   /*     if (i==0 || j==0 || k==0 || i==mx-1 || j==my-1 || k==mz-1){
          v[0] = 2.0*(HxHydHz + HxHzdHy + HyHzdHx);
          ierr = 
MatSetValuesStencil(B,1,&row,1,&row,v,INSERT_VALUES);CHKERRQ(ierr);
        } else */
           {
          v[0] = -HxHydHz;col[0].i = i; col[0].j = j; col[0].k = k-1;
          v[1] = -HxHzdHy;col[1].i = i; col[1].j = j-1; col[1].k = k;
          v[2] = -HyHzdHx;col[2].i = i-1; col[2].j = j; col[2].k = k;
          v[3] = 2.0*(HxHydHz + HxHzdHy + HyHzdHx);col[3].i = row.i; col[3].j = 
row.j; col[3].k = row.k;
          v[4] = -HyHzdHx;col[4].i = i+1; col[4].j = j; col[4].k = k;
          v[5] = -HxHzdHy;col[5].i = i; col[5].j = j+1; col[5].k = k;
          v[6] = -HxHydHz;col[6].i = i; col[6].j = j; col[6].k = k+1;
    ierr = MatSetValuesStencil(B,1,&row,7,col,v,INSERT_VALUES);CHKERRQ(ierr);
        }
      }
    }
  }
  ierr = MatAssemblyBegin(B,MAT_FINAL_ASSEMBLY);CHKERRQ(ierr);
  ierr = MatAssemblyEnd(B,MAT_FINAL_ASSEMBLY);CHKERRQ(ierr);
  return 0;
}
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.mcs.anl.gov/pipermail/petsc-users/attachments/20110419/d2e304ad/attachment.htm>

From knepley at gmail.com  Tue Apr 19 18:31:17 2011
From: knepley at gmail.com (Matthew Knepley)
Date: Tue, 19 Apr 2011 18:31:17 -0500
Subject: [petsc-users] DMMG with PBC
In-Reply-To: <369201.93548.qm@web112604.mail.gq1.yahoo.com>
References: <mailman.45.1303232427.18851.petsc-users@mcs.anl.gov>
	<369201.93548.qm@web112604.mail.gq1.yahoo.com>
Message-ID: <BANLkTikqPui4EcC2+32O6Uirk9RPkYoUeA@mail.gmail.com>

On Tue, Apr 19, 2011 at 6:26 PM, khalid ashraf <khalid_eee at yahoo.com> wrote:

> >>How much of a difference?
> With the applied XYZPeriodic, if I keep the follwoing lines
>
>  if (i==0 || j==0 || k==0 || i==mx-1 || j==my-1 || k==mz-1){
>           v[0] = 2.0*(HxHydHz + HxHzdHy + HyHzdHx);
>           ierr =
> MatSetValuesStencil(B,1,&row,1,&row,v,INSERT_VALUES);CHKERRQ(ierr);
>         } else
>
>  Then the error between 1proc and 4 procs is only after 5th decimal point.
> However, if I comment out the
> above lines then the results are completely different for 1 and 4 procs.
>

Yes, without this the problem is rank deficient. I suspect that your 5th
decimal place difference
comes from either a) different partitions b) different convergence (stop on
a different iterate) or c)
parallel reordering (but it seems big).

   Matt


> I am attaching the output of some last data points of a 10X10X8 grid.
> 1 proc output:
> -4.40214
> -4.39202
> -4.38693
> -4.38547
> -4.38687
> -4.39047
>
> 4 proc output:
> 0.000188031
> 0.000169784
> 0.000157229
> 0.000178713
> 0.000179637
> 0.000188031
> 0.000169784
> 0.000157229
> 0.000178713
> 0.000179637
>
> I am attaching the faulty code here for your review.
>
> Thanks.
>
> Khalid
>
> static char help[] = "Solves 3D Laplacian using multigrid.\n\n";
>
> #include "petscda.h"
> #include "petscksp.h"
> #include "petscdmmg.h"
> #include "myHeaderfile.h"
>
> extern PetscErrorCode ComputeMatrix(DMMG,Mat,Mat);
> extern PetscErrorCode ComputeRHS(DMMG,Vec);
>
> #undef __FUNCT__
> #define __FUNCT__ "main"
> int main(int argc,char **argv)
> {
>   PetscErrorCode ierr;
>   DMMG           *dmmg;
>   PetscReal      norm;
>   DA             da;
>
>   PetscInitialize(&argc,&argv,(char *)0,help);
>
>   ierr = DMMGCreate(PETSC_COMM_WORLD,1,PETSC_NULL,&dmmg);CHKERRQ(ierr);
>   ierr =
> DACreate3d(PETSC_COMM_WORLD,DA_XYZPERIODIC,DA_STENCIL_STAR,10,10,8,PETSC_DECIDE,PETSC_DECIDE,PETSC_DECIDE,1,1,0,0,0,&da);CHKERRQ(ierr);
>   ierr = DMMGSetDM(dmmg,(DM)da);CHKERRQ(ierr);
>  // ierr = DADestroy(da);CHKERRQ(ierr);
>
>   ierr = DMMGSetKSP(dmmg,ComputeRHS,ComputeMatrix);CHKERRQ(ierr);
>   ierr =DMMGSetNullSpace(dmmg,PETSC_TRUE,0,PETSC_NULL);
>
>  ierr = DMMGSetUp(dmmg);CHKERRQ(ierr);
>   ierr = DMMGSolve(dmmg);CHKERRQ(ierr);
>
>   ierr =
> MatMult(DMMGGetJ(dmmg),DMMGGetx(dmmg),DMMGGetr(dmmg));CHKERRQ(ierr);
>   ierr = VecAXPY(DMMGGetr(dmmg),-1.0,DMMGGetRHS(dmmg));CHKERRQ(ierr);
>   ierr = VecNorm(DMMGGetr(dmmg),NORM_2,&norm);CHKERRQ(ierr);
>   /* ierr = PetscPrintf(PETSC_COMM_WORLD,"Residual norm
> %G\n",norm);CHKERRQ(ierr); */
>   ierr=VecView_VTK(DMMGGetx(dmmg),"X",&appctx);
>
>   ierr = DMMGDestroy(dmmg);CHKERRQ(ierr);
>   ierr = PetscFinalize();CHKERRQ(ierr);
>
>   return 0;
> }
>
> #undef __FUNCT__
> #define __FUNCT__ "ComputeRHS"
> PetscErrorCode ComputeRHS(DMMG dmmg,Vec b)
> {
>   PetscErrorCode ierr;
>   PetscInt       mx,my,mz;
>   PetscScalar    h;
>
>   PetscFunctionBegin;
>   ierr = DAGetInfo((DA)dmmg->dm,0,&mx,&my,&mz,0,0,0,0,0,0,0);CHKERRQ(ierr);
>   h    = 10.0/((mx-1)*(my-1)*(mz-1));
>   ierr = VecSet(b,h);CHKERRQ(ierr);
>   PetscFunctionReturn(0);
> }
>
> #undef __FUNCT__
> #define __FUNCT__ "ComputeMatrix"
> PetscErrorCode ComputeMatrix(DMMG dmmg,Mat jac,Mat B)
> {
>   DA             da = (DA)dmmg->dm;
>   PetscErrorCode ierr;
>   PetscInt       i,j,k,mx,my,mz,xm,ym,zm,xs,ys,zs;
>   PetscScalar    v[7],Hx,Hy,Hz,HxHydHz,HyHzdHx,HxHzdHy;
>   MatStencil     row,col[7];
>
>   ierr = DAGetInfo(da,0,&mx,&my,&mz,0,0,0,0,0,0,0);CHKERRQ(ierr);
>   Hx = 1.0 / (PetscReal)(mx-1); Hy = 1.0 / (PetscReal)(my-1); Hz = 1.0 /
> (PetscReal)(mz-1);
>   HxHydHz = Hx*Hy/Hz; HxHzdHy = Hx*Hz/Hy; HyHzdHx = Hy*Hz/Hx;
>   ierr = DAGetCorners(da,&xs,&ys,&zs,&xm,&ym,&zm);CHKERRQ(ierr);
>
>   for (k=zs; k<zs+zm; k++){
>     for (j=ys; j<ys+ym; j++){
>       for(i=xs; i<xs+xm; i++){
>         row.i = i; row.j = j; row.k = k;
>    /*     if (i==0 || j==0 || k==0 || i==mx-1 || j==my-1 || k==mz-1){
>           v[0] = 2.0*(HxHydHz + HxHzdHy + HyHzdHx);
>           ierr =
> MatSetValuesStencil(B,1,&row,1,&row,v,INSERT_VALUES);CHKERRQ(ierr);
>         } else */
>            {
>           v[0] = -HxHydHz;col[0].i = i; col[0].j = j; col[0].k = k-1;
>           v[1] = -HxHzdHy;col[1].i = i; col[1].j = j-1; col[1].k = k;
>           v[2] = -HyHzdHx;col[2].i = i-1; col[2].j = j; col[2].k = k;
>           v[3] = 2.0*(HxHydHz + HxHzdHy + HyHzdHx);col[3].i = row.i;
> col[3].j = row.j; col[3].k = row.k;
>           v[4] = -HyHzdHx;col[4].i = i+1; col[4].j = j; col[4].k = k;
>           v[5] = -HxHzdHy;col[5].i = i; col[5].j = j+1; col[5].k = k;
>           v[6] = -HxHydHz;col[6].i = i; col[6].j = j; col[6].k = k+1;
>     ierr =
> MatSetValuesStencil(B,1,&row,7,col,v,INSERT_VALUES);CHKERRQ(ierr);
>         }
>       }
>     }
>   }
>   ierr = MatAssemblyBegin(B,MAT_FINAL_ASSEMBLY);CHKERRQ(ierr);
>   ierr = MatAssemblyEnd(B,MAT_FINAL_ASSEMBLY);CHKERRQ(ierr);
>   return 0;
> }
>
>


-- 
What most experimenters take for granted before they begin their experiments
is infinitely more interesting than any results to which their experiments
lead.
-- Norbert Wiener
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.mcs.anl.gov/pipermail/petsc-users/attachments/20110419/54c00af6/attachment-0001.htm>

From Debao.Shao at brion.com  Tue Apr 19 20:31:11 2011
From: Debao.Shao at brion.com (Debao Shao)
Date: Tue, 19 Apr 2011 18:31:11 -0700
Subject: [petsc-users] MatCopy and MatSetValue consume most
	99	percentage of runtime
In-Reply-To: <4C61B97A-F095-41BC-992C-2A0A040A0838@mcs.anl.gov>
References: <384FF55F15E3E447802DC8CCA85696980AAB2E2358@EX03>
	<4C61B97A-F095-41BC-992C-2A0A040A0838@mcs.anl.gov>
Message-ID: <384FF55F15E3E447802DC8CCA85696980AAB3E699A@EX03>

Hi, Barry:

Thanks for the reply.

I preallocated enough space for the sparse matrix, but I found mat->data->imax changed after "MatAssemblyEnd", and it caused many rowmax less than the number of nonzeros per row, then "MatSeqXAIJReallocateAIJ" is called frequently when doing MatSetValues again to the matrix.

Any suggestions?

Thanks,
Debao
-----Original Message-----
From: petsc-users-bounces at mcs.anl.gov [mailto:petsc-users-bounces at mcs.anl.gov] On Behalf Of Barry Smith
Sent: Friday, April 15, 2011 9:25 PM
To: PETSc users list
Subject: Re: [petsc-users] MatCopy and MatSetValue consume most 99 percentage of runtime


   Debao,

      Please see http://www.mcs.anl.gov/petsc/petsc-as/documentation/faq.html#efficient-assembly It should resolve the difficulties.

    Barry

On Apr 15, 2011, at 2:16 AM, Debao Shao wrote:

> Dear Petsc:
>
> I'm trying on Petsc iterative solver(KSPCG & PCJACOBI), but it's strange that the two functions "MatCopy" and "MatSetValue" consume most of runtime, and the functions were not called frequently, just several times.
>
> My libpetsc.a is built as follows:
> 1, /config/configure.py --with-mpi=0 --with-debugging=0 -with-log=0 -with-info=0
> 2, make all;
>
> It's very appreciated to get your reply.
>
> Thanks a lot,
> Debao
>
> -- The information contained in this communication and any attachments is confidential and may be privileged, and is for the sole use of the intended recipient(s). Any unauthorized review, use, disclosure or distribution is prohibited. Unless explicitly stated otherwise in the body of this communication or the attachment thereto (if any), the information is provided on an AS-IS basis without any express or implied warranties or liabilities. To the extent you are relying on this information, you are doing so at your own risk. If you are not the intended recipient, please notify the sender immediately by replying to this message and destroy all copies of this message and any attachments. ASML is neither liable for the proper and complete transmission of the information contained in this communication, nor for any delay in its receipt.


-- The information contained in this communication and any attachments is confidential and may be privileged, and is for the sole use of the intended recipient(s). Any unauthorized review, use, disclosure or distribution is prohibited. Unless explicitly stated otherwise in the body of this communication or the attachment thereto (if any), the information is provided on an AS-IS basis without any express or implied warranties or liabilities. To the extent you are relying on this information, you are doing so at your own risk. If you are not the intended recipient, please notify the sender immediately by replying to this message and destroy all copies of this message and any attachments. ASML is neither liable for the proper and complete transmission of the information contained in this communication, nor for any delay in its receipt.

From bsmith at mcs.anl.gov  Tue Apr 19 20:40:05 2011
From: bsmith at mcs.anl.gov (Barry Smith)
Date: Tue, 19 Apr 2011 20:40:05 -0500
Subject: [petsc-users] MatCopy and MatSetValue consume most
	99	percentage of runtime
In-Reply-To: <384FF55F15E3E447802DC8CCA85696980AAB3E699A@EX03>
References: <384FF55F15E3E447802DC8CCA85696980AAB2E2358@EX03>
	<4C61B97A-F095-41BC-992C-2A0A040A0838@mcs.anl.gov>
	<384FF55F15E3E447802DC8CCA85696980AAB3E699A@EX03>
Message-ID: <39D3147F-560C-4EBF-A861-9B175E4D4783@mcs.anl.gov>


On Apr 19, 2011, at 8:31 PM, Debao Shao wrote:

> Hi, Barry:
> 
> Thanks for the reply.
> 
> I preallocated enough space for the sparse matrix, but I found mat->data->imax changed after "MatAssemblyEnd", and it caused many rowmax less than the number of nonzeros per row, then "MatSeqXAIJReallocateAIJ" is called frequently when doing MatSetValues again to the matrix.

   Are you using MatZeroRows()? If so call MatSetOption(mat, MAT_KEEP_NONZERO_PATTERN) before calling the zero rows to retain that structure. 

    If you are not using MatZeroRows() then apparently the first time you set values in there and call MatAssemblyEnd() you have left many locations that later will be filled unfilled and so they are eliminated at MatAssembly time.  You must make sure that all potentially nonzero locations get a value put in initially (put zero for the locations that you don't yet have  a value for) before you first call MatAssemblyEnd().

   Barry

 PETSc matrices have no way of retaining extra locations you preallocated for unless you put something (like 0) in there.

> 
> Any suggestions?
> 
> Thanks,
> Debao
> -----Original Message-----
> From: petsc-users-bounces at mcs.anl.gov [mailto:petsc-users-bounces at mcs.anl.gov] On Behalf Of Barry Smith
> Sent: Friday, April 15, 2011 9:25 PM
> To: PETSc users list
> Subject: Re: [petsc-users] MatCopy and MatSetValue consume most 99 percentage of runtime
> 
> 
>   Debao,
> 
>      Please see http://www.mcs.anl.gov/petsc/petsc-as/documentation/faq.html#efficient-assembly It should resolve the difficulties.
> 
>    Barry
> 
> On Apr 15, 2011, at 2:16 AM, Debao Shao wrote:
> 
>> Dear Petsc:
>> 
>> I'm trying on Petsc iterative solver(KSPCG & PCJACOBI), but it's strange that the two functions "MatCopy" and "MatSetValue" consume most of runtime, and the functions were not called frequently, just several times.
>> 
>> My libpetsc.a is built as follows:
>> 1, /config/configure.py --with-mpi=0 --with-debugging=0 -with-log=0 -with-info=0
>> 2, make all;
>> 
>> It's very appreciated to get your reply.
>> 
>> Thanks a lot,
>> Debao
>> 
>> -- The information contained in this communication and any attachments is confidential and may be privileged, and is for the sole use of the intended recipient(s). Any unauthorized review, use, disclosure or distribution is prohibited. Unless explicitly stated otherwise in the body of this communication or the attachment thereto (if any), the information is provided on an AS-IS basis without any express or implied warranties or liabilities. To the extent you are relying on this information, you are doing so at your own risk. If you are not the intended recipient, please notify the sender immediately by replying to this message and destroy all copies of this message and any attachments. ASML is neither liable for the proper and complete transmission of the information contained in this communication, nor for any delay in its receipt.
> 
> 
> -- The information contained in this communication and any attachments is confidential and may be privileged, and is for the sole use of the intended recipient(s). Any unauthorized review, use, disclosure or distribution is prohibited. Unless explicitly stated otherwise in the body of this communication or the attachment thereto (if any), the information is provided on an AS-IS basis without any express or implied warranties or liabilities. To the extent you are relying on this information, you are doing so at your own risk. If you are not the intended recipient, please notify the sender immediately by replying to this message and destroy all copies of this message and any attachments. ASML is neither liable for the proper and complete transmission of the information contained in this communication, nor for any delay in its receipt.


From Debao.Shao at brion.com  Tue Apr 19 20:50:14 2011
From: Debao.Shao at brion.com (Debao Shao)
Date: Tue, 19 Apr 2011 18:50:14 -0700
Subject: [petsc-users] MatCopy and MatSetValue consume
	most	99	percentage of runtime
In-Reply-To: <39D3147F-560C-4EBF-A861-9B175E4D4783@mcs.anl.gov>
References: <384FF55F15E3E447802DC8CCA85696980AAB2E2358@EX03>
	<4C61B97A-F095-41BC-992C-2A0A040A0838@mcs.anl.gov>
	<384FF55F15E3E447802DC8CCA85696980AAB3E699A@EX03>
	<39D3147F-560C-4EBF-A861-9B175E4D4783@mcs.anl.gov>
Message-ID: <384FF55F15E3E447802DC8CCA85696980AAB3E69B6@EX03>

Here is my sample code:
  ierr = MatZeroEntries( M ); assert( ierr  == 0);
  ierr = MatDiagonalSet( M, vec.pv, INSERT_VALUES );
  ierr = MatCopy( ms->M, mStorage->M,  DIFFERENT_NONZERO_PATTERN );

I checked PETSC api, both "MatDiagonalSet" and "MatCopy" called MatAssembly***, Is the usage wrong, or, how to deal with the problem?

Thanks,
Debao
-----Original Message-----
From: petsc-users-bounces at mcs.anl.gov [mailto:petsc-users-bounces at mcs.anl.gov] On Behalf Of Barry Smith
Sent: Wednesday, April 20, 2011 9:40 AM
To: PETSc users list
Subject: Re: [petsc-users] MatCopy and MatSetValue consume most 99 percentage of runtime


On Apr 19, 2011, at 8:31 PM, Debao Shao wrote:

> Hi, Barry:
>
> Thanks for the reply.
>
> I preallocated enough space for the sparse matrix, but I found mat->data->imax changed after "MatAssemblyEnd", and it caused many rowmax less than the number of nonzeros per row, then "MatSeqXAIJReallocateAIJ" is called frequently when doing MatSetValues again to the matrix.

   Are you using MatZeroRows()? If so call MatSetOption(mat, MAT_KEEP_NONZERO_PATTERN) before calling the zero rows to retain that structure.

    If you are not using MatZeroRows() then apparently the first time you set values in there and call MatAssemblyEnd() you have left many locations that later will be filled unfilled and so they are eliminated at MatAssembly time.  You must make sure that all potentially nonzero locations get a value put in initially (put zero for the locations that you don't yet have  a value for) before you first call MatAssemblyEnd().

   Barry

 PETSc matrices have no way of retaining extra locations you preallocated for unless you put something (like 0) in there.

>
> Any suggestions?
>
> Thanks,
> Debao
> -----Original Message-----
> From: petsc-users-bounces at mcs.anl.gov [mailto:petsc-users-bounces at mcs.anl.gov] On Behalf Of Barry Smith
> Sent: Friday, April 15, 2011 9:25 PM
> To: PETSc users list
> Subject: Re: [petsc-users] MatCopy and MatSetValue consume most 99 percentage of runtime
>
>
>   Debao,
>
>      Please see http://www.mcs.anl.gov/petsc/petsc-as/documentation/faq.html#efficient-assembly It should resolve the difficulties.
>
>    Barry
>
> On Apr 15, 2011, at 2:16 AM, Debao Shao wrote:
>
>> Dear Petsc:
>>
>> I'm trying on Petsc iterative solver(KSPCG & PCJACOBI), but it's strange that the two functions "MatCopy" and "MatSetValue" consume most of runtime, and the functions were not called frequently, just several times.
>>
>> My libpetsc.a is built as follows:
>> 1, /config/configure.py --with-mpi=0 --with-debugging=0 -with-log=0 -with-info=0
>> 2, make all;
>>
>> It's very appreciated to get your reply.
>>
>> Thanks a lot,
>> Debao
>>
>> -- The information contained in this communication and any attachments is confidential and may be privileged, and is for the sole use of the intended recipient(s). Any unauthorized review, use, disclosure or distribution is prohibited. Unless explicitly stated otherwise in the body of this communication or the attachment thereto (if any), the information is provided on an AS-IS basis without any express or implied warranties or liabilities. To the extent you are relying on this information, you are doing so at your own risk. If you are not the intended recipient, please notify the sender immediately by replying to this message and destroy all copies of this message and any attachments. ASML is neither liable for the proper and complete transmission of the information contained in this communication, nor for any delay in its receipt.
>
>
> -- The information contained in this communication and any attachments is confidential and may be privileged, and is for the sole use of the intended recipient(s). Any unauthorized review, use, disclosure or distribution is prohibited. Unless explicitly stated otherwise in the body of this communication or the attachment thereto (if any), the information is provided on an AS-IS basis without any express or implied warranties or liabilities. To the extent you are relying on this information, you are doing so at your own risk. If you are not the intended recipient, please notify the sender immediately by replying to this message and destroy all copies of this message and any attachments. ASML is neither liable for the proper and complete transmission of the information contained in this communication, nor for any delay in its receipt.


-- The information contained in this communication and any attachments is confidential and may be privileged, and is for the sole use of the intended recipient(s). Any unauthorized review, use, disclosure or distribution is prohibited. Unless explicitly stated otherwise in the body of this communication or the attachment thereto (if any), the information is provided on an AS-IS basis without any express or implied warranties or liabilities. To the extent you are relying on this information, you are doing so at your own risk. If you are not the intended recipient, please notify the sender immediately by replying to this message and destroy all copies of this message and any attachments. ASML is neither liable for the proper and complete transmission of the information contained in this communication, nor for any delay in its receipt.

From Debao.Shao at brion.com  Tue Apr 19 21:08:49 2011
From: Debao.Shao at brion.com (Debao Shao)
Date: Tue, 19 Apr 2011 19:08:49 -0700
Subject: [petsc-users] MatCopy and MatSetValue
	consume	most	99	percentage of runtime
In-Reply-To: <384FF55F15E3E447802DC8CCA85696980AAB3E69B6@EX03>
References: <384FF55F15E3E447802DC8CCA85696980AAB2E2358@EX03>
	<4C61B97A-F095-41BC-992C-2A0A040A0838@mcs.anl.gov>
	<384FF55F15E3E447802DC8CCA85696980AAB3E699A@EX03>
	<39D3147F-560C-4EBF-A861-9B175E4D4783@mcs.anl.gov>
	<384FF55F15E3E447802DC8CCA85696980AAB3E69B6@EX03>
Message-ID: <384FF55F15E3E447802DC8CCA85696980AAB3E69C8@EX03>

Dear Barry:

If I add "MatSetOption(C,MAT_KEEP_NONZERO_PATTERN,PETSC_TRUE)" before " MatZeroEntries", need I reset it back when doing MatCopy?

I'm a freshman to PETSC, your reply is very appreciated.

Thanks,
Debao

-----Original Message-----
From: petsc-users-bounces at mcs.anl.gov [mailto:petsc-users-bounces at mcs.anl.gov] On Behalf Of Debao Shao
Sent: Wednesday, April 20, 2011 9:50 AM
To: PETSc users list
Subject: Re: [petsc-users] MatCopy and MatSetValue consume most 99 percentage of runtime

Here is my sample code:
  ierr = MatZeroEntries( M ); assert( ierr  == 0);
  ierr = MatDiagonalSet( M, vec.pv, INSERT_VALUES );
  ierr = MatCopy( ms->M, mStorage->M,  DIFFERENT_NONZERO_PATTERN );

I checked PETSC api, both "MatDiagonalSet" and "MatCopy" called MatAssembly***, Is the usage wrong, or, how to deal with the problem?

Thanks,
Debao
-----Original Message-----
From: petsc-users-bounces at mcs.anl.gov [mailto:petsc-users-bounces at mcs.anl.gov] On Behalf Of Barry Smith
Sent: Wednesday, April 20, 2011 9:40 AM
To: PETSc users list
Subject: Re: [petsc-users] MatCopy and MatSetValue consume most 99 percentage of runtime


On Apr 19, 2011, at 8:31 PM, Debao Shao wrote:

> Hi, Barry:
>
> Thanks for the reply.
>
> I preallocated enough space for the sparse matrix, but I found mat->data->imax changed after "MatAssemblyEnd", and it caused many rowmax less than the number of nonzeros per row, then "MatSeqXAIJReallocateAIJ" is called frequently when doing MatSetValues again to the matrix.

   Are you using MatZeroRows()? If so call MatSetOption(mat, MAT_KEEP_NONZERO_PATTERN) before calling the zero rows to retain that structure.

    If you are not using MatZeroRows() then apparently the first time you set values in there and call MatAssemblyEnd() you have left many locations that later will be filled unfilled and so they are eliminated at MatAssembly time.  You must make sure that all potentially nonzero locations get a value put in initially (put zero for the locations that you don't yet have  a value for) before you first call MatAssemblyEnd().

   Barry

 PETSc matrices have no way of retaining extra locations you preallocated for unless you put something (like 0) in there.

>
> Any suggestions?
>
> Thanks,
> Debao
> -----Original Message-----
> From: petsc-users-bounces at mcs.anl.gov [mailto:petsc-users-bounces at mcs.anl.gov] On Behalf Of Barry Smith
> Sent: Friday, April 15, 2011 9:25 PM
> To: PETSc users list
> Subject: Re: [petsc-users] MatCopy and MatSetValue consume most 99 percentage of runtime
>
>
>   Debao,
>
>      Please see http://www.mcs.anl.gov/petsc/petsc-as/documentation/faq.html#efficient-assembly It should resolve the difficulties.
>
>    Barry
>
> On Apr 15, 2011, at 2:16 AM, Debao Shao wrote:
>
>> Dear Petsc:
>>
>> I'm trying on Petsc iterative solver(KSPCG & PCJACOBI), but it's strange that the two functions "MatCopy" and "MatSetValue" consume most of runtime, and the functions were not called frequently, just several times.
>>
>> My libpetsc.a is built as follows:
>> 1, /config/configure.py --with-mpi=0 --with-debugging=0 -with-log=0 -with-info=0
>> 2, make all;
>>
>> It's very appreciated to get your reply.
>>
>> Thanks a lot,
>> Debao
>>
>> -- The information contained in this communication and any attachments is confidential and may be privileged, and is for the sole use of the intended recipient(s). Any unauthorized review, use, disclosure or distribution is prohibited. Unless explicitly stated otherwise in the body of this communication or the attachment thereto (if any), the information is provided on an AS-IS basis without any express or implied warranties or liabilities. To the extent you are relying on this information, you are doing so at your own risk. If you are not the intended recipient, please notify the sender immediately by replying to this message and destroy all copies of this message and any attachments. ASML is neither liable for the proper and complete transmission of the information contained in this communication, nor for any delay in its receipt.
>
>
> -- The information contained in this communication and any attachments is confidential and may be privileged, and is for the sole use of the intended recipient(s). Any unauthorized review, use, disclosure or distribution is prohibited. Unless explicitly stated otherwise in the body of this communication or the attachment thereto (if any), the information is provided on an AS-IS basis without any express or implied warranties or liabilities. To the extent you are relying on this information, you are doing so at your own risk. If you are not the intended recipient, please notify the sender immediately by replying to this message and destroy all copies of this message and any attachments. ASML is neither liable for the proper and complete transmission of the information contained in this communication, nor for any delay in its receipt.


-- The information contained in this communication and any attachments is confidential and may be privileged, and is for the sole use of the intended recipient(s). Any unauthorized review, use, disclosure or distribution is prohibited. Unless explicitly stated otherwise in the body of this communication or the attachment thereto (if any), the information is provided on an AS-IS basis without any express or implied warranties or liabilities. To the extent you are relying on this information, you are doing so at your own risk. If you are not the intended recipient, please notify the sender immediately by replying to this message and destroy all copies of this message and any attachments. ASML is neither liable for the proper and complete transmission of the information contained in this communication, nor for any delay in its receipt.

-- The information contained in this communication and any attachments is confidential and may be privileged, and is for the sole use of the intended recipient(s). Any unauthorized review, use, disclosure or distribution is prohibited. Unless explicitly stated otherwise in the body of this communication or the attachment thereto (if any), the information is provided on an AS-IS basis without any express or implied warranties or liabilities. To the extent you are relying on this information, you are doing so at your own risk. If you are not the intended recipient, please notify the sender immediately by replying to this message and destroy all copies of this message and any attachments. ASML is neither liable for the proper and complete transmission of the information contained in this communication, nor for any delay in its receipt.

From knepley at gmail.com  Tue Apr 19 21:10:29 2011
From: knepley at gmail.com (Matthew Knepley)
Date: Tue, 19 Apr 2011 21:10:29 -0500
Subject: [petsc-users] MatCopy and MatSetValue consume most 99
 percentage of runtime
In-Reply-To: <384FF55F15E3E447802DC8CCA85696980AAB3E69C8@EX03>
References: <384FF55F15E3E447802DC8CCA85696980AAB2E2358@EX03>
	<4C61B97A-F095-41BC-992C-2A0A040A0838@mcs.anl.gov>
	<384FF55F15E3E447802DC8CCA85696980AAB3E699A@EX03>
	<39D3147F-560C-4EBF-A861-9B175E4D4783@mcs.anl.gov>
	<384FF55F15E3E447802DC8CCA85696980AAB3E69B6@EX03>
	<384FF55F15E3E447802DC8CCA85696980AAB3E69C8@EX03>
Message-ID: <BANLkTinz0i-um=PMqA_U+rwc-i8Q0W71gw@mail.gmail.com>

On Tue, Apr 19, 2011 at 9:08 PM, Debao Shao <Debao.Shao at brion.com> wrote:

> Dear Barry:
>
> If I add "MatSetOption(C,MAT_KEEP_NONZERO_PATTERN,PETSC_TRUE)" before "
> MatZeroEntries", need I reset it back when doing MatCopy?
>

No.

   Matt


> I'm a freshman to PETSC, your reply is very appreciated.
>
> Thanks,
> Debao
>
> -----Original Message-----
> From: petsc-users-bounces at mcs.anl.gov [mailto:
> petsc-users-bounces at mcs.anl.gov] On Behalf Of Debao Shao
> Sent: Wednesday, April 20, 2011 9:50 AM
> To: PETSc users list
> Subject: Re: [petsc-users] MatCopy and MatSetValue consume most 99
> percentage of runtime
>
> Here is my sample code:
>  ierr = MatZeroEntries( M ); assert( ierr  == 0);
>  ierr = MatDiagonalSet( M, vec.pv, INSERT_VALUES );
>  ierr = MatCopy( ms->M, mStorage->M,  DIFFERENT_NONZERO_PATTERN );
>
> I checked PETSC api, both "MatDiagonalSet" and "MatCopy" called
> MatAssembly***, Is the usage wrong, or, how to deal with the problem?
>
> Thanks,
> Debao
> -----Original Message-----
> From: petsc-users-bounces at mcs.anl.gov [mailto:
> petsc-users-bounces at mcs.anl.gov] On Behalf Of Barry Smith
> Sent: Wednesday, April 20, 2011 9:40 AM
> To: PETSc users list
> Subject: Re: [petsc-users] MatCopy and MatSetValue consume most 99
> percentage of runtime
>
>
> On Apr 19, 2011, at 8:31 PM, Debao Shao wrote:
>
> > Hi, Barry:
> >
> > Thanks for the reply.
> >
> > I preallocated enough space for the sparse matrix, but I found
> mat->data->imax changed after "MatAssemblyEnd", and it caused many rowmax
> less than the number of nonzeros per row, then "MatSeqXAIJReallocateAIJ" is
> called frequently when doing MatSetValues again to the matrix.
>
>   Are you using MatZeroRows()? If so call MatSetOption(mat,
> MAT_KEEP_NONZERO_PATTERN) before calling the zero rows to retain that
> structure.
>
>    If you are not using MatZeroRows() then apparently the first time you
> set values in there and call MatAssemblyEnd() you have left many locations
> that later will be filled unfilled and so they are eliminated at MatAssembly
> time.  You must make sure that all potentially nonzero locations get a value
> put in initially (put zero for the locations that you don't yet have  a
> value for) before you first call MatAssemblyEnd().
>
>   Barry
>
>  PETSc matrices have no way of retaining extra locations you preallocated
> for unless you put something (like 0) in there.
>
> >
> > Any suggestions?
> >
> > Thanks,
> > Debao
> > -----Original Message-----
> > From: petsc-users-bounces at mcs.anl.gov [mailto:
> petsc-users-bounces at mcs.anl.gov] On Behalf Of Barry Smith
> > Sent: Friday, April 15, 2011 9:25 PM
> > To: PETSc users list
> > Subject: Re: [petsc-users] MatCopy and MatSetValue consume most 99
> percentage of runtime
> >
> >
> >   Debao,
> >
> >      Please see
> http://www.mcs.anl.gov/petsc/petsc-as/documentation/faq.html#efficient-assemblyIt should resolve the difficulties.
> >
> >    Barry
> >
> > On Apr 15, 2011, at 2:16 AM, Debao Shao wrote:
> >
> >> Dear Petsc:
> >>
> >> I'm trying on Petsc iterative solver(KSPCG & PCJACOBI), but it's strange
> that the two functions "MatCopy" and "MatSetValue" consume most of runtime,
> and the functions were not called frequently, just several times.
> >>
> >> My libpetsc.a is built as follows:
> >> 1, /config/configure.py --with-mpi=0 --with-debugging=0 -with-log=0
> -with-info=0
> >> 2, make all;
> >>
> >> It's very appreciated to get your reply.
> >>
> >> Thanks a lot,
> >> Debao
> >>
> >> -- The information contained in this communication and any attachments
> is confidential and may be privileged, and is for the sole use of the
> intended recipient(s). Any unauthorized review, use, disclosure or
> distribution is prohibited. Unless explicitly stated otherwise in the body
> of this communication or the attachment thereto (if any), the information is
> provided on an AS-IS basis without any express or implied warranties or
> liabilities. To the extent you are relying on this information, you are
> doing so at your own risk. If you are not the intended recipient, please
> notify the sender immediately by replying to this message and destroy all
> copies of this message and any attachments. ASML is neither liable for the
> proper and complete transmission of the information contained in this
> communication, nor for any delay in its receipt.
> >
> >
> > -- The information contained in this communication and any attachments is
> confidential and may be privileged, and is for the sole use of the intended
> recipient(s). Any unauthorized review, use, disclosure or distribution is
> prohibited. Unless explicitly stated otherwise in the body of this
> communication or the attachment thereto (if any), the information is
> provided on an AS-IS basis without any express or implied warranties or
> liabilities. To the extent you are relying on this information, you are
> doing so at your own risk. If you are not the intended recipient, please
> notify the sender immediately by replying to this message and destroy all
> copies of this message and any attachments. ASML is neither liable for the
> proper and complete transmission of the information contained in this
> communication, nor for any delay in its receipt.
>
>
> -- The information contained in this communication and any attachments is
> confidential and may be privileged, and is for the sole use of the intended
> recipient(s). Any unauthorized review, use, disclosure or distribution is
> prohibited. Unless explicitly stated otherwise in the body of this
> communication or the attachment thereto (if any), the information is
> provided on an AS-IS basis without any express or implied warranties or
> liabilities. To the extent you are relying on this information, you are
> doing so at your own risk. If you are not the intended recipient, please
> notify the sender immediately by replying to this message and destroy all
> copies of this message and any attachments. ASML is neither liable for the
> proper and complete transmission of the information contained in this
> communication, nor for any delay in its receipt.
>
> -- The information contained in this communication and any attachments is
> confidential and may be privileged, and is for the sole use of the intended
> recipient(s). Any unauthorized review, use, disclosure or distribution is
> prohibited. Unless explicitly stated otherwise in the body of this
> communication or the attachment thereto (if any), the information is
> provided on an AS-IS basis without any express or implied warranties or
> liabilities. To the extent you are relying on this information, you are
> doing so at your own risk. If you are not the intended recipient, please
> notify the sender immediately by replying to this message and destroy all
> copies of this message and any attachments. ASML is neither liable for the
> proper and complete transmission of the information contained in this
> communication, nor for any delay in its receipt.
>


-- 
What most experimenters take for granted before they begin their experiments
is infinitely more interesting than any results to which their experiments
lead.
-- Norbert Wiener
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.mcs.anl.gov/pipermail/petsc-users/attachments/20110419/d6664ed9/attachment-0001.htm>

From bsmith at mcs.anl.gov  Tue Apr 19 21:58:30 2011
From: bsmith at mcs.anl.gov (Barry Smith)
Date: Tue, 19 Apr 2011 21:58:30 -0500
Subject: [petsc-users] MatCopy and MatSetValue consume
	most	99	percentage of runtime
In-Reply-To: <384FF55F15E3E447802DC8CCA85696980AAB3E69B6@EX03>
References: <384FF55F15E3E447802DC8CCA85696980AAB2E2358@EX03>
	<4C61B97A-F095-41BC-992C-2A0A040A0838@mcs.anl.gov>
	<384FF55F15E3E447802DC8CCA85696980AAB3E699A@EX03>
	<39D3147F-560C-4EBF-A861-9B175E4D4783@mcs.anl.gov>
	<384FF55F15E3E447802DC8CCA85696980AAB3E69B6@EX03>
Message-ID: <7E6BBB75-2C7D-4CE7-86BF-124DBFCE280E@mcs.anl.gov>


On Apr 19, 2011, at 8:50 PM, Debao Shao wrote:

> Here is my sample code:
>  ierr = MatZeroEntries( M ); assert( ierr  == 0);
>  ierr = MatDiagonalSet( M, vec.pv, INSERT_VALUES );
>  ierr = MatCopy( ms->M, mStorage->M,  DIFFERENT_NONZERO_PATTERN );
> 
> I checked PETSC api, both "MatDiagonalSet" and "MatCopy" called MatAssembly***, Is the usage wrong, or, how to deal with the problem?

   Don't call MatCopy() on anything that you have NOT fully filled up, same with MatDiagonalSet() and MatZeroEntries(). You should only do those operations on matrices that you have filled up and called MatAssembly on already. You can use MatDiagonalSet() directly on a naked matrix but only if you are not putting other values in.

    Barry

I know these "rules" may seem strange but the problem is that these operations need to know the nonzero pattern of the sparse matrix and since you haven't set anything much in them yet they have to assume it is empty and blast away the preallocation information.  Basically you shouldn't do much in the way of operations on matrices until you've fully assembled them.

> 
> Thanks,
> Debao
> -----Original Message-----
> From: petsc-users-bounces at mcs.anl.gov [mailto:petsc-users-bounces at mcs.anl.gov] On Behalf Of Barry Smith
> Sent: Wednesday, April 20, 2011 9:40 AM
> To: PETSc users list
> Subject: Re: [petsc-users] MatCopy and MatSetValue consume most 99 percentage of runtime
> 
> 
> On Apr 19, 2011, at 8:31 PM, Debao Shao wrote:
> 
>> Hi, Barry:
>> 
>> Thanks for the reply.
>> 
>> I preallocated enough space for the sparse matrix, but I found mat->data->imax changed after "MatAssemblyEnd", and it caused many rowmax less than the number of nonzeros per row, then "MatSeqXAIJReallocateAIJ" is called frequently when doing MatSetValues again to the matrix.
> 
>   Are you using MatZeroRows()? If so call MatSetOption(mat, MAT_KEEP_NONZERO_PATTERN) before calling the zero rows to retain that structure.
> 
>    If you are not using MatZeroRows() then apparently the first time you set values in there and call MatAssemblyEnd() you have left many locations that later will be filled unfilled and so they are eliminated at MatAssembly time.  You must make sure that all potentially nonzero locations get a value put in initially (put zero for the locations that you don't yet have  a value for) before you first call MatAssemblyEnd().
> 
>   Barry
> 
> PETSc matrices have no way of retaining extra locations you preallocated for unless you put something (like 0) in there.
> 
>> 
>> Any suggestions?
>> 
>> Thanks,
>> Debao
>> -----Original Message-----
>> From: petsc-users-bounces at mcs.anl.gov [mailto:petsc-users-bounces at mcs.anl.gov] On Behalf Of Barry Smith
>> Sent: Friday, April 15, 2011 9:25 PM
>> To: PETSc users list
>> Subject: Re: [petsc-users] MatCopy and MatSetValue consume most 99 percentage of runtime
>> 
>> 
>>  Debao,
>> 
>>     Please see http://www.mcs.anl.gov/petsc/petsc-as/documentation/faq.html#efficient-assembly It should resolve the difficulties.
>> 
>>   Barry
>> 
>> On Apr 15, 2011, at 2:16 AM, Debao Shao wrote:
>> 
>>> Dear Petsc:
>>> 
>>> I'm trying on Petsc iterative solver(KSPCG & PCJACOBI), but it's strange that the two functions "MatCopy" and "MatSetValue" consume most of runtime, and the functions were not called frequently, just several times.
>>> 
>>> My libpetsc.a is built as follows:
>>> 1, /config/configure.py --with-mpi=0 --with-debugging=0 -with-log=0 -with-info=0
>>> 2, make all;
>>> 
>>> It's very appreciated to get your reply.
>>> 
>>> Thanks a lot,
>>> Debao
>>> 
>>> -- The information contained in this communication and any attachments is confidential and may be privileged, and is for the sole use of the intended recipient(s). Any unauthorized review, use, disclosure or distribution is prohibited. Unless explicitly stated otherwise in the body of this communication or the attachment thereto (if any), the information is provided on an AS-IS basis without any express or implied warranties or liabilities. To the extent you are relying on this information, you are doing so at your own risk. If you are not the intended recipient, please notify the sender immediately by replying to this message and destroy all copies of this message and any attachments. ASML is neither liable for the proper and complete transmission of the information contained in this communication, nor for any delay in its receipt.
>> 
>> 
>> -- The information contained in this communication and any attachments is confidential and may be privileged, and is for the sole use of the intended recipient(s). Any unauthorized review, use, disclosure or distribution is prohibited. Unless explicitly stated otherwise in the body of this communication or the attachment thereto (if any), the information is provided on an AS-IS basis without any express or implied warranties or liabilities. To the extent you are relying on this information, you are doing so at your own risk. If you are not the intended recipient, please notify the sender immediately by replying to this message and destroy all copies of this message and any attachments. ASML is neither liable for the proper and complete transmission of the information contained in this communication, nor for any delay in its receipt.
> 
> 
> -- The information contained in this communication and any attachments is confidential and may be privileged, and is for the sole use of the intended recipient(s). Any unauthorized review, use, disclosure or distribution is prohibited. Unless explicitly stated otherwise in the body of this communication or the attachment thereto (if any), the information is provided on an AS-IS basis without any express or implied warranties or liabilities. To the extent you are relying on this information, you are doing so at your own risk. If you are not the intended recipient, please notify the sender immediately by replying to this message and destroy all copies of this message and any attachments. ASML is neither liable for the proper and complete transmission of the information contained in this communication, nor for any delay in its receipt.


From vijay.m at gmail.com  Tue Apr 19 22:08:46 2011
From: vijay.m at gmail.com (Vijay S. Mahadevan)
Date: Tue, 19 Apr 2011 22:08:46 -0500
Subject: [petsc-users] MatCopy and MatSetValue consume most 99
 percentage of runtime
In-Reply-To: <7E6BBB75-2C7D-4CE7-86BF-124DBFCE280E@mcs.anl.gov>
References: <384FF55F15E3E447802DC8CCA85696980AAB2E2358@EX03>
	<4C61B97A-F095-41BC-992C-2A0A040A0838@mcs.anl.gov>
	<384FF55F15E3E447802DC8CCA85696980AAB3E699A@EX03>
	<39D3147F-560C-4EBF-A861-9B175E4D4783@mcs.anl.gov>
	<384FF55F15E3E447802DC8CCA85696980AAB3E69B6@EX03>
	<7E6BBB75-2C7D-4CE7-86BF-124DBFCE280E@mcs.anl.gov>
Message-ID: <BANLkTikOhbK2gGdcd6jsx14xsL8d3wVYjA@mail.gmail.com>

> ? Don't call MatCopy() on anything that you have NOT fully filled up, same with MatDiagonalSet() and MatZeroEntries(). You should only do those operations on matrices that you have filled up and called MatAssembly on already. You can use MatDiagonalSet() directly on a naked matrix but only if you are not putting other values in.

Barry, just to confirm, are you saying that MatZeroEntries would
nullify the preallocation completely if called before
AssmeblyBegin/End ? I have been doing this quite often before a linear
system assembly and have not noticed extra mallocs during the process.
Is there something that I am misunderstanding in the above statement ?
I would much appreciate if you can clarify.

Thanks,
Vijay

On Tue, Apr 19, 2011 at 9:58 PM, Barry Smith <bsmith at mcs.anl.gov> wrote:
>
> On Apr 19, 2011, at 8:50 PM, Debao Shao wrote:
>
>> Here is my sample code:
>> ?ierr = MatZeroEntries( M ); assert( ierr ?== 0);
>> ?ierr = MatDiagonalSet( M, vec.pv, INSERT_VALUES );
>> ?ierr = MatCopy( ms->M, mStorage->M, ?DIFFERENT_NONZERO_PATTERN );
>>
>> I checked PETSC api, both "MatDiagonalSet" and "MatCopy" called MatAssembly***, Is the usage wrong, or, how to deal with the problem?
>
> ? Don't call MatCopy() on anything that you have NOT fully filled up, same with MatDiagonalSet() and MatZeroEntries(). You should only do those operations on matrices that you have filled up and called MatAssembly on already. You can use MatDiagonalSet() directly on a naked matrix but only if you are not putting other values in.
>
> ? ?Barry
>
> I know these "rules" may seem strange but the problem is that these operations need to know the nonzero pattern of the sparse matrix and since you haven't set anything much in them yet they have to assume it is empty and blast away the preallocation information. ?Basically you shouldn't do much in the way of operations on matrices until you've fully assembled them.
>
>>
>> Thanks,
>> Debao
>> -----Original Message-----
>> From: petsc-users-bounces at mcs.anl.gov [mailto:petsc-users-bounces at mcs.anl.gov] On Behalf Of Barry Smith
>> Sent: Wednesday, April 20, 2011 9:40 AM
>> To: PETSc users list
>> Subject: Re: [petsc-users] MatCopy and MatSetValue consume most 99 percentage of runtime
>>
>>
>> On Apr 19, 2011, at 8:31 PM, Debao Shao wrote:
>>
>>> Hi, Barry:
>>>
>>> Thanks for the reply.
>>>
>>> I preallocated enough space for the sparse matrix, but I found mat->data->imax changed after "MatAssemblyEnd", and it caused many rowmax less than the number of nonzeros per row, then "MatSeqXAIJReallocateAIJ" is called frequently when doing MatSetValues again to the matrix.
>>
>> ? Are you using MatZeroRows()? If so call MatSetOption(mat, MAT_KEEP_NONZERO_PATTERN) before calling the zero rows to retain that structure.
>>
>> ? ?If you are not using MatZeroRows() then apparently the first time you set values in there and call MatAssemblyEnd() you have left many locations that later will be filled unfilled and so they are eliminated at MatAssembly time. ?You must make sure that all potentially nonzero locations get a value put in initially (put zero for the locations that you don't yet have ?a value for) before you first call MatAssemblyEnd().
>>
>> ? Barry
>>
>> PETSc matrices have no way of retaining extra locations you preallocated for unless you put something (like 0) in there.
>>
>>>
>>> Any suggestions?
>>>
>>> Thanks,
>>> Debao
>>> -----Original Message-----
>>> From: petsc-users-bounces at mcs.anl.gov [mailto:petsc-users-bounces at mcs.anl.gov] On Behalf Of Barry Smith
>>> Sent: Friday, April 15, 2011 9:25 PM
>>> To: PETSc users list
>>> Subject: Re: [petsc-users] MatCopy and MatSetValue consume most 99 percentage of runtime
>>>
>>>
>>> ?Debao,
>>>
>>> ? ? Please see http://www.mcs.anl.gov/petsc/petsc-as/documentation/faq.html#efficient-assembly It should resolve the difficulties.
>>>
>>> ? Barry
>>>
>>> On Apr 15, 2011, at 2:16 AM, Debao Shao wrote:
>>>
>>>> Dear Petsc:
>>>>
>>>> I'm trying on Petsc iterative solver(KSPCG & PCJACOBI), but it's strange that the two functions "MatCopy" and "MatSetValue" consume most of runtime, and the functions were not called frequently, just several times.
>>>>
>>>> My libpetsc.a is built as follows:
>>>> 1, /config/configure.py --with-mpi=0 --with-debugging=0 -with-log=0 -with-info=0
>>>> 2, make all;
>>>>
>>>> It's very appreciated to get your reply.
>>>>
>>>> Thanks a lot,
>>>> Debao
>>>>
>>>> -- The information contained in this communication and any attachments is confidential and may be privileged, and is for the sole use of the intended recipient(s). Any unauthorized review, use, disclosure or distribution is prohibited. Unless explicitly stated otherwise in the body of this communication or the attachment thereto (if any), the information is provided on an AS-IS basis without any express or implied warranties or liabilities. To the extent you are relying on this information, you are doing so at your own risk. If you are not the intended recipient, please notify the sender immediately by replying to this message and destroy all copies of this message and any attachments. ASML is neither liable for the proper and complete transmission of the information contained in this communication, nor for any delay in its receipt.
>>>
>>>
>>> -- The information contained in this communication and any attachments is confidential and may be privileged, and is for the sole use of the intended recipient(s). Any unauthorized review, use, disclosure or distribution is prohibited. Unless explicitly stated otherwise in the body of this communication or the attachment thereto (if any), the information is provided on an AS-IS basis without any express or implied warranties or liabilities. To the extent you are relying on this information, you are doing so at your own risk. If you are not the intended recipient, please notify the sender immediately by replying to this message and destroy all copies of this message and any attachments. ASML is neither liable for the proper and complete transmission of the information contained in this communication, nor for any delay in its receipt.
>>
>>
>> -- The information contained in this communication and any attachments is confidential and may be privileged, and is for the sole use of the intended recipient(s). Any unauthorized review, use, disclosure or distribution is prohibited. Unless explicitly stated otherwise in the body of this communication or the attachment thereto (if any), the information is provided on an AS-IS basis without any express or implied warranties or liabilities. To the extent you are relying on this information, you are doing so at your own risk. If you are not the intended recipient, please notify the sender immediately by replying to this message and destroy all copies of this message and any attachments. ASML is neither liable for the proper and complete transmission of the information contained in this communication, nor for any delay in its receipt.
>
>

From Debao.Shao at brion.com  Tue Apr 19 22:21:34 2011
From: Debao.Shao at brion.com (Debao Shao)
Date: Tue, 19 Apr 2011 20:21:34 -0700
Subject: [petsc-users] MatCopy and MatSetValue
	consume	most	99	percentage of runtime
In-Reply-To: <7E6BBB75-2C7D-4CE7-86BF-124DBFCE280E@mcs.anl.gov>
References: <384FF55F15E3E447802DC8CCA85696980AAB2E2358@EX03>
	<4C61B97A-F095-41BC-992C-2A0A040A0838@mcs.anl.gov>
	<384FF55F15E3E447802DC8CCA85696980AAB3E699A@EX03>
	<39D3147F-560C-4EBF-A861-9B175E4D4783@mcs.anl.gov>
	<384FF55F15E3E447802DC8CCA85696980AAB3E69B6@EX03>
	<7E6BBB75-2C7D-4CE7-86BF-124DBFCE280E@mcs.anl.gov>
Message-ID: <384FF55F15E3E447802DC8CCA85696980AAB3E6A28@EX03>

Hi, Barry:

I'm confused,
1), if we can't use "MatZeroEntries" before MatAssembly, then, how do we do initialization for M?
2), if we can't use "MatCopy" before MatAssembly, then, how to fill up M from another matrix?

Can you give a sample code for the right usage? Thanks very much.

Regards,
Debao

-----Original Message-----
From: petsc-users-bounces at mcs.anl.gov [mailto:petsc-users-bounces at mcs.anl.gov] On Behalf Of Barry Smith
Sent: Wednesday, April 20, 2011 10:59 AM
To: PETSc users list
Subject: Re: [petsc-users] MatCopy and MatSetValue consume most 99 percentage of runtime


On Apr 19, 2011, at 8:50 PM, Debao Shao wrote:

> Here is my sample code:
>  ierr = MatZeroEntries( M ); assert( ierr  == 0);
>  ierr = MatDiagonalSet( M, vec.pv, INSERT_VALUES );
>  ierr = MatCopy( ms->M, mStorage->M,  DIFFERENT_NONZERO_PATTERN );
>
> I checked PETSC api, both "MatDiagonalSet" and "MatCopy" called MatAssembly***, Is the usage wrong, or, how to deal with the problem?

   Don't call MatCopy() on anything that you have NOT fully filled up, same with MatDiagonalSet() and MatZeroEntries(). You should only do those operations on matrices that you have filled up and called MatAssembly on already. You can use MatDiagonalSet() directly on a naked matrix but only if you are not putting other values in.

    Barry

I know these "rules" may seem strange but the problem is that these operations need to know the nonzero pattern of the sparse matrix and since you haven't set anything much in them yet they have to assume it is empty and blast away the preallocation information.  Basically you shouldn't do much in the way of operations on matrices until you've fully assembled them.

>
> Thanks,
> Debao
> -----Original Message-----
> From: petsc-users-bounces at mcs.anl.gov [mailto:petsc-users-bounces at mcs.anl.gov] On Behalf Of Barry Smith
> Sent: Wednesday, April 20, 2011 9:40 AM
> To: PETSc users list
> Subject: Re: [petsc-users] MatCopy and MatSetValue consume most 99 percentage of runtime
>
>
> On Apr 19, 2011, at 8:31 PM, Debao Shao wrote:
>
>> Hi, Barry:
>>
>> Thanks for the reply.
>>
>> I preallocated enough space for the sparse matrix, but I found mat->data->imax changed after "MatAssemblyEnd", and it caused many rowmax less than the number of nonzeros per row, then "MatSeqXAIJReallocateAIJ" is called frequently when doing MatSetValues again to the matrix.
>
>   Are you using MatZeroRows()? If so call MatSetOption(mat, MAT_KEEP_NONZERO_PATTERN) before calling the zero rows to retain that structure.
>
>    If you are not using MatZeroRows() then apparently the first time you set values in there and call MatAssemblyEnd() you have left many locations that later will be filled unfilled and so they are eliminated at MatAssembly time.  You must make sure that all potentially nonzero locations get a value put in initially (put zero for the locations that you don't yet have  a value for) before you first call MatAssemblyEnd().
>
>   Barry
>
> PETSc matrices have no way of retaining extra locations you preallocated for unless you put something (like 0) in there.
>
>>
>> Any suggestions?
>>
>> Thanks,
>> Debao
>> -----Original Message-----
>> From: petsc-users-bounces at mcs.anl.gov [mailto:petsc-users-bounces at mcs.anl.gov] On Behalf Of Barry Smith
>> Sent: Friday, April 15, 2011 9:25 PM
>> To: PETSc users list
>> Subject: Re: [petsc-users] MatCopy and MatSetValue consume most 99 percentage of runtime
>>
>>
>>  Debao,
>>
>>     Please see http://www.mcs.anl.gov/petsc/petsc-as/documentation/faq.html#efficient-assembly It should resolve the difficulties.
>>
>>   Barry
>>
>> On Apr 15, 2011, at 2:16 AM, Debao Shao wrote:
>>
>>> Dear Petsc:
>>>
>>> I'm trying on Petsc iterative solver(KSPCG & PCJACOBI), but it's strange that the two functions "MatCopy" and "MatSetValue" consume most of runtime, and the functions were not called frequently, just several times.
>>>
>>> My libpetsc.a is built as follows:
>>> 1, /config/configure.py --with-mpi=0 --with-debugging=0 -with-log=0 -with-info=0
>>> 2, make all;
>>>
>>> It's very appreciated to get your reply.
>>>
>>> Thanks a lot,
>>> Debao
>>>
>>> -- The information contained in this communication and any attachments is confidential and may be privileged, and is for the sole use of the intended recipient(s). Any unauthorized review, use, disclosure or distribution is prohibited. Unless explicitly stated otherwise in the body of this communication or the attachment thereto (if any), the information is provided on an AS-IS basis without any express or implied warranties or liabilities. To the extent you are relying on this information, you are doing so at your own risk. If you are not the intended recipient, please notify the sender immediately by replying to this message and destroy all copies of this message and any attachments. ASML is neither liable for the proper and complete transmission of the information contained in this communication, nor for any delay in its receipt.
>>
>>
>> -- The information contained in this communication and any attachments is confidential and may be privileged, and is for the sole use of the intended recipient(s). Any unauthorized review, use, disclosure or distribution is prohibited. Unless explicitly stated otherwise in the body of this communication or the attachment thereto (if any), the information is provided on an AS-IS basis without any express or implied warranties or liabilities. To the extent you are relying on this information, you are doing so at your own risk. If you are not the intended recipient, please notify the sender immediately by replying to this message and destroy all copies of this message and any attachments. ASML is neither liable for the proper and complete transmission of the information contained in this communication, nor for any delay in its receipt.
>
>
> -- The information contained in this communication and any attachments is confidential and may be privileged, and is for the sole use of the intended recipient(s). Any unauthorized review, use, disclosure or distribution is prohibited. Unless explicitly stated otherwise in the body of this communication or the attachment thereto (if any), the information is provided on an AS-IS basis without any express or implied warranties or liabilities. To the extent you are relying on this information, you are doing so at your own risk. If you are not the intended recipient, please notify the sender immediately by replying to this message and destroy all copies of this message and any attachments. ASML is neither liable for the proper and complete transmission of the information contained in this communication, nor for any delay in its receipt.


-- The information contained in this communication and any attachments is confidential and may be privileged, and is for the sole use of the intended recipient(s). Any unauthorized review, use, disclosure or distribution is prohibited. Unless explicitly stated otherwise in the body of this communication or the attachment thereto (if any), the information is provided on an AS-IS basis without any express or implied warranties or liabilities. To the extent you are relying on this information, you are doing so at your own risk. If you are not the intended recipient, please notify the sender immediately by replying to this message and destroy all copies of this message and any attachments. ASML is neither liable for the proper and complete transmission of the information contained in this communication, nor for any delay in its receipt.

From jed at 59A2.org  Wed Apr 20 05:05:54 2011
From: jed at 59A2.org (Jed Brown)
Date: Wed, 20 Apr 2011 12:05:54 +0200
Subject: [petsc-users] MatCopy and MatSetValue consume most 99
 percentage of runtime
In-Reply-To: <BANLkTikOhbK2gGdcd6jsx14xsL8d3wVYjA@mail.gmail.com>
References: <384FF55F15E3E447802DC8CCA85696980AAB2E2358@EX03>
	<4C61B97A-F095-41BC-992C-2A0A040A0838@mcs.anl.gov>
	<384FF55F15E3E447802DC8CCA85696980AAB3E699A@EX03>
	<39D3147F-560C-4EBF-A861-9B175E4D4783@mcs.anl.gov>
	<384FF55F15E3E447802DC8CCA85696980AAB3E69B6@EX03>
	<7E6BBB75-2C7D-4CE7-86BF-124DBFCE280E@mcs.anl.gov>
	<BANLkTikOhbK2gGdcd6jsx14xsL8d3wVYjA@mail.gmail.com>
Message-ID: <BANLkTimZ+-m-G9r8FxyPg9mdQ7HD7EAEPQ@mail.gmail.com>

On Wed, Apr 20, 2011 at 05:08, Vijay S. Mahadevan <vijay.m at gmail.com> wrote:

> Barry, just to confirm, are you saying that MatZeroEntries would
> nullify the preallocation completely if called before
> AssmeblyBegin/End ?
>

I can't think of a way that would happen. It may zero more entries than
necessary, but it shouldn't forget the preallocation. Note that the
preallocation in DMGetMatrix() and many other libraries actually insert
explicit zeros in the locations it has preallocated.
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.mcs.anl.gov/pipermail/petsc-users/attachments/20110420/55b4dceb/attachment.htm>

From jed at 59A2.org  Wed Apr 20 05:07:15 2011
From: jed at 59A2.org (Jed Brown)
Date: Wed, 20 Apr 2011 12:07:15 +0200
Subject: [petsc-users] MatCopy and MatSetValue consume most 99
 percentage of runtime
In-Reply-To: <384FF55F15E3E447802DC8CCA85696980AAB3E69B6@EX03>
References: <384FF55F15E3E447802DC8CCA85696980AAB2E2358@EX03>
	<4C61B97A-F095-41BC-992C-2A0A040A0838@mcs.anl.gov>
	<384FF55F15E3E447802DC8CCA85696980AAB3E699A@EX03>
	<39D3147F-560C-4EBF-A861-9B175E4D4783@mcs.anl.gov>
	<384FF55F15E3E447802DC8CCA85696980AAB3E69B6@EX03>
Message-ID: <BANLkTinCa4RMwCVSq=HQqjJEEfP0XGJBgw@mail.gmail.com>

On Wed, Apr 20, 2011 at 03:50, Debao Shao <Debao.Shao at brion.com> wrote:

>  ierr = MatZeroEntries( M ); assert( ierr  == 0);
>  ierr = MatDiagonalSet( M, vec.pv, INSERT_VALUES );
>  ierr = MatCopy( ms->M, mStorage->M,  DIFFERENT_NONZERO_PATTERN );
>

Is M somehow related to ms->M or mStorage->M?

What do you actually want to do?
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.mcs.anl.gov/pipermail/petsc-users/attachments/20110420/4eb2391b/attachment-0001.htm>

From vijay.m at gmail.com  Wed Apr 20 06:46:35 2011
From: vijay.m at gmail.com (Vijay S. Mahadevan)
Date: Wed, 20 Apr 2011 06:46:35 -0500
Subject: [petsc-users] MatCopy and MatSetValue consume most 99
 percentage of runtime
In-Reply-To: <BANLkTimZ+-m-G9r8FxyPg9mdQ7HD7EAEPQ@mail.gmail.com>
References: <384FF55F15E3E447802DC8CCA85696980AAB2E2358@EX03>
	<4C61B97A-F095-41BC-992C-2A0A040A0838@mcs.anl.gov>
	<384FF55F15E3E447802DC8CCA85696980AAB3E699A@EX03>
	<39D3147F-560C-4EBF-A861-9B175E4D4783@mcs.anl.gov>
	<384FF55F15E3E447802DC8CCA85696980AAB3E69B6@EX03>
	<7E6BBB75-2C7D-4CE7-86BF-124DBFCE280E@mcs.anl.gov>
	<BANLkTikOhbK2gGdcd6jsx14xsL8d3wVYjA@mail.gmail.com>
	<BANLkTimZ+-m-G9r8FxyPg9mdQ7HD7EAEPQ@mail.gmail.com>
Message-ID: <BANLkTim3gHHH0JwUX75f2iyJFkMOAk1nig@mail.gmail.com>

Thanks for the clarification Jed. But zeroing more entries than necessary
would still trigger malloc calls. Would it not ?

Vijay
On Apr 20, 2011 5:05 AM, "Jed Brown" <jed at 59a2.org> wrote:
> On Wed, Apr 20, 2011 at 05:08, Vijay S. Mahadevan <vijay.m at gmail.com>
wrote:
>
>> Barry, just to confirm, are you saying that MatZeroEntries would
>> nullify the preallocation completely if called before
>> AssmeblyBegin/End ?
>>
>
> I can't think of a way that would happen. It may zero more entries than
> necessary, but it shouldn't forget the preallocation. Note that the
> preallocation in DMGetMatrix() and many other libraries actually insert
> explicit zeros in the locations it has preallocated.
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.mcs.anl.gov/pipermail/petsc-users/attachments/20110420/ba224320/attachment.htm>

From thomas.witkowski at tu-dresden.de  Wed Apr 20 06:55:25 2011
From: thomas.witkowski at tu-dresden.de (Thomas Witkowski)
Date: Wed, 20 Apr 2011 13:55:25 +0200
Subject: [petsc-users] FETI-DP
In-Reply-To: <BANLkTimimSZL3n34Pzg-rnoLQt9sh4H6sA@mail.gmail.com>
References: <4DA6F440.4000204@tu-dresden.de>
	<BANLkTimimSZL3n34Pzg-rnoLQt9sh4H6sA@mail.gmail.com>
Message-ID: <4DAEC9AD.2010509@tu-dresden.de>

There one small thing on the implementation details of the FETI-DP, I 
cannot figure out. Maybe some of you could help me to understand it, 
though it is not directly related to PETSc. Non of the publications says 
something about how to distribute the Lagrange multipliers over the 
processors. Is there any good way to do it or can it done arbitrarily? 
And should be the jump operators B^i be directly assembled or should 
they be implemented in a matrix-free way? I'm confuse because in the 
work of Klawoon/Rheinbach, it is claimed that the following operator can 
be solved in a pure local way:

F = \sum_{i=1}^{N} B^i   inv(K_BB^i) trans(B^i)

With B^i the jump operators and K_BB^i the discretization of the sub 
domains with the primal nodes. From the notation it follows that EACH 
local solve takes the whole vector of Lagrange multipliers. But this is 
not applicable for a good parallel implementation. Any hint on this 
topic would be helpful for me to understand this problem.

Thomas


Jed Brown wrote:
> On Thu, Apr 14, 2011 at 15:18, Thomas Witkowski 
> <thomas.witkowski at tu-dresden.de 
> <mailto:thomas.witkowski at tu-dresden.de>> wrote:
>
>     Has anybody of you implemented the FETI-DP method in PETSc? I
>     think about to do this for my FEM code, but first I want to
>     evaluate the effort of the implementation.
>
>
> There are a few implementations out there. Probably most notable is 
> Axel Klawonn and Oliver Rheinbach's implementation which has been 
> scaled up to very large problems and computers. My understanding is 
> that Xuemin Tu did some work on BDDC (equivalent to FETI-DP) using 
> PETSc. I am not aware of anyone releasing a working FETI-DP 
> implementation using PETSc, but of course you're welcome to ask these 
> people if they would share code with you.
>
>
> What sort of problems do you want it for (physics and mesh)? How are 
> you currently assembling your systems? A fully general FETI-DP 
> implementation is a lot of work. For a specific class of problems and 
> variant of FETI-DP, it will still take some effort, but should not be 
> too much.
>
> There was a start to a FETI-DP implementation in PETSc quite a while 
> ago, but it died due to bitrot and different ideas of how we would 
> like to implement. You can get that code from mercurial:
>
>  http://petsc.cs.iit.edu/petsc/petsc-dev/rev/021f379b5eea
>
>
> The fundamental ingredient of these methods is a "partially assembled" 
> matrix. For a library implementation, the challenges are
>
> 1. How does the user provide the information necessary to decide what 
> the coarse space looks like? (It's different for scalar problems, 
> compressible elasticity, and Stokes, and tricky to do with no 
> geometric information from the user.) The coefficient structure in the 
> problem matters a lot when deciding which coarse basis functions to 
> use, see http://dx.doi.org/10.1016/j.cma.2006.03.023
>
> 2. How do you handle primal basis functions with large support (e.g. 
> rigid body modes of a face)? Two choices here: 
> http://www.cs.nyu.edu/cs/faculty/widlund/FETI-DP-elasticity_TR.pdf .
>
> 3. How do you make it easy for the user to provide the required 
> matrix? Ideally, the user would just use plain MatSetValuesLocal() and 
> run with -mat_type partially-assembled -pc_type fetidp instead of, say 
> -mat_type baij -pc_type asm. It should work for multiple subdomains 
> per process and subdomains spanning multiple processes. This can now 
> be done by implementing MatGetLocalSubMatrix(). The local blocks of 
> the partially assembled system should be able to use different formats 
> (e.g. SBAIJ).
>
> 4. How do you handle more than two levels? This is very important to 
> use more than about 1000 subdomains in 3D because the coarse problem 
> just gets too big (unless the coarse problem happens to be 
> well-conditioned enough that you can use algebraic multigrid).
>
>
> I've wanted to implement FETI-DP in PETSc for almost two years, but 
> it's never been a high priority. I think I now know how to get enough 
> flexibility to make it worthwhile to me. I'd be happy to discuss 
> implementation issues with you.


From jed at 59A2.org  Wed Apr 20 07:02:43 2011
From: jed at 59A2.org (Jed Brown)
Date: Wed, 20 Apr 2011 14:02:43 +0200
Subject: [petsc-users] MatCopy and MatSetValue consume most 99
 percentage of runtime
In-Reply-To: <BANLkTim3gHHH0JwUX75f2iyJFkMOAk1nig@mail.gmail.com>
References: <384FF55F15E3E447802DC8CCA85696980AAB2E2358@EX03>
	<4C61B97A-F095-41BC-992C-2A0A040A0838@mcs.anl.gov>
	<384FF55F15E3E447802DC8CCA85696980AAB3E699A@EX03>
	<39D3147F-560C-4EBF-A861-9B175E4D4783@mcs.anl.gov>
	<384FF55F15E3E447802DC8CCA85696980AAB3E69B6@EX03>
	<7E6BBB75-2C7D-4CE7-86BF-124DBFCE280E@mcs.anl.gov>
	<BANLkTikOhbK2gGdcd6jsx14xsL8d3wVYjA@mail.gmail.com>
	<BANLkTimZ+-m-G9r8FxyPg9mdQ7HD7EAEPQ@mail.gmail.com>
	<BANLkTim3gHHH0JwUX75f2iyJFkMOAk1nig@mail.gmail.com>
Message-ID: <BANLkTi==Ae3pB4-9_N0xAoefuJ6Rh7s6Jw@mail.gmail.com>

On Wed, Apr 20, 2011 at 13:46, Vijay S. Mahadevan <vijay.m at gmail.com> wrote:

> Thanks for the clarification Jed. But zeroing more entries than necessary
> would still trigger malloc calls. Would it not ?


It won't zero more than you allocated but it might zero more than you will
actually insert. It doesn't matter.
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.mcs.anl.gov/pipermail/petsc-users/attachments/20110420/abb8dc0e/attachment.htm>

From jed at 59A2.org  Wed Apr 20 07:43:46 2011
From: jed at 59A2.org (Jed Brown)
Date: Wed, 20 Apr 2011 14:43:46 +0200
Subject: [petsc-users] FETI-DP
In-Reply-To: <4DAEC9AD.2010509@tu-dresden.de>
References: <4DA6F440.4000204@tu-dresden.de>
	<BANLkTimimSZL3n34Pzg-rnoLQt9sh4H6sA@mail.gmail.com>
	<4DAEC9AD.2010509@tu-dresden.de>
Message-ID: <BANLkTi=j5f=4rn8+78Qj6Re+pD+WmeSS5g@mail.gmail.com>

Thomas, we should move this discussion to petsc-dev, are you subscribed to
that list?

On Wed, Apr 20, 2011 at 13:55, Thomas Witkowski <
thomas.witkowski at tu-dresden.de> wrote:

> There one small thing on the implementation details of the FETI-DP, I
> cannot figure out. Maybe some of you could help me to understand it, though
> it is not directly related to PETSc. Non of the publications says something
> about how to distribute the Lagrange multipliers over the processors. Is
> there any good way to do it or can it done arbitrarily?
>

All their work that I have seen assumes a fully redundant set of Lagrange
multipliers. In that context, each Lagrange multiplier only ever couples two
subdomains together. Either process can then take ownership of that single
Lagrange multiplier.


> And should be the jump operators B^i be directly assembled or should they
> be implemented in a matrix-free way?
>

Usually these constraints are sparse so I think it is no problem to assume
that they are always assembled.


> I'm confuse because in the work of Klawoon/Rheinbach, it is claimed that
> the following operator can be solved in a pure local way:
>
> F = \sum_{i=1}^{N} B^i   inv(K_BB^i) trans(B^i)
>

Did they use "F" for this thing? Usually F is the FETI-DP operator which
involves a Schur complement of the entire partially assembled operator in
the dual space. In any case, this thing is not purely local since the jump
operators B^i need neighboring values so it has the same communication as a
MatMult.


> With B^i the jump operators and K_BB^i the discretization of the sub
> domains with the primal nodes.
>

I think you mean "with the primal nodes removed".


> From the notation it follows that EACH local solve takes the whole vector
> of Lagrange multipliers. But this is not applicable for a good parallel
> implementation. Any hint on this topic would be helpful for me to understand
> this problem.
>

I can't tell from their papers how B is stored. It would be natural to
simply store B as a normal assembled matrix with a standard row partition of
the Lagrange multipliers. Then you would apply the subdomain solve operator
using

MatMultTranspose(B,XLambdaGlobal,XGlobal);
for (i=0; i<nlocalsub; i++) {
  Vec XSubdomain,YSubdomain;
  VecGetSubVector(XGlobal,sublocal[i],&XSubdomain); // no copy if subdomains
are contiguous
  VecGetSubVector(YGlobal,sublocal[i],&YSubdomain); // also no copy
  KSPSolve(kspK_BB[i],XSubdomain,YSubdomain); // purely local solve, often
KSPPREONLY and PCLU
  VecRestoreSubVector(XGlobal,sublocal[i],&XSubdomain);
  VecRestoreSubVector(YGlobal,sublocal[i],&YSubdomain);
}
MatMult(B,YGlobal,YLambdaGlobal);

All the communication is in the MatMultTranspose and MatMult. The "Global"
vectors here are global with respect to K_BB (interior and interface dofs,
primal dofs removed). I don't think there is need to ever store K_BB as a
parallel matrix, it would be a separate matrix per subdomain (in the general
case, subdomains could be parallel on subcommunicators).

This code should handle nlocalsub subdomains owned by the local
communicator, typically PETSC_COMM_SELF. The index sets (IS) in sublocal
represent the global (space of K_BB) dofs, usually these are contiguous sets
so they can be represented very cheaply.

Barry, would you do it differently?
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.mcs.anl.gov/pipermail/petsc-users/attachments/20110420/5b89989f/attachment.htm>

From bsmith at mcs.anl.gov  Wed Apr 20 08:13:36 2011
From: bsmith at mcs.anl.gov (Barry Smith)
Date: Wed, 20 Apr 2011 08:13:36 -0500
Subject: [petsc-users] MatCopy and MatSetValue consume most 99
	percentage of runtime
In-Reply-To: <BANLkTikOhbK2gGdcd6jsx14xsL8d3wVYjA@mail.gmail.com>
References: <384FF55F15E3E447802DC8CCA85696980AAB2E2358@EX03>
	<4C61B97A-F095-41BC-992C-2A0A040A0838@mcs.anl.gov>
	<384FF55F15E3E447802DC8CCA85696980AAB3E699A@EX03>
	<39D3147F-560C-4EBF-A861-9B175E4D4783@mcs.anl.gov>
	<384FF55F15E3E447802DC8CCA85696980AAB3E69B6@EX03>
	<7E6BBB75-2C7D-4CE7-86BF-124DBFCE280E@mcs.anl.gov>
	<BANLkTikOhbK2gGdcd6jsx14xsL8d3wVYjA@mail.gmail.com>
Message-ID: <BB90B524-93D3-497E-B8F0-51BF96F8611B@mcs.anl.gov>


On Apr 19, 2011, at 10:08 PM, Vijay S. Mahadevan wrote:

>>   Don't call MatCopy() on anything that you have NOT fully filled up, same with MatDiagonalSet() and MatZeroEntries(). You should only do those operations on matrices that you have filled up and called MatAssembly on already. You can use MatDiagonalSet() directly on a naked matrix but only if you are not putting other values in.
> 
> Barry, just to confirm, are you saying that MatZeroEntries would
> nullify the preallocation completely if called before
> AssmeblyBegin/End ? I have been doing this quite often before a linear
> system assembly and have not noticed extra mallocs during the process.
> Is there something that I am misunderstanding in the above statement ?

   My mistake. Yes if you call MatZeroEntries() on a matrix you have not started putting values in it will not destroy the preallocation information. 

   Barry

> I would much appreciate if you can clarify.
> 
> Thanks,
> Vijay
> 
> On Tue, Apr 19, 2011 at 9:58 PM, Barry Smith <bsmith at mcs.anl.gov> wrote:
>> 
>> On Apr 19, 2011, at 8:50 PM, Debao Shao wrote:
>> 
>>> Here is my sample code:
>>>  ierr = MatZeroEntries( M ); assert( ierr  == 0);
>>>  ierr = MatDiagonalSet( M, vec.pv, INSERT_VALUES );
>>>  ierr = MatCopy( ms->M, mStorage->M,  DIFFERENT_NONZERO_PATTERN );
>>> 
>>> I checked PETSC api, both "MatDiagonalSet" and "MatCopy" called MatAssembly***, Is the usage wrong, or, how to deal with the problem?
>> 
>>   Don't call MatCopy() on anything that you have NOT fully filled up, same with MatDiagonalSet() and MatZeroEntries(). You should only do those operations on matrices that you have filled up and called MatAssembly on already. You can use MatDiagonalSet() directly on a naked matrix but only if you are not putting other values in.
>> 
>>    Barry
>> 
>> I know these "rules" may seem strange but the problem is that these operations need to know the nonzero pattern of the sparse matrix and since you haven't set anything much in them yet they have to assume it is empty and blast away the preallocation information.  Basically you shouldn't do much in the way of operations on matrices until you've fully assembled them.
>> 
>>> 
>>> Thanks,
>>> Debao
>>> -----Original Message-----
>>> From: petsc-users-bounces at mcs.anl.gov [mailto:petsc-users-bounces at mcs.anl.gov] On Behalf Of Barry Smith
>>> Sent: Wednesday, April 20, 2011 9:40 AM
>>> To: PETSc users list
>>> Subject: Re: [petsc-users] MatCopy and MatSetValue consume most 99 percentage of runtime
>>> 
>>> 
>>> On Apr 19, 2011, at 8:31 PM, Debao Shao wrote:
>>> 
>>>> Hi, Barry:
>>>> 
>>>> Thanks for the reply.
>>>> 
>>>> I preallocated enough space for the sparse matrix, but I found mat->data->imax changed after "MatAssemblyEnd", and it caused many rowmax less than the number of nonzeros per row, then "MatSeqXAIJReallocateAIJ" is called frequently when doing MatSetValues again to the matrix.
>>> 
>>>   Are you using MatZeroRows()? If so call MatSetOption(mat, MAT_KEEP_NONZERO_PATTERN) before calling the zero rows to retain that structure.
>>> 
>>>    If you are not using MatZeroRows() then apparently the first time you set values in there and call MatAssemblyEnd() you have left many locations that later will be filled unfilled and so they are eliminated at MatAssembly time.  You must make sure that all potentially nonzero locations get a value put in initially (put zero for the locations that you don't yet have  a value for) before you first call MatAssemblyEnd().
>>> 
>>>   Barry
>>> 
>>> PETSc matrices have no way of retaining extra locations you preallocated for unless you put something (like 0) in there.
>>> 
>>>> 
>>>> Any suggestions?
>>>> 
>>>> Thanks,
>>>> Debao
>>>> -----Original Message-----
>>>> From: petsc-users-bounces at mcs.anl.gov [mailto:petsc-users-bounces at mcs.anl.gov] On Behalf Of Barry Smith
>>>> Sent: Friday, April 15, 2011 9:25 PM
>>>> To: PETSc users list
>>>> Subject: Re: [petsc-users] MatCopy and MatSetValue consume most 99 percentage of runtime
>>>> 
>>>> 
>>>>  Debao,
>>>> 
>>>>     Please see http://www.mcs.anl.gov/petsc/petsc-as/documentation/faq.html#efficient-assembly It should resolve the difficulties.
>>>> 
>>>>   Barry
>>>> 
>>>> On Apr 15, 2011, at 2:16 AM, Debao Shao wrote:
>>>> 
>>>>> Dear Petsc:
>>>>> 
>>>>> I'm trying on Petsc iterative solver(KSPCG & PCJACOBI), but it's strange that the two functions "MatCopy" and "MatSetValue" consume most of runtime, and the functions were not called frequently, just several times.
>>>>> 
>>>>> My libpetsc.a is built as follows:
>>>>> 1, /config/configure.py --with-mpi=0 --with-debugging=0 -with-log=0 -with-info=0
>>>>> 2, make all;
>>>>> 
>>>>> It's very appreciated to get your reply.
>>>>> 
>>>>> Thanks a lot,
>>>>> Debao
>>>>> 
>>>>> -- The information contained in this communication and any attachments is confidential and may be privileged, and is for the sole use of the intended recipient(s). Any unauthorized review, use, disclosure or distribution is prohibited. Unless explicitly stated otherwise in the body of this communication or the attachment thereto (if any), the information is provided on an AS-IS basis without any express or implied warranties or liabilities. To the extent you are relying on this information, you are doing so at your own risk. If you are not the intended recipient, please notify the sender immediately by replying to this message and destroy all copies of this message and any attachments. ASML is neither liable for the proper and complete transmission of the information contained in this communication, nor for any delay in its receipt.
>>>> 
>>>> 
>>>> -- The information contained in this communication and any attachments is confidential and may be privileged, and is for the sole use of the intended recipient(s). Any unauthorized review, use, disclosure or distribution is prohibited. Unless explicitly stated otherwise in the body of this communication or the attachment thereto (if any), the information is provided on an AS-IS basis without any express or implied warranties or liabilities. To the extent you are relying on this information, you are doing so at your own risk. If you are not the intended recipient, please notify the sender immediately by replying to this message and destroy all copies of this message and any attachments. ASML is neither liable for the proper and complete transmission of the information contained in this communication, nor for any delay in its receipt.
>>> 
>>> 
>>> -- The information contained in this communication and any attachments is confidential and may be privileged, and is for the sole use of the intended recipient(s). Any unauthorized review, use, disclosure or distribution is prohibited. Unless explicitly stated otherwise in the body of this communication or the attachment thereto (if any), the information is provided on an AS-IS basis without any express or implied warranties or liabilities. To the extent you are relying on this information, you are doing so at your own risk. If you are not the intended recipient, please notify the sender immediately by replying to this message and destroy all copies of this message and any attachments. ASML is neither liable for the proper and complete transmission of the information contained in this communication, nor for any delay in its receipt.
>> 
>> 


From bsmith at mcs.anl.gov  Wed Apr 20 08:18:56 2011
From: bsmith at mcs.anl.gov (Barry Smith)
Date: Wed, 20 Apr 2011 08:18:56 -0500
Subject: [petsc-users] MatCopy and MatSetValue
	consume	most	99	percentage of runtime
In-Reply-To: <384FF55F15E3E447802DC8CCA85696980AAB3E6A28@EX03>
References: <384FF55F15E3E447802DC8CCA85696980AAB2E2358@EX03>
	<4C61B97A-F095-41BC-992C-2A0A040A0838@mcs.anl.gov>
	<384FF55F15E3E447802DC8CCA85696980AAB3E699A@EX03>
	<39D3147F-560C-4EBF-A861-9B175E4D4783@mcs.anl.gov>
	<384FF55F15E3E447802DC8CCA85696980AAB3E69B6@EX03>
	<7E6BBB75-2C7D-4CE7-86BF-124DBFCE280E@mcs.anl.gov>
	<384FF55F15E3E447802DC8CCA85696980AAB3E6A28@EX03>
Message-ID: <3FAF21D8-DC2B-40EE-A79B-1FCF4DCAABD6@mcs.anl.gov>


On Apr 19, 2011, at 10:21 PM, Debao Shao wrote:

> Hi, Barry:
> 
> I'm confused,
> 1), if we can't use "MatZeroEntries" before MatAssembly, then, how do we do initialization for M?

   When you create a sparse matrix it automatically has no non-zero values in it so there is no reason to call MatZeroEntries() on it. But I was wrong it is ok to call MatZeroEntries() on it and it will not destroy the preallocation

> 2), if we can't use "MatCopy" before MatAssembly, then, how to fill up M from another matrix?

   You can copy, say A, to M with MatCopy() but M will get the same nonzero structure as A, if you provided "extra" preallocation information in M that will be lost in the copy.  So it is not efficient to copy into a matrix M and then start putting as bunch of new nonzero locations into M.


    Barry

> 
> Can you give a sample code for the right usage? Thanks very much.
> 
> Regards,
> Debao
> 
> -----Original Message-----
> From: petsc-users-bounces at mcs.anl.gov [mailto:petsc-users-bounces at mcs.anl.gov] On Behalf Of Barry Smith
> Sent: Wednesday, April 20, 2011 10:59 AM
> To: PETSc users list
> Subject: Re: [petsc-users] MatCopy and MatSetValue consume most 99 percentage of runtime
> 
> 
> On Apr 19, 2011, at 8:50 PM, Debao Shao wrote:
> 
>> Here is my sample code:
>> ierr = MatZeroEntries( M ); assert( ierr  == 0);
>> ierr = MatDiagonalSet( M, vec.pv, INSERT_VALUES );
>> ierr = MatCopy( ms->M, mStorage->M,  DIFFERENT_NONZERO_PATTERN );
>> 
>> I checked PETSC api, both "MatDiagonalSet" and "MatCopy" called MatAssembly***, Is the usage wrong, or, how to deal with the problem?
> 
>   Don't call MatCopy() on anything that you have NOT fully filled up, same with MatDiagonalSet() and MatZeroEntries(). You should only do those operations on matrices that you have filled up and called MatAssembly on already. You can use MatDiagonalSet() directly on a naked matrix but only if you are not putting other values in.
> 
>    Barry
> 
> I know these "rules" may seem strange but the problem is that these operations need to know the nonzero pattern of the sparse matrix and since you haven't set anything much in them yet they have to assume it is empty and blast away the preallocation information.  Basically you shouldn't do much in the way of operations on matrices until you've fully assembled them.
> 
>> 
>> Thanks,
>> Debao
>> -----Original Message-----
>> From: petsc-users-bounces at mcs.anl.gov [mailto:petsc-users-bounces at mcs.anl.gov] On Behalf Of Barry Smith
>> Sent: Wednesday, April 20, 2011 9:40 AM
>> To: PETSc users list
>> Subject: Re: [petsc-users] MatCopy and MatSetValue consume most 99 percentage of runtime
>> 
>> 
>> On Apr 19, 2011, at 8:31 PM, Debao Shao wrote:
>> 
>>> Hi, Barry:
>>> 
>>> Thanks for the reply.
>>> 
>>> I preallocated enough space for the sparse matrix, but I found mat->data->imax changed after "MatAssemblyEnd", and it caused many rowmax less than the number of nonzeros per row, then "MatSeqXAIJReallocateAIJ" is called frequently when doing MatSetValues again to the matrix.
>> 
>>  Are you using MatZeroRows()? If so call MatSetOption(mat, MAT_KEEP_NONZERO_PATTERN) before calling the zero rows to retain that structure.
>> 
>>   If you are not using MatZeroRows() then apparently the first time you set values in there and call MatAssemblyEnd() you have left many locations that later will be filled unfilled and so they are eliminated at MatAssembly time.  You must make sure that all potentially nonzero locations get a value put in initially (put zero for the locations that you don't yet have  a value for) before you first call MatAssemblyEnd().
>> 
>>  Barry
>> 
>> PETSc matrices have no way of retaining extra locations you preallocated for unless you put something (like 0) in there.
>> 
>>> 
>>> Any suggestions?
>>> 
>>> Thanks,
>>> Debao
>>> -----Original Message-----
>>> From: petsc-users-bounces at mcs.anl.gov [mailto:petsc-users-bounces at mcs.anl.gov] On Behalf Of Barry Smith
>>> Sent: Friday, April 15, 2011 9:25 PM
>>> To: PETSc users list
>>> Subject: Re: [petsc-users] MatCopy and MatSetValue consume most 99 percentage of runtime
>>> 
>>> 
>>> Debao,
>>> 
>>>    Please see http://www.mcs.anl.gov/petsc/petsc-as/documentation/faq.html#efficient-assembly It should resolve the difficulties.
>>> 
>>>  Barry
>>> 
>>> On Apr 15, 2011, at 2:16 AM, Debao Shao wrote:
>>> 
>>>> Dear Petsc:
>>>> 
>>>> I'm trying on Petsc iterative solver(KSPCG & PCJACOBI), but it's strange that the two functions "MatCopy" and "MatSetValue" consume most of runtime, and the functions were not called frequently, just several times.
>>>> 
>>>> My libpetsc.a is built as follows:
>>>> 1, /config/configure.py --with-mpi=0 --with-debugging=0 -with-log=0 -with-info=0
>>>> 2, make all;
>>>> 
>>>> It's very appreciated to get your reply.
>>>> 
>>>> Thanks a lot,
>>>> Debao
>>>> 
>>>> -- The information contained in this communication and any attachments is confidential and may be privileged, and is for the sole use of the intended recipient(s). Any unauthorized review, use, disclosure or distribution is prohibited. Unless explicitly stated otherwise in the body of this communication or the attachment thereto (if any), the information is provided on an AS-IS basis without any express or implied warranties or liabilities. To the extent you are relying on this information, you are doing so at your own risk. If you are not the intended recipient, please notify the sender immediately by replying to this message and destroy all copies of this message and any attachments. ASML is neither liable for the proper and complete transmission of the information contained in this communication, nor for any delay in its receipt.
>>> 
>>> 
>>> -- The information contained in this communication and any attachments is confidential and may be privileged, and is for the sole use of the intended recipient(s). Any unauthorized review, use, disclosure or distribution is prohibited. Unless explicitly stated otherwise in the body of this communication or the attachment thereto (if any), the information is provided on an AS-IS basis without any express or implied warranties or liabilities. To the extent you are relying on this information, you are doing so at your own risk. If you are not the intended recipient, please notify the sender immediately by replying to this message and destroy all copies of this message and any attachments. ASML is neither liable for the proper and complete transmission of the information contained in this communication, nor for any delay in its receipt.
>> 
>> 
>> -- The information contained in this communication and any attachments is confidential and may be privileged, and is for the sole use of the intended recipient(s). Any unauthorized review, use, disclosure or distribution is prohibited. Unless explicitly stated otherwise in the body of this communication or the attachment thereto (if any), the information is provided on an AS-IS basis without any express or implied warranties or liabilities. To the extent you are relying on this information, you are doing so at your own risk. If you are not the intended recipient, please notify the sender immediately by replying to this message and destroy all copies of this message and any attachments. ASML is neither liable for the proper and complete transmission of the information contained in this communication, nor for any delay in its receipt.
> 
> 
> -- The information contained in this communication and any attachments is confidential and may be privileged, and is for the sole use of the intended recipient(s). Any unauthorized review, use, disclosure or distribution is prohibited. Unless explicitly stated otherwise in the body of this communication or the attachment thereto (if any), the information is provided on an AS-IS basis without any express or implied warranties or liabilities. To the extent you are relying on this information, you are doing so at your own risk. If you are not the intended recipient, please notify the sender immediately by replying to this message and destroy all copies of this message and any attachments. ASML is neither liable for the proper and complete transmission of the information contained in this communication, nor for any delay in its receipt.


From domenico.borzacchiello at univ-st-etienne.fr  Wed Apr 20 10:32:31 2011
From: domenico.borzacchiello at univ-st-etienne.fr (domenico.borzacchiello at univ-st-etienne.fr)
Date: Wed, 20 Apr 2011 17:32:31 +0200 (CEST)
Subject: [petsc-users] DaSetGetMatrix
Message-ID: <02985683a2f41fe8ced5b122db8f4a1c.squirrel@arcon.univ-st-etienne.fr>

Hi,

I'm running my code (3D Stokes Solver with MAC arrangement) pretty fine so
far with a FieldSplit/Schur Preconditioning.

I had to write my own DAGetMatrix routine cause it was using too much
memory (5 times the required size) for the matrices. The code works  with
1 2 3 5 etc procs (I presume with any number of procs for which are only
possible 1D cartesian topologies of communicators i.e. any prime numbers) 
then if I run with 4 procs for example it stops when assembling the MPIAIJ
matrix with the following error:

[1]PETSC ERROR: Nonconforming object sizes!
[1]PETSC ERROR: Local scatter sizes don't match!

What could be causing the error?

Thank you,
Domenico

here's the getmatrix function I'm using

#undef __FUNCT__
#define __FUNCT__ "DAGetMatrix_User_2"
PetscErrorCode DAGetMatrix_User_2(DA da,const MatType mtype,Mat *J)
{
  PetscErrorCode ierr;
  Mat A;
  PetscInt xm,ym,zm,dim,dof,starts[3],dims[3];
  const MatType  Atype;
  void          
(*aij)(void)=PETSC_NULL,(*baij)(void)=PETSC_NULL,(*sbaij)(void)=PETSC_NULL;
  ISLocalToGlobalMapping ltog,ltogb;

  PetscFunctionBegin;
  ierr = DAGetInfo(da,&dim, 0,0,0, 0,0,0,&dof,0,0,0);CHKERRQ(ierr);
  if (dim != 3) SETERRQ(PETSC_ERR_ARG_WRONG,"Expected DA to be 3D");

  ierr = DAGetCorners(da,0,0,0,&zm,&ym,&xm);CHKERRQ(ierr);
  ierr = DAGetISLocalToGlobalMapping(da,&ltog);CHKERRQ(ierr);
  ierr = DAGetISLocalToGlobalMappingBlck(da,&ltogb);CHKERRQ(ierr);
  ierr = MatCreate(((PetscObject)da)->comm,&A);CHKERRQ(ierr);
  ierr =
MatSetSizes(A,dof*xm*ym*zm,dof*xm*ym*zm,PETSC_DETERMINE,PETSC_DETERMINE);CHKERRQ(ierr);
  ierr = MatSetType(A,mtype);CHKERRQ(ierr);
  ierr = MatSetFromOptions(A);CHKERRQ(ierr);
  ierr = MatSeqAIJSetPreallocation(A,17,PETSC_NULL);CHKERRQ(ierr);
  ierr =
MatMPIAIJSetPreallocation(A,17,PETSC_NULL,12,PETSC_NULL);CHKERRQ(ierr);
  ierr = MatSeqBAIJSetPreallocation(A,dof,7,PETSC_NULL);CHKERRQ(ierr);
  ierr =
MatMPIBAIJSetPreallocation(A,dof,7,PETSC_NULL,0,PETSC_NULL);CHKERRQ(ierr);
  ierr = MatSeqSBAIJSetPreallocation(A,dof,4,PETSC_NULL);CHKERRQ(ierr);
  ierr =
MatMPISBAIJSetPreallocation(A,dof,4,PETSC_NULL,0,PETSC_NULL);CHKERRQ(ierr);
  ierr = MatSetDA(A,da);
  ierr = MatSetFromOptions(A);
  ierr = MatGetType(A,&Atype);
  ierr = MatSetBlockSize(A,dof);CHKERRQ(ierr);
  ierr = MatSetLocalToGlobalMapping(A,ltog);CHKERRQ(ierr);
  ierr = MatSetLocalToGlobalMappingBlock(A,ltogb);CHKERRQ(ierr);
  ierr =
DAGetGhostCorners(da,&starts[0],&starts[1],&starts[2],&dims[0],&dims[1],&dims[2]);CHKERRQ(ierr);
  ierr = MatSetStencil(A,dim,dims,starts,dof);CHKERRQ(ierr);
  *J = A;
  PetscFunctionReturn(0);
}


From agrayver at gfz-potsdam.de  Wed Apr 20 10:31:56 2011
From: agrayver at gfz-potsdam.de (Alexander Grayver)
Date: Wed, 20 Apr 2011 17:31:56 +0200
Subject: [petsc-users] complexity of solvers
Message-ID: <4DAEFC6C.30906@gfz-potsdam.de>

Hello,

Probably my question might seem stupid, but I don't know better place to 
ask.
I came across with paper in one of the referenced journal where authors 
claim that LU decomposition has complexity of
O(n^1.5) and one solution using factorized matrix can be calculated in 
O(n*logn). They have sparse matrix with 13 nnz per row.
What I've thought so far is that the complexity of the LU decomposition 
depends on the sparsity of the matrix and in worst case
of dense matrix can be estimated as O(n^3). I have not seen any 
estimates of the LU decomposition complexity for sparse matrices.
Is that possible at all?

I also always assume the same situation for iterative solvers with the 
worst case of O(n^2) when the matrix is dense.

Regards,
Alexander

From jed at 59A2.org  Wed Apr 20 10:45:05 2011
From: jed at 59A2.org (Jed Brown)
Date: Wed, 20 Apr 2011 17:45:05 +0200
Subject: [petsc-users] complexity of solvers
In-Reply-To: <4DAEFC6C.30906@gfz-potsdam.de>
References: <4DAEFC6C.30906@gfz-potsdam.de>
Message-ID: <BANLkTinv0o655-iDiYm9PmVejhX9SntRig@mail.gmail.com>

On Wed, Apr 20, 2011 at 17:31, Alexander Grayver <agrayver at gfz-potsdam.de>wrote:

> I came across with paper in one of the referenced journal where authors
> claim that LU decomposition has complexity of
> O(n^1.5) and one solution using factorized matrix can be calculated in
> O(n*logn).
>

These are the bounds for 2D problems with optimal ordering. For 3D, the
bounds are O(n^2) time and O(n^{4/3}) space.

Alan George, Joseph Liu, Computer Solution of Large Sparse Positive Definite
Systems, Prentice-Hall, Englewood Cliffs, NJ, 1981.

S.C. Eisenstat, M.H. Schultz, A.H. Sherman, Applications of an element model
for Gaussian elimination, in: Sparse Matrix Computations (Proc. Symp.,
Argonne Nat. Lab., Lemont, Ill., 1975), Academic Press, New York, 1976, pp.
85?96.
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.mcs.anl.gov/pipermail/petsc-users/attachments/20110420/7ca8839c/attachment.htm>

From agrayver at gfz-potsdam.de  Wed Apr 20 11:05:49 2011
From: agrayver at gfz-potsdam.de (Alexander Grayver)
Date: Wed, 20 Apr 2011 18:05:49 +0200
Subject: [petsc-users] complexity of solvers
In-Reply-To: <BANLkTinv0o655-iDiYm9PmVejhX9SntRig@mail.gmail.com>
References: <4DAEFC6C.30906@gfz-potsdam.de>
	<BANLkTinv0o655-iDiYm9PmVejhX9SntRig@mail.gmail.com>
Message-ID: <4DAF045D.7060609@gfz-potsdam.de>

Thanks for references, Jed!

Yes, they have 2D problem.

Regards,
Alexander

On 20.04.2011 17:45, Jed Brown wrote:
> On Wed, Apr 20, 2011 at 17:31, Alexander Grayver 
> <agrayver at gfz-potsdam.de <mailto:agrayver at gfz-potsdam.de>> wrote:
>
>     I came across with paper in one of the referenced journal where
>     authors claim that LU decomposition has complexity of
>     O(n^1.5) and one solution using factorized matrix can be
>     calculated in O(n*logn).
>
>
> These are the bounds for 2D problems with optimal ordering. For 3D, 
> the bounds are O(n^2) time and O(n^{4/3}) space.
>
> Alan George, Joseph Liu, Computer Solution of Large Sparse Positive 
> Definite Systems, Prentice-Hall, Englewood Cliffs, NJ, 1981.
>
> S.C. Eisenstat, M.H. Schultz, A.H. Sherman, Applications of an element 
> model for Gaussian elimination, in: Sparse Matrix Computations (Proc. 
> Symp., Argonne Nat. Lab., Lemont, Ill., 1975), Academic Press, New 
> York, 1976, pp. 85?96.
>

-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.mcs.anl.gov/pipermail/petsc-users/attachments/20110420/422660a6/attachment.htm>

From bsmith at mcs.anl.gov  Wed Apr 20 11:31:03 2011
From: bsmith at mcs.anl.gov (Barry Smith)
Date: Wed, 20 Apr 2011 11:31:03 -0500
Subject: [petsc-users] DaSetGetMatrix
In-Reply-To: <02985683a2f41fe8ced5b122db8f4a1c.squirrel@arcon.univ-st-etienne.fr>
References: <02985683a2f41fe8ced5b122db8f4a1c.squirrel@arcon.univ-st-etienne.fr>
Message-ID: <8F411CF4-69B6-47E3-B563-75AF52F77BBA@mcs.anl.gov>


  Please send a complete error report with the entire error message to petsc-maint at mcs.anl.gov without the information we cannot even begin to guess what the issue is.

   Barry

On Apr 20, 2011, at 10:32 AM, domenico.borzacchiello at univ-st-etienne.fr wrote:

> Hi,
> 
> I'm running my code (3D Stokes Solver with MAC arrangement) pretty fine so
> far with a FieldSplit/Schur Preconditioning.
> 
> I had to write my own DAGetMatrix routine cause it was using too much
> memory (5 times the required size) for the matrices. The code works  with
> 1 2 3 5 etc procs (I presume with any number of procs for which are only
> possible 1D cartesian topologies of communicators i.e. any prime numbers) 
> then if I run with 4 procs for example it stops when assembling the MPIAIJ
> matrix with the following error:
> 
> [1]PETSC ERROR: Nonconforming object sizes!
> [1]PETSC ERROR: Local scatter sizes don't match!
> 
> What could be causing the error?
> 
> Thank you,
> Domenico
> 
> here's the getmatrix function I'm using
> 
> #undef __FUNCT__
> #define __FUNCT__ "DAGetMatrix_User_2"
> PetscErrorCode DAGetMatrix_User_2(DA da,const MatType mtype,Mat *J)
> {
>  PetscErrorCode ierr;
>  Mat A;
>  PetscInt xm,ym,zm,dim,dof,starts[3],dims[3];
>  const MatType  Atype;
>  void          
> (*aij)(void)=PETSC_NULL,(*baij)(void)=PETSC_NULL,(*sbaij)(void)=PETSC_NULL;
>  ISLocalToGlobalMapping ltog,ltogb;
> 
>  PetscFunctionBegin;
>  ierr = DAGetInfo(da,&dim, 0,0,0, 0,0,0,&dof,0,0,0);CHKERRQ(ierr);
>  if (dim != 3) SETERRQ(PETSC_ERR_ARG_WRONG,"Expected DA to be 3D");
> 
>  ierr = DAGetCorners(da,0,0,0,&zm,&ym,&xm);CHKERRQ(ierr);
>  ierr = DAGetISLocalToGlobalMapping(da,&ltog);CHKERRQ(ierr);
>  ierr = DAGetISLocalToGlobalMappingBlck(da,&ltogb);CHKERRQ(ierr);
>  ierr = MatCreate(((PetscObject)da)->comm,&A);CHKERRQ(ierr);
>  ierr =
> MatSetSizes(A,dof*xm*ym*zm,dof*xm*ym*zm,PETSC_DETERMINE,PETSC_DETERMINE);CHKERRQ(ierr);
>  ierr = MatSetType(A,mtype);CHKERRQ(ierr);
>  ierr = MatSetFromOptions(A);CHKERRQ(ierr);
>  ierr = MatSeqAIJSetPreallocation(A,17,PETSC_NULL);CHKERRQ(ierr);
>  ierr =
> MatMPIAIJSetPreallocation(A,17,PETSC_NULL,12,PETSC_NULL);CHKERRQ(ierr);
>  ierr = MatSeqBAIJSetPreallocation(A,dof,7,PETSC_NULL);CHKERRQ(ierr);
>  ierr =
> MatMPIBAIJSetPreallocation(A,dof,7,PETSC_NULL,0,PETSC_NULL);CHKERRQ(ierr);
>  ierr = MatSeqSBAIJSetPreallocation(A,dof,4,PETSC_NULL);CHKERRQ(ierr);
>  ierr =
> MatMPISBAIJSetPreallocation(A,dof,4,PETSC_NULL,0,PETSC_NULL);CHKERRQ(ierr);
>  ierr = MatSetDA(A,da);
>  ierr = MatSetFromOptions(A);
>  ierr = MatGetType(A,&Atype);
>  ierr = MatSetBlockSize(A,dof);CHKERRQ(ierr);
>  ierr = MatSetLocalToGlobalMapping(A,ltog);CHKERRQ(ierr);
>  ierr = MatSetLocalToGlobalMappingBlock(A,ltogb);CHKERRQ(ierr);
>  ierr =
> DAGetGhostCorners(da,&starts[0],&starts[1],&starts[2],&dims[0],&dims[1],&dims[2]);CHKERRQ(ierr);
>  ierr = MatSetStencil(A,dim,dims,starts,dof);CHKERRQ(ierr);
>  *J = A;
>  PetscFunctionReturn(0);
> }
> 


From Debao.Shao at brion.com  Wed Apr 20 20:23:43 2011
From: Debao.Shao at brion.com (Debao Shao)
Date: Wed, 20 Apr 2011 18:23:43 -0700
Subject: [petsc-users] MatCopy and
	MatSetValue	consume	most	99	percentage of runtime
In-Reply-To: <3FAF21D8-DC2B-40EE-A79B-1FCF4DCAABD6@mcs.anl.gov>
References: <384FF55F15E3E447802DC8CCA85696980AAB2E2358@EX03>
	<4C61B97A-F095-41BC-992C-2A0A040A0838@mcs.anl.gov>
	<384FF55F15E3E447802DC8CCA85696980AAB3E699A@EX03>
	<39D3147F-560C-4EBF-A861-9B175E4D4783@mcs.anl.gov>
	<384FF55F15E3E447802DC8CCA85696980AAB3E69B6@EX03>
	<7E6BBB75-2C7D-4CE7-86BF-124DBFCE280E@mcs.anl.gov>
	<384FF55F15E3E447802DC8CCA85696980AAB3E6A28@EX03>
	<3FAF21D8-DC2B-40EE-A79B-1FCF4DCAABD6@mcs.anl.gov>
Message-ID: <384FF55F15E3E447802DC8CCA85696980AAB3E6BE4@EX03>

Understand,
Barry, Thanks a lot.

-----Original Message-----
From: petsc-users-bounces at mcs.anl.gov [mailto:petsc-users-bounces at mcs.anl.gov] On Behalf Of Barry Smith
Sent: Wednesday, April 20, 2011 9:19 PM
To: PETSc users list
Subject: Re: [petsc-users] MatCopy and MatSetValue consume most 99 percentage of runtime


On Apr 19, 2011, at 10:21 PM, Debao Shao wrote:

> Hi, Barry:
>
> I'm confused,
> 1), if we can't use "MatZeroEntries" before MatAssembly, then, how do we do initialization for M?

   When you create a sparse matrix it automatically has no non-zero values in it so there is no reason to call MatZeroEntries() on it. But I was wrong it is ok to call MatZeroEntries() on it and it will not destroy the preallocation

> 2), if we can't use "MatCopy" before MatAssembly, then, how to fill up M from another matrix?

   You can copy, say A, to M with MatCopy() but M will get the same nonzero structure as A, if you provided "extra" preallocation information in M that will be lost in the copy.  So it is not efficient to copy into a matrix M and then start putting as bunch of new nonzero locations into M.


    Barry

>
> Can you give a sample code for the right usage? Thanks very much.
>
> Regards,
> Debao
>
> -----Original Message-----
> From: petsc-users-bounces at mcs.anl.gov [mailto:petsc-users-bounces at mcs.anl.gov] On Behalf Of Barry Smith
> Sent: Wednesday, April 20, 2011 10:59 AM
> To: PETSc users list
> Subject: Re: [petsc-users] MatCopy and MatSetValue consume most 99 percentage of runtime
>
>
> On Apr 19, 2011, at 8:50 PM, Debao Shao wrote:
>
>> Here is my sample code:
>> ierr = MatZeroEntries( M ); assert( ierr  == 0);
>> ierr = MatDiagonalSet( M, vec.pv, INSERT_VALUES );
>> ierr = MatCopy( ms->M, mStorage->M,  DIFFERENT_NONZERO_PATTERN );
>>
>> I checked PETSC api, both "MatDiagonalSet" and "MatCopy" called MatAssembly***, Is the usage wrong, or, how to deal with the problem?
>
>   Don't call MatCopy() on anything that you have NOT fully filled up, same with MatDiagonalSet() and MatZeroEntries(). You should only do those operations on matrices that you have filled up and called MatAssembly on already. You can use MatDiagonalSet() directly on a naked matrix but only if you are not putting other values in.
>
>    Barry
>
> I know these "rules" may seem strange but the problem is that these operations need to know the nonzero pattern of the sparse matrix and since you haven't set anything much in them yet they have to assume it is empty and blast away the preallocation information.  Basically you shouldn't do much in the way of operations on matrices until you've fully assembled them.
>
>>
>> Thanks,
>> Debao
>> -----Original Message-----
>> From: petsc-users-bounces at mcs.anl.gov [mailto:petsc-users-bounces at mcs.anl.gov] On Behalf Of Barry Smith
>> Sent: Wednesday, April 20, 2011 9:40 AM
>> To: PETSc users list
>> Subject: Re: [petsc-users] MatCopy and MatSetValue consume most 99 percentage of runtime
>>
>>
>> On Apr 19, 2011, at 8:31 PM, Debao Shao wrote:
>>
>>> Hi, Barry:
>>>
>>> Thanks for the reply.
>>>
>>> I preallocated enough space for the sparse matrix, but I found mat->data->imax changed after "MatAssemblyEnd", and it caused many rowmax less than the number of nonzeros per row, then "MatSeqXAIJReallocateAIJ" is called frequently when doing MatSetValues again to the matrix.
>>
>>  Are you using MatZeroRows()? If so call MatSetOption(mat, MAT_KEEP_NONZERO_PATTERN) before calling the zero rows to retain that structure.
>>
>>   If you are not using MatZeroRows() then apparently the first time you set values in there and call MatAssemblyEnd() you have left many locations that later will be filled unfilled and so they are eliminated at MatAssembly time.  You must make sure that all potentially nonzero locations get a value put in initially (put zero for the locations that you don't yet have  a value for) before you first call MatAssemblyEnd().
>>
>>  Barry
>>
>> PETSc matrices have no way of retaining extra locations you preallocated for unless you put something (like 0) in there.
>>
>>>
>>> Any suggestions?
>>>
>>> Thanks,
>>> Debao
>>> -----Original Message-----
>>> From: petsc-users-bounces at mcs.anl.gov [mailto:petsc-users-bounces at mcs.anl.gov] On Behalf Of Barry Smith
>>> Sent: Friday, April 15, 2011 9:25 PM
>>> To: PETSc users list
>>> Subject: Re: [petsc-users] MatCopy and MatSetValue consume most 99 percentage of runtime
>>>
>>>
>>> Debao,
>>>
>>>    Please see http://www.mcs.anl.gov/petsc/petsc-as/documentation/faq.html#efficient-assembly It should resolve the difficulties.
>>>
>>>  Barry
>>>
>>> On Apr 15, 2011, at 2:16 AM, Debao Shao wrote:
>>>
>>>> Dear Petsc:
>>>>
>>>> I'm trying on Petsc iterative solver(KSPCG & PCJACOBI), but it's strange that the two functions "MatCopy" and "MatSetValue" consume most of runtime, and the functions were not called frequently, just several times.
>>>>
>>>> My libpetsc.a is built as follows:
>>>> 1, /config/configure.py --with-mpi=0 --with-debugging=0 -with-log=0 -with-info=0
>>>> 2, make all;
>>>>
>>>> It's very appreciated to get your reply.
>>>>
>>>> Thanks a lot,
>>>> Debao
>>>>
>>>> -- The information contained in this communication and any attachments is confidential and may be privileged, and is for the sole use of the intended recipient(s). Any unauthorized review, use, disclosure or distribution is prohibited. Unless explicitly stated otherwise in the body of this communication or the attachment thereto (if any), the information is provided on an AS-IS basis without any express or implied warranties or liabilities. To the extent you are relying on this information, you are doing so at your own risk. If you are not the intended recipient, please notify the sender immediately by replying to this message and destroy all copies of this message and any attachments. ASML is neither liable for the proper and complete transmission of the information contained in this communication, nor for any delay in its receipt.
>>>
>>>
>>> -- The information contained in this communication and any attachments is confidential and may be privileged, and is for the sole use of the intended recipient(s). Any unauthorized review, use, disclosure or distribution is prohibited. Unless explicitly stated otherwise in the body of this communication or the attachment thereto (if any), the information is provided on an AS-IS basis without any express or implied warranties or liabilities. To the extent you are relying on this information, you are doing so at your own risk. If you are not the intended recipient, please notify the sender immediately by replying to this message and destroy all copies of this message and any attachments. ASML is neither liable for the proper and complete transmission of the information contained in this communication, nor for any delay in its receipt.
>>
>>
>> -- The information contained in this communication and any attachments is confidential and may be privileged, and is for the sole use of the intended recipient(s). Any unauthorized review, use, disclosure or distribution is prohibited. Unless explicitly stated otherwise in the body of this communication or the attachment thereto (if any), the information is provided on an AS-IS basis without any express or implied warranties or liabilities. To the extent you are relying on this information, you are doing so at your own risk. If you are not the intended recipient, please notify the sender immediately by replying to this message and destroy all copies of this message and any attachments. ASML is neither liable for the proper and complete transmission of the information contained in this communication, nor for any delay in its receipt.
>
>
> -- The information contained in this communication and any attachments is confidential and may be privileged, and is for the sole use of the intended recipient(s). Any unauthorized review, use, disclosure or distribution is prohibited. Unless explicitly stated otherwise in the body of this communication or the attachment thereto (if any), the information is provided on an AS-IS basis without any express or implied warranties or liabilities. To the extent you are relying on this information, you are doing so at your own risk. If you are not the intended recipient, please notify the sender immediately by replying to this message and destroy all copies of this message and any attachments. ASML is neither liable for the proper and complete transmission of the information contained in this communication, nor for any delay in its receipt.


-- The information contained in this communication and any attachments is confidential and may be privileged, and is for the sole use of the intended recipient(s). Any unauthorized review, use, disclosure or distribution is prohibited. Unless explicitly stated otherwise in the body of this communication or the attachment thereto (if any), the information is provided on an AS-IS basis without any express or implied warranties or liabilities. To the extent you are relying on this information, you are doing so at your own risk. If you are not the intended recipient, please notify the sender immediately by replying to this message and destroy all copies of this message and any attachments. ASML is neither liable for the proper and complete transmission of the information contained in this communication, nor for any delay in its receipt.

From zonexo at gmail.com  Thu Apr 21 07:07:06 2011
From: zonexo at gmail.com (TAY wee-beng)
Date: Thu, 21 Apr 2011 14:07:06 +0200
Subject: [petsc-users] Improving performance for parallel CFD simulation
In-Reply-To: <BANLkTimKN10ez5uFU87-8Ymd8yKGUjZZxQ@mail.gmail.com>
References: <A56CCB96395E684D9D57EAFB4AC41D4D1245C1DC92@ICEXM4.ic.ac.uk>	<A56CCB96395E684D9D57EAFB4AC41D4D1245C1DC93@ICEXM4.ic.ac.uk>	<BANLkTimf+ZU_ZWGQpFHxcz-Rwwj_d9u6sQ@mail.gmail.com>	<A56CCB96395E684D9D57EAFB4AC41D4D1245C1DC96@ICEXM4.ic.ac.uk>	<BANLkTin0yms29dRHAiKif1DTd2EHyG=nmA@mail.gmail.com>	<A56CCB96395E684D9D57EAFB4AC41D4D1245C1DC98@ICEXM4.ic.ac.uk>	<BANLkTim5L=S=yuScHf2Cj1o3zTQFOdbcGQ@mail.gmail.com>	<4DA818AE.80900@tu-dresden.de>	<BANLkTinA2phVRXNUyXbQhDmofuSYbeCTSw@mail.gmail.com>	<4DA8232C.6070007@tu-dresden.de>
	<BANLkTimKN10ez5uFU87-8Ymd8yKGUjZZxQ@mail.gmail.com>
Message-ID: <4DB01DEA.5050009@gmail.com>

Hi,

Since there is a similar topic on improving performance of CFD using 
PETSc earlier, I hope to improve my current CFD solver performance too.

I wrote it in Fortran90 with MPI. It is a Immersed Boundary Method (IBM) 
Navier-Stokes Cartesian grid solver in 2D, although I hope to extend it 
to 3D in the future. I solve the NS equations using fractional step 
which results in 2 equations - the momentum and Poisson equations. They 
are then linearized into systems of equations.

I currently solve the momentum solver using PETSc with KSPBCGS. For the 
Poisson equation, I was using hypre's BoomerAMG. But I have changed to 
using the geometric multigrid solver from hypre since it's slightly faster.

Currently, I am dividing my grid along the y direction for MPI into 
equal size for each processor. I guess this is not very efficient since 
beyond 4 processors, the scaling factor drops.

I think implementing the distributed array should increase performance, 
is that so? I wonder how difficult it is because most examples are in C 
and I am not so used to that. I am also using staggered grid but I will 
most likely changed to a collocated grid arrangement.

What other suggestions do you have to improve the solver's performance 
using PETSc?

Thank you very much.

Yours sincerely,

TAY wee-beng


From knepley at gmail.com  Thu Apr 21 07:28:45 2011
From: knepley at gmail.com (Matthew Knepley)
Date: Thu, 21 Apr 2011 07:28:45 -0500
Subject: [petsc-users] Improving performance for parallel CFD simulation
In-Reply-To: <4DB01DEA.5050009@gmail.com>
References: <A56CCB96395E684D9D57EAFB4AC41D4D1245C1DC92@ICEXM4.ic.ac.uk>
	<A56CCB96395E684D9D57EAFB4AC41D4D1245C1DC93@ICEXM4.ic.ac.uk>
	<BANLkTimf+ZU_ZWGQpFHxcz-Rwwj_d9u6sQ@mail.gmail.com>
	<A56CCB96395E684D9D57EAFB4AC41D4D1245C1DC96@ICEXM4.ic.ac.uk>
	<BANLkTin0yms29dRHAiKif1DTd2EHyG=nmA@mail.gmail.com>
	<A56CCB96395E684D9D57EAFB4AC41D4D1245C1DC98@ICEXM4.ic.ac.uk>
	<BANLkTim5L=S=yuScHf2Cj1o3zTQFOdbcGQ@mail.gmail.com>
	<4DA818AE.80900@tu-dresden.de>
	<BANLkTinA2phVRXNUyXbQhDmofuSYbeCTSw@mail.gmail.com>
	<4DA8232C.6070007@tu-dresden.de>
	<BANLkTimKN10ez5uFU87-8Ymd8yKGUjZZxQ@mail.gmail.com>
	<4DB01DEA.5050009@gmail.com>
Message-ID: <BANLkTi=ChGe1C_E6J1_6OCqwC+g0fJj7ow@mail.gmail.com>

On Thu, Apr 21, 2011 at 7:07 AM, TAY wee-beng <zonexo at gmail.com> wrote:

> Hi,
>
> Since there is a similar topic on improving performance of CFD using PETSc
> earlier, I hope to improve my current CFD solver performance too.
>
> I wrote it in Fortran90 with MPI. It is a Immersed Boundary Method (IBM)
> Navier-Stokes Cartesian grid solver in 2D, although I hope to extend it to
> 3D in the future. I solve the NS equations using fractional step which
> results in 2 equations - the momentum and Poisson equations. They are then
> linearized into systems of equations.
>
> I currently solve the momentum solver using PETSc with KSPBCGS. For the
> Poisson equation, I was using hypre's BoomerAMG. But I have changed to using
> the geometric multigrid solver from hypre since it's slightly faster.
>
> Currently, I am dividing my grid along the y direction for MPI into equal
> size for each processor. I guess this is not very efficient since beyond 4
> processors, the scaling factor drops.
>
> I think implementing the distributed array should increase performance, is
> that so? I wonder how difficult it is because most examples are in C and I
> am not so used to that. I am also using staggered grid but I will most
> likely changed to a collocated grid arrangement.
>

It will definitely improve scalability. I don't think conversion should be
that hard.

   Matt

What other suggestions do you have to improve the solver's performance using
> PETSc?
>
> Thank you very much.
>
> Yours sincerely,
>
> TAY wee-beng
>
>
>


-- 
What most experimenters take for granted before they begin their experiments
is infinitely more interesting than any results to which their experiments
lead.
-- Norbert Wiener
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.mcs.anl.gov/pipermail/petsc-users/attachments/20110421/b73a3235/attachment.htm>

From vishy at stat.purdue.edu  Thu Apr 21 09:39:40 2011
From: vishy at stat.purdue.edu (S V N Vishwanathan)
Date: Thu, 21 Apr 2011 10:39:40 -0400
Subject: [petsc-users] Question on writing a large matrix
Message-ID: <87k4ensm6r.wl%vishy@stat.purdue.edu>

Hi

I am using the attached code to convert a matrix from a rather
inefficient ascii format (each line is a row and contains a series of
idx:val pairs) to the PETSc binary format. Some of the matrices that I
am working with are rather huge (50GB ascii file) and cannot be
assembled on a single processor. When I use the attached code the matrix
assembly across machines seems to be fairly fast. However, dumping the
assembled matrix out to disk seems to be painfully slow. Any suggestions
on how to speed things up will be deeply appreciated.

vishy

-------------- next part --------------
A non-text attachment was scrubbed...
Name: libsvm-to-binary.cpp
Type: application/octet-stream
Size: 15762 bytes
Desc: not available
URL: <http://lists.mcs.anl.gov/pipermail/petsc-users/attachments/20110421/6a12badf/attachment-0001.obj>

From aron.ahmadia at kaust.edu.sa  Thu Apr 21 09:44:35 2011
From: aron.ahmadia at kaust.edu.sa (Aron Ahmadia)
Date: Thu, 21 Apr 2011 17:44:35 +0300
Subject: [petsc-users] Question on writing a large matrix
In-Reply-To: <87k4ensm6r.wl%vishy@stat.purdue.edu>
References: <87k4ensm6r.wl%vishy@stat.purdue.edu>
Message-ID: <BANLkTin-L5V3HXOYQUheUANas6G62xMZvg@mail.gmail.com>

Hi Vish,

What is 'painfully slow'.  Do you have a profile or an estimate in terms of
GB/s?  Have you taken a look at your process's memory allocation and checked
to see if it is swapping?  My first guess would be that you are exceeding
RAM and your program is thrashing as parts of the page table get swapped to
and from disk mid-run.

A

On Thu, Apr 21, 2011 at 5:39 PM, S V N Vishwanathan
<vishy at stat.purdue.edu>wrote:

> Hi
>
> I am using the attached code to convert a matrix from a rather
> inefficient ascii format (each line is a row and contains a series of
> idx:val pairs) to the PETSc binary format. Some of the matrices that I
> am working with are rather huge (50GB ascii file) and cannot be
> assembled on a single processor. When I use the attached code the matrix
> assembly across machines seems to be fairly fast. However, dumping the
> assembled matrix out to disk seems to be painfully slow. Any suggestions
> on how to speed things up will be deeply appreciated.
>
> vishy
>
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.mcs.anl.gov/pipermail/petsc-users/attachments/20110421/d5d5954a/attachment.htm>

From vishy at stat.purdue.edu  Thu Apr 21 11:59:10 2011
From: vishy at stat.purdue.edu (S V N Vishwanathan)
Date: Thu, 21 Apr 2011 12:59:10 -0400
Subject: [petsc-users] Question on writing a large matrix
In-Reply-To: <BANLkTin-L5V3HXOYQUheUANas6G62xMZvg@mail.gmail.com>
References: <87k4ensm6r.wl%vishy@stat.purdue.edu>
	<BANLkTin-L5V3HXOYQUheUANas6G62xMZvg@mail.gmail.com>
Message-ID: <87hb9rsfq9.wl%vishy@stat.purdue.edu>


> What is 'painfully slow'. ?Do you have a profile or an estimate in
> terms of GB/s? ?Have you taken a look at your process's memory
> allocation and checked to see if it is swapping? ?My first guess would
> be that you are exceeding RAM and your program is thrashing as parts
> of the page table get swapped to and from disk mid-run.

A single machine does not have enough memory to hold the entire
matrix. That is why I have to assemble it in parallel. When distributed
across 8 machines the assembly seemed to finish in under an hr. However,
my program tried to write the matrix to file since yesterday night and
eventually crashed. The log just indicated

[1]PETSC ERROR: Caught signal number 1 Hang up: Some other process (or the batch system) has told this process to end

Most likely because it tried to allocate a large chunk of memory and
failed. 

I investigated using a smaller matrix and ran the code with the -info
flag (see below). What worries me are these lines:

Writing data in binary format to adult9.train.x 
....  >>>> I call MatView in my code here 
[1] MatAssemblyEnd_SeqAIJ(): Number of mallocs during MatSetValues() is 16281

Is MatView reconstructing the matrix at the root node? In that case the
program will definitely fail due to lack of memory. 

Please let me know if I you need any other information or if I can run
any other tests to help investigate.

vishy


mpiexec -n 2 ./libsvm-to-binary -in ../LibSVM/biclass/adult9/adult9.train.txt -data adult9.train.x -labels adult9.train.y -info 

[0] PetscInitialize(): PETSc successfully started: number of processors = 2
[1] PetscInitialize(): PETSc successfully started: number of processors = 2
[1] PetscInitialize(): Running on machine: rossmann-fe03.rcac.purdue.edu
[0] PetscInitialize(): Running on machine: rossmann-fe03.rcac.purdue.edu
No libsvm test file specified!

 Reading libsvm train file at ../LibSVM/biclass/adult9/adult9.train.txt
[0] PetscFOpen(): Opening file ../LibSVM/biclass/adult9/adult9.train.txt
[1] PetscFOpen(): Opening file ../LibSVM/biclass/adult9/adult9.train.txt
[0] PetscCommDuplicate(): Duplicating a communicator 1140850688 -2080374780 max tags = 2147483647
[0] PetscCommDuplicate():   returning tag 2147483647
[1] PetscCommDuplicate(): Duplicating a communicator 1140850688 -2080374782 max tags = 2147483647
[1] PetscCommDuplicate():   returning tag 2147483647
[0] PetscCommDuplicate(): Using internal PETSc communicator 1140850688 -2080374780
[0] PetscCommDuplicate():   returning tag 2147483642
[1] PetscCommDuplicate(): Using internal PETSc communicator 1140850688 -2080374782
[1] PetscCommDuplicate():   returning tag 2147483642
[0] PetscCommDuplicate(): Duplicating a communicator 1140850689 -2080374777 max tags = 2147483647
[1] PetscCommDuplicate(): Duplicating a communicator 1140850689 -2080374780 max tags = 2147483647
[1] PetscCommDuplicate():   returning tag 2147483647
[0] PetscCommDuplicate():   returning tag 2147483647
[0] PetscCommDuplicate(): Using internal PETSc communicator 1140850689 -2080374777
[0] PetscCommDuplicate():   returning tag 2147483646
[1] PetscCommDuplicate(): Using internal PETSc communicator 1140850689 -2080374780
[1] PetscCommDuplicate():   returning tag 2147483646
[0] VecAssemblyBegin_MPI(): Stash has 0 entries, uses 0 mallocs.
[0] VecAssemblyBegin_MPI(): Block-Stash has 0 entries, uses 0 mallocs.
[0] MatStashScatterBegin_Private(): No of messages: 0 
[0] MatAssemblyBegin_MPIAIJ(): Stash has 0 entries, uses 0 mallocs.
[1] MatAssemblyBegin_MPIAIJ(): Stash has 0 entries, uses 0 mallocs.
[1] MatAssemblyEnd_SeqAIJ(): Matrix size: 16281 X 124; storage space: 225806 unneeded,0 used
[1] MatAssemblyEnd_SeqAIJ(): Number of mallocs during MatSetValues() is 0
[1] MatAssemblyEnd_SeqAIJ(): Maximum nonzeros in any row is 0
[1] Mat_CheckInode(): Found 3257 nodes of 16281. Limit used: 5. Using Inode routines
[0] MatAssemblyEnd_SeqAIJ(): Matrix size: 16280 X 124; storage space: 0 unneeded,225786 used
[0] MatAssemblyEnd_SeqAIJ(): Number of mallocs during MatSetValues() is 0
[0] MatAssemblyEnd_SeqAIJ(): Maximum nonzeros in any row is 14
[0] Mat_CheckInode(): Found 16280 nodes out of 16280 rows. Not using Inode routines
[0] PetscCommDuplicate(): Using internal PETSc communicator 1140850689 -2080374777
[0] PetscCommDuplicate():   returning tag 2147483645
[0] MatSetUpMultiply_MPIAIJ(): Using block index set to define scatter
[0] PetscCommDuplicate():   returning tag 2147483638
[1] PetscCommDuplicate(): Using internal PETSc communicator 1140850689 -2080374780
[1] PetscCommDuplicate():   returning tag 2147483645
[1] PetscCommDuplicate():   returning tag 2147483638
[0] PetscCommDuplicate(): Using internal PETSc communicator 1140850689 -2080374777
[1] PetscCommDuplicate(): Using internal PETSc communicator 1140850689 -2080374780
[1] PetscCommDuplicate():   returning tag 2147483644
[1] PetscCommDuplicate():   returning tag 2147483637
[0] PetscCommDuplicate():   returning tag 2147483644
[0] PetscCommDuplicate():   returning tag 2147483637
[1] PetscCommDuplicate():   returning tag 2147483632
[0] PetscCommDuplicate():   returning tag 2147483632
[0] VecScatterCreateCommon_PtoS(): Using blocksize 1 scatter
[0] VecScatterCreate(): General case: MPI to Seq
[0] MatAssemblyEnd_SeqAIJ(): Matrix size: 16280 X 0; storage space: 0 unneeded,0 used
[0] MatAssemblyEnd_SeqAIJ(): Number of mallocs during MatSetValues() is 0
[0] MatAssemblyEnd_SeqAIJ(): Maximum nonzeros in any row is 0

 Writing data in binary format to adult9.train.x 
[0] PetscCommDuplicate(): Using internal PETSc communicator 1140850688 -2080374780
[0] PetscCommDuplicate():   returning tag 2147483628
[1] MatAssemblyEnd_SeqAIJ(): Matrix size: 16281 X 123; storage space: 18409 unneeded,225806 used
[1] MatAssemblyEnd_SeqAIJ(): Number of mallocs during MatSetValues() is 16281
[1] MatAssemblyEnd_SeqAIJ(): Maximum nonzeros in any row is 14
[1] PetscCommDuplicate(): Using internal PETSc communicator 1140850688 -2080374782
[1] PetscCommDuplicate():   returning tag 2147483628
[1] Petsc_DelComm(): Deleting PETSc communicator imbedded in a user MPI_Comm 1140850689
[1] PetscCommDestroy(): Deleting PETSc MPI_Comm -2080374780
[1] Petsc_DelComm(): Deleting PETSc communicator imbedded in a user MPI_Comm -2080374780
[1] Petsc_DelCounter(): Deleting counter data in an MPI_Comm -2080374780
[1] PetscCommDuplicate(): Using internal PETSc communicator 1140850688 -2080374782
[1] PetscCommDuplicate():   returning tag 2147483627
[0] Petsc_DelComm(): Deleting PETSc communicator imbedded in a user MPI_Comm 1140850689
[0] PetscCommDestroy(): Deleting PETSc MPI_Comm -2080374777
[0] Petsc_DelComm(): Deleting PETSc communicator imbedded in a user MPI_Comm -2080374777
[0] Petsc_DelCounter(): Deleting counter data in an MPI_Comm -2080374777

 Writing labels in binary format to adult9.train.y 
[0] PetscCommDuplicate(): Using internal PETSc communicator 1140850688 -2080374780
[0] PetscCommDuplicate():   returning tag 2147483627
[1] Petsc_DelComm(): Deleting PETSc communicator imbedded in a user MPI_Comm 1140850688
[1] PetscCommDestroy(): Deleting PETSc MPI_Comm -2080374782
[1] Petsc_DelComm(): Deleting PETSc communicator imbedded in a user MPI_Comm -2080374782
[1] Petsc_DelCounter(): Deleting counter data in an MPI_Comm -2080374782
[1] PetscFinalize(): PetscFinalize() called
[0] Petsc_DelComm(): Deleting PETSc communicator imbedded in a user MPI_Comm 1140850688
[0] PetscCommDestroy(): Deleting PETSc MPI_Comm -2080374780
[0] Petsc_DelComm(): Deleting PETSc communicator imbedded in a user MPI_Comm -2080374780
[0] Petsc_DelCounter(): Deleting counter data in an MPI_Comm -2080374780
[0] PetscFinalize(): PetscFinalize() called


> 
> On Thu, Apr 21, 2011 at 5:39 PM, S V N Vishwanathan <vishy at stat.purdue.edu> wrote:
> 
>     Hi
>    
>     I am using the attached code to convert a matrix from a rather
>     inefficient ascii format (each line is a row and contains a series of
>     idx:val pairs) to the PETSc binary format. Some of the matrices that I
>     am working with are rather huge (50GB ascii file) and cannot be
>     assembled on a single processor. When I use the attached code the matrix
>     assembly across machines seems to be fairly fast. However, dumping the
>     assembled matrix out to disk seems to be painfully slow. Any suggestions
>     on how to speed things up will be deeply appreciated.
>    
>     vishy
> 
> 


From bsmith at mcs.anl.gov  Thu Apr 21 12:25:33 2011
From: bsmith at mcs.anl.gov (Barry Smith)
Date: Thu, 21 Apr 2011 12:25:33 -0500
Subject: [petsc-users] Question on writing a large matrix
In-Reply-To: <87hb9rsfq9.wl%vishy@stat.purdue.edu>
References: <87k4ensm6r.wl%vishy@stat.purdue.edu>
	<BANLkTin-L5V3HXOYQUheUANas6G62xMZvg@mail.gmail.com>
	<87hb9rsfq9.wl%vishy@stat.purdue.edu>
Message-ID: <96C5FD2E-7EB1-458A-9DB6-1B8B2353829E@mcs.anl.gov>


On Apr 21, 2011, at 11:59 AM, S V N Vishwanathan wrote:

> 
>> What is 'painfully slow'.  Do you have a profile or an estimate in
>> terms of GB/s?  Have you taken a look at your process's memory
>> allocation and checked to see if it is swapping?  My first guess would
>> be that you are exceeding RAM and your program is thrashing as parts
>> of the page table get swapped to and from disk mid-run.
> 
> A single machine does not have enough memory to hold the entire
> matrix. That is why I have to assemble it in parallel. When distributed
> across 8 machines the assembly seemed to finish in under an hr.

   It has not assembled the matrix in an hour. It is working all night to assemble the matrix, the problem is that you are not preallocating the  nonzeros per row with MatMPIAIJSetPreallocation() when pre allocation is correct it will always print 0 for Number of mallocs. The actual writing of the parallel matrix to the binary file will take at most minutes.

   Barry


> However,
> my program tried to write the matrix to file since yesterday night and
> eventually crashed. The log just indicated
> 
> [1]PETSC ERROR: Caught signal number 1 Hang up: Some other process (or the batch system) has told this process to end
> 
> Most likely because it tried to allocate a large chunk of memory and
> failed. 
> 
> I investigated using a smaller matrix and ran the code with the -info
> flag (see below). What worries me are these lines:
> 
> Writing data in binary format to adult9.train.x 
> ....  >>>> I call MatView in my code here 
> [1] MatAssemblyEnd_SeqAIJ(): Number of mallocs during MatSetValues() is 16281
> 
> Is MatView reconstructing the matrix at the root node? In that case the
> program will definitely fail due to lack of memory. 
> 
> Please let me know if I you need any other information or if I can run
> any other tests to help investigate.
> 
> vishy
> 
> 
> 
> 
> mpiexec -n 2 ./libsvm-to-binary -in ../LibSVM/biclass/adult9/adult9.train.txt -data adult9.train.x -labels adult9.train.y -info 
> 
> [0] PetscInitialize(): PETSc successfully started: number of processors = 2
> [1] PetscInitialize(): PETSc successfully started: number of processors = 2
> [1] PetscInitialize(): Running on machine: rossmann-fe03.rcac.purdue.edu
> [0] PetscInitialize(): Running on machine: rossmann-fe03.rcac.purdue.edu
> No libsvm test file specified!
> 
> Reading libsvm train file at ../LibSVM/biclass/adult9/adult9.train.txt
> [0] PetscFOpen(): Opening file ../LibSVM/biclass/adult9/adult9.train.txt
> [1] PetscFOpen(): Opening file ../LibSVM/biclass/adult9/adult9.train.txt
> [0] PetscCommDuplicate(): Duplicating a communicator 1140850688 -2080374780 max tags = 2147483647
> [0] PetscCommDuplicate():   returning tag 2147483647
> [1] PetscCommDuplicate(): Duplicating a communicator 1140850688 -2080374782 max tags = 2147483647
> [1] PetscCommDuplicate():   returning tag 2147483647
> [0] PetscCommDuplicate(): Using internal PETSc communicator 1140850688 -2080374780
> [0] PetscCommDuplicate():   returning tag 2147483642
> [1] PetscCommDuplicate(): Using internal PETSc communicator 1140850688 -2080374782
> [1] PetscCommDuplicate():   returning tag 2147483642
> [0] PetscCommDuplicate(): Duplicating a communicator 1140850689 -2080374777 max tags = 2147483647
> [1] PetscCommDuplicate(): Duplicating a communicator 1140850689 -2080374780 max tags = 2147483647
> [1] PetscCommDuplicate():   returning tag 2147483647
> [0] PetscCommDuplicate():   returning tag 2147483647
> [0] PetscCommDuplicate(): Using internal PETSc communicator 1140850689 -2080374777
> [0] PetscCommDuplicate():   returning tag 2147483646
> [1] PetscCommDuplicate(): Using internal PETSc communicator 1140850689 -2080374780
> [1] PetscCommDuplicate():   returning tag 2147483646
> [0] VecAssemblyBegin_MPI(): Stash has 0 entries, uses 0 mallocs.
> [0] VecAssemblyBegin_MPI(): Block-Stash has 0 entries, uses 0 mallocs.
> [0] MatStashScatterBegin_Private(): No of messages: 0 
> [0] MatAssemblyBegin_MPIAIJ(): Stash has 0 entries, uses 0 mallocs.
> [1] MatAssemblyBegin_MPIAIJ(): Stash has 0 entries, uses 0 mallocs.
> [1] MatAssemblyEnd_SeqAIJ(): Matrix size: 16281 X 124; storage space: 225806 unneeded,0 used
> [1] MatAssemblyEnd_SeqAIJ(): Number of mallocs during MatSetValues() is 0
> [1] MatAssemblyEnd_SeqAIJ(): Maximum nonzeros in any row is 0
> [1] Mat_CheckInode(): Found 3257 nodes of 16281. Limit used: 5. Using Inode routines
> [0] MatAssemblyEnd_SeqAIJ(): Matrix size: 16280 X 124; storage space: 0 unneeded,225786 used
> [0] MatAssemblyEnd_SeqAIJ(): Number of mallocs during MatSetValues() is 0
> [0] MatAssemblyEnd_SeqAIJ(): Maximum nonzeros in any row is 14
> [0] Mat_CheckInode(): Found 16280 nodes out of 16280 rows. Not using Inode routines
> [0] PetscCommDuplicate(): Using internal PETSc communicator 1140850689 -2080374777
> [0] PetscCommDuplicate():   returning tag 2147483645
> [0] MatSetUpMultiply_MPIAIJ(): Using block index set to define scatter
> [0] PetscCommDuplicate():   returning tag 2147483638
> [1] PetscCommDuplicate(): Using internal PETSc communicator 1140850689 -2080374780
> [1] PetscCommDuplicate():   returning tag 2147483645
> [1] PetscCommDuplicate():   returning tag 2147483638
> [0] PetscCommDuplicate(): Using internal PETSc communicator 1140850689 -2080374777
> [1] PetscCommDuplicate(): Using internal PETSc communicator 1140850689 -2080374780
> [1] PetscCommDuplicate():   returning tag 2147483644
> [1] PetscCommDuplicate():   returning tag 2147483637
> [0] PetscCommDuplicate():   returning tag 2147483644
> [0] PetscCommDuplicate():   returning tag 2147483637
> [1] PetscCommDuplicate():   returning tag 2147483632
> [0] PetscCommDuplicate():   returning tag 2147483632
> [0] VecScatterCreateCommon_PtoS(): Using blocksize 1 scatter
> [0] VecScatterCreate(): General case: MPI to Seq
> [0] MatAssemblyEnd_SeqAIJ(): Matrix size: 16280 X 0; storage space: 0 unneeded,0 used
> [0] MatAssemblyEnd_SeqAIJ(): Number of mallocs during MatSetValues() is 0
> [0] MatAssemblyEnd_SeqAIJ(): Maximum nonzeros in any row is 0
> 
> Writing data in binary format to adult9.train.x 
> [0] PetscCommDuplicate(): Using internal PETSc communicator 1140850688 -2080374780
> [0] PetscCommDuplicate():   returning tag 2147483628
> [1] MatAssemblyEnd_SeqAIJ(): Matrix size: 16281 X 123; storage space: 18409 unneeded,225806 used
> [1] MatAssemblyEnd_SeqAIJ(): Number of mallocs during MatSetValues() is 16281
> [1] MatAssemblyEnd_SeqAIJ(): Maximum nonzeros in any row is 14
> [1] PetscCommDuplicate(): Using internal PETSc communicator 1140850688 -2080374782
> [1] PetscCommDuplicate():   returning tag 2147483628
> [1] Petsc_DelComm(): Deleting PETSc communicator imbedded in a user MPI_Comm 1140850689
> [1] PetscCommDestroy(): Deleting PETSc MPI_Comm -2080374780
> [1] Petsc_DelComm(): Deleting PETSc communicator imbedded in a user MPI_Comm -2080374780
> [1] Petsc_DelCounter(): Deleting counter data in an MPI_Comm -2080374780
> [1] PetscCommDuplicate(): Using internal PETSc communicator 1140850688 -2080374782
> [1] PetscCommDuplicate():   returning tag 2147483627
> [0] Petsc_DelComm(): Deleting PETSc communicator imbedded in a user MPI_Comm 1140850689
> [0] PetscCommDestroy(): Deleting PETSc MPI_Comm -2080374777
> [0] Petsc_DelComm(): Deleting PETSc communicator imbedded in a user MPI_Comm -2080374777
> [0] Petsc_DelCounter(): Deleting counter data in an MPI_Comm -2080374777
> 
> Writing labels in binary format to adult9.train.y 
> [0] PetscCommDuplicate(): Using internal PETSc communicator 1140850688 -2080374780
> [0] PetscCommDuplicate():   returning tag 2147483627
> [1] Petsc_DelComm(): Deleting PETSc communicator imbedded in a user MPI_Comm 1140850688
> [1] PetscCommDestroy(): Deleting PETSc MPI_Comm -2080374782
> [1] Petsc_DelComm(): Deleting PETSc communicator imbedded in a user MPI_Comm -2080374782
> [1] Petsc_DelCounter(): Deleting counter data in an MPI_Comm -2080374782
> [1] PetscFinalize(): PetscFinalize() called
> [0] Petsc_DelComm(): Deleting PETSc communicator imbedded in a user MPI_Comm 1140850688
> [0] PetscCommDestroy(): Deleting PETSc MPI_Comm -2080374780
> [0] Petsc_DelComm(): Deleting PETSc communicator imbedded in a user MPI_Comm -2080374780
> [0] Petsc_DelCounter(): Deleting counter data in an MPI_Comm -2080374780
> [0] PetscFinalize(): PetscFinalize() called
> 
> 
>> 
>> On Thu, Apr 21, 2011 at 5:39 PM, S V N Vishwanathan <vishy at stat.purdue.edu> wrote:
>> 
>>    Hi
>> 
>>    I am using the attached code to convert a matrix from a rather
>>    inefficient ascii format (each line is a row and contains a series of
>>    idx:val pairs) to the PETSc binary format. Some of the matrices that I
>>    am working with are rather huge (50GB ascii file) and cannot be
>>    assembled on a single processor. When I use the attached code the matrix
>>    assembly across machines seems to be fairly fast. However, dumping the
>>    assembled matrix out to disk seems to be painfully slow. Any suggestions
>>    on how to speed things up will be deeply appreciated.
>> 
>>    vishy
>> 
>> 
> 


From longmin.ran at gmail.com  Fri Apr 22 05:31:03 2011
From: longmin.ran at gmail.com (Longmin RAN)
Date: Fri, 22 Apr 2011 12:31:03 +0200
Subject: [petsc-users] "-mat_superlu_colperm MMD_AT_PLUS_A" causes the
	program to hang
Message-ID: <BANLkTi=5xCO18AD1OgqUq-kPhh47Nqpv_A@mail.gmail.com>

Dear all,


I'm using superlu within petsc to solve systems with symmetric sparse
matrix. In superlu manual I read that MMD_AT_PLUS_A column
permutation, together with little diagonal pivot threshold, should be
used for symmetric mode. So I launch my program with the following
options:
  -ksp_type preonly
  -pc_type lu
  -pc_factor_mat_solver_package superlu
  -mat_superlu_diagpivotthresh 0.001
  -mat_superlu_symmetricmode TRUE
  -mat_superlu_colperm MMD_AT_PLUS_A

It seems that "-mat_superlu_colperm MMD_AT_PLUS_A" causes the program
to hang: when I deleted this option, my calculation is executed
correctly. But It's always interesting to be able to use the column
permutation option. Do you guys have any ideas ?


Cheers,

Longmin

From hzhang at mcs.anl.gov  Fri Apr 22 09:56:11 2011
From: hzhang at mcs.anl.gov (Hong Zhang)
Date: Fri, 22 Apr 2011 09:56:11 -0500
Subject: [petsc-users] "-mat_superlu_colperm MMD_AT_PLUS_A" causes the
 program to hang
In-Reply-To: <BANLkTi=5xCO18AD1OgqUq-kPhh47Nqpv_A@mail.gmail.com>
References: <BANLkTi=5xCO18AD1OgqUq-kPhh47Nqpv_A@mail.gmail.com>
Message-ID: <BANLkTikW2h724voNz=BdZMxn5cmiya2_zg@mail.gmail.com>

Longmin :

I cannot reproduce it with petsc example:
petsc-dev/src/ksp/ksp/examples/tutorials>./ex2 -ksp_type preonly
-pc_type lu -pc_factor_mat_solver_package superlu
-mat_superlu_diagpivotthresh 0.001 -mat_superlu_symmetricmode TRUE
-mat_superlu_colperm MMD_AT_PLUS_A
Norm of error < 1.e-12 iterations 1

Please use a debugger to check where it hangs. Then use valgrind to check
posible memory corruption.

Hong
>
> I'm using superlu within petsc to solve systems with symmetric sparse
> matrix. In superlu manual I read that MMD_AT_PLUS_A column
> permutation, together with little diagonal pivot threshold, should be
> used for symmetric mode. So I launch my program with the following
> options:
> ?-ksp_type preonly
> ?-pc_type lu
> ?-pc_factor_mat_solver_package superlu
> ?-mat_superlu_diagpivotthresh 0.001
> ?-mat_superlu_symmetricmode TRUE
> ?-mat_superlu_colperm MMD_AT_PLUS_A
>
> It seems that "-mat_superlu_colperm MMD_AT_PLUS_A" causes the program
> to hang: when I deleted this option, my calculation is executed
> correctly. But It's always interesting to be able to use the column
> permutation option. Do you guys have any ideas ?
>
>
> Cheers,
>
> Longmin
>

From gaurish108 at gmail.com  Fri Apr 22 15:44:55 2011
From: gaurish108 at gmail.com (Gaurish Telang)
Date: Fri, 22 Apr 2011 16:44:55 -0400
Subject: [petsc-users] how good is PETSc+GPU's ?
Message-ID: <BANLkTikR-D=4AP5g0cp4RerYonn=UjgHJA@mail.gmail.com>

I would like to know how well PETSc works with GPU's and the kind of
Speed-ups one can get if one uses PETSc along with GPU's.

Has it been used for scientific studies so far?

[ I think this bit of information (
http://www.mcs.anl.gov/petsc/petsc-as/documentation/faq.html#gpus ) has been
that way for a long time, and hence the above question.

This 2010 article( http://www.mcs.anl.gov/petsc/petsc-2/features/gpus.pdf )
does not mention any comparative studies of the preliminary implementation
of PETSc for use with GPU's with other software libraries. ]

Sincere thanks,

Gaurish
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.mcs.anl.gov/pipermail/petsc-users/attachments/20110422/f5d2704d/attachment.htm>

From bsmith at mcs.anl.gov  Fri Apr 22 16:15:52 2011
From: bsmith at mcs.anl.gov (Barry Smith)
Date: Fri, 22 Apr 2011 16:15:52 -0500
Subject: [petsc-users] how good is PETSc+GPU's ?
In-Reply-To: <BANLkTikR-D=4AP5g0cp4RerYonn=UjgHJA@mail.gmail.com>
References: <BANLkTikR-D=4AP5g0cp4RerYonn=UjgHJA@mail.gmail.com>
Message-ID: <B8999E2F-3B52-4F81-88D4-A67B3AFC2B60@mcs.anl.gov>


On Apr 22, 2011, at 3:44 PM, Gaurish Telang wrote:

> I would like to know how well PETSc works with GPU's and the kind of Speed-ups one can get if one uses PETSc along with GPU's. 
> 
> Has it been used for scientific studies so far? 
> 
> [ I think this bit of information ( http://www.mcs.anl.gov/petsc/petsc-as/documentation/faq.html#gpus ) has been that way for a long time, and hence the above question. 
> 
> This 2010 article( http://www.mcs.anl.gov/petsc/petsc-2/features/gpus.pdf ) does n
> ot mention any comparative studies of the preliminary implementation of PETSc for use with GPU's with other software libraries. ]
> 

    There are no such studies. PETSc uses the CUSP and THRUST libraries of Nvidia on the GPU therefor the performance will be the same as using CUSP and THRUST directly or of any other library that uses CUSP and THRUST. Just like with regular CPUs the performance of sparse matrix iterative methods (floating point speedwise) is determined by the hardware so there won't be much difference between different libraries that do the "right thing".

   If you are trying to decide between two packages to use for solving some algebraic systems you need to compare them yourself, you cannot rely on what people say. If you are deciding between using a package and doing it yourself you might as well use the package since you can always add whatever custom stuff yourself if you think it is better, so there is really no downside to using a package.


   Barry

> Sincere thanks,
> 
> Gaurish


From vishy at stat.purdue.edu  Sat Apr 23 12:36:35 2011
From: vishy at stat.purdue.edu (S V N Vishwanathan)
Date: Sat, 23 Apr 2011 13:36:35 -0400
Subject: [petsc-users] Question on writing a large matrix
In-Reply-To: <96C5FD2E-7EB1-458A-9DB6-1B8B2353829E@mcs.anl.gov>
References: <87k4ensm6r.wl%vishy@stat.purdue.edu>
	<BANLkTin-L5V3HXOYQUheUANas6G62xMZvg@mail.gmail.com>
	<87hb9rsfq9.wl%vishy@stat.purdue.edu>
	<96C5FD2E-7EB1-458A-9DB6-1B8B2353829E@mcs.anl.gov>
Message-ID: <87vcy4g998.wl%vishy@stat.purdue.edu>

Barry,

>    It has not assembled the matrix in an hour. It is working all night
>    to assemble the matrix, the problem is that you are not
>    preallocating the nonzeros per row with MatMPIAIJSetPreallocation()
>    when pre allocation is correct it will always print 0 for Number of
>    mallocs. The actual writing of the parallel matrix to the binary
>    file will take at most minutes.

You were absolutely right! I had not set the preallocation properly and
hence the code was painfully slow. I fixed that issue (see attached
code) and now it runs much faster. However, I am having a different
problem now. When I run the code for smaller matrices (less than a
million rows) everything works well. However, when working with large
matrices (e.g. 2.8 million rows x 1157 columns) writing the matrix to
file dies with the following message:

Fatal error in MPI_Recv: Other MPI error

Any hints on how to solve this problem or are deeply appreciated. 

vishy

The output of running the code with the -info flag is as follows:

[0] PetscInitialize(): PETSc successfully started: number of processors = 4
[0] PetscInitialize(): Running on machine: rossmann-b001.rcac.purdue.edu
[3] PetscInitialize(): PETSc successfully started: number of processors = 4
[3] PetscInitialize(): Running on machine: rossmann-b004.rcac.purdue.edu
No libsvm test file specified!

 Reading libsvm train file at /scratch/lustreA/v/vishy/LibSVM/biclass/ocr/ocr.train.txt
[2] PetscInitialize(): PETSc successfully started: number of processors = 4
[2] PetscInitialize(): Running on machine: rossmann-b003.rcac.purdue.edu
[3] PetscFOpen(): Opening file /scratch/lustreA/v/vishy/LibSVM/biclass/ocr/ocr.train.txt
[2] PetscFOpen(): Opening file /scratch/lustreA/v/vishy/LibSVM/biclass/ocr/ocr.train.txt
[0] PetscFOpen(): Opening file /scratch/lustreA/v/vishy/LibSVM/biclass/ocr/ocr.train.txt
[1] PetscInitialize(): PETSc successfully started: number of processors = 4
[1] PetscInitialize(): Running on machine: rossmann-b002.rcac.purdue.edu
[1] PetscFOpen(): Opening file /scratch/lustreA/v/vishy/LibSVM/biclass/ocr/ocr.train.txt
m=100000
m=200000
m=300000
m=400000
m=500000
m=600000
m=700000
m=800000
m=900000
m=1000000
m=1100000
m=1200000
m=1300000
m=1400000
m=1500000
m=1600000
m=1700000
m=1800000
m=1900000
m=2000000
m=2100000
m=2200000
m=2300000
m=2400000
m=2500000
m=2600000
m=2700000
m=2800000
user.dim=1157 user.m=2800000 user.maxnnz=1156 user.maxlen=32768 user.flg=1 
[0] PetscCommDuplicate(): Duplicating a communicator 1140850688 -2080374784 max tags = 2147483647
[0] PetscCommDuplicate():   returning tag 2147483647
user.dim=1157 user.m=2800000 user.maxnnz=1156 user.maxlen=32768 user.flg=1 
user.dim=1157 user.m=2800000 user.maxnnz=1156 user.maxlen=32768 user.flg=1 
user.dim=1157 user.m=2800000 user.maxnnz=1156 user.maxlen=32768 user.flg=1 
[0] Petsc_DelComm(): Deleting PETSc communicator imbedded in a user MPI_Comm 1140850688
[0] PetscCommDestroy(): Deleting PETSc MPI_Comm -2080374784
[0] Petsc_DelComm(): Deleting PETSc communicator imbedded in a user MPI_Comm -2080374784
[0] Petsc_DelCounter(): Deleting counter data in an MPI_Comm -2080374784
[2] PetscCommDuplicate(): Duplicating a communicator 1140850688 -2080374784 max tags = 2147483647
[2] PetscCommDuplicate():   returning tag 2147483647
[2] Petsc_DelComm(): Deleting PETSc communicator imbedded in a user MPI_Comm 1140850688
[2] PetscCommDestroy(): Deleting PETSc MPI_Comm -2080374784
[2] Petsc_DelComm(): Deleting PETSc communicator imbedded in a user MPI_Comm -2080374784
[2] Petsc_DelCounter(): Deleting counter data in an MPI_Comm -2080374784
[3] PetscCommDuplicate(): Duplicating a communicator 1140850688 -2080374784 max tags = 2147483647
[3] PetscCommDuplicate():   returning tag 2147483647
[3] Petsc_DelComm(): Deleting PETSc communicator imbedded in a user MPI_Comm 1140850688
[3] PetscCommDestroy(): Deleting PETSc MPI_Comm -2080374784
[3] Petsc_DelComm(): Deleting PETSc communicator imbedded in a user MPI_Comm -2080374784
[3] Petsc_DelCounter(): Deleting counter data in an MPI_Comm -2080374784
[0] PetscCommDuplicate(): Duplicating a communicator 1140850688 -2080374784 max tags = 2147483647
[0] PetscCommDuplicate():   returning tag 2147483647
[1] PetscCommDuplicate(): Duplicating a communicator 1140850688 -2080374784 max tags = 2147483647
[1] PetscCommDuplicate():   returning tag 2147483647
[1] Petsc_DelComm(): Deleting PETSc communicator imbedded in a user MPI_Comm 1140850688
[1] PetscCommDestroy(): Deleting PETSc MPI_Comm -2080374784
[1] Petsc_DelComm(): Deleting PETSc communicator imbedded in a user MPI_Comm -2080374784
[1] Petsc_DelCounter(): Deleting counter data in an MPI_Comm -2080374784
[1] PetscCommDuplicate(): Duplicating a communicator 1140850688 -2080374784 max tags = 2147483647
[1] PetscCommDuplicate():   returning tag 2147483647
[2] PetscCommDuplicate(): Duplicating a communicator 1140850688 -2080374784 max tags = 2147483647
[2] PetscCommDuplicate():   returning tag 2147483647
[3] PetscCommDuplicate(): Duplicating a communicator 1140850688 -2080374784 max tags = 2147483647
[3] PetscCommDuplicate():   returning tag 2147483647
[3] PetscCommDuplicate(): Using internal PETSc communicator 1140850688 -2080374784
[3] PetscCommDuplicate():   returning tag 2147483642
[2] PetscCommDuplicate(): Using internal PETSc communicator 1140850688 -2080374784
[2] PetscCommDuplicate():   returning tag 2147483642
[1] PetscCommDuplicate(): Using internal PETSc communicator 1140850688 -2080374784
[1] PetscCommDuplicate():   returning tag 2147483642
[0] PetscCommDuplicate(): Using internal PETSc communicator 1140850688 -2080374784
[0] PetscCommDuplicate():   returning tag 2147483642
[0] MatSetUpPreallocation(): Warning not preallocating matrix storage
[0] PetscCommDuplicate(): Duplicating a communicator 1140850689 -2080374783 max tags = 2147483647
[0] PetscCommDuplicate():   returning tag 2147483647
[0] PetscCommDuplicate(): Using internal PETSc communicator 1140850689 -2080374783
[0] PetscCommDuplicate():   returning tag 2147483646
[2] PetscCommDuplicate(): Duplicating a communicator 1140850689 -2080374783 max tags = 2147483647
[2] PetscCommDuplicate():   returning tag 2147483647
[1] PetscCommDuplicate(): Duplicating a communicator 1140850689 -2080374783 max tags = 2147483647
[1] PetscCommDuplicate():   returning tag 2147483647
[2] PetscCommDuplicate(): Using internal PETSc communicator 1140850689 -2080374783
[2] PetscCommDuplicate():   returning tag 2147483646
[1] PetscCommDuplicate(): Using internal PETSc communicator 1140850689 -2080374783
[1] PetscCommDuplicate():   returning tag 2147483646
[3] PetscCommDuplicate(): Duplicating a communicator 1140850689 -2080374783 max tags = 2147483647
[3] PetscCommDuplicate():   returning tag 2147483647
[3] PetscCommDuplicate(): Using internal PETSc communicator 1140850689 -2080374783
[3] PetscCommDuplicate():   returning tag 2147483646
[0] VecAssemblyBegin_MPI(): Stash has 0 entries, uses 0 mallocs.
[0] VecAssemblyBegin_MPI(): Block-Stash has 0 entries, uses 0 mallocs.
[0] MatStashScatterBegin_Private(): No of messages: 0 
[0] MatAssemblyBegin_MPIAIJ(): Stash has 0 entries, uses 0 mallocs.
[1] MatAssemblyBegin_MPIAIJ(): Stash has 0 entries, uses 0 mallocs.
[3] MatAssemblyBegin_MPIAIJ(): Stash has 0 entries, uses 0 mallocs.
[2] MatAssemblyBegin_MPIAIJ(): Stash has 0 entries, uses 0 mallocs.
[1] MatAssemblyEnd_SeqAIJ(): Matrix size: 700000 X 289; storage space: 0 unneeded,202300000 used
[1] MatAssemblyEnd_SeqAIJ(): Number of mallocs during MatSetValues() is 0
[1] MatAssemblyEnd_SeqAIJ(): Maximum nonzeros in any row is 289
[3] MatAssemblyEnd_SeqAIJ(): Matrix size: 700000 X 289; storage space: 0 unneeded,202300000 used
[3] MatAssemblyEnd_SeqAIJ(): Number of mallocs during MatSetValues() is 0
[3] MatAssemblyEnd_SeqAIJ(): Maximum nonzeros in any row is 289
[0] MatAssemblyEnd_SeqAIJ(): Matrix size: 700000 X 290; storage space: 0 unneeded,202300000 used
[0] MatAssemblyEnd_SeqAIJ(): Number of mallocs during MatSetValues() is 0
[0] MatAssemblyEnd_SeqAIJ(): Maximum nonzeros in any row is 289
[2] MatAssemblyEnd_SeqAIJ(): Matrix size: 700000 X 289; storage space: 0 unneeded,202300000 used
[2] MatAssemblyEnd_SeqAIJ(): Number of mallocs during MatSetValues() is 0
[2] MatAssemblyEnd_SeqAIJ(): Maximum nonzeros in any row is 289
[1] Mat_CheckInode(): Found 140000 nodes of 700000. Limit used: 5. Using Inode routines
[3] Mat_CheckInode(): Found 140000 nodes of 700000. Limit used: 5. Using Inode routines
[2] Mat_CheckInode(): Found 140000 nodes of 700000. Limit used: 5. Using Inode routines
[0] Mat_CheckInode(): Found 140000 nodes of 700000. Limit used: 5. Using Inode routines
[3] PetscCommDuplicate(): Using internal PETSc communicator 1140850689 -2080374783
[3] PetscCommDuplicate():   returning tag 2147483645
[3] PetscCommDuplicate():   returning tag 2147483638
[2] PetscCommDuplicate(): Using internal PETSc communicator 1140850689 -2080374783
[2] PetscCommDuplicate():   returning tag 2147483645
[2] PetscCommDuplicate():   returning tag 2147483638
[0] PetscCommDuplicate(): Using internal PETSc communicator 1140850689 -2080374783
[0] PetscCommDuplicate():   returning tag 2147483645
[0] MatSetUpMultiply_MPIAIJ(): Using block index set to define scatter
[0] PetscCommDuplicate():   returning tag 2147483638
[1] PetscCommDuplicate(): Using internal PETSc communicator 1140850689 -2080374783
[1] PetscCommDuplicate():   returning tag 2147483645
[1] PetscCommDuplicate():   returning tag 2147483638
[0] PetscCommDuplicate(): Using internal PETSc communicator 1140850689 -2080374783
[0] PetscCommDuplicate():   returning tag 2147483644
[0] PetscCommDuplicate():   returning tag 2147483637
[1] PetscCommDuplicate(): Using internal PETSc communicator 1140850689 -2080374783
[1] PetscCommDuplicate():   returning tag 2147483644
[3] PetscCommDuplicate(): Using internal PETSc communicator 1140850689 -2080374783
[3] PetscCommDuplicate():   returning tag 2147483644
[3] PetscCommDuplicate():   returning tag 2147483637
[1] PetscCommDuplicate():   returning tag 2147483637
[0] PetscCommDuplicate():   returning tag 2147483632
[2] PetscCommDuplicate(): Using internal PETSc communicator 1140850689 -2080374783
[2] PetscCommDuplicate():   returning tag 2147483644
[2] PetscCommDuplicate():   returning tag 2147483637
[1] PetscCommDuplicate():   returning tag 2147483632
[2] PetscCommDuplicate():   returning tag 2147483632
[3] PetscCommDuplicate():   returning tag 2147483632
[0] VecScatterCreateCommon_PtoS(): Using blocksize 1 scatter
[0] VecScatterCreate(): General case: MPI to Seq
[2] MatAssemblyEnd_SeqAIJ(): Matrix size: 700000 X 867; storage space: 0 unneeded,606900000 used
[2] MatAssemblyEnd_SeqAIJ(): Number of mallocs during MatSetValues() is 0
[2] MatAssemblyEnd_SeqAIJ(): Maximum nonzeros in any row is 867
[2] PetscCommDuplicate(): Using internal PETSc communicator 1140850688 -2080374784
[2] PetscCommDuplicate():   returning tag 2147483628
[3] MatAssemblyEnd_SeqAIJ(): Matrix size: 700000 X 867; storage space: 0 unneeded,606900000 used
[3] MatAssemblyEnd_SeqAIJ(): Number of mallocs during MatSetValues() is 0
[3] MatAssemblyEnd_SeqAIJ(): Maximum nonzeros in any row is 867
[3] PetscCommDuplicate(): Using internal PETSc communicator 1140850688 -2080374784
[3] PetscCommDuplicate():   returning tag 2147483628
[1] MatAssemblyEnd_SeqAIJ(): Matrix size: 700000 X 867; storage space: 0 unneeded,606900000 used
[1] MatAssemblyEnd_SeqAIJ(): Number of mallocs during MatSetValues() is 0
[1] MatAssemblyEnd_SeqAIJ(): Maximum nonzeros in any row is 867
[1] PetscCommDuplicate(): Using internal PETSc communicator 1140850688 -2080374784
[1] PetscCommDuplicate():   returning tag 2147483628
[0] MatAssemblyEnd_SeqAIJ(): Matrix size: 700000 X 867; storage space: 0 unneeded,606900000 used
[0] MatAssemblyEnd_SeqAIJ(): Number of mallocs during MatSetValues() is 0
[0] MatAssemblyEnd_SeqAIJ(): Maximum nonzeros in any row is 867

 Writing data in binary format to /scratch/lustreA/v/vishy/biclass/ocr.train.x 
[0] PetscCommDuplicate(): Using internal PETSc communicator 1140850688 -2080374784
[0] PetscCommDuplicate():   returning tag 2147483628
APPLICATION TERMINATED WITH THE EXIT STRING: Hangup (signal 1)

-------------- next part --------------
A non-text attachment was scrubbed...
Name: libsvm-to-binary.cpp
Type: application/octet-stream
Size: 15449 bytes
Desc: not available
URL: <http://lists.mcs.anl.gov/pipermail/petsc-users/attachments/20110423/64fe1a16/attachment.obj>

From vishy at stat.purdue.edu  Sat Apr 23 12:36:52 2011
From: vishy at stat.purdue.edu (S V N Vishwanathan)
Date: Sat, 23 Apr 2011 13:36:52 -0400
Subject: [petsc-users] Question on writing a large matrix
In-Reply-To: <96C5FD2E-7EB1-458A-9DB6-1B8B2353829E@mcs.anl.gov>
References: <87k4ensm6r.wl%vishy@stat.purdue.edu>
	<BANLkTin-L5V3HXOYQUheUANas6G62xMZvg@mail.gmail.com>
	<87hb9rsfq9.wl%vishy@stat.purdue.edu>
	<96C5FD2E-7EB1-458A-9DB6-1B8B2353829E@mcs.anl.gov>
Message-ID: <87tydog98r.wl%vishy@stat.purdue.edu>

Barry,

>    It has not assembled the matrix in an hour. It is working all night
>    to assemble the matrix, the problem is that you are not
>    preallocating the nonzeros per row with MatMPIAIJSetPreallocation()
>    when pre allocation is correct it will always print 0 for Number of
>    mallocs. The actual writing of the parallel matrix to the binary
>    file will take at most minutes.

You were absolutely right! I had not set the preallocation properly and
hence the code was painfully slow. I fixed that issue (see attached
code) and now it runs much faster. However, I am having a different
problem now. When I run the code for smaller matrices (less than a
million rows) everything works well. However, when working with large
matrices (e.g. 2.8 million rows x 1157 columns) writing the matrix to
file dies with the following message:

Fatal error in MPI_Recv: Other MPI error

Any hints on how to solve this problem or are deeply appreciated. 

vishy

The output of running the code with the -info flag is as follows:

[0] PetscInitialize(): PETSc successfully started: number of processors = 4
[0] PetscInitialize(): Running on machine: rossmann-b001.rcac.purdue.edu
[3] PetscInitialize(): PETSc successfully started: number of processors = 4
[3] PetscInitialize(): Running on machine: rossmann-b004.rcac.purdue.edu
No libsvm test file specified!

 Reading libsvm train file at /scratch/lustreA/v/vishy/LibSVM/biclass/ocr/ocr.train.txt
[2] PetscInitialize(): PETSc successfully started: number of processors = 4
[2] PetscInitialize(): Running on machine: rossmann-b003.rcac.purdue.edu
[3] PetscFOpen(): Opening file /scratch/lustreA/v/vishy/LibSVM/biclass/ocr/ocr.train.txt
[2] PetscFOpen(): Opening file /scratch/lustreA/v/vishy/LibSVM/biclass/ocr/ocr.train.txt
[0] PetscFOpen(): Opening file /scratch/lustreA/v/vishy/LibSVM/biclass/ocr/ocr.train.txt
[1] PetscInitialize(): PETSc successfully started: number of processors = 4
[1] PetscInitialize(): Running on machine: rossmann-b002.rcac.purdue.edu
[1] PetscFOpen(): Opening file /scratch/lustreA/v/vishy/LibSVM/biclass/ocr/ocr.train.txt
m=100000
m=200000
m=300000
m=400000
m=500000
m=600000
m=700000
m=800000
m=900000
m=1000000
m=1100000
m=1200000
m=1300000
m=1400000
m=1500000
m=1600000
m=1700000
m=1800000
m=1900000
m=2000000
m=2100000
m=2200000
m=2300000
m=2400000
m=2500000
m=2600000
m=2700000
m=2800000
user.dim=1157 user.m=2800000 user.maxnnz=1156 user.maxlen=32768 user.flg=1 
[0] PetscCommDuplicate(): Duplicating a communicator 1140850688 -2080374784 max tags = 2147483647
[0] PetscCommDuplicate():   returning tag 2147483647
user.dim=1157 user.m=2800000 user.maxnnz=1156 user.maxlen=32768 user.flg=1 
user.dim=1157 user.m=2800000 user.maxnnz=1156 user.maxlen=32768 user.flg=1 
user.dim=1157 user.m=2800000 user.maxnnz=1156 user.maxlen=32768 user.flg=1 
[0] Petsc_DelComm(): Deleting PETSc communicator imbedded in a user MPI_Comm 1140850688
[0] PetscCommDestroy(): Deleting PETSc MPI_Comm -2080374784
[0] Petsc_DelComm(): Deleting PETSc communicator imbedded in a user MPI_Comm -2080374784
[0] Petsc_DelCounter(): Deleting counter data in an MPI_Comm -2080374784
[2] PetscCommDuplicate(): Duplicating a communicator 1140850688 -2080374784 max tags = 2147483647
[2] PetscCommDuplicate():   returning tag 2147483647
[2] Petsc_DelComm(): Deleting PETSc communicator imbedded in a user MPI_Comm 1140850688
[2] PetscCommDestroy(): Deleting PETSc MPI_Comm -2080374784
[2] Petsc_DelComm(): Deleting PETSc communicator imbedded in a user MPI_Comm -2080374784
[2] Petsc_DelCounter(): Deleting counter data in an MPI_Comm -2080374784
[3] PetscCommDuplicate(): Duplicating a communicator 1140850688 -2080374784 max tags = 2147483647
[3] PetscCommDuplicate():   returning tag 2147483647
[3] Petsc_DelComm(): Deleting PETSc communicator imbedded in a user MPI_Comm 1140850688
[3] PetscCommDestroy(): Deleting PETSc MPI_Comm -2080374784
[3] Petsc_DelComm(): Deleting PETSc communicator imbedded in a user MPI_Comm -2080374784
[3] Petsc_DelCounter(): Deleting counter data in an MPI_Comm -2080374784
[0] PetscCommDuplicate(): Duplicating a communicator 1140850688 -2080374784 max tags = 2147483647
[0] PetscCommDuplicate():   returning tag 2147483647
[1] PetscCommDuplicate(): Duplicating a communicator 1140850688 -2080374784 max tags = 2147483647
[1] PetscCommDuplicate():   returning tag 2147483647
[1] Petsc_DelComm(): Deleting PETSc communicator imbedded in a user MPI_Comm 1140850688
[1] PetscCommDestroy(): Deleting PETSc MPI_Comm -2080374784
[1] Petsc_DelComm(): Deleting PETSc communicator imbedded in a user MPI_Comm -2080374784
[1] Petsc_DelCounter(): Deleting counter data in an MPI_Comm -2080374784
[1] PetscCommDuplicate(): Duplicating a communicator 1140850688 -2080374784 max tags = 2147483647
[1] PetscCommDuplicate():   returning tag 2147483647
[2] PetscCommDuplicate(): Duplicating a communicator 1140850688 -2080374784 max tags = 2147483647
[2] PetscCommDuplicate():   returning tag 2147483647
[3] PetscCommDuplicate(): Duplicating a communicator 1140850688 -2080374784 max tags = 2147483647
[3] PetscCommDuplicate():   returning tag 2147483647
[3] PetscCommDuplicate(): Using internal PETSc communicator 1140850688 -2080374784
[3] PetscCommDuplicate():   returning tag 2147483642
[2] PetscCommDuplicate(): Using internal PETSc communicator 1140850688 -2080374784
[2] PetscCommDuplicate():   returning tag 2147483642
[1] PetscCommDuplicate(): Using internal PETSc communicator 1140850688 -2080374784
[1] PetscCommDuplicate():   returning tag 2147483642
[0] PetscCommDuplicate(): Using internal PETSc communicator 1140850688 -2080374784
[0] PetscCommDuplicate():   returning tag 2147483642
[0] MatSetUpPreallocation(): Warning not preallocating matrix storage
[0] PetscCommDuplicate(): Duplicating a communicator 1140850689 -2080374783 max tags = 2147483647
[0] PetscCommDuplicate():   returning tag 2147483647
[0] PetscCommDuplicate(): Using internal PETSc communicator 1140850689 -2080374783
[0] PetscCommDuplicate():   returning tag 2147483646
[2] PetscCommDuplicate(): Duplicating a communicator 1140850689 -2080374783 max tags = 2147483647
[2] PetscCommDuplicate():   returning tag 2147483647
[1] PetscCommDuplicate(): Duplicating a communicator 1140850689 -2080374783 max tags = 2147483647
[1] PetscCommDuplicate():   returning tag 2147483647
[2] PetscCommDuplicate(): Using internal PETSc communicator 1140850689 -2080374783
[2] PetscCommDuplicate():   returning tag 2147483646
[1] PetscCommDuplicate(): Using internal PETSc communicator 1140850689 -2080374783
[1] PetscCommDuplicate():   returning tag 2147483646
[3] PetscCommDuplicate(): Duplicating a communicator 1140850689 -2080374783 max tags = 2147483647
[3] PetscCommDuplicate():   returning tag 2147483647
[3] PetscCommDuplicate(): Using internal PETSc communicator 1140850689 -2080374783
[3] PetscCommDuplicate():   returning tag 2147483646
[0] VecAssemblyBegin_MPI(): Stash has 0 entries, uses 0 mallocs.
[0] VecAssemblyBegin_MPI(): Block-Stash has 0 entries, uses 0 mallocs.
[0] MatStashScatterBegin_Private(): No of messages: 0 
[0] MatAssemblyBegin_MPIAIJ(): Stash has 0 entries, uses 0 mallocs.
[1] MatAssemblyBegin_MPIAIJ(): Stash has 0 entries, uses 0 mallocs.
[3] MatAssemblyBegin_MPIAIJ(): Stash has 0 entries, uses 0 mallocs.
[2] MatAssemblyBegin_MPIAIJ(): Stash has 0 entries, uses 0 mallocs.
[1] MatAssemblyEnd_SeqAIJ(): Matrix size: 700000 X 289; storage space: 0 unneeded,202300000 used
[1] MatAssemblyEnd_SeqAIJ(): Number of mallocs during MatSetValues() is 0
[1] MatAssemblyEnd_SeqAIJ(): Maximum nonzeros in any row is 289
[3] MatAssemblyEnd_SeqAIJ(): Matrix size: 700000 X 289; storage space: 0 unneeded,202300000 used
[3] MatAssemblyEnd_SeqAIJ(): Number of mallocs during MatSetValues() is 0
[3] MatAssemblyEnd_SeqAIJ(): Maximum nonzeros in any row is 289
[0] MatAssemblyEnd_SeqAIJ(): Matrix size: 700000 X 290; storage space: 0 unneeded,202300000 used
[0] MatAssemblyEnd_SeqAIJ(): Number of mallocs during MatSetValues() is 0
[0] MatAssemblyEnd_SeqAIJ(): Maximum nonzeros in any row is 289
[2] MatAssemblyEnd_SeqAIJ(): Matrix size: 700000 X 289; storage space: 0 unneeded,202300000 used
[2] MatAssemblyEnd_SeqAIJ(): Number of mallocs during MatSetValues() is 0
[2] MatAssemblyEnd_SeqAIJ(): Maximum nonzeros in any row is 289
[1] Mat_CheckInode(): Found 140000 nodes of 700000. Limit used: 5. Using Inode routines
[3] Mat_CheckInode(): Found 140000 nodes of 700000. Limit used: 5. Using Inode routines
[2] Mat_CheckInode(): Found 140000 nodes of 700000. Limit used: 5. Using Inode routines
[0] Mat_CheckInode(): Found 140000 nodes of 700000. Limit used: 5. Using Inode routines
[3] PetscCommDuplicate(): Using internal PETSc communicator 1140850689 -2080374783
[3] PetscCommDuplicate():   returning tag 2147483645
[3] PetscCommDuplicate():   returning tag 2147483638
[2] PetscCommDuplicate(): Using internal PETSc communicator 1140850689 -2080374783
[2] PetscCommDuplicate():   returning tag 2147483645
[2] PetscCommDuplicate():   returning tag 2147483638
[0] PetscCommDuplicate(): Using internal PETSc communicator 1140850689 -2080374783
[0] PetscCommDuplicate():   returning tag 2147483645
[0] MatSetUpMultiply_MPIAIJ(): Using block index set to define scatter
[0] PetscCommDuplicate():   returning tag 2147483638
[1] PetscCommDuplicate(): Using internal PETSc communicator 1140850689 -2080374783
[1] PetscCommDuplicate():   returning tag 2147483645
[1] PetscCommDuplicate():   returning tag 2147483638
[0] PetscCommDuplicate(): Using internal PETSc communicator 1140850689 -2080374783
[0] PetscCommDuplicate():   returning tag 2147483644
[0] PetscCommDuplicate():   returning tag 2147483637
[1] PetscCommDuplicate(): Using internal PETSc communicator 1140850689 -2080374783
[1] PetscCommDuplicate():   returning tag 2147483644
[3] PetscCommDuplicate(): Using internal PETSc communicator 1140850689 -2080374783
[3] PetscCommDuplicate():   returning tag 2147483644
[3] PetscCommDuplicate():   returning tag 2147483637
[1] PetscCommDuplicate():   returning tag 2147483637
[0] PetscCommDuplicate():   returning tag 2147483632
[2] PetscCommDuplicate(): Using internal PETSc communicator 1140850689 -2080374783
[2] PetscCommDuplicate():   returning tag 2147483644
[2] PetscCommDuplicate():   returning tag 2147483637
[1] PetscCommDuplicate():   returning tag 2147483632
[2] PetscCommDuplicate():   returning tag 2147483632
[3] PetscCommDuplicate():   returning tag 2147483632
[0] VecScatterCreateCommon_PtoS(): Using blocksize 1 scatter
[0] VecScatterCreate(): General case: MPI to Seq
[2] MatAssemblyEnd_SeqAIJ(): Matrix size: 700000 X 867; storage space: 0 unneeded,606900000 used
[2] MatAssemblyEnd_SeqAIJ(): Number of mallocs during MatSetValues() is 0
[2] MatAssemblyEnd_SeqAIJ(): Maximum nonzeros in any row is 867
[2] PetscCommDuplicate(): Using internal PETSc communicator 1140850688 -2080374784
[2] PetscCommDuplicate():   returning tag 2147483628
[3] MatAssemblyEnd_SeqAIJ(): Matrix size: 700000 X 867; storage space: 0 unneeded,606900000 used
[3] MatAssemblyEnd_SeqAIJ(): Number of mallocs during MatSetValues() is 0
[3] MatAssemblyEnd_SeqAIJ(): Maximum nonzeros in any row is 867
[3] PetscCommDuplicate(): Using internal PETSc communicator 1140850688 -2080374784
[3] PetscCommDuplicate():   returning tag 2147483628
[1] MatAssemblyEnd_SeqAIJ(): Matrix size: 700000 X 867; storage space: 0 unneeded,606900000 used
[1] MatAssemblyEnd_SeqAIJ(): Number of mallocs during MatSetValues() is 0
[1] MatAssemblyEnd_SeqAIJ(): Maximum nonzeros in any row is 867
[1] PetscCommDuplicate(): Using internal PETSc communicator 1140850688 -2080374784
[1] PetscCommDuplicate():   returning tag 2147483628
[0] MatAssemblyEnd_SeqAIJ(): Matrix size: 700000 X 867; storage space: 0 unneeded,606900000 used
[0] MatAssemblyEnd_SeqAIJ(): Number of mallocs during MatSetValues() is 0
[0] MatAssemblyEnd_SeqAIJ(): Maximum nonzeros in any row is 867

 Writing data in binary format to /scratch/lustreA/v/vishy/biclass/ocr.train.x 
[0] PetscCommDuplicate(): Using internal PETSc communicator 1140850688 -2080374784
[0] PetscCommDuplicate():   returning tag 2147483628
APPLICATION TERMINATED WITH THE EXIT STRING: Hangup (signal 1)

-------------- next part --------------
A non-text attachment was scrubbed...
Name: libsvm-to-binary.cpp
Type: application/octet-stream
Size: 15449 bytes
Desc: not available
URL: <http://lists.mcs.anl.gov/pipermail/petsc-users/attachments/20110423/a4769bfd/attachment-0001.obj>

From bsmith at mcs.anl.gov  Sat Apr 23 13:39:35 2011
From: bsmith at mcs.anl.gov (Barry Smith)
Date: Sat, 23 Apr 2011 13:39:35 -0500
Subject: [petsc-users] Question on writing a large matrix
In-Reply-To: <87vcy4g998.wl%vishy@stat.purdue.edu>
References: <87k4ensm6r.wl%vishy@stat.purdue.edu>
	<BANLkTin-L5V3HXOYQUheUANas6G62xMZvg@mail.gmail.com>
	<87hb9rsfq9.wl%vishy@stat.purdue.edu>
	<96C5FD2E-7EB1-458A-9DB6-1B8B2353829E@mcs.anl.gov>
	<87vcy4g998.wl%vishy@stat.purdue.edu>
Message-ID: <92BAD5B4-F576-4FA5-B4DD-6A3666851CAD@mcs.anl.gov>


http://www.mcs.anl.gov/petsc/petsc-as/documentation/faq.html#64-bit-indices


On Apr 23, 2011, at 12:36 PM, S V N Vishwanathan wrote:

> Barry,
> 
>>   It has not assembled the matrix in an hour. It is working all night
>>   to assemble the matrix, the problem is that you are not
>>   preallocating the nonzeros per row with MatMPIAIJSetPreallocation()
>>   when pre allocation is correct it will always print 0 for Number of
>>   mallocs. The actual writing of the parallel matrix to the binary
>>   file will take at most minutes.
> 
> You were absolutely right! I had not set the preallocation properly and
> hence the code was painfully slow. I fixed that issue (see attached
> code) and now it runs much faster. However, I am having a different
> problem now. When I run the code for smaller matrices (less than a
> million rows) everything works well. However, when working with large
> matrices (e.g. 2.8 million rows x 1157 columns) writing the matrix to
> file dies with the following message:
> 
> Fatal error in MPI_Recv: Other MPI error
> 
> Any hints on how to solve this problem or are deeply appreciated. 
> 
> vishy
> 
> The output of running the code with the -info flag is as follows:
> 
> [0] PetscInitialize(): PETSc successfully started: number of processors = 4
> [0] PetscInitialize(): Running on machine: rossmann-b001.rcac.purdue.edu
> [3] PetscInitialize(): PETSc successfully started: number of processors = 4
> [3] PetscInitialize(): Running on machine: rossmann-b004.rcac.purdue.edu
> No libsvm test file specified!
> 
> Reading libsvm train file at /scratch/lustreA/v/vishy/LibSVM/biclass/ocr/ocr.train.txt
> [2] PetscInitialize(): PETSc successfully started: number of processors = 4
> [2] PetscInitialize(): Running on machine: rossmann-b003.rcac.purdue.edu
> [3] PetscFOpen(): Opening file /scratch/lustreA/v/vishy/LibSVM/biclass/ocr/ocr.train.txt
> [2] PetscFOpen(): Opening file /scratch/lustreA/v/vishy/LibSVM/biclass/ocr/ocr.train.txt
> [0] PetscFOpen(): Opening file /scratch/lustreA/v/vishy/LibSVM/biclass/ocr/ocr.train.txt
> [1] PetscInitialize(): PETSc successfully started: number of processors = 4
> [1] PetscInitialize(): Running on machine: rossmann-b002.rcac.purdue.edu
> [1] PetscFOpen(): Opening file /scratch/lustreA/v/vishy/LibSVM/biclass/ocr/ocr.train.txt
> m=100000
> m=200000
> m=300000
> m=400000
> m=500000
> m=600000
> m=700000
> m=800000
> m=900000
> m=1000000
> m=1100000
> m=1200000
> m=1300000
> m=1400000
> m=1500000
> m=1600000
> m=1700000
> m=1800000
> m=1900000
> m=2000000
> m=2100000
> m=2200000
> m=2300000
> m=2400000
> m=2500000
> m=2600000
> m=2700000
> m=2800000
> user.dim=1157 user.m=2800000 user.maxnnz=1156 user.maxlen=32768 user.flg=1 
> [0] PetscCommDuplicate(): Duplicating a communicator 1140850688 -2080374784 max tags = 2147483647
> [0] PetscCommDuplicate():   returning tag 2147483647
> user.dim=1157 user.m=2800000 user.maxnnz=1156 user.maxlen=32768 user.flg=1 
> user.dim=1157 user.m=2800000 user.maxnnz=1156 user.maxlen=32768 user.flg=1 
> user.dim=1157 user.m=2800000 user.maxnnz=1156 user.maxlen=32768 user.flg=1 
> [0] Petsc_DelComm(): Deleting PETSc communicator imbedded in a user MPI_Comm 1140850688
> [0] PetscCommDestroy(): Deleting PETSc MPI_Comm -2080374784
> [0] Petsc_DelComm(): Deleting PETSc communicator imbedded in a user MPI_Comm -2080374784
> [0] Petsc_DelCounter(): Deleting counter data in an MPI_Comm -2080374784
> [2] PetscCommDuplicate(): Duplicating a communicator 1140850688 -2080374784 max tags = 2147483647
> [2] PetscCommDuplicate():   returning tag 2147483647
> [2] Petsc_DelComm(): Deleting PETSc communicator imbedded in a user MPI_Comm 1140850688
> [2] PetscCommDestroy(): Deleting PETSc MPI_Comm -2080374784
> [2] Petsc_DelComm(): Deleting PETSc communicator imbedded in a user MPI_Comm -2080374784
> [2] Petsc_DelCounter(): Deleting counter data in an MPI_Comm -2080374784
> [3] PetscCommDuplicate(): Duplicating a communicator 1140850688 -2080374784 max tags = 2147483647
> [3] PetscCommDuplicate():   returning tag 2147483647
> [3] Petsc_DelComm(): Deleting PETSc communicator imbedded in a user MPI_Comm 1140850688
> [3] PetscCommDestroy(): Deleting PETSc MPI_Comm -2080374784
> [3] Petsc_DelComm(): Deleting PETSc communicator imbedded in a user MPI_Comm -2080374784
> [3] Petsc_DelCounter(): Deleting counter data in an MPI_Comm -2080374784
> [0] PetscCommDuplicate(): Duplicating a communicator 1140850688 -2080374784 max tags = 2147483647
> [0] PetscCommDuplicate():   returning tag 2147483647
> [1] PetscCommDuplicate(): Duplicating a communicator 1140850688 -2080374784 max tags = 2147483647
> [1] PetscCommDuplicate():   returning tag 2147483647
> [1] Petsc_DelComm(): Deleting PETSc communicator imbedded in a user MPI_Comm 1140850688
> [1] PetscCommDestroy(): Deleting PETSc MPI_Comm -2080374784
> [1] Petsc_DelComm(): Deleting PETSc communicator imbedded in a user MPI_Comm -2080374784
> [1] Petsc_DelCounter(): Deleting counter data in an MPI_Comm -2080374784
> [1] PetscCommDuplicate(): Duplicating a communicator 1140850688 -2080374784 max tags = 2147483647
> [1] PetscCommDuplicate():   returning tag 2147483647
> [2] PetscCommDuplicate(): Duplicating a communicator 1140850688 -2080374784 max tags = 2147483647
> [2] PetscCommDuplicate():   returning tag 2147483647
> [3] PetscCommDuplicate(): Duplicating a communicator 1140850688 -2080374784 max tags = 2147483647
> [3] PetscCommDuplicate():   returning tag 2147483647
> [3] PetscCommDuplicate(): Using internal PETSc communicator 1140850688 -2080374784
> [3] PetscCommDuplicate():   returning tag 2147483642
> [2] PetscCommDuplicate(): Using internal PETSc communicator 1140850688 -2080374784
> [2] PetscCommDuplicate():   returning tag 2147483642
> [1] PetscCommDuplicate(): Using internal PETSc communicator 1140850688 -2080374784
> [1] PetscCommDuplicate():   returning tag 2147483642
> [0] PetscCommDuplicate(): Using internal PETSc communicator 1140850688 -2080374784
> [0] PetscCommDuplicate():   returning tag 2147483642
> [0] MatSetUpPreallocation(): Warning not preallocating matrix storage
> [0] PetscCommDuplicate(): Duplicating a communicator 1140850689 -2080374783 max tags = 2147483647
> [0] PetscCommDuplicate():   returning tag 2147483647
> [0] PetscCommDuplicate(): Using internal PETSc communicator 1140850689 -2080374783
> [0] PetscCommDuplicate():   returning tag 2147483646
> [2] PetscCommDuplicate(): Duplicating a communicator 1140850689 -2080374783 max tags = 2147483647
> [2] PetscCommDuplicate():   returning tag 2147483647
> [1] PetscCommDuplicate(): Duplicating a communicator 1140850689 -2080374783 max tags = 2147483647
> [1] PetscCommDuplicate():   returning tag 2147483647
> [2] PetscCommDuplicate(): Using internal PETSc communicator 1140850689 -2080374783
> [2] PetscCommDuplicate():   returning tag 2147483646
> [1] PetscCommDuplicate(): Using internal PETSc communicator 1140850689 -2080374783
> [1] PetscCommDuplicate():   returning tag 2147483646
> [3] PetscCommDuplicate(): Duplicating a communicator 1140850689 -2080374783 max tags = 2147483647
> [3] PetscCommDuplicate():   returning tag 2147483647
> [3] PetscCommDuplicate(): Using internal PETSc communicator 1140850689 -2080374783
> [3] PetscCommDuplicate():   returning tag 2147483646
> [0] VecAssemblyBegin_MPI(): Stash has 0 entries, uses 0 mallocs.
> [0] VecAssemblyBegin_MPI(): Block-Stash has 0 entries, uses 0 mallocs.
> [0] MatStashScatterBegin_Private(): No of messages: 0 
> [0] MatAssemblyBegin_MPIAIJ(): Stash has 0 entries, uses 0 mallocs.
> [1] MatAssemblyBegin_MPIAIJ(): Stash has 0 entries, uses 0 mallocs.
> [3] MatAssemblyBegin_MPIAIJ(): Stash has 0 entries, uses 0 mallocs.
> [2] MatAssemblyBegin_MPIAIJ(): Stash has 0 entries, uses 0 mallocs.
> [1] MatAssemblyEnd_SeqAIJ(): Matrix size: 700000 X 289; storage space: 0 unneeded,202300000 used
> [1] MatAssemblyEnd_SeqAIJ(): Number of mallocs during MatSetValues() is 0
> [1] MatAssemblyEnd_SeqAIJ(): Maximum nonzeros in any row is 289
> [3] MatAssemblyEnd_SeqAIJ(): Matrix size: 700000 X 289; storage space: 0 unneeded,202300000 used
> [3] MatAssemblyEnd_SeqAIJ(): Number of mallocs during MatSetValues() is 0
> [3] MatAssemblyEnd_SeqAIJ(): Maximum nonzeros in any row is 289
> [0] MatAssemblyEnd_SeqAIJ(): Matrix size: 700000 X 290; storage space: 0 unneeded,202300000 used
> [0] MatAssemblyEnd_SeqAIJ(): Number of mallocs during MatSetValues() is 0
> [0] MatAssemblyEnd_SeqAIJ(): Maximum nonzeros in any row is 289
> [2] MatAssemblyEnd_SeqAIJ(): Matrix size: 700000 X 289; storage space: 0 unneeded,202300000 used
> [2] MatAssemblyEnd_SeqAIJ(): Number of mallocs during MatSetValues() is 0
> [2] MatAssemblyEnd_SeqAIJ(): Maximum nonzeros in any row is 289
> [1] Mat_CheckInode(): Found 140000 nodes of 700000. Limit used: 5. Using Inode routines
> [3] Mat_CheckInode(): Found 140000 nodes of 700000. Limit used: 5. Using Inode routines
> [2] Mat_CheckInode(): Found 140000 nodes of 700000. Limit used: 5. Using Inode routines
> [0] Mat_CheckInode(): Found 140000 nodes of 700000. Limit used: 5. Using Inode routines
> [3] PetscCommDuplicate(): Using internal PETSc communicator 1140850689 -2080374783
> [3] PetscCommDuplicate():   returning tag 2147483645
> [3] PetscCommDuplicate():   returning tag 2147483638
> [2] PetscCommDuplicate(): Using internal PETSc communicator 1140850689 -2080374783
> [2] PetscCommDuplicate():   returning tag 2147483645
> [2] PetscCommDuplicate():   returning tag 2147483638
> [0] PetscCommDuplicate(): Using internal PETSc communicator 1140850689 -2080374783
> [0] PetscCommDuplicate():   returning tag 2147483645
> [0] MatSetUpMultiply_MPIAIJ(): Using block index set to define scatter
> [0] PetscCommDuplicate():   returning tag 2147483638
> [1] PetscCommDuplicate(): Using internal PETSc communicator 1140850689 -2080374783
> [1] PetscCommDuplicate():   returning tag 2147483645
> [1] PetscCommDuplicate():   returning tag 2147483638
> [0] PetscCommDuplicate(): Using internal PETSc communicator 1140850689 -2080374783
> [0] PetscCommDuplicate():   returning tag 2147483644
> [0] PetscCommDuplicate():   returning tag 2147483637
> [1] PetscCommDuplicate(): Using internal PETSc communicator 1140850689 -2080374783
> [1] PetscCommDuplicate():   returning tag 2147483644
> [3] PetscCommDuplicate(): Using internal PETSc communicator 1140850689 -2080374783
> [3] PetscCommDuplicate():   returning tag 2147483644
> [3] PetscCommDuplicate():   returning tag 2147483637
> [1] PetscCommDuplicate():   returning tag 2147483637
> [0] PetscCommDuplicate():   returning tag 2147483632
> [2] PetscCommDuplicate(): Using internal PETSc communicator 1140850689 -2080374783
> [2] PetscCommDuplicate():   returning tag 2147483644
> [2] PetscCommDuplicate():   returning tag 2147483637
> [1] PetscCommDuplicate():   returning tag 2147483632
> [2] PetscCommDuplicate():   returning tag 2147483632
> [3] PetscCommDuplicate():   returning tag 2147483632
> [0] VecScatterCreateCommon_PtoS(): Using blocksize 1 scatter
> [0] VecScatterCreate(): General case: MPI to Seq
> [2] MatAssemblyEnd_SeqAIJ(): Matrix size: 700000 X 867; storage space: 0 unneeded,606900000 used
> [2] MatAssemblyEnd_SeqAIJ(): Number of mallocs during MatSetValues() is 0
> [2] MatAssemblyEnd_SeqAIJ(): Maximum nonzeros in any row is 867
> [2] PetscCommDuplicate(): Using internal PETSc communicator 1140850688 -2080374784
> [2] PetscCommDuplicate():   returning tag 2147483628
> [3] MatAssemblyEnd_SeqAIJ(): Matrix size: 700000 X 867; storage space: 0 unneeded,606900000 used
> [3] MatAssemblyEnd_SeqAIJ(): Number of mallocs during MatSetValues() is 0
> [3] MatAssemblyEnd_SeqAIJ(): Maximum nonzeros in any row is 867
> [3] PetscCommDuplicate(): Using internal PETSc communicator 1140850688 -2080374784
> [3] PetscCommDuplicate():   returning tag 2147483628
> [1] MatAssemblyEnd_SeqAIJ(): Matrix size: 700000 X 867; storage space: 0 unneeded,606900000 used
> [1] MatAssemblyEnd_SeqAIJ(): Number of mallocs during MatSetValues() is 0
> [1] MatAssemblyEnd_SeqAIJ(): Maximum nonzeros in any row is 867
> [1] PetscCommDuplicate(): Using internal PETSc communicator 1140850688 -2080374784
> [1] PetscCommDuplicate():   returning tag 2147483628
> [0] MatAssemblyEnd_SeqAIJ(): Matrix size: 700000 X 867; storage space: 0 unneeded,606900000 used
> [0] MatAssemblyEnd_SeqAIJ(): Number of mallocs during MatSetValues() is 0
> [0] MatAssemblyEnd_SeqAIJ(): Maximum nonzeros in any row is 867
> 
> Writing data in binary format to /scratch/lustreA/v/vishy/biclass/ocr.train.x 
> [0] PetscCommDuplicate(): Using internal PETSc communicator 1140850688 -2080374784
> [0] PetscCommDuplicate():   returning tag 2147483628
> APPLICATION TERMINATED WITH THE EXIT STRING: Hangup (signal 1)
> 
> <libsvm-to-binary.cpp>


From bsmith at mcs.anl.gov  Sat Apr 23 13:39:35 2011
From: bsmith at mcs.anl.gov (Barry Smith)
Date: Sat, 23 Apr 2011 13:39:35 -0500
Subject: [petsc-users] Question on writing a large matrix
In-Reply-To: <87vcy4g998.wl%vishy@stat.purdue.edu>
References: <87k4ensm6r.wl%vishy@stat.purdue.edu>
	<BANLkTin-L5V3HXOYQUheUANas6G62xMZvg@mail.gmail.com>
	<87hb9rsfq9.wl%vishy@stat.purdue.edu>
	<96C5FD2E-7EB1-458A-9DB6-1B8B2353829E@mcs.anl.gov>
	<87vcy4g998.wl%vishy@stat.purdue.edu>
Message-ID: <92BAD5B4-F576-4FA5-B4DD-6A3666851CAD@mcs.anl.gov>


http://www.mcs.anl.gov/petsc/petsc-as/documentation/faq.html#64-bit-indices


On Apr 23, 2011, at 12:36 PM, S V N Vishwanathan wrote:

> Barry,
> 
>>   It has not assembled the matrix in an hour. It is working all night
>>   to assemble the matrix, the problem is that you are not
>>   preallocating the nonzeros per row with MatMPIAIJSetPreallocation()
>>   when pre allocation is correct it will always print 0 for Number of
>>   mallocs. The actual writing of the parallel matrix to the binary
>>   file will take at most minutes.
> 
> You were absolutely right! I had not set the preallocation properly and
> hence the code was painfully slow. I fixed that issue (see attached
> code) and now it runs much faster. However, I am having a different
> problem now. When I run the code for smaller matrices (less than a
> million rows) everything works well. However, when working with large
> matrices (e.g. 2.8 million rows x 1157 columns) writing the matrix to
> file dies with the following message:
> 
> Fatal error in MPI_Recv: Other MPI error
> 
> Any hints on how to solve this problem or are deeply appreciated. 
> 
> vishy
> 
> The output of running the code with the -info flag is as follows:
> 
> [0] PetscInitialize(): PETSc successfully started: number of processors = 4
> [0] PetscInitialize(): Running on machine: rossmann-b001.rcac.purdue.edu
> [3] PetscInitialize(): PETSc successfully started: number of processors = 4
> [3] PetscInitialize(): Running on machine: rossmann-b004.rcac.purdue.edu
> No libsvm test file specified!
> 
> Reading libsvm train file at /scratch/lustreA/v/vishy/LibSVM/biclass/ocr/ocr.train.txt
> [2] PetscInitialize(): PETSc successfully started: number of processors = 4
> [2] PetscInitialize(): Running on machine: rossmann-b003.rcac.purdue.edu
> [3] PetscFOpen(): Opening file /scratch/lustreA/v/vishy/LibSVM/biclass/ocr/ocr.train.txt
> [2] PetscFOpen(): Opening file /scratch/lustreA/v/vishy/LibSVM/biclass/ocr/ocr.train.txt
> [0] PetscFOpen(): Opening file /scratch/lustreA/v/vishy/LibSVM/biclass/ocr/ocr.train.txt
> [1] PetscInitialize(): PETSc successfully started: number of processors = 4
> [1] PetscInitialize(): Running on machine: rossmann-b002.rcac.purdue.edu
> [1] PetscFOpen(): Opening file /scratch/lustreA/v/vishy/LibSVM/biclass/ocr/ocr.train.txt
> m=100000
> m=200000
> m=300000
> m=400000
> m=500000
> m=600000
> m=700000
> m=800000
> m=900000
> m=1000000
> m=1100000
> m=1200000
> m=1300000
> m=1400000
> m=1500000
> m=1600000
> m=1700000
> m=1800000
> m=1900000
> m=2000000
> m=2100000
> m=2200000
> m=2300000
> m=2400000
> m=2500000
> m=2600000
> m=2700000
> m=2800000
> user.dim=1157 user.m=2800000 user.maxnnz=1156 user.maxlen=32768 user.flg=1 
> [0] PetscCommDuplicate(): Duplicating a communicator 1140850688 -2080374784 max tags = 2147483647
> [0] PetscCommDuplicate():   returning tag 2147483647
> user.dim=1157 user.m=2800000 user.maxnnz=1156 user.maxlen=32768 user.flg=1 
> user.dim=1157 user.m=2800000 user.maxnnz=1156 user.maxlen=32768 user.flg=1 
> user.dim=1157 user.m=2800000 user.maxnnz=1156 user.maxlen=32768 user.flg=1 
> [0] Petsc_DelComm(): Deleting PETSc communicator imbedded in a user MPI_Comm 1140850688
> [0] PetscCommDestroy(): Deleting PETSc MPI_Comm -2080374784
> [0] Petsc_DelComm(): Deleting PETSc communicator imbedded in a user MPI_Comm -2080374784
> [0] Petsc_DelCounter(): Deleting counter data in an MPI_Comm -2080374784
> [2] PetscCommDuplicate(): Duplicating a communicator 1140850688 -2080374784 max tags = 2147483647
> [2] PetscCommDuplicate():   returning tag 2147483647
> [2] Petsc_DelComm(): Deleting PETSc communicator imbedded in a user MPI_Comm 1140850688
> [2] PetscCommDestroy(): Deleting PETSc MPI_Comm -2080374784
> [2] Petsc_DelComm(): Deleting PETSc communicator imbedded in a user MPI_Comm -2080374784
> [2] Petsc_DelCounter(): Deleting counter data in an MPI_Comm -2080374784
> [3] PetscCommDuplicate(): Duplicating a communicator 1140850688 -2080374784 max tags = 2147483647
> [3] PetscCommDuplicate():   returning tag 2147483647
> [3] Petsc_DelComm(): Deleting PETSc communicator imbedded in a user MPI_Comm 1140850688
> [3] PetscCommDestroy(): Deleting PETSc MPI_Comm -2080374784
> [3] Petsc_DelComm(): Deleting PETSc communicator imbedded in a user MPI_Comm -2080374784
> [3] Petsc_DelCounter(): Deleting counter data in an MPI_Comm -2080374784
> [0] PetscCommDuplicate(): Duplicating a communicator 1140850688 -2080374784 max tags = 2147483647
> [0] PetscCommDuplicate():   returning tag 2147483647
> [1] PetscCommDuplicate(): Duplicating a communicator 1140850688 -2080374784 max tags = 2147483647
> [1] PetscCommDuplicate():   returning tag 2147483647
> [1] Petsc_DelComm(): Deleting PETSc communicator imbedded in a user MPI_Comm 1140850688
> [1] PetscCommDestroy(): Deleting PETSc MPI_Comm -2080374784
> [1] Petsc_DelComm(): Deleting PETSc communicator imbedded in a user MPI_Comm -2080374784
> [1] Petsc_DelCounter(): Deleting counter data in an MPI_Comm -2080374784
> [1] PetscCommDuplicate(): Duplicating a communicator 1140850688 -2080374784 max tags = 2147483647
> [1] PetscCommDuplicate():   returning tag 2147483647
> [2] PetscCommDuplicate(): Duplicating a communicator 1140850688 -2080374784 max tags = 2147483647
> [2] PetscCommDuplicate():   returning tag 2147483647
> [3] PetscCommDuplicate(): Duplicating a communicator 1140850688 -2080374784 max tags = 2147483647
> [3] PetscCommDuplicate():   returning tag 2147483647
> [3] PetscCommDuplicate(): Using internal PETSc communicator 1140850688 -2080374784
> [3] PetscCommDuplicate():   returning tag 2147483642
> [2] PetscCommDuplicate(): Using internal PETSc communicator 1140850688 -2080374784
> [2] PetscCommDuplicate():   returning tag 2147483642
> [1] PetscCommDuplicate(): Using internal PETSc communicator 1140850688 -2080374784
> [1] PetscCommDuplicate():   returning tag 2147483642
> [0] PetscCommDuplicate(): Using internal PETSc communicator 1140850688 -2080374784
> [0] PetscCommDuplicate():   returning tag 2147483642
> [0] MatSetUpPreallocation(): Warning not preallocating matrix storage
> [0] PetscCommDuplicate(): Duplicating a communicator 1140850689 -2080374783 max tags = 2147483647
> [0] PetscCommDuplicate():   returning tag 2147483647
> [0] PetscCommDuplicate(): Using internal PETSc communicator 1140850689 -2080374783
> [0] PetscCommDuplicate():   returning tag 2147483646
> [2] PetscCommDuplicate(): Duplicating a communicator 1140850689 -2080374783 max tags = 2147483647
> [2] PetscCommDuplicate():   returning tag 2147483647
> [1] PetscCommDuplicate(): Duplicating a communicator 1140850689 -2080374783 max tags = 2147483647
> [1] PetscCommDuplicate():   returning tag 2147483647
> [2] PetscCommDuplicate(): Using internal PETSc communicator 1140850689 -2080374783
> [2] PetscCommDuplicate():   returning tag 2147483646
> [1] PetscCommDuplicate(): Using internal PETSc communicator 1140850689 -2080374783
> [1] PetscCommDuplicate():   returning tag 2147483646
> [3] PetscCommDuplicate(): Duplicating a communicator 1140850689 -2080374783 max tags = 2147483647
> [3] PetscCommDuplicate():   returning tag 2147483647
> [3] PetscCommDuplicate(): Using internal PETSc communicator 1140850689 -2080374783
> [3] PetscCommDuplicate():   returning tag 2147483646
> [0] VecAssemblyBegin_MPI(): Stash has 0 entries, uses 0 mallocs.
> [0] VecAssemblyBegin_MPI(): Block-Stash has 0 entries, uses 0 mallocs.
> [0] MatStashScatterBegin_Private(): No of messages: 0 
> [0] MatAssemblyBegin_MPIAIJ(): Stash has 0 entries, uses 0 mallocs.
> [1] MatAssemblyBegin_MPIAIJ(): Stash has 0 entries, uses 0 mallocs.
> [3] MatAssemblyBegin_MPIAIJ(): Stash has 0 entries, uses 0 mallocs.
> [2] MatAssemblyBegin_MPIAIJ(): Stash has 0 entries, uses 0 mallocs.
> [1] MatAssemblyEnd_SeqAIJ(): Matrix size: 700000 X 289; storage space: 0 unneeded,202300000 used
> [1] MatAssemblyEnd_SeqAIJ(): Number of mallocs during MatSetValues() is 0
> [1] MatAssemblyEnd_SeqAIJ(): Maximum nonzeros in any row is 289
> [3] MatAssemblyEnd_SeqAIJ(): Matrix size: 700000 X 289; storage space: 0 unneeded,202300000 used
> [3] MatAssemblyEnd_SeqAIJ(): Number of mallocs during MatSetValues() is 0
> [3] MatAssemblyEnd_SeqAIJ(): Maximum nonzeros in any row is 289
> [0] MatAssemblyEnd_SeqAIJ(): Matrix size: 700000 X 290; storage space: 0 unneeded,202300000 used
> [0] MatAssemblyEnd_SeqAIJ(): Number of mallocs during MatSetValues() is 0
> [0] MatAssemblyEnd_SeqAIJ(): Maximum nonzeros in any row is 289
> [2] MatAssemblyEnd_SeqAIJ(): Matrix size: 700000 X 289; storage space: 0 unneeded,202300000 used
> [2] MatAssemblyEnd_SeqAIJ(): Number of mallocs during MatSetValues() is 0
> [2] MatAssemblyEnd_SeqAIJ(): Maximum nonzeros in any row is 289
> [1] Mat_CheckInode(): Found 140000 nodes of 700000. Limit used: 5. Using Inode routines
> [3] Mat_CheckInode(): Found 140000 nodes of 700000. Limit used: 5. Using Inode routines
> [2] Mat_CheckInode(): Found 140000 nodes of 700000. Limit used: 5. Using Inode routines
> [0] Mat_CheckInode(): Found 140000 nodes of 700000. Limit used: 5. Using Inode routines
> [3] PetscCommDuplicate(): Using internal PETSc communicator 1140850689 -2080374783
> [3] PetscCommDuplicate():   returning tag 2147483645
> [3] PetscCommDuplicate():   returning tag 2147483638
> [2] PetscCommDuplicate(): Using internal PETSc communicator 1140850689 -2080374783
> [2] PetscCommDuplicate():   returning tag 2147483645
> [2] PetscCommDuplicate():   returning tag 2147483638
> [0] PetscCommDuplicate(): Using internal PETSc communicator 1140850689 -2080374783
> [0] PetscCommDuplicate():   returning tag 2147483645
> [0] MatSetUpMultiply_MPIAIJ(): Using block index set to define scatter
> [0] PetscCommDuplicate():   returning tag 2147483638
> [1] PetscCommDuplicate(): Using internal PETSc communicator 1140850689 -2080374783
> [1] PetscCommDuplicate():   returning tag 2147483645
> [1] PetscCommDuplicate():   returning tag 2147483638
> [0] PetscCommDuplicate(): Using internal PETSc communicator 1140850689 -2080374783
> [0] PetscCommDuplicate():   returning tag 2147483644
> [0] PetscCommDuplicate():   returning tag 2147483637
> [1] PetscCommDuplicate(): Using internal PETSc communicator 1140850689 -2080374783
> [1] PetscCommDuplicate():   returning tag 2147483644
> [3] PetscCommDuplicate(): Using internal PETSc communicator 1140850689 -2080374783
> [3] PetscCommDuplicate():   returning tag 2147483644
> [3] PetscCommDuplicate():   returning tag 2147483637
> [1] PetscCommDuplicate():   returning tag 2147483637
> [0] PetscCommDuplicate():   returning tag 2147483632
> [2] PetscCommDuplicate(): Using internal PETSc communicator 1140850689 -2080374783
> [2] PetscCommDuplicate():   returning tag 2147483644
> [2] PetscCommDuplicate():   returning tag 2147483637
> [1] PetscCommDuplicate():   returning tag 2147483632
> [2] PetscCommDuplicate():   returning tag 2147483632
> [3] PetscCommDuplicate():   returning tag 2147483632
> [0] VecScatterCreateCommon_PtoS(): Using blocksize 1 scatter
> [0] VecScatterCreate(): General case: MPI to Seq
> [2] MatAssemblyEnd_SeqAIJ(): Matrix size: 700000 X 867; storage space: 0 unneeded,606900000 used
> [2] MatAssemblyEnd_SeqAIJ(): Number of mallocs during MatSetValues() is 0
> [2] MatAssemblyEnd_SeqAIJ(): Maximum nonzeros in any row is 867
> [2] PetscCommDuplicate(): Using internal PETSc communicator 1140850688 -2080374784
> [2] PetscCommDuplicate():   returning tag 2147483628
> [3] MatAssemblyEnd_SeqAIJ(): Matrix size: 700000 X 867; storage space: 0 unneeded,606900000 used
> [3] MatAssemblyEnd_SeqAIJ(): Number of mallocs during MatSetValues() is 0
> [3] MatAssemblyEnd_SeqAIJ(): Maximum nonzeros in any row is 867
> [3] PetscCommDuplicate(): Using internal PETSc communicator 1140850688 -2080374784
> [3] PetscCommDuplicate():   returning tag 2147483628
> [1] MatAssemblyEnd_SeqAIJ(): Matrix size: 700000 X 867; storage space: 0 unneeded,606900000 used
> [1] MatAssemblyEnd_SeqAIJ(): Number of mallocs during MatSetValues() is 0
> [1] MatAssemblyEnd_SeqAIJ(): Maximum nonzeros in any row is 867
> [1] PetscCommDuplicate(): Using internal PETSc communicator 1140850688 -2080374784
> [1] PetscCommDuplicate():   returning tag 2147483628
> [0] MatAssemblyEnd_SeqAIJ(): Matrix size: 700000 X 867; storage space: 0 unneeded,606900000 used
> [0] MatAssemblyEnd_SeqAIJ(): Number of mallocs during MatSetValues() is 0
> [0] MatAssemblyEnd_SeqAIJ(): Maximum nonzeros in any row is 867
> 
> Writing data in binary format to /scratch/lustreA/v/vishy/biclass/ocr.train.x 
> [0] PetscCommDuplicate(): Using internal PETSc communicator 1140850688 -2080374784
> [0] PetscCommDuplicate():   returning tag 2147483628
> APPLICATION TERMINATED WITH THE EXIT STRING: Hangup (signal 1)
> 
> <libsvm-to-binary.cpp>


From bartlomiej.wach at yahoo.pl  Sat Apr 23 18:33:27 2011
From: bartlomiej.wach at yahoo.pl (=?utf-8?B?QmFydMWCb21pZWogVw==?=)
Date: Sun, 24 Apr 2011 00:33:27 +0100 (BST)
Subject: [petsc-users] Identifying processes
In-Reply-To: <BANLkTikR-D=4AP5g0cp4RerYonn=UjgHJA@mail.gmail.com>
Message-ID: <497548.87377.qm@web28309.mail.ukl.yahoo.com>

Hello,

I was wondering if anyone could help me to identify processes in parallel execution. I run my app with mpiexec -n 2 and would like to be able to pick a single core to perform a task and be the only one to print instead of having n cores repeat the same thing.

PETSC_COMM_WORLD and PETSC_COMM_SELF both cause all processes to print for me, like there is no difference.

Thank you
Bartholomew
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.mcs.anl.gov/pipermail/petsc-users/attachments/20110424/fa40aa6b/attachment.htm>

From bsmith at mcs.anl.gov  Sat Apr 23 18:37:59 2011
From: bsmith at mcs.anl.gov (Barry Smith)
Date: Sat, 23 Apr 2011 18:37:59 -0500
Subject: [petsc-users] Identifying processes
In-Reply-To: <497548.87377.qm@web28309.mail.ukl.yahoo.com>
References: <497548.87377.qm@web28309.mail.ukl.yahoo.com>
Message-ID: <749EB116-6F30-4C97-99D1-8903539324C3@mcs.anl.gov>


    PetscMPIInt rank;

    MPI_Comm_rank(PETSC_COMM_WORLD,&rank);
    if (!rank) {
        do something
     }


     If both cores do something there there is a mismatch with the mpiexec that you are are running, it may not be the right mpiexec for the MPI includes and library you are using.


   Barry

On Apr 23, 2011, at 6:33 PM, Bart?omiej W wrote:

> Hello,
> 
> I was wondering if anyone could help me to identify processes in parallel execution. I run my app with mpiexec -n 2 and would like to be able to pick a single core to perform a task and be the only one to print instead of having n cores repeat the same thing.
> 
> PETSC_COMM_WORLD and PETSC_COMM_SELF both cause all processes to print for me, like there is no difference.
> 
> Thank you
> Bartholomew


From ilyascfd at gmail.com  Sun Apr 24 07:31:53 2011
From: ilyascfd at gmail.com (ilyas ilyas)
Date: Sun, 24 Apr 2011 15:31:53 +0300
Subject: [petsc-users] setting up DA matrix for 3D periodic domain
Message-ID: <BANLkTinQS-nAuAQh=Bdcy=ouBkz0dkRFiA@mail.gmail.com>

Hi,

Manual pages for "MatSetValuesStencil" says that,
"For periodic boundary conditions use negative indices for values to the
left (below 0; that are to be obtained by wrapping values from right edge).
 For values to the right of the last entry using that index plus one etc to
obtain values that obtained by wrapping the values from the left edge.
 This does not work for the DA_NONPERIODIC wrap."

According to this explanation, If I would set up a matrix for 3D periodic
domain using DAs with DA_ XYZPERIODIC,
The code segment given below could handle periodicity "without specifying
boundary information within the loop"   ?

                 - - - - - - - - - - - - - - - - - - - - - - - - - - - - - -
- - - - - - - - - - - - - - - -


"
.
.
.

 MatStencil    row(4),col(4,7)
 PetscInt        i1,i7
 PetscScalar  val(7)

i1 = 1
i7 = 7
.
.
.

call DACreate3d(...,DA_XYZPERIODIC,DA_STENCIL_STAR, ... )

call DAGetMatrix(...,A,...)

call DAGetCorners(da,xs,ys,zs,xm,ym,zm,ierr)

do k=zs,zs+zm-1
   do j=ys,ys+ym-1
      do i=xs,xs+xm-1


  val(1) = ...
col(MatStencil_i,1) = i
col(MatStencil_j,1) = j
col(MatStencil_k,1) = k-1

  val(2) = ...
col(MatStencil_i,2) = i
col(MatStencil_j,2) = j-1
col(MatStencil_k,2) = k

  val(3) = ...
col(MatStencil_i,3) = i-1
col(MatStencil_j,3) = j
col(MatStencil_k,3) = k

  val(4) = ...
col(MatStencil_i,4) = i
col(MatStencil_j,4) = j
col(MatStencil_k,4) = k

  val(5) = ...
col(MatStencil_i,5) = i+1
col(MatStencil_j,5) = j
col(MatStencil_k,5) = k

  val(6) = ...
col(MatStencil_i,6) = i
col(MatStencil_j,6) = j+1
col(MatStencil_k,6) = k

  val(7) = ...
col(MatStencil_i,7) = i
col(MatStencil_j,7) = j
col(MatStencil_k,7) = k+1

  call MatSetValuesStencil(A,i1,row,i7,col,val,INSERT_VALUES,ierr)

        end do
    end do
end do


call MatAssemblyBegin(A,MAT_FINAL_ASSEMBLY,ierr)
call MatAssemblyEnd(A,MAT_FINAL_ASSEMBLY,ierr)

"

If it is so, how PETSc does it ? By inserting cyclic contributions arising
from periodicity into the correct locations within PETSc DAs matrix , as it
is done serially ?


Thank you,
Ilyas.
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.mcs.anl.gov/pipermail/petsc-users/attachments/20110424/d8f5955d/attachment-0001.htm>

From ilyascfd at gmail.com  Sun Apr 24 07:35:25 2011
From: ilyascfd at gmail.com (ilyas ilyas)
Date: Sun, 24 Apr 2011 15:35:25 +0300
Subject: [petsc-users] local row calculation in 3D
In-Reply-To: <BANLkTiniSaj4S-QRuAo2vyQ+G0x1wXjkNA@mail.gmail.com>
References: <BANLkTik7RZhwcQaP_1ArP67nov1j1P4CSQ@mail.gmail.com>
	<BANLkTi=4GS2ojZCmz7iafpxodHMUJw9V-g@mail.gmail.com>
	<BANLkTi=vzsA4xG6R66aXPvd7_PT+EYL2PQ@mail.gmail.com>
	<BANLkTi=bLZUx47nb9Md3qw1Tj4fZRz8Xpw@mail.gmail.com>
	<BANLkTiniyZkCBf3m+pB8op1UFBzrHg8A+A@mail.gmail.com>
	<BANLkTiniSaj4S-QRuAo2vyQ+G0x1wXjkNA@mail.gmail.com>
Message-ID: <BANLkTin2Kswx86SxnVayqbZfsGcT5t+R4g@mail.gmail.com>

Thank you Randall,
I guess I will follow the Jed's and Matt's suggestions.

Ilyas.

2011/4/19 Randall Mackie <rlmackie862 at gmail.com>

> You are right! I just didn't read all the way to the end of your email.
> Sorry about that.
> So here is a little more code that does it correctly:
>
>       PetscInt, pointer :: ltog(:)
>
>       call DAGetGlobalIndicesF90(da,nloc,ltog,ierr); CHKERRQ(ierr)
>
>
>       do kk=zs,zs+zm-1
>         do jj=ys,ys+ym-1
>           do ii=xs,xs+xm-1
>
>             row=ii-gxs + (jj-gys)*gxm + (kk-gzs)*gxm*gym
>             grow=ltog(3*row + 1)
>
> [all your code here]
>
>               call MatSetValues(A,i1,grow,ic,col,v,INSERT_VALUES,
>      .             ierr); CHKERRQ(ierr)
>
> [more code here]
>
>       call MatAssemblyBegin(A,MAT_FINAL_ASSEMBLY,ierr); CHKERRQ(ierr)
>       call MatAssemblyEnd(A,MAT_FINAL_ASSEMBLY,ierr); CHKERRQ(ierr)
>
>
>
> Hope this is a little more helpful. As Jed points out, there are other ways
> to do the same
> thing (and probably more efficiently than what I've outlined here).
>
> Randy M.
>
>
>
> On Tue, Apr 19, 2011 at 12:00 AM, ilyas ilyas <ilyascfd at gmail.com> wrote:
>
>> Hi Randy,
>>
>> Thank you for your answer.
>>
>> I have already done it. You can see it in my first e-mail.
>>
>> It does not work properly for all number of processors.
>> For certain number of processors, it works correctly,
>> not for all number of processors.
>> For example, for 1,2,or 3 processors, it's ok.
>> For 4 processors, it gives wrong location, so on.
>> "Problem" occurs in 3rd dimension ( (kk-gzs)*gxm*gym )
>>
>> Here is another suggestion (I have not tried yet) ;
>>
>>        do kk=zs,zs+zm-1
>>         do jj=ys,ys+ym-1
>>           do ii=xs,xs+xm-1
>>
>>             row=ii-gxs + (jj-gys)*MX + (kk-gzs)*MX*MY
>>
>> MX,MY,MZ are global dimensions.This is also what I do serially
>>
>> Do you think that it is correct or any other suggestions?
>>
>> Regards,
>> Ilyas.
>>
>> 2011/4/18 Randall Mackie <rlmackie862 at gmail.com>
>>
>>> Here's how I do it:
>>>
>>>        do kk=zs,zs+zm-1
>>>         do jj=ys,ys+ym-1
>>>           do ii=xs,xs+xm-1
>>>
>>>              row=ii-gxs + (jj-gys)*gxm + (kk-gzs)*gxm*gym
>>>
>>>
>>> Good luck,
>>>
>>> Randy M.
>>>
>>>
>>>
>>> On Mon, Apr 18, 2011 at 6:54 AM, ilyas ilyas <ilyascfd at gmail.com> wrote:
>>>
>>>> Hi,
>>>> Thank you for your suggestion. I will take it into account.
>>>> Since changing this structure in my "massive" code may take  too much
>>>> time,
>>>> I would like to know that how "row" is calculated in 3D, independently
>>>> from processor numbers.
>>>>
>>>> Regards,
>>>> Ilyas
>>>>
>>>> 2011/4/18 Matthew Knepley <knepley at gmail.com>
>>>>
>>>>> On Mon, Apr 18, 2011 at 8:34 AM, ilyas ilyas <ilyascfd at gmail.com>wrote:
>>>>>
>>>>>> Hi,
>>>>>>
>>>>>> In ex14f.F in KSP, "row" variable is calculated either
>>>>>>
>>>>>
>>>>> These are very old. I suggest you use the FormFunctionLocal() approach
>>>>> in ex5f.F which
>>>>> does not calculate global row numbers when using a DA.
>>>>>
>>>>>    Matt
>>>>>
>>>>>
>>>>>> 349: do 30 j=ys,ys+ym-1
>>>>>> 350: ...
>>>>>> 351: do 40 i=xs,xs+xm-1
>>>>>> 352:          row = i - gxs + (j - gys)*gxm + 1
>>>>>>
>>>>>> or
>>>>>>
>>>>>> 442: do 50 j=ys,ys+ym-1
>>>>>> 443: ...
>>>>>> 444: row = (j - gys)*gxm + xs - gxs
>>>>>> 445: do 60 i=xs,xs+xm-1
>>>>>> 446:          row = row + 1
>>>>>>
>>>>>> How can I calculate "row" in 3D ?
>>>>>>
>>>>>> I tried this;
>>>>>>
>>>>>> do k=zs,zs+zm-1
>>>>>>    do j=ys,ys+ym-1
>>>>>>       do i=xs,xs+xm-1
>>>>>>
>>>>>>            row = i - gxs + (j - gys)*gxm + (k - gzs)*gxm*gym + 1
>>>>>>
>>>>>> It does not work for certain number of processors.
>>>>>>
>>>>>>
>>>>>> Thanks,
>>>>>>
>>>>>> Ilyas
>>>>>>
>>>>>
>>>>>
>>>>>
>>>>> --
>>>>> What most experimenters take for granted before they begin their
>>>>> experiments is infinitely more interesting than any results to which their
>>>>> experiments lead.
>>>>> -- Norbert Wiener
>>>>>
>>>>
>>>>
>>>
>>
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.mcs.anl.gov/pipermail/petsc-users/attachments/20110424/cd7e2561/attachment.htm>

From jed at 59A2.org  Sun Apr 24 08:02:46 2011
From: jed at 59A2.org (Jed Brown)
Date: Sun, 24 Apr 2011 15:02:46 +0200
Subject: [petsc-users] setting up DA matrix for 3D periodic domain
In-Reply-To: <BANLkTinQS-nAuAQh=Bdcy=ouBkz0dkRFiA@mail.gmail.com>
References: <BANLkTinQS-nAuAQh=Bdcy=ouBkz0dkRFiA@mail.gmail.com>
Message-ID: <BANLkTika8z-PnNPLBFBva-DiWt3JoxaeMw@mail.gmail.com>

On Sun, Apr 24, 2011 at 14:31, ilyas ilyas <ilyascfd at gmail.com> wrote:

> According to this explanation, If I would set up a matrix for 3D periodic
> domain using DAs with DA_ XYZPERIODIC,
> The code segment given below could handle periodicity "without specifying
> boundary information within the loop"   ?
>

Yes, that code is fine. PETSc translates the periodic contributions.
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.mcs.anl.gov/pipermail/petsc-users/attachments/20110424/404646da/attachment.htm>

From xdliang at gmail.com  Sun Apr 24 11:22:34 2011
From: xdliang at gmail.com (Xiangdong Liang)
Date: Sun, 24 Apr 2011 12:22:34 -0400
Subject: [petsc-users] point-wise vector product
Message-ID: <BANLkTi=t4f4MEv5P74F8gjhtdLGhY4msbA@mail.gmail.com>

Hello everyone,

I am wondering what function in petsc computes the pointwise product of two
vectors. For example, vin1=[1,2], vin2=[3,4]; I need a function to output
vout= vin1.*vin2 = [3,8]. I can write my own function, but I am worrying
whether the vin1 and vin2 are known by all the processors (I guess not if
they are parallel vectors and distributed). More precisely, when one
processor computes vout[i] = vin1[i]*vin2[i], is it possible that vin1[i] or
vin2[i] are not known by this particular processor and the program output an
NaN or other meaningless vout[i]? Thank you very much!

Xiangdong
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.mcs.anl.gov/pipermail/petsc-users/attachments/20110424/8b56e946/attachment.htm>

From jed at 59A2.org  Sun Apr 24 11:24:06 2011
From: jed at 59A2.org (Jed Brown)
Date: Sun, 24 Apr 2011 18:24:06 +0200
Subject: [petsc-users] point-wise vector product
In-Reply-To: <BANLkTi=t4f4MEv5P74F8gjhtdLGhY4msbA@mail.gmail.com>
References: <BANLkTi=t4f4MEv5P74F8gjhtdLGhY4msbA@mail.gmail.com>
Message-ID: <BANLkTik25KS8B6-_jkL1tTVubAR8O4geXQ@mail.gmail.com>

On Sun, Apr 24, 2011 at 18:22, Xiangdong Liang <xdliang at gmail.com> wrote:

> I am wondering what function in petsc computes the pointwise product of two
> vectors. For example, vin1=[1,2], vin2=[3,4]; I need a function to output
> vout= vin1.*vin2 = [3,8].


http://www.mcs.anl.gov/petsc/petsc-as/snapshots/petsc-dev/docs/manualpages/Vec/VecPointwiseMult.html
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.mcs.anl.gov/pipermail/petsc-users/attachments/20110424/74014659/attachment.htm>

From alejandro.aragon at gmail.com  Tue Apr 26 01:37:32 2011
From: alejandro.aragon at gmail.com (=?iso-8859-1?Q?Alejandro_Marcos_Arag=F3n?=)
Date: Tue, 26 Apr 2011 08:37:32 +0200
Subject: [petsc-users]  KSP solver increases the solution time
Message-ID: <380FD73E-0CDB-4E48-A1AA-5C61E07C10F3@gmail.com>

Hi all,

I'm using the standard configuration of the KSP solver, but the time it takes to solve a large system of equations is increasing (because of the increasing number of iterations?). These are my timing lines and the log from the KSP solver in two consecutive solves:

[1]   Solving system... 0.154203 s

KSP Object:
  type: gmres
    GMRES: restart=30, using Classical (unmodified) Gram-Schmidt Orthogonalization with no iterative refinement
    GMRES: happy breakdown tolerance 1e-30
  maximum iterations=10000
  tolerances:  relative=1e-05, absolute=1e-50, divergence=10000
  left preconditioning
  using nonzero initial guess
  using PRECONDITIONED norm type for convergence test
PC Object:
  type: bjacobi
    block Jacobi: number of blocks = 3
    Local solve is same for all blocks, in the following KSP and PC objects:
  KSP Object:(sub_)
    type: preonly
    maximum iterations=10000, initial guess is zero
    tolerances:  relative=1e-05, absolute=1e-50, divergence=10000
    left preconditioning
    using PRECONDITIONED norm type for convergence test
  PC Object:(sub_)
    type: ilu
      ILU: out-of-place factorization
      0 levels of fill
      tolerance for zero pivot 1e-12
      using diagonal shift to prevent zero pivot
      matrix ordering: natural
      factor fill ratio given 1, needed 1
        Factored matrix follows:
          Matrix Object:
            type=seqaij, rows=2020, cols=2020
            package used to perform factorization: petsc
            total: nonzeros=119396, allocated nonzeros=163620
              using I-node routines: found 676 nodes, limit used is 5
    linear system matrix = precond matrix:
    Matrix Object:
      type=seqaij, rows=2020, cols=2020
      total: nonzeros=119396, allocated nonzeros=163620
        using I-node routines: found 676 nodes, limit used is 5
  linear system matrix = precond matrix:
  Matrix Object:
    type=mpiaij, rows=6058, cols=6058
    total: nonzeros=365026, allocated nonzeros=509941
      using I-node (on process 0) routines: found 676 nodes, limit used is 5


[1]   System solved in 51 iterations... 0.543215 s
...
...
...

[1]   Solving system... 0.302414 s

KSP Object:
  type: gmres
    GMRES: restart=30, using Classical (unmodified) Gram-Schmidt Orthogonalization with no iterative refinement
    GMRES: happy breakdown tolerance 1e-30
  maximum iterations=10000
  tolerances:  relative=1e-05, absolute=1e-50, divergence=10000
  left preconditioning
  using nonzero initial guess
  using PRECONDITIONED norm type for convergence test
PC Object:
  type: bjacobi
    block Jacobi: number of blocks = 3
    Local solve is same for all blocks, in the following KSP and PC objects:
  KSP Object:(sub_)
    type: preonly
    maximum iterations=10000, initial guess is zero
    tolerances:  relative=1e-05, absolute=1e-50, divergence=10000
    left preconditioning
    using PRECONDITIONED norm type for convergence test
  PC Object:(sub_)
    type: ilu
      ILU: out-of-place factorization
      0 levels of fill
      tolerance for zero pivot 1e-12
      using diagonal shift to prevent zero pivot
      matrix ordering: natural
      factor fill ratio given 1, needed 1
        Factored matrix follows:
          Matrix Object:
            type=seqaij, rows=2020, cols=2020
            package used to perform factorization: petsc
            total: nonzeros=119396, allocated nonzeros=163620
              using I-node routines: found 676 nodes, limit used is 5
    linear system matrix = precond matrix:
    Matrix Object:
      type=seqaij, rows=2020, cols=2020
      total: nonzeros=119396, allocated nonzeros=163620
        using I-node routines: found 676 nodes, limit used is 5
  linear system matrix = precond matrix:
  Matrix Object:
    type=mpiaij, rows=6058, cols=6058
    total: nonzeros=365026, allocated nonzeros=509941
      using I-node (on process 0) routines: found 676 nodes, limit used is 5

[1]   System solved in 3664 iterations... 42.683 s


As you can see, the second iteration takes more than 40 seconds to solve. Could some explain why this is happening and why he number of iterations is increasing dramatically between solves? Thank you all,

Alejandro M. Arag?n, Ph.D.

From jed at 59A2.org  Tue Apr 26 02:52:17 2011
From: jed at 59A2.org (Jed Brown)
Date: Tue, 26 Apr 2011 09:52:17 +0200
Subject: [petsc-users] KSP solver increases the solution time
In-Reply-To: <380FD73E-0CDB-4E48-A1AA-5C61E07C10F3@gmail.com>
References: <380FD73E-0CDB-4E48-A1AA-5C61E07C10F3@gmail.com>
Message-ID: <BANLkTi=5G_apT8NgbCyqnKW8R5r4SyK=Pw@mail.gmail.com>

2011/4/26 Alejandro Marcos Arag?n <alejandro.aragon at gmail.com>

> As you can see, the second iteration takes more than 40 seconds to solve.
> Could some explain why this is happening and why he number of iterations is
> increasing dramatically between solves?


What has changed between solves? If this is part of a nonlinear problem, it
might have just gotten harder to solve. If the linear system is the same,
the right hand side for the first problem was probably degenerate (roughly
speaking, having significant energy in only a few Krylov modes).
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.mcs.anl.gov/pipermail/petsc-users/attachments/20110426/b672aee5/attachment.htm>

From alejandro.aragon at gmail.com  Tue Apr 26 04:45:47 2011
From: alejandro.aragon at gmail.com (=?iso-8859-1?Q?Alejandro_Marcos_Arag=F3n?=)
Date: Tue, 26 Apr 2011 11:45:47 +0200
Subject: [petsc-users] KSP solver increases the solution time
In-Reply-To: <BANLkTi=5G_apT8NgbCyqnKW8R5r4SyK=Pw@mail.gmail.com>
References: <380FD73E-0CDB-4E48-A1AA-5C61E07C10F3@gmail.com>
	<BANLkTi=5G_apT8NgbCyqnKW8R5r4SyK=Pw@mail.gmail.com>
Message-ID: <471EBF6A-D42A-43F8-9BD7-BF4EAEE4FF4E@gmail.com>

Hi Jed, thanks for replying. In fact, the problem I sent the results from is a dynamic problem of a simply supported beam subjected to a constant load at the center, so I'm integrating in time. The material is linear elastic so the stiffness matrix doesn't change in non-zero structure. Of course the right hand side changes, but I don't think this is the problem because at some point it takes it goes back to just a few iterations to solve. The behavior is cyclic, but I don't understand the reason for this. I've noticed the same behavior of the solver also in quasi-static problems (increasing the load gradually but not integrating over time).

Alejandro M. Arag?n

On Apr 26, 2011, at 9:52 AM, Jed Brown wrote:

> 2011/4/26 Alejandro Marcos Arag?n <alejandro.aragon at gmail.com>
> As you can see, the second iteration takes more than 40 seconds to solve. Could some explain why this is happening and why he number of iterations is increasing dramatically between solves?
> 
> What has changed between solves? If this is part of a nonlinear problem, it might have just gotten harder to solve. If the linear system is the same, the right hand side for the first problem was probably degenerate (roughly speaking, having significant energy in only a few Krylov modes).

-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.mcs.anl.gov/pipermail/petsc-users/attachments/20110426/c602f3b0/attachment.htm>

From jed at 59A2.org  Tue Apr 26 04:56:47 2011
From: jed at 59A2.org (Jed Brown)
Date: Tue, 26 Apr 2011 11:56:47 +0200
Subject: [petsc-users] KSP solver increases the solution time
In-Reply-To: <471EBF6A-D42A-43F8-9BD7-BF4EAEE4FF4E@gmail.com>
References: <380FD73E-0CDB-4E48-A1AA-5C61E07C10F3@gmail.com>
	<BANLkTi=5G_apT8NgbCyqnKW8R5r4SyK=Pw@mail.gmail.com>
	<471EBF6A-D42A-43F8-9BD7-BF4EAEE4FF4E@gmail.com>
Message-ID: <BANLkTimFceN--oiLtcX4hbcgN=RMAmQXTw@mail.gmail.com>

2011/4/26 Alejandro Marcos Arag?n <alejandro.aragon at gmail.com>

> Hi Jed, thanks for replying. In fact, the problem I sent the results from
> is a dynamic problem of a simply supported beam subjected to a constant load
> at the center, so I'm integrating in time.
>

I take it there is no buckling.


> The material is linear elastic so the stiffness matrix doesn't change in
> non-zero structure. Of course the right hand side changes, but I don't think
> this is the problem because at some point it takes it goes back to just a
> few iterations to solve. The behavior is cyclic, but I don't understand the
> reason for this. I've noticed the same behavior of the solver also in
> quasi-static problems (increasing the load gradually but not integrating
> over time).
>

Is the convergence relatively smooth? Are you losing a lot in GMRES restarts
(every 30 iterations)? If you have a symmetric formulation (including
boundary conditions), you can use -ksp_type cg, otherwise try
-ksp_gmres_restart 500.

Also, try solving the system with a random right hand side.
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.mcs.anl.gov/pipermail/petsc-users/attachments/20110426/28e15076/attachment.htm>

From stali at purdue.edu  Tue Apr 26 01:05:53 2011
From: stali at purdue.edu (Tabrez Ali)
Date: Tue, 26 Apr 2011 01:05:53 -0500
Subject: [petsc-users] Error with SBAIJ during KSPSolve
Message-ID: <4DB660C1.804@purdue.edu>

Hi

I am trying to solve a system with constraints (0 on some diagonals). It 
works fine with AIJ but gives the following error (see below) with SBAIJ 
Matrices during KSPSolve (sequential). With GMRES it just segfaults.

What did I miss?

Thanks in advance.
Tabrez

stali at x61:~/src$ ./a.out <mesh_dense.inp -pc_type lu 
-pc_factor_shift_type NONZERO -pc_factor_shift_amount 1

  Reading input file ...
  Calculating NZ structure ...
  Forming [K] ...
  Forming RHS ...
  Setting up solver ...
  Solving ...
[0]PETSC ERROR: --------------------- Error Message 
------------------------------------
[0]PETSC ERROR: Invalid argument!
[0]PETSC ERROR: Index set is not a permutation!
[0]PETSC ERROR: 
------------------------------------------------------------------------
[0]PETSC ERROR: Petsc Release Version 3.1.0, Patch 5, Mon Sep 27 
11:51:54 CDT 2010
[0]PETSC ERROR: See docs/changes/index.html for recent updates.
[0]PETSC ERROR: See docs/faq.html for hints about trouble shooting.
[0]PETSC ERROR: See docs/index.html for manual pages.
[0]PETSC ERROR: 
------------------------------------------------------------------------
[0]PETSC ERROR: ./a.out on a linux-gnu named x61 by stali Tue Apr 26 
00:55:04 2011
[0]PETSC ERROR: Libraries linked from /opt/petsc-3.1-p5/lib
[0]PETSC ERROR: Configure run at Mon Feb  7 00:03:37 2011
[0]PETSC ERROR: Configure options --prefix=/opt/petsc-3.1-p5 
--with-mpi-dir=/opt/mpich2-1.0.8 --with-fortran-interfaces=1 
--with-parmetis=1 --download-parmetis=ifneeded --with-shared=1
[0]PETSC ERROR: 
------------------------------------------------------------------------
[0]PETSC ERROR: ISSetPermutation() line 141 in src/vec/is/interface/index.c
[0]PETSC ERROR: MatGetOrdering() line 231 in src/mat/order/sorder.c
[0]PETSC ERROR: PCSetUp_LU() line 125 in src/ksp/pc/impls/factor/lu/lu.c
[0]PETSC ERROR: PCSetUp() line 795 in src/ksp/pc/interface/precon.c
[0]PETSC ERROR: KSPSetUp() line 237 in src/ksp/ksp/interface/itfunc.c
[0]PETSC ERROR: KSPSolve() line 353 in src/ksp/ksp/interface/itfunc.c
  Done
  Cleaning up ...
  Finished

From jed at 59A2.org  Tue Apr 26 07:08:04 2011
From: jed at 59A2.org (Jed Brown)
Date: Tue, 26 Apr 2011 14:08:04 +0200
Subject: [petsc-users] Error with SBAIJ during KSPSolve
In-Reply-To: <4DB660C1.804@purdue.edu>
References: <4DB660C1.804@purdue.edu>
Message-ID: <BANLkTi=4_D4t2e578hhE+Fk+fmq-8OXcjw@mail.gmail.com>

On Tue, Apr 26, 2011 at 08:05, Tabrez Ali <stali at purdue.edu> wrote:

> I am trying to solve a system with constraints (0 on some diagonals). It
> works fine with AIJ but gives the following error (see below) with SBAIJ
> Matrices during KSPSolve (sequential). With GMRES it just segfaults.


You cannot do LU with SBAIJ, try -pc_type cholesky.
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.mcs.anl.gov/pipermail/petsc-users/attachments/20110426/803786cc/attachment.htm>

From bsmith at mcs.anl.gov  Tue Apr 26 07:23:35 2011
From: bsmith at mcs.anl.gov (Barry Smith)
Date: Tue, 26 Apr 2011 07:23:35 -0500
Subject: [petsc-users] KSP solver increases the solution time
In-Reply-To: <380FD73E-0CDB-4E48-A1AA-5C61E07C10F3@gmail.com>
References: <380FD73E-0CDB-4E48-A1AA-5C61E07C10F3@gmail.com>
Message-ID: <7E8E9476-A40A-4FA4-95DE-D66D9DF3B695@mcs.anl.gov>


> System solved in 3664 iterations... 42.683 s

   The preconditioner is simply not up to the task it has been assigned.  This number of iterations is problematic.

    Have you tried -pc_type asm -sub_pc_type lu     If that works well you can try -pc_type asm -sub_pc_type ilu and see if that still works.  

   If the matrix is indeed symmetric positive definite you will want to use -ksp_type cg


    Barry


On Apr 26, 2011, at 1:37 AM, Alejandro Marcos Arag?n wrote:

> Hi all,
> 
> I'm using the standard configuration of the KSP solver, but the time it takes to solve a large system of equations is increasing (because of the increasing number of iterations?). These are my timing lines and the log from the KSP solver in two consecutive solves:
> 
> [1]   Solving system... 0.154203 s
> 
> KSP Object:
>  type: gmres
>    GMRES: restart=30, using Classical (unmodified) Gram-Schmidt Orthogonalization with no iterative refinement
>    GMRES: happy breakdown tolerance 1e-30
>  maximum iterations=10000
>  tolerances:  relative=1e-05, absolute=1e-50, divergence=10000
>  left preconditioning
>  using nonzero initial guess
>  using PRECONDITIONED norm type for convergence test
> PC Object:
>  type: bjacobi
>    block Jacobi: number of blocks = 3
>    Local solve is same for all blocks, in the following KSP and PC objects:
>  KSP Object:(sub_)
>    type: preonly
>    maximum iterations=10000, initial guess is zero
>    tolerances:  relative=1e-05, absolute=1e-50, divergence=10000
>    left preconditioning
>    using PRECONDITIONED norm type for convergence test
>  PC Object:(sub_)
>    type: ilu
>      ILU: out-of-place factorization
>      0 levels of fill
>      tolerance for zero pivot 1e-12
>      using diagonal shift to prevent zero pivot
>      matrix ordering: natural
>      factor fill ratio given 1, needed 1
>        Factored matrix follows:
>          Matrix Object:
>            type=seqaij, rows=2020, cols=2020
>            package used to perform factorization: petsc
>            total: nonzeros=119396, allocated nonzeros=163620
>              using I-node routines: found 676 nodes, limit used is 5
>    linear system matrix = precond matrix:
>    Matrix Object:
>      type=seqaij, rows=2020, cols=2020
>      total: nonzeros=119396, allocated nonzeros=163620
>        using I-node routines: found 676 nodes, limit used is 5
>  linear system matrix = precond matrix:
>  Matrix Object:
>    type=mpiaij, rows=6058, cols=6058
>    total: nonzeros=365026, allocated nonzeros=509941
>      using I-node (on process 0) routines: found 676 nodes, limit used is 5
> 
> 
> [1]   System solved in 51 iterations... 0.543215 s
> ...
> ...
> ...
> 
> [1]   Solving system... 0.302414 s
> 
> KSP Object:
>  type: gmres
>    GMRES: restart=30, using Classical (unmodified) Gram-Schmidt Orthogonalization with no iterative refinement
>    GMRES: happy breakdown tolerance 1e-30
>  maximum iterations=10000
>  tolerances:  relative=1e-05, absolute=1e-50, divergence=10000
>  left preconditioning
>  using nonzero initial guess
>  using PRECONDITIONED norm type for convergence test
> PC Object:
>  type: bjacobi
>    block Jacobi: number of blocks = 3
>    Local solve is same for all blocks, in the following KSP and PC objects:
>  KSP Object:(sub_)
>    type: preonly
>    maximum iterations=10000, initial guess is zero
>    tolerances:  relative=1e-05, absolute=1e-50, divergence=10000
>    left preconditioning
>    using PRECONDITIONED norm type for convergence test
>  PC Object:(sub_)
>    type: ilu
>      ILU: out-of-place factorization
>      0 levels of fill
>      tolerance for zero pivot 1e-12
>      using diagonal shift to prevent zero pivot
>      matrix ordering: natural
>      factor fill ratio given 1, needed 1
>        Factored matrix follows:
>          Matrix Object:
>            type=seqaij, rows=2020, cols=2020
>            package used to perform factorization: petsc
>            total: nonzeros=119396, allocated nonzeros=163620
>              using I-node routines: found 676 nodes, limit used is 5
>    linear system matrix = precond matrix:
>    Matrix Object:
>      type=seqaij, rows=2020, cols=2020
>      total: nonzeros=119396, allocated nonzeros=163620
>        using I-node routines: found 676 nodes, limit used is 5
>  linear system matrix = precond matrix:
>  Matrix Object:
>    type=mpiaij, rows=6058, cols=6058
>    total: nonzeros=365026, allocated nonzeros=509941
>      using I-node (on process 0) routines: found 676 nodes, limit used is 5
> 
> [1]   System solved in 3664 iterations... 42.683 s
> 
> 
> 
> As you can see, the second iteration takes more than 40 seconds to solve. Could some explain why this is happening and why he number of iterations is increasing dramatically between solves? Thank you all,
> 
> Alejandro M. Arag?n, Ph.D.


From alejandro.aragon at gmail.com  Tue Apr 26 09:45:53 2011
From: alejandro.aragon at gmail.com (=?iso-8859-1?Q?Alejandro_Marcos_Arag=F3n?=)
Date: Tue, 26 Apr 2011 16:45:53 +0200
Subject: [petsc-users] KSP solver increases the solution time
In-Reply-To: <7E8E9476-A40A-4FA4-95DE-D66D9DF3B695@mcs.anl.gov>
References: <380FD73E-0CDB-4E48-A1AA-5C61E07C10F3@gmail.com>
	<7E8E9476-A40A-4FA4-95DE-D66D9DF3B695@mcs.anl.gov>
Message-ID: <CE41B0B9-5BF9-4982-882B-A4148E0B1178@gmail.com>

Thank you Barry, your suggestion really helped speed up the program. The maximum number of iterations is 48. I still don't know what the asm pre-conditioner is but I guess I just need to read the manual. I'm trying to add code to do what you suggested automatically, and I found that I can add:

            PC pc;
            ierr = KSPGetPC(ksp_,&pc); CHKERR(ierr);
            ierr = PCSetType(pc, PCASM); CHKERR(ierr);

However, I cannot find the function to replace the -sub_pc_type. Can you point me where to look? My system may not be symmetric so I can't use the cg option.

Thanks again for your response.

a?


On Apr 26, 2011, at 2:23 PM, Barry Smith wrote:

> 
>> System solved in 3664 iterations... 42.683 s
> 
>   The preconditioner is simply not up to the task it has been assigned.  This number of iterations is problematic.
> 
>    Have you tried -pc_type asm -sub_pc_type lu     If that works well you can try -pc_type asm -sub_pc_type ilu and see if that still works.  
> 
>   If the matrix is indeed symmetric positive definite you will want to use -ksp_type cg
> 
> 
> 
>    Barry
> 
> 
> On Apr 26, 2011, at 1:37 AM, Alejandro Marcos Arag?n wrote:
> 
>> Hi all,
>> 
>> I'm using the standard configuration of the KSP solver, but the time it takes to solve a large system of equations is increasing (because of the increasing number of iterations?). These are my timing lines and the log from the KSP solver in two consecutive solves:
>> 
>> [1]   Solving system... 0.154203 s
>> 
>> KSP Object:
>> type: gmres
>>   GMRES: restart=30, using Classical (unmodified) Gram-Schmidt Orthogonalization with no iterative refinement
>>   GMRES: happy breakdown tolerance 1e-30
>> maximum iterations=10000
>> tolerances:  relative=1e-05, absolute=1e-50, divergence=10000
>> left preconditioning
>> using nonzero initial guess
>> using PRECONDITIONED norm type for convergence test
>> PC Object:
>> type: bjacobi
>>   block Jacobi: number of blocks = 3
>>   Local solve is same for all blocks, in the following KSP and PC objects:
>> KSP Object:(sub_)
>>   type: preonly
>>   maximum iterations=10000, initial guess is zero
>>   tolerances:  relative=1e-05, absolute=1e-50, divergence=10000
>>   left preconditioning
>>   using PRECONDITIONED norm type for convergence test
>> PC Object:(sub_)
>>   type: ilu
>>     ILU: out-of-place factorization
>>     0 levels of fill
>>     tolerance for zero pivot 1e-12
>>     using diagonal shift to prevent zero pivot
>>     matrix ordering: natural
>>     factor fill ratio given 1, needed 1
>>       Factored matrix follows:
>>         Matrix Object:
>>           type=seqaij, rows=2020, cols=2020
>>           package used to perform factorization: petsc
>>           total: nonzeros=119396, allocated nonzeros=163620
>>             using I-node routines: found 676 nodes, limit used is 5
>>   linear system matrix = precond matrix:
>>   Matrix Object:
>>     type=seqaij, rows=2020, cols=2020
>>     total: nonzeros=119396, allocated nonzeros=163620
>>       using I-node routines: found 676 nodes, limit used is 5
>> linear system matrix = precond matrix:
>> Matrix Object:
>>   type=mpiaij, rows=6058, cols=6058
>>   total: nonzeros=365026, allocated nonzeros=509941
>>     using I-node (on process 0) routines: found 676 nodes, limit used is 5
>> 
>> 
>> [1]   System solved in 51 iterations... 0.543215 s
>> ...
>> ...
>> ...
>> 
>> [1]   Solving system... 0.302414 s
>> 
>> KSP Object:
>> type: gmres
>>   GMRES: restart=30, using Classical (unmodified) Gram-Schmidt Orthogonalization with no iterative refinement
>>   GMRES: happy breakdown tolerance 1e-30
>> maximum iterations=10000
>> tolerances:  relative=1e-05, absolute=1e-50, divergence=10000
>> left preconditioning
>> using nonzero initial guess
>> using PRECONDITIONED norm type for convergence test
>> PC Object:
>> type: bjacobi
>>   block Jacobi: number of blocks = 3
>>   Local solve is same for all blocks, in the following KSP and PC objects:
>> KSP Object:(sub_)
>>   type: preonly
>>   maximum iterations=10000, initial guess is zero
>>   tolerances:  relative=1e-05, absolute=1e-50, divergence=10000
>>   left preconditioning
>>   using PRECONDITIONED norm type for convergence test
>> PC Object:(sub_)
>>   type: ilu
>>     ILU: out-of-place factorization
>>     0 levels of fill
>>     tolerance for zero pivot 1e-12
>>     using diagonal shift to prevent zero pivot
>>     matrix ordering: natural
>>     factor fill ratio given 1, needed 1
>>       Factored matrix follows:
>>         Matrix Object:
>>           type=seqaij, rows=2020, cols=2020
>>           package used to perform factorization: petsc
>>           total: nonzeros=119396, allocated nonzeros=163620
>>             using I-node routines: found 676 nodes, limit used is 5
>>   linear system matrix = precond matrix:
>>   Matrix Object:
>>     type=seqaij, rows=2020, cols=2020
>>     total: nonzeros=119396, allocated nonzeros=163620
>>       using I-node routines: found 676 nodes, limit used is 5
>> linear system matrix = precond matrix:
>> Matrix Object:
>>   type=mpiaij, rows=6058, cols=6058
>>   total: nonzeros=365026, allocated nonzeros=509941
>>     using I-node (on process 0) routines: found 676 nodes, limit used is 5
>> 
>> [1]   System solved in 3664 iterations... 42.683 s
>> 
>> 
>> 
>> As you can see, the second iteration takes more than 40 seconds to solve. Could some explain why this is happening and why he number of iterations is increasing dramatically between solves? Thank you all,
>> 
>> Alejandro M. Arag?n, Ph.D.
> 

-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.mcs.anl.gov/pipermail/petsc-users/attachments/20110426/96e9048b/attachment.htm>

From knepley at gmail.com  Tue Apr 26 10:26:54 2011
From: knepley at gmail.com (Matthew Knepley)
Date: Tue, 26 Apr 2011 10:26:54 -0500
Subject: [petsc-users] KSP solver increases the solution time
In-Reply-To: <CE41B0B9-5BF9-4982-882B-A4148E0B1178@gmail.com>
References: <380FD73E-0CDB-4E48-A1AA-5C61E07C10F3@gmail.com>
	<7E8E9476-A40A-4FA4-95DE-D66D9DF3B695@mcs.anl.gov>
	<CE41B0B9-5BF9-4982-882B-A4148E0B1178@gmail.com>
Message-ID: <BANLkTi=t7ZQB8_FBU8o5TZOM6yaS8SsxfQ@mail.gmail.com>

2011/4/26 Alejandro Marcos Arag?n <alejandro.aragon at gmail.com>

> Thank you Barry, your suggestion really helped speed up the program. The
> maximum number of iterations is 48. I still don't know what the asm
> pre-conditioner is but I guess I just need to read the manual. I'm trying to
> add code to do what you suggested automatically, and I found that I can add:
>
>             PC pc;
>             ierr = KSPGetPC(ksp_,&pc); CHKERR(ierr);
>             ierr = PCSetType(pc, PCASM); CHKERR(ierr);
>
> However, I cannot find the function to replace the -sub_pc_type. Can you
> point me where to look? My system may not be symmetric so I can't use the cg
> option.
>

1) Hard coding it in your program does not make sense. You gain nothing, and
lose a lot of flexibility.

2) You can do this using


http://www.mcs.anl.gov/petsc/petsc-as/snapshots/petsc-current/docs/manualpages/PC/PCASMGetSubKSP.html

    Matt


> Thanks again for your response.
>
> a?
>
>
> On Apr 26, 2011, at 2:23 PM, Barry Smith wrote:
>
>
> System solved in 3664 iterations... 42.683 s
>
>
>   The preconditioner is simply not up to the task it has been assigned.
>  This number of iterations is problematic.
>
>    Have you tried -pc_type asm -sub_pc_type lu     If that works well you
> can try -pc_type asm -sub_pc_type ilu and see if that still works.
>
>   If the matrix is indeed symmetric positive definite you will want to use
> -ksp_type cg
>
>
>
>    Barry
>
>
> On Apr 26, 2011, at 1:37 AM, Alejandro Marcos Arag?n wrote:
>
> Hi all,
>
>
> I'm using the standard configuration of the KSP solver, but the time it
> takes to solve a large system of equations is increasing (because of the
> increasing number of iterations?). These are my timing lines and the log
> from the KSP solver in two consecutive solves:
>
>
> [1]   Solving system... 0.154203 s
>
>
> KSP Object:
>
> type: gmres
>
>   GMRES: restart=30, using Classical (unmodified) Gram-Schmidt
> Orthogonalization with no iterative refinement
>
>   GMRES: happy breakdown tolerance 1e-30
>
> maximum iterations=10000
>
> tolerances:  relative=1e-05, absolute=1e-50, divergence=10000
>
> left preconditioning
>
> using nonzero initial guess
>
> using PRECONDITIONED norm type for convergence test
>
> PC Object:
>
> type: bjacobi
>
>   block Jacobi: number of blocks = 3
>
>   Local solve is same for all blocks, in the following KSP and PC objects:
>
> KSP Object:(sub_)
>
>   type: preonly
>
>   maximum iterations=10000, initial guess is zero
>
>   tolerances:  relative=1e-05, absolute=1e-50, divergence=10000
>
>   left preconditioning
>
>   using PRECONDITIONED norm type for convergence test
>
> PC Object:(sub_)
>
>   type: ilu
>
>     ILU: out-of-place factorization
>
>     0 levels of fill
>
>     tolerance for zero pivot 1e-12
>
>     using diagonal shift to prevent zero pivot
>
>     matrix ordering: natural
>
>     factor fill ratio given 1, needed 1
>
>       Factored matrix follows:
>
>         Matrix Object:
>
>           type=seqaij, rows=2020, cols=2020
>
>           package used to perform factorization: petsc
>
>           total: nonzeros=119396, allocated nonzeros=163620
>
>             using I-node routines: found 676 nodes, limit used is 5
>
>   linear system matrix = precond matrix:
>
>   Matrix Object:
>
>     type=seqaij, rows=2020, cols=2020
>
>     total: nonzeros=119396, allocated nonzeros=163620
>
>       using I-node routines: found 676 nodes, limit used is 5
>
> linear system matrix = precond matrix:
>
> Matrix Object:
>
>   type=mpiaij, rows=6058, cols=6058
>
>   total: nonzeros=365026, allocated nonzeros=509941
>
>     using I-node (on process 0) routines: found 676 nodes, limit used is 5
>
>
>
> [1]   System solved in 51 iterations... 0.543215 s
>
> ...
>
> ...
>
> ...
>
>
> [1]   Solving system... 0.302414 s
>
>
> KSP Object:
>
> type: gmres
>
>   GMRES: restart=30, using Classical (unmodified) Gram-Schmidt
> Orthogonalization with no iterative refinement
>
>   GMRES: happy breakdown tolerance 1e-30
>
> maximum iterations=10000
>
> tolerances:  relative=1e-05, absolute=1e-50, divergence=10000
>
> left preconditioning
>
> using nonzero initial guess
>
> using PRECONDITIONED norm type for convergence test
>
> PC Object:
>
> type: bjacobi
>
>   block Jacobi: number of blocks = 3
>
>   Local solve is same for all blocks, in the following KSP and PC objects:
>
> KSP Object:(sub_)
>
>   type: preonly
>
>   maximum iterations=10000, initial guess is zero
>
>   tolerances:  relative=1e-05, absolute=1e-50, divergence=10000
>
>   left preconditioning
>
>   using PRECONDITIONED norm type for convergence test
>
> PC Object:(sub_)
>
>   type: ilu
>
>     ILU: out-of-place factorization
>
>     0 levels of fill
>
>     tolerance for zero pivot 1e-12
>
>     using diagonal shift to prevent zero pivot
>
>     matrix ordering: natural
>
>     factor fill ratio given 1, needed 1
>
>       Factored matrix follows:
>
>         Matrix Object:
>
>           type=seqaij, rows=2020, cols=2020
>
>           package used to perform factorization: petsc
>
>           total: nonzeros=119396, allocated nonzeros=163620
>
>             using I-node routines: found 676 nodes, limit used is 5
>
>   linear system matrix = precond matrix:
>
>   Matrix Object:
>
>     type=seqaij, rows=2020, cols=2020
>
>     total: nonzeros=119396, allocated nonzeros=163620
>
>       using I-node routines: found 676 nodes, limit used is 5
>
> linear system matrix = precond matrix:
>
> Matrix Object:
>
>   type=mpiaij, rows=6058, cols=6058
>
>   total: nonzeros=365026, allocated nonzeros=509941
>
>     using I-node (on process 0) routines: found 676 nodes, limit used is 5
>
>
> [1]   System solved in 3664 iterations... 42.683 s
>
>
>
>
> As you can see, the second iteration takes more than 40 seconds to solve.
> Could some explain why this is happening and why he number of iterations is
> increasing dramatically between solves? Thank you all,
>
>
> Alejandro M. Arag?n, Ph.D.
>
>
>
>


-- 
What most experimenters take for granted before they begin their experiments
is infinitely more interesting than any results to which their experiments
lead.
-- Norbert Wiener
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.mcs.anl.gov/pipermail/petsc-users/attachments/20110426/b48f8601/attachment-0001.htm>

From alejandro.aragon at gmail.com  Wed Apr 27 02:17:27 2011
From: alejandro.aragon at gmail.com (=?iso-8859-1?Q?Alejandro_Marcos_Arag=F3n?=)
Date: Wed, 27 Apr 2011 09:17:27 +0200
Subject: [petsc-users] KSP solver increases the solution time
In-Reply-To: <BANLkTi=t7ZQB8_FBU8o5TZOM6yaS8SsxfQ@mail.gmail.com>
References: <380FD73E-0CDB-4E48-A1AA-5C61E07C10F3@gmail.com>
	<7E8E9476-A40A-4FA4-95DE-D66D9DF3B695@mcs.anl.gov>
	<CE41B0B9-5BF9-4982-882B-A4148E0B1178@gmail.com>
	<BANLkTi=t7ZQB8_FBU8o5TZOM6yaS8SsxfQ@mail.gmail.com>
Message-ID: <7D72624F-0B0D-4494-A230-F814C8C0FB28@gmail.com>

I understand that, but I'm trying to provide default behavior for the solver because the default one (no parameters) works very bad in my case.
However, I'm stuck because I can't set the same parameters that I obtain with command line arguments "-pc_type asm -sub_pc_type lu".

Can someone point me where is the error with the following code?

...
...
            PetscInitialize(&argc, &argv,NULL,NULL);
            PetscErrorCode ierr = MatCreate(PETSC_COMM_WORLD,&A_);CHKERR(ierr);
            
            // create linear solver context
            ierr = KSPCreate(PETSC_COMM_WORLD,&ksp_);CHKERR(ierr);
            
            // initial nonzero guess
            ierr = KSPSetInitialGuessNonzero(ksp_,PETSC_TRUE); CHKERR(ierr);
            
            // set runtime options
            ierr = KSPSetFromOptions(ksp_);CHKERR(ierr);

            // set the default preconditioner for this program to be ASM
            PC pc;
            ierr = KSPGetPC(ksp_,&pc); CHKERR(ierr);
            ierr = PCSetType(pc, PCASM); CHKERR(ierr);

            KSP        *subksp;       /* array of KSP contexts for local subblocks */
            PetscInt   nlocal,first;  /* number of local subblocks, first local subblock */
            PC         subpc;          /* PC context for subblock */
                                    
            /* 
             Call KSPSetUp() to set the block Jacobi data structures (including
             creation of an internal KSP context for each block).
             
             Note: KSPSetUp() MUST be called before PCASMGetSubKSP().
             */
            ierr = KSPSetUp(ksp_);CHKERR(ierr);
            
            /*
             Extract the array of KSP contexts for the local blocks
             */
            ierr = PCASMGetSubKSP(pc,&nlocal,&first,&subksp);CHKERR(ierr);
            
            /*
             Loop over the local blocks, setting various KSP options
             for each block.  
             */
            for (int i=0; i<nlocal; i++) {
                ierr = KSPGetPC(subksp[i],&subpc);CHKERR(ierr);
                ierr = PCSetType(subpc,PCLU);CHKERR(ierr);
            }

This is the error I get:

User explicitly sets subdomain solvers.
[0]PETSC ERROR: --------------------- Error Message ------------------------------------
[0]PETSC ERROR: Null argument, when expecting valid pointer!
[0]PETSC ERROR: Null Object: Parameter # 1!
[0]PETSC ERROR: ------------------------------------------------------------------------
[0]PETSC ERROR: Petsc Release Version 3.1.0, Patch 7, Mon Dec 20 14:26:37 CST 2010
[0]PETSC ERROR: See docs/changes/index.html for recent updates.
[0]PETSC ERROR: See docs/faq.html for hints about trouble shooting.
[0]PETSC ERROR: See docs/index.html for manual pages.
[0]PETSC ERROR: ------------------------------------------------------------------------
[0]PETSC ERROR: yafeq/a.out on a darwin10. named lsmspc26.epfl.ch by aaragon Wed Apr 27 09:14:25 2011
[0]PETSC ERROR: Libraries linked from /Users/aaragon/Local/lib
[0]PETSC ERROR: Configure run at Thu Apr  7 17:01:26 2011
[0]PETSC ERROR: Configure options --prefix=/Users/aaragon/Local --with-mpi-include=/Users/aaragon/Local/include --with-mpi-lib=/Users/aaragon/Local/lib/libmpich.a --with-superlu=1 --with-superlu-include=/Users/aaragon/Local/include/superlu --with-superlu-lib=/Users/aaragon/Local/lib/libsuperlu.a --with-superlu_dist=1 --with-superlu_dist-include=/Users/aaragon/Local/include/superlu_dist --with-superlu_dist-lib=/Users/aaragon/Local/lib/libsuperlu_dist.a --with-parmetis=1 --download-parmetis=ifneeded
[0]PETSC ERROR: ------------------------------------------------------------------------
[0]PETSC ERROR: MatGetVecs() line 7265 in src/mat/interface/matrix.c
[0]PETSC ERROR: KSPGetVecs() line 806 in src/ksp/ksp/interface/iterativ.c
[0]PETSC ERROR: KSPSetUp_GMRES() line 94 in src/ksp/ksp/impls/gmres/gmres.c
[0]PETSC ERROR: KSPSetUp() line 199 in src/ksp/ksp/interface/itfunc.c
[0]PETSC ERROR: User provided function() line 397 in "unknowndirectory/"/Users/aaragon/Local/include/cpputils/solver.hpp
[0]PETSC ERROR: --------------------- Error Message ------------------------------------
[0]PETSC ERROR: Operation done in wrong order!
[0]PETSC ERROR: Need to call PCSetUP() on PC (or KSPSetUp() on the outer KSP object) before calling here!
[0]PETSC ERROR: ------------------------------------------------------------------------
[0]PETSC ERROR: Petsc Release Version 3.1.0, Patch 7, Mon Dec 20 14:26:37 CST 2010
[0]PETSC ERROR: See docs/changes/index.html for recent updates.
[0]PETSC ERROR: See docs/faq.html for hints about trouble shooting.
[0]PETSC ERROR: See docs/index.html for manual pages.
[0]PETSC ERROR: ------------------------------------------------------------------------
[0]PETSC ERROR: yafeq/a.out on a darwin10. named lsmspc26.epfl.ch by aaragon Wed Apr 27 09:14:25 2011
[0]PETSC ERROR: Libraries linked from /Users/aaragon/Local/lib
[0]PETSC ERROR: Configure run at Thu Apr  7 17:01:26 2011
[0]PETSC ERROR: Configure options --prefix=/Users/aaragon/Local --with-mpi-include=/Users/aaragon/Local/include --with-mpi-lib=/Users/aaragon/Local/lib/libmpich.a --with-superlu=1 --with-superlu-include=/Users/aaragon/Local/include/superlu --with-superlu-lib=/Users/aaragon/Local/lib/libsuperlu.a --with-superlu_dist=1 --with-superlu_dist-include=/Users/aaragon/Local/include/superlu_dist --with-superlu_dist-lib=/Users/aaragon/Local/lib/libsuperlu_dist.a --with-parmetis=1 --download-parmetis=ifneeded
[0]PETSC ERROR: ------------------------------------------------------------------------
[0]PETSC ERROR: PCASMGetSubKSP_ASM() line 644 in src/ksp/pc/impls/asm/asm.c
[0]PETSC ERROR: PCASMGetSubKSP() line 926 in src/ksp/pc/impls/asm/asm.c
[0]PETSC ERROR: User provided function() line 402 in "unknowndirectory/"/Users/aaragon/Local/include/cpputils/solver.hpp
[0]PETSC ERROR: --------------------- Error Message ------------------------------------
[0]PETSC ERROR: Invalid argument!
[0]PETSC ERROR: Wrong type of object: Parameter # 1!
[0]PETSC ERROR: ------------------------------------------------------------------------
[0]PETSC ERROR: Petsc Release Version 3.1.0, Patch 7, Mon Dec 20 14:26:37 CST 2010

and the error continues...


On Apr 26, 2011, at 5:26 PM, Matthew Knepley wrote:

> 2011/4/26 Alejandro Marcos Arag?n <alejandro.aragon at gmail.com>
> Thank you Barry, your suggestion really helped speed up the program. The maximum number of iterations is 48. I still don't know what the asm pre-conditioner is but I guess I just need to read the manual. I'm trying to add code to do what you suggested automatically, and I found that I can add:
> 
>             PC pc;
>             ierr = KSPGetPC(ksp_,&pc); CHKERR(ierr);
>             ierr = PCSetType(pc, PCASM); CHKERR(ierr);
> 
> However, I cannot find the function to replace the -sub_pc_type. Can you point me where to look? My system may not be symmetric so I can't use the cg option.
> 
> 1) Hard coding it in your program does not make sense. You gain nothing, and lose a lot of flexibility.
> 
> 2) You can do this using
> 
>   http://www.mcs.anl.gov/petsc/petsc-as/snapshots/petsc-current/docs/manualpages/PC/PCASMGetSubKSP.html
> 
>     Matt
>  
> Thanks again for your response.
> 
> a?
> 
> 
> On Apr 26, 2011, at 2:23 PM, Barry Smith wrote:
> 
>> 
>>> System solved in 3664 iterations... 42.683 s
>> 
>>   The preconditioner is simply not up to the task it has been assigned.  This number of iterations is problematic.
>> 
>>    Have you tried -pc_type asm -sub_pc_type lu     If that works well you can try -pc_type asm -sub_pc_type ilu and see if that still works.  
>> 
>>   If the matrix is indeed symmetric positive definite you will want to use -ksp_type cg
>> 
>> 
>> 
>>    Barry
>> 
>> 
>> On Apr 26, 2011, at 1:37 AM, Alejandro Marcos Arag?n wrote:
>> 
>>> Hi all,
>>> 
>>> I'm using the standard configuration of the KSP solver, but the time it takes to solve a large system of equations is increasing (because of the increasing number of iterations?). These are my timing lines and the log from the KSP solver in two consecutive solves:
>>> 
>>> [1]   Solving system... 0.154203 s
>>> 
>>> KSP Object:
>>> type: gmres
>>>   GMRES: restart=30, using Classical (unmodified) Gram-Schmidt Orthogonalization with no iterative refinement
>>>   GMRES: happy breakdown tolerance 1e-30
>>> maximum iterations=10000
>>> tolerances:  relative=1e-05, absolute=1e-50, divergence=10000
>>> left preconditioning
>>> using nonzero initial guess
>>> using PRECONDITIONED norm type for convergence test
>>> PC Object:
>>> type: bjacobi
>>>   block Jacobi: number of blocks = 3
>>>   Local solve is same for all blocks, in the following KSP and PC objects:
>>> KSP Object:(sub_)
>>>   type: preonly
>>>   maximum iterations=10000, initial guess is zero
>>>   tolerances:  relative=1e-05, absolute=1e-50, divergence=10000
>>>   left preconditioning
>>>   using PRECONDITIONED norm type for convergence test
>>> PC Object:(sub_)
>>>   type: ilu
>>>     ILU: out-of-place factorization
>>>     0 levels of fill
>>>     tolerance for zero pivot 1e-12
>>>     using diagonal shift to prevent zero pivot
>>>     matrix ordering: natural
>>>     factor fill ratio given 1, needed 1
>>>       Factored matrix follows:
>>>         Matrix Object:
>>>           type=seqaij, rows=2020, cols=2020
>>>           package used to perform factorization: petsc
>>>           total: nonzeros=119396, allocated nonzeros=163620
>>>             using I-node routines: found 676 nodes, limit used is 5
>>>   linear system matrix = precond matrix:
>>>   Matrix Object:
>>>     type=seqaij, rows=2020, cols=2020
>>>     total: nonzeros=119396, allocated nonzeros=163620
>>>       using I-node routines: found 676 nodes, limit used is 5
>>> linear system matrix = precond matrix:
>>> Matrix Object:
>>>   type=mpiaij, rows=6058, cols=6058
>>>   total: nonzeros=365026, allocated nonzeros=509941
>>>     using I-node (on process 0) routines: found 676 nodes, limit used is 5
>>> 
>>> 
>>> [1]   System solved in 51 iterations... 0.543215 s
>>> ...
>>> ...
>>> ...
>>> 
>>> [1]   Solving system... 0.302414 s
>>> 
>>> KSP Object:
>>> type: gmres
>>>   GMRES: restart=30, using Classical (unmodified) Gram-Schmidt Orthogonalization with no iterative refinement
>>>   GMRES: happy breakdown tolerance 1e-30
>>> maximum iterations=10000
>>> tolerances:  relative=1e-05, absolute=1e-50, divergence=10000
>>> left preconditioning
>>> using nonzero initial guess
>>> using PRECONDITIONED norm type for convergence test
>>> PC Object:
>>> type: bjacobi
>>>   block Jacobi: number of blocks = 3
>>>   Local solve is same for all blocks, in the following KSP and PC objects:
>>> KSP Object:(sub_)
>>>   type: preonly
>>>   maximum iterations=10000, initial guess is zero
>>>   tolerances:  relative=1e-05, absolute=1e-50, divergence=10000
>>>   left preconditioning
>>>   using PRECONDITIONED norm type for convergence test
>>> PC Object:(sub_)
>>>   type: ilu
>>>     ILU: out-of-place factorization
>>>     0 levels of fill
>>>     tolerance for zero pivot 1e-12
>>>     using diagonal shift to prevent zero pivot
>>>     matrix ordering: natural
>>>     factor fill ratio given 1, needed 1
>>>       Factored matrix follows:
>>>         Matrix Object:
>>>           type=seqaij, rows=2020, cols=2020
>>>           package used to perform factorization: petsc
>>>           total: nonzeros=119396, allocated nonzeros=163620
>>>             using I-node routines: found 676 nodes, limit used is 5
>>>   linear system matrix = precond matrix:
>>>   Matrix Object:
>>>     type=seqaij, rows=2020, cols=2020
>>>     total: nonzeros=119396, allocated nonzeros=163620
>>>       using I-node routines: found 676 nodes, limit used is 5
>>> linear system matrix = precond matrix:
>>> Matrix Object:
>>>   type=mpiaij, rows=6058, cols=6058
>>>   total: nonzeros=365026, allocated nonzeros=509941
>>>     using I-node (on process 0) routines: found 676 nodes, limit used is 5
>>> 
>>> [1]   System solved in 3664 iterations... 42.683 s
>>> 
>>> 
>>> 
>>> As you can see, the second iteration takes more than 40 seconds to solve. Could some explain why this is happening and why he number of iterations is increasing dramatically between solves? Thank you all,
>>> 
>>> Alejandro M. Arag?n, Ph.D.
>> 
> 
> 
> 
> 
> -- 
> What most experimenters take for granted before they begin their experiments is infinitely more interesting than any results to which their experiments lead.
> -- Norbert Wiener

-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.mcs.anl.gov/pipermail/petsc-users/attachments/20110427/a143dc5f/attachment-0001.htm>

From jroman at dsic.upv.es  Wed Apr 27 04:32:27 2011
From: jroman at dsic.upv.es (Jose E. Roman)
Date: Wed, 27 Apr 2011 11:32:27 +0200
Subject: [petsc-users] KSP solver increases the solution time
In-Reply-To: <7D72624F-0B0D-4494-A230-F814C8C0FB28@gmail.com>
References: <380FD73E-0CDB-4E48-A1AA-5C61E07C10F3@gmail.com>
	<7E8E9476-A40A-4FA4-95DE-D66D9DF3B695@mcs.anl.gov>
	<CE41B0B9-5BF9-4982-882B-A4148E0B1178@gmail.com>
	<BANLkTi=t7ZQB8_FBU8o5TZOM6yaS8SsxfQ@mail.gmail.com>
	<7D72624F-0B0D-4494-A230-F814C8C0FB28@gmail.com>
Message-ID: <D2003686-47C2-4FEC-BE47-026C95C926FC@dsic.upv.es>


El 27/04/2011, a las 09:17, Alejandro Marcos Arag?n escribi?:

> I understand that, but I'm trying to provide default behavior for the solver because the default one (no parameters) works very bad in my case.
> However, I'm stuck because I can't set the same parameters that I obtain with command line arguments "-pc_type asm -sub_pc_type lu".
> 
> Can someone point me where is the error with the following code?

You should call KSPSetOperators before doing all the setup.
Jose


> 
> ...
> ...
>             PetscInitialize(&argc, &argv,NULL,NULL);
>             PetscErrorCode ierr = MatCreate(PETSC_COMM_WORLD,&A_);CHKERR(ierr);
>             
>             // create linear solver context
>             ierr = KSPCreate(PETSC_COMM_WORLD,&ksp_);CHKERR(ierr);
>             
>             // initial nonzero guess
>             ierr = KSPSetInitialGuessNonzero(ksp_,PETSC_TRUE); CHKERR(ierr);
>             
>             // set runtime options
>             ierr = KSPSetFromOptions(ksp_);CHKERR(ierr);
> 
>             // set the default preconditioner for this program to be ASM
>             PC pc;
>             ierr = KSPGetPC(ksp_,&pc); CHKERR(ierr);
>             ierr = PCSetType(pc, PCASM); CHKERR(ierr);
> 
>             KSP        *subksp;       /* array of KSP contexts for local subblocks */
>             PetscInt   nlocal,first;  /* number of local subblocks, first local subblock */
>             PC         subpc;          /* PC context for subblock */
>                                     
>             /* 
>              Call KSPSetUp() to set the block Jacobi data structures (including
>              creation of an internal KSP context for each block).
>              
>              Note: KSPSetUp() MUST be called before PCASMGetSubKSP().
>              */
>             ierr = KSPSetUp(ksp_);CHKERR(ierr);
>             
>             /*
>              Extract the array of KSP contexts for the local blocks
>              */
>             ierr = PCASMGetSubKSP(pc,&nlocal,&first,&subksp);CHKERR(ierr);
>             
>             /*
>              Loop over the local blocks, setting various KSP options
>              for each block.  
>              */
>             for (int i=0; i<nlocal; i++) {
>                 ierr = KSPGetPC(subksp[i],&subpc);CHKERR(ierr);
>                 ierr = PCSetType(subpc,PCLU);CHKERR(ierr);
>             }
> 
> This is the error I get:
> 
> User explicitly sets subdomain solvers.
> [0]PETSC ERROR: --------------------- Error Message ------------------------------------
> [0]PETSC ERROR: Null argument, when expecting valid pointer!
> [0]PETSC ERROR: Null Object: Parameter # 1!
> [0]PETSC ERROR: ------------------------------------------------------------------------
> [0]PETSC ERROR: Petsc Release Version 3.1.0, Patch 7, Mon Dec 20 14:26:37 CST 2010
> [0]PETSC ERROR: See docs/changes/index.html for recent updates.
> [0]PETSC ERROR: See docs/faq.html for hints about trouble shooting.
> [0]PETSC ERROR: See docs/index.html for manual pages.
> [0]PETSC ERROR: ------------------------------------------------------------------------
> [0]PETSC ERROR: yafeq/a.out on a darwin10. named lsmspc26.epfl.ch by aaragon Wed Apr 27 09:14:25 2011
> [0]PETSC ERROR: Libraries linked from /Users/aaragon/Local/lib
> [0]PETSC ERROR: Configure run at Thu Apr  7 17:01:26 2011
> [0]PETSC ERROR: Configure options --prefix=/Users/aaragon/Local --with-mpi-include=/Users/aaragon/Local/include --with-mpi-lib=/Users/aaragon/Local/lib/libmpich.a --with-superlu=1 --with-superlu-include=/Users/aaragon/Local/include/superlu --with-superlu-lib=/Users/aaragon/Local/lib/libsuperlu.a --with-superlu_dist=1 --with-superlu_dist-include=/Users/aaragon/Local/include/superlu_dist --with-superlu_dist-lib=/Users/aaragon/Local/lib/libsuperlu_dist.a --with-parmetis=1 --download-parmetis=ifneeded
> [0]PETSC ERROR: ------------------------------------------------------------------------
> [0]PETSC ERROR: MatGetVecs() line 7265 in src/mat/interface/matrix.c
> [0]PETSC ERROR: KSPGetVecs() line 806 in src/ksp/ksp/interface/iterativ.c
> [0]PETSC ERROR: KSPSetUp_GMRES() line 94 in src/ksp/ksp/impls/gmres/gmres.c
> [0]PETSC ERROR: KSPSetUp() line 199 in src/ksp/ksp/interface/itfunc.c
> [0]PETSC ERROR: User provided function() line 397 in "unknowndirectory/"/Users/aaragon/Local/include/cpputils/solver.hpp
> [0]PETSC ERROR: --------------------- Error Message ------------------------------------
> [0]PETSC ERROR: Operation done in wrong order!
> [0]PETSC ERROR: Need to call PCSetUP() on PC (or KSPSetUp() on the outer KSP object) before calling here!
> [0]PETSC ERROR: ------------------------------------------------------------------------
> [0]PETSC ERROR: Petsc Release Version 3.1.0, Patch 7, Mon Dec 20 14:26:37 CST 2010
> [0]PETSC ERROR: See docs/changes/index.html for recent updates.
> [0]PETSC ERROR: See docs/faq.html for hints about trouble shooting.
> [0]PETSC ERROR: See docs/index.html for manual pages.
> [0]PETSC ERROR: ------------------------------------------------------------------------
> [0]PETSC ERROR: yafeq/a.out on a darwin10. named lsmspc26.epfl.ch by aaragon Wed Apr 27 09:14:25 2011
> [0]PETSC ERROR: Libraries linked from /Users/aaragon/Local/lib
> [0]PETSC ERROR: Configure run at Thu Apr  7 17:01:26 2011
> [0]PETSC ERROR: Configure options --prefix=/Users/aaragon/Local --with-mpi-include=/Users/aaragon/Local/include --with-mpi-lib=/Users/aaragon/Local/lib/libmpich.a --with-superlu=1 --with-superlu-include=/Users/aaragon/Local/include/superlu --with-superlu-lib=/Users/aaragon/Local/lib/libsuperlu.a --with-superlu_dist=1 --with-superlu_dist-include=/Users/aaragon/Local/include/superlu_dist --with-superlu_dist-lib=/Users/aaragon/Local/lib/libsuperlu_dist.a --with-parmetis=1 --download-parmetis=ifneeded
> [0]PETSC ERROR: ------------------------------------------------------------------------
> [0]PETSC ERROR: PCASMGetSubKSP_ASM() line 644 in src/ksp/pc/impls/asm/asm.c
> [0]PETSC ERROR: PCASMGetSubKSP() line 926 in src/ksp/pc/impls/asm/asm.c
> [0]PETSC ERROR: User provided function() line 402 in "unknowndirectory/"/Users/aaragon/Local/include/cpputils/solver.hpp
> [0]PETSC ERROR: --------------------- Error Message ------------------------------------
> [0]PETSC ERROR: Invalid argument!
> [0]PETSC ERROR: Wrong type of object: Parameter # 1!
> [0]PETSC ERROR: ------------------------------------------------------------------------
> [0]PETSC ERROR: Petsc Release Version 3.1.0, Patch 7, Mon Dec 20 14:26:37 CST 2010
> 
> and the error continues...
> 


From xdliang at gmail.com  Wed Apr 27 10:57:13 2011
From: xdliang at gmail.com (Xiangdong Liang)
Date: Wed, 27 Apr 2011 11:57:13 -0400
Subject: [petsc-users] Changing the diagonal of a matrix via a vector
Message-ID: <BANLkTik9UWqdHfaVMc_noYtsweUFxWyt1w@mail.gmail.com>

Hello everyone,

I am a novice to Petsc and parallel computing. I have created a mpi sparse
matrix A (size n-by-n) and a parallel vector b (size n-by-1). Now I want to
modify the diagonal of A by adding the values of vector b. Namely, A(i,i) =
A(i,i) + b(i) and all the off-diagonal elements remains the same. I am
worrying that when I use MatSetValue or MatSetValues, b(i) may not be
accessed by some particular processor since VecGetValues can only get values
on the same processor. One possible solution I am thinking  is converting
vector b to a diagonal matrix B and then do the MatAXPY operation. However,
using MatSetValue to set diagonal elements of B,  B(i,i) = b(i),  still
faces the similar problem. Can anyone give me some suggestion? Thanks.

Best,
Xiangdong

P.S. When I compiled my program, I get warnings like that: warning: return
makes pointer from integer without a cast. Actually, these lines are
standard Petsc functions like that:

ierr = VecCreate(PETSC_COMM_WORLD,&x);CHKERRQ(ierr);
ierr = VecDestroy(x); CHKERRQ(ierr);

How can I get rid of these warnings?
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.mcs.anl.gov/pipermail/petsc-users/attachments/20110427/b1c63a14/attachment.htm>

From jed at 59A2.org  Wed Apr 27 11:03:07 2011
From: jed at 59A2.org (Jed Brown)
Date: Wed, 27 Apr 2011 18:03:07 +0200
Subject: [petsc-users] Changing the diagonal of a matrix via a vector
In-Reply-To: <BANLkTik9UWqdHfaVMc_noYtsweUFxWyt1w@mail.gmail.com>
References: <BANLkTik9UWqdHfaVMc_noYtsweUFxWyt1w@mail.gmail.com>
Message-ID: <BANLkTi=8bwBMx8xPzF-7fM_zk9QcDsXEkw@mail.gmail.com>

On Wed, Apr 27, 2011 at 17:57, Xiangdong Liang <xdliang at gmail.com> wrote:

> I am a novice to Petsc and parallel computing. I have created a mpi sparse
> matrix A (size n-by-n) and a parallel vector b (size n-by-1). Now I want to
> modify the diagonal of A by adding the values of vector b. Namely, A(i,i) =
> A(i,i) + b(i) and all the off-diagonal elements remains the same. I am
> worrying that when I use MatSetValue or MatSetValues, b(i) may not be
> accessed by some particular processor since VecGetValues can only get values
> on the same processor. One possible solution I am thinking  is converting
> vector b to a diagonal matrix B and then do the MatAXPY operation. However,
> using MatSetValue to set diagonal elements of B,  B(i,i) = b(i),  still
> faces the similar problem. Can anyone give me some suggestion? Thanks.
>

MatDiagonalSet(A,b,ADD_VALUES)

http://www.mcs.anl.gov/petsc/petsc-as/snapshots/petsc-dev/docs/manualpages/Mat/MatDiagonalSet.html


> P.S. When I compiled my program, I get warnings like that: warning: return
> makes pointer from integer without a cast. Actually, these lines are
> standard Petsc functions like that:
>
> ierr = VecCreate(PETSC_COMM_WORLD,&x);CHKERRQ(ierr);
> ierr = VecDestroy(x); CHKERRQ(ierr);
>
> How can I get rid of these warnings?
>

Either make your function return int (or PetscErrorCode), passing "return
values" back through arguments or use CHKERRV (worse because errors won't
propagate up correctly).

http://www.mcs.anl.gov/petsc/petsc-as/snapshots/petsc-dev/docs/manualpages/Sys/CHKERRQ.html
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.mcs.anl.gov/pipermail/petsc-users/attachments/20110427/1a638b5f/attachment.htm>

From xdliang at gmail.com  Wed Apr 27 11:41:22 2011
From: xdliang at gmail.com (Xiangdong Liang)
Date: Wed, 27 Apr 2011 12:41:22 -0400
Subject: [petsc-users] Changing the diagonal of a matrix via a vector
In-Reply-To: <BANLkTi=8bwBMx8xPzF-7fM_zk9QcDsXEkw@mail.gmail.com>
References: <BANLkTik9UWqdHfaVMc_noYtsweUFxWyt1w@mail.gmail.com>
	<BANLkTi=8bwBMx8xPzF-7fM_zk9QcDsXEkw@mail.gmail.com>
Message-ID: <BANLkTi=UbxHiU69Z+pRryuqyAstFjsn7Eg@mail.gmail.com>

On Wed, Apr 27, 2011 at 12:03 PM, Jed Brown <jed at 59a2.org> wrote:

> On Wed, Apr 27, 2011 at 17:57, Xiangdong Liang <xdliang at gmail.com> wrote:
>
>> I am a novice to Petsc and parallel computing. I have created a mpi sparse
>> matrix A (size n-by-n) and a parallel vector b (size n-by-1). Now I want to
>> modify the diagonal of A by adding the values of vector b. Namely, A(i,i) =
>> A(i,i) + b(i) and all the off-diagonal elements remains the same. I am
>> worrying that when I use MatSetValue or MatSetValues, b(i) may not be
>> accessed by some particular processor since VecGetValues can only get values
>> on the same processor. One possible solution I am thinking  is converting
>> vector b to a diagonal matrix B and then do the MatAXPY operation. However,
>> using MatSetValue to set diagonal elements of B,  B(i,i) = b(i),  still
>> faces the similar problem. Can anyone give me some suggestion? Thanks.
>>
>
> MatDiagonalSet(A,b,ADD_VALUES)
>
>
> http://www.mcs.anl.gov/petsc/petsc-as/snapshots/petsc-dev/docs/manualpages/Mat/MatDiagonalSet.html
>
>
>> P.S. When I compiled my program, I get warnings like that: warning: return
>> makes pointer from integer without a cast. Actually, these lines are
>> standard Petsc functions like that:
>>
>> ierr = VecCreate(PETSC_COMM_WORLD,&x);CHKERRQ(ierr);
>> ierr = VecDestroy(x); CHKERRQ(ierr);
>>
>> How can I get rid of these warnings?
>>
>
> Either make your function return int (or PetscErrorCode), passing "return
> values" back through arguments or use CHKERRV (worse because errors won't
> propagate up correctly).
>
>
Thanks a lot, Jed. I am using Petsc's built-in function, VecCreate and
MatCreate. they are supposed to return PetscErrorCode. However, I still get
"warning: return makes pointer from integer without a cast" for these
built-in functions.


>
> http://www.mcs.anl.gov/petsc/petsc-as/snapshots/petsc-dev/docs/manualpages/Sys/CHKERRQ.html
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.mcs.anl.gov/pipermail/petsc-users/attachments/20110427/595b387c/attachment.htm>

From jed at 59A2.org  Wed Apr 27 12:41:43 2011
From: jed at 59A2.org (Jed Brown)
Date: Wed, 27 Apr 2011 19:41:43 +0200
Subject: [petsc-users] Changing the diagonal of a matrix via a vector
In-Reply-To: <BANLkTi=UbxHiU69Z+pRryuqyAstFjsn7Eg@mail.gmail.com>
References: <BANLkTik9UWqdHfaVMc_noYtsweUFxWyt1w@mail.gmail.com>
	<BANLkTi=8bwBMx8xPzF-7fM_zk9QcDsXEkw@mail.gmail.com>
	<BANLkTi=UbxHiU69Z+pRryuqyAstFjsn7Eg@mail.gmail.com>
Message-ID: <BANLkTimz=NfdOetQbC6Fiy330P71CeQhqA@mail.gmail.com>

On Wed, Apr 27, 2011 at 18:41, Xiangdong Liang <xdliang at gmail.com> wrote:

> Thanks a lot, Jed. I am using Petsc's built-in function, VecCreate and
> MatCreate. they are supposed to return PetscErrorCode. However, I still get
> "warning: return makes pointer from integer without a cast" for these
> built-in functions.


There is a "return" inside the CHKERRQ macro. You either need to make *your*
function return PetscErrorCode or use a different checking macro (e.g.
CHKERRABORT).
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.mcs.anl.gov/pipermail/petsc-users/attachments/20110427/742187be/attachment.htm>

From xdliang at gmail.com  Wed Apr 27 15:35:03 2011
From: xdliang at gmail.com (Xiangdong Liang)
Date: Wed, 27 Apr 2011 16:35:03 -0400
Subject: [petsc-users] create a new large vector via combining existing
	small vectors
Message-ID: <BANLkTinwHf5nGjCftbTikUEUgXNR5va0oA@mail.gmail.com>

Hello everyone,

I have a problem with creating a new large vector via combining existing
small vectors. Suppose I have two vectors v1 and v2 (size n-by-1) already. I
want to have a new vector vout (size 2n-by-1) with vout(1:n) = v1 and
vout(n+1:2*n) = v2. Is there any quick way to create vout with Petsc's built
in functions? Thanks.

Best,
Xiangdong
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.mcs.anl.gov/pipermail/petsc-users/attachments/20110427/bf0be569/attachment.htm>

From bsmith at mcs.anl.gov  Wed Apr 27 16:09:28 2011
From: bsmith at mcs.anl.gov (Barry Smith)
Date: Wed, 27 Apr 2011 16:09:28 -0500
Subject: [petsc-users] create a new large vector via combining existing
	small vectors
In-Reply-To: <BANLkTinwHf5nGjCftbTikUEUgXNR5va0oA@mail.gmail.com>
References: <BANLkTinwHf5nGjCftbTikUEUgXNR5va0oA@mail.gmail.com>
Message-ID: <CD71E195-01BE-4090-A6C0-761C166DCB4B@mcs.anl.gov>


On Apr 27, 2011, at 3:35 PM, Xiangdong Liang wrote:

> Hello everyone,
> 
> I have a problem with creating a new large vector via combining existing small vectors. Suppose I have two vectors v1 and v2 (size n-by-1) already. I want to have a new vector vout (size 2n-by-1) with vout(1:n) = v1 and vout(n+1:2*n) = v2. Is there any quick way to create vout with Petsc's built in functions? Thanks.

   No. 

   You can use VecCreate() and then a couple of VecScatters to get the entries from the two small vectors to the large one.

   Barry


> 
> Best,
> Xiangdong


From ilyascfd at gmail.com  Thu Apr 28 05:36:18 2011
From: ilyascfd at gmail.com (ilyas ilyas)
Date: Thu, 28 Apr 2011 13:36:18 +0300
Subject: [petsc-users] setting up DA matrix for 3D periodic domain
In-Reply-To: <BANLkTika8z-PnNPLBFBva-DiWt3JoxaeMw@mail.gmail.com>
References: <BANLkTinQS-nAuAQh=Bdcy=ouBkz0dkRFiA@mail.gmail.com>
	<BANLkTika8z-PnNPLBFBva-DiWt3JoxaeMw@mail.gmail.com>
Message-ID: <BANLkTinRGEaYs3LCmQk82N-FLi1MmGWw0g@mail.gmail.com>

Jed,
Sorry for my late response.
Thank you very much.

Ilyas.


2011/4/24 Jed Brown <jed at 59a2.org>

> On Sun, Apr 24, 2011 at 14:31, ilyas ilyas <ilyascfd at gmail.com> wrote:
>
>> According to this explanation, If I would set up a matrix for 3D periodic
>> domain using DAs with DA_ XYZPERIODIC,
>> The code segment given below could handle periodicity "without specifying
>> boundary information within the loop"   ?
>>
>
> Yes, that code is fine. PETSc translates the periodic contributions.
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.mcs.anl.gov/pipermail/petsc-users/attachments/20110428/bcae7df7/attachment.htm>

From ilyascfd at gmail.com  Thu Apr 28 06:03:11 2011
From: ilyascfd at gmail.com (ilyas ilyas)
Date: Thu, 28 Apr 2011 14:03:11 +0300
Subject: [petsc-users] working with different size of arrays in a single DA
Message-ID: <BANLkTim9BJVGHDgWjbKTC+QLkrNTSQOCcA@mail.gmail.com>

Hi,
May be, It is a simple question, but I am little bit confused.
If I have two different size of arrays (one is cell-based which is from 1 to
N,
other one is face-based which is from 1 to N+1).
How can  I create and work with them within a single DA structure,
for example in evaluating a function or setting up a matrix ?

Regards,
Ilyas.
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.mcs.anl.gov/pipermail/petsc-users/attachments/20110428/b19c1192/attachment.htm>

From jed at 59A2.org  Thu Apr 28 06:50:16 2011
From: jed at 59A2.org (Jed Brown)
Date: Thu, 28 Apr 2011 13:50:16 +0200
Subject: [petsc-users] working with different size of arrays in a single
	DA
In-Reply-To: <BANLkTim9BJVGHDgWjbKTC+QLkrNTSQOCcA@mail.gmail.com>
References: <BANLkTim9BJVGHDgWjbKTC+QLkrNTSQOCcA@mail.gmail.com>
Message-ID: <BANLkTinzXrK6xUhF3Rh8pTmJMr3SsgyZbw@mail.gmail.com>

On Thu, Apr 28, 2011 at 13:03, ilyas ilyas <ilyascfd at gmail.com> wrote:

> May be, It is a simple question, but I am little bit confused.
> If I have two different size of arrays (one is cell-based which is from 1
> to N,
> other one is face-based which is from 1 to N+1).
> How can  I create and work with them within a single DA structure,
> for example in evaluating a function or setting up a matrix ?
>

Two choices:

1. Increase the block size (number of components per node) and just write
the identity into the equatiosn for the "N+1" cell (which does not exist).
This will normally give better memory performance and the few extra trivial
equations around the margin are not a big deal.

2. Use two separate DAs and make the parallel decomposition compatible. You
can put them together into one system using DMComposite. This is usually
overkill for staggered grids, but extends to general multi-physics problems.
Support for this option is better in petsc-dev, see, for example,

http://www.mcs.anl.gov/petsc/petsc-as/snapshots/petsc-dev/src/snes/examples/tutorials/ex28.c.html
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.mcs.anl.gov/pipermail/petsc-users/attachments/20110428/d7b46bd5/attachment.htm>

From ilyascfd at gmail.com  Thu Apr 28 08:41:43 2011
From: ilyascfd at gmail.com (ilyas ilyas)
Date: Thu, 28 Apr 2011 16:41:43 +0300
Subject: [petsc-users] working with different size of arrays in a single
	DA
In-Reply-To: <BANLkTinzXrK6xUhF3Rh8pTmJMr3SsgyZbw@mail.gmail.com>
References: <BANLkTim9BJVGHDgWjbKTC+QLkrNTSQOCcA@mail.gmail.com>
	<BANLkTinzXrK6xUhF3Rh8pTmJMr3SsgyZbw@mail.gmail.com>
Message-ID: <BANLkTimWja1qvcTFYEKau9KAsfswTxuEiQ@mail.gmail.com>

Dear Jed,


2011/4/28 Jed Brown <jed at 59a2.org>

> On Thu, Apr 28, 2011 at 13:03, ilyas ilyas <ilyascfd at gmail.com> wrote:
>
>> May be, It is a simple question, but I am little bit confused.
>> If I have two different size of arrays (one is cell-based which is from 1
>> to N,
>> other one is face-based which is from 1 to N+1).
>> How can  I create and work with them within a single DA structure,
>> for example in evaluating a function or setting up a matrix ?
>>
>
> Two choices:
>
> 1. Increase the block size (number of components per node) and just write
> the identity into the equatiosn for the "N+1" cell (which does not exist).
> This will normally give better memory performance and the few extra trivial
> equations around the margin are not a big deal.
>


Would you please explain more the first option?


>
> 2. Use two separate DAs and make the parallel decomposition compatible. You
> can put them together into one system using DMComposite. This is usually
> overkill for staggered grids, but extends to general multi-physics problems.
> Support for this option is better in petsc-dev, see, for example,
>
>
> http://www.mcs.anl.gov/petsc/petsc-as/snapshots/petsc-dev/src/snes/examples/tutorials/ex28.c.html
>


Thank you,
Ilyas.
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.mcs.anl.gov/pipermail/petsc-users/attachments/20110428/44b2e70e/attachment.htm>

From bartlomiej.wach at yahoo.pl  Thu Apr 28 09:02:23 2011
From: bartlomiej.wach at yahoo.pl (=?utf-8?B?QmFydMWCb21pZWogVw==?=)
Date: Thu, 28 Apr 2011 15:02:23 +0100 (BST)
Subject: [petsc-users] Large matrixes on single machine
Message-ID: <458933.34056.qm@web28313.mail.ukl.yahoo.com>

Hello,

I was trying to allocate a sparse AIJ matrix of over 800 entries

MatSeqAIJSetPreallocation(L,PETSC_NULL,nnz);

(with proper nonzeros vector)
results in an
Maximum memory PetscMalloc()ed 315699888 OS cannot compute size of entire process
(in ubuntu)

Can this be dealt with somehow?
I am aware of the 4gb limitation of memory.

If not, how would one run it on several machines?

Thank You very much.

-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.mcs.anl.gov/pipermail/petsc-users/attachments/20110428/b28346b9/attachment-0001.htm>

From jed at 59A2.org  Thu Apr 28 09:33:52 2011
From: jed at 59A2.org (Jed Brown)
Date: Thu, 28 Apr 2011 16:33:52 +0200
Subject: [petsc-users] Large matrixes on single machine
In-Reply-To: <458933.34056.qm@web28313.mail.ukl.yahoo.com>
References: <458933.34056.qm@web28313.mail.ukl.yahoo.com>
Message-ID: <BANLkTinM4zt69mx+ouuE3ZeMSs2nGFziJw@mail.gmail.com>

On Thu, Apr 28, 2011 at 16:02, Bart?omiej W <bartlomiej.wach at yahoo.pl>wrote:

> Hello,
>
> I was trying to allocate a sparse AIJ matrix of over 800 entries
>
> MatSeqAIJSetPreallocation(L,PETSC_NULL,nnz);
>
> (with proper nonzeros vector)
> results in an
> Maximum memory PetscMalloc()ed 315699888 OS cannot compute size of entire
> process
> (in ubuntu)


What was in the nnz array? If you don't expect the problem to exceed the
addressable memory, then the array is probably corrupt. If you really mean
to be solving a very large problem, you will have to get a 64-bit machine
and configure --with-64-bit-indices, or run in parallel.
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.mcs.anl.gov/pipermail/petsc-users/attachments/20110428/ac2f3397/attachment.htm>

From jed at 59A2.org  Thu Apr 28 09:36:16 2011
From: jed at 59A2.org (Jed Brown)
Date: Thu, 28 Apr 2011 16:36:16 +0200
Subject: [petsc-users] working with different size of arrays in a single
	DA
In-Reply-To: <BANLkTimWja1qvcTFYEKau9KAsfswTxuEiQ@mail.gmail.com>
References: <BANLkTim9BJVGHDgWjbKTC+QLkrNTSQOCcA@mail.gmail.com>
	<BANLkTinzXrK6xUhF3Rh8pTmJMr3SsgyZbw@mail.gmail.com>
	<BANLkTimWja1qvcTFYEKau9KAsfswTxuEiQ@mail.gmail.com>
Message-ID: <BANLkTi=kFVbEbnG1ZMuR3NV9Gif8MwngSQ@mail.gmail.com>

On Thu, Apr 28, 2011 at 15:41, ilyas ilyas <ilyascfd at gmail.com> wrote:

> Would you please explain more the first option?


You create a system with larger block size. If you have two node-centered
values and one cell-centered, you would use a block size of 3. Each center
would be associated with the node to its lower left, for example. There will
be a fringe of "cell-centers" that extend out of domain on the right and
top, you use x = 0 for these equations and the Jacobian for those equations
will just have a 1 on the diagonal.
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.mcs.anl.gov/pipermail/petsc-users/attachments/20110428/c2b9bbdd/attachment.htm>

From balay at mcs.anl.gov  Thu Apr 28 11:54:59 2011
From: balay at mcs.anl.gov (Satish Balay)
Date: Thu, 28 Apr 2011 11:54:59 -0500 (CDT)
Subject: [petsc-users] Large matrixes on single machine
In-Reply-To: <BANLkTinM4zt69mx+ouuE3ZeMSs2nGFziJw@mail.gmail.com>
References: <458933.34056.qm@web28313.mail.ukl.yahoo.com>
	<BANLkTinM4zt69mx+ouuE3ZeMSs2nGFziJw@mail.gmail.com>
Message-ID: <alpine.LFD.2.02.1104281153240.2575@asterix>

On Thu, 28 Apr 2011, Jed Brown wrote:

> On Thu, Apr 28, 2011 at 16:02, Bart?omiej W <bartlomiej.wach at yahoo.pl>wrote:
> 
> > Hello,
> >
> > I was trying to allocate a sparse AIJ matrix of over 800 entries
> >
> > MatSeqAIJSetPreallocation(L,PETSC_NULL,nnz);
> >
> > (with proper nonzeros vector)
> > results in an
> > Maximum memory PetscMalloc()ed 315699888 OS cannot compute size of entire
> > process
> > (in ubuntu)
> 
> 
> What was in the nnz array? If you don't expect the problem to exceed the
> addressable memory, then the array is probably corrupt. If you really mean
> to be solving a very large problem, you will have to get a 64-bit machine
> and configure --with-64-bit-indices, or run in parallel.

What was the complete error message?

The above says '315MB in use'. So was the code trying to allocate 3GB
- when it failed?

Also How many total non-zeros in the matrix?

Satish

From xdliang at gmail.com  Fri Apr 29 10:50:06 2011
From: xdliang at gmail.com (Xiangdong Liang)
Date: Fri, 29 Apr 2011 11:50:06 -0400
Subject: [petsc-users] questions about parallel vectors
Message-ID: <BANLkTimzmiDX97m5nN1YAEveoUFf50MMdA@mail.gmail.com>

Hello everyone,

I am trying to create a sparse matrix A, which depends on a parallel
vector v. For example, my function looks like this:  Mat
myfun(MPI_Comm comm, Vec v, other parameters). When I set the value
A(i,j) = v[k], v[k] may not be obtained by VecGetValues since that
operation can only get values on the same processor. I am thinking

1) create v as an array and pass this array into myfun.
2)  create another vector v2, which is a full copy of parallel v
through VecScatter.
3)  when I first create the initial vec v, using
VecCreate(PETSC_COMM_SELF,v) or VecCreateSeq. Does this guarantee that
all the processors creating matrix A have all the components of vector
v?

I think 1) and 2) are going to work, but not sure about option 3). I
have no idea which would have better performance. Can you give me some
suggestions on how to handle this problem? Thanks.

Another quick question, what is the difference between
PetscViewerSetFormat and PetscViewerPushFormat?

Best,
Xiangdong

From abhyshr at mcs.anl.gov  Fri Apr 29 14:41:25 2011
From: abhyshr at mcs.anl.gov (Shri)
Date: Fri, 29 Apr 2011 14:41:25 -0500 (CDT)
Subject: [petsc-users] questions about parallel vectors
In-Reply-To: <BANLkTimzmiDX97m5nN1YAEveoUFf50MMdA@mail.gmail.com>
Message-ID: <175206172.8268.1304106085650.JavaMail.root@zimbra.anl.gov>


----- Original Message -----
> Hello everyone,
> 
> I am trying to create a sparse matrix A, which depends on a parallel
> vector v. For example, my function looks like this: Mat
> myfun(MPI_Comm comm, Vec v, other parameters). When I set the value
> A(i,j) = v[k], v[k] may not be obtained by VecGetValues since that
> operation can only get values on the same processor. I am thinking
> 
> 1) create v as an array and pass this array into myfun.
> 2) create another vector v2, which is a full copy of parallel v
> through VecScatter.

Do you need all the vector elements on each processor to set the matrix values or just a subset of them? '
If you need all the vector elements then you can use VecScattertoAll
http://www.mcs.anl.gov/petsc/petsc-as/snapshots/petsc-current/docs/manualpages/Vec/VecScatterCreateToAll.html
If you only need a subset then you could create v as a ghosted vector. See the example
http://www.mcs.anl.gov/petsc/petsc-as/snapshots/petsc-current/src/vec/vec/examples/tutorials/ex9.c.html

> 3) when I first create the initial vec v, using
> VecCreate(PETSC_COMM_SELF,v) or VecCreateSeq. Does this guarantee that
> all the processors creating matrix A have all the components of vector
> v?
> 
> I think 1) and 2) are going to work, but not sure about option 3). I
> have no idea which would have better performance. Can you give me some
> suggestions on how to handle this problem? Thanks.
> 
> Another quick question, what is the difference between
> PetscViewerSetFormat and PetscViewerPushFormat?
> 
> Best,
> Xiangdong

From ilyascfd at gmail.com  Sat Apr 30 06:17:43 2011
From: ilyascfd at gmail.com (ilyas ilyas)
Date: Sat, 30 Apr 2011 14:17:43 +0300
Subject: [petsc-users] working with different size of arrays in a single
	DA
In-Reply-To: <BANLkTi=kFVbEbnG1ZMuR3NV9Gif8MwngSQ@mail.gmail.com>
References: <BANLkTim9BJVGHDgWjbKTC+QLkrNTSQOCcA@mail.gmail.com>
	<BANLkTinzXrK6xUhF3Rh8pTmJMr3SsgyZbw@mail.gmail.com>
	<BANLkTimWja1qvcTFYEKau9KAsfswTxuEiQ@mail.gmail.com>
	<BANLkTi=kFVbEbnG1ZMuR3NV9Gif8MwngSQ@mail.gmail.com>
Message-ID: <BANLkTik1MkNH+d897Xku_r_+hAxFORN8Tg@mail.gmail.com>

Thank you Jed,

I guess the thing that makes it complicated application of boundary
conditions.
Since XYZGHOSTED is not available in fortran,
I am using XYZPERIODIC in order to implement bc's, as it is done in ex31.c
in SNES
Combining XYZPERIODIC with larger block size is still not much clear for me.

By the way, what is the current status of DMDACreate3D
and other DMDA routines and their fortran support ?
According to ex11f90.F and ex22f.F in the manual page of DMDACreate3d ,
these routine(s) provide ghost cell support for fortran in 3D ?
If ghost cell support is available, implementation would be relatively easy

Cheers,
Ilyas

2011/4/28 Jed Brown <jed at 59a2.org>

> On Thu, Apr 28, 2011 at 15:41, ilyas ilyas <ilyascfd at gmail.com> wrote:
>
>> Would you please explain more the first option?
>
>
> You create a system with larger block size. If you have two node-centered
> values and one cell-centered, you would use a block size of 3. Each center
> would be associated with the node to its lower left, for example. There will
> be a fringe of "cell-centers" that extend out of domain on the right and
> top, you use x = 0 for these equations and the Jacobian for those equations
> will just have a 1 on the diagonal.
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.mcs.anl.gov/pipermail/petsc-users/attachments/20110430/42626df4/attachment.htm>

From jed at 59A2.org  Sat Apr 30 10:45:06 2011
From: jed at 59A2.org (Jed Brown)
Date: Sat, 30 Apr 2011 17:45:06 +0200
Subject: [petsc-users] working with different size of arrays in a single
	DA
In-Reply-To: <BANLkTik1MkNH+d897Xku_r_+hAxFORN8Tg@mail.gmail.com>
References: <BANLkTim9BJVGHDgWjbKTC+QLkrNTSQOCcA@mail.gmail.com>
	<BANLkTinzXrK6xUhF3Rh8pTmJMr3SsgyZbw@mail.gmail.com>
	<BANLkTimWja1qvcTFYEKau9KAsfswTxuEiQ@mail.gmail.com>
	<BANLkTi=kFVbEbnG1ZMuR3NV9Gif8MwngSQ@mail.gmail.com>
	<BANLkTik1MkNH+d897Xku_r_+hAxFORN8Tg@mail.gmail.com>
Message-ID: <BANLkTimxLEuJ+zbiw64v5+H9majWTitjRQ@mail.gmail.com>

On Sat, Apr 30, 2011 at 13:17, ilyas ilyas <ilyascfd at gmail.com> wrote:

> I guess the thing that makes it complicated application of boundary
> conditions.
> Since XYZGHOSTED is not available in fortran,
> I am using XYZPERIODIC in order to implement bc's, as it is done in ex31.c
> in SNES
> Combining XYZPERIODIC with larger block size is still not much clear for
> me.
>
> By the way, what is the current status of DMDACreate3D
> and other DMDA routines and their fortran support ?
> According to ex11f90.F and ex22f.F in the manual page of DMDACreate3d ,
> these routine(s) provide ghost cell support for fortran in 3D ?
> If ghost cell support is available, implementation would be relatively easy
>

There is Fortran support for periodic boundaries. Also, petsc-dev has
support for ghost cells even when they do not imply any wrapping,
see DMDA_BOUNDARY_GHOST.

http://www.mcs.anl.gov/petsc/petsc-as/snapshots/petsc-dev/docs/manualpages/DM/DMDACreate3d.html
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.mcs.anl.gov/pipermail/petsc-users/attachments/20110430/2217d210/attachment.htm>

From stali at geology.wisc.edu  Sat Apr 30 12:58:52 2011
From: stali at geology.wisc.edu (Tabrez Ali)
Date: Sat, 30 Apr 2011 12:58:52 -0500
Subject: [petsc-users] Preallocation (Unstructured FE)
Message-ID: <4DBC4DDC.1010604@geology.wisc.edu>

Petsc Developers/Users

I having some performance issues with preallocation in a fully 
unstructured FE code. It would be very helpful if those using FE codes 
can comment.

For a problem of size 100K nodes and 600K tet elements (on 1 cpu)

1. If I calculate the _exact_ number of non-zeros per row (using a 
running list in Fortran) by looping over nodes & elements, the code 
takes 17 mins (to calculate nnz's/per row, assemble and solve).
2. If I dont use a running list and simply get the average of the max 
number of nodes a node might be connected to (again by looping over 
nodes & elements but not using a running list) then it takes 8 mins
3. If I just magically guess the right value calculated in 2 and use 
that as average nnz per row then it only takes 25 secs.

Basically in all cases Assembly and Solve are very fast (few seconds) 
but the nnz calculation itself (in 2 and 3) takes a long time. How can 
this be cut down? Is there a heuristic way to estimate the number (as 
done in 3) even if it slightly overestimates the nnz's per row or are 
efficient ways to do step 1 or 2. Right now I have do i=1,num_nodes; do 
j=1,num_elements ... which obviously is slow for large number of 
nodes/elements.

Thanks in advance
Tabrez