From erlend.pedersen at holberger.com  Fri Feb  1 05:54:24 2008
From: erlend.pedersen at holberger.com (Erlend Pedersen :.)
Date: Fri, 01 Feb 2008 12:54:24 +0100
Subject: Overdetermined, non-linear
Message-ID: <1201866864.6394.25.camel@erlend-ws.in.holberger.com>

I am attempting to use the PETSc nonlinear solver on an overdetermined
system of non-linear equations. Hence, the Jacobian is not square, and
so far we have unfortunately not succeeded with any combination of snes,
ksp and pc.

Could you confirm that snes actually works for overdetermined systems,
and if so, is there an application example we could look at in order to
make sure there is nothing wrong with our test-setup?

We have previously used the MINPACK routine LMDER very successfully, but
for our current problem sizes we rely on the use of sparse matrix
representations and parallel architectures. PETSc's abstractions and
automatic MPI makes this system very attractive for us, and we have
already used the PETSc LSQR solver with great success.

Thank you very much.


Regards,
Erlend Pedersen :.


From geenen at gmail.com  Sat Feb  2 03:32:37 2008
From: geenen at gmail.com (Thomas Geenen)
Date: Sat, 2 Feb 2008 10:32:37 +0100
Subject: assembly
Message-ID: <8aa042e10802020132j5814522dh46303aba83cdc4d1@mail.gmail.com>

Dear Petsc users,

I would like to understand what is slowing down the assembly phase of my matrix.
I create a matrix with MatCreateMPIAIJ i make a rough guess of the
number of off diagonal entries and then use a conservative value to
make sure I do not need extra mallocs. (the number of diagonal entries
is exact)
next i call MatSetValues and MatAssemblyBegin, MatAssemblyEnd.
The first time i call MatSetValues and MatAssemblyBegin,
MatAssemblyEnd it takes about 170 seconds
the second time 0.3 seconds.
I run it on 6 cpu's and I do fill quit a number of row-entries on the
"wrong" cpu. However thats also the case the second run. I checked
that there are no additional mallocs
MatGetInfo info.mallocs=0 both after MatSetValues and after
MatAssemblyBegin, MatAssemblyEnd.

cheers
Thomas


From jiaxun_hou at yahoo.com.cn  Sat Feb  2 06:49:03 2008
From: jiaxun_hou at yahoo.com.cn (jiaxun hou)
Date: Sat, 2 Feb 2008 20:49:03 +0800 (CST)
Subject: About unpreconditioned residuals in Left Preconditioned GMRES
Message-ID: <520833.35747.qm@web15802.mail.cnb.yahoo.com>

Hi everyone,

I want to use the Left Preconditioned GMRES to solve a linear system, and the stopping criterion  must  be  based on the  actual residuals (b-Ax). But the  GMRES codes of PETSc seems to use the preconditioned residuals (B^-1(b-Ax)) only. In addition,  when I set KSPSetNormType(ksp,KSP_NORM_UNPRECONDITIONED), I receive the error message: "Currently can use GMRES with only preconditioned residual (right preconditioning not coded)". So, is there any way to set stopping criterion based on the actual residuals?

Best regards,
Jiaxun
 
       
---------------------------------
??????????????????? 
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.mcs.anl.gov/pipermail/petsc-users/attachments/20080202/1f931b62/attachment.htm>

From dave.mayhem23 at gmail.com  Sat Feb  2 07:29:33 2008
From: dave.mayhem23 at gmail.com (Dave May)
Date: Sun, 3 Feb 2008 00:29:33 +1100
Subject: About unpreconditioned residuals in Left Preconditioned GMRES
In-Reply-To: <520833.35747.qm@web15802.mail.cnb.yahoo.com>
References: <520833.35747.qm@web15802.mail.cnb.yahoo.com>
Message-ID: <956373f0802020529m5501b2b8t44549bebe9063e47@mail.gmail.com>

Hi,
    You can use the function

PetscErrorCode PETSCKSP_DLLEXPORT KSPSetConvergenceTest(KSP
ksp,PetscErrorCode
(*converge)(KSP,PetscInt,PetscReal,KSPConvergedReason*,void*),void
*cctx)

to define your own convergence test.

Cheers,
    Dave.


2008/2/2 jiaxun hou <jiaxun_hou at yahoo.com.cn>:

> Hi everyone,
>
> I want to use the Left Preconditioned GMRES to solve a linear system, and
> the stopping criterion  must  be  based on the  actual residuals (b-Ax). But
> the  GMRES codes of PETSc seems to use the preconditioned residuals
> (B^-1(b-Ax)) only. In addition,  when I set
> KSPSetNormType(ksp,KSP_NORM_UNPRECONDITIONED), I receive the error message:
> "Currently can use GMRES with only preconditioned residual (right
> preconditioning not coded)". So, is there any way to set stopping criterion
> based on the actual residuals?
>
> Best regards,
> Jiaxun
>
> ------------------------------
> ???????????????????<http://cn.mail.yahoo.com/gc/index.html?entry=5&souce=mail_mailletter_tagline>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.mcs.anl.gov/pipermail/petsc-users/attachments/20080203/0f526776/attachment.htm>

From jiaxun_hou at yahoo.com.cn  Sat Feb  2 10:09:02 2008
From: jiaxun_hou at yahoo.com.cn (jiaxun hou)
Date: Sun, 3 Feb 2008 00:09:02 +0800 (CST)
Subject: =?gb2312?q?=BB=D8=B8=B4=A3=BA=20Re:=20About=20unpreconditioned=20residual?=
 =?gb2312?q?s=20in=20Left=20Preconditioned=20GMRES?=
In-Reply-To: <956373f0802020529m5501b2b8t44549bebe9063e47@mail.gmail.com>
Message-ID: <491152.9114.qm@web15815.mail.cnb.yahoo.com>

Thank you, Dave. But there is still a question: how can I get the residual vector in each iteration? It seems difficult to get it without modifying the GMRES codes.

Best regards,
Jiaxun
Dave May <dave.mayhem23 at gmail.com> ??? Hi,
    You can use the function

PetscErrorCode PETSCKSP_DLLEXPORT KSPSetConvergenceTest(KSP ksp,PetscErrorCode (*converge)(KSP,PetscInt,PetscReal,KSPConvergedReason*,void*),void *cctx)
to define your own convergence test.
 
Cheers,
    Dave.


2008/2/2 jiaxun hou <jiaxun_hou at yahoo.com.cn>:
 Hi everyone,

I want to use the Left Preconditioned GMRES to solve a linear system, and the stopping criterion  must  be  based on the  actual residuals (b-Ax). But the  GMRES codes of PETSc seems to use the preconditioned residuals (B^-1(b-Ax)) only. In addition,  when I set KSPSetNormType(ksp,KSP_NORM_UNPRECONDITIONED), I receive the error message: "Currently can use GMRES with only preconditioned residual (right preconditioning not coded)". So, is there any way to set stopping criterion based on the actual residuals?
 
Best regards,
Jiaxun
           

---------------------------------
??????????????????? 

 
---------------------------------
??????????????????? 
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.mcs.anl.gov/pipermail/petsc-users/attachments/20080203/b6fa6a95/attachment.htm>

From hzhang at mcs.anl.gov  Sat Feb  2 11:33:51 2008
From: hzhang at mcs.anl.gov (Hong Zhang)
Date: Sat, 2 Feb 2008 11:33:51 -0600 (CST)
Subject: assembly
In-Reply-To: <8aa042e10802020132j5814522dh46303aba83cdc4d1@mail.gmail.com>
References: <8aa042e10802020132j5814522dh46303aba83cdc4d1@mail.gmail.com>
Message-ID: <Pine.LNX.4.64.0802021128260.521@terra.mcs.anl.gov>


On Sat, 2 Feb 2008, Thomas Geenen wrote:

> Dear Petsc users,
>
> I would like to understand what is slowing down the assembly phase of my matrix.
> I create a matrix with MatCreateMPIAIJ i make a rough guess of the
> number of off diagonal entries and then use a conservative value to
> make sure I do not need extra mallocs. (the number of diagonal entries
> is exact)
> next i call MatSetValues and MatAssemblyBegin, MatAssemblyEnd.
> The first time i call MatSetValues and MatAssemblyBegin,
> MatAssemblyEnd it takes about 170 seconds
> the second time 0.3 seconds.
> I run it on 6 cpu's and I do fill quit a number of row-entries on the
> "wrong" cpu. However thats also the case the second run. I checked
> that there are no additional mallocs
> MatGetInfo info.mallocs=0 both after MatSetValues and after
> MatAssemblyBegin, MatAssemblyEnd.

Run your code with the option '-log_summary' and check which function
call dominates the execution time.

> I run it on 6 cpu's and I do fill quit a number of row-entries on the
> "wrong" cpu.

Likely, the communication that sending the entries to the
corrected cpu consume the time. Can you fill the entries in the
correct cpu?

Hong
>
> cheers
> Thomas
>
>


From geenen at gmail.com  Sat Feb  2 12:30:49 2008
From: geenen at gmail.com (Thomas Geenen)
Date: Sat, 2 Feb 2008 19:30:49 +0100
Subject: assembly
In-Reply-To: <Pine.LNX.4.64.0802021128260.521@terra.mcs.anl.gov>
References: <8aa042e10802020132j5814522dh46303aba83cdc4d1@mail.gmail.com> <Pine.LNX.4.64.0802021128260.521@terra.mcs.anl.gov>
Message-ID: <200802021930.49084.geenen@gmail.com>

On Saturday 02 February 2008 18:33, Hong Zhang wrote:
> On Sat, 2 Feb 2008, Thomas Geenen wrote:
> > Dear Petsc users,
> >
> > I would like to understand what is slowing down the assembly phase of my
> > matrix. I create a matrix with MatCreateMPIAIJ i make a rough guess of
> > the number of off diagonal entries and then use a conservative value to
> > make sure I do not need extra mallocs. (the number of diagonal entries is
> > exact)
> > next i call MatSetValues and MatAssemblyBegin, MatAssemblyEnd.
> > The first time i call MatSetValues and MatAssemblyBegin,
> > MatAssemblyEnd it takes about 170 seconds
> > the second time 0.3 seconds.
> > I run it on 6 cpu's and I do fill quit a number of row-entries on the
> > "wrong" cpu. However thats also the case the second run. I checked
> > that there are no additional mallocs
> > MatGetInfo info.mallocs=0 both after MatSetValues and after
> > MatAssemblyBegin, MatAssemblyEnd.
>
> Run your code with the option '-log_summary' and check which function
> call dominates the execution time.

the time is spend in MatStashScatterGetMesg_Private

>
> > I run it on 6 cpu's and I do fill quit a number of row-entries on the
> > "wrong" cpu.
>
> Likely, the communication that sending the entries to the
> corrected cpu consume the time. Can you fill the entries in the
> correct cpu?

the second time the entries are filled on the wrong CPU as well.
i am curious about the difference in time between run 1 and 2.

>
> Hong
>
> > cheers
> > Thomas


From bsmith at mcs.anl.gov  Sat Feb  2 16:10:44 2008
From: bsmith at mcs.anl.gov (Barry Smith)
Date: Sat, 2 Feb 2008 16:10:44 -0600
Subject: =?GB2312?Q?Re:_=BB=D8=B8=B4=A3=BA_Re:_About_unpreconditioned_res?=
 =?GB2312?Q?iduals_in_Left_Preconditioned_GMRES?=
In-Reply-To: <491152.9114.qm@web15815.mail.cnb.yahoo.com>
References: <491152.9114.qm@web15815.mail.cnb.yahoo.com>
Message-ID: <2BEBE7C4-DF7E-4B6D-9F47-510DFAC86153@mcs.anl.gov>


    To calculate the true residual norm at each iteration of left  
preconditioned GMRES requires
actually forming b - A*x which means computing A*x which means  
computing x (which is not available
without additional calculations at each iteration). This is why we do  
not support left preconditioning with
true residual norm convergence test.

    You should use the KSP type of FGMRES, it is written using right  
preconditioning and for a standard PC
is identical to regular GMRES.

    Barry

On Feb 2, 2008, at 10:09 AM, jiaxun hou wrote:

> Thank you, Dave. But there is still a question: how can I get the  
> residual vector in each iteration? It seems difficult to get it  
> without modifying the GMRES codes.
>
> Best regards,
> Jiaxun
> Dave May <dave.mayhem23 at gmail.com> ???
> Hi,
>     You can use the function
> PetscErrorCode PETSCKSP_DLLEXPORT KSPSetConvergenceTest(KSP  
> ksp,PetscErrorCode (*converge) 
> (KSP,PetscInt,PetscReal,KSPConvergedReason*,void*),void *cctx)
> to define your own convergence test.
>
> Cheers,
>     Dave.
>
>
>
>
>
> 2008/2/2 jiaxun hou <jiaxun_hou at yahoo.com.cn>:
> Hi everyone,
>
> I want to use the Left Preconditioned GMRES to solve a linear  
> system, and the stopping criterion  must  be  based on the  actual  
> residuals (b-Ax). But the  GMRES codes of PETSc seems to use the  
> preconditioned residuals (B^-1(b-Ax)) only. In addition,  when I set  
> KSPSetNormType(ksp,KSP_NORM_UNPRECONDITIONED), I receive the error  
> message: "Currently can use GMRES with only preconditioned residual  
> (right preconditioning not coded)". So, is there any way to set  
> stopping criterion based on the actual residuals?
>
> Best regards,
> Jiaxun
> ???????????????????
>
>
>
> ???????????????????

-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.mcs.anl.gov/pipermail/petsc-users/attachments/20080202/063c9530/attachment.htm>

From bsmith at mcs.anl.gov  Sat Feb  2 16:19:37 2008
From: bsmith at mcs.anl.gov (Barry Smith)
Date: Sat, 2 Feb 2008 16:19:37 -0600
Subject: assembly
In-Reply-To: <200802021930.49084.geenen@gmail.com>
References: <8aa042e10802020132j5814522dh46303aba83cdc4d1@mail.gmail.com> <Pine.LNX.4.64.0802021128260.521@terra.mcs.anl.gov> <200802021930.49084.geenen@gmail.com>
Message-ID: <37FE47D2-1A4A-4980-AD8D-EEF888109C93@mcs.anl.gov>


    The matstash has a concept of preallocation also. During the first  
setvalues
it is allocating more and more memory for the stash. In the second  
setvalues
the stash is large enough so does not require any addition allocation.

    You can use the option -matstash_initial_size <size> to allocate  
enough space
initially so that the first setvalues is also fast. It does not look  
like there is a way
coded to get the <size> that you should use. It should be set to the  
maximum nonzeros
any process has that belongs to other processes. The stash handling  
code is
in src/mat/utils/matstash.c, perhaps you can figure out how to  
printout with PetscInfo()
the sizes needed?


    Barry


On Feb 2, 2008, at 12:30 PM, Thomas Geenen wrote:

> On Saturday 02 February 2008 18:33, Hong Zhang wrote:
>> On Sat, 2 Feb 2008, Thomas Geenen wrote:
>>> Dear Petsc users,
>>>
>>> I would like to understand what is slowing down the assembly phase  
>>> of my
>>> matrix. I create a matrix with MatCreateMPIAIJ i make a rough  
>>> guess of
>>> the number of off diagonal entries and then use a conservative  
>>> value to
>>> make sure I do not need extra mallocs. (the number of diagonal  
>>> entries is
>>> exact)
>>> next i call MatSetValues and MatAssemblyBegin, MatAssemblyEnd.
>>> The first time i call MatSetValues and MatAssemblyBegin,
>>> MatAssemblyEnd it takes about 170 seconds
>>> the second time 0.3 seconds.
>>> I run it on 6 cpu's and I do fill quit a number of row-entries on  
>>> the
>>> "wrong" cpu. However thats also the case the second run. I checked
>>> that there are no additional mallocs
>>> MatGetInfo info.mallocs=0 both after MatSetValues and after
>>> MatAssemblyBegin, MatAssemblyEnd.
>>
>> Run your code with the option '-log_summary' and check which function
>> call dominates the execution time.
>
> the time is spend in MatStashScatterGetMesg_Private
>
>>
>>> I run it on 6 cpu's and I do fill quit a number of row-entries on  
>>> the
>>> "wrong" cpu.
>>
>> Likely, the communication that sending the entries to the
>> corrected cpu consume the time. Can you fill the entries in the
>> correct cpu?
>
> the second time the entries are filled on the wrong CPU as well.
> i am curious about the difference in time between run 1 and 2.
>
>>
>> Hong
>>
>>> cheers
>>> Thomas
>


From geenen at gmail.com  Sun Feb  3 06:44:56 2008
From: geenen at gmail.com (Thomas Geenen)
Date: Sun, 3 Feb 2008 13:44:56 +0100
Subject: assembly
In-Reply-To: <37FE47D2-1A4A-4980-AD8D-EEF888109C93@mcs.anl.gov>
References: <8aa042e10802020132j5814522dh46303aba83cdc4d1@mail.gmail.com> <200802021930.49084.geenen@gmail.com> <37FE47D2-1A4A-4980-AD8D-EEF888109C93@mcs.anl.gov>
Message-ID: <200802031344.56290.geenen@gmail.com>

i call 
ierr = MatStashSetInitialSize(A[*seqsolve],stash_size, 
stash_size);CHKERRQ(ierr);
with 100 000 000 for the stash size to make sure that's not the bottleneck

the assemble time remains unchanged however.

nstash in MatAssemblyBegin_MPIAIJ (CPU=0) = 109485 
reallocs in MatAssemblyBegin_MPIAIJ =  0

cheers
Thomas

On Saturday 02 February 2008 23:19, Barry Smith wrote:
>     The matstash has a concept of preallocation also. During the first
> setvalues
> it is allocating more and more memory for the stash. In the second
> setvalues
> the stash is large enough so does not require any addition allocation.
>
>     You can use the option -matstash_initial_size <size> to allocate
> enough space
> initially so that the first setvalues is also fast. It does not look
> like there is a way
> coded to get the <size> that you should use. It should be set to the
> maximum nonzeros
> any process has that belongs to other processes. The stash handling
> code is
> in src/mat/utils/matstash.c, perhaps you can figure out how to
> printout with PetscInfo()
> the sizes needed?
>
>
>     Barry
>
> On Feb 2, 2008, at 12:30 PM, Thomas Geenen wrote:
> > On Saturday 02 February 2008 18:33, Hong Zhang wrote:
> >> On Sat, 2 Feb 2008, Thomas Geenen wrote:
> >>> Dear Petsc users,
> >>>
> >>> I would like to understand what is slowing down the assembly phase
> >>> of my
> >>> matrix. I create a matrix with MatCreateMPIAIJ i make a rough
> >>> guess of
> >>> the number of off diagonal entries and then use a conservative
> >>> value to
> >>> make sure I do not need extra mallocs. (the number of diagonal
> >>> entries is
> >>> exact)
> >>> next i call MatSetValues and MatAssemblyBegin, MatAssemblyEnd.
> >>> The first time i call MatSetValues and MatAssemblyBegin,
> >>> MatAssemblyEnd it takes about 170 seconds
> >>> the second time 0.3 seconds.
> >>> I run it on 6 cpu's and I do fill quit a number of row-entries on
> >>> the
> >>> "wrong" cpu. However thats also the case the second run. I checked
> >>> that there are no additional mallocs
> >>> MatGetInfo info.mallocs=0 both after MatSetValues and after
> >>> MatAssemblyBegin, MatAssemblyEnd.
> >>
> >> Run your code with the option '-log_summary' and check which function
> >> call dominates the execution time.
> >
> > the time is spend in MatStashScatterGetMesg_Private
> >
> >>> I run it on 6 cpu's and I do fill quit a number of row-entries on
> >>> the
> >>> "wrong" cpu.
> >>
> >> Likely, the communication that sending the entries to the
> >> corrected cpu consume the time. Can you fill the entries in the
> >> correct cpu?
> >
> > the second time the entries are filled on the wrong CPU as well.
> > i am curious about the difference in time between run 1 and 2.
> >
> >> Hong
> >>
> >>> cheers
> >>> Thomas


From bsmith at mcs.anl.gov  Sun Feb  3 13:51:51 2008
From: bsmith at mcs.anl.gov (Barry Smith)
Date: Sun, 3 Feb 2008 13:51:51 -0600
Subject: assembly
In-Reply-To: <200802031344.56290.geenen@gmail.com>
References: <8aa042e10802020132j5814522dh46303aba83cdc4d1@mail.gmail.com> <200802021930.49084.geenen@gmail.com> <37FE47D2-1A4A-4980-AD8D-EEF888109C93@mcs.anl.gov> <200802031344.56290.geenen@gmail.com>
Message-ID: <964E08D4-3BBC-4BF8-8113-292D3302C745@mcs.anl.gov>


    Hmmm, are you saying the first round of setting values still
takes much longer then the second round? Or is it the time
in MatAssemblyBegin() much longer the first time?

   The MatAssembly process has one piece of code that's
work is order n*size; where n is the stash size and size is the
number of processes, all other work is only order n.

    Could you send the -log_summary output?

    Barry


    The a
On Feb 3, 2008, at 6:44 AM, Thomas Geenen wrote:

> i call
> ierr = MatStashSetInitialSize(A[*seqsolve],stash_size,
> stash_size);CHKERRQ(ierr);
> with 100 000 000 for the stash size to make sure that's not the  
> bottleneck
>
> the assemble time remains unchanged however.
>
> nstash in MatAssemblyBegin_MPIAIJ (CPU=0) = 109485
> reallocs in MatAssemblyBegin_MPIAIJ =  0
>
> cheers
> Thomas
>
> On Saturday 02 February 2008 23:19, Barry Smith wrote:
>>    The matstash has a concept of preallocation also. During the first
>> setvalues
>> it is allocating more and more memory for the stash. In the second
>> setvalues
>> the stash is large enough so does not require any addition  
>> allocation.
>>
>>    You can use the option -matstash_initial_size <size> to allocate
>> enough space
>> initially so that the first setvalues is also fast. It does not look
>> like there is a way
>> coded to get the <size> that you should use. It should be set to the
>> maximum nonzeros
>> any process has that belongs to other processes. The stash handling
>> code is
>> in src/mat/utils/matstash.c, perhaps you can figure out how to
>> printout with PetscInfo()
>> the sizes needed?
>>
>>
>>    Barry
>>
>> On Feb 2, 2008, at 12:30 PM, Thomas Geenen wrote:
>>> On Saturday 02 February 2008 18:33, Hong Zhang wrote:
>>>> On Sat, 2 Feb 2008, Thomas Geenen wrote:
>>>>> Dear Petsc users,
>>>>>
>>>>> I would like to understand what is slowing down the assembly phase
>>>>> of my
>>>>> matrix. I create a matrix with MatCreateMPIAIJ i make a rough
>>>>> guess of
>>>>> the number of off diagonal entries and then use a conservative
>>>>> value to
>>>>> make sure I do not need extra mallocs. (the number of diagonal
>>>>> entries is
>>>>> exact)
>>>>> next i call MatSetValues and MatAssemblyBegin, MatAssemblyEnd.
>>>>> The first time i call MatSetValues and MatAssemblyBegin,
>>>>> MatAssemblyEnd it takes about 170 seconds
>>>>> the second time 0.3 seconds.
>>>>> I run it on 6 cpu's and I do fill quit a number of row-entries on
>>>>> the
>>>>> "wrong" cpu. However thats also the case the second run. I checked
>>>>> that there are no additional mallocs
>>>>> MatGetInfo info.mallocs=0 both after MatSetValues and after
>>>>> MatAssemblyBegin, MatAssemblyEnd.
>>>>
>>>> Run your code with the option '-log_summary' and check which  
>>>> function
>>>> call dominates the execution time.
>>>
>>> the time is spend in MatStashScatterGetMesg_Private
>>>
>>>>> I run it on 6 cpu's and I do fill quit a number of row-entries on
>>>>> the
>>>>> "wrong" cpu.
>>>>
>>>> Likely, the communication that sending the entries to the
>>>> corrected cpu consume the time. Can you fill the entries in the
>>>> correct cpu?
>>>
>>> the second time the entries are filled on the wrong CPU as well.
>>> i am curious about the difference in time between run 1 and 2.
>>>
>>>> Hong
>>>>
>>>>> cheers
>>>>> Thomas
>


From grs2103 at columbia.edu  Sun Feb  3 16:29:43 2008
From: grs2103 at columbia.edu (Gideon Simpson)
Date: Sun, 3 Feb 2008 17:29:43 -0500
Subject: intel mkl on os x
Message-ID: <9BC9DBA4-C8FC-4C53-8D40-748DEF0AF709@columbia.edu>

If I wished to use the intel MKL instead of Apple's vecLib framework  
for my BLAS/LAPACK, what would be the appropriate flags to give petsc  
when it's configuring?

-gideon


From bsmith at mcs.anl.gov  Sun Feb  3 18:13:13 2008
From: bsmith at mcs.anl.gov (Barry Smith)
Date: Sun, 3 Feb 2008 18:13:13 -0600
Subject: intel mkl on os x
In-Reply-To: <9BC9DBA4-C8FC-4C53-8D40-748DEF0AF709@columbia.edu>
References: <9BC9DBA4-C8FC-4C53-8D40-748DEF0AF709@columbia.edu>
Message-ID: <E2ED16AF-A867-4C75-8949-26AA00DC5071@mcs.anl.gov>


    Locate the library libmkl_lapack.a then use --with-blas-lapack- 
dir=/the path to the libmkl_lapack.a

    Good luck,

    Barry

On Feb 3, 2008, at 4:29 PM, Gideon Simpson wrote:

> If I wished to use the intel MKL instead of Apple's vecLib framework  
> for my BLAS/LAPACK, what would be the appropriate flags to give  
> petsc when it's configuring?
>
> -gideon
>


From knepley at gmail.com  Sun Feb  3 19:59:00 2008
From: knepley at gmail.com (Matthew Knepley)
Date: Sun, 3 Feb 2008 19:59:00 -0600
Subject: Overdetermined, non-linear
In-Reply-To: <1201866864.6394.25.camel@erlend-ws.in.holberger.com>
References: <1201866864.6394.25.camel@erlend-ws.in.holberger.com>
Message-ID: <a9f269830802031759l1edba6e2rba838012b99a445d@mail.gmail.com>

On Feb 1, 2008 5:54 AM, Erlend Pedersen :.
<erlend.pedersen at holberger.com> wrote:
> I am attempting to use the PETSc nonlinear solver on an overdetermined
> system of non-linear equations. Hence, the Jacobian is not square, and
> so far we have unfortunately not succeeded with any combination of snes,
> ksp and pc.
>
> Could you confirm that snes actually works for overdetermined systems,
> and if so, is there an application example we could look at in order to
> make sure there is nothing wrong with our test-setup?
>
> We have previously used the MINPACK routine LMDER very successfully, but
> for our current problem sizes we rely on the use of sparse matrix
> representations and parallel architectures. PETSc's abstractions and
> automatic MPI makes this system very attractive for us, and we have
> already used the PETSc LSQR solver with great success.

So in the sense that SNES is really just an iteration with an embedded solve,
yes it can solve non-square nonlinear systems. However, the user has to
understand what is meant by the Function and Jacobian evaluation methods.
I suggest implementing the simplest algorithm for non-square systems:

http://en.wikipedia.org/wiki/Gauss-Newton_algorithm

By implement, I mean your Function and Jacobian methods should return the
correct terms. I believe the reason you have not seen convergence is that
the result of the solve does not "mean" the correct thing for the iteration
in your current setup.

   Matt

> Thank you very much.
>
>
> Regards,
> Erlend Pedersen :.
-- 
What most experimenters take for granted before they begin their
experiments is infinitely more interesting than any results to which
their experiments lead.
-- Norbert Wiener


From recrusader at gmail.com  Mon Feb  4 00:37:23 2008
From: recrusader at gmail.com (Yujie)
Date: Mon, 4 Feb 2008 14:37:23 +0800
Subject: how to inverse a sparse matrix in Petsc?
Message-ID: <7ff0ee010802032237y2ad6d05cp362ef5b9279b3a95@mail.gmail.com>

Hi,
Now, I want to inverse a sparse matrix. I have browsed the manual, however,
I can't find some information. could you give me some advice?

thanks a lot.

Regards,
Yujie
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.mcs.anl.gov/pipermail/petsc-users/attachments/20080204/53601c73/attachment.htm>

From knepley at gmail.com  Mon Feb  4 00:46:29 2008
From: knepley at gmail.com (Matthew Knepley)
Date: Mon, 4 Feb 2008 00:46:29 -0600
Subject: how to inverse a sparse matrix in Petsc?
In-Reply-To: <7ff0ee010802032237y2ad6d05cp362ef5b9279b3a95@mail.gmail.com>
References: <7ff0ee010802032237y2ad6d05cp362ef5b9279b3a95@mail.gmail.com>
Message-ID: <a9f269830802032246s3614cd20n2837dd160b22c685@mail.gmail.com>

On Feb 4, 2008 12:37 AM, Yujie <recrusader at gmail.com> wrote:
> Hi,
> Now, I want to inverse a sparse matrix. I have browsed the manual, however,
> I can't find some information. could you give me some advice?

This is generally a bad idea since the inverse is dense. However, you can
use sparse direct factorization if you configure with 3rd party packages like
MUMPS, SuperLU, DSCPACK, or Spooles.

   Matt

> thanks a lot.
>
> Regards,
> Yujie
-- 
What most experimenters take for granted before they begin their
experiments is infinitely more interesting than any results to which
their experiments lead.
-- Norbert Wiener


From bsmith at mcs.anl.gov  Mon Feb  4 07:10:15 2008
From: bsmith at mcs.anl.gov (Barry Smith)
Date: Mon, 4 Feb 2008 07:10:15 -0600
Subject: how to inverse a sparse matrix in Petsc?
In-Reply-To: <7ff0ee010802032237y2ad6d05cp362ef5b9279b3a95@mail.gmail.com>
References: <7ff0ee010802032237y2ad6d05cp362ef5b9279b3a95@mail.gmail.com>
Message-ID: <5EA032CF-2E1B-4692-8B46-B787B7778646@mcs.anl.gov>


    For sequential AIJ matrices you can fill the B matrix with the  
identity and then use
MatMatSolve().

    Note since the inverse of a sparse matrix is dense the B matrix is  
a SeqDense matrix.

    Barry

On Feb 4, 2008, at 12:37 AM, Yujie wrote:

> Hi,
> Now, I want to inverse a sparse matrix. I have browsed the manual,  
> however, I can't find some information. could you give me some advice?
>
> thanks a lot.
>
> Regards,
> Yujie
>


From li76pan at yahoo.com  Mon Feb  4 07:49:36 2008
From: li76pan at yahoo.com (li pan)
Date: Mon, 4 Feb 2008 05:49:36 -0800 (PST)
Subject: how to inverse a sparse matrix in Petsc?
In-Reply-To: <5EA032CF-2E1B-4692-8B46-B787B7778646@mcs.anl.gov>
Message-ID: <351384.55725.qm@web36802.mail.mud.yahoo.com>

hi,
Does MatMatSolve() use Gauss elimination method?

thanx

pan


--- Barry Smith <bsmith at mcs.anl.gov> wrote:

> 
>     For sequential AIJ matrices you can fill the B
> matrix with the  
> identity and then use
> MatMatSolve().
> 
>     Note since the inverse of a sparse matrix is
> dense the B matrix is  
> a SeqDense matrix.
> 
>     Barry
> 
> On Feb 4, 2008, at 12:37 AM, Yujie wrote:
> 
> > Hi,
> > Now, I want to inverse a sparse matrix. I have
> browsed the manual,  
> > however, I can't find some information. could you
> give me some advice?
> >
> > thanks a lot.
> >
> > Regards,
> > Yujie
> >
> 
> 


      ____________________________________________________________________________________
Never miss a thing.  Make Yahoo your home page. 
http://www.yahoo.com/r/hs


From bsmith at mcs.anl.gov  Mon Feb  4 07:58:27 2008
From: bsmith at mcs.anl.gov (Barry Smith)
Date: Mon, 4 Feb 2008 07:58:27 -0600
Subject: how to inverse a sparse matrix in Petsc?
In-Reply-To: <351384.55725.qm@web36802.mail.mud.yahoo.com>
References: <351384.55725.qm@web36802.mail.mud.yahoo.com>
Message-ID: <D375794D-E5CB-47EA-B3B5-8009A6A34043@mcs.anl.gov>


    Yes. It uses the LU factorization of the matrix computed with  
MatLUFactor().

    Barry

On Feb 4, 2008, at 7:49 AM, li pan wrote:

> hi,
> Does MatMatSolve() use Gauss elimination method?
>
> thanx
>
> pan
>
>
> --- Barry Smith <bsmith at mcs.anl.gov> wrote:
>
>>
>>    For sequential AIJ matrices you can fill the B
>> matrix with the
>> identity and then use
>> MatMatSolve().
>>
>>    Note since the inverse of a sparse matrix is
>> dense the B matrix is
>> a SeqDense matrix.
>>
>>    Barry
>>
>> On Feb 4, 2008, at 12:37 AM, Yujie wrote:
>>
>>> Hi,
>>> Now, I want to inverse a sparse matrix. I have
>> browsed the manual,
>>> however, I can't find some information. could you
>> give me some advice?
>>>
>>> thanks a lot.
>>>
>>> Regards,
>>> Yujie
>>>
>>
>>
>
>
>
>       
> ____________________________________________________________________________________
> Never miss a thing.  Make Yahoo your home page.
> http://www.yahoo.com/r/hs
>


From dave.mayhem23 at gmail.com  Mon Feb  4 08:04:17 2008
From: dave.mayhem23 at gmail.com (Dave May)
Date: Tue, 5 Feb 2008 01:04:17 +1100
Subject: how to inverse a sparse matrix in Petsc?
In-Reply-To: <5EA032CF-2E1B-4692-8B46-B787B7778646@mcs.anl.gov>
References: <7ff0ee010802032237y2ad6d05cp362ef5b9279b3a95@mail.gmail.com>
	 <5EA032CF-2E1B-4692-8B46-B787B7778646@mcs.anl.gov>
Message-ID: <956373f0802040604u7cfaa7e1t682e5b36f1791a4@mail.gmail.com>

Hi,
    Does anyone know how much faster (approximately) using MatMatSolve is
compared
to using PCComputeExplicitOperator(), when the PC in the latter function is
defined to be LU?

Cheers,
    Dave.


On Feb 5, 2008 12:10 AM, Barry Smith <bsmith at mcs.anl.gov> wrote:

>
>    For sequential AIJ matrices you can fill the B matrix with the
> identity and then use
> MatMatSolve().
>
>    Note since the inverse of a sparse matrix is dense the B matrix is
> a SeqDense matrix.
>
>    Barry
>
>
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.mcs.anl.gov/pipermail/petsc-users/attachments/20080205/44383e32/attachment.htm>

From bsmith at mcs.anl.gov  Mon Feb  4 08:06:58 2008
From: bsmith at mcs.anl.gov (Barry Smith)
Date: Mon, 4 Feb 2008 08:06:58 -0600
Subject: how to inverse a sparse matrix in Petsc?
In-Reply-To: <956373f0802040604u7cfaa7e1t682e5b36f1791a4@mail.gmail.com>
References: <7ff0ee010802032237y2ad6d05cp362ef5b9279b3a95@mail.gmail.com> <5EA032CF-2E1B-4692-8B46-B787B7778646@mcs.anl.gov> <956373f0802040604u7cfaa7e1t682e5b36f1791a4@mail.gmail.com>
Message-ID: <74F47236-A9CB-4FB9-83F3-71F62DF07868@mcs.anl.gov>


   They should be pretty much the same. In both cases the huge bulk of  
the time
is spent in the triangular solves.

    Barry

On Feb 4, 2008, at 8:04 AM, Dave May wrote:

> Hi,
>     Does anyone know how much faster (approximately) using  
> MatMatSolve is compared
> to using PCComputeExplicitOperator(), when the PC in the latter  
> function is defined to be LU?
>
> Cheers,
>     Dave.
>
>
> On Feb 5, 2008 12:10 AM, Barry Smith <bsmith at mcs.anl.gov> wrote:
>
>    For sequential AIJ matrices you can fill the B matrix with the
> identity and then use
> MatMatSolve().
>
>    Note since the inverse of a sparse matrix is dense the B matrix is
> a SeqDense matrix.
>
>    Barry
>
>

-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.mcs.anl.gov/pipermail/petsc-users/attachments/20080204/1bfa019d/attachment.htm>

From geenen at gmail.com  Mon Feb  4 10:41:11 2008
From: geenen at gmail.com (Thomas Geenen)
Date: Mon, 4 Feb 2008 17:41:11 +0100
Subject: assembly
In-Reply-To: <964E08D4-3BBC-4BF8-8113-292D3302C745@mcs.anl.gov>
References: <8aa042e10802020132j5814522dh46303aba83cdc4d1@mail.gmail.com>
	 <200802021930.49084.geenen@gmail.com>
	 <37FE47D2-1A4A-4980-AD8D-EEF888109C93@mcs.anl.gov>
	 <200802031344.56290.geenen@gmail.com>
	 <964E08D4-3BBC-4BF8-8113-292D3302C745@mcs.anl.gov>
Message-ID: <8aa042e10802040841n5429ee0bvb67f5f750f57472a@mail.gmail.com>

On Feb 3, 2008 8:51 PM, Barry Smith <bsmith at mcs.anl.gov> wrote:
>
>     Hmmm, are you saying the first round of setting values still
> takes much longer then the second round?

yes

>Or is it the time
> in MatAssemblyBegin() much longer the first time?
>
>    The MatAssembly process has one piece of code that's
> work is order n*size; where n is the stash size and size is the
> number of processes, all other work is only order n.
>
>     Could you send the -log_summary output?

the timing is cumulative i guess?
in between these two solves i solve a smaller system for which i do
not include the timing.

run 1
                         Max       Max/Min        Avg      Total
Time (sec):           2.154e+02      1.00001   2.154e+02
Objects:              2.200e+01      1.00000   2.200e+01
Flops:                0.000e+00      0.00000   0.000e+00  0.000e+00
Flops/sec:            0.000e+00      0.00000   0.000e+00  0.000e+00
MPI Messages:         1.750e+01      1.25000   1.633e+01  9.800e+01
MPI Message Lengths:  3.460e+06      1.29903   1.855e+05  1.818e+07
MPI Reductions:       4.167e+00      1.00000

Flop counting convention: 1 flop = 1 real number operation of type
(multiply/divide/add/subtract)
                            e.g., VecAXPY() for real vectors of length
N --> 2N flops
                            and VecAXPY() for complex vectors of
length N --> 8N flops

Summary of Stages:   ----- Time ------  ----- Flops -----  ---
Messages ---  -- Message Lengths --  -- Reductions --
                        Avg     %Total     Avg     %Total   counts
%Total     Avg         %Total   counts   %Total
 0:      Main Stage: 2.1537e+02 100.0%  0.0000e+00   0.0%  9.800e+01
100.0%  1.855e+05      100.0%  2.500e+01 100.0%

------------------------------------------------------------------------------------------------------------------------
See the 'Profiling' chapter of the users' manual for details on
interpreting output.
Phase summary info:
   Count: number of times phase was executed
   Time and Flops/sec: Max - maximum over all processors
                       Ratio - ratio of maximum to minimum over all processors
   Mess: number of messages sent
   Avg. len: average message length
   Reduct: number of global reductions
   Global: entire computation
   Stage: stages of a computation. Set stages with PetscLogStagePush()
and PetscLogStagePop().
      %T - percent time in this phase         %F - percent flops in this phase
      %M - percent messages in this phase     %L - percent message
lengths in this phase
      %R - percent reductions in this phase
   Total Mflop/s: 10e-6 * (sum of flops over all processors)/(max time
over all processors)
------------------------------------------------------------------------------------------------------------------------


      ##########################################################
      #                                                        #
      #                          WARNING!!!                    #
      #                                                        #
      #   This code was run without the PreLoadBegin()         #
      #   macros. To get timing results we always recommend    #
      #   preloading. otherwise timing numbers may be          #
      #   meaningless.                                         #
      ##########################################################


Event                Count      Time (sec)     Flops/sec
          --- Global ---  --- Stage ---   Total
                   Max Ratio  Max     Ratio   Max  Ratio  Mess   Avg
len Reduct  %T %F %M %L %R  %T %F %M %L %R Mflop/s
------------------------------------------------------------------------------------------------------------------------

--- Event Stage 0: Main Stage

MatAssemblyBegin       1 1.0 2.9536e-01 4.0 0.00e+00 0.0 4.2e+01
4.2e+05 2.0e+00  0  0 43 98  8   0  0 43 98  8     0
MatAssemblyEnd         1 1.0 2.1410e+02 1.0 0.00e+00 0.0 2.8e+01
8.2e+03 7.0e+00 99  0 29  1 28  99  0 29  1 28     0
MatZeroEntries         1 1.0 9.3739e-02 5.9 0.00e+00 0.0 0.0e+00
0.0e+00 0.0e+00  0  0  0  0  0   0  0  0  0  0     0
VecSet                 4 1.0 3.9721e-04 2.2 0.00e+00 0.0 0.0e+00
0.0e+00 0.0e+00  0  0  0  0  0   0  0  0  0  0     0
------------------------------------------------------------------------------------------------------------------------

Memory usage is given in bytes:

Object Type          Creations   Destructions   Memory  Descendants' Mem.

--- Event Stage 0: Main Stage

              Matrix     3              0          0     0
           Index Set     6              6      45500     0
                 Vec     6              1     196776     0
         Vec Scatter     3              0          0     0
   IS L to G Mapping     2              0          0     0
       Krylov Solver     1              0          0     0
      Preconditioner     1              0          0     0
========================================================================================================================
Average time to get PetscTime(): 1.71661e-06
Average time for MPI_Barrier(): 0.000159979
Average time for zero size MPI_Send(): 1.29938e-05
Compiled without FORTRAN kernels
Compiled with full precision matrices (default)
sizeof(short) 2 sizeof(int) 4 sizeof(long) 8 sizeof(void*) 8
sizeof(PetscScalar) 8
Configure run at: Fri Sep 28 23:34:20 2007

run2
                         Max       Max/Min        Avg      Total
Time (sec):           2.298e+02      1.00000   2.298e+02
Objects:              2.600e+02      1.00000   2.600e+02
Flops:                1.265e+09      1.17394   1.161e+09  6.969e+09
Flops/sec:            5.505e+06      1.17394   5.054e+06  3.032e+07
MPI Messages:         1.436e+03      1.20816   1.326e+03  7.956e+03
MPI Message Lengths:  2.120e+07      1.23141   1.457e+04  1.159e+08
MPI Reductions:       4.192e+02      1.00000

Flop counting convention: 1 flop = 1 real number operation of type
(multiply/divide/add/subtract)
                            e.g., VecAXPY() for real vectors of length
N --> 2N flops
                            and VecAXPY() for complex vectors of
length N --> 8N flops

Summary of Stages:   ----- Time ------  ----- Flops -----  ---
Messages ---  -- Message Lengths --  -- Reductions --
                        Avg     %Total     Avg     %Total   counts
%Total     Avg         %Total   counts   %Total
 0:      Main Stage: 2.2943e+02  99.8%  6.9689e+09 100.0%  7.944e+03
99.8%  1.457e+04      100.0%  2.230e+02   8.9%

------------------------------------------------------------------------------------------------------------------------
See the 'Profiling' chapter of the users' manual for details on
interpreting output.
Phase summary info:
   Count: number of times phase was executed
   Time and Flops/sec: Max - maximum over all processors
                       Ratio - ratio of maximum to minimum over all processors
   Mess: number of messages sent
   Avg. len: average message length
   Reduct: number of global reductions
   Global: entire computation
   Stage: stages of a computation. Set stages with PetscLogStagePush()
and PetscLogStagePop().
      %T - percent time in this phase         %F - percent flops in this phase
      %M - percent messages in this phase     %L - percent message
lengths in this phase
      %R - percent reductions in this phase
   Total Mflop/s: 10e-6 * (sum of flops over all processors)/(max time
over all processors)
------------------------------------------------------------------------------------------------------------------------


      ##########################################################
      #                                                        #
      #                          WARNING!!!                    #
      #                                                        #
      #   This code was run without the PreLoadBegin()         #
      #   macros. To get timing results we always recommend    #
      #   preloading. otherwise timing numbers may be          #
      #   meaningless.                                         #
      ##########################################################


Event                Count      Time (sec)     Flops/sec
          --- Global ---  --- Stage ---   Total
                   Max Ratio  Max     Ratio   Max  Ratio  Mess   Avg
len Reduct  %T %F %M %L %R  %T %F %M %L %R Mflop/s
------------------------------------------------------------------------------------------------------------------------

--- Event Stage 0: Main Stage

MatMult              135 1.0 2.0830e+00 1.4 2.30e+08 1.9 2.4e+03
1.3e+04 0.0e+00  1 26 30 26  0   1 26 30 26  0   862
MatMultAdd            40 1.0 3.2598e-01 4.5 2.68e+07 5.7 0.0e+00
0.0e+00 0.0e+00  0  0  0  0  0   0  0  0  0  0    31
MatSolve              44 1.0 6.7841e-01 1.7 1.93e+08 1.7 0.0e+00
0.0e+00 0.0e+00  0  7  0  0  0   0  7  0  0  0   672
MatRelax              80 1.0 4.2949e+00 1.6 8.77e+07 1.2 2.2e+03
1.3e+04 0.0e+00  1 23 28 26  0   1 23 28 26  0   374
MatLUFactorSym         1 1.0 7.6739e-02 1.1 0.00e+00 0.0 0.0e+00
0.0e+00 1.0e+00  0  0  0  0  0   0  0  0  0  0     0
MatLUFactorNum         2 1.0 2.9370e+00 1.5 2.53e+08 1.5 0.0e+00
0.0e+00 0.0e+00  1 44  0  0  0   1 44  0  0  0  1037
MatILUFactorSym        1 1.0 6.7334e-01 1.0 0.00e+00 0.0 0.0e+00
0.0e+00 1.0e+00  0  0  0  0  0   0  0  0  0  0     0
MatAssemblyBegin       7 1.0 7.1652e-01 4.4 0.00e+00 0.0 1.3e+02
2.9e+05 8.0e+00  0  0  2 32  0   0  0  2 32  4     0
MatAssemblyEnd         7 1.0 2.1473e+02 1.0 0.00e+00 0.0 8.4e+01
3.2e+03 2.2e+01 93  0  1  0  1  94  0  1  0 10     0
MatGetRowIJ            2 1.0 1.8899e-03 1.1 0.00e+00 0.0 0.0e+00
0.0e+00 0.0e+00  0  0  0  0  0   0  0  0  0  0     0
MatGetSubMatrice       1 1.0 3.1915e-02 1.0 0.00e+00 0.0 0.0e+00
0.0e+00 3.0e+00  0  0  0  0  0   0  0  0  0  1     0
MatGetOrdering         2 1.0 1.2184e-02 1.6 0.00e+00 0.0 0.0e+00
0.0e+00 4.0e+00  0  0  0  0  0   0  0  0  0  2     0
MatIncreaseOvrlp       1 1.0 7.6865e-02 1.0 0.00e+00 0.0 1.1e+03
2.4e+03 2.0e+01  0  0 13  2  1   0  0 13  2  9     0
MatZeroEntries         3 1.0 1.0429e-01 4.7 0.00e+00 0.0 0.0e+00
0.0e+00 0.0e+00  0  0  0  0  0   0  0  0  0  0     0
MAT_GetRedundantMatrix       1 1.0 3.0144e-02 1.2 0.00e+00 0.0 9.0e+01
8.0e+04 2.0e+00  0  0  1  6  0   0  0  1  6  1     0
VecDot                39 1.0 9.0680e-01140.0 2.95e+08191.3 0.0e+00
0.0e+00 3.9e+01  0  0  0  0  2   0  0  0  0 17    11
VecMDot                8 1.0 2.8777e-03 2.8 4.16e+07 3.3 0.0e+00
0.0e+00 8.0e+00  0  0  0  0  0   0  0  0  0  4    84
VecNorm               31 1.0 6.2301e-02 4.2 7.17e+07 5.8 0.0e+00
0.0e+00 3.1e+01  0  0  0  0  1   0  0  0  0 14    86
VecScale              85 1.0 1.7729e-03 1.4 4.94e+08 1.4 0.0e+00
0.0e+00 0.0e+00  0  0  0  0  0   0  0  0  0  0  2138
VecCopy                4 1.0 6.2108e-04 1.5 0.00e+00 0.0 0.0e+00
0.0e+00 0.0e+00  0  0  0  0  0   0  0  0  0  0     0
VecSet               139 1.0 3.5934e-03 1.4 0.00e+00 0.0 0.0e+00
0.0e+00 0.0e+00  0  0  0  0  0   0  0  0  0  0     0
VecAXPY               98 1.0 5.1496e-03 1.3 9.24e+08 1.1 0.0e+00
0.0e+00 0.0e+00  0  0  0  0  0   0  0  0  0  0  4705
VecAYPX               40 1.0 1.0311e-02 1.4 1.90e+08 1.0 0.0e+00
0.0e+00 0.0e+00  0  0  0  0  0   0  0  0  0  0   969
VecWAXPY              75 1.0 2.6060e-02 1.4 1.22e+08 1.0 0.0e+00
0.0e+00 0.0e+00  0  0  0  0  0   0  0  0  0  0   618
VecMAXPY               9 1.0 2.1315e-04 1.6 3.01e+08 1.2 0.0e+00
0.0e+00 0.0e+00  0  0  0  0  0   0  0  0  0  0  1355
VecAssemblyBegin       4 1.0 1.9898e-03 1.2 0.00e+00 0.0 0.0e+00
0.0e+00 1.2e+01  0  0  0  0  0   0  0  0  0  5     0
VecAssemblyEnd         4 1.0 2.1219e-05 1.2 0.00e+00 0.0 0.0e+00
0.0e+00 0.0e+00  0  0  0  0  0   0  0  0  0  0     0
VecScatterBegin      267 1.0 1.0224e-01 1.6 0.00e+00 0.0 6.3e+03
1.1e+04 0.0e+00  0  0 79 59  0   0  0 79 59  0     0
VecScatterEnd        267 1.0 7.8653e-0111.3 0.00e+00 0.0 0.0e+00
0.0e+00 0.0e+00  0  0  0  0  0   0  0  0  0  0     0
KSPGMRESOrthog         4 1.0 3.7677e-03 1.9 4.78e+07 2.4 0.0e+00
0.0e+00 1.2e+01  0  0  0  0  0   0  0  0  0  5   141
KSPSetup               6 1.0 1.0260e-02 1.0 0.00e+00 0.0 0.0e+00
0.0e+00 4.0e+00  0  0  0  0  0   0  0  0  0  2     0
KSPSolve               2 1.0 7.8238e+00 1.0 1.16e+08 1.2 5.9e+03
1.1e+04 8.1e+01  3 70 75 54  3   3 70 75 54 36   622
PCSetUp                3 1.0 5.1323e+00 1.2 1.29e+08 1.2 1.6e+03
9.1e+03 7.3e+01  2 47 20 13  3   2 47 20 13 33   636
PCSetUpOnBlocks        1 1.0 1.3325e+00 1.0 1.47e+08 1.0 0.0e+00
0.0e+00 3.0e+00  1 17  0  0  0   1 17  0  0  1   871
PCApply               44 1.0 5.7917e+00 1.1 9.60e+07 1.2 4.7e+03
1.0e+04 0.0e+00  2 41 59 42  0   2 41 59 42  0   497
------------------------------------------------------------------------------------------------------------------------

Memory usage is given in bytes:

Object Type          Creations   Destructions   Memory  Descendants' Mem.

--- Event Stage 0: Main Stage

              Matrix    16              0          0     0
           Index Set    36             29     256760     0
                 Vec   176             93   16582464     0
         Vec Scatter    10              0          0     0
   IS L to G Mapping     4              0          0     0
       Krylov Solver     6              0          0     0
      Preconditioner     6              0          0     0
              Viewer     4              2          0     0
           Container     2              0          0     0
========================================================================================================================
Average time to get PetscTime(): 8.10623e-07
Average time for MPI_Barrier(): 0.000178194
Average time for zero size MPI_Send(): 1.33117e-05
OptionTable: -mg_levels_ksp_type richardson
OptionTable: -mg_levels_pc_sor_omega 1.05
OptionTable: -mg_levels_pc_type sor
OptionTable: -pc_ml_PrintLevel 4
OptionTable: -pc_ml_maxNlevels 2
OptionTable: -pc_type ml
Compiled without FORTRAN kernels
Compiled with full precision matrices (default)
sizeof(short) 2 sizeof(int) 4 sizeof(long) 8 sizeof(void*) 8
sizeof(PetscScalar) 8
Configure run at: Fri Sep 28 23:34:20 2007


>
>     Barry
>
>
>     The a
>
> On Feb 3, 2008, at 6:44 AM, Thomas Geenen wrote:
>
> > i call
> > ierr = MatStashSetInitialSize(A[*seqsolve],stash_size,
> > stash_size);CHKERRQ(ierr);
> > with 100 000 000 for the stash size to make sure that's not the
> > bottleneck
> >
> > the assemble time remains unchanged however.
> >
> > nstash in MatAssemblyBegin_MPIAIJ (CPU=0) = 109485
> > reallocs in MatAssemblyBegin_MPIAIJ =  0
> >
> > cheers
> > Thomas
> >
> > On Saturday 02 February 2008 23:19, Barry Smith wrote:
> >>    The matstash has a concept of preallocation also. During the first
> >> setvalues
> >> it is allocating more and more memory for the stash. In the second
> >> setvalues
> >> the stash is large enough so does not require any addition
> >> allocation.
> >>
> >>    You can use the option -matstash_initial_size <size> to allocate
> >> enough space
> >> initially so that the first setvalues is also fast. It does not look
> >> like there is a way
> >> coded to get the <size> that you should use. It should be set to the
> >> maximum nonzeros
> >> any process has that belongs to other processes. The stash handling
> >> code is
> >> in src/mat/utils/matstash.c, perhaps you can figure out how to
> >> printout with PetscInfo()
> >> the sizes needed?
> >>
> >>
> >>    Barry
> >>
> >> On Feb 2, 2008, at 12:30 PM, Thomas Geenen wrote:
> >>> On Saturday 02 February 2008 18:33, Hong Zhang wrote:
> >>>> On Sat, 2 Feb 2008, Thomas Geenen wrote:
> >>>>> Dear Petsc users,
> >>>>>
> >>>>> I would like to understand what is slowing down the assembly phase
> >>>>> of my
> >>>>> matrix. I create a matrix with MatCreateMPIAIJ i make a rough
> >>>>> guess of
> >>>>> the number of off diagonal entries and then use a conservative
> >>>>> value to
> >>>>> make sure I do not need extra mallocs. (the number of diagonal
> >>>>> entries is
> >>>>> exact)
> >>>>> next i call MatSetValues and MatAssemblyBegin, MatAssemblyEnd.
> >>>>> The first time i call MatSetValues and MatAssemblyBegin,
> >>>>> MatAssemblyEnd it takes about 170 seconds
> >>>>> the second time 0.3 seconds.
> >>>>> I run it on 6 cpu's and I do fill quit a number of row-entries on
> >>>>> the
> >>>>> "wrong" cpu. However thats also the case the second run. I checked
> >>>>> that there are no additional mallocs
> >>>>> MatGetInfo info.mallocs=0 both after MatSetValues and after
> >>>>> MatAssemblyBegin, MatAssemblyEnd.
> >>>>
> >>>> Run your code with the option '-log_summary' and check which
> >>>> function
> >>>> call dominates the execution time.
> >>>
> >>> the time is spend in MatStashScatterGetMesg_Private
> >>>
> >>>>> I run it on 6 cpu's and I do fill quit a number of row-entries on
> >>>>> the
> >>>>> "wrong" cpu.
> >>>>
> >>>> Likely, the communication that sending the entries to the
> >>>> corrected cpu consume the time. Can you fill the entries in the
> >>>> correct cpu?
> >>>
> >>> the second time the entries are filled on the wrong CPU as well.
> >>> i am curious about the difference in time between run 1 and 2.
> >>>
> >>>> Hong
> >>>>
> >>>>> cheers
> >>>>> Thomas
> >
>
>


From knepley at gmail.com  Mon Feb  4 10:47:44 2008
From: knepley at gmail.com (Matthew Knepley)
Date: Mon, 4 Feb 2008 10:47:44 -0600
Subject: assembly
In-Reply-To: <8aa042e10802040841n5429ee0bvb67f5f750f57472a@mail.gmail.com>
References: <8aa042e10802020132j5814522dh46303aba83cdc4d1@mail.gmail.com>
	 <200802021930.49084.geenen@gmail.com>
	 <37FE47D2-1A4A-4980-AD8D-EEF888109C93@mcs.anl.gov>
	 <200802031344.56290.geenen@gmail.com>
	 <964E08D4-3BBC-4BF8-8113-292D3302C745@mcs.anl.gov>
	 <8aa042e10802040841n5429ee0bvb67f5f750f57472a@mail.gmail.com>
Message-ID: <a9f269830802040847k30b98983r2d92ccd3b8e6bb0a@mail.gmail.com>

On Feb 4, 2008 10:41 AM, Thomas Geenen <geenen at gmail.com> wrote:
> On Feb 3, 2008 8:51 PM, Barry Smith <bsmith at mcs.anl.gov> wrote:
> >
> >     Hmmm, are you saying the first round of setting values still
> > takes much longer then the second round?
>
> yes
>
> >Or is it the time
> > in MatAssemblyBegin() much longer the first time?
> >
> >    The MatAssembly process has one piece of code that's
> > work is order n*size; where n is the stash size and size is the
> > number of processes, all other work is only order n.
> >
> >     Could you send the -log_summary output?
>
> the timing is cumulative i guess?
> in between these two solves i solve a smaller system for which i do
> not include the timing.

I ma having a little trouble reading this. I think the easiest thing to do
is wrap the two section of code in their own sections:

PetscLogStageRegister(&stage1, "First assembly");
PetscLogStageRegister(&stage2, "Second assembly");

PetscLogStagePush(stage1);
<code for first assembly>
PetscLogStagePop();

PetscLogStagePush(stage2);
<code for second assembly>
PetscLogStagePop();

Then we can also get a look at how many messages are sent
and how big they are.

  Thanks,

     Matt

> run 1
>                          Max       Max/Min        Avg      Total
> Time (sec):           2.154e+02      1.00001   2.154e+02
> Objects:              2.200e+01      1.00000   2.200e+01
> Flops:                0.000e+00      0.00000   0.000e+00  0.000e+00
> Flops/sec:            0.000e+00      0.00000   0.000e+00  0.000e+00
> MPI Messages:         1.750e+01      1.25000   1.633e+01  9.800e+01
> MPI Message Lengths:  3.460e+06      1.29903   1.855e+05  1.818e+07
> MPI Reductions:       4.167e+00      1.00000
>
> Flop counting convention: 1 flop = 1 real number operation of type
> (multiply/divide/add/subtract)
>                             e.g., VecAXPY() for real vectors of length
> N --> 2N flops
>                             and VecAXPY() for complex vectors of
> length N --> 8N flops
>
> Summary of Stages:   ----- Time ------  ----- Flops -----  ---
> Messages ---  -- Message Lengths --  -- Reductions --
>                         Avg     %Total     Avg     %Total   counts
> %Total     Avg         %Total   counts   %Total
>  0:      Main Stage: 2.1537e+02 100.0%  0.0000e+00   0.0%  9.800e+01
> 100.0%  1.855e+05      100.0%  2.500e+01 100.0%
>
> ------------------------------------------------------------------------------------------------------------------------
> See the 'Profiling' chapter of the users' manual for details on
> interpreting output.
> Phase summary info:
>    Count: number of times phase was executed
>    Time and Flops/sec: Max - maximum over all processors
>                        Ratio - ratio of maximum to minimum over all processors
>    Mess: number of messages sent
>    Avg. len: average message length
>    Reduct: number of global reductions
>    Global: entire computation
>    Stage: stages of a computation. Set stages with PetscLogStagePush()
> and PetscLogStagePop().
>       %T - percent time in this phase         %F - percent flops in this phase
>       %M - percent messages in this phase     %L - percent message
> lengths in this phase
>       %R - percent reductions in this phase
>    Total Mflop/s: 10e-6 * (sum of flops over all processors)/(max time
> over all processors)
> ------------------------------------------------------------------------------------------------------------------------
>
>
>       ##########################################################
>       #                                                        #
>       #                          WARNING!!!                    #
>       #                                                        #
>       #   This code was run without the PreLoadBegin()         #
>       #   macros. To get timing results we always recommend    #
>       #   preloading. otherwise timing numbers may be          #
>       #   meaningless.                                         #
>       ##########################################################
>
>
> Event                Count      Time (sec)     Flops/sec
>           --- Global ---  --- Stage ---   Total
>                    Max Ratio  Max     Ratio   Max  Ratio  Mess   Avg
> len Reduct  %T %F %M %L %R  %T %F %M %L %R Mflop/s
> ------------------------------------------------------------------------------------------------------------------------
>
> --- Event Stage 0: Main Stage
>
> MatAssemblyBegin       1 1.0 2.9536e-01 4.0 0.00e+00 0.0 4.2e+01
> 4.2e+05 2.0e+00  0  0 43 98  8   0  0 43 98  8     0
> MatAssemblyEnd         1 1.0 2.1410e+02 1.0 0.00e+00 0.0 2.8e+01
> 8.2e+03 7.0e+00 99  0 29  1 28  99  0 29  1 28     0
> MatZeroEntries         1 1.0 9.3739e-02 5.9 0.00e+00 0.0 0.0e+00
> 0.0e+00 0.0e+00  0  0  0  0  0   0  0  0  0  0     0
> VecSet                 4 1.0 3.9721e-04 2.2 0.00e+00 0.0 0.0e+00
> 0.0e+00 0.0e+00  0  0  0  0  0   0  0  0  0  0     0
> ------------------------------------------------------------------------------------------------------------------------
>
> Memory usage is given in bytes:
>
> Object Type          Creations   Destructions   Memory  Descendants' Mem.
>
> --- Event Stage 0: Main Stage
>
>               Matrix     3              0          0     0
>            Index Set     6              6      45500     0
>                  Vec     6              1     196776     0
>          Vec Scatter     3              0          0     0
>    IS L to G Mapping     2              0          0     0
>        Krylov Solver     1              0          0     0
>       Preconditioner     1              0          0     0
> ========================================================================================================================
> Average time to get PetscTime(): 1.71661e-06
> Average time for MPI_Barrier(): 0.000159979
> Average time for zero size MPI_Send(): 1.29938e-05
> Compiled without FORTRAN kernels
> Compiled with full precision matrices (default)
> sizeof(short) 2 sizeof(int) 4 sizeof(long) 8 sizeof(void*) 8
> sizeof(PetscScalar) 8
> Configure run at: Fri Sep 28 23:34:20 2007
>
> run2
>                          Max       Max/Min        Avg      Total
> Time (sec):           2.298e+02      1.00000   2.298e+02
> Objects:              2.600e+02      1.00000   2.600e+02
> Flops:                1.265e+09      1.17394   1.161e+09  6.969e+09
> Flops/sec:            5.505e+06      1.17394   5.054e+06  3.032e+07
> MPI Messages:         1.436e+03      1.20816   1.326e+03  7.956e+03
> MPI Message Lengths:  2.120e+07      1.23141   1.457e+04  1.159e+08
> MPI Reductions:       4.192e+02      1.00000
>
> Flop counting convention: 1 flop = 1 real number operation of type
> (multiply/divide/add/subtract)
>                             e.g., VecAXPY() for real vectors of length
> N --> 2N flops
>                             and VecAXPY() for complex vectors of
> length N --> 8N flops
>
> Summary of Stages:   ----- Time ------  ----- Flops -----  ---
> Messages ---  -- Message Lengths --  -- Reductions --
>                         Avg     %Total     Avg     %Total   counts
> %Total     Avg         %Total   counts   %Total
>  0:      Main Stage: 2.2943e+02  99.8%  6.9689e+09 100.0%  7.944e+03
> 99.8%  1.457e+04      100.0%  2.230e+02   8.9%
>
> ------------------------------------------------------------------------------------------------------------------------
> See the 'Profiling' chapter of the users' manual for details on
> interpreting output.
> Phase summary info:
>    Count: number of times phase was executed
>    Time and Flops/sec: Max - maximum over all processors
>                        Ratio - ratio of maximum to minimum over all processors
>    Mess: number of messages sent
>    Avg. len: average message length
>    Reduct: number of global reductions
>    Global: entire computation
>    Stage: stages of a computation. Set stages with PetscLogStagePush()
> and PetscLogStagePop().
>       %T - percent time in this phase         %F - percent flops in this phase
>       %M - percent messages in this phase     %L - percent message
> lengths in this phase
>       %R - percent reductions in this phase
>    Total Mflop/s: 10e-6 * (sum of flops over all processors)/(max time
> over all processors)
> ------------------------------------------------------------------------------------------------------------------------
>
>
>       ##########################################################
>       #                                                        #
>       #                          WARNING!!!                    #
>       #                                                        #
>       #   This code was run without the PreLoadBegin()         #
>       #   macros. To get timing results we always recommend    #
>       #   preloading. otherwise timing numbers may be          #
>       #   meaningless.                                         #
>       ##########################################################
>
>
> Event                Count      Time (sec)     Flops/sec
>           --- Global ---  --- Stage ---   Total
>                    Max Ratio  Max     Ratio   Max  Ratio  Mess   Avg
> len Reduct  %T %F %M %L %R  %T %F %M %L %R Mflop/s
> ------------------------------------------------------------------------------------------------------------------------
>
> --- Event Stage 0: Main Stage
>
> MatMult              135 1.0 2.0830e+00 1.4 2.30e+08 1.9 2.4e+03
> 1.3e+04 0.0e+00  1 26 30 26  0   1 26 30 26  0   862
> MatMultAdd            40 1.0 3.2598e-01 4.5 2.68e+07 5.7 0.0e+00
> 0.0e+00 0.0e+00  0  0  0  0  0   0  0  0  0  0    31
> MatSolve              44 1.0 6.7841e-01 1.7 1.93e+08 1.7 0.0e+00
> 0.0e+00 0.0e+00  0  7  0  0  0   0  7  0  0  0   672
> MatRelax              80 1.0 4.2949e+00 1.6 8.77e+07 1.2 2.2e+03
> 1.3e+04 0.0e+00  1 23 28 26  0   1 23 28 26  0   374
> MatLUFactorSym         1 1.0 7.6739e-02 1.1 0.00e+00 0.0 0.0e+00
> 0.0e+00 1.0e+00  0  0  0  0  0   0  0  0  0  0     0
> MatLUFactorNum         2 1.0 2.9370e+00 1.5 2.53e+08 1.5 0.0e+00
> 0.0e+00 0.0e+00  1 44  0  0  0   1 44  0  0  0  1037
> MatILUFactorSym        1 1.0 6.7334e-01 1.0 0.00e+00 0.0 0.0e+00
> 0.0e+00 1.0e+00  0  0  0  0  0   0  0  0  0  0     0
> MatAssemblyBegin       7 1.0 7.1652e-01 4.4 0.00e+00 0.0 1.3e+02
> 2.9e+05 8.0e+00  0  0  2 32  0   0  0  2 32  4     0
> MatAssemblyEnd         7 1.0 2.1473e+02 1.0 0.00e+00 0.0 8.4e+01
> 3.2e+03 2.2e+01 93  0  1  0  1  94  0  1  0 10     0
> MatGetRowIJ            2 1.0 1.8899e-03 1.1 0.00e+00 0.0 0.0e+00
> 0.0e+00 0.0e+00  0  0  0  0  0   0  0  0  0  0     0
> MatGetSubMatrice       1 1.0 3.1915e-02 1.0 0.00e+00 0.0 0.0e+00
> 0.0e+00 3.0e+00  0  0  0  0  0   0  0  0  0  1     0
> MatGetOrdering         2 1.0 1.2184e-02 1.6 0.00e+00 0.0 0.0e+00
> 0.0e+00 4.0e+00  0  0  0  0  0   0  0  0  0  2     0
> MatIncreaseOvrlp       1 1.0 7.6865e-02 1.0 0.00e+00 0.0 1.1e+03
> 2.4e+03 2.0e+01  0  0 13  2  1   0  0 13  2  9     0
> MatZeroEntries         3 1.0 1.0429e-01 4.7 0.00e+00 0.0 0.0e+00
> 0.0e+00 0.0e+00  0  0  0  0  0   0  0  0  0  0     0
> MAT_GetRedundantMatrix       1 1.0 3.0144e-02 1.2 0.00e+00 0.0 9.0e+01
> 8.0e+04 2.0e+00  0  0  1  6  0   0  0  1  6  1     0
> VecDot                39 1.0 9.0680e-01140.0 2.95e+08191.3 0.0e+00
> 0.0e+00 3.9e+01  0  0  0  0  2   0  0  0  0 17    11
> VecMDot                8 1.0 2.8777e-03 2.8 4.16e+07 3.3 0.0e+00
> 0.0e+00 8.0e+00  0  0  0  0  0   0  0  0  0  4    84
> VecNorm               31 1.0 6.2301e-02 4.2 7.17e+07 5.8 0.0e+00
> 0.0e+00 3.1e+01  0  0  0  0  1   0  0  0  0 14    86
> VecScale              85 1.0 1.7729e-03 1.4 4.94e+08 1.4 0.0e+00
> 0.0e+00 0.0e+00  0  0  0  0  0   0  0  0  0  0  2138
> VecCopy                4 1.0 6.2108e-04 1.5 0.00e+00 0.0 0.0e+00
> 0.0e+00 0.0e+00  0  0  0  0  0   0  0  0  0  0     0
> VecSet               139 1.0 3.5934e-03 1.4 0.00e+00 0.0 0.0e+00
> 0.0e+00 0.0e+00  0  0  0  0  0   0  0  0  0  0     0
> VecAXPY               98 1.0 5.1496e-03 1.3 9.24e+08 1.1 0.0e+00
> 0.0e+00 0.0e+00  0  0  0  0  0   0  0  0  0  0  4705
> VecAYPX               40 1.0 1.0311e-02 1.4 1.90e+08 1.0 0.0e+00
> 0.0e+00 0.0e+00  0  0  0  0  0   0  0  0  0  0   969
> VecWAXPY              75 1.0 2.6060e-02 1.4 1.22e+08 1.0 0.0e+00
> 0.0e+00 0.0e+00  0  0  0  0  0   0  0  0  0  0   618
> VecMAXPY               9 1.0 2.1315e-04 1.6 3.01e+08 1.2 0.0e+00
> 0.0e+00 0.0e+00  0  0  0  0  0   0  0  0  0  0  1355
> VecAssemblyBegin       4 1.0 1.9898e-03 1.2 0.00e+00 0.0 0.0e+00
> 0.0e+00 1.2e+01  0  0  0  0  0   0  0  0  0  5     0
> VecAssemblyEnd         4 1.0 2.1219e-05 1.2 0.00e+00 0.0 0.0e+00
> 0.0e+00 0.0e+00  0  0  0  0  0   0  0  0  0  0     0
> VecScatterBegin      267 1.0 1.0224e-01 1.6 0.00e+00 0.0 6.3e+03
> 1.1e+04 0.0e+00  0  0 79 59  0   0  0 79 59  0     0
> VecScatterEnd        267 1.0 7.8653e-0111.3 0.00e+00 0.0 0.0e+00
> 0.0e+00 0.0e+00  0  0  0  0  0   0  0  0  0  0     0
> KSPGMRESOrthog         4 1.0 3.7677e-03 1.9 4.78e+07 2.4 0.0e+00
> 0.0e+00 1.2e+01  0  0  0  0  0   0  0  0  0  5   141
> KSPSetup               6 1.0 1.0260e-02 1.0 0.00e+00 0.0 0.0e+00
> 0.0e+00 4.0e+00  0  0  0  0  0   0  0  0  0  2     0
> KSPSolve               2 1.0 7.8238e+00 1.0 1.16e+08 1.2 5.9e+03
> 1.1e+04 8.1e+01  3 70 75 54  3   3 70 75 54 36   622
> PCSetUp                3 1.0 5.1323e+00 1.2 1.29e+08 1.2 1.6e+03
> 9.1e+03 7.3e+01  2 47 20 13  3   2 47 20 13 33   636
> PCSetUpOnBlocks        1 1.0 1.3325e+00 1.0 1.47e+08 1.0 0.0e+00
> 0.0e+00 3.0e+00  1 17  0  0  0   1 17  0  0  1   871
> PCApply               44 1.0 5.7917e+00 1.1 9.60e+07 1.2 4.7e+03
> 1.0e+04 0.0e+00  2 41 59 42  0   2 41 59 42  0   497
> ------------------------------------------------------------------------------------------------------------------------
>
> Memory usage is given in bytes:
>
> Object Type          Creations   Destructions   Memory  Descendants' Mem.
>
> --- Event Stage 0: Main Stage
>
>               Matrix    16              0          0     0
>            Index Set    36             29     256760     0
>                  Vec   176             93   16582464     0
>          Vec Scatter    10              0          0     0
>    IS L to G Mapping     4              0          0     0
>        Krylov Solver     6              0          0     0
>       Preconditioner     6              0          0     0
>               Viewer     4              2          0     0
>            Container     2              0          0     0
> ========================================================================================================================
> Average time to get PetscTime(): 8.10623e-07
> Average time for MPI_Barrier(): 0.000178194
> Average time for zero size MPI_Send(): 1.33117e-05
> OptionTable: -mg_levels_ksp_type richardson
> OptionTable: -mg_levels_pc_sor_omega 1.05
> OptionTable: -mg_levels_pc_type sor
> OptionTable: -pc_ml_PrintLevel 4
> OptionTable: -pc_ml_maxNlevels 2
> OptionTable: -pc_type ml
> Compiled without FORTRAN kernels
> Compiled with full precision matrices (default)
> sizeof(short) 2 sizeof(int) 4 sizeof(long) 8 sizeof(void*) 8
> sizeof(PetscScalar) 8
> Configure run at: Fri Sep 28 23:34:20 2007
>
>
> >
> >     Barry
> >
> >
> >     The a
> >
> > On Feb 3, 2008, at 6:44 AM, Thomas Geenen wrote:
> >
> > > i call
> > > ierr = MatStashSetInitialSize(A[*seqsolve],stash_size,
> > > stash_size);CHKERRQ(ierr);
> > > with 100 000 000 for the stash size to make sure that's not the
> > > bottleneck
> > >
> > > the assemble time remains unchanged however.
> > >
> > > nstash in MatAssemblyBegin_MPIAIJ (CPU=0) = 109485
> > > reallocs in MatAssemblyBegin_MPIAIJ =  0
> > >
> > > cheers
> > > Thomas
> > >
> > > On Saturday 02 February 2008 23:19, Barry Smith wrote:
> > >>    The matstash has a concept of preallocation also. During the first
> > >> setvalues
> > >> it is allocating more and more memory for the stash. In the second
> > >> setvalues
> > >> the stash is large enough so does not require any addition
> > >> allocation.
> > >>
> > >>    You can use the option -matstash_initial_size <size> to allocate
> > >> enough space
> > >> initially so that the first setvalues is also fast. It does not look
> > >> like there is a way
> > >> coded to get the <size> that you should use. It should be set to the
> > >> maximum nonzeros
> > >> any process has that belongs to other processes. The stash handling
> > >> code is
> > >> in src/mat/utils/matstash.c, perhaps you can figure out how to
> > >> printout with PetscInfo()
> > >> the sizes needed?
> > >>
> > >>
> > >>    Barry
> > >>
> > >> On Feb 2, 2008, at 12:30 PM, Thomas Geenen wrote:
> > >>> On Saturday 02 February 2008 18:33, Hong Zhang wrote:
> > >>>> On Sat, 2 Feb 2008, Thomas Geenen wrote:
> > >>>>> Dear Petsc users,
> > >>>>>
> > >>>>> I would like to understand what is slowing down the assembly phase
> > >>>>> of my
> > >>>>> matrix. I create a matrix with MatCreateMPIAIJ i make a rough
> > >>>>> guess of
> > >>>>> the number of off diagonal entries and then use a conservative
> > >>>>> value to
> > >>>>> make sure I do not need extra mallocs. (the number of diagonal
> > >>>>> entries is
> > >>>>> exact)
> > >>>>> next i call MatSetValues and MatAssemblyBegin, MatAssemblyEnd.
> > >>>>> The first time i call MatSetValues and MatAssemblyBegin,
> > >>>>> MatAssemblyEnd it takes about 170 seconds
> > >>>>> the second time 0.3 seconds.
> > >>>>> I run it on 6 cpu's and I do fill quit a number of row-entries on
> > >>>>> the
> > >>>>> "wrong" cpu. However thats also the case the second run. I checked
> > >>>>> that there are no additional mallocs
> > >>>>> MatGetInfo info.mallocs=0 both after MatSetValues and after
> > >>>>> MatAssemblyBegin, MatAssemblyEnd.
> > >>>>
> > >>>> Run your code with the option '-log_summary' and check which
> > >>>> function
> > >>>> call dominates the execution time.
> > >>>
> > >>> the time is spend in MatStashScatterGetMesg_Private
> > >>>
> > >>>>> I run it on 6 cpu's and I do fill quit a number of row-entries on
> > >>>>> the
> > >>>>> "wrong" cpu.
> > >>>>
> > >>>> Likely, the communication that sending the entries to the
> > >>>> corrected cpu consume the time. Can you fill the entries in the
> > >>>> correct cpu?
> > >>>
> > >>> the second time the entries are filled on the wrong CPU as well.
> > >>> i am curious about the difference in time between run 1 and 2.
> > >>>
> > >>>> Hong
> > >>>>
> > >>>>> cheers
> > >>>>> Thomas
> > >
> >
> >
>
>


-- 
What most experimenters take for granted before they begin their
experiments is infinitely more interesting than any results to which
their experiments lead.
-- Norbert Wiener


From geenen at gmail.com  Mon Feb  4 11:34:29 2008
From: geenen at gmail.com (Thomas Geenen)
Date: Mon, 4 Feb 2008 18:34:29 +0100
Subject: assembly
In-Reply-To: <a9f269830802040847k30b98983r2d92ccd3b8e6bb0a@mail.gmail.com>
References: <8aa042e10802020132j5814522dh46303aba83cdc4d1@mail.gmail.com>
	 <200802021930.49084.geenen@gmail.com>
	 <37FE47D2-1A4A-4980-AD8D-EEF888109C93@mcs.anl.gov>
	 <200802031344.56290.geenen@gmail.com>
	 <964E08D4-3BBC-4BF8-8113-292D3302C745@mcs.anl.gov>
	 <8aa042e10802040841n5429ee0bvb67f5f750f57472a@mail.gmail.com>
	 <a9f269830802040847k30b98983r2d92ccd3b8e6bb0a@mail.gmail.com>
Message-ID: <8aa042e10802040934p7308142fjff32c882308eceda@mail.gmail.com>

hi matt,

this is indeed much clearer
i put the push and pop around MatAssemblyBegin/End

1:  First_assembly: 1.4724e+02  90.8%  0.0000e+00   0.0%  7.000e+01
0.9%  2.266e+03       15.5%  9.000e+00   0.2%
2: Second_assembly: 2.3823e-01   0.1%  0.0000e+00   0.0%  7.000e+01
0.9%  1.276e+02        0.9%  9.000e+00   0.2%
3:  Third_assembly: 5.0168e-01   0.3%  0.0000e+00   0.0%  4.200e+01
0.5%  2.237e+03       15.4%  3.000e+00   0.1%

The second assembly is another system of equations (pressure
correction in simpler)
so 1 and 3 are 1 and 2 ......

cheers
Thomas


---------------------------------------------- PETSc Performance
Summary: ----------------------------------------------

Unknown Name on a linux-gnu named etna.geo.uu.nl with 6 processors, by
geenen Mon Feb  4 18:27:48 2008
Using Petsc Release Version 2.3.3, Patch 3, Fri Jun 15 16:51:25 CDT
2007 HG revision: f051789beadcd36f77fb6111d20225e26ed7cc0d

                         Max       Max/Min        Avg      Total
Time (sec):           1.621e+02      1.00000   1.621e+02
Objects:              2.600e+02      1.00000   2.600e+02
Flops:                1.265e+09      1.17394   1.161e+09  6.969e+09
Flops/sec:            7.806e+06      1.17393   7.166e+06  4.300e+07
MPI Messages:         1.436e+03      1.20816   1.326e+03  7.956e+03
MPI Message Lengths:  2.120e+07      1.23141   1.457e+04  1.159e+08
MPI Reductions:       9.862e+02      1.00000

Flop counting convention: 1 flop = 1 real number operation of type
(multiply/divide/add/subtract)
                            e.g., VecAXPY() for real vectors of length
N --> 2N flops
                            and VecAXPY() for complex vectors of
length N --> 8N flops

Summary of Stages:   ----- Time ------  ----- Flops -----  ---
Messages ---  -- Message Lengths --  -- Reductions --
                        Avg     %Total     Avg     %Total   counts
%Total     Avg         %Total   counts   %Total
 0:      Main Stage: 1.3160e+01   8.1%  6.9689e+09 100.0%  7.762e+03
97.6%  9.941e+03       68.2%  2.020e+02   3.4%
 1:  First_assembly: 1.4724e+02  90.8%  0.0000e+00   0.0%  7.000e+01
0.9%  2.266e+03       15.5%  9.000e+00   0.2%
 2: Second_assembly: 2.3823e-01   0.1%  0.0000e+00   0.0%  7.000e+01
0.9%  1.276e+02        0.9%  9.000e+00   0.2%
 3:  Third_assembly: 5.0168e-01   0.3%  0.0000e+00   0.0%  4.200e+01
0.5%  2.237e+03       15.4%  3.000e+00   0.1%

------------------------------------------------------------------------------------------------------------------------
See the 'Profiling' chapter of the users' manual for details on
interpreting output.
Phase summary info:
   Count: number of times phase was executed
   Time and Flops/sec: Max - maximum over all processors
                       Ratio - ratio of maximum to minimum over all processors
   Mess: number of messages sent
   Avg. len: average message length
   Reduct: number of global reductions
   Global: entire computation
   Stage: stages of a computation. Set stages with PetscLogStagePush()
and PetscLogStagePop().
      %T - percent time in this phase         %F - percent flops in this phase
      %M - percent messages in this phase     %L - percent message
lengths in this phase
      %R - percent reductions in this phase
   Total Mflop/s: 10e-6 * (sum of flops over all processors)/(max time
over all processors)
------------------------------------------------------------------------------------------------------------------------


      ##########################################################
      #                                                        #
      #                          WARNING!!!                    #
      #                                                        #
      #   This code was run without the PreLoadBegin()         #
      #   macros. To get timing results we always recommend    #
      #   preloading. otherwise timing numbers may be          #
      #   meaningless.                                         #
      ##########################################################


Event                Count      Time (sec)     Flops/sec
          --- Global ---  --- Stage ---   Total
                   Max Ratio  Max     Ratio   Max  Ratio  Mess   Avg
len Reduct  %T %F %M %L %R  %T %F %M %L %R Mflop/s
------------------------------------------------------------------------------------------------------------------------

--- Event Stage 0: Main Stage

MatMult              135 1.0 1.5541e+00 1.3 3.03e+08 1.8 2.4e+03
1.3e+04 0.0e+00  1 26 30 26  0  11 26 30 38  0  1155
MatMultAdd            40 1.0 3.2611e-01 8.3 3.64e+07 7.7 0.0e+00
0.0e+00 0.0e+00  0  0  0  0  0   2  0  0  0  0    31
MatSolve              44 1.0 6.7682e-01 1.7 1.94e+08 1.7 0.0e+00
0.0e+00 0.0e+00  0  7  0  0  0   4  7  0  0  0   673
MatRelax              80 1.0 3.4453e+00 1.4 9.46e+07 1.1 2.2e+03
1.3e+04 0.0e+00  2 23 28 26  0  22 23 29 38  0   466
MatLUFactorSym         1 1.0 6.7567e-02 1.1 0.00e+00 0.0 0.0e+00
0.0e+00 1.0e+00  0  0  0  0  0   0  0  0  0  0     0
MatLUFactorNum         2 1.0 2.8804e+00 1.4 2.53e+08 1.4 0.0e+00
0.0e+00 0.0e+00  1 44  0  0  0  17 44  0  0  0  1058
MatILUFactorSym        1 1.0 6.7676e-01 1.0 0.00e+00 0.0 0.0e+00
0.0e+00 1.0e+00  0  0  0  0  0   5  0  0  0  0     0
MatAssemblyBegin       4 1.0 2.7711e-0237.3 0.00e+00 0.0 0.0e+00
0.0e+00 2.0e+00  0  0  0  0  0   0  0  0  0  1     0
MatAssemblyEnd         4 1.0 2.4401e-02 1.2 0.00e+00 0.0 2.8e+01
4.7e+02 7.0e+00  0  0  0  0  0   0  0  0  0  3     0
MatGetRowIJ            2 1.0 1.2948e-02 7.5 0.00e+00 0.0 0.0e+00
0.0e+00 0.0e+00  0  0  0  0  0   0  0  0  0  0     0
MatGetSubMatrice       1 1.0 2.8603e-02 1.1 0.00e+00 0.0 0.0e+00
0.0e+00 3.0e+00  0  0  0  0  0   0  0  0  0  1     0
MatGetOrdering         2 1.0 2.3054e-02 2.4 0.00e+00 0.0 0.0e+00
0.0e+00 4.0e+00  0  0  0  0  0   0  0  0  0  2     0
MatIncreaseOvrlp       1 1.0 8.1528e-02 1.0 0.00e+00 0.0 1.1e+03
2.4e+03 2.0e+01  0  0 13  2  0   1  0 14  3 10     0
MatZeroEntries         3 1.0 3.4422e-02 1.8 0.00e+00 0.0 0.0e+00
0.0e+00 0.0e+00  0  0  0  0  0   0  0  0  0  0     0
MAT_GetRedundantMatrix       1 1.0 3.5774e-02 1.6 0.00e+00 0.0 9.0e+01
8.0e+04 2.0e+00  0  0  1  6  0   0  0  1  9  1     0
VecDot                39 1.0 5.8092e-0131.0 1.02e+0842.3 0.0e+00
0.0e+00 3.9e+01  0  0  0  0  1   2  0  0  0 19    17
VecMDot                8 1.0 3.4735e-03 2.9 4.52e+07 5.5 0.0e+00
0.0e+00 8.0e+00  0  0  0  0  0   0  0  0  0  4    69
VecNorm               31 1.0 3.8690e-02 4.1 1.11e+08 5.6 0.0e+00
0.0e+00 3.1e+01  0  0  0  0  1   0  0  0  0 15   139
VecScale              85 1.0 1.7631e-03 1.2 5.59e+08 1.9 0.0e+00
0.0e+00 0.0e+00  0  0  0  0  0   0  0  0  0  0  2150
VecCopy                4 1.0 5.5027e-04 1.4 0.00e+00 0.0 0.0e+00
0.0e+00 0.0e+00  0  0  0  0  0   0  0  0  0  0     0
VecSet               139 1.0 3.0956e-03 1.4 0.00e+00 0.0 0.0e+00
0.0e+00 0.0e+00  0  0  0  0  0   0  0  0  0  0     0
VecAXPY               98 1.0 5.0848e-03 1.3 9.35e+08 1.1 0.0e+00
0.0e+00 0.0e+00  0  0  0  0  0   0  0  0  0  0  4765
VecAYPX               40 1.0 1.0264e-02 1.4 2.01e+08 1.1 0.0e+00
0.0e+00 0.0e+00  0  0  0  0  0   0  0  0  0  0   973
VecWAXPY              75 1.0 2.6191e-02 1.4 1.22e+08 1.0 0.0e+00
0.0e+00 0.0e+00  0  0  0  0  0   0  0  0  0  0   615
VecMAXPY               9 1.0 2.1935e-04 1.7 2.93e+08 1.1 0.0e+00
0.0e+00 0.0e+00  0  0  0  0  0   0  0  0  0  0  1317
VecAssemblyBegin       4 1.0 1.9331e-03 1.2 0.00e+00 0.0 0.0e+00
0.0e+00 1.2e+01  0  0  0  0  0   0  0  0  0  6     0
VecAssemblyEnd         4 1.0 2.3842e-05 1.4 0.00e+00 0.0 0.0e+00
0.0e+00 0.0e+00  0  0  0  0  0   0  0  0  0  0     0
VecScatterBegin      267 1.0 1.0370e-01 1.5 0.00e+00 0.0 6.3e+03
1.1e+04 0.0e+00  0  0 79 59  0   1  0 81 86  0     0
VecScatterEnd        267 1.0 4.1189e-01 3.1 0.00e+00 0.0 0.0e+00
0.0e+00 0.0e+00  0  0  0  0  0   2  0  0  0  0     0
KSPGMRESOrthog         4 1.0 4.5178e-03 2.0 5.22e+07 3.8 0.0e+00
0.0e+00 1.2e+01  0  0  0  0  0   0  0  0  0  6   117
KSPSetup               6 1.0 7.9882e-03 1.0 0.00e+00 0.0 0.0e+00
0.0e+00 4.0e+00  0  0  0  0  0   0  0  0  0  2     0
KSPSolve               2 1.0 6.6590e+00 1.0 1.36e+08 1.2 5.9e+03
1.1e+04 8.1e+01  4 70 75 54  1  51 70 76 80 40   731
PCSetUp                3 1.0 5.0877e+00 1.2 1.29e+08 1.2 1.6e+03
9.1e+03 7.3e+01  3 47 20 13  1  34 47 21 19 36   642
PCSetUpOnBlocks        1 1.0 1.3292e+00 1.0 1.46e+08 1.0 0.0e+00
0.0e+00 3.0e+00  1 17  0  0  0  10 17  0  0  1   873
PCApply               44 1.0 4.7444e+00 1.1 1.16e+08 1.2 4.7e+03
1.0e+04 0.0e+00  3 41 59 42  0  34 41 60 61  0   607

--- Event Stage 1: First_assembly

MatAssemblyBegin       1 1.0 3.0375e-01 3.5 0.00e+00 0.0 4.2e+01
4.2e+05 2.0e+00  0  0  1 15  0   0  0 60 99 22     0
MatAssemblyEnd         1 1.0 1.4709e+02 1.0 0.00e+00 0.0 2.8e+01
8.2e+03 7.0e+00 91  0  0  0  0 100  0 40  1 78     0

--- Event Stage 2: Second_assembly

MatAssemblyBegin       1 1.0 1.7451e-02 5.0 0.00e+00 0.0 4.2e+01
2.4e+04 2.0e+00  0  0  1  1  0   5  0 60 98 22     0
MatAssemblyEnd         1 1.0 2.3056e-01 1.0 0.00e+00 0.0 2.8e+01
8.4e+02 7.0e+00  0  0  0  0  0  95  0 40  2 78     0

--- Event Stage 3: Third_assembly

MatAssemblyBegin       1 1.0 3.3676e-01 3.8 0.00e+00 0.0 4.2e+01
4.2e+05 2.0e+00  0  0  1 15  0  45  0100100 67     0
MatAssemblyEnd         1 1.0 3.3125e-01 1.7 0.00e+00 0.0 0.0e+00
0.0e+00 1.0e+00  0  0  0  0  0  55  0  0  0 33     0

--- Event Stage 4: Unknown


--- Event Stage 5: Unknown


--- Event Stage 6: Unknown


--- Event Stage 7: Unknown


--- Event Stage 8: Unknown


--- Event Stage 9: Unknown

------------------------------------------------------------------------------------------------------------------------

Memory usage is given in bytes:

Object Type          Creations   Destructions   Memory  Descendants' Mem.

--- Event Stage 0: Main Stage

              Matrix    16              0          0     0
           Index Set    32             25     207888     0
                 Vec   172             91   16374224     0
         Vec Scatter     8              0          0     0
   IS L to G Mapping     4              0          0     0
       Krylov Solver     6              0          0     0
      Preconditioner     6              0          0     0
              Viewer     4              2          0     0
           Container     2              0          0     0

--- Event Stage 1: First_assembly

           Index Set     2              2      44140     0
                 Vec     2              1     196776     0
         Vec Scatter     1              0          0     0

--- Event Stage 2: Second_assembly

           Index Set     2              2       4732     0
                 Vec     2              1      11464     0
         Vec Scatter     1              0          0     0

--- Event Stage 3: Third_assembly


--- Event Stage 4: Unknown


--- Event Stage 5: Unknown


--- Event Stage 6: Unknown


--- Event Stage 7: Unknown


--- Event Stage 8: Unknown


--- Event Stage 9: Unknown

========================================================================================================================
Average time to get PetscTime(): 8.82149e-07
Average time for MPI_Barrier(): 0.000153208
Average time for zero size MPI_Send(): 1.86761e-05
OptionTable: -mg_levels_ksp_type richardson
OptionTable: -mg_levels_pc_sor_omega 1.05
OptionTable: -mg_levels_pc_type sor
OptionTable: -pc_ml_PrintLevel 4
OptionTable: -pc_ml_maxNlevels 2
OptionTable: -pc_type ml
Compiled without FORTRAN kernels
Compiled with full precision matrices (default)
sizeof(short) 2 sizeof(int) 4 sizeof(long) 8 sizeof(void*) 8
sizeof(PetscScalar) 8
Configure run at: Fri Sep 28 23:34:20 2007

On Feb 4, 2008 5:47 PM, Matthew Knepley <knepley at gmail.com> wrote:
> On Feb 4, 2008 10:41 AM, Thomas Geenen <geenen at gmail.com> wrote:
> > On Feb 3, 2008 8:51 PM, Barry Smith <bsmith at mcs.anl.gov> wrote:
> > >
> > >     Hmmm, are you saying the first round of setting values still
> > > takes much longer then the second round?
> >
> > yes
> >
> > >Or is it the time
> > > in MatAssemblyBegin() much longer the first time?
> > >
> > >    The MatAssembly process has one piece of code that's
> > > work is order n*size; where n is the stash size and size is the
> > > number of processes, all other work is only order n.
> > >
> > >     Could you send the -log_summary output?
> >
> > the timing is cumulative i guess?
> > in between these two solves i solve a smaller system for which i do
> > not include the timing.
>
> I ma having a little trouble reading this. I think the easiest thing to do
> is wrap the two section of code in their own sections:
>
> PetscLogStageRegister(&stage1, "First assembly");
> PetscLogStageRegister(&stage2, "Second assembly");
>
> PetscLogStagePush(stage1);
> <code for first assembly>
> PetscLogStagePop();
>
> PetscLogStagePush(stage2);
> <code for second assembly>
> PetscLogStagePop();
>
> Then we can also get a look at how many messages are sent
> and how big they are.
>
>   Thanks,
>
>      Matt
>
>
> > run 1
> >                          Max       Max/Min        Avg      Total
> > Time (sec):           2.154e+02      1.00001   2.154e+02
> > Objects:              2.200e+01      1.00000   2.200e+01
> > Flops:                0.000e+00      0.00000   0.000e+00  0.000e+00
> > Flops/sec:            0.000e+00      0.00000   0.000e+00  0.000e+00
> > MPI Messages:         1.750e+01      1.25000   1.633e+01  9.800e+01
> > MPI Message Lengths:  3.460e+06      1.29903   1.855e+05  1.818e+07
> > MPI Reductions:       4.167e+00      1.00000
> >
> > Flop counting convention: 1 flop = 1 real number operation of type
> > (multiply/divide/add/subtract)
> >                             e.g., VecAXPY() for real vectors of length
> > N --> 2N flops
> >                             and VecAXPY() for complex vectors of
> > length N --> 8N flops
> >
> > Summary of Stages:   ----- Time ------  ----- Flops -----  ---
> > Messages ---  -- Message Lengths --  -- Reductions --
> >                         Avg     %Total     Avg     %Total   counts
> > %Total     Avg         %Total   counts   %Total
> >  0:      Main Stage: 2.1537e+02 100.0%  0.0000e+00   0.0%  9.800e+01
> > 100.0%  1.855e+05      100.0%  2.500e+01 100.0%
> >
> > ------------------------------------------------------------------------------------------------------------------------
> > See the 'Profiling' chapter of the users' manual for details on
> > interpreting output.
> > Phase summary info:
> >    Count: number of times phase was executed
> >    Time and Flops/sec: Max - maximum over all processors
> >                        Ratio - ratio of maximum to minimum over all processors
> >    Mess: number of messages sent
> >    Avg. len: average message length
> >    Reduct: number of global reductions
> >    Global: entire computation
> >    Stage: stages of a computation. Set stages with PetscLogStagePush()
> > and PetscLogStagePop().
> >       %T - percent time in this phase         %F - percent flops in this phase
> >       %M - percent messages in this phase     %L - percent message
> > lengths in this phase
> >       %R - percent reductions in this phase
> >    Total Mflop/s: 10e-6 * (sum of flops over all processors)/(max time
> > over all processors)
> > ------------------------------------------------------------------------------------------------------------------------
> >
> >
> >       ##########################################################
> >       #                                                        #
> >       #                          WARNING!!!                    #
> >       #                                                        #
> >       #   This code was run without the PreLoadBegin()         #
> >       #   macros. To get timing results we always recommend    #
> >       #   preloading. otherwise timing numbers may be          #
> >       #   meaningless.                                         #
> >       ##########################################################
> >
> >
> > Event                Count      Time (sec)     Flops/sec
> >           --- Global ---  --- Stage ---   Total
> >                    Max Ratio  Max     Ratio   Max  Ratio  Mess   Avg
> > len Reduct  %T %F %M %L %R  %T %F %M %L %R Mflop/s
> > ------------------------------------------------------------------------------------------------------------------------
> >
> > --- Event Stage 0: Main Stage
> >
> > MatAssemblyBegin       1 1.0 2.9536e-01 4.0 0.00e+00 0.0 4.2e+01
> > 4.2e+05 2.0e+00  0  0 43 98  8   0  0 43 98  8     0
> > MatAssemblyEnd         1 1.0 2.1410e+02 1.0 0.00e+00 0.0 2.8e+01
> > 8.2e+03 7.0e+00 99  0 29  1 28  99  0 29  1 28     0
> > MatZeroEntries         1 1.0 9.3739e-02 5.9 0.00e+00 0.0 0.0e+00
> > 0.0e+00 0.0e+00  0  0  0  0  0   0  0  0  0  0     0
> > VecSet                 4 1.0 3.9721e-04 2.2 0.00e+00 0.0 0.0e+00
> > 0.0e+00 0.0e+00  0  0  0  0  0   0  0  0  0  0     0
> > ------------------------------------------------------------------------------------------------------------------------
> >
> > Memory usage is given in bytes:
> >
> > Object Type          Creations   Destructions   Memory  Descendants' Mem.
> >
> > --- Event Stage 0: Main Stage
> >
> >               Matrix     3              0          0     0
> >            Index Set     6              6      45500     0
> >                  Vec     6              1     196776     0
> >          Vec Scatter     3              0          0     0
> >    IS L to G Mapping     2              0          0     0
> >        Krylov Solver     1              0          0     0
> >       Preconditioner     1              0          0     0
> > ========================================================================================================================
> > Average time to get PetscTime(): 1.71661e-06
> > Average time for MPI_Barrier(): 0.000159979
> > Average time for zero size MPI_Send(): 1.29938e-05
> > Compiled without FORTRAN kernels
> > Compiled with full precision matrices (default)
> > sizeof(short) 2 sizeof(int) 4 sizeof(long) 8 sizeof(void*) 8
> > sizeof(PetscScalar) 8
> > Configure run at: Fri Sep 28 23:34:20 2007
> >
> > run2
> >                          Max       Max/Min        Avg      Total
> > Time (sec):           2.298e+02      1.00000   2.298e+02
> > Objects:              2.600e+02      1.00000   2.600e+02
> > Flops:                1.265e+09      1.17394   1.161e+09  6.969e+09
> > Flops/sec:            5.505e+06      1.17394   5.054e+06  3.032e+07
> > MPI Messages:         1.436e+03      1.20816   1.326e+03  7.956e+03
> > MPI Message Lengths:  2.120e+07      1.23141   1.457e+04  1.159e+08
> > MPI Reductions:       4.192e+02      1.00000
> >
> > Flop counting convention: 1 flop = 1 real number operation of type
> > (multiply/divide/add/subtract)
> >                             e.g., VecAXPY() for real vectors of length
> > N --> 2N flops
> >                             and VecAXPY() for complex vectors of
> > length N --> 8N flops
> >
> > Summary of Stages:   ----- Time ------  ----- Flops -----  ---
> > Messages ---  -- Message Lengths --  -- Reductions --
> >                         Avg     %Total     Avg     %Total   counts
> > %Total     Avg         %Total   counts   %Total
> >  0:      Main Stage: 2.2943e+02  99.8%  6.9689e+09 100.0%  7.944e+03
> > 99.8%  1.457e+04      100.0%  2.230e+02   8.9%
> >
> > ------------------------------------------------------------------------------------------------------------------------
> > See the 'Profiling' chapter of the users' manual for details on
> > interpreting output.
> > Phase summary info:
> >    Count: number of times phase was executed
> >    Time and Flops/sec: Max - maximum over all processors
> >                        Ratio - ratio of maximum to minimum over all processors
> >    Mess: number of messages sent
> >    Avg. len: average message length
> >    Reduct: number of global reductions
> >    Global: entire computation
> >    Stage: stages of a computation. Set stages with PetscLogStagePush()
> > and PetscLogStagePop().
> >       %T - percent time in this phase         %F - percent flops in this phase
> >       %M - percent messages in this phase     %L - percent message
> > lengths in this phase
> >       %R - percent reductions in this phase
> >    Total Mflop/s: 10e-6 * (sum of flops over all processors)/(max time
> > over all processors)
> > ------------------------------------------------------------------------------------------------------------------------
> >
> >
> >       ##########################################################
> >       #                                                        #
> >       #                          WARNING!!!                    #
> >       #                                                        #
> >       #   This code was run without the PreLoadBegin()         #
> >       #   macros. To get timing results we always recommend    #
> >       #   preloading. otherwise timing numbers may be          #
> >       #   meaningless.                                         #
> >       ##########################################################
> >
> >
> > Event                Count      Time (sec)     Flops/sec
> >           --- Global ---  --- Stage ---   Total
> >                    Max Ratio  Max     Ratio   Max  Ratio  Mess   Avg
> > len Reduct  %T %F %M %L %R  %T %F %M %L %R Mflop/s
> > ------------------------------------------------------------------------------------------------------------------------
> >
> > --- Event Stage 0: Main Stage
> >
> > MatMult              135 1.0 2.0830e+00 1.4 2.30e+08 1.9 2.4e+03
> > 1.3e+04 0.0e+00  1 26 30 26  0   1 26 30 26  0   862
> > MatMultAdd            40 1.0 3.2598e-01 4.5 2.68e+07 5.7 0.0e+00
> > 0.0e+00 0.0e+00  0  0  0  0  0   0  0  0  0  0    31
> > MatSolve              44 1.0 6.7841e-01 1.7 1.93e+08 1.7 0.0e+00
> > 0.0e+00 0.0e+00  0  7  0  0  0   0  7  0  0  0   672
> > MatRelax              80 1.0 4.2949e+00 1.6 8.77e+07 1.2 2.2e+03
> > 1.3e+04 0.0e+00  1 23 28 26  0   1 23 28 26  0   374
> > MatLUFactorSym         1 1.0 7.6739e-02 1.1 0.00e+00 0.0 0.0e+00
> > 0.0e+00 1.0e+00  0  0  0  0  0   0  0  0  0  0     0
> > MatLUFactorNum         2 1.0 2.9370e+00 1.5 2.53e+08 1.5 0.0e+00
> > 0.0e+00 0.0e+00  1 44  0  0  0   1 44  0  0  0  1037
> > MatILUFactorSym        1 1.0 6.7334e-01 1.0 0.00e+00 0.0 0.0e+00
> > 0.0e+00 1.0e+00  0  0  0  0  0   0  0  0  0  0     0
> > MatAssemblyBegin       7 1.0 7.1652e-01 4.4 0.00e+00 0.0 1.3e+02
> > 2.9e+05 8.0e+00  0  0  2 32  0   0  0  2 32  4     0
> > MatAssemblyEnd         7 1.0 2.1473e+02 1.0 0.00e+00 0.0 8.4e+01
> > 3.2e+03 2.2e+01 93  0  1  0  1  94  0  1  0 10     0
> > MatGetRowIJ            2 1.0 1.8899e-03 1.1 0.00e+00 0.0 0.0e+00
> > 0.0e+00 0.0e+00  0  0  0  0  0   0  0  0  0  0     0
> > MatGetSubMatrice       1 1.0 3.1915e-02 1.0 0.00e+00 0.0 0.0e+00
> > 0.0e+00 3.0e+00  0  0  0  0  0   0  0  0  0  1     0
> > MatGetOrdering         2 1.0 1.2184e-02 1.6 0.00e+00 0.0 0.0e+00
> > 0.0e+00 4.0e+00  0  0  0  0  0   0  0  0  0  2     0
> > MatIncreaseOvrlp       1 1.0 7.6865e-02 1.0 0.00e+00 0.0 1.1e+03
> > 2.4e+03 2.0e+01  0  0 13  2  1   0  0 13  2  9     0
> > MatZeroEntries         3 1.0 1.0429e-01 4.7 0.00e+00 0.0 0.0e+00
> > 0.0e+00 0.0e+00  0  0  0  0  0   0  0  0  0  0     0
> > MAT_GetRedundantMatrix       1 1.0 3.0144e-02 1.2 0.00e+00 0.0 9.0e+01
> > 8.0e+04 2.0e+00  0  0  1  6  0   0  0  1  6  1     0
> > VecDot                39 1.0 9.0680e-01140.0 2.95e+08191.3 0.0e+00
> > 0.0e+00 3.9e+01  0  0  0  0  2   0  0  0  0 17    11
> > VecMDot                8 1.0 2.8777e-03 2.8 4.16e+07 3.3 0.0e+00
> > 0.0e+00 8.0e+00  0  0  0  0  0   0  0  0  0  4    84
> > VecNorm               31 1.0 6.2301e-02 4.2 7.17e+07 5.8 0.0e+00
> > 0.0e+00 3.1e+01  0  0  0  0  1   0  0  0  0 14    86
> > VecScale              85 1.0 1.7729e-03 1.4 4.94e+08 1.4 0.0e+00
> > 0.0e+00 0.0e+00  0  0  0  0  0   0  0  0  0  0  2138
> > VecCopy                4 1.0 6.2108e-04 1.5 0.00e+00 0.0 0.0e+00
> > 0.0e+00 0.0e+00  0  0  0  0  0   0  0  0  0  0     0
> > VecSet               139 1.0 3.5934e-03 1.4 0.00e+00 0.0 0.0e+00
> > 0.0e+00 0.0e+00  0  0  0  0  0   0  0  0  0  0     0
> > VecAXPY               98 1.0 5.1496e-03 1.3 9.24e+08 1.1 0.0e+00
> > 0.0e+00 0.0e+00  0  0  0  0  0   0  0  0  0  0  4705
> > VecAYPX               40 1.0 1.0311e-02 1.4 1.90e+08 1.0 0.0e+00
> > 0.0e+00 0.0e+00  0  0  0  0  0   0  0  0  0  0   969
> > VecWAXPY              75 1.0 2.6060e-02 1.4 1.22e+08 1.0 0.0e+00
> > 0.0e+00 0.0e+00  0  0  0  0  0   0  0  0  0  0   618
> > VecMAXPY               9 1.0 2.1315e-04 1.6 3.01e+08 1.2 0.0e+00
> > 0.0e+00 0.0e+00  0  0  0  0  0   0  0  0  0  0  1355
> > VecAssemblyBegin       4 1.0 1.9898e-03 1.2 0.00e+00 0.0 0.0e+00
> > 0.0e+00 1.2e+01  0  0  0  0  0   0  0  0  0  5     0
> > VecAssemblyEnd         4 1.0 2.1219e-05 1.2 0.00e+00 0.0 0.0e+00
> > 0.0e+00 0.0e+00  0  0  0  0  0   0  0  0  0  0     0
> > VecScatterBegin      267 1.0 1.0224e-01 1.6 0.00e+00 0.0 6.3e+03
> > 1.1e+04 0.0e+00  0  0 79 59  0   0  0 79 59  0     0
> > VecScatterEnd        267 1.0 7.8653e-0111.3 0.00e+00 0.0 0.0e+00
> > 0.0e+00 0.0e+00  0  0  0  0  0   0  0  0  0  0     0
> > KSPGMRESOrthog         4 1.0 3.7677e-03 1.9 4.78e+07 2.4 0.0e+00
> > 0.0e+00 1.2e+01  0  0  0  0  0   0  0  0  0  5   141
> > KSPSetup               6 1.0 1.0260e-02 1.0 0.00e+00 0.0 0.0e+00
> > 0.0e+00 4.0e+00  0  0  0  0  0   0  0  0  0  2     0
> > KSPSolve               2 1.0 7.8238e+00 1.0 1.16e+08 1.2 5.9e+03
> > 1.1e+04 8.1e+01  3 70 75 54  3   3 70 75 54 36   622
> > PCSetUp                3 1.0 5.1323e+00 1.2 1.29e+08 1.2 1.6e+03
> > 9.1e+03 7.3e+01  2 47 20 13  3   2 47 20 13 33   636
> > PCSetUpOnBlocks        1 1.0 1.3325e+00 1.0 1.47e+08 1.0 0.0e+00
> > 0.0e+00 3.0e+00  1 17  0  0  0   1 17  0  0  1   871
> > PCApply               44 1.0 5.7917e+00 1.1 9.60e+07 1.2 4.7e+03
> > 1.0e+04 0.0e+00  2 41 59 42  0   2 41 59 42  0   497
> > ------------------------------------------------------------------------------------------------------------------------
> >
> > Memory usage is given in bytes:
> >
> > Object Type          Creations   Destructions   Memory  Descendants' Mem.
> >
> > --- Event Stage 0: Main Stage
> >
> >               Matrix    16              0          0     0
> >            Index Set    36             29     256760     0
> >                  Vec   176             93   16582464     0
> >          Vec Scatter    10              0          0     0
> >    IS L to G Mapping     4              0          0     0
> >        Krylov Solver     6              0          0     0
> >       Preconditioner     6              0          0     0
> >               Viewer     4              2          0     0
> >            Container     2              0          0     0
> > ========================================================================================================================
> > Average time to get PetscTime(): 8.10623e-07
> > Average time for MPI_Barrier(): 0.000178194
> > Average time for zero size MPI_Send(): 1.33117e-05
> > OptionTable: -mg_levels_ksp_type richardson
> > OptionTable: -mg_levels_pc_sor_omega 1.05
> > OptionTable: -mg_levels_pc_type sor
> > OptionTable: -pc_ml_PrintLevel 4
> > OptionTable: -pc_ml_maxNlevels 2
> > OptionTable: -pc_type ml
> > Compiled without FORTRAN kernels
> > Compiled with full precision matrices (default)
> > sizeof(short) 2 sizeof(int) 4 sizeof(long) 8 sizeof(void*) 8
> > sizeof(PetscScalar) 8
> > Configure run at: Fri Sep 28 23:34:20 2007
> >
> >
> > >
> > >     Barry
> > >
> > >
> > >     The a
> > >
> > > On Feb 3, 2008, at 6:44 AM, Thomas Geenen wrote:
> > >
> > > > i call
> > > > ierr = MatStashSetInitialSize(A[*seqsolve],stash_size,
> > > > stash_size);CHKERRQ(ierr);
> > > > with 100 000 000 for the stash size to make sure that's not the
> > > > bottleneck
> > > >
> > > > the assemble time remains unchanged however.
> > > >
> > > > nstash in MatAssemblyBegin_MPIAIJ (CPU=0) = 109485
> > > > reallocs in MatAssemblyBegin_MPIAIJ =  0
> > > >
> > > > cheers
> > > > Thomas
> > > >
> > > > On Saturday 02 February 2008 23:19, Barry Smith wrote:
> > > >>    The matstash has a concept of preallocation also. During the first
> > > >> setvalues
> > > >> it is allocating more and more memory for the stash. In the second
> > > >> setvalues
> > > >> the stash is large enough so does not require any addition
> > > >> allocation.
> > > >>
> > > >>    You can use the option -matstash_initial_size <size> to allocate
> > > >> enough space
> > > >> initially so that the first setvalues is also fast. It does not look
> > > >> like there is a way
> > > >> coded to get the <size> that you should use. It should be set to the
> > > >> maximum nonzeros
> > > >> any process has that belongs to other processes. The stash handling
> > > >> code is
> > > >> in src/mat/utils/matstash.c, perhaps you can figure out how to
> > > >> printout with PetscInfo()
> > > >> the sizes needed?
> > > >>
> > > >>
> > > >>    Barry
> > > >>
> > > >> On Feb 2, 2008, at 12:30 PM, Thomas Geenen wrote:
> > > >>> On Saturday 02 February 2008 18:33, Hong Zhang wrote:
> > > >>>> On Sat, 2 Feb 2008, Thomas Geenen wrote:
> > > >>>>> Dear Petsc users,
> > > >>>>>
> > > >>>>> I would like to understand what is slowing down the assembly phase
> > > >>>>> of my
> > > >>>>> matrix. I create a matrix with MatCreateMPIAIJ i make a rough
> > > >>>>> guess of
> > > >>>>> the number of off diagonal entries and then use a conservative
> > > >>>>> value to
> > > >>>>> make sure I do not need extra mallocs. (the number of diagonal
> > > >>>>> entries is
> > > >>>>> exact)
> > > >>>>> next i call MatSetValues and MatAssemblyBegin, MatAssemblyEnd.
> > > >>>>> The first time i call MatSetValues and MatAssemblyBegin,
> > > >>>>> MatAssemblyEnd it takes about 170 seconds
> > > >>>>> the second time 0.3 seconds.
> > > >>>>> I run it on 6 cpu's and I do fill quit a number of row-entries on
> > > >>>>> the
> > > >>>>> "wrong" cpu. However thats also the case the second run. I checked
> > > >>>>> that there are no additional mallocs
> > > >>>>> MatGetInfo info.mallocs=0 both after MatSetValues and after
> > > >>>>> MatAssemblyBegin, MatAssemblyEnd.
> > > >>>>
> > > >>>> Run your code with the option '-log_summary' and check which
> > > >>>> function
> > > >>>> call dominates the execution time.
> > > >>>
> > > >>> the time is spend in MatStashScatterGetMesg_Private
> > > >>>
> > > >>>>> I run it on 6 cpu's and I do fill quit a number of row-entries on
> > > >>>>> the
> > > >>>>> "wrong" cpu.
> > > >>>>
> > > >>>> Likely, the communication that sending the entries to the
> > > >>>> corrected cpu consume the time. Can you fill the entries in the
> > > >>>> correct cpu?
> > > >>>
> > > >>> the second time the entries are filled on the wrong CPU as well.
> > > >>> i am curious about the difference in time between run 1 and 2.
> > > >>>
> > > >>>> Hong
> > > >>>>
> > > >>>>> cheers
> > > >>>>> Thomas
> > > >
> > >
> > >
> >
> >
>
>
>
> --
> What most experimenters take for granted before they begin their
> experiments is infinitely more interesting than any results to which
> their experiments lead.
> -- Norbert Wiener
>
>


From recrusader at gmail.com  Mon Feb  4 12:20:28 2008
From: recrusader at gmail.com (Yujie)
Date: Mon, 4 Feb 2008 10:20:28 -0800
Subject: how to inverse a sparse matrix in Petsc?
In-Reply-To: <5EA032CF-2E1B-4692-8B46-B787B7778646@mcs.anl.gov>
References: <7ff0ee010802032237y2ad6d05cp362ef5b9279b3a95@mail.gmail.com>
	 <5EA032CF-2E1B-4692-8B46-B787B7778646@mcs.anl.gov>
Message-ID: <7ff0ee010802041020n49db272aq878c038c4d531072@mail.gmail.com>

what is the difference between sequantial and parallel AIJ matrix? Assuming
there is a matrix A, if I partitaion this matrix into A1, A2, Ai... An.
A is a parallel AIJ matrix at the whole view, Ai
is a sequential AIJ matrix? I want to operate Ai at each node.
In addition, whether is it possible to get general inverse using
MatMatSolve() if the matrix is not square? Thanks a lot.

Regards,
Yujie


On 2/4/08, Barry Smith <bsmith at mcs.anl.gov> wrote:
>
>
>     For sequential AIJ matrices you can fill the B matrix with the
> identity and then use
> MatMatSolve().
>
>     Note since the inverse of a sparse matrix is dense the B matrix is
> a SeqDense matrix.
>
>     Barry
>
> On Feb 4, 2008, at 12:37 AM, Yujie wrote:
>
> > Hi,
> > Now, I want to inverse a sparse matrix. I have browsed the manual,
> > however, I can't find some information. could you give me some advice?
> >
> > thanks a lot.
> >
> > Regards,
> > Yujie
> >
>
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.mcs.anl.gov/pipermail/petsc-users/attachments/20080204/4dd1ff62/attachment.htm>

From bsmith at mcs.anl.gov  Mon Feb  4 12:21:05 2008
From: bsmith at mcs.anl.gov (Barry Smith)
Date: Mon, 4 Feb 2008 12:21:05 -0600
Subject: assembly
In-Reply-To: <8aa042e10802040934p7308142fjff32c882308eceda@mail.gmail.com>
References: <8aa042e10802020132j5814522dh46303aba83cdc4d1@mail.gmail.com> <200802021930.49084.geenen@gmail.com> <37FE47D2-1A4A-4980-AD8D-EEF888109C93@mcs.anl.gov> <200802031344.56290.geenen@gmail.com> <964E08D4-3BBC-4BF8-8113-292D3302C745@mcs.anl.gov> <8aa042e10802040841n5429ee0bvb67f5f750f57472a@mail.gmail.com> <a9f269830802040847k30b98983r2d92ccd3b8e6bb0a@mail.gmail.com> <8aa042e10802040934p7308142fjff32c882308eceda@mail.gmail.com>
Message-ID: <636792AA-B9FF-4788-82C5-7A3E008124BA@mcs.anl.gov>

MatAssemblyEnd         1 1.0 1.4709e+02


In MatAssemblyEnd all the messages with off-process values are  
received and then
MatSetValues is called with them onto the local matrices.

The only way this can be taking this huge amount of time is if the  
preallocation was
not done correctly. Can you please run with -info and send all the  
output

    Barry


On Feb 4, 2008, at 11:34 AM, Thomas Geenen wrote:

> hi matt,
>
> this is indeed much clearer
> i put the push and pop around MatAssemblyBegin/End
>
> 1:  First_assembly: 1.4724e+02  90.8%  0.0000e+00   0.0%  7.000e+01
> 0.9%  2.266e+03       15.5%  9.000e+00   0.2%
> 2: Second_assembly: 2.3823e-01   0.1%  0.0000e+00   0.0%  7.000e+01
> 0.9%  1.276e+02        0.9%  9.000e+00   0.2%
> 3:  Third_assembly: 5.0168e-01   0.3%  0.0000e+00   0.0%  4.200e+01
> 0.5%  2.237e+03       15.4%  3.000e+00   0.1%
>
> The second assembly is another system of equations (pressure
> correction in simpler)
> so 1 and 3 are 1 and 2 ......
>
> cheers
> Thomas
>
>
>
> ---------------------------------------------- PETSc Performance
> Summary: ----------------------------------------------
>
> Unknown Name on a linux-gnu named etna.geo.uu.nl with 6 processors, by
> geenen Mon Feb  4 18:27:48 2008
> Using Petsc Release Version 2.3.3, Patch 3, Fri Jun 15 16:51:25 CDT
> 2007 HG revision: f051789beadcd36f77fb6111d20225e26ed7cc0d
>
>                         Max       Max/Min        Avg      Total
> Time (sec):           1.621e+02      1.00000   1.621e+02
> Objects:              2.600e+02      1.00000   2.600e+02
> Flops:                1.265e+09      1.17394   1.161e+09  6.969e+09
> Flops/sec:            7.806e+06      1.17393   7.166e+06  4.300e+07
> MPI Messages:         1.436e+03      1.20816   1.326e+03  7.956e+03
> MPI Message Lengths:  2.120e+07      1.23141   1.457e+04  1.159e+08
> MPI Reductions:       9.862e+02      1.00000
>
> Flop counting convention: 1 flop = 1 real number operation of type
> (multiply/divide/add/subtract)
>                            e.g., VecAXPY() for real vectors of length
> N --> 2N flops
>                            and VecAXPY() for complex vectors of
> length N --> 8N flops
>
> Summary of Stages:   ----- Time ------  ----- Flops -----  ---
> Messages ---  -- Message Lengths --  -- Reductions --
>                        Avg     %Total     Avg     %Total   counts
> %Total     Avg         %Total   counts   %Total
> 0:      Main Stage: 1.3160e+01   8.1%  6.9689e+09 100.0%  7.762e+03
> 97.6%  9.941e+03       68.2%  2.020e+02   3.4%
> 1:  First_assembly: 1.4724e+02  90.8%  0.0000e+00   0.0%  7.000e+01
> 0.9%  2.266e+03       15.5%  9.000e+00   0.2%
> 2: Second_assembly: 2.3823e-01   0.1%  0.0000e+00   0.0%  7.000e+01
> 0.9%  1.276e+02        0.9%  9.000e+00   0.2%
> 3:  Third_assembly: 5.0168e-01   0.3%  0.0000e+00   0.0%  4.200e+01
> 0.5%  2.237e+03       15.4%  3.000e+00   0.1%
>
> ------------------------------------------------------------------------------------------------------------------------
> See the 'Profiling' chapter of the users' manual for details on
> interpreting output.
> Phase summary info:
>   Count: number of times phase was executed
>   Time and Flops/sec: Max - maximum over all processors
>                       Ratio - ratio of maximum to minimum over all  
> processors
>   Mess: number of messages sent
>   Avg. len: average message length
>   Reduct: number of global reductions
>   Global: entire computation
>   Stage: stages of a computation. Set stages with PetscLogStagePush()
> and PetscLogStagePop().
>      %T - percent time in this phase         %F - percent flops in  
> this phase
>      %M - percent messages in this phase     %L - percent message
> lengths in this phase
>      %R - percent reductions in this phase
>   Total Mflop/s: 10e-6 * (sum of flops over all processors)/(max time
> over all processors)
> ------------------------------------------------------------------------------------------------------------------------
>
>
>      ##########################################################
>      #                                                        #
>      #                          WARNING!!!                    #
>      #                                                        #
>      #   This code was run without the PreLoadBegin()         #
>      #   macros. To get timing results we always recommend    #
>      #   preloading. otherwise timing numbers may be          #
>      #   meaningless.                                         #
>      ##########################################################
>
>
> Event                Count      Time (sec)     Flops/sec
>          --- Global ---  --- Stage ---   Total
>                   Max Ratio  Max     Ratio   Max  Ratio  Mess   Avg
> len Reduct  %T %F %M %L %R  %T %F %M %L %R Mflop/s
> ------------------------------------------------------------------------------------------------------------------------
>
> --- Event Stage 0: Main Stage
>
> MatMult              135 1.0 1.5541e+00 1.3 3.03e+08 1.8 2.4e+03
> 1.3e+04 0.0e+00  1 26 30 26  0  11 26 30 38  0  1155
> MatMultAdd            40 1.0 3.2611e-01 8.3 3.64e+07 7.7 0.0e+00
> 0.0e+00 0.0e+00  0  0  0  0  0   2  0  0  0  0    31
> MatSolve              44 1.0 6.7682e-01 1.7 1.94e+08 1.7 0.0e+00
> 0.0e+00 0.0e+00  0  7  0  0  0   4  7  0  0  0   673
> MatRelax              80 1.0 3.4453e+00 1.4 9.46e+07 1.1 2.2e+03
> 1.3e+04 0.0e+00  2 23 28 26  0  22 23 29 38  0   466
> MatLUFactorSym         1 1.0 6.7567e-02 1.1 0.00e+00 0.0 0.0e+00
> 0.0e+00 1.0e+00  0  0  0  0  0   0  0  0  0  0     0
> MatLUFactorNum         2 1.0 2.8804e+00 1.4 2.53e+08 1.4 0.0e+00
> 0.0e+00 0.0e+00  1 44  0  0  0  17 44  0  0  0  1058
> MatILUFactorSym        1 1.0 6.7676e-01 1.0 0.00e+00 0.0 0.0e+00
> 0.0e+00 1.0e+00  0  0  0  0  0   5  0  0  0  0     0
> MatAssemblyBegin       4 1.0 2.7711e-0237.3 0.00e+00 0.0 0.0e+00
> 0.0e+00 2.0e+00  0  0  0  0  0   0  0  0  0  1     0
> MatAssemblyEnd         4 1.0 2.4401e-02 1.2 0.00e+00 0.0 2.8e+01
> 4.7e+02 7.0e+00  0  0  0  0  0   0  0  0  0  3     0
> MatGetRowIJ            2 1.0 1.2948e-02 7.5 0.00e+00 0.0 0.0e+00
> 0.0e+00 0.0e+00  0  0  0  0  0   0  0  0  0  0     0
> MatGetSubMatrice       1 1.0 2.8603e-02 1.1 0.00e+00 0.0 0.0e+00
> 0.0e+00 3.0e+00  0  0  0  0  0   0  0  0  0  1     0
> MatGetOrdering         2 1.0 2.3054e-02 2.4 0.00e+00 0.0 0.0e+00
> 0.0e+00 4.0e+00  0  0  0  0  0   0  0  0  0  2     0
> MatIncreaseOvrlp       1 1.0 8.1528e-02 1.0 0.00e+00 0.0 1.1e+03
> 2.4e+03 2.0e+01  0  0 13  2  0   1  0 14  3 10     0
> MatZeroEntries         3 1.0 3.4422e-02 1.8 0.00e+00 0.0 0.0e+00
> 0.0e+00 0.0e+00  0  0  0  0  0   0  0  0  0  0     0
> MAT_GetRedundantMatrix       1 1.0 3.5774e-02 1.6 0.00e+00 0.0 9.0e+01
> 8.0e+04 2.0e+00  0  0  1  6  0   0  0  1  9  1     0
> VecDot                39 1.0 5.8092e-0131.0 1.02e+0842.3 0.0e+00
> 0.0e+00 3.9e+01  0  0  0  0  1   2  0  0  0 19    17
> VecMDot                8 1.0 3.4735e-03 2.9 4.52e+07 5.5 0.0e+00
> 0.0e+00 8.0e+00  0  0  0  0  0   0  0  0  0  4    69
> VecNorm               31 1.0 3.8690e-02 4.1 1.11e+08 5.6 0.0e+00
> 0.0e+00 3.1e+01  0  0  0  0  1   0  0  0  0 15   139
> VecScale              85 1.0 1.7631e-03 1.2 5.59e+08 1.9 0.0e+00
> 0.0e+00 0.0e+00  0  0  0  0  0   0  0  0  0  0  2150
> VecCopy                4 1.0 5.5027e-04 1.4 0.00e+00 0.0 0.0e+00
> 0.0e+00 0.0e+00  0  0  0  0  0   0  0  0  0  0     0
> VecSet               139 1.0 3.0956e-03 1.4 0.00e+00 0.0 0.0e+00
> 0.0e+00 0.0e+00  0  0  0  0  0   0  0  0  0  0     0
> VecAXPY               98 1.0 5.0848e-03 1.3 9.35e+08 1.1 0.0e+00
> 0.0e+00 0.0e+00  0  0  0  0  0   0  0  0  0  0  4765
> VecAYPX               40 1.0 1.0264e-02 1.4 2.01e+08 1.1 0.0e+00
> 0.0e+00 0.0e+00  0  0  0  0  0   0  0  0  0  0   973
> VecWAXPY              75 1.0 2.6191e-02 1.4 1.22e+08 1.0 0.0e+00
> 0.0e+00 0.0e+00  0  0  0  0  0   0  0  0  0  0   615
> VecMAXPY               9 1.0 2.1935e-04 1.7 2.93e+08 1.1 0.0e+00
> 0.0e+00 0.0e+00  0  0  0  0  0   0  0  0  0  0  1317
> VecAssemblyBegin       4 1.0 1.9331e-03 1.2 0.00e+00 0.0 0.0e+00
> 0.0e+00 1.2e+01  0  0  0  0  0   0  0  0  0  6     0
> VecAssemblyEnd         4 1.0 2.3842e-05 1.4 0.00e+00 0.0 0.0e+00
> 0.0e+00 0.0e+00  0  0  0  0  0   0  0  0  0  0     0
> VecScatterBegin      267 1.0 1.0370e-01 1.5 0.00e+00 0.0 6.3e+03
> 1.1e+04 0.0e+00  0  0 79 59  0   1  0 81 86  0     0
> VecScatterEnd        267 1.0 4.1189e-01 3.1 0.00e+00 0.0 0.0e+00
> 0.0e+00 0.0e+00  0  0  0  0  0   2  0  0  0  0     0
> KSPGMRESOrthog         4 1.0 4.5178e-03 2.0 5.22e+07 3.8 0.0e+00
> 0.0e+00 1.2e+01  0  0  0  0  0   0  0  0  0  6   117
> KSPSetup               6 1.0 7.9882e-03 1.0 0.00e+00 0.0 0.0e+00
> 0.0e+00 4.0e+00  0  0  0  0  0   0  0  0  0  2     0
> KSPSolve               2 1.0 6.6590e+00 1.0 1.36e+08 1.2 5.9e+03
> 1.1e+04 8.1e+01  4 70 75 54  1  51 70 76 80 40   731
> PCSetUp                3 1.0 5.0877e+00 1.2 1.29e+08 1.2 1.6e+03
> 9.1e+03 7.3e+01  3 47 20 13  1  34 47 21 19 36   642
> PCSetUpOnBlocks        1 1.0 1.3292e+00 1.0 1.46e+08 1.0 0.0e+00
> 0.0e+00 3.0e+00  1 17  0  0  0  10 17  0  0  1   873
> PCApply               44 1.0 4.7444e+00 1.1 1.16e+08 1.2 4.7e+03
> 1.0e+04 0.0e+00  3 41 59 42  0  34 41 60 61  0   607
>
> --- Event Stage 1: First_assembly
>
> MatAssemblyBegin       1 1.0 3.0375e-01 3.5 0.00e+00 0.0 4.2e+01
> 4.2e+05 2.0e+00  0  0  1 15  0   0  0 60 99 22     0
> MatAssemblyEnd         1 1.0 1.4709e+02 1.0 0.00e+00 0.0 2.8e+01
> 8.2e+03 7.0e+00 91  0  0  0  0 100  0 40  1 78     0
>
> --- Event Stage 2: Second_assembly
>
> MatAssemblyBegin       1 1.0 1.7451e-02 5.0 0.00e+00 0.0 4.2e+01
> 2.4e+04 2.0e+00  0  0  1  1  0   5  0 60 98 22     0
> MatAssemblyEnd         1 1.0 2.3056e-01 1.0 0.00e+00 0.0 2.8e+01
> 8.4e+02 7.0e+00  0  0  0  0  0  95  0 40  2 78     0
>
> --- Event Stage 3: Third_assembly
>
> MatAssemblyBegin       1 1.0 3.3676e-01 3.8 0.00e+00 0.0 4.2e+01
> 4.2e+05 2.0e+00  0  0  1 15  0  45  0100100 67     0
> MatAssemblyEnd         1 1.0 3.3125e-01 1.7 0.00e+00 0.0 0.0e+00
> 0.0e+00 1.0e+00  0  0  0  0  0  55  0  0  0 33     0
>
> --- Event Stage 4: Unknown
>
>
> --- Event Stage 5: Unknown
>
>
> --- Event Stage 6: Unknown
>
>
> --- Event Stage 7: Unknown
>
>
> --- Event Stage 8: Unknown
>
>
> --- Event Stage 9: Unknown
>
> ------------------------------------------------------------------------------------------------------------------------
>
> Memory usage is given in bytes:
>
> Object Type          Creations   Destructions   Memory  Descendants'  
> Mem.
>
> --- Event Stage 0: Main Stage
>
>              Matrix    16              0          0     0
>           Index Set    32             25     207888     0
>                 Vec   172             91   16374224     0
>         Vec Scatter     8              0          0     0
>   IS L to G Mapping     4              0          0     0
>       Krylov Solver     6              0          0     0
>      Preconditioner     6              0          0     0
>              Viewer     4              2          0     0
>           Container     2              0          0     0
>
> --- Event Stage 1: First_assembly
>
>           Index Set     2              2      44140     0
>                 Vec     2              1     196776     0
>         Vec Scatter     1              0          0     0
>
> --- Event Stage 2: Second_assembly
>
>           Index Set     2              2       4732     0
>                 Vec     2              1      11464     0
>         Vec Scatter     1              0          0     0
>
> --- Event Stage 3: Third_assembly
>
>
> --- Event Stage 4: Unknown
>
>
> --- Event Stage 5: Unknown
>
>
> --- Event Stage 6: Unknown
>
>
> --- Event Stage 7: Unknown
>
>
> --- Event Stage 8: Unknown
>
>
> --- Event Stage 9: Unknown
>
> = 
> = 
> = 
> = 
> = 
> = 
> = 
> = 
> = 
> = 
> = 
> = 
> = 
> = 
> = 
> = 
> = 
> = 
> = 
> = 
> = 
> = 
> = 
> = 
> = 
> = 
> = 
> = 
> = 
> = 
> = 
> = 
> = 
> = 
> = 
> = 
> = 
> = 
> = 
> = 
> = 
> = 
> = 
> = 
> = 
> = 
> = 
> = 
> = 
> = 
> ======================================================================
> Average time to get PetscTime(): 8.82149e-07
> Average time for MPI_Barrier(): 0.000153208
> Average time for zero size MPI_Send(): 1.86761e-05
> OptionTable: -mg_levels_ksp_type richardson
> OptionTable: -mg_levels_pc_sor_omega 1.05
> OptionTable: -mg_levels_pc_type sor
> OptionTable: -pc_ml_PrintLevel 4
> OptionTable: -pc_ml_maxNlevels 2
> OptionTable: -pc_type ml
> Compiled without FORTRAN kernels
> Compiled with full precision matrices (default)
> sizeof(short) 2 sizeof(int) 4 sizeof(long) 8 sizeof(void*) 8
> sizeof(PetscScalar) 8
> Configure run at: Fri Sep 28 23:34:20 2007
>
> On Feb 4, 2008 5:47 PM, Matthew Knepley <knepley at gmail.com> wrote:
>> On Feb 4, 2008 10:41 AM, Thomas Geenen <geenen at gmail.com> wrote:
>>> On Feb 3, 2008 8:51 PM, Barry Smith <bsmith at mcs.anl.gov> wrote:
>>>>
>>>>    Hmmm, are you saying the first round of setting values still
>>>> takes much longer then the second round?
>>>
>>> yes
>>>
>>>> Or is it the time
>>>> in MatAssemblyBegin() much longer the first time?
>>>>
>>>>   The MatAssembly process has one piece of code that's
>>>> work is order n*size; where n is the stash size and size is the
>>>> number of processes, all other work is only order n.
>>>>
>>>>    Could you send the -log_summary output?
>>>
>>> the timing is cumulative i guess?
>>> in between these two solves i solve a smaller system for which i do
>>> not include the timing.
>>
>> I ma having a little trouble reading this. I think the easiest  
>> thing to do
>> is wrap the two section of code in their own sections:
>>
>> PetscLogStageRegister(&stage1, "First assembly");
>> PetscLogStageRegister(&stage2, "Second assembly");
>>
>> PetscLogStagePush(stage1);
>> <code for first assembly>
>> PetscLogStagePop();
>>
>> PetscLogStagePush(stage2);
>> <code for second assembly>
>> PetscLogStagePop();
>>
>> Then we can also get a look at how many messages are sent
>> and how big they are.
>>
>>  Thanks,
>>
>>     Matt
>>
>>
>>> run 1
>>>                         Max       Max/Min        Avg      Total
>>> Time (sec):           2.154e+02      1.00001   2.154e+02
>>> Objects:              2.200e+01      1.00000   2.200e+01
>>> Flops:                0.000e+00      0.00000   0.000e+00  0.000e+00
>>> Flops/sec:            0.000e+00      0.00000   0.000e+00  0.000e+00
>>> MPI Messages:         1.750e+01      1.25000   1.633e+01  9.800e+01
>>> MPI Message Lengths:  3.460e+06      1.29903   1.855e+05  1.818e+07
>>> MPI Reductions:       4.167e+00      1.00000
>>>
>>> Flop counting convention: 1 flop = 1 real number operation of type
>>> (multiply/divide/add/subtract)
>>>                            e.g., VecAXPY() for real vectors of  
>>> length
>>> N --> 2N flops
>>>                            and VecAXPY() for complex vectors of
>>> length N --> 8N flops
>>>
>>> Summary of Stages:   ----- Time ------  ----- Flops -----  ---
>>> Messages ---  -- Message Lengths --  -- Reductions --
>>>                        Avg     %Total     Avg     %Total   counts
>>> %Total     Avg         %Total   counts   %Total
>>> 0:      Main Stage: 2.1537e+02 100.0%  0.0000e+00   0.0%  9.800e+01
>>> 100.0%  1.855e+05      100.0%  2.500e+01 100.0%
>>>
>>> ------------------------------------------------------------------------------------------------------------------------
>>> See the 'Profiling' chapter of the users' manual for details on
>>> interpreting output.
>>> Phase summary info:
>>>   Count: number of times phase was executed
>>>   Time and Flops/sec: Max - maximum over all processors
>>>                       Ratio - ratio of maximum to minimum over all  
>>> processors
>>>   Mess: number of messages sent
>>>   Avg. len: average message length
>>>   Reduct: number of global reductions
>>>   Global: entire computation
>>>   Stage: stages of a computation. Set stages with  
>>> PetscLogStagePush()
>>> and PetscLogStagePop().
>>>      %T - percent time in this phase         %F - percent flops in  
>>> this phase
>>>      %M - percent messages in this phase     %L - percent message
>>> lengths in this phase
>>>      %R - percent reductions in this phase
>>>   Total Mflop/s: 10e-6 * (sum of flops over all processors)/(max  
>>> time
>>> over all processors)
>>> ------------------------------------------------------------------------------------------------------------------------
>>>
>>>
>>>      ##########################################################
>>>      #                                                        #
>>>      #                          WARNING!!!                    #
>>>      #                                                        #
>>>      #   This code was run without the PreLoadBegin()         #
>>>      #   macros. To get timing results we always recommend    #
>>>      #   preloading. otherwise timing numbers may be          #
>>>      #   meaningless.                                         #
>>>      ##########################################################
>>>
>>>
>>> Event                Count      Time (sec)     Flops/sec
>>>          --- Global ---  --- Stage ---   Total
>>>                   Max Ratio  Max     Ratio   Max  Ratio  Mess   Avg
>>> len Reduct  %T %F %M %L %R  %T %F %M %L %R Mflop/s
>>> ------------------------------------------------------------------------------------------------------------------------
>>>
>>> --- Event Stage 0: Main Stage
>>>
>>> MatAssemblyBegin       1 1.0 2.9536e-01 4.0 0.00e+00 0.0 4.2e+01
>>> 4.2e+05 2.0e+00  0  0 43 98  8   0  0 43 98  8     0
>>> MatAssemblyEnd         1 1.0 2.1410e+02 1.0 0.00e+00 0.0 2.8e+01
>>> 8.2e+03 7.0e+00 99  0 29  1 28  99  0 29  1 28     0
>>> MatZeroEntries         1 1.0 9.3739e-02 5.9 0.00e+00 0.0 0.0e+00
>>> 0.0e+00 0.0e+00  0  0  0  0  0   0  0  0  0  0     0
>>> VecSet                 4 1.0 3.9721e-04 2.2 0.00e+00 0.0 0.0e+00
>>> 0.0e+00 0.0e+00  0  0  0  0  0   0  0  0  0  0     0
>>> ------------------------------------------------------------------------------------------------------------------------
>>>
>>> Memory usage is given in bytes:
>>>
>>> Object Type          Creations   Destructions   Memory   
>>> Descendants' Mem.
>>>
>>> --- Event Stage 0: Main Stage
>>>
>>>              Matrix     3              0          0     0
>>>           Index Set     6              6      45500     0
>>>                 Vec     6              1     196776     0
>>>         Vec Scatter     3              0          0     0
>>>   IS L to G Mapping     2              0          0     0
>>>       Krylov Solver     1              0          0     0
>>>      Preconditioner     1              0          0     0
>>> = 
>>> = 
>>> = 
>>> = 
>>> = 
>>> = 
>>> = 
>>> = 
>>> = 
>>> = 
>>> = 
>>> = 
>>> = 
>>> = 
>>> = 
>>> = 
>>> = 
>>> = 
>>> = 
>>> = 
>>> = 
>>> = 
>>> = 
>>> = 
>>> = 
>>> = 
>>> = 
>>> = 
>>> = 
>>> = 
>>> = 
>>> = 
>>> = 
>>> = 
>>> = 
>>> = 
>>> = 
>>> = 
>>> = 
>>> = 
>>> = 
>>> = 
>>> = 
>>> = 
>>> = 
>>> = 
>>> = 
>>> = 
>>> = 
>>> = 
>>> = 
>>> = 
>>> ====================================================================
>>> Average time to get PetscTime(): 1.71661e-06
>>> Average time for MPI_Barrier(): 0.000159979
>>> Average time for zero size MPI_Send(): 1.29938e-05
>>> Compiled without FORTRAN kernels
>>> Compiled with full precision matrices (default)
>>> sizeof(short) 2 sizeof(int) 4 sizeof(long) 8 sizeof(void*) 8
>>> sizeof(PetscScalar) 8
>>> Configure run at: Fri Sep 28 23:34:20 2007
>>>
>>> run2
>>>                         Max       Max/Min        Avg      Total
>>> Time (sec):           2.298e+02      1.00000   2.298e+02
>>> Objects:              2.600e+02      1.00000   2.600e+02
>>> Flops:                1.265e+09      1.17394   1.161e+09  6.969e+09
>>> Flops/sec:            5.505e+06      1.17394   5.054e+06  3.032e+07
>>> MPI Messages:         1.436e+03      1.20816   1.326e+03  7.956e+03
>>> MPI Message Lengths:  2.120e+07      1.23141   1.457e+04  1.159e+08
>>> MPI Reductions:       4.192e+02      1.00000
>>>
>>> Flop counting convention: 1 flop = 1 real number operation of type
>>> (multiply/divide/add/subtract)
>>>                            e.g., VecAXPY() for real vectors of  
>>> length
>>> N --> 2N flops
>>>                            and VecAXPY() for complex vectors of
>>> length N --> 8N flops
>>>
>>> Summary of Stages:   ----- Time ------  ----- Flops -----  ---
>>> Messages ---  -- Message Lengths --  -- Reductions --
>>>                        Avg     %Total     Avg     %Total   counts
>>> %Total     Avg         %Total   counts   %Total
>>> 0:      Main Stage: 2.2943e+02  99.8%  6.9689e+09 100.0%  7.944e+03
>>> 99.8%  1.457e+04      100.0%  2.230e+02   8.9%
>>>
>>> ------------------------------------------------------------------------------------------------------------------------
>>> See the 'Profiling' chapter of the users' manual for details on
>>> interpreting output.
>>> Phase summary info:
>>>   Count: number of times phase was executed
>>>   Time and Flops/sec: Max - maximum over all processors
>>>                       Ratio - ratio of maximum to minimum over all  
>>> processors
>>>   Mess: number of messages sent
>>>   Avg. len: average message length
>>>   Reduct: number of global reductions
>>>   Global: entire computation
>>>   Stage: stages of a computation. Set stages with  
>>> PetscLogStagePush()
>>> and PetscLogStagePop().
>>>      %T - percent time in this phase         %F - percent flops in  
>>> this phase
>>>      %M - percent messages in this phase     %L - percent message
>>> lengths in this phase
>>>      %R - percent reductions in this phase
>>>   Total Mflop/s: 10e-6 * (sum of flops over all processors)/(max  
>>> time
>>> over all processors)
>>> ------------------------------------------------------------------------------------------------------------------------
>>>
>>>
>>>      ##########################################################
>>>      #                                                        #
>>>      #                          WARNING!!!                    #
>>>      #                                                        #
>>>      #   This code was run without the PreLoadBegin()         #
>>>      #   macros. To get timing results we always recommend    #
>>>      #   preloading. otherwise timing numbers may be          #
>>>      #   meaningless.                                         #
>>>      ##########################################################
>>>
>>>
>>> Event                Count      Time (sec)     Flops/sec
>>>          --- Global ---  --- Stage ---   Total
>>>                   Max Ratio  Max     Ratio   Max  Ratio  Mess   Avg
>>> len Reduct  %T %F %M %L %R  %T %F %M %L %R Mflop/s
>>> ------------------------------------------------------------------------------------------------------------------------
>>>
>>> --- Event Stage 0: Main Stage
>>>
>>> MatMult              135 1.0 2.0830e+00 1.4 2.30e+08 1.9 2.4e+03
>>> 1.3e+04 0.0e+00  1 26 30 26  0   1 26 30 26  0   862
>>> MatMultAdd            40 1.0 3.2598e-01 4.5 2.68e+07 5.7 0.0e+00
>>> 0.0e+00 0.0e+00  0  0  0  0  0   0  0  0  0  0    31
>>> MatSolve              44 1.0 6.7841e-01 1.7 1.93e+08 1.7 0.0e+00
>>> 0.0e+00 0.0e+00  0  7  0  0  0   0  7  0  0  0   672
>>> MatRelax              80 1.0 4.2949e+00 1.6 8.77e+07 1.2 2.2e+03
>>> 1.3e+04 0.0e+00  1 23 28 26  0   1 23 28 26  0   374
>>> MatLUFactorSym         1 1.0 7.6739e-02 1.1 0.00e+00 0.0 0.0e+00
>>> 0.0e+00 1.0e+00  0  0  0  0  0   0  0  0  0  0     0
>>> MatLUFactorNum         2 1.0 2.9370e+00 1.5 2.53e+08 1.5 0.0e+00
>>> 0.0e+00 0.0e+00  1 44  0  0  0   1 44  0  0  0  1037
>>> MatILUFactorSym        1 1.0 6.7334e-01 1.0 0.00e+00 0.0 0.0e+00
>>> 0.0e+00 1.0e+00  0  0  0  0  0   0  0  0  0  0     0
>>> MatAssemblyBegin       7 1.0 7.1652e-01 4.4 0.00e+00 0.0 1.3e+02
>>> 2.9e+05 8.0e+00  0  0  2 32  0   0  0  2 32  4     0
>>> MatAssemblyEnd         7 1.0 2.1473e+02 1.0 0.00e+00 0.0 8.4e+01
>>> 3.2e+03 2.2e+01 93  0  1  0  1  94  0  1  0 10     0
>>> MatGetRowIJ            2 1.0 1.8899e-03 1.1 0.00e+00 0.0 0.0e+00
>>> 0.0e+00 0.0e+00  0  0  0  0  0   0  0  0  0  0     0
>>> MatGetSubMatrice       1 1.0 3.1915e-02 1.0 0.00e+00 0.0 0.0e+00
>>> 0.0e+00 3.0e+00  0  0  0  0  0   0  0  0  0  1     0
>>> MatGetOrdering         2 1.0 1.2184e-02 1.6 0.00e+00 0.0 0.0e+00
>>> 0.0e+00 4.0e+00  0  0  0  0  0   0  0  0  0  2     0
>>> MatIncreaseOvrlp       1 1.0 7.6865e-02 1.0 0.00e+00 0.0 1.1e+03
>>> 2.4e+03 2.0e+01  0  0 13  2  1   0  0 13  2  9     0
>>> MatZeroEntries         3 1.0 1.0429e-01 4.7 0.00e+00 0.0 0.0e+00
>>> 0.0e+00 0.0e+00  0  0  0  0  0   0  0  0  0  0     0
>>> MAT_GetRedundantMatrix       1 1.0 3.0144e-02 1.2 0.00e+00 0.0 9.0e 
>>> +01
>>> 8.0e+04 2.0e+00  0  0  1  6  0   0  0  1  6  1     0
>>> VecDot                39 1.0 9.0680e-01140.0 2.95e+08191.3 0.0e+00
>>> 0.0e+00 3.9e+01  0  0  0  0  2   0  0  0  0 17    11
>>> VecMDot                8 1.0 2.8777e-03 2.8 4.16e+07 3.3 0.0e+00
>>> 0.0e+00 8.0e+00  0  0  0  0  0   0  0  0  0  4    84
>>> VecNorm               31 1.0 6.2301e-02 4.2 7.17e+07 5.8 0.0e+00
>>> 0.0e+00 3.1e+01  0  0  0  0  1   0  0  0  0 14    86
>>> VecScale              85 1.0 1.7729e-03 1.4 4.94e+08 1.4 0.0e+00
>>> 0.0e+00 0.0e+00  0  0  0  0  0   0  0  0  0  0  2138
>>> VecCopy                4 1.0 6.2108e-04 1.5 0.00e+00 0.0 0.0e+00
>>> 0.0e+00 0.0e+00  0  0  0  0  0   0  0  0  0  0     0
>>> VecSet               139 1.0 3.5934e-03 1.4 0.00e+00 0.0 0.0e+00
>>> 0.0e+00 0.0e+00  0  0  0  0  0   0  0  0  0  0     0
>>> VecAXPY               98 1.0 5.1496e-03 1.3 9.24e+08 1.1 0.0e+00
>>> 0.0e+00 0.0e+00  0  0  0  0  0   0  0  0  0  0  4705
>>> VecAYPX               40 1.0 1.0311e-02 1.4 1.90e+08 1.0 0.0e+00
>>> 0.0e+00 0.0e+00  0  0  0  0  0   0  0  0  0  0   969
>>> VecWAXPY              75 1.0 2.6060e-02 1.4 1.22e+08 1.0 0.0e+00
>>> 0.0e+00 0.0e+00  0  0  0  0  0   0  0  0  0  0   618
>>> VecMAXPY               9 1.0 2.1315e-04 1.6 3.01e+08 1.2 0.0e+00
>>> 0.0e+00 0.0e+00  0  0  0  0  0   0  0  0  0  0  1355
>>> VecAssemblyBegin       4 1.0 1.9898e-03 1.2 0.00e+00 0.0 0.0e+00
>>> 0.0e+00 1.2e+01  0  0  0  0  0   0  0  0  0  5     0
>>> VecAssemblyEnd         4 1.0 2.1219e-05 1.2 0.00e+00 0.0 0.0e+00
>>> 0.0e+00 0.0e+00  0  0  0  0  0   0  0  0  0  0     0
>>> VecScatterBegin      267 1.0 1.0224e-01 1.6 0.00e+00 0.0 6.3e+03
>>> 1.1e+04 0.0e+00  0  0 79 59  0   0  0 79 59  0     0
>>> VecScatterEnd        267 1.0 7.8653e-0111.3 0.00e+00 0.0 0.0e+00
>>> 0.0e+00 0.0e+00  0  0  0  0  0   0  0  0  0  0     0
>>> KSPGMRESOrthog         4 1.0 3.7677e-03 1.9 4.78e+07 2.4 0.0e+00
>>> 0.0e+00 1.2e+01  0  0  0  0  0   0  0  0  0  5   141
>>> KSPSetup               6 1.0 1.0260e-02 1.0 0.00e+00 0.0 0.0e+00
>>> 0.0e+00 4.0e+00  0  0  0  0  0   0  0  0  0  2     0
>>> KSPSolve               2 1.0 7.8238e+00 1.0 1.16e+08 1.2 5.9e+03
>>> 1.1e+04 8.1e+01  3 70 75 54  3   3 70 75 54 36   622
>>> PCSetUp                3 1.0 5.1323e+00 1.2 1.29e+08 1.2 1.6e+03
>>> 9.1e+03 7.3e+01  2 47 20 13  3   2 47 20 13 33   636
>>> PCSetUpOnBlocks        1 1.0 1.3325e+00 1.0 1.47e+08 1.0 0.0e+00
>>> 0.0e+00 3.0e+00  1 17  0  0  0   1 17  0  0  1   871
>>> PCApply               44 1.0 5.7917e+00 1.1 9.60e+07 1.2 4.7e+03
>>> 1.0e+04 0.0e+00  2 41 59 42  0   2 41 59 42  0   497
>>> ------------------------------------------------------------------------------------------------------------------------
>>>
>>> Memory usage is given in bytes:
>>>
>>> Object Type          Creations   Destructions   Memory   
>>> Descendants' Mem.
>>>
>>> --- Event Stage 0: Main Stage
>>>
>>>              Matrix    16              0          0     0
>>>           Index Set    36             29     256760     0
>>>                 Vec   176             93   16582464     0
>>>         Vec Scatter    10              0          0     0
>>>   IS L to G Mapping     4              0          0     0
>>>       Krylov Solver     6              0          0     0
>>>      Preconditioner     6              0          0     0
>>>              Viewer     4              2          0     0
>>>           Container     2              0          0     0
>>> = 
>>> = 
>>> = 
>>> = 
>>> = 
>>> = 
>>> = 
>>> = 
>>> = 
>>> = 
>>> = 
>>> = 
>>> = 
>>> = 
>>> = 
>>> = 
>>> = 
>>> = 
>>> = 
>>> = 
>>> = 
>>> = 
>>> = 
>>> = 
>>> = 
>>> = 
>>> = 
>>> = 
>>> = 
>>> = 
>>> = 
>>> = 
>>> = 
>>> = 
>>> = 
>>> = 
>>> = 
>>> = 
>>> = 
>>> = 
>>> = 
>>> = 
>>> = 
>>> = 
>>> = 
>>> = 
>>> = 
>>> = 
>>> = 
>>> = 
>>> = 
>>> = 
>>> ====================================================================
>>> Average time to get PetscTime(): 8.10623e-07
>>> Average time for MPI_Barrier(): 0.000178194
>>> Average time for zero size MPI_Send(): 1.33117e-05
>>> OptionTable: -mg_levels_ksp_type richardson
>>> OptionTable: -mg_levels_pc_sor_omega 1.05
>>> OptionTable: -mg_levels_pc_type sor
>>> OptionTable: -pc_ml_PrintLevel 4
>>> OptionTable: -pc_ml_maxNlevels 2
>>> OptionTable: -pc_type ml
>>> Compiled without FORTRAN kernels
>>> Compiled with full precision matrices (default)
>>> sizeof(short) 2 sizeof(int) 4 sizeof(long) 8 sizeof(void*) 8
>>> sizeof(PetscScalar) 8
>>> Configure run at: Fri Sep 28 23:34:20 2007
>>>
>>>
>>>>
>>>>    Barry
>>>>
>>>>
>>>>    The a
>>>>
>>>> On Feb 3, 2008, at 6:44 AM, Thomas Geenen wrote:
>>>>
>>>>> i call
>>>>> ierr = MatStashSetInitialSize(A[*seqsolve],stash_size,
>>>>> stash_size);CHKERRQ(ierr);
>>>>> with 100 000 000 for the stash size to make sure that's not the
>>>>> bottleneck
>>>>>
>>>>> the assemble time remains unchanged however.
>>>>>
>>>>> nstash in MatAssemblyBegin_MPIAIJ (CPU=0) = 109485
>>>>> reallocs in MatAssemblyBegin_MPIAIJ =  0
>>>>>
>>>>> cheers
>>>>> Thomas
>>>>>
>>>>> On Saturday 02 February 2008 23:19, Barry Smith wrote:
>>>>>>   The matstash has a concept of preallocation also. During the  
>>>>>> first
>>>>>> setvalues
>>>>>> it is allocating more and more memory for the stash. In the  
>>>>>> second
>>>>>> setvalues
>>>>>> the stash is large enough so does not require any addition
>>>>>> allocation.
>>>>>>
>>>>>>   You can use the option -matstash_initial_size <size> to  
>>>>>> allocate
>>>>>> enough space
>>>>>> initially so that the first setvalues is also fast. It does not  
>>>>>> look
>>>>>> like there is a way
>>>>>> coded to get the <size> that you should use. It should be set  
>>>>>> to the
>>>>>> maximum nonzeros
>>>>>> any process has that belongs to other processes. The stash  
>>>>>> handling
>>>>>> code is
>>>>>> in src/mat/utils/matstash.c, perhaps you can figure out how to
>>>>>> printout with PetscInfo()
>>>>>> the sizes needed?
>>>>>>
>>>>>>
>>>>>>   Barry
>>>>>>
>>>>>> On Feb 2, 2008, at 12:30 PM, Thomas Geenen wrote:
>>>>>>> On Saturday 02 February 2008 18:33, Hong Zhang wrote:
>>>>>>>> On Sat, 2 Feb 2008, Thomas Geenen wrote:
>>>>>>>>> Dear Petsc users,
>>>>>>>>>
>>>>>>>>> I would like to understand what is slowing down the assembly  
>>>>>>>>> phase
>>>>>>>>> of my
>>>>>>>>> matrix. I create a matrix with MatCreateMPIAIJ i make a rough
>>>>>>>>> guess of
>>>>>>>>> the number of off diagonal entries and then use a conservative
>>>>>>>>> value to
>>>>>>>>> make sure I do not need extra mallocs. (the number of diagonal
>>>>>>>>> entries is
>>>>>>>>> exact)
>>>>>>>>> next i call MatSetValues and MatAssemblyBegin, MatAssemblyEnd.
>>>>>>>>> The first time i call MatSetValues and MatAssemblyBegin,
>>>>>>>>> MatAssemblyEnd it takes about 170 seconds
>>>>>>>>> the second time 0.3 seconds.
>>>>>>>>> I run it on 6 cpu's and I do fill quit a number of row- 
>>>>>>>>> entries on
>>>>>>>>> the
>>>>>>>>> "wrong" cpu. However thats also the case the second run. I  
>>>>>>>>> checked
>>>>>>>>> that there are no additional mallocs
>>>>>>>>> MatGetInfo info.mallocs=0 both after MatSetValues and after
>>>>>>>>> MatAssemblyBegin, MatAssemblyEnd.
>>>>>>>>
>>>>>>>> Run your code with the option '-log_summary' and check which
>>>>>>>> function
>>>>>>>> call dominates the execution time.
>>>>>>>
>>>>>>> the time is spend in MatStashScatterGetMesg_Private
>>>>>>>
>>>>>>>>> I run it on 6 cpu's and I do fill quit a number of row- 
>>>>>>>>> entries on
>>>>>>>>> the
>>>>>>>>> "wrong" cpu.
>>>>>>>>
>>>>>>>> Likely, the communication that sending the entries to the
>>>>>>>> corrected cpu consume the time. Can you fill the entries in the
>>>>>>>> correct cpu?
>>>>>>>
>>>>>>> the second time the entries are filled on the wrong CPU as well.
>>>>>>> i am curious about the difference in time between run 1 and 2.
>>>>>>>
>>>>>>>> Hong
>>>>>>>>
>>>>>>>>> cheers
>>>>>>>>> Thomas
>>>>>
>>>>
>>>>
>>>
>>>
>>
>>
>>
>> --
>> What most experimenters take for granted before they begin their
>> experiments is infinitely more interesting than any results to which
>> their experiments lead.
>> -- Norbert Wiener
>>
>>
>


From knepley at gmail.com  Mon Feb  4 12:25:51 2008
From: knepley at gmail.com (Matthew Knepley)
Date: Mon, 4 Feb 2008 12:25:51 -0600
Subject: how to inverse a sparse matrix in Petsc?
In-Reply-To: <7ff0ee010802041020n49db272aq878c038c4d531072@mail.gmail.com>
References: <7ff0ee010802032237y2ad6d05cp362ef5b9279b3a95@mail.gmail.com>
	 <5EA032CF-2E1B-4692-8B46-B787B7778646@mcs.anl.gov>
	 <7ff0ee010802041020n49db272aq878c038c4d531072@mail.gmail.com>
Message-ID: <a9f269830802041025s62aace17x7042a526262b08ba@mail.gmail.com>

On Feb 4, 2008 12:20 PM, Yujie <recrusader at gmail.com> wrote:
> what is the difference between sequantial and parallel AIJ matrix? Assuming
> there is a matrix A, if I partitaion this matrix into A1, A2, Ai... An.
> A is a parallel AIJ matrix at the whole view, Ai is a sequential AIJ matrix?

We mean parallel CSR format, which is described in the book Iterative Methods
for Sparse Linear Systems by Yousef Saad as well as in the PETSc manual.

> I want to operate Ai at each node.
>  In addition, whether is it possible to get general inverse using
> MatMatSolve() if the matrix is not square? Thanks a lot.

Rectangular matrices do not have inverses in that sense. You may want some
sort of pseudo-inverse, but it must be motivated by the problem you are solving.

   Matt

> Regards,
> Yujie
>
>
> On 2/4/08, Barry Smith <bsmith at mcs.anl.gov> wrote:
> >
> >     For sequential AIJ matrices you can fill the B matrix with the
> > identity and then use
> > MatMatSolve().
> >
> >     Note since the inverse of a sparse matrix is dense the B matrix is
> > a SeqDense matrix.
> >
> >     Barry
> >
> > On Feb 4, 2008, at 12:37 AM, Yujie wrote:
> >
> > > Hi,
> > > Now, I want to inverse a sparse matrix. I have browsed the manual,
> > > however, I can't find some information. could you give me some advice?
> > >
> > > thanks a lot.
> > >
> > > Regards,
> > > Yujie
> > >
> >
> >
>
>


-- 
What most experimenters take for granted before they begin their
experiments is infinitely more interesting than any results to which
their experiments lead.
-- Norbert Wiener


From bsmith at mcs.anl.gov  Mon Feb  4 12:26:12 2008
From: bsmith at mcs.anl.gov (Barry Smith)
Date: Mon, 4 Feb 2008 12:26:12 -0600
Subject: how to inverse a sparse matrix in Petsc?
In-Reply-To: <7ff0ee010802041020n49db272aq878c038c4d531072@mail.gmail.com>
References: <7ff0ee010802032237y2ad6d05cp362ef5b9279b3a95@mail.gmail.com> <5EA032CF-2E1B-4692-8B46-B787B7778646@mcs.anl.gov> <7ff0ee010802041020n49db272aq878c038c4d531072@mail.gmail.com>
Message-ID: <30E53953-81B2-4EF0-B02D-D9810EB8FA0B@mcs.anl.gov>


On Feb 4, 2008, at 12:20 PM, Yujie wrote:

> what is the difference between sequantial and parallel AIJ matrix?  
> Assuming there is a matrix A, if I partitaion this matrix into A1,  
> A2, Ai... An.
> A is a parallel AIJ matrix at the whole view, Ai is a sequential AIJ  
> matrix?

    It is not that simple. Ai is split into two parts 1) the "block  
diagonal" part and 2) the "off diagonal part" ; this is explained
in the manual page for MatCreateMPIAIJ(). If you want to do operations  
on the pieces you will need to understand the
code in src/mat/impls/aij/mpi.

    What do you want to do with Ai?

> I want to operate Ai at each node.
> In addition, whether is it possible to get general inverse using  
> MatMatSolve() if the matrix is not square?

  No, that requires much more complicated linear algebra technology  
then is in PETSc.

    Barry

> Thanks a lot.
>
> Regards,
> Yujie
>
>
> On 2/4/08, Barry Smith <bsmith at mcs.anl.gov> wrote:
>
>     For sequential AIJ matrices you can fill the B matrix with the
> identity and then use
> MatMatSolve().
>
>     Note since the inverse of a sparse matrix is dense the B matrix is
> a SeqDense matrix.
>
>     Barry
>
> On Feb 4, 2008, at 12:37 AM, Yujie wrote:
>
> > Hi,
> > Now, I want to inverse a sparse matrix. I have browsed the manual,
> > however, I can't find some information. could you give me some  
> advice?
> >
> > thanks a lot.
> >
> > Regards,
> > Yujie
> >
>
>

-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.mcs.anl.gov/pipermail/petsc-users/attachments/20080204/e4a70a64/attachment.htm>

From w_subber at yahoo.com  Mon Feb  4 19:58:31 2008
From: w_subber at yahoo.com (Waad Subber)
Date: Mon, 4 Feb 2008 17:58:31 -0800 (PST)
Subject: how to inverse a sparse matrix in Petsc?
In-Reply-To: <7ff0ee010802041020n49db272aq878c038c4d531072@mail.gmail.com>
Message-ID: <602426.95557.qm@web38210.mail.mud.yahoo.com>

Hi
There was a discussion between Tim Stitt and petsc developers about matrix inversion, and it was really helpful. That was in last Nov. You can check the emails archive

http://www-unix.mcs.anl.gov/web-mail-archive/lists/petsc-users/2007/11/threads.html

Waad

Yujie <recrusader at gmail.com> wrote: what is the difference between sequantial and parallel AIJ matrix? Assuming there is a matrix A, if I partitaion this matrix into A1, A2, Ai... An.
A is a parallel AIJ matrix at the whole view, Ai is a sequential AIJ matrix? I want to operate Ai at each node. 
 In addition, whether is it possible to get  general inverse using MatMatSolve() if the matrix is not square? Thanks a lot.
 
Regards,
Yujie
 

On 2/4/08, Barry Smith <bsmith at mcs.anl.gov> wrote: 
    For sequential AIJ matrices you can fill the B matrix with the
identity and then use
MatMatSolve().

    Note since the inverse of a sparse matrix is dense the B matrix is
a SeqDense matrix.

     Barry

On Feb 4, 2008, at 12:37 AM, Yujie wrote:

> Hi,
> Now, I want to inverse a sparse matrix. I have browsed the manual,
> however, I can't find some information. could you give me some advice?
 >
> thanks a lot.
>
> Regards,
> Yujie
>


---------------------------------
Looking for last minute shopping deals?  Find them fast with Yahoo! Search.
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.mcs.anl.gov/pipermail/petsc-users/attachments/20080204/7d617a9b/attachment.htm>

From jiaxun_hou at yahoo.com.cn  Mon Feb  4 20:34:16 2008
From: jiaxun_hou at yahoo.com.cn (jiaxun hou)
Date: Tue, 5 Feb 2008 10:34:16 +0800 (CST)
Subject: Compare the accuracy of PETSc's GMRES algorithm  with Matlab's
Message-ID: <82303.99306.qm@web15812.mail.cnb.yahoo.com>

Hello everyone,

I want to solve a linear system with no preconditioned GMRES in PETSc, but the result is divergent. White I solve the exactly same system under Matlab, I get the convergent result. Which result can I trust?

And I print the residuals of them as the attachments. Thanks!

Best regards,
Jiaxun

       
---------------------------------
???????????????????????????????????? 
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.mcs.anl.gov/pipermail/petsc-users/attachments/20080205/7507b0fd/attachment.htm>
-------------- next part --------------
An embedded and charset-unspecified text was scrubbed...
Name: matlab.txt
URL: <http://lists.mcs.anl.gov/pipermail/petsc-users/attachments/20080205/7507b0fd/attachment.txt>
-------------- next part --------------
An embedded and charset-unspecified text was scrubbed...
Name: petsc.txt
URL: <http://lists.mcs.anl.gov/pipermail/petsc-users/attachments/20080205/7507b0fd/attachment-0001.txt>

From erlend.pedersen at holberger.com  Tue Feb  5 03:26:56 2008
From: erlend.pedersen at holberger.com (Erlend Pedersen :.)
Date: Tue, 05 Feb 2008 10:26:56 +0100
Subject: Overdetermined, non-linear
In-Reply-To: <a9f269830802031759l1edba6e2rba838012b99a445d@mail.gmail.com>
References: <1201866864.6394.25.camel@erlend-ws.in.holberger.com>
	 <a9f269830802031759l1edba6e2rba838012b99a445d@mail.gmail.com>
Message-ID: <1202203616.27733.50.camel@erlend-ws.in.holberger.com>

On Sun, 2008-02-03 at 19:59 -0600, Matthew Knepley wrote:
> On Feb 1, 2008 5:54 AM, Erlend Pedersen :.
> <erlend.pedersen at holberger.com> wrote:
> > I am attempting to use the PETSc nonlinear solver on an overdetermined
> > system of non-linear equations. Hence, the Jacobian is not square, and
> > so far we have unfortunately not succeeded with any combination of snes,
> > ksp and pc.
> >
> > Could you confirm that snes actually works for overdetermined systems,
> > and if so, is there an application example we could look at in order to
> > make sure there is nothing wrong with our test-setup?
> >
> > We have previously used the MINPACK routine LMDER very successfully, but
> > for our current problem sizes we rely on the use of sparse matrix
> > representations and parallel architectures. PETSc's abstractions and
> > automatic MPI makes this system very attractive for us, and we have
> > already used the PETSc LSQR solver with great success.
> 
> So in the sense that SNES is really just an iteration with an embedded solve,
> yes it can solve non-square nonlinear systems. However, the user has to
> understand what is meant by the Function and Jacobian evaluation methods.
> I suggest implementing the simplest algorithm for non-square systems:
> 
> http://en.wikipedia.org/wiki/Gauss-Newton_algorithm
> 
> By implement, I mean your Function and Jacobian methods should return the
> correct terms. I believe the reason you have not seen convergence is that
> the result of the solve does not "mean" the correct thing for the iteration
> in your current setup.
> 
>    Matt

Thanks. Good to know that I should be able to get a working setup. Are
there by any chance any code examples that I could use to clue myself in
on how to transform my m equations of n unknonwns into a correct
function for the Gauss-Newton algorithm?

- Erlend :.


From tstitt at cscs.ch  Tue Feb  5 06:24:19 2008
From: tstitt at cscs.ch (Timothy Stitt)
Date: Tue, 05 Feb 2008 13:24:19 +0100
Subject: how to inverse a sparse matrix in Petsc?
In-Reply-To: <602426.95557.qm@web38210.mail.mud.yahoo.com>
References: <602426.95557.qm@web38210.mail.mud.yahoo.com>
Message-ID: <47A85573.4090607@cscs.ch>

Yes Yujie, I was able to put together a parallel code to invert a large 
sparse matrix with the help of the PETSc developers. If you need any 
help or maybe a Fortran code template just let me know.

Best,

Tim.

Waad Subber wrote:
> Hi
> There was a discussion between Tim Stitt and petsc developers about 
> matrix inversion, and it was really helpful. That was in last Nov. You 
> can check the emails archive
>
> http://www-unix.mcs.anl.gov/web-mail-archive/lists/petsc-users/2007/11/threads.html
>
> Waad
>
> */Yujie <recrusader at gmail.com>/* wrote:
>
>     what is the difference between sequantial and parallel AIJ matrix?
>     Assuming there is a matrix A, if
>     I partitaion this matrix into A1, A2, Ai... An.
>     A is a parallel AIJ matrix at the whole view, Ai
>     is a sequential AIJ matrix? I want to operate Ai at each node.
>     In addition, whether is it possible to get general inverse using
>     MatMatSolve() if the matrix is not square? Thanks a lot.
>
>     Regards,
>     Yujie
>
>
>     On 2/4/08, *Barry Smith* <bsmith at mcs.anl.gov
>     <mailto:bsmith at mcs.anl.gov>> wrote:
>
>
>             For sequential AIJ matrices you can fill the B matrix with the
>         identity and then use
>         MatMatSolve().
>
>             Note since the inverse of a sparse matrix is dense the B
>         matrix is
>         a SeqDense matrix.
>
>             Barry
>
>         On Feb 4, 2008, at 12:37 AM, Yujie wrote:
>
>         > Hi,
>         > Now, I want to inverse a sparse matrix. I have browsed the
>         manual,
>         > however, I can't find some information. could you give me
>         some advice?
>         >
>         > thanks a lot.
>         >
>         > Regards,
>         > Yujie
>         >
>
>
>
> ------------------------------------------------------------------------
> Looking for last minute shopping deals? Find them fast with Yahoo! 
> Search. 
> <http://us.rd.yahoo.com/evt=51734/*http://tools.search.yahoo.com/newsearch/category.php?category=shopping> 


-- 
Timothy Stitt
HPC Applications Analyst

Swiss National Supercomputing Centre (CSCS)
Galleria 2 - Via Cantonale
CH-6928 Manno, Switzerland

+41 (0) 91 610 8233
stitt at cscs.ch


From knepley at gmail.com  Tue Feb  5 07:31:15 2008
From: knepley at gmail.com (Matthew Knepley)
Date: Tue, 5 Feb 2008 07:31:15 -0600
Subject: Overdetermined, non-linear
In-Reply-To: <1202203616.27733.50.camel@erlend-ws.in.holberger.com>
References: <1201866864.6394.25.camel@erlend-ws.in.holberger.com>
	 <a9f269830802031759l1edba6e2rba838012b99a445d@mail.gmail.com>
	 <1202203616.27733.50.camel@erlend-ws.in.holberger.com>
Message-ID: <a9f269830802050531u55bc29b2uf1b52694d060adcd@mail.gmail.com>

On Feb 5, 2008 3:26 AM, Erlend Pedersen :.
<erlend.pedersen at holberger.com> wrote:
>
> On Sun, 2008-02-03 at 19:59 -0600, Matthew Knepley wrote:
> > On Feb 1, 2008 5:54 AM, Erlend Pedersen :.
> > <erlend.pedersen at holberger.com> wrote:
> > > I am attempting to use the PETSc nonlinear solver on an overdetermined
> > > system of non-linear equations. Hence, the Jacobian is not square, and
> > > so far we have unfortunately not succeeded with any combination of snes,
> > > ksp and pc.
> > >
> > > Could you confirm that snes actually works for overdetermined systems,
> > > and if so, is there an application example we could look at in order to
> > > make sure there is nothing wrong with our test-setup?
> > >
> > > We have previously used the MINPACK routine LMDER very successfully, but
> > > for our current problem sizes we rely on the use of sparse matrix
> > > representations and parallel architectures. PETSc's abstractions and
> > > automatic MPI makes this system very attractive for us, and we have
> > > already used the PETSc LSQR solver with great success.
> >
> > So in the sense that SNES is really just an iteration with an embedded solve,
> > yes it can solve non-square nonlinear systems. However, the user has to
> > understand what is meant by the Function and Jacobian evaluation methods.
> > I suggest implementing the simplest algorithm for non-square systems:
> >
> > http://en.wikipedia.org/wiki/Gauss-Newton_algorithm
> >
> > By implement, I mean your Function and Jacobian methods should return the
> > correct terms. I believe the reason you have not seen convergence is that
> > the result of the solve does not "mean" the correct thing for the iteration
> > in your current setup.
> >
> >    Matt
>
> Thanks. Good to know that I should be able to get a working setup. Are
> there by any chance any code examples that I could use to clue myself in
> on how to transform my m equations of n unknonwns into a correct
> function for the Gauss-Newton algorithm?

We do not have any nonlinear least-squares examples, unfortunately. At that
point, most users have gone over to formulating their problem directly as
an optimization problem (which allows more flexibility than least squares) and
have moved to TAO (http://www-unix.mcs.anl.gov/tao/) which does have
examples, I believe, for optimization of this kind.

If you know that you only ever want to do least squares, and you want to solve
the biggest, parallel problems, than stick with PETSc and build a nice
Gauss-Newton
(or Levenberg-Marquadt) solver. However, if you really want to solve a more
general optimization problem, I recommend reformulating it now and moving
to TAO. It is at least worth reading up on it.

  Thanks,

     Matt

> - Erlend :.
-- 
What most experimenters take for granted before they begin their
experiments is infinitely more interesting than any results to which
their experiments lead.
-- Norbert Wiener


From zonexo at gmail.com  Tue Feb  5 07:57:51 2008
From: zonexo at gmail.com (Ben Tay)
Date: Tue, 05 Feb 2008 21:57:51 +0800
Subject: how to inverse a sparse matrix in Petsc?
In-Reply-To: <47A85573.4090607@cscs.ch>
References: <602426.95557.qm@web38210.mail.mud.yahoo.com> <47A85573.4090607@cscs.ch>
Message-ID: <47A86B5F.5010503@gmail.com>

Hi everyone,

I was reading about the topic abt inversing a sparse matrix. I have to 
solve a poisson eqn for my CFD code. Usually, I form a system of linear 
eqns and solve Ax=b. The "A" is always the same and only the "b" changes 
every timestep. Does it mean that if I'm able to get the inverse matrix 
A^(-1), in order to get x at every timestep, I only need to do a simple 
matrix multiplication ie x=A^(-1)*b ?

Hi Timothy, if the above is true, can you email me your Fortran code 
template? I'm also programming in fortran 90. Thank you very much

Regards.

Timothy Stitt wrote:
> Yes Yujie, I was able to put together a parallel code to invert a 
> large sparse matrix with the help of the PETSc developers. If you need 
> any help or maybe a Fortran code template just let me know.
>
> Best,
>
> Tim.
>
> Waad Subber wrote:
>> Hi
>> There was a discussion between Tim Stitt and petsc developers about 
>> matrix inversion, and it was really helpful. That was in last Nov. 
>> You can check the emails archive
>>
>> http://www-unix.mcs.anl.gov/web-mail-archive/lists/petsc-users/2007/11/threads.html 
>>
>>
>> Waad
>>
>> */Yujie <recrusader at gmail.com>/* wrote:
>>
>>     what is the difference between sequantial and parallel AIJ matrix?
>>     Assuming there is a matrix A, if
>>     I partitaion this matrix into A1, A2, Ai... An.
>>     A is a parallel AIJ matrix at the whole view, Ai
>>     is a sequential AIJ matrix? I want to operate Ai at each node.
>>     In addition, whether is it possible to get general inverse using
>>     MatMatSolve() if the matrix is not square? Thanks a lot.
>>
>>     Regards,
>>     Yujie
>>
>>
>>     On 2/4/08, *Barry Smith* <bsmith at mcs.anl.gov
>>     <mailto:bsmith at mcs.anl.gov>> wrote:
>>
>>
>>             For sequential AIJ matrices you can fill the B matrix 
>> with the
>>         identity and then use
>>         MatMatSolve().
>>
>>             Note since the inverse of a sparse matrix is dense the B
>>         matrix is
>>         a SeqDense matrix.
>>
>>             Barry
>>
>>         On Feb 4, 2008, at 12:37 AM, Yujie wrote:
>>
>>         > Hi,
>>         > Now, I want to inverse a sparse matrix. I have browsed the
>>         manual,
>>         > however, I can't find some information. could you give me
>>         some advice?
>>         >
>>         > thanks a lot.
>>         >
>>         > Regards,
>>         > Yujie
>>         >
>>
>>
>>
>> ------------------------------------------------------------------------
>> Looking for last minute shopping deals? Find them fast with Yahoo! 
>> Search. 
>> <http://us.rd.yahoo.com/evt=51734/*http://tools.search.yahoo.com/newsearch/category.php?category=shopping> 
>
>
>
>


From dalcinl at gmail.com  Tue Feb  5 08:04:54 2008
From: dalcinl at gmail.com (Lisandro Dalcin)
Date: Tue, 5 Feb 2008 11:04:54 -0300
Subject: Compare the accuracy of PETSc's GMRES algorithm with Matlab's
In-Reply-To: <82303.99306.qm@web15812.mail.cnb.yahoo.com>
References: <82303.99306.qm@web15812.mail.cnb.yahoo.com>
Message-ID: <e7ba66e40802050604x4af8b212sed4281759fab3add@mail.gmail.com>

Could you try to run your PETSc code with the option below?

-ksp_gmres_modifiedgramschmidt


On 2/4/08, jiaxun hou <jiaxun_hou at yahoo.com.cn> wrote:
> Hello everyone,
>
> I want to solve a linear system with no preconditioned GMRES in PETSc, but
> the result is divergent. White I solve the exactly same system under Matlab,
> I get the convergent result. Which result can I trust?
>
> And I print the residuals of them as the attachments. Thanks!
>
> Best regards,
> Jiaxun
>
>
>  ________________________________
> ????????????????????????????????????
> Matlab results
> ------------------------------------------------------------
>    1 ss= 0.8767977459  GRS=8.3589003100  GRS_new=7.3290649496
>    2 ss= 0.9942062562  GRS=7.3290649496  GRS_new=7.2866022253
>    3 ss= 0.9579384974  GRS=7.2866022253  GRS_new=6.9801167867
>    4 ss= 0.9770486148  GRS=6.9801167867  GRS_new=6.8199134376
>    5 ss= 0.9848398847  GRS=6.8199134376  GRS_new=6.7165227635
>    6 ss= 0.9817587194  GRS=6.7165227635  GRS_new=6.5940047873
>    7 ss= 0.9828044725  GRS=6.5940047873  GRS_new=6.4806173965
>    8 ss= 0.9923435431  GRS=6.4806173965  GRS_new=6.4309988285
>    9 ss= 0.9847288648  GRS=6.4309988285  GRS_new=6.3327901760
>   10 ss= 0.9999247510  GRS=6.3327901760  GRS_new=6.3323136396
>   11 ss= 0.9591518787  GRS=6.3323136396  GRS_new=6.0736505238
>   12 ss= 0.9998567242  GRS=6.0736505238  GRS_new=6.0727803165
>   13 ss= 0.9609943819  GRS=6.0727803165  GRS_new=5.8359077665
>   14 ss= 0.9999728993  GRS=5.8359077665  GRS_new=5.8357496091
>   15 ss= 0.9905739831  GRS=5.8357496091  GRS_new=5.7807417346
>   16 ss= 0.9937523276  GRS=5.7807417346  GRS_new=5.7446255541
>   17 ss= 0.9841963377  GRS=5.7446255541  GRS_new=5.6538394318
>   18 ss= 0.9953976161  GRS=5.6538394318  GRS_new=5.6278182921
>   19 ss= 0.9971858162  GRS=5.6278182921  GRS_new=5.6119805771
>   20 ss= 0.9891811740  GRS=5.6119805771  GRS_new=5.5512655356
>   21 ss= 0.9985693802  GRS=5.5512655356  GRS_new=5.5433237853
>   22 ss= 0.9902721074  GRS=5.5433237853  GRS_new=5.4893989269
>   23 ss= 0.9995190416  GRS=5.4893989269  GRS_new=5.4867587542
>   24 ss= 0.9852930755  GRS=5.4867587542  GRS_new=5.4060654074
>   25 ss= 0.9993241249  GRS=5.4060654074  GRS_new=5.4024115826
>   26 ss= 0.9886557159  GRS=5.4024115826  GRS_new=5.3411250907
>   27 ss= 0.9999955172  GRS=5.3411250907  GRS_new=5.3411011474
>   28 ss= 0.9825588209  GRS=5.3411011474  GRS_new=5.2479460454
>   29 ss= 0.9988457962  GRS=5.2479460454  GRS_new=5.2418888463
>   30 ss= 0.9884104840  GRS=5.2418888463  GRS_new=5.1811378916
>   31 ss= 0.9992362771  GRS=5.1811378916  GRS_new=5.1771809379
>   32 ss= 0.9858180681  GRS=5.1771809379  GRS_new=5.1037585103
>   33 ss= 0.9986080784  GRS=5.1037585103  GRS_new=5.0966544787
>   34 ss= 0.9867985434  GRS=5.0966544787  GRS_new=5.0293712159
>   35 ss= 0.9987564203  GRS=5.0293712159  GRS_new=5.0231167919
>   36 ss= 0.9882373404  GRS=5.0231167919  GRS_new=4.9640315790
>   37 ss= 0.9990354114  GRS=4.9640315790  GRS_new=4.9592433307
>   38 ss= 0.9868343640  GRS=4.9592433307  GRS_new=4.8939517382
>   39 ss= 0.9993989316  GRS=4.8939517382  GRS_new=4.8910101386
>   40 ss= 0.9876496755  GRS=4.8910101386  GRS_new=4.8306045761
>   41 ss= 0.9992609634  GRS=4.8306045761  GRS_new=4.8270345827
>   42 ss= 0.9900904643  GRS=4.8270345827  GRS_new=4.7792009111
>   43 ss= 0.9998679060  GRS=4.7792009111  GRS_new=4.7785696072
>   44 ss= 0.9863344717  GRS=4.7785696072  GRS_new=4.7132679291
>   45 ss= 0.9999844285  GRS=4.7132679291  GRS_new=4.7131945365
>   46 ss= 0.9910405864  GRS=4.7131945365  GRS_new=4.6709670773
>   47 ss= 0.9999609816  GRS=4.6709670773  GRS_new=4.6707848237
>   48 ss= 0.9899729173  GRS=4.6707848237  GRS_new=4.6239504780
>   49 ss= 0.9979852800  GRS=4.6239504780  GRS_new=4.6146345124
>   50 ss= 0.9933060164  GRS=4.6146345124  GRS_new=4.5837442249
>   51 ss= 0.9977925155  GRS=4.5837442249  GRS_new=4.5736256804
>   52 ss= 0.9908313109  GRS=4.5736256804  GRS_new=4.5316915283
>   53 ss= 0.9974726054  GRS=4.5316915283  GRS_new=4.5202381555
>   54 ss= 0.9933649859  GRS=4.5202381555  GRS_new=4.4902463115
>   55 ss= 0.9971691729  GRS=4.4902463115  GRS_new=4.4775352004
>   56 ss= 0.9923528355  GRS=4.4775352004  GRS_new=4.4432947523
>   57 ss= 0.9958902809  GRS=4.4432947523  GRS_new=4.4250340590
>   58 ss= 0.9956119858  GRS=4.4250340590  GRS_new=4.4056169468
>   59 ss= 0.9929633098  GRS=4.4056169468  GRS_new=4.3746159852
>   60 ss= 0.9965857536  GRS=4.3746159852  GRS_new=4.3596799681
>   61 ss= 0.9864800828  GRS=4.3596799681  GRS_new=4.3007374561
>   62 ss= 0.9993535743  GRS=4.3007374561  GRS_new=4.2979573489
>   63 ss= 0.9924403396  GRS=4.2979573489  GRS_new=4.2654662508
>   64 ss= 0.9999218865  GRS=4.2654662508  GRS_new=4.2651330604
>   65 ss= 0.9903604367  GRS=4.2651330604  GRS_new=4.2240190404
>   66 ss= 0.9999981102  GRS=4.2240190404  GRS_new=4.2240110578
>   67 ss= 0.9850734469  GRS=4.2240110578  GRS_new=4.1609611323
>   68 ss= 0.9998407715  GRS=4.1609611323  GRS_new=4.1602985889
>   69 ss= 0.9825309063  GRS=4.1602985889  GRS_new=4.0876219429
>   70 ss= 0.9999999323  GRS=4.0876219429  GRS_new=4.0876216663
>   71 ss= 0.9801304161  GRS=4.0876216663  GRS_new=4.0064023246
>   72 ss= 0.9998051423  GRS=4.0064023246  GRS_new=4.0056216462
>   73 ss= 0.9855617342  GRS=4.0056216462  GRS_new=3.9477874163
>   74 ss= 0.9995279352  GRS=3.9477874163  GRS_new=3.9459238048
>   75 ss= 0.9899078491  GRS=3.9459238048  GRS_new=3.9061009463
>   76 ss= 0.9997748883  GRS=3.9061009463  GRS_new=3.9052216374
>   77 ss= 0.9915984422  GRS=3.9052216374  GRS_new=3.8724116920
>   78 ss= 0.9999758238  GRS=3.8724116920  GRS_new=3.8723180717
>   79 ss= 0.9906834932  GRS=3.8723180717  GRS_new=3.8362415942
>   80 ss= 0.9999058873  GRS=3.8362415942  GRS_new=3.8358805550
>   81 ss= 0.9931303767  GRS=3.8358805550  GRS_new=3.8095295006
>   82 ss= 0.9994032497  GRS=3.8095295006  GRS_new=3.8072561629
>   83 ss= 0.9946211871  GRS=3.8072561629  GRS_new=3.7867776442
>   84 ss= 0.9975086306  GRS=3.7867776442  GRS_new=3.7773433822
>   85 ss= 0.9951749469  GRS=3.7773433822  GRS_new=3.7591175000
>   86 ss= 0.9965077732  GRS=3.7591175000  GRS_new=3.7459898092
>   87 ss= 0.9934497713  GRS=3.7459898092  GRS_new=3.7214527192
>   88 ss= 0.9934219923  GRS=3.7214527192  GRS_new=3.6969729745
>   89 ss= 0.9943188889  GRS=3.6969729745  GRS_new=3.6759700602
>   90 ss= 0.9911302936  GRS=3.6759700602  GRS_new=3.6433652850
>   91 ss= 0.9969167891  GRS=3.6433652850  GRS_new=3.6321320213
>   92 ss= 0.9850570261  GRS=3.6321320213  GRS_new=3.5778571672
>   93 ss= 0.9994276670  GRS=3.5778571672  GRS_new=3.5758094416
>   94 ss= 0.9825319957  GRS=3.5758094416  GRS_new=3.5133471868
>   95 ss= 0.9990651680  GRS=3.5133471868  GRS_new=3.5100627973
>   96 ss= 0.9873693338  GRS=3.5100627973  GRS_new=3.4657283658
>   97 ss= 0.9995260021  GRS=3.4657283658  GRS_new=3.4640856180
>   98 ss= 0.9852095868  GRS=3.4640856180  GRS_new=3.4128503602
>   99 ss= 0.9994646613  GRS=3.4128503602  GRS_new=3.4110233294
>  100 ss= 0.9838857947  GRS=3.4110233294  GRS_new=3.3560573991
>  101 ss= 0.9999092368  GRS=3.3560573991  GRS_new=3.3557527927
>  102 ss= 0.9887575579  GRS=3.3557527927  GRS_new=3.3180259361
>  103 ss= 0.9999098244  GRS=3.3180259361  GRS_new=3.3177267310
>  104 ss= 0.9859325023  GRS=3.3177267310  GRS_new=3.2710546178
>  105 ss= 0.9990376979  GRS=3.2710546178  GRS_new=3.2679068751
>  106 ss= 0.9880431974  GRS=3.2679068751  GRS_new=3.2288331576
>  107 ss= 0.9990743475  GRS=3.2288331576  GRS_new=3.2258443800
>  108 ss= 0.9890143770  GRS=3.2258443800  GRS_new=3.1904064698
>  109 ss= 0.9996039465  GRS=3.1904064698  GRS_new=3.1891428981
>  110 ss= 0.9911902290  GRS=3.1891428981  GRS_new=3.1610472793
>  111 ss= 0.9996627037  GRS=3.1610472793  GRS_new=3.1599810697
>  112 ss= 0.9873736362  GRS=3.1599810697  GRS_new=3.1200819991
>  113 ss= 0.9994522850  GRS=3.1200819991  GRS_new=3.1183730833
>  114 ss= 0.9892755061  GRS=3.1183730833  GRS_new=3.0849301101
>  115 ss= 0.9997734276  GRS=3.0849301101  GRS_new=3.0842311502
>  116 ss= 0.9883184336  GRS=3.0842311502  GRS_new=3.0482024992
>  117 ss= 0.9999009839  GRS=3.0482024992  GRS_new=3.0479006782
>  118 ss= 0.9854371189  GRS=3.0479006782  GRS_new=3.0035144631
>  119 ss= 0.9999937513  GRS=3.0035144631  GRS_new=3.0034956949
>  120 ss= 0.9816559742  GRS=3.0034956949  GRS_new=2.9483994923
>  121 ss= 0.9998133192  GRS=2.9483994923  GRS_new=2.9478490828
>  122 ss= 0.9897842140  GRS=2.9478490828  GRS_new=2.9177344875
>  123 ss= 0.9937438967  GRS=2.9177344875  GRS_new=2.8994808391
>  124 ss= 0.9928844144  GRS=2.8994808391  GRS_new=2.8788493350
>  125 ss= 0.9940354540  GRS=2.8788493350  GRS_new=2.8616783057
>  126 ss= 0.9903795737  GRS=2.8616783057  GRS_new=2.8341477405
>  127 ss= 0.9930147037  GRS=2.8341477405  GRS_new=2.8143503787
>  128 ss= 0.9954513964  GRS=2.8143503787  GRS_new=2.8015490145
>  129 ss= 0.9921800208  GRS=2.8015490145  GRS_new=2.7796409594
>  130 ss= 0.9939987784  GRS=2.7796409594  GRS_new=2.7629597180
>  131 ss= 0.9908792550  GRS=2.7629597180  GRS_new=2.7377594669
>  132 ss= 0.9973083911  GRS=2.7377594669  GRS_new=2.7303904893
>  133 ss= 0.9909726835  GRS=2.7303904893  GRS_new=2.7057423901
>  134 ss= 0.9988907774  GRS=2.7057423901  GRS_new=2.7027411194
>  135 ss= 0.9878359656  GRS=2.7027411194  GRS_new=2.6698648834
>  136 ss= 0.9998424757  GRS=2.6698648834  GRS_new=2.6694443147
>  137 ss= 0.9860925172  GRS=2.6694443147  GRS_new=2.6323190638
>  138 ss= 0.9998970195  GRS=2.6323190638  GRS_new=2.6320479862
>  139 ss= 0.9834325176  GRS=2.6320479862  GRS_new=2.5884415775
>  140 ss= 0.9997099187  GRS=2.5884415775  GRS_new=2.5876907191
>  141 ss= 0.9873985061  GRS=2.5876907191  GRS_new=2.5550819503
>  142 ss= 0.9997385347  GRS=2.5550819503  GRS_new=2.5544138850
>  143 ss= 0.9878404410  GRS=2.5544138850  GRS_new=2.5233533387
>  144 ss= 0.9999702108  GRS=2.5233533387  GRS_new=2.5232781699
>  145 ss= 0.9834843437  GRS=2.5232781699  GRS_new=2.4816045750
>  146 ss= 0.9999555852  GRS=2.4816045750  GRS_new=2.4814943549
>  147 ss= 0.9833714542  GRS=2.4814943549  GRS_new=2.4402307123
>  148 ss= 0.9998565798  GRS=2.4402307123  GRS_new=2.4398807339
>  149 ss= 0.9793940038  GRS=2.4398807339  GRS_new=2.3896045609
>  150 ss= 0.9980094016  GRS=2.3896045609  GRS_new=2.3848478179
>  151 ss= 0.9839947785  GRS=2.3848478179  GRS_new=2.3466778003
>  152 ss= 0.9946845124  GRS=2.3466778003  GRS_new=2.3342040635
>  153 ss= 0.9815926232  GRS=2.3342040635  GRS_new=2.2912374898
>  154 ss= 0.9928244685  GRS=2.2912374898  GRS_new=2.2747966429
>  155 ss= 0.9915544496  GRS=2.2747966429  GRS_new=2.2555847332
>  156 ss= 0.9860201299  GRS=2.2555847332  GRS_new=2.2240519517
>  157 ss= 0.9920398136  GRS=2.2240519517  GRS_new=2.2063480837
>  158 ss= 0.9850997487  GRS=2.2063480837  GRS_new=2.1734729429
>  159 ss= 0.9846737498  GRS=2.1734729429  GRS_new=2.1401617527
>  160 ss= 0.9889998946  GRS=2.1401617527  GRS_new=2.1166197478
>  161 ss= 0.9886237198  GRS=2.1166197478  GRS_new=2.0925404885
>  162 ss= 0.9864441737  GRS=2.0925404885  GRS_new=2.0641743731
>  163 ss= 0.9889421019  GRS=2.0641743731  GRS_new=2.0413489432
>  164 ss= 0.9875293875  GRS=2.0413489432  GRS_new=2.0158920717
>  165 ss= 0.9923696106  GRS=2.0158920717  GRS_new=2.0005100301
>  166 ss= 0.9917055125  GRS=2.0005100301  GRS_new=1.9839168246
>  167 ss= 0.9883956382  GRS=1.9839168246  GRS_new=1.9608947359
>  168 ss= 0.9951669546  GRS=1.9608947359  GRS_new=1.9514176427
>  169 ss= 0.9846800082  GRS=1.9514176427  GRS_new=1.9215219403
>  170 ss= 0.9851222204  GRS=1.9215219403  GRS_new=1.8929339605
>  171 ss= 0.9913306028  GRS=1.8929339605  GRS_new=1.8765233641
>  172 ss= 0.9772904480  GRS=1.8765233641  GRS_new=1.8339083592
>  173 ss= 0.9958226278  GRS=1.8339083592  GRS_new=1.8262474413
>  174 ss= 0.9723712619  GRS=1.8262474413  GRS_new=1.7757905291
>  175 ss= 0.9978494633  GRS=1.7757905291  GRS_new=1.7719716264
>  176 ss= 0.9699641657  GRS=1.7719716264  GRS_new=1.7187489803
>  177 ss= 0.9978279403  GRS=1.7187489803  GRS_new=1.7150157549
>  178 ss= 0.9651257580  GRS=1.7150157549  GRS_new=1.6552058804
>  179 ss= 0.9958703980  GRS=1.6552058804  GRS_new=1.6483705389
>  180 ss= 0.9583941862  GRS=1.6483705389  GRS_new=1.5797887411
>  181 ss= 0.9943257865  GRS=1.5797887411  GRS_new=1.5708246826
>  182 ss= 0.9689013349  GRS=1.5708246826  GRS_new=1.5219741318
>  183 ss= 0.9927423131  GRS=1.5219741318  GRS_new=1.5109281201
>  184 ss= 0.9760067066  GRS=1.5109281201  GRS_new=1.4746759785
>  185 ss= 0.9900954777  GRS=1.4746759785  GRS_new=1.4600700174
>  186 ss= 0.9696993568  GRS=1.4600700174  GRS_new=1.4158289568
>  187 ss= 0.9802346349  GRS=1.4158289568  GRS_new=1.3878445806
>  188 ss= 0.9724237348  GRS=1.3878445806  GRS_new=1.3495730103
>  189 ss= 0.9796598123  GRS=1.3495730103  GRS_new=1.3221224419
>  190 ss= 0.9595089723  GRS=1.3221224419  GRS_new=1.2685883455
>  191 ss= 0.9795574759  GRS=1.2685883455  GRS_new=1.2426551977
>  192 ss= 0.9844421304  GRS=1.2426551977  GRS_new=1.2233221302
>  193 ss= 0.9545711990  GRS=1.2233221302  GRS_new=1.1677480726
>  194 ss= 0.9856428311  GRS=1.1677480726  GRS_new=1.1509825164
>  195 ss= 0.9526429154  GRS=1.1509825164  GRS_new=1.0964753400
>  196 ss= 0.9696093276  GRS=1.0964753400  GRS_new=1.0631527171
>  197 ss= 0.9687936691  GRS=1.0631527171  GRS_new=1.0299756216
>  198 ss= 0.9749973851  GRS=1.0299756216  GRS_new=1.0042235378
>  199 ss= 0.9305657933  GRS=1.0042235378  GRS_new=0.9344960731
>  200 ss= 0.9663929841  GRS=0.9344960731  GRS_new=0.9030904487
>  201 ss= 0.9463180084  GRS=0.9030904487  GRS_new=0.8546107548
>  202 ss= 0.9440343320  GRS=0.8546107548  GRS_new=0.8067818930
>  203 ss= 0.9692366995  GRS=0.8067818930  GRS_new=0.7819626192
>  204 ss= 0.9647239092  GRS=0.7819626192  GRS_new=0.7543780349
>  205 ss= 0.9586533240  GRS=0.7543780349  GRS_new=0.7231870107
>  206 ss= 0.9512701628  GRS=0.7231870107  GRS_new=0.6879462254
>  207 ss= 0.9863931602  GRS=0.6879462254  GRS_new=0.6785854513
>  208 ss= 0.9436979088  GRS=0.6785854513  GRS_new=0.6403796714
>  209 ss= 0.9902642308  GRS=0.6403796714  GRS_new=0.6341450827
>  210 ss= 0.9265021328  GRS=0.6341450827  GRS_new=0.5875367716
>  211 ss= 0.9812524410  GRS=0.5875367716  GRS_new=0.5765218913
>  212 ss= 0.9309942287  GRS=0.5765218913  GRS_new=0.5367385535
>  213 ss= 0.9730496129  GRS=0.5367385535  GRS_new=0.5222732417
>  214 ss= 0.9656883272  GRS=0.5222732417  GRS_new=0.5043531731
>  215 ss= 0.9452863486  GRS=0.5043531731  GRS_new=0.4767581694
>  216 ss= 0.9438553901  GRS=0.4767581694  GRS_new=0.4499907680
>  217 ss= 0.9427151241  GRS=0.4499907680  GRS_new=0.4242131027
>  218 ss= 0.9141056503  GRS=0.4242131027  GRS_new=0.3877755941
>  219 ss= 0.9652966758  GRS=0.3877755941  GRS_new=0.3743184919
>  220 ss= 0.9380852210  GRS=0.3743184919  GRS_new=0.3511426452
>  221 ss= 0.9857748287  GRS=0.3511426452  GRS_new=0.3461475809
>  222 ss= 0.9011745983  GRS=0.3461475809  GRS_new=0.3119394072
>  223 ss= 0.9958789390  GRS=0.3119394072  GRS_new=0.3106538859
>  224 ss= 0.8942056291  GRS=0.3106538859  GRS_new=0.2777884535
>  225 ss= 0.9860175937  GRS=0.2777884535  GRS_new=0.2739043025
>  226 ss= 0.9569264287  GRS=0.2739043025  GRS_new=0.2621062660
>  227 ss= 0.9784558926  GRS=0.2621062660  GRS_new=0.2564594204
>  228 ss= 0.9702851128  GRS=0.2564594204  GRS_new=0.2488387576
>  229 ss= 0.9368992686  GRS=0.2488387576  GRS_new=0.2331368500
>  230 ss= 0.9534314432  GRS=0.2331368500  GRS_new=0.2222800034
>  231 ss= 0.9462475811  GRS=0.2222800034  GRS_new=0.2103319155
>  232 ss= 0.9506818789  GRS=0.2103319155  GRS_new=0.1999587407
>  233 ss= 0.9284295306  GRS=0.1999587407  GRS_new=0.1856475997
>  234 ss= 0.9646994083  GRS=0.1856475997  GRS_new=0.1790941296
>  235 ss= 0.9564933801  GRS=0.1790941296  GRS_new=0.1713023494
>  236 ss= 0.8964059043  GRS=0.1713023494  GRS_new=0.1535564374
>  237 ss= 0.9362210991  GRS=0.1535564374  GRS_new=0.1437627766
>  238 ss= 0.8842624203  GRS=0.1437627766  GRS_new=0.1271240208
>  239 ss= 0.9572135619  GRS=0.1271240208  GRS_new=0.1216848367
>  240 ss= 0.8843474708  GRS=0.1216848367  GRS_new=0.1076116776
>  241 ss= 0.9344896729  GRS=0.1076116776  GRS_new=0.1005620014
>  242 ss= 0.9446141679  GRS=0.1005620014  GRS_new=0.0949922913
>  243 ss= 0.9420503449  GRS=0.0949922913  GRS_new=0.0894875208
>  244 ss= 0.8996858936  GRS=0.0894875208  GRS_new=0.0805106601
>  245 ss= 0.8843030849  GRS=0.0805106601  GRS_new=0.0711958251
>  246 ss= 0.8495068207  GRS=0.0711958251  GRS_new=0.0604813390
>  247 ss= 0.9823275657  GRS=0.0604813390  GRS_new=0.0594124865
>  248 ss= 0.8032699443  GRS=0.0594124865  GRS_new=0.0477242647
>  249 ss= 0.9278236882  GRS=0.0477242647  GRS_new=0.0442797033
>  250 ss= 0.9228876267  GRS=0.0442797033  GRS_new=0.0408651903
>  251 ss= 0.9244258318  GRS=0.0408651903  GRS_new=0.0377768376
>  252 ss= 0.9452470204  GRS=0.0377768376  GRS_new=0.0357084431
>  253 ss= 0.9004540041  GRS=0.0357084431  GRS_new=0.0321538106
>  254 ss= 0.9615617722  GRS=0.0321538106  GRS_new=0.0309178751
>  255 ss= 0.8226991366  GRS=0.0309178751  GRS_new=0.0254361092
>  256 ss= 0.9052555692  GRS=0.0254361092  GRS_new=0.0230261795
>  257 ss= 0.8810736262  GRS=0.0230261795  GRS_new=0.0202877594
>  258 ss= 0.8902808119  GRS=0.0202877594  GRS_new=0.0180618029
>  259 ss= 0.8622991452  GRS=0.0180618029  GRS_new=0.0155746772
>  260 ss= 0.9209308003  GRS=0.0155746772  GRS_new=0.0143432000
>  261 ss= 0.8712647973  GRS=0.0143432000  GRS_new=0.0124967252
>  262 ss= 0.9652566894  GRS=0.0124967252  GRS_new=0.0120625476
>  263 ss= 0.8755993907  GRS=0.0120625476  GRS_new=0.0105619593
>  264 ss= 0.9532509670  GRS=0.0105619593  GRS_new=0.0100681980
>  265 ss= 0.7939620184  GRS=0.0100681980  GRS_new=0.0079937668
>  266 ss= 0.9274761528  GRS=0.0079937668  GRS_new=0.0074140281
>  267 ss= 0.8665778971  GRS=0.0074140281  GRS_new=0.0064248328
>  268 ss= 0.8479362252  GRS=0.0064248328  GRS_new=0.0054478485
>  269 ss= 0.8984785974  GRS=0.0054478485  GRS_new=0.0048947753
>  270 ss= 0.8307538442  GRS=0.0048947753  GRS_new=0.0040663534
>  271 ss= 0.7605938618  GRS=0.0040663534  GRS_new=0.0030928434
>  272 ss= 0.9282658981  GRS=0.0030928434  GRS_new=0.0028709811
>  273 ss= 0.8253071221  GRS=0.0028709811  GRS_new=0.0023694411
>  274 ss= 0.9363207077  GRS=0.0023694411  GRS_new=0.0022185568
>  275 ss= -0.6388998849  GRS=0.0022185568  GRS_new=-0.0014174357
>  276 ss= 0.9846474808  GRS=-0.0014174357  GRS_new=-0.0013956745
>  277 ss= 0.6798895257  GRS=-0.0013956745  GRS_new=-0.0009489045
>  278 ss= 0.8665430398  GRS=-0.0009489045  GRS_new=-0.0008222666
>  279 ss= 0.9124149039  GRS=-0.0008222666  GRS_new=-0.0007502483
>  280 ss= 0.8807164067  GRS=-0.0007502483  GRS_new=-0.0006607559
>  281 ss= 0.8804510347  GRS=-0.0006607559  GRS_new=-0.0005817633
>  282 ss= 0.8206397927  GRS=-0.0005817633  GRS_new=-0.0004774181
>  283 ss= 0.8096503797  GRS=-0.0004774181  GRS_new=-0.0003865417
>  284 ss= 0.8500015980  GRS=-0.0003865417  GRS_new=-0.0003285611
>  285 ss= 0.8026252203  GRS=-0.0003285611  GRS_new=-0.0002637114
>  286 ss= -0.7069351672  GRS=-0.0002637114  GRS_new=0.0001864269
>  287 ss= 0.9999108471  GRS=0.0001864269  GRS_new=0.0001864103
>  288 ss= 0.6379050807  GRS=0.0001864103  GRS_new=0.0001189120
>  289 ss= 0.9121889821  GRS=0.0001189120  GRS_new=0.0001084703
>  290 ss= 0.7271537479  GRS=0.0001084703  GRS_new=0.0000788746
>  291 ss= 0.8520989054  GRS=0.0000788746  GRS_new=0.0000672089
>  292 ss= 0.7999946682  GRS=0.0000672089  GRS_new=0.0000537668
>  293 ss= 0.8065995129  GRS=0.0000537668  GRS_new=0.0000433683
>  294 ss= 0.7179751982  GRS=0.0000433683  GRS_new=0.0000311373
>  295 ss= 0.9944276090  GRS=0.0000311373  GRS_new=0.0000309638
>  296 ss= 0.4960821305  GRS=0.0000309638  GRS_new=0.0000153606
>  297 ss= 0.9628692907  GRS=0.0000153606  GRS_new=0.0000147903
>  298 ss= 0.4440261274  GRS=0.0000147903  GRS_new=0.0000065673
>
>  PETSc results
> ------------------------------------------------------------
>  ss= 0.8767977459 GRS(0)=8.3589003100 GRS(1)=-7.3290649496
>  ss= 0.9942062562 GRS(1)=-7.3290649496 GRS(2)=7.2866022253
>  ss= 0.9579384974 GRS(2)=7.2866022253 GRS(3)=-6.9801167867
>  ss= 0.9770486148 GRS(3)=-6.9801167867 GRS(4)=6.8199134376
>  ss= 0.9848398847 GRS(4)=6.8199134376 GRS(5)=-6.7165227635
>  ss= 0.9817587194 GRS(5)=-6.7165227635 GRS(6)=6.5940047873
>  ss= 0.9828044725 GRS(6)=6.5940047873 GRS(7)=-6.4806173965
>  ss= 0.9923435431 GRS(7)=-6.4806173965 GRS(8)=6.4309988285
>  ss= 0.9847288648 GRS(8)=6.4309988285 GRS(9)=-6.3327901760
>  ss= 0.9999247510 GRS(9)=-6.3327901760 GRS(10)=6.3323136396
>  ss= 0.9591518787 GRS(10)=6.3323136396 GRS(11)=-6.0736505238
>  ss= 0.9998567242 GRS(11)=-6.0736505238 GRS(12)=6.0727803165
>  ss= 0.9609943819 GRS(12)=6.0727803165 GRS(13)=-5.8359077665
>  ss= 0.9999728993 GRS(13)=-5.8359077665 GRS(14)=5.8357496091
>  ss= 0.9905739831 GRS(14)=5.8357496091 GRS(15)=-5.7807417346
>  ss= 0.9937523276 GRS(15)=-5.7807417346 GRS(16)=5.7446255541
>  ss= 0.9841963377 GRS(16)=5.7446255541 GRS(17)=-5.6538394318
>  ss= 0.9953976161 GRS(17)=-5.6538394318 GRS(18)=5.6278182921
>  ss= 0.9971858162 GRS(18)=5.6278182921 GRS(19)=-5.6119805772
>  ss= 0.9891811740 GRS(19)=-5.6119805772 GRS(20)=5.5512655357
>  ss= 0.9985693802 GRS(20)=5.5512655357 GRS(21)=-5.5433237853
>  ss= 0.9902721075 GRS(21)=-5.5433237853 GRS(22)=5.4893989272
>  ss= 0.9995190415 GRS(22)=5.4893989272 GRS(23)=-5.4867587542
>  ss= 0.9852930754 GRS(23)=-5.4867587542 GRS(24)=5.4060654070
>  ss= 0.9993241250 GRS(24)=5.4060654070 GRS(25)=-5.4024115827
>  ss= 0.9886557155 GRS(25)=-5.4024115827 GRS(26)=5.3411250889
>  ss= 0.9999955172 GRS(26)=5.3411250889 GRS(27)=-5.3411011457
>  ss= 0.9825588208 GRS(27)=-5.3411011457 GRS(28)=5.2479460434
>  ss= 0.9988457956 GRS(28)=5.2479460434 GRS(29)=-5.2418888410
>  ss= 0.9884104836 GRS(29)=-5.2418888410 GRS(30)=5.1811378842
>  ss= 0.9992362779 GRS(30)=5.1811378842 GRS(31)=-5.1771809345
>  ss= 0.9858180688 GRS(31)=-5.1771809345 GRS(32)=5.1037585107
>  ss= 0.9986080812 GRS(32)=5.1037585107 GRS(33)=-5.0966544933
>  ss= 0.9867985464 GRS(33)=-5.0966544933 GRS(34)=5.0293712457
>  ss= 0.9987564222 GRS(34)=5.0293712457 GRS(35)=-5.0231168315
>  ss= 0.9882373460 GRS(35)=-5.0231168315 GRS(36)=4.9640316464
>  ss= 0.9990354060 GRS(36)=4.9640316464 GRS(37)=-4.9592433714
>  ss= 0.9868343675 GRS(37)=-4.9592433714 GRS(38)=4.8939517956
>  ss= 0.9993989288 GRS(38)=4.8939517956 GRS(39)=-4.8910101822
>  ss= 0.9876496783 GRS(39)=-4.8910101822 GRS(40)=4.8306046332
>  ss= 0.9992609725 GRS(40)=4.8306046332 GRS(41)=-4.8270346838
>  ss= 0.9900904698 GRS(41)=-4.8270346838 GRS(42)=4.7792010378
>  ss= 0.9998679144 GRS(42)=4.7792010378 GRS(43)=-4.7785697743
>  ss= 0.9863344290 GRS(43)=-4.7785697743 GRS(44)=4.7132678899
>  ss= 0.9999844323 GRS(44)=4.7132678899 GRS(45)=-4.7131945153
>  ss= 0.9910404741 GRS(45)=-4.7131945153 GRS(46)=4.6709665272
>  ss= 0.9999609765 GRS(46)=4.6709665272 GRS(47)=-4.6707842498
>  ss= 0.9899724790 GRS(47)=-4.6707842498 GRS(48)=4.6239478627
>  ss= 0.9979853488 GRS(48)=4.6239478627 GRS(49)=-4.6146322207
>  ss= 0.9933053898 GRS(49)=-4.6146322207 GRS(50)=4.5837390570
>  ss= 0.9977925127 GRS(50)=4.5837390570 GRS(51)=-4.5736205111
>  ss= 0.9908317992 GRS(51)=-4.5736205111 GRS(52)=4.5316886400
>  ss= 0.9974703441 GRS(52)=4.5316886400 GRS(53)=-4.5202250272
>  ss= 0.9933577593 GRS(53)=-4.5202250272 GRS(54)=4.4902006044
>  ss= 0.9971745163 GRS(54)=4.4902006044 GRS(55)=-4.4775136156
>  ss= 0.9923600818 GRS(55)=-4.4775136156 GRS(56)=4.4433057777
>  ss= 0.9958745981 GRS(56)=4.4433057777 GRS(57)=-4.4249753554
>  ss= 0.9955761873 GRS(57)=-4.4249753554 GRS(58)=4.4054000934
>  ss= 0.9930040219 GRS(58)=4.4054000934 GRS(59)=-4.3745800110
>  ss= 0.9966346650 GRS(59)=-4.3745800110 GRS(60)=4.3598580838
>  ss= 0.9863487872 GRS(60)=4.3598580838 GRS(61)=-4.3003407333
>  ss= 0.9993086675 GRS(61)=-4.3003407333 GRS(62)=4.2973677678
>  ss= 0.9925217950 GRS(62)=4.2973677678 GRS(63)=-4.2652311706
>  ss= 0.9999207600 GRS(63)=-4.2652311706 GRS(64)=4.2648931935
>  ss= 0.9902311161 GRS(64)=4.2648931935 GRS(65)=-4.2232299472
>  ss= 0.9999879339 GRS(65)=-4.2232299472 GRS(66)=4.2231789892
>  ss= 0.9851418699 GRS(66)=4.2231789892 GRS(67)=-4.1604304461
>  ss= 0.9998329649 GRS(67)=-4.1604304461 GRS(68)=4.1597355080
>  ss= 0.9823773748 GRS(68)=4.1597355080 GRS(69)=-4.0864300483
>  ss= 0.9999777038 GRS(69)=-4.0864300483 GRS(70)=4.0863389365
>  ss= 0.9824555504 GRS(70)=4.0863389365 GRS(71)=-4.0146463691
>  ss= 0.9999965224 GRS(71)=-4.0146463691 GRS(72)=4.0146324076
>  ss= 0.9814858949 GRS(72)=4.0146324076 GRS(73)=-3.9403050811
>  ss= 0.9999966647 GRS(73)=-3.9403050811 GRS(74)=3.9402919391
>  ss= 0.9935776881 GRS(74)=3.9402919391 GRS(75)=-3.9149861555
>  ss= 0.9996537787 GRS(75)=-3.9149861555 GRS(76)=3.9136307039
>  ss= 0.9996908126 GRS(76)=3.9136307039 GRS(77)=-3.9124206585
>  ss= 0.9940733429 GRS(77)=-3.9124206585 GRS(78)=3.8892330829
>  ss= 0.9917645614 GRS(78)=3.8892330829 GRS(79)=-3.8572035427
>  ss= 0.9979318848 GRS(79)=-3.8572035427 GRS(80)=3.8492264013
>  ss= 0.9968843841 GRS(80)=3.8492264013 GRS(81)=-3.8372336902
>  ss= 0.9999422648 GRS(81)=-3.8372336902 GRS(82)=3.8370121468
>  ss= 0.9998611209 GRS(82)=3.8370121468 GRS(83)=-3.8364792660
>  ss= 0.9969188227 GRS(83)=-3.8364792660 GRS(84)=3.8246583932
>  ss= 0.9993374910 GRS(84)=3.8246583932 GRS(85)=-3.8221245227
>  ss= 0.9960060274 GRS(85)=-3.8221245227 GRS(86)=3.8068590620
>  ss= 0.9992693029 GRS(86)=3.8068590620 GRS(87)=-3.8040774011
>  ss= 0.9986496902 GRS(87)=-3.8040774011 GRS(88)=3.7989407181
>  ss= 0.9993521740 GRS(88)=3.7989407181 GRS(89)=-3.7964796656
>  ss= 0.9995241475 GRS(89)=-3.7964796656 GRS(90)=3.7946731011
>  ss= 0.9996635044 GRS(90)=3.7946731011 GRS(91)=-3.7933962102
>  ss= 0.9990514354 GRS(91)=-3.7933962102 GRS(92)=3.7897979287
>  ss= 0.9999406006 GRS(92)=3.7897979287 GRS(93)=-3.7895728170
>  ss= 0.9989296448 GRS(93)=-3.7895728170 GRS(94)=3.7855166280
>  ss= 0.9998479921 GRS(94)=3.7855166280 GRS(95)=-3.7849411995
>  ss= 0.9990064433 GRS(95)=-3.7849411995 GRS(96)=3.7811806459
>  ss= 0.9993703821 GRS(96)=3.7811806459 GRS(97)=-3.7787999470
>  ss= 0.9999839199 GRS(97)=-3.7787999470 GRS(98)=3.7787391836
>  ss= 0.9990449258 GRS(98)=3.7787391836 GRS(99)=-3.7751302072
>  ss= 0.9962938985 GRS(99)=-3.7751302072 GRS(100)=3.7611391914
>  ss= 0.9987838060 GRS(100)=3.7611391914 GRS(101)=-3.7565649166
>  ss= 0.9975838636 GRS(101)=-3.7565649166 GRS(102)=3.7474885433
>  ss= 0.9997294989 GRS(102)=3.7474885433 GRS(103)=-3.7464748436
>  ss= 0.9989983184 GRS(103)=-3.7464748436 GRS(104)=3.7427220687
>  ss= 0.9989125566 GRS(104)=3.7427220687 GRS(105)=-3.7386520702
>  ss= 0.9957203280 GRS(105)=-3.7386520702 GRS(106)=3.7226518658
>  ss= 0.9991932085 GRS(106)=3.7226518658 GRS(107)=-3.7196484618
>  ss= 0.9997653132 GRS(107)=-3.7196484618 GRS(108)=3.7187755093
>  ss= 0.9992213997 GRS(108)=3.7187755093 GRS(109)=-3.7158800696
>  ss= 0.9952900827 GRS(109)=-3.7158800696 GRS(110)=3.6983785818
>  ss= 0.9985549601 GRS(110)=3.6983785818 GRS(111)=-3.6930342772
>  ss= 0.9994399191 GRS(111)=-3.6930342772 GRS(112)=3.6909658793
>  ss= 0.9983418456 GRS(112)=3.6909658793 GRS(113)=-3.6848456880
>  ss= 0.9994158990 GRS(113)=-3.6848456880 GRS(114)=3.6826933658
>  ss= 0.9982195750 GRS(114)=3.6826933658 GRS(115)=-3.6761366063
>  ss= 0.9991707575 GRS(115)=-3.6761366063 GRS(116)=3.6730881974
>  ss= 0.9987102098 GRS(116)=3.6730881974 GRS(117)=-3.6683506843
>  ss= 0.9986401624 GRS(117)=-3.6683506843 GRS(118)=3.6633623233
>  ss= 0.9987924884 GRS(118)=3.6633623233 GRS(119)=-3.6589387709
>  ss= 0.9984336194 GRS(119)=-3.6589387709 GRS(120)=3.6532074803
>  ss= 0.9993482297 GRS(120)=3.6532074803 GRS(121)=-3.6508264281
>  ss= 0.9971928551 GRS(121)=-3.6508264281 GRS(122)=3.6405780295
>  ss= 0.9994387455 GRS(122)=3.6405780295 GRS(123)=-3.6385347389
>  ss= 0.9942118921 GRS(123)=-3.6385347389 GRS(124)=3.6174745074
>  ss= 0.9986868853 GRS(124)=3.6174745074 GRS(125)=-3.6127243485
>  ss= 0.9988696701 GRS(125)=-3.6127243485 GRS(126)=3.6086407783
>  ss= 0.9996170184 GRS(126)=3.6086407783 GRS(127)=-3.6072587352
>  ss= 0.9990788944 GRS(127)=-3.6072587352 GRS(128)=3.6039360689
>  ss= 0.9993388154 GRS(128)=3.6039360689 GRS(129)=-3.6015532017
>  ss= 0.9990718332 GRS(129)=-3.6015532017 GRS(130)=3.5982103596
>  ss= 0.9976971136 GRS(130)=3.5982103596 GRS(131)=-3.5899240897
>  ss= 0.9992135898 GRS(131)=-3.5899240897 GRS(132)=3.5871009367
>  ss= 0.9954909155 GRS(132)=3.5871009367 GRS(133)=-3.5709263955
>  ss= 0.9991680969 GRS(133)=-3.5709263955 GRS(134)=3.5679557307
>  ss= 0.9983786001 GRS(134)=3.5679557307 GRS(135)=-3.5621706476
>  ss= 0.9996053240 GRS(135)=-3.5621706476 GRS(136)=3.5607647445
>  ss= 0.9993398136 GRS(136)=3.5607647445 GRS(137)=-3.5584139761
>  ss= 0.9991764202 GRS(137)=-3.5584139761 GRS(138)=3.5554833384
>  ss= 0.9998724792 GRS(138)=3.5554833384 GRS(139)=-3.5550299405
>  ss= 0.9993082128 GRS(139)=-3.5550299405 GRS(140)=3.5525706164
>  ss= 0.9999423075 GRS(140)=3.5525706164 GRS(141)=-3.5523656597
>  ss= 0.9941184498 GRS(141)=-3.5523656597 GRS(142)=3.5314722428
>  ss= 0.9991938940 GRS(142)=3.5314722428 GRS(143)=-3.5286255020
>  ss= 0.9992458884 GRS(143)=-3.5286255020 GRS(144)=3.5259645247
>  ss= 0.9997983606 GRS(144)=3.5259645247 GRS(145)=-3.5252535512
>  ss= 0.9994699733 GRS(145)=-3.5252535512 GRS(146)=3.5233850726
>  ss= 0.9951672976 GRS(146)=3.5233850726 GRS(147)=-3.5063576013
>  ss= 0.9993936705 GRS(147)=-3.5063576013 GRS(148)=3.5042315933
>  ss= 0.9974660116 GRS(148)=3.5042315933 GRS(149)=-3.4953519110
>  ss= 0.9991638024 GRS(149)=-3.4953519110 GRS(150)=3.4924291060
>  ss= 0.9997542087 GRS(150)=3.4924291060 GRS(151)=-3.4915706973
>  ss= 0.9988871428 GRS(151)=-3.4915706973 GRS(152)=3.4876850776
>  ss= 0.9995866915 GRS(152)=3.4876850776 GRS(153)=-3.4862435878
>  ss= 0.9988659428 GRS(153)=-3.4862435878 GRS(154)=3.4822899881
>  ss= 0.9995188309 GRS(154)=3.4822899881 GRS(155)=-3.4806144176
>  ss= 0.9989542490 GRS(155)=-3.4806144176 GRS(156)=3.4769745615
>  ss= 0.9993750477 GRS(156)=3.4769745615 GRS(157)=-3.4748016183
>  ss= 0.9990318039 GRS(157)=-3.4748016183 GRS(158)=3.4714373291
>  ss= 0.9991632994 GRS(158)=3.4714373291 GRS(159)=-3.4685327753
>  ss= 0.9989657278 GRS(159)=-3.4685327753 GRS(160)=3.4649453684
>  ss= 0.9989351160 GRS(160)=3.4649453684 GRS(161)=-3.4612556035
>  ss= 0.9986648427 GRS(161)=-3.4612556035 GRS(162)=3.4566342830
>  ss= 0.9986802695 GRS(162)=3.4566342830 GRS(163)=-3.4520724574
>  ss= 0.9999842765 GRS(163)=-3.4520724574 GRS(164)=3.4520181787
>  ss= 0.9962707501 GRS(164)=3.4520181787 GRS(165)=-3.4391447404
>  ss= 0.9992908775 GRS(165)=-3.4391447404 GRS(166)=3.4367059655
>  ss= 0.9991528773 GRS(166)=3.4367059655 GRS(167)=-3.4337946537
>  ss= 0.9988349701 GRS(167)=-3.4337946537 GRS(168)=3.4297941805
>  ss= 0.9994063998 GRS(168)=3.4297941805 GRS(169)=-3.4277582539
>  ss= 0.9988363765 GRS(169)=-3.4277582539 GRS(170)=3.4237696339
>  ss= 0.9994013434 GRS(170)=3.4237696339 GRS(171)=-3.4217199716
>  ss= 0.9986836331 GRS(171)=-3.4217199716 GRS(172)=3.4172157327
>  ss= 0.9992205093 GRS(172)=3.4172157327 GRS(173)=-3.4145520449
>  ss= 0.9962578169 GRS(173)=-3.4145520449 GRS(174)=3.4017741661
>  ss= 0.9990349573 GRS(174)=3.4017741661 GRS(175)=-3.3984913088
>  ss= 0.9995742569 GRS(175)=-3.3984913088 GRS(176)=3.3970444246
>  ss= 0.9988914825 GRS(176)=3.3970444246 GRS(177)=-3.3932787413
>  ss= 0.9993890209 GRS(177)=-3.3932787413 GRS(178)=3.3912055190
>  ss= 0.9985401554 GRS(178)=3.3912055190 GRS(179)=-3.3862548861
>  ss= 0.9992489307 GRS(179)=-3.3862548861 GRS(180)=3.3837115741
>  ss= 0.9998779270 GRS(180)=3.3837115741 GRS(181)=-3.3832985143
>  ss= 0.9991176810 GRS(181)=-3.3832985143 GRS(182)=3.3803133656
>  ss= 0.9993999107 GRS(182)=3.3803133656 GRS(183)=-3.3782848756
>  ss= 0.9990005881 GRS(183)=-3.3782848756 GRS(184)=3.3749085774
>  ss= 0.9993085669 GRS(184)=3.3749085774 GRS(185)=-3.3725750540
>  ss= 0.9988404869 GRS(185)=-3.3725750540 GRS(186)=3.3686645091
>  ss= 0.9992545135 GRS(186)=3.3686645091 GRS(187)=-3.3661532151
>  ss= 0.9981115878 GRS(187)=-3.3661532151 GRS(188)=3.3597965304
>  ss= 0.9991828961 GRS(188)=3.3597965304 GRS(189)=-3.3570512275
>  ss= 0.9997520639 GRS(189)=-3.3570512275 GRS(190)=3.3562188934
>  ss= 0.9991014452 GRS(190)=3.3562188934 GRS(191)=-3.3532031469
>  ss= 0.9993888343 GRS(191)=-3.3532031469 GRS(192)=3.3511537842
>  ss= 0.9990006788 GRS(192)=3.3511537842 GRS(193)=-3.3478049052
>  ss= 0.9992859047 GRS(193)=-3.3478049052 GRS(194)=3.3454142535
>  ss= 0.9988116463 GRS(194)=3.3454142535 GRS(195)=-3.3414387180
>  ss= 0.9992254629 GRS(195)=-3.3414387180 GRS(196)=3.3388506498
>  ss= 0.9973993121 GRS(196)=3.3388506498 GRS(197)=-3.3301673412
>  ss= 0.9991480513 GRS(197)=-3.3301673412 GRS(198)=3.3273302095
>  ss= 0.9995365296 GRS(198)=3.3273302095 GRS(199)=-3.3257880903
>  ss= 0.9990384042 GRS(199)=-3.3257880903 GRS(200)=3.3225900263
>  ss= 0.9992891727 GRS(200)=3.3225900263 GRS(201)=-3.3202282388
>  ss= 0.9987917588 GRS(201)=-3.3202282388 GRS(202)=3.3162166022
>  ss= 0.9991905749 GRS(202)=3.3162166022 GRS(203)=-3.3135323732
>  ss= 0.9945222759 GRS(203)=-3.3135323732 GRS(204)=3.2953817569
>  ss= 0.9990617879 GRS(204)=3.2953817569 GRS(205)=-3.2922899900
>  ss= 0.9992534719 GRS(205)=-3.2922899900 GRS(206)=3.2898322031
>  ss= 0.9961647199 GRS(206)=3.2898322031 GRS(207)=-3.2772147750
>  ss= 0.9990875122 GRS(207)=-3.2772147750 GRS(208)=3.2742243567
>  ss= 0.9991667318 GRS(208)=3.2742243567 GRS(209)=-3.2714960497
>  ss= 0.9994417361 GRS(209)=-3.2714960497 GRS(210)=3.2696696916
>  ss= 0.9987831388 GRS(210)=3.2696696916 GRS(211)=-3.2656909574
>  ss= 0.9991318601 GRS(211)=-3.2656909574 GRS(212)=3.2628558808
>  ss= 0.9992048809 GRS(212)=3.2628558808 GRS(213)=-3.2602615216
>  ss= 0.9999977715 GRS(213)=-3.2602615216 GRS(214)=3.2602542561
>  ss= 0.9990704168 GRS(214)=3.2602542561 GRS(215)=-3.2572235785
>  ss= 0.9990871785 GRS(215)=-3.2572235785 GRS(216)=3.2542503146
>  ss= 0.9992241431 GRS(216)=3.2542503146 GRS(217)=-3.2517254820
>  ss= 0.9992341296 GRS(217)=-3.2517254820 GRS(218)=3.2492350815
>  ss= 0.9991332641 GRS(218)=3.2492350815 GRS(219)=-3.2464188528
>  ss= 0.9989246373 GRS(219)=-3.2464188528 GRS(220)=3.2429277749
>  ss= 0.9999780993 GRS(220)=3.2429277749 GRS(221)=-3.2428567525
>  ss= 0.9992237669 GRS(221)=-3.2428567525 GRS(222)=3.2403395397
>  ss= 0.9992159985 GRS(222)=3.2403395397 GRS(223)=-3.2377991086
>  ss= 0.9991226513 GRS(223)=-3.2377991086 GRS(224)=3.2349584299
>  ss= 0.9992580461 GRS(224)=3.2349584299 GRS(225)=-3.2325582398
>  ss= 0.9993831678 GRS(225)=-3.2325582398 GRS(226)=3.2305642938
>  ss= 0.9992804217 GRS(226)=3.2305642938 GRS(227)=-3.2282396499
>  ss= 0.9992571243 GRS(227)=-3.2282396499 GRS(228)=3.2258414692
>  ss= 0.9995229352 GRS(228)=3.2258414692 GRS(229)=-3.2243025337
>  ss= 0.9992479966 GRS(229)=-3.2243025337 GRS(230)=3.2218778472
>  ss= 0.9993507731 GRS(230)=3.2218778472 GRS(231)=-3.2197861174
>  ss= 0.9994051829 GRS(231)=-3.2197861174 GRS(232)=3.2178709337
>  ss= 0.9993009431 GRS(232)=3.2178709337 GRS(233)=-3.2156214587
>  ss= 0.9993810026 GRS(233)=-3.2156214587 GRS(234)=3.2136309973
>  ss= 0.9991509825 GRS(234)=3.2136309973 GRS(235)=-3.2109025682
>  ss= 0.9993726158 GRS(235)=-3.2109025682 GRS(236)=3.2088880988
>  ss= 0.9997222850 GRS(236)=3.2088880988 GRS(237)=-3.2079969423
>  ss= 0.9991037372 GRS(237)=-3.2079969423 GRS(238)=3.2051217341
>  ss= 0.9999158291 GRS(238)=3.2051217341 GRS(239)=-3.2048519561
>  ss= 0.9972093241 GRS(239)=-3.2048519561 GRS(240)=3.1959082530
>  ss= 0.9993238488 GRS(240)=3.1959082530 GRS(241)=-3.1937473358
>  ss= 0.9991330096 GRS(241)=-3.1937473358 GRS(242)=3.1909783876
>  ss= 0.9994080319 GRS(242)=3.1909783876 GRS(243)=-3.1890894303
>  ss= 0.9993385501 GRS(243)=-3.1890894303 GRS(244)=3.1869800073
>  ss= 0.9999972375 GRS(244)=3.1869800073 GRS(245)=-3.1869712032
>  ss= 0.9993571022 GRS(245)=-3.1869712032 GRS(246)=3.1849223065
>  ss= 0.9993387979 GRS(246)=3.1849223065 GRS(247)=-3.1828164292
>  ss= 0.9993523547 GRS(247)=-3.1828164292 GRS(248)=3.1807550930
>  ss= 0.9993420252 GRS(248)=3.1807550930 GRS(249)=-3.1786622361
>  ss= 0.9996629413 GRS(249)=-3.1786622361 GRS(250)=3.1775908404
>  ss= 0.9993227720 GRS(250)=3.1775908404 GRS(251)=-3.1754388868
>  ss= 0.9995068178 GRS(251)=-3.1754388868 GRS(252)=3.1738728167
>  ss= 0.9993412723 GRS(252)=3.1738728167 GRS(253)=-3.1717820987
>  ss= 0.9994261036 GRS(253)=-3.1717820987 GRS(254)=3.1699618242
>  ss= 0.9993883498 GRS(254)=3.1699618242 GRS(255)=-3.1680229166
>  ss= 0.9993428349 GRS(255)=-3.1680229166 GRS(256)=3.1659410024
>  ss= 0.9994184780 GRS(256)=3.1659410024 GRS(257)=-3.1640999380
>  ss= 0.9993866187 GRS(257)=-3.1640999380 GRS(258)=3.1621591383
>  ss= 0.9994709376 GRS(258)=3.1621591383 GRS(259)=-3.1604861587
>  ss= 0.9994159200 GRS(259)=-3.1604861587 GRS(260)=3.1586401820
>  ss= 0.9994211964 GRS(260)=3.1586401820 GRS(261)=-3.1568119498
>  ss= 0.9994309409 GRS(261)=-3.1568119498 GRS(262)=3.1550155371
>  ss= 0.9994528128 GRS(262)=3.1550155371 GRS(263)=-3.1532891530
>  ss= 0.9993306065 GRS(263)=-3.1532891530 GRS(264)=3.1511783617
>  ss= 0.9994349559 GRS(264)=3.1511783617 GRS(265)=-3.1493978068
>  ss= 0.9999220100 GRS(265)=-3.1493978068 GRS(266)=3.1491521854
>  ss= 0.9993566767 GRS(266)=3.1491521854 GRS(267)=-3.1471262622
>  ss= 0.9994992300 GRS(267)=-3.1471262622 GRS(268)=3.1455502758
>  ss= 0.9992832754 GRS(268)=3.1455502758 GRS(269)=-3.1432957826
>  ss= 0.9993621556 GRS(269)=-3.1432957826 GRS(270)=3.1412908491
>  ss= 0.9992263941 GRS(270)=3.1412908491 GRS(271)=-3.1388607279
>  ss= 0.9993389644 GRS(271)=-3.1388607279 GRS(272)=3.1367858291
>  ss= 0.9993636102 GRS(272)=3.1367858291 GRS(273)=-3.1347896105
>  ss= 0.9993568803 GRS(273)=-3.1347896105 GRS(274)=3.1327735655
>  ss= 0.9993604667 GRS(274)=3.1327735655 GRS(275)=-3.1307700524
>  ss= 0.9993541678 GRS(275)=-3.1307700524 GRS(276)=3.1287481002
>  ss= 0.9993852133 GRS(276)=3.1287481002 GRS(277)=-3.1268245874
>  ss= 0.9993703101 GRS(277)=-3.1268245874 GRS(278)=3.1248556576
>  ss= 0.9993265741 GRS(278)=3.1248556576 GRS(279)=-3.1227512989
>  ss= 0.9993968034 GRS(279)=-3.1227512989 GRS(280)=3.1208676660
>  ss= 0.9993740036 GRS(280)=3.1208676660 GRS(281)=-3.1189140142
>  ss= 0.9998628189 GRS(281)=-3.1189140142 GRS(282)=3.1184861582
>  ss= 0.9994002598 GRS(282)=3.1184861582 GRS(283)=-3.1166158768
>  ss= 0.9993757232 GRS(283)=-3.1166158768 GRS(284)=3.1146702459
>  ss= 0.9998069433 GRS(284)=3.1146702459 GRS(285)=-3.1140689380
>  ss= 0.9994091675 GRS(285)=-3.1140689380 GRS(286)=3.1122290447
>  ss= 0.9993798463 GRS(286)=3.1122290447 GRS(287)=-3.1102989845
>  ss= 0.9990410297 GRS(287)=-3.1102989845 GRS(288)=3.1073163001
>  ss= 0.9994253213 GRS(288)=3.1073163001 GRS(289)=-3.1055305917
>  ss= 0.9993813198 GRS(289)=-3.1055305917 GRS(290)=3.1036092614
>  ss= 0.9992530678 GRS(290)=3.1036092614 GRS(291)=-3.1012910758
>  ss= 0.9995044672 GRS(291)=-3.1012910758 GRS(292)=3.0997542844
>  ss= 0.9993860763 GRS(292)=3.0997542844 GRS(293)=-3.0978512716
>  ss= 0.9993606501 GRS(293)=-3.0978512716 GRS(294)=3.0958706606
>  ss= 0.9993658646 GRS(294)=3.0958706606 GRS(295)=-3.0939074593
>  ss= 0.9993192768 GRS(295)=-3.0939074593 GRS(296)=3.0918013649
>  ss= 0.9994759923 GRS(296)=3.0918013649 GRS(297)=-3.0901812373
>  ss= 0.9993682724 GRS(297)=-3.0901812373 GRS(298)=3.0882290846
>  ss= 0.9993261365 GRS(298)=3.0882290846 GRS(299)=-3.0861480399
>  ss= 0.9995045375 GRS(299)=-3.0861480399 GRS(300)=3.0846189693
>  ss= 0.9993483698 GRS(300)=3.0846189693 GRS(301)=-3.0826089383
>  ss= 0.9993300737 GRS(301)=-3.0826089383 GRS(302)=3.0805438174
>  ss= 0.9993518109 GRS(302)=3.0805438174 GRS(303)=-3.0785470426
>  ss= 0.9993356624 GRS(303)=-3.0785470426 GRS(304)=3.0765018480
>  ss= 0.9993604538 GRS(304)=3.0765018480 GRS(305)=-3.0745342827
>  ss= 0.9993163946 GRS(305)=-3.0745342827 GRS(306)=3.0724325146
>  ss= 0.9993437039 GRS(306)=3.0724325146 GRS(307)=-3.0704160891
>  ss= 0.9991526250 GRS(307)=-3.0704160891 GRS(308)=3.0678142953
>  ss= 0.9993247846 GRS(308)=3.0678142953 GRS(309)=-3.0657428599
>  ss= 0.9994201749 GRS(309)=-3.0657428599 GRS(310)=3.0639652653
>  ss= 0.9992913585 GRS(310)=3.0639652653 GRS(311)=-3.0617940123
>  ss= 0.9993792182 GRS(311)=-3.0617940123 GRS(312)=3.0598933062
>  ss= 0.9990868152 GRS(312)=3.0598933062 GRS(313)=-3.0570990580
>  ss= 0.9993661564 GRS(313)=-3.0570990580 GRS(314)=3.0551613354
>  ss= 0.9995746355 GRS(314)=3.0551613354 GRS(315)=-3.0538617781
>  ss= 0.9992909820 GRS(315)=-3.0538617781 GRS(316)=3.0516965351
>  ss= 0.9995174353 GRS(316)=3.0516965351 GRS(317)=-3.0502238941
>  ss= 0.9995468513 GRS(317)=-3.0502238941 GRS(318)=3.0488416892
>  ss= 0.9999960894 GRS(318)=3.0488416892 GRS(319)=-3.0488297663
>  ss= 0.9995299833 GRS(319)=-3.0488297663 GRS(320)=3.0473967655
>  ss= 0.9993869847 GRS(320)=3.0473967655 GRS(321)=-3.0455286648
>  ss= 0.9969262118 GRS(321)=-3.0455286648 GRS(322)=3.0361673546
>  ss= 0.9994269862 GRS(322)=3.0361673546 GRS(323)=-3.0344275889
>  ss= 0.9993358123 GRS(323)=-3.0344275889 GRS(324)=3.0324121595
>  ss= 0.9995194150 GRS(324)=3.0324121595 GRS(325)=-3.0309548276
>  ss= 0.9993737671 GRS(325)=-3.0309548276 GRS(326)=3.0290567438
>  ss= 0.9993186418 GRS(326)=3.0290567438 GRS(327)=-3.0269928712
>  ss= 0.9993835771 GRS(327)=-3.0269928712 GRS(328)=3.0251269633
>  ss= 0.9993590702 GRS(328)=3.0251269633 GRS(329)=-3.0231880692
>  ss= 0.9997037403 GRS(329)=-3.0231880692 GRS(330)=3.0222924203
>  ss= 0.9993561904 GRS(330)=3.0222924203 GRS(331)=-3.0203466394
>  ss= 0.9993784183 GRS(331)=-3.0203466394 GRS(332)=3.0184692472
>  ss= 0.9993002817 GRS(332)=3.0184692472 GRS(333)=-3.0163571690
>  ss= 0.9993637662 GRS(333)=-3.0163571690 GRS(334)=3.0144380605
>  ss= 0.9995062884 GRS(334)=3.0144380605 GRS(335)=-3.0129497974
>  ss= 0.9993146857 GRS(335)=-3.0129497974 GRS(336)=3.0108849798
>  ss= 0.9994180850 GRS(336)=3.0108849798 GRS(337)=-3.0091329006
>  ss= 0.9988448398 GRS(337)=-3.0091329006 GRS(338)=3.0056568700
>  ss= 0.9993695263 GRS(338)=3.0056568700 GRS(339)=-3.0037618823
>  ss= 0.9996753233 GRS(339)=-3.0037618823 GRS(340)=3.0027866308
>  ss= 0.9992593698 GRS(340)=3.0027866308 GRS(341)=-3.0005626764
>  ss= 0.9995516107 GRS(341)=-3.0005626764 GRS(342)=2.9992172563
>  ss= 0.9988421528 GRS(342)=2.9992172563 GRS(343)=-2.9957446209
>  ss= 0.9994896854 GRS(343)=-2.9957446209 GRS(344)=2.9942158487
>  ss= 0.9995334412 GRS(344)=2.9942158487 GRS(345)=-2.9928188711
>  ss= 0.9995010777 GRS(345)=-2.9928188711 GRS(346)=2.9913256871
>  ss= 0.9994682910 GRS(346)=2.9913256871 GRS(347)=-2.9897351722
>  ss= 0.9993530559 GRS(347)=-2.9897351722 GRS(348)=2.9878009807
>  ss= 0.9994873794 GRS(348)=2.9878009807 GRS(349)=-2.9862693723
>  ss= 0.9994200506 GRS(349)=-2.9862693723 GRS(350)=2.9845374871
>  ss= 0.9990825480 GRS(350)=2.9845374871 GRS(351)=-2.9817993171
>  ss= 0.9994431179 GRS(351)=-2.9817993171 GRS(352)=2.9801388066
>  ss= 0.9993923854 GRS(352)=2.9801388066 GRS(353)=-2.9783280307
>  ss= 0.9996308745 GRS(353)=-2.9783280307 GRS(354)=2.9772286539
>  ss= 0.9994186741 GRS(354)=2.9772286539 GRS(355)=-2.9754979138
>  ss= 0.9993914987 GRS(355)=-2.9754979138 GRS(356)=2.9736873193
>  ss= 0.9994840367 GRS(356)=2.9736873193 GRS(357)=-2.9721530059
>  ss= 0.9994154374 GRS(357)=-2.9721530059 GRS(358)=2.9704155964
>  ss= 0.9994028270 GRS(358)=2.9704155964 GRS(359)=-2.9686417444
>  ss= 0.9994462895 GRS(359)=-2.9686417444 GRS(360)=2.9669979762
>  ss= 0.9994183126 GRS(360)=2.9669979762 GRS(361)=-2.9652721110
>  ss= 0.9994100620 GRS(361)=-2.9652721110 GRS(362)=2.9635227842
>  ss= 0.9994348253 GRS(362)=2.9635227842 GRS(363)=-2.9618478761
>  ss= 0.9994225680 GRS(363)=-2.9618478761 GRS(364)=2.9601376103
>  ss= 0.9994146048 GRS(364)=2.9601376103 GRS(365)=-2.9584047599
>  ss= 0.9994361762 GRS(365)=-2.9584047599 GRS(366)=2.9567367410
>  ss= 0.9994255562 GRS(366)=2.9567367410 GRS(367)=-2.9550382618
>  ss= 0.9994133788 GRS(367)=-2.9550382618 GRS(368)=2.9533047736
>  ss= 0.9994359414 GRS(368)=2.9533047736 GRS(369)=-2.9516389368
>  ss= 0.9994270259 GRS(369)=-2.9516389368 GRS(370)=2.9499477240
>  ss= 0.9994154864 GRS(370)=2.9499477240 GRS(371)=-2.9482234393
>  ss= 0.9994371249 GRS(371)=-2.9482234393 GRS(372)=2.9465639576
>  ss= 0.9994305019 GRS(372)=2.9465639576 GRS(373)=-2.9448858950
>  ss= 0.9994162432 GRS(373)=-2.9448858950 GRS(374)=2.9431667977
>  ss= 0.9994415671 GRS(374)=2.9431667977 GRS(375)=-2.9415232365
>  ss= 0.9994325367 GRS(375)=-2.9415232365 GRS(376)=2.9398540299
>  ss= 0.9994133030 GRS(376)=2.9398540299 GRS(377)=-2.9381292263
>  ss= 0.9994439229 GRS(377)=-2.9381292263 GRS(378)=2.9364954000
>  ss= 0.9994333716 GRS(378)=2.9364954000 GRS(379)=-2.9348314983
>  ss= 0.9994076337 GRS(379)=-2.9348314983 GRS(380)=2.9330930030
>  ss= 0.9994439768 GRS(380)=2.9330930030 GRS(381)=-2.9314621353
>  ss= 0.9994328745 GRS(381)=-2.9314621353 GRS(382)=2.9297996283
>  ss= 0.9993894478 GRS(382)=2.9297996283 GRS(383)=-2.9280108328
>  ss= 0.9994416553 GRS(383)=-2.9280108328 GRS(384)=2.9263759933
>  ss= 0.9994327591 GRS(384)=2.9263759933 GRS(385)=-2.9247160331
>  ss= 0.9996752211 GRS(385)=-2.9247160331 GRS(386)=2.9237661472
>  ss= 0.9994412103 GRS(386)=2.9237661472 GRS(387)=-2.9221323769
>  ss= 0.9994354005 GRS(387)=-2.9221323769 GRS(388)=2.9204825424
>  ss= 0.9995772094 GRS(388)=2.9204825424 GRS(389)=-2.9192477899
>  ss= 0.9994436575 GRS(389)=-2.9192477899 GRS(390)=2.9176236883
>  ss= 0.9994384983 GRS(390)=2.9176236883 GRS(391)=-2.9159854378
>  ss= 0.9995047771 GRS(391)=-2.9159854378 GRS(392)=2.9145413750
>  ss= 0.9994459262 GRS(392)=2.9145413750 GRS(393)=-2.9129265040
>  ss= 0.9994409367 GRS(393)=-2.9129265040 GRS(394)=2.9112979939
>  ss= 0.9993892623 GRS(394)=2.9112979939 GRS(395)=-2.9095199545
>  ss= 0.9994489008 GRS(395)=-2.9095199545 GRS(396)=2.9079165204
>  ss= 0.9994407083 GRS(396)=2.9079165204 GRS(397)=-2.9062901469
>  ss= 0.9999992564 GRS(397)=-2.9062901469 GRS(398)=2.9062879857
>  ss= 0.9994462376 GRS(398)=2.9062879857 GRS(399)=-2.9046785927
>  ss= 0.9994417929 GRS(399)=-2.9046785927 GRS(400)=2.9030571804
>  ss= 0.9994443601 GRS(400)=2.9030571804 GRS(401)=-2.9014441259
>  ss= 0.9994349489 GRS(401)=-2.9014441259 GRS(402)=2.8998046618
>  ss= 0.9994507091 GRS(402)=2.8998046618 GRS(403)=-2.8982118253
>  ss= 0.9991488880 GRS(403)=-2.8982118253 GRS(404)=2.8957451226
>  ss= 0.9994664648 GRS(404)=2.8957451226 GRS(405)=-2.8942001406
>  ss= 0.9994267905 GRS(405)=-2.8942001406 GRS(406)=2.8925411575
>  ss= 0.9995049185 GRS(406)=2.8925411575 GRS(407)=-2.8911091138
>  ss= 0.9994354727 GRS(407)=-2.8911091138 GRS(408)=2.8894770039
>  ss= 0.9998613315 GRS(408)=2.8894770039 GRS(409)=-2.8890763244
>  ss= 0.9994382314 GRS(409)=-2.8890763244 GRS(410)=2.8874533321
>  ss= 0.9993073995 GRS(410)=2.8874533321 GRS(411)=-2.8854534807
>  ss= 0.9994395437 GRS(411)=-2.8854534807 GRS(412)=2.8838363100
>  ss= 0.9993777033 GRS(412)=2.8838363100 GRS(413)=-2.8820417081
>  ss= 0.9994400705 GRS(413)=-2.8820417081 GRS(414)=2.8804279678
>  ss= 0.9994217500 GRS(414)=2.8804279678 GRS(415)=-2.8787623604
>  ss= 0.9994605371 GRS(415)=-2.8787623604 GRS(416)=2.8772093749
>  ss= 0.9994478522 GRS(416)=2.8772093749 GRS(417)=-2.8756207301
>  ss= 0.9994382054 GRS(417)=-2.8756207301 GRS(418)=2.8740052217
>  ss= 0.9994606002 GRS(418)=2.8740052217 GRS(419)=-2.8724549840
>  ss= 0.9994335068 GRS(419)=-2.8724549840 GRS(420)=2.8708277577
>  ss= 0.9994662513 GRS(420)=2.8708277577 GRS(421)=-2.8692954572
>  ss= 0.9994312466 GRS(421)=-2.8692954572 GRS(422)=2.8676635357
>  ss= 0.9994693955 GRS(422)=2.8676635357 GRS(423)=-2.8661419406
>  ss= 0.9994303104 GRS(423)=-2.8661419406 GRS(424)=2.8645091293
>  ss= 0.9994740117 GRS(424)=2.8645091293 GRS(425)=-2.8630024311
>  ss= 0.9994320424 GRS(425)=-2.8630024311 GRS(426)=2.8613763670
>  ss= 0.9994937902 GRS(426)=2.8613763670 GRS(427)=-2.8599279104
>  ss= 0.9994391154 GRS(427)=-2.8599279104 GRS(428)=2.8583238210
>  ss= 0.9998244039 GRS(428)=2.8583238210 GRS(429)=-2.8578219105
>  ss= 0.9994528308 GRS(429)=-2.8578219105 GRS(430)=2.8562581983
>  ss= 0.9994183073 GRS(430)=2.8562581983 GRS(431)=-2.8545967337
>  ss= 0.9994769203 GRS(431)=-2.8545967337 GRS(432)=2.8531035521
>  ss= 0.9993725204 GRS(432)=2.8531035521 GRS(433)=-2.8513132877
>  ss= 0.9994752875 GRS(433)=-2.8513132877 GRS(434)=2.8498171679
>  ss= 0.9991919106 GRS(434)=2.8498171679 GRS(435)=-2.8475142608
>  ss= 0.9994554057 GRS(435)=-2.8475142608 GRS(436)=2.8459635208
>  ss= 0.9998493353 GRS(436)=2.8459635208 GRS(437)=-2.8455347345
>  ss= 0.9992897356 GRS(437)=-2.8455347345 GRS(438)=2.8435136523
>  ss= 0.9995149138 GRS(438)=2.8435136523 GRS(439)=-2.8421343032
>  ss= 0.9994913111 GRS(439)=-2.8421343032 GRS(440)=2.8406885409
>  ss= 0.9994564266 GRS(440)=2.8406885409 GRS(441)=-2.8391444181
>  ss= 0.9994716723 GRS(441)=-2.8391444181 GRS(442)=2.8376444194
>  ss= 0.9994250562 GRS(442)=2.8376444194 GRS(443)=-2.8360129334
>  ss= 0.9994597350 GRS(443)=-2.8360129334 GRS(444)=2.8344807349
>  ss= 0.9993746578 GRS(444)=2.8344807349 GRS(445)=-2.8327082146
>  ss= 0.9994442346 GRS(445)=-2.8327082146 GRS(446)=2.8311338934
>  ss= 0.9999295844 GRS(446)=2.8311338934 GRS(447)=-2.8309345375
>  ss= 0.9994206614 GRS(447)=-2.8309345375 GRS(448)=2.8292944679
>  ss= 0.9995009941 GRS(448)=2.8292944679 GRS(449)=-2.8278826333
>  ss= 0.9993970511 GRS(449)=-2.8278826333 GRS(450)=2.8261775645
>  ss= 0.9994637400 GRS(450)=2.8261775645 GRS(451)=-2.8246619985
>  ss= 0.9994167795 GRS(451)=-2.8246619985 GRS(452)=2.8230145976
>  ss= 0.9994125541 GRS(452)=2.8230145976 GRS(453)=-2.8213562293
>  ss= 0.9994801773 GRS(453)=-2.8213562293 GRS(454)=2.8198896244
>  ss= 0.9995266347 GRS(454)=2.8198896244 GRS(455)=-2.8185547865
>  ss= 0.9996024419 GRS(455)=-2.8185547865 GRS(456)=2.8174342471
>  ss= 0.9994835605 GRS(456)=2.8174342471 GRS(457)=-2.8159792128
>  ss= 0.9988856001 GRS(457)=-2.8159792128 GRS(458)=2.8128410859
>  ss= 0.9994714230 GRS(458)=2.8128410859 GRS(459)=-2.8113542828
>  ss= 0.9993199998 GRS(459)=-2.8113542828 GRS(460)=2.8094425612
>  ss= 0.9994553524 GRS(460)=2.8094425612 GRS(461)=-2.8079124051
>  ss= 0.9993865791 GRS(461)=-2.8079124051 GRS(462)=2.8061899729
>  ss= 0.9994941828 GRS(462)=2.8061899729 GRS(463)=-2.8047705536
>  ss= 0.9995832541 GRS(463)=-2.8047705536 GRS(464)=2.8036016771
>  ss= 0.9994728529 GRS(464)=2.8036016771 GRS(465)=-2.8021237666
>  ss= 0.9994176712 GRS(465)=-2.8021237666 GRS(466)=2.8004920093
>  ss= 0.9994672136 GRS(466)=2.8004920093 GRS(467)=-2.7989999452
>  ss= 0.9994308286 GRS(467)=-2.7989999452 GRS(468)=2.7974068344
>  ss= 0.9994625246 GRS(468)=2.7974068344 GRS(469)=-2.7959032970
>  ss= 0.9994370966 GRS(469)=-2.7959032970 GRS(470)=2.7943294734
>  ss= 0.9994427972 GRS(470)=2.7943294734 GRS(471)=-2.7927724651
>  ss= 0.9994392836 GRS(471)=-2.7927724651 GRS(472)=2.7912065117
>  ss= 0.9994625470 GRS(472)=2.7912065117 GRS(473)=-2.7897063694
>  ss= 0.9994285596 GRS(473)=-2.7897063694 GRS(474)=2.7881122186
>  ss= 0.9994522292 GRS(474)=2.7881122186 GRS(475)=-2.7865849722
>  ss= 0.9994432728 GRS(475)=-2.7865849722 GRS(476)=2.7850336045
>  ss= 0.9994134803 GRS(476)=2.7850336045 GRS(477)=-2.7834001273
>  ss= 0.9994177564 GRS(477)=-2.7834001273 GRS(478)=2.7817795104
>  ss= 0.9995472340 GRS(478)=2.7817795104 GRS(479)=-2.7805200150
>  ss= 0.9997332345 GRS(479)=-2.7805200150 GRS(480)=2.7797782683
>  ss= 0.9994997631 GRS(480)=2.7797782683 GRS(481)=-2.7783877206
>  ss= 0.9995042222 GRS(481)=-2.7783877206 GRS(482)=2.7770102576
>  ss= 0.9995476122 GRS(482)=2.7770102576 GRS(483)=-2.7757539720
>  ss= 0.9994921444 GRS(483)=-2.7757539720 GRS(484)=2.7743442898
>  ss= 0.9994873122 GRS(484)=2.7743442898 GRS(485)=-2.7729219174
>  ss= 0.9999730391 GRS(485)=-2.7729219174 GRS(486)=2.7728471570
>  ss= 0.9994923417 GRS(486)=2.7728471570 GRS(487)=-2.7714394981
>  ss= 0.9994789142 GRS(487)=-2.7714394981 GRS(488)=2.7699953403
>  ss= 0.9992720272 GRS(488)=2.7699953403 GRS(489)=-2.7679788590
>  ss= 0.9994799968 GRS(489)=-2.7679788590 GRS(490)=2.7665395010
>  ss= 0.9994687697 GRS(490)=2.7665395010 GRS(491)=-2.7650698315
>  ss= 0.9994773622 GRS(491)=-2.7650698315 GRS(492)=2.7636247016
>  ss= 0.9994692161 GRS(492)=2.7636247016 GRS(493)=-2.7621578140
>  ss= 0.9994799462 GRS(493)=-2.7621578140 GRS(494)=2.7607213433
>  ss= 0.9991770779 GRS(494)=2.7607213433 GRS(495)=-2.7584494847
>  ss= 0.9994700180 GRS(495)=-2.7584494847 GRS(496)=2.7569875560
>  ss= 0.9994848689 GRS(496)=2.7569875560 GRS(497)=-2.7555673461
>  ss= 0.9995099823 GRS(497)=-2.7555673461 GRS(498)=2.7542170693
>  ss= 0.9994067694 GRS(498)=2.7542170693 GRS(499)=-2.7525831835
>  ss= 0.9994993553 GRS(499)=-2.7525831835 GRS(500)=2.7512051173
>  ss= 0.9994953347 GRS(500)=2.7512051173 GRS(501)=-2.7498166795
>  ss= 0.9994687424 GRS(501)=-2.7498166795 GRS(502)=2.7483558183
>  ss= 0.9995016204 GRS(502)=2.7483558183 GRS(503)=-2.7469860940
>  ss= 0.9994807365 GRS(503)=-2.7469860940 GRS(504)=2.7455596843
>  ss= 0.9994728852 GRS(504)=2.7455596843 GRS(505)=-2.7441124591
>  ss= 0.9994772929 GRS(505)=-2.7441124591 GRS(506)=2.7426780920
>  ss= 0.9994755747 GRS(506)=2.7426780920 GRS(507)=-2.7412397621
>  ss= 0.9994800621 GRS(507)=-2.7412397621 GRS(508)=2.7398144876
>  ss= 0.9989624525 GRS(508)=2.7398144876 GRS(509)=-2.7369717999
>  ss= 0.9994536459 GRS(509)=-2.7369717999 GRS(510)=2.7354764441
>  ss= 0.9994723305 GRS(510)=2.7354764441 GRS(511)=-2.7340330165
>  ss= 0.9994844897 GRS(511)=-2.7340330165 GRS(512)=2.7326235944
>  ss= 0.9994756251 GRS(512)=2.7326235944 GRS(513)=-2.7311906752
>  ss= 0.9994664488 GRS(513)=-2.7311906752 GRS(514)=2.7297334451
>  ss= 0.9994689458 GRS(514)=2.7297334451 GRS(515)=-2.7282838087
>  ss= 0.9994776639 GRS(515)=-2.7282838087 GRS(516)=2.7268587276
>  ss= 0.9994795503 GRS(516)=2.7268587276 GRS(517)=-2.7254395348
>  ss= 0.9994767193 GRS(517)=-2.7254395348 GRS(518)=2.7240133650
>  ss= 0.9994733901 GRS(518)=2.7240133650 GRS(519)=-2.7225788725
>  ss= 0.9994744144 GRS(519)=-2.7225788725 GRS(520)=2.7211479244
>  ss= 0.9994791983 GRS(520)=2.7211479244 GRS(521)=-2.7197307459
>  ss= 0.9994803647 GRS(521)=-2.7197307459 GRS(522)=2.7183174779
>  ss= 0.9994783137 GRS(522)=2.7183174779 GRS(523)=-2.7168993690
>  ss= 0.9994775802 GRS(523)=-2.7168993690 GRS(524)=2.7154800070
>  ss= 0.9994792642 GRS(524)=2.7154800070 GRS(525)=-2.7140659595
>  ss= 0.9994809563 GRS(525)=-2.7140659595 GRS(526)=2.7126572405
>  ss= 0.9994814491 GRS(526)=2.7126572405 GRS(527)=-2.7112505896
>  ss= 0.9994812712 GRS(527)=-2.7112505896 GRS(528)=2.7098441859
>  ss= 0.9994812986 GRS(528)=2.7098441859 GRS(529)=-2.7084385860
>  ss= 0.9994821262 GRS(529)=-2.7084385860 GRS(530)=2.7070359566
>  ss= 0.9994832602 GRS(530)=2.7070359566 GRS(531)=-2.7056371233
>  ss= 0.9994838017 GRS(531)=-2.7056371233 GRS(532)=2.7042404780
>  ss= 0.9994839049 GRS(532)=2.7042404780 GRS(533)=-2.7028448326
>  ss= 0.9994842695 GRS(533)=-2.7028448326 GRS(534)=2.7014508932
>  ss= 0.9994850175 GRS(534)=2.7014508932 GRS(535)=-2.7000596931
>  ss= 0.9994857425 GRS(535)=-2.7000596931 GRS(536)=2.6986711673
>  ss= 0.9994862214 GRS(536)=2.6986711673 GRS(537)=-2.6972846478
>  ss= 0.9994865976 GRS(537)=-2.6972846478 GRS(538)=2.6958998553
>  ss= 0.9994870841 GRS(538)=2.6958998553 GRS(539)=-2.6945170855
>  ss= 0.9994876964 GRS(539)=-2.6945170855 GRS(540)=2.6931366745
>  ss= 0.9994882853 GRS(540)=2.6931366745 GRS(541)=-2.6917585568
>  ss= 0.9994887772 GRS(541)=-2.6917585568 GRS(542)=2.6903824686
>  ss= 0.9994892462 GRS(542)=2.6903824686 GRS(543)=-2.6890083456
>  ss= 0.9994897650 GRS(543)=-2.6890083456 GRS(544)=2.6876363193
>  ss= 0.9994903157 GRS(544)=2.6876363193 GRS(545)=-2.6862664732
>  ss= 0.9994908504 GRS(545)=-2.6862664732 GRS(546)=2.6848987617
>  ss= 0.9994913574 GRS(546)=2.6848987617 GRS(547)=-2.6835331079
>  ss= 0.9994918585 GRS(547)=-2.6835331079 GRS(548)=2.6821694934
>  ss= 0.9994923736 GRS(548)=2.6821694934 GRS(549)=-2.6808079532
>  ss= 0.9994928989 GRS(549)=-2.6808079532 GRS(550)=2.6794485125
>  ss= 0.9994934180 GRS(550)=2.6794485125 GRS(551)=-2.6780911522
>  ss= 0.9994939260 GRS(551)=-2.6780911522 GRS(552)=2.6767358399
>  ss= 0.9994944323 GRS(552)=2.6767358399 GRS(553)=-2.6753825689
>  ss= 0.9994949438 GRS(553)=-2.6753825689 GRS(554)=2.6740313503
>  ss= 0.9994954571 GRS(554)=2.6740313503 GRS(555)=-2.6726821868
>  ss= 0.9994959667 GRS(555)=-2.6726821868 GRS(556)=2.6713350661
>  ss= 0.9994964726 GRS(556)=2.6713350661 GRS(557)=-2.6699899756
>  ss= 0.9994969778 GRS(557)=-2.6699899756 GRS(558)=2.6686469112
>  ss= 0.9994974838 GRS(558)=2.6686469112 GRS(559)=-2.6673058729
>  ss= 0.9994979894 GRS(559)=-2.6673058729 GRS(560)=2.6659668569
>  ss= 0.9994984930 GRS(560)=2.6659668569 GRS(561)=-2.6646298560
>  ss= 0.9994989950 GRS(561)=-2.6646298560 GRS(562)=2.6632948631
>  ss= 0.9994994962 GRS(562)=2.6632948631 GRS(563)=-2.6619618739
>  ss= 0.9994999969 GRS(563)=-2.6619618739 GRS(564)=2.6606308846
>  ss= 0.9995004966 GRS(564)=2.6606308846 GRS(565)=-2.6593018906
>  ss= 0.9995009952 GRS(565)=-2.6593018906 GRS(566)=2.6579748861
>  ss= 0.9995014925 GRS(566)=2.6579748861 GRS(567)=-2.6566498658
>  ss= 0.9995019890 GRS(567)=-2.6566498658 GRS(568)=2.6553268249
>  ss= 0.9995024846 GRS(568)=2.6553268249 GRS(569)=-2.6540057588
>  ss= 0.9995029792 GRS(569)=-2.6540057588 GRS(570)=2.6526866627
>  ss= 0.9995034728 GRS(570)=2.6526866627 GRS(571)=-2.6513695316
>  ss= 0.9995039653 GRS(571)=-2.6513695316 GRS(572)=2.6500543603
>  ss= 0.9995044569 GRS(572)=2.6500543603 GRS(573)=-2.6487411441
>  ss= 0.9995049476 GRS(573)=-2.6487411441 GRS(574)=2.6474298784
>  ss= 0.9995054373 GRS(574)=2.6474298784 GRS(575)=-2.6461205582
>  ss= 0.9995059259 GRS(575)=-2.6461205582 GRS(576)=2.6448131787
>  ss= 0.9995064137 GRS(576)=2.6448131787 GRS(577)=-2.6435077351
>  ss= 0.9995069004 GRS(577)=-2.6435077351 GRS(578)=2.6422042226
>  ss= 0.9995073863 GRS(578)=2.6422042226 GRS(579)=-2.6409026365
>  ss= 0.9995078711 GRS(579)=-2.6409026365 GRS(580)=2.6396029721
>  ss= 0.9995083550 GRS(580)=2.6396029721 GRS(581)=-2.6383052245
>  ss= 0.9995088380 GRS(581)=-2.6383052245 GRS(582)=2.6370093892
>  ss= 0.9995093200 GRS(582)=2.6370093892 GRS(583)=-2.6357154614
>  ss= 0.9995098011 GRS(583)=-2.6357154614 GRS(584)=2.6344234365
>  ss= 0.9995102812 GRS(584)=2.6344234365 GRS(585)=-2.6331333097
>  ss= 0.9995107604 GRS(585)=-2.6331333097 GRS(586)=2.6318450765
>  ss= 0.9995112386 GRS(586)=2.6318450765 GRS(587)=-2.6305587322
>  ss= 0.9995117159 GRS(587)=-2.6305587322 GRS(588)=2.6292742722
>  ss= 0.9995121923 GRS(588)=2.6292742722 GRS(589)=-2.6279916920
>  ss= 0.9995126677 GRS(589)=-2.6279916920 GRS(590)=2.6267109868
>  ss= 0.9995131423 GRS(590)=2.6267109868 GRS(591)=-2.6254321522
>  ss= 0.9995136159 GRS(591)=-2.6254321522 GRS(592)=2.6241551836
>  ss= 0.9995140885 GRS(592)=2.6241551836 GRS(593)=-2.6228800765
>  ss= 0.9995145603 GRS(593)=-2.6228800765 GRS(594)=2.6216068264
>  ss= 0.9995150311 GRS(594)=2.6216068264 GRS(595)=-2.6203354287
>  ss= 0.9995155011 GRS(595)=-2.6203354287 GRS(596)=2.6190658790
>  ss= 0.9995159701 GRS(596)=2.6190658790 GRS(597)=-2.6177981728
>  ss= 0.9995164382 GRS(597)=-2.6177981728 GRS(598)=2.6165323057
>  ss= 0.9995169054 GRS(598)=2.6165323057 GRS(599)=-2.6152682731
>  ss= 0.9995173717 GRS(599)=-2.6152682731 GRS(600)=2.6140060707
>  ss= 0.9995178371 GRS(600)=2.6140060707 GRS(601)=-2.6127456941
>  ss= 0.9995183017 GRS(601)=-2.6127456941 GRS(602)=2.6114871389
>  ss= 0.9995187653 GRS(602)=2.6114871389 GRS(603)=-2.6102304006
>  ss= 0.9995192280 GRS(603)=-2.6102304006 GRS(604)=2.6089754749
>  ss= 0.9995196899 GRS(604)=2.6089754749 GRS(605)=-2.6077223575
>  ss= 0.9995201508 GRS(605)=-2.6077223575 GRS(606)=2.6064710441
>  ss= 0.9995206109 GRS(606)=2.6064710441 GRS(607)=-2.6052215302
>  ss= 0.9995210701 GRS(607)=-2.6052215302 GRS(608)=2.6039738116
>  ss= 0.9995215284 GRS(608)=2.6039738116 GRS(609)=-2.6027278840
>  ss= 0.9995219858 GRS(609)=-2.6027278840 GRS(610)=2.6014837431
>  ss= 0.9995224424 GRS(610)=2.6014837431 GRS(611)=-2.6002413847
>  ss= 0.9995228980 GRS(611)=-2.6002413847 GRS(612)=2.5990008044
>  ss= 0.9995233529 GRS(612)=2.5990008044 GRS(613)=-2.5977619981
>  ss= 0.9995238068 GRS(613)=-2.5977619981 GRS(614)=2.5965249616
>  ss= 0.9995242599 GRS(614)=2.5965249616 GRS(615)=-2.5952896906
>  ss= 0.9995247121 GRS(615)=-2.5952896906 GRS(616)=2.5940561809
>  ss= 0.9995251635 GRS(616)=2.5940561809 GRS(617)=-2.5928244283
>  ss= 0.9995256140 GRS(617)=-2.5928244283 GRS(618)=2.5915944287
>  ss= 0.9995260637 GRS(618)=2.5915944287 GRS(619)=-2.5903661780
>  ss= 0.9995265125 GRS(619)=-2.5903661780 GRS(620)=2.5891396719
>  ss= 0.9995269604 GRS(620)=2.5891396719 GRS(621)=-2.5879149064
>  ss= 0.9995274075 GRS(621)=-2.5879149064 GRS(622)=2.5866918773
>  ss= 0.9995278538 GRS(622)=2.5866918773 GRS(623)=-2.5854705806
>  ss= 0.9995282992 GRS(623)=-2.5854705806 GRS(624)=2.5842510122
>  ss= 0.9995287438 GRS(624)=2.5842510122 GRS(625)=-2.5830331679
>  ss= 0.9995291876 GRS(625)=-2.5830331679 GRS(626)=2.5818170438
>  ss= 0.9995296305 GRS(626)=2.5818170438 GRS(627)=-2.5806026357
>  ss= 0.9995300726 GRS(627)=-2.5806026357 GRS(628)=2.5793899397
>  ss= 0.9995305138 GRS(628)=2.5793899397 GRS(629)=-2.5781789517
>  ss= 0.9995309542 GRS(629)=-2.5781789517 GRS(630)=2.5769696678
>  ss= 0.9995313938 GRS(630)=2.5769696678 GRS(631)=-2.5757620838
>  ss= 0.9995318326 GRS(631)=-2.5757620838 GRS(632)=2.5745561960
>  ss= 0.9995322705 GRS(632)=2.5745561960 GRS(633)=-2.5733520002
>  ss= 0.9995327077 GRS(633)=-2.5733520002 GRS(634)=2.5721494926
>  ss= 0.9995331440 GRS(634)=2.5721494926 GRS(635)=-2.5709486691
>  ss= 0.9995335795 GRS(635)=-2.5709486691 GRS(636)=2.5697495260
>  ss= 0.9995340142 GRS(636)=2.5697495260 GRS(637)=-2.5685520591
>  ss= 0.9995344481 GRS(637)=-2.5685520591 GRS(638)=2.5673562648
>  ss= 0.9995348811 GRS(638)=2.5673562648 GRS(639)=-2.5661621389
>  ss= 0.9995353134 GRS(639)=-2.5661621389 GRS(640)=2.5649696778
>  ss= 0.9995357449 GRS(640)=2.5649696778 GRS(641)=-2.5637788775
>  ss= 0.9995361755 GRS(641)=-2.5637788775 GRS(642)=2.5625897341
>  ss= 0.9995366054 GRS(642)=2.5625897341 GRS(643)=-2.5614022439
>  ss= 0.9995370345 GRS(643)=-2.5614022439 GRS(644)=2.5602164030
>  ss= 0.9995374628 GRS(644)=2.5602164030 GRS(645)=-2.5590322076
>  ss= 0.9995378902 GRS(645)=-2.5590322076 GRS(646)=2.5578496538
>  ss= 0.9995383169 GRS(646)=2.5578496538 GRS(647)=-2.5566687380
>  ss= 0.9995387428 GRS(647)=-2.5566687380 GRS(648)=2.5554894562
>  ss= 0.9995391680 GRS(648)=2.5554894562 GRS(649)=-2.5543118048
>  ss= 0.9995395923 GRS(649)=-2.5543118048 GRS(650)=2.5531357800
>  ss= 0.9995400159 GRS(650)=2.5531357800 GRS(651)=-2.5519613781
>  ss= 0.9995404387 GRS(651)=-2.5519613781 GRS(652)=2.5507885953
>  ss= 0.9995408607 GRS(652)=2.5507885953 GRS(653)=-2.5496174279
>  ss= 0.9995412819 GRS(653)=-2.5496174279 GRS(654)=2.5484478723
>  ss= 0.9995417024 GRS(654)=2.5484478723 GRS(655)=-2.5472799246
>  ss= 0.9995421220 GRS(655)=-2.5472799246 GRS(656)=2.5461135813
>  ss= 0.9995425410 GRS(656)=2.5461135813 GRS(657)=-2.5449488386
>  ss= 0.9995429591 GRS(657)=-2.5449488386 GRS(658)=2.5437856929
>  ss= 0.9995433765 GRS(658)=2.5437856929 GRS(659)=-2.5426241406
>  ss= 0.9995437931 GRS(659)=-2.5426241406 GRS(660)=2.5414641780
>  ss= 0.9995442090 GRS(660)=2.5414641780 GRS(661)=-2.5403058016
>  ss= 0.9995446241 GRS(661)=-2.5403058016 GRS(662)=2.5391490076
>  ss= 0.9995450385 GRS(662)=2.5391490076 GRS(663)=-2.5379937925
>  ss= 0.9995454521 GRS(663)=-2.5379937925 GRS(664)=2.5368401527
>  ss= 0.9995458649 GRS(664)=2.5368401527 GRS(665)=-2.5356880846
>  ss= 0.9995462770 GRS(665)=-2.5356880846 GRS(666)=2.5345375847
>  ss= 0.9995466884 GRS(666)=2.5345375847 GRS(667)=-2.5333886494
>  ss= 0.9995470990 GRS(667)=-2.5333886494 GRS(668)=2.5322412752
>  ss= 0.9995475089 GRS(668)=2.5322412752 GRS(669)=-2.5310954584
>  ss= 0.9995479180 GRS(669)=-2.5310954584 GRS(670)=2.5299511957
>  ss= 0.9995483264 GRS(670)=2.5299511957 GRS(671)=-2.5288084835
>  ss= 0.9995487340 GRS(671)=-2.5288084835 GRS(672)=2.5276673183
>  ss= 0.9995491409 GRS(672)=2.5276673183 GRS(673)=-2.5265276966
>  ss= 0.9995495471 GRS(673)=-2.5265276966 GRS(674)=2.5253896150
>  ss= 0.9995499526 GRS(674)=2.5253896150 GRS(675)=-2.5242530699
>  ss= 0.9995503573 GRS(675)=-2.5242530699 GRS(676)=2.5231180579
>  ss= 0.9995507613 GRS(676)=2.5231180579 GRS(677)=-2.5219845756
>  ss= 0.9995511646 GRS(677)=-2.5219845756 GRS(678)=2.5208526196
>  ss= 0.9995515671 GRS(678)=2.5208526196 GRS(679)=-2.5197221864
>  ss= 0.9995519689 GRS(679)=-2.5197221864 GRS(680)=2.5185932726
>  ss= 0.9995523700 GRS(680)=2.5185932726 GRS(681)=-2.5174658747
>  ss= 0.9995527704 GRS(681)=-2.5174658747 GRS(682)=2.5163399895
>  ss= 0.9995531701 GRS(682)=2.5163399895 GRS(683)=-2.5152156136
>  ss= 0.9995535691 GRS(683)=-2.5152156136 GRS(684)=2.5140927435
>  ss= 0.9995539673 GRS(684)=2.5140927435 GRS(685)=-2.5129713759
>  ss= 0.9995543648 GRS(685)=-2.5129713759 GRS(686)=2.5118515075
>  ss= 0.9995547617 GRS(686)=2.5118515075 GRS(687)=-2.5107331349
>  ss= 0.9995551578 GRS(687)=-2.5107331349 GRS(688)=2.5096162548
>  ss= 0.9995555532 GRS(688)=2.5096162548 GRS(689)=-2.5085008639
>  ss= 0.9995559479 GRS(689)=-2.5085008639 GRS(690)=2.5073869589
>  ss= 0.9995563419 GRS(690)=2.5073869589 GRS(691)=-2.5062745364
>  ss= 0.9995567352 GRS(691)=-2.5062745364 GRS(692)=2.5051635932
>  ss= 0.9995571279 GRS(692)=2.5051635932 GRS(693)=-2.5040541261
>  ss= 0.9995575198 GRS(693)=-2.5040541261 GRS(694)=2.5029461317
>  ss= 0.9995579110 GRS(694)=2.5029461317 GRS(695)=-2.5018396068
>  ss= 0.9995583016 GRS(695)=-2.5018396068 GRS(696)=2.5007345481
>  ss= 0.9995586914 GRS(696)=2.5007345481 GRS(697)=-2.4996309525
>  ss= 0.9995590806 GRS(697)=-2.4996309525 GRS(698)=2.4985288167
>  ss= 0.9995594691 GRS(698)=2.4985288167 GRS(699)=-2.4974281374
>  ss= 0.9995598568 GRS(699)=-2.4974281374 GRS(700)=2.4963289115
>  ss= 0.9995602440 GRS(700)=2.4963289115 GRS(701)=-2.4952311358
>  ss= 0.9995606304 GRS(701)=-2.4952311358 GRS(702)=2.4941348070
>  ss= 0.9995610161 GRS(702)=2.4941348070 GRS(703)=-2.4930399221
>  ss= 0.9995614012 GRS(703)=-2.4930399221 GRS(704)=2.4919464778
>  ss= 0.9995617856 GRS(704)=2.4919464778 GRS(705)=-2.4908544710
>  ss= 0.9995621693 GRS(705)=-2.4908544710 GRS(706)=2.4897638986
>  ss= 0.9995625524 GRS(706)=2.4897638986 GRS(707)=-2.4886747573
>  ss= 0.9995629348 GRS(707)=-2.4886747573 GRS(708)=2.4875870442
>  ss= 0.9995633165 GRS(708)=2.4875870442 GRS(709)=-2.4865007560
>  ss= 0.9995636976 GRS(709)=-2.4865007560 GRS(710)=2.4854158896
>  ss= 0.9995640779 GRS(710)=2.4854158896 GRS(711)=-2.4843324420
>  ss= 0.9995644577 GRS(711)=-2.4843324420 GRS(712)=2.4832504101
>  ss= 0.9995648367 GRS(712)=2.4832504101 GRS(713)=-2.4821697907
>  ss= 0.9995652151 GRS(713)=-2.4821697907 GRS(714)=2.4810905808
>  ss= 0.9995655929 GRS(714)=2.4810905808 GRS(715)=-2.4800127774
>  ss= 0.9995659700 GRS(715)=-2.4800127774 GRS(716)=2.4789363774
>  ss= 0.9995663464 GRS(716)=2.4789363774 GRS(717)=-2.4778613778
>  ss= 0.9995667222 GRS(717)=-2.4778613778 GRS(718)=2.4767877755
>  ss= 0.9995670973 GRS(718)=2.4767877755 GRS(719)=-2.4757155674
>  ss= 0.9995674718 GRS(719)=-2.4757155674 GRS(720)=2.4746447507
>  ss= 0.9995678457 GRS(720)=2.4746447507 GRS(721)=-2.4735753222
>  ss= 0.9995682188 GRS(721)=-2.4735753222 GRS(722)=2.4725072790
>  ss= 0.9995685914 GRS(722)=2.4725072790 GRS(723)=-2.4714406181
>  ss= 0.9995689633 GRS(723)=-2.4714406181 GRS(724)=2.4703753365
>  ss= 0.9995693346 GRS(724)=2.4703753365 GRS(725)=-2.4693114312
>  ss= 0.9995697052 GRS(725)=-2.4693114312 GRS(726)=2.4682488993
>  ss= 0.9995700752 GRS(726)=2.4682488993 GRS(727)=-2.4671877379
>  ss= 0.9995704445 GRS(727)=-2.4671877379 GRS(728)=2.4661279439
>  ss= 0.9995708133 GRS(728)=2.4661279439 GRS(729)=-2.4650695145
>  ss= 0.9995711813 GRS(729)=-2.4650695145 GRS(730)=2.4640124467
>  ss= 0.9995715488 GRS(730)=2.4640124467 GRS(731)=-2.4629567376
>  ss= 0.9995719156 GRS(731)=-2.4629567376 GRS(732)=2.4619023843
>  ss= 0.9995722818 GRS(732)=2.4619023843 GRS(733)=-2.4608493839
>  ss= 0.9995726474 GRS(733)=-2.4608493839 GRS(734)=2.4597977335
>  ss= 0.9995730123 GRS(734)=2.4597977335 GRS(735)=-2.4587474302
>  ss= 0.9995733767 GRS(735)=-2.4587474302 GRS(736)=2.4576984712
>  ss= 0.9995737404 GRS(736)=2.4576984712 GRS(737)=-2.4566508536
>  ss= 0.9995741035 GRS(737)=-2.4566508536 GRS(738)=2.4556045745
>  ss= 0.9995744659 GRS(738)=2.4556045745 GRS(739)=-2.4545596311
>  ss= 0.9995748278 GRS(739)=-2.4545596311 GRS(740)=2.4535160205
>  ss= 0.9995751890 GRS(740)=2.4535160205 GRS(741)=-2.4524737399
>  ss= 0.9995755496 GRS(741)=-2.4524737399 GRS(742)=2.4514327866
>  ss= 0.9995759096 GRS(742)=2.4514327866 GRS(743)=-2.4503931576
>  ss= 0.9995762690 GRS(743)=-2.4503931576 GRS(744)=2.4493548501
>  ss= 0.9995766278 GRS(744)=2.4493548501 GRS(745)=-2.4483178615
>  ss= 0.9995769860 GRS(745)=-2.4483178615 GRS(746)=2.4472821888
>  ss= 0.9995773436 GRS(746)=2.4472821888 GRS(747)=-2.4462478293
>  ss= 0.9995777006 GRS(747)=-2.4462478293 GRS(748)=2.4452147803
>  ss= 0.9995780569 GRS(748)=2.4452147803 GRS(749)=-2.4441830389
>  ss= 0.9995784127 GRS(749)=-2.4441830389 GRS(750)=2.4431526024
>  ss= 0.9995787679 GRS(750)=2.4431526024 GRS(751)=-2.4421234681
>  ss= 0.9995791225 GRS(751)=-2.4421234681 GRS(752)=2.4410956332
>  ss= 0.9995794764 GRS(752)=2.4410956332 GRS(753)=-2.4400690950
>  ss= 0.9995798298 GRS(753)=-2.4400690950 GRS(754)=2.4390438507
>  ss= 0.9995801826 GRS(754)=2.4390438507 GRS(755)=-2.4380198977
>  ss= 0.9995805348 GRS(755)=-2.4380198977 GRS(756)=2.4369972333
>  ss= 0.9995808864 GRS(756)=2.4369972333 GRS(757)=-2.4359758546
>  ss= 0.9995812374 GRS(757)=-2.4359758546 GRS(758)=2.4349557592
>  ss= 0.9995815879 GRS(758)=2.4349557592 GRS(759)=-2.4339369441
>  ss= 0.9995819377 GRS(759)=-2.4339369441 GRS(760)=2.4329194069
>  ss= 0.9995822870 GRS(760)=2.4329194069 GRS(761)=-2.4319031448
>  ss= 0.9995826357 GRS(761)=-2.4319031448 GRS(762)=2.4308881551
>  ss= 0.9995829837 GRS(762)=2.4308881551 GRS(763)=-2.4298744353
>  ss= 0.9995833313 GRS(763)=-2.4298744353 GRS(764)=2.4288619825
>  ss= 0.9995836782 GRS(764)=2.4288619825 GRS(765)=-2.4278507944
>  ss= 0.9995840246 GRS(765)=-2.4278507944 GRS(766)=2.4268408681
>  ss= 0.9995843703 GRS(766)=2.4268408681 GRS(767)=-2.4258322010
>  ss= 0.9995847156 GRS(767)=-2.4258322010 GRS(768)=2.4248247906
>  ss= 0.9995850602 GRS(768)=2.4248247906 GRS(769)=-2.4238186343
>  ss= 0.9995854043 GRS(769)=-2.4238186343 GRS(770)=2.4228137294
>  ss= 0.9995857477 GRS(770)=2.4228137294 GRS(771)=-2.4218100734
>  ss= 0.9995860907 GRS(771)=-2.4218100734 GRS(772)=2.4208076636
>  ss= 0.9995864330 GRS(772)=2.4208076636 GRS(773)=-2.4198064975
>  ss= 0.9995867748 GRS(773)=-2.4198064975 GRS(774)=2.4188065725
>  ss= 0.9995871161 GRS(774)=2.4188065725 GRS(775)=-2.4178078861
>  ss= 0.9995874567 GRS(775)=-2.4178078861 GRS(776)=2.4168104357
>  ss= 0.9995877968 GRS(776)=2.4168104357 GRS(777)=-2.4158142188
>  ss= 0.9995881364 GRS(777)=-2.4158142188 GRS(778)=2.4148192328
>  ss= 0.9995884753 GRS(778)=2.4148192328 GRS(779)=-2.4138254751
>  ss= 0.9995888138 GRS(779)=-2.4138254751 GRS(780)=2.4128329433
>  ss= 0.9995891516 GRS(780)=2.4128329433 GRS(781)=-2.4118416349
>  ss= 0.9995894890 GRS(781)=-2.4118416349 GRS(782)=2.4108515473
>  ss= 0.9995898257 GRS(782)=2.4108515473 GRS(783)=-2.4098626780
>  ss= 0.9995901619 GRS(783)=-2.4098626780 GRS(784)=2.4088750245
>  ss= 0.9995904976 GRS(784)=2.4088750245 GRS(785)=-2.4078885844
>  ss= 0.9995908327 GRS(785)=-2.4078885844 GRS(786)=2.4069033551
>  ss= 0.9995911673 GRS(786)=2.4069033551 GRS(787)=-2.4059193342
>  ss= 0.9995915013 GRS(787)=-2.4059193342 GRS(788)=2.4049365193
>  ss= 0.9995918347 GRS(788)=2.4049365193 GRS(789)=-2.4039549078
>  ss= 0.9995921677 GRS(789)=-2.4039549078 GRS(790)=2.4029744972
>  ss= 0.9995925001 GRS(790)=2.4029744972 GRS(791)=-2.4019952853
>  ss= 0.9995928319 GRS(791)=-2.4019952853 GRS(792)=2.4010172694
>  ss= 0.9995931632 GRS(792)=2.4010172694 GRS(793)=-2.4000404472
>  ss= 0.9995934940 GRS(793)=-2.4000404472 GRS(794)=2.3990648163
>  ss= 0.9995938242 GRS(794)=2.3990648163 GRS(795)=-2.3980903742
>  ss= 0.9995941539 GRS(795)=-2.3980903742 GRS(796)=2.3971171186
>  ss= 0.9995944830 GRS(796)=2.3971171186 GRS(797)=-2.3961450469
>  ss= 0.9995948117 GRS(797)=-2.3961450469 GRS(798)=2.3951741569
>  ss= 0.9995951397 GRS(798)=2.3951741569 GRS(799)=-2.3942044461
>  ss= 0.9995954673 GRS(799)=-2.3942044461 GRS(800)=2.3932359121
>  ss= 0.9995957943 GRS(800)=2.3932359121 GRS(801)=-2.3922685526
>  ss= 0.9995961208 GRS(801)=-2.3922685526 GRS(802)=2.3913023651
>  ss= 0.9995964468 GRS(802)=2.3913023651 GRS(803)=-2.3903373474
>  ss= 0.9995967723 GRS(803)=-2.3903373474 GRS(804)=2.3893734971
>  ss= 0.9995970972 GRS(804)=2.3893734971 GRS(805)=-2.3884108118
>  ss= 0.9995974216 GRS(805)=-2.3884108118 GRS(806)=2.3874492891
>  ss= 0.9995977455 GRS(806)=2.3874492891 GRS(807)=-2.3864889268
>  ss= 0.9995980688 GRS(807)=-2.3864889268 GRS(808)=2.3855297225
>  ss= 0.9995983917 GRS(808)=2.3855297225 GRS(809)=-2.3845716738
>  ss= 0.9995987140 GRS(809)=-2.3845716738 GRS(810)=2.3836147785
>  ss= 0.9995990358 GRS(810)=2.3836147785 GRS(811)=-2.3826590343
>  ss= 0.9995993571 GRS(811)=-2.3826590343 GRS(812)=2.3817044388
>  ss= 0.9995996778 GRS(812)=2.3817044388 GRS(813)=-2.3807509897
>  ss= 0.9995999981 GRS(813)=-2.3807509897 GRS(814)=2.3797986848
>  ss= 0.9996003178 GRS(814)=2.3797986848 GRS(815)=-2.3788475217
>  ss= 0.9996006371 GRS(815)=-2.3788475217 GRS(816)=2.3778974982
>  ss= 0.9996009558 GRS(816)=2.3778974982 GRS(817)=-2.3769486120
>  ss= 0.9996012740 GRS(817)=-2.3769486120 GRS(818)=2.3760008608
>  ss= 0.9996015917 GRS(818)=2.3760008608 GRS(819)=-2.3750542425
>  ss= 0.9996019089 GRS(819)=-2.3750542425 GRS(820)=2.3741087546
>  ss= 0.9996022256 GRS(820)=2.3741087546 GRS(821)=-2.3731643950
>  ss= 0.9996025418 GRS(821)=-2.3731643950 GRS(822)=2.3722211615
>  ss= 0.9996028575 GRS(822)=2.3722211615 GRS(823)=-2.3712790517
>  ss= 0.9996031727 GRS(823)=-2.3712790517 GRS(824)=2.3703380635
>  ss= 0.9996034874 GRS(824)=2.3703380635 GRS(825)=-2.3693981946
>  ss= 0.9996038016 GRS(825)=-2.3693981946 GRS(826)=2.3684594429
>  ss= 0.9996041153 GRS(826)=2.3684594429 GRS(827)=-2.3675218061
>  ss= 0.9996044285 GRS(827)=-2.3675218061 GRS(828)=2.3665852820
>  ss= 0.9996047412 GRS(828)=2.3665852820 GRS(829)=-2.3656498684
>  ss= 0.9996050534 GRS(829)=-2.3656498684 GRS(830)=2.3647155631
>  ss= 0.9996053652 GRS(830)=2.3647155631 GRS(831)=-2.3637823639
>  ss= 0.9996056764 GRS(831)=-2.3637823639 GRS(832)=2.3628502687
>  ss= 0.9996059871 GRS(832)=2.3628502687 GRS(833)=-2.3619192752
>  ss= 0.9996062974 GRS(833)=-2.3619192752 GRS(834)=2.3609893814
>  ss= 0.9996066071 GRS(834)=2.3609893814 GRS(835)=-2.3600605850
>  ss= 0.9996069164 GRS(835)=-2.3600605850 GRS(836)=2.3591328839
>  ss= 0.9996072252 GRS(836)=2.3591328839 GRS(837)=-2.3582062759
>  ss= 0.9996075335 GRS(837)=-2.3582062759 GRS(838)=2.3572807589
>  ss= 0.9996078413 GRS(838)=2.3572807589 GRS(839)=-2.3563563308
>  ss= 0.9996081486 GRS(839)=-2.3563563308 GRS(840)=2.3554329893
>  ss= 0.9996084555 GRS(840)=2.3554329893 GRS(841)=-2.3545107325
>  ss= 0.9996087619 GRS(841)=-2.3545107325 GRS(842)=2.3535895581
>  ss= 0.9996090678 GRS(842)=2.3535895581 GRS(843)=-2.3526694641
>  ss= 0.9996093732 GRS(843)=-2.3526694641 GRS(844)=2.3517504483
>  ss= 0.9996096781 GRS(844)=2.3517504483 GRS(845)=-2.3508325087
>  ss= 0.9996099826 GRS(845)=-2.3508325087 GRS(846)=2.3499156431
>  ss= 0.9996102866 GRS(846)=2.3499156431 GRS(847)=-2.3489998494
>  ss= 0.9996105901 GRS(847)=-2.3489998494 GRS(848)=2.3480851256
>  ss= 0.9996108931 GRS(848)=2.3480851256 GRS(849)=-2.3471714695
>  ss= 0.9996111957 GRS(849)=-2.3471714695 GRS(850)=2.3462588792
>  ss= 0.9996114978 GRS(850)=2.3462588792 GRS(851)=-2.3453473525
>  ss= 0.9996117994 GRS(851)=-2.3453473525 GRS(852)=2.3444368874
>  ss= 0.9996121006 GRS(852)=2.3444368874 GRS(853)=-2.3435274817
>  ss= 0.9996124013 GRS(853)=-2.3435274817 GRS(854)=2.3426191336
>  ss= 0.9996127015 GRS(854)=2.3426191336 GRS(855)=-2.3417118408
>  ss= 0.9996130013 GRS(855)=-2.3417118408 GRS(856)=2.3408056014
>  ss= 0.9996133006 GRS(856)=2.3408056014 GRS(857)=-2.3399004133
>  ss= 0.9996135995 GRS(857)=-2.3399004133 GRS(858)=2.3389962745
>  ss= 0.9996138978 GRS(858)=2.3389962745 GRS(859)=-2.3380931830
>  ss= 0.9996141958 GRS(859)=-2.3380931830 GRS(860)=2.3371911367
>  ss= 0.9996144932 GRS(860)=2.3371911367 GRS(861)=-2.3362901337
>  ss= 0.9996147902 GRS(861)=-2.3362901337 GRS(862)=2.3353901719
>  ss= 0.9996150868 GRS(862)=2.3353901719 GRS(863)=-2.3344912493
>  ss= 0.9996153829 GRS(863)=-2.3344912493 GRS(864)=2.3335933639
>  ss= 0.9996156785 GRS(864)=2.3335933639 GRS(865)=-2.3326965138
>  ss= 0.9996159737 GRS(865)=-2.3326965138 GRS(866)=2.3318006969
>  ss= 0.9996162684 GRS(866)=2.3318006969 GRS(867)=-2.3309059113
>  ss= 0.9996165627 GRS(867)=-2.3309059113 GRS(868)=2.3300121549
>  ss= 0.9996168565 GRS(868)=2.3300121549 GRS(869)=-2.3291194259
>  ss= 0.9996171499 GRS(869)=-2.3291194259 GRS(870)=2.3282277222
>  ss= 0.9996174428 GRS(870)=2.3282277222 GRS(871)=-2.3273370419
>  ss= 0.9996177353 GRS(871)=-2.3273370419 GRS(872)=2.3264473830
>  ss= 0.9996180273 GRS(872)=2.3264473830 GRS(873)=-2.3255587436
>  ss= 0.9996183189 GRS(873)=-2.3255587436 GRS(874)=2.3246711217
>  ss= 0.9996186100 GRS(874)=2.3246711217 GRS(875)=-2.3237845155
>  ss= 0.9996189007 GRS(875)=-2.3237845155 GRS(876)=2.3228989228
>  ss= 0.9996191910 GRS(876)=2.3228989228 GRS(877)=-2.3220143419
>  ss= 0.9996194808 GRS(877)=-2.3220143419 GRS(878)=2.3211307708
>  ss= 0.9996197701 GRS(878)=2.3211307708 GRS(879)=-2.3202482076
>  ss= 0.9996200591 GRS(879)=-2.3202482076 GRS(880)=2.3193666503
>  ss= 0.9996203476 GRS(880)=2.3193666503 GRS(881)=-2.3184860971
>  ss= 0.9996206356 GRS(881)=-2.3184860971 GRS(882)=2.3176065461
>  ss= 0.9996209232 GRS(882)=2.3176065461 GRS(883)=-2.3167279953
>  ss= 0.9996212104 GRS(883)=-2.3167279953 GRS(884)=2.3158504428
>  ss= 0.9996214972 GRS(884)=2.3158504428 GRS(885)=-2.3149738869
>  ss= 0.9996217835 GRS(885)=-2.3149738869 GRS(886)=2.3140983255
>  ss= 0.9996220693 GRS(886)=2.3140983255 GRS(887)=-2.3132237568
>  ss= 0.9996223548 GRS(887)=-2.3132237568 GRS(888)=2.3123501789
>  ss= 0.9996226398 GRS(888)=2.3123501789 GRS(889)=-2.3114775900
>  ss= 0.9996229244 GRS(889)=-2.3114775900 GRS(890)=2.3106059882
>  ss= 0.9996232086 GRS(890)=2.3106059882 GRS(891)=-2.3097353717
>  ss= 0.9996234923 GRS(891)=-2.3097353717 GRS(892)=2.3088657385
>  ss= 0.9996237756 GRS(892)=2.3088657385 GRS(893)=-2.3079970869
>  ss= 0.9996240585 GRS(893)=-2.3079970869 GRS(894)=2.3071294149
>  ss= 0.9996243409 GRS(894)=2.3071294149 GRS(895)=-2.3062627208
>  ss= 0.9996246229 GRS(895)=-2.3062627208 GRS(896)=2.3053970027
>  ss= 0.9996249046 GRS(896)=2.3053970027 GRS(897)=-2.3045322587
>  ss= 0.9996251857 GRS(897)=-2.3045322587 GRS(898)=2.3036684872
>  ss= 0.9996254665 GRS(898)=2.3036684872 GRS(899)=-2.3028056861
>  ss= 0.9996257468 GRS(899)=-2.3028056861 GRS(900)=2.3019438538
>  ss= 0.9996260268 GRS(900)=2.3019438538 GRS(901)=-2.3010829884
>  ss= 0.9996263063 GRS(901)=-2.3010829884 GRS(902)=2.3002230881
>  ss= 0.9996265853 GRS(902)=2.3002230881 GRS(903)=-2.2993641511
>  ss= 0.9996268640 GRS(903)=-2.2993641511 GRS(904)=2.2985061756
>  ss= 0.9996271423 GRS(904)=2.2985061756 GRS(905)=-2.2976491598
>  ss= 0.9996274201 GRS(905)=-2.2976491598 GRS(906)=2.2967931019
>  ss= 0.9996276975 GRS(906)=2.2967931019 GRS(907)=-2.2959380002
>  ss= 0.9996279745 GRS(907)=-2.2959380002 GRS(908)=2.2950838528
>  ss= 0.9996282511 GRS(908)=2.2950838528 GRS(909)=-2.2942306580
>  ss= 0.9996285273 GRS(909)=-2.2942306580 GRS(910)=2.2933784140
>  ss= 0.9996288031 GRS(910)=2.2933784140 GRS(911)=-2.2925271191
>  ss= 0.9996290785 GRS(911)=-2.2925271191 GRS(912)=2.2916767714
>  ss= 0.9996293534 GRS(912)=2.2916767714 GRS(913)=-2.2908273693
>  ss= 0.9996296280 GRS(913)=-2.2908273693 GRS(914)=2.2899789110
>  ss= 0.9996299021 GRS(914)=2.2899789110 GRS(915)=-2.2891313947
>  ss= 0.9996301759 GRS(915)=-2.2891313947 GRS(916)=2.2882848187
>  ss= 0.9996304492 GRS(916)=2.2882848187 GRS(917)=-2.2874391813
>  ss= 0.9996307222 GRS(917)=-2.2874391813 GRS(918)=2.2865944807
>  ss= 0.9996309947 GRS(918)=2.2865944807 GRS(919)=-2.2857507152
>  ss= 0.9996312668 GRS(919)=-2.2857507152 GRS(920)=2.2849078830
>  ss= 0.9996315385 GRS(920)=2.2849078830 GRS(921)=-2.2840659826
>  ss= 0.9996318099 GRS(921)=-2.2840659826 GRS(922)=2.2832250120
>  ss= 0.9996320808 GRS(922)=2.2832250120 GRS(923)=-2.2823849697
>  ss= 0.9996323513 GRS(923)=-2.2823849697 GRS(924)=2.2815458539
>  ss= 0.9996326215 GRS(924)=2.2815458539 GRS(925)=-2.2807076629
>  ss= 0.9996328912 GRS(925)=-2.2807076629 GRS(926)=2.2798703951
>  ss= 0.9996331605 GRS(926)=2.2798703951 GRS(927)=-2.2790340487
>  ss= 0.9996334295 GRS(927)=-2.2790340487 GRS(928)=2.2781986220
>  ss= 0.9996336980 GRS(928)=2.2781986220 GRS(929)=-2.2773641134
>  ss= 0.9996339662 GRS(929)=-2.2773641134 GRS(930)=2.2765305211
>  ss= 0.9996342340 GRS(930)=2.2765305211 GRS(931)=-2.2756978435
>  ss= 0.9996345013 GRS(931)=-2.2756978435 GRS(932)=2.2748660790
>  ss= 0.9996347683 GRS(932)=2.2748660790 GRS(933)=-2.2740352258
>  ss= 0.9996350349 GRS(933)=-2.2740352258 GRS(934)=2.2732052824
>  ss= 0.9996353011 GRS(934)=2.2732052824 GRS(935)=-2.2723762470
>  ss= 0.9996355669 GRS(935)=-2.2723762470 GRS(936)=2.2715481179
>  ss= 0.9996358324 GRS(936)=2.2715481179 GRS(937)=-2.2707208936
>  ss= 0.9996360974 GRS(937)=-2.2707208936 GRS(938)=2.2698945724
>  ss= 0.9996363621 GRS(938)=2.2698945724 GRS(939)=-2.2690691526
>  ss= 0.9996366263 GRS(939)=-2.2690691526 GRS(940)=2.2682446326
>  ss= 0.9996368902 GRS(940)=2.2682446326 GRS(941)=-2.2674210108
>  ss= 0.9996371537 GRS(941)=-2.2674210108 GRS(942)=2.2665982856
>  ss= 0.9996374169 GRS(942)=2.2665982856 GRS(943)=-2.2657764552
>  ss= 0.9996376796 GRS(943)=-2.2657764552 GRS(944)=2.2649555182
>  ss= 0.9996379420 GRS(944)=2.2649555182 GRS(945)=-2.2641354728
>  ss= 0.9996382039 GRS(945)=-2.2641354728 GRS(946)=2.2633163175
>  ss= 0.9996384655 GRS(946)=2.2633163175 GRS(947)=-2.2624980507
>  ss= 0.9996387268 GRS(947)=-2.2624980507 GRS(948)=2.2616806707
>  ss= 0.9996389876 GRS(948)=2.2616806707 GRS(949)=-2.2608641760
>  ss= 0.9996392481 GRS(949)=-2.2608641760 GRS(950)=2.2600485649
>  ss= 0.9996395082 GRS(950)=2.2600485649 GRS(951)=-2.2592338359
>  ss= 0.9996397679 GRS(951)=-2.2592338359 GRS(952)=2.2584199874
>  ss= 0.9996400273 GRS(952)=2.2584199874 GRS(953)=-2.2576070177
>  ss= 0.9996402862 GRS(953)=-2.2576070177 GRS(954)=2.2567949254
>  ss= 0.9996405448 GRS(954)=2.2567949254 GRS(955)=-2.2559837088
>  ss= 0.9996408031 GRS(955)=-2.2559837088 GRS(956)=2.2551733664
>  ss= 0.9996410609 GRS(956)=2.2551733664 GRS(957)=-2.2543638965
>  ss= 0.9996413184 GRS(957)=-2.2543638965 GRS(958)=2.2535552977
>  ss= 0.9996415755 GRS(958)=2.2535552977 GRS(959)=-2.2527475683
>  ss= 0.9996418323 GRS(959)=-2.2527475683 GRS(960)=2.2519407069
>  ss= 0.9996420887 GRS(960)=2.2519407069 GRS(961)=-2.2511347118
>  ss= 0.9996423447 GRS(961)=-2.2511347118 GRS(962)=2.2503295815
>  ss= 0.9996426003 GRS(962)=2.2503295815 GRS(963)=-2.2495253145
>  ss= 0.9996428556 GRS(963)=-2.2495253145 GRS(964)=2.2487219092
>  ss= 0.9996431105 GRS(964)=2.2487219092 GRS(965)=-2.2479193640
>  ss= 0.9996433651 GRS(965)=-2.2479193640 GRS(966)=2.2471176775
>  ss= 0.9996436193 GRS(966)=2.2471176775 GRS(967)=-2.2463168482
>  ss= 0.9996438731 GRS(967)=-2.2463168482 GRS(968)=2.2455168744
>  ss= 0.9996441266 GRS(968)=2.2455168744 GRS(969)=-2.2447177547
>  ss= 0.9996443797 GRS(969)=-2.2447177547 GRS(970)=2.2439194875
>  ss= 0.9996446325 GRS(970)=2.2439194875 GRS(971)=-2.2431220714
>  ss= 0.9996448849 GRS(971)=-2.2431220714 GRS(972)=2.2423255048
>  ss= 0.9996451369 GRS(972)=2.2423255048 GRS(973)=-2.2415297862
>  ss= 0.9996453886 GRS(973)=-2.2415297862 GRS(974)=2.2407349141
>  ss= 0.9996456399 GRS(974)=2.2407349141 GRS(975)=-2.2399408871
>  ss= 0.9996458909 GRS(975)=-2.2399408871 GRS(976)=2.2391477035
>  ss= 0.9996461415 GRS(976)=2.2391477035 GRS(977)=-2.2383553620
>  ss= 0.9996463917 GRS(977)=-2.2383553620 GRS(978)=2.2375638610
>  ss= 0.9996466416 GRS(978)=2.2375638610 GRS(979)=-2.2367731991
>  ss= 0.9996468912 GRS(979)=-2.2367731991 GRS(980)=2.2359833747
>  ss= 0.9996471404 GRS(980)=2.2359833747 GRS(981)=-2.2351943865
>  ss= 0.9996473892 GRS(981)=-2.2351943865 GRS(982)=2.2344062328
>  ss= 0.9996476377 GRS(982)=2.2344062328 GRS(983)=-2.2336189123
>  ss= 0.9996478858 GRS(983)=-2.2336189123 GRS(984)=2.2328324235
>  ss= 0.9996481336 GRS(984)=2.2328324235 GRS(985)=-2.2320467649
>  ss= 0.9996483811 GRS(985)=-2.2320467649 GRS(986)=2.2312619350
>  ss= 0.9996486282 GRS(986)=2.2312619350 GRS(987)=-2.2304779324
>  ss= 0.9996488749 GRS(987)=-2.2304779324 GRS(988)=2.2296947557
>  ss= 0.9996491213 GRS(988)=2.2296947557 GRS(989)=-2.2289124034
>  ss= 0.9996493674 GRS(989)=-2.2289124034 GRS(990)=2.2281308741
>  ss= 0.9996496131 GRS(990)=2.2281308741 GRS(991)=-2.2273501663
>  ss= 0.9996498585 GRS(991)=-2.2273501663 GRS(992)=2.2265702785
>  ss= 0.9996501035 GRS(992)=2.2265702785 GRS(993)=-2.2257912094
>  ss= 0.9996503482 GRS(993)=-2.2257912094 GRS(994)=2.2250129575
>  ss= 0.9996505925 GRS(994)=2.2250129575 GRS(995)=-2.2242355213
>  ss= 0.9996508365 GRS(995)=-2.2242355213 GRS(996)=2.2234588996
>  ss= 0.9996510802 GRS(996)=2.2234588996 GRS(997)=-2.2226830907
>  ss= 0.9996513235 GRS(997)=-2.2226830907 GRS(998)=2.2219080934
>  ss= 0.9996515665 GRS(998)=2.2219080934 GRS(999)=-2.2211339062
>  ss= 0.9996518091 GRS(999)=-2.2211339062 GRS(1000)=2.2203605277
>
>


-- 
Lisandro Dalc??n
---------------
Centro Internacional de M??todos Computacionales en Ingenier??a (CIMEC)
Instituto de Desarrollo Tecnol??gico para la Industria Qu??mica (INTEC)
Consejo Nacional de Investigaciones Cient??ficas y T??cnicas (CONICET)
PTLC - G??emes 3450, (3000) Santa Fe, Argentina
Tel/Fax: +54-(0)342-451.1594


From dalcinl at gmail.com  Tue Feb  5 08:43:32 2008
From: dalcinl at gmail.com (Lisandro Dalcin)
Date: Tue, 5 Feb 2008 11:43:32 -0300
Subject: how to inverse a sparse matrix in Petsc?
In-Reply-To: <47A86B5F.5010503@gmail.com>
References: <602426.95557.qm@web38210.mail.mud.yahoo.com>
	 <47A85573.4090607@cscs.ch> <47A86B5F.5010503@gmail.com>
Message-ID: <e7ba66e40802050643y5f7944e5k4e79bf94beb6dd46@mail.gmail.com>

Ben, some time ago I was doing some testing with PETSc for solving
incompressible NS eqs with fractional step method. I've found that in
our software and hardware setup, the best way to solve the pressure
problem was by using HYPRE BoomerAMG. This preconditioner usually have
some heavy setup, but if your Poison matrix does not change, then the
sucessive solves at each time step are really fast.

If you still want to use a direct method, you should use the
combination '-ksp_type preonly -pc_type lu' (by default, this will
only work on sequential mode, unless you build PETSc with an external
package like MUMPS). This way, PETSc computes the LU factorization
only once, and at each time step, the call to KSPSolve end-up only
doing the triangular solvers.

The nice thing about PETSc is that, if you next realize the
factorization take a long time (as it usually take in big problems),
you can switch BoomerAMG by only passing in the command line
'-ksp_type cg -pc_type hypre -pc_hypre_type boomeramg'. And that's
all, you do not need to change your code. And more, depending on your
problem you can choose the direct solvers or algebraic multigrid as
you want, by simply pass the appropriate combination options in the
command line (or a options file, using the -options_file option).

Please, if you ever try HYPRE BoomerAMG preconditioners, I would like
to know about your experience.

Regards,

On 2/5/08, Ben Tay <zonexo at gmail.com> wrote:
> Hi everyone,
>
> I was reading about the topic abt inversing a sparse matrix. I have to
> solve a poisson eqn for my CFD code. Usually, I form a system of linear
> eqns and solve Ax=b. The "A" is always the same and only the "b" changes
> every timestep. Does it mean that if I'm able to get the inverse matrix
> A^(-1), in order to get x at every timestep, I only need to do a simple
> matrix multiplication ie x=A^(-1)*b ?
>
> Hi Timothy, if the above is true, can you email me your Fortran code
> template? I'm also programming in fortran 90. Thank you very much
>
> Regards.
>
> Timothy Stitt wrote:
> > Yes Yujie, I was able to put together a parallel code to invert a
> > large sparse matrix with the help of the PETSc developers. If you need
> > any help or maybe a Fortran code template just let me know.
> >
> > Best,
> >
> > Tim.
> >
> > Waad Subber wrote:
> >> Hi
> >> There was a discussion between Tim Stitt and petsc developers about
> >> matrix inversion, and it was really helpful. That was in last Nov.
> >> You can check the emails archive
> >>
> >> http://www-unix.mcs.anl.gov/web-mail-archive/lists/petsc-users/2007/11/threads.html
> >>
> >>
> >> Waad
> >>
> >> */Yujie <recrusader at gmail.com>/* wrote:
> >>
> >>     what is the difference between sequantial and parallel AIJ matrix?
> >>     Assuming there is a matrix A, if
> >>     I partitaion this matrix into A1, A2, Ai... An.
> >>     A is a parallel AIJ matrix at the whole view, Ai
> >>     is a sequential AIJ matrix? I want to operate Ai at each node.
> >>     In addition, whether is it possible to get general inverse using
> >>     MatMatSolve() if the matrix is not square? Thanks a lot.
> >>
> >>     Regards,
> >>     Yujie
> >>
> >>
> >>     On 2/4/08, *Barry Smith* <bsmith at mcs.anl.gov
> >>     <mailto:bsmith at mcs.anl.gov>> wrote:
> >>
> >>
> >>             For sequential AIJ matrices you can fill the B matrix
> >> with the
> >>         identity and then use
> >>         MatMatSolve().
> >>
> >>             Note since the inverse of a sparse matrix is dense the B
> >>         matrix is
> >>         a SeqDense matrix.
> >>
> >>             Barry
> >>
> >>         On Feb 4, 2008, at 12:37 AM, Yujie wrote:
> >>
> >>         > Hi,
> >>         > Now, I want to inverse a sparse matrix. I have browsed the
> >>         manual,
> >>         > however, I can't find some information. could you give me
> >>         some advice?
> >>         >
> >>         > thanks a lot.
> >>         >
> >>         > Regards,
> >>         > Yujie
> >>         >
> >>
> >>
> >>
> >> ------------------------------------------------------------------------
> >> Looking for last minute shopping deals? Find them fast with Yahoo!
> >> Search.
> >> <http://us.rd.yahoo.com/evt=51734/*http://tools.search.yahoo.com/newsearch/category.php?category=shopping>
> >
> >
> >
> >
>
>


-- 
Lisandro Dalc?n
---------------
Centro Internacional de M?todos Computacionales en Ingenier?a (CIMEC)
Instituto de Desarrollo Tecnol?gico para la Industria Qu?mica (INTEC)
Consejo Nacional de Investigaciones Cient?ficas y T?cnicas (CONICET)
PTLC - G?emes 3450, (3000) Santa Fe, Argentina
Tel/Fax: +54-(0)342-451.1594


From recrusader at gmail.com  Tue Feb  5 11:26:35 2008
From: recrusader at gmail.com (Yujie)
Date: Tue, 5 Feb 2008 09:26:35 -0800
Subject: how to inverse a sparse matrix in Petsc?
In-Reply-To: <47A85573.4090607@cscs.ch>
References: <602426.95557.qm@web38210.mail.mud.yahoo.com>
	 <47A85573.4090607@cscs.ch>
Message-ID: <7ff0ee010802050926q1f13235dgf4ddea8393586493@mail.gmail.com>

Hi, Tim

Thank you for your help. I am really glad to get your help.
According to what you said, if the matrix A has been divided into several
nodes in the cluster, you may use your parallel code to inverse A?
My problem is that what is the distribution of the results?
thanks a lot.

Regards,
Yujie

On 2/5/08, Timothy Stitt <tstitt at cscs.ch> wrote:
>
> Yes Yujie, I was able to put together a parallel code to invert a large
> sparse matrix with the help of the PETSc developers. If you need any
> help or maybe a Fortran code template just let me know.
>
> Best,
>
> Tim.
>
> Waad Subber wrote:
> > Hi
> > There was a discussion between Tim Stitt and petsc developers about
> > matrix inversion, and it was really helpful. That was in last Nov. You
> > can check the emails archive
> >
> >
> http://www-unix.mcs.anl.gov/web-mail-archive/lists/petsc-users/2007/11/threads.html
> >
> > Waad
> >
> > */Yujie <recrusader at gmail.com>/* wrote:
> >
> >     what is the difference between sequantial and parallel AIJ matrix?
> >     Assuming there is a matrix A, if
> >     I partitaion this matrix into A1, A2, Ai... An.
> >     A is a parallel AIJ matrix at the whole view, Ai
> >     is a sequential AIJ matrix? I want to operate Ai at each node.
> >     In addition, whether is it possible to get general inverse using
> >     MatMatSolve() if the matrix is not square? Thanks a lot.
> >
> >     Regards,
> >     Yujie
> >
> >
> >     On 2/4/08, *Barry Smith* <bsmith at mcs.anl.gov
> >     <mailto:bsmith at mcs.anl.gov>> wrote:
> >
> >
> >             For sequential AIJ matrices you can fill the B matrix with
> the
> >         identity and then use
> >         MatMatSolve().
> >
> >             Note since the inverse of a sparse matrix is dense the B
> >         matrix is
> >         a SeqDense matrix.
> >
> >             Barry
> >
> >         On Feb 4, 2008, at 12:37 AM, Yujie wrote:
> >
> >         > Hi,
> >         > Now, I want to inverse a sparse matrix. I have browsed the
> >         manual,
> >         > however, I can't find some information. could you give me
> >         some advice?
> >         >
> >         > thanks a lot.
> >         >
> >         > Regards,
> >         > Yujie
> >         >
> >
> >
> >
> > ------------------------------------------------------------------------
> > Looking for last minute shopping deals? Find them fast with Yahoo!
> > Search.
> > <
> http://us.rd.yahoo.com/evt=51734/*http://tools.search.yahoo.com/newsearch/category.php?category=shopping
> >
>
>
>
> --
> Timothy Stitt
> HPC Applications Analyst
>
> Swiss National Supercomputing Centre (CSCS)
> Galleria 2 - Via Cantonale
> CH-6928 Manno, Switzerland
>
> +41 (0) 91 610 8233
> stitt at cscs.ch
>
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.mcs.anl.gov/pipermail/petsc-users/attachments/20080205/ef5eb69c/attachment.htm>

From recrusader at gmail.com  Tue Feb  5 11:32:52 2008
From: recrusader at gmail.com (Yujie)
Date: Tue, 5 Feb 2008 09:32:52 -0800
Subject: how to inverse a sparse matrix in Petsc?
In-Reply-To: <e7ba66e40802050643y5f7944e5k4e79bf94beb6dd46@mail.gmail.com>
References: <602426.95557.qm@web38210.mail.mud.yahoo.com>
	 <47A85573.4090607@cscs.ch> <47A86B5F.5010503@gmail.com>
	 <e7ba66e40802050643y5f7944e5k4e79bf94beb6dd46@mail.gmail.com>
Message-ID: <7ff0ee010802050932q5661bcek51be3259a6332550@mail.gmail.com>

Hi, Lisandro

I have tried to use BoomerAMG for my problem. My problem is a set of
elliptic-type linear PDEs. They are strong
coupled. The convergence was bad. I tried to adjust some parameters, the
convergence had some improvements and was always bad.
I have little knowledge about your problem. I have discussed my
problem with Hypre developers,
they told me that if the PDEs are strong coupled, it is difficult to use
BoomerAMG.

Regards,
Yujie

On 2/5/08, Lisandro Dalcin <dalcinl at gmail.com> wrote:
>
> Ben, some time ago I was doing some testing with PETSc for solving
> incompressible NS eqs with fractional step method. I've found that in
> our software and hardware setup, the best way to solve the pressure
> problem was by using HYPRE BoomerAMG. This preconditioner usually have
> some heavy setup, but if your Poison matrix does not change, then the
> sucessive solves at each time step are really fast.
>
> If you still want to use a direct method, you should use the
> combination '-ksp_type preonly -pc_type lu' (by default, this will
> only work on sequential mode, unless you build PETSc with an external
> package like MUMPS). This way, PETSc computes the LU factorization
> only once, and at each time step, the call to KSPSolve end-up only
> doing the triangular solvers.
>
> The nice thing about PETSc is that, if you next realize the
> factorization take a long time (as it usually take in big problems),
> you can switch BoomerAMG by only passing in the command line
> '-ksp_type cg -pc_type hypre -pc_hypre_type boomeramg'. And that's
> all, you do not need to change your code. And more, depending on your
> problem you can choose the direct solvers or algebraic multigrid as
> you want, by simply pass the appropriate combination options in the
> command line (or a options file, using the -options_file option).
>
> Please, if you ever try HYPRE BoomerAMG preconditioners, I would like
> to know about your experience.
>
> Regards,
>
> On 2/5/08, Ben Tay <zonexo at gmail.com> wrote:
> > Hi everyone,
> >
> > I was reading about the topic abt inversing a sparse matrix. I have to
> > solve a poisson eqn for my CFD code. Usually, I form a system of linear
> > eqns and solve Ax=b. The "A" is always the same and only the "b" changes
> > every timestep. Does it mean that if I'm able to get the inverse matrix
> > A^(-1), in order to get x at every timestep, I only need to do a simple
> > matrix multiplication ie x=A^(-1)*b ?
> >
> > Hi Timothy, if the above is true, can you email me your Fortran code
> > template? I'm also programming in fortran 90. Thank you very much
> >
> > Regards.
> >
> > Timothy Stitt wrote:
> > > Yes Yujie, I was able to put together a parallel code to invert a
> > > large sparse matrix with the help of the PETSc developers. If you need
> > > any help or maybe a Fortran code template just let me know.
> > >
> > > Best,
> > >
> > > Tim.
> > >
> > > Waad Subber wrote:
> > >> Hi
> > >> There was a discussion between Tim Stitt and petsc developers about
> > >> matrix inversion, and it was really helpful. That was in last Nov.
> > >> You can check the emails archive
> > >>
> > >>
> http://www-unix.mcs.anl.gov/web-mail-archive/lists/petsc-users/2007/11/threads.html
> > >>
> > >>
> > >> Waad
> > >>
> > >> */Yujie <recrusader at gmail.com>/* wrote:
> > >>
> > >>     what is the difference between sequantial and parallel AIJ
> matrix?
> > >>     Assuming there is a matrix A, if
> > >>     I partitaion this matrix into A1, A2, Ai... An.
> > >>     A is a parallel AIJ matrix at the whole view, Ai
> > >>     is a sequential AIJ matrix? I want to operate Ai at each node.
> > >>     In addition, whether is it possible to get general inverse using
> > >>     MatMatSolve() if the matrix is not square? Thanks a lot.
> > >>
> > >>     Regards,
> > >>     Yujie
> > >>
> > >>
> > >>     On 2/4/08, *Barry Smith* <bsmith at mcs.anl.gov
> > >>     <mailto:bsmith at mcs.anl.gov>> wrote:
> > >>
> > >>
> > >>             For sequential AIJ matrices you can fill the B matrix
> > >> with the
> > >>         identity and then use
> > >>         MatMatSolve().
> > >>
> > >>             Note since the inverse of a sparse matrix is dense the B
> > >>         matrix is
> > >>         a SeqDense matrix.
> > >>
> > >>             Barry
> > >>
> > >>         On Feb 4, 2008, at 12:37 AM, Yujie wrote:
> > >>
> > >>         > Hi,
> > >>         > Now, I want to inverse a sparse matrix. I have browsed the
> > >>         manual,
> > >>         > however, I can't find some information. could you give me
> > >>         some advice?
> > >>         >
> > >>         > thanks a lot.
> > >>         >
> > >>         > Regards,
> > >>         > Yujie
> > >>         >
> > >>
> > >>
> > >>
> > >>
> ------------------------------------------------------------------------
> > >> Looking for last minute shopping deals? Find them fast with Yahoo!
> > >> Search.
> > >> <
> http://us.rd.yahoo.com/evt=51734/*http://tools.search.yahoo.com/newsearch/category.php?category=shopping
> >
> > >
> > >
> > >
> > >
> >
> >
>
>
> --
> Lisandro Dalc?n
> ---------------
> Centro Internacional de M?todos Computacionales en Ingenier?a (CIMEC)
> Instituto de Desarrollo Tecnol?gico para la Industria Qu?mica (INTEC)
> Consejo Nacional de Investigaciones Cient?ficas y T?cnicas (CONICET)
> PTLC - G?emes 3450, (3000) Santa Fe, Argentina
> Tel/Fax: +54-(0)342-451.1594
>
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.mcs.anl.gov/pipermail/petsc-users/attachments/20080205/d54f4472/attachment.htm>

From dalcinl at gmail.com  Tue Feb  5 12:43:06 2008
From: dalcinl at gmail.com (Lisandro Dalcin)
Date: Tue, 5 Feb 2008 15:43:06 -0300
Subject: how to inverse a sparse matrix in Petsc?
In-Reply-To: <7ff0ee010802050932q5661bcek51be3259a6332550@mail.gmail.com>
References: <602426.95557.qm@web38210.mail.mud.yahoo.com>
	 <47A85573.4090607@cscs.ch> <47A86B5F.5010503@gmail.com>
	 <e7ba66e40802050643y5f7944e5k4e79bf94beb6dd46@mail.gmail.com>
	 <7ff0ee010802050932q5661bcek51be3259a6332550@mail.gmail.com>
Message-ID: <e7ba66e40802051043m386fd5bcha825db8464c051b0@mail.gmail.com>

Yujie,

My recommendation was mainly directed to Ben Tay, as he (like me) only
needs to solve a simple scalar elliptic PDE. In your case, as you
said, BoomerAMG is hard to use.

Any way, to put things completelly clear, I believe you should NEVER
try to build and explicit inverse matrix (unless you problem is really
small and perhaps only for the shake of debug something). Instead, you
have to use LU factorization through the combination of options
'-ksp_type preonly -pc_type lu'.


On 2/5/08, Yujie <recrusader at gmail.com> wrote:
> Hi, Lisandro
>
> I have tried to use BoomerAMG for my problem. My problem is a set of
> elliptic-type linear PDEs. They are strong coupled. The convergence
> was bad. I tried to adjust some parameters, the convergence
> had some improvements and was always bad. I have little knowledge
> about your problem. I have discussed my
> problem with Hypre developers,
>  they told me that if the PDEs are strong coupled, it is difficult to use
> BoomerAMG.
>
> Regards,
> Yujie
>
>
> On 2/5/08, Lisandro Dalcin <dalcinl at gmail.com> wrote:
> > Ben, some time ago I was doing some testing with PETSc for solving
> > incompressible NS eqs with fractional step method. I've found that in
> > our software and hardware setup, the best way to solve the pressure
> > problem was by using HYPRE BoomerAMG. This preconditioner usually have
> > some heavy setup, but if your Poison matrix does not change, then the
> > sucessive solves at each time step are really fast.
> >
> > If you still want to use a direct method, you should use the
> > combination '-ksp_type preonly -pc_type lu' (by default, this will
> > only work on sequential mode, unless you build PETSc with an external
> > package like MUMPS). This way, PETSc computes the LU factorization
> > only once, and at each time step, the call to KSPSolve end-up only
> > doing the triangular solvers.
> >
> > The nice thing about PETSc is that, if you next realize the
> > factorization take a long time (as it usually take in big problems),
> > you can switch BoomerAMG by only passing in the command line
> > '-ksp_type cg -pc_type hypre -pc_hypre_type boomeramg'. And that's
> > all, you do not need to change your code. And more, depending on your
> > problem you can choose the direct solvers or algebraic multigrid as
> > you want, by simply pass the appropriate combination options in the
> > command line (or a options file, using the -options_file option).
> >
> > Please, if you ever try HYPRE BoomerAMG preconditioners, I would like
> > to know about your experience.
> >
> > Regards,
> >
> > On 2/5/08, Ben Tay <zonexo at gmail.com> wrote:
> > > Hi everyone,
> > >
> > > I was reading about the topic abt inversing a sparse matrix. I have to
> > > solve a poisson eqn for my CFD code. Usually, I form a system of linear
> > > eqns and solve Ax=b. The "A" is always the same and only the "b" changes
> > > every timestep. Does it mean that if I'm able to get the inverse matrix
> > > A^(-1), in order to get x at every timestep, I only need to do a simple
> > > matrix multiplication ie x=A^(-1)*b ?
> > >
> > > Hi Timothy, if the above is true, can you email me your Fortran code
> > > template? I'm also programming in fortran 90. Thank you very much
> > >
> > > Regards.
> > >
> > > Timothy Stitt wrote:
> > > > Yes Yujie, I was able to put together a parallel code to invert a
> > > > large sparse matrix with the help of the PETSc developers. If you need
> > > > any help or maybe a Fortran code template just let me know.
> > > >
> > > > Best,
> > > >
> > > > Tim.
> > > >
> > > > Waad Subber wrote:
> > > >> Hi
> > > >> There was a discussion between Tim Stitt and petsc developers about
> > > >> matrix inversion, and it was really helpful. That was in last Nov.
> > > >> You can check the emails archive
> > > >>
> > > >>
> http://www-unix.mcs.anl.gov/web-mail-archive/lists/petsc-users/2007/11/threads.html
> > > >>
> > > >>
> > > >> Waad
> > > >>
> > > >> */Yujie <recrusader at gmail.com>/* wrote:
> > > >>
> > > >>     what is the difference between sequantial and parallel AIJ
> matrix?
> > > >>     Assuming there is a matrix A, if
> > > >>     I partitaion this matrix into A1, A2, Ai... An.
> > > >>     A is a parallel AIJ matrix at the whole view, Ai
> > > >>     is a sequential AIJ matrix? I want to operate Ai at each node.
> > > >>     In addition, whether is it possible to get general inverse using
> > > >>     MatMatSolve() if the matrix is not square? Thanks a lot.
> > > >>
> > > >>     Regards,
> > > >>     Yujie
> > > >>
> > > >>
> > > >>     On 2/4/08, *Barry Smith* <bsmith at mcs.anl.gov
> > > >>     <mailto:bsmith at mcs.anl.gov>> wrote:
> > > >>
> > > >>
> > > >>             For sequential AIJ matrices you can fill the B matrix
> > > >> with the
> > > >>         identity and then use
> > > >>         MatMatSolve().
> > > >>
> > > >>             Note since the inverse of a sparse matrix is dense the B
> > > >>         matrix is
> > > >>         a SeqDense matrix.
> > > >>
> > > >>             Barry
> > > >>
> > > >>         On Feb 4, 2008, at 12:37 AM, Yujie wrote:
> > > >>
> > > >>         > Hi,
> > > >>         > Now, I want to inverse a sparse matrix. I have browsed the
> > > >>         manual,
> > > >>         > however, I can't find some information. could you give me
> > > >>         some advice?
> > > >>         >
> > > >>         > thanks a lot.
> > > >>         >
> > > >>         > Regards,
> > > >>         > Yujie
> > > >>         >
> > > >>
> > > >>
> > > >>
> > > >>
> ------------------------------------------------------------------------
> > > >> Looking for last minute shopping deals? Find them fast with Yahoo!
> > > >> Search.
> > > >>
> <http://us.rd.yahoo.com/evt=51734/*http://tools.search.yahoo.com/newsearch/category.php?category=shopping>
> > > >
> > > >
> > > >
> > > >
> > >
> > >
> >
> >
> > --
> > Lisandro Dalc?n
> > ---------------
> > Centro Internacional de M?todos Computacionales en Ingenier?a (CIMEC)
> > Instituto de Desarrollo Tecnol?gico para la Industria Qu?mica (INTEC)
> > Consejo Nacional de Investigaciones Cient?ficas y T?cnicas (CONICET)
> > PTLC - G?emes 3450, (3000) Santa Fe, Argentina
> > Tel/Fax: +54-(0)342-451.1594
> >
> >
>
>


-- 
Lisandro Dalc?n
---------------
Centro Internacional de M?todos Computacionales en Ingenier?a (CIMEC)
Instituto de Desarrollo Tecnol?gico para la Industria Qu?mica (INTEC)
Consejo Nacional de Investigaciones Cient?ficas y T?cnicas (CONICET)
PTLC - G?emes 3450, (3000) Santa Fe, Argentina
Tel/Fax: +54-(0)342-451.1594


From pflath at ices.utexas.edu  Tue Feb  5 13:50:51 2008
From: pflath at ices.utexas.edu (Pearl Flath)
Date: Tue, 5 Feb 2008 13:50:51 -0600
Subject: Trouble with DA, multiple degrees of freedom
Message-ID: <d91228d00802051150j1b78d612hd8f926893bef5200@mail.gmail.com>

Dear All,
I have a code where the velocity  (three components) and pressure are all
stored in a distributed array with 4 degrees of freedom per node. I'd like
to take one component of the velocity and multiply it by -1, but I am having
trouble figuring out how to access that. I believe it must involve
DAVecGetArrayDOF or DAVecGetArray, but I haven't managed to get either to
work. I've attached a code fragment where it loads the velocity. Could
someone suggest how to do this or point me to where I can find additional
discussion of this? I've read the users manual on DA already.
Sincerely,
Pearl Flath
ICES, UT Austin
---------------------------------
DACreate3d(PETSC_COMM_WORLD,DA_NONPERIODIC,DA_STENCIL_BOX,m,n,p,
              PETSC_DECIDE,PETSC_DECIDE,PETSC_DECIDE,
              4,1,PETSC_NULL,PETSC_NULL,PETSC_NULL,&daV);

  DACreateGlobalVector(daV, &vel);

  // Set the velocity file to read from
  PetscTruth flg ;
  PetscViewer view_u;
  char velocityfile[1024] ;
  PetscOptionsGetString(0,"-velocityfile",velocityfile,1023,&flg);

  PetscViewerBinaryOpen(PETSC_COMM_WORLD, velocityfile,
                        FILE_MODE_READ, &view_u);
  VecLoadIntoVector(view_u, vel);
  PetscViewerDestroy(view_u);
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.mcs.anl.gov/pipermail/petsc-users/attachments/20080205/f67468bf/attachment.htm>

From knepley at gmail.com  Tue Feb  5 13:58:58 2008
From: knepley at gmail.com (Matthew Knepley)
Date: Tue, 5 Feb 2008 13:58:58 -0600
Subject: Trouble with DA, multiple degrees of freedom
In-Reply-To: <d91228d00802051150j1b78d612hd8f926893bef5200@mail.gmail.com>
References: <d91228d00802051150j1b78d612hd8f926893bef5200@mail.gmail.com>
Message-ID: <a9f269830802051158i5b7e7b01q73d68c524be3e315@mail.gmail.com>

The easiest thing to do in C is to declare a struct:

typedef struct {
  PetscScalar v[3];
  PetscScalar p;
} Space;

and then cast pointers

  Space ***array;

  DAVecGetArray(da, u, (void *) &array);

     array[k][j][i].v *= -1.0;

  Thanks,

     Matt

On Feb 5, 2008 1:50 PM, Pearl Flath <pflath at ices.utexas.edu> wrote:
> Dear All,
> I have a code where the velocity  (three components) and pressure are all
> stored in a distributed array with 4 degrees of freedom per node. I'd like
> to take one component of the velocity and multiply it by -1, but I am having
> trouble figuring out how to access that. I believe it must involve
> DAVecGetArrayDOF or DAVecGetArray, but I haven't managed to get either to
> work. I've attached a code fragment where it loads the velocity. Could
> someone suggest how to do this or point me to where I can find additional
> discussion of this? I've read the users manual on DA already.
>  Sincerely,
> Pearl Flath
> ICES, UT Austin
>  ---------------------------------
> DACreate3d(PETSC_COMM_WORLD,DA_NONPERIODIC,DA_STENCIL_BOX,m,n,p,
>               PETSC_DECIDE,PETSC_DECIDE,PETSC_DECIDE,
>               4,1,PETSC_NULL,PETSC_NULL,PETSC_NULL,&daV);
>
>   DACreateGlobalVector(daV, &vel);
>
>   // Set the velocity file to read from
>   PetscTruth flg ;
>   PetscViewer view_u;
>   char velocityfile[1024] ;
>   PetscOptionsGetString(0,"-velocityfile",velocityfile,1023,&flg);
>
>   PetscViewerBinaryOpen(PETSC_COMM_WORLD, velocityfile,
>                          FILE_MODE_READ, &view_u);
>   VecLoadIntoVector(view_u, vel);
>   PetscViewerDestroy(view_u);
>
>
>


-- 
What most experimenters take for granted before they begin their
experiments is infinitely more interesting than any results to which
their experiments lead.
-- Norbert Wiener


From bsmith at mcs.anl.gov  Tue Feb  5 15:39:02 2008
From: bsmith at mcs.anl.gov (Barry Smith)
Date: Tue, 5 Feb 2008 15:39:02 -0600
Subject: Trouble with DA, multiple degrees of freedom
In-Reply-To: <d91228d00802051150j1b78d612hd8f926893bef5200@mail.gmail.com>
References: <d91228d00802051150j1b78d612hd8f926893bef5200@mail.gmail.com>
Message-ID: <3D23BA07-59A6-4B04-90D5-21C081A8B9BE@mcs.anl.gov>


   VecStrideScale() is the easiest way to do this.

    Barry

On Feb 5, 2008, at 1:50 PM, Pearl Flath wrote:

> Dear All,
> I have a code where the velocity  (three components) and pressure  
> are all stored in a distributed array with 4 degrees of freedom per  
> node. I'd like to take one component of the velocity and multiply it  
> by -1, but I am having trouble figuring out how to access that. I  
> believe it must involve DAVecGetArrayDOF or DAVecGetArray, but I  
> haven't managed to get either to work. I've attached a code fragment  
> where it loads the velocity. Could someone suggest how to do this or  
> point me to where I can find additional discussion of this? I've  
> read the users manual on DA already.
> Sincerely,
> Pearl Flath
> ICES, UT Austin
> ---------------------------------
> DACreate3d(PETSC_COMM_WORLD,DA_NONPERIODIC,DA_STENCIL_BOX,m,n,p,
>               PETSC_DECIDE,PETSC_DECIDE,PETSC_DECIDE,
>               4,1,PETSC_NULL,PETSC_NULL,PETSC_NULL,&daV);
>
>   DACreateGlobalVector(daV, &vel);
>
>   // Set the velocity file to read from
>   PetscTruth flg ;
>   PetscViewer view_u;
>   char velocityfile[1024] ;
>   PetscOptionsGetString(0,"-velocityfile",velocityfile,1023,&flg);
>
>   PetscViewerBinaryOpen(PETSC_COMM_WORLD, velocityfile,
>                         FILE_MODE_READ, &view_u);
>   VecLoadIntoVector(view_u, vel);
>   PetscViewerDestroy(view_u);
>
>


From zonexo at gmail.com  Tue Feb  5 20:04:27 2008
From: zonexo at gmail.com (Ben Tay)
Date: Wed, 06 Feb 2008 10:04:27 +0800
Subject: how to inverse a sparse matrix in Petsc?
In-Reply-To: <e7ba66e40802050643y5f7944e5k4e79bf94beb6dd46@mail.gmail.com>
References: <602426.95557.qm@web38210.mail.mud.yahoo.com>	 <47A85573.4090607@cscs.ch> <47A86B5F.5010503@gmail.com> <e7ba66e40802050643y5f7944e5k4e79bf94beb6dd46@mail.gmail.com>
Message-ID: <47A915AB.9010006@gmail.com>

Hi Lisandro,

I'm using the fractional step mtd to solve the NS eqns as well. I've 
tried the direct mtd and also boomerAMG in solving the poisson eqn. 
Experience shows that for smaller matrix, direct mtd is slightly faster 
but if the matrix increases in size, boomerAMG  is faster. Btw, if I'm 
not wrong, the default solver will be GMRES. I've also tried using the 
"Struct" interface solely under Hypre. It's even faster for big matrix, 
although the improvement doesn't seem to be a lot. I need to do more 
tests to confirm though.

I'm now doing 2D simulation with 1400x2000 grids. It's takes quite a 
while to solve the eqns. I'm wondering if it'll be faster if I get the 
inverse and then do matrix multiplication. Or just calling KSPSolve is 
actually doing something similar and there'll not be any speed 
difference. Hope someone can enlighten...

Thanks!

Lisandro Dalcin wrote:
> Ben, some time ago I was doing some testing with PETSc for solving
> incompressible NS eqs with fractional step method. I've found that in
> our software and hardware setup, the best way to solve the pressure
> problem was by using HYPRE BoomerAMG. This preconditioner usually have
> some heavy setup, but if your Poison matrix does not change, then the
> sucessive solves at each time step are really fast.
>
> If you still want to use a direct method, you should use the
> combination '-ksp_type preonly -pc_type lu' (by default, this will
> only work on sequential mode, unless you build PETSc with an external
> package like MUMPS). This way, PETSc computes the LU factorization
> only once, and at each time step, the call to KSPSolve end-up only
> doing the triangular solvers.
>
> The nice thing about PETSc is that, if you next realize the
> factorization take a long time (as it usually take in big problems),
> you can switch BoomerAMG by only passing in the command line
> '-ksp_type cg -pc_type hypre -pc_hypre_type boomeramg'. And that's
> all, you do not need to change your code. And more, depending on your
> problem you can choose the direct solvers or algebraic multigrid as
> you want, by simply pass the appropriate combination options in the
> command line (or a options file, using the -options_file option).
>
> Please, if you ever try HYPRE BoomerAMG preconditioners, I would like
> to know about your experience.
>
> Regards,
>
> On 2/5/08, Ben Tay <zonexo at gmail.com> wrote:
>   
>> Hi everyone,
>>
>> I was reading about the topic abt inversing a sparse matrix. I have to
>> solve a poisson eqn for my CFD code. Usually, I form a system of linear
>> eqns and solve Ax=b. The "A" is always the same and only the "b" changes
>> every timestep. Does it mean that if I'm able to get the inverse matrix
>> A^(-1), in order to get x at every timestep, I only need to do a simple
>> matrix multiplication ie x=A^(-1)*b ?
>>
>> Hi Timothy, if the above is true, can you email me your Fortran code
>> template? I'm also programming in fortran 90. Thank you very much
>>
>> Regards.
>>
>> Timothy Stitt wrote:
>>     
>>> Yes Yujie, I was able to put together a parallel code to invert a
>>> large sparse matrix with the help of the PETSc developers. If you need
>>> any help or maybe a Fortran code template just let me know.
>>>
>>> Best,
>>>
>>> Tim.
>>>
>>> Waad Subber wrote:
>>>       
>>>> Hi
>>>> There was a discussion between Tim Stitt and petsc developers about
>>>> matrix inversion, and it was really helpful. That was in last Nov.
>>>> You can check the emails archive
>>>>
>>>> http://www-unix.mcs.anl.gov/web-mail-archive/lists/petsc-users/2007/11/threads.html
>>>>
>>>>
>>>> Waad
>>>>
>>>> */Yujie <recrusader at gmail.com>/* wrote:
>>>>
>>>>     what is the difference between sequantial and parallel AIJ matrix?
>>>>     Assuming there is a matrix A, if
>>>>     I partitaion this matrix into A1, A2, Ai... An.
>>>>     A is a parallel AIJ matrix at the whole view, Ai
>>>>     is a sequential AIJ matrix? I want to operate Ai at each node.
>>>>     In addition, whether is it possible to get general inverse using
>>>>     MatMatSolve() if the matrix is not square? Thanks a lot.
>>>>
>>>>     Regards,
>>>>     Yujie
>>>>
>>>>
>>>>     On 2/4/08, *Barry Smith* <bsmith at mcs.anl.gov
>>>>     <mailto:bsmith at mcs.anl.gov>> wrote:
>>>>
>>>>
>>>>             For sequential AIJ matrices you can fill the B matrix
>>>> with the
>>>>         identity and then use
>>>>         MatMatSolve().
>>>>
>>>>             Note since the inverse of a sparse matrix is dense the B
>>>>         matrix is
>>>>         a SeqDense matrix.
>>>>
>>>>             Barry
>>>>
>>>>         On Feb 4, 2008, at 12:37 AM, Yujie wrote:
>>>>
>>>>         > Hi,
>>>>         > Now, I want to inverse a sparse matrix. I have browsed the
>>>>         manual,
>>>>         > however, I can't find some information. could you give me
>>>>         some advice?
>>>>         >
>>>>         > thanks a lot.
>>>>         >
>>>>         > Regards,
>>>>         > Yujie
>>>>         >
>>>>
>>>>
>>>>
>>>> ------------------------------------------------------------------------
>>>> Looking for last minute shopping deals? Find them fast with Yahoo!
>>>> Search.
>>>> <http://us.rd.yahoo.com/evt=51734/*http://tools.search.yahoo.com/newsearch/category.php?category=shopping>
>>>>         
>>>
>>>       
>>     
>
>
>   


From bsmith at mcs.anl.gov  Tue Feb  5 20:16:18 2008
From: bsmith at mcs.anl.gov (Barry Smith)
Date: Tue, 5 Feb 2008 20:16:18 -0600
Subject: how to inverse a sparse matrix in Petsc?
In-Reply-To: <47A915AB.9010006@gmail.com>
References: <602426.95557.qm@web38210.mail.mud.yahoo.com>	 <47A85573.4090607@cscs.ch> <47A86B5F.5010503@gmail.com> <e7ba66e40802050643y5f7944e5k4e79bf94beb6dd46@mail.gmail.com> <47A915AB.9010006@gmail.com>
Message-ID: <3612E197-7A2D-4A1C-8B46-360F5EC3D0CE@mcs.anl.gov>


On Feb 5, 2008, at 8:04 PM, Ben Tay wrote:

> Hi Lisandro,
>
> I'm using the fractional step mtd to solve the NS eqns as well. I've  
> tried the direct mtd and also boomerAMG in solving the poisson eqn.  
> Experience shows that for smaller matrix, direct mtd is slightly  
> faster but if the matrix increases in size, boomerAMG  is faster.  
> Btw, if I'm not wrong, the default solver will be GMRES. I've also  
> tried using the "Struct" interface solely under Hypre. It's even  
> faster for big matrix, although the improvement doesn't seem to be a  
> lot. I need to do more tests to confirm though.
>
> I'm now doing 2D simulation with 1400x2000 grids. It's takes quite a  
> while to solve the eqns. I'm wondering if it'll be faster if I get  
> the inverse and then do matrix multiplication. Or just calling  
> KSPSolve is actually doing something similar and there'll not be any  
> speed difference. Hope someone can enlighten...
>
> Thanks!
>
    Ben,

      Forming the inverse explicitly will be a complete failure.  
Because it is dense it will have (1400x2000)^2 values and
each multiply will take 2*(1400x2000)^2 floating point operations,  
while boomerAMG should take only O(1400x2000).

      BTW: if this is a constant coefficient Poisson operator with  
Neumann or Dirchelet boundary conditions then
likely a parallel FFT based algorithm would be fastest. Alas we do not  
yet have this in PETSc. It looks like FFTW finally
has an updated MPI version so we need to do the PETSc interface for  
that.


    Barry


> Lisandro Dalcin wrote:
>> Ben, some time ago I was doing some testing with PETSc for solving
>> incompressible NS eqs with fractional step method. I've found that in
>> our software and hardware setup, the best way to solve the pressure
>> problem was by using HYPRE BoomerAMG. This preconditioner usually  
>> have
>> some heavy setup, but if your Poison matrix does not change, then the
>> sucessive solves at each time step are really fast.
>>
>> If you still want to use a direct method, you should use the
>> combination '-ksp_type preonly -pc_type lu' (by default, this will
>> only work on sequential mode, unless you build PETSc with an external
>> package like MUMPS). This way, PETSc computes the LU factorization
>> only once, and at each time step, the call to KSPSolve end-up only
>> doing the triangular solvers.
>>
>> The nice thing about PETSc is that, if you next realize the
>> factorization take a long time (as it usually take in big problems),
>> you can switch BoomerAMG by only passing in the command line
>> '-ksp_type cg -pc_type hypre -pc_hypre_type boomeramg'. And that's
>> all, you do not need to change your code. And more, depending on your
>> problem you can choose the direct solvers or algebraic multigrid as
>> you want, by simply pass the appropriate combination options in the
>> command line (or a options file, using the -options_file option).
>>
>> Please, if you ever try HYPRE BoomerAMG preconditioners, I would like
>> to know about your experience.
>>
>> Regards,
>>
>> On 2/5/08, Ben Tay <zonexo at gmail.com> wrote:
>>
>>> Hi everyone,
>>>
>>> I was reading about the topic abt inversing a sparse matrix. I  
>>> have to
>>> solve a poisson eqn for my CFD code. Usually, I form a system of  
>>> linear
>>> eqns and solve Ax=b. The "A" is always the same and only the "b"  
>>> changes
>>> every timestep. Does it mean that if I'm able to get the inverse  
>>> matrix
>>> A^(-1), in order to get x at every timestep, I only need to do a  
>>> simple
>>> matrix multiplication ie x=A^(-1)*b ?
>>>
>>> Hi Timothy, if the above is true, can you email me your Fortran code
>>> template? I'm also programming in fortran 90. Thank you very much
>>>
>>> Regards.
>>>
>>> Timothy Stitt wrote:
>>>
>>>> Yes Yujie, I was able to put together a parallel code to invert a
>>>> large sparse matrix with the help of the PETSc developers. If you  
>>>> need
>>>> any help or maybe a Fortran code template just let me know.
>>>>
>>>> Best,
>>>>
>>>> Tim.
>>>>
>>>> Waad Subber wrote:
>>>>
>>>>> Hi
>>>>> There was a discussion between Tim Stitt and petsc developers  
>>>>> about
>>>>> matrix inversion, and it was really helpful. That was in last Nov.
>>>>> You can check the emails archive
>>>>>
>>>>> http://www-unix.mcs.anl.gov/web-mail-archive/lists/petsc-users/2007/11/threads.html
>>>>>
>>>>>
>>>>> Waad
>>>>>
>>>>> */Yujie <recrusader at gmail.com>/* wrote:
>>>>>
>>>>>    what is the difference between sequantial and parallel AIJ  
>>>>> matrix?
>>>>>    Assuming there is a matrix A, if
>>>>>    I partitaion this matrix into A1, A2, Ai... An.
>>>>>    A is a parallel AIJ matrix at the whole view, Ai
>>>>>    is a sequential AIJ matrix? I want to operate Ai at each node.
>>>>>    In addition, whether is it possible to get general inverse  
>>>>> using
>>>>>    MatMatSolve() if the matrix is not square? Thanks a lot.
>>>>>
>>>>>    Regards,
>>>>>    Yujie
>>>>>
>>>>>
>>>>>    On 2/4/08, *Barry Smith* <bsmith at mcs.anl.gov
>>>>>    <mailto:bsmith at mcs.anl.gov>> wrote:
>>>>>
>>>>>
>>>>>            For sequential AIJ matrices you can fill the B matrix
>>>>> with the
>>>>>        identity and then use
>>>>>        MatMatSolve().
>>>>>
>>>>>            Note since the inverse of a sparse matrix is dense  
>>>>> the B
>>>>>        matrix is
>>>>>        a SeqDense matrix.
>>>>>
>>>>>            Barry
>>>>>
>>>>>        On Feb 4, 2008, at 12:37 AM, Yujie wrote:
>>>>>
>>>>>        > Hi,
>>>>>        > Now, I want to inverse a sparse matrix. I have browsed  
>>>>> the
>>>>>        manual,
>>>>>        > however, I can't find some information. could you give me
>>>>>        some advice?
>>>>>        >
>>>>>        > thanks a lot.
>>>>>        >
>>>>>        > Regards,
>>>>>        > Yujie
>>>>>        >
>>>>>
>>>>>
>>>>>
>>>>> ------------------------------------------------------------------------
>>>>> Looking for last minute shopping deals? Find them fast with Yahoo!
>>>>> Search.
>>>>> <http://us.rd.yahoo.com/evt=51734/*http://tools.search.yahoo.com/newsearch/category.php?category=shopping 
>>>>> >
>>>>>
>>>>
>>>>
>>>
>>
>>
>>
>


From zonexo at gmail.com  Tue Feb  5 20:48:38 2008
From: zonexo at gmail.com (Ben Tay)
Date: Wed, 06 Feb 2008 10:48:38 +0800
Subject: how to inverse a sparse matrix in Petsc?
In-Reply-To: <3612E197-7A2D-4A1C-8B46-360F5EC3D0CE@mcs.anl.gov>
References: <602426.95557.qm@web38210.mail.mud.yahoo.com>	 <47A85573.4090607@cscs.ch> <47A86B5F.5010503@gmail.com> <e7ba66e40802050643y5f7944e5k4e79bf94beb6dd46@mail.gmail.com> <47A915AB.9010006@gmail.com> <3612E197-7A2D-4A1C-8B46-360F5EC3D0CE@mcs.anl.gov>
Message-ID: <47A92006.20108@gmail.com>

Thank you Barry for your enlightenment. I'll just continue to use 
BoomerAMG for the poisson eqn. I'll also check up on FFTW. Last time, I 
recalled that there seemed to be some restrictions for FFT on solving 
poisson eqn. It seems that the grids must be constant in at least 1 
dimension. I wonder if that is true? If that's the case, then it's not 
possible for me to use it, although it's a constant coefficient Poisson 
operator with Neumann or Dirchelet boundary conditions.

thank you.

Barry Smith wrote:
>
> On Feb 5, 2008, at 8:04 PM, Ben Tay wrote:
>
>> Hi Lisandro,
>>
>> I'm using the fractional step mtd to solve the NS eqns as well. I've 
>> tried the direct mtd and also boomerAMG in solving the poisson eqn. 
>> Experience shows that for smaller matrix, direct mtd is slightly 
>> faster but if the matrix increases in size, boomerAMG  is faster. 
>> Btw, if I'm not wrong, the default solver will be GMRES. I've also 
>> tried using the "Struct" interface solely under Hypre. It's even 
>> faster for big matrix, although the improvement doesn't seem to be a 
>> lot. I need to do more tests to confirm though.
>>
>> I'm now doing 2D simulation with 1400x2000 grids. It's takes quite a 
>> while to solve the eqns. I'm wondering if it'll be faster if I get 
>> the inverse and then do matrix multiplication. Or just calling 
>> KSPSolve is actually doing something similar and there'll not be any 
>> speed difference. Hope someone can enlighten...
>>
>> Thanks!
>>
>    Ben,
>
>      Forming the inverse explicitly will be a complete failure. 
> Because it is dense it will have (1400x2000)^2 values and
> each multiply will take 2*(1400x2000)^2 floating point operations, 
> while boomerAMG should take only O(1400x2000).
>
>      BTW: if this is a constant coefficient Poisson operator with 
> Neumann or Dirchelet boundary conditions then
> likely a parallel FFT based algorithm would be fastest. Alas we do not 
> yet have this in PETSc. It looks like FFTW finally
> has an updated MPI version so we need to do the PETSc interface for that.
>
>
>    Barry
>
>
>> Lisandro Dalcin wrote:
>>> Ben, some time ago I was doing some testing with PETSc for solving
>>> incompressible NS eqs with fractional step method. I've found that in
>>> our software and hardware setup, the best way to solve the pressure
>>> problem was by using HYPRE BoomerAMG. This preconditioner usually have
>>> some heavy setup, but if your Poison matrix does not change, then the
>>> sucessive solves at each time step are really fast.
>>>
>>> If you still want to use a direct method, you should use the
>>> combination '-ksp_type preonly -pc_type lu' (by default, this will
>>> only work on sequential mode, unless you build PETSc with an external
>>> package like MUMPS). This way, PETSc computes the LU factorization
>>> only once, and at each time step, the call to KSPSolve end-up only
>>> doing the triangular solvers.
>>>
>>> The nice thing about PETSc is that, if you next realize the
>>> factorization take a long time (as it usually take in big problems),
>>> you can switch BoomerAMG by only passing in the command line
>>> '-ksp_type cg -pc_type hypre -pc_hypre_type boomeramg'. And that's
>>> all, you do not need to change your code. And more, depending on your
>>> problem you can choose the direct solvers or algebraic multigrid as
>>> you want, by simply pass the appropriate combination options in the
>>> command line (or a options file, using the -options_file option).
>>>
>>> Please, if you ever try HYPRE BoomerAMG preconditioners, I would like
>>> to know about your experience.
>>>
>>> Regards,
>>>
>>> On 2/5/08, Ben Tay <zonexo at gmail.com> wrote:
>>>
>>>> Hi everyone,
>>>>
>>>> I was reading about the topic abt inversing a sparse matrix. I have to
>>>> solve a poisson eqn for my CFD code. Usually, I form a system of 
>>>> linear
>>>> eqns and solve Ax=b. The "A" is always the same and only the "b" 
>>>> changes
>>>> every timestep. Does it mean that if I'm able to get the inverse 
>>>> matrix
>>>> A^(-1), in order to get x at every timestep, I only need to do a 
>>>> simple
>>>> matrix multiplication ie x=A^(-1)*b ?
>>>>
>>>> Hi Timothy, if the above is true, can you email me your Fortran code
>>>> template? I'm also programming in fortran 90. Thank you very much
>>>>
>>>> Regards.
>>>>
>>>> Timothy Stitt wrote:
>>>>
>>>>> Yes Yujie, I was able to put together a parallel code to invert a
>>>>> large sparse matrix with the help of the PETSc developers. If you 
>>>>> need
>>>>> any help or maybe a Fortran code template just let me know.
>>>>>
>>>>> Best,
>>>>>
>>>>> Tim.
>>>>>
>>>>> Waad Subber wrote:
>>>>>
>>>>>> Hi
>>>>>> There was a discussion between Tim Stitt and petsc developers about
>>>>>> matrix inversion, and it was really helpful. That was in last Nov.
>>>>>> You can check the emails archive
>>>>>>
>>>>>> http://www-unix.mcs.anl.gov/web-mail-archive/lists/petsc-users/2007/11/threads.html 
>>>>>>
>>>>>>
>>>>>>
>>>>>> Waad
>>>>>>
>>>>>> */Yujie <recrusader at gmail.com>/* wrote:
>>>>>>
>>>>>>    what is the difference between sequantial and parallel AIJ 
>>>>>> matrix?
>>>>>>    Assuming there is a matrix A, if
>>>>>>    I partitaion this matrix into A1, A2, Ai... An.
>>>>>>    A is a parallel AIJ matrix at the whole view, Ai
>>>>>>    is a sequential AIJ matrix? I want to operate Ai at each node.
>>>>>>    In addition, whether is it possible to get general inverse using
>>>>>>    MatMatSolve() if the matrix is not square? Thanks a lot.
>>>>>>
>>>>>>    Regards,
>>>>>>    Yujie
>>>>>>
>>>>>>
>>>>>>    On 2/4/08, *Barry Smith* <bsmith at mcs.anl.gov
>>>>>>    <mailto:bsmith at mcs.anl.gov>> wrote:
>>>>>>
>>>>>>
>>>>>>            For sequential AIJ matrices you can fill the B matrix
>>>>>> with the
>>>>>>        identity and then use
>>>>>>        MatMatSolve().
>>>>>>
>>>>>>            Note since the inverse of a sparse matrix is dense the B
>>>>>>        matrix is
>>>>>>        a SeqDense matrix.
>>>>>>
>>>>>>            Barry
>>>>>>
>>>>>>        On Feb 4, 2008, at 12:37 AM, Yujie wrote:
>>>>>>
>>>>>>        > Hi,
>>>>>>        > Now, I want to inverse a sparse matrix. I have browsed the
>>>>>>        manual,
>>>>>>        > however, I can't find some information. could you give me
>>>>>>        some advice?
>>>>>>        >
>>>>>>        > thanks a lot.
>>>>>>        >
>>>>>>        > Regards,
>>>>>>        > Yujie
>>>>>>        >
>>>>>>
>>>>>>
>>>>>>
>>>>>> ------------------------------------------------------------------------ 
>>>>>>
>>>>>> Looking for last minute shopping deals? Find them fast with Yahoo!
>>>>>> Search.
>>>>>> <http://us.rd.yahoo.com/evt=51734/*http://tools.search.yahoo.com/newsearch/category.php?category=shopping> 
>>>>>>
>>>>>>
>>>>>
>>>>>
>>>>
>>>
>>>
>>>
>>
>
>


From bsmith at mcs.anl.gov  Tue Feb  5 21:21:46 2008
From: bsmith at mcs.anl.gov (Barry Smith)
Date: Tue, 5 Feb 2008 21:21:46 -0600
Subject: how to inverse a sparse matrix in Petsc?
In-Reply-To: <47A92006.20108@gmail.com>
References: <602426.95557.qm@web38210.mail.mud.yahoo.com>	 <47A85573.4090607@cscs.ch> <47A86B5F.5010503@gmail.com> <e7ba66e40802050643y5f7944e5k4e79bf94beb6dd46@mail.gmail.com> <47A915AB.9010006@gmail.com> <3612E197-7A2D-4A1C-8B46-360F5EC3D0CE@mcs.anl.gov> <47A92006.20108@gmail.com>
Message-ID: <79724F06-DB17-4794-9539-E07A37DC469F@mcs.anl.gov>


On Feb 5, 2008, at 8:48 PM, Ben Tay wrote:

> Thank you Barry for your enlightenment. I'll just continue to use  
> BoomerAMG for the poisson eqn. I'll also check up on FFTW. Last  
> time, I recalled that there seemed to be some restrictions for FFT  
> on solving poisson eqn. It seems that the grids must be constant in  
> at least 1 dimension.

    Yes. Then it decouples into a bunch of tridiagonal solves;  
Basically if you can do separation of variables you can
use FFTs.

    Barry

> I wonder if that is true? If that's the case, then it's not possible  
> for me to use it, although it's a constant coefficient Poisson  
> operator with Neumann or Dirchelet boundary conditions.
>
> thank you.
>
> Barry Smith wrote:
>>
>> On Feb 5, 2008, at 8:04 PM, Ben Tay wrote:
>>
>>> Hi Lisandro,
>>>
>>> I'm using the fractional step mtd to solve the NS eqns as well.  
>>> I've tried the direct mtd and also boomerAMG in solving the  
>>> poisson eqn. Experience shows that for smaller matrix, direct mtd  
>>> is slightly faster but if the matrix increases in size, boomerAMG   
>>> is faster. Btw, if I'm not wrong, the default solver will be  
>>> GMRES. I've also tried using the "Struct" interface solely under  
>>> Hypre. It's even faster for big matrix, although the improvement  
>>> doesn't seem to be a lot. I need to do more tests to confirm though.
>>>
>>> I'm now doing 2D simulation with 1400x2000 grids. It's takes quite  
>>> a while to solve the eqns. I'm wondering if it'll be faster if I  
>>> get the inverse and then do matrix multiplication. Or just calling  
>>> KSPSolve is actually doing something similar and there'll not be  
>>> any speed difference. Hope someone can enlighten...
>>>
>>> Thanks!
>>>
>>   Ben,
>>
>>     Forming the inverse explicitly will be a complete failure.  
>> Because it is dense it will have (1400x2000)^2 values and
>> each multiply will take 2*(1400x2000)^2 floating point operations,  
>> while boomerAMG should take only O(1400x2000).
>>
>>     BTW: if this is a constant coefficient Poisson operator with  
>> Neumann or Dirchelet boundary conditions then
>> likely a parallel FFT based algorithm would be fastest. Alas we do  
>> not yet have this in PETSc. It looks like FFTW finally
>> has an updated MPI version so we need to do the PETSc interface for  
>> that.
>>
>>
>>   Barry
>>
>>
>>> Lisandro Dalcin wrote:
>>>> Ben, some time ago I was doing some testing with PETSc for solving
>>>> incompressible NS eqs with fractional step method. I've found  
>>>> that in
>>>> our software and hardware setup, the best way to solve the pressure
>>>> problem was by using HYPRE BoomerAMG. This preconditioner usually  
>>>> have
>>>> some heavy setup, but if your Poison matrix does not change, then  
>>>> the
>>>> sucessive solves at each time step are really fast.
>>>>
>>>> If you still want to use a direct method, you should use the
>>>> combination '-ksp_type preonly -pc_type lu' (by default, this will
>>>> only work on sequential mode, unless you build PETSc with an  
>>>> external
>>>> package like MUMPS). This way, PETSc computes the LU factorization
>>>> only once, and at each time step, the call to KSPSolve end-up only
>>>> doing the triangular solvers.
>>>>
>>>> The nice thing about PETSc is that, if you next realize the
>>>> factorization take a long time (as it usually take in big  
>>>> problems),
>>>> you can switch BoomerAMG by only passing in the command line
>>>> '-ksp_type cg -pc_type hypre -pc_hypre_type boomeramg'. And that's
>>>> all, you do not need to change your code. And more, depending on  
>>>> your
>>>> problem you can choose the direct solvers or algebraic multigrid as
>>>> you want, by simply pass the appropriate combination options in the
>>>> command line (or a options file, using the -options_file option).
>>>>
>>>> Please, if you ever try HYPRE BoomerAMG preconditioners, I would  
>>>> like
>>>> to know about your experience.
>>>>
>>>> Regards,
>>>>
>>>> On 2/5/08, Ben Tay <zonexo at gmail.com> wrote:
>>>>
>>>>> Hi everyone,
>>>>>
>>>>> I was reading about the topic abt inversing a sparse matrix. I  
>>>>> have to
>>>>> solve a poisson eqn for my CFD code. Usually, I form a system of  
>>>>> linear
>>>>> eqns and solve Ax=b. The "A" is always the same and only the "b"  
>>>>> changes
>>>>> every timestep. Does it mean that if I'm able to get the inverse  
>>>>> matrix
>>>>> A^(-1), in order to get x at every timestep, I only need to do a  
>>>>> simple
>>>>> matrix multiplication ie x=A^(-1)*b ?
>>>>>
>>>>> Hi Timothy, if the above is true, can you email me your Fortran  
>>>>> code
>>>>> template? I'm also programming in fortran 90. Thank you very much
>>>>>
>>>>> Regards.
>>>>>
>>>>> Timothy Stitt wrote:
>>>>>
>>>>>> Yes Yujie, I was able to put together a parallel code to invert a
>>>>>> large sparse matrix with the help of the PETSc developers. If  
>>>>>> you need
>>>>>> any help or maybe a Fortran code template just let me know.
>>>>>>
>>>>>> Best,
>>>>>>
>>>>>> Tim.
>>>>>>
>>>>>> Waad Subber wrote:
>>>>>>
>>>>>>> Hi
>>>>>>> There was a discussion between Tim Stitt and petsc developers  
>>>>>>> about
>>>>>>> matrix inversion, and it was really helpful. That was in last  
>>>>>>> Nov.
>>>>>>> You can check the emails archive
>>>>>>>
>>>>>>> http://www-unix.mcs.anl.gov/web-mail-archive/lists/petsc-users/2007/11/threads.html
>>>>>>>
>>>>>>>
>>>>>>> Waad
>>>>>>>
>>>>>>> */Yujie <recrusader at gmail.com>/* wrote:
>>>>>>>
>>>>>>>   what is the difference between sequantial and parallel AIJ  
>>>>>>> matrix?
>>>>>>>   Assuming there is a matrix A, if
>>>>>>>   I partitaion this matrix into A1, A2, Ai... An.
>>>>>>>   A is a parallel AIJ matrix at the whole view, Ai
>>>>>>>   is a sequential AIJ matrix? I want to operate Ai at each node.
>>>>>>>   In addition, whether is it possible to get general inverse  
>>>>>>> using
>>>>>>>   MatMatSolve() if the matrix is not square? Thanks a lot.
>>>>>>>
>>>>>>>   Regards,
>>>>>>>   Yujie
>>>>>>>
>>>>>>>
>>>>>>>   On 2/4/08, *Barry Smith* <bsmith at mcs.anl.gov
>>>>>>>   <mailto:bsmith at mcs.anl.gov>> wrote:
>>>>>>>
>>>>>>>
>>>>>>>           For sequential AIJ matrices you can fill the B matrix
>>>>>>> with the
>>>>>>>       identity and then use
>>>>>>>       MatMatSolve().
>>>>>>>
>>>>>>>           Note since the inverse of a sparse matrix is dense  
>>>>>>> the B
>>>>>>>       matrix is
>>>>>>>       a SeqDense matrix.
>>>>>>>
>>>>>>>           Barry
>>>>>>>
>>>>>>>       On Feb 4, 2008, at 12:37 AM, Yujie wrote:
>>>>>>>
>>>>>>>       > Hi,
>>>>>>>       > Now, I want to inverse a sparse matrix. I have browsed  
>>>>>>> the
>>>>>>>       manual,
>>>>>>>       > however, I can't find some information. could you give  
>>>>>>> me
>>>>>>>       some advice?
>>>>>>>       >
>>>>>>>       > thanks a lot.
>>>>>>>       >
>>>>>>>       > Regards,
>>>>>>>       > Yujie
>>>>>>>       >
>>>>>>>
>>>>>>>
>>>>>>>
>>>>>>> ------------------------------------------------------------------------
>>>>>>> Looking for last minute shopping deals? Find them fast with  
>>>>>>> Yahoo!
>>>>>>> Search.
>>>>>>> <http://us.rd.yahoo.com/evt=51734/*http://tools.search.yahoo.com/newsearch/category.php?category=shopping 
>>>>>>> >
>>>>>>>
>>>>>>
>>>>>>
>>>>>
>>>>
>>>>
>>>>
>>>
>>
>>
>


From zonexo at gmail.com  Tue Feb  5 21:39:41 2008
From: zonexo at gmail.com (Ben Tay)
Date: Wed, 06 Feb 2008 11:39:41 +0800
Subject: how to inverse a sparse matrix in Petsc?
In-Reply-To: <79724F06-DB17-4794-9539-E07A37DC469F@mcs.anl.gov>
References: <602426.95557.qm@web38210.mail.mud.yahoo.com>	 <47A85573.4090607@cscs.ch> <47A86B5F.5010503@gmail.com> <e7ba66e40802050643y5f7944e5k4e79bf94beb6dd46@mail.gmail.com> <47A915AB.9010006@gmail.com> <3612E197-7A2D-4A1C-8B46-360F5EC3D0CE@mcs.anl.gov> <47A92006.20108@gmail.com> <79724F06-DB17-4794-9539-E07A37DC469F@mcs.anl.gov>
Message-ID: <47A92BFD.1040205@gmail.com>

Sorry Barry, I just would like to confirm that as long as it's a 
constant constant coefficient Poisson eqn with Neumann or Dirchelet 
boundary conditions, I can use FFT. It doesn't matter if the grids are 
uniform or not. Is that correct? Thanks.

Barry Smith wrote:
>
> On Feb 5, 2008, at 8:48 PM, Ben Tay wrote:
>
>> Thank you Barry for your enlightenment. I'll just continue to use 
>> BoomerAMG for the poisson eqn. I'll also check up on FFTW. Last time, 
>> I recalled that there seemed to be some restrictions for FFT on 
>> solving poisson eqn. It seems that the grids must be constant in at 
>> least 1 dimension.
>
>    Yes. Then it decouples into a bunch of tridiagonal solves; 
> Basically if you can do separation of variables you can
> use FFTs.
>
>    Barry
>
>> I wonder if that is true? If that's the case, then it's not possible 
>> for me to use it, although it's a constant coefficient Poisson 
>> operator with Neumann or Dirchelet boundary conditions.
>>
>> thank you.
>>
>> Barry Smith wrote:
>>>
>>> On Feb 5, 2008, at 8:04 PM, Ben Tay wrote:
>>>
>>>> Hi Lisandro,
>>>>
>>>> I'm using the fractional step mtd to solve the NS eqns as well. 
>>>> I've tried the direct mtd and also boomerAMG in solving the poisson 
>>>> eqn. Experience shows that for smaller matrix, direct mtd is 
>>>> slightly faster but if the matrix increases in size, boomerAMG  is 
>>>> faster. Btw, if I'm not wrong, the default solver will be GMRES. 
>>>> I've also tried using the "Struct" interface solely under Hypre. 
>>>> It's even faster for big matrix, although the improvement doesn't 
>>>> seem to be a lot. I need to do more tests to confirm though.
>>>>
>>>> I'm now doing 2D simulation with 1400x2000 grids. It's takes quite 
>>>> a while to solve the eqns. I'm wondering if it'll be faster if I 
>>>> get the inverse and then do matrix multiplication. Or just calling 
>>>> KSPSolve is actually doing something similar and there'll not be 
>>>> any speed difference. Hope someone can enlighten...
>>>>
>>>> Thanks!
>>>>
>>>   Ben,
>>>
>>>     Forming the inverse explicitly will be a complete failure. 
>>> Because it is dense it will have (1400x2000)^2 values and
>>> each multiply will take 2*(1400x2000)^2 floating point operations, 
>>> while boomerAMG should take only O(1400x2000).
>>>
>>>     BTW: if this is a constant coefficient Poisson operator with 
>>> Neumann or Dirchelet boundary conditions then
>>> likely a parallel FFT based algorithm would be fastest. Alas we do 
>>> not yet have this in PETSc. It looks like FFTW finally
>>> has an updated MPI version so we need to do the PETSc interface for 
>>> that.
>>>
>>>
>>>   Barry
>>>
>>>
>>>> Lisandro Dalcin wrote:
>>>>> Ben, some time ago I was doing some testing with PETSc for solving
>>>>> incompressible NS eqs with fractional step method. I've found that in
>>>>> our software and hardware setup, the best way to solve the pressure
>>>>> problem was by using HYPRE BoomerAMG. This preconditioner usually 
>>>>> have
>>>>> some heavy setup, but if your Poison matrix does not change, then the
>>>>> sucessive solves at each time step are really fast.
>>>>>
>>>>> If you still want to use a direct method, you should use the
>>>>> combination '-ksp_type preonly -pc_type lu' (by default, this will
>>>>> only work on sequential mode, unless you build PETSc with an external
>>>>> package like MUMPS). This way, PETSc computes the LU factorization
>>>>> only once, and at each time step, the call to KSPSolve end-up only
>>>>> doing the triangular solvers.
>>>>>
>>>>> The nice thing about PETSc is that, if you next realize the
>>>>> factorization take a long time (as it usually take in big problems),
>>>>> you can switch BoomerAMG by only passing in the command line
>>>>> '-ksp_type cg -pc_type hypre -pc_hypre_type boomeramg'. And that's
>>>>> all, you do not need to change your code. And more, depending on your
>>>>> problem you can choose the direct solvers or algebraic multigrid as
>>>>> you want, by simply pass the appropriate combination options in the
>>>>> command line (or a options file, using the -options_file option).
>>>>>
>>>>> Please, if you ever try HYPRE BoomerAMG preconditioners, I would like
>>>>> to know about your experience.
>>>>>
>>>>> Regards,
>>>>>
>>>>> On 2/5/08, Ben Tay <zonexo at gmail.com> wrote:
>>>>>
>>>>>> Hi everyone,
>>>>>>
>>>>>> I was reading about the topic abt inversing a sparse matrix. I 
>>>>>> have to
>>>>>> solve a poisson eqn for my CFD code. Usually, I form a system of 
>>>>>> linear
>>>>>> eqns and solve Ax=b. The "A" is always the same and only the "b" 
>>>>>> changes
>>>>>> every timestep. Does it mean that if I'm able to get the inverse 
>>>>>> matrix
>>>>>> A^(-1), in order to get x at every timestep, I only need to do a 
>>>>>> simple
>>>>>> matrix multiplication ie x=A^(-1)*b ?
>>>>>>
>>>>>> Hi Timothy, if the above is true, can you email me your Fortran code
>>>>>> template? I'm also programming in fortran 90. Thank you very much
>>>>>>
>>>>>> Regards.
>>>>>>
>>>>>> Timothy Stitt wrote:
>>>>>>
>>>>>>> Yes Yujie, I was able to put together a parallel code to invert a
>>>>>>> large sparse matrix with the help of the PETSc developers. If 
>>>>>>> you need
>>>>>>> any help or maybe a Fortran code template just let me know.
>>>>>>>
>>>>>>> Best,
>>>>>>>
>>>>>>> Tim.
>>>>>>>
>>>>>>> Waad Subber wrote:
>>>>>>>
>>>>>>>> Hi
>>>>>>>> There was a discussion between Tim Stitt and petsc developers 
>>>>>>>> about
>>>>>>>> matrix inversion, and it was really helpful. That was in last Nov.
>>>>>>>> You can check the emails archive
>>>>>>>>
>>>>>>>> http://www-unix.mcs.anl.gov/web-mail-archive/lists/petsc-users/2007/11/threads.html 
>>>>>>>>
>>>>>>>>
>>>>>>>>
>>>>>>>> Waad
>>>>>>>>
>>>>>>>> */Yujie <recrusader at gmail.com>/* wrote:
>>>>>>>>
>>>>>>>>   what is the difference between sequantial and parallel AIJ 
>>>>>>>> matrix?
>>>>>>>>   Assuming there is a matrix A, if
>>>>>>>>   I partitaion this matrix into A1, A2, Ai... An.
>>>>>>>>   A is a parallel AIJ matrix at the whole view, Ai
>>>>>>>>   is a sequential AIJ matrix? I want to operate Ai at each node.
>>>>>>>>   In addition, whether is it possible to get general inverse using
>>>>>>>>   MatMatSolve() if the matrix is not square? Thanks a lot.
>>>>>>>>
>>>>>>>>   Regards,
>>>>>>>>   Yujie
>>>>>>>>
>>>>>>>>
>>>>>>>>   On 2/4/08, *Barry Smith* <bsmith at mcs.anl.gov
>>>>>>>>   <mailto:bsmith at mcs.anl.gov>> wrote:
>>>>>>>>
>>>>>>>>
>>>>>>>>           For sequential AIJ matrices you can fill the B matrix
>>>>>>>> with the
>>>>>>>>       identity and then use
>>>>>>>>       MatMatSolve().
>>>>>>>>
>>>>>>>>           Note since the inverse of a sparse matrix is dense the B
>>>>>>>>       matrix is
>>>>>>>>       a SeqDense matrix.
>>>>>>>>
>>>>>>>>           Barry
>>>>>>>>
>>>>>>>>       On Feb 4, 2008, at 12:37 AM, Yujie wrote:
>>>>>>>>
>>>>>>>>       > Hi,
>>>>>>>>       > Now, I want to inverse a sparse matrix. I have browsed the
>>>>>>>>       manual,
>>>>>>>>       > however, I can't find some information. could you give me
>>>>>>>>       some advice?
>>>>>>>>       >
>>>>>>>>       > thanks a lot.
>>>>>>>>       >
>>>>>>>>       > Regards,
>>>>>>>>       > Yujie
>>>>>>>>       >
>>>>>>>>
>>>>>>>>
>>>>>>>>
>>>>>>>> ------------------------------------------------------------------------ 
>>>>>>>>
>>>>>>>> Looking for last minute shopping deals? Find them fast with Yahoo!
>>>>>>>> Search.
>>>>>>>> <http://us.rd.yahoo.com/evt=51734/*http://tools.search.yahoo.com/newsearch/category.php?category=shopping> 
>>>>>>>>
>>>>>>>>
>>>>>>>
>>>>>>>
>>>>>>
>>>>>
>>>>>
>>>>>
>>>>
>>>
>>>
>>
>
>


From bsmith at mcs.anl.gov  Wed Feb  6 07:52:39 2008
From: bsmith at mcs.anl.gov (Barry Smith)
Date: Wed, 6 Feb 2008 07:52:39 -0600
Subject: how to inverse a sparse matrix in Petsc?
In-Reply-To: <47A92BFD.1040205@gmail.com>
References: <602426.95557.qm@web38210.mail.mud.yahoo.com>	 <47A85573.4090607@cscs.ch> <47A86B5F.5010503@gmail.com> <e7ba66e40802050643y5f7944e5k4e79bf94beb6dd46@mail.gmail.com> <47A915AB.9010006@gmail.com> <3612E197-7A2D-4A1C-8B46-360F5EC3D0CE@mcs.anl.gov> <47A92006.20108@gmail.com> <79724F06-DB17-4794-9539-E07A37DC469F@mcs.anl.gov> <47A92BFD.1040205@gmail.com>
Message-ID: <E0C60EC7-03EF-4E6C-B872-278F7D95BA0D@mcs.anl.gov>


    Whoops, actually the grids in each direction would need to be  
uniform.

     Barry

On Feb 5, 2008, at 9:39 PM, Ben Tay wrote:

> Sorry Barry, I just would like to confirm that as long as it's a  
> constant constant coefficient Poisson eqn with Neumann or Dirchelet  
> boundary conditions, I can use FFT. It doesn't matter if the grids  
> are uniform or not. Is that correct? Thanks.
>
> Barry Smith wrote:
>>
>> On Feb 5, 2008, at 8:48 PM, Ben Tay wrote:
>>
>>> Thank you Barry for your enlightenment. I'll just continue to use  
>>> BoomerAMG for the poisson eqn. I'll also check up on FFTW. Last  
>>> time, I recalled that there seemed to be some restrictions for FFT  
>>> on solving poisson eqn. It seems that the grids must be constant  
>>> in at least 1 dimension.
>>
>>   Yes. Then it decouples into a bunch of tridiagonal solves;  
>> Basically if you can do separation of variables you can
>> use FFTs.
>>
>>   Barry
>>
>>> I wonder if that is true? If that's the case, then it's not  
>>> possible for me to use it, although it's a constant coefficient  
>>> Poisson operator with Neumann or Dirchelet boundary conditions.
>>>
>>> thank you.
>>>
>>> Barry Smith wrote:
>>>>
>>>> On Feb 5, 2008, at 8:04 PM, Ben Tay wrote:
>>>>
>>>>> Hi Lisandro,
>>>>>
>>>>> I'm using the fractional step mtd to solve the NS eqns as well.  
>>>>> I've tried the direct mtd and also boomerAMG in solving the  
>>>>> poisson eqn. Experience shows that for smaller matrix, direct  
>>>>> mtd is slightly faster but if the matrix increases in size,  
>>>>> boomerAMG  is faster. Btw, if I'm not wrong, the default solver  
>>>>> will be GMRES. I've also tried using the "Struct" interface  
>>>>> solely under Hypre. It's even faster for big matrix, although  
>>>>> the improvement doesn't seem to be a lot. I need to do more  
>>>>> tests to confirm though.
>>>>>
>>>>> I'm now doing 2D simulation with 1400x2000 grids. It's takes  
>>>>> quite a while to solve the eqns. I'm wondering if it'll be  
>>>>> faster if I get the inverse and then do matrix multiplication.  
>>>>> Or just calling KSPSolve is actually doing something similar and  
>>>>> there'll not be any speed difference. Hope someone can  
>>>>> enlighten...
>>>>>
>>>>> Thanks!
>>>>>
>>>>  Ben,
>>>>
>>>>    Forming the inverse explicitly will be a complete failure.  
>>>> Because it is dense it will have (1400x2000)^2 values and
>>>> each multiply will take 2*(1400x2000)^2 floating point  
>>>> operations, while boomerAMG should take only O(1400x2000).
>>>>
>>>>    BTW: if this is a constant coefficient Poisson operator with  
>>>> Neumann or Dirchelet boundary conditions then
>>>> likely a parallel FFT based algorithm would be fastest. Alas we  
>>>> do not yet have this in PETSc. It looks like FFTW finally
>>>> has an updated MPI version so we need to do the PETSc interface  
>>>> for that.
>>>>
>>>>
>>>>  Barry
>>>>
>>>>
>>>>> Lisandro Dalcin wrote:
>>>>>> Ben, some time ago I was doing some testing with PETSc for  
>>>>>> solving
>>>>>> incompressible NS eqs with fractional step method. I've found  
>>>>>> that in
>>>>>> our software and hardware setup, the best way to solve the  
>>>>>> pressure
>>>>>> problem was by using HYPRE BoomerAMG. This preconditioner  
>>>>>> usually have
>>>>>> some heavy setup, but if your Poison matrix does not change,  
>>>>>> then the
>>>>>> sucessive solves at each time step are really fast.
>>>>>>
>>>>>> If you still want to use a direct method, you should use the
>>>>>> combination '-ksp_type preonly -pc_type lu' (by default, this  
>>>>>> will
>>>>>> only work on sequential mode, unless you build PETSc with an  
>>>>>> external
>>>>>> package like MUMPS). This way, PETSc computes the LU  
>>>>>> factorization
>>>>>> only once, and at each time step, the call to KSPSolve end-up  
>>>>>> only
>>>>>> doing the triangular solvers.
>>>>>>
>>>>>> The nice thing about PETSc is that, if you next realize the
>>>>>> factorization take a long time (as it usually take in big  
>>>>>> problems),
>>>>>> you can switch BoomerAMG by only passing in the command line
>>>>>> '-ksp_type cg -pc_type hypre -pc_hypre_type boomeramg'. And  
>>>>>> that's
>>>>>> all, you do not need to change your code. And more, depending  
>>>>>> on your
>>>>>> problem you can choose the direct solvers or algebraic  
>>>>>> multigrid as
>>>>>> you want, by simply pass the appropriate combination options in  
>>>>>> the
>>>>>> command line (or a options file, using the -options_file option).
>>>>>>
>>>>>> Please, if you ever try HYPRE BoomerAMG preconditioners, I  
>>>>>> would like
>>>>>> to know about your experience.
>>>>>>
>>>>>> Regards,
>>>>>>
>>>>>> On 2/5/08, Ben Tay <zonexo at gmail.com> wrote:
>>>>>>
>>>>>>> Hi everyone,
>>>>>>>
>>>>>>> I was reading about the topic abt inversing a sparse matrix. I  
>>>>>>> have to
>>>>>>> solve a poisson eqn for my CFD code. Usually, I form a system  
>>>>>>> of linear
>>>>>>> eqns and solve Ax=b. The "A" is always the same and only the  
>>>>>>> "b" changes
>>>>>>> every timestep. Does it mean that if I'm able to get the  
>>>>>>> inverse matrix
>>>>>>> A^(-1), in order to get x at every timestep, I only need to do  
>>>>>>> a simple
>>>>>>> matrix multiplication ie x=A^(-1)*b ?
>>>>>>>
>>>>>>> Hi Timothy, if the above is true, can you email me your  
>>>>>>> Fortran code
>>>>>>> template? I'm also programming in fortran 90. Thank you very  
>>>>>>> much
>>>>>>>
>>>>>>> Regards.
>>>>>>>
>>>>>>> Timothy Stitt wrote:
>>>>>>>
>>>>>>>> Yes Yujie, I was able to put together a parallel code to  
>>>>>>>> invert a
>>>>>>>> large sparse matrix with the help of the PETSc developers. If  
>>>>>>>> you need
>>>>>>>> any help or maybe a Fortran code template just let me know.
>>>>>>>>
>>>>>>>> Best,
>>>>>>>>
>>>>>>>> Tim.
>>>>>>>>
>>>>>>>> Waad Subber wrote:
>>>>>>>>
>>>>>>>>> Hi
>>>>>>>>> There was a discussion between Tim Stitt and petsc  
>>>>>>>>> developers about
>>>>>>>>> matrix inversion, and it was really helpful. That was in  
>>>>>>>>> last Nov.
>>>>>>>>> You can check the emails archive
>>>>>>>>>
>>>>>>>>> http://www-unix.mcs.anl.gov/web-mail-archive/lists/petsc-users/2007/11/threads.html
>>>>>>>>>
>>>>>>>>>
>>>>>>>>> Waad
>>>>>>>>>
>>>>>>>>> */Yujie <recrusader at gmail.com>/* wrote:
>>>>>>>>>
>>>>>>>>>  what is the difference between sequantial and parallel AIJ  
>>>>>>>>> matrix?
>>>>>>>>>  Assuming there is a matrix A, if
>>>>>>>>>  I partitaion this matrix into A1, A2, Ai... An.
>>>>>>>>>  A is a parallel AIJ matrix at the whole view, Ai
>>>>>>>>>  is a sequential AIJ matrix? I want to operate Ai at each  
>>>>>>>>> node.
>>>>>>>>>  In addition, whether is it possible to get general inverse  
>>>>>>>>> using
>>>>>>>>>  MatMatSolve() if the matrix is not square? Thanks a lot.
>>>>>>>>>
>>>>>>>>>  Regards,
>>>>>>>>>  Yujie
>>>>>>>>>
>>>>>>>>>
>>>>>>>>>  On 2/4/08, *Barry Smith* <bsmith at mcs.anl.gov
>>>>>>>>>  <mailto:bsmith at mcs.anl.gov>> wrote:
>>>>>>>>>
>>>>>>>>>
>>>>>>>>>          For sequential AIJ matrices you can fill the B matrix
>>>>>>>>> with the
>>>>>>>>>      identity and then use
>>>>>>>>>      MatMatSolve().
>>>>>>>>>
>>>>>>>>>          Note since the inverse of a sparse matrix is dense  
>>>>>>>>> the B
>>>>>>>>>      matrix is
>>>>>>>>>      a SeqDense matrix.
>>>>>>>>>
>>>>>>>>>          Barry
>>>>>>>>>
>>>>>>>>>      On Feb 4, 2008, at 12:37 AM, Yujie wrote:
>>>>>>>>>
>>>>>>>>>      > Hi,
>>>>>>>>>      > Now, I want to inverse a sparse matrix. I have  
>>>>>>>>> browsed the
>>>>>>>>>      manual,
>>>>>>>>>      > however, I can't find some information. could you  
>>>>>>>>> give me
>>>>>>>>>      some advice?
>>>>>>>>>      >
>>>>>>>>>      > thanks a lot.
>>>>>>>>>      >
>>>>>>>>>      > Regards,
>>>>>>>>>      > Yujie
>>>>>>>>>      >
>>>>>>>>>
>>>>>>>>>
>>>>>>>>>
>>>>>>>>> ------------------------------------------------------------------------
>>>>>>>>> Looking for last minute shopping deals? Find them fast with  
>>>>>>>>> Yahoo!
>>>>>>>>> Search.
>>>>>>>>> <http://us.rd.yahoo.com/evt=51734/*http://tools.search.yahoo.com/newsearch/category.php?category=shopping 
>>>>>>>>> >
>>>>>>>>>
>>>>>>>>
>>>>>>>>
>>>>>>>
>>>>>>
>>>>>>
>>>>>>
>>>>>
>>>>
>>>>
>>>
>>
>>
>


From erlend.pedersen at holberger.com  Wed Feb  6 09:49:32 2008
From: erlend.pedersen at holberger.com (Erlend Pedersen :.)
Date: Wed, 06 Feb 2008 16:49:32 +0100
Subject: Overdetermined, non-linear
In-Reply-To: <a9f269830802050531u55bc29b2uf1b52694d060adcd@mail.gmail.com>
References: <1201866864.6394.25.camel@erlend-ws.in.holberger.com>
	 <a9f269830802031759l1edba6e2rba838012b99a445d@mail.gmail.com>
	 <1202203616.27733.50.camel@erlend-ws.in.holberger.com>
	 <a9f269830802050531u55bc29b2uf1b52694d060adcd@mail.gmail.com>
Message-ID: <1202312972.9921.15.camel@erlend-ws.in.holberger.com>


On Tue, 2008-02-05 at 07:31 -0600, Matthew Knepley wrote:
> On Feb 5, 2008 3:26 AM, Erlend Pedersen :.
> <erlend.pedersen at holberger.com> wrote:
> > On Sun, 2008-02-03 at 19:59 -0600, Matthew Knepley wrote:
> > > On Feb 1, 2008 5:54 AM, Erlend Pedersen :.
> > > <erlend.pedersen at holberger.com> wrote:
> > > > I am attempting to use the PETSc nonlinear solver on an overdetermined
> > > > system of non-linear equations. Hence, the Jacobian is not square, and
> > > > so far we have unfortunately not succeeded with any combination of snes,
> > > > ksp and pc.
> > > >
> > > > Could you confirm that snes actually works for overdetermined systems,
> > > > and if so, is there an application example we could look at in order to
> > > > make sure there is nothing wrong with our test-setup?
> > > >
> > > > We have previously used the MINPACK routine LMDER very successfully, but
> > > > for our current problem sizes we rely on the use of sparse matrix
> > > > representations and parallel architectures. PETSc's abstractions and
> > > > automatic MPI makes this system very attractive for us, and we have
> > > > already used the PETSc LSQR solver with great success.
> > >
> > > So in the sense that SNES is really just an iteration with an embedded solve,
> > > yes it can solve non-square nonlinear systems. However, the user has to
> > > understand what is meant by the Function and Jacobian evaluation methods.
> > > I suggest implementing the simplest algorithm for non-square systems:
> > >
> > > http://en.wikipedia.org/wiki/Gauss-Newton_algorithm
> > >
> > > By implement, I mean your Function and Jacobian methods should return the
> > > correct terms. I believe the reason you have not seen convergence is that
> > > the result of the solve does not "mean" the correct thing for the iteration
> > > in your current setup.
> > >
> > >    Matt
> >
> > Thanks. Good to know that I should be able to get a working setup. Are
> > there by any chance any code examples that I could use to clue myself in
> > on how to transform my m equations of n unknonwns into a correct
> > function for the Gauss-Newton algorithm?
> 
> We do not have any nonlinear least-squares examples, unfortunately. At that
> point, most users have gone over to formulating their problem directly as
> an optimization problem (which allows more flexibility than least squares) and
> have moved to TAO (http://www-unix.mcs.anl.gov/tao/) which does have
> examples, I believe, for optimization of this kind.
> 
> If you know that you only ever want to do least squares, and you want to solve
> the biggest, parallel problems, than stick with PETSc and build a nice
> Gauss-Newton
> (or Levenberg-Marquadt) solver. However, if you really want to solve a more
> general optimization problem, I recommend reformulating it now and moving
> to TAO. It is at least worth reading up on it.

Reformulating as an optimization problem does seem like the easier route
for now. I kept away from TAO in order to Keep It Simple, but now I see
that the opposite might be the case. I should be able to provide it with
a gradient, if not necessarily a Hessian. Thanks again :)

- Erlend :.

> 
>   Thanks,
> 
>      Matt
> 
> > - Erlend :.


From dalcinl at gmail.com  Wed Feb  6 09:53:48 2008
From: dalcinl at gmail.com (Lisandro Dalcin)
Date: Wed, 6 Feb 2008 12:53:48 -0300
Subject: how to inverse a sparse matrix in Petsc?
In-Reply-To: <47A915AB.9010006@gmail.com>
References: <602426.95557.qm@web38210.mail.mud.yahoo.com>
	 <47A85573.4090607@cscs.ch> <47A86B5F.5010503@gmail.com>
	 <e7ba66e40802050643y5f7944e5k4e79bf94beb6dd46@mail.gmail.com>
	 <47A915AB.9010006@gmail.com>
Message-ID: <e7ba66e40802060753h3be3fafeu4f5a53be41bea59f@mail.gmail.com>

Well, after taking into accout Barry's comments, you have have the
following choices.

* You can use a direct method based on LU factorization using
'-ksp_type preonly -pc_type lu' . This way, PETSc will compute the LU
factors the fist time they are needed; after that, every call to
KSPSolve will reuse those factors. This will work only in sequential
with a default PETSc build, but you could also build PETSc with MUMPS,
and it will let you do the parallel factorization. For MUMPS  to
actually work in your matrix, I believe you have to add the following
line:

MatConvert(A, MATAIJMUMPS, MAT_REUSE_MATRIX, &A);

after assembling (ie. MatAssembleBegin/End calls) your Poisson matrix.


* You can use CG with '-ksp_type cg' (I assume your matrix is SPD, as
it is in a standard fractional step method), and a preconditioner. And
then, I believe the best choice for your application will bee
BoomerAMG. It has a rather high setup cost, but solves are fast. Or
your could use ML, it has less setup costs, but the solvers are a bit
slower. So if you make many timesteps, I would say that BoomerAMG will
pay.

Finally, if you use the last option, perhaps you can try Paul Fischer
tricks. I tried to add this to KSP's some time ago, but I stoped for
many reasons (the main one, lack of time). You can take a look at
this:

http://citeseer.ist.psu.edu/492082.html

A similar (equivalent?) approach is this other one (perhaps a bit
easier to implement, depending on your taste)
doi.wiley.com/10.1002/cnm.743


On 2/5/08, Ben Tay <zonexo at gmail.com> wrote:
> Hi Lisandro,
>
> I'm using the fractional step mtd to solve the NS eqns as well. I've
> tried the direct mtd and also boomerAMG in solving the poisson eqn.
> Experience shows that for smaller matrix, direct mtd is slightly faster
> but if the matrix increases in size, boomerAMG  is faster. Btw, if I'm
> not wrong, the default solver will be GMRES. I've also tried using the
> "Struct" interface solely under Hypre. It's even faster for big matrix,
> although the improvement doesn't seem to be a lot. I need to do more
> tests to confirm though.
>
> I'm now doing 2D simulation with 1400x2000 grids. It's takes quite a
> while to solve the eqns. I'm wondering if it'll be faster if I get the
> inverse and then do matrix multiplication. Or just calling KSPSolve is
> actually doing something similar and there'll not be any speed
> difference. Hope someone can enlighten...
>
> Thanks!
>
> Lisandro Dalcin wrote:
> > Ben, some time ago I was doing some testing with PETSc for solving
> > incompressible NS eqs with fractional step method. I've found that in
> > our software and hardware setup, the best way to solve the pressure
> > problem was by using HYPRE BoomerAMG. This preconditioner usually have
> > some heavy setup, but if your Poison matrix does not change, then the
> > sucessive solves at each time step are really fast.
> >
> > If you still want to use a direct method, you should use the
> > combination '-ksp_type preonly -pc_type lu' (by default, this will
> > only work on sequential mode, unless you build PETSc with an external
> > package like MUMPS). This way, PETSc computes the LU factorization
> > only once, and at each time step, the call to KSPSolve end-up only
> > doing the triangular solvers.
> >
> > The nice thing about PETSc is that, if you next realize the
> > factorization take a long time (as it usually take in big problems),
> > you can switch BoomerAMG by only passing in the command line
> > '-ksp_type cg -pc_type hypre -pc_hypre_type boomeramg'. And that's
> > all, you do not need to change your code. And more, depending on your
> > problem you can choose the direct solvers or algebraic multigrid as
> > you want, by simply pass the appropriate combination options in the
> > command line (or a options file, using the -options_file option).
> >
> > Please, if you ever try HYPRE BoomerAMG preconditioners, I would like
> > to know about your experience.
> >
> > Regards,
> >
> > On 2/5/08, Ben Tay <zonexo at gmail.com> wrote:
> >
> >> Hi everyone,
> >>
> >> I was reading about the topic abt inversing a sparse matrix. I have to
> >> solve a poisson eqn for my CFD code. Usually, I form a system of linear
> >> eqns and solve Ax=b. The "A" is always the same and only the "b" changes
> >> every timestep. Does it mean that if I'm able to get the inverse matrix
> >> A^(-1), in order to get x at every timestep, I only need to do a simple
> >> matrix multiplication ie x=A^(-1)*b ?
> >>
> >> Hi Timothy, if the above is true, can you email me your Fortran code
> >> template? I'm also programming in fortran 90. Thank you very much
> >>
> >> Regards.
> >>
> >> Timothy Stitt wrote:
> >>
> >>> Yes Yujie, I was able to put together a parallel code to invert a
> >>> large sparse matrix with the help of the PETSc developers. If you need
> >>> any help or maybe a Fortran code template just let me know.
> >>>
> >>> Best,
> >>>
> >>> Tim.
> >>>
> >>> Waad Subber wrote:
> >>>
> >>>> Hi
> >>>> There was a discussion between Tim Stitt and petsc developers about
> >>>> matrix inversion, and it was really helpful. That was in last Nov.
> >>>> You can check the emails archive
> >>>>
> >>>> http://www-unix.mcs.anl.gov/web-mail-archive/lists/petsc-users/2007/11/threads.html
> >>>>
> >>>>
> >>>> Waad
> >>>>
> >>>> */Yujie <recrusader at gmail.com>/* wrote:
> >>>>
> >>>>     what is the difference between sequantial and parallel AIJ matrix?
> >>>>     Assuming there is a matrix A, if
> >>>>     I partitaion this matrix into A1, A2, Ai... An.
> >>>>     A is a parallel AIJ matrix at the whole view, Ai
> >>>>     is a sequential AIJ matrix? I want to operate Ai at each node.
> >>>>     In addition, whether is it possible to get general inverse using
> >>>>     MatMatSolve() if the matrix is not square? Thanks a lot.
> >>>>
> >>>>     Regards,
> >>>>     Yujie
> >>>>
> >>>>
> >>>>     On 2/4/08, *Barry Smith* <bsmith at mcs.anl.gov
> >>>>     <mailto:bsmith at mcs.anl.gov>> wrote:
> >>>>
> >>>>
> >>>>             For sequential AIJ matrices you can fill the B matrix
> >>>> with the
> >>>>         identity and then use
> >>>>         MatMatSolve().
> >>>>
> >>>>             Note since the inverse of a sparse matrix is dense the B
> >>>>         matrix is
> >>>>         a SeqDense matrix.
> >>>>
> >>>>             Barry
> >>>>
> >>>>         On Feb 4, 2008, at 12:37 AM, Yujie wrote:
> >>>>
> >>>>         > Hi,
> >>>>         > Now, I want to inverse a sparse matrix. I have browsed the
> >>>>         manual,
> >>>>         > however, I can't find some information. could you give me
> >>>>         some advice?
> >>>>         >
> >>>>         > thanks a lot.
> >>>>         >
> >>>>         > Regards,
> >>>>         > Yujie
> >>>>         >
> >>>>
> >>>>
> >>>>
> >>>> ------------------------------------------------------------------------
> >>>> Looking for last minute shopping deals? Find them fast with Yahoo!
> >>>> Search.
> >>>> <http://us.rd.yahoo.com/evt=51734/*http://tools.search.yahoo.com/newsearch/category.php?category=shopping>
> >>>>
> >>>
> >>>
> >>
> >
> >
> >
>
>


-- 
Lisandro Dalc?n
---------------
Centro Internacional de M?todos Computacionales en Ingenier?a (CIMEC)
Instituto de Desarrollo Tecnol?gico para la Industria Qu?mica (INTEC)
Consejo Nacional de Investigaciones Cient?ficas y T?cnicas (CONICET)
PTLC - G?emes 3450, (3000) Santa Fe, Argentina
Tel/Fax: +54-(0)342-451.1594


From zonexo at gmail.com  Wed Feb  6 10:24:20 2008
From: zonexo at gmail.com (Ben Tay)
Date: Thu, 07 Feb 2008 00:24:20 +0800
Subject: how to inverse a sparse matrix in Petsc?
In-Reply-To: <e7ba66e40802060753h3be3fafeu4f5a53be41bea59f@mail.gmail.com>
References: <602426.95557.qm@web38210.mail.mud.yahoo.com>	 <47A85573.4090607@cscs.ch> <47A86B5F.5010503@gmail.com>	 <e7ba66e40802050643y5f7944e5k4e79bf94beb6dd46@mail.gmail.com>	 <47A915AB.9010006@gmail.com> <e7ba66e40802060753h3be3fafeu4f5a53be41bea59f@mail.gmail.com>
Message-ID: <47A9DF34.6030903@gmail.com>

Hi Lisandro,

Thanks for your recommendation. Btw, does the poisson eqn arising from 
fractional step gives a matrix which is SPD? Because my grid's are 
non-uniform in both x,y directions. Shouldn't that result in a 
non-symmetric matrix? But I think it's still PD, positive definite. 
Correct me if I'm wrong.

Thanks

Lisandro Dalcin wrote:
> Well, after taking into accout Barry's comments, you have have the
> following choices.
>
> * You can use a direct method based on LU factorization using
> '-ksp_type preonly -pc_type lu' . This way, PETSc will compute the LU
> factors the fist time they are needed; after that, every call to
> KSPSolve will reuse those factors. This will work only in sequential
> with a default PETSc build, but you could also build PETSc with MUMPS,
> and it will let you do the parallel factorization. For MUMPS  to
> actually work in your matrix, I believe you have to add the following
> line:
>
> MatConvert(A, MATAIJMUMPS, MAT_REUSE_MATRIX, &A);
>
> after assembling (ie. MatAssembleBegin/End calls) your Poisson matrix.
>
>
> * You can use CG with '-ksp_type cg' (I assume your matrix is SPD, as
> it is in a standard fractional step method), and a preconditioner. And
> then, I believe the best choice for your application will bee
> BoomerAMG. It has a rather high setup cost, but solves are fast. Or
> your could use ML, it has less setup costs, but the solvers are a bit
> slower. So if you make many timesteps, I would say that BoomerAMG will
> pay.
>
> Finally, if you use the last option, perhaps you can try Paul Fischer
> tricks. I tried to add this to KSP's some time ago, but I stoped for
> many reasons (the main one, lack of time). You can take a look at
> this:
>
> http://citeseer.ist.psu.edu/492082.html
>
> A similar (equivalent?) approach is this other one (perhaps a bit
> easier to implement, depending on your taste)
> doi.wiley.com/10.1002/cnm.743
>
>
> On 2/5/08, Ben Tay <zonexo at gmail.com> wrote:
>   
>> Hi Lisandro,
>>
>> I'm using the fractional step mtd to solve the NS eqns as well. I've
>> tried the direct mtd and also boomerAMG in solving the poisson eqn.
>> Experience shows that for smaller matrix, direct mtd is slightly faster
>> but if the matrix increases in size, boomerAMG  is faster. Btw, if I'm
>> not wrong, the default solver will be GMRES. I've also tried using the
>> "Struct" interface solely under Hypre. It's even faster for big matrix,
>> although the improvement doesn't seem to be a lot. I need to do more
>> tests to confirm though.
>>
>> I'm now doing 2D simulation with 1400x2000 grids. It's takes quite a
>> while to solve the eqns. I'm wondering if it'll be faster if I get the
>> inverse and then do matrix multiplication. Or just calling KSPSolve is
>> actually doing something similar and there'll not be any speed
>> difference. Hope someone can enlighten...
>>
>> Thanks!
>>
>> Lisandro Dalcin wrote:
>>     
>>> Ben, some time ago I was doing some testing with PETSc for solving
>>> incompressible NS eqs with fractional step method. I've found that in
>>> our software and hardware setup, the best way to solve the pressure
>>> problem was by using HYPRE BoomerAMG. This preconditioner usually have
>>> some heavy setup, but if your Poison matrix does not change, then the
>>> sucessive solves at each time step are really fast.
>>>
>>> If you still want to use a direct method, you should use the
>>> combination '-ksp_type preonly -pc_type lu' (by default, this will
>>> only work on sequential mode, unless you build PETSc with an external
>>> package like MUMPS). This way, PETSc computes the LU factorization
>>> only once, and at each time step, the call to KSPSolve end-up only
>>> doing the triangular solvers.
>>>
>>> The nice thing about PETSc is that, if you next realize the
>>> factorization take a long time (as it usually take in big problems),
>>> you can switch BoomerAMG by only passing in the command line
>>> '-ksp_type cg -pc_type hypre -pc_hypre_type boomeramg'. And that's
>>> all, you do not need to change your code. And more, depending on your
>>> problem you can choose the direct solvers or algebraic multigrid as
>>> you want, by simply pass the appropriate combination options in the
>>> command line (or a options file, using the -options_file option).
>>>
>>> Please, if you ever try HYPRE BoomerAMG preconditioners, I would like
>>> to know about your experience.
>>>
>>> Regards,
>>>
>>> On 2/5/08, Ben Tay <zonexo at gmail.com> wrote:
>>>
>>>       
>>>> Hi everyone,
>>>>
>>>> I was reading about the topic abt inversing a sparse matrix. I have to
>>>> solve a poisson eqn for my CFD code. Usually, I form a system of linear
>>>> eqns and solve Ax=b. The "A" is always the same and only the "b" changes
>>>> every timestep. Does it mean that if I'm able to get the inverse matrix
>>>> A^(-1), in order to get x at every timestep, I only need to do a simple
>>>> matrix multiplication ie x=A^(-1)*b ?
>>>>
>>>> Hi Timothy, if the above is true, can you email me your Fortran code
>>>> template? I'm also programming in fortran 90. Thank you very much
>>>>
>>>> Regards.
>>>>
>>>> Timothy Stitt wrote:
>>>>
>>>>         
>>>>> Yes Yujie, I was able to put together a parallel code to invert a
>>>>> large sparse matrix with the help of the PETSc developers. If you need
>>>>> any help or maybe a Fortran code template just let me know.
>>>>>
>>>>> Best,
>>>>>
>>>>> Tim.
>>>>>
>>>>> Waad Subber wrote:
>>>>>
>>>>>           
>>>>>> Hi
>>>>>> There was a discussion between Tim Stitt and petsc developers about
>>>>>> matrix inversion, and it was really helpful. That was in last Nov.
>>>>>> You can check the emails archive
>>>>>>
>>>>>> http://www-unix.mcs.anl.gov/web-mail-archive/lists/petsc-users/2007/11/threads.html
>>>>>>
>>>>>>
>>>>>> Waad
>>>>>>
>>>>>> */Yujie <recrusader at gmail.com>/* wrote:
>>>>>>
>>>>>>     what is the difference between sequantial and parallel AIJ matrix?
>>>>>>     Assuming there is a matrix A, if
>>>>>>     I partitaion this matrix into A1, A2, Ai... An.
>>>>>>     A is a parallel AIJ matrix at the whole view, Ai
>>>>>>     is a sequential AIJ matrix? I want to operate Ai at each node.
>>>>>>     In addition, whether is it possible to get general inverse using
>>>>>>     MatMatSolve() if the matrix is not square? Thanks a lot.
>>>>>>
>>>>>>     Regards,
>>>>>>     Yujie
>>>>>>
>>>>>>
>>>>>>     On 2/4/08, *Barry Smith* <bsmith at mcs.anl.gov
>>>>>>     <mailto:bsmith at mcs.anl.gov>> wrote:
>>>>>>
>>>>>>
>>>>>>             For sequential AIJ matrices you can fill the B matrix
>>>>>> with the
>>>>>>         identity and then use
>>>>>>         MatMatSolve().
>>>>>>
>>>>>>             Note since the inverse of a sparse matrix is dense the B
>>>>>>         matrix is
>>>>>>         a SeqDense matrix.
>>>>>>
>>>>>>             Barry
>>>>>>
>>>>>>         On Feb 4, 2008, at 12:37 AM, Yujie wrote:
>>>>>>
>>>>>>         > Hi,
>>>>>>         > Now, I want to inverse a sparse matrix. I have browsed the
>>>>>>         manual,
>>>>>>         > however, I can't find some information. could you give me
>>>>>>         some advice?
>>>>>>         >
>>>>>>         > thanks a lot.
>>>>>>         >
>>>>>>         > Regards,
>>>>>>         > Yujie
>>>>>>         >
>>>>>>
>>>>>>
>>>>>>
>>>>>> ------------------------------------------------------------------------
>>>>>> Looking for last minute shopping deals? Find them fast with Yahoo!
>>>>>> Search.
>>>>>> <http://us.rd.yahoo.com/evt=51734/*http://tools.search.yahoo.com/newsearch/category.php?category=shopping>
>>>>>>
>>>>>>             
>>>>>           
>>>
>>>       
>>     
>
>
>   


From dalcinl at gmail.com  Wed Feb  6 11:10:01 2008
From: dalcinl at gmail.com (Lisandro Dalcin)
Date: Wed, 6 Feb 2008 14:10:01 -0300
Subject: how to inverse a sparse matrix in Petsc?
In-Reply-To: <47A9DF34.6030903@gmail.com>
References: <602426.95557.qm@web38210.mail.mud.yahoo.com>
	 <47A85573.4090607@cscs.ch> <47A86B5F.5010503@gmail.com>
	 <e7ba66e40802050643y5f7944e5k4e79bf94beb6dd46@mail.gmail.com>
	 <47A915AB.9010006@gmail.com>
	 <e7ba66e40802060753h3be3fafeu4f5a53be41bea59f@mail.gmail.com>
	 <47A9DF34.6030903@gmail.com>
Message-ID: <e7ba66e40802060910r2719e304me5879e79f1ef9345@mail.gmail.com>

On 2/6/08, Ben Tay <zonexo at gmail.com> wrote:
Because my grid's are
> non-uniform in both x,y directions. Shouldn't that result in a
> non-symmetric matrix? But I think it's still PD, positive definite.
> Correct me if I'm wrong.

I believe you are wrong, unless you are using a non-standart spatial
discretization method. Is your Poisson equation using some additional
terms than the usual Laplace operator? For standard finite elements
and finite diferences, your matrix should be symmetric. Of course,
symmetry can be lost if you use the common trick of zeroing-out rows
for boundary conditions (using MatZeroRows and related). But even in
that case, I believe you can still use CG.

-- 
Lisandro Dalc?n
---------------
Centro Internacional de M?todos Computacionales en Ingenier?a (CIMEC)
Instituto de Desarrollo Tecnol?gico para la Industria Qu?mica (INTEC)
Consejo Nacional de Investigaciones Cient?ficas y T?cnicas (CONICET)
PTLC - G?emes 3450, (3000) Santa Fe, Argentina
Tel/Fax: +54-(0)342-451.1594


From zonexo at gmail.com  Thu Feb  7 00:18:51 2008
From: zonexo at gmail.com (Ben Tay)
Date: Thu, 07 Feb 2008 14:18:51 +0800
Subject: how to inverse a sparse matrix in Petsc?
In-Reply-To: <e7ba66e40802060910r2719e304me5879e79f1ef9345@mail.gmail.com>
References: <602426.95557.qm@web38210.mail.mud.yahoo.com>	 <47A85573.4090607@cscs.ch> <47A86B5F.5010503@gmail.com>	 <e7ba66e40802050643y5f7944e5k4e79bf94beb6dd46@mail.gmail.com>	 <47A915AB.9010006@gmail.com>	 <e7ba66e40802060753h3be3fafeu4f5a53be41bea59f@mail.gmail.com>	 <47A9DF34.6030903@gmail.com> <e7ba66e40802060910r2719e304me5879e79f1ef9345@mail.gmail.com>
Message-ID: <47AAA2CB.6050705@gmail.com>

Ya, Lisandro, it's my mistake. It is indeed SPD. Thank you for your 
reminder!

Lisandro Dalcin wrote:
> On 2/6/08, Ben Tay <zonexo at gmail.com> wrote:
> Because my grid's are
>   
>> non-uniform in both x,y directions. Shouldn't that result in a
>> non-symmetric matrix? But I think it's still PD, positive definite.
>> Correct me if I'm wrong.
>>     
>
> I believe you are wrong, unless you are using a non-standart spatial
> discretization method. Is your Poisson equation using some additional
> terms than the usual Laplace operator? For standard finite elements
> and finite diferences, your matrix should be symmetric. Of course,
> symmetry can be lost if you use the common trick of zeroing-out rows
> for boundary conditions (using MatZeroRows and related). But even in
> that case, I believe you can still use CG.
>
>   


From tstitt at cscs.ch  Thu Feb  7 09:06:39 2008
From: tstitt at cscs.ch (Timothy Stitt)
Date: Thu, 07 Feb 2008 16:06:39 +0100
Subject: Legendre Transform
Message-ID: <47AB1E7F.1070907@cscs.ch>

Hi all,

I am not sure if this query is directly related to the PETSc library per 
se, but I was wondering if anyone knows of an efficient implementation 
of the Legendre Transform. A parallel implementation would be even more 
ideal.
I don't know that much about the implementation of integral 
transforms...could it be done in PETSc for instance, if no suitable 
library exists? A web search doesn't throw up to many relevant hits.

Thanks in advance for any guidance given,

Best,

Tim.

-- 
Timothy Stitt
HPC Applications Analyst

Swiss National Supercomputing Centre (CSCS)
Galleria 2 - Via Cantonale
CH-6928 Manno, Switzerland

+41 (0) 91 610 8233
stitt at cscs.ch


From sanjay at ce.berkeley.edu  Thu Feb  7 09:22:01 2008
From: sanjay at ce.berkeley.edu (Sanjay Govindjee)
Date: Thu, 07 Feb 2008 16:22:01 +0100
Subject: Legendre Transform
In-Reply-To: <47AB1E7F.1070907@cscs.ch>
References: <47AB1E7F.1070907@cscs.ch>
Message-ID: <47AB2219.6020401@ce.berkeley.edu>

Can you define what you mean by Legendre transform?  The usual 
definition in physics
is L{f}(p)  = max_x (p.x - f(x)) where f(x) is convex.    When you say 
parallel, then I presume that
x lies in a very high dimensional space.  In which case you are simply 
looking at a root
finding problem in high dimensions and you could certainly use PETSc to 
solve the intermediate
linear solves of a Newton scheme (or just invoke SNES).

-sg

Timothy Stitt wrote:
> Hi all,
>
> I am not sure if this query is directly related to the PETSc library 
> per se, but I was wondering if anyone knows of an efficient 
> implementation of the Legendre Transform. A parallel implementation 
> would be even more ideal.
> I don't know that much about the implementation of integral 
> transforms...could it be done in PETSc for instance, if no suitable 
> library exists? A web search doesn't throw up to many relevant hits.
>
> Thanks in advance for any guidance given,
>
> Best,
>
> Tim.
>


From tstitt at cscs.ch  Thu Feb  7 11:03:06 2008
From: tstitt at cscs.ch (Timothy Stitt)
Date: Thu, 07 Feb 2008 18:03:06 +0100
Subject: Legendre Transform
In-Reply-To: <47AB2219.6020401@ce.berkeley.edu>
References: <47AB1E7F.1070907@cscs.ch> <47AB2219.6020401@ce.berkeley.edu>
Message-ID: <47AB39CA.9050908@cscs.ch>

Thanks Sanjay for the reply...I apologise for my ambiguous definition 
but it is more to do with my unfamiliarity with the topic.

I have been asked to help a group optimise their quasi-spectral 
geophysical MHD code. From what I gather the significant computation 
occurs when it flips back and forth between spectral and real space. The 
Fourier and Chebyshev transforms are quite efficient but not the 
Legendre transform. The full forward and  inverse Legendre transforms 
are calculated on each process (using direct summation and 
Gauss-Legendre quadrature on the inverse) but performance suffers as the 
resolution becomes high.

Ideally I would like to be make sure I am implementing the most 
efficient Legendre transform algorithm available and more importantly if 
it can be performed in parallel, hence my call for existing libraries 
and maybe PETSc library support. I apologise if the application area is 
not directly related to your own field but I hope you were able to 
follow the general idea. Again I would appreciate your comments.

Thanks,

Tim.

Sanjay Govindjee wrote:
> Can you define what you mean by Legendre transform?  The usual 
> definition in physics
> is L{f}(p)  = max_x (p.x - f(x)) where f(x) is convex.    When you say 
> parallel, then I presume that
> x lies in a very high dimensional space.  In which case you are simply 
> looking at a root
> finding problem in high dimensions and you could certainly use PETSc 
> to solve the intermediate
> linear solves of a Newton scheme (or just invoke SNES).
>
> -sg
>
> Timothy Stitt wrote:
>> Hi all,
>>
>> I am not sure if this query is directly related to the PETSc library 
>> per se, but I was wondering if anyone knows of an efficient 
>> implementation of the Legendre Transform. A parallel implementation 
>> would be even more ideal.
>> I don't know that much about the implementation of integral 
>> transforms...could it be done in PETSc for instance, if no suitable 
>> library exists? A web search doesn't throw up to many relevant hits.
>>
>> Thanks in advance for any guidance given,
>>
>> Best,
>>
>> Tim.
>>
>
>


-- 
Timothy Stitt
HPC Applications Analyst

Swiss National Supercomputing Centre (CSCS)
Galleria 2 - Via Cantonale
CH-6928 Manno, Switzerland

+41 (0) 91 610 8233
stitt at cscs.ch


From vijay.m at gmail.com  Fri Feb  8 00:11:17 2008
From: vijay.m at gmail.com (Vijay S. Mahadevan)
Date: Fri, 8 Feb 2008 00:11:17 -0600
Subject: PCGetType question
Message-ID: <00aa01c86a19$6cbcac50$b63010ac@neutron>

Hi all,

 
I?ve been trying to figure out how exactly to find out the PCType for a
given PC context. Here?s the sample code I?ve been trying to execute but to
no avail.

 
Method 1:

 
PetscTruth isshell ;

                        
PetscTypeCompare((PetscObject)pc, PCSHELL, &isshell);

                        
PetscPrintf(PETSC_COMM_SELF, " PETSC_IS_SHELL = %D", isshell) ;

 
Result: The isshell variable is always false even when I set ?pc_type shell
option. Why ?

 
Method 2:

 
PCType currpcType ;

 
ierr = PCGetType(pc, &currpcType) ;

 
if(currpcType == PCSHELL)

    isshell = PETSC_TRUE ;

 
PetscPrintf(PETSC_COMM_SELF, " PETSC_IS_SHELL = %D", isshell) ;

 
Result: Again, the isshell variable is always false even when I set ?pc_type
shell option. Also, my currpcType string is a null string. 

 
Any ideas on what I am doing wrong on either one of these cases. I just
spent a while trying to figure out if there was a bug in some other part of
my code while the isshell variable is never set in the first place. Any help
would be appreciated.

 
Thanks,

Vijay 

 
No virus found in this outgoing message.
Checked by AVG Free Edition. 
Version: 7.5.516 / Virus Database: 269.19.21/1265 - Release Date: 2/7/2008
11:17 AM
 
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.mcs.anl.gov/pipermail/petsc-users/attachments/20080208/c8fd0f3c/attachment.htm>

From dalcinl at gmail.com  Fri Feb  8 07:52:46 2008
From: dalcinl at gmail.com (Lisandro Dalcin)
Date: Fri, 8 Feb 2008 10:52:46 -0300
Subject: PCGetType question
In-Reply-To: <00aa01c86a19$6cbcac50$b63010ac@neutron>
References: <00aa01c86a19$6cbcac50$b63010ac@neutron>
Message-ID: <e7ba66e40802080552u2a5565beucf569ee9af6fed9c@mail.gmail.com>

On 2/8/08, Vijay S. Mahadevan <vijay.m at gmail.com> wrote:
> PetscTruth isshell ;
> PetscTypeCompare((PetscObject)pc, PCSHELL, &isshell);
> PetscPrintf(PETSC_COMM_SELF, " PETSC_IS_SHELL = %D", isshell) ;
> Result: The isshell variable is always false even when I set ?pc_type shell
> option. Why ?

Did you call PCSetFromOptions (or KSPSetFromOptions) before calling
PetscTypeCompare()? If not, the option '-pc_type' will not be used

> Method 2:
> PCType currpcType ;
> ierr = PCGetType(pc, &currpcType) ;
> if(currpcType == PCSHELL)
>     isshell = PETSC_TRUE ;
> PetscPrintf(PETSC_COMM_SELF, " PETSC_IS_SHELL = %D", isshell) ;
> Result: Again, the isshell variable is always false even when I set ?pc_type
> shell option. Also, my currpcType string is a null string.

PCType is a 'const char*', so you should never compare that with '=='
opertor!!. Again, if you got a null pointer from PCGetType(), my guess
is that you forgot to call XXXSetFromOptions, where XXX is PC, or a
KSP object containing it, or a SNES cointaing the KSP in turn
containing the PC.


-- 
Lisandro Dalc?n
---------------
Centro Internacional de M?todos Computacionales en Ingenier?a (CIMEC)
Instituto de Desarrollo Tecnol?gico para la Industria Qu?mica (INTEC)
Consejo Nacional de Investigaciones Cient?ficas y T?cnicas (CONICET)
PTLC - G?emes 3450, (3000) Santa Fe, Argentina
Tel/Fax: +54-(0)342-451.1594


From Andrew.Barker at Colorado.EDU  Fri Feb  8 11:46:09 2008
From: Andrew.Barker at Colorado.EDU (Andrew T Barker)
Date: Fri,  8 Feb 2008 10:46:09 -0700 (MST)
Subject: sub_pc lu zero pivot
Message-ID: <20080208104609.AAW99320@batman.int.colorado.edu>


For my code on a single processor, using the options

-ksp_type gmres -pc_type lu

works fine, but

-ksp_type gmres -pc_type asm -sub_pc_type lu

produces a zero pivot.  On one processor I would expect these two to be identical, and in fact when I print out the asm submatrices (there is only one) in the second case and the original matrix in the first case, they are identical.  Even stranger is that

-ksp_type gmres -pc_type lu -sub_pc_type lu

also produces a zero pivot.  I would expect it to ignore the sub_pc in this case.

Thanks for any help,

Andrew

---
Andrew T. Barker
andrew.barker at colorado.edu
Applied Math Department
University of Colorado, Boulder
526 UCB, Boulder, CO 80309-0526


From knepley at gmail.com  Fri Feb  8 12:02:45 2008
From: knepley at gmail.com (Matthew Knepley)
Date: Fri, 8 Feb 2008 12:02:45 -0600
Subject: sub_pc lu zero pivot
In-Reply-To: <20080208104609.AAW99320@batman.int.colorado.edu>
References: <20080208104609.AAW99320@batman.int.colorado.edu>
Message-ID: <a9f269830802081002w2cd742ceie75832496940eae4@mail.gmail.com>

We would like to see:

  1) The complete error message

  2) The output of -snes_view so we can see exactly what solver setup you have

  Thanks,

     Matt

On Feb 8, 2008 11:46 AM, Andrew T Barker <Andrew.Barker at colorado.edu> wrote:
>
> For my code on a single processor, using the options
>
> -ksp_type gmres -pc_type lu
>
> works fine, but
>
> -ksp_type gmres -pc_type asm -sub_pc_type lu
>
> produces a zero pivot.  On one processor I would expect these two to be identical, and in fact when I print out the asm submatrices (there is only one) in the second case and the original matrix in the first case, they are identical.  Even stranger is that
>
> -ksp_type gmres -pc_type lu -sub_pc_type lu
>
> also produces a zero pivot.  I would expect it to ignore the sub_pc in this case.
>
> Thanks for any help,
>
> Andrew
>
> ---
> Andrew T. Barker
> andrew.barker at colorado.edu
> Applied Math Department
> University of Colorado, Boulder
> 526 UCB, Boulder, CO 80309-0526
>
>


-- 
What most experimenters take for granted before they begin their
experiments is infinitely more interesting than any results to which
their experiments lead.
-- Norbert Wiener


From Andrew.Barker at Colorado.EDU  Fri Feb  8 12:34:27 2008
From: Andrew.Barker at Colorado.EDU (Andrew T Barker)
Date: Fri,  8 Feb 2008 11:34:27 -0700 (MST)
Subject: sub_pc lu zero pivot
Message-ID: <20080208113427.AAX01992@batman.int.colorado.edu>

I apparently was explicitly changing the PC in my code after PCSetFromOptions, so the options weren't getting set properly (which I found out by using -snes_view).  Sorry to bother.

Andrew

---- Original message ----
>Date: Fri, 8 Feb 2008 12:02:45 -0600
>From: "Matthew Knepley" <knepley at gmail.com>  
>Subject: Re: sub_pc lu zero pivot  
>To: petsc-users at mcs.anl.gov
>
>We would like to see:
>
>  1) The complete error message
>
>  2) The output of -snes_view so we can see exactly what solver setup you have
>
>  Thanks,
>
>     Matt
>
>On Feb 8, 2008 11:46 AM, Andrew T Barker <Andrew.Barker at colorado.edu> wrote:
>>
>> For my code on a single processor, using the options
>>
>> -ksp_type gmres -pc_type lu
>>
>> works fine, but
>>
>> -ksp_type gmres -pc_type asm -sub_pc_type lu
>>
>> produces a zero pivot.  On one processor I would expect these two to be identical, and in fact when I print out the asm submatrices (there is only one) in the second case and the original matrix in the first case, they are identical.  Even stranger is that
>>
>> -ksp_type gmres -pc_type lu -sub_pc_type lu
>>
>> also produces a zero pivot.  I would expect it to ignore the sub_pc in this case.
>>
>> Thanks for any help,
>>
>> Andrew
>>
>> ---
>> Andrew T. Barker
>> andrew.barker at colorado.edu
>> Applied Math Department
>> University of Colorado, Boulder
>> 526 UCB, Boulder, CO 80309-0526
>>
>>
>
>
>
>-- 
>What most experimenters take for granted before they begin their
>experiments is infinitely more interesting than any results to which
>their experiments lead.
>-- Norbert Wiener
>


From recrusader at gmail.com  Tue Feb 12 13:06:51 2008
From: recrusader at gmail.com (Yujie)
Date: Tue, 12 Feb 2008 11:06:51 -0800
Subject: what's the difference between PetscViewerASCIIOpen() and PetscViewerBinaryOpen()?
In-Reply-To: <a9f269830801231247m45fa1d60p5b7467a7ac9d465d@mail.gmail.com>
References: <7ff0ee010801221850h4a4161efu6ec3190f3525346d@mail.gmail.com>
	 <a9f269830801221925q721dde06ua29e4f2496d9cc72@mail.gmail.com>
	 <7ff0ee010801222101k471b3c00xe22e933efacb7939@mail.gmail.com>
	 <a9f269830801230553r7c206040p5764d79db38a7fb6@mail.gmail.com>
	 <7ff0ee010801231218xc0281b2n8a7db71713ede8fb@mail.gmail.com>
	 <a9f269830801231247m45fa1d60p5b7467a7ac9d465d@mail.gmail.com>
Message-ID: <7ff0ee010802121106n3f1bb609m37d81e9454fd806@mail.gmail.com>

hi, Matt

If I output matrix with binary format, should the file format obtained from
sequential output is the same with that from parallel output? I mean that I
don't need to consider whether MatView_***_Binary() is parallel or
sequential when I use the matrix file.

thanks a lot.

Regards,
Yujie

On 1/23/08, Matthew Knepley <knepley at gmail.com> wrote:
>
> On Jan 23, 2008 2:18 PM, Yujie <recrusader at gmail.com> wrote:
> > Thank you for your further explanation. I just want to use this data in
> > other packages. I think that ASCII file is likely better. Because I
> don't
> > know the format of the binary file? how to find it?
>
> Look at MatView_SeqAIJ_Binary() in src/mat/impls/aij/seq/aij.c. The
> format is pretty simple.
>
> > In addition, do you have any better methods to save the sparsity
> structure
> > picture of the matrix? Now, I use "-mat_view_draw" to do this. However,
> the
> > speed is very slow and the picture is small. I want to get a big picture
> and
> > directly save it to the disk?
> >  could you give me some advice? thanks a lot.
>
> We do not have a better way to make the sparsity picture. I assume you
> could
> write something that decides how many pixels to use, calculates an average
> occupancy per pixel, and writes a BMP or something.
>
>   Matt
>
> > Regards,
> > Yujie
> >
> > On 1/23/08, Matthew Knepley <knepley at gmail.com > wrote:
> > > On Jan 22, 2008 11:01 PM, Yujie <recrusader at gmail.com> wrote:
> > > > Dear Matt:
> > > >
> > > > thank you for your reply. Do you have any method to generate an
> ascii
> > file
> > > > of the huge sparse matrix? thanks
> > >
> > > I think you miss my point. The PETSc function is not a bad way to
> generate
> > > ASCII matrices. ASCII matrices make "no sense" for large operators.
> > >
> > >    Matt
> > >
> > > > Regards,
> > > > Yujie
> > > >
> > > >
> > > >
> > > > On 1/23/08, Matthew Knepley <knepley at gmail.com> wrote:
> > > > > On Jan 22, 2008 8:50 PM, Yujie < recrusader at gmail.com> wrote:
> > > > > > Hi everyone:
> > > > > >
> > > > > > #include "petsc.h"
> > > > > >  PetscErrorCode PetscViewerASCIIOpen(MPI_Comm comm,const char
> > > > > > name[],PetscViewer *lab)
> > > > > >
> > > > > > #include "petsc.h"
> > > > > >  PetscErrorCode PetscViewerBinaryOpen(MPI_Comm comm,const char
> > > > > > name[],PetscFileMode type,PetscViewer *binv)
> > > > > >
> > > > > > if the difference between them is that one for ASCII output and
> the
> > > > other
> > > > > > for Binary output, why are there different parameters?
> > > > >
> > > > > It is historical. If you want to be generic, you should use
> > > > >
> > > > >   PetscViewerCreate()
> > > > >   PetscViewerSetType()
> > > > >   PetscViewerFileSetMode()
> > > > >   PetscViewerFileSetName()
> > > > >
> > > > > which can create both.
> > > > >
> > > > > > The speed to output matrix is very fast when I use
> > > > PetscViewerBinaryOpen.
> > > > > > However, when I use PetscViewerASCIIOpen, I can't get the matrix
> > output.
> > > > the
> > > > > > code always is running and it has taken about one day! what's
> the
> > > > problem?
> > > > > > thank you.
> > > > >
> > > > > ASCII files do not make sense for large matrices. You should use
> > binary
> > > > files.
> > > > >
> > > > >    Matt
> > > > >
> > > > > > Regards,
> > > > > > Yujie
> > > > > >
> > > > > >
> > > > > >
> > > > >
> > > > >
> > > > >
> > > > > --
> > > > > What most experimenters take for granted before they begin their
> > > > > experiments is infinitely more interesting than any results to
> which
> > > > > their experiments lead.
> > > > > -- Norbert Wiener
> > > > >
> > > > >
> > > >
> > > >
> > >
> > >
> > >
> > > --
> > > What most experimenters take for granted before they begin their
> > > experiments is infinitely more interesting than any results to which
> > > their experiments lead.
> > > -- Norbert Wiener
> > >
> > >
> >
> >
>
>
>
> --
> What most experimenters take for granted before they begin their
> experiments is infinitely more interesting than any results to which
> their experiments lead.
> -- Norbert Wiener
>
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.mcs.anl.gov/pipermail/petsc-users/attachments/20080212/1270915f/attachment.htm>

From bsmith at mcs.anl.gov  Tue Feb 12 14:15:00 2008
From: bsmith at mcs.anl.gov (Barry Smith)
Date: Tue, 12 Feb 2008 14:15:00 -0600
Subject: what's the difference between PetscViewerASCIIOpen() and PetscViewerBinaryOpen()?
In-Reply-To: <7ff0ee010802121106n3f1bb609m37d81e9454fd806@mail.gmail.com>
References: <7ff0ee010801221850h4a4161efu6ec3190f3525346d@mail.gmail.com> <a9f269830801221925q721dde06ua29e4f2496d9cc72@mail.gmail.com> <7ff0ee010801222101k471b3c00xe22e933efacb7939@mail.gmail.com> <a9f269830801230553r7c206040p5764d79db38a7fb6@mail.gmail.com> <7ff0ee010801231218xc0281b2n8a7db71713ede8fb@mail.gmail.com> <a9f269830801231247m45fa1d60p5b7467a7ac9d465d@mail.gmail.com> <7ff0ee010802121106n3f1bb609m37d81e9454fd806@mail.gmail.com>
Message-ID: <3E498C1D-EE34-4744-8D0E-C0EE9119741C@mcs.anl.gov>


    The files should be the same.

    Barry

1) If the matrix is generated differently in parallel then sequential  
then rounding of the floating point
operations will result in very slightly different numerical values in  
the matrix.

2) If the ordering of the the unknowns is different in the parallel  
and sequential then the matrices
will, of course, be permutations of each other.


On Feb 12, 2008, at 1:06 PM, Yujie wrote:

> hi, Matt
>
> If I output matrix with binary format, should the file format  
> obtained from sequential output is the same with that from parallel  
> output? I mean that I don't need to consider whether  
> MatView_***_Binary() is parallel or sequential when I use the matrix  
> file.
>
> thanks a lot.
>
> Regards,
> Yujie
>
> On 1/23/08, Matthew Knepley <knepley at gmail.com> wrote:
> On Jan 23, 2008 2:18 PM, Yujie <recrusader at gmail.com> wrote:
> > Thank you for your further explanation. I just want to use this  
> data in
> > other packages. I think that ASCII file is likely better. Because  
> I don't
> > know the format of the binary file? how to find it?
>
> Look at MatView_SeqAIJ_Binary() in src/mat/impls/aij/seq/aij.c. The
> format is pretty simple.
>
> > In addition, do you have any better methods to save the sparsity  
> structure
> > picture of the matrix? Now, I use "-mat_view_draw" to do this.  
> However, the
> > speed is very slow and the picture is small. I want to get a big  
> picture and
> > directly save it to the disk?
> >  could you give me some advice? thanks a lot.
>
> We do not have a better way to make the sparsity picture. I assume  
> you could
> write something that decides how many pixels to use, calculates an  
> average
> occupancy per pixel, and writes a BMP or something.
>
>   Matt
>
> > Regards,
> > Yujie
> >
> > On 1/23/08, Matthew Knepley <knepley at gmail.com > wrote:
> > > On Jan 22, 2008 11:01 PM, Yujie <recrusader at gmail.com> wrote:
> > > > Dear Matt:
> > > >
> > > > thank you for your reply. Do you have any method to generate  
> an ascii
> > file
> > > > of the huge sparse matrix? thanks
> > >
> > > I think you miss my point. The PETSc function is not a bad way  
> to generate
> > > ASCII matrices. ASCII matrices make "no sense" for large  
> operators.
> > >
> > >    Matt
> > >
> > > > Regards,
> > > > Yujie
> > > >
> > > >
> > > >
> > > > On 1/23/08, Matthew Knepley <knepley at gmail.com> wrote:
> > > > > On Jan 22, 2008 8:50 PM, Yujie < recrusader at gmail.com> wrote:
> > > > > > Hi everyone:
> > > > > >
> > > > > > #include "petsc.h"
> > > > > >  PetscErrorCode PetscViewerASCIIOpen(MPI_Comm comm,const  
> char
> > > > > > name[],PetscViewer *lab)
> > > > > >
> > > > > > #include "petsc.h"
> > > > > >  PetscErrorCode PetscViewerBinaryOpen(MPI_Comm comm,const  
> char
> > > > > > name[],PetscFileMode type,PetscViewer *binv)
> > > > > >
> > > > > > if the difference between them is that one for ASCII  
> output and the
> > > > other
> > > > > > for Binary output, why are there different parameters?
> > > > >
> > > > > It is historical. If you want to be generic, you should use
> > > > >
> > > > >   PetscViewerCreate()
> > > > >   PetscViewerSetType()
> > > > >   PetscViewerFileSetMode()
> > > > >   PetscViewerFileSetName()
> > > > >
> > > > > which can create both.
> > > > >
> > > > > > The speed to output matrix is very fast when I use
> > > > PetscViewerBinaryOpen.
> > > > > > However, when I use PetscViewerASCIIOpen, I can't get the  
> matrix
> > output.
> > > > the
> > > > > > code always is running and it has taken about one day!  
> what's the
> > > > problem?
> > > > > > thank you.
> > > > >
> > > > > ASCII files do not make sense for large matrices. You should  
> use
> > binary
> > > > files.
> > > > >
> > > > >    Matt
> > > > >
> > > > > > Regards,
> > > > > > Yujie
> > > > > >
> > > > > >
> > > > > >
> > > > >
> > > > >
> > > > >
> > > > > --
> > > > > What most experimenters take for granted before they begin  
> their
> > > > > experiments is infinitely more interesting than any results  
> to which
> > > > > their experiments lead.
> > > > > -- Norbert Wiener
> > > > >
> > > > >
> > > >
> > > >
> > >
> > >
> > >
> > > --
> > > What most experimenters take for granted before they begin their
> > > experiments is infinitely more interesting than any results to  
> which
> > > their experiments lead.
> > > -- Norbert Wiener
> > >
> > >
> >
> >
>
>
>
> --
> What most experimenters take for granted before they begin their
> experiments is infinitely more interesting than any results to which
> their experiments lead.
> -- Norbert Wiener
>
>

-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.mcs.anl.gov/pipermail/petsc-users/attachments/20080212/04cf4ff8/attachment.htm>

From balay at mcs.anl.gov  Tue Feb 12 14:23:38 2008
From: balay at mcs.anl.gov (Satish Balay)
Date: Tue, 12 Feb 2008 14:23:38 -0600 (CST)
Subject: what's the difference between PetscViewerASCIIOpen() and
 PetscViewerBinaryOpen()?
In-Reply-To: <7ff0ee010802121106n3f1bb609m37d81e9454fd806@mail.gmail.com>
References: <7ff0ee010801221850h4a4161efu6ec3190f3525346d@mail.gmail.com>  <a9f269830801221925q721dde06ua29e4f2496d9cc72@mail.gmail.com>  <7ff0ee010801222101k471b3c00xe22e933efacb7939@mail.gmail.com>  <a9f269830801230553r7c206040p5764d79db38a7fb6@mail.gmail.com>
  <7ff0ee010801231218xc0281b2n8a7db71713ede8fb@mail.gmail.com>  <a9f269830801231247m45fa1d60p5b7467a7ac9d465d@mail.gmail.com> <7ff0ee010802121106n3f1bb609m37d81e9454fd806@mail.gmail.com>
Message-ID: <alpine.LFD.1.00.0802121409060.9540@asterix>

On Tue, 12 Feb 2008, Yujie wrote:

> On 1/23/08, Matthew Knepley <knepley at gmail.com> wrote:

> > > In addition, do you have any better methods to save the sparsity
> > > structure picture of the matrix? Now, I use "-mat_view_draw" to
> > > do this. However, the speed is very slow and the picture is
> > > small. I want to get a big picture and directly save it to the
> > > disk?  could you give me some advice? thanks a lot.

> > We do not have a better way to make the sparsity picture. I assume
> >you could write something that decides how many pixels to use,
> >calculates an average occupancy per pixel, and writes a BMP or
> >something.

Couple of notes on this.

- -mat_view_draw can be slow for parallel runs [because all the data
  is moved to proc-0, from where its displayed]. If you wish to speed
  up, you can either:
  * run it sequentially [depending upon your code, the matrix generated
  could be different - so its not suitable]
  * do a binary dump [with MatView() on a binary viewer] - and
  then reload this matrix with a sequential code and then do mat_view
  [check mat/examples/tests/ex33.c,ex43.c]

- you can use the option '-draw_pause -1' to make the window not
  disappear. Now you can zoom-in & zoom-out [with mouse-left or
  mouse-right click]

- Take the snapshot of this window with xv or gnome-screenshot or
other screen-dump tool [ like 'xwd | xpr -device ps > dump.ps']

- Alternatively you can dump the matrix is matlab format - and use
Matlab visualization tools.

Satish


From bsmith at mcs.anl.gov  Tue Feb 12 14:29:49 2008
From: bsmith at mcs.anl.gov (Barry Smith)
Date: Tue, 12 Feb 2008 14:29:49 -0600
Subject: what's the difference between PetscViewerASCIIOpen() and PetscViewerBinaryOpen()?
In-Reply-To: <alpine.LFD.1.00.0802121409060.9540@asterix>
References: <7ff0ee010801221850h4a4161efu6ec3190f3525346d@mail.gmail.com>  <a9f269830801221925q721dde06ua29e4f2496d9cc72@mail.gmail.com>  <7ff0ee010801222101k471b3c00xe22e933efacb7939@mail.gmail.com>  <a9f269830801230553r7c206040p5764d79db38a7fb6@mail.gmail.com> <7ff0ee010801231218xc0281b2n8a7db71713ede8fb@mail.gmail.com>  <a9f269830801231247m45fa1d60p5b7467a7ac9d465d@mail.gmail.com> <7ff0ee010802121106n3f1bb609m37d81e9454fd806@mail.gmail.com> <alpine.LFD.1.00.0802121409060.9540@asterix>
Message-ID: <0AB82F4A-4114-4763-AC47-69EDE8E4E2DA@mcs.anl.gov>


   It is important to remember what the PETSc users manual
says

"PETSc graphics library is not intended to compete with
high-quality graphics packages.  Instead, it is intended to be
easy to use interactively with PETSc programs. We urge users
to generate their publication-quality graphics using a
professional graphics package."

   We are not graphics experts, nor do we want to be, or could be.

    Barry


On Feb 12, 2008, at 2:23 PM, Satish Balay wrote:

> On Tue, 12 Feb 2008, Yujie wrote:
>
>> On 1/23/08, Matthew Knepley <knepley at gmail.com> wrote:
>
>>>> In addition, do you have any better methods to save the sparsity
>>>> structure picture of the matrix? Now, I use "-mat_view_draw" to
>>>> do this. However, the speed is very slow and the picture is
>>>> small. I want to get a big picture and directly save it to the
>>>> disk?  could you give me some advice? thanks a lot.
>
>>> We do not have a better way to make the sparsity picture. I assume
>>> you could write something that decides how many pixels to use,
>>> calculates an average occupancy per pixel, and writes a BMP or
>>> something.
>
> Couple of notes on this.
>
> - -mat_view_draw can be slow for parallel runs [because all the data
>  is moved to proc-0, from where its displayed]. If you wish to speed
>  up, you can either:
>  * run it sequentially [depending upon your code, the matrix generated
>  could be different - so its not suitable]
>  * do a binary dump [with MatView() on a binary viewer] - and
>  then reload this matrix with a sequential code and then do mat_view
>  [check mat/examples/tests/ex33.c,ex43.c]
>
> - you can use the option '-draw_pause -1' to make the window not
>  disappear. Now you can zoom-in & zoom-out [with mouse-left or
>  mouse-right click]
>
> - Take the snapshot of this window with xv or gnome-screenshot or
> other screen-dump tool [ like 'xwd | xpr -device ps > dump.ps']
>
> - Alternatively you can dump the matrix is matlab format - and use
> Matlab visualization tools.
>
> Satish
>
>


From recrusader at gmail.com  Tue Feb 12 16:08:49 2008
From: recrusader at gmail.com (Yujie)
Date: Tue, 12 Feb 2008 14:08:49 -0800
Subject: what's the difference between PetscViewerASCIIOpen() and PetscViewerBinaryOpen()?
In-Reply-To: <0AB82F4A-4114-4763-AC47-69EDE8E4E2DA@mcs.anl.gov>
References: <7ff0ee010801221850h4a4161efu6ec3190f3525346d@mail.gmail.com>
	 <a9f269830801221925q721dde06ua29e4f2496d9cc72@mail.gmail.com>
	 <7ff0ee010801222101k471b3c00xe22e933efacb7939@mail.gmail.com>
	 <a9f269830801230553r7c206040p5764d79db38a7fb6@mail.gmail.com>
	 <7ff0ee010801231218xc0281b2n8a7db71713ede8fb@mail.gmail.com>
	 <a9f269830801231247m45fa1d60p5b7467a7ac9d465d@mail.gmail.com>
	 <7ff0ee010802121106n3f1bb609m37d81e9454fd806@mail.gmail.com>
	 <alpine.LFD.1.00.0802121409060.9540@asterix>
	 <0AB82F4A-4114-4763-AC47-69EDE8E4E2DA@mcs.anl.gov>
Message-ID: <7ff0ee010802121408s6d7ec5eau6a7ea8582ae59a26@mail.gmail.com>

thanks a lot, everyone :).

On 2/12/08, Barry Smith <bsmith at mcs.anl.gov> wrote:
>
>
>    It is important to remember what the PETSc users manual
> says
>
> "PETSc graphics library is not intended to compete with
> high-quality graphics packages.  Instead, it is intended to be
> easy to use interactively with PETSc programs. We urge users
> to generate their publication-quality graphics using a
> professional graphics package."
>
>    We are not graphics experts, nor do we want to be, or could be.
>
>     Barry
>
>
>
>
> On Feb 12, 2008, at 2:23 PM, Satish Balay wrote:
>
> > On Tue, 12 Feb 2008, Yujie wrote:
> >
> >> On 1/23/08, Matthew Knepley <knepley at gmail.com> wrote:
> >
> >>>> In addition, do you have any better methods to save the sparsity
> >>>> structure picture of the matrix? Now, I use "-mat_view_draw" to
> >>>> do this. However, the speed is very slow and the picture is
> >>>> small. I want to get a big picture and directly save it to the
> >>>> disk?  could you give me some advice? thanks a lot.
> >
> >>> We do not have a better way to make the sparsity picture. I assume
> >>> you could write something that decides how many pixels to use,
> >>> calculates an average occupancy per pixel, and writes a BMP or
> >>> something.
> >
> > Couple of notes on this.
> >
> > - -mat_view_draw can be slow for parallel runs [because all the data
> >  is moved to proc-0, from where its displayed]. If you wish to speed
> >  up, you can either:
> >  * run it sequentially [depending upon your code, the matrix generated
> >  could be different - so its not suitable]
> >  * do a binary dump [with MatView() on a binary viewer] - and
> >  then reload this matrix with a sequential code and then do mat_view
> >  [check mat/examples/tests/ex33.c,ex43.c]
> >
> > - you can use the option '-draw_pause -1' to make the window not
> >  disappear. Now you can zoom-in & zoom-out [with mouse-left or
> >  mouse-right click]
> >
> > - Take the snapshot of this window with xv or gnome-screenshot or
> > other screen-dump tool [ like 'xwd | xpr -device ps > dump.ps']
> >
> > - Alternatively you can dump the matrix is matlab format - and use
> > Matlab visualization tools.
> >
> > Satish
> >
> >
>
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.mcs.anl.gov/pipermail/petsc-users/attachments/20080212/d6bd4650/attachment.htm>

From Stephen.R.Ball at awe.co.uk  Wed Feb 13 06:12:03 2008
From: Stephen.R.Ball at awe.co.uk (Stephen R Ball)
Date: Wed, 13 Feb 2008 12:12:03 -0000
Subject: References for preconditioners and solver methods.
Message-ID: <82DCAO020874@awe.co.uk>


Hi

I am writing a paper that references PETSc and the preconditioners and
linear solvers that it uses. I would like to include references for
these. I have searched and found references for quite a few but am
struggling to find references for the following solver methods:

BICG
CGNE
CHEBYCHEV
CR (Conjugate Residuals)
QCG
RICHARDSON
TCQMR

Could you send me suitable references for these methods?

I'm not sure if they exist, but could you also send me suitable
references for the following preconditioners:

ASM
BJACOBI
ILU
ICC

Much appreciated

Stephen
--
_______________________________________________________________________________

The information in this email and in any attachment(s) is commercial in confidence. If you are not the named addressee(s) or if you receive this email in error then any distribution, copying or use of this communication or the information in it is strictly prohibited.  Please notify us immediately by email at admin.internet(at)awe.co.uk, and then delete this message from your computer.  While attachments are virus checked, AWE plc does not accept any liability in respect of any virus which is not detected.

AWE Plc
Registered in England and Wales
Registration No 02763902
AWE, Aldermaston, Reading, RG7 4PR


From bsmith at mcs.anl.gov  Wed Feb 13 14:40:30 2008
From: bsmith at mcs.anl.gov (Barry Smith)
Date: Wed, 13 Feb 2008 14:40:30 -0600
Subject: References for preconditioners and solver methods.
In-Reply-To: <82DCAO020874@awe.co.uk>
References: <82DCAO020874@awe.co.uk>
Message-ID: <F56214AC-4BF9-45B9-81AB-9EF4245014AB@mcs.anl.gov>


   I've started adding them to the manual pages. Here are the ones I  
have so far

On Feb 13, 2008, at 6:12 AM, Stephen R Ball wrote:

>
> Hi
>
> I am writing a paper that references PETSc and the preconditioners and
> linear solvers that it uses. I would like to include references for
> these. I have searched and found references for quite a few but am
> struggling to find references for the following solver methods:
>
> BICG


>
> CGNE

   This is just CG applied to the normal equations; it is not an idea  
worthing of a
publication.

>
> CHEBYCHEV


>
> CR (Conjugate Residuals)

    Methods of Conjugate Gradients for Solving Linear Systems, Magnus  
R. Hestenes and Eduard Stiefel,
    Journal of Research of the National Bureau of Standards Vol. 49,  
No. 6, December 1952 Research Paper 2379
    pp. 409--436.

>
> QCG

    The Conjugate Gradient Method and Trust Regions in Large Scale  
Optimization, Trond Steihaug
    SIAM Journal on Numerical Analysis, Vol. 20, No. 3 (Jun., 1983),  
pp. 626-637

>
> RICHARDSON


>
> TCQMR

   Transpose-free formulations of Lanczos-type methods for  
nonsymmetric linear systems,
   Tony F. Chan, Lisette de Pillis, and Henk van der Vorst, Numerical  
Algorithms, 	
   Volume 17, Numbers 1-2 / May, 1998 pp. 51-66.
>
>
> Could you send me suitable references for these methods?
>
> I'm not sure if they exist, but could you also send me suitable
> references for the following preconditioners:
>
> ASM
     An additive variant of the Schwarz alternating method for the  
case of many subregions
     M Dryja, OB Widlund - Courant Institute, New York University  
Technical report

     Domain Decompositions: Parallel Multilevel Methods for Elliptic  
Partial Differential Equations,
     Barry Smith, Petter Bjorstad, and William Gropp, Cambridge  
University Press, ISBN 0-521-49589-X.

>
> BJACOBI

    Any iterative solver book, this is just Jacobi's method
>
> ILU
> ICC
>

   Both ICC and ILU the review article

APPROXIMATE AND INCOMPLETE FACTORIZATIONS, TONY F. CHAN AND HENK A.  
VAN DER VORST
       http://igitur-archive.library.uu.nl/math/2001-0621-115821/proc.pdf 
  chapter in Parallel Numerical
       Algorithms, edited by D. Keyes, A. Semah, V. Venkatakrishnan,  
ICASE/LaRC Interdisciplinary Series in
       Science and Engineering, Kluwer, pp. 167--202.

It is difficult to determine the publications where the FIRST use of  
ILU/ICC appeared since the did not
call them that originally.

If anyone has references to the original Chebychev and Bi-CG  
algorithms please let us know.

    Barry

> Much appreciated
>
> Stephen
> --
> _______________________________________________________________________________
>
> The information in this email and in any attachment(s) is commercial  
> in confidence. If you are not the named addressee(s) or if you  
> receive this email in error then any distribution, copying or use of  
> this communication or the information in it is strictly prohibited.   
> Please notify us immediately by email at  
> admin.internet(at)awe.co.uk, and then delete this message from your  
> computer.  While attachments are virus checked, AWE plc does not  
> accept any liability in respect of any virus which is not detected.
>
> AWE Plc
> Registered in England and Wales
> Registration No 02763902
> AWE, Aldermaston, Reading, RG7 4PR
>


From Stephen.R.Ball at awe.co.uk  Thu Feb 14 10:30:45 2008
From: Stephen.R.Ball at awe.co.uk (Stephen R Ball)
Date: Thu, 14 Feb 2008 16:30:45 -0000
Subject: References for preconditioners and solver methods.
Message-ID: <82EGUi224985@awe.co.uk>


Hi

Thanks for your suggestions. You have given a reference for CR
(Conjugate Residuals) as:

Methods of Conjugate Gradients for Solving Linear Systems, Magnus R.
Hestenes and Eduard Stiefel, Journal of Research of the National Bureau
of Standards Vol. 49, No. 6, December 1952 Research Paper 2379 pp.
409--436.

However the PETSc user manual says this is the reference for CG
(Conjugate Gradient). Can you clarify which is the case? If it is not
for CR do you know of a reference for CR?

If anyone can provide references for the Bi-CG, Chebychev, CR (Conjugate
Residuals), QCG (Quadratic CG) and Richardson solvers that would be very
much appreciated.

Regards

Stephen


-----Original Message-----
From: owner-petsc-users at mcs.anl.gov
[mailto:owner-petsc-users at mcs.anl.gov] On Behalf Of Barry Smith
Sent: 13 February 2008 20:41
To: petsc-users at mcs.anl.gov
Subject: EXTERNAL: Re: References for preconditioners and solver
methods.


   I've started adding them to the manual pages. Here are the ones I  
have so far

On Feb 13, 2008, at 6:12 AM, Stephen R Ball wrote:

>
> Hi
>
> I am writing a paper that references PETSc and the preconditioners and
> linear solvers that it uses. I would like to include references for
> these. I have searched and found references for quite a few but am
> struggling to find references for the following solver methods:
>
> BICG


>
> CGNE

   This is just CG applied to the normal equations; it is not an idea  
worthing of a
publication.

>
> CHEBYCHEV


>
> CR (Conjugate Residuals)

    Methods of Conjugate Gradients for Solving Linear Systems, Magnus  
R. Hestenes and Eduard Stiefel,
    Journal of Research of the National Bureau of Standards Vol. 49,  
No. 6, December 1952 Research Paper 2379
    pp. 409--436.

>
> QCG

    The Conjugate Gradient Method and Trust Regions in Large Scale  
Optimization, Trond Steihaug
    SIAM Journal on Numerical Analysis, Vol. 20, No. 3 (Jun., 1983),  
pp. 626-637

>
> RICHARDSON


>
> TCQMR

   Transpose-free formulations of Lanczos-type methods for  
nonsymmetric linear systems,
   Tony F. Chan, Lisette de Pillis, and Henk van der Vorst, Numerical  
Algorithms, 	
   Volume 17, Numbers 1-2 / May, 1998 pp. 51-66.
>
>
> Could you send me suitable references for these methods?
>
> I'm not sure if they exist, but could you also send me suitable
> references for the following preconditioners:
>
> ASM
     An additive variant of the Schwarz alternating method for the  
case of many subregions
     M Dryja, OB Widlund - Courant Institute, New York University  
Technical report

     Domain Decompositions: Parallel Multilevel Methods for Elliptic  
Partial Differential Equations,
     Barry Smith, Petter Bjorstad, and William Gropp, Cambridge  
University Press, ISBN 0-521-49589-X.

>
> BJACOBI

    Any iterative solver book, this is just Jacobi's method
>
> ILU
> ICC
>

   Both ICC and ILU the review article

APPROXIMATE AND INCOMPLETE FACTORIZATIONS, TONY F. CHAN AND HENK A.  
VAN DER VORST
 
http://igitur-archive.library.uu.nl/math/2001-0621-115821/proc.pdf 
  chapter in Parallel Numerical
       Algorithms, edited by D. Keyes, A. Semah, V. Venkatakrishnan,  
ICASE/LaRC Interdisciplinary Series in
       Science and Engineering, Kluwer, pp. 167--202.

It is difficult to determine the publications where the FIRST use of  
ILU/ICC appeared since the did not
call them that originally.

If anyone has references to the original Chebychev and Bi-CG  
algorithms please let us know.

    Barry

> Much appreciated
>
> Stephen
> --
>
________________________________________________________________________
_______
>
> The information in this email and in any attachment(s) is commercial  
> in confidence. If you are not the named addressee(s) or if you  
> receive this email in error then any distribution, copying or use of  
> this communication or the information in it is strictly prohibited.   
> Please notify us immediately by email at  
> admin.internet(at)awe.co.uk, and then delete this message from your  
> computer.  While attachments are virus checked, AWE plc does not  
> accept any liability in respect of any virus which is not detected.
>
> AWE Plc
> Registered in England and Wales
> Registration No 02763902
> AWE, Aldermaston, Reading, RG7 4PR
>
--
_______________________________________________________________________________

The information in this email and in any attachment(s) is commercial in confidence. If you are not the named addressee(s) or if you receive this email in error then any distribution, copying or use of this communication or the information in it is strictly prohibited.  Please notify us immediately by email at admin.internet(at)awe.co.uk, and then delete this message from your computer.  While attachments are virus checked, AWE plc does not accept any liability in respect of any virus which is not detected.

AWE Plc
Registered in England and Wales
Registration No 02763902
AWE, Aldermaston, Reading, RG7 4PR


From knepley at gmail.com  Thu Feb 14 12:56:30 2008
From: knepley at gmail.com (Matthew Knepley)
Date: Thu, 14 Feb 2008 12:56:30 -0600
Subject: References for preconditioners and solver methods.
In-Reply-To: <82EGUi224985@awe.co.uk>
References: <82EGUi224985@awe.co.uk>
Message-ID: <a9f269830802141056j7cc03246ubb02f97d99408694@mail.gmail.com>

On Thu, Feb 14, 2008 at 10:30 AM, Stephen R Ball
<Stephen.R.Ball at awe.co.uk> wrote:
>
>
>  Hi
>
>  Thanks for your suggestions. You have given a reference for CR
>  (Conjugate Residuals) as:
>
>  Methods of Conjugate Gradients for Solving Linear Systems, Magnus R.
>  Hestenes and Eduard Stiefel, Journal of Research of the National Bureau
>  of Standards Vol. 49, No. 6, December 1952 Research Paper 2379 pp.
>  409--436.

I get this:

The Conjugate Residual Method for Constrained Minimization Problems
David G. Luenberger
SIAM Journal on Numerical Analysis, Vol. 7, No. 3 (Sep., 1970), pp. 390-398

Barry, do you agree?

    Matt

>  However the PETSc user manual says this is the reference for CG
>  (Conjugate Gradient). Can you clarify which is the case? If it is not
>  for CR do you know of a reference for CR?
>
>  If anyone can provide references for the Bi-CG, Chebychev, CR (Conjugate
>  Residuals), QCG (Quadratic CG) and Richardson solvers that would be very
>  much appreciated.
>
>  Regards
>
>  Stephen
>
>
>
>
>  -----Original Message-----
>  From: owner-petsc-users at mcs.anl.gov
>  [mailto:owner-petsc-users at mcs.anl.gov] On Behalf Of Barry Smith
>  Sent: 13 February 2008 20:41
>  To: petsc-users at mcs.anl.gov
>  Subject: EXTERNAL: Re: References for preconditioners and solver
>  methods.
>
>
>    I've started adding them to the manual pages. Here are the ones I
>  have so far
>
>  On Feb 13, 2008, at 6:12 AM, Stephen R Ball wrote:
>
>  >
>  > Hi
>  >
>  > I am writing a paper that references PETSc and the preconditioners and
>  > linear solvers that it uses. I would like to include references for
>  > these. I have searched and found references for quite a few but am
>  > struggling to find references for the following solver methods:
>  >
>  > BICG
>
>
>  >
>  > CGNE
>
>    This is just CG applied to the normal equations; it is not an idea
>  worthing of a
>  publication.
>
>  >
>  > CHEBYCHEV
>
>
>
>  >
>  > CR (Conjugate Residuals)
>
>     Methods of Conjugate Gradients for Solving Linear Systems, Magnus
>  R. Hestenes and Eduard Stiefel,
>     Journal of Research of the National Bureau of Standards Vol. 49,
>  No. 6, December 1952 Research Paper 2379
>     pp. 409--436.
>
>  >
>  > QCG
>
>     The Conjugate Gradient Method and Trust Regions in Large Scale
>  Optimization, Trond Steihaug
>     SIAM Journal on Numerical Analysis, Vol. 20, No. 3 (Jun., 1983),
>  pp. 626-637
>
>  >
>  > RICHARDSON
>
>
>  >
>  > TCQMR
>
>    Transpose-free formulations of Lanczos-type methods for
>  nonsymmetric linear systems,
>    Tony F. Chan, Lisette de Pillis, and Henk van der Vorst, Numerical
>  Algorithms,
>    Volume 17, Numbers 1-2 / May, 1998 pp. 51-66.
>  >
>  >
>  > Could you send me suitable references for these methods?
>  >
>  > I'm not sure if they exist, but could you also send me suitable
>  > references for the following preconditioners:
>  >
>  > ASM
>      An additive variant of the Schwarz alternating method for the
>  case of many subregions
>      M Dryja, OB Widlund - Courant Institute, New York University
>  Technical report
>
>      Domain Decompositions: Parallel Multilevel Methods for Elliptic
>  Partial Differential Equations,
>      Barry Smith, Petter Bjorstad, and William Gropp, Cambridge
>  University Press, ISBN 0-521-49589-X.
>
>  >
>  > BJACOBI
>
>     Any iterative solver book, this is just Jacobi's method
>  >
>  > ILU
>  > ICC
>  >
>
>    Both ICC and ILU the review article
>
>  APPROXIMATE AND INCOMPLETE FACTORIZATIONS, TONY F. CHAN AND HENK A.
>  VAN DER VORST
>
>  http://igitur-archive.library.uu.nl/math/2001-0621-115821/proc.pdf
>   chapter in Parallel Numerical
>        Algorithms, edited by D. Keyes, A. Semah, V. Venkatakrishnan,
>  ICASE/LaRC Interdisciplinary Series in
>        Science and Engineering, Kluwer, pp. 167--202.
>
>  It is difficult to determine the publications where the FIRST use of
>  ILU/ICC appeared since the did not
>  call them that originally.
>
>  If anyone has references to the original Chebychev and Bi-CG
>  algorithms please let us know.
>
>     Barry
>
>  > Much appreciated
>  >
>  > Stephen
>  > --
>  >
>  ________________________________________________________________________
>  _______
>  >
>  > The information in this email and in any attachment(s) is commercial
>  > in confidence. If you are not the named addressee(s) or if you
>  > receive this email in error then any distribution, copying or use of
>  > this communication or the information in it is strictly prohibited.
>  > Please notify us immediately by email at
>  > admin.internet(at)awe.co.uk, and then delete this message from your
>  > computer.  While attachments are virus checked, AWE plc does not
>  > accept any liability in respect of any virus which is not detected.
>  >
>  > AWE Plc
>  > Registered in England and Wales
>  > Registration No 02763902
>  > AWE, Aldermaston, Reading, RG7 4PR
>  >
>  --
>  _______________________________________________________________________________
>
>  The information in this email and in any attachment(s) is commercial in confidence. If you are not the named addressee(s) or if you receive this email in error then any distribution, copying or use of this communication or the information in it is strictly prohibited.  Please notify us immediately by email at admin.internet(at)awe.co.uk, and then delete this message from your computer.  While attachments are virus checked, AWE plc does not accept any liability in respect of any virus which is not detected.
>
>  AWE Plc
>  Registered in England and Wales
>  Registration No 02763902
>  AWE, Aldermaston, Reading, RG7 4PR
>
>


-- 
What most experimenters take for granted before they begin their
experiments is infinitely more interesting than any results to which
their experiments lead.
-- Norbert Wiener


From bsmith at mcs.anl.gov  Thu Feb 14 13:02:53 2008
From: bsmith at mcs.anl.gov (Barry Smith)
Date: Thu, 14 Feb 2008 13:02:53 -0600
Subject: References for preconditioners and solver methods.
In-Reply-To: <a9f269830802141056j7cc03246ubb02f97d99408694@mail.gmail.com>
References: <82EGUi224985@awe.co.uk> <a9f269830802141056j7cc03246ubb02f97d99408694@mail.gmail.com>
Message-ID: <CFD2B0A2-EE7B-4F34-BC48-47325259DCD8@mcs.anl.gov>


On Feb 14, 2008, at 12:56 PM, Matthew Knepley wrote:

> On Thu, Feb 14, 2008 at 10:30 AM, Stephen R Ball
> <Stephen.R.Ball at awe.co.uk> wrote:
>>
>>
>> Hi
>>
>> Thanks for your suggestions. You have given a reference for CR
>> (Conjugate Residuals) as:
>>
>> Methods of Conjugate Gradients for Solving Linear Systems, Magnus R.
>> Hestenes and Eduard Stiefel, Journal of Research of the National  
>> Bureau
>> of Standards Vol. 49, No. 6, December 1952 Research Paper 2379 pp.
>> 409--436.
>
> I get this:
>
> The Conjugate Residual Method for Constrained Minimization Problems
> David G. Luenberger
> SIAM Journal on Numerical Analysis, Vol. 7, No. 3 (Sep., 1970), pp.  
> 390-398
>
> Barry, do you agree?

   I took at a look at Hestenes and Stiefel, though they don't use the  
term "conjugate residuals"
I would argue that the algorithm is essentially there and so we should  
not give
credit to someone else.

    Barry

>
>
>    Matt
>
>> However the PETSc user manual says this is the reference for CG
>> (Conjugate Gradient). Can you clarify which is the case? If it is not
>> for CR do you know of a reference for CR?
>>
>> If anyone can provide references for the Bi-CG, Chebychev, CR  
>> (Conjugate
>> Residuals), QCG (Quadratic CG) and Richardson solvers that would be  
>> very
>> much appreciated.
>>
>> Regards
>>
>> Stephen
>>
>>
>>
>>
>> -----Original Message-----
>> From: owner-petsc-users at mcs.anl.gov
>> [mailto:owner-petsc-users at mcs.anl.gov] On Behalf Of Barry Smith
>> Sent: 13 February 2008 20:41
>> To: petsc-users at mcs.anl.gov
>> Subject: EXTERNAL: Re: References for preconditioners and solver
>> methods.
>>
>>
>>   I've started adding them to the manual pages. Here are the ones I
>> have so far
>>
>> On Feb 13, 2008, at 6:12 AM, Stephen R Ball wrote:
>>
>>>
>>> Hi
>>>
>>> I am writing a paper that references PETSc and the preconditioners  
>>> and
>>> linear solvers that it uses. I would like to include references for
>>> these. I have searched and found references for quite a few but am
>>> struggling to find references for the following solver methods:
>>>
>>> BICG
>>
>>
>>>
>>> CGNE
>>
>>   This is just CG applied to the normal equations; it is not an idea
>> worthing of a
>> publication.
>>
>>>
>>> CHEBYCHEV
>>
>>
>>
>>>
>>> CR (Conjugate Residuals)
>>
>>    Methods of Conjugate Gradients for Solving Linear Systems, Magnus
>> R. Hestenes and Eduard Stiefel,
>>    Journal of Research of the National Bureau of Standards Vol. 49,
>> No. 6, December 1952 Research Paper 2379
>>    pp. 409--436.
>>
>>>
>>> QCG
>>
>>    The Conjugate Gradient Method and Trust Regions in Large Scale
>> Optimization, Trond Steihaug
>>    SIAM Journal on Numerical Analysis, Vol. 20, No. 3 (Jun., 1983),
>> pp. 626-637
>>
>>>
>>> RICHARDSON
>>
>>
>>>
>>> TCQMR
>>
>>   Transpose-free formulations of Lanczos-type methods for
>> nonsymmetric linear systems,
>>   Tony F. Chan, Lisette de Pillis, and Henk van der Vorst, Numerical
>> Algorithms,
>>   Volume 17, Numbers 1-2 / May, 1998 pp. 51-66.
>>>
>>>
>>> Could you send me suitable references for these methods?
>>>
>>> I'm not sure if they exist, but could you also send me suitable
>>> references for the following preconditioners:
>>>
>>> ASM
>>     An additive variant of the Schwarz alternating method for the
>> case of many subregions
>>     M Dryja, OB Widlund - Courant Institute, New York University
>> Technical report
>>
>>     Domain Decompositions: Parallel Multilevel Methods for Elliptic
>> Partial Differential Equations,
>>     Barry Smith, Petter Bjorstad, and William Gropp, Cambridge
>> University Press, ISBN 0-521-49589-X.
>>
>>>
>>> BJACOBI
>>
>>    Any iterative solver book, this is just Jacobi's method
>>>
>>> ILU
>>> ICC
>>>
>>
>>   Both ICC and ILU the review article
>>
>> APPROXIMATE AND INCOMPLETE FACTORIZATIONS, TONY F. CHAN AND HENK A.
>> VAN DER VORST
>>
>> http://igitur-archive.library.uu.nl/math/2001-0621-115821/proc.pdf
>>  chapter in Parallel Numerical
>>       Algorithms, edited by D. Keyes, A. Semah, V. Venkatakrishnan,
>> ICASE/LaRC Interdisciplinary Series in
>>       Science and Engineering, Kluwer, pp. 167--202.
>>
>> It is difficult to determine the publications where the FIRST use of
>> ILU/ICC appeared since the did not
>> call them that originally.
>>
>> If anyone has references to the original Chebychev and Bi-CG
>> algorithms please let us know.
>>
>>    Barry
>>
>>> Much appreciated
>>>
>>> Stephen
>>> --
>>>
>> ________________________________________________________________________
>> _______
>>>
>>> The information in this email and in any attachment(s) is commercial
>>> in confidence. If you are not the named addressee(s) or if you
>>> receive this email in error then any distribution, copying or use of
>>> this communication or the information in it is strictly prohibited.
>>> Please notify us immediately by email at
>>> admin.internet(at)awe.co.uk, and then delete this message from your
>>> computer.  While attachments are virus checked, AWE plc does not
>>> accept any liability in respect of any virus which is not detected.
>>>
>>> AWE Plc
>>> Registered in England and Wales
>>> Registration No 02763902
>>> AWE, Aldermaston, Reading, RG7 4PR
>>>
>> --
>> _______________________________________________________________________________
>>
>> The information in this email and in any attachment(s) is  
>> commercial in confidence. If you are not the named addressee(s) or  
>> if you receive this email in error then any distribution, copying  
>> or use of this communication or the information in it is strictly  
>> prohibited.  Please notify us immediately by email at  
>> admin.internet(at)awe.co.uk, and then delete this message from your  
>> computer.  While attachments are virus checked, AWE plc does not  
>> accept any liability in respect of any virus which is not detected.
>>
>> AWE Plc
>> Registered in England and Wales
>> Registration No 02763902
>> AWE, Aldermaston, Reading, RG7 4PR
>>
>>
>
>
>
> -- 
> What most experimenters take for granted before they begin their
> experiments is infinitely more interesting than any results to which
> their experiments lead.
> -- Norbert Wiener
>


From knepley at gmail.com  Thu Feb 14 13:15:42 2008
From: knepley at gmail.com (Matthew Knepley)
Date: Thu, 14 Feb 2008 13:15:42 -0600
Subject: References for preconditioners and solver methods.
In-Reply-To: <82EGUi224985@awe.co.uk>
References: <82EGUi224985@awe.co.uk>
Message-ID: <a9f269830802141115p179a9c6bj8444dbb8baef8961@mail.gmail.com>

On Thu, Feb 14, 2008 at 10:30 AM, Stephen R Ball
<Stephen.R.Ball at awe.co.uk> wrote:
>  If anyone can provide references for the Bi-CG, Chebychev, CR (Conjugate
>  Residuals), QCG (Quadratic CG) and Richardson solvers that would be very
>  much appreciated.
>  > QCG
>
>     The Conjugate Gradient Method and Trust Regions in Large Scale
>  Optimization, Trond Steihaug
>     SIAM Journal on Numerical Analysis, Vol. 20, No. 3 (Jun., 1983),
>  pp. 626-637

and I put Richardson in the source yesterday, so it should be up on
the developer
documentation online today.

   Matt

-- 
What most experimenters take for granted before they begin their
experiments is infinitely more interesting than any results to which
their experiments lead.
-- Norbert Wiener


From Stephen.R.Ball at awe.co.uk  Fri Feb 15 04:44:11 2008
From: Stephen.R.Ball at awe.co.uk (Stephen R Ball)
Date: Fri, 15 Feb 2008 10:44:11 -0000
Subject: References for preconditioners and solver methods.
Message-ID: <82FAjs014703@awe.co.uk>


Hi

Can you tell me where I can get hold of the developer documentation?

Regards

Stephen


-----Original Message-----
From: owner-petsc-users at mcs.anl.gov
[mailto:owner-petsc-users at mcs.anl.gov] On Behalf Of Matthew Knepley
Sent: 14 February 2008 19:16
To: petsc-users at mcs.anl.gov
Subject: EXTERNAL: Re: References for preconditioners and solver
methods.

On Thu, Feb 14, 2008 at 10:30 AM, Stephen R Ball
<Stephen.R.Ball at awe.co.uk> wrote:
>  If anyone can provide references for the Bi-CG, Chebychev, CR
(Conjugate
>  Residuals), QCG (Quadratic CG) and Richardson solvers that would be
very
>  much appreciated.
>  > QCG
>
>     The Conjugate Gradient Method and Trust Regions in Large Scale
>  Optimization, Trond Steihaug
>     SIAM Journal on Numerical Analysis, Vol. 20, No. 3 (Jun., 1983),
>  pp. 626-637

and I put Richardson in the source yesterday, so it should be up on
the developer
documentation online today.

   Matt

-- 
What most experimenters take for granted before they begin their
experiments is infinitely more interesting than any results to which
their experiments lead.
-- Norbert Wiener
--
_______________________________________________________________________________

The information in this email and in any attachment(s) is commercial in confidence. If you are not the named addressee(s) or if you receive this email in error then any distribution, copying or use of this communication or the information in it is strictly prohibited.  Please notify us immediately by email at admin.internet(at)awe.co.uk, and then delete this message from your computer.  While attachments are virus checked, AWE plc does not accept any liability in respect of any virus which is not detected.

AWE Plc
Registered in England and Wales
Registration No 02763902
AWE, Aldermaston, Reading, RG7 4PR


From Stephen.R.Ball at awe.co.uk  Fri Feb 15 04:42:33 2008
From: Stephen.R.Ball at awe.co.uk (Stephen R Ball)
Date: Fri, 15 Feb 2008 10:42:33 -0000
Subject: References for preconditioners and solver methods.
Message-ID: <82FAe7014420@awe.co.uk>


Hi

So to clarify then I should use reference:

Methods of Conjugate Gradients for Solving Linear Systems, Magnus R.
Hestenes and Eduard Stiefel, Journal of Research of the National Bureau
of Standards Vol. 49, No. 6, December 1952 Research Paper 2379 pp.
409--436.

For both CG and CR?

Regards

Stephen


-----Original Message-----
From: owner-petsc-users at mcs.anl.gov
[mailto:owner-petsc-users at mcs.anl.gov] On Behalf Of Barry Smith
Sent: 14 February 2008 19:03
To: petsc-users at mcs.anl.gov
Subject: EXTERNAL: Re: References for preconditioners and solver
methods.

On Feb 14, 2008, at 12:56 PM, Matthew Knepley wrote:

> On Thu, Feb 14, 2008 at 10:30 AM, Stephen R Ball
> <Stephen.R.Ball at awe.co.uk> wrote:
>>
>>
>> Hi
>>
>> Thanks for your suggestions. You have given a reference for CR
>> (Conjugate Residuals) as:
>>
>> Methods of Conjugate Gradients for Solving Linear Systems, Magnus R.
>> Hestenes and Eduard Stiefel, Journal of Research of the National  
>> Bureau
>> of Standards Vol. 49, No. 6, December 1952 Research Paper 2379 pp.
>> 409--436.
>
> I get this:
>
> The Conjugate Residual Method for Constrained Minimization Problems
> David G. Luenberger
> SIAM Journal on Numerical Analysis, Vol. 7, No. 3 (Sep., 1970), pp.  
> 390-398
>
> Barry, do you agree?

   I took at a look at Hestenes and Stiefel, though they don't use the  
term "conjugate residuals"
I would argue that the algorithm is essentially there and so we should  
not give
credit to someone else.

    Barry

>
>
>    Matt
>
>> However the PETSc user manual says this is the reference for CG
>> (Conjugate Gradient). Can you clarify which is the case? If it is not
>> for CR do you know of a reference for CR?
>>
>> If anyone can provide references for the Bi-CG, Chebychev, CR  
>> (Conjugate
>> Residuals), QCG (Quadratic CG) and Richardson solvers that would be  
>> very
>> much appreciated.
>>
>> Regards
>>
>> Stephen
>>
>>
>>
>>
>> -----Original Message-----
>> From: owner-petsc-users at mcs.anl.gov
>> [mailto:owner-petsc-users at mcs.anl.gov] On Behalf Of Barry Smith
>> Sent: 13 February 2008 20:41
>> To: petsc-users at mcs.anl.gov
>> Subject: EXTERNAL: Re: References for preconditioners and solver
>> methods.
>>
>>
>>   I've started adding them to the manual pages. Here are the ones I
>> have so far
>>
>> On Feb 13, 2008, at 6:12 AM, Stephen R Ball wrote:
>>
>>>
>>> Hi
>>>
>>> I am writing a paper that references PETSc and the preconditioners  
>>> and
>>> linear solvers that it uses. I would like to include references for
>>> these. I have searched and found references for quite a few but am
>>> struggling to find references for the following solver methods:
>>>
>>> BICG
>>
>>
>>>
>>> CGNE
>>
>>   This is just CG applied to the normal equations; it is not an idea
>> worthing of a
>> publication.
>>
>>>
>>> CHEBYCHEV
>>
>>
>>
>>>
>>> CR (Conjugate Residuals)
>>
>>    Methods of Conjugate Gradients for Solving Linear Systems, Magnus
>> R. Hestenes and Eduard Stiefel,
>>    Journal of Research of the National Bureau of Standards Vol. 49,
>> No. 6, December 1952 Research Paper 2379
>>    pp. 409--436.
>>
>>>
>>> QCG
>>
>>    The Conjugate Gradient Method and Trust Regions in Large Scale
>> Optimization, Trond Steihaug
>>    SIAM Journal on Numerical Analysis, Vol. 20, No. 3 (Jun., 1983),
>> pp. 626-637
>>
>>>
>>> RICHARDSON
>>
>>
>>>
>>> TCQMR
>>
>>   Transpose-free formulations of Lanczos-type methods for
>> nonsymmetric linear systems,
>>   Tony F. Chan, Lisette de Pillis, and Henk van der Vorst, Numerical
>> Algorithms,
>>   Volume 17, Numbers 1-2 / May, 1998 pp. 51-66.
>>>
>>>
>>> Could you send me suitable references for these methods?
>>>
>>> I'm not sure if they exist, but could you also send me suitable
>>> references for the following preconditioners:
>>>
>>> ASM
>>     An additive variant of the Schwarz alternating method for the
>> case of many subregions
>>     M Dryja, OB Widlund - Courant Institute, New York University
>> Technical report
>>
>>     Domain Decompositions: Parallel Multilevel Methods for Elliptic
>> Partial Differential Equations,
>>     Barry Smith, Petter Bjorstad, and William Gropp, Cambridge
>> University Press, ISBN 0-521-49589-X.
>>
>>>
>>> BJACOBI
>>
>>    Any iterative solver book, this is just Jacobi's method
>>>
>>> ILU
>>> ICC
>>>
>>
>>   Both ICC and ILU the review article
>>
>> APPROXIMATE AND INCOMPLETE FACTORIZATIONS, TONY F. CHAN AND HENK A.
>> VAN DER VORST
>>
>> http://igitur-archive.library.uu.nl/math/2001-0621-115821/proc.pdf
>>  chapter in Parallel Numerical
>>       Algorithms, edited by D. Keyes, A. Semah, V. Venkatakrishnan,
>> ICASE/LaRC Interdisciplinary Series in
>>       Science and Engineering, Kluwer, pp. 167--202.
>>
>> It is difficult to determine the publications where the FIRST use of
>> ILU/ICC appeared since the did not
>> call them that originally.
>>
>> If anyone has references to the original Chebychev and Bi-CG
>> algorithms please let us know.
>>
>>    Barry
>>
>>> Much appreciated
>>>
>>> Stephen
>>> --
>>>
>>
________________________________________________________________________
>> _______
>>>
>>> The information in this email and in any attachment(s) is commercial
>>> in confidence. If you are not the named addressee(s) or if you
>>> receive this email in error then any distribution, copying or use of
>>> this communication or the information in it is strictly prohibited.
>>> Please notify us immediately by email at
>>> admin.internet(at)awe.co.uk, and then delete this message from your
>>> computer.  While attachments are virus checked, AWE plc does not
>>> accept any liability in respect of any virus which is not detected.
>>>
>>> AWE Plc
>>> Registered in England and Wales
>>> Registration No 02763902
>>> AWE, Aldermaston, Reading, RG7 4PR
>>>
>> --
>>
________________________________________________________________________
_______
>>
>> The information in this email and in any attachment(s) is  
>> commercial in confidence. If you are not the named addressee(s) or  
>> if you receive this email in error then any distribution, copying  
>> or use of this communication or the information in it is strictly  
>> prohibited.  Please notify us immediately by email at  
>> admin.internet(at)awe.co.uk, and then delete this message from your  
>> computer.  While attachments are virus checked, AWE plc does not  
>> accept any liability in respect of any virus which is not detected.
>>
>> AWE Plc
>> Registered in England and Wales
>> Registration No 02763902
>> AWE, Aldermaston, Reading, RG7 4PR
>>
>>
>
>
>
> -- 
> What most experimenters take for granted before they begin their
> experiments is infinitely more interesting than any results to which
> their experiments lead.
> -- Norbert Wiener
>
--
_______________________________________________________________________________

The information in this email and in any attachment(s) is commercial in confidence. If you are not the named addressee(s) or if you receive this email in error then any distribution, copying or use of this communication or the information in it is strictly prohibited.  Please notify us immediately by email at admin.internet(at)awe.co.uk, and then delete this message from your computer.  While attachments are virus checked, AWE plc does not accept any liability in respect of any virus which is not detected.

AWE Plc
Registered in England and Wales
Registration No 02763902
AWE, Aldermaston, Reading, RG7 4PR


From knutert at stud.ntnu.no  Fri Feb 15 07:00:09 2008
From: knutert at stud.ntnu.no (knutert at stud.ntnu.no)
Date: Fri, 15 Feb 2008 14:00:09 +0100
Subject: Poor performance with BoomerAMG?
In-Reply-To: <82FAjs014703@awe.co.uk>
References: <82FAjs014703@awe.co.uk>
Message-ID: <20080215140009.lbc7d9t52scok08g@webmail.ntnu.no>

Hello,

I am trying to use the hypre multigrid solver to solve a Poisson equation.
However, on a test case with grid size 257x257 it takes 40 seconds to converge
on one processor when I run with
./run -ksp_type richardson -pc_type hypre -pc_type_hypre boomeramg

Using the DMMG framework, the same problem takes less than a second,
and the default gmres solver uses only four seconds.

Am I somehow using the solver the wrong way, or is this performance expected?

Regards
Knut Erik Teigen


From zonexo at gmail.com  Fri Feb 15 07:47:49 2008
From: zonexo at gmail.com (Ben Tay)
Date: Fri, 15 Feb 2008 21:47:49 +0800
Subject: Poor performance with BoomerAMG?
In-Reply-To: <20080215140009.lbc7d9t52scok08g@webmail.ntnu.no>
References: <82FAjs014703@awe.co.uk> <20080215140009.lbc7d9t52scok08g@webmail.ntnu.no>
Message-ID: <47B59805.1070306@gmail.com>

Hi Knut,

I'm currently using boomeramg to solve my poisson eqn too. I'm using it 
on my structured C-grid. I found it to be faster than LU, especially as 
the grid size increases. However I use it as a preconditioner with GMRES 
as the solver. Have you tried this option? Although it's faster, the 
speed increase is usually less than double. It seems to be worse if 
there is a lot of stretching in the grid.

Btw, your mention using the DMMG framework and it takes less than a sec. 
What solver or preconditioner did you use? It's 4 times faster than GMRES...

thanks!

knutert at stud.ntnu.no wrote:
> Hello,
>
> I am trying to use the hypre multigrid solver to solve a Poisson 
> equation.
> However, on a test case with grid size 257x257 it takes 40 seconds to 
> converge
> on one processor when I run with
> ./run -ksp_type richardson -pc_type hypre -pc_type_hypre boomeramg
>
> Using the DMMG framework, the same problem takes less than a second,
> and the default gmres solver uses only four seconds.
>
> Am I somehow using the solver the wrong way, or is this performance 
> expected?
>
> Regards
> Knut Erik Teigen
>
>


From knepley at gmail.com  Fri Feb 15 08:35:46 2008
From: knepley at gmail.com (Matthew Knepley)
Date: Fri, 15 Feb 2008 08:35:46 -0600
Subject: References for preconditioners and solver methods.
In-Reply-To: <82FAjs014703@awe.co.uk>
References: <82FAjs014703@awe.co.uk>
Message-ID: <a9f269830802150635p209fe643ud083e6611b98a743@mail.gmail.com>

On Fri, Feb 15, 2008 at 4:44 AM, Stephen R Ball
<Stephen.R.Ball at awe.co.uk> wrote:
>
>  Hi
>
>  Can you tell me where I can get hold of the developer documentation?

http://www-unix.mcs.anl.gov/petsc/petsc-as/snapshots/petsc-dev/docs/index.html

   Matt

>  Regards
>
>  Stephen
>
>
>
>
>
>  -----Original Message-----
>  From: owner-petsc-users at mcs.anl.gov
>  [mailto:owner-petsc-users at mcs.anl.gov] On Behalf Of Matthew Knepley
>  Sent: 14 February 2008 19:16
>  To: petsc-users at mcs.anl.gov
>  Subject: EXTERNAL: Re: References for preconditioners and solver
>  methods.
>
>  On Thu, Feb 14, 2008 at 10:30 AM, Stephen R Ball
>  <Stephen.R.Ball at awe.co.uk> wrote:
>  >  If anyone can provide references for the Bi-CG, Chebychev, CR
>  (Conjugate
>  >  Residuals), QCG (Quadratic CG) and Richardson solvers that would be
>  very
>  >  much appreciated.
>  >  > QCG
>  >
>  >     The Conjugate Gradient Method and Trust Regions in Large Scale
>  >  Optimization, Trond Steihaug
>  >     SIAM Journal on Numerical Analysis, Vol. 20, No. 3 (Jun., 1983),
>  >  pp. 626-637
>
>  and I put Richardson in the source yesterday, so it should be up on
>  the developer
>  documentation online today.
>
>    Matt
>
>  --
>  What most experimenters take for granted before they begin their
>  experiments is infinitely more interesting than any results to which
>  their experiments lead.
>  -- Norbert Wiener
>  --
>  _______________________________________________________________________________
>
>  The information in this email and in any attachment(s) is commercial in confidence. If you are not the named addressee(s) or if you receive this email in error then any distribution, copying or use of this communication or the information in it is strictly prohibited.  Please notify us immediately by email at admin.internet(at)awe.co.uk, and then delete this message from your computer.  While attachments are virus checked, AWE plc does not accept any liability in respect of any virus which is not detected.
>
>  AWE Plc
>  Registered in England and Wales
>  Registration No 02763902
>  AWE, Aldermaston, Reading, RG7 4PR
>
>


-- 
What most experimenters take for granted before they begin their
experiments is infinitely more interesting than any results to which
their experiments lead.
-- Norbert Wiener


From knutert at stud.ntnu.no  Fri Feb 15 08:36:35 2008
From: knutert at stud.ntnu.no (knutert at stud.ntnu.no)
Date: Fri, 15 Feb 2008 15:36:35 +0100
Subject: Poor performance with BoomerAMG?
In-Reply-To: <47B59805.1070306@gmail.com>
References: <82FAjs014703@awe.co.uk>
	<20080215140009.lbc7d9t52scok08g@webmail.ntnu.no>
	<47B59805.1070306@gmail.com>
Message-ID: <20080215153635.1y2s2a1uslcwswws@webmail.ntnu.no>

Hi Ben,

Thank you for answering. With gmres and boomeramg I get a run time of
2s, so that is much better. However, if I increase the grid size to
513x513, I get a run time of one minute. With richardson, it fails to  
converge.
LU gives 6 seconds, CG and ICC gives 7s, and the DMMG solver 3s for  
the 513x513 problem.

When using the DMMG framework, I just used the default solvers.
I use the Galerkin process to generate the coarse matrices for
the multigrid cycle.

Best,
Knut

Siterer Ben Tay <zonexo at gmail.com>:

> Hi Knut,
>
> I'm currently using boomeramg to solve my poisson eqn too. I'm using it
> on my structured C-grid. I found it to be faster than LU, especially as
> the grid size increases. However I use it as a preconditioner with
> GMRES as the solver. Have you tried this option? Although it's faster,
> the speed increase is usually less than double. It seems to be worse if
> there is a lot of stretching in the grid.
>
> Btw, your mention using the DMMG framework and it takes less than a
> sec. What solver or preconditioner did you use? It's 4 times faster
> than GMRES...
>
> thanks!
>
> knutert at stud.ntnu.no wrote:
>> Hello,
>>
>> I am trying to use the hypre multigrid solver to solve a Poisson equation.
>> However, on a test case with grid size 257x257 it takes 40 seconds   
>> to converge
>> on one processor when I run with
>> ./run -ksp_type richardson -pc_type hypre -pc_type_hypre boomeramg
>>
>> Using the DMMG framework, the same problem takes less than a second,
>> and the default gmres solver uses only four seconds.
>>
>> Am I somehow using the solver the wrong way, or is this performance  
>>  expected?
>>
>> Regards
>> Knut Erik Teigen
>>
>>


From bsmith at mcs.anl.gov  Fri Feb 15 11:13:28 2008
From: bsmith at mcs.anl.gov (Barry Smith)
Date: Fri, 15 Feb 2008 11:13:28 -0600
Subject: Poor performance with BoomerAMG?
In-Reply-To: <20080215153635.1y2s2a1uslcwswws@webmail.ntnu.no>
References: <82FAjs014703@awe.co.uk> <20080215140009.lbc7d9t52scok08g@webmail.ntnu.no> <47B59805.1070306@gmail.com> <20080215153635.1y2s2a1uslcwswws@webmail.ntnu.no>
Message-ID: <73C20BE7-ABAD-45CF-8FDE-5972C4891527@mcs.anl.gov>


   Run with the DMMG solver with the option -pc_type hypre
What happens? Then run again with the additional option -ksp_type  
richardson

Is hypre taking many, many iterations which is causing the slow speed?

I expect there is something wrong with your code that does not use DMMG.
Be careful how you handle boundary conditions; you need to make sure
they have the same scaling as the other equations.

    Barry


On Feb 15, 2008, at 8:36 AM, knutert at stud.ntnu.no wrote:

> Hi Ben,
>
> Thank you for answering. With gmres and boomeramg I get a run time of
> 2s, so that is much better. However, if I increase the grid size to
> 513x513, I get a run time of one minute. With richardson, it fails  
> to converge.
> LU gives 6 seconds, CG and ICC gives 7s, and the DMMG solver 3s for  
> the 513x513 problem.
>
> When using the DMMG framework, I just used the default solvers.
> I use the Galerkin process to generate the coarse matrices for
> the multigrid cycle.
>
> Best,
> Knut
>
> Siterer Ben Tay <zonexo at gmail.com>:
>
>> Hi Knut,
>>
>> I'm currently using boomeramg to solve my poisson eqn too. I'm  
>> using it
>> on my structured C-grid. I found it to be faster than LU,  
>> especially as
>> the grid size increases. However I use it as a preconditioner with
>> GMRES as the solver. Have you tried this option? Although it's  
>> faster,
>> the speed increase is usually less than double. It seems to be  
>> worse if
>> there is a lot of stretching in the grid.
>>
>> Btw, your mention using the DMMG framework and it takes less than a
>> sec. What solver or preconditioner did you use? It's 4 times faster
>> than GMRES...
>>
>> thanks!
>>
>> knutert at stud.ntnu.no wrote:
>>> Hello,
>>>
>>> I am trying to use the hypre multigrid solver to solve a Poisson  
>>> equation.
>>> However, on a test case with grid size 257x257 it takes 40  
>>> seconds  to converge
>>> on one processor when I run with
>>> ./run -ksp_type richardson -pc_type hypre -pc_type_hypre boomeramg
>>>
>>> Using the DMMG framework, the same problem takes less than a second,
>>> and the default gmres solver uses only four seconds.
>>>
>>> Am I somehow using the solver the wrong way, or is this  
>>> performance  expected?
>>>
>>> Regards
>>> Knut Erik Teigen
>>>
>>>
>
>
>


From a.albiniana.crespo at gmail.com  Fri Feb 15 14:23:36 2008
From: a.albiniana.crespo at gmail.com (=?ISO-8859-1?Q?antonio_albi=F1ana_crespo?=)
Date: Fri, 15 Feb 2008 21:23:36 +0100
Subject: Installing Petsc
Message-ID: <268924010802151223k2fd93fabp1afce2359cdb5b18@mail.gmail.com>

Hi!

I'm new in this world, and I'm trying to install Petsc in order to develop
my final studies project. I have received a few error messages and I have
tried to study the configure.log file:

              Popping language C
*********************************************************************************
         UNABLE to CONFIGURE with GIVEN OPTIONS    (see configure.log for
details):
---------------------------------------------------------------------------------------
Downloaded mpi could not be used. Please check install in
/new/proyecto/PETSc/petsc-2.3.3-p8/externalpackages/mpich2-1.0.5p4
/linux-gnu-c-debug
*********************************************************************************
  File "./config/configure.py", line 190, in petsc_configure
    framework.configure(out = sys.stdout)
  File "/home/ubuntu/Escritorio/PETSc/petsc-2.3.3-p8/python/BuildSystem/config/framework.py",
line 878, in configure
    child.configure()
  File "/home/ubuntu/Escritorio/PETSc/petsc-2.3.3-p8/python/BuildSystem/config/package.py",
line 380, in configure
    self.executeTest(self.configureLibrary)
  File "/home/ubuntu/Escritorio/PETSc/petsc-2.3.3-p8/python/BuildSystem/config/base.py",
line 93, in executeTest
    return apply(test, args,kargs)
  File "/home/ubuntu/Escritorio/PETSc/petsc-2.3.3-p8/python/BuildSystem/config/packages/MPI.py",
line 548, in configureLibrary
    config.package.Package.configureLibrary(self)
  File "/home/ubuntu/Escritorio/PETSc/petsc-2.3.3-p8/python/BuildSystem/config/package.py",
line 331, in configureLibrary
    for location, directory, lib, incl in self.generateGuesses():
  File "/home/ubuntu/Escritorio/PETSc/petsc-2.3.3-p8/python/BuildSystem/config/package.py",
line 160, in generateGuesses
    raise RuntimeError('Downloaded '+self.package+' could not be used.
Please check install in '+d+'\n')

I have seen many messages with:

Possible ERROR while running linker: gcc: opci?n '-PIC' no reconocida
gcc: opci?n '-PIC' no reconocida

What am I doing wrongly?

I thank you beforehand for your attention.
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.mcs.anl.gov/pipermail/petsc-users/attachments/20080215/da82aab6/attachment.htm>

From balay at mcs.anl.gov  Fri Feb 15 16:14:00 2008
From: balay at mcs.anl.gov (Satish Balay)
Date: Fri, 15 Feb 2008 16:14:00 -0600 (CST)
Subject: Installing Petsc
In-Reply-To: <268924010802151223k2fd93fabp1afce2359cdb5b18@mail.gmail.com>
References: <268924010802151223k2fd93fabp1afce2359cdb5b18@mail.gmail.com>
Message-ID: <alpine.LFD.1.00.0802151611420.5839@asterix>


Can you send such installation issues to petsc-maint at mcs.anl.gov with
the complete configure.log file as attachment?

>>
> Possible ERROR while running linker: gcc: opci?n '-PIC' no reconocida
> gcc: opci?n '-PIC' no reconocida
<<

Perhaps configure misbehaves with non-english output from some of the
tools it invokes..

Satish


On Fri, 15 Feb 2008, antonio albi?ana crespo wrote:

> Hi!
> 
> I'm new in this world, and I'm trying to install Petsc in order to develop
> my final studies project. I have received a few error messages and I have
> tried to study the configure.log file:
> 
>               Popping language C
> *********************************************************************************
>          UNABLE to CONFIGURE with GIVEN OPTIONS    (see configure.log for
> details):
> ---------------------------------------------------------------------------------------
> Downloaded mpi could not be used. Please check install in
> /new/proyecto/PETSc/petsc-2.3.3-p8/externalpackages/mpich2-1.0.5p4
> /linux-gnu-c-debug
> *********************************************************************************
>   File "./config/configure.py", line 190, in petsc_configure
>     framework.configure(out = sys.stdout)
>   File "/home/ubuntu/Escritorio/PETSc/petsc-2.3.3-p8/python/BuildSystem/config/framework.py",
> line 878, in configure
>     child.configure()
>   File "/home/ubuntu/Escritorio/PETSc/petsc-2.3.3-p8/python/BuildSystem/config/package.py",
> line 380, in configure
>     self.executeTest(self.configureLibrary)
>   File "/home/ubuntu/Escritorio/PETSc/petsc-2.3.3-p8/python/BuildSystem/config/base.py",
> line 93, in executeTest
>     return apply(test, args,kargs)
>   File "/home/ubuntu/Escritorio/PETSc/petsc-2.3.3-p8/python/BuildSystem/config/packages/MPI.py",
> line 548, in configureLibrary
>     config.package.Package.configureLibrary(self)
>   File "/home/ubuntu/Escritorio/PETSc/petsc-2.3.3-p8/python/BuildSystem/config/package.py",
> line 331, in configureLibrary
>     for location, directory, lib, incl in self.generateGuesses():
>   File "/home/ubuntu/Escritorio/PETSc/petsc-2.3.3-p8/python/BuildSystem/config/package.py",
> line 160, in generateGuesses
>     raise RuntimeError('Downloaded '+self.package+' could not be used.
> Please check install in '+d+'\n')
> 
> I have seen many messages with:
> 
> Possible ERROR while running linker: gcc: opci?n '-PIC' no reconocida
> gcc: opci?n '-PIC' no reconocida
> 
> What am I doing wrongly?
> 
> I thank you beforehand for your attention.
> 

From tyoung at ippt.gov.pl  Fri Feb 15 17:00:58 2008
From: tyoung at ippt.gov.pl (Toby D. Young)
Date: Sat, 16 Feb 2008 00:00:58 +0100 (CET)
Subject: Installing Petsc
In-Reply-To: <268924010802151223k2fd93fabp1afce2359cdb5b18@mail.gmail.com>
References: <268924010802151223k2fd93fabp1afce2359cdb5b18@mail.gmail.com>
Message-ID: <Pine.LNX.4.58.0802152357070.19910@brama.ippt.gov.pl>


> Possible ERROR while running linker: gcc: opci?n '-PIC' no reconocida
> gcc: opci?n '-PIC' no reconocida

Try compiling with option `-fPIC' (not `-PIC').

Best,
	Toby

-----

Toby D. Young - Adiunkt (Assistant Professor)
Department of Computational Science
Institute of Fundamental Technological Research
Polish Academy of Sciences
Room 206, ul. Swietokrzyska 21
00-049 Warszawa, POLAND


From rlmackie862 at gmail.com  Fri Feb 15 17:23:40 2008
From: rlmackie862 at gmail.com (Randall Mackie)
Date: Fri, 15 Feb 2008 15:23:40 -0800
Subject: Question on the ordering for a 3D Distributed Array Vector with 3
 degrees of freedom
Message-ID: <47B61EFC.7000506@gmail.com>

I am using a 3D distributed array with 3 degrees of freedom, where each
degree of freedom refers to, for example, the model value in the x, y, and
z directions (the model properties are diagonally anisotropic).

If I scatter the DA vector to a natural vector on the zero processor,
and then use VecGetArray to access it:

	Call VecGetArray(vseq,xx_v,xx_i,ierr)

	do i=1,3*mx*my*mz
	  v=xx_a(i)
	end do

	call VecRestoreArray(vseq,xx_v,xx_i,ierr)


Is the natural ordering with the vector v then v(mx,my,mz,dof)?

So then if I want to get the model values in the x direction and write
them to a file, then it would be the first mx*my*mz values, and so forth?


Randy


From Andrew.Barker at Colorado.EDU  Fri Feb 15 17:36:00 2008
From: Andrew.Barker at Colorado.EDU (Andrew T Barker)
Date: Fri, 15 Feb 2008 16:36:00 -0700 (MST)
Subject: Poor performance with BoomerAMG?
Message-ID: <20080215163600.ABA57782@batman.int.colorado.edu>


>Be careful how you handle boundary conditions; you need to make sure
>they have the same scaling as the other equations.

Could you clarify what you mean?  Is boomerAMG sensitive to scaling of matrix rows in a way that other solvers/preconditioners are not?

Andrew

>
>On Feb 15, 2008, at 8:36 AM, knutert at stud.ntnu.no wrote:
>
>> Hi Ben,
>>
>> Thank you for answering. With gmres and boomeramg I get a run time of
>> 2s, so that is much better. However, if I increase the grid size to
>> 513x513, I get a run time of one minute. With richardson, it fails  
>> to converge.
>> LU gives 6 seconds, CG and ICC gives 7s, and the DMMG solver 3s for  
>> the 513x513 problem.
>>
>> When using the DMMG framework, I just used the default solvers.
>> I use the Galerkin process to generate the coarse matrices for
>> the multigrid cycle.
>>
>> Best,
>> Knut
>>
>> Siterer Ben Tay <zonexo at gmail.com>:
>>
>>> Hi Knut,
>>>
>>> I'm currently using boomeramg to solve my poisson eqn too. I'm  
>>> using it
>>> on my structured C-grid. I found it to be faster than LU,  
>>> especially as
>>> the grid size increases. However I use it as a preconditioner with
>>> GMRES as the solver. Have you tried this option? Although it's  
>>> faster,
>>> the speed increase is usually less than double. It seems to be  
>>> worse if
>>> there is a lot of stretching in the grid.
>>>
>>> Btw, your mention using the DMMG framework and it takes less than a
>>> sec. What solver or preconditioner did you use? It's 4 times faster
>>> than GMRES...
>>>
>>> thanks!
>>>
>>> knutert at stud.ntnu.no wrote:
>>>> Hello,
>>>>
>>>> I am trying to use the hypre multigrid solver to solve a Poisson  
>>>> equation.
>>>> However, on a test case with grid size 257x257 it takes 40  
>>>> seconds  to converge
>>>> on one processor when I run with
>>>> ./run -ksp_type richardson -pc_type hypre -pc_type_hypre boomeramg
>>>>
>>>> Using the DMMG framework, the same problem takes less than a second,
>>>> and the default gmres solver uses only four seconds.
>>>>
>>>> Am I somehow using the solver the wrong way, or is this  
>>>> performance  expected?
>>>>
>>>> Regards
>>>> Knut Erik Teigen
>>>>
>>>>
>>
>>
>>
>


From bsmith at mcs.anl.gov  Sat Feb 16 11:39:13 2008
From: bsmith at mcs.anl.gov (Barry Smith)
Date: Sat, 16 Feb 2008 11:39:13 -0600
Subject: Question on the ordering for a 3D Distributed Array Vector with 3 degrees of freedom
In-Reply-To: <47B61EFC.7000506@gmail.com>
References: <47B61EFC.7000506@gmail.com>
Message-ID: <DE03F847-466D-44BC-92D9-4ED7C8D3C0A4@mcs.anl.gov>


    In Fortran array indexing it is v(dof,mx,my,mz)

    Barry

On Feb 15, 2008, at 5:23 PM, Randall Mackie wrote:

> I am using a 3D distributed array with 3 degrees of freedom, where  
> each
> degree of freedom refers to, for example, the model value in the x,  
> y, and
> z directions (the model properties are diagonally anisotropic).
>
> If I scatter the DA vector to a natural vector on the zero processor,
> and then use VecGetArray to access it:
>
> 	Call VecGetArray(vseq,xx_v,xx_i,ierr)
>
> 	do i=1,3*mx*my*mz
> 	  v=xx_a(i)
> 	end do
>
> 	call VecRestoreArray(vseq,xx_v,xx_i,ierr)
>
>
> Is the natural ordering with the vector v then v(mx,my,mz,dof)?
>
> So then if I want to get the model values in the x direction and write
> them to a file, then it would be the first mx*my*mz values, and so  
> forth?
>
>
> Randy
>


From bsmith at mcs.anl.gov  Sat Feb 16 11:49:04 2008
From: bsmith at mcs.anl.gov (Barry Smith)
Date: Sat, 16 Feb 2008 11:49:04 -0600
Subject: Poor performance with BoomerAMG?
In-Reply-To: <20080215163600.ABA57782@batman.int.colorado.edu>
References: <20080215163600.ABA57782@batman.int.colorado.edu>
Message-ID: <30BAACA0-5EAE-4A7C-8BD7-2E34071CAE57@mcs.anl.gov>


    All multigrid solvers depend on proper scaling of the variables.  
For example
for a  Laplacian operator the matrix entries are

         \integral \grad \phi_i dot \grad \phi_j

now in 2d \grad \phi is O(1/h) and the volume is O(h^2) so the terms
in the matrix are O(1). In 3d \grad \phi is still O(1/h) but the  
volume is O(h^3)
meaning the matrix entries are O(h).  Now say you impose a Dirichlet  
boundary
conditions by just saying u_k    =  g_k. In 2d this is ok but in 3d  
you need to
use h*u_k = h*g_k otherwise when you restrict to the coarser grid the
resulting matrix entries for the boundary are "out of whack" with the  
matrix
entries for the interior of the domain.

Actually most preconditioners and Krylov methods behavior does depend
on the row scaling; multigrid is just particularly sensitive.

    Barry


On Feb 15, 2008, at 5:36 PM, Andrew T Barker wrote:

>
>
>> Be careful how you handle boundary conditions; you need to make sure
>> they have the same scaling as the other equations.
>
> Could you clarify what you mean?  Is boomerAMG sensitive to scaling  
> of matrix rows in a way that other solvers/preconditioners are not?
>
> Andrew
>
>>
>> On Feb 15, 2008, at 8:36 AM, knutert at stud.ntnu.no wrote:
>>
>>> Hi Ben,
>>>
>>> Thank you for answering. With gmres and boomeramg I get a run time  
>>> of
>>> 2s, so that is much better. However, if I increase the grid size to
>>> 513x513, I get a run time of one minute. With richardson, it fails
>>> to converge.
>>> LU gives 6 seconds, CG and ICC gives 7s, and the DMMG solver 3s for
>>> the 513x513 problem.
>>>
>>> When using the DMMG framework, I just used the default solvers.
>>> I use the Galerkin process to generate the coarse matrices for
>>> the multigrid cycle.
>>>
>>> Best,
>>> Knut
>>>
>>> Siterer Ben Tay <zonexo at gmail.com>:
>>>
>>>> Hi Knut,
>>>>
>>>> I'm currently using boomeramg to solve my poisson eqn too. I'm
>>>> using it
>>>> on my structured C-grid. I found it to be faster than LU,
>>>> especially as
>>>> the grid size increases. However I use it as a preconditioner with
>>>> GMRES as the solver. Have you tried this option? Although it's
>>>> faster,
>>>> the speed increase is usually less than double. It seems to be
>>>> worse if
>>>> there is a lot of stretching in the grid.
>>>>
>>>> Btw, your mention using the DMMG framework and it takes less than a
>>>> sec. What solver or preconditioner did you use? It's 4 times faster
>>>> than GMRES...
>>>>
>>>> thanks!
>>>>
>>>> knutert at stud.ntnu.no wrote:
>>>>> Hello,
>>>>>
>>>>> I am trying to use the hypre multigrid solver to solve a Poisson
>>>>> equation.
>>>>> However, on a test case with grid size 257x257 it takes 40
>>>>> seconds  to converge
>>>>> on one processor when I run with
>>>>> ./run -ksp_type richardson -pc_type hypre -pc_type_hypre boomeramg
>>>>>
>>>>> Using the DMMG framework, the same problem takes less than a  
>>>>> second,
>>>>> and the default gmres solver uses only four seconds.
>>>>>
>>>>> Am I somehow using the solver the wrong way, or is this
>>>>> performance  expected?
>>>>>
>>>>> Regards
>>>>> Knut Erik Teigen
>>>>>
>>>>>
>>>
>>>
>>>
>>
>


From rlmackie862 at gmail.com  Sat Feb 16 13:28:21 2008
From: rlmackie862 at gmail.com (Randall Mackie)
Date: Sat, 16 Feb 2008 11:28:21 -0800
Subject: Question on DA's and VecScatters
Message-ID: <47B73955.4070507@gmail.com>

In my 3D distributed array, I created global vectors, local vectors,
and one natural vector (because I want to access and output to a file
some of these values).

If you create a scatter context, using VecScatterCreateToZero, does
it matter whether or not I specify a global vector or the natural vector
to create the context?

In other words, does it matter in this vecscattercreate call whether or not
I use vnat (natural vector) or vsol (global vector)

	call VecScatterCreateToZero(vnat,vToZero,vseq,ierr)


if later I make these calls:


	call DaGlobalToNaturalBegin(da,vsol,INSERT_VALUES,vnat,ierr)
	call DaGlobalToNaturalEnd(da,vsol,INSERT_VALUES,vnat,ierr)

	call VecScatterBegin(vToZero,vnat,vseq....)
	call VecScatterEnd(vToZero,vnat,vseq....)


	call VecGetArray(vseq....)

Or does it keep track because it knows what type of vectors are being dealt with?


Thanks, Randy


From bsmith at mcs.anl.gov  Sat Feb 16 16:10:52 2008
From: bsmith at mcs.anl.gov (Barry Smith)
Date: Sat, 16 Feb 2008 16:10:52 -0600
Subject: Question on DA's and VecScatters
In-Reply-To: <47B73955.4070507@gmail.com>
References: <47B73955.4070507@gmail.com>
Message-ID: <FF561277-B7A0-438A-BD98-123447655DF7@mcs.anl.gov>


   It most definitely matters. The VecScatterCreateToZero() simply  
concatenates
the values from all the processes together so they need to be in the  
natural
order before collecting on zero.

   Barry

BTW:  The VecView() to a binary file for DA global vectors does the  
mapping
automatically, so the file is always in the natural ordering.


On Feb 16, 2008, at 1:28 PM, Randall Mackie wrote:

> In my 3D distributed array, I created global vectors, local vectors,
> and one natural vector (because I want to access and output to a file
> some of these values).
>
> If you create a scatter context, using VecScatterCreateToZero, does
> it matter whether or not I specify a global vector or the natural  
> vector
> to create the context?
>
> In other words, does it matter in this vecscattercreate call whether  
> or not
> I use vnat (natural vector) or vsol (global vector)
>
> 	call VecScatterCreateToZero(vnat,vToZero,vseq,ierr)
>
>
> if later I make these calls:
>
>
> 	call DaGlobalToNaturalBegin(da,vsol,INSERT_VALUES,vnat,ierr)
> 	call DaGlobalToNaturalEnd(da,vsol,INSERT_VALUES,vnat,ierr)
>
> 	call VecScatterBegin(vToZero,vnat,vseq....)
> 	call VecScatterEnd(vToZero,vnat,vseq....)
>
>
> 	call VecGetArray(vseq....)
>
> Or does it keep track because it knows what type of vectors are  
> being dealt with?
>
>
> Thanks, Randy
>
>


From knutert at stud.ntnu.no  Mon Feb 18 01:57:34 2008
From: knutert at stud.ntnu.no (knutert at stud.ntnu.no)
Date: Mon, 18 Feb 2008 08:57:34 +0100
Subject: Poor performance with BoomerAMG?
In-Reply-To: <73C20BE7-ABAD-45CF-8FDE-5972C4891527@mcs.anl.gov>
References: <82FAjs014703@awe.co.uk>
	<20080215140009.lbc7d9t52scok08g@webmail.ntnu.no>
	<47B59805.1070306@gmail.com>
	<20080215153635.1y2s2a1uslcwswws@webmail.ntnu.no>
	<73C20BE7-ABAD-45CF-8FDE-5972C4891527@mcs.anl.gov>
Message-ID: <20080218085734.br098xx1k4ggw8os@webmail.ntnu.no>

Thank you for the reply, Barry.

The same thing happens if I use hypre with the DMMG solver.
As you say, with hypre, the convergence is extremely slow, requiring
a lot of iterations, 1413 iterations (1820 if I use richardson) for a 257x257
problem, while the default only needs 5.

I use the same way of handling boundary conditions in the two codes.
I've also compared the coeff matrix and rhs, and they are equal.

-Knut Erik-

Siterer Barry Smith <bsmith at mcs.anl.gov>:

>
>   Run with the DMMG solver with the option -pc_type hypre
> What happens? Then run again with the additional option -ksp_type richardson
>
> Is hypre taking many, many iterations which is causing the slow speed?
>
> I expect there is something wrong with your code that does not use DMMG.
> Be careful how you handle boundary conditions; you need to make sure
> they have the same scaling as the other equations.
>
>    Barry
>
>
>
> On Feb 15, 2008, at 8:36 AM, knutert at stud.ntnu.no wrote:
>
>> Hi Ben,
>>
>> Thank you for answering. With gmres and boomeramg I get a run time of
>> 2s, so that is much better. However, if I increase the grid size to
>> 513x513, I get a run time of one minute. With richardson, it fails   
>> to converge.
>> LU gives 6 seconds, CG and ICC gives 7s, and the DMMG solver 3s for  
>>  the 513x513 problem.
>>
>> When using the DMMG framework, I just used the default solvers.
>> I use the Galerkin process to generate the coarse matrices for
>> the multigrid cycle.
>>
>> Best,
>> Knut
>>
>> Siterer Ben Tay <zonexo at gmail.com>:
>>
>>> Hi Knut,
>>>
>>> I'm currently using boomeramg to solve my poisson eqn too. I'm using it
>>> on my structured C-grid. I found it to be faster than LU, especially as
>>> the grid size increases. However I use it as a preconditioner with
>>> GMRES as the solver. Have you tried this option? Although it's faster,
>>> the speed increase is usually less than double. It seems to be worse if
>>> there is a lot of stretching in the grid.
>>>
>>> Btw, your mention using the DMMG framework and it takes less than a
>>> sec. What solver or preconditioner did you use? It's 4 times faster
>>> than GMRES...
>>>
>>> thanks!
>>>
>>> knutert at stud.ntnu.no wrote:
>>>> Hello,
>>>>
>>>> I am trying to use the hypre multigrid solver to solve a Poisson equation.
>>>> However, on a test case with grid size 257x257 it takes 40   
>>>> seconds  to converge
>>>> on one processor when I run with
>>>> ./run -ksp_type richardson -pc_type hypre -pc_type_hypre boomeramg
>>>>
>>>> Using the DMMG framework, the same problem takes less than a second,
>>>> and the default gmres solver uses only four seconds.
>>>>
>>>> Am I somehow using the solver the wrong way, or is this   
>>>> performance  expected?
>>>>
>>>> Regards
>>>> Knut Erik Teigen
>>>>
>>>>
>>
>>
>>


From bsmith at mcs.anl.gov  Mon Feb 18 12:10:04 2008
From: bsmith at mcs.anl.gov (Barry Smith)
Date: Mon, 18 Feb 2008 12:10:04 -0600
Subject: Poor performance with BoomerAMG?
In-Reply-To: <20080218085734.br098xx1k4ggw8os@webmail.ntnu.no>
References: <82FAjs014703@awe.co.uk> <20080215140009.lbc7d9t52scok08g@webmail.ntnu.no> <47B59805.1070306@gmail.com> <20080215153635.1y2s2a1uslcwswws@webmail.ntnu.no> <73C20BE7-ABAD-45CF-8FDE-5972C4891527@mcs.anl.gov> <20080218085734.br098xx1k4ggw8os@webmail.ntnu.no>
Message-ID: <0D9E98F1-82D0-4453-8581-DF454A2C10FC@mcs.anl.gov>


    Please send the code to petsc-maint at mcs.anl.gov

    Something is not right.

    Barry

On Feb 18, 2008, at 1:57 AM, knutert at stud.ntnu.no wrote:

> Thank you for the reply, Barry.
>
> The same thing happens if I use hypre with the DMMG solver.
> As you say, with hypre, the convergence is extremely slow, requiring
> a lot of iterations, 1413 iterations (1820 if I use richardson) for  
> a 257x257
> problem, while the default only needs 5.
>
> I use the same way of handling boundary conditions in the two codes.
> I've also compared the coeff matrix and rhs, and they are equal.
>
> -Knut Erik-
>
> Siterer Barry Smith <bsmith at mcs.anl.gov>:
>
>>
>>  Run with the DMMG solver with the option -pc_type hypre
>> What happens? Then run again with the additional option -ksp_type  
>> richardson
>>
>> Is hypre taking many, many iterations which is causing the slow  
>> speed?
>>
>> I expect there is something wrong with your code that does not use  
>> DMMG.
>> Be careful how you handle boundary conditions; you need to make sure
>> they have the same scaling as the other equations.
>>
>>   Barry
>>
>>
>>
>> On Feb 15, 2008, at 8:36 AM, knutert at stud.ntnu.no wrote:
>>
>>> Hi Ben,
>>>
>>> Thank you for answering. With gmres and boomeramg I get a run time  
>>> of
>>> 2s, so that is much better. However, if I increase the grid size to
>>> 513x513, I get a run time of one minute. With richardson, it  
>>> fails  to converge.
>>> LU gives 6 seconds, CG and ICC gives 7s, and the DMMG solver 3s  
>>> for  the 513x513 problem.
>>>
>>> When using the DMMG framework, I just used the default solvers.
>>> I use the Galerkin process to generate the coarse matrices for
>>> the multigrid cycle.
>>>
>>> Best,
>>> Knut
>>>
>>> Siterer Ben Tay <zonexo at gmail.com>:
>>>
>>>> Hi Knut,
>>>>
>>>> I'm currently using boomeramg to solve my poisson eqn too. I'm  
>>>> using it
>>>> on my structured C-grid. I found it to be faster than LU,  
>>>> especially as
>>>> the grid size increases. However I use it as a preconditioner with
>>>> GMRES as the solver. Have you tried this option? Although it's  
>>>> faster,
>>>> the speed increase is usually less than double. It seems to be  
>>>> worse if
>>>> there is a lot of stretching in the grid.
>>>>
>>>> Btw, your mention using the DMMG framework and it takes less than a
>>>> sec. What solver or preconditioner did you use? It's 4 times faster
>>>> than GMRES...
>>>>
>>>> thanks!
>>>>
>>>> knutert at stud.ntnu.no wrote:
>>>>> Hello,
>>>>>
>>>>> I am trying to use the hypre multigrid solver to solve a Poisson  
>>>>> equation.
>>>>> However, on a test case with grid size 257x257 it takes 40   
>>>>> seconds  to converge
>>>>> on one processor when I run with
>>>>> ./run -ksp_type richardson -pc_type hypre -pc_type_hypre boomeramg
>>>>>
>>>>> Using the DMMG framework, the same problem takes less than a  
>>>>> second,
>>>>> and the default gmres solver uses only four seconds.
>>>>>
>>>>> Am I somehow using the solver the wrong way, or is this   
>>>>> performance  expected?
>>>>>
>>>>> Regards
>>>>> Knut Erik Teigen
>>>>>
>>>>>
>>>
>>>
>>>
>
>
>


From keita at cray.com  Mon Feb 18 12:27:10 2008
From: keita at cray.com (Keita Teranishi)
Date: Mon, 18 Feb 2008 12:27:10 -0600
Subject: Support for Howell-Rutherford or Howell-Boeing Sparse matrix format
Message-ID: <925346A443D4E340BEB20248BAFCDBDF04245686@CFEVS1-IP.americas.cray.com>

Hi,

 
I am wondering if there is any PETSc routine that loads Howell-Rutherford or Howell-Boeing format, and converts it to AIJ format automatically.  Since there is a huge collection of sparse matrices at University of Florida, such routine is very useful for benchmarking KSP and PC. 

 
Thanks,

================================
 Keita Teranishi
 Math Software Group
 Cray, Inc.
 keita at cray.com
================================

 
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.mcs.anl.gov/pipermail/petsc-users/attachments/20080218/52ab2573/attachment.htm>

From bsmith at mcs.anl.gov  Mon Feb 18 16:00:39 2008
From: bsmith at mcs.anl.gov (Barry Smith)
Date: Mon, 18 Feb 2008 16:00:39 -0600
Subject: Support for Howell-Rutherford or Howell-Boeing Sparse matrix format
In-Reply-To: <925346A443D4E340BEB20248BAFCDBDF04245686@CFEVS1-IP.americas.cray.com>
References: <925346A443D4E340BEB20248BAFCDBDF04245686@CFEVS1-IP.americas.cray.com>
Message-ID: <39BFB7E1-6579-4FA0-9C91-FD385AD182CC@mcs.anl.gov>


    Keita,

     We have not provided supported library routines for this since  
we've found that actual ASCII file formats often
require slightly different readers. And thus we could not provide  
robust readers. Instead we have some example
routines in src/mat/examples/tutorials/, specifically ex78.c that the  
user may modify for their exact needs.

   I've added this information to our FAQ.


    Barry

On Feb 18, 2008, at 12:27 PM, Keita Teranishi wrote:

> Hi,
>
> I am wondering if there is any PETSc routine that loads Howell- 
> Rutherford or Howell-Boeing format, and converts it to AIJ format  
> automatically.  Since there is a huge collection of sparse matrices  
> at University of Florida, such routine is very useful for  
> benchmarking KSP and PC.
>
> Thanks,
> ================================
>  Keita Teranishi
>  Math Software Group
>  Cray, Inc.
>  keita at cray.com
> ================================
>
>

-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.mcs.anl.gov/pipermail/petsc-users/attachments/20080218/14624f84/attachment.htm>

From bernhard.kubicek at arsenal.ac.at  Tue Feb 19 04:12:45 2008
From: bernhard.kubicek at arsenal.ac.at (Bernhard Kubicek)
Date: Tue, 19 Feb 2008 11:12:45 +0100
Subject: Parallel matrix assembly - SetValues Problem?
Message-ID: <1203415965.7200.108.camel@node99>

Dear List,

sorry to bother you  but I just finished reading the whole archive and
couldn't find a solution to a problem of mine that keeps on bothering me
now for 7+ days.

The problem is that my code produces different matrices if run in
parallel or single cpu.

I do a manual partitioning of the mesh by using metis by hand.
Thereafter, there is a list of finite-volume elements that I want to be
stored on the individual cpu and a renumbering that is manged somehow.
I create my matrix with

MatCreateMPIAIJ(PETSC_COMM_WORLD,mycount,mycount,
PETSC_DETERMINE,PETSC_DETERMINE,50,PETSC_NULL,50,PETSC_NULL,&A),

where mycount is different on each cpu, and is the mentioned number of
elements I wish to have there locally.


for each local row/element at the same time let a user calculate the
matrix elements and column positions, and the right hand side values for
this row. I output those for debugging. Within the loop for the local
rows, I call
MatSetValues(A,1,&i,nrEntries,entries,v,INSERT_VALUES) 
VecSetValue(rhs,i,rhsval,INSERT_VALUES)
in this order

When I run on one cpu, everything works nicely. A 3d mesh of a
10-element long bar with each element having volume 1, creates the
following matrix:
 -2.  1.  0.  0.  0.  0.  0.  0.  0.  1. 
 1.  -2.  1.  0.  0.  0.  0.  0.  0.  0. 
 0.  1.  -2.  1.  0.  0.  0.  0.  0.  0. 
 0.  0.  1.  -2.  1.  0.  0.  0.  0.  0. 
 0.  0.  0.  1.  -3.  0.  0.  0.  0.  0. 
 0.  0.  0.  0.  0.  -3.  1.  0.  0.  0. 
 0.  0.  0.  0.  0.  1.  -2.  1.  0.  0. 
 0.  0.  0.  0.  0.  0.  1.  -2.  1.  0. 
 0.  0.  0.  0.  0.  0.  0.  1.  -2.  1. 
 1.  0.  0.  0.  0.  0.  0.  0.  1.  -2. 

rhs
0
0
0
0
0
-2
0
0
0
0
The CPU sets the matrix and rhs like this ( global matrix row:
column/value ...column/value | rhs-value )
Row 0 cols:0/-2	9/1 1/1		|	0
Row 1 cols:1/-2	0/1 2/1		|	0
Row 2 cols:2/-2	1/1 3/1		|	0
Row 3 cols:3/-2	2/1 4/1		|	0
Row 4 cols:4/-3	3/1	|	0
Row 5 cols:5/-3	6/1	|	-2
Row 6 cols:6/-2	5/1 7/1		|	0
Row 7 cols:7/-2	6/1 8/1		|	0
Row 8 cols:8/-2	7/1 9/1		|	0
Row 9 cols:9/-2	0/1 8/1		|	0
because of the meshing the central rows in the matrix are the most
exterior elements, on which wall boundary condition 0 and 1 are set
(laplace equation).

one 2 cpus,
the matrix looses is different, although the global-local element
renumbering is defacto nonexisting (cpu 0: rows 0-4, cpu 1: rows 5-9):
 1.  1.  0.  0.  0.  0.  0.  0.  0.  0. 
 1.  -2.  1.  0.  0.  0.  0.  0.  0.  0. 
 0.  1.  -2.  1.  0.  0.  0.  0.  0.  0. 
 0.  0.  1.  -2.  1.  0.  0.  0.  0.  0. 
 0.  0.  0.  1.  -3.  0.  0.  0.  0.  0. 
 0.  0.  0.  0.  0.  -3.  1.  0.  0.  0. 
 0.  0.  0.  0.  0.  1.  -2.  1.  0.  0. 
 0.  0.  0.  0.  0.  0.  1.  -2.  1.  0. 
 0.  0.  0.  0.  0.  0.  0.  1.  -2.  1. 
 1.  0.  0.  0.  0.  0.  0.  0.  1.  -2. 
Process [0]
0
0
0
0
0
Process [1]
-2
0
0
0
0
here cpu 0 sets:
Row 0 cols:0/-2	9/1 1/1	|0
Row 1 cols:1/-2	0/1 2/1	|0
Row 2 cols:2/-2	1/1 3/1	|0
Row 3 cols:3/-2	2/1 4/1	|0
Row 4 cols:4/-3	3/1 |0

and cpu 1:
Row 5 cols:5/-3	6/1|-2
Row 6 cols:6/-2	5/1 7/1	|0
Row 7 cols:7/-2	6/1 8/1	|0
Row 8 cols:8/-2	7/1 9/1	|0
Row 9 cols:9/-2	0/1 8/1	|0
I triple verified that ***SetValues is called with the exactly same
values as on one cpu, and that nothing is set twice, and that every cpu
sets it's correct columns. Also for more sophisticated renumberings

Attached are the outputs when run with with -info. 

My current guess is that I create the matrix falsely, or that I cannot
mix the setting of Vec and Mat values before their
respective ???AssemblyBegin/Ends.

If anyone has any idea where the problem is, is would be extremely nice
to help me here.

Thank you very much, even for the slightest help
 Bernhard Kubicek

------
Physics Doctorate Student Techn. University of Vienna, Austria
Freelancer arsenal research, Vienna Austria


-------------- next part --------------
[1] PetscInitialize(): PETSc successfully started: number of processors = 2
[1] PetscGetHostName(): Rejecting domainname, likely is NIS node99.(none)
[1] PetscInitialize(): Running on machine: node99
Trying to read Gmsh .msh file "stab.gmsh"
Gmsh2 file format recognised
[0] PetscInitialize(): PETSc successfully started: number of processors = 2
[0] PetscGetHostName(): Rejecting domainname, likely is NIS node99.(none)
[0] PetscInitialize(): Running on machine: node99
Read 44 nodes.
alltogether number of elements including faces, ignored:61
Trying to read Gmsh .msh file "stab.gmsh"
Gmsh2 file format recognised
Read 44 nodes.
alltogether number of elements including faces, ignored:61
[1] PetscCommDuplicate(): Duplicating a communicator 91 141 max tags = 1073741823
[1] PetscCommDuplicate():   returning tag 1073741823
[1] PetscCommDuplicate(): Duplicating a communicator 92 143 max tags = 1073741823
[1] PetscCommDuplicate():   returning tag 1073741823
[0] PetscCommDuplicate(): Using internal PETSc communicator 92 143
[0] PetscCommDuplicate():   returning tag 1073741822
[1] PetscCommDuplicate(): Using internal PETSc communicator 91 141
[1] PetscCommDuplicate():   returning tag 1073741820
[1] PetscCommDuplicate(): Using internal PETSc communicator 91 141
[0] PetscCommDuplicate():   returning tag 1073741819
[0] PetscCommDuplicate():   returning tag 1073741814
[1] PetscCommDuplicate():   returning tag 1073741809
[1] MatStashScatterBegin_Private(): No of messages: 0 
[1] MatAssemblyBegin_MPIAIJ(): Stash has 0 entries, uses 0 mallocs.
[1] MatStashScatterBegin_Private(): No of messages: 0 
[1] MatAssemblyBegin_MPIAIJ(): Stash has 0 entries, uses 0 mallocs.
[1] MatAssemblyEnd_SeqAIJ(): Matrix size: 5 X 5; storage space: 237 unneeded,13 used
[1] MatAssemblyEnd_SeqAIJ(): Number of mallocs during MatSetValues() is 0
[1] MatAssemblyEnd_SeqAIJ(): Maximum nonzeros in any row is 3
[1] Mat_CheckInode(): Found 5 nodes out of 5 rows. Not using Inode routines
[1] PetscCommDuplicate(): Using internal PETSc communicator 92 143
[1] PetscCommDuplicate():   returning tag 1073741821
[0] MatSetUpMultiply_MPIAIJ(): Using block index set to define scatter
[1] PetscCommDuplicate():   returning tag 1073741802
[1] PetscCommDuplicate(): Using internal PETSc communicator 92 143
[1] PetscCommDuplicate():   returning tag 1073741820
[1] PetscCommDuplicate():   returning tag 1073741801
[1] PetscCommDuplicate():   returning tag 1073741796
[1] VecScatterCreateCommon_PtoS(): Using blocksize 1 scatter
[0] VecScatterCreate(): General case: MPI to Seq
[0] MatSetOption_Inode(): Not using Inode routines due to MatSetOption(MAT_DO_NOT_USE_INODES
[0] MatAssemblyEnd_SeqAIJ(): Matrix size: 5 X 1; storage space: 249 unneeded,1 used
[0] MatAssemblyEnd_SeqAIJ(): Number of mallocs during MatSetValues() is 0
[0] MatAssemblyEnd_SeqAIJ(): Maximum nonzeros in any row is 1
[0] Mat_CheckCompressedRow(): Found the ratio (num_zerorows 4)/(num_localrows 5) > 0.6. Use CompressedRow routines.
[0] MatAssemblyEnd_SeqAIJ(): Matrix size: 5 X 5; storage space: 0 unneeded,13 used
[0] MatAssemblyEnd_SeqAIJ(): Number of mallocs during MatSetValues() is 0
[0] MatAssemblyEnd_SeqAIJ(): Maximum nonzeros in any row is 3
[1] PetscCommDuplicate(): Using internal PETSc communicator 92 143
[1] PetscCommDuplicate():   returning tag 1073741819
[0] MatSetUpMultiply_MPIAIJ(): Using block index set to define scatter
[1] PetscCommDuplicate():   returning tag 1073741793
[0] PetscCommDuplicate(): Using internal PETSc communicator 92 143
[0] PetscCommDuplicate():   returning tag 1073741818
[1] PetscCommDuplicate():   returning tag 1073741792
[1] PetscCommDuplicate():   returning tag 1073741787
[0] VecScatterCreateCommon_PtoS(): Using blocksize 1 scatter
[0] VecScatterLocalOptimizeCopy_Private(): Local scatter is a copy, optimizing for it
[0] VecScatterCreate(): General case: MPI to Seq
[1] MatSetOption_Inode(): Not using Inode routines due to MatSetOption(MAT_DO_NOT_USE_INODES
[1] MatAssemblyEnd_SeqAIJ(): Matrix size: 5 X 1; storage space: 0 unneeded,1 used
[1] MatAssemblyEnd_SeqAIJ(): Number of mallocs during MatSetValues() is 0
[1] MatAssemblyEnd_SeqAIJ(): Maximum nonzeros in any row is 1
[0] MatAssemblyEnd_SeqAIJ(): Number of mallocs during MatSetValues() is 0
[1] Mat_CheckCompressedRow(): Skip check. m: 5, n: 1,M: 5, N: 1,nrows: 1, ii: 0x89153a0, type: seqaij
[0] MatAssemblyEnd_SeqAIJ(): Maximum nonzeros in any row is 1
[0] Mat_CheckCompressedRow(): Skip check. m: 5, n: 1,M: 5, N: 1,nrows: 1, ii: 0x8917588, type: seqaij
[1] VecAssemblyBegin_MPI(): Stash has 0 entries, uses 0 mallocs.
[1] VecAssemblyBegin_MPI(): Block-Stash has 0 entries, uses 0 mallocs.
[1] VecAssemblyBegin_MPI(): Stash has 0 entries, uses 0 mallocs.
[1] VecAssemblyBegin_MPI(): Block-Stash has 0 entries, uses 0 mallocs.
[1] PetscCommDuplicate(): Using internal PETSc communicator 91 141
[0] PetscCommDuplicate():   returning tag 1073741784
[1] PetscCommDuplicate():   returning tag 1073741783
[0] PetscCommDuplicate(): Using internal PETSc communicator 92 143
[0] PetscCommDuplicate():   returning tag 1073741817
[0] PetscCommDuplicate(): Using internal PETSc communicator 92 143
[0] PetscCommDuplicate():   returning tag 1073741816
[1] MatStashScatterBegin_Private(): No of messages: 1 
[0] MatAssemblyBegin_MPIAIJ(): Stash has 0 entries, uses 0 mallocs.
[1] MatAssemblyBegin_MPIAIJ(): Stash has 14 entries, uses 0 mallocs.
[0] MatAssemblyEnd_SeqAIJ(): Matrix size: 10 X 10; storage space: 133 unneeded,27 used
[0] MatAssemblyEnd_SeqAIJ(): Number of mallocs during MatSetValues() is 10
[0] MatAssemblyEnd_SeqAIJ(): Maximum nonzeros in any row is 3
[0] Mat_CheckInode(): Found 10 nodes out of 10 rows. Not using Inode routines
[1] MatAssemblyEnd_SeqAIJ(): Matrix size: 0 X 0; storage space: 0 unneeded,0 used
[1] MatAssemblyEnd_SeqAIJ(): Number of mallocs during MatSetValues() is 0
[1] MatAssemblyEnd_SeqAIJ(): Maximum nonzeros in any row is 0
[1] Mat_CheckInode(): Found 0 nodes of 0. Limit used: 5. Using Inode routines
[0] PetscCommDuplicate(): Using internal PETSc communicator 92 143
[0] PetscCommDuplicate():   returning tag 1073741815
[0] MatSetUpMultiply_MPIAIJ(): Using block index set to define scatter
[0] PetscCommDuplicate():   returning tag 1073741779
[0] PetscCommDuplicate(): Using internal PETSc communicator 92 143
[0] PetscCommDuplicate():   returning tag 1073741814
[1] PetscCommDuplicate():   returning tag 1073741814
[0] PetscCommDuplicate():   returning tag 1073741778
[1] PetscCommDuplicate():   returning tag 1073741773
[1] VecScatterCreateCommon_PtoS(): Using blocksize 1 scatter
[0] VecScatterCreate(): General case: MPI to Seq
[1] MatSetOption_Inode(): Not using Inode routines due to MatSetOption(MAT_DO_NOT_USE_INODES
[1] MatAssemblyEnd_SeqAIJ(): Matrix size: 0 X 0; storage space: 0 unneeded,0 used
[1] MatAssemblyEnd_SeqAIJ(): Number of mallocs during MatSetValues() is 0
[0] MatAssemblyEnd_SeqAIJ(): Maximum nonzeros in any row is 0
[1] MatAssemblyEnd_SeqAIJ(): Maximum nonzeros in any row is 0
[0] Mat_CheckCompressedRow(): Found the ratio (num_zerorows 10)/(num_localrows 10) > 0.6. Use CompressedRow routines.
[1] Mat_CheckCompressedRow(): Found the ratio (num_zerorows 0)/(num_localrows 0) > 0.6. Use CompressedRow routines.
[0] PetscCommDuplicate(): Using internal PETSc communicator 92 143
[0] PetscCommDuplicate():   returning tag 1073741813
[1] PetscCommDuplicate():   returning tag 1073741813
 1.00000e+00  1.00000e+00  0.00000e+00  0.00000e+00  0.00000e+00  0.00000e+00  0.00000e+00  0.00000e+00  0.00000e+00  0.00000e+00 
 1.00000e+00  -2.00000e+00  1.00000e+00  0.00000e+00  0.00000e+00  0.00000e+00  0.00000e+00  0.00000e+00  0.00000e+00  0.00000e+00 
 0.00000e+00  1.00000e+00  -2.00000e+00  1.00000e+00  0.00000e+00  0.00000e+00  0.00000e+00  0.00000e+00  0.00000e+00  0.00000e+00 
 0.00000e+00  0.00000e+00  1.00000e+00  -2.00000e+00  1.00000e+00  0.00000e+00  0.00000e+00  0.00000e+00  0.00000e+00  0.00000e+00 
 0.00000e+00  0.00000e+00  0.00000e+00  1.00000e+00  -3.00000e+00  0.00000e+00  0.00000e+00  0.00000e+00  0.00000e+00  0.00000e+00 
 0.00000e+00  0.00000e+00  0.00000e+00  0.00000e+00  0.00000e+00  -3.00000e+00  1.00000e+00  0.00000e+00  0.00000e+00  0.00000e+00 
 0.00000e+00  0.00000e+00  0.00000e+00  0.00000e+00  0.00000e+00  1.00000e+00  -2.00000e+00  1.00000e+00  0.00000e+00  0.00000e+00 
 0.00000e+00  0.00000e+00  0.00000e+00  0.00000e+00  0.00000e+00  0.00000e+00  1.00000e+00  -2.00000e+00  1.00000e+00  0.00000e+00 
 0.00000e+00  0.00000e+00  0.00000e+00  0.00000e+00  0.00000e+00  0.00000e+00  0.00000e+00  1.00000e+00  -2.00000e+00  1.00000e+00 
 1.00000e+00  0.00000e+00  0.00000e+00  0.00000e+00  0.00000e+00  0.00000e+00  0.00000e+00  0.00000e+00  1.00000e+00  -2.00000e+00 
[1] PetscCommDuplicate():   returning tag 1073741770
Process [0]
0
0
0
0
0
Process [1]
-2
0
0
0
0
[0] PetscCommDuplicate():   returning tag 1073741769
[1] PetscCommDuplicate():   returning tag 1073741769
[0] PetscCommDuplicate(): Using internal PETSc communicator 91 141
[0] PetscCommDuplicate():   returning tag 1073741768
[1] PetscCommDuplicate(): Using internal PETSc communicator 91 141
[0] PetscCommDuplicate(): Using internal PETSc communicator 91 141
[0] PetscCommDuplicate():   returning tag 1073741767
[0] PetscCommDuplicate(): Using internal PETSc communicator 91 141
[1] PetscCommDuplicate():   returning tag 1073741766
[1] PetscCommDuplicate(): Using internal PETSc communicator 91 141
[1] PetscCommDuplicate():   returning tag 1073741765
[0] PetscCommDuplicate():   returning tag 1073741764
[1] PetscCommDuplicate():   returning tag 1073741763
[1] PetscCommDuplicate():   returning tag 1073741762
[0] PetscCommDuplicate():   returning tag 1073741761
[1] PetscCommDuplicate():   returning tag 1073741756
[0] PetscCommDuplicate():   returning tag 1073741751
[0] PetscCommDuplicate():   returning tag 1073741746
[0] PetscCommDuplicate():   returning tag 1073741741
[1] PetscCommDuplicate():   returning tag 1073741736
[0] PCSetUp(): Setting up new PC
[1] PetscCommDuplicate():   returning tag 1073741731
[1] PetscCommDuplicate(): Using internal PETSc communicator 92 143
[1] PetscCommDuplicate():   returning tag 1073741812
[1] MatIncreaseOverlap_MPIAIJ_Receive(): Allocated 1 bytes, required 3 bytes, no of mallocs = 0
[0] PetscCommDuplicate(): Using internal PETSc communicator 92 143
[0] PetscCommDuplicate():   returning tag 1073741811
[1] PetscCommDuplicate(): Using internal PETSc communicator 92 143
[1] PetscCommDuplicate():   returning tag 1073741810
[1] PetscCommDuplicate():   returning tag 1073741809
[1] PetscCommDuplicate(): Using internal PETSc communicator 92 143
[1] PetscCommDuplicate():   returning tag 1073741808
[1] PetscCommDuplicate():   returning tag 1073741722
[0] VecScatterCreateCommon_PtoS(): Using blocksize 1 scatter
[0] VecScatterLocalOptimizeCopy_Private(): Local scatter is a copy, optimizing for it
[1] VecScatterLocalOptimizeCopy_Private(): Local scatter is a copy, optimizing for it
[0] VecScatterCreate(): General case: MPI to Seq
[0] PetscCommDuplicate(): Using internal PETSc communicator 92 143
[0] PetscCommDuplicate():   returning tag 1073741807
[0] PetscCommDuplicate(): Using internal PETSc communicator 92 143
[0] PetscCommDuplicate():   returning tag 1073741806
[1] PetscCommDuplicate(): Using internal PETSc communicator 92 143
[0] PetscCommDuplicate():   returning tag 1073741805
[1] PetscCommDuplicate():   returning tag 1073741805
[1] MatAssemblyEnd_SeqAIJ(): Matrix size: 6 X 6; storage space: 0 unneeded,16 used
[1] MatAssemblyEnd_SeqAIJ(): Number of mallocs during MatSetValues() is 0
[1] MatAssemblyEnd_SeqAIJ(): Maximum nonzeros in any row is 3
[1] Mat_CheckInode(): Found 6 nodes out of 6 rows. Not using Inode routines
[0] PCSetUp(): Setting up new PC
[0] PetscCommDuplicate(): Using internal PETSc communicator 92 143
[0] PetscCommDuplicate():   returning tag 1073741804
[1] PetscCommDuplicate(): Using internal PETSc communicator 92 143
[0] PetscCommDuplicate():   returning tag 1073741803
[1] PetscCommDuplicate():   returning tag 1073741803
[1] PetscCommDuplicate(): Using internal PETSc communicator 92 143
[0] PetscCommDuplicate():   returning tag 1073741802
[1] PetscCommDuplicate():   returning tag 1073741802
[1] MatILUFactorSymbolic_SeqAIJ(): Reallocs 0 Fill ratio:given 50 needed 0.9375
[1] MatILUFactorSymbolic_SeqAIJ(): Run with -[sub_]pc_factor_fill 0.9375 or use 
[0] MatILUFactorSymbolic_SeqAIJ(): PCFactorSetFill([sub]pc,0.928571);
[0] MatILUFactorSymbolic_SeqAIJ(): for best performance.
[1] MatILUFactorSymbolic_SeqAIJ(): for best performance.
[0] PetscCommDuplicate():   returning tag 1073741801
[1] PetscCommDuplicate():   returning tag 1073741801
[1] Mat_CheckInode(): Found 6 nodes out of 6 rows. Not using Inode routines
[1]PETSC ERROR: --------------------- Error Message ------------------------------------
[1]PETSC ERROR: Detected zero pivot in LU factorization
see http://www.mcs.anl.gov/petsc/petsc-as/documentation/troubleshooting.html#ZeroPivot!
[1]PETSC ERROR: Zero pivot row 5 value 0 tolerance 0 * rowsum 0!
[1]PETSC ERROR: ------------------------------------------------------------------------
[1]PETSC ERROR: Petsc Release Version 2.3.3, Patch 8, Fri Nov 16 17:03:40 CST 2007 HG revision: 414581156e67e55c761739b0deb119f7590d0f4b
[1]PETSC ERROR: See docs/changes/index.html for recent updates.
[1]PETSC ERROR: See docs/faq.html for hints about trouble shooting.
[1]PETSC ERROR: See docs/index.html for manual pages.
[1]PETSC ERROR: ------------------------------------------------------------------------
[1]PETSC ERROR: /media/sdb1/bernhard/svnl/bogen/Cpp/newmesh/bin/./main on a omp_deb_m named node99 by bkubicek Tue Feb 19 11:04:15 2008
[1]PETSC ERROR: Libraries linked from /home/bkubicek/750/Software/petsc-2.3.3-p8/lib/omp_deb_mpi_cxx
[1]PETSC ERROR: Configure run at Thu Jan 31 10:02:09 2008
[1]PETSC ERROR: Configure options --with-clanguage=c++ --with-x=0 --with-debugging=1 --with-shared=0 --with-default-arch=0 --with-mpi=1 COPTFLAGS=' -O2 -march=pentium4 -mtune=pentium4 ' FOPTFLAGS='-I -O2 -march=pentium4 -mtune=pentium4 '
[1]PETSC ERROR: ------------------------------------------------------------------------
[1]PETSC ERROR: MatLUFactorNumeric_SeqAIJ() line 529 in src/mat/impls/aij/seq/aijfact.c
[1]PETSC ERROR: MatLUFactorNumeric() line 2227 in src/mat/interface/matrix.c
[1]PETSC ERROR: PCSetUp_ILU() line 564 in src/ksp/pc/impls/factor/ilu/ilu.c
[1]PETSC ERROR: PCSetUp() line 787 in src/ksp/pc/interface/precon.c
[1]PETSC ERROR: KSPSetUp() line 234 in src/ksp/ksp/interface/itfunc.c
[1]PETSC ERROR: PCSetUpOnBlocks_ASM() line 224 in src/ksp/pc/impls/asm/asm.c
[1]PETSC ERROR: PCSetUpOnBlocks() line 820 in src/ksp/pc/interface/precon.c
[1]PETSC ERROR: KSPSetUpOnBlocks() line 158 in src/ksp/ksp/interface/itfunc.c
[1]PETSC ERROR: KSPSolve() line 348 in src/ksp/ksp/interface/itfunc.c
[1] PCSetUp(): Setting up new PC
[1] PetscCommDuplicate(): Using internal PETSc communicator 92 143
[1] PetscCommDuplicate():   returning tag 1073741800
[1] PetscCommDuplicate(): Using internal PETSc communicator 92 143
[1] PetscCommDuplicate():   returning tag 1073741799
[1] PetscCommDuplicate(): Using internal PETSc communicator 92 143
[1] PetscCommDuplicate():   returning tag 1073741798
[1] MatILUFactorSymbolic_SeqAIJ(): Reallocs 0 Fill ratio:given 50 needed 0.9375
[1] MatILUFactorSymbolic_SeqAIJ(): Run with -[sub_]pc_factor_fill 0.9375 or use 
[1] MatILUFactorSymbolic_SeqAIJ(): PCFactorSetFill([sub]pc,0.9375);
[1] MatILUFactorSymbolic_SeqAIJ(): for best performance.
[1] PetscCommDuplicate():   returning tag 1073741797
[1] Mat_CheckInode(): Found 6 nodes out of 6 rows. Not using Inode routines
[1]PETSC ERROR: --------------------- Error Message ------------------------------------
[1]PETSC ERROR: Detected zero pivot in LU factorization
see http://www.mcs.anl.gov/petsc/petsc-as/documentation/troubleshooting.html#ZeroPivot!
[1]PETSC ERROR: Zero pivot row 5 value 0 tolerance 0 * rowsum 0!
[1]PETSC ERROR: ------------------------------------------------------------------------
[1]PETSC ERROR: Petsc Release Version 2.3.3, Patch 8, Fri Nov 16 17:03:40 CST 2007 HG revision: 414581156e67e55c761739b0deb119f7590d0f4b
[1]PETSC ERROR: See docs/changes/index.html for recent updates.
[1]PETSC ERROR: See docs/faq.html for hints about trouble shooting.
[1]PETSC ERROR: See docs/index.html for manual pages.
[1]PETSC ERROR: ------------------------------------------------------------------------
[1]PETSC ERROR: /media/sdb1/bernhard/svnl/bogen/Cpp/newmesh/bin/./main on a omp_deb_m named node99 by bkubicek Tue Feb 19 11:04:15 2008
[1]PETSC ERROR: Libraries linked from /home/bkubicek/750/Software/petsc-2.3.3-p8/lib/omp_deb_mpi_cxx
[1]PETSC ERROR: Configure run at Thu Jan 31 10:02:09 2008
[1]PETSC ERROR: Configure options --with-clanguage=c++ --with-x=0 --with-debugging=1 --with-shared=0 --with-default-arch=0 --with-mpi=1 COPTFLAGS=' -O2 -march=pentium4 -mtune=pentium4 ' FOPTFLAGS='-I -O2 -march=pentium4 -mtune=pentium4 '
[1]PETSC ERROR: ------------------------------------------------------------------------
[1]PETSC ERROR: MatLUFactorNumeric_SeqAIJ() line 529 in src/mat/impls/aij/seq/aijfact.c
[1]PETSC ERROR: MatLUFactorNumeric() line 2227 in src/mat/interface/matrix.c
[1]PETSC ERROR: PCSetUp_ILU() line 564 in src/ksp/pc/impls/factor/ilu/ilu.c
[1]PETSC ERROR: PCSetUp() line 787 in src/ksp/pc/interface/precon.c
[1]PETSC ERROR: KSPSetUp() line 234 in src/ksp/ksp/interface/itfunc.c
[1]PETSC ERROR: PCSetUpOnBlocks_ASM() line 224 in src/ksp/pc/impls/asm/asm.c
[1]PETSC ERROR: PCSetUpOnBlocks() line 820 in src/ksp/pc/interface/precon.c
[1]PETSC ERROR: KSPSetUpOnBlocks() line 158 in src/ksp/ksp/interface/itfunc.c
[1]PETSC ERROR: KSPSolve() line 348 in src/ksp/ksp/interface/itfunc.c
PETSC_ERROR: Line 250 File: matrix.cpp
Child process exited unexpectedly 0
Aborted (core dumped)
-------------- next part --------------
[0] PetscInitialize(): PETSc successfully started: number of processors = 1
[0] PetscGetHostName(): Rejecting domainname, likely is NIS node99.(none)
[0] PetscInitialize(): Running on machine: node99
Trying to read Gmsh .msh file "stab.gmsh"
Gmsh2 file format recognised
Read 44 nodes.
alltogether number of elements including faces, ignored:61
[0] PetscCommDuplicate(): Duplicating a communicator 91 141 max tags = 1073741823
[0] PetscCommDuplicate():   returning tag 1073741823
[0] PetscCommDuplicate(): Using internal PETSc communicator 91 141
[0] PetscCommDuplicate():   returning tag 1073741822
[0] PetscCommDuplicate(): Using internal PETSc communicator 91 141
[0] PetscCommDuplicate():   returning tag 1073741821
[0] PetscCommDuplicate():   returning tag 1073741820
[0] PetscCommDuplicate():   returning tag 1073741819
[0] MatAssemblyEnd_SeqAIJ(): Matrix size: 10 X 10; storage space: 472 unneeded,28 used
[0] MatAssemblyEnd_SeqAIJ(): Number of mallocs during MatSetValues() is 0
[0] MatAssemblyEnd_SeqAIJ(): Maximum nonzeros in any row is 3
[0] Mat_CheckInode(): Found 10 nodes out of 10 rows. Not using Inode routines
[0] MatAssemblyEnd_SeqAIJ(): Matrix size: 10 X 10; storage space: 0 unneeded,28 used
[0] MatAssemblyEnd_SeqAIJ(): Number of mallocs during MatSetValues() is 0
[0] MatAssemblyEnd_SeqAIJ(): Maximum nonzeros in any row is 3
[0] PetscCommDuplicate(): Using internal PETSc communicator 91 141
[0] PetscCommDuplicate():   returning tag 1073741818
 -2.00000e+00  1.00000e+00  0.00000e+00  0.00000e+00  0.00000e+00  0.00000e+00  0.00000e+00  0.00000e+00  0.00000e+00  1.00000e+00 
 1.00000e+00  -2.00000e+00  1.00000e+00  0.00000e+00  0.00000e+00  0.00000e+00  0.00000e+00  0.00000e+00  0.00000e+00  0.00000e+00 
 0.00000e+00  1.00000e+00  -2.00000e+00  1.00000e+00  0.00000e+00  0.00000e+00  0.00000e+00  0.00000e+00  0.00000e+00  0.00000e+00 
 0.00000e+00  0.00000e+00  1.00000e+00  -2.00000e+00  1.00000e+00  0.00000e+00  0.00000e+00  0.00000e+00  0.00000e+00  0.00000e+00 
 0.00000e+00  0.00000e+00  0.00000e+00  1.00000e+00  -3.00000e+00  0.00000e+00  0.00000e+00  0.00000e+00  0.00000e+00  0.00000e+00 
 0.00000e+00  0.00000e+00  0.00000e+00  0.00000e+00  0.00000e+00  -3.00000e+00  1.00000e+00  0.00000e+00  0.00000e+00  0.00000e+00 
 0.00000e+00  0.00000e+00  0.00000e+00  0.00000e+00  0.00000e+00  1.00000e+00  -2.00000e+00  1.00000e+00  0.00000e+00  0.00000e+00 
 0.00000e+00  0.00000e+00  0.00000e+00  0.00000e+00  0.00000e+00  0.00000e+00  1.00000e+00  -2.00000e+00  1.00000e+00  0.00000e+00 
 0.00000e+00  0.00000e+00  0.00000e+00  0.00000e+00  0.00000e+00  0.00000e+00  0.00000e+00  1.00000e+00  -2.00000e+00  1.00000e+00 
 1.00000e+00  0.00000e+00  0.00000e+00  0.00000e+00  0.00000e+00  0.00000e+00  0.00000e+00  0.00000e+00  1.00000e+00  -2.00000e+00 
[0] PetscCommDuplicate():   returning tag 1073741817
0
0
0
0
0
-2
0
0
0
0
[0] PetscCommDuplicate():   returning tag 1073741816
[0] PetscCommDuplicate(): Using internal PETSc communicator 91 141
[0] PetscCommDuplicate():   returning tag 1073741815
[0] PetscCommDuplicate(): Using internal PETSc communicator 91 141
[0] PetscCommDuplicate():   returning tag 1073741814
[0] PetscCommDuplicate(): Using internal PETSc communicator 91 141
[0] PetscCommDuplicate():   returning tag 1073741813
[0] PetscCommDuplicate(): Using internal PETSc communicator 91 141
[0] PetscCommDuplicate():   returning tag 1073741812
[0] PetscCommDuplicate():   returning tag 1073741811
[0] PetscCommDuplicate():   returning tag 1073741810
[0] PetscCommDuplicate():   returning tag 1073741809
[0] PetscCommDuplicate():   returning tag 1073741808
[0] PetscCommDuplicate():   returning tag 1073741807
[0] PetscCommDuplicate():   returning tag 1073741806
[0] PetscCommDuplicate():   returning tag 1073741805
[0] PetscCommDuplicate():   returning tag 1073741804
[0] PetscCommDuplicate():   returning tag 1073741803
[0] PCSetUp(): Setting up new PC
[0] PetscCommDuplicate():   returning tag 1073741802
[0] PetscCommDuplicate(): Duplicating a communicator 92 149 max tags = 1073741823
[0] PetscCommDuplicate():   returning tag 1073741823
[0] Petsc_DelComm(): Deleting PETSc communicator imbedded in a user MPI_Comm 92
[0] PetscCommDestroy(): Deleting PETSc MPI_Comm 149
[0] Petsc_DelComm(): Deleting PETSc communicator imbedded in a user MPI_Comm 149
[0] Petsc_DelTag(): Deleting tag data in an MPI_Comm 149
[0] PetscCommDuplicate(): Duplicating a communicator 92 149 max tags = 1073741823
[0] PetscCommDuplicate():   returning tag 1073741823
[0] PetscCommDuplicate(): Using internal PETSc communicator 92 149
[0] PetscCommDuplicate():   returning tag 1073741822
[0] PetscCommDuplicate():   returning tag 1073741821
[0] PetscCommDuplicate(): Using internal PETSc communicator 92 149
[0] PetscCommDuplicate():   returning tag 1073741820
[0] PetscCommDuplicate():   returning tag 1073741801
[0] VecScatterCreate(): Special case: sequential vector general to stride
[0] PetscCommDuplicate(): Using internal PETSc communicator 92 149
[0] PetscCommDuplicate():   returning tag 1073741819
[0] PetscCommDuplicate(): Using internal PETSc communicator 92 149
[0] PetscCommDuplicate():   returning tag 1073741818
[0] PetscCommDuplicate():   returning tag 1073741800
[0] MatAssemblyEnd_SeqAIJ(): Matrix size: 10 X 10; storage space: 0 unneeded,28 used
[0] MatAssemblyEnd_SeqAIJ(): Number of mallocs during MatSetValues() is 0
[0] MatAssemblyEnd_SeqAIJ(): Maximum nonzeros in any row is 3
[0] Mat_CheckInode(): Found 10 nodes out of 10 rows. Not using Inode routines
[0] PCSetUp(): Setting up new PC
[0] PetscCommDuplicate(): Using internal PETSc communicator 92 149
[0] PetscCommDuplicate():   returning tag 1073741817
[0] PetscCommDuplicate(): Using internal PETSc communicator 92 149
[0] PetscCommDuplicate():   returning tag 1073741816
[0] PetscCommDuplicate(): Using internal PETSc communicator 92 149
[0] PetscCommDuplicate():   returning tag 1073741815
[0] MatILUFactorSymbolic_SeqAIJ(): Reallocs 0 Fill ratio:given 50 needed 1
[0] MatILUFactorSymbolic_SeqAIJ(): Run with -[sub_]pc_factor_fill 1 or use 
[0] MatILUFactorSymbolic_SeqAIJ(): PCFactorSetFill([sub]pc,1);
[0] MatILUFactorSymbolic_SeqAIJ(): for best performance.
[0] PetscCommDuplicate():   returning tag 1073741799
[0] Mat_CheckInode(): Found 10 nodes out of 10 rows. Not using Inode routines
[0] PetscCommDuplicate():   returning tag 1073741798
[0] KSPDefaultConverged(): user has provided nonzero initial guess, computing 2-norm of preconditioned RHS
  0 KSP Residual norm 3.146479381785e+02 
  1 KSP Residual norm 5.527758587503e-14 
  2 KSP Residual norm 2.957120339776e-14 
[0] PetscCommDuplicate():   returning tag 1073741797
[0] PetscCommDuplicate():   returning tag 1073741796
[0] PetscCommDuplicate():   returning tag 1073741795
[0] PetscCommDuplicate():   returning tag 1073741794
[0] PetscCommDuplicate():   returning tag 1073741793
[0] PetscCommDuplicate():   returning tag 1073741792
[0] PetscCommDuplicate():   returning tag 1073741791
[0] PetscCommDuplicate():   returning tag 1073741790
[0] PetscCommDuplicate():   returning tag 1073741789
[0] PetscCommDuplicate():   returning tag 1073741788
  3 KSP Residual norm 2.311953842658e-14 
  4 KSP Residual norm 1.949598161624e-14 
  5 KSP Residual norm 1.718379006549e-14 
  6 KSP Residual norm 1.553657802016e-14 
  7 KSP Residual norm 1.428735337435e-14 
  8 KSP Residual norm 1.329791896231e-14 
  9 KSP Residual norm 1.248915022855e-14 
 10 KSP Residual norm 1.181200860607e-14 
 11 KSP Residual norm 1.123427218264e-14 
 12 KSP Residual norm 1.073378041180e-14 
[0] PetscCommDuplicate():   returning tag 1073741787
[0] PetscCommDuplicate():   returning tag 1073741786
[0] PetscCommDuplicate():   returning tag 1073741785
[0] PetscCommDuplicate():   returning tag 1073741784
[0] PetscCommDuplicate():   returning tag 1073741783
[0] PetscCommDuplicate():   returning tag 1073741782
[0] PetscCommDuplicate():   returning tag 1073741781
[0] PetscCommDuplicate():   returning tag 1073741780
[0] PetscCommDuplicate():   returning tag 1073741779
[0] PetscCommDuplicate():   returning tag 1073741778
 13 KSP Residual norm 1.029472464285e-14 
 14 KSP Residual norm 9.905483966479e-15 
 15 KSP Residual norm 9.557299004528e-15 
 16 KSP Residual norm 9.243425362697e-15 
 17 KSP Residual norm 8.958574339974e-15 
 18 KSP Residual norm 8.698532377314e-15 
 19 KSP Residual norm 8.459895437148e-15 
 20 KSP Residual norm 8.239879423284e-15 
 21 KSP Residual norm 8.036182185186e-15 
 22 KSP Residual norm 7.846881299150e-15 
[0] PetscCommDuplicate():   returning tag 1073741777
[0] PetscCommDuplicate():   returning tag 1073741776
[0] PetscCommDuplicate():   returning tag 1073741775
[0] PetscCommDuplicate():   returning tag 1073741774
[0] PetscCommDuplicate():   returning tag 1073741773
[0] PetscCommDuplicate():   returning tag 1073741772
[0] PetscCommDuplicate():   returning tag 1073741771
[0] PetscCommDuplicate():   returning tag 1073741770
[0] PetscCommDuplicate():   returning tag 1073741769
[0] PetscCommDuplicate():   returning tag 1073741768
 23 KSP Residual norm 7.670357156941e-15 
 24 KSP Residual norm 7.505234275465e-15 
 25 KSP Residual norm 7.350335936204e-15 
 26 KSP Residual norm 7.204648718167e-15 
 27 KSP Residual norm 7.067294471298e-15 
 28 KSP Residual norm 6.937507953310e-15 
 29 KSP Residual norm 6.814618825309e-15 
 30 KSP Residual norm 6.698037036492e-15 
 31 KSP Residual norm 6.587240868887e-15 
 32 KSP Residual norm 6.481767088291e-15 
[0] PetscCommDuplicate():   returning tag 1073741767
[0] PetscCommDuplicate():   returning tag 1073741766
[0] PetscCommDuplicate():   returning tag 1073741765
[0] PetscCommDuplicate():   returning tag 1073741764
 33 KSP Residual norm 6.381202776463e-15 
 34 KSP Residual norm 6.285178515610e-15 
 35 KSP Residual norm 8.829047469026e-14 
[0] KSPDefaultConverged(): Linear solver has converged. Residual norm 2.7398e-30 is less than absolute tolerance 1e-15 at iteration 36
 36 KSP Residual norm 2.739801485312e-30 
KSP Object:
  type: gmres
    GMRES: restart=35, using Classical (unmodified) Gram-Schmidt Orthogonalization with no iterative refinement
    GMRES: happy breakdown tolerance 1e-30
  maximum iterations=1000
  tolerances:  relative=1e-15, absolute=1e-15, divergence=10000
  left preconditioning
PC Object:
  type: asm
    Additive Schwarz: total subdomain blocks = 1, amount of overlap = 1
    Additive Schwarz: restriction/interpolation type - RESTRICT
    Local solve is same for all blocks, in the following KSP and PC objects:
    KSP Object:(sub_)
      type: preonly
      maximum iterations=10000, initial guess is zero
      tolerances:  relative=1e-05, absolute=1e-50, divergence=10000
      left preconditioning
    PC Object:(sub_)
      type: ilu
        ILU: 15 levels of fill
        ILU: factor fill ratio allocated 50
        ILU: tolerance for zero pivot 1e-12
             out-of-place factorization
             matrix ordering: rcm
        ILU: factor fill ratio needed 1
             Factored matrix follows
            Matrix Object:
              type=seqaij, rows=10, cols=10
              total: nonzeros=28, allocated nonzeros=28
                not using I-node routines
      linear system matrix = precond matrix:
      Matrix Object:
        type=seqaij, rows=10, cols=10
        total: nonzeros=28, allocated nonzeros=28
          not using I-node routines
[0] PetscCommDuplicate():   returning tag 1073741763
  linear system matrix = precond matrix:
  Matrix Object:
    type=seqaij, rows=10, cols=10
    total: nonzeros=28, allocated nonzeros=500
      not using I-node routines
[0] PetscCommDuplicate():   returning tag 1073741762
[0] PetscCommDuplicate():   returning tag 1073741761
[0] KSPDefaultConverged(): user has provided nonzero initial guess, computing 2-norm of preconditioned RHS
[0] KSPDefaultConverged(): Linear solver has converged. Residual norm 5.27447e-16 is less than absolute tolerance 1e-15 at iteration 0
  0 KSP Residual norm 5.274472300304e-16 
KSP Object:
  type: gmres
    GMRES: restart=35, using Classical (unmodified) Gram-Schmidt Orthogonalization with no iterative refinement
    GMRES: happy breakdown tolerance 1e-30
  maximum iterations=1000
  tolerances:  relative=1e-15, absolute=1e-15, divergence=10000
  left preconditioning
PC Object:
  type: asm
    Additive Schwarz: total subdomain blocks = 1, amount of overlap = 1
    Additive Schwarz: restriction/interpolation type - RESTRICT
    Local solve is same for all blocks, in the following KSP and PC objects:
    KSP Object:(sub_)
      type: preonly
      maximum iterations=10000, initial guess is zero
      tolerances:  relative=1e-05, absolute=1e-50, divergence=10000
      left preconditioning
    PC Object:(sub_)
      type: ilu
        ILU: 15 levels of fill
        ILU: factor fill ratio allocated 50
        ILU: tolerance for zero pivot 1e-12
             out-of-place factorization
             matrix ordering: rcm
        ILU: factor fill ratio needed 1
             Factored matrix follows
            Matrix Object:
              type=seqaij, rows=10, cols=10
              total: nonzeros=28, allocated nonzeros=28
                not using I-node routines
      linear system matrix = precond matrix:
      Matrix Object:
        type=seqaij, rows=10, cols=10
        total: nonzeros=28, allocated nonzeros=28
          not using I-node routines
[0] PetscCommDuplicate():   returning tag 1073741760
  linear system matrix = precond matrix:
  Matrix Object:
    type=seqaij, rows=10, cols=10
    total: nonzeros=28, allocated nonzeros=500
      not using I-node routines
[0] PetscCommDuplicate():   returning tag 1073741759
Bnds: 2 NBnds:58
[0] PetscCommDuplicate():   returning tag 1073741758
[0] Petsc_DelViewer(): Deleting viewer data in an MPI_Comm 141
[0] PetscCommDuplicate():   returning tag 1073741757
OptionTable: -info
OptionTable: -ksp_atol 1.e-15
OptionTable: -ksp_gmres_restart 35
OptionTable: -ksp_max_it 1000
OptionTable: -ksp_monitor
OptionTable: -ksp_rtol 1.e-15
OptionTable: -ksp_view
OptionTable: -options_left
OptionTable: -pc_type asm
OptionTable: -sub_pc_factor_fill 50
OptionTable: -sub_pc_factor_levels 15
OptionTable: -sub_pc_factor_mat_ordering_type rcm
OptionTable: -sub_pc_type ilu
There are no unused options.

From bsmith at mcs.anl.gov  Tue Feb 19 07:19:38 2008
From: bsmith at mcs.anl.gov (Barry Smith)
Date: Tue, 19 Feb 2008 07:19:38 -0600
Subject: Parallel matrix assembly - SetValues Problem?
In-Reply-To: <1203415965.7200.108.camel@node99>
References: <1203415965.7200.108.camel@node99>
Message-ID: <60948BCF-4D67-439C-9DB2-171BF6158571@mcs.anl.gov>


    Send the code to petsc-maint at mcs.anl.gov and we'll take a look
at it.

    Barry

On Feb 19, 2008, at 4:12 AM, Bernhard Kubicek wrote:

> Dear List,
>
> sorry to bother you  but I just finished reading the whole archive and
> couldn't find a solution to a problem of mine that keeps on  
> bothering me
> now for 7+ days.
>
> The problem is that my code produces different matrices if run in
> parallel or single cpu.
>
> I do a manual partitioning of the mesh by using metis by hand.
> Thereafter, there is a list of finite-volume elements that I want to  
> be
> stored on the individual cpu and a renumbering that is manged somehow.
> I create my matrix with
>
> MatCreateMPIAIJ(PETSC_COMM_WORLD,mycount,mycount,
> PETSC_DETERMINE,PETSC_DETERMINE,50,PETSC_NULL,50,PETSC_NULL,&A),
>
> where mycount is different on each cpu, and is the mentioned number of
> elements I wish to have there locally.
>
>
> for each local row/element at the same time let a user calculate the
> matrix elements and column positions, and the right hand side values  
> for
> this row. I output those for debugging. Within the loop for the local
> rows, I call
> MatSetValues(A,1,&i,nrEntries,entries,v,INSERT_VALUES)
> VecSetValue(rhs,i,rhsval,INSERT_VALUES)
> in this order
>
> When I run on one cpu, everything works nicely. A 3d mesh of a
> 10-element long bar with each element having volume 1, creates the
> following matrix:
> -2.  1.  0.  0.  0.  0.  0.  0.  0.  1.
> 1.  -2.  1.  0.  0.  0.  0.  0.  0.  0.
> 0.  1.  -2.  1.  0.  0.  0.  0.  0.  0.
> 0.  0.  1.  -2.  1.  0.  0.  0.  0.  0.
> 0.  0.  0.  1.  -3.  0.  0.  0.  0.  0.
> 0.  0.  0.  0.  0.  -3.  1.  0.  0.  0.
> 0.  0.  0.  0.  0.  1.  -2.  1.  0.  0.
> 0.  0.  0.  0.  0.  0.  1.  -2.  1.  0.
> 0.  0.  0.  0.  0.  0.  0.  1.  -2.  1.
> 1.  0.  0.  0.  0.  0.  0.  0.  1.  -2.
>
> rhs
> 0
> 0
> 0
> 0
> 0
> -2
> 0
> 0
> 0
> 0
> The CPU sets the matrix and rhs like this ( global matrix row:
> column/value ...column/value | rhs-value )
> Row 0 cols:0/-2	9/1 1/1		|	0
> Row 1 cols:1/-2	0/1 2/1		|	0
> Row 2 cols:2/-2	1/1 3/1		|	0
> Row 3 cols:3/-2	2/1 4/1		|	0
> Row 4 cols:4/-3	3/1	|	0
> Row 5 cols:5/-3	6/1	|	-2
> Row 6 cols:6/-2	5/1 7/1		|	0
> Row 7 cols:7/-2	6/1 8/1		|	0
> Row 8 cols:8/-2	7/1 9/1		|	0
> Row 9 cols:9/-2	0/1 8/1		|	0
> because of the meshing the central rows in the matrix are the most
> exterior elements, on which wall boundary condition 0 and 1 are set
> (laplace equation).
>
> one 2 cpus,
> the matrix looses is different, although the global-local element
> renumbering is defacto nonexisting (cpu 0: rows 0-4, cpu 1: rows 5-9):
> 1.  1.  0.  0.  0.  0.  0.  0.  0.  0.
> 1.  -2.  1.  0.  0.  0.  0.  0.  0.  0.
> 0.  1.  -2.  1.  0.  0.  0.  0.  0.  0.
> 0.  0.  1.  -2.  1.  0.  0.  0.  0.  0.
> 0.  0.  0.  1.  -3.  0.  0.  0.  0.  0.
> 0.  0.  0.  0.  0.  -3.  1.  0.  0.  0.
> 0.  0.  0.  0.  0.  1.  -2.  1.  0.  0.
> 0.  0.  0.  0.  0.  0.  1.  -2.  1.  0.
> 0.  0.  0.  0.  0.  0.  0.  1.  -2.  1.
> 1.  0.  0.  0.  0.  0.  0.  0.  1.  -2.
> Process [0]
> 0
> 0
> 0
> 0
> 0
> Process [1]
> -2
> 0
> 0
> 0
> 0
> here cpu 0 sets:
> Row 0 cols:0/-2	9/1 1/1	|0
> Row 1 cols:1/-2	0/1 2/1	|0
> Row 2 cols:2/-2	1/1 3/1	|0
> Row 3 cols:3/-2	2/1 4/1	|0
> Row 4 cols:4/-3	3/1 |0
>
> and cpu 1:
> Row 5 cols:5/-3	6/1|-2
> Row 6 cols:6/-2	5/1 7/1	|0
> Row 7 cols:7/-2	6/1 8/1	|0
> Row 8 cols:8/-2	7/1 9/1	|0
> Row 9 cols:9/-2	0/1 8/1	|0
> I triple verified that ***SetValues is called with the exactly same
> values as on one cpu, and that nothing is set twice, and that every  
> cpu
> sets it's correct columns. Also for more sophisticated renumberings
>
> Attached are the outputs when run with with -info.
>
> My current guess is that I create the matrix falsely, or that I cannot
> mix the setting of Vec and Mat values before their
> respective ???AssemblyBegin/Ends.
>
> If anyone has any idea where the problem is, is would be extremely  
> nice
> to help me here.
>
> Thank you very much, even for the slightest help
> Bernhard Kubicek
>
> ------
> Physics Doctorate Student Techn. University of Vienna, Austria
> Freelancer arsenal research, Vienna Austria
>
>
>
>
>
> <par.txt><seq.txt>


From jens.madsen at risoe.dk  Tue Feb 19 08:21:15 2008
From: jens.madsen at risoe.dk (jens.madsen at risoe.dk)
Date: Tue, 19 Feb 2008 15:21:15 +0100
Subject: Poor performance with BoomerAMG?
In-Reply-To: <30BAACA0-5EAE-4A7C-8BD7-2E34071CAE57@mcs.anl.gov>
References: <20080215163600.ABA57782@batman.int.colorado.edu>
 <30BAACA0-5EAE-4A7C-8BD7-2E34071CAE57@mcs.anl.gov>
Message-ID: <CA703655D571CF49A2E6DAED3FF5A49301AB3AB6@EXCHG-VS1.risoe.dk>

Hi Barry

Two questions. 

1) What do you mean with "volume" and "wrong scaling"? Could translate this to some other terms? I have a book by Ulrich Trottenberg "Multigrid" and the book by Saad, but could not find similar.

2) Do you know of any summerschools in scientific computing, focusing on Krylov methods, multigrids and preconditioning(all parallel)?

Kind Regards 

Jens Madsen
Ph.d.-studerende
Phone direct +45 4677 4560
Mobile 
jens.madsen at risoe.dk

Optics and Plasma Research Department
Ris? National Laboratory
Technical University of Denmark - DTU
Building 128, P.O. Box 49
DK-4000 Roskilde, Denmark
Tel +45 4677 4500
Fax +45 4677 4565
www.risoe.dk 

>From 1 January 2007, Ris? National Laboratory, the Danish Institute for Food and Veterinary Research,
the Danish Institute for Fisheries Research, the Danish National Space Center and
the Danish Transport Research Institute have been merged with
the Technical University of Denmark (DTU) with DTU as the continuing unit.
-----Original Message-----
From: owner-petsc-users at mcs.anl.gov [mailto:owner-petsc-users at mcs.anl.gov] On Behalf Of Barry Smith
Sent: Saturday, February 16, 2008 6:49 PM
To: petsc-users at mcs.anl.gov
Subject: Re: Poor performance with BoomerAMG?


    All multigrid solvers depend on proper scaling of the variables.  
For example
for a  Laplacian operator the matrix entries are

         \integral \grad \phi_i dot \grad \phi_j

now in 2d \grad \phi is O(1/h) and the volume is O(h^2) so the terms
in the matrix are O(1). In 3d \grad \phi is still O(1/h) but the  
volume is O(h^3)
meaning the matrix entries are O(h).  Now say you impose a Dirichlet  
boundary
conditions by just saying u_k    =  g_k. In 2d this is ok but in 3d  
you need to
use h*u_k = h*g_k otherwise when you restrict to the coarser grid the
resulting matrix entries for the boundary are "out of whack" with the  
matrix
entries for the interior of the domain.

Actually most preconditioners and Krylov methods behavior does depend
on the row scaling; multigrid is just particularly sensitive.

    Barry


On Feb 15, 2008, at 5:36 PM, Andrew T Barker wrote:

>
>
>> Be careful how you handle boundary conditions; you need to make sure
>> they have the same scaling as the other equations.
>
> Could you clarify what you mean?  Is boomerAMG sensitive to scaling  
> of matrix rows in a way that other solvers/preconditioners are not?
>
> Andrew
>
>>
>> On Feb 15, 2008, at 8:36 AM, knutert at stud.ntnu.no wrote:
>>
>>> Hi Ben,
>>>
>>> Thank you for answering. With gmres and boomeramg I get a run time  
>>> of
>>> 2s, so that is much better. However, if I increase the grid size to
>>> 513x513, I get a run time of one minute. With richardson, it fails
>>> to converge.
>>> LU gives 6 seconds, CG and ICC gives 7s, and the DMMG solver 3s for
>>> the 513x513 problem.
>>>
>>> When using the DMMG framework, I just used the default solvers.
>>> I use the Galerkin process to generate the coarse matrices for
>>> the multigrid cycle.
>>>
>>> Best,
>>> Knut
>>>
>>> Siterer Ben Tay <zonexo at gmail.com>:
>>>
>>>> Hi Knut,
>>>>
>>>> I'm currently using boomeramg to solve my poisson eqn too. I'm
>>>> using it
>>>> on my structured C-grid. I found it to be faster than LU,
>>>> especially as
>>>> the grid size increases. However I use it as a preconditioner with
>>>> GMRES as the solver. Have you tried this option? Although it's
>>>> faster,
>>>> the speed increase is usually less than double. It seems to be
>>>> worse if
>>>> there is a lot of stretching in the grid.
>>>>
>>>> Btw, your mention using the DMMG framework and it takes less than a
>>>> sec. What solver or preconditioner did you use? It's 4 times faster
>>>> than GMRES...
>>>>
>>>> thanks!
>>>>
>>>> knutert at stud.ntnu.no wrote:
>>>>> Hello,
>>>>>
>>>>> I am trying to use the hypre multigrid solver to solve a Poisson
>>>>> equation.
>>>>> However, on a test case with grid size 257x257 it takes 40
>>>>> seconds  to converge
>>>>> on one processor when I run with
>>>>> ./run -ksp_type richardson -pc_type hypre -pc_type_hypre boomeramg
>>>>>
>>>>> Using the DMMG framework, the same problem takes less than a  
>>>>> second,
>>>>> and the default gmres solver uses only four seconds.
>>>>>
>>>>> Am I somehow using the solver the wrong way, or is this
>>>>> performance  expected?
>>>>>
>>>>> Regards
>>>>> Knut Erik Teigen
>>>>>
>>>>>
>>>
>>>
>>>
>>
>


From bsmith at mcs.anl.gov  Tue Feb 19 15:56:18 2008
From: bsmith at mcs.anl.gov (Barry Smith)
Date: Tue, 19 Feb 2008 15:56:18 -0600
Subject: Poor performance with BoomerAMG?
In-Reply-To: <20080218085734.br098xx1k4ggw8os@webmail.ntnu.no>
References: <82FAjs014703@awe.co.uk> <20080215140009.lbc7d9t52scok08g@webmail.ntnu.no> <47B59805.1070306@gmail.com> <20080215153635.1y2s2a1uslcwswws@webmail.ntnu.no> <73C20BE7-ABAD-45CF-8FDE-5972C4891527@mcs.anl.gov> <20080218085734.br098xx1k4ggw8os@webmail.ntnu.no>
Message-ID: <CD546084-7E1E-4E1B-B705-D8559DA18905@mcs.anl.gov>


    BoomerAMG works like a charm. Likely you forgot the -pc_hypre_type  
boomeramg

    Hmm, I think I'll change the default solver to boomeramg

    Barry

barry-smiths-macbook-pro-17:ksp/examples/tutorials] bsmith% ./ex1f - 
ksp_monitor -pc_type hypre -pc_hypre_type boomeramg -m 513 -n 513 - 
ksp_type richardson -ksp_view
  p=           1
   0 KSP Residual norm 4.213878296084e+03
   1 KSP Residual norm 2.135189837330e+02
   2 KSP Residual norm 1.225934028865e+01
   3 KSP Residual norm 7.255859884400e-01
   4 KSP Residual norm 4.353504737395e-02
   5 KSP Residual norm 2.643035146258e-03
   6 KSP Residual norm 1.628271972668e-04
KSP Object:
   type: richardson
     Richardson: damping factor=1
   maximum iterations=10000, initial guess is zero
   tolerances:  relative=1e-07, absolute=1e-50, divergence=10000
   left preconditioning
PC Object:
   type: hypre
     HYPRE BoomerAMG preconditioning
     HYPRE BoomerAMG: Cycle type V
     HYPRE BoomerAMG: Maximum number of levels 25
     HYPRE BoomerAMG: Maximum number of iterations PER hypre call 1
     HYPRE BoomerAMG: Convergence tolerance PER hypre call 0
     HYPRE BoomerAMG: Threshold for strong coupling 0.25
     HYPRE BoomerAMG: Interpolation truncation factor 0
     HYPRE BoomerAMG: Interpolation: max elements per row 0
     HYPRE BoomerAMG: Number of levels of aggressive coarsening 0
     HYPRE BoomerAMG: Number of paths for aggressive coarsening 1
     HYPRE BoomerAMG: Maximum row sums 0.9
     HYPRE BoomerAMG: Sweeps down         1
     HYPRE BoomerAMG: Sweeps up           1
     HYPRE BoomerAMG: Sweeps on coarse    1
     HYPRE BoomerAMG: Relax down          symmetric-SOR/Jacobi
     HYPRE BoomerAMG: Relax up            symmetric-SOR/Jacobi
     HYPRE BoomerAMG: Relax on coarse     Gaussian-elimination
     HYPRE BoomerAMG: Relax weight  (all)      1
     HYPRE BoomerAMG: Outer relax weight (all) 1
     HYPRE BoomerAMG: Using CF-relaxation
     HYPRE BoomerAMG: Measure type        local
     HYPRE BoomerAMG: Coarsen type        Falgout
     HYPRE BoomerAMG: Interpolation type  classical
   linear system matrix = precond matrix:
   Matrix Object:
     type=seqaij, rows=263169, cols=263169
     total: nonzeros=1313793, allocated nonzeros=1315845
       not using I-node routines
  Iterations:           7

[barry-smiths-macbook-pro-17:ksp/examples/tutorials] bsmith% ./ex1f - 
ksp_monitor -pc_type hypre -pc_hypre_type boomeramg -m 513 -n 513 - 
ksp_type gmres -ksp_view
  p=           1
   0 KSP Residual norm 4.213878296084e+03
   1 KSP Residual norm 5.272381634094e+01
   2 KSP Residual norm 8.107668116258e-01
   3 KSP Residual norm 1.807380875232e-02
   4 KSP Residual norm 4.068259191532e-04
KSP Object:
   type: gmres
     GMRES: restart=30, using Classical (unmodified) Gram-Schmidt  
Orthogonalization with no iterative refinement
     GMRES: happy breakdown tolerance 1e-30
   maximum iterations=10000, initial guess is zero
   tolerances:  relative=1e-07, absolute=1e-50, divergence=10000
   left preconditioning
PC Object:
   type: hypre
     HYPRE BoomerAMG preconditioning
     HYPRE BoomerAMG: Cycle type V
     HYPRE BoomerAMG: Maximum number of levels 25
     HYPRE BoomerAMG: Maximum number of iterations PER hypre call 1
     HYPRE BoomerAMG: Convergence tolerance PER hypre call 0
     HYPRE BoomerAMG: Threshold for strong coupling 0.25
     HYPRE BoomerAMG: Interpolation truncation factor 0
     HYPRE BoomerAMG: Interpolation: max elements per row 0
     HYPRE BoomerAMG: Number of levels of aggressive coarsening 0
     HYPRE BoomerAMG: Number of paths for aggressive coarsening 1
     HYPRE BoomerAMG: Maximum row sums 0.9
     HYPRE BoomerAMG: Sweeps down         1
     HYPRE BoomerAMG: Sweeps up           1
     HYPRE BoomerAMG: Sweeps on coarse    1
     HYPRE BoomerAMG: Relax down          symmetric-SOR/Jacobi
     HYPRE BoomerAMG: Relax up            symmetric-SOR/Jacobi
     HYPRE BoomerAMG: Relax on coarse     Gaussian-elimination
     HYPRE BoomerAMG: Relax weight  (all)      1
     HYPRE BoomerAMG: Outer relax weight (all) 1
     HYPRE BoomerAMG: Using CF-relaxation
     HYPRE BoomerAMG: Measure type        local
     HYPRE BoomerAMG: Coarsen type        Falgout
     HYPRE BoomerAMG: Interpolation type  classical
   linear system matrix = precond matrix:
   Matrix Object:
     type=seqaij, rows=263169, cols=263169
     total: nonzeros=1313793, allocated nonzeros=1315845
       not using I-node routines
  Iterations:           4


On Feb 18, 2008, at 1:57 AM, knutert at stud.ntnu.no wrote:

> Thank you for the reply, Barry.
>
> The same thing happens if I use hypre with the DMMG solver.
> As you say, with hypre, the convergence is extremely slow, requiring
> a lot of iterations, 1413 iterations (1820 if I use richardson) for  
> a 257x257
> problem, while the default only needs 5.
>
> I use the same way of handling boundary conditions in the two codes.
> I've also compared the coeff matrix and rhs, and they are equal.
>
> -Knut Erik-
>
> Siterer Barry Smith <bsmith at mcs.anl.gov>:
>
>>
>>  Run with the DMMG solver with the option -pc_type hypre
>> What happens? Then run again with the additional option -ksp_type  
>> richardson
>>
>> Is hypre taking many, many iterations which is causing the slow  
>> speed?
>>
>> I expect there is something wrong with your code that does not use  
>> DMMG.
>> Be careful how you handle boundary conditions; you need to make sure
>> they have the same scaling as the other equations.
>>
>>   Barry
>>
>>
>>
>> On Feb 15, 2008, at 8:36 AM, knutert at stud.ntnu.no wrote:
>>
>>> Hi Ben,
>>>
>>> Thank you for answering. With gmres and boomeramg I get a run time  
>>> of
>>> 2s, so that is much better. However, if I increase the grid size to
>>> 513x513, I get a run time of one minute. With richardson, it  
>>> fails  to converge.
>>> LU gives 6 seconds, CG and ICC gives 7s, and the DMMG solver 3s  
>>> for  the 513x513 problem.
>>>
>>> When using the DMMG framework, I just used the default solvers.
>>> I use the Galerkin process to generate the coarse matrices for
>>> the multigrid cycle.
>>>
>>> Best,
>>> Knut
>>>
>>> Siterer Ben Tay <zonexo at gmail.com>:
>>>
>>>> Hi Knut,
>>>>
>>>> I'm currently using boomeramg to solve my poisson eqn too. I'm  
>>>> using it
>>>> on my structured C-grid. I found it to be faster than LU,  
>>>> especially as
>>>> the grid size increases. However I use it as a preconditioner with
>>>> GMRES as the solver. Have you tried this option? Although it's  
>>>> faster,
>>>> the speed increase is usually less than double. It seems to be  
>>>> worse if
>>>> there is a lot of stretching in the grid.
>>>>
>>>> Btw, your mention using the DMMG framework and it takes less than a
>>>> sec. What solver or preconditioner did you use? It's 4 times faster
>>>> than GMRES...
>>>>
>>>> thanks!
>>>>
>>>> knutert at stud.ntnu.no wrote:
>>>>> Hello,
>>>>>
>>>>> I am trying to use the hypre multigrid solver to solve a Poisson  
>>>>> equation.
>>>>> However, on a test case with grid size 257x257 it takes 40   
>>>>> seconds  to converge
>>>>> on one processor when I run with
>>>>> ./run -ksp_type richardson -pc_type hypre -pc_type_hypre boomeramg
>>>>>
>>>>> Using the DMMG framework, the same problem takes less than a  
>>>>> second,
>>>>> and the default gmres solver uses only four seconds.
>>>>>
>>>>> Am I somehow using the solver the wrong way, or is this   
>>>>> performance  expected?
>>>>>
>>>>> Regards
>>>>> Knut Erik Teigen
>>>>>
>>>>>
>>>
>>>
>>>
>
>
>


From bsmith at mcs.anl.gov  Tue Feb 19 17:04:14 2008
From: bsmith at mcs.anl.gov (Barry Smith)
Date: Tue, 19 Feb 2008 17:04:14 -0600
Subject: Poor performance with BoomerAMG?
In-Reply-To: <CA703655D571CF49A2E6DAED3FF5A49301AB3AB6@EXCHG-VS1.risoe.dk>
References: <20080215163600.ABA57782@batman.int.colorado.edu> <30BAACA0-5EAE-4A7C-8BD7-2E34071CAE57@mcs.anl.gov> <CA703655D571CF49A2E6DAED3FF5A49301AB3AB6@EXCHG-VS1.risoe.dk>
Message-ID: <9CC301B6-B41A-49EF-B9E2-43CF7890CF10@mcs.anl.gov>


   Trottenberg has a discussion page 178; see the box that begins at  
the bottom of the page and
continues onto the next one). See also the discussion at the bottom of  
page 182 with equations
5.6.14 and 5.6.15,

   I totally disagree with his suggestion of interpolating boundary  
nodes differently from
interior nodes. It makes the code unnecessarily complicated. So long  
as you have the
boundary equations suitably scaled you can simply interpolate  
everywhere identically.

   Barry


On Feb 19, 2008, at 8:21 AM, jens.madsen at risoe.dk wrote:

> Hi Barry
>
> Two questions.
>
> 1) What do you mean with "volume" and "wrong scaling"? Could  
> translate this to some other terms? I have a book by Ulrich  
> Trottenberg "Multigrid" and the book by Saad, but could not find  
> similar.
>
> 2) Do you know of any summerschools in scientific computing,  
> focusing on Krylov methods, multigrids and preconditioning(all  
> parallel)?
>
> Kind Regards
>
> Jens Madsen
> Ph.d.-studerende
> Phone direct +45 4677 4560
> Mobile
> jens.madsen at risoe.dk
>
> Optics and Plasma Research Department
> Ris? National Laboratory
> Technical University of Denmark - DTU
> Building 128, P.O. Box 49
> DK-4000 Roskilde, Denmark
> Tel +45 4677 4500
> Fax +45 4677 4565
> www.risoe.dk
>
> From 1 January 2007, Ris? National Laboratory, the Danish Institute  
> for Food and Veterinary Research,
> the Danish Institute for Fisheries Research, the Danish National  
> Space Center and
> the Danish Transport Research Institute have been merged with
> the Technical University of Denmark (DTU) with DTU as the continuing  
> unit.
> -----Original Message-----
> From: owner-petsc-users at mcs.anl.gov [mailto:owner-petsc-users at mcs.anl.gov 
> ] On Behalf Of Barry Smith
> Sent: Saturday, February 16, 2008 6:49 PM
> To: petsc-users at mcs.anl.gov
> Subject: Re: Poor performance with BoomerAMG?
>
>
>    All multigrid solvers depend on proper scaling of the variables.
> For example
> for a  Laplacian operator the matrix entries are
>
>         \integral \grad \phi_i dot \grad \phi_j
>
> now in 2d \grad \phi is O(1/h) and the volume is O(h^2) so the terms
> in the matrix are O(1). In 3d \grad \phi is still O(1/h) but the
> volume is O(h^3)
> meaning the matrix entries are O(h).  Now say you impose a Dirichlet
> boundary
> conditions by just saying u_k    =  g_k. In 2d this is ok but in 3d
> you need to
> use h*u_k = h*g_k otherwise when you restrict to the coarser grid the
> resulting matrix entries for the boundary are "out of whack" with the
> matrix
> entries for the interior of the domain.
>
> Actually most preconditioners and Krylov methods behavior does depend
> on the row scaling; multigrid is just particularly sensitive.
>
>    Barry
>
>
> On Feb 15, 2008, at 5:36 PM, Andrew T Barker wrote:
>
>>
>>
>>> Be careful how you handle boundary conditions; you need to make sure
>>> they have the same scaling as the other equations.
>>
>> Could you clarify what you mean?  Is boomerAMG sensitive to scaling
>> of matrix rows in a way that other solvers/preconditioners are not?
>>
>> Andrew
>>
>>>
>>> On Feb 15, 2008, at 8:36 AM, knutert at stud.ntnu.no wrote:
>>>
>>>> Hi Ben,
>>>>
>>>> Thank you for answering. With gmres and boomeramg I get a run time
>>>> of
>>>> 2s, so that is much better. However, if I increase the grid size to
>>>> 513x513, I get a run time of one minute. With richardson, it fails
>>>> to converge.
>>>> LU gives 6 seconds, CG and ICC gives 7s, and the DMMG solver 3s for
>>>> the 513x513 problem.
>>>>
>>>> When using the DMMG framework, I just used the default solvers.
>>>> I use the Galerkin process to generate the coarse matrices for
>>>> the multigrid cycle.
>>>>
>>>> Best,
>>>> Knut
>>>>
>>>> Siterer Ben Tay <zonexo at gmail.com>:
>>>>
>>>>> Hi Knut,
>>>>>
>>>>> I'm currently using boomeramg to solve my poisson eqn too. I'm
>>>>> using it
>>>>> on my structured C-grid. I found it to be faster than LU,
>>>>> especially as
>>>>> the grid size increases. However I use it as a preconditioner with
>>>>> GMRES as the solver. Have you tried this option? Although it's
>>>>> faster,
>>>>> the speed increase is usually less than double. It seems to be
>>>>> worse if
>>>>> there is a lot of stretching in the grid.
>>>>>
>>>>> Btw, your mention using the DMMG framework and it takes less  
>>>>> than a
>>>>> sec. What solver or preconditioner did you use? It's 4 times  
>>>>> faster
>>>>> than GMRES...
>>>>>
>>>>> thanks!
>>>>>
>>>>> knutert at stud.ntnu.no wrote:
>>>>>> Hello,
>>>>>>
>>>>>> I am trying to use the hypre multigrid solver to solve a Poisson
>>>>>> equation.
>>>>>> However, on a test case with grid size 257x257 it takes 40
>>>>>> seconds  to converge
>>>>>> on one processor when I run with
>>>>>> ./run -ksp_type richardson -pc_type hypre -pc_type_hypre  
>>>>>> boomeramg
>>>>>>
>>>>>> Using the DMMG framework, the same problem takes less than a
>>>>>> second,
>>>>>> and the default gmres solver uses only four seconds.
>>>>>>
>>>>>> Am I somehow using the solver the wrong way, or is this
>>>>>> performance  expected?
>>>>>>
>>>>>> Regards
>>>>>> Knut Erik Teigen
>>>>>>
>>>>>>
>>>>
>>>>
>>>>
>>>
>>
>
>


From knutert at stud.ntnu.no  Wed Feb 20 01:47:22 2008
From: knutert at stud.ntnu.no (knutert at stud.ntnu.no)
Date: Wed, 20 Feb 2008 08:47:22 +0100
Subject: Poor performance with BoomerAMG?
In-Reply-To: <CD546084-7E1E-4E1B-B705-D8559DA18905@mcs.anl.gov>
References: <82FAjs014703@awe.co.uk>
	<20080215140009.lbc7d9t52scok08g@webmail.ntnu.no>
	<47B59805.1070306@gmail.com>
	<20080215153635.1y2s2a1uslcwswws@webmail.ntnu.no>
	<73C20BE7-ABAD-45CF-8FDE-5972C4891527@mcs.anl.gov>
	<20080218085734.br098xx1k4ggw8os@webmail.ntnu.no>
	<CD546084-7E1E-4E1B-B705-D8559DA18905@mcs.anl.gov>
Message-ID: <20080220084722.xljveqv9zc44gsw4@webmail.ntnu.no>

Wow, that is embarrassing...I had put -pc_type_hypre instead of  
_pc_hypre_type.
Thanks!

-Knut Erik-

Siterer Barry Smith <bsmith at mcs.anl.gov>:

>
>    BoomerAMG works like a charm. Likely you forgot the -pc_hypre_type
> boomeramg
>
>    Hmm, I think I'll change the default solver to boomeramg
>
>    Barry
>
> barry-smiths-macbook-pro-17:ksp/examples/tutorials] bsmith% ./ex1f
> -ksp_monitor -pc_type hypre -pc_hypre_type boomeramg -m 513 -n 513
> -ksp_type richardson -ksp_view
>  p=           1
>   0 KSP Residual norm 4.213878296084e+03
>   1 KSP Residual norm 2.135189837330e+02
>   2 KSP Residual norm 1.225934028865e+01
>   3 KSP Residual norm 7.255859884400e-01
>   4 KSP Residual norm 4.353504737395e-02
>   5 KSP Residual norm 2.643035146258e-03
>   6 KSP Residual norm 1.628271972668e-04
> KSP Object:
>   type: richardson
>     Richardson: damping factor=1
>   maximum iterations=10000, initial guess is zero
>   tolerances:  relative=1e-07, absolute=1e-50, divergence=10000
>   left preconditioning
> PC Object:
>   type: hypre
>     HYPRE BoomerAMG preconditioning
>     HYPRE BoomerAMG: Cycle type V
>     HYPRE BoomerAMG: Maximum number of levels 25
>     HYPRE BoomerAMG: Maximum number of iterations PER hypre call 1
>     HYPRE BoomerAMG: Convergence tolerance PER hypre call 0
>     HYPRE BoomerAMG: Threshold for strong coupling 0.25
>     HYPRE BoomerAMG: Interpolation truncation factor 0
>     HYPRE BoomerAMG: Interpolation: max elements per row 0
>     HYPRE BoomerAMG: Number of levels of aggressive coarsening 0
>     HYPRE BoomerAMG: Number of paths for aggressive coarsening 1
>     HYPRE BoomerAMG: Maximum row sums 0.9
>     HYPRE BoomerAMG: Sweeps down         1
>     HYPRE BoomerAMG: Sweeps up           1
>     HYPRE BoomerAMG: Sweeps on coarse    1
>     HYPRE BoomerAMG: Relax down          symmetric-SOR/Jacobi
>     HYPRE BoomerAMG: Relax up            symmetric-SOR/Jacobi
>     HYPRE BoomerAMG: Relax on coarse     Gaussian-elimination
>     HYPRE BoomerAMG: Relax weight  (all)      1
>     HYPRE BoomerAMG: Outer relax weight (all) 1
>     HYPRE BoomerAMG: Using CF-relaxation
>     HYPRE BoomerAMG: Measure type        local
>     HYPRE BoomerAMG: Coarsen type        Falgout
>     HYPRE BoomerAMG: Interpolation type  classical
>   linear system matrix = precond matrix:
>   Matrix Object:
>     type=seqaij, rows=263169, cols=263169
>     total: nonzeros=1313793, allocated nonzeros=1315845
>       not using I-node routines
>  Iterations:           7
>
> [barry-smiths-macbook-pro-17:ksp/examples/tutorials] bsmith% ./ex1f
> -ksp_monitor -pc_type hypre -pc_hypre_type boomeramg -m 513 -n 513
> -ksp_type gmres -ksp_view
>  p=           1
>   0 KSP Residual norm 4.213878296084e+03
>   1 KSP Residual norm 5.272381634094e+01
>   2 KSP Residual norm 8.107668116258e-01
>   3 KSP Residual norm 1.807380875232e-02
>   4 KSP Residual norm 4.068259191532e-04
> KSP Object:
>   type: gmres
>     GMRES: restart=30, using Classical (unmodified) Gram-Schmidt
> Orthogonalization with no iterative refinement
>     GMRES: happy breakdown tolerance 1e-30
>   maximum iterations=10000, initial guess is zero
>   tolerances:  relative=1e-07, absolute=1e-50, divergence=10000
>   left preconditioning
> PC Object:
>   type: hypre
>     HYPRE BoomerAMG preconditioning
>     HYPRE BoomerAMG: Cycle type V
>     HYPRE BoomerAMG: Maximum number of levels 25
>     HYPRE BoomerAMG: Maximum number of iterations PER hypre call 1
>     HYPRE BoomerAMG: Convergence tolerance PER hypre call 0
>     HYPRE BoomerAMG: Threshold for strong coupling 0.25
>     HYPRE BoomerAMG: Interpolation truncation factor 0
>     HYPRE BoomerAMG: Interpolation: max elements per row 0
>     HYPRE BoomerAMG: Number of levels of aggressive coarsening 0
>     HYPRE BoomerAMG: Number of paths for aggressive coarsening 1
>     HYPRE BoomerAMG: Maximum row sums 0.9
>     HYPRE BoomerAMG: Sweeps down         1
>     HYPRE BoomerAMG: Sweeps up           1
>     HYPRE BoomerAMG: Sweeps on coarse    1
>     HYPRE BoomerAMG: Relax down          symmetric-SOR/Jacobi
>     HYPRE BoomerAMG: Relax up            symmetric-SOR/Jacobi
>     HYPRE BoomerAMG: Relax on coarse     Gaussian-elimination
>     HYPRE BoomerAMG: Relax weight  (all)      1
>     HYPRE BoomerAMG: Outer relax weight (all) 1
>     HYPRE BoomerAMG: Using CF-relaxation
>     HYPRE BoomerAMG: Measure type        local
>     HYPRE BoomerAMG: Coarsen type        Falgout
>     HYPRE BoomerAMG: Interpolation type  classical
>   linear system matrix = precond matrix:
>   Matrix Object:
>     type=seqaij, rows=263169, cols=263169
>     total: nonzeros=1313793, allocated nonzeros=1315845
>       not using I-node routines
>  Iterations:           4
>
>
> On Feb 18, 2008, at 1:57 AM, knutert at stud.ntnu.no wrote:
>
>> Thank you for the reply, Barry.
>>
>> The same thing happens if I use hypre with the DMMG solver.
>> As you say, with hypre, the convergence is extremely slow, requiring
>> a lot of iterations, 1413 iterations (1820 if I use richardson) for  
>>  a 257x257
>> problem, while the default only needs 5.
>>
>> I use the same way of handling boundary conditions in the two codes.
>> I've also compared the coeff matrix and rhs, and they are equal.
>>
>> -Knut Erik-
>>
>> Siterer Barry Smith <bsmith at mcs.anl.gov>:
>>
>>>
>>> Run with the DMMG solver with the option -pc_type hypre
>>> What happens? Then run again with the additional option -ksp_type   
>>> richardson
>>>
>>> Is hypre taking many, many iterations which is causing the slow speed?
>>>
>>> I expect there is something wrong with your code that does not use DMMG.
>>> Be careful how you handle boundary conditions; you need to make sure
>>> they have the same scaling as the other equations.
>>>
>>>  Barry
>>>
>>>
>>>
>>> On Feb 15, 2008, at 8:36 AM, knutert at stud.ntnu.no wrote:
>>>
>>>> Hi Ben,
>>>>
>>>> Thank you for answering. With gmres and boomeramg I get a run time of
>>>> 2s, so that is much better. However, if I increase the grid size to
>>>> 513x513, I get a run time of one minute. With richardson, it   
>>>> fails  to converge.
>>>> LU gives 6 seconds, CG and ICC gives 7s, and the DMMG solver 3s   
>>>> for  the 513x513 problem.
>>>>
>>>> When using the DMMG framework, I just used the default solvers.
>>>> I use the Galerkin process to generate the coarse matrices for
>>>> the multigrid cycle.
>>>>
>>>> Best,
>>>> Knut
>>>>
>>>> Siterer Ben Tay <zonexo at gmail.com>:
>>>>
>>>>> Hi Knut,
>>>>>
>>>>> I'm currently using boomeramg to solve my poisson eqn too. I'm using it
>>>>> on my structured C-grid. I found it to be faster than LU, especially as
>>>>> the grid size increases. However I use it as a preconditioner with
>>>>> GMRES as the solver. Have you tried this option? Although it's faster,
>>>>> the speed increase is usually less than double. It seems to be worse if
>>>>> there is a lot of stretching in the grid.
>>>>>
>>>>> Btw, your mention using the DMMG framework and it takes less than a
>>>>> sec. What solver or preconditioner did you use? It's 4 times faster
>>>>> than GMRES...
>>>>>
>>>>> thanks!
>>>>>
>>>>> knutert at stud.ntnu.no wrote:
>>>>>> Hello,
>>>>>>
>>>>>> I am trying to use the hypre multigrid solver to solve a   
>>>>>> Poisson equation.
>>>>>> However, on a test case with grid size 257x257 it takes 40    
>>>>>> seconds  to converge
>>>>>> on one processor when I run with
>>>>>> ./run -ksp_type richardson -pc_type hypre -pc_type_hypre boomeramg
>>>>>>
>>>>>> Using the DMMG framework, the same problem takes less than a second,
>>>>>> and the default gmres solver uses only four seconds.
>>>>>>
>>>>>> Am I somehow using the solver the wrong way, or is this    
>>>>>> performance  expected?
>>>>>>
>>>>>> Regards
>>>>>> Knut Erik Teigen
>>>>>>
>>>>>>
>>>>
>>>>
>>>>
>>
>>
>>


From jens.madsen at risoe.dk  Wed Feb 20 14:54:18 2008
From: jens.madsen at risoe.dk (jens.madsen at risoe.dk)
Date: Wed, 20 Feb 2008 21:54:18 +0100
Subject: Poor performance with BoomerAMG?
In-Reply-To: <9CC301B6-B41A-49EF-B9E2-43CF7890CF10@mcs.anl.gov>
Message-ID: <CA703655D571CF49A2E6DAED3FF5A4930105FD05@EXCHG-VS1.risoe.dk>

Thank you Barry. I'll take a look at it:-) 

Did you have any summerschool suggestions? 

Kind Regards 

Jens

-----Original Message-----
From: owner-petsc-users at mcs.anl.gov [mailto:owner-petsc-users at mcs.anl.gov] On Behalf Of Barry Smith
Sent: Wednesday, February 20, 2008 12:04 AM
To: petsc-users at mcs.anl.gov
Subject: Re: Poor performance with BoomerAMG?


   Trottenberg has a discussion page 178; see the box that begins at  
the bottom of the page and
continues onto the next one). See also the discussion at the bottom of  
page 182 with equations
5.6.14 and 5.6.15,

   I totally disagree with his suggestion of interpolating boundary  
nodes differently from
interior nodes. It makes the code unnecessarily complicated. So long  
as you have the
boundary equations suitably scaled you can simply interpolate  
everywhere identically.

   Barry


On Feb 19, 2008, at 8:21 AM, jens.madsen at risoe.dk wrote:

> Hi Barry
>
> Two questions.
>
> 1) What do you mean with "volume" and "wrong scaling"? Could  
> translate this to some other terms? I have a book by Ulrich  
> Trottenberg "Multigrid" and the book by Saad, but could not find  
> similar.
>
> 2) Do you know of any summerschools in scientific computing,  
> focusing on Krylov methods, multigrids and preconditioning(all  
> parallel)?
>
> Kind Regards
>
> Jens Madsen
> Ph.d.-studerende
> Phone direct +45 4677 4560
> Mobile
> jens.madsen at risoe.dk
>
> Optics and Plasma Research Department
> Ris? National Laboratory
> Technical University of Denmark - DTU
> Building 128, P.O. Box 49
> DK-4000 Roskilde, Denmark
> Tel +45 4677 4500
> Fax +45 4677 4565
> www.risoe.dk
>
> From 1 January 2007, Ris? National Laboratory, the Danish Institute  
> for Food and Veterinary Research,
> the Danish Institute for Fisheries Research, the Danish National  
> Space Center and
> the Danish Transport Research Institute have been merged with
> the Technical University of Denmark (DTU) with DTU as the continuing  
> unit.
> -----Original Message-----
> From: owner-petsc-users at mcs.anl.gov [mailto:owner-petsc-users at mcs.anl.gov 
> ] On Behalf Of Barry Smith
> Sent: Saturday, February 16, 2008 6:49 PM
> To: petsc-users at mcs.anl.gov
> Subject: Re: Poor performance with BoomerAMG?
>
>
>    All multigrid solvers depend on proper scaling of the variables.
> For example
> for a  Laplacian operator the matrix entries are
>
>         \integral \grad \phi_i dot \grad \phi_j
>
> now in 2d \grad \phi is O(1/h) and the volume is O(h^2) so the terms
> in the matrix are O(1). In 3d \grad \phi is still O(1/h) but the
> volume is O(h^3)
> meaning the matrix entries are O(h).  Now say you impose a Dirichlet
> boundary
> conditions by just saying u_k    =  g_k. In 2d this is ok but in 3d
> you need to
> use h*u_k = h*g_k otherwise when you restrict to the coarser grid the
> resulting matrix entries for the boundary are "out of whack" with the
> matrix
> entries for the interior of the domain.
>
> Actually most preconditioners and Krylov methods behavior does depend
> on the row scaling; multigrid is just particularly sensitive.
>
>    Barry
>
>
> On Feb 15, 2008, at 5:36 PM, Andrew T Barker wrote:
>
>>
>>
>>> Be careful how you handle boundary conditions; you need to make sure
>>> they have the same scaling as the other equations.
>>
>> Could you clarify what you mean?  Is boomerAMG sensitive to scaling
>> of matrix rows in a way that other solvers/preconditioners are not?
>>
>> Andrew
>>
>>>
>>> On Feb 15, 2008, at 8:36 AM, knutert at stud.ntnu.no wrote:
>>>
>>>> Hi Ben,
>>>>
>>>> Thank you for answering. With gmres and boomeramg I get a run time
>>>> of
>>>> 2s, so that is much better. However, if I increase the grid size to
>>>> 513x513, I get a run time of one minute. With richardson, it fails
>>>> to converge.
>>>> LU gives 6 seconds, CG and ICC gives 7s, and the DMMG solver 3s for
>>>> the 513x513 problem.
>>>>
>>>> When using the DMMG framework, I just used the default solvers.
>>>> I use the Galerkin process to generate the coarse matrices for
>>>> the multigrid cycle.
>>>>
>>>> Best,
>>>> Knut
>>>>
>>>> Siterer Ben Tay <zonexo at gmail.com>:
>>>>
>>>>> Hi Knut,
>>>>>
>>>>> I'm currently using boomeramg to solve my poisson eqn too. I'm
>>>>> using it
>>>>> on my structured C-grid. I found it to be faster than LU,
>>>>> especially as
>>>>> the grid size increases. However I use it as a preconditioner with
>>>>> GMRES as the solver. Have you tried this option? Although it's
>>>>> faster,
>>>>> the speed increase is usually less than double. It seems to be
>>>>> worse if
>>>>> there is a lot of stretching in the grid.
>>>>>
>>>>> Btw, your mention using the DMMG framework and it takes less  
>>>>> than a
>>>>> sec. What solver or preconditioner did you use? It's 4 times  
>>>>> faster
>>>>> than GMRES...
>>>>>
>>>>> thanks!
>>>>>
>>>>> knutert at stud.ntnu.no wrote:
>>>>>> Hello,
>>>>>>
>>>>>> I am trying to use the hypre multigrid solver to solve a Poisson
>>>>>> equation.
>>>>>> However, on a test case with grid size 257x257 it takes 40
>>>>>> seconds  to converge
>>>>>> on one processor when I run with
>>>>>> ./run -ksp_type richardson -pc_type hypre -pc_type_hypre  
>>>>>> boomeramg
>>>>>>
>>>>>> Using the DMMG framework, the same problem takes less than a
>>>>>> second,
>>>>>> and the default gmres solver uses only four seconds.
>>>>>>
>>>>>> Am I somehow using the solver the wrong way, or is this
>>>>>> performance  expected?
>>>>>>
>>>>>> Regards
>>>>>> Knut Erik Teigen
>>>>>>
>>>>>>
>>>>
>>>>
>>>>
>>>
>>
>
>


From bsmith at mcs.anl.gov  Wed Feb 20 18:57:57 2008
From: bsmith at mcs.anl.gov (Barry Smith)
Date: Wed, 20 Feb 2008 18:57:57 -0600
Subject: Poor performance with BoomerAMG?
In-Reply-To: <CA703655D571CF49A2E6DAED3FF5A4930105FD05@EXCHG-VS1.risoe.dk>
References: <CA703655D571CF49A2E6DAED3FF5A4930105FD05@EXCHG-VS1.risoe.dk>
Message-ID: <B0A9DD44-3E8E-4287-A49E-599CE2ED4765@mcs.anl.gov>


On Feb 20, 2008, at 2:54 PM, jens.madsen at risoe.dk wrote:

> Thank you Barry. I'll take a look at it:-)
>
> Did you have any summerschool suggestions?

   Sorry I don't know of any,

    Barry

>
>
> Kind Regards
>
> Jens
>
> -----Original Message-----
> From: owner-petsc-users at mcs.anl.gov [mailto:owner-petsc-users at mcs.anl.gov 
> ] On Behalf Of Barry Smith
> Sent: Wednesday, February 20, 2008 12:04 AM
> To: petsc-users at mcs.anl.gov
> Subject: Re: Poor performance with BoomerAMG?
>
>
>   Trottenberg has a discussion page 178; see the box that begins at
> the bottom of the page and
> continues onto the next one). See also the discussion at the bottom of
> page 182 with equations
> 5.6.14 and 5.6.15,
>
>   I totally disagree with his suggestion of interpolating boundary
> nodes differently from
> interior nodes. It makes the code unnecessarily complicated. So long
> as you have the
> boundary equations suitably scaled you can simply interpolate
> everywhere identically.
>
>   Barry
>
>
>
> On Feb 19, 2008, at 8:21 AM, jens.madsen at risoe.dk wrote:
>
>> Hi Barry
>>
>> Two questions.
>>
>> 1) What do you mean with "volume" and "wrong scaling"? Could
>> translate this to some other terms? I have a book by Ulrich
>> Trottenberg "Multigrid" and the book by Saad, but could not find
>> similar.
>>
>> 2) Do you know of any summerschools in scientific computing,
>> focusing on Krylov methods, multigrids and preconditioning(all
>> parallel)?
>>
>> Kind Regards
>>
>> Jens Madsen
>> Ph.d.-studerende
>> Phone direct +45 4677 4560
>> Mobile
>> jens.madsen at risoe.dk
>>
>> Optics and Plasma Research Department
>> Ris? National Laboratory
>> Technical University of Denmark - DTU
>> Building 128, P.O. Box 49
>> DK-4000 Roskilde, Denmark
>> Tel +45 4677 4500
>> Fax +45 4677 4565
>> www.risoe.dk
>>
>> From 1 January 2007, Ris? National Laboratory, the Danish Institute
>> for Food and Veterinary Research,
>> the Danish Institute for Fisheries Research, the Danish National
>> Space Center and
>> the Danish Transport Research Institute have been merged with
>> the Technical University of Denmark (DTU) with DTU as the continuing
>> unit.
>> -----Original Message-----
>> From: owner-petsc-users at mcs.anl.gov [mailto:owner-petsc-users at mcs.anl.gov
>> ] On Behalf Of Barry Smith
>> Sent: Saturday, February 16, 2008 6:49 PM
>> To: petsc-users at mcs.anl.gov
>> Subject: Re: Poor performance with BoomerAMG?
>>
>>
>>   All multigrid solvers depend on proper scaling of the variables.
>> For example
>> for a  Laplacian operator the matrix entries are
>>
>>        \integral \grad \phi_i dot \grad \phi_j
>>
>> now in 2d \grad \phi is O(1/h) and the volume is O(h^2) so the terms
>> in the matrix are O(1). In 3d \grad \phi is still O(1/h) but the
>> volume is O(h^3)
>> meaning the matrix entries are O(h).  Now say you impose a Dirichlet
>> boundary
>> conditions by just saying u_k    =  g_k. In 2d this is ok but in 3d
>> you need to
>> use h*u_k = h*g_k otherwise when you restrict to the coarser grid the
>> resulting matrix entries for the boundary are "out of whack" with the
>> matrix
>> entries for the interior of the domain.
>>
>> Actually most preconditioners and Krylov methods behavior does depend
>> on the row scaling; multigrid is just particularly sensitive.
>>
>>   Barry
>>
>>
>> On Feb 15, 2008, at 5:36 PM, Andrew T Barker wrote:
>>
>>>
>>>
>>>> Be careful how you handle boundary conditions; you need to make  
>>>> sure
>>>> they have the same scaling as the other equations.
>>>
>>> Could you clarify what you mean?  Is boomerAMG sensitive to scaling
>>> of matrix rows in a way that other solvers/preconditioners are not?
>>>
>>> Andrew
>>>
>>>>
>>>> On Feb 15, 2008, at 8:36 AM, knutert at stud.ntnu.no wrote:
>>>>
>>>>> Hi Ben,
>>>>>
>>>>> Thank you for answering. With gmres and boomeramg I get a run time
>>>>> of
>>>>> 2s, so that is much better. However, if I increase the grid size  
>>>>> to
>>>>> 513x513, I get a run time of one minute. With richardson, it fails
>>>>> to converge.
>>>>> LU gives 6 seconds, CG and ICC gives 7s, and the DMMG solver 3s  
>>>>> for
>>>>> the 513x513 problem.
>>>>>
>>>>> When using the DMMG framework, I just used the default solvers.
>>>>> I use the Galerkin process to generate the coarse matrices for
>>>>> the multigrid cycle.
>>>>>
>>>>> Best,
>>>>> Knut
>>>>>
>>>>> Siterer Ben Tay <zonexo at gmail.com>:
>>>>>
>>>>>> Hi Knut,
>>>>>>
>>>>>> I'm currently using boomeramg to solve my poisson eqn too. I'm
>>>>>> using it
>>>>>> on my structured C-grid. I found it to be faster than LU,
>>>>>> especially as
>>>>>> the grid size increases. However I use it as a preconditioner  
>>>>>> with
>>>>>> GMRES as the solver. Have you tried this option? Although it's
>>>>>> faster,
>>>>>> the speed increase is usually less than double. It seems to be
>>>>>> worse if
>>>>>> there is a lot of stretching in the grid.
>>>>>>
>>>>>> Btw, your mention using the DMMG framework and it takes less
>>>>>> than a
>>>>>> sec. What solver or preconditioner did you use? It's 4 times
>>>>>> faster
>>>>>> than GMRES...
>>>>>>
>>>>>> thanks!
>>>>>>
>>>>>> knutert at stud.ntnu.no wrote:
>>>>>>> Hello,
>>>>>>>
>>>>>>> I am trying to use the hypre multigrid solver to solve a Poisson
>>>>>>> equation.
>>>>>>> However, on a test case with grid size 257x257 it takes 40
>>>>>>> seconds  to converge
>>>>>>> on one processor when I run with
>>>>>>> ./run -ksp_type richardson -pc_type hypre -pc_type_hypre
>>>>>>> boomeramg
>>>>>>>
>>>>>>> Using the DMMG framework, the same problem takes less than a
>>>>>>> second,
>>>>>>> and the default gmres solver uses only four seconds.
>>>>>>>
>>>>>>> Am I somehow using the solver the wrong way, or is this
>>>>>>> performance  expected?
>>>>>>>
>>>>>>> Regards
>>>>>>> Knut Erik Teigen
>>>>>>>
>>>>>>>
>>>>>
>>>>>
>>>>>
>>>>
>>>
>>
>>
>
>


From recrusader at gmail.com  Sun Feb 24 14:07:28 2008
From: recrusader at gmail.com (Yujie)
Date: Mon, 25 Feb 2008 04:07:28 +0800
Subject: about MatMat*() functions
Message-ID: <7ff0ee010802241207g728a3edcia610ef7e8f16e492@mail.gmail.com>

hi,

I am wondering whether all the MatMat*() only are suitable for sequential
matrix. I know MatMatSolve() is for sequential matrix. How about
MatMatMult()? Thanks a lot.

Regards,
Yujie
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.mcs.anl.gov/pipermail/petsc-users/attachments/20080225/b56d2153/attachment.htm>

From dave.mayhem23 at gmail.com  Sun Feb 24 16:00:19 2008
From: dave.mayhem23 at gmail.com (Dave May)
Date: Mon, 25 Feb 2008 09:00:19 +1100
Subject: about MatMat*() functions
In-Reply-To: <7ff0ee010802241207g728a3edcia610ef7e8f16e492@mail.gmail.com>
References: <7ff0ee010802241207g728a3edcia610ef7e8f16e492@mail.gmail.com>
Message-ID: <956373f0802241400w10d13fauf96b0bf47ab51966@mail.gmail.com>

Hey,
    MatMatMult() will work for MPIAIJ matrices. So will MatPtAP(). If you
are
ever in doubt, the easiest way (I find) to check whether I certain operation
is
supported is to just look at the source and see which ops. are defined.

I can usually find the answer with the online docs. In this case, starting
with the
type (MATMPIAIJ), and then searching mpiaij.c for MatMatMult. If you see
something like
your desired op. in  struct _MatOps (i.e. MatMatMult_MPIAIJ_MPIAIJ) then
it's supported.
Now you know the function defining your operation and you can search for it
(with grep as it might be in another file not online) to find out exactly
what it does.


Cheers,
    Dave


On Mon, Feb 25, 2008 at 7:07 AM, Yujie <recrusader at gmail.com> wrote:

> hi,
>
> I am wondering whether all the MatMat*() only are suitable for sequential
> matrix. I know MatMatSolve() is for sequential matrix. How about
> MatMatMult()? Thanks a lot.
>
> Regards,
> Yujie
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.mcs.anl.gov/pipermail/petsc-users/attachments/20080225/f5440a5e/attachment.htm>

From recrusader at gmail.com  Tue Feb 26 19:46:59 2008
From: recrusader at gmail.com (Yujie)
Date: Tue, 26 Feb 2008 17:46:59 -0800
Subject: any examples to demonstrate how to Spooles package?
Message-ID: <7ff0ee010802261746g243bb9cobffbfee858229969@mail.gmail.com>

Hi, everyone

I have compiled PETSc with spooles. However, I try to find how to use this
package in PETSc directory. I can't find any examples for it. Could you give
me some advice? I want to use spooles to inverse a sparse matrix. thanks a
lot.

Regards,
Yujie
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.mcs.anl.gov/pipermail/petsc-users/attachments/20080226/60174049/attachment.htm>

From sanjay at ce.berkeley.edu  Wed Feb 27 01:10:31 2008
From: sanjay at ce.berkeley.edu (Sanjay Govindjee)
Date: Wed, 27 Feb 2008 08:10:31 +0100
Subject: any examples to demonstrate how to Spooles package?
In-Reply-To: <7ff0ee010802261746g243bb9cobffbfee858229969@mail.gmail.com>
References: <7ff0ee010802261746g243bb9cobffbfee858229969@mail.gmail.com>
Message-ID: <47C50CE7.5060703@ce.berkeley.edu>

from my make file

-@${MPIRUN} -s all -np $(NPROC) $(PROGNAME) -ksp_type preonly 
-ksp_monitor -pc_type cholesky -mat_type mpisbaijspooles -log_summary 
-on_error_attach_debugger -mat_spooles_symmetryflag 0  -options_left

-sg

Yujie wrote:
> Hi, everyone
>
> I have compiled PETSc with spooles. However, I try to find how to use 
> this package in PETSc directory. I can't find any examples for it. 
> Could you give me some advice? I want to use spooles to inverse a 
> sparse matrix. thanks a lot.
>
> Regards,
> Yujie


From amjad11 at gmail.com  Wed Feb 27 01:11:14 2008
From: amjad11 at gmail.com (amjad ali)
Date: Wed, 27 Feb 2008 12:11:14 +0500
Subject: few questions
Message-ID: <428810f20802262311t5f502823k74615457eeff8f53@mail.gmail.com>

Hello all,

Please answer the following,

1) What is the difference between static and dynamic versions of petsc?

2) How to check that which version (static or dynamic) is installed on a
system?

3) Plz comment on if there is any effect of static/dynamic version while
using/calling petsc from some external package?

4) how to update an already installed petsc version with newerer/latest
version of petsc?

Thanks to all.
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.mcs.anl.gov/pipermail/petsc-users/attachments/20080227/5462ec32/attachment.htm>

From aja2111 at columbia.edu  Wed Feb 27 02:05:24 2008
From: aja2111 at columbia.edu (Aron Ahmadia)
Date: Wed, 27 Feb 2008 03:05:24 -0500
Subject: few questions
In-Reply-To: <428810f20802262311t5f502823k74615457eeff8f53@mail.gmail.com>
References: <428810f20802262311t5f502823k74615457eeff8f53@mail.gmail.com>
Message-ID: <37604ab40802270005y22b27007o247d0bae0c8817fd@mail.gmail.com>

On Wed, Feb 27, 2008 at 2:11 AM, amjad ali <amjad11 at gmail.com> wrote:
> Hello all,
>
> Please answer the following,
>
> 1) What is the difference between static and dynamic versions of petsc?
>

Start here: http://en.wikipedia.org/wiki/Library_(computer_science)#Static_libraries

In PETSc the primary differences end up being the size and link-time.
Statically-linked executables need all the possible code that they
could contain in the actual file, so they can be up to several MB in
size.  Dynamically-linked executables are much leaner for the small
price of a little extra load time.

Now if you're talking about Dynamically-Loaded code, that's a bit hairier...

> 2) How to check that which version (static or dynamic) is installed on a
> system?
>

The fastest way is probably to look in $PETSC_DIR/$PETSC_ARCH/

If you see .a files, you've got static libraries, if you see .so or
.dylib files you've got dynamic libraries.

> 3) Plz comment on if there is any effect of static/dynamic version while
> using/calling petsc from some external package?
>

I'm not sure what you're asking here.  If you mean "Is there a
difference between calling dynamically compiled PETSc from statically
compiled PETSc" the answer is no.  There are differences in how you
compile and link the two version but your actual code would look the
same.

Again, if we're talking about dynamically loaded code (using something
like dl_open), then your code will look different.

> 4) how to update an already installed petsc version with newerer/latest
> version of petsc?
>

Doing this in place is more trouble than it's worth if you're not
using a development copy .  I just grab the latest copy of PETSc from
their webpage, then re-build and re-install.

~A


From tstitt at cscs.ch  Wed Feb 27 04:09:30 2008
From: tstitt at cscs.ch (Timothy Stitt)
Date: Wed, 27 Feb 2008 11:09:30 +0100
Subject: Error with -log_history
Message-ID: <47C536DA.2030002@cscs.ch>

Hi PETSc users/developers,

I am having some difficulties with the -log_history option on example 
PETSc codes at my local installation (FYI: Cray XT architecture). When 
executing the ex2.c code (for example) on multiple processors with the 
-log_history option I keep getting:

Signal number 11 SEGV: Segmentation Violation, probably memory access 
out of range

The log history file is created though but contains the following:

Petsc Release Version 2.3.3, Patch 3, Fri Jun 15 16:51:25 CDT 2007 HG 
revision: f051789beadcd36f77fb6111d20225e26ed7cc0d Wed Feb 27 10:35:55 2008
[8191]PETSC ERROR: [18945200]PETSC ERROR: [118163408]PETSC ERROR: 
[118163408]PETSC ERROR: [118163408]PETSC ERROR: [118163408]PETSC ERROR: 
[118163408]PETSC ERROR: [59]PETSC ER
ROR: [118163408]PETSC ERROR: [118163408]PETSC ERROR: [118163408]PETSC 
ERROR: [118163408]PETSC ERROR: [118163408]PETSC ERROR: [118163408]PETSC 
ERROR: [118163408]PETSC ERROR: [
23118368]PETSC ERROR: [118163408]PETSC ERROR: [118163408]PETSC ERROR: 
[118163408]PETSC ERROR: [118163408]PETSC ERROR: [19014780]PETSC ERROR:

Without the -log_history option the code runs as expected.

Is this a build/architecture issue at my end?

Thanks in advance for any advice given.

Tim.

-- 
Timothy Stitt
HPC Applications Analyst

Swiss National Supercomputing Centre (CSCS)
Galleria 2 - Via Cantonale
CH-6928 Manno, Switzerland

+41 (0) 91 610 8233
stitt at cscs.ch


From bsmith at mcs.anl.gov  Wed Feb 27 11:29:27 2008
From: bsmith at mcs.anl.gov (Barry Smith)
Date: Wed, 27 Feb 2008 11:29:27 -0600
Subject: Error with -log_history
In-Reply-To: <47C536DA.2030002@cscs.ch>
References: <47C536DA.2030002@cscs.ch>
Message-ID: <FF752851-D29F-4ECF-AC0A-890013E222F5@mcs.anl.gov>


    Was the file opened? That is did you still get an empty file or a  
file with a few lines (if so please send
the lines to petsc-maint at mcs.anl.gov). Is there are problem on ONE  
process?

    Barry

On Feb 27, 2008, at 4:09 AM, Timothy Stitt wrote:

> Hi PETSc users/developers,
>
> I am having some difficulties with the -log_history option on  
> example PETSc codes at my local installation (FYI: Cray XT  
> architecture). When executing the ex2.c code (for example) on  
> multiple processors with the -log_history option I keep getting:
>
> Signal number 11 SEGV: Segmentation Violation, probably memory  
> access out of range
>
> The log history file is created though but contains the following:
>
> Petsc Release Version 2.3.3, Patch 3, Fri Jun 15 16:51:25 CDT 2007  
> HG revision: f051789beadcd36f77fb6111d20225e26ed7cc0d Wed Feb 27  
> 10:35:55 2008
> [8191]PETSC ERROR: [18945200]PETSC ERROR: [118163408]PETSC ERROR:  
> [118163408]PETSC ERROR: [118163408]PETSC ERROR: [118163408]PETSC  
> ERROR: [118163408]PETSC ERROR: [59]PETSC ER
> ROR: [118163408]PETSC ERROR: [118163408]PETSC ERROR:  
> [118163408]PETSC ERROR: [118163408]PETSC ERROR: [118163408]PETSC  
> ERROR: [118163408]PETSC ERROR: [118163408]PETSC ERROR: [
> 23118368]PETSC ERROR: [118163408]PETSC ERROR: [118163408]PETSC  
> ERROR: [118163408]PETSC ERROR: [118163408]PETSC ERROR:  
> [19014780]PETSC ERROR:
>
> Without the -log_history option the code runs as expected.
>
> Is this a build/architecture issue at my end?
>
> Thanks in advance for any advice given.
>
> Tim.
>
> -- 
> Timothy Stitt
> HPC Applications Analyst
>
> Swiss National Supercomputing Centre (CSCS)
> Galleria 2 - Via Cantonale
> CH-6928 Manno, Switzerland
>
> +41 (0) 91 610 8233
> stitt at cscs.ch
>


From recrusader at gmail.com  Wed Feb 27 11:05:47 2008
From: recrusader at gmail.com (Yujie)
Date: Wed, 27 Feb 2008 09:05:47 -0800
Subject: any examples to demonstrate how to Spooles package?
In-Reply-To: <47C50CE7.5060703@ce.berkeley.edu>
References: <7ff0ee010802261746g243bb9cobffbfee858229969@mail.gmail.com>
	 <47C50CE7.5060703@ce.berkeley.edu>
Message-ID: <7ff0ee010802270905nf416452u7f85e136504ffe4f@mail.gmail.com>

Dear Sanjay:

Thank you for your reply. I don't understand what you said. Now, I want to
use spooles package to inverse a sparse SPD matrix. I have further checked
the inferface about spooles in PETSc. I find although spooles can deal with
AX=B (B may be a dense matrix) with parallel LU factorization.
However, PETSc only provide the following:
51: PetscErrorCode MatSolve_MPISpooles(Mat A,Vec b,Vec x)
I don't set b to a matrix even if I use
178: PetscErrorCode MatFactorNumeric_MPISpooles(Mat A,MatFactorInfo
*info,Mat *F) for LU factorization.

Could you have any suggestions about this? thanks a lot.

Regards,
Yujie

On 2/26/08, Sanjay Govindjee <sanjay at ce.berkeley.edu> wrote:
>
> from my make file
>
> -@${MPIRUN} -s all -np $(NPROC) $(PROGNAME) -ksp_type preonly
> -ksp_monitor -pc_type cholesky -mat_type mpisbaijspooles -log_summary
> -on_error_attach_debugger -mat_spooles_symmetryflag 0  -options_left
>
>
> -sg
>
>
> Yujie wrote:
> > Hi, everyone
> >
> > I have compiled PETSc with spooles. However, I try to find how to use
> > this package in PETSc directory. I can't find any examples for it.
> > Could you give me some advice? I want to use spooles to inverse a
> > sparse matrix. thanks a lot.
> >
> > Regards,
> > Yujie
>
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.mcs.anl.gov/pipermail/petsc-users/attachments/20080227/3b4a6a70/attachment.htm>

From knepley at gmail.com  Wed Feb 27 13:12:54 2008
From: knepley at gmail.com (Matthew Knepley)
Date: Wed, 27 Feb 2008 13:12:54 -0600
Subject: any examples to demonstrate how to Spooles package?
In-Reply-To: <7ff0ee010802270905nf416452u7f85e136504ffe4f@mail.gmail.com>
References: <7ff0ee010802261746g243bb9cobffbfee858229969@mail.gmail.com>
	 <47C50CE7.5060703@ce.berkeley.edu>
	 <7ff0ee010802270905nf416452u7f85e136504ffe4f@mail.gmail.com>
Message-ID: <a9f269830802271112s7f0317bga500479c381feae0@mail.gmail.com>

On Wed, Feb 27, 2008 at 11:05 AM, Yujie <recrusader at gmail.com> wrote:
> Dear Sanjay:
>
> Thank you for your reply. I don't understand what you said. Now, I want to
> use spooles package to inverse a sparse SPD matrix. I have further checked
> the inferface about spooles in PETSc. I find although spooles can deal with
> AX=B (B may be a dense matrix) with parallel LU factorization.
>  However, PETSc only provide the following:
> 51: PetscErrorCode MatSolve_MPISpooles(Mat A,Vec b,Vec x)
>  I don't set b to a matrix even if I use
>  178: PetscErrorCode MatFactorNumeric_MPISpooles(Mat A,MatFactorInfo
> *info,Mat *F) for LU factorization.
>
> Could you have any suggestions about this? thanks a lot.

MatMatSolve()

http://www-unix.mcs.anl.gov/petsc/petsc-as/snapshots/petsc-current/docs/manualpages/Mat/MatMatSolve.html

   Matt

> Regards,
> Yujie
>
> On 2/26/08, Sanjay Govindjee <sanjay at ce.berkeley.edu> wrote:
> > from my make file
> >
> > -@${MPIRUN} -s all -np $(NPROC) $(PROGNAME) -ksp_type preonly
> > -ksp_monitor -pc_type cholesky -mat_type mpisbaijspooles -log_summary
> > -on_error_attach_debugger -mat_spooles_symmetryflag 0  -options_left
> >
> >
> > -sg
> >
> >
> > Yujie wrote:
> > > Hi, everyone
> > >
> > > I have compiled PETSc with spooles. However, I try to find how to use
> > > this package in PETSc directory. I can't find any examples for it.
> > > Could you give me some advice? I want to use spooles to inverse a
> > > sparse matrix. thanks a lot.
> > >
> > > Regards,
> > > Yujie
> >
> >
>
>


-- 
What most experimenters take for granted before they begin their
experiments is infinitely more interesting than any results to which
their experiments lead.
-- Norbert Wiener


From recrusader at gmail.com  Wed Feb 27 13:21:39 2008
From: recrusader at gmail.com (Yujie)
Date: Wed, 27 Feb 2008 11:21:39 -0800
Subject: any examples to demonstrate how to Spooles package?
In-Reply-To: <a9f269830802271112s7f0317bga500479c381feae0@mail.gmail.com>
References: <7ff0ee010802261746g243bb9cobffbfee858229969@mail.gmail.com>
	 <47C50CE7.5060703@ce.berkeley.edu>
	 <7ff0ee010802270905nf416452u7f85e136504ffe4f@mail.gmail.com>
	 <a9f269830802271112s7f0317bga500479c381feae0@mail.gmail.com>
Message-ID: <7ff0ee010802271121l4075e3cbl65bcd4e7a104e4f0@mail.gmail.com>

Dear Matt:

I checked the codes about MatMatSolve(). However, currently, PETSc didn't
realize its parallel version. Is it right? I want to inverse the matrix
parallelly. could you give me some examples about it? thanks a lot.

Regards,
Yujie

On 2/27/08, Matthew Knepley <knepley at gmail.com> wrote:
>
> On Wed, Feb 27, 2008 at 11:05 AM, Yujie <recrusader at gmail.com> wrote:
> > Dear Sanjay:
> >
> > Thank you for your reply. I don't understand what you said. Now, I want
> to
> > use spooles package to inverse a sparse SPD matrix. I have further
> checked
> > the inferface about spooles in PETSc. I find although spooles can deal
> with
> > AX=B (B may be a dense matrix) with parallel LU factorization.
> >  However, PETSc only provide the following:
> > 51: PetscErrorCode MatSolve_MPISpooles(Mat A,Vec b,Vec x)
> >  I don't set b to a matrix even if I use
> >  178: PetscErrorCode MatFactorNumeric_MPISpooles(Mat A,MatFactorInfo
> > *info,Mat *F) for LU factorization.
> >
> > Could you have any suggestions about this? thanks a lot.
>
>
> MatMatSolve()
>
>
> http://www-unix.mcs.anl.gov/petsc/petsc-as/snapshots/petsc-current/docs/manualpages/Mat/MatMatSolve.html
>
>    Matt
>
>
> > Regards,
> > Yujie
> >
> > On 2/26/08, Sanjay Govindjee <sanjay at ce.berkeley.edu> wrote:
> > > from my make file
> > >
> > > -@${MPIRUN} -s all -np $(NPROC) $(PROGNAME) -ksp_type preonly
> > > -ksp_monitor -pc_type cholesky -mat_type mpisbaijspooles -log_summary
> > > -on_error_attach_debugger -mat_spooles_symmetryflag 0  -options_left
> > >
> > >
> > > -sg
> > >
> > >
> > > Yujie wrote:
> > > > Hi, everyone
> > > >
> > > > I have compiled PETSc with spooles. However, I try to find how to
> use
> > > > this package in PETSc directory. I can't find any examples for it.
> > > > Could you give me some advice? I want to use spooles to inverse a
> > > > sparse matrix. thanks a lot.
> > > >
> > > > Regards,
> > > > Yujie
> > >
> > >
> >
> >
>
>
>
>
> --
> What most experimenters take for granted before they begin their
> experiments is infinitely more interesting than any results to which
> their experiments lead.
> -- Norbert Wiener
>
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.mcs.anl.gov/pipermail/petsc-users/attachments/20080227/4e26a12a/attachment.htm>

From aja2111 at columbia.edu  Wed Feb 27 13:26:08 2008
From: aja2111 at columbia.edu (Aron Ahmadia)
Date: Wed, 27 Feb 2008 14:26:08 -0500
Subject: any examples to demonstrate how to Spooles package?
In-Reply-To: <a9f269830802271112s7f0317bga500479c381feae0@mail.gmail.com>
References: <7ff0ee010802261746g243bb9cobffbfee858229969@mail.gmail.com>
	 <47C50CE7.5060703@ce.berkeley.edu>
	 <7ff0ee010802270905nf416452u7f85e136504ffe4f@mail.gmail.com>
	 <a9f269830802271112s7f0317bga500479c381feae0@mail.gmail.com>
Message-ID: <37604ab40802271126l7773640axba21eaf0e3d3e928@mail.gmail.com>

Hey Matt,

You should probably clean up the documentation for MatMatSolve while
you're at it, it's indicating that x and b are vectors...  Also,
should you reference the factor routine you need to use to get a
factored matrix?

~A

On Wed, Feb 27, 2008 at 2:12 PM, Matthew Knepley <knepley at gmail.com> wrote:
> On Wed, Feb 27, 2008 at 11:05 AM, Yujie <recrusader at gmail.com> wrote:
>  > Dear Sanjay:
>  >
>  > Thank you for your reply. I don't understand what you said. Now, I want to
>  > use spooles package to inverse a sparse SPD matrix. I have further checked
>  > the inferface about spooles in PETSc. I find although spooles can deal with
>  > AX=B (B may be a dense matrix) with parallel LU factorization.
>  >  However, PETSc only provide the following:
>  > 51: PetscErrorCode MatSolve_MPISpooles(Mat A,Vec b,Vec x)
>  >  I don't set b to a matrix even if I use
>  >  178: PetscErrorCode MatFactorNumeric_MPISpooles(Mat A,MatFactorInfo
>  > *info,Mat *F) for LU factorization.
>  >
>  > Could you have any suggestions about this? thanks a lot.
>
>  MatMatSolve()
>
>  http://www-unix.mcs.anl.gov/petsc/petsc-as/snapshots/petsc-current/docs/manualpages/Mat/MatMatSolve.html
>
>    Matt
>
>
>
>  > Regards,
>  > Yujie
>  >
>  > On 2/26/08, Sanjay Govindjee <sanjay at ce.berkeley.edu> wrote:
>  > > from my make file
>  > >
>  > > -@${MPIRUN} -s all -np $(NPROC) $(PROGNAME) -ksp_type preonly
>  > > -ksp_monitor -pc_type cholesky -mat_type mpisbaijspooles -log_summary
>  > > -on_error_attach_debugger -mat_spooles_symmetryflag 0  -options_left
>  > >
>  > >
>  > > -sg
>  > >
>  > >
>  > > Yujie wrote:
>  > > > Hi, everyone
>  > > >
>  > > > I have compiled PETSc with spooles. However, I try to find how to use
>  > > > this package in PETSc directory. I can't find any examples for it.
>  > > > Could you give me some advice? I want to use spooles to inverse a
>  > > > sparse matrix. thanks a lot.
>  > > >
>  > > > Regards,
>  > > > Yujie
>  > >
>  > >
>  >
>  >
>
>
>
>  --
>  What most experimenters take for granted before they begin their
>  experiments is infinitely more interesting than any results to which
>  their experiments lead.
>  -- Norbert Wiener
>
>


From jens.madsen at risoe.dk  Wed Feb 27 13:31:43 2008
From: jens.madsen at risoe.dk (jens.madsen at risoe.dk)
Date: Wed, 27 Feb 2008 20:31:43 +0100
Subject: MG question
Message-ID: <CA703655D571CF49A2E6DAED3FF5A4930105FD0C@EXCHG-VS1.risoe.dk>

Hi 

 
I hope that this question is not outside the scope of this mailinglist.

 
As far as I understand PETSc uses preconditioned GMRES(or another KSP
method) as pre- and postsmoother on all multigrid levels? I was just
wondering why and where in the literature I can read about that method?
I thought that a fast method would be to use MG (with Gauss-Seidel
RB/zebra smothers) as a preconditioner for GMRES? I have looked at
papers written by Oosterlee etc.

 
Kind Regards 

-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.mcs.anl.gov/pipermail/petsc-users/attachments/20080227/466a53bb/attachment.htm>

From knepley at gmail.com  Wed Feb 27 13:32:11 2008
From: knepley at gmail.com (Matthew Knepley)
Date: Wed, 27 Feb 2008 13:32:11 -0600
Subject: any examples to demonstrate how to Spooles package?
In-Reply-To: <7ff0ee010802271121l4075e3cbl65bcd4e7a104e4f0@mail.gmail.com>
References: <7ff0ee010802261746g243bb9cobffbfee858229969@mail.gmail.com>
	 <47C50CE7.5060703@ce.berkeley.edu>
	 <7ff0ee010802270905nf416452u7f85e136504ffe4f@mail.gmail.com>
	 <a9f269830802271112s7f0317bga500479c381feae0@mail.gmail.com>
	 <7ff0ee010802271121l4075e3cbl65bcd4e7a104e4f0@mail.gmail.com>
Message-ID: <a9f269830802271132s53a9c773v105c9f90a7abb207@mail.gmail.com>

On Wed, Feb 27, 2008 at 1:21 PM, Yujie <recrusader at gmail.com> wrote:
> Dear Matt:
>
> I checked the codes about MatMatSolve(). However, currently, PETSc didn't
> realize its parallel version. Is it right? I want to inverse the matrix
> parallelly. could you give me some examples about it? thanks a lot.

Thats right. The parallel version is not implemented. It looks like this would
take significant work.

  Matt

> Regards,
> Yujie
>
>
>
> On 2/27/08, Matthew Knepley <knepley at gmail.com> wrote:
> > On Wed, Feb 27, 2008 at 11:05 AM, Yujie <recrusader at gmail.com> wrote:
> > > Dear Sanjay:
> > >
> > > Thank you for your reply. I don't understand what you said. Now, I want
> to
> > > use spooles package to inverse a sparse SPD matrix. I have further
> checked
> > > the inferface about spooles in PETSc. I find although spooles can deal
> with
> > > AX=B (B may be a dense matrix) with parallel LU factorization.
> > >  However, PETSc only provide the following:
> > > 51: PetscErrorCode MatSolve_MPISpooles(Mat A,Vec b,Vec x)
> > >  I don't set b to a matrix even if I use
> > >  178: PetscErrorCode MatFactorNumeric_MPISpooles(Mat A,MatFactorInfo
> > > *info,Mat *F) for LU factorization.
> > >
> > > Could you have any suggestions about this? thanks a lot.
> >
> >
> > MatMatSolve()
> >
> >
> http://www-unix.mcs.anl.gov/petsc/petsc-as/snapshots/petsc-current/docs/manualpages/Mat/MatMatSolve.html
> >
> >    Matt
> >
> >
> > > Regards,
> > > Yujie
> > >
> > > On 2/26/08, Sanjay Govindjee <sanjay at ce.berkeley.edu> wrote:
> > > > from my make file
> > > >
> > > > -@${MPIRUN} -s all -np $(NPROC) $(PROGNAME) -ksp_type preonly
> > > > -ksp_monitor -pc_type cholesky -mat_type mpisbaijspooles -log_summary
> > > > -on_error_attach_debugger -mat_spooles_symmetryflag 0  -options_left
> > > >
> > > >
> > > > -sg
> > > >
> > > >
> > > > Yujie wrote:
> > > > > Hi, everyone
> > > > >
> > > > > I have compiled PETSc with spooles. However, I try to find how to
> use
> > > > > this package in PETSc directory. I can't find any examples for it.
> > > > > Could you give me some advice? I want to use spooles to inverse a
> > > > > sparse matrix. thanks a lot.
> > > > >
> > > > > Regards,
> > > > > Yujie
> > > >
> > > >
> > >
> > >
> >
> >
> >
> >
> > --
> > What most experimenters take for granted before they begin their
> > experiments is infinitely more interesting than any results to which
> > their experiments lead.
> > -- Norbert Wiener
> >
> >
>
>


-- 
What most experimenters take for granted before they begin their
experiments is infinitely more interesting than any results to which
their experiments lead.
-- Norbert Wiener


From knepley at gmail.com  Wed Feb 27 13:33:57 2008
From: knepley at gmail.com (Matthew Knepley)
Date: Wed, 27 Feb 2008 13:33:57 -0600
Subject: any examples to demonstrate how to Spooles package?
In-Reply-To: <37604ab40802271126l7773640axba21eaf0e3d3e928@mail.gmail.com>
References: <7ff0ee010802261746g243bb9cobffbfee858229969@mail.gmail.com>
	 <47C50CE7.5060703@ce.berkeley.edu>
	 <7ff0ee010802270905nf416452u7f85e136504ffe4f@mail.gmail.com>
	 <a9f269830802271112s7f0317bga500479c381feae0@mail.gmail.com>
	 <37604ab40802271126l7773640axba21eaf0e3d3e928@mail.gmail.com>
Message-ID: <a9f269830802271133p518c47dexa156b8c3bb5c8f6e@mail.gmail.com>

On Wed, Feb 27, 2008 at 1:26 PM, Aron Ahmadia <aja2111 at columbia.edu> wrote:
> Hey Matt,
>
>  You should probably clean up the documentation for MatMatSolve while
>  you're at it, it's indicating that x and b are vectors...  Also,
>  should you reference the factor routine you need to use to get a
>  factored matrix?

The dev has the correct args. I added links to LU and Cholesky.

  Matt

>  ~A
>
>
>
>  On Wed, Feb 27, 2008 at 2:12 PM, Matthew Knepley <knepley at gmail.com> wrote:
>  > On Wed, Feb 27, 2008 at 11:05 AM, Yujie <recrusader at gmail.com> wrote:
>  >  > Dear Sanjay:
>  >  >
>  >  > Thank you for your reply. I don't understand what you said. Now, I want to
>  >  > use spooles package to inverse a sparse SPD matrix. I have further checked
>  >  > the inferface about spooles in PETSc. I find although spooles can deal with
>  >  > AX=B (B may be a dense matrix) with parallel LU factorization.
>  >  >  However, PETSc only provide the following:
>  >  > 51: PetscErrorCode MatSolve_MPISpooles(Mat A,Vec b,Vec x)
>  >  >  I don't set b to a matrix even if I use
>  >  >  178: PetscErrorCode MatFactorNumeric_MPISpooles(Mat A,MatFactorInfo
>  >  > *info,Mat *F) for LU factorization.
>  >  >
>  >  > Could you have any suggestions about this? thanks a lot.
>  >
>  >  MatMatSolve()
>  >
>  >  http://www-unix.mcs.anl.gov/petsc/petsc-as/snapshots/petsc-current/docs/manualpages/Mat/MatMatSolve.html
>  >
>  >    Matt
>  >
>  >
>  >
>  >  > Regards,
>  >  > Yujie
>  >  >
>  >  > On 2/26/08, Sanjay Govindjee <sanjay at ce.berkeley.edu> wrote:
>  >  > > from my make file
>  >  > >
>  >  > > -@${MPIRUN} -s all -np $(NPROC) $(PROGNAME) -ksp_type preonly
>  >  > > -ksp_monitor -pc_type cholesky -mat_type mpisbaijspooles -log_summary
>  >  > > -on_error_attach_debugger -mat_spooles_symmetryflag 0  -options_left
>  >  > >
>  >  > >
>  >  > > -sg
>  >  > >
>  >  > >
>  >  > > Yujie wrote:
>  >  > > > Hi, everyone
>  >  > > >
>  >  > > > I have compiled PETSc with spooles. However, I try to find how to use
>  >  > > > this package in PETSc directory. I can't find any examples for it.
>  >  > > > Could you give me some advice? I want to use spooles to inverse a
>  >  > > > sparse matrix. thanks a lot.
>  >  > > >
>  >  > > > Regards,
>  >  > > > Yujie
>  >  > >
>  >  > >
>  >  >
>  >  >
>  >
>  >
>  >
>  >  --
>  >  What most experimenters take for granted before they begin their
>  >  experiments is infinitely more interesting than any results to which
>  >  their experiments lead.
>  >  -- Norbert Wiener
>  >
>  >
>
>


-- 
What most experimenters take for granted before they begin their
experiments is infinitely more interesting than any results to which
their experiments lead.
-- Norbert Wiener


From recrusader at gmail.com  Wed Feb 27 13:40:52 2008
From: recrusader at gmail.com (Yujie)
Date: Wed, 27 Feb 2008 11:40:52 -0800
Subject: any examples to demonstrate how to Spooles package?
In-Reply-To: <a9f269830802271132s53a9c773v105c9f90a7abb207@mail.gmail.com>
References: <7ff0ee010802261746g243bb9cobffbfee858229969@mail.gmail.com>
	 <47C50CE7.5060703@ce.berkeley.edu>
	 <7ff0ee010802270905nf416452u7f85e136504ffe4f@mail.gmail.com>
	 <a9f269830802271112s7f0317bga500479c381feae0@mail.gmail.com>
	 <7ff0ee010802271121l4075e3cbl65bcd4e7a104e4f0@mail.gmail.com>
	 <a9f269830802271132s53a9c773v105c9f90a7abb207@mail.gmail.com>
Message-ID: <7ff0ee010802271140l3852e586sbc45180cf2ea23ba@mail.gmail.com>

This is why I have recompiled PETSc with spooles. spooles can deal with
AX=Y(Y is a matrix). However, PETSc only provide the following:
51: PetscErrorCode MatSolve_MPISpooles(Mat A,Vec b,Vec x)
I don't set b to a matrix even if I use
178: PetscErrorCode MatFactorNumeric_MPISpooles(Mat A,MatFactorInfo
*info,Mat *F) for LU factorization

Could you give me some advice or examples? thanks a lot.

Regards,
Yujie

On Wed, Feb 27, 2008 at 11:32 AM, Matthew Knepley <knepley at gmail.com> wrote:

> On Wed, Feb 27, 2008 at 1:21 PM, Yujie <recrusader at gmail.com> wrote:
> > Dear Matt:
> >
> > I checked the codes about MatMatSolve(). However, currently, PETSc
> didn't
> > realize its parallel version. Is it right? I want to inverse the matrix
> > parallelly. could you give me some examples about it? thanks a lot.
>
> Thats right. The parallel version is not implemented. It looks like this
> would
> take significant work.
>
>  Matt
>
> > Regards,
> > Yujie
> >
> >
> >
> > On 2/27/08, Matthew Knepley <knepley at gmail.com> wrote:
> > > On Wed, Feb 27, 2008 at 11:05 AM, Yujie <recrusader at gmail.com> wrote:
> > > > Dear Sanjay:
> > > >
> > > > Thank you for your reply. I don't understand what you said. Now, I
> want
> > to
> > > > use spooles package to inverse a sparse SPD matrix. I have further
> > checked
> > > > the inferface about spooles in PETSc. I find although spooles can
> deal
> > with
> > > > AX=B (B may be a dense matrix) with parallel LU factorization.
> > > >  However, PETSc only provide the following:
> > > > 51: PetscErrorCode MatSolve_MPISpooles(Mat A,Vec b,Vec x)
> > > >  I don't set b to a matrix even if I use
> > > >  178: PetscErrorCode MatFactorNumeric_MPISpooles(Mat A,MatFactorInfo
> > > > *info,Mat *F) for LU factorization.
> > > >
> > > > Could you have any suggestions about this? thanks a lot.
> > >
> > >
> > > MatMatSolve()
> > >
> > >
> >
> http://www-unix.mcs.anl.gov/petsc/petsc-as/snapshots/petsc-current/docs/manualpages/Mat/MatMatSolve.html
> > >
> > >    Matt
> > >
> > >
> > > > Regards,
> > > > Yujie
> > > >
> > > > On 2/26/08, Sanjay Govindjee <sanjay at ce.berkeley.edu> wrote:
> > > > > from my make file
> > > > >
> > > > > -@${MPIRUN} -s all -np $(NPROC) $(PROGNAME) -ksp_type preonly
> > > > > -ksp_monitor -pc_type cholesky -mat_type mpisbaijspooles
> -log_summary
> > > > > -on_error_attach_debugger -mat_spooles_symmetryflag 0
>  -options_left
> > > > >
> > > > >
> > > > > -sg
> > > > >
> > > > >
> > > > > Yujie wrote:
> > > > > > Hi, everyone
> > > > > >
> > > > > > I have compiled PETSc with spooles. However, I try to find how
> to
> > use
> > > > > > this package in PETSc directory. I can't find any examples for
> it.
> > > > > > Could you give me some advice? I want to use spooles to inverse
> a
> > > > > > sparse matrix. thanks a lot.
> > > > > >
> > > > > > Regards,
> > > > > > Yujie
> > > > >
> > > > >
> > > >
> > > >
> > >
> > >
> > >
> > >
> > > --
> > > What most experimenters take for granted before they begin their
> > > experiments is infinitely more interesting than any results to which
> > > their experiments lead.
> > > -- Norbert Wiener
> > >
> > >
> >
> >
>
>
>
> --
>  What most experimenters take for granted before they begin their
> experiments is infinitely more interesting than any results to which
> their experiments lead.
> -- Norbert Wiener
>
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.mcs.anl.gov/pipermail/petsc-users/attachments/20080227/0c8d87bf/attachment.htm>

From knepley at gmail.com  Wed Feb 27 13:40:59 2008
From: knepley at gmail.com (Matthew Knepley)
Date: Wed, 27 Feb 2008 13:40:59 -0600
Subject: MG question
In-Reply-To: <CA703655D571CF49A2E6DAED3FF5A4930105FD0C@EXCHG-VS1.risoe.dk>
References: <CA703655D571CF49A2E6DAED3FF5A4930105FD0C@EXCHG-VS1.risoe.dk>
Message-ID: <a9f269830802271140v2f7c1f09pfe741d00b40f796a@mail.gmail.com>

On Wed, Feb 27, 2008 at 1:31 PM,  <jens.madsen at risoe.dk> wrote:
> Hi
>
> I hope that this question is not outside the scope of this mailinglist.
>
> As far as I understand PETSc uses preconditioned GMRES(or another KSP
> method) as pre- and postsmoother on all multigrid levels? I was just

This is the default. However, you can use any combination of KSP/PC on any
given level with options. For instance,

  -mg_level_ksp_type richardson -mg_level_pc_type sor

gives "regulation" MG. We default to GMRES because it is more robust.

> wondering why and where in the literature I can read about that method? I
> thought that a fast method would be to use MG (with Gauss-Seidel RB/zebra
> smothers) as a preconditioner for GMRES? I have looked at papers written by
> Oosterlee etc.

In order to prove something about GMRES/MG, you would need to prove something
about the convergence of GMRES on the operators at each level. Good luck. GMRES
is the enemy of all convergence proofs. See paper by Greenbaum, Strakos, & Ptak.
If SOR works, great and it is much faster. However, GMRES/ILU(0) tends
to be more
robust.

   Matt

> Kind Regards
-- 
What most experimenters take for granted before they begin their
experiments is infinitely more interesting than any results to which
their experiments lead.
-- Norbert Wiener


From bsmith at mcs.anl.gov  Wed Feb 27 13:48:33 2008
From: bsmith at mcs.anl.gov (Barry Smith)
Date: Wed, 27 Feb 2008 13:48:33 -0600
Subject: MG question
In-Reply-To: <a9f269830802271140v2f7c1f09pfe741d00b40f796a@mail.gmail.com>
References: <CA703655D571CF49A2E6DAED3FF5A4930105FD0C@EXCHG-VS1.risoe.dk> <a9f269830802271140v2f7c1f09pfe741d00b40f796a@mail.gmail.com>
Message-ID: <32E0E8CE-D7AF-4B90-89A9-D43EFD17556B@mcs.anl.gov>


   The reason we default to these "very strong" (gmres + ILU(0))  
smoothers is robustness, we'd rather have
the solver "just work" for our users and be a little bit slower than  
have it often fail but be optimal
for special cases.

    Most of the MG community has a mental block about using Krylov  
methods, this is
why you find few papers that discuss their use with multigrid. Note  
also that using several iterations
of GMRES (with or without ILU(0)) is still order n work so you still  
get the optimal convergence of
mutligrid methods (when they work, of course).

    Barry


On Feb 27, 2008, at 1:40 PM, Matthew Knepley wrote:

> On Wed, Feb 27, 2008 at 1:31 PM,  <jens.madsen at risoe.dk> wrote:
>> Hi
>>
>> I hope that this question is not outside the scope of this  
>> mailinglist.
>>
>> As far as I understand PETSc uses preconditioned GMRES(or another KSP
>> method) as pre- and postsmoother on all multigrid levels? I was just
>
> This is the default. However, you can use any combination of KSP/PC  
> on any
> given level with options. For instance,
>
>  -mg_level_ksp_type richardson -mg_level_pc_type sor
>
> gives "regulation" MG. We default to GMRES because it is more robust.
>
>> wondering why and where in the literature I can read about that  
>> method? I
>> thought that a fast method would be to use MG (with Gauss-Seidel RB/ 
>> zebra
>> smothers) as a preconditioner for GMRES? I have looked at papers  
>> written by
>> Oosterlee etc.
>
> In order to prove something about GMRES/MG, you would need to prove  
> something
> about the convergence of GMRES on the operators at each level. Good  
> luck. GMRES
> is the enemy of all convergence proofs. See paper by Greenbaum,  
> Strakos, & Ptak.
> If SOR works, great and it is much faster. However, GMRES/ILU(0) tends
> to be more
> robust.
>
>   Matt
>
>> Kind Regards
> -- 
> What most experimenters take for granted before they begin their
> experiments is infinitely more interesting than any results to which
> their experiments lead.
> -- Norbert Wiener
>


From jens.madsen at risoe.dk  Wed Feb 27 14:22:02 2008
From: jens.madsen at risoe.dk (jens.madsen at risoe.dk)
Date: Wed, 27 Feb 2008 21:22:02 +0100
Subject: MG question
In-Reply-To: <32E0E8CE-D7AF-4B90-89A9-D43EFD17556B@mcs.anl.gov>
Message-ID: <CA703655D571CF49A2E6DAED3FF5A4930105FD0D@EXCHG-VS1.risoe.dk>

Ok 

Thanks Matthew and Barry

First I solve 2d boundary value problems of size 512^2 - 2048^2. 

Typically either kind of problem(solve for phi)

I) poisson type equation:

\nabla^2 \phi(x,y) = f(x,y)

II)

\nabla \cdot (g(x,y) \nabla\phi(x,y))  = f(x,y)

Successively with new f and g functions 


Do you know where to read about the smoothing properties of GMRES and
CG? All refs that I find are only describing smoothing with GS-RB etc.

My vague idea on how a fast solver is to use a (preconditioned ILU?)
krylov (CG for spd ie. problem I, GMRES for II)) method with additional
MG preconditioning(GS-RB smoother, Krylov solver on coarsest level)?

As my problems are not that big I fear that I will get no MG speedup if
I use krylov methods as smoothers?

Kind Regards Jens 


-----Original Message-----
From: owner-petsc-users at mcs.anl.gov
[mailto:owner-petsc-users at mcs.anl.gov] On Behalf Of Barry Smith
Sent: Wednesday, February 27, 2008 8:49 PM
To: petsc-users at mcs.anl.gov
Subject: Re: MG question


   The reason we default to these "very strong" (gmres + ILU(0))  
smoothers is robustness, we'd rather have
the solver "just work" for our users and be a little bit slower than  
have it often fail but be optimal
for special cases.

    Most of the MG community has a mental block about using Krylov  
methods, this is
why you find few papers that discuss their use with multigrid. Note  
also that using several iterations
of GMRES (with or without ILU(0)) is still order n work so you still  
get the optimal convergence of
mutligrid methods (when they work, of course).

    Barry


On Feb 27, 2008, at 1:40 PM, Matthew Knepley wrote:

> On Wed, Feb 27, 2008 at 1:31 PM,  <jens.madsen at risoe.dk> wrote:
>> Hi
>>
>> I hope that this question is not outside the scope of this  
>> mailinglist.
>>
>> As far as I understand PETSc uses preconditioned GMRES(or another KSP
>> method) as pre- and postsmoother on all multigrid levels? I was just
>
> This is the default. However, you can use any combination of KSP/PC  
> on any
> given level with options. For instance,
>
>  -mg_level_ksp_type richardson -mg_level_pc_type sor
>
> gives "regulation" MG. We default to GMRES because it is more robust.
>
>> wondering why and where in the literature I can read about that  
>> method? I
>> thought that a fast method would be to use MG (with Gauss-Seidel RB/ 
>> zebra
>> smothers) as a preconditioner for GMRES? I have looked at papers  
>> written by
>> Oosterlee etc.
>
> In order to prove something about GMRES/MG, you would need to prove  
> something
> about the convergence of GMRES on the operators at each level. Good  
> luck. GMRES
> is the enemy of all convergence proofs. See paper by Greenbaum,  
> Strakos, & Ptak.
> If SOR works, great and it is much faster. However, GMRES/ILU(0) tends
> to be more
> robust.
>
>   Matt
>
>> Kind Regards
> -- 
> What most experimenters take for granted before they begin their
> experiments is infinitely more interesting than any results to which
> their experiments lead.
> -- Norbert Wiener
>


From knepley at gmail.com  Wed Feb 27 14:29:44 2008
From: knepley at gmail.com (Matthew Knepley)
Date: Wed, 27 Feb 2008 14:29:44 -0600
Subject: MG question
In-Reply-To: <CA703655D571CF49A2E6DAED3FF5A4930105FD0D@EXCHG-VS1.risoe.dk>
References: <32E0E8CE-D7AF-4B90-89A9-D43EFD17556B@mcs.anl.gov>
	 <CA703655D571CF49A2E6DAED3FF5A4930105FD0D@EXCHG-VS1.risoe.dk>
Message-ID: <a9f269830802271229t484cabccu7e73d3688d3c7872@mail.gmail.com>

On Wed, Feb 27, 2008 at 2:22 PM,  <jens.madsen at risoe.dk> wrote:
> Ok
>
>  Thanks Matthew and Barry
>
>  First I solve 2d boundary value problems of size 512^2 - 2048^2.
>
>  Typically either kind of problem(solve for phi)
>
>  I) poisson type equation:
>
>  \nabla^2 \phi(x,y) = f(x,y)
>
>  II)
>
>  \nabla \cdot (g(x,y) \nabla\phi(x,y))  = f(x,y)
>
>  Successively with new f and g functions
>
>
>  Do you know where to read about the smoothing properties of GMRES and
>  CG? All refs that I find are only describing smoothing with GS-RB etc.
>
>  My vague idea on how a fast solver is to use a (preconditioned ILU?)
>  krylov (CG for spd ie. problem I, GMRES for II)) method with additional
>  MG preconditioning(GS-RB smoother, Krylov solver on coarsest level)?
>
>  As my problems are not that big I fear that I will get no MG speedup if
>  I use krylov methods as smoothers?

Well, you might need to prove things, but I would not worry about that first.
It is so easy to code up, just run everything and see what actually works.
Then sit down and try to show it.

  Matt

>  Kind Regards Jens
>
>
>  -----Original Message-----
>  From: owner-petsc-users at mcs.anl.gov
>  [mailto:owner-petsc-users at mcs.anl.gov] On Behalf Of Barry Smith
>  Sent: Wednesday, February 27, 2008 8:49 PM
>  To: petsc-users at mcs.anl.gov
>  Subject: Re: MG question
>
>
>    The reason we default to these "very strong" (gmres + ILU(0))
>  smoothers is robustness, we'd rather have
>  the solver "just work" for our users and be a little bit slower than
>  have it often fail but be optimal
>  for special cases.
>
>     Most of the MG community has a mental block about using Krylov
>  methods, this is
>  why you find few papers that discuss their use with multigrid. Note
>  also that using several iterations
>  of GMRES (with or without ILU(0)) is still order n work so you still
>  get the optimal convergence of
>  mutligrid methods (when they work, of course).
>
>     Barry
>
>
>  On Feb 27, 2008, at 1:40 PM, Matthew Knepley wrote:
>
>  > On Wed, Feb 27, 2008 at 1:31 PM,  <jens.madsen at risoe.dk> wrote:
>  >> Hi
>  >>
>  >> I hope that this question is not outside the scope of this
>  >> mailinglist.
>  >>
>  >> As far as I understand PETSc uses preconditioned GMRES(or another KSP
>  >> method) as pre- and postsmoother on all multigrid levels? I was just
>  >
>  > This is the default. However, you can use any combination of KSP/PC
>  > on any
>  > given level with options. For instance,
>  >
>  >  -mg_level_ksp_type richardson -mg_level_pc_type sor
>  >
>  > gives "regulation" MG. We default to GMRES because it is more robust.
>  >
>  >> wondering why and where in the literature I can read about that
>  >> method? I
>  >> thought that a fast method would be to use MG (with Gauss-Seidel RB/
>  >> zebra
>  >> smothers) as a preconditioner for GMRES? I have looked at papers
>  >> written by
>  >> Oosterlee etc.
>  >
>  > In order to prove something about GMRES/MG, you would need to prove
>  > something
>  > about the convergence of GMRES on the operators at each level. Good
>  > luck. GMRES
>  > is the enemy of all convergence proofs. See paper by Greenbaum,
>  > Strakos, & Ptak.
>  > If SOR works, great and it is much faster. However, GMRES/ILU(0) tends
>  > to be more
>  > robust.
>  >
>  >   Matt
>  >
>  >> Kind Regards
>  > --
>  > What most experimenters take for granted before they begin their
>  > experiments is infinitely more interesting than any results to which
>  > their experiments lead.
>  > -- Norbert Wiener
>  >
>
>
>


-- 
What most experimenters take for granted before they begin their
experiments is infinitely more interesting than any results to which
their experiments lead.
-- Norbert Wiener


From balay at mcs.anl.gov  Wed Feb 27 14:40:41 2008
From: balay at mcs.anl.gov (Satish Balay)
Date: Wed, 27 Feb 2008 14:40:41 -0600 (CST)
Subject: few questions
In-Reply-To: <37604ab40802270005y22b27007o247d0bae0c8817fd@mail.gmail.com>
References: <428810f20802262311t5f502823k74615457eeff8f53@mail.gmail.com> <37604ab40802270005y22b27007o247d0bae0c8817fd@mail.gmail.com>
Message-ID: <alpine.LFD.1.00.0802271332410.13304@asterix>

Just a note on terminology. The difference between shared & dynamic is
a bit confusing [esp across windows/linux/mac etc..]. I like to use
'shared-libraries' name instead of 'dynamic-libraries', as thats the
primary feature of .so/.dylib/.dll etc.

PETSc configure supports the following options

--with-shared=0/1 --with-dynamic=0/1

The dynamic option refers to the using dlopen() to look for function
in a sharedlibrary [instead of resolving these functions at link-time]

If petsc is built with dynamic usage- then PETSC_USE_DYNAMIC_LIBRARIES
flag is set in petscconf.h. Shared libs can be identified by looking
at the library names.

Satish

On Wed, 27 Feb 2008, Aron Ahmadia wrote:

> On Wed, Feb 27, 2008 at 2:11 AM, amjad ali <amjad11 at gmail.com> wrote:
> > Hello all,
> >
> > Please answer the following,
> >
> > 1) What is the difference between static and dynamic versions of petsc?
> >
> 
> Start here: http://en.wikipedia.org/wiki/Library_(computer_science)#Static_libraries
> 
> In PETSc the primary differences end up being the size and link-time.
> Statically-linked executables need all the possible code that they
> could contain in the actual file, so they can be up to several MB in
> size.  Dynamically-linked executables are much leaner for the small
> price of a little extra load time.
> 
> Now if you're talking about Dynamically-Loaded code, that's a bit hairier...
> 
> > 2) How to check that which version (static or dynamic) is installed on a
> > system?
> >
> 
> The fastest way is probably to look in $PETSC_DIR/$PETSC_ARCH/
> 
> If you see .a files, you've got static libraries, if you see .so or
> .dylib files you've got dynamic libraries.
> 
> > 3) Plz comment on if there is any effect of static/dynamic version while
> > using/calling petsc from some external package?
> >
> 
> I'm not sure what you're asking here.  If you mean "Is there a
> difference between calling dynamically compiled PETSc from statically
> compiled PETSc" the answer is no.  There are differences in how you
> compile and link the two version but your actual code would look the
> same.
> 
> Again, if we're talking about dynamically loaded code (using something
> like dl_open), then your code will look different.
> 
> > 4) how to update an already installed petsc version with newerer/latest
> > version of petsc?
> >
> 
> Doing this in place is more trouble than it's worth if you're not
> using a development copy .  I just grab the latest copy of PETSc from
> their webpage, then re-build and re-install.
> 
> ~A
> 
> 


From bsmith at mcs.anl.gov  Wed Feb 27 14:45:07 2008
From: bsmith at mcs.anl.gov (Barry Smith)
Date: Wed, 27 Feb 2008 14:45:07 -0600
Subject: MG question
In-Reply-To: <CA703655D571CF49A2E6DAED3FF5A4930105FD0D@EXCHG-VS1.risoe.dk>
References: <CA703655D571CF49A2E6DAED3FF5A4930105FD0D@EXCHG-VS1.risoe.dk>
Message-ID: <2EF58CAE-5270-45A8-8EB9-1FC9D519BCC2@mcs.anl.gov>


On Feb 27, 2008, at 2:22 PM, jens.madsen at risoe.dk wrote:

> Ok
>
> Thanks Matthew and Barry
>
> First I solve 2d boundary value problems of size 512^2 - 2048^2.
>
> Typically either kind of problem(solve for phi)
>
> I) poisson type equation:
>
> \nabla^2 \phi(x,y) = f(x,y)

There is no reason to use GMRES here, use

-ksp_type richardson -mg_levels_pc_type sor -mg_levels_ksp_type  
richardson
should require about 5-10 outter iterations to get reasonable  
convergence
on the norm of the residual.

>
>
> II)
>
> \nabla \cdot (g(x,y) \nabla\phi(x,y))  = f(x,y)
>
     If g(x,y) is smooth and not highly varying again you should not  
need GMRES.
If it is a crazy function than the whole kitchen sink will likely give  
better convergence.

    I do not understand your questions. If you don't need GMRES/CG  
then don't use
it and if you think you might need it just try it and see if it helps.

    Barry

> Successively with new f and g functions
>
>
> Do you know where to read about the smoothing properties of GMRES and
> CG? All refs that I find are only describing smoothing with GS-RB etc.
>
> My vague idea on how a fast solver is to use a (preconditioned ILU?)
> krylov (CG for spd ie. problem I, GMRES for II)) method with  
> additional
> MG preconditioning(GS-RB smoother, Krylov solver on coarsest level)?
>
> As my problems are not that big I fear that I will get no MG speedup  
> if
> I use krylov methods as smoothers?
>
> Kind Regards Jens
>
>
> -----Original Message-----
> From: owner-petsc-users at mcs.anl.gov
> [mailto:owner-petsc-users at mcs.anl.gov] On Behalf Of Barry Smith
> Sent: Wednesday, February 27, 2008 8:49 PM
> To: petsc-users at mcs.anl.gov
> Subject: Re: MG question
>
>
>   The reason we default to these "very strong" (gmres + ILU(0))
> smoothers is robustness, we'd rather have
> the solver "just work" for our users and be a little bit slower than
> have it often fail but be optimal
> for special cases.
>
>    Most of the MG community has a mental block about using Krylov
> methods, this is
> why you find few papers that discuss their use with multigrid. Note
> also that using several iterations
> of GMRES (with or without ILU(0)) is still order n work so you still
> get the optimal convergence of
> mutligrid methods (when they work, of course).
>
>    Barry
>
>
> On Feb 27, 2008, at 1:40 PM, Matthew Knepley wrote:
>
>> On Wed, Feb 27, 2008 at 1:31 PM,  <jens.madsen at risoe.dk> wrote:
>>> Hi
>>>
>>> I hope that this question is not outside the scope of this
>>> mailinglist.
>>>
>>> As far as I understand PETSc uses preconditioned GMRES(or another  
>>> KSP
>>> method) as pre- and postsmoother on all multigrid levels? I was just
>>
>> This is the default. However, you can use any combination of KSP/PC
>> on any
>> given level with options. For instance,
>>
>> -mg_level_ksp_type richardson -mg_level_pc_type sor
>>
>> gives "regulation" MG. We default to GMRES because it is more robust.
>>
>>> wondering why and where in the literature I can read about that
>>> method? I
>>> thought that a fast method would be to use MG (with Gauss-Seidel RB/
>>> zebra
>>> smothers) as a preconditioner for GMRES? I have looked at papers
>>> written by
>>> Oosterlee etc.
>>
>> In order to prove something about GMRES/MG, you would need to prove
>> something
>> about the convergence of GMRES on the operators at each level. Good
>> luck. GMRES
>> is the enemy of all convergence proofs. See paper by Greenbaum,
>> Strakos, & Ptak.
>> If SOR works, great and it is much faster. However, GMRES/ILU(0)  
>> tends
>> to be more
>> robust.
>>
>>  Matt
>>
>>> Kind Regards
>> -- 
>> What most experimenters take for granted before they begin their
>> experiments is infinitely more interesting than any results to which
>> their experiments lead.
>> -- Norbert Wiener
>>
>
>


From aja2111 at columbia.edu  Wed Feb 27 14:48:34 2008
From: aja2111 at columbia.edu (Aron Ahmadia)
Date: Wed, 27 Feb 2008 15:48:34 -0500
Subject: few questions
In-Reply-To: <alpine.LFD.1.00.0802271332410.13304@asterix>
References: <428810f20802262311t5f502823k74615457eeff8f53@mail.gmail.com>
	 <37604ab40802270005y22b27007o247d0bae0c8817fd@mail.gmail.com>
	 <alpine.LFD.1.00.0802271332410.13304@asterix>
Message-ID: <37604ab40802271248s4bfe85f9p71f1a9959e8da18@mail.gmail.com>

Thanks Satish,

That's definitely an easier way to think about it...

~A

On Wed, Feb 27, 2008 at 3:40 PM, Satish Balay <balay at mcs.anl.gov> wrote:
> Just a note on terminology. The difference between shared & dynamic is
>  a bit confusing [esp across windows/linux/mac etc..]. I like to use
>  'shared-libraries' name instead of 'dynamic-libraries', as thats the
>  primary feature of .so/.dylib/.dll etc.
>
>  PETSc configure supports the following options
>
>  --with-shared=0/1 --with-dynamic=0/1
>
>  The dynamic option refers to the using dlopen() to look for function
>  in a sharedlibrary [instead of resolving these functions at link-time]
>
>  If petsc is built with dynamic usage- then PETSC_USE_DYNAMIC_LIBRARIES
>  flag is set in petscconf.h. Shared libs can be identified by looking
>  at the library names.
>
>  Satish
>
>
>
>  On Wed, 27 Feb 2008, Aron Ahmadia wrote:
>
>  > On Wed, Feb 27, 2008 at 2:11 AM, amjad ali <amjad11 at gmail.com> wrote:
>  > > Hello all,
>  > >
>  > > Please answer the following,
>  > >
>  > > 1) What is the difference between static and dynamic versions of petsc?
>  > >
>  >
>  > Start here: http://en.wikipedia.org/wiki/Library_(computer_science)#Static_libraries
>  >
>  > In PETSc the primary differences end up being the size and link-time.
>  > Statically-linked executables need all the possible code that they
>  > could contain in the actual file, so they can be up to several MB in
>  > size.  Dynamically-linked executables are much leaner for the small
>  > price of a little extra load time.
>  >
>  > Now if you're talking about Dynamically-Loaded code, that's a bit hairier...
>  >
>  > > 2) How to check that which version (static or dynamic) is installed on a
>  > > system?
>  > >
>  >
>  > The fastest way is probably to look in $PETSC_DIR/$PETSC_ARCH/
>  >
>  > If you see .a files, you've got static libraries, if you see .so or
>  > .dylib files you've got dynamic libraries.
>  >
>  > > 3) Plz comment on if there is any effect of static/dynamic version while
>  > > using/calling petsc from some external package?
>  > >
>  >
>  > I'm not sure what you're asking here.  If you mean "Is there a
>  > difference between calling dynamically compiled PETSc from statically
>  > compiled PETSc" the answer is no.  There are differences in how you
>  > compile and link the two version but your actual code would look the
>  > same.
>  >
>  > Again, if we're talking about dynamically loaded code (using something
>  > like dl_open), then your code will look different.
>  >
>  > > 4) how to update an already installed petsc version with newerer/latest
>  > > version of petsc?
>  > >
>  >
>  > Doing this in place is more trouble than it's worth if you're not
>  > using a development copy .  I just grab the latest copy of PETSc from
>  > their webpage, then re-build and re-install.
>  >
>  > ~A
>  >
>  >
>
>


From jens.madsen at risoe.dk  Wed Feb 27 15:21:49 2008
From: jens.madsen at risoe.dk (jens.madsen at risoe.dk)
Date: Wed, 27 Feb 2008 22:21:49 +0100
Subject: MG question
In-Reply-To: <2EF58CAE-5270-45A8-8EB9-1FC9D519BCC2@mcs.anl.gov>
Message-ID: <CA703655D571CF49A2E6DAED3FF5A4930105FD0E@EXCHG-VS1.risoe.dk>

Thanks again :-) 

The reason why I ask is that my code is actually much faster without
GMRES.. I thought that MG accelerated Krylov methods were always the
fastest methods.... I am no expert, so I was just wondering why the
default in DMMG is GMRES/ILU. 

In the articles I have been able to find, PCG/MG(GS-RB/zebra)(SPD) and
GMRES/ MG(GS-RB/zebra) on the problems I) and II) respectively, seems to
be faster than (one level) preconditioned Krylov methods and MG. 

I am new in this field and find it very difficult even to choose which
methods to test and compare(there are so many possibilities). :-D

I will keep on testing :-) 

Thanks you very much for your answers. 

Jens


-----Original Message-----
From: owner-petsc-users at mcs.anl.gov
[mailto:owner-petsc-users at mcs.anl.gov] On Behalf Of Barry Smith
Sent: Wednesday, February 27, 2008 9:45 PM
To: petsc-users at mcs.anl.gov
Subject: Re: MG question


On Feb 27, 2008, at 2:22 PM, jens.madsen at risoe.dk wrote:

> Ok
>
> Thanks Matthew and Barry
>
> First I solve 2d boundary value problems of size 512^2 - 2048^2.
>
> Typically either kind of problem(solve for phi)
>
> I) poisson type equation:
>
> \nabla^2 \phi(x,y) = f(x,y)

There is no reason to use GMRES here, use

-ksp_type richardson -mg_levels_pc_type sor -mg_levels_ksp_type  
richardson
should require about 5-10 outter iterations to get reasonable  
convergence
on the norm of the residual.

>
>
> II)
>
> \nabla \cdot (g(x,y) \nabla\phi(x,y))  = f(x,y)
>
     If g(x,y) is smooth and not highly varying again you should not  
need GMRES.
If it is a crazy function than the whole kitchen sink will likely give  
better convergence.

    I do not understand your questions. If you don't need GMRES/CG  
then don't use
it and if you think you might need it just try it and see if it helps.

    Barry

> Successively with new f and g functions
>
>
> Do you know where to read about the smoothing properties of GMRES and
> CG? All refs that I find are only describing smoothing with GS-RB etc.
>
> My vague idea on how a fast solver is to use a (preconditioned ILU?)
> krylov (CG for spd ie. problem I, GMRES for II)) method with  
> additional
> MG preconditioning(GS-RB smoother, Krylov solver on coarsest level)?
>
> As my problems are not that big I fear that I will get no MG speedup  
> if
> I use krylov methods as smoothers?
>
> Kind Regards Jens
>
>
> -----Original Message-----
> From: owner-petsc-users at mcs.anl.gov
> [mailto:owner-petsc-users at mcs.anl.gov] On Behalf Of Barry Smith
> Sent: Wednesday, February 27, 2008 8:49 PM
> To: petsc-users at mcs.anl.gov
> Subject: Re: MG question
>
>
>   The reason we default to these "very strong" (gmres + ILU(0))
> smoothers is robustness, we'd rather have
> the solver "just work" for our users and be a little bit slower than
> have it often fail but be optimal
> for special cases.
>
>    Most of the MG community has a mental block about using Krylov
> methods, this is
> why you find few papers that discuss their use with multigrid. Note
> also that using several iterations
> of GMRES (with or without ILU(0)) is still order n work so you still
> get the optimal convergence of
> mutligrid methods (when they work, of course).
>
>    Barry
>
>
> On Feb 27, 2008, at 1:40 PM, Matthew Knepley wrote:
>
>> On Wed, Feb 27, 2008 at 1:31 PM,  <jens.madsen at risoe.dk> wrote:
>>> Hi
>>>
>>> I hope that this question is not outside the scope of this
>>> mailinglist.
>>>
>>> As far as I understand PETSc uses preconditioned GMRES(or another  
>>> KSP
>>> method) as pre- and postsmoother on all multigrid levels? I was just
>>
>> This is the default. However, you can use any combination of KSP/PC
>> on any
>> given level with options. For instance,
>>
>> -mg_level_ksp_type richardson -mg_level_pc_type sor
>>
>> gives "regulation" MG. We default to GMRES because it is more robust.
>>
>>> wondering why and where in the literature I can read about that
>>> method? I
>>> thought that a fast method would be to use MG (with Gauss-Seidel RB/
>>> zebra
>>> smothers) as a preconditioner for GMRES? I have looked at papers
>>> written by
>>> Oosterlee etc.
>>
>> In order to prove something about GMRES/MG, you would need to prove
>> something
>> about the convergence of GMRES on the operators at each level. Good
>> luck. GMRES
>> is the enemy of all convergence proofs. See paper by Greenbaum,
>> Strakos, & Ptak.
>> If SOR works, great and it is much faster. However, GMRES/ILU(0)  
>> tends
>> to be more
>> robust.
>>
>>  Matt
>>
>>> Kind Regards
>> -- 
>> What most experimenters take for granted before they begin their
>> experiments is infinitely more interesting than any results to which
>> their experiments lead.
>> -- Norbert Wiener
>>
>
>


From bsmith at mcs.anl.gov  Wed Feb 27 15:39:36 2008
From: bsmith at mcs.anl.gov (Barry Smith)
Date: Wed, 27 Feb 2008 15:39:36 -0600
Subject: MG question
In-Reply-To: <CA703655D571CF49A2E6DAED3FF5A4930105FD0E@EXCHG-VS1.risoe.dk>
References: <CA703655D571CF49A2E6DAED3FF5A4930105FD0E@EXCHG-VS1.risoe.dk>
Message-ID: <B4DE40F0-E210-4BB0-8C7F-263C0A901725@mcs.anl.gov>


On Feb 27, 2008, at 3:21 PM, jens.madsen at risoe.dk wrote:

> Thanks again :-)
>
> The reason why I ask is that my code is actually much faster without
> GMRES.. I thought that MG accelerated Krylov methods were always the
> fastest methods.... I am no expert, so I was just wondering why the
> default in DMMG is GMRES/ILU.

   It is just for robustness, not for speed.

>
>
> In the articles I have been able to find, PCG/MG(GS-RB/zebra)(SPD) and
> GMRES/ MG(GS-RB/zebra) on the problems I) and II) respectively,  
> seems to
> be faster than (one level) preconditioned Krylov methods and MG.
>
> I am new in this field and find it very difficult even to choose which
> methods to test and compare(there are so many possibilities). :-D
>
> I will keep on testing :-)
>
> Thanks you very much for your answers.
>
> Jens
>
>
> -----Original Message-----
> From: owner-petsc-users at mcs.anl.gov
> [mailto:owner-petsc-users at mcs.anl.gov] On Behalf Of Barry Smith
> Sent: Wednesday, February 27, 2008 9:45 PM
> To: petsc-users at mcs.anl.gov
> Subject: Re: MG question
>
>
> On Feb 27, 2008, at 2:22 PM, jens.madsen at risoe.dk wrote:
>
>> Ok
>>
>> Thanks Matthew and Barry
>>
>> First I solve 2d boundary value problems of size 512^2 - 2048^2.
>>
>> Typically either kind of problem(solve for phi)
>>
>> I) poisson type equation:
>>
>> \nabla^2 \phi(x,y) = f(x,y)
>
> There is no reason to use GMRES here, use
>
> -ksp_type richardson -mg_levels_pc_type sor -mg_levels_ksp_type
> richardson
> should require about 5-10 outter iterations to get reasonable
> convergence
> on the norm of the residual.
>
>>
>>
>> II)
>>
>> \nabla \cdot (g(x,y) \nabla\phi(x,y))  = f(x,y)
>>
>     If g(x,y) is smooth and not highly varying again you should not
> need GMRES.
> If it is a crazy function than the whole kitchen sink will likely give
> better convergence.
>
>    I do not understand your questions. If you don't need GMRES/CG
> then don't use
> it and if you think you might need it just try it and see if it helps.
>
>    Barry
>
>> Successively with new f and g functions
>>
>>
>> Do you know where to read about the smoothing properties of GMRES and
>> CG? All refs that I find are only describing smoothing with GS-RB  
>> etc.
>>
>> My vague idea on how a fast solver is to use a (preconditioned ILU?)
>> krylov (CG for spd ie. problem I, GMRES for II)) method with
>> additional
>> MG preconditioning(GS-RB smoother, Krylov solver on coarsest level)?
>>
>> As my problems are not that big I fear that I will get no MG speedup
>> if
>> I use krylov methods as smoothers?
>>
>> Kind Regards Jens
>>
>>
>> -----Original Message-----
>> From: owner-petsc-users at mcs.anl.gov
>> [mailto:owner-petsc-users at mcs.anl.gov] On Behalf Of Barry Smith
>> Sent: Wednesday, February 27, 2008 8:49 PM
>> To: petsc-users at mcs.anl.gov
>> Subject: Re: MG question
>>
>>
>>  The reason we default to these "very strong" (gmres + ILU(0))
>> smoothers is robustness, we'd rather have
>> the solver "just work" for our users and be a little bit slower than
>> have it often fail but be optimal
>> for special cases.
>>
>>   Most of the MG community has a mental block about using Krylov
>> methods, this is
>> why you find few papers that discuss their use with multigrid. Note
>> also that using several iterations
>> of GMRES (with or without ILU(0)) is still order n work so you still
>> get the optimal convergence of
>> mutligrid methods (when they work, of course).
>>
>>   Barry
>>
>>
>> On Feb 27, 2008, at 1:40 PM, Matthew Knepley wrote:
>>
>>> On Wed, Feb 27, 2008 at 1:31 PM,  <jens.madsen at risoe.dk> wrote:
>>>> Hi
>>>>
>>>> I hope that this question is not outside the scope of this
>>>> mailinglist.
>>>>
>>>> As far as I understand PETSc uses preconditioned GMRES(or another
>>>> KSP
>>>> method) as pre- and postsmoother on all multigrid levels? I was  
>>>> just
>>>
>>> This is the default. However, you can use any combination of KSP/PC
>>> on any
>>> given level with options. For instance,
>>>
>>> -mg_level_ksp_type richardson -mg_level_pc_type sor
>>>
>>> gives "regulation" MG. We default to GMRES because it is more  
>>> robust.
>>>
>>>> wondering why and where in the literature I can read about that
>>>> method? I
>>>> thought that a fast method would be to use MG (with Gauss-Seidel  
>>>> RB/
>>>> zebra
>>>> smothers) as a preconditioner for GMRES? I have looked at papers
>>>> written by
>>>> Oosterlee etc.
>>>
>>> In order to prove something about GMRES/MG, you would need to prove
>>> something
>>> about the convergence of GMRES on the operators at each level. Good
>>> luck. GMRES
>>> is the enemy of all convergence proofs. See paper by Greenbaum,
>>> Strakos, & Ptak.
>>> If SOR works, great and it is much faster. However, GMRES/ILU(0)
>>> tends
>>> to be more
>>> robust.
>>>
>>> Matt
>>>
>>>> Kind Regards
>>> -- 
>>> What most experimenters take for granted before they begin their
>>> experiments is infinitely more interesting than any results to which
>>> their experiments lead.
>>> -- Norbert Wiener
>>>
>>
>>
>
>


From recrusader at gmail.com  Wed Feb 27 16:47:46 2008
From: recrusader at gmail.com (Yujie)
Date: Wed, 27 Feb 2008 14:47:46 -0800
Subject: any good ideas for me from you? thanks a lot.Fwd: any examples to demonstrate how to Spooles package?
In-Reply-To: <a9f269830802271433v6fd8e5f1q39cbd97a5ecc9cfa@mail.gmail.com>
References: <7ff0ee010802271344s497ce960g89e99d4ff00ad3dd@mail.gmail.com>
	 <a9f269830802271433v6fd8e5f1q39cbd97a5ecc9cfa@mail.gmail.com>
Message-ID: <7ff0ee010802271447y6d755a4dn7dee083042b5de65@mail.gmail.com>

What do you mean about 3)? I am considering to use MatSolve_MPIAIJSpooles
with setting b
to
1  0 0 . . .0
0  1 0      0
0  0 1      0
.   .  .
.   .  .
.   .  .       1
for solving Ax=b. After finishing all, I will rearrange X=[x1,x2,x3,x4],
which is the inversion of A. whether is it similar with 1) you mentioned?
Practically, If I use such method, I may use some iterative methods to solve
it, not direct inversion method. What is the time difference or time
complexity regarding using spooles (direct inversion method) or other
iterative methods?
thanks a lot.

Regards,
Yujie
On Wed, Feb 27, 2008 at 2:33 PM, Matthew Knepley <knepley at gmail.com> wrote:

> This seems like it would involve significant programming time. Therefore,
> I suggest
>
>  1) Solving each vector in a loop
>
>  2) Taking a look at MatMatSolve_SeqAIJ() and
> MatSolve_MPIAIJSpooles() and trying to implement it yourself for
> Spooles
>
>  3) Reformulating your problem so as not use an inverse, but rather just
> solves
>
>  Thanks,
>
>     Matt
>
> On Wed, Feb 27, 2008 at 3:44 PM, Yujie <recrusader at gmail.com> wrote:
> >
> >
> >
> > ---------- Forwarded message ----------
> > From: Yujie <recrusader at gmail.com>
> > Date: Wed, Feb 27, 2008 at 11:40 AM
> >  Subject: Re: any examples to demonstrate how to Spooles package?
> > To: petsc-users at mcs.anl.gov
> >
> >
> >
> > This is why I have recompiled PETSc with spooles. spooles can deal with
> > AX=Y(Y is a matrix). However, PETSc only provide the following:
> >
> > 51: PetscErrorCode MatSolve_MPISpooles(Mat A,Vec b,Vec x)
> > I don't set b to a matrix even if I use
> > 178: PetscErrorCode MatFactorNumeric_MPISpooles(Mat A,MatFactorInfo
> > *info,Mat *F) for LU factorization
> >
> > Could you give me some advice or examples? thanks a lot.
> >
> > Regards,
> > Yujie
> >
> >
> >
> >
> >
> > On Wed, Feb 27, 2008 at 11:32 AM, Matthew Knepley <knepley at gmail.com>
> wrote:
> >
> > >
> > > On Wed, Feb 27, 2008 at 1:21 PM, Yujie <recrusader at gmail.com> wrote:
> > > > Dear Matt:
> > > >
> > > > I checked the codes about MatMatSolve(). However, currently, PETSc
> > didn't
> > > > realize its parallel version. Is it right? I want to inverse the
> matrix
> > > > parallelly. could you give me some examples about it? thanks a lot.
> > >
> > > Thats right. The parallel version is not implemented. It looks like
> this
> > would
> > > take significant work.
> > >
> > >
> > >
> > >
> > >  Matt
> > >
> > > > Regards,
> > > > Yujie
> > > >
> > > >
> > > >
> > > > On 2/27/08, Matthew Knepley <knepley at gmail.com> wrote:
> > > > > On Wed, Feb 27, 2008 at 11:05 AM, Yujie <recrusader at gmail.com>
> wrote:
> > > > > > Dear Sanjay:
> > > > > >
> > > > > > Thank you for your reply. I don't understand what you said. Now,
> I
> > want
> > > > to
> > > > > > use spooles package to inverse a sparse SPD matrix. I have
> further
> > > > checked
> > > > > > the inferface about spooles in PETSc. I find although spooles
> can
> > deal
> > > > with
> > > > > > AX=B (B may be a dense matrix) with parallel LU factorization.
> > > > > >  However, PETSc only provide the following:
> > > > > > 51: PetscErrorCode MatSolve_MPISpooles(Mat A,Vec b,Vec x)
> > > > > >  I don't set b to a matrix even if I use
> > > > > >  178: PetscErrorCode MatFactorNumeric_MPISpooles(Mat
> A,MatFactorInfo
> > > > > > *info,Mat *F) for LU factorization.
> > > > > >
> > > > > > Could you have any suggestions about this? thanks a lot.
> > > > >
> > > > >
> > > > > MatMatSolve()
> > > > >
> > > > >
> > > >
> >
> http://www-unix.mcs.anl.gov/petsc/petsc-as/snapshots/petsc-current/docs/manualpages/Mat/MatMatSolve.html
> > > > >
> > > > >    Matt
> > > > >
> > > > >
> > > > > > Regards,
> > > > > > Yujie
> > > > > >
> > > > > > On 2/26/08, Sanjay Govindjee <sanjay at ce.berkeley.edu> wrote:
> > > > > > > from my make file
> > > > > > >
> > > > > > > -@${MPIRUN} -s all -np $(NPROC) $(PROGNAME) -ksp_type preonly
> > > > > > > -ksp_monitor -pc_type cholesky -mat_type mpisbaijspooles
> > -log_summary
> > > > > > > -on_error_attach_debugger -mat_spooles_symmetryflag 0
> > -options_left
> > > > > > >
> > > > > > >
> > > > > > > -sg
> > > > > > >
> > > > > > >
> > > > > > > Yujie wrote:
> > > > > > > > Hi, everyone
> > > > > > > >
> > > > > > > > I have compiled PETSc with spooles. However, I try to find
> how
> > to
> > > > use
> > > > > > > > this package in PETSc directory. I can't find any examples
> for
> > it.
> > > > > > > > Could you give me some advice? I want to use spooles to
> inverse
> > a
> > > > > > > > sparse matrix. thanks a lot.
> > > > > > > >
> > > > > > > > Regards,
> > > > > > > > Yujie
> > > > > > >
> > > > > > >
> > > > > >
> > > > > >
> > > > >
> > > > >
> > > > >
> > > > >
> > > > > --
> > > > > What most experimenters take for granted before they begin their
> > > > > experiments is infinitely more interesting than any results to
> which
> > > > > their experiments lead.
> > > > > -- Norbert Wiener
> > > > >
> > > > >
> > > >
> > > >
> > >
> > >
> > >
> > > --
> > >
> > >
> > >
> > > What most experimenters take for granted before they begin their
> > > experiments is infinitely more interesting than any results to which
> > > their experiments lead.
> > > -- Norbert Wiener
> > >
> > >
> >
> >
> >
>
>
>
> --
>  What most experimenters take for granted before they begin their
> experiments is infinitely more interesting than any results to which
> their experiments lead.
> -- Norbert Wiener
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.mcs.anl.gov/pipermail/petsc-users/attachments/20080227/6090d56d/attachment.htm>

From Amit.Itagi at seagate.com  Thu Feb 28 13:07:36 2008
From: Amit.Itagi at seagate.com (Amit.Itagi at seagate.com)
Date: Thu, 28 Feb 2008 14:07:36 -0500
Subject: Direct LU solver 
In-Reply-To: <200802281846.m1SIkPA31406@mcs.anl.gov>
Message-ID: <OFC5D62013.CED52B17-ON852573FD.00673DB3-852573FD.0069C055@seagate.com>

Hi,

I need to do direct LU solves (repeatedly, with the same matrix) in one of
my MPI applications. I am having trouble implementing the solver. To
identify the problem, I wrote a short toy  code to run with 2 processes. I
can run it with either the spooles parallel matrix or the superlu_dist
matrix. I am using C++ and complex matrices.

Here is the code listing:

#include <iostream>
#include <complex>
#include <stdlib.h>
#include "petsc.h"
#include "petscmat.h"
#include "petscvec.h"
#include "petscksp.h"

using namespace std;

int main( int argc, char *argv[] ) {

  int rank, size;
  Mat A;
  PetscErrorCode ierr;
  PetscInt loc;
  PetscScalar val;
  Vec x, y;
  KSP solver;
  PC prec;
  MPI_Comm comm;

  // Number of non-zeros in each row
  int d_nnz=1, o_nnz=1;

  ierr=PetscInitialize(&argc,&argv,PETSC_NULL,PETSC_NULL); CHKERRQ(ierr);
  // Initialization (including MPI)

  comm=PETSC_COMM_WORLD;

  ierr=MPI_Comm_size(PETSC_COMM_WORLD,&size); CHKERRQ(ierr);
  ierr=MPI_Comm_rank(PETSC_COMM_WORLD,&rank); CHKERRQ(ierr);

  // Assemble matrix A

  if(rank==0) {
    ierr=MatCreateMPIAIJ(comm,1,1,2,2,0,&d_nnz,0,&o_nnz,&A); CHKERRQ(ierr);
    val=complex<double>(1.0,0.0);
    ierr=MatSetValue(A,0,0,val,INSERT_VALUES);CHKERRQ(ierr);
    val=complex<double>(0.0,1.0);
    ierr=MatSetValue(A,0,1,val,INSERT_VALUES);CHKERRQ(ierr);
  }
  else {
    ierr=MatCreateMPIAIJ(PETSC_COMM_WORLD,1,1,2,2,0,&d_nnz,0,&o_nnz,&A);
CHKERRQ(ierr);
    val=complex<double>(1.0,1.0);
    ierr=MatSetValue(A,1,1,val,INSERT_VALUES);CHKERRQ(ierr);
    val=complex<double>(0.0,-1.0);
    ierr=MatSetValue(A,1,0,val,INSERT_VALUES);CHKERRQ(ierr);
  }

  ierr = MatAssemblyBegin(A,MAT_FINAL_ASSEMBLY); CHKERRQ(ierr);
  ierr = MatAssemblyEnd(A,MAT_FINAL_ASSEMBLY); CHKERRQ(ierr);

  cout << "============  Mat A  ==================" << endl;
  ierr=MatView(A,PETSC_VIEWER_STDOUT_WORLD); CHKERRQ(ierr);
  cout << "======================================" << endl;

  // For spooles
  //ierr=MatConvert(A,MATMPIAIJSPOOLES,MAT_REUSE_MATRIX,&A); CHKERRQ(ierr);

  // For superlu_dist
  ierr=MatConvert(A,MATSUPERLU_DIST,MAT_REUSE_MATRIX,&A); CHKERRQ(ierr);

  // Direct LU solver
  ierr=KSPCreate(comm,&solver); CHKERRQ(ierr);
  ierr=KSPSetType(solver,KSPPREONLY); CHKERRQ(ierr);
  ierr=KSPSetOperators(solver,A,A,SAME_NONZERO_PATTERN); CHKERRQ(ierr);
  ierr=KSPGetPC(solver,&prec); CHKERRQ(ierr);
  ierr=PCSetType(prec,PCLU); CHKERRQ(ierr);
  ierr=KSPSetFromOptions(solver); CHKERRQ(ierr);

  //============  Vector assembly ========================

  if(rank==0) {
    ierr=VecCreateMPI(PETSC_COMM_WORLD,1,2,&x); CHKERRQ(ierr);
    val=complex<double>(1.0,0.0);
    loc=0;
    ierr=VecSetValues(x,1,&loc,&val,ADD_VALUES);CHKERRQ(ierr);
  }
  else {
    ierr=VecCreateMPI(PETSC_COMM_WORLD,1,2,&x); CHKERRQ(ierr);
    val=complex<double>(-1.0,0.0);
    loc=1;
    ierr=VecSetValues(x,1,&loc,&val,ADD_VALUES);CHKERRQ(ierr);
  }

  ierr=VecAssemblyBegin(x); CHKERRQ(ierr);
  ierr=VecAssemblyEnd(x); CHKERRQ(ierr);

  cout << "============== Vec x ==================" << endl;
  ierr=VecView(x,PETSC_VIEWER_STDOUT_WORLD); CHKERRQ(ierr);
  cout << "======================================" << endl;

  VecDuplicate(x,&y);  // Duplicate the matrix storage

  // Solve the matrix equation
  ierr=KSPSolve(solver,x,y); CHKERRQ(ierr);

  cout << "============== Vec y =================" << endl;
  ierr=VecView(y,PETSC_VIEWER_STDOUT_WORLD); CHKERRQ(ierr);
  cout << "======================================" << endl;


  // Destructors
  ierr=KSPDestroy(solver); CHKERRQ(ierr);
  ierr=VecDestroy(x); CHKERRQ(ierr);
  ierr=VecDestroy(y); CHKERRQ(ierr);
  ierr=MatDestroy(A); CHKERRQ(ierr);

 // Finalize
  ierr=PetscFinalize(); CHKERRQ(ierr);


  return 0;

}


When I run the program with spooles, I get the following output.


============  Mat A  ==================
============  Mat A  ==================
======================================
row 0: (0, 1)  (1, 0 + 1 i)
row 1: (0, 0 - 1 i) (1, 1 + 1 i)
======================================
============== Vec x ==================
============== Vec x ==================
Process [0]
1
======================================
Process [1]
-1
======================================

 fatal error in InpMtx_MPI_split()
 firsttag = 0, tagbound = -1

 fatal error in InpMtx_MPI_split()
 firsttag = 0, tagbound = -1
-----------------------------------------------------------------------------
One of the processes started by mpirun has exited with a nonzero exit
code.  This typically indicates that the process finished in error.
If your process did not finish in error, be sure to include a "return
0" or "exit(0)" in your C code before exiting the application.

PID 22881 failed on node n0 (127.0.0.1) with exit status 255.
-----------------------------------------------------------------------------
mpirun failed with exit status 255


When run in the debugger, there is no stack trace. The error is

Program exited with code 0377


With superlu_dist, the output is

============  Mat A  ==================
============  Mat A  ==================
row 0: (0, 1)  (1, 0 + 1 i)
row 1: (0, 0 - 1 i) (1, 1 + 1 i)
======================================
======================================
============== Vec x ==================
============== Vec x ==================
Process [0]
1
Process [1]
-1
======================================
======================================
[0]PETSC ERROR:
------------------------------------------------------------------------
[0]PETSC ERROR: Caught signal number 11 SEGV: Segmentation Violation,
probably memory access out of range
[0]PETSC ERROR: Try option -start_in_debugger or -on_error_attach_debugger
[0]PETSC ERROR: or see
http://www.mcs.anl.gov/petsc/petsc-as/documentation/troubleshooting.html#Signal[0]PETSC
 ERROR: or try http://valgrind.org on linux or man libgmalloc on Apple to
find memory corruption errors
[0]PETSC ERROR: configure using --with-debugging=yes, recompile, link, and
run
[0]PETSC ERROR: to get more information on the crash.
[0]PETSC ERROR: --------------------- Error Message
------------------------------------
[0]PETSC ERROR: Signal received!
[0]PETSC ERROR:
------------------------------------------------------------------------
[0]PETSC ERROR: Petsc Release Version 2.3.3, Patch 8, Fri Nov 16 17:03:40
CST 2007 HG revision: 414581156e67e55c761739b0deb119f7590d0f4b
[0]PETSC ERROR: See docs/changes/index.html for recent updates.
[0]PETSC ERROR: See docs/faq.html for hints about trouble shooting.
[0]PETSC ERROR: See docs/index.html for manual pages.
[0]PETSC ERROR:
------------------------------------------------------------------------
[0]PETSC ERROR: lu on a linux-gnu named tabla by amit Thu Feb 28 13:48:54
2008
[0]PETSC ERROR: Libraries linked from
/home/amit/programs/ParEM/petsc-2.3.3-p8/lib/linux-gnu-c-debug
[0]PETSC ERROR: Configure run at Thu Feb 28 12:19:39 2008
[0]PETSC ERROR: Configure options --with-scalar-type=complex
--with-debugging=no --with-fortran-kernels=generic --with-clanguage=cxx
--with-metis=1 --download-metis=1 --with-parmetis=1 --download-parmetis=1
--with-superlu_dist=1 --download-superlu_dist=1 --with-spooles=1
--with-spooles-dir=/home/amit/programs/ParEM/spooles-2.2 COPTFLAGS="-O3
-march=p4 -mtune=p4 -ffast-math -malign-double -funroll-loops -pipe
-fomit-frame-pointer -finline-functions -msse2" CXXOPTFLAGS="-O3 -march=p4
-mtune=p4 -ffast-math -malign-double -funroll-loops -pipe
-fomit-frame-pointer -finline-functions -msse2" FOPTS="-O3 -qarch=p4
-qtune=p4" --with-shared=0
[0]PETSC ERROR:
------------------------------------------------------------------------
[0]PETSC ERROR: User provided function() line 0 in unknown directory
unknown file
-----------------------------------------------------------------------------
One of the processes started by mpirun has exited with a nonzero exit
code.  This typically indicates that the process finished in error.
If your process did not finish in error, be sure to include a "return
0" or "exit(0)" in your C code before exiting the application.

PID 22998 failed on node n0 (127.0.0.1) with exit status 1.
-----------------------------------------------------------------------------
mpirun failed with exit status 1


The debugger tracks the segmentation violation to


main -> KSPSolve -> KSPSetUp -> PCSetUp -> PCSetUp_LU -> MatLUFactorNumeric
-> MatLUFactorNumeric_SUperLU_DIST -> pzgssvx


Could  someone kindly point out what I am missing ?


Thanks

Rgds,
Amit


From knepley at gmail.com  Thu Feb 28 14:45:48 2008
From: knepley at gmail.com (Matthew Knepley)
Date: Thu, 28 Feb 2008 14:45:48 -0600
Subject: Direct LU solver
In-Reply-To: <OFC5D62013.CED52B17-ON852573FD.00673DB3-852573FD.0069C055@seagate.com>
References: <200802281846.m1SIkPA31406@mcs.anl.gov>
	 <OFC5D62013.CED52B17-ON852573FD.00673DB3-852573FD.0069C055@seagate.com>
Message-ID: <a9f269830802281245i779cd509j5d071582a3bfa3f2@mail.gmail.com>

I would recommend using KSP ex10 and then customizing with options.
That way we know it should work. For SuperLU_dist,

  -ksp_type preonly -pc_type lu -mat_type superlu_dist

   Matt

On Thu, Feb 28, 2008 at 1:07 PM,  <Amit.Itagi at seagate.com> wrote:
> Hi,
>
>  I need to do direct LU solves (repeatedly, with the same matrix) in one of
>  my MPI applications. I am having trouble implementing the solver. To
>  identify the problem, I wrote a short toy  code to run with 2 processes. I
>  can run it with either the spooles parallel matrix or the superlu_dist
>  matrix. I am using C++ and complex matrices.
>
>  Here is the code listing:
>
>  #include <iostream>
>  #include <complex>
>  #include <stdlib.h>
>  #include "petsc.h"
>  #include "petscmat.h"
>  #include "petscvec.h"
>  #include "petscksp.h"
>
>  using namespace std;
>
>  int main( int argc, char *argv[] ) {
>
>   int rank, size;
>   Mat A;
>   PetscErrorCode ierr;
>   PetscInt loc;
>   PetscScalar val;
>   Vec x, y;
>   KSP solver;
>   PC prec;
>   MPI_Comm comm;
>
>   // Number of non-zeros in each row
>   int d_nnz=1, o_nnz=1;
>
>   ierr=PetscInitialize(&argc,&argv,PETSC_NULL,PETSC_NULL); CHKERRQ(ierr);
>   // Initialization (including MPI)
>
>   comm=PETSC_COMM_WORLD;
>
>   ierr=MPI_Comm_size(PETSC_COMM_WORLD,&size); CHKERRQ(ierr);
>   ierr=MPI_Comm_rank(PETSC_COMM_WORLD,&rank); CHKERRQ(ierr);
>
>   // Assemble matrix A
>
>   if(rank==0) {
>     ierr=MatCreateMPIAIJ(comm,1,1,2,2,0,&d_nnz,0,&o_nnz,&A); CHKERRQ(ierr);
>     val=complex<double>(1.0,0.0);
>     ierr=MatSetValue(A,0,0,val,INSERT_VALUES);CHKERRQ(ierr);
>     val=complex<double>(0.0,1.0);
>     ierr=MatSetValue(A,0,1,val,INSERT_VALUES);CHKERRQ(ierr);
>   }
>   else {
>     ierr=MatCreateMPIAIJ(PETSC_COMM_WORLD,1,1,2,2,0,&d_nnz,0,&o_nnz,&A);
>  CHKERRQ(ierr);
>     val=complex<double>(1.0,1.0);
>     ierr=MatSetValue(A,1,1,val,INSERT_VALUES);CHKERRQ(ierr);
>     val=complex<double>(0.0,-1.0);
>     ierr=MatSetValue(A,1,0,val,INSERT_VALUES);CHKERRQ(ierr);
>   }
>
>   ierr = MatAssemblyBegin(A,MAT_FINAL_ASSEMBLY); CHKERRQ(ierr);
>   ierr = MatAssemblyEnd(A,MAT_FINAL_ASSEMBLY); CHKERRQ(ierr);
>
>   cout << "============  Mat A  ==================" << endl;
>   ierr=MatView(A,PETSC_VIEWER_STDOUT_WORLD); CHKERRQ(ierr);
>   cout << "======================================" << endl;
>
>   // For spooles
>   //ierr=MatConvert(A,MATMPIAIJSPOOLES,MAT_REUSE_MATRIX,&A); CHKERRQ(ierr);
>
>   // For superlu_dist
>   ierr=MatConvert(A,MATSUPERLU_DIST,MAT_REUSE_MATRIX,&A); CHKERRQ(ierr);
>
>   // Direct LU solver
>   ierr=KSPCreate(comm,&solver); CHKERRQ(ierr);
>   ierr=KSPSetType(solver,KSPPREONLY); CHKERRQ(ierr);
>   ierr=KSPSetOperators(solver,A,A,SAME_NONZERO_PATTERN); CHKERRQ(ierr);
>   ierr=KSPGetPC(solver,&prec); CHKERRQ(ierr);
>   ierr=PCSetType(prec,PCLU); CHKERRQ(ierr);
>   ierr=KSPSetFromOptions(solver); CHKERRQ(ierr);
>
>   //============  Vector assembly ========================
>
>   if(rank==0) {
>     ierr=VecCreateMPI(PETSC_COMM_WORLD,1,2,&x); CHKERRQ(ierr);
>     val=complex<double>(1.0,0.0);
>     loc=0;
>     ierr=VecSetValues(x,1,&loc,&val,ADD_VALUES);CHKERRQ(ierr);
>   }
>   else {
>     ierr=VecCreateMPI(PETSC_COMM_WORLD,1,2,&x); CHKERRQ(ierr);
>     val=complex<double>(-1.0,0.0);
>     loc=1;
>     ierr=VecSetValues(x,1,&loc,&val,ADD_VALUES);CHKERRQ(ierr);
>   }
>
>   ierr=VecAssemblyBegin(x); CHKERRQ(ierr);
>   ierr=VecAssemblyEnd(x); CHKERRQ(ierr);
>
>   cout << "============== Vec x ==================" << endl;
>   ierr=VecView(x,PETSC_VIEWER_STDOUT_WORLD); CHKERRQ(ierr);
>   cout << "======================================" << endl;
>
>   VecDuplicate(x,&y);  // Duplicate the matrix storage
>
>   // Solve the matrix equation
>   ierr=KSPSolve(solver,x,y); CHKERRQ(ierr);
>
>   cout << "============== Vec y =================" << endl;
>   ierr=VecView(y,PETSC_VIEWER_STDOUT_WORLD); CHKERRQ(ierr);
>   cout << "======================================" << endl;
>
>
>   // Destructors
>   ierr=KSPDestroy(solver); CHKERRQ(ierr);
>   ierr=VecDestroy(x); CHKERRQ(ierr);
>   ierr=VecDestroy(y); CHKERRQ(ierr);
>   ierr=MatDestroy(A); CHKERRQ(ierr);
>
>   // Finalize
>   ierr=PetscFinalize(); CHKERRQ(ierr);
>
>
>   return 0;
>
>  }
>
>
>  When I run the program with spooles, I get the following output.
>
>
>  ============  Mat A  ==================
>  ============  Mat A  ==================
>  ======================================
>  row 0: (0, 1)  (1, 0 + 1 i)
>  row 1: (0, 0 - 1 i) (1, 1 + 1 i)
>  ======================================
>  ============== Vec x ==================
>  ============== Vec x ==================
>  Process [0]
>  1
>  ======================================
>  Process [1]
>  -1
>  ======================================
>
>   fatal error in InpMtx_MPI_split()
>   firsttag = 0, tagbound = -1
>
>   fatal error in InpMtx_MPI_split()
>   firsttag = 0, tagbound = -1
>  -----------------------------------------------------------------------------
>  One of the processes started by mpirun has exited with a nonzero exit
>  code.  This typically indicates that the process finished in error.
>  If your process did not finish in error, be sure to include a "return
>  0" or "exit(0)" in your C code before exiting the application.
>
>  PID 22881 failed on node n0 (127.0.0.1) with exit status 255.
>  -----------------------------------------------------------------------------
>  mpirun failed with exit status 255
>
>
>  When run in the debugger, there is no stack trace. The error is
>
>  Program exited with code 0377
>
>
>  With superlu_dist, the output is
>
>  ============  Mat A  ==================
>  ============  Mat A  ==================
>  row 0: (0, 1)  (1, 0 + 1 i)
>  row 1: (0, 0 - 1 i) (1, 1 + 1 i)
>  ======================================
>  ======================================
>  ============== Vec x ==================
>  ============== Vec x ==================
>  Process [0]
>  1
>  Process [1]
>  -1
>  ======================================
>  ======================================
>  [0]PETSC ERROR:
>  ------------------------------------------------------------------------
>  [0]PETSC ERROR: Caught signal number 11 SEGV: Segmentation Violation,
>  probably memory access out of range
>  [0]PETSC ERROR: Try option -start_in_debugger or -on_error_attach_debugger
>  [0]PETSC ERROR: or see
>  http://www.mcs.anl.gov/petsc/petsc-as/documentation/troubleshooting.html#Signal[0]PETSC
>   ERROR: or try http://valgrind.org on linux or man libgmalloc on Apple to
>  find memory corruption errors
>  [0]PETSC ERROR: configure using --with-debugging=yes, recompile, link, and
>  run
>  [0]PETSC ERROR: to get more information on the crash.
>  [0]PETSC ERROR: --------------------- Error Message
>  ------------------------------------
>  [0]PETSC ERROR: Signal received!
>  [0]PETSC ERROR:
>  ------------------------------------------------------------------------
>  [0]PETSC ERROR: Petsc Release Version 2.3.3, Patch 8, Fri Nov 16 17:03:40
>  CST 2007 HG revision: 414581156e67e55c761739b0deb119f7590d0f4b
>  [0]PETSC ERROR: See docs/changes/index.html for recent updates.
>  [0]PETSC ERROR: See docs/faq.html for hints about trouble shooting.
>  [0]PETSC ERROR: See docs/index.html for manual pages.
>  [0]PETSC ERROR:
>  ------------------------------------------------------------------------
>  [0]PETSC ERROR: lu on a linux-gnu named tabla by amit Thu Feb 28 13:48:54
>  2008
>  [0]PETSC ERROR: Libraries linked from
>  /home/amit/programs/ParEM/petsc-2.3.3-p8/lib/linux-gnu-c-debug
>  [0]PETSC ERROR: Configure run at Thu Feb 28 12:19:39 2008
>  [0]PETSC ERROR: Configure options --with-scalar-type=complex
>  --with-debugging=no --with-fortran-kernels=generic --with-clanguage=cxx
>  --with-metis=1 --download-metis=1 --with-parmetis=1 --download-parmetis=1
>  --with-superlu_dist=1 --download-superlu_dist=1 --with-spooles=1
>  --with-spooles-dir=/home/amit/programs/ParEM/spooles-2.2 COPTFLAGS="-O3
>  -march=p4 -mtune=p4 -ffast-math -malign-double -funroll-loops -pipe
>  -fomit-frame-pointer -finline-functions -msse2" CXXOPTFLAGS="-O3 -march=p4
>  -mtune=p4 -ffast-math -malign-double -funroll-loops -pipe
>  -fomit-frame-pointer -finline-functions -msse2" FOPTS="-O3 -qarch=p4
>  -qtune=p4" --with-shared=0
>  [0]PETSC ERROR:
>  ------------------------------------------------------------------------
>  [0]PETSC ERROR: User provided function() line 0 in unknown directory
>  unknown file
>  -----------------------------------------------------------------------------
>  One of the processes started by mpirun has exited with a nonzero exit
>  code.  This typically indicates that the process finished in error.
>  If your process did not finish in error, be sure to include a "return
>  0" or "exit(0)" in your C code before exiting the application.
>
>  PID 22998 failed on node n0 (127.0.0.1) with exit status 1.
>  -----------------------------------------------------------------------------
>  mpirun failed with exit status 1
>
>
>  The debugger tracks the segmentation violation to
>
>
>  main -> KSPSolve -> KSPSetUp -> PCSetUp -> PCSetUp_LU -> MatLUFactorNumeric
>  -> MatLUFactorNumeric_SUperLU_DIST -> pzgssvx
>
>
>  Could  someone kindly point out what I am missing ?
>
>
>  Thanks
>
>  Rgds,
>  Amit
>
>


-- 
What most experimenters take for granted before they begin their
experiments is infinitely more interesting than any results to which
their experiments lead.
-- Norbert Wiener


From hzhang at mcs.anl.gov  Thu Feb 28 14:57:24 2008
From: hzhang at mcs.anl.gov (Hong Zhang)
Date: Thu, 28 Feb 2008 14:57:24 -0600 (CST)
Subject: Direct LU solver
In-Reply-To: <a9f269830802281245i779cd509j5d071582a3bfa3f2@mail.gmail.com>
References: <200802281846.m1SIkPA31406@mcs.anl.gov> 
 <OFC5D62013.CED52B17-ON852573FD.00673DB3-852573FD.0069C055@seagate.com>
 <a9f269830802281245i779cd509j5d071582a3bfa3f2@mail.gmail.com>
Message-ID: <Pine.LNX.4.64.0802281455120.5104@terra.mcs.anl.gov>


or ~petsc/src/ksp/ksp/examples/tutorials/ex5.c to avoid using
matrix data file, e.g.
mpiexec -np 2 ./ex5 -ksp_type preonly -pc_type lu -mat_type superlu_dist

Hong

On Thu, 28 Feb 2008, Matthew Knepley wrote:

> I would recommend using KSP ex10 and then customizing with options.
> That way we know it should work. For SuperLU_dist,
>
>  -ksp_type preonly -pc_type lu -mat_type superlu_dist
>
>   Matt
>
> On Thu, Feb 28, 2008 at 1:07 PM,  <Amit.Itagi at seagate.com> wrote:
>> Hi,
>>
>>  I need to do direct LU solves (repeatedly, with the same matrix) in one of
>>  my MPI applications. I am having trouble implementing the solver. To
>>  identify the problem, I wrote a short toy  code to run with 2 processes. I
>>  can run it with either the spooles parallel matrix or the superlu_dist
>>  matrix. I am using C++ and complex matrices.
>>
>>  Here is the code listing:
>>
>>  #include <iostream>
>>  #include <complex>
>>  #include <stdlib.h>
>>  #include "petsc.h"
>>  #include "petscmat.h"
>>  #include "petscvec.h"
>>  #include "petscksp.h"
>>
>>  using namespace std;
>>
>>  int main( int argc, char *argv[] ) {
>>
>>   int rank, size;
>>   Mat A;
>>   PetscErrorCode ierr;
>>   PetscInt loc;
>>   PetscScalar val;
>>   Vec x, y;
>>   KSP solver;
>>   PC prec;
>>   MPI_Comm comm;
>>
>>   // Number of non-zeros in each row
>>   int d_nnz=1, o_nnz=1;
>>
>>   ierr=PetscInitialize(&argc,&argv,PETSC_NULL,PETSC_NULL); CHKERRQ(ierr);
>>   // Initialization (including MPI)
>>
>>   comm=PETSC_COMM_WORLD;
>>
>>   ierr=MPI_Comm_size(PETSC_COMM_WORLD,&size); CHKERRQ(ierr);
>>   ierr=MPI_Comm_rank(PETSC_COMM_WORLD,&rank); CHKERRQ(ierr);
>>
>>   // Assemble matrix A
>>
>>   if(rank==0) {
>>     ierr=MatCreateMPIAIJ(comm,1,1,2,2,0,&d_nnz,0,&o_nnz,&A); CHKERRQ(ierr);
>>     val=complex<double>(1.0,0.0);
>>     ierr=MatSetValue(A,0,0,val,INSERT_VALUES);CHKERRQ(ierr);
>>     val=complex<double>(0.0,1.0);
>>     ierr=MatSetValue(A,0,1,val,INSERT_VALUES);CHKERRQ(ierr);
>>   }
>>   else {
>>     ierr=MatCreateMPIAIJ(PETSC_COMM_WORLD,1,1,2,2,0,&d_nnz,0,&o_nnz,&A);
>>  CHKERRQ(ierr);
>>     val=complex<double>(1.0,1.0);
>>     ierr=MatSetValue(A,1,1,val,INSERT_VALUES);CHKERRQ(ierr);
>>     val=complex<double>(0.0,-1.0);
>>     ierr=MatSetValue(A,1,0,val,INSERT_VALUES);CHKERRQ(ierr);
>>   }
>>
>>   ierr = MatAssemblyBegin(A,MAT_FINAL_ASSEMBLY); CHKERRQ(ierr);
>>   ierr = MatAssemblyEnd(A,MAT_FINAL_ASSEMBLY); CHKERRQ(ierr);
>>
>>   cout << "============  Mat A  ==================" << endl;
>>   ierr=MatView(A,PETSC_VIEWER_STDOUT_WORLD); CHKERRQ(ierr);
>>   cout << "======================================" << endl;
>>
>>   // For spooles
>>   //ierr=MatConvert(A,MATMPIAIJSPOOLES,MAT_REUSE_MATRIX,&A); CHKERRQ(ierr);
>>
>>   // For superlu_dist
>>   ierr=MatConvert(A,MATSUPERLU_DIST,MAT_REUSE_MATRIX,&A); CHKERRQ(ierr);
>>
>>   // Direct LU solver
>>   ierr=KSPCreate(comm,&solver); CHKERRQ(ierr);
>>   ierr=KSPSetType(solver,KSPPREONLY); CHKERRQ(ierr);
>>   ierr=KSPSetOperators(solver,A,A,SAME_NONZERO_PATTERN); CHKERRQ(ierr);
>>   ierr=KSPGetPC(solver,&prec); CHKERRQ(ierr);
>>   ierr=PCSetType(prec,PCLU); CHKERRQ(ierr);
>>   ierr=KSPSetFromOptions(solver); CHKERRQ(ierr);
>>
>>   //============  Vector assembly ========================
>>
>>   if(rank==0) {
>>     ierr=VecCreateMPI(PETSC_COMM_WORLD,1,2,&x); CHKERRQ(ierr);
>>     val=complex<double>(1.0,0.0);
>>     loc=0;
>>     ierr=VecSetValues(x,1,&loc,&val,ADD_VALUES);CHKERRQ(ierr);
>>   }
>>   else {
>>     ierr=VecCreateMPI(PETSC_COMM_WORLD,1,2,&x); CHKERRQ(ierr);
>>     val=complex<double>(-1.0,0.0);
>>     loc=1;
>>     ierr=VecSetValues(x,1,&loc,&val,ADD_VALUES);CHKERRQ(ierr);
>>   }
>>
>>   ierr=VecAssemblyBegin(x); CHKERRQ(ierr);
>>   ierr=VecAssemblyEnd(x); CHKERRQ(ierr);
>>
>>   cout << "============== Vec x ==================" << endl;
>>   ierr=VecView(x,PETSC_VIEWER_STDOUT_WORLD); CHKERRQ(ierr);
>>   cout << "======================================" << endl;
>>
>>   VecDuplicate(x,&y);  // Duplicate the matrix storage
>>
>>   // Solve the matrix equation
>>   ierr=KSPSolve(solver,x,y); CHKERRQ(ierr);
>>
>>   cout << "============== Vec y =================" << endl;
>>   ierr=VecView(y,PETSC_VIEWER_STDOUT_WORLD); CHKERRQ(ierr);
>>   cout << "======================================" << endl;
>>
>>
>>   // Destructors
>>   ierr=KSPDestroy(solver); CHKERRQ(ierr);
>>   ierr=VecDestroy(x); CHKERRQ(ierr);
>>   ierr=VecDestroy(y); CHKERRQ(ierr);
>>   ierr=MatDestroy(A); CHKERRQ(ierr);
>>
>>   // Finalize
>>   ierr=PetscFinalize(); CHKERRQ(ierr);
>>
>>
>>   return 0;
>>
>>  }
>>
>>
>>  When I run the program with spooles, I get the following output.
>>
>>
>>  ============  Mat A  ==================
>>  ============  Mat A  ==================
>>  ======================================
>>  row 0: (0, 1)  (1, 0 + 1 i)
>>  row 1: (0, 0 - 1 i) (1, 1 + 1 i)
>>  ======================================
>>  ============== Vec x ==================
>>  ============== Vec x ==================
>>  Process [0]
>>  1
>>  ======================================
>>  Process [1]
>>  -1
>>  ======================================
>>
>>   fatal error in InpMtx_MPI_split()
>>   firsttag = 0, tagbound = -1
>>
>>   fatal error in InpMtx_MPI_split()
>>   firsttag = 0, tagbound = -1
>>  -----------------------------------------------------------------------------
>>  One of the processes started by mpirun has exited with a nonzero exit
>>  code.  This typically indicates that the process finished in error.
>>  If your process did not finish in error, be sure to include a "return
>>  0" or "exit(0)" in your C code before exiting the application.
>>
>>  PID 22881 failed on node n0 (127.0.0.1) with exit status 255.
>>  -----------------------------------------------------------------------------
>>  mpirun failed with exit status 255
>>
>>
>>  When run in the debugger, there is no stack trace. The error is
>>
>>  Program exited with code 0377
>>
>>
>>  With superlu_dist, the output is
>>
>>  ============  Mat A  ==================
>>  ============  Mat A  ==================
>>  row 0: (0, 1)  (1, 0 + 1 i)
>>  row 1: (0, 0 - 1 i) (1, 1 + 1 i)
>>  ======================================
>>  ======================================
>>  ============== Vec x ==================
>>  ============== Vec x ==================
>>  Process [0]
>>  1
>>  Process [1]
>>  -1
>>  ======================================
>>  ======================================
>>  [0]PETSC ERROR:
>>  ------------------------------------------------------------------------
>>  [0]PETSC ERROR: Caught signal number 11 SEGV: Segmentation Violation,
>>  probably memory access out of range
>>  [0]PETSC ERROR: Try option -start_in_debugger or -on_error_attach_debugger
>>  [0]PETSC ERROR: or see
>>  http://www.mcs.anl.gov/petsc/petsc-as/documentation/troubleshooting.html#Signal[0]PETSC
>>   ERROR: or try http://valgrind.org on linux or man libgmalloc on Apple to
>>  find memory corruption errors
>>  [0]PETSC ERROR: configure using --with-debugging=yes, recompile, link, and
>>  run
>>  [0]PETSC ERROR: to get more information on the crash.
>>  [0]PETSC ERROR: --------------------- Error Message
>>  ------------------------------------
>>  [0]PETSC ERROR: Signal received!
>>  [0]PETSC ERROR:
>>  ------------------------------------------------------------------------
>>  [0]PETSC ERROR: Petsc Release Version 2.3.3, Patch 8, Fri Nov 16 17:03:40
>>  CST 2007 HG revision: 414581156e67e55c761739b0deb119f7590d0f4b
>>  [0]PETSC ERROR: See docs/changes/index.html for recent updates.
>>  [0]PETSC ERROR: See docs/faq.html for hints about trouble shooting.
>>  [0]PETSC ERROR: See docs/index.html for manual pages.
>>  [0]PETSC ERROR:
>>  ------------------------------------------------------------------------
>>  [0]PETSC ERROR: lu on a linux-gnu named tabla by amit Thu Feb 28 13:48:54
>>  2008
>>  [0]PETSC ERROR: Libraries linked from
>>  /home/amit/programs/ParEM/petsc-2.3.3-p8/lib/linux-gnu-c-debug
>>  [0]PETSC ERROR: Configure run at Thu Feb 28 12:19:39 2008
>>  [0]PETSC ERROR: Configure options --with-scalar-type=complex
>>  --with-debugging=no --with-fortran-kernels=generic --with-clanguage=cxx
>>  --with-metis=1 --download-metis=1 --with-parmetis=1 --download-parmetis=1
>>  --with-superlu_dist=1 --download-superlu_dist=1 --with-spooles=1
>>  --with-spooles-dir=/home/amit/programs/ParEM/spooles-2.2 COPTFLAGS="-O3
>>  -march=p4 -mtune=p4 -ffast-math -malign-double -funroll-loops -pipe
>>  -fomit-frame-pointer -finline-functions -msse2" CXXOPTFLAGS="-O3 -march=p4
>>  -mtune=p4 -ffast-math -malign-double -funroll-loops -pipe
>>  -fomit-frame-pointer -finline-functions -msse2" FOPTS="-O3 -qarch=p4
>>  -qtune=p4" --with-shared=0
>>  [0]PETSC ERROR:
>>  ------------------------------------------------------------------------
>>  [0]PETSC ERROR: User provided function() line 0 in unknown directory
>>  unknown file
>>  -----------------------------------------------------------------------------
>>  One of the processes started by mpirun has exited with a nonzero exit
>>  code.  This typically indicates that the process finished in error.
>>  If your process did not finish in error, be sure to include a "return
>>  0" or "exit(0)" in your C code before exiting the application.
>>
>>  PID 22998 failed on node n0 (127.0.0.1) with exit status 1.
>>  -----------------------------------------------------------------------------
>>  mpirun failed with exit status 1
>>
>>
>>  The debugger tracks the segmentation violation to
>>
>>
>>  main -> KSPSolve -> KSPSetUp -> PCSetUp -> PCSetUp_LU -> MatLUFactorNumeric
>>  -> MatLUFactorNumeric_SUperLU_DIST -> pzgssvx
>>
>>
>>  Could  someone kindly point out what I am missing ?
>>
>>
>>  Thanks
>>
>>  Rgds,
>>  Amit
>>
>>
>
>
>
> -- 
> What most experimenters take for granted before they begin their
> experiments is infinitely more interesting than any results to which
> their experiments lead.
> -- Norbert Wiener
>
>


From Amit.Itagi at seagate.com  Thu Feb 28 16:32:13 2008
From: Amit.Itagi at seagate.com (Amit.Itagi at seagate.com)
Date: Thu, 28 Feb 2008 17:32:13 -0500
Subject: Direct LU solver
In-Reply-To: <Pine.LNX.4.64.0802281455120.5104@terra.mcs.anl.gov>
Message-ID: <OF218FFB5D.280CF026-ON852573FD.007B9708-852573FD.007C7C58@seagate.com>

Matt and Hong,

I will try to customize the example. However, since my application involves
multiple ksp solvers (using different algorithms), I would really like to
set the options with in the code, instead of on the command line. Is there
a way of doing this ?

Thanks

Rgds,
Amit


From balay at mcs.anl.gov  Thu Feb 28 16:58:16 2008
From: balay at mcs.anl.gov (Satish Balay)
Date: Thu, 28 Feb 2008 16:58:16 -0600 (CST)
Subject: Direct LU solver
In-Reply-To: <OF218FFB5D.280CF026-ON852573FD.007B9708-852573FD.007C7C58@seagate.com>
References: <OF218FFB5D.280CF026-ON852573FD.007B9708-852573FD.007C7C58@seagate.com>
Message-ID: <alpine.LFD.1.00.0802281649290.15049@asterix>

On Thu, 28 Feb 2008, Amit.Itagi at seagate.com wrote:

> Matt and Hong,
> 
> I will try to customize the example. However, since my application involves
> multiple ksp solvers (using different algorithms), I would really like to
> set the options with in the code, instead of on the command line. Is there
> a way of doing this ?

-ksp_type preonly -pc_type lu -mat_type superlu_dist

There are 2 ways of doing this.

One is within the code:

    MatCreate(&mat);
    MatSetType(mat,MATSUPERLU_DIST);
    MatSetFromOptions(mat);

    KSPSetType(ksp,KSPPREONLY);
    KSPGetPC(ksp,&pc);
    PCSetType(pc,PCLU);
    KSPSetFromOptions(ksp);

etc..

Another way is to give each object a prefix [if you have multiple
objects of the same type. You can use this prefix with the command
line options.

For eg:

    KSPCreate(&ksp1)
    KSPCreate(&ksp2)

    KSPSetOptionsPrefix(ksp1,"a_")
    KSPSetOptionsPrefix(ksp2,"b_")

Now you can use -a_ksp_type gmres -b_ksp_type cg

etc..

Satish


From Amit.Itagi at seagate.com  Fri Feb 29 08:23:22 2008
From: Amit.Itagi at seagate.com (Amit.Itagi at seagate.com)
Date: Fri, 29 Feb 2008 09:23:22 -0500
Subject: Direct LU solver
In-Reply-To: <alpine.LFD.1.00.0802281649290.15049@asterix>
Message-ID: <OF5C0832E6.9B958E5B-ON852573FE.004EBF3E-852573FE.004FBABF@seagate.com>

Matt/Hong/Satish,

My toy-problem would run with the command line options. However, the
in-code options were still giving a problem. I also found that I had a
Petsc version compiled with the debugging flag off. On recompiling Petsc by
turning the debugging flag on, the in-code options worked. I am wondering
about the cause for this behavior.

Thanks for your help. I will now fiddle around with the actual application.

Rgds,
Amit


             Satish Balay                                                  
             <balay at mcs.anl.go                                             
             v>                                                         To 
             Sent by:                  petsc-users at mcs.anl.gov             
             owner-petsc-users                                          cc 
             @mcs.anl.gov                                                  
             No Phone Info                                         Subject 
             Available                 Re: Direct LU solver                
                                                                           
                                                                           
             02/28/2008 05:58                                              
             PM                                                            
                                                                           
                                                                           
             Please respond to                                             
             petsc-users at mcs.a                                             
                  nl.gov                                                   
                                                                           
                                                                           
On Thu, 28 Feb 2008, Amit.Itagi at seagate.com wrote:

> Matt and Hong,
>
> I will try to customize the example. However, since my application
involves
> multiple ksp solvers (using different algorithms), I would really like to
> set the options with in the code, instead of on the command line. Is
there
> a way of doing this ?

-ksp_type preonly -pc_type lu -mat_type superlu_dist

There are 2 ways of doing this.

One is within the code:

    MatCreate(&mat);
    MatSetType(mat,MATSUPERLU_DIST);
    MatSetFromOptions(mat);

    KSPSetType(ksp,KSPPREONLY);
    KSPGetPC(ksp,&pc);
    PCSetType(pc,PCLU);
    KSPSetFromOptions(ksp);

etc..

Another way is to give each object a prefix [if you have multiple
objects of the same type. You can use this prefix with the command
line options.

For eg:

    KSPCreate(&ksp1)
    KSPCreate(&ksp2)

    KSPSetOptionsPrefix(ksp1,"a_")
    KSPSetOptionsPrefix(ksp2,"b_")

Now you can use -a_ksp_type gmres -b_ksp_type cg

etc..

Satish


From balay at mcs.anl.gov  Fri Feb 29 09:09:17 2008
From: balay at mcs.anl.gov (Satish Balay)
Date: Fri, 29 Feb 2008 09:09:17 -0600 (CST)
Subject: Direct LU solver
In-Reply-To: <OF5C0832E6.9B958E5B-ON852573FE.004EBF3E-852573FE.004FBABF@seagate.com>
References: <OF5C0832E6.9B958E5B-ON852573FE.004EBF3E-852573FE.004FBABF@seagate.com>
Message-ID: <alpine.LFD.1.00.0802290906450.3753@asterix>

On Fri, 29 Feb 2008, Amit.Itagi at seagate.com wrote:

> Matt/Hong/Satish,
> 
> My toy-problem would run with the command line options. However, the
> in-code options were still giving a problem. I also found that I had a
> Petsc version compiled with the debugging flag off. On recompiling Petsc by
> turning the debugging flag on, the in-code options worked. I am wondering
> about the cause for this behavior.
> 
> Thanks for your help. I will now fiddle around with the actual application.

Hmm - there should be some example usages in
src/ksp/ksp/examples/tutorials [like ex2.c, ex30.c etc..]. You can
verify if these work fine for you without debuging, and then see if
your usage is same as these examples.

Satish


From knepley at gmail.com  Fri Feb 29 09:14:20 2008
From: knepley at gmail.com (Matthew Knepley)
Date: Fri, 29 Feb 2008 09:14:20 -0600
Subject: Direct LU solver
In-Reply-To: <OF5C0832E6.9B958E5B-ON852573FE.004EBF3E-852573FE.004FBABF@seagate.com>
References: <alpine.LFD.1.00.0802281649290.15049@asterix>
	 <OF5C0832E6.9B958E5B-ON852573FE.004EBF3E-852573FE.004FBABF@seagate.com>
Message-ID: <a9f269830802290714p210a9267pf40386a20f8eb5e0@mail.gmail.com>

On Fri, Feb 29, 2008 at 8:23 AM,  <Amit.Itagi at seagate.com> wrote:
> Matt/Hong/Satish,
>
>  My toy-problem would run with the command line options. However, the
>  in-code options were still giving a problem. I also found that I had a
>  Petsc version compiled with the debugging flag off. On recompiling Petsc by
>  turning the debugging flag on, the in-code options worked. I am wondering
>  about the cause for this behavior.

I am sure this is a misinterpretation. The code just does not work that way.
Something you have not notices changed between those versions of your code.
When you say "giving a problem", I assume you mean the option does not take
effect. The most common cause is a misunderstanding of the mechanism. If you
call a function to set something, but subsequently call
SetFromOptions(), it will
be overridden by command line arguments

   Matt

>  Thanks for your help. I will now fiddle around with the actual application.
>
>  Rgds,
>  Amit
-- 
What most experimenters take for granted before they begin their
experiments is infinitely more interesting than any results to which
their experiments lead.
-- Norbert Wiener


From Amit.Itagi at seagate.com  Fri Feb 29 12:22:00 2008
From: Amit.Itagi at seagate.com (Amit.Itagi at seagate.com)
Date: Fri, 29 Feb 2008 13:22:00 -0500
Subject: Direct LU solver
In-Reply-To: <a9f269830802290714p210a9267pf40386a20f8eb5e0@mail.gmail.com>
Message-ID: <OF43DF573D.F82FB150-ON852573FE.00642BCB-852573FE.0065940D@seagate.com>


owner-petsc-users at mcs.anl.gov wrote on 02/29/2008 10:14:20 AM:

> On Fri, Feb 29, 2008 at 8:23 AM,  <Amit.Itagi at seagate.com> wrote:
> > Matt/Hong/Satish,
> >
> >  My toy-problem would run with the command line options. However, the
> >  in-code options were still giving a problem. I also found that I had a
> >  Petsc version compiled with the debugging flag off. On recompiling
Petsc by
> >  turning the debugging flag on, the in-code options worked. I am
wondering
> >  about the cause for this behavior.
>
> I am sure this is a misinterpretation. The code just does not work that
way.
> Something you have not notices changed between those versions of your
code.
> When you say "giving a problem", I assume you mean the option does not
take
> effect. The most common cause is a misunderstanding of the mechanism. If
you
> call a function to set something, but subsequently call
> SetFromOptions(), it will
> be overridden by command line arguments
>
>    Matt
>

Hi,

My woes continue.  Based on the earlier discussions, I implemented the
matrix as

//=========================================================================

 //   Option 1
  ierr=MatCreate(PETSC_COMM_WORLD,&A); CHKERRQ(ierr);
  ierr=MatSetSizes(A,1,1,2,2); CHKERRQ(ierr);


  /*    Option 2
  PetscInt d_nnz=1, o_nnz=1;
  ierr=MatCreateMPIAIJ(PETSC_COMM_WORLD,1,1,2,2,0,&d_nnz,0,&o_nnz,&A);
CHKERRQ(ierr);
  */

  /*   Option 3

ierr=MatCreateMPIAIJ(PETSC_COMM_WORLD,1,1,2,2,0,PETSC_NULL,0,PETSC_NULL,&A);
 CHKERRQ(ierr);
  */

  ierr=MatSetType(A,MATSUPERLU_DIST); CHKERRQ(ierr);
  ierr=MatSetFromOptions(A); CHKERRQ(ierr);

  // (After this, I set the values and do the assembly). I then use the
direct LU solver.

//============================================================================

Note: I have a simple 2 by 2 matrix (with non-zero values in all 4 places).
If I use "option 1" (based on Satish's email), the program executes
successfully. If instead of "option 1", I use "option 2" or "option 3", I
get a crash.
If I am not mistaken, options 1 and 3 are the same. Option 2, additionally,
does a pre-allocation. Am I correct ?


Thanks

Rgds,
Amit


From balay at mcs.anl.gov  Fri Feb 29 13:06:06 2008
From: balay at mcs.anl.gov (Satish Balay)
Date: Fri, 29 Feb 2008 13:06:06 -0600 (CST)
Subject: Direct LU solver
In-Reply-To: <OF43DF573D.F82FB150-ON852573FE.00642BCB-852573FE.0065940D@seagate.com>
References: <OF43DF573D.F82FB150-ON852573FE.00642BCB-852573FE.0065940D@seagate.com>
Message-ID: <alpine.LFD.1.00.0802291257550.2870@asterix>

On Fri, 29 Feb 2008, Amit.Itagi at seagate.com wrote:

> 
> My woes continue.  Based on the earlier discussions, I implemented the
> matrix as
> 
> //=========================================================================
> 
>  //   Option 1
>   ierr=MatCreate(PETSC_COMM_WORLD,&A); CHKERRQ(ierr);
>   ierr=MatSetSizes(A,1,1,2,2); CHKERRQ(ierr);
> 
> 
>   /*    Option 2
>   PetscInt d_nnz=1, o_nnz=1;
>   ierr=MatCreateMPIAIJ(PETSC_COMM_WORLD,1,1,2,2,0,&d_nnz,0,&o_nnz,&A);
> CHKERRQ(ierr);
>   */
> 
>   /*   Option 3
> 
> ierr=MatCreateMPIAIJ(PETSC_COMM_WORLD,1,1,2,2,0,PETSC_NULL,0,PETSC_NULL,&A);
>  CHKERRQ(ierr);
>   */
> 
>   ierr=MatSetType(A,MATSUPERLU_DIST); CHKERRQ(ierr);
>   ierr=MatSetFromOptions(A); CHKERRQ(ierr);
> 
>   // (After this, I set the values and do the assembly). I then use the
> direct LU solver.
> 
> //============================================================================
> 
> Note: I have a simple 2 by 2 matrix (with non-zero values in all 4 places).
> If I use "option 1" (based on Satish's email), the program executes
> successfully. If instead of "option 1", I use "option 2" or "option 3", I
> get a crash.
> If I am not mistaken, options 1 and 3 are the same. Option 2, additionally,
> does a pre-allocation. Am I correct ?

Nope - Option 3 is same as:

MatCreate()
MatSetType(MPIAIJ)
MatMPIAIJSetPreallocation()
MatSetType(MATSUPERLU_DIST)

[i.e first you are setting type as MPIAIJ, and then changing to
MATSUPERLU_DIST]

What you want is:

MatCreate()
MatSetType(MATSUPERLU_DIST)
MatMPIAIJSetPreallocation()

[Ideally you need MatSuerLU_DistSetPreallocation() - but that would be
same as MatMPIAIJSetPreallocation()]

Satish


From recrusader at gmail.com  Fri Feb 29 16:45:44 2008
From: recrusader at gmail.com (Yujie)
Date: Fri, 29 Feb 2008 14:45:44 -0800
Subject: --with-clanguage: c and c++
Message-ID: <7ff0ee010802291445g544b30f7xf89493105635eb84@mail.gmail.com>

Hi, everyone

When PETSc is compiled with "--with-clanguage=C" or "--with-clanguage=C++",
what is the difference between them, the parameters in the functions are
adjusted or else?

thanks,
Yujie
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.mcs.anl.gov/pipermail/petsc-users/attachments/20080229/9b83d158/attachment.htm>

From balay at mcs.anl.gov  Fri Feb 29 16:51:11 2008
From: balay at mcs.anl.gov (Satish Balay)
Date: Fri, 29 Feb 2008 16:51:11 -0600 (CST)
Subject: --with-clanguage: c and c++
In-Reply-To: <7ff0ee010802291445g544b30f7xf89493105635eb84@mail.gmail.com>
References: <7ff0ee010802291445g544b30f7xf89493105635eb84@mail.gmail.com>
Message-ID: <alpine.LFD.1.00.0802291649390.2870@asterix>

On Fri, 29 Feb 2008, Yujie wrote:

> Hi, everyone
> 
> When PETSc is compiled with "--with-clanguage=C" or "--with-clanguage=C++",
> what is the difference between them, the parameters in the functions are
> adjusted or else?

Primary difference is that the default compiler used to compile the
sources is c vs c++.

[So, if the user is developing with c++, its easiest to build PETSc
with c++, and use the default makefiles to compile user code aswell]

There are some components of PETSc [sieve] that are coded in c++, and
can be built only in the c++ mode.

Satish


From recrusader at gmail.com  Fri Feb 29 18:46:27 2008
From: recrusader at gmail.com (Yujie)
Date: Fri, 29 Feb 2008 16:46:27 -0800
Subject: how to add a parallel MatMatSolve() function?
Message-ID: <7ff0ee010802291646n508b1c6bu9d6acd5ba891e19c@mail.gmail.com>

Hi, everyone

I am considering to add a parallel MatMatSolve() into PETSc based on
SuperLu_DIST or Spooles. If I want to use it like current sequential
MatMatSolve() in the application codes, how to do it? Could you give me some
examples about how to add a new function?
thanks a lot.

Regards,
Yujie
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.mcs.anl.gov/pipermail/petsc-users/attachments/20080229/72adf92b/attachment.htm>

From bsmith at mcs.anl.gov  Fri Feb 29 19:46:19 2008
From: bsmith at mcs.anl.gov (Barry Smith)
Date: Fri, 29 Feb 2008 19:46:19 -0600
Subject: how to add a parallel MatMatSolve() function?
In-Reply-To: <7ff0ee010802291646n508b1c6bu9d6acd5ba891e19c@mail.gmail.com>
References: <7ff0ee010802291646n508b1c6bu9d6acd5ba891e19c@mail.gmail.com>
Message-ID: <363E050C-90D5-48FC-A362-85A74D377EC6@mcs.anl.gov>


   Please edit the file src/mat/interface/matrix.c and remove the  
function MatMatSolve(). Replace it with the following two functions,
then run "make lib shared" in that directory. Please let us know at petsc-maint at mcs.anl.gov 
  if it crashes or produces incorrect
results.

    Barry


#undef __FUNCT__
#define __FUNCT__ "MatMatSolve_Basic"
PetscErrorCode PETSCMAT_DLLEXPORT MatMatSolve_Basic(Mat A,Mat B,Mat X)
{
   PetscErrorCode ierr;
   Vec            b,x;
   PetscInt       m,N,i;
   PetscScalar    *bb,*xx;

   PetscFunctionBegin;
   ierr = MatGetArray(B,&bb);CHKERRQ(ierr);
   ierr = MatGetArray(X,&xx);CHKERRQ(ierr);
   ierr = MatGetLocalSize(B,&m,PETSC_NULL);CHKERRQ(ierr);  /* number  
local rows */
   ierr = MatGetSize(B,PETSC_NULL,&N);CHKERRQ(ierr);       /* total  
columns in dense matrix */
   ierr = VecCreateMPIWithArray(A- 
 >hdr.comm,m,PETSC_DETERMINE,PETSC_NULL,&b);CHKERRQ(ierr);
   ierr = VecCreateMPIWithArray(A- 
 >hdr.comm,m,PETSC_DETERMINE,PETSC_NULL,&x);CHKERRQ(ierr);
   for (i=0; i<N; i++) {
     ierr = VecPlaceArray(b,bb + i*m);CHKERRQ(ierr);
     ierr = VecPlaceArray(x,xx + i*m);CHKERRQ(ierr);
     ierr = MatSolve(A,b,x);CHKERRQ(ierr);
   }
   ierr = VecDestroy(b);CHKERRQ(ierr);
   ierr = VecDestroy(x);CHKERRQ(ierr);
   PetscFunctionReturn(0);
}

#undef __FUNCT__
#define __FUNCT__ "MatMatSolve"
/*@
    MatMatSolve - Solves A X = B, given a factored matrix.

    Collective on Mat

    Input Parameters:
+  mat - the factored matrix
-  B - the right-hand-side matrix  (dense matrix)

    Output Parameter:
.  B - the result matrix (dense matrix)

    Notes:
    The matrices b and x cannot be the same.  I.e., one cannot
    call MatMatSolve(A,x,x).

    Notes:
    Most users should usually employ the simplified KSP interface for  
linear solvers
    instead of working directly with matrix algebra routines such as  
this.
    See, e.g., KSPCreate(). However KSP can only solve for one vector  
(column of X)
    at a time.

    Level: developer

    Concepts: matrices^triangular solves

.seealso: MatMatSolveAdd(), MatMatSolveTranspose(),  
MatMatSolveTransposeAdd(), MatLUFactor(), MatCholeskyFactor()
@*/
PetscErrorCode PETSCMAT_DLLEXPORT MatMatSolve(Mat A,Mat B,Mat X)
{
   PetscErrorCode ierr;

   PetscFunctionBegin;
   PetscValidHeaderSpecific(A,MAT_COOKIE,1);
   PetscValidType(A,1);
   PetscValidHeaderSpecific(B,MAT_COOKIE,2);
   PetscValidHeaderSpecific(X,MAT_COOKIE,3);
   PetscCheckSameComm(A,1,B,2);
   PetscCheckSameComm(A,1,X,3);
   if (X == B) SETERRQ(PETSC_ERR_ARG_IDN,"X and B must be different  
matrices");
   if (!A->factor) SETERRQ(PETSC_ERR_ARG_WRONGSTATE,"Unfactored  
matrix");
   if (A->cmap.N != X->rmap.N) SETERRQ2(PETSC_ERR_ARG_SIZ,"Mat A,Mat  
X: global dim %D %D",A->cmap.N,X->rmap.N);
   if (A->rmap.N != B->rmap.N) SETERRQ2(PETSC_ERR_ARG_SIZ,"Mat A,Mat  
B: global dim %D %D",A->rmap.N,B->rmap.N);
   if (A->rmap.n != B->rmap.n) SETERRQ2(PETSC_ERR_ARG_SIZ,"Mat A,Mat  
B: local dim %D %D",A->rmap.n,B->rmap.n);
   if (!A->rmap.N && !A->cmap.N) PetscFunctionReturn(0);
   ierr = MatPreallocated(A);CHKERRQ(ierr);

   ierr = PetscLogEventBegin(MAT_MatSolve,A,B,X,0);CHKERRQ(ierr);
   if (!A->ops->matsolve) {
     ierr = PetscInfo1(A,"Mat type %s using basic MatMatSolve", 
((PetscObject)A)->type_name);CHKERRQ(ierr);
     ierr = MatMatSolve_Basic(A,B,X);CHKERRQ(ierr);
   } else {
     ierr = (*A->ops->matsolve)(A,B,X);CHKERRQ(ierr);
   }
   ierr = PetscLogEventEnd(MAT_MatSolve,A,B,X,0);CHKERRQ(ierr);
   ierr = PetscObjectStateIncrease((PetscObject)X);CHKERRQ(ierr);
   PetscFunctionReturn(0);
}

On Feb 29, 2008, at 6:46 PM, Yujie wrote:

> Hi, everyone
>
> I am considering to add a parallel MatMatSolve() into PETSc based on  
> SuperLu_DIST or Spooles. If I want to use it like current sequential  
> MatMatSolve() in the application codes, how to do it? Could you give  
> me some examples about how to add a new function?
> thanks a lot.
>
> Regards,
> Yujie


From recrusader at gmail.com  Fri Feb 29 20:01:59 2008
From: recrusader at gmail.com (Yujie)
Date: Fri, 29 Feb 2008 18:01:59 -0800
Subject: how to add a parallel MatMatSolve() function?
In-Reply-To: <363E050C-90D5-48FC-A362-85A74D377EC6@mcs.anl.gov>
References: <7ff0ee010802291646n508b1c6bu9d6acd5ba891e19c@mail.gmail.com>
	 <363E050C-90D5-48FC-A362-85A74D377EC6@mcs.anl.gov>
Message-ID: <7ff0ee010802291801ia1897cau272afa8c19125522@mail.gmail.com>

Dear Barry:

Thank you for your help. I check the codes
roughly, the method in the codes is to use MatSolve() to solve AX=B in
a loop. I also consider such a method.
However, I am wondering whether it is slower than the method that directly
solves AX=B? thanks again.

Regards,
Yujie

On 2/29/08, Barry Smith <bsmith at mcs.anl.gov> wrote:
>
>
>    Please edit the file src/mat/interface/matrix.c and remove the
> function MatMatSolve(). Replace it with the following two functions,
> then run "make lib shared" in that directory. Please let us know at
> petsc-maint at mcs.anl.gov
>   if it crashes or produces incorrect
> results.
>
>     Barry
>
>
>
> #undef __FUNCT__
> #define __FUNCT__ "MatMatSolve_Basic"
> PetscErrorCode PETSCMAT_DLLEXPORT MatMatSolve_Basic(Mat A,Mat B,Mat X)
> {
>    PetscErrorCode ierr;
>    Vec            b,x;
>    PetscInt       m,N,i;
>    PetscScalar    *bb,*xx;
>
>    PetscFunctionBegin;
>    ierr = MatGetArray(B,&bb);CHKERRQ(ierr);
>    ierr = MatGetArray(X,&xx);CHKERRQ(ierr);
>    ierr = MatGetLocalSize(B,&m,PETSC_NULL);CHKERRQ(ierr);  /* number
> local rows */
>    ierr = MatGetSize(B,PETSC_NULL,&N);CHKERRQ(ierr);       /* total
> columns in dense matrix */
>    ierr = VecCreateMPIWithArray(A-
>   >hdr.comm,m,PETSC_DETERMINE,PETSC_NULL,&b);CHKERRQ(ierr);
>    ierr = VecCreateMPIWithArray(A-
>   >hdr.comm,m,PETSC_DETERMINE,PETSC_NULL,&x);CHKERRQ(ierr);
>    for (i=0; i<N; i++) {
>      ierr = VecPlaceArray(b,bb + i*m);CHKERRQ(ierr);
>      ierr = VecPlaceArray(x,xx + i*m);CHKERRQ(ierr);
>      ierr = MatSolve(A,b,x);CHKERRQ(ierr);
>    }
>    ierr = VecDestroy(b);CHKERRQ(ierr);
>    ierr = VecDestroy(x);CHKERRQ(ierr);
>    PetscFunctionReturn(0);
> }
>
> #undef __FUNCT__
> #define __FUNCT__ "MatMatSolve"
> /*@
>     MatMatSolve - Solves A X = B, given a factored matrix.
>
>     Collective on Mat
>
>     Input Parameters:
> +  mat - the factored matrix
> -  B - the right-hand-side matrix  (dense matrix)
>
>     Output Parameter:
> .  B - the result matrix (dense matrix)
>
>     Notes:
>     The matrices b and x cannot be the same.  I.e., one cannot
>     call MatMatSolve(A,x,x).
>
>     Notes:
>     Most users should usually employ the simplified KSP interface for
> linear solvers
>     instead of working directly with matrix algebra routines such as
> this.
>     See, e.g., KSPCreate(). However KSP can only solve for one vector
> (column of X)
>     at a time.
>
>     Level: developer
>
>     Concepts: matrices^triangular solves
>
> .seealso: MatMatSolveAdd(), MatMatSolveTranspose(),
> MatMatSolveTransposeAdd(), MatLUFactor(), MatCholeskyFactor()
> @*/
> PetscErrorCode PETSCMAT_DLLEXPORT MatMatSolve(Mat A,Mat B,Mat X)
> {
>    PetscErrorCode ierr;
>
>    PetscFunctionBegin;
>    PetscValidHeaderSpecific(A,MAT_COOKIE,1);
>    PetscValidType(A,1);
>    PetscValidHeaderSpecific(B,MAT_COOKIE,2);
>    PetscValidHeaderSpecific(X,MAT_COOKIE,3);
>    PetscCheckSameComm(A,1,B,2);
>    PetscCheckSameComm(A,1,X,3);
>    if (X == B) SETERRQ(PETSC_ERR_ARG_IDN,"X and B must be different
> matrices");
>    if (!A->factor) SETERRQ(PETSC_ERR_ARG_WRONGSTATE,"Unfactored
> matrix");
>    if (A->cmap.N != X->rmap.N) SETERRQ2(PETSC_ERR_ARG_SIZ,"Mat A,Mat
> X: global dim %D %D",A->cmap.N,X->rmap.N);
>    if (A->rmap.N != B->rmap.N) SETERRQ2(PETSC_ERR_ARG_SIZ,"Mat A,Mat
> B: global dim %D %D",A->rmap.N,B->rmap.N);
>    if (A->rmap.n != B->rmap.n) SETERRQ2(PETSC_ERR_ARG_SIZ,"Mat A,Mat
> B: local dim %D %D",A->rmap.n,B->rmap.n);
>    if (!A->rmap.N && !A->cmap.N) PetscFunctionReturn(0);
>    ierr = MatPreallocated(A);CHKERRQ(ierr);
>
>    ierr = PetscLogEventBegin(MAT_MatSolve,A,B,X,0);CHKERRQ(ierr);
>    if (!A->ops->matsolve) {
>      ierr = PetscInfo1(A,"Mat type %s using basic MatMatSolve",
> ((PetscObject)A)->type_name);CHKERRQ(ierr);
>      ierr = MatMatSolve_Basic(A,B,X);CHKERRQ(ierr);
>    } else {
>      ierr = (*A->ops->matsolve)(A,B,X);CHKERRQ(ierr);
>    }
>    ierr = PetscLogEventEnd(MAT_MatSolve,A,B,X,0);CHKERRQ(ierr);
>    ierr = PetscObjectStateIncrease((PetscObject)X);CHKERRQ(ierr);
>    PetscFunctionReturn(0);
>
> }
>
> On Feb 29, 2008, at 6:46 PM, Yujie wrote:
>
> > Hi, everyone
> >
> > I am considering to add a parallel MatMatSolve() into PETSc based on
> > SuperLu_DIST or Spooles. If I want to use it like current sequential
> > MatMatSolve() in the application codes, how to do it? Could you give
> > me some examples about how to add a new function?
> > thanks a lot.
> >
> > Regards,
> > Yujie
>
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.mcs.anl.gov/pipermail/petsc-users/attachments/20080229/e0c3c991/attachment.htm>

From recrusader at gmail.com  Fri Feb 29 20:15:16 2008
From: recrusader at gmail.com (Yujie)
Date: Fri, 29 Feb 2008 18:15:16 -0800
Subject: how to add a parallel MatMatSolve() function?
In-Reply-To: <363E050C-90D5-48FC-A362-85A74D377EC6@mcs.anl.gov>
References: <7ff0ee010802291646n508b1c6bu9d6acd5ba891e19c@mail.gmail.com>
	 <363E050C-90D5-48FC-A362-85A74D377EC6@mcs.anl.gov>
Message-ID: <7ff0ee010802291815i42daaf82kbb3d97500542f80c@mail.gmail.com>

Dear Barry:

the following is the compiled errors:

/home/yujie/mpich127/bin/mpicxx -o matrix.o -c -Wall -Wwrite-strings -g
-fPIC -I/home/yujie/codes/petsc-2.3.3-p8
-I/home/yujie/codes/petsc-2.3.3-p8/bmake/linux
-I/home/yujie/codes/petsc-2.3.3-p8/include -I/home/yujie/codes/petsc-
2.3.3-p8/externalpackages/spooles-2.2/linux/ -I/home/yujie/mpich127/include
-I/usr/X11R6/include -D__SDIR__='"src/mat/interface/"' matrix.c
matrix.c: In function `PetscErrorCode MatMatSolve_Basic(_p_Mat*, _p_Mat*,
_p_Mat*)':
matrix.c:2531: `struct _p_Mat' has no member named `hdr'
matrix.c:2532: `struct _p_Mat' has no member named `hdr'
matrix.c:2588:41: warning: multi-line string literals are deprecated
matrix.c:2590:52: warning: multi-line string literals are deprecated
matrix.c:2592:58: warning: multi-line string literals are deprecated
matrix.c:2594:58: warning: multi-line string literals are deprecated
matrix.c:2596:58: warning: multi-line string literals are deprecated
make[1]: [/home/yujie/codes/petsc-2.3.3-p8/lib/linux/libpetscmat.a(matrix.o)]
Error 1 (ignored)
/usr/bin/ar cr /home/yujie/codes/petsc-2.3.3-p8/lib/linux/libpetscmat.a
matrix.o
/usr/bin/ar: matrix.o: No such file or directory
make[1]: [/home/yujie/codes/petsc-2.3.3-p8/lib/linux/libpetscmat.a(matrix.o)]
Error 1 (ignored)
if test -n ""; then /usr/bin/ar cr matrix.lo; fi
/bin/rm -f matrix.o matrix.lo
making shared libraries in /home/yujie/codes/petsc-2.3.3-p8/lib/linux
building libpetscmat.so

thanks,
Yujie

On 2/29/08, Barry Smith <bsmith at mcs.anl.gov> wrote:
>
>
>    Please edit the file src/mat/interface/matrix.c and remove the
> function MatMatSolve(). Replace it with the following two functions,
> then run "make lib shared" in that directory. Please let us know at
> petsc-maint at mcs.anl.gov
>   if it crashes or produces incorrect
> results.
>
>     Barry
>
>
>
> #undef __FUNCT__
> #define __FUNCT__ "MatMatSolve_Basic"
> PetscErrorCode PETSCMAT_DLLEXPORT MatMatSolve_Basic(Mat A,Mat B,Mat X)
> {
>    PetscErrorCode ierr;
>    Vec            b,x;
>    PetscInt       m,N,i;
>    PetscScalar    *bb,*xx;
>
>    PetscFunctionBegin;
>    ierr = MatGetArray(B,&bb);CHKERRQ(ierr);
>    ierr = MatGetArray(X,&xx);CHKERRQ(ierr);
>    ierr = MatGetLocalSize(B,&m,PETSC_NULL);CHKERRQ(ierr);  /* number
> local rows */
>    ierr = MatGetSize(B,PETSC_NULL,&N);CHKERRQ(ierr);       /* total
> columns in dense matrix */
>    ierr = VecCreateMPIWithArray(A-
>   >hdr.comm,m,PETSC_DETERMINE,PETSC_NULL,&b);CHKERRQ(ierr);
>    ierr = VecCreateMPIWithArray(A-
>   >hdr.comm,m,PETSC_DETERMINE,PETSC_NULL,&x);CHKERRQ(ierr);
>    for (i=0; i<N; i++) {
>      ierr = VecPlaceArray(b,bb + i*m);CHKERRQ(ierr);
>      ierr = VecPlaceArray(x,xx + i*m);CHKERRQ(ierr);
>      ierr = MatSolve(A,b,x);CHKERRQ(ierr);
>    }
>    ierr = VecDestroy(b);CHKERRQ(ierr);
>    ierr = VecDestroy(x);CHKERRQ(ierr);
>    PetscFunctionReturn(0);
> }
>
> #undef __FUNCT__
> #define __FUNCT__ "MatMatSolve"
> /*@
>     MatMatSolve - Solves A X = B, given a factored matrix.
>
>     Collective on Mat
>
>     Input Parameters:
> +  mat - the factored matrix
> -  B - the right-hand-side matrix  (dense matrix)
>
>     Output Parameter:
> .  B - the result matrix (dense matrix)
>
>     Notes:
>     The matrices b and x cannot be the same.  I.e., one cannot
>     call MatMatSolve(A,x,x).
>
>     Notes:
>     Most users should usually employ the simplified KSP interface for
> linear solvers
>     instead of working directly with matrix algebra routines such as
> this.
>     See, e.g., KSPCreate(). However KSP can only solve for one vector
> (column of X)
>     at a time.
>
>     Level: developer
>
>     Concepts: matrices^triangular solves
>
> .seealso: MatMatSolveAdd(), MatMatSolveTranspose(),
> MatMatSolveTransposeAdd(), MatLUFactor(), MatCholeskyFactor()
> @*/
> PetscErrorCode PETSCMAT_DLLEXPORT MatMatSolve(Mat A,Mat B,Mat X)
> {
>    PetscErrorCode ierr;
>
>    PetscFunctionBegin;
>    PetscValidHeaderSpecific(A,MAT_COOKIE,1);
>    PetscValidType(A,1);
>    PetscValidHeaderSpecific(B,MAT_COOKIE,2);
>    PetscValidHeaderSpecific(X,MAT_COOKIE,3);
>    PetscCheckSameComm(A,1,B,2);
>    PetscCheckSameComm(A,1,X,3);
>    if (X == B) SETERRQ(PETSC_ERR_ARG_IDN,"X and B must be different
> matrices");
>    if (!A->factor) SETERRQ(PETSC_ERR_ARG_WRONGSTATE,"Unfactored
> matrix");
>    if (A->cmap.N != X->rmap.N) SETERRQ2(PETSC_ERR_ARG_SIZ,"Mat A,Mat
> X: global dim %D %D",A->cmap.N,X->rmap.N);
>    if (A->rmap.N != B->rmap.N) SETERRQ2(PETSC_ERR_ARG_SIZ,"Mat A,Mat
> B: global dim %D %D",A->rmap.N,B->rmap.N);
>    if (A->rmap.n != B->rmap.n) SETERRQ2(PETSC_ERR_ARG_SIZ,"Mat A,Mat
> B: local dim %D %D",A->rmap.n,B->rmap.n);
>    if (!A->rmap.N && !A->cmap.N) PetscFunctionReturn(0);
>    ierr = MatPreallocated(A);CHKERRQ(ierr);
>
>    ierr = PetscLogEventBegin(MAT_MatSolve,A,B,X,0);CHKERRQ(ierr);
>    if (!A->ops->matsolve) {
>      ierr = PetscInfo1(A,"Mat type %s using basic MatMatSolve",
> ((PetscObject)A)->type_name);CHKERRQ(ierr);
>      ierr = MatMatSolve_Basic(A,B,X);CHKERRQ(ierr);
>    } else {
>      ierr = (*A->ops->matsolve)(A,B,X);CHKERRQ(ierr);
>    }
>    ierr = PetscLogEventEnd(MAT_MatSolve,A,B,X,0);CHKERRQ(ierr);
>    ierr = PetscObjectStateIncrease((PetscObject)X);CHKERRQ(ierr);
>    PetscFunctionReturn(0);
>
> }
>
> On Feb 29, 2008, at 6:46 PM, Yujie wrote:
>
> > Hi, everyone
> >
> > I am considering to add a parallel MatMatSolve() into PETSc based on
> > SuperLu_DIST or Spooles. If I want to use it like current sequential
> > MatMatSolve() in the application codes, how to do it? Could you give
> > me some examples about how to add a new function?
> > thanks a lot.
> >
> > Regards,
> > Yujie
>
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.mcs.anl.gov/pipermail/petsc-users/attachments/20080229/4b2b233f/attachment.htm>

From bsmith at mcs.anl.gov  Fri Feb 29 20:50:51 2008
From: bsmith at mcs.anl.gov (Barry Smith)
Date: Fri, 29 Feb 2008 20:50:51 -0600
Subject: how to add a parallel MatMatSolve() function?
In-Reply-To: <7ff0ee010802291801ia1897cau272afa8c19125522@mail.gmail.com>
References: <7ff0ee010802291646n508b1c6bu9d6acd5ba891e19c@mail.gmail.com> <363E050C-90D5-48FC-A362-85A74D377EC6@mcs.anl.gov> <7ff0ee010802291801ia1897cau272afa8c19125522@mail.gmail.com>
Message-ID: <39836C37-D183-4869-B70D-B7187762C233@mcs.anl.gov>


   Some direct solver packages have support for solving directly with  
several right hand sides at the same time.
They could be a bit faster than solving one at a time; maybe 30%  
faster at most, not 10 times faster. What is more
important solving the problem you want to solve in a reasonable time  
or solving the problem a bit faster
after spending several weeks writing the much more complicated code?

   Barry

On Feb 29, 2008, at 8:01 PM, Yujie wrote:

> Dear Barry:
>
> Thank you for your help. I check the codes roughly, the method in  
> the codes is to use MatSolve() to solve AX=B in a loop. I also  
> consider such a method.
> However, I am wondering whether it is slower than the method that  
> directly solves AX=B? thanks again.
>
> Regards,
> Yujie
>
> On 2/29/08, Barry Smith <bsmith at mcs.anl.gov> wrote:
>
>    Please edit the file src/mat/interface/matrix.c and remove the
> function MatMatSolve(). Replace it with the following two functions,
> then run "make lib shared" in that directory. Please let us know at petsc-maint at mcs.anl.gov
>   if it crashes or produces incorrect
> results.
>
>     Barry
>
>
>
> #undef __FUNCT__
> #define __FUNCT__ "MatMatSolve_Basic"
> PetscErrorCode PETSCMAT_DLLEXPORT MatMatSolve_Basic(Mat A,Mat B,Mat X)
> {
>    PetscErrorCode ierr;
>    Vec            b,x;
>    PetscInt       m,N,i;
>    PetscScalar    *bb,*xx;
>
>    PetscFunctionBegin;
>    ierr = MatGetArray(B,&bb);CHKERRQ(ierr);
>    ierr = MatGetArray(X,&xx);CHKERRQ(ierr);
>    ierr = MatGetLocalSize(B,&m,PETSC_NULL);CHKERRQ(ierr);  /* number
> local rows */
>    ierr = MatGetSize(B,PETSC_NULL,&N);CHKERRQ(ierr);       /* total
> columns in dense matrix */
>    ierr = VecCreateMPIWithArray(A-
>   >hdr.comm,m,PETSC_DETERMINE,PETSC_NULL,&b);CHKERRQ(ierr);
>    ierr = VecCreateMPIWithArray(A-
>   >hdr.comm,m,PETSC_DETERMINE,PETSC_NULL,&x);CHKERRQ(ierr);
>    for (i=0; i<N; i++) {
>      ierr = VecPlaceArray(b,bb + i*m);CHKERRQ(ierr);
>      ierr = VecPlaceArray(x,xx + i*m);CHKERRQ(ierr);
>      ierr = MatSolve(A,b,x);CHKERRQ(ierr);
>    }
>    ierr = VecDestroy(b);CHKERRQ(ierr);
>    ierr = VecDestroy(x);CHKERRQ(ierr);
>    PetscFunctionReturn(0);
> }
>
> #undef __FUNCT__
> #define __FUNCT__ "MatMatSolve"
> /*@
>     MatMatSolve - Solves A X = B, given a factored matrix.
>
>     Collective on Mat
>
>     Input Parameters:
> +  mat - the factored matrix
> -  B - the right-hand-side matrix  (dense matrix)
>
>     Output Parameter:
> .  B - the result matrix (dense matrix)
>
>     Notes:
>     The matrices b and x cannot be the same.  I.e., one cannot
>     call MatMatSolve(A,x,x).
>
>     Notes:
>     Most users should usually employ the simplified KSP interface for
> linear solvers
>     instead of working directly with matrix algebra routines such as
> this.
>     See, e.g., KSPCreate(). However KSP can only solve for one vector
> (column of X)
>     at a time.
>
>     Level: developer
>
>     Concepts: matrices^triangular solves
>
> .seealso: MatMatSolveAdd(), MatMatSolveTranspose(),
> MatMatSolveTransposeAdd(), MatLUFactor(), MatCholeskyFactor()
> @*/
> PetscErrorCode PETSCMAT_DLLEXPORT MatMatSolve(Mat A,Mat B,Mat X)
> {
>    PetscErrorCode ierr;
>
>    PetscFunctionBegin;
>    PetscValidHeaderSpecific(A,MAT_COOKIE,1);
>    PetscValidType(A,1);
>    PetscValidHeaderSpecific(B,MAT_COOKIE,2);
>    PetscValidHeaderSpecific(X,MAT_COOKIE,3);
>    PetscCheckSameComm(A,1,B,2);
>    PetscCheckSameComm(A,1,X,3);
>    if (X == B) SETERRQ(PETSC_ERR_ARG_IDN,"X and B must be different
> matrices");
>    if (!A->factor) SETERRQ(PETSC_ERR_ARG_WRONGSTATE,"Unfactored
> matrix");
>    if (A->cmap.N != X->rmap.N) SETERRQ2(PETSC_ERR_ARG_SIZ,"Mat A,Mat
> X: global dim %D %D",A->cmap.N,X->rmap.N);
>    if (A->rmap.N != B->rmap.N) SETERRQ2(PETSC_ERR_ARG_SIZ,"Mat A,Mat
> B: global dim %D %D",A->rmap.N,B->rmap.N);
>    if (A->rmap.n != B->rmap.n) SETERRQ2(PETSC_ERR_ARG_SIZ,"Mat A,Mat
> B: local dim %D %D",A->rmap.n,B->rmap.n);
>    if (!A->rmap.N && !A->cmap.N) PetscFunctionReturn(0);
>    ierr = MatPreallocated(A);CHKERRQ(ierr);
>
>    ierr = PetscLogEventBegin(MAT_MatSolve,A,B,X,0);CHKERRQ(ierr);
>    if (!A->ops->matsolve) {
>      ierr = PetscInfo1(A,"Mat type %s using basic MatMatSolve",
> ((PetscObject)A)->type_name);CHKERRQ(ierr);
>      ierr = MatMatSolve_Basic(A,B,X);CHKERRQ(ierr);
>    } else {
>      ierr = (*A->ops->matsolve)(A,B,X);CHKERRQ(ierr);
>    }
>    ierr = PetscLogEventEnd(MAT_MatSolve,A,B,X,0);CHKERRQ(ierr);
>    ierr = PetscObjectStateIncrease((PetscObject)X);CHKERRQ(ierr);
>    PetscFunctionReturn(0);
>
> }
>
> On Feb 29, 2008, at 6:46 PM, Yujie wrote:
>
> > Hi, everyone
> >
> > I am considering to add a parallel MatMatSolve() into PETSc based on
> > SuperLu_DIST or Spooles. If I want to use it like current sequential
> > MatMatSolve() in the application codes, how to do it? Could you give
> > me some examples about how to add a new function?
> > thanks a lot.
> >
> > Regards,
> > Yujie
>
>

-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.mcs.anl.gov/pipermail/petsc-users/attachments/20080229/118818f9/attachment.htm>

From recrusader at gmail.com  Fri Feb 29 21:35:46 2008
From: recrusader at gmail.com (Yujie)
Date: Sat, 1 Mar 2008 11:35:46 +0800
Subject: how to add a parallel MatMatSolve() function?
In-Reply-To: <39836C37-D183-4869-B70D-B7187762C233@mcs.anl.gov>
References: <7ff0ee010802291646n508b1c6bu9d6acd5ba891e19c@mail.gmail.com>
	 <363E050C-90D5-48FC-A362-85A74D377EC6@mcs.anl.gov>
	 <7ff0ee010802291801ia1897cau272afa8c19125522@mail.gmail.com>
	 <39836C37-D183-4869-B70D-B7187762C233@mcs.anl.gov>
Message-ID: <7ff0ee010802291935o65002d30o4a8189c730f829d6@mail.gmail.com>

Dear Barry:

I have checked SuperLU_Dist codes. It looks like relative easy to write
codes for AX=B based on MatSolve(). This is why I ask you the above problem.

how about your advice?
thanks a lot.

Regards,
Yujie

On 3/1/08, Barry Smith <bsmith at mcs.anl.gov> wrote:
>
>
>   Some direct solver packages have support for solving directly with
> several right hand sides at the same time.They could be a bit faster than
> solving one at a time; maybe 30% faster at most, not 10 times faster. What
> is more
> important solving the problem you want to solve in a reasonable time or
> solving the problem a bit faster
> after spending several weeks writing the much more complicated code?
>
>   Barry
>
> On Feb 29, 2008, at 8:01 PM, Yujie wrote:
>
> Dear Barry:
>
> Thank you for your help. I check the codes
> roughly, the method in the codes is to use MatSolve() to solve AX=B in a loop. I also consider such a method.
> However, I am wondering whether it is slower than the method that directly
> solves AX=B? thanks again.
>
> Regards,
> Yujie
>
> On 2/29/08, Barry Smith <bsmith at mcs.anl.gov> wrote:
> >
> >
> >    Please edit the file src/mat/interface/matrix.c and remove the
> > function MatMatSolve(). Replace it with the following two functions,
> > then run "make lib shared" in that directory. Please let us know at
> > petsc-maint at mcs.anl.gov
> >   if it crashes or produces incorrect
> > results.
> >
> >     Barry
> >
> >
> >
> > #undef __FUNCT__
> > #define __FUNCT__ "MatMatSolve_Basic"
> > PetscErrorCode PETSCMAT_DLLEXPORT MatMatSolve_Basic(Mat A,Mat B,Mat X)
> > {
> >    PetscErrorCode ierr;
> >    Vec            b,x;
> >    PetscInt       m,N,i;
> >    PetscScalar    *bb,*xx;
> >
> >    PetscFunctionBegin;
> >    ierr = MatGetArray(B,&bb);CHKERRQ(ierr);
> >    ierr = MatGetArray(X,&xx);CHKERRQ(ierr);
> >    ierr = MatGetLocalSize(B,&m,PETSC_NULL);CHKERRQ(ierr);  /* number
> > local rows */
> >    ierr = MatGetSize(B,PETSC_NULL,&N);CHKERRQ(ierr);       /* total
> > columns in dense matrix */
> >    ierr = VecCreateMPIWithArray(A-
> >   >hdr.comm,m,PETSC_DETERMINE,PETSC_NULL,&b);CHKERRQ(ierr);
> >    ierr = VecCreateMPIWithArray(A-
> >   >hdr.comm,m,PETSC_DETERMINE,PETSC_NULL,&x);CHKERRQ(ierr);
> >    for (i=0; i<N; i++) {
> >      ierr = VecPlaceArray(b,bb + i*m);CHKERRQ(ierr);
> >      ierr = VecPlaceArray(x,xx + i*m);CHKERRQ(ierr);
> >      ierr = MatSolve(A,b,x);CHKERRQ(ierr);
> >    }
> >    ierr = VecDestroy(b);CHKERRQ(ierr);
> >    ierr = VecDestroy(x);CHKERRQ(ierr);
> >    PetscFunctionReturn(0);
> > }
> >
> > #undef __FUNCT__
> > #define __FUNCT__ "MatMatSolve"
> > /*@
> >     MatMatSolve - Solves A X = B, given a factored matrix.
> >
> >     Collective on Mat
> >
> >     Input Parameters:
> > +  mat - the factored matrix
> > -  B - the right-hand-side matrix  (dense matrix)
> >
> >     Output Parameter:
> > .  B - the result matrix (dense matrix)
> >
> >     Notes:
> >     The matrices b and x cannot be the same.  I.e., one cannot
> >     call MatMatSolve(A,x,x).
> >
> >     Notes:
> >     Most users should usually employ the simplified KSP interface for
> > linear solvers
> >     instead of working directly with matrix algebra routines such as
> > this.
> >     See, e.g., KSPCreate(). However KSP can only solve for one vector
> > (column of X)
> >     at a time.
> >
> >     Level: developer
> >
> >     Concepts: matrices^triangular solves
> >
> > .seealso: MatMatSolveAdd(), MatMatSolveTranspose(),
> > MatMatSolveTransposeAdd(), MatLUFactor(), MatCholeskyFactor()
> > @*/
> > PetscErrorCode PETSCMAT_DLLEXPORT MatMatSolve(Mat A,Mat B,Mat X)
> > {
> >    PetscErrorCode ierr;
> >
> >    PetscFunctionBegin;
> >    PetscValidHeaderSpecific(A,MAT_COOKIE,1);
> >    PetscValidType(A,1);
> >    PetscValidHeaderSpecific(B,MAT_COOKIE,2);
> >    PetscValidHeaderSpecific(X,MAT_COOKIE,3);
> >    PetscCheckSameComm(A,1,B,2);
> >    PetscCheckSameComm(A,1,X,3);
> >    if (X == B) SETERRQ(PETSC_ERR_ARG_IDN,"X and B must be different
> > matrices");
> >    if (!A->factor) SETERRQ(PETSC_ERR_ARG_WRONGSTATE,"Unfactored
> > matrix");
> >    if (A->cmap.N != X->rmap.N) SETERRQ2(PETSC_ERR_ARG_SIZ,"Mat A,Mat
> > X: global dim %D %D",A->cmap.N,X->rmap.N);
> >    if (A->rmap.N != B->rmap.N) SETERRQ2(PETSC_ERR_ARG_SIZ,"Mat A,Mat
> > B: global dim %D %D",A->rmap.N,B->rmap.N);
> >    if (A->rmap.n != B->rmap.n) SETERRQ2(PETSC_ERR_ARG_SIZ,"Mat A,Mat
> > B: local dim %D %D",A->rmap.n,B->rmap.n);
> >    if (!A->rmap.N && !A->cmap.N) PetscFunctionReturn(0);
> >    ierr = MatPreallocated(A);CHKERRQ(ierr);
> >
> >    ierr = PetscLogEventBegin(MAT_MatSolve,A,B,X,0);CHKERRQ(ierr);
> >    if (!A->ops->matsolve) {
> >      ierr = PetscInfo1(A,"Mat type %s using basic MatMatSolve",
> > ((PetscObject)A)->type_name);CHKERRQ(ierr);
> >      ierr = MatMatSolve_Basic(A,B,X);CHKERRQ(ierr);
> >    } else {
> >      ierr = (*A->ops->matsolve)(A,B,X);CHKERRQ(ierr);
> >    }
> >    ierr = PetscLogEventEnd(MAT_MatSolve,A,B,X,0);CHKERRQ(ierr);
> >    ierr = PetscObjectStateIncrease((PetscObject)X);CHKERRQ(ierr);
> >    PetscFunctionReturn(0);
> >
> > }
> >
> > On Feb 29, 2008, at 6:46 PM, Yujie wrote:
> >
> > > Hi, everyone
> > >
> > > I am considering to add a parallel MatMatSolve() into PETSc based on
> > > SuperLu_DIST or Spooles. If I want to use it like current sequential
> > > MatMatSolve() in the application codes, how to do it? Could you give
> > > me some examples about how to add a new function?
> > > thanks a lot.
> > >
> > > Regards,
> > > Yujie
> >
> >
>
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.mcs.anl.gov/pipermail/petsc-users/attachments/20080301/be30bb20/attachment.htm>

From bsmith at mcs.anl.gov  Fri Feb 29 21:48:33 2008
From: bsmith at mcs.anl.gov (Barry Smith)
Date: Fri, 29 Feb 2008 21:48:33 -0600
Subject: how to add a parallel MatMatSolve() function?
In-Reply-To: <7ff0ee010802291935o65002d30o4a8189c730f829d6@mail.gmail.com>
References: <7ff0ee010802291646n508b1c6bu9d6acd5ba891e19c@mail.gmail.com> <363E050C-90D5-48FC-A362-85A74D377EC6@mcs.anl.gov> <7ff0ee010802291801ia1897cau272afa8c19125522@mail.gmail.com> <39836C37-D183-4869-B70D-B7187762C233@mcs.anl.gov> <7ff0ee010802291935o65002d30o4a8189c730f829d6@mail.gmail.com>
Message-ID: <61DD008D-868D-4B76-BBDE-734E08E0E6E7@mcs.anl.gov>


   The code I provided works for any LU solver; the way the PETSc code  
is written you can customize
a routine for any specific matrix format, like the PETSc SuperLU_Dist  
format. You are certainly
free to try to write a custom one for SuperLU_dist() (see  
MatMatSolve_SeqAIJ() for how to do this).
It is your time, not mine. Personally I'd rather have the computer run  
a few more minutes then spend
my time looking at code :-)

   Barry

On Feb 29, 2008, at 9:35 PM, Yujie wrote:

> Dear Barry:
>
> I have checked SuperLU_Dist codes. It looks like relative easy to  
> write codes for AX=B based on MatSolve(). This is why I ask you the  
> above problem.
> how about your advice?
> thanks a lot.
>
> Regards,
> Yujie
>
> On 3/1/08, Barry Smith <bsmith at mcs.anl.gov> wrote:
>
>   Some direct solver packages have support for solving directly with  
> several right hand sides at the same time.
> They could be a bit faster than solving one at a time; maybe 30%  
> faster at most, not 10 times faster. What is more
> important solving the problem you want to solve in a reasonable time  
> or solving the problem a bit faster
> after spending several weeks writing the much more complicated code?
>
>   Barry
>
> On Feb 29, 2008, at 8:01 PM, Yujie wrote:
>
>> Dear Barry:
>>
>> Thank you for your help. I check the codes roughly, the method in  
>> the codes is to use MatSolve() to solve AX=B in a loop. I also  
>> consider such a method.
>> However, I am wondering whether it is slower than the method that  
>> directly solves AX=B? thanks again.
>>
>> Regards,
>> Yujie
>>
>> On 2/29/08, Barry Smith <bsmith at mcs.anl.gov> wrote:
>>
>>    Please edit the file src/mat/interface/matrix.c and remove the
>> function MatMatSolve(). Replace it with the following two functions,
>> then run "make lib shared" in that directory. Please let us know at petsc-maint at mcs.anl.gov
>>   if it crashes or produces incorrect
>> results.
>>
>>     Barry
>>
>>
>>
>> #undef __FUNCT__
>> #define __FUNCT__ "MatMatSolve_Basic"
>> PetscErrorCode PETSCMAT_DLLEXPORT MatMatSolve_Basic(Mat A,Mat B,Mat  
>> X)
>> {
>>    PetscErrorCode ierr;
>>    Vec            b,x;
>>    PetscInt       m,N,i;
>>    PetscScalar    *bb,*xx;
>>
>>    PetscFunctionBegin;
>>    ierr = MatGetArray(B,&bb);CHKERRQ(ierr);
>>    ierr = MatGetArray(X,&xx);CHKERRQ(ierr);
>>    ierr = MatGetLocalSize(B,&m,PETSC_NULL);CHKERRQ(ierr);  /* number
>> local rows */
>>    ierr = MatGetSize(B,PETSC_NULL,&N);CHKERRQ(ierr);       /* total
>> columns in dense matrix */
>>    ierr = VecCreateMPIWithArray(A-
>>   >hdr.comm,m,PETSC_DETERMINE,PETSC_NULL,&b);CHKERRQ(ierr);
>>    ierr = VecCreateMPIWithArray(A-
>>   >hdr.comm,m,PETSC_DETERMINE,PETSC_NULL,&x);CHKERRQ(ierr);
>>    for (i=0; i<N; i++) {
>>      ierr = VecPlaceArray(b,bb + i*m);CHKERRQ(ierr);
>>      ierr = VecPlaceArray(x,xx + i*m);CHKERRQ(ierr);
>>      ierr = MatSolve(A,b,x);CHKERRQ(ierr);
>>    }
>>    ierr = VecDestroy(b);CHKERRQ(ierr);
>>    ierr = VecDestroy(x);CHKERRQ(ierr);
>>    PetscFunctionReturn(0);
>> }
>>
>> #undef __FUNCT__
>> #define __FUNCT__ "MatMatSolve"
>> /*@
>>     MatMatSolve - Solves A X = B, given a factored matrix.
>>
>>     Collective on Mat
>>
>>     Input Parameters:
>> +  mat - the factored matrix
>> -  B - the right-hand-side matrix  (dense matrix)
>>
>>     Output Parameter:
>> .  B - the result matrix (dense matrix)
>>
>>     Notes:
>>     The matrices b and x cannot be the same.  I.e., one cannot
>>     call MatMatSolve(A,x,x).
>>
>>     Notes:
>>     Most users should usually employ the simplified KSP interface for
>> linear solvers
>>     instead of working directly with matrix algebra routines such as
>> this.
>>     See, e.g., KSPCreate(). However KSP can only solve for one vector
>> (column of X)
>>     at a time.
>>
>>     Level: developer
>>
>>     Concepts: matrices^triangular solves
>>
>> .seealso: MatMatSolveAdd(), MatMatSolveTranspose(),
>> MatMatSolveTransposeAdd(), MatLUFactor(), MatCholeskyFactor()
>> @*/
>> PetscErrorCode PETSCMAT_DLLEXPORT MatMatSolve(Mat A,Mat B,Mat X)
>> {
>>    PetscErrorCode ierr;
>>
>>    PetscFunctionBegin;
>>    PetscValidHeaderSpecific(A,MAT_COOKIE,1);
>>    PetscValidType(A,1);
>>    PetscValidHeaderSpecific(B,MAT_COOKIE,2);
>>    PetscValidHeaderSpecific(X,MAT_COOKIE,3);
>>    PetscCheckSameComm(A,1,B,2);
>>    PetscCheckSameComm(A,1,X,3);
>>    if (X == B) SETERRQ(PETSC_ERR_ARG_IDN,"X and B must be different
>> matrices");
>>    if (!A->factor) SETERRQ(PETSC_ERR_ARG_WRONGSTATE,"Unfactored
>> matrix");
>>    if (A->cmap.N != X->rmap.N) SETERRQ2(PETSC_ERR_ARG_SIZ,"Mat A,Mat
>> X: global dim %D %D",A->cmap.N,X->rmap.N);
>>    if (A->rmap.N != B->rmap.N) SETERRQ2(PETSC_ERR_ARG_SIZ,"Mat A,Mat
>> B: global dim %D %D",A->rmap.N,B->rmap.N);
>>    if (A->rmap.n != B->rmap.n) SETERRQ2(PETSC_ERR_ARG_SIZ,"Mat A,Mat
>> B: local dim %D %D",A->rmap.n,B->rmap.n);
>>    if (!A->rmap.N && !A->cmap.N) PetscFunctionReturn(0);
>>    ierr = MatPreallocated(A);CHKERRQ(ierr);
>>
>>    ierr = PetscLogEventBegin(MAT_MatSolve,A,B,X,0);CHKERRQ(ierr);
>>    if (!A->ops->matsolve) {
>>      ierr = PetscInfo1(A,"Mat type %s using basic MatMatSolve",
>> ((PetscObject)A)->type_name);CHKERRQ(ierr);
>>      ierr = MatMatSolve_Basic(A,B,X);CHKERRQ(ierr);
>>    } else {
>>      ierr = (*A->ops->matsolve)(A,B,X);CHKERRQ(ierr);
>>    }
>>    ierr = PetscLogEventEnd(MAT_MatSolve,A,B,X,0);CHKERRQ(ierr);
>>    ierr = PetscObjectStateIncrease((PetscObject)X);CHKERRQ(ierr);
>>    PetscFunctionReturn(0);
>>
>> }
>>
>> On Feb 29, 2008, at 6:46 PM, Yujie wrote:
>>
>> > Hi, everyone
>> >
>> > I am considering to add a parallel MatMatSolve() into PETSc based  
>> on
>> > SuperLu_DIST or Spooles. If I want to use it like current  
>> sequential
>> > MatMatSolve() in the application codes, how to do it? Could you  
>> give
>> > me some examples about how to add a new function?
>> > thanks a lot.
>> >
>> > Regards,
>> > Yujie
>>
>>
>
>

-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.mcs.anl.gov/pipermail/petsc-users/attachments/20080229/6a3cb797/attachment.htm>

From recrusader at gmail.com  Fri Feb 29 21:59:48 2008
From: recrusader at gmail.com (Yujie)
Date: Sat, 1 Mar 2008 11:59:48 +0800
Subject: how to add a parallel MatMatSolve() function?
In-Reply-To: <61DD008D-868D-4B76-BBDE-734E08E0E6E7@mcs.anl.gov>
References: <7ff0ee010802291646n508b1c6bu9d6acd5ba891e19c@mail.gmail.com>
	 <363E050C-90D5-48FC-A362-85A74D377EC6@mcs.anl.gov>
	 <7ff0ee010802291801ia1897cau272afa8c19125522@mail.gmail.com>
	 <39836C37-D183-4869-B70D-B7187762C233@mcs.anl.gov>
	 <7ff0ee010802291935o65002d30o4a8189c730f829d6@mail.gmail.com>
	 <61DD008D-868D-4B76-BBDE-734E08E0E6E7@mcs.anl.gov>
Message-ID: <7ff0ee010802291959i35e2aa0fyb38af45644115121@mail.gmail.com>

Thanks a lot:).

Regards,
Yujie

On 3/1/08, Barry Smith <bsmith at mcs.anl.gov> wrote:
>
>
>   The code I provided works for any LU solver; the way the PETSc code is
> written you can customizea routine for any specific matrix format, like
> the PETSc SuperLU_Dist format. You are certainly
> free to try to write a custom one for SuperLU_dist() (see
> MatMatSolve_SeqAIJ() for how to do this).
> It is your time, not mine. Personally I'd rather have the computer run a
> few more minutes then spend
> my time looking at code :-)
>
>   Barry
>
> On Feb 29, 2008, at 9:35 PM, Yujie wrote:
>
> Dear Barry:
>
> I have checked SuperLU_Dist codes. It looks like relative easy to write
> codes for AX=B based on MatSolve(). This is why I ask you the above problem.
>
> how about your advice?
> thanks a lot.
>
> Regards,
> Yujie
>
> On 3/1/08, Barry Smith <bsmith at mcs.anl.gov> wrote:
> >
> >
> >   Some direct solver packages have support for solving directly with
> > several right hand sides at the same time.They could be a bit faster
> > than solving one at a time; maybe 30% faster at most, not 10 times faster.
> > What is more
> > important solving the problem you want to solve in a reasonable time or
> > solving the problem a bit faster
> > after spending several weeks writing the much more complicated code?
> >
> >   Barry
> >
> > On Feb 29, 2008, at 8:01 PM, Yujie wrote:
> >
> > Dear Barry:
> >
> > Thank you for your help. I check the codes
> > roughly, the method in the codes is to use MatSolve() to solve AX=B in a loop. I also consider such a method.
> > However, I am wondering whether it is slower than the method that
> > directly solves AX=B? thanks again.
> >
> > Regards,
> > Yujie
> >
> > On 2/29/08, Barry Smith <bsmith at mcs.anl.gov> wrote:
> > >
> > >
> > >    Please edit the file src/mat/interface/matrix.c and remove the
> > > function MatMatSolve(). Replace it with the following two functions,
> > > then run "make lib shared" in that directory. Please let us know at
> > > petsc-maint at mcs.anl.gov
> > >   if it crashes or produces incorrect
> > > results.
> > >
> > >     Barry
> > >
> > >
> > >
> > > #undef __FUNCT__
> > > #define __FUNCT__ "MatMatSolve_Basic"
> > > PetscErrorCode PETSCMAT_DLLEXPORT MatMatSolve_Basic(Mat A,Mat B,Mat X)
> > > {
> > >    PetscErrorCode ierr;
> > >    Vec            b,x;
> > >    PetscInt       m,N,i;
> > >    PetscScalar    *bb,*xx;
> > >
> > >    PetscFunctionBegin;
> > >    ierr = MatGetArray(B,&bb);CHKERRQ(ierr);
> > >    ierr = MatGetArray(X,&xx);CHKERRQ(ierr);
> > >    ierr = MatGetLocalSize(B,&m,PETSC_NULL);CHKERRQ(ierr);  /* number
> > > local rows */
> > >    ierr = MatGetSize(B,PETSC_NULL,&N);CHKERRQ(ierr);       /* total
> > > columns in dense matrix */
> > >    ierr = VecCreateMPIWithArray(A-
> > >   >hdr.comm,m,PETSC_DETERMINE,PETSC_NULL,&b);CHKERRQ(ierr);
> > >    ierr = VecCreateMPIWithArray(A-
> > >   >hdr.comm,m,PETSC_DETERMINE,PETSC_NULL,&x);CHKERRQ(ierr);
> > >    for (i=0; i<N; i++) {
> > >      ierr = VecPlaceArray(b,bb + i*m);CHKERRQ(ierr);
> > >      ierr = VecPlaceArray(x,xx + i*m);CHKERRQ(ierr);
> > >      ierr = MatSolve(A,b,x);CHKERRQ(ierr);
> > >    }
> > >    ierr = VecDestroy(b);CHKERRQ(ierr);
> > >    ierr = VecDestroy(x);CHKERRQ(ierr);
> > >    PetscFunctionReturn(0);
> > > }
> > >
> > > #undef __FUNCT__
> > > #define __FUNCT__ "MatMatSolve"
> > > /*@
> > >     MatMatSolve - Solves A X = B, given a factored matrix.
> > >
> > >     Collective on Mat
> > >
> > >     Input Parameters:
> > > +  mat - the factored matrix
> > > -  B - the right-hand-side matrix  (dense matrix)
> > >
> > >     Output Parameter:
> > > .  B - the result matrix (dense matrix)
> > >
> > >     Notes:
> > >     The matrices b and x cannot be the same.  I.e., one cannot
> > >     call MatMatSolve(A,x,x).
> > >
> > >     Notes:
> > >     Most users should usually employ the simplified KSP interface for
> > > linear solvers
> > >     instead of working directly with matrix algebra routines such as
> > > this.
> > >     See, e.g., KSPCreate(). However KSP can only solve for one vector
> > > (column of X)
> > >     at a time.
> > >
> > >     Level: developer
> > >
> > >     Concepts: matrices^triangular solves
> > >
> > > .seealso: MatMatSolveAdd(), MatMatSolveTranspose(),
> > > MatMatSolveTransposeAdd(), MatLUFactor(), MatCholeskyFactor()
> > > @*/
> > > PetscErrorCode PETSCMAT_DLLEXPORT MatMatSolve(Mat A,Mat B,Mat X)
> > > {
> > >    PetscErrorCode ierr;
> > >
> > >    PetscFunctionBegin;
> > >    PetscValidHeaderSpecific(A,MAT_COOKIE,1);
> > >    PetscValidType(A,1);
> > >    PetscValidHeaderSpecific(B,MAT_COOKIE,2);
> > >    PetscValidHeaderSpecific(X,MAT_COOKIE,3);
> > >    PetscCheckSameComm(A,1,B,2);
> > >    PetscCheckSameComm(A,1,X,3);
> > >    if (X == B) SETERRQ(PETSC_ERR_ARG_IDN,"X and B must be different
> > > matrices");
> > >    if (!A->factor) SETERRQ(PETSC_ERR_ARG_WRONGSTATE,"Unfactored
> > > matrix");
> > >    if (A->cmap.N != X->rmap.N) SETERRQ2(PETSC_ERR_ARG_SIZ,"Mat A,Mat
> > > X: global dim %D %D",A->cmap.N,X->rmap.N);
> > >    if (A->rmap.N != B->rmap.N) SETERRQ2(PETSC_ERR_ARG_SIZ,"Mat A,Mat
> > > B: global dim %D %D",A->rmap.N,B->rmap.N);
> > >    if (A->rmap.n != B->rmap.n) SETERRQ2(PETSC_ERR_ARG_SIZ,"Mat A,Mat
> > > B: local dim %D %D",A->rmap.n,B->rmap.n);
> > >    if (!A->rmap.N && !A->cmap.N) PetscFunctionReturn(0);
> > >    ierr = MatPreallocated(A);CHKERRQ(ierr);
> > >
> > >    ierr = PetscLogEventBegin(MAT_MatSolve,A,B,X,0);CHKERRQ(ierr);
> > >    if (!A->ops->matsolve) {
> > >      ierr = PetscInfo1(A,"Mat type %s using basic MatMatSolve",
> > > ((PetscObject)A)->type_name);CHKERRQ(ierr);
> > >      ierr = MatMatSolve_Basic(A,B,X);CHKERRQ(ierr);
> > >    } else {
> > >      ierr = (*A->ops->matsolve)(A,B,X);CHKERRQ(ierr);
> > >    }
> > >    ierr = PetscLogEventEnd(MAT_MatSolve,A,B,X,0);CHKERRQ(ierr);
> > >    ierr = PetscObjectStateIncrease((PetscObject)X);CHKERRQ(ierr);
> > >    PetscFunctionReturn(0);
> > >
> > > }
> > >
> > > On Feb 29, 2008, at 6:46 PM, Yujie wrote:
> > >
> > > > Hi, everyone
> > > >
> > > > I am considering to add a parallel MatMatSolve() into PETSc based on
> > > > SuperLu_DIST or Spooles. If I want to use it like current sequential
> > > > MatMatSolve() in the application codes, how to do it? Could you give
> > > > me some examples about how to add a new function?
> > > > thanks a lot.
> > > >
> > > > Regards,
> > > > Yujie
> > >
> > >
> >
> >
>
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.mcs.anl.gov/pipermail/petsc-users/attachments/20080301/036a1c86/attachment.htm>