From jwicks at cs.brown.edu  Sun Dec  2 08:01:59 2007
From: jwicks at cs.brown.edu (John R. Wicks)
Date: Sun, 2 Dec 2007 09:01:59 -0500
Subject: PCGetFactoredMatrix
In-Reply-To: <CC40FE55-E1C7-43B4-82AC-42D594A860DA@mcs.anl.gov>
Message-ID: <000201c834eb$e89ecfa0$0201a8c0@jwickslptp>

I am specifically interested in knowing if one can expect the residual
matrix (A - LU) to be significantly more sparse than the original matrix, A.
Does anyone know if this is the case for sparse A?

> -----Original Message-----
> From: owner-petsc-users at mcs.anl.gov 
> [mailto:owner-petsc-users at mcs.anl.gov] On Behalf Of Barry Smith
> Sent: Thursday, November 29, 2007 3:43 PM
> To: petsc-users at mcs.anl.gov
> Subject: Re: PCGetFactoredMatrix
> 
> 
> 
>     John,
> 
>       There is no immediate way to do this.
> For the SeqAIJ format, we store both the LU in a single CSR 
> format. with for each row first the part of L (below the 
> diagonal) then 1/D_i then the part of U for that row. You can 
> see how the triangular solves are done by looking at 
> src/mat/impls/aij/seq/aijfact.c the routine  
> MatSolve_SeqAIJ()
> Note that it is actually more complicated due to the row and column  
> permutations
> (the factored matrix is stored in the ordering of the 
> permutations). For BAIJ matrix the storage is similar except 
> it is stored by block  
> instead of point
> and the inverse of the block diagonal is stored.
> 
> One could take the MatSolve_SeqAIJ() routine and modify it to do the  
> matrix
> vector product without too much difficulty.
> 
> If you decide to do this we would gladly include it in our 
> distribution.
> 
>     Barry
> 
> One can ask why we don't provide this functionality in PETSc since  
> computing
> A - LU is a reasonable thing to do if one wants to understand the  
> convergence
> of the method. The answer is two-fold 1) time and energy and 
> 2) though  
> we
> like everyone to use PETSc we driven more by people who are not  
> interested
> in the solution algorithms etc but only in getting the answer easily  
> and relatively
> efficiently.
> 
> 
> On Nov 29, 2007, at 12:07 PM, John R. Wicks wrote:
> 
> > I would like to compute the residual A - LU, where LU is the ILU 
> > factorization of A.  What is the most convenient way of doing so?
> >
> >> -----Original Message-----
> >> From: owner-petsc-users at mcs.anl.gov 
> >> [mailto:owner-petsc-users at mcs.anl.gov] On Behalf Of Matthew Knepley
> >> Sent: Thursday, November 29, 2007 12:04 PM
> >> To: petsc-users at mcs.anl.gov
> >> Subject: Re: PCGetFactoredMatrix
> >>
> >>
> >> It depends on the package, but the petsc stuff stores L 
> and U in one 
> >> matrix.
> >>
> >>   Matt
> >>
> >> On Nov 29, 2007 9:03 AM, John R. Wicks <jwicks at cs.brown.edu> wrote:
> >>> The documentation for PCGetFactoredMatrix is not clear.  
> What does 
> >>> this return for ILU(0), for example? Does it return the
> >> product LU or
> >>> the in place factorization?
> >>>
> >>>
> >>
> >>
> >>
> >> --
> >> What most experimenters take for granted before they begin
> >> their experiments is infinitely more interesting than any
> >> results to which their experiments lead.
> >> -- Norbert Wiener
> >>
> >
> 



From timothy.stitt at ichec.ie  Mon Dec  3 11:49:07 2007
From: timothy.stitt at ichec.ie (Tim Stitt)
Date: Mon, 03 Dec 2007 17:49:07 +0000
Subject: Global to Local Vector Mapping
Message-ID: <47544193.6050501@ichec.ie>

Hi all,

Is there a quick way to map a global index for a parallel vector to a 
local mapping tuple (p,i) were 'p' represents the process containing the 
value and 'i' is the local index number on that process?

As always, thanks in advance for any information provided.

Tim.

-- 
Dr. Timothy Stitt <timothy_dot_stitt_at_ichec.ie>
HPC Application Consultant - ICHEC (www.ichec.ie)

Dublin Institute for Advanced Studies
5 Merrion Square - Dublin 2 - Ireland

+353-1-6621333 (tel) / +353-1-6621477 (fax)



From knepley at gmail.com  Mon Dec  3 12:03:50 2007
From: knepley at gmail.com (Matthew Knepley)
Date: Mon, 3 Dec 2007 12:03:50 -0600
Subject: Global to Local Vector Mapping
In-Reply-To: <47544193.6050501@ichec.ie>
References: <47544193.6050501@ichec.ie>
Message-ID: <a9f269830712031003h422eadb1w1e08b9fa9345d3fc@mail.gmail.com>

On Dec 3, 2007 11:49 AM, Tim Stitt <timothy.stitt at ichec.ie> wrote:
> Hi all,
>
> Is there a quick way to map a global index for a parallel vector to a
> local mapping tuple (p,i) were 'p' represents the process containing the
> value and 'i' is the local index number on that process?

PetscMapGetGlobalRange(&v->map,const &range);
for(p = 0; p < numProcs; ++p) if (range[p+1] > globalInd) break;
localInd = globalInd - range[p];

   Matt

> As always, thanks in advance for any information provided.
>
> Tim.
>
> --
> Dr. Timothy Stitt <timothy_dot_stitt_at_ichec.ie>
> HPC Application Consultant - ICHEC (www.ichec.ie)
>
> Dublin Institute for Advanced Studies
> 5 Merrion Square - Dublin 2 - Ireland
>
> +353-1-6621333 (tel) / +353-1-6621477 (fax)
>
>



-- 
What most experimenters take for granted before they begin their
experiments is infinitely more interesting than any results to which
their experiments lead.
-- Norbert Wiener



From timothy.stitt at ichec.ie  Mon Dec  3 12:23:36 2007
From: timothy.stitt at ichec.ie (Tim Stitt)
Date: Mon, 03 Dec 2007 18:23:36 +0000
Subject: Global to Local Vector Mapping
In-Reply-To: <a9f269830712031003h422eadb1w1e08b9fa9345d3fc@mail.gmail.com>
References: <47544193.6050501@ichec.ie> <a9f269830712031003h422eadb1w1e08b9fa9345d3fc@mail.gmail.com>
Message-ID: <475449A8.7000307@ichec.ie>

I have a problem Matthew...this is a Fortran code..which I don't think 
this routine is compatible with. Is there any other way?

Matthew Knepley wrote:
> On Dec 3, 2007 11:49 AM, Tim Stitt <timothy.stitt at ichec.ie> wrote:
>   
>> Hi all,
>>
>> Is there a quick way to map a global index for a parallel vector to a
>> local mapping tuple (p,i) were 'p' represents the process containing the
>> value and 'i' is the local index number on that process?
>>     
>
> PetscMapGetGlobalRange(&v->map,const &range);
> for(p = 0; p < numProcs; ++p) if (range[p+1] > globalInd) break;
> localInd = globalInd - range[p];
>
>    Matt
>
>   
>> As always, thanks in advance for any information provided.
>>
>> Tim.
>>
>> --
>> Dr. Timothy Stitt <timothy_dot_stitt_at_ichec.ie>
>> HPC Application Consultant - ICHEC (www.ichec.ie)
>>
>> Dublin Institute for Advanced Studies
>> 5 Merrion Square - Dublin 2 - Ireland
>>
>> +353-1-6621333 (tel) / +353-1-6621477 (fax)
>>
>>
>>     
>
>
>
>   


-- 
Dr. Timothy Stitt <timothy_dot_stitt_at_ichec.ie>
HPC Application Consultant - ICHEC (www.ichec.ie)

Dublin Institute for Advanced Studies
5 Merrion Square - Dublin 2 - Ireland

+353-1-6621333 (tel) / +353-1-6621477 (fax)



From gdiso at ustc.edu  Mon Dec  3 17:59:31 2007
From: gdiso at ustc.edu (Gong Ding)
Date: Tue, 4 Dec 2007 07:59:31 +0800
Subject: Small bug about line search
Message-ID: <7C7393CDF2354390AEC3A1E27815999F@nintatmel>

Hi,
After a  SNESLineSearchPostCheck call, the function SNESLineSearchCubic and SNESLineSearchQuadratic should recompute residual norm ||g|| and search length norm ||y||
But the code is
src/snes/impls/ls/ls.c
676:       VecNormBegin(g,NORM_2,gnorm);
677:       if (*gnorm != *gnorm) SETERRQ(PETSC_ERR_FP,"User provided compute function generated a Not-a-Number");
678:       VecNormBegin(w,NORM_2,ynorm);
679:       VecNormEnd(g,NORM_2,gnorm);
680:       VecNormEnd(w,NORM_2,ynorm);
and
850:       VecNormBegin(g,NORM_2,gnorm);
851:       VecNormBegin(w,NORM_2,ynorm);
852:       VecNormEnd(g,NORM_2,gnorm);
853:       VecNormEnd(w,NORM_2,ynorm);
it set ||w|| for variable ynorm, for which i think should be ||y||

Yours
Gong Ding



From bsmith at mcs.anl.gov  Tue Dec  4 15:39:07 2007
From: bsmith at mcs.anl.gov (Barry Smith)
Date: Tue, 4 Dec 2007 15:39:07 -0600
Subject: Global to Local Vector Mapping
In-Reply-To: <475449A8.7000307@ichec.ie>
References: <47544193.6050501@ichec.ie> <a9f269830712031003h422eadb1w1e08b9fa9345d3fc@mail.gmail.com> <475449A8.7000307@ichec.ie>
Message-ID: <E4CBC5F5-A487-4F6D-A410-FCAAA805F0A1@mcs.anl.gov>


    Tim,

     Sorry for the delay. You will need to call VecGetOwnershipRanges().

    Unfortunately it does not exist in either C or Fortran. I have put  
it into petsc-dev.
You can add the following line to include/petscvec.h
EXTERN PetscErrorCode PETSCVEC_DLLEXPORT  
VecGetOwnershipRanges(Vec,const PetscInt *[]);
add the following lines in src/vec/vec/interface/vector.c
#undef __FUNCT__
#define __FUNCT__ "VecGetOwnershipRanges"
/*@C
    VecGetOwnershipRanges - Returns the range of indices owned by EACH  
processor,
    assuming that the vectors are laid out with the
    first n1 elements on the first processor, next n2 elements on the
    second, etc.  For certain parallel layouts this range may not be
    well defined.

    Not Collective

    Input Parameter:
.  x - the vector

    Output Parameters:
.  range - array of length size+1 with the start and end+1 for each  
process

    Note:
    The high argument is one more than the last element stored locally.

    Fortran: You must PASS in an array of length size+1

    Level: beginner

    Concepts: ownership^of vectors
    Concepts: vector^ownership of elements

.seealso:   MatGetOwnershipRange(), MatGetOwnershipRanges(),  
VecGetOwnershipRange()
@*/
PetscErrorCode PETSCVEC_DLLEXPORT VecGetOwnershipRanges(Vec x,const  
PetscInt *ranges[])
{
   PetscErrorCode ierr;

   PetscFunctionBegin;
   PetscValidHeaderSpecific(x,VEC_COOKIE,1);
   PetscValidType(x,1);
   ierr = PetscMapGetGlobalRange(&x->map,ranges);CHKERRQ(ierr);
   PetscFunctionReturn(0);
}

Run make in that directory, then add to src/vec/vec/interface/ftn- 
custom/zvectorf.c
#if defined(PETSC_HAVE_FORTRAN_CAPS)
#define vecgetownershipranges_    VECGETOWNERSHIPRANGES
#elif !defined(PETSC_HAVE_FORTRAN_UNDERSCORE)
#define vecgetownershipranges_    vecgetownershipranges
#endif

void PETSC_STDCALL vecgetownershipranges_(Vec *x,PetscInt  
*range,PetscErrorCode *ierr)
{
   PetscMPIInt    size;
   const PetscInt *r;

   *ierr = MPI_Comm_size((*x)->map.comm,&size);if (*ierr) return;
   *ierr = VecGetOwnershipRanges(*x,&r);if (*ierr) return;
   *ierr = PetscMemcpy(range,r,(size+1)*sizeof(PetscInt));
}
and again run make in that directory.

    Let us know if any problems come up,

    Barry


On Dec 3, 2007, at 12:23 PM, Tim Stitt wrote:

> I have a problem Matthew...this is a Fortran code..which I don't  
> think this routine is compatible with. Is there any other way?
>
> Matthew Knepley wrote:
>> On Dec 3, 2007 11:49 AM, Tim Stitt <timothy.stitt at ichec.ie> wrote:
>>
>>> Hi all,
>>>
>>> Is there a quick way to map a global index for a parallel vector  
>>> to a
>>> local mapping tuple (p,i) were 'p' represents the process  
>>> containing the
>>> value and 'i' is the local index number on that process?
>>>
>>
>> PetscMapGetGlobalRange(&v->map,const &range);
>> for(p = 0; p < numProcs; ++p) if (range[p+1] > globalInd) break;
>> localInd = globalInd - range[p];
>>
>>   Matt
>>
>>
>>> As always, thanks in advance for any information provided.
>>>
>>> Tim.
>>>
>>> --
>>> Dr. Timothy Stitt <timothy_dot_stitt_at_ichec.ie>
>>> HPC Application Consultant - ICHEC (www.ichec.ie)
>>>
>>> Dublin Institute for Advanced Studies
>>> 5 Merrion Square - Dublin 2 - Ireland
>>>
>>> +353-1-6621333 (tel) / +353-1-6621477 (fax)
>>>
>>>
>>>
>>
>>
>>
>>
>
>
> -- 
> Dr. Timothy Stitt <timothy_dot_stitt_at_ichec.ie>
> HPC Application Consultant - ICHEC (www.ichec.ie)
>
> Dublin Institute for Advanced Studies
> 5 Merrion Square - Dublin 2 - Ireland
>
> +353-1-6621333 (tel) / +353-1-6621477 (fax)
>



From zonexo at gmail.com  Wed Dec  5 04:40:08 2007
From: zonexo at gmail.com (Ben Tay)
Date: Wed, 05 Dec 2007 18:40:08 +0800
Subject: Estimating PETSc performance using SuperPI's results
Message-ID: <47568008.20506@gmail.com>

Hi,

I'm thinking of ways to estimate and compare the performance of PETSc on 
different CPUs. I think it will also enable one to make the wise choice 
of whether to upgrade or not.

Of cos, the best way is to run your own code on the new machine to see 
how much increase there is. However, most of the time this option is not 
available. I have found many forums whereby users post the time required 
to run programs such as SuperPI or other benchmarking softwares. I 
wonder if such software can be used to estimate the performance of PETSc 
too? In other words, if cpu A ran 4 times faster than on cpu B running 
SuperPi, is it safe to assume that it 'll be roughly the same running 
PETSc? Btw, SuperPi is a single threaded program.

Thanks



From keita at cray.com  Wed Dec  5 15:31:47 2007
From: keita at cray.com (Keita Teranishi)
Date: Wed, 5 Dec 2007 15:31:47 -0600
Subject: Usage of fun3d (flow)
Message-ID: <925346A443D4E340BEB20248BAFCDBDF034D13D9@CFEVS1-IP.americas.cray.com>

Hi,

 

I am trying to run fun3d bundled with petsc distribution. The main program says it needs to access petsc.opt file, but it is not provided with the petsc package.  

Can you tell me the format of the file, or give me any sample files? 

 

Thank you,

================================
 Keita Teranishi
 Math Software Group
 Cray, Inc.
 keita at cray.com
================================

 

-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.mcs.anl.gov/pipermail/petsc-users/attachments/20071205/1cc38ee9/attachment.htm>

From keita at cray.com  Wed Dec  5 15:45:27 2007
From: keita at cray.com (Keita Teranishi)
Date: Wed, 5 Dec 2007 15:45:27 -0600
Subject: Usage of fun3d (flow)
In-Reply-To: <925346A443D4E340BEB20248BAFCDBDF034D13D9@CFEVS1-IP.americas.cray.com>
References: <925346A443D4E340BEB20248BAFCDBDF034D13D9@CFEVS1-IP.americas.cray.com>
Message-ID: <925346A443D4E340BEB20248BAFCDBDF034D140D@CFEVS1-IP.americas.cray.com>

Hi,

 

I also found fun3d requires many input files.  I'd like to know the format and sample of these files.

 

Thanks,

 

================================
 Keita Teranishi
 Math Software Group
 Cray, Inc.
 keita at cray.com
================================

________________________________

From: owner-petsc-users at mcs.anl.gov [mailto:owner-petsc-users at mcs.anl.gov] On Behalf Of Keita Teranishi
Sent: Wednesday, December 05, 2007 3:32 PM
To: petsc-users at mcs.anl.gov
Subject: Usage of fun3d (flow)

 

Hi,

 

I am trying to run fun3d bundled with petsc distribution. The main program says it needs to access petsc.opt file, but it is not provided with the petsc package.  

Can you tell me the format of the file, or give me any sample files? 

 

Thank you,

================================
 Keita Teranishi
 Math Software Group
 Cray, Inc.
 keita at cray.com
================================

 

-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.mcs.anl.gov/pipermail/petsc-users/attachments/20071205/bba11acf/attachment.htm>

From timothy.stitt at ichec.ie  Thu Dec  6 06:09:49 2007
From: timothy.stitt at ichec.ie (Tim Stitt)
Date: Thu, 06 Dec 2007 12:09:49 +0000
Subject: Zero Pivot Row in LU Factorization
In-Reply-To: <982DFD3C-8D76-4045-A8B4-F046675DBC08@mcs.anl.gov>
References: <4749C4FE.7070602@ichec.ie> <4749C894.5090602@ichec.ie> <982DFD3C-8D76-4045-A8B4-F046675DBC08@mcs.anl.gov>
Message-ID: <4757E68D.3030208@ichec.ie>

Barry,

I will be using these routines from Fortran..so I am assuming that 
Fortran interfaces are available for each routine?

Also, how do I know how many sub ksp's there will be? I am assuming I 
need to dynamically allocate the subksp array in Fortran but do I know 
the size in advance? Is this related to the value 'n' ? If so, how do I 
calculate 'n'.

What is the significance of subksp[0]? Is it just the sub ksp at this 
position I should be interested in?

Finally, which of the PCFactorSetxxxxxx routines should I be using?

Sorry for the twenty questions (well nearly) but I am just a bit 
confused with this approach.

Thanks,

Tim.

 Barry Smith wrote:
>
>   KSP *subksp;
>
>    KSPGetPC(ksp,pc)
>    PCBJacobiGetSubKSP(pc,&n,PETSC_NULL,&subksp)
>    KSPGetPC(subksp[0],&subpc);
>    PCFactorSetxxxxxx(subpc, ....
>
>   Barry
>
>
> On Nov 25, 2007, at 1:10 PM, Tim Stitt wrote:
>
>> I should also add that the code executes without this error when 
>> using 1 processor...but then displays the error when running in 
>> parallel with more than one process.
>>
>> Tim Stitt wrote:
>>> Hi all,
>>>
>>> Can anyone suggest ways of overcoming the following pivot error I 
>>> keep receiving in my PETSc code during a KSPSolve().
>>>
>>> [1]PETSC ERROR: Detected zero pivot in LU factorization
>>> see 
>>> http://www.mcs.anl.gov/petsc/petsc-as/documentation/troubleshooting.html#ZeroPivot! 
>>>
>>> [1]PETSC ERROR: Zero pivot row 1801 value 0.00102826 tolerance 
>>> 0.00165189 * rowsum 1.65189e+09!
>>>
>>> From checking the documentation....the error is in row 1801, which 
>>> means it is most likely not a matrix assembly issue?
>>>
>>> I tried the following prior to the solve with no luck either.....
>>>
>>> call KSPGetPC(ksp,pc,error)
>>> call PCFactorSetShiftNonzero(pc,PETSC_DECIDE,error)
>>>
>>> Is there anything else I can try?
>>>
>>> Thanks,
>>>
>>> Tim.
>>>
>>
>>
>> --Dr. Timothy Stitt <timothy_dot_stitt_at_ichec.ie>
>> HPC Application Consultant - ICHEC (www.ichec.ie)
>>
>> Dublin Institute for Advanced Studies
>> 5 Merrion Square - Dublin 2 - Ireland
>>
>> +353-1-6621333 (tel) / +353-1-6621477 (fax)
>>
>


-- 
Dr. Timothy Stitt <timothy_dot_stitt_at_ichec.ie>
HPC Application Consultant - ICHEC (www.ichec.ie)

Dublin Institute for Advanced Studies
5 Merrion Square - Dublin 2 - Ireland

+353-1-6621333 (tel) / +353-1-6621477 (fax)



From bsmith at mcs.anl.gov  Thu Dec  6 11:26:29 2007
From: bsmith at mcs.anl.gov (Barry Smith)
Date: Thu, 6 Dec 2007 11:26:29 -0600
Subject: Zero Pivot Row in LU Factorization
In-Reply-To: <4757E68D.3030208@ichec.ie>
References: <4749C4FE.7070602@ichec.ie> <4749C894.5090602@ichec.ie> <982DFD3C-8D76-4045-A8B4-F046675DBC08@mcs.anl.gov> <4757E68D.3030208@ichec.ie>
Message-ID: <CCDF16E2-1D38-4670-BED3-8F86AA6C3AFC@mcs.anl.gov>


On Dec 6, 2007, at 6:09 AM, Tim Stitt wrote:

> Barry,
>
> I will be using these routines from Fortran..so I am assuming that  
> Fortran interfaces are available for each routine?
>
> Also, how do I know how many sub ksp's there will be? I am assuming  
> I need to dynamically allocate the subksp array in Fortran but do I  
> know the size in advance? Is this related to the value 'n' ? If so,  
> how do I calculate 'n'.

    There will always be one sub ksp be default. There will only be  
more than one if you use
PCBJacobiSetLocalBlocks() or PCBJacobiSetTotalBlocks() or - 
pc_bjacobi_blocks.
In general we recommend keeping it one. This means you do not need to  
allocate
any KSP, just pass in a KSP variable

>
>
> What is the significance of subksp[0]? Is it just the sub ksp at  
> this position I should be interested in?

    This is just the first one. If you have multiply ones then you  
must loop over them, but I
recommend having just one.
>
>
> Finally, which of the PCFactorSetxxxxxx routines should I be using?

PCFactorSetZeroPivot() or PCFactorSetShiftNonzero() or  
PCFactorSetShiftPd() depending
on what you want to have happen.

    Barry

>
>
> Sorry for the twenty questions (well nearly) but I am just a bit  
> confused with this approach.
>
> Thanks,
>
> Tim.
>
> Barry Smith wrote:
>>
>>  KSP *subksp;
>>
>>   KSPGetPC(ksp,pc)
>>   PCBJacobiGetSubKSP(pc,&n,PETSC_NULL,&subksp)
>>   KSPGetPC(subksp[0],&subpc);
>>   PCFactorSetxxxxxx(subpc, ....
>>
>>  Barry
>>
>>
>> On Nov 25, 2007, at 1:10 PM, Tim Stitt wrote:
>>
>>> I should also add that the code executes without this error when  
>>> using 1 processor...but then displays the error when running in  
>>> parallel with more than one process.
>>>
>>> Tim Stitt wrote:
>>>> Hi all,
>>>>
>>>> Can anyone suggest ways of overcoming the following pivot error I  
>>>> keep receiving in my PETSc code during a KSPSolve().
>>>>
>>>> [1]PETSC ERROR: Detected zero pivot in LU factorization
>>>> see http://www.mcs.anl.gov/petsc/petsc-as/documentation/troubleshooting.html#ZeroPivot 
>>>> !
>>>> [1]PETSC ERROR: Zero pivot row 1801 value 0.00102826 tolerance  
>>>> 0.00165189 * rowsum 1.65189e+09!
>>>>
>>>> From checking the documentation....the error is in row 1801,  
>>>> which means it is most likely not a matrix assembly issue?
>>>>
>>>> I tried the following prior to the solve with no luck either.....
>>>>
>>>> call KSPGetPC(ksp,pc,error)
>>>> call PCFactorSetShiftNonzero(pc,PETSC_DECIDE,error)
>>>>
>>>> Is there anything else I can try?
>>>>
>>>> Thanks,
>>>>
>>>> Tim.
>>>>
>>>
>>>
>>> --Dr. Timothy Stitt <timothy_dot_stitt_at_ichec.ie>
>>> HPC Application Consultant - ICHEC (www.ichec.ie)
>>>
>>> Dublin Institute for Advanced Studies
>>> 5 Merrion Square - Dublin 2 - Ireland
>>>
>>> +353-1-6621333 (tel) / +353-1-6621477 (fax)
>>>
>>
>
>
> -- 
> Dr. Timothy Stitt <timothy_dot_stitt_at_ichec.ie>
> HPC Application Consultant - ICHEC (www.ichec.ie)
>
> Dublin Institute for Advanced Studies
> 5 Merrion Square - Dublin 2 - Ireland
>
> +353-1-6621333 (tel) / +353-1-6621477 (fax)
>



From amjad11 at gmail.com  Thu Dec  6 23:44:37 2007
From: amjad11 at gmail.com (amjad ali)
Date: Fri, 7 Dec 2007 10:44:37 +0500
Subject: Seclecting specific board
Message-ID: <428810f20712062144q2a7226a4wee543b216bb7fae6@mail.gmail.com>

Hello all,

I want to bulid a beowulf cluster of 16+1 nodes with each node having one
Intel Core2Duo (2.66 GHz, FSB 1333MHz, 4MB L2) processor and GiGE as the
interconnect. On this cluster, I would run my PETSc based CFD/FEM codes
(REQURING VERY FAST MEMORY/high memory bandwidth). Please help me out to
select out any one of the following boards:

1) Intel Server board S3200SH, System Bus 1333MHz, supprting 240-pin DDR2
800 MHz RAM
2) Intel Desktop board DX38BT, System Bus 1333MHz, supprting 240-pin DDR3
1333 MHz RAM

See that RAM speed difference. Given that keeping up running the cluster all
the time and loging on of many user simultaneously is not the concern. The
cluster may be dedicated to be used by one user whenever required. But it
may be the case that running a code for several days will be required.

Would the desktop board DX38BT be suitable to run the cluster for several
hours/days?
Which Board you recommend for this scenario?

Regards,
Amjad Ali.
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.mcs.anl.gov/pipermail/petsc-users/attachments/20071207/0b9c9758/attachment.htm>

From dalcinl at gmail.com  Fri Dec  7 06:43:44 2007
From: dalcinl at gmail.com (Lisandro Dalcin)
Date: Fri, 7 Dec 2007 09:43:44 -0300
Subject: Seclecting specific board
In-Reply-To: <428810f20712062144q2a7226a4wee543b216bb7fae6@mail.gmail.com>
References: <428810f20712062144q2a7226a4wee543b216bb7fae6@mail.gmail.com>
Message-ID: <e7ba66e40712070443r22fd834leddf5c0115b1a94d@mail.gmail.com>

I'm not a hardware expert, but If your processors have FSB 1333MHz,
then you should select the matching NIC card, that is, the DX38BT one.

Additionally, you have to carefully select your switch. However, this
is not an easy task. If you have any chance of getting a switch from
your provider for a trial, then try to do some testing based on
MPI_Alltoall() routine. I believe many (almost all?) switches have
some performance drop in this scenario.

On 12/7/07, amjad ali <amjad11 at gmail.com> wrote:
> Hello all,
>
> I want to bulid a beowulf cluster of 16+1 nodes with each node having one
> Intel Core2Duo (2.66 GHz, FSB 1333MHz, 4MB L2) processor and GiGE as the
> interconnect. On this cluster, I would run my PETSc based CFD/FEM codes
> (REQURING VERY FAST MEMORY/high memory bandwidth). Please help me out to
> select out any one of the following boards:
>
> 1) Intel Server board S3200SH, System Bus 1333MHz, supprting 240-pin DDR2
> 800 MHz RAM
> 2) Intel Desktop board DX38BT, System Bus 1333MHz, supprting 240-pin DDR3
> 1333 MHz RAM
>
> See that RAM speed difference. Given that keeping up running the cluster all
> the time and loging on of many user simultaneously is not the concern. The
> cluster may be dedicated to be used by one user whenever required. But it
> may be the case that running a code for several days will be required.
>
> Would the desktop board DX38BT be suitable to run the cluster for several
> hours/days?
> Which Board you recommend for this scenario?
>
> Regards,
> Amjad Ali.


-- 
Lisandro Dalc?n
---------------
Centro Internacional de M?todos Computacionales en Ingenier?a (CIMEC)
Instituto de Desarrollo Tecnol?gico para la Industria Qu?mica (INTEC)
Consejo Nacional de Investigaciones Cient?ficas y T?cnicas (CONICET)
PTLC - G?emes 3450, (3000) Santa Fe, Argentina
Tel/Fax: +54-(0)342-451.1594



From timothy.stitt at ichec.ie  Fri Dec  7 09:58:30 2007
From: timothy.stitt at ichec.ie (Tim Stitt)
Date: Fri, 07 Dec 2007 15:58:30 +0000
Subject: Zero Pivot Row in LU Factorization
In-Reply-To: <CCDF16E2-1D38-4670-BED3-8F86AA6C3AFC@mcs.anl.gov>
References: <4749C4FE.7070602@ichec.ie> <4749C894.5090602@ichec.ie> <982DFD3C-8D76-4045-A8B4-F046675DBC08@mcs.anl.gov> <4757E68D.3030208@ichec.ie> <CCDF16E2-1D38-4670-BED3-8F86AA6C3AFC@mcs.anl.gov>
Message-ID: <47596DA6.1030809@ichec.ie>

Barry,

I added the following lines to my Fortran code:

call KSPGetPC(ksp,pc,error)
call KSPSetUp(ksp,error)
oneInt=1
call PCBJacobiGetSubKSP(pc,oneInt,PETSC_NULL,kspSub,error)
call KSPGetPC(kspSub,pcSub,error)
call PCFactorSetShiftNonzero(pcSub,PETSC_DECIDE,error)

Now the parallel code goes beyond the zero pivot problem I was getting 
in the KSPSolve()...but only process 0 seems to complete the KSPSolve() 
and Process 1 and higher never makes it out of the KSPSolve() i.e. 
process 0 moves on and performs post-KSPSolve work (just some print 
statements) while the other processes never get out of KSPSolve(). My 
job only terminates once the requested wallclock expires. Again when 
running with only 1 process everything terminates successfully.

Any ideas? Have I done something stupid with the instructions above?

Thanks,

Tim.

Barry Smith wrote:
>
> On Dec 6, 2007, at 6:09 AM, Tim Stitt wrote:
>
>> Barry,
>>
>> I will be using these routines from Fortran..so I am assuming that 
>> Fortran interfaces are available for each routine?
>>
>> Also, how do I know how many sub ksp's there will be? I am assuming I 
>> need to dynamically allocate the subksp array in Fortran but do I 
>> know the size in advance? Is this related to the value 'n' ? If so, 
>> how do I calculate 'n'.
>
>    There will always be one sub ksp be default. There will only be 
> more than one if you use
> PCBJacobiSetLocalBlocks() or PCBJacobiSetTotalBlocks() or 
> -pc_bjacobi_blocks.
> In general we recommend keeping it one. This means you do not need to 
> allocate
> any KSP, just pass in a KSP variable
>
>>
>>
>> What is the significance of subksp[0]? Is it just the sub ksp at this 
>> position I should be interested in?
>
>    This is just the first one. If you have multiply ones then you must 
> loop over them, but I
> recommend having just one.
>>
>>
>> Finally, which of the PCFactorSetxxxxxx routines should I be using?
>
> PCFactorSetZeroPivot() or PCFactorSetShiftNonzero() or 
> PCFactorSetShiftPd() depending
> on what you want to have happen.
>
>    Barry
>
>>
>>
>> Sorry for the twenty questions (well nearly) but I am just a bit 
>> confused with this approach.
>>
>> Thanks,
>>
>> Tim.
>>
>> Barry Smith wrote:
>>>
>>>  KSP *subksp;
>>>
>>>   KSPGetPC(ksp,pc)
>>>   PCBJacobiGetSubKSP(pc,&n,PETSC_NULL,&subksp)
>>>   KSPGetPC(subksp[0],&subpc);
>>>   PCFactorSetxxxxxx(subpc, ....
>>>
>>>  Barry
>>>
>>>
>>> On Nov 25, 2007, at 1:10 PM, Tim Stitt wrote:
>>>
>>>> I should also add that the code executes without this error when 
>>>> using 1 processor...but then displays the error when running in 
>>>> parallel with more than one process.
>>>>
>>>> Tim Stitt wrote:
>>>>> Hi all,
>>>>>
>>>>> Can anyone suggest ways of overcoming the following pivot error I 
>>>>> keep receiving in my PETSc code during a KSPSolve().
>>>>>
>>>>> [1]PETSC ERROR: Detected zero pivot in LU factorization
>>>>> see 
>>>>> http://www.mcs.anl.gov/petsc/petsc-as/documentation/troubleshooting.html#ZeroPivot! 
>>>>>
>>>>> [1]PETSC ERROR: Zero pivot row 1801 value 0.00102826 tolerance 
>>>>> 0.00165189 * rowsum 1.65189e+09!
>>>>>
>>>>> From checking the documentation....the error is in row 1801, which 
>>>>> means it is most likely not a matrix assembly issue?
>>>>>
>>>>> I tried the following prior to the solve with no luck either.....
>>>>>
>>>>> call KSPGetPC(ksp,pc,error)
>>>>> call PCFactorSetShiftNonzero(pc,PETSC_DECIDE,error)
>>>>>
>>>>> Is there anything else I can try?
>>>>>
>>>>> Thanks,
>>>>>
>>>>> Tim.
>>>>>
>>>>
>>>>
>>>> --Dr. Timothy Stitt <timothy_dot_stitt_at_ichec.ie>
>>>> HPC Application Consultant - ICHEC (www.ichec.ie)
>>>>
>>>> Dublin Institute for Advanced Studies
>>>> 5 Merrion Square - Dublin 2 - Ireland
>>>>
>>>> +353-1-6621333 (tel) / +353-1-6621477 (fax)
>>>>
>>>
>>
>>
>> --Dr. Timothy Stitt <timothy_dot_stitt_at_ichec.ie>
>> HPC Application Consultant - ICHEC (www.ichec.ie)
>>
>> Dublin Institute for Advanced Studies
>> 5 Merrion Square - Dublin 2 - Ireland
>>
>> +353-1-6621333 (tel) / +353-1-6621477 (fax)
>>
>


-- 
Dr. Timothy Stitt <timothy_dot_stitt_at_ichec.ie>
HPC Application Consultant - ICHEC (www.ichec.ie)

Dublin Institute for Advanced Studies
5 Merrion Square - Dublin 2 - Ireland

+353-1-6621333 (tel) / +353-1-6621477 (fax)



From knepley at gmail.com  Fri Dec  7 11:44:57 2007
From: knepley at gmail.com (Matthew Knepley)
Date: Fri, 7 Dec 2007 11:44:57 -0600
Subject: Zero Pivot Row in LU Factorization
In-Reply-To: <47596DA6.1030809@ichec.ie>
References: <4749C4FE.7070602@ichec.ie> <4749C894.5090602@ichec.ie>
	 <982DFD3C-8D76-4045-A8B4-F046675DBC08@mcs.anl.gov>
	 <4757E68D.3030208@ichec.ie>
	 <CCDF16E2-1D38-4670-BED3-8F86AA6C3AFC@mcs.anl.gov>
	 <47596DA6.1030809@ichec.ie>
Message-ID: <a9f269830712070944x25dc8d59l3b5c6a3fb6f7a3ca@mail.gmail.com>

On Dec 7, 2007 9:58 AM, Tim Stitt <timothy.stitt at ichec.ie> wrote:
> Barry,
>
> I added the following lines to my Fortran code:
>
> call KSPGetPC(ksp,pc,error)
> call KSPSetUp(ksp,error)
> oneInt=1
> call PCBJacobiGetSubKSP(pc,oneInt,PETSC_NULL,kspSub,error)
> call KSPGetPC(kspSub,pcSub,error)
> call PCFactorSetShiftNonzero(pcSub,PETSC_DECIDE,error)
>
> Now the parallel code goes beyond the zero pivot problem I was getting
> in the KSPSolve()...but only process 0 seems to complete the KSPSolve()
> and Process 1 and higher never makes it out of the KSPSolve() i.e.
> process 0 moves on and performs post-KSPSolve work (just some print
> statements) while the other processes never get out of KSPSolve(). My
> job only terminates once the requested wallclock expires. Again when
> running with only 1 process everything terminates successfully.

This does not seem possible. BJacobi synchronizes at each step for a
residual evaluation. Are you sure you did not call KSPSolve() on the
inner KSP?

  Matt

> Any ideas? Have I done something stupid with the instructions above?
>
> Thanks,
>
> Tim.
>
> Barry Smith wrote:
> >
> > On Dec 6, 2007, at 6:09 AM, Tim Stitt wrote:
> >
> >> Barry,
> >>
> >> I will be using these routines from Fortran..so I am assuming that
> >> Fortran interfaces are available for each routine?
> >>
> >> Also, how do I know how many sub ksp's there will be? I am assuming I
> >> need to dynamically allocate the subksp array in Fortran but do I
> >> know the size in advance? Is this related to the value 'n' ? If so,
> >> how do I calculate 'n'.
> >
> >    There will always be one sub ksp be default. There will only be
> > more than one if you use
> > PCBJacobiSetLocalBlocks() or PCBJacobiSetTotalBlocks() or
> > -pc_bjacobi_blocks.
> > In general we recommend keeping it one. This means you do not need to
> > allocate
> > any KSP, just pass in a KSP variable
> >
> >>
> >>
> >> What is the significance of subksp[0]? Is it just the sub ksp at this
> >> position I should be interested in?
> >
> >    This is just the first one. If you have multiply ones then you must
> > loop over them, but I
> > recommend having just one.
> >>
> >>
> >> Finally, which of the PCFactorSetxxxxxx routines should I be using?
> >
> > PCFactorSetZeroPivot() or PCFactorSetShiftNonzero() or
> > PCFactorSetShiftPd() depending
> > on what you want to have happen.
> >
> >    Barry
> >
> >>
> >>
> >> Sorry for the twenty questions (well nearly) but I am just a bit
> >> confused with this approach.
> >>
> >> Thanks,
> >>
> >> Tim.
> >>
> >> Barry Smith wrote:
> >>>
> >>>  KSP *subksp;
> >>>
> >>>   KSPGetPC(ksp,pc)
> >>>   PCBJacobiGetSubKSP(pc,&n,PETSC_NULL,&subksp)
> >>>   KSPGetPC(subksp[0],&subpc);
> >>>   PCFactorSetxxxxxx(subpc, ....
> >>>
> >>>  Barry
> >>>
> >>>
> >>> On Nov 25, 2007, at 1:10 PM, Tim Stitt wrote:
> >>>
> >>>> I should also add that the code executes without this error when
> >>>> using 1 processor...but then displays the error when running in
> >>>> parallel with more than one process.
> >>>>
> >>>> Tim Stitt wrote:
> >>>>> Hi all,
> >>>>>
> >>>>> Can anyone suggest ways of overcoming the following pivot error I
> >>>>> keep receiving in my PETSc code during a KSPSolve().
> >>>>>
> >>>>> [1]PETSC ERROR: Detected zero pivot in LU factorization
> >>>>> see
> >>>>> http://www.mcs.anl.gov/petsc/petsc-as/documentation/troubleshooting.html#ZeroPivot!
> >>>>>
> >>>>> [1]PETSC ERROR: Zero pivot row 1801 value 0.00102826 tolerance
> >>>>> 0.00165189 * rowsum 1.65189e+09!
> >>>>>
> >>>>> From checking the documentation....the error is in row 1801, which
> >>>>> means it is most likely not a matrix assembly issue?
> >>>>>
> >>>>> I tried the following prior to the solve with no luck either.....
> >>>>>
> >>>>> call KSPGetPC(ksp,pc,error)
> >>>>> call PCFactorSetShiftNonzero(pc,PETSC_DECIDE,error)
> >>>>>
> >>>>> Is there anything else I can try?
> >>>>>
> >>>>> Thanks,
> >>>>>
> >>>>> Tim.
> >>>>>
> >>>>
> >>>>
> >>>> --Dr. Timothy Stitt <timothy_dot_stitt_at_ichec.ie>
> >>>> HPC Application Consultant - ICHEC (www.ichec.ie)
> >>>>
> >>>> Dublin Institute for Advanced Studies
> >>>> 5 Merrion Square - Dublin 2 - Ireland
> >>>>
> >>>> +353-1-6621333 (tel) / +353-1-6621477 (fax)
> >>>>
> >>>
> >>
> >>
> >> --Dr. Timothy Stitt <timothy_dot_stitt_at_ichec.ie>
> >> HPC Application Consultant - ICHEC (www.ichec.ie)
> >>
> >> Dublin Institute for Advanced Studies
> >> 5 Merrion Square - Dublin 2 - Ireland
> >>
> >> +353-1-6621333 (tel) / +353-1-6621477 (fax)
> >>
> >
>
>
> --
> Dr. Timothy Stitt <timothy_dot_stitt_at_ichec.ie>
> HPC Application Consultant - ICHEC (www.ichec.ie)
>
> Dublin Institute for Advanced Studies
> 5 Merrion Square - Dublin 2 - Ireland
>
> +353-1-6621333 (tel) / +353-1-6621477 (fax)
>
>



-- 
What most experimenters take for granted before they begin their
experiments is infinitely more interesting than any results to which
their experiments lead.
-- Norbert Wiener



From ondrej at certik.cz  Sat Dec  8 16:33:24 2007
From: ondrej at certik.cz (Ondrej Certik)
Date: Sat, 8 Dec 2007 23:33:24 +0100
Subject: which MPI can we use
In-Reply-To: <e7ba66e40711280749n7c7bcf44wdd08d24f3c7f447e@mail.gmail.com>
References: <428810f20711272224k116d77a3o12d60c9532052045@mail.gmail.com>
	 <e7ba66e40711280749n7c7bcf44wdd08d24f3c7f447e@mail.gmail.com>
Message-ID: <85b5c3130712081433i163b9a2fx24dc57cc62878675@mail.gmail.com>

On Nov 28, 2007 4:49 PM, Lisandro Dalcin <dalcinl at gmail.com> wrote:
> On 11/28/07, amjad ali <amjad11 at gmail.com> wrote:
> > Please name the MPI libraries (other than MPICH2) which can be used
> > efficiently with PETSc?
>
> On Linux/GNU, surelly Open-MPI. You also have Intel-MPI (actually, it
> is based on MPICH2).

Yep, openmpi works nice. You can also use petsc4py with that. If you
use Debian, just install
python-petsc4py and you'll get everything installed with openmpi.

Ondrej



From timothy.stitt at ichec.ie  Sun Dec  9 11:37:07 2007
From: timothy.stitt at ichec.ie (Tim Stitt)
Date: Sun, 09 Dec 2007 17:37:07 +0000
Subject: MatView() and Multiple Processes
Message-ID: <475C27C3.7030001@ichec.ie>

Hi all,

I was just wondering if someone can tell me how to configure my 
dual-core laptop to allow graphical X11 output of my PETSc matrices. 
MatView() works fine on my parallel code with 1 process but I get the 
following errors on each process when I use more than one:

[0]PETSC ERROR: --------------------- Error Message 
------------------------------------
[0]PETSC ERROR: Error in external library!
[0]PETSC ERROR: Unable to open display on localhost.localdomain:0.0
.  Make sure your COMPUTE NODES are authorized to connect
    to this X server and either your DISPLAY variable
    is set or you use the -display name option

Thanks in advance as always,

Tim.

-- 
Dr. Timothy Stitt <timothy_dot_stitt_at_ichec.ie>
HPC Application Consultant - ICHEC (www.ichec.ie)

Dublin Institute for Advanced Studies
5 Merrion Square - Dublin 2 - Ireland

+353-1-6621333 (tel) / +353-1-6621477 (fax)



From knepley at gmail.com  Sun Dec  9 11:41:56 2007
From: knepley at gmail.com (Matthew Knepley)
Date: Sun, 9 Dec 2007 11:41:56 -0600
Subject: MatView() and Multiple Processes
In-Reply-To: <475C27C3.7030001@ichec.ie>
References: <475C27C3.7030001@ichec.ie>
Message-ID: <a9f269830712090941v29511c07p3c96775e5088ef1a@mail.gmail.com>

I would try -display :0.0

   Matt

On Dec 9, 2007 11:37 AM, Tim Stitt <timothy.stitt at ichec.ie> wrote:
> Hi all,
>
> I was just wondering if someone can tell me how to configure my
> dual-core laptop to allow graphical X11 output of my PETSc matrices.
> MatView() works fine on my parallel code with 1 process but I get the
> following errors on each process when I use more than one:
>
> [0]PETSC ERROR: --------------------- Error Message
> ------------------------------------
> [0]PETSC ERROR: Error in external library!
> [0]PETSC ERROR: Unable to open display on localhost.localdomain:0.0
> .  Make sure your COMPUTE NODES are authorized to connect
>     to this X server and either your DISPLAY variable
>     is set or you use the -display name option
>
> Thanks in advance as always,
>
> Tim.
>
> --
> Dr. Timothy Stitt <timothy_dot_stitt_at_ichec.ie>
> HPC Application Consultant - ICHEC (www.ichec.ie)
>
> Dublin Institute for Advanced Studies
> 5 Merrion Square - Dublin 2 - Ireland
>
> +353-1-6621333 (tel) / +353-1-6621477 (fax)
>
>



-- 
What most experimenters take for granted before they begin their
experiments is infinitely more interesting than any results to which
their experiments lead.
-- Norbert Wiener



From timothy.stitt at ichec.ie  Sun Dec  9 11:54:42 2007
From: timothy.stitt at ichec.ie (Tim Stitt)
Date: Sun, 09 Dec 2007 17:54:42 +0000
Subject: MatView() and Multiple Processes
In-Reply-To: <a9f269830712090941v29511c07p3c96775e5088ef1a@mail.gmail.com>
References: <475C27C3.7030001@ichec.ie> <a9f269830712090941v29511c07p3c96775e5088ef1a@mail.gmail.com>
Message-ID: <475C2BE2.8030302@ichec.ie>

Perfect Matthew thanks...I was trying the -display with localhost:0.0 
and all possible permutations but it seems your incantation is the 
correct one.

Cheers,

Tim.

Matthew Knepley wrote:
> I would try -display :0.0
>
>    Matt
>
> On Dec 9, 2007 11:37 AM, Tim Stitt <timothy.stitt at ichec.ie> wrote:
>   
>> Hi all,
>>
>> I was just wondering if someone can tell me how to configure my
>> dual-core laptop to allow graphical X11 output of my PETSc matrices.
>> MatView() works fine on my parallel code with 1 process but I get the
>> following errors on each process when I use more than one:
>>
>> [0]PETSC ERROR: --------------------- Error Message
>> ------------------------------------
>> [0]PETSC ERROR: Error in external library!
>> [0]PETSC ERROR: Unable to open display on localhost.localdomain:0.0
>> .  Make sure your COMPUTE NODES are authorized to connect
>>     to this X server and either your DISPLAY variable
>>     is set or you use the -display name option
>>
>> Thanks in advance as always,
>>
>> Tim.
>>
>> --
>> Dr. Timothy Stitt <timothy_dot_stitt_at_ichec.ie>
>> HPC Application Consultant - ICHEC (www.ichec.ie)
>>
>> Dublin Institute for Advanced Studies
>> 5 Merrion Square - Dublin 2 - Ireland
>>
>> +353-1-6621333 (tel) / +353-1-6621477 (fax)
>>
>>
>>     
>
>
>
>   


-- 
Dr. Timothy Stitt <timothy_dot_stitt_at_ichec.ie>
HPC Application Consultant - ICHEC (www.ichec.ie)

Dublin Institute for Advanced Studies
5 Merrion Square - Dublin 2 - Ireland

+353-1-6621333 (tel) / +353-1-6621477 (fax)



From amjad11 at gmail.com  Tue Dec 11 09:33:03 2007
From: amjad11 at gmail.com (amjad ali)
Date: Tue, 11 Dec 2007 20:33:03 +0500
Subject: PETSc on ROCKS
Message-ID: <428810f20712110733l26211b1j2634fde417cc47ff@mail.gmail.com>

Hi all,
CAn we install PETSc on ROCKS-based-cluster? or we have find some kind of
PETSc-Roll?

Have any of you experienced PETSc on ROCKS?

regards,
Amjad.
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.mcs.anl.gov/pipermail/petsc-users/attachments/20071211/9e511d0d/attachment.htm>

From randy at geosystem.us  Tue Dec 11 09:35:29 2007
From: randy at geosystem.us (Randall Mackie)
Date: Tue, 11 Dec 2007 07:35:29 -0800
Subject: PETSc on ROCKS
In-Reply-To: <428810f20712110733l26211b1j2634fde417cc47ff@mail.gmail.com>
References: <428810f20712110733l26211b1j2634fde417cc47ff@mail.gmail.com>
Message-ID: <475EAE41.6040806@geosystem.us>

Yes, we use PETSc on a ROCKS-based-cluster and they work quite
well together.

Randy M.


amjad ali wrote:
> Hi all,
> CAn we install PETSc on ROCKS-based-cluster? or we have find some kind 
> of PETSc-Roll?
>  
> Have any of you experienced PETSc on ROCKS?
>  
> regards,
> Amjad.
>  



From jwicks at cs.brown.edu  Thu Dec 13 14:42:14 2007
From: jwicks at cs.brown.edu (John R. Wicks)
Date: Thu, 13 Dec 2007 15:42:14 -0500
Subject: Norm computation
In-Reply-To: <000201c834eb$e89ecfa0$0201a8c0@jwickslptp>
Message-ID: <000b01c83dc8$a57215d0$0201a8c0@jwickslptp>

I recently peformed solved a linear system of very high dimension
distributed over 32 Mac XServe's.  I was rather surprised by the performance
statistics it reported, given below.  In particular, how can VecNorm be so
much more expensive than VecDot, since VecNorm should simply involve taking
a single square root of a dot product.
--- Event Stage 2: LinearSolve

MatMult               19 1.0 1.8057e+01 1.7 1.19e+08 1.7 1.9e+04 5.5e+04
0.0e+00  2 17 49 49  0  16 17 50 50  0  2214
MatMultTranspose      19 1.0 1.6234e+01 2.1 1.73e+08 2.1 1.9e+04 5.5e+04
0.0e+00  2 18 49 49  0  11 18 50 50  0  2601
MatSolve              20 1.0 1.2656e+01 1.5 1.45e+08 1.5 0.0e+00 0.0e+00
0.0e+00  1 18  0  0  0  10 18  0  0  0  3200
MatSolveTranspos      20 1.0 1.3608e+01 1.5 1.40e+08 1.5 0.0e+00 0.0e+00
0.0e+00  2 18  0  0  0  11 18  0  0  0  2976
MatLUFactorNum         1 1.0 1.9609e+01 6.5 2.71e+08 9.3 0.0e+00 0.0e+00
0.0e+00  1 14  0  0  0  10 14  0  0  0  1635
MatILUFactorSym        1 1.0 5.3393e+00 3.9 0.00e+00 0.0 0.0e+00 0.0e+00
1.0e+00  1  0  0  0  1   4  0  0  0  2     0
MatGetRowIJ            1 1.0 1.7881e-05 1.6 0.00e+00 0.0 0.0e+00 0.0e+00
0.0e+00  0  0  0  0  0   0  0  0  0  0     0
MatGetOrdering         1 1.0 2.4659e-01 1.6 0.00e+00 0.0 0.0e+00 0.0e+00
2.0e+00  0  0  0  0  3   0  0  0  0  3     0
VecDot                38 1.0 1.2653e+01 1.9 4.24e+07 2.2 0.0e+00 0.0e+00
3.8e+01  1  4  0  0 49  10  4  0  0 62   710
VecNorm               20 1.0 3.5348e+01 4.0 2.05e+07 6.0 0.0e+00 0.0e+00
2.0e+01  4  2  0  0 26  25  2  0  0 33   134
VecCopy                4 1.0 2.7451e-01 4.3 0.00e+00 0.0 0.0e+00 0.0e+00
0.0e+00  0  0  0  0  0   0  0  0  0  0     0
VecSet                79 1.0 8.9448e-01 1.9 0.00e+00 0.0 0.0e+00 0.0e+00
0.0e+00  0  0  0  0  0   1  0  0  0  0     0
VecAXPY               57 1.0 3.2704e+00 2.2 2.33e+08 1.5 0.0e+00 0.0e+00
0.0e+00  0  6  0  0  0   2  6  0  0  0  4118
VecAYPX               36 1.0 1.5667e+00 1.7 2.28e+08 1.3 0.0e+00 0.0e+00
0.0e+00  0  4  0  0  0   1  4  0  0  0  5430
VecScatterBegin       38 1.0 1.1066e+00 2.1 0.00e+00 0.0 3.8e+04 5.5e+04
0.0e+00  0  0 97 98  0   1  0100100  0     0
VecScatterEnd         38 1.0 1.5381e+01 2.0 0.00e+00 0.0 0.0e+00 0.0e+00
0.0e+00  2  0  0  0  0  12  0  0  0  0     0
KSPSetup               2 1.0 8.2719e-01 1.4 0.00e+00 0.0 0.0e+00 0.0e+00
0.0e+00  0  0  0  0  0   1  0  0  0  0     0
KSPSolve              20 1.0 1.2881e+01 1.5 1.42e+08 1.5 0.0e+00 0.0e+00
0.0e+00  1 18  0  0  0  10 18  0  0  0  3144
PCSetUp                2 1.0 2.5356e+01 5.5 1.96e+08 8.7 0.0e+00 0.0e+00
3.0e+00  2 14  0  0  4  13 14  0  0  5  1265
PCApply               40 1.0 4.7115e+01 2.1 1.59e+08 2.4 0.0e+00 0.0e+00
3.0e+00  5 49  0  0  4  35 49  0  0  5  2400



From bsmith at mcs.anl.gov  Thu Dec 13 15:58:10 2007
From: bsmith at mcs.anl.gov (Barry Smith)
Date: Thu, 13 Dec 2007 15:58:10 -0600
Subject: Norm computation
In-Reply-To: <000b01c83dc8$a57215d0$0201a8c0@jwickslptp>
References: <000b01c83dc8$a57215d0$0201a8c0@jwickslptp>
Message-ID: <0DC6210C-F590-4CC0-95E0-E590F5F6F9D4@mcs.anl.gov>


    The time for VecNorm and VecDot reflects two factors
1) the time to perform the local floating point operations and
2) the time a process waits untill all the other processes are ready
     to exchange date.

2) depends on whatever calculations are being done BEFORE the
norm or dot and is largely related to the load balancing of the work
there. If you look at the 4th column of numbers below it
is a measure for the load balance up to that point: for the VecDot it is
1.9 which means the fastest process was in the routine (mostly waiting)
1/1.9 times as long as the slowest process was in the routine. For
VecNorm it is 4! Meaning some processes are waiting in VecNorm
for a long time before the slowest gets to that routine and does its
communications.

     Barry


On Dec 13, 2007, at 2:42 PM, John R. Wicks wrote:

> I recently peformed solved a linear system of very high dimension
> distributed over 32 Mac XServe's.  I was rather surprised by the  
> performance
> statistics it reported, given below.  In particular, how can VecNorm  
> be so
> much more expensive than VecDot, since VecNorm should simply involve  
> taking
> a single square root of a dot product.
>
> --- Event Stage 2: LinearSolve
>
> MatMult               19 1.0 1.8057e+01 1.7 1.19e+08 1.7 1.9e+04 5.5e 
> +04
> 0.0e+00  2 17 49 49  0  16 17 50 50  0  2214
> MatMultTranspose      19 1.0 1.6234e+01 2.1 1.73e+08 2.1 1.9e+04 5.5e 
> +04
> 0.0e+00  2 18 49 49  0  11 18 50 50  0  2601
> MatSolve              20 1.0 1.2656e+01 1.5 1.45e+08 1.5 0.0e+00 0.0e 
> +00
> 0.0e+00  1 18  0  0  0  10 18  0  0  0  3200
> MatSolveTranspos      20 1.0 1.3608e+01 1.5 1.40e+08 1.5 0.0e+00 0.0e 
> +00
> 0.0e+00  2 18  0  0  0  11 18  0  0  0  2976
> MatLUFactorNum         1 1.0 1.9609e+01 6.5 2.71e+08 9.3 0.0e+00 0.0e 
> +00
> 0.0e+00  1 14  0  0  0  10 14  0  0  0  1635
> MatILUFactorSym        1 1.0 5.3393e+00 3.9 0.00e+00 0.0 0.0e+00 0.0e 
> +00
> 1.0e+00  1  0  0  0  1   4  0  0  0  2     0
> MatGetRowIJ            1 1.0 1.7881e-05 1.6 0.00e+00 0.0 0.0e+00 0.0e 
> +00
> 0.0e+00  0  0  0  0  0   0  0  0  0  0     0
> MatGetOrdering         1 1.0 2.4659e-01 1.6 0.00e+00 0.0 0.0e+00 0.0e 
> +00
> 2.0e+00  0  0  0  0  3   0  0  0  0  3     0
> VecDot                38 1.0 1.2653e+01 1.9 4.24e+07 2.2 0.0e+00 0.0e 
> +00
> 3.8e+01  1  4  0  0 49  10  4  0  0 62   710
> VecNorm               20 1.0 3.5348e+01 4.0 2.05e+07 6.0 0.0e+00 0.0e 
> +00
> 2.0e+01  4  2  0  0 26  25  2  0  0 33   134
> VecCopy                4 1.0 2.7451e-01 4.3 0.00e+00 0.0 0.0e+00 0.0e 
> +00
> 0.0e+00  0  0  0  0  0   0  0  0  0  0     0
> VecSet                79 1.0 8.9448e-01 1.9 0.00e+00 0.0 0.0e+00 0.0e 
> +00
> 0.0e+00  0  0  0  0  0   1  0  0  0  0     0
> VecAXPY               57 1.0 3.2704e+00 2.2 2.33e+08 1.5 0.0e+00 0.0e 
> +00
> 0.0e+00  0  6  0  0  0   2  6  0  0  0  4118
> VecAYPX               36 1.0 1.5667e+00 1.7 2.28e+08 1.3 0.0e+00 0.0e 
> +00
> 0.0e+00  0  4  0  0  0   1  4  0  0  0  5430
> VecScatterBegin       38 1.0 1.1066e+00 2.1 0.00e+00 0.0 3.8e+04 5.5e 
> +04
> 0.0e+00  0  0 97 98  0   1  0100100  0     0
> VecScatterEnd         38 1.0 1.5381e+01 2.0 0.00e+00 0.0 0.0e+00 0.0e 
> +00
> 0.0e+00  2  0  0  0  0  12  0  0  0  0     0
> KSPSetup               2 1.0 8.2719e-01 1.4 0.00e+00 0.0 0.0e+00 0.0e 
> +00
> 0.0e+00  0  0  0  0  0   1  0  0  0  0     0
> KSPSolve              20 1.0 1.2881e+01 1.5 1.42e+08 1.5 0.0e+00 0.0e 
> +00
> 0.0e+00  1 18  0  0  0  10 18  0  0  0  3144
> PCSetUp                2 1.0 2.5356e+01 5.5 1.96e+08 8.7 0.0e+00 0.0e 
> +00
> 3.0e+00  2 14  0  0  4  13 14  0  0  5  1265
> PCApply               40 1.0 4.7115e+01 2.1 1.59e+08 2.4 0.0e+00 0.0e 
> +00
> 3.0e+00  5 49  0  0  4  35 49  0  0  5  2400
>



From randy at geosystem.us  Thu Dec 13 17:32:00 2007
From: randy at geosystem.us (Randall Mackie)
Date: Thu, 13 Dec 2007 15:32:00 -0800
Subject: Question on Index Sets and VecScatters
Message-ID: <4761C0F0.1070700@geosystem.us>

I have a situation where I've put a model vector m(i,j,k) into a parallel
PETSc vector for use in my modeling code. However, I'm now adding a bit of code
where I want to do some calculations based on the 1D average of the model.
In other words, for each k, I want to average m(i,j), and so produce a new
model vector m_avg(k).

So, to do this, it would seem that I need to create a VecScatter that will,
for each layer, scatter all the m(i,j) into a 2D vector, then I can take
the average. It would seem that I need to create an Index Set to do this,
but I'm a bit confused as to how to go about it actually, since I've never
used Index Sets.

Can someone outline the basic steps given my description above?

Thanks, Randy

-- 
Randall Mackie
GSY-USA, Inc.
PMB# 643
2261 Market St.,
San Francisco, CA 94114-1600
Tel (415) 469-8649
Fax (415) 469-5044

California Registered Geophysicist
License No. GP 1034



From knepley at gmail.com  Thu Dec 13 17:46:39 2007
From: knepley at gmail.com (Matthew Knepley)
Date: Thu, 13 Dec 2007 17:46:39 -0600
Subject: Question on Index Sets and VecScatters
In-Reply-To: <4761C0F0.1070700@geosystem.us>
References: <4761C0F0.1070700@geosystem.us>
Message-ID: <a9f269830712131546y5e466da1xb9de0e0ae9b2264b@mail.gmail.com>

You could do it like that, but it seems pretty wasteful, especially in parallel
where you might be sending a considerable amount of data. Why not do
something like this:

  1) Average all slabs into a local vector, indexed by the given k value,
      meaning you have a map {k_0, k_1, ..., k_m} --> {0,1, ...,m}.

  2) Now construct a scatter that maps each local vector into a parallel
       vector of all the ks. The IndexSet for the from (local) vector will be
       {0, 1, ..., m} and the IndexSet for the to (global) vector will be
       {k_0, k_1, ... , k_m} on each process.

  3) When you scatter use ADD_VALUES. Then you will have the sum, and
      just scale the vector by the slab size.

Does this makes sense to you?

  Thanks,

    Matt

On Dec 13, 2007 5:32 PM, Randall Mackie <randy at geosystem.us> wrote:
> I have a situation where I've put a model vector m(i,j,k) into a parallel
> PETSc vector for use in my modeling code. However, I'm now adding a bit of code
> where I want to do some calculations based on the 1D average of the model.
> In other words, for each k, I want to average m(i,j), and so produce a new
> model vector m_avg(k).
>
> So, to do this, it would seem that I need to create a VecScatter that will,
> for each layer, scatter all the m(i,j) into a 2D vector, then I can take
> the average. It would seem that I need to create an Index Set to do this,
> but I'm a bit confused as to how to go about it actually, since I've never
> used Index Sets.
>
> Can someone outline the basic steps given my description above?
>
> Thanks, Randy
>
> --
> Randall Mackie
> GSY-USA, Inc.
> PMB# 643
> 2261 Market St.,
> San Francisco, CA 94114-1600
> Tel (415) 469-8649
> Fax (415) 469-5044
>
> California Registered Geophysicist
> License No. GP 1034
>
>



-- 
What most experimenters take for granted before they begin their
experiments is infinitely more interesting than any results to which
their experiments lead.
-- Norbert Wiener



From randy at geosystem.us  Thu Dec 13 18:16:31 2007
From: randy at geosystem.us (Randall Mackie)
Date: Thu, 13 Dec 2007 16:16:31 -0800
Subject: Question on Index Sets and VecScatters
In-Reply-To: <a9f269830712131546y5e466da1xb9de0e0ae9b2264b@mail.gmail.com>
References: <4761C0F0.1070700@geosystem.us> <a9f269830712131546y5e466da1xb9de0e0ae9b2264b@mail.gmail.com>
Message-ID: <4761CB5F.3080609@geosystem.us>

Hi Matt,

Yes, I see what you're saying, and it makes sense. I'll give it
a try.

Randy


Matthew Knepley wrote:
> You could do it like that, but it seems pretty wasteful, especially in parallel
> where you might be sending a considerable amount of data. Why not do
> something like this:
> 
>   1) Average all slabs into a local vector, indexed by the given k value,
>       meaning you have a map {k_0, k_1, ..., k_m} --> {0,1, ...,m}.
> 
>   2) Now construct a scatter that maps each local vector into a parallel
>        vector of all the ks. The IndexSet for the from (local) vector will be
>        {0, 1, ..., m} and the IndexSet for the to (global) vector will be
>        {k_0, k_1, ... , k_m} on each process.
> 
>   3) When you scatter use ADD_VALUES. Then you will have the sum, and
>       just scale the vector by the slab size.
> 
> Does this makes sense to you?
> 
>   Thanks,
> 
>     Matt
> 
> On Dec 13, 2007 5:32 PM, Randall Mackie <randy at geosystem.us> wrote:
>> I have a situation where I've put a model vector m(i,j,k) into a parallel
>> PETSc vector for use in my modeling code. However, I'm now adding a bit of code
>> where I want to do some calculations based on the 1D average of the model.
>> In other words, for each k, I want to average m(i,j), and so produce a new
>> model vector m_avg(k).
>>
>> So, to do this, it would seem that I need to create a VecScatter that will,
>> for each layer, scatter all the m(i,j) into a 2D vector, then I can take
>> the average. It would seem that I need to create an Index Set to do this,
>> but I'm a bit confused as to how to go about it actually, since I've never
>> used Index Sets.
>>
>> Can someone outline the basic steps given my description above?
>>
>> Thanks, Randy
>>
>> --
>> Randall Mackie
>> GSY-USA, Inc.
>> PMB# 643
>> 2261 Market St.,
>> San Francisco, CA 94114-1600
>> Tel (415) 469-8649
>> Fax (415) 469-5044
>>
>> California Registered Geophysicist
>> License No. GP 1034
>>
>>
> 
> 
> 

-- 
Randall Mackie
GSY-USA, Inc.
PMB# 643
2261 Market St.,
San Francisco, CA 94114-1600
Tel (415) 469-8649
Fax (415) 469-5044

California Registered Geophysicist
License No. GP 1034



From thomas.fabry at uz.kuleuven.ac.be  Tue Dec 18 06:22:17 2007
From: thomas.fabry at uz.kuleuven.ac.be (Thomas Fabry)
Date: Tue, 18 Dec 2007 13:22:17 +0100
Subject: Petsc + Matlab Compute Engine
Message-ID: <F2AD9323108DCB46B251816E1C6C0400022AA8@SWIFT.uz.kuleuven.ac.be>

I have a problem using the Matlab Compute Engine via Petsc.
The line

	ierr = PetscMatlabEngineCreate(PETSC_COM_WORLD,PETSC_NULL,e);
CHKERRQ(ierr);

and using this makefile: 

CFLAGS       = -c -I/usr/local/matlab14.3/extern/include
-I/usr/local/matlab14.3/simulink/include
FFLAGS       = -I${PETSC_DIR}/include/finclude
CPPFLAGS     =
FPPFLAGS     =

include ${PETSC_DIR}/bmake/common/base

secondPETScTest: secondPETScTest.o
	-${CLINKER} -o secondPETScTest secondPETScTest.o
${PETSC_KSP_LIB}
	${RM} secondPETScTest.o

secondPETScTestm: secondPETScTest.o chkopts
	-${CLINKER} -O -pthread -shared -m32
-Wl,--version-script,/usr/local/matlab14.3/extern/lib/glnx86/mexFunction
.map -o secondPETScTest secondPETScTest.o
-Wl,-rpath-link,/usr/local/matlab14.3/bin/glnx86
-L/usr/local/matlab14.3/bin/glnx86 -lmx -lmex -lmat -lm -lstdc++
${PETSC_KSP_LIB}
	${RM} secondPETScTest.o

gives "/PETSc impl/secondPETScTest.c:38: undefined reference to
`PetscMatlabEngineCreate'" when trying make secondPETScTest, and when I
compile with make secondPETScTestm, compilation works, but running the
program gives a segmentation fault.


I hope someone can help me

Kind regards

Thomas Fabry






From knepley at gmail.com  Tue Dec 18 08:02:53 2007
From: knepley at gmail.com (Matthew Knepley)
Date: Tue, 18 Dec 2007 08:02:53 -0600
Subject: Petsc + Matlab Compute Engine
In-Reply-To: <F2AD9323108DCB46B251816E1C6C0400022AA8@SWIFT.uz.kuleuven.ac.be>
References: <F2AD9323108DCB46B251816E1C6C0400022AA8@SWIFT.uz.kuleuven.ac.be>
Message-ID: <a9f269830712180602l626deb59y4edf2c3f801b8a65@mail.gmail.com>

If you want to use the Matlab engine, you must configure PETSc to use
Matlab, --with-matlab-dir=<Matlab directory> --with-matlab-engine.

  Thanks,

    Matt

On Dec 18, 2007 6:22 AM, Thomas Fabry <thomas.fabry at uz.kuleuven.ac.be> wrote:
> I have a problem using the Matlab Compute Engine via Petsc.
> The line
>
>         ierr = PetscMatlabEngineCreate(PETSC_COM_WORLD,PETSC_NULL,e);
> CHKERRQ(ierr);
>
> and using this makefile:
>
> CFLAGS       = -c -I/usr/local/matlab14.3/extern/include
> -I/usr/local/matlab14.3/simulink/include
> FFLAGS       = -I${PETSC_DIR}/include/finclude
> CPPFLAGS     =
> FPPFLAGS     =
>
> include ${PETSC_DIR}/bmake/common/base
>
> secondPETScTest: secondPETScTest.o
>         -${CLINKER} -o secondPETScTest secondPETScTest.o
> ${PETSC_KSP_LIB}
>         ${RM} secondPETScTest.o
>
> secondPETScTestm: secondPETScTest.o chkopts
>         -${CLINKER} -O -pthread -shared -m32
> -Wl,--version-script,/usr/local/matlab14.3/extern/lib/glnx86/mexFunction
> .map -o secondPETScTest secondPETScTest.o
> -Wl,-rpath-link,/usr/local/matlab14.3/bin/glnx86
> -L/usr/local/matlab14.3/bin/glnx86 -lmx -lmex -lmat -lm -lstdc++
> ${PETSC_KSP_LIB}
>         ${RM} secondPETScTest.o
>
> gives "/PETSc impl/secondPETScTest.c:38: undefined reference to
> `PetscMatlabEngineCreate'" when trying make secondPETScTest, and when I
> compile with make secondPETScTestm, compilation works, but running the
> program gives a segmentation fault.
>
>
> I hope someone can help me
>
> Kind regards
>
> Thomas Fabry
>
>
>
>
>



-- 
What most experimenters take for granted before they begin their
experiments is infinitely more interesting than any results to which
their experiments lead.
-- Norbert Wiener



From randy at geosystem.us  Tue Dec 18 13:43:25 2007
From: randy at geosystem.us (Randall Mackie)
Date: Tue, 18 Dec 2007 11:43:25 -0800
Subject: Question on Index Sets and VecScatters
In-Reply-To: <a9f269830712131546y5e466da1xb9de0e0ae9b2264b@mail.gmail.com>
References: <4761C0F0.1070700@geosystem.us> <a9f269830712131546y5e466da1xb9de0e0ae9b2264b@mail.gmail.com>
Message-ID: <476822DD.7070603@geosystem.us>

Matt,

Just a quick follow up question. The local vectors created in (1)
are SEQ vectors on PETSC_COMM_SELF. To create the index sets, it
seems like I should just use ISCreateStride, using 0, m for start
and length of the index set. My question is should the communicator
be PETSC_COMM_SELF or PETSC_COMM_WORLD?

Similarly, the index set for the global vector should also be
created with ISCreateStride, using k_0 and m for start and lengths.
Same question about the communicator.

Thanks, Randy


Matthew Knepley wrote:
> You could do it like that, but it seems pretty wasteful, especially in parallel
> where you might be sending a considerable amount of data. Why not do
> something like this:
> 
>   1) Average all slabs into a local vector, indexed by the given k value,
>       meaning you have a map {k_0, k_1, ..., k_m} --> {0,1, ...,m}.
> 
>   2) Now construct a scatter that maps each local vector into a parallel
>        vector of all the ks. The IndexSet for the from (local) vector will be
>        {0, 1, ..., m} and the IndexSet for the to (global) vector will be
>        {k_0, k_1, ... , k_m} on each process.
> 
>   3) When you scatter use ADD_VALUES. Then you will have the sum, and
>       just scale the vector by the slab size.
> 
> Does this makes sense to you?
> 
>   Thanks,
> 
>     Matt
> 
> On Dec 13, 2007 5:32 PM, Randall Mackie <randy at geosystem.us> wrote:
>> I have a situation where I've put a model vector m(i,j,k) into a parallel
>> PETSc vector for use in my modeling code. However, I'm now adding a bit of code
>> where I want to do some calculations based on the 1D average of the model.
>> In other words, for each k, I want to average m(i,j), and so produce a new
>> model vector m_avg(k).
>>
>> So, to do this, it would seem that I need to create a VecScatter that will,
>> for each layer, scatter all the m(i,j) into a 2D vector, then I can take
>> the average. It would seem that I need to create an Index Set to do this,
>> but I'm a bit confused as to how to go about it actually, since I've never
>> used Index Sets.
>>
>> Can someone outline the basic steps given my description above?
>>
>> Thanks, Randy
>>
>> --
>> Randall Mackie
>> GSY-USA, Inc.
>> PMB# 643
>> 2261 Market St.,
>> San Francisco, CA 94114-1600
>> Tel (415) 469-8649
>> Fax (415) 469-5044
>>
>> California Registered Geophysicist
>> License No. GP 1034
>>
>>
> 
> 
> 

-- 
Randall Mackie
GSY-USA, Inc.
PMB# 643
2261 Market St.,
San Francisco, CA 94114-1600
Tel (415) 469-8649
Fax (415) 469-5044

California Registered Geophysicist
License No. GP 1034



From knepley at gmail.com  Tue Dec 18 16:01:34 2007
From: knepley at gmail.com (Matthew Knepley)
Date: Tue, 18 Dec 2007 16:01:34 -0600
Subject: Question on Index Sets and VecScatters
In-Reply-To: <476822DD.7070603@geosystem.us>
References: <4761C0F0.1070700@geosystem.us>
	 <a9f269830712131546y5e466da1xb9de0e0ae9b2264b@mail.gmail.com>
	 <476822DD.7070603@geosystem.us>
Message-ID: <a9f269830712181401u7ee9000ap7fae0338be256fab@mail.gmail.com>

On Dec 18, 2007 1:43 PM, Randall Mackie <randy at geosystem.us> wrote:
> Matt,
>
> Just a quick follow up question. The local vectors created in (1)
> are SEQ vectors on PETSC_COMM_SELF. To create the index sets, it
> seems like I should just use ISCreateStride, using 0, m for start
> and length of the index set. My question is should the communicator
> be PETSC_COMM_SELF or PETSC_COMM_WORLD?

SELF.

> Similarly, the index set for the global vector should also be
> created with ISCreateStride, using k_0 and m for start and lengths.
> Same question about the communicator.

Comms on IndexSets do not actually matter.

  Thanks,

    Matt

> Thanks, Randy
>
>
> Matthew Knepley wrote:
> > You could do it like that, but it seems pretty wasteful, especially in parallel
> > where you might be sending a considerable amount of data. Why not do
> > something like this:
> >
> >   1) Average all slabs into a local vector, indexed by the given k value,
> >       meaning you have a map {k_0, k_1, ..., k_m} --> {0,1, ...,m}.
> >
> >   2) Now construct a scatter that maps each local vector into a parallel
> >        vector of all the ks. The IndexSet for the from (local) vector will be
> >        {0, 1, ..., m} and the IndexSet for the to (global) vector will be
> >        {k_0, k_1, ... , k_m} on each process.
> >
> >   3) When you scatter use ADD_VALUES. Then you will have the sum, and
> >       just scale the vector by the slab size.
> >
> > Does this makes sense to you?
> >
> >   Thanks,
> >
> >     Matt
> >
> > On Dec 13, 2007 5:32 PM, Randall Mackie <randy at geosystem.us> wrote:
> >> I have a situation where I've put a model vector m(i,j,k) into a parallel
> >> PETSc vector for use in my modeling code. However, I'm now adding a bit of code
> >> where I want to do some calculations based on the 1D average of the model.
> >> In other words, for each k, I want to average m(i,j), and so produce a new
> >> model vector m_avg(k).
> >>
> >> So, to do this, it would seem that I need to create a VecScatter that will,
> >> for each layer, scatter all the m(i,j) into a 2D vector, then I can take
> >> the average. It would seem that I need to create an Index Set to do this,
> >> but I'm a bit confused as to how to go about it actually, since I've never
> >> used Index Sets.
> >>
> >> Can someone outline the basic steps given my description above?
> >>
> >> Thanks, Randy
> >>
> >> --
> >> Randall Mackie
> >> GSY-USA, Inc.
> >> PMB# 643
> >> 2261 Market St.,
> >> San Francisco, CA 94114-1600
> >> Tel (415) 469-8649
> >> Fax (415) 469-5044
> >>
> >> California Registered Geophysicist
> >> License No. GP 1034
> >>
> >>
> >
> >
> >
>
> --
> Randall Mackie
> GSY-USA, Inc.
> PMB# 643
> 2261 Market St.,
> San Francisco, CA 94114-1600
> Tel (415) 469-8649
> Fax (415) 469-5044
>
> California Registered Geophysicist
> License No. GP 1034
>
>



-- 
What most experimenters take for granted before they begin their
experiments is infinitely more interesting than any results to which
their experiments lead.
-- Norbert Wiener



From recrusader at gmail.com  Fri Dec 21 01:16:29 2007
From: recrusader at gmail.com (Yujie)
Date: Thu, 20 Dec 2007 23:16:29 -0800
Subject: how to visit the variable "bs" in pmat of preconditioner
Message-ID: <7ff0ee010712202316i57a8927ehfaaa933bc7c17304@mail.gmail.com>

hi, everyone

now, I want to use Hypre package via
PETSc in third package. I need to visit the variable "bs" in Mat struct. In
hypre.c, this variable may let BoomerAMG know the block size of Mat. The
code is as follows:

127: /* special case for BoomerAMG */
128: if (jac->setup == HYPRE_BoomerAMGSetup) {
129: MatGetBlockSize(pc->pmat,&bs);
130: if (bs > 1) {
131: HYPRE_BoomerAMGSetNumFunctions(jac->hsolver,bs);
132: }
133: };

However, I can't visit this variable. Now, I have get the pointer of PC I
use. I can't visit the variable pmat in my code. I can't find any function
to realize this function from PETSc manual.
Could you give me some advice about how to do?

Merry X'mas!

Regards,
Yujie
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.mcs.anl.gov/pipermail/petsc-users/attachments/20071220/d55a4e36/attachment.htm>

From bsmith at mcs.anl.gov  Fri Dec 21 07:44:12 2007
From: bsmith at mcs.anl.gov (Barry Smith)
Date: Fri, 21 Dec 2007 07:44:12 -0600
Subject: how to visit the variable "bs" in pmat of preconditioner
In-Reply-To: <7ff0ee010712202316i57a8927ehfaaa933bc7c17304@mail.gmail.com>
References: <7ff0ee010712202316i57a8927ehfaaa933bc7c17304@mail.gmail.com>
Message-ID: <219D5EF9-DFFB-409A-B2E3-825C64F1E54E@mcs.anl.gov>


    pmat is the matrix you set with KSPSetOperators() so you just need  
to set block size of that matrix.

On Dec 21, 2007, at 1:16 AM, Yujie wrote:

> hi, everyone
>
> now, I want to use Hypre package via PETSc in third package. I need  
> to visit the variable "bs" in Mat struct. In hypre.c, this variable  
> may let BoomerAMG know the block size of Mat. The code is as follows:
>
> 127: /* special case for BoomerAMG */
> 128: if (jac->setup == HYPRE_BoomerAMGSetup) {
> 129: MatGetBlockSize(pc->pmat,&bs);
> 130: if (bs > 1) {
> 131: HYPRE_BoomerAMGSetNumFunctions(jac->hsolver,bs);
> 132: }
> 133: };
>
> However, I can't visit this variable. Now, I have get the pointer of  
> PC I use. I can't visit the variable pmat in my code. I can't find  
> any function to realize this function from PETSc manual.
> Could you give me some advice about how to do?
>
> Merry X'mas!
>
> Regards,
> Yujie
>



From billy at dem.uminho.pt  Sat Dec 29 17:56:20 2007
From: billy at dem.uminho.pt (=?iso-8859-1?Q?Billy_Ara=FAjo?=)
Date: Sat, 29 Dec 2007 23:56:20 -0000
Subject: Maintaining accuracy while increasing number of processors
Message-ID: <1200D8BEDB3DD54DBA528E210F372BF3D94467@BEFUNCIONARIOS.uminho.pt>


Hi,

I need to know more about the PETSc parallel GMRES solver. Does the solver maintain the same accuracy independent of the number of processors. For example, if I subdivide a mesh with 1000 unkowns into 10, 100, 1000 processors should I expect to get always the same result? If no, why not? Are there any studies on this?

Thank you,

Billy.
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.mcs.anl.gov/pipermail/petsc-users/attachments/20071229/b289cbfe/attachment.htm>

From bsmith at mcs.anl.gov  Sat Dec 29 19:00:56 2007
From: bsmith at mcs.anl.gov (Barry Smith)
Date: Sat, 29 Dec 2007 19:00:56 -0600
Subject: Maintaining accuracy while increasing number of processors
In-Reply-To: <1200D8BEDB3DD54DBA528E210F372BF3D94467@BEFUNCIONARIOS.uminho.pt>
References: <1200D8BEDB3DD54DBA528E210F372BF3D94467@BEFUNCIONARIOS.uminho.pt>
Message-ID: <65DF14FD-8FFC-4F20-A30A-63EC203FCA52@mcs.anl.gov>


    Billy,

     By default GMRES and most of the other KSP solvers stop after a  
reduction
in the 2-norm of the PRECONDITIONED residual by a factor of 10^-5. See  
the manual
page for KSPDefaultConverged() http://www-unix.mcs.anl.gov/petsc/petsc-as/snapshots/petsc-current/docs/manualpages/KSP/KSPDefaultConverged.html

    There are a couple of things to consider:
1) even with the exact same preconditioner (for example Jacobi) the  
convergence history
will be slightly different since the computations are done in a  
different order and so the floating
point results will be slightly different. The converged SOLUTIONS for  
a different number of
processes are ALL correct, even though they have different values  
since the calculations
are done in floating point. As you decrease the tolerance factors you  
will see the SOLUTIONS
for different number of processes all converge to the same answer  
(i.e. the solutions will
share more and more significant digits.)

2) Most parallel preconditioners (even in exact precision) are  
different for a different number of
processes, for example block Jacobi and the additive Schwarz method.  
So you get all the
issues of 1) plus the fact that the convergence histories with  
different number of processes
will be different. Again IF the solver is converging than the answers  
from any number of
processes are equally correct. Also as you decrease the convergence  
tolerances you will
see more and more common significant digits in the different  
solutions. Sometimes
with a larger number of processes the preconditioner may stop working  
and you do not
get convergence of GMRES and then, of course, the "answer" is garbage.  
You should always
call KSPGetConvergedReason() to  make sure the solver has converged.

    Barry




On Dec 29, 2007, at 5:56 PM, Billy Ara?jo wrote:

>
> Hi,
>
> I need to know more about the PETSc parallel GMRES solver. Does the  
> solver maintain the same accuracy independent of the number of  
> processors. For example, if I subdivide a mesh with 1000 unkowns  
> into 10, 100, 1000 processors should I expect to get always the same  
> result? If no, why not? Are there any studies on this?
>
> Thank you,
>
> Billy.
>

-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.mcs.anl.gov/pipermail/petsc-users/attachments/20071229/a70f7012/attachment.htm>

From vijay.m at gmail.com  Sat Dec 29 20:07:11 2007
From: vijay.m at gmail.com (Vijay M)
Date: Sat, 29 Dec 2007 20:07:11 -0600
Subject: Matrix free example snes/ex20.c
In-Reply-To: <65DF14FD-8FFC-4F20-A30A-63EC203FCA52@mcs.anl.gov>
Message-ID: <000601c84a88$b1f54cb0$203010ac@neutrino>

Hi all,

 

I was trying to compile and run the ex20.c example code in the tutorial
section of SNES. Although it does not explicitly specify that -snes_mf
option can be used, my understanding is that as long as a nonlinear residual
function is written correctly, PETSc will calculate via finite difference
the action of the Jacobian on a given vector. Is that correct ?

 

Now if that is the case, then please observe the discrepancy in the number
of linear iterations taken with an analytical Jacobian and matrix-free
option. What puzzles me is that the SNES function norm are quite close for
both the methods but the linear iterations differ by a factor of 3. Why
exactly is this ?

 

Here's the output to make this clearer.

 

vijay :mpirun -np 1 ex20 -ksp_type gmres -snes_monitor

  0 SNES Function norm 2.271442542876e-01

  1 SNES Function norm 6.881516100891e-02

  2 SNES Function norm 1.813939751552e-02

  3 SNES Function norm 2.354176462207e-03

  4 SNES Function norm 3.063728077362e-05

  5 SNES Function norm 3.106106268946e-08

  6 SNES Function norm 5.344742712545e-12

  0 SNES Function norm 2.271442542876e-01

  1 SNES Function norm 6.881516100891e-02

  2 SNES Function norm 1.813939751552e-02

  3 SNES Function norm 2.354176462207e-03

  4 SNES Function norm 3.063728077362e-05

  5 SNES Function norm 3.106106268946e-08

  6 SNES Function norm 5.344742712545e-12

Number of Newton iterations = 6

Number of Linear iterations = 18

Average Linear its / Newton = 3.000000e+00

 

vijay :mpirun -np 1 ex20 -ksp_type gmres -snes_monitor -snes_mf

  0 SNES Function norm 2.271442542876e-01

  1 SNES Function norm 6.870629867542e-02

  2 SNES Function norm 1.804335379848e-02

  3 SNES Function norm 2.290074339682e-03

  4 SNES Function norm 3.082384186373e-05

  5 SNES Function norm 3.926396277038e-09

  6 SNES Function norm 3.754922566585e-16

  0 SNES Function norm 2.271442542876e-01

  1 SNES Function norm 6.870629867542e-02

  2 SNES Function norm 1.804335379848e-02

  3 SNES Function norm 2.290074339682e-03

  4 SNES Function norm 3.082384186373e-05

  5 SNES Function norm 3.926396277038e-09

  6 SNES Function norm 3.754922566585e-16

Number of Newton iterations = 6

Number of Linear iterations = 54

Average Linear its / Newton = 9.000000e+00

 

Thanks,

Vijay

 

-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.mcs.anl.gov/pipermail/petsc-users/attachments/20071229/2b6050ee/attachment.htm>

From knepley at gmail.com  Sat Dec 29 21:05:26 2007
From: knepley at gmail.com (Matthew Knepley)
Date: Sat, 29 Dec 2007 21:05:26 -0600
Subject: Matrix free example snes/ex20.c
In-Reply-To: <000601c84a88$b1f54cb0$203010ac@neutrino>
References: <65DF14FD-8FFC-4F20-A30A-63EC203FCA52@mcs.anl.gov>
	 <000601c84a88$b1f54cb0$203010ac@neutrino>
Message-ID: <a9f269830712291905v780667d7t1edb813386daafad@mail.gmail.com>

On Dec 29, 2007 8:07 PM, Vijay M <vijay.m at gmail.com> wrote:
> Hi all,
>
> I was trying to compile and run the ex20.c example code in the tutorial
> section of SNES. Although it does not explicitly specify that ?snes_mf
> option can be used, my understanding is that as long as a nonlinear residual
> function is written correctly, PETSc will calculate via finite difference
> the action of the Jacobian on a given vector. Is that correct ?

Yes.

> Now if that is the case, then please observe the discrepancy in the number
> of linear iterations taken with an analytical Jacobian and matrix-free
> option. What puzzles me is that the SNES function norm are quite close for
> both the methods but the linear iterations differ by a factor of 3. Why
> exactly is this ?

There is no PC when using -snes_mf whereas the default is ILU for the analytic
Jacobian.

   Matt

> Here's the output to make this clearer.
>
> vijay :mpirun -np 1 ex20 -ksp_type gmres -snes_monitor
>
>   0 SNES Function norm 2.271442542876e-01
>
>   1 SNES Function norm 6.881516100891e-02
>
>   2 SNES Function norm 1.813939751552e-02
>
>   3 SNES Function norm 2.354176462207e-03
>
>   4 SNES Function norm 3.063728077362e-05
>
>   5 SNES Function norm 3.106106268946e-08
>
>   6 SNES Function norm 5.344742712545e-12
>
>   0 SNES Function norm 2.271442542876e-01
>
>   1 SNES Function norm 6.881516100891e-02
>
>   2 SNES Function norm 1.813939751552e-02
>
>   3 SNES Function norm 2.354176462207e-03
>
>   4 SNES Function norm 3.063728077362e-05
>
>   5 SNES Function norm 3.106106268946e-08
>
>   6 SNES Function norm 5.344742712545e-12
>
> Number of Newton iterations = 6
>
> Number of Linear iterations = 18
>
> Average Linear its / Newton = 3.000000e+00
>
>
>
> vijay :mpirun -np 1 ex20 -ksp_type gmres -snes_monitor -snes_mf
>
>   0 SNES Function norm 2.271442542876e-01
>
>   1 SNES Function norm 6.870629867542e-02
>
>   2 SNES Function norm 1.804335379848e-02
>
>   3 SNES Function norm 2.290074339682e-03
>
>   4 SNES Function norm 3.082384186373e-05
>
>   5 SNES Function norm 3.926396277038e-09
>
>   6 SNES Function norm 3.754922566585e-16
>
>   0 SNES Function norm 2.271442542876e-01
>
>   1 SNES Function norm 6.870629867542e-02
>
>   2 SNES Function norm 1.804335379848e-02
>
>   3 SNES Function norm 2.290074339682e-03
>
>   4 SNES Function norm 3.082384186373e-05
>
>   5 SNES Function norm 3.926396277038e-09
>
>   6 SNES Function norm 3.754922566585e-16
>
> Number of Newton iterations = 6
>
> Number of Linear iterations = 54
>
> Average Linear its / Newton = 9.000000e+00
>
>
>
> Thanks,
>
> Vijay
>
>



-- 
What most experimenters take for granted before they begin their
experiments is infinitely more interesting than any results to which
their experiments lead.
-- Norbert Wiener



From vijay.m at gmail.com  Sun Dec 30 12:44:24 2007
From: vijay.m at gmail.com (Vijay M)
Date: Sun, 30 Dec 2007 12:44:24 -0600
Subject: Matrix free example snes/ex20.c
In-Reply-To: <a9f269830712291905v780667d7t1edb813386daafad@mail.gmail.com>
Message-ID: <000001c84b14$00ed25f0$6c00a8c0@neutrino>

Matt,

Thanks for the reply. What you suggested makes sense and so to start from a
common ground, I used no preconditioner at all in both the J-free and
analytical Jacobian cases. But now, interestingly, the analytical Jacobian
takes around twice the number of linear iterations. 

mpirun -np 1 ex20 -ksp_type gmres -snes_monitor -pc_type none
Number of Newton iterations = 6
Number of Linear iterations = 112
Average Linear its / Newton = 1.866667e+01

mpirun -np 1 ex20 -ksp_type gmres -snes_monitor -snes_mf -pc_type none
Number of Newton iterations = 6
Number of Linear iterations = 54
Average Linear its / Newton = 9.000000e+00

I understand that both the methods will not give me the same number of total
linear iterations but a factor of 2 seems a little odd to me. This leads to
another question whether the user can actually change the epsilon used for
computing the perturbation in J-free scheme or is this fixed in PETSc ?

If not, then what do you think is the reason for this ? Do let me know your
comments when you get some time. Thanks.

Vijay

-----Original Message-----
From: owner-petsc-users at mcs.anl.gov [mailto:owner-petsc-users at mcs.anl.gov]
On Behalf Of Matthew Knepley
Sent: Saturday, December 29, 2007 9:05 PM
To: petsc-users at mcs.anl.gov
Subject: Re: Matrix free example snes/ex20.c

On Dec 29, 2007 8:07 PM, Vijay M <vijay.m at gmail.com> wrote:
> Hi all,
>
> I was trying to compile and run the ex20.c example code in the tutorial
> section of SNES. Although it does not explicitly specify that -snes_mf
> option can be used, my understanding is that as long as a nonlinear
residual
> function is written correctly, PETSc will calculate via finite difference
> the action of the Jacobian on a given vector. Is that correct ?

Yes.

> Now if that is the case, then please observe the discrepancy in the number
> of linear iterations taken with an analytical Jacobian and matrix-free
> option. What puzzles me is that the SNES function norm are quite close for
> both the methods but the linear iterations differ by a factor of 3. Why
> exactly is this ?

There is no PC when using -snes_mf whereas the default is ILU for the
analytic
Jacobian.

   Matt

> Here's the output to make this clearer.
>
> vijay :mpirun -np 1 ex20 -ksp_type gmres -snes_monitor
>
>   0 SNES Function norm 2.271442542876e-01
>
>   1 SNES Function norm 6.881516100891e-02
>
>   2 SNES Function norm 1.813939751552e-02
>
>   3 SNES Function norm 2.354176462207e-03
>
>   4 SNES Function norm 3.063728077362e-05
>
>   5 SNES Function norm 3.106106268946e-08
>
>   6 SNES Function norm 5.344742712545e-12
>
>   0 SNES Function norm 2.271442542876e-01
>
>   1 SNES Function norm 6.881516100891e-02
>
>   2 SNES Function norm 1.813939751552e-02
>
>   3 SNES Function norm 2.354176462207e-03
>
>   4 SNES Function norm 3.063728077362e-05
>
>   5 SNES Function norm 3.106106268946e-08
>
>   6 SNES Function norm 5.344742712545e-12
>
> Number of Newton iterations = 6
>
> Number of Linear iterations = 18
>
> Average Linear its / Newton = 3.000000e+00
>
>
>
> vijay :mpirun -np 1 ex20 -ksp_type gmres -snes_monitor -snes_mf
>
>   0 SNES Function norm 2.271442542876e-01
>
>   1 SNES Function norm 6.870629867542e-02
>
>   2 SNES Function norm 1.804335379848e-02
>
>   3 SNES Function norm 2.290074339682e-03
>
>   4 SNES Function norm 3.082384186373e-05
>
>   5 SNES Function norm 3.926396277038e-09
>
>   6 SNES Function norm 3.754922566585e-16
>
>   0 SNES Function norm 2.271442542876e-01
>
>   1 SNES Function norm 6.870629867542e-02
>
>   2 SNES Function norm 1.804335379848e-02
>
>   3 SNES Function norm 2.290074339682e-03
>
>   4 SNES Function norm 3.082384186373e-05
>
>   5 SNES Function norm 3.926396277038e-09
>
>   6 SNES Function norm 3.754922566585e-16
>
> Number of Newton iterations = 6
>
> Number of Linear iterations = 54
>
> Average Linear its / Newton = 9.000000e+00
>
>
>
> Thanks,
>
> Vijay
>
>



-- 
What most experimenters take for granted before they begin their
experiments is infinitely more interesting than any results to which
their experiments lead.
-- Norbert Wiener



From bsmith at mcs.anl.gov  Sun Dec 30 13:46:14 2007
From: bsmith at mcs.anl.gov (Barry Smith)
Date: Sun, 30 Dec 2007 13:46:14 -0600
Subject: Matrix free example snes/ex20.c
In-Reply-To: <000001c84b14$00ed25f0$6c00a8c0@neutrino>
References: <000001c84b14$00ed25f0$6c00a8c0@neutrino>
Message-ID: <81FBC8B4-77DE-453B-A226-CDFB8D10C1B6@mcs.anl.gov>


On Dec 30, 2007, at 12:44 PM, Vijay M wrote:

> Matt,
>
> Thanks for the reply. What you suggested makes sense and so to start  
> from a
> common ground, I used no preconditioner at all in both the J-free and
> analytical Jacobian cases. But now, interestingly, the analytical  
> Jacobian
> takes around twice the number of linear iterations.
>
> mpirun -np 1 ex20 -ksp_type gmres -snes_monitor -pc_type none
> Number of Newton iterations = 6
> Number of Linear iterations = 112
> Average Linear its / Newton = 1.866667e+01
>
> mpirun -np 1 ex20 -ksp_type gmres -snes_monitor -snes_mf -pc_type none
> Number of Newton iterations = 6
> Number of Linear iterations = 54
> Average Linear its / Newton = 9.000000e+00
>
> I understand that both the methods will not give me the same number  
> of total
> linear iterations but a factor of 2 seems a little odd to me.

    Yes, this is surprising.

    Run with -ksp_monitor how are the linear convergence different?

> This leads to
> another question whether the user can actually change the epsilon  
> used for
> computing the perturbation in J-free scheme or is this fixed in  
> PETSc ?

    Yes, see the manual page for MatMFFDSetFromOptions() and related  
manual
pages.

>
>
> If not, then what do you think is the reason for this ?

    Bug in your analytic Jacobian? Run with -snes_monitor and - 
ksp_monitor and
send all output.

    Barry

> Do let me know your
> comments when you get some time. Thanks.
>
> Vijay
>
> -----Original Message-----
> From: owner-petsc-users at mcs.anl.gov [mailto:owner-petsc-users at mcs.anl.gov 
> ]
> On Behalf Of Matthew Knepley
> Sent: Saturday, December 29, 2007 9:05 PM
> To: petsc-users at mcs.anl.gov
> Subject: Re: Matrix free example snes/ex20.c
>
> On Dec 29, 2007 8:07 PM, Vijay M <vijay.m at gmail.com> wrote:
>> Hi all,
>>
>> I was trying to compile and run the ex20.c example code in the  
>> tutorial
>> section of SNES. Although it does not explicitly specify that - 
>> snes_mf
>> option can be used, my understanding is that as long as a nonlinear
> residual
>> function is written correctly, PETSc will calculate via finite  
>> difference
>> the action of the Jacobian on a given vector. Is that correct ?
>
> Yes.
>
>> Now if that is the case, then please observe the discrepancy in the  
>> number
>> of linear iterations taken with an analytical Jacobian and matrix- 
>> free
>> option. What puzzles me is that the SNES function norm are quite  
>> close for
>> both the methods but the linear iterations differ by a factor of 3.  
>> Why
>> exactly is this ?
>
> There is no PC when using -snes_mf whereas the default is ILU for the
> analytic
> Jacobian.
>
>   Matt
>
>> Here's the output to make this clearer.
>>
>> vijay :mpirun -np 1 ex20 -ksp_type gmres -snes_monitor
>>
>>  0 SNES Function norm 2.271442542876e-01
>>
>>  1 SNES Function norm 6.881516100891e-02
>>
>>  2 SNES Function norm 1.813939751552e-02
>>
>>  3 SNES Function norm 2.354176462207e-03
>>
>>  4 SNES Function norm 3.063728077362e-05
>>
>>  5 SNES Function norm 3.106106268946e-08
>>
>>  6 SNES Function norm 5.344742712545e-12
>>
>>  0 SNES Function norm 2.271442542876e-01
>>
>>  1 SNES Function norm 6.881516100891e-02
>>
>>  2 SNES Function norm 1.813939751552e-02
>>
>>  3 SNES Function norm 2.354176462207e-03
>>
>>  4 SNES Function norm 3.063728077362e-05
>>
>>  5 SNES Function norm 3.106106268946e-08
>>
>>  6 SNES Function norm 5.344742712545e-12
>>
>> Number of Newton iterations = 6
>>
>> Number of Linear iterations = 18
>>
>> Average Linear its / Newton = 3.000000e+00
>>
>>
>>
>> vijay :mpirun -np 1 ex20 -ksp_type gmres -snes_monitor -snes_mf
>>
>>  0 SNES Function norm 2.271442542876e-01
>>
>>  1 SNES Function norm 6.870629867542e-02
>>
>>  2 SNES Function norm 1.804335379848e-02
>>
>>  3 SNES Function norm 2.290074339682e-03
>>
>>  4 SNES Function norm 3.082384186373e-05
>>
>>  5 SNES Function norm 3.926396277038e-09
>>
>>  6 SNES Function norm 3.754922566585e-16
>>
>>  0 SNES Function norm 2.271442542876e-01
>>
>>  1 SNES Function norm 6.870629867542e-02
>>
>>  2 SNES Function norm 1.804335379848e-02
>>
>>  3 SNES Function norm 2.290074339682e-03
>>
>>  4 SNES Function norm 3.082384186373e-05
>>
>>  5 SNES Function norm 3.926396277038e-09
>>
>>  6 SNES Function norm 3.754922566585e-16
>>
>> Number of Newton iterations = 6
>>
>> Number of Linear iterations = 54
>>
>> Average Linear its / Newton = 9.000000e+00
>>
>>
>>
>> Thanks,
>>
>> Vijay
>>
>>
>
>
>
> -- 
> What most experimenters take for granted before they begin their
> experiments is infinitely more interesting than any results to which
> their experiments lead.
> -- Norbert Wiener
>



From vijay.m at gmail.com  Sun Dec 30 14:19:32 2007
From: vijay.m at gmail.com (Vijay M)
Date: Sun, 30 Dec 2007 14:19:32 -0600
Subject: Matrix free example snes/ex20.c
In-Reply-To: <81FBC8B4-77DE-453B-A226-CDFB8D10C1B6@mcs.anl.gov>
Message-ID: <000001c84b21$4b6b8240$163010ac@neutrino>

I ran both the cases with -ksp_monitor on and have attached the output in
two different files. 1.txt is the Jfree case and 2.txt is the analytical
case.

Barry, the example problem is ex20 from the snes tutorial directory. The
petsc version is 2.3.3-p7 if that helps to clear things a little. Now I
haven't yet completely checked for a bug in the analytical Jacobian but I
would imagine that if it were incorrect, wouldn't that affect only how the
nonlinear iteration converges and not the linear iteration since the matrix
sparsity structure is still the same (well assuming the condition number is
not very different from the exact Jacobian !). Just my 2 cents.

Anyway, I will look into the code for ex20 and then see if something is
messed up. Let me know if you find out the problem from the output.

Thanks,
Vijay

> I understand that both the methods will not give me the same number  
> of total
> linear iterations but a factor of 2 seems a little odd to me.

    Yes, this is surprising.

    Run with -ksp_monitor how are the linear convergence different?

> This leads to
> another question whether the user can actually change the epsilon  
> used for
> computing the perturbation in J-free scheme or is this fixed in  
> PETSc ?

    Yes, see the manual page for MatMFFDSetFromOptions() and related  
manual
pages.

>
>
> If not, then what do you think is the reason for this ?

    Bug in your analytic Jacobian? Run with -snes_monitor and - 
ksp_monitor and
send all output.

    Barry

> Do let me know your
> comments when you get some time. Thanks.
>
> Vijay
>
> -----Original Message-----
> From: owner-petsc-users at mcs.anl.gov [mailto:owner-petsc-users at mcs.anl.gov 
> ]
> On Behalf Of Matthew Knepley
> Sent: Saturday, December 29, 2007 9:05 PM
> To: petsc-users at mcs.anl.gov
> Subject: Re: Matrix free example snes/ex20.c
>
> On Dec 29, 2007 8:07 PM, Vijay M <vijay.m at gmail.com> wrote:
>> Hi all,
>>
>> I was trying to compile and run the ex20.c example code in the  
>> tutorial
>> section of SNES. Although it does not explicitly specify that - 
>> snes_mf
>> option can be used, my understanding is that as long as a nonlinear
> residual
>> function is written correctly, PETSc will calculate via finite  
>> difference
>> the action of the Jacobian on a given vector. Is that correct ?
>
> Yes.
>
>> Now if that is the case, then please observe the discrepancy in the  
>> number
>> of linear iterations taken with an analytical Jacobian and matrix- 
>> free
>> option. What puzzles me is that the SNES function norm are quite  
>> close for
>> both the methods but the linear iterations differ by a factor of 3.  
>> Why
>> exactly is this ?
>
> There is no PC when using -snes_mf whereas the default is ILU for the
> analytic
> Jacobian.
>
>   Matt
>
>> Here's the output to make this clearer.
>>
>> vijay :mpirun -np 1 ex20 -ksp_type gmres -snes_monitor
>>
>>  0 SNES Function norm 2.271442542876e-01
>>
>>  1 SNES Function norm 6.881516100891e-02
>>
>>  2 SNES Function norm 1.813939751552e-02
>>
>>  3 SNES Function norm 2.354176462207e-03
>>
>>  4 SNES Function norm 3.063728077362e-05
>>
>>  5 SNES Function norm 3.106106268946e-08
>>
>>  6 SNES Function norm 5.344742712545e-12
>>
>>  0 SNES Function norm 2.271442542876e-01
>>
>>  1 SNES Function norm 6.881516100891e-02
>>
>>  2 SNES Function norm 1.813939751552e-02
>>
>>  3 SNES Function norm 2.354176462207e-03
>>
>>  4 SNES Function norm 3.063728077362e-05
>>
>>  5 SNES Function norm 3.106106268946e-08
>>
>>  6 SNES Function norm 5.344742712545e-12
>>
>> Number of Newton iterations = 6
>>
>> Number of Linear iterations = 18
>>
>> Average Linear its / Newton = 3.000000e+00
>>
>>
>>
>> vijay :mpirun -np 1 ex20 -ksp_type gmres -snes_monitor -snes_mf
>>
>>  0 SNES Function norm 2.271442542876e-01
>>
>>  1 SNES Function norm 6.870629867542e-02
>>
>>  2 SNES Function norm 1.804335379848e-02
>>
>>  3 SNES Function norm 2.290074339682e-03
>>
>>  4 SNES Function norm 3.082384186373e-05
>>
>>  5 SNES Function norm 3.926396277038e-09
>>
>>  6 SNES Function norm 3.754922566585e-16
>>
>>  0 SNES Function norm 2.271442542876e-01
>>
>>  1 SNES Function norm 6.870629867542e-02
>>
>>  2 SNES Function norm 1.804335379848e-02
>>
>>  3 SNES Function norm 2.290074339682e-03
>>
>>  4 SNES Function norm 3.082384186373e-05
>>
>>  5 SNES Function norm 3.926396277038e-09
>>
>>  6 SNES Function norm 3.754922566585e-16
>>
>> Number of Newton iterations = 6
>>
>> Number of Linear iterations = 54
>>
>> Average Linear its / Newton = 9.000000e+00
>>
>>
>>
>> Thanks,
>>
>> Vijay
>>
>>
>
>
>
> -- 
> What most experimenters take for granted before they begin their
> experiments is infinitely more interesting than any results to which
> their experiments lead.
> -- Norbert Wiener
>
-------------- next part --------------
An embedded and charset-unspecified text was scrubbed...
Name: 1.txt
URL: <http://lists.mcs.anl.gov/pipermail/petsc-users/attachments/20071230/067d3349/attachment.txt>
-------------- next part --------------
An embedded and charset-unspecified text was scrubbed...
Name: 2.txt
URL: <http://lists.mcs.anl.gov/pipermail/petsc-users/attachments/20071230/067d3349/attachment-0001.txt>

From bsmith at mcs.anl.gov  Sun Dec 30 16:51:20 2007
From: bsmith at mcs.anl.gov (Barry Smith)
Date: Sun, 30 Dec 2007 16:51:20 -0600
Subject: Matrix free example snes/ex20.c
In-Reply-To: <000001c84b21$4b6b8240$163010ac@neutrino>
References: <000001c84b21$4b6b8240$163010ac@neutrino>
Message-ID: <E359B90B-7096-4DBD-AEE4-34D589A99E34@mcs.anl.gov>


   Vijay,

     This is a very cool problem.

     Because of the exact symmetry of the domain the EXACT Jacobian
at each step has exactly 9 different eigenvalues. This means the
GMRES will take exactly 9 iterations (and "completely" converge in
the ninth iteration) if the "exact" Jacobian is used.  You can run with
-pc_type none -snes_mf -ksp_monitor_singular_value - 
ksp_plot_eigenvalues -display :0.0 -draw_pause -1
to see the 9 eigenvalues.

Now run without the -snes_mf option. You will see the first Newton  
iteration's
eigenvalues still look like 9; but starting at the second Newton  
iteration the
"identical" eigenvalues are now not all identically placed so GMRES  
needs
more iterations.

   The question then becomes how come the matrix-free application of
the Jacobian is more accurate than actually computing it as a sparse
matrix then applying it? Here is my non-rigorous answer; the  
multiplication
of the sparse matrix values (even if very accurate) against the vector  
introduces
some rounding error that screws up the eigenvalues slightly. For some  
reason
for this problem the matrix-free application is accurate enough not to
perturb the eigenvalues.

   Barry





On Dec 30, 2007, at 2:19 PM, Vijay M wrote:

> I ran both the cases with -ksp_monitor on and have attached the  
> output in
> two different files. 1.txt is the Jfree case and 2.txt is the  
> analytical
> case.
>
> Barry, the example problem is ex20 from the snes tutorial directory.  
> The
> petsc version is 2.3.3-p7 if that helps to clear things a little.  
> Now I
> haven't yet completely checked for a bug in the analytical Jacobian  
> but I
> would imagine that if it were incorrect, wouldn't that affect only  
> how the
> nonlinear iteration converges and not the linear iteration since the  
> matrix
> sparsity structure is still the same (well assuming the condition  
> number is
> not very different from the exact Jacobian !). Just my 2 cents.
>
> Anyway, I will look into the code for ex20 and then see if something  
> is
> messed up. Let me know if you find out the problem from the output.
>
> Thanks,
> Vijay
>
>> I understand that both the methods will not give me the same number
>> of total
>> linear iterations but a factor of 2 seems a little odd to me.
>
>    Yes, this is surprising.
>
>    Run with -ksp_monitor how are the linear convergence different?
>
>> This leads to
>> another question whether the user can actually change the epsilon
>> used for
>> computing the perturbation in J-free scheme or is this fixed in
>> PETSc ?
>
>    Yes, see the manual page for MatMFFDSetFromOptions() and related
> manual
> pages.
>
>>
>>
>> If not, then what do you think is the reason for this ?
>
>    Bug in your analytic Jacobian? Run with -snes_monitor and -
> ksp_monitor and
> send all output.
>
>    Barry
>
>> Do let me know your
>> comments when you get some time. Thanks.
>>
>> Vijay
>>
>> -----Original Message-----
>> From: owner-petsc-users at mcs.anl.gov [mailto:owner-petsc-users at mcs.anl.gov
>> ]
>> On Behalf Of Matthew Knepley
>> Sent: Saturday, December 29, 2007 9:05 PM
>> To: petsc-users at mcs.anl.gov
>> Subject: Re: Matrix free example snes/ex20.c
>>
>> On Dec 29, 2007 8:07 PM, Vijay M <vijay.m at gmail.com> wrote:
>>> Hi all,
>>>
>>> I was trying to compile and run the ex20.c example code in the
>>> tutorial
>>> section of SNES. Although it does not explicitly specify that -
>>> snes_mf
>>> option can be used, my understanding is that as long as a nonlinear
>> residual
>>> function is written correctly, PETSc will calculate via finite
>>> difference
>>> the action of the Jacobian on a given vector. Is that correct ?
>>
>> Yes.
>>
>>> Now if that is the case, then please observe the discrepancy in the
>>> number
>>> of linear iterations taken with an analytical Jacobian and matrix-
>>> free
>>> option. What puzzles me is that the SNES function norm are quite
>>> close for
>>> both the methods but the linear iterations differ by a factor of 3.
>>> Why
>>> exactly is this ?
>>
>> There is no PC when using -snes_mf whereas the default is ILU for the
>> analytic
>> Jacobian.
>>
>>  Matt
>>
>>> Here's the output to make this clearer.
>>>
>>> vijay :mpirun -np 1 ex20 -ksp_type gmres -snes_monitor
>>>
>>> 0 SNES Function norm 2.271442542876e-01
>>>
>>> 1 SNES Function norm 6.881516100891e-02
>>>
>>> 2 SNES Function norm 1.813939751552e-02
>>>
>>> 3 SNES Function norm 2.354176462207e-03
>>>
>>> 4 SNES Function norm 3.063728077362e-05
>>>
>>> 5 SNES Function norm 3.106106268946e-08
>>>
>>> 6 SNES Function norm 5.344742712545e-12
>>>
>>> 0 SNES Function norm 2.271442542876e-01
>>>
>>> 1 SNES Function norm 6.881516100891e-02
>>>
>>> 2 SNES Function norm 1.813939751552e-02
>>>
>>> 3 SNES Function norm 2.354176462207e-03
>>>
>>> 4 SNES Function norm 3.063728077362e-05
>>>
>>> 5 SNES Function norm 3.106106268946e-08
>>>
>>> 6 SNES Function norm 5.344742712545e-12
>>>
>>> Number of Newton iterations = 6
>>>
>>> Number of Linear iterations = 18
>>>
>>> Average Linear its / Newton = 3.000000e+00
>>>
>>>
>>>
>>> vijay :mpirun -np 1 ex20 -ksp_type gmres -snes_monitor -snes_mf
>>>
>>> 0 SNES Function norm 2.271442542876e-01
>>>
>>> 1 SNES Function norm 6.870629867542e-02
>>>
>>> 2 SNES Function norm 1.804335379848e-02
>>>
>>> 3 SNES Function norm 2.290074339682e-03
>>>
>>> 4 SNES Function norm 3.082384186373e-05
>>>
>>> 5 SNES Function norm 3.926396277038e-09
>>>
>>> 6 SNES Function norm 3.754922566585e-16
>>>
>>> 0 SNES Function norm 2.271442542876e-01
>>>
>>> 1 SNES Function norm 6.870629867542e-02
>>>
>>> 2 SNES Function norm 1.804335379848e-02
>>>
>>> 3 SNES Function norm 2.290074339682e-03
>>>
>>> 4 SNES Function norm 3.082384186373e-05
>>>
>>> 5 SNES Function norm 3.926396277038e-09
>>>
>>> 6 SNES Function norm 3.754922566585e-16
>>>
>>> Number of Newton iterations = 6
>>>
>>> Number of Linear iterations = 54
>>>
>>> Average Linear its / Newton = 9.000000e+00
>>>
>>>
>>>
>>> Thanks,
>>>
>>> Vijay
>>>
>>>
>>
>>
>>
>> -- 
>> What most experimenters take for granted before they begin their
>> experiments is infinitely more interesting than any results to which
>> their experiments lead.
>> -- Norbert Wiener
>>
> <1.txt><2.txt>



From vijay.m at gmail.com  Mon Dec 31 18:06:17 2007
From: vijay.m at gmail.com (Vijay M)
Date: Mon, 31 Dec 2007 18:06:17 -0600
Subject: Matrix free example snes/ex20.c
In-Reply-To: <E359B90B-7096-4DBD-AEE4-34D589A99E34@mcs.anl.gov>
Message-ID: <000001c84c0a$222dccf0$6e00a8c0@neutrino>

Barry,

Thanks for the detailed explanation. That sure is a tricky and interesting
problem. I did run the problem with the options you suggested and see what
you mean.

I just have one another question though that is not quite related to the
example: Say when you do J-free Newton-Krylov iteration, then is it correct
to say that the F.D calculation of the action of Jacobian on a vector is
more accurate than using a numerical  Jacobian (not analytical) found at the
start of a Newton iteration ? Because even though in both cases, the
Jacobian is technically found by perturbation about the last Newton
iteration, it seems to me that there is some gain in this convergence
respect with J-free immaterial of the problem being solved. Now is that
confusing or am I making sense ? I'll be glad to explain more on that and
awaiting to hear your comments.

Well, happy new year to you Barry and all the PETSc team !!

Cheers,
Vijay

-----Original Message-----
From: owner-petsc-users at mcs.anl.gov [mailto:owner-petsc-users at mcs.anl.gov]
On Behalf Of Barry Smith
Sent: Sunday, December 30, 2007 4:51 PM
To: petsc-users at mcs.anl.gov
Subject: Re: Matrix free example snes/ex20.c


   Vijay,

     This is a very cool problem.

     Because of the exact symmetry of the domain the EXACT Jacobian
at each step has exactly 9 different eigenvalues. This means the
GMRES will take exactly 9 iterations (and "completely" converge in
the ninth iteration) if the "exact" Jacobian is used.  You can run with
-pc_type none -snes_mf -ksp_monitor_singular_value - 
ksp_plot_eigenvalues -display :0.0 -draw_pause -1
to see the 9 eigenvalues.

Now run without the -snes_mf option. You will see the first Newton  
iteration's
eigenvalues still look like 9; but starting at the second Newton  
iteration the
"identical" eigenvalues are now not all identically placed so GMRES  
needs
more iterations.

   The question then becomes how come the matrix-free application of
the Jacobian is more accurate than actually computing it as a sparse
matrix then applying it? Here is my non-rigorous answer; the  
multiplication
of the sparse matrix values (even if very accurate) against the vector  
introduces
some rounding error that screws up the eigenvalues slightly. For some  
reason
for this problem the matrix-free application is accurate enough not to
perturb the eigenvalues.

   Barry





On Dec 30, 2007, at 2:19 PM, Vijay M wrote:

> I ran both the cases with -ksp_monitor on and have attached the  
> output in
> two different files. 1.txt is the Jfree case and 2.txt is the  
> analytical
> case.
>
> Barry, the example problem is ex20 from the snes tutorial directory.  
> The
> petsc version is 2.3.3-p7 if that helps to clear things a little.  
> Now I
> haven't yet completely checked for a bug in the analytical Jacobian  
> but I
> would imagine that if it were incorrect, wouldn't that affect only  
> how the
> nonlinear iteration converges and not the linear iteration since the  
> matrix
> sparsity structure is still the same (well assuming the condition  
> number is
> not very different from the exact Jacobian !). Just my 2 cents.
>
> Anyway, I will look into the code for ex20 and then see if something  
> is
> messed up. Let me know if you find out the problem from the output.
>
> Thanks,
> Vijay
>
>> I understand that both the methods will not give me the same number
>> of total
>> linear iterations but a factor of 2 seems a little odd to me.
>
>    Yes, this is surprising.
>
>    Run with -ksp_monitor how are the linear convergence different?
>
>> This leads to
>> another question whether the user can actually change the epsilon
>> used for
>> computing the perturbation in J-free scheme or is this fixed in
>> PETSc ?
>
>    Yes, see the manual page for MatMFFDSetFromOptions() and related
> manual
> pages.
>
>>
>>
>> If not, then what do you think is the reason for this ?
>
>    Bug in your analytic Jacobian? Run with -snes_monitor and -
> ksp_monitor and
> send all output.
>
>    Barry
>
>> Do let me know your
>> comments when you get some time. Thanks.
>>
>> Vijay
>>
>> -----Original Message-----
>> From: owner-petsc-users at mcs.anl.gov [mailto:owner-petsc-users at mcs.anl.gov
>> ]
>> On Behalf Of Matthew Knepley
>> Sent: Saturday, December 29, 2007 9:05 PM
>> To: petsc-users at mcs.anl.gov
>> Subject: Re: Matrix free example snes/ex20.c
>>
>> On Dec 29, 2007 8:07 PM, Vijay M <vijay.m at gmail.com> wrote:
>>> Hi all,
>>>
>>> I was trying to compile and run the ex20.c example code in the
>>> tutorial
>>> section of SNES. Although it does not explicitly specify that -
>>> snes_mf
>>> option can be used, my understanding is that as long as a nonlinear
>> residual
>>> function is written correctly, PETSc will calculate via finite
>>> difference
>>> the action of the Jacobian on a given vector. Is that correct ?
>>
>> Yes.
>>
>>> Now if that is the case, then please observe the discrepancy in the
>>> number
>>> of linear iterations taken with an analytical Jacobian and matrix-
>>> free
>>> option. What puzzles me is that the SNES function norm are quite
>>> close for
>>> both the methods but the linear iterations differ by a factor of 3.
>>> Why
>>> exactly is this ?
>>
>> There is no PC when using -snes_mf whereas the default is ILU for the
>> analytic
>> Jacobian.
>>
>>  Matt
>>
>>> Here's the output to make this clearer.
>>>
>>> vijay :mpirun -np 1 ex20 -ksp_type gmres -snes_monitor
>>>
>>> 0 SNES Function norm 2.271442542876e-01
>>>
>>> 1 SNES Function norm 6.881516100891e-02
>>>
>>> 2 SNES Function norm 1.813939751552e-02
>>>
>>> 3 SNES Function norm 2.354176462207e-03
>>>
>>> 4 SNES Function norm 3.063728077362e-05
>>>
>>> 5 SNES Function norm 3.106106268946e-08
>>>
>>> 6 SNES Function norm 5.344742712545e-12
>>>
>>> 0 SNES Function norm 2.271442542876e-01
>>>
>>> 1 SNES Function norm 6.881516100891e-02
>>>
>>> 2 SNES Function norm 1.813939751552e-02
>>>
>>> 3 SNES Function norm 2.354176462207e-03
>>>
>>> 4 SNES Function norm 3.063728077362e-05
>>>
>>> 5 SNES Function norm 3.106106268946e-08
>>>
>>> 6 SNES Function norm 5.344742712545e-12
>>>
>>> Number of Newton iterations = 6
>>>
>>> Number of Linear iterations = 18
>>>
>>> Average Linear its / Newton = 3.000000e+00
>>>
>>>
>>>
>>> vijay :mpirun -np 1 ex20 -ksp_type gmres -snes_monitor -snes_mf
>>>
>>> 0 SNES Function norm 2.271442542876e-01
>>>
>>> 1 SNES Function norm 6.870629867542e-02
>>>
>>> 2 SNES Function norm 1.804335379848e-02
>>>
>>> 3 SNES Function norm 2.290074339682e-03
>>>
>>> 4 SNES Function norm 3.082384186373e-05
>>>
>>> 5 SNES Function norm 3.926396277038e-09
>>>
>>> 6 SNES Function norm 3.754922566585e-16
>>>
>>> 0 SNES Function norm 2.271442542876e-01
>>>
>>> 1 SNES Function norm 6.870629867542e-02
>>>
>>> 2 SNES Function norm 1.804335379848e-02
>>>
>>> 3 SNES Function norm 2.290074339682e-03
>>>
>>> 4 SNES Function norm 3.082384186373e-05
>>>
>>> 5 SNES Function norm 3.926396277038e-09
>>>
>>> 6 SNES Function norm 3.754922566585e-16
>>>
>>> Number of Newton iterations = 6
>>>
>>> Number of Linear iterations = 54
>>>
>>> Average Linear its / Newton = 9.000000e+00
>>>
>>>
>>>
>>> Thanks,
>>>
>>> Vijay
>>>
>>>
>>
>>
>>
>> -- 
>> What most experimenters take for granted before they begin their
>> experiments is infinitely more interesting than any results to which
>> their experiments lead.
>> -- Norbert Wiener
>>
> <1.txt><2.txt>



From bsmith at mcs.anl.gov  Mon Dec 31 18:08:29 2007
From: bsmith at mcs.anl.gov (Barry Smith)
Date: Mon, 31 Dec 2007 18:08:29 -0600
Subject: Matrix free example snes/ex20.c
In-Reply-To: <000001c84c0a$222dccf0$6e00a8c0@neutrino>
References: <000001c84c0a$222dccf0$6e00a8c0@neutrino>
Message-ID: <69A26E5B-749B-4055-BFBF-8703B2704F03@mcs.anl.gov>


On Dec 31, 2007, at 6:06 PM, Vijay M wrote:

> Barry,
>
> Thanks for the detailed explanation. That sure is a tricky and  
> interesting
> problem. I did run the problem with the options you suggested and  
> see what
> you mean.
>
> I just have one another question though that is not quite related to  
> the
> example: Say when you do J-free Newton-Krylov iteration, then is it  
> correct
> to say that the F.D calculation of the action of Jacobian on a  
> vector is
> more accurate than using a numerical  Jacobian (not analytical)  
> found at the
> start of a Newton iteration ? Because even though in both cases, the
> Jacobian is technically found by perturbation about the last Newton
> iteration, it seems to me that there is some gain in this convergence
> respect with J-free immaterial of the problem being solved.

    I would say no; this is just a fluke thing.

   Barry


> Now is that
> confusing or am I making sense ? I'll be glad to explain more on  
> that and
> awaiting to hear your comments.
>
> Well, happy new year to you Barry and all the PETSc team !!
>
> Cheers,
> Vijay
>
> -----Original Message-----
> From: owner-petsc-users at mcs.anl.gov [mailto:owner-petsc-users at mcs.anl.gov 
> ]
> On Behalf Of Barry Smith
> Sent: Sunday, December 30, 2007 4:51 PM
> To: petsc-users at mcs.anl.gov
> Subject: Re: Matrix free example snes/ex20.c
>
>
>   Vijay,
>
>     This is a very cool problem.
>
>     Because of the exact symmetry of the domain the EXACT Jacobian
> at each step has exactly 9 different eigenvalues. This means the
> GMRES will take exactly 9 iterations (and "completely" converge in
> the ninth iteration) if the "exact" Jacobian is used.  You can run  
> with
> -pc_type none -snes_mf -ksp_monitor_singular_value -
> ksp_plot_eigenvalues -display :0.0 -draw_pause -1
> to see the 9 eigenvalues.
>
> Now run without the -snes_mf option. You will see the first Newton
> iteration's
> eigenvalues still look like 9; but starting at the second Newton
> iteration the
> "identical" eigenvalues are now not all identically placed so GMRES
> needs
> more iterations.
>
>   The question then becomes how come the matrix-free application of
> the Jacobian is more accurate than actually computing it as a sparse
> matrix then applying it? Here is my non-rigorous answer; the
> multiplication
> of the sparse matrix values (even if very accurate) against the vector
> introduces
> some rounding error that screws up the eigenvalues slightly. For some
> reason
> for this problem the matrix-free application is accurate enough not to
> perturb the eigenvalues.
>
>   Barry
>
>
>
>
>
> On Dec 30, 2007, at 2:19 PM, Vijay M wrote:
>
>> I ran both the cases with -ksp_monitor on and have attached the
>> output in
>> two different files. 1.txt is the Jfree case and 2.txt is the
>> analytical
>> case.
>>
>> Barry, the example problem is ex20 from the snes tutorial directory.
>> The
>> petsc version is 2.3.3-p7 if that helps to clear things a little.
>> Now I
>> haven't yet completely checked for a bug in the analytical Jacobian
>> but I
>> would imagine that if it were incorrect, wouldn't that affect only
>> how the
>> nonlinear iteration converges and not the linear iteration since the
>> matrix
>> sparsity structure is still the same (well assuming the condition
>> number is
>> not very different from the exact Jacobian !). Just my 2 cents.
>>
>> Anyway, I will look into the code for ex20 and then see if something
>> is
>> messed up. Let me know if you find out the problem from the output.
>>
>> Thanks,
>> Vijay
>>
>>> I understand that both the methods will not give me the same number
>>> of total
>>> linear iterations but a factor of 2 seems a little odd to me.
>>
>>   Yes, this is surprising.
>>
>>   Run with -ksp_monitor how are the linear convergence different?
>>
>>> This leads to
>>> another question whether the user can actually change the epsilon
>>> used for
>>> computing the perturbation in J-free scheme or is this fixed in
>>> PETSc ?
>>
>>   Yes, see the manual page for MatMFFDSetFromOptions() and related
>> manual
>> pages.
>>
>>>
>>>
>>> If not, then what do you think is the reason for this ?
>>
>>   Bug in your analytic Jacobian? Run with -snes_monitor and -
>> ksp_monitor and
>> send all output.
>>
>>   Barry
>>
>>> Do let me know your
>>> comments when you get some time. Thanks.
>>>
>>> Vijay
>>>
>>> -----Original Message-----
>>> From: owner-petsc-users at mcs.anl.gov [mailto:owner-petsc-users at mcs.anl.gov
>>> ]
>>> On Behalf Of Matthew Knepley
>>> Sent: Saturday, December 29, 2007 9:05 PM
>>> To: petsc-users at mcs.anl.gov
>>> Subject: Re: Matrix free example snes/ex20.c
>>>
>>> On Dec 29, 2007 8:07 PM, Vijay M <vijay.m at gmail.com> wrote:
>>>> Hi all,
>>>>
>>>> I was trying to compile and run the ex20.c example code in the
>>>> tutorial
>>>> section of SNES. Although it does not explicitly specify that -
>>>> snes_mf
>>>> option can be used, my understanding is that as long as a nonlinear
>>> residual
>>>> function is written correctly, PETSc will calculate via finite
>>>> difference
>>>> the action of the Jacobian on a given vector. Is that correct ?
>>>
>>> Yes.
>>>
>>>> Now if that is the case, then please observe the discrepancy in the
>>>> number
>>>> of linear iterations taken with an analytical Jacobian and matrix-
>>>> free
>>>> option. What puzzles me is that the SNES function norm are quite
>>>> close for
>>>> both the methods but the linear iterations differ by a factor of 3.
>>>> Why
>>>> exactly is this ?
>>>
>>> There is no PC when using -snes_mf whereas the default is ILU for  
>>> the
>>> analytic
>>> Jacobian.
>>>
>>> Matt
>>>
>>>> Here's the output to make this clearer.
>>>>
>>>> vijay :mpirun -np 1 ex20 -ksp_type gmres -snes_monitor
>>>>
>>>> 0 SNES Function norm 2.271442542876e-01
>>>>
>>>> 1 SNES Function norm 6.881516100891e-02
>>>>
>>>> 2 SNES Function norm 1.813939751552e-02
>>>>
>>>> 3 SNES Function norm 2.354176462207e-03
>>>>
>>>> 4 SNES Function norm 3.063728077362e-05
>>>>
>>>> 5 SNES Function norm 3.106106268946e-08
>>>>
>>>> 6 SNES Function norm 5.344742712545e-12
>>>>
>>>> 0 SNES Function norm 2.271442542876e-01
>>>>
>>>> 1 SNES Function norm 6.881516100891e-02
>>>>
>>>> 2 SNES Function norm 1.813939751552e-02
>>>>
>>>> 3 SNES Function norm 2.354176462207e-03
>>>>
>>>> 4 SNES Function norm 3.063728077362e-05
>>>>
>>>> 5 SNES Function norm 3.106106268946e-08
>>>>
>>>> 6 SNES Function norm 5.344742712545e-12
>>>>
>>>> Number of Newton iterations = 6
>>>>
>>>> Number of Linear iterations = 18
>>>>
>>>> Average Linear its / Newton = 3.000000e+00
>>>>
>>>>
>>>>
>>>> vijay :mpirun -np 1 ex20 -ksp_type gmres -snes_monitor -snes_mf
>>>>
>>>> 0 SNES Function norm 2.271442542876e-01
>>>>
>>>> 1 SNES Function norm 6.870629867542e-02
>>>>
>>>> 2 SNES Function norm 1.804335379848e-02
>>>>
>>>> 3 SNES Function norm 2.290074339682e-03
>>>>
>>>> 4 SNES Function norm 3.082384186373e-05
>>>>
>>>> 5 SNES Function norm 3.926396277038e-09
>>>>
>>>> 6 SNES Function norm 3.754922566585e-16
>>>>
>>>> 0 SNES Function norm 2.271442542876e-01
>>>>
>>>> 1 SNES Function norm 6.870629867542e-02
>>>>
>>>> 2 SNES Function norm 1.804335379848e-02
>>>>
>>>> 3 SNES Function norm 2.290074339682e-03
>>>>
>>>> 4 SNES Function norm 3.082384186373e-05
>>>>
>>>> 5 SNES Function norm 3.926396277038e-09
>>>>
>>>> 6 SNES Function norm 3.754922566585e-16
>>>>
>>>> Number of Newton iterations = 6
>>>>
>>>> Number of Linear iterations = 54
>>>>
>>>> Average Linear its / Newton = 9.000000e+00
>>>>
>>>>
>>>>
>>>> Thanks,
>>>>
>>>> Vijay
>>>>
>>>>
>>>
>>>
>>>
>>> -- 
>>> What most experimenters take for granted before they begin their
>>> experiments is infinitely more interesting than any results to which
>>> their experiments lead.
>>> -- Norbert Wiener
>>>
>> <1.txt><2.txt>
>