question on DA's and performance

Thu Apr 27 15:20:22 CDT 2006

Actually, I have 3 degrees of freedom (Hx, Hy, Hz) per cell. I'm not using
DAGetMatrix because as of last year, that returns the full coupling
and a lot of structure that I don't use, and I couldn't figure out
how to use DASetBlockFills() to get the right amount of coupling I needed,
so I just coded it up use MatCreateMPIAIJ, and it works.

I suggested last year that the actual coupling could be based on the
values actually entered, and you said you'd put that on the todo list.

For example, in my problem, I'm solving the curl curl equations, so
the coupling is like

 >>>
 >>>	Hx(i,j,k) is coupled to Hx(i,j,k-1), Hx(i,j,k+1), Hx(i,j+1,k),
 >>>               Hx(i,j-1,k), Hy(i+1,j,k), Hy(i,j,k), Hy(i+1,j-1,k), Hy(i,j-1,k),
 >>>               Hz(i,j,k-1), Hz(i+1,j,k-1), Hz(i,j,k), Hz(i+1,j,k)
 >>>
 >>>
 >>>	Similarly for Hy(i,j,k) and Hz(i,j,k)

Randy

Barry Smith wrote:
> 
>   In your case it is 2, since you have 2 degree's of freedom per cell.
> But note: Aren't you using DAGetMatrix() to get your matrix instead of
> MatCreate....() directory? You would pass MATBAIJ into DAGetMatrix() to
> get the BAIJ matrix.
> 
>     Barry
> 
> 
> On Thu, 27 Apr 2006, Randall Mackie wrote:
> 
>> Barry,
>>
>> Can you give some advice, or do you have any examples, on how to set
>> the block size in MatCreateMPIBAIJ?
>>
>> Randy
>>
>>
>> Barry Smith wrote:
>>>
>>>   Randy,
>>>
>>>     1) I'd first change the scaling of the
>>>> with coefficient matrix values set to 1.0), 
>>> to match the other diagonal entries in the matrix. For example, if the
>>> diagonal entries are usually 10,000 then multiply the "boundary 
>>> condition"
>>> rows by 10,000. (Note the diagonal entries may be proportional to h, 
>>> or 1/h etc
>>> so make sure you get the same scaling in your boundary condition 
>>> rows. For example,
>>> if the diagonal entries double when you refine the grid make sure your
>>> "boundary condition" equations scale the same way.)
>>>
>>>   2) Are you using AIJ or BAIJ matrices? BAIJ is likely much better
>>> in your "new" code because it uses point block ILU(k) instead of 
>>> point ILU(k).
>>>
>>>   3) Are you using -pc_type bjacobi or -pc_type asm (and what did you 
>>> do in the previous version?).
>>>
>>>   4) If you use LU on the blocks does it work better than ILU(0) in 
>>> terms of iterations? A little, a lot? If just a little I suggest 
>>> trying -pc_type asm
>>>
>>>   Let us know how things go, you should be able to get similar 
>>> convergence
>>> using the DAs.
>>>
>>>    Barry
>>>
>>>
>>>
>>>
>>> On Wed, 26 Apr 2006, Randall Mackie wrote:
>>>
>>>> I've been using Petsc for a few years with reasonably good success. 
>>>> My application is
>>>> 3D EM forward modeling and inversion.
>>>>
>>>> What has been working well is basically an adaptation of what I did 
>>>> in serial mode,
>>>> by solving the following system of equations:
>>>>
>>>> |Mxx Mxy Mxz| |Hx|   |bx|
>>>> |Myx Myy Myz| |Hy| = |by|
>>>> |Mzx Mzy Mzz| |Hz|   |bz|
>>>>
>>>> Because this system is very stiff and ill-conditioned, the 
>>>> preconditioner that has
>>>> been successfully used is the ILU(k) of the diagonal sub-blocks of 
>>>> the coefficient
>>>> matrix only, not the ILU(k) of the entire matrix.
>>>>
>>>>
>>>> I've tried, with less success, converting to distributed arrays, 
>>>> because in reality
>>>> my program requires alternating between solving the system above, 
>>>> and solving
>>>> another system based on the divergences of the magnetic field, and 
>>>> so this requires
>>>> a lot of message passing between the nodes. I think that using DA's 
>>>> would require
>>>> passing only the ghost values instead of all the values as I am now 
>>>> doing.
>>>>
>>>> So I coded up DA's, and it works, but only okay, and not as well as 
>>>> the case above.
>>>> The main problem is that it seems to work best for ILU(0), whereas 
>>>> the case above
>>>> works better and better with the more fill-in specified. I think 
>>>> I've tried every
>>>> possible option, but I can't get any improvement over just using 
>>>> ILU(0), but the problem
>>>> is that with ILU(0), it takes too many iterations, and so the total 
>>>> time is more
>>>> than when I use my original method with, say, ILU(8).
>>>>
>>>> The differences between using DA's and the approach above is that 
>>>> the fields are
>>>> interlaced in DA's, and the boundary values are included as unknowns 
>>>> (with coefficient
>>>> matrix values set to 1.0), whereas my original way the boundary 
>>>> values are
>>>> incorporated in the right-hand side. Maybe that hurts the condition 
>>>> of my system.
>>>>
>>>> I would really like to use DA's to improve on the efficiency of the 
>>>> code, but I
>>>> can't figure out how to do that.
>>>>
>>>> It was suggested to me last year to use PCFIELDSPLIT, but there are 
>>>> no examples,
>>>> and I'm not a c programmer so it's hard for me to look at the source 
>>>> code and
>>>> know what to do. (I'm only just now able to get back to this).
>>>>
>>>> Does anyone do any prescaling of their systems? If so, does anyone 
>>>> have examples
>>>> of how this can be done?
>>>>
>>>> Any advice is greatly appreciated.
>>>>
>>>> Thanks, Randy
>>>>
>>>>
>>>
>>
>>
> 

-- 
Randall Mackie
GSY-USA, Inc.
PMB# 643
2261 Market St.,
San Francisco, CA 94114-1600
Tel (415) 469-8649
Fax (415) 469-5044

California Registered Geophysicist
License No. GP 1034