question on DA's and performance

Thu Apr 27 15:35:00 CDT 2006

   Randy,

    If the codes are in a form that I could build and run them could you just 
send me the two codes (petsc-maint at mcs.anl.gov) so I can play around with the
convergence of both? Also send sample input data if there is some.

      Barry

On Thu, 27 Apr 2006, Randall Mackie wrote:

> Actually, I have 3 degrees of freedom (Hx, Hy, Hz) per cell. I'm not using
> DAGetMatrix because as of last year, that returns the full coupling
> and a lot of structure that I don't use, and I couldn't figure out
> how to use DASetBlockFills() to get the right amount of coupling I needed,
> so I just coded it up use MatCreateMPIAIJ, and it works.
>
> I suggested last year that the actual coupling could be based on the
> values actually entered, and you said you'd put that on the todo list.
>
> For example, in my problem, I'm solving the curl curl equations, so
> the coupling is like
>
>>>>
>>>> 	Hx(i,j,k) is coupled to Hx(i,j,k-1), Hx(i,j,k+1), Hx(i,j+1,k),
>>>>               Hx(i,j-1,k), Hy(i+1,j,k), Hy(i,j,k), Hy(i+1,j-1,k), 
> Hy(i,j-1,k),
>>>>               Hz(i,j,k-1), Hz(i+1,j,k-1), Hz(i,j,k), Hz(i+1,j,k)
>>>>
>>>>
>>>> 	Similarly for Hy(i,j,k) and Hz(i,j,k)
>
>
> Randy
>
>
> Barry Smith wrote:
>>
>>   In your case it is 2, since you have 2 degree's of freedom per cell.
>> But note: Aren't you using DAGetMatrix() to get your matrix instead of
>> MatCreate....() directory? You would pass MATBAIJ into DAGetMatrix() to
>> get the BAIJ matrix.
>>
>>     Barry
>> 
>> 
>> On Thu, 27 Apr 2006, Randall Mackie wrote:
>> 
>>> Barry,
>>> 
>>> Can you give some advice, or do you have any examples, on how to set
>>> the block size in MatCreateMPIBAIJ?
>>> 
>>> Randy
>>> 
>>> 
>>> Barry Smith wrote:
>>>>
>>>>   Randy,
>>>>
>>>>     1) I'd first change the scaling of the
>>>>> with coefficient matrix values set to 1.0), 
>>>> to match the other diagonal entries in the matrix. For example, if the
>>>> diagonal entries are usually 10,000 then multiply the "boundary 
>>>> condition"
>>>> rows by 10,000. (Note the diagonal entries may be proportional to h, or 
>>>> 1/h etc
>>>> so make sure you get the same scaling in your boundary condition rows. 
>>>> For example,
>>>> if the diagonal entries double when you refine the grid make sure your
>>>> "boundary condition" equations scale the same way.)
>>>>
>>>>   2) Are you using AIJ or BAIJ matrices? BAIJ is likely much better
>>>> in your "new" code because it uses point block ILU(k) instead of point 
>>>> ILU(k).
>>>>
>>>>   3) Are you using -pc_type bjacobi or -pc_type asm (and what did you do 
>>>> in the previous version?).
>>>>
>>>>   4) If you use LU on the blocks does it work better than ILU(0) in terms 
>>>> of iterations? A little, a lot? If just a little I suggest trying 
>>>> -pc_type asm
>>>>
>>>>   Let us know how things go, you should be able to get similar 
>>>> convergence
>>>> using the DAs.
>>>>
>>>>    Barry
>>>> 
>>>> 
>>>> 
>>>> 
>>>> On Wed, 26 Apr 2006, Randall Mackie wrote:
>>>> 
>>>>> I've been using Petsc for a few years with reasonably good success. My 
>>>>> application is
>>>>> 3D EM forward modeling and inversion.
>>>>> 
>>>>> What has been working well is basically an adaptation of what I did in 
>>>>> serial mode,
>>>>> by solving the following system of equations:
>>>>> 
>>>>> |Mxx Mxy Mxz| |Hx|   |bx|
>>>>> |Myx Myy Myz| |Hy| = |by|
>>>>> |Mzx Mzy Mzz| |Hz|   |bz|
>>>>> 
>>>>> Because this system is very stiff and ill-conditioned, the 
>>>>> preconditioner that has
>>>>> been successfully used is the ILU(k) of the diagonal sub-blocks of the 
>>>>> coefficient
>>>>> matrix only, not the ILU(k) of the entire matrix.
>>>>> 
>>>>> 
>>>>> I've tried, with less success, converting to distributed arrays, because 
>>>>> in reality
>>>>> my program requires alternating between solving the system above, and 
>>>>> solving
>>>>> another system based on the divergences of the magnetic field, and so 
>>>>> this requires
>>>>> a lot of message passing between the nodes. I think that using DA's 
>>>>> would require
>>>>> passing only the ghost values instead of all the values as I am now 
>>>>> doing.
>>>>> 
>>>>> So I coded up DA's, and it works, but only okay, and not as well as the 
>>>>> case above.
>>>>> The main problem is that it seems to work best for ILU(0), whereas the 
>>>>> case above
>>>>> works better and better with the more fill-in specified. I think I've 
>>>>> tried every
>>>>> possible option, but I can't get any improvement over just using ILU(0), 
>>>>> but the problem
>>>>> is that with ILU(0), it takes too many iterations, and so the total time 
>>>>> is more
>>>>> than when I use my original method with, say, ILU(8).
>>>>> 
>>>>> The differences between using DA's and the approach above is that the 
>>>>> fields are
>>>>> interlaced in DA's, and the boundary values are included as unknowns 
>>>>> (with coefficient
>>>>> matrix values set to 1.0), whereas my original way the boundary values 
>>>>> are
>>>>> incorporated in the right-hand side. Maybe that hurts the condition of 
>>>>> my system.
>>>>> 
>>>>> I would really like to use DA's to improve on the efficiency of the 
>>>>> code, but I
>>>>> can't figure out how to do that.
>>>>> 
>>>>> It was suggested to me last year to use PCFIELDSPLIT, but there are no 
>>>>> examples,
>>>>> and I'm not a c programmer so it's hard for me to look at the source 
>>>>> code and
>>>>> know what to do. (I'm only just now able to get back to this).
>>>>> 
>>>>> Does anyone do any prescaling of their systems? If so, does anyone have 
>>>>> examples
>>>>> of how this can be done?
>>>>> 
>>>>> Any advice is greatly appreciated.
>>>>> 
>>>>> Thanks, Randy
>>>>> 
>>>>> 
>>>> 
>>> 
>>> 
>> 
>
>