[petsc-users] a question about DAE Function

Tue Jun 4 17:43:17 CDT 2013

Sorry about the confusion. Please allow me to illustrate the problem in a better way below:

I have a set of DAE equations, which can be defined as 4*n f[] equations:

  scatterMyVec(X, &x); CHKERRQ(ierr); // Scatter x to each processor               // (0)
  ierr = DMDAVecGetArray(da, Xdot, &xdot); CHKERRQ(ierr);
  ierr = DMDAVecGetArray(da, F, &f); CHKERRQ(ierr);

  // Get local grid boundaries
  ierr = DMDAGetCorners(da, &xstart, NULL, NULL, &xlen, NULL, NULL); CHKERRQ(ierr);

  // Set up DAE equations
  // Compute function over the locally owned part of the grid
  for (i = 0; i < n; i++) {
      if (2*i >= xstart && 2*i < xstart+xlen) {
          f[2*i] = xdot[2*i] -  x[2*i+1];                                                                                         // (1)
      }
      if (2*i+1 >= xstart && 2*i+1 < xstart+xlen) {
          f[2*i+1] = xdot[2*i+1] - x[2*n+2*i+1] - x[2*i+1]);                                                // (2)
      }
  }
  for (i = 0; i < n; i++) {
      if (2*n+2*i >= xstart && 2*n+2*i < xstart+xlen) {
          f[2*n+2*i] = sin(x[2*i])*x[2*n+2*i] + cos(x[2*i])*x[2*n+2*i+1];                  
          for (k = 0; k < n; k++) {
              f[2*n+2*i] += -(cos(x[2*k])*a-sin(x[2*k])*b);                                                     // (3)
          }
      }
      if (2*n+2*i+1 >= xstart && 2*n+2*i+1 < xstart+xlen) {
          f[2*n+2*i+1] = -cos(x[2*i])*x[2*n+2*i] + sin(x[2*i])*x[2*n+2*i+1];
          for (k = 0; k < n; k++) {
              f[2*n+2*i+1] += -(cos(x[2*k])*c+(sin(x[2*k])*d);                                              // (4)
         }
      }
  }

As you may see, all the f functions are dependent on some x values. However, the x is a distributed array of size 4*n.

In order to use the access x[] from different processor, I have to scatter x before I use them to define f[], the scatterMyVec function is shown below.

// Scatter the context of v to each processor to get a global array (vArray) values
static PetscErrorCode scatterMyVec(Vec vin, PetscScalar **voutArray)
{
  PetscErrorCode ierr;
  VecScatter ctx;
  Vec vout;

  ierr = VecScatterCreateToAll(vin, &ctx, &vout); CHKERRQ(ierr);
  ierr = VecScatterBegin(ctx, vin, vout, INSERT_VALUES, SCATTER_FORWARD); CHKERRQ(ierr);
  ierr = VecScatterEnd(ctx, vin, vout, INSERT_VALUES, SCATTER_FORWARD); CHKERRQ(ierr);
  ierr = VecGetArray(vout, voutArray); CHKERRQ(ierr);

  ierr = VecScatterDestroy(&ctx); CHKERRQ(ierr);
  ierr = VecRestoreArray(vout, PETSC_NULL);

  PetscFunctionReturn(0);
}

As I mentioned in the previous email, I think this is quite inefficient because the scattering of x during each iteration "requires a good chunk of communication". And as Barry pointed out, " the Jacobian will be dense and the linear solves with the dense Jacobian will dominate the time anyways. "

So my question now is:
1. Am I right to use the PETSc TS parallel paradigm in this way to solve my DAE equations? My concern is although I can see the solution be computed across multiple processors, the overhead of scattering x from time to time and the linear solving with dense Jacobian will eat up the performance gains of parallel computation.

2. If the answer of the above question is "NO", what's the correct parallel tool I should look into in PETSc to solve my DAE equations? MatElemental as most of you suggested?

Thanks a lot for your help!

Regards,
Shuangshuang

-----Original Message-----
From: Jed Brown [mailto:five9a2 at gmail.com] On Behalf Of Jed Brown
Sent: Tuesday, June 04, 2013 1:42 PM
To: Jin, Shuangshuang; Matthew Knepley
Cc: petsc-users at mcs.anl.gov
Subject: Re: [petsc-users] a question about DAE Function

Your message quoting is very peculiar.  Please use normal quoting in the future.

"Jin, Shuangshuang" <Shuangshuang.Jin at pnnl.gov> writes:

> From: Matthew Knepley [mailto:knepley at gmail.com]
> Sent: Tuesday, June 04, 2013 12:32 PM
> To: Jin, Shuangshuang
> Cc: petsc-users at mcs.anl.gov
> Subject: Re: [petsc-users] a question about DAE Function
>
> On Tue, Jun 4, 2013 at 3:13 PM, Jin, Shuangshuang <Shuangshuang.Jin at pnnl.gov<mailto:Shuangshuang.Jin at pnnl.gov>> wrote:
> Hello, I have a question regarding how to set up the IFunction for TSSetIFunction. Please see the detailed illustration below:
>
> Inside my IFunction, I have one of the f equations defined as:
>
> if (i >= xstart && i < xstart+xlen) {
>       f[i+2] = sin(x[i])*x[i+2] + cos(x[i])*x[i+3];
>
>
> 1)      You should not be setting f[i+2]. You always set the residual local to your process.
> [Jin, Shuangshuang]  Sorry, I typed it wrong, the condition is "if (i+2 >= xstart && i+2 < xstart+xlen) {" instead.
>
>       for (k = 0; k < n; k++) {
>           f[i+2] += -(cos(x[k])*a[k])*b) - (sin(x[k])*c[k])*d);               (1)
>       }
> }

Showing something of the outer loop would have made the usage more clear.

>
> 2) This in effect says that your Jacobian is completely dense, or that your problem is fully coupled. I see two
>     choices:
> [Jin, Shuangshuang] I have four sets of f functions: f[i], f[i+1],
>     f[i+2], and f[i+3]. (i=1..n), the Jacobian matrix for the whole
>     DAE problem is sparse,

Your code above couples f[i+2] to x[k] for k=1..n, which is dense.

> but the problem is fully coupled.

"fully coupled" means "dense", unless you intended something different.
What exactly is sparse?

>   a) Use scalable dense linear algebra, like Elemental, and formulate 
> your operations as linear algebra [Jin, Shuangshuang] Do you have a sample source code on the web for this?

There are a couple tutorials using MatElemental, but we don't seem to agree about whether the problem formulation is correct, so don't do that yet.

>   b) If this is a boundary integral equation, consider using the FMM 
> technology [Jin, Shuangshuang] It's not.
>
> Either way, this is not a PDE, and so using that PETSc parallel paradigm is wrong.
> [Jin, Shuangshuang] This is a DAE, I set up the section 6.1.2 Solving Differential Algebraic Equations of the Manual, form IFUNCITON, IJacobain, and set up initial values for x. If I scatter x every time inside IFUNCTION, the problem is solved in parallel when I use multiple threads. I just felt that my way of scattering x is inefficient. I don't understand why using the PETSc parallel paradigm to solve the problem is wrong?

If your function is dense, then you need to gather all of x to compute a residual.  That is what your code above implies, but you also said it was sparse, so which is it?