[petsc-users] Local columns of A10 do not equal local rows of A00

Alexander Lindsay alexlindsay239 at gmail.com
Wed Nov 9 12:45:38 CST 2022


Ok, I've figured out that we are definitely messing something up in our
split index set generation. For process 1 our Jac/PMat local size is 14307,
but split 0 IS local size is 4129 and split 1 IS local size is 10170, so
that leaves us 8 dofs short.

I know now where I need to dig in our field decomposition. Thanks Matt for
helping me process through this stuff!



On Tue, Nov 8, 2022 at 4:53 PM Alexander Lindsay <alexlindsay239 at gmail.com>
wrote:

> This is from our DMCreateFieldDecomposition_Moose routine. The IS size on
> process 1 (which is the process from which I took the error in the original
> post) is reported as 4129 which is consistent with the row size of A00.
>
> Split '0' has local size 4129 on processor 1
> Split '0' has local size 4484 on processor 6
> Split '0' has local size 4471 on processor 12
> Split '0' has local size 4040 on processor 14
> Split '0' has local size 3594 on processor 20
> Split '0' has local size 4423 on processor 22
> Split '0' has local size 2791 on processor 27
> Split '0' has local size 3014 on processor 29
> Split '0' has local size 3183 on processor 30
> Split '0' has local size 3328 on processor 3
> Split '0' has local size 4689 on processor 4
> Split '0' has local size 8016 on processor 8
> Split '0' has local size 6367 on processor 10
> Split '0' has local size 5973 on processor 17
> Split '0' has local size 4431 on processor 18
> Split '0' has local size 7564 on processor 25
> Split '0' has local size 12504 on processor 9
> Split '0' has local size 10081 on processor 11
> Split '0' has local size 13808 on processor 24
> Split '0' has local size 14049 on processor 31
> Split '0' has local size 15324 on processor 7
> Split '0' has local size 15337 on processor 15
> Split '0' has local size 14849 on processor 19
> Split '0' has local size 15660 on processor 23
> Split '0' has local size 14728 on processor 26
> Split '0' has local size 15724 on processor 28
> Split '0' has local size 17249 on processor 5
> Split '0' has local size 15519 on processor 13
> Split '0' has local size 16511 on processor 16
> Split '0' has local size 16496 on processor 21
> Split '0' has local size 18291 on processor 2
> Split '0' has local size 18042 on processor 0
>
> On Mon, Nov 7, 2022 at 6:04 PM Matthew Knepley <knepley at gmail.com> wrote:
>
>> On Mon, Nov 7, 2022 at 5:48 PM Alexander Lindsay <
>> alexlindsay239 at gmail.com> wrote:
>>
>>> My understanding looking at PCFieldSplitSetDefaults is that our
>>> implementation of `createfielddecomposition` should get called, we'll set
>>> `fields` and then (ignoring possible user setting of
>>> -pc_fieldsplit_%D_fields flag) PCFieldSplitSetIS will get called with
>>> whatever we did to `fields`. So yea I guess that just looking over that I
>>> would assume we're not supplying two different index sets for rows and
>>> columns, or put more precisely we (MOOSE) are not really afforded the
>>> opportunity to. But my interpretation could very well be wrong.
>>>
>>
>> Oh wait. I read the error message again. It does not say that the whole
>> selection is rectangular. It says
>>
>>   Local columns of A10 4137 do not equal local rows of A00 4129
>>
>> So this is a parallel partitioning thing. Since A00 has 4129 local rows,
>> it should have this many columns as well.
>> However A10 has 4137 local columns. How big is IS_0, on each process,
>> that you pass in to PCFIELDSPLIT?
>>
>>   Thanks,
>>
>>      Matt
>>
>>
>>> On Mon, Nov 7, 2022 at 12:33 PM Matthew Knepley <knepley at gmail.com>
>>> wrote:
>>>
>>>> On Mon, Nov 7, 2022 at 2:09 PM Alexander Lindsay <
>>>> alexlindsay239 at gmail.com> wrote:
>>>>
>>>>> The libMesh/MOOSE specific code that identifies dof indices for
>>>>> ISCreateGeneral is in DMooseGetEmbedding_Private. I can share that function
>>>>> (it's quite long) or more details if that could be helpful.
>>>>>
>>>>
>>>> Sorry, I should have written more. The puzzling thing for me is that
>>>> somehow it looks like the row and column index sets are not the same. I did
>>>> not think
>>>> PCFIELDSPLIT could do that. The PCFieldSplitSetIS() interface does not
>>>> allow it. I was wondering how you were setting the ISes.
>>>>
>>>>   Thanks,
>>>>
>>>>      Matt
>>>>
>>>>
>>>>> On Mon, Nov 7, 2022 at 10:55 AM Alexander Lindsay <
>>>>> alexlindsay239 at gmail.com> wrote:
>>>>>
>>>>>> I'm not sure exactly what you mean, but I'll try to give more
>>>>>> details. We have our own DM class (DM_Moose) and we set our own field and
>>>>>> domain decomposition routines:
>>>>>>
>>>>>>   dm->ops->createfielddecomposition =
>>>>>> DMCreateFieldDecomposition_Moose;
>>>>>>
>>>>>>   dm->ops->createdomaindecomposition =
>>>>>> DMCreateDomainDecomposition_Moose;
>>>>>>
>>>>>>
>>>>>> The field and domain decomposition routines are as follows (can see
>>>>>> also at
>>>>>> https://github.com/idaholab/moose/blob/next/framework/src/utils/PetscDMMoose.C
>>>>>> ):
>>>>>>
>>>>>> static PetscErrorCode
>>>>>> DMCreateFieldDecomposition_Moose(
>>>>>>     DM dm, PetscInt * len, char *** namelist, IS ** islist, DM **
>>>>>> dmlist)
>>>>>> {
>>>>>>   PetscErrorCode ierr;
>>>>>>   DM_Moose * dmm = (DM_Moose *)(dm->data);
>>>>>>
>>>>>>   PetscFunctionBegin;
>>>>>>   /* Only called after DMSetUp(). */
>>>>>>   if (!dmm->_splitlocs)
>>>>>>     PetscFunctionReturn(0);
>>>>>>   *len = dmm->_splitlocs->size();
>>>>>>   if (namelist)
>>>>>>   {
>>>>>>     ierr = PetscMalloc(*len * sizeof(char *), namelist);
>>>>>>     CHKERRQ(ierr);
>>>>>>   }
>>>>>>   if (islist)
>>>>>>   {
>>>>>>     ierr = PetscMalloc(*len * sizeof(IS), islist);
>>>>>>     CHKERRQ(ierr);
>>>>>>   }
>>>>>>   if (dmlist)
>>>>>>   {
>>>>>>     ierr = PetscMalloc(*len * sizeof(DM), dmlist);
>>>>>>     CHKERRQ(ierr);
>>>>>>   }
>>>>>>   for (const auto & dit : *(dmm->_splitlocs))
>>>>>>   {
>>>>>>     unsigned int d = dit.second;
>>>>>>     std::string dname = dit.first;
>>>>>>     DM_Moose::SplitInfo & dinfo = (*dmm->_splits)[dname];
>>>>>>     if (!dinfo._dm)
>>>>>>     {
>>>>>>       ierr = DMCreateMoose(((PetscObject)dm)->comm, *dmm->_nl,
>>>>>> &dinfo._dm);
>>>>>>       CHKERRQ(ierr);
>>>>>>       ierr = PetscObjectSetOptionsPrefix((PetscObject)dinfo._dm,
>>>>>> ((PetscObject)dm)->prefix);
>>>>>>       CHKERRQ(ierr);
>>>>>>       std::string suffix = std::string("fieldsplit_") + dname + "_";
>>>>>>       ierr = PetscObjectAppendOptionsPrefix((PetscObject)dinfo._dm,
>>>>>> suffix.c_str());
>>>>>>       CHKERRQ(ierr);
>>>>>>     }
>>>>>>     ierr = DMSetFromOptions(dinfo._dm);
>>>>>>     CHKERRQ(ierr);
>>>>>>     ierr = DMSetUp(dinfo._dm);
>>>>>>     CHKERRQ(ierr);
>>>>>>     if (namelist)
>>>>>>     {
>>>>>>       ierr = PetscStrallocpy(dname.c_str(), (*namelist) + d);
>>>>>>       CHKERRQ(ierr);
>>>>>>     }
>>>>>>     if (islist)
>>>>>>     {
>>>>>>       if (!dinfo._rembedding)
>>>>>>       {
>>>>>>         IS dembedding, lembedding;
>>>>>>         ierr = DMMooseGetEmbedding_Private(dinfo._dm, &dembedding);
>>>>>>         CHKERRQ(ierr);
>>>>>>         if (dmm->_embedding)
>>>>>>         {
>>>>>>           // Create a relative embedding into the parent's index
>>>>>> space.
>>>>>>           ierr = ISEmbed(dembedding, dmm->_embedding, PETSC_TRUE,
>>>>>> &lembedding);
>>>>>>           CHKERRQ(ierr);
>>>>>>           const PetscInt * lindices;
>>>>>>           PetscInt len, dlen, llen, *rindices, off, i;
>>>>>>           ierr = ISGetLocalSize(dembedding, &dlen);
>>>>>>           CHKERRQ(ierr);
>>>>>>           ierr = ISGetLocalSize(lembedding, &llen);
>>>>>>           CHKERRQ(ierr);
>>>>>>           if (llen != dlen)
>>>>>>             SETERRQ1(((PetscObject)dm)->comm, PETSC_ERR_PLIB, "Failed
>>>>>> to embed split %D", d);
>>>>>>           ierr = ISDestroy(&dembedding);
>>>>>>           CHKERRQ(ierr);
>>>>>>           // Convert local embedding to global (but still relative)
>>>>>> embedding
>>>>>>           ierr = PetscMalloc(llen * sizeof(PetscInt), &rindices);
>>>>>>           CHKERRQ(ierr);
>>>>>>           ierr = ISGetIndices(lembedding, &lindices);
>>>>>>           CHKERRQ(ierr);
>>>>>>           ierr = PetscMemcpy(rindices, lindices, llen *
>>>>>> sizeof(PetscInt));
>>>>>>           CHKERRQ(ierr);
>>>>>>           ierr = ISDestroy(&lembedding);
>>>>>>           CHKERRQ(ierr);
>>>>>>           // We could get the index offset from a corresponding
>>>>>> global vector, but subDMs don't yet
>>>>>>           // have global vectors
>>>>>>           ierr = ISGetLocalSize(dmm->_embedding, &len);
>>>>>>           CHKERRQ(ierr);
>>>>>>
>>>>>>           ierr = MPI_Scan(&len,
>>>>>>                           &off,
>>>>>>                           1,
>>>>>> #ifdef PETSC_USE_64BIT_INDICES
>>>>>>                           MPI_LONG_LONG_INT,
>>>>>> #else
>>>>>>                           MPI_INT,
>>>>>> #endif
>>>>>>                           MPI_SUM,
>>>>>>                           ((PetscObject)dm)->comm);
>>>>>>           CHKERRQ(ierr);
>>>>>>
>>>>>>           off -= len;
>>>>>>           for (i = 0; i < llen; ++i)
>>>>>>             rindices[i] += off;
>>>>>>           ierr = ISCreateGeneral(
>>>>>>               ((PetscObject)dm)->comm, llen, rindices,
>>>>>> PETSC_OWN_POINTER, &(dinfo._rembedding));
>>>>>>           CHKERRQ(ierr);
>>>>>>         }
>>>>>>         else
>>>>>>         {
>>>>>>           dinfo._rembedding = dembedding;
>>>>>>         }
>>>>>>       }
>>>>>>       ierr = PetscObjectReference((PetscObject)(dinfo._rembedding));
>>>>>>       CHKERRQ(ierr);
>>>>>>       (*islist)[d] = dinfo._rembedding;
>>>>>>     }
>>>>>>     if (dmlist)
>>>>>>     {
>>>>>>       ierr = PetscObjectReference((PetscObject)dinfo._dm);
>>>>>>       CHKERRQ(ierr);
>>>>>>       (*dmlist)[d] = dinfo._dm;
>>>>>>     }
>>>>>>   }
>>>>>>   PetscFunctionReturn(0);
>>>>>> }
>>>>>>
>>>>>> static PetscErrorCode
>>>>>> DMCreateDomainDecomposition_Moose(
>>>>>>     DM dm, PetscInt * len, char *** namelist, IS ** innerislist, IS
>>>>>> ** outerislist, DM ** dmlist)
>>>>>> {
>>>>>>   PetscErrorCode ierr;
>>>>>>
>>>>>>   PetscFunctionBegin;
>>>>>>   /* Use DMCreateFieldDecomposition_Moose() to obtain everything but
>>>>>> outerislist, which is currently
>>>>>>    * PETSC_NULL. */
>>>>>>   if (outerislist)
>>>>>>     *outerislist = PETSC_NULL; /* FIX: allow mesh-based overlap. */
>>>>>>   ierr = DMCreateFieldDecomposition_Moose(dm, len, namelist,
>>>>>> innerislist, dmlist);
>>>>>>   CHKERRQ(ierr);
>>>>>>   PetscFunctionReturn(0);
>>>>>> }
>>>>>>
>>>>>>
>>>>>>
>>>>>> On Thu, Nov 3, 2022 at 5:19 PM Matthew Knepley <knepley at gmail.com>
>>>>>> wrote:
>>>>>>
>>>>>>> On Thu, Nov 3, 2022 at 7:52 PM Alexander Lindsay <
>>>>>>> alexlindsay239 at gmail.com> wrote:
>>>>>>>
>>>>>>>> I have errors on quite a few (but not all) processes of the like
>>>>>>>>
>>>>>>>> [1]PETSC ERROR: --------------------- Error Message
>>>>>>>> --------------------------------------------------------------
>>>>>>>> [1]PETSC ERROR: Nonconforming object sizes
>>>>>>>> [1]PETSC ERROR: Local columns of A10 4137 do not equal local rows
>>>>>>>> of A00 4129
>>>>>>>>
>>>>>>>> when performing field splits. We (MOOSE) have some code for
>>>>>>>> identifying the index sets for each split. However, the code was written by
>>>>>>>> some authors who are no longer with us. Normally I would chase this down in
>>>>>>>> a debugger, but this error only seems to crop up for pretty complex and
>>>>>>>> large meshes. If anyone has an idea for what we might be doing wrong, that
>>>>>>>> might help me chase this down faster. I guess intuitively I'm pretty
>>>>>>>> perplexed that we could get ourselves into this pickle as it almost appears
>>>>>>>> that we have two different local dof index counts for a given block (0 in
>>>>>>>> this case). More background, if helpful, can be found in
>>>>>>>> https://github.com/idaholab/moose/issues/22359 as well as
>>>>>>>> https://github.com/idaholab/moose/discussions/22468.
>>>>>>>>
>>>>>>>
>>>>>>> How are you specifying the blocks? I would not have thought this was
>>>>>>> possible.
>>>>>>>
>>>>>>>   Thanks,
>>>>>>>
>>>>>>>      Matt
>>>>>>>
>>>>>>>
>>>>>>>> I should note that we are currently running with 3.16.6 as our
>>>>>>>> PETSc submodule hash (we are talking about updating to 3.18 soon).
>>>>>>>>
>>>>>>>
>>>>>>>
>>>>>>> --
>>>>>>> What most experimenters take for granted before they begin their
>>>>>>> experiments is infinitely more interesting than any results to which their
>>>>>>> experiments lead.
>>>>>>> -- Norbert Wiener
>>>>>>>
>>>>>>> https://www.cse.buffalo.edu/~knepley/
>>>>>>> <http://www.cse.buffalo.edu/~knepley/>
>>>>>>>
>>>>>>
>>>>
>>>> --
>>>> What most experimenters take for granted before they begin their
>>>> experiments is infinitely more interesting than any results to which their
>>>> experiments lead.
>>>> -- Norbert Wiener
>>>>
>>>> https://www.cse.buffalo.edu/~knepley/
>>>> <http://www.cse.buffalo.edu/~knepley/>
>>>>
>>>
>>
>> --
>> What most experimenters take for granted before they begin their
>> experiments is infinitely more interesting than any results to which their
>> experiments lead.
>> -- Norbert Wiener
>>
>> https://www.cse.buffalo.edu/~knepley/
>> <http://www.cse.buffalo.edu/~knepley/>
>>
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.mcs.anl.gov/pipermail/petsc-users/attachments/20221109/1213ec61/attachment-0001.html>


More information about the petsc-users mailing list