[petsc-users] Local columns of A10 do not equal local rows of A00

Alexander Lindsay alexlindsay239 at gmail.com
Wed Nov 9 17:18:11 CST 2022


I was able to get it worked out, once I knew the issue, doing a detailed
read through our split IS generation. Working great (at least on this test
problem) now!

On Wed, Nov 9, 2022 at 12:45 PM Matthew Knepley <knepley at gmail.com> wrote:

> On Wed, Nov 9, 2022 at 1:45 PM Alexander Lindsay <alexlindsay239 at gmail.com>
> wrote:
>
>> Ok, I've figured out that we are definitely messing something up in our
>> split index set generation. For process 1 our Jac/PMat local size is 14307,
>> but split 0 IS local size is 4129 and split 1 IS local size is 10170, so
>> that leaves us 8 dofs short.
>>
>> I know now where I need to dig in our field decomposition. Thanks Matt
>> for helping me process through this stuff!
>>
>
> Cool. The one piece of advice I have is to make the problem ruthlessly
> small. Even if it seems hard, it is worth the time
> to get it down to the size you can print to the screen.
>
>   Thanks,
>
>      Matt
>
>
>> On Tue, Nov 8, 2022 at 4:53 PM Alexander Lindsay <
>> alexlindsay239 at gmail.com> wrote:
>>
>>> This is from our DMCreateFieldDecomposition_Moose routine. The IS size
>>> on process 1 (which is the process from which I took the error in the
>>> original post) is reported as 4129 which is consistent with the row size of
>>> A00.
>>>
>>> Split '0' has local size 4129 on processor 1
>>> Split '0' has local size 4484 on processor 6
>>> Split '0' has local size 4471 on processor 12
>>> Split '0' has local size 4040 on processor 14
>>> Split '0' has local size 3594 on processor 20
>>> Split '0' has local size 4423 on processor 22
>>> Split '0' has local size 2791 on processor 27
>>> Split '0' has local size 3014 on processor 29
>>> Split '0' has local size 3183 on processor 30
>>> Split '0' has local size 3328 on processor 3
>>> Split '0' has local size 4689 on processor 4
>>> Split '0' has local size 8016 on processor 8
>>> Split '0' has local size 6367 on processor 10
>>> Split '0' has local size 5973 on processor 17
>>> Split '0' has local size 4431 on processor 18
>>> Split '0' has local size 7564 on processor 25
>>> Split '0' has local size 12504 on processor 9
>>> Split '0' has local size 10081 on processor 11
>>> Split '0' has local size 13808 on processor 24
>>> Split '0' has local size 14049 on processor 31
>>> Split '0' has local size 15324 on processor 7
>>> Split '0' has local size 15337 on processor 15
>>> Split '0' has local size 14849 on processor 19
>>> Split '0' has local size 15660 on processor 23
>>> Split '0' has local size 14728 on processor 26
>>> Split '0' has local size 15724 on processor 28
>>> Split '0' has local size 17249 on processor 5
>>> Split '0' has local size 15519 on processor 13
>>> Split '0' has local size 16511 on processor 16
>>> Split '0' has local size 16496 on processor 21
>>> Split '0' has local size 18291 on processor 2
>>> Split '0' has local size 18042 on processor 0
>>>
>>> On Mon, Nov 7, 2022 at 6:04 PM Matthew Knepley <knepley at gmail.com>
>>> wrote:
>>>
>>>> On Mon, Nov 7, 2022 at 5:48 PM Alexander Lindsay <
>>>> alexlindsay239 at gmail.com> wrote:
>>>>
>>>>> My understanding looking at PCFieldSplitSetDefaults is that our
>>>>> implementation of `createfielddecomposition` should get called, we'll set
>>>>> `fields` and then (ignoring possible user setting of
>>>>> -pc_fieldsplit_%D_fields flag) PCFieldSplitSetIS will get called with
>>>>> whatever we did to `fields`. So yea I guess that just looking over that I
>>>>> would assume we're not supplying two different index sets for rows and
>>>>> columns, or put more precisely we (MOOSE) are not really afforded the
>>>>> opportunity to. But my interpretation could very well be wrong.
>>>>>
>>>>
>>>> Oh wait. I read the error message again. It does not say that the whole
>>>> selection is rectangular. It says
>>>>
>>>>   Local columns of A10 4137 do not equal local rows of A00 4129
>>>>
>>>> So this is a parallel partitioning thing. Since A00 has 4129 local
>>>> rows, it should have this many columns as well.
>>>> However A10 has 4137 local columns. How big is IS_0, on each process,
>>>> that you pass in to PCFIELDSPLIT?
>>>>
>>>>   Thanks,
>>>>
>>>>      Matt
>>>>
>>>>
>>>>> On Mon, Nov 7, 2022 at 12:33 PM Matthew Knepley <knepley at gmail.com>
>>>>> wrote:
>>>>>
>>>>>> On Mon, Nov 7, 2022 at 2:09 PM Alexander Lindsay <
>>>>>> alexlindsay239 at gmail.com> wrote:
>>>>>>
>>>>>>> The libMesh/MOOSE specific code that identifies dof indices for
>>>>>>> ISCreateGeneral is in DMooseGetEmbedding_Private. I can share that function
>>>>>>> (it's quite long) or more details if that could be helpful.
>>>>>>>
>>>>>>
>>>>>> Sorry, I should have written more. The puzzling thing for me is that
>>>>>> somehow it looks like the row and column index sets are not the same. I did
>>>>>> not think
>>>>>> PCFIELDSPLIT could do that. The PCFieldSplitSetIS() interface does
>>>>>> not allow it. I was wondering how you were setting the ISes.
>>>>>>
>>>>>>   Thanks,
>>>>>>
>>>>>>      Matt
>>>>>>
>>>>>>
>>>>>>> On Mon, Nov 7, 2022 at 10:55 AM Alexander Lindsay <
>>>>>>> alexlindsay239 at gmail.com> wrote:
>>>>>>>
>>>>>>>> I'm not sure exactly what you mean, but I'll try to give more
>>>>>>>> details. We have our own DM class (DM_Moose) and we set our own field and
>>>>>>>> domain decomposition routines:
>>>>>>>>
>>>>>>>>   dm->ops->createfielddecomposition =
>>>>>>>> DMCreateFieldDecomposition_Moose;
>>>>>>>>
>>>>>>>>   dm->ops->createdomaindecomposition =
>>>>>>>> DMCreateDomainDecomposition_Moose;
>>>>>>>>
>>>>>>>>
>>>>>>>> The field and domain decomposition routines are as follows (can see
>>>>>>>> also at
>>>>>>>> https://github.com/idaholab/moose/blob/next/framework/src/utils/PetscDMMoose.C
>>>>>>>> ):
>>>>>>>>
>>>>>>>> static PetscErrorCode
>>>>>>>> DMCreateFieldDecomposition_Moose(
>>>>>>>>     DM dm, PetscInt * len, char *** namelist, IS ** islist, DM **
>>>>>>>> dmlist)
>>>>>>>> {
>>>>>>>>   PetscErrorCode ierr;
>>>>>>>>   DM_Moose * dmm = (DM_Moose *)(dm->data);
>>>>>>>>
>>>>>>>>   PetscFunctionBegin;
>>>>>>>>   /* Only called after DMSetUp(). */
>>>>>>>>   if (!dmm->_splitlocs)
>>>>>>>>     PetscFunctionReturn(0);
>>>>>>>>   *len = dmm->_splitlocs->size();
>>>>>>>>   if (namelist)
>>>>>>>>   {
>>>>>>>>     ierr = PetscMalloc(*len * sizeof(char *), namelist);
>>>>>>>>     CHKERRQ(ierr);
>>>>>>>>   }
>>>>>>>>   if (islist)
>>>>>>>>   {
>>>>>>>>     ierr = PetscMalloc(*len * sizeof(IS), islist);
>>>>>>>>     CHKERRQ(ierr);
>>>>>>>>   }
>>>>>>>>   if (dmlist)
>>>>>>>>   {
>>>>>>>>     ierr = PetscMalloc(*len * sizeof(DM), dmlist);
>>>>>>>>     CHKERRQ(ierr);
>>>>>>>>   }
>>>>>>>>   for (const auto & dit : *(dmm->_splitlocs))
>>>>>>>>   {
>>>>>>>>     unsigned int d = dit.second;
>>>>>>>>     std::string dname = dit.first;
>>>>>>>>     DM_Moose::SplitInfo & dinfo = (*dmm->_splits)[dname];
>>>>>>>>     if (!dinfo._dm)
>>>>>>>>     {
>>>>>>>>       ierr = DMCreateMoose(((PetscObject)dm)->comm, *dmm->_nl,
>>>>>>>> &dinfo._dm);
>>>>>>>>       CHKERRQ(ierr);
>>>>>>>>       ierr = PetscObjectSetOptionsPrefix((PetscObject)dinfo._dm,
>>>>>>>> ((PetscObject)dm)->prefix);
>>>>>>>>       CHKERRQ(ierr);
>>>>>>>>       std::string suffix = std::string("fieldsplit_") + dname + "_";
>>>>>>>>       ierr = PetscObjectAppendOptionsPrefix((PetscObject)dinfo._dm,
>>>>>>>> suffix.c_str());
>>>>>>>>       CHKERRQ(ierr);
>>>>>>>>     }
>>>>>>>>     ierr = DMSetFromOptions(dinfo._dm);
>>>>>>>>     CHKERRQ(ierr);
>>>>>>>>     ierr = DMSetUp(dinfo._dm);
>>>>>>>>     CHKERRQ(ierr);
>>>>>>>>     if (namelist)
>>>>>>>>     {
>>>>>>>>       ierr = PetscStrallocpy(dname.c_str(), (*namelist) + d);
>>>>>>>>       CHKERRQ(ierr);
>>>>>>>>     }
>>>>>>>>     if (islist)
>>>>>>>>     {
>>>>>>>>       if (!dinfo._rembedding)
>>>>>>>>       {
>>>>>>>>         IS dembedding, lembedding;
>>>>>>>>         ierr = DMMooseGetEmbedding_Private(dinfo._dm, &dembedding);
>>>>>>>>         CHKERRQ(ierr);
>>>>>>>>         if (dmm->_embedding)
>>>>>>>>         {
>>>>>>>>           // Create a relative embedding into the parent's index
>>>>>>>> space.
>>>>>>>>           ierr = ISEmbed(dembedding, dmm->_embedding, PETSC_TRUE,
>>>>>>>> &lembedding);
>>>>>>>>           CHKERRQ(ierr);
>>>>>>>>           const PetscInt * lindices;
>>>>>>>>           PetscInt len, dlen, llen, *rindices, off, i;
>>>>>>>>           ierr = ISGetLocalSize(dembedding, &dlen);
>>>>>>>>           CHKERRQ(ierr);
>>>>>>>>           ierr = ISGetLocalSize(lembedding, &llen);
>>>>>>>>           CHKERRQ(ierr);
>>>>>>>>           if (llen != dlen)
>>>>>>>>             SETERRQ1(((PetscObject)dm)->comm, PETSC_ERR_PLIB,
>>>>>>>> "Failed to embed split %D", d);
>>>>>>>>           ierr = ISDestroy(&dembedding);
>>>>>>>>           CHKERRQ(ierr);
>>>>>>>>           // Convert local embedding to global (but still relative)
>>>>>>>> embedding
>>>>>>>>           ierr = PetscMalloc(llen * sizeof(PetscInt), &rindices);
>>>>>>>>           CHKERRQ(ierr);
>>>>>>>>           ierr = ISGetIndices(lembedding, &lindices);
>>>>>>>>           CHKERRQ(ierr);
>>>>>>>>           ierr = PetscMemcpy(rindices, lindices, llen *
>>>>>>>> sizeof(PetscInt));
>>>>>>>>           CHKERRQ(ierr);
>>>>>>>>           ierr = ISDestroy(&lembedding);
>>>>>>>>           CHKERRQ(ierr);
>>>>>>>>           // We could get the index offset from a corresponding
>>>>>>>> global vector, but subDMs don't yet
>>>>>>>>           // have global vectors
>>>>>>>>           ierr = ISGetLocalSize(dmm->_embedding, &len);
>>>>>>>>           CHKERRQ(ierr);
>>>>>>>>
>>>>>>>>           ierr = MPI_Scan(&len,
>>>>>>>>                           &off,
>>>>>>>>                           1,
>>>>>>>> #ifdef PETSC_USE_64BIT_INDICES
>>>>>>>>                           MPI_LONG_LONG_INT,
>>>>>>>> #else
>>>>>>>>                           MPI_INT,
>>>>>>>> #endif
>>>>>>>>                           MPI_SUM,
>>>>>>>>                           ((PetscObject)dm)->comm);
>>>>>>>>           CHKERRQ(ierr);
>>>>>>>>
>>>>>>>>           off -= len;
>>>>>>>>           for (i = 0; i < llen; ++i)
>>>>>>>>             rindices[i] += off;
>>>>>>>>           ierr = ISCreateGeneral(
>>>>>>>>               ((PetscObject)dm)->comm, llen, rindices,
>>>>>>>> PETSC_OWN_POINTER, &(dinfo._rembedding));
>>>>>>>>           CHKERRQ(ierr);
>>>>>>>>         }
>>>>>>>>         else
>>>>>>>>         {
>>>>>>>>           dinfo._rembedding = dembedding;
>>>>>>>>         }
>>>>>>>>       }
>>>>>>>>       ierr = PetscObjectReference((PetscObject)(dinfo._rembedding));
>>>>>>>>       CHKERRQ(ierr);
>>>>>>>>       (*islist)[d] = dinfo._rembedding;
>>>>>>>>     }
>>>>>>>>     if (dmlist)
>>>>>>>>     {
>>>>>>>>       ierr = PetscObjectReference((PetscObject)dinfo._dm);
>>>>>>>>       CHKERRQ(ierr);
>>>>>>>>       (*dmlist)[d] = dinfo._dm;
>>>>>>>>     }
>>>>>>>>   }
>>>>>>>>   PetscFunctionReturn(0);
>>>>>>>> }
>>>>>>>>
>>>>>>>> static PetscErrorCode
>>>>>>>> DMCreateDomainDecomposition_Moose(
>>>>>>>>     DM dm, PetscInt * len, char *** namelist, IS ** innerislist, IS
>>>>>>>> ** outerislist, DM ** dmlist)
>>>>>>>> {
>>>>>>>>   PetscErrorCode ierr;
>>>>>>>>
>>>>>>>>   PetscFunctionBegin;
>>>>>>>>   /* Use DMCreateFieldDecomposition_Moose() to obtain everything
>>>>>>>> but outerislist, which is currently
>>>>>>>>    * PETSC_NULL. */
>>>>>>>>   if (outerislist)
>>>>>>>>     *outerislist = PETSC_NULL; /* FIX: allow mesh-based overlap. */
>>>>>>>>   ierr = DMCreateFieldDecomposition_Moose(dm, len, namelist,
>>>>>>>> innerislist, dmlist);
>>>>>>>>   CHKERRQ(ierr);
>>>>>>>>   PetscFunctionReturn(0);
>>>>>>>> }
>>>>>>>>
>>>>>>>>
>>>>>>>>
>>>>>>>> On Thu, Nov 3, 2022 at 5:19 PM Matthew Knepley <knepley at gmail.com>
>>>>>>>> wrote:
>>>>>>>>
>>>>>>>>> On Thu, Nov 3, 2022 at 7:52 PM Alexander Lindsay <
>>>>>>>>> alexlindsay239 at gmail.com> wrote:
>>>>>>>>>
>>>>>>>>>> I have errors on quite a few (but not all) processes of the like
>>>>>>>>>>
>>>>>>>>>> [1]PETSC ERROR: --------------------- Error Message
>>>>>>>>>> --------------------------------------------------------------
>>>>>>>>>> [1]PETSC ERROR: Nonconforming object sizes
>>>>>>>>>> [1]PETSC ERROR: Local columns of A10 4137 do not equal local rows
>>>>>>>>>> of A00 4129
>>>>>>>>>>
>>>>>>>>>> when performing field splits. We (MOOSE) have some code for
>>>>>>>>>> identifying the index sets for each split. However, the code was written by
>>>>>>>>>> some authors who are no longer with us. Normally I would chase this down in
>>>>>>>>>> a debugger, but this error only seems to crop up for pretty complex and
>>>>>>>>>> large meshes. If anyone has an idea for what we might be doing wrong, that
>>>>>>>>>> might help me chase this down faster. I guess intuitively I'm pretty
>>>>>>>>>> perplexed that we could get ourselves into this pickle as it almost appears
>>>>>>>>>> that we have two different local dof index counts for a given block (0 in
>>>>>>>>>> this case). More background, if helpful, can be found in
>>>>>>>>>> https://github.com/idaholab/moose/issues/22359 as well as
>>>>>>>>>> https://github.com/idaholab/moose/discussions/22468.
>>>>>>>>>>
>>>>>>>>>
>>>>>>>>> How are you specifying the blocks? I would not have thought this
>>>>>>>>> was possible.
>>>>>>>>>
>>>>>>>>>   Thanks,
>>>>>>>>>
>>>>>>>>>      Matt
>>>>>>>>>
>>>>>>>>>
>>>>>>>>>> I should note that we are currently running with 3.16.6 as our
>>>>>>>>>> PETSc submodule hash (we are talking about updating to 3.18 soon).
>>>>>>>>>>
>>>>>>>>>
>>>>>>>>>
>>>>>>>>> --
>>>>>>>>> What most experimenters take for granted before they begin their
>>>>>>>>> experiments is infinitely more interesting than any results to which their
>>>>>>>>> experiments lead.
>>>>>>>>> -- Norbert Wiener
>>>>>>>>>
>>>>>>>>> https://www.cse.buffalo.edu/~knepley/
>>>>>>>>> <http://www.cse.buffalo.edu/~knepley/>
>>>>>>>>>
>>>>>>>>
>>>>>>
>>>>>> --
>>>>>> What most experimenters take for granted before they begin their
>>>>>> experiments is infinitely more interesting than any results to which their
>>>>>> experiments lead.
>>>>>> -- Norbert Wiener
>>>>>>
>>>>>> https://www.cse.buffalo.edu/~knepley/
>>>>>> <http://www.cse.buffalo.edu/~knepley/>
>>>>>>
>>>>>
>>>>
>>>> --
>>>> What most experimenters take for granted before they begin their
>>>> experiments is infinitely more interesting than any results to which their
>>>> experiments lead.
>>>> -- Norbert Wiener
>>>>
>>>> https://www.cse.buffalo.edu/~knepley/
>>>> <http://www.cse.buffalo.edu/~knepley/>
>>>>
>>>
>
> --
> What most experimenters take for granted before they begin their
> experiments is infinitely more interesting than any results to which their
> experiments lead.
> -- Norbert Wiener
>
> https://www.cse.buffalo.edu/~knepley/
> <http://www.cse.buffalo.edu/~knepley/>
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.mcs.anl.gov/pipermail/petsc-users/attachments/20221109/a84d0f63/attachment-0001.html>


More information about the petsc-users mailing list