[petsc-users] Local columns of A10 do not equal local rows of A00

Matthew Knepley knepley at gmail.com
Wed Nov 9 14:45:05 CST 2022


On Wed, Nov 9, 2022 at 1:45 PM Alexander Lindsay <alexlindsay239 at gmail.com>
wrote:

> Ok, I've figured out that we are definitely messing something up in our
> split index set generation. For process 1 our Jac/PMat local size is 14307,
> but split 0 IS local size is 4129 and split 1 IS local size is 10170, so
> that leaves us 8 dofs short.
>
> I know now where I need to dig in our field decomposition. Thanks Matt for
> helping me process through this stuff!
>

Cool. The one piece of advice I have is to make the problem ruthlessly
small. Even if it seems hard, it is worth the time
to get it down to the size you can print to the screen.

  Thanks,

     Matt


> On Tue, Nov 8, 2022 at 4:53 PM Alexander Lindsay <alexlindsay239 at gmail.com>
> wrote:
>
>> This is from our DMCreateFieldDecomposition_Moose routine. The IS size on
>> process 1 (which is the process from which I took the error in the original
>> post) is reported as 4129 which is consistent with the row size of A00.
>>
>> Split '0' has local size 4129 on processor 1
>> Split '0' has local size 4484 on processor 6
>> Split '0' has local size 4471 on processor 12
>> Split '0' has local size 4040 on processor 14
>> Split '0' has local size 3594 on processor 20
>> Split '0' has local size 4423 on processor 22
>> Split '0' has local size 2791 on processor 27
>> Split '0' has local size 3014 on processor 29
>> Split '0' has local size 3183 on processor 30
>> Split '0' has local size 3328 on processor 3
>> Split '0' has local size 4689 on processor 4
>> Split '0' has local size 8016 on processor 8
>> Split '0' has local size 6367 on processor 10
>> Split '0' has local size 5973 on processor 17
>> Split '0' has local size 4431 on processor 18
>> Split '0' has local size 7564 on processor 25
>> Split '0' has local size 12504 on processor 9
>> Split '0' has local size 10081 on processor 11
>> Split '0' has local size 13808 on processor 24
>> Split '0' has local size 14049 on processor 31
>> Split '0' has local size 15324 on processor 7
>> Split '0' has local size 15337 on processor 15
>> Split '0' has local size 14849 on processor 19
>> Split '0' has local size 15660 on processor 23
>> Split '0' has local size 14728 on processor 26
>> Split '0' has local size 15724 on processor 28
>> Split '0' has local size 17249 on processor 5
>> Split '0' has local size 15519 on processor 13
>> Split '0' has local size 16511 on processor 16
>> Split '0' has local size 16496 on processor 21
>> Split '0' has local size 18291 on processor 2
>> Split '0' has local size 18042 on processor 0
>>
>> On Mon, Nov 7, 2022 at 6:04 PM Matthew Knepley <knepley at gmail.com> wrote:
>>
>>> On Mon, Nov 7, 2022 at 5:48 PM Alexander Lindsay <
>>> alexlindsay239 at gmail.com> wrote:
>>>
>>>> My understanding looking at PCFieldSplitSetDefaults is that our
>>>> implementation of `createfielddecomposition` should get called, we'll set
>>>> `fields` and then (ignoring possible user setting of
>>>> -pc_fieldsplit_%D_fields flag) PCFieldSplitSetIS will get called with
>>>> whatever we did to `fields`. So yea I guess that just looking over that I
>>>> would assume we're not supplying two different index sets for rows and
>>>> columns, or put more precisely we (MOOSE) are not really afforded the
>>>> opportunity to. But my interpretation could very well be wrong.
>>>>
>>>
>>> Oh wait. I read the error message again. It does not say that the whole
>>> selection is rectangular. It says
>>>
>>>   Local columns of A10 4137 do not equal local rows of A00 4129
>>>
>>> So this is a parallel partitioning thing. Since A00 has 4129 local rows,
>>> it should have this many columns as well.
>>> However A10 has 4137 local columns. How big is IS_0, on each process,
>>> that you pass in to PCFIELDSPLIT?
>>>
>>>   Thanks,
>>>
>>>      Matt
>>>
>>>
>>>> On Mon, Nov 7, 2022 at 12:33 PM Matthew Knepley <knepley at gmail.com>
>>>> wrote:
>>>>
>>>>> On Mon, Nov 7, 2022 at 2:09 PM Alexander Lindsay <
>>>>> alexlindsay239 at gmail.com> wrote:
>>>>>
>>>>>> The libMesh/MOOSE specific code that identifies dof indices for
>>>>>> ISCreateGeneral is in DMooseGetEmbedding_Private. I can share that function
>>>>>> (it's quite long) or more details if that could be helpful.
>>>>>>
>>>>>
>>>>> Sorry, I should have written more. The puzzling thing for me is that
>>>>> somehow it looks like the row and column index sets are not the same. I did
>>>>> not think
>>>>> PCFIELDSPLIT could do that. The PCFieldSplitSetIS() interface does not
>>>>> allow it. I was wondering how you were setting the ISes.
>>>>>
>>>>>   Thanks,
>>>>>
>>>>>      Matt
>>>>>
>>>>>
>>>>>> On Mon, Nov 7, 2022 at 10:55 AM Alexander Lindsay <
>>>>>> alexlindsay239 at gmail.com> wrote:
>>>>>>
>>>>>>> I'm not sure exactly what you mean, but I'll try to give more
>>>>>>> details. We have our own DM class (DM_Moose) and we set our own field and
>>>>>>> domain decomposition routines:
>>>>>>>
>>>>>>>   dm->ops->createfielddecomposition =
>>>>>>> DMCreateFieldDecomposition_Moose;
>>>>>>>
>>>>>>>   dm->ops->createdomaindecomposition =
>>>>>>> DMCreateDomainDecomposition_Moose;
>>>>>>>
>>>>>>>
>>>>>>> The field and domain decomposition routines are as follows (can see
>>>>>>> also at
>>>>>>> https://github.com/idaholab/moose/blob/next/framework/src/utils/PetscDMMoose.C
>>>>>>> ):
>>>>>>>
>>>>>>> static PetscErrorCode
>>>>>>> DMCreateFieldDecomposition_Moose(
>>>>>>>     DM dm, PetscInt * len, char *** namelist, IS ** islist, DM **
>>>>>>> dmlist)
>>>>>>> {
>>>>>>>   PetscErrorCode ierr;
>>>>>>>   DM_Moose * dmm = (DM_Moose *)(dm->data);
>>>>>>>
>>>>>>>   PetscFunctionBegin;
>>>>>>>   /* Only called after DMSetUp(). */
>>>>>>>   if (!dmm->_splitlocs)
>>>>>>>     PetscFunctionReturn(0);
>>>>>>>   *len = dmm->_splitlocs->size();
>>>>>>>   if (namelist)
>>>>>>>   {
>>>>>>>     ierr = PetscMalloc(*len * sizeof(char *), namelist);
>>>>>>>     CHKERRQ(ierr);
>>>>>>>   }
>>>>>>>   if (islist)
>>>>>>>   {
>>>>>>>     ierr = PetscMalloc(*len * sizeof(IS), islist);
>>>>>>>     CHKERRQ(ierr);
>>>>>>>   }
>>>>>>>   if (dmlist)
>>>>>>>   {
>>>>>>>     ierr = PetscMalloc(*len * sizeof(DM), dmlist);
>>>>>>>     CHKERRQ(ierr);
>>>>>>>   }
>>>>>>>   for (const auto & dit : *(dmm->_splitlocs))
>>>>>>>   {
>>>>>>>     unsigned int d = dit.second;
>>>>>>>     std::string dname = dit.first;
>>>>>>>     DM_Moose::SplitInfo & dinfo = (*dmm->_splits)[dname];
>>>>>>>     if (!dinfo._dm)
>>>>>>>     {
>>>>>>>       ierr = DMCreateMoose(((PetscObject)dm)->comm, *dmm->_nl,
>>>>>>> &dinfo._dm);
>>>>>>>       CHKERRQ(ierr);
>>>>>>>       ierr = PetscObjectSetOptionsPrefix((PetscObject)dinfo._dm,
>>>>>>> ((PetscObject)dm)->prefix);
>>>>>>>       CHKERRQ(ierr);
>>>>>>>       std::string suffix = std::string("fieldsplit_") + dname + "_";
>>>>>>>       ierr = PetscObjectAppendOptionsPrefix((PetscObject)dinfo._dm,
>>>>>>> suffix.c_str());
>>>>>>>       CHKERRQ(ierr);
>>>>>>>     }
>>>>>>>     ierr = DMSetFromOptions(dinfo._dm);
>>>>>>>     CHKERRQ(ierr);
>>>>>>>     ierr = DMSetUp(dinfo._dm);
>>>>>>>     CHKERRQ(ierr);
>>>>>>>     if (namelist)
>>>>>>>     {
>>>>>>>       ierr = PetscStrallocpy(dname.c_str(), (*namelist) + d);
>>>>>>>       CHKERRQ(ierr);
>>>>>>>     }
>>>>>>>     if (islist)
>>>>>>>     {
>>>>>>>       if (!dinfo._rembedding)
>>>>>>>       {
>>>>>>>         IS dembedding, lembedding;
>>>>>>>         ierr = DMMooseGetEmbedding_Private(dinfo._dm, &dembedding);
>>>>>>>         CHKERRQ(ierr);
>>>>>>>         if (dmm->_embedding)
>>>>>>>         {
>>>>>>>           // Create a relative embedding into the parent's index
>>>>>>> space.
>>>>>>>           ierr = ISEmbed(dembedding, dmm->_embedding, PETSC_TRUE,
>>>>>>> &lembedding);
>>>>>>>           CHKERRQ(ierr);
>>>>>>>           const PetscInt * lindices;
>>>>>>>           PetscInt len, dlen, llen, *rindices, off, i;
>>>>>>>           ierr = ISGetLocalSize(dembedding, &dlen);
>>>>>>>           CHKERRQ(ierr);
>>>>>>>           ierr = ISGetLocalSize(lembedding, &llen);
>>>>>>>           CHKERRQ(ierr);
>>>>>>>           if (llen != dlen)
>>>>>>>             SETERRQ1(((PetscObject)dm)->comm, PETSC_ERR_PLIB,
>>>>>>> "Failed to embed split %D", d);
>>>>>>>           ierr = ISDestroy(&dembedding);
>>>>>>>           CHKERRQ(ierr);
>>>>>>>           // Convert local embedding to global (but still relative)
>>>>>>> embedding
>>>>>>>           ierr = PetscMalloc(llen * sizeof(PetscInt), &rindices);
>>>>>>>           CHKERRQ(ierr);
>>>>>>>           ierr = ISGetIndices(lembedding, &lindices);
>>>>>>>           CHKERRQ(ierr);
>>>>>>>           ierr = PetscMemcpy(rindices, lindices, llen *
>>>>>>> sizeof(PetscInt));
>>>>>>>           CHKERRQ(ierr);
>>>>>>>           ierr = ISDestroy(&lembedding);
>>>>>>>           CHKERRQ(ierr);
>>>>>>>           // We could get the index offset from a corresponding
>>>>>>> global vector, but subDMs don't yet
>>>>>>>           // have global vectors
>>>>>>>           ierr = ISGetLocalSize(dmm->_embedding, &len);
>>>>>>>           CHKERRQ(ierr);
>>>>>>>
>>>>>>>           ierr = MPI_Scan(&len,
>>>>>>>                           &off,
>>>>>>>                           1,
>>>>>>> #ifdef PETSC_USE_64BIT_INDICES
>>>>>>>                           MPI_LONG_LONG_INT,
>>>>>>> #else
>>>>>>>                           MPI_INT,
>>>>>>> #endif
>>>>>>>                           MPI_SUM,
>>>>>>>                           ((PetscObject)dm)->comm);
>>>>>>>           CHKERRQ(ierr);
>>>>>>>
>>>>>>>           off -= len;
>>>>>>>           for (i = 0; i < llen; ++i)
>>>>>>>             rindices[i] += off;
>>>>>>>           ierr = ISCreateGeneral(
>>>>>>>               ((PetscObject)dm)->comm, llen, rindices,
>>>>>>> PETSC_OWN_POINTER, &(dinfo._rembedding));
>>>>>>>           CHKERRQ(ierr);
>>>>>>>         }
>>>>>>>         else
>>>>>>>         {
>>>>>>>           dinfo._rembedding = dembedding;
>>>>>>>         }
>>>>>>>       }
>>>>>>>       ierr = PetscObjectReference((PetscObject)(dinfo._rembedding));
>>>>>>>       CHKERRQ(ierr);
>>>>>>>       (*islist)[d] = dinfo._rembedding;
>>>>>>>     }
>>>>>>>     if (dmlist)
>>>>>>>     {
>>>>>>>       ierr = PetscObjectReference((PetscObject)dinfo._dm);
>>>>>>>       CHKERRQ(ierr);
>>>>>>>       (*dmlist)[d] = dinfo._dm;
>>>>>>>     }
>>>>>>>   }
>>>>>>>   PetscFunctionReturn(0);
>>>>>>> }
>>>>>>>
>>>>>>> static PetscErrorCode
>>>>>>> DMCreateDomainDecomposition_Moose(
>>>>>>>     DM dm, PetscInt * len, char *** namelist, IS ** innerislist, IS
>>>>>>> ** outerislist, DM ** dmlist)
>>>>>>> {
>>>>>>>   PetscErrorCode ierr;
>>>>>>>
>>>>>>>   PetscFunctionBegin;
>>>>>>>   /* Use DMCreateFieldDecomposition_Moose() to obtain everything but
>>>>>>> outerislist, which is currently
>>>>>>>    * PETSC_NULL. */
>>>>>>>   if (outerislist)
>>>>>>>     *outerislist = PETSC_NULL; /* FIX: allow mesh-based overlap. */
>>>>>>>   ierr = DMCreateFieldDecomposition_Moose(dm, len, namelist,
>>>>>>> innerislist, dmlist);
>>>>>>>   CHKERRQ(ierr);
>>>>>>>   PetscFunctionReturn(0);
>>>>>>> }
>>>>>>>
>>>>>>>
>>>>>>>
>>>>>>> On Thu, Nov 3, 2022 at 5:19 PM Matthew Knepley <knepley at gmail.com>
>>>>>>> wrote:
>>>>>>>
>>>>>>>> On Thu, Nov 3, 2022 at 7:52 PM Alexander Lindsay <
>>>>>>>> alexlindsay239 at gmail.com> wrote:
>>>>>>>>
>>>>>>>>> I have errors on quite a few (but not all) processes of the like
>>>>>>>>>
>>>>>>>>> [1]PETSC ERROR: --------------------- Error Message
>>>>>>>>> --------------------------------------------------------------
>>>>>>>>> [1]PETSC ERROR: Nonconforming object sizes
>>>>>>>>> [1]PETSC ERROR: Local columns of A10 4137 do not equal local rows
>>>>>>>>> of A00 4129
>>>>>>>>>
>>>>>>>>> when performing field splits. We (MOOSE) have some code for
>>>>>>>>> identifying the index sets for each split. However, the code was written by
>>>>>>>>> some authors who are no longer with us. Normally I would chase this down in
>>>>>>>>> a debugger, but this error only seems to crop up for pretty complex and
>>>>>>>>> large meshes. If anyone has an idea for what we might be doing wrong, that
>>>>>>>>> might help me chase this down faster. I guess intuitively I'm pretty
>>>>>>>>> perplexed that we could get ourselves into this pickle as it almost appears
>>>>>>>>> that we have two different local dof index counts for a given block (0 in
>>>>>>>>> this case). More background, if helpful, can be found in
>>>>>>>>> https://github.com/idaholab/moose/issues/22359 as well as
>>>>>>>>> https://github.com/idaholab/moose/discussions/22468.
>>>>>>>>>
>>>>>>>>
>>>>>>>> How are you specifying the blocks? I would not have thought this
>>>>>>>> was possible.
>>>>>>>>
>>>>>>>>   Thanks,
>>>>>>>>
>>>>>>>>      Matt
>>>>>>>>
>>>>>>>>
>>>>>>>>> I should note that we are currently running with 3.16.6 as our
>>>>>>>>> PETSc submodule hash (we are talking about updating to 3.18 soon).
>>>>>>>>>
>>>>>>>>
>>>>>>>>
>>>>>>>> --
>>>>>>>> What most experimenters take for granted before they begin their
>>>>>>>> experiments is infinitely more interesting than any results to which their
>>>>>>>> experiments lead.
>>>>>>>> -- Norbert Wiener
>>>>>>>>
>>>>>>>> https://www.cse.buffalo.edu/~knepley/
>>>>>>>> <http://www.cse.buffalo.edu/~knepley/>
>>>>>>>>
>>>>>>>
>>>>>
>>>>> --
>>>>> What most experimenters take for granted before they begin their
>>>>> experiments is infinitely more interesting than any results to which their
>>>>> experiments lead.
>>>>> -- Norbert Wiener
>>>>>
>>>>> https://www.cse.buffalo.edu/~knepley/
>>>>> <http://www.cse.buffalo.edu/~knepley/>
>>>>>
>>>>
>>>
>>> --
>>> What most experimenters take for granted before they begin their
>>> experiments is infinitely more interesting than any results to which their
>>> experiments lead.
>>> -- Norbert Wiener
>>>
>>> https://www.cse.buffalo.edu/~knepley/
>>> <http://www.cse.buffalo.edu/~knepley/>
>>>
>>

-- 
What most experimenters take for granted before they begin their
experiments is infinitely more interesting than any results to which their
experiments lead.
-- Norbert Wiener

https://www.cse.buffalo.edu/~knepley/ <http://www.cse.buffalo.edu/~knepley/>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.mcs.anl.gov/pipermail/petsc-users/attachments/20221109/d4c2a2cc/attachment-0001.html>


More information about the petsc-users mailing list