[petsc-users] Local columns of A10 do not equal local rows of A00
Alexander Lindsay
alexlindsay239 at gmail.com
Wed Nov 9 12:45:38 CST 2022
Ok, I've figured out that we are definitely messing something up in our
split index set generation. For process 1 our Jac/PMat local size is 14307,
but split 0 IS local size is 4129 and split 1 IS local size is 10170, so
that leaves us 8 dofs short.
I know now where I need to dig in our field decomposition. Thanks Matt for
helping me process through this stuff!
On Tue, Nov 8, 2022 at 4:53 PM Alexander Lindsay <alexlindsay239 at gmail.com>
wrote:
> This is from our DMCreateFieldDecomposition_Moose routine. The IS size on
> process 1 (which is the process from which I took the error in the original
> post) is reported as 4129 which is consistent with the row size of A00.
>
> Split '0' has local size 4129 on processor 1
> Split '0' has local size 4484 on processor 6
> Split '0' has local size 4471 on processor 12
> Split '0' has local size 4040 on processor 14
> Split '0' has local size 3594 on processor 20
> Split '0' has local size 4423 on processor 22
> Split '0' has local size 2791 on processor 27
> Split '0' has local size 3014 on processor 29
> Split '0' has local size 3183 on processor 30
> Split '0' has local size 3328 on processor 3
> Split '0' has local size 4689 on processor 4
> Split '0' has local size 8016 on processor 8
> Split '0' has local size 6367 on processor 10
> Split '0' has local size 5973 on processor 17
> Split '0' has local size 4431 on processor 18
> Split '0' has local size 7564 on processor 25
> Split '0' has local size 12504 on processor 9
> Split '0' has local size 10081 on processor 11
> Split '0' has local size 13808 on processor 24
> Split '0' has local size 14049 on processor 31
> Split '0' has local size 15324 on processor 7
> Split '0' has local size 15337 on processor 15
> Split '0' has local size 14849 on processor 19
> Split '0' has local size 15660 on processor 23
> Split '0' has local size 14728 on processor 26
> Split '0' has local size 15724 on processor 28
> Split '0' has local size 17249 on processor 5
> Split '0' has local size 15519 on processor 13
> Split '0' has local size 16511 on processor 16
> Split '0' has local size 16496 on processor 21
> Split '0' has local size 18291 on processor 2
> Split '0' has local size 18042 on processor 0
>
> On Mon, Nov 7, 2022 at 6:04 PM Matthew Knepley <knepley at gmail.com> wrote:
>
>> On Mon, Nov 7, 2022 at 5:48 PM Alexander Lindsay <
>> alexlindsay239 at gmail.com> wrote:
>>
>>> My understanding looking at PCFieldSplitSetDefaults is that our
>>> implementation of `createfielddecomposition` should get called, we'll set
>>> `fields` and then (ignoring possible user setting of
>>> -pc_fieldsplit_%D_fields flag) PCFieldSplitSetIS will get called with
>>> whatever we did to `fields`. So yea I guess that just looking over that I
>>> would assume we're not supplying two different index sets for rows and
>>> columns, or put more precisely we (MOOSE) are not really afforded the
>>> opportunity to. But my interpretation could very well be wrong.
>>>
>>
>> Oh wait. I read the error message again. It does not say that the whole
>> selection is rectangular. It says
>>
>> Local columns of A10 4137 do not equal local rows of A00 4129
>>
>> So this is a parallel partitioning thing. Since A00 has 4129 local rows,
>> it should have this many columns as well.
>> However A10 has 4137 local columns. How big is IS_0, on each process,
>> that you pass in to PCFIELDSPLIT?
>>
>> Thanks,
>>
>> Matt
>>
>>
>>> On Mon, Nov 7, 2022 at 12:33 PM Matthew Knepley <knepley at gmail.com>
>>> wrote:
>>>
>>>> On Mon, Nov 7, 2022 at 2:09 PM Alexander Lindsay <
>>>> alexlindsay239 at gmail.com> wrote:
>>>>
>>>>> The libMesh/MOOSE specific code that identifies dof indices for
>>>>> ISCreateGeneral is in DMooseGetEmbedding_Private. I can share that function
>>>>> (it's quite long) or more details if that could be helpful.
>>>>>
>>>>
>>>> Sorry, I should have written more. The puzzling thing for me is that
>>>> somehow it looks like the row and column index sets are not the same. I did
>>>> not think
>>>> PCFIELDSPLIT could do that. The PCFieldSplitSetIS() interface does not
>>>> allow it. I was wondering how you were setting the ISes.
>>>>
>>>> Thanks,
>>>>
>>>> Matt
>>>>
>>>>
>>>>> On Mon, Nov 7, 2022 at 10:55 AM Alexander Lindsay <
>>>>> alexlindsay239 at gmail.com> wrote:
>>>>>
>>>>>> I'm not sure exactly what you mean, but I'll try to give more
>>>>>> details. We have our own DM class (DM_Moose) and we set our own field and
>>>>>> domain decomposition routines:
>>>>>>
>>>>>> dm->ops->createfielddecomposition =
>>>>>> DMCreateFieldDecomposition_Moose;
>>>>>>
>>>>>> dm->ops->createdomaindecomposition =
>>>>>> DMCreateDomainDecomposition_Moose;
>>>>>>
>>>>>>
>>>>>> The field and domain decomposition routines are as follows (can see
>>>>>> also at
>>>>>> https://github.com/idaholab/moose/blob/next/framework/src/utils/PetscDMMoose.C
>>>>>> ):
>>>>>>
>>>>>> static PetscErrorCode
>>>>>> DMCreateFieldDecomposition_Moose(
>>>>>> DM dm, PetscInt * len, char *** namelist, IS ** islist, DM **
>>>>>> dmlist)
>>>>>> {
>>>>>> PetscErrorCode ierr;
>>>>>> DM_Moose * dmm = (DM_Moose *)(dm->data);
>>>>>>
>>>>>> PetscFunctionBegin;
>>>>>> /* Only called after DMSetUp(). */
>>>>>> if (!dmm->_splitlocs)
>>>>>> PetscFunctionReturn(0);
>>>>>> *len = dmm->_splitlocs->size();
>>>>>> if (namelist)
>>>>>> {
>>>>>> ierr = PetscMalloc(*len * sizeof(char *), namelist);
>>>>>> CHKERRQ(ierr);
>>>>>> }
>>>>>> if (islist)
>>>>>> {
>>>>>> ierr = PetscMalloc(*len * sizeof(IS), islist);
>>>>>> CHKERRQ(ierr);
>>>>>> }
>>>>>> if (dmlist)
>>>>>> {
>>>>>> ierr = PetscMalloc(*len * sizeof(DM), dmlist);
>>>>>> CHKERRQ(ierr);
>>>>>> }
>>>>>> for (const auto & dit : *(dmm->_splitlocs))
>>>>>> {
>>>>>> unsigned int d = dit.second;
>>>>>> std::string dname = dit.first;
>>>>>> DM_Moose::SplitInfo & dinfo = (*dmm->_splits)[dname];
>>>>>> if (!dinfo._dm)
>>>>>> {
>>>>>> ierr = DMCreateMoose(((PetscObject)dm)->comm, *dmm->_nl,
>>>>>> &dinfo._dm);
>>>>>> CHKERRQ(ierr);
>>>>>> ierr = PetscObjectSetOptionsPrefix((PetscObject)dinfo._dm,
>>>>>> ((PetscObject)dm)->prefix);
>>>>>> CHKERRQ(ierr);
>>>>>> std::string suffix = std::string("fieldsplit_") + dname + "_";
>>>>>> ierr = PetscObjectAppendOptionsPrefix((PetscObject)dinfo._dm,
>>>>>> suffix.c_str());
>>>>>> CHKERRQ(ierr);
>>>>>> }
>>>>>> ierr = DMSetFromOptions(dinfo._dm);
>>>>>> CHKERRQ(ierr);
>>>>>> ierr = DMSetUp(dinfo._dm);
>>>>>> CHKERRQ(ierr);
>>>>>> if (namelist)
>>>>>> {
>>>>>> ierr = PetscStrallocpy(dname.c_str(), (*namelist) + d);
>>>>>> CHKERRQ(ierr);
>>>>>> }
>>>>>> if (islist)
>>>>>> {
>>>>>> if (!dinfo._rembedding)
>>>>>> {
>>>>>> IS dembedding, lembedding;
>>>>>> ierr = DMMooseGetEmbedding_Private(dinfo._dm, &dembedding);
>>>>>> CHKERRQ(ierr);
>>>>>> if (dmm->_embedding)
>>>>>> {
>>>>>> // Create a relative embedding into the parent's index
>>>>>> space.
>>>>>> ierr = ISEmbed(dembedding, dmm->_embedding, PETSC_TRUE,
>>>>>> &lembedding);
>>>>>> CHKERRQ(ierr);
>>>>>> const PetscInt * lindices;
>>>>>> PetscInt len, dlen, llen, *rindices, off, i;
>>>>>> ierr = ISGetLocalSize(dembedding, &dlen);
>>>>>> CHKERRQ(ierr);
>>>>>> ierr = ISGetLocalSize(lembedding, &llen);
>>>>>> CHKERRQ(ierr);
>>>>>> if (llen != dlen)
>>>>>> SETERRQ1(((PetscObject)dm)->comm, PETSC_ERR_PLIB, "Failed
>>>>>> to embed split %D", d);
>>>>>> ierr = ISDestroy(&dembedding);
>>>>>> CHKERRQ(ierr);
>>>>>> // Convert local embedding to global (but still relative)
>>>>>> embedding
>>>>>> ierr = PetscMalloc(llen * sizeof(PetscInt), &rindices);
>>>>>> CHKERRQ(ierr);
>>>>>> ierr = ISGetIndices(lembedding, &lindices);
>>>>>> CHKERRQ(ierr);
>>>>>> ierr = PetscMemcpy(rindices, lindices, llen *
>>>>>> sizeof(PetscInt));
>>>>>> CHKERRQ(ierr);
>>>>>> ierr = ISDestroy(&lembedding);
>>>>>> CHKERRQ(ierr);
>>>>>> // We could get the index offset from a corresponding
>>>>>> global vector, but subDMs don't yet
>>>>>> // have global vectors
>>>>>> ierr = ISGetLocalSize(dmm->_embedding, &len);
>>>>>> CHKERRQ(ierr);
>>>>>>
>>>>>> ierr = MPI_Scan(&len,
>>>>>> &off,
>>>>>> 1,
>>>>>> #ifdef PETSC_USE_64BIT_INDICES
>>>>>> MPI_LONG_LONG_INT,
>>>>>> #else
>>>>>> MPI_INT,
>>>>>> #endif
>>>>>> MPI_SUM,
>>>>>> ((PetscObject)dm)->comm);
>>>>>> CHKERRQ(ierr);
>>>>>>
>>>>>> off -= len;
>>>>>> for (i = 0; i < llen; ++i)
>>>>>> rindices[i] += off;
>>>>>> ierr = ISCreateGeneral(
>>>>>> ((PetscObject)dm)->comm, llen, rindices,
>>>>>> PETSC_OWN_POINTER, &(dinfo._rembedding));
>>>>>> CHKERRQ(ierr);
>>>>>> }
>>>>>> else
>>>>>> {
>>>>>> dinfo._rembedding = dembedding;
>>>>>> }
>>>>>> }
>>>>>> ierr = PetscObjectReference((PetscObject)(dinfo._rembedding));
>>>>>> CHKERRQ(ierr);
>>>>>> (*islist)[d] = dinfo._rembedding;
>>>>>> }
>>>>>> if (dmlist)
>>>>>> {
>>>>>> ierr = PetscObjectReference((PetscObject)dinfo._dm);
>>>>>> CHKERRQ(ierr);
>>>>>> (*dmlist)[d] = dinfo._dm;
>>>>>> }
>>>>>> }
>>>>>> PetscFunctionReturn(0);
>>>>>> }
>>>>>>
>>>>>> static PetscErrorCode
>>>>>> DMCreateDomainDecomposition_Moose(
>>>>>> DM dm, PetscInt * len, char *** namelist, IS ** innerislist, IS
>>>>>> ** outerislist, DM ** dmlist)
>>>>>> {
>>>>>> PetscErrorCode ierr;
>>>>>>
>>>>>> PetscFunctionBegin;
>>>>>> /* Use DMCreateFieldDecomposition_Moose() to obtain everything but
>>>>>> outerislist, which is currently
>>>>>> * PETSC_NULL. */
>>>>>> if (outerislist)
>>>>>> *outerislist = PETSC_NULL; /* FIX: allow mesh-based overlap. */
>>>>>> ierr = DMCreateFieldDecomposition_Moose(dm, len, namelist,
>>>>>> innerislist, dmlist);
>>>>>> CHKERRQ(ierr);
>>>>>> PetscFunctionReturn(0);
>>>>>> }
>>>>>>
>>>>>>
>>>>>>
>>>>>> On Thu, Nov 3, 2022 at 5:19 PM Matthew Knepley <knepley at gmail.com>
>>>>>> wrote:
>>>>>>
>>>>>>> On Thu, Nov 3, 2022 at 7:52 PM Alexander Lindsay <
>>>>>>> alexlindsay239 at gmail.com> wrote:
>>>>>>>
>>>>>>>> I have errors on quite a few (but not all) processes of the like
>>>>>>>>
>>>>>>>> [1]PETSC ERROR: --------------------- Error Message
>>>>>>>> --------------------------------------------------------------
>>>>>>>> [1]PETSC ERROR: Nonconforming object sizes
>>>>>>>> [1]PETSC ERROR: Local columns of A10 4137 do not equal local rows
>>>>>>>> of A00 4129
>>>>>>>>
>>>>>>>> when performing field splits. We (MOOSE) have some code for
>>>>>>>> identifying the index sets for each split. However, the code was written by
>>>>>>>> some authors who are no longer with us. Normally I would chase this down in
>>>>>>>> a debugger, but this error only seems to crop up for pretty complex and
>>>>>>>> large meshes. If anyone has an idea for what we might be doing wrong, that
>>>>>>>> might help me chase this down faster. I guess intuitively I'm pretty
>>>>>>>> perplexed that we could get ourselves into this pickle as it almost appears
>>>>>>>> that we have two different local dof index counts for a given block (0 in
>>>>>>>> this case). More background, if helpful, can be found in
>>>>>>>> https://github.com/idaholab/moose/issues/22359 as well as
>>>>>>>> https://github.com/idaholab/moose/discussions/22468.
>>>>>>>>
>>>>>>>
>>>>>>> How are you specifying the blocks? I would not have thought this was
>>>>>>> possible.
>>>>>>>
>>>>>>> Thanks,
>>>>>>>
>>>>>>> Matt
>>>>>>>
>>>>>>>
>>>>>>>> I should note that we are currently running with 3.16.6 as our
>>>>>>>> PETSc submodule hash (we are talking about updating to 3.18 soon).
>>>>>>>>
>>>>>>>
>>>>>>>
>>>>>>> --
>>>>>>> What most experimenters take for granted before they begin their
>>>>>>> experiments is infinitely more interesting than any results to which their
>>>>>>> experiments lead.
>>>>>>> -- Norbert Wiener
>>>>>>>
>>>>>>> https://www.cse.buffalo.edu/~knepley/
>>>>>>> <http://www.cse.buffalo.edu/~knepley/>
>>>>>>>
>>>>>>
>>>>
>>>> --
>>>> What most experimenters take for granted before they begin their
>>>> experiments is infinitely more interesting than any results to which their
>>>> experiments lead.
>>>> -- Norbert Wiener
>>>>
>>>> https://www.cse.buffalo.edu/~knepley/
>>>> <http://www.cse.buffalo.edu/~knepley/>
>>>>
>>>
>>
>> --
>> What most experimenters take for granted before they begin their
>> experiments is infinitely more interesting than any results to which their
>> experiments lead.
>> -- Norbert Wiener
>>
>> https://www.cse.buffalo.edu/~knepley/
>> <http://www.cse.buffalo.edu/~knepley/>
>>
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.mcs.anl.gov/pipermail/petsc-users/attachments/20221109/1213ec61/attachment-0001.html>
More information about the petsc-users
mailing list