[petsc-users] MatMPIAIJSetPreallocation: "nnz cannot be greater than row length"

David Knezevic david.knezevic at akselos.com
Mon Feb 23 10:17:10 CST 2015


OK, sounds good. Let me know if I can help with the digging.

David



On Mon, Feb 23, 2015 at 11:10 AM, Dmitry Karpeyev <karpeev at mcs.anl.gov>
wrote:

> I just tried building against petsc/master, but there needs to be more
> work on libMesh before it can work with petsc/master:
> the new VecLockPush()/Pop() stuff isn't respected by vector manipulation
> in libMesh.
> I put a hack equivalent to MatReset() into your branch (patch attached, in
> case you want to take a look at it),
> but it generates the same error in MatCreateColmap that you reported
> earlier.  It's odd that it occurs
> on the second nonlinear iteration.   I'll have to dig a bit deeper  to
> see what's going on.
>
> Dmitry.
>
>
> On Mon Feb 23 2015 at 10:03:33 AM David Knezevic <
> david.knezevic at akselos.com> wrote:
>
>> Hi Dmitry,
>>
>> OK, good to hear we're seeing the same behavior for the example.
>>
>> Regarding this comment:
>>
>>
>> libMesh needs to change the way configure extracts PETSc information --
>>> configuration data were moved:
>>> conf --> lib/petsc-conf
>>> ${PETSC_ARCH}/conf --> ${PETSC_ARCH}/lib/petsc-conf
>>>
>>> At one point I started looking at m4/petsc.m4, but that got put on the
>>> back burner.  For now making the relevant symlinks by hand lets you
>>> configure and build libMesh with petsc/master.
>>>
>>
>>
>> So do you suggest that the next step here is to build libmesh against
>> petsc/master so that we can try your PETSc pull request that implements
>> MatReset() to see if that gets this example working?
>>
>> David
>>
>>
>>
>>
>>
>>> On Mon Feb 23 2015 at 9:15:44 AM David Knezevic <
>>> david.knezevic at akselos.com> wrote:
>>>
>>>> Hi Dmitry,
>>>>
>>>> Thanks very much for testing out the example.
>>>>
>>>> examples/systems_of_equations/ex8 works fine for me in serial, but it
>>>> fails for me if I run with more than 1 MPI process. Can you try it with,
>>>> say, 2 or 4 MPI processes?
>>>>
>>>> If we need access to MatReset in libMesh to get this to work, I'll be
>>>> happy to work on a libMesh pull request for that.
>>>>
>>>> David
>>>>
>>>>
>>>> --
>>>>
>>>> David J. Knezevic | CTO
>>>> Akselos | 17 Bay State Road | Boston, MA | 02215
>>>> Phone (office): +1-857-265-2238
>>>> Phone (mobile): +1-617-599-4755
>>>> Web: http://www.akselos.com
>>>>
>>>>
>>>> On Mon, Feb 23, 2015 at 10:08 AM, Dmitry Karpeyev <karpeev at mcs.anl.gov>
>>>> wrote:
>>>>
>>>>> David,
>>>>>
>>>>> What code are you running when you encounter this error?  I'm trying
>>>>> to reproduce it and
>>>>> I tried examples/systems_of_equations/ex8, but it ran for me,
>>>>> ostensibly to completion.
>>>>>
>>>>> I have a small PETSc pull request that implements MatReset(), which
>>>>> passes a small PETSc test,
>>>>> but libMesh needs some work to be able to build against petsc/master
>>>>> because of some recent
>>>>> changes to PETSc.
>>>>>
>>>>> Dmitry.
>>>>>
>>>>> On Mon Feb 23 2015 at 7:17:06 AM David Knezevic <
>>>>> david.knezevic at akselos.com> wrote:
>>>>>
>>>>>> Hi Barry, hi Dmitry,
>>>>>>
>>>>>> I set the matrix to BAIJ and back to AIJ, and the code got a bit
>>>>>> further. But I now run into the error pasted below (Note that I'm now using
>>>>>> "--with-debugging=1"):
>>>>>>
>>>>>> PETSC ERROR: --------------------- Error Message
>>>>>> --------------------------------------------------------------
>>>>>> PETSC ERROR: Petsc has generated inconsistent data
>>>>>> PETSC ERROR: MPIAIJ Matrix was assembled but is missing garray
>>>>>> PETSC ERROR: See http://www.mcs.anl.gov/petsc/documentation/faq.html
>>>>>> for trouble shooting.
>>>>>> PETSC ERROR: Petsc Release Version 3.5.2, Sep, 08, 2014
>>>>>> PETSC ERROR: ./example-dbg on a arch-linux2-c-debug named
>>>>>> david-Lenovo by dknez Mon Feb 23 08:05:44 2015
>>>>>> PETSC ERROR: Configure options --with-shared-libraries=1
>>>>>> --with-debugging=1 --download-suitesparse=1 --download-parmetis=1
>>>>>> --download-blacs=1 --download-scalapack=1 --download-mumps=1
>>>>>> --download-metis --download-superlu_dist --prefix=/home/dknez/software/libmesh_install/dbg_real/petsc
>>>>>> --download-hypre
>>>>>> PETSC ERROR: #1 MatCreateColmap_MPIAIJ_Private() line 361 in
>>>>>> /home/dknez/software/petsc-3.5.2/src/mat/impls/aij/mpi/mpiaij.c
>>>>>> PETSC ERROR: #2 MatSetValues_MPIAIJ() line 538 in
>>>>>> /home/dknez/software/petsc-3.5.2/src/mat/impls/aij/mpi/mpiaij.c
>>>>>> PETSC ERROR: #3 MatSetValues() line 1136 in
>>>>>> /home/dknez/software/petsc-3.5.2/src/mat/interface/matrix.c
>>>>>> PETSC ERROR: #4 add_matrix() line 765 in /home/dknez/software/libmesh-
>>>>>> src/src/numerics/petsc_matrix.C
>>>>>> ------------------------------------------------------------
>>>>>> --------------
>>>>>>
>>>>>> This occurs when I try to set some entries of the matrix. Do you have
>>>>>> any suggestions on how I can resolve this?
>>>>>>
>>>>>> Thanks!
>>>>>> David
>>>>>>
>>>>>>
>>>>>>
>>>>>>
>>>>>> On Sun, Feb 22, 2015 at 10:22 PM, Dmitry Karpeyev <dkarpeev at gmail.com
>>>>>> > wrote:
>>>>>>
>>>>>>>
>>>>>>>
>>>>>>> On Sun Feb 22 2015 at 9:15:22 PM Barry Smith <bsmith at mcs.anl.gov>
>>>>>>> wrote:
>>>>>>>
>>>>>>>>
>>>>>>>> > On Feb 22, 2015, at 9:09 PM, David Knezevic <
>>>>>>>> david.knezevic at akselos.com> wrote:
>>>>>>>> >
>>>>>>>> > Hi Dmitry,
>>>>>>>> >
>>>>>>>> > Thanks for the suggestion. I tried MatSetType(mat,MATMPIAIJ)
>>>>>>>> followed by MatXAIJSetPreallocation(...), but unfortunately this still
>>>>>>>> gives me the same error as before: "nnz cannot be greater than row length:
>>>>>>>> local row 168 value 24 rowlength 0".
>>>>>>>> >
>>>>>>>> > I gather that the idea here is that MatSetType builds a new
>>>>>>>> matrix object, and then I should be able to pre-allocate for that new
>>>>>>>> matrix however I like, right? Was I supposed to clear the matrix object
>>>>>>>> somehow before calling MatSetType? (I didn't do any sort of clear
>>>>>>>> operation.)
>>>>>>>>
>>>>>>>>   If the type doesn't change then MatSetType() won't do anything.
>>>>>>>> You can try setting the type to BAIJ and then setting the type back to AIJ.
>>>>>>>> This may/should clear out the matrix.
>>>>>>>>
>>>>>>> Ah, yes.  If the type is the same as before it does quit early, but
>>>>>>> changing the type and then back will clear out and rebuild the matrix.   We
>>>>>>> need
>>>>>>> something like MatReset() to do the equivalent thing.
>>>>>>>
>>>>>>>>
>>>>>>>> >
>>>>>>>> > As I said earlier, I'll make a dbg PETSc build, so hopefully that
>>>>>>>> will help shed some light on what's going wrong for me.
>>>>>>>>
>>>>>>> I think it's always a good idea to have a dbg build of PETSc when
>>>>>>> you doing things like these.
>>>>>>>
>>>>>>> Dmitry.
>>>>>>>
>>>>>>>>
>>>>>>>>    Don't bother, what I suggested won't work.
>>>>>>>>
>>>>>>>>   Barry
>>>>>>>>
>>>>>>>>
>>>>>>>> >
>>>>>>>> > Thanks,
>>>>>>>> > David
>>>>>>>> >
>>>>>>>> >
>>>>>>>> >
>>>>>>>> >
>>>>>>>> > On Sun, Feb 22, 2015 at 6:02 PM, Dmitry Karpeyev <
>>>>>>>> dkarpeev at gmail.com> wrote:
>>>>>>>> > David,
>>>>>>>> > It might be easier to just rebuild the whole matrix from scratch:
>>>>>>>> you would in effect be doing all that with disassembling and resetting the
>>>>>>>> preallocation.
>>>>>>>> > MatSetType(mat,MATMPIAIJ)
>>>>>>>> > or
>>>>>>>> > PetscObjectGetType((PetscObject)mat,&type);
>>>>>>>> > MatSetType(mat,type);
>>>>>>>> > followed by
>>>>>>>> > MatXAIJSetPreallocation(...);
>>>>>>>> > should do.
>>>>>>>> > Dmitry.
>>>>>>>> >
>>>>>>>> >
>>>>>>>> > On Sun Feb 22 2015 at 4:45:46 PM Barry Smith <bsmith at mcs.anl.gov>
>>>>>>>> wrote:
>>>>>>>> >
>>>>>>>> >  Do not call for SeqAIJ matrix. Do not call before the first time
>>>>>>>> you have preallocated and put entries in the matrix and done the
>>>>>>>> MatAssemblyBegin/End()
>>>>>>>> >
>>>>>>>> >   If it still crashes you'll need to try the debugger
>>>>>>>> >
>>>>>>>> >   Barry
>>>>>>>> >
>>>>>>>> > > On Feb 22, 2015, at 4:09 PM, David Knezevic <
>>>>>>>> david.knezevic at akselos.com> wrote:
>>>>>>>> > >
>>>>>>>> > > Hi Barry,
>>>>>>>> > >
>>>>>>>> > > Thanks for your help, much appreciated.
>>>>>>>> > >
>>>>>>>> > > I added a prototype for MatDisAssemble_MPIAIJ:
>>>>>>>> > > PETSC_INTERN PetscErrorCode MatDisAssemble_MPIAIJ(Mat);
>>>>>>>> > >
>>>>>>>> > > and I added a call to MatDisAssemble_MPIAIJ before
>>>>>>>> MatMPIAIJSetPreallocation. However, I get a segfault on the call to
>>>>>>>> MatDisAssemble_MPIAIJ. The segfault occurs in both serial and parallel.
>>>>>>>> > >
>>>>>>>> > > FYI, I'm using Petsc 3.5.2, and I'm not using a non-debug build
>>>>>>>> (though I could rebuild PETSc in debug mode if you think that would help
>>>>>>>> figure out what's happening here).
>>>>>>>> > >
>>>>>>>> > > Thanks,
>>>>>>>> > > David
>>>>>>>> > >
>>>>>>>> > >
>>>>>>>> > >
>>>>>>>> > > On Sun, Feb 22, 2015 at 1:13 PM, Barry Smith <
>>>>>>>> bsmith at mcs.anl.gov> wrote:
>>>>>>>> > >
>>>>>>>> > >   David,
>>>>>>>> > >
>>>>>>>> > >    This is an obscure little feature of MatMPIAIJ,   each time
>>>>>>>> you change the sparsity pattern before you call the
>>>>>>>> MatMPIAIJSetPreallocation you need to call  MatDisAssemble_MPIAIJ(Mat
>>>>>>>> mat).    This is a private PETSc function so you need to provide your own
>>>>>>>> prototype for it above the function you use it in.
>>>>>>>> > >
>>>>>>>> > >   Let us know if this resolves the problem.
>>>>>>>> > >
>>>>>>>> > >    Barry
>>>>>>>> > >
>>>>>>>> > > We never really intended that people would call
>>>>>>>> MatMPIAIJSetPreallocation() AFTER they had already used the matrix.
>>>>>>>> > >
>>>>>>>> > >
>>>>>>>> > > > On Feb 22, 2015, at 6:50 AM, David Knezevic <
>>>>>>>> david.knezevic at akselos.com> wrote:
>>>>>>>> > > >
>>>>>>>> > > > Hi all,
>>>>>>>> > > >
>>>>>>>> > > > I've implemented a solver for a contact problem using SNES.
>>>>>>>> The sparsity pattern of the jacobian matrix needs to change at each
>>>>>>>> nonlinear iteration (because the elements which are in contact can change),
>>>>>>>> so I tried to deal with this by calling MatSeqAIJSetPreallocation and
>>>>>>>> MatMPIAIJSetPreallocation during each iteration in order to update the
>>>>>>>> preallocation.
>>>>>>>> > > >
>>>>>>>> > > > This seems to work fine in serial, but with two or more MPI
>>>>>>>> processes I run into the error "nnz cannot be greater than row length",
>>>>>>>> e.g.:
>>>>>>>> > > > nnz cannot be greater than row length: local row 528 value 12
>>>>>>>> rowlength 0
>>>>>>>> > > >
>>>>>>>> > > > This error is from the call to
>>>>>>>> > > > MatSeqAIJSetPreallocation(b->B,o_nz,o_nnz); in
>>>>>>>> MatMPIAIJSetPreallocation_MPIAIJ.
>>>>>>>> > > >
>>>>>>>> > > > Any guidance on what the problem might be would be most
>>>>>>>> appreciated. For example, I was wondering if there is a problem with
>>>>>>>> calling SetPreallocation on a matrix that has already been preallocated?
>>>>>>>> > > >
>>>>>>>> > > > Some notes:
>>>>>>>> > > > - I'm using PETSc via libMesh
>>>>>>>> > > > - The code that triggers this issue is available as a PR on
>>>>>>>> the libMesh github repo, in case anyone is interested:
>>>>>>>> https://github.com/libMesh/libmesh/pull/460/
>>>>>>>> > > > - I can try to make a minimal pure-PETSc example that
>>>>>>>> reproduces this error, if that would be helpful.
>>>>>>>> > > >
>>>>>>>> > > > Many thanks,
>>>>>>>> > > > David
>>>>>>>> > > >
>>>>>>>> > >
>>>>>>>> > >
>>>>>>>> >
>>>>>>>> >
>>>>>>>>
>>>>>>>>
>>>>>>
>>>>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.mcs.anl.gov/pipermail/petsc-users/attachments/20150223/c3de5aec/attachment.html>


More information about the petsc-users mailing list