[petsc-users] MatMPIAIJSetPreallocation: "nnz cannot be greater than row length"

Dmitry Karpeyev karpeev at mcs.anl.gov
Mon Feb 23 10:10:38 CST 2015


I just tried building against petsc/master, but there needs to be more work
on libMesh before it can work with petsc/master:
the new VecLockPush()/Pop() stuff isn't respected by vector manipulation in
libMesh.
I put a hack equivalent to MatReset() into your branch (patch attached, in
case you want to take a look at it),
but it generates the same error in MatCreateColmap that you reported
earlier.  It's odd that it occurs
on the second nonlinear iteration.   I'll have to dig a bit deeper  to see
what's going on.

Dmitry.


On Mon Feb 23 2015 at 10:03:33 AM David Knezevic <david.knezevic at akselos.com>
wrote:

> Hi Dmitry,
>
> OK, good to hear we're seeing the same behavior for the example.
>
> Regarding this comment:
>
>
> libMesh needs to change the way configure extracts PETSc information --
>> configuration data were moved:
>> conf --> lib/petsc-conf
>> ${PETSC_ARCH}/conf --> ${PETSC_ARCH}/lib/petsc-conf
>>
>> At one point I started looking at m4/petsc.m4, but that got put on the
>> back burner.  For now making the relevant symlinks by hand lets you
>> configure and build libMesh with petsc/master.
>>
>
>
> So do you suggest that the next step here is to build libmesh against
> petsc/master so that we can try your PETSc pull request that implements
> MatReset() to see if that gets this example working?
>
> David
>
>
>
>
>
>> On Mon Feb 23 2015 at 9:15:44 AM David Knezevic <
>> david.knezevic at akselos.com> wrote:
>>
>>> Hi Dmitry,
>>>
>>> Thanks very much for testing out the example.
>>>
>>> examples/systems_of_equations/ex8 works fine for me in serial, but it
>>> fails for me if I run with more than 1 MPI process. Can you try it with,
>>> say, 2 or 4 MPI processes?
>>>
>>> If we need access to MatReset in libMesh to get this to work, I'll be
>>> happy to work on a libMesh pull request for that.
>>>
>>> David
>>>
>>>
>>> --
>>>
>>> David J. Knezevic | CTO
>>> Akselos | 17 Bay State Road | Boston, MA | 02215
>>> Phone (office): +1-857-265-2238
>>> Phone (mobile): +1-617-599-4755
>>> Web: http://www.akselos.com
>>>
>>>
>>> On Mon, Feb 23, 2015 at 10:08 AM, Dmitry Karpeyev <karpeev at mcs.anl.gov>
>>> wrote:
>>>
>>>> David,
>>>>
>>>> What code are you running when you encounter this error?  I'm trying to
>>>> reproduce it and
>>>> I tried examples/systems_of_equations/ex8, but it ran for me,
>>>> ostensibly to completion.
>>>>
>>>> I have a small PETSc pull request that implements MatReset(), which
>>>> passes a small PETSc test,
>>>> but libMesh needs some work to be able to build against petsc/master
>>>> because of some recent
>>>> changes to PETSc.
>>>>
>>>> Dmitry.
>>>>
>>>> On Mon Feb 23 2015 at 7:17:06 AM David Knezevic <
>>>> david.knezevic at akselos.com> wrote:
>>>>
>>>>> Hi Barry, hi Dmitry,
>>>>>
>>>>> I set the matrix to BAIJ and back to AIJ, and the code got a bit
>>>>> further. But I now run into the error pasted below (Note that I'm now using
>>>>> "--with-debugging=1"):
>>>>>
>>>>> PETSC ERROR: --------------------- Error Message
>>>>> --------------------------------------------------------------
>>>>> PETSC ERROR: Petsc has generated inconsistent data
>>>>> PETSC ERROR: MPIAIJ Matrix was assembled but is missing garray
>>>>> PETSC ERROR: See http://www.mcs.anl.gov/petsc/documentation/faq.html
>>>>> for trouble shooting.
>>>>> PETSC ERROR: Petsc Release Version 3.5.2, Sep, 08, 2014
>>>>> PETSC ERROR: ./example-dbg on a arch-linux2-c-debug named david-Lenovo
>>>>> by dknez Mon Feb 23 08:05:44 2015
>>>>> PETSC ERROR: Configure options --with-shared-libraries=1
>>>>> --with-debugging=1 --download-suitesparse=1 --download-parmetis=1
>>>>> --download-blacs=1 --download-scalapack=1 --download-mumps=1
>>>>> --download-metis --download-superlu_dist --prefix=/home/dknez/software/libmesh_install/dbg_real/petsc
>>>>> --download-hypre
>>>>> PETSC ERROR: #1 MatCreateColmap_MPIAIJ_Private() line 361 in
>>>>> /home/dknez/software/petsc-3.5.2/src/mat/impls/aij/mpi/mpiaij.c
>>>>> PETSC ERROR: #2 MatSetValues_MPIAIJ() line 538 in
>>>>> /home/dknez/software/petsc-3.5.2/src/mat/impls/aij/mpi/mpiaij.c
>>>>> PETSC ERROR: #3 MatSetValues() line 1136 in
>>>>> /home/dknez/software/petsc-3.5.2/src/mat/interface/matrix.c
>>>>> PETSC ERROR: #4 add_matrix() line 765 in /home/dknez/software/libmesh-
>>>>> src/src/numerics/petsc_matrix.C
>>>>> ------------------------------------------------------------
>>>>> --------------
>>>>>
>>>>> This occurs when I try to set some entries of the matrix. Do you have
>>>>> any suggestions on how I can resolve this?
>>>>>
>>>>> Thanks!
>>>>> David
>>>>>
>>>>>
>>>>>
>>>>>
>>>>> On Sun, Feb 22, 2015 at 10:22 PM, Dmitry Karpeyev <dkarpeev at gmail.com>
>>>>> wrote:
>>>>>
>>>>>>
>>>>>>
>>>>>> On Sun Feb 22 2015 at 9:15:22 PM Barry Smith <bsmith at mcs.anl.gov>
>>>>>> wrote:
>>>>>>
>>>>>>>
>>>>>>> > On Feb 22, 2015, at 9:09 PM, David Knezevic <
>>>>>>> david.knezevic at akselos.com> wrote:
>>>>>>> >
>>>>>>> > Hi Dmitry,
>>>>>>> >
>>>>>>> > Thanks for the suggestion. I tried MatSetType(mat,MATMPIAIJ)
>>>>>>> followed by MatXAIJSetPreallocation(...), but unfortunately this still
>>>>>>> gives me the same error as before: "nnz cannot be greater than row length:
>>>>>>> local row 168 value 24 rowlength 0".
>>>>>>> >
>>>>>>> > I gather that the idea here is that MatSetType builds a new matrix
>>>>>>> object, and then I should be able to pre-allocate for that new matrix
>>>>>>> however I like, right? Was I supposed to clear the matrix object somehow
>>>>>>> before calling MatSetType? (I didn't do any sort of clear operation.)
>>>>>>>
>>>>>>>   If the type doesn't change then MatSetType() won't do anything.
>>>>>>> You can try setting the type to BAIJ and then setting the type back to AIJ.
>>>>>>> This may/should clear out the matrix.
>>>>>>>
>>>>>> Ah, yes.  If the type is the same as before it does quit early, but
>>>>>> changing the type and then back will clear out and rebuild the matrix.   We
>>>>>> need
>>>>>> something like MatReset() to do the equivalent thing.
>>>>>>
>>>>>>>
>>>>>>> >
>>>>>>> > As I said earlier, I'll make a dbg PETSc build, so hopefully that
>>>>>>> will help shed some light on what's going wrong for me.
>>>>>>>
>>>>>> I think it's always a good idea to have a dbg build of PETSc when you
>>>>>> doing things like these.
>>>>>>
>>>>>> Dmitry.
>>>>>>
>>>>>>>
>>>>>>>    Don't bother, what I suggested won't work.
>>>>>>>
>>>>>>>   Barry
>>>>>>>
>>>>>>>
>>>>>>> >
>>>>>>> > Thanks,
>>>>>>> > David
>>>>>>> >
>>>>>>> >
>>>>>>> >
>>>>>>> >
>>>>>>> > On Sun, Feb 22, 2015 at 6:02 PM, Dmitry Karpeyev <
>>>>>>> dkarpeev at gmail.com> wrote:
>>>>>>> > David,
>>>>>>> > It might be easier to just rebuild the whole matrix from scratch:
>>>>>>> you would in effect be doing all that with disassembling and resetting the
>>>>>>> preallocation.
>>>>>>> > MatSetType(mat,MATMPIAIJ)
>>>>>>> > or
>>>>>>> > PetscObjectGetType((PetscObject)mat,&type);
>>>>>>> > MatSetType(mat,type);
>>>>>>> > followed by
>>>>>>> > MatXAIJSetPreallocation(...);
>>>>>>> > should do.
>>>>>>> > Dmitry.
>>>>>>> >
>>>>>>> >
>>>>>>> > On Sun Feb 22 2015 at 4:45:46 PM Barry Smith <bsmith at mcs.anl.gov>
>>>>>>> wrote:
>>>>>>> >
>>>>>>> >  Do not call for SeqAIJ matrix. Do not call before the first time
>>>>>>> you have preallocated and put entries in the matrix and done the
>>>>>>> MatAssemblyBegin/End()
>>>>>>> >
>>>>>>> >   If it still crashes you'll need to try the debugger
>>>>>>> >
>>>>>>> >   Barry
>>>>>>> >
>>>>>>> > > On Feb 22, 2015, at 4:09 PM, David Knezevic <
>>>>>>> david.knezevic at akselos.com> wrote:
>>>>>>> > >
>>>>>>> > > Hi Barry,
>>>>>>> > >
>>>>>>> > > Thanks for your help, much appreciated.
>>>>>>> > >
>>>>>>> > > I added a prototype for MatDisAssemble_MPIAIJ:
>>>>>>> > > PETSC_INTERN PetscErrorCode MatDisAssemble_MPIAIJ(Mat);
>>>>>>> > >
>>>>>>> > > and I added a call to MatDisAssemble_MPIAIJ before
>>>>>>> MatMPIAIJSetPreallocation. However, I get a segfault on the call to
>>>>>>> MatDisAssemble_MPIAIJ. The segfault occurs in both serial and parallel.
>>>>>>> > >
>>>>>>> > > FYI, I'm using Petsc 3.5.2, and I'm not using a non-debug build
>>>>>>> (though I could rebuild PETSc in debug mode if you think that would help
>>>>>>> figure out what's happening here).
>>>>>>> > >
>>>>>>> > > Thanks,
>>>>>>> > > David
>>>>>>> > >
>>>>>>> > >
>>>>>>> > >
>>>>>>> > > On Sun, Feb 22, 2015 at 1:13 PM, Barry Smith <bsmith at mcs.anl.gov>
>>>>>>> wrote:
>>>>>>> > >
>>>>>>> > >   David,
>>>>>>> > >
>>>>>>> > >    This is an obscure little feature of MatMPIAIJ,   each time
>>>>>>> you change the sparsity pattern before you call the
>>>>>>> MatMPIAIJSetPreallocation you need to call  MatDisAssemble_MPIAIJ(Mat
>>>>>>> mat).    This is a private PETSc function so you need to provide your own
>>>>>>> prototype for it above the function you use it in.
>>>>>>> > >
>>>>>>> > >   Let us know if this resolves the problem.
>>>>>>> > >
>>>>>>> > >    Barry
>>>>>>> > >
>>>>>>> > > We never really intended that people would call
>>>>>>> MatMPIAIJSetPreallocation() AFTER they had already used the matrix.
>>>>>>> > >
>>>>>>> > >
>>>>>>> > > > On Feb 22, 2015, at 6:50 AM, David Knezevic <
>>>>>>> david.knezevic at akselos.com> wrote:
>>>>>>> > > >
>>>>>>> > > > Hi all,
>>>>>>> > > >
>>>>>>> > > > I've implemented a solver for a contact problem using SNES.
>>>>>>> The sparsity pattern of the jacobian matrix needs to change at each
>>>>>>> nonlinear iteration (because the elements which are in contact can change),
>>>>>>> so I tried to deal with this by calling MatSeqAIJSetPreallocation and
>>>>>>> MatMPIAIJSetPreallocation during each iteration in order to update the
>>>>>>> preallocation.
>>>>>>> > > >
>>>>>>> > > > This seems to work fine in serial, but with two or more MPI
>>>>>>> processes I run into the error "nnz cannot be greater than row length",
>>>>>>> e.g.:
>>>>>>> > > > nnz cannot be greater than row length: local row 528 value 12
>>>>>>> rowlength 0
>>>>>>> > > >
>>>>>>> > > > This error is from the call to
>>>>>>> > > > MatSeqAIJSetPreallocation(b->B,o_nz,o_nnz); in
>>>>>>> MatMPIAIJSetPreallocation_MPIAIJ.
>>>>>>> > > >
>>>>>>> > > > Any guidance on what the problem might be would be most
>>>>>>> appreciated. For example, I was wondering if there is a problem with
>>>>>>> calling SetPreallocation on a matrix that has already been preallocated?
>>>>>>> > > >
>>>>>>> > > > Some notes:
>>>>>>> > > > - I'm using PETSc via libMesh
>>>>>>> > > > - The code that triggers this issue is available as a PR on
>>>>>>> the libMesh github repo, in case anyone is interested:
>>>>>>> https://github.com/libMesh/libmesh/pull/460/
>>>>>>> > > > - I can try to make a minimal pure-PETSc example that
>>>>>>> reproduces this error, if that would be helpful.
>>>>>>> > > >
>>>>>>> > > > Many thanks,
>>>>>>> > > > David
>>>>>>> > > >
>>>>>>> > >
>>>>>>> > >
>>>>>>> >
>>>>>>> >
>>>>>>>
>>>>>>>
>>>>>
>>>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.mcs.anl.gov/pipermail/petsc-users/attachments/20150223/f18e35bb/attachment-0001.html>
-------------- next part --------------
A non-text attachment was scrubbed...
Name: mat-reset-hack.patch
Type: application/octet-stream
Size: 947 bytes
Desc: not available
URL: <http://lists.mcs.anl.gov/pipermail/petsc-users/attachments/20150223/f18e35bb/attachment-0001.obj>


More information about the petsc-users mailing list