[petsc-users] MatMPIAIJSetPreallocation: "nnz cannot be greater than row length"

Dmitry Karpeyev karpeev at mcs.anl.gov
Wed Feb 25 08:56:12 CST 2015


The previous patch didn't clean out the Mat data structure thoroughly
enough. Please try the attached patch.  It runs for me, but there is no
convergence.  In particular, KSP doesn't seem to converge, although that's
with the default GMRES+BJACOBI/ILU(0) options. So I don't think resetting
the Mat is the cause.

Let me know if this works for you. MatReset() will be in the next release
as well as, hopefully, an overhauled VI solver that should be able to
handle these contact problems.

Dmitry.

On Mon Feb 23 2015 at 10:17:19 AM David Knezevic <david.knezevic at akselos.com>
wrote:

> OK, sounds good. Let me know if I can help with the digging.
>
> David
>
>
>
> On Mon, Feb 23, 2015 at 11:10 AM, Dmitry Karpeyev <karpeev at mcs.anl.gov>
> wrote:
>
>> I just tried building against petsc/master, but there needs to be more
>> work on libMesh before it can work with petsc/master:
>> the new VecLockPush()/Pop() stuff isn't respected by vector manipulation
>> in libMesh.
>> I put a hack equivalent to MatReset() into your branch (patch attached,
>> in case you want to take a look at it),
>> but it generates the same error in MatCreateColmap that you reported
>> earlier.  It's odd that it occurs
>> on the second nonlinear iteration.   I'll have to dig a bit deeper  to
>> see what's going on.
>>
>> Dmitry.
>>
>>
>> On Mon Feb 23 2015 at 10:03:33 AM David Knezevic <
>> david.knezevic at akselos.com> wrote:
>>
>>> Hi Dmitry,
>>>
>>> OK, good to hear we're seeing the same behavior for the example.
>>>
>>> Regarding this comment:
>>>
>>>
>>> libMesh needs to change the way configure extracts PETSc information --
>>>> configuration data were moved:
>>>> conf --> lib/petsc-conf
>>>> ${PETSC_ARCH}/conf --> ${PETSC_ARCH}/lib/petsc-conf
>>>>
>>>> At one point I started looking at m4/petsc.m4, but that got put on the
>>>> back burner.  For now making the relevant symlinks by hand lets you
>>>> configure and build libMesh with petsc/master.
>>>>
>>>
>>>
>>> So do you suggest that the next step here is to build libmesh against
>>> petsc/master so that we can try your PETSc pull request that implements
>>> MatReset() to see if that gets this example working?
>>>
>>> David
>>>
>>>
>>>
>>>
>>>
>>>> On Mon Feb 23 2015 at 9:15:44 AM David Knezevic <
>>>> david.knezevic at akselos.com> wrote:
>>>>
>>>>> Hi Dmitry,
>>>>>
>>>>> Thanks very much for testing out the example.
>>>>>
>>>>> examples/systems_of_equations/ex8 works fine for me in serial, but it
>>>>> fails for me if I run with more than 1 MPI process. Can you try it with,
>>>>> say, 2 or 4 MPI processes?
>>>>>
>>>>> If we need access to MatReset in libMesh to get this to work, I'll be
>>>>> happy to work on a libMesh pull request for that.
>>>>>
>>>>> David
>>>>>
>>>>>
>>>>> --
>>>>>
>>>>> David J. Knezevic | CTO
>>>>> Akselos | 17 Bay State Road | Boston, MA | 02215
>>>>> Phone (office): +1-857-265-2238
>>>>> Phone (mobile): +1-617-599-4755
>>>>> Web: http://www.akselos.com
>>>>>
>>>>>
>>>>> On Mon, Feb 23, 2015 at 10:08 AM, Dmitry Karpeyev <karpeev at mcs.anl.gov
>>>>> > wrote:
>>>>>
>>>>>> David,
>>>>>>
>>>>>> What code are you running when you encounter this error?  I'm trying
>>>>>> to reproduce it and
>>>>>> I tried examples/systems_of_equations/ex8, but it ran for me,
>>>>>> ostensibly to completion.
>>>>>>
>>>>>> I have a small PETSc pull request that implements MatReset(), which
>>>>>> passes a small PETSc test,
>>>>>> but libMesh needs some work to be able to build against petsc/master
>>>>>> because of some recent
>>>>>> changes to PETSc.
>>>>>>
>>>>>> Dmitry.
>>>>>>
>>>>>> On Mon Feb 23 2015 at 7:17:06 AM David Knezevic <
>>>>>> david.knezevic at akselos.com> wrote:
>>>>>>
>>>>>>> Hi Barry, hi Dmitry,
>>>>>>>
>>>>>>> I set the matrix to BAIJ and back to AIJ, and the code got a bit
>>>>>>> further. But I now run into the error pasted below (Note that I'm now using
>>>>>>> "--with-debugging=1"):
>>>>>>>
>>>>>>> PETSC ERROR: --------------------- Error Message
>>>>>>> --------------------------------------------------------------
>>>>>>> PETSC ERROR: Petsc has generated inconsistent data
>>>>>>> PETSC ERROR: MPIAIJ Matrix was assembled but is missing garray
>>>>>>> PETSC ERROR: See http://www.mcs.anl.gov/petsc/documentation/faq.html
>>>>>>> for trouble shooting.
>>>>>>> PETSC ERROR: Petsc Release Version 3.5.2, Sep, 08, 2014
>>>>>>> PETSC ERROR: ./example-dbg on a arch-linux2-c-debug named
>>>>>>> david-Lenovo by dknez Mon Feb 23 08:05:44 2015
>>>>>>> PETSC ERROR: Configure options --with-shared-libraries=1
>>>>>>> --with-debugging=1 --download-suitesparse=1 --download-parmetis=1
>>>>>>> --download-blacs=1 --download-scalapack=1 --download-mumps=1
>>>>>>> --download-metis --download-superlu_dist --prefix=/home/dknez/software/libmesh_install/dbg_real/petsc
>>>>>>> --download-hypre
>>>>>>> PETSC ERROR: #1 MatCreateColmap_MPIAIJ_Private() line 361 in
>>>>>>> /home/dknez/software/petsc-3.5.2/src/mat/impls/aij/mpi/mpiaij.c
>>>>>>> PETSC ERROR: #2 MatSetValues_MPIAIJ() line 538 in
>>>>>>> /home/dknez/software/petsc-3.5.2/src/mat/impls/aij/mpi/mpiaij.c
>>>>>>> PETSC ERROR: #3 MatSetValues() line 1136 in
>>>>>>> /home/dknez/software/petsc-3.5.2/src/mat/interface/matrix.c
>>>>>>> PETSC ERROR: #4 add_matrix() line 765 in
>>>>>>> /home/dknez/software/libmesh-src/src/numerics/petsc_matrix.C
>>>>>>> ------------------------------------------------------------
>>>>>>> --------------
>>>>>>>
>>>>>>> This occurs when I try to set some entries of the matrix. Do you
>>>>>>> have any suggestions on how I can resolve this?
>>>>>>>
>>>>>>> Thanks!
>>>>>>> David
>>>>>>>
>>>>>>>
>>>>>>>
>>>>>>>
>>>>>>> On Sun, Feb 22, 2015 at 10:22 PM, Dmitry Karpeyev <
>>>>>>> dkarpeev at gmail.com> wrote:
>>>>>>>
>>>>>>>>
>>>>>>>>
>>>>>>>> On Sun Feb 22 2015 at 9:15:22 PM Barry Smith <bsmith at mcs.anl.gov>
>>>>>>>> wrote:
>>>>>>>>
>>>>>>>>>
>>>>>>>>> > On Feb 22, 2015, at 9:09 PM, David Knezevic <
>>>>>>>>> david.knezevic at akselos.com> wrote:
>>>>>>>>> >
>>>>>>>>> > Hi Dmitry,
>>>>>>>>> >
>>>>>>>>> > Thanks for the suggestion. I tried MatSetType(mat,MATMPIAIJ)
>>>>>>>>> followed by MatXAIJSetPreallocation(...), but unfortunately this still
>>>>>>>>> gives me the same error as before: "nnz cannot be greater than row length:
>>>>>>>>> local row 168 value 24 rowlength 0".
>>>>>>>>> >
>>>>>>>>> > I gather that the idea here is that MatSetType builds a new
>>>>>>>>> matrix object, and then I should be able to pre-allocate for that new
>>>>>>>>> matrix however I like, right? Was I supposed to clear the matrix object
>>>>>>>>> somehow before calling MatSetType? (I didn't do any sort of clear
>>>>>>>>> operation.)
>>>>>>>>>
>>>>>>>>>   If the type doesn't change then MatSetType() won't do anything.
>>>>>>>>> You can try setting the type to BAIJ and then setting the type back to AIJ.
>>>>>>>>> This may/should clear out the matrix.
>>>>>>>>>
>>>>>>>> Ah, yes.  If the type is the same as before it does quit early, but
>>>>>>>> changing the type and then back will clear out and rebuild the matrix.   We
>>>>>>>> need
>>>>>>>> something like MatReset() to do the equivalent thing.
>>>>>>>>
>>>>>>>>>
>>>>>>>>> >
>>>>>>>>> > As I said earlier, I'll make a dbg PETSc build, so hopefully
>>>>>>>>> that will help shed some light on what's going wrong for me.
>>>>>>>>>
>>>>>>>> I think it's always a good idea to have a dbg build of PETSc when
>>>>>>>> you doing things like these.
>>>>>>>>
>>>>>>>> Dmitry.
>>>>>>>>
>>>>>>>>>
>>>>>>>>>    Don't bother, what I suggested won't work.
>>>>>>>>>
>>>>>>>>>   Barry
>>>>>>>>>
>>>>>>>>>
>>>>>>>>> >
>>>>>>>>> > Thanks,
>>>>>>>>> > David
>>>>>>>>> >
>>>>>>>>> >
>>>>>>>>> >
>>>>>>>>> >
>>>>>>>>> > On Sun, Feb 22, 2015 at 6:02 PM, Dmitry Karpeyev <
>>>>>>>>> dkarpeev at gmail.com> wrote:
>>>>>>>>> > David,
>>>>>>>>> > It might be easier to just rebuild the whole matrix from
>>>>>>>>> scratch: you would in effect be doing all that with disassembling and
>>>>>>>>> resetting the preallocation.
>>>>>>>>> > MatSetType(mat,MATMPIAIJ)
>>>>>>>>> > or
>>>>>>>>> > PetscObjectGetType((PetscObject)mat,&type);
>>>>>>>>> > MatSetType(mat,type);
>>>>>>>>> > followed by
>>>>>>>>> > MatXAIJSetPreallocation(...);
>>>>>>>>> > should do.
>>>>>>>>> > Dmitry.
>>>>>>>>> >
>>>>>>>>> >
>>>>>>>>> > On Sun Feb 22 2015 at 4:45:46 PM Barry Smith <bsmith at mcs.anl.gov>
>>>>>>>>> wrote:
>>>>>>>>> >
>>>>>>>>> >  Do not call for SeqAIJ matrix. Do not call before the first
>>>>>>>>> time you have preallocated and put entries in the matrix and done the
>>>>>>>>> MatAssemblyBegin/End()
>>>>>>>>> >
>>>>>>>>> >   If it still crashes you'll need to try the debugger
>>>>>>>>> >
>>>>>>>>> >   Barry
>>>>>>>>> >
>>>>>>>>> > > On Feb 22, 2015, at 4:09 PM, David Knezevic <
>>>>>>>>> david.knezevic at akselos.com> wrote:
>>>>>>>>> > >
>>>>>>>>> > > Hi Barry,
>>>>>>>>> > >
>>>>>>>>> > > Thanks for your help, much appreciated.
>>>>>>>>> > >
>>>>>>>>> > > I added a prototype for MatDisAssemble_MPIAIJ:
>>>>>>>>> > > PETSC_INTERN PetscErrorCode MatDisAssemble_MPIAIJ(Mat);
>>>>>>>>> > >
>>>>>>>>> > > and I added a call to MatDisAssemble_MPIAIJ before
>>>>>>>>> MatMPIAIJSetPreallocation. However, I get a segfault on the call to
>>>>>>>>> MatDisAssemble_MPIAIJ. The segfault occurs in both serial and parallel.
>>>>>>>>> > >
>>>>>>>>> > > FYI, I'm using Petsc 3.5.2, and I'm not using a non-debug
>>>>>>>>> build (though I could rebuild PETSc in debug mode if you think that would
>>>>>>>>> help figure out what's happening here).
>>>>>>>>> > >
>>>>>>>>> > > Thanks,
>>>>>>>>> > > David
>>>>>>>>> > >
>>>>>>>>> > >
>>>>>>>>> > >
>>>>>>>>> > > On Sun, Feb 22, 2015 at 1:13 PM, Barry Smith <
>>>>>>>>> bsmith at mcs.anl.gov> wrote:
>>>>>>>>> > >
>>>>>>>>> > >   David,
>>>>>>>>> > >
>>>>>>>>> > >    This is an obscure little feature of MatMPIAIJ,   each time
>>>>>>>>> you change the sparsity pattern before you call the
>>>>>>>>> MatMPIAIJSetPreallocation you need to call  MatDisAssemble_MPIAIJ(Mat
>>>>>>>>> mat).    This is a private PETSc function so you need to provide your own
>>>>>>>>> prototype for it above the function you use it in.
>>>>>>>>> > >
>>>>>>>>> > >   Let us know if this resolves the problem.
>>>>>>>>> > >
>>>>>>>>> > >    Barry
>>>>>>>>> > >
>>>>>>>>> > > We never really intended that people would call
>>>>>>>>> MatMPIAIJSetPreallocation() AFTER they had already used the matrix.
>>>>>>>>> > >
>>>>>>>>> > >
>>>>>>>>> > > > On Feb 22, 2015, at 6:50 AM, David Knezevic <
>>>>>>>>> david.knezevic at akselos.com> wrote:
>>>>>>>>> > > >
>>>>>>>>> > > > Hi all,
>>>>>>>>> > > >
>>>>>>>>> > > > I've implemented a solver for a contact problem using SNES.
>>>>>>>>> The sparsity pattern of the jacobian matrix needs to change at each
>>>>>>>>> nonlinear iteration (because the elements which are in contact can change),
>>>>>>>>> so I tried to deal with this by calling MatSeqAIJSetPreallocation and
>>>>>>>>> MatMPIAIJSetPreallocation during each iteration in order to update the
>>>>>>>>> preallocation.
>>>>>>>>> > > >
>>>>>>>>> > > > This seems to work fine in serial, but with two or more MPI
>>>>>>>>> processes I run into the error "nnz cannot be greater than row length",
>>>>>>>>> e.g.:
>>>>>>>>> > > > nnz cannot be greater than row length: local row 528 value
>>>>>>>>> 12 rowlength 0
>>>>>>>>> > > >
>>>>>>>>> > > > This error is from the call to
>>>>>>>>> > > > MatSeqAIJSetPreallocation(b->B,o_nz,o_nnz); in
>>>>>>>>> MatMPIAIJSetPreallocation_MPIAIJ.
>>>>>>>>> > > >
>>>>>>>>> > > > Any guidance on what the problem might be would be most
>>>>>>>>> appreciated. For example, I was wondering if there is a problem with
>>>>>>>>> calling SetPreallocation on a matrix that has already been preallocated?
>>>>>>>>> > > >
>>>>>>>>> > > > Some notes:
>>>>>>>>> > > > - I'm using PETSc via libMesh
>>>>>>>>> > > > - The code that triggers this issue is available as a PR on
>>>>>>>>> the libMesh github repo, in case anyone is interested:
>>>>>>>>> https://github.com/libMesh/libmesh/pull/460/
>>>>>>>>> > > > - I can try to make a minimal pure-PETSc example that
>>>>>>>>> reproduces this error, if that would be helpful.
>>>>>>>>> > > >
>>>>>>>>> > > > Many thanks,
>>>>>>>>> > > > David
>>>>>>>>> > > >
>>>>>>>>> > >
>>>>>>>>> > >
>>>>>>>>> >
>>>>>>>>> >
>>>>>>>>>
>>>>>>>>>
>>>>>>>
>>>>>
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.mcs.anl.gov/pipermail/petsc-users/attachments/20150225/b7fcd5cc/attachment.html>
-------------- next part --------------
A non-text attachment was scrubbed...
Name: mat-reset-hack.patch
Type: application/octet-stream
Size: 1141 bytes
Desc: not available
URL: <http://lists.mcs.anl.gov/pipermail/petsc-users/attachments/20150225/b7fcd5cc/attachment.obj>


More information about the petsc-users mailing list