[petsc-users] A number of questions about DMDA with SNES and Quasi-Newton methods

zakaryah . zakaryah at gmail.com
Thu Nov 2 21:44:57 CDT 2017


OK thanks Barry!  I've attached source for a minimal example which has the
same structure as my problem.

On Thu, Nov 2, 2017 at 5:36 PM, Smith, Barry F. <bsmith at mcs.anl.gov> wrote:

>
>   Please send your "simple" code that attempts to do the appropriate
> preallocation and we'll look at why it is failing.
>
>   Barry
>
>
> > On Nov 2, 2017, at 4:03 PM, zakaryah . <zakaryah at gmail.com> wrote:
> >
> > I ran MatView with the smallest possible grid size.  The elements which
> are nonzero after the first assembly but not after the matrix is created
> are exactly the couplings, i.e. row and column 0 off the diagonal, which
> I'm trying to preallocate, and eventually place in the second call to
> FormCoupleLocations, although right now it's only being called once - not
> sure why prealloc_only is set - maybe because I have
> MAT_NEW_NONZERO_ALLOCATION_ERR set to false?  I'm not sure why the Jbh
> terms result in so few additional mallocs - are columns allocated in chunks
> on the fly?  For example, when my DMDA size is 375, adding values to Jhb
> results in 375 additional mallocs, but adding values to Jbh only adds 25
> mallocs.
> >
> > I guess my FormCoupleLocations is pretty incomplete, because
> incrementing dnz and onz doesn't seem to have any effect.  I also don't
> know how this function should behave - as I said before, writing to dnz[i]
> and onz[i] where i goes from rstart to rstart+nrows-1 causes a segfault on
> processors >0, and if I inspect those values of dnz[i] and onz[i] by
> writing them to a file, they are nonsense when processor >0.  Are dnz and
> onz local to the processor, i.e. should my iterations go from 0 to
> nrows-1?  What is the general idea behind setting the structure from within
> FormCoupleLocations?  I'm currently not doing that at all.
> >
> > Thanks for all the help!
> >
> > On Thu, Nov 2, 2017 at 12:31 PM, Smith, Barry F. <bsmith at mcs.anl.gov>
> wrote:
> >
> >
> > > On Nov 1, 2017, at 10:32 PM, zakaryah . <zakaryah at gmail.com> wrote:
> > >
> > > I worked on the assumptions in my previous email and I at least
> partially implemented the function to assign the couplings.  For row 0,
> which is the redundant field, I set dnz[0] to end-start, and onz[0] to the
> size of the matrix minus dnz[0].  For all other rows, I just increment the
> existing values of dnz[i] and onz[i], since the coupling to the redundant
> field adds one extra element beyond what's allocated for the DMDA stencil.
> >
> >   Sounds reasonable.
> > >
> > > I see in the source that the FormCoupleLocations function is called
> once if the DM has PREALLOC_ONLY set to true, but twice otherwise.  I
> assume that the second call is for setting the nonzero structure.
> >
> >   Yes
> > > Do I need to do this?
> >
> >    You probably should.
> >
> >    I would do a MatView() small DMDA on the matrix you obtain and then
> again after you add the numerical values. This will show you what values
> are not being properly allocated/put in when the matrix is created by the
> DM.
> > >
> > > In any case, something is still not right.  Even with the extra
> elements preallocated, the first assembly of the matrix is very slow.  I
> ran a test problem on a single process with -info, and got this:
> > >
> > > 0] MatAssemblyEnd_SeqAIJ(): Matrix size: 1 X 1; storage space: 0
> unneeded,1 used
> > >
> > > [0] MatAssemblyEnd_SeqAIJ(): Number of mallocs during MatSetValues()
> is 0
> > >
> > > [0] MatAssemblyEnd_SeqAIJ(): Maximum nonzeros in any row is 1
> > >
> > > [0] MatCheckCompressedRow(): Found the ratio (num_zerorows
> 0)/(num_localrows 1) < 0.6. Do not use CompressedRow routines.
> > >
> > > [0] MatSeqAIJCheckInode(): Found 1 nodes out of 1 rows. Not using
> Inode routines
> > >
> > > [0] MatAssemblyEnd_SeqAIJ(): Matrix size: 9009 X 9009; storage space:
> 0 unneeded,629703 used
> > >
> > > [0] MatAssemblyEnd_SeqAIJ(): Number of mallocs during MatSetValues()
> is 0
> > >
> > > [0] MatAssemblyEnd_SeqAIJ(): Maximum nonzeros in any row is 81
> > >
> > > [0] MatCheckCompressedRow(): Found the ratio (num_zerorows
> 0)/(num_localrows 9009) < 0.6. Do not use CompressedRow routines.
> > >
> > > [0] MatSeqAIJCheckInode(): Found 3003 nodes of 9009. Limit used: 5.
> Using Inode routines
> > >
> > > [0] MatAssemblyEnd_SeqAIJ(): Matrix size: 1 X 1; storage space: 0
> unneeded,1 used
> > >
> > > [0] MatAssemblyEnd_SeqAIJ(): Number of mallocs during MatSetValues()
> is 0
> > >
> > > [0] MatAssemblyEnd_SeqAIJ(): Maximum nonzeros in any row is 1
> > >
> > > [0] MatCheckCompressedRow(): Found the ratio (num_zerorows
> 0)/(num_localrows 1) < 0.6. Do not use CompressedRow routines.
> > >
> > > [0] MatSeqAIJCheckInode(): Found 1 nodes out of 1 rows. Not using
> Inode routines
> > >
> > > [0] MatAssemblyEnd_SeqAIJ(): Matrix size: 9009 X 9009; storage space:
> 0 unneeded,629703 used
> > >
> > > [0] MatAssemblyEnd_SeqAIJ(): Number of mallocs during MatSetValues()
> is 0
> > >
> > > [0] MatAssemblyEnd_SeqAIJ(): Maximum nonzeros in any row is 81
> > >
> > > [0] MatCheckCompressedRow(): Found the ratio (num_zerorows
> 0)/(num_localrows 9009) < 0.6. Do not use CompressedRow routines.
> > >
> > > [0] MatSeqAIJCheckInode(): Found 3003 nodes of 9009. Limit used: 5.
> Using Inode routines
> > >
> > > [0] MatAssemblyEnd_SeqAIJ(): Matrix size: 9010 X 9010; storage space:
> 18018 unneeded,629704 used
> >
> >   Yes, really bad.
> > >
> > > [0] MatAssemblyEnd_SeqAIJ(): Number of mallocs during MatSetValues()
> is 0
> > >
> > > [0] MatAssemblyEnd_SeqAIJ(): Maximum nonzeros in any row is 81
> > >
> > > [0] MatCheckCompressedRow(): Found the ratio (num_zerorows
> 0)/(num_localrows 9010) < 0.6. Do not use CompressedRow routines.
> > >
> > > [0] MatSeqAIJCheckInode(): Found 3004 nodes of 9010. Limit used: 5.
> Using Inode routines
> > >
> > > [0] PetscCommDuplicate(): Using internal PETSc communicator
> 139995446774208 30818112
> > >
> > > [0] DMGetDMSNES(): Creating new DMSNES
> > >
> > > [0] DMGetDMKSP(): Creating new DMKSP
> > >
> > > [0] PetscCommDuplicate(): Using internal PETSc communicator
> 139995446773056 30194064
> > >
> > > [0] PetscCommDuplicate(): Using internal PETSc communicator
> 139995446773056 30194064
> > >
> > > [0] PetscCommDuplicate(): Using internal PETSc communicator
> 139995446773056 30194064
> > >
> > > [0] PetscCommDuplicate(): Using internal PETSc communicator
> 139995446773056 30194064
> > >
> > >   0 SNES Function norm 2.302381528359e+00
> > >
> > > [0] PetscCommDuplicate(): Using internal PETSc communicator
> 139995446773056 30194064
> > >
> > > [0] PetscCommDuplicate(): Using internal PETSc communicator
> 139995446773056 30194064
> > >
> > > [0] PetscCommDuplicate(): Using internal PETSc communicator
> 139995446773056 30194064
> > >
> > > [0] PetscCommDuplicate(): Using internal PETSc communicator
> 139995446773056 30194064
> > >
> > > [0] PetscCommDuplicate(): Using internal PETSc communicator
> 139995446773056 30194064
> > >
> > > [0] PetscCommDuplicate(): Using internal PETSc communicator
> 139995446773056 30194064
> > >
> > > [0] MatAssemblyEnd_SeqAIJ(): Matrix size: 9010 X 9010; storage space:
> 126132 unneeded,647722 used
> > >
> > > [0] MatAssemblyEnd_SeqAIJ(): Number of mallocs during MatSetValues()
> is 9610
> > >
> > > [0] MatAssemblyEnd_SeqAIJ(): Maximum nonzeros in any row is 9010
> > >
> > > [0] MatCheckCompressedRow(): Found the ratio (num_zerorows
> 0)/(num_localrows 9010) < 0.6. Do not use CompressedRow routines.
> > >
> > > [0] MatSeqAIJCheckInode(): Found 3004 nodes of 9010. Limit used: 5.
> Using Inode routines
> > >
> > > [0] MatAssemblyEnd_SeqAIJ(): Matrix size: 9010 X 9010; storage space:
> 0 unneeded,647722 used
> > >
> > > [0] MatAssemblyEnd_SeqAIJ(): Number of mallocs during MatSetValues()
> is 0
> > >
> > > [0] MatAssemblyEnd_SeqAIJ(): Maximum nonzeros in any row is 9010
> > >
> > > [0] MatCheckCompressedRow(): Found the ratio (num_zerorows
> 0)/(num_localrows 9010) < 0.6. Do not use CompressedRow routines.
> > >
> > >
> > >
> > > The 9610 mallocs during MatSetValues seem suspicious and are probably
> what's taking so long with larger problems.  601 of them are apparently in
> the call to set Jbh, and 9009 are in the call to set Jhb (b is the
> redundant field, h is the DMDA field).  If I run with more than one
> process, I get a segfault when a process which has rank greater than 0 sets
> dnz or onz in the FormCoupleLocations call.
> > >
> > >
> > > On Tue, Oct 31, 2017 at 10:40 PM, zakaryah . <zakaryah at gmail.com>
> wrote:
> > > Thanks Barry, that looks like exactly what I need.  I'm looking at
> pack.c and packm.c and I want to check my understanding of what my coupling
> function should do.  The relevant line in DMCreateMatrix_Composite_AIJ
> seems to be:
> > >
> > > (*com->FormCoupleLocations)(dm,NULL,dnz,onz,__rstart,__
> nrows,__start,__end);
> > >
> > > and I infer that dnz and onz are the number of nonzero elements in the
> diagonal and off-diagonal submatrices, for each row of the DMComposite
> matrix.  I suppose I can just set each of these in a for loop, but can I
> use the arguments to FormCoupleLocations as the range for the loop?  Which
> ones - __rstart to __rstart+__nrows?  How can I determine the number of
> rows on each processor from within the function that I pass?  From the
> preallocation macros it looks like __start to __end describe the range of
> the columns of the diagonal submatrix - is that right?  It looks like the
> ranges will be specific to each processor.  Do I just set the values in dnz
> and onz, or do I need to reduce them?
> > >
> > > Thanks for all the help!  Maybe if I get things working I can carve
> out the core of the code to make an example program for
> DMRedundant/Composite.
> > >
> >
> >
>
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.mcs.anl.gov/pipermail/petsc-users/attachments/20171102/b1adc8c4/attachment-0001.html>
-------------- next part --------------
A non-text attachment was scrubbed...
Name: main.c
Type: text/x-csrc
Size: 29021 bytes
Desc: not available
URL: <http://lists.mcs.anl.gov/pipermail/petsc-users/attachments/20171102/b1adc8c4/attachment-0002.bin>
-------------- next part --------------
A non-text attachment was scrubbed...
Name: main.h
Type: text/x-chdr
Size: 716 bytes
Desc: not available
URL: <http://lists.mcs.anl.gov/pipermail/petsc-users/attachments/20171102/b1adc8c4/attachment-0003.bin>


More information about the petsc-users mailing list