[petsc-users] Extremely slow DMNetwork Jacobian assembly

Matthew Knepley knepley at gmail.com
Wed May 8 21:58:49 CDT 2019


On Wed, May 8, 2019 at 10:53 PM Abhyankar, Shrirang G <
shrirang.abhyankar at pnnl.gov> wrote:

>
>
>
>
> *From: *Matthew Knepley <knepley at gmail.com>
> *Date: *Wednesday, May 8, 2019 at 9:29 PM
> *To: *"Abhyankar, Shrirang G" <shrirang.abhyankar at pnnl.gov>
> *Cc: *"Zhang, Hong" <hzhang at mcs.anl.gov>, Justin Chang <
> jychang48 at gmail.com>, petsc-users <petsc-users at mcs.anl.gov>
> *Subject: *Re: [petsc-users] Extremely slow DMNetwork Jacobian assembly
>
>
>
> On Wed, May 8, 2019 at 10:17 PM Abhyankar, Shrirang G via petsc-users <
> petsc-users at mcs.anl.gov> wrote:
>
>
>
>
>
> *From: *petsc-users <petsc-users-bounces at mcs.anl.gov> on behalf of
> "Zhang, Hong via petsc-users" <petsc-users at mcs.anl.gov>
> *Reply-To: *"Zhang, Hong" <hzhang at mcs.anl.gov>
> *Date: *Wednesday, May 8, 2019 at 8:01 PM
> *To: *Justin Chang <jychang48 at gmail.com>
> *Cc: *petsc-users <petsc-users at mcs.anl.gov>
> *Subject: *Re: [petsc-users] Extremely slow DMNetwork Jacobian assembly
>
>
>
> Justin:
>
> Great, the issue is resolved.
>
> Why MatSetOption(J,MAT_NEW_NONZERO_ALLOCATION_ERR,PETSC_FALSE) does not
> raise an error?
>
>
>
> I copy-pasted the above line from the power.c example. MatSetOption should
> use PETSC_TRUE to activate the MAT_NEW_NONZERO_ALLOCATION_ERR option.
>
>
>
> Matt,
>
>
>
> We usually prevent this with a structured SetValues API. For example, DMDA
> uses MatSetValuesStencil() which cannot write
>
> outside the stencil you set. DMPlex uses MatSetValuesClosure(), which is
> guaranteed to be allocated. We should write one
>
> for DMNetwork. The allocation is just like Plex (I believe) where you
> allocate closure(star(p)), which would mean that setting
>
> values for a vertex gets the neighboring edges and their vertices, and
> setting values for an edge gets the covering vertices.
>
> Is that right for DMNetwork?
>
> Yes, DMNetwork behaves in this fashion.
>
> I cannot find MatSetValuesClosure() in petsc-master.
>
> Can you provide detailed instruction on how to implement
> MatSetValuesClosure() for DMNetwork?
>
> Note, dmnetwork is a subclass of DMPlex.
>
>
>
> DMNetwork does not do any matrix creation by itself. It calls Plex to
> create the matrix.
>
>
>
> Right. However, the only operation I put in was MatSetClosure() since that
> is appropriate for FEM. I think you would need
>
> a MatSetStar() for DMNetwork as well.
>
>
>
> I see that Hong has implemented has some additional code in
> DMCreateMatrix_Network that does not use Plex for creating the matrix.  I
> think it covers what you describe above.
>

No. That is not what I am saying. It has nothing to do with matrix
creation, It is about setting values in the matrix.

    Matt


> Hong: DMNetwork matrix creation code is not used unless the user wants to
> set special sparsity pattern for the blocks. Shouldn’t this code be used by
> default instead of having Plex create the matrix?
>
>
>
> Shri
>
>
>
>    Matt
>
>
>
> Hong
>
>
>
>
>
> On Wed, May 8, 2019 at 4:00 PM Dave May <dave.mayhem23 at gmail.com> wrote:
>
>
>
>
>
> On Wed, 8 May 2019 at 20:34, Justin Chang via petsc-users <
> petsc-users at mcs.anl.gov> wrote:
>
> So here's the branch/repo to the working example I have:
>
>
>
> https://github.com/jychang48/petsc-dss/tree/single-bus-vertex
>
>
>
> Type 'make' to compile the dss, it should work with the latest petsc-dev
>
>
>
> To test the performance, I've taken an existing IEEE 13-bus and duplicated
> it N times to create a long radial-like network. I have three sizes where N
> = 100, 500, and 1000. Those test files are listed as:
>
>
>
> input/test_100.m
>
> input/test_500.m
>
> input/test_1000.m
>
>
>
> I also created another set of examples where the IEEE 13-bus is fully
> balanced (but the program will crash ar the solve step because I used some
> unrealistic parameters for the Y-bus matrices and probably have some zeros
> somewhere). They are listed as:
>
>
>
> input/test2_100.m
>
> input/test2_500.m
>
> input/test2_1000.m
>
>
>
> The dof count and matrices for the test2_*.m files are slightly larger
> than their respective test_*.m but they have a bs=6.
>
>
>
> To run these tests, type the following:
>
>
>
> ./dpflow -input input/test_100.m
>
>
>
> I have a timer that shows how long it takes to compute the Jacobian.
> Attached are the log outputs I have for each of the six cases.
>
>
>
> Turns out that only the first call to the SNESComputeJacobian() is slow,
> all the subsequent calls are fast as I expect. This makes me think it still
> has something to do with matrix allocation.
>
>
>
> I think it is a preallocation issue.
>
> Looking to some of the output files (test_1000.out, test_100.out), under
> Mat Object I see this in the KSPView
>
>
>
>       total number of mallocs used during MatSetValues calls =10000
>
>
>
>
>
>
>
>
>
>
>
> Thanks for the help everyone,
>
>
>
> Justin
>
>
>
> On Wed, May 8, 2019 at 12:36 PM Matthew Knepley <knepley at gmail.com> wrote:
>
> On Wed, May 8, 2019 at 2:30 PM Justin Chang <jychang48 at gmail.com> wrote:
>
> Hi everyone,
>
>
>
> Yes I have these lines in my code:
>
> ierr = DMCreateMatrix(networkdm,&J);CHKERRQ(ierr);
> ierr =
> MatSetOption(J,MAT_NEW_NONZERO_ALLOCATION_ERR,PETSC_FALSE);CHKERRQ(ierr);
>
>
>
> Okay, its not allocation. So maybe Hong is right that its setting great
> big element matrices. We will see with the example.
>
>
>
>   Thanks,
>
>
>
>     Matt
>
>
>
> I tried -info and here's my output:
>
>
>
> [0] PetscInitialize(): PETSc successfully started: number of processors = 1
>
> [0] PetscInitialize(): Running on machine: jchang31606s.domain
>
> [0] PetscCommDuplicate(): Duplicating a communicator 4436504608
> 140550815662944 max tags = 2147483647
>
> [0] PetscCommDuplicate(): Using internal PETSc communicator 4436504608
> 140550815662944
>
> [0] PetscCommDuplicate(): Using internal PETSc communicator 4436504608
> 140550815662944
>
> Base power = 0.166667, numbus = 115000, numgen = 5000, numyl = 75000,
> numdl = 5000, numlbr = 109999, numtbr = 5000
>
>
>
> **** Power flow dist case ****
>
>
>
> Base power = 0.166667, nbus = 115000, ngen = 5000, nwye = 75000, ndelta =
> 5000, nbranch = 114999
>
> [0] PetscCommDuplicate(): Duplicating a communicator 4436505120
> 140550815683104 max tags = 2147483647
>
> [0] PetscCommDuplicate(): Using internal PETSc communicator 4436505120
> 140550815683104
>
> [0] PetscCommDuplicate(): Using internal PETSc communicator 4436505120
> 140550815683104
>
> [0] PetscCommDuplicate(): Using internal PETSc communicator 4436505120
> 140550815683104
>
> [0] PetscCommDuplicate(): Using internal PETSc communicator 4436505120
> 140550815683104
>
> [0] PetscCommDuplicate(): Using internal PETSc communicator 4436505120
> 140550815683104
>
> [0] PetscCommDuplicate(): Using internal PETSc communicator 4436505120
> 140550815683104
>
> [0] PetscCommDuplicate(): Using internal PETSc communicator 4436505120
> 140550815683104
>
> [0] PetscCommDuplicate(): Using internal PETSc communicator 4436505120
> 140550815683104
>
> [0] PetscCommDuplicate(): Using internal PETSc communicator 4436505120
> 140550815683104
>
> [0] PetscCommDuplicate(): Using internal PETSc communicator 4436505120
> 140550815683104
>
> [0] PetscCommDuplicate(): Using internal PETSc communicator 4436505120
> 140550815683104
>
> [0] PetscCommDuplicate(): Using internal PETSc communicator 4436505120
> 140550815683104
>
> [0] MatAssemblyEnd_SeqAIJ(): Matrix size: 620000 X 620000; storage space:
> 0 unneeded,10799928 used
>
> [0] MatAssemblyEnd_SeqAIJ(): Number of mallocs during MatSetValues() is 0
>
> [0] MatAssemblyEnd_SeqAIJ(): Maximum nonzeros in any row is 28
>
> [0] MatCheckCompressedRow(): Found the ratio (num_zerorows
> 0)/(num_localrows 620000) < 0.6. Do not use CompressedRow routines.
>
> [0] MatSeqAIJCheckInode(): Found 205000 nodes of 620000. Limit used: 5.
> Using Inode routines
>
> [0] PetscCommDuplicate(): Using internal PETSc communicator 4436505120
> 140550815683104
>
> [0] PetscCommDuplicate(): Using internal PETSc communicator 4436504608
> 140550815662944
>
> [0] DMGetDMSNES(): Creating new DMSNES
>
> [0] DMGetDMKSP(): Creating new DMKSP
>
> [0] PetscCommDuplicate(): Using internal PETSc communicator 4436505120
> 140550815683104
>
>   0 SNES Function norm 1155.45
>
>
>
> nothing else -info related shows up as I'm iterating through the vertex
> loop.
>
>
>
> I'll have a MWE for you guys to play with shortly.
>
>
>
> Thanks,
>
> Justin
>
>
>
> On Wed, May 8, 2019 at 12:10 PM Smith, Barry F. <bsmith at mcs.anl.gov>
> wrote:
>
>
>   Justin,
>
>      Are you providing matrix entries that connect  directly one vertex to
> another vertex ACROSS an edge? I don't think that is supported by the
> DMNetwork model. The assumption is that edges are only connected to
> vertices and vertices are only connected to neighboring edges.
>
>   Everyone,
>
>   I second Matt's reply.
>
>   How is the DMNetwork preallocating for the Jacobian? Does it take into
> account coupling between neighboring vertices/edges? Or does it assume no
> coupling. Or assume full coupling. If it assumes no coupling and the user
> has a good amount of coupling it will be very slow.
>
>   There would need to be a way for the user provide the coupling
> information between neighboring vertices/edges if it assumes no coupling.
>
>     Barry
>
>
> > On May 8, 2019, at 7:44 AM, Matthew Knepley via petsc-users <
> petsc-users at mcs.anl.gov> wrote:
> >
> > On Wed, May 8, 2019 at 4:45 AM Justin Chang via petsc-users <
> petsc-users at mcs.anl.gov> wrote:
> > Hi guys,
> >
> > I have a fully working distribution system solver written using
> DMNetwork, The idea is that each electrical bus can have up to three phase
> nodes, and each phase node has two unknowns: voltage magnitude and angle.
> In a completely balanced system, each bus has three nodes, but in an
> unbalanced system some of the buses can be either single phase or two-phase.
> >
> > The working DMNetwork code I developed, loosely based on the SNES
> network/power.c, essentially represents each vertex as a bus.
> DMNetworkAddNumVariables() function will add either 2, 4, or 6 unknowns to
> each vertex. If every single bus had the same number of variables, the mat
> block size = 2, 4, or 6, and my code is both fast and scalable. However, if
> the unknowns per DMNetwork vertex unknowns are not the same across, then my
> SNESFormJacobian function becomes extremely extremely slow. Specifically,
> the MatSetValues() calls when the col/row global indices contain an offset
> value that points to a neighboring bus vertex.
> >
> > I have never seen MatSetValues() be slow unless it is allocating. Did
> you confirm that you are not allocating, with -info?
> >
> >   Thanks,
> >
> >      MAtt
> >
> > Why is that? Is it because I no longer have a uniform block structure
> and lose the speed/optimization benefits of iterating through an AIJ
> matrix? I see three potential workarounds:
> >
> > 1) Treat every vertex as a three phase bus and "zero out" all the unused
> phase node dofs and put a 1 in the diagonal. The problem I see with this is
> that I will have unnecessary degrees of freedom (aka non-zeros in the
> matrix). From the distribution systems I've seen, it's possible that
> anywhere from 1/3 to 1/2 of the buses will be two-phase or less, meaning I
> may have nearly twice the amount of dofs than necessary if I wanted to
> preserve the block size = 6 for the AU mat.
> >
> > 2) Treat every phase node as a vertex aka solve a single-phase power
> flow solver. That way I guarantee to have a block size = 2, this is what
> Domenico's former student did in his thesis work. The problem I see with
> this is that I have a larger graph, which can take more time to setup and
> parallelize.
> >
> > 3) Create a "fieldsplit" where I essentially have three "blocks" - one
> for buses with all three phases, another for buses with only two phases,
> one for single-phase buses. This way each block/fieldsplit will have a
> consistent block size. I am not sure if this will solve the MatSetValues()
> issues, but it's, but can anyone give pointers on how to go about achieving
> this?
> >
> > Thanks,
> > Justin
> >
> >
> > --
> > What most experimenters take for granted before they begin their
> experiments is infinitely more interesting than any results to which their
> experiments lead.
> > -- Norbert Wiener
> >
> > https://www.cse.buffalo.edu/~knepley/
>
>
>
>
> --
>
> What most experimenters take for granted before they begin their
> experiments is infinitely more interesting than any results to which their
> experiments lead.
> -- Norbert Wiener
>
>
>
> https://www.cse.buffalo.edu/~knepley/
> <http://www.cse.buffalo.edu/~knepley/>
>
>
>
>
> --
>
> What most experimenters take for granted before they begin their
> experiments is infinitely more interesting than any results to which their
> experiments lead.
> -- Norbert Wiener
>
>
>
> https://www.cse.buffalo.edu/~knepley/
> <http://www.cse.buffalo.edu/~knepley/>
>
>
>
>
> --
>
> What most experimenters take for granted before they begin their
> experiments is infinitely more interesting than any results to which their
> experiments lead.
> -- Norbert Wiener
>
>
>
> https://www.cse.buffalo.edu/~knepley/
> <http://www.cse.buffalo.edu/~knepley/>
>


-- 
What most experimenters take for granted before they begin their
experiments is infinitely more interesting than any results to which their
experiments lead.
-- Norbert Wiener

https://www.cse.buffalo.edu/~knepley/ <http://www.cse.buffalo.edu/~knepley/>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.mcs.anl.gov/pipermail/petsc-users/attachments/20190508/06ddc52d/attachment-0001.html>


More information about the petsc-users mailing list