[petsc-users] Extremely slow DMNetwork Jacobian assembly

Justin Chang jychang48 at gmail.com
Wed May 8 14:32:43 CDT 2019


So here's the branch/repo to the working example I have:

https://github.com/jychang48/petsc-dss/tree/single-bus-vertex

Type 'make' to compile the dss, it should work with the latest petsc-dev

To test the performance, I've taken an existing IEEE 13-bus and duplicated
it N times to create a long radial-like network. I have three sizes where N
= 100, 500, and 1000. Those test files are listed as:

input/test_100.m
input/test_500.m
input/test_1000.m

I also created another set of examples where the IEEE 13-bus is fully
balanced (but the program will crash ar the solve step because I used some
unrealistic parameters for the Y-bus matrices and probably have some zeros
somewhere). They are listed as:

input/test2_100.m
input/test2_500.m
input/test2_1000.m

The dof count and matrices for the test2_*.m files are slightly larger than
their respective test_*.m but they have a bs=6.

To run these tests, type the following:

./dpflow -input input/test_100.m

I have a timer that shows how long it takes to compute the Jacobian.
Attached are the log outputs I have for each of the six cases.

Turns out that only the first call to the SNESComputeJacobian() is slow,
all the subsequent calls are fast as I expect. This makes me think it still
has something to do with matrix allocation.

Thanks for the help everyone,

Justin

On Wed, May 8, 2019 at 12:36 PM Matthew Knepley <knepley at gmail.com> wrote:

> On Wed, May 8, 2019 at 2:30 PM Justin Chang <jychang48 at gmail.com> wrote:
>
>> Hi everyone,
>>
>> Yes I have these lines in my code:
>>
>> ierr = DMCreateMatrix(networkdm,&J);CHKERRQ(ierr);
>> ierr =
>> MatSetOption(J,MAT_NEW_NONZERO_ALLOCATION_ERR,PETSC_FALSE);CHKERRQ(ierr);
>>
>
> Okay, its not allocation. So maybe Hong is right that its setting great
> big element matrices. We will see with the example.
>
>   Thanks,
>
>     Matt
>
>
>> I tried -info and here's my output:
>>
>> [0] PetscInitialize(): PETSc successfully started: number of processors =
>> 1
>> [0] PetscInitialize(): Running on machine: jchang31606s.domain
>> [0] PetscCommDuplicate(): Duplicating a communicator 4436504608
>> 140550815662944 max tags = 2147483647
>> [0] PetscCommDuplicate(): Using internal PETSc communicator 4436504608
>> 140550815662944
>> [0] PetscCommDuplicate(): Using internal PETSc communicator 4436504608
>> 140550815662944
>> Base power = 0.166667, numbus = 115000, numgen = 5000, numyl = 75000,
>> numdl = 5000, numlbr = 109999, numtbr = 5000
>>
>> **** Power flow dist case ****
>>
>> Base power = 0.166667, nbus = 115000, ngen = 5000, nwye = 75000, ndelta =
>> 5000, nbranch = 114999
>> [0] PetscCommDuplicate(): Duplicating a communicator 4436505120
>> 140550815683104 max tags = 2147483647
>> [0] PetscCommDuplicate(): Using internal PETSc communicator 4436505120
>> 140550815683104
>> [0] PetscCommDuplicate(): Using internal PETSc communicator 4436505120
>> 140550815683104
>> [0] PetscCommDuplicate(): Using internal PETSc communicator 4436505120
>> 140550815683104
>> [0] PetscCommDuplicate(): Using internal PETSc communicator 4436505120
>> 140550815683104
>> [0] PetscCommDuplicate(): Using internal PETSc communicator 4436505120
>> 140550815683104
>> [0] PetscCommDuplicate(): Using internal PETSc communicator 4436505120
>> 140550815683104
>> [0] PetscCommDuplicate(): Using internal PETSc communicator 4436505120
>> 140550815683104
>> [0] PetscCommDuplicate(): Using internal PETSc communicator 4436505120
>> 140550815683104
>> [0] PetscCommDuplicate(): Using internal PETSc communicator 4436505120
>> 140550815683104
>> [0] PetscCommDuplicate(): Using internal PETSc communicator 4436505120
>> 140550815683104
>> [0] PetscCommDuplicate(): Using internal PETSc communicator 4436505120
>> 140550815683104
>> [0] PetscCommDuplicate(): Using internal PETSc communicator 4436505120
>> 140550815683104
>> [0] MatAssemblyEnd_SeqAIJ(): Matrix size: 620000 X 620000; storage space:
>> 0 unneeded,10799928 used
>> [0] MatAssemblyEnd_SeqAIJ(): Number of mallocs during MatSetValues() is 0
>> [0] MatAssemblyEnd_SeqAIJ(): Maximum nonzeros in any row is 28
>> [0] MatCheckCompressedRow(): Found the ratio (num_zerorows
>> 0)/(num_localrows 620000) < 0.6. Do not use CompressedRow routines.
>> [0] MatSeqAIJCheckInode(): Found 205000 nodes of 620000. Limit used: 5.
>> Using Inode routines
>> [0] PetscCommDuplicate(): Using internal PETSc communicator 4436505120
>> 140550815683104
>> [0] PetscCommDuplicate(): Using internal PETSc communicator 4436504608
>> 140550815662944
>> [0] DMGetDMSNES(): Creating new DMSNES
>> [0] DMGetDMKSP(): Creating new DMKSP
>> [0] PetscCommDuplicate(): Using internal PETSc communicator 4436505120
>> 140550815683104
>>   0 SNES Function norm 1155.45
>>
>> nothing else -info related shows up as I'm iterating through the vertex
>> loop.
>>
>> I'll have a MWE for you guys to play with shortly.
>>
>> Thanks,
>> Justin
>>
>> On Wed, May 8, 2019 at 12:10 PM Smith, Barry F. <bsmith at mcs.anl.gov>
>> wrote:
>>
>>>
>>>   Justin,
>>>
>>>      Are you providing matrix entries that connect  directly one vertex
>>> to another vertex ACROSS an edge? I don't think that is supported by the
>>> DMNetwork model. The assumption is that edges are only connected to
>>> vertices and vertices are only connected to neighboring edges.
>>>
>>>   Everyone,
>>>
>>>   I second Matt's reply.
>>>
>>>   How is the DMNetwork preallocating for the Jacobian? Does it take into
>>> account coupling between neighboring vertices/edges? Or does it assume no
>>> coupling. Or assume full coupling. If it assumes no coupling and the user
>>> has a good amount of coupling it will be very slow.
>>>
>>>   There would need to be a way for the user provide the coupling
>>> information between neighboring vertices/edges if it assumes no coupling.
>>>
>>>     Barry
>>>
>>>
>>> > On May 8, 2019, at 7:44 AM, Matthew Knepley via petsc-users <
>>> petsc-users at mcs.anl.gov> wrote:
>>> >
>>> > On Wed, May 8, 2019 at 4:45 AM Justin Chang via petsc-users <
>>> petsc-users at mcs.anl.gov> wrote:
>>> > Hi guys,
>>> >
>>> > I have a fully working distribution system solver written using
>>> DMNetwork, The idea is that each electrical bus can have up to three phase
>>> nodes, and each phase node has two unknowns: voltage magnitude and angle.
>>> In a completely balanced system, each bus has three nodes, but in an
>>> unbalanced system some of the buses can be either single phase or two-phase.
>>> >
>>> > The working DMNetwork code I developed, loosely based on the SNES
>>> network/power.c, essentially represents each vertex as a bus.
>>> DMNetworkAddNumVariables() function will add either 2, 4, or 6 unknowns to
>>> each vertex. If every single bus had the same number of variables, the mat
>>> block size = 2, 4, or 6, and my code is both fast and scalable. However, if
>>> the unknowns per DMNetwork vertex unknowns are not the same across, then my
>>> SNESFormJacobian function becomes extremely extremely slow. Specifically,
>>> the MatSetValues() calls when the col/row global indices contain an offset
>>> value that points to a neighboring bus vertex.
>>> >
>>> > I have never seen MatSetValues() be slow unless it is allocating. Did
>>> you confirm that you are not allocating, with -info?
>>> >
>>> >   Thanks,
>>> >
>>> >      MAtt
>>> >
>>> > Why is that? Is it because I no longer have a uniform block structure
>>> and lose the speed/optimization benefits of iterating through an AIJ
>>> matrix? I see three potential workarounds:
>>> >
>>> > 1) Treat every vertex as a three phase bus and "zero out" all the
>>> unused phase node dofs and put a 1 in the diagonal. The problem I see with
>>> this is that I will have unnecessary degrees of freedom (aka non-zeros in
>>> the matrix). From the distribution systems I've seen, it's possible that
>>> anywhere from 1/3 to 1/2 of the buses will be two-phase or less, meaning I
>>> may have nearly twice the amount of dofs than necessary if I wanted to
>>> preserve the block size = 6 for the AU mat.
>>> >
>>> > 2) Treat every phase node as a vertex aka solve a single-phase power
>>> flow solver. That way I guarantee to have a block size = 2, this is what
>>> Domenico's former student did in his thesis work. The problem I see with
>>> this is that I have a larger graph, which can take more time to setup and
>>> parallelize.
>>> >
>>> > 3) Create a "fieldsplit" where I essentially have three "blocks" - one
>>> for buses with all three phases, another for buses with only two phases,
>>> one for single-phase buses. This way each block/fieldsplit will have a
>>> consistent block size. I am not sure if this will solve the MatSetValues()
>>> issues, but it's, but can anyone give pointers on how to go about achieving
>>> this?
>>> >
>>> > Thanks,
>>> > Justin
>>> >
>>> >
>>> > --
>>> > What most experimenters take for granted before they begin their
>>> experiments is infinitely more interesting than any results to which their
>>> experiments lead.
>>> > -- Norbert Wiener
>>> >
>>> > https://www.cse.buffalo.edu/~knepley/
>>>
>>>
>
> --
> What most experimenters take for granted before they begin their
> experiments is infinitely more interesting than any results to which their
> experiments lead.
> -- Norbert Wiener
>
> https://www.cse.buffalo.edu/~knepley/
> <http://www.cse.buffalo.edu/~knepley/>
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.mcs.anl.gov/pipermail/petsc-users/attachments/20190508/15a0ab37/attachment-0001.html>
-------------- next part --------------
A non-text attachment was scrubbed...
Name: test_100.out
Type: application/octet-stream
Size: 16187 bytes
Desc: not available
URL: <http://lists.mcs.anl.gov/pipermail/petsc-users/attachments/20190508/15a0ab37/attachment-0006.obj>
-------------- next part --------------
A non-text attachment was scrubbed...
Name: test_1000.out
Type: application/octet-stream
Size: 16224 bytes
Desc: not available
URL: <http://lists.mcs.anl.gov/pipermail/petsc-users/attachments/20190508/15a0ab37/attachment-0007.obj>
-------------- next part --------------
A non-text attachment was scrubbed...
Name: test2_100.out
Type: application/octet-stream
Size: 14901 bytes
Desc: not available
URL: <http://lists.mcs.anl.gov/pipermail/petsc-users/attachments/20190508/15a0ab37/attachment-0008.obj>
-------------- next part --------------
A non-text attachment was scrubbed...
Name: test_500.out
Type: application/octet-stream
Size: 16200 bytes
Desc: not available
URL: <http://lists.mcs.anl.gov/pipermail/petsc-users/attachments/20190508/15a0ab37/attachment-0009.obj>
-------------- next part --------------
A non-text attachment was scrubbed...
Name: test2_500.out
Type: application/octet-stream
Size: 14913 bytes
Desc: not available
URL: <http://lists.mcs.anl.gov/pipermail/petsc-users/attachments/20190508/15a0ab37/attachment-0010.obj>
-------------- next part --------------
A non-text attachment was scrubbed...
Name: test2_1000.out
Type: application/octet-stream
Size: 14914 bytes
Desc: not available
URL: <http://lists.mcs.anl.gov/pipermail/petsc-users/attachments/20190508/15a0ab37/attachment-0011.obj>


More information about the petsc-users mailing list