[petsc-users] MatCreateBAIJ, SNES, Preallocation...
William Coirier
William.Coirier at kratosdefense.com
Thu May 16 16:50:40 CDT 2019
Barry:
Thanks for the quick response!
Running with -info gives nearly the same # of mallocs whether I "prealloc" or not. I'll bet I'm doing something wrong with the preallocation. I must know the matrix structure since convergence is really good with SNES.
I should have 9232128 total non zeros, and when i do a -info -mat_view ::ascii_info i see that in the diagnostic output, but I also see a lot of allocated non-zeros:
Mat Object: SNES_Jacobian 8 MPI processes
type: mpibaij
rows=453195, cols=453195, bs=3
total: nonzeros=9232128, allocated nonzeros=203660352
total number of mallocs used during MatSetValues calls =146300
block size is 3
grepping for malloc in the output shows this initially (8 processors) and then zeros afterwards. Makes sense.
[0] MatAssemblyEnd_SeqBAIJ(): Number of mallocs during MatSetValues is 18884
[3] MatAssemblyEnd_SeqBAIJ(): Number of mallocs during MatSetValues is 18883
[7] MatAssemblyEnd_SeqBAIJ(): Number of mallocs during MatSetValues is 14122
[4] MatAssemblyEnd_SeqBAIJ(): Number of mallocs during MatSetValues is 18883
[5] MatAssemblyEnd_SeqBAIJ(): Number of mallocs during MatSetValues is 18881
[2] MatAssemblyEnd_SeqBAIJ(): Number of mallocs during MatSetValues is 18882
[1] MatAssemblyEnd_SeqBAIJ(): Number of mallocs during MatSetValues is 18882
[6] MatAssemblyEnd_SeqBAIJ(): Number of mallocs during MatSetValues is 18883
________________________________________
From: Smith, Barry F. [bsmith at mcs.anl.gov]
Sent: Thursday, May 16, 2019 4:07 PM
To: William Coirier
Cc: petsc-users at mcs.anl.gov; Michael Robinson; Andrew Holm
Subject: Re: [petsc-users] MatCreateBAIJ, SNES, Preallocation...
> On May 16, 2019, at 3:44 PM, William Coirier via petsc-users <petsc-users at mcs.anl.gov> wrote:
>
> Folks:
>
> I'm developing an application using the SNES, and overall it's working great, as many of our other PETSc-based projects. But, I'm having a problem related to (presumably) pre-allocation, block matrices and SNES.
>
> Without going into details about the actual problem we are solving, here are the symptoms/characteristics/behavior.
> • For the SNES Jacobian, I'm using MatCreateBAIJ for a block size=3, and letting "PETSC_DECIDE" the partitioning. Actual call is:
> • ierr = MatCreateBAIJ(PETSC_COMM_WORLD, bs, PETSC_DECIDE, PETSC_DECIDE, (int)3 * numNodesSAM, (int)3 * numNodesSAM, PETSC_DEFAULT, NULL, PETSC_DEFAULT, NULL, &J);
> • When registering the SNES jacobian function, I set the B and J matrices to be the same.
> • ierr = SNESSetJacobian(snes, J, J, SAMformSNESJ, (void *)this); CHKERRQ(ierr);
> • I can either let PETSc figure out the allocation structure:
> • ierr = MatMPIBAIJSetPreallocation(J, bs, PETSC_DEFAULT, NULL,PETSC_DEFAULT, NULL);
> • or, do it myself, since I know the fill pattern,
> • ierr = MatMPIBAIJSetPreallocation(J, bs, d_nz_dum,&d_nnz[0],o_nz_dum,&o_nnz[0]);
> The symptoms/problems are as follows:
> • Whether I do preallocation or not, the "setup" time is pretty long. It might take 2 minutes before SNES starts doing its thing. After this setup, convergence and speed is great. But this first phase takes a long time. I'm assuming this has to be related to some poor preallocation setup so it's doing tons of mallocs where it’s not needed.
You should definitely get much better performance with proper preallocation then with none (unless the default is enough for your matrix). Run with -info and grep for "malloc" this will tell you exactly how many, if any mallocs are taking place inside the MatSetValues() due to improper preallocation.
> • If I don't call my Jacobian formulation before calling SNESSolve, I get a segmentation violation in a PETSc routine.
Not sure what you mean by Jacobian formation but I'm guessing filling up the Jacobian with numerical values?
Something is wrong because you should not need to fill up the values before calling SNES solve, and regardless it should never ever crash with
a segmentation violation. You can run with valgrind https://urldefense.proofpoint.com/v2/url?u=https-3A__www.mcs.anl.gov_petsc_documentation_faq.html-23valgrind&d=DwIGaQ&c=zeCCs5WLaN-HWPHrpXwbFoOqeS0G3NH2_2IQ_bzV13g&r=q_3hswOPAFb0l_4-IAZZi5DgTpzDUIpk984njq2YnggBd-vCWTgNlbk27KjHXmKK&m=IREQmkCt5PXK-SnLqJZXz3Du7h3mFP24xtI0jHGgGUY&s=UiGHkQ2Zr_nYYQ-GYg1HEYtbqZutYSgv9F1A86sfNKI&e= to make sure that it is not a memory corruption issue. You can also run in the debugger (perhaps the PETSc command line option -start_in_debugger) to get more details on why it is crashing.
When you have it running satisfactory you can send us the output from running with -log_view and we can let you know how it seems to be performing efficiency wise.
Barry
> (If I DO call my Jacobian first, things work great, although slow for the setup phase.) Here's a snippet of the traceback:
> 0 0x00000000009649fc in MatMultAdd_SeqBAIJ_3 (A=<optimized out>,
> xx=0x3a525b0, yy=0x3a531b0, zz=0x3a531b0)
> at /home/jstutts/Downloads/petsc-3.11.1/src/mat/impls/baij/seq/baij2.c:1424
> #1 0x00000000006444cb in MatMult_MPIBAIJ (A=0x15da340, xx=0x3a542a0,
> yy=0x3a531b0)
> at /home/jstutts/Downloads/petsc-3.11.1/src/mat/impls/baij/mpi/mpibaij.c:1380
> #2 0x00000000005b2c0f in MatMult (mat=0x15da340, x=x at entry=0x3a542a0,
> y=y at entry=0x3a531b0)
> at /home/jstutts/Downloads/petsc-3.11.1/src/mat/interface/matrix.c:2396
> #3 0x0000000000c61f2e in PCApplyBAorAB (pc=0x1ce78c0, side=PC_LEFT,
> x=0x3a542a0, y=y at entry=0x3a548a0, work=0x3a531b0)
> at /home/jstutts/Downloads/petsc-3.11.1/src/ksp/pc/interface/precon.c:690
> #4 0x0000000000ccb36b in KSP_PCApplyBAorAB (w=<optimized out>, y=0x3a548a0,
> x=<optimized out>, ksp=0x1d44d50)
> at /home/jstutts/Downloads/petsc-3.11.1/include/petsc/private/kspimpl.h:309
> #5 KSPGMRESCycle (itcount=itcount at entry=0x7fffffffc02c,
> ksp=ksp at entry=0x1d44d50)
> at /home/jstutts/Downloads/petsc-3.11.1/src/ksp/ksp/impls/gmres/gmres.c:152
> #6 0x0000000000ccbf6f in KSPSolve_GMRES (ksp=0x1d44d50)
> at /home/jstutts/Downloads/petsc-3.11.1/src/ksp/ksp/impls/gmres/gmres.c:237
> #7 0x00000000007dc193 in KSPSolve (ksp=0x1d44d50, b=b at entry=0x1d41c70,
> x=x at entry=0x1cebf40)
>
>
>
> I apologize if I’ve missed something in the documentation or examples, but I can’t seem to figure this one out. The “setup” seems to take too long, and from my previous experiences with PETSc, this is due to a poor preallocation strategy.
>
> Any and all help is appreciated!
>
> -----------------------------------------------------------------------
> William J. Coirier, Ph.D.
> Director, Aerosciences and Engineering Analysis Branch
> Advanced Concepts Development and Test Division
> Kratos Defense and Rocket Support Services
> 4904 Research Drive
> Huntsville, AL 35805
> 256-327-8170
> 256-327-8120 (fax)
More information about the petsc-users
mailing list