[petsc-users] SuperLU MPI-problem

Hong hzhang at mcs.anl.gov
Mon Aug 3 09:46:04 CDT 2015


Mahir,

Sherry found the culprit. I can reproduce it:
petsc/src/ksp/ksp/examples/tutorials
mpiexec -n 2 ./ex2 -pc_type lu -pc_factor_mat_solver_package superlu_dist
-mat_superlu_dist_matinput GLOBAL -mat_superlu_dist_parsymbfact

Invalid ISPEC at line 484 in file get_perm_c.c
Invalid ISPEC at line 484 in file get_perm_c.c
-------------------------------------------------------
Primary job  terminated normally, but 1 process returned
a non-zero exit code.. Per user-direction, the job has been aborted.
-------------------------------------------------------
...

PETSc-superlu_dist interface sets matinput=DISTRIBUTED as default when
using more than one processes.
Did you either use '-mat_superlu_dist_parsymbfact' for sequential run or
set matinput=GLOBAL for parallel run?

I'll add an error flag for these use cases.

Hong

On Mon, Aug 3, 2015 at 9:17 AM, Xiaoye S. Li <xsli at lbl.gov> wrote:

> I think I know the problem.   Since zdistribute.c is called, I guess you
> are using the global (replicated) matrix input interface,
> pzgssvx_ABglobal().  This interface does not allow you to use parallel
> symbolic factorization (since matrix is centralized).
>
> That's why you get the following error:
> Invalid ISPEC at line 484 in file get_perm_c.c
>
> You need to use distributed matrix input interface pzgssvx() (without
> ABglobal)
>
> Sherry
>
>
> On Mon, Aug 3, 2015 at 5:02 AM, Mahir.Ulker-Kaustell at tyrens.se <
> Mahir.Ulker-Kaustell at tyrens.se> wrote:
>
>> Hong and Sherry,
>>
>>
>>
>> I have rebuilt PETSc with SuperLU 4.1. Unfortunately, the problem remains:
>>
>>
>>
>> If I use -mat_superlu_dist_parsymbfact, the program crashes with: Invalid
>> ISPEC at line 484 in file get_perm_c.c
>>
>> If I use -mat_superlu_dist_parsymbfact=1 or leave this flag out, the
>> program crashes with:  Calloc fails for SPA dense[]. at line 438 in file
>> zdistribute.c
>>
>>
>>
>> Mahir
>>
>>
>>
>> *From:* Hong [mailto:hzhang at mcs.anl.gov]
>> *Sent:* den 30 juli 2015 02:58
>> *To:* Ülker-Kaustell, Mahir
>> *Cc:* Xiaoye Li; PETSc users list
>>
>> *Subject:* Fwd: [petsc-users] SuperLU MPI-problem
>>
>>
>>
>> Mahir,
>>
>>
>>
>> Sherry fixed several bugs in superlu_dist-v4.1.
>>
>> The current petsc-release interfaces with superlu_dist-v4.0.
>>
>> We do not know whether the reported issue (attached below) has been
>> resolved or not. If not, can you test it with the latest superlu_dist-v4.1?
>>
>>
>>
>> Here is how to do it:
>>
>> 1. download superlu_dist v4.1
>>
>> 2. remove existing PETSC_ARCH directory, then configure petsc with
>>
>> '--download-superlu_dist=superlu_dist_4.1.tar.gz'
>>
>> 3. build petsc
>>
>>
>>
>> Let us know if the issue remains.
>>
>>
>>
>> Hong
>>
>>
>>
>>
>>
>> ---------- Forwarded message ----------
>> From: *Xiaoye S. Li* <xsli at lbl.gov>
>> Date: Wed, Jul 29, 2015 at 2:24 PM
>> Subject: Fwd: [petsc-users] SuperLU MPI-problem
>> To: Hong Zhang <hzhang at mcs.anl.gov>
>>
>> Hong,
>>
>> I am cleaning the mailbox, and saw this unresolved issue.  I am not sure
>> whether the new fix to parallel symbolic factorization solves the problem.
>> What bothers be is that he is getting the following error:
>>
>> Invalid ISPEC at line 484 in file get_perm_c.c
>>
>> This has nothing to do with my bug fix.
>>
>> ​  Shall we ask him to try the new version, or try to get him matrix?
>>
>> Sherry
>>>>
>>
>>
>> ---------- Forwarded message ----------
>> From: *Mahir.Ulker-Kaustell at tyrens.se <Mahir.Ulker-Kaustell at tyrens.se>* <
>> Mahir.Ulker-Kaustell at tyrens.se>
>> Date: Wed, Jul 22, 2015 at 1:32 PM
>> Subject: RE: [petsc-users] SuperLU MPI-problem
>> To: Hong <hzhang at mcs.anl.gov>, "Xiaoye S. Li" <xsli at lbl.gov>
>> Cc: petsc-users <petsc-users at mcs.anl.gov>
>>
>> The 1000 was just a conservative guess. The number of non-zeros per row
>> is in the tens in general but certain constraints lead to non-diagonal
>> streaks in the sparsity-pattern.
>>
>> Is it the reordering of the matrix that is killing me here? How can I set
>> options.ColPerm?
>>
>>
>>
>> If i use -mat_superlu_dist_parsymbfact the program crashes with
>>
>>
>>
>> Invalid ISPEC at line 484 in file get_perm_c.c
>>
>> -------------------------------------------------------
>>
>> Primary job  terminated normally, but 1 process returned
>>
>> a non-zero exit code.. Per user-direction, the job has been aborted.
>>
>> -------------------------------------------------------
>>
>> [0]PETSC ERROR:
>> ------------------------------------------------------------------------
>>
>> [0]PETSC ERROR: Caught signal number 15 Terminate: Some process (or the
>> batch system) has told this process to end
>>
>> [0]PETSC ERROR: Try option -start_in_debugger or -on_error_attach_debugger
>>
>> [0]PETSC ERROR: or see
>> http://www.mcs.anl.gov/petsc/documentation/faq.html#valgrind
>>
>> [0]PETSC ERROR: or try http://valgrind.org on GNU/linux and Apple Mac OS
>> X to find memory corruption errors
>>
>> [0]PETSC ERROR: configure using --with-debugging=yes, recompile, link,
>> and run
>>
>> [0]PETSC ERROR: to get more information on the crash.
>>
>> [0]PETSC ERROR: --------------------- Error Message
>> --------------------------------------------------------------
>>
>> [0]PETSC ERROR: Signal received
>>
>> [0]PETSC ERROR: See http://www.mcs.anl.gov/petsc/documentation/faq.html
>> for trouble shooting.
>>
>> [0]PETSC ERROR: Petsc Release Version 3.6.0, Jun, 09, 2015
>>
>> [0]PETSC ERROR: ./solve on a cygwin-complex-nodebug named CZC5202SM2 by
>> muk Wed Jul 22 21:59:23 2015
>>
>> [0]PETSC ERROR: Configure options PETSC_DIR=/packages/petsc-3.6.0
>> PETSC_ARCH=cygwin-complex-nodebug --with-cc=gcc --with-cxx=g++
>> --with-fc=gfortran --with-debugging=0 --with-fortran-kernels=1
>> --with-scalar-type=complex --download-fblaspack --download-mpich
>> --download-scalapack --download-mumps --download-metis --download-parmetis
>> --download-superlu --download-superlu_dist --download-fftw
>>
>> [0]PETSC ERROR: #1 User provided function() line 0 in  unknown file
>>
>> application called MPI_Abort(MPI_COMM_WORLD, 59) - process 0
>>
>> [unset]: aborting job:
>>
>> application called MPI_Abort(MPI_COMM_WORLD, 59) - process 0
>>
>> [0]PETSC ERROR:
>> ------------------------------------------------------------------------
>>
>>
>>
>> If i use -mat_superlu_dist_parsymbfact=1 the program crashes (somewhat
>> later) with
>>
>>
>>
>> Malloc fails for Lnzval_bc_ptr[*][] at line 626 in file zdistribute.c
>>
>> col block 3006 -------------------------------------------------------
>>
>> Primary job  terminated normally, but 1 process returned
>>
>> a non-zero exit code.. Per user-direction, the job has been aborted.
>>
>> -------------------------------------------------------
>>
>> col block 1924 [0]PETSC ERROR:
>> ------------------------------------------------------------------------
>>
>> [0]PETSC ERROR: Caught signal number 15 Terminate: Some process (or the
>> batch system) has told this process to end
>>
>> [0]PETSC ERROR: Try option -start_in_debugger or -on_error_attach_debugger
>>
>> [0]PETSC ERROR: or see
>> http://www.mcs.anl.gov/petsc/documentation/faq.html#valgrind
>>
>> [0]PETSC ERROR: or try http://valgrind.org on GNU/linux and Apple Mac OS
>> X to find memory corruption errors
>>
>> [0]PETSC ERROR: configure using --with-debugging=yes, recompile, link,
>> and run
>>
>> [0]PETSC ERROR: to get more information on the crash.
>>
>> [0]PETSC ERROR: --------------------- Error Message
>> --------------------------------------------------------------
>>
>> [0]PETSC ERROR: Signal received
>>
>> [0]PETSC ERROR: See http://www.mcs.anl.gov/petsc/documentation/faq.html
>> for trouble shooting.
>>
>> [0]PETSC ERROR: Petsc Release Version 3.6.0, Jun, 09, 2015
>>
>> [0]PETSC ERROR: ./solve on a cygwin-complex-nodebug named CZC5202SM2 by
>> muk Wed Jul 22 21:59:58 2015
>>
>> [0]PETSC ERROR: Configure options PETSC_DIR=/packages/petsc-3.6.0
>> PETSC_ARCH=cygwin-complex-nodebug --with-cc=gcc --with-cxx=g++
>> --with-fc=gfortran --with-debugging=0 --with-fortran-kernels=1
>> --with-scalar-type=complex --download-fblaspack --download-mpich
>> --download-scalapack --download-mumps --download-metis --download-parmetis
>> --download-superlu --download-superlu_dist --download-fftw
>>
>> [0]PETSC ERROR: #1 User provided function() line 0 in  unknown file
>>
>> application called MPI_Abort(MPI_COMM_WORLD, 59) - process 0
>>
>> [unset]: aborting job:
>>
>> application called MPI_Abort(MPI_COMM_WORLD, 59) - process 0
>>
>> [0]PETSC ERROR:
>> ------------------------------------------------------------------------
>>
>>
>>
>>
>>
>> /Mahir
>>
>>
>>
>>
>>
>> *From:* Hong [mailto:hzhang at mcs.anl.gov]
>>
>> *Sent:* den 22 juli 2015 21:34
>> *To:* Xiaoye S. Li
>> *Cc:* Ülker-Kaustell, Mahir; petsc-users
>>
>>
>> *Subject:* Re: [petsc-users] SuperLU MPI-problem
>>
>>
>>
>> In Petsc/superlu_dist interface, we set default
>>
>>
>>
>> options.ParSymbFact = NO;
>>
>>
>>
>> When user raises the flag "-mat_superlu_dist_parsymbfact",
>>
>> we set
>>
>>
>>
>>     options.ParSymbFact = YES;
>>
>>     options.ColPerm     = PARMETIS;   /* in v2.2, PARMETIS is forced for
>> ParSymbFact regardless of user ordering setting */
>>
>>
>>
>> We do not change anything else.
>>
>>
>>
>> Hong
>>
>>
>>
>> On Wed, Jul 22, 2015 at 2:19 PM, Xiaoye S. Li <xsli at lbl.gov> wrote:
>>
>> I am trying to understand your problem. You said you are solving Naviers
>> equation (elastodynamics) in the frequency domain, using finite element
>> discretization.  I wonder why you have about 1000 nonzeros per row.
>> Usually in many PDE discretized matrices, the number of nonzeros per row is
>> in the tens (even for 3D problems), not in the thousands.   So, your matrix
>> is quite a bit denser than many sparse matrices we deal with.
>>
>>
>>
>> The number of nonzeros in the L and U factors is much more than that in
>> original matrix A -- typically we see 10-20x fill ratio for 2D, or can be
>> as bad as 50-100x fill ratio for 3D.  But since your matrix starts much
>> denser (i.e., the underlying graph has many connections), it may not lend
>> to any good ordering strategy to preserve sparsity of L and U; that is, the
>> L and U fill ratio may be large.
>>
>>
>>
>> I don't understand why you get the following error when you use
>>
>> ‘-mat_superlu_dist_parsymbfact’.
>>
>>
>>
>> Invalid ISPEC at line 484 in file get_perm_c.c
>>
>>
>>
>> Perhaps Hong Zhang knows; she built the SuperLU_DIST interface for PETSc.
>>
>>
>>
>> ​Hong -- in order to use parallel symbolic factorization, is it
>> sufficient to specify only
>>
>> ‘-mat_superlu_dist_parsymbfact’
>>
>> ​ ?  (the default is to use  sequential symbolic factorization.)
>>
>>
>>
>>
>>
>> Sherry
>>
>>
>>
>> On Wed, Jul 22, 2015 at 9:11 AM, Mahir.Ulker-Kaustell at tyrens.se <
>> Mahir.Ulker-Kaustell at tyrens.se> wrote:
>>
>> Thank you for your reply.
>>
>> As you have probably figured out already, I am not a computational
>> scientist. I am a researcher in civil engineering (railways for high-speed
>> traffic), trying to produce some, from my perspective, fairly large
>> parametric studies based on finite element discretizations.
>>
>> I am working in a Windows-environment and have installed PETSc through
>> Cygwin.
>> Apparently, there is no support for Valgrind in this OS.
>>
>> If I have understood you correct, the memory issues are related to
>> superLU and given my background, there is not much I can do. Is this
>> correct?
>>
>>
>> Best regards,
>> Mahir
>>
>> ______________________________________________
>> Mahir Ülker-Kaustell, Kompetenssamordnare, Brokonstruktör, Tekn. Dr,
>> Tyréns AB
>> 010 452 30 82, Mahir.Ulker-Kaustell at tyrens.se
>> ______________________________________________
>>
>>
>> -----Original Message-----
>> From: Barry Smith [mailto:bsmith at mcs.anl.gov]
>> Sent: den 22 juli 2015 02:57
>> To: Ülker-Kaustell, Mahir
>> Cc: Xiaoye S. Li; petsc-users
>> Subject: Re: [petsc-users] SuperLU MPI-problem
>>
>>
>>    Run the program under valgrind
>> http://www.mcs.anl.gov/petsc/documentation/faq.html#valgrind . When I
>> use the option -mat_superlu_dist_parsymbfact I get many scary memory
>> problems some involving for example ddist_psymbtonum
>> (pdsymbfact_distdata.c:1332)
>>
>>    Note that I consider it unacceptable for running programs to EVER use
>> uninitialized values; until these are all cleaned up I won't trust any runs
>> like this.
>>
>>   Barry
>>
>>
>>
>>
>> ==42050== Conditional jump or move depends on uninitialised value(s)
>> ==42050==    at 0x10274C436: MPI_Allgatherv (allgatherv.c:1053)
>> ==42050==    by 0x101557F60: get_perm_c_parmetis
>> (get_perm_c_parmetis.c:285)
>> ==42050==    by 0x101501192: pdgssvx (pdgssvx.c:934)
>> ==42050==    by 0x1009CFE7A: MatLUFactorNumeric_SuperLU_DIST
>> (superlu_dist.c:414)
>> ==42050==    by 0x10046CC5C: MatLUFactorNumeric (matrix.c:2946)
>> ==42050==    by 0x100F09F2C: PCSetUp_LU (lu.c:152)
>> ==42050==    by 0x100FF9036: PCSetUp (precon.c:982)
>> ==42050==    by 0x1010F54EB: KSPSetUp (itfunc.c:332)
>> ==42050==    by 0x1010F7985: KSPSolve (itfunc.c:546)
>> ==42050==    by 0x10125541E: SNESSolve_NEWTONLS (ls.c:233)
>> ==42050==    by 0x1011C49B7: SNESSolve (snes.c:3906)
>> ==42050==    by 0x100001B3C: main (in ./ex19)
>> ==42050==  Uninitialised value was created by a stack allocation
>> ==42050==    at 0x10155751B: get_perm_c_parmetis
>> (get_perm_c_parmetis.c:96)
>> ==42050==
>> ==42050== Conditional jump or move depends on uninitialised value(s)
>> ==42050==    at 0x102851C61: MPIR_Allgatherv_intra (allgatherv.c:651)
>> ==42050==    by 0x102853EC7: MPIR_Allgatherv (allgatherv.c:903)
>> ==42050==    by 0x102853F84: MPIR_Allgatherv_impl (allgatherv.c:944)
>> ==42050==    by 0x10274CA41: MPI_Allgatherv (allgatherv.c:1107)
>> ==42050==    by 0x101557F60: get_perm_c_parmetis
>> (get_perm_c_parmetis.c:285)
>> ==42050==    by 0x101501192: pdgssvx (pdgssvx.c:934)
>> ==42050==    by 0x1009CFE7A: MatLUFactorNumeric_SuperLU_DIST
>> (superlu_dist.c:414)
>> ==42050==    by 0x10046CC5C: MatLUFactorNumeric (matrix.c:2946)
>> ==42050==    by 0x100F09F2C: PCSetUp_LU (lu.c:152)
>> ==42050==    by 0x100FF9036: PCSetUp (precon.c:982)
>> ==42050==    by 0x1010F54EB: KSPSetUp (itfunc.c:332)
>> ==42050==    by 0x1010F7985: KSPSolve (itfunc.c:546)
>> ==42050==    by 0x10125541E: SNESSolve_NEWTONLS (ls.c:233)
>> ==42050==    by 0x1011C49B7: SNESSolve (snes.c:3906)
>> ==42050==    by 0x100001B3C: main (in ./ex19)
>> ==42050==  Uninitialised value was created by a stack allocation
>> ==42050==    at 0x10155751B: get_perm_c_parmetis
>> (get_perm_c_parmetis.c:96)
>> ==42050==
>> ==42049== Syscall param writev(vector[...]) points to uninitialised
>> byte(s)
>> ==42049==    at 0x102DA1C3A: writev (in
>> /usr/lib/system/libsystem_kernel.dylib)
>> ==42049==    by 0x10296A0DC: MPL_large_writev (mplsock.c:32)
>> ==42049==    by 0x10295F6AD: MPIDU_Sock_writev (sock_immed.i:610)
>> ==42049==    by 0x102943FCA: MPIDI_CH3_iSendv (ch3_isendv.c:84)
>> ==42049==    by 0x102934361: MPIDI_CH3_EagerContigIsend (ch3u_eager.c:556)
>> ==42049==    by 0x102939531: MPID_Isend (mpid_isend.c:138)
>> ==42049==    by 0x10277656E: MPI_Isend (isend.c:125)
>> ==42049==    by 0x102088B66: libparmetis__gkMPI_Isend (gkmpi.c:63)
>> ==42049==    by 0x10208140F: libparmetis__CommInterfaceData (comm.c:298)
>> ==42049==    by 0x1020A8758: libparmetis__CompactGraph (ometis.c:553)
>> ==42049==    by 0x1020A77BB: libparmetis__MultilevelOrder (ometis.c:225)
>> ==42049==    by 0x1020A7493: ParMETIS_V32_NodeND (ometis.c:151)
>> ==42049==    by 0x1020A6AFB: ParMETIS_V3_NodeND (ometis.c:34)
>> ==42049==    by 0x101557CFC: get_perm_c_parmetis
>> (get_perm_c_parmetis.c:241)
>> ==42049==    by 0x101501192: pdgssvx (pdgssvx.c:934)
>> ==42049==    by 0x1009CFE7A: MatLUFactorNumeric_SuperLU_DIST
>> (superlu_dist.c:414)
>> ==42049==    by 0x10046CC5C: MatLUFactorNumeric (matrix.c:2946)
>> ==42049==    by 0x100F09F2C: PCSetUp_LU (lu.c:152)
>> ==42049==    by 0x100FF9036: PCSetUp (precon.c:982)
>> ==42048== Syscall param writev(vector[...]) points to uninitialised
>> byte(s)
>> ==42049==    by 0x1010F54EB: KSPSetUp (itfunc.c:332)
>> ==42049==  Address 0x105edff70 is 1,424 bytes inside a block of size
>> 752,720 alloc'd
>> ==42049==    at 0x1000183B1: malloc (vg_replace_malloc.c:303)
>> ==42049==    by 0x1020EB90C: gk_malloc (memory.c:147)
>> ==42049==    by 0x1020EAA28: gk_mcoreCreate (mcore.c:28)
>> ==42048==    at 0x102DA1C3A: writev (in
>> /usr/lib/system/libsystem_kernel.dylib)
>> ==42048==    by 0x10296A0DC: MPL_large_writev (mplsock.c:32)
>> ==42049==    by 0x1020BA5CF: libparmetis__AllocateWSpace (wspace.c:23)
>> ==42049==    by 0x1020A6E84: ParMETIS_V32_NodeND (ometis.c:98)
>> ==42048==    by 0x10295F6AD: MPIDU_Sock_writev (sock_immed.i:610)
>> ==42048==    by 0x102943FCA: MPIDI_CH3_iSendv (ch3_isendv.c:84)
>> ==42048==    by 0x102934361: MPIDI_CH3_EagerContigIsend (ch3u_eager.c:556)
>> ==42049==    by 0x1020A6AFB: ParMETIS_V3_NodeND (ometis.c:34)
>> ==42049==    by 0x101557CFC: get_perm_c_parmetis
>> (get_perm_c_parmetis.c:241)
>> ==42049==    by 0x101501192: pdgssvx (pdgssvx.c:934)
>> ==42048==    by 0x102939531: MPID_Isend (mpid_isend.c:138)
>> ==42048==    by 0x10277656E: MPI_Isend (isend.c:125)
>> ==42049==    by 0x1009CFE7A: MatLUFactorNumeric_SuperLU_DIST
>> (superlu_dist.c:414)
>> ==42049==    by 0x10046CC5C: MatLUFactorNumeric (matrix.c:2946)
>> ==42049==    by 0x100F09F2C: PCSetUp_LU (lu.c:152)
>> ==42049==    by 0x100FF9036: PCSetUp (precon.c:982)
>> ==42049==    by 0x1010F54EB: KSPSetUp (itfunc.c:332)
>> ==42048==    by 0x102088B66: libparmetis__gkMPI_Isend (gkmpi.c:63)
>> ==42048==    by 0x10208140F: libparmetis__CommInterfaceData (comm.c:298)
>> ==42049==    by 0x1010F7985: KSPSolve (itfunc.c:546)
>> ==42049==    by 0x10125541E: SNESSolve_NEWTONLS (ls.c:233)
>> ==42048==    by 0x1020A8758: libparmetis__CompactGraph (ometis.c:553)
>> ==42048==    by 0x1020A77BB: libparmetis__MultilevelOrder (ometis.c:225)
>> ==42048==    by 0x1020A7493: ParMETIS_V32_NodeND (ometis.c:151)
>> ==42049==    by 0x1011C49B7: SNESSolve (snes.c:3906)
>> ==42049==    by 0x100001B3C: main (in ./ex19)
>> ==42049==  Uninitialised value was created by a heap allocation
>> ==42049==    at 0x1000183B1: malloc (vg_replace_malloc.c:303)
>> ==42049==    by 0x1020EB90C: gk_malloc (memory.c:147)
>> ==42048==    by 0x1020A6AFB: ParMETIS_V3_NodeND (ometis.c:34)
>> ==42048==    by 0x101557CFC: get_perm_c_parmetis
>> (get_perm_c_parmetis.c:241)
>> ==42048==    by 0x101501192: pdgssvx (pdgssvx.c:934)
>> ==42048==    by 0x1009CFE7A: MatLUFactorNumeric_SuperLU_DIST
>> (superlu_dist.c:414)
>> ==42049==    by 0x10211C50B: libmetis__imalloc (gklib.c:24)
>> ==42049==    by 0x1020A8566: libparmetis__CompactGraph (ometis.c:519)
>> ==42049==    by 0x1020A77BB: libparmetis__MultilevelOrder (ometis.c:225)
>> ==42048==    by 0x10046CC5C: MatLUFactorNumeric (matrix.c:2946)
>> ==42049==    by 0x1020A7493: ParMETIS_V32_NodeND (ometis.c:151)
>> ==42049==    by 0x1020A6AFB: ParMETIS_V3_NodeND (ometis.c:34)
>> ==42049==    by 0x101557CFC: get_perm_c_parmetis
>> (get_perm_c_parmetis.c:241)
>> ==42049==    by 0x101501192: pdgssvx (pdgssvx.c:934)
>> ==42049==    by 0x1009CFE7A: MatLUFactorNumeric_SuperLU_DIST
>> (superlu_dist.c:414)
>> ==42049==    by 0x10046CC5C: MatLUFactorNumeric (matrix.c:2946)
>> ==42048==    by 0x100F09F2C: PCSetUp_LU (lu.c:152)
>> ==42049==    by 0x100F09F2C: PCSetUp_LU (lu.c:152)
>> ==42049==    by 0x100FF9036: PCSetUp (precon.c:982)
>> ==42049==    by 0x1010F54EB: KSPSetUp (itfunc.c:332)
>> ==42049==    by 0x1010F7985: KSPSolve (itfunc.c:546)
>> ==42049==    by 0x10125541E: SNESSolve_NEWTONLS (ls.c:233)
>> ==42048==    by 0x100FF9036: PCSetUp (precon.c:982)
>> ==42048==    by 0x1010F54EB: KSPSetUp (itfunc.c:332)
>> ==42048==  Address 0x10597a860 is 1,408 bytes inside a block of size
>> 752,720 alloc'd
>> ==42049==    by 0x1011C49B7: SNESSolve (snes.c:3906)
>> ==42049==    by 0x100001B3C: main (in ./ex19)
>> ==42049==
>> ==42048==    at 0x1000183B1: malloc (vg_replace_malloc.c:303)
>> ==42048==    by 0x1020EB90C: gk_malloc (memory.c:147)
>> ==42048==    by 0x1020EAA28: gk_mcoreCreate (mcore.c:28)
>> ==42048==    by 0x1020BA5CF: libparmetis__AllocateWSpace (wspace.c:23)
>> ==42048==    by 0x1020A6E84: ParMETIS_V32_NodeND (ometis.c:98)
>> ==42048==    by 0x1020A6AFB: ParMETIS_V3_NodeND (ometis.c:34)
>> ==42048==    by 0x101557CFC: get_perm_c_parmetis
>> (get_perm_c_parmetis.c:241)
>> ==42048==    by 0x101501192: pdgssvx (pdgssvx.c:934)
>> ==42048==    by 0x1009CFE7A: MatLUFactorNumeric_SuperLU_DIST
>> (superlu_dist.c:414)
>> ==42048==    by 0x10046CC5C: MatLUFactorNumeric (matrix.c:2946)
>> ==42048==    by 0x100F09F2C: PCSetUp_LU (lu.c:152)
>> ==42048==    by 0x100FF9036: PCSetUp (precon.c:982)
>> ==42048==    by 0x1010F54EB: KSPSetUp (itfunc.c:332)
>> ==42048==    by 0x1010F7985: KSPSolve (itfunc.c:546)
>> ==42048==    by 0x10125541E: SNESSolve_NEWTONLS (ls.c:233)
>> ==42048==    by 0x1011C49B7: SNESSolve (snes.c:3906)
>> ==42048==    by 0x100001B3C: main (in ./ex19)
>> ==42048==  Uninitialised value was created by a heap allocation
>> ==42048==    at 0x1000183B1: malloc (vg_replace_malloc.c:303)
>> ==42048==    by 0x1020EB90C: gk_malloc (memory.c:147)
>> ==42048==    by 0x10211C50B: libmetis__imalloc (gklib.c:24)
>> ==42048==    by 0x1020A8566: libparmetis__CompactGraph (ometis.c:519)
>> ==42048==    by 0x1020A77BB: libparmetis__MultilevelOrder (ometis.c:225)
>> ==42048==    by 0x1020A7493: ParMETIS_V32_NodeND (ometis.c:151)
>> ==42048==    by 0x1020A6AFB: ParMETIS_V3_NodeND (ometis.c:34)
>> ==42048==    by 0x101557CFC: get_perm_c_parmetis
>> (get_perm_c_parmetis.c:241)
>> ==42048==    by 0x101501192: pdgssvx (pdgssvx.c:934)
>> ==42048==    by 0x1009CFE7A: MatLUFactorNumeric_SuperLU_DIST
>> (superlu_dist.c:414)
>> ==42048==    by 0x10046CC5C: MatLUFactorNumeric (matrix.c:2946)
>> ==42048==    by 0x100F09F2C: PCSetUp_LU (lu.c:152)
>> ==42048==    by 0x100FF9036: PCSetUp (precon.c:982)
>> ==42048==    by 0x1010F54EB: KSPSetUp (itfunc.c:332)
>> ==42048==    by 0x1010F7985: KSPSolve (itfunc.c:546)
>> ==42048==    by 0x10125541E: SNESSolve_NEWTONLS (ls.c:233)
>> ==42048==    by 0x1011C49B7: SNESSolve (snes.c:3906)
>> ==42048==    by 0x100001B3C: main (in ./ex19)
>> ==42048==
>> ==42048== Syscall param write(buf) points to uninitialised byte(s)
>> ==42048==    at 0x102DA1C22: write (in
>> /usr/lib/system/libsystem_kernel.dylib)
>> ==42048==    by 0x10295F5BD: MPIDU_Sock_write (sock_immed.i:525)
>> ==42048==    by 0x102944839: MPIDI_CH3_iStartMsg (ch3_istartmsg.c:86)
>> ==42048==    by 0x102933B80: MPIDI_CH3_EagerContigShortSend
>> (ch3u_eager.c:257)
>> ==42048==    by 0x10293ADBA: MPID_Send (mpid_send.c:130)
>> ==42048==    by 0x10277A1FA: MPI_Send (send.c:127)
>> ==42048==    by 0x10155802F: get_perm_c_parmetis
>> (get_perm_c_parmetis.c:299)
>> ==42048==    by 0x101501192: pdgssvx (pdgssvx.c:934)
>> ==42048==    by 0x1009CFE7A: MatLUFactorNumeric_SuperLU_DIST
>> (superlu_dist.c:414)
>> ==42048==    by 0x10046CC5C: MatLUFactorNumeric (matrix.c:2946)
>> ==42048==    by 0x100F09F2C: PCSetUp_LU (lu.c:152)
>> ==42048==    by 0x100FF9036: PCSetUp (precon.c:982)
>> ==42048==    by 0x1010F54EB: KSPSetUp (itfunc.c:332)
>> ==42048==    by 0x1010F7985: KSPSolve (itfunc.c:546)
>> ==42048==    by 0x10125541E: SNESSolve_NEWTONLS (ls.c:233)
>> ==42048==    by 0x1011C49B7: SNESSolve (snes.c:3906)
>> ==42048==    by 0x100001B3C: main (in ./ex19)
>> ==42048==  Address 0x104810704 is on thread 1's stack
>> ==42048==  in frame #3, created by MPIDI_CH3_EagerContigShortSend
>> (ch3u_eager.c:218)
>> ==42048==  Uninitialised value was created by a heap allocation
>> ==42048==    at 0x1000183B1: malloc (vg_replace_malloc.c:303)
>> ==42048==    by 0x10153B704: superlu_malloc_dist (memory.c:108)
>> ==42048==    by 0x101557AB9: get_perm_c_parmetis
>> (get_perm_c_parmetis.c:185)
>> ==42048==    by 0x101501192: pdgssvx (pdgssvx.c:934)
>> ==42048==    by 0x1009CFE7A: MatLUFactorNumeric_SuperLU_DIST
>> (superlu_dist.c:414)
>> ==42048==    by 0x10046CC5C: MatLUFactorNumeric (matrix.c:2946)
>> ==42048==    by 0x100F09F2C: PCSetUp_LU (lu.c:152)
>> ==42048==    by 0x100FF9036: PCSetUp (precon.c:982)
>> ==42048==    by 0x1010F54EB: KSPSetUp (itfunc.c:332)
>> ==42048==    by 0x1010F7985: KSPSolve (itfunc.c:546)
>> ==42048==    by 0x10125541E: SNESSolve_NEWTONLS (ls.c:233)
>> ==42048==    by 0x1011C49B7: SNESSolve (snes.c:3906)
>> ==42048==    by 0x100001B3C: main (in ./ex19)
>> ==42048==
>> ==42050== Conditional jump or move depends on uninitialised value(s)
>> ==42050==    at 0x102744CB8: MPI_Alltoallv (alltoallv.c:480)
>> ==42050==    by 0x101510B3E: dist_symbLU (pdsymbfact_distdata.c:539)
>> ==42050==    by 0x10150A5C6: ddist_psymbtonum (pdsymbfact_distdata.c:1275)
>> ==42050==    by 0x1015018C2: pdgssvx (pdgssvx.c:1057)
>> ==42050==    by 0x1009CFE7A: MatLUFactorNumeric_SuperLU_DIST
>> (superlu_dist.c:414)
>> ==42050==    by 0x10046CC5C: MatLUFactorNumeric (matrix.c:2946)
>> ==42050==    by 0x100F09F2C: PCSetUp_LU (lu.c:152)
>> ==42050==    by 0x100FF9036: PCSetUp (precon.c:982)
>> ==42050==    by 0x1010F54EB: KSPSetUp (itfunc.c:332)
>> ==42050==    by 0x1010F7985: KSPSolve (itfunc.c:546)
>> ==42050==    by 0x10125541E: SNESSolve_NEWTONLS (ls.c:233)
>> ==42050==    by 0x1011C49B7: SNESSolve (snes.c:3906)
>> ==42050==    by 0x100001B3C: main (in ./ex19)
>> ==42050==  Uninitialised value was created by a stack allocation
>> ==42050==    at 0x10150E4C4: dist_symbLU (pdsymbfact_distdata.c:96)
>> ==42050==
>> ==42050== Conditional jump or move depends on uninitialised value(s)
>> ==42050==    at 0x102744E43: MPI_Alltoallv (alltoallv.c:490)
>> ==42050==    by 0x101510B3E: dist_symbLU (pdsymbfact_distdata.c:539)
>> ==42050==    by 0x10150A5C6: ddist_psymbtonum (pdsymbfact_distdata.c:1275)
>> ==42050==    by 0x1015018C2: pdgssvx (pdgssvx.c:1057)
>> ==42050==    by 0x1009CFE7A: MatLUFactorNumeric_SuperLU_DIST
>> (superlu_dist.c:414)
>> ==42050==    by 0x10046CC5C: MatLUFactorNumeric (matrix.c:2946)
>> ==42050==    by 0x100F09F2C: PCSetUp_LU (lu.c:152)
>> ==42050==    by 0x100FF9036: PCSetUp (precon.c:982)
>> ==42050==    by 0x1010F54EB: KSPSetUp (itfunc.c:332)
>> ==42050==    by 0x1010F7985: KSPSolve (itfunc.c:546)
>> ==42050==    by 0x10125541E: SNESSolve_NEWTONLS (ls.c:233)
>> ==42050==    by 0x1011C49B7: SNESSolve (snes.c:3906)
>> ==42050==    by 0x100001B3C: main (in ./ex19)
>> ==42050==  Uninitialised value was created by a stack allocation
>> ==42050==    at 0x10150E4C4: dist_symbLU (pdsymbfact_distdata.c:96)
>> ==42050==
>> ==42050== Conditional jump or move depends on uninitialised value(s)
>> ==42050==    at 0x102744EBF: MPI_Alltoallv (alltoallv.c:497)
>> ==42050==    by 0x101510B3E: dist_symbLU (pdsymbfact_distdata.c:539)
>> ==42050==    by 0x10150A5C6: ddist_psymbtonum (pdsymbfact_distdata.c:1275)
>> ==42050==    by 0x1015018C2: pdgssvx (pdgssvx.c:1057)
>> ==42050==    by 0x1009CFE7A: MatLUFactorNumeric_SuperLU_DIST
>> (superlu_dist.c:414)
>> ==42050==    by 0x10046CC5C: MatLUFactorNumeric (matrix.c:2946)
>> ==42050==    by 0x100F09F2C: PCSetUp_LU (lu.c:152)
>> ==42050==    by 0x100FF9036: PCSetUp (precon.c:982)
>> ==42050==    by 0x1010F54EB: KSPSetUp (itfunc.c:332)
>> ==42050==    by 0x1010F7985: KSPSolve (itfunc.c:546)
>> ==42050==    by 0x10125541E: SNESSolve_NEWTONLS (ls.c:233)
>> ==42050==    by 0x1011C49B7: SNESSolve (snes.c:3906)
>> ==42050==    by 0x100001B3C: main (in ./ex19)
>> ==42050==  Uninitialised value was created by a stack allocation
>> ==42050==    at 0x10150E4C4: dist_symbLU (pdsymbfact_distdata.c:96)
>> ==42050==
>> ==42050== Conditional jump or move depends on uninitialised value(s)
>> ==42050==    at 0x1027450B1: MPI_Alltoallv (alltoallv.c:512)
>> ==42050==    by 0x101510B3E: dist_symbLU (pdsymbfact_distdata.c:539)
>> ==42050==    by 0x10150A5C6: ddist_psymbtonum (pdsymbfact_distdata.c:1275)
>> ==42050==    by 0x1015018C2: pdgssvx (pdgssvx.c:1057)
>> ==42050==    by 0x1009CFE7A: MatLUFactorNumeric_SuperLU_DIST
>> (superlu_dist.c:414)
>> ==42050==    by 0x10046CC5C: MatLUFactorNumeric (matrix.c:2946)
>> ==42050==    by 0x100F09F2C: PCSetUp_LU (lu.c:152)
>> ==42050==    by 0x100FF9036: PCSetUp (precon.c:982)
>> ==42050==    by 0x1010F54EB: KSPSetUp (itfunc.c:332)
>> ==42050==    by 0x1010F7985: KSPSolve (itfunc.c:546)
>> ==42050==    by 0x10125541E: SNESSolve_NEWTONLS (ls.c:233)
>> ==42050==    by 0x1011C49B7: SNESSolve (snes.c:3906)
>> ==42050==    by 0x100001B3C: main (in ./ex19)
>> ==42050==  Uninitialised value was created by a stack allocation
>> ==42050==    at 0x10150E4C4: dist_symbLU (pdsymbfact_distdata.c:96)
>> ==42050==
>> ==42050== Conditional jump or move depends on uninitialised value(s)
>> ==42050==    at 0x10283FB06: MPIR_Alltoallv_intra (alltoallv.c:92)
>> ==42050==    by 0x1028407B6: MPIR_Alltoallv (alltoallv.c:343)
>> ==42050==    by 0x102840884: MPIR_Alltoallv_impl (alltoallv.c:380)
>> ==42050==    by 0x10274541B: MPI_Alltoallv (alltoallv.c:531)
>> ==42050==    by 0x101510B3E: dist_symbLU (pdsymbfact_distdata.c:539)
>> ==42050==    by 0x10150A5C6: ddist_psymbtonum (pdsymbfact_distdata.c:1275)
>> ==42050==    by 0x1015018C2: pdgssvx (pdgssvx.c:1057)
>> ==42050==    by 0x1009CFE7A: MatLUFactorNumeric_SuperLU_DIST
>> (superlu_dist.c:414)
>> ==42050==    by 0x10046CC5C: MatLUFactorNumeric (matrix.c:2946)
>> ==42050==    by 0x100F09F2C: PCSetUp_LU (lu.c:152)
>> ==42050==    by 0x100FF9036: PCSetUp (precon.c:982)
>> ==42050==    by 0x1010F54EB: KSPSetUp (itfunc.c:332)
>> ==42050==    by 0x1010F7985: KSPSolve (itfunc.c:546)
>> ==42050==    by 0x10125541E: SNESSolve_NEWTONLS (ls.c:233)
>> ==42050==    by 0x1011C49B7: SNESSolve (snes.c:3906)
>> ==42050==    by 0x100001B3C: main (in ./ex19)
>> ==42050==  Uninitialised value was created by a stack allocation
>> ==42050==    at 0x10150E4C4: dist_symbLU (pdsymbfact_distdata.c:96)
>> ==42050==
>> ==42050== Syscall param writev(vector[...]) points to uninitialised
>> byte(s)
>> ==42050==    at 0x102DA1C3A: writev (in
>> /usr/lib/system/libsystem_kernel.dylib)
>> ==42050==    by 0x10296A0DC: MPL_large_writev (mplsock.c:32)
>> ==42050==    by 0x10295F6AD: MPIDU_Sock_writev (sock_immed.i:610)
>> ==42050==    by 0x102943FCA: MPIDI_CH3_iSendv (ch3_isendv.c:84)
>> ==42050==    by 0x102934361: MPIDI_CH3_EagerContigIsend (ch3u_eager.c:556)
>> ==42050==    by 0x102939531: MPID_Isend (mpid_isend.c:138)
>> ==42050==    by 0x10277656E: MPI_Isend (isend.c:125)
>> ==42050==    by 0x101524C41: pdgstrf2_trsm (pdgstrf2.c:201)
>> ==42050==    by 0x10151ECBF: pdgstrf (pdgstrf.c:1082)
>> ==42050==    by 0x1015019A5: pdgssvx (pdgssvx.c:1069)
>> ==42050==    by 0x1009CFE7A: MatLUFactorNumeric_SuperLU_DIST
>> (superlu_dist.c:414)
>> ==42050==    by 0x10046CC5C: MatLUFactorNumeric (matrix.c:2946)
>> ==42050==    by 0x100F09F2C: PCSetUp_LU (lu.c:152)
>> ==42050==    by 0x100FF9036: PCSetUp (precon.c:982)
>> ==42050==    by 0x1010F54EB: KSPSetUp (itfunc.c:332)
>> ==42050==    by 0x1010F7985: KSPSolve (itfunc.c:546)
>> ==42050==    by 0x10125541E: SNESSolve_NEWTONLS (ls.c:233)
>> ==42050==    by 0x1011C49B7: SNESSolve (snes.c:3906)
>> ==42050==    by 0x100001B3C: main (in ./ex19)
>> ==42050==  Address 0x1060144d0 is 1,168 bytes inside a block of size
>> 131,072 alloc'd
>> ==42050==    at 0x1000183B1: malloc (vg_replace_malloc.c:303)
>> ==42050==    by 0x10153B704: superlu_malloc_dist (memory.c:108)
>> ==42050==    by 0x1014FD7AD: doubleMalloc_dist (dmemory.c:145)
>> ==42050==    by 0x10151DA7D: pdgstrf (pdgstrf.c:735)
>> ==42050==    by 0x1015019A5: pdgssvx (pdgssvx.c:1069)
>> ==42050==    by 0x1009CFE7A: MatLUFactorNumeric_SuperLU_DIST
>> (superlu_dist.c:414)
>> ==42050==    by 0x10046CC5C: MatLUFactorNumeric (matrix.c:2946)
>> ==42050==    by 0x100F09F2C: PCSetUp_LU (lu.c:152)
>> ==42050==    by 0x100FF9036: PCSetUp (precon.c:982)
>> ==42050==    by 0x1010F54EB: KSPSetUp (itfunc.c:332)
>> ==42050==    by 0x1010F7985: KSPSolve (itfunc.c:546)
>> ==42050==    by 0x10125541E: SNESSolve_NEWTONLS (ls.c:233)
>> ==42050==    by 0x1011C49B7: SNESSolve (snes.c:3906)
>> ==42050==    by 0x100001B3C: main (in ./ex19)
>> ==42050==  Uninitialised value was created by a heap allocation
>> ==42050==    at 0x1000183B1: malloc (vg_replace_malloc.c:303)
>> ==42050==    by 0x10153B704: superlu_malloc_dist (memory.c:108)
>> ==42050==    by 0x1014FD7AD: doubleMalloc_dist (dmemory.c:145)
>> ==42050==    by 0x10151DA7D: pdgstrf (pdgstrf.c:735)
>> ==42050==    by 0x1015019A5: pdgssvx (pdgssvx.c:1069)
>> ==42050==    by 0x1009CFE7A: MatLUFactorNumeric_SuperLU_DIST
>> (superlu_dist.c:414)
>> ==42050==    by 0x10046CC5C: MatLUFactorNumeric (matrix.c:2946)
>> ==42050==    by 0x100F09F2C: PCSetUp_LU (lu.c:152)
>> ==42050==    by 0x100FF9036: PCSetUp (precon.c:982)
>> ==42050==    by 0x1010F54EB: KSPSetUp (itfunc.c:332)
>> ==42050==    by 0x1010F7985: KSPSolve (itfunc.c:546)
>> ==42050==    by 0x10125541E: SNESSolve_NEWTONLS (ls.c:233)
>> ==42050==    by 0x1011C49B7: SNESSolve (snes.c:3906)
>> ==42050==    by 0x100001B3C: main (in ./ex19)
>> ==42050==
>> ==42048== Conditional jump or move depends on uninitialised value(s)
>> ==42048==    at 0x10151F141: pdgstrf (pdgstrf.c:1139)
>> ==42048==    by 0x1015019A5: pdgssvx (pdgssvx.c:1069)
>> ==42048==    by 0x1009CFE7A: MatLUFactorNumeric_SuperLU_DIST
>> (superlu_dist.c:414)
>> ==42048==    by 0x10046CC5C: MatLUFactorNumeric (matrix.c:2946)
>> ==42048==    by 0x100F09F2C: PCSetUp_LU (lu.c:152)
>> ==42048==    by 0x100FF9036: PCSetUp (precon.c:982)
>> ==42048==    by 0x1010F54EB: KSPSetUp (itfunc.c:332)
>> ==42048==    by 0x1010F7985: KSPSolve (itfunc.c:546)
>> ==42048==    by 0x10125541E: SNESSolve_NEWTONLS (ls.c:233)
>> ==42048==    by 0x1011C49B7: SNESSolve (snes.c:3906)
>> ==42048==    by 0x100001B3C: main (in ./ex19)
>> ==42048==  Uninitialised value was created by a heap allocation
>> ==42048==    at 0x1000183B1: malloc (vg_replace_malloc.c:303)
>> ==42048==    by 0x10153B704: superlu_malloc_dist (memory.c:108)
>> ==42048==    by 0x10150ABE2: ddist_psymbtonum (pdsymbfact_distdata.c:1332)
>> ==42048==    by 0x1015018C2: pdgssvx (pdgssvx.c:1057)
>> ==42048==    by 0x1009CFE7A: MatLUFactorNumeric_SuperLU_DIST
>> (superlu_dist.c:414)
>> ==42048==    by 0x10046CC5C: MatLUFactorNumeric (matrix.c:2946)
>> ==42048==    by 0x100F09F2C: PCSetUp_LU (lu.c:152)
>> ==42048==    by 0x100FF9036: PCSetUp (precon.c:982)
>> ==42048==    by 0x1010F54EB: KSPSetUp (itfunc.c:332)
>> ==42048==    by 0x1010F7985: KSPSolve (itfunc.c:546)
>> ==42048==    by 0x10125541E: SNESSolve_NEWTONLS (ls.c:233)
>> ==42048==    by 0x1011C49B7: SNESSolve (snes.c:3906)
>> ==42048==    by 0x100001B3C: main (in ./ex19)
>> ==42048==
>> ==42049== Conditional jump or move depends on uninitialised value(s)
>> ==42049==    at 0x10151F141: pdgstrf (pdgstrf.c:1139)
>> ==42049==    by 0x1015019A5: pdgssvx (pdgssvx.c:1069)
>> ==42049==    by 0x1009CFE7A: MatLUFactorNumeric_SuperLU_DIST
>> (superlu_dist.c:414)
>> ==42049==    by 0x10046CC5C: MatLUFactorNumeric (matrix.c:2946)
>> ==42049==    by 0x100F09F2C: PCSetUp_LU (lu.c:152)
>> ==42049==    by 0x100FF9036: PCSetUp (precon.c:982)
>> ==42049==    by 0x1010F54EB: KSPSetUp (itfunc.c:332)
>> ==42049==    by 0x1010F7985: KSPSolve (itfunc.c:546)
>> ==42049==    by 0x10125541E: SNESSolve_NEWTONLS (ls.c:233)
>> ==42049==    by 0x1011C49B7: SNESSolve (snes.c:3906)
>> ==42049==    by 0x100001B3C: main (in ./ex19)
>> ==42049==  Uninitialised value was created by a heap allocation
>> ==42049==    at 0x1000183B1: malloc (vg_replace_malloc.c:303)
>> ==42049==    by 0x10153B704: superlu_malloc_dist (memory.c:108)
>> ==42049==    by 0x10150ABE2: ddist_psymbtonum (pdsymbfact_distdata.c:1332)
>> ==42049==    by 0x1015018C2: pdgssvx (pdgssvx.c:1057)
>> ==42049==    by 0x1009CFE7A: MatLUFactorNumeric_SuperLU_DIST
>> (superlu_dist.c:414)
>> ==42049==    by 0x10046CC5C: MatLUFactorNumeric (matrix.c:2946)
>> ==42049==    by 0x100F09F2C: PCSetUp_LU (lu.c:152)
>> ==42049==    by 0x100FF9036: PCSetUp (precon.c:982)
>> ==42049==    by 0x1010F54EB: KSPSetUp (itfunc.c:332)
>> ==42049==    by 0x1010F7985: KSPSolve (itfunc.c:546)
>> ==42049==    by 0x10125541E: SNESSolve_NEWTONLS (ls.c:233)
>> ==42049==    by 0x1011C49B7: SNESSolve (snes.c:3906)
>> ==42049==    by 0x100001B3C: main (in ./ex19)
>> ==42049==
>> ==42048== Conditional jump or move depends on uninitialised value(s)
>> ==42048==    at 0x101520054: pdgstrf (pdgstrf.c:1429)
>> ==42048==    by 0x1015019A5: pdgssvx (pdgssvx.c:1069)
>> ==42048==    by 0x1009CFE7A: MatLUFactorNumeric_SuperLU_DIST
>> (superlu_dist.c:414)
>> ==42048==    by 0x10046CC5C: MatLUFactorNumeric (matrix.c:2946)
>> ==42048==    by 0x100F09F2C: PCSetUp_LU (lu.c:152)
>> ==42048==    by 0x100FF9036: PCSetUp (precon.c:982)
>> ==42048==    by 0x1010F54EB: KSPSetUp (itfunc.c:332)
>> ==42048==    by 0x1010F7985: KSPSolve (itfunc.c:546)
>> ==42049== Conditional jump or move depends on uninitialised value(s)
>> ==42048==    by 0x10125541E: SNESSolve_NEWTONLS (ls.c:233)
>> ==42048==    by 0x1011C49B7: SNESSolve (snes.c:3906)
>> ==42048==    by 0x100001B3C: main (in ./ex19)
>> ==42048==  Uninitialised value was created by a heap allocation
>> ==42049==    at 0x101520054: pdgstrf (pdgstrf.c:1429)
>> ==42048==    at 0x1000183B1: malloc (vg_replace_malloc.c:303)
>> ==42048==    by 0x10153B704: superlu_malloc_dist (memory.c:108)
>> ==42049==    by 0x1015019A5: pdgssvx (pdgssvx.c:1069)
>> ==42049==    by 0x1009CFE7A: MatLUFactorNumeric_SuperLU_DIST
>> (superlu_dist.c:414)
>> ==42048==    by 0x10150ABE2: ddist_psymbtonum (pdsymbfact_distdata.c:1332)
>> ==42048==    by 0x1015018C2: pdgssvx (pdgssvx.c:1057)
>> ==42048==    by 0x1009CFE7A: MatLUFactorNumeric_SuperLU_DIST
>> (superlu_dist.c:414)
>> ==42049==    by 0x10046CC5C: MatLUFactorNumeric (matrix.c:2946)
>> ==42049==    by 0x100F09F2C: PCSetUp_LU (lu.c:152)
>> ==42048==    by 0x10046CC5C: MatLUFactorNumeric (matrix.c:2946)
>> ==42048==    by 0x100F09F2C: PCSetUp_LU (lu.c:152)
>> ==42049==    by 0x100FF9036: PCSetUp (precon.c:982)
>> ==42049==    by 0x1010F54EB: KSPSetUp (itfunc.c:332)
>> ==42049==    by 0x1010F7985: KSPSolve (itfunc.c:546)
>> ==42048==    by 0x100FF9036: PCSetUp (precon.c:982)
>> ==42048==    by 0x1010F54EB: KSPSetUp (itfunc.c:332)
>> ==42048==    by 0x1010F7985: KSPSolve (itfunc.c:546)
>> ==42049==    by 0x10125541E: SNESSolve_NEWTONLS (ls.c:233)
>> ==42049==    by 0x1011C49B7: SNESSolve (snes.c:3906)
>> ==42048==    by 0x10125541E: SNESSolve_NEWTONLS (ls.c:233)
>> ==42048==    by 0x1011C49B7: SNESSolve (snes.c:3906)
>> ==42049==    by 0x100001B3C: main (in ./ex19)
>> ==42049==  Uninitialised value was created by a heap allocation
>> ==42049==    at 0x1000183B1: malloc (vg_replace_malloc.c:303)
>> ==42048==    by 0x100001B3C: main (in ./ex19)
>> ==42048==
>> ==42049==    by 0x10153B704: superlu_malloc_dist (memory.c:108)
>> ==42049==    by 0x10150ABE2: ddist_psymbtonum (pdsymbfact_distdata.c:1332)
>> ==42049==    by 0x1015018C2: pdgssvx (pdgssvx.c:1057)
>> ==42049==    by 0x1009CFE7A: MatLUFactorNumeric_SuperLU_DIST
>> (superlu_dist.c:414)
>> ==42049==    by 0x10046CC5C: MatLUFactorNumeric (matrix.c:2946)
>> ==42049==    by 0x100F09F2C: PCSetUp_LU (lu.c:152)
>> ==42049==    by 0x100FF9036: PCSetUp (precon.c:982)
>> ==42049==    by 0x1010F54EB: KSPSetUp (itfunc.c:332)
>> ==42049==    by 0x1010F7985: KSPSolve (itfunc.c:546)
>> ==42049==    by 0x10125541E: SNESSolve_NEWTONLS (ls.c:233)
>> ==42049==    by 0x1011C49B7: SNESSolve (snes.c:3906)
>> ==42049==    by 0x100001B3C: main (in ./ex19)
>> ==42049==
>> ==42050== Conditional jump or move depends on uninitialised value(s)
>> ==42050==    at 0x10151FDE6: pdgstrf (pdgstrf.c:1382)
>> ==42050==    by 0x1015019A5: pdgssvx (pdgssvx.c:1069)
>> ==42050==    by 0x1009CFE7A: MatLUFactorNumeric_SuperLU_DIST
>> (superlu_dist.c:414)
>> ==42050==    by 0x10046CC5C: MatLUFactorNumeric (matrix.c:2946)
>> ==42050==    by 0x100F09F2C: PCSetUp_LU (lu.c:152)
>> ==42050==    by 0x100FF9036: PCSetUp (precon.c:982)
>> ==42050==    by 0x1010F54EB: KSPSetUp (itfunc.c:332)
>> ==42050==    by 0x1010F7985: KSPSolve (itfunc.c:546)
>> ==42050==    by 0x10125541E: SNESSolve_NEWTONLS (ls.c:233)
>> ==42050==    by 0x1011C49B7: SNESSolve (snes.c:3906)
>> ==42050==    by 0x100001B3C: main (in ./ex19)
>> ==42050==  Uninitialised value was created by a heap allocation
>> ==42050==    at 0x1000183B1: malloc (vg_replace_malloc.c:303)
>> ==42050==    by 0x10153B704: superlu_malloc_dist (memory.c:108)
>> ==42050==    by 0x10150B241: ddist_psymbtonum (pdsymbfact_distdata.c:1389)
>> ==42050==    by 0x1015018C2: pdgssvx (pdgssvx.c:1057)
>> ==42050==    by 0x1009CFE7A: MatLUFactorNumeric_SuperLU_DIST
>> (superlu_dist.c:414)
>> ==42050==    by 0x10046CC5C: MatLUFactorNumeric (matrix.c:2946)
>> ==42050==    by 0x100F09F2C: PCSetUp_LU (lu.c:152)
>> ==42050==    by 0x100FF9036: PCSetUp (precon.c:982)
>> ==42050==    by 0x1010F54EB: KSPSetUp (itfunc.c:332)
>> ==42050==    by 0x1010F7985: KSPSolve (itfunc.c:546)
>> ==42050==    by 0x10125541E: SNESSolve_NEWTONLS (ls.c:233)
>> ==42050==    by 0x1011C49B7: SNESSolve (snes.c:3906)
>> ==42050==    by 0x100001B3C: main (in ./ex19)
>> ==42050==
>>
>>
>> > On Jul 20, 2015, at 12:03 PM, Mahir.Ulker-Kaustell at tyrens.se wrote:
>> >
>> > Ok. So I have been creating the full factorization on each process.
>> That gives me some hope!
>> >
>> > I followed your suggestion and tried to use the runtime option
>> ‘-mat_superlu_dist_parsymbfact’.
>> > However, now the program crashes with:
>> >
>> > Invalid ISPEC at line 484 in file get_perm_c.c
>> >
>> > And so on…
>> >
>> > From the SuperLU manual; I should give the option either YES or NO,
>> however -mat_superlu_dist_parsymbfact YES makes the program crash in the
>> same way as above.
>> > Also I can’t find any reference to -mat_superlu_dist_parsymbfact in the
>> PETSc documentation
>> >
>> > Mahir
>> >
>> > Mahir Ülker-Kaustell, Kompetenssamordnare, Brokonstruktör, Tekn. Dr,
>> Tyréns AB
>> > 010 452 30 82, Mahir.Ulker-Kaustell at tyrens.se
>> >
>> > From: Xiaoye S. Li [mailto:xsli at lbl.gov]
>> > Sent: den 20 juli 2015 18:12
>> > To: Ülker-Kaustell, Mahir
>> > Cc: Hong; petsc-users
>> > Subject: Re: [petsc-users] SuperLU MPI-problem
>> >
>> > The default SuperLU_DIST setting is to serial symbolic factorization.
>> Therefore, what matters is how much memory do you have per MPI task?
>> >
>> > The code failed to malloc memory during redistribution of matrix A to
>> {L\U} data struction (using result of serial symbolic factorization.)
>> >
>> > You can use parallel symbolic factorization, by runtime option:
>> '-mat_superlu_dist_parsymbfact'
>> >
>> > Sherry Li
>> >
>> >
>> > On Mon, Jul 20, 2015 at 8:59 AM, Mahir.Ulker-Kaustell at tyrens.se <
>> Mahir.Ulker-Kaustell at tyrens.se> wrote:
>> > Hong:
>> >
>> > Previous experiences with this equation have shown that it is very
>> difficult to solve it iteratively. Hence the use of a direct solver.
>> >
>> > The large test problem I am trying to solve has slightly less than 10^6
>> degrees of freedom. The matrices are derived from finite elements so they
>> are sparse.
>> > The machine I am working on has 128GB ram. I have estimated the memory
>> needed to less than 20GB, so if the solver needs twice or even three times
>> as much, it should still work well. Or have I completely misunderstood
>> something here?
>> >
>> > Mahir
>> >
>> >
>> >
>> > From: Hong [mailto:hzhang at mcs.anl.gov]
>> > Sent: den 20 juli 2015 17:39
>> > To: Ülker-Kaustell, Mahir
>> > Cc: petsc-users
>> > Subject: Re: [petsc-users] SuperLU MPI-problem
>> >
>> > Mahir:
>> > Direct solvers consume large amount of memory. Suggest to try
>> followings:
>> >
>> > 1. A sparse iterative solver if  [-omega^2M + K] is not too
>> ill-conditioned. You may test it using the small matrix.
>> >
>> > 2. Incrementally increase your matrix sizes. Try different matrix
>> orderings.
>> > Do you get memory crash in the 1st symbolic factorization?
>> > In your case, matrix data structure stays same when omega changes, so
>> you only need to do one matrix symbolic factorization and reuse it.
>> >
>> > 3. Use a machine that gives larger memory.
>> >
>> > Hong
>> >
>> > Dear Petsc-Users,
>> >
>> > I am trying to use PETSc to solve a set of linear equations arising
>> from Naviers equation (elastodynamics) in the frequency domain.
>> > The frequency dependency of the problem requires that the system
>> >
>> >                              [-omega^2M + K]u = F
>> >
>> > where M and K are constant, square, positive definite matrices (mass
>> and stiffness respectively) is solved for each frequency omega of interest.
>> > K is a complex matrix, including material damping.
>> >
>> > I have written a PETSc program which solves this problem for a small
>> (1000 degrees of freedom) test problem on one or several processors, but it
>> keeps crashing when I try it on my full scale (in the order of 10^6 degrees
>> of freedom) problem.
>> >
>> > The program crashes at KSPSetUp() and from what I can see in the error
>> messages, it appears as if it consumes too much memory.
>> >
>> > I would guess that similar problems have occurred in this mail-list, so
>> I am hoping that someone can push  me in the right direction…
>> >
>> > Mahir
>>
>>
>>
>>
>>
>>
>>
>>
>>
>
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.mcs.anl.gov/pipermail/petsc-users/attachments/20150803/a2a0629e/attachment-0001.html>


More information about the petsc-users mailing list