From jroman at dsic.upv.es Sat Aug 1 02:05:53 2015 From: jroman at dsic.upv.es (Jose E. Roman) Date: Sat, 1 Aug 2015 09:05:53 +0200 Subject: [petsc-users] SLEPc example failed ... In-Reply-To: References: Message-ID: <42827D94-B36B-4268-9CE7-425F45B2CF34@dsic.upv.es> There is not enough information to give an answer. Did you modify the example code? Did 'make test' work after SLEPc installation? Use a debugger to see the exact point where the execution failed. Jose El 01/08/2015, a las 00:29, Xujun Zhao escribi?: > Hi all, > > I run the EPS example ex9, and it failed. Can anyone help me figure out the problem? Thanks. The following are the output error msg: > > mcswl156:eps_tutorials xzhao$ ./ex9 -n 10 > > > > Brusselator wave model, n=10 > > > > ---> my test: VecCreateMPIWithArray is done. > > ---> my test: Shell Matrix is created. > > ---> my test: EPS is set. > > ---> my test: Start to solve the EPS ... > > [0]PETSC ERROR: ------------------------------------------------------------------------ > > [0]PETSC ERROR: Caught signal number 11 SEGV: Segmentation Violation, probably memory access out of range > > [0]PETSC ERROR: Try option -start_in_debugger or -on_error_attach_debugger > > [0]PETSC ERROR: or see http://www.mcs.anl.gov/petsc/documentation/faq.html#valgrind > > [0]PETSC ERROR: or try http://valgrind.org on GNU/linux and Apple Mac OS X to find memory corruption errors > > [0]PETSC ERROR: configure using --with-debugging=yes, recompile, link, and run > > [0]PETSC ERROR: to get more information on the crash. > > [0]PETSC ERROR: --------------------- Error Message -------------------------------------------------------------- > > [0]PETSC ERROR: Signal received > > [0]PETSC ERROR: See http://www.mcs.anl.gov/petsc/documentation/faq.html for trouble shooting. > > [0]PETSC ERROR: Petsc Release Version 3.5.4, May, 23, 2015 > > [0]PETSC ERROR: ./ex9 on a arch-darwin-c-opt named mcswl156.mcs.anl.gov by xzhao Fri Jul 31 17:26:52 2015 > > [0]PETSC ERROR: Configure options --with-cc=gcc-4.9 --with-cxx=g++-4.9 --with-fc=gfortran-4.9 --with-cxx-dialect=C++11 --download-mpich --download-fblaslapack --download-scalapack --download-mumps ?download-superlu_dist --download-hypre --download-ml --download-parmetis --download-metis --download-triangle --download-chaco --download-elemental --with-debugging=0 > > [0]PETSC ERROR: #1 User provided function() line 0 in unknown file > > application called MPI_Abort(MPI_COMM_WORLD, 59) - process 0 > > [unset]: aborting job: > > application called MPI_Abort(MPI_COMM_WORLD, 59) - process 0 > From fdkong.jd at gmail.com Sat Aug 1 16:01:49 2015 From: fdkong.jd at gmail.com (Fande Kong) Date: Sat, 1 Aug 2015 15:01:49 -0600 Subject: [petsc-users] failed to compile HDF5 on vesta Message-ID: Hi all, I want to install petsc on vesta (an IBM Blue Gene at Argonne). Failed to compile HDF5. The configure log file is attached. Any suggestions would be greatly appreciated. Thanks, Fande Kong, -------------- next part -------------- An HTML attachment was scrubbed... URL: -------------- next part -------------- A non-text attachment was scrubbed... Name: configure.log Type: application/octet-stream Size: 2753937 bytes Desc: not available URL: From bsmith at mcs.anl.gov Sat Aug 1 16:15:01 2015 From: bsmith at mcs.anl.gov (Barry Smith) Date: Sat, 1 Aug 2015 16:15:01 -0500 Subject: [petsc-users] failed to compile HDF5 on vesta In-Reply-To: References: Message-ID: Try with --with-shared-libraries=0 The HDF5 build is having some issue with shared libraries Barry > On Aug 1, 2015, at 4:01 PM, Fande Kong wrote: > > Hi all, > > I want to install petsc on vesta (an IBM Blue Gene at Argonne). Failed to compile HDF5. The configure log file is attached. Any suggestions would be greatly appreciated. > > Thanks, > > Fande Kong, > From fdkong.jd at gmail.com Sat Aug 1 19:55:08 2015 From: fdkong.jd at gmail.com (Fande Kong) Date: Sat, 1 Aug 2015 18:55:08 -0600 Subject: [petsc-users] failed to compile HDF5 on vesta In-Reply-To: References: Message-ID: HI barry, Thanks a lot. I could compile hdf5, but failed to compile fblaslapack. Log file is attached. Fande Kong, On Sat, Aug 1, 2015 at 3:15 PM, Barry Smith wrote: > > Try with --with-shared-libraries=0 The HDF5 build is having some issue > with shared libraries > > Barry > > > On Aug 1, 2015, at 4:01 PM, Fande Kong wrote: > > > > Hi all, > > > > I want to install petsc on vesta (an IBM Blue Gene at Argonne). Failed > to compile HDF5. The configure log file is attached. Any suggestions would > be greatly appreciated. > > > > Thanks, > > > > Fande Kong, > > > > -------------- next part -------------- An HTML attachment was scrubbed... URL: -------------- next part -------------- A non-text attachment was scrubbed... Name: configure.log Type: application/octet-stream Size: 5576806 bytes Desc: not available URL: From bsmith at mcs.anl.gov Sun Aug 2 11:30:50 2015 From: bsmith at mcs.anl.gov (Barry Smith) Date: Sun, 2 Aug 2015 11:30:50 -0500 Subject: [petsc-users] failed to compile HDF5 on vesta In-Reply-To: References: Message-ID: <78D07896-DA83-4572-B0AE-D022AE673D20@mcs.anl.gov> You shouldn't need --download-fblaslapack almost every system has it already installed. Barry Looks like the Fortran compiler on this system is rejecting the "old" Fortran in blas/lapack code. > On Aug 1, 2015, at 7:55 PM, Fande Kong wrote: > > HI barry, > > Thanks a lot. I could compile hdf5, but failed to compile fblaslapack. Log file is attached. > > Fande Kong, > > On Sat, Aug 1, 2015 at 3:15 PM, Barry Smith wrote: > > Try with --with-shared-libraries=0 The HDF5 build is having some issue with shared libraries > > Barry > > > On Aug 1, 2015, at 4:01 PM, Fande Kong wrote: > > > > Hi all, > > > > I want to install petsc on vesta (an IBM Blue Gene at Argonne). Failed to compile HDF5. The configure log file is attached. Any suggestions would be greatly appreciated. > > > > Thanks, > > > > Fande Kong, > > > > > From fdkong.jd at gmail.com Sun Aug 2 22:22:34 2015 From: fdkong.jd at gmail.com (Fande Kong) Date: Sun, 2 Aug 2015 22:22:34 -0500 Subject: [petsc-users] failed to compile HDF5 on vesta In-Reply-To: <78D07896-DA83-4572-B0AE-D022AE673D20@mcs.anl.gov> References: <78D07896-DA83-4572-B0AE-D022AE673D20@mcs.anl.gov> Message-ID: Hi, Barry, Looks like they did not have fblaslapack installed. I could compile the fblaslapack when I switched the compiler from XL to gcc. Thanks, Fande Kong, On Sun, Aug 2, 2015 at 11:30 AM, Barry Smith wrote: > > You shouldn't need --download-fblaslapack almost every system has it > already installed. > > Barry > > Looks like the Fortran compiler on this system is rejecting the "old" > Fortran in blas/lapack code. > > > > On Aug 1, 2015, at 7:55 PM, Fande Kong wrote: > > > > HI barry, > > > > Thanks a lot. I could compile hdf5, but failed to compile fblaslapack. > Log file is attached. > > > > Fande Kong, > > > > On Sat, Aug 1, 2015 at 3:15 PM, Barry Smith wrote: > > > > Try with --with-shared-libraries=0 The HDF5 build is having some > issue with shared libraries > > > > Barry > > > > > On Aug 1, 2015, at 4:01 PM, Fande Kong wrote: > > > > > > Hi all, > > > > > > I want to install petsc on vesta (an IBM Blue Gene at Argonne). Failed > to compile HDF5. The configure log file is attached. Any suggestions would > be greatly appreciated. > > > > > > Thanks, > > > > > > Fande Kong, > > > > > > > > > > > -------------- next part -------------- An HTML attachment was scrubbed... URL: From Mahir.Ulker-Kaustell at tyrens.se Mon Aug 3 07:02:11 2015 From: Mahir.Ulker-Kaustell at tyrens.se (Mahir.Ulker-Kaustell at tyrens.se) Date: Mon, 3 Aug 2015 12:02:11 +0000 Subject: [petsc-users] SuperLU MPI-problem In-Reply-To: References: <051d9816c3bd4a3eac37fc51004ebce1@STHWS42.tyrens.se> <7345cece365942d1a06deeac56cf1d72@STHWS42.tyrens.se> <19A5B30A-64E2-44E1-8F73-F67AE628F175@mcs.anl.gov> <03369975ff0a46a388920f1b3372d25c@STHWS42.tyrens.se> <1891ada2f99843b6b6c1d91f0f47f065@STHWS42.tyrens.se> Message-ID: Hong and Sherry, I have rebuilt PETSc with SuperLU 4.1. Unfortunately, the problem remains: If I use -mat_superlu_dist_parsymbfact, the program crashes with: Invalid ISPEC at line 484 in file get_perm_c.c If I use -mat_superlu_dist_parsymbfact=1 or leave this flag out, the program crashes with: Calloc fails for SPA dense[]. at line 438 in file zdistribute.c Mahir From: Hong [mailto:hzhang at mcs.anl.gov] Sent: den 30 juli 2015 02:58 To: ?lker-Kaustell, Mahir Cc: Xiaoye Li; PETSc users list Subject: Fwd: [petsc-users] SuperLU MPI-problem Mahir, Sherry fixed several bugs in superlu_dist-v4.1. The current petsc-release interfaces with superlu_dist-v4.0. We do not know whether the reported issue (attached below) has been resolved or not. If not, can you test it with the latest superlu_dist-v4.1? Here is how to do it: 1. download superlu_dist v4.1 2. remove existing PETSC_ARCH directory, then configure petsc with '--download-superlu_dist=superlu_dist_4.1.tar.gz' 3. build petsc Let us know if the issue remains. Hong ---------- Forwarded message ---------- From: Xiaoye S. Li > Date: Wed, Jul 29, 2015 at 2:24 PM Subject: Fwd: [petsc-users] SuperLU MPI-problem To: Hong Zhang > Hong, I am cleaning the mailbox, and saw this unresolved issue. I am not sure whether the new fix to parallel symbolic factorization solves the problem. What bothers be is that he is getting the following error: Invalid ISPEC at line 484 in file get_perm_c.c This has nothing to do with my bug fix. ? Shall we ask him to try the new version, or try to get him matrix? Sherry ? ---------- Forwarded message ---------- From: Mahir.Ulker-Kaustell at tyrens.se > Date: Wed, Jul 22, 2015 at 1:32 PM Subject: RE: [petsc-users] SuperLU MPI-problem To: Hong >, "Xiaoye S. Li" > Cc: petsc-users > The 1000 was just a conservative guess. The number of non-zeros per row is in the tens in general but certain constraints lead to non-diagonal streaks in the sparsity-pattern. Is it the reordering of the matrix that is killing me here? How can I set options.ColPerm? If i use -mat_superlu_dist_parsymbfact the program crashes with Invalid ISPEC at line 484 in file get_perm_c.c ------------------------------------------------------- Primary job terminated normally, but 1 process returned a non-zero exit code.. Per user-direction, the job has been aborted. ------------------------------------------------------- [0]PETSC ERROR: ------------------------------------------------------------------------ [0]PETSC ERROR: Caught signal number 15 Terminate: Some process (or the batch system) has told this process to end [0]PETSC ERROR: Try option -start_in_debugger or -on_error_attach_debugger [0]PETSC ERROR: or see http://www.mcs.anl.gov/petsc/documentation/faq.html#valgrind [0]PETSC ERROR: or try http://valgrind.org on GNU/linux and Apple Mac OS X to find memory corruption errors [0]PETSC ERROR: configure using --with-debugging=yes, recompile, link, and run [0]PETSC ERROR: to get more information on the crash. [0]PETSC ERROR: --------------------- Error Message -------------------------------------------------------------- [0]PETSC ERROR: Signal received [0]PETSC ERROR: See http://www.mcs.anl.gov/petsc/documentation/faq.html for trouble shooting. [0]PETSC ERROR: Petsc Release Version 3.6.0, Jun, 09, 2015 [0]PETSC ERROR: ./solve on a cygwin-complex-nodebug named CZC5202SM2 by muk Wed Jul 22 21:59:23 2015 [0]PETSC ERROR: Configure options PETSC_DIR=/packages/petsc-3.6.0 PETSC_ARCH=cygwin-complex-nodebug --with-cc=gcc --with-cxx=g++ --with-fc=gfortran --with-debugging=0 --with-fortran-kernels=1 --with-scalar-type=complex --download-fblaspack --download-mpich --download-scalapack --download-mumps --download-metis --download-parmetis --download-superlu --download-superlu_dist --download-fftw [0]PETSC ERROR: #1 User provided function() line 0 in unknown file application called MPI_Abort(MPI_COMM_WORLD, 59) - process 0 [unset]: aborting job: application called MPI_Abort(MPI_COMM_WORLD, 59) - process 0 [0]PETSC ERROR: ------------------------------------------------------------------------ If i use -mat_superlu_dist_parsymbfact=1 the program crashes (somewhat later) with Malloc fails for Lnzval_bc_ptr[*][] at line 626 in file zdistribute.c col block 3006 ------------------------------------------------------- Primary job terminated normally, but 1 process returned a non-zero exit code.. Per user-direction, the job has been aborted. ------------------------------------------------------- col block 1924 [0]PETSC ERROR: ------------------------------------------------------------------------ [0]PETSC ERROR: Caught signal number 15 Terminate: Some process (or the batch system) has told this process to end [0]PETSC ERROR: Try option -start_in_debugger or -on_error_attach_debugger [0]PETSC ERROR: or see http://www.mcs.anl.gov/petsc/documentation/faq.html#valgrind [0]PETSC ERROR: or try http://valgrind.org on GNU/linux and Apple Mac OS X to find memory corruption errors [0]PETSC ERROR: configure using --with-debugging=yes, recompile, link, and run [0]PETSC ERROR: to get more information on the crash. [0]PETSC ERROR: --------------------- Error Message -------------------------------------------------------------- [0]PETSC ERROR: Signal received [0]PETSC ERROR: See http://www.mcs.anl.gov/petsc/documentation/faq.html for trouble shooting. [0]PETSC ERROR: Petsc Release Version 3.6.0, Jun, 09, 2015 [0]PETSC ERROR: ./solve on a cygwin-complex-nodebug named CZC5202SM2 by muk Wed Jul 22 21:59:58 2015 [0]PETSC ERROR: Configure options PETSC_DIR=/packages/petsc-3.6.0 PETSC_ARCH=cygwin-complex-nodebug --with-cc=gcc --with-cxx=g++ --with-fc=gfortran --with-debugging=0 --with-fortran-kernels=1 --with-scalar-type=complex --download-fblaspack --download-mpich --download-scalapack --download-mumps --download-metis --download-parmetis --download-superlu --download-superlu_dist --download-fftw [0]PETSC ERROR: #1 User provided function() line 0 in unknown file application called MPI_Abort(MPI_COMM_WORLD, 59) - process 0 [unset]: aborting job: application called MPI_Abort(MPI_COMM_WORLD, 59) - process 0 [0]PETSC ERROR: ------------------------------------------------------------------------ /Mahir From: Hong [mailto:hzhang at mcs.anl.gov] Sent: den 22 juli 2015 21:34 To: Xiaoye S. Li Cc: ?lker-Kaustell, Mahir; petsc-users Subject: Re: [petsc-users] SuperLU MPI-problem In Petsc/superlu_dist interface, we set default options.ParSymbFact = NO; When user raises the flag "-mat_superlu_dist_parsymbfact", we set options.ParSymbFact = YES; options.ColPerm = PARMETIS; /* in v2.2, PARMETIS is forced for ParSymbFact regardless of user ordering setting */ We do not change anything else. Hong On Wed, Jul 22, 2015 at 2:19 PM, Xiaoye S. Li > wrote: I am trying to understand your problem. You said you are solving Naviers equation (elastodynamics) in the frequency domain, using finite element discretization. I wonder why you have about 1000 nonzeros per row. Usually in many PDE discretized matrices, the number of nonzeros per row is in the tens (even for 3D problems), not in the thousands. So, your matrix is quite a bit denser than many sparse matrices we deal with. The number of nonzeros in the L and U factors is much more than that in original matrix A -- typically we see 10-20x fill ratio for 2D, or can be as bad as 50-100x fill ratio for 3D. But since your matrix starts much denser (i.e., the underlying graph has many connections), it may not lend to any good ordering strategy to preserve sparsity of L and U; that is, the L and U fill ratio may be large. I don't understand why you get the following error when you use ?-mat_superlu_dist_parsymbfact?. Invalid ISPEC at line 484 in file get_perm_c.c Perhaps Hong Zhang knows; she built the SuperLU_DIST interface for PETSc. ?Hong -- in order to use parallel symbolic factorization, is it sufficient to specify only ?-mat_superlu_dist_parsymbfact? ? ? (the default is to use sequential symbolic factorization.) Sherry On Wed, Jul 22, 2015 at 9:11 AM, Mahir.Ulker-Kaustell at tyrens.se > wrote: Thank you for your reply. As you have probably figured out already, I am not a computational scientist. I am a researcher in civil engineering (railways for high-speed traffic), trying to produce some, from my perspective, fairly large parametric studies based on finite element discretizations. I am working in a Windows-environment and have installed PETSc through Cygwin. Apparently, there is no support for Valgrind in this OS. If I have understood you correct, the memory issues are related to superLU and given my background, there is not much I can do. Is this correct? Best regards, Mahir ______________________________________________ Mahir ?lker-Kaustell, Kompetenssamordnare, Brokonstrukt?r, Tekn. Dr, Tyr?ns AB 010 452 30 82, Mahir.Ulker-Kaustell at tyrens.se ______________________________________________ -----Original Message----- From: Barry Smith [mailto:bsmith at mcs.anl.gov] Sent: den 22 juli 2015 02:57 To: ?lker-Kaustell, Mahir Cc: Xiaoye S. Li; petsc-users Subject: Re: [petsc-users] SuperLU MPI-problem Run the program under valgrind http://www.mcs.anl.gov/petsc/documentation/faq.html#valgrind . When I use the option -mat_superlu_dist_parsymbfact I get many scary memory problems some involving for example ddist_psymbtonum (pdsymbfact_distdata.c:1332) Note that I consider it unacceptable for running programs to EVER use uninitialized values; until these are all cleaned up I won't trust any runs like this. Barry ==42050== Conditional jump or move depends on uninitialised value(s) ==42050== at 0x10274C436: MPI_Allgatherv (allgatherv.c:1053) ==42050== by 0x101557F60: get_perm_c_parmetis (get_perm_c_parmetis.c:285) ==42050== by 0x101501192: pdgssvx (pdgssvx.c:934) ==42050== by 0x1009CFE7A: MatLUFactorNumeric_SuperLU_DIST (superlu_dist.c:414) ==42050== by 0x10046CC5C: MatLUFactorNumeric (matrix.c:2946) ==42050== by 0x100F09F2C: PCSetUp_LU (lu.c:152) ==42050== by 0x100FF9036: PCSetUp (precon.c:982) ==42050== by 0x1010F54EB: KSPSetUp (itfunc.c:332) ==42050== by 0x1010F7985: KSPSolve (itfunc.c:546) ==42050== by 0x10125541E: SNESSolve_NEWTONLS (ls.c:233) ==42050== by 0x1011C49B7: SNESSolve (snes.c:3906) ==42050== by 0x100001B3C: main (in ./ex19) ==42050== Uninitialised value was created by a stack allocation ==42050== at 0x10155751B: get_perm_c_parmetis (get_perm_c_parmetis.c:96) ==42050== ==42050== Conditional jump or move depends on uninitialised value(s) ==42050== at 0x102851C61: MPIR_Allgatherv_intra (allgatherv.c:651) ==42050== by 0x102853EC7: MPIR_Allgatherv (allgatherv.c:903) ==42050== by 0x102853F84: MPIR_Allgatherv_impl (allgatherv.c:944) ==42050== by 0x10274CA41: MPI_Allgatherv (allgatherv.c:1107) ==42050== by 0x101557F60: get_perm_c_parmetis (get_perm_c_parmetis.c:285) ==42050== by 0x101501192: pdgssvx (pdgssvx.c:934) ==42050== by 0x1009CFE7A: MatLUFactorNumeric_SuperLU_DIST (superlu_dist.c:414) ==42050== by 0x10046CC5C: MatLUFactorNumeric (matrix.c:2946) ==42050== by 0x100F09F2C: PCSetUp_LU (lu.c:152) ==42050== by 0x100FF9036: PCSetUp (precon.c:982) ==42050== by 0x1010F54EB: KSPSetUp (itfunc.c:332) ==42050== by 0x1010F7985: KSPSolve (itfunc.c:546) ==42050== by 0x10125541E: SNESSolve_NEWTONLS (ls.c:233) ==42050== by 0x1011C49B7: SNESSolve (snes.c:3906) ==42050== by 0x100001B3C: main (in ./ex19) ==42050== Uninitialised value was created by a stack allocation ==42050== at 0x10155751B: get_perm_c_parmetis (get_perm_c_parmetis.c:96) ==42050== ==42049== Syscall param writev(vector[...]) points to uninitialised byte(s) ==42049== at 0x102DA1C3A: writev (in /usr/lib/system/libsystem_kernel.dylib) ==42049== by 0x10296A0DC: MPL_large_writev (mplsock.c:32) ==42049== by 0x10295F6AD: MPIDU_Sock_writev (sock_immed.i:610) ==42049== by 0x102943FCA: MPIDI_CH3_iSendv (ch3_isendv.c:84) ==42049== by 0x102934361: MPIDI_CH3_EagerContigIsend (ch3u_eager.c:556) ==42049== by 0x102939531: MPID_Isend (mpid_isend.c:138) ==42049== by 0x10277656E: MPI_Isend (isend.c:125) ==42049== by 0x102088B66: libparmetis__gkMPI_Isend (gkmpi.c:63) ==42049== by 0x10208140F: libparmetis__CommInterfaceData (comm.c:298) ==42049== by 0x1020A8758: libparmetis__CompactGraph (ometis.c:553) ==42049== by 0x1020A77BB: libparmetis__MultilevelOrder (ometis.c:225) ==42049== by 0x1020A7493: ParMETIS_V32_NodeND (ometis.c:151) ==42049== by 0x1020A6AFB: ParMETIS_V3_NodeND (ometis.c:34) ==42049== by 0x101557CFC: get_perm_c_parmetis (get_perm_c_parmetis.c:241) ==42049== by 0x101501192: pdgssvx (pdgssvx.c:934) ==42049== by 0x1009CFE7A: MatLUFactorNumeric_SuperLU_DIST (superlu_dist.c:414) ==42049== by 0x10046CC5C: MatLUFactorNumeric (matrix.c:2946) ==42049== by 0x100F09F2C: PCSetUp_LU (lu.c:152) ==42049== by 0x100FF9036: PCSetUp (precon.c:982) ==42048== Syscall param writev(vector[...]) points to uninitialised byte(s) ==42049== by 0x1010F54EB: KSPSetUp (itfunc.c:332) ==42049== Address 0x105edff70 is 1,424 bytes inside a block of size 752,720 alloc'd ==42049== at 0x1000183B1: malloc (vg_replace_malloc.c:303) ==42049== by 0x1020EB90C: gk_malloc (memory.c:147) ==42049== by 0x1020EAA28: gk_mcoreCreate (mcore.c:28) ==42048== at 0x102DA1C3A: writev (in /usr/lib/system/libsystem_kernel.dylib) ==42048== by 0x10296A0DC: MPL_large_writev (mplsock.c:32) ==42049== by 0x1020BA5CF: libparmetis__AllocateWSpace (wspace.c:23) ==42049== by 0x1020A6E84: ParMETIS_V32_NodeND (ometis.c:98) ==42048== by 0x10295F6AD: MPIDU_Sock_writev (sock_immed.i:610) ==42048== by 0x102943FCA: MPIDI_CH3_iSendv (ch3_isendv.c:84) ==42048== by 0x102934361: MPIDI_CH3_EagerContigIsend (ch3u_eager.c:556) ==42049== by 0x1020A6AFB: ParMETIS_V3_NodeND (ometis.c:34) ==42049== by 0x101557CFC: get_perm_c_parmetis (get_perm_c_parmetis.c:241) ==42049== by 0x101501192: pdgssvx (pdgssvx.c:934) ==42048== by 0x102939531: MPID_Isend (mpid_isend.c:138) ==42048== by 0x10277656E: MPI_Isend (isend.c:125) ==42049== by 0x1009CFE7A: MatLUFactorNumeric_SuperLU_DIST (superlu_dist.c:414) ==42049== by 0x10046CC5C: MatLUFactorNumeric (matrix.c:2946) ==42049== by 0x100F09F2C: PCSetUp_LU (lu.c:152) ==42049== by 0x100FF9036: PCSetUp (precon.c:982) ==42049== by 0x1010F54EB: KSPSetUp (itfunc.c:332) ==42048== by 0x102088B66: libparmetis__gkMPI_Isend (gkmpi.c:63) ==42048== by 0x10208140F: libparmetis__CommInterfaceData (comm.c:298) ==42049== by 0x1010F7985: KSPSolve (itfunc.c:546) ==42049== by 0x10125541E: SNESSolve_NEWTONLS (ls.c:233) ==42048== by 0x1020A8758: libparmetis__CompactGraph (ometis.c:553) ==42048== by 0x1020A77BB: libparmetis__MultilevelOrder (ometis.c:225) ==42048== by 0x1020A7493: ParMETIS_V32_NodeND (ometis.c:151) ==42049== by 0x1011C49B7: SNESSolve (snes.c:3906) ==42049== by 0x100001B3C: main (in ./ex19) ==42049== Uninitialised value was created by a heap allocation ==42049== at 0x1000183B1: malloc (vg_replace_malloc.c:303) ==42049== by 0x1020EB90C: gk_malloc (memory.c:147) ==42048== by 0x1020A6AFB: ParMETIS_V3_NodeND (ometis.c:34) ==42048== by 0x101557CFC: get_perm_c_parmetis (get_perm_c_parmetis.c:241) ==42048== by 0x101501192: pdgssvx (pdgssvx.c:934) ==42048== by 0x1009CFE7A: MatLUFactorNumeric_SuperLU_DIST (superlu_dist.c:414) ==42049== by 0x10211C50B: libmetis__imalloc (gklib.c:24) ==42049== by 0x1020A8566: libparmetis__CompactGraph (ometis.c:519) ==42049== by 0x1020A77BB: libparmetis__MultilevelOrder (ometis.c:225) ==42048== by 0x10046CC5C: MatLUFactorNumeric (matrix.c:2946) ==42049== by 0x1020A7493: ParMETIS_V32_NodeND (ometis.c:151) ==42049== by 0x1020A6AFB: ParMETIS_V3_NodeND (ometis.c:34) ==42049== by 0x101557CFC: get_perm_c_parmetis (get_perm_c_parmetis.c:241) ==42049== by 0x101501192: pdgssvx (pdgssvx.c:934) ==42049== by 0x1009CFE7A: MatLUFactorNumeric_SuperLU_DIST (superlu_dist.c:414) ==42049== by 0x10046CC5C: MatLUFactorNumeric (matrix.c:2946) ==42048== by 0x100F09F2C: PCSetUp_LU (lu.c:152) ==42049== by 0x100F09F2C: PCSetUp_LU (lu.c:152) ==42049== by 0x100FF9036: PCSetUp (precon.c:982) ==42049== by 0x1010F54EB: KSPSetUp (itfunc.c:332) ==42049== by 0x1010F7985: KSPSolve (itfunc.c:546) ==42049== by 0x10125541E: SNESSolve_NEWTONLS (ls.c:233) ==42048== by 0x100FF9036: PCSetUp (precon.c:982) ==42048== by 0x1010F54EB: KSPSetUp (itfunc.c:332) ==42048== Address 0x10597a860 is 1,408 bytes inside a block of size 752,720 alloc'd ==42049== by 0x1011C49B7: SNESSolve (snes.c:3906) ==42049== by 0x100001B3C: main (in ./ex19) ==42049== ==42048== at 0x1000183B1: malloc (vg_replace_malloc.c:303) ==42048== by 0x1020EB90C: gk_malloc (memory.c:147) ==42048== by 0x1020EAA28: gk_mcoreCreate (mcore.c:28) ==42048== by 0x1020BA5CF: libparmetis__AllocateWSpace (wspace.c:23) ==42048== by 0x1020A6E84: ParMETIS_V32_NodeND (ometis.c:98) ==42048== by 0x1020A6AFB: ParMETIS_V3_NodeND (ometis.c:34) ==42048== by 0x101557CFC: get_perm_c_parmetis (get_perm_c_parmetis.c:241) ==42048== by 0x101501192: pdgssvx (pdgssvx.c:934) ==42048== by 0x1009CFE7A: MatLUFactorNumeric_SuperLU_DIST (superlu_dist.c:414) ==42048== by 0x10046CC5C: MatLUFactorNumeric (matrix.c:2946) ==42048== by 0x100F09F2C: PCSetUp_LU (lu.c:152) ==42048== by 0x100FF9036: PCSetUp (precon.c:982) ==42048== by 0x1010F54EB: KSPSetUp (itfunc.c:332) ==42048== by 0x1010F7985: KSPSolve (itfunc.c:546) ==42048== by 0x10125541E: SNESSolve_NEWTONLS (ls.c:233) ==42048== by 0x1011C49B7: SNESSolve (snes.c:3906) ==42048== by 0x100001B3C: main (in ./ex19) ==42048== Uninitialised value was created by a heap allocation ==42048== at 0x1000183B1: malloc (vg_replace_malloc.c:303) ==42048== by 0x1020EB90C: gk_malloc (memory.c:147) ==42048== by 0x10211C50B: libmetis__imalloc (gklib.c:24) ==42048== by 0x1020A8566: libparmetis__CompactGraph (ometis.c:519) ==42048== by 0x1020A77BB: libparmetis__MultilevelOrder (ometis.c:225) ==42048== by 0x1020A7493: ParMETIS_V32_NodeND (ometis.c:151) ==42048== by 0x1020A6AFB: ParMETIS_V3_NodeND (ometis.c:34) ==42048== by 0x101557CFC: get_perm_c_parmetis (get_perm_c_parmetis.c:241) ==42048== by 0x101501192: pdgssvx (pdgssvx.c:934) ==42048== by 0x1009CFE7A: MatLUFactorNumeric_SuperLU_DIST (superlu_dist.c:414) ==42048== by 0x10046CC5C: MatLUFactorNumeric (matrix.c:2946) ==42048== by 0x100F09F2C: PCSetUp_LU (lu.c:152) ==42048== by 0x100FF9036: PCSetUp (precon.c:982) ==42048== by 0x1010F54EB: KSPSetUp (itfunc.c:332) ==42048== by 0x1010F7985: KSPSolve (itfunc.c:546) ==42048== by 0x10125541E: SNESSolve_NEWTONLS (ls.c:233) ==42048== by 0x1011C49B7: SNESSolve (snes.c:3906) ==42048== by 0x100001B3C: main (in ./ex19) ==42048== ==42048== Syscall param write(buf) points to uninitialised byte(s) ==42048== at 0x102DA1C22: write (in /usr/lib/system/libsystem_kernel.dylib) ==42048== by 0x10295F5BD: MPIDU_Sock_write (sock_immed.i:525) ==42048== by 0x102944839: MPIDI_CH3_iStartMsg (ch3_istartmsg.c:86) ==42048== by 0x102933B80: MPIDI_CH3_EagerContigShortSend (ch3u_eager.c:257) ==42048== by 0x10293ADBA: MPID_Send (mpid_send.c:130) ==42048== by 0x10277A1FA: MPI_Send (send.c:127) ==42048== by 0x10155802F: get_perm_c_parmetis (get_perm_c_parmetis.c:299) ==42048== by 0x101501192: pdgssvx (pdgssvx.c:934) ==42048== by 0x1009CFE7A: MatLUFactorNumeric_SuperLU_DIST (superlu_dist.c:414) ==42048== by 0x10046CC5C: MatLUFactorNumeric (matrix.c:2946) ==42048== by 0x100F09F2C: PCSetUp_LU (lu.c:152) ==42048== by 0x100FF9036: PCSetUp (precon.c:982) ==42048== by 0x1010F54EB: KSPSetUp (itfunc.c:332) ==42048== by 0x1010F7985: KSPSolve (itfunc.c:546) ==42048== by 0x10125541E: SNESSolve_NEWTONLS (ls.c:233) ==42048== by 0x1011C49B7: SNESSolve (snes.c:3906) ==42048== by 0x100001B3C: main (in ./ex19) ==42048== Address 0x104810704 is on thread 1's stack ==42048== in frame #3, created by MPIDI_CH3_EagerContigShortSend (ch3u_eager.c:218) ==42048== Uninitialised value was created by a heap allocation ==42048== at 0x1000183B1: malloc (vg_replace_malloc.c:303) ==42048== by 0x10153B704: superlu_malloc_dist (memory.c:108) ==42048== by 0x101557AB9: get_perm_c_parmetis (get_perm_c_parmetis.c:185) ==42048== by 0x101501192: pdgssvx (pdgssvx.c:934) ==42048== by 0x1009CFE7A: MatLUFactorNumeric_SuperLU_DIST (superlu_dist.c:414) ==42048== by 0x10046CC5C: MatLUFactorNumeric (matrix.c:2946) ==42048== by 0x100F09F2C: PCSetUp_LU (lu.c:152) ==42048== by 0x100FF9036: PCSetUp (precon.c:982) ==42048== by 0x1010F54EB: KSPSetUp (itfunc.c:332) ==42048== by 0x1010F7985: KSPSolve (itfunc.c:546) ==42048== by 0x10125541E: SNESSolve_NEWTONLS (ls.c:233) ==42048== by 0x1011C49B7: SNESSolve (snes.c:3906) ==42048== by 0x100001B3C: main (in ./ex19) ==42048== ==42050== Conditional jump or move depends on uninitialised value(s) ==42050== at 0x102744CB8: MPI_Alltoallv (alltoallv.c:480) ==42050== by 0x101510B3E: dist_symbLU (pdsymbfact_distdata.c:539) ==42050== by 0x10150A5C6: ddist_psymbtonum (pdsymbfact_distdata.c:1275) ==42050== by 0x1015018C2: pdgssvx (pdgssvx.c:1057) ==42050== by 0x1009CFE7A: MatLUFactorNumeric_SuperLU_DIST (superlu_dist.c:414) ==42050== by 0x10046CC5C: MatLUFactorNumeric (matrix.c:2946) ==42050== by 0x100F09F2C: PCSetUp_LU (lu.c:152) ==42050== by 0x100FF9036: PCSetUp (precon.c:982) ==42050== by 0x1010F54EB: KSPSetUp (itfunc.c:332) ==42050== by 0x1010F7985: KSPSolve (itfunc.c:546) ==42050== by 0x10125541E: SNESSolve_NEWTONLS (ls.c:233) ==42050== by 0x1011C49B7: SNESSolve (snes.c:3906) ==42050== by 0x100001B3C: main (in ./ex19) ==42050== Uninitialised value was created by a stack allocation ==42050== at 0x10150E4C4: dist_symbLU (pdsymbfact_distdata.c:96) ==42050== ==42050== Conditional jump or move depends on uninitialised value(s) ==42050== at 0x102744E43: MPI_Alltoallv (alltoallv.c:490) ==42050== by 0x101510B3E: dist_symbLU (pdsymbfact_distdata.c:539) ==42050== by 0x10150A5C6: ddist_psymbtonum (pdsymbfact_distdata.c:1275) ==42050== by 0x1015018C2: pdgssvx (pdgssvx.c:1057) ==42050== by 0x1009CFE7A: MatLUFactorNumeric_SuperLU_DIST (superlu_dist.c:414) ==42050== by 0x10046CC5C: MatLUFactorNumeric (matrix.c:2946) ==42050== by 0x100F09F2C: PCSetUp_LU (lu.c:152) ==42050== by 0x100FF9036: PCSetUp (precon.c:982) ==42050== by 0x1010F54EB: KSPSetUp (itfunc.c:332) ==42050== by 0x1010F7985: KSPSolve (itfunc.c:546) ==42050== by 0x10125541E: SNESSolve_NEWTONLS (ls.c:233) ==42050== by 0x1011C49B7: SNESSolve (snes.c:3906) ==42050== by 0x100001B3C: main (in ./ex19) ==42050== Uninitialised value was created by a stack allocation ==42050== at 0x10150E4C4: dist_symbLU (pdsymbfact_distdata.c:96) ==42050== ==42050== Conditional jump or move depends on uninitialised value(s) ==42050== at 0x102744EBF: MPI_Alltoallv (alltoallv.c:497) ==42050== by 0x101510B3E: dist_symbLU (pdsymbfact_distdata.c:539) ==42050== by 0x10150A5C6: ddist_psymbtonum (pdsymbfact_distdata.c:1275) ==42050== by 0x1015018C2: pdgssvx (pdgssvx.c:1057) ==42050== by 0x1009CFE7A: MatLUFactorNumeric_SuperLU_DIST (superlu_dist.c:414) ==42050== by 0x10046CC5C: MatLUFactorNumeric (matrix.c:2946) ==42050== by 0x100F09F2C: PCSetUp_LU (lu.c:152) ==42050== by 0x100FF9036: PCSetUp (precon.c:982) ==42050== by 0x1010F54EB: KSPSetUp (itfunc.c:332) ==42050== by 0x1010F7985: KSPSolve (itfunc.c:546) ==42050== by 0x10125541E: SNESSolve_NEWTONLS (ls.c:233) ==42050== by 0x1011C49B7: SNESSolve (snes.c:3906) ==42050== by 0x100001B3C: main (in ./ex19) ==42050== Uninitialised value was created by a stack allocation ==42050== at 0x10150E4C4: dist_symbLU (pdsymbfact_distdata.c:96) ==42050== ==42050== Conditional jump or move depends on uninitialised value(s) ==42050== at 0x1027450B1: MPI_Alltoallv (alltoallv.c:512) ==42050== by 0x101510B3E: dist_symbLU (pdsymbfact_distdata.c:539) ==42050== by 0x10150A5C6: ddist_psymbtonum (pdsymbfact_distdata.c:1275) ==42050== by 0x1015018C2: pdgssvx (pdgssvx.c:1057) ==42050== by 0x1009CFE7A: MatLUFactorNumeric_SuperLU_DIST (superlu_dist.c:414) ==42050== by 0x10046CC5C: MatLUFactorNumeric (matrix.c:2946) ==42050== by 0x100F09F2C: PCSetUp_LU (lu.c:152) ==42050== by 0x100FF9036: PCSetUp (precon.c:982) ==42050== by 0x1010F54EB: KSPSetUp (itfunc.c:332) ==42050== by 0x1010F7985: KSPSolve (itfunc.c:546) ==42050== by 0x10125541E: SNESSolve_NEWTONLS (ls.c:233) ==42050== by 0x1011C49B7: SNESSolve (snes.c:3906) ==42050== by 0x100001B3C: main (in ./ex19) ==42050== Uninitialised value was created by a stack allocation ==42050== at 0x10150E4C4: dist_symbLU (pdsymbfact_distdata.c:96) ==42050== ==42050== Conditional jump or move depends on uninitialised value(s) ==42050== at 0x10283FB06: MPIR_Alltoallv_intra (alltoallv.c:92) ==42050== by 0x1028407B6: MPIR_Alltoallv (alltoallv.c:343) ==42050== by 0x102840884: MPIR_Alltoallv_impl (alltoallv.c:380) ==42050== by 0x10274541B: MPI_Alltoallv (alltoallv.c:531) ==42050== by 0x101510B3E: dist_symbLU (pdsymbfact_distdata.c:539) ==42050== by 0x10150A5C6: ddist_psymbtonum (pdsymbfact_distdata.c:1275) ==42050== by 0x1015018C2: pdgssvx (pdgssvx.c:1057) ==42050== by 0x1009CFE7A: MatLUFactorNumeric_SuperLU_DIST (superlu_dist.c:414) ==42050== by 0x10046CC5C: MatLUFactorNumeric (matrix.c:2946) ==42050== by 0x100F09F2C: PCSetUp_LU (lu.c:152) ==42050== by 0x100FF9036: PCSetUp (precon.c:982) ==42050== by 0x1010F54EB: KSPSetUp (itfunc.c:332) ==42050== by 0x1010F7985: KSPSolve (itfunc.c:546) ==42050== by 0x10125541E: SNESSolve_NEWTONLS (ls.c:233) ==42050== by 0x1011C49B7: SNESSolve (snes.c:3906) ==42050== by 0x100001B3C: main (in ./ex19) ==42050== Uninitialised value was created by a stack allocation ==42050== at 0x10150E4C4: dist_symbLU (pdsymbfact_distdata.c:96) ==42050== ==42050== Syscall param writev(vector[...]) points to uninitialised byte(s) ==42050== at 0x102DA1C3A: writev (in /usr/lib/system/libsystem_kernel.dylib) ==42050== by 0x10296A0DC: MPL_large_writev (mplsock.c:32) ==42050== by 0x10295F6AD: MPIDU_Sock_writev (sock_immed.i:610) ==42050== by 0x102943FCA: MPIDI_CH3_iSendv (ch3_isendv.c:84) ==42050== by 0x102934361: MPIDI_CH3_EagerContigIsend (ch3u_eager.c:556) ==42050== by 0x102939531: MPID_Isend (mpid_isend.c:138) ==42050== by 0x10277656E: MPI_Isend (isend.c:125) ==42050== by 0x101524C41: pdgstrf2_trsm (pdgstrf2.c:201) ==42050== by 0x10151ECBF: pdgstrf (pdgstrf.c:1082) ==42050== by 0x1015019A5: pdgssvx (pdgssvx.c:1069) ==42050== by 0x1009CFE7A: MatLUFactorNumeric_SuperLU_DIST (superlu_dist.c:414) ==42050== by 0x10046CC5C: MatLUFactorNumeric (matrix.c:2946) ==42050== by 0x100F09F2C: PCSetUp_LU (lu.c:152) ==42050== by 0x100FF9036: PCSetUp (precon.c:982) ==42050== by 0x1010F54EB: KSPSetUp (itfunc.c:332) ==42050== by 0x1010F7985: KSPSolve (itfunc.c:546) ==42050== by 0x10125541E: SNESSolve_NEWTONLS (ls.c:233) ==42050== by 0x1011C49B7: SNESSolve (snes.c:3906) ==42050== by 0x100001B3C: main (in ./ex19) ==42050== Address 0x1060144d0 is 1,168 bytes inside a block of size 131,072 alloc'd ==42050== at 0x1000183B1: malloc (vg_replace_malloc.c:303) ==42050== by 0x10153B704: superlu_malloc_dist (memory.c:108) ==42050== by 0x1014FD7AD: doubleMalloc_dist (dmemory.c:145) ==42050== by 0x10151DA7D: pdgstrf (pdgstrf.c:735) ==42050== by 0x1015019A5: pdgssvx (pdgssvx.c:1069) ==42050== by 0x1009CFE7A: MatLUFactorNumeric_SuperLU_DIST (superlu_dist.c:414) ==42050== by 0x10046CC5C: MatLUFactorNumeric (matrix.c:2946) ==42050== by 0x100F09F2C: PCSetUp_LU (lu.c:152) ==42050== by 0x100FF9036: PCSetUp (precon.c:982) ==42050== by 0x1010F54EB: KSPSetUp (itfunc.c:332) ==42050== by 0x1010F7985: KSPSolve (itfunc.c:546) ==42050== by 0x10125541E: SNESSolve_NEWTONLS (ls.c:233) ==42050== by 0x1011C49B7: SNESSolve (snes.c:3906) ==42050== by 0x100001B3C: main (in ./ex19) ==42050== Uninitialised value was created by a heap allocation ==42050== at 0x1000183B1: malloc (vg_replace_malloc.c:303) ==42050== by 0x10153B704: superlu_malloc_dist (memory.c:108) ==42050== by 0x1014FD7AD: doubleMalloc_dist (dmemory.c:145) ==42050== by 0x10151DA7D: pdgstrf (pdgstrf.c:735) ==42050== by 0x1015019A5: pdgssvx (pdgssvx.c:1069) ==42050== by 0x1009CFE7A: MatLUFactorNumeric_SuperLU_DIST (superlu_dist.c:414) ==42050== by 0x10046CC5C: MatLUFactorNumeric (matrix.c:2946) ==42050== by 0x100F09F2C: PCSetUp_LU (lu.c:152) ==42050== by 0x100FF9036: PCSetUp (precon.c:982) ==42050== by 0x1010F54EB: KSPSetUp (itfunc.c:332) ==42050== by 0x1010F7985: KSPSolve (itfunc.c:546) ==42050== by 0x10125541E: SNESSolve_NEWTONLS (ls.c:233) ==42050== by 0x1011C49B7: SNESSolve (snes.c:3906) ==42050== by 0x100001B3C: main (in ./ex19) ==42050== ==42048== Conditional jump or move depends on uninitialised value(s) ==42048== at 0x10151F141: pdgstrf (pdgstrf.c:1139) ==42048== by 0x1015019A5: pdgssvx (pdgssvx.c:1069) ==42048== by 0x1009CFE7A: MatLUFactorNumeric_SuperLU_DIST (superlu_dist.c:414) ==42048== by 0x10046CC5C: MatLUFactorNumeric (matrix.c:2946) ==42048== by 0x100F09F2C: PCSetUp_LU (lu.c:152) ==42048== by 0x100FF9036: PCSetUp (precon.c:982) ==42048== by 0x1010F54EB: KSPSetUp (itfunc.c:332) ==42048== by 0x1010F7985: KSPSolve (itfunc.c:546) ==42048== by 0x10125541E: SNESSolve_NEWTONLS (ls.c:233) ==42048== by 0x1011C49B7: SNESSolve (snes.c:3906) ==42048== by 0x100001B3C: main (in ./ex19) ==42048== Uninitialised value was created by a heap allocation ==42048== at 0x1000183B1: malloc (vg_replace_malloc.c:303) ==42048== by 0x10153B704: superlu_malloc_dist (memory.c:108) ==42048== by 0x10150ABE2: ddist_psymbtonum (pdsymbfact_distdata.c:1332) ==42048== by 0x1015018C2: pdgssvx (pdgssvx.c:1057) ==42048== by 0x1009CFE7A: MatLUFactorNumeric_SuperLU_DIST (superlu_dist.c:414) ==42048== by 0x10046CC5C: MatLUFactorNumeric (matrix.c:2946) ==42048== by 0x100F09F2C: PCSetUp_LU (lu.c:152) ==42048== by 0x100FF9036: PCSetUp (precon.c:982) ==42048== by 0x1010F54EB: KSPSetUp (itfunc.c:332) ==42048== by 0x1010F7985: KSPSolve (itfunc.c:546) ==42048== by 0x10125541E: SNESSolve_NEWTONLS (ls.c:233) ==42048== by 0x1011C49B7: SNESSolve (snes.c:3906) ==42048== by 0x100001B3C: main (in ./ex19) ==42048== ==42049== Conditional jump or move depends on uninitialised value(s) ==42049== at 0x10151F141: pdgstrf (pdgstrf.c:1139) ==42049== by 0x1015019A5: pdgssvx (pdgssvx.c:1069) ==42049== by 0x1009CFE7A: MatLUFactorNumeric_SuperLU_DIST (superlu_dist.c:414) ==42049== by 0x10046CC5C: MatLUFactorNumeric (matrix.c:2946) ==42049== by 0x100F09F2C: PCSetUp_LU (lu.c:152) ==42049== by 0x100FF9036: PCSetUp (precon.c:982) ==42049== by 0x1010F54EB: KSPSetUp (itfunc.c:332) ==42049== by 0x1010F7985: KSPSolve (itfunc.c:546) ==42049== by 0x10125541E: SNESSolve_NEWTONLS (ls.c:233) ==42049== by 0x1011C49B7: SNESSolve (snes.c:3906) ==42049== by 0x100001B3C: main (in ./ex19) ==42049== Uninitialised value was created by a heap allocation ==42049== at 0x1000183B1: malloc (vg_replace_malloc.c:303) ==42049== by 0x10153B704: superlu_malloc_dist (memory.c:108) ==42049== by 0x10150ABE2: ddist_psymbtonum (pdsymbfact_distdata.c:1332) ==42049== by 0x1015018C2: pdgssvx (pdgssvx.c:1057) ==42049== by 0x1009CFE7A: MatLUFactorNumeric_SuperLU_DIST (superlu_dist.c:414) ==42049== by 0x10046CC5C: MatLUFactorNumeric (matrix.c:2946) ==42049== by 0x100F09F2C: PCSetUp_LU (lu.c:152) ==42049== by 0x100FF9036: PCSetUp (precon.c:982) ==42049== by 0x1010F54EB: KSPSetUp (itfunc.c:332) ==42049== by 0x1010F7985: KSPSolve (itfunc.c:546) ==42049== by 0x10125541E: SNESSolve_NEWTONLS (ls.c:233) ==42049== by 0x1011C49B7: SNESSolve (snes.c:3906) ==42049== by 0x100001B3C: main (in ./ex19) ==42049== ==42048== Conditional jump or move depends on uninitialised value(s) ==42048== at 0x101520054: pdgstrf (pdgstrf.c:1429) ==42048== by 0x1015019A5: pdgssvx (pdgssvx.c:1069) ==42048== by 0x1009CFE7A: MatLUFactorNumeric_SuperLU_DIST (superlu_dist.c:414) ==42048== by 0x10046CC5C: MatLUFactorNumeric (matrix.c:2946) ==42048== by 0x100F09F2C: PCSetUp_LU (lu.c:152) ==42048== by 0x100FF9036: PCSetUp (precon.c:982) ==42048== by 0x1010F54EB: KSPSetUp (itfunc.c:332) ==42048== by 0x1010F7985: KSPSolve (itfunc.c:546) ==42049== Conditional jump or move depends on uninitialised value(s) ==42048== by 0x10125541E: SNESSolve_NEWTONLS (ls.c:233) ==42048== by 0x1011C49B7: SNESSolve (snes.c:3906) ==42048== by 0x100001B3C: main (in ./ex19) ==42048== Uninitialised value was created by a heap allocation ==42049== at 0x101520054: pdgstrf (pdgstrf.c:1429) ==42048== at 0x1000183B1: malloc (vg_replace_malloc.c:303) ==42048== by 0x10153B704: superlu_malloc_dist (memory.c:108) ==42049== by 0x1015019A5: pdgssvx (pdgssvx.c:1069) ==42049== by 0x1009CFE7A: MatLUFactorNumeric_SuperLU_DIST (superlu_dist.c:414) ==42048== by 0x10150ABE2: ddist_psymbtonum (pdsymbfact_distdata.c:1332) ==42048== by 0x1015018C2: pdgssvx (pdgssvx.c:1057) ==42048== by 0x1009CFE7A: MatLUFactorNumeric_SuperLU_DIST (superlu_dist.c:414) ==42049== by 0x10046CC5C: MatLUFactorNumeric (matrix.c:2946) ==42049== by 0x100F09F2C: PCSetUp_LU (lu.c:152) ==42048== by 0x10046CC5C: MatLUFactorNumeric (matrix.c:2946) ==42048== by 0x100F09F2C: PCSetUp_LU (lu.c:152) ==42049== by 0x100FF9036: PCSetUp (precon.c:982) ==42049== by 0x1010F54EB: KSPSetUp (itfunc.c:332) ==42049== by 0x1010F7985: KSPSolve (itfunc.c:546) ==42048== by 0x100FF9036: PCSetUp (precon.c:982) ==42048== by 0x1010F54EB: KSPSetUp (itfunc.c:332) ==42048== by 0x1010F7985: KSPSolve (itfunc.c:546) ==42049== by 0x10125541E: SNESSolve_NEWTONLS (ls.c:233) ==42049== by 0x1011C49B7: SNESSolve (snes.c:3906) ==42048== by 0x10125541E: SNESSolve_NEWTONLS (ls.c:233) ==42048== by 0x1011C49B7: SNESSolve (snes.c:3906) ==42049== by 0x100001B3C: main (in ./ex19) ==42049== Uninitialised value was created by a heap allocation ==42049== at 0x1000183B1: malloc (vg_replace_malloc.c:303) ==42048== by 0x100001B3C: main (in ./ex19) ==42048== ==42049== by 0x10153B704: superlu_malloc_dist (memory.c:108) ==42049== by 0x10150ABE2: ddist_psymbtonum (pdsymbfact_distdata.c:1332) ==42049== by 0x1015018C2: pdgssvx (pdgssvx.c:1057) ==42049== by 0x1009CFE7A: MatLUFactorNumeric_SuperLU_DIST (superlu_dist.c:414) ==42049== by 0x10046CC5C: MatLUFactorNumeric (matrix.c:2946) ==42049== by 0x100F09F2C: PCSetUp_LU (lu.c:152) ==42049== by 0x100FF9036: PCSetUp (precon.c:982) ==42049== by 0x1010F54EB: KSPSetUp (itfunc.c:332) ==42049== by 0x1010F7985: KSPSolve (itfunc.c:546) ==42049== by 0x10125541E: SNESSolve_NEWTONLS (ls.c:233) ==42049== by 0x1011C49B7: SNESSolve (snes.c:3906) ==42049== by 0x100001B3C: main (in ./ex19) ==42049== ==42050== Conditional jump or move depends on uninitialised value(s) ==42050== at 0x10151FDE6: pdgstrf (pdgstrf.c:1382) ==42050== by 0x1015019A5: pdgssvx (pdgssvx.c:1069) ==42050== by 0x1009CFE7A: MatLUFactorNumeric_SuperLU_DIST (superlu_dist.c:414) ==42050== by 0x10046CC5C: MatLUFactorNumeric (matrix.c:2946) ==42050== by 0x100F09F2C: PCSetUp_LU (lu.c:152) ==42050== by 0x100FF9036: PCSetUp (precon.c:982) ==42050== by 0x1010F54EB: KSPSetUp (itfunc.c:332) ==42050== by 0x1010F7985: KSPSolve (itfunc.c:546) ==42050== by 0x10125541E: SNESSolve_NEWTONLS (ls.c:233) ==42050== by 0x1011C49B7: SNESSolve (snes.c:3906) ==42050== by 0x100001B3C: main (in ./ex19) ==42050== Uninitialised value was created by a heap allocation ==42050== at 0x1000183B1: malloc (vg_replace_malloc.c:303) ==42050== by 0x10153B704: superlu_malloc_dist (memory.c:108) ==42050== by 0x10150B241: ddist_psymbtonum (pdsymbfact_distdata.c:1389) ==42050== by 0x1015018C2: pdgssvx (pdgssvx.c:1057) ==42050== by 0x1009CFE7A: MatLUFactorNumeric_SuperLU_DIST (superlu_dist.c:414) ==42050== by 0x10046CC5C: MatLUFactorNumeric (matrix.c:2946) ==42050== by 0x100F09F2C: PCSetUp_LU (lu.c:152) ==42050== by 0x100FF9036: PCSetUp (precon.c:982) ==42050== by 0x1010F54EB: KSPSetUp (itfunc.c:332) ==42050== by 0x1010F7985: KSPSolve (itfunc.c:546) ==42050== by 0x10125541E: SNESSolve_NEWTONLS (ls.c:233) ==42050== by 0x1011C49B7: SNESSolve (snes.c:3906) ==42050== by 0x100001B3C: main (in ./ex19) ==42050== > On Jul 20, 2015, at 12:03 PM, Mahir.Ulker-Kaustell at tyrens.se wrote: > > Ok. So I have been creating the full factorization on each process. That gives me some hope! > > I followed your suggestion and tried to use the runtime option ?-mat_superlu_dist_parsymbfact?. > However, now the program crashes with: > > Invalid ISPEC at line 484 in file get_perm_c.c > > And so on? > > From the SuperLU manual; I should give the option either YES or NO, however -mat_superlu_dist_parsymbfact YES makes the program crash in the same way as above. > Also I can?t find any reference to -mat_superlu_dist_parsymbfact in the PETSc documentation > > Mahir > > Mahir ?lker-Kaustell, Kompetenssamordnare, Brokonstrukt?r, Tekn. Dr, Tyr?ns AB > 010 452 30 82, Mahir.Ulker-Kaustell at tyrens.se > > From: Xiaoye S. Li [mailto:xsli at lbl.gov] > Sent: den 20 juli 2015 18:12 > To: ?lker-Kaustell, Mahir > Cc: Hong; petsc-users > Subject: Re: [petsc-users] SuperLU MPI-problem > > The default SuperLU_DIST setting is to serial symbolic factorization. Therefore, what matters is how much memory do you have per MPI task? > > The code failed to malloc memory during redistribution of matrix A to {L\U} data struction (using result of serial symbolic factorization.) > > You can use parallel symbolic factorization, by runtime option: '-mat_superlu_dist_parsymbfact' > > Sherry Li > > > On Mon, Jul 20, 2015 at 8:59 AM, Mahir.Ulker-Kaustell at tyrens.se > wrote: > Hong: > > Previous experiences with this equation have shown that it is very difficult to solve it iteratively. Hence the use of a direct solver. > > The large test problem I am trying to solve has slightly less than 10^6 degrees of freedom. The matrices are derived from finite elements so they are sparse. > The machine I am working on has 128GB ram. I have estimated the memory needed to less than 20GB, so if the solver needs twice or even three times as much, it should still work well. Or have I completely misunderstood something here? > > Mahir > > > > From: Hong [mailto:hzhang at mcs.anl.gov] > Sent: den 20 juli 2015 17:39 > To: ?lker-Kaustell, Mahir > Cc: petsc-users > Subject: Re: [petsc-users] SuperLU MPI-problem > > Mahir: > Direct solvers consume large amount of memory. Suggest to try followings: > > 1. A sparse iterative solver if [-omega^2M + K] is not too ill-conditioned. You may test it using the small matrix. > > 2. Incrementally increase your matrix sizes. Try different matrix orderings. > Do you get memory crash in the 1st symbolic factorization? > In your case, matrix data structure stays same when omega changes, so you only need to do one matrix symbolic factorization and reuse it. > > 3. Use a machine that gives larger memory. > > Hong > > Dear Petsc-Users, > > I am trying to use PETSc to solve a set of linear equations arising from Naviers equation (elastodynamics) in the frequency domain. > The frequency dependency of the problem requires that the system > > [-omega^2M + K]u = F > > where M and K are constant, square, positive definite matrices (mass and stiffness respectively) is solved for each frequency omega of interest. > K is a complex matrix, including material damping. > > I have written a PETSc program which solves this problem for a small (1000 degrees of freedom) test problem on one or several processors, but it keeps crashing when I try it on my full scale (in the order of 10^6 degrees of freedom) problem. > > The program crashes at KSPSetUp() and from what I can see in the error messages, it appears as if it consumes too much memory. > > I would guess that similar problems have occurred in this mail-list, so I am hoping that someone can push me in the right direction? > > Mahir -------------- next part -------------- An HTML attachment was scrubbed... URL: From nicolas.pozin at inria.fr Mon Aug 3 09:13:08 2015 From: nicolas.pozin at inria.fr (Nicolas Pozin) Date: Mon, 3 Aug 2015 16:13:08 +0200 (CEST) Subject: [petsc-users] problem with MatShellGetContext In-Reply-To: <1219151237.6415983.1438609951363.JavaMail.zimbra@inria.fr> Message-ID: <115885278.6419010.1438611188443.JavaMail.zimbra@inria.fr> Hello everyone, I am having trouble using MatShellGetContext. Here's the simple test I did : typedef struct{ PetscInt testValue; Mat matShell; KSP currentCtx; } AppCtx; AppCtx context1; KSPCreate(PETSC_COMM_WORLD,&context1.currentCtx); context1.testValue=18; MatCreateShell(PETSC_COMM_WORLD, nl, nl, nL, nL, context1.currentCtx, &context1.matShell); AppCtx context2; MatShellGetContext(context1.matShell, (void*)&context2); It happens that context2.testValue is different from 18. Any would have a clue on what I miss? thanks a lot, Nicolas -------------- next part -------------- An HTML attachment was scrubbed... URL: From xsli at lbl.gov Mon Aug 3 09:17:54 2015 From: xsli at lbl.gov (Xiaoye S. Li) Date: Mon, 3 Aug 2015 07:17:54 -0700 Subject: [petsc-users] SuperLU MPI-problem In-Reply-To: References: <051d9816c3bd4a3eac37fc51004ebce1@STHWS42.tyrens.se> <7345cece365942d1a06deeac56cf1d72@STHWS42.tyrens.se> <19A5B30A-64E2-44E1-8F73-F67AE628F175@mcs.anl.gov> <03369975ff0a46a388920f1b3372d25c@STHWS42.tyrens.se> <1891ada2f99843b6b6c1d91f0f47f065@STHWS42.tyrens.se> Message-ID: I think I know the problem. Since zdistribute.c is called, I guess you are using the global (replicated) matrix input interface, pzgssvx_ABglobal(). This interface does not allow you to use parallel symbolic factorization (since matrix is centralized). That's why you get the following error: Invalid ISPEC at line 484 in file get_perm_c.c You need to use distributed matrix input interface pzgssvx() (without ABglobal) Sherry On Mon, Aug 3, 2015 at 5:02 AM, Mahir.Ulker-Kaustell at tyrens.se < Mahir.Ulker-Kaustell at tyrens.se> wrote: > Hong and Sherry, > > > > I have rebuilt PETSc with SuperLU 4.1. Unfortunately, the problem remains: > > > > If I use -mat_superlu_dist_parsymbfact, the program crashes with: Invalid > ISPEC at line 484 in file get_perm_c.c > > If I use -mat_superlu_dist_parsymbfact=1 or leave this flag out, the > program crashes with: Calloc fails for SPA dense[]. at line 438 in file > zdistribute.c > > > > Mahir > > > > *From:* Hong [mailto:hzhang at mcs.anl.gov] > *Sent:* den 30 juli 2015 02:58 > *To:* ?lker-Kaustell, Mahir > *Cc:* Xiaoye Li; PETSc users list > > *Subject:* Fwd: [petsc-users] SuperLU MPI-problem > > > > Mahir, > > > > Sherry fixed several bugs in superlu_dist-v4.1. > > The current petsc-release interfaces with superlu_dist-v4.0. > > We do not know whether the reported issue (attached below) has been > resolved or not. If not, can you test it with the latest superlu_dist-v4.1? > > > > Here is how to do it: > > 1. download superlu_dist v4.1 > > 2. remove existing PETSC_ARCH directory, then configure petsc with > > '--download-superlu_dist=superlu_dist_4.1.tar.gz' > > 3. build petsc > > > > Let us know if the issue remains. > > > > Hong > > > > > > ---------- Forwarded message ---------- > From: *Xiaoye S. Li* > Date: Wed, Jul 29, 2015 at 2:24 PM > Subject: Fwd: [petsc-users] SuperLU MPI-problem > To: Hong Zhang > > Hong, > > I am cleaning the mailbox, and saw this unresolved issue. I am not sure > whether the new fix to parallel symbolic factorization solves the problem. > What bothers be is that he is getting the following error: > > Invalid ISPEC at line 484 in file get_perm_c.c > > This has nothing to do with my bug fix. > > ? Shall we ask him to try the new version, or try to get him matrix? > > Sherry > ? > > > > ---------- Forwarded message ---------- > From: *Mahir.Ulker-Kaustell at tyrens.se * < > Mahir.Ulker-Kaustell at tyrens.se> > Date: Wed, Jul 22, 2015 at 1:32 PM > Subject: RE: [petsc-users] SuperLU MPI-problem > To: Hong , "Xiaoye S. Li" > Cc: petsc-users > > The 1000 was just a conservative guess. The number of non-zeros per row is > in the tens in general but certain constraints lead to non-diagonal streaks > in the sparsity-pattern. > > Is it the reordering of the matrix that is killing me here? How can I set > options.ColPerm? > > > > If i use -mat_superlu_dist_parsymbfact the program crashes with > > > > Invalid ISPEC at line 484 in file get_perm_c.c > > ------------------------------------------------------- > > Primary job terminated normally, but 1 process returned > > a non-zero exit code.. Per user-direction, the job has been aborted. > > ------------------------------------------------------- > > [0]PETSC ERROR: > ------------------------------------------------------------------------ > > [0]PETSC ERROR: Caught signal number 15 Terminate: Some process (or the > batch system) has told this process to end > > [0]PETSC ERROR: Try option -start_in_debugger or -on_error_attach_debugger > > [0]PETSC ERROR: or see > http://www.mcs.anl.gov/petsc/documentation/faq.html#valgrind > > [0]PETSC ERROR: or try http://valgrind.org on GNU/linux and Apple Mac OS > X to find memory corruption errors > > [0]PETSC ERROR: configure using --with-debugging=yes, recompile, link, and > run > > [0]PETSC ERROR: to get more information on the crash. > > [0]PETSC ERROR: --------------------- Error Message > -------------------------------------------------------------- > > [0]PETSC ERROR: Signal received > > [0]PETSC ERROR: See http://www.mcs.anl.gov/petsc/documentation/faq.html > for trouble shooting. > > [0]PETSC ERROR: Petsc Release Version 3.6.0, Jun, 09, 2015 > > [0]PETSC ERROR: ./solve on a cygwin-complex-nodebug named CZC5202SM2 by > muk Wed Jul 22 21:59:23 2015 > > [0]PETSC ERROR: Configure options PETSC_DIR=/packages/petsc-3.6.0 > PETSC_ARCH=cygwin-complex-nodebug --with-cc=gcc --with-cxx=g++ > --with-fc=gfortran --with-debugging=0 --with-fortran-kernels=1 > --with-scalar-type=complex --download-fblaspack --download-mpich > --download-scalapack --download-mumps --download-metis --download-parmetis > --download-superlu --download-superlu_dist --download-fftw > > [0]PETSC ERROR: #1 User provided function() line 0 in unknown file > > application called MPI_Abort(MPI_COMM_WORLD, 59) - process 0 > > [unset]: aborting job: > > application called MPI_Abort(MPI_COMM_WORLD, 59) - process 0 > > [0]PETSC ERROR: > ------------------------------------------------------------------------ > > > > If i use -mat_superlu_dist_parsymbfact=1 the program crashes (somewhat > later) with > > > > Malloc fails for Lnzval_bc_ptr[*][] at line 626 in file zdistribute.c > > col block 3006 ------------------------------------------------------- > > Primary job terminated normally, but 1 process returned > > a non-zero exit code.. Per user-direction, the job has been aborted. > > ------------------------------------------------------- > > col block 1924 [0]PETSC ERROR: > ------------------------------------------------------------------------ > > [0]PETSC ERROR: Caught signal number 15 Terminate: Some process (or the > batch system) has told this process to end > > [0]PETSC ERROR: Try option -start_in_debugger or -on_error_attach_debugger > > [0]PETSC ERROR: or see > http://www.mcs.anl.gov/petsc/documentation/faq.html#valgrind > > [0]PETSC ERROR: or try http://valgrind.org on GNU/linux and Apple Mac OS > X to find memory corruption errors > > [0]PETSC ERROR: configure using --with-debugging=yes, recompile, link, and > run > > [0]PETSC ERROR: to get more information on the crash. > > [0]PETSC ERROR: --------------------- Error Message > -------------------------------------------------------------- > > [0]PETSC ERROR: Signal received > > [0]PETSC ERROR: See http://www.mcs.anl.gov/petsc/documentation/faq.html > for trouble shooting. > > [0]PETSC ERROR: Petsc Release Version 3.6.0, Jun, 09, 2015 > > [0]PETSC ERROR: ./solve on a cygwin-complex-nodebug named CZC5202SM2 by > muk Wed Jul 22 21:59:58 2015 > > [0]PETSC ERROR: Configure options PETSC_DIR=/packages/petsc-3.6.0 > PETSC_ARCH=cygwin-complex-nodebug --with-cc=gcc --with-cxx=g++ > --with-fc=gfortran --with-debugging=0 --with-fortran-kernels=1 > --with-scalar-type=complex --download-fblaspack --download-mpich > --download-scalapack --download-mumps --download-metis --download-parmetis > --download-superlu --download-superlu_dist --download-fftw > > [0]PETSC ERROR: #1 User provided function() line 0 in unknown file > > application called MPI_Abort(MPI_COMM_WORLD, 59) - process 0 > > [unset]: aborting job: > > application called MPI_Abort(MPI_COMM_WORLD, 59) - process 0 > > [0]PETSC ERROR: > ------------------------------------------------------------------------ > > > > > > /Mahir > > > > > > *From:* Hong [mailto:hzhang at mcs.anl.gov] > > *Sent:* den 22 juli 2015 21:34 > *To:* Xiaoye S. Li > *Cc:* ?lker-Kaustell, Mahir; petsc-users > > > *Subject:* Re: [petsc-users] SuperLU MPI-problem > > > > In Petsc/superlu_dist interface, we set default > > > > options.ParSymbFact = NO; > > > > When user raises the flag "-mat_superlu_dist_parsymbfact", > > we set > > > > options.ParSymbFact = YES; > > options.ColPerm = PARMETIS; /* in v2.2, PARMETIS is forced for > ParSymbFact regardless of user ordering setting */ > > > > We do not change anything else. > > > > Hong > > > > On Wed, Jul 22, 2015 at 2:19 PM, Xiaoye S. Li wrote: > > I am trying to understand your problem. You said you are solving Naviers > equation (elastodynamics) in the frequency domain, using finite element > discretization. I wonder why you have about 1000 nonzeros per row. > Usually in many PDE discretized matrices, the number of nonzeros per row is > in the tens (even for 3D problems), not in the thousands. So, your matrix > is quite a bit denser than many sparse matrices we deal with. > > > > The number of nonzeros in the L and U factors is much more than that in > original matrix A -- typically we see 10-20x fill ratio for 2D, or can be > as bad as 50-100x fill ratio for 3D. But since your matrix starts much > denser (i.e., the underlying graph has many connections), it may not lend > to any good ordering strategy to preserve sparsity of L and U; that is, the > L and U fill ratio may be large. > > > > I don't understand why you get the following error when you use > > ?-mat_superlu_dist_parsymbfact?. > > > > Invalid ISPEC at line 484 in file get_perm_c.c > > > > Perhaps Hong Zhang knows; she built the SuperLU_DIST interface for PETSc. > > > > ?Hong -- in order to use parallel symbolic factorization, is it sufficient > to specify only > > ?-mat_superlu_dist_parsymbfact? > > ? ? (the default is to use sequential symbolic factorization.) > > > > > > Sherry > > > > On Wed, Jul 22, 2015 at 9:11 AM, Mahir.Ulker-Kaustell at tyrens.se < > Mahir.Ulker-Kaustell at tyrens.se> wrote: > > Thank you for your reply. > > As you have probably figured out already, I am not a computational > scientist. I am a researcher in civil engineering (railways for high-speed > traffic), trying to produce some, from my perspective, fairly large > parametric studies based on finite element discretizations. > > I am working in a Windows-environment and have installed PETSc through > Cygwin. > Apparently, there is no support for Valgrind in this OS. > > If I have understood you correct, the memory issues are related to superLU > and given my background, there is not much I can do. Is this correct? > > > Best regards, > Mahir > > ______________________________________________ > Mahir ?lker-Kaustell, Kompetenssamordnare, Brokonstrukt?r, Tekn. Dr, > Tyr?ns AB > 010 452 30 82, Mahir.Ulker-Kaustell at tyrens.se > ______________________________________________ > > > -----Original Message----- > From: Barry Smith [mailto:bsmith at mcs.anl.gov] > Sent: den 22 juli 2015 02:57 > To: ?lker-Kaustell, Mahir > Cc: Xiaoye S. Li; petsc-users > Subject: Re: [petsc-users] SuperLU MPI-problem > > > Run the program under valgrind > http://www.mcs.anl.gov/petsc/documentation/faq.html#valgrind . When I use > the option -mat_superlu_dist_parsymbfact I get many scary memory problems > some involving for example ddist_psymbtonum (pdsymbfact_distdata.c:1332) > > Note that I consider it unacceptable for running programs to EVER use > uninitialized values; until these are all cleaned up I won't trust any runs > like this. > > Barry > > > > > ==42050== Conditional jump or move depends on uninitialised value(s) > ==42050== at 0x10274C436: MPI_Allgatherv (allgatherv.c:1053) > ==42050== by 0x101557F60: get_perm_c_parmetis > (get_perm_c_parmetis.c:285) > ==42050== by 0x101501192: pdgssvx (pdgssvx.c:934) > ==42050== by 0x1009CFE7A: MatLUFactorNumeric_SuperLU_DIST > (superlu_dist.c:414) > ==42050== by 0x10046CC5C: MatLUFactorNumeric (matrix.c:2946) > ==42050== by 0x100F09F2C: PCSetUp_LU (lu.c:152) > ==42050== by 0x100FF9036: PCSetUp (precon.c:982) > ==42050== by 0x1010F54EB: KSPSetUp (itfunc.c:332) > ==42050== by 0x1010F7985: KSPSolve (itfunc.c:546) > ==42050== by 0x10125541E: SNESSolve_NEWTONLS (ls.c:233) > ==42050== by 0x1011C49B7: SNESSolve (snes.c:3906) > ==42050== by 0x100001B3C: main (in ./ex19) > ==42050== Uninitialised value was created by a stack allocation > ==42050== at 0x10155751B: get_perm_c_parmetis (get_perm_c_parmetis.c:96) > ==42050== > ==42050== Conditional jump or move depends on uninitialised value(s) > ==42050== at 0x102851C61: MPIR_Allgatherv_intra (allgatherv.c:651) > ==42050== by 0x102853EC7: MPIR_Allgatherv (allgatherv.c:903) > ==42050== by 0x102853F84: MPIR_Allgatherv_impl (allgatherv.c:944) > ==42050== by 0x10274CA41: MPI_Allgatherv (allgatherv.c:1107) > ==42050== by 0x101557F60: get_perm_c_parmetis > (get_perm_c_parmetis.c:285) > ==42050== by 0x101501192: pdgssvx (pdgssvx.c:934) > ==42050== by 0x1009CFE7A: MatLUFactorNumeric_SuperLU_DIST > (superlu_dist.c:414) > ==42050== by 0x10046CC5C: MatLUFactorNumeric (matrix.c:2946) > ==42050== by 0x100F09F2C: PCSetUp_LU (lu.c:152) > ==42050== by 0x100FF9036: PCSetUp (precon.c:982) > ==42050== by 0x1010F54EB: KSPSetUp (itfunc.c:332) > ==42050== by 0x1010F7985: KSPSolve (itfunc.c:546) > ==42050== by 0x10125541E: SNESSolve_NEWTONLS (ls.c:233) > ==42050== by 0x1011C49B7: SNESSolve (snes.c:3906) > ==42050== by 0x100001B3C: main (in ./ex19) > ==42050== Uninitialised value was created by a stack allocation > ==42050== at 0x10155751B: get_perm_c_parmetis (get_perm_c_parmetis.c:96) > ==42050== > ==42049== Syscall param writev(vector[...]) points to uninitialised byte(s) > ==42049== at 0x102DA1C3A: writev (in > /usr/lib/system/libsystem_kernel.dylib) > ==42049== by 0x10296A0DC: MPL_large_writev (mplsock.c:32) > ==42049== by 0x10295F6AD: MPIDU_Sock_writev (sock_immed.i:610) > ==42049== by 0x102943FCA: MPIDI_CH3_iSendv (ch3_isendv.c:84) > ==42049== by 0x102934361: MPIDI_CH3_EagerContigIsend (ch3u_eager.c:556) > ==42049== by 0x102939531: MPID_Isend (mpid_isend.c:138) > ==42049== by 0x10277656E: MPI_Isend (isend.c:125) > ==42049== by 0x102088B66: libparmetis__gkMPI_Isend (gkmpi.c:63) > ==42049== by 0x10208140F: libparmetis__CommInterfaceData (comm.c:298) > ==42049== by 0x1020A8758: libparmetis__CompactGraph (ometis.c:553) > ==42049== by 0x1020A77BB: libparmetis__MultilevelOrder (ometis.c:225) > ==42049== by 0x1020A7493: ParMETIS_V32_NodeND (ometis.c:151) > ==42049== by 0x1020A6AFB: ParMETIS_V3_NodeND (ometis.c:34) > ==42049== by 0x101557CFC: get_perm_c_parmetis > (get_perm_c_parmetis.c:241) > ==42049== by 0x101501192: pdgssvx (pdgssvx.c:934) > ==42049== by 0x1009CFE7A: MatLUFactorNumeric_SuperLU_DIST > (superlu_dist.c:414) > ==42049== by 0x10046CC5C: MatLUFactorNumeric (matrix.c:2946) > ==42049== by 0x100F09F2C: PCSetUp_LU (lu.c:152) > ==42049== by 0x100FF9036: PCSetUp (precon.c:982) > ==42048== Syscall param writev(vector[...]) points to uninitialised byte(s) > ==42049== by 0x1010F54EB: KSPSetUp (itfunc.c:332) > ==42049== Address 0x105edff70 is 1,424 bytes inside a block of size > 752,720 alloc'd > ==42049== at 0x1000183B1: malloc (vg_replace_malloc.c:303) > ==42049== by 0x1020EB90C: gk_malloc (memory.c:147) > ==42049== by 0x1020EAA28: gk_mcoreCreate (mcore.c:28) > ==42048== at 0x102DA1C3A: writev (in > /usr/lib/system/libsystem_kernel.dylib) > ==42048== by 0x10296A0DC: MPL_large_writev (mplsock.c:32) > ==42049== by 0x1020BA5CF: libparmetis__AllocateWSpace (wspace.c:23) > ==42049== by 0x1020A6E84: ParMETIS_V32_NodeND (ometis.c:98) > ==42048== by 0x10295F6AD: MPIDU_Sock_writev (sock_immed.i:610) > ==42048== by 0x102943FCA: MPIDI_CH3_iSendv (ch3_isendv.c:84) > ==42048== by 0x102934361: MPIDI_CH3_EagerContigIsend (ch3u_eager.c:556) > ==42049== by 0x1020A6AFB: ParMETIS_V3_NodeND (ometis.c:34) > ==42049== by 0x101557CFC: get_perm_c_parmetis > (get_perm_c_parmetis.c:241) > ==42049== by 0x101501192: pdgssvx (pdgssvx.c:934) > ==42048== by 0x102939531: MPID_Isend (mpid_isend.c:138) > ==42048== by 0x10277656E: MPI_Isend (isend.c:125) > ==42049== by 0x1009CFE7A: MatLUFactorNumeric_SuperLU_DIST > (superlu_dist.c:414) > ==42049== by 0x10046CC5C: MatLUFactorNumeric (matrix.c:2946) > ==42049== by 0x100F09F2C: PCSetUp_LU (lu.c:152) > ==42049== by 0x100FF9036: PCSetUp (precon.c:982) > ==42049== by 0x1010F54EB: KSPSetUp (itfunc.c:332) > ==42048== by 0x102088B66: libparmetis__gkMPI_Isend (gkmpi.c:63) > ==42048== by 0x10208140F: libparmetis__CommInterfaceData (comm.c:298) > ==42049== by 0x1010F7985: KSPSolve (itfunc.c:546) > ==42049== by 0x10125541E: SNESSolve_NEWTONLS (ls.c:233) > ==42048== by 0x1020A8758: libparmetis__CompactGraph (ometis.c:553) > ==42048== by 0x1020A77BB: libparmetis__MultilevelOrder (ometis.c:225) > ==42048== by 0x1020A7493: ParMETIS_V32_NodeND (ometis.c:151) > ==42049== by 0x1011C49B7: SNESSolve (snes.c:3906) > ==42049== by 0x100001B3C: main (in ./ex19) > ==42049== Uninitialised value was created by a heap allocation > ==42049== at 0x1000183B1: malloc (vg_replace_malloc.c:303) > ==42049== by 0x1020EB90C: gk_malloc (memory.c:147) > ==42048== by 0x1020A6AFB: ParMETIS_V3_NodeND (ometis.c:34) > ==42048== by 0x101557CFC: get_perm_c_parmetis > (get_perm_c_parmetis.c:241) > ==42048== by 0x101501192: pdgssvx (pdgssvx.c:934) > ==42048== by 0x1009CFE7A: MatLUFactorNumeric_SuperLU_DIST > (superlu_dist.c:414) > ==42049== by 0x10211C50B: libmetis__imalloc (gklib.c:24) > ==42049== by 0x1020A8566: libparmetis__CompactGraph (ometis.c:519) > ==42049== by 0x1020A77BB: libparmetis__MultilevelOrder (ometis.c:225) > ==42048== by 0x10046CC5C: MatLUFactorNumeric (matrix.c:2946) > ==42049== by 0x1020A7493: ParMETIS_V32_NodeND (ometis.c:151) > ==42049== by 0x1020A6AFB: ParMETIS_V3_NodeND (ometis.c:34) > ==42049== by 0x101557CFC: get_perm_c_parmetis > (get_perm_c_parmetis.c:241) > ==42049== by 0x101501192: pdgssvx (pdgssvx.c:934) > ==42049== by 0x1009CFE7A: MatLUFactorNumeric_SuperLU_DIST > (superlu_dist.c:414) > ==42049== by 0x10046CC5C: MatLUFactorNumeric (matrix.c:2946) > ==42048== by 0x100F09F2C: PCSetUp_LU (lu.c:152) > ==42049== by 0x100F09F2C: PCSetUp_LU (lu.c:152) > ==42049== by 0x100FF9036: PCSetUp (precon.c:982) > ==42049== by 0x1010F54EB: KSPSetUp (itfunc.c:332) > ==42049== by 0x1010F7985: KSPSolve (itfunc.c:546) > ==42049== by 0x10125541E: SNESSolve_NEWTONLS (ls.c:233) > ==42048== by 0x100FF9036: PCSetUp (precon.c:982) > ==42048== by 0x1010F54EB: KSPSetUp (itfunc.c:332) > ==42048== Address 0x10597a860 is 1,408 bytes inside a block of size > 752,720 alloc'd > ==42049== by 0x1011C49B7: SNESSolve (snes.c:3906) > ==42049== by 0x100001B3C: main (in ./ex19) > ==42049== > ==42048== at 0x1000183B1: malloc (vg_replace_malloc.c:303) > ==42048== by 0x1020EB90C: gk_malloc (memory.c:147) > ==42048== by 0x1020EAA28: gk_mcoreCreate (mcore.c:28) > ==42048== by 0x1020BA5CF: libparmetis__AllocateWSpace (wspace.c:23) > ==42048== by 0x1020A6E84: ParMETIS_V32_NodeND (ometis.c:98) > ==42048== by 0x1020A6AFB: ParMETIS_V3_NodeND (ometis.c:34) > ==42048== by 0x101557CFC: get_perm_c_parmetis > (get_perm_c_parmetis.c:241) > ==42048== by 0x101501192: pdgssvx (pdgssvx.c:934) > ==42048== by 0x1009CFE7A: MatLUFactorNumeric_SuperLU_DIST > (superlu_dist.c:414) > ==42048== by 0x10046CC5C: MatLUFactorNumeric (matrix.c:2946) > ==42048== by 0x100F09F2C: PCSetUp_LU (lu.c:152) > ==42048== by 0x100FF9036: PCSetUp (precon.c:982) > ==42048== by 0x1010F54EB: KSPSetUp (itfunc.c:332) > ==42048== by 0x1010F7985: KSPSolve (itfunc.c:546) > ==42048== by 0x10125541E: SNESSolve_NEWTONLS (ls.c:233) > ==42048== by 0x1011C49B7: SNESSolve (snes.c:3906) > ==42048== by 0x100001B3C: main (in ./ex19) > ==42048== Uninitialised value was created by a heap allocation > ==42048== at 0x1000183B1: malloc (vg_replace_malloc.c:303) > ==42048== by 0x1020EB90C: gk_malloc (memory.c:147) > ==42048== by 0x10211C50B: libmetis__imalloc (gklib.c:24) > ==42048== by 0x1020A8566: libparmetis__CompactGraph (ometis.c:519) > ==42048== by 0x1020A77BB: libparmetis__MultilevelOrder (ometis.c:225) > ==42048== by 0x1020A7493: ParMETIS_V32_NodeND (ometis.c:151) > ==42048== by 0x1020A6AFB: ParMETIS_V3_NodeND (ometis.c:34) > ==42048== by 0x101557CFC: get_perm_c_parmetis > (get_perm_c_parmetis.c:241) > ==42048== by 0x101501192: pdgssvx (pdgssvx.c:934) > ==42048== by 0x1009CFE7A: MatLUFactorNumeric_SuperLU_DIST > (superlu_dist.c:414) > ==42048== by 0x10046CC5C: MatLUFactorNumeric (matrix.c:2946) > ==42048== by 0x100F09F2C: PCSetUp_LU (lu.c:152) > ==42048== by 0x100FF9036: PCSetUp (precon.c:982) > ==42048== by 0x1010F54EB: KSPSetUp (itfunc.c:332) > ==42048== by 0x1010F7985: KSPSolve (itfunc.c:546) > ==42048== by 0x10125541E: SNESSolve_NEWTONLS (ls.c:233) > ==42048== by 0x1011C49B7: SNESSolve (snes.c:3906) > ==42048== by 0x100001B3C: main (in ./ex19) > ==42048== > ==42048== Syscall param write(buf) points to uninitialised byte(s) > ==42048== at 0x102DA1C22: write (in > /usr/lib/system/libsystem_kernel.dylib) > ==42048== by 0x10295F5BD: MPIDU_Sock_write (sock_immed.i:525) > ==42048== by 0x102944839: MPIDI_CH3_iStartMsg (ch3_istartmsg.c:86) > ==42048== by 0x102933B80: MPIDI_CH3_EagerContigShortSend > (ch3u_eager.c:257) > ==42048== by 0x10293ADBA: MPID_Send (mpid_send.c:130) > ==42048== by 0x10277A1FA: MPI_Send (send.c:127) > ==42048== by 0x10155802F: get_perm_c_parmetis > (get_perm_c_parmetis.c:299) > ==42048== by 0x101501192: pdgssvx (pdgssvx.c:934) > ==42048== by 0x1009CFE7A: MatLUFactorNumeric_SuperLU_DIST > (superlu_dist.c:414) > ==42048== by 0x10046CC5C: MatLUFactorNumeric (matrix.c:2946) > ==42048== by 0x100F09F2C: PCSetUp_LU (lu.c:152) > ==42048== by 0x100FF9036: PCSetUp (precon.c:982) > ==42048== by 0x1010F54EB: KSPSetUp (itfunc.c:332) > ==42048== by 0x1010F7985: KSPSolve (itfunc.c:546) > ==42048== by 0x10125541E: SNESSolve_NEWTONLS (ls.c:233) > ==42048== by 0x1011C49B7: SNESSolve (snes.c:3906) > ==42048== by 0x100001B3C: main (in ./ex19) > ==42048== Address 0x104810704 is on thread 1's stack > ==42048== in frame #3, created by MPIDI_CH3_EagerContigShortSend > (ch3u_eager.c:218) > ==42048== Uninitialised value was created by a heap allocation > ==42048== at 0x1000183B1: malloc (vg_replace_malloc.c:303) > ==42048== by 0x10153B704: superlu_malloc_dist (memory.c:108) > ==42048== by 0x101557AB9: get_perm_c_parmetis > (get_perm_c_parmetis.c:185) > ==42048== by 0x101501192: pdgssvx (pdgssvx.c:934) > ==42048== by 0x1009CFE7A: MatLUFactorNumeric_SuperLU_DIST > (superlu_dist.c:414) > ==42048== by 0x10046CC5C: MatLUFactorNumeric (matrix.c:2946) > ==42048== by 0x100F09F2C: PCSetUp_LU (lu.c:152) > ==42048== by 0x100FF9036: PCSetUp (precon.c:982) > ==42048== by 0x1010F54EB: KSPSetUp (itfunc.c:332) > ==42048== by 0x1010F7985: KSPSolve (itfunc.c:546) > ==42048== by 0x10125541E: SNESSolve_NEWTONLS (ls.c:233) > ==42048== by 0x1011C49B7: SNESSolve (snes.c:3906) > ==42048== by 0x100001B3C: main (in ./ex19) > ==42048== > ==42050== Conditional jump or move depends on uninitialised value(s) > ==42050== at 0x102744CB8: MPI_Alltoallv (alltoallv.c:480) > ==42050== by 0x101510B3E: dist_symbLU (pdsymbfact_distdata.c:539) > ==42050== by 0x10150A5C6: ddist_psymbtonum (pdsymbfact_distdata.c:1275) > ==42050== by 0x1015018C2: pdgssvx (pdgssvx.c:1057) > ==42050== by 0x1009CFE7A: MatLUFactorNumeric_SuperLU_DIST > (superlu_dist.c:414) > ==42050== by 0x10046CC5C: MatLUFactorNumeric (matrix.c:2946) > ==42050== by 0x100F09F2C: PCSetUp_LU (lu.c:152) > ==42050== by 0x100FF9036: PCSetUp (precon.c:982) > ==42050== by 0x1010F54EB: KSPSetUp (itfunc.c:332) > ==42050== by 0x1010F7985: KSPSolve (itfunc.c:546) > ==42050== by 0x10125541E: SNESSolve_NEWTONLS (ls.c:233) > ==42050== by 0x1011C49B7: SNESSolve (snes.c:3906) > ==42050== by 0x100001B3C: main (in ./ex19) > ==42050== Uninitialised value was created by a stack allocation > ==42050== at 0x10150E4C4: dist_symbLU (pdsymbfact_distdata.c:96) > ==42050== > ==42050== Conditional jump or move depends on uninitialised value(s) > ==42050== at 0x102744E43: MPI_Alltoallv (alltoallv.c:490) > ==42050== by 0x101510B3E: dist_symbLU (pdsymbfact_distdata.c:539) > ==42050== by 0x10150A5C6: ddist_psymbtonum (pdsymbfact_distdata.c:1275) > ==42050== by 0x1015018C2: pdgssvx (pdgssvx.c:1057) > ==42050== by 0x1009CFE7A: MatLUFactorNumeric_SuperLU_DIST > (superlu_dist.c:414) > ==42050== by 0x10046CC5C: MatLUFactorNumeric (matrix.c:2946) > ==42050== by 0x100F09F2C: PCSetUp_LU (lu.c:152) > ==42050== by 0x100FF9036: PCSetUp (precon.c:982) > ==42050== by 0x1010F54EB: KSPSetUp (itfunc.c:332) > ==42050== by 0x1010F7985: KSPSolve (itfunc.c:546) > ==42050== by 0x10125541E: SNESSolve_NEWTONLS (ls.c:233) > ==42050== by 0x1011C49B7: SNESSolve (snes.c:3906) > ==42050== by 0x100001B3C: main (in ./ex19) > ==42050== Uninitialised value was created by a stack allocation > ==42050== at 0x10150E4C4: dist_symbLU (pdsymbfact_distdata.c:96) > ==42050== > ==42050== Conditional jump or move depends on uninitialised value(s) > ==42050== at 0x102744EBF: MPI_Alltoallv (alltoallv.c:497) > ==42050== by 0x101510B3E: dist_symbLU (pdsymbfact_distdata.c:539) > ==42050== by 0x10150A5C6: ddist_psymbtonum (pdsymbfact_distdata.c:1275) > ==42050== by 0x1015018C2: pdgssvx (pdgssvx.c:1057) > ==42050== by 0x1009CFE7A: MatLUFactorNumeric_SuperLU_DIST > (superlu_dist.c:414) > ==42050== by 0x10046CC5C: MatLUFactorNumeric (matrix.c:2946) > ==42050== by 0x100F09F2C: PCSetUp_LU (lu.c:152) > ==42050== by 0x100FF9036: PCSetUp (precon.c:982) > ==42050== by 0x1010F54EB: KSPSetUp (itfunc.c:332) > ==42050== by 0x1010F7985: KSPSolve (itfunc.c:546) > ==42050== by 0x10125541E: SNESSolve_NEWTONLS (ls.c:233) > ==42050== by 0x1011C49B7: SNESSolve (snes.c:3906) > ==42050== by 0x100001B3C: main (in ./ex19) > ==42050== Uninitialised value was created by a stack allocation > ==42050== at 0x10150E4C4: dist_symbLU (pdsymbfact_distdata.c:96) > ==42050== > ==42050== Conditional jump or move depends on uninitialised value(s) > ==42050== at 0x1027450B1: MPI_Alltoallv (alltoallv.c:512) > ==42050== by 0x101510B3E: dist_symbLU (pdsymbfact_distdata.c:539) > ==42050== by 0x10150A5C6: ddist_psymbtonum (pdsymbfact_distdata.c:1275) > ==42050== by 0x1015018C2: pdgssvx (pdgssvx.c:1057) > ==42050== by 0x1009CFE7A: MatLUFactorNumeric_SuperLU_DIST > (superlu_dist.c:414) > ==42050== by 0x10046CC5C: MatLUFactorNumeric (matrix.c:2946) > ==42050== by 0x100F09F2C: PCSetUp_LU (lu.c:152) > ==42050== by 0x100FF9036: PCSetUp (precon.c:982) > ==42050== by 0x1010F54EB: KSPSetUp (itfunc.c:332) > ==42050== by 0x1010F7985: KSPSolve (itfunc.c:546) > ==42050== by 0x10125541E: SNESSolve_NEWTONLS (ls.c:233) > ==42050== by 0x1011C49B7: SNESSolve (snes.c:3906) > ==42050== by 0x100001B3C: main (in ./ex19) > ==42050== Uninitialised value was created by a stack allocation > ==42050== at 0x10150E4C4: dist_symbLU (pdsymbfact_distdata.c:96) > ==42050== > ==42050== Conditional jump or move depends on uninitialised value(s) > ==42050== at 0x10283FB06: MPIR_Alltoallv_intra (alltoallv.c:92) > ==42050== by 0x1028407B6: MPIR_Alltoallv (alltoallv.c:343) > ==42050== by 0x102840884: MPIR_Alltoallv_impl (alltoallv.c:380) > ==42050== by 0x10274541B: MPI_Alltoallv (alltoallv.c:531) > ==42050== by 0x101510B3E: dist_symbLU (pdsymbfact_distdata.c:539) > ==42050== by 0x10150A5C6: ddist_psymbtonum (pdsymbfact_distdata.c:1275) > ==42050== by 0x1015018C2: pdgssvx (pdgssvx.c:1057) > ==42050== by 0x1009CFE7A: MatLUFactorNumeric_SuperLU_DIST > (superlu_dist.c:414) > ==42050== by 0x10046CC5C: MatLUFactorNumeric (matrix.c:2946) > ==42050== by 0x100F09F2C: PCSetUp_LU (lu.c:152) > ==42050== by 0x100FF9036: PCSetUp (precon.c:982) > ==42050== by 0x1010F54EB: KSPSetUp (itfunc.c:332) > ==42050== by 0x1010F7985: KSPSolve (itfunc.c:546) > ==42050== by 0x10125541E: SNESSolve_NEWTONLS (ls.c:233) > ==42050== by 0x1011C49B7: SNESSolve (snes.c:3906) > ==42050== by 0x100001B3C: main (in ./ex19) > ==42050== Uninitialised value was created by a stack allocation > ==42050== at 0x10150E4C4: dist_symbLU (pdsymbfact_distdata.c:96) > ==42050== > ==42050== Syscall param writev(vector[...]) points to uninitialised byte(s) > ==42050== at 0x102DA1C3A: writev (in > /usr/lib/system/libsystem_kernel.dylib) > ==42050== by 0x10296A0DC: MPL_large_writev (mplsock.c:32) > ==42050== by 0x10295F6AD: MPIDU_Sock_writev (sock_immed.i:610) > ==42050== by 0x102943FCA: MPIDI_CH3_iSendv (ch3_isendv.c:84) > ==42050== by 0x102934361: MPIDI_CH3_EagerContigIsend (ch3u_eager.c:556) > ==42050== by 0x102939531: MPID_Isend (mpid_isend.c:138) > ==42050== by 0x10277656E: MPI_Isend (isend.c:125) > ==42050== by 0x101524C41: pdgstrf2_trsm (pdgstrf2.c:201) > ==42050== by 0x10151ECBF: pdgstrf (pdgstrf.c:1082) > ==42050== by 0x1015019A5: pdgssvx (pdgssvx.c:1069) > ==42050== by 0x1009CFE7A: MatLUFactorNumeric_SuperLU_DIST > (superlu_dist.c:414) > ==42050== by 0x10046CC5C: MatLUFactorNumeric (matrix.c:2946) > ==42050== by 0x100F09F2C: PCSetUp_LU (lu.c:152) > ==42050== by 0x100FF9036: PCSetUp (precon.c:982) > ==42050== by 0x1010F54EB: KSPSetUp (itfunc.c:332) > ==42050== by 0x1010F7985: KSPSolve (itfunc.c:546) > ==42050== by 0x10125541E: SNESSolve_NEWTONLS (ls.c:233) > ==42050== by 0x1011C49B7: SNESSolve (snes.c:3906) > ==42050== by 0x100001B3C: main (in ./ex19) > ==42050== Address 0x1060144d0 is 1,168 bytes inside a block of size > 131,072 alloc'd > ==42050== at 0x1000183B1: malloc (vg_replace_malloc.c:303) > ==42050== by 0x10153B704: superlu_malloc_dist (memory.c:108) > ==42050== by 0x1014FD7AD: doubleMalloc_dist (dmemory.c:145) > ==42050== by 0x10151DA7D: pdgstrf (pdgstrf.c:735) > ==42050== by 0x1015019A5: pdgssvx (pdgssvx.c:1069) > ==42050== by 0x1009CFE7A: MatLUFactorNumeric_SuperLU_DIST > (superlu_dist.c:414) > ==42050== by 0x10046CC5C: MatLUFactorNumeric (matrix.c:2946) > ==42050== by 0x100F09F2C: PCSetUp_LU (lu.c:152) > ==42050== by 0x100FF9036: PCSetUp (precon.c:982) > ==42050== by 0x1010F54EB: KSPSetUp (itfunc.c:332) > ==42050== by 0x1010F7985: KSPSolve (itfunc.c:546) > ==42050== by 0x10125541E: SNESSolve_NEWTONLS (ls.c:233) > ==42050== by 0x1011C49B7: SNESSolve (snes.c:3906) > ==42050== by 0x100001B3C: main (in ./ex19) > ==42050== Uninitialised value was created by a heap allocation > ==42050== at 0x1000183B1: malloc (vg_replace_malloc.c:303) > ==42050== by 0x10153B704: superlu_malloc_dist (memory.c:108) > ==42050== by 0x1014FD7AD: doubleMalloc_dist (dmemory.c:145) > ==42050== by 0x10151DA7D: pdgstrf (pdgstrf.c:735) > ==42050== by 0x1015019A5: pdgssvx (pdgssvx.c:1069) > ==42050== by 0x1009CFE7A: MatLUFactorNumeric_SuperLU_DIST > (superlu_dist.c:414) > ==42050== by 0x10046CC5C: MatLUFactorNumeric (matrix.c:2946) > ==42050== by 0x100F09F2C: PCSetUp_LU (lu.c:152) > ==42050== by 0x100FF9036: PCSetUp (precon.c:982) > ==42050== by 0x1010F54EB: KSPSetUp (itfunc.c:332) > ==42050== by 0x1010F7985: KSPSolve (itfunc.c:546) > ==42050== by 0x10125541E: SNESSolve_NEWTONLS (ls.c:233) > ==42050== by 0x1011C49B7: SNESSolve (snes.c:3906) > ==42050== by 0x100001B3C: main (in ./ex19) > ==42050== > ==42048== Conditional jump or move depends on uninitialised value(s) > ==42048== at 0x10151F141: pdgstrf (pdgstrf.c:1139) > ==42048== by 0x1015019A5: pdgssvx (pdgssvx.c:1069) > ==42048== by 0x1009CFE7A: MatLUFactorNumeric_SuperLU_DIST > (superlu_dist.c:414) > ==42048== by 0x10046CC5C: MatLUFactorNumeric (matrix.c:2946) > ==42048== by 0x100F09F2C: PCSetUp_LU (lu.c:152) > ==42048== by 0x100FF9036: PCSetUp (precon.c:982) > ==42048== by 0x1010F54EB: KSPSetUp (itfunc.c:332) > ==42048== by 0x1010F7985: KSPSolve (itfunc.c:546) > ==42048== by 0x10125541E: SNESSolve_NEWTONLS (ls.c:233) > ==42048== by 0x1011C49B7: SNESSolve (snes.c:3906) > ==42048== by 0x100001B3C: main (in ./ex19) > ==42048== Uninitialised value was created by a heap allocation > ==42048== at 0x1000183B1: malloc (vg_replace_malloc.c:303) > ==42048== by 0x10153B704: superlu_malloc_dist (memory.c:108) > ==42048== by 0x10150ABE2: ddist_psymbtonum (pdsymbfact_distdata.c:1332) > ==42048== by 0x1015018C2: pdgssvx (pdgssvx.c:1057) > ==42048== by 0x1009CFE7A: MatLUFactorNumeric_SuperLU_DIST > (superlu_dist.c:414) > ==42048== by 0x10046CC5C: MatLUFactorNumeric (matrix.c:2946) > ==42048== by 0x100F09F2C: PCSetUp_LU (lu.c:152) > ==42048== by 0x100FF9036: PCSetUp (precon.c:982) > ==42048== by 0x1010F54EB: KSPSetUp (itfunc.c:332) > ==42048== by 0x1010F7985: KSPSolve (itfunc.c:546) > ==42048== by 0x10125541E: SNESSolve_NEWTONLS (ls.c:233) > ==42048== by 0x1011C49B7: SNESSolve (snes.c:3906) > ==42048== by 0x100001B3C: main (in ./ex19) > ==42048== > ==42049== Conditional jump or move depends on uninitialised value(s) > ==42049== at 0x10151F141: pdgstrf (pdgstrf.c:1139) > ==42049== by 0x1015019A5: pdgssvx (pdgssvx.c:1069) > ==42049== by 0x1009CFE7A: MatLUFactorNumeric_SuperLU_DIST > (superlu_dist.c:414) > ==42049== by 0x10046CC5C: MatLUFactorNumeric (matrix.c:2946) > ==42049== by 0x100F09F2C: PCSetUp_LU (lu.c:152) > ==42049== by 0x100FF9036: PCSetUp (precon.c:982) > ==42049== by 0x1010F54EB: KSPSetUp (itfunc.c:332) > ==42049== by 0x1010F7985: KSPSolve (itfunc.c:546) > ==42049== by 0x10125541E: SNESSolve_NEWTONLS (ls.c:233) > ==42049== by 0x1011C49B7: SNESSolve (snes.c:3906) > ==42049== by 0x100001B3C: main (in ./ex19) > ==42049== Uninitialised value was created by a heap allocation > ==42049== at 0x1000183B1: malloc (vg_replace_malloc.c:303) > ==42049== by 0x10153B704: superlu_malloc_dist (memory.c:108) > ==42049== by 0x10150ABE2: ddist_psymbtonum (pdsymbfact_distdata.c:1332) > ==42049== by 0x1015018C2: pdgssvx (pdgssvx.c:1057) > ==42049== by 0x1009CFE7A: MatLUFactorNumeric_SuperLU_DIST > (superlu_dist.c:414) > ==42049== by 0x10046CC5C: MatLUFactorNumeric (matrix.c:2946) > ==42049== by 0x100F09F2C: PCSetUp_LU (lu.c:152) > ==42049== by 0x100FF9036: PCSetUp (precon.c:982) > ==42049== by 0x1010F54EB: KSPSetUp (itfunc.c:332) > ==42049== by 0x1010F7985: KSPSolve (itfunc.c:546) > ==42049== by 0x10125541E: SNESSolve_NEWTONLS (ls.c:233) > ==42049== by 0x1011C49B7: SNESSolve (snes.c:3906) > ==42049== by 0x100001B3C: main (in ./ex19) > ==42049== > ==42048== Conditional jump or move depends on uninitialised value(s) > ==42048== at 0x101520054: pdgstrf (pdgstrf.c:1429) > ==42048== by 0x1015019A5: pdgssvx (pdgssvx.c:1069) > ==42048== by 0x1009CFE7A: MatLUFactorNumeric_SuperLU_DIST > (superlu_dist.c:414) > ==42048== by 0x10046CC5C: MatLUFactorNumeric (matrix.c:2946) > ==42048== by 0x100F09F2C: PCSetUp_LU (lu.c:152) > ==42048== by 0x100FF9036: PCSetUp (precon.c:982) > ==42048== by 0x1010F54EB: KSPSetUp (itfunc.c:332) > ==42048== by 0x1010F7985: KSPSolve (itfunc.c:546) > ==42049== Conditional jump or move depends on uninitialised value(s) > ==42048== by 0x10125541E: SNESSolve_NEWTONLS (ls.c:233) > ==42048== by 0x1011C49B7: SNESSolve (snes.c:3906) > ==42048== by 0x100001B3C: main (in ./ex19) > ==42048== Uninitialised value was created by a heap allocation > ==42049== at 0x101520054: pdgstrf (pdgstrf.c:1429) > ==42048== at 0x1000183B1: malloc (vg_replace_malloc.c:303) > ==42048== by 0x10153B704: superlu_malloc_dist (memory.c:108) > ==42049== by 0x1015019A5: pdgssvx (pdgssvx.c:1069) > ==42049== by 0x1009CFE7A: MatLUFactorNumeric_SuperLU_DIST > (superlu_dist.c:414) > ==42048== by 0x10150ABE2: ddist_psymbtonum (pdsymbfact_distdata.c:1332) > ==42048== by 0x1015018C2: pdgssvx (pdgssvx.c:1057) > ==42048== by 0x1009CFE7A: MatLUFactorNumeric_SuperLU_DIST > (superlu_dist.c:414) > ==42049== by 0x10046CC5C: MatLUFactorNumeric (matrix.c:2946) > ==42049== by 0x100F09F2C: PCSetUp_LU (lu.c:152) > ==42048== by 0x10046CC5C: MatLUFactorNumeric (matrix.c:2946) > ==42048== by 0x100F09F2C: PCSetUp_LU (lu.c:152) > ==42049== by 0x100FF9036: PCSetUp (precon.c:982) > ==42049== by 0x1010F54EB: KSPSetUp (itfunc.c:332) > ==42049== by 0x1010F7985: KSPSolve (itfunc.c:546) > ==42048== by 0x100FF9036: PCSetUp (precon.c:982) > ==42048== by 0x1010F54EB: KSPSetUp (itfunc.c:332) > ==42048== by 0x1010F7985: KSPSolve (itfunc.c:546) > ==42049== by 0x10125541E: SNESSolve_NEWTONLS (ls.c:233) > ==42049== by 0x1011C49B7: SNESSolve (snes.c:3906) > ==42048== by 0x10125541E: SNESSolve_NEWTONLS (ls.c:233) > ==42048== by 0x1011C49B7: SNESSolve (snes.c:3906) > ==42049== by 0x100001B3C: main (in ./ex19) > ==42049== Uninitialised value was created by a heap allocation > ==42049== at 0x1000183B1: malloc (vg_replace_malloc.c:303) > ==42048== by 0x100001B3C: main (in ./ex19) > ==42048== > ==42049== by 0x10153B704: superlu_malloc_dist (memory.c:108) > ==42049== by 0x10150ABE2: ddist_psymbtonum (pdsymbfact_distdata.c:1332) > ==42049== by 0x1015018C2: pdgssvx (pdgssvx.c:1057) > ==42049== by 0x1009CFE7A: MatLUFactorNumeric_SuperLU_DIST > (superlu_dist.c:414) > ==42049== by 0x10046CC5C: MatLUFactorNumeric (matrix.c:2946) > ==42049== by 0x100F09F2C: PCSetUp_LU (lu.c:152) > ==42049== by 0x100FF9036: PCSetUp (precon.c:982) > ==42049== by 0x1010F54EB: KSPSetUp (itfunc.c:332) > ==42049== by 0x1010F7985: KSPSolve (itfunc.c:546) > ==42049== by 0x10125541E: SNESSolve_NEWTONLS (ls.c:233) > ==42049== by 0x1011C49B7: SNESSolve (snes.c:3906) > ==42049== by 0x100001B3C: main (in ./ex19) > ==42049== > ==42050== Conditional jump or move depends on uninitialised value(s) > ==42050== at 0x10151FDE6: pdgstrf (pdgstrf.c:1382) > ==42050== by 0x1015019A5: pdgssvx (pdgssvx.c:1069) > ==42050== by 0x1009CFE7A: MatLUFactorNumeric_SuperLU_DIST > (superlu_dist.c:414) > ==42050== by 0x10046CC5C: MatLUFactorNumeric (matrix.c:2946) > ==42050== by 0x100F09F2C: PCSetUp_LU (lu.c:152) > ==42050== by 0x100FF9036: PCSetUp (precon.c:982) > ==42050== by 0x1010F54EB: KSPSetUp (itfunc.c:332) > ==42050== by 0x1010F7985: KSPSolve (itfunc.c:546) > ==42050== by 0x10125541E: SNESSolve_NEWTONLS (ls.c:233) > ==42050== by 0x1011C49B7: SNESSolve (snes.c:3906) > ==42050== by 0x100001B3C: main (in ./ex19) > ==42050== Uninitialised value was created by a heap allocation > ==42050== at 0x1000183B1: malloc (vg_replace_malloc.c:303) > ==42050== by 0x10153B704: superlu_malloc_dist (memory.c:108) > ==42050== by 0x10150B241: ddist_psymbtonum (pdsymbfact_distdata.c:1389) > ==42050== by 0x1015018C2: pdgssvx (pdgssvx.c:1057) > ==42050== by 0x1009CFE7A: MatLUFactorNumeric_SuperLU_DIST > (superlu_dist.c:414) > ==42050== by 0x10046CC5C: MatLUFactorNumeric (matrix.c:2946) > ==42050== by 0x100F09F2C: PCSetUp_LU (lu.c:152) > ==42050== by 0x100FF9036: PCSetUp (precon.c:982) > ==42050== by 0x1010F54EB: KSPSetUp (itfunc.c:332) > ==42050== by 0x1010F7985: KSPSolve (itfunc.c:546) > ==42050== by 0x10125541E: SNESSolve_NEWTONLS (ls.c:233) > ==42050== by 0x1011C49B7: SNESSolve (snes.c:3906) > ==42050== by 0x100001B3C: main (in ./ex19) > ==42050== > > > > On Jul 20, 2015, at 12:03 PM, Mahir.Ulker-Kaustell at tyrens.se wrote: > > > > Ok. So I have been creating the full factorization on each process. That > gives me some hope! > > > > I followed your suggestion and tried to use the runtime option > ?-mat_superlu_dist_parsymbfact?. > > However, now the program crashes with: > > > > Invalid ISPEC at line 484 in file get_perm_c.c > > > > And so on? > > > > From the SuperLU manual; I should give the option either YES or NO, > however -mat_superlu_dist_parsymbfact YES makes the program crash in the > same way as above. > > Also I can?t find any reference to -mat_superlu_dist_parsymbfact in the > PETSc documentation > > > > Mahir > > > > Mahir ?lker-Kaustell, Kompetenssamordnare, Brokonstrukt?r, Tekn. Dr, > Tyr?ns AB > > 010 452 30 82, Mahir.Ulker-Kaustell at tyrens.se > > > > From: Xiaoye S. Li [mailto:xsli at lbl.gov] > > Sent: den 20 juli 2015 18:12 > > To: ?lker-Kaustell, Mahir > > Cc: Hong; petsc-users > > Subject: Re: [petsc-users] SuperLU MPI-problem > > > > The default SuperLU_DIST setting is to serial symbolic factorization. > Therefore, what matters is how much memory do you have per MPI task? > > > > The code failed to malloc memory during redistribution of matrix A to > {L\U} data struction (using result of serial symbolic factorization.) > > > > You can use parallel symbolic factorization, by runtime option: > '-mat_superlu_dist_parsymbfact' > > > > Sherry Li > > > > > > On Mon, Jul 20, 2015 at 8:59 AM, Mahir.Ulker-Kaustell at tyrens.se < > Mahir.Ulker-Kaustell at tyrens.se> wrote: > > Hong: > > > > Previous experiences with this equation have shown that it is very > difficult to solve it iteratively. Hence the use of a direct solver. > > > > The large test problem I am trying to solve has slightly less than 10^6 > degrees of freedom. The matrices are derived from finite elements so they > are sparse. > > The machine I am working on has 128GB ram. I have estimated the memory > needed to less than 20GB, so if the solver needs twice or even three times > as much, it should still work well. Or have I completely misunderstood > something here? > > > > Mahir > > > > > > > > From: Hong [mailto:hzhang at mcs.anl.gov] > > Sent: den 20 juli 2015 17:39 > > To: ?lker-Kaustell, Mahir > > Cc: petsc-users > > Subject: Re: [petsc-users] SuperLU MPI-problem > > > > Mahir: > > Direct solvers consume large amount of memory. Suggest to try followings: > > > > 1. A sparse iterative solver if [-omega^2M + K] is not too > ill-conditioned. You may test it using the small matrix. > > > > 2. Incrementally increase your matrix sizes. Try different matrix > orderings. > > Do you get memory crash in the 1st symbolic factorization? > > In your case, matrix data structure stays same when omega changes, so > you only need to do one matrix symbolic factorization and reuse it. > > > > 3. Use a machine that gives larger memory. > > > > Hong > > > > Dear Petsc-Users, > > > > I am trying to use PETSc to solve a set of linear equations arising from > Naviers equation (elastodynamics) in the frequency domain. > > The frequency dependency of the problem requires that the system > > > > [-omega^2M + K]u = F > > > > where M and K are constant, square, positive definite matrices (mass and > stiffness respectively) is solved for each frequency omega of interest. > > K is a complex matrix, including material damping. > > > > I have written a PETSc program which solves this problem for a small > (1000 degrees of freedom) test problem on one or several processors, but it > keeps crashing when I try it on my full scale (in the order of 10^6 degrees > of freedom) problem. > > > > The program crashes at KSPSetUp() and from what I can see in the error > messages, it appears as if it consumes too much memory. > > > > I would guess that similar problems have occurred in this mail-list, so > I am hoping that someone can push me in the right direction? > > > > Mahir > > > > > > > > > -------------- next part -------------- An HTML attachment was scrubbed... URL: From knepley at gmail.com Mon Aug 3 09:33:16 2015 From: knepley at gmail.com (Matthew Knepley) Date: Mon, 3 Aug 2015 09:33:16 -0500 Subject: [petsc-users] problem with MatShellGetContext In-Reply-To: <115885278.6419010.1438611188443.JavaMail.zimbra@inria.fr> References: <1219151237.6415983.1438609951363.JavaMail.zimbra@inria.fr> <115885278.6419010.1438611188443.JavaMail.zimbra@inria.fr> Message-ID: On Mon, Aug 3, 2015 at 9:13 AM, Nicolas Pozin wrote: > Hello everyone, > > I am having trouble using MatShellGetContext. > > Here's the simple test I did : > > > typedef struct{ > PetscInt testValue; > Mat matShell; > KSP currentCtx; > } AppCtx; > > > AppCtx context1; > KSPCreate(PETSC_COMM_WORLD,&context1.currentCtx); > context1.testValue=18; > MatCreateShell(PETSC_COMM_WORLD, nl, nl, nL, nL, context1.currentCtx, > &context1.matShell); > It looks like you want ''&context1" for the context argument. You are just passing the KSP pointer. > AppCtx context2; > MatShellGetContext(context1.matShell, (void*)&context2); > Here you better declare AppCtx *context2; and access it as context2->testValue; Matt > It happens that context2.testValue is different from 18. > > Any would have a clue on what I miss? > > thanks a lot, > Nicolas > > -- What most experimenters take for granted before they begin their experiments is infinitely more interesting than any results to which their experiments lead. -- Norbert Wiener -------------- next part -------------- An HTML attachment was scrubbed... URL: From nicolas.pozin at inria.fr Mon Aug 3 09:42:00 2015 From: nicolas.pozin at inria.fr (Nicolas Pozin) Date: Mon, 3 Aug 2015 16:42:00 +0200 (CEST) Subject: [petsc-users] problem with MatShellGetContext In-Reply-To: References: <1219151237.6415983.1438609951363.JavaMail.zimbra@inria.fr> <115885278.6419010.1438611188443.JavaMail.zimbra@inria.fr> Message-ID: <748729499.6423014.1438612920482.JavaMail.zimbra@inria.fr> ----- Mail original ----- > De: "Matthew Knepley" > ?: "Nicolas Pozin" > Cc: "PETSc" > Envoy?: Lundi 3 Ao?t 2015 16:33:16 > Objet: Re: [petsc-users] problem with MatShellGetContext > On Mon, Aug 3, 2015 at 9:13 AM, Nicolas Pozin < nicolas.pozin at inria.fr > > wrote: > > Hello everyone, > > > I am having trouble using MatShellGetContext. > > > Here's the simple test I did : > > > typedef struct{ > > > PetscInt testValue; > > > Mat matShell; > > > KSP currentCtx; > > > } AppCtx; > > > AppCtx context1; > > > KSPCreate(PETSC_COMM_WORLD,&context1.currentCtx); > > > context1.testValue=18; > > > MatCreateShell(PETSC_COMM_WORLD, nl, nl, nL, nL, context1.currentCtx, > > &context1.matShell); > > It looks like you want ''&context1" for the context argument. You are just > passing the KSP pointer. > > AppCtx context2; > > > MatShellGetContext(context1.matShell, (void*)&context2); > > Here you better declare > AppCtx *context2; > and access it as > context2->testValue; Thanks, but It doesn't work better unfortunately > Matt > > It happens that context2.testValue is different from 18. > > > Any would have a clue on what I miss? > > > thanks a lot, > > > Nicolas > > -- > What most experimenters take for granted before they begin their experiments > is infinitely more interesting than any results to which their experiments > lead. > -- Norbert Wiener -------------- next part -------------- An HTML attachment was scrubbed... URL: From hzhang at mcs.anl.gov Mon Aug 3 09:46:04 2015 From: hzhang at mcs.anl.gov (Hong) Date: Mon, 3 Aug 2015 09:46:04 -0500 Subject: [petsc-users] SuperLU MPI-problem In-Reply-To: References: <051d9816c3bd4a3eac37fc51004ebce1@STHWS42.tyrens.se> <7345cece365942d1a06deeac56cf1d72@STHWS42.tyrens.se> <19A5B30A-64E2-44E1-8F73-F67AE628F175@mcs.anl.gov> <03369975ff0a46a388920f1b3372d25c@STHWS42.tyrens.se> <1891ada2f99843b6b6c1d91f0f47f065@STHWS42.tyrens.se> Message-ID: Mahir, Sherry found the culprit. I can reproduce it: petsc/src/ksp/ksp/examples/tutorials mpiexec -n 2 ./ex2 -pc_type lu -pc_factor_mat_solver_package superlu_dist -mat_superlu_dist_matinput GLOBAL -mat_superlu_dist_parsymbfact Invalid ISPEC at line 484 in file get_perm_c.c Invalid ISPEC at line 484 in file get_perm_c.c ------------------------------------------------------- Primary job terminated normally, but 1 process returned a non-zero exit code.. Per user-direction, the job has been aborted. ------------------------------------------------------- ... PETSc-superlu_dist interface sets matinput=DISTRIBUTED as default when using more than one processes. Did you either use '-mat_superlu_dist_parsymbfact' for sequential run or set matinput=GLOBAL for parallel run? I'll add an error flag for these use cases. Hong On Mon, Aug 3, 2015 at 9:17 AM, Xiaoye S. Li wrote: > I think I know the problem. Since zdistribute.c is called, I guess you > are using the global (replicated) matrix input interface, > pzgssvx_ABglobal(). This interface does not allow you to use parallel > symbolic factorization (since matrix is centralized). > > That's why you get the following error: > Invalid ISPEC at line 484 in file get_perm_c.c > > You need to use distributed matrix input interface pzgssvx() (without > ABglobal) > > Sherry > > > On Mon, Aug 3, 2015 at 5:02 AM, Mahir.Ulker-Kaustell at tyrens.se < > Mahir.Ulker-Kaustell at tyrens.se> wrote: > >> Hong and Sherry, >> >> >> >> I have rebuilt PETSc with SuperLU 4.1. Unfortunately, the problem remains: >> >> >> >> If I use -mat_superlu_dist_parsymbfact, the program crashes with: Invalid >> ISPEC at line 484 in file get_perm_c.c >> >> If I use -mat_superlu_dist_parsymbfact=1 or leave this flag out, the >> program crashes with: Calloc fails for SPA dense[]. at line 438 in file >> zdistribute.c >> >> >> >> Mahir >> >> >> >> *From:* Hong [mailto:hzhang at mcs.anl.gov] >> *Sent:* den 30 juli 2015 02:58 >> *To:* ?lker-Kaustell, Mahir >> *Cc:* Xiaoye Li; PETSc users list >> >> *Subject:* Fwd: [petsc-users] SuperLU MPI-problem >> >> >> >> Mahir, >> >> >> >> Sherry fixed several bugs in superlu_dist-v4.1. >> >> The current petsc-release interfaces with superlu_dist-v4.0. >> >> We do not know whether the reported issue (attached below) has been >> resolved or not. If not, can you test it with the latest superlu_dist-v4.1? >> >> >> >> Here is how to do it: >> >> 1. download superlu_dist v4.1 >> >> 2. remove existing PETSC_ARCH directory, then configure petsc with >> >> '--download-superlu_dist=superlu_dist_4.1.tar.gz' >> >> 3. build petsc >> >> >> >> Let us know if the issue remains. >> >> >> >> Hong >> >> >> >> >> >> ---------- Forwarded message ---------- >> From: *Xiaoye S. Li* >> Date: Wed, Jul 29, 2015 at 2:24 PM >> Subject: Fwd: [petsc-users] SuperLU MPI-problem >> To: Hong Zhang >> >> Hong, >> >> I am cleaning the mailbox, and saw this unresolved issue. I am not sure >> whether the new fix to parallel symbolic factorization solves the problem. >> What bothers be is that he is getting the following error: >> >> Invalid ISPEC at line 484 in file get_perm_c.c >> >> This has nothing to do with my bug fix. >> >> ? Shall we ask him to try the new version, or try to get him matrix? >> >> Sherry >> ? >> >> >> >> ---------- Forwarded message ---------- >> From: *Mahir.Ulker-Kaustell at tyrens.se * < >> Mahir.Ulker-Kaustell at tyrens.se> >> Date: Wed, Jul 22, 2015 at 1:32 PM >> Subject: RE: [petsc-users] SuperLU MPI-problem >> To: Hong , "Xiaoye S. Li" >> Cc: petsc-users >> >> The 1000 was just a conservative guess. The number of non-zeros per row >> is in the tens in general but certain constraints lead to non-diagonal >> streaks in the sparsity-pattern. >> >> Is it the reordering of the matrix that is killing me here? How can I set >> options.ColPerm? >> >> >> >> If i use -mat_superlu_dist_parsymbfact the program crashes with >> >> >> >> Invalid ISPEC at line 484 in file get_perm_c.c >> >> ------------------------------------------------------- >> >> Primary job terminated normally, but 1 process returned >> >> a non-zero exit code.. Per user-direction, the job has been aborted. >> >> ------------------------------------------------------- >> >> [0]PETSC ERROR: >> ------------------------------------------------------------------------ >> >> [0]PETSC ERROR: Caught signal number 15 Terminate: Some process (or the >> batch system) has told this process to end >> >> [0]PETSC ERROR: Try option -start_in_debugger or -on_error_attach_debugger >> >> [0]PETSC ERROR: or see >> http://www.mcs.anl.gov/petsc/documentation/faq.html#valgrind >> >> [0]PETSC ERROR: or try http://valgrind.org on GNU/linux and Apple Mac OS >> X to find memory corruption errors >> >> [0]PETSC ERROR: configure using --with-debugging=yes, recompile, link, >> and run >> >> [0]PETSC ERROR: to get more information on the crash. >> >> [0]PETSC ERROR: --------------------- Error Message >> -------------------------------------------------------------- >> >> [0]PETSC ERROR: Signal received >> >> [0]PETSC ERROR: See http://www.mcs.anl.gov/petsc/documentation/faq.html >> for trouble shooting. >> >> [0]PETSC ERROR: Petsc Release Version 3.6.0, Jun, 09, 2015 >> >> [0]PETSC ERROR: ./solve on a cygwin-complex-nodebug named CZC5202SM2 by >> muk Wed Jul 22 21:59:23 2015 >> >> [0]PETSC ERROR: Configure options PETSC_DIR=/packages/petsc-3.6.0 >> PETSC_ARCH=cygwin-complex-nodebug --with-cc=gcc --with-cxx=g++ >> --with-fc=gfortran --with-debugging=0 --with-fortran-kernels=1 >> --with-scalar-type=complex --download-fblaspack --download-mpich >> --download-scalapack --download-mumps --download-metis --download-parmetis >> --download-superlu --download-superlu_dist --download-fftw >> >> [0]PETSC ERROR: #1 User provided function() line 0 in unknown file >> >> application called MPI_Abort(MPI_COMM_WORLD, 59) - process 0 >> >> [unset]: aborting job: >> >> application called MPI_Abort(MPI_COMM_WORLD, 59) - process 0 >> >> [0]PETSC ERROR: >> ------------------------------------------------------------------------ >> >> >> >> If i use -mat_superlu_dist_parsymbfact=1 the program crashes (somewhat >> later) with >> >> >> >> Malloc fails for Lnzval_bc_ptr[*][] at line 626 in file zdistribute.c >> >> col block 3006 ------------------------------------------------------- >> >> Primary job terminated normally, but 1 process returned >> >> a non-zero exit code.. Per user-direction, the job has been aborted. >> >> ------------------------------------------------------- >> >> col block 1924 [0]PETSC ERROR: >> ------------------------------------------------------------------------ >> >> [0]PETSC ERROR: Caught signal number 15 Terminate: Some process (or the >> batch system) has told this process to end >> >> [0]PETSC ERROR: Try option -start_in_debugger or -on_error_attach_debugger >> >> [0]PETSC ERROR: or see >> http://www.mcs.anl.gov/petsc/documentation/faq.html#valgrind >> >> [0]PETSC ERROR: or try http://valgrind.org on GNU/linux and Apple Mac OS >> X to find memory corruption errors >> >> [0]PETSC ERROR: configure using --with-debugging=yes, recompile, link, >> and run >> >> [0]PETSC ERROR: to get more information on the crash. >> >> [0]PETSC ERROR: --------------------- Error Message >> -------------------------------------------------------------- >> >> [0]PETSC ERROR: Signal received >> >> [0]PETSC ERROR: See http://www.mcs.anl.gov/petsc/documentation/faq.html >> for trouble shooting. >> >> [0]PETSC ERROR: Petsc Release Version 3.6.0, Jun, 09, 2015 >> >> [0]PETSC ERROR: ./solve on a cygwin-complex-nodebug named CZC5202SM2 by >> muk Wed Jul 22 21:59:58 2015 >> >> [0]PETSC ERROR: Configure options PETSC_DIR=/packages/petsc-3.6.0 >> PETSC_ARCH=cygwin-complex-nodebug --with-cc=gcc --with-cxx=g++ >> --with-fc=gfortran --with-debugging=0 --with-fortran-kernels=1 >> --with-scalar-type=complex --download-fblaspack --download-mpich >> --download-scalapack --download-mumps --download-metis --download-parmetis >> --download-superlu --download-superlu_dist --download-fftw >> >> [0]PETSC ERROR: #1 User provided function() line 0 in unknown file >> >> application called MPI_Abort(MPI_COMM_WORLD, 59) - process 0 >> >> [unset]: aborting job: >> >> application called MPI_Abort(MPI_COMM_WORLD, 59) - process 0 >> >> [0]PETSC ERROR: >> ------------------------------------------------------------------------ >> >> >> >> >> >> /Mahir >> >> >> >> >> >> *From:* Hong [mailto:hzhang at mcs.anl.gov] >> >> *Sent:* den 22 juli 2015 21:34 >> *To:* Xiaoye S. Li >> *Cc:* ?lker-Kaustell, Mahir; petsc-users >> >> >> *Subject:* Re: [petsc-users] SuperLU MPI-problem >> >> >> >> In Petsc/superlu_dist interface, we set default >> >> >> >> options.ParSymbFact = NO; >> >> >> >> When user raises the flag "-mat_superlu_dist_parsymbfact", >> >> we set >> >> >> >> options.ParSymbFact = YES; >> >> options.ColPerm = PARMETIS; /* in v2.2, PARMETIS is forced for >> ParSymbFact regardless of user ordering setting */ >> >> >> >> We do not change anything else. >> >> >> >> Hong >> >> >> >> On Wed, Jul 22, 2015 at 2:19 PM, Xiaoye S. Li wrote: >> >> I am trying to understand your problem. You said you are solving Naviers >> equation (elastodynamics) in the frequency domain, using finite element >> discretization. I wonder why you have about 1000 nonzeros per row. >> Usually in many PDE discretized matrices, the number of nonzeros per row is >> in the tens (even for 3D problems), not in the thousands. So, your matrix >> is quite a bit denser than many sparse matrices we deal with. >> >> >> >> The number of nonzeros in the L and U factors is much more than that in >> original matrix A -- typically we see 10-20x fill ratio for 2D, or can be >> as bad as 50-100x fill ratio for 3D. But since your matrix starts much >> denser (i.e., the underlying graph has many connections), it may not lend >> to any good ordering strategy to preserve sparsity of L and U; that is, the >> L and U fill ratio may be large. >> >> >> >> I don't understand why you get the following error when you use >> >> ?-mat_superlu_dist_parsymbfact?. >> >> >> >> Invalid ISPEC at line 484 in file get_perm_c.c >> >> >> >> Perhaps Hong Zhang knows; she built the SuperLU_DIST interface for PETSc. >> >> >> >> ?Hong -- in order to use parallel symbolic factorization, is it >> sufficient to specify only >> >> ?-mat_superlu_dist_parsymbfact? >> >> ? ? (the default is to use sequential symbolic factorization.) >> >> >> >> >> >> Sherry >> >> >> >> On Wed, Jul 22, 2015 at 9:11 AM, Mahir.Ulker-Kaustell at tyrens.se < >> Mahir.Ulker-Kaustell at tyrens.se> wrote: >> >> Thank you for your reply. >> >> As you have probably figured out already, I am not a computational >> scientist. I am a researcher in civil engineering (railways for high-speed >> traffic), trying to produce some, from my perspective, fairly large >> parametric studies based on finite element discretizations. >> >> I am working in a Windows-environment and have installed PETSc through >> Cygwin. >> Apparently, there is no support for Valgrind in this OS. >> >> If I have understood you correct, the memory issues are related to >> superLU and given my background, there is not much I can do. Is this >> correct? >> >> >> Best regards, >> Mahir >> >> ______________________________________________ >> Mahir ?lker-Kaustell, Kompetenssamordnare, Brokonstrukt?r, Tekn. Dr, >> Tyr?ns AB >> 010 452 30 82, Mahir.Ulker-Kaustell at tyrens.se >> ______________________________________________ >> >> >> -----Original Message----- >> From: Barry Smith [mailto:bsmith at mcs.anl.gov] >> Sent: den 22 juli 2015 02:57 >> To: ?lker-Kaustell, Mahir >> Cc: Xiaoye S. Li; petsc-users >> Subject: Re: [petsc-users] SuperLU MPI-problem >> >> >> Run the program under valgrind >> http://www.mcs.anl.gov/petsc/documentation/faq.html#valgrind . When I >> use the option -mat_superlu_dist_parsymbfact I get many scary memory >> problems some involving for example ddist_psymbtonum >> (pdsymbfact_distdata.c:1332) >> >> Note that I consider it unacceptable for running programs to EVER use >> uninitialized values; until these are all cleaned up I won't trust any runs >> like this. >> >> Barry >> >> >> >> >> ==42050== Conditional jump or move depends on uninitialised value(s) >> ==42050== at 0x10274C436: MPI_Allgatherv (allgatherv.c:1053) >> ==42050== by 0x101557F60: get_perm_c_parmetis >> (get_perm_c_parmetis.c:285) >> ==42050== by 0x101501192: pdgssvx (pdgssvx.c:934) >> ==42050== by 0x1009CFE7A: MatLUFactorNumeric_SuperLU_DIST >> (superlu_dist.c:414) >> ==42050== by 0x10046CC5C: MatLUFactorNumeric (matrix.c:2946) >> ==42050== by 0x100F09F2C: PCSetUp_LU (lu.c:152) >> ==42050== by 0x100FF9036: PCSetUp (precon.c:982) >> ==42050== by 0x1010F54EB: KSPSetUp (itfunc.c:332) >> ==42050== by 0x1010F7985: KSPSolve (itfunc.c:546) >> ==42050== by 0x10125541E: SNESSolve_NEWTONLS (ls.c:233) >> ==42050== by 0x1011C49B7: SNESSolve (snes.c:3906) >> ==42050== by 0x100001B3C: main (in ./ex19) >> ==42050== Uninitialised value was created by a stack allocation >> ==42050== at 0x10155751B: get_perm_c_parmetis >> (get_perm_c_parmetis.c:96) >> ==42050== >> ==42050== Conditional jump or move depends on uninitialised value(s) >> ==42050== at 0x102851C61: MPIR_Allgatherv_intra (allgatherv.c:651) >> ==42050== by 0x102853EC7: MPIR_Allgatherv (allgatherv.c:903) >> ==42050== by 0x102853F84: MPIR_Allgatherv_impl (allgatherv.c:944) >> ==42050== by 0x10274CA41: MPI_Allgatherv (allgatherv.c:1107) >> ==42050== by 0x101557F60: get_perm_c_parmetis >> (get_perm_c_parmetis.c:285) >> ==42050== by 0x101501192: pdgssvx (pdgssvx.c:934) >> ==42050== by 0x1009CFE7A: MatLUFactorNumeric_SuperLU_DIST >> (superlu_dist.c:414) >> ==42050== by 0x10046CC5C: MatLUFactorNumeric (matrix.c:2946) >> ==42050== by 0x100F09F2C: PCSetUp_LU (lu.c:152) >> ==42050== by 0x100FF9036: PCSetUp (precon.c:982) >> ==42050== by 0x1010F54EB: KSPSetUp (itfunc.c:332) >> ==42050== by 0x1010F7985: KSPSolve (itfunc.c:546) >> ==42050== by 0x10125541E: SNESSolve_NEWTONLS (ls.c:233) >> ==42050== by 0x1011C49B7: SNESSolve (snes.c:3906) >> ==42050== by 0x100001B3C: main (in ./ex19) >> ==42050== Uninitialised value was created by a stack allocation >> ==42050== at 0x10155751B: get_perm_c_parmetis >> (get_perm_c_parmetis.c:96) >> ==42050== >> ==42049== Syscall param writev(vector[...]) points to uninitialised >> byte(s) >> ==42049== at 0x102DA1C3A: writev (in >> /usr/lib/system/libsystem_kernel.dylib) >> ==42049== by 0x10296A0DC: MPL_large_writev (mplsock.c:32) >> ==42049== by 0x10295F6AD: MPIDU_Sock_writev (sock_immed.i:610) >> ==42049== by 0x102943FCA: MPIDI_CH3_iSendv (ch3_isendv.c:84) >> ==42049== by 0x102934361: MPIDI_CH3_EagerContigIsend (ch3u_eager.c:556) >> ==42049== by 0x102939531: MPID_Isend (mpid_isend.c:138) >> ==42049== by 0x10277656E: MPI_Isend (isend.c:125) >> ==42049== by 0x102088B66: libparmetis__gkMPI_Isend (gkmpi.c:63) >> ==42049== by 0x10208140F: libparmetis__CommInterfaceData (comm.c:298) >> ==42049== by 0x1020A8758: libparmetis__CompactGraph (ometis.c:553) >> ==42049== by 0x1020A77BB: libparmetis__MultilevelOrder (ometis.c:225) >> ==42049== by 0x1020A7493: ParMETIS_V32_NodeND (ometis.c:151) >> ==42049== by 0x1020A6AFB: ParMETIS_V3_NodeND (ometis.c:34) >> ==42049== by 0x101557CFC: get_perm_c_parmetis >> (get_perm_c_parmetis.c:241) >> ==42049== by 0x101501192: pdgssvx (pdgssvx.c:934) >> ==42049== by 0x1009CFE7A: MatLUFactorNumeric_SuperLU_DIST >> (superlu_dist.c:414) >> ==42049== by 0x10046CC5C: MatLUFactorNumeric (matrix.c:2946) >> ==42049== by 0x100F09F2C: PCSetUp_LU (lu.c:152) >> ==42049== by 0x100FF9036: PCSetUp (precon.c:982) >> ==42048== Syscall param writev(vector[...]) points to uninitialised >> byte(s) >> ==42049== by 0x1010F54EB: KSPSetUp (itfunc.c:332) >> ==42049== Address 0x105edff70 is 1,424 bytes inside a block of size >> 752,720 alloc'd >> ==42049== at 0x1000183B1: malloc (vg_replace_malloc.c:303) >> ==42049== by 0x1020EB90C: gk_malloc (memory.c:147) >> ==42049== by 0x1020EAA28: gk_mcoreCreate (mcore.c:28) >> ==42048== at 0x102DA1C3A: writev (in >> /usr/lib/system/libsystem_kernel.dylib) >> ==42048== by 0x10296A0DC: MPL_large_writev (mplsock.c:32) >> ==42049== by 0x1020BA5CF: libparmetis__AllocateWSpace (wspace.c:23) >> ==42049== by 0x1020A6E84: ParMETIS_V32_NodeND (ometis.c:98) >> ==42048== by 0x10295F6AD: MPIDU_Sock_writev (sock_immed.i:610) >> ==42048== by 0x102943FCA: MPIDI_CH3_iSendv (ch3_isendv.c:84) >> ==42048== by 0x102934361: MPIDI_CH3_EagerContigIsend (ch3u_eager.c:556) >> ==42049== by 0x1020A6AFB: ParMETIS_V3_NodeND (ometis.c:34) >> ==42049== by 0x101557CFC: get_perm_c_parmetis >> (get_perm_c_parmetis.c:241) >> ==42049== by 0x101501192: pdgssvx (pdgssvx.c:934) >> ==42048== by 0x102939531: MPID_Isend (mpid_isend.c:138) >> ==42048== by 0x10277656E: MPI_Isend (isend.c:125) >> ==42049== by 0x1009CFE7A: MatLUFactorNumeric_SuperLU_DIST >> (superlu_dist.c:414) >> ==42049== by 0x10046CC5C: MatLUFactorNumeric (matrix.c:2946) >> ==42049== by 0x100F09F2C: PCSetUp_LU (lu.c:152) >> ==42049== by 0x100FF9036: PCSetUp (precon.c:982) >> ==42049== by 0x1010F54EB: KSPSetUp (itfunc.c:332) >> ==42048== by 0x102088B66: libparmetis__gkMPI_Isend (gkmpi.c:63) >> ==42048== by 0x10208140F: libparmetis__CommInterfaceData (comm.c:298) >> ==42049== by 0x1010F7985: KSPSolve (itfunc.c:546) >> ==42049== by 0x10125541E: SNESSolve_NEWTONLS (ls.c:233) >> ==42048== by 0x1020A8758: libparmetis__CompactGraph (ometis.c:553) >> ==42048== by 0x1020A77BB: libparmetis__MultilevelOrder (ometis.c:225) >> ==42048== by 0x1020A7493: ParMETIS_V32_NodeND (ometis.c:151) >> ==42049== by 0x1011C49B7: SNESSolve (snes.c:3906) >> ==42049== by 0x100001B3C: main (in ./ex19) >> ==42049== Uninitialised value was created by a heap allocation >> ==42049== at 0x1000183B1: malloc (vg_replace_malloc.c:303) >> ==42049== by 0x1020EB90C: gk_malloc (memory.c:147) >> ==42048== by 0x1020A6AFB: ParMETIS_V3_NodeND (ometis.c:34) >> ==42048== by 0x101557CFC: get_perm_c_parmetis >> (get_perm_c_parmetis.c:241) >> ==42048== by 0x101501192: pdgssvx (pdgssvx.c:934) >> ==42048== by 0x1009CFE7A: MatLUFactorNumeric_SuperLU_DIST >> (superlu_dist.c:414) >> ==42049== by 0x10211C50B: libmetis__imalloc (gklib.c:24) >> ==42049== by 0x1020A8566: libparmetis__CompactGraph (ometis.c:519) >> ==42049== by 0x1020A77BB: libparmetis__MultilevelOrder (ometis.c:225) >> ==42048== by 0x10046CC5C: MatLUFactorNumeric (matrix.c:2946) >> ==42049== by 0x1020A7493: ParMETIS_V32_NodeND (ometis.c:151) >> ==42049== by 0x1020A6AFB: ParMETIS_V3_NodeND (ometis.c:34) >> ==42049== by 0x101557CFC: get_perm_c_parmetis >> (get_perm_c_parmetis.c:241) >> ==42049== by 0x101501192: pdgssvx (pdgssvx.c:934) >> ==42049== by 0x1009CFE7A: MatLUFactorNumeric_SuperLU_DIST >> (superlu_dist.c:414) >> ==42049== by 0x10046CC5C: MatLUFactorNumeric (matrix.c:2946) >> ==42048== by 0x100F09F2C: PCSetUp_LU (lu.c:152) >> ==42049== by 0x100F09F2C: PCSetUp_LU (lu.c:152) >> ==42049== by 0x100FF9036: PCSetUp (precon.c:982) >> ==42049== by 0x1010F54EB: KSPSetUp (itfunc.c:332) >> ==42049== by 0x1010F7985: KSPSolve (itfunc.c:546) >> ==42049== by 0x10125541E: SNESSolve_NEWTONLS (ls.c:233) >> ==42048== by 0x100FF9036: PCSetUp (precon.c:982) >> ==42048== by 0x1010F54EB: KSPSetUp (itfunc.c:332) >> ==42048== Address 0x10597a860 is 1,408 bytes inside a block of size >> 752,720 alloc'd >> ==42049== by 0x1011C49B7: SNESSolve (snes.c:3906) >> ==42049== by 0x100001B3C: main (in ./ex19) >> ==42049== >> ==42048== at 0x1000183B1: malloc (vg_replace_malloc.c:303) >> ==42048== by 0x1020EB90C: gk_malloc (memory.c:147) >> ==42048== by 0x1020EAA28: gk_mcoreCreate (mcore.c:28) >> ==42048== by 0x1020BA5CF: libparmetis__AllocateWSpace (wspace.c:23) >> ==42048== by 0x1020A6E84: ParMETIS_V32_NodeND (ometis.c:98) >> ==42048== by 0x1020A6AFB: ParMETIS_V3_NodeND (ometis.c:34) >> ==42048== by 0x101557CFC: get_perm_c_parmetis >> (get_perm_c_parmetis.c:241) >> ==42048== by 0x101501192: pdgssvx (pdgssvx.c:934) >> ==42048== by 0x1009CFE7A: MatLUFactorNumeric_SuperLU_DIST >> (superlu_dist.c:414) >> ==42048== by 0x10046CC5C: MatLUFactorNumeric (matrix.c:2946) >> ==42048== by 0x100F09F2C: PCSetUp_LU (lu.c:152) >> ==42048== by 0x100FF9036: PCSetUp (precon.c:982) >> ==42048== by 0x1010F54EB: KSPSetUp (itfunc.c:332) >> ==42048== by 0x1010F7985: KSPSolve (itfunc.c:546) >> ==42048== by 0x10125541E: SNESSolve_NEWTONLS (ls.c:233) >> ==42048== by 0x1011C49B7: SNESSolve (snes.c:3906) >> ==42048== by 0x100001B3C: main (in ./ex19) >> ==42048== Uninitialised value was created by a heap allocation >> ==42048== at 0x1000183B1: malloc (vg_replace_malloc.c:303) >> ==42048== by 0x1020EB90C: gk_malloc (memory.c:147) >> ==42048== by 0x10211C50B: libmetis__imalloc (gklib.c:24) >> ==42048== by 0x1020A8566: libparmetis__CompactGraph (ometis.c:519) >> ==42048== by 0x1020A77BB: libparmetis__MultilevelOrder (ometis.c:225) >> ==42048== by 0x1020A7493: ParMETIS_V32_NodeND (ometis.c:151) >> ==42048== by 0x1020A6AFB: ParMETIS_V3_NodeND (ometis.c:34) >> ==42048== by 0x101557CFC: get_perm_c_parmetis >> (get_perm_c_parmetis.c:241) >> ==42048== by 0x101501192: pdgssvx (pdgssvx.c:934) >> ==42048== by 0x1009CFE7A: MatLUFactorNumeric_SuperLU_DIST >> (superlu_dist.c:414) >> ==42048== by 0x10046CC5C: MatLUFactorNumeric (matrix.c:2946) >> ==42048== by 0x100F09F2C: PCSetUp_LU (lu.c:152) >> ==42048== by 0x100FF9036: PCSetUp (precon.c:982) >> ==42048== by 0x1010F54EB: KSPSetUp (itfunc.c:332) >> ==42048== by 0x1010F7985: KSPSolve (itfunc.c:546) >> ==42048== by 0x10125541E: SNESSolve_NEWTONLS (ls.c:233) >> ==42048== by 0x1011C49B7: SNESSolve (snes.c:3906) >> ==42048== by 0x100001B3C: main (in ./ex19) >> ==42048== >> ==42048== Syscall param write(buf) points to uninitialised byte(s) >> ==42048== at 0x102DA1C22: write (in >> /usr/lib/system/libsystem_kernel.dylib) >> ==42048== by 0x10295F5BD: MPIDU_Sock_write (sock_immed.i:525) >> ==42048== by 0x102944839: MPIDI_CH3_iStartMsg (ch3_istartmsg.c:86) >> ==42048== by 0x102933B80: MPIDI_CH3_EagerContigShortSend >> (ch3u_eager.c:257) >> ==42048== by 0x10293ADBA: MPID_Send (mpid_send.c:130) >> ==42048== by 0x10277A1FA: MPI_Send (send.c:127) >> ==42048== by 0x10155802F: get_perm_c_parmetis >> (get_perm_c_parmetis.c:299) >> ==42048== by 0x101501192: pdgssvx (pdgssvx.c:934) >> ==42048== by 0x1009CFE7A: MatLUFactorNumeric_SuperLU_DIST >> (superlu_dist.c:414) >> ==42048== by 0x10046CC5C: MatLUFactorNumeric (matrix.c:2946) >> ==42048== by 0x100F09F2C: PCSetUp_LU (lu.c:152) >> ==42048== by 0x100FF9036: PCSetUp (precon.c:982) >> ==42048== by 0x1010F54EB: KSPSetUp (itfunc.c:332) >> ==42048== by 0x1010F7985: KSPSolve (itfunc.c:546) >> ==42048== by 0x10125541E: SNESSolve_NEWTONLS (ls.c:233) >> ==42048== by 0x1011C49B7: SNESSolve (snes.c:3906) >> ==42048== by 0x100001B3C: main (in ./ex19) >> ==42048== Address 0x104810704 is on thread 1's stack >> ==42048== in frame #3, created by MPIDI_CH3_EagerContigShortSend >> (ch3u_eager.c:218) >> ==42048== Uninitialised value was created by a heap allocation >> ==42048== at 0x1000183B1: malloc (vg_replace_malloc.c:303) >> ==42048== by 0x10153B704: superlu_malloc_dist (memory.c:108) >> ==42048== by 0x101557AB9: get_perm_c_parmetis >> (get_perm_c_parmetis.c:185) >> ==42048== by 0x101501192: pdgssvx (pdgssvx.c:934) >> ==42048== by 0x1009CFE7A: MatLUFactorNumeric_SuperLU_DIST >> (superlu_dist.c:414) >> ==42048== by 0x10046CC5C: MatLUFactorNumeric (matrix.c:2946) >> ==42048== by 0x100F09F2C: PCSetUp_LU (lu.c:152) >> ==42048== by 0x100FF9036: PCSetUp (precon.c:982) >> ==42048== by 0x1010F54EB: KSPSetUp (itfunc.c:332) >> ==42048== by 0x1010F7985: KSPSolve (itfunc.c:546) >> ==42048== by 0x10125541E: SNESSolve_NEWTONLS (ls.c:233) >> ==42048== by 0x1011C49B7: SNESSolve (snes.c:3906) >> ==42048== by 0x100001B3C: main (in ./ex19) >> ==42048== >> ==42050== Conditional jump or move depends on uninitialised value(s) >> ==42050== at 0x102744CB8: MPI_Alltoallv (alltoallv.c:480) >> ==42050== by 0x101510B3E: dist_symbLU (pdsymbfact_distdata.c:539) >> ==42050== by 0x10150A5C6: ddist_psymbtonum (pdsymbfact_distdata.c:1275) >> ==42050== by 0x1015018C2: pdgssvx (pdgssvx.c:1057) >> ==42050== by 0x1009CFE7A: MatLUFactorNumeric_SuperLU_DIST >> (superlu_dist.c:414) >> ==42050== by 0x10046CC5C: MatLUFactorNumeric (matrix.c:2946) >> ==42050== by 0x100F09F2C: PCSetUp_LU (lu.c:152) >> ==42050== by 0x100FF9036: PCSetUp (precon.c:982) >> ==42050== by 0x1010F54EB: KSPSetUp (itfunc.c:332) >> ==42050== by 0x1010F7985: KSPSolve (itfunc.c:546) >> ==42050== by 0x10125541E: SNESSolve_NEWTONLS (ls.c:233) >> ==42050== by 0x1011C49B7: SNESSolve (snes.c:3906) >> ==42050== by 0x100001B3C: main (in ./ex19) >> ==42050== Uninitialised value was created by a stack allocation >> ==42050== at 0x10150E4C4: dist_symbLU (pdsymbfact_distdata.c:96) >> ==42050== >> ==42050== Conditional jump or move depends on uninitialised value(s) >> ==42050== at 0x102744E43: MPI_Alltoallv (alltoallv.c:490) >> ==42050== by 0x101510B3E: dist_symbLU (pdsymbfact_distdata.c:539) >> ==42050== by 0x10150A5C6: ddist_psymbtonum (pdsymbfact_distdata.c:1275) >> ==42050== by 0x1015018C2: pdgssvx (pdgssvx.c:1057) >> ==42050== by 0x1009CFE7A: MatLUFactorNumeric_SuperLU_DIST >> (superlu_dist.c:414) >> ==42050== by 0x10046CC5C: MatLUFactorNumeric (matrix.c:2946) >> ==42050== by 0x100F09F2C: PCSetUp_LU (lu.c:152) >> ==42050== by 0x100FF9036: PCSetUp (precon.c:982) >> ==42050== by 0x1010F54EB: KSPSetUp (itfunc.c:332) >> ==42050== by 0x1010F7985: KSPSolve (itfunc.c:546) >> ==42050== by 0x10125541E: SNESSolve_NEWTONLS (ls.c:233) >> ==42050== by 0x1011C49B7: SNESSolve (snes.c:3906) >> ==42050== by 0x100001B3C: main (in ./ex19) >> ==42050== Uninitialised value was created by a stack allocation >> ==42050== at 0x10150E4C4: dist_symbLU (pdsymbfact_distdata.c:96) >> ==42050== >> ==42050== Conditional jump or move depends on uninitialised value(s) >> ==42050== at 0x102744EBF: MPI_Alltoallv (alltoallv.c:497) >> ==42050== by 0x101510B3E: dist_symbLU (pdsymbfact_distdata.c:539) >> ==42050== by 0x10150A5C6: ddist_psymbtonum (pdsymbfact_distdata.c:1275) >> ==42050== by 0x1015018C2: pdgssvx (pdgssvx.c:1057) >> ==42050== by 0x1009CFE7A: MatLUFactorNumeric_SuperLU_DIST >> (superlu_dist.c:414) >> ==42050== by 0x10046CC5C: MatLUFactorNumeric (matrix.c:2946) >> ==42050== by 0x100F09F2C: PCSetUp_LU (lu.c:152) >> ==42050== by 0x100FF9036: PCSetUp (precon.c:982) >> ==42050== by 0x1010F54EB: KSPSetUp (itfunc.c:332) >> ==42050== by 0x1010F7985: KSPSolve (itfunc.c:546) >> ==42050== by 0x10125541E: SNESSolve_NEWTONLS (ls.c:233) >> ==42050== by 0x1011C49B7: SNESSolve (snes.c:3906) >> ==42050== by 0x100001B3C: main (in ./ex19) >> ==42050== Uninitialised value was created by a stack allocation >> ==42050== at 0x10150E4C4: dist_symbLU (pdsymbfact_distdata.c:96) >> ==42050== >> ==42050== Conditional jump or move depends on uninitialised value(s) >> ==42050== at 0x1027450B1: MPI_Alltoallv (alltoallv.c:512) >> ==42050== by 0x101510B3E: dist_symbLU (pdsymbfact_distdata.c:539) >> ==42050== by 0x10150A5C6: ddist_psymbtonum (pdsymbfact_distdata.c:1275) >> ==42050== by 0x1015018C2: pdgssvx (pdgssvx.c:1057) >> ==42050== by 0x1009CFE7A: MatLUFactorNumeric_SuperLU_DIST >> (superlu_dist.c:414) >> ==42050== by 0x10046CC5C: MatLUFactorNumeric (matrix.c:2946) >> ==42050== by 0x100F09F2C: PCSetUp_LU (lu.c:152) >> ==42050== by 0x100FF9036: PCSetUp (precon.c:982) >> ==42050== by 0x1010F54EB: KSPSetUp (itfunc.c:332) >> ==42050== by 0x1010F7985: KSPSolve (itfunc.c:546) >> ==42050== by 0x10125541E: SNESSolve_NEWTONLS (ls.c:233) >> ==42050== by 0x1011C49B7: SNESSolve (snes.c:3906) >> ==42050== by 0x100001B3C: main (in ./ex19) >> ==42050== Uninitialised value was created by a stack allocation >> ==42050== at 0x10150E4C4: dist_symbLU (pdsymbfact_distdata.c:96) >> ==42050== >> ==42050== Conditional jump or move depends on uninitialised value(s) >> ==42050== at 0x10283FB06: MPIR_Alltoallv_intra (alltoallv.c:92) >> ==42050== by 0x1028407B6: MPIR_Alltoallv (alltoallv.c:343) >> ==42050== by 0x102840884: MPIR_Alltoallv_impl (alltoallv.c:380) >> ==42050== by 0x10274541B: MPI_Alltoallv (alltoallv.c:531) >> ==42050== by 0x101510B3E: dist_symbLU (pdsymbfact_distdata.c:539) >> ==42050== by 0x10150A5C6: ddist_psymbtonum (pdsymbfact_distdata.c:1275) >> ==42050== by 0x1015018C2: pdgssvx (pdgssvx.c:1057) >> ==42050== by 0x1009CFE7A: MatLUFactorNumeric_SuperLU_DIST >> (superlu_dist.c:414) >> ==42050== by 0x10046CC5C: MatLUFactorNumeric (matrix.c:2946) >> ==42050== by 0x100F09F2C: PCSetUp_LU (lu.c:152) >> ==42050== by 0x100FF9036: PCSetUp (precon.c:982) >> ==42050== by 0x1010F54EB: KSPSetUp (itfunc.c:332) >> ==42050== by 0x1010F7985: KSPSolve (itfunc.c:546) >> ==42050== by 0x10125541E: SNESSolve_NEWTONLS (ls.c:233) >> ==42050== by 0x1011C49B7: SNESSolve (snes.c:3906) >> ==42050== by 0x100001B3C: main (in ./ex19) >> ==42050== Uninitialised value was created by a stack allocation >> ==42050== at 0x10150E4C4: dist_symbLU (pdsymbfact_distdata.c:96) >> ==42050== >> ==42050== Syscall param writev(vector[...]) points to uninitialised >> byte(s) >> ==42050== at 0x102DA1C3A: writev (in >> /usr/lib/system/libsystem_kernel.dylib) >> ==42050== by 0x10296A0DC: MPL_large_writev (mplsock.c:32) >> ==42050== by 0x10295F6AD: MPIDU_Sock_writev (sock_immed.i:610) >> ==42050== by 0x102943FCA: MPIDI_CH3_iSendv (ch3_isendv.c:84) >> ==42050== by 0x102934361: MPIDI_CH3_EagerContigIsend (ch3u_eager.c:556) >> ==42050== by 0x102939531: MPID_Isend (mpid_isend.c:138) >> ==42050== by 0x10277656E: MPI_Isend (isend.c:125) >> ==42050== by 0x101524C41: pdgstrf2_trsm (pdgstrf2.c:201) >> ==42050== by 0x10151ECBF: pdgstrf (pdgstrf.c:1082) >> ==42050== by 0x1015019A5: pdgssvx (pdgssvx.c:1069) >> ==42050== by 0x1009CFE7A: MatLUFactorNumeric_SuperLU_DIST >> (superlu_dist.c:414) >> ==42050== by 0x10046CC5C: MatLUFactorNumeric (matrix.c:2946) >> ==42050== by 0x100F09F2C: PCSetUp_LU (lu.c:152) >> ==42050== by 0x100FF9036: PCSetUp (precon.c:982) >> ==42050== by 0x1010F54EB: KSPSetUp (itfunc.c:332) >> ==42050== by 0x1010F7985: KSPSolve (itfunc.c:546) >> ==42050== by 0x10125541E: SNESSolve_NEWTONLS (ls.c:233) >> ==42050== by 0x1011C49B7: SNESSolve (snes.c:3906) >> ==42050== by 0x100001B3C: main (in ./ex19) >> ==42050== Address 0x1060144d0 is 1,168 bytes inside a block of size >> 131,072 alloc'd >> ==42050== at 0x1000183B1: malloc (vg_replace_malloc.c:303) >> ==42050== by 0x10153B704: superlu_malloc_dist (memory.c:108) >> ==42050== by 0x1014FD7AD: doubleMalloc_dist (dmemory.c:145) >> ==42050== by 0x10151DA7D: pdgstrf (pdgstrf.c:735) >> ==42050== by 0x1015019A5: pdgssvx (pdgssvx.c:1069) >> ==42050== by 0x1009CFE7A: MatLUFactorNumeric_SuperLU_DIST >> (superlu_dist.c:414) >> ==42050== by 0x10046CC5C: MatLUFactorNumeric (matrix.c:2946) >> ==42050== by 0x100F09F2C: PCSetUp_LU (lu.c:152) >> ==42050== by 0x100FF9036: PCSetUp (precon.c:982) >> ==42050== by 0x1010F54EB: KSPSetUp (itfunc.c:332) >> ==42050== by 0x1010F7985: KSPSolve (itfunc.c:546) >> ==42050== by 0x10125541E: SNESSolve_NEWTONLS (ls.c:233) >> ==42050== by 0x1011C49B7: SNESSolve (snes.c:3906) >> ==42050== by 0x100001B3C: main (in ./ex19) >> ==42050== Uninitialised value was created by a heap allocation >> ==42050== at 0x1000183B1: malloc (vg_replace_malloc.c:303) >> ==42050== by 0x10153B704: superlu_malloc_dist (memory.c:108) >> ==42050== by 0x1014FD7AD: doubleMalloc_dist (dmemory.c:145) >> ==42050== by 0x10151DA7D: pdgstrf (pdgstrf.c:735) >> ==42050== by 0x1015019A5: pdgssvx (pdgssvx.c:1069) >> ==42050== by 0x1009CFE7A: MatLUFactorNumeric_SuperLU_DIST >> (superlu_dist.c:414) >> ==42050== by 0x10046CC5C: MatLUFactorNumeric (matrix.c:2946) >> ==42050== by 0x100F09F2C: PCSetUp_LU (lu.c:152) >> ==42050== by 0x100FF9036: PCSetUp (precon.c:982) >> ==42050== by 0x1010F54EB: KSPSetUp (itfunc.c:332) >> ==42050== by 0x1010F7985: KSPSolve (itfunc.c:546) >> ==42050== by 0x10125541E: SNESSolve_NEWTONLS (ls.c:233) >> ==42050== by 0x1011C49B7: SNESSolve (snes.c:3906) >> ==42050== by 0x100001B3C: main (in ./ex19) >> ==42050== >> ==42048== Conditional jump or move depends on uninitialised value(s) >> ==42048== at 0x10151F141: pdgstrf (pdgstrf.c:1139) >> ==42048== by 0x1015019A5: pdgssvx (pdgssvx.c:1069) >> ==42048== by 0x1009CFE7A: MatLUFactorNumeric_SuperLU_DIST >> (superlu_dist.c:414) >> ==42048== by 0x10046CC5C: MatLUFactorNumeric (matrix.c:2946) >> ==42048== by 0x100F09F2C: PCSetUp_LU (lu.c:152) >> ==42048== by 0x100FF9036: PCSetUp (precon.c:982) >> ==42048== by 0x1010F54EB: KSPSetUp (itfunc.c:332) >> ==42048== by 0x1010F7985: KSPSolve (itfunc.c:546) >> ==42048== by 0x10125541E: SNESSolve_NEWTONLS (ls.c:233) >> ==42048== by 0x1011C49B7: SNESSolve (snes.c:3906) >> ==42048== by 0x100001B3C: main (in ./ex19) >> ==42048== Uninitialised value was created by a heap allocation >> ==42048== at 0x1000183B1: malloc (vg_replace_malloc.c:303) >> ==42048== by 0x10153B704: superlu_malloc_dist (memory.c:108) >> ==42048== by 0x10150ABE2: ddist_psymbtonum (pdsymbfact_distdata.c:1332) >> ==42048== by 0x1015018C2: pdgssvx (pdgssvx.c:1057) >> ==42048== by 0x1009CFE7A: MatLUFactorNumeric_SuperLU_DIST >> (superlu_dist.c:414) >> ==42048== by 0x10046CC5C: MatLUFactorNumeric (matrix.c:2946) >> ==42048== by 0x100F09F2C: PCSetUp_LU (lu.c:152) >> ==42048== by 0x100FF9036: PCSetUp (precon.c:982) >> ==42048== by 0x1010F54EB: KSPSetUp (itfunc.c:332) >> ==42048== by 0x1010F7985: KSPSolve (itfunc.c:546) >> ==42048== by 0x10125541E: SNESSolve_NEWTONLS (ls.c:233) >> ==42048== by 0x1011C49B7: SNESSolve (snes.c:3906) >> ==42048== by 0x100001B3C: main (in ./ex19) >> ==42048== >> ==42049== Conditional jump or move depends on uninitialised value(s) >> ==42049== at 0x10151F141: pdgstrf (pdgstrf.c:1139) >> ==42049== by 0x1015019A5: pdgssvx (pdgssvx.c:1069) >> ==42049== by 0x1009CFE7A: MatLUFactorNumeric_SuperLU_DIST >> (superlu_dist.c:414) >> ==42049== by 0x10046CC5C: MatLUFactorNumeric (matrix.c:2946) >> ==42049== by 0x100F09F2C: PCSetUp_LU (lu.c:152) >> ==42049== by 0x100FF9036: PCSetUp (precon.c:982) >> ==42049== by 0x1010F54EB: KSPSetUp (itfunc.c:332) >> ==42049== by 0x1010F7985: KSPSolve (itfunc.c:546) >> ==42049== by 0x10125541E: SNESSolve_NEWTONLS (ls.c:233) >> ==42049== by 0x1011C49B7: SNESSolve (snes.c:3906) >> ==42049== by 0x100001B3C: main (in ./ex19) >> ==42049== Uninitialised value was created by a heap allocation >> ==42049== at 0x1000183B1: malloc (vg_replace_malloc.c:303) >> ==42049== by 0x10153B704: superlu_malloc_dist (memory.c:108) >> ==42049== by 0x10150ABE2: ddist_psymbtonum (pdsymbfact_distdata.c:1332) >> ==42049== by 0x1015018C2: pdgssvx (pdgssvx.c:1057) >> ==42049== by 0x1009CFE7A: MatLUFactorNumeric_SuperLU_DIST >> (superlu_dist.c:414) >> ==42049== by 0x10046CC5C: MatLUFactorNumeric (matrix.c:2946) >> ==42049== by 0x100F09F2C: PCSetUp_LU (lu.c:152) >> ==42049== by 0x100FF9036: PCSetUp (precon.c:982) >> ==42049== by 0x1010F54EB: KSPSetUp (itfunc.c:332) >> ==42049== by 0x1010F7985: KSPSolve (itfunc.c:546) >> ==42049== by 0x10125541E: SNESSolve_NEWTONLS (ls.c:233) >> ==42049== by 0x1011C49B7: SNESSolve (snes.c:3906) >> ==42049== by 0x100001B3C: main (in ./ex19) >> ==42049== >> ==42048== Conditional jump or move depends on uninitialised value(s) >> ==42048== at 0x101520054: pdgstrf (pdgstrf.c:1429) >> ==42048== by 0x1015019A5: pdgssvx (pdgssvx.c:1069) >> ==42048== by 0x1009CFE7A: MatLUFactorNumeric_SuperLU_DIST >> (superlu_dist.c:414) >> ==42048== by 0x10046CC5C: MatLUFactorNumeric (matrix.c:2946) >> ==42048== by 0x100F09F2C: PCSetUp_LU (lu.c:152) >> ==42048== by 0x100FF9036: PCSetUp (precon.c:982) >> ==42048== by 0x1010F54EB: KSPSetUp (itfunc.c:332) >> ==42048== by 0x1010F7985: KSPSolve (itfunc.c:546) >> ==42049== Conditional jump or move depends on uninitialised value(s) >> ==42048== by 0x10125541E: SNESSolve_NEWTONLS (ls.c:233) >> ==42048== by 0x1011C49B7: SNESSolve (snes.c:3906) >> ==42048== by 0x100001B3C: main (in ./ex19) >> ==42048== Uninitialised value was created by a heap allocation >> ==42049== at 0x101520054: pdgstrf (pdgstrf.c:1429) >> ==42048== at 0x1000183B1: malloc (vg_replace_malloc.c:303) >> ==42048== by 0x10153B704: superlu_malloc_dist (memory.c:108) >> ==42049== by 0x1015019A5: pdgssvx (pdgssvx.c:1069) >> ==42049== by 0x1009CFE7A: MatLUFactorNumeric_SuperLU_DIST >> (superlu_dist.c:414) >> ==42048== by 0x10150ABE2: ddist_psymbtonum (pdsymbfact_distdata.c:1332) >> ==42048== by 0x1015018C2: pdgssvx (pdgssvx.c:1057) >> ==42048== by 0x1009CFE7A: MatLUFactorNumeric_SuperLU_DIST >> (superlu_dist.c:414) >> ==42049== by 0x10046CC5C: MatLUFactorNumeric (matrix.c:2946) >> ==42049== by 0x100F09F2C: PCSetUp_LU (lu.c:152) >> ==42048== by 0x10046CC5C: MatLUFactorNumeric (matrix.c:2946) >> ==42048== by 0x100F09F2C: PCSetUp_LU (lu.c:152) >> ==42049== by 0x100FF9036: PCSetUp (precon.c:982) >> ==42049== by 0x1010F54EB: KSPSetUp (itfunc.c:332) >> ==42049== by 0x1010F7985: KSPSolve (itfunc.c:546) >> ==42048== by 0x100FF9036: PCSetUp (precon.c:982) >> ==42048== by 0x1010F54EB: KSPSetUp (itfunc.c:332) >> ==42048== by 0x1010F7985: KSPSolve (itfunc.c:546) >> ==42049== by 0x10125541E: SNESSolve_NEWTONLS (ls.c:233) >> ==42049== by 0x1011C49B7: SNESSolve (snes.c:3906) >> ==42048== by 0x10125541E: SNESSolve_NEWTONLS (ls.c:233) >> ==42048== by 0x1011C49B7: SNESSolve (snes.c:3906) >> ==42049== by 0x100001B3C: main (in ./ex19) >> ==42049== Uninitialised value was created by a heap allocation >> ==42049== at 0x1000183B1: malloc (vg_replace_malloc.c:303) >> ==42048== by 0x100001B3C: main (in ./ex19) >> ==42048== >> ==42049== by 0x10153B704: superlu_malloc_dist (memory.c:108) >> ==42049== by 0x10150ABE2: ddist_psymbtonum (pdsymbfact_distdata.c:1332) >> ==42049== by 0x1015018C2: pdgssvx (pdgssvx.c:1057) >> ==42049== by 0x1009CFE7A: MatLUFactorNumeric_SuperLU_DIST >> (superlu_dist.c:414) >> ==42049== by 0x10046CC5C: MatLUFactorNumeric (matrix.c:2946) >> ==42049== by 0x100F09F2C: PCSetUp_LU (lu.c:152) >> ==42049== by 0x100FF9036: PCSetUp (precon.c:982) >> ==42049== by 0x1010F54EB: KSPSetUp (itfunc.c:332) >> ==42049== by 0x1010F7985: KSPSolve (itfunc.c:546) >> ==42049== by 0x10125541E: SNESSolve_NEWTONLS (ls.c:233) >> ==42049== by 0x1011C49B7: SNESSolve (snes.c:3906) >> ==42049== by 0x100001B3C: main (in ./ex19) >> ==42049== >> ==42050== Conditional jump or move depends on uninitialised value(s) >> ==42050== at 0x10151FDE6: pdgstrf (pdgstrf.c:1382) >> ==42050== by 0x1015019A5: pdgssvx (pdgssvx.c:1069) >> ==42050== by 0x1009CFE7A: MatLUFactorNumeric_SuperLU_DIST >> (superlu_dist.c:414) >> ==42050== by 0x10046CC5C: MatLUFactorNumeric (matrix.c:2946) >> ==42050== by 0x100F09F2C: PCSetUp_LU (lu.c:152) >> ==42050== by 0x100FF9036: PCSetUp (precon.c:982) >> ==42050== by 0x1010F54EB: KSPSetUp (itfunc.c:332) >> ==42050== by 0x1010F7985: KSPSolve (itfunc.c:546) >> ==42050== by 0x10125541E: SNESSolve_NEWTONLS (ls.c:233) >> ==42050== by 0x1011C49B7: SNESSolve (snes.c:3906) >> ==42050== by 0x100001B3C: main (in ./ex19) >> ==42050== Uninitialised value was created by a heap allocation >> ==42050== at 0x1000183B1: malloc (vg_replace_malloc.c:303) >> ==42050== by 0x10153B704: superlu_malloc_dist (memory.c:108) >> ==42050== by 0x10150B241: ddist_psymbtonum (pdsymbfact_distdata.c:1389) >> ==42050== by 0x1015018C2: pdgssvx (pdgssvx.c:1057) >> ==42050== by 0x1009CFE7A: MatLUFactorNumeric_SuperLU_DIST >> (superlu_dist.c:414) >> ==42050== by 0x10046CC5C: MatLUFactorNumeric (matrix.c:2946) >> ==42050== by 0x100F09F2C: PCSetUp_LU (lu.c:152) >> ==42050== by 0x100FF9036: PCSetUp (precon.c:982) >> ==42050== by 0x1010F54EB: KSPSetUp (itfunc.c:332) >> ==42050== by 0x1010F7985: KSPSolve (itfunc.c:546) >> ==42050== by 0x10125541E: SNESSolve_NEWTONLS (ls.c:233) >> ==42050== by 0x1011C49B7: SNESSolve (snes.c:3906) >> ==42050== by 0x100001B3C: main (in ./ex19) >> ==42050== >> >> >> > On Jul 20, 2015, at 12:03 PM, Mahir.Ulker-Kaustell at tyrens.se wrote: >> > >> > Ok. So I have been creating the full factorization on each process. >> That gives me some hope! >> > >> > I followed your suggestion and tried to use the runtime option >> ?-mat_superlu_dist_parsymbfact?. >> > However, now the program crashes with: >> > >> > Invalid ISPEC at line 484 in file get_perm_c.c >> > >> > And so on? >> > >> > From the SuperLU manual; I should give the option either YES or NO, >> however -mat_superlu_dist_parsymbfact YES makes the program crash in the >> same way as above. >> > Also I can?t find any reference to -mat_superlu_dist_parsymbfact in the >> PETSc documentation >> > >> > Mahir >> > >> > Mahir ?lker-Kaustell, Kompetenssamordnare, Brokonstrukt?r, Tekn. Dr, >> Tyr?ns AB >> > 010 452 30 82, Mahir.Ulker-Kaustell at tyrens.se >> > >> > From: Xiaoye S. Li [mailto:xsli at lbl.gov] >> > Sent: den 20 juli 2015 18:12 >> > To: ?lker-Kaustell, Mahir >> > Cc: Hong; petsc-users >> > Subject: Re: [petsc-users] SuperLU MPI-problem >> > >> > The default SuperLU_DIST setting is to serial symbolic factorization. >> Therefore, what matters is how much memory do you have per MPI task? >> > >> > The code failed to malloc memory during redistribution of matrix A to >> {L\U} data struction (using result of serial symbolic factorization.) >> > >> > You can use parallel symbolic factorization, by runtime option: >> '-mat_superlu_dist_parsymbfact' >> > >> > Sherry Li >> > >> > >> > On Mon, Jul 20, 2015 at 8:59 AM, Mahir.Ulker-Kaustell at tyrens.se < >> Mahir.Ulker-Kaustell at tyrens.se> wrote: >> > Hong: >> > >> > Previous experiences with this equation have shown that it is very >> difficult to solve it iteratively. Hence the use of a direct solver. >> > >> > The large test problem I am trying to solve has slightly less than 10^6 >> degrees of freedom. The matrices are derived from finite elements so they >> are sparse. >> > The machine I am working on has 128GB ram. I have estimated the memory >> needed to less than 20GB, so if the solver needs twice or even three times >> as much, it should still work well. Or have I completely misunderstood >> something here? >> > >> > Mahir >> > >> > >> > >> > From: Hong [mailto:hzhang at mcs.anl.gov] >> > Sent: den 20 juli 2015 17:39 >> > To: ?lker-Kaustell, Mahir >> > Cc: petsc-users >> > Subject: Re: [petsc-users] SuperLU MPI-problem >> > >> > Mahir: >> > Direct solvers consume large amount of memory. Suggest to try >> followings: >> > >> > 1. A sparse iterative solver if [-omega^2M + K] is not too >> ill-conditioned. You may test it using the small matrix. >> > >> > 2. Incrementally increase your matrix sizes. Try different matrix >> orderings. >> > Do you get memory crash in the 1st symbolic factorization? >> > In your case, matrix data structure stays same when omega changes, so >> you only need to do one matrix symbolic factorization and reuse it. >> > >> > 3. Use a machine that gives larger memory. >> > >> > Hong >> > >> > Dear Petsc-Users, >> > >> > I am trying to use PETSc to solve a set of linear equations arising >> from Naviers equation (elastodynamics) in the frequency domain. >> > The frequency dependency of the problem requires that the system >> > >> > [-omega^2M + K]u = F >> > >> > where M and K are constant, square, positive definite matrices (mass >> and stiffness respectively) is solved for each frequency omega of interest. >> > K is a complex matrix, including material damping. >> > >> > I have written a PETSc program which solves this problem for a small >> (1000 degrees of freedom) test problem on one or several processors, but it >> keeps crashing when I try it on my full scale (in the order of 10^6 degrees >> of freedom) problem. >> > >> > The program crashes at KSPSetUp() and from what I can see in the error >> messages, it appears as if it consumes too much memory. >> > >> > I would guess that similar problems have occurred in this mail-list, so >> I am hoping that someone can push me in the right direction? >> > >> > Mahir >> >> >> >> >> >> >> >> >> > > -------------- next part -------------- An HTML attachment was scrubbed... URL: From knepley at gmail.com Mon Aug 3 09:47:13 2015 From: knepley at gmail.com (Matthew Knepley) Date: Mon, 3 Aug 2015 09:47:13 -0500 Subject: [petsc-users] problem with MatShellGetContext In-Reply-To: <748729499.6423014.1438612920482.JavaMail.zimbra@inria.fr> References: <1219151237.6415983.1438609951363.JavaMail.zimbra@inria.fr> <115885278.6419010.1438611188443.JavaMail.zimbra@inria.fr> <748729499.6423014.1438612920482.JavaMail.zimbra@inria.fr> Message-ID: On Mon, Aug 3, 2015 at 9:42 AM, Nicolas Pozin wrote: > > > ------------------------------ > > *De: *"Matthew Knepley" > *?: *"Nicolas Pozin" > *Cc: *"PETSc" > *Envoy?: *Lundi 3 Ao?t 2015 16:33:16 > *Objet: *Re: [petsc-users] problem with MatShellGetContext > > On Mon, Aug 3, 2015 at 9:13 AM, Nicolas Pozin > wrote: > >> Hello everyone, >> >> I am having trouble using MatShellGetContext. >> >> Here's the simple test I did : >> >> >> typedef struct{ >> PetscInt testValue; >> Mat matShell; >> KSP currentCtx; >> } AppCtx; >> >> >> AppCtx context1; >> KSPCreate(PETSC_COMM_WORLD,&context1.currentCtx); >> context1.testValue=18; >> MatCreateShell(PETSC_COMM_WORLD, nl, nl, nL, nL, context1.currentCtx, >> &context1.matShell); >> > > It looks like you want ''&context1" for the context argument. You are just > passing the KSP pointer. > >> AppCtx context2; >> MatShellGetContext(context1.matShell, (void*)&context2); >> > > Here you better declare > > AppCtx *context2; > > and access it as > > context2->testValue; > > Thanks, but It doesn't work better unfortunately > This tells me NOTHING. How can I help you with this information? This has nothing to do with PETSc. It is simple C semantics. Write a test code and send it in. Matt > Matt > >> It happens that context2.testValue is different from 18. >> >> Any would have a clue on what I miss? >> >> thanks a lot, >> Nicolas >> >> > > > -- > What most experimenters take for granted before they begin their > experiments is infinitely more interesting than any results to which their > experiments lead. > -- Norbert Wiener > > > -- What most experimenters take for granted before they begin their experiments is infinitely more interesting than any results to which their experiments lead. -- Norbert Wiener -------------- next part -------------- An HTML attachment was scrubbed... URL: From Mahir.Ulker-Kaustell at tyrens.se Mon Aug 3 10:34:46 2015 From: Mahir.Ulker-Kaustell at tyrens.se (Mahir.Ulker-Kaustell at tyrens.se) Date: Mon, 3 Aug 2015 15:34:46 +0000 Subject: [petsc-users] SuperLU MPI-problem In-Reply-To: References: <051d9816c3bd4a3eac37fc51004ebce1@STHWS42.tyrens.se> <7345cece365942d1a06deeac56cf1d72@STHWS42.tyrens.se> <19A5B30A-64E2-44E1-8F73-F67AE628F175@mcs.anl.gov> <03369975ff0a46a388920f1b3372d25c@STHWS42.tyrens.se> <1891ada2f99843b6b6c1d91f0f47f065@STHWS42.tyrens.se> Message-ID: Sherry and Hong, If I use: -mat_superlu_dist_parsymbfact, I get: Invalid ISPEC at line 484 in file get_perm_c.c regardless of what I give to ?mat_superlu_dist_matinput I have not used ?parsymbfact in sequential runs or set matinput=GLOBAL for parallel runs. If I use 2 processors, the program runs if I use ?mat_superlu_dist_parsymbfact=1: mpiexec -n 2 ./solve -ksp_type preonly -pc_type lu -pc_factor_mat_solver_package superlu_dist -mat_superlu_dist_matinput GLOBAL -mat_superlu_dist_parsymbfact=1 and mpiexec -n 2 ./solve -ksp_type preonly -pc_type lu -pc_factor_mat_solver_package superlu_dist -mat_superlu_dist_matinput DISTRIBUTED -mat_superlu_dist_parsymbfact=1 I guess this corresponds to not setting parsymbfact at all. Both programs consume the same amount of RAM and seem to run well. If I use (what seems to be correct): mpiexec -n 2 ./solve -ksp_type preonly -pc_type lu -pc_factor_mat_solver_package superlu_dist -mat_superlu_dist_matinput DISTRIBUTED -mat_superlu_dist_parsymbfact the result is: Invalid ISPEC at line 484 in file get_perm_c.c Mahir From: Hong [mailto:hzhang at mcs.anl.gov] Sent: den 3 augusti 2015 16:46 To: Xiaoye S. Li Cc: ?lker-Kaustell, Mahir; Hong; PETSc users list Subject: Re: [petsc-users] SuperLU MPI-problem Mahir, Sherry found the culprit. I can reproduce it: petsc/src/ksp/ksp/examples/tutorials mpiexec -n 2 ./ex2 -pc_type lu -pc_factor_mat_solver_package superlu_dist -mat_superlu_dist_matinput GLOBAL -mat_superlu_dist_parsymbfact Invalid ISPEC at line 484 in file get_perm_c.c Invalid ISPEC at line 484 in file get_perm_c.c ------------------------------------------------------- Primary job terminated normally, but 1 process returned a non-zero exit code.. Per user-direction, the job has been aborted. ------------------------------------------------------- ... PETSc-superlu_dist interface sets matinput=DISTRIBUTED as default when using more than one processes. Did you either use '-mat_superlu_dist_parsymbfact' for sequential run or set matinput=GLOBAL for parallel run? I'll add an error flag for these use cases. Hong On Mon, Aug 3, 2015 at 9:17 AM, Xiaoye S. Li > wrote: I think I know the problem. Since zdistribute.c is called, I guess you are using the global (replicated) matrix input interface, pzgssvx_ABglobal(). This interface does not allow you to use parallel symbolic factorization (since matrix is centralized). That's why you get the following error: Invalid ISPEC at line 484 in file get_perm_c.c You need to use distributed matrix input interface pzgssvx() (without ABglobal) Sherry On Mon, Aug 3, 2015 at 5:02 AM, Mahir.Ulker-Kaustell at tyrens.se > wrote: Hong and Sherry, I have rebuilt PETSc with SuperLU 4.1. Unfortunately, the problem remains: If I use -mat_superlu_dist_parsymbfact, the program crashes with: Invalid ISPEC at line 484 in file get_perm_c.c If I use -mat_superlu_dist_parsymbfact=1 or leave this flag out, the program crashes with: Calloc fails for SPA dense[]. at line 438 in file zdistribute.c Mahir From: Hong [mailto:hzhang at mcs.anl.gov] Sent: den 30 juli 2015 02:58 To: ?lker-Kaustell, Mahir Cc: Xiaoye Li; PETSc users list Subject: Fwd: [petsc-users] SuperLU MPI-problem Mahir, Sherry fixed several bugs in superlu_dist-v4.1. The current petsc-release interfaces with superlu_dist-v4.0. We do not know whether the reported issue (attached below) has been resolved or not. If not, can you test it with the latest superlu_dist-v4.1? Here is how to do it: 1. download superlu_dist v4.1 2. remove existing PETSC_ARCH directory, then configure petsc with '--download-superlu_dist=superlu_dist_4.1.tar.gz' 3. build petsc Let us know if the issue remains. Hong ---------- Forwarded message ---------- From: Xiaoye S. Li > Date: Wed, Jul 29, 2015 at 2:24 PM Subject: Fwd: [petsc-users] SuperLU MPI-problem To: Hong Zhang > Hong, I am cleaning the mailbox, and saw this unresolved issue. I am not sure whether the new fix to parallel symbolic factorization solves the problem. What bothers be is that he is getting the following error: Invalid ISPEC at line 484 in file get_perm_c.c This has nothing to do with my bug fix. ? Shall we ask him to try the new version, or try to get him matrix? Sherry ? ---------- Forwarded message ---------- From: Mahir.Ulker-Kaustell at tyrens.se > Date: Wed, Jul 22, 2015 at 1:32 PM Subject: RE: [petsc-users] SuperLU MPI-problem To: Hong >, "Xiaoye S. Li" > Cc: petsc-users > The 1000 was just a conservative guess. The number of non-zeros per row is in the tens in general but certain constraints lead to non-diagonal streaks in the sparsity-pattern. Is it the reordering of the matrix that is killing me here? How can I set options.ColPerm? If i use -mat_superlu_dist_parsymbfact the program crashes with Invalid ISPEC at line 484 in file get_perm_c.c ------------------------------------------------------- Primary job terminated normally, but 1 process returned a non-zero exit code.. Per user-direction, the job has been aborted. ------------------------------------------------------- [0]PETSC ERROR: ------------------------------------------------------------------------ [0]PETSC ERROR: Caught signal number 15 Terminate: Some process (or the batch system) has told this process to end [0]PETSC ERROR: Try option -start_in_debugger or -on_error_attach_debugger [0]PETSC ERROR: or see http://www.mcs.anl.gov/petsc/documentation/faq.html#valgrind [0]PETSC ERROR: or try http://valgrind.org on GNU/linux and Apple Mac OS X to find memory corruption errors [0]PETSC ERROR: configure using --with-debugging=yes, recompile, link, and run [0]PETSC ERROR: to get more information on the crash. [0]PETSC ERROR: --------------------- Error Message -------------------------------------------------------------- [0]PETSC ERROR: Signal received [0]PETSC ERROR: See http://www.mcs.anl.gov/petsc/documentation/faq.html for trouble shooting. [0]PETSC ERROR: Petsc Release Version 3.6.0, Jun, 09, 2015 [0]PETSC ERROR: ./solve on a cygwin-complex-nodebug named CZC5202SM2 by muk Wed Jul 22 21:59:23 2015 [0]PETSC ERROR: Configure options PETSC_DIR=/packages/petsc-3.6.0 PETSC_ARCH=cygwin-complex-nodebug --with-cc=gcc --with-cxx=g++ --with-fc=gfortran --with-debugging=0 --with-fortran-kernels=1 --with-scalar-type=complex --download-fblaspack --download-mpich --download-scalapack --download-mumps --download-metis --download-parmetis --download-superlu --download-superlu_dist --download-fftw [0]PETSC ERROR: #1 User provided function() line 0 in unknown file application called MPI_Abort(MPI_COMM_WORLD, 59) - process 0 [unset]: aborting job: application called MPI_Abort(MPI_COMM_WORLD, 59) - process 0 [0]PETSC ERROR: ------------------------------------------------------------------------ If i use -mat_superlu_dist_parsymbfact=1 the program crashes (somewhat later) with Malloc fails for Lnzval_bc_ptr[*][] at line 626 in file zdistribute.c col block 3006 ------------------------------------------------------- Primary job terminated normally, but 1 process returned a non-zero exit code.. Per user-direction, the job has been aborted. ------------------------------------------------------- col block 1924 [0]PETSC ERROR: ------------------------------------------------------------------------ [0]PETSC ERROR: Caught signal number 15 Terminate: Some process (or the batch system) has told this process to end [0]PETSC ERROR: Try option -start_in_debugger or -on_error_attach_debugger [0]PETSC ERROR: or see http://www.mcs.anl.gov/petsc/documentation/faq.html#valgrind [0]PETSC ERROR: or try http://valgrind.org on GNU/linux and Apple Mac OS X to find memory corruption errors [0]PETSC ERROR: configure using --with-debugging=yes, recompile, link, and run [0]PETSC ERROR: to get more information on the crash. [0]PETSC ERROR: --------------------- Error Message -------------------------------------------------------------- [0]PETSC ERROR: Signal received [0]PETSC ERROR: See http://www.mcs.anl.gov/petsc/documentation/faq.html for trouble shooting. [0]PETSC ERROR: Petsc Release Version 3.6.0, Jun, 09, 2015 [0]PETSC ERROR: ./solve on a cygwin-complex-nodebug named CZC5202SM2 by muk Wed Jul 22 21:59:58 2015 [0]PETSC ERROR: Configure options PETSC_DIR=/packages/petsc-3.6.0 PETSC_ARCH=cygwin-complex-nodebug --with-cc=gcc --with-cxx=g++ --with-fc=gfortran --with-debugging=0 --with-fortran-kernels=1 --with-scalar-type=complex --download-fblaspack --download-mpich --download-scalapack --download-mumps --download-metis --download-parmetis --download-superlu --download-superlu_dist --download-fftw [0]PETSC ERROR: #1 User provided function() line 0 in unknown file application called MPI_Abort(MPI_COMM_WORLD, 59) - process 0 [unset]: aborting job: application called MPI_Abort(MPI_COMM_WORLD, 59) - process 0 [0]PETSC ERROR: ------------------------------------------------------------------------ /Mahir From: Hong [mailto:hzhang at mcs.anl.gov] Sent: den 22 juli 2015 21:34 To: Xiaoye S. Li Cc: ?lker-Kaustell, Mahir; petsc-users Subject: Re: [petsc-users] SuperLU MPI-problem In Petsc/superlu_dist interface, we set default options.ParSymbFact = NO; When user raises the flag "-mat_superlu_dist_parsymbfact", we set options.ParSymbFact = YES; options.ColPerm = PARMETIS; /* in v2.2, PARMETIS is forced for ParSymbFact regardless of user ordering setting */ We do not change anything else. Hong On Wed, Jul 22, 2015 at 2:19 PM, Xiaoye S. Li > wrote: I am trying to understand your problem. You said you are solving Naviers equation (elastodynamics) in the frequency domain, using finite element discretization. I wonder why you have about 1000 nonzeros per row. Usually in many PDE discretized matrices, the number of nonzeros per row is in the tens (even for 3D problems), not in the thousands. So, your matrix is quite a bit denser than many sparse matrices we deal with. The number of nonzeros in the L and U factors is much more than that in original matrix A -- typically we see 10-20x fill ratio for 2D, or can be as bad as 50-100x fill ratio for 3D. But since your matrix starts much denser (i.e., the underlying graph has many connections), it may not lend to any good ordering strategy to preserve sparsity of L and U; that is, the L and U fill ratio may be large. I don't understand why you get the following error when you use ?-mat_superlu_dist_parsymbfact?. Invalid ISPEC at line 484 in file get_perm_c.c Perhaps Hong Zhang knows; she built the SuperLU_DIST interface for PETSc. ?Hong -- in order to use parallel symbolic factorization, is it sufficient to specify only ?-mat_superlu_dist_parsymbfact? ? ? (the default is to use sequential symbolic factorization.) Sherry On Wed, Jul 22, 2015 at 9:11 AM, Mahir.Ulker-Kaustell at tyrens.se > wrote: Thank you for your reply. As you have probably figured out already, I am not a computational scientist. I am a researcher in civil engineering (railways for high-speed traffic), trying to produce some, from my perspective, fairly large parametric studies based on finite element discretizations. I am working in a Windows-environment and have installed PETSc through Cygwin. Apparently, there is no support for Valgrind in this OS. If I have understood you correct, the memory issues are related to superLU and given my background, there is not much I can do. Is this correct? Best regards, Mahir ______________________________________________ Mahir ?lker-Kaustell, Kompetenssamordnare, Brokonstrukt?r, Tekn. Dr, Tyr?ns AB 010 452 30 82, Mahir.Ulker-Kaustell at tyrens.se ______________________________________________ -----Original Message----- From: Barry Smith [mailto:bsmith at mcs.anl.gov] Sent: den 22 juli 2015 02:57 To: ?lker-Kaustell, Mahir Cc: Xiaoye S. Li; petsc-users Subject: Re: [petsc-users] SuperLU MPI-problem Run the program under valgrind http://www.mcs.anl.gov/petsc/documentation/faq.html#valgrind . When I use the option -mat_superlu_dist_parsymbfact I get many scary memory problems some involving for example ddist_psymbtonum (pdsymbfact_distdata.c:1332) Note that I consider it unacceptable for running programs to EVER use uninitialized values; until these are all cleaned up I won't trust any runs like this. Barry ==42050== Conditional jump or move depends on uninitialised value(s) ==42050== at 0x10274C436: MPI_Allgatherv (allgatherv.c:1053) ==42050== by 0x101557F60: get_perm_c_parmetis (get_perm_c_parmetis.c:285) ==42050== by 0x101501192: pdgssvx (pdgssvx.c:934) ==42050== by 0x1009CFE7A: MatLUFactorNumeric_SuperLU_DIST (superlu_dist.c:414) ==42050== by 0x10046CC5C: MatLUFactorNumeric (matrix.c:2946) ==42050== by 0x100F09F2C: PCSetUp_LU (lu.c:152) ==42050== by 0x100FF9036: PCSetUp (precon.c:982) ==42050== by 0x1010F54EB: KSPSetUp (itfunc.c:332) ==42050== by 0x1010F7985: KSPSolve (itfunc.c:546) ==42050== by 0x10125541E: SNESSolve_NEWTONLS (ls.c:233) ==42050== by 0x1011C49B7: SNESSolve (snes.c:3906) ==42050== by 0x100001B3C: main (in ./ex19) ==42050== Uninitialised value was created by a stack allocation ==42050== at 0x10155751B: get_perm_c_parmetis (get_perm_c_parmetis.c:96) ==42050== ==42050== Conditional jump or move depends on uninitialised value(s) ==42050== at 0x102851C61: MPIR_Allgatherv_intra (allgatherv.c:651) ==42050== by 0x102853EC7: MPIR_Allgatherv (allgatherv.c:903) ==42050== by 0x102853F84: MPIR_Allgatherv_impl (allgatherv.c:944) ==42050== by 0x10274CA41: MPI_Allgatherv (allgatherv.c:1107) ==42050== by 0x101557F60: get_perm_c_parmetis (get_perm_c_parmetis.c:285) ==42050== by 0x101501192: pdgssvx (pdgssvx.c:934) ==42050== by 0x1009CFE7A: MatLUFactorNumeric_SuperLU_DIST (superlu_dist.c:414) ==42050== by 0x10046CC5C: MatLUFactorNumeric (matrix.c:2946) ==42050== by 0x100F09F2C: PCSetUp_LU (lu.c:152) ==42050== by 0x100FF9036: PCSetUp (precon.c:982) ==42050== by 0x1010F54EB: KSPSetUp (itfunc.c:332) ==42050== by 0x1010F7985: KSPSolve (itfunc.c:546) ==42050== by 0x10125541E: SNESSolve_NEWTONLS (ls.c:233) ==42050== by 0x1011C49B7: SNESSolve (snes.c:3906) ==42050== by 0x100001B3C: main (in ./ex19) ==42050== Uninitialised value was created by a stack allocation ==42050== at 0x10155751B: get_perm_c_parmetis (get_perm_c_parmetis.c:96) ==42050== ==42049== Syscall param writev(vector[...]) points to uninitialised byte(s) ==42049== at 0x102DA1C3A: writev (in /usr/lib/system/libsystem_kernel.dylib) ==42049== by 0x10296A0DC: MPL_large_writev (mplsock.c:32) ==42049== by 0x10295F6AD: MPIDU_Sock_writev (sock_immed.i:610) ==42049== by 0x102943FCA: MPIDI_CH3_iSendv (ch3_isendv.c:84) ==42049== by 0x102934361: MPIDI_CH3_EagerContigIsend (ch3u_eager.c:556) ==42049== by 0x102939531: MPID_Isend (mpid_isend.c:138) ==42049== by 0x10277656E: MPI_Isend (isend.c:125) ==42049== by 0x102088B66: libparmetis__gkMPI_Isend (gkmpi.c:63) ==42049== by 0x10208140F: libparmetis__CommInterfaceData (comm.c:298) ==42049== by 0x1020A8758: libparmetis__CompactGraph (ometis.c:553) ==42049== by 0x1020A77BB: libparmetis__MultilevelOrder (ometis.c:225) ==42049== by 0x1020A7493: ParMETIS_V32_NodeND (ometis.c:151) ==42049== by 0x1020A6AFB: ParMETIS_V3_NodeND (ometis.c:34) ==42049== by 0x101557CFC: get_perm_c_parmetis (get_perm_c_parmetis.c:241) ==42049== by 0x101501192: pdgssvx (pdgssvx.c:934) ==42049== by 0x1009CFE7A: MatLUFactorNumeric_SuperLU_DIST (superlu_dist.c:414) ==42049== by 0x10046CC5C: MatLUFactorNumeric (matrix.c:2946) ==42049== by 0x100F09F2C: PCSetUp_LU (lu.c:152) ==42049== by 0x100FF9036: PCSetUp (precon.c:982) ==42048== Syscall param writev(vector[...]) points to uninitialised byte(s) ==42049== by 0x1010F54EB: KSPSetUp (itfunc.c:332) ==42049== Address 0x105edff70 is 1,424 bytes inside a block of size 752,720 alloc'd ==42049== at 0x1000183B1: malloc (vg_replace_malloc.c:303) ==42049== by 0x1020EB90C: gk_malloc (memory.c:147) ==42049== by 0x1020EAA28: gk_mcoreCreate (mcore.c:28) ==42048== at 0x102DA1C3A: writev (in /usr/lib/system/libsystem_kernel.dylib) ==42048== by 0x10296A0DC: MPL_large_writev (mplsock.c:32) ==42049== by 0x1020BA5CF: libparmetis__AllocateWSpace (wspace.c:23) ==42049== by 0x1020A6E84: ParMETIS_V32_NodeND (ometis.c:98) ==42048== by 0x10295F6AD: MPIDU_Sock_writev (sock_immed.i:610) ==42048== by 0x102943FCA: MPIDI_CH3_iSendv (ch3_isendv.c:84) ==42048== by 0x102934361: MPIDI_CH3_EagerContigIsend (ch3u_eager.c:556) ==42049== by 0x1020A6AFB: ParMETIS_V3_NodeND (ometis.c:34) ==42049== by 0x101557CFC: get_perm_c_parmetis (get_perm_c_parmetis.c:241) ==42049== by 0x101501192: pdgssvx (pdgssvx.c:934) ==42048== by 0x102939531: MPID_Isend (mpid_isend.c:138) ==42048== by 0x10277656E: MPI_Isend (isend.c:125) ==42049== by 0x1009CFE7A: MatLUFactorNumeric_SuperLU_DIST (superlu_dist.c:414) ==42049== by 0x10046CC5C: MatLUFactorNumeric (matrix.c:2946) ==42049== by 0x100F09F2C: PCSetUp_LU (lu.c:152) ==42049== by 0x100FF9036: PCSetUp (precon.c:982) ==42049== by 0x1010F54EB: KSPSetUp (itfunc.c:332) ==42048== by 0x102088B66: libparmetis__gkMPI_Isend (gkmpi.c:63) ==42048== by 0x10208140F: libparmetis__CommInterfaceData (comm.c:298) ==42049== by 0x1010F7985: KSPSolve (itfunc.c:546) ==42049== by 0x10125541E: SNESSolve_NEWTONLS (ls.c:233) ==42048== by 0x1020A8758: libparmetis__CompactGraph (ometis.c:553) ==42048== by 0x1020A77BB: libparmetis__MultilevelOrder (ometis.c:225) ==42048== by 0x1020A7493: ParMETIS_V32_NodeND (ometis.c:151) ==42049== by 0x1011C49B7: SNESSolve (snes.c:3906) ==42049== by 0x100001B3C: main (in ./ex19) ==42049== Uninitialised value was created by a heap allocation ==42049== at 0x1000183B1: malloc (vg_replace_malloc.c:303) ==42049== by 0x1020EB90C: gk_malloc (memory.c:147) ==42048== by 0x1020A6AFB: ParMETIS_V3_NodeND (ometis.c:34) ==42048== by 0x101557CFC: get_perm_c_parmetis (get_perm_c_parmetis.c:241) ==42048== by 0x101501192: pdgssvx (pdgssvx.c:934) ==42048== by 0x1009CFE7A: MatLUFactorNumeric_SuperLU_DIST (superlu_dist.c:414) ==42049== by 0x10211C50B: libmetis__imalloc (gklib.c:24) ==42049== by 0x1020A8566: libparmetis__CompactGraph (ometis.c:519) ==42049== by 0x1020A77BB: libparmetis__MultilevelOrder (ometis.c:225) ==42048== by 0x10046CC5C: MatLUFactorNumeric (matrix.c:2946) ==42049== by 0x1020A7493: ParMETIS_V32_NodeND (ometis.c:151) ==42049== by 0x1020A6AFB: ParMETIS_V3_NodeND (ometis.c:34) ==42049== by 0x101557CFC: get_perm_c_parmetis (get_perm_c_parmetis.c:241) ==42049== by 0x101501192: pdgssvx (pdgssvx.c:934) ==42049== by 0x1009CFE7A: MatLUFactorNumeric_SuperLU_DIST (superlu_dist.c:414) ==42049== by 0x10046CC5C: MatLUFactorNumeric (matrix.c:2946) ==42048== by 0x100F09F2C: PCSetUp_LU (lu.c:152) ==42049== by 0x100F09F2C: PCSetUp_LU (lu.c:152) ==42049== by 0x100FF9036: PCSetUp (precon.c:982) ==42049== by 0x1010F54EB: KSPSetUp (itfunc.c:332) ==42049== by 0x1010F7985: KSPSolve (itfunc.c:546) ==42049== by 0x10125541E: SNESSolve_NEWTONLS (ls.c:233) ==42048== by 0x100FF9036: PCSetUp (precon.c:982) ==42048== by 0x1010F54EB: KSPSetUp (itfunc.c:332) ==42048== Address 0x10597a860 is 1,408 bytes inside a block of size 752,720 alloc'd ==42049== by 0x1011C49B7: SNESSolve (snes.c:3906) ==42049== by 0x100001B3C: main (in ./ex19) ==42049== ==42048== at 0x1000183B1: malloc (vg_replace_malloc.c:303) ==42048== by 0x1020EB90C: gk_malloc (memory.c:147) ==42048== by 0x1020EAA28: gk_mcoreCreate (mcore.c:28) ==42048== by 0x1020BA5CF: libparmetis__AllocateWSpace (wspace.c:23) ==42048== by 0x1020A6E84: ParMETIS_V32_NodeND (ometis.c:98) ==42048== by 0x1020A6AFB: ParMETIS_V3_NodeND (ometis.c:34) ==42048== by 0x101557CFC: get_perm_c_parmetis (get_perm_c_parmetis.c:241) ==42048== by 0x101501192: pdgssvx (pdgssvx.c:934) ==42048== by 0x1009CFE7A: MatLUFactorNumeric_SuperLU_DIST (superlu_dist.c:414) ==42048== by 0x10046CC5C: MatLUFactorNumeric (matrix.c:2946) ==42048== by 0x100F09F2C: PCSetUp_LU (lu.c:152) ==42048== by 0x100FF9036: PCSetUp (precon.c:982) ==42048== by 0x1010F54EB: KSPSetUp (itfunc.c:332) ==42048== by 0x1010F7985: KSPSolve (itfunc.c:546) ==42048== by 0x10125541E: SNESSolve_NEWTONLS (ls.c:233) ==42048== by 0x1011C49B7: SNESSolve (snes.c:3906) ==42048== by 0x100001B3C: main (in ./ex19) ==42048== Uninitialised value was created by a heap allocation ==42048== at 0x1000183B1: malloc (vg_replace_malloc.c:303) ==42048== by 0x1020EB90C: gk_malloc (memory.c:147) ==42048== by 0x10211C50B: libmetis__imalloc (gklib.c:24) ==42048== by 0x1020A8566: libparmetis__CompactGraph (ometis.c:519) ==42048== by 0x1020A77BB: libparmetis__MultilevelOrder (ometis.c:225) ==42048== by 0x1020A7493: ParMETIS_V32_NodeND (ometis.c:151) ==42048== by 0x1020A6AFB: ParMETIS_V3_NodeND (ometis.c:34) ==42048== by 0x101557CFC: get_perm_c_parmetis (get_perm_c_parmetis.c:241) ==42048== by 0x101501192: pdgssvx (pdgssvx.c:934) ==42048== by 0x1009CFE7A: MatLUFactorNumeric_SuperLU_DIST (superlu_dist.c:414) ==42048== by 0x10046CC5C: MatLUFactorNumeric (matrix.c:2946) ==42048== by 0x100F09F2C: PCSetUp_LU (lu.c:152) ==42048== by 0x100FF9036: PCSetUp (precon.c:982) ==42048== by 0x1010F54EB: KSPSetUp (itfunc.c:332) ==42048== by 0x1010F7985: KSPSolve (itfunc.c:546) ==42048== by 0x10125541E: SNESSolve_NEWTONLS (ls.c:233) ==42048== by 0x1011C49B7: SNESSolve (snes.c:3906) ==42048== by 0x100001B3C: main (in ./ex19) ==42048== ==42048== Syscall param write(buf) points to uninitialised byte(s) ==42048== at 0x102DA1C22: write (in /usr/lib/system/libsystem_kernel.dylib) ==42048== by 0x10295F5BD: MPIDU_Sock_write (sock_immed.i:525) ==42048== by 0x102944839: MPIDI_CH3_iStartMsg (ch3_istartmsg.c:86) ==42048== by 0x102933B80: MPIDI_CH3_EagerContigShortSend (ch3u_eager.c:257) ==42048== by 0x10293ADBA: MPID_Send (mpid_send.c:130) ==42048== by 0x10277A1FA: MPI_Send (send.c:127) ==42048== by 0x10155802F: get_perm_c_parmetis (get_perm_c_parmetis.c:299) ==42048== by 0x101501192: pdgssvx (pdgssvx.c:934) ==42048== by 0x1009CFE7A: MatLUFactorNumeric_SuperLU_DIST (superlu_dist.c:414) ==42048== by 0x10046CC5C: MatLUFactorNumeric (matrix.c:2946) ==42048== by 0x100F09F2C: PCSetUp_LU (lu.c:152) ==42048== by 0x100FF9036: PCSetUp (precon.c:982) ==42048== by 0x1010F54EB: KSPSetUp (itfunc.c:332) ==42048== by 0x1010F7985: KSPSolve (itfunc.c:546) ==42048== by 0x10125541E: SNESSolve_NEWTONLS (ls.c:233) ==42048== by 0x1011C49B7: SNESSolve (snes.c:3906) ==42048== by 0x100001B3C: main (in ./ex19) ==42048== Address 0x104810704 is on thread 1's stack ==42048== in frame #3, created by MPIDI_CH3_EagerContigShortSend (ch3u_eager.c:218) ==42048== Uninitialised value was created by a heap allocation ==42048== at 0x1000183B1: malloc (vg_replace_malloc.c:303) ==42048== by 0x10153B704: superlu_malloc_dist (memory.c:108) ==42048== by 0x101557AB9: get_perm_c_parmetis (get_perm_c_parmetis.c:185) ==42048== by 0x101501192: pdgssvx (pdgssvx.c:934) ==42048== by 0x1009CFE7A: MatLUFactorNumeric_SuperLU_DIST (superlu_dist.c:414) ==42048== by 0x10046CC5C: MatLUFactorNumeric (matrix.c:2946) ==42048== by 0x100F09F2C: PCSetUp_LU (lu.c:152) ==42048== by 0x100FF9036: PCSetUp (precon.c:982) ==42048== by 0x1010F54EB: KSPSetUp (itfunc.c:332) ==42048== by 0x1010F7985: KSPSolve (itfunc.c:546) ==42048== by 0x10125541E: SNESSolve_NEWTONLS (ls.c:233) ==42048== by 0x1011C49B7: SNESSolve (snes.c:3906) ==42048== by 0x100001B3C: main (in ./ex19) ==42048== ==42050== Conditional jump or move depends on uninitialised value(s) ==42050== at 0x102744CB8: MPI_Alltoallv (alltoallv.c:480) ==42050== by 0x101510B3E: dist_symbLU (pdsymbfact_distdata.c:539) ==42050== by 0x10150A5C6: ddist_psymbtonum (pdsymbfact_distdata.c:1275) ==42050== by 0x1015018C2: pdgssvx (pdgssvx.c:1057) ==42050== by 0x1009CFE7A: MatLUFactorNumeric_SuperLU_DIST (superlu_dist.c:414) ==42050== by 0x10046CC5C: MatLUFactorNumeric (matrix.c:2946) ==42050== by 0x100F09F2C: PCSetUp_LU (lu.c:152) ==42050== by 0x100FF9036: PCSetUp (precon.c:982) ==42050== by 0x1010F54EB: KSPSetUp (itfunc.c:332) ==42050== by 0x1010F7985: KSPSolve (itfunc.c:546) ==42050== by 0x10125541E: SNESSolve_NEWTONLS (ls.c:233) ==42050== by 0x1011C49B7: SNESSolve (snes.c:3906) ==42050== by 0x100001B3C: main (in ./ex19) ==42050== Uninitialised value was created by a stack allocation ==42050== at 0x10150E4C4: dist_symbLU (pdsymbfact_distdata.c:96) ==42050== ==42050== Conditional jump or move depends on uninitialised value(s) ==42050== at 0x102744E43: MPI_Alltoallv (alltoallv.c:490) ==42050== by 0x101510B3E: dist_symbLU (pdsymbfact_distdata.c:539) ==42050== by 0x10150A5C6: ddist_psymbtonum (pdsymbfact_distdata.c:1275) ==42050== by 0x1015018C2: pdgssvx (pdgssvx.c:1057) ==42050== by 0x1009CFE7A: MatLUFactorNumeric_SuperLU_DIST (superlu_dist.c:414) ==42050== by 0x10046CC5C: MatLUFactorNumeric (matrix.c:2946) ==42050== by 0x100F09F2C: PCSetUp_LU (lu.c:152) ==42050== by 0x100FF9036: PCSetUp (precon.c:982) ==42050== by 0x1010F54EB: KSPSetUp (itfunc.c:332) ==42050== by 0x1010F7985: KSPSolve (itfunc.c:546) ==42050== by 0x10125541E: SNESSolve_NEWTONLS (ls.c:233) ==42050== by 0x1011C49B7: SNESSolve (snes.c:3906) ==42050== by 0x100001B3C: main (in ./ex19) ==42050== Uninitialised value was created by a stack allocation ==42050== at 0x10150E4C4: dist_symbLU (pdsymbfact_distdata.c:96) ==42050== ==42050== Conditional jump or move depends on uninitialised value(s) ==42050== at 0x102744EBF: MPI_Alltoallv (alltoallv.c:497) ==42050== by 0x101510B3E: dist_symbLU (pdsymbfact_distdata.c:539) ==42050== by 0x10150A5C6: ddist_psymbtonum (pdsymbfact_distdata.c:1275) ==42050== by 0x1015018C2: pdgssvx (pdgssvx.c:1057) ==42050== by 0x1009CFE7A: MatLUFactorNumeric_SuperLU_DIST (superlu_dist.c:414) ==42050== by 0x10046CC5C: MatLUFactorNumeric (matrix.c:2946) ==42050== by 0x100F09F2C: PCSetUp_LU (lu.c:152) ==42050== by 0x100FF9036: PCSetUp (precon.c:982) ==42050== by 0x1010F54EB: KSPSetUp (itfunc.c:332) ==42050== by 0x1010F7985: KSPSolve (itfunc.c:546) ==42050== by 0x10125541E: SNESSolve_NEWTONLS (ls.c:233) ==42050== by 0x1011C49B7: SNESSolve (snes.c:3906) ==42050== by 0x100001B3C: main (in ./ex19) ==42050== Uninitialised value was created by a stack allocation ==42050== at 0x10150E4C4: dist_symbLU (pdsymbfact_distdata.c:96) ==42050== ==42050== Conditional jump or move depends on uninitialised value(s) ==42050== at 0x1027450B1: MPI_Alltoallv (alltoallv.c:512) ==42050== by 0x101510B3E: dist_symbLU (pdsymbfact_distdata.c:539) ==42050== by 0x10150A5C6: ddist_psymbtonum (pdsymbfact_distdata.c:1275) ==42050== by 0x1015018C2: pdgssvx (pdgssvx.c:1057) ==42050== by 0x1009CFE7A: MatLUFactorNumeric_SuperLU_DIST (superlu_dist.c:414) ==42050== by 0x10046CC5C: MatLUFactorNumeric (matrix.c:2946) ==42050== by 0x100F09F2C: PCSetUp_LU (lu.c:152) ==42050== by 0x100FF9036: PCSetUp (precon.c:982) ==42050== by 0x1010F54EB: KSPSetUp (itfunc.c:332) ==42050== by 0x1010F7985: KSPSolve (itfunc.c:546) ==42050== by 0x10125541E: SNESSolve_NEWTONLS (ls.c:233) ==42050== by 0x1011C49B7: SNESSolve (snes.c:3906) ==42050== by 0x100001B3C: main (in ./ex19) ==42050== Uninitialised value was created by a stack allocation ==42050== at 0x10150E4C4: dist_symbLU (pdsymbfact_distdata.c:96) ==42050== ==42050== Conditional jump or move depends on uninitialised value(s) ==42050== at 0x10283FB06: MPIR_Alltoallv_intra (alltoallv.c:92) ==42050== by 0x1028407B6: MPIR_Alltoallv (alltoallv.c:343) ==42050== by 0x102840884: MPIR_Alltoallv_impl (alltoallv.c:380) ==42050== by 0x10274541B: MPI_Alltoallv (alltoallv.c:531) ==42050== by 0x101510B3E: dist_symbLU (pdsymbfact_distdata.c:539) ==42050== by 0x10150A5C6: ddist_psymbtonum (pdsymbfact_distdata.c:1275) ==42050== by 0x1015018C2: pdgssvx (pdgssvx.c:1057) ==42050== by 0x1009CFE7A: MatLUFactorNumeric_SuperLU_DIST (superlu_dist.c:414) ==42050== by 0x10046CC5C: MatLUFactorNumeric (matrix.c:2946) ==42050== by 0x100F09F2C: PCSetUp_LU (lu.c:152) ==42050== by 0x100FF9036: PCSetUp (precon.c:982) ==42050== by 0x1010F54EB: KSPSetUp (itfunc.c:332) ==42050== by 0x1010F7985: KSPSolve (itfunc.c:546) ==42050== by 0x10125541E: SNESSolve_NEWTONLS (ls.c:233) ==42050== by 0x1011C49B7: SNESSolve (snes.c:3906) ==42050== by 0x100001B3C: main (in ./ex19) ==42050== Uninitialised value was created by a stack allocation ==42050== at 0x10150E4C4: dist_symbLU (pdsymbfact_distdata.c:96) ==42050== ==42050== Syscall param writev(vector[...]) points to uninitialised byte(s) ==42050== at 0x102DA1C3A: writev (in /usr/lib/system/libsystem_kernel.dylib) ==42050== by 0x10296A0DC: MPL_large_writev (mplsock.c:32) ==42050== by 0x10295F6AD: MPIDU_Sock_writev (sock_immed.i:610) ==42050== by 0x102943FCA: MPIDI_CH3_iSendv (ch3_isendv.c:84) ==42050== by 0x102934361: MPIDI_CH3_EagerContigIsend (ch3u_eager.c:556) ==42050== by 0x102939531: MPID_Isend (mpid_isend.c:138) ==42050== by 0x10277656E: MPI_Isend (isend.c:125) ==42050== by 0x101524C41: pdgstrf2_trsm (pdgstrf2.c:201) ==42050== by 0x10151ECBF: pdgstrf (pdgstrf.c:1082) ==42050== by 0x1015019A5: pdgssvx (pdgssvx.c:1069) ==42050== by 0x1009CFE7A: MatLUFactorNumeric_SuperLU_DIST (superlu_dist.c:414) ==42050== by 0x10046CC5C: MatLUFactorNumeric (matrix.c:2946) ==42050== by 0x100F09F2C: PCSetUp_LU (lu.c:152) ==42050== by 0x100FF9036: PCSetUp (precon.c:982) ==42050== by 0x1010F54EB: KSPSetUp (itfunc.c:332) ==42050== by 0x1010F7985: KSPSolve (itfunc.c:546) ==42050== by 0x10125541E: SNESSolve_NEWTONLS (ls.c:233) ==42050== by 0x1011C49B7: SNESSolve (snes.c:3906) ==42050== by 0x100001B3C: main (in ./ex19) ==42050== Address 0x1060144d0 is 1,168 bytes inside a block of size 131,072 alloc'd ==42050== at 0x1000183B1: malloc (vg_replace_malloc.c:303) ==42050== by 0x10153B704: superlu_malloc_dist (memory.c:108) ==42050== by 0x1014FD7AD: doubleMalloc_dist (dmemory.c:145) ==42050== by 0x10151DA7D: pdgstrf (pdgstrf.c:735) ==42050== by 0x1015019A5: pdgssvx (pdgssvx.c:1069) ==42050== by 0x1009CFE7A: MatLUFactorNumeric_SuperLU_DIST (superlu_dist.c:414) ==42050== by 0x10046CC5C: MatLUFactorNumeric (matrix.c:2946) ==42050== by 0x100F09F2C: PCSetUp_LU (lu.c:152) ==42050== by 0x100FF9036: PCSetUp (precon.c:982) ==42050== by 0x1010F54EB: KSPSetUp (itfunc.c:332) ==42050== by 0x1010F7985: KSPSolve (itfunc.c:546) ==42050== by 0x10125541E: SNESSolve_NEWTONLS (ls.c:233) ==42050== by 0x1011C49B7: SNESSolve (snes.c:3906) ==42050== by 0x100001B3C: main (in ./ex19) ==42050== Uninitialised value was created by a heap allocation ==42050== at 0x1000183B1: malloc (vg_replace_malloc.c:303) ==42050== by 0x10153B704: superlu_malloc_dist (memory.c:108) ==42050== by 0x1014FD7AD: doubleMalloc_dist (dmemory.c:145) ==42050== by 0x10151DA7D: pdgstrf (pdgstrf.c:735) ==42050== by 0x1015019A5: pdgssvx (pdgssvx.c:1069) ==42050== by 0x1009CFE7A: MatLUFactorNumeric_SuperLU_DIST (superlu_dist.c:414) ==42050== by 0x10046CC5C: MatLUFactorNumeric (matrix.c:2946) ==42050== by 0x100F09F2C: PCSetUp_LU (lu.c:152) ==42050== by 0x100FF9036: PCSetUp (precon.c:982) ==42050== by 0x1010F54EB: KSPSetUp (itfunc.c:332) ==42050== by 0x1010F7985: KSPSolve (itfunc.c:546) ==42050== by 0x10125541E: SNESSolve_NEWTONLS (ls.c:233) ==42050== by 0x1011C49B7: SNESSolve (snes.c:3906) ==42050== by 0x100001B3C: main (in ./ex19) ==42050== ==42048== Conditional jump or move depends on uninitialised value(s) ==42048== at 0x10151F141: pdgstrf (pdgstrf.c:1139) ==42048== by 0x1015019A5: pdgssvx (pdgssvx.c:1069) ==42048== by 0x1009CFE7A: MatLUFactorNumeric_SuperLU_DIST (superlu_dist.c:414) ==42048== by 0x10046CC5C: MatLUFactorNumeric (matrix.c:2946) ==42048== by 0x100F09F2C: PCSetUp_LU (lu.c:152) ==42048== by 0x100FF9036: PCSetUp (precon.c:982) ==42048== by 0x1010F54EB: KSPSetUp (itfunc.c:332) ==42048== by 0x1010F7985: KSPSolve (itfunc.c:546) ==42048== by 0x10125541E: SNESSolve_NEWTONLS (ls.c:233) ==42048== by 0x1011C49B7: SNESSolve (snes.c:3906) ==42048== by 0x100001B3C: main (in ./ex19) ==42048== Uninitialised value was created by a heap allocation ==42048== at 0x1000183B1: malloc (vg_replace_malloc.c:303) ==42048== by 0x10153B704: superlu_malloc_dist (memory.c:108) ==42048== by 0x10150ABE2: ddist_psymbtonum (pdsymbfact_distdata.c:1332) ==42048== by 0x1015018C2: pdgssvx (pdgssvx.c:1057) ==42048== by 0x1009CFE7A: MatLUFactorNumeric_SuperLU_DIST (superlu_dist.c:414) ==42048== by 0x10046CC5C: MatLUFactorNumeric (matrix.c:2946) ==42048== by 0x100F09F2C: PCSetUp_LU (lu.c:152) ==42048== by 0x100FF9036: PCSetUp (precon.c:982) ==42048== by 0x1010F54EB: KSPSetUp (itfunc.c:332) ==42048== by 0x1010F7985: KSPSolve (itfunc.c:546) ==42048== by 0x10125541E: SNESSolve_NEWTONLS (ls.c:233) ==42048== by 0x1011C49B7: SNESSolve (snes.c:3906) ==42048== by 0x100001B3C: main (in ./ex19) ==42048== ==42049== Conditional jump or move depends on uninitialised value(s) ==42049== at 0x10151F141: pdgstrf (pdgstrf.c:1139) ==42049== by 0x1015019A5: pdgssvx (pdgssvx.c:1069) ==42049== by 0x1009CFE7A: MatLUFactorNumeric_SuperLU_DIST (superlu_dist.c:414) ==42049== by 0x10046CC5C: MatLUFactorNumeric (matrix.c:2946) ==42049== by 0x100F09F2C: PCSetUp_LU (lu.c:152) ==42049== by 0x100FF9036: PCSetUp (precon.c:982) ==42049== by 0x1010F54EB: KSPSetUp (itfunc.c:332) ==42049== by 0x1010F7985: KSPSolve (itfunc.c:546) ==42049== by 0x10125541E: SNESSolve_NEWTONLS (ls.c:233) ==42049== by 0x1011C49B7: SNESSolve (snes.c:3906) ==42049== by 0x100001B3C: main (in ./ex19) ==42049== Uninitialised value was created by a heap allocation ==42049== at 0x1000183B1: malloc (vg_replace_malloc.c:303) ==42049== by 0x10153B704: superlu_malloc_dist (memory.c:108) ==42049== by 0x10150ABE2: ddist_psymbtonum (pdsymbfact_distdata.c:1332) ==42049== by 0x1015018C2: pdgssvx (pdgssvx.c:1057) ==42049== by 0x1009CFE7A: MatLUFactorNumeric_SuperLU_DIST (superlu_dist.c:414) ==42049== by 0x10046CC5C: MatLUFactorNumeric (matrix.c:2946) ==42049== by 0x100F09F2C: PCSetUp_LU (lu.c:152) ==42049== by 0x100FF9036: PCSetUp (precon.c:982) ==42049== by 0x1010F54EB: KSPSetUp (itfunc.c:332) ==42049== by 0x1010F7985: KSPSolve (itfunc.c:546) ==42049== by 0x10125541E: SNESSolve_NEWTONLS (ls.c:233) ==42049== by 0x1011C49B7: SNESSolve (snes.c:3906) ==42049== by 0x100001B3C: main (in ./ex19) ==42049== ==42048== Conditional jump or move depends on uninitialised value(s) ==42048== at 0x101520054: pdgstrf (pdgstrf.c:1429) ==42048== by 0x1015019A5: pdgssvx (pdgssvx.c:1069) ==42048== by 0x1009CFE7A: MatLUFactorNumeric_SuperLU_DIST (superlu_dist.c:414) ==42048== by 0x10046CC5C: MatLUFactorNumeric (matrix.c:2946) ==42048== by 0x100F09F2C: PCSetUp_LU (lu.c:152) ==42048== by 0x100FF9036: PCSetUp (precon.c:982) ==42048== by 0x1010F54EB: KSPSetUp (itfunc.c:332) ==42048== by 0x1010F7985: KSPSolve (itfunc.c:546) ==42049== Conditional jump or move depends on uninitialised value(s) ==42048== by 0x10125541E: SNESSolve_NEWTONLS (ls.c:233) ==42048== by 0x1011C49B7: SNESSolve (snes.c:3906) ==42048== by 0x100001B3C: main (in ./ex19) ==42048== Uninitialised value was created by a heap allocation ==42049== at 0x101520054: pdgstrf (pdgstrf.c:1429) ==42048== at 0x1000183B1: malloc (vg_replace_malloc.c:303) ==42048== by 0x10153B704: superlu_malloc_dist (memory.c:108) ==42049== by 0x1015019A5: pdgssvx (pdgssvx.c:1069) ==42049== by 0x1009CFE7A: MatLUFactorNumeric_SuperLU_DIST (superlu_dist.c:414) ==42048== by 0x10150ABE2: ddist_psymbtonum (pdsymbfact_distdata.c:1332) ==42048== by 0x1015018C2: pdgssvx (pdgssvx.c:1057) ==42048== by 0x1009CFE7A: MatLUFactorNumeric_SuperLU_DIST (superlu_dist.c:414) ==42049== by 0x10046CC5C: MatLUFactorNumeric (matrix.c:2946) ==42049== by 0x100F09F2C: PCSetUp_LU (lu.c:152) ==42048== by 0x10046CC5C: MatLUFactorNumeric (matrix.c:2946) ==42048== by 0x100F09F2C: PCSetUp_LU (lu.c:152) ==42049== by 0x100FF9036: PCSetUp (precon.c:982) ==42049== by 0x1010F54EB: KSPSetUp (itfunc.c:332) ==42049== by 0x1010F7985: KSPSolve (itfunc.c:546) ==42048== by 0x100FF9036: PCSetUp (precon.c:982) ==42048== by 0x1010F54EB: KSPSetUp (itfunc.c:332) ==42048== by 0x1010F7985: KSPSolve (itfunc.c:546) ==42049== by 0x10125541E: SNESSolve_NEWTONLS (ls.c:233) ==42049== by 0x1011C49B7: SNESSolve (snes.c:3906) ==42048== by 0x10125541E: SNESSolve_NEWTONLS (ls.c:233) ==42048== by 0x1011C49B7: SNESSolve (snes.c:3906) ==42049== by 0x100001B3C: main (in ./ex19) ==42049== Uninitialised value was created by a heap allocation ==42049== at 0x1000183B1: malloc (vg_replace_malloc.c:303) ==42048== by 0x100001B3C: main (in ./ex19) ==42048== ==42049== by 0x10153B704: superlu_malloc_dist (memory.c:108) ==42049== by 0x10150ABE2: ddist_psymbtonum (pdsymbfact_distdata.c:1332) ==42049== by 0x1015018C2: pdgssvx (pdgssvx.c:1057) ==42049== by 0x1009CFE7A: MatLUFactorNumeric_SuperLU_DIST (superlu_dist.c:414) ==42049== by 0x10046CC5C: MatLUFactorNumeric (matrix.c:2946) ==42049== by 0x100F09F2C: PCSetUp_LU (lu.c:152) ==42049== by 0x100FF9036: PCSetUp (precon.c:982) ==42049== by 0x1010F54EB: KSPSetUp (itfunc.c:332) ==42049== by 0x1010F7985: KSPSolve (itfunc.c:546) ==42049== by 0x10125541E: SNESSolve_NEWTONLS (ls.c:233) ==42049== by 0x1011C49B7: SNESSolve (snes.c:3906) ==42049== by 0x100001B3C: main (in ./ex19) ==42049== ==42050== Conditional jump or move depends on uninitialised value(s) ==42050== at 0x10151FDE6: pdgstrf (pdgstrf.c:1382) ==42050== by 0x1015019A5: pdgssvx (pdgssvx.c:1069) ==42050== by 0x1009CFE7A: MatLUFactorNumeric_SuperLU_DIST (superlu_dist.c:414) ==42050== by 0x10046CC5C: MatLUFactorNumeric (matrix.c:2946) ==42050== by 0x100F09F2C: PCSetUp_LU (lu.c:152) ==42050== by 0x100FF9036: PCSetUp (precon.c:982) ==42050== by 0x1010F54EB: KSPSetUp (itfunc.c:332) ==42050== by 0x1010F7985: KSPSolve (itfunc.c:546) ==42050== by 0x10125541E: SNESSolve_NEWTONLS (ls.c:233) ==42050== by 0x1011C49B7: SNESSolve (snes.c:3906) ==42050== by 0x100001B3C: main (in ./ex19) ==42050== Uninitialised value was created by a heap allocation ==42050== at 0x1000183B1: malloc (vg_replace_malloc.c:303) ==42050== by 0x10153B704: superlu_malloc_dist (memory.c:108) ==42050== by 0x10150B241: ddist_psymbtonum (pdsymbfact_distdata.c:1389) ==42050== by 0x1015018C2: pdgssvx (pdgssvx.c:1057) ==42050== by 0x1009CFE7A: MatLUFactorNumeric_SuperLU_DIST (superlu_dist.c:414) ==42050== by 0x10046CC5C: MatLUFactorNumeric (matrix.c:2946) ==42050== by 0x100F09F2C: PCSetUp_LU (lu.c:152) ==42050== by 0x100FF9036: PCSetUp (precon.c:982) ==42050== by 0x1010F54EB: KSPSetUp (itfunc.c:332) ==42050== by 0x1010F7985: KSPSolve (itfunc.c:546) ==42050== by 0x10125541E: SNESSolve_NEWTONLS (ls.c:233) ==42050== by 0x1011C49B7: SNESSolve (snes.c:3906) ==42050== by 0x100001B3C: main (in ./ex19) ==42050== > On Jul 20, 2015, at 12:03 PM, Mahir.Ulker-Kaustell at tyrens.se wrote: > > Ok. So I have been creating the full factorization on each process. That gives me some hope! > > I followed your suggestion and tried to use the runtime option ?-mat_superlu_dist_parsymbfact?. > However, now the program crashes with: > > Invalid ISPEC at line 484 in file get_perm_c.c > > And so on? > > From the SuperLU manual; I should give the option either YES or NO, however -mat_superlu_dist_parsymbfact YES makes the program crash in the same way as above. > Also I can?t find any reference to -mat_superlu_dist_parsymbfact in the PETSc documentation > > Mahir > > Mahir ?lker-Kaustell, Kompetenssamordnare, Brokonstrukt?r, Tekn. Dr, Tyr?ns AB > 010 452 30 82, Mahir.Ulker-Kaustell at tyrens.se > > From: Xiaoye S. Li [mailto:xsli at lbl.gov] > Sent: den 20 juli 2015 18:12 > To: ?lker-Kaustell, Mahir > Cc: Hong; petsc-users > Subject: Re: [petsc-users] SuperLU MPI-problem > > The default SuperLU_DIST setting is to serial symbolic factorization. Therefore, what matters is how much memory do you have per MPI task? > > The code failed to malloc memory during redistribution of matrix A to {L\U} data struction (using result of serial symbolic factorization.) > > You can use parallel symbolic factorization, by runtime option: '-mat_superlu_dist_parsymbfact' > > Sherry Li > > > On Mon, Jul 20, 2015 at 8:59 AM, Mahir.Ulker-Kaustell at tyrens.se > wrote: > Hong: > > Previous experiences with this equation have shown that it is very difficult to solve it iteratively. Hence the use of a direct solver. > > The large test problem I am trying to solve has slightly less than 10^6 degrees of freedom. The matrices are derived from finite elements so they are sparse. > The machine I am working on has 128GB ram. I have estimated the memory needed to less than 20GB, so if the solver needs twice or even three times as much, it should still work well. Or have I completely misunderstood something here? > > Mahir > > > > From: Hong [mailto:hzhang at mcs.anl.gov] > Sent: den 20 juli 2015 17:39 > To: ?lker-Kaustell, Mahir > Cc: petsc-users > Subject: Re: [petsc-users] SuperLU MPI-problem > > Mahir: > Direct solvers consume large amount of memory. Suggest to try followings: > > 1. A sparse iterative solver if [-omega^2M + K] is not too ill-conditioned. You may test it using the small matrix. > > 2. Incrementally increase your matrix sizes. Try different matrix orderings. > Do you get memory crash in the 1st symbolic factorization? > In your case, matrix data structure stays same when omega changes, so you only need to do one matrix symbolic factorization and reuse it. > > 3. Use a machine that gives larger memory. > > Hong > > Dear Petsc-Users, > > I am trying to use PETSc to solve a set of linear equations arising from Naviers equation (elastodynamics) in the frequency domain. > The frequency dependency of the problem requires that the system > > [-omega^2M + K]u = F > > where M and K are constant, square, positive definite matrices (mass and stiffness respectively) is solved for each frequency omega of interest. > K is a complex matrix, including material damping. > > I have written a PETSc program which solves this problem for a small (1000 degrees of freedom) test problem on one or several processors, but it keeps crashing when I try it on my full scale (in the order of 10^6 degrees of freedom) problem. > > The program crashes at KSPSetUp() and from what I can see in the error messages, it appears as if it consumes too much memory. > > I would guess that similar problems have occurred in this mail-list, so I am hoping that someone can push me in the right direction? > > Mahir -------------- next part -------------- An HTML attachment was scrubbed... URL: From knepley at gmail.com Mon Aug 3 10:39:47 2015 From: knepley at gmail.com (Matthew Knepley) Date: Mon, 3 Aug 2015 10:39:47 -0500 Subject: [petsc-users] SuperLU MPI-problem In-Reply-To: References: <051d9816c3bd4a3eac37fc51004ebce1@STHWS42.tyrens.se> <7345cece365942d1a06deeac56cf1d72@STHWS42.tyrens.se> <19A5B30A-64E2-44E1-8F73-F67AE628F175@mcs.anl.gov> <03369975ff0a46a388920f1b3372d25c@STHWS42.tyrens.se> <1891ada2f99843b6b6c1d91f0f47f065@STHWS42.tyrens.se> Message-ID: On Mon, Aug 3, 2015 at 10:34 AM, Mahir.Ulker-Kaustell at tyrens.se < Mahir.Ulker-Kaustell at tyrens.se> wrote: > Sherry and Hong, > > > > If I use: > > -mat_superlu_dist_parsymbfact, > > I get: > > Invalid ISPEC at line 484 in file get_perm_c.c > > regardless of what I give to ?mat_superlu_dist_matinput > > > > I have not used ?parsymbfact in sequential runs or set matinput=GLOBAL for > parallel runs. > > > > If I use 2 processors, the program runs if I use > *?mat_superlu_dist_parsymbfact=1*: > Do not use "=1" for any PETSc option. This is improper syntax. It will ignore that option. You use "-option 1" since all option arguments are separated by a space, not an =. Matt > mpiexec -n 2 ./solve -ksp_type preonly -pc_type lu > -pc_factor_mat_solver_package superlu_dist -mat_superlu_dist_matinput > GLOBAL -mat_superlu_dist_parsymbfact=1 > > > > and > > mpiexec -n 2 ./solve -ksp_type preonly -pc_type lu > -pc_factor_mat_solver_package superlu_dist -mat_superlu_dist_matinput > DISTRIBUTED -mat_superlu_dist_parsymbfact=1 > > > > I guess this corresponds to not setting parsymbfact at all. Both programs > consume the same amount of RAM and seem to run well. > > > > If I use (what seems to be correct): > > mpiexec -n 2 ./solve -ksp_type preonly -pc_type lu > -pc_factor_mat_solver_package superlu_dist -mat_superlu_dist_matinput > DISTRIBUTED -mat_superlu_dist_parsymbfact > > > > the result is: Invalid ISPEC at line 484 in file get_perm_c.c > > > > > > Mahir > > > > > > > > *From:* Hong [mailto:hzhang at mcs.anl.gov] > *Sent:* den 3 augusti 2015 16:46 > *To:* Xiaoye S. Li > *Cc:* ?lker-Kaustell, Mahir; Hong; PETSc users list > *Subject:* Re: [petsc-users] SuperLU MPI-problem > > > > Mahir, > > > > Sherry found the culprit. I can reproduce it: > > petsc/src/ksp/ksp/examples/tutorials > > mpiexec -n 2 ./ex2 -pc_type lu -pc_factor_mat_solver_package superlu_dist > -mat_superlu_dist_matinput GLOBAL -mat_superlu_dist_parsymbfact > > > > Invalid ISPEC at line 484 in file get_perm_c.c > > Invalid ISPEC at line 484 in file get_perm_c.c > > ------------------------------------------------------- > > Primary job terminated normally, but 1 process returned > > a non-zero exit code.. Per user-direction, the job has been aborted. > > ------------------------------------------------------- > > ... > > > > PETSc-superlu_dist interface sets matinput=DISTRIBUTED as default when > using more than one processes. > > Did you either use '-mat_superlu_dist_parsymbfact' for sequential run or > set matinput=GLOBAL for parallel run? > > > > I'll add an error flag for these use cases. > > > > Hong > > > > On Mon, Aug 3, 2015 at 9:17 AM, Xiaoye S. Li wrote: > > I think I know the problem. Since zdistribute.c is called, I guess you > are using the global (replicated) matrix input interface, > pzgssvx_ABglobal(). This interface does not allow you to use parallel > symbolic factorization (since matrix is centralized). > > > > That's why you get the following error: > > Invalid ISPEC at line 484 in file get_perm_c.c > > > > You need to use distributed matrix input interface pzgssvx() (without > ABglobal) > > Sherry > > > > > > On Mon, Aug 3, 2015 at 5:02 AM, Mahir.Ulker-Kaustell at tyrens.se < > Mahir.Ulker-Kaustell at tyrens.se> wrote: > > Hong and Sherry, > > > > I have rebuilt PETSc with SuperLU 4.1. Unfortunately, the problem remains: > > > > If I use -mat_superlu_dist_parsymbfact, the program crashes with: Invalid > ISPEC at line 484 in file get_perm_c.c > > If I use -mat_superlu_dist_parsymbfact=1 or leave this flag out, the > program crashes with: Calloc fails for SPA dense[]. at line 438 in file > zdistribute.c > > > > Mahir > > > > *From:* Hong [mailto:hzhang at mcs.anl.gov] > *Sent:* den 30 juli 2015 02:58 > *To:* ?lker-Kaustell, Mahir > *Cc:* Xiaoye Li; PETSc users list > > > *Subject:* Fwd: [petsc-users] SuperLU MPI-problem > > > > Mahir, > > > > Sherry fixed several bugs in superlu_dist-v4.1. > > The current petsc-release interfaces with superlu_dist-v4.0. > > We do not know whether the reported issue (attached below) has been > resolved or not. If not, can you test it with the latest superlu_dist-v4.1? > > > > Here is how to do it: > > 1. download superlu_dist v4.1 > > 2. remove existing PETSC_ARCH directory, then configure petsc with > > '--download-superlu_dist=superlu_dist_4.1.tar.gz' > > 3. build petsc > > > > Let us know if the issue remains. > > > > Hong > > > > > > ---------- Forwarded message ---------- > From: *Xiaoye S. Li* > Date: Wed, Jul 29, 2015 at 2:24 PM > Subject: Fwd: [petsc-users] SuperLU MPI-problem > To: Hong Zhang > > Hong, > > I am cleaning the mailbox, and saw this unresolved issue. I am not sure > whether the new fix to parallel symbolic factorization solves the problem. > What bothers be is that he is getting the following error: > > Invalid ISPEC at line 484 in file get_perm_c.c > > This has nothing to do with my bug fix. > > ? Shall we ask him to try the new version, or try to get him matrix? > > Sherry > ? > > > > ---------- Forwarded message ---------- > From: *Mahir.Ulker-Kaustell at tyrens.se * < > Mahir.Ulker-Kaustell at tyrens.se> > Date: Wed, Jul 22, 2015 at 1:32 PM > Subject: RE: [petsc-users] SuperLU MPI-problem > To: Hong , "Xiaoye S. Li" > Cc: petsc-users > > The 1000 was just a conservative guess. The number of non-zeros per row is > in the tens in general but certain constraints lead to non-diagonal streaks > in the sparsity-pattern. > > Is it the reordering of the matrix that is killing me here? How can I set > options.ColPerm? > > > > If i use -mat_superlu_dist_parsymbfact the program crashes with > > > > Invalid ISPEC at line 484 in file get_perm_c.c > > ------------------------------------------------------- > > Primary job terminated normally, but 1 process returned > > a non-zero exit code.. Per user-direction, the job has been aborted. > > ------------------------------------------------------- > > [0]PETSC ERROR: > ------------------------------------------------------------------------ > > [0]PETSC ERROR: Caught signal number 15 Terminate: Some process (or the > batch system) has told this process to end > > [0]PETSC ERROR: Try option -start_in_debugger or -on_error_attach_debugger > > [0]PETSC ERROR: or see > http://www.mcs.anl.gov/petsc/documentation/faq.html#valgrind > > [0]PETSC ERROR: or try http://valgrind.org on GNU/linux and Apple Mac OS > X to find memory corruption errors > > [0]PETSC ERROR: configure using --with-debugging=yes, recompile, link, and > run > > [0]PETSC ERROR: to get more information on the crash. > > [0]PETSC ERROR: --------------------- Error Message > -------------------------------------------------------------- > > [0]PETSC ERROR: Signal received > > [0]PETSC ERROR: See http://www.mcs.anl.gov/petsc/documentation/faq.html > for trouble shooting. > > [0]PETSC ERROR: Petsc Release Version 3.6.0, Jun, 09, 2015 > > [0]PETSC ERROR: ./solve on a cygwin-complex-nodebug named CZC5202SM2 by > muk Wed Jul 22 21:59:23 2015 > > [0]PETSC ERROR: Configure options PETSC_DIR=/packages/petsc-3.6.0 > PETSC_ARCH=cygwin-complex-nodebug --with-cc=gcc --with-cxx=g++ > --with-fc=gfortran --with-debugging=0 --with-fortran-kernels=1 > --with-scalar-type=complex --download-fblaspack --download-mpich > --download-scalapack --download-mumps --download-metis --download-parmetis > --download-superlu --download-superlu_dist --download-fftw > > [0]PETSC ERROR: #1 User provided function() line 0 in unknown file > > application called MPI_Abort(MPI_COMM_WORLD, 59) - process 0 > > [unset]: aborting job: > > application called MPI_Abort(MPI_COMM_WORLD, 59) - process 0 > > [0]PETSC ERROR: > ------------------------------------------------------------------------ > > > > If i use -mat_superlu_dist_parsymbfact=1 the program crashes (somewhat > later) with > > > > Malloc fails for Lnzval_bc_ptr[*][] at line 626 in file zdistribute.c > > col block 3006 ------------------------------------------------------- > > Primary job terminated normally, but 1 process returned > > a non-zero exit code.. Per user-direction, the job has been aborted. > > ------------------------------------------------------- > > col block 1924 [0]PETSC ERROR: > ------------------------------------------------------------------------ > > [0]PETSC ERROR: Caught signal number 15 Terminate: Some process (or the > batch system) has told this process to end > > [0]PETSC ERROR: Try option -start_in_debugger or -on_error_attach_debugger > > [0]PETSC ERROR: or see > http://www.mcs.anl.gov/petsc/documentation/faq.html#valgrind > > [0]PETSC ERROR: or try http://valgrind.org on GNU/linux and Apple Mac OS > X to find memory corruption errors > > [0]PETSC ERROR: configure using --with-debugging=yes, recompile, link, and > run > > [0]PETSC ERROR: to get more information on the crash. > > [0]PETSC ERROR: --------------------- Error Message > -------------------------------------------------------------- > > [0]PETSC ERROR: Signal received > > [0]PETSC ERROR: See http://www.mcs.anl.gov/petsc/documentation/faq.html > for trouble shooting. > > [0]PETSC ERROR: Petsc Release Version 3.6.0, Jun, 09, 2015 > > [0]PETSC ERROR: ./solve on a cygwin-complex-nodebug named CZC5202SM2 by > muk Wed Jul 22 21:59:58 2015 > > [0]PETSC ERROR: Configure options PETSC_DIR=/packages/petsc-3.6.0 > PETSC_ARCH=cygwin-complex-nodebug --with-cc=gcc --with-cxx=g++ > --with-fc=gfortran --with-debugging=0 --with-fortran-kernels=1 > --with-scalar-type=complex --download-fblaspack --download-mpich > --download-scalapack --download-mumps --download-metis --download-parmetis > --download-superlu --download-superlu_dist --download-fftw > > [0]PETSC ERROR: #1 User provided function() line 0 in unknown file > > application called MPI_Abort(MPI_COMM_WORLD, 59) - process 0 > > [unset]: aborting job: > > application called MPI_Abort(MPI_COMM_WORLD, 59) - process 0 > > [0]PETSC ERROR: > ------------------------------------------------------------------------ > > > > > > /Mahir > > > > > > *From:* Hong [mailto:hzhang at mcs.anl.gov] > > *Sent:* den 22 juli 2015 21:34 > *To:* Xiaoye S. Li > *Cc:* ?lker-Kaustell, Mahir; petsc-users > > > *Subject:* Re: [petsc-users] SuperLU MPI-problem > > > > In Petsc/superlu_dist interface, we set default > > > > options.ParSymbFact = NO; > > > > When user raises the flag "-mat_superlu_dist_parsymbfact", > > we set > > > > options.ParSymbFact = YES; > > options.ColPerm = PARMETIS; /* in v2.2, PARMETIS is forced for > ParSymbFact regardless of user ordering setting */ > > > > We do not change anything else. > > > > Hong > > > > On Wed, Jul 22, 2015 at 2:19 PM, Xiaoye S. Li wrote: > > I am trying to understand your problem. You said you are solving Naviers > equation (elastodynamics) in the frequency domain, using finite element > discretization. I wonder why you have about 1000 nonzeros per row. > Usually in many PDE discretized matrices, the number of nonzeros per row is > in the tens (even for 3D problems), not in the thousands. So, your matrix > is quite a bit denser than many sparse matrices we deal with. > > > > The number of nonzeros in the L and U factors is much more than that in > original matrix A -- typically we see 10-20x fill ratio for 2D, or can be > as bad as 50-100x fill ratio for 3D. But since your matrix starts much > denser (i.e., the underlying graph has many connections), it may not lend > to any good ordering strategy to preserve sparsity of L and U; that is, the > L and U fill ratio may be large. > > > > I don't understand why you get the following error when you use > > ?-mat_superlu_dist_parsymbfact?. > > > > Invalid ISPEC at line 484 in file get_perm_c.c > > > > Perhaps Hong Zhang knows; she built the SuperLU_DIST interface for PETSc. > > > > ?Hong -- in order to use parallel symbolic factorization, is it sufficient > to specify only > > ?-mat_superlu_dist_parsymbfact? > > ? ? (the default is to use sequential symbolic factorization.) > > > > > > Sherry > > > > On Wed, Jul 22, 2015 at 9:11 AM, Mahir.Ulker-Kaustell at tyrens.se < > Mahir.Ulker-Kaustell at tyrens.se> wrote: > > Thank you for your reply. > > As you have probably figured out already, I am not a computational > scientist. I am a researcher in civil engineering (railways for high-speed > traffic), trying to produce some, from my perspective, fairly large > parametric studies based on finite element discretizations. > > I am working in a Windows-environment and have installed PETSc through > Cygwin. > Apparently, there is no support for Valgrind in this OS. > > If I have understood you correct, the memory issues are related to superLU > and given my background, there is not much I can do. Is this correct? > > > Best regards, > Mahir > > ______________________________________________ > Mahir ?lker-Kaustell, Kompetenssamordnare, Brokonstrukt?r, Tekn. Dr, > Tyr?ns AB > 010 452 30 82, Mahir.Ulker-Kaustell at tyrens.se > ______________________________________________ > > > -----Original Message----- > From: Barry Smith [mailto:bsmith at mcs.anl.gov] > Sent: den 22 juli 2015 02:57 > To: ?lker-Kaustell, Mahir > Cc: Xiaoye S. Li; petsc-users > Subject: Re: [petsc-users] SuperLU MPI-problem > > > Run the program under valgrind > http://www.mcs.anl.gov/petsc/documentation/faq.html#valgrind . When I use > the option -mat_superlu_dist_parsymbfact I get many scary memory problems > some involving for example ddist_psymbtonum (pdsymbfact_distdata.c:1332) > > Note that I consider it unacceptable for running programs to EVER use > uninitialized values; until these are all cleaned up I won't trust any runs > like this. > > Barry > > > > > ==42050== Conditional jump or move depends on uninitialised value(s) > ==42050== at 0x10274C436: MPI_Allgatherv (allgatherv.c:1053) > ==42050== by 0x101557F60: get_perm_c_parmetis > (get_perm_c_parmetis.c:285) > ==42050== by 0x101501192: pdgssvx (pdgssvx.c:934) > ==42050== by 0x1009CFE7A: MatLUFactorNumeric_SuperLU_DIST > (superlu_dist.c:414) > ==42050== by 0x10046CC5C: MatLUFactorNumeric (matrix.c:2946) > ==42050== by 0x100F09F2C: PCSetUp_LU (lu.c:152) > ==42050== by 0x100FF9036: PCSetUp (precon.c:982) > ==42050== by 0x1010F54EB: KSPSetUp (itfunc.c:332) > ==42050== by 0x1010F7985: KSPSolve (itfunc.c:546) > ==42050== by 0x10125541E: SNESSolve_NEWTONLS (ls.c:233) > ==42050== by 0x1011C49B7: SNESSolve (snes.c:3906) > ==42050== by 0x100001B3C: main (in ./ex19) > ==42050== Uninitialised value was created by a stack allocation > ==42050== at 0x10155751B: get_perm_c_parmetis (get_perm_c_parmetis.c:96) > ==42050== > ==42050== Conditional jump or move depends on uninitialised value(s) > ==42050== at 0x102851C61: MPIR_Allgatherv_intra (allgatherv.c:651) > ==42050== by 0x102853EC7: MPIR_Allgatherv (allgatherv.c:903) > ==42050== by 0x102853F84: MPIR_Allgatherv_impl (allgatherv.c:944) > ==42050== by 0x10274CA41: MPI_Allgatherv (allgatherv.c:1107) > ==42050== by 0x101557F60: get_perm_c_parmetis > (get_perm_c_parmetis.c:285) > ==42050== by 0x101501192: pdgssvx (pdgssvx.c:934) > ==42050== by 0x1009CFE7A: MatLUFactorNumeric_SuperLU_DIST > (superlu_dist.c:414) > ==42050== by 0x10046CC5C: MatLUFactorNumeric (matrix.c:2946) > ==42050== by 0x100F09F2C: PCSetUp_LU (lu.c:152) > ==42050== by 0x100FF9036: PCSetUp (precon.c:982) > ==42050== by 0x1010F54EB: KSPSetUp (itfunc.c:332) > ==42050== by 0x1010F7985: KSPSolve (itfunc.c:546) > ==42050== by 0x10125541E: SNESSolve_NEWTONLS (ls.c:233) > ==42050== by 0x1011C49B7: SNESSolve (snes.c:3906) > ==42050== by 0x100001B3C: main (in ./ex19) > ==42050== Uninitialised value was created by a stack allocation > ==42050== at 0x10155751B: get_perm_c_parmetis (get_perm_c_parmetis.c:96) > ==42050== > ==42049== Syscall param writev(vector[...]) points to uninitialised byte(s) > ==42049== at 0x102DA1C3A: writev (in > /usr/lib/system/libsystem_kernel.dylib) > ==42049== by 0x10296A0DC: MPL_large_writev (mplsock.c:32) > ==42049== by 0x10295F6AD: MPIDU_Sock_writev (sock_immed.i:610) > ==42049== by 0x102943FCA: MPIDI_CH3_iSendv (ch3_isendv.c:84) > ==42049== by 0x102934361: MPIDI_CH3_EagerContigIsend (ch3u_eager.c:556) > ==42049== by 0x102939531: MPID_Isend (mpid_isend.c:138) > ==42049== by 0x10277656E: MPI_Isend (isend.c:125) > ==42049== by 0x102088B66: libparmetis__gkMPI_Isend (gkmpi.c:63) > ==42049== by 0x10208140F: libparmetis__CommInterfaceData (comm.c:298) > ==42049== by 0x1020A8758: libparmetis__CompactGraph (ometis.c:553) > ==42049== by 0x1020A77BB: libparmetis__MultilevelOrder (ometis.c:225) > ==42049== by 0x1020A7493: ParMETIS_V32_NodeND (ometis.c:151) > ==42049== by 0x1020A6AFB: ParMETIS_V3_NodeND (ometis.c:34) > ==42049== by 0x101557CFC: get_perm_c_parmetis > (get_perm_c_parmetis.c:241) > ==42049== by 0x101501192: pdgssvx (pdgssvx.c:934) > ==42049== by 0x1009CFE7A: MatLUFactorNumeric_SuperLU_DIST > (superlu_dist.c:414) > ==42049== by 0x10046CC5C: MatLUFactorNumeric (matrix.c:2946) > ==42049== by 0x100F09F2C: PCSetUp_LU (lu.c:152) > ==42049== by 0x100FF9036: PCSetUp (precon.c:982) > ==42048== Syscall param writev(vector[...]) points to uninitialised byte(s) > ==42049== by 0x1010F54EB: KSPSetUp (itfunc.c:332) > ==42049== Address 0x105edff70 is 1,424 bytes inside a block of size > 752,720 alloc'd > ==42049== at 0x1000183B1: malloc (vg_replace_malloc.c:303) > ==42049== by 0x1020EB90C: gk_malloc (memory.c:147) > ==42049== by 0x1020EAA28: gk_mcoreCreate (mcore.c:28) > ==42048== at 0x102DA1C3A: writev (in > /usr/lib/system/libsystem_kernel.dylib) > ==42048== by 0x10296A0DC: MPL_large_writev (mplsock.c:32) > ==42049== by 0x1020BA5CF: libparmetis__AllocateWSpace (wspace.c:23) > ==42049== by 0x1020A6E84: ParMETIS_V32_NodeND (ometis.c:98) > ==42048== by 0x10295F6AD: MPIDU_Sock_writev (sock_immed.i:610) > ==42048== by 0x102943FCA: MPIDI_CH3_iSendv (ch3_isendv.c:84) > ==42048== by 0x102934361: MPIDI_CH3_EagerContigIsend (ch3u_eager.c:556) > ==42049== by 0x1020A6AFB: ParMETIS_V3_NodeND (ometis.c:34) > ==42049== by 0x101557CFC: get_perm_c_parmetis > (get_perm_c_parmetis.c:241) > ==42049== by 0x101501192: pdgssvx (pdgssvx.c:934) > ==42048== by 0x102939531: MPID_Isend (mpid_isend.c:138) > ==42048== by 0x10277656E: MPI_Isend (isend.c:125) > ==42049== by 0x1009CFE7A: MatLUFactorNumeric_SuperLU_DIST > (superlu_dist.c:414) > ==42049== by 0x10046CC5C: MatLUFactorNumeric (matrix.c:2946) > ==42049== by 0x100F09F2C: PCSetUp_LU (lu.c:152) > ==42049== by 0x100FF9036: PCSetUp (precon.c:982) > ==42049== by 0x1010F54EB: KSPSetUp (itfunc.c:332) > ==42048== by 0x102088B66: libparmetis__gkMPI_Isend (gkmpi.c:63) > ==42048== by 0x10208140F: libparmetis__CommInterfaceData (comm.c:298) > ==42049== by 0x1010F7985: KSPSolve (itfunc.c:546) > ==42049== by 0x10125541E: SNESSolve_NEWTONLS (ls.c:233) > ==42048== by 0x1020A8758: libparmetis__CompactGraph (ometis.c:553) > ==42048== by 0x1020A77BB: libparmetis__MultilevelOrder (ometis.c:225) > ==42048== by 0x1020A7493: ParMETIS_V32_NodeND (ometis.c:151) > ==42049== by 0x1011C49B7: SNESSolve (snes.c:3906) > ==42049== by 0x100001B3C: main (in ./ex19) > ==42049== Uninitialised value was created by a heap allocation > ==42049== at 0x1000183B1: malloc (vg_replace_malloc.c:303) > ==42049== by 0x1020EB90C: gk_malloc (memory.c:147) > ==42048== by 0x1020A6AFB: ParMETIS_V3_NodeND (ometis.c:34) > ==42048== by 0x101557CFC: get_perm_c_parmetis > (get_perm_c_parmetis.c:241) > ==42048== by 0x101501192: pdgssvx (pdgssvx.c:934) > ==42048== by 0x1009CFE7A: MatLUFactorNumeric_SuperLU_DIST > (superlu_dist.c:414) > ==42049== by 0x10211C50B: libmetis__imalloc (gklib.c:24) > ==42049== by 0x1020A8566: libparmetis__CompactGraph (ometis.c:519) > ==42049== by 0x1020A77BB: libparmetis__MultilevelOrder (ometis.c:225) > ==42048== by 0x10046CC5C: MatLUFactorNumeric (matrix.c:2946) > ==42049== by 0x1020A7493: ParMETIS_V32_NodeND (ometis.c:151) > ==42049== by 0x1020A6AFB: ParMETIS_V3_NodeND (ometis.c:34) > ==42049== by 0x101557CFC: get_perm_c_parmetis > (get_perm_c_parmetis.c:241) > ==42049== by 0x101501192: pdgssvx (pdgssvx.c:934) > ==42049== by 0x1009CFE7A: MatLUFactorNumeric_SuperLU_DIST > (superlu_dist.c:414) > ==42049== by 0x10046CC5C: MatLUFactorNumeric (matrix.c:2946) > ==42048== by 0x100F09F2C: PCSetUp_LU (lu.c:152) > ==42049== by 0x100F09F2C: PCSetUp_LU (lu.c:152) > ==42049== by 0x100FF9036: PCSetUp (precon.c:982) > ==42049== by 0x1010F54EB: KSPSetUp (itfunc.c:332) > ==42049== by 0x1010F7985: KSPSolve (itfunc.c:546) > ==42049== by 0x10125541E: SNESSolve_NEWTONLS (ls.c:233) > ==42048== by 0x100FF9036: PCSetUp (precon.c:982) > ==42048== by 0x1010F54EB: KSPSetUp (itfunc.c:332) > ==42048== Address 0x10597a860 is 1,408 bytes inside a block of size > 752,720 alloc'd > ==42049== by 0x1011C49B7: SNESSolve (snes.c:3906) > ==42049== by 0x100001B3C: main (in ./ex19) > ==42049== > ==42048== at 0x1000183B1: malloc (vg_replace_malloc.c:303) > ==42048== by 0x1020EB90C: gk_malloc (memory.c:147) > ==42048== by 0x1020EAA28: gk_mcoreCreate (mcore.c:28) > ==42048== by 0x1020BA5CF: libparmetis__AllocateWSpace (wspace.c:23) > ==42048== by 0x1020A6E84: ParMETIS_V32_NodeND (ometis.c:98) > ==42048== by 0x1020A6AFB: ParMETIS_V3_NodeND (ometis.c:34) > ==42048== by 0x101557CFC: get_perm_c_parmetis > (get_perm_c_parmetis.c:241) > ==42048== by 0x101501192: pdgssvx (pdgssvx.c:934) > ==42048== by 0x1009CFE7A: MatLUFactorNumeric_SuperLU_DIST > (superlu_dist.c:414) > ==42048== by 0x10046CC5C: MatLUFactorNumeric (matrix.c:2946) > ==42048== by 0x100F09F2C: PCSetUp_LU (lu.c:152) > ==42048== by 0x100FF9036: PCSetUp (precon.c:982) > ==42048== by 0x1010F54EB: KSPSetUp (itfunc.c:332) > ==42048== by 0x1010F7985: KSPSolve (itfunc.c:546) > ==42048== by 0x10125541E: SNESSolve_NEWTONLS (ls.c:233) > ==42048== by 0x1011C49B7: SNESSolve (snes.c:3906) > ==42048== by 0x100001B3C: main (in ./ex19) > ==42048== Uninitialised value was created by a heap allocation > ==42048== at 0x1000183B1: malloc (vg_replace_malloc.c:303) > ==42048== by 0x1020EB90C: gk_malloc (memory.c:147) > ==42048== by 0x10211C50B: libmetis__imalloc (gklib.c:24) > ==42048== by 0x1020A8566: libparmetis__CompactGraph (ometis.c:519) > ==42048== by 0x1020A77BB: libparmetis__MultilevelOrder (ometis.c:225) > ==42048== by 0x1020A7493: ParMETIS_V32_NodeND (ometis.c:151) > ==42048== by 0x1020A6AFB: ParMETIS_V3_NodeND (ometis.c:34) > ==42048== by 0x101557CFC: get_perm_c_parmetis > (get_perm_c_parmetis.c:241) > ==42048== by 0x101501192: pdgssvx (pdgssvx.c:934) > ==42048== by 0x1009CFE7A: MatLUFactorNumeric_SuperLU_DIST > (superlu_dist.c:414) > ==42048== by 0x10046CC5C: MatLUFactorNumeric (matrix.c:2946) > ==42048== by 0x100F09F2C: PCSetUp_LU (lu.c:152) > ==42048== by 0x100FF9036: PCSetUp (precon.c:982) > ==42048== by 0x1010F54EB: KSPSetUp (itfunc.c:332) > ==42048== by 0x1010F7985: KSPSolve (itfunc.c:546) > ==42048== by 0x10125541E: SNESSolve_NEWTONLS (ls.c:233) > ==42048== by 0x1011C49B7: SNESSolve (snes.c:3906) > ==42048== by 0x100001B3C: main (in ./ex19) > ==42048== > ==42048== Syscall param write(buf) points to uninitialised byte(s) > ==42048== at 0x102DA1C22: write (in > /usr/lib/system/libsystem_kernel.dylib) > ==42048== by 0x10295F5BD: MPIDU_Sock_write (sock_immed.i:525) > ==42048== by 0x102944839: MPIDI_CH3_iStartMsg (ch3_istartmsg.c:86) > ==42048== by 0x102933B80: MPIDI_CH3_EagerContigShortSend > (ch3u_eager.c:257) > ==42048== by 0x10293ADBA: MPID_Send (mpid_send.c:130) > ==42048== by 0x10277A1FA: MPI_Send (send.c:127) > ==42048== by 0x10155802F: get_perm_c_parmetis > (get_perm_c_parmetis.c:299) > ==42048== by 0x101501192: pdgssvx (pdgssvx.c:934) > ==42048== by 0x1009CFE7A: MatLUFactorNumeric_SuperLU_DIST > (superlu_dist.c:414) > ==42048== by 0x10046CC5C: MatLUFactorNumeric (matrix.c:2946) > ==42048== by 0x100F09F2C: PCSetUp_LU (lu.c:152) > ==42048== by 0x100FF9036: PCSetUp (precon.c:982) > ==42048== by 0x1010F54EB: KSPSetUp (itfunc.c:332) > ==42048== by 0x1010F7985: KSPSolve (itfunc.c:546) > ==42048== by 0x10125541E: SNESSolve_NEWTONLS (ls.c:233) > ==42048== by 0x1011C49B7: SNESSolve (snes.c:3906) > ==42048== by 0x100001B3C: main (in ./ex19) > ==42048== Address 0x104810704 is on thread 1's stack > ==42048== in frame #3, created by MPIDI_CH3_EagerContigShortSend > (ch3u_eager.c:218) > ==42048== Uninitialised value was created by a heap allocation > ==42048== at 0x1000183B1: malloc (vg_replace_malloc.c:303) > ==42048== by 0x10153B704: superlu_malloc_dist (memory.c:108) > ==42048== by 0x101557AB9: get_perm_c_parmetis > (get_perm_c_parmetis.c:185) > ==42048== by 0x101501192: pdgssvx (pdgssvx.c:934) > ==42048== by 0x1009CFE7A: MatLUFactorNumeric_SuperLU_DIST > (superlu_dist.c:414) > ==42048== by 0x10046CC5C: MatLUFactorNumeric (matrix.c:2946) > ==42048== by 0x100F09F2C: PCSetUp_LU (lu.c:152) > ==42048== by 0x100FF9036: PCSetUp (precon.c:982) > ==42048== by 0x1010F54EB: KSPSetUp (itfunc.c:332) > ==42048== by 0x1010F7985: KSPSolve (itfunc.c:546) > ==42048== by 0x10125541E: SNESSolve_NEWTONLS (ls.c:233) > ==42048== by 0x1011C49B7: SNESSolve (snes.c:3906) > ==42048== by 0x100001B3C: main (in ./ex19) > ==42048== > ==42050== Conditional jump or move depends on uninitialised value(s) > ==42050== at 0x102744CB8: MPI_Alltoallv (alltoallv.c:480) > ==42050== by 0x101510B3E: dist_symbLU (pdsymbfact_distdata.c:539) > ==42050== by 0x10150A5C6: ddist_psymbtonum (pdsymbfact_distdata.c:1275) > ==42050== by 0x1015018C2: pdgssvx (pdgssvx.c:1057) > ==42050== by 0x1009CFE7A: MatLUFactorNumeric_SuperLU_DIST > (superlu_dist.c:414) > ==42050== by 0x10046CC5C: MatLUFactorNumeric (matrix.c:2946) > ==42050== by 0x100F09F2C: PCSetUp_LU (lu.c:152) > ==42050== by 0x100FF9036: PCSetUp (precon.c:982) > ==42050== by 0x1010F54EB: KSPSetUp (itfunc.c:332) > ==42050== by 0x1010F7985: KSPSolve (itfunc.c:546) > ==42050== by 0x10125541E: SNESSolve_NEWTONLS (ls.c:233) > ==42050== by 0x1011C49B7: SNESSolve (snes.c:3906) > ==42050== by 0x100001B3C: main (in ./ex19) > ==42050== Uninitialised value was created by a stack allocation > ==42050== at 0x10150E4C4: dist_symbLU (pdsymbfact_distdata.c:96) > ==42050== > ==42050== Conditional jump or move depends on uninitialised value(s) > ==42050== at 0x102744E43: MPI_Alltoallv (alltoallv.c:490) > ==42050== by 0x101510B3E: dist_symbLU (pdsymbfact_distdata.c:539) > ==42050== by 0x10150A5C6: ddist_psymbtonum (pdsymbfact_distdata.c:1275) > ==42050== by 0x1015018C2: pdgssvx (pdgssvx.c:1057) > ==42050== by 0x1009CFE7A: MatLUFactorNumeric_SuperLU_DIST > (superlu_dist.c:414) > ==42050== by 0x10046CC5C: MatLUFactorNumeric (matrix.c:2946) > ==42050== by 0x100F09F2C: PCSetUp_LU (lu.c:152) > ==42050== by 0x100FF9036: PCSetUp (precon.c:982) > ==42050== by 0x1010F54EB: KSPSetUp (itfunc.c:332) > ==42050== by 0x1010F7985: KSPSolve (itfunc.c:546) > ==42050== by 0x10125541E: SNESSolve_NEWTONLS (ls.c:233) > ==42050== by 0x1011C49B7: SNESSolve (snes.c:3906) > ==42050== by 0x100001B3C: main (in ./ex19) > ==42050== Uninitialised value was created by a stack allocation > ==42050== at 0x10150E4C4: dist_symbLU (pdsymbfact_distdata.c:96) > ==42050== > ==42050== Conditional jump or move depends on uninitialised value(s) > ==42050== at 0x102744EBF: MPI_Alltoallv (alltoallv.c:497) > ==42050== by 0x101510B3E: dist_symbLU (pdsymbfact_distdata.c:539) > ==42050== by 0x10150A5C6: ddist_psymbtonum (pdsymbfact_distdata.c:1275) > ==42050== by 0x1015018C2: pdgssvx (pdgssvx.c:1057) > ==42050== by 0x1009CFE7A: MatLUFactorNumeric_SuperLU_DIST > (superlu_dist.c:414) > ==42050== by 0x10046CC5C: MatLUFactorNumeric (matrix.c:2946) > ==42050== by 0x100F09F2C: PCSetUp_LU (lu.c:152) > ==42050== by 0x100FF9036: PCSetUp (precon.c:982) > ==42050== by 0x1010F54EB: KSPSetUp (itfunc.c:332) > ==42050== by 0x1010F7985: KSPSolve (itfunc.c:546) > ==42050== by 0x10125541E: SNESSolve_NEWTONLS (ls.c:233) > ==42050== by 0x1011C49B7: SNESSolve (snes.c:3906) > ==42050== by 0x100001B3C: main (in ./ex19) > ==42050== Uninitialised value was created by a stack allocation > ==42050== at 0x10150E4C4: dist_symbLU (pdsymbfact_distdata.c:96) > ==42050== > ==42050== Conditional jump or move depends on uninitialised value(s) > ==42050== at 0x1027450B1: MPI_Alltoallv (alltoallv.c:512) > ==42050== by 0x101510B3E: dist_symbLU (pdsymbfact_distdata.c:539) > ==42050== by 0x10150A5C6: ddist_psymbtonum (pdsymbfact_distdata.c:1275) > ==42050== by 0x1015018C2: pdgssvx (pdgssvx.c:1057) > ==42050== by 0x1009CFE7A: MatLUFactorNumeric_SuperLU_DIST > (superlu_dist.c:414) > ==42050== by 0x10046CC5C: MatLUFactorNumeric (matrix.c:2946) > ==42050== by 0x100F09F2C: PCSetUp_LU (lu.c:152) > ==42050== by 0x100FF9036: PCSetUp (precon.c:982) > ==42050== by 0x1010F54EB: KSPSetUp (itfunc.c:332) > ==42050== by 0x1010F7985: KSPSolve (itfunc.c:546) > ==42050== by 0x10125541E: SNESSolve_NEWTONLS (ls.c:233) > ==42050== by 0x1011C49B7: SNESSolve (snes.c:3906) > ==42050== by 0x100001B3C: main (in ./ex19) > ==42050== Uninitialised value was created by a stack allocation > ==42050== at 0x10150E4C4: dist_symbLU (pdsymbfact_distdata.c:96) > ==42050== > ==42050== Conditional jump or move depends on uninitialised value(s) > ==42050== at 0x10283FB06: MPIR_Alltoallv_intra (alltoallv.c:92) > ==42050== by 0x1028407B6: MPIR_Alltoallv (alltoallv.c:343) > ==42050== by 0x102840884: MPIR_Alltoallv_impl (alltoallv.c:380) > ==42050== by 0x10274541B: MPI_Alltoallv (alltoallv.c:531) > ==42050== by 0x101510B3E: dist_symbLU (pdsymbfact_distdata.c:539) > ==42050== by 0x10150A5C6: ddist_psymbtonum (pdsymbfact_distdata.c:1275) > ==42050== by 0x1015018C2: pdgssvx (pdgssvx.c:1057) > ==42050== by 0x1009CFE7A: MatLUFactorNumeric_SuperLU_DIST > (superlu_dist.c:414) > ==42050== by 0x10046CC5C: MatLUFactorNumeric (matrix.c:2946) > ==42050== by 0x100F09F2C: PCSetUp_LU (lu.c:152) > ==42050== by 0x100FF9036: PCSetUp (precon.c:982) > ==42050== by 0x1010F54EB: KSPSetUp (itfunc.c:332) > ==42050== by 0x1010F7985: KSPSolve (itfunc.c:546) > ==42050== by 0x10125541E: SNESSolve_NEWTONLS (ls.c:233) > ==42050== by 0x1011C49B7: SNESSolve (snes.c:3906) > ==42050== by 0x100001B3C: main (in ./ex19) > ==42050== Uninitialised value was created by a stack allocation > ==42050== at 0x10150E4C4: dist_symbLU (pdsymbfact_distdata.c:96) > ==42050== > ==42050== Syscall param writev(vector[...]) points to uninitialised byte(s) > ==42050== at 0x102DA1C3A: writev (in > /usr/lib/system/libsystem_kernel.dylib) > ==42050== by 0x10296A0DC: MPL_large_writev (mplsock.c:32) > ==42050== by 0x10295F6AD: MPIDU_Sock_writev (sock_immed.i:610) > ==42050== by 0x102943FCA: MPIDI_CH3_iSendv (ch3_isendv.c:84) > ==42050== by 0x102934361: MPIDI_CH3_EagerContigIsend (ch3u_eager.c:556) > ==42050== by 0x102939531: MPID_Isend (mpid_isend.c:138) > ==42050== by 0x10277656E: MPI_Isend (isend.c:125) > ==42050== by 0x101524C41: pdgstrf2_trsm (pdgstrf2.c:201) > ==42050== by 0x10151ECBF: pdgstrf (pdgstrf.c:1082) > ==42050== by 0x1015019A5: pdgssvx (pdgssvx.c:1069) > ==42050== by 0x1009CFE7A: MatLUFactorNumeric_SuperLU_DIST > (superlu_dist.c:414) > ==42050== by 0x10046CC5C: MatLUFactorNumeric (matrix.c:2946) > ==42050== by 0x100F09F2C: PCSetUp_LU (lu.c:152) > ==42050== by 0x100FF9036: PCSetUp (precon.c:982) > ==42050== by 0x1010F54EB: KSPSetUp (itfunc.c:332) > ==42050== by 0x1010F7985: KSPSolve (itfunc.c:546) > ==42050== by 0x10125541E: SNESSolve_NEWTONLS (ls.c:233) > ==42050== by 0x1011C49B7: SNESSolve (snes.c:3906) > ==42050== by 0x100001B3C: main (in ./ex19) > ==42050== Address 0x1060144d0 is 1,168 bytes inside a block of size > 131,072 alloc'd > ==42050== at 0x1000183B1: malloc (vg_replace_malloc.c:303) > ==42050== by 0x10153B704: superlu_malloc_dist (memory.c:108) > ==42050== by 0x1014FD7AD: doubleMalloc_dist (dmemory.c:145) > ==42050== by 0x10151DA7D: pdgstrf (pdgstrf.c:735) > ==42050== by 0x1015019A5: pdgssvx (pdgssvx.c:1069) > ==42050== by 0x1009CFE7A: MatLUFactorNumeric_SuperLU_DIST > (superlu_dist.c:414) > ==42050== by 0x10046CC5C: MatLUFactorNumeric (matrix.c:2946) > ==42050== by 0x100F09F2C: PCSetUp_LU (lu.c:152) > ==42050== by 0x100FF9036: PCSetUp (precon.c:982) > ==42050== by 0x1010F54EB: KSPSetUp (itfunc.c:332) > ==42050== by 0x1010F7985: KSPSolve (itfunc.c:546) > ==42050== by 0x10125541E: SNESSolve_NEWTONLS (ls.c:233) > ==42050== by 0x1011C49B7: SNESSolve (snes.c:3906) > ==42050== by 0x100001B3C: main (in ./ex19) > ==42050== Uninitialised value was created by a heap allocation > ==42050== at 0x1000183B1: malloc (vg_replace_malloc.c:303) > ==42050== by 0x10153B704: superlu_malloc_dist (memory.c:108) > ==42050== by 0x1014FD7AD: doubleMalloc_dist (dmemory.c:145) > ==42050== by 0x10151DA7D: pdgstrf (pdgstrf.c:735) > ==42050== by 0x1015019A5: pdgssvx (pdgssvx.c:1069) > ==42050== by 0x1009CFE7A: MatLUFactorNumeric_SuperLU_DIST > (superlu_dist.c:414) > ==42050== by 0x10046CC5C: MatLUFactorNumeric (matrix.c:2946) > ==42050== by 0x100F09F2C: PCSetUp_LU (lu.c:152) > ==42050== by 0x100FF9036: PCSetUp (precon.c:982) > ==42050== by 0x1010F54EB: KSPSetUp (itfunc.c:332) > ==42050== by 0x1010F7985: KSPSolve (itfunc.c:546) > ==42050== by 0x10125541E: SNESSolve_NEWTONLS (ls.c:233) > ==42050== by 0x1011C49B7: SNESSolve (snes.c:3906) > ==42050== by 0x100001B3C: main (in ./ex19) > ==42050== > ==42048== Conditional jump or move depends on uninitialised value(s) > ==42048== at 0x10151F141: pdgstrf (pdgstrf.c:1139) > ==42048== by 0x1015019A5: pdgssvx (pdgssvx.c:1069) > ==42048== by 0x1009CFE7A: MatLUFactorNumeric_SuperLU_DIST > (superlu_dist.c:414) > ==42048== by 0x10046CC5C: MatLUFactorNumeric (matrix.c:2946) > ==42048== by 0x100F09F2C: PCSetUp_LU (lu.c:152) > ==42048== by 0x100FF9036: PCSetUp (precon.c:982) > ==42048== by 0x1010F54EB: KSPSetUp (itfunc.c:332) > ==42048== by 0x1010F7985: KSPSolve (itfunc.c:546) > ==42048== by 0x10125541E: SNESSolve_NEWTONLS (ls.c:233) > ==42048== by 0x1011C49B7: SNESSolve (snes.c:3906) > ==42048== by 0x100001B3C: main (in ./ex19) > ==42048== Uninitialised value was created by a heap allocation > ==42048== at 0x1000183B1: malloc (vg_replace_malloc.c:303) > ==42048== by 0x10153B704: superlu_malloc_dist (memory.c:108) > ==42048== by 0x10150ABE2: ddist_psymbtonum (pdsymbfact_distdata.c:1332) > ==42048== by 0x1015018C2: pdgssvx (pdgssvx.c:1057) > ==42048== by 0x1009CFE7A: MatLUFactorNumeric_SuperLU_DIST > (superlu_dist.c:414) > ==42048== by 0x10046CC5C: MatLUFactorNumeric (matrix.c:2946) > ==42048== by 0x100F09F2C: PCSetUp_LU (lu.c:152) > ==42048== by 0x100FF9036: PCSetUp (precon.c:982) > ==42048== by 0x1010F54EB: KSPSetUp (itfunc.c:332) > ==42048== by 0x1010F7985: KSPSolve (itfunc.c:546) > ==42048== by 0x10125541E: SNESSolve_NEWTONLS (ls.c:233) > ==42048== by 0x1011C49B7: SNESSolve (snes.c:3906) > ==42048== by 0x100001B3C: main (in ./ex19) > ==42048== > ==42049== Conditional jump or move depends on uninitialised value(s) > ==42049== at 0x10151F141: pdgstrf (pdgstrf.c:1139) > ==42049== by 0x1015019A5: pdgssvx (pdgssvx.c:1069) > ==42049== by 0x1009CFE7A: MatLUFactorNumeric_SuperLU_DIST > (superlu_dist.c:414) > ==42049== by 0x10046CC5C: MatLUFactorNumeric (matrix.c:2946) > ==42049== by 0x100F09F2C: PCSetUp_LU (lu.c:152) > ==42049== by 0x100FF9036: PCSetUp (precon.c:982) > ==42049== by 0x1010F54EB: KSPSetUp (itfunc.c:332) > ==42049== by 0x1010F7985: KSPSolve (itfunc.c:546) > ==42049== by 0x10125541E: SNESSolve_NEWTONLS (ls.c:233) > ==42049== by 0x1011C49B7: SNESSolve (snes.c:3906) > ==42049== by 0x100001B3C: main (in ./ex19) > ==42049== Uninitialised value was created by a heap allocation > ==42049== at 0x1000183B1: malloc (vg_replace_malloc.c:303) > ==42049== by 0x10153B704: superlu_malloc_dist (memory.c:108) > ==42049== by 0x10150ABE2: ddist_psymbtonum (pdsymbfact_distdata.c:1332) > ==42049== by 0x1015018C2: pdgssvx (pdgssvx.c:1057) > ==42049== by 0x1009CFE7A: MatLUFactorNumeric_SuperLU_DIST > (superlu_dist.c:414) > ==42049== by 0x10046CC5C: MatLUFactorNumeric (matrix.c:2946) > ==42049== by 0x100F09F2C: PCSetUp_LU (lu.c:152) > ==42049== by 0x100FF9036: PCSetUp (precon.c:982) > ==42049== by 0x1010F54EB: KSPSetUp (itfunc.c:332) > ==42049== by 0x1010F7985: KSPSolve (itfunc.c:546) > ==42049== by 0x10125541E: SNESSolve_NEWTONLS (ls.c:233) > ==42049== by 0x1011C49B7: SNESSolve (snes.c:3906) > ==42049== by 0x100001B3C: main (in ./ex19) > ==42049== > ==42048== Conditional jump or move depends on uninitialised value(s) > ==42048== at 0x101520054: pdgstrf (pdgstrf.c:1429) > ==42048== by 0x1015019A5: pdgssvx (pdgssvx.c:1069) > ==42048== by 0x1009CFE7A: MatLUFactorNumeric_SuperLU_DIST > (superlu_dist.c:414) > ==42048== by 0x10046CC5C: MatLUFactorNumeric (matrix.c:2946) > ==42048== by 0x100F09F2C: PCSetUp_LU (lu.c:152) > ==42048== by 0x100FF9036: PCSetUp (precon.c:982) > ==42048== by 0x1010F54EB: KSPSetUp (itfunc.c:332) > ==42048== by 0x1010F7985: KSPSolve (itfunc.c:546) > ==42049== Conditional jump or move depends on uninitialised value(s) > ==42048== by 0x10125541E: SNESSolve_NEWTONLS (ls.c:233) > ==42048== by 0x1011C49B7: SNESSolve (snes.c:3906) > ==42048== by 0x100001B3C: main (in ./ex19) > ==42048== Uninitialised value was created by a heap allocation > ==42049== at 0x101520054: pdgstrf (pdgstrf.c:1429) > ==42048== at 0x1000183B1: malloc (vg_replace_malloc.c:303) > ==42048== by 0x10153B704: superlu_malloc_dist (memory.c:108) > ==42049== by 0x1015019A5: pdgssvx (pdgssvx.c:1069) > ==42049== by 0x1009CFE7A: MatLUFactorNumeric_SuperLU_DIST > (superlu_dist.c:414) > ==42048== by 0x10150ABE2: ddist_psymbtonum (pdsymbfact_distdata.c:1332) > ==42048== by 0x1015018C2: pdgssvx (pdgssvx.c:1057) > ==42048== by 0x1009CFE7A: MatLUFactorNumeric_SuperLU_DIST > (superlu_dist.c:414) > ==42049== by 0x10046CC5C: MatLUFactorNumeric (matrix.c:2946) > ==42049== by 0x100F09F2C: PCSetUp_LU (lu.c:152) > ==42048== by 0x10046CC5C: MatLUFactorNumeric (matrix.c:2946) > ==42048== by 0x100F09F2C: PCSetUp_LU (lu.c:152) > ==42049== by 0x100FF9036: PCSetUp (precon.c:982) > ==42049== by 0x1010F54EB: KSPSetUp (itfunc.c:332) > ==42049== by 0x1010F7985: KSPSolve (itfunc.c:546) > ==42048== by 0x100FF9036: PCSetUp (precon.c:982) > ==42048== by 0x1010F54EB: KSPSetUp (itfunc.c:332) > ==42048== by 0x1010F7985: KSPSolve (itfunc.c:546) > ==42049== by 0x10125541E: SNESSolve_NEWTONLS (ls.c:233) > ==42049== by 0x1011C49B7: SNESSolve (snes.c:3906) > ==42048== by 0x10125541E: SNESSolve_NEWTONLS (ls.c:233) > ==42048== by 0x1011C49B7: SNESSolve (snes.c:3906) > ==42049== by 0x100001B3C: main (in ./ex19) > ==42049== Uninitialised value was created by a heap allocation > ==42049== at 0x1000183B1: malloc (vg_replace_malloc.c:303) > ==42048== by 0x100001B3C: main (in ./ex19) > ==42048== > ==42049== by 0x10153B704: superlu_malloc_dist (memory.c:108) > ==42049== by 0x10150ABE2: ddist_psymbtonum (pdsymbfact_distdata.c:1332) > ==42049== by 0x1015018C2: pdgssvx (pdgssvx.c:1057) > ==42049== by 0x1009CFE7A: MatLUFactorNumeric_SuperLU_DIST > (superlu_dist.c:414) > ==42049== by 0x10046CC5C: MatLUFactorNumeric (matrix.c:2946) > ==42049== by 0x100F09F2C: PCSetUp_LU (lu.c:152) > ==42049== by 0x100FF9036: PCSetUp (precon.c:982) > ==42049== by 0x1010F54EB: KSPSetUp (itfunc.c:332) > ==42049== by 0x1010F7985: KSPSolve (itfunc.c:546) > ==42049== by 0x10125541E: SNESSolve_NEWTONLS (ls.c:233) > ==42049== by 0x1011C49B7: SNESSolve (snes.c:3906) > ==42049== by 0x100001B3C: main (in ./ex19) > ==42049== > ==42050== Conditional jump or move depends on uninitialised value(s) > ==42050== at 0x10151FDE6: pdgstrf (pdgstrf.c:1382) > ==42050== by 0x1015019A5: pdgssvx (pdgssvx.c:1069) > ==42050== by 0x1009CFE7A: MatLUFactorNumeric_SuperLU_DIST > (superlu_dist.c:414) > ==42050== by 0x10046CC5C: MatLUFactorNumeric (matrix.c:2946) > ==42050== by 0x100F09F2C: PCSetUp_LU (lu.c:152) > ==42050== by 0x100FF9036: PCSetUp (precon.c:982) > ==42050== by 0x1010F54EB: KSPSetUp (itfunc.c:332) > ==42050== by 0x1010F7985: KSPSolve (itfunc.c:546) > ==42050== by 0x10125541E: SNESSolve_NEWTONLS (ls.c:233) > ==42050== by 0x1011C49B7: SNESSolve (snes.c:3906) > ==42050== by 0x100001B3C: main (in ./ex19) > ==42050== Uninitialised value was created by a heap allocation > ==42050== at 0x1000183B1: malloc (vg_replace_malloc.c:303) > ==42050== by 0x10153B704: superlu_malloc_dist (memory.c:108) > ==42050== by 0x10150B241: ddist_psymbtonum (pdsymbfact_distdata.c:1389) > ==42050== by 0x1015018C2: pdgssvx (pdgssvx.c:1057) > ==42050== by 0x1009CFE7A: MatLUFactorNumeric_SuperLU_DIST > (superlu_dist.c:414) > ==42050== by 0x10046CC5C: MatLUFactorNumeric (matrix.c:2946) > ==42050== by 0x100F09F2C: PCSetUp_LU (lu.c:152) > ==42050== by 0x100FF9036: PCSetUp (precon.c:982) > ==42050== by 0x1010F54EB: KSPSetUp (itfunc.c:332) > ==42050== by 0x1010F7985: KSPSolve (itfunc.c:546) > ==42050== by 0x10125541E: SNESSolve_NEWTONLS (ls.c:233) > ==42050== by 0x1011C49B7: SNESSolve (snes.c:3906) > ==42050== by 0x100001B3C: main (in ./ex19) > ==42050== > > > > On Jul 20, 2015, at 12:03 PM, Mahir.Ulker-Kaustell at tyrens.se wrote: > > > > Ok. So I have been creating the full factorization on each process. That > gives me some hope! > > > > I followed your suggestion and tried to use the runtime option > ?-mat_superlu_dist_parsymbfact?. > > However, now the program crashes with: > > > > Invalid ISPEC at line 484 in file get_perm_c.c > > > > And so on? > > > > From the SuperLU manual; I should give the option either YES or NO, > however -mat_superlu_dist_parsymbfact YES makes the program crash in the > same way as above. > > Also I can?t find any reference to -mat_superlu_dist_parsymbfact in the > PETSc documentation > > > > Mahir > > > > Mahir ?lker-Kaustell, Kompetenssamordnare, Brokonstrukt?r, Tekn. Dr, > Tyr?ns AB > > 010 452 30 82, Mahir.Ulker-Kaustell at tyrens.se > > > > From: Xiaoye S. Li [mailto:xsli at lbl.gov] > > Sent: den 20 juli 2015 18:12 > > To: ?lker-Kaustell, Mahir > > Cc: Hong; petsc-users > > Subject: Re: [petsc-users] SuperLU MPI-problem > > > > The default SuperLU_DIST setting is to serial symbolic factorization. > Therefore, what matters is how much memory do you have per MPI task? > > > > The code failed to malloc memory during redistribution of matrix A to > {L\U} data struction (using result of serial symbolic factorization.) > > > > You can use parallel symbolic factorization, by runtime option: > '-mat_superlu_dist_parsymbfact' > > > > Sherry Li > > > > > > On Mon, Jul 20, 2015 at 8:59 AM, Mahir.Ulker-Kaustell at tyrens.se < > Mahir.Ulker-Kaustell at tyrens.se> wrote: > > Hong: > > > > Previous experiences with this equation have shown that it is very > difficult to solve it iteratively. Hence the use of a direct solver. > > > > The large test problem I am trying to solve has slightly less than 10^6 > degrees of freedom. The matrices are derived from finite elements so they > are sparse. > > The machine I am working on has 128GB ram. I have estimated the memory > needed to less than 20GB, so if the solver needs twice or even three times > as much, it should still work well. Or have I completely misunderstood > something here? > > > > Mahir > > > > > > > > From: Hong [mailto:hzhang at mcs.anl.gov] > > Sent: den 20 juli 2015 17:39 > > To: ?lker-Kaustell, Mahir > > Cc: petsc-users > > Subject: Re: [petsc-users] SuperLU MPI-problem > > > > Mahir: > > Direct solvers consume large amount of memory. Suggest to try followings: > > > > 1. A sparse iterative solver if [-omega^2M + K] is not too > ill-conditioned. You may test it using the small matrix. > > > > 2. Incrementally increase your matrix sizes. Try different matrix > orderings. > > Do you get memory crash in the 1st symbolic factorization? > > In your case, matrix data structure stays same when omega changes, so > you only need to do one matrix symbolic factorization and reuse it. > > > > 3. Use a machine that gives larger memory. > > > > Hong > > > > Dear Petsc-Users, > > > > I am trying to use PETSc to solve a set of linear equations arising from > Naviers equation (elastodynamics) in the frequency domain. > > The frequency dependency of the problem requires that the system > > > > [-omega^2M + K]u = F > > > > where M and K are constant, square, positive definite matrices (mass and > stiffness respectively) is solved for each frequency omega of interest. > > K is a complex matrix, including material damping. > > > > I have written a PETSc program which solves this problem for a small > (1000 degrees of freedom) test problem on one or several processors, but it > keeps crashing when I try it on my full scale (in the order of 10^6 degrees > of freedom) problem. > > > > The program crashes at KSPSetUp() and from what I can see in the error > messages, it appears as if it consumes too much memory. > > > > I would guess that similar problems have occurred in this mail-list, so > I am hoping that someone can push me in the right direction? > > > > Mahir > > > > > > > > > > > > > -- What most experimenters take for granted before they begin their experiments is infinitely more interesting than any results to which their experiments lead. -- Norbert Wiener -------------- next part -------------- An HTML attachment was scrubbed... URL: From Mahir.Ulker-Kaustell at tyrens.se Mon Aug 3 10:45:25 2015 From: Mahir.Ulker-Kaustell at tyrens.se (Mahir.Ulker-Kaustell at tyrens.se) Date: Mon, 3 Aug 2015 15:45:25 +0000 Subject: [petsc-users] SuperLU MPI-problem In-Reply-To: References: <051d9816c3bd4a3eac37fc51004ebce1@STHWS42.tyrens.se> <7345cece365942d1a06deeac56cf1d72@STHWS42.tyrens.se> <19A5B30A-64E2-44E1-8F73-F67AE628F175@mcs.anl.gov> <03369975ff0a46a388920f1b3372d25c@STHWS42.tyrens.se> <1891ada2f99843b6b6c1d91f0f47f065@STHWS42.tyrens.se> Message-ID: Matt, Thank you for clarifying this. Mahir ________________________________ Mahir ?lker-Kaustell, Kompetenssamordnare, Brokonstrukt?r, Tekn. Dr, Tyr?ns AB 010 452 30 82, Mahir.Ulker-Kaustell at tyrens.se ________________________________ From: Matthew Knepley [mailto:knepley at gmail.com] Sent: den 3 augusti 2015 17:40 To: ?lker-Kaustell, Mahir Cc: Hong; Xiaoye S. Li; PETSc users list Subject: Re: [petsc-users] SuperLU MPI-problem On Mon, Aug 3, 2015 at 10:34 AM, Mahir.Ulker-Kaustell at tyrens.se > wrote: Sherry and Hong, If I use: -mat_superlu_dist_parsymbfact, I get: Invalid ISPEC at line 484 in file get_perm_c.c regardless of what I give to ?mat_superlu_dist_matinput I have not used ?parsymbfact in sequential runs or set matinput=GLOBAL for parallel runs. If I use 2 processors, the program runs if I use ?mat_superlu_dist_parsymbfact=1: Do not use "=1" for any PETSc option. This is improper syntax. It will ignore that option. You use "-option 1" since all option arguments are separated by a space, not an =. Matt mpiexec -n 2 ./solve -ksp_type preonly -pc_type lu -pc_factor_mat_solver_package superlu_dist -mat_superlu_dist_matinput GLOBAL -mat_superlu_dist_parsymbfact=1 and mpiexec -n 2 ./solve -ksp_type preonly -pc_type lu -pc_factor_mat_solver_package superlu_dist -mat_superlu_dist_matinput DISTRIBUTED -mat_superlu_dist_parsymbfact=1 I guess this corresponds to not setting parsymbfact at all. Both programs consume the same amount of RAM and seem to run well. If I use (what seems to be correct): mpiexec -n 2 ./solve -ksp_type preonly -pc_type lu -pc_factor_mat_solver_package superlu_dist -mat_superlu_dist_matinput DISTRIBUTED -mat_superlu_dist_parsymbfact the result is: Invalid ISPEC at line 484 in file get_perm_c.c Mahir From: Hong [mailto:hzhang at mcs.anl.gov] Sent: den 3 augusti 2015 16:46 To: Xiaoye S. Li Cc: ?lker-Kaustell, Mahir; Hong; PETSc users list Subject: Re: [petsc-users] SuperLU MPI-problem Mahir, Sherry found the culprit. I can reproduce it: petsc/src/ksp/ksp/examples/tutorials mpiexec -n 2 ./ex2 -pc_type lu -pc_factor_mat_solver_package superlu_dist -mat_superlu_dist_matinput GLOBAL -mat_superlu_dist_parsymbfact Invalid ISPEC at line 484 in file get_perm_c.c Invalid ISPEC at line 484 in file get_perm_c.c ------------------------------------------------------- Primary job terminated normally, but 1 process returned a non-zero exit code.. Per user-direction, the job has been aborted. ------------------------------------------------------- ... PETSc-superlu_dist interface sets matinput=DISTRIBUTED as default when using more than one processes. Did you either use '-mat_superlu_dist_parsymbfact' for sequential run or set matinput=GLOBAL for parallel run? I'll add an error flag for these use cases. Hong On Mon, Aug 3, 2015 at 9:17 AM, Xiaoye S. Li > wrote: I think I know the problem. Since zdistribute.c is called, I guess you are using the global (replicated) matrix input interface, pzgssvx_ABglobal(). This interface does not allow you to use parallel symbolic factorization (since matrix is centralized). That's why you get the following error: Invalid ISPEC at line 484 in file get_perm_c.c You need to use distributed matrix input interface pzgssvx() (without ABglobal) Sherry On Mon, Aug 3, 2015 at 5:02 AM, Mahir.Ulker-Kaustell at tyrens.se > wrote: Hong and Sherry, I have rebuilt PETSc with SuperLU 4.1. Unfortunately, the problem remains: If I use -mat_superlu_dist_parsymbfact, the program crashes with: Invalid ISPEC at line 484 in file get_perm_c.c If I use -mat_superlu_dist_parsymbfact=1 or leave this flag out, the program crashes with: Calloc fails for SPA dense[]. at line 438 in file zdistribute.c Mahir From: Hong [mailto:hzhang at mcs.anl.gov] Sent: den 30 juli 2015 02:58 To: ?lker-Kaustell, Mahir Cc: Xiaoye Li; PETSc users list Subject: Fwd: [petsc-users] SuperLU MPI-problem Mahir, Sherry fixed several bugs in superlu_dist-v4.1. The current petsc-release interfaces with superlu_dist-v4.0. We do not know whether the reported issue (attached below) has been resolved or not. If not, can you test it with the latest superlu_dist-v4.1? Here is how to do it: 1. download superlu_dist v4.1 2. remove existing PETSC_ARCH directory, then configure petsc with '--download-superlu_dist=superlu_dist_4.1.tar.gz' 3. build petsc Let us know if the issue remains. Hong ---------- Forwarded message ---------- From: Xiaoye S. Li > Date: Wed, Jul 29, 2015 at 2:24 PM Subject: Fwd: [petsc-users] SuperLU MPI-problem To: Hong Zhang > Hong, I am cleaning the mailbox, and saw this unresolved issue. I am not sure whether the new fix to parallel symbolic factorization solves the problem. What bothers be is that he is getting the following error: Invalid ISPEC at line 484 in file get_perm_c.c This has nothing to do with my bug fix. ? Shall we ask him to try the new version, or try to get him matrix? Sherry ? ---------- Forwarded message ---------- From: Mahir.Ulker-Kaustell at tyrens.se > Date: Wed, Jul 22, 2015 at 1:32 PM Subject: RE: [petsc-users] SuperLU MPI-problem To: Hong >, "Xiaoye S. Li" > Cc: petsc-users > The 1000 was just a conservative guess. The number of non-zeros per row is in the tens in general but certain constraints lead to non-diagonal streaks in the sparsity-pattern. Is it the reordering of the matrix that is killing me here? How can I set options.ColPerm? If i use -mat_superlu_dist_parsymbfact the program crashes with Invalid ISPEC at line 484 in file get_perm_c.c ------------------------------------------------------- Primary job terminated normally, but 1 process returned a non-zero exit code.. Per user-direction, the job has been aborted. ------------------------------------------------------- [0]PETSC ERROR: ------------------------------------------------------------------------ [0]PETSC ERROR: Caught signal number 15 Terminate: Some process (or the batch system) has told this process to end [0]PETSC ERROR: Try option -start_in_debugger or -on_error_attach_debugger [0]PETSC ERROR: or see http://www.mcs.anl.gov/petsc/documentation/faq.html#valgrind [0]PETSC ERROR: or try http://valgrind.org on GNU/linux and Apple Mac OS X to find memory corruption errors [0]PETSC ERROR: configure using --with-debugging=yes, recompile, link, and run [0]PETSC ERROR: to get more information on the crash. [0]PETSC ERROR: --------------------- Error Message -------------------------------------------------------------- [0]PETSC ERROR: Signal received [0]PETSC ERROR: See http://www.mcs.anl.gov/petsc/documentation/faq.html for trouble shooting. [0]PETSC ERROR: Petsc Release Version 3.6.0, Jun, 09, 2015 [0]PETSC ERROR: ./solve on a cygwin-complex-nodebug named CZC5202SM2 by muk Wed Jul 22 21:59:23 2015 [0]PETSC ERROR: Configure options PETSC_DIR=/packages/petsc-3.6.0 PETSC_ARCH=cygwin-complex-nodebug --with-cc=gcc --with-cxx=g++ --with-fc=gfortran --with-debugging=0 --with-fortran-kernels=1 --with-scalar-type=complex --download-fblaspack --download-mpich --download-scalapack --download-mumps --download-metis --download-parmetis --download-superlu --download-superlu_dist --download-fftw [0]PETSC ERROR: #1 User provided function() line 0 in unknown file application called MPI_Abort(MPI_COMM_WORLD, 59) - process 0 [unset]: aborting job: application called MPI_Abort(MPI_COMM_WORLD, 59) - process 0 [0]PETSC ERROR: ------------------------------------------------------------------------ If i use -mat_superlu_dist_parsymbfact=1 the program crashes (somewhat later) with Malloc fails for Lnzval_bc_ptr[*][] at line 626 in file zdistribute.c col block 3006 ------------------------------------------------------- Primary job terminated normally, but 1 process returned a non-zero exit code.. Per user-direction, the job has been aborted. ------------------------------------------------------- col block 1924 [0]PETSC ERROR: ------------------------------------------------------------------------ [0]PETSC ERROR: Caught signal number 15 Terminate: Some process (or the batch system) has told this process to end [0]PETSC ERROR: Try option -start_in_debugger or -on_error_attach_debugger [0]PETSC ERROR: or see http://www.mcs.anl.gov/petsc/documentation/faq.html#valgrind [0]PETSC ERROR: or try http://valgrind.org on GNU/linux and Apple Mac OS X to find memory corruption errors [0]PETSC ERROR: configure using --with-debugging=yes, recompile, link, and run [0]PETSC ERROR: to get more information on the crash. [0]PETSC ERROR: --------------------- Error Message -------------------------------------------------------------- [0]PETSC ERROR: Signal received [0]PETSC ERROR: See http://www.mcs.anl.gov/petsc/documentation/faq.html for trouble shooting. [0]PETSC ERROR: Petsc Release Version 3.6.0, Jun, 09, 2015 [0]PETSC ERROR: ./solve on a cygwin-complex-nodebug named CZC5202SM2 by muk Wed Jul 22 21:59:58 2015 [0]PETSC ERROR: Configure options PETSC_DIR=/packages/petsc-3.6.0 PETSC_ARCH=cygwin-complex-nodebug --with-cc=gcc --with-cxx=g++ --with-fc=gfortran --with-debugging=0 --with-fortran-kernels=1 --with-scalar-type=complex --download-fblaspack --download-mpich --download-scalapack --download-mumps --download-metis --download-parmetis --download-superlu --download-superlu_dist --download-fftw [0]PETSC ERROR: #1 User provided function() line 0 in unknown file application called MPI_Abort(MPI_COMM_WORLD, 59) - process 0 [unset]: aborting job: application called MPI_Abort(MPI_COMM_WORLD, 59) - process 0 [0]PETSC ERROR: ------------------------------------------------------------------------ /Mahir From: Hong [mailto:hzhang at mcs.anl.gov] Sent: den 22 juli 2015 21:34 To: Xiaoye S. Li Cc: ?lker-Kaustell, Mahir; petsc-users Subject: Re: [petsc-users] SuperLU MPI-problem In Petsc/superlu_dist interface, we set default options.ParSymbFact = NO; When user raises the flag "-mat_superlu_dist_parsymbfact", we set options.ParSymbFact = YES; options.ColPerm = PARMETIS; /* in v2.2, PARMETIS is forced for ParSymbFact regardless of user ordering setting */ We do not change anything else. Hong On Wed, Jul 22, 2015 at 2:19 PM, Xiaoye S. Li > wrote: I am trying to understand your problem. You said you are solving Naviers equation (elastodynamics) in the frequency domain, using finite element discretization. I wonder why you have about 1000 nonzeros per row. Usually in many PDE discretized matrices, the number of nonzeros per row is in the tens (even for 3D problems), not in the thousands. So, your matrix is quite a bit denser than many sparse matrices we deal with. The number of nonzeros in the L and U factors is much more than that in original matrix A -- typically we see 10-20x fill ratio for 2D, or can be as bad as 50-100x fill ratio for 3D. But since your matrix starts much denser (i.e., the underlying graph has many connections), it may not lend to any good ordering strategy to preserve sparsity of L and U; that is, the L and U fill ratio may be large. I don't understand why you get the following error when you use ?-mat_superlu_dist_parsymbfact?. Invalid ISPEC at line 484 in file get_perm_c.c Perhaps Hong Zhang knows; she built the SuperLU_DIST interface for PETSc. ?Hong -- in order to use parallel symbolic factorization, is it sufficient to specify only ?-mat_superlu_dist_parsymbfact? ? ? (the default is to use sequential symbolic factorization.) Sherry On Wed, Jul 22, 2015 at 9:11 AM, Mahir.Ulker-Kaustell at tyrens.se > wrote: Thank you for your reply. As you have probably figured out already, I am not a computational scientist. I am a researcher in civil engineering (railways for high-speed traffic), trying to produce some, from my perspective, fairly large parametric studies based on finite element discretizations. I am working in a Windows-environment and have installed PETSc through Cygwin. Apparently, there is no support for Valgrind in this OS. If I have understood you correct, the memory issues are related to superLU and given my background, there is not much I can do. Is this correct? Best regards, Mahir ______________________________________________ Mahir ?lker-Kaustell, Kompetenssamordnare, Brokonstrukt?r, Tekn. Dr, Tyr?ns AB 010 452 30 82, Mahir.Ulker-Kaustell at tyrens.se ______________________________________________ -----Original Message----- From: Barry Smith [mailto:bsmith at mcs.anl.gov] Sent: den 22 juli 2015 02:57 To: ?lker-Kaustell, Mahir Cc: Xiaoye S. Li; petsc-users Subject: Re: [petsc-users] SuperLU MPI-problem Run the program under valgrind http://www.mcs.anl.gov/petsc/documentation/faq.html#valgrind . When I use the option -mat_superlu_dist_parsymbfact I get many scary memory problems some involving for example ddist_psymbtonum (pdsymbfact_distdata.c:1332) Note that I consider it unacceptable for running programs to EVER use uninitialized values; until these are all cleaned up I won't trust any runs like this. Barry ==42050== Conditional jump or move depends on uninitialised value(s) ==42050== at 0x10274C436: MPI_Allgatherv (allgatherv.c:1053) ==42050== by 0x101557F60: get_perm_c_parmetis (get_perm_c_parmetis.c:285) ==42050== by 0x101501192: pdgssvx (pdgssvx.c:934) ==42050== by 0x1009CFE7A: MatLUFactorNumeric_SuperLU_DIST (superlu_dist.c:414) ==42050== by 0x10046CC5C: MatLUFactorNumeric (matrix.c:2946) ==42050== by 0x100F09F2C: PCSetUp_LU (lu.c:152) ==42050== by 0x100FF9036: PCSetUp (precon.c:982) ==42050== by 0x1010F54EB: KSPSetUp (itfunc.c:332) ==42050== by 0x1010F7985: KSPSolve (itfunc.c:546) ==42050== by 0x10125541E: SNESSolve_NEWTONLS (ls.c:233) ==42050== by 0x1011C49B7: SNESSolve (snes.c:3906) ==42050== by 0x100001B3C: main (in ./ex19) ==42050== Uninitialised value was created by a stack allocation ==42050== at 0x10155751B: get_perm_c_parmetis (get_perm_c_parmetis.c:96) ==42050== ==42050== Conditional jump or move depends on uninitialised value(s) ==42050== at 0x102851C61: MPIR_Allgatherv_intra (allgatherv.c:651) ==42050== by 0x102853EC7: MPIR_Allgatherv (allgatherv.c:903) ==42050== by 0x102853F84: MPIR_Allgatherv_impl (allgatherv.c:944) ==42050== by 0x10274CA41: MPI_Allgatherv (allgatherv.c:1107) ==42050== by 0x101557F60: get_perm_c_parmetis (get_perm_c_parmetis.c:285) ==42050== by 0x101501192: pdgssvx (pdgssvx.c:934) ==42050== by 0x1009CFE7A: MatLUFactorNumeric_SuperLU_DIST (superlu_dist.c:414) ==42050== by 0x10046CC5C: MatLUFactorNumeric (matrix.c:2946) ==42050== by 0x100F09F2C: PCSetUp_LU (lu.c:152) ==42050== by 0x100FF9036: PCSetUp (precon.c:982) ==42050== by 0x1010F54EB: KSPSetUp (itfunc.c:332) ==42050== by 0x1010F7985: KSPSolve (itfunc.c:546) ==42050== by 0x10125541E: SNESSolve_NEWTONLS (ls.c:233) ==42050== by 0x1011C49B7: SNESSolve (snes.c:3906) ==42050== by 0x100001B3C: main (in ./ex19) ==42050== Uninitialised value was created by a stack allocation ==42050== at 0x10155751B: get_perm_c_parmetis (get_perm_c_parmetis.c:96) ==42050== ==42049== Syscall param writev(vector[...]) points to uninitialised byte(s) ==42049== at 0x102DA1C3A: writev (in /usr/lib/system/libsystem_kernel.dylib) ==42049== by 0x10296A0DC: MPL_large_writev (mplsock.c:32) ==42049== by 0x10295F6AD: MPIDU_Sock_writev (sock_immed.i:610) ==42049== by 0x102943FCA: MPIDI_CH3_iSendv (ch3_isendv.c:84) ==42049== by 0x102934361: MPIDI_CH3_EagerContigIsend (ch3u_eager.c:556) ==42049== by 0x102939531: MPID_Isend (mpid_isend.c:138) ==42049== by 0x10277656E: MPI_Isend (isend.c:125) ==42049== by 0x102088B66: libparmetis__gkMPI_Isend (gkmpi.c:63) ==42049== by 0x10208140F: libparmetis__CommInterfaceData (comm.c:298) ==42049== by 0x1020A8758: libparmetis__CompactGraph (ometis.c:553) ==42049== by 0x1020A77BB: libparmetis__MultilevelOrder (ometis.c:225) ==42049== by 0x1020A7493: ParMETIS_V32_NodeND (ometis.c:151) ==42049== by 0x1020A6AFB: ParMETIS_V3_NodeND (ometis.c:34) ==42049== by 0x101557CFC: get_perm_c_parmetis (get_perm_c_parmetis.c:241) ==42049== by 0x101501192: pdgssvx (pdgssvx.c:934) ==42049== by 0x1009CFE7A: MatLUFactorNumeric_SuperLU_DIST (superlu_dist.c:414) ==42049== by 0x10046CC5C: MatLUFactorNumeric (matrix.c:2946) ==42049== by 0x100F09F2C: PCSetUp_LU (lu.c:152) ==42049== by 0x100FF9036: PCSetUp (precon.c:982) ==42048== Syscall param writev(vector[...]) points to uninitialised byte(s) ==42049== by 0x1010F54EB: KSPSetUp (itfunc.c:332) ==42049== Address 0x105edff70 is 1,424 bytes inside a block of size 752,720 alloc'd ==42049== at 0x1000183B1: malloc (vg_replace_malloc.c:303) ==42049== by 0x1020EB90C: gk_malloc (memory.c:147) ==42049== by 0x1020EAA28: gk_mcoreCreate (mcore.c:28) ==42048== at 0x102DA1C3A: writev (in /usr/lib/system/libsystem_kernel.dylib) ==42048== by 0x10296A0DC: MPL_large_writev (mplsock.c:32) ==42049== by 0x1020BA5CF: libparmetis__AllocateWSpace (wspace.c:23) ==42049== by 0x1020A6E84: ParMETIS_V32_NodeND (ometis.c:98) ==42048== by 0x10295F6AD: MPIDU_Sock_writev (sock_immed.i:610) ==42048== by 0x102943FCA: MPIDI_CH3_iSendv (ch3_isendv.c:84) ==42048== by 0x102934361: MPIDI_CH3_EagerContigIsend (ch3u_eager.c:556) ==42049== by 0x1020A6AFB: ParMETIS_V3_NodeND (ometis.c:34) ==42049== by 0x101557CFC: get_perm_c_parmetis (get_perm_c_parmetis.c:241) ==42049== by 0x101501192: pdgssvx (pdgssvx.c:934) ==42048== by 0x102939531: MPID_Isend (mpid_isend.c:138) ==42048== by 0x10277656E: MPI_Isend (isend.c:125) ==42049== by 0x1009CFE7A: MatLUFactorNumeric_SuperLU_DIST (superlu_dist.c:414) ==42049== by 0x10046CC5C: MatLUFactorNumeric (matrix.c:2946) ==42049== by 0x100F09F2C: PCSetUp_LU (lu.c:152) ==42049== by 0x100FF9036: PCSetUp (precon.c:982) ==42049== by 0x1010F54EB: KSPSetUp (itfunc.c:332) ==42048== by 0x102088B66: libparmetis__gkMPI_Isend (gkmpi.c:63) ==42048== by 0x10208140F: libparmetis__CommInterfaceData (comm.c:298) ==42049== by 0x1010F7985: KSPSolve (itfunc.c:546) ==42049== by 0x10125541E: SNESSolve_NEWTONLS (ls.c:233) ==42048== by 0x1020A8758: libparmetis__CompactGraph (ometis.c:553) ==42048== by 0x1020A77BB: libparmetis__MultilevelOrder (ometis.c:225) ==42048== by 0x1020A7493: ParMETIS_V32_NodeND (ometis.c:151) ==42049== by 0x1011C49B7: SNESSolve (snes.c:3906) ==42049== by 0x100001B3C: main (in ./ex19) ==42049== Uninitialised value was created by a heap allocation ==42049== at 0x1000183B1: malloc (vg_replace_malloc.c:303) ==42049== by 0x1020EB90C: gk_malloc (memory.c:147) ==42048== by 0x1020A6AFB: ParMETIS_V3_NodeND (ometis.c:34) ==42048== by 0x101557CFC: get_perm_c_parmetis (get_perm_c_parmetis.c:241) ==42048== by 0x101501192: pdgssvx (pdgssvx.c:934) ==42048== by 0x1009CFE7A: MatLUFactorNumeric_SuperLU_DIST (superlu_dist.c:414) ==42049== by 0x10211C50B: libmetis__imalloc (gklib.c:24) ==42049== by 0x1020A8566: libparmetis__CompactGraph (ometis.c:519) ==42049== by 0x1020A77BB: libparmetis__MultilevelOrder (ometis.c:225) ==42048== by 0x10046CC5C: MatLUFactorNumeric (matrix.c:2946) ==42049== by 0x1020A7493: ParMETIS_V32_NodeND (ometis.c:151) ==42049== by 0x1020A6AFB: ParMETIS_V3_NodeND (ometis.c:34) ==42049== by 0x101557CFC: get_perm_c_parmetis (get_perm_c_parmetis.c:241) ==42049== by 0x101501192: pdgssvx (pdgssvx.c:934) ==42049== by 0x1009CFE7A: MatLUFactorNumeric_SuperLU_DIST (superlu_dist.c:414) ==42049== by 0x10046CC5C: MatLUFactorNumeric (matrix.c:2946) ==42048== by 0x100F09F2C: PCSetUp_LU (lu.c:152) ==42049== by 0x100F09F2C: PCSetUp_LU (lu.c:152) ==42049== by 0x100FF9036: PCSetUp (precon.c:982) ==42049== by 0x1010F54EB: KSPSetUp (itfunc.c:332) ==42049== by 0x1010F7985: KSPSolve (itfunc.c:546) ==42049== by 0x10125541E: SNESSolve_NEWTONLS (ls.c:233) ==42048== by 0x100FF9036: PCSetUp (precon.c:982) ==42048== by 0x1010F54EB: KSPSetUp (itfunc.c:332) ==42048== Address 0x10597a860 is 1,408 bytes inside a block of size 752,720 alloc'd ==42049== by 0x1011C49B7: SNESSolve (snes.c:3906) ==42049== by 0x100001B3C: main (in ./ex19) ==42049== ==42048== at 0x1000183B1: malloc (vg_replace_malloc.c:303) ==42048== by 0x1020EB90C: gk_malloc (memory.c:147) ==42048== by 0x1020EAA28: gk_mcoreCreate (mcore.c:28) ==42048== by 0x1020BA5CF: libparmetis__AllocateWSpace (wspace.c:23) ==42048== by 0x1020A6E84: ParMETIS_V32_NodeND (ometis.c:98) ==42048== by 0x1020A6AFB: ParMETIS_V3_NodeND (ometis.c:34) ==42048== by 0x101557CFC: get_perm_c_parmetis (get_perm_c_parmetis.c:241) ==42048== by 0x101501192: pdgssvx (pdgssvx.c:934) ==42048== by 0x1009CFE7A: MatLUFactorNumeric_SuperLU_DIST (superlu_dist.c:414) ==42048== by 0x10046CC5C: MatLUFactorNumeric (matrix.c:2946) ==42048== by 0x100F09F2C: PCSetUp_LU (lu.c:152) ==42048== by 0x100FF9036: PCSetUp (precon.c:982) ==42048== by 0x1010F54EB: KSPSetUp (itfunc.c:332) ==42048== by 0x1010F7985: KSPSolve (itfunc.c:546) ==42048== by 0x10125541E: SNESSolve_NEWTONLS (ls.c:233) ==42048== by 0x1011C49B7: SNESSolve (snes.c:3906) ==42048== by 0x100001B3C: main (in ./ex19) ==42048== Uninitialised value was created by a heap allocation ==42048== at 0x1000183B1: malloc (vg_replace_malloc.c:303) ==42048== by 0x1020EB90C: gk_malloc (memory.c:147) ==42048== by 0x10211C50B: libmetis__imalloc (gklib.c:24) ==42048== by 0x1020A8566: libparmetis__CompactGraph (ometis.c:519) ==42048== by 0x1020A77BB: libparmetis__MultilevelOrder (ometis.c:225) ==42048== by 0x1020A7493: ParMETIS_V32_NodeND (ometis.c:151) ==42048== by 0x1020A6AFB: ParMETIS_V3_NodeND (ometis.c:34) ==42048== by 0x101557CFC: get_perm_c_parmetis (get_perm_c_parmetis.c:241) ==42048== by 0x101501192: pdgssvx (pdgssvx.c:934) ==42048== by 0x1009CFE7A: MatLUFactorNumeric_SuperLU_DIST (superlu_dist.c:414) ==42048== by 0x10046CC5C: MatLUFactorNumeric (matrix.c:2946) ==42048== by 0x100F09F2C: PCSetUp_LU (lu.c:152) ==42048== by 0x100FF9036: PCSetUp (precon.c:982) ==42048== by 0x1010F54EB: KSPSetUp (itfunc.c:332) ==42048== by 0x1010F7985: KSPSolve (itfunc.c:546) ==42048== by 0x10125541E: SNESSolve_NEWTONLS (ls.c:233) ==42048== by 0x1011C49B7: SNESSolve (snes.c:3906) ==42048== by 0x100001B3C: main (in ./ex19) ==42048== ==42048== Syscall param write(buf) points to uninitialised byte(s) ==42048== at 0x102DA1C22: write (in /usr/lib/system/libsystem_kernel.dylib) ==42048== by 0x10295F5BD: MPIDU_Sock_write (sock_immed.i:525) ==42048== by 0x102944839: MPIDI_CH3_iStartMsg (ch3_istartmsg.c:86) ==42048== by 0x102933B80: MPIDI_CH3_EagerContigShortSend (ch3u_eager.c:257) ==42048== by 0x10293ADBA: MPID_Send (mpid_send.c:130) ==42048== by 0x10277A1FA: MPI_Send (send.c:127) ==42048== by 0x10155802F: get_perm_c_parmetis (get_perm_c_parmetis.c:299) ==42048== by 0x101501192: pdgssvx (pdgssvx.c:934) ==42048== by 0x1009CFE7A: MatLUFactorNumeric_SuperLU_DIST (superlu_dist.c:414) ==42048== by 0x10046CC5C: MatLUFactorNumeric (matrix.c:2946) ==42048== by 0x100F09F2C: PCSetUp_LU (lu.c:152) ==42048== by 0x100FF9036: PCSetUp (precon.c:982) ==42048== by 0x1010F54EB: KSPSetUp (itfunc.c:332) ==42048== by 0x1010F7985: KSPSolve (itfunc.c:546) ==42048== by 0x10125541E: SNESSolve_NEWTONLS (ls.c:233) ==42048== by 0x1011C49B7: SNESSolve (snes.c:3906) ==42048== by 0x100001B3C: main (in ./ex19) ==42048== Address 0x104810704 is on thread 1's stack ==42048== in frame #3, created by MPIDI_CH3_EagerContigShortSend (ch3u_eager.c:218) ==42048== Uninitialised value was created by a heap allocation ==42048== at 0x1000183B1: malloc (vg_replace_malloc.c:303) ==42048== by 0x10153B704: superlu_malloc_dist (memory.c:108) ==42048== by 0x101557AB9: get_perm_c_parmetis (get_perm_c_parmetis.c:185) ==42048== by 0x101501192: pdgssvx (pdgssvx.c:934) ==42048== by 0x1009CFE7A: MatLUFactorNumeric_SuperLU_DIST (superlu_dist.c:414) ==42048== by 0x10046CC5C: MatLUFactorNumeric (matrix.c:2946) ==42048== by 0x100F09F2C: PCSetUp_LU (lu.c:152) ==42048== by 0x100FF9036: PCSetUp (precon.c:982) ==42048== by 0x1010F54EB: KSPSetUp (itfunc.c:332) ==42048== by 0x1010F7985: KSPSolve (itfunc.c:546) ==42048== by 0x10125541E: SNESSolve_NEWTONLS (ls.c:233) ==42048== by 0x1011C49B7: SNESSolve (snes.c:3906) ==42048== by 0x100001B3C: main (in ./ex19) ==42048== ==42050== Conditional jump or move depends on uninitialised value(s) ==42050== at 0x102744CB8: MPI_Alltoallv (alltoallv.c:480) ==42050== by 0x101510B3E: dist_symbLU (pdsymbfact_distdata.c:539) ==42050== by 0x10150A5C6: ddist_psymbtonum (pdsymbfact_distdata.c:1275) ==42050== by 0x1015018C2: pdgssvx (pdgssvx.c:1057) ==42050== by 0x1009CFE7A: MatLUFactorNumeric_SuperLU_DIST (superlu_dist.c:414) ==42050== by 0x10046CC5C: MatLUFactorNumeric (matrix.c:2946) ==42050== by 0x100F09F2C: PCSetUp_LU (lu.c:152) ==42050== by 0x100FF9036: PCSetUp (precon.c:982) ==42050== by 0x1010F54EB: KSPSetUp (itfunc.c:332) ==42050== by 0x1010F7985: KSPSolve (itfunc.c:546) ==42050== by 0x10125541E: SNESSolve_NEWTONLS (ls.c:233) ==42050== by 0x1011C49B7: SNESSolve (snes.c:3906) ==42050== by 0x100001B3C: main (in ./ex19) ==42050== Uninitialised value was created by a stack allocation ==42050== at 0x10150E4C4: dist_symbLU (pdsymbfact_distdata.c:96) ==42050== ==42050== Conditional jump or move depends on uninitialised value(s) ==42050== at 0x102744E43: MPI_Alltoallv (alltoallv.c:490) ==42050== by 0x101510B3E: dist_symbLU (pdsymbfact_distdata.c:539) ==42050== by 0x10150A5C6: ddist_psymbtonum (pdsymbfact_distdata.c:1275) ==42050== by 0x1015018C2: pdgssvx (pdgssvx.c:1057) ==42050== by 0x1009CFE7A: MatLUFactorNumeric_SuperLU_DIST (superlu_dist.c:414) ==42050== by 0x10046CC5C: MatLUFactorNumeric (matrix.c:2946) ==42050== by 0x100F09F2C: PCSetUp_LU (lu.c:152) ==42050== by 0x100FF9036: PCSetUp (precon.c:982) ==42050== by 0x1010F54EB: KSPSetUp (itfunc.c:332) ==42050== by 0x1010F7985: KSPSolve (itfunc.c:546) ==42050== by 0x10125541E: SNESSolve_NEWTONLS (ls.c:233) ==42050== by 0x1011C49B7: SNESSolve (snes.c:3906) ==42050== by 0x100001B3C: main (in ./ex19) ==42050== Uninitialised value was created by a stack allocation ==42050== at 0x10150E4C4: dist_symbLU (pdsymbfact_distdata.c:96) ==42050== ==42050== Conditional jump or move depends on uninitialised value(s) ==42050== at 0x102744EBF: MPI_Alltoallv (alltoallv.c:497) ==42050== by 0x101510B3E: dist_symbLU (pdsymbfact_distdata.c:539) ==42050== by 0x10150A5C6: ddist_psymbtonum (pdsymbfact_distdata.c:1275) ==42050== by 0x1015018C2: pdgssvx (pdgssvx.c:1057) ==42050== by 0x1009CFE7A: MatLUFactorNumeric_SuperLU_DIST (superlu_dist.c:414) ==42050== by 0x10046CC5C: MatLUFactorNumeric (matrix.c:2946) ==42050== by 0x100F09F2C: PCSetUp_LU (lu.c:152) ==42050== by 0x100FF9036: PCSetUp (precon.c:982) ==42050== by 0x1010F54EB: KSPSetUp (itfunc.c:332) ==42050== by 0x1010F7985: KSPSolve (itfunc.c:546) ==42050== by 0x10125541E: SNESSolve_NEWTONLS (ls.c:233) ==42050== by 0x1011C49B7: SNESSolve (snes.c:3906) ==42050== by 0x100001B3C: main (in ./ex19) ==42050== Uninitialised value was created by a stack allocation ==42050== at 0x10150E4C4: dist_symbLU (pdsymbfact_distdata.c:96) ==42050== ==42050== Conditional jump or move depends on uninitialised value(s) ==42050== at 0x1027450B1: MPI_Alltoallv (alltoallv.c:512) ==42050== by 0x101510B3E: dist_symbLU (pdsymbfact_distdata.c:539) ==42050== by 0x10150A5C6: ddist_psymbtonum (pdsymbfact_distdata.c:1275) ==42050== by 0x1015018C2: pdgssvx (pdgssvx.c:1057) ==42050== by 0x1009CFE7A: MatLUFactorNumeric_SuperLU_DIST (superlu_dist.c:414) ==42050== by 0x10046CC5C: MatLUFactorNumeric (matrix.c:2946) ==42050== by 0x100F09F2C: PCSetUp_LU (lu.c:152) ==42050== by 0x100FF9036: PCSetUp (precon.c:982) ==42050== by 0x1010F54EB: KSPSetUp (itfunc.c:332) ==42050== by 0x1010F7985: KSPSolve (itfunc.c:546) ==42050== by 0x10125541E: SNESSolve_NEWTONLS (ls.c:233) ==42050== by 0x1011C49B7: SNESSolve (snes.c:3906) ==42050== by 0x100001B3C: main (in ./ex19) ==42050== Uninitialised value was created by a stack allocation ==42050== at 0x10150E4C4: dist_symbLU (pdsymbfact_distdata.c:96) ==42050== ==42050== Conditional jump or move depends on uninitialised value(s) ==42050== at 0x10283FB06: MPIR_Alltoallv_intra (alltoallv.c:92) ==42050== by 0x1028407B6: MPIR_Alltoallv (alltoallv.c:343) ==42050== by 0x102840884: MPIR_Alltoallv_impl (alltoallv.c:380) ==42050== by 0x10274541B: MPI_Alltoallv (alltoallv.c:531) ==42050== by 0x101510B3E: dist_symbLU (pdsymbfact_distdata.c:539) ==42050== by 0x10150A5C6: ddist_psymbtonum (pdsymbfact_distdata.c:1275) ==42050== by 0x1015018C2: pdgssvx (pdgssvx.c:1057) ==42050== by 0x1009CFE7A: MatLUFactorNumeric_SuperLU_DIST (superlu_dist.c:414) ==42050== by 0x10046CC5C: MatLUFactorNumeric (matrix.c:2946) ==42050== by 0x100F09F2C: PCSetUp_LU (lu.c:152) ==42050== by 0x100FF9036: PCSetUp (precon.c:982) ==42050== by 0x1010F54EB: KSPSetUp (itfunc.c:332) ==42050== by 0x1010F7985: KSPSolve (itfunc.c:546) ==42050== by 0x10125541E: SNESSolve_NEWTONLS (ls.c:233) ==42050== by 0x1011C49B7: SNESSolve (snes.c:3906) ==42050== by 0x100001B3C: main (in ./ex19) ==42050== Uninitialised value was created by a stack allocation ==42050== at 0x10150E4C4: dist_symbLU (pdsymbfact_distdata.c:96) ==42050== ==42050== Syscall param writev(vector[...]) points to uninitialised byte(s) ==42050== at 0x102DA1C3A: writev (in /usr/lib/system/libsystem_kernel.dylib) ==42050== by 0x10296A0DC: MPL_large_writev (mplsock.c:32) ==42050== by 0x10295F6AD: MPIDU_Sock_writev (sock_immed.i:610) ==42050== by 0x102943FCA: MPIDI_CH3_iSendv (ch3_isendv.c:84) ==42050== by 0x102934361: MPIDI_CH3_EagerContigIsend (ch3u_eager.c:556) ==42050== by 0x102939531: MPID_Isend (mpid_isend.c:138) ==42050== by 0x10277656E: MPI_Isend (isend.c:125) ==42050== by 0x101524C41: pdgstrf2_trsm (pdgstrf2.c:201) ==42050== by 0x10151ECBF: pdgstrf (pdgstrf.c:1082) ==42050== by 0x1015019A5: pdgssvx (pdgssvx.c:1069) ==42050== by 0x1009CFE7A: MatLUFactorNumeric_SuperLU_DIST (superlu_dist.c:414) ==42050== by 0x10046CC5C: MatLUFactorNumeric (matrix.c:2946) ==42050== by 0x100F09F2C: PCSetUp_LU (lu.c:152) ==42050== by 0x100FF9036: PCSetUp (precon.c:982) ==42050== by 0x1010F54EB: KSPSetUp (itfunc.c:332) ==42050== by 0x1010F7985: KSPSolve (itfunc.c:546) ==42050== by 0x10125541E: SNESSolve_NEWTONLS (ls.c:233) ==42050== by 0x1011C49B7: SNESSolve (snes.c:3906) ==42050== by 0x100001B3C: main (in ./ex19) ==42050== Address 0x1060144d0 is 1,168 bytes inside a block of size 131,072 alloc'd ==42050== at 0x1000183B1: malloc (vg_replace_malloc.c:303) ==42050== by 0x10153B704: superlu_malloc_dist (memory.c:108) ==42050== by 0x1014FD7AD: doubleMalloc_dist (dmemory.c:145) ==42050== by 0x10151DA7D: pdgstrf (pdgstrf.c:735) ==42050== by 0x1015019A5: pdgssvx (pdgssvx.c:1069) ==42050== by 0x1009CFE7A: MatLUFactorNumeric_SuperLU_DIST (superlu_dist.c:414) ==42050== by 0x10046CC5C: MatLUFactorNumeric (matrix.c:2946) ==42050== by 0x100F09F2C: PCSetUp_LU (lu.c:152) ==42050== by 0x100FF9036: PCSetUp (precon.c:982) ==42050== by 0x1010F54EB: KSPSetUp (itfunc.c:332) ==42050== by 0x1010F7985: KSPSolve (itfunc.c:546) ==42050== by 0x10125541E: SNESSolve_NEWTONLS (ls.c:233) ==42050== by 0x1011C49B7: SNESSolve (snes.c:3906) ==42050== by 0x100001B3C: main (in ./ex19) ==42050== Uninitialised value was created by a heap allocation ==42050== at 0x1000183B1: malloc (vg_replace_malloc.c:303) ==42050== by 0x10153B704: superlu_malloc_dist (memory.c:108) ==42050== by 0x1014FD7AD: doubleMalloc_dist (dmemory.c:145) ==42050== by 0x10151DA7D: pdgstrf (pdgstrf.c:735) ==42050== by 0x1015019A5: pdgssvx (pdgssvx.c:1069) ==42050== by 0x1009CFE7A: MatLUFactorNumeric_SuperLU_DIST (superlu_dist.c:414) ==42050== by 0x10046CC5C: MatLUFactorNumeric (matrix.c:2946) ==42050== by 0x100F09F2C: PCSetUp_LU (lu.c:152) ==42050== by 0x100FF9036: PCSetUp (precon.c:982) ==42050== by 0x1010F54EB: KSPSetUp (itfunc.c:332) ==42050== by 0x1010F7985: KSPSolve (itfunc.c:546) ==42050== by 0x10125541E: SNESSolve_NEWTONLS (ls.c:233) ==42050== by 0x1011C49B7: SNESSolve (snes.c:3906) ==42050== by 0x100001B3C: main (in ./ex19) ==42050== ==42048== Conditional jump or move depends on uninitialised value(s) ==42048== at 0x10151F141: pdgstrf (pdgstrf.c:1139) ==42048== by 0x1015019A5: pdgssvx (pdgssvx.c:1069) ==42048== by 0x1009CFE7A: MatLUFactorNumeric_SuperLU_DIST (superlu_dist.c:414) ==42048== by 0x10046CC5C: MatLUFactorNumeric (matrix.c:2946) ==42048== by 0x100F09F2C: PCSetUp_LU (lu.c:152) ==42048== by 0x100FF9036: PCSetUp (precon.c:982) ==42048== by 0x1010F54EB: KSPSetUp (itfunc.c:332) ==42048== by 0x1010F7985: KSPSolve (itfunc.c:546) ==42048== by 0x10125541E: SNESSolve_NEWTONLS (ls.c:233) ==42048== by 0x1011C49B7: SNESSolve (snes.c:3906) ==42048== by 0x100001B3C: main (in ./ex19) ==42048== Uninitialised value was created by a heap allocation ==42048== at 0x1000183B1: malloc (vg_replace_malloc.c:303) ==42048== by 0x10153B704: superlu_malloc_dist (memory.c:108) ==42048== by 0x10150ABE2: ddist_psymbtonum (pdsymbfact_distdata.c:1332) ==42048== by 0x1015018C2: pdgssvx (pdgssvx.c:1057) ==42048== by 0x1009CFE7A: MatLUFactorNumeric_SuperLU_DIST (superlu_dist.c:414) ==42048== by 0x10046CC5C: MatLUFactorNumeric (matrix.c:2946) ==42048== by 0x100F09F2C: PCSetUp_LU (lu.c:152) ==42048== by 0x100FF9036: PCSetUp (precon.c:982) ==42048== by 0x1010F54EB: KSPSetUp (itfunc.c:332) ==42048== by 0x1010F7985: KSPSolve (itfunc.c:546) ==42048== by 0x10125541E: SNESSolve_NEWTONLS (ls.c:233) ==42048== by 0x1011C49B7: SNESSolve (snes.c:3906) ==42048== by 0x100001B3C: main (in ./ex19) ==42048== ==42049== Conditional jump or move depends on uninitialised value(s) ==42049== at 0x10151F141: pdgstrf (pdgstrf.c:1139) ==42049== by 0x1015019A5: pdgssvx (pdgssvx.c:1069) ==42049== by 0x1009CFE7A: MatLUFactorNumeric_SuperLU_DIST (superlu_dist.c:414) ==42049== by 0x10046CC5C: MatLUFactorNumeric (matrix.c:2946) ==42049== by 0x100F09F2C: PCSetUp_LU (lu.c:152) ==42049== by 0x100FF9036: PCSetUp (precon.c:982) ==42049== by 0x1010F54EB: KSPSetUp (itfunc.c:332) ==42049== by 0x1010F7985: KSPSolve (itfunc.c:546) ==42049== by 0x10125541E: SNESSolve_NEWTONLS (ls.c:233) ==42049== by 0x1011C49B7: SNESSolve (snes.c:3906) ==42049== by 0x100001B3C: main (in ./ex19) ==42049== Uninitialised value was created by a heap allocation ==42049== at 0x1000183B1: malloc (vg_replace_malloc.c:303) ==42049== by 0x10153B704: superlu_malloc_dist (memory.c:108) ==42049== by 0x10150ABE2: ddist_psymbtonum (pdsymbfact_distdata.c:1332) ==42049== by 0x1015018C2: pdgssvx (pdgssvx.c:1057) ==42049== by 0x1009CFE7A: MatLUFactorNumeric_SuperLU_DIST (superlu_dist.c:414) ==42049== by 0x10046CC5C: MatLUFactorNumeric (matrix.c:2946) ==42049== by 0x100F09F2C: PCSetUp_LU (lu.c:152) ==42049== by 0x100FF9036: PCSetUp (precon.c:982) ==42049== by 0x1010F54EB: KSPSetUp (itfunc.c:332) ==42049== by 0x1010F7985: KSPSolve (itfunc.c:546) ==42049== by 0x10125541E: SNESSolve_NEWTONLS (ls.c:233) ==42049== by 0x1011C49B7: SNESSolve (snes.c:3906) ==42049== by 0x100001B3C: main (in ./ex19) ==42049== ==42048== Conditional jump or move depends on uninitialised value(s) ==42048== at 0x101520054: pdgstrf (pdgstrf.c:1429) ==42048== by 0x1015019A5: pdgssvx (pdgssvx.c:1069) ==42048== by 0x1009CFE7A: MatLUFactorNumeric_SuperLU_DIST (superlu_dist.c:414) ==42048== by 0x10046CC5C: MatLUFactorNumeric (matrix.c:2946) ==42048== by 0x100F09F2C: PCSetUp_LU (lu.c:152) ==42048== by 0x100FF9036: PCSetUp (precon.c:982) ==42048== by 0x1010F54EB: KSPSetUp (itfunc.c:332) ==42048== by 0x1010F7985: KSPSolve (itfunc.c:546) ==42049== Conditional jump or move depends on uninitialised value(s) ==42048== by 0x10125541E: SNESSolve_NEWTONLS (ls.c:233) ==42048== by 0x1011C49B7: SNESSolve (snes.c:3906) ==42048== by 0x100001B3C: main (in ./ex19) ==42048== Uninitialised value was created by a heap allocation ==42049== at 0x101520054: pdgstrf (pdgstrf.c:1429) ==42048== at 0x1000183B1: malloc (vg_replace_malloc.c:303) ==42048== by 0x10153B704: superlu_malloc_dist (memory.c:108) ==42049== by 0x1015019A5: pdgssvx (pdgssvx.c:1069) ==42049== by 0x1009CFE7A: MatLUFactorNumeric_SuperLU_DIST (superlu_dist.c:414) ==42048== by 0x10150ABE2: ddist_psymbtonum (pdsymbfact_distdata.c:1332) ==42048== by 0x1015018C2: pdgssvx (pdgssvx.c:1057) ==42048== by 0x1009CFE7A: MatLUFactorNumeric_SuperLU_DIST (superlu_dist.c:414) ==42049== by 0x10046CC5C: MatLUFactorNumeric (matrix.c:2946) ==42049== by 0x100F09F2C: PCSetUp_LU (lu.c:152) ==42048== by 0x10046CC5C: MatLUFactorNumeric (matrix.c:2946) ==42048== by 0x100F09F2C: PCSetUp_LU (lu.c:152) ==42049== by 0x100FF9036: PCSetUp (precon.c:982) ==42049== by 0x1010F54EB: KSPSetUp (itfunc.c:332) ==42049== by 0x1010F7985: KSPSolve (itfunc.c:546) ==42048== by 0x100FF9036: PCSetUp (precon.c:982) ==42048== by 0x1010F54EB: KSPSetUp (itfunc.c:332) ==42048== by 0x1010F7985: KSPSolve (itfunc.c:546) ==42049== by 0x10125541E: SNESSolve_NEWTONLS (ls.c:233) ==42049== by 0x1011C49B7: SNESSolve (snes.c:3906) ==42048== by 0x10125541E: SNESSolve_NEWTONLS (ls.c:233) ==42048== by 0x1011C49B7: SNESSolve (snes.c:3906) ==42049== by 0x100001B3C: main (in ./ex19) ==42049== Uninitialised value was created by a heap allocation ==42049== at 0x1000183B1: malloc (vg_replace_malloc.c:303) ==42048== by 0x100001B3C: main (in ./ex19) ==42048== ==42049== by 0x10153B704: superlu_malloc_dist (memory.c:108) ==42049== by 0x10150ABE2: ddist_psymbtonum (pdsymbfact_distdata.c:1332) ==42049== by 0x1015018C2: pdgssvx (pdgssvx.c:1057) ==42049== by 0x1009CFE7A: MatLUFactorNumeric_SuperLU_DIST (superlu_dist.c:414) ==42049== by 0x10046CC5C: MatLUFactorNumeric (matrix.c:2946) ==42049== by 0x100F09F2C: PCSetUp_LU (lu.c:152) ==42049== by 0x100FF9036: PCSetUp (precon.c:982) ==42049== by 0x1010F54EB: KSPSetUp (itfunc.c:332) ==42049== by 0x1010F7985: KSPSolve (itfunc.c:546) ==42049== by 0x10125541E: SNESSolve_NEWTONLS (ls.c:233) ==42049== by 0x1011C49B7: SNESSolve (snes.c:3906) ==42049== by 0x100001B3C: main (in ./ex19) ==42049== ==42050== Conditional jump or move depends on uninitialised value(s) ==42050== at 0x10151FDE6: pdgstrf (pdgstrf.c:1382) ==42050== by 0x1015019A5: pdgssvx (pdgssvx.c:1069) ==42050== by 0x1009CFE7A: MatLUFactorNumeric_SuperLU_DIST (superlu_dist.c:414) ==42050== by 0x10046CC5C: MatLUFactorNumeric (matrix.c:2946) ==42050== by 0x100F09F2C: PCSetUp_LU (lu.c:152) ==42050== by 0x100FF9036: PCSetUp (precon.c:982) ==42050== by 0x1010F54EB: KSPSetUp (itfunc.c:332) ==42050== by 0x1010F7985: KSPSolve (itfunc.c:546) ==42050== by 0x10125541E: SNESSolve_NEWTONLS (ls.c:233) ==42050== by 0x1011C49B7: SNESSolve (snes.c:3906) ==42050== by 0x100001B3C: main (in ./ex19) ==42050== Uninitialised value was created by a heap allocation ==42050== at 0x1000183B1: malloc (vg_replace_malloc.c:303) ==42050== by 0x10153B704: superlu_malloc_dist (memory.c:108) ==42050== by 0x10150B241: ddist_psymbtonum (pdsymbfact_distdata.c:1389) ==42050== by 0x1015018C2: pdgssvx (pdgssvx.c:1057) ==42050== by 0x1009CFE7A: MatLUFactorNumeric_SuperLU_DIST (superlu_dist.c:414) ==42050== by 0x10046CC5C: MatLUFactorNumeric (matrix.c:2946) ==42050== by 0x100F09F2C: PCSetUp_LU (lu.c:152) ==42050== by 0x100FF9036: PCSetUp (precon.c:982) ==42050== by 0x1010F54EB: KSPSetUp (itfunc.c:332) ==42050== by 0x1010F7985: KSPSolve (itfunc.c:546) ==42050== by 0x10125541E: SNESSolve_NEWTONLS (ls.c:233) ==42050== by 0x1011C49B7: SNESSolve (snes.c:3906) ==42050== by 0x100001B3C: main (in ./ex19) ==42050== > On Jul 20, 2015, at 12:03 PM, Mahir.Ulker-Kaustell at tyrens.se wrote: > > Ok. So I have been creating the full factorization on each process. That gives me some hope! > > I followed your suggestion and tried to use the runtime option ?-mat_superlu_dist_parsymbfact?. > However, now the program crashes with: > > Invalid ISPEC at line 484 in file get_perm_c.c > > And so on? > > From the SuperLU manual; I should give the option either YES or NO, however -mat_superlu_dist_parsymbfact YES makes the program crash in the same way as above. > Also I can?t find any reference to -mat_superlu_dist_parsymbfact in the PETSc documentation > > Mahir > > Mahir ?lker-Kaustell, Kompetenssamordnare, Brokonstrukt?r, Tekn. Dr, Tyr?ns AB > 010 452 30 82, Mahir.Ulker-Kaustell at tyrens.se > > From: Xiaoye S. Li [mailto:xsli at lbl.gov] > Sent: den 20 juli 2015 18:12 > To: ?lker-Kaustell, Mahir > Cc: Hong; petsc-users > Subject: Re: [petsc-users] SuperLU MPI-problem > > The default SuperLU_DIST setting is to serial symbolic factorization. Therefore, what matters is how much memory do you have per MPI task? > > The code failed to malloc memory during redistribution of matrix A to {L\U} data struction (using result of serial symbolic factorization.) > > You can use parallel symbolic factorization, by runtime option: '-mat_superlu_dist_parsymbfact' > > Sherry Li > > > On Mon, Jul 20, 2015 at 8:59 AM, Mahir.Ulker-Kaustell at tyrens.se > wrote: > Hong: > > Previous experiences with this equation have shown that it is very difficult to solve it iteratively. Hence the use of a direct solver. > > The large test problem I am trying to solve has slightly less than 10^6 degrees of freedom. The matrices are derived from finite elements so they are sparse. > The machine I am working on has 128GB ram. I have estimated the memory needed to less than 20GB, so if the solver needs twice or even three times as much, it should still work well. Or have I completely misunderstood something here? > > Mahir > > > > From: Hong [mailto:hzhang at mcs.anl.gov] > Sent: den 20 juli 2015 17:39 > To: ?lker-Kaustell, Mahir > Cc: petsc-users > Subject: Re: [petsc-users] SuperLU MPI-problem > > Mahir: > Direct solvers consume large amount of memory. Suggest to try followings: > > 1. A sparse iterative solver if [-omega^2M + K] is not too ill-conditioned. You may test it using the small matrix. > > 2. Incrementally increase your matrix sizes. Try different matrix orderings. > Do you get memory crash in the 1st symbolic factorization? > In your case, matrix data structure stays same when omega changes, so you only need to do one matrix symbolic factorization and reuse it. > > 3. Use a machine that gives larger memory. > > Hong > > Dear Petsc-Users, > > I am trying to use PETSc to solve a set of linear equations arising from Naviers equation (elastodynamics) in the frequency domain. > The frequency dependency of the problem requires that the system > > [-omega^2M + K]u = F > > where M and K are constant, square, positive definite matrices (mass and stiffness respectively) is solved for each frequency omega of interest. > K is a complex matrix, including material damping. > > I have written a PETSc program which solves this problem for a small (1000 degrees of freedom) test problem on one or several processors, but it keeps crashing when I try it on my full scale (in the order of 10^6 degrees of freedom) problem. > > The program crashes at KSPSetUp() and from what I can see in the error messages, it appears as if it consumes too much memory. > > I would guess that similar problems have occurred in this mail-list, so I am hoping that someone can push me in the right direction? > > Mahir -- What most experimenters take for granted before they begin their experiments is infinitely more interesting than any results to which their experiments lead. -- Norbert Wiener -------------- next part -------------- An HTML attachment was scrubbed... URL: From gtheler at cites-gss.com Mon Aug 3 10:50:36 2015 From: gtheler at cites-gss.com (Theler German Guillermo) Date: Mon, 3 Aug 2015 15:50:36 +0000 Subject: [petsc-users] Get CPU time from events In-Reply-To: References: Message-ID: Hi Matt I get empty PetscEventPerfInfo structures after calling PetscLogEventGetPerfInfo(), i.e. both integers and floats are zero, as if the structure was just calloc'ed and never filled. However, I managed to get the overall stage CPU time (which is ok for me) by doing PetscLogGetStageLog(&stageLog); and then accessing stageLog->stageInfo[stage].perfInfo.time I attach a modified src/ksp/ksp/examples/tutorials/ex1.c that tries to illustrate my point. -- jeremy On Fri, 2015-07-31 at 09:00 -0500, Matthew Knepley wrote: > 2015-07-31 8:43 GMT-05:00 Theler German Guillermo > : > Is there a way to obtain as a PetscScalar the CPU time > associated to an > event or stage? > Something like PetscGetFlops() in an event or stage-based > basis? > > > Here is a test where I do that: > > > https://bitbucket.org/petsc/petsc/src/77c2d1544b79e11f3573a3360b35a7573ef4d1bf/src/dm/impls/plex/examples/tests/ex9.c?at=master#ex9.c-237 > > > ________________________________ Imprima este mensaje s?lo si es absolutamente necesario. Para imprimir, en lo posible utilice el papel de ambos lados. El Grupo Sancor Seguros se compromete con el cuidado del medioambiente. ************AVISO DE CONFIDENCIALIDAD************ El Grupo Sancor Seguros comunica que: Este mensaje y todos los archivos adjuntos a el son para uso exclusivo del destinatario y pueden contener informaci?n confidencial o propietaria, cuya divulgaci?n es sancionada por ley. Si usted recibi? este mensaje err?neamente, por favor notif?quenos respondiendo al remitente, borre el mensaje original y destruya las copias (impresas o grabadas en cualquier medio magn?tico) que pueda haber realizado del mismo. Todas las opiniones contenidas en este mail son propias del autor del mensaje. La publicaci?n, uso, copia o impresi?n total o parcial de este mensaje o documentos adjuntos queda prohibida. Disposici?n DNDP 10-2008. El titular de los datos personales tiene la facultad de ejercer el derecho de acceso a los mismos en forma gratuita a intervalos no inferiores a seis meses, salvo que acredite un inter?s leg?timo al efecto conforme lo establecido en el art?culo 14, inciso 3 de la Ley 25.326. La DIRECCI?N NACIONAL DE PROTECCI?N DE DATOS PERSONALES, Organo de Control de la Ley 25.326, tiene la atribuci?n de atender las denuncias y reclamos que se interpongan con relaci?n al incumplimiento de las normas sobre la protecci?n de datos personales. -------------- next part -------------- A non-text attachment was scrubbed... Name: ex1.c Type: text/x-csrc Size: 8205 bytes Desc: ex1.c URL: From knepley at gmail.com Mon Aug 3 11:36:32 2015 From: knepley at gmail.com (Matthew Knepley) Date: Mon, 3 Aug 2015 11:36:32 -0500 Subject: [petsc-users] Get CPU time from events In-Reply-To: References: Message-ID: 2015-08-03 10:50 GMT-05:00 Theler German Guillermo : > Hi Matt > > I get empty PetscEventPerfInfo structures after calling > PetscLogEventGetPerfInfo(), i.e. both integers and floats are zero, as > If you do not pass -log_summary, you have to call PetscLogBegin() after PetscInitialize() to get it to start logging. Thanks, Matt > if the structure was just calloc'ed and never filled. However, I managed > to get the overall stage CPU time (which is ok for me) by doing > > PetscLogGetStageLog(&stageLog); > > and then accessing stageLog->stageInfo[stage].perfInfo.time > > I attach a modified src/ksp/ksp/examples/tutorials/ex1.c that tries to > illustrate my point. > > -- > jeremy > > > On Fri, 2015-07-31 at 09:00 -0500, Matthew Knepley wrote: > > 2015-07-31 8:43 GMT-05:00 Theler German Guillermo > > : > > Is there a way to obtain as a PetscScalar the CPU time > > associated to an > > event or stage? > > Something like PetscGetFlops() in an event or stage-based > > basis? > > > > > > Here is a test where I do that: > > > > > > > https://bitbucket.org/petsc/petsc/src/77c2d1544b79e11f3573a3360b35a7573ef4d1bf/src/dm/impls/plex/examples/tests/ex9.c?at=master#ex9.c-237 > > > > > > > > ________________________________ > Imprima este mensaje s?lo si es absolutamente necesario. > Para imprimir, en lo posible utilice el papel de ambos lados. > El Grupo Sancor Seguros se compromete con el cuidado del medioambiente. > > > > ************AVISO DE CONFIDENCIALIDAD************ > > El Grupo Sancor Seguros comunica que: > > Este mensaje y todos los archivos adjuntos a el son para uso exclusivo del > destinatario y pueden contener informaci?n confidencial o propietaria, cuya > divulgaci?n es sancionada por ley. Si usted recibi? este mensaje > err?neamente, por favor notif?quenos respondiendo al remitente, borre el > mensaje original y destruya las copias (impresas o grabadas en cualquier > medio magn?tico) que pueda haber realizado del mismo. Todas las opiniones > contenidas en este mail son propias del autor del mensaje. La publicaci?n, > uso, copia o impresi?n total o parcial de este mensaje o documentos > adjuntos queda prohibida. > > Disposici?n DNDP 10-2008. El titular de los datos personales tiene la > facultad de ejercer el derecho de acceso a los mismos en forma gratuita a > intervalos no inferiores a seis meses, salvo que acredite un inter?s > leg?timo al efecto conforme lo establecido en el art?culo 14, inciso 3 de > la Ley 25.326. La DIRECCI?N NACIONAL DE PROTECCI?N DE DATOS PERSONALES, > Organo de Control de la Ley 25.326, tiene la atribuci?n de atender las > denuncias y reclamos que se interpongan con relaci?n al incumplimiento de > las normas sobre la protecci?n de datos personales. > -- What most experimenters take for granted before they begin their experiments is infinitely more interesting than any results to which their experiments lead. -- Norbert Wiener -------------- next part -------------- An HTML attachment was scrubbed... URL: From gtheler at cites-gss.com Mon Aug 3 11:52:41 2015 From: gtheler at cites-gss.com (Theler German Guillermo) Date: Mon, 3 Aug 2015 16:52:41 +0000 Subject: [petsc-users] Get CPU time from events In-Reply-To: References: Message-ID: > I get empty PetscEventPerfInfo structures after calling > PetscLogEventGetPerfInfo(), i.e. both integers and floats are > zero, as > If you do not pass -log_summary, you have to call PetscLogBegin() > after PetscInitialize() to > get it to start logging. Got it! Thanks. Maybe that sentence should be added to the description of PetscLogEventGetPerfInfo() and friends. -- jeremy ________________________________ Imprima este mensaje s?lo si es absolutamente necesario. Para imprimir, en lo posible utilice el papel de ambos lados. El Grupo Sancor Seguros se compromete con el cuidado del medioambiente. ************AVISO DE CONFIDENCIALIDAD************ El Grupo Sancor Seguros comunica que: Este mensaje y todos los archivos adjuntos a el son para uso exclusivo del destinatario y pueden contener informaci?n confidencial o propietaria, cuya divulgaci?n es sancionada por ley. Si usted recibi? este mensaje err?neamente, por favor notif?quenos respondiendo al remitente, borre el mensaje original y destruya las copias (impresas o grabadas en cualquier medio magn?tico) que pueda haber realizado del mismo. Todas las opiniones contenidas en este mail son propias del autor del mensaje. La publicaci?n, uso, copia o impresi?n total o parcial de este mensaje o documentos adjuntos queda prohibida. Disposici?n DNDP 10-2008. El titular de los datos personales tiene la facultad de ejercer el derecho de acceso a los mismos en forma gratuita a intervalos no inferiores a seis meses, salvo que acredite un inter?s leg?timo al efecto conforme lo establecido en el art?culo 14, inciso 3 de la Ley 25.326. La DIRECCI?N NACIONAL DE PROTECCI?N DE DATOS PERSONALES, Organo de Control de la Ley 25.326, tiene la atribuci?n de atender las denuncias y reclamos que se interpongan con relaci?n al incumplimiento de las normas sobre la protecci?n de datos personales. From bsmith at mcs.anl.gov Mon Aug 3 12:00:45 2015 From: bsmith at mcs.anl.gov (Barry Smith) Date: Mon, 3 Aug 2015 12:00:45 -0500 Subject: [petsc-users] Get CPU time from events In-Reply-To: References: Message-ID: > On Aug 3, 2015, at 11:52 AM, Theler German Guillermo wrote: > > >> I get empty PetscEventPerfInfo structures after calling >> PetscLogEventGetPerfInfo(), i.e. both integers and floats are >> zero, as >> If you do not pass -log_summary, you have to call PetscLogBegin() >> after PetscInitialize() to >> get it to start logging. > > Got it! Thanks. > Maybe that sentence should be added to the description of > PetscLogEventGetPerfInfo() and friends. We should probably trigger an error, with a very helpful error message, if these are called but the initialization was never done. Barry > > -- > jeremy > ________________________________ > Imprima este mensaje s?lo si es absolutamente necesario. > Para imprimir, en lo posible utilice el papel de ambos lados. > El Grupo Sancor Seguros se compromete con el cuidado del medioambiente. > > > > ************AVISO DE CONFIDENCIALIDAD************ > > El Grupo Sancor Seguros comunica que: > > Este mensaje y todos los archivos adjuntos a el son para uso exclusivo del destinatario y pueden contener informaci?n confidencial o propietaria, cuya divulgaci?n es sancionada por ley. Si usted recibi? este mensaje err?neamente, por favor notif?quenos respondiendo al remitente, borre el mensaje original y destruya las copias (impresas o grabadas en cualquier medio magn?tico) que pueda haber realizado del mismo. Todas las opiniones contenidas en este mail son propias del autor del mensaje. La publicaci?n, uso, copia o impresi?n total o parcial de este mensaje o documentos adjuntos queda prohibida. > > Disposici?n DNDP 10-2008. El titular de los datos personales tiene la facultad de ejercer el derecho de acceso a los mismos en forma gratuita a intervalos no inferiores a seis meses, salvo que acredite un inter?s leg?timo al efecto conforme lo establecido en el art?culo 14, inciso 3 de la Ley 25.326. La DIRECCI?N NACIONAL DE PROTECCI?N DE DATOS PERSONALES, Organo de Control de la Ley 25.326, tiene la atribuci?n de atender las denuncias y reclamos que se interpongan con relaci?n al incumplimiento de las normas sobre la protecci?n de datos personales. From hzhang at mcs.anl.gov Mon Aug 3 12:06:26 2015 From: hzhang at mcs.anl.gov (Hong) Date: Mon, 3 Aug 2015 12:06:26 -0500 Subject: [petsc-users] SuperLU MPI-problem In-Reply-To: References: <051d9816c3bd4a3eac37fc51004ebce1@STHWS42.tyrens.se> <7345cece365942d1a06deeac56cf1d72@STHWS42.tyrens.se> <19A5B30A-64E2-44E1-8F73-F67AE628F175@mcs.anl.gov> <03369975ff0a46a388920f1b3372d25c@STHWS42.tyrens.se> <1891ada2f99843b6b6c1d91f0f47f065@STHWS42.tyrens.se> Message-ID: Mahir, > I have not used ?parsymbfact in sequential runs or set matinput=GLOBAL for > parallel runs. > > > > If I use 2 processors, the program runs if I use > *?mat_superlu_dist_parsymbfact=1*: > > mpiexec -n 2 ./solve -ksp_type preonly -pc_type lu > -pc_factor_mat_solver_package superlu_dist -mat_superlu_dist_matinput > GLOBAL -mat_superlu_dist_parsymbfact=1 > The incorrect option '-mat_superlu_dist_parsymbfact=1' is not taken, so your code runs well without parsymbfact. Please run it with '-ksp_view' and see what 'SuperLU_DIST run parameters:' are being used, e.g. petsc/src/ksp/ksp/examples/tutorials (maint) $ mpiexec -n 2 ./ex2 -pc_type lu -pc_factor_mat_solver_package superlu_dist -mat_superlu_dist_parsymbfact=1 -ksp_view ... SuperLU_DIST run parameters: Process grid nprow 2 x npcol 1 Equilibrate matrix TRUE Matrix input mode 1 Replace tiny pivots TRUE Use iterative refinement FALSE Processors in row 2 col partition 1 Row permutation LargeDiag Column permutation METIS_AT_PLUS_A Parallel symbolic factorization FALSE Repeated factorization SamePattern_SameRowPerm I do not understand why your code uses matrix input mode = global. Hong > > > > > > *From:* Hong [mailto:hzhang at mcs.anl.gov] > *Sent:* den 3 augusti 2015 16:46 > *To:* Xiaoye S. Li > *Cc:* ?lker-Kaustell, Mahir; Hong; PETSc users list > > *Subject:* Re: [petsc-users] SuperLU MPI-problem > > > > Mahir, > > > > Sherry found the culprit. I can reproduce it: > > petsc/src/ksp/ksp/examples/tutorials > > mpiexec -n 2 ./ex2 -pc_type lu -pc_factor_mat_solver_package superlu_dist > -mat_superlu_dist_matinput GLOBAL -mat_superlu_dist_parsymbfact > > > > Invalid ISPEC at line 484 in file get_perm_c.c > > Invalid ISPEC at line 484 in file get_perm_c.c > > ------------------------------------------------------- > > Primary job terminated normally, but 1 process returned > > a non-zero exit code.. Per user-direction, the job has been aborted. > > ------------------------------------------------------- > > ... > > > > PETSc-superlu_dist interface sets matinput=DISTRIBUTED as default when > using more than one processes. > > Did you either use '-mat_superlu_dist_parsymbfact' for sequential run or > set matinput=GLOBAL for parallel run? > > > > I'll add an error flag for these use cases. > > > > Hong > > > > On Mon, Aug 3, 2015 at 9:17 AM, Xiaoye S. Li wrote: > > I think I know the problem. Since zdistribute.c is called, I guess you > are using the global (replicated) matrix input interface, > pzgssvx_ABglobal(). This interface does not allow you to use parallel > symbolic factorization (since matrix is centralized). > > > > That's why you get the following error: > > Invalid ISPEC at line 484 in file get_perm_c.c > > > > You need to use distributed matrix input interface pzgssvx() (without > ABglobal) > > Sherry > > > > > > On Mon, Aug 3, 2015 at 5:02 AM, Mahir.Ulker-Kaustell at tyrens.se < > Mahir.Ulker-Kaustell at tyrens.se> wrote: > > Hong and Sherry, > > > > I have rebuilt PETSc with SuperLU 4.1. Unfortunately, the problem remains: > > > > If I use -mat_superlu_dist_parsymbfact, the program crashes with: Invalid > ISPEC at line 484 in file get_perm_c.c > > If I use -mat_superlu_dist_parsymbfact=1 or leave this flag out, the > program crashes with: Calloc fails for SPA dense[]. at line 438 in file > zdistribute.c > > > > Mahir > > > > *From:* Hong [mailto:hzhang at mcs.anl.gov] > *Sent:* den 30 juli 2015 02:58 > *To:* ?lker-Kaustell, Mahir > *Cc:* Xiaoye Li; PETSc users list > > > *Subject:* Fwd: [petsc-users] SuperLU MPI-problem > > > > Mahir, > > > > Sherry fixed several bugs in superlu_dist-v4.1. > > The current petsc-release interfaces with superlu_dist-v4.0. > > We do not know whether the reported issue (attached below) has been > resolved or not. If not, can you test it with the latest superlu_dist-v4.1? > > > > Here is how to do it: > > 1. download superlu_dist v4.1 > > 2. remove existing PETSC_ARCH directory, then configure petsc with > > '--download-superlu_dist=superlu_dist_4.1.tar.gz' > > 3. build petsc > > > > Let us know if the issue remains. > > > > Hong > > > > > > ---------- Forwarded message ---------- > From: *Xiaoye S. Li* > Date: Wed, Jul 29, 2015 at 2:24 PM > Subject: Fwd: [petsc-users] SuperLU MPI-problem > To: Hong Zhang > > Hong, > > I am cleaning the mailbox, and saw this unresolved issue. I am not sure > whether the new fix to parallel symbolic factorization solves the problem. > What bothers be is that he is getting the following error: > > Invalid ISPEC at line 484 in file get_perm_c.c > > This has nothing to do with my bug fix. > > ? Shall we ask him to try the new version, or try to get him matrix? > > Sherry > ? > > > > ---------- Forwarded message ---------- > From: *Mahir.Ulker-Kaustell at tyrens.se * < > Mahir.Ulker-Kaustell at tyrens.se> > Date: Wed, Jul 22, 2015 at 1:32 PM > Subject: RE: [petsc-users] SuperLU MPI-problem > To: Hong , "Xiaoye S. Li" > Cc: petsc-users > > The 1000 was just a conservative guess. The number of non-zeros per row is > in the tens in general but certain constraints lead to non-diagonal streaks > in the sparsity-pattern. > > Is it the reordering of the matrix that is killing me here? How can I set > options.ColPerm? > > > > If i use -mat_superlu_dist_parsymbfact the program crashes with > > > > Invalid ISPEC at line 484 in file get_perm_c.c > > ------------------------------------------------------- > > Primary job terminated normally, but 1 process returned > > a non-zero exit code.. Per user-direction, the job has been aborted. > > ------------------------------------------------------- > > [0]PETSC ERROR: > ------------------------------------------------------------------------ > > [0]PETSC ERROR: Caught signal number 15 Terminate: Some process (or the > batch system) has told this process to end > > [0]PETSC ERROR: Try option -start_in_debugger or -on_error_attach_debugger > > [0]PETSC ERROR: or see > http://www.mcs.anl.gov/petsc/documentation/faq.html#valgrind > > [0]PETSC ERROR: or try http://valgrind.org on GNU/linux and Apple Mac OS > X to find memory corruption errors > > [0]PETSC ERROR: configure using --with-debugging=yes, recompile, link, and > run > > [0]PETSC ERROR: to get more information on the crash. > > [0]PETSC ERROR: --------------------- Error Message > -------------------------------------------------------------- > > [0]PETSC ERROR: Signal received > > [0]PETSC ERROR: See http://www.mcs.anl.gov/petsc/documentation/faq.html > for trouble shooting. > > [0]PETSC ERROR: Petsc Release Version 3.6.0, Jun, 09, 2015 > > [0]PETSC ERROR: ./solve on a cygwin-complex-nodebug named CZC5202SM2 by > muk Wed Jul 22 21:59:23 2015 > > [0]PETSC ERROR: Configure options PETSC_DIR=/packages/petsc-3.6.0 > PETSC_ARCH=cygwin-complex-nodebug --with-cc=gcc --with-cxx=g++ > --with-fc=gfortran --with-debugging=0 --with-fortran-kernels=1 > --with-scalar-type=complex --download-fblaspack --download-mpich > --download-scalapack --download-mumps --download-metis --download-parmetis > --download-superlu --download-superlu_dist --download-fftw > > [0]PETSC ERROR: #1 User provided function() line 0 in unknown file > > application called MPI_Abort(MPI_COMM_WORLD, 59) - process 0 > > [unset]: aborting job: > > application called MPI_Abort(MPI_COMM_WORLD, 59) - process 0 > > [0]PETSC ERROR: > ------------------------------------------------------------------------ > > > > If i use -mat_superlu_dist_parsymbfact=1 the program crashes (somewhat > later) with > > > > Malloc fails for Lnzval_bc_ptr[*][] at line 626 in file zdistribute.c > > col block 3006 ------------------------------------------------------- > > Primary job terminated normally, but 1 process returned > > a non-zero exit code.. Per user-direction, the job has been aborted. > > ------------------------------------------------------- > > col block 1924 [0]PETSC ERROR: > ------------------------------------------------------------------------ > > [0]PETSC ERROR: Caught signal number 15 Terminate: Some process (or the > batch system) has told this process to end > > [0]PETSC ERROR: Try option -start_in_debugger or -on_error_attach_debugger > > [0]PETSC ERROR: or see > http://www.mcs.anl.gov/petsc/documentation/faq.html#valgrind > > [0]PETSC ERROR: or try http://valgrind.org on GNU/linux and Apple Mac OS > X to find memory corruption errors > > [0]PETSC ERROR: configure using --with-debugging=yes, recompile, link, and > run > > [0]PETSC ERROR: to get more information on the crash. > > [0]PETSC ERROR: --------------------- Error Message > -------------------------------------------------------------- > > [0]PETSC ERROR: Signal received > > [0]PETSC ERROR: See http://www.mcs.anl.gov/petsc/documentation/faq.html > for trouble shooting. > > [0]PETSC ERROR: Petsc Release Version 3.6.0, Jun, 09, 2015 > > [0]PETSC ERROR: ./solve on a cygwin-complex-nodebug named CZC5202SM2 by > muk Wed Jul 22 21:59:58 2015 > > [0]PETSC ERROR: Configure options PETSC_DIR=/packages/petsc-3.6.0 > PETSC_ARCH=cygwin-complex-nodebug --with-cc=gcc --with-cxx=g++ > --with-fc=gfortran --with-debugging=0 --with-fortran-kernels=1 > --with-scalar-type=complex --download-fblaspack --download-mpich > --download-scalapack --download-mumps --download-metis --download-parmetis > --download-superlu --download-superlu_dist --download-fftw > > [0]PETSC ERROR: #1 User provided function() line 0 in unknown file > > application called MPI_Abort(MPI_COMM_WORLD, 59) - process 0 > > [unset]: aborting job: > > application called MPI_Abort(MPI_COMM_WORLD, 59) - process 0 > > [0]PETSC ERROR: > ------------------------------------------------------------------------ > > > > > > /Mahir > > > > > > *From:* Hong [mailto:hzhang at mcs.anl.gov] > > *Sent:* den 22 juli 2015 21:34 > *To:* Xiaoye S. Li > *Cc:* ?lker-Kaustell, Mahir; petsc-users > > > *Subject:* Re: [petsc-users] SuperLU MPI-problem > > > > In Petsc/superlu_dist interface, we set default > > > > options.ParSymbFact = NO; > > > > When user raises the flag "-mat_superlu_dist_parsymbfact", > > we set > > > > options.ParSymbFact = YES; > > options.ColPerm = PARMETIS; /* in v2.2, PARMETIS is forced for > ParSymbFact regardless of user ordering setting */ > > > > We do not change anything else. > > > > Hong > > > > On Wed, Jul 22, 2015 at 2:19 PM, Xiaoye S. Li wrote: > > I am trying to understand your problem. You said you are solving Naviers > equation (elastodynamics) in the frequency domain, using finite element > discretization. I wonder why you have about 1000 nonzeros per row. > Usually in many PDE discretized matrices, the number of nonzeros per row is > in the tens (even for 3D problems), not in the thousands. So, your matrix > is quite a bit denser than many sparse matrices we deal with. > > > > The number of nonzeros in the L and U factors is much more than that in > original matrix A -- typically we see 10-20x fill ratio for 2D, or can be > as bad as 50-100x fill ratio for 3D. But since your matrix starts much > denser (i.e., the underlying graph has many connections), it may not lend > to any good ordering strategy to preserve sparsity of L and U; that is, the > L and U fill ratio may be large. > > > > I don't understand why you get the following error when you use > > ?-mat_superlu_dist_parsymbfact?. > > > > Invalid ISPEC at line 484 in file get_perm_c.c > > > > Perhaps Hong Zhang knows; she built the SuperLU_DIST interface for PETSc. > > > > ?Hong -- in order to use parallel symbolic factorization, is it sufficient > to specify only > > ?-mat_superlu_dist_parsymbfact? > > ? ? (the default is to use sequential symbolic factorization.) > > > > > > Sherry > > > > On Wed, Jul 22, 2015 at 9:11 AM, Mahir.Ulker-Kaustell at tyrens.se < > Mahir.Ulker-Kaustell at tyrens.se> wrote: > > Thank you for your reply. > > As you have probably figured out already, I am not a computational > scientist. I am a researcher in civil engineering (railways for high-speed > traffic), trying to produce some, from my perspective, fairly large > parametric studies based on finite element discretizations. > > I am working in a Windows-environment and have installed PETSc through > Cygwin. > Apparently, there is no support for Valgrind in this OS. > > If I have understood you correct, the memory issues are related to superLU > and given my background, there is not much I can do. Is this correct? > > > Best regards, > Mahir > > ______________________________________________ > Mahir ?lker-Kaustell, Kompetenssamordnare, Brokonstrukt?r, Tekn. Dr, > Tyr?ns AB > 010 452 30 82, Mahir.Ulker-Kaustell at tyrens.se > ______________________________________________ > > > -----Original Message----- > From: Barry Smith [mailto:bsmith at mcs.anl.gov] > Sent: den 22 juli 2015 02:57 > To: ?lker-Kaustell, Mahir > Cc: Xiaoye S. Li; petsc-users > Subject: Re: [petsc-users] SuperLU MPI-problem > > > Run the program under valgrind > http://www.mcs.anl.gov/petsc/documentation/faq.html#valgrind . When I use > the option -mat_superlu_dist_parsymbfact I get many scary memory problems > some involving for example ddist_psymbtonum (pdsymbfact_distdata.c:1332) > > Note that I consider it unacceptable for running programs to EVER use > uninitialized values; until these are all cleaned up I won't trust any runs > like this. > > Barry > > > > > ==42050== Conditional jump or move depends on uninitialised value(s) > ==42050== at 0x10274C436: MPI_Allgatherv (allgatherv.c:1053) > ==42050== by 0x101557F60: get_perm_c_parmetis > (get_perm_c_parmetis.c:285) > ==42050== by 0x101501192: pdgssvx (pdgssvx.c:934) > ==42050== by 0x1009CFE7A: MatLUFactorNumeric_SuperLU_DIST > (superlu_dist.c:414) > ==42050== by 0x10046CC5C: MatLUFactorNumeric (matrix.c:2946) > ==42050== by 0x100F09F2C: PCSetUp_LU (lu.c:152) > ==42050== by 0x100FF9036: PCSetUp (precon.c:982) > ==42050== by 0x1010F54EB: KSPSetUp (itfunc.c:332) > ==42050== by 0x1010F7985: KSPSolve (itfunc.c:546) > ==42050== by 0x10125541E: SNESSolve_NEWTONLS (ls.c:233) > ==42050== by 0x1011C49B7: SNESSolve (snes.c:3906) > ==42050== by 0x100001B3C: main (in ./ex19) > ==42050== Uninitialised value was created by a stack allocation > ==42050== at 0x10155751B: get_perm_c_parmetis (get_perm_c_parmetis.c:96) > ==42050== > ==42050== Conditional jump or move depends on uninitialised value(s) > ==42050== at 0x102851C61: MPIR_Allgatherv_intra (allgatherv.c:651) > ==42050== by 0x102853EC7: MPIR_Allgatherv (allgatherv.c:903) > ==42050== by 0x102853F84: MPIR_Allgatherv_impl (allgatherv.c:944) > ==42050== by 0x10274CA41: MPI_Allgatherv (allgatherv.c:1107) > ==42050== by 0x101557F60: get_perm_c_parmetis > (get_perm_c_parmetis.c:285) > ==42050== by 0x101501192: pdgssvx (pdgssvx.c:934) > ==42050== by 0x1009CFE7A: MatLUFactorNumeric_SuperLU_DIST > (superlu_dist.c:414) > ==42050== by 0x10046CC5C: MatLUFactorNumeric (matrix.c:2946) > ==42050== by 0x100F09F2C: PCSetUp_LU (lu.c:152) > ==42050== by 0x100FF9036: PCSetUp (precon.c:982) > ==42050== by 0x1010F54EB: KSPSetUp (itfunc.c:332) > ==42050== by 0x1010F7985: KSPSolve (itfunc.c:546) > ==42050== by 0x10125541E: SNESSolve_NEWTONLS (ls.c:233) > ==42050== by 0x1011C49B7: SNESSolve (snes.c:3906) > ==42050== by 0x100001B3C: main (in ./ex19) > ==42050== Uninitialised value was created by a stack allocation > ==42050== at 0x10155751B: get_perm_c_parmetis (get_perm_c_parmetis.c:96) > ==42050== > ==42049== Syscall param writev(vector[...]) points to uninitialised byte(s) > ==42049== at 0x102DA1C3A: writev (in > /usr/lib/system/libsystem_kernel.dylib) > ==42049== by 0x10296A0DC: MPL_large_writev (mplsock.c:32) > ==42049== by 0x10295F6AD: MPIDU_Sock_writev (sock_immed.i:610) > ==42049== by 0x102943FCA: MPIDI_CH3_iSendv (ch3_isendv.c:84) > ==42049== by 0x102934361: MPIDI_CH3_EagerContigIsend (ch3u_eager.c:556) > ==42049== by 0x102939531: MPID_Isend (mpid_isend.c:138) > ==42049== by 0x10277656E: MPI_Isend (isend.c:125) > ==42049== by 0x102088B66: libparmetis__gkMPI_Isend (gkmpi.c:63) > ==42049== by 0x10208140F: libparmetis__CommInterfaceData (comm.c:298) > ==42049== by 0x1020A8758: libparmetis__CompactGraph (ometis.c:553) > ==42049== by 0x1020A77BB: libparmetis__MultilevelOrder (ometis.c:225) > ==42049== by 0x1020A7493: ParMETIS_V32_NodeND (ometis.c:151) > ==42049== by 0x1020A6AFB: ParMETIS_V3_NodeND (ometis.c:34) > ==42049== by 0x101557CFC: get_perm_c_parmetis > (get_perm_c_parmetis.c:241) > ==42049== by 0x101501192: pdgssvx (pdgssvx.c:934) > ==42049== by 0x1009CFE7A: MatLUFactorNumeric_SuperLU_DIST > (superlu_dist.c:414) > ==42049== by 0x10046CC5C: MatLUFactorNumeric (matrix.c:2946) > ==42049== by 0x100F09F2C: PCSetUp_LU (lu.c:152) > ==42049== by 0x100FF9036: PCSetUp (precon.c:982) > ==42048== Syscall param writev(vector[...]) points to uninitialised byte(s) > ==42049== by 0x1010F54EB: KSPSetUp (itfunc.c:332) > ==42049== Address 0x105edff70 is 1,424 bytes inside a block of size > 752,720 alloc'd > ==42049== at 0x1000183B1: malloc (vg_replace_malloc.c:303) > ==42049== by 0x1020EB90C: gk_malloc (memory.c:147) > ==42049== by 0x1020EAA28: gk_mcoreCreate (mcore.c:28) > ==42048== at 0x102DA1C3A: writev (in > /usr/lib/system/libsystem_kernel.dylib) > ==42048== by 0x10296A0DC: MPL_large_writev (mplsock.c:32) > ==42049== by 0x1020BA5CF: libparmetis__AllocateWSpace (wspace.c:23) > ==42049== by 0x1020A6E84: ParMETIS_V32_NodeND (ometis.c:98) > ==42048== by 0x10295F6AD: MPIDU_Sock_writev (sock_immed.i:610) > ==42048== by 0x102943FCA: MPIDI_CH3_iSendv (ch3_isendv.c:84) > ==42048== by 0x102934361: MPIDI_CH3_EagerContigIsend (ch3u_eager.c:556) > ==42049== by 0x1020A6AFB: ParMETIS_V3_NodeND (ometis.c:34) > ==42049== by 0x101557CFC: get_perm_c_parmetis > (get_perm_c_parmetis.c:241) > ==42049== by 0x101501192: pdgssvx (pdgssvx.c:934) > ==42048== by 0x102939531: MPID_Isend (mpid_isend.c:138) > ==42048== by 0x10277656E: MPI_Isend (isend.c:125) > ==42049== by 0x1009CFE7A: MatLUFactorNumeric_SuperLU_DIST > (superlu_dist.c:414) > ==42049== by 0x10046CC5C: MatLUFactorNumeric (matrix.c:2946) > ==42049== by 0x100F09F2C: PCSetUp_LU (lu.c:152) > ==42049== by 0x100FF9036: PCSetUp (precon.c:982) > ==42049== by 0x1010F54EB: KSPSetUp (itfunc.c:332) > ==42048== by 0x102088B66: libparmetis__gkMPI_Isend (gkmpi.c:63) > ==42048== by 0x10208140F: libparmetis__CommInterfaceData (comm.c:298) > ==42049== by 0x1010F7985: KSPSolve (itfunc.c:546) > ==42049== by 0x10125541E: SNESSolve_NEWTONLS (ls.c:233) > ==42048== by 0x1020A8758: libparmetis__CompactGraph (ometis.c:553) > ==42048== by 0x1020A77BB: libparmetis__MultilevelOrder (ometis.c:225) > ==42048== by 0x1020A7493: ParMETIS_V32_NodeND (ometis.c:151) > ==42049== by 0x1011C49B7: SNESSolve (snes.c:3906) > ==42049== by 0x100001B3C: main (in ./ex19) > ==42049== Uninitialised value was created by a heap allocation > ==42049== at 0x1000183B1: malloc (vg_replace_malloc.c:303) > ==42049== by 0x1020EB90C: gk_malloc (memory.c:147) > ==42048== by 0x1020A6AFB: ParMETIS_V3_NodeND (ometis.c:34) > ==42048== by 0x101557CFC: get_perm_c_parmetis > (get_perm_c_parmetis.c:241) > ==42048== by 0x101501192: pdgssvx (pdgssvx.c:934) > ==42048== by 0x1009CFE7A: MatLUFactorNumeric_SuperLU_DIST > (superlu_dist.c:414) > ==42049== by 0x10211C50B: libmetis__imalloc (gklib.c:24) > ==42049== by 0x1020A8566: libparmetis__CompactGraph (ometis.c:519) > ==42049== by 0x1020A77BB: libparmetis__MultilevelOrder (ometis.c:225) > ==42048== by 0x10046CC5C: MatLUFactorNumeric (matrix.c:2946) > ==42049== by 0x1020A7493: ParMETIS_V32_NodeND (ometis.c:151) > ==42049== by 0x1020A6AFB: ParMETIS_V3_NodeND (ometis.c:34) > ==42049== by 0x101557CFC: get_perm_c_parmetis > (get_perm_c_parmetis.c:241) > ==42049== by 0x101501192: pdgssvx (pdgssvx.c:934) > ==42049== by 0x1009CFE7A: MatLUFactorNumeric_SuperLU_DIST > (superlu_dist.c:414) > ==42049== by 0x10046CC5C: MatLUFactorNumeric (matrix.c:2946) > ==42048== by 0x100F09F2C: PCSetUp_LU (lu.c:152) > ==42049== by 0x100F09F2C: PCSetUp_LU (lu.c:152) > ==42049== by 0x100FF9036: PCSetUp (precon.c:982) > ==42049== by 0x1010F54EB: KSPSetUp (itfunc.c:332) > ==42049== by 0x1010F7985: KSPSolve (itfunc.c:546) > ==42049== by 0x10125541E: SNESSolve_NEWTONLS (ls.c:233) > ==42048== by 0x100FF9036: PCSetUp (precon.c:982) > ==42048== by 0x1010F54EB: KSPSetUp (itfunc.c:332) > ==42048== Address 0x10597a860 is 1,408 bytes inside a block of size > 752,720 alloc'd > ==42049== by 0x1011C49B7: SNESSolve (snes.c:3906) > ==42049== by 0x100001B3C: main (in ./ex19) > ==42049== > ==42048== at 0x1000183B1: malloc (vg_replace_malloc.c:303) > ==42048== by 0x1020EB90C: gk_malloc (memory.c:147) > ==42048== by 0x1020EAA28: gk_mcoreCreate (mcore.c:28) > ==42048== by 0x1020BA5CF: libparmetis__AllocateWSpace (wspace.c:23) > ==42048== by 0x1020A6E84: ParMETIS_V32_NodeND (ometis.c:98) > ==42048== by 0x1020A6AFB: ParMETIS_V3_NodeND (ometis.c:34) > ==42048== by 0x101557CFC: get_perm_c_parmetis > (get_perm_c_parmetis.c:241) > ==42048== by 0x101501192: pdgssvx (pdgssvx.c:934) > ==42048== by 0x1009CFE7A: MatLUFactorNumeric_SuperLU_DIST > (superlu_dist.c:414) > ==42048== by 0x10046CC5C: MatLUFactorNumeric (matrix.c:2946) > ==42048== by 0x100F09F2C: PCSetUp_LU (lu.c:152) > ==42048== by 0x100FF9036: PCSetUp (precon.c:982) > ==42048== by 0x1010F54EB: KSPSetUp (itfunc.c:332) > ==42048== by 0x1010F7985: KSPSolve (itfunc.c:546) > ==42048== by 0x10125541E: SNESSolve_NEWTONLS (ls.c:233) > ==42048== by 0x1011C49B7: SNESSolve (snes.c:3906) > ==42048== by 0x100001B3C: main (in ./ex19) > ==42048== Uninitialised value was created by a heap allocation > ==42048== at 0x1000183B1: malloc (vg_replace_malloc.c:303) > ==42048== by 0x1020EB90C: gk_malloc (memory.c:147) > ==42048== by 0x10211C50B: libmetis__imalloc (gklib.c:24) > ==42048== by 0x1020A8566: libparmetis__CompactGraph (ometis.c:519) > ==42048== by 0x1020A77BB: libparmetis__MultilevelOrder (ometis.c:225) > ==42048== by 0x1020A7493: ParMETIS_V32_NodeND (ometis.c:151) > ==42048== by 0x1020A6AFB: ParMETIS_V3_NodeND (ometis.c:34) > ==42048== by 0x101557CFC: get_perm_c_parmetis > (get_perm_c_parmetis.c:241) > ==42048== by 0x101501192: pdgssvx (pdgssvx.c:934) > ==42048== by 0x1009CFE7A: MatLUFactorNumeric_SuperLU_DIST > (superlu_dist.c:414) > ==42048== by 0x10046CC5C: MatLUFactorNumeric (matrix.c:2946) > ==42048== by 0x100F09F2C: PCSetUp_LU (lu.c:152) > ==42048== by 0x100FF9036: PCSetUp (precon.c:982) > ==42048== by 0x1010F54EB: KSPSetUp (itfunc.c:332) > ==42048== by 0x1010F7985: KSPSolve (itfunc.c:546) > ==42048== by 0x10125541E: SNESSolve_NEWTONLS (ls.c:233) > ==42048== by 0x1011C49B7: SNESSolve (snes.c:3906) > ==42048== by 0x100001B3C: main (in ./ex19) > ==42048== > ==42048== Syscall param write(buf) points to uninitialised byte(s) > ==42048== at 0x102DA1C22: write (in > /usr/lib/system/libsystem_kernel.dylib) > ==42048== by 0x10295F5BD: MPIDU_Sock_write (sock_immed.i:525) > ==42048== by 0x102944839: MPIDI_CH3_iStartMsg (ch3_istartmsg.c:86) > ==42048== by 0x102933B80: MPIDI_CH3_EagerContigShortSend > (ch3u_eager.c:257) > ==42048== by 0x10293ADBA: MPID_Send (mpid_send.c:130) > ==42048== by 0x10277A1FA: MPI_Send (send.c:127) > ==42048== by 0x10155802F: get_perm_c_parmetis > (get_perm_c_parmetis.c:299) > ==42048== by 0x101501192: pdgssvx (pdgssvx.c:934) > ==42048== by 0x1009CFE7A: MatLUFactorNumeric_SuperLU_DIST > (superlu_dist.c:414) > ==42048== by 0x10046CC5C: MatLUFactorNumeric (matrix.c:2946) > ==42048== by 0x100F09F2C: PCSetUp_LU (lu.c:152) > ==42048== by 0x100FF9036: PCSetUp (precon.c:982) > ==42048== by 0x1010F54EB: KSPSetUp (itfunc.c:332) > ==42048== by 0x1010F7985: KSPSolve (itfunc.c:546) > ==42048== by 0x10125541E: SNESSolve_NEWTONLS (ls.c:233) > ==42048== by 0x1011C49B7: SNESSolve (snes.c:3906) > ==42048== by 0x100001B3C: main (in ./ex19) > ==42048== Address 0x104810704 is on thread 1's stack > ==42048== in frame #3, created by MPIDI_CH3_EagerContigShortSend > (ch3u_eager.c:218) > ==42048== Uninitialised value was created by a heap allocation > ==42048== at 0x1000183B1: malloc (vg_replace_malloc.c:303) > ==42048== by 0x10153B704: superlu_malloc_dist (memory.c:108) > ==42048== by 0x101557AB9: get_perm_c_parmetis > (get_perm_c_parmetis.c:185) > ==42048== by 0x101501192: pdgssvx (pdgssvx.c:934) > ==42048== by 0x1009CFE7A: MatLUFactorNumeric_SuperLU_DIST > (superlu_dist.c:414) > ==42048== by 0x10046CC5C: MatLUFactorNumeric (matrix.c:2946) > ==42048== by 0x100F09F2C: PCSetUp_LU (lu.c:152) > ==42048== by 0x100FF9036: PCSetUp (precon.c:982) > ==42048== by 0x1010F54EB: KSPSetUp (itfunc.c:332) > ==42048== by 0x1010F7985: KSPSolve (itfunc.c:546) > ==42048== by 0x10125541E: SNESSolve_NEWTONLS (ls.c:233) > ==42048== by 0x1011C49B7: SNESSolve (snes.c:3906) > ==42048== by 0x100001B3C: main (in ./ex19) > ==42048== > ==42050== Conditional jump or move depends on uninitialised value(s) > ==42050== at 0x102744CB8: MPI_Alltoallv (alltoallv.c:480) > ==42050== by 0x101510B3E: dist_symbLU (pdsymbfact_distdata.c:539) > ==42050== by 0x10150A5C6: ddist_psymbtonum (pdsymbfact_distdata.c:1275) > ==42050== by 0x1015018C2: pdgssvx (pdgssvx.c:1057) > ==42050== by 0x1009CFE7A: MatLUFactorNumeric_SuperLU_DIST > (superlu_dist.c:414) > ==42050== by 0x10046CC5C: MatLUFactorNumeric (matrix.c:2946) > ==42050== by 0x100F09F2C: PCSetUp_LU (lu.c:152) > ==42050== by 0x100FF9036: PCSetUp (precon.c:982) > ==42050== by 0x1010F54EB: KSPSetUp (itfunc.c:332) > ==42050== by 0x1010F7985: KSPSolve (itfunc.c:546) > ==42050== by 0x10125541E: SNESSolve_NEWTONLS (ls.c:233) > ==42050== by 0x1011C49B7: SNESSolve (snes.c:3906) > ==42050== by 0x100001B3C: main (in ./ex19) > ==42050== Uninitialised value was created by a stack allocation > ==42050== at 0x10150E4C4: dist_symbLU (pdsymbfact_distdata.c:96) > ==42050== > ==42050== Conditional jump or move depends on uninitialised value(s) > ==42050== at 0x102744E43: MPI_Alltoallv (alltoallv.c:490) > ==42050== by 0x101510B3E: dist_symbLU (pdsymbfact_distdata.c:539) > ==42050== by 0x10150A5C6: ddist_psymbtonum (pdsymbfact_distdata.c:1275) > ==42050== by 0x1015018C2: pdgssvx (pdgssvx.c:1057) > ==42050== by 0x1009CFE7A: MatLUFactorNumeric_SuperLU_DIST > (superlu_dist.c:414) > ==42050== by 0x10046CC5C: MatLUFactorNumeric (matrix.c:2946) > ==42050== by 0x100F09F2C: PCSetUp_LU (lu.c:152) > ==42050== by 0x100FF9036: PCSetUp (precon.c:982) > ==42050== by 0x1010F54EB: KSPSetUp (itfunc.c:332) > ==42050== by 0x1010F7985: KSPSolve (itfunc.c:546) > ==42050== by 0x10125541E: SNESSolve_NEWTONLS (ls.c:233) > ==42050== by 0x1011C49B7: SNESSolve (snes.c:3906) > ==42050== by 0x100001B3C: main (in ./ex19) > ==42050== Uninitialised value was created by a stack allocation > ==42050== at 0x10150E4C4: dist_symbLU (pdsymbfact_distdata.c:96) > ==42050== > ==42050== Conditional jump or move depends on uninitialised value(s) > ==42050== at 0x102744EBF: MPI_Alltoallv (alltoallv.c:497) > ==42050== by 0x101510B3E: dist_symbLU (pdsymbfact_distdata.c:539) > ==42050== by 0x10150A5C6: ddist_psymbtonum (pdsymbfact_distdata.c:1275) > ==42050== by 0x1015018C2: pdgssvx (pdgssvx.c:1057) > ==42050== by 0x1009CFE7A: MatLUFactorNumeric_SuperLU_DIST > (superlu_dist.c:414) > ==42050== by 0x10046CC5C: MatLUFactorNumeric (matrix.c:2946) > ==42050== by 0x100F09F2C: PCSetUp_LU (lu.c:152) > ==42050== by 0x100FF9036: PCSetUp (precon.c:982) > ==42050== by 0x1010F54EB: KSPSetUp (itfunc.c:332) > ==42050== by 0x1010F7985: KSPSolve (itfunc.c:546) > ==42050== by 0x10125541E: SNESSolve_NEWTONLS (ls.c:233) > ==42050== by 0x1011C49B7: SNESSolve (snes.c:3906) > ==42050== by 0x100001B3C: main (in ./ex19) > ==42050== Uninitialised value was created by a stack allocation > ==42050== at 0x10150E4C4: dist_symbLU (pdsymbfact_distdata.c:96) > ==42050== > ==42050== Conditional jump or move depends on uninitialised value(s) > ==42050== at 0x1027450B1: MPI_Alltoallv (alltoallv.c:512) > ==42050== by 0x101510B3E: dist_symbLU (pdsymbfact_distdata.c:539) > ==42050== by 0x10150A5C6: ddist_psymbtonum (pdsymbfact_distdata.c:1275) > ==42050== by 0x1015018C2: pdgssvx (pdgssvx.c:1057) > ==42050== by 0x1009CFE7A: MatLUFactorNumeric_SuperLU_DIST > (superlu_dist.c:414) > ==42050== by 0x10046CC5C: MatLUFactorNumeric (matrix.c:2946) > ==42050== by 0x100F09F2C: PCSetUp_LU (lu.c:152) > ==42050== by 0x100FF9036: PCSetUp (precon.c:982) > ==42050== by 0x1010F54EB: KSPSetUp (itfunc.c:332) > ==42050== by 0x1010F7985: KSPSolve (itfunc.c:546) > ==42050== by 0x10125541E: SNESSolve_NEWTONLS (ls.c:233) > ==42050== by 0x1011C49B7: SNESSolve (snes.c:3906) > ==42050== by 0x100001B3C: main (in ./ex19) > ==42050== Uninitialised value was created by a stack allocation > ==42050== at 0x10150E4C4: dist_symbLU (pdsymbfact_distdata.c:96) > ==42050== > ==42050== Conditional jump or move depends on uninitialised value(s) > ==42050== at 0x10283FB06: MPIR_Alltoallv_intra (alltoallv.c:92) > ==42050== by 0x1028407B6: MPIR_Alltoallv (alltoallv.c:343) > ==42050== by 0x102840884: MPIR_Alltoallv_impl (alltoallv.c:380) > ==42050== by 0x10274541B: MPI_Alltoallv (alltoallv.c:531) > ==42050== by 0x101510B3E: dist_symbLU (pdsymbfact_distdata.c:539) > ==42050== by 0x10150A5C6: ddist_psymbtonum (pdsymbfact_distdata.c:1275) > ==42050== by 0x1015018C2: pdgssvx (pdgssvx.c:1057) > ==42050== by 0x1009CFE7A: MatLUFactorNumeric_SuperLU_DIST > (superlu_dist.c:414) > ==42050== by 0x10046CC5C: MatLUFactorNumeric (matrix.c:2946) > ==42050== by 0x100F09F2C: PCSetUp_LU (lu.c:152) > ==42050== by 0x100FF9036: PCSetUp (precon.c:982) > ==42050== by 0x1010F54EB: KSPSetUp (itfunc.c:332) > ==42050== by 0x1010F7985: KSPSolve (itfunc.c:546) > ==42050== by 0x10125541E: SNESSolve_NEWTONLS (ls.c:233) > ==42050== by 0x1011C49B7: SNESSolve (snes.c:3906) > ==42050== by 0x100001B3C: main (in ./ex19) > ==42050== Uninitialised value was created by a stack allocation > ==42050== at 0x10150E4C4: dist_symbLU (pdsymbfact_distdata.c:96) > ==42050== > ==42050== Syscall param writev(vector[...]) points to uninitialised byte(s) > ==42050== at 0x102DA1C3A: writev (in > /usr/lib/system/libsystem_kernel.dylib) > ==42050== by 0x10296A0DC: MPL_large_writev (mplsock.c:32) > ==42050== by 0x10295F6AD: MPIDU_Sock_writev (sock_immed.i:610) > ==42050== by 0x102943FCA: MPIDI_CH3_iSendv (ch3_isendv.c:84) > ==42050== by 0x102934361: MPIDI_CH3_EagerContigIsend (ch3u_eager.c:556) > ==42050== by 0x102939531: MPID_Isend (mpid_isend.c:138) > ==42050== by 0x10277656E: MPI_Isend (isend.c:125) > ==42050== by 0x101524C41: pdgstrf2_trsm (pdgstrf2.c:201) > ==42050== by 0x10151ECBF: pdgstrf (pdgstrf.c:1082) > ==42050== by 0x1015019A5: pdgssvx (pdgssvx.c:1069) > ==42050== by 0x1009CFE7A: MatLUFactorNumeric_SuperLU_DIST > (superlu_dist.c:414) > ==42050== by 0x10046CC5C: MatLUFactorNumeric (matrix.c:2946) > ==42050== by 0x100F09F2C: PCSetUp_LU (lu.c:152) > ==42050== by 0x100FF9036: PCSetUp (precon.c:982) > ==42050== by 0x1010F54EB: KSPSetUp (itfunc.c:332) > ==42050== by 0x1010F7985: KSPSolve (itfunc.c:546) > ==42050== by 0x10125541E: SNESSolve_NEWTONLS (ls.c:233) > ==42050== by 0x1011C49B7: SNESSolve (snes.c:3906) > ==42050== by 0x100001B3C: main (in ./ex19) > ==42050== Address 0x1060144d0 is 1,168 bytes inside a block of size > 131,072 alloc'd > ==42050== at 0x1000183B1: malloc (vg_replace_malloc.c:303) > ==42050== by 0x10153B704: superlu_malloc_dist (memory.c:108) > ==42050== by 0x1014FD7AD: doubleMalloc_dist (dmemory.c:145) > ==42050== by 0x10151DA7D: pdgstrf (pdgstrf.c:735) > ==42050== by 0x1015019A5: pdgssvx (pdgssvx.c:1069) > ==42050== by 0x1009CFE7A: MatLUFactorNumeric_SuperLU_DIST > (superlu_dist.c:414) > ==42050== by 0x10046CC5C: MatLUFactorNumeric (matrix.c:2946) > ==42050== by 0x100F09F2C: PCSetUp_LU (lu.c:152) > ==42050== by 0x100FF9036: PCSetUp (precon.c:982) > ==42050== by 0x1010F54EB: KSPSetUp (itfunc.c:332) > ==42050== by 0x1010F7985: KSPSolve (itfunc.c:546) > ==42050== by 0x10125541E: SNESSolve_NEWTONLS (ls.c:233) > ==42050== by 0x1011C49B7: SNESSolve (snes.c:3906) > ==42050== by 0x100001B3C: main (in ./ex19) > ==42050== Uninitialised value was created by a heap allocation > ==42050== at 0x1000183B1: malloc (vg_replace_malloc.c:303) > ==42050== by 0x10153B704: superlu_malloc_dist (memory.c:108) > ==42050== by 0x1014FD7AD: doubleMalloc_dist (dmemory.c:145) > ==42050== by 0x10151DA7D: pdgstrf (pdgstrf.c:735) > ==42050== by 0x1015019A5: pdgssvx (pdgssvx.c:1069) > ==42050== by 0x1009CFE7A: MatLUFactorNumeric_SuperLU_DIST > (superlu_dist.c:414) > ==42050== by 0x10046CC5C: MatLUFactorNumeric (matrix.c:2946) > ==42050== by 0x100F09F2C: PCSetUp_LU (lu.c:152) > ==42050== by 0x100FF9036: PCSetUp (precon.c:982) > ==42050== by 0x1010F54EB: KSPSetUp (itfunc.c:332) > ==42050== by 0x1010F7985: KSPSolve (itfunc.c:546) > ==42050== by 0x10125541E: SNESSolve_NEWTONLS (ls.c:233) > ==42050== by 0x1011C49B7: SNESSolve (snes.c:3906) > ==42050== by 0x100001B3C: main (in ./ex19) > ==42050== > ==42048== Conditional jump or move depends on uninitialised value(s) > ==42048== at 0x10151F141: pdgstrf (pdgstrf.c:1139) > ==42048== by 0x1015019A5: pdgssvx (pdgssvx.c:1069) > ==42048== by 0x1009CFE7A: MatLUFactorNumeric_SuperLU_DIST > (superlu_dist.c:414) > ==42048== by 0x10046CC5C: MatLUFactorNumeric (matrix.c:2946) > ==42048== by 0x100F09F2C: PCSetUp_LU (lu.c:152) > ==42048== by 0x100FF9036: PCSetUp (precon.c:982) > ==42048== by 0x1010F54EB: KSPSetUp (itfunc.c:332) > ==42048== by 0x1010F7985: KSPSolve (itfunc.c:546) > ==42048== by 0x10125541E: SNESSolve_NEWTONLS (ls.c:233) > ==42048== by 0x1011C49B7: SNESSolve (snes.c:3906) > ==42048== by 0x100001B3C: main (in ./ex19) > ==42048== Uninitialised value was created by a heap allocation > ==42048== at 0x1000183B1: malloc (vg_replace_malloc.c:303) > ==42048== by 0x10153B704: superlu_malloc_dist (memory.c:108) > ==42048== by 0x10150ABE2: ddist_psymbtonum (pdsymbfact_distdata.c:1332) > ==42048== by 0x1015018C2: pdgssvx (pdgssvx.c:1057) > ==42048== by 0x1009CFE7A: MatLUFactorNumeric_SuperLU_DIST > (superlu_dist.c:414) > ==42048== by 0x10046CC5C: MatLUFactorNumeric (matrix.c:2946) > ==42048== by 0x100F09F2C: PCSetUp_LU (lu.c:152) > ==42048== by 0x100FF9036: PCSetUp (precon.c:982) > ==42048== by 0x1010F54EB: KSPSetUp (itfunc.c:332) > ==42048== by 0x1010F7985: KSPSolve (itfunc.c:546) > ==42048== by 0x10125541E: SNESSolve_NEWTONLS (ls.c:233) > ==42048== by 0x1011C49B7: SNESSolve (snes.c:3906) > ==42048== by 0x100001B3C: main (in ./ex19) > ==42048== > ==42049== Conditional jump or move depends on uninitialised value(s) > ==42049== at 0x10151F141: pdgstrf (pdgstrf.c:1139) > ==42049== by 0x1015019A5: pdgssvx (pdgssvx.c:1069) > ==42049== by 0x1009CFE7A: MatLUFactorNumeric_SuperLU_DIST > (superlu_dist.c:414) > ==42049== by 0x10046CC5C: MatLUFactorNumeric (matrix.c:2946) > ==42049== by 0x100F09F2C: PCSetUp_LU (lu.c:152) > ==42049== by 0x100FF9036: PCSetUp (precon.c:982) > ==42049== by 0x1010F54EB: KSPSetUp (itfunc.c:332) > ==42049== by 0x1010F7985: KSPSolve (itfunc.c:546) > ==42049== by 0x10125541E: SNESSolve_NEWTONLS (ls.c:233) > ==42049== by 0x1011C49B7: SNESSolve (snes.c:3906) > ==42049== by 0x100001B3C: main (in ./ex19) > ==42049== Uninitialised value was created by a heap allocation > ==42049== at 0x1000183B1: malloc (vg_replace_malloc.c:303) > ==42049== by 0x10153B704: superlu_malloc_dist (memory.c:108) > ==42049== by 0x10150ABE2: ddist_psymbtonum (pdsymbfact_distdata.c:1332) > ==42049== by 0x1015018C2: pdgssvx (pdgssvx.c:1057) > ==42049== by 0x1009CFE7A: MatLUFactorNumeric_SuperLU_DIST > (superlu_dist.c:414) > ==42049== by 0x10046CC5C: MatLUFactorNumeric (matrix.c:2946) > ==42049== by 0x100F09F2C: PCSetUp_LU (lu.c:152) > ==42049== by 0x100FF9036: PCSetUp (precon.c:982) > ==42049== by 0x1010F54EB: KSPSetUp (itfunc.c:332) > ==42049== by 0x1010F7985: KSPSolve (itfunc.c:546) > ==42049== by 0x10125541E: SNESSolve_NEWTONLS (ls.c:233) > ==42049== by 0x1011C49B7: SNESSolve (snes.c:3906) > ==42049== by 0x100001B3C: main (in ./ex19) > ==42049== > ==42048== Conditional jump or move depends on uninitialised value(s) > ==42048== at 0x101520054: pdgstrf (pdgstrf.c:1429) > ==42048== by 0x1015019A5: pdgssvx (pdgssvx.c:1069) > ==42048== by 0x1009CFE7A: MatLUFactorNumeric_SuperLU_DIST > (superlu_dist.c:414) > ==42048== by 0x10046CC5C: MatLUFactorNumeric (matrix.c:2946) > ==42048== by 0x100F09F2C: PCSetUp_LU (lu.c:152) > ==42048== by 0x100FF9036: PCSetUp (precon.c:982) > ==42048== by 0x1010F54EB: KSPSetUp (itfunc.c:332) > ==42048== by 0x1010F7985: KSPSolve (itfunc.c:546) > ==42049== Conditional jump or move depends on uninitialised value(s) > ==42048== by 0x10125541E: SNESSolve_NEWTONLS (ls.c:233) > ==42048== by 0x1011C49B7: SNESSolve (snes.c:3906) > ==42048== by 0x100001B3C: main (in ./ex19) > ==42048== Uninitialised value was created by a heap allocation > ==42049== at 0x101520054: pdgstrf (pdgstrf.c:1429) > ==42048== at 0x1000183B1: malloc (vg_replace_malloc.c:303) > ==42048== by 0x10153B704: superlu_malloc_dist (memory.c:108) > ==42049== by 0x1015019A5: pdgssvx (pdgssvx.c:1069) > ==42049== by 0x1009CFE7A: MatLUFactorNumeric_SuperLU_DIST > (superlu_dist.c:414) > ==42048== by 0x10150ABE2: ddist_psymbtonum (pdsymbfact_distdata.c:1332) > ==42048== by 0x1015018C2: pdgssvx (pdgssvx.c:1057) > ==42048== by 0x1009CFE7A: MatLUFactorNumeric_SuperLU_DIST > (superlu_dist.c:414) > ==42049== by 0x10046CC5C: MatLUFactorNumeric (matrix.c:2946) > ==42049== by 0x100F09F2C: PCSetUp_LU (lu.c:152) > ==42048== by 0x10046CC5C: MatLUFactorNumeric (matrix.c:2946) > ==42048== by 0x100F09F2C: PCSetUp_LU (lu.c:152) > ==42049== by 0x100FF9036: PCSetUp (precon.c:982) > ==42049== by 0x1010F54EB: KSPSetUp (itfunc.c:332) > ==42049== by 0x1010F7985: KSPSolve (itfunc.c:546) > ==42048== by 0x100FF9036: PCSetUp (precon.c:982) > ==42048== by 0x1010F54EB: KSPSetUp (itfunc.c:332) > ==42048== by 0x1010F7985: KSPSolve (itfunc.c:546) > ==42049== by 0x10125541E: SNESSolve_NEWTONLS (ls.c:233) > ==42049== by 0x1011C49B7: SNESSolve (snes.c:3906) > ==42048== by 0x10125541E: SNESSolve_NEWTONLS (ls.c:233) > ==42048== by 0x1011C49B7: SNESSolve (snes.c:3906) > ==42049== by 0x100001B3C: main (in ./ex19) > ==42049== Uninitialised value was created by a heap allocation > ==42049== at 0x1000183B1: malloc (vg_replace_malloc.c:303) > ==42048== by 0x100001B3C: main (in ./ex19) > ==42048== > ==42049== by 0x10153B704: superlu_malloc_dist (memory.c:108) > ==42049== by 0x10150ABE2: ddist_psymbtonum (pdsymbfact_distdata.c:1332) > ==42049== by 0x1015018C2: pdgssvx (pdgssvx.c:1057) > ==42049== by 0x1009CFE7A: MatLUFactorNumeric_SuperLU_DIST > (superlu_dist.c:414) > ==42049== by 0x10046CC5C: MatLUFactorNumeric (matrix.c:2946) > ==42049== by 0x100F09F2C: PCSetUp_LU (lu.c:152) > ==42049== by 0x100FF9036: PCSetUp (precon.c:982) > ==42049== by 0x1010F54EB: KSPSetUp (itfunc.c:332) > ==42049== by 0x1010F7985: KSPSolve (itfunc.c:546) > ==42049== by 0x10125541E: SNESSolve_NEWTONLS (ls.c:233) > ==42049== by 0x1011C49B7: SNESSolve (snes.c:3906) > ==42049== by 0x100001B3C: main (in ./ex19) > ==42049== > ==42050== Conditional jump or move depends on uninitialised value(s) > ==42050== at 0x10151FDE6: pdgstrf (pdgstrf.c:1382) > ==42050== by 0x1015019A5: pdgssvx (pdgssvx.c:1069) > ==42050== by 0x1009CFE7A: MatLUFactorNumeric_SuperLU_DIST > (superlu_dist.c:414) > ==42050== by 0x10046CC5C: MatLUFactorNumeric (matrix.c:2946) > ==42050== by 0x100F09F2C: PCSetUp_LU (lu.c:152) > ==42050== by 0x100FF9036: PCSetUp (precon.c:982) > ==42050== by 0x1010F54EB: KSPSetUp (itfunc.c:332) > ==42050== by 0x1010F7985: KSPSolve (itfunc.c:546) > ==42050== by 0x10125541E: SNESSolve_NEWTONLS (ls.c:233) > ==42050== by 0x1011C49B7: SNESSolve (snes.c:3906) > ==42050== by 0x100001B3C: main (in ./ex19) > ==42050== Uninitialised value was created by a heap allocation > ==42050== at 0x1000183B1: malloc (vg_replace_malloc.c:303) > ==42050== by 0x10153B704: superlu_malloc_dist (memory.c:108) > ==42050== by 0x10150B241: ddist_psymbtonum (pdsymbfact_distdata.c:1389) > ==42050== by 0x1015018C2: pdgssvx (pdgssvx.c:1057) > ==42050== by 0x1009CFE7A: MatLUFactorNumeric_SuperLU_DIST > (superlu_dist.c:414) > ==42050== by 0x10046CC5C: MatLUFactorNumeric (matrix.c:2946) > ==42050== by 0x100F09F2C: PCSetUp_LU (lu.c:152) > ==42050== by 0x100FF9036: PCSetUp (precon.c:982) > ==42050== by 0x1010F54EB: KSPSetUp (itfunc.c:332) > ==42050== by 0x1010F7985: KSPSolve (itfunc.c:546) > ==42050== by 0x10125541E: SNESSolve_NEWTONLS (ls.c:233) > ==42050== by 0x1011C49B7: SNESSolve (snes.c:3906) > ==42050== by 0x100001B3C: main (in ./ex19) > ==42050== > > > > On Jul 20, 2015, at 12:03 PM, Mahir.Ulker-Kaustell at tyrens.se wrote: > > > > Ok. So I have been creating the full factorization on each process. That > gives me some hope! > > > > I followed your suggestion and tried to use the runtime option > ?-mat_superlu_dist_parsymbfact?. > > However, now the program crashes with: > > > > Invalid ISPEC at line 484 in file get_perm_c.c > > > > And so on? > > > > From the SuperLU manual; I should give the option either YES or NO, > however -mat_superlu_dist_parsymbfact YES makes the program crash in the > same way as above. > > Also I can?t find any reference to -mat_superlu_dist_parsymbfact in the > PETSc documentation > > > > Mahir > > > > Mahir ?lker-Kaustell, Kompetenssamordnare, Brokonstrukt?r, Tekn. Dr, > Tyr?ns AB > > 010 452 30 82, Mahir.Ulker-Kaustell at tyrens.se > > > > From: Xiaoye S. Li [mailto:xsli at lbl.gov] > > Sent: den 20 juli 2015 18:12 > > To: ?lker-Kaustell, Mahir > > Cc: Hong; petsc-users > > Subject: Re: [petsc-users] SuperLU MPI-problem > > > > The default SuperLU_DIST setting is to serial symbolic factorization. > Therefore, what matters is how much memory do you have per MPI task? > > > > The code failed to malloc memory during redistribution of matrix A to > {L\U} data struction (using result of serial symbolic factorization.) > > > > You can use parallel symbolic factorization, by runtime option: > '-mat_superlu_dist_parsymbfact' > > > > Sherry Li > > > > > > On Mon, Jul 20, 2015 at 8:59 AM, Mahir.Ulker-Kaustell at tyrens.se < > Mahir.Ulker-Kaustell at tyrens.se> wrote: > > Hong: > > > > Previous experiences with this equation have shown that it is very > difficult to solve it iteratively. Hence the use of a direct solver. > > > > The large test problem I am trying to solve has slightly less than 10^6 > degrees of freedom. The matrices are derived from finite elements so they > are sparse. > > The machine I am working on has 128GB ram. I have estimated the memory > needed to less than 20GB, so if the solver needs twice or even three times > as much, it should still work well. Or have I completely misunderstood > something here? > > > > Mahir > > > > > > > > From: Hong [mailto:hzhang at mcs.anl.gov] > > Sent: den 20 juli 2015 17:39 > > To: ?lker-Kaustell, Mahir > > Cc: petsc-users > > Subject: Re: [petsc-users] SuperLU MPI-problem > > > > Mahir: > > Direct solvers consume large amount of memory. Suggest to try followings: > > > > 1. A sparse iterative solver if [-omega^2M + K] is not too > ill-conditioned. You may test it using the small matrix. > > > > 2. Incrementally increase your matrix sizes. Try different matrix > orderings. > > Do you get memory crash in the 1st symbolic factorization? > > In your case, matrix data structure stays same when omega changes, so > you only need to do one matrix symbolic factorization and reuse it. > > > > 3. Use a machine that gives larger memory. > > > > Hong > > > > Dear Petsc-Users, > > > > I am trying to use PETSc to solve a set of linear equations arising from > Naviers equation (elastodynamics) in the frequency domain. > > The frequency dependency of the problem requires that the system > > > > [-omega^2M + K]u = F > > > > where M and K are constant, square, positive definite matrices (mass and > stiffness respectively) is solved for each frequency omega of interest. > > K is a complex matrix, including material damping. > > > > I have written a PETSc program which solves this problem for a small > (1000 degrees of freedom) test problem on one or several processors, but it > keeps crashing when I try it on my full scale (in the order of 10^6 degrees > of freedom) problem. > > > > The program crashes at KSPSetUp() and from what I can see in the error > messages, it appears as if it consumes too much memory. > > > > I would guess that similar problems have occurred in this mail-list, so > I am hoping that someone can push me in the right direction? > > > > Mahir > > > > > > > > > > > > > -------------- next part -------------- An HTML attachment was scrubbed... URL: From balay at mcs.anl.gov Mon Aug 3 13:19:41 2015 From: balay at mcs.anl.gov (Satish Balay) Date: Mon, 3 Aug 2015 13:19:41 -0500 Subject: [petsc-users] failed to compile HDF5 on vesta In-Reply-To: References: <78D07896-DA83-4572-B0AE-D022AE673D20@mcs.anl.gov> Message-ID: You can look at /soft/libraries/petsc/3.6.1.1/xl-opt/lib/petsc/conf/reconfigure-arch-xl-opt.py for currently used configure options for bgq install with xl compilers. Specifically: '--with-blas-lapack-lib=-L/soft/libraries/alcf/current/xl/LAPACK/lib -llapack -L/soft/libraries/alcf/current/xl/BLAS/lib -lblas', Satish On Sun, 2 Aug 2015, Fande Kong wrote: > Hi, Barry, > > Looks like they did not have fblaslapack installed. I could compile > the fblaslapack when I switched the compiler from XL to gcc. > > Thanks, > Fande Kong, > > On Sun, Aug 2, 2015 at 11:30 AM, Barry Smith wrote: > > > > > You shouldn't need --download-fblaslapack almost every system has it > > already installed. > > > > Barry > > > > Looks like the Fortran compiler on this system is rejecting the "old" > > Fortran in blas/lapack code. > > > > > > > On Aug 1, 2015, at 7:55 PM, Fande Kong wrote: > > > > > > HI barry, > > > > > > Thanks a lot. I could compile hdf5, but failed to compile fblaslapack. > > Log file is attached. > > > > > > Fande Kong, > > > > > > On Sat, Aug 1, 2015 at 3:15 PM, Barry Smith wrote: > > > > > > Try with --with-shared-libraries=0 The HDF5 build is having some > > issue with shared libraries > > > > > > Barry > > > > > > > On Aug 1, 2015, at 4:01 PM, Fande Kong wrote: > > > > > > > > Hi all, > > > > > > > > I want to install petsc on vesta (an IBM Blue Gene at Argonne). Failed > > to compile HDF5. The configure log file is attached. Any suggestions would > > be greatly appreciated. > > > > > > > > Thanks, > > > > > > > > Fande Kong, > > > > > > > > > > > > > > > > > > From solvercorleone at gmail.com Mon Aug 3 22:00:56 2015 From: solvercorleone at gmail.com (Cong Li) Date: Tue, 4 Aug 2015 12:00:56 +0900 Subject: [petsc-users] I am wondering if there is a way to implement SPMM Message-ID: Hello, I am a PhD student using PETsc for my research. I am wondering if there is a way to implement SPMM (Sparse matrix-matrix multiplication) by using PETSc. for example: I want to get matrix B in AX=B, where A is a sparse matrix and both X and B are dense matrices. Thanks in advance Regards Cong Li -------------- next part -------------- An HTML attachment was scrubbed... URL: From jed at jedbrown.org Tue Aug 4 01:27:54 2015 From: jed at jedbrown.org (Jed Brown) Date: Tue, 04 Aug 2015 00:27:54 -0600 Subject: [petsc-users] I am wondering if there is a way to implement SPMM In-Reply-To: References: Message-ID: <87egjjr2j9.fsf@jedbrown.org> Cong Li writes: > Hello, > > I am a PhD student using PETsc for my research. > I am wondering if there is a way to implement SPMM (Sparse matrix-matrix > multiplication) by using PETSc. http://www.mcs.anl.gov/petsc/petsc-current/docs/manualpages/Mat/MatMatMult.html -------------- next part -------------- A non-text attachment was scrubbed... Name: signature.asc Type: application/pgp-signature Size: 818 bytes Desc: not available URL: From solvercorleone at gmail.com Tue Aug 4 01:42:14 2015 From: solvercorleone at gmail.com (Cong Li) Date: Tue, 4 Aug 2015 15:42:14 +0900 Subject: [petsc-users] I am wondering if there is a way to implement SPMM In-Reply-To: <87egjjr2j9.fsf@jedbrown.org> References: <87egjjr2j9.fsf@jedbrown.org> Message-ID: Thanks for your reply. I have an other question. I want to do SPMM several times and combine result matrices into one bigger matrix. for example I firstly calculate AX1=B1, AX2=B2 ... then I want to combine B1, B2.. to get a C, where C=[B1,B2...] Could you please suggest a way of how to do this. Thanks Cong Li On Tue, Aug 4, 2015 at 3:27 PM, Jed Brown wrote: > Cong Li writes: > > > Hello, > > > > I am a PhD student using PETsc for my research. > > I am wondering if there is a way to implement SPMM (Sparse matrix-matrix > > multiplication) by using PETSc. > > > http://www.mcs.anl.gov/petsc/petsc-current/docs/manualpages/Mat/MatMatMult.html > -------------- next part -------------- An HTML attachment was scrubbed... URL: From patrick.sanan at gmail.com Tue Aug 4 03:45:48 2015 From: patrick.sanan at gmail.com (Patrick Sanan) Date: Tue, 4 Aug 2015 10:45:48 +0200 Subject: [petsc-users] I am wondering if there is a way to implement SPMM In-Reply-To: References: <87egjjr2j9.fsf@jedbrown.org> Message-ID: <20150804084548.GB52392@Patricks-MacBook-Pro-3.local> On Tue, Aug 04, 2015 at 03:42:14PM +0900, Cong Li wrote: > Thanks for your reply. > > I have an other question. > I want to do SPMM several times and combine result matrices into one bigger > matrix. > for example > I firstly calculate AX1=B1, AX2=B2 ... > then I want to combine B1, B2.. to get a C, where C=[B1,B2...] > > Could you please suggest a way of how to do this. This is just linear algebra, nothing to do with PETSc specifically. A * [X1, X2, ... ] = [AX1, AX2, ...] > > Thanks > > Cong Li > > On Tue, Aug 4, 2015 at 3:27 PM, Jed Brown wrote: > > > Cong Li writes: > > > > > Hello, > > > > > > I am a PhD student using PETsc for my research. > > > I am wondering if there is a way to implement SPMM (Sparse matrix-matrix > > > multiplication) by using PETSc. > > > > > > http://www.mcs.anl.gov/petsc/petsc-current/docs/manualpages/Mat/MatMatMult.html > > -------------- next part -------------- A non-text attachment was scrubbed... Name: not available Type: application/pgp-signature Size: 473 bytes Desc: not available URL: From solvercorleone at gmail.com Tue Aug 4 04:09:30 2015 From: solvercorleone at gmail.com (Cong Li) Date: Tue, 4 Aug 2015 18:09:30 +0900 Subject: [petsc-users] I am wondering if there is a way to implement SPMM In-Reply-To: <20150804084548.GB52392@Patricks-MacBook-Pro-3.local> References: <87egjjr2j9.fsf@jedbrown.org> <20150804084548.GB52392@Patricks-MacBook-Pro-3.local> Message-ID: I am sorry that I should have explained it more clearly. Actually I want to compute a recurrence. Like, I want to firstly compute A*X1=B1, and then calculate A*B1=B2, A*B2=B3 and so on. Finally I want to combine all these results into a bigger matrix C=[B1,B2 ...] Is there any way to do this efficiently. On Tue, Aug 4, 2015 at 5:45 PM, Patrick Sanan wrote: > On Tue, Aug 04, 2015 at 03:42:14PM +0900, Cong Li wrote: > > Thanks for your reply. > > > > I have an other question. > > I want to do SPMM several times and combine result matrices into one > bigger > > matrix. > > for example > > I firstly calculate AX1=B1, AX2=B2 ... > > then I want to combine B1, B2.. to get a C, where C=[B1,B2...] > > > > Could you please suggest a way of how to do this. > This is just linear algebra, nothing to do with PETSc specifically. > A * [X1, X2, ... ] = [AX1, AX2, ...] > > > > Thanks > > > > Cong Li > > > > On Tue, Aug 4, 2015 at 3:27 PM, Jed Brown wrote: > > > > > Cong Li writes: > > > > > > > Hello, > > > > > > > > I am a PhD student using PETsc for my research. > > > > I am wondering if there is a way to implement SPMM (Sparse > matrix-matrix > > > > multiplication) by using PETSc. > > > > > > > > > > http://www.mcs.anl.gov/petsc/petsc-current/docs/manualpages/Mat/MatMatMult.html > > > > -------------- next part -------------- An HTML attachment was scrubbed... URL: From patrick.sanan at gmail.com Tue Aug 4 04:46:31 2015 From: patrick.sanan at gmail.com (Patrick Sanan) Date: Tue, 4 Aug 2015 11:46:31 +0200 Subject: [petsc-users] I am wondering if there is a way to implement SPMM In-Reply-To: References: <87egjjr2j9.fsf@jedbrown.org> <20150804084548.GB52392@Patricks-MacBook-Pro-3.local> Message-ID: <20150804094631.GF52392@Patricks-MacBook-Pro-3.local> On Tue, Aug 04, 2015 at 06:09:30PM +0900, Cong Li wrote: > I am sorry that I should have explained it more clearly. > Actually I want to compute a recurrence. > > Like, I want to firstly compute A*X1=B1, and then calculate A*B1=B2, > A*B2=B3 and so on. > Finally I want to combine all these results into a bigger matrix C=[B1,B2 > ...] > > Is there any way to do this efficiently. With no other information about your problem, one literal solution might be to use MATNEST to define C once you have computed B1,B2,.. However, this invites questions about what you plan to do with C and whether you require explicit representations of some or all of these matrices, and what problem sizes you are considering. > > > > On Tue, Aug 4, 2015 at 5:45 PM, Patrick Sanan > wrote: > > > On Tue, Aug 04, 2015 at 03:42:14PM +0900, Cong Li wrote: > > > Thanks for your reply. > > > > > > I have an other question. > > > I want to do SPMM several times and combine result matrices into one > > bigger > > > matrix. > > > for example > > > I firstly calculate AX1=B1, AX2=B2 ... > > > then I want to combine B1, B2.. to get a C, where C=[B1,B2...] > > > > > > Could you please suggest a way of how to do this. > > This is just linear algebra, nothing to do with PETSc specifically. > > A * [X1, X2, ... ] = [AX1, AX2, ...] > > > > > > Thanks > > > > > > Cong Li > > > > > > On Tue, Aug 4, 2015 at 3:27 PM, Jed Brown wrote: > > > > > > > Cong Li writes: > > > > > > > > > Hello, > > > > > > > > > > I am a PhD student using PETsc for my research. > > > > > I am wondering if there is a way to implement SPMM (Sparse > > matrix-matrix > > > > > multiplication) by using PETSc. > > > > > > > > > > > > > > http://www.mcs.anl.gov/petsc/petsc-current/docs/manualpages/Mat/MatMatMult.html > > > > > > -------------- next part -------------- A non-text attachment was scrubbed... Name: not available Type: application/pgp-signature Size: 473 bytes Desc: not available URL: From knepley at gmail.com Tue Aug 4 04:50:08 2015 From: knepley at gmail.com (Matthew Knepley) Date: Tue, 4 Aug 2015 04:50:08 -0500 Subject: [petsc-users] I am wondering if there is a way to implement SPMM In-Reply-To: References: <87egjjr2j9.fsf@jedbrown.org> <20150804084548.GB52392@Patricks-MacBook-Pro-3.local> Message-ID: On Tue, Aug 4, 2015 at 4:09 AM, Cong Li wrote: > I am sorry that I should have explained it more clearly. > Actually I want to compute a recurrence. > > Like, I want to firstly compute A*X1=B1, and then calculate A*B1=B2, > A*B2=B3 and so on. > Finally I want to combine all these results into a bigger matrix C=[B1,B2 > ...] > > Is there any way to do this efficiently. > You could use a MatNest, however now this seems like thw wrong way to calculate it. Why do you want to put a matrix polynomial into another matrix? Matt > On Tue, Aug 4, 2015 at 5:45 PM, Patrick Sanan > wrote: > >> On Tue, Aug 04, 2015 at 03:42:14PM +0900, Cong Li wrote: >> > Thanks for your reply. >> > >> > I have an other question. >> > I want to do SPMM several times and combine result matrices into one >> bigger >> > matrix. >> > for example >> > I firstly calculate AX1=B1, AX2=B2 ... >> > then I want to combine B1, B2.. to get a C, where C=[B1,B2...] >> > >> > Could you please suggest a way of how to do this. >> This is just linear algebra, nothing to do with PETSc specifically. >> A * [X1, X2, ... ] = [AX1, AX2, ...] >> > >> > Thanks >> > >> > Cong Li >> > >> > On Tue, Aug 4, 2015 at 3:27 PM, Jed Brown wrote: >> > >> > > Cong Li writes: >> > > >> > > > Hello, >> > > > >> > > > I am a PhD student using PETsc for my research. >> > > > I am wondering if there is a way to implement SPMM (Sparse >> matrix-matrix >> > > > multiplication) by using PETSc. >> > > >> > > >> > > >> http://www.mcs.anl.gov/petsc/petsc-current/docs/manualpages/Mat/MatMatMult.html >> > > >> > > -- What most experimenters take for granted before they begin their experiments is infinitely more interesting than any results to which their experiments lead. -- Norbert Wiener -------------- next part -------------- An HTML attachment was scrubbed... URL: From solvercorleone at gmail.com Tue Aug 4 05:31:57 2015 From: solvercorleone at gmail.com (Cong Li) Date: Tue, 4 Aug 2015 19:31:57 +0900 Subject: [petsc-users] I am wondering if there is a way to implement SPMM In-Reply-To: <20150804094631.GF52392@Patricks-MacBook-Pro-3.local> References: <87egjjr2j9.fsf@jedbrown.org> <20150804084548.GB52392@Patricks-MacBook-Pro-3.local> <20150804094631.GF52392@Patricks-MacBook-Pro-3.local> Message-ID: Actually, I am trying to implement s-step krylov subspace method. I want to extend the Krylov subspace by s dimensions by using monomial, which can be defined as C={X, AX, A^2X, ... , A^sX}, in one loop. So, my plan now is to firstly calculate the recurrence, which is P_n(x)=xP_n-1(x), and then use the results to update the items in C. And then, in the next loop of Krylov subspace method, the C will be updated again. This means I need to update C in every iteration. This continues till the convergence criteria is satisfied. I suppose A is huge sparse SPD matrix with millions of rows, and X is tall-skinny dense matrix. Do you still think MATNEST is a good way to define C. Actually I am wondering if there is a way to do SPMM by using a submatrix of C and also store the result in a submatrix of C. If it is possible, I think we can remove some of cost of data movement. For example, C=[c_1, c_2,.., c_s], and I want to use the result of A*c_1 to update c_2, and then use he result of A*c_2(updated) to update c_3 and so on. I don't need the intermediate result separately, such as the result of A*c_1, A*c_2. And I only need the final C. Is there any SPMM function or strategies I can use to achievement this? Thanks Cong Li On Tue, Aug 4, 2015 at 6:46 PM, Patrick Sanan wrote: > On Tue, Aug 04, 2015 at 06:09:30PM +0900, Cong Li wrote: > > I am sorry that I should have explained it more clearly. > > Actually I want to compute a recurrence. > > > > Like, I want to firstly compute A*X1=B1, and then calculate A*B1=B2, > > A*B2=B3 and so on. > > Finally I want to combine all these results into a bigger matrix C=[B1,B2 > > ...] > > > > Is there any way to do this efficiently. > With no other information about your problem, one literal solution might > be to use MATNEST to define C once you have computed B1,B2,.. > However, this invites questions about what you plan to do with C and > whether you require explicit representations of some or all of these > matrices, and what problem sizes you are considering. > > > > > > > > On Tue, Aug 4, 2015 at 5:45 PM, Patrick Sanan > > wrote: > > > > > On Tue, Aug 04, 2015 at 03:42:14PM +0900, Cong Li wrote: > > > > Thanks for your reply. > > > > > > > > I have an other question. > > > > I want to do SPMM several times and combine result matrices into one > > > bigger > > > > matrix. > > > > for example > > > > I firstly calculate AX1=B1, AX2=B2 ... > > > > then I want to combine B1, B2.. to get a C, where C=[B1,B2...] > > > > > > > > Could you please suggest a way of how to do this. > > > This is just linear algebra, nothing to do with PETSc specifically. > > > A * [X1, X2, ... ] = [AX1, AX2, ...] > > > > > > > > Thanks > > > > > > > > Cong Li > > > > > > > > On Tue, Aug 4, 2015 at 3:27 PM, Jed Brown wrote: > > > > > > > > > Cong Li writes: > > > > > > > > > > > Hello, > > > > > > > > > > > > I am a PhD student using PETsc for my research. > > > > > > I am wondering if there is a way to implement SPMM (Sparse > > > matrix-matrix > > > > > > multiplication) by using PETSc. > > > > > > > > > > > > > > > > > > > http://www.mcs.anl.gov/petsc/petsc-current/docs/manualpages/Mat/MatMatMult.html > > > > > > > > > -------------- next part -------------- An HTML attachment was scrubbed... URL: From solvercorleone at gmail.com Tue Aug 4 05:36:15 2015 From: solvercorleone at gmail.com (Cong Li) Date: Tue, 4 Aug 2015 19:36:15 +0900 Subject: [petsc-users] I am wondering if there is a way to implement SPMM In-Reply-To: References: <87egjjr2j9.fsf@jedbrown.org> <20150804084548.GB52392@Patricks-MacBook-Pro-3.local> Message-ID: Hi As I answered in the last email. Actually, I am trying to implement s-step block krylov subspace method. So, I need to expand the Krylov subspace by putting matrix polynomials into another matrix. Cong Li On Tue, Aug 4, 2015 at 6:50 PM, Matthew Knepley wrote: > On Tue, Aug 4, 2015 at 4:09 AM, Cong Li wrote: > >> I am sorry that I should have explained it more clearly. >> Actually I want to compute a recurrence. >> >> Like, I want to firstly compute A*X1=B1, and then calculate A*B1=B2, >> A*B2=B3 and so on. >> Finally I want to combine all these results into a bigger matrix C=[B1,B2 >> ...] >> >> Is there any way to do this efficiently. >> > > You could use a MatNest, however now this seems like thw wrong way to > calculate it. Why > do you want to put a matrix polynomial into another matrix? > > Matt > > >> On Tue, Aug 4, 2015 at 5:45 PM, Patrick Sanan >> wrote: >> >>> On Tue, Aug 04, 2015 at 03:42:14PM +0900, Cong Li wrote: >>> > Thanks for your reply. >>> > >>> > I have an other question. >>> > I want to do SPMM several times and combine result matrices into one >>> bigger >>> > matrix. >>> > for example >>> > I firstly calculate AX1=B1, AX2=B2 ... >>> > then I want to combine B1, B2.. to get a C, where C=[B1,B2...] >>> > >>> > Could you please suggest a way of how to do this. >>> This is just linear algebra, nothing to do with PETSc specifically. >>> A * [X1, X2, ... ] = [AX1, AX2, ...] >>> > >>> > Thanks >>> > >>> > Cong Li >>> > >>> > On Tue, Aug 4, 2015 at 3:27 PM, Jed Brown wrote: >>> > >>> > > Cong Li writes: >>> > > >>> > > > Hello, >>> > > > >>> > > > I am a PhD student using PETsc for my research. >>> > > > I am wondering if there is a way to implement SPMM (Sparse >>> matrix-matrix >>> > > > multiplication) by using PETSc. >>> > > >>> > > >>> > > >>> http://www.mcs.anl.gov/petsc/petsc-current/docs/manualpages/Mat/MatMatMult.html >>> > > >>> >> >> > > > -- > What most experimenters take for granted before they begin their > experiments is infinitely more interesting than any results to which their > experiments lead. > -- Norbert Wiener > -------------- next part -------------- An HTML attachment was scrubbed... URL: From knepley at gmail.com Tue Aug 4 06:34:50 2015 From: knepley at gmail.com (Matthew Knepley) Date: Tue, 4 Aug 2015 06:34:50 -0500 Subject: [petsc-users] I am wondering if there is a way to implement SPMM In-Reply-To: References: <87egjjr2j9.fsf@jedbrown.org> <20150804084548.GB52392@Patricks-MacBook-Pro-3.local> <20150804094631.GF52392@Patricks-MacBook-Pro-3.local> Message-ID: On Tue, Aug 4, 2015 at 5:31 AM, Cong Li wrote: > Actually, I am trying to implement s-step krylov subspace method. > I want to extend the Krylov subspace by s dimensions by using monomial, > which can be defined as C={X, AX, A^2X, ... , A^sX}, in one loop. So, my > plan now is to firstly calculate the recurrence, which is P_n(x)=xP_n-1(x), > and then use the results to update the items in C. And then, in the next > loop of Krylov subspace method, the C will be updated again. This means I > need to update C in every iteration. > This continues till the convergence criteria is satisfied. > > I suppose A is huge sparse SPD matrix with millions of rows, and X is > tall-skinny dense matrix. > > Do you still think MATNEST is a good way to define C. > > Actually I am wondering if there is a way to do SPMM by using a submatrix > of C and also store the result in a submatrix of C. If it is possible, I > think we can remove some of cost of data movement. > For example, C=[c_1, c_2,.., c_s], and I want to use the result of A*c_1 > to update c_2, and then use he result of A*c_2(updated) to update c_3 and > so on. > I don't need the intermediate result separately, such as the result of > A*c_1, A*c_2. And I only need the final C. > Is there any SPMM function or strategies I can use to achievement this? > So there are two optimizations here: 1) Communication: You only communicate every s steps. If you are solving a transport dominated problem, this can make sense. For elliptic problems, I think it makes no difference at all. 2) Computation: You can alleviate bandwidth pressure by acting on multiple vectors at once. I would first implement this naively with a collection of Vecs to check that 1) makes a difference for your problem. If it does, then I think 2) can best be accomplished by using a TAIJ matrix and a long Vec, where you shift the memory at each iterate. Thanks, Matt > Thanks > > Cong Li > > > > On Tue, Aug 4, 2015 at 6:46 PM, Patrick Sanan > wrote: > >> On Tue, Aug 04, 2015 at 06:09:30PM +0900, Cong Li wrote: >> > I am sorry that I should have explained it more clearly. >> > Actually I want to compute a recurrence. >> > >> > Like, I want to firstly compute A*X1=B1, and then calculate A*B1=B2, >> > A*B2=B3 and so on. >> > Finally I want to combine all these results into a bigger matrix >> C=[B1,B2 >> > ...] >> > >> > Is there any way to do this efficiently. >> With no other information about your problem, one literal solution might >> be to use MATNEST to define C once you have computed B1,B2,.. >> However, this invites questions about what you plan to do with C and >> whether you require explicit representations of some or all of these >> matrices, and what problem sizes you are considering. >> > >> > >> > >> > On Tue, Aug 4, 2015 at 5:45 PM, Patrick Sanan >> > wrote: >> > >> > > On Tue, Aug 04, 2015 at 03:42:14PM +0900, Cong Li wrote: >> > > > Thanks for your reply. >> > > > >> > > > I have an other question. >> > > > I want to do SPMM several times and combine result matrices into one >> > > bigger >> > > > matrix. >> > > > for example >> > > > I firstly calculate AX1=B1, AX2=B2 ... >> > > > then I want to combine B1, B2.. to get a C, where C=[B1,B2...] >> > > > >> > > > Could you please suggest a way of how to do this. >> > > This is just linear algebra, nothing to do with PETSc specifically. >> > > A * [X1, X2, ... ] = [AX1, AX2, ...] >> > > > >> > > > Thanks >> > > > >> > > > Cong Li >> > > > >> > > > On Tue, Aug 4, 2015 at 3:27 PM, Jed Brown wrote: >> > > > >> > > > > Cong Li writes: >> > > > > >> > > > > > Hello, >> > > > > > >> > > > > > I am a PhD student using PETsc for my research. >> > > > > > I am wondering if there is a way to implement SPMM (Sparse >> > > matrix-matrix >> > > > > > multiplication) by using PETSc. >> > > > > >> > > > > >> > > > > >> > > >> http://www.mcs.anl.gov/petsc/petsc-current/docs/manualpages/Mat/MatMatMult.html >> > > > > >> > > >> > > -- What most experimenters take for granted before they begin their experiments is infinitely more interesting than any results to which their experiments lead. -- Norbert Wiener -------------- next part -------------- An HTML attachment was scrubbed... URL: From bsmith at mcs.anl.gov Tue Aug 4 11:27:39 2015 From: bsmith at mcs.anl.gov (Barry Smith) Date: Tue, 4 Aug 2015 11:27:39 -0500 Subject: [petsc-users] I am wondering if there is a way to implement SPMM In-Reply-To: References: <87egjjr2j9.fsf@jedbrown.org> <20150804084548.GB52392@Patricks-MacBook-Pro-3.local> Message-ID: <07456300-9874-41EF-AF5E-16BC0CB0423D@mcs.anl.gov> > On Aug 4, 2015, at 4:09 AM, Cong Li wrote: > > I am sorry that I should have explained it more clearly. > Actually I want to compute a recurrence. > > Like, I want to firstly compute A*X1=B1, and then calculate A*B1=B2, A*B2=B3 and so on. > Finally I want to combine all these results into a bigger matrix C=[B1,B2 ...] First create C with MatCreateDense(,&C). Then call MatDenseGetArray(C,&array); then create B1 with MatCreateDense(....,array,&B1); then create B2 with MatCreateDense(...,array+shift,&B2) etc where shift equals the number of __local__ rows in B1 times the number of columns in B1, then create B3 with a larger shift etc. Note that you are "sharing" the array space of C with B1, B2, B3, ..., each Bi contains its columns of the C matrix. Barry > > Is there any way to do this efficiently. > > > > On Tue, Aug 4, 2015 at 5:45 PM, Patrick Sanan wrote: > On Tue, Aug 04, 2015 at 03:42:14PM +0900, Cong Li wrote: > > Thanks for your reply. > > > > I have an other question. > > I want to do SPMM several times and combine result matrices into one bigger > > matrix. > > for example > > I firstly calculate AX1=B1, AX2=B2 ... > > then I want to combine B1, B2.. to get a C, where C=[B1,B2...] > > > > Could you please suggest a way of how to do this. > This is just linear algebra, nothing to do with PETSc specifically. > A * [X1, X2, ... ] = [AX1, AX2, ...] > > > > Thanks > > > > Cong Li > > > > On Tue, Aug 4, 2015 at 3:27 PM, Jed Brown wrote: > > > > > Cong Li writes: > > > > > > > Hello, > > > > > > > > I am a PhD student using PETsc for my research. > > > > I am wondering if there is a way to implement SPMM (Sparse matrix-matrix > > > > multiplication) by using PETSc. > > > > > > > > > http://www.mcs.anl.gov/petsc/petsc-current/docs/manualpages/Mat/MatMatMult.html > > > > From solvercorleone at gmail.com Tue Aug 4 11:59:31 2015 From: solvercorleone at gmail.com (Cong Li) Date: Wed, 5 Aug 2015 01:59:31 +0900 Subject: [petsc-users] I am wondering if there is a way to implement SPMM In-Reply-To: <07456300-9874-41EF-AF5E-16BC0CB0423D@mcs.anl.gov> References: <87egjjr2j9.fsf@jedbrown.org> <20150804084548.GB52392@Patricks-MacBook-Pro-3.local> <07456300-9874-41EF-AF5E-16BC0CB0423D@mcs.anl.gov> Message-ID: Thanks very much. This answer is very helpful. And I have a following question. If I create B1, B2, .. by the way you suggested and then use MatMatMult to do SPMM. PetscErrorCode MatMatMult (Mat A,Mat B,MatReuse scall,PetscReal fill,Mat *C) should I use MAT_REUSE_MATRIX for MatReuse part of the arguement. Thanks Cong Li On Wed, Aug 5, 2015 at 1:27 AM, Barry Smith wrote: > > > On Aug 4, 2015, at 4:09 AM, Cong Li wrote: > > > > I am sorry that I should have explained it more clearly. > > Actually I want to compute a recurrence. > > > > Like, I want to firstly compute A*X1=B1, and then calculate A*B1=B2, > A*B2=B3 and so on. > > Finally I want to combine all these results into a bigger matrix > C=[B1,B2 ...] > > First create C with MatCreateDense(,&C). Then call > MatDenseGetArray(C,&array); then create B1 with > MatCreateDense(....,array,&B1); then create > B2 with MatCreateDense(...,array+shift,&B2) etc where shift equals the > number of __local__ rows in B1 times the number of columns in B1, then > create B3 with a larger shift etc. > > Note that you are "sharing" the array space of C with B1, B2, B3, ..., > each Bi contains its columns of the C matrix. > > Barry > > > > > > > Is there any way to do this efficiently. > > > > > > > > On Tue, Aug 4, 2015 at 5:45 PM, Patrick Sanan > wrote: > > On Tue, Aug 04, 2015 at 03:42:14PM +0900, Cong Li wrote: > > > Thanks for your reply. > > > > > > I have an other question. > > > I want to do SPMM several times and combine result matrices into one > bigger > > > matrix. > > > for example > > > I firstly calculate AX1=B1, AX2=B2 ... > > > then I want to combine B1, B2.. to get a C, where C=[B1,B2...] > > > > > > Could you please suggest a way of how to do this. > > This is just linear algebra, nothing to do with PETSc specifically. > > A * [X1, X2, ... ] = [AX1, AX2, ...] > > > > > > Thanks > > > > > > Cong Li > > > > > > On Tue, Aug 4, 2015 at 3:27 PM, Jed Brown wrote: > > > > > > > Cong Li writes: > > > > > > > > > Hello, > > > > > > > > > > I am a PhD student using PETsc for my research. > > > > > I am wondering if there is a way to implement SPMM (Sparse > matrix-matrix > > > > > multiplication) by using PETSc. > > > > > > > > > > > > > http://www.mcs.anl.gov/petsc/petsc-current/docs/manualpages/Mat/MatMatMult.html > > > > > > > > -------------- next part -------------- An HTML attachment was scrubbed... URL: From solvercorleone at gmail.com Tue Aug 4 12:08:50 2015 From: solvercorleone at gmail.com (Cong Li) Date: Wed, 5 Aug 2015 02:08:50 +0900 Subject: [petsc-users] I am wondering if there is a way to implement SPMM In-Reply-To: References: <87egjjr2j9.fsf@jedbrown.org> <20150804084548.GB52392@Patricks-MacBook-Pro-3.local> <20150804094631.GF52392@Patricks-MacBook-Pro-3.local> Message-ID: Thanks very much for your suggestions. Actually I am also considering using communication-avoiding matrix power kernel (CA-MPK) to do SPMM. However, the communication pattern of CA-MPK depends on the sparsity pattern. So the implementation could be very complex for some of problems.The efficient implementation of CA-MPK is actually one of problems I want to solve during my PhD course. As to the optimisation 2 you suggested, is it the same idea as what Barry Smith suggested? I am sorry that I am a Rookie to PETSc, so I am not quite familiar with PETSc implementation strategies. Thanks Cong Li On Tue, Aug 4, 2015 at 8:34 PM, Matthew Knepley wrote: > On Tue, Aug 4, 2015 at 5:31 AM, Cong Li wrote: > >> Actually, I am trying to implement s-step krylov subspace method. >> I want to extend the Krylov subspace by s dimensions by using monomial, >> which can be defined as C={X, AX, A^2X, ... , A^sX}, in one loop. So, my >> plan now is to firstly calculate the recurrence, which is P_n(x)=xP_n-1(x), >> and then use the results to update the items in C. And then, in the next >> loop of Krylov subspace method, the C will be updated again. This means I >> need to update C in every iteration. >> This continues till the convergence criteria is satisfied. >> >> I suppose A is huge sparse SPD matrix with millions of rows, and X is >> tall-skinny dense matrix. >> >> Do you still think MATNEST is a good way to define C. >> >> Actually I am wondering if there is a way to do SPMM by using a submatrix >> of C and also store the result in a submatrix of C. If it is possible, I >> think we can remove some of cost of data movement. >> For example, C=[c_1, c_2,.., c_s], and I want to use the result of A*c_1 >> to update c_2, and then use he result of A*c_2(updated) to update c_3 and >> so on. >> I don't need the intermediate result separately, such as the result of >> A*c_1, A*c_2. And I only need the final C. >> Is there any SPMM function or strategies I can use to achievement this? >> > > So there are two optimizations here: > > 1) Communication: You only communicate every s steps. If you are solving > a transport dominated problem, this can make sense. > For elliptic problems, I think it makes no difference at all. > > 2) Computation: You can alleviate bandwidth pressure by acting on > multiple vectors at once. > > I would first implement this naively with a collection of Vecs to check > that 1) makes a difference for your problem. > If it does, then I think 2) can best be accomplished by using a TAIJ > matrix and a long Vec, where you shift the > memory at each iterate. > > Thanks, > > Matt > > >> Thanks >> >> Cong Li >> >> >> >> On Tue, Aug 4, 2015 at 6:46 PM, Patrick Sanan >> wrote: >> >>> On Tue, Aug 04, 2015 at 06:09:30PM +0900, Cong Li wrote: >>> > I am sorry that I should have explained it more clearly. >>> > Actually I want to compute a recurrence. >>> > >>> > Like, I want to firstly compute A*X1=B1, and then calculate A*B1=B2, >>> > A*B2=B3 and so on. >>> > Finally I want to combine all these results into a bigger matrix >>> C=[B1,B2 >>> > ...] >>> > >>> > Is there any way to do this efficiently. >>> With no other information about your problem, one literal solution might >>> be to use MATNEST to define C once you have computed B1,B2,.. >>> However, this invites questions about what you plan to do with C and >>> whether you require explicit representations of some or all of these >>> matrices, and what problem sizes you are considering. >>> > >>> > >>> > >>> > On Tue, Aug 4, 2015 at 5:45 PM, Patrick Sanan >> > >>> > wrote: >>> > >>> > > On Tue, Aug 04, 2015 at 03:42:14PM +0900, Cong Li wrote: >>> > > > Thanks for your reply. >>> > > > >>> > > > I have an other question. >>> > > > I want to do SPMM several times and combine result matrices into >>> one >>> > > bigger >>> > > > matrix. >>> > > > for example >>> > > > I firstly calculate AX1=B1, AX2=B2 ... >>> > > > then I want to combine B1, B2.. to get a C, where C=[B1,B2...] >>> > > > >>> > > > Could you please suggest a way of how to do this. >>> > > This is just linear algebra, nothing to do with PETSc specifically. >>> > > A * [X1, X2, ... ] = [AX1, AX2, ...] >>> > > > >>> > > > Thanks >>> > > > >>> > > > Cong Li >>> > > > >>> > > > On Tue, Aug 4, 2015 at 3:27 PM, Jed Brown >>> wrote: >>> > > > >>> > > > > Cong Li writes: >>> > > > > >>> > > > > > Hello, >>> > > > > > >>> > > > > > I am a PhD student using PETsc for my research. >>> > > > > > I am wondering if there is a way to implement SPMM (Sparse >>> > > matrix-matrix >>> > > > > > multiplication) by using PETSc. >>> > > > > >>> > > > > >>> > > > > >>> > > >>> http://www.mcs.anl.gov/petsc/petsc-current/docs/manualpages/Mat/MatMatMult.html >>> > > > > >>> > > >>> >> >> > > > -- > What most experimenters take for granted before they begin their > experiments is infinitely more interesting than any results to which their > experiments lead. > -- Norbert Wiener > -------------- next part -------------- An HTML attachment was scrubbed... URL: From knepley at gmail.com Tue Aug 4 12:11:40 2015 From: knepley at gmail.com (Matthew Knepley) Date: Tue, 4 Aug 2015 12:11:40 -0500 Subject: [petsc-users] I am wondering if there is a way to implement SPMM In-Reply-To: References: <87egjjr2j9.fsf@jedbrown.org> <20150804084548.GB52392@Patricks-MacBook-Pro-3.local> <20150804094631.GF52392@Patricks-MacBook-Pro-3.local> Message-ID: On Tue, Aug 4, 2015 at 12:08 PM, Cong Li wrote: > Thanks very much for your suggestions. > > Actually I am also considering using communication-avoiding matrix power > kernel (CA-MPK) to do SPMM. However, the communication pattern of CA-MPK > depends on the sparsity pattern. So the implementation could be very > complex for some of problems.The efficient implementation of CA-MPK is > actually one of problems I want to solve during my PhD course. > > As to the optimisation 2 you suggested, is it the same idea as what Barry > Smith suggested? > I am sorry that I am a Rookie to PETSc, so I am not quite familiar with > PETSc implementation strategies. > Yes, that is what Barry is suggesting. Thanks, Matt > Thanks > > Cong Li > > On Tue, Aug 4, 2015 at 8:34 PM, Matthew Knepley wrote: > >> On Tue, Aug 4, 2015 at 5:31 AM, Cong Li wrote: >> >>> Actually, I am trying to implement s-step krylov subspace method. >>> I want to extend the Krylov subspace by s dimensions by using monomial, >>> which can be defined as C={X, AX, A^2X, ... , A^sX}, in one loop. So, my >>> plan now is to firstly calculate the recurrence, which is P_n(x)=xP_n-1(x), >>> and then use the results to update the items in C. And then, in the next >>> loop of Krylov subspace method, the C will be updated again. This means I >>> need to update C in every iteration. >>> This continues till the convergence criteria is satisfied. >>> >>> I suppose A is huge sparse SPD matrix with millions of rows, and X is >>> tall-skinny dense matrix. >>> >>> Do you still think MATNEST is a good way to define C. >>> >>> Actually I am wondering if there is a way to do SPMM by using a >>> submatrix of C and also store the result in a submatrix of C. If it is >>> possible, I think we can remove some of cost of data movement. >>> For example, C=[c_1, c_2,.., c_s], and I want to use the result of A*c_1 >>> to update c_2, and then use he result of A*c_2(updated) to update c_3 and >>> so on. >>> I don't need the intermediate result separately, such as the result of >>> A*c_1, A*c_2. And I only need the final C. >>> Is there any SPMM function or strategies I can use to achievement this? >>> >> >> So there are two optimizations here: >> >> 1) Communication: You only communicate every s steps. If you are >> solving a transport dominated problem, this can make sense. >> For elliptic problems, I think it makes no difference at all. >> >> 2) Computation: You can alleviate bandwidth pressure by acting on >> multiple vectors at once. >> >> I would first implement this naively with a collection of Vecs to check >> that 1) makes a difference for your problem. >> If it does, then I think 2) can best be accomplished by using a TAIJ >> matrix and a long Vec, where you shift the >> memory at each iterate. >> >> Thanks, >> >> Matt >> >> >>> Thanks >>> >>> Cong Li >>> >>> >>> >>> On Tue, Aug 4, 2015 at 6:46 PM, Patrick Sanan >>> wrote: >>> >>>> On Tue, Aug 04, 2015 at 06:09:30PM +0900, Cong Li wrote: >>>> > I am sorry that I should have explained it more clearly. >>>> > Actually I want to compute a recurrence. >>>> > >>>> > Like, I want to firstly compute A*X1=B1, and then calculate A*B1=B2, >>>> > A*B2=B3 and so on. >>>> > Finally I want to combine all these results into a bigger matrix >>>> C=[B1,B2 >>>> > ...] >>>> > >>>> > Is there any way to do this efficiently. >>>> With no other information about your problem, one literal solution >>>> might be to use MATNEST to define C once you have computed B1,B2,.. >>>> However, this invites questions about what you plan to do with C and >>>> whether you require explicit representations of some or all of these >>>> matrices, and what problem sizes you are considering. >>>> > >>>> > >>>> > >>>> > On Tue, Aug 4, 2015 at 5:45 PM, Patrick Sanan < >>>> patrick.sanan at gmail.com> >>>> > wrote: >>>> > >>>> > > On Tue, Aug 04, 2015 at 03:42:14PM +0900, Cong Li wrote: >>>> > > > Thanks for your reply. >>>> > > > >>>> > > > I have an other question. >>>> > > > I want to do SPMM several times and combine result matrices into >>>> one >>>> > > bigger >>>> > > > matrix. >>>> > > > for example >>>> > > > I firstly calculate AX1=B1, AX2=B2 ... >>>> > > > then I want to combine B1, B2.. to get a C, where C=[B1,B2...] >>>> > > > >>>> > > > Could you please suggest a way of how to do this. >>>> > > This is just linear algebra, nothing to do with PETSc specifically. >>>> > > A * [X1, X2, ... ] = [AX1, AX2, ...] >>>> > > > >>>> > > > Thanks >>>> > > > >>>> > > > Cong Li >>>> > > > >>>> > > > On Tue, Aug 4, 2015 at 3:27 PM, Jed Brown >>>> wrote: >>>> > > > >>>> > > > > Cong Li writes: >>>> > > > > >>>> > > > > > Hello, >>>> > > > > > >>>> > > > > > I am a PhD student using PETsc for my research. >>>> > > > > > I am wondering if there is a way to implement SPMM (Sparse >>>> > > matrix-matrix >>>> > > > > > multiplication) by using PETSc. >>>> > > > > >>>> > > > > >>>> > > > > >>>> > > >>>> http://www.mcs.anl.gov/petsc/petsc-current/docs/manualpages/Mat/MatMatMult.html >>>> > > > > >>>> > > >>>> >>> >>> >> >> >> -- >> What most experimenters take for granted before they begin their >> experiments is infinitely more interesting than any results to which their >> experiments lead. >> -- Norbert Wiener >> > > -- What most experimenters take for granted before they begin their experiments is infinitely more interesting than any results to which their experiments lead. -- Norbert Wiener -------------- next part -------------- An HTML attachment was scrubbed... URL: From martin.vymazal at vki.ac.be Tue Aug 4 12:15:24 2015 From: martin.vymazal at vki.ac.be (Martin Vymazal) Date: Tue, 04 Aug 2015 18:15:24 +0100 Subject: [petsc-users] C++ wrapper for petsc vector Message-ID: <1585215.z8oGCl3ZR4@tinlaptop> Hello, I'm trying to create a small C++ class to wrap the 'Vec' object. This class has an internal pointer to a member variable of type Vec, and in its destructor, it calls VecDestroy. Unfortunately, my test program segfaults and this seems to be due to the fact that the destructor of the wrapper class is called after main() calls PetscFinalize(). Apparently VecDestroy performs some collective communication, so calling it after PetscFinalize() is too late. How can I fix this? Thank you, Martin Vymazal From jed at jedbrown.org Tue Aug 4 12:20:19 2015 From: jed at jedbrown.org (Jed Brown) Date: Tue, 04 Aug 2015 11:20:19 -0600 Subject: [petsc-users] I am wondering if there is a way to implement SPMM In-Reply-To: References: <87egjjr2j9.fsf@jedbrown.org> <20150804084548.GB52392@Patricks-MacBook-Pro-3.local> <20150804094631.GF52392@Patricks-MacBook-Pro-3.local> Message-ID: <87pp33otrg.fsf@jedbrown.org> Cong Li writes: > Thanks very much for your suggestions. > > Actually I am also considering using communication-avoiding matrix power > kernel (CA-MPK) to do SPMM. However, the communication pattern of CA-MPK > depends on the sparsity pattern. So the implementation could be very > complex for some of problems.The efficient implementation of CA-MPK is > actually one of problems I want to solve during my PhD course. You can get the pattern with MatIncreaseOverlap. Lack of useful preconditioning all but destroys the practical utility of these matrix powers kernels for solvers, even if the surface area/volume ratio is favorable (an odd corner of the relevant problem space). I consider it a fashion that has already gotten more attention than it deserves and should be on its way out. I would recommend a different thesis topic if you want to do work with a tangible impact on practical computational science and engineering. If you're really excited about this niche, by all means, have fun. -------------- next part -------------- A non-text attachment was scrubbed... Name: signature.asc Type: application/pgp-signature Size: 818 bytes Desc: not available URL: From gpau at lbl.gov Tue Aug 4 12:22:18 2015 From: gpau at lbl.gov (George Pau) Date: Tue, 4 Aug 2015 10:22:18 -0700 Subject: [petsc-users] configure error with --with-shared-libraries=0 --download-elemental Message-ID: Hi, I am configuring petsc on NERSC/Edison with the following configure arguments: --with-debugging=1 --with-shared-libraries=0 --prefix=/global/homes/g/gpau/clm-rom/install/t pls --with-cxx-dialect=C++11 --download-elemental --download-mumps --download-scalapack --do wnload-parmetis --download-metis --download-hdf5 --download-netcdf --with-x=0 --with-cc=/opt /cray/craype/2.3.1/bin/cc --with-cxx=/opt/cray/craype/2.3.1/bin/CC --with-fc=/opt/cray/crayp e/2.3.1/bin/ftn but it seems like the --with-shared-libraries=0 is not propagated when building elemental. In the end I get the following error: gmake[3]: Leaving directory `/global/u1/g/gpau/clm-rom/build/tpl-build/petsc/petsc-3.6.1-sou rce/arch-linux2-c-debug/externalpackages/Elemental-0.85-p1/build'/usr/bin/ld: /usr/common/us g/darshan/2.3.0/lib/libdarshan-mpi-io.a(darshan-mpi-io.o): relocation R_X86_64_32 against `. rodata' can not be used when making a shared object; recompile with -fPIC Any help will be appreciated. Attached is the configure log file. Thanks, George -- George Pau Earth Sciences Division Lawrence Berkeley National Laboratory One Cyclotron, MS 74-120 Berkeley, CA 94720 (510) 486-7196 gpau at lbl.gov http://esd.lbl.gov/about/staff/georgepau/ -------------- next part -------------- An HTML attachment was scrubbed... URL: -------------- next part -------------- A non-text attachment was scrubbed... Name: petsc-configure-out.log Type: application/octet-stream Size: 210962 bytes Desc: not available URL: From knepley at gmail.com Tue Aug 4 12:22:54 2015 From: knepley at gmail.com (Matthew Knepley) Date: Tue, 4 Aug 2015 12:22:54 -0500 Subject: [petsc-users] I am wondering if there is a way to implement SPMM In-Reply-To: <87pp33otrg.fsf@jedbrown.org> References: <87egjjr2j9.fsf@jedbrown.org> <20150804084548.GB52392@Patricks-MacBook-Pro-3.local> <20150804094631.GF52392@Patricks-MacBook-Pro-3.local> <87pp33otrg.fsf@jedbrown.org> Message-ID: On Tue, Aug 4, 2015 at 12:20 PM, Jed Brown wrote: > Cong Li writes: > > > Thanks very much for your suggestions. > > > > Actually I am also considering using communication-avoiding matrix power > > kernel (CA-MPK) to do SPMM. However, the communication pattern of CA-MPK > > depends on the sparsity pattern. So the implementation could be very > > complex for some of problems.The efficient implementation of CA-MPK is > > actually one of problems I want to solve during my PhD course. > > You can get the pattern with MatIncreaseOverlap. > > Lack of useful preconditioning all but destroys the practical utility of > these matrix powers kernels for solvers, even if the surface area/volume > ratio is favorable (an odd corner of the relevant problem space). I > consider it a fashion that has already gotten more attention than it > deserves and should be on its way out. I would recommend a different > thesis topic if you want to do work with a tangible impact on practical > computational science and engineering. If you're really excited about > this niche, by all means, have fun. > Totally true if you are doing this for solvers, particularly elliptic solvers. If you are planning to use it for something like Tall-Skinny QR, or maybe some graph problems it could make more sense. Matt -- What most experimenters take for granted before they begin their experiments is infinitely more interesting than any results to which their experiments lead. -- Norbert Wiener -------------- next part -------------- An HTML attachment was scrubbed... URL: From knepley at gmail.com Tue Aug 4 12:24:14 2015 From: knepley at gmail.com (Matthew Knepley) Date: Tue, 4 Aug 2015 12:24:14 -0500 Subject: [petsc-users] C++ wrapper for petsc vector In-Reply-To: <1585215.z8oGCl3ZR4@tinlaptop> References: <1585215.z8oGCl3ZR4@tinlaptop> Message-ID: On Tue, Aug 4, 2015 at 12:15 PM, Martin Vymazal wrote: > Hello, > > I'm trying to create a small C++ class to wrap the 'Vec' object. This > class > has an internal pointer to a member variable of type Vec, and in its > destructor, it calls VecDestroy. Unfortunately, my test program segfaults > and > this seems to be due to the fact that the destructor of the wrapper class > is > called after main() calls PetscFinalize(). Apparently VecDestroy performs > some > collective communication, so calling it after PetscFinalize() is too late. > How > can I fix this? > 1) Declare your C++ in a scope, so that it goes out of scope before PetscFinalize() 2) Is there any utility to this wrapper since everything can be called directly from C++? Thanks, Matt > Thank you, > > Martin Vymazal > -- What most experimenters take for granted before they begin their experiments is infinitely more interesting than any results to which their experiments lead. -- Norbert Wiener -------------- next part -------------- An HTML attachment was scrubbed... URL: From jed at jedbrown.org Tue Aug 4 12:24:38 2015 From: jed at jedbrown.org (Jed Brown) Date: Tue, 04 Aug 2015 11:24:38 -0600 Subject: [petsc-users] C++ wrapper for petsc vector In-Reply-To: <1585215.z8oGCl3ZR4@tinlaptop> References: <1585215.z8oGCl3ZR4@tinlaptop> Message-ID: <87mvy7otk9.fsf@jedbrown.org> Martin Vymazal writes: > Hello, > > I'm trying to create a small C++ class to wrap the 'Vec' object. A word of warning: Lots of people try this, but I've never seen an implementation that wasn't a leaky, high-maintenance abstraction with purely cosmetic value. > This class has an internal pointer to a member variable of type Vec, > and in its destructor, it calls VecDestroy. Unfortunately, my test > program segfaults and this seems to be due to the fact that the > destructor of the wrapper class is called after main() calls > PetscFinalize(). Apparently VecDestroy performs some collective > communication, so calling it after PetscFinalize() is too late. How > can I fix this? Scope so your objects are destroyed before PetscFinalize. -------------- next part -------------- A non-text attachment was scrubbed... Name: signature.asc Type: application/pgp-signature Size: 818 bytes Desc: not available URL: From jed at jedbrown.org Tue Aug 4 12:26:39 2015 From: jed at jedbrown.org (Jed Brown) Date: Tue, 04 Aug 2015 11:26:39 -0600 Subject: [petsc-users] I am wondering if there is a way to implement SPMM In-Reply-To: References: <87egjjr2j9.fsf@jedbrown.org> <20150804084548.GB52392@Patricks-MacBook-Pro-3.local> <20150804094631.GF52392@Patricks-MacBook-Pro-3.local> <87pp33otrg.fsf@jedbrown.org> Message-ID: <87io8votgw.fsf@jedbrown.org> Matthew Knepley writes: > Totally true if you are doing this for solvers, particularly elliptic > solvers. If you are planning to use it for something like Tall-Skinny > QR, or maybe some graph problems it could make more sense. Note that TSQR does not involve matrix powers. -------------- next part -------------- A non-text attachment was scrubbed... Name: signature.asc Type: application/pgp-signature Size: 818 bytes Desc: not available URL: From martin.vymazal at vki.ac.be Tue Aug 4 12:30:58 2015 From: martin.vymazal at vki.ac.be (Martin Vymazal) Date: Tue, 04 Aug 2015 18:30:58 +0100 Subject: [petsc-users] C++ wrapper for petsc vector In-Reply-To: References: <1585215.z8oGCl3ZR4@tinlaptop> Message-ID: <3775667.cNPH29TcoF@tinlaptop> Hello, 1) thank you for the suggestion. 2) suppose you want to be able to switch between solver implementations provided by different libraries (e.g. petsc/trilinos). One obvious approach is through inheritance, but in order to keep child interfaces conforming to base class signatures, I need to wrap the solvers. If you can think of a better approach that would keep switching between solvers easy, I'm open to suggestions. I don't really need both trilinos and petsc, this is just a matter of curiosity. Best regards, Martin Vymazal On Tuesday, August 04, 2015 12:24:14 PM Matthew Knepley wrote: > On Tue, Aug 4, 2015 at 12:15 PM, Martin Vymazal > > wrote: > > Hello, > > > > I'm trying to create a small C++ class to wrap the 'Vec' object. This > > > > class > > has an internal pointer to a member variable of type Vec, and in its > > destructor, it calls VecDestroy. Unfortunately, my test program segfaults > > and > > this seems to be due to the fact that the destructor of the wrapper class > > is > > called after main() calls PetscFinalize(). Apparently VecDestroy performs > > some > > collective communication, so calling it after PetscFinalize() is too late. > > How > > can I fix this? > > 1) Declare your C++ in a scope, so that it goes out of scope before > PetscFinalize() > > 2) Is there any utility to this wrapper since everything can be called > directly from C++? > > Thanks, > > Matt > > > Thank you, > > > > Martin Vymazal From knepley at gmail.com Tue Aug 4 12:34:01 2015 From: knepley at gmail.com (Matthew Knepley) Date: Tue, 4 Aug 2015 12:34:01 -0500 Subject: [petsc-users] C++ wrapper for petsc vector In-Reply-To: <3775667.cNPH29TcoF@tinlaptop> References: <1585215.z8oGCl3ZR4@tinlaptop> <3775667.cNPH29TcoF@tinlaptop> Message-ID: On Tue, Aug 4, 2015 at 12:30 PM, Martin Vymazal wrote: > Hello, > > 1) thank you for the suggestion. > 2) suppose you want to be able to switch between solver implementations > provided by different libraries (e.g. petsc/trilinos). One obvious > approach is > through inheritance, but in order to keep child interfaces conforming to > base > class signatures, I need to wrap the solvers. If you can think of a better > approach that would keep switching between solvers easy, I'm open to > suggestions. I don't really need both trilinos and petsc, this is just a > matter of curiosity. > I think this is a bad way of doing that. You would introduce a whole bunch of types at the top level which are meaningless (just like Trilinos). If you want another solver, just wrap it up in the PETSc PCShell object (two calls at most). Its an easier to write wrapper, which also fits in with all the debugging and profiling. We wrap a bunch of things this way like Hypre (70+ packages last time I checked). Matt > Best regards, > > Martin Vymazal > > > On Tuesday, August 04, 2015 12:24:14 PM Matthew Knepley wrote: > > On Tue, Aug 4, 2015 at 12:15 PM, Martin Vymazal < > martin.vymazal at vki.ac.be> > > > > wrote: > > > Hello, > > > > > > I'm trying to create a small C++ class to wrap the 'Vec' object. This > > > > > > class > > > has an internal pointer to a member variable of type Vec, and in its > > > destructor, it calls VecDestroy. Unfortunately, my test program > segfaults > > > and > > > this seems to be due to the fact that the destructor of the wrapper > class > > > is > > > called after main() calls PetscFinalize(). Apparently VecDestroy > performs > > > some > > > collective communication, so calling it after PetscFinalize() is too > late. > > > How > > > can I fix this? > > > > 1) Declare your C++ in a scope, so that it goes out of scope before > > PetscFinalize() > > > > 2) Is there any utility to this wrapper since everything can be called > > directly from C++? > > > > Thanks, > > > > Matt > > > > > Thank you, > > > > > > Martin Vymazal > > -- What most experimenters take for granted before they begin their experiments is infinitely more interesting than any results to which their experiments lead. -- Norbert Wiener -------------- next part -------------- An HTML attachment was scrubbed... URL: From martin.vymazal at vki.ac.be Tue Aug 4 13:07:56 2015 From: martin.vymazal at vki.ac.be (Martin Vymazal) Date: Tue, 04 Aug 2015 19:07:56 +0100 Subject: [petsc-users] C++ wrapper for petsc vector In-Reply-To: References: <1585215.z8oGCl3ZR4@tinlaptop> <3775667.cNPH29TcoF@tinlaptop> Message-ID: <2182692.n8DuiFgnrM@tinlaptop> On Tuesday, August 04, 2015 12:34:01 PM Matthew Knepley wrote: > On Tue, Aug 4, 2015 at 12:30 PM, Martin Vymazal > > wrote: > > Hello, > > > > 1) thank you for the suggestion. > > 2) suppose you want to be able to switch between solver implementations > > > > provided by different libraries (e.g. petsc/trilinos). One obvious > > approach is > > through inheritance, but in order to keep child interfaces conforming to > > base > > class signatures, I need to wrap the solvers. If you can think of a better > > approach that would keep switching between solvers easy, I'm open to > > suggestions. I don't really need both trilinos and petsc, this is just a > > matter of curiosity. > > I think this is a bad way of doing that. You would introduce a whole bunch > of types > at the top level which are meaningless (just like Trilinos). If you want > another solver, > just wrap it up in the PETSc PCShell object (two calls at most). Its an > easier to write > wrapper, which also fits in with all the debugging and profiling. We wrap a > bunch of > things this way like Hypre (70+ packages last time I checked). > > Matt OK, I was not aware of PCShell (I'm new to PETSc). I don't know Trilinos well enough to judge whether it's good from software engineering point of view or not, but allow me one last question. What would happen if I wrap all 'other' solvers in PCShell and then for some reason, PETSc is not available. None of the other solvers would be accessible (unless I modify the source code), so wrapping everything using PCShell creates a strong dependency on one particular library (PETSc), doesn't it? Martin > > > Best regards, > > > > Martin Vymazal > > > > On Tuesday, August 04, 2015 12:24:14 PM Matthew Knepley wrote: > > > On Tue, Aug 4, 2015 at 12:15 PM, Martin Vymazal < > > > > martin.vymazal at vki.ac.be> > > > > > wrote: > > > > Hello, > > > > > > > > I'm trying to create a small C++ class to wrap the 'Vec' object. This > > > > > > > > class > > > > has an internal pointer to a member variable of type Vec, and in its > > > > destructor, it calls VecDestroy. Unfortunately, my test program > > > > segfaults > > > > > > and > > > > this seems to be due to the fact that the destructor of the wrapper > > > > class > > > > > > is > > > > called after main() calls PetscFinalize(). Apparently VecDestroy > > > > performs > > > > > > some > > > > collective communication, so calling it after PetscFinalize() is too > > > > late. > > > > > > How > > > > can I fix this? > > > > > > 1) Declare your C++ in a scope, so that it goes out of scope before > > > PetscFinalize() > > > > > > 2) Is there any utility to this wrapper since everything can be called > > > directly from C++? > > > > > > Thanks, > > > > > > Matt > > > > > > > > Thank you, > > > > > > > > Martin Vymazal From bsmith at mcs.anl.gov Tue Aug 4 13:09:04 2015 From: bsmith at mcs.anl.gov (Barry Smith) Date: Tue, 4 Aug 2015 13:09:04 -0500 Subject: [petsc-users] I am wondering if there is a way to implement SPMM In-Reply-To: References: <87egjjr2j9.fsf@jedbrown.org> <20150804084548.GB52392@Patricks-MacBook-Pro-3.local> <07456300-9874-41EF-AF5E-16BC0CB0423D@mcs.anl.gov> Message-ID: <06426FD6-16F6-429A-8EEB-8BE31CECC8F4@mcs.anl.gov> From the manual page: Unless scall is MAT_REUSE_MATRIX C will be created. Since you want to use the C that is passed in you should use MAT_REUSE_MATRIX. Note that since your B and C matrices are dense the issue of sparsity pattern of C is not relevant. Barry > On Aug 4, 2015, at 11:59 AM, Cong Li wrote: > > Thanks very much. This answer is very helpful. > And I have a following question. > If I create B1, B2, .. by the way you suggested and then use MatMatMult to do SPMM. > PetscErrorCode MatMatMult(Mat A,Mat B,MatReuse scall,PetscReal fill,Mat *C) > should I use MAT_REUSE_MATRIX for MatReuse part of the arguement. > > Thanks > > Cong Li > > On Wed, Aug 5, 2015 at 1:27 AM, Barry Smith wrote: > > > On Aug 4, 2015, at 4:09 AM, Cong Li wrote: > > > > I am sorry that I should have explained it more clearly. > > Actually I want to compute a recurrence. > > > > Like, I want to firstly compute A*X1=B1, and then calculate A*B1=B2, A*B2=B3 and so on. > > Finally I want to combine all these results into a bigger matrix C=[B1,B2 ...] > > First create C with MatCreateDense(,&C). Then call MatDenseGetArray(C,&array); then create B1 with MatCreateDense(....,array,&B1); then create > B2 with MatCreateDense(...,array+shift,&B2) etc where shift equals the number of __local__ rows in B1 times the number of columns in B1, then create B3 with a larger shift etc. > > Note that you are "sharing" the array space of C with B1, B2, B3, ..., each Bi contains its columns of the C matrix. > > Barry > > > > > > > Is there any way to do this efficiently. > > > > > > > > On Tue, Aug 4, 2015 at 5:45 PM, Patrick Sanan wrote: > > On Tue, Aug 04, 2015 at 03:42:14PM +0900, Cong Li wrote: > > > Thanks for your reply. > > > > > > I have an other question. > > > I want to do SPMM several times and combine result matrices into one bigger > > > matrix. > > > for example > > > I firstly calculate AX1=B1, AX2=B2 ... > > > then I want to combine B1, B2.. to get a C, where C=[B1,B2...] > > > > > > Could you please suggest a way of how to do this. > > This is just linear algebra, nothing to do with PETSc specifically. > > A * [X1, X2, ... ] = [AX1, AX2, ...] > > > > > > Thanks > > > > > > Cong Li > > > > > > On Tue, Aug 4, 2015 at 3:27 PM, Jed Brown wrote: > > > > > > > Cong Li writes: > > > > > > > > > Hello, > > > > > > > > > > I am a PhD student using PETsc for my research. > > > > > I am wondering if there is a way to implement SPMM (Sparse matrix-matrix > > > > > multiplication) by using PETSc. > > > > > > > > > > > > http://www.mcs.anl.gov/petsc/petsc-current/docs/manualpages/Mat/MatMatMult.html > > > > > > > > From bsmith at mcs.anl.gov Tue Aug 4 13:19:33 2015 From: bsmith at mcs.anl.gov (Barry Smith) Date: Tue, 4 Aug 2015 13:19:33 -0500 Subject: [petsc-users] C++ wrapper for petsc vector In-Reply-To: References: <1585215.z8oGCl3ZR4@tinlaptop> <3775667.cNPH29TcoF@tinlaptop> Message-ID: <644E3374-62A7-4CF7-B116-3AAA95C46F20@mcs.anl.gov> > On Aug 4, 2015, at 12:34 PM, Matthew Knepley wrote: > > On Tue, Aug 4, 2015 at 12:30 PM, Martin Vymazal wrote: > Hello, > > 1) thank you for the suggestion. > 2) suppose you want to be able to switch between solver implementations > provided by different libraries (e.g. petsc/trilinos). One obvious approach is > through inheritance, but in order to keep child interfaces conforming to base > class signatures, I need to wrap the solvers. If you can think of a better > approach that would keep switching between solvers easy, I'm open to > suggestions. I don't really need both trilinos and petsc, this is just a > matter of curiosity. > > I think this is a bad way of doing that. You would introduce a whole bunch of types > at the top level which are meaningless (just like Trilinos). If you want another solver, > just wrap it up in the PETSc PCShell object (two calls at most). Its an easier to write > wrapper, which also fits in with all the debugging and profiling. We wrap a bunch of > things this way like Hypre (70+ packages last time I checked). As Matt points out PETSc is already a wrapper library in that it is designed to easily wrap around other solver libraries to use the common PETSc API. So what you are doing is writing a wrapper library around a wrapper library, certainly possible but of questionable value. Note also that what makes particular solvers powerful is their use of "extra information" to obtain fast convergence over the only information being the "matrix values"; so for example the near null space for some algebraic multigrid methods, the geometric information for geometric multigrid, the "block structure" for "block preconditioners (what we call PCFIELDSPLIT in PETSc), etc. Do you really want to handle all of this "extra information" in your wrapper class? We already understand these details and provide APIs for them, it would be a huge thankless project for you to reproduce them all in your API. Barry > > Matt > > Best regards, > > Martin Vymazal > > > On Tuesday, August 04, 2015 12:24:14 PM Matthew Knepley wrote: > > On Tue, Aug 4, 2015 at 12:15 PM, Martin Vymazal > > > > wrote: > > > Hello, > > > > > > I'm trying to create a small C++ class to wrap the 'Vec' object. This > > > > > > class > > > has an internal pointer to a member variable of type Vec, and in its > > > destructor, it calls VecDestroy. Unfortunately, my test program segfaults > > > and > > > this seems to be due to the fact that the destructor of the wrapper class > > > is > > > called after main() calls PetscFinalize(). Apparently VecDestroy performs > > > some > > > collective communication, so calling it after PetscFinalize() is too late. > > > How > > > can I fix this? > > > > 1) Declare your C++ in a scope, so that it goes out of scope before > > PetscFinalize() > > > > 2) Is there any utility to this wrapper since everything can be called > > directly from C++? > > > > Thanks, > > > > Matt > > > > > Thank you, > > > > > > Martin Vymazal > > > > > -- > What most experimenters take for granted before they begin their experiments is infinitely more interesting than any results to which their experiments lead. > -- Norbert Wiener From martin.vymazal at vki.ac.be Tue Aug 4 13:48:02 2015 From: martin.vymazal at vki.ac.be (Martin Vymazal) Date: Tue, 04 Aug 2015 19:48:02 +0100 Subject: [petsc-users] C++ wrapper for petsc vector In-Reply-To: <644E3374-62A7-4CF7-B116-3AAA95C46F20@mcs.anl.gov> References: <1585215.z8oGCl3ZR4@tinlaptop> <644E3374-62A7-4CF7-B116-3AAA95C46F20@mcs.anl.gov> Message-ID: <1568148.yQyVaWRmv2@tinlaptop> On Tuesday, August 04, 2015 01:19:33 PM Barry Smith wrote: > > On Aug 4, 2015, at 12:34 PM, Matthew Knepley wrote: > > > > On Tue, Aug 4, 2015 at 12:30 PM, Martin Vymazal > > wrote: Hello, > > > > 1) thank you for the suggestion. > > 2) suppose you want to be able to switch between solver implementations > > > > provided by different libraries (e.g. petsc/trilinos). One obvious > > approach is through inheritance, but in order to keep child interfaces > > conforming to base class signatures, I need to wrap the solvers. If you > > can think of a better approach that would keep switching between solvers > > easy, I'm open to suggestions. I don't really need both trilinos and > > petsc, this is just a matter of curiosity. > > > > I think this is a bad way of doing that. You would introduce a whole bunch > > of types at the top level which are meaningless (just like Trilinos). If > > you want another solver, just wrap it up in the PETSc PCShell object (two > > calls at most). Its an easier to write wrapper, which also fits in with > > all the debugging and profiling. We wrap a bunch of things this way like > > Hypre (70+ packages last time I checked). > > As Matt points out PETSc is already a wrapper library in that it is > designed to easily wrap around other solver libraries to use the common > PETSc API. So what you are doing is writing a wrapper library around a > wrapper library, certainly possible but of questionable value. > > Note also that what makes particular solvers powerful is their use of > "extra information" to obtain fast convergence over the only information > being the "matrix values"; so for example the near null space for some > algebraic multigrid methods, the geometric information for geometric > multigrid, the "block structure" for "block preconditioners (what we call > PCFIELDSPLIT in PETSc), etc. Do you really want to handle all of this > "extra information" in your wrapper class? We already understand these > details and provide APIs for them, it would be a huge thankless project for > you to reproduce them all in your API. > > Barry Of course I prefer to rely on other people's expertise in the domain instead of doing the job over again (with probably worse result). What you say about PETSc being a wrapper for other libraries makes sense. Martin > > > Matt > > > > Best regards, > > > > Martin Vymazal > > > > On Tuesday, August 04, 2015 12:24:14 PM Matthew Knepley wrote: > > > On Tue, Aug 4, 2015 at 12:15 PM, Martin Vymazal > > > > > > > > > wrote: > > > > Hello, > > > > > > > > I'm trying to create a small C++ class to wrap the 'Vec' object. This > > > > > > > > class > > > > has an internal pointer to a member variable of type Vec, and in its > > > > destructor, it calls VecDestroy. Unfortunately, my test program > > > > segfaults > > > > and > > > > this seems to be due to the fact that the destructor of the wrapper > > > > class > > > > is > > > > called after main() calls PetscFinalize(). Apparently VecDestroy > > > > performs > > > > some > > > > collective communication, so calling it after PetscFinalize() is too > > > > late. > > > > How > > > > can I fix this? > > > > > > 1) Declare your C++ in a scope, so that it goes out of scope before > > > PetscFinalize() > > > > > > 2) Is there any utility to this wrapper since everything can be called > > > directly from C++? > > > > > > Thanks, > > > > > > Matt > > > > > > > > Thank you, > > > > > > > > Martin Vymazal From bsmith at mcs.anl.gov Tue Aug 4 14:33:31 2015 From: bsmith at mcs.anl.gov (Barry Smith) Date: Tue, 4 Aug 2015 14:33:31 -0500 Subject: [petsc-users] configure error with --with-shared-libraries=0 --download-elemental In-Reply-To: References: Message-ID: Aghh, looks like CMake does not have a universal standard for indicating shared libraries or not. Please try the attached elemental.py file and see if that resolves your difficulties. -------------- next part -------------- A non-text attachment was scrubbed... Name: elemental.py Type: text/x-python-script Size: 2705 bytes Desc: not available URL: -------------- next part -------------- Barry > On Aug 4, 2015, at 12:22 PM, George Pau wrote: > > Hi, > > I am configuring petsc on NERSC/Edison with the following configure arguments: > > --with-debugging=1 --with-shared-libraries=0 --prefix=/global/homes/g/gpau/clm-rom/install/t > pls --with-cxx-dialect=C++11 --download-elemental --download-mumps --download-scalapack --do > wnload-parmetis --download-metis --download-hdf5 --download-netcdf --with-x=0 --with-cc=/opt > /cray/craype/2.3.1/bin/cc --with-cxx=/opt/cray/craype/2.3.1/bin/CC --with-fc=/opt/cray/crayp > e/2.3.1/bin/ftn > > but it seems like the --with-shared-libraries=0 is not propagated when building elemental. In the end I get the following error: > > gmake[3]: Leaving directory `/global/u1/g/gpau/clm-rom/build/tpl-build/petsc/petsc-3.6.1-sou > rce/arch-linux2-c-debug/externalpackages/Elemental-0.85-p1/build'/usr/bin/ld: /usr/common/us > g/darshan/2.3.0/lib/libdarshan-mpi-io.a(darshan-mpi-io.o): relocation R_X86_64_32 against `. > rodata' can not be used when making a shared object; recompile with -fPIC > > Any help will be appreciated. Attached is the configure log file. > > Thanks, > George > > > -- > George Pau > Earth Sciences Division > Lawrence Berkeley National Laboratory > One Cyclotron, MS 74-120 > Berkeley, CA 94720 > > (510) 486-7196 > gpau at lbl.gov > http://esd.lbl.gov/about/staff/georgepau/ > From gpau at lbl.gov Tue Aug 4 17:06:55 2015 From: gpau at lbl.gov (George Pau) Date: Tue, 4 Aug 2015 15:06:55 -0700 Subject: [petsc-users] configure error with --with-shared-libraries=0 --download-elemental In-Reply-To: References: Message-ID: Barry, Thanks. The patch works. George On Tue, Aug 4, 2015 at 12:33 PM, Barry Smith wrote: > > Aghh, looks like CMake does not have a universal standard for indicating > shared libraries or not. > > Please try the attached elemental.py file and see if that resolves your > difficulties. > > > > Barry > > > On Aug 4, 2015, at 12:22 PM, George Pau wrote: > > > > Hi, > > > > I am configuring petsc on NERSC/Edison with the following configure > arguments: > > > > --with-debugging=1 --with-shared-libraries=0 > --prefix=/global/homes/g/gpau/clm-rom/install/t > > pls --with-cxx-dialect=C++11 --download-elemental --download-mumps > --download-scalapack --do > > wnload-parmetis --download-metis --download-hdf5 --download-netcdf > --with-x=0 --with-cc=/opt > > /cray/craype/2.3.1/bin/cc --with-cxx=/opt/cray/craype/2.3.1/bin/CC > --with-fc=/opt/cray/crayp > > e/2.3.1/bin/ftn > > > > but it seems like the --with-shared-libraries=0 is not propagated when > building elemental. In the end I get the following error: > > > > gmake[3]: Leaving directory > `/global/u1/g/gpau/clm-rom/build/tpl-build/petsc/petsc-3.6.1-sou > > > rce/arch-linux2-c-debug/externalpackages/Elemental-0.85-p1/build'/usr/bin/ld: > /usr/common/us > > g/darshan/2.3.0/lib/libdarshan-mpi-io.a(darshan-mpi-io.o): relocation > R_X86_64_32 against `. > > rodata' can not be used when making a shared object; recompile with -fPIC > > > > Any help will be appreciated. Attached is the configure log file. > > > > Thanks, > > George > > > > > > -- > > George Pau > > Earth Sciences Division > > Lawrence Berkeley National Laboratory > > One Cyclotron, MS 74-120 > > Berkeley, CA 94720 > > > > (510) 486-7196 > > gpau at lbl.gov > > http://esd.lbl.gov/about/staff/georgepau/ > > > > > -- George Pau Earth Sciences Division Lawrence Berkeley National Laboratory One Cyclotron, MS 74-120 Berkeley, CA 94720 (510) 486-7196 gpau at lbl.gov http://esd.lbl.gov/about/staff/georgepau/ -------------- next part -------------- An HTML attachment was scrubbed... URL: From bsmith at mcs.anl.gov Tue Aug 4 17:08:36 2015 From: bsmith at mcs.anl.gov (Barry Smith) Date: Tue, 4 Aug 2015 17:08:36 -0500 Subject: [petsc-users] Get CPU time from events In-Reply-To: References: Message-ID: <53676E55-FEA5-4B0A-B85B-DBC0DF8421C7@mcs.anl.gov> I have added error checking in the branches maint, master, and next so no else will waste their time trying to figure out why the routine is returning nothing useful. Thanks for reporting the issue, Barry > On Aug 3, 2015, at 12:00 PM, Barry Smith wrote: > >> >> On Aug 3, 2015, at 11:52 AM, Theler German Guillermo wrote: >> >> >>> I get empty PetscEventPerfInfo structures after calling >>> PetscLogEventGetPerfInfo(), i.e. both integers and floats are >>> zero, as >>> If you do not pass -log_summary, you have to call PetscLogBegin() >>> after PetscInitialize() to >>> get it to start logging. >> >> Got it! Thanks. >> Maybe that sentence should be added to the description of >> PetscLogEventGetPerfInfo() and friends. > > We should probably trigger an error, with a very helpful error message, if these are called but the initialization was never done. > > Barry > >> >> -- >> jeremy >> ________________________________ >> Imprima este mensaje s?lo si es absolutamente necesario. >> Para imprimir, en lo posible utilice el papel de ambos lados. >> El Grupo Sancor Seguros se compromete con el cuidado del medioambiente. >> >> >> >> ************AVISO DE CONFIDENCIALIDAD************ >> >> El Grupo Sancor Seguros comunica que: >> >> Este mensaje y todos los archivos adjuntos a el son para uso exclusivo del destinatario y pueden contener informaci?n confidencial o propietaria, cuya divulgaci?n es sancionada por ley. Si usted recibi? este mensaje err?neamente, por favor notif?quenos respondiendo al remitente, borre el mensaje original y destruya las copias (impresas o grabadas en cualquier medio magn?tico) que pueda haber realizado del mismo. Todas las opiniones contenidas en este mail son propias del autor del mensaje. La publicaci?n, uso, copia o impresi?n total o parcial de este mensaje o documentos adjuntos queda prohibida. >> >> Disposici?n DNDP 10-2008. El titular de los datos personales tiene la facultad de ejercer el derecho de acceso a los mismos en forma gratuita a intervalos no inferiores a seis meses, salvo que acredite un inter?s leg?timo al efecto conforme lo establecido en el art?culo 14, inciso 3 de la Ley 25.326. La DIRECCI?N NACIONAL DE PROTECCI?N DE DATOS PERSONALES, Organo de Control de la Ley 25.326, tiene la atribuci?n de atender las denuncias y reclamos que se interpongan con relaci?n al incumplimiento de las normas sobre la protecci?n de datos personales. From jychang48 at gmail.com Tue Aug 4 17:09:11 2015 From: jychang48 at gmail.com (Justin Chang) Date: Tue, 4 Aug 2015 17:09:11 -0500 Subject: [petsc-users] Profiling/checkpoints Message-ID: Hi all, Not sure what to title this mail, but let me begin with an analogy of what I am looking for: In MATLAB, we could insert breakpoints into the code, such that when we run the program, we could pause the execution and see what the variables contain and what is going on exactly within your function calls. Is there a way to do something like this within PETSc? I want to see what's going on within certain PETSc functions within KSPSolve. For instance, -log_summary says that my solver invokes calls to functions like VecMDot and VecMAXPY but I would like to know exactly how many vectors each of these functions are working with. Morever, I would also like to get a general overview of the properties of the matrices MatPtAP and MatMatMult are playing with (e.g., dimensions, number of nonzeros, etc). Or Above functions happen to be invoked from gamg, so is it possible to tell just from the parameters fed into PETSc what the answers to the above may be? Thanks, Justin -------------- next part -------------- An HTML attachment was scrubbed... URL: From bsmith at mcs.anl.gov Tue Aug 4 17:09:59 2015 From: bsmith at mcs.anl.gov (Barry Smith) Date: Tue, 4 Aug 2015 17:09:59 -0500 Subject: [petsc-users] configure error with --with-shared-libraries=0 --download-elemental In-Reply-To: References: Message-ID: <2387A54D-8E42-497C-94AF-21D484B87FF8@mcs.anl.gov> George, Thanks for letting us know. Now fixed in maint, master, next and will be in our next patch release Barry commit 5219fea8a3666c8b7c80c852d85e563864a43d28 Author: Barry Smith Date: Tue Aug 4 14:33:41 2015 -0500 Turn off elemental shared libraries if --with-shared-libraries=0 is used Reported-by: George Pau > On Aug 4, 2015, at 5:06 PM, George Pau wrote: > > Barry, > > Thanks. The patch works. > > George > > > On Tue, Aug 4, 2015 at 12:33 PM, Barry Smith wrote: > > Aghh, looks like CMake does not have a universal standard for indicating shared libraries or not. > > Please try the attached elemental.py file and see if that resolves your difficulties. > > > > Barry > > > On Aug 4, 2015, at 12:22 PM, George Pau wrote: > > > > Hi, > > > > I am configuring petsc on NERSC/Edison with the following configure arguments: > > > > --with-debugging=1 --with-shared-libraries=0 --prefix=/global/homes/g/gpau/clm-rom/install/t > > pls --with-cxx-dialect=C++11 --download-elemental --download-mumps --download-scalapack --do > > wnload-parmetis --download-metis --download-hdf5 --download-netcdf --with-x=0 --with-cc=/opt > > /cray/craype/2.3.1/bin/cc --with-cxx=/opt/cray/craype/2.3.1/bin/CC --with-fc=/opt/cray/crayp > > e/2.3.1/bin/ftn > > > > but it seems like the --with-shared-libraries=0 is not propagated when building elemental. In the end I get the following error: > > > > gmake[3]: Leaving directory `/global/u1/g/gpau/clm-rom/build/tpl-build/petsc/petsc-3.6.1-sou > > rce/arch-linux2-c-debug/externalpackages/Elemental-0.85-p1/build'/usr/bin/ld: /usr/common/us > > g/darshan/2.3.0/lib/libdarshan-mpi-io.a(darshan-mpi-io.o): relocation R_X86_64_32 against `. > > rodata' can not be used when making a shared object; recompile with -fPIC > > > > Any help will be appreciated. Attached is the configure log file. > > > > Thanks, > > George > > > > > > -- > > George Pau > > Earth Sciences Division > > Lawrence Berkeley National Laboratory > > One Cyclotron, MS 74-120 > > Berkeley, CA 94720 > > > > (510) 486-7196 > > gpau at lbl.gov > > http://esd.lbl.gov/about/staff/georgepau/ > > > > > > > > -- > George Pau > Earth Sciences Division > Lawrence Berkeley National Laboratory > One Cyclotron, MS 74-120 > Berkeley, CA 94720 > > (510) 486-7196 > gpau at lbl.gov > http://esd.lbl.gov/about/staff/georgepau/ From jed at jedbrown.org Tue Aug 4 17:14:01 2015 From: jed at jedbrown.org (Jed Brown) Date: Tue, 04 Aug 2015 16:14:01 -0600 Subject: [petsc-users] Profiling/checkpoints In-Reply-To: References: Message-ID: <87wpxazopi.fsf@jedbrown.org> Justin Chang writes: > Hi all, > > Not sure what to title this mail, but let me begin with an analogy of what > I am looking for: > > In MATLAB, we could insert breakpoints into the code, such that when we run > the program, we could pause the execution and see what the variables > contain and what is going on exactly within your function calls. Is there a > way to do something like this within PETSc? Yes, they're called breakpoints and available with any debugger. http://www.sourceware.org/gdb/onlinedocs/gdb/Set-Breaks.html Compile with debugging symbols. -------------- next part -------------- A non-text attachment was scrubbed... Name: signature.asc Type: application/pgp-signature Size: 818 bytes Desc: not available URL: From bsmith at mcs.anl.gov Tue Aug 4 17:22:48 2015 From: bsmith at mcs.anl.gov (Barry Smith) Date: Tue, 4 Aug 2015 17:22:48 -0500 Subject: [petsc-users] Profiling/checkpoints In-Reply-To: References: Message-ID: <76FC9743-5DE8-4AFA-9C1C-14865AAECD88@mcs.anl.gov> I do this by running in the debugger and putting in breakpoints. At the breakpoint you can look directly at variables like the n in call to VecMDot() you can also call KSPView() etc on any PETSc object (with a viewer of 0) and it will print out the information about the object right then. Calling VecView() or MatView() directly will of course cause it to print the entire object which generally you don't want but you can do PetscViewerSetFormat(0, PETSC_VIEWER_ASCII_INFO) and then VecView or MatView to have it print size information etc about the object instead of the full object. In parallel instead of passing 0 for the viewer you need to pass PETSC_VIEWER_STDOUT_(PETSC_COMM_WORLD) and make sure all processes that share the object call the routine in the debugger but it is possible. Let us know how it goes and we can try to improve the experience with your suggestions, Barry > On Aug 4, 2015, at 5:09 PM, Justin Chang wrote: > > Hi all, > > Not sure what to title this mail, but let me begin with an analogy of what I am looking for: > > In MATLAB, we could insert breakpoints into the code, such that when we run the program, we could pause the execution and see what the variables contain and what is going on exactly within your function calls. Is there a way to do something like this within PETSc? > > I want to see what's going on within certain PETSc functions within KSPSolve. For instance, -log_summary says that my solver invokes calls to functions like VecMDot and VecMAXPY but I would like to know exactly how many vectors each of these functions are working with. Morever, I would also like to get a general overview of the properties of the matrices MatPtAP and MatMatMult are playing with (e.g., dimensions, number of nonzeros, etc). > > Or > > Above functions happen to be invoked from gamg, so is it possible to tell just from the parameters fed into PETSc what the answers to the above may be? > > Thanks, > Justin From bsmith at mcs.anl.gov Tue Aug 4 17:36:08 2015 From: bsmith at mcs.anl.gov (Barry Smith) Date: Tue, 4 Aug 2015 17:36:08 -0500 Subject: [petsc-users] Profiling/checkpoints In-Reply-To: <76FC9743-5DE8-4AFA-9C1C-14865AAECD88@mcs.anl.gov> References: <76FC9743-5DE8-4AFA-9C1C-14865AAECD88@mcs.anl.gov> Message-ID: Correction, even in parallel you should be able to use a 0 for the viewer for calls to KSPView() etc; just make sure you do the same call on each process that shares the object. To change the viewer format you do need to use PetscViewerSetFormat(PETSC_VIEWER_STDOUT_(PETSC_COMM_WORLD), PETSC_VIEWER_ASCII_INFO) to change the format for parallel objects that live on PETSC_COMM_WORLD. Barry PetscViewerSetFormat(0, PETSC_VIEWER_ASCII_INFO) only effects the format of the sequential ASCII viewer. > On Aug 4, 2015, at 5:22 PM, Barry Smith wrote: > > > I do this by running in the debugger and putting in breakpoints. At the breakpoint you can look directly at variables like the n in call to VecMDot() you can also call KSPView() etc on any PETSc object (with a viewer of 0) and it will print out the information about the object right then. Calling VecView() or MatView() directly will of course cause it to print the entire object which generally you don't want but you can do PetscViewerSetFormat(0, PETSC_VIEWER_ASCII_INFO) and then VecView or MatView to have it print size information etc about the object instead of the full object. In parallel instead of passing 0 for the viewer you need to pass PETSC_VIEWER_STDOUT_(PETSC_COMM_WORLD) and make sure all processes that share the object call the routine in the debugger but it is possible. > > Let us know how it goes and we can try to improve the experience with your suggestions, > > Barry > >> On Aug 4, 2015, at 5:09 PM, Justin Chang wrote: >> >> Hi all, >> >> Not sure what to title this mail, but let me begin with an analogy of what I am looking for: >> >> In MATLAB, we could insert breakpoints into the code, such that when we run the program, we could pause the execution and see what the variables contain and what is going on exactly within your function calls. Is there a way to do something like this within PETSc? >> >> I want to see what's going on within certain PETSc functions within KSPSolve. For instance, -log_summary says that my solver invokes calls to functions like VecMDot and VecMAXPY but I would like to know exactly how many vectors each of these functions are working with. Morever, I would also like to get a general overview of the properties of the matrices MatPtAP and MatMatMult are playing with (e.g., dimensions, number of nonzeros, etc). >> >> Or >> >> Above functions happen to be invoked from gamg, so is it possible to tell just from the parameters fed into PETSc what the answers to the above may be? >> >> Thanks, >> Justin > From patrick.sanan at gmail.com Tue Aug 4 18:20:22 2015 From: patrick.sanan at gmail.com (Patrick Sanan) Date: Wed, 5 Aug 2015 01:20:22 +0200 Subject: [petsc-users] Profiling/checkpoints In-Reply-To: References: <76FC9743-5DE8-4AFA-9C1C-14865AAECD88@mcs.anl.gov> Message-ID: <8CD18807-1074-4FE1-A6C5-181C550480E9@gmail.com> And note that it is possible to run gdb/lldb on each of several MPI processes, useful when you hit a bug that only appears in parallel. For example, this FAQ describes a couple of ways to do this: https://www.open-mpi.org/faq/?category=debugging#serial-debuggers > Am 05.08.2015 um 00:36 schrieb Barry Smith : > > > Correction, even in parallel you should be able to use a 0 for the viewer for calls to KSPView() etc; just make sure you do the same call on each process that shares the object. > > To change the viewer format you do need to use PetscViewerSetFormat(PETSC_VIEWER_STDOUT_(PETSC_COMM_WORLD), PETSC_VIEWER_ASCII_INFO) to change the format for parallel objects that live on PETSC_COMM_WORLD. > > > Barry > > PetscViewerSetFormat(0, PETSC_VIEWER_ASCII_INFO) only effects the format of the sequential ASCII viewer. > >> On Aug 4, 2015, at 5:22 PM, Barry Smith wrote: >> >> >> I do this by running in the debugger and putting in breakpoints. At the breakpoint you can look directly at variables like the n in call to VecMDot() you can also call KSPView() etc on any PETSc object (with a viewer of 0) and it will print out the information about the object right then. Calling VecView() or MatView() directly will of course cause it to print the entire object which generally you don't want but you can do PetscViewerSetFormat(0, PETSC_VIEWER_ASCII_INFO) and then VecView or MatView to have it print size information etc about the object instead of the full object. In parallel instead of passing 0 for the viewer you need to pass PETSC_VIEWER_STDOUT_(PETSC_COMM_WORLD) and make sure all processes that share the object call the routine in the debugger but it is possible. >> >> Let us know how it goes and we can try to improve the experience with your suggestions, >> >> Barry >> >>> On Aug 4, 2015, at 5:09 PM, Justin Chang wrote: >>> >>> Hi all, >>> >>> Not sure what to title this mail, but let me begin with an analogy of what I am looking for: >>> >>> In MATLAB, we could insert breakpoints into the code, such that when we run the program, we could pause the execution and see what the variables contain and what is going on exactly within your function calls. Is there a way to do something like this within PETSc? >>> >>> I want to see what's going on within certain PETSc functions within KSPSolve. For instance, -log_summary says that my solver invokes calls to functions like VecMDot and VecMAXPY but I would like to know exactly how many vectors each of these functions are working with. Morever, I would also like to get a general overview of the properties of the matrices MatPtAP and MatMatMult are playing with (e.g., dimensions, number of nonzeros, etc). >>> >>> Or >>> >>> Above functions happen to be invoked from gamg, so is it possible to tell just from the parameters fed into PETSc what the answers to the above may be? >>> >>> Thanks, >>> Justin > From bsmith at mcs.anl.gov Tue Aug 4 18:33:34 2015 From: bsmith at mcs.anl.gov (Barry Smith) Date: Tue, 4 Aug 2015 18:33:34 -0500 Subject: [petsc-users] Profiling/checkpoints In-Reply-To: <8CD18807-1074-4FE1-A6C5-181C550480E9@gmail.com> References: <76FC9743-5DE8-4AFA-9C1C-14865AAECD88@mcs.anl.gov> <8CD18807-1074-4FE1-A6C5-181C550480E9@gmail.com> Message-ID: <2C727A93-1276-4CB2-AC6D-C8D97139CB77@mcs.anl.gov> > On Aug 4, 2015, at 6:20 PM, Patrick Sanan wrote: > > And note that it is possible to run gdb/lldb on each of several MPI processes, useful when you hit a bug that only appears in parallel. For example, this FAQ describes a couple of ways to do this: > > https://www.open-mpi.org/faq/?category=debugging#serial-debuggers You can also use the PETSc option -start_in_debugger which can work under some circumstances (like all MPI processes have access to the X server). Barry > > >> Am 05.08.2015 um 00:36 schrieb Barry Smith : >> >> >> Correction, even in parallel you should be able to use a 0 for the viewer for calls to KSPView() etc; just make sure you do the same call on each process that shares the object. >> >> To change the viewer format you do need to use PetscViewerSetFormat(PETSC_VIEWER_STDOUT_(PETSC_COMM_WORLD), PETSC_VIEWER_ASCII_INFO) to change the format for parallel objects that live on PETSC_COMM_WORLD. >> >> >> Barry >> >> PetscViewerSetFormat(0, PETSC_VIEWER_ASCII_INFO) only effects the format of the sequential ASCII viewer. >> >>> On Aug 4, 2015, at 5:22 PM, Barry Smith wrote: >>> >>> >>> I do this by running in the debugger and putting in breakpoints. At the breakpoint you can look directly at variables like the n in call to VecMDot() you can also call KSPView() etc on any PETSc object (with a viewer of 0) and it will print out the information about the object right then. Calling VecView() or MatView() directly will of course cause it to print the entire object which generally you don't want but you can do PetscViewerSetFormat(0, PETSC_VIEWER_ASCII_INFO) and then VecView or MatView to have it print size information etc about the object instead of the full object. In parallel instead of passing 0 for the viewer you need to pass PETSC_VIEWER_STDOUT_(PETSC_COMM_WORLD) and make sure all processes that share the object call the routine in the debugger but it is possible. >>> >>> Let us know how it goes and we can try to improve the experience with your suggestions, >>> >>> Barry >>> >>>> On Aug 4, 2015, at 5:09 PM, Justin Chang wrote: >>>> >>>> Hi all, >>>> >>>> Not sure what to title this mail, but let me begin with an analogy of what I am looking for: >>>> >>>> In MATLAB, we could insert breakpoints into the code, such that when we run the program, we could pause the execution and see what the variables contain and what is going on exactly within your function calls. Is there a way to do something like this within PETSc? >>>> >>>> I want to see what's going on within certain PETSc functions within KSPSolve. For instance, -log_summary says that my solver invokes calls to functions like VecMDot and VecMAXPY but I would like to know exactly how many vectors each of these functions are working with. Morever, I would also like to get a general overview of the properties of the matrices MatPtAP and MatMatMult are playing with (e.g., dimensions, number of nonzeros, etc). >>>> >>>> Or >>>> >>>> Above functions happen to be invoked from gamg, so is it possible to tell just from the parameters fed into PETSc what the answers to the above may be? >>>> >>>> Thanks, >>>> Justin >> From knepley at gmail.com Tue Aug 4 18:43:55 2015 From: knepley at gmail.com (Matthew Knepley) Date: Tue, 4 Aug 2015 18:43:55 -0500 Subject: [petsc-users] Profiling/checkpoints In-Reply-To: <2C727A93-1276-4CB2-AC6D-C8D97139CB77@mcs.anl.gov> References: <76FC9743-5DE8-4AFA-9C1C-14865AAECD88@mcs.anl.gov> <8CD18807-1074-4FE1-A6C5-181C550480E9@gmail.com> <2C727A93-1276-4CB2-AC6D-C8D97139CB77@mcs.anl.gov> Message-ID: On Tue, Aug 4, 2015 at 6:33 PM, Barry Smith wrote: > > > On Aug 4, 2015, at 6:20 PM, Patrick Sanan > wrote: > > > > And note that it is possible to run gdb/lldb on each of several MPI > processes, useful when you hit a bug that only appears in parallel. For > example, this FAQ describes a couple of ways to do this: > > > > https://www.open-mpi.org/faq/?category=debugging#serial-debuggers > > You can also use the PETSc option -start_in_debugger which can work > under some circumstances (like all MPI processes have access to the X > server). and you can start debuggers on only some processes using -debugger_nodes 1,3,7 Thanks, Matt > > Barry > > > > > > >> Am 05.08.2015 um 00:36 schrieb Barry Smith : > >> > >> > >> Correction, even in parallel you should be able to use a 0 for the > viewer for calls to KSPView() etc; just make sure you do the same call on > each process that shares the object. > >> > >> To change the viewer format you do need to use > PetscViewerSetFormat(PETSC_VIEWER_STDOUT_(PETSC_COMM_WORLD), > PETSC_VIEWER_ASCII_INFO) to change the format for parallel objects that > live on PETSC_COMM_WORLD. > >> > >> > >> Barry > >> > >> PetscViewerSetFormat(0, PETSC_VIEWER_ASCII_INFO) only effects the > format of the sequential ASCII viewer. > >> > >>> On Aug 4, 2015, at 5:22 PM, Barry Smith wrote: > >>> > >>> > >>> I do this by running in the debugger and putting in breakpoints. At > the breakpoint you can look directly at variables like the n in call to > VecMDot() you can also call KSPView() etc on any PETSc object (with a > viewer of 0) and it will print out the information about the object right > then. Calling VecView() or MatView() directly will of course cause it to > print the entire object which generally you don't want but you can do > PetscViewerSetFormat(0, PETSC_VIEWER_ASCII_INFO) and then VecView or > MatView to have it print size information etc about the object instead of > the full object. In parallel instead of passing 0 for the viewer you need > to pass PETSC_VIEWER_STDOUT_(PETSC_COMM_WORLD) and make sure all processes > that share the object call the routine in the debugger but it is possible. > >>> > >>> Let us know how it goes and we can try to improve the experience with > your suggestions, > >>> > >>> Barry > >>> > >>>> On Aug 4, 2015, at 5:09 PM, Justin Chang wrote: > >>>> > >>>> Hi all, > >>>> > >>>> Not sure what to title this mail, but let me begin with an analogy of > what I am looking for: > >>>> > >>>> In MATLAB, we could insert breakpoints into the code, such that when > we run the program, we could pause the execution and see what the variables > contain and what is going on exactly within your function calls. Is there a > way to do something like this within PETSc? > >>>> > >>>> I want to see what's going on within certain PETSc functions within > KSPSolve. For instance, -log_summary says that my solver invokes calls to > functions like VecMDot and VecMAXPY but I would like to know exactly how > many vectors each of these functions are working with. Morever, I would > also like to get a general overview of the properties of the matrices > MatPtAP and MatMatMult are playing with (e.g., dimensions, number of > nonzeros, etc). > >>>> > >>>> Or > >>>> > >>>> Above functions happen to be invoked from gamg, so is it possible to > tell just from the parameters fed into PETSc what the answers to the above > may be? > >>>> > >>>> Thanks, > >>>> Justin > >> > > -- What most experimenters take for granted before they begin their experiments is infinitely more interesting than any results to which their experiments lead. -- Norbert Wiener -------------- next part -------------- An HTML attachment was scrubbed... URL: From solvercorleone at gmail.com Tue Aug 4 20:53:43 2015 From: solvercorleone at gmail.com (Cong Li) Date: Wed, 5 Aug 2015 10:53:43 +0900 Subject: [petsc-users] I am wondering if there is a way to implement SPMM In-Reply-To: <06426FD6-16F6-429A-8EEB-8BE31CECC8F4@mcs.anl.gov> References: <87egjjr2j9.fsf@jedbrown.org> <20150804084548.GB52392@Patricks-MacBook-Pro-3.local> <07456300-9874-41EF-AF5E-16BC0CB0423D@mcs.anl.gov> <06426FD6-16F6-429A-8EEB-8BE31CECC8F4@mcs.anl.gov> Message-ID: Thank you very much for your help and suggestions. With your help, finally I could continue my project. Regards Cong Li On Wed, Aug 5, 2015 at 3:09 AM, Barry Smith wrote: > > From the manual page: Unless scall is MAT_REUSE_MATRIX C will be > created. > > Since you want to use the C that is passed in you should use > MAT_REUSE_MATRIX. > > Note that since your B and C matrices are dense the issue of sparsity > pattern of C is not relevant. > > Barry > > > On Aug 4, 2015, at 11:59 AM, Cong Li wrote: > > > > Thanks very much. This answer is very helpful. > > And I have a following question. > > If I create B1, B2, .. by the way you suggested and then use MatMatMult > to do SPMM. > > PetscErrorCode MatMatMult(Mat A,Mat B,MatReuse scall,PetscReal fill,Mat > *C) > > should I use MAT_REUSE_MATRIX for MatReuse part of the arguement. > > > > Thanks > > > > Cong Li > > > > On Wed, Aug 5, 2015 at 1:27 AM, Barry Smith wrote: > > > > > On Aug 4, 2015, at 4:09 AM, Cong Li wrote: > > > > > > I am sorry that I should have explained it more clearly. > > > Actually I want to compute a recurrence. > > > > > > Like, I want to firstly compute A*X1=B1, and then calculate A*B1=B2, > A*B2=B3 and so on. > > > Finally I want to combine all these results into a bigger matrix > C=[B1,B2 ...] > > > > First create C with MatCreateDense(,&C). Then call > MatDenseGetArray(C,&array); then create B1 with > MatCreateDense(....,array,&B1); then create > > B2 with MatCreateDense(...,array+shift,&B2) etc where shift equals the > number of __local__ rows in B1 times the number of columns in B1, then > create B3 with a larger shift etc. > > > > Note that you are "sharing" the array space of C with B1, B2, B3, > ..., each Bi contains its columns of the C matrix. > > > > Barry > > > > > > > > > > > > Is there any way to do this efficiently. > > > > > > > > > > > > On Tue, Aug 4, 2015 at 5:45 PM, Patrick Sanan > wrote: > > > On Tue, Aug 04, 2015 at 03:42:14PM +0900, Cong Li wrote: > > > > Thanks for your reply. > > > > > > > > I have an other question. > > > > I want to do SPMM several times and combine result matrices into one > bigger > > > > matrix. > > > > for example > > > > I firstly calculate AX1=B1, AX2=B2 ... > > > > then I want to combine B1, B2.. to get a C, where C=[B1,B2...] > > > > > > > > Could you please suggest a way of how to do this. > > > This is just linear algebra, nothing to do with PETSc specifically. > > > A * [X1, X2, ... ] = [AX1, AX2, ...] > > > > > > > > Thanks > > > > > > > > Cong Li > > > > > > > > On Tue, Aug 4, 2015 at 3:27 PM, Jed Brown wrote: > > > > > > > > > Cong Li writes: > > > > > > > > > > > Hello, > > > > > > > > > > > > I am a PhD student using PETsc for my research. > > > > > > I am wondering if there is a way to implement SPMM (Sparse > matrix-matrix > > > > > > multiplication) by using PETSc. > > > > > > > > > > > > > > > > http://www.mcs.anl.gov/petsc/petsc-current/docs/manualpages/Mat/MatMatMult.html > > > > > > > > > > > > > > -------------- next part -------------- An HTML attachment was scrubbed... URL: From rongliang.chan at gmail.com Tue Aug 4 22:33:28 2015 From: rongliang.chan at gmail.com (Rongliang Chen) Date: Wed, 05 Aug 2015 11:33:28 +0800 Subject: [petsc-users] Fail to Configure petsc-3.6.1 Message-ID: <55C18408.5040500@gmail.com> Hi there, I tried to configure the petsc-3.6.1 on my laptop but failed with the following error. The configure.log is attached. Any suggestions? Thanks. configure: error: Can't find or link to the hdf5 library. Use --disable-netcdf-4, or see config.log for errors. Best, Rongliang -------------- next part -------------- A non-text attachment was scrubbed... Name: configure.log Type: text/x-log Size: 2927981 bytes Desc: not available URL: From jed at jedbrown.org Tue Aug 4 23:26:35 2015 From: jed at jedbrown.org (Jed Brown) Date: Tue, 04 Aug 2015 22:26:35 -0600 Subject: [petsc-users] Fail to Configure petsc-3.6.1 In-Reply-To: <55C18408.5040500@gmail.com> References: <55C18408.5040500@gmail.com> Message-ID: <87r3niz7gk.fsf@jedbrown.org> Rongliang Chen writes: > Hi there, > > I tried to configure the petsc-3.6.1 on my laptop but failed with the > following error. The configure.log is attached. Any suggestions? Thanks. > > configure: error: Can't find or link to the hdf5 library. Use > --disable-netcdf-4, or see config.log for errors. Looks like you'll have to check NetCDF's config.log for the details. Either something is wrong with the HDF5 install or the wrong options are being passed to NetCDF configure. -------------- next part -------------- A non-text attachment was scrubbed... Name: signature.asc Type: application/pgp-signature Size: 818 bytes Desc: not available URL: From gbisht at lbl.gov Tue Aug 4 23:27:24 2015 From: gbisht at lbl.gov (Gautam Bisht) Date: Tue, 4 Aug 2015 21:27:24 -0700 Subject: [petsc-users] Error running DMPlex example Message-ID: Hi, I'm getting the following error while running the following DMPlex example. Any suggestion what is going wrong? Attached are example.log and configure.log. python2.7 ./config/builder2.py check src/snes/examples/tutorials/ex12.c Namespace(args=[], files=['src/snes/examples/tutorials/ex12.c'], func=, numProcs=None, regParams=None, replace=False, retain=False, testnum=None) Running 52 tests Building ['/Users/gbisht/projects/petsc/petsc_f0284fa/src/snes/examples/tutorials/ex12.c'] Running #0: /opt/local/bin/mpiexec-mpich-gcc49 -host localhost -n 1 darwin-gnu-fort-debug/lib/ex12-obj/ex12 -run_type test -refinement_limit 0.0 -bc_type dirichlet -interpolate 0 -petscspace_order 1 -s TEST ERROR: Failed to execute darwin-gnu-fort-debug/lib/ex12-obj/ex12 =================================================================================== = BAD TERMINATION OF ONE OF YOUR APPLICATION PROCESSES = PID 69451 RUNNING AT localhost = EXIT CODE: 59 = CLEANING UP REMAINING PROCESSES = YOU CAN IGNORE THE BELOW CLEANUP MESSAGES =================================================================================== [0]PETSC ERROR: ------------------------------------------------------------------------ [0]PETSC ERROR: Caught signal number 11 SEGV: Segmentation Violation, probably memory access out of range [0]PETSC ERROR: Try option -start_in_debugger or -on_error_attach_debugger [0]PETSC ERROR: or see http://www.mcs.anl.gov/petsc/documentation/faq.html#valgrind [0]PETSC ERROR: or try http://valgrind.org on GNU/linux and Apple Mac OS X to find memory corruption errors [0]PETSC ERROR: likely location of problem given in stack below [0]PETSC ERROR: --------------------- Stack Frames ------------------------------------ [0]PETSC ERROR: Note: The EXACT line numbers in the stack are not available, [0]PETSC ERROR: INSTEAD the line number of the start of the function [0]PETSC ERROR: is given. [0]PETSC ERROR: [0] DMPlexGenerate_Triangle line 217 src/dm/impls/plex/plexgenerate.c [0]PETSC ERROR: [0] DMPlexGenerate line 1056 src/dm/impls/plex/plexgenerate.c [0]PETSC ERROR: [0] DMPlexCreateBoxMesh line 897 src/dm/impls/plex/plexcreate.c [0]PETSC ERROR: [0] CreateMesh line 347 src/snes/examples/tutorials/ex12.c [0]PETSC ERROR: --------------------- Error Message -------------------------------------------------------------- [0]PETSC ERROR: Signal received [0]PETSC ERROR: See http://www.mcs.anl.gov/petsc/documentation/faq.html for trouble shooting. [0]PETSC ERROR: Petsc Development GIT revision: v3.6.1-238-gf0284fa GIT Date: 2015-07-27 13:34:26 -0500 [0]PETSC ERROR: darwin-gnu-fort-debug/lib/ex12-obj/ex12 on a darwin-gnu-fort-debug named gautam-laptop by gbisht Tue Aug 4 21:09:37 2015 [0]PETSC ERROR: Configure options --download-hdf5=1 --with-blas-lapack-lib=/System/Library/Frameworks/Accelerate.framework/Versions/Current/Accelerate --download-parmetis=yes --download-metis=yes --with-c [0]PETSC ERROR: #1 User provided function() line 0 in unknown file application called MPI_Abort(MPI_COMM_WORLD, 59) - process 0 TEST RUN FAILED (check example.log for details) -Gautam. -------------- next part -------------- An HTML attachment was scrubbed... URL: -------------- next part -------------- A non-text attachment was scrubbed... Name: configure.log Type: application/octet-stream Size: 4688020 bytes Desc: not available URL: -------------- next part -------------- A non-text attachment was scrubbed... Name: example.log Type: application/octet-stream Size: 7189 bytes Desc: not available URL: From rongliang.chan at gmail.com Wed Aug 5 01:05:55 2015 From: rongliang.chan at gmail.com (Rongliang Chen) Date: Wed, 05 Aug 2015 14:05:55 +0800 Subject: [petsc-users] Fail to Configure petsc-3.6.1 In-Reply-To: <87r3niz7gk.fsf@jedbrown.org> References: <55C18408.5040500@gmail.com> <87r3niz7gk.fsf@jedbrown.org> Message-ID: <55C1A7C3.7030209@gmail.com> Hi Jed, Thanks for your reply. I checked the netcdf and hdf5's config.log and could not find any possible solutions. Can you help me check these two files again? The two files are attached. Thanks. Best regards, Rongliang On 08/05/2015 12:26 PM, Jed Brown wrote: > Rongliang Chen writes: > >> Hi there, >> >> I tried to configure the petsc-3.6.1 on my laptop but failed with the >> following error. The configure.log is attached. Any suggestions? Thanks. >> >> configure: error: Can't find or link to the hdf5 library. Use >> --disable-netcdf-4, or see config.log for errors. > Looks like you'll have to check NetCDF's config.log for the details. > Either something is wrong with the HDF5 install or the wrong options are > being passed to NetCDF configure. -------------- next part -------------- A non-text attachment was scrubbed... Name: config-hdf5.log Type: text/x-log Size: 1804340 bytes Desc: not available URL: -------------- next part -------------- A non-text attachment was scrubbed... Name: config-netcdf.log Type: text/x-log Size: 127324 bytes Desc: not available URL: From solvercorleone at gmail.com Wed Aug 5 01:23:14 2015 From: solvercorleone at gmail.com (Cong Li) Date: Wed, 5 Aug 2015 15:23:14 +0900 Subject: [petsc-users] Questions about creation of matrix and setting its values Message-ID: Hi, I am wondering if it is necessary to call MatAssemblyBegin() and MatAssemblyEnd() after MatDuplicate() with the option of MAT_DO_NOT_COPY_VALUES. For example, if I have an assembled matrix A, and I call MatDuplicate() to create B, which is a duplication of A. Do I need to call MatAssemblyBegin() and MatAssemblyEnd() for B. And 2nd question is : just after the MatCreateDense() call and before MatAssemblyBegin() and MatAssemblyEnd() calls, can I use MatGetArray() ? The 3rd question is: before the MatAssemblyBegin() and MatAssemblyEnd() calls, should I use INSERT_VALUES or ADD_VALUES for MatSetValues call? And why ? Actually I have read the manual, but I still feel confused about the means of INSERT_VALUES and ADD_VALUES. Thanks Cong Li -------------- next part -------------- An HTML attachment was scrubbed... URL: From dave.mayhem23 at gmail.com Wed Aug 5 03:37:41 2015 From: dave.mayhem23 at gmail.com (Dave May) Date: Wed, 5 Aug 2015 10:37:41 +0200 Subject: [petsc-users] C++ wrapper for petsc vector In-Reply-To: <2182692.n8DuiFgnrM@tinlaptop> References: <1585215.z8oGCl3ZR4@tinlaptop> <3775667.cNPH29TcoF@tinlaptop> <2182692.n8DuiFgnrM@tinlaptop> Message-ID: > > OK, I was not aware of PCShell (I'm new to PETSc). I don't know Trilinos > well > enough to judge whether it's good from software engineering point of view > or > not, but allow me one last question. What would happen if I wrap all > 'other' > solvers in PCShell and then for some reason, PETSc is not available. This is a fair comment. However, in my experience PETSc builds everywhere. If PETSc isn't provided as a module on the resource you have access to, it is relatively straight forward to built the entire library yourself. The --download-XXX feature of PETSc's configure is pretty damn good and also will on most (if not all) machines. If configure does fail on your machine of choice, send the configure.log file to petsc-maint at mcs.anl.gov. The PETSc guys will sort out the problem. In 12 years, I haven't found a single machine which I couldn't get petsc compiled on. I think you are safe if you wrap everything within PETSc. :D Cheers Dave > None of > the other solvers would be accessible (unless I modify the source code), so > wrapping everything using PCShell creates a strong dependency on one > particular library (PETSc), doesn't it? > > Martin > > > > > > > Best regards, > > > > > > Martin Vymazal > > > > > > On Tuesday, August 04, 2015 12:24:14 PM Matthew Knepley wrote: > > > > On Tue, Aug 4, 2015 at 12:15 PM, Martin Vymazal < > > > > > > martin.vymazal at vki.ac.be> > > > > > > > wrote: > > > > > Hello, > > > > > > > > > > I'm trying to create a small C++ class to wrap the 'Vec' object. > This > > > > > > > > > > class > > > > > has an internal pointer to a member variable of type Vec, and in > its > > > > > destructor, it calls VecDestroy. Unfortunately, my test program > > > > > > segfaults > > > > > > > > and > > > > > this seems to be due to the fact that the destructor of the wrapper > > > > > > class > > > > > > > > is > > > > > called after main() calls PetscFinalize(). Apparently VecDestroy > > > > > > performs > > > > > > > > some > > > > > collective communication, so calling it after PetscFinalize() is > too > > > > > > late. > > > > > > > > How > > > > > can I fix this? > > > > > > > > 1) Declare your C++ in a scope, so that it goes out of scope before > > > > PetscFinalize() > > > > > > > > 2) Is there any utility to this wrapper since everything can be > called > > > > directly from C++? > > > > > > > > Thanks, > > > > > > > > Matt > > > > > > > > > > Thank you, > > > > > > > > > > Martin Vymazal > > -------------- next part -------------- An HTML attachment was scrubbed... URL: From nicolas.pozin at inria.fr Wed Aug 5 04:15:16 2015 From: nicolas.pozin at inria.fr (Nicolas Pozin) Date: Wed, 5 Aug 2015 11:15:16 +0200 (CEST) Subject: [petsc-users] problem with MatShellGetContext In-Reply-To: <832877335.6632754.1438765243311.JavaMail.zimbra@inria.fr> Message-ID: <624969556.6635901.1438766116326.JavaMail.zimbra@inria.fr> Hello, I'm trying to solve a system with a matrix free operator and through conjugate gradient method. To make ideas clear, I set up the following simple example (I am using petsc-3.6) and I get this error message : " [0]PETSC ERROR: --------------------- Error Message ------------------------------------ [0]PETSC ERROR: Invalid argument! [0]PETSC ERROR: Wrong type of object: Parameter # 1! [0]PETSC ERROR: ------------------------------------------------------------------------ [0]PETSC ERROR: Petsc Release Version 3.4.3, Oct, 15, 2013 [0]PETSC ERROR: See docs/changes/index.html for recent updates. [0]PETSC ERROR: See docs/faq.html for hints about trouble shooting. [0]PETSC ERROR: See docs/index.html for manual pages. [0]PETSC ERROR: ------------------------------------------------------------------------ [0]PETSC ERROR: ./test on a ubuntu_release named pl-59080 by npozin Wed Aug 5 10:55:26 2015 [0]PETSC ERROR: Libraries linked from /home/npozin/Felisce_libraries/petsc_3.4.3/ubuntu_release/lib [0]PETSC ERROR: Configure run at Wed Jul 22 16:18:36 2015 [0]PETSC ERROR: Configure options PETSC_ARCH=ubuntu_release --with-cxx=g++ --with-fc=gfortran --with-cc=gcc --with-x=0 --download-openmpi --download-f-blas-lapack --download-superlu --download-superlu_dist --with-superlu_dist=1 --download-metis --download-mumps --download-parmetis --with-superlu_dist=1 --download-boost --with-boost=1 --download-scalapack with-external-packages-dir=/home/npozin/Felisce_libraries/petsc_3.4.3/packages [0]PETSC ERROR: ------------------------------------------------------------------------ [0]PETSC ERROR: MatShellGetContext() line 202 in /home/npozin/Felisce_libraries/petsc_3.4.3/src/mat/impls/shell/shell.c End userMult [0]PETSC ERROR: MatMult() line 2179 in /home/npozin/Felisce_libraries/petsc_3.4.3/src/mat/interface/matrix.c [0]PETSC ERROR: KSP_MatMult() line 204 in /home/npozin/Felisce_libraries/petsc_3.4.3/include/petsc-private/kspimpl.h [0]PETSC ERROR: KSPSolve_CG() line 219 in /home/npozin/Felisce_libraries/petsc_3.4.3/src/ksp/ksp/impls/cg/cg.c [0]PETSC ERROR: KSPSolve() line 441 in /home/npozin/Felisce_libraries/petsc_3.4.3/src/ksp/ksp/interface/itfunc.c " I don't understand where the problem comes from with the matrix argument of MatShellGetContext. Any idea on what I do wrong? Thanks a lot, Nicolas #include #include using namespace std; typedef struct { int val; } MyCtx; class ShellClass { Mat matShell; KSP ksp; PC pc; Vec x; Vec b; public: void userMult(Mat Amat, Vec x, Vec y) { cout << "Inside userMult" << endl; MyCtx *ctx; MatShellGetContext(Amat, (void *) ctx); cout << "End userMult" << endl; } void solveShell() { // context MyCtx *ctx = new MyCtx; ctx->val = 42; // pc PCCreate(PETSC_COMM_WORLD, &pc); PCSetType(pc, PCNONE); // ksp KSPCreate(PETSC_COMM_WORLD, &ksp); KSPSetType(ksp, KSPCG); KSPSetPC(ksp, pc); KSPSetFromOptions(ksp); // matshell int m = 10; int n = 10; MatCreateShell(PETSC_COMM_WORLD, m, n, PETSC_DETERMINE, PETSC_DETERMINE, ctx, &matShell); MatShellSetOperation(matShell, MATOP_MULT, (void(*)(void))&ShellClass::userMult); // create vectors MatCreateVecs(matShell, &x, 0); VecDuplicate(x, &b); VecSet(b, 1.); // set operators KSPSetOperators(ksp, matShell, matShell); // solve (call to userMult) KSPSolve(ksp, b, x); } }; int main(int argc, char** argv) { PetscInitialize(&argc, &argv, NULL, NULL); ShellClass foo; foo.solveShell(); PetscFinalize(); return 0; } -------------- next part -------------- An HTML attachment was scrubbed... URL: -------------- next part -------------- A non-text attachment was scrubbed... Name: makefile Type: text/x-makefile Size: 171 bytes Desc: not available URL: -------------- next part -------------- A non-text attachment was scrubbed... Name: test.cpp Type: text/x-c++src Size: 1372 bytes Desc: not available URL: From solvercorleone at gmail.com Wed Aug 5 04:42:16 2015 From: solvercorleone at gmail.com (Cong Li) Date: Wed, 5 Aug 2015 18:42:16 +0900 Subject: [petsc-users] I am wondering if there is a way to implement SPMM In-Reply-To: References: <87egjjr2j9.fsf@jedbrown.org> <20150804084548.GB52392@Patricks-MacBook-Pro-3.local> <07456300-9874-41EF-AF5E-16BC0CB0423D@mcs.anl.gov> <06426FD6-16F6-429A-8EEB-8BE31CECC8F4@mcs.anl.gov> Message-ID: Hi I tried the method you suggested. However, I got the error message. My code and message are below. K is the big matrix containing column matrices. code: call MatGetArray(K,KArray,KArrayOffset,ierr) call MatGetLocalSize(R,local_RRow,local_RCol) call MatGetArray(R,RArray,RArrayOffset,ierr) call MatCreateDense(PETSC_COMM_WORLD, PETSC_DECIDE, & PETSC_DECIDE , nDim, bsize,KArray(KArrayOffset + 1), Km(1), ierr) localRsize = local_RRow * local_RCol do genIdx= 1, localRsize KArray(KArrayOffset + genIdx) = RArray(RArrayOffset + genIdx) end do call MatRestoreArray(R,RArray,RArrayOffset,ierr) call MatAssemblyBegin(Km(1), MAT_FINAL_ASSEMBLY, ierr) call MatAssemblyEnd (Km(1), MAT_FINAL_ASSEMBLY, ierr) do stepIdx= 2, step_k blockShift = KArrayOffset + (stepIdx-1) * (local_RRow * local_RCol) call MatCreateDense(PETSC_COMM_WORLD, PETSC_DECIDE, & PETSC_DECIDE , nDim, bsize,KArray(blockShift+1), Km(stepIdx), ierr) call MatAssemblyBegin(Km(stepIdx), MAT_FINAL_ASSEMBLY, ierr) call MatAssemblyEnd (Km(stepIdx), MAT_FINAL_ASSEMBLY, ierr) end do call MatRestoreArray(K,KArray,KArrayOffset,ierr) do stepIdx= 2, step_k call MatMatMult(A,Km(stepIdx-1),MAT_REUSE_MATRIX,PETSC_DEFAULT_INTEGER,Km(stepIdx), ierr) end do And I got the error message as below: [0]PETSC ERROR: ------------------------------------------------------------------------ [0]PETSC ERROR: Caught signal number 11 SEGV: Segmentation Violation, probably memory access out of range [0]PETSC ERROR: Try option -start_in_debugger or -on_error_attach_debugger [0]PETSC ERROR: or see http://www.mcs.anl.gov/petsc/documentation/faq.html#valgrind[0]PETSC ERROR: or try http://valgrind.org on GNU/linux and Apple Mac OS X to find memory corruption errors [0]PETSC ERROR: configure using --with-debugging=yes, recompile, link, and run [0]PETSC ERROR: to get more information on the crash. [0]PETSC ERROR: --------------------- Error Message ------------------------------------ [0]PETSC ERROR: Signal received! [0]PETSC ERROR: ------------------------------------------------------------------------ [0]PETSC ERROR: Petsc Release Version 3.3.0, Patch 7, Sat May 11 22:15:24 CDT 2013 [0]PETSC ERROR: See docs/changes/index.html for recent updates. [0]PETSC ERROR: See docs/faq.html for hints about trouble shooting. [0]PETSC ERROR: See docs/index.html for manual pages. [0]PETSC ERROR: --------------------[1]PETSC ERROR: ------------------------------------------------------------------------ [1]PETSC ERROR: Caught signal number 11 SEGV: Segmentation Violation, probably memory access out of range ---------------------------------------------------- [0]PETSC ERROR: ./kmath.bcbcg on a arch-fuji named p01-024 by a03293 Wed Aug 5 18:24:40 2015 [0]PETSC ERROR: Libraries linked from /volume1/home/ra000005/a03293/kmathlibbuild/petsc-3.3-p7/arch-fujitsu-sparc64fx-opt/lib [0]PETSC ERROR: Configure run at Tue Jul 28 19:23:51 2015 [0]PETSC ERROR: Configure options --known-level1-dcache-size=32768 --known-level1-dcache-linesize=32 --known-level1-dcache-assoc=0 --known-memcmp-ok=1 --known-sizeof-char=1 --known-sizeof-void-p=8 --known-sizeof-short=2 --known-sizeof-int=4 --known-sizeof-long=8 --known-sizeof-long-long=8 --known-sizeof-float=4 --known-sizeof-double=8 --known-sizeof-size_t=8 --known-bits-per-byte=8 --known-sizeof-MPI_Comm=8 --known-sizeof-MPI_Fint=4 --known-mpi-long-double=1 --known-mpi-c-double-complex=1 --with-cc=mpifccpx --CFLAGS="-mt -Xg" --COPTFLAGS=-Kfast,openmp --with-cxx=mpiFCCpx --CXXFLAGS=-mt --CXXOPTFLAGS=-Kfast,openmp --with-fc=mpifrtpx --FFLAGS=-Kthreadsafe --FOPTFLAGS=-Kfast,openmp --with-blas-lapack-lib="-SCALAPACK -SSL2" --with-x=0 --with-c++-support --with-batch=1 --with-info=1 --with-debugging=0 --known-mpi-shared-libraries=0 --with-valgrind=0 [0]PETSC ERROR: ------------------------------------------------------------------------ [0]PETSC ERROR: User provided function() line 0 in unknown directory unknown file -------------------------------------------------------------------------- [mpi::mpi-api::mpi-abort] MPI_ABORT was invoked on rank 0 in communicator MPI_COMM_WORLD with errorcode 59. NOTE: invoking MPI_ABORT causes Open MPI to kill all MPI processes. You may or may not see output from other processes, depending on exactly when Open MPI kills them. -------------------------------------------------------------------------- [p01-024:26516] /opt/FJSVtclang/GM-1.2.0-18/lib64/libmpi.so.0(orte_errmgr_base_error_abort+0x84) [0xffffffff0091f684] [p01-024:26516] /opt/FJSVtclang/GM-1.2.0-18/lib64/libmpi.so.0(ompi_mpi_abort+0x51c) [0xffffffff006c389c] [p01-024:26516] /opt/FJSVtclang/GM-1.2.0-18/lib64/libmpi.so.0(MPI_Abort+0x6c) [0xffffffff006db3ac] [p01-024:26516] /opt/FJSVtclang/GM-1.2.0-18/lib64/libtrtmet_c.so.1(MPI_Abort+0x2c) [0xffffffff00281bf0] [p01-024:26516] ./kmath.bcbcg [0x1bf620] [p01-024:26516] ./kmath.bcbcg [0x1bf20c] [p01-024:26516] /lib64/libc.so.6(killpg+0x48) [0xffffffff02d52600] [p01-024:26516] [(nil)] [p01-024:26516] ./kmath.bcbcg [0x1a2054] [p01-024:26516] ./kmath.bcbcg [0x1064f8] [p01-024:26516] ./kmath.bcbcg(MAIN__+0x9dc) [0x105d1c] [p01-024:26516] ./kmath.bcbcg(main+0xec) [0x8a329c] [p01-024:26516] /lib64/libc.so.6(__libc_start_main+0x194) [0xffffffff02d3b81c] [p01-024:26516] ./kmath.bcbcg [0x1051ec] [0]PETSC ERROR: ------------------------------------------------------------------------ [0]PETSC ERROR: Caught signal number 15 Terminate: Somet process (or the batch system) has told this process to end [0]PETSC ERROR: Try option -start_in_debugger or -on_error_attach_debugger [0]PETSC ERROR: or see http://www.mcs.anl.gov/petsc/documentation/faq.html#valgrind[0]PETSC ERROR: or try http://valgrind.org on GNU/linux and Apple Mac OS X to find memory corruption errors [0]PETSC ERROR: configure using --with-debugging=yes, recompile, link, and run [0]PETSC ERROR: to get more information on the crash. [0]PETSC ERROR: --------------------- Error Message ------------------------------------ [0]PETSC ERROR: Signal received! [0]PETSC ERROR: ------------------------------------------------------------------------ [0]PETSC ERROR: Petsc Release Version 3.3.0, Patch 7, Sat May 11 22:15:24 CDT 2013 [0]PETSC ERROR: See docs/changes/index.html for recent updates. [0]PETSC ERROR: See docs/faq.html for hints about trouble shooting. [0]PETSC ERROR: See docs/index.html for manual pages. [0]PETSC ERROR: ------------------------------------------------------------------------ [0]PETSC ERROR: ./kmath.bcbcg on a arch-fuji named p01-024 by a03293 Wed Aug 5 18:24:40 2015 [0]PETSC ERROR: Libraries linked from /volume1/home/ra000005/a03293/kmathlibbuild/petsc-3.3-p7/arch-fujitsu-sparc64fx-opt/lib [0]PETSC ERROR: Configure run at Tue Jul 28 19:23:51 2015 [0]PETSC ERROR: Configure options --known-level1-dcache-size=32768 --known-level1-dcache-linesize=32 --known-level1-dcache-assoc=0 --known-memcmp-ok=1 --known-sizeof-char=1 --known-sizeof-void-p=8 --known-sizeof-short=2 --known-sizeof-int=4 --known-sizeof-long=8 --known-sizeof-long-long=8 --known-sizeof-float=4 --known-sizeof-double=8 --known-sizeof-size_t=8 --known-bits-per-byte=8 --known-sizeof-MPI_Comm=8 --known-sizeof-MPI_Fint=4 --known-mpi-long-double=1 --known-mpi-c-double-complex=1 --with-cc=mpifccpx --CFLAGS="-mt -Xg" --COPTFLAGS=-Kfast,openmp --with-cxx=mpiFCCpx --CXXFLAGS=-mt --CXXOPTFLAGS=-Kfast,openmp --with-fc=mpifrtpx --FFLAGS=-Kthreadsafe --FOPTFLAGS=-Kfast,openmp --with-blas-lapack-lib="-SCALAPACK -SSL2" --with-x=0 --with-c++-support --with-batch=1 --with-info=1 --with-debugging=0 --known-mpi-shared-libraries=0 --with-valgrind=0 [0]PETSC ERROR: ------------------------------------------------------------------------ [0]PETSC ERROR: User provided function() line 0 in unknown directory unknown file [ERR.] PLE 0019 plexec One of MPI processes was aborted.(rank=0)(nid=0x020a0028)(CODE=1938,793745140674134016,15104) However, if I change from call MatMatMult(A,Km(stepIdx-1),MAT_REUSE_MATRIX,PETSC_DEFAULT_INTEGER,Km(stepIdx), ierr) to call MatMatMult(A,Km(stepIdx-1), MAT_INITIAL_MATRIX ,PETSC_DEFAULT_INTEGER,Km(stepIdx), ierr) everything is fine. could you please suggest some way to solve this? Thanks Cong Li On Wed, Aug 5, 2015 at 10:53 AM, Cong Li wrote: > Thank you very much for your help and suggestions. > With your help, finally I could continue my project. > > Regards > > Cong Li > > > > On Wed, Aug 5, 2015 at 3:09 AM, Barry Smith wrote: > >> >> From the manual page: Unless scall is MAT_REUSE_MATRIX C will be >> created. >> >> Since you want to use the C that is passed in you should use >> MAT_REUSE_MATRIX. >> >> Note that since your B and C matrices are dense the issue of sparsity >> pattern of C is not relevant. >> >> Barry >> >> > On Aug 4, 2015, at 11:59 AM, Cong Li wrote: >> > >> > Thanks very much. This answer is very helpful. >> > And I have a following question. >> > If I create B1, B2, .. by the way you suggested and then use MatMatMult >> to do SPMM. >> > PetscErrorCode MatMatMult(Mat A,Mat B,MatReuse scall,PetscReal >> fill,Mat *C) >> > should I use MAT_REUSE_MATRIX for MatReuse part of the arguement. >> > >> > Thanks >> > >> > Cong Li >> > >> > On Wed, Aug 5, 2015 at 1:27 AM, Barry Smith wrote: >> > >> > > On Aug 4, 2015, at 4:09 AM, Cong Li wrote: >> > > >> > > I am sorry that I should have explained it more clearly. >> > > Actually I want to compute a recurrence. >> > > >> > > Like, I want to firstly compute A*X1=B1, and then calculate A*B1=B2, >> A*B2=B3 and so on. >> > > Finally I want to combine all these results into a bigger matrix >> C=[B1,B2 ...] >> > >> > First create C with MatCreateDense(,&C). Then call >> MatDenseGetArray(C,&array); then create B1 with >> MatCreateDense(....,array,&B1); then create >> > B2 with MatCreateDense(...,array+shift,&B2) etc where shift equals the >> number of __local__ rows in B1 times the number of columns in B1, then >> create B3 with a larger shift etc. >> > >> > Note that you are "sharing" the array space of C with B1, B2, B3, >> ..., each Bi contains its columns of the C matrix. >> > >> > Barry >> > >> > >> > >> > > >> > > Is there any way to do this efficiently. >> > > >> > > >> > > >> > > On Tue, Aug 4, 2015 at 5:45 PM, Patrick Sanan < >> patrick.sanan at gmail.com> wrote: >> > > On Tue, Aug 04, 2015 at 03:42:14PM +0900, Cong Li wrote: >> > > > Thanks for your reply. >> > > > >> > > > I have an other question. >> > > > I want to do SPMM several times and combine result matrices into >> one bigger >> > > > matrix. >> > > > for example >> > > > I firstly calculate AX1=B1, AX2=B2 ... >> > > > then I want to combine B1, B2.. to get a C, where C=[B1,B2...] >> > > > >> > > > Could you please suggest a way of how to do this. >> > > This is just linear algebra, nothing to do with PETSc specifically. >> > > A * [X1, X2, ... ] = [AX1, AX2, ...] >> > > > >> > > > Thanks >> > > > >> > > > Cong Li >> > > > >> > > > On Tue, Aug 4, 2015 at 3:27 PM, Jed Brown wrote: >> > > > >> > > > > Cong Li writes: >> > > > > >> > > > > > Hello, >> > > > > > >> > > > > > I am a PhD student using PETsc for my research. >> > > > > > I am wondering if there is a way to implement SPMM (Sparse >> matrix-matrix >> > > > > > multiplication) by using PETSc. >> > > > > >> > > > > >> > > > > >> http://www.mcs.anl.gov/petsc/petsc-current/docs/manualpages/Mat/MatMatMult.html >> > > > > >> > > >> > >> > >> >> > -------------- next part -------------- An HTML attachment was scrubbed... URL: From solvercorleone at gmail.com Wed Aug 5 04:47:53 2015 From: solvercorleone at gmail.com (Cong Li) Date: Wed, 5 Aug 2015 18:47:53 +0900 Subject: [petsc-users] Questions about creation of matrix and setting its values In-Reply-To: References: Message-ID: Thanks, Patrick. I think I got it now. Cong Li On Wed, Aug 5, 2015 at 3:45 PM, Patrick Sanan wrote: > > > > > Am 05.08.2015 um 08:23 schrieb Cong Li : > > Hi, > > I am wondering if it is necessary to call > MatAssemblyBegin() and MatAssemblyEnd() after MatDuplicate() with the > option of MAT_DO_NOT_COPY_VALUES. > For example, if I have an assembled matrix A, and I call MatDuplicate() to > create B, which is a duplication of A. > Do I need to call MatAssemblyBegin() and MatAssemblyEnd() for B. > > And 2nd question is : > just after the MatCreateDense() call and before MatAssemblyBegin() > and MatAssemblyEnd() calls, can I use MatGetArray() ? > > The 3rd question is: > before the MatAssemblyBegin() and MatAssemblyEnd() calls, should I use > INSERT_VALUES or ADD_VALUES for MatSetValues call? And why ? > Actually I have read the manual, but I still feel confused about the means > of INSERT_VALUES and ADD_VALUES. > > There are a couple of reasons that you need to make these > MatAssemblyBegin/End calls: > - entries can be set which should be stored on a different process, so > these need to be communicated > - for compressed formats like CSR (as used in MATAIJ and others) the > entries need to be processed into their compressed form > In general, the entries of the matrix are not stored in their "usable" > forms until you make the MatAssembleEnd call. Rather they are kept in some > easy-to-insert-into intermediate storage. INSERT_VALUES means that old > values that might be in the matrix are overwritten , and ADD_VALUES means > that the new entries from intermediate storage are added to the old values. > > > > Thanks > > Cong Li > > -------------- next part -------------- An HTML attachment was scrubbed... URL: From knepley at gmail.com Wed Aug 5 06:38:20 2015 From: knepley at gmail.com (Matthew Knepley) Date: Wed, 5 Aug 2015 06:38:20 -0500 Subject: [petsc-users] problem with MatShellGetContext In-Reply-To: <624969556.6635901.1438766116326.JavaMail.zimbra@inria.fr> References: <832877335.6632754.1438765243311.JavaMail.zimbra@inria.fr> <624969556.6635901.1438766116326.JavaMail.zimbra@inria.fr> Message-ID: On Wed, Aug 5, 2015 at 4:15 AM, Nicolas Pozin wrote: > Hello, > > I'm trying to solve a system with a matrix free operator and through > conjugate gradient method. > To make ideas clear, I set up the following simple example (I am using > petsc-3.6) and I get this error message : > Yes, you are passing a C++ function userMult, so the compiler sticks "this" in as the first argument. We do not recommend this kind of wrapping. Thanks, Matt > " > [0]PETSC ERROR: --------------------- Error Message > ------------------------------------ > [0]PETSC ERROR: Invalid argument! > [0]PETSC ERROR: Wrong type of object: Parameter # 1! > [0]PETSC ERROR: > ------------------------------------------------------------------------ > [0]PETSC ERROR: Petsc Release Version 3.4.3, Oct, 15, 2013 > [0]PETSC ERROR: See docs/changes/index.html for recent updates. > [0]PETSC ERROR: See docs/faq.html for hints about trouble shooting. > [0]PETSC ERROR: See docs/index.html for manual pages. > [0]PETSC ERROR: > ------------------------------------------------------------------------ > [0]PETSC ERROR: ./test on a ubuntu_release named pl-59080 by npozin Wed > Aug 5 10:55:26 2015 > [0]PETSC ERROR: Libraries linked from > /home/npozin/Felisce_libraries/petsc_3.4.3/ubuntu_release/lib > [0]PETSC ERROR: Configure run at Wed Jul 22 16:18:36 2015 > [0]PETSC ERROR: Configure options PETSC_ARCH=ubuntu_release --with-cxx=g++ > --with-fc=gfortran --with-cc=gcc --with-x=0 --download-openmpi > --download-f-blas-lapack --download-superlu --download-superlu_dist > --with-superlu_dist=1 --download-metis --download-mumps --download-parmetis > --with-superlu_dist=1 --download-boost --with-boost=1 --download-scalapack > with-external-packages-dir=/home/npozin/Felisce_libraries/petsc_3.4.3/packages > [0]PETSC ERROR: > ------------------------------------------------------------------------ > [0]PETSC ERROR: MatShellGetContext() line 202 in > /home/npozin/Felisce_libraries/petsc_3.4.3/src/mat/impls/shell/shell.c > End userMult > [0]PETSC ERROR: MatMult() line 2179 in > /home/npozin/Felisce_libraries/petsc_3.4.3/src/mat/interface/matrix.c > [0]PETSC ERROR: KSP_MatMult() line 204 in > /home/npozin/Felisce_libraries/petsc_3.4.3/include/petsc-private/kspimpl.h > [0]PETSC ERROR: KSPSolve_CG() line 219 in > /home/npozin/Felisce_libraries/petsc_3.4.3/src/ksp/ksp/impls/cg/cg.c > [0]PETSC ERROR: KSPSolve() line 441 in > /home/npozin/Felisce_libraries/petsc_3.4.3/src/ksp/ksp/interface/itfunc.c > " > > I don't understand where the problem comes from with the matrix argument > of MatShellGetContext. > Any idea on what I do wrong? > > Thanks a lot, > Nicolas > > > > #include > #include > > using namespace std; > > > typedef struct { > int val; > } MyCtx; > > > class ShellClass { > Mat matShell; > KSP ksp; > PC pc; > Vec x; > Vec b; > > public: > void userMult(Mat Amat, Vec x, Vec y) { > cout << "Inside userMult" << endl; > > MyCtx *ctx; > MatShellGetContext(Amat, (void *) ctx); > > cout << "End userMult" << endl; > } > > void solveShell() { > // context > MyCtx *ctx = new MyCtx; > ctx->val = 42; > > // pc > PCCreate(PETSC_COMM_WORLD, &pc); > PCSetType(pc, PCNONE); > > // ksp > KSPCreate(PETSC_COMM_WORLD, &ksp); > KSPSetType(ksp, KSPCG); > KSPSetPC(ksp, pc); > KSPSetFromOptions(ksp); > > // matshell > int m = 10; > int n = 10; > MatCreateShell(PETSC_COMM_WORLD, m, n, PETSC_DETERMINE, > PETSC_DETERMINE, ctx, &matShell); > MatShellSetOperation(matShell, MATOP_MULT, > (void(*)(void))&ShellClass::userMult); > > > // create vectors > MatCreateVecs(matShell, &x, 0); > VecDuplicate(x, &b); > VecSet(b, 1.); > > // set operators > KSPSetOperators(ksp, matShell, matShell); > > // solve (call to userMult) > KSPSolve(ksp, b, x); > } > }; > > > > int main(int argc, char** argv) { > PetscInitialize(&argc, &argv, NULL, NULL); > > ShellClass foo; > foo.solveShell(); > > PetscFinalize(); > return 0; > } > > -- What most experimenters take for granted before they begin their experiments is infinitely more interesting than any results to which their experiments lead. -- Norbert Wiener -------------- next part -------------- An HTML attachment was scrubbed... URL: From knepley at gmail.com Wed Aug 5 06:45:02 2015 From: knepley at gmail.com (Matthew Knepley) Date: Wed, 5 Aug 2015 06:45:02 -0500 Subject: [petsc-users] Error running DMPlex example In-Reply-To: References: Message-ID: On Tue, Aug 4, 2015 at 11:27 PM, Gautam Bisht wrote: > Hi, > > I'm getting the following error while running the following DMPlex > example. Any suggestion what is going wrong? Attached are example.log and > configure.log. > Can you run the test by hand either with the debugger and get a stack trace: /opt/local/bin/mpiexec-mpich-gcc49 -host localhost -n 1 darwin-gnu-fort-debug/lib/ex12-obj/ex12 -run_type test -refinement_limit 0.0 -bc_type dirichlet -interpolate 0 -petscspace_order 1 -show_initial -dm_plex_print_fem 1 -start_in_debugger or using valgrind? I cannot reproduce the problem here. Thanks, Matt > python2.7 ./config/builder2.py check src/snes/examples/tutorials/ex12.c > Namespace(args=[], files=['src/snes/examples/tutorials/ex12.c'], > func=, numProcs=None, regParams=None, > replace=False, retain=False, testnum=None) > Running 52 tests > Building > ['/Users/gbisht/projects/petsc/petsc_f0284fa/src/snes/examples/tutorials/ex12.c'] > Running #0: /opt/local/bin/mpiexec-mpich-gcc49 -host localhost -n 1 > darwin-gnu-fort-debug/lib/ex12-obj/ex12 -run_type test -refinement_limit > 0.0 -bc_type dirichlet -interpolate 0 -petscspace_order 1 -s > TEST ERROR: Failed to execute darwin-gnu-fort-debug/lib/ex12-obj/ex12 > =================================================================================== > > > = BAD TERMINATION OF ONE OF YOUR APPLICATION PROCESSES > > > = PID 69451 RUNNING AT localhost > > > = EXIT CODE: 59 > > > = CLEANING UP REMAINING PROCESSES > > > = YOU CAN IGNORE THE BELOW CLEANUP MESSAGES > > > > =================================================================================== > [0]PETSC ERROR: > ------------------------------------------------------------------------ > > [0]PETSC ERROR: Caught signal number 11 SEGV: Segmentation Violation, > probably memory access out of range > > [0]PETSC ERROR: Try option -start_in_debugger or -on_error_attach_debugger > > > [0]PETSC ERROR: or see > http://www.mcs.anl.gov/petsc/documentation/faq.html#valgrind > > > [0]PETSC ERROR: or try http://valgrind.org on GNU/linux and Apple Mac OS > X to find memory corruption errors > > [0]PETSC ERROR: likely location of problem given in stack below > > > [0]PETSC ERROR: --------------------- Stack Frames > ------------------------------------ > > > [0]PETSC ERROR: Note: The EXACT line numbers in the stack are not > available, > > [0]PETSC ERROR: INSTEAD the line number of the start of the function > > > [0]PETSC ERROR: is given. > > > [0]PETSC ERROR: [0] DMPlexGenerate_Triangle line 217 > src/dm/impls/plex/plexgenerate.c > > [0]PETSC ERROR: [0] DMPlexGenerate line 1056 > src/dm/impls/plex/plexgenerate.c > > > [0]PETSC ERROR: [0] DMPlexCreateBoxMesh line 897 > src/dm/impls/plex/plexcreate.c > > > [0]PETSC ERROR: [0] CreateMesh line 347 src/snes/examples/tutorials/ex12.c > > > [0]PETSC ERROR: --------------------- Error Message > -------------------------------------------------------------- > > > [0]PETSC ERROR: Signal received > > > [0]PETSC ERROR: See http://www.mcs.anl.gov/petsc/documentation/faq.html > for trouble shooting. > > [0]PETSC ERROR: Petsc Development GIT revision: v3.6.1-238-gf0284fa GIT > Date: 2015-07-27 13:34:26 -0500 > > [0]PETSC ERROR: darwin-gnu-fort-debug/lib/ex12-obj/ex12 on a > darwin-gnu-fort-debug named gautam-laptop by gbisht Tue Aug 4 21:09:37 > 2015 > [0]PETSC ERROR: Configure options --download-hdf5=1 > --with-blas-lapack-lib=/System/Library/Frameworks/Accelerate.framework/Versions/Current/Accelerate > --download-parmetis=yes --download-metis=yes --with-c > [0]PETSC ERROR: #1 User provided function() line 0 in unknown file > > > application called MPI_Abort(MPI_COMM_WORLD, 59) - process 0 > > > > > > TEST RUN FAILED (check example.log for details) > > > -Gautam. > > -- What most experimenters take for granted before they begin their experiments is infinitely more interesting than any results to which their experiments lead. -- Norbert Wiener -------------- next part -------------- An HTML attachment was scrubbed... URL: From hzhang at mcs.anl.gov Wed Aug 5 09:23:04 2015 From: hzhang at mcs.anl.gov (Hong) Date: Wed, 5 Aug 2015 09:23:04 -0500 Subject: [petsc-users] I am wondering if there is a way to implement SPMM In-Reply-To: References: <87egjjr2j9.fsf@jedbrown.org> <20150804084548.GB52392@Patricks-MacBook-Pro-3.local> <07456300-9874-41EF-AF5E-16BC0CB0423D@mcs.anl.gov> <06426FD6-16F6-429A-8EEB-8BE31CECC8F4@mcs.anl.gov> Message-ID: Cong: You cannot use "MAT_REUSE_MATRIX" on arbitrary matrix product. The correct process is call MatMatMult(A,Km(stepIdx-1), MAT_INITIAL_MATRIX,PETSC_ DEFAULT_INTEGER,C, ierr) call MatMatMult(A,Km(stepIdx-1), MAT_REUSE_MATRIX,PETSC_DEFAULT_INTEGER,C, ierr) i.e., C has data structure of A*Km(stepIdx-1) and is created in the first call. C can be reused in the 2nd call when A or Km(stepIdx-1) changed values, but not the structures. In your case, Km(stepIdx) = A*Km(stepIdx-1). You should do 'call MatMatMult(A,Km(stepIdx-1), MAT_INITIAL_MATRIX ,PETSC_DEFAULT_INTEGER,Km(stepIdx), ierr)' directly. Hong On Wed, Aug 5, 2015 at 4:42 AM, Cong Li wrote: > Hi > > I tried the method you suggested. However, I got the error message. > My code and message are below. > > K is the big matrix containing column matrices. > > code: > > call MatGetArray(K,KArray,KArrayOffset,ierr) > > call MatGetLocalSize(R,local_RRow,local_RCol) > > call MatGetArray(R,RArray,RArrayOffset,ierr) > > call MatCreateDense(PETSC_COMM_WORLD, PETSC_DECIDE, & > > PETSC_DECIDE , nDim, bsize,KArray(KArrayOffset + 1), Km(1), ierr) > > localRsize = local_RRow * local_RCol > do genIdx= 1, localRsize > KArray(KArrayOffset + genIdx) = RArray(RArrayOffset + genIdx) > end do > > call MatRestoreArray(R,RArray,RArrayOffset,ierr) > > call MatAssemblyBegin(Km(1), MAT_FINAL_ASSEMBLY, ierr) > call MatAssemblyEnd (Km(1), MAT_FINAL_ASSEMBLY, ierr) > > do stepIdx= 2, step_k > > blockShift = KArrayOffset + (stepIdx-1) * (local_RRow * local_RCol) > > call MatCreateDense(PETSC_COMM_WORLD, PETSC_DECIDE, & > > PETSC_DECIDE , nDim, bsize,KArray(blockShift+1), Km(stepIdx), ierr) > call MatAssemblyBegin(Km(stepIdx), MAT_FINAL_ASSEMBLY, ierr) > call MatAssemblyEnd (Km(stepIdx), MAT_FINAL_ASSEMBLY, ierr) > end do > > call MatRestoreArray(K,KArray,KArrayOffset,ierr) > > do stepIdx= 2, step_k > > > call MatMatMult(A,Km(stepIdx-1),MAT_REUSE_MATRIX,PETSC_DEFAULT_INTEGER,Km(stepIdx), ierr) > end do > > > And I got the error message as below: > > > [0]PETSC ERROR: > ------------------------------------------------------------------------ > [0]PETSC ERROR: Caught signal number 11 SEGV: Segmentation Violation, > probably memory access out of range > [0]PETSC ERROR: Try option -start_in_debugger or -on_error_attach_debugger > [0]PETSC ERROR: or see > http://www.mcs.anl.gov/petsc/documentation/faq.html#valgrind[0]PETSC > ERROR: or try http://valgrind.org on GNU/linux and Apple Mac OS X to find > memory corruption errors > [0]PETSC ERROR: configure using --with-debugging=yes, recompile, link, and > run > [0]PETSC ERROR: to get more information on the crash. > [0]PETSC ERROR: --------------------- Error Message > ------------------------------------ > [0]PETSC ERROR: Signal received! > [0]PETSC ERROR: > ------------------------------------------------------------------------ > [0]PETSC ERROR: Petsc Release Version 3.3.0, Patch 7, Sat May 11 22:15:24 > CDT 2013 > [0]PETSC ERROR: See docs/changes/index.html for recent updates. > [0]PETSC ERROR: See docs/faq.html for hints about trouble shooting. > [0]PETSC ERROR: See docs/index.html for manual pages. > [0]PETSC ERROR: --------------------[1]PETSC ERROR: > ------------------------------------------------------------------------ > [1]PETSC ERROR: Caught signal number 11 SEGV: Segmentation Violation, > probably memory access out of range > ---------------------------------------------------- > [0]PETSC ERROR: ./kmath.bcbcg on a arch-fuji named p01-024 by a03293 Wed > Aug 5 18:24:40 2015 > [0]PETSC ERROR: Libraries linked from > /volume1/home/ra000005/a03293/kmathlibbuild/petsc-3.3-p7/arch-fujitsu-sparc64fx-opt/lib > [0]PETSC ERROR: Configure run at Tue Jul 28 19:23:51 2015 > [0]PETSC ERROR: Configure options --known-level1-dcache-size=32768 > --known-level1-dcache-linesize=32 --known-level1-dcache-assoc=0 > --known-memcmp-ok=1 --known-sizeof-char=1 --known-sizeof-void-p=8 > --known-sizeof-short=2 --known-sizeof-int=4 --known-sizeof-long=8 > --known-sizeof-long-long=8 --known-sizeof-float=4 --known-sizeof-double=8 > --known-sizeof-size_t=8 --known-bits-per-byte=8 --known-sizeof-MPI_Comm=8 > --known-sizeof-MPI_Fint=4 --known-mpi-long-double=1 > --known-mpi-c-double-complex=1 --with-cc=mpifccpx --CFLAGS="-mt -Xg" > --COPTFLAGS=-Kfast,openmp --with-cxx=mpiFCCpx --CXXFLAGS=-mt > --CXXOPTFLAGS=-Kfast,openmp --with-fc=mpifrtpx --FFLAGS=-Kthreadsafe > --FOPTFLAGS=-Kfast,openmp --with-blas-lapack-lib="-SCALAPACK -SSL2" > --with-x=0 --with-c++-support --with-batch=1 --with-info=1 > --with-debugging=0 --known-mpi-shared-libraries=0 --with-valgrind=0 > [0]PETSC ERROR: > ------------------------------------------------------------------------ > [0]PETSC ERROR: User provided function() line 0 in unknown directory > unknown file > -------------------------------------------------------------------------- > [mpi::mpi-api::mpi-abort] > MPI_ABORT was invoked on rank 0 in communicator MPI_COMM_WORLD > with errorcode 59. > > NOTE: invoking MPI_ABORT causes Open MPI to kill all MPI processes. > You may or may not see output from other processes, depending on > exactly when Open MPI kills them. > -------------------------------------------------------------------------- > [p01-024:26516] > /opt/FJSVtclang/GM-1.2.0-18/lib64/libmpi.so.0(orte_errmgr_base_error_abort+0x84) > [0xffffffff0091f684] > [p01-024:26516] > /opt/FJSVtclang/GM-1.2.0-18/lib64/libmpi.so.0(ompi_mpi_abort+0x51c) > [0xffffffff006c389c] > [p01-024:26516] > /opt/FJSVtclang/GM-1.2.0-18/lib64/libmpi.so.0(MPI_Abort+0x6c) > [0xffffffff006db3ac] > [p01-024:26516] > /opt/FJSVtclang/GM-1.2.0-18/lib64/libtrtmet_c.so.1(MPI_Abort+0x2c) > [0xffffffff00281bf0] > [p01-024:26516] ./kmath.bcbcg [0x1bf620] > [p01-024:26516] ./kmath.bcbcg [0x1bf20c] > [p01-024:26516] /lib64/libc.so.6(killpg+0x48) [0xffffffff02d52600] > [p01-024:26516] [(nil)] > [p01-024:26516] ./kmath.bcbcg [0x1a2054] > [p01-024:26516] ./kmath.bcbcg [0x1064f8] > [p01-024:26516] ./kmath.bcbcg(MAIN__+0x9dc) [0x105d1c] > [p01-024:26516] ./kmath.bcbcg(main+0xec) [0x8a329c] > [p01-024:26516] /lib64/libc.so.6(__libc_start_main+0x194) > [0xffffffff02d3b81c] > [p01-024:26516] ./kmath.bcbcg [0x1051ec] > [0]PETSC ERROR: > ------------------------------------------------------------------------ > [0]PETSC ERROR: Caught signal number 15 Terminate: Somet process (or the > batch system) has told this process to end > [0]PETSC ERROR: Try option -start_in_debugger or -on_error_attach_debugger > [0]PETSC ERROR: or see > http://www.mcs.anl.gov/petsc/documentation/faq.html#valgrind[0]PETSC > ERROR: or try http://valgrind.org on GNU/linux and Apple Mac OS X to find > memory corruption errors > [0]PETSC ERROR: configure using --with-debugging=yes, recompile, link, and > run > [0]PETSC ERROR: to get more information on the crash. > [0]PETSC ERROR: --------------------- Error Message > ------------------------------------ > [0]PETSC ERROR: Signal received! > [0]PETSC ERROR: > ------------------------------------------------------------------------ > [0]PETSC ERROR: Petsc Release Version 3.3.0, Patch 7, Sat May 11 22:15:24 > CDT 2013 > [0]PETSC ERROR: See docs/changes/index.html for recent updates. > [0]PETSC ERROR: See docs/faq.html for hints about trouble shooting. > [0]PETSC ERROR: See docs/index.html for manual pages. > [0]PETSC ERROR: > ------------------------------------------------------------------------ > [0]PETSC ERROR: ./kmath.bcbcg on a arch-fuji named p01-024 by a03293 Wed > Aug 5 18:24:40 2015 > [0]PETSC ERROR: Libraries linked from > /volume1/home/ra000005/a03293/kmathlibbuild/petsc-3.3-p7/arch-fujitsu-sparc64fx-opt/lib > [0]PETSC ERROR: Configure run at Tue Jul 28 19:23:51 2015 > [0]PETSC ERROR: Configure options --known-level1-dcache-size=32768 > --known-level1-dcache-linesize=32 --known-level1-dcache-assoc=0 > --known-memcmp-ok=1 --known-sizeof-char=1 --known-sizeof-void-p=8 > --known-sizeof-short=2 --known-sizeof-int=4 --known-sizeof-long=8 > --known-sizeof-long-long=8 --known-sizeof-float=4 --known-sizeof-double=8 > --known-sizeof-size_t=8 --known-bits-per-byte=8 --known-sizeof-MPI_Comm=8 > --known-sizeof-MPI_Fint=4 --known-mpi-long-double=1 > --known-mpi-c-double-complex=1 --with-cc=mpifccpx --CFLAGS="-mt -Xg" > --COPTFLAGS=-Kfast,openmp --with-cxx=mpiFCCpx --CXXFLAGS=-mt > --CXXOPTFLAGS=-Kfast,openmp --with-fc=mpifrtpx --FFLAGS=-Kthreadsafe > --FOPTFLAGS=-Kfast,openmp --with-blas-lapack-lib="-SCALAPACK -SSL2" > --with-x=0 --with-c++-support --with-batch=1 --with-info=1 > --with-debugging=0 --known-mpi-shared-libraries=0 --with-valgrind=0 > [0]PETSC ERROR: > ------------------------------------------------------------------------ > [0]PETSC ERROR: User provided function() line 0 in unknown directory > unknown file > [ERR.] PLE 0019 plexec One of MPI processes was > aborted.(rank=0)(nid=0x020a0028)(CODE=1938,793745140674134016,15104) > > However, if I change from > > call MatMatMult(A,Km(stepIdx-1),MAT_REUSE_MATRIX,PETSC_DEFAULT_INTEGER,Km(stepIdx), ierr) > to > call MatMatMult(A,Km(stepIdx-1), MAT_INITIAL_MATRIX > ,PETSC_DEFAULT_INTEGER,Km(stepIdx), ierr) > > everything is fine. > > could you please suggest some way to solve this? > > Thanks > > Cong Li > > On Wed, Aug 5, 2015 at 10:53 AM, Cong Li wrote: > >> Thank you very much for your help and suggestions. >> With your help, finally I could continue my project. >> >> Regards >> >> Cong Li >> >> >> >> On Wed, Aug 5, 2015 at 3:09 AM, Barry Smith wrote: >> >>> >>> From the manual page: Unless scall is MAT_REUSE_MATRIX C will be >>> created. >>> >>> Since you want to use the C that is passed in you should use >>> MAT_REUSE_MATRIX. >>> >>> Note that since your B and C matrices are dense the issue of sparsity >>> pattern of C is not relevant. >>> >>> Barry >>> >>> > On Aug 4, 2015, at 11:59 AM, Cong Li wrote: >>> > >>> > Thanks very much. This answer is very helpful. >>> > And I have a following question. >>> > If I create B1, B2, .. by the way you suggested and then use >>> MatMatMult to do SPMM. >>> > PetscErrorCode MatMatMult(Mat A,Mat B,MatReuse scall,PetscReal >>> fill,Mat *C) >>> > should I use MAT_REUSE_MATRIX for MatReuse part of the arguement. >>> > >>> > Thanks >>> > >>> > Cong Li >>> > >>> > On Wed, Aug 5, 2015 at 1:27 AM, Barry Smith >>> wrote: >>> > >>> > > On Aug 4, 2015, at 4:09 AM, Cong Li >>> wrote: >>> > > >>> > > I am sorry that I should have explained it more clearly. >>> > > Actually I want to compute a recurrence. >>> > > >>> > > Like, I want to firstly compute A*X1=B1, and then calculate A*B1=B2, >>> A*B2=B3 and so on. >>> > > Finally I want to combine all these results into a bigger matrix >>> C=[B1,B2 ...] >>> > >>> > First create C with MatCreateDense(,&C). Then call >>> MatDenseGetArray(C,&array); then create B1 with >>> MatCreateDense(....,array,&B1); then create >>> > B2 with MatCreateDense(...,array+shift,&B2) etc where shift equals the >>> number of __local__ rows in B1 times the number of columns in B1, then >>> create B3 with a larger shift etc. >>> > >>> > Note that you are "sharing" the array space of C with B1, B2, B3, >>> ..., each Bi contains its columns of the C matrix. >>> > >>> > Barry >>> > >>> > >>> > >>> > > >>> > > Is there any way to do this efficiently. >>> > > >>> > > >>> > > >>> > > On Tue, Aug 4, 2015 at 5:45 PM, Patrick Sanan < >>> patrick.sanan at gmail.com> wrote: >>> > > On Tue, Aug 04, 2015 at 03:42:14PM +0900, Cong Li wrote: >>> > > > Thanks for your reply. >>> > > > >>> > > > I have an other question. >>> > > > I want to do SPMM several times and combine result matrices into >>> one bigger >>> > > > matrix. >>> > > > for example >>> > > > I firstly calculate AX1=B1, AX2=B2 ... >>> > > > then I want to combine B1, B2.. to get a C, where C=[B1,B2...] >>> > > > >>> > > > Could you please suggest a way of how to do this. >>> > > This is just linear algebra, nothing to do with PETSc specifically. >>> > > A * [X1, X2, ... ] = [AX1, AX2, ...] >>> > > > >>> > > > Thanks >>> > > > >>> > > > Cong Li >>> > > > >>> > > > On Tue, Aug 4, 2015 at 3:27 PM, Jed Brown >>> wrote: >>> > > > >>> > > > > Cong Li writes: >>> > > > > >>> > > > > > Hello, >>> > > > > > >>> > > > > > I am a PhD student using PETsc for my research. >>> > > > > > I am wondering if there is a way to implement SPMM (Sparse >>> matrix-matrix >>> > > > > > multiplication) by using PETSc. >>> > > > > >>> > > > > >>> > > > > >>> http://www.mcs.anl.gov/petsc/petsc-current/docs/manualpages/Mat/MatMatMult.html >>> > > > > >>> > > >>> > >>> > >>> >>> >> > -------------- next part -------------- An HTML attachment was scrubbed... URL: From solvercorleone at gmail.com Wed Aug 5 09:43:51 2015 From: solvercorleone at gmail.com (Cong Li) Date: Wed, 5 Aug 2015 23:43:51 +0900 Subject: [petsc-users] I am wondering if there is a way to implement SPMM In-Reply-To: References: <87egjjr2j9.fsf@jedbrown.org> <20150804084548.GB52392@Patricks-MacBook-Pro-3.local> <07456300-9874-41EF-AF5E-16BC0CB0423D@mcs.anl.gov> <06426FD6-16F6-429A-8EEB-8BE31CECC8F4@mcs.anl.gov> Message-ID: Hong, Thanks for your answer. However, in my problem, I have a pre-allocated matrix K, and its columns are associated with Km(1), .. Km(step_k) respectively. What I want to do is to update Km(2) by using the result of A*Km(1), and then to update Km(3) by using the product of A and updated Km(2) and so on. So, I think I need to use MAT_REUSE_MATRIX from the beginning, since even when it is the first time I call MatMatMult(A,Km(stepIdx-1), MAT_REUSE_MATRIX,PETSC_ DEFAULT_INTEGER,Km(stepIdx), ierr)', Km(stepIdx) have actually already been allocated (in K). Do you think it is possible that I can do this, and could you please suggest some possible ways. Thanks Cong Li On Wed, Aug 5, 2015 at 11:23 PM, Hong wrote: > Cong: > You cannot use "MAT_REUSE_MATRIX" on arbitrary matrix product. > The correct process is > > call MatMatMult(A,Km(stepIdx-1), MAT_INITIAL_MATRIX,PETSC_ > DEFAULT_INTEGER,C, ierr) > call MatMatMult(A,Km(stepIdx-1), MAT_REUSE_MATRIX,PETSC_ > DEFAULT_INTEGER,C, ierr) > i.e., C has data structure of A*Km(stepIdx-1) and is created in the first > call. C can be reused in the 2nd call when A or Km(stepIdx-1) changed > values, but not the structures. > > In your case, Km(stepIdx) = A*Km(stepIdx-1). You should do > 'call MatMatMult(A,Km(stepIdx-1), MAT_INITIAL_MATRIX > ,PETSC_DEFAULT_INTEGER,Km(stepIdx), ierr)' > directly. > > Hong > > On Wed, Aug 5, 2015 at 4:42 AM, Cong Li wrote: > >> Hi >> >> I tried the method you suggested. However, I got the error message. >> My code and message are below. >> >> K is the big matrix containing column matrices. >> >> code: >> >> call MatGetArray(K,KArray,KArrayOffset,ierr) >> >> call MatGetLocalSize(R,local_RRow,local_RCol) >> >> call MatGetArray(R,RArray,RArrayOffset,ierr) >> >> call MatCreateDense(PETSC_COMM_WORLD, PETSC_DECIDE, & >> >> PETSC_DECIDE , nDim, bsize,KArray(KArrayOffset + 1), Km(1), ierr) >> >> localRsize = local_RRow * local_RCol >> do genIdx= 1, localRsize >> KArray(KArrayOffset + genIdx) = RArray(RArrayOffset + genIdx) >> end do >> >> call MatRestoreArray(R,RArray,RArrayOffset,ierr) >> >> call MatAssemblyBegin(Km(1), MAT_FINAL_ASSEMBLY, ierr) >> call MatAssemblyEnd (Km(1), MAT_FINAL_ASSEMBLY, ierr) >> >> do stepIdx= 2, step_k >> >> blockShift = KArrayOffset + (stepIdx-1) * (local_RRow * local_RCol) >> >> call MatCreateDense(PETSC_COMM_WORLD, PETSC_DECIDE, & >> >> PETSC_DECIDE , nDim, bsize,KArray(blockShift+1), Km(stepIdx), ierr) >> call MatAssemblyBegin(Km(stepIdx), MAT_FINAL_ASSEMBLY, ierr) >> call MatAssemblyEnd (Km(stepIdx), MAT_FINAL_ASSEMBLY, ierr) >> end do >> >> call MatRestoreArray(K,KArray,KArrayOffset,ierr) >> >> do stepIdx= 2, step_k >> >> >> call MatMatMult(A,Km(stepIdx-1),MAT_REUSE_MATRIX,PETSC_DEFAULT_INTEGER,Km(stepIdx), ierr) >> end do >> >> >> And I got the error message as below: >> >> >> [0]PETSC ERROR: >> ------------------------------------------------------------------------ >> [0]PETSC ERROR: Caught signal number 11 SEGV: Segmentation Violation, >> probably memory access out of range >> [0]PETSC ERROR: Try option -start_in_debugger or -on_error_attach_debugger >> [0]PETSC ERROR: or see >> http://www.mcs.anl.gov/petsc/documentation/faq.html#valgrind[0]PETSC >> ERROR: or try http://valgrind.org on GNU/linux and Apple Mac OS X to >> find memory corruption errors >> [0]PETSC ERROR: configure using --with-debugging=yes, recompile, link, >> and run >> [0]PETSC ERROR: to get more information on the crash. >> [0]PETSC ERROR: --------------------- Error Message >> ------------------------------------ >> [0]PETSC ERROR: Signal received! >> [0]PETSC ERROR: >> ------------------------------------------------------------------------ >> [0]PETSC ERROR: Petsc Release Version 3.3.0, Patch 7, Sat May 11 22:15:24 >> CDT 2013 >> [0]PETSC ERROR: See docs/changes/index.html for recent updates. >> [0]PETSC ERROR: See docs/faq.html for hints about trouble shooting. >> [0]PETSC ERROR: See docs/index.html for manual pages. >> [0]PETSC ERROR: --------------------[1]PETSC ERROR: >> ------------------------------------------------------------------------ >> [1]PETSC ERROR: Caught signal number 11 SEGV: Segmentation Violation, >> probably memory access out of range >> ---------------------------------------------------- >> [0]PETSC ERROR: ./kmath.bcbcg on a arch-fuji named p01-024 by a03293 Wed >> Aug 5 18:24:40 2015 >> [0]PETSC ERROR: Libraries linked from >> /volume1/home/ra000005/a03293/kmathlibbuild/petsc-3.3-p7/arch-fujitsu-sparc64fx-opt/lib >> [0]PETSC ERROR: Configure run at Tue Jul 28 19:23:51 2015 >> [0]PETSC ERROR: Configure options --known-level1-dcache-size=32768 >> --known-level1-dcache-linesize=32 --known-level1-dcache-assoc=0 >> --known-memcmp-ok=1 --known-sizeof-char=1 --known-sizeof-void-p=8 >> --known-sizeof-short=2 --known-sizeof-int=4 --known-sizeof-long=8 >> --known-sizeof-long-long=8 --known-sizeof-float=4 --known-sizeof-double=8 >> --known-sizeof-size_t=8 --known-bits-per-byte=8 --known-sizeof-MPI_Comm=8 >> --known-sizeof-MPI_Fint=4 --known-mpi-long-double=1 >> --known-mpi-c-double-complex=1 --with-cc=mpifccpx --CFLAGS="-mt -Xg" >> --COPTFLAGS=-Kfast,openmp --with-cxx=mpiFCCpx --CXXFLAGS=-mt >> --CXXOPTFLAGS=-Kfast,openmp --with-fc=mpifrtpx --FFLAGS=-Kthreadsafe >> --FOPTFLAGS=-Kfast,openmp --with-blas-lapack-lib="-SCALAPACK -SSL2" >> --with-x=0 --with-c++-support --with-batch=1 --with-info=1 >> --with-debugging=0 --known-mpi-shared-libraries=0 --with-valgrind=0 >> [0]PETSC ERROR: >> ------------------------------------------------------------------------ >> [0]PETSC ERROR: User provided function() line 0 in unknown directory >> unknown file >> -------------------------------------------------------------------------- >> [mpi::mpi-api::mpi-abort] >> MPI_ABORT was invoked on rank 0 in communicator MPI_COMM_WORLD >> with errorcode 59. >> >> NOTE: invoking MPI_ABORT causes Open MPI to kill all MPI processes. >> You may or may not see output from other processes, depending on >> exactly when Open MPI kills them. >> -------------------------------------------------------------------------- >> [p01-024:26516] >> /opt/FJSVtclang/GM-1.2.0-18/lib64/libmpi.so.0(orte_errmgr_base_error_abort+0x84) >> [0xffffffff0091f684] >> [p01-024:26516] >> /opt/FJSVtclang/GM-1.2.0-18/lib64/libmpi.so.0(ompi_mpi_abort+0x51c) >> [0xffffffff006c389c] >> [p01-024:26516] >> /opt/FJSVtclang/GM-1.2.0-18/lib64/libmpi.so.0(MPI_Abort+0x6c) >> [0xffffffff006db3ac] >> [p01-024:26516] >> /opt/FJSVtclang/GM-1.2.0-18/lib64/libtrtmet_c.so.1(MPI_Abort+0x2c) >> [0xffffffff00281bf0] >> [p01-024:26516] ./kmath.bcbcg [0x1bf620] >> [p01-024:26516] ./kmath.bcbcg [0x1bf20c] >> [p01-024:26516] /lib64/libc.so.6(killpg+0x48) [0xffffffff02d52600] >> [p01-024:26516] [(nil)] >> [p01-024:26516] ./kmath.bcbcg [0x1a2054] >> [p01-024:26516] ./kmath.bcbcg [0x1064f8] >> [p01-024:26516] ./kmath.bcbcg(MAIN__+0x9dc) [0x105d1c] >> [p01-024:26516] ./kmath.bcbcg(main+0xec) [0x8a329c] >> [p01-024:26516] /lib64/libc.so.6(__libc_start_main+0x194) >> [0xffffffff02d3b81c] >> [p01-024:26516] ./kmath.bcbcg [0x1051ec] >> [0]PETSC ERROR: >> ------------------------------------------------------------------------ >> [0]PETSC ERROR: Caught signal number 15 Terminate: Somet process (or the >> batch system) has told this process to end >> [0]PETSC ERROR: Try option -start_in_debugger or -on_error_attach_debugger >> [0]PETSC ERROR: or see >> http://www.mcs.anl.gov/petsc/documentation/faq.html#valgrind[0]PETSC >> ERROR: or try http://valgrind.org on GNU/linux and Apple Mac OS X to >> find memory corruption errors >> [0]PETSC ERROR: configure using --with-debugging=yes, recompile, link, >> and run >> [0]PETSC ERROR: to get more information on the crash. >> [0]PETSC ERROR: --------------------- Error Message >> ------------------------------------ >> [0]PETSC ERROR: Signal received! >> [0]PETSC ERROR: >> ------------------------------------------------------------------------ >> [0]PETSC ERROR: Petsc Release Version 3.3.0, Patch 7, Sat May 11 22:15:24 >> CDT 2013 >> [0]PETSC ERROR: See docs/changes/index.html for recent updates. >> [0]PETSC ERROR: See docs/faq.html for hints about trouble shooting. >> [0]PETSC ERROR: See docs/index.html for manual pages. >> [0]PETSC ERROR: >> ------------------------------------------------------------------------ >> [0]PETSC ERROR: ./kmath.bcbcg on a arch-fuji named p01-024 by a03293 Wed >> Aug 5 18:24:40 2015 >> [0]PETSC ERROR: Libraries linked from >> /volume1/home/ra000005/a03293/kmathlibbuild/petsc-3.3-p7/arch-fujitsu-sparc64fx-opt/lib >> [0]PETSC ERROR: Configure run at Tue Jul 28 19:23:51 2015 >> [0]PETSC ERROR: Configure options --known-level1-dcache-size=32768 >> --known-level1-dcache-linesize=32 --known-level1-dcache-assoc=0 >> --known-memcmp-ok=1 --known-sizeof-char=1 --known-sizeof-void-p=8 >> --known-sizeof-short=2 --known-sizeof-int=4 --known-sizeof-long=8 >> --known-sizeof-long-long=8 --known-sizeof-float=4 --known-sizeof-double=8 >> --known-sizeof-size_t=8 --known-bits-per-byte=8 --known-sizeof-MPI_Comm=8 >> --known-sizeof-MPI_Fint=4 --known-mpi-long-double=1 >> --known-mpi-c-double-complex=1 --with-cc=mpifccpx --CFLAGS="-mt -Xg" >> --COPTFLAGS=-Kfast,openmp --with-cxx=mpiFCCpx --CXXFLAGS=-mt >> --CXXOPTFLAGS=-Kfast,openmp --with-fc=mpifrtpx --FFLAGS=-Kthreadsafe >> --FOPTFLAGS=-Kfast,openmp --with-blas-lapack-lib="-SCALAPACK -SSL2" >> --with-x=0 --with-c++-support --with-batch=1 --with-info=1 >> --with-debugging=0 --known-mpi-shared-libraries=0 --with-valgrind=0 >> [0]PETSC ERROR: >> ------------------------------------------------------------------------ >> [0]PETSC ERROR: User provided function() line 0 in unknown directory >> unknown file >> [ERR.] PLE 0019 plexec One of MPI processes was >> aborted.(rank=0)(nid=0x020a0028)(CODE=1938,793745140674134016,15104) >> >> However, if I change from >> >> call MatMatMult(A,Km(stepIdx-1),MAT_REUSE_MATRIX,PETSC_DEFAULT_INTEGER,Km(stepIdx), ierr) >> to >> call MatMatMult(A,Km(stepIdx-1), MAT_INITIAL_MATRIX >> ,PETSC_DEFAULT_INTEGER,Km(stepIdx), ierr) >> >> everything is fine. >> >> could you please suggest some way to solve this? >> >> Thanks >> >> Cong Li >> >> On Wed, Aug 5, 2015 at 10:53 AM, Cong Li >> wrote: >> >>> Thank you very much for your help and suggestions. >>> With your help, finally I could continue my project. >>> >>> Regards >>> >>> Cong Li >>> >>> >>> >>> On Wed, Aug 5, 2015 at 3:09 AM, Barry Smith wrote: >>> >>>> >>>> From the manual page: Unless scall is MAT_REUSE_MATRIX C will be >>>> created. >>>> >>>> Since you want to use the C that is passed in you should use >>>> MAT_REUSE_MATRIX. >>>> >>>> Note that since your B and C matrices are dense the issue of sparsity >>>> pattern of C is not relevant. >>>> >>>> Barry >>>> >>>> > On Aug 4, 2015, at 11:59 AM, Cong Li >>>> wrote: >>>> > >>>> > Thanks very much. This answer is very helpful. >>>> > And I have a following question. >>>> > If I create B1, B2, .. by the way you suggested and then use >>>> MatMatMult to do SPMM. >>>> > PetscErrorCode MatMatMult(Mat A,Mat B,MatReuse scall,PetscReal >>>> fill,Mat *C) >>>> > should I use MAT_REUSE_MATRIX for MatReuse part of the arguement. >>>> > >>>> > Thanks >>>> > >>>> > Cong Li >>>> > >>>> > On Wed, Aug 5, 2015 at 1:27 AM, Barry Smith >>>> wrote: >>>> > >>>> > > On Aug 4, 2015, at 4:09 AM, Cong Li >>>> wrote: >>>> > > >>>> > > I am sorry that I should have explained it more clearly. >>>> > > Actually I want to compute a recurrence. >>>> > > >>>> > > Like, I want to firstly compute A*X1=B1, and then calculate >>>> A*B1=B2, A*B2=B3 and so on. >>>> > > Finally I want to combine all these results into a bigger matrix >>>> C=[B1,B2 ...] >>>> > >>>> > First create C with MatCreateDense(,&C). Then call >>>> MatDenseGetArray(C,&array); then create B1 with >>>> MatCreateDense(....,array,&B1); then create >>>> > B2 with MatCreateDense(...,array+shift,&B2) etc where shift equals >>>> the number of __local__ rows in B1 times the number of columns in B1, then >>>> create B3 with a larger shift etc. >>>> > >>>> > Note that you are "sharing" the array space of C with B1, B2, B3, >>>> ..., each Bi contains its columns of the C matrix. >>>> > >>>> > Barry >>>> > >>>> > >>>> > >>>> > > >>>> > > Is there any way to do this efficiently. >>>> > > >>>> > > >>>> > > >>>> > > On Tue, Aug 4, 2015 at 5:45 PM, Patrick Sanan < >>>> patrick.sanan at gmail.com> wrote: >>>> > > On Tue, Aug 04, 2015 at 03:42:14PM +0900, Cong Li wrote: >>>> > > > Thanks for your reply. >>>> > > > >>>> > > > I have an other question. >>>> > > > I want to do SPMM several times and combine result matrices into >>>> one bigger >>>> > > > matrix. >>>> > > > for example >>>> > > > I firstly calculate AX1=B1, AX2=B2 ... >>>> > > > then I want to combine B1, B2.. to get a C, where C=[B1,B2...] >>>> > > > >>>> > > > Could you please suggest a way of how to do this. >>>> > > This is just linear algebra, nothing to do with PETSc specifically. >>>> > > A * [X1, X2, ... ] = [AX1, AX2, ...] >>>> > > > >>>> > > > Thanks >>>> > > > >>>> > > > Cong Li >>>> > > > >>>> > > > On Tue, Aug 4, 2015 at 3:27 PM, Jed Brown >>>> wrote: >>>> > > > >>>> > > > > Cong Li writes: >>>> > > > > >>>> > > > > > Hello, >>>> > > > > > >>>> > > > > > I am a PhD student using PETsc for my research. >>>> > > > > > I am wondering if there is a way to implement SPMM (Sparse >>>> matrix-matrix >>>> > > > > > multiplication) by using PETSc. >>>> > > > > >>>> > > > > >>>> > > > > >>>> http://www.mcs.anl.gov/petsc/petsc-current/docs/manualpages/Mat/MatMatMult.html >>>> > > > > >>>> > > >>>> > >>>> > >>>> >>>> >>> >> > -------------- next part -------------- An HTML attachment was scrubbed... URL: From Mahir.Ulker-Kaustell at tyrens.se Wed Aug 5 09:46:24 2015 From: Mahir.Ulker-Kaustell at tyrens.se (Mahir.Ulker-Kaustell at tyrens.se) Date: Wed, 5 Aug 2015 14:46:24 +0000 Subject: [petsc-users] SuperLU MPI-problem In-Reply-To: References: <051d9816c3bd4a3eac37fc51004ebce1@STHWS42.tyrens.se> <7345cece365942d1a06deeac56cf1d72@STHWS42.tyrens.se> <19A5B30A-64E2-44E1-8F73-F67AE628F175@mcs.anl.gov> <03369975ff0a46a388920f1b3372d25c@STHWS42.tyrens.se> <1891ada2f99843b6b6c1d91f0f47f065@STHWS42.tyrens.se> Message-ID: Hong, If I set parsymbfact: $ mpiexec -n 2 ./solve -ksp_type preonly -pc_type lu -pc_factor_mat_solver_package superlu_dist -mat_superlu_dist_matinput DISTRIBUTED -mat_superlu_dist_parsymbfact -ksp_view Invalid ISPEC at line 484 in file get_perm_c.c Invalid ISPEC at line 484 in file get_perm_c.c ------------------------------------------------------- Primary job terminated normally, but 1 process returned a non-zero exit code.. Per user-direction, the job has been aborted. ------------------------------------------------------- -------------------------------------------------------------------------- mpiexec detected that one or more processes exited with non-zero status, thus causing the job to be terminated. The first process to do so was: Process name: [[63679,1],0] Exit code: 255 -------------------------------------------------------------------------- Since the program does not finish the call to KSPSolve(), we do not get any information about the KSP from ?ksp_view. If I do not set it, I get a serial run even if I specify ?n 2: mpiexec -n 2 ./solve -ksp_type preonly -pc_type lu -pc_factor_mat_solver_package superlu_dist -ksp_view ? KSP Object: 1 MPI processes type: preonly maximum iterations=10000, initial guess is zero tolerances: relative=1e-05, absolute=1e-50, divergence=10000 left preconditioning using NONE norm type for convergence test PC Object: 1 MPI processes type: lu LU: out-of-place factorization tolerance for zero pivot 2.22045e-14 matrix ordering: nd factor fill ratio given 0, needed 0 Factored matrix follows: Mat Object: 1 MPI processes type: seqaij rows=954, cols=954 package used to perform factorization: superlu_dist total: nonzeros=0, allocated nonzeros=0 total number of mallocs used during MatSetValues calls =0 SuperLU_DIST run parameters: Process grid nprow 1 x npcol 1 Equilibrate matrix TRUE Matrix input mode 0 Replace tiny pivots TRUE Use iterative refinement FALSE Processors in row 1 col partition 1 Row permutation LargeDiag Column permutation METIS_AT_PLUS_A Parallel symbolic factorization FALSE Repeated factorization SamePattern_SameRowPerm linear system matrix = precond matrix: Mat Object: 1 MPI processes type: seqaij rows=954, cols=954 total: nonzeros=34223, allocated nonzeros=34223 total number of mallocs used during MatSetValues calls =0 using I-node routines: found 668 nodes, limit used is 5 I am running PETSc via Cygwin on a windows machine. When I installed PETSc the tests with different numbers of processes ran well. Mahir From: Hong [mailto:hzhang at mcs.anl.gov] Sent: den 3 augusti 2015 19:06 To: ?lker-Kaustell, Mahir Cc: Hong; Xiaoye S. Li; PETSc users list Subject: Re: [petsc-users] SuperLU MPI-problem Mahir, I have not used ?parsymbfact in sequential runs or set matinput=GLOBAL for parallel runs. If I use 2 processors, the program runs if I use ?mat_superlu_dist_parsymbfact=1: mpiexec -n 2 ./solve -ksp_type preonly -pc_type lu -pc_factor_mat_solver_package superlu_dist -mat_superlu_dist_matinput GLOBAL -mat_superlu_dist_parsymbfact=1 The incorrect option '-mat_superlu_dist_parsymbfact=1' is not taken, so your code runs well without parsymbfact. Please run it with '-ksp_view' and see what 'SuperLU_DIST run parameters:' are being used, e.g. petsc/src/ksp/ksp/examples/tutorials (maint) $ mpiexec -n 2 ./ex2 -pc_type lu -pc_factor_mat_solver_package superlu_dist -mat_superlu_dist_parsymbfact=1 -ksp_view ... SuperLU_DIST run parameters: Process grid nprow 2 x npcol 1 Equilibrate matrix TRUE Matrix input mode 1 Replace tiny pivots TRUE Use iterative refinement FALSE Processors in row 2 col partition 1 Row permutation LargeDiag Column permutation METIS_AT_PLUS_A Parallel symbolic factorization FALSE Repeated factorization SamePattern_SameRowPerm I do not understand why your code uses matrix input mode = global. Hong From: Hong [mailto:hzhang at mcs.anl.gov] Sent: den 3 augusti 2015 16:46 To: Xiaoye S. Li Cc: ?lker-Kaustell, Mahir; Hong; PETSc users list Subject: Re: [petsc-users] SuperLU MPI-problem Mahir, Sherry found the culprit. I can reproduce it: petsc/src/ksp/ksp/examples/tutorials mpiexec -n 2 ./ex2 -pc_type lu -pc_factor_mat_solver_package superlu_dist -mat_superlu_dist_matinput GLOBAL -mat_superlu_dist_parsymbfact Invalid ISPEC at line 484 in file get_perm_c.c Invalid ISPEC at line 484 in file get_perm_c.c ------------------------------------------------------- Primary job terminated normally, but 1 process returned a non-zero exit code.. Per user-direction, the job has been aborted. ------------------------------------------------------- ... PETSc-superlu_dist interface sets matinput=DISTRIBUTED as default when using more than one processes. Did you either use '-mat_superlu_dist_parsymbfact' for sequential run or set matinput=GLOBAL for parallel run? I'll add an error flag for these use cases. Hong On Mon, Aug 3, 2015 at 9:17 AM, Xiaoye S. Li > wrote: I think I know the problem. Since zdistribute.c is called, I guess you are using the global (replicated) matrix input interface, pzgssvx_ABglobal(). This interface does not allow you to use parallel symbolic factorization (since matrix is centralized). That's why you get the following error: Invalid ISPEC at line 484 in file get_perm_c.c You need to use distributed matrix input interface pzgssvx() (without ABglobal) Sherry On Mon, Aug 3, 2015 at 5:02 AM, Mahir.Ulker-Kaustell at tyrens.se > wrote: Hong and Sherry, I have rebuilt PETSc with SuperLU 4.1. Unfortunately, the problem remains: If I use -mat_superlu_dist_parsymbfact, the program crashes with: Invalid ISPEC at line 484 in file get_perm_c.c If I use -mat_superlu_dist_parsymbfact=1 or leave this flag out, the program crashes with: Calloc fails for SPA dense[]. at line 438 in file zdistribute.c Mahir From: Hong [mailto:hzhang at mcs.anl.gov] Sent: den 30 juli 2015 02:58 To: ?lker-Kaustell, Mahir Cc: Xiaoye Li; PETSc users list Subject: Fwd: [petsc-users] SuperLU MPI-problem Mahir, Sherry fixed several bugs in superlu_dist-v4.1. The current petsc-release interfaces with superlu_dist-v4.0. We do not know whether the reported issue (attached below) has been resolved or not. If not, can you test it with the latest superlu_dist-v4.1? Here is how to do it: 1. download superlu_dist v4.1 2. remove existing PETSC_ARCH directory, then configure petsc with '--download-superlu_dist=superlu_dist_4.1.tar.gz' 3. build petsc Let us know if the issue remains. Hong ---------- Forwarded message ---------- From: Xiaoye S. Li > Date: Wed, Jul 29, 2015 at 2:24 PM Subject: Fwd: [petsc-users] SuperLU MPI-problem To: Hong Zhang > Hong, I am cleaning the mailbox, and saw this unresolved issue. I am not sure whether the new fix to parallel symbolic factorization solves the problem. What bothers be is that he is getting the following error: Invalid ISPEC at line 484 in file get_perm_c.c This has nothing to do with my bug fix. ? Shall we ask him to try the new version, or try to get him matrix? Sherry ? ---------- Forwarded message ---------- From: Mahir.Ulker-Kaustell at tyrens.se > Date: Wed, Jul 22, 2015 at 1:32 PM Subject: RE: [petsc-users] SuperLU MPI-problem To: Hong >, "Xiaoye S. Li" > Cc: petsc-users > The 1000 was just a conservative guess. The number of non-zeros per row is in the tens in general but certain constraints lead to non-diagonal streaks in the sparsity-pattern. Is it the reordering of the matrix that is killing me here? How can I set options.ColPerm? If i use -mat_superlu_dist_parsymbfact the program crashes with Invalid ISPEC at line 484 in file get_perm_c.c ------------------------------------------------------- Primary job terminated normally, but 1 process returned a non-zero exit code.. Per user-direction, the job has been aborted. ------------------------------------------------------- [0]PETSC ERROR: ------------------------------------------------------------------------ [0]PETSC ERROR: Caught signal number 15 Terminate: Some process (or the batch system) has told this process to end [0]PETSC ERROR: Try option -start_in_debugger or -on_error_attach_debugger [0]PETSC ERROR: or see http://www.mcs.anl.gov/petsc/documentation/faq.html#valgrind [0]PETSC ERROR: or try http://valgrind.org on GNU/linux and Apple Mac OS X to find memory corruption errors [0]PETSC ERROR: configure using --with-debugging=yes, recompile, link, and run [0]PETSC ERROR: to get more information on the crash. [0]PETSC ERROR: --------------------- Error Message -------------------------------------------------------------- [0]PETSC ERROR: Signal received [0]PETSC ERROR: See http://www.mcs.anl.gov/petsc/documentation/faq.html for trouble shooting. [0]PETSC ERROR: Petsc Release Version 3.6.0, Jun, 09, 2015 [0]PETSC ERROR: ./solve on a cygwin-complex-nodebug named CZC5202SM2 by muk Wed Jul 22 21:59:23 2015 [0]PETSC ERROR: Configure options PETSC_DIR=/packages/petsc-3.6.0 PETSC_ARCH=cygwin-complex-nodebug --with-cc=gcc --with-cxx=g++ --with-fc=gfortran --with-debugging=0 --with-fortran-kernels=1 --with-scalar-type=complex --download-fblaspack --download-mpich --download-scalapack --download-mumps --download-metis --download-parmetis --download-superlu --download-superlu_dist --download-fftw [0]PETSC ERROR: #1 User provided function() line 0 in unknown file application called MPI_Abort(MPI_COMM_WORLD, 59) - process 0 [unset]: aborting job: application called MPI_Abort(MPI_COMM_WORLD, 59) - process 0 [0]PETSC ERROR: ------------------------------------------------------------------------ If i use -mat_superlu_dist_parsymbfact=1 the program crashes (somewhat later) with Malloc fails for Lnzval_bc_ptr[*][] at line 626 in file zdistribute.c col block 3006 ------------------------------------------------------- Primary job terminated normally, but 1 process returned a non-zero exit code.. Per user-direction, the job has been aborted. ------------------------------------------------------- col block 1924 [0]PETSC ERROR: ------------------------------------------------------------------------ [0]PETSC ERROR: Caught signal number 15 Terminate: Some process (or the batch system) has told this process to end [0]PETSC ERROR: Try option -start_in_debugger or -on_error_attach_debugger [0]PETSC ERROR: or see http://www.mcs.anl.gov/petsc/documentation/faq.html#valgrind [0]PETSC ERROR: or try http://valgrind.org on GNU/linux and Apple Mac OS X to find memory corruption errors [0]PETSC ERROR: configure using --with-debugging=yes, recompile, link, and run [0]PETSC ERROR: to get more information on the crash. [0]PETSC ERROR: --------------------- Error Message -------------------------------------------------------------- [0]PETSC ERROR: Signal received [0]PETSC ERROR: See http://www.mcs.anl.gov/petsc/documentation/faq.html for trouble shooting. [0]PETSC ERROR: Petsc Release Version 3.6.0, Jun, 09, 2015 [0]PETSC ERROR: ./solve on a cygwin-complex-nodebug named CZC5202SM2 by muk Wed Jul 22 21:59:58 2015 [0]PETSC ERROR: Configure options PETSC_DIR=/packages/petsc-3.6.0 PETSC_ARCH=cygwin-complex-nodebug --with-cc=gcc --with-cxx=g++ --with-fc=gfortran --with-debugging=0 --with-fortran-kernels=1 --with-scalar-type=complex --download-fblaspack --download-mpich --download-scalapack --download-mumps --download-metis --download-parmetis --download-superlu --download-superlu_dist --download-fftw [0]PETSC ERROR: #1 User provided function() line 0 in unknown file application called MPI_Abort(MPI_COMM_WORLD, 59) - process 0 [unset]: aborting job: application called MPI_Abort(MPI_COMM_WORLD, 59) - process 0 [0]PETSC ERROR: ------------------------------------------------------------------------ /Mahir From: Hong [mailto:hzhang at mcs.anl.gov] Sent: den 22 juli 2015 21:34 To: Xiaoye S. Li Cc: ?lker-Kaustell, Mahir; petsc-users Subject: Re: [petsc-users] SuperLU MPI-problem In Petsc/superlu_dist interface, we set default options.ParSymbFact = NO; When user raises the flag "-mat_superlu_dist_parsymbfact", we set options.ParSymbFact = YES; options.ColPerm = PARMETIS; /* in v2.2, PARMETIS is forced for ParSymbFact regardless of user ordering setting */ We do not change anything else. Hong On Wed, Jul 22, 2015 at 2:19 PM, Xiaoye S. Li > wrote: I am trying to understand your problem. You said you are solving Naviers equation (elastodynamics) in the frequency domain, using finite element discretization. I wonder why you have about 1000 nonzeros per row. Usually in many PDE discretized matrices, the number of nonzeros per row is in the tens (even for 3D problems), not in the thousands. So, your matrix is quite a bit denser than many sparse matrices we deal with. The number of nonzeros in the L and U factors is much more than that in original matrix A -- typically we see 10-20x fill ratio for 2D, or can be as bad as 50-100x fill ratio for 3D. But since your matrix starts much denser (i.e., the underlying graph has many connections), it may not lend to any good ordering strategy to preserve sparsity of L and U; that is, the L and U fill ratio may be large. I don't understand why you get the following error when you use ?-mat_superlu_dist_parsymbfact?. Invalid ISPEC at line 484 in file get_perm_c.c Perhaps Hong Zhang knows; she built the SuperLU_DIST interface for PETSc. ?Hong -- in order to use parallel symbolic factorization, is it sufficient to specify only ?-mat_superlu_dist_parsymbfact? ? ? (the default is to use sequential symbolic factorization.) Sherry On Wed, Jul 22, 2015 at 9:11 AM, Mahir.Ulker-Kaustell at tyrens.se > wrote: Thank you for your reply. As you have probably figured out already, I am not a computational scientist. I am a researcher in civil engineering (railways for high-speed traffic), trying to produce some, from my perspective, fairly large parametric studies based on finite element discretizations. I am working in a Windows-environment and have installed PETSc through Cygwin. Apparently, there is no support for Valgrind in this OS. If I have understood you correct, the memory issues are related to superLU and given my background, there is not much I can do. Is this correct? Best regards, Mahir ______________________________________________ Mahir ?lker-Kaustell, Kompetenssamordnare, Brokonstrukt?r, Tekn. Dr, Tyr?ns AB 010 452 30 82, Mahir.Ulker-Kaustell at tyrens.se ______________________________________________ -----Original Message----- From: Barry Smith [mailto:bsmith at mcs.anl.gov] Sent: den 22 juli 2015 02:57 To: ?lker-Kaustell, Mahir Cc: Xiaoye S. Li; petsc-users Subject: Re: [petsc-users] SuperLU MPI-problem Run the program under valgrind http://www.mcs.anl.gov/petsc/documentation/faq.html#valgrind . When I use the option -mat_superlu_dist_parsymbfact I get many scary memory problems some involving for example ddist_psymbtonum (pdsymbfact_distdata.c:1332) Note that I consider it unacceptable for running programs to EVER use uninitialized values; until these are all cleaned up I won't trust any runs like this. Barry ==42050== Conditional jump or move depends on uninitialised value(s) ==42050== at 0x10274C436: MPI_Allgatherv (allgatherv.c:1053) ==42050== by 0x101557F60: get_perm_c_parmetis (get_perm_c_parmetis.c:285) ==42050== by 0x101501192: pdgssvx (pdgssvx.c:934) ==42050== by 0x1009CFE7A: MatLUFactorNumeric_SuperLU_DIST (superlu_dist.c:414) ==42050== by 0x10046CC5C: MatLUFactorNumeric (matrix.c:2946) ==42050== by 0x100F09F2C: PCSetUp_LU (lu.c:152) ==42050== by 0x100FF9036: PCSetUp (precon.c:982) ==42050== by 0x1010F54EB: KSPSetUp (itfunc.c:332) ==42050== by 0x1010F7985: KSPSolve (itfunc.c:546) ==42050== by 0x10125541E: SNESSolve_NEWTONLS (ls.c:233) ==42050== by 0x1011C49B7: SNESSolve (snes.c:3906) ==42050== by 0x100001B3C: main (in ./ex19) ==42050== Uninitialised value was created by a stack allocation ==42050== at 0x10155751B: get_perm_c_parmetis (get_perm_c_parmetis.c:96) ==42050== ==42050== Conditional jump or move depends on uninitialised value(s) ==42050== at 0x102851C61: MPIR_Allgatherv_intra (allgatherv.c:651) ==42050== by 0x102853EC7: MPIR_Allgatherv (allgatherv.c:903) ==42050== by 0x102853F84: MPIR_Allgatherv_impl (allgatherv.c:944) ==42050== by 0x10274CA41: MPI_Allgatherv (allgatherv.c:1107) ==42050== by 0x101557F60: get_perm_c_parmetis (get_perm_c_parmetis.c:285) ==42050== by 0x101501192: pdgssvx (pdgssvx.c:934) ==42050== by 0x1009CFE7A: MatLUFactorNumeric_SuperLU_DIST (superlu_dist.c:414) ==42050== by 0x10046CC5C: MatLUFactorNumeric (matrix.c:2946) ==42050== by 0x100F09F2C: PCSetUp_LU (lu.c:152) ==42050== by 0x100FF9036: PCSetUp (precon.c:982) ==42050== by 0x1010F54EB: KSPSetUp (itfunc.c:332) ==42050== by 0x1010F7985: KSPSolve (itfunc.c:546) ==42050== by 0x10125541E: SNESSolve_NEWTONLS (ls.c:233) ==42050== by 0x1011C49B7: SNESSolve (snes.c:3906) ==42050== by 0x100001B3C: main (in ./ex19) ==42050== Uninitialised value was created by a stack allocation ==42050== at 0x10155751B: get_perm_c_parmetis (get_perm_c_parmetis.c:96) ==42050== ==42049== Syscall param writev(vector[...]) points to uninitialised byte(s) ==42049== at 0x102DA1C3A: writev (in /usr/lib/system/libsystem_kernel.dylib) ==42049== by 0x10296A0DC: MPL_large_writev (mplsock.c:32) ==42049== by 0x10295F6AD: MPIDU_Sock_writev (sock_immed.i:610) ==42049== by 0x102943FCA: MPIDI_CH3_iSendv (ch3_isendv.c:84) ==42049== by 0x102934361: MPIDI_CH3_EagerContigIsend (ch3u_eager.c:556) ==42049== by 0x102939531: MPID_Isend (mpid_isend.c:138) ==42049== by 0x10277656E: MPI_Isend (isend.c:125) ==42049== by 0x102088B66: libparmetis__gkMPI_Isend (gkmpi.c:63) ==42049== by 0x10208140F: libparmetis__CommInterfaceData (comm.c:298) ==42049== by 0x1020A8758: libparmetis__CompactGraph (ometis.c:553) ==42049== by 0x1020A77BB: libparmetis__MultilevelOrder (ometis.c:225) ==42049== by 0x1020A7493: ParMETIS_V32_NodeND (ometis.c:151) ==42049== by 0x1020A6AFB: ParMETIS_V3_NodeND (ometis.c:34) ==42049== by 0x101557CFC: get_perm_c_parmetis (get_perm_c_parmetis.c:241) ==42049== by 0x101501192: pdgssvx (pdgssvx.c:934) ==42049== by 0x1009CFE7A: MatLUFactorNumeric_SuperLU_DIST (superlu_dist.c:414) ==42049== by 0x10046CC5C: MatLUFactorNumeric (matrix.c:2946) ==42049== by 0x100F09F2C: PCSetUp_LU (lu.c:152) ==42049== by 0x100FF9036: PCSetUp (precon.c:982) ==42048== Syscall param writev(vector[...]) points to uninitialised byte(s) ==42049== by 0x1010F54EB: KSPSetUp (itfunc.c:332) ==42049== Address 0x105edff70 is 1,424 bytes inside a block of size 752,720 alloc'd ==42049== at 0x1000183B1: malloc (vg_replace_malloc.c:303) ==42049== by 0x1020EB90C: gk_malloc (memory.c:147) ==42049== by 0x1020EAA28: gk_mcoreCreate (mcore.c:28) ==42048== at 0x102DA1C3A: writev (in /usr/lib/system/libsystem_kernel.dylib) ==42048== by 0x10296A0DC: MPL_large_writev (mplsock.c:32) ==42049== by 0x1020BA5CF: libparmetis__AllocateWSpace (wspace.c:23) ==42049== by 0x1020A6E84: ParMETIS_V32_NodeND (ometis.c:98) ==42048== by 0x10295F6AD: MPIDU_Sock_writev (sock_immed.i:610) ==42048== by 0x102943FCA: MPIDI_CH3_iSendv (ch3_isendv.c:84) ==42048== by 0x102934361: MPIDI_CH3_EagerContigIsend (ch3u_eager.c:556) ==42049== by 0x1020A6AFB: ParMETIS_V3_NodeND (ometis.c:34) ==42049== by 0x101557CFC: get_perm_c_parmetis (get_perm_c_parmetis.c:241) ==42049== by 0x101501192: pdgssvx (pdgssvx.c:934) ==42048== by 0x102939531: MPID_Isend (mpid_isend.c:138) ==42048== by 0x10277656E: MPI_Isend (isend.c:125) ==42049== by 0x1009CFE7A: MatLUFactorNumeric_SuperLU_DIST (superlu_dist.c:414) ==42049== by 0x10046CC5C: MatLUFactorNumeric (matrix.c:2946) ==42049== by 0x100F09F2C: PCSetUp_LU (lu.c:152) ==42049== by 0x100FF9036: PCSetUp (precon.c:982) ==42049== by 0x1010F54EB: KSPSetUp (itfunc.c:332) ==42048== by 0x102088B66: libparmetis__gkMPI_Isend (gkmpi.c:63) ==42048== by 0x10208140F: libparmetis__CommInterfaceData (comm.c:298) ==42049== by 0x1010F7985: KSPSolve (itfunc.c:546) ==42049== by 0x10125541E: SNESSolve_NEWTONLS (ls.c:233) ==42048== by 0x1020A8758: libparmetis__CompactGraph (ometis.c:553) ==42048== by 0x1020A77BB: libparmetis__MultilevelOrder (ometis.c:225) ==42048== by 0x1020A7493: ParMETIS_V32_NodeND (ometis.c:151) ==42049== by 0x1011C49B7: SNESSolve (snes.c:3906) ==42049== by 0x100001B3C: main (in ./ex19) ==42049== Uninitialised value was created by a heap allocation ==42049== at 0x1000183B1: malloc (vg_replace_malloc.c:303) ==42049== by 0x1020EB90C: gk_malloc (memory.c:147) ==42048== by 0x1020A6AFB: ParMETIS_V3_NodeND (ometis.c:34) ==42048== by 0x101557CFC: get_perm_c_parmetis (get_perm_c_parmetis.c:241) ==42048== by 0x101501192: pdgssvx (pdgssvx.c:934) ==42048== by 0x1009CFE7A: MatLUFactorNumeric_SuperLU_DIST (superlu_dist.c:414) ==42049== by 0x10211C50B: libmetis__imalloc (gklib.c:24) ==42049== by 0x1020A8566: libparmetis__CompactGraph (ometis.c:519) ==42049== by 0x1020A77BB: libparmetis__MultilevelOrder (ometis.c:225) ==42048== by 0x10046CC5C: MatLUFactorNumeric (matrix.c:2946) ==42049== by 0x1020A7493: ParMETIS_V32_NodeND (ometis.c:151) ==42049== by 0x1020A6AFB: ParMETIS_V3_NodeND (ometis.c:34) ==42049== by 0x101557CFC: get_perm_c_parmetis (get_perm_c_parmetis.c:241) ==42049== by 0x101501192: pdgssvx (pdgssvx.c:934) ==42049== by 0x1009CFE7A: MatLUFactorNumeric_SuperLU_DIST (superlu_dist.c:414) ==42049== by 0x10046CC5C: MatLUFactorNumeric (matrix.c:2946) ==42048== by 0x100F09F2C: PCSetUp_LU (lu.c:152) ==42049== by 0x100F09F2C: PCSetUp_LU (lu.c:152) ==42049== by 0x100FF9036: PCSetUp (precon.c:982) ==42049== by 0x1010F54EB: KSPSetUp (itfunc.c:332) ==42049== by 0x1010F7985: KSPSolve (itfunc.c:546) ==42049== by 0x10125541E: SNESSolve_NEWTONLS (ls.c:233) ==42048== by 0x100FF9036: PCSetUp (precon.c:982) ==42048== by 0x1010F54EB: KSPSetUp (itfunc.c:332) ==42048== Address 0x10597a860 is 1,408 bytes inside a block of size 752,720 alloc'd ==42049== by 0x1011C49B7: SNESSolve (snes.c:3906) ==42049== by 0x100001B3C: main (in ./ex19) ==42049== ==42048== at 0x1000183B1: malloc (vg_replace_malloc.c:303) ==42048== by 0x1020EB90C: gk_malloc (memory.c:147) ==42048== by 0x1020EAA28: gk_mcoreCreate (mcore.c:28) ==42048== by 0x1020BA5CF: libparmetis__AllocateWSpace (wspace.c:23) ==42048== by 0x1020A6E84: ParMETIS_V32_NodeND (ometis.c:98) ==42048== by 0x1020A6AFB: ParMETIS_V3_NodeND (ometis.c:34) ==42048== by 0x101557CFC: get_perm_c_parmetis (get_perm_c_parmetis.c:241) ==42048== by 0x101501192: pdgssvx (pdgssvx.c:934) ==42048== by 0x1009CFE7A: MatLUFactorNumeric_SuperLU_DIST (superlu_dist.c:414) ==42048== by 0x10046CC5C: MatLUFactorNumeric (matrix.c:2946) ==42048== by 0x100F09F2C: PCSetUp_LU (lu.c:152) ==42048== by 0x100FF9036: PCSetUp (precon.c:982) ==42048== by 0x1010F54EB: KSPSetUp (itfunc.c:332) ==42048== by 0x1010F7985: KSPSolve (itfunc.c:546) ==42048== by 0x10125541E: SNESSolve_NEWTONLS (ls.c:233) ==42048== by 0x1011C49B7: SNESSolve (snes.c:3906) ==42048== by 0x100001B3C: main (in ./ex19) ==42048== Uninitialised value was created by a heap allocation ==42048== at 0x1000183B1: malloc (vg_replace_malloc.c:303) ==42048== by 0x1020EB90C: gk_malloc (memory.c:147) ==42048== by 0x10211C50B: libmetis__imalloc (gklib.c:24) ==42048== by 0x1020A8566: libparmetis__CompactGraph (ometis.c:519) ==42048== by 0x1020A77BB: libparmetis__MultilevelOrder (ometis.c:225) ==42048== by 0x1020A7493: ParMETIS_V32_NodeND (ometis.c:151) ==42048== by 0x1020A6AFB: ParMETIS_V3_NodeND (ometis.c:34) ==42048== by 0x101557CFC: get_perm_c_parmetis (get_perm_c_parmetis.c:241) ==42048== by 0x101501192: pdgssvx (pdgssvx.c:934) ==42048== by 0x1009CFE7A: MatLUFactorNumeric_SuperLU_DIST (superlu_dist.c:414) ==42048== by 0x10046CC5C: MatLUFactorNumeric (matrix.c:2946) ==42048== by 0x100F09F2C: PCSetUp_LU (lu.c:152) ==42048== by 0x100FF9036: PCSetUp (precon.c:982) ==42048== by 0x1010F54EB: KSPSetUp (itfunc.c:332) ==42048== by 0x1010F7985: KSPSolve (itfunc.c:546) ==42048== by 0x10125541E: SNESSolve_NEWTONLS (ls.c:233) ==42048== by 0x1011C49B7: SNESSolve (snes.c:3906) ==42048== by 0x100001B3C: main (in ./ex19) ==42048== ==42048== Syscall param write(buf) points to uninitialised byte(s) ==42048== at 0x102DA1C22: write (in /usr/lib/system/libsystem_kernel.dylib) ==42048== by 0x10295F5BD: MPIDU_Sock_write (sock_immed.i:525) ==42048== by 0x102944839: MPIDI_CH3_iStartMsg (ch3_istartmsg.c:86) ==42048== by 0x102933B80: MPIDI_CH3_EagerContigShortSend (ch3u_eager.c:257) ==42048== by 0x10293ADBA: MPID_Send (mpid_send.c:130) ==42048== by 0x10277A1FA: MPI_Send (send.c:127) ==42048== by 0x10155802F: get_perm_c_parmetis (get_perm_c_parmetis.c:299) ==42048== by 0x101501192: pdgssvx (pdgssvx.c:934) ==42048== by 0x1009CFE7A: MatLUFactorNumeric_SuperLU_DIST (superlu_dist.c:414) ==42048== by 0x10046CC5C: MatLUFactorNumeric (matrix.c:2946) ==42048== by 0x100F09F2C: PCSetUp_LU (lu.c:152) ==42048== by 0x100FF9036: PCSetUp (precon.c:982) ==42048== by 0x1010F54EB: KSPSetUp (itfunc.c:332) ==42048== by 0x1010F7985: KSPSolve (itfunc.c:546) ==42048== by 0x10125541E: SNESSolve_NEWTONLS (ls.c:233) ==42048== by 0x1011C49B7: SNESSolve (snes.c:3906) ==42048== by 0x100001B3C: main (in ./ex19) ==42048== Address 0x104810704 is on thread 1's stack ==42048== in frame #3, created by MPIDI_CH3_EagerContigShortSend (ch3u_eager.c:218) ==42048== Uninitialised value was created by a heap allocation ==42048== at 0x1000183B1: malloc (vg_replace_malloc.c:303) ==42048== by 0x10153B704: superlu_malloc_dist (memory.c:108) ==42048== by 0x101557AB9: get_perm_c_parmetis (get_perm_c_parmetis.c:185) ==42048== by 0x101501192: pdgssvx (pdgssvx.c:934) ==42048== by 0x1009CFE7A: MatLUFactorNumeric_SuperLU_DIST (superlu_dist.c:414) ==42048== by 0x10046CC5C: MatLUFactorNumeric (matrix.c:2946) ==42048== by 0x100F09F2C: PCSetUp_LU (lu.c:152) ==42048== by 0x100FF9036: PCSetUp (precon.c:982) ==42048== by 0x1010F54EB: KSPSetUp (itfunc.c:332) ==42048== by 0x1010F7985: KSPSolve (itfunc.c:546) ==42048== by 0x10125541E: SNESSolve_NEWTONLS (ls.c:233) ==42048== by 0x1011C49B7: SNESSolve (snes.c:3906) ==42048== by 0x100001B3C: main (in ./ex19) ==42048== ==42050== Conditional jump or move depends on uninitialised value(s) ==42050== at 0x102744CB8: MPI_Alltoallv (alltoallv.c:480) ==42050== by 0x101510B3E: dist_symbLU (pdsymbfact_distdata.c:539) ==42050== by 0x10150A5C6: ddist_psymbtonum (pdsymbfact_distdata.c:1275) ==42050== by 0x1015018C2: pdgssvx (pdgssvx.c:1057) ==42050== by 0x1009CFE7A: MatLUFactorNumeric_SuperLU_DIST (superlu_dist.c:414) ==42050== by 0x10046CC5C: MatLUFactorNumeric (matrix.c:2946) ==42050== by 0x100F09F2C: PCSetUp_LU (lu.c:152) ==42050== by 0x100FF9036: PCSetUp (precon.c:982) ==42050== by 0x1010F54EB: KSPSetUp (itfunc.c:332) ==42050== by 0x1010F7985: KSPSolve (itfunc.c:546) ==42050== by 0x10125541E: SNESSolve_NEWTONLS (ls.c:233) ==42050== by 0x1011C49B7: SNESSolve (snes.c:3906) ==42050== by 0x100001B3C: main (in ./ex19) ==42050== Uninitialised value was created by a stack allocation ==42050== at 0x10150E4C4: dist_symbLU (pdsymbfact_distdata.c:96) ==42050== ==42050== Conditional jump or move depends on uninitialised value(s) ==42050== at 0x102744E43: MPI_Alltoallv (alltoallv.c:490) ==42050== by 0x101510B3E: dist_symbLU (pdsymbfact_distdata.c:539) ==42050== by 0x10150A5C6: ddist_psymbtonum (pdsymbfact_distdata.c:1275) ==42050== by 0x1015018C2: pdgssvx (pdgssvx.c:1057) ==42050== by 0x1009CFE7A: MatLUFactorNumeric_SuperLU_DIST (superlu_dist.c:414) ==42050== by 0x10046CC5C: MatLUFactorNumeric (matrix.c:2946) ==42050== by 0x100F09F2C: PCSetUp_LU (lu.c:152) ==42050== by 0x100FF9036: PCSetUp (precon.c:982) ==42050== by 0x1010F54EB: KSPSetUp (itfunc.c:332) ==42050== by 0x1010F7985: KSPSolve (itfunc.c:546) ==42050== by 0x10125541E: SNESSolve_NEWTONLS (ls.c:233) ==42050== by 0x1011C49B7: SNESSolve (snes.c:3906) ==42050== by 0x100001B3C: main (in ./ex19) ==42050== Uninitialised value was created by a stack allocation ==42050== at 0x10150E4C4: dist_symbLU (pdsymbfact_distdata.c:96) ==42050== ==42050== Conditional jump or move depends on uninitialised value(s) ==42050== at 0x102744EBF: MPI_Alltoallv (alltoallv.c:497) ==42050== by 0x101510B3E: dist_symbLU (pdsymbfact_distdata.c:539) ==42050== by 0x10150A5C6: ddist_psymbtonum (pdsymbfact_distdata.c:1275) ==42050== by 0x1015018C2: pdgssvx (pdgssvx.c:1057) ==42050== by 0x1009CFE7A: MatLUFactorNumeric_SuperLU_DIST (superlu_dist.c:414) ==42050== by 0x10046CC5C: MatLUFactorNumeric (matrix.c:2946) ==42050== by 0x100F09F2C: PCSetUp_LU (lu.c:152) ==42050== by 0x100FF9036: PCSetUp (precon.c:982) ==42050== by 0x1010F54EB: KSPSetUp (itfunc.c:332) ==42050== by 0x1010F7985: KSPSolve (itfunc.c:546) ==42050== by 0x10125541E: SNESSolve_NEWTONLS (ls.c:233) ==42050== by 0x1011C49B7: SNESSolve (snes.c:3906) ==42050== by 0x100001B3C: main (in ./ex19) ==42050== Uninitialised value was created by a stack allocation ==42050== at 0x10150E4C4: dist_symbLU (pdsymbfact_distdata.c:96) ==42050== ==42050== Conditional jump or move depends on uninitialised value(s) ==42050== at 0x1027450B1: MPI_Alltoallv (alltoallv.c:512) ==42050== by 0x101510B3E: dist_symbLU (pdsymbfact_distdata.c:539) ==42050== by 0x10150A5C6: ddist_psymbtonum (pdsymbfact_distdata.c:1275) ==42050== by 0x1015018C2: pdgssvx (pdgssvx.c:1057) ==42050== by 0x1009CFE7A: MatLUFactorNumeric_SuperLU_DIST (superlu_dist.c:414) ==42050== by 0x10046CC5C: MatLUFactorNumeric (matrix.c:2946) ==42050== by 0x100F09F2C: PCSetUp_LU (lu.c:152) ==42050== by 0x100FF9036: PCSetUp (precon.c:982) ==42050== by 0x1010F54EB: KSPSetUp (itfunc.c:332) ==42050== by 0x1010F7985: KSPSolve (itfunc.c:546) ==42050== by 0x10125541E: SNESSolve_NEWTONLS (ls.c:233) ==42050== by 0x1011C49B7: SNESSolve (snes.c:3906) ==42050== by 0x100001B3C: main (in ./ex19) ==42050== Uninitialised value was created by a stack allocation ==42050== at 0x10150E4C4: dist_symbLU (pdsymbfact_distdata.c:96) ==42050== ==42050== Conditional jump or move depends on uninitialised value(s) ==42050== at 0x10283FB06: MPIR_Alltoallv_intra (alltoallv.c:92) ==42050== by 0x1028407B6: MPIR_Alltoallv (alltoallv.c:343) ==42050== by 0x102840884: MPIR_Alltoallv_impl (alltoallv.c:380) ==42050== by 0x10274541B: MPI_Alltoallv (alltoallv.c:531) ==42050== by 0x101510B3E: dist_symbLU (pdsymbfact_distdata.c:539) ==42050== by 0x10150A5C6: ddist_psymbtonum (pdsymbfact_distdata.c:1275) ==42050== by 0x1015018C2: pdgssvx (pdgssvx.c:1057) ==42050== by 0x1009CFE7A: MatLUFactorNumeric_SuperLU_DIST (superlu_dist.c:414) ==42050== by 0x10046CC5C: MatLUFactorNumeric (matrix.c:2946) ==42050== by 0x100F09F2C: PCSetUp_LU (lu.c:152) ==42050== by 0x100FF9036: PCSetUp (precon.c:982) ==42050== by 0x1010F54EB: KSPSetUp (itfunc.c:332) ==42050== by 0x1010F7985: KSPSolve (itfunc.c:546) ==42050== by 0x10125541E: SNESSolve_NEWTONLS (ls.c:233) ==42050== by 0x1011C49B7: SNESSolve (snes.c:3906) ==42050== by 0x100001B3C: main (in ./ex19) ==42050== Uninitialised value was created by a stack allocation ==42050== at 0x10150E4C4: dist_symbLU (pdsymbfact_distdata.c:96) ==42050== ==42050== Syscall param writev(vector[...]) points to uninitialised byte(s) ==42050== at 0x102DA1C3A: writev (in /usr/lib/system/libsystem_kernel.dylib) ==42050== by 0x10296A0DC: MPL_large_writev (mplsock.c:32) ==42050== by 0x10295F6AD: MPIDU_Sock_writev (sock_immed.i:610) ==42050== by 0x102943FCA: MPIDI_CH3_iSendv (ch3_isendv.c:84) ==42050== by 0x102934361: MPIDI_CH3_EagerContigIsend (ch3u_eager.c:556) ==42050== by 0x102939531: MPID_Isend (mpid_isend.c:138) ==42050== by 0x10277656E: MPI_Isend (isend.c:125) ==42050== by 0x101524C41: pdgstrf2_trsm (pdgstrf2.c:201) ==42050== by 0x10151ECBF: pdgstrf (pdgstrf.c:1082) ==42050== by 0x1015019A5: pdgssvx (pdgssvx.c:1069) ==42050== by 0x1009CFE7A: MatLUFactorNumeric_SuperLU_DIST (superlu_dist.c:414) ==42050== by 0x10046CC5C: MatLUFactorNumeric (matrix.c:2946) ==42050== by 0x100F09F2C: PCSetUp_LU (lu.c:152) ==42050== by 0x100FF9036: PCSetUp (precon.c:982) ==42050== by 0x1010F54EB: KSPSetUp (itfunc.c:332) ==42050== by 0x1010F7985: KSPSolve (itfunc.c:546) ==42050== by 0x10125541E: SNESSolve_NEWTONLS (ls.c:233) ==42050== by 0x1011C49B7: SNESSolve (snes.c:3906) ==42050== by 0x100001B3C: main (in ./ex19) ==42050== Address 0x1060144d0 is 1,168 bytes inside a block of size 131,072 alloc'd ==42050== at 0x1000183B1: malloc (vg_replace_malloc.c:303) ==42050== by 0x10153B704: superlu_malloc_dist (memory.c:108) ==42050== by 0x1014FD7AD: doubleMalloc_dist (dmemory.c:145) ==42050== by 0x10151DA7D: pdgstrf (pdgstrf.c:735) ==42050== by 0x1015019A5: pdgssvx (pdgssvx.c:1069) ==42050== by 0x1009CFE7A: MatLUFactorNumeric_SuperLU_DIST (superlu_dist.c:414) ==42050== by 0x10046CC5C: MatLUFactorNumeric (matrix.c:2946) ==42050== by 0x100F09F2C: PCSetUp_LU (lu.c:152) ==42050== by 0x100FF9036: PCSetUp (precon.c:982) ==42050== by 0x1010F54EB: KSPSetUp (itfunc.c:332) ==42050== by 0x1010F7985: KSPSolve (itfunc.c:546) ==42050== by 0x10125541E: SNESSolve_NEWTONLS (ls.c:233) ==42050== by 0x1011C49B7: SNESSolve (snes.c:3906) ==42050== by 0x100001B3C: main (in ./ex19) ==42050== Uninitialised value was created by a heap allocation ==42050== at 0x1000183B1: malloc (vg_replace_malloc.c:303) ==42050== by 0x10153B704: superlu_malloc_dist (memory.c:108) ==42050== by 0x1014FD7AD: doubleMalloc_dist (dmemory.c:145) ==42050== by 0x10151DA7D: pdgstrf (pdgstrf.c:735) ==42050== by 0x1015019A5: pdgssvx (pdgssvx.c:1069) ==42050== by 0x1009CFE7A: MatLUFactorNumeric_SuperLU_DIST (superlu_dist.c:414) ==42050== by 0x10046CC5C: MatLUFactorNumeric (matrix.c:2946) ==42050== by 0x100F09F2C: PCSetUp_LU (lu.c:152) ==42050== by 0x100FF9036: PCSetUp (precon.c:982) ==42050== by 0x1010F54EB: KSPSetUp (itfunc.c:332) ==42050== by 0x1010F7985: KSPSolve (itfunc.c:546) ==42050== by 0x10125541E: SNESSolve_NEWTONLS (ls.c:233) ==42050== by 0x1011C49B7: SNESSolve (snes.c:3906) ==42050== by 0x100001B3C: main (in ./ex19) ==42050== ==42048== Conditional jump or move depends on uninitialised value(s) ==42048== at 0x10151F141: pdgstrf (pdgstrf.c:1139) ==42048== by 0x1015019A5: pdgssvx (pdgssvx.c:1069) ==42048== by 0x1009CFE7A: MatLUFactorNumeric_SuperLU_DIST (superlu_dist.c:414) ==42048== by 0x10046CC5C: MatLUFactorNumeric (matrix.c:2946) ==42048== by 0x100F09F2C: PCSetUp_LU (lu.c:152) ==42048== by 0x100FF9036: PCSetUp (precon.c:982) ==42048== by 0x1010F54EB: KSPSetUp (itfunc.c:332) ==42048== by 0x1010F7985: KSPSolve (itfunc.c:546) ==42048== by 0x10125541E: SNESSolve_NEWTONLS (ls.c:233) ==42048== by 0x1011C49B7: SNESSolve (snes.c:3906) ==42048== by 0x100001B3C: main (in ./ex19) ==42048== Uninitialised value was created by a heap allocation ==42048== at 0x1000183B1: malloc (vg_replace_malloc.c:303) ==42048== by 0x10153B704: superlu_malloc_dist (memory.c:108) ==42048== by 0x10150ABE2: ddist_psymbtonum (pdsymbfact_distdata.c:1332) ==42048== by 0x1015018C2: pdgssvx (pdgssvx.c:1057) ==42048== by 0x1009CFE7A: MatLUFactorNumeric_SuperLU_DIST (superlu_dist.c:414) ==42048== by 0x10046CC5C: MatLUFactorNumeric (matrix.c:2946) ==42048== by 0x100F09F2C: PCSetUp_LU (lu.c:152) ==42048== by 0x100FF9036: PCSetUp (precon.c:982) ==42048== by 0x1010F54EB: KSPSetUp (itfunc.c:332) ==42048== by 0x1010F7985: KSPSolve (itfunc.c:546) ==42048== by 0x10125541E: SNESSolve_NEWTONLS (ls.c:233) ==42048== by 0x1011C49B7: SNESSolve (snes.c:3906) ==42048== by 0x100001B3C: main (in ./ex19) ==42048== ==42049== Conditional jump or move depends on uninitialised value(s) ==42049== at 0x10151F141: pdgstrf (pdgstrf.c:1139) ==42049== by 0x1015019A5: pdgssvx (pdgssvx.c:1069) ==42049== by 0x1009CFE7A: MatLUFactorNumeric_SuperLU_DIST (superlu_dist.c:414) ==42049== by 0x10046CC5C: MatLUFactorNumeric (matrix.c:2946) ==42049== by 0x100F09F2C: PCSetUp_LU (lu.c:152) ==42049== by 0x100FF9036: PCSetUp (precon.c:982) ==42049== by 0x1010F54EB: KSPSetUp (itfunc.c:332) ==42049== by 0x1010F7985: KSPSolve (itfunc.c:546) ==42049== by 0x10125541E: SNESSolve_NEWTONLS (ls.c:233) ==42049== by 0x1011C49B7: SNESSolve (snes.c:3906) ==42049== by 0x100001B3C: main (in ./ex19) ==42049== Uninitialised value was created by a heap allocation ==42049== at 0x1000183B1: malloc (vg_replace_malloc.c:303) ==42049== by 0x10153B704: superlu_malloc_dist (memory.c:108) ==42049== by 0x10150ABE2: ddist_psymbtonum (pdsymbfact_distdata.c:1332) ==42049== by 0x1015018C2: pdgssvx (pdgssvx.c:1057) ==42049== by 0x1009CFE7A: MatLUFactorNumeric_SuperLU_DIST (superlu_dist.c:414) ==42049== by 0x10046CC5C: MatLUFactorNumeric (matrix.c:2946) ==42049== by 0x100F09F2C: PCSetUp_LU (lu.c:152) ==42049== by 0x100FF9036: PCSetUp (precon.c:982) ==42049== by 0x1010F54EB: KSPSetUp (itfunc.c:332) ==42049== by 0x1010F7985: KSPSolve (itfunc.c:546) ==42049== by 0x10125541E: SNESSolve_NEWTONLS (ls.c:233) ==42049== by 0x1011C49B7: SNESSolve (snes.c:3906) ==42049== by 0x100001B3C: main (in ./ex19) ==42049== ==42048== Conditional jump or move depends on uninitialised value(s) ==42048== at 0x101520054: pdgstrf (pdgstrf.c:1429) ==42048== by 0x1015019A5: pdgssvx (pdgssvx.c:1069) ==42048== by 0x1009CFE7A: MatLUFactorNumeric_SuperLU_DIST (superlu_dist.c:414) ==42048== by 0x10046CC5C: MatLUFactorNumeric (matrix.c:2946) ==42048== by 0x100F09F2C: PCSetUp_LU (lu.c:152) ==42048== by 0x100FF9036: PCSetUp (precon.c:982) ==42048== by 0x1010F54EB: KSPSetUp (itfunc.c:332) ==42048== by 0x1010F7985: KSPSolve (itfunc.c:546) ==42049== Conditional jump or move depends on uninitialised value(s) ==42048== by 0x10125541E: SNESSolve_NEWTONLS (ls.c:233) ==42048== by 0x1011C49B7: SNESSolve (snes.c:3906) ==42048== by 0x100001B3C: main (in ./ex19) ==42048== Uninitialised value was created by a heap allocation ==42049== at 0x101520054: pdgstrf (pdgstrf.c:1429) ==42048== at 0x1000183B1: malloc (vg_replace_malloc.c:303) ==42048== by 0x10153B704: superlu_malloc_dist (memory.c:108) ==42049== by 0x1015019A5: pdgssvx (pdgssvx.c:1069) ==42049== by 0x1009CFE7A: MatLUFactorNumeric_SuperLU_DIST (superlu_dist.c:414) ==42048== by 0x10150ABE2: ddist_psymbtonum (pdsymbfact_distdata.c:1332) ==42048== by 0x1015018C2: pdgssvx (pdgssvx.c:1057) ==42048== by 0x1009CFE7A: MatLUFactorNumeric_SuperLU_DIST (superlu_dist.c:414) ==42049== by 0x10046CC5C: MatLUFactorNumeric (matrix.c:2946) ==42049== by 0x100F09F2C: PCSetUp_LU (lu.c:152) ==42048== by 0x10046CC5C: MatLUFactorNumeric (matrix.c:2946) ==42048== by 0x100F09F2C: PCSetUp_LU (lu.c:152) ==42049== by 0x100FF9036: PCSetUp (precon.c:982) ==42049== by 0x1010F54EB: KSPSetUp (itfunc.c:332) ==42049== by 0x1010F7985: KSPSolve (itfunc.c:546) ==42048== by 0x100FF9036: PCSetUp (precon.c:982) ==42048== by 0x1010F54EB: KSPSetUp (itfunc.c:332) ==42048== by 0x1010F7985: KSPSolve (itfunc.c:546) ==42049== by 0x10125541E: SNESSolve_NEWTONLS (ls.c:233) ==42049== by 0x1011C49B7: SNESSolve (snes.c:3906) ==42048== by 0x10125541E: SNESSolve_NEWTONLS (ls.c:233) ==42048== by 0x1011C49B7: SNESSolve (snes.c:3906) ==42049== by 0x100001B3C: main (in ./ex19) ==42049== Uninitialised value was created by a heap allocation ==42049== at 0x1000183B1: malloc (vg_replace_malloc.c:303) ==42048== by 0x100001B3C: main (in ./ex19) ==42048== ==42049== by 0x10153B704: superlu_malloc_dist (memory.c:108) ==42049== by 0x10150ABE2: ddist_psymbtonum (pdsymbfact_distdata.c:1332) ==42049== by 0x1015018C2: pdgssvx (pdgssvx.c:1057) ==42049== by 0x1009CFE7A: MatLUFactorNumeric_SuperLU_DIST (superlu_dist.c:414) ==42049== by 0x10046CC5C: MatLUFactorNumeric (matrix.c:2946) ==42049== by 0x100F09F2C: PCSetUp_LU (lu.c:152) ==42049== by 0x100FF9036: PCSetUp (precon.c:982) ==42049== by 0x1010F54EB: KSPSetUp (itfunc.c:332) ==42049== by 0x1010F7985: KSPSolve (itfunc.c:546) ==42049== by 0x10125541E: SNESSolve_NEWTONLS (ls.c:233) ==42049== by 0x1011C49B7: SNESSolve (snes.c:3906) ==42049== by 0x100001B3C: main (in ./ex19) ==42049== ==42050== Conditional jump or move depends on uninitialised value(s) ==42050== at 0x10151FDE6: pdgstrf (pdgstrf.c:1382) ==42050== by 0x1015019A5: pdgssvx (pdgssvx.c:1069) ==42050== by 0x1009CFE7A: MatLUFactorNumeric_SuperLU_DIST (superlu_dist.c:414) ==42050== by 0x10046CC5C: MatLUFactorNumeric (matrix.c:2946) ==42050== by 0x100F09F2C: PCSetUp_LU (lu.c:152) ==42050== by 0x100FF9036: PCSetUp (precon.c:982) ==42050== by 0x1010F54EB: KSPSetUp (itfunc.c:332) ==42050== by 0x1010F7985: KSPSolve (itfunc.c:546) ==42050== by 0x10125541E: SNESSolve_NEWTONLS (ls.c:233) ==42050== by 0x1011C49B7: SNESSolve (snes.c:3906) ==42050== by 0x100001B3C: main (in ./ex19) ==42050== Uninitialised value was created by a heap allocation ==42050== at 0x1000183B1: malloc (vg_replace_malloc.c:303) ==42050== by 0x10153B704: superlu_malloc_dist (memory.c:108) ==42050== by 0x10150B241: ddist_psymbtonum (pdsymbfact_distdata.c:1389) ==42050== by 0x1015018C2: pdgssvx (pdgssvx.c:1057) ==42050== by 0x1009CFE7A: MatLUFactorNumeric_SuperLU_DIST (superlu_dist.c:414) ==42050== by 0x10046CC5C: MatLUFactorNumeric (matrix.c:2946) ==42050== by 0x100F09F2C: PCSetUp_LU (lu.c:152) ==42050== by 0x100FF9036: PCSetUp (precon.c:982) ==42050== by 0x1010F54EB: KSPSetUp (itfunc.c:332) ==42050== by 0x1010F7985: KSPSolve (itfunc.c:546) ==42050== by 0x10125541E: SNESSolve_NEWTONLS (ls.c:233) ==42050== by 0x1011C49B7: SNESSolve (snes.c:3906) ==42050== by 0x100001B3C: main (in ./ex19) ==42050== > On Jul 20, 2015, at 12:03 PM, Mahir.Ulker-Kaustell at tyrens.se wrote: > > Ok. So I have been creating the full factorization on each process. That gives me some hope! > > I followed your suggestion and tried to use the runtime option ?-mat_superlu_dist_parsymbfact?. > However, now the program crashes with: > > Invalid ISPEC at line 484 in file get_perm_c.c > > And so on? > > From the SuperLU manual; I should give the option either YES or NO, however -mat_superlu_dist_parsymbfact YES makes the program crash in the same way as above. > Also I can?t find any reference to -mat_superlu_dist_parsymbfact in the PETSc documentation > > Mahir > > Mahir ?lker-Kaustell, Kompetenssamordnare, Brokonstrukt?r, Tekn. Dr, Tyr?ns AB > 010 452 30 82, Mahir.Ulker-Kaustell at tyrens.se > > From: Xiaoye S. Li [mailto:xsli at lbl.gov] > Sent: den 20 juli 2015 18:12 > To: ?lker-Kaustell, Mahir > Cc: Hong; petsc-users > Subject: Re: [petsc-users] SuperLU MPI-problem > > The default SuperLU_DIST setting is to serial symbolic factorization. Therefore, what matters is how much memory do you have per MPI task? > > The code failed to malloc memory during redistribution of matrix A to {L\U} data struction (using result of serial symbolic factorization.) > > You can use parallel symbolic factorization, by runtime option: '-mat_superlu_dist_parsymbfact' > > Sherry Li > > > On Mon, Jul 20, 2015 at 8:59 AM, Mahir.Ulker-Kaustell at tyrens.se > wrote: > Hong: > > Previous experiences with this equation have shown that it is very difficult to solve it iteratively. Hence the use of a direct solver. > > The large test problem I am trying to solve has slightly less than 10^6 degrees of freedom. The matrices are derived from finite elements so they are sparse. > The machine I am working on has 128GB ram. I have estimated the memory needed to less than 20GB, so if the solver needs twice or even three times as much, it should still work well. Or have I completely misunderstood something here? > > Mahir > > > > From: Hong [mailto:hzhang at mcs.anl.gov] > Sent: den 20 juli 2015 17:39 > To: ?lker-Kaustell, Mahir > Cc: petsc-users > Subject: Re: [petsc-users] SuperLU MPI-problem > > Mahir: > Direct solvers consume large amount of memory. Suggest to try followings: > > 1. A sparse iterative solver if [-omega^2M + K] is not too ill-conditioned. You may test it using the small matrix. > > 2. Incrementally increase your matrix sizes. Try different matrix orderings. > Do you get memory crash in the 1st symbolic factorization? > In your case, matrix data structure stays same when omega changes, so you only need to do one matrix symbolic factorization and reuse it. > > 3. Use a machine that gives larger memory. > > Hong > > Dear Petsc-Users, > > I am trying to use PETSc to solve a set of linear equations arising from Naviers equation (elastodynamics) in the frequency domain. > The frequency dependency of the problem requires that the system > > [-omega^2M + K]u = F > > where M and K are constant, square, positive definite matrices (mass and stiffness respectively) is solved for each frequency omega of interest. > K is a complex matrix, including material damping. > > I have written a PETSc program which solves this problem for a small (1000 degrees of freedom) test problem on one or several processors, but it keeps crashing when I try it on my full scale (in the order of 10^6 degrees of freedom) problem. > > The program crashes at KSPSetUp() and from what I can see in the error messages, it appears as if it consumes too much memory. > > I would guess that similar problems have occurred in this mail-list, so I am hoping that someone can push me in the right direction? > > Mahir -------------- next part -------------- An HTML attachment was scrubbed... URL: From hzhang at mcs.anl.gov Wed Aug 5 10:10:58 2015 From: hzhang at mcs.anl.gov (Hong) Date: Wed, 5 Aug 2015 10:10:58 -0500 Subject: [petsc-users] SuperLU MPI-problem In-Reply-To: References: <051d9816c3bd4a3eac37fc51004ebce1@STHWS42.tyrens.se> <7345cece365942d1a06deeac56cf1d72@STHWS42.tyrens.se> <19A5B30A-64E2-44E1-8F73-F67AE628F175@mcs.anl.gov> <03369975ff0a46a388920f1b3372d25c@STHWS42.tyrens.se> <1891ada2f99843b6b6c1d91f0f47f065@STHWS42.tyrens.se> Message-ID: Mahir: As you noticed, you ran the code in serial mode, not parallel. Check your code on input communicator, e.g., what input communicator do you use in KSPCreate(comm,&ksp)? I have added error flag to superlu_dist interface (released version). When user uses '-mat_superlu_dist_parsymbfact' in serial mode, this option is ignored with a warning. Hong Hong, > > > > If I set parsymbfact: > > > > $ mpiexec -n 2 ./solve -ksp_type preonly -pc_type lu > -pc_factor_mat_solver_package superlu_dist -mat_superlu_dist_matinput > DISTRIBUTED -mat_superlu_dist_parsymbfact -ksp_view > > Invalid ISPEC at line 484 in file get_perm_c.c > > Invalid ISPEC at line 484 in file get_perm_c.c > > ------------------------------------------------------- > > Primary job terminated normally, but 1 process returned > > a non-zero exit code.. Per user-direction, the job has been aborted. > > ------------------------------------------------------- > > -------------------------------------------------------------------------- > > mpiexec detected that one or more processes exited with non-zero status, > thus causing > > the job to be terminated. The first process to do so was: > > > > Process name: [[63679,1],0] > > Exit code: 255 > > -------------------------------------------------------------------------- > > > > Since the program does not finish the call to KSPSolve(), we do not get > any information about the KSP from ?ksp_view. > > > > If I do not set it, I get a serial run even if I specify ?n 2: > > > > mpiexec -n 2 ./solve -ksp_type preonly -pc_type lu > -pc_factor_mat_solver_package superlu_dist -ksp_view > > ? > > KSP Object: 1 MPI processes > > type: preonly > > maximum iterations=10000, initial guess is zero > > tolerances: relative=1e-05, absolute=1e-50, divergence=10000 > > left preconditioning > > using NONE norm type for convergence test > > PC Object: 1 MPI processes > > type: lu > > LU: out-of-place factorization > > tolerance for zero pivot 2.22045e-14 > > matrix ordering: nd > > factor fill ratio given 0, needed 0 > > Factored matrix follows: > > Mat Object: 1 MPI processes > > type: seqaij > > rows=954, cols=954 > > package used to perform factorization: superlu_dist > > total: nonzeros=0, allocated nonzeros=0 > > total number of mallocs used during MatSetValues calls =0 > > SuperLU_DIST run parameters: > > Process grid nprow 1 x npcol 1 > > Equilibrate matrix TRUE > > Matrix input mode 0 > > Replace tiny pivots TRUE > > Use iterative refinement FALSE > > Processors in row 1 col partition 1 > > Row permutation LargeDiag > > Column permutation METIS_AT_PLUS_A > > Parallel symbolic factorization FALSE > > Repeated factorization SamePattern_SameRowPerm > > linear system matrix = precond matrix: > > Mat Object: 1 MPI processes > > type: seqaij > > rows=954, cols=954 > > total: nonzeros=34223, allocated nonzeros=34223 > > total number of mallocs used during MatSetValues calls =0 > > using I-node routines: found 668 nodes, limit used is 5 > > > > I am running PETSc via Cygwin on a windows machine. > > When I installed PETSc the tests with different numbers of processes ran > well. > > > > Mahir > > > > > > *From:* Hong [mailto:hzhang at mcs.anl.gov] > *Sent:* den 3 augusti 2015 19:06 > *To:* ?lker-Kaustell, Mahir > *Cc:* Hong; Xiaoye S. Li; PETSc users list > *Subject:* Re: [petsc-users] SuperLU MPI-problem > > > > Mahir, > > > > > > I have not used ?parsymbfact in sequential runs or set matinput=GLOBAL for > parallel runs. > > > > If I use 2 processors, the program runs if I use > *?mat_superlu_dist_parsymbfact=1*: > > mpiexec -n 2 ./solve -ksp_type preonly -pc_type lu > -pc_factor_mat_solver_package superlu_dist -mat_superlu_dist_matinput > GLOBAL -mat_superlu_dist_parsymbfact=1 > > > > The incorrect option '-mat_superlu_dist_parsymbfact=1' is not taken, so > your code runs well without parsymbfact. > > > > Please run it with '-ksp_view' and see what > > 'SuperLU_DIST run parameters:' are being used, e.g. > > petsc/src/ksp/ksp/examples/tutorials (maint) > > $ mpiexec -n 2 ./ex2 -pc_type lu -pc_factor_mat_solver_package > superlu_dist -mat_superlu_dist_parsymbfact=1 -ksp_view > > > > ... > > SuperLU_DIST run parameters: > > Process grid nprow 2 x npcol 1 > > Equilibrate matrix TRUE > > Matrix input mode 1 > > Replace tiny pivots TRUE > > Use iterative refinement FALSE > > Processors in row 2 col partition 1 > > Row permutation LargeDiag > > Column permutation METIS_AT_PLUS_A > > Parallel symbolic factorization FALSE > > Repeated factorization SamePattern_SameRowPerm > > > > I do not understand why your code uses matrix input mode = global. > > > > Hong > > > > > > > > *From:* Hong [mailto:hzhang at mcs.anl.gov] > *Sent:* den 3 augusti 2015 16:46 > *To:* Xiaoye S. Li > *Cc:* ?lker-Kaustell, Mahir; Hong; PETSc users list > > > *Subject:* Re: [petsc-users] SuperLU MPI-problem > > > > Mahir, > > > > Sherry found the culprit. I can reproduce it: > > petsc/src/ksp/ksp/examples/tutorials > > mpiexec -n 2 ./ex2 -pc_type lu -pc_factor_mat_solver_package superlu_dist > -mat_superlu_dist_matinput GLOBAL -mat_superlu_dist_parsymbfact > > > > Invalid ISPEC at line 484 in file get_perm_c.c > > Invalid ISPEC at line 484 in file get_perm_c.c > > ------------------------------------------------------- > > Primary job terminated normally, but 1 process returned > > a non-zero exit code.. Per user-direction, the job has been aborted. > > ------------------------------------------------------- > > ... > > > > PETSc-superlu_dist interface sets matinput=DISTRIBUTED as default when > using more than one processes. > > Did you either use '-mat_superlu_dist_parsymbfact' for sequential run or > set matinput=GLOBAL for parallel run? > > > > I'll add an error flag for these use cases. > > > > Hong > > > > On Mon, Aug 3, 2015 at 9:17 AM, Xiaoye S. Li wrote: > > I think I know the problem. Since zdistribute.c is called, I guess you > are using the global (replicated) matrix input interface, > pzgssvx_ABglobal(). This interface does not allow you to use parallel > symbolic factorization (since matrix is centralized). > > > > That's why you get the following error: > > Invalid ISPEC at line 484 in file get_perm_c.c > > > > You need to use distributed matrix input interface pzgssvx() (without > ABglobal) > > Sherry > > > > > > On Mon, Aug 3, 2015 at 5:02 AM, Mahir.Ulker-Kaustell at tyrens.se < > Mahir.Ulker-Kaustell at tyrens.se> wrote: > > Hong and Sherry, > > > > I have rebuilt PETSc with SuperLU 4.1. Unfortunately, the problem remains: > > > > If I use -mat_superlu_dist_parsymbfact, the program crashes with: Invalid > ISPEC at line 484 in file get_perm_c.c > > If I use -mat_superlu_dist_parsymbfact=1 or leave this flag out, the > program crashes with: Calloc fails for SPA dense[]. at line 438 in file > zdistribute.c > > > > Mahir > > > > *From:* Hong [mailto:hzhang at mcs.anl.gov] > *Sent:* den 30 juli 2015 02:58 > *To:* ?lker-Kaustell, Mahir > *Cc:* Xiaoye Li; PETSc users list > > > *Subject:* Fwd: [petsc-users] SuperLU MPI-problem > > > > Mahir, > > > > Sherry fixed several bugs in superlu_dist-v4.1. > > The current petsc-release interfaces with superlu_dist-v4.0. > > We do not know whether the reported issue (attached below) has been > resolved or not. If not, can you test it with the latest superlu_dist-v4.1? > > > > Here is how to do it: > > 1. download superlu_dist v4.1 > > 2. remove existing PETSC_ARCH directory, then configure petsc with > > '--download-superlu_dist=superlu_dist_4.1.tar.gz' > > 3. build petsc > > > > Let us know if the issue remains. > > > > Hong > > > > > > ---------- Forwarded message ---------- > From: *Xiaoye S. Li* > Date: Wed, Jul 29, 2015 at 2:24 PM > Subject: Fwd: [petsc-users] SuperLU MPI-problem > To: Hong Zhang > > Hong, > > I am cleaning the mailbox, and saw this unresolved issue. I am not sure > whether the new fix to parallel symbolic factorization solves the problem. > What bothers be is that he is getting the following error: > > Invalid ISPEC at line 484 in file get_perm_c.c > > This has nothing to do with my bug fix. > > ? Shall we ask him to try the new version, or try to get him matrix? > > Sherry > ? > > > > ---------- Forwarded message ---------- > From: *Mahir.Ulker-Kaustell at tyrens.se * < > Mahir.Ulker-Kaustell at tyrens.se> > Date: Wed, Jul 22, 2015 at 1:32 PM > Subject: RE: [petsc-users] SuperLU MPI-problem > To: Hong , "Xiaoye S. Li" > Cc: petsc-users > > The 1000 was just a conservative guess. The number of non-zeros per row is > in the tens in general but certain constraints lead to non-diagonal streaks > in the sparsity-pattern. > > Is it the reordering of the matrix that is killing me here? How can I set > options.ColPerm? > > > > If i use -mat_superlu_dist_parsymbfact the program crashes with > > > > Invalid ISPEC at line 484 in file get_perm_c.c > > ------------------------------------------------------- > > Primary job terminated normally, but 1 process returned > > a non-zero exit code.. Per user-direction, the job has been aborted. > > ------------------------------------------------------- > > [0]PETSC ERROR: > ------------------------------------------------------------------------ > > [0]PETSC ERROR: Caught signal number 15 Terminate: Some process (or the > batch system) has told this process to end > > [0]PETSC ERROR: Try option -start_in_debugger or -on_error_attach_debugger > > [0]PETSC ERROR: or see > http://www.mcs.anl.gov/petsc/documentation/faq.html#valgrind > > [0]PETSC ERROR: or try http://valgrind.org on GNU/linux and Apple Mac OS > X to find memory corruption errors > > [0]PETSC ERROR: configure using --with-debugging=yes, recompile, link, and > run > > [0]PETSC ERROR: to get more information on the crash. > > [0]PETSC ERROR: --------------------- Error Message > -------------------------------------------------------------- > > [0]PETSC ERROR: Signal received > > [0]PETSC ERROR: See http://www.mcs.anl.gov/petsc/documentation/faq.html > for trouble shooting. > > [0]PETSC ERROR: Petsc Release Version 3.6.0, Jun, 09, 2015 > > [0]PETSC ERROR: ./solve on a cygwin-complex-nodebug named CZC5202SM2 by > muk Wed Jul 22 21:59:23 2015 > > [0]PETSC ERROR: Configure options PETSC_DIR=/packages/petsc-3.6.0 > PETSC_ARCH=cygwin-complex-nodebug --with-cc=gcc --with-cxx=g++ > --with-fc=gfortran --with-debugging=0 --with-fortran-kernels=1 > --with-scalar-type=complex --download-fblaspack --download-mpich > --download-scalapack --download-mumps --download-metis --download-parmetis > --download-superlu --download-superlu_dist --download-fftw > > [0]PETSC ERROR: #1 User provided function() line 0 in unknown file > > application called MPI_Abort(MPI_COMM_WORLD, 59) - process 0 > > [unset]: aborting job: > > application called MPI_Abort(MPI_COMM_WORLD, 59) - process 0 > > [0]PETSC ERROR: > ------------------------------------------------------------------------ > > > > If i use -mat_superlu_dist_parsymbfact=1 the program crashes (somewhat > later) with > > > > Malloc fails for Lnzval_bc_ptr[*][] at line 626 in file zdistribute.c > > col block 3006 ------------------------------------------------------- > > Primary job terminated normally, but 1 process returned > > a non-zero exit code.. Per user-direction, the job has been aborted. > > ------------------------------------------------------- > > col block 1924 [0]PETSC ERROR: > ------------------------------------------------------------------------ > > [0]PETSC ERROR: Caught signal number 15 Terminate: Some process (or the > batch system) has told this process to end > > [0]PETSC ERROR: Try option -start_in_debugger or -on_error_attach_debugger > > [0]PETSC ERROR: or see > http://www.mcs.anl.gov/petsc/documentation/faq.html#valgrind > > [0]PETSC ERROR: or try http://valgrind.org on GNU/linux and Apple Mac OS > X to find memory corruption errors > > [0]PETSC ERROR: configure using --with-debugging=yes, recompile, link, and > run > > [0]PETSC ERROR: to get more information on the crash. > > [0]PETSC ERROR: --------------------- Error Message > -------------------------------------------------------------- > > [0]PETSC ERROR: Signal received > > [0]PETSC ERROR: See http://www.mcs.anl.gov/petsc/documentation/faq.html > for trouble shooting. > > [0]PETSC ERROR: Petsc Release Version 3.6.0, Jun, 09, 2015 > > [0]PETSC ERROR: ./solve on a cygwin-complex-nodebug named CZC5202SM2 by > muk Wed Jul 22 21:59:58 2015 > > [0]PETSC ERROR: Configure options PETSC_DIR=/packages/petsc-3.6.0 > PETSC_ARCH=cygwin-complex-nodebug --with-cc=gcc --with-cxx=g++ > --with-fc=gfortran --with-debugging=0 --with-fortran-kernels=1 > --with-scalar-type=complex --download-fblaspack --download-mpich > --download-scalapack --download-mumps --download-metis --download-parmetis > --download-superlu --download-superlu_dist --download-fftw > > [0]PETSC ERROR: #1 User provided function() line 0 in unknown file > > application called MPI_Abort(MPI_COMM_WORLD, 59) - process 0 > > [unset]: aborting job: > > application called MPI_Abort(MPI_COMM_WORLD, 59) - process 0 > > [0]PETSC ERROR: > ------------------------------------------------------------------------ > > > > > > /Mahir > > > > > > *From:* Hong [mailto:hzhang at mcs.anl.gov] > > *Sent:* den 22 juli 2015 21:34 > *To:* Xiaoye S. Li > *Cc:* ?lker-Kaustell, Mahir; petsc-users > > > *Subject:* Re: [petsc-users] SuperLU MPI-problem > > > > In Petsc/superlu_dist interface, we set default > > > > options.ParSymbFact = NO; > > > > When user raises the flag "-mat_superlu_dist_parsymbfact", > > we set > > > > options.ParSymbFact = YES; > > options.ColPerm = PARMETIS; /* in v2.2, PARMETIS is forced for > ParSymbFact regardless of user ordering setting */ > > > > We do not change anything else. > > > > Hong > > > > On Wed, Jul 22, 2015 at 2:19 PM, Xiaoye S. Li wrote: > > I am trying to understand your problem. You said you are solving Naviers > equation (elastodynamics) in the frequency domain, using finite element > discretization. I wonder why you have about 1000 nonzeros per row. > Usually in many PDE discretized matrices, the number of nonzeros per row is > in the tens (even for 3D problems), not in the thousands. So, your matrix > is quite a bit denser than many sparse matrices we deal with. > > > > The number of nonzeros in the L and U factors is much more than that in > original matrix A -- typically we see 10-20x fill ratio for 2D, or can be > as bad as 50-100x fill ratio for 3D. But since your matrix starts much > denser (i.e., the underlying graph has many connections), it may not lend > to any good ordering strategy to preserve sparsity of L and U; that is, the > L and U fill ratio may be large. > > > > I don't understand why you get the following error when you use > > ?-mat_superlu_dist_parsymbfact?. > > > > Invalid ISPEC at line 484 in file get_perm_c.c > > > > Perhaps Hong Zhang knows; she built the SuperLU_DIST interface for PETSc. > > > > ?Hong -- in order to use parallel symbolic factorization, is it sufficient > to specify only > > ?-mat_superlu_dist_parsymbfact? > > ? ? (the default is to use sequential symbolic factorization.) > > > > > > Sherry > > > > On Wed, Jul 22, 2015 at 9:11 AM, Mahir.Ulker-Kaustell at tyrens.se < > Mahir.Ulker-Kaustell at tyrens.se> wrote: > > Thank you for your reply. > > As you have probably figured out already, I am not a computational > scientist. I am a researcher in civil engineering (railways for high-speed > traffic), trying to produce some, from my perspective, fairly large > parametric studies based on finite element discretizations. > > I am working in a Windows-environment and have installed PETSc through > Cygwin. > Apparently, there is no support for Valgrind in this OS. > > If I have understood you correct, the memory issues are related to superLU > and given my background, there is not much I can do. Is this correct? > > > Best regards, > Mahir > > ______________________________________________ > Mahir ?lker-Kaustell, Kompetenssamordnare, Brokonstrukt?r, Tekn. Dr, > Tyr?ns AB > 010 452 30 82, Mahir.Ulker-Kaustell at tyrens.se > ______________________________________________ > > > -----Original Message----- > From: Barry Smith [mailto:bsmith at mcs.anl.gov] > Sent: den 22 juli 2015 02:57 > To: ?lker-Kaustell, Mahir > Cc: Xiaoye S. Li; petsc-users > Subject: Re: [petsc-users] SuperLU MPI-problem > > > Run the program under valgrind > http://www.mcs.anl.gov/petsc/documentation/faq.html#valgrind . When I use > the option -mat_superlu_dist_parsymbfact I get many scary memory problems > some involving for example ddist_psymbtonum (pdsymbfact_distdata.c:1332) > > Note that I consider it unacceptable for running programs to EVER use > uninitialized values; until these are all cleaned up I won't trust any runs > like this. > > Barry > > > > > ==42050== Conditional jump or move depends on uninitialised value(s) > ==42050== at 0x10274C436: MPI_Allgatherv (allgatherv.c:1053) > ==42050== by 0x101557F60: get_perm_c_parmetis > (get_perm_c_parmetis.c:285) > ==42050== by 0x101501192: pdgssvx (pdgssvx.c:934) > ==42050== by 0x1009CFE7A: MatLUFactorNumeric_SuperLU_DIST > (superlu_dist.c:414) > ==42050== by 0x10046CC5C: MatLUFactorNumeric (matrix.c:2946) > ==42050== by 0x100F09F2C: PCSetUp_LU (lu.c:152) > ==42050== by 0x100FF9036: PCSetUp (precon.c:982) > ==42050== by 0x1010F54EB: KSPSetUp (itfunc.c:332) > ==42050== by 0x1010F7985: KSPSolve (itfunc.c:546) > ==42050== by 0x10125541E: SNESSolve_NEWTONLS (ls.c:233) > ==42050== by 0x1011C49B7: SNESSolve (snes.c:3906) > ==42050== by 0x100001B3C: main (in ./ex19) > ==42050== Uninitialised value was created by a stack allocation > ==42050== at 0x10155751B: get_perm_c_parmetis (get_perm_c_parmetis.c:96) > ==42050== > ==42050== Conditional jump or move depends on uninitialised value(s) > ==42050== at 0x102851C61: MPIR_Allgatherv_intra (allgatherv.c:651) > ==42050== by 0x102853EC7: MPIR_Allgatherv (allgatherv.c:903) > ==42050== by 0x102853F84: MPIR_Allgatherv_impl (allgatherv.c:944) > ==42050== by 0x10274CA41: MPI_Allgatherv (allgatherv.c:1107) > ==42050== by 0x101557F60: get_perm_c_parmetis > (get_perm_c_parmetis.c:285) > ==42050== by 0x101501192: pdgssvx (pdgssvx.c:934) > ==42050== by 0x1009CFE7A: MatLUFactorNumeric_SuperLU_DIST > (superlu_dist.c:414) > ==42050== by 0x10046CC5C: MatLUFactorNumeric (matrix.c:2946) > ==42050== by 0x100F09F2C: PCSetUp_LU (lu.c:152) > ==42050== by 0x100FF9036: PCSetUp (precon.c:982) > ==42050== by 0x1010F54EB: KSPSetUp (itfunc.c:332) > ==42050== by 0x1010F7985: KSPSolve (itfunc.c:546) > ==42050== by 0x10125541E: SNESSolve_NEWTONLS (ls.c:233) > ==42050== by 0x1011C49B7: SNESSolve (snes.c:3906) > ==42050== by 0x100001B3C: main (in ./ex19) > ==42050== Uninitialised value was created by a stack allocation > ==42050== at 0x10155751B: get_perm_c_parmetis (get_perm_c_parmetis.c:96) > ==42050== > ==42049== Syscall param writev(vector[...]) points to uninitialised byte(s) > ==42049== at 0x102DA1C3A: writev (in > /usr/lib/system/libsystem_kernel.dylib) > ==42049== by 0x10296A0DC: MPL_large_writev (mplsock.c:32) > ==42049== by 0x10295F6AD: MPIDU_Sock_writev (sock_immed.i:610) > ==42049== by 0x102943FCA: MPIDI_CH3_iSendv (ch3_isendv.c:84) > ==42049== by 0x102934361: MPIDI_CH3_EagerContigIsend (ch3u_eager.c:556) > ==42049== by 0x102939531: MPID_Isend (mpid_isend.c:138) > ==42049== by 0x10277656E: MPI_Isend (isend.c:125) > ==42049== by 0x102088B66: libparmetis__gkMPI_Isend (gkmpi.c:63) > ==42049== by 0x10208140F: libparmetis__CommInterfaceData (comm.c:298) > ==42049== by 0x1020A8758: libparmetis__CompactGraph (ometis.c:553) > ==42049== by 0x1020A77BB: libparmetis__MultilevelOrder (ometis.c:225) > ==42049== by 0x1020A7493: ParMETIS_V32_NodeND (ometis.c:151) > ==42049== by 0x1020A6AFB: ParMETIS_V3_NodeND (ometis.c:34) > ==42049== by 0x101557CFC: get_perm_c_parmetis > (get_perm_c_parmetis.c:241) > ==42049== by 0x101501192: pdgssvx (pdgssvx.c:934) > ==42049== by 0x1009CFE7A: MatLUFactorNumeric_SuperLU_DIST > (superlu_dist.c:414) > ==42049== by 0x10046CC5C: MatLUFactorNumeric (matrix.c:2946) > ==42049== by 0x100F09F2C: PCSetUp_LU (lu.c:152) > ==42049== by 0x100FF9036: PCSetUp (precon.c:982) > ==42048== Syscall param writev(vector[...]) points to uninitialised byte(s) > ==42049== by 0x1010F54EB: KSPSetUp (itfunc.c:332) > ==42049== Address 0x105edff70 is 1,424 bytes inside a block of size > 752,720 alloc'd > ==42049== at 0x1000183B1: malloc (vg_replace_malloc.c:303) > ==42049== by 0x1020EB90C: gk_malloc (memory.c:147) > ==42049== by 0x1020EAA28: gk_mcoreCreate (mcore.c:28) > ==42048== at 0x102DA1C3A: writev (in > /usr/lib/system/libsystem_kernel.dylib) > ==42048== by 0x10296A0DC: MPL_large_writev (mplsock.c:32) > ==42049== by 0x1020BA5CF: libparmetis__AllocateWSpace (wspace.c:23) > ==42049== by 0x1020A6E84: ParMETIS_V32_NodeND (ometis.c:98) > ==42048== by 0x10295F6AD: MPIDU_Sock_writev (sock_immed.i:610) > ==42048== by 0x102943FCA: MPIDI_CH3_iSendv (ch3_isendv.c:84) > ==42048== by 0x102934361: MPIDI_CH3_EagerContigIsend (ch3u_eager.c:556) > ==42049== by 0x1020A6AFB: ParMETIS_V3_NodeND (ometis.c:34) > ==42049== by 0x101557CFC: get_perm_c_parmetis > (get_perm_c_parmetis.c:241) > ==42049== by 0x101501192: pdgssvx (pdgssvx.c:934) > ==42048== by 0x102939531: MPID_Isend (mpid_isend.c:138) > ==42048== by 0x10277656E: MPI_Isend (isend.c:125) > ==42049== by 0x1009CFE7A: MatLUFactorNumeric_SuperLU_DIST > (superlu_dist.c:414) > ==42049== by 0x10046CC5C: MatLUFactorNumeric (matrix.c:2946) > ==42049== by 0x100F09F2C: PCSetUp_LU (lu.c:152) > ==42049== by 0x100FF9036: PCSetUp (precon.c:982) > ==42049== by 0x1010F54EB: KSPSetUp (itfunc.c:332) > ==42048== by 0x102088B66: libparmetis__gkMPI_Isend (gkmpi.c:63) > ==42048== by 0x10208140F: libparmetis__CommInterfaceData (comm.c:298) > ==42049== by 0x1010F7985: KSPSolve (itfunc.c:546) > ==42049== by 0x10125541E: SNESSolve_NEWTONLS (ls.c:233) > ==42048== by 0x1020A8758: libparmetis__CompactGraph (ometis.c:553) > ==42048== by 0x1020A77BB: libparmetis__MultilevelOrder (ometis.c:225) > ==42048== by 0x1020A7493: ParMETIS_V32_NodeND (ometis.c:151) > ==42049== by 0x1011C49B7: SNESSolve (snes.c:3906) > ==42049== by 0x100001B3C: main (in ./ex19) > ==42049== Uninitialised value was created by a heap allocation > ==42049== at 0x1000183B1: malloc (vg_replace_malloc.c:303) > ==42049== by 0x1020EB90C: gk_malloc (memory.c:147) > ==42048== by 0x1020A6AFB: ParMETIS_V3_NodeND (ometis.c:34) > ==42048== by 0x101557CFC: get_perm_c_parmetis > (get_perm_c_parmetis.c:241) > ==42048== by 0x101501192: pdgssvx (pdgssvx.c:934) > ==42048== by 0x1009CFE7A: MatLUFactorNumeric_SuperLU_DIST > (superlu_dist.c:414) > ==42049== by 0x10211C50B: libmetis__imalloc (gklib.c:24) > ==42049== by 0x1020A8566: libparmetis__CompactGraph (ometis.c:519) > ==42049== by 0x1020A77BB: libparmetis__MultilevelOrder (ometis.c:225) > ==42048== by 0x10046CC5C: MatLUFactorNumeric (matrix.c:2946) > ==42049== by 0x1020A7493: ParMETIS_V32_NodeND (ometis.c:151) > ==42049== by 0x1020A6AFB: ParMETIS_V3_NodeND (ometis.c:34) > ==42049== by 0x101557CFC: get_perm_c_parmetis > (get_perm_c_parmetis.c:241) > ==42049== by 0x101501192: pdgssvx (pdgssvx.c:934) > ==42049== by 0x1009CFE7A: MatLUFactorNumeric_SuperLU_DIST > (superlu_dist.c:414) > ==42049== by 0x10046CC5C: MatLUFactorNumeric (matrix.c:2946) > ==42048== by 0x100F09F2C: PCSetUp_LU (lu.c:152) > ==42049== by 0x100F09F2C: PCSetUp_LU (lu.c:152) > ==42049== by 0x100FF9036: PCSetUp (precon.c:982) > ==42049== by 0x1010F54EB: KSPSetUp (itfunc.c:332) > ==42049== by 0x1010F7985: KSPSolve (itfunc.c:546) > ==42049== by 0x10125541E: SNESSolve_NEWTONLS (ls.c:233) > ==42048== by 0x100FF9036: PCSetUp (precon.c:982) > ==42048== by 0x1010F54EB: KSPSetUp (itfunc.c:332) > ==42048== Address 0x10597a860 is 1,408 bytes inside a block of size > 752,720 alloc'd > ==42049== by 0x1011C49B7: SNESSolve (snes.c:3906) > ==42049== by 0x100001B3C: main (in ./ex19) > ==42049== > ==42048== at 0x1000183B1: malloc (vg_replace_malloc.c:303) > ==42048== by 0x1020EB90C: gk_malloc (memory.c:147) > ==42048== by 0x1020EAA28: gk_mcoreCreate (mcore.c:28) > ==42048== by 0x1020BA5CF: libparmetis__AllocateWSpace (wspace.c:23) > ==42048== by 0x1020A6E84: ParMETIS_V32_NodeND (ometis.c:98) > ==42048== by 0x1020A6AFB: ParMETIS_V3_NodeND (ometis.c:34) > ==42048== by 0x101557CFC: get_perm_c_parmetis > (get_perm_c_parmetis.c:241) > ==42048== by 0x101501192: pdgssvx (pdgssvx.c:934) > ==42048== by 0x1009CFE7A: MatLUFactorNumeric_SuperLU_DIST > (superlu_dist.c:414) > ==42048== by 0x10046CC5C: MatLUFactorNumeric (matrix.c:2946) > ==42048== by 0x100F09F2C: PCSetUp_LU (lu.c:152) > ==42048== by 0x100FF9036: PCSetUp (precon.c:982) > ==42048== by 0x1010F54EB: KSPSetUp (itfunc.c:332) > ==42048== by 0x1010F7985: KSPSolve (itfunc.c:546) > ==42048== by 0x10125541E: SNESSolve_NEWTONLS (ls.c:233) > ==42048== by 0x1011C49B7: SNESSolve (snes.c:3906) > ==42048== by 0x100001B3C: main (in ./ex19) > ==42048== Uninitialised value was created by a heap allocation > ==42048== at 0x1000183B1: malloc (vg_replace_malloc.c:303) > ==42048== by 0x1020EB90C: gk_malloc (memory.c:147) > ==42048== by 0x10211C50B: libmetis__imalloc (gklib.c:24) > ==42048== by 0x1020A8566: libparmetis__CompactGraph (ometis.c:519) > ==42048== by 0x1020A77BB: libparmetis__MultilevelOrder (ometis.c:225) > ==42048== by 0x1020A7493: ParMETIS_V32_NodeND (ometis.c:151) > ==42048== by 0x1020A6AFB: ParMETIS_V3_NodeND (ometis.c:34) > ==42048== by 0x101557CFC: get_perm_c_parmetis > (get_perm_c_parmetis.c:241) > ==42048== by 0x101501192: pdgssvx (pdgssvx.c:934) > ==42048== by 0x1009CFE7A: MatLUFactorNumeric_SuperLU_DIST > (superlu_dist.c:414) > ==42048== by 0x10046CC5C: MatLUFactorNumeric (matrix.c:2946) > ==42048== by 0x100F09F2C: PCSetUp_LU (lu.c:152) > ==42048== by 0x100FF9036: PCSetUp (precon.c:982) > ==42048== by 0x1010F54EB: KSPSetUp (itfunc.c:332) > ==42048== by 0x1010F7985: KSPSolve (itfunc.c:546) > ==42048== by 0x10125541E: SNESSolve_NEWTONLS (ls.c:233) > ==42048== by 0x1011C49B7: SNESSolve (snes.c:3906) > ==42048== by 0x100001B3C: main (in ./ex19) > ==42048== > ==42048== Syscall param write(buf) points to uninitialised byte(s) > ==42048== at 0x102DA1C22: write (in > /usr/lib/system/libsystem_kernel.dylib) > ==42048== by 0x10295F5BD: MPIDU_Sock_write (sock_immed.i:525) > ==42048== by 0x102944839: MPIDI_CH3_iStartMsg (ch3_istartmsg.c:86) > ==42048== by 0x102933B80: MPIDI_CH3_EagerContigShortSend > (ch3u_eager.c:257) > ==42048== by 0x10293ADBA: MPID_Send (mpid_send.c:130) > ==42048== by 0x10277A1FA: MPI_Send (send.c:127) > ==42048== by 0x10155802F: get_perm_c_parmetis > (get_perm_c_parmetis.c:299) > ==42048== by 0x101501192: pdgssvx (pdgssvx.c:934) > ==42048== by 0x1009CFE7A: MatLUFactorNumeric_SuperLU_DIST > (superlu_dist.c:414) > ==42048== by 0x10046CC5C: MatLUFactorNumeric (matrix.c:2946) > ==42048== by 0x100F09F2C: PCSetUp_LU (lu.c:152) > ==42048== by 0x100FF9036: PCSetUp (precon.c:982) > ==42048== by 0x1010F54EB: KSPSetUp (itfunc.c:332) > ==42048== by 0x1010F7985: KSPSolve (itfunc.c:546) > ==42048== by 0x10125541E: SNESSolve_NEWTONLS (ls.c:233) > ==42048== by 0x1011C49B7: SNESSolve (snes.c:3906) > ==42048== by 0x100001B3C: main (in ./ex19) > ==42048== Address 0x104810704 is on thread 1's stack > ==42048== in frame #3, created by MPIDI_CH3_EagerContigShortSend > (ch3u_eager.c:218) > ==42048== Uninitialised value was created by a heap allocation > ==42048== at 0x1000183B1: malloc (vg_replace_malloc.c:303) > ==42048== by 0x10153B704: superlu_malloc_dist (memory.c:108) > ==42048== by 0x101557AB9: get_perm_c_parmetis > (get_perm_c_parmetis.c:185) > ==42048== by 0x101501192: pdgssvx (pdgssvx.c:934) > ==42048== by 0x1009CFE7A: MatLUFactorNumeric_SuperLU_DIST > (superlu_dist.c:414) > ==42048== by 0x10046CC5C: MatLUFactorNumeric (matrix.c:2946) > ==42048== by 0x100F09F2C: PCSetUp_LU (lu.c:152) > ==42048== by 0x100FF9036: PCSetUp (precon.c:982) > ==42048== by 0x1010F54EB: KSPSetUp (itfunc.c:332) > ==42048== by 0x1010F7985: KSPSolve (itfunc.c:546) > ==42048== by 0x10125541E: SNESSolve_NEWTONLS (ls.c:233) > ==42048== by 0x1011C49B7: SNESSolve (snes.c:3906) > ==42048== by 0x100001B3C: main (in ./ex19) > ==42048== > ==42050== Conditional jump or move depends on uninitialised value(s) > ==42050== at 0x102744CB8: MPI_Alltoallv (alltoallv.c:480) > ==42050== by 0x101510B3E: dist_symbLU (pdsymbfact_distdata.c:539) > ==42050== by 0x10150A5C6: ddist_psymbtonum (pdsymbfact_distdata.c:1275) > ==42050== by 0x1015018C2: pdgssvx (pdgssvx.c:1057) > ==42050== by 0x1009CFE7A: MatLUFactorNumeric_SuperLU_DIST > (superlu_dist.c:414) > ==42050== by 0x10046CC5C: MatLUFactorNumeric (matrix.c:2946) > ==42050== by 0x100F09F2C: PCSetUp_LU (lu.c:152) > ==42050== by 0x100FF9036: PCSetUp (precon.c:982) > ==42050== by 0x1010F54EB: KSPSetUp (itfunc.c:332) > ==42050== by 0x1010F7985: KSPSolve (itfunc.c:546) > ==42050== by 0x10125541E: SNESSolve_NEWTONLS (ls.c:233) > ==42050== by 0x1011C49B7: SNESSolve (snes.c:3906) > ==42050== by 0x100001B3C: main (in ./ex19) > ==42050== Uninitialised value was created by a stack allocation > ==42050== at 0x10150E4C4: dist_symbLU (pdsymbfact_distdata.c:96) > ==42050== > ==42050== Conditional jump or move depends on uninitialised value(s) > ==42050== at 0x102744E43: MPI_Alltoallv (alltoallv.c:490) > ==42050== by 0x101510B3E: dist_symbLU (pdsymbfact_distdata.c:539) > ==42050== by 0x10150A5C6: ddist_psymbtonum (pdsymbfact_distdata.c:1275) > ==42050== by 0x1015018C2: pdgssvx (pdgssvx.c:1057) > ==42050== by 0x1009CFE7A: MatLUFactorNumeric_SuperLU_DIST > (superlu_dist.c:414) > ==42050== by 0x10046CC5C: MatLUFactorNumeric (matrix.c:2946) > ==42050== by 0x100F09F2C: PCSetUp_LU (lu.c:152) > ==42050== by 0x100FF9036: PCSetUp (precon.c:982) > ==42050== by 0x1010F54EB: KSPSetUp (itfunc.c:332) > ==42050== by 0x1010F7985: KSPSolve (itfunc.c:546) > ==42050== by 0x10125541E: SNESSolve_NEWTONLS (ls.c:233) > ==42050== by 0x1011C49B7: SNESSolve (snes.c:3906) > ==42050== by 0x100001B3C: main (in ./ex19) > ==42050== Uninitialised value was created by a stack allocation > ==42050== at 0x10150E4C4: dist_symbLU (pdsymbfact_distdata.c:96) > ==42050== > ==42050== Conditional jump or move depends on uninitialised value(s) > ==42050== at 0x102744EBF: MPI_Alltoallv (alltoallv.c:497) > ==42050== by 0x101510B3E: dist_symbLU (pdsymbfact_distdata.c:539) > ==42050== by 0x10150A5C6: ddist_psymbtonum (pdsymbfact_distdata.c:1275) > ==42050== by 0x1015018C2: pdgssvx (pdgssvx.c:1057) > ==42050== by 0x1009CFE7A: MatLUFactorNumeric_SuperLU_DIST > (superlu_dist.c:414) > ==42050== by 0x10046CC5C: MatLUFactorNumeric (matrix.c:2946) > ==42050== by 0x100F09F2C: PCSetUp_LU (lu.c:152) > ==42050== by 0x100FF9036: PCSetUp (precon.c:982) > ==42050== by 0x1010F54EB: KSPSetUp (itfunc.c:332) > ==42050== by 0x1010F7985: KSPSolve (itfunc.c:546) > ==42050== by 0x10125541E: SNESSolve_NEWTONLS (ls.c:233) > ==42050== by 0x1011C49B7: SNESSolve (snes.c:3906) > ==42050== by 0x100001B3C: main (in ./ex19) > ==42050== Uninitialised value was created by a stack allocation > ==42050== at 0x10150E4C4: dist_symbLU (pdsymbfact_distdata.c:96) > ==42050== > ==42050== Conditional jump or move depends on uninitialised value(s) > ==42050== at 0x1027450B1: MPI_Alltoallv (alltoallv.c:512) > ==42050== by 0x101510B3E: dist_symbLU (pdsymbfact_distdata.c:539) > ==42050== by 0x10150A5C6: ddist_psymbtonum (pdsymbfact_distdata.c:1275) > ==42050== by 0x1015018C2: pdgssvx (pdgssvx.c:1057) > ==42050== by 0x1009CFE7A: MatLUFactorNumeric_SuperLU_DIST > (superlu_dist.c:414) > ==42050== by 0x10046CC5C: MatLUFactorNumeric (matrix.c:2946) > ==42050== by 0x100F09F2C: PCSetUp_LU (lu.c:152) > ==42050== by 0x100FF9036: PCSetUp (precon.c:982) > ==42050== by 0x1010F54EB: KSPSetUp (itfunc.c:332) > ==42050== by 0x1010F7985: KSPSolve (itfunc.c:546) > ==42050== by 0x10125541E: SNESSolve_NEWTONLS (ls.c:233) > ==42050== by 0x1011C49B7: SNESSolve (snes.c:3906) > ==42050== by 0x100001B3C: main (in ./ex19) > ==42050== Uninitialised value was created by a stack allocation > ==42050== at 0x10150E4C4: dist_symbLU (pdsymbfact_distdata.c:96) > ==42050== > ==42050== Conditional jump or move depends on uninitialised value(s) > ==42050== at 0x10283FB06: MPIR_Alltoallv_intra (alltoallv.c:92) > ==42050== by 0x1028407B6: MPIR_Alltoallv (alltoallv.c:343) > ==42050== by 0x102840884: MPIR_Alltoallv_impl (alltoallv.c:380) > ==42050== by 0x10274541B: MPI_Alltoallv (alltoallv.c:531) > ==42050== by 0x101510B3E: dist_symbLU (pdsymbfact_distdata.c:539) > ==42050== by 0x10150A5C6: ddist_psymbtonum (pdsymbfact_distdata.c:1275) > ==42050== by 0x1015018C2: pdgssvx (pdgssvx.c:1057) > ==42050== by 0x1009CFE7A: MatLUFactorNumeric_SuperLU_DIST > (superlu_dist.c:414) > ==42050== by 0x10046CC5C: MatLUFactorNumeric (matrix.c:2946) > ==42050== by 0x100F09F2C: PCSetUp_LU (lu.c:152) > ==42050== by 0x100FF9036: PCSetUp (precon.c:982) > ==42050== by 0x1010F54EB: KSPSetUp (itfunc.c:332) > ==42050== by 0x1010F7985: KSPSolve (itfunc.c:546) > ==42050== by 0x10125541E: SNESSolve_NEWTONLS (ls.c:233) > ==42050== by 0x1011C49B7: SNESSolve (snes.c:3906) > ==42050== by 0x100001B3C: main (in ./ex19) > ==42050== Uninitialised value was created by a stack allocation > ==42050== at 0x10150E4C4: dist_symbLU (pdsymbfact_distdata.c:96) > ==42050== > ==42050== Syscall param writev(vector[...]) points to uninitialised byte(s) > ==42050== at 0x102DA1C3A: writev (in > /usr/lib/system/libsystem_kernel.dylib) > ==42050== by 0x10296A0DC: MPL_large_writev (mplsock.c:32) > ==42050== by 0x10295F6AD: MPIDU_Sock_writev (sock_immed.i:610) > ==42050== by 0x102943FCA: MPIDI_CH3_iSendv (ch3_isendv.c:84) > ==42050== by 0x102934361: MPIDI_CH3_EagerContigIsend (ch3u_eager.c:556) > ==42050== by 0x102939531: MPID_Isend (mpid_isend.c:138) > ==42050== by 0x10277656E: MPI_Isend (isend.c:125) > ==42050== by 0x101524C41: pdgstrf2_trsm (pdgstrf2.c:201) > ==42050== by 0x10151ECBF: pdgstrf (pdgstrf.c:1082) > ==42050== by 0x1015019A5: pdgssvx (pdgssvx.c:1069) > ==42050== by 0x1009CFE7A: MatLUFactorNumeric_SuperLU_DIST > (superlu_dist.c:414) > ==42050== by 0x10046CC5C: MatLUFactorNumeric (matrix.c:2946) > ==42050== by 0x100F09F2C: PCSetUp_LU (lu.c:152) > ==42050== by 0x100FF9036: PCSetUp (precon.c:982) > ==42050== by 0x1010F54EB: KSPSetUp (itfunc.c:332) > ==42050== by 0x1010F7985: KSPSolve (itfunc.c:546) > ==42050== by 0x10125541E: SNESSolve_NEWTONLS (ls.c:233) > ==42050== by 0x1011C49B7: SNESSolve (snes.c:3906) > ==42050== by 0x100001B3C: main (in ./ex19) > ==42050== Address 0x1060144d0 is 1,168 bytes inside a block of size > 131,072 alloc'd > ==42050== at 0x1000183B1: malloc (vg_replace_malloc.c:303) > ==42050== by 0x10153B704: superlu_malloc_dist (memory.c:108) > ==42050== by 0x1014FD7AD: doubleMalloc_dist (dmemory.c:145) > ==42050== by 0x10151DA7D: pdgstrf (pdgstrf.c:735) > ==42050== by 0x1015019A5: pdgssvx (pdgssvx.c:1069) > ==42050== by 0x1009CFE7A: MatLUFactorNumeric_SuperLU_DIST > (superlu_dist.c:414) > ==42050== by 0x10046CC5C: MatLUFactorNumeric (matrix.c:2946) > ==42050== by 0x100F09F2C: PCSetUp_LU (lu.c:152) > ==42050== by 0x100FF9036: PCSetUp (precon.c:982) > ==42050== by 0x1010F54EB: KSPSetUp (itfunc.c:332) > ==42050== by 0x1010F7985: KSPSolve (itfunc.c:546) > ==42050== by 0x10125541E: SNESSolve_NEWTONLS (ls.c:233) > ==42050== by 0x1011C49B7: SNESSolve (snes.c:3906) > ==42050== by 0x100001B3C: main (in ./ex19) > ==42050== Uninitialised value was created by a heap allocation > ==42050== at 0x1000183B1: malloc (vg_replace_malloc.c:303) > ==42050== by 0x10153B704: superlu_malloc_dist (memory.c:108) > ==42050== by 0x1014FD7AD: doubleMalloc_dist (dmemory.c:145) > ==42050== by 0x10151DA7D: pdgstrf (pdgstrf.c:735) > ==42050== by 0x1015019A5: pdgssvx (pdgssvx.c:1069) > ==42050== by 0x1009CFE7A: MatLUFactorNumeric_SuperLU_DIST > (superlu_dist.c:414) > ==42050== by 0x10046CC5C: MatLUFactorNumeric (matrix.c:2946) > ==42050== by 0x100F09F2C: PCSetUp_LU (lu.c:152) > ==42050== by 0x100FF9036: PCSetUp (precon.c:982) > ==42050== by 0x1010F54EB: KSPSetUp (itfunc.c:332) > ==42050== by 0x1010F7985: KSPSolve (itfunc.c:546) > ==42050== by 0x10125541E: SNESSolve_NEWTONLS (ls.c:233) > ==42050== by 0x1011C49B7: SNESSolve (snes.c:3906) > ==42050== by 0x100001B3C: main (in ./ex19) > ==42050== > ==42048== Conditional jump or move depends on uninitialised value(s) > ==42048== at 0x10151F141: pdgstrf (pdgstrf.c:1139) > ==42048== by 0x1015019A5: pdgssvx (pdgssvx.c:1069) > ==42048== by 0x1009CFE7A: MatLUFactorNumeric_SuperLU_DIST > (superlu_dist.c:414) > ==42048== by 0x10046CC5C: MatLUFactorNumeric (matrix.c:2946) > ==42048== by 0x100F09F2C: PCSetUp_LU (lu.c:152) > ==42048== by 0x100FF9036: PCSetUp (precon.c:982) > ==42048== by 0x1010F54EB: KSPSetUp (itfunc.c:332) > ==42048== by 0x1010F7985: KSPSolve (itfunc.c:546) > ==42048== by 0x10125541E: SNESSolve_NEWTONLS (ls.c:233) > ==42048== by 0x1011C49B7: SNESSolve (snes.c:3906) > ==42048== by 0x100001B3C: main (in ./ex19) > ==42048== Uninitialised value was created by a heap allocation > ==42048== at 0x1000183B1: malloc (vg_replace_malloc.c:303) > ==42048== by 0x10153B704: superlu_malloc_dist (memory.c:108) > ==42048== by 0x10150ABE2: ddist_psymbtonum (pdsymbfact_distdata.c:1332) > ==42048== by 0x1015018C2: pdgssvx (pdgssvx.c:1057) > ==42048== by 0x1009CFE7A: MatLUFactorNumeric_SuperLU_DIST > (superlu_dist.c:414) > ==42048== by 0x10046CC5C: MatLUFactorNumeric (matrix.c:2946) > ==42048== by 0x100F09F2C: PCSetUp_LU (lu.c:152) > ==42048== by 0x100FF9036: PCSetUp (precon.c:982) > ==42048== by 0x1010F54EB: KSPSetUp (itfunc.c:332) > ==42048== by 0x1010F7985: KSPSolve (itfunc.c:546) > ==42048== by 0x10125541E: SNESSolve_NEWTONLS (ls.c:233) > ==42048== by 0x1011C49B7: SNESSolve (snes.c:3906) > ==42048== by 0x100001B3C: main (in ./ex19) > ==42048== > ==42049== Conditional jump or move depends on uninitialised value(s) > ==42049== at 0x10151F141: pdgstrf (pdgstrf.c:1139) > ==42049== by 0x1015019A5: pdgssvx (pdgssvx.c:1069) > ==42049== by 0x1009CFE7A: MatLUFactorNumeric_SuperLU_DIST > (superlu_dist.c:414) > ==42049== by 0x10046CC5C: MatLUFactorNumeric (matrix.c:2946) > ==42049== by 0x100F09F2C: PCSetUp_LU (lu.c:152) > ==42049== by 0x100FF9036: PCSetUp (precon.c:982) > ==42049== by 0x1010F54EB: KSPSetUp (itfunc.c:332) > ==42049== by 0x1010F7985: KSPSolve (itfunc.c:546) > ==42049== by 0x10125541E: SNESSolve_NEWTONLS (ls.c:233) > ==42049== by 0x1011C49B7: SNESSolve (snes.c:3906) > ==42049== by 0x100001B3C: main (in ./ex19) > ==42049== Uninitialised value was created by a heap allocation > ==42049== at 0x1000183B1: malloc (vg_replace_malloc.c:303) > ==42049== by 0x10153B704: superlu_malloc_dist (memory.c:108) > ==42049== by 0x10150ABE2: ddist_psymbtonum (pdsymbfact_distdata.c:1332) > ==42049== by 0x1015018C2: pdgssvx (pdgssvx.c:1057) > ==42049== by 0x1009CFE7A: MatLUFactorNumeric_SuperLU_DIST > (superlu_dist.c:414) > ==42049== by 0x10046CC5C: MatLUFactorNumeric (matrix.c:2946) > ==42049== by 0x100F09F2C: PCSetUp_LU (lu.c:152) > ==42049== by 0x100FF9036: PCSetUp (precon.c:982) > ==42049== by 0x1010F54EB: KSPSetUp (itfunc.c:332) > ==42049== by 0x1010F7985: KSPSolve (itfunc.c:546) > ==42049== by 0x10125541E: SNESSolve_NEWTONLS (ls.c:233) > ==42049== by 0x1011C49B7: SNESSolve (snes.c:3906) > ==42049== by 0x100001B3C: main (in ./ex19) > ==42049== > ==42048== Conditional jump or move depends on uninitialised value(s) > ==42048== at 0x101520054: pdgstrf (pdgstrf.c:1429) > ==42048== by 0x1015019A5: pdgssvx (pdgssvx.c:1069) > ==42048== by 0x1009CFE7A: MatLUFactorNumeric_SuperLU_DIST > (superlu_dist.c:414) > ==42048== by 0x10046CC5C: MatLUFactorNumeric (matrix.c:2946) > ==42048== by 0x100F09F2C: PCSetUp_LU (lu.c:152) > ==42048== by 0x100FF9036: PCSetUp (precon.c:982) > ==42048== by 0x1010F54EB: KSPSetUp (itfunc.c:332) > ==42048== by 0x1010F7985: KSPSolve (itfunc.c:546) > ==42049== Conditional jump or move depends on uninitialised value(s) > ==42048== by 0x10125541E: SNESSolve_NEWTONLS (ls.c:233) > ==42048== by 0x1011C49B7: SNESSolve (snes.c:3906) > ==42048== by 0x100001B3C: main (in ./ex19) > ==42048== Uninitialised value was created by a heap allocation > ==42049== at 0x101520054: pdgstrf (pdgstrf.c:1429) > ==42048== at 0x1000183B1: malloc (vg_replace_malloc.c:303) > ==42048== by 0x10153B704: superlu_malloc_dist (memory.c:108) > ==42049== by 0x1015019A5: pdgssvx (pdgssvx.c:1069) > ==42049== by 0x1009CFE7A: MatLUFactorNumeric_SuperLU_DIST > (superlu_dist.c:414) > ==42048== by 0x10150ABE2: ddist_psymbtonum (pdsymbfact_distdata.c:1332) > ==42048== by 0x1015018C2: pdgssvx (pdgssvx.c:1057) > ==42048== by 0x1009CFE7A: MatLUFactorNumeric_SuperLU_DIST > (superlu_dist.c:414) > ==42049== by 0x10046CC5C: MatLUFactorNumeric (matrix.c:2946) > ==42049== by 0x100F09F2C: PCSetUp_LU (lu.c:152) > ==42048== by 0x10046CC5C: MatLUFactorNumeric (matrix.c:2946) > ==42048== by 0x100F09F2C: PCSetUp_LU (lu.c:152) > ==42049== by 0x100FF9036: PCSetUp (precon.c:982) > ==42049== by 0x1010F54EB: KSPSetUp (itfunc.c:332) > ==42049== by 0x1010F7985: KSPSolve (itfunc.c:546) > ==42048== by 0x100FF9036: PCSetUp (precon.c:982) > ==42048== by 0x1010F54EB: KSPSetUp (itfunc.c:332) > ==42048== by 0x1010F7985: KSPSolve (itfunc.c:546) > ==42049== by 0x10125541E: SNESSolve_NEWTONLS (ls.c:233) > ==42049== by 0x1011C49B7: SNESSolve (snes.c:3906) > ==42048== by 0x10125541E: SNESSolve_NEWTONLS (ls.c:233) > ==42048== by 0x1011C49B7: SNESSolve (snes.c:3906) > ==42049== by 0x100001B3C: main (in ./ex19) > ==42049== Uninitialised value was created by a heap allocation > ==42049== at 0x1000183B1: malloc (vg_replace_malloc.c:303) > ==42048== by 0x100001B3C: main (in ./ex19) > ==42048== > ==42049== by 0x10153B704: superlu_malloc_dist (memory.c:108) > ==42049== by 0x10150ABE2: ddist_psymbtonum (pdsymbfact_distdata.c:1332) > ==42049== by 0x1015018C2: pdgssvx (pdgssvx.c:1057) > ==42049== by 0x1009CFE7A: MatLUFactorNumeric_SuperLU_DIST > (superlu_dist.c:414) > ==42049== by 0x10046CC5C: MatLUFactorNumeric (matrix.c:2946) > ==42049== by 0x100F09F2C: PCSetUp_LU (lu.c:152) > ==42049== by 0x100FF9036: PCSetUp (precon.c:982) > ==42049== by 0x1010F54EB: KSPSetUp (itfunc.c:332) > ==42049== by 0x1010F7985: KSPSolve (itfunc.c:546) > ==42049== by 0x10125541E: SNESSolve_NEWTONLS (ls.c:233) > ==42049== by 0x1011C49B7: SNESSolve (snes.c:3906) > ==42049== by 0x100001B3C: main (in ./ex19) > ==42049== > ==42050== Conditional jump or move depends on uninitialised value(s) > ==42050== at 0x10151FDE6: pdgstrf (pdgstrf.c:1382) > ==42050== by 0x1015019A5: pdgssvx (pdgssvx.c:1069) > ==42050== by 0x1009CFE7A: MatLUFactorNumeric_SuperLU_DIST > (superlu_dist.c:414) > ==42050== by 0x10046CC5C: MatLUFactorNumeric (matrix.c:2946) > ==42050== by 0x100F09F2C: PCSetUp_LU (lu.c:152) > ==42050== by 0x100FF9036: PCSetUp (precon.c:982) > ==42050== by 0x1010F54EB: KSPSetUp (itfunc.c:332) > ==42050== by 0x1010F7985: KSPSolve (itfunc.c:546) > ==42050== by 0x10125541E: SNESSolve_NEWTONLS (ls.c:233) > ==42050== by 0x1011C49B7: SNESSolve (snes.c:3906) > ==42050== by 0x100001B3C: main (in ./ex19) > ==42050== Uninitialised value was created by a heap allocation > ==42050== at 0x1000183B1: malloc (vg_replace_malloc.c:303) > ==42050== by 0x10153B704: superlu_malloc_dist (memory.c:108) > ==42050== by 0x10150B241: ddist_psymbtonum (pdsymbfact_distdata.c:1389) > ==42050== by 0x1015018C2: pdgssvx (pdgssvx.c:1057) > ==42050== by 0x1009CFE7A: MatLUFactorNumeric_SuperLU_DIST > (superlu_dist.c:414) > ==42050== by 0x10046CC5C: MatLUFactorNumeric (matrix.c:2946) > ==42050== by 0x100F09F2C: PCSetUp_LU (lu.c:152) > ==42050== by 0x100FF9036: PCSetUp (precon.c:982) > ==42050== by 0x1010F54EB: KSPSetUp (itfunc.c:332) > ==42050== by 0x1010F7985: KSPSolve (itfunc.c:546) > ==42050== by 0x10125541E: SNESSolve_NEWTONLS (ls.c:233) > ==42050== by 0x1011C49B7: SNESSolve (snes.c:3906) > ==42050== by 0x100001B3C: main (in ./ex19) > ==42050== > > > > On Jul 20, 2015, at 12:03 PM, Mahir.Ulker-Kaustell at tyrens.se wrote: > > > > Ok. So I have been creating the full factorization on each process. That > gives me some hope! > > > > I followed your suggestion and tried to use the runtime option > ?-mat_superlu_dist_parsymbfact?. > > However, now the program crashes with: > > > > Invalid ISPEC at line 484 in file get_perm_c.c > > > > And so on? > > > > From the SuperLU manual; I should give the option either YES or NO, > however -mat_superlu_dist_parsymbfact YES makes the program crash in the > same way as above. > > Also I can?t find any reference to -mat_superlu_dist_parsymbfact in the > PETSc documentation > > > > Mahir > > > > Mahir ?lker-Kaustell, Kompetenssamordnare, Brokonstrukt?r, Tekn. Dr, > Tyr?ns AB > > 010 452 30 82, Mahir.Ulker-Kaustell at tyrens.se > > > > From: Xiaoye S. Li [mailto:xsli at lbl.gov] > > Sent: den 20 juli 2015 18:12 > > To: ?lker-Kaustell, Mahir > > Cc: Hong; petsc-users > > Subject: Re: [petsc-users] SuperLU MPI-problem > > > > The default SuperLU_DIST setting is to serial symbolic factorization. > Therefore, what matters is how much memory do you have per MPI task? > > > > The code failed to malloc memory during redistribution of matrix A to > {L\U} data struction (using result of serial symbolic factorization.) > > > > You can use parallel symbolic factorization, by runtime option: > '-mat_superlu_dist_parsymbfact' > > > > Sherry Li > > > > > > On Mon, Jul 20, 2015 at 8:59 AM, Mahir.Ulker-Kaustell at tyrens.se < > Mahir.Ulker-Kaustell at tyrens.se> wrote: > > Hong: > > > > Previous experiences with this equation have shown that it is very > difficult to solve it iteratively. Hence the use of a direct solver. > > > > The large test problem I am trying to solve has slightly less than 10^6 > degrees of freedom. The matrices are derived from finite elements so they > are sparse. > > The machine I am working on has 128GB ram. I have estimated the memory > needed to less than 20GB, so if the solver needs twice or even three times > as much, it should still work well. Or have I completely misunderstood > something here? > > > > Mahir > > > > > > > > From: Hong [mailto:hzhang at mcs.anl.gov] > > Sent: den 20 juli 2015 17:39 > > To: ?lker-Kaustell, Mahir > > Cc: petsc-users > > Subject: Re: [petsc-users] SuperLU MPI-problem > > > > Mahir: > > Direct solvers consume large amount of memory. Suggest to try followings: > > > > 1. A sparse iterative solver if [-omega^2M + K] is not too > ill-conditioned. You may test it using the small matrix. > > > > 2. Incrementally increase your matrix sizes. Try different matrix > orderings. > > Do you get memory crash in the 1st symbolic factorization? > > In your case, matrix data structure stays same when omega changes, so > you only need to do one matrix symbolic factorization and reuse it. > > > > 3. Use a machine that gives larger memory. > > > > Hong > > > > Dear Petsc-Users, > > > > I am trying to use PETSc to solve a set of linear equations arising from > Naviers equation (elastodynamics) in the frequency domain. > > The frequency dependency of the problem requires that the system > > > > [-omega^2M + K]u = F > > > > where M and K are constant, square, positive definite matrices (mass and > stiffness respectively) is solved for each frequency omega of interest. > > K is a complex matrix, including material damping. > > > > I have written a PETSc program which solves this problem for a small > (1000 degrees of freedom) test problem on one or several processors, but it > keeps crashing when I try it on my full scale (in the order of 10^6 degrees > of freedom) problem. > > > > The program crashes at KSPSetUp() and from what I can see in the error > messages, it appears as if it consumes too much memory. > > > > I would guess that similar problems have occurred in this mail-list, so > I am hoping that someone can push me in the right direction? > > > > Mahir > > > > > > > > > > > > > > > -------------- next part -------------- An HTML attachment was scrubbed... URL: From nicolas.pozin at inria.fr Wed Aug 5 10:20:41 2015 From: nicolas.pozin at inria.fr (Nicolas Pozin) Date: Wed, 5 Aug 2015 17:20:41 +0200 (CEST) Subject: [petsc-users] problem with MatShellGetContext In-Reply-To: References: <832877335.6632754.1438765243311.JavaMail.zimbra@inria.fr> <624969556.6635901.1438766116326.JavaMail.zimbra@inria.fr> Message-ID: <417177684.6699042.1438788041611.JavaMail.zimbra@inria.fr> Thank you! ----- Mail original ----- > De: "Matthew Knepley" > ?: "Nicolas Pozin" > Cc: "PETSc" > Envoy?: Mercredi 5 Ao?t 2015 13:38:20 > Objet: Re: [petsc-users] problem with MatShellGetContext > On Wed, Aug 5, 2015 at 4:15 AM, Nicolas Pozin < nicolas.pozin at inria.fr > > wrote: > > Hello, > > > I'm trying to solve a system with a matrix free operator and through > > conjugate gradient method. > > > To make ideas clear, I set up the following simple example (I am using > > petsc-3.6) and I get this error message : > > Yes, you are passing a C++ function userMult, so the compiler sticks "this" > in as the first argument. We do not > recommend this kind of wrapping. > Thanks, > Matt > > " > > > [0]PETSC ERROR: --------------------- Error Message > > ------------------------------------ > > > [0]PETSC ERROR: Invalid argument! > > > [0]PETSC ERROR: Wrong type of object: Parameter # 1! > > > [0]PETSC ERROR: > > ------------------------------------------------------------------------ > > > [0]PETSC ERROR: Petsc Release Version 3.4.3, Oct, 15, 2013 > > > [0]PETSC ERROR: See docs/changes/index.html for recent updates. > > > [0]PETSC ERROR: See docs/faq.html for hints about trouble shooting. > > > [0]PETSC ERROR: See docs/index.html for manual pages. > > > [0]PETSC ERROR: > > ------------------------------------------------------------------------ > > > [0]PETSC ERROR: ./test on a ubuntu_release named pl-59080 by npozin Wed Aug > > 5 > > 10:55:26 2015 > > > [0]PETSC ERROR: Libraries linked from > > /home/npozin/Felisce_libraries/petsc_3.4.3/ubuntu_release/lib > > > [0]PETSC ERROR: Configure run at Wed Jul 22 16:18:36 2015 > > > [0]PETSC ERROR: Configure options PETSC_ARCH=ubuntu_release --with-cxx=g++ > > --with-fc=gfortran --with-cc=gcc --with-x=0 --download-openmpi > > --download-f-blas-lapack --download-superlu --download-superlu_dist > > --with-superlu_dist=1 --download-metis --download-mumps --download-parmetis > > --with-superlu_dist=1 --download-boost --with-boost=1 --download-scalapack > > with-external-packages-dir=/home/npozin/Felisce_libraries/petsc_3.4.3/packages > > > [0]PETSC ERROR: > > ------------------------------------------------------------------------ > > > [0]PETSC ERROR: MatShellGetContext() line 202 in > > /home/npozin/Felisce_libraries/petsc_3.4.3/src/mat/impls/shell/shell.c > > > End userMult > > > [0]PETSC ERROR: MatMult() line 2179 in > > /home/npozin/Felisce_libraries/petsc_3.4.3/src/mat/interface/matrix.c > > > [0]PETSC ERROR: KSP_MatMult() line 204 in > > /home/npozin/Felisce_libraries/petsc_3.4.3/include/petsc-private/kspimpl.h > > > [0]PETSC ERROR: KSPSolve_CG() line 219 in > > /home/npozin/Felisce_libraries/petsc_3.4.3/src/ksp/ksp/impls/cg/cg.c > > > [0]PETSC ERROR: KSPSolve() line 441 in > > /home/npozin/Felisce_libraries/petsc_3.4.3/src/ksp/ksp/interface/itfunc.c > > > " > > > I don't understand where the problem comes from with the matrix argument of > > MatShellGetContext. > > > Any idea on what I do wrong? > > > Thanks a lot, > > > Nicolas > > > #include > > > #include > > > using namespace std; > > > typedef struct { > > > int val; > > > } MyCtx; > > > class ShellClass { > > > Mat matShell; > > > KSP ksp; > > > PC pc; > > > Vec x; > > > Vec b; > > > public: > > > void userMult(Mat Amat, Vec x, Vec y) { > > > cout << "Inside userMult" << endl; > > > MyCtx *ctx; > > > MatShellGetContext(Amat, (void *) ctx); > > > cout << "End userMult" << endl; > > > } > > > void solveShell() { > > > // context > > > MyCtx *ctx = new MyCtx; > > > ctx->val = 42; > > > // pc > > > PCCreate(PETSC_COMM_WORLD, &pc); > > > PCSetType(pc, PCNONE); > > > // ksp > > > KSPCreate(PETSC_COMM_WORLD, &ksp); > > > KSPSetType(ksp, KSPCG); > > > KSPSetPC(ksp, pc); > > > KSPSetFromOptions(ksp); > > > // matshell > > > int m = 10; > > > int n = 10; > > > MatCreateShell(PETSC_COMM_WORLD, m, n, PETSC_DETERMINE, PETSC_DETERMINE, > > ctx, > > &matShell); > > > MatShellSetOperation(matShell, MATOP_MULT, > > (void(*)(void))&ShellClass::userMult); > > > // create vectors > > > MatCreateVecs(matShell, &x, 0); > > > VecDuplicate(x, &b); > > > VecSet(b, 1.); > > > // set operators > > > KSPSetOperators(ksp, matShell, matShell); > > > // solve (call to userMult) > > > KSPSolve(ksp, b, x); > > > } > > > }; > > > int main(int argc, char** argv) { > > > PetscInitialize(&argc, &argv, NULL, NULL); > > > ShellClass foo; > > > foo.solveShell(); > > > PetscFinalize(); > > > return 0; > > > } > > -- > What most experimenters take for granted before they begin their experiments > is infinitely more interesting than any results to which their experiments > lead. > -- Norbert Wiener -------------- next part -------------- An HTML attachment was scrubbed... URL: From hzhang at mcs.anl.gov Wed Aug 5 10:28:34 2015 From: hzhang at mcs.anl.gov (Hong) Date: Wed, 5 Aug 2015 10:28:34 -0500 Subject: [petsc-users] I am wondering if there is a way to implement SPMM In-Reply-To: References: <87egjjr2j9.fsf@jedbrown.org> <20150804084548.GB52392@Patricks-MacBook-Pro-3.local> <07456300-9874-41EF-AF5E-16BC0CB0423D@mcs.anl.gov> <06426FD6-16F6-429A-8EEB-8BE31CECC8F4@mcs.anl.gov> Message-ID: Cong, For the first loop: do stepIdx= 2, step_k blockShift = KArrayOffset + (stepIdx-1) * (local_RRow * local_RCol) call MatCreateDense(PETSC_COMM_WORLD, PETSC_DECIDE, & PETSC_DECIDE , nDim, bsize,KArray(blockShift+1), Km(stepIdx), ierr) ... end do Do you use Km(stepIdx) here? If not, replace MatCreateDense() with MatMatMult(A,Km(stepIdx-1),MAT_INITIAL_MATRIX,...). Is matrix A dense or sparse? Hong On Wed, Aug 5, 2015 at 9:43 AM, Cong Li wrote: > Hong, > > Thanks for your answer. > However, in my problem, I have a pre-allocated matrix K, and its columns > are associated with Km(1), .. Km(step_k) respectively. What I want to do is > to update Km(2) by using the result of A*Km(1), and then to update Km(3) by > using the product of A and updated Km(2) and so on. > > So, I think I need to use MAT_REUSE_MATRIX from the beginning, since even > when it is the first time I call > MatMatMult(A,Km(stepIdx-1), MAT_REUSE_MATRIX,PETSC_ > DEFAULT_INTEGER,Km(stepIdx), ierr)', > > Km(stepIdx) have actually already been allocated (in K). > > Do you think it is possible that I can do this, and could you please > suggest some possible ways. > > Thanks > > Cong Li > > On Wed, Aug 5, 2015 at 11:23 PM, Hong wrote: > >> Cong: >> You cannot use "MAT_REUSE_MATRIX" on arbitrary matrix product. >> The correct process is >> >> call MatMatMult(A,Km(stepIdx-1), MAT_INITIAL_MATRIX,PETSC_ >> DEFAULT_INTEGER,C, ierr) >> call MatMatMult(A,Km(stepIdx-1), MAT_REUSE_MATRIX,PETSC_ >> DEFAULT_INTEGER,C, ierr) >> i.e., C has data structure of A*Km(stepIdx-1) and is created in the >> first call. C can be reused in the 2nd call when A or Km(stepIdx-1) >> changed values, but not the structures. >> >> In your case, Km(stepIdx) = A*Km(stepIdx-1). You should do >> 'call MatMatMult(A,Km(stepIdx-1), MAT_INITIAL_MATRIX >> ,PETSC_DEFAULT_INTEGER,Km(stepIdx), ierr)' >> directly. >> >> Hong >> >> On Wed, Aug 5, 2015 at 4:42 AM, Cong Li wrote: >> >>> Hi >>> >>> I tried the method you suggested. However, I got the error message. >>> My code and message are below. >>> >>> K is the big matrix containing column matrices. >>> >>> code: >>> >>> call MatGetArray(K,KArray,KArrayOffset,ierr) >>> >>> call MatGetLocalSize(R,local_RRow,local_RCol) >>> >>> call MatGetArray(R,RArray,RArrayOffset,ierr) >>> >>> call MatCreateDense(PETSC_COMM_WORLD, PETSC_DECIDE, & >>> >>> PETSC_DECIDE , nDim, bsize,KArray(KArrayOffset + 1), Km(1), ierr) >>> >>> localRsize = local_RRow * local_RCol >>> do genIdx= 1, localRsize >>> KArray(KArrayOffset + genIdx) = RArray(RArrayOffset + genIdx) >>> end do >>> >>> call MatRestoreArray(R,RArray,RArrayOffset,ierr) >>> >>> call MatAssemblyBegin(Km(1), MAT_FINAL_ASSEMBLY, ierr) >>> call MatAssemblyEnd (Km(1), MAT_FINAL_ASSEMBLY, ierr) >>> >>> do stepIdx= 2, step_k >>> >>> blockShift = KArrayOffset + (stepIdx-1) * (local_RRow * local_RCol) >>> >>> call MatCreateDense(PETSC_COMM_WORLD, PETSC_DECIDE, & >>> >>> PETSC_DECIDE , nDim, bsize,KArray(blockShift+1), Km(stepIdx), ierr) >>> call MatAssemblyBegin(Km(stepIdx), MAT_FINAL_ASSEMBLY, ierr) >>> call MatAssemblyEnd (Km(stepIdx), MAT_FINAL_ASSEMBLY, ierr) >>> end do >>> >>> call MatRestoreArray(K,KArray,KArrayOffset,ierr) >>> >>> do stepIdx= 2, step_k >>> >>> >>> call MatMatMult(A,Km(stepIdx-1),MAT_REUSE_MATRIX,PETSC_DEFAULT_INTEGER,Km(stepIdx), ierr) >>> end do >>> >>> >>> And I got the error message as below: >>> >>> >>> [0]PETSC ERROR: >>> ------------------------------------------------------------------------ >>> [0]PETSC ERROR: Caught signal number 11 SEGV: Segmentation Violation, >>> probably memory access out of range >>> [0]PETSC ERROR: Try option -start_in_debugger or >>> -on_error_attach_debugger >>> [0]PETSC ERROR: or see >>> http://www.mcs.anl.gov/petsc/documentation/faq.html#valgrind[0]PETSC >>> ERROR: or try http://valgrind.org on GNU/linux and Apple Mac OS X to >>> find memory corruption errors >>> [0]PETSC ERROR: configure using --with-debugging=yes, recompile, link, >>> and run >>> [0]PETSC ERROR: to get more information on the crash. >>> [0]PETSC ERROR: --------------------- Error Message >>> ------------------------------------ >>> [0]PETSC ERROR: Signal received! >>> [0]PETSC ERROR: >>> ------------------------------------------------------------------------ >>> [0]PETSC ERROR: Petsc Release Version 3.3.0, Patch 7, Sat May 11 >>> 22:15:24 CDT 2013 >>> [0]PETSC ERROR: See docs/changes/index.html for recent updates. >>> [0]PETSC ERROR: See docs/faq.html for hints about trouble shooting. >>> [0]PETSC ERROR: See docs/index.html for manual pages. >>> [0]PETSC ERROR: --------------------[1]PETSC ERROR: >>> ------------------------------------------------------------------------ >>> [1]PETSC ERROR: Caught signal number 11 SEGV: Segmentation Violation, >>> probably memory access out of range >>> ---------------------------------------------------- >>> [0]PETSC ERROR: ./kmath.bcbcg on a arch-fuji named p01-024 by a03293 Wed >>> Aug 5 18:24:40 2015 >>> [0]PETSC ERROR: Libraries linked from >>> /volume1/home/ra000005/a03293/kmathlibbuild/petsc-3.3-p7/arch-fujitsu-sparc64fx-opt/lib >>> [0]PETSC ERROR: Configure run at Tue Jul 28 19:23:51 2015 >>> [0]PETSC ERROR: Configure options --known-level1-dcache-size=32768 >>> --known-level1-dcache-linesize=32 --known-level1-dcache-assoc=0 >>> --known-memcmp-ok=1 --known-sizeof-char=1 --known-sizeof-void-p=8 >>> --known-sizeof-short=2 --known-sizeof-int=4 --known-sizeof-long=8 >>> --known-sizeof-long-long=8 --known-sizeof-float=4 --known-sizeof-double=8 >>> --known-sizeof-size_t=8 --known-bits-per-byte=8 --known-sizeof-MPI_Comm=8 >>> --known-sizeof-MPI_Fint=4 --known-mpi-long-double=1 >>> --known-mpi-c-double-complex=1 --with-cc=mpifccpx --CFLAGS="-mt -Xg" >>> --COPTFLAGS=-Kfast,openmp --with-cxx=mpiFCCpx --CXXFLAGS=-mt >>> --CXXOPTFLAGS=-Kfast,openmp --with-fc=mpifrtpx --FFLAGS=-Kthreadsafe >>> --FOPTFLAGS=-Kfast,openmp --with-blas-lapack-lib="-SCALAPACK -SSL2" >>> --with-x=0 --with-c++-support --with-batch=1 --with-info=1 >>> --with-debugging=0 --known-mpi-shared-libraries=0 --with-valgrind=0 >>> [0]PETSC ERROR: >>> ------------------------------------------------------------------------ >>> [0]PETSC ERROR: User provided function() line 0 in unknown directory >>> unknown file >>> >>> -------------------------------------------------------------------------- >>> [mpi::mpi-api::mpi-abort] >>> MPI_ABORT was invoked on rank 0 in communicator MPI_COMM_WORLD >>> with errorcode 59. >>> >>> NOTE: invoking MPI_ABORT causes Open MPI to kill all MPI processes. >>> You may or may not see output from other processes, depending on >>> exactly when Open MPI kills them. >>> >>> -------------------------------------------------------------------------- >>> [p01-024:26516] >>> /opt/FJSVtclang/GM-1.2.0-18/lib64/libmpi.so.0(orte_errmgr_base_error_abort+0x84) >>> [0xffffffff0091f684] >>> [p01-024:26516] >>> /opt/FJSVtclang/GM-1.2.0-18/lib64/libmpi.so.0(ompi_mpi_abort+0x51c) >>> [0xffffffff006c389c] >>> [p01-024:26516] >>> /opt/FJSVtclang/GM-1.2.0-18/lib64/libmpi.so.0(MPI_Abort+0x6c) >>> [0xffffffff006db3ac] >>> [p01-024:26516] >>> /opt/FJSVtclang/GM-1.2.0-18/lib64/libtrtmet_c.so.1(MPI_Abort+0x2c) >>> [0xffffffff00281bf0] >>> [p01-024:26516] ./kmath.bcbcg [0x1bf620] >>> [p01-024:26516] ./kmath.bcbcg [0x1bf20c] >>> [p01-024:26516] /lib64/libc.so.6(killpg+0x48) [0xffffffff02d52600] >>> [p01-024:26516] [(nil)] >>> [p01-024:26516] ./kmath.bcbcg [0x1a2054] >>> [p01-024:26516] ./kmath.bcbcg [0x1064f8] >>> [p01-024:26516] ./kmath.bcbcg(MAIN__+0x9dc) [0x105d1c] >>> [p01-024:26516] ./kmath.bcbcg(main+0xec) [0x8a329c] >>> [p01-024:26516] /lib64/libc.so.6(__libc_start_main+0x194) >>> [0xffffffff02d3b81c] >>> [p01-024:26516] ./kmath.bcbcg [0x1051ec] >>> [0]PETSC ERROR: >>> ------------------------------------------------------------------------ >>> [0]PETSC ERROR: Caught signal number 15 Terminate: Somet process (or the >>> batch system) has told this process to end >>> [0]PETSC ERROR: Try option -start_in_debugger or >>> -on_error_attach_debugger >>> [0]PETSC ERROR: or see >>> http://www.mcs.anl.gov/petsc/documentation/faq.html#valgrind[0]PETSC >>> ERROR: or try http://valgrind.org on GNU/linux and Apple Mac OS X to >>> find memory corruption errors >>> [0]PETSC ERROR: configure using --with-debugging=yes, recompile, link, >>> and run >>> [0]PETSC ERROR: to get more information on the crash. >>> [0]PETSC ERROR: --------------------- Error Message >>> ------------------------------------ >>> [0]PETSC ERROR: Signal received! >>> [0]PETSC ERROR: >>> ------------------------------------------------------------------------ >>> [0]PETSC ERROR: Petsc Release Version 3.3.0, Patch 7, Sat May 11 >>> 22:15:24 CDT 2013 >>> [0]PETSC ERROR: See docs/changes/index.html for recent updates. >>> [0]PETSC ERROR: See docs/faq.html for hints about trouble shooting. >>> [0]PETSC ERROR: See docs/index.html for manual pages. >>> [0]PETSC ERROR: >>> ------------------------------------------------------------------------ >>> [0]PETSC ERROR: ./kmath.bcbcg on a arch-fuji named p01-024 by a03293 Wed >>> Aug 5 18:24:40 2015 >>> [0]PETSC ERROR: Libraries linked from >>> /volume1/home/ra000005/a03293/kmathlibbuild/petsc-3.3-p7/arch-fujitsu-sparc64fx-opt/lib >>> [0]PETSC ERROR: Configure run at Tue Jul 28 19:23:51 2015 >>> [0]PETSC ERROR: Configure options --known-level1-dcache-size=32768 >>> --known-level1-dcache-linesize=32 --known-level1-dcache-assoc=0 >>> --known-memcmp-ok=1 --known-sizeof-char=1 --known-sizeof-void-p=8 >>> --known-sizeof-short=2 --known-sizeof-int=4 --known-sizeof-long=8 >>> --known-sizeof-long-long=8 --known-sizeof-float=4 --known-sizeof-double=8 >>> --known-sizeof-size_t=8 --known-bits-per-byte=8 --known-sizeof-MPI_Comm=8 >>> --known-sizeof-MPI_Fint=4 --known-mpi-long-double=1 >>> --known-mpi-c-double-complex=1 --with-cc=mpifccpx --CFLAGS="-mt -Xg" >>> --COPTFLAGS=-Kfast,openmp --with-cxx=mpiFCCpx --CXXFLAGS=-mt >>> --CXXOPTFLAGS=-Kfast,openmp --with-fc=mpifrtpx --FFLAGS=-Kthreadsafe >>> --FOPTFLAGS=-Kfast,openmp --with-blas-lapack-lib="-SCALAPACK -SSL2" >>> --with-x=0 --with-c++-support --with-batch=1 --with-info=1 >>> --with-debugging=0 --known-mpi-shared-libraries=0 --with-valgrind=0 >>> [0]PETSC ERROR: >>> ------------------------------------------------------------------------ >>> [0]PETSC ERROR: User provided function() line 0 in unknown directory >>> unknown file >>> [ERR.] PLE 0019 plexec One of MPI processes was >>> aborted.(rank=0)(nid=0x020a0028)(CODE=1938,793745140674134016,15104) >>> >>> However, if I change from >>> >>> call MatMatMult(A,Km(stepIdx-1),MAT_REUSE_MATRIX,PETSC_DEFAULT_INTEGER,Km(stepIdx), ierr) >>> to >>> call MatMatMult(A,Km(stepIdx-1), MAT_INITIAL_MATRIX >>> ,PETSC_DEFAULT_INTEGER,Km(stepIdx), ierr) >>> >>> everything is fine. >>> >>> could you please suggest some way to solve this? >>> >>> Thanks >>> >>> Cong Li >>> >>> On Wed, Aug 5, 2015 at 10:53 AM, Cong Li >>> wrote: >>> >>>> Thank you very much for your help and suggestions. >>>> With your help, finally I could continue my project. >>>> >>>> Regards >>>> >>>> Cong Li >>>> >>>> >>>> >>>> On Wed, Aug 5, 2015 at 3:09 AM, Barry Smith wrote: >>>> >>>>> >>>>> From the manual page: Unless scall is MAT_REUSE_MATRIX C will be >>>>> created. >>>>> >>>>> Since you want to use the C that is passed in you should use >>>>> MAT_REUSE_MATRIX. >>>>> >>>>> Note that since your B and C matrices are dense the issue of >>>>> sparsity pattern of C is not relevant. >>>>> >>>>> Barry >>>>> >>>>> > On Aug 4, 2015, at 11:59 AM, Cong Li >>>>> wrote: >>>>> > >>>>> > Thanks very much. This answer is very helpful. >>>>> > And I have a following question. >>>>> > If I create B1, B2, .. by the way you suggested and then use >>>>> MatMatMult to do SPMM. >>>>> > PetscErrorCode MatMatMult(Mat A,Mat B,MatReuse scall,PetscReal >>>>> fill,Mat *C) >>>>> > should I use MAT_REUSE_MATRIX for MatReuse part of the arguement. >>>>> > >>>>> > Thanks >>>>> > >>>>> > Cong Li >>>>> > >>>>> > On Wed, Aug 5, 2015 at 1:27 AM, Barry Smith >>>>> wrote: >>>>> > >>>>> > > On Aug 4, 2015, at 4:09 AM, Cong Li >>>>> wrote: >>>>> > > >>>>> > > I am sorry that I should have explained it more clearly. >>>>> > > Actually I want to compute a recurrence. >>>>> > > >>>>> > > Like, I want to firstly compute A*X1=B1, and then calculate >>>>> A*B1=B2, A*B2=B3 and so on. >>>>> > > Finally I want to combine all these results into a bigger matrix >>>>> C=[B1,B2 ...] >>>>> > >>>>> > First create C with MatCreateDense(,&C). Then call >>>>> MatDenseGetArray(C,&array); then create B1 with >>>>> MatCreateDense(....,array,&B1); then create >>>>> > B2 with MatCreateDense(...,array+shift,&B2) etc where shift equals >>>>> the number of __local__ rows in B1 times the number of columns in B1, then >>>>> create B3 with a larger shift etc. >>>>> > >>>>> > Note that you are "sharing" the array space of C with B1, B2, B3, >>>>> ..., each Bi contains its columns of the C matrix. >>>>> > >>>>> > Barry >>>>> > >>>>> > >>>>> > >>>>> > > >>>>> > > Is there any way to do this efficiently. >>>>> > > >>>>> > > >>>>> > > >>>>> > > On Tue, Aug 4, 2015 at 5:45 PM, Patrick Sanan < >>>>> patrick.sanan at gmail.com> wrote: >>>>> > > On Tue, Aug 04, 2015 at 03:42:14PM +0900, Cong Li wrote: >>>>> > > > Thanks for your reply. >>>>> > > > >>>>> > > > I have an other question. >>>>> > > > I want to do SPMM several times and combine result matrices into >>>>> one bigger >>>>> > > > matrix. >>>>> > > > for example >>>>> > > > I firstly calculate AX1=B1, AX2=B2 ... >>>>> > > > then I want to combine B1, B2.. to get a C, where C=[B1,B2...] >>>>> > > > >>>>> > > > Could you please suggest a way of how to do this. >>>>> > > This is just linear algebra, nothing to do with PETSc specifically. >>>>> > > A * [X1, X2, ... ] = [AX1, AX2, ...] >>>>> > > > >>>>> > > > Thanks >>>>> > > > >>>>> > > > Cong Li >>>>> > > > >>>>> > > > On Tue, Aug 4, 2015 at 3:27 PM, Jed Brown >>>>> wrote: >>>>> > > > >>>>> > > > > Cong Li writes: >>>>> > > > > >>>>> > > > > > Hello, >>>>> > > > > > >>>>> > > > > > I am a PhD student using PETsc for my research. >>>>> > > > > > I am wondering if there is a way to implement SPMM (Sparse >>>>> matrix-matrix >>>>> > > > > > multiplication) by using PETSc. >>>>> > > > > >>>>> > > > > >>>>> > > > > >>>>> http://www.mcs.anl.gov/petsc/petsc-current/docs/manualpages/Mat/MatMatMult.html >>>>> > > > > >>>>> > > >>>>> > >>>>> > >>>>> >>>>> >>>> >>> >> > -------------- next part -------------- An HTML attachment was scrubbed... URL: From jed at jedbrown.org Wed Aug 5 10:29:22 2015 From: jed at jedbrown.org (Jed Brown) Date: Wed, 05 Aug 2015 09:29:22 -0600 Subject: [petsc-users] Fail to Configure petsc-3.6.1 In-Reply-To: <55C1A7C3.7030209@gmail.com> References: <55C18408.5040500@gmail.com> <87r3niz7gk.fsf@jedbrown.org> <55C1A7C3.7030209@gmail.com> Message-ID: <87lhdpzrcd.fsf@jedbrown.org> Rongliang Chen writes: > Hi Jed, > > Thanks for your reply. > > I checked the netcdf and hdf5's config.log and could not find any > possible solutions. Can you help me check these two files again? The two > files are attached. Thanks. It looks to me like libhdf5.a needs to be linked with -ldl, which partly defeats the intent of static linking. PETSc folks, do we blame this on HDF5 with --disable-shared not being a truly static build? Should we pass LDLIBS=-ldl so that NetCDF can link? This likely all works if you use shared libraries. (I can't believe this is still a debate in 2015.) configure:16585: mpicc -o conftest -g3 -O0 -I/home/rlchen/soft/petsc-3.6.1/64bit-debug/include conftest.c -lhdf5 -lm -Wl,-rpath,/home/rlchen/soft/petsc-3.6.1/64bit-debug/lib -L/home/rlchen/soft/petsc-3.6.1/64bit-debug/lib -lhdf5hl_fortran -lhdf5_fortran -lhdf5_hl -lhdf5 -lz >&5 /home/rlchen/soft/petsc-3.6.1/64bit-debug/lib/libhdf5.a(H5PL.o): In function `H5PL__open': /home/rlchen/soft/petsc-3.6.1/64bit-debug/externalpackages/hdf5-1.8.12/src/H5PL.c:535: undefined reference to `dlopen' /home/rlchen/soft/petsc-3.6.1/64bit-debug/externalpackages/hdf5-1.8.12/src/H5PL.c:536: undefined reference to `dlerror' /home/rlchen/soft/petsc-3.6.1/64bit-debug/externalpackages/hdf5-1.8.12/src/H5PL.c:544: undefined reference to `dlsym' /home/rlchen/soft/petsc-3.6.1/64bit-debug/lib/libhdf5.a(H5PL.o): In function `H5PL__search_table': /home/rlchen/soft/petsc-3.6.1/64bit-debug/externalpackages/hdf5-1.8.12/src/H5PL.c:627: undefined reference to `dlsym' /home/rlchen/soft/petsc-3.6.1/64bit-debug/lib/libhdf5.a(H5PL.o): In function `H5PL__close': /home/rlchen/soft/petsc-3.6.1/64bit-debug/externalpackages/hdf5-1.8.12/src/H5PL.c:661: undefined reference to `dlclose' collect2: error: ld returned 1 exit status -------------- next part -------------- A non-text attachment was scrubbed... Name: signature.asc Type: application/pgp-signature Size: 818 bytes Desc: not available URL: From jed at jedbrown.org Wed Aug 5 11:35:05 2015 From: jed at jedbrown.org (Jed Brown) Date: Wed, 05 Aug 2015 10:35:05 -0600 Subject: [petsc-users] problem with MatShellGetContext In-Reply-To: References: <832877335.6632754.1438765243311.JavaMail.zimbra@inria.fr> <624969556.6635901.1438766116326.JavaMail.zimbra@inria.fr> Message-ID: <87d1z1zoau.fsf@jedbrown.org> Matthew Knepley writes: > Yes, you are passing a C++ function userMult, so the compiler sticks "this" > in as the first argument. We do not > recommend this kind of wrapping. I.e., either make it a stand-alone function or make the class function static. -------------- next part -------------- A non-text attachment was scrubbed... Name: signature.asc Type: application/pgp-signature Size: 818 bytes Desc: not available URL: From solvercorleone at gmail.com Wed Aug 5 11:50:59 2015 From: solvercorleone at gmail.com (Cong Li) Date: Thu, 6 Aug 2015 01:50:59 +0900 Subject: [petsc-users] I am wondering if there is a way to implement SPMM In-Reply-To: References: <87egjjr2j9.fsf@jedbrown.org> <20150804084548.GB52392@Patricks-MacBook-Pro-3.local> <07456300-9874-41EF-AF5E-16BC0CB0423D@mcs.anl.gov> <06426FD6-16F6-429A-8EEB-8BE31CECC8F4@mcs.anl.gov> Message-ID: <0F56C97B-8D7E-470C-A82E-3EBE05824F83@gmail.com> Hong, A is a sparse matrix. In the first loop, I don't use Km(stepIdx) here. However, I want to let Km(stepIdx) matrix be associated with the some of columns of K here. In the second loop, I want to update Km(stepIdx) by using A*Km(stepIdx-1) so that the corresponding columns of K can be updated simultaneously . If I use MAT_INITIAL_MATRIX, I guess I have to copy the values of updated Km(stepIdx) back to the corresponding columns of K after SPMM call. But this copy phrase costs bandwidth, I think. Do you have any idea by which I can do SPMM as well as remove the copy phrase. Thanks Cong Li iPhone???? 2015/08/06 0:28?Hong ??????: > Cong, > For the first loop: > > do stepIdx= 2, step_k > > blockShift = KArrayOffset + (stepIdx-1) * (local_RRow * local_RCol) > > call MatCreateDense(PETSC_COMM_WORLD, PETSC_DECIDE, & > PETSC_DECIDE , nDim, bsize,KArray(blockShift+1), Km(stepIdx), ierr) > ... > end do > > Do you use Km(stepIdx) here? > If not, replace MatCreateDense() with > MatMatMult(A,Km(stepIdx-1),.MAT_INITIAL_MATRIX,..). > Is matrix A dense or sparse? > > Hong > > >> On Wed, Aug 5, 2015 at 9:43 AM, Cong Li wrote: >> Hong, >> >> Thanks for your answer. >> However, in my problem, I have a pre-allocated matrix K, and its columns are associated with Km(1), .. Km(step_k) respectively. What I want to do is to update Km(2) by using the result of A*Km(1), and then to update Km(3) by using the product of A and updated Km(2) and so on. >> >> So, I think I need to use MAT_REUSE_MATRIX from the beginning, since even when it is the first time I call >> MatMatMult(A,Km(stepIdx-1), MAT_REUSE_MATRIX,PETSC_DEFAULT_INTEGER,Km(stepIdx), ierr)', >> >> Km(stepIdx) have actually already been allocated (in K). >> >> Do you think it is possible that I can do this, and could you please suggest some possible ways. >> >> Thanks >> >> Cong Li >> >>> On Wed, Aug 5, 2015 at 11:23 PM, Hong wrote: >>> Cong: >>> You cannot use "MAT_REUSE_MATRIX" on arbitrary matrix product. >>> The correct process is >>> >>> call MatMatMult(A,Km(stepIdx-1), MAT_INITIAL_MATRIX,PETSC_DEFAULT_INTEGER,C, ierr) >>> call MatMatMult(A,Km(stepIdx-1), MAT_REUSE_MATRIX,PETSC_DEFAULT_INTEGER,C, ierr) >>> i.e., C has data structure of A*Km(stepIdx-1) and is created in the first call. C can be reused in the 2nd call when A or Km(stepIdx-1) changed values, but not the structures. >>> >>> In your case, Km(stepIdx) = A*Km(stepIdx-1). You should do >>> 'call MatMatMult(A,Km(stepIdx-1), MAT_INITIAL_MATRIX,PETSC_DEFAULT_INTEGER,Km(stepIdx), ierr)' >>> directly. >>> >>> Hong >>> >>>> On Wed, Aug 5, 2015 at 4:42 AM, Cong Li wrote: >>>> Hi >>>> >>>> I tried the method you suggested. However, I got the error message. >>>> My code and message are below. >>>> >>>> K is the big matrix containing column matrices. >>>> >>>> code: >>>> >>>> call MatGetArray(K,KArray,KArrayOffset,ierr) >>>> >>>> call MatGetLocalSize(R,local_RRow,local_RCol) >>>> >>>> call MatGetArray(R,RArray,RArrayOffset,ierr) >>>> >>>> call MatCreateDense(PETSC_COMM_WORLD, PETSC_DECIDE, & >>>> PETSC_DECIDE , nDim, bsize,KArray(KArrayOffset + 1), Km(1), ierr) >>>> >>>> localRsize = local_RRow * local_RCol >>>> do genIdx= 1, localRsize >>>> KArray(KArrayOffset + genIdx) = RArray(RArrayOffset + genIdx) >>>> end do >>>> >>>> call MatRestoreArray(R,RArray,RArrayOffset,ierr) >>>> >>>> call MatAssemblyBegin(Km(1), MAT_FINAL_ASSEMBLY, ierr) >>>> call MatAssemblyEnd (Km(1), MAT_FINAL_ASSEMBLY, ierr) >>>> >>>> do stepIdx= 2, step_k >>>> >>>> blockShift = KArrayOffset + (stepIdx-1) * (local_RRow * local_RCol) >>>> >>>> call MatCreateDense(PETSC_COMM_WORLD, PETSC_DECIDE, & >>>> PETSC_DECIDE , nDim, bsize,KArray(blockShift+1), Km(stepIdx), ierr) >>>> call MatAssemblyBegin(Km(stepIdx), MAT_FINAL_ASSEMBLY, ierr) >>>> call MatAssemblyEnd (Km(stepIdx), MAT_FINAL_ASSEMBLY, ierr) >>>> end do >>>> >>>> call MatRestoreArray(K,KArray,KArrayOffset,ierr) >>>> >>>> do stepIdx= 2, step_k >>>> >>>> call MatMatMult(A,Km(stepIdx-1),MAT_REUSE_MATRIX,PETSC_DEFAULT_INTEGER,Km(stepIdx), ierr) >>>> end do >>>> >>>> >>>> And I got the error message as below: >>>> >>>> >>>> [0]PETSC ERROR: ------------------------------------------------------------------------ >>>> [0]PETSC ERROR: Caught signal number 11 SEGV: Segmentation Violation, probably memory access out of range >>>> [0]PETSC ERROR: Try option -start_in_debugger or -on_error_attach_debugger >>>> [0]PETSC ERROR: or see http://www.mcs.anl.gov/petsc/documentation/faq.html#valgrind[0]PETSC ERROR: or try http://valgrind.org on GNU/linux and Apple Mac OS X to find memory corruption errors >>>> [0]PETSC ERROR: configure using --with-debugging=yes, recompile, link, and run >>>> [0]PETSC ERROR: to get more information on the crash. >>>> [0]PETSC ERROR: --------------------- Error Message ------------------------------------ >>>> [0]PETSC ERROR: Signal received! >>>> [0]PETSC ERROR: ------------------------------------------------------------------------ >>>> [0]PETSC ERROR: Petsc Release Version 3.3.0, Patch 7, Sat May 11 22:15:24 CDT 2013 >>>> [0]PETSC ERROR: See docs/changes/index.html for recent updates. >>>> [0]PETSC ERROR: See docs/faq.html for hints about trouble shooting. >>>> [0]PETSC ERROR: See docs/index.html for manual pages. >>>> [0]PETSC ERROR: --------------------[1]PETSC ERROR: ------------------------------------------------------------------------ >>>> [1]PETSC ERROR: Caught signal number 11 SEGV: Segmentation Violation, probably memory access out of range >>>> ---------------------------------------------------- >>>> [0]PETSC ERROR: ./kmath.bcbcg on a arch-fuji named p01-024 by a03293 Wed Aug 5 18:24:40 2015 >>>> [0]PETSC ERROR: Libraries linked from /volume1/home/ra000005/a03293/kmathlibbuild/petsc-3.3-p7/arch-fujitsu-sparc64fx-opt/lib >>>> [0]PETSC ERROR: Configure run at Tue Jul 28 19:23:51 2015 >>>> [0]PETSC ERROR: Configure options --known-level1-dcache-size=32768 --known-level1-dcache-linesize=32 --known-level1-dcache-assoc=0 --known-memcmp-ok=1 --known-sizeof-char=1 --known-sizeof-void-p=8 --known-sizeof-short=2 --known-sizeof-int=4 --known-sizeof-long=8 --known-sizeof-long-long=8 --known-sizeof-float=4 --known-sizeof-double=8 --known-sizeof-size_t=8 --known-bits-per-byte=8 --known-sizeof-MPI_Comm=8 --known-sizeof-MPI_Fint=4 --known-mpi-long-double=1 --known-mpi-c-double-complex=1 --with-cc=mpifccpx --CFLAGS="-mt -Xg" --COPTFLAGS=-Kfast,openmp --with-cxx=mpiFCCpx --CXXFLAGS=-mt --CXXOPTFLAGS=-Kfast,openmp --with-fc=mpifrtpx --FFLAGS=-Kthreadsafe --FOPTFLAGS=-Kfast,openmp --with-blas-lapack-lib="-SCALAPACK -SSL2" --with-x=0 --with-c++-support --with-batch=1 --with-info=1 --with-debugging=0 --known-mpi-shared-libraries=0 --with-valgrind=0 >>>> [0]PETSC ERROR: ------------------------------------------------------------------------ >>>> [0]PETSC ERROR: User provided function() line 0 in unknown directory unknown file >>>> -------------------------------------------------------------------------- >>>> [mpi::mpi-api::mpi-abort] >>>> MPI_ABORT was invoked on rank 0 in communicator MPI_COMM_WORLD >>>> with errorcode 59. >>>> >>>> NOTE: invoking MPI_ABORT causes Open MPI to kill all MPI processes. >>>> You may or may not see output from other processes, depending on >>>> exactly when Open MPI kills them. >>>> -------------------------------------------------------------------------- >>>> [p01-024:26516] /opt/FJSVtclang/GM-1.2.0-18/lib64/libmpi.so.0(orte_errmgr_base_error_abort+0x84) [0xffffffff0091f684] >>>> [p01-024:26516] /opt/FJSVtclang/GM-1.2.0-18/lib64/libmpi.so.0(ompi_mpi_abort+0x51c) [0xffffffff006c389c] >>>> [p01-024:26516] /opt/FJSVtclang/GM-1.2.0-18/lib64/libmpi.so.0(MPI_Abort+0x6c) [0xffffffff006db3ac] >>>> [p01-024:26516] /opt/FJSVtclang/GM-1.2.0-18/lib64/libtrtmet_c.so.1(MPI_Abort+0x2c) [0xffffffff00281bf0] >>>> [p01-024:26516] ./kmath.bcbcg [0x1bf620] >>>> [p01-024:26516] ./kmath.bcbcg [0x1bf20c] >>>> [p01-024:26516] /lib64/libc.so.6(killpg+0x48) [0xffffffff02d52600] >>>> [p01-024:26516] [(nil)] >>>> [p01-024:26516] ./kmath.bcbcg [0x1a2054] >>>> [p01-024:26516] ./kmath.bcbcg [0x1064f8] >>>> [p01-024:26516] ./kmath.bcbcg(MAIN__+0x9dc) [0x105d1c] >>>> [p01-024:26516] ./kmath.bcbcg(main+0xec) [0x8a329c] >>>> [p01-024:26516] /lib64/libc.so.6(__libc_start_main+0x194) [0xffffffff02d3b81c] >>>> [p01-024:26516] ./kmath.bcbcg [0x1051ec] >>>> [0]PETSC ERROR: ------------------------------------------------------------------------ >>>> [0]PETSC ERROR: Caught signal number 15 Terminate: Somet process (or the batch system) has told this process to end >>>> [0]PETSC ERROR: Try option -start_in_debugger or -on_error_attach_debugger >>>> [0]PETSC ERROR: or see http://www.mcs.anl.gov/petsc/documentation/faq.html#valgrind[0]PETSC ERROR: or try http://valgrind.org on GNU/linux and Apple Mac OS X to find memory corruption errors >>>> [0]PETSC ERROR: configure using --with-debugging=yes, recompile, link, and run >>>> [0]PETSC ERROR: to get more information on the crash. >>>> [0]PETSC ERROR: --------------------- Error Message ------------------------------------ >>>> [0]PETSC ERROR: Signal received! >>>> [0]PETSC ERROR: ------------------------------------------------------------------------ >>>> [0]PETSC ERROR: Petsc Release Version 3.3.0, Patch 7, Sat May 11 22:15:24 CDT 2013 >>>> [0]PETSC ERROR: See docs/changes/index.html for recent updates. >>>> [0]PETSC ERROR: See docs/faq.html for hints about trouble shooting. >>>> [0]PETSC ERROR: See docs/index.html for manual pages. >>>> [0]PETSC ERROR: ------------------------------------------------------------------------ >>>> [0]PETSC ERROR: ./kmath.bcbcg on a arch-fuji named p01-024 by a03293 Wed Aug 5 18:24:40 2015 >>>> [0]PETSC ERROR: Libraries linked from /volume1/home/ra000005/a03293/kmathlibbuild/petsc-3.3-p7/arch-fujitsu-sparc64fx-opt/lib >>>> [0]PETSC ERROR: Configure run at Tue Jul 28 19:23:51 2015 >>>> [0]PETSC ERROR: Configure options --known-level1-dcache-size=32768 --known-level1-dcache-linesize=32 --known-level1-dcache-assoc=0 --known-memcmp-ok=1 --known-sizeof-char=1 --known-sizeof-void-p=8 --known-sizeof-short=2 --known-sizeof-int=4 --known-sizeof-long=8 --known-sizeof-long-long=8 --known-sizeof-float=4 --known-sizeof-double=8 --known-sizeof-size_t=8 --known-bits-per-byte=8 --known-sizeof-MPI_Comm=8 --known-sizeof-MPI_Fint=4 --known-mpi-long-double=1 --known-mpi-c-double-complex=1 --with-cc=mpifccpx --CFLAGS="-mt -Xg" --COPTFLAGS=-Kfast,openmp --with-cxx=mpiFCCpx --CXXFLAGS=-mt --CXXOPTFLAGS=-Kfast,openmp --with-fc=mpifrtpx --FFLAGS=-Kthreadsafe --FOPTFLAGS=-Kfast,openmp --with-blas-lapack-lib="-SCALAPACK -SSL2" --with-x=0 --with-c++-support --with-batch=1 --with-info=1 --with-debugging=0 --known-mpi-shared-libraries=0 --with-valgrind=0 >>>> [0]PETSC ERROR: ------------------------------------------------------------------------ >>>> [0]PETSC ERROR: User provided function() line 0 in unknown directory unknown file >>>> [ERR.] PLE 0019 plexec One of MPI processes was aborted.(rank=0)(nid=0x020a0028)(CODE=1938,793745140674134016,15104) >>>> >>>> However, if I change from >>>> call MatMatMult(A,Km(stepIdx-1),MAT_REUSE_MATRIX,PETSC_DEFAULT_INTEGER,Km(stepIdx), ierr) >>>> to >>>> call MatMatMult(A,Km(stepIdx-1), MAT_INITIAL_MATRIX,PETSC_DEFAULT_INTEGER,Km(stepIdx), ierr) >>>> >>>> everything is fine. >>>> >>>> could you please suggest some way to solve this? >>>> >>>> Thanks >>>> >>>> Cong Li >>>> >>>>> On Wed, Aug 5, 2015 at 10:53 AM, Cong Li wrote: >>>>> Thank you very much for your help and suggestions. >>>>> With your help, finally I could continue my project. >>>>> >>>>> Regards >>>>> >>>>> Cong Li >>>>> >>>>> >>>>> >>>>>> On Wed, Aug 5, 2015 at 3:09 AM, Barry Smith wrote: >>>>>> >>>>>> From the manual page: Unless scall is MAT_REUSE_MATRIX C will be created. >>>>>> >>>>>> Since you want to use the C that is passed in you should use MAT_REUSE_MATRIX. >>>>>> >>>>>> Note that since your B and C matrices are dense the issue of sparsity pattern of C is not relevant. >>>>>> >>>>>> Barry >>>>>> >>>>>> > On Aug 4, 2015, at 11:59 AM, Cong Li wrote: >>>>>> > >>>>>> > Thanks very much. This answer is very helpful. >>>>>> > And I have a following question. >>>>>> > If I create B1, B2, .. by the way you suggested and then use MatMatMult to do SPMM. >>>>>> > PetscErrorCode MatMatMult(Mat A,Mat B,MatReuse scall,PetscReal fill,Mat *C) >>>>>> > should I use MAT_REUSE_MATRIX for MatReuse part of the arguement. >>>>>> > >>>>>> > Thanks >>>>>> > >>>>>> > Cong Li >>>>>> > >>>>>> > On Wed, Aug 5, 2015 at 1:27 AM, Barry Smith wrote: >>>>>> > >>>>>> > > On Aug 4, 2015, at 4:09 AM, Cong Li wrote: >>>>>> > > >>>>>> > > I am sorry that I should have explained it more clearly. >>>>>> > > Actually I want to compute a recurrence. >>>>>> > > >>>>>> > > Like, I want to firstly compute A*X1=B1, and then calculate A*B1=B2, A*B2=B3 and so on. >>>>>> > > Finally I want to combine all these results into a bigger matrix C=[B1,B2 ...] >>>>>> > >>>>>> > First create C with MatCreateDense(,&C). Then call MatDenseGetArray(C,&array); then create B1 with MatCreateDense(....,array,&B1); then create >>>>>> > B2 with MatCreateDense(...,array+shift,&B2) etc where shift equals the number of __local__ rows in B1 times the number of columns in B1, then create B3 with a larger shift etc. >>>>>> > >>>>>> > Note that you are "sharing" the array space of C with B1, B2, B3, ..., each Bi contains its columns of the C matrix. >>>>>> > >>>>>> > Barry >>>>>> > >>>>>> > >>>>>> > >>>>>> > > >>>>>> > > Is there any way to do this efficiently. >>>>>> > > >>>>>> > > >>>>>> > > >>>>>> > > On Tue, Aug 4, 2015 at 5:45 PM, Patrick Sanan wrote: >>>>>> > > On Tue, Aug 04, 2015 at 03:42:14PM +0900, Cong Li wrote: >>>>>> > > > Thanks for your reply. >>>>>> > > > >>>>>> > > > I have an other question. >>>>>> > > > I want to do SPMM several times and combine result matrices into one bigger >>>>>> > > > matrix. >>>>>> > > > for example >>>>>> > > > I firstly calculate AX1=B1, AX2=B2 ... >>>>>> > > > then I want to combine B1, B2.. to get a C, where C=[B1,B2...] >>>>>> > > > >>>>>> > > > Could you please suggest a way of how to do this. >>>>>> > > This is just linear algebra, nothing to do with PETSc specifically. >>>>>> > > A * [X1, X2, ... ] = [AX1, AX2, ...] >>>>>> > > > >>>>>> > > > Thanks >>>>>> > > > >>>>>> > > > Cong Li >>>>>> > > > >>>>>> > > > On Tue, Aug 4, 2015 at 3:27 PM, Jed Brown wrote: >>>>>> > > > >>>>>> > > > > Cong Li writes: >>>>>> > > > > >>>>>> > > > > > Hello, >>>>>> > > > > > >>>>>> > > > > > I am a PhD student using PETsc for my research. >>>>>> > > > > > I am wondering if there is a way to implement SPMM (Sparse matrix-matrix >>>>>> > > > > > multiplication) by using PETSc. >>>>>> > > > > >>>>>> > > > > >>>>>> > > > > http://www.mcs.anl.gov/petsc/petsc-current/docs/manualpages/Mat/MatMatMult.html >>>>>> > > > > >>>>>> > > >>>>>> > >>>>>> > > -------------- next part -------------- An HTML attachment was scrubbed... URL: From knepley at gmail.com Wed Aug 5 12:35:52 2015 From: knepley at gmail.com (Matthew Knepley) Date: Wed, 5 Aug 2015 12:35:52 -0500 Subject: [petsc-users] Fail to Configure petsc-3.6.1 In-Reply-To: <87lhdpzrcd.fsf@jedbrown.org> References: <55C18408.5040500@gmail.com> <87r3niz7gk.fsf@jedbrown.org> <55C1A7C3.7030209@gmail.com> <87lhdpzrcd.fsf@jedbrown.org> Message-ID: On Wed, Aug 5, 2015 at 10:29 AM, Jed Brown wrote: > Rongliang Chen writes: > > > Hi Jed, > > > > Thanks for your reply. > > > > I checked the netcdf and hdf5's config.log and could not find any > > possible solutions. Can you help me check these two files again? The two > > files are attached. Thanks. > > It looks to me like libhdf5.a needs to be linked with -ldl, which partly > defeats the intent of static linking. PETSc folks, do we blame this on > HDF5 with --disable-shared not being a truly static build? Should we > Yes, this is an error in the HDF5 buildsystem. > pass LDLIBS=-ldl so that NetCDF can link? > That would work I think, but looks very strange for a static build (as you said). It appears to me that HDF5 is not suitable for a static build, and I would reconsider this strategy. Thanks, Matt > This likely all works if you use shared libraries. (I can't believe > this is still a debate in 2015.) > > configure:16585: mpicc -o conftest -g3 -O0 > -I/home/rlchen/soft/petsc-3.6.1/64bit-debug/include conftest.c -lhdf5 -lm > -Wl,-rpath,/home/rlchen/soft/petsc-3.6.1/64bit-debug/lib > -L/home/rlchen/soft/petsc-3.6.1/64bit-debug/lib -lhdf5hl_fortran > -lhdf5_fortran -lhdf5_hl -lhdf5 -lz >&5 > /home/rlchen/soft/petsc-3.6.1/64bit-debug/lib/libhdf5.a(H5PL.o): In > function `H5PL__open': > /home/rlchen/soft/petsc-3.6.1/64bit-debug/externalpackages/hdf5-1.8.12/src/H5PL.c:535: > undefined reference to `dlopen' > /home/rlchen/soft/petsc-3.6.1/64bit-debug/externalpackages/hdf5-1.8.12/src/H5PL.c:536: > undefined reference to `dlerror' > /home/rlchen/soft/petsc-3.6.1/64bit-debug/externalpackages/hdf5-1.8.12/src/H5PL.c:544: > undefined reference to `dlsym' > /home/rlchen/soft/petsc-3.6.1/64bit-debug/lib/libhdf5.a(H5PL.o): In > function `H5PL__search_table': > /home/rlchen/soft/petsc-3.6.1/64bit-debug/externalpackages/hdf5-1.8.12/src/H5PL.c:627: > undefined reference to `dlsym' > /home/rlchen/soft/petsc-3.6.1/64bit-debug/lib/libhdf5.a(H5PL.o): In > function `H5PL__close': > /home/rlchen/soft/petsc-3.6.1/64bit-debug/externalpackages/hdf5-1.8.12/src/H5PL.c:661: > undefined reference to `dlclose' > collect2: error: ld returned 1 exit status -- What most experimenters take for granted before they begin their experiments is infinitely more interesting than any results to which their experiments lead. -- Norbert Wiener -------------- next part -------------- An HTML attachment was scrubbed... URL: From bsmith at mcs.anl.gov Wed Aug 5 13:01:44 2015 From: bsmith at mcs.anl.gov (Barry Smith) Date: Wed, 5 Aug 2015 13:01:44 -0500 Subject: [petsc-users] Questions about creation of matrix and setting its values In-Reply-To: References: Message-ID: <1AD73AA4-8437-40BC-AFD2-EF1471B27E34@mcs.anl.gov> > On Aug 5, 2015, at 4:47 AM, Cong Li wrote: > >> Hi, >> >> I am wondering if it is necessary to call >> MatAssemblyBegin() and MatAssemblyEnd() after MatDuplicate() with the option of MAT_DO_NOT_COPY_VALUES. >> For example, if I have an assembled matrix A, and I call MatDuplicate() to create B, which is a duplication of A. >> Do I need to call MatAssemblyBegin() and MatAssemblyEnd() for B. You should not need to. But note if you use the flag MAT_DO_NOT_COPY_VALUES the new matrix will have zero for all the numerical entries. > >> >> And 2nd question is : >> just after the MatCreateDense() call and before MatAssemblyBegin() and MatAssemblyEnd() calls, can I use MatGetArray() ? Dense matrices are a special case because room is always allocated for all the matrix entries and one can use MatDenseGetArray() to either access or set any local value. So if you are only setting/accessing local values you don't actually need to use MatSetValues() (though you can) you can just access the locations directly after using MatDenseGetArray(). There is no harm in calling the MatAssemblyBegin/End() "extra" times for dense matrices. >> >> The 3rd question is: >> before the MatAssemblyBegin() and MatAssemblyEnd() calls, should I use INSERT_VALUES or ADD_VALUES for MatSetValues call? And why ? >> Actually I have read the manual, but I still feel confused about the means of INSERT_VALUES and ADD_VALUES. > There are a couple of reasons that you need to make these MatAssemblyBegin/End calls: > - entries can be set which should be stored on a different process, so these need to be communicated > - for compressed formats like CSR (as used in MATAIJ and others) the entries need to be processed into their compressed form > In general, the entries of the matrix are not stored in their "usable" forms until you make the MatAssembleEnd call. Rather they are kept in some easy-to-insert-into intermediate storage. INSERT_VALUES means that old values that might be in the matrix are overwritten , and ADD_VALUES means that the new entries from intermediate storage are added to the old values. > > >> >> Thanks >> >> Cong Li > From bsmith at mcs.anl.gov Wed Aug 5 13:30:58 2015 From: bsmith at mcs.anl.gov (Barry Smith) Date: Wed, 5 Aug 2015 13:30:58 -0500 Subject: [petsc-users] I am wondering if there is a way to implement SPMM In-Reply-To: References: <87egjjr2j9.fsf@jedbrown.org> <20150804084548.GB52392@Patricks-MacBook-Pro-3.local> <07456300-9874-41EF-AF5E-16BC0CB0423D@mcs.anl.gov> <06426FD6-16F6-429A-8EEB-8BE31CECC8F4@mcs.anl.gov> Message-ID: Send the entire code so that we can compile it and run it ourselves to see what is going wrong. Barry > On Aug 5, 2015, at 4:42 AM, Cong Li wrote: > > Hi > > I tried the method you suggested. However, I got the error message. > My code and message are below. > > K is the big matrix containing column matrices. > > code: > > call MatGetArray(K,KArray,KArrayOffset,ierr) > > call MatGetLocalSize(R,local_RRow,local_RCol) > > call MatGetArray(R,RArray,RArrayOffset,ierr) > > call MatCreateDense(PETSC_COMM_WORLD, PETSC_DECIDE, & > PETSC_DECIDE , nDim, bsize,KArray(KArrayOffset + 1), Km(1), ierr) > > localRsize = local_RRow * local_RCol > do genIdx= 1, localRsize > KArray(KArrayOffset + genIdx) = RArray(RArrayOffset + genIdx) > end do > > call MatRestoreArray(R,RArray,RArrayOffset,ierr) > > call MatAssemblyBegin(Km(1), MAT_FINAL_ASSEMBLY, ierr) > call MatAssemblyEnd (Km(1), MAT_FINAL_ASSEMBLY, ierr) > > do stepIdx= 2, step_k > > blockShift = KArrayOffset + (stepIdx-1) * (local_RRow * local_RCol) > > call MatCreateDense(PETSC_COMM_WORLD, PETSC_DECIDE, & > PETSC_DECIDE , nDim, bsize,KArray(blockShift+1), Km(stepIdx), ierr) > call MatAssemblyBegin(Km(stepIdx), MAT_FINAL_ASSEMBLY, ierr) > call MatAssemblyEnd (Km(stepIdx), MAT_FINAL_ASSEMBLY, ierr) > end do > > call MatRestoreArray(K,KArray,KArrayOffset,ierr) > > do stepIdx= 2, step_k > > call MatMatMult(A,Km(stepIdx-1),MAT_REUSE_MATRIX,PETSC_DEFAULT_INTEGER,Km(stepIdx), ierr) > end do > > > And I got the error message as below: > > > [0]PETSC ERROR: ------------------------------------------------------------------------ > [0]PETSC ERROR: Caught signal number 11 SEGV: Segmentation Violation, probably memory access out of range > [0]PETSC ERROR: Try option -start_in_debugger or -on_error_attach_debugger > [0]PETSC ERROR: or see http://www.mcs.anl.gov/petsc/documentation/faq.html#valgrind[0]PETSC ERROR: or try http://valgrind.org on GNU/linux and Apple Mac OS X to find memory corruption errors > [0]PETSC ERROR: configure using --with-debugging=yes, recompile, link, and run > [0]PETSC ERROR: to get more information on the crash. > [0]PETSC ERROR: --------------------- Error Message ------------------------------------ > [0]PETSC ERROR: Signal received! > [0]PETSC ERROR: ------------------------------------------------------------------------ > [0]PETSC ERROR: Petsc Release Version 3.3.0, Patch 7, Sat May 11 22:15:24 CDT 2013 > [0]PETSC ERROR: See docs/changes/index.html for recent updates. > [0]PETSC ERROR: See docs/faq.html for hints about trouble shooting. > [0]PETSC ERROR: See docs/index.html for manual pages. > [0]PETSC ERROR: --------------------[1]PETSC ERROR: ------------------------------------------------------------------------ > [1]PETSC ERROR: Caught signal number 11 SEGV: Segmentation Violation, probably memory access out of range > ---------------------------------------------------- > [0]PETSC ERROR: ./kmath.bcbcg on a arch-fuji named p01-024 by a03293 Wed Aug 5 18:24:40 2015 > [0]PETSC ERROR: Libraries linked from /volume1/home/ra000005/a03293/kmathlibbuild/petsc-3.3-p7/arch-fujitsu-sparc64fx-opt/lib > [0]PETSC ERROR: Configure run at Tue Jul 28 19:23:51 2015 > [0]PETSC ERROR: Configure options --known-level1-dcache-size=32768 --known-level1-dcache-linesize=32 --known-level1-dcache-assoc=0 --known-memcmp-ok=1 --known-sizeof-char=1 --known-sizeof-void-p=8 --known-sizeof-short=2 --known-sizeof-int=4 --known-sizeof-long=8 --known-sizeof-long-long=8 --known-sizeof-float=4 --known-sizeof-double=8 --known-sizeof-size_t=8 --known-bits-per-byte=8 --known-sizeof-MPI_Comm=8 --known-sizeof-MPI_Fint=4 --known-mpi-long-double=1 --known-mpi-c-double-complex=1 --with-cc=mpifccpx --CFLAGS="-mt -Xg" --COPTFLAGS=-Kfast,openmp --with-cxx=mpiFCCpx --CXXFLAGS=-mt --CXXOPTFLAGS=-Kfast,openmp --with-fc=mpifrtpx --FFLAGS=-Kthreadsafe --FOPTFLAGS=-Kfast,openmp --with-blas-lapack-lib="-SCALAPACK -SSL2" --with-x=0 --with-c++-support --with-batch=1 --with-info=1 --with-debugging=0 --known-mpi-shared-libraries=0 --with-valgrind=0 > [0]PETSC ERROR: ------------------------------------------------------------------------ > [0]PETSC ERROR: User provided function() line 0 in unknown directory unknown file > -------------------------------------------------------------------------- > [mpi::mpi-api::mpi-abort] > MPI_ABORT was invoked on rank 0 in communicator MPI_COMM_WORLD > with errorcode 59. > > NOTE: invoking MPI_ABORT causes Open MPI to kill all MPI processes. > You may or may not see output from other processes, depending on > exactly when Open MPI kills them. > -------------------------------------------------------------------------- > [p01-024:26516] /opt/FJSVtclang/GM-1.2.0-18/lib64/libmpi.so.0(orte_errmgr_base_error_abort+0x84) [0xffffffff0091f684] > [p01-024:26516] /opt/FJSVtclang/GM-1.2.0-18/lib64/libmpi.so.0(ompi_mpi_abort+0x51c) [0xffffffff006c389c] > [p01-024:26516] /opt/FJSVtclang/GM-1.2.0-18/lib64/libmpi.so.0(MPI_Abort+0x6c) [0xffffffff006db3ac] > [p01-024:26516] /opt/FJSVtclang/GM-1.2.0-18/lib64/libtrtmet_c.so.1(MPI_Abort+0x2c) [0xffffffff00281bf0] > [p01-024:26516] ./kmath.bcbcg [0x1bf620] > [p01-024:26516] ./kmath.bcbcg [0x1bf20c] > [p01-024:26516] /lib64/libc.so.6(killpg+0x48) [0xffffffff02d52600] > [p01-024:26516] [(nil)] > [p01-024:26516] ./kmath.bcbcg [0x1a2054] > [p01-024:26516] ./kmath.bcbcg [0x1064f8] > [p01-024:26516] ./kmath.bcbcg(MAIN__+0x9dc) [0x105d1c] > [p01-024:26516] ./kmath.bcbcg(main+0xec) [0x8a329c] > [p01-024:26516] /lib64/libc.so.6(__libc_start_main+0x194) [0xffffffff02d3b81c] > [p01-024:26516] ./kmath.bcbcg [0x1051ec] > [0]PETSC ERROR: ------------------------------------------------------------------------ > [0]PETSC ERROR: Caught signal number 15 Terminate: Somet process (or the batch system) has told this process to end > [0]PETSC ERROR: Try option -start_in_debugger or -on_error_attach_debugger > [0]PETSC ERROR: or see http://www.mcs.anl.gov/petsc/documentation/faq.html#valgrind[0]PETSC ERROR: or try http://valgrind.org on GNU/linux and Apple Mac OS X to find memory corruption errors > [0]PETSC ERROR: configure using --with-debugging=yes, recompile, link, and run > [0]PETSC ERROR: to get more information on the crash. > [0]PETSC ERROR: --------------------- Error Message ------------------------------------ > [0]PETSC ERROR: Signal received! > [0]PETSC ERROR: ------------------------------------------------------------------------ > [0]PETSC ERROR: Petsc Release Version 3.3.0, Patch 7, Sat May 11 22:15:24 CDT 2013 > [0]PETSC ERROR: See docs/changes/index.html for recent updates. > [0]PETSC ERROR: See docs/faq.html for hints about trouble shooting. > [0]PETSC ERROR: See docs/index.html for manual pages. > [0]PETSC ERROR: ------------------------------------------------------------------------ > [0]PETSC ERROR: ./kmath.bcbcg on a arch-fuji named p01-024 by a03293 Wed Aug 5 18:24:40 2015 > [0]PETSC ERROR: Libraries linked from /volume1/home/ra000005/a03293/kmathlibbuild/petsc-3.3-p7/arch-fujitsu-sparc64fx-opt/lib > [0]PETSC ERROR: Configure run at Tue Jul 28 19:23:51 2015 > [0]PETSC ERROR: Configure options --known-level1-dcache-size=32768 --known-level1-dcache-linesize=32 --known-level1-dcache-assoc=0 --known-memcmp-ok=1 --known-sizeof-char=1 --known-sizeof-void-p=8 --known-sizeof-short=2 --known-sizeof-int=4 --known-sizeof-long=8 --known-sizeof-long-long=8 --known-sizeof-float=4 --known-sizeof-double=8 --known-sizeof-size_t=8 --known-bits-per-byte=8 --known-sizeof-MPI_Comm=8 --known-sizeof-MPI_Fint=4 --known-mpi-long-double=1 --known-mpi-c-double-complex=1 --with-cc=mpifccpx --CFLAGS="-mt -Xg" --COPTFLAGS=-Kfast,openmp --with-cxx=mpiFCCpx --CXXFLAGS=-mt --CXXOPTFLAGS=-Kfast,openmp --with-fc=mpifrtpx --FFLAGS=-Kthreadsafe --FOPTFLAGS=-Kfast,openmp --with-blas-lapack-lib="-SCALAPACK -SSL2" --with-x=0 --with-c++-support --with-batch=1 --with-info=1 --with-debugging=0 --known-mpi-shared-libraries=0 --with-valgrind=0 > [0]PETSC ERROR: ------------------------------------------------------------------------ > [0]PETSC ERROR: User provided function() line 0 in unknown directory unknown file > [ERR.] PLE 0019 plexec One of MPI processes was aborted.(rank=0)(nid=0x020a0028)(CODE=1938,793745140674134016,15104) > > However, if I change from > call MatMatMult(A,Km(stepIdx-1),MAT_REUSE_MATRIX,PETSC_DEFAULT_INTEGER,Km(stepIdx), ierr) > to > call MatMatMult(A,Km(stepIdx-1), MAT_INITIAL_MATRIX,PETSC_DEFAULT_INTEGER,Km(stepIdx), ierr) > > everything is fine. > > could you please suggest some way to solve this? > > Thanks > > Cong Li > > On Wed, Aug 5, 2015 at 10:53 AM, Cong Li wrote: > Thank you very much for your help and suggestions. > With your help, finally I could continue my project. > > Regards > > Cong Li > > > > On Wed, Aug 5, 2015 at 3:09 AM, Barry Smith wrote: > > From the manual page: Unless scall is MAT_REUSE_MATRIX C will be created. > > Since you want to use the C that is passed in you should use MAT_REUSE_MATRIX. > > Note that since your B and C matrices are dense the issue of sparsity pattern of C is not relevant. > > Barry > > > On Aug 4, 2015, at 11:59 AM, Cong Li wrote: > > > > Thanks very much. This answer is very helpful. > > And I have a following question. > > If I create B1, B2, .. by the way you suggested and then use MatMatMult to do SPMM. > > PetscErrorCode MatMatMult(Mat A,Mat B,MatReuse scall,PetscReal fill,Mat *C) > > should I use MAT_REUSE_MATRIX for MatReuse part of the arguement. > > > > Thanks > > > > Cong Li > > > > On Wed, Aug 5, 2015 at 1:27 AM, Barry Smith wrote: > > > > > On Aug 4, 2015, at 4:09 AM, Cong Li wrote: > > > > > > I am sorry that I should have explained it more clearly. > > > Actually I want to compute a recurrence. > > > > > > Like, I want to firstly compute A*X1=B1, and then calculate A*B1=B2, A*B2=B3 and so on. > > > Finally I want to combine all these results into a bigger matrix C=[B1,B2 ...] > > > > First create C with MatCreateDense(,&C). Then call MatDenseGetArray(C,&array); then create B1 with MatCreateDense(....,array,&B1); then create > > B2 with MatCreateDense(...,array+shift,&B2) etc where shift equals the number of __local__ rows in B1 times the number of columns in B1, then create B3 with a larger shift etc. > > > > Note that you are "sharing" the array space of C with B1, B2, B3, ..., each Bi contains its columns of the C matrix. > > > > Barry > > > > > > > > > > > > Is there any way to do this efficiently. > > > > > > > > > > > > On Tue, Aug 4, 2015 at 5:45 PM, Patrick Sanan wrote: > > > On Tue, Aug 04, 2015 at 03:42:14PM +0900, Cong Li wrote: > > > > Thanks for your reply. > > > > > > > > I have an other question. > > > > I want to do SPMM several times and combine result matrices into one bigger > > > > matrix. > > > > for example > > > > I firstly calculate AX1=B1, AX2=B2 ... > > > > then I want to combine B1, B2.. to get a C, where C=[B1,B2...] > > > > > > > > Could you please suggest a way of how to do this. > > > This is just linear algebra, nothing to do with PETSc specifically. > > > A * [X1, X2, ... ] = [AX1, AX2, ...] > > > > > > > > Thanks > > > > > > > > Cong Li > > > > > > > > On Tue, Aug 4, 2015 at 3:27 PM, Jed Brown wrote: > > > > > > > > > Cong Li writes: > > > > > > > > > > > Hello, > > > > > > > > > > > > I am a PhD student using PETsc for my research. > > > > > > I am wondering if there is a way to implement SPMM (Sparse matrix-matrix > > > > > > multiplication) by using PETSc. > > > > > > > > > > > > > > > http://www.mcs.anl.gov/petsc/petsc-current/docs/manualpages/Mat/MatMatMult.html > > > > > > > > > > > > > > > From gpau at lbl.gov Wed Aug 5 13:35:46 2015 From: gpau at lbl.gov (George Pau) Date: Wed, 5 Aug 2015 11:35:46 -0700 Subject: [petsc-users] mumps compile error Message-ID: Hi, I am now having issues with mumps. Similar to my configure options in my previous email: --with-debugging=1 --with-shared-libraries=0 --prefix=/global/homes/g/gpau/clm-rom/install/t pls --with-cxx-dialect=C++11 --download-elemental --download-mumps --download-scalapack --do wnload-parmetis --download-metis --download-hdf5 --download-netcdf --with-x=0 --with-cc=/opt /cray/craype/2.3.1/bin/cc --with-cxx=/opt/cray/craype/2.3.1/bin/CC --with-fc=/opt/cray/crayp e/2.3.1/bin/ftn I am having now having problem with mumps but I couldn't figure out what is wrong. I have this problem on both NERSC/Edison (using Intel compiler) and on Ubuntu (using gcc compiler): mumps_c.c(136): error: identifier "MUMPS_INT8" is undefined MUMPS_INT8 *keep8, ^ mumps_c.c(284): error: identifier "MUMPS_INT8" is undefined MUMPS_INT8 *keep8; ^ mumps_c.c(284): error: identifier "keep8" is undefined MUMPS_INT8 *keep8; The error messages are longer and can be found in the attached log file. However, if I leave out the --prefix option, then everything is fine. MUMPS will configure correctly. It seems like a linking issue. -- George Pau Earth Sciences Division Lawrence Berkeley National Laboratory One Cyclotron, MS 74-120 Berkeley, CA 94720 (510) 486-7196 gpau at lbl.gov http://esd.lbl.gov/about/staff/georgepau/ -------------- next part -------------- An HTML attachment was scrubbed... URL: -------------- next part -------------- A non-text attachment was scrubbed... Name: configure.log Type: application/octet-stream Size: 7357916 bytes Desc: not available URL: From bsmith at mcs.anl.gov Wed Aug 5 14:13:40 2015 From: bsmith at mcs.anl.gov (Barry Smith) Date: Wed, 5 Aug 2015 14:13:40 -0500 Subject: [petsc-users] Fail to Configure petsc-3.6.1 In-Reply-To: References: <55C18408.5040500@gmail.com> <87r3niz7gk.fsf@jedbrown.org> <55C1A7C3.7030209@gmail.com> <87lhdpzrcd.fsf@jedbrown.org> Message-ID: <49599094-1C85-499D-A847-7EC2C72D4430@mcs.anl.gov> > On Aug 5, 2015, at 12:35 PM, Matthew Knepley wrote: > > On Wed, Aug 5, 2015 at 10:29 AM, Jed Brown wrote: > Rongliang Chen writes: > > > Hi Jed, > > > > Thanks for your reply. > > > > I checked the netcdf and hdf5's config.log and could not find any > > possible solutions. Can you help me check these two files again? The two > > files are attached. Thanks. > > It looks to me like libhdf5.a needs to be linked with -ldl, which partly > defeats the intent of static linking. PETSc folks, do we blame this on > HDF5 with --disable-shared not being a truly static build? Should we > > Yes, this is an error in the HDF5 buildsystem. > > pass LDLIBS=-ldl so that NetCDF can link? > > That would work I think, but looks very strange for a static build (as you said). It appears to me > that HDF5 is not suitable for a static build, and I would reconsider this strategy. Our approach is always to work around bugs and stupidity in other packages design, so if HDF5 needs to link against -ldl (and -lm it looks like) even with static libraries then we just make that a dependency in hdf5.py We sure don't require people to know that they should "pass LDLIBS=-ldl so that NetCDF can link?" Why is my answer not obvious? Barry BTW: needsmath should probably eliminated and handled properly where math is just another package that some packages depend on. > > Thanks, > > Matt > > This likely all works if you use shared libraries. (I can't believe > this is still a debate in 2015.) > > configure:16585: mpicc -o conftest -g3 -O0 -I/home/rlchen/soft/petsc-3.6.1/64bit-debug/include conftest.c -lhdf5 -lm -Wl,-rpath,/home/rlchen/soft/petsc-3.6.1/64bit-debug/lib -L/home/rlchen/soft/petsc-3.6.1/64bit-debug/lib -lhdf5hl_fortran -lhdf5_fortran -lhdf5_hl -lhdf5 -lz >&5 > /home/rlchen/soft/petsc-3.6.1/64bit-debug/lib/libhdf5.a(H5PL.o): In function `H5PL__open': > /home/rlchen/soft/petsc-3.6.1/64bit-debug/externalpackages/hdf5-1.8.12/src/H5PL.c:535: undefined reference to `dlopen' > /home/rlchen/soft/petsc-3.6.1/64bit-debug/externalpackages/hdf5-1.8.12/src/H5PL.c:536: undefined reference to `dlerror' > /home/rlchen/soft/petsc-3.6.1/64bit-debug/externalpackages/hdf5-1.8.12/src/H5PL.c:544: undefined reference to `dlsym' > /home/rlchen/soft/petsc-3.6.1/64bit-debug/lib/libhdf5.a(H5PL.o): In function `H5PL__search_table': > /home/rlchen/soft/petsc-3.6.1/64bit-debug/externalpackages/hdf5-1.8.12/src/H5PL.c:627: undefined reference to `dlsym' > /home/rlchen/soft/petsc-3.6.1/64bit-debug/lib/libhdf5.a(H5PL.o): In function `H5PL__close': > /home/rlchen/soft/petsc-3.6.1/64bit-debug/externalpackages/hdf5-1.8.12/src/H5PL.c:661: undefined reference to `dlclose' > collect2: error: ld returned 1 exit status > -- > What most experimenters take for granted before they begin their experiments is infinitely more interesting than any results to which their experiments lead. > -- Norbert Wiener From bsmith at mcs.anl.gov Wed Aug 5 14:23:57 2015 From: bsmith at mcs.anl.gov (Barry Smith) Date: Wed, 5 Aug 2015 14:23:57 -0500 Subject: [petsc-users] mumps compile error In-Reply-To: References: Message-ID: George, Try running with a completely empty directory for the --prefix (perhaps it is picking up some incorrect/outdated stuff there). Also send us the configure.log file from running without a prefix (so we can see what the differences are). I ran a --prefix configure build with MUMPS just now and it was fine. Barry > On Aug 5, 2015, at 1:35 PM, George Pau wrote: > > Hi, > > I am now having issues with mumps. Similar to my configure options in my previous email: > > --with-debugging=1 --with-shared-libraries=0 --prefix=/global/homes/g/gpau/clm-rom/install/t > pls --with-cxx-dialect=C++11 --download-elemental --download-mumps --download-scalapack --do > wnload-parmetis --download-metis --download-hdf5 --download-netcdf --with-x=0 --with-cc=/opt > /cray/craype/2.3.1/bin/cc --with-cxx=/opt/cray/craype/2.3.1/bin/CC --with-fc=/opt/cray/crayp > e/2.3.1/bin/ftn > > I am having now having problem with mumps but I couldn't figure out what is wrong. I have this problem on both NERSC/Edison (using Intel compiler) and on Ubuntu (using gcc compiler): > > mumps_c.c(136): error: identifier "MUMPS_INT8" is undefined > MUMPS_INT8 *keep8, > ^ > > mumps_c.c(284): error: identifier "MUMPS_INT8" is undefined > MUMPS_INT8 *keep8; > ^ > > mumps_c.c(284): error: identifier "keep8" is undefined > MUMPS_INT8 *keep8; > > The error messages are longer and can be found in the attached log file. > > However, if I leave out the --prefix option, then everything is fine. MUMPS will configure correctly. It seems like a linking issue. > > > > -- > George Pau > Earth Sciences Division > Lawrence Berkeley National Laboratory > One Cyclotron, MS 74-120 > Berkeley, CA 94720 > > (510) 486-7196 > gpau at lbl.gov > http://esd.lbl.gov/about/staff/georgepau/ > From jed at jedbrown.org Wed Aug 5 14:26:43 2015 From: jed at jedbrown.org (Jed Brown) Date: Wed, 05 Aug 2015 13:26:43 -0600 Subject: [petsc-users] Fail to Configure petsc-3.6.1 In-Reply-To: <49599094-1C85-499D-A847-7EC2C72D4430@mcs.anl.gov> References: <55C18408.5040500@gmail.com> <87r3niz7gk.fsf@jedbrown.org> <55C1A7C3.7030209@gmail.com> <87lhdpzrcd.fsf@jedbrown.org> <49599094-1C85-499D-A847-7EC2C72D4430@mcs.anl.gov> Message-ID: <87pp31y1sc.fsf@jedbrown.org> Barry Smith writes: > Our approach is always to work around bugs and stupidity in other packages design, Do we report it to them as a bug? -------------- next part -------------- A non-text attachment was scrubbed... Name: signature.asc Type: application/pgp-signature Size: 818 bytes Desc: not available URL: From bsmith at mcs.anl.gov Wed Aug 5 15:11:42 2015 From: bsmith at mcs.anl.gov (Barry Smith) Date: Wed, 5 Aug 2015 15:11:42 -0500 Subject: [petsc-users] Fail to Configure petsc-3.6.1 In-Reply-To: <87pp31y1sc.fsf@jedbrown.org> References: <55C18408.5040500@gmail.com> <87r3niz7gk.fsf@jedbrown.org> <55C1A7C3.7030209@gmail.com> <87lhdpzrcd.fsf@jedbrown.org> <49599094-1C85-499D-A847-7EC2C72D4430@mcs.anl.gov> <87pp31y1sc.fsf@jedbrown.org> Message-ID: <37BDE715-40F7-4DEB-9651-B5D298866F3F@mcs.anl.gov> > On Aug 5, 2015, at 2:26 PM, Jed Brown wrote: > > Barry Smith writes: >> Our approach is always to work around bugs and stupidity in other packages design, > > Do we report it to them as a bug? When there is a place to report them then we should and sometimes do. Barry From jychang48 at gmail.com Wed Aug 5 15:43:26 2015 From: jychang48 at gmail.com (Justin Chang) Date: Wed, 5 Aug 2015 15:43:26 -0500 Subject: [petsc-users] Fail to Configure petsc-3.6.1 In-Reply-To: <37BDE715-40F7-4DEB-9651-B5D298866F3F@mcs.anl.gov> References: <55C18408.5040500@gmail.com> <87r3niz7gk.fsf@jedbrown.org> <55C1A7C3.7030209@gmail.com> <87lhdpzrcd.fsf@jedbrown.org> <49599094-1C85-499D-A847-7EC2C72D4430@mcs.anl.gov> <87pp31y1sc.fsf@jedbrown.org> <37BDE715-40F7-4DEB-9651-B5D298866F3F@mcs.anl.gov> Message-ID: Hi everyone, Not sure how related this may be, but I am also having trouble installing petsc 3.6.1 with hdf5. In fact ./configure hangs at "Running make on HDF5; this may take several minutes". I grew impatient after 30 minutes so I had to kill it. Attached is the configure.log. Can y'all figure out what's going on here? Thanks, Justin On Wed, Aug 5, 2015 at 3:11 PM, Barry Smith wrote: > > > On Aug 5, 2015, at 2:26 PM, Jed Brown wrote: > > > > Barry Smith writes: > >> Our approach is always to work around bugs and stupidity in other > packages design, > > > > Do we report it to them as a bug? > > When there is a place to report them then we should and sometimes do. > > Barry > > > -------------- next part -------------- An HTML attachment was scrubbed... URL: -------------- next part -------------- A non-text attachment was scrubbed... Name: configure.log Type: application/octet-stream Size: 2401571 bytes Desc: not available URL: From gpau at lbl.gov Wed Aug 5 15:44:37 2015 From: gpau at lbl.gov (George Pau) Date: Wed, 5 Aug 2015 13:44:37 -0700 Subject: [petsc-users] mumps compile error In-Reply-To: References: Message-ID: Hi Barry, Thanks. That is indeed the reason. Once I deleted the old directory, everything is configured correctly. Thanks, George On Wed, Aug 5, 2015 at 12:23 PM, Barry Smith wrote: > > George, > > Try running with a completely empty directory for the --prefix (perhaps > it is picking up some incorrect/outdated stuff there). > > Also send us the configure.log file from running without a prefix (so > we can see what the differences are). > > I ran a --prefix configure build with MUMPS just now and it was fine. > > Barry > > > On Aug 5, 2015, at 1:35 PM, George Pau wrote: > > > > Hi, > > > > I am now having issues with mumps. Similar to my configure options in > my previous email: > > > > --with-debugging=1 --with-shared-libraries=0 > --prefix=/global/homes/g/gpau/clm-rom/install/t > > pls --with-cxx-dialect=C++11 --download-elemental --download-mumps > --download-scalapack --do > > wnload-parmetis --download-metis --download-hdf5 --download-netcdf > --with-x=0 --with-cc=/opt > > /cray/craype/2.3.1/bin/cc --with-cxx=/opt/cray/craype/2.3.1/bin/CC > --with-fc=/opt/cray/crayp > > e/2.3.1/bin/ftn > > > > I am having now having problem with mumps but I couldn't figure out what > is wrong. I have this problem on both NERSC/Edison (using Intel compiler) > and on Ubuntu (using gcc compiler): > > > > mumps_c.c(136): error: identifier "MUMPS_INT8" is undefined > > MUMPS_INT8 *keep8, > > ^ > > > > mumps_c.c(284): error: identifier "MUMPS_INT8" is undefined > > MUMPS_INT8 *keep8; > > ^ > > > > mumps_c.c(284): error: identifier "keep8" is undefined > > MUMPS_INT8 *keep8; > > > > The error messages are longer and can be found in the attached log file. > > > > However, if I leave out the --prefix option, then everything is fine. > MUMPS will configure correctly. It seems like a linking issue. > > > > > > > > -- > > George Pau > > Earth Sciences Division > > Lawrence Berkeley National Laboratory > > One Cyclotron, MS 74-120 > > Berkeley, CA 94720 > > > > (510) 486-7196 > > gpau at lbl.gov > > http://esd.lbl.gov/about/staff/georgepau/ > > > > -- George Pau Earth Sciences Division Lawrence Berkeley National Laboratory One Cyclotron, MS 74-120 Berkeley, CA 94720 (510) 486-7196 gpau at lbl.gov http://esd.lbl.gov/about/staff/georgepau/ -------------- next part -------------- An HTML attachment was scrubbed... URL: From knepley at gmail.com Wed Aug 5 15:49:35 2015 From: knepley at gmail.com (Matthew Knepley) Date: Wed, 5 Aug 2015 15:49:35 -0500 Subject: [petsc-users] Fail to Configure petsc-3.6.1 In-Reply-To: References: <55C18408.5040500@gmail.com> <87r3niz7gk.fsf@jedbrown.org> <55C1A7C3.7030209@gmail.com> <87lhdpzrcd.fsf@jedbrown.org> <49599094-1C85-499D-A847-7EC2C72D4430@mcs.anl.gov> <87pp31y1sc.fsf@jedbrown.org> <37BDE715-40F7-4DEB-9651-B5D298866F3F@mcs.anl.gov> Message-ID: On Wed, Aug 5, 2015 at 3:43 PM, Justin Chang wrote: > Hi everyone, > > Not sure how related this may be, but I am also having trouble installing > petsc 3.6.1 with hdf5. In fact ./configure hangs at "Running make on HDF5; > this may take several minutes". I grew impatient after 30 minutes so I had > to kill it. > > Attached is the configure.log. Can y'all figure out what's going on here? > Is this being built on a system with nonlocal disk? This can make builds take forever. The timeout on this operation is 100 minutes. Thanks, Matt > Thanks, > Justin > > On Wed, Aug 5, 2015 at 3:11 PM, Barry Smith wrote: > >> >> > On Aug 5, 2015, at 2:26 PM, Jed Brown wrote: >> > >> > Barry Smith writes: >> >> Our approach is always to work around bugs and stupidity in other >> packages design, >> > >> > Do we report it to them as a bug? >> >> When there is a place to report them then we should and sometimes do. >> >> Barry >> >> >> > -- What most experimenters take for granted before they begin their experiments is infinitely more interesting than any results to which their experiments lead. -- Norbert Wiener -------------- next part -------------- An HTML attachment was scrubbed... URL: From jed at jedbrown.org Wed Aug 5 15:49:52 2015 From: jed at jedbrown.org (Jed Brown) Date: Wed, 05 Aug 2015 14:49:52 -0600 Subject: [petsc-users] Fail to Configure petsc-3.6.1 In-Reply-To: <37BDE715-40F7-4DEB-9651-B5D298866F3F@mcs.anl.gov> References: <55C18408.5040500@gmail.com> <87r3niz7gk.fsf@jedbrown.org> <55C1A7C3.7030209@gmail.com> <87lhdpzrcd.fsf@jedbrown.org> <49599094-1C85-499D-A847-7EC2C72D4430@mcs.anl.gov> <87pp31y1sc.fsf@jedbrown.org> <37BDE715-40F7-4DEB-9651-B5D298866F3F@mcs.anl.gov> Message-ID: <87egjhxxxr.fsf@jedbrown.org> Barry Smith writes: >> On Aug 5, 2015, at 2:26 PM, Jed Brown wrote: >> >> Barry Smith writes: >>> Our approach is always to work around bugs and stupidity in other packages design, >> >> Do we report it to them as a bug? > > When there is a place to report them then we should and sometimes do. Nominally, help at hdfgroup.org -------------- next part -------------- A non-text attachment was scrubbed... Name: signature.asc Type: application/pgp-signature Size: 818 bytes Desc: not available URL: From jychang48 at gmail.com Wed Aug 5 15:51:25 2015 From: jychang48 at gmail.com (Justin Chang) Date: Wed, 5 Aug 2015 15:51:25 -0500 Subject: [petsc-users] Fail to Configure petsc-3.6.1 In-Reply-To: References: <55C18408.5040500@gmail.com> <87r3niz7gk.fsf@jedbrown.org> <55C1A7C3.7030209@gmail.com> <87lhdpzrcd.fsf@jedbrown.org> <49599094-1C85-499D-A847-7EC2C72D4430@mcs.anl.gov> <87pp31y1sc.fsf@jedbrown.org> <37BDE715-40F7-4DEB-9651-B5D298866F3F@mcs.anl.gov> Message-ID: no this is being built on my macbook laptop. The difference between yesterday and today is that I updated my gcc compiler (downloaded via brew) to gcc-5.0 and reinstalled openmpi accordingly On Wed, Aug 5, 2015 at 3:49 PM, Matthew Knepley wrote: > On Wed, Aug 5, 2015 at 3:43 PM, Justin Chang wrote: > >> Hi everyone, >> >> Not sure how related this may be, but I am also having trouble installing >> petsc 3.6.1 with hdf5. In fact ./configure hangs at "Running make on HDF5; >> this may take several minutes". I grew impatient after 30 minutes so I had >> to kill it. >> >> Attached is the configure.log. Can y'all figure out what's going on here? >> > > Is this being built on a system with nonlocal disk? This can make builds > take forever. The timeout on this operation is > 100 minutes. > > Thanks, > > Matt > > >> Thanks, >> Justin >> >> On Wed, Aug 5, 2015 at 3:11 PM, Barry Smith wrote: >> >>> >>> > On Aug 5, 2015, at 2:26 PM, Jed Brown wrote: >>> > >>> > Barry Smith writes: >>> >> Our approach is always to work around bugs and stupidity in other >>> packages design, >>> > >>> > Do we report it to them as a bug? >>> >>> When there is a place to report them then we should and sometimes do. >>> >>> Barry >>> >>> >>> >> > > > -- > What most experimenters take for granted before they begin their > experiments is infinitely more interesting than any results to which their > experiments lead. > -- Norbert Wiener > -------------- next part -------------- An HTML attachment was scrubbed... URL: From knepley at gmail.com Wed Aug 5 15:56:19 2015 From: knepley at gmail.com (Matthew Knepley) Date: Wed, 5 Aug 2015 15:56:19 -0500 Subject: [petsc-users] Fail to Configure petsc-3.6.1 In-Reply-To: References: <55C18408.5040500@gmail.com> <87r3niz7gk.fsf@jedbrown.org> <55C1A7C3.7030209@gmail.com> <87lhdpzrcd.fsf@jedbrown.org> <49599094-1C85-499D-A847-7EC2C72D4430@mcs.anl.gov> <87pp31y1sc.fsf@jedbrown.org> <37BDE715-40F7-4DEB-9651-B5D298866F3F@mcs.anl.gov> Message-ID: On Wed, Aug 5, 2015 at 3:51 PM, Justin Chang wrote: > no this is being built on my macbook laptop. The difference between > yesterday and today is that I updated my gcc compiler (downloaded via brew) > to gcc-5.0 and reinstalled openmpi accordingly > Can you go to the directory and execute it manually? cd /Users/justin/Software/petsc/arch-darwin-c-opt-firedrake/externalpackages/hdf5-1.8.12 && /usr/bin/make -j 4 Maybe there is a problem using multiple make threads on this machine... Thanks, Matt > On Wed, Aug 5, 2015 at 3:49 PM, Matthew Knepley wrote: > >> On Wed, Aug 5, 2015 at 3:43 PM, Justin Chang wrote: >> >>> Hi everyone, >>> >>> Not sure how related this may be, but I am also having trouble >>> installing petsc 3.6.1 with hdf5. In fact ./configure hangs at "Running >>> make on HDF5; this may take several minutes". I grew impatient after 30 >>> minutes so I had to kill it. >>> >>> Attached is the configure.log. Can y'all figure out what's going on here? >>> >> >> Is this being built on a system with nonlocal disk? This can make builds >> take forever. The timeout on this operation is >> 100 minutes. >> >> Thanks, >> >> Matt >> >> >>> Thanks, >>> Justin >>> >>> On Wed, Aug 5, 2015 at 3:11 PM, Barry Smith wrote: >>> >>>> >>>> > On Aug 5, 2015, at 2:26 PM, Jed Brown wrote: >>>> > >>>> > Barry Smith writes: >>>> >> Our approach is always to work around bugs and stupidity in other >>>> packages design, >>>> > >>>> > Do we report it to them as a bug? >>>> >>>> When there is a place to report them then we should and sometimes do. >>>> >>>> Barry >>>> >>>> >>>> >>> >> >> >> -- >> What most experimenters take for granted before they begin their >> experiments is infinitely more interesting than any results to which their >> experiments lead. >> -- Norbert Wiener >> > > -- What most experimenters take for granted before they begin their experiments is infinitely more interesting than any results to which their experiments lead. -- Norbert Wiener -------------- next part -------------- An HTML attachment was scrubbed... URL: From solvercorleone at gmail.com Wed Aug 5 20:56:24 2015 From: solvercorleone at gmail.com (Cong Li) Date: Thu, 6 Aug 2015 10:56:24 +0900 Subject: [petsc-users] I am wondering if there is a way to implement SPMM In-Reply-To: References: <87egjjr2j9.fsf@jedbrown.org> <20150804084548.GB52392@Patricks-MacBook-Pro-3.local> <07456300-9874-41EF-AF5E-16BC0CB0423D@mcs.anl.gov> <06426FD6-16F6-429A-8EEB-8BE31CECC8F4@mcs.anl.gov> Message-ID: The entire source code files are attached. Also I copy and paste the here in this email thanks program test implicit none #include #include #include #include PetscViewer :: view ! sparse matrix Mat :: A ! distributed dense matrix of size n x m Mat :: B, X, R, QDlt, AQDlt ! distributed dense matrix of size n x (m x k) Mat :: Q, K, AQ_p, AQ ! local dense matrix (every process keep the identical copies), (m x k) x (m x k) Mat :: AConjPara, QtAQ, QtAQ_p, Dlt PetscInt :: nDim, mDim, rhsNDim,rhsMDim,ierr, maxIter, iter, step_k,bsize PetscInt :: ownRowS,ownRowE PetscScalar, allocatable :: XInit(:,:) PetscInt :: XInitI, XInitJ PetscScalar :: v=1.0 PetscBool :: flg PetscMPIInt :: size, rank character(128) :: fin, rhsfin call PetscInitialize(PETSC_NULL_CHARACTER,ierr) call MPI_Comm_size(PETSC_COMM_WORLD,size,ierr) call MPI_Comm_rank(PETSC_COMM_WORLD,rank,ierr) ! read binary matrix file call PetscOptionsGetString(PETSC_NULL_CHARACTER,'-f',fin,flg,ierr) call PetscOptionsGetString(PETSC_NULL_CHARACTER,'-r',rhsfin,flg,ierr) call PetscOptionsGetInt(PETSC_NULL_CHARACTER,'-i',maxIter,flg,ierr) call PetscOptionsGetInt(PETSC_NULL_CHARACTER,'-k',step_k,flg,ierr) call PetscOptionsGetInt(PETSC_NULL_CHARACTER,'-w',bsize,flg,ierr) call PetscViewerBinaryOpen(PETSC_COMM_WORLD,fin,FILE_MODE_READ,view,ierr) call MatCreate(PETSC_COMM_WORLD,A,ierr) call MatSetType(A,MATAIJ,ierr) call MatLoad(A,view,ierr) call PetscViewerDestroy(view,ierr) ! for the time being, assume mDim == nDim is true call MatGetSize(A, nDim, mDim, ierr) if (rank == 0) then print*,'Mat Size = ', nDim, mDim end if call MatAssemblyBegin(A, MAT_FINAL_ASSEMBLY, ierr) call MatAssemblyEnd(A, MAT_FINAL_ASSEMBLY, ierr) call MatGetOwnershipRange(A,ownRowS,ownRowE, ierr) ! create right-and-side matrix ! for the time being, choose row-wise decomposition ! for the time being, assume nDim%size = 0 call MatCreateDense(PETSC_COMM_WORLD, (ownRowE - ownRowS), & bsize, nDim, bsize,PETSC_NULL_SCALAR, B, ierr) call PetscViewerBinaryOpen(PETSC_COMM_WORLD,rhsfin,FILE_MODE_READ,view, ierr) call MatLoad(B,view,ierr) call PetscViewerDestroy(view,ierr) call MatGetSize(B, rhsMDim, rhsNDim, ierr) if (rank == 0) then print*,'MRHS Size actually are:', rhsMDim, rhsNDim print*,'MRHS Size should be:', nDim, bsize end if call MatAssemblyBegin(B, MAT_FINAL_ASSEMBLY, ierr) call MatAssemblyEnd(B, MAT_FINAL_ASSEMBLY, ierr) ! inintial value guses X allocate(XInit(nDim,bsize)) do XInitI=1, nDim do XInitJ=1, bsize XInit(XInitI,XInitJ) = 1.0 end do end do call MatCreateDense(PETSC_COMM_WORLD, (ownRowE - ownRowS), & bsize, nDim, bsize,XInit, X, ierr) call MatAssemblyBegin(X, MAT_FINAL_ASSEMBLY, ierr) call MatAssemblyEnd (X, MAT_FINAL_ASSEMBLY, ierr) ! B, X, R, QDlt, AQDlt call MatDuplicate(B, MAT_DO_NOT_COPY_VALUES, R, ierr) call MatAssemblyBegin(R, MAT_FINAL_ASSEMBLY, ierr) call MatAssemblyEnd (R, MAT_FINAL_ASSEMBLY, ierr) call MatDuplicate(B, MAT_DO_NOT_COPY_VALUES, QDlt, ierr) call MatAssemblyBegin(QDlt, MAT_FINAL_ASSEMBLY, ierr) call MatAssemblyEnd (QDlt, MAT_FINAL_ASSEMBLY, ierr) call MatDuplicate(B, MAT_DO_NOT_COPY_VALUES, AQDlt, ierr) call MatAssemblyBegin(AQDlt, MAT_FINAL_ASSEMBLY, ierr) call MatAssemblyEnd (AQDlt, MAT_FINAL_ASSEMBLY, ierr) ! Q, K, AQ_p, AQ of size n x (m x k) call MatCreateDense(PETSC_COMM_WORLD, (ownRowE - ownRowS), & (bsize*step_k), nDim, (bsize*step_k),PETSC_NULL_SCALAR, Q, ierr) call MatAssemblyBegin(Q, MAT_FINAL_ASSEMBLY, ierr) call MatAssemblyEnd(Q, MAT_FINAL_ASSEMBLY, ierr) call MatDuplicate(Q, MAT_DO_NOT_COPY_VALUES, K, ierr) call MatAssemblyBegin(K, MAT_FINAL_ASSEMBLY, ierr) call MatAssemblyEnd(K, MAT_FINAL_ASSEMBLY, ierr) call MatDuplicate(Q, MAT_DO_NOT_COPY_VALUES, AQ_p, ierr) call MatAssemblyBegin(AQ_p, MAT_FINAL_ASSEMBLY, ierr) call MatAssemblyEnd(AQ_p, MAT_FINAL_ASSEMBLY, ierr) call MatDuplicate(Q, MAT_DO_NOT_COPY_VALUES, AQ, ierr) call MatAssemblyBegin(AQ, MAT_FINAL_ASSEMBLY, ierr) call MatAssemblyEnd(AQ, MAT_FINAL_ASSEMBLY, ierr) ! QtAQ, QtAQ_p, Dlt of size (m x k) x (m x k) call MatCreateSeqDense(PETSC_COMM_SELF,(bsize*step_k),(bsize*step_k),& PETSC_NULL_SCALAR, QtAQ, ierr) call MatAssemblyBegin(QtAQ, MAT_FINAL_ASSEMBLY, ierr) call MatAssemblyEnd (QtAQ, MAT_FINAL_ASSEMBLY, ierr) call MatDuplicate(QtAQ, MAT_DO_NOT_COPY_VALUES, QtAQ_p , ierr) call MatAssemblyBegin(QtAQ_p, MAT_FINAL_ASSEMBLY, ierr) call MatAssemblyEnd (QtAQ_p, MAT_FINAL_ASSEMBLY, ierr) call MatDuplicate(QtAQ, MAT_DO_NOT_COPY_VALUES, Dlt , ierr) call MatAssemblyBegin(Dlt, MAT_FINAL_ASSEMBLY, ierr) call MatAssemblyEnd (Dlt, MAT_FINAL_ASSEMBLY, ierr) call MatDuplicate(QtAQ, MAT_DO_NOT_COPY_VALUES, AConjPara , ierr) call MatAssemblyBegin(AConjPara, MAT_FINAL_ASSEMBLY, ierr) call MatAssemblyEnd (AConjPara, MAT_FINAL_ASSEMBLY, ierr) ! calculation for R ! call matrix powers kernel call mpk_monomial (K, A, R, step_k, rank,size) ! destory matrices deallocate(XInit) call MatDestroy(B, ierr) call MatDestroy(X, ierr) call MatDestroy(R, ierr) call MatDestroy(QDlt, ierr) call MatDestroy(AQDlt, ierr) call MatDestroy(Q, ierr) call MatDestroy(K, ierr) call MatDestroy(AQ_p, ierr) call MatDestroy(AQ, ierr) call MatDestroy(QtAQ, ierr) call MatDestroy(QtAQ_p, ierr) call MatDestroy(Dlt, ierr) call PetscFinalize(ierr) stop end program test subroutine mpk_monomial (K, A, R, step_k, rank, sizeMPI) implicit none #include #include #include #include Mat :: K, Km(step_k) Mat :: A, R PetscMPIInt :: sizeMPI, rank PetscInt :: nDim, bsize, step_k, local_RRow, local_RCol, genIdx PetscInt :: ierr PetscInt :: stepIdx, blockShift, localRsize PetscScalar :: KArray(1), RArray(1), PetscScalarSize PetscOffset :: KArrayOffset, RArrayOffset call MatGetSize(R, nDim, bsize, ierr) if (rank == 0) then print*,'Mat Size = ', nDim, bsize end if call MatGetArray(K,KArray,KArrayOffset,ierr) call MatGetLocalSize(R,local_RRow,local_RCol) ! print *, "local_RRow,local_RCol", local_RRow,local_RCol ! get arry from R to add values to K(1) call MatGetArray(R,RArray,RArrayOffset,ierr) call MatCreateDense(PETSC_COMM_WORLD, PETSC_DECIDE, & PETSC_DECIDE , nDim, bsize,KArray(KArrayOffset + 1), Km(1), ierr) ! call PetscMemmove(KArray(KArrayOffset + 1),RArray(RArrayOffset + 1) & ! ,local_RRow * local_RCol * STORAGE_SIZE(PetscScalarSize), ierr) localRsize = local_RRow * local_RCol do genIdx= 1, localRsize KArray(KArrayOffset + genIdx) = RArray(RArrayOffset + genIdx) end do call MatRestoreArray(R,RArray,RArrayOffset,ierr) call MatAssemblyBegin(Km(1), MAT_FINAL_ASSEMBLY, ierr) call MatAssemblyEnd (Km(1), MAT_FINAL_ASSEMBLY, ierr) do stepIdx= 2, step_k blockShift = KArrayOffset + (stepIdx-1) * (local_RRow * local_RCol) call MatCreateDense(PETSC_COMM_WORLD, PETSC_DECIDE, & PETSC_DECIDE , nDim, bsize,KArray(blockShift+1), Km(stepIdx), ierr) call MatAssemblyBegin(Km(stepIdx), MAT_FINAL_ASSEMBLY, ierr) call MatAssemblyEnd (Km(stepIdx), MAT_FINAL_ASSEMBLY, ierr) end do call MatRestoreArray(K,KArray,KArrayOffset,ierr) ! do stepIdx= 2, step_k do stepIdx= 2,2 call MatMatMult(A,Km(stepIdx-1),MAT_REUSE_MATRIX,PETSC_DEFAULT_INTEGER,Km(stepIdx), ierr) ! call MatMatMult(A,Km(stepIdx-1),MAT_INITIAL_MATRIX,PETSC_DEFAULT_INTEGER,Km(stepIdx), ierr) end do ! call MatView(K,PETSC_VIEWER_STDOUT_WORLD,ierr) end subroutine mpk_monomial Cong Li On Thu, Aug 6, 2015 at 3:30 AM, Barry Smith wrote: > > Send the entire code so that we can compile it and run it ourselves to > see what is going wrong. > > Barry > > > On Aug 5, 2015, at 4:42 AM, Cong Li wrote: > > > > Hi > > > > I tried the method you suggested. However, I got the error message. > > My code and message are below. > > > > K is the big matrix containing column matrices. > > > > code: > > > > call MatGetArray(K,KArray,KArrayOffset,ierr) > > > > call MatGetLocalSize(R,local_RRow,local_RCol) > > > > call MatGetArray(R,RArray,RArrayOffset,ierr) > > > > call MatCreateDense(PETSC_COMM_WORLD, PETSC_DECIDE, & > > PETSC_DECIDE , nDim, bsize,KArray(KArrayOffset + > 1), Km(1), ierr) > > > > localRsize = local_RRow * local_RCol > > do genIdx= 1, localRsize > > KArray(KArrayOffset + genIdx) = RArray(RArrayOffset + genIdx) > > end do > > > > call MatRestoreArray(R,RArray,RArrayOffset,ierr) > > > > call MatAssemblyBegin(Km(1), MAT_FINAL_ASSEMBLY, ierr) > > call MatAssemblyEnd (Km(1), MAT_FINAL_ASSEMBLY, ierr) > > > > do stepIdx= 2, step_k > > > > blockShift = KArrayOffset + (stepIdx-1) * (local_RRow * local_RCol) > > > > call MatCreateDense(PETSC_COMM_WORLD, PETSC_DECIDE, & > > PETSC_DECIDE , nDim, bsize,KArray(blockShift+1), > Km(stepIdx), ierr) > > call MatAssemblyBegin(Km(stepIdx), MAT_FINAL_ASSEMBLY, ierr) > > call MatAssemblyEnd (Km(stepIdx), MAT_FINAL_ASSEMBLY, ierr) > > end do > > > > call MatRestoreArray(K,KArray,KArrayOffset,ierr) > > > > do stepIdx= 2, step_k > > > > call > MatMatMult(A,Km(stepIdx-1),MAT_REUSE_MATRIX,PETSC_DEFAULT_INTEGER,Km(stepIdx), > ierr) > > end do > > > > > > And I got the error message as below: > > > > > > [0]PETSC ERROR: > ------------------------------------------------------------------------ > > [0]PETSC ERROR: Caught signal number 11 SEGV: Segmentation Violation, > probably memory access out of range > > [0]PETSC ERROR: Try option -start_in_debugger or > -on_error_attach_debugger > > [0]PETSC ERROR: or see > http://www.mcs.anl.gov/petsc/documentation/faq.html#valgrind[0]PETSC > ERROR: or try http://valgrind.org on GNU/linux and Apple Mac OS X to find > memory corruption errors > > [0]PETSC ERROR: configure using --with-debugging=yes, recompile, link, > and run > > [0]PETSC ERROR: to get more information on the crash. > > [0]PETSC ERROR: --------------------- Error Message > ------------------------------------ > > [0]PETSC ERROR: Signal received! > > [0]PETSC ERROR: > ------------------------------------------------------------------------ > > [0]PETSC ERROR: Petsc Release Version 3.3.0, Patch 7, Sat May 11 > 22:15:24 CDT 2013 > > [0]PETSC ERROR: See docs/changes/index.html for recent updates. > > [0]PETSC ERROR: See docs/faq.html for hints about trouble shooting. > > [0]PETSC ERROR: See docs/index.html for manual pages. > > [0]PETSC ERROR: --------------------[1]PETSC ERROR: > ------------------------------------------------------------------------ > > [1]PETSC ERROR: Caught signal number 11 SEGV: Segmentation Violation, > probably memory access out of range > > ---------------------------------------------------- > > [0]PETSC ERROR: ./kmath.bcbcg on a arch-fuji named p01-024 by a03293 Wed > Aug 5 18:24:40 2015 > > [0]PETSC ERROR: Libraries linked from > /volume1/home/ra000005/a03293/kmathlibbuild/petsc-3.3-p7/arch-fujitsu-sparc64fx-opt/lib > > [0]PETSC ERROR: Configure run at Tue Jul 28 19:23:51 2015 > > [0]PETSC ERROR: Configure options --known-level1-dcache-size=32768 > --known-level1-dcache-linesize=32 --known-level1-dcache-assoc=0 > --known-memcmp-ok=1 --known-sizeof-char=1 --known-sizeof-void-p=8 > --known-sizeof-short=2 --known-sizeof-int=4 --known-sizeof-long=8 > --known-sizeof-long-long=8 --known-sizeof-float=4 --known-sizeof-double=8 > --known-sizeof-size_t=8 --known-bits-per-byte=8 --known-sizeof-MPI_Comm=8 > --known-sizeof-MPI_Fint=4 --known-mpi-long-double=1 > --known-mpi-c-double-complex=1 --with-cc=mpifccpx --CFLAGS="-mt -Xg" > --COPTFLAGS=-Kfast,openmp --with-cxx=mpiFCCpx --CXXFLAGS=-mt > --CXXOPTFLAGS=-Kfast,openmp --with-fc=mpifrtpx --FFLAGS=-Kthreadsafe > --FOPTFLAGS=-Kfast,openmp --with-blas-lapack-lib="-SCALAPACK -SSL2" > --with-x=0 --with-c++-support --with-batch=1 --with-info=1 > --with-debugging=0 --known-mpi-shared-libraries=0 --with-valgrind=0 > > [0]PETSC ERROR: > ------------------------------------------------------------------------ > > [0]PETSC ERROR: User provided function() line 0 in unknown directory > unknown file > > > -------------------------------------------------------------------------- > > [mpi::mpi-api::mpi-abort] > > MPI_ABORT was invoked on rank 0 in communicator MPI_COMM_WORLD > > with errorcode 59. > > > > NOTE: invoking MPI_ABORT causes Open MPI to kill all MPI processes. > > You may or may not see output from other processes, depending on > > exactly when Open MPI kills them. > > > -------------------------------------------------------------------------- > > [p01-024:26516] > /opt/FJSVtclang/GM-1.2.0-18/lib64/libmpi.so.0(orte_errmgr_base_error_abort+0x84) > [0xffffffff0091f684] > > [p01-024:26516] > /opt/FJSVtclang/GM-1.2.0-18/lib64/libmpi.so.0(ompi_mpi_abort+0x51c) > [0xffffffff006c389c] > > [p01-024:26516] > /opt/FJSVtclang/GM-1.2.0-18/lib64/libmpi.so.0(MPI_Abort+0x6c) > [0xffffffff006db3ac] > > [p01-024:26516] > /opt/FJSVtclang/GM-1.2.0-18/lib64/libtrtmet_c.so.1(MPI_Abort+0x2c) > [0xffffffff00281bf0] > > [p01-024:26516] ./kmath.bcbcg [0x1bf620] > > [p01-024:26516] ./kmath.bcbcg [0x1bf20c] > > [p01-024:26516] /lib64/libc.so.6(killpg+0x48) [0xffffffff02d52600] > > [p01-024:26516] [(nil)] > > [p01-024:26516] ./kmath.bcbcg [0x1a2054] > > [p01-024:26516] ./kmath.bcbcg [0x1064f8] > > [p01-024:26516] ./kmath.bcbcg(MAIN__+0x9dc) [0x105d1c] > > [p01-024:26516] ./kmath.bcbcg(main+0xec) [0x8a329c] > > [p01-024:26516] /lib64/libc.so.6(__libc_start_main+0x194) > [0xffffffff02d3b81c] > > [p01-024:26516] ./kmath.bcbcg [0x1051ec] > > [0]PETSC ERROR: > ------------------------------------------------------------------------ > > [0]PETSC ERROR: Caught signal number 15 Terminate: Somet process (or the > batch system) has told this process to end > > [0]PETSC ERROR: Try option -start_in_debugger or > -on_error_attach_debugger > > [0]PETSC ERROR: or see > http://www.mcs.anl.gov/petsc/documentation/faq.html#valgrind[0]PETSC > ERROR: or try http://valgrind.org on GNU/linux and Apple Mac OS X to find > memory corruption errors > > [0]PETSC ERROR: configure using --with-debugging=yes, recompile, link, > and run > > [0]PETSC ERROR: to get more information on the crash. > > [0]PETSC ERROR: --------------------- Error Message > ------------------------------------ > > [0]PETSC ERROR: Signal received! > > [0]PETSC ERROR: > ------------------------------------------------------------------------ > > [0]PETSC ERROR: Petsc Release Version 3.3.0, Patch 7, Sat May 11 > 22:15:24 CDT 2013 > > [0]PETSC ERROR: See docs/changes/index.html for recent updates. > > [0]PETSC ERROR: See docs/faq.html for hints about trouble shooting. > > [0]PETSC ERROR: See docs/index.html for manual pages. > > [0]PETSC ERROR: > ------------------------------------------------------------------------ > > [0]PETSC ERROR: ./kmath.bcbcg on a arch-fuji named p01-024 by a03293 Wed > Aug 5 18:24:40 2015 > > [0]PETSC ERROR: Libraries linked from > /volume1/home/ra000005/a03293/kmathlibbuild/petsc-3.3-p7/arch-fujitsu-sparc64fx-opt/lib > > [0]PETSC ERROR: Configure run at Tue Jul 28 19:23:51 2015 > > [0]PETSC ERROR: Configure options --known-level1-dcache-size=32768 > --known-level1-dcache-linesize=32 --known-level1-dcache-assoc=0 > --known-memcmp-ok=1 --known-sizeof-char=1 --known-sizeof-void-p=8 > --known-sizeof-short=2 --known-sizeof-int=4 --known-sizeof-long=8 > --known-sizeof-long-long=8 --known-sizeof-float=4 --known-sizeof-double=8 > --known-sizeof-size_t=8 --known-bits-per-byte=8 --known-sizeof-MPI_Comm=8 > --known-sizeof-MPI_Fint=4 --known-mpi-long-double=1 > --known-mpi-c-double-complex=1 --with-cc=mpifccpx --CFLAGS="-mt -Xg" > --COPTFLAGS=-Kfast,openmp --with-cxx=mpiFCCpx --CXXFLAGS=-mt > --CXXOPTFLAGS=-Kfast,openmp --with-fc=mpifrtpx --FFLAGS=-Kthreadsafe > --FOPTFLAGS=-Kfast,openmp --with-blas-lapack-lib="-SCALAPACK -SSL2" > --with-x=0 --with-c++-support --with-batch=1 --with-info=1 > --with-debugging=0 --known-mpi-shared-libraries=0 --with-valgrind=0 > > [0]PETSC ERROR: > ------------------------------------------------------------------------ > > [0]PETSC ERROR: User provided function() line 0 in unknown directory > unknown file > > [ERR.] PLE 0019 plexec One of MPI processes was > aborted.(rank=0)(nid=0x020a0028)(CODE=1938,793745140674134016,15104) > > > > However, if I change from > > call > MatMatMult(A,Km(stepIdx-1),MAT_REUSE_MATRIX,PETSC_DEFAULT_INTEGER,Km(stepIdx), > ierr) > > to > > call MatMatMult(A,Km(stepIdx-1), > MAT_INITIAL_MATRIX,PETSC_DEFAULT_INTEGER,Km(stepIdx), ierr) > > > > everything is fine. > > > > could you please suggest some way to solve this? > > > > Thanks > > > > Cong Li > > > > On Wed, Aug 5, 2015 at 10:53 AM, Cong Li > wrote: > > Thank you very much for your help and suggestions. > > With your help, finally I could continue my project. > > > > Regards > > > > Cong Li > > > > > > > > On Wed, Aug 5, 2015 at 3:09 AM, Barry Smith wrote: > > > > From the manual page: Unless scall is MAT_REUSE_MATRIX C will be > created. > > > > Since you want to use the C that is passed in you should use > MAT_REUSE_MATRIX. > > > > Note that since your B and C matrices are dense the issue of sparsity > pattern of C is not relevant. > > > > Barry > > > > > On Aug 4, 2015, at 11:59 AM, Cong Li wrote: > > > > > > Thanks very much. This answer is very helpful. > > > And I have a following question. > > > If I create B1, B2, .. by the way you suggested and then use > MatMatMult to do SPMM. > > > PetscErrorCode MatMatMult(Mat A,Mat B,MatReuse scall,PetscReal > fill,Mat *C) > > > should I use MAT_REUSE_MATRIX for MatReuse part of the arguement. > > > > > > Thanks > > > > > > Cong Li > > > > > > On Wed, Aug 5, 2015 at 1:27 AM, Barry Smith > wrote: > > > > > > > On Aug 4, 2015, at 4:09 AM, Cong Li > wrote: > > > > > > > > I am sorry that I should have explained it more clearly. > > > > Actually I want to compute a recurrence. > > > > > > > > Like, I want to firstly compute A*X1=B1, and then calculate A*B1=B2, > A*B2=B3 and so on. > > > > Finally I want to combine all these results into a bigger matrix > C=[B1,B2 ...] > > > > > > First create C with MatCreateDense(,&C). Then call > MatDenseGetArray(C,&array); then create B1 with > MatCreateDense(....,array,&B1); then create > > > B2 with MatCreateDense(...,array+shift,&B2) etc where shift equals the > number of __local__ rows in B1 times the number of columns in B1, then > create B3 with a larger shift etc. > > > > > > Note that you are "sharing" the array space of C with B1, B2, B3, > ..., each Bi contains its columns of the C matrix. > > > > > > Barry > > > > > > > > > > > > > > > > > Is there any way to do this efficiently. > > > > > > > > > > > > > > > > On Tue, Aug 4, 2015 at 5:45 PM, Patrick Sanan < > patrick.sanan at gmail.com> wrote: > > > > On Tue, Aug 04, 2015 at 03:42:14PM +0900, Cong Li wrote: > > > > > Thanks for your reply. > > > > > > > > > > I have an other question. > > > > > I want to do SPMM several times and combine result matrices into > one bigger > > > > > matrix. > > > > > for example > > > > > I firstly calculate AX1=B1, AX2=B2 ... > > > > > then I want to combine B1, B2.. to get a C, where C=[B1,B2...] > > > > > > > > > > Could you please suggest a way of how to do this. > > > > This is just linear algebra, nothing to do with PETSc specifically. > > > > A * [X1, X2, ... ] = [AX1, AX2, ...] > > > > > > > > > > Thanks > > > > > > > > > > Cong Li > > > > > > > > > > On Tue, Aug 4, 2015 at 3:27 PM, Jed Brown > wrote: > > > > > > > > > > > Cong Li writes: > > > > > > > > > > > > > Hello, > > > > > > > > > > > > > > I am a PhD student using PETsc for my research. > > > > > > > I am wondering if there is a way to implement SPMM (Sparse > matrix-matrix > > > > > > > multiplication) by using PETSc. > > > > > > > > > > > > > > > > > > > http://www.mcs.anl.gov/petsc/petsc-current/docs/manualpages/Mat/MatMatMult.html > > > > > > > > > > > > > > > > > > > > > > > > -------------- next part -------------- An HTML attachment was scrubbed... URL: -------------- next part -------------- A non-text attachment was scrubbed... Name: mainprogram.f90 Type: application/octet-stream Size: 5681 bytes Desc: not available URL: -------------- next part -------------- A non-text attachment was scrubbed... Name: mpk_monomial.f90 Type: application/octet-stream Size: 2294 bytes Desc: not available URL: From solvercorleone at gmail.com Wed Aug 5 21:00:37 2015 From: solvercorleone at gmail.com (Cong Li) Date: Thu, 6 Aug 2015 11:00:37 +0900 Subject: [petsc-users] Questions about creation of matrix and setting its values In-Reply-To: <1AD73AA4-8437-40BC-AFD2-EF1471B27E34@mcs.anl.gov> References: <1AD73AA4-8437-40BC-AFD2-EF1471B27E34@mcs.anl.gov> Message-ID: Barry, Thanks. I think I understood. Cong Li On Thu, Aug 6, 2015 at 3:01 AM, Barry Smith wrote: > > > On Aug 5, 2015, at 4:47 AM, Cong Li wrote: > > > >> Hi, > >> > >> I am wondering if it is necessary to call > >> MatAssemblyBegin() and MatAssemblyEnd() after MatDuplicate() with the > option of MAT_DO_NOT_COPY_VALUES. > >> For example, if I have an assembled matrix A, and I call MatDuplicate() > to create B, which is a duplication of A. > >> Do I need to call MatAssemblyBegin() and MatAssemblyEnd() for B. > > You should not need to. But note if you use the flag > MAT_DO_NOT_COPY_VALUES the new matrix will have zero for all the numerical > entries. > > > >> > >> And 2nd question is : > >> just after the MatCreateDense() call and before MatAssemblyBegin() and > MatAssemblyEnd() calls, can I use MatGetArray() ? > > Dense matrices are a special case because room is always allocated for > all the matrix entries and one can use MatDenseGetArray() to either access > or set any local value. So if you are only setting/accessing local values > you don't actually need to use MatSetValues() (though you can) you can just > access the locations directly after using MatDenseGetArray(). There is no > harm in calling the MatAssemblyBegin/End() "extra" times for dense matrices. > > >> > >> The 3rd question is: > >> before the MatAssemblyBegin() and MatAssemblyEnd() calls, should I use > INSERT_VALUES or ADD_VALUES for MatSetValues call? And why ? > >> Actually I have read the manual, but I still feel confused about the > means of INSERT_VALUES and ADD_VALUES. > > There are a couple of reasons that you need to make these > MatAssemblyBegin/End calls: > > - entries can be set which should be stored on a different process, so > these need to be communicated > > - for compressed formats like CSR (as used in MATAIJ and others) the > entries need to be processed into their compressed form > > In general, the entries of the matrix are not stored in their "usable" > forms until you make the MatAssembleEnd call. Rather they are kept in some > easy-to-insert-into intermediate storage. INSERT_VALUES means that old > values that might be in the matrix are overwritten , and ADD_VALUES means > that the new entries from intermediate storage are added to the old values. > > > > > >> > >> Thanks > >> > >> Cong Li > > > > -------------- next part -------------- An HTML attachment was scrubbed... URL: From bsmith at mcs.anl.gov Wed Aug 5 21:43:35 2015 From: bsmith at mcs.anl.gov (Barry Smith) Date: Wed, 5 Aug 2015 21:43:35 -0500 Subject: [petsc-users] I am wondering if there is a way to implement SPMM In-Reply-To: References: <87egjjr2j9.fsf@jedbrown.org> <20150804084548.GB52392@Patricks-MacBook-Pro-3.local> <07456300-9874-41EF-AF5E-16BC0CB0423D@mcs.anl.gov> <06426FD6-16F6-429A-8EEB-8BE31CECC8F4@mcs.anl.gov> Message-ID: Send the input files so I can actually run the thing (and make sure they are not very big, debugging with large data sets is silly and unproductive). Thanks Barry > On Aug 5, 2015, at 8:56 PM, Cong Li wrote: > > The entire source code files are attached. > > Also I copy and paste the here in this email > > thanks > > program test > > implicit none > > #include > #include > #include > #include > > > PetscViewer :: view > ! sparse matrix > Mat :: A > ! distributed dense matrix of size n x m > Mat :: B, X, R, QDlt, AQDlt > ! distributed dense matrix of size n x (m x k) > Mat :: Q, K, AQ_p, AQ > ! local dense matrix (every process keep the identical copies), (m x k) x (m x k) > Mat :: AConjPara, QtAQ, QtAQ_p, Dlt > > PetscInt :: nDim, mDim, rhsNDim,rhsMDim,ierr, maxIter, iter, step_k,bsize > PetscInt :: ownRowS,ownRowE > PetscScalar, allocatable :: XInit(:,:) > PetscInt :: XInitI, XInitJ > PetscScalar :: v=1.0 > PetscBool :: flg > PetscMPIInt :: size, rank > > character(128) :: fin, rhsfin > > > call PetscInitialize(PETSC_NULL_CHARACTER,ierr) > call MPI_Comm_size(PETSC_COMM_WORLD,size,ierr) > call MPI_Comm_rank(PETSC_COMM_WORLD,rank,ierr) > > ! read binary matrix file > call PetscOptionsGetString(PETSC_NULL_CHARACTER,'-f',fin,flg,ierr) > call PetscOptionsGetString(PETSC_NULL_CHARACTER,'-r',rhsfin,flg,ierr) > > call PetscOptionsGetInt(PETSC_NULL_CHARACTER,'-i',maxIter,flg,ierr) > call PetscOptionsGetInt(PETSC_NULL_CHARACTER,'-k',step_k,flg,ierr) > call PetscOptionsGetInt(PETSC_NULL_CHARACTER,'-w',bsize,flg,ierr) > > > call PetscViewerBinaryOpen(PETSC_COMM_WORLD,fin,FILE_MODE_READ,view,ierr) > call MatCreate(PETSC_COMM_WORLD,A,ierr) > call MatSetType(A,MATAIJ,ierr) > call MatLoad(A,view,ierr) > call PetscViewerDestroy(view,ierr) > ! for the time being, assume mDim == nDim is true > call MatGetSize(A, nDim, mDim, ierr) > > if (rank == 0) then > print*,'Mat Size = ', nDim, mDim > end if > > call MatAssemblyBegin(A, MAT_FINAL_ASSEMBLY, ierr) > call MatAssemblyEnd(A, MAT_FINAL_ASSEMBLY, ierr) > call MatGetOwnershipRange(A,ownRowS,ownRowE, ierr) > > ! create right-and-side matrix > ! for the time being, choose row-wise decomposition > ! for the time being, assume nDim%size = 0 > call MatCreateDense(PETSC_COMM_WORLD, (ownRowE - ownRowS), & > bsize, nDim, bsize,PETSC_NULL_SCALAR, B, ierr) > call PetscViewerBinaryOpen(PETSC_COMM_WORLD,rhsfin,FILE_MODE_READ,view, ierr) > call MatLoad(B,view,ierr) > call PetscViewerDestroy(view,ierr) > call MatGetSize(B, rhsMDim, rhsNDim, ierr) > if (rank == 0) then > print*,'MRHS Size actually are:', rhsMDim, rhsNDim > print*,'MRHS Size should be:', nDim, bsize > end if > call MatAssemblyBegin(B, MAT_FINAL_ASSEMBLY, ierr) > call MatAssemblyEnd(B, MAT_FINAL_ASSEMBLY, ierr) > > ! inintial value guses X > allocate(XInit(nDim,bsize)) > do XInitI=1, nDim > do XInitJ=1, bsize > XInit(XInitI,XInitJ) = 1.0 > end do > end do > > call MatCreateDense(PETSC_COMM_WORLD, (ownRowE - ownRowS), & > bsize, nDim, bsize,XInit, X, ierr) > > call MatAssemblyBegin(X, MAT_FINAL_ASSEMBLY, ierr) > call MatAssemblyEnd (X, MAT_FINAL_ASSEMBLY, ierr) > > > ! B, X, R, QDlt, AQDlt > call MatDuplicate(B, MAT_DO_NOT_COPY_VALUES, R, ierr) > call MatAssemblyBegin(R, MAT_FINAL_ASSEMBLY, ierr) > call MatAssemblyEnd (R, MAT_FINAL_ASSEMBLY, ierr) > > call MatDuplicate(B, MAT_DO_NOT_COPY_VALUES, QDlt, ierr) > call MatAssemblyBegin(QDlt, MAT_FINAL_ASSEMBLY, ierr) > call MatAssemblyEnd (QDlt, MAT_FINAL_ASSEMBLY, ierr) > > call MatDuplicate(B, MAT_DO_NOT_COPY_VALUES, AQDlt, ierr) > call MatAssemblyBegin(AQDlt, MAT_FINAL_ASSEMBLY, ierr) > call MatAssemblyEnd (AQDlt, MAT_FINAL_ASSEMBLY, ierr) > > ! Q, K, AQ_p, AQ of size n x (m x k) > call MatCreateDense(PETSC_COMM_WORLD, (ownRowE - ownRowS), & > (bsize*step_k), nDim, (bsize*step_k),PETSC_NULL_SCALAR, Q, ierr) > call MatAssemblyBegin(Q, MAT_FINAL_ASSEMBLY, ierr) > call MatAssemblyEnd(Q, MAT_FINAL_ASSEMBLY, ierr) > > call MatDuplicate(Q, MAT_DO_NOT_COPY_VALUES, K, ierr) > call MatAssemblyBegin(K, MAT_FINAL_ASSEMBLY, ierr) > call MatAssemblyEnd(K, MAT_FINAL_ASSEMBLY, ierr) > > call MatDuplicate(Q, MAT_DO_NOT_COPY_VALUES, AQ_p, ierr) > call MatAssemblyBegin(AQ_p, MAT_FINAL_ASSEMBLY, ierr) > call MatAssemblyEnd(AQ_p, MAT_FINAL_ASSEMBLY, ierr) > > call MatDuplicate(Q, MAT_DO_NOT_COPY_VALUES, AQ, ierr) > call MatAssemblyBegin(AQ, MAT_FINAL_ASSEMBLY, ierr) > call MatAssemblyEnd(AQ, MAT_FINAL_ASSEMBLY, ierr) > > ! QtAQ, QtAQ_p, Dlt of size (m x k) x (m x k) > call MatCreateSeqDense(PETSC_COMM_SELF,(bsize*step_k),(bsize*step_k),& > PETSC_NULL_SCALAR, QtAQ, ierr) > call MatAssemblyBegin(QtAQ, MAT_FINAL_ASSEMBLY, ierr) > call MatAssemblyEnd (QtAQ, MAT_FINAL_ASSEMBLY, ierr) > > call MatDuplicate(QtAQ, MAT_DO_NOT_COPY_VALUES, QtAQ_p , ierr) > call MatAssemblyBegin(QtAQ_p, MAT_FINAL_ASSEMBLY, ierr) > call MatAssemblyEnd (QtAQ_p, MAT_FINAL_ASSEMBLY, ierr) > > call MatDuplicate(QtAQ, MAT_DO_NOT_COPY_VALUES, Dlt , ierr) > call MatAssemblyBegin(Dlt, MAT_FINAL_ASSEMBLY, ierr) > call MatAssemblyEnd (Dlt, MAT_FINAL_ASSEMBLY, ierr) > > call MatDuplicate(QtAQ, MAT_DO_NOT_COPY_VALUES, AConjPara , ierr) > call MatAssemblyBegin(AConjPara, MAT_FINAL_ASSEMBLY, ierr) > call MatAssemblyEnd (AConjPara, MAT_FINAL_ASSEMBLY, ierr) > > ! calculation for R > > ! call matrix powers kernel > call mpk_monomial (K, A, R, step_k, rank,size) > > ! destory matrices > deallocate(XInit) > > call MatDestroy(B, ierr) > call MatDestroy(X, ierr) > call MatDestroy(R, ierr) > call MatDestroy(QDlt, ierr) > call MatDestroy(AQDlt, ierr) > call MatDestroy(Q, ierr) > call MatDestroy(K, ierr) > call MatDestroy(AQ_p, ierr) > call MatDestroy(AQ, ierr) > call MatDestroy(QtAQ, ierr) > call MatDestroy(QtAQ_p, ierr) > call MatDestroy(Dlt, ierr) > > > call PetscFinalize(ierr) > > stop > > end program test > > > subroutine mpk_monomial (K, A, R, step_k, rank, sizeMPI) > implicit none > > #include > #include > #include > #include > > Mat :: K, Km(step_k) > Mat :: A, R > PetscMPIInt :: sizeMPI, rank > PetscInt :: nDim, bsize, step_k, local_RRow, local_RCol, genIdx > PetscInt :: ierr > PetscInt :: stepIdx, blockShift, localRsize > PetscScalar :: KArray(1), RArray(1), PetscScalarSize > PetscOffset :: KArrayOffset, RArrayOffset > > call MatGetSize(R, nDim, bsize, ierr) > if (rank == 0) then > print*,'Mat Size = ', nDim, bsize > end if > > call MatGetArray(K,KArray,KArrayOffset,ierr) > > call MatGetLocalSize(R,local_RRow,local_RCol) > ! print *, "local_RRow,local_RCol", local_RRow,local_RCol > > ! get arry from R to add values to K(1) > call MatGetArray(R,RArray,RArrayOffset,ierr) > > call MatCreateDense(PETSC_COMM_WORLD, PETSC_DECIDE, & > PETSC_DECIDE , nDim, bsize,KArray(KArrayOffset + 1), Km(1), ierr) > > > ! call PetscMemmove(KArray(KArrayOffset + 1),RArray(RArrayOffset + 1) & > ! ,local_RRow * local_RCol * STORAGE_SIZE(PetscScalarSize), ierr) > > localRsize = local_RRow * local_RCol > do genIdx= 1, localRsize > KArray(KArrayOffset + genIdx) = RArray(RArrayOffset + genIdx) > end do > > > call MatRestoreArray(R,RArray,RArrayOffset,ierr) > > call MatAssemblyBegin(Km(1), MAT_FINAL_ASSEMBLY, ierr) > call MatAssemblyEnd (Km(1), MAT_FINAL_ASSEMBLY, ierr) > > do stepIdx= 2, step_k > > blockShift = KArrayOffset + (stepIdx-1) * (local_RRow * local_RCol) > > call MatCreateDense(PETSC_COMM_WORLD, PETSC_DECIDE, & > PETSC_DECIDE , nDim, bsize,KArray(blockShift+1), Km(stepIdx), ierr) > call MatAssemblyBegin(Km(stepIdx), MAT_FINAL_ASSEMBLY, ierr) > call MatAssemblyEnd (Km(stepIdx), MAT_FINAL_ASSEMBLY, ierr) > > end do > > call MatRestoreArray(K,KArray,KArrayOffset,ierr) > > ! do stepIdx= 2, step_k > do stepIdx= 2,2 > > call MatMatMult(A,Km(stepIdx-1),MAT_REUSE_MATRIX,PETSC_DEFAULT_INTEGER,Km(stepIdx), ierr) > ! call MatMatMult(A,Km(stepIdx-1),MAT_INITIAL_MATRIX,PETSC_DEFAULT_INTEGER,Km(stepIdx), ierr) > end do > > ! call MatView(K,PETSC_VIEWER_STDOUT_WORLD,ierr) > > end subroutine mpk_monomial > > > > Cong Li > > On Thu, Aug 6, 2015 at 3:30 AM, Barry Smith wrote: > > Send the entire code so that we can compile it and run it ourselves to see what is going wrong. > > Barry > > > On Aug 5, 2015, at 4:42 AM, Cong Li wrote: > > > > Hi > > > > I tried the method you suggested. However, I got the error message. > > My code and message are below. > > > > K is the big matrix containing column matrices. > > > > code: > > > > call MatGetArray(K,KArray,KArrayOffset,ierr) > > > > call MatGetLocalSize(R,local_RRow,local_RCol) > > > > call MatGetArray(R,RArray,RArrayOffset,ierr) > > > > call MatCreateDense(PETSC_COMM_WORLD, PETSC_DECIDE, & > > PETSC_DECIDE , nDim, bsize,KArray(KArrayOffset + 1), Km(1), ierr) > > > > localRsize = local_RRow * local_RCol > > do genIdx= 1, localRsize > > KArray(KArrayOffset + genIdx) = RArray(RArrayOffset + genIdx) > > end do > > > > call MatRestoreArray(R,RArray,RArrayOffset,ierr) > > > > call MatAssemblyBegin(Km(1), MAT_FINAL_ASSEMBLY, ierr) > > call MatAssemblyEnd (Km(1), MAT_FINAL_ASSEMBLY, ierr) > > > > do stepIdx= 2, step_k > > > > blockShift = KArrayOffset + (stepIdx-1) * (local_RRow * local_RCol) > > > > call MatCreateDense(PETSC_COMM_WORLD, PETSC_DECIDE, & > > PETSC_DECIDE , nDim, bsize,KArray(blockShift+1), Km(stepIdx), ierr) > > call MatAssemblyBegin(Km(stepIdx), MAT_FINAL_ASSEMBLY, ierr) > > call MatAssemblyEnd (Km(stepIdx), MAT_FINAL_ASSEMBLY, ierr) > > end do > > > > call MatRestoreArray(K,KArray,KArrayOffset,ierr) > > > > do stepIdx= 2, step_k > > > > call MatMatMult(A,Km(stepIdx-1),MAT_REUSE_MATRIX,PETSC_DEFAULT_INTEGER,Km(stepIdx), ierr) > > end do > > > > > > And I got the error message as below: > > > > > > [0]PETSC ERROR: ------------------------------------------------------------------------ > > [0]PETSC ERROR: Caught signal number 11 SEGV: Segmentation Violation, probably memory access out of range > > [0]PETSC ERROR: Try option -start_in_debugger or -on_error_attach_debugger > > [0]PETSC ERROR: or see http://www.mcs.anl.gov/petsc/documentation/faq.html#valgrind[0]PETSC ERROR: or try http://valgrind.org on GNU/linux and Apple Mac OS X to find memory corruption errors > > [0]PETSC ERROR: configure using --with-debugging=yes, recompile, link, and run > > [0]PETSC ERROR: to get more information on the crash. > > [0]PETSC ERROR: --------------------- Error Message ------------------------------------ > > [0]PETSC ERROR: Signal received! > > [0]PETSC ERROR: ------------------------------------------------------------------------ > > [0]PETSC ERROR: Petsc Release Version 3.3.0, Patch 7, Sat May 11 22:15:24 CDT 2013 > > [0]PETSC ERROR: See docs/changes/index.html for recent updates. > > [0]PETSC ERROR: See docs/faq.html for hints about trouble shooting. > > [0]PETSC ERROR: See docs/index.html for manual pages. > > [0]PETSC ERROR: --------------------[1]PETSC ERROR: ------------------------------------------------------------------------ > > [1]PETSC ERROR: Caught signal number 11 SEGV: Segmentation Violation, probably memory access out of range > > ---------------------------------------------------- > > [0]PETSC ERROR: ./kmath.bcbcg on a arch-fuji named p01-024 by a03293 Wed Aug 5 18:24:40 2015 > > [0]PETSC ERROR: Libraries linked from /volume1/home/ra000005/a03293/kmathlibbuild/petsc-3.3-p7/arch-fujitsu-sparc64fx-opt/lib > > [0]PETSC ERROR: Configure run at Tue Jul 28 19:23:51 2015 > > [0]PETSC ERROR: Configure options --known-level1-dcache-size=32768 --known-level1-dcache-linesize=32 --known-level1-dcache-assoc=0 --known-memcmp-ok=1 --known-sizeof-char=1 --known-sizeof-void-p=8 --known-sizeof-short=2 --known-sizeof-int=4 --known-sizeof-long=8 --known-sizeof-long-long=8 --known-sizeof-float=4 --known-sizeof-double=8 --known-sizeof-size_t=8 --known-bits-per-byte=8 --known-sizeof-MPI_Comm=8 --known-sizeof-MPI_Fint=4 --known-mpi-long-double=1 --known-mpi-c-double-complex=1 --with-cc=mpifccpx --CFLAGS="-mt -Xg" --COPTFLAGS=-Kfast,openmp --with-cxx=mpiFCCpx --CXXFLAGS=-mt --CXXOPTFLAGS=-Kfast,openmp --with-fc=mpifrtpx --FFLAGS=-Kthreadsafe --FOPTFLAGS=-Kfast,openmp --with-blas-lapack-lib="-SCALAPACK -SSL2" --with-x=0 --with-c++-support --with-batch=1 --with-info=1 --with-debugging=0 --known-mpi-shared-libraries=0 --with-valgrind=0 > > [0]PETSC ERROR: ------------------------------------------------------------------------ > > [0]PETSC ERROR: User provided function() line 0 in unknown directory unknown file > > -------------------------------------------------------------------------- > > [mpi::mpi-api::mpi-abort] > > MPI_ABORT was invoked on rank 0 in communicator MPI_COMM_WORLD > > with errorcode 59. > > > > NOTE: invoking MPI_ABORT causes Open MPI to kill all MPI processes. > > You may or may not see output from other processes, depending on > > exactly when Open MPI kills them. > > -------------------------------------------------------------------------- > > [p01-024:26516] /opt/FJSVtclang/GM-1.2.0-18/lib64/libmpi.so.0(orte_errmgr_base_error_abort+0x84) [0xffffffff0091f684] > > [p01-024:26516] /opt/FJSVtclang/GM-1.2.0-18/lib64/libmpi.so.0(ompi_mpi_abort+0x51c) [0xffffffff006c389c] > > [p01-024:26516] /opt/FJSVtclang/GM-1.2.0-18/lib64/libmpi.so.0(MPI_Abort+0x6c) [0xffffffff006db3ac] > > [p01-024:26516] /opt/FJSVtclang/GM-1.2.0-18/lib64/libtrtmet_c.so.1(MPI_Abort+0x2c) [0xffffffff00281bf0] > > [p01-024:26516] ./kmath.bcbcg [0x1bf620] > > [p01-024:26516] ./kmath.bcbcg [0x1bf20c] > > [p01-024:26516] /lib64/libc.so.6(killpg+0x48) [0xffffffff02d52600] > > [p01-024:26516] [(nil)] > > [p01-024:26516] ./kmath.bcbcg [0x1a2054] > > [p01-024:26516] ./kmath.bcbcg [0x1064f8] > > [p01-024:26516] ./kmath.bcbcg(MAIN__+0x9dc) [0x105d1c] > > [p01-024:26516] ./kmath.bcbcg(main+0xec) [0x8a329c] > > [p01-024:26516] /lib64/libc.so.6(__libc_start_main+0x194) [0xffffffff02d3b81c] > > [p01-024:26516] ./kmath.bcbcg [0x1051ec] > > [0]PETSC ERROR: ------------------------------------------------------------------------ > > [0]PETSC ERROR: Caught signal number 15 Terminate: Somet process (or the batch system) has told this process to end > > [0]PETSC ERROR: Try option -start_in_debugger or -on_error_attach_debugger > > [0]PETSC ERROR: or see http://www.mcs.anl.gov/petsc/documentation/faq.html#valgrind[0]PETSC ERROR: or try http://valgrind.org on GNU/linux and Apple Mac OS X to find memory corruption errors > > [0]PETSC ERROR: configure using --with-debugging=yes, recompile, link, and run > > [0]PETSC ERROR: to get more information on the crash. > > [0]PETSC ERROR: --------------------- Error Message ------------------------------------ > > [0]PETSC ERROR: Signal received! > > [0]PETSC ERROR: ------------------------------------------------------------------------ > > [0]PETSC ERROR: Petsc Release Version 3.3.0, Patch 7, Sat May 11 22:15:24 CDT 2013 > > [0]PETSC ERROR: See docs/changes/index.html for recent updates. > > [0]PETSC ERROR: See docs/faq.html for hints about trouble shooting. > > [0]PETSC ERROR: See docs/index.html for manual pages. > > [0]PETSC ERROR: ------------------------------------------------------------------------ > > [0]PETSC ERROR: ./kmath.bcbcg on a arch-fuji named p01-024 by a03293 Wed Aug 5 18:24:40 2015 > > [0]PETSC ERROR: Libraries linked from /volume1/home/ra000005/a03293/kmathlibbuild/petsc-3.3-p7/arch-fujitsu-sparc64fx-opt/lib > > [0]PETSC ERROR: Configure run at Tue Jul 28 19:23:51 2015 > > [0]PETSC ERROR: Configure options --known-level1-dcache-size=32768 --known-level1-dcache-linesize=32 --known-level1-dcache-assoc=0 --known-memcmp-ok=1 --known-sizeof-char=1 --known-sizeof-void-p=8 --known-sizeof-short=2 --known-sizeof-int=4 --known-sizeof-long=8 --known-sizeof-long-long=8 --known-sizeof-float=4 --known-sizeof-double=8 --known-sizeof-size_t=8 --known-bits-per-byte=8 --known-sizeof-MPI_Comm=8 --known-sizeof-MPI_Fint=4 --known-mpi-long-double=1 --known-mpi-c-double-complex=1 --with-cc=mpifccpx --CFLAGS="-mt -Xg" --COPTFLAGS=-Kfast,openmp --with-cxx=mpiFCCpx --CXXFLAGS=-mt --CXXOPTFLAGS=-Kfast,openmp --with-fc=mpifrtpx --FFLAGS=-Kthreadsafe --FOPTFLAGS=-Kfast,openmp --with-blas-lapack-lib="-SCALAPACK -SSL2" --with-x=0 --with-c++-support --with-batch=1 --with-info=1 --with-debugging=0 --known-mpi-shared-libraries=0 --with-valgrind=0 > > [0]PETSC ERROR: ------------------------------------------------------------------------ > > [0]PETSC ERROR: User provided function() line 0 in unknown directory unknown file > > [ERR.] PLE 0019 plexec One of MPI processes was aborted.(rank=0)(nid=0x020a0028)(CODE=1938,793745140674134016,15104) > > > > However, if I change from > > call MatMatMult(A,Km(stepIdx-1),MAT_REUSE_MATRIX,PETSC_DEFAULT_INTEGER,Km(stepIdx), ierr) > > to > > call MatMatMult(A,Km(stepIdx-1), MAT_INITIAL_MATRIX,PETSC_DEFAULT_INTEGER,Km(stepIdx), ierr) > > > > everything is fine. > > > > could you please suggest some way to solve this? > > > > Thanks > > > > Cong Li > > > > On Wed, Aug 5, 2015 at 10:53 AM, Cong Li wrote: > > Thank you very much for your help and suggestions. > > With your help, finally I could continue my project. > > > > Regards > > > > Cong Li > > > > > > > > On Wed, Aug 5, 2015 at 3:09 AM, Barry Smith wrote: > > > > From the manual page: Unless scall is MAT_REUSE_MATRIX C will be created. > > > > Since you want to use the C that is passed in you should use MAT_REUSE_MATRIX. > > > > Note that since your B and C matrices are dense the issue of sparsity pattern of C is not relevant. > > > > Barry > > > > > On Aug 4, 2015, at 11:59 AM, Cong Li wrote: > > > > > > Thanks very much. This answer is very helpful. > > > And I have a following question. > > > If I create B1, B2, .. by the way you suggested and then use MatMatMult to do SPMM. > > > PetscErrorCode MatMatMult(Mat A,Mat B,MatReuse scall,PetscReal fill,Mat *C) > > > should I use MAT_REUSE_MATRIX for MatReuse part of the arguement. > > > > > > Thanks > > > > > > Cong Li > > > > > > On Wed, Aug 5, 2015 at 1:27 AM, Barry Smith wrote: > > > > > > > On Aug 4, 2015, at 4:09 AM, Cong Li wrote: > > > > > > > > I am sorry that I should have explained it more clearly. > > > > Actually I want to compute a recurrence. > > > > > > > > Like, I want to firstly compute A*X1=B1, and then calculate A*B1=B2, A*B2=B3 and so on. > > > > Finally I want to combine all these results into a bigger matrix C=[B1,B2 ...] > > > > > > First create C with MatCreateDense(,&C). Then call MatDenseGetArray(C,&array); then create B1 with MatCreateDense(....,array,&B1); then create > > > B2 with MatCreateDense(...,array+shift,&B2) etc where shift equals the number of __local__ rows in B1 times the number of columns in B1, then create B3 with a larger shift etc. > > > > > > Note that you are "sharing" the array space of C with B1, B2, B3, ..., each Bi contains its columns of the C matrix. > > > > > > Barry > > > > > > > > > > > > > > > > > Is there any way to do this efficiently. > > > > > > > > > > > > > > > > On Tue, Aug 4, 2015 at 5:45 PM, Patrick Sanan wrote: > > > > On Tue, Aug 04, 2015 at 03:42:14PM +0900, Cong Li wrote: > > > > > Thanks for your reply. > > > > > > > > > > I have an other question. > > > > > I want to do SPMM several times and combine result matrices into one bigger > > > > > matrix. > > > > > for example > > > > > I firstly calculate AX1=B1, AX2=B2 ... > > > > > then I want to combine B1, B2.. to get a C, where C=[B1,B2...] > > > > > > > > > > Could you please suggest a way of how to do this. > > > > This is just linear algebra, nothing to do with PETSc specifically. > > > > A * [X1, X2, ... ] = [AX1, AX2, ...] > > > > > > > > > > Thanks > > > > > > > > > > Cong Li > > > > > > > > > > On Tue, Aug 4, 2015 at 3:27 PM, Jed Brown wrote: > > > > > > > > > > > Cong Li writes: > > > > > > > > > > > > > Hello, > > > > > > > > > > > > > > I am a PhD student using PETsc for my research. > > > > > > > I am wondering if there is a way to implement SPMM (Sparse matrix-matrix > > > > > > > multiplication) by using PETSc. > > > > > > > > > > > > > > > > > > http://www.mcs.anl.gov/petsc/petsc-current/docs/manualpages/Mat/MatMatMult.html > > > > > > > > > > > > > > > > > > > > > > > > > From gbisht at lbl.gov Wed Aug 5 22:21:18 2015 From: gbisht at lbl.gov (Gautam Bisht) Date: Wed, 5 Aug 2015 20:21:18 -0700 Subject: [petsc-users] Error running DMPlex example In-Reply-To: References: Message-ID: Hi Matt, Instead of using gcc4.9, I reinstalled PETSc using clang on mac os x 10.10 and the example runs fine. Btw, are there any examples that use DMPlex+DMComposite? Thanks, -Gautam. -------------- next part -------------- An HTML attachment was scrubbed... URL: From hzhang at mcs.anl.gov Wed Aug 5 22:23:58 2015 From: hzhang at mcs.anl.gov (Hong) Date: Wed, 5 Aug 2015 22:23:58 -0500 Subject: [petsc-users] I am wondering if there is a way to implement SPMM In-Reply-To: References: <87egjjr2j9.fsf@jedbrown.org> <20150804084548.GB52392@Patricks-MacBook-Pro-3.local> <07456300-9874-41EF-AF5E-16BC0CB0423D@mcs.anl.gov> <06426FD6-16F6-429A-8EEB-8BE31CECC8F4@mcs.anl.gov> Message-ID: Cong, Can you write out math equations for mpk_monomial (), list input and output parameters. Note: 1. MatDuplicate() does not need to be followed by MatAssemblyBegin/End 2. MatMatMult(A,Km(stepIdx-1),MAT_REUSE_MATRIX,..) must be called after MatMatMult(A,Km(stepIdx-1),MAT_INITIAL_MATRIX,..) Hong On Wed, Aug 5, 2015 at 8:56 PM, Cong Li wrote: > The entire source code files are attached. > > Also I copy and paste the here in this email > > thanks > > program test > > implicit none > > #include > #include > #include > #include > > > PetscViewer :: view > ! sparse matrix > Mat :: A > ! distributed dense matrix of size n x m > Mat :: B, X, R, QDlt, AQDlt > ! distributed dense matrix of size n x (m x k) > Mat :: Q, K, AQ_p, AQ > ! local dense matrix (every process keep the identical copies), (m x k) > x (m x k) > Mat :: AConjPara, QtAQ, QtAQ_p, Dlt > > PetscInt :: nDim, mDim, rhsNDim,rhsMDim,ierr, maxIter, iter, > step_k,bsize > PetscInt :: ownRowS,ownRowE > PetscScalar, allocatable :: XInit(:,:) > PetscInt :: XInitI, XInitJ > PetscScalar :: v=1.0 > PetscBool :: flg > PetscMPIInt :: size, rank > > character(128) :: fin, rhsfin > > > call PetscInitialize(PETSC_NULL_CHARACTER,ierr) > call MPI_Comm_size(PETSC_COMM_WORLD,size,ierr) > call MPI_Comm_rank(PETSC_COMM_WORLD,rank,ierr) > > ! read binary matrix file > call PetscOptionsGetString(PETSC_NULL_CHARACTER,'-f',fin,flg,ierr) > call PetscOptionsGetString(PETSC_NULL_CHARACTER,'-r',rhsfin,flg,ierr) > > call PetscOptionsGetInt(PETSC_NULL_CHARACTER,'-i',maxIter,flg,ierr) > call PetscOptionsGetInt(PETSC_NULL_CHARACTER,'-k',step_k,flg,ierr) > call PetscOptionsGetInt(PETSC_NULL_CHARACTER,'-w',bsize,flg,ierr) > > > call PetscViewerBinaryOpen(PETSC_COMM_WORLD,fin,FILE_MODE_READ,view,ierr) > call MatCreate(PETSC_COMM_WORLD,A,ierr) > call MatSetType(A,MATAIJ,ierr) > call MatLoad(A,view,ierr) > call PetscViewerDestroy(view,ierr) > ! for the time being, assume mDim == nDim is true > call MatGetSize(A, nDim, mDim, ierr) > > if (rank == 0) then > print*,'Mat Size = ', nDim, mDim > end if > > call MatAssemblyBegin(A, MAT_FINAL_ASSEMBLY, ierr) > call MatAssemblyEnd(A, MAT_FINAL_ASSEMBLY, ierr) > call MatGetOwnershipRange(A,ownRowS,ownRowE, ierr) > > ! create right-and-side matrix > ! for the time being, choose row-wise decomposition > ! for the time being, assume nDim%size = 0 > call MatCreateDense(PETSC_COMM_WORLD, (ownRowE - ownRowS), & > bsize, nDim, bsize,PETSC_NULL_SCALAR, B, ierr) > call PetscViewerBinaryOpen(PETSC_COMM_WORLD,rhsfin,FILE_MODE_READ,view, > ierr) > call MatLoad(B,view,ierr) > call PetscViewerDestroy(view,ierr) > call MatGetSize(B, rhsMDim, rhsNDim, ierr) > if (rank == 0) then > print*,'MRHS Size actually are:', rhsMDim, rhsNDim > print*,'MRHS Size should be:', nDim, bsize > end if > call MatAssemblyBegin(B, MAT_FINAL_ASSEMBLY, ierr) > call MatAssemblyEnd(B, MAT_FINAL_ASSEMBLY, ierr) > > ! inintial value guses X > allocate(XInit(nDim,bsize)) > do XInitI=1, nDim > do XInitJ=1, bsize > XInit(XInitI,XInitJ) = 1.0 > end do > end do > > call MatCreateDense(PETSC_COMM_WORLD, (ownRowE - ownRowS), & > bsize, nDim, bsize,XInit, X, ierr) > > call MatAssemblyBegin(X, MAT_FINAL_ASSEMBLY, ierr) > call MatAssemblyEnd (X, MAT_FINAL_ASSEMBLY, ierr) > > > ! B, X, R, QDlt, AQDlt > call MatDuplicate(B, MAT_DO_NOT_COPY_VALUES, R, ierr) > call MatAssemblyBegin(R, MAT_FINAL_ASSEMBLY, ierr) > call MatAssemblyEnd (R, MAT_FINAL_ASSEMBLY, ierr) > > call MatDuplicate(B, MAT_DO_NOT_COPY_VALUES, QDlt, ierr) > call MatAssemblyBegin(QDlt, MAT_FINAL_ASSEMBLY, ierr) > call MatAssemblyEnd (QDlt, MAT_FINAL_ASSEMBLY, ierr) > > call MatDuplicate(B, MAT_DO_NOT_COPY_VALUES, AQDlt, ierr) > call MatAssemblyBegin(AQDlt, MAT_FINAL_ASSEMBLY, ierr) > call MatAssemblyEnd (AQDlt, MAT_FINAL_ASSEMBLY, ierr) > > ! Q, K, AQ_p, AQ of size n x (m x k) > call MatCreateDense(PETSC_COMM_WORLD, (ownRowE - ownRowS), & > (bsize*step_k), nDim, > (bsize*step_k),PETSC_NULL_SCALAR, Q, ierr) > call MatAssemblyBegin(Q, MAT_FINAL_ASSEMBLY, ierr) > call MatAssemblyEnd(Q, MAT_FINAL_ASSEMBLY, ierr) > > call MatDuplicate(Q, MAT_DO_NOT_COPY_VALUES, K, ierr) > call MatAssemblyBegin(K, MAT_FINAL_ASSEMBLY, ierr) > call MatAssemblyEnd(K, MAT_FINAL_ASSEMBLY, ierr) > > call MatDuplicate(Q, MAT_DO_NOT_COPY_VALUES, AQ_p, ierr) > call MatAssemblyBegin(AQ_p, MAT_FINAL_ASSEMBLY, ierr) > call MatAssemblyEnd(AQ_p, MAT_FINAL_ASSEMBLY, ierr) > > call MatDuplicate(Q, MAT_DO_NOT_COPY_VALUES, AQ, ierr) > call MatAssemblyBegin(AQ, MAT_FINAL_ASSEMBLY, ierr) > call MatAssemblyEnd(AQ, MAT_FINAL_ASSEMBLY, ierr) > > ! QtAQ, QtAQ_p, Dlt of size (m x k) x (m x k) > call MatCreateSeqDense(PETSC_COMM_SELF,(bsize*step_k),(bsize*step_k),& > PETSC_NULL_SCALAR, QtAQ, ierr) > call MatAssemblyBegin(QtAQ, MAT_FINAL_ASSEMBLY, ierr) > call MatAssemblyEnd (QtAQ, MAT_FINAL_ASSEMBLY, ierr) > > call MatDuplicate(QtAQ, MAT_DO_NOT_COPY_VALUES, QtAQ_p , ierr) > call MatAssemblyBegin(QtAQ_p, MAT_FINAL_ASSEMBLY, ierr) > call MatAssemblyEnd (QtAQ_p, MAT_FINAL_ASSEMBLY, ierr) > > call MatDuplicate(QtAQ, MAT_DO_NOT_COPY_VALUES, Dlt , ierr) > call MatAssemblyBegin(Dlt, MAT_FINAL_ASSEMBLY, ierr) > call MatAssemblyEnd (Dlt, MAT_FINAL_ASSEMBLY, ierr) > > call MatDuplicate(QtAQ, MAT_DO_NOT_COPY_VALUES, AConjPara , ierr) > call MatAssemblyBegin(AConjPara, MAT_FINAL_ASSEMBLY, ierr) > call MatAssemblyEnd (AConjPara, MAT_FINAL_ASSEMBLY, ierr) > > ! calculation for R > > ! call matrix powers kernel > call mpk_monomial (K, A, R, step_k, rank,size) > > ! destory matrices > deallocate(XInit) > > call MatDestroy(B, ierr) > call MatDestroy(X, ierr) > call MatDestroy(R, ierr) > call MatDestroy(QDlt, ierr) > call MatDestroy(AQDlt, ierr) > call MatDestroy(Q, ierr) > call MatDestroy(K, ierr) > call MatDestroy(AQ_p, ierr) > call MatDestroy(AQ, ierr) > call MatDestroy(QtAQ, ierr) > call MatDestroy(QtAQ_p, ierr) > call MatDestroy(Dlt, ierr) > > > call PetscFinalize(ierr) > > stop > > end program test > > > subroutine mpk_monomial (K, A, R, step_k, rank, sizeMPI) > implicit none > > #include > #include > #include > #include > > Mat :: K, Km(step_k) > Mat :: A, R > PetscMPIInt :: sizeMPI, rank > PetscInt :: nDim, bsize, step_k, local_RRow, local_RCol, genIdx > PetscInt :: ierr > PetscInt :: stepIdx, blockShift, localRsize > PetscScalar :: KArray(1), RArray(1), PetscScalarSize > PetscOffset :: KArrayOffset, RArrayOffset > > call MatGetSize(R, nDim, bsize, ierr) > if (rank == 0) then > print*,'Mat Size = ', nDim, bsize > end if > > call MatGetArray(K,KArray,KArrayOffset,ierr) > > call MatGetLocalSize(R,local_RRow,local_RCol) > ! print *, "local_RRow,local_RCol", local_RRow,local_RCol > > ! get arry from R to add values to K(1) > call MatGetArray(R,RArray,RArrayOffset,ierr) > > call MatCreateDense(PETSC_COMM_WORLD, PETSC_DECIDE, & > PETSC_DECIDE , nDim, bsize,KArray(KArrayOffset + > 1), Km(1), ierr) > > > ! call PetscMemmove(KArray(KArrayOffset + 1),RArray(RArrayOffset + 1) & > ! ,local_RRow * local_RCol * > STORAGE_SIZE(PetscScalarSize), ierr) > > localRsize = local_RRow * local_RCol > do genIdx= 1, localRsize > KArray(KArrayOffset + genIdx) = RArray(RArrayOffset + genIdx) > end do > > > call MatRestoreArray(R,RArray,RArrayOffset,ierr) > > call MatAssemblyBegin(Km(1), MAT_FINAL_ASSEMBLY, ierr) > call MatAssemblyEnd (Km(1), MAT_FINAL_ASSEMBLY, ierr) > > do stepIdx= 2, step_k > > blockShift = KArrayOffset + (stepIdx-1) * (local_RRow * local_RCol) > > call MatCreateDense(PETSC_COMM_WORLD, PETSC_DECIDE, & > PETSC_DECIDE , nDim, bsize,KArray(blockShift+1), > Km(stepIdx), ierr) > call MatAssemblyBegin(Km(stepIdx), MAT_FINAL_ASSEMBLY, ierr) > call MatAssemblyEnd (Km(stepIdx), MAT_FINAL_ASSEMBLY, ierr) > > end do > > call MatRestoreArray(K,KArray,KArrayOffset,ierr) > > ! do stepIdx= 2, step_k > do stepIdx= 2,2 > > call > MatMatMult(A,Km(stepIdx-1),MAT_REUSE_MATRIX,PETSC_DEFAULT_INTEGER,Km(stepIdx), > ierr) > ! call > MatMatMult(A,Km(stepIdx-1),MAT_INITIAL_MATRIX,PETSC_DEFAULT_INTEGER,Km(stepIdx), > ierr) > end do > > ! call MatView(K,PETSC_VIEWER_STDOUT_WORLD,ierr) > > end subroutine mpk_monomial > > > > Cong Li > > On Thu, Aug 6, 2015 at 3:30 AM, Barry Smith wrote: > >> >> Send the entire code so that we can compile it and run it ourselves to >> see what is going wrong. >> >> Barry >> >> > On Aug 5, 2015, at 4:42 AM, Cong Li wrote: >> > >> > Hi >> > >> > I tried the method you suggested. However, I got the error message. >> > My code and message are below. >> > >> > K is the big matrix containing column matrices. >> > >> > code: >> > >> > call MatGetArray(K,KArray,KArrayOffset,ierr) >> > >> > call MatGetLocalSize(R,local_RRow,local_RCol) >> > >> > call MatGetArray(R,RArray,RArrayOffset,ierr) >> > >> > call MatCreateDense(PETSC_COMM_WORLD, PETSC_DECIDE, & >> > PETSC_DECIDE , nDim, bsize,KArray(KArrayOffset >> + 1), Km(1), ierr) >> > >> > localRsize = local_RRow * local_RCol >> > do genIdx= 1, localRsize >> > KArray(KArrayOffset + genIdx) = RArray(RArrayOffset + genIdx) >> > end do >> > >> > call MatRestoreArray(R,RArray,RArrayOffset,ierr) >> > >> > call MatAssemblyBegin(Km(1), MAT_FINAL_ASSEMBLY, ierr) >> > call MatAssemblyEnd (Km(1), MAT_FINAL_ASSEMBLY, ierr) >> > >> > do stepIdx= 2, step_k >> > >> > blockShift = KArrayOffset + (stepIdx-1) * (local_RRow * local_RCol) >> > >> > call MatCreateDense(PETSC_COMM_WORLD, PETSC_DECIDE, & >> > PETSC_DECIDE , nDim, >> bsize,KArray(blockShift+1), Km(stepIdx), ierr) >> > call MatAssemblyBegin(Km(stepIdx), MAT_FINAL_ASSEMBLY, ierr) >> > call MatAssemblyEnd (Km(stepIdx), MAT_FINAL_ASSEMBLY, ierr) >> > end do >> > >> > call MatRestoreArray(K,KArray,KArrayOffset,ierr) >> > >> > do stepIdx= 2, step_k >> > >> > call >> MatMatMult(A,Km(stepIdx-1),MAT_REUSE_MATRIX,PETSC_DEFAULT_INTEGER,Km(stepIdx), >> ierr) >> > end do >> > >> > >> > And I got the error message as below: >> > >> > >> > [0]PETSC ERROR: >> ------------------------------------------------------------------------ >> > [0]PETSC ERROR: Caught signal number 11 SEGV: Segmentation Violation, >> probably memory access out of range >> > [0]PETSC ERROR: Try option -start_in_debugger or >> -on_error_attach_debugger >> > [0]PETSC ERROR: or see >> http://www.mcs.anl.gov/petsc/documentation/faq.html#valgrind[0]PETSC >> ERROR: or try http://valgrind.org on GNU/linux and Apple Mac OS X to >> find memory corruption errors >> > [0]PETSC ERROR: configure using --with-debugging=yes, recompile, link, >> and run >> > [0]PETSC ERROR: to get more information on the crash. >> > [0]PETSC ERROR: --------------------- Error Message >> ------------------------------------ >> > [0]PETSC ERROR: Signal received! >> > [0]PETSC ERROR: >> ------------------------------------------------------------------------ >> > [0]PETSC ERROR: Petsc Release Version 3.3.0, Patch 7, Sat May 11 >> 22:15:24 CDT 2013 >> > [0]PETSC ERROR: See docs/changes/index.html for recent updates. >> > [0]PETSC ERROR: See docs/faq.html for hints about trouble shooting. >> > [0]PETSC ERROR: See docs/index.html for manual pages. >> > [0]PETSC ERROR: --------------------[1]PETSC ERROR: >> ------------------------------------------------------------------------ >> > [1]PETSC ERROR: Caught signal number 11 SEGV: Segmentation Violation, >> probably memory access out of range >> > ---------------------------------------------------- >> > [0]PETSC ERROR: ./kmath.bcbcg on a arch-fuji named p01-024 by a03293 >> Wed Aug 5 18:24:40 2015 >> > [0]PETSC ERROR: Libraries linked from >> /volume1/home/ra000005/a03293/kmathlibbuild/petsc-3.3-p7/arch-fujitsu-sparc64fx-opt/lib >> > [0]PETSC ERROR: Configure run at Tue Jul 28 19:23:51 2015 >> > [0]PETSC ERROR: Configure options --known-level1-dcache-size=32768 >> --known-level1-dcache-linesize=32 --known-level1-dcache-assoc=0 >> --known-memcmp-ok=1 --known-sizeof-char=1 --known-sizeof-void-p=8 >> --known-sizeof-short=2 --known-sizeof-int=4 --known-sizeof-long=8 >> --known-sizeof-long-long=8 --known-sizeof-float=4 --known-sizeof-double=8 >> --known-sizeof-size_t=8 --known-bits-per-byte=8 --known-sizeof-MPI_Comm=8 >> --known-sizeof-MPI_Fint=4 --known-mpi-long-double=1 >> --known-mpi-c-double-complex=1 --with-cc=mpifccpx --CFLAGS="-mt -Xg" >> --COPTFLAGS=-Kfast,openmp --with-cxx=mpiFCCpx --CXXFLAGS=-mt >> --CXXOPTFLAGS=-Kfast,openmp --with-fc=mpifrtpx --FFLAGS=-Kthreadsafe >> --FOPTFLAGS=-Kfast,openmp --with-blas-lapack-lib="-SCALAPACK -SSL2" >> --with-x=0 --with-c++-support --with-batch=1 --with-info=1 >> --with-debugging=0 --known-mpi-shared-libraries=0 --with-valgrind=0 >> > [0]PETSC ERROR: >> ------------------------------------------------------------------------ >> > [0]PETSC ERROR: User provided function() line 0 in unknown directory >> unknown file >> > >> -------------------------------------------------------------------------- >> > [mpi::mpi-api::mpi-abort] >> > MPI_ABORT was invoked on rank 0 in communicator MPI_COMM_WORLD >> > with errorcode 59. >> > >> > NOTE: invoking MPI_ABORT causes Open MPI to kill all MPI processes. >> > You may or may not see output from other processes, depending on >> > exactly when Open MPI kills them. >> > >> -------------------------------------------------------------------------- >> > [p01-024:26516] >> /opt/FJSVtclang/GM-1.2.0-18/lib64/libmpi.so.0(orte_errmgr_base_error_abort+0x84) >> [0xffffffff0091f684] >> > [p01-024:26516] >> /opt/FJSVtclang/GM-1.2.0-18/lib64/libmpi.so.0(ompi_mpi_abort+0x51c) >> [0xffffffff006c389c] >> > [p01-024:26516] >> /opt/FJSVtclang/GM-1.2.0-18/lib64/libmpi.so.0(MPI_Abort+0x6c) >> [0xffffffff006db3ac] >> > [p01-024:26516] >> /opt/FJSVtclang/GM-1.2.0-18/lib64/libtrtmet_c.so.1(MPI_Abort+0x2c) >> [0xffffffff00281bf0] >> > [p01-024:26516] ./kmath.bcbcg [0x1bf620] >> > [p01-024:26516] ./kmath.bcbcg [0x1bf20c] >> > [p01-024:26516] /lib64/libc.so.6(killpg+0x48) [0xffffffff02d52600] >> > [p01-024:26516] [(nil)] >> > [p01-024:26516] ./kmath.bcbcg [0x1a2054] >> > [p01-024:26516] ./kmath.bcbcg [0x1064f8] >> > [p01-024:26516] ./kmath.bcbcg(MAIN__+0x9dc) [0x105d1c] >> > [p01-024:26516] ./kmath.bcbcg(main+0xec) [0x8a329c] >> > [p01-024:26516] /lib64/libc.so.6(__libc_start_main+0x194) >> [0xffffffff02d3b81c] >> > [p01-024:26516] ./kmath.bcbcg [0x1051ec] >> > [0]PETSC ERROR: >> ------------------------------------------------------------------------ >> > [0]PETSC ERROR: Caught signal number 15 Terminate: Somet process (or >> the batch system) has told this process to end >> > [0]PETSC ERROR: Try option -start_in_debugger or >> -on_error_attach_debugger >> > [0]PETSC ERROR: or see >> http://www.mcs.anl.gov/petsc/documentation/faq.html#valgrind[0]PETSC >> ERROR: or try http://valgrind.org on GNU/linux and Apple Mac OS X to >> find memory corruption errors >> > [0]PETSC ERROR: configure using --with-debugging=yes, recompile, link, >> and run >> > [0]PETSC ERROR: to get more information on the crash. >> > [0]PETSC ERROR: --------------------- Error Message >> ------------------------------------ >> > [0]PETSC ERROR: Signal received! >> > [0]PETSC ERROR: >> ------------------------------------------------------------------------ >> > [0]PETSC ERROR: Petsc Release Version 3.3.0, Patch 7, Sat May 11 >> 22:15:24 CDT 2013 >> > [0]PETSC ERROR: See docs/changes/index.html for recent updates. >> > [0]PETSC ERROR: See docs/faq.html for hints about trouble shooting. >> > [0]PETSC ERROR: See docs/index.html for manual pages. >> > [0]PETSC ERROR: >> ------------------------------------------------------------------------ >> > [0]PETSC ERROR: ./kmath.bcbcg on a arch-fuji named p01-024 by a03293 >> Wed Aug 5 18:24:40 2015 >> > [0]PETSC ERROR: Libraries linked from >> /volume1/home/ra000005/a03293/kmathlibbuild/petsc-3.3-p7/arch-fujitsu-sparc64fx-opt/lib >> > [0]PETSC ERROR: Configure run at Tue Jul 28 19:23:51 2015 >> > [0]PETSC ERROR: Configure options --known-level1-dcache-size=32768 >> --known-level1-dcache-linesize=32 --known-level1-dcache-assoc=0 >> --known-memcmp-ok=1 --known-sizeof-char=1 --known-sizeof-void-p=8 >> --known-sizeof-short=2 --known-sizeof-int=4 --known-sizeof-long=8 >> --known-sizeof-long-long=8 --known-sizeof-float=4 --known-sizeof-double=8 >> --known-sizeof-size_t=8 --known-bits-per-byte=8 --known-sizeof-MPI_Comm=8 >> --known-sizeof-MPI_Fint=4 --known-mpi-long-double=1 >> --known-mpi-c-double-complex=1 --with-cc=mpifccpx --CFLAGS="-mt -Xg" >> --COPTFLAGS=-Kfast,openmp --with-cxx=mpiFCCpx --CXXFLAGS=-mt >> --CXXOPTFLAGS=-Kfast,openmp --with-fc=mpifrtpx --FFLAGS=-Kthreadsafe >> --FOPTFLAGS=-Kfast,openmp --with-blas-lapack-lib="-SCALAPACK -SSL2" >> --with-x=0 --with-c++-support --with-batch=1 --with-info=1 >> --with-debugging=0 --known-mpi-shared-libraries=0 --with-valgrind=0 >> > [0]PETSC ERROR: >> ------------------------------------------------------------------------ >> > [0]PETSC ERROR: User provided function() line 0 in unknown directory >> unknown file >> > [ERR.] PLE 0019 plexec One of MPI processes was >> aborted.(rank=0)(nid=0x020a0028)(CODE=1938,793745140674134016,15104) >> > >> > However, if I change from >> > call >> MatMatMult(A,Km(stepIdx-1),MAT_REUSE_MATRIX,PETSC_DEFAULT_INTEGER,Km(stepIdx), >> ierr) >> > to >> > call MatMatMult(A,Km(stepIdx-1), >> MAT_INITIAL_MATRIX,PETSC_DEFAULT_INTEGER,Km(stepIdx), ierr) >> > >> > everything is fine. >> > >> > could you please suggest some way to solve this? >> > >> > Thanks >> > >> > Cong Li >> > >> > On Wed, Aug 5, 2015 at 10:53 AM, Cong Li >> wrote: >> > Thank you very much for your help and suggestions. >> > With your help, finally I could continue my project. >> > >> > Regards >> > >> > Cong Li >> > >> > >> > >> > On Wed, Aug 5, 2015 at 3:09 AM, Barry Smith wrote: >> > >> > From the manual page: Unless scall is MAT_REUSE_MATRIX C will be >> created. >> > >> > Since you want to use the C that is passed in you should use >> MAT_REUSE_MATRIX. >> > >> > Note that since your B and C matrices are dense the issue of sparsity >> pattern of C is not relevant. >> > >> > Barry >> > >> > > On Aug 4, 2015, at 11:59 AM, Cong Li >> wrote: >> > > >> > > Thanks very much. This answer is very helpful. >> > > And I have a following question. >> > > If I create B1, B2, .. by the way you suggested and then use >> MatMatMult to do SPMM. >> > > PetscErrorCode MatMatMult(Mat A,Mat B,MatReuse scall,PetscReal >> fill,Mat *C) >> > > should I use MAT_REUSE_MATRIX for MatReuse part of the arguement. >> > > >> > > Thanks >> > > >> > > Cong Li >> > > >> > > On Wed, Aug 5, 2015 at 1:27 AM, Barry Smith >> wrote: >> > > >> > > > On Aug 4, 2015, at 4:09 AM, Cong Li >> wrote: >> > > > >> > > > I am sorry that I should have explained it more clearly. >> > > > Actually I want to compute a recurrence. >> > > > >> > > > Like, I want to firstly compute A*X1=B1, and then calculate >> A*B1=B2, A*B2=B3 and so on. >> > > > Finally I want to combine all these results into a bigger matrix >> C=[B1,B2 ...] >> > > >> > > First create C with MatCreateDense(,&C). Then call >> MatDenseGetArray(C,&array); then create B1 with >> MatCreateDense(....,array,&B1); then create >> > > B2 with MatCreateDense(...,array+shift,&B2) etc where shift equals >> the number of __local__ rows in B1 times the number of columns in B1, then >> create B3 with a larger shift etc. >> > > >> > > Note that you are "sharing" the array space of C with B1, B2, B3, >> ..., each Bi contains its columns of the C matrix. >> > > >> > > Barry >> > > >> > > >> > > >> > > > >> > > > Is there any way to do this efficiently. >> > > > >> > > > >> > > > >> > > > On Tue, Aug 4, 2015 at 5:45 PM, Patrick Sanan < >> patrick.sanan at gmail.com> wrote: >> > > > On Tue, Aug 04, 2015 at 03:42:14PM +0900, Cong Li wrote: >> > > > > Thanks for your reply. >> > > > > >> > > > > I have an other question. >> > > > > I want to do SPMM several times and combine result matrices into >> one bigger >> > > > > matrix. >> > > > > for example >> > > > > I firstly calculate AX1=B1, AX2=B2 ... >> > > > > then I want to combine B1, B2.. to get a C, where C=[B1,B2...] >> > > > > >> > > > > Could you please suggest a way of how to do this. >> > > > This is just linear algebra, nothing to do with PETSc specifically. >> > > > A * [X1, X2, ... ] = [AX1, AX2, ...] >> > > > > >> > > > > Thanks >> > > > > >> > > > > Cong Li >> > > > > >> > > > > On Tue, Aug 4, 2015 at 3:27 PM, Jed Brown >> wrote: >> > > > > >> > > > > > Cong Li writes: >> > > > > > >> > > > > > > Hello, >> > > > > > > >> > > > > > > I am a PhD student using PETsc for my research. >> > > > > > > I am wondering if there is a way to implement SPMM (Sparse >> matrix-matrix >> > > > > > > multiplication) by using PETSc. >> > > > > > >> > > > > > >> > > > > > >> http://www.mcs.anl.gov/petsc/petsc-current/docs/manualpages/Mat/MatMatMult.html >> > > > > > >> > > > >> > > >> > > >> > >> > >> > >> >> > -------------- next part -------------- An HTML attachment was scrubbed... URL: From bsmith at mcs.anl.gov Wed Aug 5 23:29:29 2015 From: bsmith at mcs.anl.gov (Barry Smith) Date: Wed, 5 Aug 2015 23:29:29 -0500 Subject: [petsc-users] I am wondering if there is a way to implement SPMM In-Reply-To: References: <87egjjr2j9.fsf@jedbrown.org> <20150804084548.GB52392@Patricks-MacBook-Pro-3.local> <07456300-9874-41EF-AF5E-16BC0CB0423D@mcs.anl.gov> <06426FD6-16F6-429A-8EEB-8BE31CECC8F4@mcs.anl.gov> Message-ID: > On Aug 5, 2015, at 10:23 PM, Hong wrote: > > Cong, > > Can you write out math equations for mpk_monomial (), > list input and output parameters. > > Note: > 1. MatDuplicate() does not need to be followed by MatAssemblyBegin/End > 2. MatMatMult(A,Km(stepIdx-1),MAT_REUSE_MATRIX,..) must be called after > MatMatMult(A,Km(stepIdx-1),MAT_INITIAL_MATRIX,..) Hong, we want to reuse the space in the Km(stepIdx-1) from which it was created which means that MAT_INITIAL_MATRIX cannot be used. Since the result is always dense it is not the difficult case when a symbolic computation needs to be done initially so, at least in theory, he should not have to use MAT_INITIAL_MATRIX the first time through. Barry > > Hong > > > On Wed, Aug 5, 2015 at 8:56 PM, Cong Li wrote: > The entire source code files are attached. > > Also I copy and paste the here in this email > > thanks > > program test > > implicit none > > #include > #include > #include > #include > > > PetscViewer :: view > ! sparse matrix > Mat :: A > ! distributed dense matrix of size n x m > Mat :: B, X, R, QDlt, AQDlt > ! distributed dense matrix of size n x (m x k) > Mat :: Q, K, AQ_p, AQ > ! local dense matrix (every process keep the identical copies), (m x k) x (m x k) > Mat :: AConjPara, QtAQ, QtAQ_p, Dlt > > PetscInt :: nDim, mDim, rhsNDim,rhsMDim,ierr, maxIter, iter, step_k,bsize > PetscInt :: ownRowS,ownRowE > PetscScalar, allocatable :: XInit(:,:) > PetscInt :: XInitI, XInitJ > PetscScalar :: v=1.0 > PetscBool :: flg > PetscMPIInt :: size, rank > > character(128) :: fin, rhsfin > > > call PetscInitialize(PETSC_NULL_CHARACTER,ierr) > call MPI_Comm_size(PETSC_COMM_WORLD,size,ierr) > call MPI_Comm_rank(PETSC_COMM_WORLD,rank,ierr) > > ! read binary matrix file > call PetscOptionsGetString(PETSC_NULL_CHARACTER,'-f',fin,flg,ierr) > call PetscOptionsGetString(PETSC_NULL_CHARACTER,'-r',rhsfin,flg,ierr) > > call PetscOptionsGetInt(PETSC_NULL_CHARACTER,'-i',maxIter,flg,ierr) > call PetscOptionsGetInt(PETSC_NULL_CHARACTER,'-k',step_k,flg,ierr) > call PetscOptionsGetInt(PETSC_NULL_CHARACTER,'-w',bsize,flg,ierr) > > > call PetscViewerBinaryOpen(PETSC_COMM_WORLD,fin,FILE_MODE_READ,view,ierr) > call MatCreate(PETSC_COMM_WORLD,A,ierr) > call MatSetType(A,MATAIJ,ierr) > call MatLoad(A,view,ierr) > call PetscViewerDestroy(view,ierr) > ! for the time being, assume mDim == nDim is true > call MatGetSize(A, nDim, mDim, ierr) > > if (rank == 0) then > print*,'Mat Size = ', nDim, mDim > end if > > call MatAssemblyBegin(A, MAT_FINAL_ASSEMBLY, ierr) > call MatAssemblyEnd(A, MAT_FINAL_ASSEMBLY, ierr) > call MatGetOwnershipRange(A,ownRowS,ownRowE, ierr) > > ! create right-and-side matrix > ! for the time being, choose row-wise decomposition > ! for the time being, assume nDim%size = 0 > call MatCreateDense(PETSC_COMM_WORLD, (ownRowE - ownRowS), & > bsize, nDim, bsize,PETSC_NULL_SCALAR, B, ierr) > call PetscViewerBinaryOpen(PETSC_COMM_WORLD,rhsfin,FILE_MODE_READ,view, ierr) > call MatLoad(B,view,ierr) > call PetscViewerDestroy(view,ierr) > call MatGetSize(B, rhsMDim, rhsNDim, ierr) > if (rank == 0) then > print*,'MRHS Size actually are:', rhsMDim, rhsNDim > print*,'MRHS Size should be:', nDim, bsize > end if > call MatAssemblyBegin(B, MAT_FINAL_ASSEMBLY, ierr) > call MatAssemblyEnd(B, MAT_FINAL_ASSEMBLY, ierr) > > ! inintial value guses X > allocate(XInit(nDim,bsize)) > do XInitI=1, nDim > do XInitJ=1, bsize > XInit(XInitI,XInitJ) = 1.0 > end do > end do > > call MatCreateDense(PETSC_COMM_WORLD, (ownRowE - ownRowS), & > bsize, nDim, bsize,XInit, X, ierr) > > call MatAssemblyBegin(X, MAT_FINAL_ASSEMBLY, ierr) > call MatAssemblyEnd (X, MAT_FINAL_ASSEMBLY, ierr) > > > ! B, X, R, QDlt, AQDlt > call MatDuplicate(B, MAT_DO_NOT_COPY_VALUES, R, ierr) > call MatAssemblyBegin(R, MAT_FINAL_ASSEMBLY, ierr) > call MatAssemblyEnd (R, MAT_FINAL_ASSEMBLY, ierr) > > call MatDuplicate(B, MAT_DO_NOT_COPY_VALUES, QDlt, ierr) > call MatAssemblyBegin(QDlt, MAT_FINAL_ASSEMBLY, ierr) > call MatAssemblyEnd (QDlt, MAT_FINAL_ASSEMBLY, ierr) > > call MatDuplicate(B, MAT_DO_NOT_COPY_VALUES, AQDlt, ierr) > call MatAssemblyBegin(AQDlt, MAT_FINAL_ASSEMBLY, ierr) > call MatAssemblyEnd (AQDlt, MAT_FINAL_ASSEMBLY, ierr) > > ! Q, K, AQ_p, AQ of size n x (m x k) > call MatCreateDense(PETSC_COMM_WORLD, (ownRowE - ownRowS), & > (bsize*step_k), nDim, (bsize*step_k),PETSC_NULL_SCALAR, Q, ierr) > call MatAssemblyBegin(Q, MAT_FINAL_ASSEMBLY, ierr) > call MatAssemblyEnd(Q, MAT_FINAL_ASSEMBLY, ierr) > > call MatDuplicate(Q, MAT_DO_NOT_COPY_VALUES, K, ierr) > call MatAssemblyBegin(K, MAT_FINAL_ASSEMBLY, ierr) > call MatAssemblyEnd(K, MAT_FINAL_ASSEMBLY, ierr) > > call MatDuplicate(Q, MAT_DO_NOT_COPY_VALUES, AQ_p, ierr) > call MatAssemblyBegin(AQ_p, MAT_FINAL_ASSEMBLY, ierr) > call MatAssemblyEnd(AQ_p, MAT_FINAL_ASSEMBLY, ierr) > > call MatDuplicate(Q, MAT_DO_NOT_COPY_VALUES, AQ, ierr) > call MatAssemblyBegin(AQ, MAT_FINAL_ASSEMBLY, ierr) > call MatAssemblyEnd(AQ, MAT_FINAL_ASSEMBLY, ierr) > > ! QtAQ, QtAQ_p, Dlt of size (m x k) x (m x k) > call MatCreateSeqDense(PETSC_COMM_SELF,(bsize*step_k),(bsize*step_k),& > PETSC_NULL_SCALAR, QtAQ, ierr) > call MatAssemblyBegin(QtAQ, MAT_FINAL_ASSEMBLY, ierr) > call MatAssemblyEnd (QtAQ, MAT_FINAL_ASSEMBLY, ierr) > > call MatDuplicate(QtAQ, MAT_DO_NOT_COPY_VALUES, QtAQ_p , ierr) > call MatAssemblyBegin(QtAQ_p, MAT_FINAL_ASSEMBLY, ierr) > call MatAssemblyEnd (QtAQ_p, MAT_FINAL_ASSEMBLY, ierr) > > call MatDuplicate(QtAQ, MAT_DO_NOT_COPY_VALUES, Dlt , ierr) > call MatAssemblyBegin(Dlt, MAT_FINAL_ASSEMBLY, ierr) > call MatAssemblyEnd (Dlt, MAT_FINAL_ASSEMBLY, ierr) > > call MatDuplicate(QtAQ, MAT_DO_NOT_COPY_VALUES, AConjPara , ierr) > call MatAssemblyBegin(AConjPara, MAT_FINAL_ASSEMBLY, ierr) > call MatAssemblyEnd (AConjPara, MAT_FINAL_ASSEMBLY, ierr) > > ! calculation for R > > ! call matrix powers kernel > call mpk_monomial (K, A, R, step_k, rank,size) > > ! destory matrices > deallocate(XInit) > > call MatDestroy(B, ierr) > call MatDestroy(X, ierr) > call MatDestroy(R, ierr) > call MatDestroy(QDlt, ierr) > call MatDestroy(AQDlt, ierr) > call MatDestroy(Q, ierr) > call MatDestroy(K, ierr) > call MatDestroy(AQ_p, ierr) > call MatDestroy(AQ, ierr) > call MatDestroy(QtAQ, ierr) > call MatDestroy(QtAQ_p, ierr) > call MatDestroy(Dlt, ierr) > > > call PetscFinalize(ierr) > > stop > > end program test > > > subroutine mpk_monomial (K, A, R, step_k, rank, sizeMPI) > implicit none > > #include > #include > #include > #include > > Mat :: K, Km(step_k) > Mat :: A, R > PetscMPIInt :: sizeMPI, rank > PetscInt :: nDim, bsize, step_k, local_RRow, local_RCol, genIdx > PetscInt :: ierr > PetscInt :: stepIdx, blockShift, localRsize > PetscScalar :: KArray(1), RArray(1), PetscScalarSize > PetscOffset :: KArrayOffset, RArrayOffset > > call MatGetSize(R, nDim, bsize, ierr) > if (rank == 0) then > print*,'Mat Size = ', nDim, bsize > end if > > call MatGetArray(K,KArray,KArrayOffset,ierr) > > call MatGetLocalSize(R,local_RRow,local_RCol) > ! print *, "local_RRow,local_RCol", local_RRow,local_RCol > > ! get arry from R to add values to K(1) > call MatGetArray(R,RArray,RArrayOffset,ierr) > > call MatCreateDense(PETSC_COMM_WORLD, PETSC_DECIDE, & > PETSC_DECIDE , nDim, bsize,KArray(KArrayOffset + 1), Km(1), ierr) > > > ! call PetscMemmove(KArray(KArrayOffset + 1),RArray(RArrayOffset + 1) & > ! ,local_RRow * local_RCol * STORAGE_SIZE(PetscScalarSize), ierr) > > localRsize = local_RRow * local_RCol > do genIdx= 1, localRsize > KArray(KArrayOffset + genIdx) = RArray(RArrayOffset + genIdx) > end do > > > call MatRestoreArray(R,RArray,RArrayOffset,ierr) > > call MatAssemblyBegin(Km(1), MAT_FINAL_ASSEMBLY, ierr) > call MatAssemblyEnd (Km(1), MAT_FINAL_ASSEMBLY, ierr) > > do stepIdx= 2, step_k > > blockShift = KArrayOffset + (stepIdx-1) * (local_RRow * local_RCol) > > call MatCreateDense(PETSC_COMM_WORLD, PETSC_DECIDE, & > PETSC_DECIDE , nDim, bsize,KArray(blockShift+1), Km(stepIdx), ierr) > call MatAssemblyBegin(Km(stepIdx), MAT_FINAL_ASSEMBLY, ierr) > call MatAssemblyEnd (Km(stepIdx), MAT_FINAL_ASSEMBLY, ierr) > > end do > > call MatRestoreArray(K,KArray,KArrayOffset,ierr) > > ! do stepIdx= 2, step_k > do stepIdx= 2,2 > > call MatMatMult(A,Km(stepIdx-1),MAT_REUSE_MATRIX,PETSC_DEFAULT_INTEGER,Km(stepIdx), ierr) > ! call MatMatMult(A,Km(stepIdx-1),MAT_INITIAL_MATRIX,PETSC_DEFAULT_INTEGER,Km(stepIdx), ierr) > end do > > ! call MatView(K,PETSC_VIEWER_STDOUT_WORLD,ierr) > > end subroutine mpk_monomial > > > > Cong Li > > On Thu, Aug 6, 2015 at 3:30 AM, Barry Smith wrote: > > Send the entire code so that we can compile it and run it ourselves to see what is going wrong. > > Barry > > > On Aug 5, 2015, at 4:42 AM, Cong Li wrote: > > > > Hi > > > > I tried the method you suggested. However, I got the error message. > > My code and message are below. > > > > K is the big matrix containing column matrices. > > > > code: > > > > call MatGetArray(K,KArray,KArrayOffset,ierr) > > > > call MatGetLocalSize(R,local_RRow,local_RCol) > > > > call MatGetArray(R,RArray,RArrayOffset,ierr) > > > > call MatCreateDense(PETSC_COMM_WORLD, PETSC_DECIDE, & > > PETSC_DECIDE , nDim, bsize,KArray(KArrayOffset + 1), Km(1), ierr) > > > > localRsize = local_RRow * local_RCol > > do genIdx= 1, localRsize > > KArray(KArrayOffset + genIdx) = RArray(RArrayOffset + genIdx) > > end do > > > > call MatRestoreArray(R,RArray,RArrayOffset,ierr) > > > > call MatAssemblyBegin(Km(1), MAT_FINAL_ASSEMBLY, ierr) > > call MatAssemblyEnd (Km(1), MAT_FINAL_ASSEMBLY, ierr) > > > > do stepIdx= 2, step_k > > > > blockShift = KArrayOffset + (stepIdx-1) * (local_RRow * local_RCol) > > > > call MatCreateDense(PETSC_COMM_WORLD, PETSC_DECIDE, & > > PETSC_DECIDE , nDim, bsize,KArray(blockShift+1), Km(stepIdx), ierr) > > call MatAssemblyBegin(Km(stepIdx), MAT_FINAL_ASSEMBLY, ierr) > > call MatAssemblyEnd (Km(stepIdx), MAT_FINAL_ASSEMBLY, ierr) > > end do > > > > call MatRestoreArray(K,KArray,KArrayOffset,ierr) > > > > do stepIdx= 2, step_k > > > > call MatMatMult(A,Km(stepIdx-1),MAT_REUSE_MATRIX,PETSC_DEFAULT_INTEGER,Km(stepIdx), ierr) > > end do > > > > > > And I got the error message as below: > > > > > > [0]PETSC ERROR: ------------------------------------------------------------------------ > > [0]PETSC ERROR: Caught signal number 11 SEGV: Segmentation Violation, probably memory access out of range > > [0]PETSC ERROR: Try option -start_in_debugger or -on_error_attach_debugger > > [0]PETSC ERROR: or see http://www.mcs.anl.gov/petsc/documentation/faq.html#valgrind[0]PETSC ERROR: or try http://valgrind.org on GNU/linux and Apple Mac OS X to find memory corruption errors > > [0]PETSC ERROR: configure using --with-debugging=yes, recompile, link, and run > > [0]PETSC ERROR: to get more information on the crash. > > [0]PETSC ERROR: --------------------- Error Message ------------------------------------ > > [0]PETSC ERROR: Signal received! > > [0]PETSC ERROR: ------------------------------------------------------------------------ > > [0]PETSC ERROR: Petsc Release Version 3.3.0, Patch 7, Sat May 11 22:15:24 CDT 2013 > > [0]PETSC ERROR: See docs/changes/index.html for recent updates. > > [0]PETSC ERROR: See docs/faq.html for hints about trouble shooting. > > [0]PETSC ERROR: See docs/index.html for manual pages. > > [0]PETSC ERROR: --------------------[1]PETSC ERROR: ------------------------------------------------------------------------ > > [1]PETSC ERROR: Caught signal number 11 SEGV: Segmentation Violation, probably memory access out of range > > ---------------------------------------------------- > > [0]PETSC ERROR: ./kmath.bcbcg on a arch-fuji named p01-024 by a03293 Wed Aug 5 18:24:40 2015 > > [0]PETSC ERROR: Libraries linked from /volume1/home/ra000005/a03293/kmathlibbuild/petsc-3.3-p7/arch-fujitsu-sparc64fx-opt/lib > > [0]PETSC ERROR: Configure run at Tue Jul 28 19:23:51 2015 > > [0]PETSC ERROR: Configure options --known-level1-dcache-size=32768 --known-level1-dcache-linesize=32 --known-level1-dcache-assoc=0 --known-memcmp-ok=1 --known-sizeof-char=1 --known-sizeof-void-p=8 --known-sizeof-short=2 --known-sizeof-int=4 --known-sizeof-long=8 --known-sizeof-long-long=8 --known-sizeof-float=4 --known-sizeof-double=8 --known-sizeof-size_t=8 --known-bits-per-byte=8 --known-sizeof-MPI_Comm=8 --known-sizeof-MPI_Fint=4 --known-mpi-long-double=1 --known-mpi-c-double-complex=1 --with-cc=mpifccpx --CFLAGS="-mt -Xg" --COPTFLAGS=-Kfast,openmp --with-cxx=mpiFCCpx --CXXFLAGS=-mt --CXXOPTFLAGS=-Kfast,openmp --with-fc=mpifrtpx --FFLAGS=-Kthreadsafe --FOPTFLAGS=-Kfast,openmp --with-blas-lapack-lib="-SCALAPACK -SSL2" --with-x=0 --with-c++-support --with-batch=1 --with-info=1 --with-debugging=0 --known-mpi-shared-libraries=0 --with-valgrind=0 > > [0]PETSC ERROR: ------------------------------------------------------------------------ > > [0]PETSC ERROR: User provided function() line 0 in unknown directory unknown file > > -------------------------------------------------------------------------- > > [mpi::mpi-api::mpi-abort] > > MPI_ABORT was invoked on rank 0 in communicator MPI_COMM_WORLD > > with errorcode 59. > > > > NOTE: invoking MPI_ABORT causes Open MPI to kill all MPI processes. > > You may or may not see output from other processes, depending on > > exactly when Open MPI kills them. > > -------------------------------------------------------------------------- > > [p01-024:26516] /opt/FJSVtclang/GM-1.2.0-18/lib64/libmpi.so.0(orte_errmgr_base_error_abort+0x84) [0xffffffff0091f684] > > [p01-024:26516] /opt/FJSVtclang/GM-1.2.0-18/lib64/libmpi.so.0(ompi_mpi_abort+0x51c) [0xffffffff006c389c] > > [p01-024:26516] /opt/FJSVtclang/GM-1.2.0-18/lib64/libmpi.so.0(MPI_Abort+0x6c) [0xffffffff006db3ac] > > [p01-024:26516] /opt/FJSVtclang/GM-1.2.0-18/lib64/libtrtmet_c.so.1(MPI_Abort+0x2c) [0xffffffff00281bf0] > > [p01-024:26516] ./kmath.bcbcg [0x1bf620] > > [p01-024:26516] ./kmath.bcbcg [0x1bf20c] > > [p01-024:26516] /lib64/libc.so.6(killpg+0x48) [0xffffffff02d52600] > > [p01-024:26516] [(nil)] > > [p01-024:26516] ./kmath.bcbcg [0x1a2054] > > [p01-024:26516] ./kmath.bcbcg [0x1064f8] > > [p01-024:26516] ./kmath.bcbcg(MAIN__+0x9dc) [0x105d1c] > > [p01-024:26516] ./kmath.bcbcg(main+0xec) [0x8a329c] > > [p01-024:26516] /lib64/libc.so.6(__libc_start_main+0x194) [0xffffffff02d3b81c] > > [p01-024:26516] ./kmath.bcbcg [0x1051ec] > > [0]PETSC ERROR: ------------------------------------------------------------------------ > > [0]PETSC ERROR: Caught signal number 15 Terminate: Somet process (or the batch system) has told this process to end > > [0]PETSC ERROR: Try option -start_in_debugger or -on_error_attach_debugger > > [0]PETSC ERROR: or see http://www.mcs.anl.gov/petsc/documentation/faq.html#valgrind[0]PETSC ERROR: or try http://valgrind.org on GNU/linux and Apple Mac OS X to find memory corruption errors > > [0]PETSC ERROR: configure using --with-debugging=yes, recompile, link, and run > > [0]PETSC ERROR: to get more information on the crash. > > [0]PETSC ERROR: --------------------- Error Message ------------------------------------ > > [0]PETSC ERROR: Signal received! > > [0]PETSC ERROR: ------------------------------------------------------------------------ > > [0]PETSC ERROR: Petsc Release Version 3.3.0, Patch 7, Sat May 11 22:15:24 CDT 2013 > > [0]PETSC ERROR: See docs/changes/index.html for recent updates. > > [0]PETSC ERROR: See docs/faq.html for hints about trouble shooting. > > [0]PETSC ERROR: See docs/index.html for manual pages. > > [0]PETSC ERROR: ------------------------------------------------------------------------ > > [0]PETSC ERROR: ./kmath.bcbcg on a arch-fuji named p01-024 by a03293 Wed Aug 5 18:24:40 2015 > > [0]PETSC ERROR: Libraries linked from /volume1/home/ra000005/a03293/kmathlibbuild/petsc-3.3-p7/arch-fujitsu-sparc64fx-opt/lib > > [0]PETSC ERROR: Configure run at Tue Jul 28 19:23:51 2015 > > [0]PETSC ERROR: Configure options --known-level1-dcache-size=32768 --known-level1-dcache-linesize=32 --known-level1-dcache-assoc=0 --known-memcmp-ok=1 --known-sizeof-char=1 --known-sizeof-void-p=8 --known-sizeof-short=2 --known-sizeof-int=4 --known-sizeof-long=8 --known-sizeof-long-long=8 --known-sizeof-float=4 --known-sizeof-double=8 --known-sizeof-size_t=8 --known-bits-per-byte=8 --known-sizeof-MPI_Comm=8 --known-sizeof-MPI_Fint=4 --known-mpi-long-double=1 --known-mpi-c-double-complex=1 --with-cc=mpifccpx --CFLAGS="-mt -Xg" --COPTFLAGS=-Kfast,openmp --with-cxx=mpiFCCpx --CXXFLAGS=-mt --CXXOPTFLAGS=-Kfast,openmp --with-fc=mpifrtpx --FFLAGS=-Kthreadsafe --FOPTFLAGS=-Kfast,openmp --with-blas-lapack-lib="-SCALAPACK -SSL2" --with-x=0 --with-c++-support --with-batch=1 --with-info=1 --with-debugging=0 --known-mpi-shared-libraries=0 --with-valgrind=0 > > [0]PETSC ERROR: ------------------------------------------------------------------------ > > [0]PETSC ERROR: User provided function() line 0 in unknown directory unknown file > > [ERR.] PLE 0019 plexec One of MPI processes was aborted.(rank=0)(nid=0x020a0028)(CODE=1938,793745140674134016,15104) > > > > However, if I change from > > call MatMatMult(A,Km(stepIdx-1),MAT_REUSE_MATRIX,PETSC_DEFAULT_INTEGER,Km(stepIdx), ierr) > > to > > call MatMatMult(A,Km(stepIdx-1), MAT_INITIAL_MATRIX,PETSC_DEFAULT_INTEGER,Km(stepIdx), ierr) > > > > everything is fine. > > > > could you please suggest some way to solve this? > > > > Thanks > > > > Cong Li > > > > On Wed, Aug 5, 2015 at 10:53 AM, Cong Li wrote: > > Thank you very much for your help and suggestions. > > With your help, finally I could continue my project. > > > > Regards > > > > Cong Li > > > > > > > > On Wed, Aug 5, 2015 at 3:09 AM, Barry Smith wrote: > > > > From the manual page: Unless scall is MAT_REUSE_MATRIX C will be created. > > > > Since you want to use the C that is passed in you should use MAT_REUSE_MATRIX. > > > > Note that since your B and C matrices are dense the issue of sparsity pattern of C is not relevant. > > > > Barry > > > > > On Aug 4, 2015, at 11:59 AM, Cong Li wrote: > > > > > > Thanks very much. This answer is very helpful. > > > And I have a following question. > > > If I create B1, B2, .. by the way you suggested and then use MatMatMult to do SPMM. > > > PetscErrorCode MatMatMult(Mat A,Mat B,MatReuse scall,PetscReal fill,Mat *C) > > > should I use MAT_REUSE_MATRIX for MatReuse part of the arguement. > > > > > > Thanks > > > > > > Cong Li > > > > > > On Wed, Aug 5, 2015 at 1:27 AM, Barry Smith wrote: > > > > > > > On Aug 4, 2015, at 4:09 AM, Cong Li wrote: > > > > > > > > I am sorry that I should have explained it more clearly. > > > > Actually I want to compute a recurrence. > > > > > > > > Like, I want to firstly compute A*X1=B1, and then calculate A*B1=B2, A*B2=B3 and so on. > > > > Finally I want to combine all these results into a bigger matrix C=[B1,B2 ...] > > > > > > First create C with MatCreateDense(,&C). Then call MatDenseGetArray(C,&array); then create B1 with MatCreateDense(....,array,&B1); then create > > > B2 with MatCreateDense(...,array+shift,&B2) etc where shift equals the number of __local__ rows in B1 times the number of columns in B1, then create B3 with a larger shift etc. > > > > > > Note that you are "sharing" the array space of C with B1, B2, B3, ..., each Bi contains its columns of the C matrix. > > > > > > Barry > > > > > > > > > > > > > > > > > Is there any way to do this efficiently. > > > > > > > > > > > > > > > > On Tue, Aug 4, 2015 at 5:45 PM, Patrick Sanan wrote: > > > > On Tue, Aug 04, 2015 at 03:42:14PM +0900, Cong Li wrote: > > > > > Thanks for your reply. > > > > > > > > > > I have an other question. > > > > > I want to do SPMM several times and combine result matrices into one bigger > > > > > matrix. > > > > > for example > > > > > I firstly calculate AX1=B1, AX2=B2 ... > > > > > then I want to combine B1, B2.. to get a C, where C=[B1,B2...] > > > > > > > > > > Could you please suggest a way of how to do this. > > > > This is just linear algebra, nothing to do with PETSc specifically. > > > > A * [X1, X2, ... ] = [AX1, AX2, ...] > > > > > > > > > > Thanks > > > > > > > > > > Cong Li > > > > > > > > > > On Tue, Aug 4, 2015 at 3:27 PM, Jed Brown wrote: > > > > > > > > > > > Cong Li writes: > > > > > > > > > > > > > Hello, > > > > > > > > > > > > > > I am a PhD student using PETsc for my research. > > > > > > > I am wondering if there is a way to implement SPMM (Sparse matrix-matrix > > > > > > > multiplication) by using PETSc. > > > > > > > > > > > > > > > > > > http://www.mcs.anl.gov/petsc/petsc-current/docs/manualpages/Mat/MatMatMult.html > > > > > > > > > > > > > > > > > > > > > > > > > From solvercorleone at gmail.com Thu Aug 6 00:12:47 2015 From: solvercorleone at gmail.com (Cong Li) Date: Thu, 6 Aug 2015 14:12:47 +0900 Subject: [petsc-users] I am wondering if there is a way to implement SPMM In-Reply-To: References: <87egjjr2j9.fsf@jedbrown.org> <20150804084548.GB52392@Patricks-MacBook-Pro-3.local> <07456300-9874-41EF-AF5E-16BC0CB0423D@mcs.anl.gov> <06426FD6-16F6-429A-8EEB-8BE31CECC8F4@mcs.anl.gov> Message-ID: Sure. Attached are input files. mesh1e1.mtx.pbin is the binary petsc file for a 48x48 S.P.D. sparse matrix, which is A in the program. b.m48.n2.dat is the binary petsc file of a 2 column right-hand side matrix, which is B in the program. I use mpiexec -n 2 ./progrma.name -f ~/mesh1e1.mtx.pbin -r ~/b.m48.n2.dat -k 2 -i 2 -w 2 to run the program Thanks Cong Li On Thu, Aug 6, 2015 at 11:43 AM, Barry Smith wrote: > > Send the input files so I can actually run the thing (and make sure they > are not very big, debugging with large data sets is silly and unproductive). > > Thanks > > Barry > > > On Aug 5, 2015, at 8:56 PM, Cong Li wrote: > > > > The entire source code files are attached. > > > > Also I copy and paste the here in this email > > > > thanks > > > > program test > > > > implicit none > > > > #include > > #include > > #include > > #include > > > > > > PetscViewer :: view > > ! sparse matrix > > Mat :: A > > ! distributed dense matrix of size n x m > > Mat :: B, X, R, QDlt, AQDlt > > ! distributed dense matrix of size n x (m x k) > > Mat :: Q, K, AQ_p, AQ > > ! local dense matrix (every process keep the identical copies), (m x > k) x (m x k) > > Mat :: AConjPara, QtAQ, QtAQ_p, Dlt > > > > PetscInt :: nDim, mDim, rhsNDim,rhsMDim,ierr, maxIter, iter, > step_k,bsize > > PetscInt :: ownRowS,ownRowE > > PetscScalar, allocatable :: XInit(:,:) > > PetscInt :: XInitI, XInitJ > > PetscScalar :: v=1.0 > > PetscBool :: flg > > PetscMPIInt :: size, rank > > > > character(128) :: fin, rhsfin > > > > > > call PetscInitialize(PETSC_NULL_CHARACTER,ierr) > > call MPI_Comm_size(PETSC_COMM_WORLD,size,ierr) > > call MPI_Comm_rank(PETSC_COMM_WORLD,rank,ierr) > > > > ! read binary matrix file > > call PetscOptionsGetString(PETSC_NULL_CHARACTER,'-f',fin,flg,ierr) > > call PetscOptionsGetString(PETSC_NULL_CHARACTER,'-r',rhsfin,flg,ierr) > > > > call PetscOptionsGetInt(PETSC_NULL_CHARACTER,'-i',maxIter,flg,ierr) > > call PetscOptionsGetInt(PETSC_NULL_CHARACTER,'-k',step_k,flg,ierr) > > call PetscOptionsGetInt(PETSC_NULL_CHARACTER,'-w',bsize,flg,ierr) > > > > > > call > PetscViewerBinaryOpen(PETSC_COMM_WORLD,fin,FILE_MODE_READ,view,ierr) > > call MatCreate(PETSC_COMM_WORLD,A,ierr) > > call MatSetType(A,MATAIJ,ierr) > > call MatLoad(A,view,ierr) > > call PetscViewerDestroy(view,ierr) > > ! for the time being, assume mDim == nDim is true > > call MatGetSize(A, nDim, mDim, ierr) > > > > if (rank == 0) then > > print*,'Mat Size = ', nDim, mDim > > end if > > > > call MatAssemblyBegin(A, MAT_FINAL_ASSEMBLY, ierr) > > call MatAssemblyEnd(A, MAT_FINAL_ASSEMBLY, ierr) > > call MatGetOwnershipRange(A,ownRowS,ownRowE, ierr) > > > > ! create right-and-side matrix > > ! for the time being, choose row-wise decomposition > > ! for the time being, assume nDim%size = 0 > > call MatCreateDense(PETSC_COMM_WORLD, (ownRowE - ownRowS), & > > bsize, nDim, bsize,PETSC_NULL_SCALAR, B, ierr) > > call > PetscViewerBinaryOpen(PETSC_COMM_WORLD,rhsfin,FILE_MODE_READ,view, ierr) > > call MatLoad(B,view,ierr) > > call PetscViewerDestroy(view,ierr) > > call MatGetSize(B, rhsMDim, rhsNDim, ierr) > > if (rank == 0) then > > print*,'MRHS Size actually are:', rhsMDim, rhsNDim > > print*,'MRHS Size should be:', nDim, bsize > > end if > > call MatAssemblyBegin(B, MAT_FINAL_ASSEMBLY, ierr) > > call MatAssemblyEnd(B, MAT_FINAL_ASSEMBLY, ierr) > > > > ! inintial value guses X > > allocate(XInit(nDim,bsize)) > > do XInitI=1, nDim > > do XInitJ=1, bsize > > XInit(XInitI,XInitJ) = 1.0 > > end do > > end do > > > > call MatCreateDense(PETSC_COMM_WORLD, (ownRowE - ownRowS), & > > bsize, nDim, bsize,XInit, X, ierr) > > > > call MatAssemblyBegin(X, MAT_FINAL_ASSEMBLY, ierr) > > call MatAssemblyEnd (X, MAT_FINAL_ASSEMBLY, ierr) > > > > > > ! B, X, R, QDlt, AQDlt > > call MatDuplicate(B, MAT_DO_NOT_COPY_VALUES, R, ierr) > > call MatAssemblyBegin(R, MAT_FINAL_ASSEMBLY, ierr) > > call MatAssemblyEnd (R, MAT_FINAL_ASSEMBLY, ierr) > > > > call MatDuplicate(B, MAT_DO_NOT_COPY_VALUES, QDlt, ierr) > > call MatAssemblyBegin(QDlt, MAT_FINAL_ASSEMBLY, ierr) > > call MatAssemblyEnd (QDlt, MAT_FINAL_ASSEMBLY, ierr) > > > > call MatDuplicate(B, MAT_DO_NOT_COPY_VALUES, AQDlt, ierr) > > call MatAssemblyBegin(AQDlt, MAT_FINAL_ASSEMBLY, ierr) > > call MatAssemblyEnd (AQDlt, MAT_FINAL_ASSEMBLY, ierr) > > > > ! Q, K, AQ_p, AQ of size n x (m x k) > > call MatCreateDense(PETSC_COMM_WORLD, (ownRowE - ownRowS), & > > (bsize*step_k), nDim, > (bsize*step_k),PETSC_NULL_SCALAR, Q, ierr) > > call MatAssemblyBegin(Q, MAT_FINAL_ASSEMBLY, ierr) > > call MatAssemblyEnd(Q, MAT_FINAL_ASSEMBLY, ierr) > > > > call MatDuplicate(Q, MAT_DO_NOT_COPY_VALUES, K, ierr) > > call MatAssemblyBegin(K, MAT_FINAL_ASSEMBLY, ierr) > > call MatAssemblyEnd(K, MAT_FINAL_ASSEMBLY, ierr) > > > > call MatDuplicate(Q, MAT_DO_NOT_COPY_VALUES, AQ_p, ierr) > > call MatAssemblyBegin(AQ_p, MAT_FINAL_ASSEMBLY, ierr) > > call MatAssemblyEnd(AQ_p, MAT_FINAL_ASSEMBLY, ierr) > > > > call MatDuplicate(Q, MAT_DO_NOT_COPY_VALUES, AQ, ierr) > > call MatAssemblyBegin(AQ, MAT_FINAL_ASSEMBLY, ierr) > > call MatAssemblyEnd(AQ, MAT_FINAL_ASSEMBLY, ierr) > > > > ! QtAQ, QtAQ_p, Dlt of size (m x k) x (m x k) > > call MatCreateSeqDense(PETSC_COMM_SELF,(bsize*step_k),(bsize*step_k),& > > PETSC_NULL_SCALAR, QtAQ, ierr) > > call MatAssemblyBegin(QtAQ, MAT_FINAL_ASSEMBLY, ierr) > > call MatAssemblyEnd (QtAQ, MAT_FINAL_ASSEMBLY, ierr) > > > > call MatDuplicate(QtAQ, MAT_DO_NOT_COPY_VALUES, QtAQ_p , ierr) > > call MatAssemblyBegin(QtAQ_p, MAT_FINAL_ASSEMBLY, ierr) > > call MatAssemblyEnd (QtAQ_p, MAT_FINAL_ASSEMBLY, ierr) > > > > call MatDuplicate(QtAQ, MAT_DO_NOT_COPY_VALUES, Dlt , ierr) > > call MatAssemblyBegin(Dlt, MAT_FINAL_ASSEMBLY, ierr) > > call MatAssemblyEnd (Dlt, MAT_FINAL_ASSEMBLY, ierr) > > > > call MatDuplicate(QtAQ, MAT_DO_NOT_COPY_VALUES, AConjPara , ierr) > > call MatAssemblyBegin(AConjPara, MAT_FINAL_ASSEMBLY, ierr) > > call MatAssemblyEnd (AConjPara, MAT_FINAL_ASSEMBLY, ierr) > > > > ! calculation for R > > > > ! call matrix powers kernel > > call mpk_monomial (K, A, R, step_k, rank,size) > > > > ! destory matrices > > deallocate(XInit) > > > > call MatDestroy(B, ierr) > > call MatDestroy(X, ierr) > > call MatDestroy(R, ierr) > > call MatDestroy(QDlt, ierr) > > call MatDestroy(AQDlt, ierr) > > call MatDestroy(Q, ierr) > > call MatDestroy(K, ierr) > > call MatDestroy(AQ_p, ierr) > > call MatDestroy(AQ, ierr) > > call MatDestroy(QtAQ, ierr) > > call MatDestroy(QtAQ_p, ierr) > > call MatDestroy(Dlt, ierr) > > > > > > call PetscFinalize(ierr) > > > > stop > > > > end program test > > > > > > subroutine mpk_monomial (K, A, R, step_k, rank, sizeMPI) > > implicit none > > > > #include > > #include > > #include > > #include > > > > Mat :: K, Km(step_k) > > Mat :: A, R > > PetscMPIInt :: sizeMPI, rank > > PetscInt :: nDim, bsize, step_k, local_RRow, local_RCol, > genIdx > > PetscInt :: ierr > > PetscInt :: stepIdx, blockShift, localRsize > > PetscScalar :: KArray(1), RArray(1), PetscScalarSize > > PetscOffset :: KArrayOffset, RArrayOffset > > > > call MatGetSize(R, nDim, bsize, ierr) > > if (rank == 0) then > > print*,'Mat Size = ', nDim, bsize > > end if > > > > call MatGetArray(K,KArray,KArrayOffset,ierr) > > > > call MatGetLocalSize(R,local_RRow,local_RCol) > > ! print *, "local_RRow,local_RCol", local_RRow,local_RCol > > > > ! get arry from R to add values to K(1) > > call MatGetArray(R,RArray,RArrayOffset,ierr) > > > > call MatCreateDense(PETSC_COMM_WORLD, PETSC_DECIDE, & > > PETSC_DECIDE , nDim, bsize,KArray(KArrayOffset + > 1), Km(1), ierr) > > > > > > ! call PetscMemmove(KArray(KArrayOffset + 1),RArray(RArrayOffset + 1) & > > ! ,local_RRow * local_RCol * > STORAGE_SIZE(PetscScalarSize), ierr) > > > > localRsize = local_RRow * local_RCol > > do genIdx= 1, localRsize > > KArray(KArrayOffset + genIdx) = RArray(RArrayOffset + genIdx) > > end do > > > > > > call MatRestoreArray(R,RArray,RArrayOffset,ierr) > > > > call MatAssemblyBegin(Km(1), MAT_FINAL_ASSEMBLY, ierr) > > call MatAssemblyEnd (Km(1), MAT_FINAL_ASSEMBLY, ierr) > > > > do stepIdx= 2, step_k > > > > blockShift = KArrayOffset + (stepIdx-1) * (local_RRow * local_RCol) > > > > call MatCreateDense(PETSC_COMM_WORLD, PETSC_DECIDE, & > > PETSC_DECIDE , nDim, bsize,KArray(blockShift+1), > Km(stepIdx), ierr) > > call MatAssemblyBegin(Km(stepIdx), MAT_FINAL_ASSEMBLY, ierr) > > call MatAssemblyEnd (Km(stepIdx), MAT_FINAL_ASSEMBLY, ierr) > > > > end do > > > > call MatRestoreArray(K,KArray,KArrayOffset,ierr) > > > > ! do stepIdx= 2, step_k > > do stepIdx= 2,2 > > > > call > MatMatMult(A,Km(stepIdx-1),MAT_REUSE_MATRIX,PETSC_DEFAULT_INTEGER,Km(stepIdx), > ierr) > > ! call > MatMatMult(A,Km(stepIdx-1),MAT_INITIAL_MATRIX,PETSC_DEFAULT_INTEGER,Km(stepIdx), > ierr) > > end do > > > > ! call MatView(K,PETSC_VIEWER_STDOUT_WORLD,ierr) > > > > end subroutine mpk_monomial > > > > > > > > Cong Li > > > > On Thu, Aug 6, 2015 at 3:30 AM, Barry Smith wrote: > > > > Send the entire code so that we can compile it and run it ourselves > to see what is going wrong. > > > > Barry > > > > > On Aug 5, 2015, at 4:42 AM, Cong Li wrote: > > > > > > Hi > > > > > > I tried the method you suggested. However, I got the error message. > > > My code and message are below. > > > > > > K is the big matrix containing column matrices. > > > > > > code: > > > > > > call MatGetArray(K,KArray,KArrayOffset,ierr) > > > > > > call MatGetLocalSize(R,local_RRow,local_RCol) > > > > > > call MatGetArray(R,RArray,RArrayOffset,ierr) > > > > > > call MatCreateDense(PETSC_COMM_WORLD, PETSC_DECIDE, & > > > PETSC_DECIDE , nDim, bsize,KArray(KArrayOffset > + 1), Km(1), ierr) > > > > > > localRsize = local_RRow * local_RCol > > > do genIdx= 1, localRsize > > > KArray(KArrayOffset + genIdx) = RArray(RArrayOffset + genIdx) > > > end do > > > > > > call MatRestoreArray(R,RArray,RArrayOffset,ierr) > > > > > > call MatAssemblyBegin(Km(1), MAT_FINAL_ASSEMBLY, ierr) > > > call MatAssemblyEnd (Km(1), MAT_FINAL_ASSEMBLY, ierr) > > > > > > do stepIdx= 2, step_k > > > > > > blockShift = KArrayOffset + (stepIdx-1) * (local_RRow * local_RCol) > > > > > > call MatCreateDense(PETSC_COMM_WORLD, PETSC_DECIDE, & > > > PETSC_DECIDE , nDim, > bsize,KArray(blockShift+1), Km(stepIdx), ierr) > > > call MatAssemblyBegin(Km(stepIdx), MAT_FINAL_ASSEMBLY, ierr) > > > call MatAssemblyEnd (Km(stepIdx), MAT_FINAL_ASSEMBLY, ierr) > > > end do > > > > > > call MatRestoreArray(K,KArray,KArrayOffset,ierr) > > > > > > do stepIdx= 2, step_k > > > > > > call > MatMatMult(A,Km(stepIdx-1),MAT_REUSE_MATRIX,PETSC_DEFAULT_INTEGER,Km(stepIdx), > ierr) > > > end do > > > > > > > > > And I got the error message as below: > > > > > > > > > [0]PETSC ERROR: > ------------------------------------------------------------------------ > > > [0]PETSC ERROR: Caught signal number 11 SEGV: Segmentation Violation, > probably memory access out of range > > > [0]PETSC ERROR: Try option -start_in_debugger or > -on_error_attach_debugger > > > [0]PETSC ERROR: or see > http://www.mcs.anl.gov/petsc/documentation/faq.html#valgrind[0]PETSC > ERROR: or try http://valgrind.org on GNU/linux and Apple Mac OS X to find > memory corruption errors > > > [0]PETSC ERROR: configure using --with-debugging=yes, recompile, link, > and run > > > [0]PETSC ERROR: to get more information on the crash. > > > [0]PETSC ERROR: --------------------- Error Message > ------------------------------------ > > > [0]PETSC ERROR: Signal received! > > > [0]PETSC ERROR: > ------------------------------------------------------------------------ > > > [0]PETSC ERROR: Petsc Release Version 3.3.0, Patch 7, Sat May 11 > 22:15:24 CDT 2013 > > > [0]PETSC ERROR: See docs/changes/index.html for recent updates. > > > [0]PETSC ERROR: See docs/faq.html for hints about trouble shooting. > > > [0]PETSC ERROR: See docs/index.html for manual pages. > > > [0]PETSC ERROR: --------------------[1]PETSC ERROR: > ------------------------------------------------------------------------ > > > [1]PETSC ERROR: Caught signal number 11 SEGV: Segmentation Violation, > probably memory access out of range > > > ---------------------------------------------------- > > > [0]PETSC ERROR: ./kmath.bcbcg on a arch-fuji named p01-024 by a03293 > Wed Aug 5 18:24:40 2015 > > > [0]PETSC ERROR: Libraries linked from > /volume1/home/ra000005/a03293/kmathlibbuild/petsc-3.3-p7/arch-fujitsu-sparc64fx-opt/lib > > > [0]PETSC ERROR: Configure run at Tue Jul 28 19:23:51 2015 > > > [0]PETSC ERROR: Configure options --known-level1-dcache-size=32768 > --known-level1-dcache-linesize=32 --known-level1-dcache-assoc=0 > --known-memcmp-ok=1 --known-sizeof-char=1 --known-sizeof-void-p=8 > --known-sizeof-short=2 --known-sizeof-int=4 --known-sizeof-long=8 > --known-sizeof-long-long=8 --known-sizeof-float=4 --known-sizeof-double=8 > --known-sizeof-size_t=8 --known-bits-per-byte=8 --known-sizeof-MPI_Comm=8 > --known-sizeof-MPI_Fint=4 --known-mpi-long-double=1 > --known-mpi-c-double-complex=1 --with-cc=mpifccpx --CFLAGS="-mt -Xg" > --COPTFLAGS=-Kfast,openmp --with-cxx=mpiFCCpx --CXXFLAGS=-mt > --CXXOPTFLAGS=-Kfast,openmp --with-fc=mpifrtpx --FFLAGS=-Kthreadsafe > --FOPTFLAGS=-Kfast,openmp --with-blas-lapack-lib="-SCALAPACK -SSL2" > --with-x=0 --with-c++-support --with-batch=1 --with-info=1 > --with-debugging=0 --known-mpi-shared-libraries=0 --with-valgrind=0 > > > [0]PETSC ERROR: > ------------------------------------------------------------------------ > > > [0]PETSC ERROR: User provided function() line 0 in unknown directory > unknown file > > > > -------------------------------------------------------------------------- > > > [mpi::mpi-api::mpi-abort] > > > MPI_ABORT was invoked on rank 0 in communicator MPI_COMM_WORLD > > > with errorcode 59. > > > > > > NOTE: invoking MPI_ABORT causes Open MPI to kill all MPI processes. > > > You may or may not see output from other processes, depending on > > > exactly when Open MPI kills them. > > > > -------------------------------------------------------------------------- > > > [p01-024:26516] > /opt/FJSVtclang/GM-1.2.0-18/lib64/libmpi.so.0(orte_errmgr_base_error_abort+0x84) > [0xffffffff0091f684] > > > [p01-024:26516] > /opt/FJSVtclang/GM-1.2.0-18/lib64/libmpi.so.0(ompi_mpi_abort+0x51c) > [0xffffffff006c389c] > > > [p01-024:26516] > /opt/FJSVtclang/GM-1.2.0-18/lib64/libmpi.so.0(MPI_Abort+0x6c) > [0xffffffff006db3ac] > > > [p01-024:26516] > /opt/FJSVtclang/GM-1.2.0-18/lib64/libtrtmet_c.so.1(MPI_Abort+0x2c) > [0xffffffff00281bf0] > > > [p01-024:26516] ./kmath.bcbcg [0x1bf620] > > > [p01-024:26516] ./kmath.bcbcg [0x1bf20c] > > > [p01-024:26516] /lib64/libc.so.6(killpg+0x48) [0xffffffff02d52600] > > > [p01-024:26516] [(nil)] > > > [p01-024:26516] ./kmath.bcbcg [0x1a2054] > > > [p01-024:26516] ./kmath.bcbcg [0x1064f8] > > > [p01-024:26516] ./kmath.bcbcg(MAIN__+0x9dc) [0x105d1c] > > > [p01-024:26516] ./kmath.bcbcg(main+0xec) [0x8a329c] > > > [p01-024:26516] /lib64/libc.so.6(__libc_start_main+0x194) > [0xffffffff02d3b81c] > > > [p01-024:26516] ./kmath.bcbcg [0x1051ec] > > > [0]PETSC ERROR: > ------------------------------------------------------------------------ > > > [0]PETSC ERROR: Caught signal number 15 Terminate: Somet process (or > the batch system) has told this process to end > > > [0]PETSC ERROR: Try option -start_in_debugger or > -on_error_attach_debugger > > > [0]PETSC ERROR: or see > http://www.mcs.anl.gov/petsc/documentation/faq.html#valgrind[0]PETSC > ERROR: or try http://valgrind.org on GNU/linux and Apple Mac OS X to find > memory corruption errors > > > [0]PETSC ERROR: configure using --with-debugging=yes, recompile, link, > and run > > > [0]PETSC ERROR: to get more information on the crash. > > > [0]PETSC ERROR: --------------------- Error Message > ------------------------------------ > > > [0]PETSC ERROR: Signal received! > > > [0]PETSC ERROR: > ------------------------------------------------------------------------ > > > [0]PETSC ERROR: Petsc Release Version 3.3.0, Patch 7, Sat May 11 > 22:15:24 CDT 2013 > > > [0]PETSC ERROR: See docs/changes/index.html for recent updates. > > > [0]PETSC ERROR: See docs/faq.html for hints about trouble shooting. > > > [0]PETSC ERROR: See docs/index.html for manual pages. > > > [0]PETSC ERROR: > ------------------------------------------------------------------------ > > > [0]PETSC ERROR: ./kmath.bcbcg on a arch-fuji named p01-024 by a03293 > Wed Aug 5 18:24:40 2015 > > > [0]PETSC ERROR: Libraries linked from > /volume1/home/ra000005/a03293/kmathlibbuild/petsc-3.3-p7/arch-fujitsu-sparc64fx-opt/lib > > > [0]PETSC ERROR: Configure run at Tue Jul 28 19:23:51 2015 > > > [0]PETSC ERROR: Configure options --known-level1-dcache-size=32768 > --known-level1-dcache-linesize=32 --known-level1-dcache-assoc=0 > --known-memcmp-ok=1 --known-sizeof-char=1 --known-sizeof-void-p=8 > --known-sizeof-short=2 --known-sizeof-int=4 --known-sizeof-long=8 > --known-sizeof-long-long=8 --known-sizeof-float=4 --known-sizeof-double=8 > --known-sizeof-size_t=8 --known-bits-per-byte=8 --known-sizeof-MPI_Comm=8 > --known-sizeof-MPI_Fint=4 --known-mpi-long-double=1 > --known-mpi-c-double-complex=1 --with-cc=mpifccpx --CFLAGS="-mt -Xg" > --COPTFLAGS=-Kfast,openmp --with-cxx=mpiFCCpx --CXXFLAGS=-mt > --CXXOPTFLAGS=-Kfast,openmp --with-fc=mpifrtpx --FFLAGS=-Kthreadsafe > --FOPTFLAGS=-Kfast,openmp --with-blas-lapack-lib="-SCALAPACK -SSL2" > --with-x=0 --with-c++-support --with-batch=1 --with-info=1 > --with-debugging=0 --known-mpi-shared-libraries=0 --with-valgrind=0 > > > [0]PETSC ERROR: > ------------------------------------------------------------------------ > > > [0]PETSC ERROR: User provided function() line 0 in unknown directory > unknown file > > > [ERR.] PLE 0019 plexec One of MPI processes was > aborted.(rank=0)(nid=0x020a0028)(CODE=1938,793745140674134016,15104) > > > > > > However, if I change from > > > call > MatMatMult(A,Km(stepIdx-1),MAT_REUSE_MATRIX,PETSC_DEFAULT_INTEGER,Km(stepIdx), > ierr) > > > to > > > call MatMatMult(A,Km(stepIdx-1), > MAT_INITIAL_MATRIX,PETSC_DEFAULT_INTEGER,Km(stepIdx), ierr) > > > > > > everything is fine. > > > > > > could you please suggest some way to solve this? > > > > > > Thanks > > > > > > Cong Li > > > > > > On Wed, Aug 5, 2015 at 10:53 AM, Cong Li > wrote: > > > Thank you very much for your help and suggestions. > > > With your help, finally I could continue my project. > > > > > > Regards > > > > > > Cong Li > > > > > > > > > > > > On Wed, Aug 5, 2015 at 3:09 AM, Barry Smith > wrote: > > > > > > From the manual page: Unless scall is MAT_REUSE_MATRIX C will be > created. > > > > > > Since you want to use the C that is passed in you should use > MAT_REUSE_MATRIX. > > > > > > Note that since your B and C matrices are dense the issue of > sparsity pattern of C is not relevant. > > > > > > Barry > > > > > > > On Aug 4, 2015, at 11:59 AM, Cong Li > wrote: > > > > > > > > Thanks very much. This answer is very helpful. > > > > And I have a following question. > > > > If I create B1, B2, .. by the way you suggested and then use > MatMatMult to do SPMM. > > > > PetscErrorCode MatMatMult(Mat A,Mat B,MatReuse scall,PetscReal > fill,Mat *C) > > > > should I use MAT_REUSE_MATRIX for MatReuse part of the arguement. > > > > > > > > Thanks > > > > > > > > Cong Li > > > > > > > > On Wed, Aug 5, 2015 at 1:27 AM, Barry Smith > wrote: > > > > > > > > > On Aug 4, 2015, at 4:09 AM, Cong Li > wrote: > > > > > > > > > > I am sorry that I should have explained it more clearly. > > > > > Actually I want to compute a recurrence. > > > > > > > > > > Like, I want to firstly compute A*X1=B1, and then calculate > A*B1=B2, A*B2=B3 and so on. > > > > > Finally I want to combine all these results into a bigger matrix > C=[B1,B2 ...] > > > > > > > > First create C with MatCreateDense(,&C). Then call > MatDenseGetArray(C,&array); then create B1 with > MatCreateDense(....,array,&B1); then create > > > > B2 with MatCreateDense(...,array+shift,&B2) etc where shift equals > the number of __local__ rows in B1 times the number of columns in B1, then > create B3 with a larger shift etc. > > > > > > > > Note that you are "sharing" the array space of C with B1, B2, B3, > ..., each Bi contains its columns of the C matrix. > > > > > > > > Barry > > > > > > > > > > > > > > > > > > > > > > Is there any way to do this efficiently. > > > > > > > > > > > > > > > > > > > > On Tue, Aug 4, 2015 at 5:45 PM, Patrick Sanan < > patrick.sanan at gmail.com> wrote: > > > > > On Tue, Aug 04, 2015 at 03:42:14PM +0900, Cong Li wrote: > > > > > > Thanks for your reply. > > > > > > > > > > > > I have an other question. > > > > > > I want to do SPMM several times and combine result matrices into > one bigger > > > > > > matrix. > > > > > > for example > > > > > > I firstly calculate AX1=B1, AX2=B2 ... > > > > > > then I want to combine B1, B2.. to get a C, where C=[B1,B2...] > > > > > > > > > > > > Could you please suggest a way of how to do this. > > > > > This is just linear algebra, nothing to do with PETSc specifically. > > > > > A * [X1, X2, ... ] = [AX1, AX2, ...] > > > > > > > > > > > > Thanks > > > > > > > > > > > > Cong Li > > > > > > > > > > > > On Tue, Aug 4, 2015 at 3:27 PM, Jed Brown > wrote: > > > > > > > > > > > > > Cong Li writes: > > > > > > > > > > > > > > > Hello, > > > > > > > > > > > > > > > > I am a PhD student using PETsc for my research. > > > > > > > > I am wondering if there is a way to implement SPMM (Sparse > matrix-matrix > > > > > > > > multiplication) by using PETSc. > > > > > > > > > > > > > > > > > > > > > > http://www.mcs.anl.gov/petsc/petsc-current/docs/manualpages/Mat/MatMatMult.html > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > -------------- next part -------------- An HTML attachment was scrubbed... URL: -------------- next part -------------- A non-text attachment was scrubbed... Name: b.m48.n2.dat Type: application/octet-stream Size: 1360 bytes Desc: not available URL: -------------- next part -------------- A non-text attachment was scrubbed... Name: mesh1e1.mtx.pbin Type: application/octet-stream Size: 3880 bytes Desc: not available URL: From solvercorleone at gmail.com Thu Aug 6 00:22:20 2015 From: solvercorleone at gmail.com (Cong Li) Date: Thu, 6 Aug 2015 14:22:20 +0900 Subject: [petsc-users] I am wondering if there is a way to implement SPMM In-Reply-To: References: <87egjjr2j9.fsf@jedbrown.org> <20150804084548.GB52392@Patricks-MacBook-Pro-3.local> <07456300-9874-41EF-AF5E-16BC0CB0423D@mcs.anl.gov> <06426FD6-16F6-429A-8EEB-8BE31CECC8F4@mcs.anl.gov> Message-ID: Hong, Sure. I want to extend the Krylov subspace by step_k dimensions by using monomial, which can be defined as K={Km(1)m Km(2), ..., Km(step_k)} ={Km(1), AKm(1), AKm(2), ... , AKm(step_k-1)} ={R, AR, A^2R, ... A^(step_k-1)R} , in one loop. So, my plan now is to firstly calculate the recurrence, which is P_n(x)=xP_n-1(x), and then use the results to update the items in K. And then, in the next loop of Krylov subspace method, the K will be updated again. Input of the the mpk_monomial subroutine, is a preallocated dense matrix K. And, A and R are for updating K in the subroutine. Thanks Cong Li On Thu, Aug 6, 2015 at 12:23 PM, Hong wrote: > Cong, > > Can you write out math equations for mpk_monomial (), > list input and output parameters. > > Note: > 1. MatDuplicate() does not need to be followed by MatAssemblyBegin/End > 2. MatMatMult(A,Km(stepIdx-1),MAT_REUSE_MATRIX,..) must be called after > MatMatMult(A,Km(stepIdx-1),MAT_INITIAL_MATRIX,..) > > Hong > > > On Wed, Aug 5, 2015 at 8:56 PM, Cong Li wrote: > >> The entire source code files are attached. >> >> Also I copy and paste the here in this email >> >> thanks >> >> program test >> >> implicit none >> >> #include >> #include >> #include >> #include >> >> >> PetscViewer :: view >> ! sparse matrix >> Mat :: A >> ! distributed dense matrix of size n x m >> Mat :: B, X, R, QDlt, AQDlt >> ! distributed dense matrix of size n x (m x k) >> Mat :: Q, K, AQ_p, AQ >> ! local dense matrix (every process keep the identical copies), (m x k) >> x (m x k) >> Mat :: AConjPara, QtAQ, QtAQ_p, Dlt >> >> PetscInt :: nDim, mDim, rhsNDim,rhsMDim,ierr, maxIter, iter, >> step_k,bsize >> PetscInt :: ownRowS,ownRowE >> PetscScalar, allocatable :: XInit(:,:) >> PetscInt :: XInitI, XInitJ >> PetscScalar :: v=1.0 >> PetscBool :: flg >> PetscMPIInt :: size, rank >> >> character(128) :: fin, rhsfin >> >> >> call PetscInitialize(PETSC_NULL_CHARACTER,ierr) >> call MPI_Comm_size(PETSC_COMM_WORLD,size,ierr) >> call MPI_Comm_rank(PETSC_COMM_WORLD,rank,ierr) >> >> ! read binary matrix file >> call PetscOptionsGetString(PETSC_NULL_CHARACTER,'-f',fin,flg,ierr) >> call PetscOptionsGetString(PETSC_NULL_CHARACTER,'-r',rhsfin,flg,ierr) >> >> call PetscOptionsGetInt(PETSC_NULL_CHARACTER,'-i',maxIter,flg,ierr) >> call PetscOptionsGetInt(PETSC_NULL_CHARACTER,'-k',step_k,flg,ierr) >> call PetscOptionsGetInt(PETSC_NULL_CHARACTER,'-w',bsize,flg,ierr) >> >> >> call >> PetscViewerBinaryOpen(PETSC_COMM_WORLD,fin,FILE_MODE_READ,view,ierr) >> call MatCreate(PETSC_COMM_WORLD,A,ierr) >> call MatSetType(A,MATAIJ,ierr) >> call MatLoad(A,view,ierr) >> call PetscViewerDestroy(view,ierr) >> ! for the time being, assume mDim == nDim is true >> call MatGetSize(A, nDim, mDim, ierr) >> >> if (rank == 0) then >> print*,'Mat Size = ', nDim, mDim >> end if >> >> call MatAssemblyBegin(A, MAT_FINAL_ASSEMBLY, ierr) >> call MatAssemblyEnd(A, MAT_FINAL_ASSEMBLY, ierr) >> call MatGetOwnershipRange(A,ownRowS,ownRowE, ierr) >> >> ! create right-and-side matrix >> ! for the time being, choose row-wise decomposition >> ! for the time being, assume nDim%size = 0 >> call MatCreateDense(PETSC_COMM_WORLD, (ownRowE - ownRowS), & >> bsize, nDim, bsize,PETSC_NULL_SCALAR, B, ierr) >> call PetscViewerBinaryOpen(PETSC_COMM_WORLD,rhsfin,FILE_MODE_READ,view, >> ierr) >> call MatLoad(B,view,ierr) >> call PetscViewerDestroy(view,ierr) >> call MatGetSize(B, rhsMDim, rhsNDim, ierr) >> if (rank == 0) then >> print*,'MRHS Size actually are:', rhsMDim, rhsNDim >> print*,'MRHS Size should be:', nDim, bsize >> end if >> call MatAssemblyBegin(B, MAT_FINAL_ASSEMBLY, ierr) >> call MatAssemblyEnd(B, MAT_FINAL_ASSEMBLY, ierr) >> >> ! inintial value guses X >> allocate(XInit(nDim,bsize)) >> do XInitI=1, nDim >> do XInitJ=1, bsize >> XInit(XInitI,XInitJ) = 1.0 >> end do >> end do >> >> call MatCreateDense(PETSC_COMM_WORLD, (ownRowE - ownRowS), & >> bsize, nDim, bsize,XInit, X, ierr) >> >> call MatAssemblyBegin(X, MAT_FINAL_ASSEMBLY, ierr) >> call MatAssemblyEnd (X, MAT_FINAL_ASSEMBLY, ierr) >> >> >> ! B, X, R, QDlt, AQDlt >> call MatDuplicate(B, MAT_DO_NOT_COPY_VALUES, R, ierr) >> call MatAssemblyBegin(R, MAT_FINAL_ASSEMBLY, ierr) >> call MatAssemblyEnd (R, MAT_FINAL_ASSEMBLY, ierr) >> >> call MatDuplicate(B, MAT_DO_NOT_COPY_VALUES, QDlt, ierr) >> call MatAssemblyBegin(QDlt, MAT_FINAL_ASSEMBLY, ierr) >> call MatAssemblyEnd (QDlt, MAT_FINAL_ASSEMBLY, ierr) >> >> call MatDuplicate(B, MAT_DO_NOT_COPY_VALUES, AQDlt, ierr) >> call MatAssemblyBegin(AQDlt, MAT_FINAL_ASSEMBLY, ierr) >> call MatAssemblyEnd (AQDlt, MAT_FINAL_ASSEMBLY, ierr) >> >> ! Q, K, AQ_p, AQ of size n x (m x k) >> call MatCreateDense(PETSC_COMM_WORLD, (ownRowE - ownRowS), & >> (bsize*step_k), nDim, >> (bsize*step_k),PETSC_NULL_SCALAR, Q, ierr) >> call MatAssemblyBegin(Q, MAT_FINAL_ASSEMBLY, ierr) >> call MatAssemblyEnd(Q, MAT_FINAL_ASSEMBLY, ierr) >> >> call MatDuplicate(Q, MAT_DO_NOT_COPY_VALUES, K, ierr) >> call MatAssemblyBegin(K, MAT_FINAL_ASSEMBLY, ierr) >> call MatAssemblyEnd(K, MAT_FINAL_ASSEMBLY, ierr) >> >> call MatDuplicate(Q, MAT_DO_NOT_COPY_VALUES, AQ_p, ierr) >> call MatAssemblyBegin(AQ_p, MAT_FINAL_ASSEMBLY, ierr) >> call MatAssemblyEnd(AQ_p, MAT_FINAL_ASSEMBLY, ierr) >> >> call MatDuplicate(Q, MAT_DO_NOT_COPY_VALUES, AQ, ierr) >> call MatAssemblyBegin(AQ, MAT_FINAL_ASSEMBLY, ierr) >> call MatAssemblyEnd(AQ, MAT_FINAL_ASSEMBLY, ierr) >> >> ! QtAQ, QtAQ_p, Dlt of size (m x k) x (m x k) >> call MatCreateSeqDense(PETSC_COMM_SELF,(bsize*step_k),(bsize*step_k),& >> PETSC_NULL_SCALAR, QtAQ, ierr) >> call MatAssemblyBegin(QtAQ, MAT_FINAL_ASSEMBLY, ierr) >> call MatAssemblyEnd (QtAQ, MAT_FINAL_ASSEMBLY, ierr) >> >> call MatDuplicate(QtAQ, MAT_DO_NOT_COPY_VALUES, QtAQ_p , ierr) >> call MatAssemblyBegin(QtAQ_p, MAT_FINAL_ASSEMBLY, ierr) >> call MatAssemblyEnd (QtAQ_p, MAT_FINAL_ASSEMBLY, ierr) >> >> call MatDuplicate(QtAQ, MAT_DO_NOT_COPY_VALUES, Dlt , ierr) >> call MatAssemblyBegin(Dlt, MAT_FINAL_ASSEMBLY, ierr) >> call MatAssemblyEnd (Dlt, MAT_FINAL_ASSEMBLY, ierr) >> >> call MatDuplicate(QtAQ, MAT_DO_NOT_COPY_VALUES, AConjPara , ierr) >> call MatAssemblyBegin(AConjPara, MAT_FINAL_ASSEMBLY, ierr) >> call MatAssemblyEnd (AConjPara, MAT_FINAL_ASSEMBLY, ierr) >> >> ! calculation for R >> >> ! call matrix powers kernel >> call mpk_monomial (K, A, R, step_k, rank,size) >> >> ! destory matrices >> deallocate(XInit) >> >> call MatDestroy(B, ierr) >> call MatDestroy(X, ierr) >> call MatDestroy(R, ierr) >> call MatDestroy(QDlt, ierr) >> call MatDestroy(AQDlt, ierr) >> call MatDestroy(Q, ierr) >> call MatDestroy(K, ierr) >> call MatDestroy(AQ_p, ierr) >> call MatDestroy(AQ, ierr) >> call MatDestroy(QtAQ, ierr) >> call MatDestroy(QtAQ_p, ierr) >> call MatDestroy(Dlt, ierr) >> >> >> call PetscFinalize(ierr) >> >> stop >> >> end program test >> >> >> subroutine mpk_monomial (K, A, R, step_k, rank, sizeMPI) >> implicit none >> >> #include >> #include >> #include >> #include >> >> Mat :: K, Km(step_k) >> Mat :: A, R >> PetscMPIInt :: sizeMPI, rank >> PetscInt :: nDim, bsize, step_k, local_RRow, local_RCol, genIdx >> PetscInt :: ierr >> PetscInt :: stepIdx, blockShift, localRsize >> PetscScalar :: KArray(1), RArray(1), PetscScalarSize >> PetscOffset :: KArrayOffset, RArrayOffset >> >> call MatGetSize(R, nDim, bsize, ierr) >> if (rank == 0) then >> print*,'Mat Size = ', nDim, bsize >> end if >> >> call MatGetArray(K,KArray,KArrayOffset,ierr) >> >> call MatGetLocalSize(R,local_RRow,local_RCol) >> ! print *, "local_RRow,local_RCol", local_RRow,local_RCol >> >> ! get arry from R to add values to K(1) >> call MatGetArray(R,RArray,RArrayOffset,ierr) >> >> call MatCreateDense(PETSC_COMM_WORLD, PETSC_DECIDE, & >> PETSC_DECIDE , nDim, bsize,KArray(KArrayOffset + >> 1), Km(1), ierr) >> >> >> ! call PetscMemmove(KArray(KArrayOffset + 1),RArray(RArrayOffset + 1) & >> ! ,local_RRow * local_RCol * >> STORAGE_SIZE(PetscScalarSize), ierr) >> >> localRsize = local_RRow * local_RCol >> do genIdx= 1, localRsize >> KArray(KArrayOffset + genIdx) = RArray(RArrayOffset + genIdx) >> end do >> >> >> call MatRestoreArray(R,RArray,RArrayOffset,ierr) >> >> call MatAssemblyBegin(Km(1), MAT_FINAL_ASSEMBLY, ierr) >> call MatAssemblyEnd (Km(1), MAT_FINAL_ASSEMBLY, ierr) >> >> do stepIdx= 2, step_k >> >> blockShift = KArrayOffset + (stepIdx-1) * (local_RRow * local_RCol) >> >> call MatCreateDense(PETSC_COMM_WORLD, PETSC_DECIDE, & >> PETSC_DECIDE , nDim, bsize,KArray(blockShift+1), >> Km(stepIdx), ierr) >> call MatAssemblyBegin(Km(stepIdx), MAT_FINAL_ASSEMBLY, ierr) >> call MatAssemblyEnd (Km(stepIdx), MAT_FINAL_ASSEMBLY, ierr) >> >> end do >> >> call MatRestoreArray(K,KArray,KArrayOffset,ierr) >> >> ! do stepIdx= 2, step_k >> do stepIdx= 2,2 >> >> call >> MatMatMult(A,Km(stepIdx-1),MAT_REUSE_MATRIX,PETSC_DEFAULT_INTEGER,Km(stepIdx), >> ierr) >> ! call >> MatMatMult(A,Km(stepIdx-1),MAT_INITIAL_MATRIX,PETSC_DEFAULT_INTEGER,Km(stepIdx), >> ierr) >> end do >> >> ! call MatView(K,PETSC_VIEWER_STDOUT_WORLD,ierr) >> >> end subroutine mpk_monomial >> >> >> >> Cong Li >> >> On Thu, Aug 6, 2015 at 3:30 AM, Barry Smith wrote: >> >>> >>> Send the entire code so that we can compile it and run it ourselves >>> to see what is going wrong. >>> >>> Barry >>> >>> > On Aug 5, 2015, at 4:42 AM, Cong Li wrote: >>> > >>> > Hi >>> > >>> > I tried the method you suggested. However, I got the error message. >>> > My code and message are below. >>> > >>> > K is the big matrix containing column matrices. >>> > >>> > code: >>> > >>> > call MatGetArray(K,KArray,KArrayOffset,ierr) >>> > >>> > call MatGetLocalSize(R,local_RRow,local_RCol) >>> > >>> > call MatGetArray(R,RArray,RArrayOffset,ierr) >>> > >>> > call MatCreateDense(PETSC_COMM_WORLD, PETSC_DECIDE, & >>> > PETSC_DECIDE , nDim, bsize,KArray(KArrayOffset >>> + 1), Km(1), ierr) >>> > >>> > localRsize = local_RRow * local_RCol >>> > do genIdx= 1, localRsize >>> > KArray(KArrayOffset + genIdx) = RArray(RArrayOffset + genIdx) >>> > end do >>> > >>> > call MatRestoreArray(R,RArray,RArrayOffset,ierr) >>> > >>> > call MatAssemblyBegin(Km(1), MAT_FINAL_ASSEMBLY, ierr) >>> > call MatAssemblyEnd (Km(1), MAT_FINAL_ASSEMBLY, ierr) >>> > >>> > do stepIdx= 2, step_k >>> > >>> > blockShift = KArrayOffset + (stepIdx-1) * (local_RRow * local_RCol) >>> > >>> > call MatCreateDense(PETSC_COMM_WORLD, PETSC_DECIDE, & >>> > PETSC_DECIDE , nDim, >>> bsize,KArray(blockShift+1), Km(stepIdx), ierr) >>> > call MatAssemblyBegin(Km(stepIdx), MAT_FINAL_ASSEMBLY, ierr) >>> > call MatAssemblyEnd (Km(stepIdx), MAT_FINAL_ASSEMBLY, ierr) >>> > end do >>> > >>> > call MatRestoreArray(K,KArray,KArrayOffset,ierr) >>> > >>> > do stepIdx= 2, step_k >>> > >>> > call >>> MatMatMult(A,Km(stepIdx-1),MAT_REUSE_MATRIX,PETSC_DEFAULT_INTEGER,Km(stepIdx), >>> ierr) >>> > end do >>> > >>> > >>> > And I got the error message as below: >>> > >>> > >>> > [0]PETSC ERROR: >>> ------------------------------------------------------------------------ >>> > [0]PETSC ERROR: Caught signal number 11 SEGV: Segmentation Violation, >>> probably memory access out of range >>> > [0]PETSC ERROR: Try option -start_in_debugger or >>> -on_error_attach_debugger >>> > [0]PETSC ERROR: or see >>> http://www.mcs.anl.gov/petsc/documentation/faq.html#valgrind[0]PETSC >>> ERROR: or try http://valgrind.org on GNU/linux and Apple Mac OS X to >>> find memory corruption errors >>> > [0]PETSC ERROR: configure using --with-debugging=yes, recompile, link, >>> and run >>> > [0]PETSC ERROR: to get more information on the crash. >>> > [0]PETSC ERROR: --------------------- Error Message >>> ------------------------------------ >>> > [0]PETSC ERROR: Signal received! >>> > [0]PETSC ERROR: >>> ------------------------------------------------------------------------ >>> > [0]PETSC ERROR: Petsc Release Version 3.3.0, Patch 7, Sat May 11 >>> 22:15:24 CDT 2013 >>> > [0]PETSC ERROR: See docs/changes/index.html for recent updates. >>> > [0]PETSC ERROR: See docs/faq.html for hints about trouble shooting. >>> > [0]PETSC ERROR: See docs/index.html for manual pages. >>> > [0]PETSC ERROR: --------------------[1]PETSC ERROR: >>> ------------------------------------------------------------------------ >>> > [1]PETSC ERROR: Caught signal number 11 SEGV: Segmentation Violation, >>> probably memory access out of range >>> > ---------------------------------------------------- >>> > [0]PETSC ERROR: ./kmath.bcbcg on a arch-fuji named p01-024 by a03293 >>> Wed Aug 5 18:24:40 2015 >>> > [0]PETSC ERROR: Libraries linked from >>> /volume1/home/ra000005/a03293/kmathlibbuild/petsc-3.3-p7/arch-fujitsu-sparc64fx-opt/lib >>> > [0]PETSC ERROR: Configure run at Tue Jul 28 19:23:51 2015 >>> > [0]PETSC ERROR: Configure options --known-level1-dcache-size=32768 >>> --known-level1-dcache-linesize=32 --known-level1-dcache-assoc=0 >>> --known-memcmp-ok=1 --known-sizeof-char=1 --known-sizeof-void-p=8 >>> --known-sizeof-short=2 --known-sizeof-int=4 --known-sizeof-long=8 >>> --known-sizeof-long-long=8 --known-sizeof-float=4 --known-sizeof-double=8 >>> --known-sizeof-size_t=8 --known-bits-per-byte=8 --known-sizeof-MPI_Comm=8 >>> --known-sizeof-MPI_Fint=4 --known-mpi-long-double=1 >>> --known-mpi-c-double-complex=1 --with-cc=mpifccpx --CFLAGS="-mt -Xg" >>> --COPTFLAGS=-Kfast,openmp --with-cxx=mpiFCCpx --CXXFLAGS=-mt >>> --CXXOPTFLAGS=-Kfast,openmp --with-fc=mpifrtpx --FFLAGS=-Kthreadsafe >>> --FOPTFLAGS=-Kfast,openmp --with-blas-lapack-lib="-SCALAPACK -SSL2" >>> --with-x=0 --with-c++-support --with-batch=1 --with-info=1 >>> --with-debugging=0 --known-mpi-shared-libraries=0 --with-valgrind=0 >>> > [0]PETSC ERROR: >>> ------------------------------------------------------------------------ >>> > [0]PETSC ERROR: User provided function() line 0 in unknown directory >>> unknown file >>> > >>> -------------------------------------------------------------------------- >>> > [mpi::mpi-api::mpi-abort] >>> > MPI_ABORT was invoked on rank 0 in communicator MPI_COMM_WORLD >>> > with errorcode 59. >>> > >>> > NOTE: invoking MPI_ABORT causes Open MPI to kill all MPI processes. >>> > You may or may not see output from other processes, depending on >>> > exactly when Open MPI kills them. >>> > >>> -------------------------------------------------------------------------- >>> > [p01-024:26516] >>> /opt/FJSVtclang/GM-1.2.0-18/lib64/libmpi.so.0(orte_errmgr_base_error_abort+0x84) >>> [0xffffffff0091f684] >>> > [p01-024:26516] >>> /opt/FJSVtclang/GM-1.2.0-18/lib64/libmpi.so.0(ompi_mpi_abort+0x51c) >>> [0xffffffff006c389c] >>> > [p01-024:26516] >>> /opt/FJSVtclang/GM-1.2.0-18/lib64/libmpi.so.0(MPI_Abort+0x6c) >>> [0xffffffff006db3ac] >>> > [p01-024:26516] >>> /opt/FJSVtclang/GM-1.2.0-18/lib64/libtrtmet_c.so.1(MPI_Abort+0x2c) >>> [0xffffffff00281bf0] >>> > [p01-024:26516] ./kmath.bcbcg [0x1bf620] >>> > [p01-024:26516] ./kmath.bcbcg [0x1bf20c] >>> > [p01-024:26516] /lib64/libc.so.6(killpg+0x48) [0xffffffff02d52600] >>> > [p01-024:26516] [(nil)] >>> > [p01-024:26516] ./kmath.bcbcg [0x1a2054] >>> > [p01-024:26516] ./kmath.bcbcg [0x1064f8] >>> > [p01-024:26516] ./kmath.bcbcg(MAIN__+0x9dc) [0x105d1c] >>> > [p01-024:26516] ./kmath.bcbcg(main+0xec) [0x8a329c] >>> > [p01-024:26516] /lib64/libc.so.6(__libc_start_main+0x194) >>> [0xffffffff02d3b81c] >>> > [p01-024:26516] ./kmath.bcbcg [0x1051ec] >>> > [0]PETSC ERROR: >>> ------------------------------------------------------------------------ >>> > [0]PETSC ERROR: Caught signal number 15 Terminate: Somet process (or >>> the batch system) has told this process to end >>> > [0]PETSC ERROR: Try option -start_in_debugger or >>> -on_error_attach_debugger >>> > [0]PETSC ERROR: or see >>> http://www.mcs.anl.gov/petsc/documentation/faq.html#valgrind[0]PETSC >>> ERROR: or try http://valgrind.org on GNU/linux and Apple Mac OS X to >>> find memory corruption errors >>> > [0]PETSC ERROR: configure using --with-debugging=yes, recompile, link, >>> and run >>> > [0]PETSC ERROR: to get more information on the crash. >>> > [0]PETSC ERROR: --------------------- Error Message >>> ------------------------------------ >>> > [0]PETSC ERROR: Signal received! >>> > [0]PETSC ERROR: >>> ------------------------------------------------------------------------ >>> > [0]PETSC ERROR: Petsc Release Version 3.3.0, Patch 7, Sat May 11 >>> 22:15:24 CDT 2013 >>> > [0]PETSC ERROR: See docs/changes/index.html for recent updates. >>> > [0]PETSC ERROR: See docs/faq.html for hints about trouble shooting. >>> > [0]PETSC ERROR: See docs/index.html for manual pages. >>> > [0]PETSC ERROR: >>> ------------------------------------------------------------------------ >>> > [0]PETSC ERROR: ./kmath.bcbcg on a arch-fuji named p01-024 by a03293 >>> Wed Aug 5 18:24:40 2015 >>> > [0]PETSC ERROR: Libraries linked from >>> /volume1/home/ra000005/a03293/kmathlibbuild/petsc-3.3-p7/arch-fujitsu-sparc64fx-opt/lib >>> > [0]PETSC ERROR: Configure run at Tue Jul 28 19:23:51 2015 >>> > [0]PETSC ERROR: Configure options --known-level1-dcache-size=32768 >>> --known-level1-dcache-linesize=32 --known-level1-dcache-assoc=0 >>> --known-memcmp-ok=1 --known-sizeof-char=1 --known-sizeof-void-p=8 >>> --known-sizeof-short=2 --known-sizeof-int=4 --known-sizeof-long=8 >>> --known-sizeof-long-long=8 --known-sizeof-float=4 --known-sizeof-double=8 >>> --known-sizeof-size_t=8 --known-bits-per-byte=8 --known-sizeof-MPI_Comm=8 >>> --known-sizeof-MPI_Fint=4 --known-mpi-long-double=1 >>> --known-mpi-c-double-complex=1 --with-cc=mpifccpx --CFLAGS="-mt -Xg" >>> --COPTFLAGS=-Kfast,openmp --with-cxx=mpiFCCpx --CXXFLAGS=-mt >>> --CXXOPTFLAGS=-Kfast,openmp --with-fc=mpifrtpx --FFLAGS=-Kthreadsafe >>> --FOPTFLAGS=-Kfast,openmp --with-blas-lapack-lib="-SCALAPACK -SSL2" >>> --with-x=0 --with-c++-support --with-batch=1 --with-info=1 >>> --with-debugging=0 --known-mpi-shared-libraries=0 --with-valgrind=0 >>> > [0]PETSC ERROR: >>> ------------------------------------------------------------------------ >>> > [0]PETSC ERROR: User provided function() line 0 in unknown directory >>> unknown file >>> > [ERR.] PLE 0019 plexec One of MPI processes was >>> aborted.(rank=0)(nid=0x020a0028)(CODE=1938,793745140674134016,15104) >>> > >>> > However, if I change from >>> > call >>> MatMatMult(A,Km(stepIdx-1),MAT_REUSE_MATRIX,PETSC_DEFAULT_INTEGER,Km(stepIdx), >>> ierr) >>> > to >>> > call MatMatMult(A,Km(stepIdx-1), >>> MAT_INITIAL_MATRIX,PETSC_DEFAULT_INTEGER,Km(stepIdx), ierr) >>> > >>> > everything is fine. >>> > >>> > could you please suggest some way to solve this? >>> > >>> > Thanks >>> > >>> > Cong Li >>> > >>> > On Wed, Aug 5, 2015 at 10:53 AM, Cong Li >>> wrote: >>> > Thank you very much for your help and suggestions. >>> > With your help, finally I could continue my project. >>> > >>> > Regards >>> > >>> > Cong Li >>> > >>> > >>> > >>> > On Wed, Aug 5, 2015 at 3:09 AM, Barry Smith >>> wrote: >>> > >>> > From the manual page: Unless scall is MAT_REUSE_MATRIX C will be >>> created. >>> > >>> > Since you want to use the C that is passed in you should use >>> MAT_REUSE_MATRIX. >>> > >>> > Note that since your B and C matrices are dense the issue of >>> sparsity pattern of C is not relevant. >>> > >>> > Barry >>> > >>> > > On Aug 4, 2015, at 11:59 AM, Cong Li >>> wrote: >>> > > >>> > > Thanks very much. This answer is very helpful. >>> > > And I have a following question. >>> > > If I create B1, B2, .. by the way you suggested and then use >>> MatMatMult to do SPMM. >>> > > PetscErrorCode MatMatMult(Mat A,Mat B,MatReuse scall,PetscReal >>> fill,Mat *C) >>> > > should I use MAT_REUSE_MATRIX for MatReuse part of the arguement. >>> > > >>> > > Thanks >>> > > >>> > > Cong Li >>> > > >>> > > On Wed, Aug 5, 2015 at 1:27 AM, Barry Smith >>> wrote: >>> > > >>> > > > On Aug 4, 2015, at 4:09 AM, Cong Li >>> wrote: >>> > > > >>> > > > I am sorry that I should have explained it more clearly. >>> > > > Actually I want to compute a recurrence. >>> > > > >>> > > > Like, I want to firstly compute A*X1=B1, and then calculate >>> A*B1=B2, A*B2=B3 and so on. >>> > > > Finally I want to combine all these results into a bigger matrix >>> C=[B1,B2 ...] >>> > > >>> > > First create C with MatCreateDense(,&C). Then call >>> MatDenseGetArray(C,&array); then create B1 with >>> MatCreateDense(....,array,&B1); then create >>> > > B2 with MatCreateDense(...,array+shift,&B2) etc where shift equals >>> the number of __local__ rows in B1 times the number of columns in B1, then >>> create B3 with a larger shift etc. >>> > > >>> > > Note that you are "sharing" the array space of C with B1, B2, B3, >>> ..., each Bi contains its columns of the C matrix. >>> > > >>> > > Barry >>> > > >>> > > >>> > > >>> > > > >>> > > > Is there any way to do this efficiently. >>> > > > >>> > > > >>> > > > >>> > > > On Tue, Aug 4, 2015 at 5:45 PM, Patrick Sanan < >>> patrick.sanan at gmail.com> wrote: >>> > > > On Tue, Aug 04, 2015 at 03:42:14PM +0900, Cong Li wrote: >>> > > > > Thanks for your reply. >>> > > > > >>> > > > > I have an other question. >>> > > > > I want to do SPMM several times and combine result matrices into >>> one bigger >>> > > > > matrix. >>> > > > > for example >>> > > > > I firstly calculate AX1=B1, AX2=B2 ... >>> > > > > then I want to combine B1, B2.. to get a C, where C=[B1,B2...] >>> > > > > >>> > > > > Could you please suggest a way of how to do this. >>> > > > This is just linear algebra, nothing to do with PETSc specifically. >>> > > > A * [X1, X2, ... ] = [AX1, AX2, ...] >>> > > > > >>> > > > > Thanks >>> > > > > >>> > > > > Cong Li >>> > > > > >>> > > > > On Tue, Aug 4, 2015 at 3:27 PM, Jed Brown >>> wrote: >>> > > > > >>> > > > > > Cong Li writes: >>> > > > > > >>> > > > > > > Hello, >>> > > > > > > >>> > > > > > > I am a PhD student using PETsc for my research. >>> > > > > > > I am wondering if there is a way to implement SPMM (Sparse >>> matrix-matrix >>> > > > > > > multiplication) by using PETSc. >>> > > > > > >>> > > > > > >>> > > > > > >>> http://www.mcs.anl.gov/petsc/petsc-current/docs/manualpages/Mat/MatMatMult.html >>> > > > > > >>> > > > >>> > > >>> > > >>> > >>> > >>> > >>> >>> >> > -------------- next part -------------- An HTML attachment was scrubbed... URL: From solvercorleone at gmail.com Thu Aug 6 00:27:52 2015 From: solvercorleone at gmail.com (Cong Li) Date: Thu, 6 Aug 2015 14:27:52 +0900 Subject: [petsc-users] I am wondering if there is a way to implement SPMM In-Reply-To: References: <87egjjr2j9.fsf@jedbrown.org> <20150804084548.GB52392@Patricks-MacBook-Pro-3.local> <07456300-9874-41EF-AF5E-16BC0CB0423D@mcs.anl.gov> <06426FD6-16F6-429A-8EEB-8BE31CECC8F4@mcs.anl.gov> Message-ID: Barry, Exactly. And thanks for the explaination. Cong Li On Thu, Aug 6, 2015 at 1:29 PM, Barry Smith wrote: > > > On Aug 5, 2015, at 10:23 PM, Hong wrote: > > > > Cong, > > > > Can you write out math equations for mpk_monomial (), > > list input and output parameters. > > > > Note: > > 1. MatDuplicate() does not need to be followed by MatAssemblyBegin/End > > 2. MatMatMult(A,Km(stepIdx-1),MAT_REUSE_MATRIX,..) must be called after > > MatMatMult(A,Km(stepIdx-1),MAT_INITIAL_MATRIX,..) > > Hong, we want to reuse the space in the Km(stepIdx-1) from which it was > created which means that MAT_INITIAL_MATRIX cannot be used. Since the > result is always dense it is not the difficult case when a symbolic > computation needs to be done initially so, at least in theory, he should > not have to use MAT_INITIAL_MATRIX the first time through. > > Barry > > > > > Hong > > > > > > On Wed, Aug 5, 2015 at 8:56 PM, Cong Li > wrote: > > The entire source code files are attached. > > > > Also I copy and paste the here in this email > > > > thanks > > > > program test > > > > implicit none > > > > #include > > #include > > #include > > #include > > > > > > PetscViewer :: view > > ! sparse matrix > > Mat :: A > > ! distributed dense matrix of size n x m > > Mat :: B, X, R, QDlt, AQDlt > > ! distributed dense matrix of size n x (m x k) > > Mat :: Q, K, AQ_p, AQ > > ! local dense matrix (every process keep the identical copies), (m x > k) x (m x k) > > Mat :: AConjPara, QtAQ, QtAQ_p, Dlt > > > > PetscInt :: nDim, mDim, rhsNDim,rhsMDim,ierr, maxIter, iter, > step_k,bsize > > PetscInt :: ownRowS,ownRowE > > PetscScalar, allocatable :: XInit(:,:) > > PetscInt :: XInitI, XInitJ > > PetscScalar :: v=1.0 > > PetscBool :: flg > > PetscMPIInt :: size, rank > > > > character(128) :: fin, rhsfin > > > > > > call PetscInitialize(PETSC_NULL_CHARACTER,ierr) > > call MPI_Comm_size(PETSC_COMM_WORLD,size,ierr) > > call MPI_Comm_rank(PETSC_COMM_WORLD,rank,ierr) > > > > ! read binary matrix file > > call PetscOptionsGetString(PETSC_NULL_CHARACTER,'-f',fin,flg,ierr) > > call PetscOptionsGetString(PETSC_NULL_CHARACTER,'-r',rhsfin,flg,ierr) > > > > call PetscOptionsGetInt(PETSC_NULL_CHARACTER,'-i',maxIter,flg,ierr) > > call PetscOptionsGetInt(PETSC_NULL_CHARACTER,'-k',step_k,flg,ierr) > > call PetscOptionsGetInt(PETSC_NULL_CHARACTER,'-w',bsize,flg,ierr) > > > > > > call > PetscViewerBinaryOpen(PETSC_COMM_WORLD,fin,FILE_MODE_READ,view,ierr) > > call MatCreate(PETSC_COMM_WORLD,A,ierr) > > call MatSetType(A,MATAIJ,ierr) > > call MatLoad(A,view,ierr) > > call PetscViewerDestroy(view,ierr) > > ! for the time being, assume mDim == nDim is true > > call MatGetSize(A, nDim, mDim, ierr) > > > > if (rank == 0) then > > print*,'Mat Size = ', nDim, mDim > > end if > > > > call MatAssemblyBegin(A, MAT_FINAL_ASSEMBLY, ierr) > > call MatAssemblyEnd(A, MAT_FINAL_ASSEMBLY, ierr) > > call MatGetOwnershipRange(A,ownRowS,ownRowE, ierr) > > > > ! create right-and-side matrix > > ! for the time being, choose row-wise decomposition > > ! for the time being, assume nDim%size = 0 > > call MatCreateDense(PETSC_COMM_WORLD, (ownRowE - ownRowS), & > > bsize, nDim, bsize,PETSC_NULL_SCALAR, B, ierr) > > call > PetscViewerBinaryOpen(PETSC_COMM_WORLD,rhsfin,FILE_MODE_READ,view, ierr) > > call MatLoad(B,view,ierr) > > call PetscViewerDestroy(view,ierr) > > call MatGetSize(B, rhsMDim, rhsNDim, ierr) > > if (rank == 0) then > > print*,'MRHS Size actually are:', rhsMDim, rhsNDim > > print*,'MRHS Size should be:', nDim, bsize > > end if > > call MatAssemblyBegin(B, MAT_FINAL_ASSEMBLY, ierr) > > call MatAssemblyEnd(B, MAT_FINAL_ASSEMBLY, ierr) > > > > ! inintial value guses X > > allocate(XInit(nDim,bsize)) > > do XInitI=1, nDim > > do XInitJ=1, bsize > > XInit(XInitI,XInitJ) = 1.0 > > end do > > end do > > > > call MatCreateDense(PETSC_COMM_WORLD, (ownRowE - ownRowS), & > > bsize, nDim, bsize,XInit, X, ierr) > > > > call MatAssemblyBegin(X, MAT_FINAL_ASSEMBLY, ierr) > > call MatAssemblyEnd (X, MAT_FINAL_ASSEMBLY, ierr) > > > > > > ! B, X, R, QDlt, AQDlt > > call MatDuplicate(B, MAT_DO_NOT_COPY_VALUES, R, ierr) > > call MatAssemblyBegin(R, MAT_FINAL_ASSEMBLY, ierr) > > call MatAssemblyEnd (R, MAT_FINAL_ASSEMBLY, ierr) > > > > call MatDuplicate(B, MAT_DO_NOT_COPY_VALUES, QDlt, ierr) > > call MatAssemblyBegin(QDlt, MAT_FINAL_ASSEMBLY, ierr) > > call MatAssemblyEnd (QDlt, MAT_FINAL_ASSEMBLY, ierr) > > > > call MatDuplicate(B, MAT_DO_NOT_COPY_VALUES, AQDlt, ierr) > > call MatAssemblyBegin(AQDlt, MAT_FINAL_ASSEMBLY, ierr) > > call MatAssemblyEnd (AQDlt, MAT_FINAL_ASSEMBLY, ierr) > > > > ! Q, K, AQ_p, AQ of size n x (m x k) > > call MatCreateDense(PETSC_COMM_WORLD, (ownRowE - ownRowS), & > > (bsize*step_k), nDim, > (bsize*step_k),PETSC_NULL_SCALAR, Q, ierr) > > call MatAssemblyBegin(Q, MAT_FINAL_ASSEMBLY, ierr) > > call MatAssemblyEnd(Q, MAT_FINAL_ASSEMBLY, ierr) > > > > call MatDuplicate(Q, MAT_DO_NOT_COPY_VALUES, K, ierr) > > call MatAssemblyBegin(K, MAT_FINAL_ASSEMBLY, ierr) > > call MatAssemblyEnd(K, MAT_FINAL_ASSEMBLY, ierr) > > > > call MatDuplicate(Q, MAT_DO_NOT_COPY_VALUES, AQ_p, ierr) > > call MatAssemblyBegin(AQ_p, MAT_FINAL_ASSEMBLY, ierr) > > call MatAssemblyEnd(AQ_p, MAT_FINAL_ASSEMBLY, ierr) > > > > call MatDuplicate(Q, MAT_DO_NOT_COPY_VALUES, AQ, ierr) > > call MatAssemblyBegin(AQ, MAT_FINAL_ASSEMBLY, ierr) > > call MatAssemblyEnd(AQ, MAT_FINAL_ASSEMBLY, ierr) > > > > ! QtAQ, QtAQ_p, Dlt of size (m x k) x (m x k) > > call MatCreateSeqDense(PETSC_COMM_SELF,(bsize*step_k),(bsize*step_k),& > > PETSC_NULL_SCALAR, QtAQ, ierr) > > call MatAssemblyBegin(QtAQ, MAT_FINAL_ASSEMBLY, ierr) > > call MatAssemblyEnd (QtAQ, MAT_FINAL_ASSEMBLY, ierr) > > > > call MatDuplicate(QtAQ, MAT_DO_NOT_COPY_VALUES, QtAQ_p , ierr) > > call MatAssemblyBegin(QtAQ_p, MAT_FINAL_ASSEMBLY, ierr) > > call MatAssemblyEnd (QtAQ_p, MAT_FINAL_ASSEMBLY, ierr) > > > > call MatDuplicate(QtAQ, MAT_DO_NOT_COPY_VALUES, Dlt , ierr) > > call MatAssemblyBegin(Dlt, MAT_FINAL_ASSEMBLY, ierr) > > call MatAssemblyEnd (Dlt, MAT_FINAL_ASSEMBLY, ierr) > > > > call MatDuplicate(QtAQ, MAT_DO_NOT_COPY_VALUES, AConjPara , ierr) > > call MatAssemblyBegin(AConjPara, MAT_FINAL_ASSEMBLY, ierr) > > call MatAssemblyEnd (AConjPara, MAT_FINAL_ASSEMBLY, ierr) > > > > ! calculation for R > > > > ! call matrix powers kernel > > call mpk_monomial (K, A, R, step_k, rank,size) > > > > ! destory matrices > > deallocate(XInit) > > > > call MatDestroy(B, ierr) > > call MatDestroy(X, ierr) > > call MatDestroy(R, ierr) > > call MatDestroy(QDlt, ierr) > > call MatDestroy(AQDlt, ierr) > > call MatDestroy(Q, ierr) > > call MatDestroy(K, ierr) > > call MatDestroy(AQ_p, ierr) > > call MatDestroy(AQ, ierr) > > call MatDestroy(QtAQ, ierr) > > call MatDestroy(QtAQ_p, ierr) > > call MatDestroy(Dlt, ierr) > > > > > > call PetscFinalize(ierr) > > > > stop > > > > end program test > > > > > > subroutine mpk_monomial (K, A, R, step_k, rank, sizeMPI) > > implicit none > > > > #include > > #include > > #include > > #include > > > > Mat :: K, Km(step_k) > > Mat :: A, R > > PetscMPIInt :: sizeMPI, rank > > PetscInt :: nDim, bsize, step_k, local_RRow, local_RCol, > genIdx > > PetscInt :: ierr > > PetscInt :: stepIdx, blockShift, localRsize > > PetscScalar :: KArray(1), RArray(1), PetscScalarSize > > PetscOffset :: KArrayOffset, RArrayOffset > > > > call MatGetSize(R, nDim, bsize, ierr) > > if (rank == 0) then > > print*,'Mat Size = ', nDim, bsize > > end if > > > > call MatGetArray(K,KArray,KArrayOffset,ierr) > > > > call MatGetLocalSize(R,local_RRow,local_RCol) > > ! print *, "local_RRow,local_RCol", local_RRow,local_RCol > > > > ! get arry from R to add values to K(1) > > call MatGetArray(R,RArray,RArrayOffset,ierr) > > > > call MatCreateDense(PETSC_COMM_WORLD, PETSC_DECIDE, & > > PETSC_DECIDE , nDim, bsize,KArray(KArrayOffset + > 1), Km(1), ierr) > > > > > > ! call PetscMemmove(KArray(KArrayOffset + 1),RArray(RArrayOffset + 1) & > > ! ,local_RRow * local_RCol * > STORAGE_SIZE(PetscScalarSize), ierr) > > > > localRsize = local_RRow * local_RCol > > do genIdx= 1, localRsize > > KArray(KArrayOffset + genIdx) = RArray(RArrayOffset + genIdx) > > end do > > > > > > call MatRestoreArray(R,RArray,RArrayOffset,ierr) > > > > call MatAssemblyBegin(Km(1), MAT_FINAL_ASSEMBLY, ierr) > > call MatAssemblyEnd (Km(1), MAT_FINAL_ASSEMBLY, ierr) > > > > do stepIdx= 2, step_k > > > > blockShift = KArrayOffset + (stepIdx-1) * (local_RRow * local_RCol) > > > > call MatCreateDense(PETSC_COMM_WORLD, PETSC_DECIDE, & > > PETSC_DECIDE , nDim, bsize,KArray(blockShift+1), > Km(stepIdx), ierr) > > call MatAssemblyBegin(Km(stepIdx), MAT_FINAL_ASSEMBLY, ierr) > > call MatAssemblyEnd (Km(stepIdx), MAT_FINAL_ASSEMBLY, ierr) > > > > end do > > > > call MatRestoreArray(K,KArray,KArrayOffset,ierr) > > > > ! do stepIdx= 2, step_k > > do stepIdx= 2,2 > > > > call > MatMatMult(A,Km(stepIdx-1),MAT_REUSE_MATRIX,PETSC_DEFAULT_INTEGER,Km(stepIdx), > ierr) > > ! call > MatMatMult(A,Km(stepIdx-1),MAT_INITIAL_MATRIX,PETSC_DEFAULT_INTEGER,Km(stepIdx), > ierr) > > end do > > > > ! call MatView(K,PETSC_VIEWER_STDOUT_WORLD,ierr) > > > > end subroutine mpk_monomial > > > > > > > > Cong Li > > > > On Thu, Aug 6, 2015 at 3:30 AM, Barry Smith wrote: > > > > Send the entire code so that we can compile it and run it ourselves > to see what is going wrong. > > > > Barry > > > > > On Aug 5, 2015, at 4:42 AM, Cong Li wrote: > > > > > > Hi > > > > > > I tried the method you suggested. However, I got the error message. > > > My code and message are below. > > > > > > K is the big matrix containing column matrices. > > > > > > code: > > > > > > call MatGetArray(K,KArray,KArrayOffset,ierr) > > > > > > call MatGetLocalSize(R,local_RRow,local_RCol) > > > > > > call MatGetArray(R,RArray,RArrayOffset,ierr) > > > > > > call MatCreateDense(PETSC_COMM_WORLD, PETSC_DECIDE, & > > > PETSC_DECIDE , nDim, bsize,KArray(KArrayOffset > + 1), Km(1), ierr) > > > > > > localRsize = local_RRow * local_RCol > > > do genIdx= 1, localRsize > > > KArray(KArrayOffset + genIdx) = RArray(RArrayOffset + genIdx) > > > end do > > > > > > call MatRestoreArray(R,RArray,RArrayOffset,ierr) > > > > > > call MatAssemblyBegin(Km(1), MAT_FINAL_ASSEMBLY, ierr) > > > call MatAssemblyEnd (Km(1), MAT_FINAL_ASSEMBLY, ierr) > > > > > > do stepIdx= 2, step_k > > > > > > blockShift = KArrayOffset + (stepIdx-1) * (local_RRow * local_RCol) > > > > > > call MatCreateDense(PETSC_COMM_WORLD, PETSC_DECIDE, & > > > PETSC_DECIDE , nDim, > bsize,KArray(blockShift+1), Km(stepIdx), ierr) > > > call MatAssemblyBegin(Km(stepIdx), MAT_FINAL_ASSEMBLY, ierr) > > > call MatAssemblyEnd (Km(stepIdx), MAT_FINAL_ASSEMBLY, ierr) > > > end do > > > > > > call MatRestoreArray(K,KArray,KArrayOffset,ierr) > > > > > > do stepIdx= 2, step_k > > > > > > call > MatMatMult(A,Km(stepIdx-1),MAT_REUSE_MATRIX,PETSC_DEFAULT_INTEGER,Km(stepIdx), > ierr) > > > end do > > > > > > > > > And I got the error message as below: > > > > > > > > > [0]PETSC ERROR: > ------------------------------------------------------------------------ > > > [0]PETSC ERROR: Caught signal number 11 SEGV: Segmentation Violation, > probably memory access out of range > > > [0]PETSC ERROR: Try option -start_in_debugger or > -on_error_attach_debugger > > > [0]PETSC ERROR: or see > http://www.mcs.anl.gov/petsc/documentation/faq.html#valgrind[0]PETSC > ERROR: or try http://valgrind.org on GNU/linux and Apple Mac OS X to find > memory corruption errors > > > [0]PETSC ERROR: configure using --with-debugging=yes, recompile, link, > and run > > > [0]PETSC ERROR: to get more information on the crash. > > > [0]PETSC ERROR: --------------------- Error Message > ------------------------------------ > > > [0]PETSC ERROR: Signal received! > > > [0]PETSC ERROR: > ------------------------------------------------------------------------ > > > [0]PETSC ERROR: Petsc Release Version 3.3.0, Patch 7, Sat May 11 > 22:15:24 CDT 2013 > > > [0]PETSC ERROR: See docs/changes/index.html for recent updates. > > > [0]PETSC ERROR: See docs/faq.html for hints about trouble shooting. > > > [0]PETSC ERROR: See docs/index.html for manual pages. > > > [0]PETSC ERROR: --------------------[1]PETSC ERROR: > ------------------------------------------------------------------------ > > > [1]PETSC ERROR: Caught signal number 11 SEGV: Segmentation Violation, > probably memory access out of range > > > ---------------------------------------------------- > > > [0]PETSC ERROR: ./kmath.bcbcg on a arch-fuji named p01-024 by a03293 > Wed Aug 5 18:24:40 2015 > > > [0]PETSC ERROR: Libraries linked from > /volume1/home/ra000005/a03293/kmathlibbuild/petsc-3.3-p7/arch-fujitsu-sparc64fx-opt/lib > > > [0]PETSC ERROR: Configure run at Tue Jul 28 19:23:51 2015 > > > [0]PETSC ERROR: Configure options --known-level1-dcache-size=32768 > --known-level1-dcache-linesize=32 --known-level1-dcache-assoc=0 > --known-memcmp-ok=1 --known-sizeof-char=1 --known-sizeof-void-p=8 > --known-sizeof-short=2 --known-sizeof-int=4 --known-sizeof-long=8 > --known-sizeof-long-long=8 --known-sizeof-float=4 --known-sizeof-double=8 > --known-sizeof-size_t=8 --known-bits-per-byte=8 --known-sizeof-MPI_Comm=8 > --known-sizeof-MPI_Fint=4 --known-mpi-long-double=1 > --known-mpi-c-double-complex=1 --with-cc=mpifccpx --CFLAGS="-mt -Xg" > --COPTFLAGS=-Kfast,openmp --with-cxx=mpiFCCpx --CXXFLAGS=-mt > --CXXOPTFLAGS=-Kfast,openmp --with-fc=mpifrtpx --FFLAGS=-Kthreadsafe > --FOPTFLAGS=-Kfast,openmp --with-blas-lapack-lib="-SCALAPACK -SSL2" > --with-x=0 --with-c++-support --with-batch=1 --with-info=1 > --with-debugging=0 --known-mpi-shared-libraries=0 --with-valgrind=0 > > > [0]PETSC ERROR: > ------------------------------------------------------------------------ > > > [0]PETSC ERROR: User provided function() line 0 in unknown directory > unknown file > > > > -------------------------------------------------------------------------- > > > [mpi::mpi-api::mpi-abort] > > > MPI_ABORT was invoked on rank 0 in communicator MPI_COMM_WORLD > > > with errorcode 59. > > > > > > NOTE: invoking MPI_ABORT causes Open MPI to kill all MPI processes. > > > You may or may not see output from other processes, depending on > > > exactly when Open MPI kills them. > > > > -------------------------------------------------------------------------- > > > [p01-024:26516] > /opt/FJSVtclang/GM-1.2.0-18/lib64/libmpi.so.0(orte_errmgr_base_error_abort+0x84) > [0xffffffff0091f684] > > > [p01-024:26516] > /opt/FJSVtclang/GM-1.2.0-18/lib64/libmpi.so.0(ompi_mpi_abort+0x51c) > [0xffffffff006c389c] > > > [p01-024:26516] > /opt/FJSVtclang/GM-1.2.0-18/lib64/libmpi.so.0(MPI_Abort+0x6c) > [0xffffffff006db3ac] > > > [p01-024:26516] > /opt/FJSVtclang/GM-1.2.0-18/lib64/libtrtmet_c.so.1(MPI_Abort+0x2c) > [0xffffffff00281bf0] > > > [p01-024:26516] ./kmath.bcbcg [0x1bf620] > > > [p01-024:26516] ./kmath.bcbcg [0x1bf20c] > > > [p01-024:26516] /lib64/libc.so.6(killpg+0x48) [0xffffffff02d52600] > > > [p01-024:26516] [(nil)] > > > [p01-024:26516] ./kmath.bcbcg [0x1a2054] > > > [p01-024:26516] ./kmath.bcbcg [0x1064f8] > > > [p01-024:26516] ./kmath.bcbcg(MAIN__+0x9dc) [0x105d1c] > > > [p01-024:26516] ./kmath.bcbcg(main+0xec) [0x8a329c] > > > [p01-024:26516] /lib64/libc.so.6(__libc_start_main+0x194) > [0xffffffff02d3b81c] > > > [p01-024:26516] ./kmath.bcbcg [0x1051ec] > > > [0]PETSC ERROR: > ------------------------------------------------------------------------ > > > [0]PETSC ERROR: Caught signal number 15 Terminate: Somet process (or > the batch system) has told this process to end > > > [0]PETSC ERROR: Try option -start_in_debugger or > -on_error_attach_debugger > > > [0]PETSC ERROR: or see > http://www.mcs.anl.gov/petsc/documentation/faq.html#valgrind[0]PETSC > ERROR: or try http://valgrind.org on GNU/linux and Apple Mac OS X to find > memory corruption errors > > > [0]PETSC ERROR: configure using --with-debugging=yes, recompile, link, > and run > > > [0]PETSC ERROR: to get more information on the crash. > > > [0]PETSC ERROR: --------------------- Error Message > ------------------------------------ > > > [0]PETSC ERROR: Signal received! > > > [0]PETSC ERROR: > ------------------------------------------------------------------------ > > > [0]PETSC ERROR: Petsc Release Version 3.3.0, Patch 7, Sat May 11 > 22:15:24 CDT 2013 > > > [0]PETSC ERROR: See docs/changes/index.html for recent updates. > > > [0]PETSC ERROR: See docs/faq.html for hints about trouble shooting. > > > [0]PETSC ERROR: See docs/index.html for manual pages. > > > [0]PETSC ERROR: > ------------------------------------------------------------------------ > > > [0]PETSC ERROR: ./kmath.bcbcg on a arch-fuji named p01-024 by a03293 > Wed Aug 5 18:24:40 2015 > > > [0]PETSC ERROR: Libraries linked from > /volume1/home/ra000005/a03293/kmathlibbuild/petsc-3.3-p7/arch-fujitsu-sparc64fx-opt/lib > > > [0]PETSC ERROR: Configure run at Tue Jul 28 19:23:51 2015 > > > [0]PETSC ERROR: Configure options --known-level1-dcache-size=32768 > --known-level1-dcache-linesize=32 --known-level1-dcache-assoc=0 > --known-memcmp-ok=1 --known-sizeof-char=1 --known-sizeof-void-p=8 > --known-sizeof-short=2 --known-sizeof-int=4 --known-sizeof-long=8 > --known-sizeof-long-long=8 --known-sizeof-float=4 --known-sizeof-double=8 > --known-sizeof-size_t=8 --known-bits-per-byte=8 --known-sizeof-MPI_Comm=8 > --known-sizeof-MPI_Fint=4 --known-mpi-long-double=1 > --known-mpi-c-double-complex=1 --with-cc=mpifccpx --CFLAGS="-mt -Xg" > --COPTFLAGS=-Kfast,openmp --with-cxx=mpiFCCpx --CXXFLAGS=-mt > --CXXOPTFLAGS=-Kfast,openmp --with-fc=mpifrtpx --FFLAGS=-Kthreadsafe > --FOPTFLAGS=-Kfast,openmp --with-blas-lapack-lib="-SCALAPACK -SSL2" > --with-x=0 --with-c++-support --with-batch=1 --with-info=1 > --with-debugging=0 --known-mpi-shared-libraries=0 --with-valgrind=0 > > > [0]PETSC ERROR: > ------------------------------------------------------------------------ > > > [0]PETSC ERROR: User provided function() line 0 in unknown directory > unknown file > > > [ERR.] PLE 0019 plexec One of MPI processes was > aborted.(rank=0)(nid=0x020a0028)(CODE=1938,793745140674134016,15104) > > > > > > However, if I change from > > > call > MatMatMult(A,Km(stepIdx-1),MAT_REUSE_MATRIX,PETSC_DEFAULT_INTEGER,Km(stepIdx), > ierr) > > > to > > > call MatMatMult(A,Km(stepIdx-1), > MAT_INITIAL_MATRIX,PETSC_DEFAULT_INTEGER,Km(stepIdx), ierr) > > > > > > everything is fine. > > > > > > could you please suggest some way to solve this? > > > > > > Thanks > > > > > > Cong Li > > > > > > On Wed, Aug 5, 2015 at 10:53 AM, Cong Li > wrote: > > > Thank you very much for your help and suggestions. > > > With your help, finally I could continue my project. > > > > > > Regards > > > > > > Cong Li > > > > > > > > > > > > On Wed, Aug 5, 2015 at 3:09 AM, Barry Smith > wrote: > > > > > > From the manual page: Unless scall is MAT_REUSE_MATRIX C will be > created. > > > > > > Since you want to use the C that is passed in you should use > MAT_REUSE_MATRIX. > > > > > > Note that since your B and C matrices are dense the issue of > sparsity pattern of C is not relevant. > > > > > > Barry > > > > > > > On Aug 4, 2015, at 11:59 AM, Cong Li > wrote: > > > > > > > > Thanks very much. This answer is very helpful. > > > > And I have a following question. > > > > If I create B1, B2, .. by the way you suggested and then use > MatMatMult to do SPMM. > > > > PetscErrorCode MatMatMult(Mat A,Mat B,MatReuse scall,PetscReal > fill,Mat *C) > > > > should I use MAT_REUSE_MATRIX for MatReuse part of the arguement. > > > > > > > > Thanks > > > > > > > > Cong Li > > > > > > > > On Wed, Aug 5, 2015 at 1:27 AM, Barry Smith > wrote: > > > > > > > > > On Aug 4, 2015, at 4:09 AM, Cong Li > wrote: > > > > > > > > > > I am sorry that I should have explained it more clearly. > > > > > Actually I want to compute a recurrence. > > > > > > > > > > Like, I want to firstly compute A*X1=B1, and then calculate > A*B1=B2, A*B2=B3 and so on. > > > > > Finally I want to combine all these results into a bigger matrix > C=[B1,B2 ...] > > > > > > > > First create C with MatCreateDense(,&C). Then call > MatDenseGetArray(C,&array); then create B1 with > MatCreateDense(....,array,&B1); then create > > > > B2 with MatCreateDense(...,array+shift,&B2) etc where shift equals > the number of __local__ rows in B1 times the number of columns in B1, then > create B3 with a larger shift etc. > > > > > > > > Note that you are "sharing" the array space of C with B1, B2, B3, > ..., each Bi contains its columns of the C matrix. > > > > > > > > Barry > > > > > > > > > > > > > > > > > > > > > > Is there any way to do this efficiently. > > > > > > > > > > > > > > > > > > > > On Tue, Aug 4, 2015 at 5:45 PM, Patrick Sanan < > patrick.sanan at gmail.com> wrote: > > > > > On Tue, Aug 04, 2015 at 03:42:14PM +0900, Cong Li wrote: > > > > > > Thanks for your reply. > > > > > > > > > > > > I have an other question. > > > > > > I want to do SPMM several times and combine result matrices into > one bigger > > > > > > matrix. > > > > > > for example > > > > > > I firstly calculate AX1=B1, AX2=B2 ... > > > > > > then I want to combine B1, B2.. to get a C, where C=[B1,B2...] > > > > > > > > > > > > Could you please suggest a way of how to do this. > > > > > This is just linear algebra, nothing to do with PETSc specifically. > > > > > A * [X1, X2, ... ] = [AX1, AX2, ...] > > > > > > > > > > > > Thanks > > > > > > > > > > > > Cong Li > > > > > > > > > > > > On Tue, Aug 4, 2015 at 3:27 PM, Jed Brown > wrote: > > > > > > > > > > > > > Cong Li writes: > > > > > > > > > > > > > > > Hello, > > > > > > > > > > > > > > > > I am a PhD student using PETsc for my research. > > > > > > > > I am wondering if there is a way to implement SPMM (Sparse > matrix-matrix > > > > > > > > multiplication) by using PETSc. > > > > > > > > > > > > > > > > > > > > > > http://www.mcs.anl.gov/petsc/petsc-current/docs/manualpages/Mat/MatMatMult.html > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > -------------- next part -------------- An HTML attachment was scrubbed... URL: From rongliang.chan at gmail.com Thu Aug 6 01:36:56 2015 From: rongliang.chan at gmail.com (Rongliang Chen) Date: Thu, 06 Aug 2015 14:36:56 +0800 Subject: [petsc-users] Fail to Configure petsc-3.6.1 In-Reply-To: <37BDE715-40F7-4DEB-9651-B5D298866F3F@mcs.anl.gov> References: <55C18408.5040500@gmail.com> <87r3niz7gk.fsf@jedbrown.org> <55C1A7C3.7030209@gmail.com> <87lhdpzrcd.fsf@jedbrown.org> <49599094-1C85-499D-A847-7EC2C72D4430@mcs.anl.gov> <87pp31y1sc.fsf@jedbrown.org> <37BDE715-40F7-4DEB-9651-B5D298866F3F@mcs.anl.gov> Message-ID: <55C30088.3000503@gmail.com> Thanks for all your helps! The problem has been solved by using shared libraries. Best, Rongliang On 08/06/2015 04:11 AM, Barry Smith wrote: >> On Aug 5, 2015, at 2:26 PM, Jed Brown wrote: >> >> Barry Smith writes: >>> Our approach is always to work around bugs and stupidity in other packages design, >> Do we report it to them as a bug? > When there is a place to report them then we should and sometimes do. > > Barry > > From dave.mayhem23 at gmail.com Thu Aug 6 02:54:21 2015 From: dave.mayhem23 at gmail.com (Dave May) Date: Thu, 6 Aug 2015 09:54:21 +0200 Subject: [petsc-users] problem with MatShellGetContext In-Reply-To: <624969556.6635901.1438766116326.JavaMail.zimbra@inria.fr> References: <832877335.6632754.1438765243311.JavaMail.zimbra@inria.fr> <624969556.6635901.1438766116326.JavaMail.zimbra@inria.fr> Message-ID: On 5 August 2015 at 11:15, Nicolas Pozin wrote: > Hello, > > I'm trying to solve a system with a matrix free operator and through > conjugate gradient method. > To make ideas clear, I set up the following simple example (I am using > petsc-3.6) and I get this error message : > " > [0]PETSC ERROR: --------------------- Error Message > ------------------------------------ > [0]PETSC ERROR: Invalid argument! > [0]PETSC ERROR: Wrong type of object: Parameter # 1! > [0]PETSC ERROR: > ------------------------------------------------------------------------ > [0]PETSC ERROR: *Petsc Release Version 3.4.3*, Oct, 15, 2013 > Also it appears that you are linking against petsc 3.4, not petsc 3.6. -------------- next part -------------- An HTML attachment was scrubbed... URL: From dave.mayhem23 at gmail.com Thu Aug 6 04:17:24 2015 From: dave.mayhem23 at gmail.com (Dave May) Date: Thu, 6 Aug 2015 11:17:24 +0200 Subject: [petsc-users] KSP changes for successive solver In-Reply-To: References: <1437083588.21829.18.camel@kolmog5> <94823D83-AABD-4C22-8BF3-EBB0F1B1F7AA@mcs.anl.gov> <1437086528.21829.27.camel@kolmog5> <1437092337.21829.42.camel@kolmog5> <1437762913.17123.11.camel@kolmog5> <1437767070.17123.17.camel@kolmog5> <04691CE0-B35E-4F46-ABCA-6B05EA033F19@mcs.anl.gov> Message-ID: > I agree with you more than the "consensus". I think the consensus does > it just because it is perceived as too difficult or we don't have the right > infrastructure to do it "correctly" > > > > In the end that is what I want to do. :D > > > > I would be happy to contribute a similar repartitioning preconditioner > to petsc. > > We'd love to have this reduced processor repartitioning for both > DMDA/PCMG and for PCGAMG in PETSc. > > Hi Barry, I've created a pull-request which defines such a preconditiner. I've tentatively called it SemiRedundant - but I don't think it is a great name in the sense it doesn't really describe what the preconditioner actually can do. I hate naming things. Possibly "Repart" or "Repartition" would be better names. Given the existence of "Redistribute", "Redundant", it is likely that it will be hard for a new user to know what the actual difference is between all these preconditioners.... Cheers, Dave -------------- next part -------------- An HTML attachment was scrubbed... URL: From Mahir.Ulker-Kaustell at tyrens.se Thu Aug 6 06:34:45 2015 From: Mahir.Ulker-Kaustell at tyrens.se (Mahir.Ulker-Kaustell at tyrens.se) Date: Thu, 6 Aug 2015 11:34:45 +0000 Subject: [petsc-users] SuperLU MPI-problem In-Reply-To: References: <051d9816c3bd4a3eac37fc51004ebce1@STHWS42.tyrens.se> <7345cece365942d1a06deeac56cf1d72@STHWS42.tyrens.se> <19A5B30A-64E2-44E1-8F73-F67AE628F175@mcs.anl.gov> <03369975ff0a46a388920f1b3372d25c@STHWS42.tyrens.se> <1891ada2f99843b6b6c1d91f0f47f065@STHWS42.tyrens.se> Message-ID: <63c6587a85914931bbbad4660884efed@STHWS42.tyrens.se> Hong, I have been using PETSC_COMM_WORLD. Mahir From: Hong [mailto:hzhang at mcs.anl.gov] Sent: den 5 augusti 2015 17:11 To: ?lker-Kaustell, Mahir Cc: Hong; Xiaoye S. Li; PETSc users list Subject: Re: [petsc-users] SuperLU MPI-problem Mahir: As you noticed, you ran the code in serial mode, not parallel. Check your code on input communicator, e.g., what input communicator do you use in KSPCreate(comm,&ksp)? I have added error flag to superlu_dist interface (released version). When user uses '-mat_superlu_dist_parsymbfact' in serial mode, this option is ignored with a warning. Hong Hong, If I set parsymbfact: $ mpiexec -n 2 ./solve -ksp_type preonly -pc_type lu -pc_factor_mat_solver_package superlu_dist -mat_superlu_dist_matinput DISTRIBUTED -mat_superlu_dist_parsymbfact -ksp_view Invalid ISPEC at line 484 in file get_perm_c.c Invalid ISPEC at line 484 in file get_perm_c.c ------------------------------------------------------- Primary job terminated normally, but 1 process returned a non-zero exit code.. Per user-direction, the job has been aborted. ------------------------------------------------------- -------------------------------------------------------------------------- mpiexec detected that one or more processes exited with non-zero status, thus causing the job to be terminated. The first process to do so was: Process name: [[63679,1],0] Exit code: 255 -------------------------------------------------------------------------- Since the program does not finish the call to KSPSolve(), we do not get any information about the KSP from ?ksp_view. If I do not set it, I get a serial run even if I specify ?n 2: mpiexec -n 2 ./solve -ksp_type preonly -pc_type lu -pc_factor_mat_solver_package superlu_dist -ksp_view ? KSP Object: 1 MPI processes type: preonly maximum iterations=10000, initial guess is zero tolerances: relative=1e-05, absolute=1e-50, divergence=10000 left preconditioning using NONE norm type for convergence test PC Object: 1 MPI processes type: lu LU: out-of-place factorization tolerance for zero pivot 2.22045e-14 matrix ordering: nd factor fill ratio given 0, needed 0 Factored matrix follows: Mat Object: 1 MPI processes type: seqaij rows=954, cols=954 package used to perform factorization: superlu_dist total: nonzeros=0, allocated nonzeros=0 total number of mallocs used during MatSetValues calls =0 SuperLU_DIST run parameters: Process grid nprow 1 x npcol 1 Equilibrate matrix TRUE Matrix input mode 0 Replace tiny pivots TRUE Use iterative refinement FALSE Processors in row 1 col partition 1 Row permutation LargeDiag Column permutation METIS_AT_PLUS_A Parallel symbolic factorization FALSE Repeated factorization SamePattern_SameRowPerm linear system matrix = precond matrix: Mat Object: 1 MPI processes type: seqaij rows=954, cols=954 total: nonzeros=34223, allocated nonzeros=34223 total number of mallocs used during MatSetValues calls =0 using I-node routines: found 668 nodes, limit used is 5 I am running PETSc via Cygwin on a windows machine. When I installed PETSc the tests with different numbers of processes ran well. Mahir From: Hong [mailto:hzhang at mcs.anl.gov] Sent: den 3 augusti 2015 19:06 To: ?lker-Kaustell, Mahir Cc: Hong; Xiaoye S. Li; PETSc users list Subject: Re: [petsc-users] SuperLU MPI-problem Mahir, I have not used ?parsymbfact in sequential runs or set matinput=GLOBAL for parallel runs. If I use 2 processors, the program runs if I use ?mat_superlu_dist_parsymbfact=1: mpiexec -n 2 ./solve -ksp_type preonly -pc_type lu -pc_factor_mat_solver_package superlu_dist -mat_superlu_dist_matinput GLOBAL -mat_superlu_dist_parsymbfact=1 The incorrect option '-mat_superlu_dist_parsymbfact=1' is not taken, so your code runs well without parsymbfact. Please run it with '-ksp_view' and see what 'SuperLU_DIST run parameters:' are being used, e.g. petsc/src/ksp/ksp/examples/tutorials (maint) $ mpiexec -n 2 ./ex2 -pc_type lu -pc_factor_mat_solver_package superlu_dist -mat_superlu_dist_parsymbfact=1 -ksp_view ... SuperLU_DIST run parameters: Process grid nprow 2 x npcol 1 Equilibrate matrix TRUE Matrix input mode 1 Replace tiny pivots TRUE Use iterative refinement FALSE Processors in row 2 col partition 1 Row permutation LargeDiag Column permutation METIS_AT_PLUS_A Parallel symbolic factorization FALSE Repeated factorization SamePattern_SameRowPerm I do not understand why your code uses matrix input mode = global. Hong From: Hong [mailto:hzhang at mcs.anl.gov] Sent: den 3 augusti 2015 16:46 To: Xiaoye S. Li Cc: ?lker-Kaustell, Mahir; Hong; PETSc users list Subject: Re: [petsc-users] SuperLU MPI-problem Mahir, Sherry found the culprit. I can reproduce it: petsc/src/ksp/ksp/examples/tutorials mpiexec -n 2 ./ex2 -pc_type lu -pc_factor_mat_solver_package superlu_dist -mat_superlu_dist_matinput GLOBAL -mat_superlu_dist_parsymbfact Invalid ISPEC at line 484 in file get_perm_c.c Invalid ISPEC at line 484 in file get_perm_c.c ------------------------------------------------------- Primary job terminated normally, but 1 process returned a non-zero exit code.. Per user-direction, the job has been aborted. ------------------------------------------------------- ... PETSc-superlu_dist interface sets matinput=DISTRIBUTED as default when using more than one processes. Did you either use '-mat_superlu_dist_parsymbfact' for sequential run or set matinput=GLOBAL for parallel run? I'll add an error flag for these use cases. Hong On Mon, Aug 3, 2015 at 9:17 AM, Xiaoye S. Li > wrote: I think I know the problem. Since zdistribute.c is called, I guess you are using the global (replicated) matrix input interface, pzgssvx_ABglobal(). This interface does not allow you to use parallel symbolic factorization (since matrix is centralized). That's why you get the following error: Invalid ISPEC at line 484 in file get_perm_c.c You need to use distributed matrix input interface pzgssvx() (without ABglobal) Sherry On Mon, Aug 3, 2015 at 5:02 AM, Mahir.Ulker-Kaustell at tyrens.se > wrote: Hong and Sherry, I have rebuilt PETSc with SuperLU 4.1. Unfortunately, the problem remains: If I use -mat_superlu_dist_parsymbfact, the program crashes with: Invalid ISPEC at line 484 in file get_perm_c.c If I use -mat_superlu_dist_parsymbfact=1 or leave this flag out, the program crashes with: Calloc fails for SPA dense[]. at line 438 in file zdistribute.c Mahir From: Hong [mailto:hzhang at mcs.anl.gov] Sent: den 30 juli 2015 02:58 To: ?lker-Kaustell, Mahir Cc: Xiaoye Li; PETSc users list Subject: Fwd: [petsc-users] SuperLU MPI-problem Mahir, Sherry fixed several bugs in superlu_dist-v4.1. The current petsc-release interfaces with superlu_dist-v4.0. We do not know whether the reported issue (attached below) has been resolved or not. If not, can you test it with the latest superlu_dist-v4.1? Here is how to do it: 1. download superlu_dist v4.1 2. remove existing PETSC_ARCH directory, then configure petsc with '--download-superlu_dist=superlu_dist_4.1.tar.gz' 3. build petsc Let us know if the issue remains. Hong ---------- Forwarded message ---------- From: Xiaoye S. Li > Date: Wed, Jul 29, 2015 at 2:24 PM Subject: Fwd: [petsc-users] SuperLU MPI-problem To: Hong Zhang > Hong, I am cleaning the mailbox, and saw this unresolved issue. I am not sure whether the new fix to parallel symbolic factorization solves the problem. What bothers be is that he is getting the following error: Invalid ISPEC at line 484 in file get_perm_c.c This has nothing to do with my bug fix. ? Shall we ask him to try the new version, or try to get him matrix? Sherry ? ---------- Forwarded message ---------- From: Mahir.Ulker-Kaustell at tyrens.se > Date: Wed, Jul 22, 2015 at 1:32 PM Subject: RE: [petsc-users] SuperLU MPI-problem To: Hong >, "Xiaoye S. Li" > Cc: petsc-users > The 1000 was just a conservative guess. The number of non-zeros per row is in the tens in general but certain constraints lead to non-diagonal streaks in the sparsity-pattern. Is it the reordering of the matrix that is killing me here? How can I set options.ColPerm? If i use -mat_superlu_dist_parsymbfact the program crashes with Invalid ISPEC at line 484 in file get_perm_c.c ------------------------------------------------------- Primary job terminated normally, but 1 process returned a non-zero exit code.. Per user-direction, the job has been aborted. ------------------------------------------------------- [0]PETSC ERROR: ------------------------------------------------------------------------ [0]PETSC ERROR: Caught signal number 15 Terminate: Some process (or the batch system) has told this process to end [0]PETSC ERROR: Try option -start_in_debugger or -on_error_attach_debugger [0]PETSC ERROR: or see http://www.mcs.anl.gov/petsc/documentation/faq.html#valgrind [0]PETSC ERROR: or try http://valgrind.org on GNU/linux and Apple Mac OS X to find memory corruption errors [0]PETSC ERROR: configure using --with-debugging=yes, recompile, link, and run [0]PETSC ERROR: to get more information on the crash. [0]PETSC ERROR: --------------------- Error Message -------------------------------------------------------------- [0]PETSC ERROR: Signal received [0]PETSC ERROR: See http://www.mcs.anl.gov/petsc/documentation/faq.html for trouble shooting. [0]PETSC ERROR: Petsc Release Version 3.6.0, Jun, 09, 2015 [0]PETSC ERROR: ./solve on a cygwin-complex-nodebug named CZC5202SM2 by muk Wed Jul 22 21:59:23 2015 [0]PETSC ERROR: Configure options PETSC_DIR=/packages/petsc-3.6.0 PETSC_ARCH=cygwin-complex-nodebug --with-cc=gcc --with-cxx=g++ --with-fc=gfortran --with-debugging=0 --with-fortran-kernels=1 --with-scalar-type=complex --download-fblaspack --download-mpich --download-scalapack --download-mumps --download-metis --download-parmetis --download-superlu --download-superlu_dist --download-fftw [0]PETSC ERROR: #1 User provided function() line 0 in unknown file application called MPI_Abort(MPI_COMM_WORLD, 59) - process 0 [unset]: aborting job: application called MPI_Abort(MPI_COMM_WORLD, 59) - process 0 [0]PETSC ERROR: ------------------------------------------------------------------------ If i use -mat_superlu_dist_parsymbfact=1 the program crashes (somewhat later) with Malloc fails for Lnzval_bc_ptr[*][] at line 626 in file zdistribute.c col block 3006 ------------------------------------------------------- Primary job terminated normally, but 1 process returned a non-zero exit code.. Per user-direction, the job has been aborted. ------------------------------------------------------- col block 1924 [0]PETSC ERROR: ------------------------------------------------------------------------ [0]PETSC ERROR: Caught signal number 15 Terminate: Some process (or the batch system) has told this process to end [0]PETSC ERROR: Try option -start_in_debugger or -on_error_attach_debugger [0]PETSC ERROR: or see http://www.mcs.anl.gov/petsc/documentation/faq.html#valgrind [0]PETSC ERROR: or try http://valgrind.org on GNU/linux and Apple Mac OS X to find memory corruption errors [0]PETSC ERROR: configure using --with-debugging=yes, recompile, link, and run [0]PETSC ERROR: to get more information on the crash. [0]PETSC ERROR: --------------------- Error Message -------------------------------------------------------------- [0]PETSC ERROR: Signal received [0]PETSC ERROR: See http://www.mcs.anl.gov/petsc/documentation/faq.html for trouble shooting. [0]PETSC ERROR: Petsc Release Version 3.6.0, Jun, 09, 2015 [0]PETSC ERROR: ./solve on a cygwin-complex-nodebug named CZC5202SM2 by muk Wed Jul 22 21:59:58 2015 [0]PETSC ERROR: Configure options PETSC_DIR=/packages/petsc-3.6.0 PETSC_ARCH=cygwin-complex-nodebug --with-cc=gcc --with-cxx=g++ --with-fc=gfortran --with-debugging=0 --with-fortran-kernels=1 --with-scalar-type=complex --download-fblaspack --download-mpich --download-scalapack --download-mumps --download-metis --download-parmetis --download-superlu --download-superlu_dist --download-fftw [0]PETSC ERROR: #1 User provided function() line 0 in unknown file application called MPI_Abort(MPI_COMM_WORLD, 59) - process 0 [unset]: aborting job: application called MPI_Abort(MPI_COMM_WORLD, 59) - process 0 [0]PETSC ERROR: ------------------------------------------------------------------------ /Mahir From: Hong [mailto:hzhang at mcs.anl.gov] Sent: den 22 juli 2015 21:34 To: Xiaoye S. Li Cc: ?lker-Kaustell, Mahir; petsc-users Subject: Re: [petsc-users] SuperLU MPI-problem In Petsc/superlu_dist interface, we set default options.ParSymbFact = NO; When user raises the flag "-mat_superlu_dist_parsymbfact", we set options.ParSymbFact = YES; options.ColPerm = PARMETIS; /* in v2.2, PARMETIS is forced for ParSymbFact regardless of user ordering setting */ We do not change anything else. Hong On Wed, Jul 22, 2015 at 2:19 PM, Xiaoye S. Li > wrote: I am trying to understand your problem. You said you are solving Naviers equation (elastodynamics) in the frequency domain, using finite element discretization. I wonder why you have about 1000 nonzeros per row. Usually in many PDE discretized matrices, the number of nonzeros per row is in the tens (even for 3D problems), not in the thousands. So, your matrix is quite a bit denser than many sparse matrices we deal with. The number of nonzeros in the L and U factors is much more than that in original matrix A -- typically we see 10-20x fill ratio for 2D, or can be as bad as 50-100x fill ratio for 3D. But since your matrix starts much denser (i.e., the underlying graph has many connections), it may not lend to any good ordering strategy to preserve sparsity of L and U; that is, the L and U fill ratio may be large. I don't understand why you get the following error when you use ?-mat_superlu_dist_parsymbfact?. Invalid ISPEC at line 484 in file get_perm_c.c Perhaps Hong Zhang knows; she built the SuperLU_DIST interface for PETSc. ?Hong -- in order to use parallel symbolic factorization, is it sufficient to specify only ?-mat_superlu_dist_parsymbfact? ? ? (the default is to use sequential symbolic factorization.) Sherry On Wed, Jul 22, 2015 at 9:11 AM, Mahir.Ulker-Kaustell at tyrens.se > wrote: Thank you for your reply. As you have probably figured out already, I am not a computational scientist. I am a researcher in civil engineering (railways for high-speed traffic), trying to produce some, from my perspective, fairly large parametric studies based on finite element discretizations. I am working in a Windows-environment and have installed PETSc through Cygwin. Apparently, there is no support for Valgrind in this OS. If I have understood you correct, the memory issues are related to superLU and given my background, there is not much I can do. Is this correct? Best regards, Mahir ______________________________________________ Mahir ?lker-Kaustell, Kompetenssamordnare, Brokonstrukt?r, Tekn. Dr, Tyr?ns AB 010 452 30 82, Mahir.Ulker-Kaustell at tyrens.se ______________________________________________ -----Original Message----- From: Barry Smith [mailto:bsmith at mcs.anl.gov] Sent: den 22 juli 2015 02:57 To: ?lker-Kaustell, Mahir Cc: Xiaoye S. Li; petsc-users Subject: Re: [petsc-users] SuperLU MPI-problem Run the program under valgrind http://www.mcs.anl.gov/petsc/documentation/faq.html#valgrind . When I use the option -mat_superlu_dist_parsymbfact I get many scary memory problems some involving for example ddist_psymbtonum (pdsymbfact_distdata.c:1332) Note that I consider it unacceptable for running programs to EVER use uninitialized values; until these are all cleaned up I won't trust any runs like this. Barry ==42050== Conditional jump or move depends on uninitialised value(s) ==42050== at 0x10274C436: MPI_Allgatherv (allgatherv.c:1053) ==42050== by 0x101557F60: get_perm_c_parmetis (get_perm_c_parmetis.c:285) ==42050== by 0x101501192: pdgssvx (pdgssvx.c:934) ==42050== by 0x1009CFE7A: MatLUFactorNumeric_SuperLU_DIST (superlu_dist.c:414) ==42050== by 0x10046CC5C: MatLUFactorNumeric (matrix.c:2946) ==42050== by 0x100F09F2C: PCSetUp_LU (lu.c:152) ==42050== by 0x100FF9036: PCSetUp (precon.c:982) ==42050== by 0x1010F54EB: KSPSetUp (itfunc.c:332) ==42050== by 0x1010F7985: KSPSolve (itfunc.c:546) ==42050== by 0x10125541E: SNESSolve_NEWTONLS (ls.c:233) ==42050== by 0x1011C49B7: SNESSolve (snes.c:3906) ==42050== by 0x100001B3C: main (in ./ex19) ==42050== Uninitialised value was created by a stack allocation ==42050== at 0x10155751B: get_perm_c_parmetis (get_perm_c_parmetis.c:96) ==42050== ==42050== Conditional jump or move depends on uninitialised value(s) ==42050== at 0x102851C61: MPIR_Allgatherv_intra (allgatherv.c:651) ==42050== by 0x102853EC7: MPIR_Allgatherv (allgatherv.c:903) ==42050== by 0x102853F84: MPIR_Allgatherv_impl (allgatherv.c:944) ==42050== by 0x10274CA41: MPI_Allgatherv (allgatherv.c:1107) ==42050== by 0x101557F60: get_perm_c_parmetis (get_perm_c_parmetis.c:285) ==42050== by 0x101501192: pdgssvx (pdgssvx.c:934) ==42050== by 0x1009CFE7A: MatLUFactorNumeric_SuperLU_DIST (superlu_dist.c:414) ==42050== by 0x10046CC5C: MatLUFactorNumeric (matrix.c:2946) ==42050== by 0x100F09F2C: PCSetUp_LU (lu.c:152) ==42050== by 0x100FF9036: PCSetUp (precon.c:982) ==42050== by 0x1010F54EB: KSPSetUp (itfunc.c:332) ==42050== by 0x1010F7985: KSPSolve (itfunc.c:546) ==42050== by 0x10125541E: SNESSolve_NEWTONLS (ls.c:233) ==42050== by 0x1011C49B7: SNESSolve (snes.c:3906) ==42050== by 0x100001B3C: main (in ./ex19) ==42050== Uninitialised value was created by a stack allocation ==42050== at 0x10155751B: get_perm_c_parmetis (get_perm_c_parmetis.c:96) ==42050== ==42049== Syscall param writev(vector[...]) points to uninitialised byte(s) ==42049== at 0x102DA1C3A: writev (in /usr/lib/system/libsystem_kernel.dylib) ==42049== by 0x10296A0DC: MPL_large_writev (mplsock.c:32) ==42049== by 0x10295F6AD: MPIDU_Sock_writev (sock_immed.i:610) ==42049== by 0x102943FCA: MPIDI_CH3_iSendv (ch3_isendv.c:84) ==42049== by 0x102934361: MPIDI_CH3_EagerContigIsend (ch3u_eager.c:556) ==42049== by 0x102939531: MPID_Isend (mpid_isend.c:138) ==42049== by 0x10277656E: MPI_Isend (isend.c:125) ==42049== by 0x102088B66: libparmetis__gkMPI_Isend (gkmpi.c:63) ==42049== by 0x10208140F: libparmetis__CommInterfaceData (comm.c:298) ==42049== by 0x1020A8758: libparmetis__CompactGraph (ometis.c:553) ==42049== by 0x1020A77BB: libparmetis__MultilevelOrder (ometis.c:225) ==42049== by 0x1020A7493: ParMETIS_V32_NodeND (ometis.c:151) ==42049== by 0x1020A6AFB: ParMETIS_V3_NodeND (ometis.c:34) ==42049== by 0x101557CFC: get_perm_c_parmetis (get_perm_c_parmetis.c:241) ==42049== by 0x101501192: pdgssvx (pdgssvx.c:934) ==42049== by 0x1009CFE7A: MatLUFactorNumeric_SuperLU_DIST (superlu_dist.c:414) ==42049== by 0x10046CC5C: MatLUFactorNumeric (matrix.c:2946) ==42049== by 0x100F09F2C: PCSetUp_LU (lu.c:152) ==42049== by 0x100FF9036: PCSetUp (precon.c:982) ==42048== Syscall param writev(vector[...]) points to uninitialised byte(s) ==42049== by 0x1010F54EB: KSPSetUp (itfunc.c:332) ==42049== Address 0x105edff70 is 1,424 bytes inside a block of size 752,720 alloc'd ==42049== at 0x1000183B1: malloc (vg_replace_malloc.c:303) ==42049== by 0x1020EB90C: gk_malloc (memory.c:147) ==42049== by 0x1020EAA28: gk_mcoreCreate (mcore.c:28) ==42048== at 0x102DA1C3A: writev (in /usr/lib/system/libsystem_kernel.dylib) ==42048== by 0x10296A0DC: MPL_large_writev (mplsock.c:32) ==42049== by 0x1020BA5CF: libparmetis__AllocateWSpace (wspace.c:23) ==42049== by 0x1020A6E84: ParMETIS_V32_NodeND (ometis.c:98) ==42048== by 0x10295F6AD: MPIDU_Sock_writev (sock_immed.i:610) ==42048== by 0x102943FCA: MPIDI_CH3_iSendv (ch3_isendv.c:84) ==42048== by 0x102934361: MPIDI_CH3_EagerContigIsend (ch3u_eager.c:556) ==42049== by 0x1020A6AFB: ParMETIS_V3_NodeND (ometis.c:34) ==42049== by 0x101557CFC: get_perm_c_parmetis (get_perm_c_parmetis.c:241) ==42049== by 0x101501192: pdgssvx (pdgssvx.c:934) ==42048== by 0x102939531: MPID_Isend (mpid_isend.c:138) ==42048== by 0x10277656E: MPI_Isend (isend.c:125) ==42049== by 0x1009CFE7A: MatLUFactorNumeric_SuperLU_DIST (superlu_dist.c:414) ==42049== by 0x10046CC5C: MatLUFactorNumeric (matrix.c:2946) ==42049== by 0x100F09F2C: PCSetUp_LU (lu.c:152) ==42049== by 0x100FF9036: PCSetUp (precon.c:982) ==42049== by 0x1010F54EB: KSPSetUp (itfunc.c:332) ==42048== by 0x102088B66: libparmetis__gkMPI_Isend (gkmpi.c:63) ==42048== by 0x10208140F: libparmetis__CommInterfaceData (comm.c:298) ==42049== by 0x1010F7985: KSPSolve (itfunc.c:546) ==42049== by 0x10125541E: SNESSolve_NEWTONLS (ls.c:233) ==42048== by 0x1020A8758: libparmetis__CompactGraph (ometis.c:553) ==42048== by 0x1020A77BB: libparmetis__MultilevelOrder (ometis.c:225) ==42048== by 0x1020A7493: ParMETIS_V32_NodeND (ometis.c:151) ==42049== by 0x1011C49B7: SNESSolve (snes.c:3906) ==42049== by 0x100001B3C: main (in ./ex19) ==42049== Uninitialised value was created by a heap allocation ==42049== at 0x1000183B1: malloc (vg_replace_malloc.c:303) ==42049== by 0x1020EB90C: gk_malloc (memory.c:147) ==42048== by 0x1020A6AFB: ParMETIS_V3_NodeND (ometis.c:34) ==42048== by 0x101557CFC: get_perm_c_parmetis (get_perm_c_parmetis.c:241) ==42048== by 0x101501192: pdgssvx (pdgssvx.c:934) ==42048== by 0x1009CFE7A: MatLUFactorNumeric_SuperLU_DIST (superlu_dist.c:414) ==42049== by 0x10211C50B: libmetis__imalloc (gklib.c:24) ==42049== by 0x1020A8566: libparmetis__CompactGraph (ometis.c:519) ==42049== by 0x1020A77BB: libparmetis__MultilevelOrder (ometis.c:225) ==42048== by 0x10046CC5C: MatLUFactorNumeric (matrix.c:2946) ==42049== by 0x1020A7493: ParMETIS_V32_NodeND (ometis.c:151) ==42049== by 0x1020A6AFB: ParMETIS_V3_NodeND (ometis.c:34) ==42049== by 0x101557CFC: get_perm_c_parmetis (get_perm_c_parmetis.c:241) ==42049== by 0x101501192: pdgssvx (pdgssvx.c:934) ==42049== by 0x1009CFE7A: MatLUFactorNumeric_SuperLU_DIST (superlu_dist.c:414) ==42049== by 0x10046CC5C: MatLUFactorNumeric (matrix.c:2946) ==42048== by 0x100F09F2C: PCSetUp_LU (lu.c:152) ==42049== by 0x100F09F2C: PCSetUp_LU (lu.c:152) ==42049== by 0x100FF9036: PCSetUp (precon.c:982) ==42049== by 0x1010F54EB: KSPSetUp (itfunc.c:332) ==42049== by 0x1010F7985: KSPSolve (itfunc.c:546) ==42049== by 0x10125541E: SNESSolve_NEWTONLS (ls.c:233) ==42048== by 0x100FF9036: PCSetUp (precon.c:982) ==42048== by 0x1010F54EB: KSPSetUp (itfunc.c:332) ==42048== Address 0x10597a860 is 1,408 bytes inside a block of size 752,720 alloc'd ==42049== by 0x1011C49B7: SNESSolve (snes.c:3906) ==42049== by 0x100001B3C: main (in ./ex19) ==42049== ==42048== at 0x1000183B1: malloc (vg_replace_malloc.c:303) ==42048== by 0x1020EB90C: gk_malloc (memory.c:147) ==42048== by 0x1020EAA28: gk_mcoreCreate (mcore.c:28) ==42048== by 0x1020BA5CF: libparmetis__AllocateWSpace (wspace.c:23) ==42048== by 0x1020A6E84: ParMETIS_V32_NodeND (ometis.c:98) ==42048== by 0x1020A6AFB: ParMETIS_V3_NodeND (ometis.c:34) ==42048== by 0x101557CFC: get_perm_c_parmetis (get_perm_c_parmetis.c:241) ==42048== by 0x101501192: pdgssvx (pdgssvx.c:934) ==42048== by 0x1009CFE7A: MatLUFactorNumeric_SuperLU_DIST (superlu_dist.c:414) ==42048== by 0x10046CC5C: MatLUFactorNumeric (matrix.c:2946) ==42048== by 0x100F09F2C: PCSetUp_LU (lu.c:152) ==42048== by 0x100FF9036: PCSetUp (precon.c:982) ==42048== by 0x1010F54EB: KSPSetUp (itfunc.c:332) ==42048== by 0x1010F7985: KSPSolve (itfunc.c:546) ==42048== by 0x10125541E: SNESSolve_NEWTONLS (ls.c:233) ==42048== by 0x1011C49B7: SNESSolve (snes.c:3906) ==42048== by 0x100001B3C: main (in ./ex19) ==42048== Uninitialised value was created by a heap allocation ==42048== at 0x1000183B1: malloc (vg_replace_malloc.c:303) ==42048== by 0x1020EB90C: gk_malloc (memory.c:147) ==42048== by 0x10211C50B: libmetis__imalloc (gklib.c:24) ==42048== by 0x1020A8566: libparmetis__CompactGraph (ometis.c:519) ==42048== by 0x1020A77BB: libparmetis__MultilevelOrder (ometis.c:225) ==42048== by 0x1020A7493: ParMETIS_V32_NodeND (ometis.c:151) ==42048== by 0x1020A6AFB: ParMETIS_V3_NodeND (ometis.c:34) ==42048== by 0x101557CFC: get_perm_c_parmetis (get_perm_c_parmetis.c:241) ==42048== by 0x101501192: pdgssvx (pdgssvx.c:934) ==42048== by 0x1009CFE7A: MatLUFactorNumeric_SuperLU_DIST (superlu_dist.c:414) ==42048== by 0x10046CC5C: MatLUFactorNumeric (matrix.c:2946) ==42048== by 0x100F09F2C: PCSetUp_LU (lu.c:152) ==42048== by 0x100FF9036: PCSetUp (precon.c:982) ==42048== by 0x1010F54EB: KSPSetUp (itfunc.c:332) ==42048== by 0x1010F7985: KSPSolve (itfunc.c:546) ==42048== by 0x10125541E: SNESSolve_NEWTONLS (ls.c:233) ==42048== by 0x1011C49B7: SNESSolve (snes.c:3906) ==42048== by 0x100001B3C: main (in ./ex19) ==42048== ==42048== Syscall param write(buf) points to uninitialised byte(s) ==42048== at 0x102DA1C22: write (in /usr/lib/system/libsystem_kernel.dylib) ==42048== by 0x10295F5BD: MPIDU_Sock_write (sock_immed.i:525) ==42048== by 0x102944839: MPIDI_CH3_iStartMsg (ch3_istartmsg.c:86) ==42048== by 0x102933B80: MPIDI_CH3_EagerContigShortSend (ch3u_eager.c:257) ==42048== by 0x10293ADBA: MPID_Send (mpid_send.c:130) ==42048== by 0x10277A1FA: MPI_Send (send.c:127) ==42048== by 0x10155802F: get_perm_c_parmetis (get_perm_c_parmetis.c:299) ==42048== by 0x101501192: pdgssvx (pdgssvx.c:934) ==42048== by 0x1009CFE7A: MatLUFactorNumeric_SuperLU_DIST (superlu_dist.c:414) ==42048== by 0x10046CC5C: MatLUFactorNumeric (matrix.c:2946) ==42048== by 0x100F09F2C: PCSetUp_LU (lu.c:152) ==42048== by 0x100FF9036: PCSetUp (precon.c:982) ==42048== by 0x1010F54EB: KSPSetUp (itfunc.c:332) ==42048== by 0x1010F7985: KSPSolve (itfunc.c:546) ==42048== by 0x10125541E: SNESSolve_NEWTONLS (ls.c:233) ==42048== by 0x1011C49B7: SNESSolve (snes.c:3906) ==42048== by 0x100001B3C: main (in ./ex19) ==42048== Address 0x104810704 is on thread 1's stack ==42048== in frame #3, created by MPIDI_CH3_EagerContigShortSend (ch3u_eager.c:218) ==42048== Uninitialised value was created by a heap allocation ==42048== at 0x1000183B1: malloc (vg_replace_malloc.c:303) ==42048== by 0x10153B704: superlu_malloc_dist (memory.c:108) ==42048== by 0x101557AB9: get_perm_c_parmetis (get_perm_c_parmetis.c:185) ==42048== by 0x101501192: pdgssvx (pdgssvx.c:934) ==42048== by 0x1009CFE7A: MatLUFactorNumeric_SuperLU_DIST (superlu_dist.c:414) ==42048== by 0x10046CC5C: MatLUFactorNumeric (matrix.c:2946) ==42048== by 0x100F09F2C: PCSetUp_LU (lu.c:152) ==42048== by 0x100FF9036: PCSetUp (precon.c:982) ==42048== by 0x1010F54EB: KSPSetUp (itfunc.c:332) ==42048== by 0x1010F7985: KSPSolve (itfunc.c:546) ==42048== by 0x10125541E: SNESSolve_NEWTONLS (ls.c:233) ==42048== by 0x1011C49B7: SNESSolve (snes.c:3906) ==42048== by 0x100001B3C: main (in ./ex19) ==42048== ==42050== Conditional jump or move depends on uninitialised value(s) ==42050== at 0x102744CB8: MPI_Alltoallv (alltoallv.c:480) ==42050== by 0x101510B3E: dist_symbLU (pdsymbfact_distdata.c:539) ==42050== by 0x10150A5C6: ddist_psymbtonum (pdsymbfact_distdata.c:1275) ==42050== by 0x1015018C2: pdgssvx (pdgssvx.c:1057) ==42050== by 0x1009CFE7A: MatLUFactorNumeric_SuperLU_DIST (superlu_dist.c:414) ==42050== by 0x10046CC5C: MatLUFactorNumeric (matrix.c:2946) ==42050== by 0x100F09F2C: PCSetUp_LU (lu.c:152) ==42050== by 0x100FF9036: PCSetUp (precon.c:982) ==42050== by 0x1010F54EB: KSPSetUp (itfunc.c:332) ==42050== by 0x1010F7985: KSPSolve (itfunc.c:546) ==42050== by 0x10125541E: SNESSolve_NEWTONLS (ls.c:233) ==42050== by 0x1011C49B7: SNESSolve (snes.c:3906) ==42050== by 0x100001B3C: main (in ./ex19) ==42050== Uninitialised value was created by a stack allocation ==42050== at 0x10150E4C4: dist_symbLU (pdsymbfact_distdata.c:96) ==42050== ==42050== Conditional jump or move depends on uninitialised value(s) ==42050== at 0x102744E43: MPI_Alltoallv (alltoallv.c:490) ==42050== by 0x101510B3E: dist_symbLU (pdsymbfact_distdata.c:539) ==42050== by 0x10150A5C6: ddist_psymbtonum (pdsymbfact_distdata.c:1275) ==42050== by 0x1015018C2: pdgssvx (pdgssvx.c:1057) ==42050== by 0x1009CFE7A: MatLUFactorNumeric_SuperLU_DIST (superlu_dist.c:414) ==42050== by 0x10046CC5C: MatLUFactorNumeric (matrix.c:2946) ==42050== by 0x100F09F2C: PCSetUp_LU (lu.c:152) ==42050== by 0x100FF9036: PCSetUp (precon.c:982) ==42050== by 0x1010F54EB: KSPSetUp (itfunc.c:332) ==42050== by 0x1010F7985: KSPSolve (itfunc.c:546) ==42050== by 0x10125541E: SNESSolve_NEWTONLS (ls.c:233) ==42050== by 0x1011C49B7: SNESSolve (snes.c:3906) ==42050== by 0x100001B3C: main (in ./ex19) ==42050== Uninitialised value was created by a stack allocation ==42050== at 0x10150E4C4: dist_symbLU (pdsymbfact_distdata.c:96) ==42050== ==42050== Conditional jump or move depends on uninitialised value(s) ==42050== at 0x102744EBF: MPI_Alltoallv (alltoallv.c:497) ==42050== by 0x101510B3E: dist_symbLU (pdsymbfact_distdata.c:539) ==42050== by 0x10150A5C6: ddist_psymbtonum (pdsymbfact_distdata.c:1275) ==42050== by 0x1015018C2: pdgssvx (pdgssvx.c:1057) ==42050== by 0x1009CFE7A: MatLUFactorNumeric_SuperLU_DIST (superlu_dist.c:414) ==42050== by 0x10046CC5C: MatLUFactorNumeric (matrix.c:2946) ==42050== by 0x100F09F2C: PCSetUp_LU (lu.c:152) ==42050== by 0x100FF9036: PCSetUp (precon.c:982) ==42050== by 0x1010F54EB: KSPSetUp (itfunc.c:332) ==42050== by 0x1010F7985: KSPSolve (itfunc.c:546) ==42050== by 0x10125541E: SNESSolve_NEWTONLS (ls.c:233) ==42050== by 0x1011C49B7: SNESSolve (snes.c:3906) ==42050== by 0x100001B3C: main (in ./ex19) ==42050== Uninitialised value was created by a stack allocation ==42050== at 0x10150E4C4: dist_symbLU (pdsymbfact_distdata.c:96) ==42050== ==42050== Conditional jump or move depends on uninitialised value(s) ==42050== at 0x1027450B1: MPI_Alltoallv (alltoallv.c:512) ==42050== by 0x101510B3E: dist_symbLU (pdsymbfact_distdata.c:539) ==42050== by 0x10150A5C6: ddist_psymbtonum (pdsymbfact_distdata.c:1275) ==42050== by 0x1015018C2: pdgssvx (pdgssvx.c:1057) ==42050== by 0x1009CFE7A: MatLUFactorNumeric_SuperLU_DIST (superlu_dist.c:414) ==42050== by 0x10046CC5C: MatLUFactorNumeric (matrix.c:2946) ==42050== by 0x100F09F2C: PCSetUp_LU (lu.c:152) ==42050== by 0x100FF9036: PCSetUp (precon.c:982) ==42050== by 0x1010F54EB: KSPSetUp (itfunc.c:332) ==42050== by 0x1010F7985: KSPSolve (itfunc.c:546) ==42050== by 0x10125541E: SNESSolve_NEWTONLS (ls.c:233) ==42050== by 0x1011C49B7: SNESSolve (snes.c:3906) ==42050== by 0x100001B3C: main (in ./ex19) ==42050== Uninitialised value was created by a stack allocation ==42050== at 0x10150E4C4: dist_symbLU (pdsymbfact_distdata.c:96) ==42050== ==42050== Conditional jump or move depends on uninitialised value(s) ==42050== at 0x10283FB06: MPIR_Alltoallv_intra (alltoallv.c:92) ==42050== by 0x1028407B6: MPIR_Alltoallv (alltoallv.c:343) ==42050== by 0x102840884: MPIR_Alltoallv_impl (alltoallv.c:380) ==42050== by 0x10274541B: MPI_Alltoallv (alltoallv.c:531) ==42050== by 0x101510B3E: dist_symbLU (pdsymbfact_distdata.c:539) ==42050== by 0x10150A5C6: ddist_psymbtonum (pdsymbfact_distdata.c:1275) ==42050== by 0x1015018C2: pdgssvx (pdgssvx.c:1057) ==42050== by 0x1009CFE7A: MatLUFactorNumeric_SuperLU_DIST (superlu_dist.c:414) ==42050== by 0x10046CC5C: MatLUFactorNumeric (matrix.c:2946) ==42050== by 0x100F09F2C: PCSetUp_LU (lu.c:152) ==42050== by 0x100FF9036: PCSetUp (precon.c:982) ==42050== by 0x1010F54EB: KSPSetUp (itfunc.c:332) ==42050== by 0x1010F7985: KSPSolve (itfunc.c:546) ==42050== by 0x10125541E: SNESSolve_NEWTONLS (ls.c:233) ==42050== by 0x1011C49B7: SNESSolve (snes.c:3906) ==42050== by 0x100001B3C: main (in ./ex19) ==42050== Uninitialised value was created by a stack allocation ==42050== at 0x10150E4C4: dist_symbLU (pdsymbfact_distdata.c:96) ==42050== ==42050== Syscall param writev(vector[...]) points to uninitialised byte(s) ==42050== at 0x102DA1C3A: writev (in /usr/lib/system/libsystem_kernel.dylib) ==42050== by 0x10296A0DC: MPL_large_writev (mplsock.c:32) ==42050== by 0x10295F6AD: MPIDU_Sock_writev (sock_immed.i:610) ==42050== by 0x102943FCA: MPIDI_CH3_iSendv (ch3_isendv.c:84) ==42050== by 0x102934361: MPIDI_CH3_EagerContigIsend (ch3u_eager.c:556) ==42050== by 0x102939531: MPID_Isend (mpid_isend.c:138) ==42050== by 0x10277656E: MPI_Isend (isend.c:125) ==42050== by 0x101524C41: pdgstrf2_trsm (pdgstrf2.c:201) ==42050== by 0x10151ECBF: pdgstrf (pdgstrf.c:1082) ==42050== by 0x1015019A5: pdgssvx (pdgssvx.c:1069) ==42050== by 0x1009CFE7A: MatLUFactorNumeric_SuperLU_DIST (superlu_dist.c:414) ==42050== by 0x10046CC5C: MatLUFactorNumeric (matrix.c:2946) ==42050== by 0x100F09F2C: PCSetUp_LU (lu.c:152) ==42050== by 0x100FF9036: PCSetUp (precon.c:982) ==42050== by 0x1010F54EB: KSPSetUp (itfunc.c:332) ==42050== by 0x1010F7985: KSPSolve (itfunc.c:546) ==42050== by 0x10125541E: SNESSolve_NEWTONLS (ls.c:233) ==42050== by 0x1011C49B7: SNESSolve (snes.c:3906) ==42050== by 0x100001B3C: main (in ./ex19) ==42050== Address 0x1060144d0 is 1,168 bytes inside a block of size 131,072 alloc'd ==42050== at 0x1000183B1: malloc (vg_replace_malloc.c:303) ==42050== by 0x10153B704: superlu_malloc_dist (memory.c:108) ==42050== by 0x1014FD7AD: doubleMalloc_dist (dmemory.c:145) ==42050== by 0x10151DA7D: pdgstrf (pdgstrf.c:735) ==42050== by 0x1015019A5: pdgssvx (pdgssvx.c:1069) ==42050== by 0x1009CFE7A: MatLUFactorNumeric_SuperLU_DIST (superlu_dist.c:414) ==42050== by 0x10046CC5C: MatLUFactorNumeric (matrix.c:2946) ==42050== by 0x100F09F2C: PCSetUp_LU (lu.c:152) ==42050== by 0x100FF9036: PCSetUp (precon.c:982) ==42050== by 0x1010F54EB: KSPSetUp (itfunc.c:332) ==42050== by 0x1010F7985: KSPSolve (itfunc.c:546) ==42050== by 0x10125541E: SNESSolve_NEWTONLS (ls.c:233) ==42050== by 0x1011C49B7: SNESSolve (snes.c:3906) ==42050== by 0x100001B3C: main (in ./ex19) ==42050== Uninitialised value was created by a heap allocation ==42050== at 0x1000183B1: malloc (vg_replace_malloc.c:303) ==42050== by 0x10153B704: superlu_malloc_dist (memory.c:108) ==42050== by 0x1014FD7AD: doubleMalloc_dist (dmemory.c:145) ==42050== by 0x10151DA7D: pdgstrf (pdgstrf.c:735) ==42050== by 0x1015019A5: pdgssvx (pdgssvx.c:1069) ==42050== by 0x1009CFE7A: MatLUFactorNumeric_SuperLU_DIST (superlu_dist.c:414) ==42050== by 0x10046CC5C: MatLUFactorNumeric (matrix.c:2946) ==42050== by 0x100F09F2C: PCSetUp_LU (lu.c:152) ==42050== by 0x100FF9036: PCSetUp (precon.c:982) ==42050== by 0x1010F54EB: KSPSetUp (itfunc.c:332) ==42050== by 0x1010F7985: KSPSolve (itfunc.c:546) ==42050== by 0x10125541E: SNESSolve_NEWTONLS (ls.c:233) ==42050== by 0x1011C49B7: SNESSolve (snes.c:3906) ==42050== by 0x100001B3C: main (in ./ex19) ==42050== ==42048== Conditional jump or move depends on uninitialised value(s) ==42048== at 0x10151F141: pdgstrf (pdgstrf.c:1139) ==42048== by 0x1015019A5: pdgssvx (pdgssvx.c:1069) ==42048== by 0x1009CFE7A: MatLUFactorNumeric_SuperLU_DIST (superlu_dist.c:414) ==42048== by 0x10046CC5C: MatLUFactorNumeric (matrix.c:2946) ==42048== by 0x100F09F2C: PCSetUp_LU (lu.c:152) ==42048== by 0x100FF9036: PCSetUp (precon.c:982) ==42048== by 0x1010F54EB: KSPSetUp (itfunc.c:332) ==42048== by 0x1010F7985: KSPSolve (itfunc.c:546) ==42048== by 0x10125541E: SNESSolve_NEWTONLS (ls.c:233) ==42048== by 0x1011C49B7: SNESSolve (snes.c:3906) ==42048== by 0x100001B3C: main (in ./ex19) ==42048== Uninitialised value was created by a heap allocation ==42048== at 0x1000183B1: malloc (vg_replace_malloc.c:303) ==42048== by 0x10153B704: superlu_malloc_dist (memory.c:108) ==42048== by 0x10150ABE2: ddist_psymbtonum (pdsymbfact_distdata.c:1332) ==42048== by 0x1015018C2: pdgssvx (pdgssvx.c:1057) ==42048== by 0x1009CFE7A: MatLUFactorNumeric_SuperLU_DIST (superlu_dist.c:414) ==42048== by 0x10046CC5C: MatLUFactorNumeric (matrix.c:2946) ==42048== by 0x100F09F2C: PCSetUp_LU (lu.c:152) ==42048== by 0x100FF9036: PCSetUp (precon.c:982) ==42048== by 0x1010F54EB: KSPSetUp (itfunc.c:332) ==42048== by 0x1010F7985: KSPSolve (itfunc.c:546) ==42048== by 0x10125541E: SNESSolve_NEWTONLS (ls.c:233) ==42048== by 0x1011C49B7: SNESSolve (snes.c:3906) ==42048== by 0x100001B3C: main (in ./ex19) ==42048== ==42049== Conditional jump or move depends on uninitialised value(s) ==42049== at 0x10151F141: pdgstrf (pdgstrf.c:1139) ==42049== by 0x1015019A5: pdgssvx (pdgssvx.c:1069) ==42049== by 0x1009CFE7A: MatLUFactorNumeric_SuperLU_DIST (superlu_dist.c:414) ==42049== by 0x10046CC5C: MatLUFactorNumeric (matrix.c:2946) ==42049== by 0x100F09F2C: PCSetUp_LU (lu.c:152) ==42049== by 0x100FF9036: PCSetUp (precon.c:982) ==42049== by 0x1010F54EB: KSPSetUp (itfunc.c:332) ==42049== by 0x1010F7985: KSPSolve (itfunc.c:546) ==42049== by 0x10125541E: SNESSolve_NEWTONLS (ls.c:233) ==42049== by 0x1011C49B7: SNESSolve (snes.c:3906) ==42049== by 0x100001B3C: main (in ./ex19) ==42049== Uninitialised value was created by a heap allocation ==42049== at 0x1000183B1: malloc (vg_replace_malloc.c:303) ==42049== by 0x10153B704: superlu_malloc_dist (memory.c:108) ==42049== by 0x10150ABE2: ddist_psymbtonum (pdsymbfact_distdata.c:1332) ==42049== by 0x1015018C2: pdgssvx (pdgssvx.c:1057) ==42049== by 0x1009CFE7A: MatLUFactorNumeric_SuperLU_DIST (superlu_dist.c:414) ==42049== by 0x10046CC5C: MatLUFactorNumeric (matrix.c:2946) ==42049== by 0x100F09F2C: PCSetUp_LU (lu.c:152) ==42049== by 0x100FF9036: PCSetUp (precon.c:982) ==42049== by 0x1010F54EB: KSPSetUp (itfunc.c:332) ==42049== by 0x1010F7985: KSPSolve (itfunc.c:546) ==42049== by 0x10125541E: SNESSolve_NEWTONLS (ls.c:233) ==42049== by 0x1011C49B7: SNESSolve (snes.c:3906) ==42049== by 0x100001B3C: main (in ./ex19) ==42049== ==42048== Conditional jump or move depends on uninitialised value(s) ==42048== at 0x101520054: pdgstrf (pdgstrf.c:1429) ==42048== by 0x1015019A5: pdgssvx (pdgssvx.c:1069) ==42048== by 0x1009CFE7A: MatLUFactorNumeric_SuperLU_DIST (superlu_dist.c:414) ==42048== by 0x10046CC5C: MatLUFactorNumeric (matrix.c:2946) ==42048== by 0x100F09F2C: PCSetUp_LU (lu.c:152) ==42048== by 0x100FF9036: PCSetUp (precon.c:982) ==42048== by 0x1010F54EB: KSPSetUp (itfunc.c:332) ==42048== by 0x1010F7985: KSPSolve (itfunc.c:546) ==42049== Conditional jump or move depends on uninitialised value(s) ==42048== by 0x10125541E: SNESSolve_NEWTONLS (ls.c:233) ==42048== by 0x1011C49B7: SNESSolve (snes.c:3906) ==42048== by 0x100001B3C: main (in ./ex19) ==42048== Uninitialised value was created by a heap allocation ==42049== at 0x101520054: pdgstrf (pdgstrf.c:1429) ==42048== at 0x1000183B1: malloc (vg_replace_malloc.c:303) ==42048== by 0x10153B704: superlu_malloc_dist (memory.c:108) ==42049== by 0x1015019A5: pdgssvx (pdgssvx.c:1069) ==42049== by 0x1009CFE7A: MatLUFactorNumeric_SuperLU_DIST (superlu_dist.c:414) ==42048== by 0x10150ABE2: ddist_psymbtonum (pdsymbfact_distdata.c:1332) ==42048== by 0x1015018C2: pdgssvx (pdgssvx.c:1057) ==42048== by 0x1009CFE7A: MatLUFactorNumeric_SuperLU_DIST (superlu_dist.c:414) ==42049== by 0x10046CC5C: MatLUFactorNumeric (matrix.c:2946) ==42049== by 0x100F09F2C: PCSetUp_LU (lu.c:152) ==42048== by 0x10046CC5C: MatLUFactorNumeric (matrix.c:2946) ==42048== by 0x100F09F2C: PCSetUp_LU (lu.c:152) ==42049== by 0x100FF9036: PCSetUp (precon.c:982) ==42049== by 0x1010F54EB: KSPSetUp (itfunc.c:332) ==42049== by 0x1010F7985: KSPSolve (itfunc.c:546) ==42048== by 0x100FF9036: PCSetUp (precon.c:982) ==42048== by 0x1010F54EB: KSPSetUp (itfunc.c:332) ==42048== by 0x1010F7985: KSPSolve (itfunc.c:546) ==42049== by 0x10125541E: SNESSolve_NEWTONLS (ls.c:233) ==42049== by 0x1011C49B7: SNESSolve (snes.c:3906) ==42048== by 0x10125541E: SNESSolve_NEWTONLS (ls.c:233) ==42048== by 0x1011C49B7: SNESSolve (snes.c:3906) ==42049== by 0x100001B3C: main (in ./ex19) ==42049== Uninitialised value was created by a heap allocation ==42049== at 0x1000183B1: malloc (vg_replace_malloc.c:303) ==42048== by 0x100001B3C: main (in ./ex19) ==42048== ==42049== by 0x10153B704: superlu_malloc_dist (memory.c:108) ==42049== by 0x10150ABE2: ddist_psymbtonum (pdsymbfact_distdata.c:1332) ==42049== by 0x1015018C2: pdgssvx (pdgssvx.c:1057) ==42049== by 0x1009CFE7A: MatLUFactorNumeric_SuperLU_DIST (superlu_dist.c:414) ==42049== by 0x10046CC5C: MatLUFactorNumeric (matrix.c:2946) ==42049== by 0x100F09F2C: PCSetUp_LU (lu.c:152) ==42049== by 0x100FF9036: PCSetUp (precon.c:982) ==42049== by 0x1010F54EB: KSPSetUp (itfunc.c:332) ==42049== by 0x1010F7985: KSPSolve (itfunc.c:546) ==42049== by 0x10125541E: SNESSolve_NEWTONLS (ls.c:233) ==42049== by 0x1011C49B7: SNESSolve (snes.c:3906) ==42049== by 0x100001B3C: main (in ./ex19) ==42049== ==42050== Conditional jump or move depends on uninitialised value(s) ==42050== at 0x10151FDE6: pdgstrf (pdgstrf.c:1382) ==42050== by 0x1015019A5: pdgssvx (pdgssvx.c:1069) ==42050== by 0x1009CFE7A: MatLUFactorNumeric_SuperLU_DIST (superlu_dist.c:414) ==42050== by 0x10046CC5C: MatLUFactorNumeric (matrix.c:2946) ==42050== by 0x100F09F2C: PCSetUp_LU (lu.c:152) ==42050== by 0x100FF9036: PCSetUp (precon.c:982) ==42050== by 0x1010F54EB: KSPSetUp (itfunc.c:332) ==42050== by 0x1010F7985: KSPSolve (itfunc.c:546) ==42050== by 0x10125541E: SNESSolve_NEWTONLS (ls.c:233) ==42050== by 0x1011C49B7: SNESSolve (snes.c:3906) ==42050== by 0x100001B3C: main (in ./ex19) ==42050== Uninitialised value was created by a heap allocation ==42050== at 0x1000183B1: malloc (vg_replace_malloc.c:303) ==42050== by 0x10153B704: superlu_malloc_dist (memory.c:108) ==42050== by 0x10150B241: ddist_psymbtonum (pdsymbfact_distdata.c:1389) ==42050== by 0x1015018C2: pdgssvx (pdgssvx.c:1057) ==42050== by 0x1009CFE7A: MatLUFactorNumeric_SuperLU_DIST (superlu_dist.c:414) ==42050== by 0x10046CC5C: MatLUFactorNumeric (matrix.c:2946) ==42050== by 0x100F09F2C: PCSetUp_LU (lu.c:152) ==42050== by 0x100FF9036: PCSetUp (precon.c:982) ==42050== by 0x1010F54EB: KSPSetUp (itfunc.c:332) ==42050== by 0x1010F7985: KSPSolve (itfunc.c:546) ==42050== by 0x10125541E: SNESSolve_NEWTONLS (ls.c:233) ==42050== by 0x1011C49B7: SNESSolve (snes.c:3906) ==42050== by 0x100001B3C: main (in ./ex19) ==42050== > On Jul 20, 2015, at 12:03 PM, Mahir.Ulker-Kaustell at tyrens.se wrote: > > Ok. So I have been creating the full factorization on each process. That gives me some hope! > > I followed your suggestion and tried to use the runtime option ?-mat_superlu_dist_parsymbfact?. > However, now the program crashes with: > > Invalid ISPEC at line 484 in file get_perm_c.c > > And so on? > > From the SuperLU manual; I should give the option either YES or NO, however -mat_superlu_dist_parsymbfact YES makes the program crash in the same way as above. > Also I can?t find any reference to -mat_superlu_dist_parsymbfact in the PETSc documentation > > Mahir > > Mahir ?lker-Kaustell, Kompetenssamordnare, Brokonstrukt?r, Tekn. Dr, Tyr?ns AB > 010 452 30 82, Mahir.Ulker-Kaustell at tyrens.se > > From: Xiaoye S. Li [mailto:xsli at lbl.gov] > Sent: den 20 juli 2015 18:12 > To: ?lker-Kaustell, Mahir > Cc: Hong; petsc-users > Subject: Re: [petsc-users] SuperLU MPI-problem > > The default SuperLU_DIST setting is to serial symbolic factorization. Therefore, what matters is how much memory do you have per MPI task? > > The code failed to malloc memory during redistribution of matrix A to {L\U} data struction (using result of serial symbolic factorization.) > > You can use parallel symbolic factorization, by runtime option: '-mat_superlu_dist_parsymbfact' > > Sherry Li > > > On Mon, Jul 20, 2015 at 8:59 AM, Mahir.Ulker-Kaustell at tyrens.se > wrote: > Hong: > > Previous experiences with this equation have shown that it is very difficult to solve it iteratively. Hence the use of a direct solver. > > The large test problem I am trying to solve has slightly less than 10^6 degrees of freedom. The matrices are derived from finite elements so they are sparse. > The machine I am working on has 128GB ram. I have estimated the memory needed to less than 20GB, so if the solver needs twice or even three times as much, it should still work well. Or have I completely misunderstood something here? > > Mahir > > > > From: Hong [mailto:hzhang at mcs.anl.gov] > Sent: den 20 juli 2015 17:39 > To: ?lker-Kaustell, Mahir > Cc: petsc-users > Subject: Re: [petsc-users] SuperLU MPI-problem > > Mahir: > Direct solvers consume large amount of memory. Suggest to try followings: > > 1. A sparse iterative solver if [-omega^2M + K] is not too ill-conditioned. You may test it using the small matrix. > > 2. Incrementally increase your matrix sizes. Try different matrix orderings. > Do you get memory crash in the 1st symbolic factorization? > In your case, matrix data structure stays same when omega changes, so you only need to do one matrix symbolic factorization and reuse it. > > 3. Use a machine that gives larger memory. > > Hong > > Dear Petsc-Users, > > I am trying to use PETSc to solve a set of linear equations arising from Naviers equation (elastodynamics) in the frequency domain. > The frequency dependency of the problem requires that the system > > [-omega^2M + K]u = F > > where M and K are constant, square, positive definite matrices (mass and stiffness respectively) is solved for each frequency omega of interest. > K is a complex matrix, including material damping. > > I have written a PETSc program which solves this problem for a small (1000 degrees of freedom) test problem on one or several processors, but it keeps crashing when I try it on my full scale (in the order of 10^6 degrees of freedom) problem. > > The program crashes at KSPSetUp() and from what I can see in the error messages, it appears as if it consumes too much memory. > > I would guess that similar problems have occurred in this mail-list, so I am hoping that someone can push me in the right direction? > > Mahir -------------- next part -------------- An HTML attachment was scrubbed... URL: From knepley at gmail.com Thu Aug 6 06:44:02 2015 From: knepley at gmail.com (Matthew Knepley) Date: Thu, 6 Aug 2015 06:44:02 -0500 Subject: [petsc-users] Error running DMPlex example In-Reply-To: References: Message-ID: On Wed, Aug 5, 2015 at 10:21 PM, Gautam Bisht wrote: > Hi Matt, > > Instead of using gcc4.9, I reinstalled PETSc using clang on mac os x 10.10 > and the example runs fine. > > Btw, are there any examples that use DMPlex+DMComposite? > I don't think so. What would you anticipate using it for? Thanks, Matt > Thanks, > -Gautam. > -- What most experimenters take for granted before they begin their experiments is infinitely more interesting than any results to which their experiments lead. -- Norbert Wiener -------------- next part -------------- An HTML attachment was scrubbed... URL: From hzhang at mcs.anl.gov Thu Aug 6 09:36:24 2015 From: hzhang at mcs.anl.gov (Hong) Date: Thu, 6 Aug 2015 09:36:24 -0500 Subject: [petsc-users] SuperLU MPI-problem In-Reply-To: <63c6587a85914931bbbad4660884efed@STHWS42.tyrens.se> References: <051d9816c3bd4a3eac37fc51004ebce1@STHWS42.tyrens.se> <7345cece365942d1a06deeac56cf1d72@STHWS42.tyrens.se> <19A5B30A-64E2-44E1-8F73-F67AE628F175@mcs.anl.gov> <03369975ff0a46a388920f1b3372d25c@STHWS42.tyrens.se> <1891ada2f99843b6b6c1d91f0f47f065@STHWS42.tyrens.se> <63c6587a85914931bbbad4660884efed@STHWS42.tyrens.se> Message-ID: Mahir: > > > > I have been using PETSC_COMM_WORLD. > What do you get by running a petsc example, e.g., petsc/src/ksp/ksp/examples/tutorials mpiexec -n 2 ./ex2 -pc_type lu -pc_factor_mat_solver_package superlu_dist -ksp_view KSP Object: 2 MPI processes type: gmres ... Hong > > > *From:* Hong [mailto:hzhang at mcs.anl.gov] > *Sent:* den 5 augusti 2015 17:11 > *To:* ?lker-Kaustell, Mahir > *Cc:* Hong; Xiaoye S. Li; PETSc users list > *Subject:* Re: [petsc-users] SuperLU MPI-problem > > > > Mahir: > > As you noticed, you ran the code in serial mode, not parallel. > > Check your code on input communicator, e.g., what input communicator do > you use in > > KSPCreate(comm,&ksp)? > > > > I have added error flag to superlu_dist interface (released version). When > user uses '-mat_superlu_dist_parsymbfact' > > in serial mode, this option is ignored with a warning. > > > > Hong > > > > Hong, > > > > If I set parsymbfact: > > > > $ mpiexec -n 2 ./solve -ksp_type preonly -pc_type lu > -pc_factor_mat_solver_package superlu_dist -mat_superlu_dist_matinput > DISTRIBUTED -mat_superlu_dist_parsymbfact -ksp_view > > Invalid ISPEC at line 484 in file get_perm_c.c > > Invalid ISPEC at line 484 in file get_perm_c.c > > ------------------------------------------------------- > > Primary job terminated normally, but 1 process returned > > a non-zero exit code.. Per user-direction, the job has been aborted. > > ------------------------------------------------------- > > -------------------------------------------------------------------------- > > mpiexec detected that one or more processes exited with non-zero status, > thus causing > > the job to be terminated. The first process to do so was: > > > > Process name: [[63679,1],0] > > Exit code: 255 > > -------------------------------------------------------------------------- > > > > Since the program does not finish the call to KSPSolve(), we do not get > any information about the KSP from ?ksp_view. > > > > If I do not set it, I get a serial run even if I specify ?n 2: > > > > mpiexec -n 2 ./solve -ksp_type preonly -pc_type lu > -pc_factor_mat_solver_package superlu_dist -ksp_view > > ? > > KSP Object: 1 MPI processes > > type: preonly > > maximum iterations=10000, initial guess is zero > > tolerances: relative=1e-05, absolute=1e-50, divergence=10000 > > left preconditioning > > using NONE norm type for convergence test > > PC Object: 1 MPI processes > > type: lu > > LU: out-of-place factorization > > tolerance for zero pivot 2.22045e-14 > > matrix ordering: nd > > factor fill ratio given 0, needed 0 > > Factored matrix follows: > > Mat Object: 1 MPI processes > > type: seqaij > > rows=954, cols=954 > > package used to perform factorization: superlu_dist > > total: nonzeros=0, allocated nonzeros=0 > > total number of mallocs used during MatSetValues calls =0 > > SuperLU_DIST run parameters: > > Process grid nprow 1 x npcol 1 > > Equilibrate matrix TRUE > > Matrix input mode 0 > > Replace tiny pivots TRUE > > Use iterative refinement FALSE > > Processors in row 1 col partition 1 > > Row permutation LargeDiag > > Column permutation METIS_AT_PLUS_A > > Parallel symbolic factorization FALSE > > Repeated factorization SamePattern_SameRowPerm > > linear system matrix = precond matrix: > > Mat Object: 1 MPI processes > > type: seqaij > > rows=954, cols=954 > > total: nonzeros=34223, allocated nonzeros=34223 > > total number of mallocs used during MatSetValues calls =0 > > using I-node routines: found 668 nodes, limit used is 5 > > > > I am running PETSc via Cygwin on a windows machine. > > When I installed PETSc the tests with different numbers of processes ran > well. > > > > Mahir > > > > > > *From:* Hong [mailto:hzhang at mcs.anl.gov] > *Sent:* den 3 augusti 2015 19:06 > *To:* ?lker-Kaustell, Mahir > *Cc:* Hong; Xiaoye S. Li; PETSc users list > *Subject:* Re: [petsc-users] SuperLU MPI-problem > > > > Mahir, > > > > > > I have not used ?parsymbfact in sequential runs or set matinput=GLOBAL for > parallel runs. > > > > If I use 2 processors, the program runs if I use > *?mat_superlu_dist_parsymbfact=1*: > > mpiexec -n 2 ./solve -ksp_type preonly -pc_type lu > -pc_factor_mat_solver_package superlu_dist -mat_superlu_dist_matinput > GLOBAL -mat_superlu_dist_parsymbfact=1 > > > > The incorrect option '-mat_superlu_dist_parsymbfact=1' is not taken, so > your code runs well without parsymbfact. > > > > Please run it with '-ksp_view' and see what > > 'SuperLU_DIST run parameters:' are being used, e.g. > > petsc/src/ksp/ksp/examples/tutorials (maint) > > $ mpiexec -n 2 ./ex2 -pc_type lu -pc_factor_mat_solver_package > superlu_dist -mat_superlu_dist_parsymbfact=1 -ksp_view > > > > ... > > SuperLU_DIST run parameters: > > Process grid nprow 2 x npcol 1 > > Equilibrate matrix TRUE > > Matrix input mode 1 > > Replace tiny pivots TRUE > > Use iterative refinement FALSE > > Processors in row 2 col partition 1 > > Row permutation LargeDiag > > Column permutation METIS_AT_PLUS_A > > Parallel symbolic factorization FALSE > > Repeated factorization SamePattern_SameRowPerm > > > > I do not understand why your code uses matrix input mode = global. > > > > Hong > > > > > > > > *From:* Hong [mailto:hzhang at mcs.anl.gov] > *Sent:* den 3 augusti 2015 16:46 > *To:* Xiaoye S. Li > *Cc:* ?lker-Kaustell, Mahir; Hong; PETSc users list > > > *Subject:* Re: [petsc-users] SuperLU MPI-problem > > > > Mahir, > > > > Sherry found the culprit. I can reproduce it: > > petsc/src/ksp/ksp/examples/tutorials > > mpiexec -n 2 ./ex2 -pc_type lu -pc_factor_mat_solver_package superlu_dist > -mat_superlu_dist_matinput GLOBAL -mat_superlu_dist_parsymbfact > > > > Invalid ISPEC at line 484 in file get_perm_c.c > > Invalid ISPEC at line 484 in file get_perm_c.c > > ------------------------------------------------------- > > Primary job terminated normally, but 1 process returned > > a non-zero exit code.. Per user-direction, the job has been aborted. > > ------------------------------------------------------- > > ... > > > > PETSc-superlu_dist interface sets matinput=DISTRIBUTED as default when > using more than one processes. > > Did you either use '-mat_superlu_dist_parsymbfact' for sequential run or > set matinput=GLOBAL for parallel run? > > > > I'll add an error flag for these use cases. > > > > Hong > > > > On Mon, Aug 3, 2015 at 9:17 AM, Xiaoye S. Li wrote: > > I think I know the problem. Since zdistribute.c is called, I guess you > are using the global (replicated) matrix input interface, > pzgssvx_ABglobal(). This interface does not allow you to use parallel > symbolic factorization (since matrix is centralized). > > > > That's why you get the following error: > > Invalid ISPEC at line 484 in file get_perm_c.c > > > > You need to use distributed matrix input interface pzgssvx() (without > ABglobal) > > Sherry > > > > > > On Mon, Aug 3, 2015 at 5:02 AM, Mahir.Ulker-Kaustell at tyrens.se < > Mahir.Ulker-Kaustell at tyrens.se> wrote: > > Hong and Sherry, > > > > I have rebuilt PETSc with SuperLU 4.1. Unfortunately, the problem remains: > > > > If I use -mat_superlu_dist_parsymbfact, the program crashes with: Invalid > ISPEC at line 484 in file get_perm_c.c > > If I use -mat_superlu_dist_parsymbfact=1 or leave this flag out, the > program crashes with: Calloc fails for SPA dense[]. at line 438 in file > zdistribute.c > > > > Mahir > > > > *From:* Hong [mailto:hzhang at mcs.anl.gov] > *Sent:* den 30 juli 2015 02:58 > *To:* ?lker-Kaustell, Mahir > *Cc:* Xiaoye Li; PETSc users list > > > *Subject:* Fwd: [petsc-users] SuperLU MPI-problem > > > > Mahir, > > > > Sherry fixed several bugs in superlu_dist-v4.1. > > The current petsc-release interfaces with superlu_dist-v4.0. > > We do not know whether the reported issue (attached below) has been > resolved or not. If not, can you test it with the latest superlu_dist-v4.1? > > > > Here is how to do it: > > 1. download superlu_dist v4.1 > > 2. remove existing PETSC_ARCH directory, then configure petsc with > > '--download-superlu_dist=superlu_dist_4.1.tar.gz' > > 3. build petsc > > > > Let us know if the issue remains. > > > > Hong > > > > > > ---------- Forwarded message ---------- > From: *Xiaoye S. Li* > Date: Wed, Jul 29, 2015 at 2:24 PM > Subject: Fwd: [petsc-users] SuperLU MPI-problem > To: Hong Zhang > > Hong, > > I am cleaning the mailbox, and saw this unresolved issue. I am not sure > whether the new fix to parallel symbolic factorization solves the problem. > What bothers be is that he is getting the following error: > > Invalid ISPEC at line 484 in file get_perm_c.c > > This has nothing to do with my bug fix. > > ? Shall we ask him to try the new version, or try to get him matrix? > > Sherry > ? > > > > ---------- Forwarded message ---------- > From: *Mahir.Ulker-Kaustell at tyrens.se * < > Mahir.Ulker-Kaustell at tyrens.se> > Date: Wed, Jul 22, 2015 at 1:32 PM > Subject: RE: [petsc-users] SuperLU MPI-problem > To: Hong , "Xiaoye S. Li" > Cc: petsc-users > > The 1000 was just a conservative guess. The number of non-zeros per row is > in the tens in general but certain constraints lead to non-diagonal streaks > in the sparsity-pattern. > > Is it the reordering of the matrix that is killing me here? How can I set > options.ColPerm? > > > > If i use -mat_superlu_dist_parsymbfact the program crashes with > > > > Invalid ISPEC at line 484 in file get_perm_c.c > > ------------------------------------------------------- > > Primary job terminated normally, but 1 process returned > > a non-zero exit code.. Per user-direction, the job has been aborted. > > ------------------------------------------------------- > > [0]PETSC ERROR: > ------------------------------------------------------------------------ > > [0]PETSC ERROR: Caught signal number 15 Terminate: Some process (or the > batch system) has told this process to end > > [0]PETSC ERROR: Try option -start_in_debugger or -on_error_attach_debugger > > [0]PETSC ERROR: or see > http://www.mcs.anl.gov/petsc/documentation/faq.html#valgrind > > [0]PETSC ERROR: or try http://valgrind.org on GNU/linux and Apple Mac OS > X to find memory corruption errors > > [0]PETSC ERROR: configure using --with-debugging=yes, recompile, link, and > run > > [0]PETSC ERROR: to get more information on the crash. > > [0]PETSC ERROR: --------------------- Error Message > -------------------------------------------------------------- > > [0]PETSC ERROR: Signal received > > [0]PETSC ERROR: See http://www.mcs.anl.gov/petsc/documentation/faq.html > for trouble shooting. > > [0]PETSC ERROR: Petsc Release Version 3.6.0, Jun, 09, 2015 > > [0]PETSC ERROR: ./solve on a cygwin-complex-nodebug named CZC5202SM2 by > muk Wed Jul 22 21:59:23 2015 > > [0]PETSC ERROR: Configure options PETSC_DIR=/packages/petsc-3.6.0 > PETSC_ARCH=cygwin-complex-nodebug --with-cc=gcc --with-cxx=g++ > --with-fc=gfortran --with-debugging=0 --with-fortran-kernels=1 > --with-scalar-type=complex --download-fblaspack --download-mpich > --download-scalapack --download-mumps --download-metis --download-parmetis > --download-superlu --download-superlu_dist --download-fftw > > [0]PETSC ERROR: #1 User provided function() line 0 in unknown file > > application called MPI_Abort(MPI_COMM_WORLD, 59) - process 0 > > [unset]: aborting job: > > application called MPI_Abort(MPI_COMM_WORLD, 59) - process 0 > > [0]PETSC ERROR: > ------------------------------------------------------------------------ > > > > If i use -mat_superlu_dist_parsymbfact=1 the program crashes (somewhat > later) with > > > > Malloc fails for Lnzval_bc_ptr[*][] at line 626 in file zdistribute.c > > col block 3006 ------------------------------------------------------- > > Primary job terminated normally, but 1 process returned > > a non-zero exit code.. Per user-direction, the job has been aborted. > > ------------------------------------------------------- > > col block 1924 [0]PETSC ERROR: > ------------------------------------------------------------------------ > > [0]PETSC ERROR: Caught signal number 15 Terminate: Some process (or the > batch system) has told this process to end > > [0]PETSC ERROR: Try option -start_in_debugger or -on_error_attach_debugger > > [0]PETSC ERROR: or see > http://www.mcs.anl.gov/petsc/documentation/faq.html#valgrind > > [0]PETSC ERROR: or try http://valgrind.org on GNU/linux and Apple Mac OS > X to find memory corruption errors > > [0]PETSC ERROR: configure using --with-debugging=yes, recompile, link, and > run > > [0]PETSC ERROR: to get more information on the crash. > > [0]PETSC ERROR: --------------------- Error Message > -------------------------------------------------------------- > > [0]PETSC ERROR: Signal received > > [0]PETSC ERROR: See http://www.mcs.anl.gov/petsc/documentation/faq.html > for trouble shooting. > > [0]PETSC ERROR: Petsc Release Version 3.6.0, Jun, 09, 2015 > > [0]PETSC ERROR: ./solve on a cygwin-complex-nodebug named CZC5202SM2 by > muk Wed Jul 22 21:59:58 2015 > > [0]PETSC ERROR: Configure options PETSC_DIR=/packages/petsc-3.6.0 > PETSC_ARCH=cygwin-complex-nodebug --with-cc=gcc --with-cxx=g++ > --with-fc=gfortran --with-debugging=0 --with-fortran-kernels=1 > --with-scalar-type=complex --download-fblaspack --download-mpich > --download-scalapack --download-mumps --download-metis --download-parmetis > --download-superlu --download-superlu_dist --download-fftw > > [0]PETSC ERROR: #1 User provided function() line 0 in unknown file > > application called MPI_Abort(MPI_COMM_WORLD, 59) - process 0 > > [unset]: aborting job: > > application called MPI_Abort(MPI_COMM_WORLD, 59) - process 0 > > [0]PETSC ERROR: > ------------------------------------------------------------------------ > > > > > > /Mahir > > > > > > *From:* Hong [mailto:hzhang at mcs.anl.gov] > > *Sent:* den 22 juli 2015 21:34 > *To:* Xiaoye S. Li > *Cc:* ?lker-Kaustell, Mahir; petsc-users > > > *Subject:* Re: [petsc-users] SuperLU MPI-problem > > > > In Petsc/superlu_dist interface, we set default > > > > options.ParSymbFact = NO; > > > > When user raises the flag "-mat_superlu_dist_parsymbfact", > > we set > > > > options.ParSymbFact = YES; > > options.ColPerm = PARMETIS; /* in v2.2, PARMETIS is forced for > ParSymbFact regardless of user ordering setting */ > > > > We do not change anything else. > > > > Hong > > > > On Wed, Jul 22, 2015 at 2:19 PM, Xiaoye S. Li wrote: > > I am trying to understand your problem. You said you are solving Naviers > equation (elastodynamics) in the frequency domain, using finite element > discretization. I wonder why you have about 1000 nonzeros per row. > Usually in many PDE discretized matrices, the number of nonzeros per row is > in the tens (even for 3D problems), not in the thousands. So, your matrix > is quite a bit denser than many sparse matrices we deal with. > > > > The number of nonzeros in the L and U factors is much more than that in > original matrix A -- typically we see 10-20x fill ratio for 2D, or can be > as bad as 50-100x fill ratio for 3D. But since your matrix starts much > denser (i.e., the underlying graph has many connections), it may not lend > to any good ordering strategy to preserve sparsity of L and U; that is, the > L and U fill ratio may be large. > > > > I don't understand why you get the following error when you use > > ?-mat_superlu_dist_parsymbfact?. > > > > Invalid ISPEC at line 484 in file get_perm_c.c > > > > Perhaps Hong Zhang knows; she built the SuperLU_DIST interface for PETSc. > > > > ?Hong -- in order to use parallel symbolic factorization, is it sufficient > to specify only > > ?-mat_superlu_dist_parsymbfact? > > ? ? (the default is to use sequential symbolic factorization.) > > > > > > Sherry > > > > On Wed, Jul 22, 2015 at 9:11 AM, Mahir.Ulker-Kaustell at tyrens.se < > Mahir.Ulker-Kaustell at tyrens.se> wrote: > > Thank you for your reply. > > As you have probably figured out already, I am not a computational > scientist. I am a researcher in civil engineering (railways for high-speed > traffic), trying to produce some, from my perspective, fairly large > parametric studies based on finite element discretizations. > > I am working in a Windows-environment and have installed PETSc through > Cygwin. > Apparently, there is no support for Valgrind in this OS. > > If I have understood you correct, the memory issues are related to superLU > and given my background, there is not much I can do. Is this correct? > > > Best regards, > Mahir > > ______________________________________________ > Mahir ?lker-Kaustell, Kompetenssamordnare, Brokonstrukt?r, Tekn. Dr, > Tyr?ns AB > 010 452 30 82, Mahir.Ulker-Kaustell at tyrens.se > ______________________________________________ > > > -----Original Message----- > From: Barry Smith [mailto:bsmith at mcs.anl.gov] > Sent: den 22 juli 2015 02:57 > To: ?lker-Kaustell, Mahir > Cc: Xiaoye S. Li; petsc-users > Subject: Re: [petsc-users] SuperLU MPI-problem > > > Run the program under valgrind > http://www.mcs.anl.gov/petsc/documentation/faq.html#valgrind . When I use > the option -mat_superlu_dist_parsymbfact I get many scary memory problems > some involving for example ddist_psymbtonum (pdsymbfact_distdata.c:1332) > > Note that I consider it unacceptable for running programs to EVER use > uninitialized values; until these are all cleaned up I won't trust any runs > like this. > > Barry > > > > > ==42050== Conditional jump or move depends on uninitialised value(s) > ==42050== at 0x10274C436: MPI_Allgatherv (allgatherv.c:1053) > ==42050== by 0x101557F60: get_perm_c_parmetis > (get_perm_c_parmetis.c:285) > ==42050== by 0x101501192: pdgssvx (pdgssvx.c:934) > ==42050== by 0x1009CFE7A: MatLUFactorNumeric_SuperLU_DIST > (superlu_dist.c:414) > ==42050== by 0x10046CC5C: MatLUFactorNumeric (matrix.c:2946) > ==42050== by 0x100F09F2C: PCSetUp_LU (lu.c:152) > ==42050== by 0x100FF9036: PCSetUp (precon.c:982) > ==42050== by 0x1010F54EB: KSPSetUp (itfunc.c:332) > ==42050== by 0x1010F7985: KSPSolve (itfunc.c:546) > ==42050== by 0x10125541E: SNESSolve_NEWTONLS (ls.c:233) > ==42050== by 0x1011C49B7: SNESSolve (snes.c:3906) > ==42050== by 0x100001B3C: main (in ./ex19) > ==42050== Uninitialised value was created by a stack allocation > ==42050== at 0x10155751B: get_perm_c_parmetis (get_perm_c_parmetis.c:96) > ==42050== > ==42050== Conditional jump or move depends on uninitialised value(s) > ==42050== at 0x102851C61: MPIR_Allgatherv_intra (allgatherv.c:651) > ==42050== by 0x102853EC7: MPIR_Allgatherv (allgatherv.c:903) > ==42050== by 0x102853F84: MPIR_Allgatherv_impl (allgatherv.c:944) > ==42050== by 0x10274CA41: MPI_Allgatherv (allgatherv.c:1107) > ==42050== by 0x101557F60: get_perm_c_parmetis > (get_perm_c_parmetis.c:285) > ==42050== by 0x101501192: pdgssvx (pdgssvx.c:934) > ==42050== by 0x1009CFE7A: MatLUFactorNumeric_SuperLU_DIST > (superlu_dist.c:414) > ==42050== by 0x10046CC5C: MatLUFactorNumeric (matrix.c:2946) > ==42050== by 0x100F09F2C: PCSetUp_LU (lu.c:152) > ==42050== by 0x100FF9036: PCSetUp (precon.c:982) > ==42050== by 0x1010F54EB: KSPSetUp (itfunc.c:332) > ==42050== by 0x1010F7985: KSPSolve (itfunc.c:546) > ==42050== by 0x10125541E: SNESSolve_NEWTONLS (ls.c:233) > ==42050== by 0x1011C49B7: SNESSolve (snes.c:3906) > ==42050== by 0x100001B3C: main (in ./ex19) > ==42050== Uninitialised value was created by a stack allocation > ==42050== at 0x10155751B: get_perm_c_parmetis (get_perm_c_parmetis.c:96) > ==42050== > ==42049== Syscall param writev(vector[...]) points to uninitialised byte(s) > ==42049== at 0x102DA1C3A: writev (in > /usr/lib/system/libsystem_kernel.dylib) > ==42049== by 0x10296A0DC: MPL_large_writev (mplsock.c:32) > ==42049== by 0x10295F6AD: MPIDU_Sock_writev (sock_immed.i:610) > ==42049== by 0x102943FCA: MPIDI_CH3_iSendv (ch3_isendv.c:84) > ==42049== by 0x102934361: MPIDI_CH3_EagerContigIsend (ch3u_eager.c:556) > ==42049== by 0x102939531: MPID_Isend (mpid_isend.c:138) > ==42049== by 0x10277656E: MPI_Isend (isend.c:125) > ==42049== by 0x102088B66: libparmetis__gkMPI_Isend (gkmpi.c:63) > ==42049== by 0x10208140F: libparmetis__CommInterfaceData (comm.c:298) > ==42049== by 0x1020A8758: libparmetis__CompactGraph (ometis.c:553) > ==42049== by 0x1020A77BB: libparmetis__MultilevelOrder (ometis.c:225) > ==42049== by 0x1020A7493: ParMETIS_V32_NodeND (ometis.c:151) > ==42049== by 0x1020A6AFB: ParMETIS_V3_NodeND (ometis.c:34) > ==42049== by 0x101557CFC: get_perm_c_parmetis > (get_perm_c_parmetis.c:241) > ==42049== by 0x101501192: pdgssvx (pdgssvx.c:934) > ==42049== by 0x1009CFE7A: MatLUFactorNumeric_SuperLU_DIST > (superlu_dist.c:414) > ==42049== by 0x10046CC5C: MatLUFactorNumeric (matrix.c:2946) > ==42049== by 0x100F09F2C: PCSetUp_LU (lu.c:152) > ==42049== by 0x100FF9036: PCSetUp (precon.c:982) > ==42048== Syscall param writev(vector[...]) points to uninitialised byte(s) > ==42049== by 0x1010F54EB: KSPSetUp (itfunc.c:332) > ==42049== Address 0x105edff70 is 1,424 bytes inside a block of size > 752,720 alloc'd > ==42049== at 0x1000183B1: malloc (vg_replace_malloc.c:303) > ==42049== by 0x1020EB90C: gk_malloc (memory.c:147) > ==42049== by 0x1020EAA28: gk_mcoreCreate (mcore.c:28) > ==42048== at 0x102DA1C3A: writev (in > /usr/lib/system/libsystem_kernel.dylib) > ==42048== by 0x10296A0DC: MPL_large_writev (mplsock.c:32) > ==42049== by 0x1020BA5CF: libparmetis__AllocateWSpace (wspace.c:23) > ==42049== by 0x1020A6E84: ParMETIS_V32_NodeND (ometis.c:98) > ==42048== by 0x10295F6AD: MPIDU_Sock_writev (sock_immed.i:610) > ==42048== by 0x102943FCA: MPIDI_CH3_iSendv (ch3_isendv.c:84) > ==42048== by 0x102934361: MPIDI_CH3_EagerContigIsend (ch3u_eager.c:556) > ==42049== by 0x1020A6AFB: ParMETIS_V3_NodeND (ometis.c:34) > ==42049== by 0x101557CFC: get_perm_c_parmetis > (get_perm_c_parmetis.c:241) > ==42049== by 0x101501192: pdgssvx (pdgssvx.c:934) > ==42048== by 0x102939531: MPID_Isend (mpid_isend.c:138) > ==42048== by 0x10277656E: MPI_Isend (isend.c:125) > ==42049== by 0x1009CFE7A: MatLUFactorNumeric_SuperLU_DIST > (superlu_dist.c:414) > ==42049== by 0x10046CC5C: MatLUFactorNumeric (matrix.c:2946) > ==42049== by 0x100F09F2C: PCSetUp_LU (lu.c:152) > ==42049== by 0x100FF9036: PCSetUp (precon.c:982) > ==42049== by 0x1010F54EB: KSPSetUp (itfunc.c:332) > ==42048== by 0x102088B66: libparmetis__gkMPI_Isend (gkmpi.c:63) > ==42048== by 0x10208140F: libparmetis__CommInterfaceData (comm.c:298) > ==42049== by 0x1010F7985: KSPSolve (itfunc.c:546) > ==42049== by 0x10125541E: SNESSolve_NEWTONLS (ls.c:233) > ==42048== by 0x1020A8758: libparmetis__CompactGraph (ometis.c:553) > ==42048== by 0x1020A77BB: libparmetis__MultilevelOrder (ometis.c:225) > ==42048== by 0x1020A7493: ParMETIS_V32_NodeND (ometis.c:151) > ==42049== by 0x1011C49B7: SNESSolve (snes.c:3906) > ==42049== by 0x100001B3C: main (in ./ex19) > ==42049== Uninitialised value was created by a heap allocation > ==42049== at 0x1000183B1: malloc (vg_replace_malloc.c:303) > ==42049== by 0x1020EB90C: gk_malloc (memory.c:147) > ==42048== by 0x1020A6AFB: ParMETIS_V3_NodeND (ometis.c:34) > ==42048== by 0x101557CFC: get_perm_c_parmetis > (get_perm_c_parmetis.c:241) > ==42048== by 0x101501192: pdgssvx (pdgssvx.c:934) > ==42048== by 0x1009CFE7A: MatLUFactorNumeric_SuperLU_DIST > (superlu_dist.c:414) > ==42049== by 0x10211C50B: libmetis__imalloc (gklib.c:24) > ==42049== by 0x1020A8566: libparmetis__CompactGraph (ometis.c:519) > ==42049== by 0x1020A77BB: libparmetis__MultilevelOrder (ometis.c:225) > ==42048== by 0x10046CC5C: MatLUFactorNumeric (matrix.c:2946) > ==42049== by 0x1020A7493: ParMETIS_V32_NodeND (ometis.c:151) > ==42049== by 0x1020A6AFB: ParMETIS_V3_NodeND (ometis.c:34) > ==42049== by 0x101557CFC: get_perm_c_parmetis > (get_perm_c_parmetis.c:241) > ==42049== by 0x101501192: pdgssvx (pdgssvx.c:934) > ==42049== by 0x1009CFE7A: MatLUFactorNumeric_SuperLU_DIST > (superlu_dist.c:414) > ==42049== by 0x10046CC5C: MatLUFactorNumeric (matrix.c:2946) > ==42048== by 0x100F09F2C: PCSetUp_LU (lu.c:152) > ==42049== by 0x100F09F2C: PCSetUp_LU (lu.c:152) > ==42049== by 0x100FF9036: PCSetUp (precon.c:982) > ==42049== by 0x1010F54EB: KSPSetUp (itfunc.c:332) > ==42049== by 0x1010F7985: KSPSolve (itfunc.c:546) > ==42049== by 0x10125541E: SNESSolve_NEWTONLS (ls.c:233) > ==42048== by 0x100FF9036: PCSetUp (precon.c:982) > ==42048== by 0x1010F54EB: KSPSetUp (itfunc.c:332) > ==42048== Address 0x10597a860 is 1,408 bytes inside a block of size > 752,720 alloc'd > ==42049== by 0x1011C49B7: SNESSolve (snes.c:3906) > ==42049== by 0x100001B3C: main (in ./ex19) > ==42049== > ==42048== at 0x1000183B1: malloc (vg_replace_malloc.c:303) > ==42048== by 0x1020EB90C: gk_malloc (memory.c:147) > ==42048== by 0x1020EAA28: gk_mcoreCreate (mcore.c:28) > ==42048== by 0x1020BA5CF: libparmetis__AllocateWSpace (wspace.c:23) > ==42048== by 0x1020A6E84: ParMETIS_V32_NodeND (ometis.c:98) > ==42048== by 0x1020A6AFB: ParMETIS_V3_NodeND (ometis.c:34) > ==42048== by 0x101557CFC: get_perm_c_parmetis > (get_perm_c_parmetis.c:241) > ==42048== by 0x101501192: pdgssvx (pdgssvx.c:934) > ==42048== by 0x1009CFE7A: MatLUFactorNumeric_SuperLU_DIST > (superlu_dist.c:414) > ==42048== by 0x10046CC5C: MatLUFactorNumeric (matrix.c:2946) > ==42048== by 0x100F09F2C: PCSetUp_LU (lu.c:152) > ==42048== by 0x100FF9036: PCSetUp (precon.c:982) > ==42048== by 0x1010F54EB: KSPSetUp (itfunc.c:332) > ==42048== by 0x1010F7985: KSPSolve (itfunc.c:546) > ==42048== by 0x10125541E: SNESSolve_NEWTONLS (ls.c:233) > ==42048== by 0x1011C49B7: SNESSolve (snes.c:3906) > ==42048== by 0x100001B3C: main (in ./ex19) > ==42048== Uninitialised value was created by a heap allocation > ==42048== at 0x1000183B1: malloc (vg_replace_malloc.c:303) > ==42048== by 0x1020EB90C: gk_malloc (memory.c:147) > ==42048== by 0x10211C50B: libmetis__imalloc (gklib.c:24) > ==42048== by 0x1020A8566: libparmetis__CompactGraph (ometis.c:519) > ==42048== by 0x1020A77BB: libparmetis__MultilevelOrder (ometis.c:225) > ==42048== by 0x1020A7493: ParMETIS_V32_NodeND (ometis.c:151) > ==42048== by 0x1020A6AFB: ParMETIS_V3_NodeND (ometis.c:34) > ==42048== by 0x101557CFC: get_perm_c_parmetis > (get_perm_c_parmetis.c:241) > ==42048== by 0x101501192: pdgssvx (pdgssvx.c:934) > ==42048== by 0x1009CFE7A: MatLUFactorNumeric_SuperLU_DIST > (superlu_dist.c:414) > ==42048== by 0x10046CC5C: MatLUFactorNumeric (matrix.c:2946) > ==42048== by 0x100F09F2C: PCSetUp_LU (lu.c:152) > ==42048== by 0x100FF9036: PCSetUp (precon.c:982) > ==42048== by 0x1010F54EB: KSPSetUp (itfunc.c:332) > ==42048== by 0x1010F7985: KSPSolve (itfunc.c:546) > ==42048== by 0x10125541E: SNESSolve_NEWTONLS (ls.c:233) > ==42048== by 0x1011C49B7: SNESSolve (snes.c:3906) > ==42048== by 0x100001B3C: main (in ./ex19) > ==42048== > ==42048== Syscall param write(buf) points to uninitialised byte(s) > ==42048== at 0x102DA1C22: write (in > /usr/lib/system/libsystem_kernel.dylib) > ==42048== by 0x10295F5BD: MPIDU_Sock_write (sock_immed.i:525) > ==42048== by 0x102944839: MPIDI_CH3_iStartMsg (ch3_istartmsg.c:86) > ==42048== by 0x102933B80: MPIDI_CH3_EagerContigShortSend > (ch3u_eager.c:257) > ==42048== by 0x10293ADBA: MPID_Send (mpid_send.c:130) > ==42048== by 0x10277A1FA: MPI_Send (send.c:127) > ==42048== by 0x10155802F: get_perm_c_parmetis > (get_perm_c_parmetis.c:299) > ==42048== by 0x101501192: pdgssvx (pdgssvx.c:934) > ==42048== by 0x1009CFE7A: MatLUFactorNumeric_SuperLU_DIST > (superlu_dist.c:414) > ==42048== by 0x10046CC5C: MatLUFactorNumeric (matrix.c:2946) > ==42048== by 0x100F09F2C: PCSetUp_LU (lu.c:152) > ==42048== by 0x100FF9036: PCSetUp (precon.c:982) > ==42048== by 0x1010F54EB: KSPSetUp (itfunc.c:332) > ==42048== by 0x1010F7985: KSPSolve (itfunc.c:546) > ==42048== by 0x10125541E: SNESSolve_NEWTONLS (ls.c:233) > ==42048== by 0x1011C49B7: SNESSolve (snes.c:3906) > ==42048== by 0x100001B3C: main (in ./ex19) > ==42048== Address 0x104810704 is on thread 1's stack > ==42048== in frame #3, created by MPIDI_CH3_EagerContigShortSend > (ch3u_eager.c:218) > ==42048== Uninitialised value was created by a heap allocation > ==42048== at 0x1000183B1: malloc (vg_replace_malloc.c:303) > ==42048== by 0x10153B704: superlu_malloc_dist (memory.c:108) > ==42048== by 0x101557AB9: get_perm_c_parmetis > (get_perm_c_parmetis.c:185) > ==42048== by 0x101501192: pdgssvx (pdgssvx.c:934) > ==42048== by 0x1009CFE7A: MatLUFactorNumeric_SuperLU_DIST > (superlu_dist.c:414) > ==42048== by 0x10046CC5C: MatLUFactorNumeric (matrix.c:2946) > ==42048== by 0x100F09F2C: PCSetUp_LU (lu.c:152) > ==42048== by 0x100FF9036: PCSetUp (precon.c:982) > ==42048== by 0x1010F54EB: KSPSetUp (itfunc.c:332) > ==42048== by 0x1010F7985: KSPSolve (itfunc.c:546) > ==42048== by 0x10125541E: SNESSolve_NEWTONLS (ls.c:233) > ==42048== by 0x1011C49B7: SNESSolve (snes.c:3906) > ==42048== by 0x100001B3C: main (in ./ex19) > ==42048== > ==42050== Conditional jump or move depends on uninitialised value(s) > ==42050== at 0x102744CB8: MPI_Alltoallv (alltoallv.c:480) > ==42050== by 0x101510B3E: dist_symbLU (pdsymbfact_distdata.c:539) > ==42050== by 0x10150A5C6: ddist_psymbtonum (pdsymbfact_distdata.c:1275) > ==42050== by 0x1015018C2: pdgssvx (pdgssvx.c:1057) > ==42050== by 0x1009CFE7A: MatLUFactorNumeric_SuperLU_DIST > (superlu_dist.c:414) > ==42050== by 0x10046CC5C: MatLUFactorNumeric (matrix.c:2946) > ==42050== by 0x100F09F2C: PCSetUp_LU (lu.c:152) > ==42050== by 0x100FF9036: PCSetUp (precon.c:982) > ==42050== by 0x1010F54EB: KSPSetUp (itfunc.c:332) > ==42050== by 0x1010F7985: KSPSolve (itfunc.c:546) > ==42050== by 0x10125541E: SNESSolve_NEWTONLS (ls.c:233) > ==42050== by 0x1011C49B7: SNESSolve (snes.c:3906) > ==42050== by 0x100001B3C: main (in ./ex19) > ==42050== Uninitialised value was created by a stack allocation > ==42050== at 0x10150E4C4: dist_symbLU (pdsymbfact_distdata.c:96) > ==42050== > ==42050== Conditional jump or move depends on uninitialised value(s) > ==42050== at 0x102744E43: MPI_Alltoallv (alltoallv.c:490) > ==42050== by 0x101510B3E: dist_symbLU (pdsymbfact_distdata.c:539) > ==42050== by 0x10150A5C6: ddist_psymbtonum (pdsymbfact_distdata.c:1275) > ==42050== by 0x1015018C2: pdgssvx (pdgssvx.c:1057) > ==42050== by 0x1009CFE7A: MatLUFactorNumeric_SuperLU_DIST > (superlu_dist.c:414) > ==42050== by 0x10046CC5C: MatLUFactorNumeric (matrix.c:2946) > ==42050== by 0x100F09F2C: PCSetUp_LU (lu.c:152) > ==42050== by 0x100FF9036: PCSetUp (precon.c:982) > ==42050== by 0x1010F54EB: KSPSetUp (itfunc.c:332) > ==42050== by 0x1010F7985: KSPSolve (itfunc.c:546) > ==42050== by 0x10125541E: SNESSolve_NEWTONLS (ls.c:233) > ==42050== by 0x1011C49B7: SNESSolve (snes.c:3906) > ==42050== by 0x100001B3C: main (in ./ex19) > ==42050== Uninitialised value was created by a stack allocation > ==42050== at 0x10150E4C4: dist_symbLU (pdsymbfact_distdata.c:96) > ==42050== > ==42050== Conditional jump or move depends on uninitialised value(s) > ==42050== at 0x102744EBF: MPI_Alltoallv (alltoallv.c:497) > ==42050== by 0x101510B3E: dist_symbLU (pdsymbfact_distdata.c:539) > ==42050== by 0x10150A5C6: ddist_psymbtonum (pdsymbfact_distdata.c:1275) > ==42050== by 0x1015018C2: pdgssvx (pdgssvx.c:1057) > ==42050== by 0x1009CFE7A: MatLUFactorNumeric_SuperLU_DIST > (superlu_dist.c:414) > ==42050== by 0x10046CC5C: MatLUFactorNumeric (matrix.c:2946) > ==42050== by 0x100F09F2C: PCSetUp_LU (lu.c:152) > ==42050== by 0x100FF9036: PCSetUp (precon.c:982) > ==42050== by 0x1010F54EB: KSPSetUp (itfunc.c:332) > ==42050== by 0x1010F7985: KSPSolve (itfunc.c:546) > ==42050== by 0x10125541E: SNESSolve_NEWTONLS (ls.c:233) > ==42050== by 0x1011C49B7: SNESSolve (snes.c:3906) > ==42050== by 0x100001B3C: main (in ./ex19) > ==42050== Uninitialised value was created by a stack allocation > ==42050== at 0x10150E4C4: dist_symbLU (pdsymbfact_distdata.c:96) > ==42050== > ==42050== Conditional jump or move depends on uninitialised value(s) > ==42050== at 0x1027450B1: MPI_Alltoallv (alltoallv.c:512) > ==42050== by 0x101510B3E: dist_symbLU (pdsymbfact_distdata.c:539) > ==42050== by 0x10150A5C6: ddist_psymbtonum (pdsymbfact_distdata.c:1275) > ==42050== by 0x1015018C2: pdgssvx (pdgssvx.c:1057) > ==42050== by 0x1009CFE7A: MatLUFactorNumeric_SuperLU_DIST > (superlu_dist.c:414) > ==42050== by 0x10046CC5C: MatLUFactorNumeric (matrix.c:2946) > ==42050== by 0x100F09F2C: PCSetUp_LU (lu.c:152) > ==42050== by 0x100FF9036: PCSetUp (precon.c:982) > ==42050== by 0x1010F54EB: KSPSetUp (itfunc.c:332) > ==42050== by 0x1010F7985: KSPSolve (itfunc.c:546) > ==42050== by 0x10125541E: SNESSolve_NEWTONLS (ls.c:233) > ==42050== by 0x1011C49B7: SNESSolve (snes.c:3906) > ==42050== by 0x100001B3C: main (in ./ex19) > ==42050== Uninitialised value was created by a stack allocation > ==42050== at 0x10150E4C4: dist_symbLU (pdsymbfact_distdata.c:96) > ==42050== > ==42050== Conditional jump or move depends on uninitialised value(s) > ==42050== at 0x10283FB06: MPIR_Alltoallv_intra (alltoallv.c:92) > ==42050== by 0x1028407B6: MPIR_Alltoallv (alltoallv.c:343) > ==42050== by 0x102840884: MPIR_Alltoallv_impl (alltoallv.c:380) > ==42050== by 0x10274541B: MPI_Alltoallv (alltoallv.c:531) > ==42050== by 0x101510B3E: dist_symbLU (pdsymbfact_distdata.c:539) > ==42050== by 0x10150A5C6: ddist_psymbtonum (pdsymbfact_distdata.c:1275) > ==42050== by 0x1015018C2: pdgssvx (pdgssvx.c:1057) > ==42050== by 0x1009CFE7A: MatLUFactorNumeric_SuperLU_DIST > (superlu_dist.c:414) > ==42050== by 0x10046CC5C: MatLUFactorNumeric (matrix.c:2946) > ==42050== by 0x100F09F2C: PCSetUp_LU (lu.c:152) > ==42050== by 0x100FF9036: PCSetUp (precon.c:982) > ==42050== by 0x1010F54EB: KSPSetUp (itfunc.c:332) > ==42050== by 0x1010F7985: KSPSolve (itfunc.c:546) > ==42050== by 0x10125541E: SNESSolve_NEWTONLS (ls.c:233) > ==42050== by 0x1011C49B7: SNESSolve (snes.c:3906) > ==42050== by 0x100001B3C: main (in ./ex19) > ==42050== Uninitialised value was created by a stack allocation > ==42050== at 0x10150E4C4: dist_symbLU (pdsymbfact_distdata.c:96) > ==42050== > ==42050== Syscall param writev(vector[...]) points to uninitialised byte(s) > ==42050== at 0x102DA1C3A: writev (in > /usr/lib/system/libsystem_kernel.dylib) > ==42050== by 0x10296A0DC: MPL_large_writev (mplsock.c:32) > ==42050== by 0x10295F6AD: MPIDU_Sock_writev (sock_immed.i:610) > ==42050== by 0x102943FCA: MPIDI_CH3_iSendv (ch3_isendv.c:84) > ==42050== by 0x102934361: MPIDI_CH3_EagerContigIsend (ch3u_eager.c:556) > ==42050== by 0x102939531: MPID_Isend (mpid_isend.c:138) > ==42050== by 0x10277656E: MPI_Isend (isend.c:125) > ==42050== by 0x101524C41: pdgstrf2_trsm (pdgstrf2.c:201) > ==42050== by 0x10151ECBF: pdgstrf (pdgstrf.c:1082) > ==42050== by 0x1015019A5: pdgssvx (pdgssvx.c:1069) > ==42050== by 0x1009CFE7A: MatLUFactorNumeric_SuperLU_DIST > (superlu_dist.c:414) > ==42050== by 0x10046CC5C: MatLUFactorNumeric (matrix.c:2946) > ==42050== by 0x100F09F2C: PCSetUp_LU (lu.c:152) > ==42050== by 0x100FF9036: PCSetUp (precon.c:982) > ==42050== by 0x1010F54EB: KSPSetUp (itfunc.c:332) > ==42050== by 0x1010F7985: KSPSolve (itfunc.c:546) > ==42050== by 0x10125541E: SNESSolve_NEWTONLS (ls.c:233) > ==42050== by 0x1011C49B7: SNESSolve (snes.c:3906) > ==42050== by 0x100001B3C: main (in ./ex19) > ==42050== Address 0x1060144d0 is 1,168 bytes inside a block of size > 131,072 alloc'd > ==42050== at 0x1000183B1: malloc (vg_replace_malloc.c:303) > ==42050== by 0x10153B704: superlu_malloc_dist (memory.c:108) > ==42050== by 0x1014FD7AD: doubleMalloc_dist (dmemory.c:145) > ==42050== by 0x10151DA7D: pdgstrf (pdgstrf.c:735) > ==42050== by 0x1015019A5: pdgssvx (pdgssvx.c:1069) > ==42050== by 0x1009CFE7A: MatLUFactorNumeric_SuperLU_DIST > (superlu_dist.c:414) > ==42050== by 0x10046CC5C: MatLUFactorNumeric (matrix.c:2946) > ==42050== by 0x100F09F2C: PCSetUp_LU (lu.c:152) > ==42050== by 0x100FF9036: PCSetUp (precon.c:982) > ==42050== by 0x1010F54EB: KSPSetUp (itfunc.c:332) > ==42050== by 0x1010F7985: KSPSolve (itfunc.c:546) > ==42050== by 0x10125541E: SNESSolve_NEWTONLS (ls.c:233) > ==42050== by 0x1011C49B7: SNESSolve (snes.c:3906) > ==42050== by 0x100001B3C: main (in ./ex19) > ==42050== Uninitialised value was created by a heap allocation > ==42050== at 0x1000183B1: malloc (vg_replace_malloc.c:303) > ==42050== by 0x10153B704: superlu_malloc_dist (memory.c:108) > ==42050== by 0x1014FD7AD: doubleMalloc_dist (dmemory.c:145) > ==42050== by 0x10151DA7D: pdgstrf (pdgstrf.c:735) > ==42050== by 0x1015019A5: pdgssvx (pdgssvx.c:1069) > ==42050== by 0x1009CFE7A: MatLUFactorNumeric_SuperLU_DIST > (superlu_dist.c:414) > ==42050== by 0x10046CC5C: MatLUFactorNumeric (matrix.c:2946) > ==42050== by 0x100F09F2C: PCSetUp_LU (lu.c:152) > ==42050== by 0x100FF9036: PCSetUp (precon.c:982) > ==42050== by 0x1010F54EB: KSPSetUp (itfunc.c:332) > ==42050== by 0x1010F7985: KSPSolve (itfunc.c:546) > ==42050== by 0x10125541E: SNESSolve_NEWTONLS (ls.c:233) > ==42050== by 0x1011C49B7: SNESSolve (snes.c:3906) > ==42050== by 0x100001B3C: main (in ./ex19) > ==42050== > ==42048== Conditional jump or move depends on uninitialised value(s) > ==42048== at 0x10151F141: pdgstrf (pdgstrf.c:1139) > ==42048== by 0x1015019A5: pdgssvx (pdgssvx.c:1069) > ==42048== by 0x1009CFE7A: MatLUFactorNumeric_SuperLU_DIST > (superlu_dist.c:414) > ==42048== by 0x10046CC5C: MatLUFactorNumeric (matrix.c:2946) > ==42048== by 0x100F09F2C: PCSetUp_LU (lu.c:152) > ==42048== by 0x100FF9036: PCSetUp (precon.c:982) > ==42048== by 0x1010F54EB: KSPSetUp (itfunc.c:332) > ==42048== by 0x1010F7985: KSPSolve (itfunc.c:546) > ==42048== by 0x10125541E: SNESSolve_NEWTONLS (ls.c:233) > ==42048== by 0x1011C49B7: SNESSolve (snes.c:3906) > ==42048== by 0x100001B3C: main (in ./ex19) > ==42048== Uninitialised value was created by a heap allocation > ==42048== at 0x1000183B1: malloc (vg_replace_malloc.c:303) > ==42048== by 0x10153B704: superlu_malloc_dist (memory.c:108) > ==42048== by 0x10150ABE2: ddist_psymbtonum (pdsymbfact_distdata.c:1332) > ==42048== by 0x1015018C2: pdgssvx (pdgssvx.c:1057) > ==42048== by 0x1009CFE7A: MatLUFactorNumeric_SuperLU_DIST > (superlu_dist.c:414) > ==42048== by 0x10046CC5C: MatLUFactorNumeric (matrix.c:2946) > ==42048== by 0x100F09F2C: PCSetUp_LU (lu.c:152) > ==42048== by 0x100FF9036: PCSetUp (precon.c:982) > ==42048== by 0x1010F54EB: KSPSetUp (itfunc.c:332) > ==42048== by 0x1010F7985: KSPSolve (itfunc.c:546) > ==42048== by 0x10125541E: SNESSolve_NEWTONLS (ls.c:233) > ==42048== by 0x1011C49B7: SNESSolve (snes.c:3906) > ==42048== by 0x100001B3C: main (in ./ex19) > ==42048== > ==42049== Conditional jump or move depends on uninitialised value(s) > ==42049== at 0x10151F141: pdgstrf (pdgstrf.c:1139) > ==42049== by 0x1015019A5: pdgssvx (pdgssvx.c:1069) > ==42049== by 0x1009CFE7A: MatLUFactorNumeric_SuperLU_DIST > (superlu_dist.c:414) > ==42049== by 0x10046CC5C: MatLUFactorNumeric (matrix.c:2946) > ==42049== by 0x100F09F2C: PCSetUp_LU (lu.c:152) > ==42049== by 0x100FF9036: PCSetUp (precon.c:982) > ==42049== by 0x1010F54EB: KSPSetUp (itfunc.c:332) > ==42049== by 0x1010F7985: KSPSolve (itfunc.c:546) > ==42049== by 0x10125541E: SNESSolve_NEWTONLS (ls.c:233) > ==42049== by 0x1011C49B7: SNESSolve (snes.c:3906) > ==42049== by 0x100001B3C: main (in ./ex19) > ==42049== Uninitialised value was created by a heap allocation > ==42049== at 0x1000183B1: malloc (vg_replace_malloc.c:303) > ==42049== by 0x10153B704: superlu_malloc_dist (memory.c:108) > ==42049== by 0x10150ABE2: ddist_psymbtonum (pdsymbfact_distdata.c:1332) > ==42049== by 0x1015018C2: pdgssvx (pdgssvx.c:1057) > ==42049== by 0x1009CFE7A: MatLUFactorNumeric_SuperLU_DIST > (superlu_dist.c:414) > ==42049== by 0x10046CC5C: MatLUFactorNumeric (matrix.c:2946) > ==42049== by 0x100F09F2C: PCSetUp_LU (lu.c:152) > ==42049== by 0x100FF9036: PCSetUp (precon.c:982) > ==42049== by 0x1010F54EB: KSPSetUp (itfunc.c:332) > ==42049== by 0x1010F7985: KSPSolve (itfunc.c:546) > ==42049== by 0x10125541E: SNESSolve_NEWTONLS (ls.c:233) > ==42049== by 0x1011C49B7: SNESSolve (snes.c:3906) > ==42049== by 0x100001B3C: main (in ./ex19) > ==42049== > ==42048== Conditional jump or move depends on uninitialised value(s) > ==42048== at 0x101520054: pdgstrf (pdgstrf.c:1429) > ==42048== by 0x1015019A5: pdgssvx (pdgssvx.c:1069) > ==42048== by 0x1009CFE7A: MatLUFactorNumeric_SuperLU_DIST > (superlu_dist.c:414) > ==42048== by 0x10046CC5C: MatLUFactorNumeric (matrix.c:2946) > ==42048== by 0x100F09F2C: PCSetUp_LU (lu.c:152) > ==42048== by 0x100FF9036: PCSetUp (precon.c:982) > ==42048== by 0x1010F54EB: KSPSetUp (itfunc.c:332) > ==42048== by 0x1010F7985: KSPSolve (itfunc.c:546) > ==42049== Conditional jump or move depends on uninitialised value(s) > ==42048== by 0x10125541E: SNESSolve_NEWTONLS (ls.c:233) > ==42048== by 0x1011C49B7: SNESSolve (snes.c:3906) > ==42048== by 0x100001B3C: main (in ./ex19) > ==42048== Uninitialised value was created by a heap allocation > ==42049== at 0x101520054: pdgstrf (pdgstrf.c:1429) > ==42048== at 0x1000183B1: malloc (vg_replace_malloc.c:303) > ==42048== by 0x10153B704: superlu_malloc_dist (memory.c:108) > ==42049== by 0x1015019A5: pdgssvx (pdgssvx.c:1069) > ==42049== by 0x1009CFE7A: MatLUFactorNumeric_SuperLU_DIST > (superlu_dist.c:414) > ==42048== by 0x10150ABE2: ddist_psymbtonum (pdsymbfact_distdata.c:1332) > ==42048== by 0x1015018C2: pdgssvx (pdgssvx.c:1057) > ==42048== by 0x1009CFE7A: MatLUFactorNumeric_SuperLU_DIST > (superlu_dist.c:414) > ==42049== by 0x10046CC5C: MatLUFactorNumeric (matrix.c:2946) > ==42049== by 0x100F09F2C: PCSetUp_LU (lu.c:152) > ==42048== by 0x10046CC5C: MatLUFactorNumeric (matrix.c:2946) > ==42048== by 0x100F09F2C: PCSetUp_LU (lu.c:152) > ==42049== by 0x100FF9036: PCSetUp (precon.c:982) > ==42049== by 0x1010F54EB: KSPSetUp (itfunc.c:332) > ==42049== by 0x1010F7985: KSPSolve (itfunc.c:546) > ==42048== by 0x100FF9036: PCSetUp (precon.c:982) > ==42048== by 0x1010F54EB: KSPSetUp (itfunc.c:332) > ==42048== by 0x1010F7985: KSPSolve (itfunc.c:546) > ==42049== by 0x10125541E: SNESSolve_NEWTONLS (ls.c:233) > ==42049== by 0x1011C49B7: SNESSolve (snes.c:3906) > ==42048== by 0x10125541E: SNESSolve_NEWTONLS (ls.c:233) > ==42048== by 0x1011C49B7: SNESSolve (snes.c:3906) > ==42049== by 0x100001B3C: main (in ./ex19) > ==42049== Uninitialised value was created by a heap allocation > ==42049== at 0x1000183B1: malloc (vg_replace_malloc.c:303) > ==42048== by 0x100001B3C: main (in ./ex19) > ==42048== > ==42049== by 0x10153B704: superlu_malloc_dist (memory.c:108) > ==42049== by 0x10150ABE2: ddist_psymbtonum (pdsymbfact_distdata.c:1332) > ==42049== by 0x1015018C2: pdgssvx (pdgssvx.c:1057) > ==42049== by 0x1009CFE7A: MatLUFactorNumeric_SuperLU_DIST > (superlu_dist.c:414) > ==42049== by 0x10046CC5C: MatLUFactorNumeric (matrix.c:2946) > ==42049== by 0x100F09F2C: PCSetUp_LU (lu.c:152) > ==42049== by 0x100FF9036: PCSetUp (precon.c:982) > ==42049== by 0x1010F54EB: KSPSetUp (itfunc.c:332) > ==42049== by 0x1010F7985: KSPSolve (itfunc.c:546) > ==42049== by 0x10125541E: SNESSolve_NEWTONLS (ls.c:233) > ==42049== by 0x1011C49B7: SNESSolve (snes.c:3906) > ==42049== by 0x100001B3C: main (in ./ex19) > ==42049== > ==42050== Conditional jump or move depends on uninitialised value(s) > ==42050== at 0x10151FDE6: pdgstrf (pdgstrf.c:1382) > ==42050== by 0x1015019A5: pdgssvx (pdgssvx.c:1069) > ==42050== by 0x1009CFE7A: MatLUFactorNumeric_SuperLU_DIST > (superlu_dist.c:414) > ==42050== by 0x10046CC5C: MatLUFactorNumeric (matrix.c:2946) > ==42050== by 0x100F09F2C: PCSetUp_LU (lu.c:152) > ==42050== by 0x100FF9036: PCSetUp (precon.c:982) > ==42050== by 0x1010F54EB: KSPSetUp (itfunc.c:332) > ==42050== by 0x1010F7985: KSPSolve (itfunc.c:546) > ==42050== by 0x10125541E: SNESSolve_NEWTONLS (ls.c:233) > ==42050== by 0x1011C49B7: SNESSolve (snes.c:3906) > ==42050== by 0x100001B3C: main (in ./ex19) > ==42050== Uninitialised value was created by a heap allocation > ==42050== at 0x1000183B1: malloc (vg_replace_malloc.c:303) > ==42050== by 0x10153B704: superlu_malloc_dist (memory.c:108) > ==42050== by 0x10150B241: ddist_psymbtonum (pdsymbfact_distdata.c:1389) > ==42050== by 0x1015018C2: pdgssvx (pdgssvx.c:1057) > ==42050== by 0x1009CFE7A: MatLUFactorNumeric_SuperLU_DIST > (superlu_dist.c:414) > ==42050== by 0x10046CC5C: MatLUFactorNumeric (matrix.c:2946) > ==42050== by 0x100F09F2C: PCSetUp_LU (lu.c:152) > ==42050== by 0x100FF9036: PCSetUp (precon.c:982) > ==42050== by 0x1010F54EB: KSPSetUp (itfunc.c:332) > ==42050== by 0x1010F7985: KSPSolve (itfunc.c:546) > ==42050== by 0x10125541E: SNESSolve_NEWTONLS (ls.c:233) > ==42050== by 0x1011C49B7: SNESSolve (snes.c:3906) > ==42050== by 0x100001B3C: main (in ./ex19) > ==42050== > > > > On Jul 20, 2015, at 12:03 PM, Mahir.Ulker-Kaustell at tyrens.se wrote: > > > > Ok. So I have been creating the full factorization on each process. That > gives me some hope! > > > > I followed your suggestion and tried to use the runtime option > ?-mat_superlu_dist_parsymbfact?. > > However, now the program crashes with: > > > > Invalid ISPEC at line 484 in file get_perm_c.c > > > > And so on? > > > > From the SuperLU manual; I should give the option either YES or NO, > however -mat_superlu_dist_parsymbfact YES makes the program crash in the > same way as above. > > Also I can?t find any reference to -mat_superlu_dist_parsymbfact in the > PETSc documentation > > > > Mahir > > > > Mahir ?lker-Kaustell, Kompetenssamordnare, Brokonstrukt?r, Tekn. Dr, > Tyr?ns AB > > 010 452 30 82, Mahir.Ulker-Kaustell at tyrens.se > > > > From: Xiaoye S. Li [mailto:xsli at lbl.gov] > > Sent: den 20 juli 2015 18:12 > > To: ?lker-Kaustell, Mahir > > Cc: Hong; petsc-users > > Subject: Re: [petsc-users] SuperLU MPI-problem > > > > The default SuperLU_DIST setting is to serial symbolic factorization. > Therefore, what matters is how much memory do you have per MPI task? > > > > The code failed to malloc memory during redistribution of matrix A to > {L\U} data struction (using result of serial symbolic factorization.) > > > > You can use parallel symbolic factorization, by runtime option: > '-mat_superlu_dist_parsymbfact' > > > > Sherry Li > > > > > > On Mon, Jul 20, 2015 at 8:59 AM, Mahir.Ulker-Kaustell at tyrens.se < > Mahir.Ulker-Kaustell at tyrens.se> wrote: > > Hong: > > > > Previous experiences with this equation have shown that it is very > difficult to solve it iteratively. Hence the use of a direct solver. > > > > The large test problem I am trying to solve has slightly less than 10^6 > degrees of freedom. The matrices are derived from finite elements so they > are sparse. > > The machine I am working on has 128GB ram. I have estimated the memory > needed to less than 20GB, so if the solver needs twice or even three times > as much, it should still work well. Or have I completely misunderstood > something here? > > > > Mahir > > > > > > > > From: Hong [mailto:hzhang at mcs.anl.gov] > > Sent: den 20 juli 2015 17:39 > > To: ?lker-Kaustell, Mahir > > Cc: petsc-users > > Subject: Re: [petsc-users] SuperLU MPI-problem > > > > Mahir: > > Direct solvers consume large amount of memory. Suggest to try followings: > > > > 1. A sparse iterative solver if [-omega^2M + K] is not too > ill-conditioned. You may test it using the small matrix. > > > > 2. Incrementally increase your matrix sizes. Try different matrix > orderings. > > Do you get memory crash in the 1st symbolic factorization? > > In your case, matrix data structure stays same when omega changes, so > you only need to do one matrix symbolic factorization and reuse it. > > > > 3. Use a machine that gives larger memory. > > > > Hong > > > > Dear Petsc-Users, > > > > I am trying to use PETSc to solve a set of linear equations arising from > Naviers equation (elastodynamics) in the frequency domain. > > The frequency dependency of the problem requires that the system > > > > [-omega^2M + K]u = F > > > > where M and K are constant, square, positive definite matrices (mass and > stiffness respectively) is solved for each frequency omega of interest. > > K is a complex matrix, including material damping. > > > > I have written a PETSc program which solves this problem for a small > (1000 degrees of freedom) test problem on one or several processors, but it > keeps crashing when I try it on my full scale (in the order of 10^6 degrees > of freedom) problem. > > > > The program crashes at KSPSetUp() and from what I can see in the error > messages, it appears as if it consumes too much memory. > > > > I would guess that similar problems have occurred in this mail-list, so > I am hoping that someone can push me in the right direction? > > > > Mahir > > > > > > > > > > > > > > > > > -------------- next part -------------- An HTML attachment was scrubbed... URL: From gbisht at lbl.gov Thu Aug 6 10:08:58 2015 From: gbisht at lbl.gov (Gautam Bisht) Date: Thu, 6 Aug 2015 08:08:58 -0700 Subject: [petsc-users] Error running DMPlex example In-Reply-To: References: Message-ID: I'm going to move this thread over to dev mailing list. -Gautam. On Thu, Aug 6, 2015 at 4:44 AM, Matthew Knepley wrote: > On Wed, Aug 5, 2015 at 10:21 PM, Gautam Bisht wrote: > >> Hi Matt, >> >> Instead of using gcc4.9, I reinstalled PETSc using clang on mac os x >> 10.10 and the example runs fine. >> >> Btw, are there any examples that use DMPlex+DMComposite? >> > > I don't think so. What would you anticipate using it for? > > Thanks, > > Matt > > >> Thanks, >> -Gautam. >> > > > > -- > What most experimenters take for granted before they begin their > experiments is infinitely more interesting than any results to which their > experiments lead. > -- Norbert Wiener > -------------- next part -------------- An HTML attachment was scrubbed... URL: From hzhang at mcs.anl.gov Thu Aug 6 10:09:02 2015 From: hzhang at mcs.anl.gov (Hong) Date: Thu, 6 Aug 2015 10:09:02 -0500 Subject: [petsc-users] I am wondering if there is a way to implement SPMM In-Reply-To: References: <87egjjr2j9.fsf@jedbrown.org> <20150804084548.GB52392@Patricks-MacBook-Pro-3.local> <07456300-9874-41EF-AF5E-16BC0CB0423D@mcs.anl.gov> <06426FD6-16F6-429A-8EEB-8BE31CECC8F4@mcs.anl.gov> Message-ID: Barry: > > > Hong, we want to reuse the space in the Km(stepIdx-1) from which it was > created which means that MAT_INITIAL_MATRIX cannot be used. Since the > result is always dense it is not the difficult case when a symbolic computation needs to be done initially so, at least in theory, > he should not have to use MAT_INITIAL_MATRIX the first time through. > Petsc implementation of MatMatMult() assumes user call MatMatMultSymbolic() first, in which, we define which MatMatMultNumeric() routine to be followed, and most importantly, we create specific data structure to be reused by MatMatMultNumeric(). In MatMatMultSymbolic_MPIAIJ_MPIDense(), we create a 'container'. Without calling the case of MAT_INITIAL_MATRIX, these info are missing, and code simply crashes. Hong > > > > Hong > > > > > > On Wed, Aug 5, 2015 at 8:56 PM, Cong Li > wrote: > > The entire source code files are attached. > > > > Also I copy and paste the here in this email > > > > thanks > > > > program test > > > > implicit none > > > > #include > > #include > > #include > > #include > > > > > > PetscViewer :: view > > ! sparse matrix > > Mat :: A > > ! distributed dense matrix of size n x m > > Mat :: B, X, R, QDlt, AQDlt > > ! distributed dense matrix of size n x (m x k) > > Mat :: Q, K, AQ_p, AQ > > ! local dense matrix (every process keep the identical copies), (m x > k) x (m x k) > > Mat :: AConjPara, QtAQ, QtAQ_p, Dlt > > > > PetscInt :: nDim, mDim, rhsNDim,rhsMDim,ierr, maxIter, iter, > step_k,bsize > > PetscInt :: ownRowS,ownRowE > > PetscScalar, allocatable :: XInit(:,:) > > PetscInt :: XInitI, XInitJ > > PetscScalar :: v=1.0 > > PetscBool :: flg > > PetscMPIInt :: size, rank > > > > character(128) :: fin, rhsfin > > > > > > call PetscInitialize(PETSC_NULL_CHARACTER,ierr) > > call MPI_Comm_size(PETSC_COMM_WORLD,size,ierr) > > call MPI_Comm_rank(PETSC_COMM_WORLD,rank,ierr) > > > > ! read binary matrix file > > call PetscOptionsGetString(PETSC_NULL_CHARACTER,'-f',fin,flg,ierr) > > call PetscOptionsGetString(PETSC_NULL_CHARACTER,'-r',rhsfin,flg,ierr) > > > > call PetscOptionsGetInt(PETSC_NULL_CHARACTER,'-i',maxIter,flg,ierr) > > call PetscOptionsGetInt(PETSC_NULL_CHARACTER,'-k',step_k,flg,ierr) > > call PetscOptionsGetInt(PETSC_NULL_CHARACTER,'-w',bsize,flg,ierr) > > > > > > call > PetscViewerBinaryOpen(PETSC_COMM_WORLD,fin,FILE_MODE_READ,view,ierr) > > call MatCreate(PETSC_COMM_WORLD,A,ierr) > > call MatSetType(A,MATAIJ,ierr) > > call MatLoad(A,view,ierr) > > call PetscViewerDestroy(view,ierr) > > ! for the time being, assume mDim == nDim is true > > call MatGetSize(A, nDim, mDim, ierr) > > > > if (rank == 0) then > > print*,'Mat Size = ', nDim, mDim > > end if > > > > call MatAssemblyBegin(A, MAT_FINAL_ASSEMBLY, ierr) > > call MatAssemblyEnd(A, MAT_FINAL_ASSEMBLY, ierr) > > call MatGetOwnershipRange(A,ownRowS,ownRowE, ierr) > > > > ! create right-and-side matrix > > ! for the time being, choose row-wise decomposition > > ! for the time being, assume nDim%size = 0 > > call MatCreateDense(PETSC_COMM_WORLD, (ownRowE - ownRowS), & > > bsize, nDim, bsize,PETSC_NULL_SCALAR, B, ierr) > > call > PetscViewerBinaryOpen(PETSC_COMM_WORLD,rhsfin,FILE_MODE_READ,view, ierr) > > call MatLoad(B,view,ierr) > > call PetscViewerDestroy(view,ierr) > > call MatGetSize(B, rhsMDim, rhsNDim, ierr) > > if (rank == 0) then > > print*,'MRHS Size actually are:', rhsMDim, rhsNDim > > print*,'MRHS Size should be:', nDim, bsize > > end if > > call MatAssemblyBegin(B, MAT_FINAL_ASSEMBLY, ierr) > > call MatAssemblyEnd(B, MAT_FINAL_ASSEMBLY, ierr) > > > > ! inintial value guses X > > allocate(XInit(nDim,bsize)) > > do XInitI=1, nDim > > do XInitJ=1, bsize > > XInit(XInitI,XInitJ) = 1.0 > > end do > > end do > > > > call MatCreateDense(PETSC_COMM_WORLD, (ownRowE - ownRowS), & > > bsize, nDim, bsize,XInit, X, ierr) > > > > call MatAssemblyBegin(X, MAT_FINAL_ASSEMBLY, ierr) > > call MatAssemblyEnd (X, MAT_FINAL_ASSEMBLY, ierr) > > > > > > ! B, X, R, QDlt, AQDlt > > call MatDuplicate(B, MAT_DO_NOT_COPY_VALUES, R, ierr) > > call MatAssemblyBegin(R, MAT_FINAL_ASSEMBLY, ierr) > > call MatAssemblyEnd (R, MAT_FINAL_ASSEMBLY, ierr) > > > > call MatDuplicate(B, MAT_DO_NOT_COPY_VALUES, QDlt, ierr) > > call MatAssemblyBegin(QDlt, MAT_FINAL_ASSEMBLY, ierr) > > call MatAssemblyEnd (QDlt, MAT_FINAL_ASSEMBLY, ierr) > > > > call MatDuplicate(B, MAT_DO_NOT_COPY_VALUES, AQDlt, ierr) > > call MatAssemblyBegin(AQDlt, MAT_FINAL_ASSEMBLY, ierr) > > call MatAssemblyEnd (AQDlt, MAT_FINAL_ASSEMBLY, ierr) > > > > ! Q, K, AQ_p, AQ of size n x (m x k) > > call MatCreateDense(PETSC_COMM_WORLD, (ownRowE - ownRowS), & > > (bsize*step_k), nDim, > (bsize*step_k),PETSC_NULL_SCALAR, Q, ierr) > > call MatAssemblyBegin(Q, MAT_FINAL_ASSEMBLY, ierr) > > call MatAssemblyEnd(Q, MAT_FINAL_ASSEMBLY, ierr) > > > > call MatDuplicate(Q, MAT_DO_NOT_COPY_VALUES, K, ierr) > > call MatAssemblyBegin(K, MAT_FINAL_ASSEMBLY, ierr) > > call MatAssemblyEnd(K, MAT_FINAL_ASSEMBLY, ierr) > > > > call MatDuplicate(Q, MAT_DO_NOT_COPY_VALUES, AQ_p, ierr) > > call MatAssemblyBegin(AQ_p, MAT_FINAL_ASSEMBLY, ierr) > > call MatAssemblyEnd(AQ_p, MAT_FINAL_ASSEMBLY, ierr) > > > > call MatDuplicate(Q, MAT_DO_NOT_COPY_VALUES, AQ, ierr) > > call MatAssemblyBegin(AQ, MAT_FINAL_ASSEMBLY, ierr) > > call MatAssemblyEnd(AQ, MAT_FINAL_ASSEMBLY, ierr) > > > > ! QtAQ, QtAQ_p, Dlt of size (m x k) x (m x k) > > call MatCreateSeqDense(PETSC_COMM_SELF,(bsize*step_k),(bsize*step_k),& > > PETSC_NULL_SCALAR, QtAQ, ierr) > > call MatAssemblyBegin(QtAQ, MAT_FINAL_ASSEMBLY, ierr) > > call MatAssemblyEnd (QtAQ, MAT_FINAL_ASSEMBLY, ierr) > > > > call MatDuplicate(QtAQ, MAT_DO_NOT_COPY_VALUES, QtAQ_p , ierr) > > call MatAssemblyBegin(QtAQ_p, MAT_FINAL_ASSEMBLY, ierr) > > call MatAssemblyEnd (QtAQ_p, MAT_FINAL_ASSEMBLY, ierr) > > > > call MatDuplicate(QtAQ, MAT_DO_NOT_COPY_VALUES, Dlt , ierr) > > call MatAssemblyBegin(Dlt, MAT_FINAL_ASSEMBLY, ierr) > > call MatAssemblyEnd (Dlt, MAT_FINAL_ASSEMBLY, ierr) > > > > call MatDuplicate(QtAQ, MAT_DO_NOT_COPY_VALUES, AConjPara , ierr) > > call MatAssemblyBegin(AConjPara, MAT_FINAL_ASSEMBLY, ierr) > > call MatAssemblyEnd (AConjPara, MAT_FINAL_ASSEMBLY, ierr) > > > > ! calculation for R > > > > ! call matrix powers kernel > > call mpk_monomial (K, A, R, step_k, rank,size) > > > > ! destory matrices > > deallocate(XInit) > > > > call MatDestroy(B, ierr) > > call MatDestroy(X, ierr) > > call MatDestroy(R, ierr) > > call MatDestroy(QDlt, ierr) > > call MatDestroy(AQDlt, ierr) > > call MatDestroy(Q, ierr) > > call MatDestroy(K, ierr) > > call MatDestroy(AQ_p, ierr) > > call MatDestroy(AQ, ierr) > > call MatDestroy(QtAQ, ierr) > > call MatDestroy(QtAQ_p, ierr) > > call MatDestroy(Dlt, ierr) > > > > > > call PetscFinalize(ierr) > > > > stop > > > > end program test > > > > > > subroutine mpk_monomial (K, A, R, step_k, rank, sizeMPI) > > implicit none > > > > #include > > #include > > #include > > #include > > > > Mat :: K, Km(step_k) > > Mat :: A, R > > PetscMPIInt :: sizeMPI, rank > > PetscInt :: nDim, bsize, step_k, local_RRow, local_RCol, > genIdx > > PetscInt :: ierr > > PetscInt :: stepIdx, blockShift, localRsize > > PetscScalar :: KArray(1), RArray(1), PetscScalarSize > > PetscOffset :: KArrayOffset, RArrayOffset > > > > call MatGetSize(R, nDim, bsize, ierr) > > if (rank == 0) then > > print*,'Mat Size = ', nDim, bsize > > end if > > > > call MatGetArray(K,KArray,KArrayOffset,ierr) > > > > call MatGetLocalSize(R,local_RRow,local_RCol) > > ! print *, "local_RRow,local_RCol", local_RRow,local_RCol > > > > ! get arry from R to add values to K(1) > > call MatGetArray(R,RArray,RArrayOffset,ierr) > > > > call MatCreateDense(PETSC_COMM_WORLD, PETSC_DECIDE, & > > PETSC_DECIDE , nDim, bsize,KArray(KArrayOffset + > 1), Km(1), ierr) > > > > > > ! call PetscMemmove(KArray(KArrayOffset + 1),RArray(RArrayOffset + 1) & > > ! ,local_RRow * local_RCol * > STORAGE_SIZE(PetscScalarSize), ierr) > > > > localRsize = local_RRow * local_RCol > > do genIdx= 1, localRsize > > KArray(KArrayOffset + genIdx) = RArray(RArrayOffset + genIdx) > > end do > > > > > > call MatRestoreArray(R,RArray,RArrayOffset,ierr) > > > > call MatAssemblyBegin(Km(1), MAT_FINAL_ASSEMBLY, ierr) > > call MatAssemblyEnd (Km(1), MAT_FINAL_ASSEMBLY, ierr) > > > > do stepIdx= 2, step_k > > > > blockShift = KArrayOffset + (stepIdx-1) * (local_RRow * local_RCol) > > > > call MatCreateDense(PETSC_COMM_WORLD, PETSC_DECIDE, & > > PETSC_DECIDE , nDim, bsize,KArray(blockShift+1), > Km(stepIdx), ierr) > > call MatAssemblyBegin(Km(stepIdx), MAT_FINAL_ASSEMBLY, ierr) > > call MatAssemblyEnd (Km(stepIdx), MAT_FINAL_ASSEMBLY, ierr) > > > > end do > > > > call MatRestoreArray(K,KArray,KArrayOffset,ierr) > > > > ! do stepIdx= 2, step_k > > do stepIdx= 2,2 > > > > call > MatMatMult(A,Km(stepIdx-1),MAT_REUSE_MATRIX,PETSC_DEFAULT_INTEGER,Km(stepIdx), > ierr) > > ! call > MatMatMult(A,Km(stepIdx-1),MAT_INITIAL_MATRIX,PETSC_DEFAULT_INTEGER,Km(stepIdx), > ierr) > > end do > > > > ! call MatView(K,PETSC_VIEWER_STDOUT_WORLD,ierr) > > > > end subroutine mpk_monomial > > > > > > > > Cong Li > > > > On Thu, Aug 6, 2015 at 3:30 AM, Barry Smith wrote: > > > > Send the entire code so that we can compile it and run it ourselves > to see what is going wrong. > > > > Barry > > > > > On Aug 5, 2015, at 4:42 AM, Cong Li wrote: > > > > > > Hi > > > > > > I tried the method you suggested. However, I got the error message. > > > My code and message are below. > > > > > > K is the big matrix containing column matrices. > > > > > > code: > > > > > > call MatGetArray(K,KArray,KArrayOffset,ierr) > > > > > > call MatGetLocalSize(R,local_RRow,local_RCol) > > > > > > call MatGetArray(R,RArray,RArrayOffset,ierr) > > > > > > call MatCreateDense(PETSC_COMM_WORLD, PETSC_DECIDE, & > > > PETSC_DECIDE , nDim, bsize,KArray(KArrayOffset > + 1), Km(1), ierr) > > > > > > localRsize = local_RRow * local_RCol > > > do genIdx= 1, localRsize > > > KArray(KArrayOffset + genIdx) = RArray(RArrayOffset + genIdx) > > > end do > > > > > > call MatRestoreArray(R,RArray,RArrayOffset,ierr) > > > > > > call MatAssemblyBegin(Km(1), MAT_FINAL_ASSEMBLY, ierr) > > > call MatAssemblyEnd (Km(1), MAT_FINAL_ASSEMBLY, ierr) > > > > > > do stepIdx= 2, step_k > > > > > > blockShift = KArrayOffset + (stepIdx-1) * (local_RRow * local_RCol) > > > > > > call MatCreateDense(PETSC_COMM_WORLD, PETSC_DECIDE, & > > > PETSC_DECIDE , nDim, > bsize,KArray(blockShift+1), Km(stepIdx), ierr) > > > call MatAssemblyBegin(Km(stepIdx), MAT_FINAL_ASSEMBLY, ierr) > > > call MatAssemblyEnd (Km(stepIdx), MAT_FINAL_ASSEMBLY, ierr) > > > end do > > > > > > call MatRestoreArray(K,KArray,KArrayOffset,ierr) > > > > > > do stepIdx= 2, step_k > > > > > > call > MatMatMult(A,Km(stepIdx-1),MAT_REUSE_MATRIX,PETSC_DEFAULT_INTEGER,Km(stepIdx), > ierr) > > > end do > > > > > > > > > And I got the error message as below: > > > > > > > > > [0]PETSC ERROR: > ------------------------------------------------------------------------ > > > [0]PETSC ERROR: Caught signal number 11 SEGV: Segmentation Violation, > probably memory access out of range > > > [0]PETSC ERROR: Try option -start_in_debugger or > -on_error_attach_debugger > > > [0]PETSC ERROR: or see > http://www.mcs.anl.gov/petsc/documentation/faq.html#valgrind[0]PETSC > ERROR: or try http://valgrind.org on GNU/linux and Apple Mac OS X to find > memory corruption errors > > > [0]PETSC ERROR: configure using --with-debugging=yes, recompile, link, > and run > > > [0]PETSC ERROR: to get more information on the crash. > > > [0]PETSC ERROR: --------------------- Error Message > ------------------------------------ > > > [0]PETSC ERROR: Signal received! > > > [0]PETSC ERROR: > ------------------------------------------------------------------------ > > > [0]PETSC ERROR: Petsc Release Version 3.3.0, Patch 7, Sat May 11 > 22:15:24 CDT 2013 > > > [0]PETSC ERROR: See docs/changes/index.html for recent updates. > > > [0]PETSC ERROR: See docs/faq.html for hints about trouble shooting. > > > [0]PETSC ERROR: See docs/index.html for manual pages. > > > [0]PETSC ERROR: --------------------[1]PETSC ERROR: > ------------------------------------------------------------------------ > > > [1]PETSC ERROR: Caught signal number 11 SEGV: Segmentation Violation, > probably memory access out of range > > > ---------------------------------------------------- > > > [0]PETSC ERROR: ./kmath.bcbcg on a arch-fuji named p01-024 by a03293 > Wed Aug 5 18:24:40 2015 > > > [0]PETSC ERROR: Libraries linked from > /volume1/home/ra000005/a03293/kmathlibbuild/petsc-3.3-p7/arch-fujitsu-sparc64fx-opt/lib > > > [0]PETSC ERROR: Configure run at Tue Jul 28 19:23:51 2015 > > > [0]PETSC ERROR: Configure options --known-level1-dcache-size=32768 > --known-level1-dcache-linesize=32 --known-level1-dcache-assoc=0 > --known-memcmp-ok=1 --known-sizeof-char=1 --known-sizeof-void-p=8 > --known-sizeof-short=2 --known-sizeof-int=4 --known-sizeof-long=8 > --known-sizeof-long-long=8 --known-sizeof-float=4 --known-sizeof-double=8 > --known-sizeof-size_t=8 --known-bits-per-byte=8 --known-sizeof-MPI_Comm=8 > --known-sizeof-MPI_Fint=4 --known-mpi-long-double=1 > --known-mpi-c-double-complex=1 --with-cc=mpifccpx --CFLAGS="-mt -Xg" > --COPTFLAGS=-Kfast,openmp --with-cxx=mpiFCCpx --CXXFLAGS=-mt > --CXXOPTFLAGS=-Kfast,openmp --with-fc=mpifrtpx --FFLAGS=-Kthreadsafe > --FOPTFLAGS=-Kfast,openmp --with-blas-lapack-lib="-SCALAPACK -SSL2" > --with-x=0 --with-c++-support --with-batch=1 --with-info=1 > --with-debugging=0 --known-mpi-shared-libraries=0 --with-valgrind=0 > > > [0]PETSC ERROR: > ------------------------------------------------------------------------ > > > [0]PETSC ERROR: User provided function() line 0 in unknown directory > unknown file > > > > -------------------------------------------------------------------------- > > > [mpi::mpi-api::mpi-abort] > > > MPI_ABORT was invoked on rank 0 in communicator MPI_COMM_WORLD > > > with errorcode 59. > > > > > > NOTE: invoking MPI_ABORT causes Open MPI to kill all MPI processes. > > > You may or may not see output from other processes, depending on > > > exactly when Open MPI kills them. > > > > -------------------------------------------------------------------------- > > > [p01-024:26516] > /opt/FJSVtclang/GM-1.2.0-18/lib64/libmpi.so.0(orte_errmgr_base_error_abort+0x84) > [0xffffffff0091f684] > > > [p01-024:26516] > /opt/FJSVtclang/GM-1.2.0-18/lib64/libmpi.so.0(ompi_mpi_abort+0x51c) > [0xffffffff006c389c] > > > [p01-024:26516] > /opt/FJSVtclang/GM-1.2.0-18/lib64/libmpi.so.0(MPI_Abort+0x6c) > [0xffffffff006db3ac] > > > [p01-024:26516] > /opt/FJSVtclang/GM-1.2.0-18/lib64/libtrtmet_c.so.1(MPI_Abort+0x2c) > [0xffffffff00281bf0] > > > [p01-024:26516] ./kmath.bcbcg [0x1bf620] > > > [p01-024:26516] ./kmath.bcbcg [0x1bf20c] > > > [p01-024:26516] /lib64/libc.so.6(killpg+0x48) [0xffffffff02d52600] > > > [p01-024:26516] [(nil)] > > > [p01-024:26516] ./kmath.bcbcg [0x1a2054] > > > [p01-024:26516] ./kmath.bcbcg [0x1064f8] > > > [p01-024:26516] ./kmath.bcbcg(MAIN__+0x9dc) [0x105d1c] > > > [p01-024:26516] ./kmath.bcbcg(main+0xec) [0x8a329c] > > > [p01-024:26516] /lib64/libc.so.6(__libc_start_main+0x194) > [0xffffffff02d3b81c] > > > [p01-024:26516] ./kmath.bcbcg [0x1051ec] > > > [0]PETSC ERROR: > ------------------------------------------------------------------------ > > > [0]PETSC ERROR: Caught signal number 15 Terminate: Somet process (or > the batch system) has told this process to end > > > [0]PETSC ERROR: Try option -start_in_debugger or > -on_error_attach_debugger > > > [0]PETSC ERROR: or see > http://www.mcs.anl.gov/petsc/documentation/faq.html#valgrind[0]PETSC > ERROR: or try http://valgrind.org on GNU/linux and Apple Mac OS X to find > memory corruption errors > > > [0]PETSC ERROR: configure using --with-debugging=yes, recompile, link, > and run > > > [0]PETSC ERROR: to get more information on the crash. > > > [0]PETSC ERROR: --------------------- Error Message > ------------------------------------ > > > [0]PETSC ERROR: Signal received! > > > [0]PETSC ERROR: > ------------------------------------------------------------------------ > > > [0]PETSC ERROR: Petsc Release Version 3.3.0, Patch 7, Sat May 11 > 22:15:24 CDT 2013 > > > [0]PETSC ERROR: See docs/changes/index.html for recent updates. > > > [0]PETSC ERROR: See docs/faq.html for hints about trouble shooting. > > > [0]PETSC ERROR: See docs/index.html for manual pages. > > > [0]PETSC ERROR: > ------------------------------------------------------------------------ > > > [0]PETSC ERROR: ./kmath.bcbcg on a arch-fuji named p01-024 by a03293 > Wed Aug 5 18:24:40 2015 > > > [0]PETSC ERROR: Libraries linked from > /volume1/home/ra000005/a03293/kmathlibbuild/petsc-3.3-p7/arch-fujitsu-sparc64fx-opt/lib > > > [0]PETSC ERROR: Configure run at Tue Jul 28 19:23:51 2015 > > > [0]PETSC ERROR: Configure options --known-level1-dcache-size=32768 > --known-level1-dcache-linesize=32 --known-level1-dcache-assoc=0 > --known-memcmp-ok=1 --known-sizeof-char=1 --known-sizeof-void-p=8 > --known-sizeof-short=2 --known-sizeof-int=4 --known-sizeof-long=8 > --known-sizeof-long-long=8 --known-sizeof-float=4 --known-sizeof-double=8 > --known-sizeof-size_t=8 --known-bits-per-byte=8 --known-sizeof-MPI_Comm=8 > --known-sizeof-MPI_Fint=4 --known-mpi-long-double=1 > --known-mpi-c-double-complex=1 --with-cc=mpifccpx --CFLAGS="-mt -Xg" > --COPTFLAGS=-Kfast,openmp --with-cxx=mpiFCCpx --CXXFLAGS=-mt > --CXXOPTFLAGS=-Kfast,openmp --with-fc=mpifrtpx --FFLAGS=-Kthreadsafe > --FOPTFLAGS=-Kfast,openmp --with-blas-lapack-lib="-SCALAPACK -SSL2" > --with-x=0 --with-c++-support --with-batch=1 --with-info=1 > --with-debugging=0 --known-mpi-shared-libraries=0 --with-valgrind=0 > > > [0]PETSC ERROR: > ------------------------------------------------------------------------ > > > [0]PETSC ERROR: User provided function() line 0 in unknown directory > unknown file > > > [ERR.] PLE 0019 plexec One of MPI processes was > aborted.(rank=0)(nid=0x020a0028)(CODE=1938,793745140674134016,15104) > > > > > > However, if I change from > > > call > MatMatMult(A,Km(stepIdx-1),MAT_REUSE_MATRIX,PETSC_DEFAULT_INTEGER,Km(stepIdx), > ierr) > > > to > > > call MatMatMult(A,Km(stepIdx-1), > MAT_INITIAL_MATRIX,PETSC_DEFAULT_INTEGER,Km(stepIdx), ierr) > > > > > > everything is fine. > > > > > > could you please suggest some way to solve this? > > > > > > Thanks > > > > > > Cong Li > > > > > > On Wed, Aug 5, 2015 at 10:53 AM, Cong Li > wrote: > > > Thank you very much for your help and suggestions. > > > With your help, finally I could continue my project. > > > > > > Regards > > > > > > Cong Li > > > > > > > > > > > > On Wed, Aug 5, 2015 at 3:09 AM, Barry Smith > wrote: > > > > > > From the manual page: Unless scall is MAT_REUSE_MATRIX C will be > created. > > > > > > Since you want to use the C that is passed in you should use > MAT_REUSE_MATRIX. > > > > > > Note that since your B and C matrices are dense the issue of > sparsity pattern of C is not relevant. > > > > > > Barry > > > > > > > On Aug 4, 2015, at 11:59 AM, Cong Li > wrote: > > > > > > > > Thanks very much. This answer is very helpful. > > > > And I have a following question. > > > > If I create B1, B2, .. by the way you suggested and then use > MatMatMult to do SPMM. > > > > PetscErrorCode MatMatMult(Mat A,Mat B,MatReuse scall,PetscReal > fill,Mat *C) > > > > should I use MAT_REUSE_MATRIX for MatReuse part of the arguement. > > > > > > > > Thanks > > > > > > > > Cong Li > > > > > > > > On Wed, Aug 5, 2015 at 1:27 AM, Barry Smith > wrote: > > > > > > > > > On Aug 4, 2015, at 4:09 AM, Cong Li > wrote: > > > > > > > > > > I am sorry that I should have explained it more clearly. > > > > > Actually I want to compute a recurrence. > > > > > > > > > > Like, I want to firstly compute A*X1=B1, and then calculate > A*B1=B2, A*B2=B3 and so on. > > > > > Finally I want to combine all these results into a bigger matrix > C=[B1,B2 ...] > > > > > > > > First create C with MatCreateDense(,&C). Then call > MatDenseGetArray(C,&array); then create B1 with > MatCreateDense(....,array,&B1); then create > > > > B2 with MatCreateDense(...,array+shift,&B2) etc where shift equals > the number of __local__ rows in B1 times the number of columns in B1, then > create B3 with a larger shift etc. > > > > > > > > Note that you are "sharing" the array space of C with B1, B2, B3, > ..., each Bi contains its columns of the C matrix. > > > > > > > > Barry > > > > > > > > > > > > > > > > > > > > > > Is there any way to do this efficiently. > > > > > > > > > > > > > > > > > > > > On Tue, Aug 4, 2015 at 5:45 PM, Patrick Sanan < > patrick.sanan at gmail.com> wrote: > > > > > On Tue, Aug 04, 2015 at 03:42:14PM +0900, Cong Li wrote: > > > > > > Thanks for your reply. > > > > > > > > > > > > I have an other question. > > > > > > I want to do SPMM several times and combine result matrices into > one bigger > > > > > > matrix. > > > > > > for example > > > > > > I firstly calculate AX1=B1, AX2=B2 ... > > > > > > then I want to combine B1, B2.. to get a C, where C=[B1,B2...] > > > > > > > > > > > > Could you please suggest a way of how to do this. > > > > > This is just linear algebra, nothing to do with PETSc specifically. > > > > > A * [X1, X2, ... ] = [AX1, AX2, ...] > > > > > > > > > > > > Thanks > > > > > > > > > > > > Cong Li > > > > > > > > > > > > On Tue, Aug 4, 2015 at 3:27 PM, Jed Brown > wrote: > > > > > > > > > > > > > Cong Li writes: > > > > > > > > > > > > > > > Hello, > > > > > > > > > > > > > > > > I am a PhD student using PETsc for my research. > > > > > > > > I am wondering if there is a way to implement SPMM (Sparse > matrix-matrix > > > > > > > > multiplication) by using PETSc. > > > > > > > > > > > > > > > > > > > > > > http://www.mcs.anl.gov/petsc/petsc-current/docs/manualpages/Mat/MatMatMult.html > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > -------------- next part -------------- An HTML attachment was scrubbed... URL: From hzhang at mcs.anl.gov Thu Aug 6 10:20:42 2015 From: hzhang at mcs.anl.gov (Hong) Date: Thu, 6 Aug 2015 10:20:42 -0500 Subject: [petsc-users] I am wondering if there is a way to implement SPMM In-Reply-To: References: <87egjjr2j9.fsf@jedbrown.org> <20150804084548.GB52392@Patricks-MacBook-Pro-3.local> <07456300-9874-41EF-AF5E-16BC0CB0423D@mcs.anl.gov> <06426FD6-16F6-429A-8EEB-8BE31CECC8F4@mcs.anl.gov> Message-ID: Cong: > Hong, > > Sure. > > I want to extend the Krylov subspace by step_k dimensions by using > monomial, which can be defined as > > K={Km(1)m Km(2), ..., Km(step_k)} > ={Km(1), AKm(1), AKm(2), ... , AKm(step_k-1)} > ={R, AR, A^2R, ... A^(step_k-1)R} > A subspace with dense matrices as basis? How large step_k and your matrices will be? Hong > > On Thu, Aug 6, 2015 at 12:23 PM, Hong wrote: > >> Cong, >> >> Can you write out math equations for mpk_monomial (), >> list input and output parameters. >> >> Note: >> 1. MatDuplicate() does not need to be followed by MatAssemblyBegin/End >> 2. MatMatMult(A,Km(stepIdx-1),MAT_REUSE_MATRIX,..) must be called after >> MatMatMult(A,Km(stepIdx-1),MAT_INITIAL_MATRIX,..) >> >> Hong >> >> >> On Wed, Aug 5, 2015 at 8:56 PM, Cong Li wrote: >> >>> The entire source code files are attached. >>> >>> Also I copy and paste the here in this email >>> >>> thanks >>> >>> program test >>> >>> implicit none >>> >>> #include >>> #include >>> #include >>> #include >>> >>> >>> PetscViewer :: view >>> ! sparse matrix >>> Mat :: A >>> ! distributed dense matrix of size n x m >>> Mat :: B, X, R, QDlt, AQDlt >>> ! distributed dense matrix of size n x (m x k) >>> Mat :: Q, K, AQ_p, AQ >>> ! local dense matrix (every process keep the identical copies), (m x >>> k) x (m x k) >>> Mat :: AConjPara, QtAQ, QtAQ_p, Dlt >>> >>> PetscInt :: nDim, mDim, rhsNDim,rhsMDim,ierr, maxIter, iter, >>> step_k,bsize >>> PetscInt :: ownRowS,ownRowE >>> PetscScalar, allocatable :: XInit(:,:) >>> PetscInt :: XInitI, XInitJ >>> PetscScalar :: v=1.0 >>> PetscBool :: flg >>> PetscMPIInt :: size, rank >>> >>> character(128) :: fin, rhsfin >>> >>> >>> call PetscInitialize(PETSC_NULL_CHARACTER,ierr) >>> call MPI_Comm_size(PETSC_COMM_WORLD,size,ierr) >>> call MPI_Comm_rank(PETSC_COMM_WORLD,rank,ierr) >>> >>> ! read binary matrix file >>> call PetscOptionsGetString(PETSC_NULL_CHARACTER,'-f',fin,flg,ierr) >>> call PetscOptionsGetString(PETSC_NULL_CHARACTER,'-r',rhsfin,flg,ierr) >>> >>> call PetscOptionsGetInt(PETSC_NULL_CHARACTER,'-i',maxIter,flg,ierr) >>> call PetscOptionsGetInt(PETSC_NULL_CHARACTER,'-k',step_k,flg,ierr) >>> call PetscOptionsGetInt(PETSC_NULL_CHARACTER,'-w',bsize,flg,ierr) >>> >>> >>> call >>> PetscViewerBinaryOpen(PETSC_COMM_WORLD,fin,FILE_MODE_READ,view,ierr) >>> call MatCreate(PETSC_COMM_WORLD,A,ierr) >>> call MatSetType(A,MATAIJ,ierr) >>> call MatLoad(A,view,ierr) >>> call PetscViewerDestroy(view,ierr) >>> ! for the time being, assume mDim == nDim is true >>> call MatGetSize(A, nDim, mDim, ierr) >>> >>> if (rank == 0) then >>> print*,'Mat Size = ', nDim, mDim >>> end if >>> >>> call MatAssemblyBegin(A, MAT_FINAL_ASSEMBLY, ierr) >>> call MatAssemblyEnd(A, MAT_FINAL_ASSEMBLY, ierr) >>> call MatGetOwnershipRange(A,ownRowS,ownRowE, ierr) >>> >>> ! create right-and-side matrix >>> ! for the time being, choose row-wise decomposition >>> ! for the time being, assume nDim%size = 0 >>> call MatCreateDense(PETSC_COMM_WORLD, (ownRowE - ownRowS), & >>> bsize, nDim, bsize,PETSC_NULL_SCALAR, B, ierr) >>> call >>> PetscViewerBinaryOpen(PETSC_COMM_WORLD,rhsfin,FILE_MODE_READ,view, ierr) >>> call MatLoad(B,view,ierr) >>> call PetscViewerDestroy(view,ierr) >>> call MatGetSize(B, rhsMDim, rhsNDim, ierr) >>> if (rank == 0) then >>> print*,'MRHS Size actually are:', rhsMDim, rhsNDim >>> print*,'MRHS Size should be:', nDim, bsize >>> end if >>> call MatAssemblyBegin(B, MAT_FINAL_ASSEMBLY, ierr) >>> call MatAssemblyEnd(B, MAT_FINAL_ASSEMBLY, ierr) >>> >>> ! inintial value guses X >>> allocate(XInit(nDim,bsize)) >>> do XInitI=1, nDim >>> do XInitJ=1, bsize >>> XInit(XInitI,XInitJ) = 1.0 >>> end do >>> end do >>> >>> call MatCreateDense(PETSC_COMM_WORLD, (ownRowE - ownRowS), & >>> bsize, nDim, bsize,XInit, X, ierr) >>> >>> call MatAssemblyBegin(X, MAT_FINAL_ASSEMBLY, ierr) >>> call MatAssemblyEnd (X, MAT_FINAL_ASSEMBLY, ierr) >>> >>> >>> ! B, X, R, QDlt, AQDlt >>> call MatDuplicate(B, MAT_DO_NOT_COPY_VALUES, R, ierr) >>> call MatAssemblyBegin(R, MAT_FINAL_ASSEMBLY, ierr) >>> call MatAssemblyEnd (R, MAT_FINAL_ASSEMBLY, ierr) >>> >>> call MatDuplicate(B, MAT_DO_NOT_COPY_VALUES, QDlt, ierr) >>> call MatAssemblyBegin(QDlt, MAT_FINAL_ASSEMBLY, ierr) >>> call MatAssemblyEnd (QDlt, MAT_FINAL_ASSEMBLY, ierr) >>> >>> call MatDuplicate(B, MAT_DO_NOT_COPY_VALUES, AQDlt, ierr) >>> call MatAssemblyBegin(AQDlt, MAT_FINAL_ASSEMBLY, ierr) >>> call MatAssemblyEnd (AQDlt, MAT_FINAL_ASSEMBLY, ierr) >>> >>> ! Q, K, AQ_p, AQ of size n x (m x k) >>> call MatCreateDense(PETSC_COMM_WORLD, (ownRowE - ownRowS), & >>> (bsize*step_k), nDim, >>> (bsize*step_k),PETSC_NULL_SCALAR, Q, ierr) >>> call MatAssemblyBegin(Q, MAT_FINAL_ASSEMBLY, ierr) >>> call MatAssemblyEnd(Q, MAT_FINAL_ASSEMBLY, ierr) >>> >>> call MatDuplicate(Q, MAT_DO_NOT_COPY_VALUES, K, ierr) >>> call MatAssemblyBegin(K, MAT_FINAL_ASSEMBLY, ierr) >>> call MatAssemblyEnd(K, MAT_FINAL_ASSEMBLY, ierr) >>> >>> call MatDuplicate(Q, MAT_DO_NOT_COPY_VALUES, AQ_p, ierr) >>> call MatAssemblyBegin(AQ_p, MAT_FINAL_ASSEMBLY, ierr) >>> call MatAssemblyEnd(AQ_p, MAT_FINAL_ASSEMBLY, ierr) >>> >>> call MatDuplicate(Q, MAT_DO_NOT_COPY_VALUES, AQ, ierr) >>> call MatAssemblyBegin(AQ, MAT_FINAL_ASSEMBLY, ierr) >>> call MatAssemblyEnd(AQ, MAT_FINAL_ASSEMBLY, ierr) >>> >>> ! QtAQ, QtAQ_p, Dlt of size (m x k) x (m x k) >>> call MatCreateSeqDense(PETSC_COMM_SELF,(bsize*step_k),(bsize*step_k),& >>> PETSC_NULL_SCALAR, QtAQ, ierr) >>> call MatAssemblyBegin(QtAQ, MAT_FINAL_ASSEMBLY, ierr) >>> call MatAssemblyEnd (QtAQ, MAT_FINAL_ASSEMBLY, ierr) >>> >>> call MatDuplicate(QtAQ, MAT_DO_NOT_COPY_VALUES, QtAQ_p , ierr) >>> call MatAssemblyBegin(QtAQ_p, MAT_FINAL_ASSEMBLY, ierr) >>> call MatAssemblyEnd (QtAQ_p, MAT_FINAL_ASSEMBLY, ierr) >>> >>> call MatDuplicate(QtAQ, MAT_DO_NOT_COPY_VALUES, Dlt , ierr) >>> call MatAssemblyBegin(Dlt, MAT_FINAL_ASSEMBLY, ierr) >>> call MatAssemblyEnd (Dlt, MAT_FINAL_ASSEMBLY, ierr) >>> >>> call MatDuplicate(QtAQ, MAT_DO_NOT_COPY_VALUES, AConjPara , ierr) >>> call MatAssemblyBegin(AConjPara, MAT_FINAL_ASSEMBLY, ierr) >>> call MatAssemblyEnd (AConjPara, MAT_FINAL_ASSEMBLY, ierr) >>> >>> ! calculation for R >>> >>> ! call matrix powers kernel >>> call mpk_monomial (K, A, R, step_k, rank,size) >>> >>> ! destory matrices >>> deallocate(XInit) >>> >>> call MatDestroy(B, ierr) >>> call MatDestroy(X, ierr) >>> call MatDestroy(R, ierr) >>> call MatDestroy(QDlt, ierr) >>> call MatDestroy(AQDlt, ierr) >>> call MatDestroy(Q, ierr) >>> call MatDestroy(K, ierr) >>> call MatDestroy(AQ_p, ierr) >>> call MatDestroy(AQ, ierr) >>> call MatDestroy(QtAQ, ierr) >>> call MatDestroy(QtAQ_p, ierr) >>> call MatDestroy(Dlt, ierr) >>> >>> >>> call PetscFinalize(ierr) >>> >>> stop >>> >>> end program test >>> >>> >>> subroutine mpk_monomial (K, A, R, step_k, rank, sizeMPI) >>> implicit none >>> >>> #include >>> #include >>> #include >>> #include >>> >>> Mat :: K, Km(step_k) >>> Mat :: A, R >>> PetscMPIInt :: sizeMPI, rank >>> PetscInt :: nDim, bsize, step_k, local_RRow, local_RCol, genIdx >>> PetscInt :: ierr >>> PetscInt :: stepIdx, blockShift, localRsize >>> PetscScalar :: KArray(1), RArray(1), PetscScalarSize >>> PetscOffset :: KArrayOffset, RArrayOffset >>> >>> call MatGetSize(R, nDim, bsize, ierr) >>> if (rank == 0) then >>> print*,'Mat Size = ', nDim, bsize >>> end if >>> >>> call MatGetArray(K,KArray,KArrayOffset,ierr) >>> >>> call MatGetLocalSize(R,local_RRow,local_RCol) >>> ! print *, "local_RRow,local_RCol", local_RRow,local_RCol >>> >>> ! get arry from R to add values to K(1) >>> call MatGetArray(R,RArray,RArrayOffset,ierr) >>> >>> call MatCreateDense(PETSC_COMM_WORLD, PETSC_DECIDE, & >>> PETSC_DECIDE , nDim, bsize,KArray(KArrayOffset + >>> 1), Km(1), ierr) >>> >>> >>> ! call PetscMemmove(KArray(KArrayOffset + 1),RArray(RArrayOffset + 1) & >>> ! ,local_RRow * local_RCol * >>> STORAGE_SIZE(PetscScalarSize), ierr) >>> >>> localRsize = local_RRow * local_RCol >>> do genIdx= 1, localRsize >>> KArray(KArrayOffset + genIdx) = RArray(RArrayOffset + genIdx) >>> end do >>> >>> >>> call MatRestoreArray(R,RArray,RArrayOffset,ierr) >>> >>> call MatAssemblyBegin(Km(1), MAT_FINAL_ASSEMBLY, ierr) >>> call MatAssemblyEnd (Km(1), MAT_FINAL_ASSEMBLY, ierr) >>> >>> do stepIdx= 2, step_k >>> >>> blockShift = KArrayOffset + (stepIdx-1) * (local_RRow * local_RCol) >>> >>> call MatCreateDense(PETSC_COMM_WORLD, PETSC_DECIDE, & >>> PETSC_DECIDE , nDim, bsize,KArray(blockShift+1), >>> Km(stepIdx), ierr) >>> call MatAssemblyBegin(Km(stepIdx), MAT_FINAL_ASSEMBLY, ierr) >>> call MatAssemblyEnd (Km(stepIdx), MAT_FINAL_ASSEMBLY, ierr) >>> >>> end do >>> >>> call MatRestoreArray(K,KArray,KArrayOffset,ierr) >>> >>> ! do stepIdx= 2, step_k >>> do stepIdx= 2,2 >>> >>> call >>> MatMatMult(A,Km(stepIdx-1),MAT_REUSE_MATRIX,PETSC_DEFAULT_INTEGER,Km(stepIdx), >>> ierr) >>> ! call >>> MatMatMult(A,Km(stepIdx-1),MAT_INITIAL_MATRIX,PETSC_DEFAULT_INTEGER,Km(stepIdx), >>> ierr) >>> end do >>> >>> ! call MatView(K,PETSC_VIEWER_STDOUT_WORLD,ierr) >>> >>> end subroutine mpk_monomial >>> >>> >>> >>> Cong Li >>> >>> On Thu, Aug 6, 2015 at 3:30 AM, Barry Smith wrote: >>> >>>> >>>> Send the entire code so that we can compile it and run it ourselves >>>> to see what is going wrong. >>>> >>>> Barry >>>> >>>> > On Aug 5, 2015, at 4:42 AM, Cong Li wrote: >>>> > >>>> > Hi >>>> > >>>> > I tried the method you suggested. However, I got the error message. >>>> > My code and message are below. >>>> > >>>> > K is the big matrix containing column matrices. >>>> > >>>> > code: >>>> > >>>> > call MatGetArray(K,KArray,KArrayOffset,ierr) >>>> > >>>> > call MatGetLocalSize(R,local_RRow,local_RCol) >>>> > >>>> > call MatGetArray(R,RArray,RArrayOffset,ierr) >>>> > >>>> > call MatCreateDense(PETSC_COMM_WORLD, PETSC_DECIDE, & >>>> > PETSC_DECIDE , nDim, >>>> bsize,KArray(KArrayOffset + 1), Km(1), ierr) >>>> > >>>> > localRsize = local_RRow * local_RCol >>>> > do genIdx= 1, localRsize >>>> > KArray(KArrayOffset + genIdx) = RArray(RArrayOffset + genIdx) >>>> > end do >>>> > >>>> > call MatRestoreArray(R,RArray,RArrayOffset,ierr) >>>> > >>>> > call MatAssemblyBegin(Km(1), MAT_FINAL_ASSEMBLY, ierr) >>>> > call MatAssemblyEnd (Km(1), MAT_FINAL_ASSEMBLY, ierr) >>>> > >>>> > do stepIdx= 2, step_k >>>> > >>>> > blockShift = KArrayOffset + (stepIdx-1) * (local_RRow * >>>> local_RCol) >>>> > >>>> > call MatCreateDense(PETSC_COMM_WORLD, PETSC_DECIDE, & >>>> > PETSC_DECIDE , nDim, >>>> bsize,KArray(blockShift+1), Km(stepIdx), ierr) >>>> > call MatAssemblyBegin(Km(stepIdx), MAT_FINAL_ASSEMBLY, ierr) >>>> > call MatAssemblyEnd (Km(stepIdx), MAT_FINAL_ASSEMBLY, ierr) >>>> > end do >>>> > >>>> > call MatRestoreArray(K,KArray,KArrayOffset,ierr) >>>> > >>>> > do stepIdx= 2, step_k >>>> > >>>> > call >>>> MatMatMult(A,Km(stepIdx-1),MAT_REUSE_MATRIX,PETSC_DEFAULT_INTEGER,Km(stepIdx), >>>> ierr) >>>> > end do >>>> > >>>> > >>>> > And I got the error message as below: >>>> > >>>> > >>>> > [0]PETSC ERROR: >>>> ------------------------------------------------------------------------ >>>> > [0]PETSC ERROR: Caught signal number 11 SEGV: Segmentation Violation, >>>> probably memory access out of range >>>> > [0]PETSC ERROR: Try option -start_in_debugger or >>>> -on_error_attach_debugger >>>> > [0]PETSC ERROR: or see >>>> http://www.mcs.anl.gov/petsc/documentation/faq.html#valgrind[0]PETSC >>>> ERROR: or try http://valgrind.org on GNU/linux and Apple Mac OS X to >>>> find memory corruption errors >>>> > [0]PETSC ERROR: configure using --with-debugging=yes, recompile, >>>> link, and run >>>> > [0]PETSC ERROR: to get more information on the crash. >>>> > [0]PETSC ERROR: --------------------- Error Message >>>> ------------------------------------ >>>> > [0]PETSC ERROR: Signal received! >>>> > [0]PETSC ERROR: >>>> ------------------------------------------------------------------------ >>>> > [0]PETSC ERROR: Petsc Release Version 3.3.0, Patch 7, Sat May 11 >>>> 22:15:24 CDT 2013 >>>> > [0]PETSC ERROR: See docs/changes/index.html for recent updates. >>>> > [0]PETSC ERROR: See docs/faq.html for hints about trouble shooting. >>>> > [0]PETSC ERROR: See docs/index.html for manual pages. >>>> > [0]PETSC ERROR: --------------------[1]PETSC ERROR: >>>> ------------------------------------------------------------------------ >>>> > [1]PETSC ERROR: Caught signal number 11 SEGV: Segmentation Violation, >>>> probably memory access out of range >>>> > ---------------------------------------------------- >>>> > [0]PETSC ERROR: ./kmath.bcbcg on a arch-fuji named p01-024 by a03293 >>>> Wed Aug 5 18:24:40 2015 >>>> > [0]PETSC ERROR: Libraries linked from >>>> /volume1/home/ra000005/a03293/kmathlibbuild/petsc-3.3-p7/arch-fujitsu-sparc64fx-opt/lib >>>> > [0]PETSC ERROR: Configure run at Tue Jul 28 19:23:51 2015 >>>> > [0]PETSC ERROR: Configure options --known-level1-dcache-size=32768 >>>> --known-level1-dcache-linesize=32 --known-level1-dcache-assoc=0 >>>> --known-memcmp-ok=1 --known-sizeof-char=1 --known-sizeof-void-p=8 >>>> --known-sizeof-short=2 --known-sizeof-int=4 --known-sizeof-long=8 >>>> --known-sizeof-long-long=8 --known-sizeof-float=4 --known-sizeof-double=8 >>>> --known-sizeof-size_t=8 --known-bits-per-byte=8 --known-sizeof-MPI_Comm=8 >>>> --known-sizeof-MPI_Fint=4 --known-mpi-long-double=1 >>>> --known-mpi-c-double-complex=1 --with-cc=mpifccpx --CFLAGS="-mt -Xg" >>>> --COPTFLAGS=-Kfast,openmp --with-cxx=mpiFCCpx --CXXFLAGS=-mt >>>> --CXXOPTFLAGS=-Kfast,openmp --with-fc=mpifrtpx --FFLAGS=-Kthreadsafe >>>> --FOPTFLAGS=-Kfast,openmp --with-blas-lapack-lib="-SCALAPACK -SSL2" >>>> --with-x=0 --with-c++-support --with-batch=1 --with-info=1 >>>> --with-debugging=0 --known-mpi-shared-libraries=0 --with-valgrind=0 >>>> > [0]PETSC ERROR: >>>> ------------------------------------------------------------------------ >>>> > [0]PETSC ERROR: User provided function() line 0 in unknown directory >>>> unknown file >>>> > >>>> -------------------------------------------------------------------------- >>>> > [mpi::mpi-api::mpi-abort] >>>> > MPI_ABORT was invoked on rank 0 in communicator MPI_COMM_WORLD >>>> > with errorcode 59. >>>> > >>>> > NOTE: invoking MPI_ABORT causes Open MPI to kill all MPI processes. >>>> > You may or may not see output from other processes, depending on >>>> > exactly when Open MPI kills them. >>>> > >>>> -------------------------------------------------------------------------- >>>> > [p01-024:26516] >>>> /opt/FJSVtclang/GM-1.2.0-18/lib64/libmpi.so.0(orte_errmgr_base_error_abort+0x84) >>>> [0xffffffff0091f684] >>>> > [p01-024:26516] >>>> /opt/FJSVtclang/GM-1.2.0-18/lib64/libmpi.so.0(ompi_mpi_abort+0x51c) >>>> [0xffffffff006c389c] >>>> > [p01-024:26516] >>>> /opt/FJSVtclang/GM-1.2.0-18/lib64/libmpi.so.0(MPI_Abort+0x6c) >>>> [0xffffffff006db3ac] >>>> > [p01-024:26516] >>>> /opt/FJSVtclang/GM-1.2.0-18/lib64/libtrtmet_c.so.1(MPI_Abort+0x2c) >>>> [0xffffffff00281bf0] >>>> > [p01-024:26516] ./kmath.bcbcg [0x1bf620] >>>> > [p01-024:26516] ./kmath.bcbcg [0x1bf20c] >>>> > [p01-024:26516] /lib64/libc.so.6(killpg+0x48) [0xffffffff02d52600] >>>> > [p01-024:26516] [(nil)] >>>> > [p01-024:26516] ./kmath.bcbcg [0x1a2054] >>>> > [p01-024:26516] ./kmath.bcbcg [0x1064f8] >>>> > [p01-024:26516] ./kmath.bcbcg(MAIN__+0x9dc) [0x105d1c] >>>> > [p01-024:26516] ./kmath.bcbcg(main+0xec) [0x8a329c] >>>> > [p01-024:26516] /lib64/libc.so.6(__libc_start_main+0x194) >>>> [0xffffffff02d3b81c] >>>> > [p01-024:26516] ./kmath.bcbcg [0x1051ec] >>>> > [0]PETSC ERROR: >>>> ------------------------------------------------------------------------ >>>> > [0]PETSC ERROR: Caught signal number 15 Terminate: Somet process (or >>>> the batch system) has told this process to end >>>> > [0]PETSC ERROR: Try option -start_in_debugger or >>>> -on_error_attach_debugger >>>> > [0]PETSC ERROR: or see >>>> http://www.mcs.anl.gov/petsc/documentation/faq.html#valgrind[0]PETSC >>>> ERROR: or try http://valgrind.org on GNU/linux and Apple Mac OS X to >>>> find memory corruption errors >>>> > [0]PETSC ERROR: configure using --with-debugging=yes, recompile, >>>> link, and run >>>> > [0]PETSC ERROR: to get more information on the crash. >>>> > [0]PETSC ERROR: --------------------- Error Message >>>> ------------------------------------ >>>> > [0]PETSC ERROR: Signal received! >>>> > [0]PETSC ERROR: >>>> ------------------------------------------------------------------------ >>>> > [0]PETSC ERROR: Petsc Release Version 3.3.0, Patch 7, Sat May 11 >>>> 22:15:24 CDT 2013 >>>> > [0]PETSC ERROR: See docs/changes/index.html for recent updates. >>>> > [0]PETSC ERROR: See docs/faq.html for hints about trouble shooting. >>>> > [0]PETSC ERROR: See docs/index.html for manual pages. >>>> > [0]PETSC ERROR: >>>> ------------------------------------------------------------------------ >>>> > [0]PETSC ERROR: ./kmath.bcbcg on a arch-fuji named p01-024 by a03293 >>>> Wed Aug 5 18:24:40 2015 >>>> > [0]PETSC ERROR: Libraries linked from >>>> /volume1/home/ra000005/a03293/kmathlibbuild/petsc-3.3-p7/arch-fujitsu-sparc64fx-opt/lib >>>> > [0]PETSC ERROR: Configure run at Tue Jul 28 19:23:51 2015 >>>> > [0]PETSC ERROR: Configure options --known-level1-dcache-size=32768 >>>> --known-level1-dcache-linesize=32 --known-level1-dcache-assoc=0 >>>> --known-memcmp-ok=1 --known-sizeof-char=1 --known-sizeof-void-p=8 >>>> --known-sizeof-short=2 --known-sizeof-int=4 --known-sizeof-long=8 >>>> --known-sizeof-long-long=8 --known-sizeof-float=4 --known-sizeof-double=8 >>>> --known-sizeof-size_t=8 --known-bits-per-byte=8 --known-sizeof-MPI_Comm=8 >>>> --known-sizeof-MPI_Fint=4 --known-mpi-long-double=1 >>>> --known-mpi-c-double-complex=1 --with-cc=mpifccpx --CFLAGS="-mt -Xg" >>>> --COPTFLAGS=-Kfast,openmp --with-cxx=mpiFCCpx --CXXFLAGS=-mt >>>> --CXXOPTFLAGS=-Kfast,openmp --with-fc=mpifrtpx --FFLAGS=-Kthreadsafe >>>> --FOPTFLAGS=-Kfast,openmp --with-blas-lapack-lib="-SCALAPACK -SSL2" >>>> --with-x=0 --with-c++-support --with-batch=1 --with-info=1 >>>> --with-debugging=0 --known-mpi-shared-libraries=0 --with-valgrind=0 >>>> > [0]PETSC ERROR: >>>> ------------------------------------------------------------------------ >>>> > [0]PETSC ERROR: User provided function() line 0 in unknown directory >>>> unknown file >>>> > [ERR.] PLE 0019 plexec One of MPI processes was >>>> aborted.(rank=0)(nid=0x020a0028)(CODE=1938,793745140674134016,15104) >>>> > >>>> > However, if I change from >>>> > call >>>> MatMatMult(A,Km(stepIdx-1),MAT_REUSE_MATRIX,PETSC_DEFAULT_INTEGER,Km(stepIdx), >>>> ierr) >>>> > to >>>> > call MatMatMult(A,Km(stepIdx-1), >>>> MAT_INITIAL_MATRIX,PETSC_DEFAULT_INTEGER,Km(stepIdx), ierr) >>>> > >>>> > everything is fine. >>>> > >>>> > could you please suggest some way to solve this? >>>> > >>>> > Thanks >>>> > >>>> > Cong Li >>>> > >>>> > On Wed, Aug 5, 2015 at 10:53 AM, Cong Li >>>> wrote: >>>> > Thank you very much for your help and suggestions. >>>> > With your help, finally I could continue my project. >>>> > >>>> > Regards >>>> > >>>> > Cong Li >>>> > >>>> > >>>> > >>>> > On Wed, Aug 5, 2015 at 3:09 AM, Barry Smith >>>> wrote: >>>> > >>>> > From the manual page: Unless scall is MAT_REUSE_MATRIX C will be >>>> created. >>>> > >>>> > Since you want to use the C that is passed in you should use >>>> MAT_REUSE_MATRIX. >>>> > >>>> > Note that since your B and C matrices are dense the issue of >>>> sparsity pattern of C is not relevant. >>>> > >>>> > Barry >>>> > >>>> > > On Aug 4, 2015, at 11:59 AM, Cong Li >>>> wrote: >>>> > > >>>> > > Thanks very much. This answer is very helpful. >>>> > > And I have a following question. >>>> > > If I create B1, B2, .. by the way you suggested and then use >>>> MatMatMult to do SPMM. >>>> > > PetscErrorCode MatMatMult(Mat A,Mat B,MatReuse scall,PetscReal >>>> fill,Mat *C) >>>> > > should I use MAT_REUSE_MATRIX for MatReuse part of the arguement. >>>> > > >>>> > > Thanks >>>> > > >>>> > > Cong Li >>>> > > >>>> > > On Wed, Aug 5, 2015 at 1:27 AM, Barry Smith >>>> wrote: >>>> > > >>>> > > > On Aug 4, 2015, at 4:09 AM, Cong Li >>>> wrote: >>>> > > > >>>> > > > I am sorry that I should have explained it more clearly. >>>> > > > Actually I want to compute a recurrence. >>>> > > > >>>> > > > Like, I want to firstly compute A*X1=B1, and then calculate >>>> A*B1=B2, A*B2=B3 and so on. >>>> > > > Finally I want to combine all these results into a bigger matrix >>>> C=[B1,B2 ...] >>>> > > >>>> > > First create C with MatCreateDense(,&C). Then call >>>> MatDenseGetArray(C,&array); then create B1 with >>>> MatCreateDense(....,array,&B1); then create >>>> > > B2 with MatCreateDense(...,array+shift,&B2) etc where shift equals >>>> the number of __local__ rows in B1 times the number of columns in B1, then >>>> create B3 with a larger shift etc. >>>> > > >>>> > > Note that you are "sharing" the array space of C with B1, B2, >>>> B3, ..., each Bi contains its columns of the C matrix. >>>> > > >>>> > > Barry >>>> > > >>>> > > >>>> > > >>>> > > > >>>> > > > Is there any way to do this efficiently. >>>> > > > >>>> > > > >>>> > > > >>>> > > > On Tue, Aug 4, 2015 at 5:45 PM, Patrick Sanan < >>>> patrick.sanan at gmail.com> wrote: >>>> > > > On Tue, Aug 04, 2015 at 03:42:14PM +0900, Cong Li wrote: >>>> > > > > Thanks for your reply. >>>> > > > > >>>> > > > > I have an other question. >>>> > > > > I want to do SPMM several times and combine result matrices >>>> into one bigger >>>> > > > > matrix. >>>> > > > > for example >>>> > > > > I firstly calculate AX1=B1, AX2=B2 ... >>>> > > > > then I want to combine B1, B2.. to get a C, where C=[B1,B2...] >>>> > > > > >>>> > > > > Could you please suggest a way of how to do this. >>>> > > > This is just linear algebra, nothing to do with PETSc >>>> specifically. >>>> > > > A * [X1, X2, ... ] = [AX1, AX2, ...] >>>> > > > > >>>> > > > > Thanks >>>> > > > > >>>> > > > > Cong Li >>>> > > > > >>>> > > > > On Tue, Aug 4, 2015 at 3:27 PM, Jed Brown >>>> wrote: >>>> > > > > >>>> > > > > > Cong Li writes: >>>> > > > > > >>>> > > > > > > Hello, >>>> > > > > > > >>>> > > > > > > I am a PhD student using PETsc for my research. >>>> > > > > > > I am wondering if there is a way to implement SPMM (Sparse >>>> matrix-matrix >>>> > > > > > > multiplication) by using PETSc. >>>> > > > > > >>>> > > > > > >>>> > > > > > >>>> http://www.mcs.anl.gov/petsc/petsc-current/docs/manualpages/Mat/MatMatMult.html >>>> > > > > > >>>> > > > >>>> > > >>>> > > >>>> > >>>> > >>>> > >>>> >>>> >>> >> > -------------- next part -------------- An HTML attachment was scrubbed... URL: From xzhao99 at gmail.com Thu Aug 6 11:22:29 2015 From: xzhao99 at gmail.com (Xujun Zhao) Date: Thu, 6 Aug 2015 11:22:29 -0500 Subject: [petsc-users] Vec Allgather operation in PETSc Message-ID: Hi all, For a parallel Vec whose components are stored in N processes, I would like to have an "Allgatherv" operation to obtain a whole copy on each process. Can anyone tell me which function I should use? Thanks Xujun -------------- next part -------------- An HTML attachment was scrubbed... URL: From dave.mayhem23 at gmail.com Thu Aug 6 11:26:53 2015 From: dave.mayhem23 at gmail.com (Dave May) Date: Thu, 6 Aug 2015 18:26:53 +0200 Subject: [petsc-users] Vec Allgather operation in PETSc In-Reply-To: References: Message-ID: Use this http://www.mcs.anl.gov/petsc/petsc-current/docs/manualpages/Vec/VecScatterCreateToAll.html On 6 August 2015 at 18:22, Xujun Zhao wrote: > Hi all, > > For a parallel Vec whose components are stored in N processes, I would > like to have an "Allgatherv" operation to obtain a whole copy on each > process. Can anyone tell me which function I should use? Thanks > > Xujun > -------------- next part -------------- An HTML attachment was scrubbed... URL: From bsmith at mcs.anl.gov Thu Aug 6 11:35:59 2015 From: bsmith at mcs.anl.gov (Barry Smith) Date: Thu, 6 Aug 2015 11:35:59 -0500 Subject: [petsc-users] I am wondering if there is a way to implement SPMM In-Reply-To: References: <87egjjr2j9.fsf@jedbrown.org> <20150804084548.GB52392@Patricks-MacBook-Pro-3.local> <07456300-9874-41EF-AF5E-16BC0CB0423D@mcs.anl.gov> <06426FD6-16F6-429A-8EEB-8BE31CECC8F4@mcs.anl.gov> Message-ID: > On Aug 6, 2015, at 10:09 AM, Hong wrote: > > Barry: > > Hong, we want to reuse the space in the Km(stepIdx-1) from which it was created which means that MAT_INITIAL_MATRIX cannot be used. Since the result is always dense it is not the difficult case when > a symbolic computation needs to be done initially so, at least in theory, he should not have to use MAT_INITIAL_MATRIX the first time through. > > Petsc implementation of MatMatMult() assumes user call > MatMatMultSymbolic() first, in which, we define which > MatMatMultNumeric() routine to be followed, and most importantly, we create specific data structure to be reused by MatMatMultNumeric(). > In MatMatMultSymbolic_MPIAIJ_MPIDense(), we create a 'container'. > > Without calling the case of MAT_INITIAL_MATRIX, these info are missing, and code simply crashes. Sure but in this case (with dense matrices) the container is very simple and we can create it the first time in if MAT_REUSE_MATRIX is passed in but the container is not there already. For the sparse result case you are right it doesn't make sense since the nonzero structure of the matrix needs to be figured out by symbolic factorization. Barry > > Hong > > > > > > Hong > > > > > > On Wed, Aug 5, 2015 at 8:56 PM, Cong Li wrote: > > The entire source code files are attached. > > > > Also I copy and paste the here in this email > > > > thanks > > > > program test > > > > implicit none > > > > #include > > #include > > #include > > #include > > > > > > PetscViewer :: view > > ! sparse matrix > > Mat :: A > > ! distributed dense matrix of size n x m > > Mat :: B, X, R, QDlt, AQDlt > > ! distributed dense matrix of size n x (m x k) > > Mat :: Q, K, AQ_p, AQ > > ! local dense matrix (every process keep the identical copies), (m x k) x (m x k) > > Mat :: AConjPara, QtAQ, QtAQ_p, Dlt > > > > PetscInt :: nDim, mDim, rhsNDim,rhsMDim,ierr, maxIter, iter, step_k,bsize > > PetscInt :: ownRowS,ownRowE > > PetscScalar, allocatable :: XInit(:,:) > > PetscInt :: XInitI, XInitJ > > PetscScalar :: v=1.0 > > PetscBool :: flg > > PetscMPIInt :: size, rank > > > > character(128) :: fin, rhsfin > > > > > > call PetscInitialize(PETSC_NULL_CHARACTER,ierr) > > call MPI_Comm_size(PETSC_COMM_WORLD,size,ierr) > > call MPI_Comm_rank(PETSC_COMM_WORLD,rank,ierr) > > > > ! read binary matrix file > > call PetscOptionsGetString(PETSC_NULL_CHARACTER,'-f',fin,flg,ierr) > > call PetscOptionsGetString(PETSC_NULL_CHARACTER,'-r',rhsfin,flg,ierr) > > > > call PetscOptionsGetInt(PETSC_NULL_CHARACTER,'-i',maxIter,flg,ierr) > > call PetscOptionsGetInt(PETSC_NULL_CHARACTER,'-k',step_k,flg,ierr) > > call PetscOptionsGetInt(PETSC_NULL_CHARACTER,'-w',bsize,flg,ierr) > > > > > > call PetscViewerBinaryOpen(PETSC_COMM_WORLD,fin,FILE_MODE_READ,view,ierr) > > call MatCreate(PETSC_COMM_WORLD,A,ierr) > > call MatSetType(A,MATAIJ,ierr) > > call MatLoad(A,view,ierr) > > call PetscViewerDestroy(view,ierr) > > ! for the time being, assume mDim == nDim is true > > call MatGetSize(A, nDim, mDim, ierr) > > > > if (rank == 0) then > > print*,'Mat Size = ', nDim, mDim > > end if > > > > call MatAssemblyBegin(A, MAT_FINAL_ASSEMBLY, ierr) > > call MatAssemblyEnd(A, MAT_FINAL_ASSEMBLY, ierr) > > call MatGetOwnershipRange(A,ownRowS,ownRowE, ierr) > > > > ! create right-and-side matrix > > ! for the time being, choose row-wise decomposition > > ! for the time being, assume nDim%size = 0 > > call MatCreateDense(PETSC_COMM_WORLD, (ownRowE - ownRowS), & > > bsize, nDim, bsize,PETSC_NULL_SCALAR, B, ierr) > > call PetscViewerBinaryOpen(PETSC_COMM_WORLD,rhsfin,FILE_MODE_READ,view, ierr) > > call MatLoad(B,view,ierr) > > call PetscViewerDestroy(view,ierr) > > call MatGetSize(B, rhsMDim, rhsNDim, ierr) > > if (rank == 0) then > > print*,'MRHS Size actually are:', rhsMDim, rhsNDim > > print*,'MRHS Size should be:', nDim, bsize > > end if > > call MatAssemblyBegin(B, MAT_FINAL_ASSEMBLY, ierr) > > call MatAssemblyEnd(B, MAT_FINAL_ASSEMBLY, ierr) > > > > ! inintial value guses X > > allocate(XInit(nDim,bsize)) > > do XInitI=1, nDim > > do XInitJ=1, bsize > > XInit(XInitI,XInitJ) = 1.0 > > end do > > end do > > > > call MatCreateDense(PETSC_COMM_WORLD, (ownRowE - ownRowS), & > > bsize, nDim, bsize,XInit, X, ierr) > > > > call MatAssemblyBegin(X, MAT_FINAL_ASSEMBLY, ierr) > > call MatAssemblyEnd (X, MAT_FINAL_ASSEMBLY, ierr) > > > > > > ! B, X, R, QDlt, AQDlt > > call MatDuplicate(B, MAT_DO_NOT_COPY_VALUES, R, ierr) > > call MatAssemblyBegin(R, MAT_FINAL_ASSEMBLY, ierr) > > call MatAssemblyEnd (R, MAT_FINAL_ASSEMBLY, ierr) > > > > call MatDuplicate(B, MAT_DO_NOT_COPY_VALUES, QDlt, ierr) > > call MatAssemblyBegin(QDlt, MAT_FINAL_ASSEMBLY, ierr) > > call MatAssemblyEnd (QDlt, MAT_FINAL_ASSEMBLY, ierr) > > > > call MatDuplicate(B, MAT_DO_NOT_COPY_VALUES, AQDlt, ierr) > > call MatAssemblyBegin(AQDlt, MAT_FINAL_ASSEMBLY, ierr) > > call MatAssemblyEnd (AQDlt, MAT_FINAL_ASSEMBLY, ierr) > > > > ! Q, K, AQ_p, AQ of size n x (m x k) > > call MatCreateDense(PETSC_COMM_WORLD, (ownRowE - ownRowS), & > > (bsize*step_k), nDim, (bsize*step_k),PETSC_NULL_SCALAR, Q, ierr) > > call MatAssemblyBegin(Q, MAT_FINAL_ASSEMBLY, ierr) > > call MatAssemblyEnd(Q, MAT_FINAL_ASSEMBLY, ierr) > > > > call MatDuplicate(Q, MAT_DO_NOT_COPY_VALUES, K, ierr) > > call MatAssemblyBegin(K, MAT_FINAL_ASSEMBLY, ierr) > > call MatAssemblyEnd(K, MAT_FINAL_ASSEMBLY, ierr) > > > > call MatDuplicate(Q, MAT_DO_NOT_COPY_VALUES, AQ_p, ierr) > > call MatAssemblyBegin(AQ_p, MAT_FINAL_ASSEMBLY, ierr) > > call MatAssemblyEnd(AQ_p, MAT_FINAL_ASSEMBLY, ierr) > > > > call MatDuplicate(Q, MAT_DO_NOT_COPY_VALUES, AQ, ierr) > > call MatAssemblyBegin(AQ, MAT_FINAL_ASSEMBLY, ierr) > > call MatAssemblyEnd(AQ, MAT_FINAL_ASSEMBLY, ierr) > > > > ! QtAQ, QtAQ_p, Dlt of size (m x k) x (m x k) > > call MatCreateSeqDense(PETSC_COMM_SELF,(bsize*step_k),(bsize*step_k),& > > PETSC_NULL_SCALAR, QtAQ, ierr) > > call MatAssemblyBegin(QtAQ, MAT_FINAL_ASSEMBLY, ierr) > > call MatAssemblyEnd (QtAQ, MAT_FINAL_ASSEMBLY, ierr) > > > > call MatDuplicate(QtAQ, MAT_DO_NOT_COPY_VALUES, QtAQ_p , ierr) > > call MatAssemblyBegin(QtAQ_p, MAT_FINAL_ASSEMBLY, ierr) > > call MatAssemblyEnd (QtAQ_p, MAT_FINAL_ASSEMBLY, ierr) > > > > call MatDuplicate(QtAQ, MAT_DO_NOT_COPY_VALUES, Dlt , ierr) > > call MatAssemblyBegin(Dlt, MAT_FINAL_ASSEMBLY, ierr) > > call MatAssemblyEnd (Dlt, MAT_FINAL_ASSEMBLY, ierr) > > > > call MatDuplicate(QtAQ, MAT_DO_NOT_COPY_VALUES, AConjPara , ierr) > > call MatAssemblyBegin(AConjPara, MAT_FINAL_ASSEMBLY, ierr) > > call MatAssemblyEnd (AConjPara, MAT_FINAL_ASSEMBLY, ierr) > > > > ! calculation for R > > > > ! call matrix powers kernel > > call mpk_monomial (K, A, R, step_k, rank,size) > > > > ! destory matrices > > deallocate(XInit) > > > > call MatDestroy(B, ierr) > > call MatDestroy(X, ierr) > > call MatDestroy(R, ierr) > > call MatDestroy(QDlt, ierr) > > call MatDestroy(AQDlt, ierr) > > call MatDestroy(Q, ierr) > > call MatDestroy(K, ierr) > > call MatDestroy(AQ_p, ierr) > > call MatDestroy(AQ, ierr) > > call MatDestroy(QtAQ, ierr) > > call MatDestroy(QtAQ_p, ierr) > > call MatDestroy(Dlt, ierr) > > > > > > call PetscFinalize(ierr) > > > > stop > > > > end program test > > > > > > subroutine mpk_monomial (K, A, R, step_k, rank, sizeMPI) > > implicit none > > > > #include > > #include > > #include > > #include > > > > Mat :: K, Km(step_k) > > Mat :: A, R > > PetscMPIInt :: sizeMPI, rank > > PetscInt :: nDim, bsize, step_k, local_RRow, local_RCol, genIdx > > PetscInt :: ierr > > PetscInt :: stepIdx, blockShift, localRsize > > PetscScalar :: KArray(1), RArray(1), PetscScalarSize > > PetscOffset :: KArrayOffset, RArrayOffset > > > > call MatGetSize(R, nDim, bsize, ierr) > > if (rank == 0) then > > print*,'Mat Size = ', nDim, bsize > > end if > > > > call MatGetArray(K,KArray,KArrayOffset,ierr) > > > > call MatGetLocalSize(R,local_RRow,local_RCol) > > ! print *, "local_RRow,local_RCol", local_RRow,local_RCol > > > > ! get arry from R to add values to K(1) > > call MatGetArray(R,RArray,RArrayOffset,ierr) > > > > call MatCreateDense(PETSC_COMM_WORLD, PETSC_DECIDE, & > > PETSC_DECIDE , nDim, bsize,KArray(KArrayOffset + 1), Km(1), ierr) > > > > > > ! call PetscMemmove(KArray(KArrayOffset + 1),RArray(RArrayOffset + 1) & > > ! ,local_RRow * local_RCol * STORAGE_SIZE(PetscScalarSize), ierr) > > > > localRsize = local_RRow * local_RCol > > do genIdx= 1, localRsize > > KArray(KArrayOffset + genIdx) = RArray(RArrayOffset + genIdx) > > end do > > > > > > call MatRestoreArray(R,RArray,RArrayOffset,ierr) > > > > call MatAssemblyBegin(Km(1), MAT_FINAL_ASSEMBLY, ierr) > > call MatAssemblyEnd (Km(1), MAT_FINAL_ASSEMBLY, ierr) > > > > do stepIdx= 2, step_k > > > > blockShift = KArrayOffset + (stepIdx-1) * (local_RRow * local_RCol) > > > > call MatCreateDense(PETSC_COMM_WORLD, PETSC_DECIDE, & > > PETSC_DECIDE , nDim, bsize,KArray(blockShift+1), Km(stepIdx), ierr) > > call MatAssemblyBegin(Km(stepIdx), MAT_FINAL_ASSEMBLY, ierr) > > call MatAssemblyEnd (Km(stepIdx), MAT_FINAL_ASSEMBLY, ierr) > > > > end do > > > > call MatRestoreArray(K,KArray,KArrayOffset,ierr) > > > > ! do stepIdx= 2, step_k > > do stepIdx= 2,2 > > > > call MatMatMult(A,Km(stepIdx-1),MAT_REUSE_MATRIX,PETSC_DEFAULT_INTEGER,Km(stepIdx), ierr) > > ! call MatMatMult(A,Km(stepIdx-1),MAT_INITIAL_MATRIX,PETSC_DEFAULT_INTEGER,Km(stepIdx), ierr) > > end do > > > > ! call MatView(K,PETSC_VIEWER_STDOUT_WORLD,ierr) > > > > end subroutine mpk_monomial > > > > > > > > Cong Li > > > > On Thu, Aug 6, 2015 at 3:30 AM, Barry Smith wrote: > > > > Send the entire code so that we can compile it and run it ourselves to see what is going wrong. > > > > Barry > > > > > On Aug 5, 2015, at 4:42 AM, Cong Li wrote: > > > > > > Hi > > > > > > I tried the method you suggested. However, I got the error message. > > > My code and message are below. > > > > > > K is the big matrix containing column matrices. > > > > > > code: > > > > > > call MatGetArray(K,KArray,KArrayOffset,ierr) > > > > > > call MatGetLocalSize(R,local_RRow,local_RCol) > > > > > > call MatGetArray(R,RArray,RArrayOffset,ierr) > > > > > > call MatCreateDense(PETSC_COMM_WORLD, PETSC_DECIDE, & > > > PETSC_DECIDE , nDim, bsize,KArray(KArrayOffset + 1), Km(1), ierr) > > > > > > localRsize = local_RRow * local_RCol > > > do genIdx= 1, localRsize > > > KArray(KArrayOffset + genIdx) = RArray(RArrayOffset + genIdx) > > > end do > > > > > > call MatRestoreArray(R,RArray,RArrayOffset,ierr) > > > > > > call MatAssemblyBegin(Km(1), MAT_FINAL_ASSEMBLY, ierr) > > > call MatAssemblyEnd (Km(1), MAT_FINAL_ASSEMBLY, ierr) > > > > > > do stepIdx= 2, step_k > > > > > > blockShift = KArrayOffset + (stepIdx-1) * (local_RRow * local_RCol) > > > > > > call MatCreateDense(PETSC_COMM_WORLD, PETSC_DECIDE, & > > > PETSC_DECIDE , nDim, bsize,KArray(blockShift+1), Km(stepIdx), ierr) > > > call MatAssemblyBegin(Km(stepIdx), MAT_FINAL_ASSEMBLY, ierr) > > > call MatAssemblyEnd (Km(stepIdx), MAT_FINAL_ASSEMBLY, ierr) > > > end do > > > > > > call MatRestoreArray(K,KArray,KArrayOffset,ierr) > > > > > > do stepIdx= 2, step_k > > > > > > call MatMatMult(A,Km(stepIdx-1),MAT_REUSE_MATRIX,PETSC_DEFAULT_INTEGER,Km(stepIdx), ierr) > > > end do > > > > > > > > > And I got the error message as below: > > > > > > > > > [0]PETSC ERROR: ------------------------------------------------------------------------ > > > [0]PETSC ERROR: Caught signal number 11 SEGV: Segmentation Violation, probably memory access out of range > > > [0]PETSC ERROR: Try option -start_in_debugger or -on_error_attach_debugger > > > [0]PETSC ERROR: or see http://www.mcs.anl.gov/petsc/documentation/faq.html#valgrind[0]PETSC ERROR: or try http://valgrind.org on GNU/linux and Apple Mac OS X to find memory corruption errors > > > [0]PETSC ERROR: configure using --with-debugging=yes, recompile, link, and run > > > [0]PETSC ERROR: to get more information on the crash. > > > [0]PETSC ERROR: --------------------- Error Message ------------------------------------ > > > [0]PETSC ERROR: Signal received! > > > [0]PETSC ERROR: ------------------------------------------------------------------------ > > > [0]PETSC ERROR: Petsc Release Version 3.3.0, Patch 7, Sat May 11 22:15:24 CDT 2013 > > > [0]PETSC ERROR: See docs/changes/index.html for recent updates. > > > [0]PETSC ERROR: See docs/faq.html for hints about trouble shooting. > > > [0]PETSC ERROR: See docs/index.html for manual pages. > > > [0]PETSC ERROR: --------------------[1]PETSC ERROR: ------------------------------------------------------------------------ > > > [1]PETSC ERROR: Caught signal number 11 SEGV: Segmentation Violation, probably memory access out of range > > > ---------------------------------------------------- > > > [0]PETSC ERROR: ./kmath.bcbcg on a arch-fuji named p01-024 by a03293 Wed Aug 5 18:24:40 2015 > > > [0]PETSC ERROR: Libraries linked from /volume1/home/ra000005/a03293/kmathlibbuild/petsc-3.3-p7/arch-fujitsu-sparc64fx-opt/lib > > > [0]PETSC ERROR: Configure run at Tue Jul 28 19:23:51 2015 > > > [0]PETSC ERROR: Configure options --known-level1-dcache-size=32768 --known-level1-dcache-linesize=32 --known-level1-dcache-assoc=0 --known-memcmp-ok=1 --known-sizeof-char=1 --known-sizeof-void-p=8 --known-sizeof-short=2 --known-sizeof-int=4 --known-sizeof-long=8 --known-sizeof-long-long=8 --known-sizeof-float=4 --known-sizeof-double=8 --known-sizeof-size_t=8 --known-bits-per-byte=8 --known-sizeof-MPI_Comm=8 --known-sizeof-MPI_Fint=4 --known-mpi-long-double=1 --known-mpi-c-double-complex=1 --with-cc=mpifccpx --CFLAGS="-mt -Xg" --COPTFLAGS=-Kfast,openmp --with-cxx=mpiFCCpx --CXXFLAGS=-mt --CXXOPTFLAGS=-Kfast,openmp --with-fc=mpifrtpx --FFLAGS=-Kthreadsafe --FOPTFLAGS=-Kfast,openmp --with-blas-lapack-lib="-SCALAPACK -SSL2" --with-x=0 --with-c++-support --with-batch=1 --with-info=1 --with-debugging=0 --known-mpi-shared-libraries=0 --with-valgrind=0 > > > [0]PETSC ERROR: ------------------------------------------------------------------------ > > > [0]PETSC ERROR: User provided function() line 0 in unknown directory unknown file > > > -------------------------------------------------------------------------- > > > [mpi::mpi-api::mpi-abort] > > > MPI_ABORT was invoked on rank 0 in communicator MPI_COMM_WORLD > > > with errorcode 59. > > > > > > NOTE: invoking MPI_ABORT causes Open MPI to kill all MPI processes. > > > You may or may not see output from other processes, depending on > > > exactly when Open MPI kills them. > > > -------------------------------------------------------------------------- > > > [p01-024:26516] /opt/FJSVtclang/GM-1.2.0-18/lib64/libmpi.so.0(orte_errmgr_base_error_abort+0x84) [0xffffffff0091f684] > > > [p01-024:26516] /opt/FJSVtclang/GM-1.2.0-18/lib64/libmpi.so.0(ompi_mpi_abort+0x51c) [0xffffffff006c389c] > > > [p01-024:26516] /opt/FJSVtclang/GM-1.2.0-18/lib64/libmpi.so.0(MPI_Abort+0x6c) [0xffffffff006db3ac] > > > [p01-024:26516] /opt/FJSVtclang/GM-1.2.0-18/lib64/libtrtmet_c.so.1(MPI_Abort+0x2c) [0xffffffff00281bf0] > > > [p01-024:26516] ./kmath.bcbcg [0x1bf620] > > > [p01-024:26516] ./kmath.bcbcg [0x1bf20c] > > > [p01-024:26516] /lib64/libc.so.6(killpg+0x48) [0xffffffff02d52600] > > > [p01-024:26516] [(nil)] > > > [p01-024:26516] ./kmath.bcbcg [0x1a2054] > > > [p01-024:26516] ./kmath.bcbcg [0x1064f8] > > > [p01-024:26516] ./kmath.bcbcg(MAIN__+0x9dc) [0x105d1c] > > > [p01-024:26516] ./kmath.bcbcg(main+0xec) [0x8a329c] > > > [p01-024:26516] /lib64/libc.so.6(__libc_start_main+0x194) [0xffffffff02d3b81c] > > > [p01-024:26516] ./kmath.bcbcg [0x1051ec] > > > [0]PETSC ERROR: ------------------------------------------------------------------------ > > > [0]PETSC ERROR: Caught signal number 15 Terminate: Somet process (or the batch system) has told this process to end > > > [0]PETSC ERROR: Try option -start_in_debugger or -on_error_attach_debugger > > > [0]PETSC ERROR: or see http://www.mcs.anl.gov/petsc/documentation/faq.html#valgrind[0]PETSC ERROR: or try http://valgrind.org on GNU/linux and Apple Mac OS X to find memory corruption errors > > > [0]PETSC ERROR: configure using --with-debugging=yes, recompile, link, and run > > > [0]PETSC ERROR: to get more information on the crash. > > > [0]PETSC ERROR: --------------------- Error Message ------------------------------------ > > > [0]PETSC ERROR: Signal received! > > > [0]PETSC ERROR: ------------------------------------------------------------------------ > > > [0]PETSC ERROR: Petsc Release Version 3.3.0, Patch 7, Sat May 11 22:15:24 CDT 2013 > > > [0]PETSC ERROR: See docs/changes/index.html for recent updates. > > > [0]PETSC ERROR: See docs/faq.html for hints about trouble shooting. > > > [0]PETSC ERROR: See docs/index.html for manual pages. > > > [0]PETSC ERROR: ------------------------------------------------------------------------ > > > [0]PETSC ERROR: ./kmath.bcbcg on a arch-fuji named p01-024 by a03293 Wed Aug 5 18:24:40 2015 > > > [0]PETSC ERROR: Libraries linked from /volume1/home/ra000005/a03293/kmathlibbuild/petsc-3.3-p7/arch-fujitsu-sparc64fx-opt/lib > > > [0]PETSC ERROR: Configure run at Tue Jul 28 19:23:51 2015 > > > [0]PETSC ERROR: Configure options --known-level1-dcache-size=32768 --known-level1-dcache-linesize=32 --known-level1-dcache-assoc=0 --known-memcmp-ok=1 --known-sizeof-char=1 --known-sizeof-void-p=8 --known-sizeof-short=2 --known-sizeof-int=4 --known-sizeof-long=8 --known-sizeof-long-long=8 --known-sizeof-float=4 --known-sizeof-double=8 --known-sizeof-size_t=8 --known-bits-per-byte=8 --known-sizeof-MPI_Comm=8 --known-sizeof-MPI_Fint=4 --known-mpi-long-double=1 --known-mpi-c-double-complex=1 --with-cc=mpifccpx --CFLAGS="-mt -Xg" --COPTFLAGS=-Kfast,openmp --with-cxx=mpiFCCpx --CXXFLAGS=-mt --CXXOPTFLAGS=-Kfast,openmp --with-fc=mpifrtpx --FFLAGS=-Kthreadsafe --FOPTFLAGS=-Kfast,openmp --with-blas-lapack-lib="-SCALAPACK -SSL2" --with-x=0 --with-c++-support --with-batch=1 --with-info=1 --with-debugging=0 --known-mpi-shared-libraries=0 --with-valgrind=0 > > > [0]PETSC ERROR: ------------------------------------------------------------------------ > > > [0]PETSC ERROR: User provided function() line 0 in unknown directory unknown file > > > [ERR.] PLE 0019 plexec One of MPI processes was aborted.(rank=0)(nid=0x020a0028)(CODE=1938,793745140674134016,15104) > > > > > > However, if I change from > > > call MatMatMult(A,Km(stepIdx-1),MAT_REUSE_MATRIX,PETSC_DEFAULT_INTEGER,Km(stepIdx), ierr) > > > to > > > call MatMatMult(A,Km(stepIdx-1), MAT_INITIAL_MATRIX,PETSC_DEFAULT_INTEGER,Km(stepIdx), ierr) > > > > > > everything is fine. > > > > > > could you please suggest some way to solve this? > > > > > > Thanks > > > > > > Cong Li > > > > > > On Wed, Aug 5, 2015 at 10:53 AM, Cong Li wrote: > > > Thank you very much for your help and suggestions. > > > With your help, finally I could continue my project. > > > > > > Regards > > > > > > Cong Li > > > > > > > > > > > > On Wed, Aug 5, 2015 at 3:09 AM, Barry Smith wrote: > > > > > > From the manual page: Unless scall is MAT_REUSE_MATRIX C will be created. > > > > > > Since you want to use the C that is passed in you should use MAT_REUSE_MATRIX. > > > > > > Note that since your B and C matrices are dense the issue of sparsity pattern of C is not relevant. > > > > > > Barry > > > > > > > On Aug 4, 2015, at 11:59 AM, Cong Li wrote: > > > > > > > > Thanks very much. This answer is very helpful. > > > > And I have a following question. > > > > If I create B1, B2, .. by the way you suggested and then use MatMatMult to do SPMM. > > > > PetscErrorCode MatMatMult(Mat A,Mat B,MatReuse scall,PetscReal fill,Mat *C) > > > > should I use MAT_REUSE_MATRIX for MatReuse part of the arguement. > > > > > > > > Thanks > > > > > > > > Cong Li > > > > > > > > On Wed, Aug 5, 2015 at 1:27 AM, Barry Smith wrote: > > > > > > > > > On Aug 4, 2015, at 4:09 AM, Cong Li wrote: > > > > > > > > > > I am sorry that I should have explained it more clearly. > > > > > Actually I want to compute a recurrence. > > > > > > > > > > Like, I want to firstly compute A*X1=B1, and then calculate A*B1=B2, A*B2=B3 and so on. > > > > > Finally I want to combine all these results into a bigger matrix C=[B1,B2 ...] > > > > > > > > First create C with MatCreateDense(,&C). Then call MatDenseGetArray(C,&array); then create B1 with MatCreateDense(....,array,&B1); then create > > > > B2 with MatCreateDense(...,array+shift,&B2) etc where shift equals the number of __local__ rows in B1 times the number of columns in B1, then create B3 with a larger shift etc. > > > > > > > > Note that you are "sharing" the array space of C with B1, B2, B3, ..., each Bi contains its columns of the C matrix. > > > > > > > > Barry > > > > > > > > > > > > > > > > > > > > > > Is there any way to do this efficiently. > > > > > > > > > > > > > > > > > > > > On Tue, Aug 4, 2015 at 5:45 PM, Patrick Sanan wrote: > > > > > On Tue, Aug 04, 2015 at 03:42:14PM +0900, Cong Li wrote: > > > > > > Thanks for your reply. > > > > > > > > > > > > I have an other question. > > > > > > I want to do SPMM several times and combine result matrices into one bigger > > > > > > matrix. > > > > > > for example > > > > > > I firstly calculate AX1=B1, AX2=B2 ... > > > > > > then I want to combine B1, B2.. to get a C, where C=[B1,B2...] > > > > > > > > > > > > Could you please suggest a way of how to do this. > > > > > This is just linear algebra, nothing to do with PETSc specifically. > > > > > A * [X1, X2, ... ] = [AX1, AX2, ...] > > > > > > > > > > > > Thanks > > > > > > > > > > > > Cong Li > > > > > > > > > > > > On Tue, Aug 4, 2015 at 3:27 PM, Jed Brown wrote: > > > > > > > > > > > > > Cong Li writes: > > > > > > > > > > > > > > > Hello, > > > > > > > > > > > > > > > > I am a PhD student using PETsc for my research. > > > > > > > > I am wondering if there is a way to implement SPMM (Sparse matrix-matrix > > > > > > > > multiplication) by using PETSc. > > > > > > > > > > > > > > > > > > > > > http://www.mcs.anl.gov/petsc/petsc-current/docs/manualpages/Mat/MatMatMult.html > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > From xzhao99 at gmail.com Thu Aug 6 14:57:52 2015 From: xzhao99 at gmail.com (Xujun Zhao) Date: Thu, 6 Aug 2015 14:57:52 -0500 Subject: [petsc-users] Vec Allgather operation in PETSc In-Reply-To: References: Message-ID: Hi Dave, Thank you. This solves my problem! Xujun On Thu, Aug 6, 2015 at 11:26 AM, Dave May wrote: > Use this > > http://www.mcs.anl.gov/petsc/petsc-current/docs/manualpages/Vec/VecScatterCreateToAll.html > > On 6 August 2015 at 18:22, Xujun Zhao wrote: > >> Hi all, >> >> For a parallel Vec whose components are stored in N processes, I would >> like to have an "Allgatherv" operation to obtain a whole copy on each >> process. Can anyone tell me which function I should use? Thanks >> >> Xujun >> > > -------------- next part -------------- An HTML attachment was scrubbed... URL: From juris.vencels at gmail.com Thu Aug 6 16:16:57 2015 From: juris.vencels at gmail.com (Juris Vencels) Date: Thu, 06 Aug 2015 15:16:57 -0600 Subject: [petsc-users] Remove Jacobian matrix values less than tolerance Message-ID: <55C3CEC9.2000705@gmail.com> Hi Users, When I construct analytical Jacobian matrix it has many small values of order 1E-16. How can I remove these values that are less than a given tolerance, let's say 1E-10? I tried to use MatChop together with MatCopy and MatDuplicate, but none of these functions ignores zeros. Thanks! From knepley at gmail.com Thu Aug 6 16:22:36 2015 From: knepley at gmail.com (Matthew Knepley) Date: Thu, 6 Aug 2015 16:22:36 -0500 Subject: [petsc-users] Remove Jacobian matrix values less than tolerance In-Reply-To: <55C3CEC9.2000705@gmail.com> References: <55C3CEC9.2000705@gmail.com> Message-ID: On Thu, Aug 6, 2015 at 4:16 PM, Juris Vencels wrote: > Hi Users, > > > When I construct analytical Jacobian matrix it has many small values of > order 1E-16. > > How can I remove these values that are less than a given tolerance, let's > say 1E-10? > > I tried to use MatChop together with MatCopy and MatDuplicate, but none of > these functions ignores zeros. > Do you mean that you want to change the sparsity pattern? We do not have a function which does this. It would require a copy to regain the lost memory, and its not normally worth the trouble. Do you have some data or a model that tells you its worth it in your case? This is a question I always ask myself before programming. Thanks, Matt > Thanks! -- What most experimenters take for granted before they begin their experiments is infinitely more interesting than any results to which their experiments lead. -- Norbert Wiener -------------- next part -------------- An HTML attachment was scrubbed... URL: From bsmith at mcs.anl.gov Thu Aug 6 16:55:11 2015 From: bsmith at mcs.anl.gov (Barry Smith) Date: Thu, 6 Aug 2015 16:55:11 -0500 Subject: [petsc-users] Remove Jacobian matrix values less than tolerance In-Reply-To: <55C3CEC9.2000705@gmail.com> References: <55C3CEC9.2000705@gmail.com> Message-ID: > On Aug 6, 2015, at 4:16 PM, Juris Vencels wrote: > > Hi Users, > > > When I construct analytical Jacobian matrix it has many small values of order 1E-16. Are the values at those locations always that small or at different Newton steps or time-steps will they be larger? Unless there are a huge number of these and you know they are always small then I would not try to take them out. If you don't want them in there then don't put them in orginally; that is don't call MatSetValues() at all for those really small locations and don't allocate space for them. Barry > > How can I remove these values that are less than a given tolerance, let's say 1E-10? > > I tried to use MatChop together with MatCopy and MatDuplicate, but none of these functions ignores zeros. > > > Thanks! From bsmith at mcs.anl.gov Thu Aug 6 18:47:39 2015 From: bsmith at mcs.anl.gov (Barry Smith) Date: Thu, 6 Aug 2015 18:47:39 -0500 Subject: [petsc-users] I am wondering if there is a way to implement SPMM In-Reply-To: References: <87egjjr2j9.fsf@jedbrown.org> <20150804084548.GB52392@Patricks-MacBook-Pro-3.local> <07456300-9874-41EF-AF5E-16BC0CB0423D@mcs.anl.gov> <06426FD6-16F6-429A-8EEB-8BE31CECC8F4@mcs.anl.gov> Message-ID: <340E63F1-4389-4C3B-8221-4F119330764F@mcs.anl.gov> Cong Li, I have updated PETSc to support the use of MatMatMult() per your needs. You will need to switch to the master development branch http://www.mcs.anl.gov/petsc/developers/index.html of PETSc so install that first. I found a number of bugs in your code that I needed to fix to get it to run successfully on 1 and 2 processes to correctly load the matrices and do everything else it was doing -------------- next part -------------- A non-text attachment was scrubbed... Name: ex1f.F90 Type: application/octet-stream Size: 4403 bytes Desc: not available URL: -------------- next part -------------- A non-text attachment was scrubbed... Name: mpk_monomial.F90 Type: application/octet-stream Size: 2366 bytes Desc: not available URL: -------------- next part -------------- with the MatMatMult() (note I do not think it generates the right numbers but at least it doesn't crash and does successfully do the MatMatMult(). I've attached the fixed files. Barry > On Aug 6, 2015, at 12:27 AM, Cong Li wrote: > > Barry, > > Exactly. And thanks for the explaination. > > Cong Li > > On Thu, Aug 6, 2015 at 1:29 PM, Barry Smith wrote: > > > On Aug 5, 2015, at 10:23 PM, Hong wrote: > > > > Cong, > > > > Can you write out math equations for mpk_monomial (), > > list input and output parameters. > > > > Note: > > 1. MatDuplicate() does not need to be followed by MatAssemblyBegin/End > > 2. MatMatMult(A,Km(stepIdx-1),MAT_REUSE_MATRIX,..) must be called after > > MatMatMult(A,Km(stepIdx-1),MAT_INITIAL_MATRIX,..) > > Hong, we want to reuse the space in the Km(stepIdx-1) from which it was created which means that MAT_INITIAL_MATRIX cannot be used. Since the result is always dense it is not the difficult case when a symbolic computation needs to be done initially so, at least in theory, he should not have to use MAT_INITIAL_MATRIX the first time through. > > Barry > > > > > Hong > > > > > > On Wed, Aug 5, 2015 at 8:56 PM, Cong Li wrote: > > The entire source code files are attached. > > > > Also I copy and paste the here in this email > > > > thanks > > > > program test > > > > implicit none > > > > #include > > #include > > #include > > #include > > > > > > PetscViewer :: view > > ! sparse matrix > > Mat :: A > > ! distributed dense matrix of size n x m > > Mat :: B, X, R, QDlt, AQDlt > > ! distributed dense matrix of size n x (m x k) > > Mat :: Q, K, AQ_p, AQ > > ! local dense matrix (every process keep the identical copies), (m x k) x (m x k) > > Mat :: AConjPara, QtAQ, QtAQ_p, Dlt > > > > PetscInt :: nDim, mDim, rhsNDim,rhsMDim,ierr, maxIter, iter, step_k,bsize > > PetscInt :: ownRowS,ownRowE > > PetscScalar, allocatable :: XInit(:,:) > > PetscInt :: XInitI, XInitJ > > PetscScalar :: v=1.0 > > PetscBool :: flg > > PetscMPIInt :: size, rank > > > > character(128) :: fin, rhsfin > > > > > > call PetscInitialize(PETSC_NULL_CHARACTER,ierr) > > call MPI_Comm_size(PETSC_COMM_WORLD,size,ierr) > > call MPI_Comm_rank(PETSC_COMM_WORLD,rank,ierr) > > > > ! read binary matrix file > > call PetscOptionsGetString(PETSC_NULL_CHARACTER,'-f',fin,flg,ierr) > > call PetscOptionsGetString(PETSC_NULL_CHARACTER,'-r',rhsfin,flg,ierr) > > > > call PetscOptionsGetInt(PETSC_NULL_CHARACTER,'-i',maxIter,flg,ierr) > > call PetscOptionsGetInt(PETSC_NULL_CHARACTER,'-k',step_k,flg,ierr) > > call PetscOptionsGetInt(PETSC_NULL_CHARACTER,'-w',bsize,flg,ierr) > > > > > > call PetscViewerBinaryOpen(PETSC_COMM_WORLD,fin,FILE_MODE_READ,view,ierr) > > call MatCreate(PETSC_COMM_WORLD,A,ierr) > > call MatSetType(A,MATAIJ,ierr) > > call MatLoad(A,view,ierr) > > call PetscViewerDestroy(view,ierr) > > ! for the time being, assume mDim == nDim is true > > call MatGetSize(A, nDim, mDim, ierr) > > > > if (rank == 0) then > > print*,'Mat Size = ', nDim, mDim > > end if > > > > call MatAssemblyBegin(A, MAT_FINAL_ASSEMBLY, ierr) > > call MatAssemblyEnd(A, MAT_FINAL_ASSEMBLY, ierr) > > call MatGetOwnershipRange(A,ownRowS,ownRowE, ierr) > > > > ! create right-and-side matrix > > ! for the time being, choose row-wise decomposition > > ! for the time being, assume nDim%size = 0 > > call MatCreateDense(PETSC_COMM_WORLD, (ownRowE - ownRowS), & > > bsize, nDim, bsize,PETSC_NULL_SCALAR, B, ierr) > > call PetscViewerBinaryOpen(PETSC_COMM_WORLD,rhsfin,FILE_MODE_READ,view, ierr) > > call MatLoad(B,view,ierr) > > call PetscViewerDestroy(view,ierr) > > call MatGetSize(B, rhsMDim, rhsNDim, ierr) > > if (rank == 0) then > > print*,'MRHS Size actually are:', rhsMDim, rhsNDim > > print*,'MRHS Size should be:', nDim, bsize > > end if > > call MatAssemblyBegin(B, MAT_FINAL_ASSEMBLY, ierr) > > call MatAssemblyEnd(B, MAT_FINAL_ASSEMBLY, ierr) > > > > ! inintial value guses X > > allocate(XInit(nDim,bsize)) > > do XInitI=1, nDim > > do XInitJ=1, bsize > > XInit(XInitI,XInitJ) = 1.0 > > end do > > end do > > > > call MatCreateDense(PETSC_COMM_WORLD, (ownRowE - ownRowS), & > > bsize, nDim, bsize,XInit, X, ierr) > > > > call MatAssemblyBegin(X, MAT_FINAL_ASSEMBLY, ierr) > > call MatAssemblyEnd (X, MAT_FINAL_ASSEMBLY, ierr) > > > > > > ! B, X, R, QDlt, AQDlt > > call MatDuplicate(B, MAT_DO_NOT_COPY_VALUES, R, ierr) > > call MatAssemblyBegin(R, MAT_FINAL_ASSEMBLY, ierr) > > call MatAssemblyEnd (R, MAT_FINAL_ASSEMBLY, ierr) > > > > call MatDuplicate(B, MAT_DO_NOT_COPY_VALUES, QDlt, ierr) > > call MatAssemblyBegin(QDlt, MAT_FINAL_ASSEMBLY, ierr) > > call MatAssemblyEnd (QDlt, MAT_FINAL_ASSEMBLY, ierr) > > > > call MatDuplicate(B, MAT_DO_NOT_COPY_VALUES, AQDlt, ierr) > > call MatAssemblyBegin(AQDlt, MAT_FINAL_ASSEMBLY, ierr) > > call MatAssemblyEnd (AQDlt, MAT_FINAL_ASSEMBLY, ierr) > > > > ! Q, K, AQ_p, AQ of size n x (m x k) > > call MatCreateDense(PETSC_COMM_WORLD, (ownRowE - ownRowS), & > > (bsize*step_k), nDim, (bsize*step_k),PETSC_NULL_SCALAR, Q, ierr) > > call MatAssemblyBegin(Q, MAT_FINAL_ASSEMBLY, ierr) > > call MatAssemblyEnd(Q, MAT_FINAL_ASSEMBLY, ierr) > > > > call MatDuplicate(Q, MAT_DO_NOT_COPY_VALUES, K, ierr) > > call MatAssemblyBegin(K, MAT_FINAL_ASSEMBLY, ierr) > > call MatAssemblyEnd(K, MAT_FINAL_ASSEMBLY, ierr) > > > > call MatDuplicate(Q, MAT_DO_NOT_COPY_VALUES, AQ_p, ierr) > > call MatAssemblyBegin(AQ_p, MAT_FINAL_ASSEMBLY, ierr) > > call MatAssemblyEnd(AQ_p, MAT_FINAL_ASSEMBLY, ierr) > > > > call MatDuplicate(Q, MAT_DO_NOT_COPY_VALUES, AQ, ierr) > > call MatAssemblyBegin(AQ, MAT_FINAL_ASSEMBLY, ierr) > > call MatAssemblyEnd(AQ, MAT_FINAL_ASSEMBLY, ierr) > > > > ! QtAQ, QtAQ_p, Dlt of size (m x k) x (m x k) > > call MatCreateSeqDense(PETSC_COMM_SELF,(bsize*step_k),(bsize*step_k),& > > PETSC_NULL_SCALAR, QtAQ, ierr) > > call MatAssemblyBegin(QtAQ, MAT_FINAL_ASSEMBLY, ierr) > > call MatAssemblyEnd (QtAQ, MAT_FINAL_ASSEMBLY, ierr) > > > > call MatDuplicate(QtAQ, MAT_DO_NOT_COPY_VALUES, QtAQ_p , ierr) > > call MatAssemblyBegin(QtAQ_p, MAT_FINAL_ASSEMBLY, ierr) > > call MatAssemblyEnd (QtAQ_p, MAT_FINAL_ASSEMBLY, ierr) > > > > call MatDuplicate(QtAQ, MAT_DO_NOT_COPY_VALUES, Dlt , ierr) > > call MatAssemblyBegin(Dlt, MAT_FINAL_ASSEMBLY, ierr) > > call MatAssemblyEnd (Dlt, MAT_FINAL_ASSEMBLY, ierr) > > > > call MatDuplicate(QtAQ, MAT_DO_NOT_COPY_VALUES, AConjPara , ierr) > > call MatAssemblyBegin(AConjPara, MAT_FINAL_ASSEMBLY, ierr) > > call MatAssemblyEnd (AConjPara, MAT_FINAL_ASSEMBLY, ierr) > > > > ! calculation for R > > > > ! call matrix powers kernel > > call mpk_monomial (K, A, R, step_k, rank,size) > > > > ! destory matrices > > deallocate(XInit) > > > > call MatDestroy(B, ierr) > > call MatDestroy(X, ierr) > > call MatDestroy(R, ierr) > > call MatDestroy(QDlt, ierr) > > call MatDestroy(AQDlt, ierr) > > call MatDestroy(Q, ierr) > > call MatDestroy(K, ierr) > > call MatDestroy(AQ_p, ierr) > > call MatDestroy(AQ, ierr) > > call MatDestroy(QtAQ, ierr) > > call MatDestroy(QtAQ_p, ierr) > > call MatDestroy(Dlt, ierr) > > > > > > call PetscFinalize(ierr) > > > > stop > > > > end program test > > > > > > subroutine mpk_monomial (K, A, R, step_k, rank, sizeMPI) > > implicit none > > > > #include > > #include > > #include > > #include > > > > Mat :: K, Km(step_k) > > Mat :: A, R > > PetscMPIInt :: sizeMPI, rank > > PetscInt :: nDim, bsize, step_k, local_RRow, local_RCol, genIdx > > PetscInt :: ierr > > PetscInt :: stepIdx, blockShift, localRsize > > PetscScalar :: KArray(1), RArray(1), PetscScalarSize > > PetscOffset :: KArrayOffset, RArrayOffset > > > > call MatGetSize(R, nDim, bsize, ierr) > > if (rank == 0) then > > print*,'Mat Size = ', nDim, bsize > > end if > > > > call MatGetArray(K,KArray,KArrayOffset,ierr) > > > > call MatGetLocalSize(R,local_RRow,local_RCol) > > ! print *, "local_RRow,local_RCol", local_RRow,local_RCol > > > > ! get arry from R to add values to K(1) > > call MatGetArray(R,RArray,RArrayOffset,ierr) > > > > call MatCreateDense(PETSC_COMM_WORLD, PETSC_DECIDE, & > > PETSC_DECIDE , nDim, bsize,KArray(KArrayOffset + 1), Km(1), ierr) > > > > > > ! call PetscMemmove(KArray(KArrayOffset + 1),RArray(RArrayOffset + 1) & > > ! ,local_RRow * local_RCol * STORAGE_SIZE(PetscScalarSize), ierr) > > > > localRsize = local_RRow * local_RCol > > do genIdx= 1, localRsize > > KArray(KArrayOffset + genIdx) = RArray(RArrayOffset + genIdx) > > end do > > > > > > call MatRestoreArray(R,RArray,RArrayOffset,ierr) > > > > call MatAssemblyBegin(Km(1), MAT_FINAL_ASSEMBLY, ierr) > > call MatAssemblyEnd (Km(1), MAT_FINAL_ASSEMBLY, ierr) > > > > do stepIdx= 2, step_k > > > > blockShift = KArrayOffset + (stepIdx-1) * (local_RRow * local_RCol) > > > > call MatCreateDense(PETSC_COMM_WORLD, PETSC_DECIDE, & > > PETSC_DECIDE , nDim, bsize,KArray(blockShift+1), Km(stepIdx), ierr) > > call MatAssemblyBegin(Km(stepIdx), MAT_FINAL_ASSEMBLY, ierr) > > call MatAssemblyEnd (Km(stepIdx), MAT_FINAL_ASSEMBLY, ierr) > > > > end do > > > > call MatRestoreArray(K,KArray,KArrayOffset,ierr) > > > > ! do stepIdx= 2, step_k > > do stepIdx= 2,2 > > > > call MatMatMult(A,Km(stepIdx-1),MAT_REUSE_MATRIX,PETSC_DEFAULT_INTEGER,Km(stepIdx), ierr) > > ! call MatMatMult(A,Km(stepIdx-1),MAT_INITIAL_MATRIX,PETSC_DEFAULT_INTEGER,Km(stepIdx), ierr) > > end do > > > > ! call MatView(K,PETSC_VIEWER_STDOUT_WORLD,ierr) > > > > end subroutine mpk_monomial > > > > > > > > Cong Li > > > > On Thu, Aug 6, 2015 at 3:30 AM, Barry Smith wrote: > > > > Send the entire code so that we can compile it and run it ourselves to see what is going wrong. > > > > Barry > > > > > On Aug 5, 2015, at 4:42 AM, Cong Li wrote: > > > > > > Hi > > > > > > I tried the method you suggested. However, I got the error message. > > > My code and message are below. > > > > > > K is the big matrix containing column matrices. > > > > > > code: > > > > > > call MatGetArray(K,KArray,KArrayOffset,ierr) > > > > > > call MatGetLocalSize(R,local_RRow,local_RCol) > > > > > > call MatGetArray(R,RArray,RArrayOffset,ierr) > > > > > > call MatCreateDense(PETSC_COMM_WORLD, PETSC_DECIDE, & > > > PETSC_DECIDE , nDim, bsize,KArray(KArrayOffset + 1), Km(1), ierr) > > > > > > localRsize = local_RRow * local_RCol > > > do genIdx= 1, localRsize > > > KArray(KArrayOffset + genIdx) = RArray(RArrayOffset + genIdx) > > > end do > > > > > > call MatRestoreArray(R,RArray,RArrayOffset,ierr) > > > > > > call MatAssemblyBegin(Km(1), MAT_FINAL_ASSEMBLY, ierr) > > > call MatAssemblyEnd (Km(1), MAT_FINAL_ASSEMBLY, ierr) > > > > > > do stepIdx= 2, step_k > > > > > > blockShift = KArrayOffset + (stepIdx-1) * (local_RRow * local_RCol) > > > > > > call MatCreateDense(PETSC_COMM_WORLD, PETSC_DECIDE, & > > > PETSC_DECIDE , nDim, bsize,KArray(blockShift+1), Km(stepIdx), ierr) > > > call MatAssemblyBegin(Km(stepIdx), MAT_FINAL_ASSEMBLY, ierr) > > > call MatAssemblyEnd (Km(stepIdx), MAT_FINAL_ASSEMBLY, ierr) > > > end do > > > > > > call MatRestoreArray(K,KArray,KArrayOffset,ierr) > > > > > > do stepIdx= 2, step_k > > > > > > call MatMatMult(A,Km(stepIdx-1),MAT_REUSE_MATRIX,PETSC_DEFAULT_INTEGER,Km(stepIdx), ierr) > > > end do > > > > > > > > > And I got the error message as below: > > > > > > > > > [0]PETSC ERROR: ------------------------------------------------------------------------ > > > [0]PETSC ERROR: Caught signal number 11 SEGV: Segmentation Violation, probably memory access out of range > > > [0]PETSC ERROR: Try option -start_in_debugger or -on_error_attach_debugger > > > [0]PETSC ERROR: or see http://www.mcs.anl.gov/petsc/documentation/faq.html#valgrind[0]PETSC ERROR: or try http://valgrind.org on GNU/linux and Apple Mac OS X to find memory corruption errors > > > [0]PETSC ERROR: configure using --with-debugging=yes, recompile, link, and run > > > [0]PETSC ERROR: to get more information on the crash. > > > [0]PETSC ERROR: --------------------- Error Message ------------------------------------ > > > [0]PETSC ERROR: Signal received! > > > [0]PETSC ERROR: ------------------------------------------------------------------------ > > > [0]PETSC ERROR: Petsc Release Version 3.3.0, Patch 7, Sat May 11 22:15:24 CDT 2013 > > > [0]PETSC ERROR: See docs/changes/index.html for recent updates. > > > [0]PETSC ERROR: See docs/faq.html for hints about trouble shooting. > > > [0]PETSC ERROR: See docs/index.html for manual pages. > > > [0]PETSC ERROR: --------------------[1]PETSC ERROR: ------------------------------------------------------------------------ > > > [1]PETSC ERROR: Caught signal number 11 SEGV: Segmentation Violation, probably memory access out of range > > > ---------------------------------------------------- > > > [0]PETSC ERROR: ./kmath.bcbcg on a arch-fuji named p01-024 by a03293 Wed Aug 5 18:24:40 2015 > > > [0]PETSC ERROR: Libraries linked from /volume1/home/ra000005/a03293/kmathlibbuild/petsc-3.3-p7/arch-fujitsu-sparc64fx-opt/lib > > > [0]PETSC ERROR: Configure run at Tue Jul 28 19:23:51 2015 > > > [0]PETSC ERROR: Configure options --known-level1-dcache-size=32768 --known-level1-dcache-linesize=32 --known-level1-dcache-assoc=0 --known-memcmp-ok=1 --known-sizeof-char=1 --known-sizeof-void-p=8 --known-sizeof-short=2 --known-sizeof-int=4 --known-sizeof-long=8 --known-sizeof-long-long=8 --known-sizeof-float=4 --known-sizeof-double=8 --known-sizeof-size_t=8 --known-bits-per-byte=8 --known-sizeof-MPI_Comm=8 --known-sizeof-MPI_Fint=4 --known-mpi-long-double=1 --known-mpi-c-double-complex=1 --with-cc=mpifccpx --CFLAGS="-mt -Xg" --COPTFLAGS=-Kfast,openmp --with-cxx=mpiFCCpx --CXXFLAGS=-mt --CXXOPTFLAGS=-Kfast,openmp --with-fc=mpifrtpx --FFLAGS=-Kthreadsafe --FOPTFLAGS=-Kfast,openmp --with-blas-lapack-lib="-SCALAPACK -SSL2" --with-x=0 --with-c++-support --with-batch=1 --with-info=1 --with-debugging=0 --known-mpi-shared-libraries=0 --with-valgrind=0 > > > [0]PETSC ERROR: ------------------------------------------------------------------------ > > > [0]PETSC ERROR: User provided function() line 0 in unknown directory unknown file > > > -------------------------------------------------------------------------- > > > [mpi::mpi-api::mpi-abort] > > > MPI_ABORT was invoked on rank 0 in communicator MPI_COMM_WORLD > > > with errorcode 59. > > > > > > NOTE: invoking MPI_ABORT causes Open MPI to kill all MPI processes. > > > You may or may not see output from other processes, depending on > > > exactly when Open MPI kills them. > > > -------------------------------------------------------------------------- > > > [p01-024:26516] /opt/FJSVtclang/GM-1.2.0-18/lib64/libmpi.so.0(orte_errmgr_base_error_abort+0x84) [0xffffffff0091f684] > > > [p01-024:26516] /opt/FJSVtclang/GM-1.2.0-18/lib64/libmpi.so.0(ompi_mpi_abort+0x51c) [0xffffffff006c389c] > > > [p01-024:26516] /opt/FJSVtclang/GM-1.2.0-18/lib64/libmpi.so.0(MPI_Abort+0x6c) [0xffffffff006db3ac] > > > [p01-024:26516] /opt/FJSVtclang/GM-1.2.0-18/lib64/libtrtmet_c.so.1(MPI_Abort+0x2c) [0xffffffff00281bf0] > > > [p01-024:26516] ./kmath.bcbcg [0x1bf620] > > > [p01-024:26516] ./kmath.bcbcg [0x1bf20c] > > > [p01-024:26516] /lib64/libc.so.6(killpg+0x48) [0xffffffff02d52600] > > > [p01-024:26516] [(nil)] > > > [p01-024:26516] ./kmath.bcbcg [0x1a2054] > > > [p01-024:26516] ./kmath.bcbcg [0x1064f8] > > > [p01-024:26516] ./kmath.bcbcg(MAIN__+0x9dc) [0x105d1c] > > > [p01-024:26516] ./kmath.bcbcg(main+0xec) [0x8a329c] > > > [p01-024:26516] /lib64/libc.so.6(__libc_start_main+0x194) [0xffffffff02d3b81c] > > > [p01-024:26516] ./kmath.bcbcg [0x1051ec] > > > [0]PETSC ERROR: ------------------------------------------------------------------------ > > > [0]PETSC ERROR: Caught signal number 15 Terminate: Somet process (or the batch system) has told this process to end > > > [0]PETSC ERROR: Try option -start_in_debugger or -on_error_attach_debugger > > > [0]PETSC ERROR: or see http://www.mcs.anl.gov/petsc/documentation/faq.html#valgrind[0]PETSC ERROR: or try http://valgrind.org on GNU/linux and Apple Mac OS X to find memory corruption errors > > > [0]PETSC ERROR: configure using --with-debugging=yes, recompile, link, and run > > > [0]PETSC ERROR: to get more information on the crash. > > > [0]PETSC ERROR: --------------------- Error Message ------------------------------------ > > > [0]PETSC ERROR: Signal received! > > > [0]PETSC ERROR: ------------------------------------------------------------------------ > > > [0]PETSC ERROR: Petsc Release Version 3.3.0, Patch 7, Sat May 11 22:15:24 CDT 2013 > > > [0]PETSC ERROR: See docs/changes/index.html for recent updates. > > > [0]PETSC ERROR: See docs/faq.html for hints about trouble shooting. > > > [0]PETSC ERROR: See docs/index.html for manual pages. > > > [0]PETSC ERROR: ------------------------------------------------------------------------ > > > [0]PETSC ERROR: ./kmath.bcbcg on a arch-fuji named p01-024 by a03293 Wed Aug 5 18:24:40 2015 > > > [0]PETSC ERROR: Libraries linked from /volume1/home/ra000005/a03293/kmathlibbuild/petsc-3.3-p7/arch-fujitsu-sparc64fx-opt/lib > > > [0]PETSC ERROR: Configure run at Tue Jul 28 19:23:51 2015 > > > [0]PETSC ERROR: Configure options --known-level1-dcache-size=32768 --known-level1-dcache-linesize=32 --known-level1-dcache-assoc=0 --known-memcmp-ok=1 --known-sizeof-char=1 --known-sizeof-void-p=8 --known-sizeof-short=2 --known-sizeof-int=4 --known-sizeof-long=8 --known-sizeof-long-long=8 --known-sizeof-float=4 --known-sizeof-double=8 --known-sizeof-size_t=8 --known-bits-per-byte=8 --known-sizeof-MPI_Comm=8 --known-sizeof-MPI_Fint=4 --known-mpi-long-double=1 --known-mpi-c-double-complex=1 --with-cc=mpifccpx --CFLAGS="-mt -Xg" --COPTFLAGS=-Kfast,openmp --with-cxx=mpiFCCpx --CXXFLAGS=-mt --CXXOPTFLAGS=-Kfast,openmp --with-fc=mpifrtpx --FFLAGS=-Kthreadsafe --FOPTFLAGS=-Kfast,openmp --with-blas-lapack-lib="-SCALAPACK -SSL2" --with-x=0 --with-c++-support --with-batch=1 --with-info=1 --with-debugging=0 --known-mpi-shared-libraries=0 --with-valgrind=0 > > > [0]PETSC ERROR: ------------------------------------------------------------------------ > > > [0]PETSC ERROR: User provided function() line 0 in unknown directory unknown file > > > [ERR.] PLE 0019 plexec One of MPI processes was aborted.(rank=0)(nid=0x020a0028)(CODE=1938,793745140674134016,15104) > > > > > > However, if I change from > > > call MatMatMult(A,Km(stepIdx-1),MAT_REUSE_MATRIX,PETSC_DEFAULT_INTEGER,Km(stepIdx), ierr) > > > to > > > call MatMatMult(A,Km(stepIdx-1), MAT_INITIAL_MATRIX,PETSC_DEFAULT_INTEGER,Km(stepIdx), ierr) > > > > > > everything is fine. > > > > > > could you please suggest some way to solve this? > > > > > > Thanks > > > > > > Cong Li > > > > > > On Wed, Aug 5, 2015 at 10:53 AM, Cong Li wrote: > > > Thank you very much for your help and suggestions. > > > With your help, finally I could continue my project. > > > > > > Regards > > > > > > Cong Li > > > > > > > > > > > > On Wed, Aug 5, 2015 at 3:09 AM, Barry Smith wrote: > > > > > > From the manual page: Unless scall is MAT_REUSE_MATRIX C will be created. > > > > > > Since you want to use the C that is passed in you should use MAT_REUSE_MATRIX. > > > > > > Note that since your B and C matrices are dense the issue of sparsity pattern of C is not relevant. > > > > > > Barry > > > > > > > On Aug 4, 2015, at 11:59 AM, Cong Li wrote: > > > > > > > > Thanks very much. This answer is very helpful. > > > > And I have a following question. > > > > If I create B1, B2, .. by the way you suggested and then use MatMatMult to do SPMM. > > > > PetscErrorCode MatMatMult(Mat A,Mat B,MatReuse scall,PetscReal fill,Mat *C) > > > > should I use MAT_REUSE_MATRIX for MatReuse part of the arguement. > > > > > > > > Thanks > > > > > > > > Cong Li > > > > > > > > On Wed, Aug 5, 2015 at 1:27 AM, Barry Smith wrote: > > > > > > > > > On Aug 4, 2015, at 4:09 AM, Cong Li wrote: > > > > > > > > > > I am sorry that I should have explained it more clearly. > > > > > Actually I want to compute a recurrence. > > > > > > > > > > Like, I want to firstly compute A*X1=B1, and then calculate A*B1=B2, A*B2=B3 and so on. > > > > > Finally I want to combine all these results into a bigger matrix C=[B1,B2 ...] > > > > > > > > First create C with MatCreateDense(,&C). Then call MatDenseGetArray(C,&array); then create B1 with MatCreateDense(....,array,&B1); then create > > > > B2 with MatCreateDense(...,array+shift,&B2) etc where shift equals the number of __local__ rows in B1 times the number of columns in B1, then create B3 with a larger shift etc. > > > > > > > > Note that you are "sharing" the array space of C with B1, B2, B3, ..., each Bi contains its columns of the C matrix. > > > > > > > > Barry > > > > > > > > > > > > > > > > > > > > > > Is there any way to do this efficiently. > > > > > > > > > > > > > > > > > > > > On Tue, Aug 4, 2015 at 5:45 PM, Patrick Sanan wrote: > > > > > On Tue, Aug 04, 2015 at 03:42:14PM +0900, Cong Li wrote: > > > > > > Thanks for your reply. > > > > > > > > > > > > I have an other question. > > > > > > I want to do SPMM several times and combine result matrices into one bigger > > > > > > matrix. > > > > > > for example > > > > > > I firstly calculate AX1=B1, AX2=B2 ... > > > > > > then I want to combine B1, B2.. to get a C, where C=[B1,B2...] > > > > > > > > > > > > Could you please suggest a way of how to do this. > > > > > This is just linear algebra, nothing to do with PETSc specifically. > > > > > A * [X1, X2, ... ] = [AX1, AX2, ...] > > > > > > > > > > > > Thanks > > > > > > > > > > > > Cong Li > > > > > > > > > > > > On Tue, Aug 4, 2015 at 3:27 PM, Jed Brown wrote: > > > > > > > > > > > > > Cong Li writes: > > > > > > > > > > > > > > > Hello, > > > > > > > > > > > > > > > > I am a PhD student using PETsc for my research. > > > > > > > > I am wondering if there is a way to implement SPMM (Sparse matrix-matrix > > > > > > > > multiplication) by using PETSc. > > > > > > > > > > > > > > > > > > > > > http://www.mcs.anl.gov/petsc/petsc-current/docs/manualpages/Mat/MatMatMult.html > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > From solvercorleone at gmail.com Thu Aug 6 20:08:25 2015 From: solvercorleone at gmail.com (Cong Li) Date: Fri, 7 Aug 2015 10:08:25 +0900 Subject: [petsc-users] I am wondering if there is a way to implement SPMM In-Reply-To: References: <87egjjr2j9.fsf@jedbrown.org> <20150804084548.GB52392@Patricks-MacBook-Pro-3.local> <07456300-9874-41EF-AF5E-16BC0CB0423D@mcs.anl.gov> <06426FD6-16F6-429A-8EEB-8BE31CECC8F4@mcs.anl.gov> Message-ID: Hong, >>A subspace with dense matrices as basis? >>How large step_k and your matrices will be? So far, I are not very sure how large it's gonna be in the future. But I use less than 50 right now. However, I hope it can be as large as possible. Cong Li On Fri, Aug 7, 2015 at 12:20 AM, Hong wrote: > Cong: > >> Hong, >> >> Sure. >> >> I want to extend the Krylov subspace by step_k dimensions by using >> monomial, which can be defined as >> >> K={Km(1)m Km(2), ..., Km(step_k)} >> ={Km(1), AKm(1), AKm(2), ... , AKm(step_k-1)} >> ={R, AR, A^2R, ... A^(step_k-1)R} >> > > A subspace with dense matrices as basis? > How large step_k and your matrices will be? > > Hong > >> >> On Thu, Aug 6, 2015 at 12:23 PM, Hong wrote: >> >>> Cong, >>> >>> Can you write out math equations for mpk_monomial (), >>> list input and output parameters. >>> >>> Note: >>> 1. MatDuplicate() does not need to be followed by MatAssemblyBegin/End >>> 2. MatMatMult(A,Km(stepIdx-1),MAT_REUSE_MATRIX,..) must be called after >>> MatMatMult(A,Km(stepIdx-1),MAT_INITIAL_MATRIX,..) >>> >>> Hong >>> >>> >>> On Wed, Aug 5, 2015 at 8:56 PM, Cong Li >>> wrote: >>> >>>> The entire source code files are attached. >>>> >>>> Also I copy and paste the here in this email >>>> >>>> thanks >>>> >>>> program test >>>> >>>> implicit none >>>> >>>> #include >>>> #include >>>> #include >>>> #include >>>> >>>> >>>> PetscViewer :: view >>>> ! sparse matrix >>>> Mat :: A >>>> ! distributed dense matrix of size n x m >>>> Mat :: B, X, R, QDlt, AQDlt >>>> ! distributed dense matrix of size n x (m x k) >>>> Mat :: Q, K, AQ_p, AQ >>>> ! local dense matrix (every process keep the identical copies), (m x >>>> k) x (m x k) >>>> Mat :: AConjPara, QtAQ, QtAQ_p, Dlt >>>> >>>> PetscInt :: nDim, mDim, rhsNDim,rhsMDim,ierr, maxIter, iter, >>>> step_k,bsize >>>> PetscInt :: ownRowS,ownRowE >>>> PetscScalar, allocatable :: XInit(:,:) >>>> PetscInt :: XInitI, XInitJ >>>> PetscScalar :: v=1.0 >>>> PetscBool :: flg >>>> PetscMPIInt :: size, rank >>>> >>>> character(128) :: fin, rhsfin >>>> >>>> >>>> call PetscInitialize(PETSC_NULL_CHARACTER,ierr) >>>> call MPI_Comm_size(PETSC_COMM_WORLD,size,ierr) >>>> call MPI_Comm_rank(PETSC_COMM_WORLD,rank,ierr) >>>> >>>> ! read binary matrix file >>>> call PetscOptionsGetString(PETSC_NULL_CHARACTER,'-f',fin,flg,ierr) >>>> call PetscOptionsGetString(PETSC_NULL_CHARACTER,'-r',rhsfin,flg,ierr) >>>> >>>> call PetscOptionsGetInt(PETSC_NULL_CHARACTER,'-i',maxIter,flg,ierr) >>>> call PetscOptionsGetInt(PETSC_NULL_CHARACTER,'-k',step_k,flg,ierr) >>>> call PetscOptionsGetInt(PETSC_NULL_CHARACTER,'-w',bsize,flg,ierr) >>>> >>>> >>>> call >>>> PetscViewerBinaryOpen(PETSC_COMM_WORLD,fin,FILE_MODE_READ,view,ierr) >>>> call MatCreate(PETSC_COMM_WORLD,A,ierr) >>>> call MatSetType(A,MATAIJ,ierr) >>>> call MatLoad(A,view,ierr) >>>> call PetscViewerDestroy(view,ierr) >>>> ! for the time being, assume mDim == nDim is true >>>> call MatGetSize(A, nDim, mDim, ierr) >>>> >>>> if (rank == 0) then >>>> print*,'Mat Size = ', nDim, mDim >>>> end if >>>> >>>> call MatAssemblyBegin(A, MAT_FINAL_ASSEMBLY, ierr) >>>> call MatAssemblyEnd(A, MAT_FINAL_ASSEMBLY, ierr) >>>> call MatGetOwnershipRange(A,ownRowS,ownRowE, ierr) >>>> >>>> ! create right-and-side matrix >>>> ! for the time being, choose row-wise decomposition >>>> ! for the time being, assume nDim%size = 0 >>>> call MatCreateDense(PETSC_COMM_WORLD, (ownRowE - ownRowS), & >>>> bsize, nDim, bsize,PETSC_NULL_SCALAR, B, ierr) >>>> call >>>> PetscViewerBinaryOpen(PETSC_COMM_WORLD,rhsfin,FILE_MODE_READ,view, ierr) >>>> call MatLoad(B,view,ierr) >>>> call PetscViewerDestroy(view,ierr) >>>> call MatGetSize(B, rhsMDim, rhsNDim, ierr) >>>> if (rank == 0) then >>>> print*,'MRHS Size actually are:', rhsMDim, rhsNDim >>>> print*,'MRHS Size should be:', nDim, bsize >>>> end if >>>> call MatAssemblyBegin(B, MAT_FINAL_ASSEMBLY, ierr) >>>> call MatAssemblyEnd(B, MAT_FINAL_ASSEMBLY, ierr) >>>> >>>> ! inintial value guses X >>>> allocate(XInit(nDim,bsize)) >>>> do XInitI=1, nDim >>>> do XInitJ=1, bsize >>>> XInit(XInitI,XInitJ) = 1.0 >>>> end do >>>> end do >>>> >>>> call MatCreateDense(PETSC_COMM_WORLD, (ownRowE - ownRowS), & >>>> bsize, nDim, bsize,XInit, X, ierr) >>>> >>>> call MatAssemblyBegin(X, MAT_FINAL_ASSEMBLY, ierr) >>>> call MatAssemblyEnd (X, MAT_FINAL_ASSEMBLY, ierr) >>>> >>>> >>>> ! B, X, R, QDlt, AQDlt >>>> call MatDuplicate(B, MAT_DO_NOT_COPY_VALUES, R, ierr) >>>> call MatAssemblyBegin(R, MAT_FINAL_ASSEMBLY, ierr) >>>> call MatAssemblyEnd (R, MAT_FINAL_ASSEMBLY, ierr) >>>> >>>> call MatDuplicate(B, MAT_DO_NOT_COPY_VALUES, QDlt, ierr) >>>> call MatAssemblyBegin(QDlt, MAT_FINAL_ASSEMBLY, ierr) >>>> call MatAssemblyEnd (QDlt, MAT_FINAL_ASSEMBLY, ierr) >>>> >>>> call MatDuplicate(B, MAT_DO_NOT_COPY_VALUES, AQDlt, ierr) >>>> call MatAssemblyBegin(AQDlt, MAT_FINAL_ASSEMBLY, ierr) >>>> call MatAssemblyEnd (AQDlt, MAT_FINAL_ASSEMBLY, ierr) >>>> >>>> ! Q, K, AQ_p, AQ of size n x (m x k) >>>> call MatCreateDense(PETSC_COMM_WORLD, (ownRowE - ownRowS), & >>>> (bsize*step_k), nDim, >>>> (bsize*step_k),PETSC_NULL_SCALAR, Q, ierr) >>>> call MatAssemblyBegin(Q, MAT_FINAL_ASSEMBLY, ierr) >>>> call MatAssemblyEnd(Q, MAT_FINAL_ASSEMBLY, ierr) >>>> >>>> call MatDuplicate(Q, MAT_DO_NOT_COPY_VALUES, K, ierr) >>>> call MatAssemblyBegin(K, MAT_FINAL_ASSEMBLY, ierr) >>>> call MatAssemblyEnd(K, MAT_FINAL_ASSEMBLY, ierr) >>>> >>>> call MatDuplicate(Q, MAT_DO_NOT_COPY_VALUES, AQ_p, ierr) >>>> call MatAssemblyBegin(AQ_p, MAT_FINAL_ASSEMBLY, ierr) >>>> call MatAssemblyEnd(AQ_p, MAT_FINAL_ASSEMBLY, ierr) >>>> >>>> call MatDuplicate(Q, MAT_DO_NOT_COPY_VALUES, AQ, ierr) >>>> call MatAssemblyBegin(AQ, MAT_FINAL_ASSEMBLY, ierr) >>>> call MatAssemblyEnd(AQ, MAT_FINAL_ASSEMBLY, ierr) >>>> >>>> ! QtAQ, QtAQ_p, Dlt of size (m x k) x (m x k) >>>> call MatCreateSeqDense(PETSC_COMM_SELF,(bsize*step_k),(bsize*step_k),& >>>> PETSC_NULL_SCALAR, QtAQ, ierr) >>>> call MatAssemblyBegin(QtAQ, MAT_FINAL_ASSEMBLY, ierr) >>>> call MatAssemblyEnd (QtAQ, MAT_FINAL_ASSEMBLY, ierr) >>>> >>>> call MatDuplicate(QtAQ, MAT_DO_NOT_COPY_VALUES, QtAQ_p , ierr) >>>> call MatAssemblyBegin(QtAQ_p, MAT_FINAL_ASSEMBLY, ierr) >>>> call MatAssemblyEnd (QtAQ_p, MAT_FINAL_ASSEMBLY, ierr) >>>> >>>> call MatDuplicate(QtAQ, MAT_DO_NOT_COPY_VALUES, Dlt , ierr) >>>> call MatAssemblyBegin(Dlt, MAT_FINAL_ASSEMBLY, ierr) >>>> call MatAssemblyEnd (Dlt, MAT_FINAL_ASSEMBLY, ierr) >>>> >>>> call MatDuplicate(QtAQ, MAT_DO_NOT_COPY_VALUES, AConjPara , ierr) >>>> call MatAssemblyBegin(AConjPara, MAT_FINAL_ASSEMBLY, ierr) >>>> call MatAssemblyEnd (AConjPara, MAT_FINAL_ASSEMBLY, ierr) >>>> >>>> ! calculation for R >>>> >>>> ! call matrix powers kernel >>>> call mpk_monomial (K, A, R, step_k, rank,size) >>>> >>>> ! destory matrices >>>> deallocate(XInit) >>>> >>>> call MatDestroy(B, ierr) >>>> call MatDestroy(X, ierr) >>>> call MatDestroy(R, ierr) >>>> call MatDestroy(QDlt, ierr) >>>> call MatDestroy(AQDlt, ierr) >>>> call MatDestroy(Q, ierr) >>>> call MatDestroy(K, ierr) >>>> call MatDestroy(AQ_p, ierr) >>>> call MatDestroy(AQ, ierr) >>>> call MatDestroy(QtAQ, ierr) >>>> call MatDestroy(QtAQ_p, ierr) >>>> call MatDestroy(Dlt, ierr) >>>> >>>> >>>> call PetscFinalize(ierr) >>>> >>>> stop >>>> >>>> end program test >>>> >>>> >>>> subroutine mpk_monomial (K, A, R, step_k, rank, sizeMPI) >>>> implicit none >>>> >>>> #include >>>> #include >>>> #include >>>> #include >>>> >>>> Mat :: K, Km(step_k) >>>> Mat :: A, R >>>> PetscMPIInt :: sizeMPI, rank >>>> PetscInt :: nDim, bsize, step_k, local_RRow, local_RCol, genIdx >>>> PetscInt :: ierr >>>> PetscInt :: stepIdx, blockShift, localRsize >>>> PetscScalar :: KArray(1), RArray(1), PetscScalarSize >>>> PetscOffset :: KArrayOffset, RArrayOffset >>>> >>>> call MatGetSize(R, nDim, bsize, ierr) >>>> if (rank == 0) then >>>> print*,'Mat Size = ', nDim, bsize >>>> end if >>>> >>>> call MatGetArray(K,KArray,KArrayOffset,ierr) >>>> >>>> call MatGetLocalSize(R,local_RRow,local_RCol) >>>> ! print *, "local_RRow,local_RCol", local_RRow,local_RCol >>>> >>>> ! get arry from R to add values to K(1) >>>> call MatGetArray(R,RArray,RArrayOffset,ierr) >>>> >>>> call MatCreateDense(PETSC_COMM_WORLD, PETSC_DECIDE, & >>>> PETSC_DECIDE , nDim, bsize,KArray(KArrayOffset >>>> + 1), Km(1), ierr) >>>> >>>> >>>> ! call PetscMemmove(KArray(KArrayOffset + 1),RArray(RArrayOffset + 1) >>>> & >>>> ! ,local_RRow * local_RCol * >>>> STORAGE_SIZE(PetscScalarSize), ierr) >>>> >>>> localRsize = local_RRow * local_RCol >>>> do genIdx= 1, localRsize >>>> KArray(KArrayOffset + genIdx) = RArray(RArrayOffset + genIdx) >>>> end do >>>> >>>> >>>> call MatRestoreArray(R,RArray,RArrayOffset,ierr) >>>> >>>> call MatAssemblyBegin(Km(1), MAT_FINAL_ASSEMBLY, ierr) >>>> call MatAssemblyEnd (Km(1), MAT_FINAL_ASSEMBLY, ierr) >>>> >>>> do stepIdx= 2, step_k >>>> >>>> blockShift = KArrayOffset + (stepIdx-1) * (local_RRow * local_RCol) >>>> >>>> call MatCreateDense(PETSC_COMM_WORLD, PETSC_DECIDE, & >>>> PETSC_DECIDE , nDim, >>>> bsize,KArray(blockShift+1), Km(stepIdx), ierr) >>>> call MatAssemblyBegin(Km(stepIdx), MAT_FINAL_ASSEMBLY, ierr) >>>> call MatAssemblyEnd (Km(stepIdx), MAT_FINAL_ASSEMBLY, ierr) >>>> >>>> end do >>>> >>>> call MatRestoreArray(K,KArray,KArrayOffset,ierr) >>>> >>>> ! do stepIdx= 2, step_k >>>> do stepIdx= 2,2 >>>> >>>> call >>>> MatMatMult(A,Km(stepIdx-1),MAT_REUSE_MATRIX,PETSC_DEFAULT_INTEGER,Km(stepIdx), >>>> ierr) >>>> ! call >>>> MatMatMult(A,Km(stepIdx-1),MAT_INITIAL_MATRIX,PETSC_DEFAULT_INTEGER,Km(stepIdx), >>>> ierr) >>>> end do >>>> >>>> ! call MatView(K,PETSC_VIEWER_STDOUT_WORLD,ierr) >>>> >>>> end subroutine mpk_monomial >>>> >>>> >>>> >>>> Cong Li >>>> >>>> On Thu, Aug 6, 2015 at 3:30 AM, Barry Smith wrote: >>>> >>>>> >>>>> Send the entire code so that we can compile it and run it ourselves >>>>> to see what is going wrong. >>>>> >>>>> Barry >>>>> >>>>> > On Aug 5, 2015, at 4:42 AM, Cong Li >>>>> wrote: >>>>> > >>>>> > Hi >>>>> > >>>>> > I tried the method you suggested. However, I got the error message. >>>>> > My code and message are below. >>>>> > >>>>> > K is the big matrix containing column matrices. >>>>> > >>>>> > code: >>>>> > >>>>> > call MatGetArray(K,KArray,KArrayOffset,ierr) >>>>> > >>>>> > call MatGetLocalSize(R,local_RRow,local_RCol) >>>>> > >>>>> > call MatGetArray(R,RArray,RArrayOffset,ierr) >>>>> > >>>>> > call MatCreateDense(PETSC_COMM_WORLD, PETSC_DECIDE, & >>>>> > PETSC_DECIDE , nDim, >>>>> bsize,KArray(KArrayOffset + 1), Km(1), ierr) >>>>> > >>>>> > localRsize = local_RRow * local_RCol >>>>> > do genIdx= 1, localRsize >>>>> > KArray(KArrayOffset + genIdx) = RArray(RArrayOffset + genIdx) >>>>> > end do >>>>> > >>>>> > call MatRestoreArray(R,RArray,RArrayOffset,ierr) >>>>> > >>>>> > call MatAssemblyBegin(Km(1), MAT_FINAL_ASSEMBLY, ierr) >>>>> > call MatAssemblyEnd (Km(1), MAT_FINAL_ASSEMBLY, ierr) >>>>> > >>>>> > do stepIdx= 2, step_k >>>>> > >>>>> > blockShift = KArrayOffset + (stepIdx-1) * (local_RRow * >>>>> local_RCol) >>>>> > >>>>> > call MatCreateDense(PETSC_COMM_WORLD, PETSC_DECIDE, & >>>>> > PETSC_DECIDE , nDim, >>>>> bsize,KArray(blockShift+1), Km(stepIdx), ierr) >>>>> > call MatAssemblyBegin(Km(stepIdx), MAT_FINAL_ASSEMBLY, ierr) >>>>> > call MatAssemblyEnd (Km(stepIdx), MAT_FINAL_ASSEMBLY, ierr) >>>>> > end do >>>>> > >>>>> > call MatRestoreArray(K,KArray,KArrayOffset,ierr) >>>>> > >>>>> > do stepIdx= 2, step_k >>>>> > >>>>> > call >>>>> MatMatMult(A,Km(stepIdx-1),MAT_REUSE_MATRIX,PETSC_DEFAULT_INTEGER,Km(stepIdx), >>>>> ierr) >>>>> > end do >>>>> > >>>>> > >>>>> > And I got the error message as below: >>>>> > >>>>> > >>>>> > [0]PETSC ERROR: >>>>> ------------------------------------------------------------------------ >>>>> > [0]PETSC ERROR: Caught signal number 11 SEGV: Segmentation >>>>> Violation, probably memory access out of range >>>>> > [0]PETSC ERROR: Try option -start_in_debugger or >>>>> -on_error_attach_debugger >>>>> > [0]PETSC ERROR: or see >>>>> http://www.mcs.anl.gov/petsc/documentation/faq.html#valgrind[0]PETSC >>>>> ERROR: or try http://valgrind.org on GNU/linux and Apple Mac OS X to >>>>> find memory corruption errors >>>>> > [0]PETSC ERROR: configure using --with-debugging=yes, recompile, >>>>> link, and run >>>>> > [0]PETSC ERROR: to get more information on the crash. >>>>> > [0]PETSC ERROR: --------------------- Error Message >>>>> ------------------------------------ >>>>> > [0]PETSC ERROR: Signal received! >>>>> > [0]PETSC ERROR: >>>>> ------------------------------------------------------------------------ >>>>> > [0]PETSC ERROR: Petsc Release Version 3.3.0, Patch 7, Sat May 11 >>>>> 22:15:24 CDT 2013 >>>>> > [0]PETSC ERROR: See docs/changes/index.html for recent updates. >>>>> > [0]PETSC ERROR: See docs/faq.html for hints about trouble shooting. >>>>> > [0]PETSC ERROR: See docs/index.html for manual pages. >>>>> > [0]PETSC ERROR: --------------------[1]PETSC ERROR: >>>>> ------------------------------------------------------------------------ >>>>> > [1]PETSC ERROR: Caught signal number 11 SEGV: Segmentation >>>>> Violation, probably memory access out of range >>>>> > ---------------------------------------------------- >>>>> > [0]PETSC ERROR: ./kmath.bcbcg on a arch-fuji named p01-024 by a03293 >>>>> Wed Aug 5 18:24:40 2015 >>>>> > [0]PETSC ERROR: Libraries linked from >>>>> /volume1/home/ra000005/a03293/kmathlibbuild/petsc-3.3-p7/arch-fujitsu-sparc64fx-opt/lib >>>>> > [0]PETSC ERROR: Configure run at Tue Jul 28 19:23:51 2015 >>>>> > [0]PETSC ERROR: Configure options --known-level1-dcache-size=32768 >>>>> --known-level1-dcache-linesize=32 --known-level1-dcache-assoc=0 >>>>> --known-memcmp-ok=1 --known-sizeof-char=1 --known-sizeof-void-p=8 >>>>> --known-sizeof-short=2 --known-sizeof-int=4 --known-sizeof-long=8 >>>>> --known-sizeof-long-long=8 --known-sizeof-float=4 --known-sizeof-double=8 >>>>> --known-sizeof-size_t=8 --known-bits-per-byte=8 --known-sizeof-MPI_Comm=8 >>>>> --known-sizeof-MPI_Fint=4 --known-mpi-long-double=1 >>>>> --known-mpi-c-double-complex=1 --with-cc=mpifccpx --CFLAGS="-mt -Xg" >>>>> --COPTFLAGS=-Kfast,openmp --with-cxx=mpiFCCpx --CXXFLAGS=-mt >>>>> --CXXOPTFLAGS=-Kfast,openmp --with-fc=mpifrtpx --FFLAGS=-Kthreadsafe >>>>> --FOPTFLAGS=-Kfast,openmp --with-blas-lapack-lib="-SCALAPACK -SSL2" >>>>> --with-x=0 --with-c++-support --with-batch=1 --with-info=1 >>>>> --with-debugging=0 --known-mpi-shared-libraries=0 --with-valgrind=0 >>>>> > [0]PETSC ERROR: >>>>> ------------------------------------------------------------------------ >>>>> > [0]PETSC ERROR: User provided function() line 0 in unknown directory >>>>> unknown file >>>>> > >>>>> -------------------------------------------------------------------------- >>>>> > [mpi::mpi-api::mpi-abort] >>>>> > MPI_ABORT was invoked on rank 0 in communicator MPI_COMM_WORLD >>>>> > with errorcode 59. >>>>> > >>>>> > NOTE: invoking MPI_ABORT causes Open MPI to kill all MPI processes. >>>>> > You may or may not see output from other processes, depending on >>>>> > exactly when Open MPI kills them. >>>>> > >>>>> -------------------------------------------------------------------------- >>>>> > [p01-024:26516] >>>>> /opt/FJSVtclang/GM-1.2.0-18/lib64/libmpi.so.0(orte_errmgr_base_error_abort+0x84) >>>>> [0xffffffff0091f684] >>>>> > [p01-024:26516] >>>>> /opt/FJSVtclang/GM-1.2.0-18/lib64/libmpi.so.0(ompi_mpi_abort+0x51c) >>>>> [0xffffffff006c389c] >>>>> > [p01-024:26516] >>>>> /opt/FJSVtclang/GM-1.2.0-18/lib64/libmpi.so.0(MPI_Abort+0x6c) >>>>> [0xffffffff006db3ac] >>>>> > [p01-024:26516] >>>>> /opt/FJSVtclang/GM-1.2.0-18/lib64/libtrtmet_c.so.1(MPI_Abort+0x2c) >>>>> [0xffffffff00281bf0] >>>>> > [p01-024:26516] ./kmath.bcbcg [0x1bf620] >>>>> > [p01-024:26516] ./kmath.bcbcg [0x1bf20c] >>>>> > [p01-024:26516] /lib64/libc.so.6(killpg+0x48) [0xffffffff02d52600] >>>>> > [p01-024:26516] [(nil)] >>>>> > [p01-024:26516] ./kmath.bcbcg [0x1a2054] >>>>> > [p01-024:26516] ./kmath.bcbcg [0x1064f8] >>>>> > [p01-024:26516] ./kmath.bcbcg(MAIN__+0x9dc) [0x105d1c] >>>>> > [p01-024:26516] ./kmath.bcbcg(main+0xec) [0x8a329c] >>>>> > [p01-024:26516] /lib64/libc.so.6(__libc_start_main+0x194) >>>>> [0xffffffff02d3b81c] >>>>> > [p01-024:26516] ./kmath.bcbcg [0x1051ec] >>>>> > [0]PETSC ERROR: >>>>> ------------------------------------------------------------------------ >>>>> > [0]PETSC ERROR: Caught signal number 15 Terminate: Somet process (or >>>>> the batch system) has told this process to end >>>>> > [0]PETSC ERROR: Try option -start_in_debugger or >>>>> -on_error_attach_debugger >>>>> > [0]PETSC ERROR: or see >>>>> http://www.mcs.anl.gov/petsc/documentation/faq.html#valgrind[0]PETSC >>>>> ERROR: or try http://valgrind.org on GNU/linux and Apple Mac OS X to >>>>> find memory corruption errors >>>>> > [0]PETSC ERROR: configure using --with-debugging=yes, recompile, >>>>> link, and run >>>>> > [0]PETSC ERROR: to get more information on the crash. >>>>> > [0]PETSC ERROR: --------------------- Error Message >>>>> ------------------------------------ >>>>> > [0]PETSC ERROR: Signal received! >>>>> > [0]PETSC ERROR: >>>>> ------------------------------------------------------------------------ >>>>> > [0]PETSC ERROR: Petsc Release Version 3.3.0, Patch 7, Sat May 11 >>>>> 22:15:24 CDT 2013 >>>>> > [0]PETSC ERROR: See docs/changes/index.html for recent updates. >>>>> > [0]PETSC ERROR: See docs/faq.html for hints about trouble shooting. >>>>> > [0]PETSC ERROR: See docs/index.html for manual pages. >>>>> > [0]PETSC ERROR: >>>>> ------------------------------------------------------------------------ >>>>> > [0]PETSC ERROR: ./kmath.bcbcg on a arch-fuji named p01-024 by a03293 >>>>> Wed Aug 5 18:24:40 2015 >>>>> > [0]PETSC ERROR: Libraries linked from >>>>> /volume1/home/ra000005/a03293/kmathlibbuild/petsc-3.3-p7/arch-fujitsu-sparc64fx-opt/lib >>>>> > [0]PETSC ERROR: Configure run at Tue Jul 28 19:23:51 2015 >>>>> > [0]PETSC ERROR: Configure options --known-level1-dcache-size=32768 >>>>> --known-level1-dcache-linesize=32 --known-level1-dcache-assoc=0 >>>>> --known-memcmp-ok=1 --known-sizeof-char=1 --known-sizeof-void-p=8 >>>>> --known-sizeof-short=2 --known-sizeof-int=4 --known-sizeof-long=8 >>>>> --known-sizeof-long-long=8 --known-sizeof-float=4 --known-sizeof-double=8 >>>>> --known-sizeof-size_t=8 --known-bits-per-byte=8 --known-sizeof-MPI_Comm=8 >>>>> --known-sizeof-MPI_Fint=4 --known-mpi-long-double=1 >>>>> --known-mpi-c-double-complex=1 --with-cc=mpifccpx --CFLAGS="-mt -Xg" >>>>> --COPTFLAGS=-Kfast,openmp --with-cxx=mpiFCCpx --CXXFLAGS=-mt >>>>> --CXXOPTFLAGS=-Kfast,openmp --with-fc=mpifrtpx --FFLAGS=-Kthreadsafe >>>>> --FOPTFLAGS=-Kfast,openmp --with-blas-lapack-lib="-SCALAPACK -SSL2" >>>>> --with-x=0 --with-c++-support --with-batch=1 --with-info=1 >>>>> --with-debugging=0 --known-mpi-shared-libraries=0 --with-valgrind=0 >>>>> > [0]PETSC ERROR: >>>>> ------------------------------------------------------------------------ >>>>> > [0]PETSC ERROR: User provided function() line 0 in unknown directory >>>>> unknown file >>>>> > [ERR.] PLE 0019 plexec One of MPI processes was >>>>> aborted.(rank=0)(nid=0x020a0028)(CODE=1938,793745140674134016,15104) >>>>> > >>>>> > However, if I change from >>>>> > call >>>>> MatMatMult(A,Km(stepIdx-1),MAT_REUSE_MATRIX,PETSC_DEFAULT_INTEGER,Km(stepIdx), >>>>> ierr) >>>>> > to >>>>> > call MatMatMult(A,Km(stepIdx-1), >>>>> MAT_INITIAL_MATRIX,PETSC_DEFAULT_INTEGER,Km(stepIdx), ierr) >>>>> > >>>>> > everything is fine. >>>>> > >>>>> > could you please suggest some way to solve this? >>>>> > >>>>> > Thanks >>>>> > >>>>> > Cong Li >>>>> > >>>>> > On Wed, Aug 5, 2015 at 10:53 AM, Cong Li >>>>> wrote: >>>>> > Thank you very much for your help and suggestions. >>>>> > With your help, finally I could continue my project. >>>>> > >>>>> > Regards >>>>> > >>>>> > Cong Li >>>>> > >>>>> > >>>>> > >>>>> > On Wed, Aug 5, 2015 at 3:09 AM, Barry Smith >>>>> wrote: >>>>> > >>>>> > From the manual page: Unless scall is MAT_REUSE_MATRIX C will be >>>>> created. >>>>> > >>>>> > Since you want to use the C that is passed in you should use >>>>> MAT_REUSE_MATRIX. >>>>> > >>>>> > Note that since your B and C matrices are dense the issue of >>>>> sparsity pattern of C is not relevant. >>>>> > >>>>> > Barry >>>>> > >>>>> > > On Aug 4, 2015, at 11:59 AM, Cong Li >>>>> wrote: >>>>> > > >>>>> > > Thanks very much. This answer is very helpful. >>>>> > > And I have a following question. >>>>> > > If I create B1, B2, .. by the way you suggested and then use >>>>> MatMatMult to do SPMM. >>>>> > > PetscErrorCode MatMatMult(Mat A,Mat B,MatReuse scall,PetscReal >>>>> fill,Mat *C) >>>>> > > should I use MAT_REUSE_MATRIX for MatReuse part of the arguement. >>>>> > > >>>>> > > Thanks >>>>> > > >>>>> > > Cong Li >>>>> > > >>>>> > > On Wed, Aug 5, 2015 at 1:27 AM, Barry Smith >>>>> wrote: >>>>> > > >>>>> > > > On Aug 4, 2015, at 4:09 AM, Cong Li >>>>> wrote: >>>>> > > > >>>>> > > > I am sorry that I should have explained it more clearly. >>>>> > > > Actually I want to compute a recurrence. >>>>> > > > >>>>> > > > Like, I want to firstly compute A*X1=B1, and then calculate >>>>> A*B1=B2, A*B2=B3 and so on. >>>>> > > > Finally I want to combine all these results into a bigger matrix >>>>> C=[B1,B2 ...] >>>>> > > >>>>> > > First create C with MatCreateDense(,&C). Then call >>>>> MatDenseGetArray(C,&array); then create B1 with >>>>> MatCreateDense(....,array,&B1); then create >>>>> > > B2 with MatCreateDense(...,array+shift,&B2) etc where shift equals >>>>> the number of __local__ rows in B1 times the number of columns in B1, then >>>>> create B3 with a larger shift etc. >>>>> > > >>>>> > > Note that you are "sharing" the array space of C with B1, B2, >>>>> B3, ..., each Bi contains its columns of the C matrix. >>>>> > > >>>>> > > Barry >>>>> > > >>>>> > > >>>>> > > >>>>> > > > >>>>> > > > Is there any way to do this efficiently. >>>>> > > > >>>>> > > > >>>>> > > > >>>>> > > > On Tue, Aug 4, 2015 at 5:45 PM, Patrick Sanan < >>>>> patrick.sanan at gmail.com> wrote: >>>>> > > > On Tue, Aug 04, 2015 at 03:42:14PM +0900, Cong Li wrote: >>>>> > > > > Thanks for your reply. >>>>> > > > > >>>>> > > > > I have an other question. >>>>> > > > > I want to do SPMM several times and combine result matrices >>>>> into one bigger >>>>> > > > > matrix. >>>>> > > > > for example >>>>> > > > > I firstly calculate AX1=B1, AX2=B2 ... >>>>> > > > > then I want to combine B1, B2.. to get a C, where C=[B1,B2...] >>>>> > > > > >>>>> > > > > Could you please suggest a way of how to do this. >>>>> > > > This is just linear algebra, nothing to do with PETSc >>>>> specifically. >>>>> > > > A * [X1, X2, ... ] = [AX1, AX2, ...] >>>>> > > > > >>>>> > > > > Thanks >>>>> > > > > >>>>> > > > > Cong Li >>>>> > > > > >>>>> > > > > On Tue, Aug 4, 2015 at 3:27 PM, Jed Brown >>>>> wrote: >>>>> > > > > >>>>> > > > > > Cong Li writes: >>>>> > > > > > >>>>> > > > > > > Hello, >>>>> > > > > > > >>>>> > > > > > > I am a PhD student using PETsc for my research. >>>>> > > > > > > I am wondering if there is a way to implement SPMM (Sparse >>>>> matrix-matrix >>>>> > > > > > > multiplication) by using PETSc. >>>>> > > > > > >>>>> > > > > > >>>>> > > > > > >>>>> http://www.mcs.anl.gov/petsc/petsc-current/docs/manualpages/Mat/MatMatMult.html >>>>> > > > > > >>>>> > > > >>>>> > > >>>>> > > >>>>> > >>>>> > >>>>> > >>>>> >>>>> >>>> >>> >> > -------------- next part -------------- An HTML attachment was scrubbed... URL: From solvercorleone at gmail.com Thu Aug 6 20:21:12 2015 From: solvercorleone at gmail.com (Cong Li) Date: Fri, 7 Aug 2015 10:21:12 +0900 Subject: [petsc-users] I am wondering if there is a way to implement SPMM In-Reply-To: <340E63F1-4389-4C3B-8221-4F119330764F@mcs.anl.gov> References: <87egjjr2j9.fsf@jedbrown.org> <20150804084548.GB52392@Patricks-MacBook-Pro-3.local> <07456300-9874-41EF-AF5E-16BC0CB0423D@mcs.anl.gov> <06426FD6-16F6-429A-8EEB-8BE31CECC8F4@mcs.anl.gov> <340E63F1-4389-4C3B-8221-4F119330764F@mcs.anl.gov> Message-ID: Barry, Thank you very much. I will install and try the updated version. Regards Cong Li On Fri, Aug 7, 2015 at 8:47 AM, Barry Smith wrote: > > Cong Li, > > I have updated PETSc to support the use of MatMatMult() per your > needs. You will need to switch to the master development branch > http://www.mcs.anl.gov/petsc/developers/index.html of PETSc so install > that first. > > I found a number of bugs in your code that I needed to fix to get it to > run successfully on 1 and 2 processes to correctly load the matrices and do > everything else it was doing > with the MatMatMult() (note I do not think it generates the right numbers > but at least it doesn't crash and does successfully do the MatMatMult(). > I've attached the fixed files. > > Barry > > > > On Aug 6, 2015, at 12:27 AM, Cong Li wrote: > > > > Barry, > > > > Exactly. And thanks for the explaination. > > > > Cong Li > > > > On Thu, Aug 6, 2015 at 1:29 PM, Barry Smith wrote: > > > > > On Aug 5, 2015, at 10:23 PM, Hong wrote: > > > > > > Cong, > > > > > > Can you write out math equations for mpk_monomial (), > > > list input and output parameters. > > > > > > Note: > > > 1. MatDuplicate() does not need to be followed by MatAssemblyBegin/End > > > 2. MatMatMult(A,Km(stepIdx-1),MAT_REUSE_MATRIX,..) must be called after > > > MatMatMult(A,Km(stepIdx-1),MAT_INITIAL_MATRIX,..) > > > > Hong, we want to reuse the space in the Km(stepIdx-1) from which it > was created which means that MAT_INITIAL_MATRIX cannot be used. Since the > result is always dense it is not the difficult case when a symbolic > computation needs to be done initially so, at least in theory, he should > not have to use MAT_INITIAL_MATRIX the first time through. > > > > Barry > > > > > > > > Hong > > > > > > > > > On Wed, Aug 5, 2015 at 8:56 PM, Cong Li > wrote: > > > The entire source code files are attached. > > > > > > Also I copy and paste the here in this email > > > > > > thanks > > > > > > program test > > > > > > implicit none > > > > > > #include > > > #include > > > #include > > > #include > > > > > > > > > PetscViewer :: view > > > ! sparse matrix > > > Mat :: A > > > ! distributed dense matrix of size n x m > > > Mat :: B, X, R, QDlt, AQDlt > > > ! distributed dense matrix of size n x (m x k) > > > Mat :: Q, K, AQ_p, AQ > > > ! local dense matrix (every process keep the identical copies), (m x > k) x (m x k) > > > Mat :: AConjPara, QtAQ, QtAQ_p, Dlt > > > > > > PetscInt :: nDim, mDim, rhsNDim,rhsMDim,ierr, maxIter, iter, > step_k,bsize > > > PetscInt :: ownRowS,ownRowE > > > PetscScalar, allocatable :: XInit(:,:) > > > PetscInt :: XInitI, XInitJ > > > PetscScalar :: v=1.0 > > > PetscBool :: flg > > > PetscMPIInt :: size, rank > > > > > > character(128) :: fin, rhsfin > > > > > > > > > call PetscInitialize(PETSC_NULL_CHARACTER,ierr) > > > call MPI_Comm_size(PETSC_COMM_WORLD,size,ierr) > > > call MPI_Comm_rank(PETSC_COMM_WORLD,rank,ierr) > > > > > > ! read binary matrix file > > > call PetscOptionsGetString(PETSC_NULL_CHARACTER,'-f',fin,flg,ierr) > > > call PetscOptionsGetString(PETSC_NULL_CHARACTER,'-r',rhsfin,flg,ierr) > > > > > > call PetscOptionsGetInt(PETSC_NULL_CHARACTER,'-i',maxIter,flg,ierr) > > > call PetscOptionsGetInt(PETSC_NULL_CHARACTER,'-k',step_k,flg,ierr) > > > call PetscOptionsGetInt(PETSC_NULL_CHARACTER,'-w',bsize,flg,ierr) > > > > > > > > > call > PetscViewerBinaryOpen(PETSC_COMM_WORLD,fin,FILE_MODE_READ,view,ierr) > > > call MatCreate(PETSC_COMM_WORLD,A,ierr) > > > call MatSetType(A,MATAIJ,ierr) > > > call MatLoad(A,view,ierr) > > > call PetscViewerDestroy(view,ierr) > > > ! for the time being, assume mDim == nDim is true > > > call MatGetSize(A, nDim, mDim, ierr) > > > > > > if (rank == 0) then > > > print*,'Mat Size = ', nDim, mDim > > > end if > > > > > > call MatAssemblyBegin(A, MAT_FINAL_ASSEMBLY, ierr) > > > call MatAssemblyEnd(A, MAT_FINAL_ASSEMBLY, ierr) > > > call MatGetOwnershipRange(A,ownRowS,ownRowE, ierr) > > > > > > ! create right-and-side matrix > > > ! for the time being, choose row-wise decomposition > > > ! for the time being, assume nDim%size = 0 > > > call MatCreateDense(PETSC_COMM_WORLD, (ownRowE - ownRowS), & > > > bsize, nDim, bsize,PETSC_NULL_SCALAR, B, ierr) > > > call > PetscViewerBinaryOpen(PETSC_COMM_WORLD,rhsfin,FILE_MODE_READ,view, ierr) > > > call MatLoad(B,view,ierr) > > > call PetscViewerDestroy(view,ierr) > > > call MatGetSize(B, rhsMDim, rhsNDim, ierr) > > > if (rank == 0) then > > > print*,'MRHS Size actually are:', rhsMDim, rhsNDim > > > print*,'MRHS Size should be:', nDim, bsize > > > end if > > > call MatAssemblyBegin(B, MAT_FINAL_ASSEMBLY, ierr) > > > call MatAssemblyEnd(B, MAT_FINAL_ASSEMBLY, ierr) > > > > > > ! inintial value guses X > > > allocate(XInit(nDim,bsize)) > > > do XInitI=1, nDim > > > do XInitJ=1, bsize > > > XInit(XInitI,XInitJ) = 1.0 > > > end do > > > end do > > > > > > call MatCreateDense(PETSC_COMM_WORLD, (ownRowE - ownRowS), & > > > bsize, nDim, bsize,XInit, X, ierr) > > > > > > call MatAssemblyBegin(X, MAT_FINAL_ASSEMBLY, ierr) > > > call MatAssemblyEnd (X, MAT_FINAL_ASSEMBLY, ierr) > > > > > > > > > ! B, X, R, QDlt, AQDlt > > > call MatDuplicate(B, MAT_DO_NOT_COPY_VALUES, R, ierr) > > > call MatAssemblyBegin(R, MAT_FINAL_ASSEMBLY, ierr) > > > call MatAssemblyEnd (R, MAT_FINAL_ASSEMBLY, ierr) > > > > > > call MatDuplicate(B, MAT_DO_NOT_COPY_VALUES, QDlt, ierr) > > > call MatAssemblyBegin(QDlt, MAT_FINAL_ASSEMBLY, ierr) > > > call MatAssemblyEnd (QDlt, MAT_FINAL_ASSEMBLY, ierr) > > > > > > call MatDuplicate(B, MAT_DO_NOT_COPY_VALUES, AQDlt, ierr) > > > call MatAssemblyBegin(AQDlt, MAT_FINAL_ASSEMBLY, ierr) > > > call MatAssemblyEnd (AQDlt, MAT_FINAL_ASSEMBLY, ierr) > > > > > > ! Q, K, AQ_p, AQ of size n x (m x k) > > > call MatCreateDense(PETSC_COMM_WORLD, (ownRowE - ownRowS), & > > > (bsize*step_k), nDim, > (bsize*step_k),PETSC_NULL_SCALAR, Q, ierr) > > > call MatAssemblyBegin(Q, MAT_FINAL_ASSEMBLY, ierr) > > > call MatAssemblyEnd(Q, MAT_FINAL_ASSEMBLY, ierr) > > > > > > call MatDuplicate(Q, MAT_DO_NOT_COPY_VALUES, K, ierr) > > > call MatAssemblyBegin(K, MAT_FINAL_ASSEMBLY, ierr) > > > call MatAssemblyEnd(K, MAT_FINAL_ASSEMBLY, ierr) > > > > > > call MatDuplicate(Q, MAT_DO_NOT_COPY_VALUES, AQ_p, ierr) > > > call MatAssemblyBegin(AQ_p, MAT_FINAL_ASSEMBLY, ierr) > > > call MatAssemblyEnd(AQ_p, MAT_FINAL_ASSEMBLY, ierr) > > > > > > call MatDuplicate(Q, MAT_DO_NOT_COPY_VALUES, AQ, ierr) > > > call MatAssemblyBegin(AQ, MAT_FINAL_ASSEMBLY, ierr) > > > call MatAssemblyEnd(AQ, MAT_FINAL_ASSEMBLY, ierr) > > > > > > ! QtAQ, QtAQ_p, Dlt of size (m x k) x (m x k) > > > call > MatCreateSeqDense(PETSC_COMM_SELF,(bsize*step_k),(bsize*step_k),& > > > PETSC_NULL_SCALAR, QtAQ, ierr) > > > call MatAssemblyBegin(QtAQ, MAT_FINAL_ASSEMBLY, ierr) > > > call MatAssemblyEnd (QtAQ, MAT_FINAL_ASSEMBLY, ierr) > > > > > > call MatDuplicate(QtAQ, MAT_DO_NOT_COPY_VALUES, QtAQ_p , ierr) > > > call MatAssemblyBegin(QtAQ_p, MAT_FINAL_ASSEMBLY, ierr) > > > call MatAssemblyEnd (QtAQ_p, MAT_FINAL_ASSEMBLY, ierr) > > > > > > call MatDuplicate(QtAQ, MAT_DO_NOT_COPY_VALUES, Dlt , ierr) > > > call MatAssemblyBegin(Dlt, MAT_FINAL_ASSEMBLY, ierr) > > > call MatAssemblyEnd (Dlt, MAT_FINAL_ASSEMBLY, ierr) > > > > > > call MatDuplicate(QtAQ, MAT_DO_NOT_COPY_VALUES, AConjPara , ierr) > > > call MatAssemblyBegin(AConjPara, MAT_FINAL_ASSEMBLY, ierr) > > > call MatAssemblyEnd (AConjPara, MAT_FINAL_ASSEMBLY, ierr) > > > > > > ! calculation for R > > > > > > ! call matrix powers kernel > > > call mpk_monomial (K, A, R, step_k, rank,size) > > > > > > ! destory matrices > > > deallocate(XInit) > > > > > > call MatDestroy(B, ierr) > > > call MatDestroy(X, ierr) > > > call MatDestroy(R, ierr) > > > call MatDestroy(QDlt, ierr) > > > call MatDestroy(AQDlt, ierr) > > > call MatDestroy(Q, ierr) > > > call MatDestroy(K, ierr) > > > call MatDestroy(AQ_p, ierr) > > > call MatDestroy(AQ, ierr) > > > call MatDestroy(QtAQ, ierr) > > > call MatDestroy(QtAQ_p, ierr) > > > call MatDestroy(Dlt, ierr) > > > > > > > > > call PetscFinalize(ierr) > > > > > > stop > > > > > > end program test > > > > > > > > > subroutine mpk_monomial (K, A, R, step_k, rank, sizeMPI) > > > implicit none > > > > > > #include > > > #include > > > #include > > > #include > > > > > > Mat :: K, Km(step_k) > > > Mat :: A, R > > > PetscMPIInt :: sizeMPI, rank > > > PetscInt :: nDim, bsize, step_k, local_RRow, local_RCol, > genIdx > > > PetscInt :: ierr > > > PetscInt :: stepIdx, blockShift, localRsize > > > PetscScalar :: KArray(1), RArray(1), PetscScalarSize > > > PetscOffset :: KArrayOffset, RArrayOffset > > > > > > call MatGetSize(R, nDim, bsize, ierr) > > > if (rank == 0) then > > > print*,'Mat Size = ', nDim, bsize > > > end if > > > > > > call MatGetArray(K,KArray,KArrayOffset,ierr) > > > > > > call MatGetLocalSize(R,local_RRow,local_RCol) > > > ! print *, "local_RRow,local_RCol", local_RRow,local_RCol > > > > > > ! get arry from R to add values to K(1) > > > call MatGetArray(R,RArray,RArrayOffset,ierr) > > > > > > call MatCreateDense(PETSC_COMM_WORLD, PETSC_DECIDE, & > > > PETSC_DECIDE , nDim, bsize,KArray(KArrayOffset > + 1), Km(1), ierr) > > > > > > > > > ! call PetscMemmove(KArray(KArrayOffset + 1),RArray(RArrayOffset + > 1) & > > > ! ,local_RRow * local_RCol * > STORAGE_SIZE(PetscScalarSize), ierr) > > > > > > localRsize = local_RRow * local_RCol > > > do genIdx= 1, localRsize > > > KArray(KArrayOffset + genIdx) = RArray(RArrayOffset + genIdx) > > > end do > > > > > > > > > call MatRestoreArray(R,RArray,RArrayOffset,ierr) > > > > > > call MatAssemblyBegin(Km(1), MAT_FINAL_ASSEMBLY, ierr) > > > call MatAssemblyEnd (Km(1), MAT_FINAL_ASSEMBLY, ierr) > > > > > > do stepIdx= 2, step_k > > > > > > blockShift = KArrayOffset + (stepIdx-1) * (local_RRow * local_RCol) > > > > > > call MatCreateDense(PETSC_COMM_WORLD, PETSC_DECIDE, & > > > PETSC_DECIDE , nDim, > bsize,KArray(blockShift+1), Km(stepIdx), ierr) > > > call MatAssemblyBegin(Km(stepIdx), MAT_FINAL_ASSEMBLY, ierr) > > > call MatAssemblyEnd (Km(stepIdx), MAT_FINAL_ASSEMBLY, ierr) > > > > > > end do > > > > > > call MatRestoreArray(K,KArray,KArrayOffset,ierr) > > > > > > ! do stepIdx= 2, step_k > > > do stepIdx= 2,2 > > > > > > call > MatMatMult(A,Km(stepIdx-1),MAT_REUSE_MATRIX,PETSC_DEFAULT_INTEGER,Km(stepIdx), > ierr) > > > ! call > MatMatMult(A,Km(stepIdx-1),MAT_INITIAL_MATRIX,PETSC_DEFAULT_INTEGER,Km(stepIdx), > ierr) > > > end do > > > > > > ! call MatView(K,PETSC_VIEWER_STDOUT_WORLD,ierr) > > > > > > end subroutine mpk_monomial > > > > > > > > > > > > Cong Li > > > > > > On Thu, Aug 6, 2015 at 3:30 AM, Barry Smith > wrote: > > > > > > Send the entire code so that we can compile it and run it ourselves > to see what is going wrong. > > > > > > Barry > > > > > > > On Aug 5, 2015, at 4:42 AM, Cong Li > wrote: > > > > > > > > Hi > > > > > > > > I tried the method you suggested. However, I got the error message. > > > > My code and message are below. > > > > > > > > K is the big matrix containing column matrices. > > > > > > > > code: > > > > > > > > call MatGetArray(K,KArray,KArrayOffset,ierr) > > > > > > > > call MatGetLocalSize(R,local_RRow,local_RCol) > > > > > > > > call MatGetArray(R,RArray,RArrayOffset,ierr) > > > > > > > > call MatCreateDense(PETSC_COMM_WORLD, PETSC_DECIDE, & > > > > PETSC_DECIDE , nDim, > bsize,KArray(KArrayOffset + 1), Km(1), ierr) > > > > > > > > localRsize = local_RRow * local_RCol > > > > do genIdx= 1, localRsize > > > > KArray(KArrayOffset + genIdx) = RArray(RArrayOffset + genIdx) > > > > end do > > > > > > > > call MatRestoreArray(R,RArray,RArrayOffset,ierr) > > > > > > > > call MatAssemblyBegin(Km(1), MAT_FINAL_ASSEMBLY, ierr) > > > > call MatAssemblyEnd (Km(1), MAT_FINAL_ASSEMBLY, ierr) > > > > > > > > do stepIdx= 2, step_k > > > > > > > > blockShift = KArrayOffset + (stepIdx-1) * (local_RRow * > local_RCol) > > > > > > > > call MatCreateDense(PETSC_COMM_WORLD, PETSC_DECIDE, & > > > > PETSC_DECIDE , nDim, > bsize,KArray(blockShift+1), Km(stepIdx), ierr) > > > > call MatAssemblyBegin(Km(stepIdx), MAT_FINAL_ASSEMBLY, ierr) > > > > call MatAssemblyEnd (Km(stepIdx), MAT_FINAL_ASSEMBLY, ierr) > > > > end do > > > > > > > > call MatRestoreArray(K,KArray,KArrayOffset,ierr) > > > > > > > > do stepIdx= 2, step_k > > > > > > > > call > MatMatMult(A,Km(stepIdx-1),MAT_REUSE_MATRIX,PETSC_DEFAULT_INTEGER,Km(stepIdx), > ierr) > > > > end do > > > > > > > > > > > > And I got the error message as below: > > > > > > > > > > > > [0]PETSC ERROR: > ------------------------------------------------------------------------ > > > > [0]PETSC ERROR: Caught signal number 11 SEGV: Segmentation > Violation, probably memory access out of range > > > > [0]PETSC ERROR: Try option -start_in_debugger or > -on_error_attach_debugger > > > > [0]PETSC ERROR: or see > http://www.mcs.anl.gov/petsc/documentation/faq.html#valgrind[0]PETSC > ERROR: or try http://valgrind.org on GNU/linux and Apple Mac OS X to find > memory corruption errors > > > > [0]PETSC ERROR: configure using --with-debugging=yes, recompile, > link, and run > > > > [0]PETSC ERROR: to get more information on the crash. > > > > [0]PETSC ERROR: --------------------- Error Message > ------------------------------------ > > > > [0]PETSC ERROR: Signal received! > > > > [0]PETSC ERROR: > ------------------------------------------------------------------------ > > > > [0]PETSC ERROR: Petsc Release Version 3.3.0, Patch 7, Sat May 11 > 22:15:24 CDT 2013 > > > > [0]PETSC ERROR: See docs/changes/index.html for recent updates. > > > > [0]PETSC ERROR: See docs/faq.html for hints about trouble shooting. > > > > [0]PETSC ERROR: See docs/index.html for manual pages. > > > > [0]PETSC ERROR: --------------------[1]PETSC ERROR: > ------------------------------------------------------------------------ > > > > [1]PETSC ERROR: Caught signal number 11 SEGV: Segmentation > Violation, probably memory access out of range > > > > ---------------------------------------------------- > > > > [0]PETSC ERROR: ./kmath.bcbcg on a arch-fuji named p01-024 by a03293 > Wed Aug 5 18:24:40 2015 > > > > [0]PETSC ERROR: Libraries linked from > /volume1/home/ra000005/a03293/kmathlibbuild/petsc-3.3-p7/arch-fujitsu-sparc64fx-opt/lib > > > > [0]PETSC ERROR: Configure run at Tue Jul 28 19:23:51 2015 > > > > [0]PETSC ERROR: Configure options --known-level1-dcache-size=32768 > --known-level1-dcache-linesize=32 --known-level1-dcache-assoc=0 > --known-memcmp-ok=1 --known-sizeof-char=1 --known-sizeof-void-p=8 > --known-sizeof-short=2 --known-sizeof-int=4 --known-sizeof-long=8 > --known-sizeof-long-long=8 --known-sizeof-float=4 --known-sizeof-double=8 > --known-sizeof-size_t=8 --known-bits-per-byte=8 --known-sizeof-MPI_Comm=8 > --known-sizeof-MPI_Fint=4 --known-mpi-long-double=1 > --known-mpi-c-double-complex=1 --with-cc=mpifccpx --CFLAGS="-mt -Xg" > --COPTFLAGS=-Kfast,openmp --with-cxx=mpiFCCpx --CXXFLAGS=-mt > --CXXOPTFLAGS=-Kfast,openmp --with-fc=mpifrtpx --FFLAGS=-Kthreadsafe > --FOPTFLAGS=-Kfast,openmp --with-blas-lapack-lib="-SCALAPACK -SSL2" > --with-x=0 --with-c++-support --with-batch=1 --with-info=1 > --with-debugging=0 --known-mpi-shared-libraries=0 --with-valgrind=0 > > > > [0]PETSC ERROR: > ------------------------------------------------------------------------ > > > > [0]PETSC ERROR: User provided function() line 0 in unknown directory > unknown file > > > > > -------------------------------------------------------------------------- > > > > [mpi::mpi-api::mpi-abort] > > > > MPI_ABORT was invoked on rank 0 in communicator MPI_COMM_WORLD > > > > with errorcode 59. > > > > > > > > NOTE: invoking MPI_ABORT causes Open MPI to kill all MPI processes. > > > > You may or may not see output from other processes, depending on > > > > exactly when Open MPI kills them. > > > > > -------------------------------------------------------------------------- > > > > [p01-024:26516] > /opt/FJSVtclang/GM-1.2.0-18/lib64/libmpi.so.0(orte_errmgr_base_error_abort+0x84) > [0xffffffff0091f684] > > > > [p01-024:26516] > /opt/FJSVtclang/GM-1.2.0-18/lib64/libmpi.so.0(ompi_mpi_abort+0x51c) > [0xffffffff006c389c] > > > > [p01-024:26516] > /opt/FJSVtclang/GM-1.2.0-18/lib64/libmpi.so.0(MPI_Abort+0x6c) > [0xffffffff006db3ac] > > > > [p01-024:26516] > /opt/FJSVtclang/GM-1.2.0-18/lib64/libtrtmet_c.so.1(MPI_Abort+0x2c) > [0xffffffff00281bf0] > > > > [p01-024:26516] ./kmath.bcbcg [0x1bf620] > > > > [p01-024:26516] ./kmath.bcbcg [0x1bf20c] > > > > [p01-024:26516] /lib64/libc.so.6(killpg+0x48) [0xffffffff02d52600] > > > > [p01-024:26516] [(nil)] > > > > [p01-024:26516] ./kmath.bcbcg [0x1a2054] > > > > [p01-024:26516] ./kmath.bcbcg [0x1064f8] > > > > [p01-024:26516] ./kmath.bcbcg(MAIN__+0x9dc) [0x105d1c] > > > > [p01-024:26516] ./kmath.bcbcg(main+0xec) [0x8a329c] > > > > [p01-024:26516] /lib64/libc.so.6(__libc_start_main+0x194) > [0xffffffff02d3b81c] > > > > [p01-024:26516] ./kmath.bcbcg [0x1051ec] > > > > [0]PETSC ERROR: > ------------------------------------------------------------------------ > > > > [0]PETSC ERROR: Caught signal number 15 Terminate: Somet process (or > the batch system) has told this process to end > > > > [0]PETSC ERROR: Try option -start_in_debugger or > -on_error_attach_debugger > > > > [0]PETSC ERROR: or see > http://www.mcs.anl.gov/petsc/documentation/faq.html#valgrind[0]PETSC > ERROR: or try http://valgrind.org on GNU/linux and Apple Mac OS X to find > memory corruption errors > > > > [0]PETSC ERROR: configure using --with-debugging=yes, recompile, > link, and run > > > > [0]PETSC ERROR: to get more information on the crash. > > > > [0]PETSC ERROR: --------------------- Error Message > ------------------------------------ > > > > [0]PETSC ERROR: Signal received! > > > > [0]PETSC ERROR: > ------------------------------------------------------------------------ > > > > [0]PETSC ERROR: Petsc Release Version 3.3.0, Patch 7, Sat May 11 > 22:15:24 CDT 2013 > > > > [0]PETSC ERROR: See docs/changes/index.html for recent updates. > > > > [0]PETSC ERROR: See docs/faq.html for hints about trouble shooting. > > > > [0]PETSC ERROR: See docs/index.html for manual pages. > > > > [0]PETSC ERROR: > ------------------------------------------------------------------------ > > > > [0]PETSC ERROR: ./kmath.bcbcg on a arch-fuji named p01-024 by a03293 > Wed Aug 5 18:24:40 2015 > > > > [0]PETSC ERROR: Libraries linked from > /volume1/home/ra000005/a03293/kmathlibbuild/petsc-3.3-p7/arch-fujitsu-sparc64fx-opt/lib > > > > [0]PETSC ERROR: Configure run at Tue Jul 28 19:23:51 2015 > > > > [0]PETSC ERROR: Configure options --known-level1-dcache-size=32768 > --known-level1-dcache-linesize=32 --known-level1-dcache-assoc=0 > --known-memcmp-ok=1 --known-sizeof-char=1 --known-sizeof-void-p=8 > --known-sizeof-short=2 --known-sizeof-int=4 --known-sizeof-long=8 > --known-sizeof-long-long=8 --known-sizeof-float=4 --known-sizeof-double=8 > --known-sizeof-size_t=8 --known-bits-per-byte=8 --known-sizeof-MPI_Comm=8 > --known-sizeof-MPI_Fint=4 --known-mpi-long-double=1 > --known-mpi-c-double-complex=1 --with-cc=mpifccpx --CFLAGS="-mt -Xg" > --COPTFLAGS=-Kfast,openmp --with-cxx=mpiFCCpx --CXXFLAGS=-mt > --CXXOPTFLAGS=-Kfast,openmp --with-fc=mpifrtpx --FFLAGS=-Kthreadsafe > --FOPTFLAGS=-Kfast,openmp --with-blas-lapack-lib="-SCALAPACK -SSL2" > --with-x=0 --with-c++-support --with-batch=1 --with-info=1 > --with-debugging=0 --known-mpi-shared-libraries=0 --with-valgrind=0 > > > > [0]PETSC ERROR: > ------------------------------------------------------------------------ > > > > [0]PETSC ERROR: User provided function() line 0 in unknown directory > unknown file > > > > [ERR.] PLE 0019 plexec One of MPI processes was > aborted.(rank=0)(nid=0x020a0028)(CODE=1938,793745140674134016,15104) > > > > > > > > However, if I change from > > > > call > MatMatMult(A,Km(stepIdx-1),MAT_REUSE_MATRIX,PETSC_DEFAULT_INTEGER,Km(stepIdx), > ierr) > > > > to > > > > call MatMatMult(A,Km(stepIdx-1), > MAT_INITIAL_MATRIX,PETSC_DEFAULT_INTEGER,Km(stepIdx), ierr) > > > > > > > > everything is fine. > > > > > > > > could you please suggest some way to solve this? > > > > > > > > Thanks > > > > > > > > Cong Li > > > > > > > > On Wed, Aug 5, 2015 at 10:53 AM, Cong Li > wrote: > > > > Thank you very much for your help and suggestions. > > > > With your help, finally I could continue my project. > > > > > > > > Regards > > > > > > > > Cong Li > > > > > > > > > > > > > > > > On Wed, Aug 5, 2015 at 3:09 AM, Barry Smith > wrote: > > > > > > > > From the manual page: Unless scall is MAT_REUSE_MATRIX C will be > created. > > > > > > > > Since you want to use the C that is passed in you should use > MAT_REUSE_MATRIX. > > > > > > > > Note that since your B and C matrices are dense the issue of > sparsity pattern of C is not relevant. > > > > > > > > Barry > > > > > > > > > On Aug 4, 2015, at 11:59 AM, Cong Li > wrote: > > > > > > > > > > Thanks very much. This answer is very helpful. > > > > > And I have a following question. > > > > > If I create B1, B2, .. by the way you suggested and then use > MatMatMult to do SPMM. > > > > > PetscErrorCode MatMatMult(Mat A,Mat B,MatReuse scall,PetscReal > fill,Mat *C) > > > > > should I use MAT_REUSE_MATRIX for MatReuse part of the arguement. > > > > > > > > > > Thanks > > > > > > > > > > Cong Li > > > > > > > > > > On Wed, Aug 5, 2015 at 1:27 AM, Barry Smith > wrote: > > > > > > > > > > > On Aug 4, 2015, at 4:09 AM, Cong Li > wrote: > > > > > > > > > > > > I am sorry that I should have explained it more clearly. > > > > > > Actually I want to compute a recurrence. > > > > > > > > > > > > Like, I want to firstly compute A*X1=B1, and then calculate > A*B1=B2, A*B2=B3 and so on. > > > > > > Finally I want to combine all these results into a bigger matrix > C=[B1,B2 ...] > > > > > > > > > > First create C with MatCreateDense(,&C). Then call > MatDenseGetArray(C,&array); then create B1 with > MatCreateDense(....,array,&B1); then create > > > > > B2 with MatCreateDense(...,array+shift,&B2) etc where shift equals > the number of __local__ rows in B1 times the number of columns in B1, then > create B3 with a larger shift etc. > > > > > > > > > > Note that you are "sharing" the array space of C with B1, B2, > B3, ..., each Bi contains its columns of the C matrix. > > > > > > > > > > Barry > > > > > > > > > > > > > > > > > > > > > > > > > > > Is there any way to do this efficiently. > > > > > > > > > > > > > > > > > > > > > > > > On Tue, Aug 4, 2015 at 5:45 PM, Patrick Sanan < > patrick.sanan at gmail.com> wrote: > > > > > > On Tue, Aug 04, 2015 at 03:42:14PM +0900, Cong Li wrote: > > > > > > > Thanks for your reply. > > > > > > > > > > > > > > I have an other question. > > > > > > > I want to do SPMM several times and combine result matrices > into one bigger > > > > > > > matrix. > > > > > > > for example > > > > > > > I firstly calculate AX1=B1, AX2=B2 ... > > > > > > > then I want to combine B1, B2.. to get a C, where C=[B1,B2...] > > > > > > > > > > > > > > Could you please suggest a way of how to do this. > > > > > > This is just linear algebra, nothing to do with PETSc > specifically. > > > > > > A * [X1, X2, ... ] = [AX1, AX2, ...] > > > > > > > > > > > > > > Thanks > > > > > > > > > > > > > > Cong Li > > > > > > > > > > > > > > On Tue, Aug 4, 2015 at 3:27 PM, Jed Brown > wrote: > > > > > > > > > > > > > > > Cong Li writes: > > > > > > > > > > > > > > > > > Hello, > > > > > > > > > > > > > > > > > > I am a PhD student using PETsc for my research. > > > > > > > > > I am wondering if there is a way to implement SPMM (Sparse > matrix-matrix > > > > > > > > > multiplication) by using PETSc. > > > > > > > > > > > > > > > > > > > > > > > > > http://www.mcs.anl.gov/petsc/petsc-current/docs/manualpages/Mat/MatMatMult.html > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > -------------- next part -------------- An HTML attachment was scrubbed... URL: From jychang48 at gmail.com Thu Aug 6 21:16:49 2015 From: jychang48 at gmail.com (Justin Chang) Date: Thu, 6 Aug 2015 21:16:49 -0500 Subject: [petsc-users] Issues running with intel MPI compiler Message-ID: Hi all, I configured PETSc using my university's intel compilers. I configured with these options: ./configure --download-chaco --download-ctetgen --download-exodusii --download-fblaslapack --download-hdf5 --download-hypre --download-metis --download-netcdf --download-parmetis --download-triangle --with-cmake=cmake --with-mpi-dir=/share/apps/intel/impi/5.0.2.044/intel64 --with-shared-libraries=1 COPTFLAGS=-O2 CXXOPTFLAGS=-O2 FOPTFLAGS=-O2 PETSC_ARCH=arch-linux2-c-opt --with-debugging=0 when I run any examples via make, i get the following error: > mpiexec_opuntia.cacds.uh.edu: cannot connect to local mpd (/tmp/mpd2.console_jchang23); possible causes: > 1. no mpd is running on this host > 2. an mpd is running but was started without a "console" (-n option) However, if I simply run /share/apps/intel/impi/ 5.0.2.044/intel64/bin/mpiexec -n 1 my_program, it works fine. Anyone know why this is happening? Thanks, Justin -------------- next part -------------- An HTML attachment was scrubbed... URL: From balay at mcs.anl.gov Fri Aug 7 00:04:10 2015 From: balay at mcs.anl.gov (Satish Balay) Date: Fri, 7 Aug 2015 00:04:10 -0500 Subject: [petsc-users] Issues running with intel MPI compiler In-Reply-To: References: Message-ID: perhaps you'll see the issue with: /share/apps/intel/impi/5.0.2.044/intel64/bin/mpiexec -n 2 my_program In this case - you can retry with: /share/apps/intel/impi/5.0.2.044/intel64/bin/mpiexec.hydra -n 2 my_program wrt running example with make - you can try equivalent of make MPIEXEC=/share/apps/intel/impi/5.0.2.044/intel64/bin/mpiexec.hydra test Satish On Thu, 6 Aug 2015, Justin Chang wrote: > Hi all, > > I configured PETSc using my university's intel compilers. I configured with > these options: > > ./configure --download-chaco --download-ctetgen --download-exodusii > --download-fblaslapack --download-hdf5 --download-hypre --download-metis > --download-netcdf --download-parmetis --download-triangle > --with-cmake=cmake --with-mpi-dir=/share/apps/intel/impi/5.0.2.044/intel64 > --with-shared-libraries=1 COPTFLAGS=-O2 CXXOPTFLAGS=-O2 FOPTFLAGS=-O2 > PETSC_ARCH=arch-linux2-c-opt --with-debugging=0 > > when I run any examples via make, i get the following error: > > > mpiexec_opuntia.cacds.uh.edu: cannot connect to local mpd > (/tmp/mpd2.console_jchang23); possible causes: > > > 1. no mpd is running on this host > > > 2. an mpd is running but was started without a "console" (-n option) > > > However, if I simply run /share/apps/intel/impi/ > 5.0.2.044/intel64/bin/mpiexec -n 1 my_program, it works fine. Anyone know > why this is happening? > > Thanks, > Justin > From Mahir.Ulker-Kaustell at tyrens.se Fri Aug 7 04:59:05 2015 From: Mahir.Ulker-Kaustell at tyrens.se (Mahir.Ulker-Kaustell at tyrens.se) Date: Fri, 7 Aug 2015 09:59:05 +0000 Subject: [petsc-users] SuperLU MPI-problem In-Reply-To: References: <051d9816c3bd4a3eac37fc51004ebce1@STHWS42.tyrens.se> <7345cece365942d1a06deeac56cf1d72@STHWS42.tyrens.se> <19A5B30A-64E2-44E1-8F73-F67AE628F175@mcs.anl.gov> <03369975ff0a46a388920f1b3372d25c@STHWS42.tyrens.se> <1891ada2f99843b6b6c1d91f0f47f065@STHWS42.tyrens.se> <63c6587a85914931bbbad4660884efed@STHWS42.tyrens.se> Message-ID: <429fe4873a534ab19216a8d2e5fa8213@STHWS42.tyrens.se> Hong, Running example 2 with the command line given below gives me two uniprocessor runs!? $ mpiexec -n 2 ./ex2 -pc_type lu -pc_factor_mat_solver_package superlu_dist -ksp_view KSP Object: 1 MPI processes type: gmres GMRES: restart=30, using Classical (unmodified) Gram-Schmidt Orthogonalization with no iterative refinement GMRES: happy breakdown tolerance 1e-30 maximum iterations=10000, initial guess is zero tolerances: relative=0.000138889, absolute=1e-50, divergence=10000 left preconditioning using PRECONDITIONED norm type for convergence test PC Object: 1 MPI processes type: lu LU: out-of-place factorization tolerance for zero pivot 2.22045e-14 matrix ordering: nd factor fill ratio given 0, needed 0 Factored matrix follows: Mat Object: 1 MPI processes type: seqaij rows=56, cols=56 package used to perform factorization: superlu_dist total: nonzeros=0, allocated nonzeros=0 total number of mallocs used during MatSetValues calls =0 SuperLU_DIST run parameters: Process grid nprow 1 x npcol 1 Equilibrate matrix TRUE Matrix input mode 0 Replace tiny pivots TRUE Use iterative refinement FALSE Processors in row 1 col partition 1 Row permutation LargeDiag Column permutation METIS_AT_PLUS_A Parallel symbolic factorization FALSE Repeated factorization SamePattern_SameRowPerm linear system matrix = precond matrix: Mat Object: 1 MPI processes type: seqaij rows=56, cols=56 total: nonzeros=250, allocated nonzeros=280 total number of mallocs used during MatSetValues calls =0 not using I-node routines Norm of error 5.21214e-15 iterations 1 KSP Object: 1 MPI processes type: gmres GMRES: restart=30, using Classical (unmodified) Gram-Schmidt Orthogonalization with no iterative refinement GMRES: happy breakdown tolerance 1e-30 maximum iterations=10000, initial guess is zero tolerances: relative=0.000138889, absolute=1e-50, divergence=10000 left preconditioning using PRECONDITIONED norm type for convergence test PC Object: 1 MPI processes type: lu LU: out-of-place factorization tolerance for zero pivot 2.22045e-14 matrix ordering: nd factor fill ratio given 0, needed 0 Factored matrix follows: Mat Object: 1 MPI processes type: seqaij rows=56, cols=56 package used to perform factorization: superlu_dist total: nonzeros=0, allocated nonzeros=0 total number of mallocs used during MatSetValues calls =0 SuperLU_DIST run parameters: Process grid nprow 1 x npcol 1 Equilibrate matrix TRUE Matrix input mode 0 Replace tiny pivots TRUE Use iterative refinement FALSE Processors in row 1 col partition 1 Row permutation LargeDiag Column permutation METIS_AT_PLUS_A Parallel symbolic factorization FALSE Repeated factorization SamePattern_SameRowPerm linear system matrix = precond matrix: Mat Object: 1 MPI processes type: seqaij rows=56, cols=56 total: nonzeros=250, allocated nonzeros=280 total number of mallocs used during MatSetValues calls =0 not using I-node routines Norm of error 5.21214e-15 iterations 1 Mahir From: Hong [mailto:hzhang at mcs.anl.gov] Sent: den 6 augusti 2015 16:36 To: ?lker-Kaustell, Mahir Cc: Hong; Xiaoye S. Li; PETSc users list Subject: Re: [petsc-users] SuperLU MPI-problem Mahir: I have been using PETSC_COMM_WORLD. What do you get by running a petsc example, e.g., petsc/src/ksp/ksp/examples/tutorials mpiexec -n 2 ./ex2 -pc_type lu -pc_factor_mat_solver_package superlu_dist -ksp_view KSP Object: 2 MPI processes type: gmres ... Hong From: Hong [mailto:hzhang at mcs.anl.gov] Sent: den 5 augusti 2015 17:11 To: ?lker-Kaustell, Mahir Cc: Hong; Xiaoye S. Li; PETSc users list Subject: Re: [petsc-users] SuperLU MPI-problem Mahir: As you noticed, you ran the code in serial mode, not parallel. Check your code on input communicator, e.g., what input communicator do you use in KSPCreate(comm,&ksp)? I have added error flag to superlu_dist interface (released version). When user uses '-mat_superlu_dist_parsymbfact' in serial mode, this option is ignored with a warning. Hong Hong, If I set parsymbfact: $ mpiexec -n 2 ./solve -ksp_type preonly -pc_type lu -pc_factor_mat_solver_package superlu_dist -mat_superlu_dist_matinput DISTRIBUTED -mat_superlu_dist_parsymbfact -ksp_view Invalid ISPEC at line 484 in file get_perm_c.c Invalid ISPEC at line 484 in file get_perm_c.c ------------------------------------------------------- Primary job terminated normally, but 1 process returned a non-zero exit code.. Per user-direction, the job has been aborted. ------------------------------------------------------- -------------------------------------------------------------------------- mpiexec detected that one or more processes exited with non-zero status, thus causing the job to be terminated. The first process to do so was: Process name: [[63679,1],0] Exit code: 255 -------------------------------------------------------------------------- Since the program does not finish the call to KSPSolve(), we do not get any information about the KSP from ?ksp_view. If I do not set it, I get a serial run even if I specify ?n 2: mpiexec -n 2 ./solve -ksp_type preonly -pc_type lu -pc_factor_mat_solver_package superlu_dist -ksp_view ? KSP Object: 1 MPI processes type: preonly maximum iterations=10000, initial guess is zero tolerances: relative=1e-05, absolute=1e-50, divergence=10000 left preconditioning using NONE norm type for convergence test PC Object: 1 MPI processes type: lu LU: out-of-place factorization tolerance for zero pivot 2.22045e-14 matrix ordering: nd factor fill ratio given 0, needed 0 Factored matrix follows: Mat Object: 1 MPI processes type: seqaij rows=954, cols=954 package used to perform factorization: superlu_dist total: nonzeros=0, allocated nonzeros=0 total number of mallocs used during MatSetValues calls =0 SuperLU_DIST run parameters: Process grid nprow 1 x npcol 1 Equilibrate matrix TRUE Matrix input mode 0 Replace tiny pivots TRUE Use iterative refinement FALSE Processors in row 1 col partition 1 Row permutation LargeDiag Column permutation METIS_AT_PLUS_A Parallel symbolic factorization FALSE Repeated factorization SamePattern_SameRowPerm linear system matrix = precond matrix: Mat Object: 1 MPI processes type: seqaij rows=954, cols=954 total: nonzeros=34223, allocated nonzeros=34223 total number of mallocs used during MatSetValues calls =0 using I-node routines: found 668 nodes, limit used is 5 I am running PETSc via Cygwin on a windows machine. When I installed PETSc the tests with different numbers of processes ran well. Mahir From: Hong [mailto:hzhang at mcs.anl.gov] Sent: den 3 augusti 2015 19:06 To: ?lker-Kaustell, Mahir Cc: Hong; Xiaoye S. Li; PETSc users list Subject: Re: [petsc-users] SuperLU MPI-problem Mahir, I have not used ?parsymbfact in sequential runs or set matinput=GLOBAL for parallel runs. If I use 2 processors, the program runs if I use ?mat_superlu_dist_parsymbfact=1: mpiexec -n 2 ./solve -ksp_type preonly -pc_type lu -pc_factor_mat_solver_package superlu_dist -mat_superlu_dist_matinput GLOBAL -mat_superlu_dist_parsymbfact=1 The incorrect option '-mat_superlu_dist_parsymbfact=1' is not taken, so your code runs well without parsymbfact. Please run it with '-ksp_view' and see what 'SuperLU_DIST run parameters:' are being used, e.g. petsc/src/ksp/ksp/examples/tutorials (maint) $ mpiexec -n 2 ./ex2 -pc_type lu -pc_factor_mat_solver_package superlu_dist -mat_superlu_dist_parsymbfact=1 -ksp_view ... SuperLU_DIST run parameters: Process grid nprow 2 x npcol 1 Equilibrate matrix TRUE Matrix input mode 1 Replace tiny pivots TRUE Use iterative refinement FALSE Processors in row 2 col partition 1 Row permutation LargeDiag Column permutation METIS_AT_PLUS_A Parallel symbolic factorization FALSE Repeated factorization SamePattern_SameRowPerm I do not understand why your code uses matrix input mode = global. Hong From: Hong [mailto:hzhang at mcs.anl.gov] Sent: den 3 augusti 2015 16:46 To: Xiaoye S. Li Cc: ?lker-Kaustell, Mahir; Hong; PETSc users list Subject: Re: [petsc-users] SuperLU MPI-problem Mahir, Sherry found the culprit. I can reproduce it: petsc/src/ksp/ksp/examples/tutorials mpiexec -n 2 ./ex2 -pc_type lu -pc_factor_mat_solver_package superlu_dist -mat_superlu_dist_matinput GLOBAL -mat_superlu_dist_parsymbfact Invalid ISPEC at line 484 in file get_perm_c.c Invalid ISPEC at line 484 in file get_perm_c.c ------------------------------------------------------- Primary job terminated normally, but 1 process returned a non-zero exit code.. Per user-direction, the job has been aborted. ------------------------------------------------------- ... PETSc-superlu_dist interface sets matinput=DISTRIBUTED as default when using more than one processes. Did you either use '-mat_superlu_dist_parsymbfact' for sequential run or set matinput=GLOBAL for parallel run? I'll add an error flag for these use cases. Hong On Mon, Aug 3, 2015 at 9:17 AM, Xiaoye S. Li > wrote: I think I know the problem. Since zdistribute.c is called, I guess you are using the global (replicated) matrix input interface, pzgssvx_ABglobal(). This interface does not allow you to use parallel symbolic factorization (since matrix is centralized). That's why you get the following error: Invalid ISPEC at line 484 in file get_perm_c.c You need to use distributed matrix input interface pzgssvx() (without ABglobal) Sherry On Mon, Aug 3, 2015 at 5:02 AM, Mahir.Ulker-Kaustell at tyrens.se > wrote: Hong and Sherry, I have rebuilt PETSc with SuperLU 4.1. Unfortunately, the problem remains: If I use -mat_superlu_dist_parsymbfact, the program crashes with: Invalid ISPEC at line 484 in file get_perm_c.c If I use -mat_superlu_dist_parsymbfact=1 or leave this flag out, the program crashes with: Calloc fails for SPA dense[]. at line 438 in file zdistribute.c Mahir From: Hong [mailto:hzhang at mcs.anl.gov] Sent: den 30 juli 2015 02:58 To: ?lker-Kaustell, Mahir Cc: Xiaoye Li; PETSc users list Subject: Fwd: [petsc-users] SuperLU MPI-problem Mahir, Sherry fixed several bugs in superlu_dist-v4.1. The current petsc-release interfaces with superlu_dist-v4.0. We do not know whether the reported issue (attached below) has been resolved or not. If not, can you test it with the latest superlu_dist-v4.1? Here is how to do it: 1. download superlu_dist v4.1 2. remove existing PETSC_ARCH directory, then configure petsc with '--download-superlu_dist=superlu_dist_4.1.tar.gz' 3. build petsc Let us know if the issue remains. Hong ---------- Forwarded message ---------- From: Xiaoye S. Li > Date: Wed, Jul 29, 2015 at 2:24 PM Subject: Fwd: [petsc-users] SuperLU MPI-problem To: Hong Zhang > Hong, I am cleaning the mailbox, and saw this unresolved issue. I am not sure whether the new fix to parallel symbolic factorization solves the problem. What bothers be is that he is getting the following error: Invalid ISPEC at line 484 in file get_perm_c.c This has nothing to do with my bug fix. ? Shall we ask him to try the new version, or try to get him matrix? Sherry ? ---------- Forwarded message ---------- From: Mahir.Ulker-Kaustell at tyrens.se > Date: Wed, Jul 22, 2015 at 1:32 PM Subject: RE: [petsc-users] SuperLU MPI-problem To: Hong >, "Xiaoye S. Li" > Cc: petsc-users > The 1000 was just a conservative guess. The number of non-zeros per row is in the tens in general but certain constraints lead to non-diagonal streaks in the sparsity-pattern. Is it the reordering of the matrix that is killing me here? How can I set options.ColPerm? If i use -mat_superlu_dist_parsymbfact the program crashes with Invalid ISPEC at line 484 in file get_perm_c.c ------------------------------------------------------- Primary job terminated normally, but 1 process returned a non-zero exit code.. Per user-direction, the job has been aborted. ------------------------------------------------------- [0]PETSC ERROR: ------------------------------------------------------------------------ [0]PETSC ERROR: Caught signal number 15 Terminate: Some process (or the batch system) has told this process to end [0]PETSC ERROR: Try option -start_in_debugger or -on_error_attach_debugger [0]PETSC ERROR: or see http://www.mcs.anl.gov/petsc/documentation/faq.html#valgrind [0]PETSC ERROR: or try http://valgrind.org on GNU/linux and Apple Mac OS X to find memory corruption errors [0]PETSC ERROR: configure using --with-debugging=yes, recompile, link, and run [0]PETSC ERROR: to get more information on the crash. [0]PETSC ERROR: --------------------- Error Message -------------------------------------------------------------- [0]PETSC ERROR: Signal received [0]PETSC ERROR: See http://www.mcs.anl.gov/petsc/documentation/faq.html for trouble shooting. [0]PETSC ERROR: Petsc Release Version 3.6.0, Jun, 09, 2015 [0]PETSC ERROR: ./solve on a cygwin-complex-nodebug named CZC5202SM2 by muk Wed Jul 22 21:59:23 2015 [0]PETSC ERROR: Configure options PETSC_DIR=/packages/petsc-3.6.0 PETSC_ARCH=cygwin-complex-nodebug --with-cc=gcc --with-cxx=g++ --with-fc=gfortran --with-debugging=0 --with-fortran-kernels=1 --with-scalar-type=complex --download-fblaspack --download-mpich --download-scalapack --download-mumps --download-metis --download-parmetis --download-superlu --download-superlu_dist --download-fftw [0]PETSC ERROR: #1 User provided function() line 0 in unknown file application called MPI_Abort(MPI_COMM_WORLD, 59) - process 0 [unset]: aborting job: application called MPI_Abort(MPI_COMM_WORLD, 59) - process 0 [0]PETSC ERROR: ------------------------------------------------------------------------ If i use -mat_superlu_dist_parsymbfact=1 the program crashes (somewhat later) with Malloc fails for Lnzval_bc_ptr[*][] at line 626 in file zdistribute.c col block 3006 ------------------------------------------------------- Primary job terminated normally, but 1 process returned a non-zero exit code.. Per user-direction, the job has been aborted. ------------------------------------------------------- col block 1924 [0]PETSC ERROR: ------------------------------------------------------------------------ [0]PETSC ERROR: Caught signal number 15 Terminate: Some process (or the batch system) has told this process to end [0]PETSC ERROR: Try option -start_in_debugger or -on_error_attach_debugger [0]PETSC ERROR: or see http://www.mcs.anl.gov/petsc/documentation/faq.html#valgrind [0]PETSC ERROR: or try http://valgrind.org on GNU/linux and Apple Mac OS X to find memory corruption errors [0]PETSC ERROR: configure using --with-debugging=yes, recompile, link, and run [0]PETSC ERROR: to get more information on the crash. [0]PETSC ERROR: --------------------- Error Message -------------------------------------------------------------- [0]PETSC ERROR: Signal received [0]PETSC ERROR: See http://www.mcs.anl.gov/petsc/documentation/faq.html for trouble shooting. [0]PETSC ERROR: Petsc Release Version 3.6.0, Jun, 09, 2015 [0]PETSC ERROR: ./solve on a cygwin-complex-nodebug named CZC5202SM2 by muk Wed Jul 22 21:59:58 2015 [0]PETSC ERROR: Configure options PETSC_DIR=/packages/petsc-3.6.0 PETSC_ARCH=cygwin-complex-nodebug --with-cc=gcc --with-cxx=g++ --with-fc=gfortran --with-debugging=0 --with-fortran-kernels=1 --with-scalar-type=complex --download-fblaspack --download-mpich --download-scalapack --download-mumps --download-metis --download-parmetis --download-superlu --download-superlu_dist --download-fftw [0]PETSC ERROR: #1 User provided function() line 0 in unknown file application called MPI_Abort(MPI_COMM_WORLD, 59) - process 0 [unset]: aborting job: application called MPI_Abort(MPI_COMM_WORLD, 59) - process 0 [0]PETSC ERROR: ------------------------------------------------------------------------ /Mahir From: Hong [mailto:hzhang at mcs.anl.gov] Sent: den 22 juli 2015 21:34 To: Xiaoye S. Li Cc: ?lker-Kaustell, Mahir; petsc-users Subject: Re: [petsc-users] SuperLU MPI-problem In Petsc/superlu_dist interface, we set default options.ParSymbFact = NO; When user raises the flag "-mat_superlu_dist_parsymbfact", we set options.ParSymbFact = YES; options.ColPerm = PARMETIS; /* in v2.2, PARMETIS is forced for ParSymbFact regardless of user ordering setting */ We do not change anything else. Hong On Wed, Jul 22, 2015 at 2:19 PM, Xiaoye S. Li > wrote: I am trying to understand your problem. You said you are solving Naviers equation (elastodynamics) in the frequency domain, using finite element discretization. I wonder why you have about 1000 nonzeros per row. Usually in many PDE discretized matrices, the number of nonzeros per row is in the tens (even for 3D problems), not in the thousands. So, your matrix is quite a bit denser than many sparse matrices we deal with. The number of nonzeros in the L and U factors is much more than that in original matrix A -- typically we see 10-20x fill ratio for 2D, or can be as bad as 50-100x fill ratio for 3D. But since your matrix starts much denser (i.e., the underlying graph has many connections), it may not lend to any good ordering strategy to preserve sparsity of L and U; that is, the L and U fill ratio may be large. I don't understand why you get the following error when you use ?-mat_superlu_dist_parsymbfact?. Invalid ISPEC at line 484 in file get_perm_c.c Perhaps Hong Zhang knows; she built the SuperLU_DIST interface for PETSc. ?Hong -- in order to use parallel symbolic factorization, is it sufficient to specify only ?-mat_superlu_dist_parsymbfact? ? ? (the default is to use sequential symbolic factorization.) Sherry On Wed, Jul 22, 2015 at 9:11 AM, Mahir.Ulker-Kaustell at tyrens.se > wrote: Thank you for your reply. As you have probably figured out already, I am not a computational scientist. I am a researcher in civil engineering (railways for high-speed traffic), trying to produce some, from my perspective, fairly large parametric studies based on finite element discretizations. I am working in a Windows-environment and have installed PETSc through Cygwin. Apparently, there is no support for Valgrind in this OS. If I have understood you correct, the memory issues are related to superLU and given my background, there is not much I can do. Is this correct? Best regards, Mahir ______________________________________________ Mahir ?lker-Kaustell, Kompetenssamordnare, Brokonstrukt?r, Tekn. Dr, Tyr?ns AB 010 452 30 82, Mahir.Ulker-Kaustell at tyrens.se ______________________________________________ -----Original Message----- From: Barry Smith [mailto:bsmith at mcs.anl.gov] Sent: den 22 juli 2015 02:57 To: ?lker-Kaustell, Mahir Cc: Xiaoye S. Li; petsc-users Subject: Re: [petsc-users] SuperLU MPI-problem Run the program under valgrind http://www.mcs.anl.gov/petsc/documentation/faq.html#valgrind . When I use the option -mat_superlu_dist_parsymbfact I get many scary memory problems some involving for example ddist_psymbtonum (pdsymbfact_distdata.c:1332) Note that I consider it unacceptable for running programs to EVER use uninitialized values; until these are all cleaned up I won't trust any runs like this. Barry ==42050== Conditional jump or move depends on uninitialised value(s) ==42050== at 0x10274C436: MPI_Allgatherv (allgatherv.c:1053) ==42050== by 0x101557F60: get_perm_c_parmetis (get_perm_c_parmetis.c:285) ==42050== by 0x101501192: pdgssvx (pdgssvx.c:934) ==42050== by 0x1009CFE7A: MatLUFactorNumeric_SuperLU_DIST (superlu_dist.c:414) ==42050== by 0x10046CC5C: MatLUFactorNumeric (matrix.c:2946) ==42050== by 0x100F09F2C: PCSetUp_LU (lu.c:152) ==42050== by 0x100FF9036: PCSetUp (precon.c:982) ==42050== by 0x1010F54EB: KSPSetUp (itfunc.c:332) ==42050== by 0x1010F7985: KSPSolve (itfunc.c:546) ==42050== by 0x10125541E: SNESSolve_NEWTONLS (ls.c:233) ==42050== by 0x1011C49B7: SNESSolve (snes.c:3906) ==42050== by 0x100001B3C: main (in ./ex19) ==42050== Uninitialised value was created by a stack allocation ==42050== at 0x10155751B: get_perm_c_parmetis (get_perm_c_parmetis.c:96) ==42050== ==42050== Conditional jump or move depends on uninitialised value(s) ==42050== at 0x102851C61: MPIR_Allgatherv_intra (allgatherv.c:651) ==42050== by 0x102853EC7: MPIR_Allgatherv (allgatherv.c:903) ==42050== by 0x102853F84: MPIR_Allgatherv_impl (allgatherv.c:944) ==42050== by 0x10274CA41: MPI_Allgatherv (allgatherv.c:1107) ==42050== by 0x101557F60: get_perm_c_parmetis (get_perm_c_parmetis.c:285) ==42050== by 0x101501192: pdgssvx (pdgssvx.c:934) ==42050== by 0x1009CFE7A: MatLUFactorNumeric_SuperLU_DIST (superlu_dist.c:414) ==42050== by 0x10046CC5C: MatLUFactorNumeric (matrix.c:2946) ==42050== by 0x100F09F2C: PCSetUp_LU (lu.c:152) ==42050== by 0x100FF9036: PCSetUp (precon.c:982) ==42050== by 0x1010F54EB: KSPSetUp (itfunc.c:332) ==42050== by 0x1010F7985: KSPSolve (itfunc.c:546) ==42050== by 0x10125541E: SNESSolve_NEWTONLS (ls.c:233) ==42050== by 0x1011C49B7: SNESSolve (snes.c:3906) ==42050== by 0x100001B3C: main (in ./ex19) ==42050== Uninitialised value was created by a stack allocation ==42050== at 0x10155751B: get_perm_c_parmetis (get_perm_c_parmetis.c:96) ==42050== ==42049== Syscall param writev(vector[...]) points to uninitialised byte(s) ==42049== at 0x102DA1C3A: writev (in /usr/lib/system/libsystem_kernel.dylib) ==42049== by 0x10296A0DC: MPL_large_writev (mplsock.c:32) ==42049== by 0x10295F6AD: MPIDU_Sock_writev (sock_immed.i:610) ==42049== by 0x102943FCA: MPIDI_CH3_iSendv (ch3_isendv.c:84) ==42049== by 0x102934361: MPIDI_CH3_EagerContigIsend (ch3u_eager.c:556) ==42049== by 0x102939531: MPID_Isend (mpid_isend.c:138) ==42049== by 0x10277656E: MPI_Isend (isend.c:125) ==42049== by 0x102088B66: libparmetis__gkMPI_Isend (gkmpi.c:63) ==42049== by 0x10208140F: libparmetis__CommInterfaceData (comm.c:298) ==42049== by 0x1020A8758: libparmetis__CompactGraph (ometis.c:553) ==42049== by 0x1020A77BB: libparmetis__MultilevelOrder (ometis.c:225) ==42049== by 0x1020A7493: ParMETIS_V32_NodeND (ometis.c:151) ==42049== by 0x1020A6AFB: ParMETIS_V3_NodeND (ometis.c:34) ==42049== by 0x101557CFC: get_perm_c_parmetis (get_perm_c_parmetis.c:241) ==42049== by 0x101501192: pdgssvx (pdgssvx.c:934) ==42049== by 0x1009CFE7A: MatLUFactorNumeric_SuperLU_DIST (superlu_dist.c:414) ==42049== by 0x10046CC5C: MatLUFactorNumeric (matrix.c:2946) ==42049== by 0x100F09F2C: PCSetUp_LU (lu.c:152) ==42049== by 0x100FF9036: PCSetUp (precon.c:982) ==42048== Syscall param writev(vector[...]) points to uninitialised byte(s) ==42049== by 0x1010F54EB: KSPSetUp (itfunc.c:332) ==42049== Address 0x105edff70 is 1,424 bytes inside a block of size 752,720 alloc'd ==42049== at 0x1000183B1: malloc (vg_replace_malloc.c:303) ==42049== by 0x1020EB90C: gk_malloc (memory.c:147) ==42049== by 0x1020EAA28: gk_mcoreCreate (mcore.c:28) ==42048== at 0x102DA1C3A: writev (in /usr/lib/system/libsystem_kernel.dylib) ==42048== by 0x10296A0DC: MPL_large_writev (mplsock.c:32) ==42049== by 0x1020BA5CF: libparmetis__AllocateWSpace (wspace.c:23) ==42049== by 0x1020A6E84: ParMETIS_V32_NodeND (ometis.c:98) ==42048== by 0x10295F6AD: MPIDU_Sock_writev (sock_immed.i:610) ==42048== by 0x102943FCA: MPIDI_CH3_iSendv (ch3_isendv.c:84) ==42048== by 0x102934361: MPIDI_CH3_EagerContigIsend (ch3u_eager.c:556) ==42049== by 0x1020A6AFB: ParMETIS_V3_NodeND (ometis.c:34) ==42049== by 0x101557CFC: get_perm_c_parmetis (get_perm_c_parmetis.c:241) ==42049== by 0x101501192: pdgssvx (pdgssvx.c:934) ==42048== by 0x102939531: MPID_Isend (mpid_isend.c:138) ==42048== by 0x10277656E: MPI_Isend (isend.c:125) ==42049== by 0x1009CFE7A: MatLUFactorNumeric_SuperLU_DIST (superlu_dist.c:414) ==42049== by 0x10046CC5C: MatLUFactorNumeric (matrix.c:2946) ==42049== by 0x100F09F2C: PCSetUp_LU (lu.c:152) ==42049== by 0x100FF9036: PCSetUp (precon.c:982) ==42049== by 0x1010F54EB: KSPSetUp (itfunc.c:332) ==42048== by 0x102088B66: libparmetis__gkMPI_Isend (gkmpi.c:63) ==42048== by 0x10208140F: libparmetis__CommInterfaceData (comm.c:298) ==42049== by 0x1010F7985: KSPSolve (itfunc.c:546) ==42049== by 0x10125541E: SNESSolve_NEWTONLS (ls.c:233) ==42048== by 0x1020A8758: libparmetis__CompactGraph (ometis.c:553) ==42048== by 0x1020A77BB: libparmetis__MultilevelOrder (ometis.c:225) ==42048== by 0x1020A7493: ParMETIS_V32_NodeND (ometis.c:151) ==42049== by 0x1011C49B7: SNESSolve (snes.c:3906) ==42049== by 0x100001B3C: main (in ./ex19) ==42049== Uninitialised value was created by a heap allocation ==42049== at 0x1000183B1: malloc (vg_replace_malloc.c:303) ==42049== by 0x1020EB90C: gk_malloc (memory.c:147) ==42048== by 0x1020A6AFB: ParMETIS_V3_NodeND (ometis.c:34) ==42048== by 0x101557CFC: get_perm_c_parmetis (get_perm_c_parmetis.c:241) ==42048== by 0x101501192: pdgssvx (pdgssvx.c:934) ==42048== by 0x1009CFE7A: MatLUFactorNumeric_SuperLU_DIST (superlu_dist.c:414) ==42049== by 0x10211C50B: libmetis__imalloc (gklib.c:24) ==42049== by 0x1020A8566: libparmetis__CompactGraph (ometis.c:519) ==42049== by 0x1020A77BB: libparmetis__MultilevelOrder (ometis.c:225) ==42048== by 0x10046CC5C: MatLUFactorNumeric (matrix.c:2946) ==42049== by 0x1020A7493: ParMETIS_V32_NodeND (ometis.c:151) ==42049== by 0x1020A6AFB: ParMETIS_V3_NodeND (ometis.c:34) ==42049== by 0x101557CFC: get_perm_c_parmetis (get_perm_c_parmetis.c:241) ==42049== by 0x101501192: pdgssvx (pdgssvx.c:934) ==42049== by 0x1009CFE7A: MatLUFactorNumeric_SuperLU_DIST (superlu_dist.c:414) ==42049== by 0x10046CC5C: MatLUFactorNumeric (matrix.c:2946) ==42048== by 0x100F09F2C: PCSetUp_LU (lu.c:152) ==42049== by 0x100F09F2C: PCSetUp_LU (lu.c:152) ==42049== by 0x100FF9036: PCSetUp (precon.c:982) ==42049== by 0x1010F54EB: KSPSetUp (itfunc.c:332) ==42049== by 0x1010F7985: KSPSolve (itfunc.c:546) ==42049== by 0x10125541E: SNESSolve_NEWTONLS (ls.c:233) ==42048== by 0x100FF9036: PCSetUp (precon.c:982) ==42048== by 0x1010F54EB: KSPSetUp (itfunc.c:332) ==42048== Address 0x10597a860 is 1,408 bytes inside a block of size 752,720 alloc'd ==42049== by 0x1011C49B7: SNESSolve (snes.c:3906) ==42049== by 0x100001B3C: main (in ./ex19) ==42049== ==42048== at 0x1000183B1: malloc (vg_replace_malloc.c:303) ==42048== by 0x1020EB90C: gk_malloc (memory.c:147) ==42048== by 0x1020EAA28: gk_mcoreCreate (mcore.c:28) ==42048== by 0x1020BA5CF: libparmetis__AllocateWSpace (wspace.c:23) ==42048== by 0x1020A6E84: ParMETIS_V32_NodeND (ometis.c:98) ==42048== by 0x1020A6AFB: ParMETIS_V3_NodeND (ometis.c:34) ==42048== by 0x101557CFC: get_perm_c_parmetis (get_perm_c_parmetis.c:241) ==42048== by 0x101501192: pdgssvx (pdgssvx.c:934) ==42048== by 0x1009CFE7A: MatLUFactorNumeric_SuperLU_DIST (superlu_dist.c:414) ==42048== by 0x10046CC5C: MatLUFactorNumeric (matrix.c:2946) ==42048== by 0x100F09F2C: PCSetUp_LU (lu.c:152) ==42048== by 0x100FF9036: PCSetUp (precon.c:982) ==42048== by 0x1010F54EB: KSPSetUp (itfunc.c:332) ==42048== by 0x1010F7985: KSPSolve (itfunc.c:546) ==42048== by 0x10125541E: SNESSolve_NEWTONLS (ls.c:233) ==42048== by 0x1011C49B7: SNESSolve (snes.c:3906) ==42048== by 0x100001B3C: main (in ./ex19) ==42048== Uninitialised value was created by a heap allocation ==42048== at 0x1000183B1: malloc (vg_replace_malloc.c:303) ==42048== by 0x1020EB90C: gk_malloc (memory.c:147) ==42048== by 0x10211C50B: libmetis__imalloc (gklib.c:24) ==42048== by 0x1020A8566: libparmetis__CompactGraph (ometis.c:519) ==42048== by 0x1020A77BB: libparmetis__MultilevelOrder (ometis.c:225) ==42048== by 0x1020A7493: ParMETIS_V32_NodeND (ometis.c:151) ==42048== by 0x1020A6AFB: ParMETIS_V3_NodeND (ometis.c:34) ==42048== by 0x101557CFC: get_perm_c_parmetis (get_perm_c_parmetis.c:241) ==42048== by 0x101501192: pdgssvx (pdgssvx.c:934) ==42048== by 0x1009CFE7A: MatLUFactorNumeric_SuperLU_DIST (superlu_dist.c:414) ==42048== by 0x10046CC5C: MatLUFactorNumeric (matrix.c:2946) ==42048== by 0x100F09F2C: PCSetUp_LU (lu.c:152) ==42048== by 0x100FF9036: PCSetUp (precon.c:982) ==42048== by 0x1010F54EB: KSPSetUp (itfunc.c:332) ==42048== by 0x1010F7985: KSPSolve (itfunc.c:546) ==42048== by 0x10125541E: SNESSolve_NEWTONLS (ls.c:233) ==42048== by 0x1011C49B7: SNESSolve (snes.c:3906) ==42048== by 0x100001B3C: main (in ./ex19) ==42048== ==42048== Syscall param write(buf) points to uninitialised byte(s) ==42048== at 0x102DA1C22: write (in /usr/lib/system/libsystem_kernel.dylib) ==42048== by 0x10295F5BD: MPIDU_Sock_write (sock_immed.i:525) ==42048== by 0x102944839: MPIDI_CH3_iStartMsg (ch3_istartmsg.c:86) ==42048== by 0x102933B80: MPIDI_CH3_EagerContigShortSend (ch3u_eager.c:257) ==42048== by 0x10293ADBA: MPID_Send (mpid_send.c:130) ==42048== by 0x10277A1FA: MPI_Send (send.c:127) ==42048== by 0x10155802F: get_perm_c_parmetis (get_perm_c_parmetis.c:299) ==42048== by 0x101501192: pdgssvx (pdgssvx.c:934) ==42048== by 0x1009CFE7A: MatLUFactorNumeric_SuperLU_DIST (superlu_dist.c:414) ==42048== by 0x10046CC5C: MatLUFactorNumeric (matrix.c:2946) ==42048== by 0x100F09F2C: PCSetUp_LU (lu.c:152) ==42048== by 0x100FF9036: PCSetUp (precon.c:982) ==42048== by 0x1010F54EB: KSPSetUp (itfunc.c:332) ==42048== by 0x1010F7985: KSPSolve (itfunc.c:546) ==42048== by 0x10125541E: SNESSolve_NEWTONLS (ls.c:233) ==42048== by 0x1011C49B7: SNESSolve (snes.c:3906) ==42048== by 0x100001B3C: main (in ./ex19) ==42048== Address 0x104810704 is on thread 1's stack ==42048== in frame #3, created by MPIDI_CH3_EagerContigShortSend (ch3u_eager.c:218) ==42048== Uninitialised value was created by a heap allocation ==42048== at 0x1000183B1: malloc (vg_replace_malloc.c:303) ==42048== by 0x10153B704: superlu_malloc_dist (memory.c:108) ==42048== by 0x101557AB9: get_perm_c_parmetis (get_perm_c_parmetis.c:185) ==42048== by 0x101501192: pdgssvx (pdgssvx.c:934) ==42048== by 0x1009CFE7A: MatLUFactorNumeric_SuperLU_DIST (superlu_dist.c:414) ==42048== by 0x10046CC5C: MatLUFactorNumeric (matrix.c:2946) ==42048== by 0x100F09F2C: PCSetUp_LU (lu.c:152) ==42048== by 0x100FF9036: PCSetUp (precon.c:982) ==42048== by 0x1010F54EB: KSPSetUp (itfunc.c:332) ==42048== by 0x1010F7985: KSPSolve (itfunc.c:546) ==42048== by 0x10125541E: SNESSolve_NEWTONLS (ls.c:233) ==42048== by 0x1011C49B7: SNESSolve (snes.c:3906) ==42048== by 0x100001B3C: main (in ./ex19) ==42048== ==42050== Conditional jump or move depends on uninitialised value(s) ==42050== at 0x102744CB8: MPI_Alltoallv (alltoallv.c:480) ==42050== by 0x101510B3E: dist_symbLU (pdsymbfact_distdata.c:539) ==42050== by 0x10150A5C6: ddist_psymbtonum (pdsymbfact_distdata.c:1275) ==42050== by 0x1015018C2: pdgssvx (pdgssvx.c:1057) ==42050== by 0x1009CFE7A: MatLUFactorNumeric_SuperLU_DIST (superlu_dist.c:414) ==42050== by 0x10046CC5C: MatLUFactorNumeric (matrix.c:2946) ==42050== by 0x100F09F2C: PCSetUp_LU (lu.c:152) ==42050== by 0x100FF9036: PCSetUp (precon.c:982) ==42050== by 0x1010F54EB: KSPSetUp (itfunc.c:332) ==42050== by 0x1010F7985: KSPSolve (itfunc.c:546) ==42050== by 0x10125541E: SNESSolve_NEWTONLS (ls.c:233) ==42050== by 0x1011C49B7: SNESSolve (snes.c:3906) ==42050== by 0x100001B3C: main (in ./ex19) ==42050== Uninitialised value was created by a stack allocation ==42050== at 0x10150E4C4: dist_symbLU (pdsymbfact_distdata.c:96) ==42050== ==42050== Conditional jump or move depends on uninitialised value(s) ==42050== at 0x102744E43: MPI_Alltoallv (alltoallv.c:490) ==42050== by 0x101510B3E: dist_symbLU (pdsymbfact_distdata.c:539) ==42050== by 0x10150A5C6: ddist_psymbtonum (pdsymbfact_distdata.c:1275) ==42050== by 0x1015018C2: pdgssvx (pdgssvx.c:1057) ==42050== by 0x1009CFE7A: MatLUFactorNumeric_SuperLU_DIST (superlu_dist.c:414) ==42050== by 0x10046CC5C: MatLUFactorNumeric (matrix.c:2946) ==42050== by 0x100F09F2C: PCSetUp_LU (lu.c:152) ==42050== by 0x100FF9036: PCSetUp (precon.c:982) ==42050== by 0x1010F54EB: KSPSetUp (itfunc.c:332) ==42050== by 0x1010F7985: KSPSolve (itfunc.c:546) ==42050== by 0x10125541E: SNESSolve_NEWTONLS (ls.c:233) ==42050== by 0x1011C49B7: SNESSolve (snes.c:3906) ==42050== by 0x100001B3C: main (in ./ex19) ==42050== Uninitialised value was created by a stack allocation ==42050== at 0x10150E4C4: dist_symbLU (pdsymbfact_distdata.c:96) ==42050== ==42050== Conditional jump or move depends on uninitialised value(s) ==42050== at 0x102744EBF: MPI_Alltoallv (alltoallv.c:497) ==42050== by 0x101510B3E: dist_symbLU (pdsymbfact_distdata.c:539) ==42050== by 0x10150A5C6: ddist_psymbtonum (pdsymbfact_distdata.c:1275) ==42050== by 0x1015018C2: pdgssvx (pdgssvx.c:1057) ==42050== by 0x1009CFE7A: MatLUFactorNumeric_SuperLU_DIST (superlu_dist.c:414) ==42050== by 0x10046CC5C: MatLUFactorNumeric (matrix.c:2946) ==42050== by 0x100F09F2C: PCSetUp_LU (lu.c:152) ==42050== by 0x100FF9036: PCSetUp (precon.c:982) ==42050== by 0x1010F54EB: KSPSetUp (itfunc.c:332) ==42050== by 0x1010F7985: KSPSolve (itfunc.c:546) ==42050== by 0x10125541E: SNESSolve_NEWTONLS (ls.c:233) ==42050== by 0x1011C49B7: SNESSolve (snes.c:3906) ==42050== by 0x100001B3C: main (in ./ex19) ==42050== Uninitialised value was created by a stack allocation ==42050== at 0x10150E4C4: dist_symbLU (pdsymbfact_distdata.c:96) ==42050== ==42050== Conditional jump or move depends on uninitialised value(s) ==42050== at 0x1027450B1: MPI_Alltoallv (alltoallv.c:512) ==42050== by 0x101510B3E: dist_symbLU (pdsymbfact_distdata.c:539) ==42050== by 0x10150A5C6: ddist_psymbtonum (pdsymbfact_distdata.c:1275) ==42050== by 0x1015018C2: pdgssvx (pdgssvx.c:1057) ==42050== by 0x1009CFE7A: MatLUFactorNumeric_SuperLU_DIST (superlu_dist.c:414) ==42050== by 0x10046CC5C: MatLUFactorNumeric (matrix.c:2946) ==42050== by 0x100F09F2C: PCSetUp_LU (lu.c:152) ==42050== by 0x100FF9036: PCSetUp (precon.c:982) ==42050== by 0x1010F54EB: KSPSetUp (itfunc.c:332) ==42050== by 0x1010F7985: KSPSolve (itfunc.c:546) ==42050== by 0x10125541E: SNESSolve_NEWTONLS (ls.c:233) ==42050== by 0x1011C49B7: SNESSolve (snes.c:3906) ==42050== by 0x100001B3C: main (in ./ex19) ==42050== Uninitialised value was created by a stack allocation ==42050== at 0x10150E4C4: dist_symbLU (pdsymbfact_distdata.c:96) ==42050== ==42050== Conditional jump or move depends on uninitialised value(s) ==42050== at 0x10283FB06: MPIR_Alltoallv_intra (alltoallv.c:92) ==42050== by 0x1028407B6: MPIR_Alltoallv (alltoallv.c:343) ==42050== by 0x102840884: MPIR_Alltoallv_impl (alltoallv.c:380) ==42050== by 0x10274541B: MPI_Alltoallv (alltoallv.c:531) ==42050== by 0x101510B3E: dist_symbLU (pdsymbfact_distdata.c:539) ==42050== by 0x10150A5C6: ddist_psymbtonum (pdsymbfact_distdata.c:1275) ==42050== by 0x1015018C2: pdgssvx (pdgssvx.c:1057) ==42050== by 0x1009CFE7A: MatLUFactorNumeric_SuperLU_DIST (superlu_dist.c:414) ==42050== by 0x10046CC5C: MatLUFactorNumeric (matrix.c:2946) ==42050== by 0x100F09F2C: PCSetUp_LU (lu.c:152) ==42050== by 0x100FF9036: PCSetUp (precon.c:982) ==42050== by 0x1010F54EB: KSPSetUp (itfunc.c:332) ==42050== by 0x1010F7985: KSPSolve (itfunc.c:546) ==42050== by 0x10125541E: SNESSolve_NEWTONLS (ls.c:233) ==42050== by 0x1011C49B7: SNESSolve (snes.c:3906) ==42050== by 0x100001B3C: main (in ./ex19) ==42050== Uninitialised value was created by a stack allocation ==42050== at 0x10150E4C4: dist_symbLU (pdsymbfact_distdata.c:96) ==42050== ==42050== Syscall param writev(vector[...]) points to uninitialised byte(s) ==42050== at 0x102DA1C3A: writev (in /usr/lib/system/libsystem_kernel.dylib) ==42050== by 0x10296A0DC: MPL_large_writev (mplsock.c:32) ==42050== by 0x10295F6AD: MPIDU_Sock_writev (sock_immed.i:610) ==42050== by 0x102943FCA: MPIDI_CH3_iSendv (ch3_isendv.c:84) ==42050== by 0x102934361: MPIDI_CH3_EagerContigIsend (ch3u_eager.c:556) ==42050== by 0x102939531: MPID_Isend (mpid_isend.c:138) ==42050== by 0x10277656E: MPI_Isend (isend.c:125) ==42050== by 0x101524C41: pdgstrf2_trsm (pdgstrf2.c:201) ==42050== by 0x10151ECBF: pdgstrf (pdgstrf.c:1082) ==42050== by 0x1015019A5: pdgssvx (pdgssvx.c:1069) ==42050== by 0x1009CFE7A: MatLUFactorNumeric_SuperLU_DIST (superlu_dist.c:414) ==42050== by 0x10046CC5C: MatLUFactorNumeric (matrix.c:2946) ==42050== by 0x100F09F2C: PCSetUp_LU (lu.c:152) ==42050== by 0x100FF9036: PCSetUp (precon.c:982) ==42050== by 0x1010F54EB: KSPSetUp (itfunc.c:332) ==42050== by 0x1010F7985: KSPSolve (itfunc.c:546) ==42050== by 0x10125541E: SNESSolve_NEWTONLS (ls.c:233) ==42050== by 0x1011C49B7: SNESSolve (snes.c:3906) ==42050== by 0x100001B3C: main (in ./ex19) ==42050== Address 0x1060144d0 is 1,168 bytes inside a block of size 131,072 alloc'd ==42050== at 0x1000183B1: malloc (vg_replace_malloc.c:303) ==42050== by 0x10153B704: superlu_malloc_dist (memory.c:108) ==42050== by 0x1014FD7AD: doubleMalloc_dist (dmemory.c:145) ==42050== by 0x10151DA7D: pdgstrf (pdgstrf.c:735) ==42050== by 0x1015019A5: pdgssvx (pdgssvx.c:1069) ==42050== by 0x1009CFE7A: MatLUFactorNumeric_SuperLU_DIST (superlu_dist.c:414) ==42050== by 0x10046CC5C: MatLUFactorNumeric (matrix.c:2946) ==42050== by 0x100F09F2C: PCSetUp_LU (lu.c:152) ==42050== by 0x100FF9036: PCSetUp (precon.c:982) ==42050== by 0x1010F54EB: KSPSetUp (itfunc.c:332) ==42050== by 0x1010F7985: KSPSolve (itfunc.c:546) ==42050== by 0x10125541E: SNESSolve_NEWTONLS (ls.c:233) ==42050== by 0x1011C49B7: SNESSolve (snes.c:3906) ==42050== by 0x100001B3C: main (in ./ex19) ==42050== Uninitialised value was created by a heap allocation ==42050== at 0x1000183B1: malloc (vg_replace_malloc.c:303) ==42050== by 0x10153B704: superlu_malloc_dist (memory.c:108) ==42050== by 0x1014FD7AD: doubleMalloc_dist (dmemory.c:145) ==42050== by 0x10151DA7D: pdgstrf (pdgstrf.c:735) ==42050== by 0x1015019A5: pdgssvx (pdgssvx.c:1069) ==42050== by 0x1009CFE7A: MatLUFactorNumeric_SuperLU_DIST (superlu_dist.c:414) ==42050== by 0x10046CC5C: MatLUFactorNumeric (matrix.c:2946) ==42050== by 0x100F09F2C: PCSetUp_LU (lu.c:152) ==42050== by 0x100FF9036: PCSetUp (precon.c:982) ==42050== by 0x1010F54EB: KSPSetUp (itfunc.c:332) ==42050== by 0x1010F7985: KSPSolve (itfunc.c:546) ==42050== by 0x10125541E: SNESSolve_NEWTONLS (ls.c:233) ==42050== by 0x1011C49B7: SNESSolve (snes.c:3906) ==42050== by 0x100001B3C: main (in ./ex19) ==42050== ==42048== Conditional jump or move depends on uninitialised value(s) ==42048== at 0x10151F141: pdgstrf (pdgstrf.c:1139) ==42048== by 0x1015019A5: pdgssvx (pdgssvx.c:1069) ==42048== by 0x1009CFE7A: MatLUFactorNumeric_SuperLU_DIST (superlu_dist.c:414) ==42048== by 0x10046CC5C: MatLUFactorNumeric (matrix.c:2946) ==42048== by 0x100F09F2C: PCSetUp_LU (lu.c:152) ==42048== by 0x100FF9036: PCSetUp (precon.c:982) ==42048== by 0x1010F54EB: KSPSetUp (itfunc.c:332) ==42048== by 0x1010F7985: KSPSolve (itfunc.c:546) ==42048== by 0x10125541E: SNESSolve_NEWTONLS (ls.c:233) ==42048== by 0x1011C49B7: SNESSolve (snes.c:3906) ==42048== by 0x100001B3C: main (in ./ex19) ==42048== Uninitialised value was created by a heap allocation ==42048== at 0x1000183B1: malloc (vg_replace_malloc.c:303) ==42048== by 0x10153B704: superlu_malloc_dist (memory.c:108) ==42048== by 0x10150ABE2: ddist_psymbtonum (pdsymbfact_distdata.c:1332) ==42048== by 0x1015018C2: pdgssvx (pdgssvx.c:1057) ==42048== by 0x1009CFE7A: MatLUFactorNumeric_SuperLU_DIST (superlu_dist.c:414) ==42048== by 0x10046CC5C: MatLUFactorNumeric (matrix.c:2946) ==42048== by 0x100F09F2C: PCSetUp_LU (lu.c:152) ==42048== by 0x100FF9036: PCSetUp (precon.c:982) ==42048== by 0x1010F54EB: KSPSetUp (itfunc.c:332) ==42048== by 0x1010F7985: KSPSolve (itfunc.c:546) ==42048== by 0x10125541E: SNESSolve_NEWTONLS (ls.c:233) ==42048== by 0x1011C49B7: SNESSolve (snes.c:3906) ==42048== by 0x100001B3C: main (in ./ex19) ==42048== ==42049== Conditional jump or move depends on uninitialised value(s) ==42049== at 0x10151F141: pdgstrf (pdgstrf.c:1139) ==42049== by 0x1015019A5: pdgssvx (pdgssvx.c:1069) ==42049== by 0x1009CFE7A: MatLUFactorNumeric_SuperLU_DIST (superlu_dist.c:414) ==42049== by 0x10046CC5C: MatLUFactorNumeric (matrix.c:2946) ==42049== by 0x100F09F2C: PCSetUp_LU (lu.c:152) ==42049== by 0x100FF9036: PCSetUp (precon.c:982) ==42049== by 0x1010F54EB: KSPSetUp (itfunc.c:332) ==42049== by 0x1010F7985: KSPSolve (itfunc.c:546) ==42049== by 0x10125541E: SNESSolve_NEWTONLS (ls.c:233) ==42049== by 0x1011C49B7: SNESSolve (snes.c:3906) ==42049== by 0x100001B3C: main (in ./ex19) ==42049== Uninitialised value was created by a heap allocation ==42049== at 0x1000183B1: malloc (vg_replace_malloc.c:303) ==42049== by 0x10153B704: superlu_malloc_dist (memory.c:108) ==42049== by 0x10150ABE2: ddist_psymbtonum (pdsymbfact_distdata.c:1332) ==42049== by 0x1015018C2: pdgssvx (pdgssvx.c:1057) ==42049== by 0x1009CFE7A: MatLUFactorNumeric_SuperLU_DIST (superlu_dist.c:414) ==42049== by 0x10046CC5C: MatLUFactorNumeric (matrix.c:2946) ==42049== by 0x100F09F2C: PCSetUp_LU (lu.c:152) ==42049== by 0x100FF9036: PCSetUp (precon.c:982) ==42049== by 0x1010F54EB: KSPSetUp (itfunc.c:332) ==42049== by 0x1010F7985: KSPSolve (itfunc.c:546) ==42049== by 0x10125541E: SNESSolve_NEWTONLS (ls.c:233) ==42049== by 0x1011C49B7: SNESSolve (snes.c:3906) ==42049== by 0x100001B3C: main (in ./ex19) ==42049== ==42048== Conditional jump or move depends on uninitialised value(s) ==42048== at 0x101520054: pdgstrf (pdgstrf.c:1429) ==42048== by 0x1015019A5: pdgssvx (pdgssvx.c:1069) ==42048== by 0x1009CFE7A: MatLUFactorNumeric_SuperLU_DIST (superlu_dist.c:414) ==42048== by 0x10046CC5C: MatLUFactorNumeric (matrix.c:2946) ==42048== by 0x100F09F2C: PCSetUp_LU (lu.c:152) ==42048== by 0x100FF9036: PCSetUp (precon.c:982) ==42048== by 0x1010F54EB: KSPSetUp (itfunc.c:332) ==42048== by 0x1010F7985: KSPSolve (itfunc.c:546) ==42049== Conditional jump or move depends on uninitialised value(s) ==42048== by 0x10125541E: SNESSolve_NEWTONLS (ls.c:233) ==42048== by 0x1011C49B7: SNESSolve (snes.c:3906) ==42048== by 0x100001B3C: main (in ./ex19) ==42048== Uninitialised value was created by a heap allocation ==42049== at 0x101520054: pdgstrf (pdgstrf.c:1429) ==42048== at 0x1000183B1: malloc (vg_replace_malloc.c:303) ==42048== by 0x10153B704: superlu_malloc_dist (memory.c:108) ==42049== by 0x1015019A5: pdgssvx (pdgssvx.c:1069) ==42049== by 0x1009CFE7A: MatLUFactorNumeric_SuperLU_DIST (superlu_dist.c:414) ==42048== by 0x10150ABE2: ddist_psymbtonum (pdsymbfact_distdata.c:1332) ==42048== by 0x1015018C2: pdgssvx (pdgssvx.c:1057) ==42048== by 0x1009CFE7A: MatLUFactorNumeric_SuperLU_DIST (superlu_dist.c:414) ==42049== by 0x10046CC5C: MatLUFactorNumeric (matrix.c:2946) ==42049== by 0x100F09F2C: PCSetUp_LU (lu.c:152) ==42048== by 0x10046CC5C: MatLUFactorNumeric (matrix.c:2946) ==42048== by 0x100F09F2C: PCSetUp_LU (lu.c:152) ==42049== by 0x100FF9036: PCSetUp (precon.c:982) ==42049== by 0x1010F54EB: KSPSetUp (itfunc.c:332) ==42049== by 0x1010F7985: KSPSolve (itfunc.c:546) ==42048== by 0x100FF9036: PCSetUp (precon.c:982) ==42048== by 0x1010F54EB: KSPSetUp (itfunc.c:332) ==42048== by 0x1010F7985: KSPSolve (itfunc.c:546) ==42049== by 0x10125541E: SNESSolve_NEWTONLS (ls.c:233) ==42049== by 0x1011C49B7: SNESSolve (snes.c:3906) ==42048== by 0x10125541E: SNESSolve_NEWTONLS (ls.c:233) ==42048== by 0x1011C49B7: SNESSolve (snes.c:3906) ==42049== by 0x100001B3C: main (in ./ex19) ==42049== Uninitialised value was created by a heap allocation ==42049== at 0x1000183B1: malloc (vg_replace_malloc.c:303) ==42048== by 0x100001B3C: main (in ./ex19) ==42048== ==42049== by 0x10153B704: superlu_malloc_dist (memory.c:108) ==42049== by 0x10150ABE2: ddist_psymbtonum (pdsymbfact_distdata.c:1332) ==42049== by 0x1015018C2: pdgssvx (pdgssvx.c:1057) ==42049== by 0x1009CFE7A: MatLUFactorNumeric_SuperLU_DIST (superlu_dist.c:414) ==42049== by 0x10046CC5C: MatLUFactorNumeric (matrix.c:2946) ==42049== by 0x100F09F2C: PCSetUp_LU (lu.c:152) ==42049== by 0x100FF9036: PCSetUp (precon.c:982) ==42049== by 0x1010F54EB: KSPSetUp (itfunc.c:332) ==42049== by 0x1010F7985: KSPSolve (itfunc.c:546) ==42049== by 0x10125541E: SNESSolve_NEWTONLS (ls.c:233) ==42049== by 0x1011C49B7: SNESSolve (snes.c:3906) ==42049== by 0x100001B3C: main (in ./ex19) ==42049== ==42050== Conditional jump or move depends on uninitialised value(s) ==42050== at 0x10151FDE6: pdgstrf (pdgstrf.c:1382) ==42050== by 0x1015019A5: pdgssvx (pdgssvx.c:1069) ==42050== by 0x1009CFE7A: MatLUFactorNumeric_SuperLU_DIST (superlu_dist.c:414) ==42050== by 0x10046CC5C: MatLUFactorNumeric (matrix.c:2946) ==42050== by 0x100F09F2C: PCSetUp_LU (lu.c:152) ==42050== by 0x100FF9036: PCSetUp (precon.c:982) ==42050== by 0x1010F54EB: KSPSetUp (itfunc.c:332) ==42050== by 0x1010F7985: KSPSolve (itfunc.c:546) ==42050== by 0x10125541E: SNESSolve_NEWTONLS (ls.c:233) ==42050== by 0x1011C49B7: SNESSolve (snes.c:3906) ==42050== by 0x100001B3C: main (in ./ex19) ==42050== Uninitialised value was created by a heap allocation ==42050== at 0x1000183B1: malloc (vg_replace_malloc.c:303) ==42050== by 0x10153B704: superlu_malloc_dist (memory.c:108) ==42050== by 0x10150B241: ddist_psymbtonum (pdsymbfact_distdata.c:1389) ==42050== by 0x1015018C2: pdgssvx (pdgssvx.c:1057) ==42050== by 0x1009CFE7A: MatLUFactorNumeric_SuperLU_DIST (superlu_dist.c:414) ==42050== by 0x10046CC5C: MatLUFactorNumeric (matrix.c:2946) ==42050== by 0x100F09F2C: PCSetUp_LU (lu.c:152) ==42050== by 0x100FF9036: PCSetUp (precon.c:982) ==42050== by 0x1010F54EB: KSPSetUp (itfunc.c:332) ==42050== by 0x1010F7985: KSPSolve (itfunc.c:546) ==42050== by 0x10125541E: SNESSolve_NEWTONLS (ls.c:233) ==42050== by 0x1011C49B7: SNESSolve (snes.c:3906) ==42050== by 0x100001B3C: main (in ./ex19) ==42050== > On Jul 20, 2015, at 12:03 PM, Mahir.Ulker-Kaustell at tyrens.se wrote: > > Ok. So I have been creating the full factorization on each process. That gives me some hope! > > I followed your suggestion and tried to use the runtime option ?-mat_superlu_dist_parsymbfact?. > However, now the program crashes with: > > Invalid ISPEC at line 484 in file get_perm_c.c > > And so on? > > From the SuperLU manual; I should give the option either YES or NO, however -mat_superlu_dist_parsymbfact YES makes the program crash in the same way as above. > Also I can?t find any reference to -mat_superlu_dist_parsymbfact in the PETSc documentation > > Mahir > > Mahir ?lker-Kaustell, Kompetenssamordnare, Brokonstrukt?r, Tekn. Dr, Tyr?ns AB > 010 452 30 82, Mahir.Ulker-Kaustell at tyrens.se > > From: Xiaoye S. Li [mailto:xsli at lbl.gov] > Sent: den 20 juli 2015 18:12 > To: ?lker-Kaustell, Mahir > Cc: Hong; petsc-users > Subject: Re: [petsc-users] SuperLU MPI-problem > > The default SuperLU_DIST setting is to serial symbolic factorization. Therefore, what matters is how much memory do you have per MPI task? > > The code failed to malloc memory during redistribution of matrix A to {L\U} data struction (using result of serial symbolic factorization.) > > You can use parallel symbolic factorization, by runtime option: '-mat_superlu_dist_parsymbfact' > > Sherry Li > > > On Mon, Jul 20, 2015 at 8:59 AM, Mahir.Ulker-Kaustell at tyrens.se > wrote: > Hong: > > Previous experiences with this equation have shown that it is very difficult to solve it iteratively. Hence the use of a direct solver. > > The large test problem I am trying to solve has slightly less than 10^6 degrees of freedom. The matrices are derived from finite elements so they are sparse. > The machine I am working on has 128GB ram. I have estimated the memory needed to less than 20GB, so if the solver needs twice or even three times as much, it should still work well. Or have I completely misunderstood something here? > > Mahir > > > > From: Hong [mailto:hzhang at mcs.anl.gov] > Sent: den 20 juli 2015 17:39 > To: ?lker-Kaustell, Mahir > Cc: petsc-users > Subject: Re: [petsc-users] SuperLU MPI-problem > > Mahir: > Direct solvers consume large amount of memory. Suggest to try followings: > > 1. A sparse iterative solver if [-omega^2M + K] is not too ill-conditioned. You may test it using the small matrix. > > 2. Incrementally increase your matrix sizes. Try different matrix orderings. > Do you get memory crash in the 1st symbolic factorization? > In your case, matrix data structure stays same when omega changes, so you only need to do one matrix symbolic factorization and reuse it. > > 3. Use a machine that gives larger memory. > > Hong > > Dear Petsc-Users, > > I am trying to use PETSc to solve a set of linear equations arising from Naviers equation (elastodynamics) in the frequency domain. > The frequency dependency of the problem requires that the system > > [-omega^2M + K]u = F > > where M and K are constant, square, positive definite matrices (mass and stiffness respectively) is solved for each frequency omega of interest. > K is a complex matrix, including material damping. > > I have written a PETSc program which solves this problem for a small (1000 degrees of freedom) test problem on one or several processors, but it keeps crashing when I try it on my full scale (in the order of 10^6 degrees of freedom) problem. > > The program crashes at KSPSetUp() and from what I can see in the error messages, it appears as if it consumes too much memory. > > I would guess that similar problems have occurred in this mail-list, so I am hoping that someone can push me in the right direction? > > Mahir -------------- next part -------------- An HTML attachment was scrubbed... URL: From jychang48 at gmail.com Fri Aug 7 09:36:30 2015 From: jychang48 at gmail.com (Justin Chang) Date: Fri, 7 Aug 2015 09:36:30 -0500 Subject: [petsc-users] Issues running with intel MPI compiler In-Reply-To: References: Message-ID: That did the trick, thank you very much On Fri, Aug 7, 2015 at 12:04 AM, Satish Balay wrote: > perhaps you'll see the issue with: > > /share/apps/intel/impi/5.0.2.044/intel64/bin/mpiexec -n 2 my_program > > In this case - you can retry with: > > /share/apps/intel/impi/5.0.2.044/intel64/bin/mpiexec.hydra -n 2 > my_program > > wrt running example with make - you can try equivalent of > > make MPIEXEC=/share/apps/intel/impi/5.0.2.044/intel64/bin/mpiexec.hydra > test > > Satish > > On Thu, 6 Aug 2015, Justin Chang wrote: > > > Hi all, > > > > I configured PETSc using my university's intel compilers. I configured > with > > these options: > > > > ./configure --download-chaco --download-ctetgen --download-exodusii > > --download-fblaslapack --download-hdf5 --download-hypre --download-metis > > --download-netcdf --download-parmetis --download-triangle > > --with-cmake=cmake --with-mpi-dir=/share/apps/intel/impi/ > 5.0.2.044/intel64 > > --with-shared-libraries=1 COPTFLAGS=-O2 CXXOPTFLAGS=-O2 FOPTFLAGS=-O2 > > PETSC_ARCH=arch-linux2-c-opt --with-debugging=0 > > > > when I run any examples via make, i get the following error: > > > > > mpiexec_opuntia.cacds.uh.edu: cannot connect to local mpd > > (/tmp/mpd2.console_jchang23); possible causes: > > > > > 1. no mpd is running on this host > > > > > 2. an mpd is running but was started without a "console" (-n option) > > > > > > However, if I simply run /share/apps/intel/impi/ > > 5.0.2.044/intel64/bin/mpiexec -n 1 my_program, it works fine. Anyone > know > > why this is happening? > > > > Thanks, > > Justin > > > > -------------- next part -------------- An HTML attachment was scrubbed... URL: From balay at mcs.anl.gov Fri Aug 7 11:08:43 2015 From: balay at mcs.anl.gov (Satish Balay) Date: Fri, 7 Aug 2015 11:08:43 -0500 Subject: [petsc-users] SuperLU MPI-problem In-Reply-To: <429fe4873a534ab19216a8d2e5fa8213@STHWS42.tyrens.se> References: <1891ada2f99843b6b6c1d91f0f47f065@STHWS42.tyrens.se> <63c6587a85914931bbbad4660884efed@STHWS42.tyrens.se> <429fe4873a534ab19216a8d2e5fa8213@STHWS42.tyrens.se> Message-ID: This usually happens if you use the wrong MPIEXEC i.e use the mpiexec from the MPI you built PETSc with. Satish On Fri, 7 Aug 2015, Mahir.Ulker-Kaustell at tyrens.se wrote: > Hong, > > Running example 2 with the command line given below gives me two uniprocessor runs!? > > $ mpiexec -n 2 ./ex2 -pc_type lu -pc_factor_mat_solver_package superlu_dist -ksp_view > KSP Object: 1 MPI processes > type: gmres > GMRES: restart=30, using Classical (unmodified) Gram-Schmidt Orthogonalization with no iterative refinement > GMRES: happy breakdown tolerance 1e-30 > maximum iterations=10000, initial guess is zero > tolerances: relative=0.000138889, absolute=1e-50, divergence=10000 > left preconditioning > using PRECONDITIONED norm type for convergence test > PC Object: 1 MPI processes > type: lu > LU: out-of-place factorization > tolerance for zero pivot 2.22045e-14 > matrix ordering: nd > factor fill ratio given 0, needed 0 > Factored matrix follows: > Mat Object: 1 MPI processes > type: seqaij > rows=56, cols=56 > package used to perform factorization: superlu_dist > total: nonzeros=0, allocated nonzeros=0 > total number of mallocs used during MatSetValues calls =0 > SuperLU_DIST run parameters: > Process grid nprow 1 x npcol 1 > Equilibrate matrix TRUE > Matrix input mode 0 > Replace tiny pivots TRUE > Use iterative refinement FALSE > Processors in row 1 col partition 1 > Row permutation LargeDiag > Column permutation METIS_AT_PLUS_A > Parallel symbolic factorization FALSE > Repeated factorization SamePattern_SameRowPerm > linear system matrix = precond matrix: > Mat Object: 1 MPI processes > type: seqaij > rows=56, cols=56 > total: nonzeros=250, allocated nonzeros=280 > total number of mallocs used during MatSetValues calls =0 > not using I-node routines > Norm of error 5.21214e-15 iterations 1 > KSP Object: 1 MPI processes > type: gmres > GMRES: restart=30, using Classical (unmodified) Gram-Schmidt Orthogonalization with no iterative refinement > GMRES: happy breakdown tolerance 1e-30 > maximum iterations=10000, initial guess is zero > tolerances: relative=0.000138889, absolute=1e-50, divergence=10000 > left preconditioning > using PRECONDITIONED norm type for convergence test > PC Object: 1 MPI processes > type: lu > LU: out-of-place factorization > tolerance for zero pivot 2.22045e-14 > matrix ordering: nd > factor fill ratio given 0, needed 0 > Factored matrix follows: > Mat Object: 1 MPI processes > type: seqaij > rows=56, cols=56 > package used to perform factorization: superlu_dist > total: nonzeros=0, allocated nonzeros=0 > total number of mallocs used during MatSetValues calls =0 > SuperLU_DIST run parameters: > Process grid nprow 1 x npcol 1 > Equilibrate matrix TRUE > Matrix input mode 0 > Replace tiny pivots TRUE > Use iterative refinement FALSE > Processors in row 1 col partition 1 > Row permutation LargeDiag > Column permutation METIS_AT_PLUS_A > Parallel symbolic factorization FALSE > Repeated factorization SamePattern_SameRowPerm > linear system matrix = precond matrix: > Mat Object: 1 MPI processes > type: seqaij > rows=56, cols=56 > total: nonzeros=250, allocated nonzeros=280 > total number of mallocs used during MatSetValues calls =0 > not using I-node routines > Norm of error 5.21214e-15 iterations 1 > > Mahir > > From: Hong [mailto:hzhang at mcs.anl.gov] > Sent: den 6 augusti 2015 16:36 > To: ?lker-Kaustell, Mahir > Cc: Hong; Xiaoye S. Li; PETSc users list > Subject: Re: [petsc-users] SuperLU MPI-problem > > Mahir: > > I have been using PETSC_COMM_WORLD. > > What do you get by running a petsc example, e.g., > petsc/src/ksp/ksp/examples/tutorials > mpiexec -n 2 ./ex2 -pc_type lu -pc_factor_mat_solver_package superlu_dist -ksp_view > > KSP Object: 2 MPI processes > type: gmres > ... > > Hong > > From: Hong [mailto:hzhang at mcs.anl.gov] > Sent: den 5 augusti 2015 17:11 > To: ?lker-Kaustell, Mahir > Cc: Hong; Xiaoye S. Li; PETSc users list > Subject: Re: [petsc-users] SuperLU MPI-problem > > Mahir: > As you noticed, you ran the code in serial mode, not parallel. > Check your code on input communicator, e.g., what input communicator do you use in > KSPCreate(comm,&ksp)? > > I have added error flag to superlu_dist interface (released version). When user uses '-mat_superlu_dist_parsymbfact' > in serial mode, this option is ignored with a warning. > > Hong > > Hong, > > If I set parsymbfact: > > $ mpiexec -n 2 ./solve -ksp_type preonly -pc_type lu -pc_factor_mat_solver_package superlu_dist -mat_superlu_dist_matinput DISTRIBUTED -mat_superlu_dist_parsymbfact -ksp_view > Invalid ISPEC at line 484 in file get_perm_c.c > Invalid ISPEC at line 484 in file get_perm_c.c > ------------------------------------------------------- > Primary job terminated normally, but 1 process returned > a non-zero exit code.. Per user-direction, the job has been aborted. > ------------------------------------------------------- > -------------------------------------------------------------------------- > mpiexec detected that one or more processes exited with non-zero status, thus causing > the job to be terminated. The first process to do so was: > > Process name: [[63679,1],0] > Exit code: 255 > -------------------------------------------------------------------------- > > Since the program does not finish the call to KSPSolve(), we do not get any information about the KSP from ?ksp_view. > > If I do not set it, I get a serial run even if I specify ?n 2: > > mpiexec -n 2 ./solve -ksp_type preonly -pc_type lu -pc_factor_mat_solver_package superlu_dist -ksp_view > ? > KSP Object: 1 MPI processes > type: preonly > maximum iterations=10000, initial guess is zero > tolerances: relative=1e-05, absolute=1e-50, divergence=10000 > left preconditioning > using NONE norm type for convergence test > PC Object: 1 MPI processes > type: lu > LU: out-of-place factorization > tolerance for zero pivot 2.22045e-14 > matrix ordering: nd > factor fill ratio given 0, needed 0 > Factored matrix follows: > Mat Object: 1 MPI processes > type: seqaij > rows=954, cols=954 > package used to perform factorization: superlu_dist > total: nonzeros=0, allocated nonzeros=0 > total number of mallocs used during MatSetValues calls =0 > SuperLU_DIST run parameters: > Process grid nprow 1 x npcol 1 > Equilibrate matrix TRUE > Matrix input mode 0 > Replace tiny pivots TRUE > Use iterative refinement FALSE > Processors in row 1 col partition 1 > Row permutation LargeDiag > Column permutation METIS_AT_PLUS_A > Parallel symbolic factorization FALSE > Repeated factorization SamePattern_SameRowPerm > linear system matrix = precond matrix: > Mat Object: 1 MPI processes > type: seqaij > rows=954, cols=954 > total: nonzeros=34223, allocated nonzeros=34223 > total number of mallocs used during MatSetValues calls =0 > using I-node routines: found 668 nodes, limit used is 5 > > I am running PETSc via Cygwin on a windows machine. > When I installed PETSc the tests with different numbers of processes ran well. > > Mahir > > > From: Hong [mailto:hzhang at mcs.anl.gov] > Sent: den 3 augusti 2015 19:06 > To: ?lker-Kaustell, Mahir > Cc: Hong; Xiaoye S. Li; PETSc users list > Subject: Re: [petsc-users] SuperLU MPI-problem > > Mahir, > > > I have not used ?parsymbfact in sequential runs or set matinput=GLOBAL for parallel runs. > > If I use 2 processors, the program runs if I use ?mat_superlu_dist_parsymbfact=1: > mpiexec -n 2 ./solve -ksp_type preonly -pc_type lu -pc_factor_mat_solver_package superlu_dist -mat_superlu_dist_matinput GLOBAL -mat_superlu_dist_parsymbfact=1 > > The incorrect option '-mat_superlu_dist_parsymbfact=1' is not taken, so your code runs well without parsymbfact. > > Please run it with '-ksp_view' and see what > 'SuperLU_DIST run parameters:' are being used, e.g. > petsc/src/ksp/ksp/examples/tutorials (maint) > $ mpiexec -n 2 ./ex2 -pc_type lu -pc_factor_mat_solver_package superlu_dist -mat_superlu_dist_parsymbfact=1 -ksp_view > > ... > SuperLU_DIST run parameters: > Process grid nprow 2 x npcol 1 > Equilibrate matrix TRUE > Matrix input mode 1 > Replace tiny pivots TRUE > Use iterative refinement FALSE > Processors in row 2 col partition 1 > Row permutation LargeDiag > Column permutation METIS_AT_PLUS_A > Parallel symbolic factorization FALSE > Repeated factorization SamePattern_SameRowPerm > > I do not understand why your code uses matrix input mode = global. > > Hong > > > > From: Hong [mailto:hzhang at mcs.anl.gov] > Sent: den 3 augusti 2015 16:46 > To: Xiaoye S. Li > Cc: ?lker-Kaustell, Mahir; Hong; PETSc users list > > Subject: Re: [petsc-users] SuperLU MPI-problem > > Mahir, > > Sherry found the culprit. I can reproduce it: > petsc/src/ksp/ksp/examples/tutorials > mpiexec -n 2 ./ex2 -pc_type lu -pc_factor_mat_solver_package superlu_dist -mat_superlu_dist_matinput GLOBAL -mat_superlu_dist_parsymbfact > > Invalid ISPEC at line 484 in file get_perm_c.c > Invalid ISPEC at line 484 in file get_perm_c.c > ------------------------------------------------------- > Primary job terminated normally, but 1 process returned > a non-zero exit code.. Per user-direction, the job has been aborted. > ------------------------------------------------------- > ... > > PETSc-superlu_dist interface sets matinput=DISTRIBUTED as default when using more than one processes. > Did you either use '-mat_superlu_dist_parsymbfact' for sequential run or set matinput=GLOBAL for parallel run? > > I'll add an error flag for these use cases. > > Hong > > On Mon, Aug 3, 2015 at 9:17 AM, Xiaoye S. Li > wrote: > I think I know the problem. Since zdistribute.c is called, I guess you are using the global (replicated) matrix input interface, pzgssvx_ABglobal(). This interface does not allow you to use parallel symbolic factorization (since matrix is centralized). > > That's why you get the following error: > Invalid ISPEC at line 484 in file get_perm_c.c > > You need to use distributed matrix input interface pzgssvx() (without ABglobal) > > Sherry > > > On Mon, Aug 3, 2015 at 5:02 AM, Mahir.Ulker-Kaustell at tyrens.se > wrote: > Hong and Sherry, > > I have rebuilt PETSc with SuperLU 4.1. Unfortunately, the problem remains: > > If I use -mat_superlu_dist_parsymbfact, the program crashes with: Invalid ISPEC at line 484 in file get_perm_c.c > If I use -mat_superlu_dist_parsymbfact=1 or leave this flag out, the program crashes with: Calloc fails for SPA dense[]. at line 438 in file zdistribute.c > > Mahir > > From: Hong [mailto:hzhang at mcs.anl.gov] > Sent: den 30 juli 2015 02:58 > To: ?lker-Kaustell, Mahir > Cc: Xiaoye Li; PETSc users list > > Subject: Fwd: [petsc-users] SuperLU MPI-problem > > Mahir, > > Sherry fixed several bugs in superlu_dist-v4.1. > The current petsc-release interfaces with superlu_dist-v4.0. > We do not know whether the reported issue (attached below) has been resolved or not. If not, can you test it with the latest superlu_dist-v4.1? > > Here is how to do it: > 1. download superlu_dist v4.1 > 2. remove existing PETSC_ARCH directory, then configure petsc with > '--download-superlu_dist=superlu_dist_4.1.tar.gz' > 3. build petsc > > Let us know if the issue remains. > > Hong > > > ---------- Forwarded message ---------- > From: Xiaoye S. Li > > Date: Wed, Jul 29, 2015 at 2:24 PM > Subject: Fwd: [petsc-users] SuperLU MPI-problem > To: Hong Zhang > > Hong, > I am cleaning the mailbox, and saw this unresolved issue. I am not sure whether the new fix to parallel symbolic factorization solves the problem. What bothers be is that he is getting the following error: > > Invalid ISPEC at line 484 in file get_perm_c.c > This has nothing to do with my bug fix. > ? Shall we ask him to try the new version, or try to get him matrix? > Sherry > ? > > ---------- Forwarded message ---------- > From: Mahir.Ulker-Kaustell at tyrens.se > > Date: Wed, Jul 22, 2015 at 1:32 PM > Subject: RE: [petsc-users] SuperLU MPI-problem > To: Hong >, "Xiaoye S. Li" > > Cc: petsc-users > > The 1000 was just a conservative guess. The number of non-zeros per row is in the tens in general but certain constraints lead to non-diagonal streaks in the sparsity-pattern. > Is it the reordering of the matrix that is killing me here? How can I set options.ColPerm? > > If i use -mat_superlu_dist_parsymbfact the program crashes with > > Invalid ISPEC at line 484 in file get_perm_c.c > ------------------------------------------------------- > Primary job terminated normally, but 1 process returned > a non-zero exit code.. Per user-direction, the job has been aborted. > ------------------------------------------------------- > [0]PETSC ERROR: ------------------------------------------------------------------------ > [0]PETSC ERROR: Caught signal number 15 Terminate: Some process (or the batch system) has told this process to end > [0]PETSC ERROR: Try option -start_in_debugger or -on_error_attach_debugger > [0]PETSC ERROR: or see http://www.mcs.anl.gov/petsc/documentation/faq.html#valgrind > [0]PETSC ERROR: or try http://valgrind.org on GNU/linux and Apple Mac OS X to find memory corruption errors > [0]PETSC ERROR: configure using --with-debugging=yes, recompile, link, and run > [0]PETSC ERROR: to get more information on the crash. > [0]PETSC ERROR: --------------------- Error Message -------------------------------------------------------------- > [0]PETSC ERROR: Signal received > [0]PETSC ERROR: See http://www.mcs.anl.gov/petsc/documentation/faq.html for trouble shooting. > [0]PETSC ERROR: Petsc Release Version 3.6.0, Jun, 09, 2015 > [0]PETSC ERROR: ./solve on a cygwin-complex-nodebug named CZC5202SM2 by muk Wed Jul 22 21:59:23 2015 > [0]PETSC ERROR: Configure options PETSC_DIR=/packages/petsc-3.6.0 PETSC_ARCH=cygwin-complex-nodebug --with-cc=gcc --with-cxx=g++ --with-fc=gfortran --with-debugging=0 --with-fortran-kernels=1 --with-scalar-type=complex --download-fblaspack --download-mpich --download-scalapack --download-mumps --download-metis --download-parmetis --download-superlu --download-superlu_dist --download-fftw > [0]PETSC ERROR: #1 User provided function() line 0 in unknown file > application called MPI_Abort(MPI_COMM_WORLD, 59) - process 0 > [unset]: aborting job: > application called MPI_Abort(MPI_COMM_WORLD, 59) - process 0 > [0]PETSC ERROR: ------------------------------------------------------------------------ > > If i use -mat_superlu_dist_parsymbfact=1 the program crashes (somewhat later) with > > Malloc fails for Lnzval_bc_ptr[*][] at line 626 in file zdistribute.c > col block 3006 ------------------------------------------------------- > Primary job terminated normally, but 1 process returned > a non-zero exit code.. Per user-direction, the job has been aborted. > ------------------------------------------------------- > col block 1924 [0]PETSC ERROR: ------------------------------------------------------------------------ > [0]PETSC ERROR: Caught signal number 15 Terminate: Some process (or the batch system) has told this process to end > [0]PETSC ERROR: Try option -start_in_debugger or -on_error_attach_debugger > [0]PETSC ERROR: or see http://www.mcs.anl.gov/petsc/documentation/faq.html#valgrind > [0]PETSC ERROR: or try http://valgrind.org on GNU/linux and Apple Mac OS X to find memory corruption errors > [0]PETSC ERROR: configure using --with-debugging=yes, recompile, link, and run > [0]PETSC ERROR: to get more information on the crash. > [0]PETSC ERROR: --------------------- Error Message -------------------------------------------------------------- > [0]PETSC ERROR: Signal received > [0]PETSC ERROR: See http://www.mcs.anl.gov/petsc/documentation/faq.html for trouble shooting. > [0]PETSC ERROR: Petsc Release Version 3.6.0, Jun, 09, 2015 > [0]PETSC ERROR: ./solve on a cygwin-complex-nodebug named CZC5202SM2 by muk Wed Jul 22 21:59:58 2015 > [0]PETSC ERROR: Configure options PETSC_DIR=/packages/petsc-3.6.0 PETSC_ARCH=cygwin-complex-nodebug --with-cc=gcc --with-cxx=g++ --with-fc=gfortran --with-debugging=0 --with-fortran-kernels=1 --with-scalar-type=complex --download-fblaspack --download-mpich --download-scalapack --download-mumps --download-metis --download-parmetis --download-superlu --download-superlu_dist --download-fftw > [0]PETSC ERROR: #1 User provided function() line 0 in unknown file > application called MPI_Abort(MPI_COMM_WORLD, 59) - process 0 > [unset]: aborting job: > application called MPI_Abort(MPI_COMM_WORLD, 59) - process 0 > [0]PETSC ERROR: ------------------------------------------------------------------------ > > > /Mahir > > > From: Hong [mailto:hzhang at mcs.anl.gov] > Sent: den 22 juli 2015 21:34 > To: Xiaoye S. Li > Cc: ?lker-Kaustell, Mahir; petsc-users > > Subject: Re: [petsc-users] SuperLU MPI-problem > > In Petsc/superlu_dist interface, we set default > > options.ParSymbFact = NO; > > When user raises the flag "-mat_superlu_dist_parsymbfact", > we set > > options.ParSymbFact = YES; > options.ColPerm = PARMETIS; /* in v2.2, PARMETIS is forced for ParSymbFact regardless of user ordering setting */ > > We do not change anything else. > > Hong > > On Wed, Jul 22, 2015 at 2:19 PM, Xiaoye S. Li > wrote: > I am trying to understand your problem. You said you are solving Naviers equation (elastodynamics) in the frequency domain, using finite element discretization. I wonder why you have about 1000 nonzeros per row. Usually in many PDE discretized matrices, the number of nonzeros per row is in the tens (even for 3D problems), not in the thousands. So, your matrix is quite a bit denser than many sparse matrices we deal with. > > The number of nonzeros in the L and U factors is much more than that in original matrix A -- typically we see 10-20x fill ratio for 2D, or can be as bad as 50-100x fill ratio for 3D. But since your matrix starts much denser (i.e., the underlying graph has many connections), it may not lend to any good ordering strategy to preserve sparsity of L and U; that is, the L and U fill ratio may be large. > > I don't understand why you get the following error when you use > ?-mat_superlu_dist_parsymbfact?. > > Invalid ISPEC at line 484 in file get_perm_c.c > > Perhaps Hong Zhang knows; she built the SuperLU_DIST interface for PETSc. > > ?Hong -- in order to use parallel symbolic factorization, is it sufficient to specify only > ?-mat_superlu_dist_parsymbfact? > ? ? (the default is to use sequential symbolic factorization.) > > > Sherry > > On Wed, Jul 22, 2015 at 9:11 AM, Mahir.Ulker-Kaustell at tyrens.se > wrote: > Thank you for your reply. > > As you have probably figured out already, I am not a computational scientist. I am a researcher in civil engineering (railways for high-speed traffic), trying to produce some, from my perspective, fairly large parametric studies based on finite element discretizations. > > I am working in a Windows-environment and have installed PETSc through Cygwin. > Apparently, there is no support for Valgrind in this OS. > > If I have understood you correct, the memory issues are related to superLU and given my background, there is not much I can do. Is this correct? > > > Best regards, > Mahir > > ______________________________________________ > Mahir ?lker-Kaustell, Kompetenssamordnare, Brokonstrukt?r, Tekn. Dr, Tyr?ns AB > 010 452 30 82, Mahir.Ulker-Kaustell at tyrens.se > ______________________________________________ > > -----Original Message----- > From: Barry Smith [mailto:bsmith at mcs.anl.gov] > Sent: den 22 juli 2015 02:57 > To: ?lker-Kaustell, Mahir > Cc: Xiaoye S. Li; petsc-users > Subject: Re: [petsc-users] SuperLU MPI-problem > > > Run the program under valgrind http://www.mcs.anl.gov/petsc/documentation/faq.html#valgrind . When I use the option -mat_superlu_dist_parsymbfact I get many scary memory problems some involving for example ddist_psymbtonum (pdsymbfact_distdata.c:1332) > > Note that I consider it unacceptable for running programs to EVER use uninitialized values; until these are all cleaned up I won't trust any runs like this. > > Barry > > > > > ==42050== Conditional jump or move depends on uninitialised value(s) > ==42050== at 0x10274C436: MPI_Allgatherv (allgatherv.c:1053) > ==42050== by 0x101557F60: get_perm_c_parmetis (get_perm_c_parmetis.c:285) > ==42050== by 0x101501192: pdgssvx (pdgssvx.c:934) > ==42050== by 0x1009CFE7A: MatLUFactorNumeric_SuperLU_DIST (superlu_dist.c:414) > ==42050== by 0x10046CC5C: MatLUFactorNumeric (matrix.c:2946) > ==42050== by 0x100F09F2C: PCSetUp_LU (lu.c:152) > ==42050== by 0x100FF9036: PCSetUp (precon.c:982) > ==42050== by 0x1010F54EB: KSPSetUp (itfunc.c:332) > ==42050== by 0x1010F7985: KSPSolve (itfunc.c:546) > ==42050== by 0x10125541E: SNESSolve_NEWTONLS (ls.c:233) > ==42050== by 0x1011C49B7: SNESSolve (snes.c:3906) > ==42050== by 0x100001B3C: main (in ./ex19) > ==42050== Uninitialised value was created by a stack allocation > ==42050== at 0x10155751B: get_perm_c_parmetis (get_perm_c_parmetis.c:96) > ==42050== > ==42050== Conditional jump or move depends on uninitialised value(s) > ==42050== at 0x102851C61: MPIR_Allgatherv_intra (allgatherv.c:651) > ==42050== by 0x102853EC7: MPIR_Allgatherv (allgatherv.c:903) > ==42050== by 0x102853F84: MPIR_Allgatherv_impl (allgatherv.c:944) > ==42050== by 0x10274CA41: MPI_Allgatherv (allgatherv.c:1107) > ==42050== by 0x101557F60: get_perm_c_parmetis (get_perm_c_parmetis.c:285) > ==42050== by 0x101501192: pdgssvx (pdgssvx.c:934) > ==42050== by 0x1009CFE7A: MatLUFactorNumeric_SuperLU_DIST (superlu_dist.c:414) > ==42050== by 0x10046CC5C: MatLUFactorNumeric (matrix.c:2946) > ==42050== by 0x100F09F2C: PCSetUp_LU (lu.c:152) > ==42050== by 0x100FF9036: PCSetUp (precon.c:982) > ==42050== by 0x1010F54EB: KSPSetUp (itfunc.c:332) > ==42050== by 0x1010F7985: KSPSolve (itfunc.c:546) > ==42050== by 0x10125541E: SNESSolve_NEWTONLS (ls.c:233) > ==42050== by 0x1011C49B7: SNESSolve (snes.c:3906) > ==42050== by 0x100001B3C: main (in ./ex19) > ==42050== Uninitialised value was created by a stack allocation > ==42050== at 0x10155751B: get_perm_c_parmetis (get_perm_c_parmetis.c:96) > ==42050== > ==42049== Syscall param writev(vector[...]) points to uninitialised byte(s) > ==42049== at 0x102DA1C3A: writev (in /usr/lib/system/libsystem_kernel.dylib) > ==42049== by 0x10296A0DC: MPL_large_writev (mplsock.c:32) > ==42049== by 0x10295F6AD: MPIDU_Sock_writev (sock_immed.i:610) > ==42049== by 0x102943FCA: MPIDI_CH3_iSendv (ch3_isendv.c:84) > ==42049== by 0x102934361: MPIDI_CH3_EagerContigIsend (ch3u_eager.c:556) > ==42049== by 0x102939531: MPID_Isend (mpid_isend.c:138) > ==42049== by 0x10277656E: MPI_Isend (isend.c:125) > ==42049== by 0x102088B66: libparmetis__gkMPI_Isend (gkmpi.c:63) > ==42049== by 0x10208140F: libparmetis__CommInterfaceData (comm.c:298) > ==42049== by 0x1020A8758: libparmetis__CompactGraph (ometis.c:553) > ==42049== by 0x1020A77BB: libparmetis__MultilevelOrder (ometis.c:225) > ==42049== by 0x1020A7493: ParMETIS_V32_NodeND (ometis.c:151) > ==42049== by 0x1020A6AFB: ParMETIS_V3_NodeND (ometis.c:34) > ==42049== by 0x101557CFC: get_perm_c_parmetis (get_perm_c_parmetis.c:241) > ==42049== by 0x101501192: pdgssvx (pdgssvx.c:934) > ==42049== by 0x1009CFE7A: MatLUFactorNumeric_SuperLU_DIST (superlu_dist.c:414) > ==42049== by 0x10046CC5C: MatLUFactorNumeric (matrix.c:2946) > ==42049== by 0x100F09F2C: PCSetUp_LU (lu.c:152) > ==42049== by 0x100FF9036: PCSetUp (precon.c:982) > ==42048== Syscall param writev(vector[...]) points to uninitialised byte(s) > ==42049== by 0x1010F54EB: KSPSetUp (itfunc.c:332) > ==42049== Address 0x105edff70 is 1,424 bytes inside a block of size 752,720 alloc'd > ==42049== at 0x1000183B1: malloc (vg_replace_malloc.c:303) > ==42049== by 0x1020EB90C: gk_malloc (memory.c:147) > ==42049== by 0x1020EAA28: gk_mcoreCreate (mcore.c:28) > ==42048== at 0x102DA1C3A: writev (in /usr/lib/system/libsystem_kernel.dylib) > ==42048== by 0x10296A0DC: MPL_large_writev (mplsock.c:32) > ==42049== by 0x1020BA5CF: libparmetis__AllocateWSpace (wspace.c:23) > ==42049== by 0x1020A6E84: ParMETIS_V32_NodeND (ometis.c:98) > ==42048== by 0x10295F6AD: MPIDU_Sock_writev (sock_immed.i:610) > ==42048== by 0x102943FCA: MPIDI_CH3_iSendv (ch3_isendv.c:84) > ==42048== by 0x102934361: MPIDI_CH3_EagerContigIsend (ch3u_eager.c:556) > ==42049== by 0x1020A6AFB: ParMETIS_V3_NodeND (ometis.c:34) > ==42049== by 0x101557CFC: get_perm_c_parmetis (get_perm_c_parmetis.c:241) > ==42049== by 0x101501192: pdgssvx (pdgssvx.c:934) > ==42048== by 0x102939531: MPID_Isend (mpid_isend.c:138) > ==42048== by 0x10277656E: MPI_Isend (isend.c:125) > ==42049== by 0x1009CFE7A: MatLUFactorNumeric_SuperLU_DIST (superlu_dist.c:414) > ==42049== by 0x10046CC5C: MatLUFactorNumeric (matrix.c:2946) > ==42049== by 0x100F09F2C: PCSetUp_LU (lu.c:152) > ==42049== by 0x100FF9036: PCSetUp (precon.c:982) > ==42049== by 0x1010F54EB: KSPSetUp (itfunc.c:332) > ==42048== by 0x102088B66: libparmetis__gkMPI_Isend (gkmpi.c:63) > ==42048== by 0x10208140F: libparmetis__CommInterfaceData (comm.c:298) > ==42049== by 0x1010F7985: KSPSolve (itfunc.c:546) > ==42049== by 0x10125541E: SNESSolve_NEWTONLS (ls.c:233) > ==42048== by 0x1020A8758: libparmetis__CompactGraph (ometis.c:553) > ==42048== by 0x1020A77BB: libparmetis__MultilevelOrder (ometis.c:225) > ==42048== by 0x1020A7493: ParMETIS_V32_NodeND (ometis.c:151) > ==42049== by 0x1011C49B7: SNESSolve (snes.c:3906) > ==42049== by 0x100001B3C: main (in ./ex19) > ==42049== Uninitialised value was created by a heap allocation > ==42049== at 0x1000183B1: malloc (vg_replace_malloc.c:303) > ==42049== by 0x1020EB90C: gk_malloc (memory.c:147) > ==42048== by 0x1020A6AFB: ParMETIS_V3_NodeND (ometis.c:34) > ==42048== by 0x101557CFC: get_perm_c_parmetis (get_perm_c_parmetis.c:241) > ==42048== by 0x101501192: pdgssvx (pdgssvx.c:934) > ==42048== by 0x1009CFE7A: MatLUFactorNumeric_SuperLU_DIST (superlu_dist.c:414) > ==42049== by 0x10211C50B: libmetis__imalloc (gklib.c:24) > ==42049== by 0x1020A8566: libparmetis__CompactGraph (ometis.c:519) > ==42049== by 0x1020A77BB: libparmetis__MultilevelOrder (ometis.c:225) > ==42048== by 0x10046CC5C: MatLUFactorNumeric (matrix.c:2946) > ==42049== by 0x1020A7493: ParMETIS_V32_NodeND (ometis.c:151) > ==42049== by 0x1020A6AFB: ParMETIS_V3_NodeND (ometis.c:34) > ==42049== by 0x101557CFC: get_perm_c_parmetis (get_perm_c_parmetis.c:241) > ==42049== by 0x101501192: pdgssvx (pdgssvx.c:934) > ==42049== by 0x1009CFE7A: MatLUFactorNumeric_SuperLU_DIST (superlu_dist.c:414) > ==42049== by 0x10046CC5C: MatLUFactorNumeric (matrix.c:2946) > ==42048== by 0x100F09F2C: PCSetUp_LU (lu.c:152) > ==42049== by 0x100F09F2C: PCSetUp_LU (lu.c:152) > ==42049== by 0x100FF9036: PCSetUp (precon.c:982) > ==42049== by 0x1010F54EB: KSPSetUp (itfunc.c:332) > ==42049== by 0x1010F7985: KSPSolve (itfunc.c:546) > ==42049== by 0x10125541E: SNESSolve_NEWTONLS (ls.c:233) > ==42048== by 0x100FF9036: PCSetUp (precon.c:982) > ==42048== by 0x1010F54EB: KSPSetUp (itfunc.c:332) > ==42048== Address 0x10597a860 is 1,408 bytes inside a block of size 752,720 alloc'd > ==42049== by 0x1011C49B7: SNESSolve (snes.c:3906) > ==42049== by 0x100001B3C: main (in ./ex19) > ==42049== > ==42048== at 0x1000183B1: malloc (vg_replace_malloc.c:303) > ==42048== by 0x1020EB90C: gk_malloc (memory.c:147) > ==42048== by 0x1020EAA28: gk_mcoreCreate (mcore.c:28) > ==42048== by 0x1020BA5CF: libparmetis__AllocateWSpace (wspace.c:23) > ==42048== by 0x1020A6E84: ParMETIS_V32_NodeND (ometis.c:98) > ==42048== by 0x1020A6AFB: ParMETIS_V3_NodeND (ometis.c:34) > ==42048== by 0x101557CFC: get_perm_c_parmetis (get_perm_c_parmetis.c:241) > ==42048== by 0x101501192: pdgssvx (pdgssvx.c:934) > ==42048== by 0x1009CFE7A: MatLUFactorNumeric_SuperLU_DIST (superlu_dist.c:414) > ==42048== by 0x10046CC5C: MatLUFactorNumeric (matrix.c:2946) > ==42048== by 0x100F09F2C: PCSetUp_LU (lu.c:152) > ==42048== by 0x100FF9036: PCSetUp (precon.c:982) > ==42048== by 0x1010F54EB: KSPSetUp (itfunc.c:332) > ==42048== by 0x1010F7985: KSPSolve (itfunc.c:546) > ==42048== by 0x10125541E: SNESSolve_NEWTONLS (ls.c:233) > ==42048== by 0x1011C49B7: SNESSolve (snes.c:3906) > ==42048== by 0x100001B3C: main (in ./ex19) > ==42048== Uninitialised value was created by a heap allocation > ==42048== at 0x1000183B1: malloc (vg_replace_malloc.c:303) > ==42048== by 0x1020EB90C: gk_malloc (memory.c:147) > ==42048== by 0x10211C50B: libmetis__imalloc (gklib.c:24) > ==42048== by 0x1020A8566: libparmetis__CompactGraph (ometis.c:519) > ==42048== by 0x1020A77BB: libparmetis__MultilevelOrder (ometis.c:225) > ==42048== by 0x1020A7493: ParMETIS_V32_NodeND (ometis.c:151) > ==42048== by 0x1020A6AFB: ParMETIS_V3_NodeND (ometis.c:34) > ==42048== by 0x101557CFC: get_perm_c_parmetis (get_perm_c_parmetis.c:241) > ==42048== by 0x101501192: pdgssvx (pdgssvx.c:934) > ==42048== by 0x1009CFE7A: MatLUFactorNumeric_SuperLU_DIST (superlu_dist.c:414) > ==42048== by 0x10046CC5C: MatLUFactorNumeric (matrix.c:2946) > ==42048== by 0x100F09F2C: PCSetUp_LU (lu.c:152) > ==42048== by 0x100FF9036: PCSetUp (precon.c:982) > ==42048== by 0x1010F54EB: KSPSetUp (itfunc.c:332) > ==42048== by 0x1010F7985: KSPSolve (itfunc.c:546) > ==42048== by 0x10125541E: SNESSolve_NEWTONLS (ls.c:233) > ==42048== by 0x1011C49B7: SNESSolve (snes.c:3906) > ==42048== by 0x100001B3C: main (in ./ex19) > ==42048== > ==42048== Syscall param write(buf) points to uninitialised byte(s) > ==42048== at 0x102DA1C22: write (in /usr/lib/system/libsystem_kernel.dylib) > ==42048== by 0x10295F5BD: MPIDU_Sock_write (sock_immed.i:525) > ==42048== by 0x102944839: MPIDI_CH3_iStartMsg (ch3_istartmsg.c:86) > ==42048== by 0x102933B80: MPIDI_CH3_EagerContigShortSend (ch3u_eager.c:257) > ==42048== by 0x10293ADBA: MPID_Send (mpid_send.c:130) > ==42048== by 0x10277A1FA: MPI_Send (send.c:127) > ==42048== by 0x10155802F: get_perm_c_parmetis (get_perm_c_parmetis.c:299) > ==42048== by 0x101501192: pdgssvx (pdgssvx.c:934) > ==42048== by 0x1009CFE7A: MatLUFactorNumeric_SuperLU_DIST (superlu_dist.c:414) > ==42048== by 0x10046CC5C: MatLUFactorNumeric (matrix.c:2946) > ==42048== by 0x100F09F2C: PCSetUp_LU (lu.c:152) > ==42048== by 0x100FF9036: PCSetUp (precon.c:982) > ==42048== by 0x1010F54EB: KSPSetUp (itfunc.c:332) > ==42048== by 0x1010F7985: KSPSolve (itfunc.c:546) > ==42048== by 0x10125541E: SNESSolve_NEWTONLS (ls.c:233) > ==42048== by 0x1011C49B7: SNESSolve (snes.c:3906) > ==42048== by 0x100001B3C: main (in ./ex19) > ==42048== Address 0x104810704 is on thread 1's stack > ==42048== in frame #3, created by MPIDI_CH3_EagerContigShortSend (ch3u_eager.c:218) > ==42048== Uninitialised value was created by a heap allocation > ==42048== at 0x1000183B1: malloc (vg_replace_malloc.c:303) > ==42048== by 0x10153B704: superlu_malloc_dist (memory.c:108) > ==42048== by 0x101557AB9: get_perm_c_parmetis (get_perm_c_parmetis.c:185) > ==42048== by 0x101501192: pdgssvx (pdgssvx.c:934) > ==42048== by 0x1009CFE7A: MatLUFactorNumeric_SuperLU_DIST (superlu_dist.c:414) > ==42048== by 0x10046CC5C: MatLUFactorNumeric (matrix.c:2946) > ==42048== by 0x100F09F2C: PCSetUp_LU (lu.c:152) > ==42048== by 0x100FF9036: PCSetUp (precon.c:982) > ==42048== by 0x1010F54EB: KSPSetUp (itfunc.c:332) > ==42048== by 0x1010F7985: KSPSolve (itfunc.c:546) > ==42048== by 0x10125541E: SNESSolve_NEWTONLS (ls.c:233) > ==42048== by 0x1011C49B7: SNESSolve (snes.c:3906) > ==42048== by 0x100001B3C: main (in ./ex19) > ==42048== > ==42050== Conditional jump or move depends on uninitialised value(s) > ==42050== at 0x102744CB8: MPI_Alltoallv (alltoallv.c:480) > ==42050== by 0x101510B3E: dist_symbLU (pdsymbfact_distdata.c:539) > ==42050== by 0x10150A5C6: ddist_psymbtonum (pdsymbfact_distdata.c:1275) > ==42050== by 0x1015018C2: pdgssvx (pdgssvx.c:1057) > ==42050== by 0x1009CFE7A: MatLUFactorNumeric_SuperLU_DIST (superlu_dist.c:414) > ==42050== by 0x10046CC5C: MatLUFactorNumeric (matrix.c:2946) > ==42050== by 0x100F09F2C: PCSetUp_LU (lu.c:152) > ==42050== by 0x100FF9036: PCSetUp (precon.c:982) > ==42050== by 0x1010F54EB: KSPSetUp (itfunc.c:332) > ==42050== by 0x1010F7985: KSPSolve (itfunc.c:546) > ==42050== by 0x10125541E: SNESSolve_NEWTONLS (ls.c:233) > ==42050== by 0x1011C49B7: SNESSolve (snes.c:3906) > ==42050== by 0x100001B3C: main (in ./ex19) > ==42050== Uninitialised value was created by a stack allocation > ==42050== at 0x10150E4C4: dist_symbLU (pdsymbfact_distdata.c:96) > ==42050== > ==42050== Conditional jump or move depends on uninitialised value(s) > ==42050== at 0x102744E43: MPI_Alltoallv (alltoallv.c:490) > ==42050== by 0x101510B3E: dist_symbLU (pdsymbfact_distdata.c:539) > ==42050== by 0x10150A5C6: ddist_psymbtonum (pdsymbfact_distdata.c:1275) > ==42050== by 0x1015018C2: pdgssvx (pdgssvx.c:1057) > ==42050== by 0x1009CFE7A: MatLUFactorNumeric_SuperLU_DIST (superlu_dist.c:414) > ==42050== by 0x10046CC5C: MatLUFactorNumeric (matrix.c:2946) > ==42050== by 0x100F09F2C: PCSetUp_LU (lu.c:152) > ==42050== by 0x100FF9036: PCSetUp (precon.c:982) > ==42050== by 0x1010F54EB: KSPSetUp (itfunc.c:332) > ==42050== by 0x1010F7985: KSPSolve (itfunc.c:546) > ==42050== by 0x10125541E: SNESSolve_NEWTONLS (ls.c:233) > ==42050== by 0x1011C49B7: SNESSolve (snes.c:3906) > ==42050== by 0x100001B3C: main (in ./ex19) > ==42050== Uninitialised value was created by a stack allocation > ==42050== at 0x10150E4C4: dist_symbLU (pdsymbfact_distdata.c:96) > ==42050== > ==42050== Conditional jump or move depends on uninitialised value(s) > ==42050== at 0x102744EBF: MPI_Alltoallv (alltoallv.c:497) > ==42050== by 0x101510B3E: dist_symbLU (pdsymbfact_distdata.c:539) > ==42050== by 0x10150A5C6: ddist_psymbtonum (pdsymbfact_distdata.c:1275) > ==42050== by 0x1015018C2: pdgssvx (pdgssvx.c:1057) > ==42050== by 0x1009CFE7A: MatLUFactorNumeric_SuperLU_DIST (superlu_dist.c:414) > ==42050== by 0x10046CC5C: MatLUFactorNumeric (matrix.c:2946) > ==42050== by 0x100F09F2C: PCSetUp_LU (lu.c:152) > ==42050== by 0x100FF9036: PCSetUp (precon.c:982) > ==42050== by 0x1010F54EB: KSPSetUp (itfunc.c:332) > ==42050== by 0x1010F7985: KSPSolve (itfunc.c:546) > ==42050== by 0x10125541E: SNESSolve_NEWTONLS (ls.c:233) > ==42050== by 0x1011C49B7: SNESSolve (snes.c:3906) > ==42050== by 0x100001B3C: main (in ./ex19) > ==42050== Uninitialised value was created by a stack allocation > ==42050== at 0x10150E4C4: dist_symbLU (pdsymbfact_distdata.c:96) > ==42050== > ==42050== Conditional jump or move depends on uninitialised value(s) > ==42050== at 0x1027450B1: MPI_Alltoallv (alltoallv.c:512) > ==42050== by 0x101510B3E: dist_symbLU (pdsymbfact_distdata.c:539) > ==42050== by 0x10150A5C6: ddist_psymbtonum (pdsymbfact_distdata.c:1275) > ==42050== by 0x1015018C2: pdgssvx (pdgssvx.c:1057) > ==42050== by 0x1009CFE7A: MatLUFactorNumeric_SuperLU_DIST (superlu_dist.c:414) > ==42050== by 0x10046CC5C: MatLUFactorNumeric (matrix.c:2946) > ==42050== by 0x100F09F2C: PCSetUp_LU (lu.c:152) > ==42050== by 0x100FF9036: PCSetUp (precon.c:982) > ==42050== by 0x1010F54EB: KSPSetUp (itfunc.c:332) > ==42050== by 0x1010F7985: KSPSolve (itfunc.c:546) > ==42050== by 0x10125541E: SNESSolve_NEWTONLS (ls.c:233) > ==42050== by 0x1011C49B7: SNESSolve (snes.c:3906) > ==42050== by 0x100001B3C: main (in ./ex19) > ==42050== Uninitialised value was created by a stack allocation > ==42050== at 0x10150E4C4: dist_symbLU (pdsymbfact_distdata.c:96) > ==42050== > ==42050== Conditional jump or move depends on uninitialised value(s) > ==42050== at 0x10283FB06: MPIR_Alltoallv_intra (alltoallv.c:92) > ==42050== by 0x1028407B6: MPIR_Alltoallv (alltoallv.c:343) > ==42050== by 0x102840884: MPIR_Alltoallv_impl (alltoallv.c:380) > ==42050== by 0x10274541B: MPI_Alltoallv (alltoallv.c:531) > ==42050== by 0x101510B3E: dist_symbLU (pdsymbfact_distdata.c:539) > ==42050== by 0x10150A5C6: ddist_psymbtonum (pdsymbfact_distdata.c:1275) > ==42050== by 0x1015018C2: pdgssvx (pdgssvx.c:1057) > ==42050== by 0x1009CFE7A: MatLUFactorNumeric_SuperLU_DIST (superlu_dist.c:414) > ==42050== by 0x10046CC5C: MatLUFactorNumeric (matrix.c:2946) > ==42050== by 0x100F09F2C: PCSetUp_LU (lu.c:152) > ==42050== by 0x100FF9036: PCSetUp (precon.c:982) > ==42050== by 0x1010F54EB: KSPSetUp (itfunc.c:332) > ==42050== by 0x1010F7985: KSPSolve (itfunc.c:546) > ==42050== by 0x10125541E: SNESSolve_NEWTONLS (ls.c:233) > ==42050== by 0x1011C49B7: SNESSolve (snes.c:3906) > ==42050== by 0x100001B3C: main (in ./ex19) > ==42050== Uninitialised value was created by a stack allocation > ==42050== at 0x10150E4C4: dist_symbLU (pdsymbfact_distdata.c:96) > ==42050== > ==42050== Syscall param writev(vector[...]) points to uninitialised byte(s) > ==42050== at 0x102DA1C3A: writev (in /usr/lib/system/libsystem_kernel.dylib) > ==42050== by 0x10296A0DC: MPL_large_writev (mplsock.c:32) > ==42050== by 0x10295F6AD: MPIDU_Sock_writev (sock_immed.i:610) > ==42050== by 0x102943FCA: MPIDI_CH3_iSendv (ch3_isendv.c:84) > ==42050== by 0x102934361: MPIDI_CH3_EagerContigIsend (ch3u_eager.c:556) > ==42050== by 0x102939531: MPID_Isend (mpid_isend.c:138) > ==42050== by 0x10277656E: MPI_Isend (isend.c:125) > ==42050== by 0x101524C41: pdgstrf2_trsm (pdgstrf2.c:201) > ==42050== by 0x10151ECBF: pdgstrf (pdgstrf.c:1082) > ==42050== by 0x1015019A5: pdgssvx (pdgssvx.c:1069) > ==42050== by 0x1009CFE7A: MatLUFactorNumeric_SuperLU_DIST (superlu_dist.c:414) > ==42050== by 0x10046CC5C: MatLUFactorNumeric (matrix.c:2946) > ==42050== by 0x100F09F2C: PCSetUp_LU (lu.c:152) > ==42050== by 0x100FF9036: PCSetUp (precon.c:982) > ==42050== by 0x1010F54EB: KSPSetUp (itfunc.c:332) > ==42050== by 0x1010F7985: KSPSolve (itfunc.c:546) > ==42050== by 0x10125541E: SNESSolve_NEWTONLS (ls.c:233) > ==42050== by 0x1011C49B7: SNESSolve (snes.c:3906) > ==42050== by 0x100001B3C: main (in ./ex19) > ==42050== Address 0x1060144d0 is 1,168 bytes inside a block of size 131,072 alloc'd > ==42050== at 0x1000183B1: malloc (vg_replace_malloc.c:303) > ==42050== by 0x10153B704: superlu_malloc_dist (memory.c:108) > ==42050== by 0x1014FD7AD: doubleMalloc_dist (dmemory.c:145) > ==42050== by 0x10151DA7D: pdgstrf (pdgstrf.c:735) > ==42050== by 0x1015019A5: pdgssvx (pdgssvx.c:1069) > ==42050== by 0x1009CFE7A: MatLUFactorNumeric_SuperLU_DIST (superlu_dist.c:414) > ==42050== by 0x10046CC5C: MatLUFactorNumeric (matrix.c:2946) > ==42050== by 0x100F09F2C: PCSetUp_LU (lu.c:152) > ==42050== by 0x100FF9036: PCSetUp (precon.c:982) > ==42050== by 0x1010F54EB: KSPSetUp (itfunc.c:332) > ==42050== by 0x1010F7985: KSPSolve (itfunc.c:546) > ==42050== by 0x10125541E: SNESSolve_NEWTONLS (ls.c:233) > ==42050== by 0x1011C49B7: SNESSolve (snes.c:3906) > ==42050== by 0x100001B3C: main (in ./ex19) > ==42050== Uninitialised value was created by a heap allocation > ==42050== at 0x1000183B1: malloc (vg_replace_malloc.c:303) > ==42050== by 0x10153B704: superlu_malloc_dist (memory.c:108) > ==42050== by 0x1014FD7AD: doubleMalloc_dist (dmemory.c:145) > ==42050== by 0x10151DA7D: pdgstrf (pdgstrf.c:735) > ==42050== by 0x1015019A5: pdgssvx (pdgssvx.c:1069) > ==42050== by 0x1009CFE7A: MatLUFactorNumeric_SuperLU_DIST (superlu_dist.c:414) > ==42050== by 0x10046CC5C: MatLUFactorNumeric (matrix.c:2946) > ==42050== by 0x100F09F2C: PCSetUp_LU (lu.c:152) > ==42050== by 0x100FF9036: PCSetUp (precon.c:982) > ==42050== by 0x1010F54EB: KSPSetUp (itfunc.c:332) > ==42050== by 0x1010F7985: KSPSolve (itfunc.c:546) > ==42050== by 0x10125541E: SNESSolve_NEWTONLS (ls.c:233) > ==42050== by 0x1011C49B7: SNESSolve (snes.c:3906) > ==42050== by 0x100001B3C: main (in ./ex19) > ==42050== > ==42048== Conditional jump or move depends on uninitialised value(s) > ==42048== at 0x10151F141: pdgstrf (pdgstrf.c:1139) > ==42048== by 0x1015019A5: pdgssvx (pdgssvx.c:1069) > ==42048== by 0x1009CFE7A: MatLUFactorNumeric_SuperLU_DIST (superlu_dist.c:414) > ==42048== by 0x10046CC5C: MatLUFactorNumeric (matrix.c:2946) > ==42048== by 0x100F09F2C: PCSetUp_LU (lu.c:152) > ==42048== by 0x100FF9036: PCSetUp (precon.c:982) > ==42048== by 0x1010F54EB: KSPSetUp (itfunc.c:332) > ==42048== by 0x1010F7985: KSPSolve (itfunc.c:546) > ==42048== by 0x10125541E: SNESSolve_NEWTONLS (ls.c:233) > ==42048== by 0x1011C49B7: SNESSolve (snes.c:3906) > ==42048== by 0x100001B3C: main (in ./ex19) > ==42048== Uninitialised value was created by a heap allocation > ==42048== at 0x1000183B1: malloc (vg_replace_malloc.c:303) > ==42048== by 0x10153B704: superlu_malloc_dist (memory.c:108) > ==42048== by 0x10150ABE2: ddist_psymbtonum (pdsymbfact_distdata.c:1332) > ==42048== by 0x1015018C2: pdgssvx (pdgssvx.c:1057) > ==42048== by 0x1009CFE7A: MatLUFactorNumeric_SuperLU_DIST (superlu_dist.c:414) > ==42048== by 0x10046CC5C: MatLUFactorNumeric (matrix.c:2946) > ==42048== by 0x100F09F2C: PCSetUp_LU (lu.c:152) > ==42048== by 0x100FF9036: PCSetUp (precon.c:982) > ==42048== by 0x1010F54EB: KSPSetUp (itfunc.c:332) > ==42048== by 0x1010F7985: KSPSolve (itfunc.c:546) > ==42048== by 0x10125541E: SNESSolve_NEWTONLS (ls.c:233) > ==42048== by 0x1011C49B7: SNESSolve (snes.c:3906) > ==42048== by 0x100001B3C: main (in ./ex19) > ==42048== > ==42049== Conditional jump or move depends on uninitialised value(s) > ==42049== at 0x10151F141: pdgstrf (pdgstrf.c:1139) > ==42049== by 0x1015019A5: pdgssvx (pdgssvx.c:1069) > ==42049== by 0x1009CFE7A: MatLUFactorNumeric_SuperLU_DIST (superlu_dist.c:414) > ==42049== by 0x10046CC5C: MatLUFactorNumeric (matrix.c:2946) > ==42049== by 0x100F09F2C: PCSetUp_LU (lu.c:152) > ==42049== by 0x100FF9036: PCSetUp (precon.c:982) > ==42049== by 0x1010F54EB: KSPSetUp (itfunc.c:332) > ==42049== by 0x1010F7985: KSPSolve (itfunc.c:546) > ==42049== by 0x10125541E: SNESSolve_NEWTONLS (ls.c:233) > ==42049== by 0x1011C49B7: SNESSolve (snes.c:3906) > ==42049== by 0x100001B3C: main (in ./ex19) > ==42049== Uninitialised value was created by a heap allocation > ==42049== at 0x1000183B1: malloc (vg_replace_malloc.c:303) > ==42049== by 0x10153B704: superlu_malloc_dist (memory.c:108) > ==42049== by 0x10150ABE2: ddist_psymbtonum (pdsymbfact_distdata.c:1332) > ==42049== by 0x1015018C2: pdgssvx (pdgssvx.c:1057) > ==42049== by 0x1009CFE7A: MatLUFactorNumeric_SuperLU_DIST (superlu_dist.c:414) > ==42049== by 0x10046CC5C: MatLUFactorNumeric (matrix.c:2946) > ==42049== by 0x100F09F2C: PCSetUp_LU (lu.c:152) > ==42049== by 0x100FF9036: PCSetUp (precon.c:982) > ==42049== by 0x1010F54EB: KSPSetUp (itfunc.c:332) > ==42049== by 0x1010F7985: KSPSolve (itfunc.c:546) > ==42049== by 0x10125541E: SNESSolve_NEWTONLS (ls.c:233) > ==42049== by 0x1011C49B7: SNESSolve (snes.c:3906) > ==42049== by 0x100001B3C: main (in ./ex19) > ==42049== > ==42048== Conditional jump or move depends on uninitialised value(s) > ==42048== at 0x101520054: pdgstrf (pdgstrf.c:1429) > ==42048== by 0x1015019A5: pdgssvx (pdgssvx.c:1069) > ==42048== by 0x1009CFE7A: MatLUFactorNumeric_SuperLU_DIST (superlu_dist.c:414) > ==42048== by 0x10046CC5C: MatLUFactorNumeric (matrix.c:2946) > ==42048== by 0x100F09F2C: PCSetUp_LU (lu.c:152) > ==42048== by 0x100FF9036: PCSetUp (precon.c:982) > ==42048== by 0x1010F54EB: KSPSetUp (itfunc.c:332) > ==42048== by 0x1010F7985: KSPSolve (itfunc.c:546) > ==42049== Conditional jump or move depends on uninitialised value(s) > ==42048== by 0x10125541E: SNESSolve_NEWTONLS (ls.c:233) > ==42048== by 0x1011C49B7: SNESSolve (snes.c:3906) > ==42048== by 0x100001B3C: main (in ./ex19) > ==42048== Uninitialised value was created by a heap allocation > ==42049== at 0x101520054: pdgstrf (pdgstrf.c:1429) > ==42048== at 0x1000183B1: malloc (vg_replace_malloc.c:303) > ==42048== by 0x10153B704: superlu_malloc_dist (memory.c:108) > ==42049== by 0x1015019A5: pdgssvx (pdgssvx.c:1069) > ==42049== by 0x1009CFE7A: MatLUFactorNumeric_SuperLU_DIST (superlu_dist.c:414) > ==42048== by 0x10150ABE2: ddist_psymbtonum (pdsymbfact_distdata.c:1332) > ==42048== by 0x1015018C2: pdgssvx (pdgssvx.c:1057) > ==42048== by 0x1009CFE7A: MatLUFactorNumeric_SuperLU_DIST (superlu_dist.c:414) > ==42049== by 0x10046CC5C: MatLUFactorNumeric (matrix.c:2946) > ==42049== by 0x100F09F2C: PCSetUp_LU (lu.c:152) > ==42048== by 0x10046CC5C: MatLUFactorNumeric (matrix.c:2946) > ==42048== by 0x100F09F2C: PCSetUp_LU (lu.c:152) > ==42049== by 0x100FF9036: PCSetUp (precon.c:982) > ==42049== by 0x1010F54EB: KSPSetUp (itfunc.c:332) > ==42049== by 0x1010F7985: KSPSolve (itfunc.c:546) > ==42048== by 0x100FF9036: PCSetUp (precon.c:982) > ==42048== by 0x1010F54EB: KSPSetUp (itfunc.c:332) > ==42048== by 0x1010F7985: KSPSolve (itfunc.c:546) > ==42049== by 0x10125541E: SNESSolve_NEWTONLS (ls.c:233) > ==42049== by 0x1011C49B7: SNESSolve (snes.c:3906) > ==42048== by 0x10125541E: SNESSolve_NEWTONLS (ls.c:233) > ==42048== by 0x1011C49B7: SNESSolve (snes.c:3906) > ==42049== by 0x100001B3C: main (in ./ex19) > ==42049== Uninitialised value was created by a heap allocation > ==42049== at 0x1000183B1: malloc (vg_replace_malloc.c:303) > ==42048== by 0x100001B3C: main (in ./ex19) > ==42048== > ==42049== by 0x10153B704: superlu_malloc_dist (memory.c:108) > ==42049== by 0x10150ABE2: ddist_psymbtonum (pdsymbfact_distdata.c:1332) > ==42049== by 0x1015018C2: pdgssvx (pdgssvx.c:1057) > ==42049== by 0x1009CFE7A: MatLUFactorNumeric_SuperLU_DIST (superlu_dist.c:414) > ==42049== by 0x10046CC5C: MatLUFactorNumeric (matrix.c:2946) > ==42049== by 0x100F09F2C: PCSetUp_LU (lu.c:152) > ==42049== by 0x100FF9036: PCSetUp (precon.c:982) > ==42049== by 0x1010F54EB: KSPSetUp (itfunc.c:332) > ==42049== by 0x1010F7985: KSPSolve (itfunc.c:546) > ==42049== by 0x10125541E: SNESSolve_NEWTONLS (ls.c:233) > ==42049== by 0x1011C49B7: SNESSolve (snes.c:3906) > ==42049== by 0x100001B3C: main (in ./ex19) > ==42049== > ==42050== Conditional jump or move depends on uninitialised value(s) > ==42050== at 0x10151FDE6: pdgstrf (pdgstrf.c:1382) > ==42050== by 0x1015019A5: pdgssvx (pdgssvx.c:1069) > ==42050== by 0x1009CFE7A: MatLUFactorNumeric_SuperLU_DIST (superlu_dist.c:414) > ==42050== by 0x10046CC5C: MatLUFactorNumeric (matrix.c:2946) > ==42050== by 0x100F09F2C: PCSetUp_LU (lu.c:152) > ==42050== by 0x100FF9036: PCSetUp (precon.c:982) > ==42050== by 0x1010F54EB: KSPSetUp (itfunc.c:332) > ==42050== by 0x1010F7985: KSPSolve (itfunc.c:546) > ==42050== by 0x10125541E: SNESSolve_NEWTONLS (ls.c:233) > ==42050== by 0x1011C49B7: SNESSolve (snes.c:3906) > ==42050== by 0x100001B3C: main (in ./ex19) > ==42050== Uninitialised value was created by a heap allocation > ==42050== at 0x1000183B1: malloc (vg_replace_malloc.c:303) > ==42050== by 0x10153B704: superlu_malloc_dist (memory.c:108) > ==42050== by 0x10150B241: ddist_psymbtonum (pdsymbfact_distdata.c:1389) > ==42050== by 0x1015018C2: pdgssvx (pdgssvx.c:1057) > ==42050== by 0x1009CFE7A: MatLUFactorNumeric_SuperLU_DIST (superlu_dist.c:414) > ==42050== by 0x10046CC5C: MatLUFactorNumeric (matrix.c:2946) > ==42050== by 0x100F09F2C: PCSetUp_LU (lu.c:152) > ==42050== by 0x100FF9036: PCSetUp (precon.c:982) > ==42050== by 0x1010F54EB: KSPSetUp (itfunc.c:332) > ==42050== by 0x1010F7985: KSPSolve (itfunc.c:546) > ==42050== by 0x10125541E: SNESSolve_NEWTONLS (ls.c:233) > ==42050== by 0x1011C49B7: SNESSolve (snes.c:3906) > ==42050== by 0x100001B3C: main (in ./ex19) > ==42050== > > > > On Jul 20, 2015, at 12:03 PM, Mahir.Ulker-Kaustell at tyrens.se wrote: > > > > Ok. So I have been creating the full factorization on each process. That gives me some hope! > > > > I followed your suggestion and tried to use the runtime option ?-mat_superlu_dist_parsymbfact?. > > However, now the program crashes with: > > > > Invalid ISPEC at line 484 in file get_perm_c.c > > > > And so on? > > > > From the SuperLU manual; I should give the option either YES or NO, however -mat_superlu_dist_parsymbfact YES makes the program crash in the same way as above. > > Also I can?t find any reference to -mat_superlu_dist_parsymbfact in the PETSc documentation > > > > Mahir > > > > Mahir ?lker-Kaustell, Kompetenssamordnare, Brokonstrukt?r, Tekn. Dr, Tyr?ns AB > > 010 452 30 82, Mahir.Ulker-Kaustell at tyrens.se > > > > From: Xiaoye S. Li [mailto:xsli at lbl.gov] > > Sent: den 20 juli 2015 18:12 > > To: ?lker-Kaustell, Mahir > > Cc: Hong; petsc-users > > Subject: Re: [petsc-users] SuperLU MPI-problem > > > > The default SuperLU_DIST setting is to serial symbolic factorization. Therefore, what matters is how much memory do you have per MPI task? > > > > The code failed to malloc memory during redistribution of matrix A to {L\U} data struction (using result of serial symbolic factorization.) > > > > You can use parallel symbolic factorization, by runtime option: '-mat_superlu_dist_parsymbfact' > > > > Sherry Li > > > > > > On Mon, Jul 20, 2015 at 8:59 AM, Mahir.Ulker-Kaustell at tyrens.se > wrote: > > Hong: > > > > Previous experiences with this equation have shown that it is very difficult to solve it iteratively. Hence the use of a direct solver. > > > > The large test problem I am trying to solve has slightly less than 10^6 degrees of freedom. The matrices are derived from finite elements so they are sparse. > > The machine I am working on has 128GB ram. I have estimated the memory needed to less than 20GB, so if the solver needs twice or even three times as much, it should still work well. Or have I completely misunderstood something here? > > > > Mahir > > > > > > > > From: Hong [mailto:hzhang at mcs.anl.gov] > > Sent: den 20 juli 2015 17:39 > > To: ?lker-Kaustell, Mahir > > Cc: petsc-users > > Subject: Re: [petsc-users] SuperLU MPI-problem > > > > Mahir: > > Direct solvers consume large amount of memory. Suggest to try followings: > > > > 1. A sparse iterative solver if [-omega^2M + K] is not too ill-conditioned. You may test it using the small matrix. > > > > 2. Incrementally increase your matrix sizes. Try different matrix orderings. > > Do you get memory crash in the 1st symbolic factorization? > > In your case, matrix data structure stays same when omega changes, so you only need to do one matrix symbolic factorization and reuse it. > > > > 3. Use a machine that gives larger memory. > > > > Hong > > > > Dear Petsc-Users, > > > > I am trying to use PETSc to solve a set of linear equations arising from Naviers equation (elastodynamics) in the frequency domain. > > The frequency dependency of the problem requires that the system > > > > [-omega^2M + K]u = F > > > > where M and K are constant, square, positive definite matrices (mass and stiffness respectively) is solved for each frequency omega of interest. > > K is a complex matrix, including material damping. > > > > I have written a PETSc program which solves this problem for a small (1000 degrees of freedom) test problem on one or several processors, but it keeps crashing when I try it on my full scale (in the order of 10^6 degrees of freedom) problem. > > > > The program crashes at KSPSetUp() and from what I can see in the error messages, it appears as if it consumes too much memory. > > > > I would guess that similar problems have occurred in this mail-list, so I am hoping that someone can push me in the right direction? > > > > Mahir > > > > > > > > > > From xzhao99 at gmail.com Fri Aug 7 11:21:40 2015 From: xzhao99 at gmail.com (Xujun Zhao) Date: Fri, 7 Aug 2015 11:21:40 -0500 Subject: [petsc-users] SLEPc fails with POWER ITERATION method Message-ID: Hi all, I am solving the max eigenvalue of a Shell matrix using SLEPc. the Shell operation is set MATOP_MULT with user-defined function u = M*f. It works with the Krylov-Schur and Arnoldi method, but fails when I use Power Iteration method and several others. This is strange, because some of those are supposed to work with any type of problem. I also wrote a power iteration algorithm by myself, and it works well and obtains the same results with that from Krylov_Schur. I am curious why this doesn't work is SLEPc. The following are my code and error messages: ---------------------------------------------------------------------------------- ierr = MatSetFromOptions(M); CHKERRQ(ierr); ierr = MatShellSetOperation(M,MATOP_MULT,(void(*)())_MatMult_Stokes);CHKERRQ(ierr); ierr = PetscPrintf(PETSC_COMM_WORLD,"\n"); printf("--->test: n = %d, N = %d, rank = %d\n",n, N, (int)rank); /* - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - Create the eigensolver and set various options - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - */ ierr = EPSCreate(PETSC_COMM_WORLD,&eps); CHKERRQ(ierr); ierr = EPSSetOperators(eps,M,NULL); CHKERRQ(ierr); ierr = EPSSetProblemType(eps,EPS_HEP); CHKERRQ(ierr); // EPSKRYLOVSCHUR(Default)/EPSARNOLDI/ // does NOT work: EPSPOWER/EPSLANCZOS/EPSSUBSPACE ierr = EPSSetType(eps,EPSPOWER); CHKERRQ(ierr); /* Select portion of spectrum */ if(option=="smallest") // LOBPCG for smallest eigenvalue problem! { ierr = EPSSetWhichEigenpairs(eps,EPS_SMALLEST_REAL);CHKERRQ(ierr); } else if(option=="largest") { ierr = EPSSetWhichEigenpairs(eps,EPS_LARGEST_REAL); CHKERRQ(ierr); } else { ierr = EPSSetFromOptions(eps); CHKERRQ(ierr); } // end if-else // Set the tolerance and maximum iteration ierr = EPSSetTolerances(eps, tol, maxits); CHKERRQ(ierr); ierr = PetscPrintf(PETSC_COMM_WORLD,"EPS tol = %f, maxits = %d\n",tol,maxits);CHKERRQ(ierr); /* - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - Solve the eigensystem and get the solution - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - */ ierr = PetscPrintf(PETSC_COMM_WORLD,"EPS solve starts ...\n");CHKERRQ(ierr); ierr = EPSSolve(eps);CHKERRQ(ierr); ---------------------------------------------------------------------------------- EPS tol = 0.000001, maxits = 100 EPS solve starts ... [0]PETSC ERROR: --------------------- Error Message -------------------------------------------------------------- [0]PETSC ERROR: Wrong value of eps->which [0]PETSC ERROR: See http://www.mcs.anl.gov/petsc/documentation/faq.html for trouble shooting. [0]PETSC ERROR: Petsc Release Version 3.5.4, May, 23, 2015 [0]PETSC ERROR: ./example-opt on a arch-darwin-c-opt named mcswl121.mcs.anl.gov by xzhao Fri Aug 7 10:31:38 2015 [0]PETSC ERROR: Configure options --with-cc=gcc-4.9 --with-cxx=g++-4.9 --with-fc=gfortran-4.9 --with-cxx-dialect=C++11 --download-mpich --download-fblaslapack --download-scalapack --download-mumps --download-superlu_dist --download-hypre --download-ml --download-parmetis --download-metis --download-triangle --download-chaco --download-elemental --with-debugging=0 [0]PETSC ERROR: #1 EPSSetUp_Power() line 64 in /Users/xzhao/software/slepc/slepc-3.5.4/src/eps/impls/power/power.c [0]PETSC ERROR: #2 EPSSetUp() line 120 in /Users/xzhao/software/slepc/slepc-3.5.4/src/eps/interface/epssetup.c [0]PETSC ERROR: #3 EPSSolve() line 88 in /Users/xzhao/software/slepc/slepc-3.5.4/src/eps/interface/epssolve.c [0]PETSC ERROR: #4 compute_eigenvalue() line 318 in brownian_system.C -------------- next part -------------- An HTML attachment was scrubbed... URL: From jroman at dsic.upv.es Fri Aug 7 11:53:11 2015 From: jroman at dsic.upv.es (Jose E. Roman) Date: Fri, 7 Aug 2015 18:53:11 +0200 Subject: [petsc-users] SLEPc fails with POWER ITERATION method In-Reply-To: References: Message-ID: <2C23A670-4E31-4EAE-9FFF-6CD9F8CBE086@dsic.upv.es> The power method only works with which=EPS_LARGEST_MAGNITUDE (or which=EPS_TARGET_MAGNITUDE if doing shift-and-invert). The rationale is that the power iteration converges to the dominant eigenvalue (the one with largest absolute value). Jose > El 7/8/2015, a las 18:21, Xujun Zhao escribi?: > > Hi all, > > I am solving the max eigenvalue of a Shell matrix using SLEPc. the Shell operation is set MATOP_MULT with user-defined function u = M*f. It works with the Krylov-Schur and Arnoldi method, but fails when I use Power Iteration method and several others. This is strange, because some of those are supposed to work with any type of problem. > > I also wrote a power iteration algorithm by myself, and it works well and obtains the same results with that from Krylov_Schur. I am curious why this doesn't work is SLEPc. The following are my code and error messages: > > ---------------------------------------------------------------------------------- > ierr = MatSetFromOptions(M); CHKERRQ(ierr); > ierr = MatShellSetOperation(M,MATOP_MULT,(void(*)())_MatMult_Stokes);CHKERRQ(ierr); > > ierr = PetscPrintf(PETSC_COMM_WORLD,"\n"); > printf("--->test: n = %d, N = %d, rank = %d\n",n, N, (int)rank); > > > /* - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - > Create the eigensolver and set various options > - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - */ > > ierr = EPSCreate(PETSC_COMM_WORLD,&eps); CHKERRQ(ierr); > ierr = EPSSetOperators(eps,M,NULL); CHKERRQ(ierr); > ierr = EPSSetProblemType(eps,EPS_HEP); CHKERRQ(ierr); > > > // EPSKRYLOVSCHUR(Default)/EPSARNOLDI/ > // does NOT work: EPSPOWER/EPSLANCZOS/EPSSUBSPACE > ierr = EPSSetType(eps,EPSPOWER); CHKERRQ(ierr); > > > /* Select portion of spectrum */ > > if(option=="smallest") // LOBPCG for smallest eigenvalue problem! > { ierr = EPSSetWhichEigenpairs(eps,EPS_SMALLEST_REAL);CHKERRQ(ierr); } > else if(option=="largest") > { ierr = EPSSetWhichEigenpairs(eps,EPS_LARGEST_REAL); CHKERRQ(ierr); } > else > { ierr = EPSSetFromOptions(eps); CHKERRQ(ierr); } > // end if-else > > > // Set the tolerance and maximum iteration > ierr = EPSSetTolerances(eps, tol, maxits); CHKERRQ(ierr); > ierr = PetscPrintf(PETSC_COMM_WORLD,"EPS tol = %f, maxits = %d\n",tol,maxits);CHKERRQ(ierr); > > > /* - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - > Solve the eigensystem and get the solution > - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - */ > > ierr = PetscPrintf(PETSC_COMM_WORLD,"EPS solve starts ...\n");CHKERRQ(ierr); > ierr = EPSSolve(eps);CHKERRQ(ierr); > > ---------------------------------------------------------------------------------- > > > EPS tol = 0.000001, maxits = 100 > > EPS solve starts ... > > [0]PETSC ERROR: --------------------- Error Message -------------------------------------------------------------- > > [0]PETSC ERROR: Wrong value of eps->which > > [0]PETSC ERROR: See http://www.mcs.anl.gov/petsc/documentation/faq.html for trouble shooting. > > [0]PETSC ERROR: Petsc Release Version 3.5.4, May, 23, 2015 > > [0]PETSC ERROR: ./example-opt on a arch-darwin-c-opt named mcswl121.mcs.anl.gov by xzhao Fri Aug 7 10:31:38 2015 > > [0]PETSC ERROR: Configure options --with-cc=gcc-4.9 --with-cxx=g++-4.9 --with-fc=gfortran-4.9 --with-cxx-dialect=C++11 --download-mpich --download-fblaslapack --download-scalapack --download-mumps --download-superlu_dist --download-hypre --download-ml --download-parmetis --download-metis --download-triangle --download-chaco --download-elemental --with-debugging=0 > > [0]PETSC ERROR: #1 EPSSetUp_Power() line 64 in /Users/xzhao/software/slepc/slepc-3.5.4/src/eps/impls/power/power.c > > [0]PETSC ERROR: #2 EPSSetUp() line 120 in /Users/xzhao/software/slepc/slepc-3.5.4/src/eps/interface/epssetup.c > > [0]PETSC ERROR: #3 EPSSolve() line 88 in /Users/xzhao/software/slepc/slepc-3.5.4/src/eps/interface/epssolve.c > > [0]PETSC ERROR: #4 compute_eigenvalue() line 318 in brownian_system.C > > > > > > > > > > > > > From xzhao99 at gmail.com Fri Aug 7 12:05:47 2015 From: xzhao99 at gmail.com (Xujun Zhao) Date: Fri, 7 Aug 2015 12:05:47 -0500 Subject: [petsc-users] SLEPc fails with POWER ITERATION method In-Reply-To: <2C23A670-4E31-4EAE-9FFF-6CD9F8CBE086@dsic.upv.es> References: <2C23A670-4E31-4EAE-9FFF-6CD9F8CBE086@dsic.upv.es> Message-ID: Hi Jose, Thank you for your answer. The problem now is solved with setting EPS_LARGEST_MAGNITUDE. Xujun On Fri, Aug 7, 2015 at 11:53 AM, Jose E. Roman wrote: > The power method only works with which=EPS_LARGEST_MAGNITUDE (or > which=EPS_TARGET_MAGNITUDE if doing shift-and-invert). The rationale is > that the power iteration converges to the dominant eigenvalue (the one with > largest absolute value). > > Jose > > > > El 7/8/2015, a las 18:21, Xujun Zhao escribi?: > > > > Hi all, > > > > I am solving the max eigenvalue of a Shell matrix using SLEPc. the Shell > operation is set MATOP_MULT with user-defined function u = M*f. It works > with the Krylov-Schur and Arnoldi method, but fails when I use Power > Iteration method and several others. This is strange, because some of those > are supposed to work with any type of problem. > > > > I also wrote a power iteration algorithm by myself, and it works well > and obtains the same results with that from Krylov_Schur. I am curious why > this doesn't work is SLEPc. The following are my code and error messages: > > > > > ---------------------------------------------------------------------------------- > > ierr = MatSetFromOptions(M); CHKERRQ(ierr); > > ierr = > MatShellSetOperation(M,MATOP_MULT,(void(*)())_MatMult_Stokes);CHKERRQ(ierr); > > > > ierr = PetscPrintf(PETSC_COMM_WORLD,"\n"); > > printf("--->test: n = %d, N = %d, rank = %d\n",n, N, (int)rank); > > > > > > /* - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - > > Create the eigensolver and set various options > > - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - */ > > > > ierr = EPSCreate(PETSC_COMM_WORLD,&eps); CHKERRQ(ierr); > > ierr = EPSSetOperators(eps,M,NULL); CHKERRQ(ierr); > > ierr = EPSSetProblemType(eps,EPS_HEP); CHKERRQ(ierr); > > > > > > // EPSKRYLOVSCHUR(Default)/EPSARNOLDI/ > > // does NOT work: EPSPOWER/EPSLANCZOS/EPSSUBSPACE > > ierr = EPSSetType(eps,EPSPOWER); CHKERRQ(ierr); > > > > > > /* Select portion of spectrum */ > > > > if(option=="smallest") // LOBPCG for smallest eigenvalue problem! > > { ierr = EPSSetWhichEigenpairs(eps,EPS_SMALLEST_REAL);CHKERRQ(ierr); } > > else if(option=="largest") > > { ierr = EPSSetWhichEigenpairs(eps,EPS_LARGEST_REAL); CHKERRQ(ierr); } > > else > > { ierr = EPSSetFromOptions(eps); CHKERRQ(ierr); } > > // end if-else > > > > > > // Set the tolerance and maximum iteration > > ierr = EPSSetTolerances(eps, tol, maxits); CHKERRQ(ierr); > > ierr = PetscPrintf(PETSC_COMM_WORLD,"EPS tol = %f, maxits = > %d\n",tol,maxits);CHKERRQ(ierr); > > > > > > /* - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - > > Solve the eigensystem and get the solution > > - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - */ > > > > ierr = PetscPrintf(PETSC_COMM_WORLD,"EPS solve starts > ...\n");CHKERRQ(ierr); > > ierr = EPSSolve(eps);CHKERRQ(ierr); > > > > > ---------------------------------------------------------------------------------- > > > > > > EPS tol = 0.000001, maxits = 100 > > > > EPS solve starts ... > > > > [0]PETSC ERROR: --------------------- Error Message > -------------------------------------------------------------- > > > > [0]PETSC ERROR: Wrong value of eps->which > > > > [0]PETSC ERROR: See http://www.mcs.anl.gov/petsc/documentation/faq.html > for trouble shooting. > > > > [0]PETSC ERROR: Petsc Release Version 3.5.4, May, 23, 2015 > > > > [0]PETSC ERROR: ./example-opt on a arch-darwin-c-opt named > mcswl121.mcs.anl.gov by xzhao Fri Aug 7 10:31:38 2015 > > > > [0]PETSC ERROR: Configure options --with-cc=gcc-4.9 --with-cxx=g++-4.9 > --with-fc=gfortran-4.9 --with-cxx-dialect=C++11 --download-mpich > --download-fblaslapack --download-scalapack --download-mumps > --download-superlu_dist --download-hypre --download-ml --download-parmetis > --download-metis --download-triangle --download-chaco --download-elemental > --with-debugging=0 > > > > [0]PETSC ERROR: #1 EPSSetUp_Power() line 64 in > /Users/xzhao/software/slepc/slepc-3.5.4/src/eps/impls/power/power.c > > > > [0]PETSC ERROR: #2 EPSSetUp() line 120 in > /Users/xzhao/software/slepc/slepc-3.5.4/src/eps/interface/epssetup.c > > > > [0]PETSC ERROR: #3 EPSSolve() line 88 in > /Users/xzhao/software/slepc/slepc-3.5.4/src/eps/interface/epssolve.c > > > > [0]PETSC ERROR: #4 compute_eigenvalue() line 318 in brownian_system.C > > > > > > > > > > > > > > > > > > > > > > > > > > > > -------------- next part -------------- An HTML attachment was scrubbed... URL: From ustc.liu at gmail.com Sat Aug 8 07:52:19 2015 From: ustc.liu at gmail.com (sheng liu) Date: Sat, 8 Aug 2015 20:52:19 +0800 Subject: [petsc-users] Need to update matrix in every loop Message-ID: Hello: I have a large sparse symmetric matrix ( about 1000000x1000000), and I need about 10 eigenvalues near 0. The problem is: I need to run the same program about 1000 times, each time I need to change the diagonal matrix elements ( and they are generated randomly). Is there a fast way to implement this problem? Thank you! -------------- next part -------------- An HTML attachment was scrubbed... URL: From bsmith at mcs.anl.gov Sat Aug 8 12:52:05 2015 From: bsmith at mcs.anl.gov (Barry Smith) Date: Sat, 8 Aug 2015 12:52:05 -0500 Subject: [petsc-users] Need to update matrix in every loop In-Reply-To: References: Message-ID: > On Aug 8, 2015, at 7:52 AM, sheng liu wrote: > > Hello: > I have a large sparse symmetric matrix ( about 1000000x1000000), and I need about 10 eigenvalues near 0. The problem is: I need to run the same program about 1000 times, each time I need to change the diagonal matrix elements ( and they are generated randomly). Is there a fast way to implement this problem? Thank you! Does each run depend on the previous one or are they all independent? If they are independent I would introduce two levels of parallelism: On the outer level have different MPI communicators compute different random diagonal perturbations and on the inner level use a small amount of parallelism for each eigenvalue solve. The outer level of parallelism is embarrassingly parallel. Of course, for runs of the eigensolve use -log_summary to make sure it is running efficiently and tune the amount of parallelism in the eigensolve for best performance. Barry From mc0710 at gmail.com Sat Aug 8 13:52:00 2015 From: mc0710 at gmail.com (Mani Chandra) Date: Sat, 8 Aug 2015 13:52:00 -0500 Subject: [petsc-users] Mapping between application ordering and Petsc ordering Message-ID: Hi, I'm having trouble interfacing petsc to an application which I think is related to the ordering of the nodes. Here's what I'm trying to do: The application uses a structured grid with a global array having dimensions N1 x N2, which is then decomposed into a local array with dimensions NX1 x NX2. I create a Petsc DMDA using DMDACreate2d(MPI_COMM_WORLD, DM_BOUNDARY_PERIODIC, DM_BOUNDARY_PERIODIC, DMDA_STENCIL_BOX, N1, N2, N1/NX1, N2/NX2, 1, nghost, PETSC_NULL, PETSC_NULL, &dmda); and then use this to create a vec: DMCreateGlobalVector(dmda, &vec); Now I copy the local contents of the application array to the petsc array using the following: Let i, j be the application indices and iPetsc and jPetsc be petsc's indices, then: DMDAGetCorners(dmda, &iStart, &jStart, &kStart, &iSize, &jSize, &kSize ); double **arrayPetsc; DMDAVecGetArray(dmda, vec, &arrayPetsc); for (int j=0, jPetsc=jStart; j -------------- next part -------------- A non-text attachment was scrubbed... Name: 1_proc.png Type: image/png Size: 31474 bytes Desc: not available URL: -------------- next part -------------- A non-text attachment was scrubbed... Name: 4_proc.png Type: image/png Size: 33689 bytes Desc: not available URL: From knepley at gmail.com Sat Aug 8 14:19:55 2015 From: knepley at gmail.com (Matthew Knepley) Date: Sat, 8 Aug 2015 14:19:55 -0500 Subject: [petsc-users] Mapping between application ordering and Petsc ordering In-Reply-To: References: Message-ID: On Sat, Aug 8, 2015 at 1:52 PM, Mani Chandra wrote: > Hi, > > I'm having trouble interfacing petsc to an application which I think is > related to the ordering of the nodes. Here's what I'm trying to do: > > The application uses a structured grid with a global array having > dimensions N1 x N2, which is then decomposed into a local array with > dimensions NX1 x NX2. > > I create a Petsc DMDA using > > DMDACreate2d(MPI_COMM_WORLD, > DM_BOUNDARY_PERIODIC, DM_BOUNDARY_PERIODIC, > DMDA_STENCIL_BOX, > N1, N2, > N1/NX1, N2/NX2, > 1, nghost, PETSC_NULL, PETSC_NULL, > &dmda); > > and then use this to create a vec: > > DMCreateGlobalVector(dmda, &vec); > > Now I copy the local contents of the application array to the petsc array > using the following: > > Let i, j be the application indices and iPetsc and jPetsc be petsc's > indices, then: > > DMDAGetCorners(dmda, &iStart, &jStart, &kStart, > &iSize, &jSize, &kSize > ); > > > double **arrayPetsc; > DMDAVecGetArray(dmda, vec, &arrayPetsc); > > for (int j=0, jPetsc=jStart; j { > for (int i=0, iPetsc=iStart; i { > arrayPetsc[jPetsc][iPetsc] = arrayApplication[j][i]; > } > } > > DMDAVecRestoreArray(dmda, vec, &arrayPetsc); > > Now if I VecView(vec, viewer) and look at the data that petsc has, it > looks right when run with 1 proc, but if I use 4 procs it's all messed up > (see attached plots). > > I should probably be using the AO object but its not clear how. Could you > help me out? > It looks like you have the global order of processes reversed, meaning you have 1 3 0 2 and it should be 2 3 0 1 Thanks, Matt > Thanks, > Mani > -- What most experimenters take for granted before they begin their experiments is infinitely more interesting than any results to which their experiments lead. -- Norbert Wiener -------------- next part -------------- An HTML attachment was scrubbed... URL: From mc0710 at gmail.com Sat Aug 8 14:45:45 2015 From: mc0710 at gmail.com (Mani Chandra) Date: Sat, 8 Aug 2015 14:45:45 -0500 Subject: [petsc-users] Mapping between application ordering and Petsc ordering In-Reply-To: References: Message-ID: Thanks. Any suggestions for a fix? Reorder the indices in arrayApplication? On Sat, Aug 8, 2015 at 2:19 PM, Matthew Knepley wrote: > On Sat, Aug 8, 2015 at 1:52 PM, Mani Chandra wrote: > >> Hi, >> >> I'm having trouble interfacing petsc to an application which I think is >> related to the ordering of the nodes. Here's what I'm trying to do: >> >> The application uses a structured grid with a global array having >> dimensions N1 x N2, which is then decomposed into a local array with >> dimensions NX1 x NX2. >> >> I create a Petsc DMDA using >> >> DMDACreate2d(MPI_COMM_WORLD, >> DM_BOUNDARY_PERIODIC, DM_BOUNDARY_PERIODIC, >> DMDA_STENCIL_BOX, >> N1, N2, >> N1/NX1, N2/NX2, >> 1, nghost, PETSC_NULL, PETSC_NULL, >> &dmda); >> >> and then use this to create a vec: >> >> DMCreateGlobalVector(dmda, &vec); >> >> Now I copy the local contents of the application array to the petsc array >> using the following: >> >> Let i, j be the application indices and iPetsc and jPetsc be petsc's >> indices, then: >> >> DMDAGetCorners(dmda, &iStart, &jStart, &kStart, >> &iSize, &jSize, &kSize >> ); >> >> >> double **arrayPetsc; >> DMDAVecGetArray(dmda, vec, &arrayPetsc); >> >> for (int j=0, jPetsc=jStart; j> { >> for (int i=0, iPetsc=iStart; i> { >> arrayPetsc[jPetsc][iPetsc] = arrayApplication[j][i]; >> } >> } >> >> DMDAVecRestoreArray(dmda, vec, &arrayPetsc); >> >> Now if I VecView(vec, viewer) and look at the data that petsc has, it >> looks right when run with 1 proc, but if I use 4 procs it's all messed up >> (see attached plots). >> >> I should probably be using the AO object but its not clear how. Could you >> help me out? >> > > It looks like you have the global order of processes reversed, meaning you > have > > 1 3 > > 0 2 > > and it should be > > 2 3 > > 0 1 > > Thanks, > > Matt > > >> Thanks, >> Mani >> > -- > What most experimenters take for granted before they begin their > experiments is infinitely more interesting than any results to which their > experiments lead. > -- Norbert Wiener > -------------- next part -------------- An HTML attachment was scrubbed... URL: From knepley at gmail.com Sat Aug 8 14:48:43 2015 From: knepley at gmail.com (Matthew Knepley) Date: Sat, 8 Aug 2015 14:48:43 -0500 Subject: [petsc-users] Mapping between application ordering and Petsc ordering In-Reply-To: References: Message-ID: On Sat, Aug 8, 2015 at 2:45 PM, Mani Chandra wrote: > Thanks. Any suggestions for a fix? > You have to deal with the right part of the domain in your application code. I have no idea how you are handling this, and its not in the code below. Matt > Reorder the indices in arrayApplication? > > On Sat, Aug 8, 2015 at 2:19 PM, Matthew Knepley wrote: > >> On Sat, Aug 8, 2015 at 1:52 PM, Mani Chandra wrote: >> >>> Hi, >>> >>> I'm having trouble interfacing petsc to an application which I think is >>> related to the ordering of the nodes. Here's what I'm trying to do: >>> >>> The application uses a structured grid with a global array having >>> dimensions N1 x N2, which is then decomposed into a local array with >>> dimensions NX1 x NX2. >>> >>> I create a Petsc DMDA using >>> >>> DMDACreate2d(MPI_COMM_WORLD, >>> DM_BOUNDARY_PERIODIC, DM_BOUNDARY_PERIODIC, >>> DMDA_STENCIL_BOX, >>> N1, N2, >>> N1/NX1, N2/NX2, >>> 1, nghost, PETSC_NULL, PETSC_NULL, >>> &dmda); >>> >>> and then use this to create a vec: >>> >>> DMCreateGlobalVector(dmda, &vec); >>> >>> Now I copy the local contents of the application array to the petsc >>> array using the following: >>> >>> Let i, j be the application indices and iPetsc and jPetsc be petsc's >>> indices, then: >>> >>> DMDAGetCorners(dmda, &iStart, &jStart, &kStart, >>> &iSize, &jSize, &kSize >>> ); >>> >>> >>> double **arrayPetsc; >>> DMDAVecGetArray(dmda, vec, &arrayPetsc); >>> >>> for (int j=0, jPetsc=jStart; j>> { >>> for (int i=0, iPetsc=iStart; i>> { >>> arrayPetsc[jPetsc][iPetsc] = arrayApplication[j][i]; >>> } >>> } >>> >>> DMDAVecRestoreArray(dmda, vec, &arrayPetsc); >>> >>> Now if I VecView(vec, viewer) and look at the data that petsc has, it >>> looks right when run with 1 proc, but if I use 4 procs it's all messed up >>> (see attached plots). >>> >>> I should probably be using the AO object but its not clear how. Could >>> you help me out? >>> >> >> It looks like you have the global order of processes reversed, meaning >> you have >> >> 1 3 >> >> 0 2 >> >> and it should be >> >> 2 3 >> >> 0 1 >> >> Thanks, >> >> Matt >> >> >>> Thanks, >>> Mani >>> >> -- >> What most experimenters take for granted before they begin their >> experiments is infinitely more interesting than any results to which their >> experiments lead. >> -- Norbert Wiener >> > > -- What most experimenters take for granted before they begin their experiments is infinitely more interesting than any results to which their experiments lead. -- Norbert Wiener -------------- next part -------------- An HTML attachment was scrubbed... URL: From bsmith at mcs.anl.gov Sat Aug 8 15:03:11 2015 From: bsmith at mcs.anl.gov (Barry Smith) Date: Sat, 8 Aug 2015 15:03:11 -0500 Subject: [petsc-users] Mapping between application ordering and Petsc ordering In-Reply-To: References: Message-ID: > On Aug 8, 2015, at 2:45 PM, Mani Chandra wrote: > > Thanks. Any suggestions for a fix? Just flip the meaning of the x indices and the y indices in the PETSc parts of the code? Also run with a very different N1 and N2 (instead of equal size) to better test the code coupling. Barry > > Reorder the indices in arrayApplication? > > On Sat, Aug 8, 2015 at 2:19 PM, Matthew Knepley wrote: > On Sat, Aug 8, 2015 at 1:52 PM, Mani Chandra wrote: > Hi, > > I'm having trouble interfacing petsc to an application which I think is related to the ordering of the nodes. Here's what I'm trying to do: > > The application uses a structured grid with a global array having dimensions N1 x N2, which is then decomposed into a local array with dimensions NX1 x NX2. > > I create a Petsc DMDA using > > DMDACreate2d(MPI_COMM_WORLD, > DM_BOUNDARY_PERIODIC, DM_BOUNDARY_PERIODIC, > DMDA_STENCIL_BOX, > N1, N2, > N1/NX1, N2/NX2, > 1, nghost, PETSC_NULL, PETSC_NULL, > &dmda); > > and then use this to create a vec: > > DMCreateGlobalVector(dmda, &vec); > > Now I copy the local contents of the application array to the petsc array using the following: > > Let i, j be the application indices and iPetsc and jPetsc be petsc's indices, then: > > DMDAGetCorners(dmda, &iStart, &jStart, &kStart, > &iSize, &jSize, &kSize > ); > > > double **arrayPetsc; > DMDAVecGetArray(dmda, vec, &arrayPetsc); > > for (int j=0, jPetsc=jStart; j { > for (int i=0, iPetsc=iStart; i { > arrayPetsc[jPetsc][iPetsc] = arrayApplication[j][i]; > } > } > > DMDAVecRestoreArray(dmda, vec, &arrayPetsc); > > Now if I VecView(vec, viewer) and look at the data that petsc has, it looks right when run with 1 proc, but if I use 4 procs it's all messed up (see attached plots). > > I should probably be using the AO object but its not clear how. Could you help me out? > > It looks like you have the global order of processes reversed, meaning you have > > 1 3 > > 0 2 > > and it should be > > 2 3 > > 0 1 > > Thanks, > > Matt > > Thanks, > Mani > -- > What most experimenters take for granted before they begin their experiments is infinitely more interesting than any results to which their experiments lead. > -- Norbert Wiener > From mc0710 at gmail.com Sat Aug 8 15:08:24 2015 From: mc0710 at gmail.com (Mani Chandra) Date: Sat, 8 Aug 2015 15:08:24 -0500 Subject: [petsc-users] Mapping between application ordering and Petsc ordering In-Reply-To: References: Message-ID: Tried flipping the indices, I get a seg fault. On Sat, Aug 8, 2015 at 3:03 PM, Barry Smith wrote: > > > On Aug 8, 2015, at 2:45 PM, Mani Chandra wrote: > > > > Thanks. Any suggestions for a fix? > > Just flip the meaning of the x indices and the y indices in the PETSc > parts of the code? > > Also run with a very different N1 and N2 (instead of equal size) to > better test the code coupling. > > Barry > > > > > > Reorder the indices in arrayApplication? > > > > On Sat, Aug 8, 2015 at 2:19 PM, Matthew Knepley > wrote: > > On Sat, Aug 8, 2015 at 1:52 PM, Mani Chandra wrote: > > Hi, > > > > I'm having trouble interfacing petsc to an application which I think is > related to the ordering of the nodes. Here's what I'm trying to do: > > > > The application uses a structured grid with a global array having > dimensions N1 x N2, which is then decomposed into a local array with > dimensions NX1 x NX2. > > > > I create a Petsc DMDA using > > > > DMDACreate2d(MPI_COMM_WORLD, > > DM_BOUNDARY_PERIODIC, DM_BOUNDARY_PERIODIC, > > DMDA_STENCIL_BOX, > > N1, N2, > > N1/NX1, N2/NX2, > > 1, nghost, PETSC_NULL, PETSC_NULL, > > &dmda); > > > > and then use this to create a vec: > > > > DMCreateGlobalVector(dmda, &vec); > > > > Now I copy the local contents of the application array to the petsc > array using the following: > > > > Let i, j be the application indices and iPetsc and jPetsc be petsc's > indices, then: > > > > DMDAGetCorners(dmda, &iStart, &jStart, &kStart, > > &iSize, &jSize, &kSize > > ); > > > > > > double **arrayPetsc; > > DMDAVecGetArray(dmda, vec, &arrayPetsc); > > > > for (int j=0, jPetsc=jStart; j > { > > for (int i=0, iPetsc=iStart; i > { > > arrayPetsc[jPetsc][iPetsc] = arrayApplication[j][i]; > > } > > } > > > > DMDAVecRestoreArray(dmda, vec, &arrayPetsc); > > > > Now if I VecView(vec, viewer) and look at the data that petsc has, it > looks right when run with 1 proc, but if I use 4 procs it's all messed up > (see attached plots). > > > > I should probably be using the AO object but its not clear how. Could > you help me out? > > > > It looks like you have the global order of processes reversed, meaning > you have > > > > 1 3 > > > > 0 2 > > > > and it should be > > > > 2 3 > > > > 0 1 > > > > Thanks, > > > > Matt > > > > Thanks, > > Mani > > -- > > What most experimenters take for granted before they begin their > experiments is infinitely more interesting than any results to which their > experiments lead. > > -- Norbert Wiener > > > > -------------- next part -------------- An HTML attachment was scrubbed... URL: From bsmith at mcs.anl.gov Sat Aug 8 15:12:43 2015 From: bsmith at mcs.anl.gov (Barry Smith) Date: Sat, 8 Aug 2015 15:12:43 -0500 Subject: [petsc-users] Mapping between application ordering and Petsc ordering In-Reply-To: References: Message-ID: <89239B79-F150-482F-9D62-48B3C0DC9A31@mcs.anl.gov> > On Aug 8, 2015, at 3:08 PM, Mani Chandra wrote: > > Tried flipping the indices, I get a seg fault. You would have to be careful in exactly what you flip. Note that the meaning of N1 and N2 etc would also be reversed between your code and the PETSc DMDA code. I would create a tiny DMDA and put entires like 1 2 3 4 ... into the array so you can track where the values go Barry > > On Sat, Aug 8, 2015 at 3:03 PM, Barry Smith wrote: > > > On Aug 8, 2015, at 2:45 PM, Mani Chandra wrote: > > > > Thanks. Any suggestions for a fix? > > Just flip the meaning of the x indices and the y indices in the PETSc parts of the code? > > Also run with a very different N1 and N2 (instead of equal size) to better test the code coupling. > > Barry > > > > > > Reorder the indices in arrayApplication? > > > > On Sat, Aug 8, 2015 at 2:19 PM, Matthew Knepley wrote: > > On Sat, Aug 8, 2015 at 1:52 PM, Mani Chandra wrote: > > Hi, > > > > I'm having trouble interfacing petsc to an application which I think is related to the ordering of the nodes. Here's what I'm trying to do: > > > > The application uses a structured grid with a global array having dimensions N1 x N2, which is then decomposed into a local array with dimensions NX1 x NX2. > > > > I create a Petsc DMDA using > > > > DMDACreate2d(MPI_COMM_WORLD, > > DM_BOUNDARY_PERIODIC, DM_BOUNDARY_PERIODIC, > > DMDA_STENCIL_BOX, > > N1, N2, > > N1/NX1, N2/NX2, > > 1, nghost, PETSC_NULL, PETSC_NULL, > > &dmda); > > > > and then use this to create a vec: > > > > DMCreateGlobalVector(dmda, &vec); > > > > Now I copy the local contents of the application array to the petsc array using the following: > > > > Let i, j be the application indices and iPetsc and jPetsc be petsc's indices, then: > > > > DMDAGetCorners(dmda, &iStart, &jStart, &kStart, > > &iSize, &jSize, &kSize > > ); > > > > > > double **arrayPetsc; > > DMDAVecGetArray(dmda, vec, &arrayPetsc); > > > > for (int j=0, jPetsc=jStart; j > { > > for (int i=0, iPetsc=iStart; i > { > > arrayPetsc[jPetsc][iPetsc] = arrayApplication[j][i]; > > } > > } > > > > DMDAVecRestoreArray(dmda, vec, &arrayPetsc); > > > > Now if I VecView(vec, viewer) and look at the data that petsc has, it looks right when run with 1 proc, but if I use 4 procs it's all messed up (see attached plots). > > > > I should probably be using the AO object but its not clear how. Could you help me out? > > > > It looks like you have the global order of processes reversed, meaning you have > > > > 1 3 > > > > 0 2 > > > > and it should be > > > > 2 3 > > > > 0 1 > > > > Thanks, > > > > Matt > > > > Thanks, > > Mani > > -- > > What most experimenters take for granted before they begin their experiments is infinitely more interesting than any results to which their experiments lead. > > -- Norbert Wiener > > > > From mc0710 at gmail.com Sat Aug 8 16:56:02 2015 From: mc0710 at gmail.com (Mani Chandra) Date: Sat, 8 Aug 2015 16:56:02 -0500 Subject: [petsc-users] Mapping between application ordering and Petsc ordering In-Reply-To: <89239B79-F150-482F-9D62-48B3C0DC9A31@mcs.anl.gov> References: <89239B79-F150-482F-9D62-48B3C0DC9A31@mcs.anl.gov> Message-ID: So basically one needs to correctly map iPetsc, jPetsc -> iApplication, jApplication ? Is there is any standard way to do this? Can I get petsc to automatically follow the same parallel topology as the host application? Thanks, Mani On Sat, Aug 8, 2015 at 3:12 PM, Barry Smith wrote: > > > On Aug 8, 2015, at 3:08 PM, Mani Chandra wrote: > > > > Tried flipping the indices, I get a seg fault. > > You would have to be careful in exactly what you flip. Note that the > meaning of N1 and N2 etc would also be reversed between your code and the > PETSc DMDA code. > > I would create a tiny DMDA and put entires like 1 2 3 4 ... into the > array so you can track where the values go > > Barry > > > > > On Sat, Aug 8, 2015 at 3:03 PM, Barry Smith wrote: > > > > > On Aug 8, 2015, at 2:45 PM, Mani Chandra wrote: > > > > > > Thanks. Any suggestions for a fix? > > > > Just flip the meaning of the x indices and the y indices in the PETSc > parts of the code? > > > > Also run with a very different N1 and N2 (instead of equal size) to > better test the code coupling. > > > > Barry > > > > > > > > > > Reorder the indices in arrayApplication? > > > > > > On Sat, Aug 8, 2015 at 2:19 PM, Matthew Knepley > wrote: > > > On Sat, Aug 8, 2015 at 1:52 PM, Mani Chandra wrote: > > > Hi, > > > > > > I'm having trouble interfacing petsc to an application which I think > is related to the ordering of the nodes. Here's what I'm trying to do: > > > > > > The application uses a structured grid with a global array having > dimensions N1 x N2, which is then decomposed into a local array with > dimensions NX1 x NX2. > > > > > > I create a Petsc DMDA using > > > > > > DMDACreate2d(MPI_COMM_WORLD, > > > DM_BOUNDARY_PERIODIC, DM_BOUNDARY_PERIODIC, > > > DMDA_STENCIL_BOX, > > > N1, N2, > > > N1/NX1, N2/NX2, > > > 1, nghost, PETSC_NULL, PETSC_NULL, > > > &dmda); > > > > > > and then use this to create a vec: > > > > > > DMCreateGlobalVector(dmda, &vec); > > > > > > Now I copy the local contents of the application array to the petsc > array using the following: > > > > > > Let i, j be the application indices and iPetsc and jPetsc be petsc's > indices, then: > > > > > > DMDAGetCorners(dmda, &iStart, &jStart, &kStart, > > > &iSize, &jSize, &kSize > > > ); > > > > > > > > > double **arrayPetsc; > > > DMDAVecGetArray(dmda, vec, &arrayPetsc); > > > > > > for (int j=0, jPetsc=jStart; j > > { > > > for (int i=0, iPetsc=iStart; i iPetsc++) > > > { > > > arrayPetsc[jPetsc][iPetsc] = arrayApplication[j][i]; > > > } > > > } > > > > > > DMDAVecRestoreArray(dmda, vec, &arrayPetsc); > > > > > > Now if I VecView(vec, viewer) and look at the data that petsc has, it > looks right when run with 1 proc, but if I use 4 procs it's all messed up > (see attached plots). > > > > > > I should probably be using the AO object but its not clear how. Could > you help me out? > > > > > > It looks like you have the global order of processes reversed, meaning > you have > > > > > > 1 3 > > > > > > 0 2 > > > > > > and it should be > > > > > > 2 3 > > > > > > 0 1 > > > > > > Thanks, > > > > > > Matt > > > > > > Thanks, > > > Mani > > > -- > > > What most experimenters take for granted before they begin their > experiments is infinitely more interesting than any results to which their > experiments lead. > > > -- Norbert Wiener > > > > > > > > > -------------- next part -------------- An HTML attachment was scrubbed... URL: From knepley at gmail.com Sat Aug 8 16:58:49 2015 From: knepley at gmail.com (Matthew Knepley) Date: Sat, 8 Aug 2015 16:58:49 -0500 Subject: [petsc-users] Mapping between application ordering and Petsc ordering In-Reply-To: References: <89239B79-F150-482F-9D62-48B3C0DC9A31@mcs.anl.gov> Message-ID: On Sat, Aug 8, 2015 at 4:56 PM, Mani Chandra wrote: > So basically one needs to correctly map > > iPetsc, jPetsc -> iApplication, jApplication ? > > Is there is any standard way to do this? Can I get petsc to automatically > follow the same parallel topology as the host application? > If you want to use DMDA, there is only one mapping of ranks, namely lexicographic. However, every structured grid code I have ever seen uses that mapping, perhaps with a permutation of the directions {x, y, z}. Thus, the user needs to map the directions in PETSc in the right order for the application. I am not sure how you would automate this seeing as it depends on the application. Thanks, Matt > Thanks, > Mani > > On Sat, Aug 8, 2015 at 3:12 PM, Barry Smith wrote: > >> >> > On Aug 8, 2015, at 3:08 PM, Mani Chandra wrote: >> > >> > Tried flipping the indices, I get a seg fault. >> >> You would have to be careful in exactly what you flip. Note that the >> meaning of N1 and N2 etc would also be reversed between your code and the >> PETSc DMDA code. >> >> I would create a tiny DMDA and put entires like 1 2 3 4 ... into the >> array so you can track where the values go >> >> Barry >> >> > >> > On Sat, Aug 8, 2015 at 3:03 PM, Barry Smith wrote: >> > >> > > On Aug 8, 2015, at 2:45 PM, Mani Chandra wrote: >> > > >> > > Thanks. Any suggestions for a fix? >> > >> > Just flip the meaning of the x indices and the y indices in the PETSc >> parts of the code? >> > >> > Also run with a very different N1 and N2 (instead of equal size) to >> better test the code coupling. >> > >> > Barry >> > >> > >> > > >> > > Reorder the indices in arrayApplication? >> > > >> > > On Sat, Aug 8, 2015 at 2:19 PM, Matthew Knepley >> wrote: >> > > On Sat, Aug 8, 2015 at 1:52 PM, Mani Chandra >> wrote: >> > > Hi, >> > > >> > > I'm having trouble interfacing petsc to an application which I think >> is related to the ordering of the nodes. Here's what I'm trying to do: >> > > >> > > The application uses a structured grid with a global array having >> dimensions N1 x N2, which is then decomposed into a local array with >> dimensions NX1 x NX2. >> > > >> > > I create a Petsc DMDA using >> > > >> > > DMDACreate2d(MPI_COMM_WORLD, >> > > DM_BOUNDARY_PERIODIC, DM_BOUNDARY_PERIODIC, >> > > DMDA_STENCIL_BOX, >> > > N1, N2, >> > > N1/NX1, N2/NX2, >> > > 1, nghost, PETSC_NULL, PETSC_NULL, >> > > &dmda); >> > > >> > > and then use this to create a vec: >> > > >> > > DMCreateGlobalVector(dmda, &vec); >> > > >> > > Now I copy the local contents of the application array to the petsc >> array using the following: >> > > >> > > Let i, j be the application indices and iPetsc and jPetsc be petsc's >> indices, then: >> > > >> > > DMDAGetCorners(dmda, &iStart, &jStart, &kStart, >> > > &iSize, &jSize, &kSize >> > > ); >> > > >> > > >> > > double **arrayPetsc; >> > > DMDAVecGetArray(dmda, vec, &arrayPetsc); >> > > >> > > for (int j=0, jPetsc=jStart; j> jPetsc++) >> > > { >> > > for (int i=0, iPetsc=iStart; i> iPetsc++) >> > > { >> > > arrayPetsc[jPetsc][iPetsc] = arrayApplication[j][i]; >> > > } >> > > } >> > > >> > > DMDAVecRestoreArray(dmda, vec, &arrayPetsc); >> > > >> > > Now if I VecView(vec, viewer) and look at the data that petsc has, it >> looks right when run with 1 proc, but if I use 4 procs it's all messed up >> (see attached plots). >> > > >> > > I should probably be using the AO object but its not clear how. Could >> you help me out? >> > > >> > > It looks like you have the global order of processes reversed, >> meaning you have >> > > >> > > 1 3 >> > > >> > > 0 2 >> > > >> > > and it should be >> > > >> > > 2 3 >> > > >> > > 0 1 >> > > >> > > Thanks, >> > > >> > > Matt >> > > >> > > Thanks, >> > > Mani >> > > -- >> > > What most experimenters take for granted before they begin their >> experiments is infinitely more interesting than any results to which their >> experiments lead. >> > > -- Norbert Wiener >> > > >> > >> > >> >> > -- What most experimenters take for granted before they begin their experiments is infinitely more interesting than any results to which their experiments lead. -- Norbert Wiener -------------- next part -------------- An HTML attachment was scrubbed... URL: From jjraskin at mynicejob.com Sat Aug 8 20:38:31 2015 From: jjraskin at mynicejob.com (Jeffery Raskin) Date: Sat, 8 Aug 2015 18:38:31 -0700 Subject: [petsc-users] =?windows-1252?q?Many_Nurse_/_medical_positions_=28?= =?windows-1252?q?Travel=29?= Message-ID: <162612d39c2b682dbab9a507002f99e3@mynicejob.com> ?Hi , Was curious if you or someone you may know was looking for a change.? We have the following position.? Also we have others, in various cities. PICU NURSE - TRAVEL http://www.mynicejob.com/jobdescription.cfm?jobID=14478 ICU (RN) - http://www.mynicejob.com/jobdescription.cfm?jobID=14481 Med/Surg Telemetry (RN) http://www.mynicejob.com/jobdescription.cfm?jobID=14480 Also 400 travel nurse positions. med surg - $75? / ICU $90.? Housing is paid for. Contact Jeffery Raskin - jjraskin at mynicejob.com? Title: JobID: Location: Company Info: *NURSING DEPT. MANAGER 14479 VALLEJO, California MyNiceJob *PICU NURSE - TRAVEL 14478 BALDWIN PARK, California MyNiceJob *LABOR & DELIVERY (RN) - PER DIEM 14477 IRVINE, California MyNiceJob *CATH LAB NURSE - PER DIEM 14476 SACRAMENTO, California MyNiceJob *RN-Emergency Department 14463 Uvalde, Travel Position MyNiceJob *Director, Analytics and Insights 14462 indianapolis, Indiana MyNiceJob *Manager of Outpatient Clinics (Nursing) 14460 King City , California MyNiceJob *Pharmacist 14459 King City, California MyNiceJob *Registered Nurse-ICU 14458 King City, California MyNiceJob *Registered Nurse - OB 14457 King City, California MyNiceJob *Physician Assistant 14456 King City, California MyNiceJob *Physical Therapist - FT/PT 14455 Pullman, Washington MyNiceJob *Manager Finance 14454 Baltimore, Maryland MyNiceJob *Director of MSU/ICU - FT 14453 Pullman, Washington MyNiceJob *Utilization Manager - Afterhours Program 14452 Remote , Louisiana MyNiceJob *Director, Service Coordination - RN/LVN 14451 Dallas, Texas MyNiceJob *Registered Nurse/ Licensed Practical Nurse 14450 silver city, New Mexico MyNiceJob *Manager of Surgery 14449 King City, California MyNiceJob *Director of Skilled Nursing Facility 14448 King City, California MyNiceJob *Infection Preventionist 14447 King City, California MyNiceJob *Director Community Services 14446 Palmdale, California MyNiceJob *Physician Practice Director- Uvalde Memorial Hosp 14445 San Antonio, Texas MyNiceJob *DEVELOPMENT MANAGER 14444 Los Angeles, California MyNiceJob *Outpatient Nurse Care Coordinator 14442 King City, California MyNiceJob *Director Quality Improvement 14441 Sunrise, Florida MyNiceJob *Family Nurse Practitioner 14440 King City, California MyNiceJob *Registered Nurse - Surgery 14439 King City, California MyNiceJob To stop receiving emails from our company; please reply to this email with the word remove in the subject line. Thank you. -------------- next part -------------- An HTML attachment was scrubbed... URL: From Fabian.Jakub at physik.uni-muenchen.de Sun Aug 9 14:31:49 2015 From: Fabian.Jakub at physik.uni-muenchen.de (Fabian) Date: Sun, 09 Aug 2015 21:31:49 +0200 Subject: [petsc-users] Mapping between application ordering and Petsc ordering In-Reply-To: References: <89239B79-F150-482F-9D62-48B3C0DC9A31@mcs.anl.gov> Message-ID: <55C7AAA5.3090708@physik.uni-muenchen.de> If the problem is due to the rank-ordering, the following excerpt from the PETSc FAQ section may help: The PETSc DA object decomposes the domain differently than the MPI_Cart_create() command. How can one use them together? The MPI_Cart_create() first divides the mesh along the z direction, then the y, then the x. DMDA divides along the x, then y, then z. Thus, for example, rank 1 of the processes will be in a different part of the mesh for the two schemes. To resolve this you can create a new MPI communicator that you pass to DMDACreate() that renumbers the process ranks so that each physical process shares the same part of the mesh with both the DMDA and the MPI_Cart_create(). The code to determine the new numbering was provided by Rolf Kuiper. // the numbers of processors per direction are (int) x_procs, y_procs, z_procs respectively // (no parallelization in direction 'dir' means dir_procs = 1) MPI_Comm NewComm; int MPI_Rank, NewRank, x,y,z; // get rank from MPI ordering: MPI_Comm_rank(MPI_COMM_WORLD, &MPI_Rank); // calculate coordinates of cpus in MPI ordering: x = MPI_rank / (z_procs*y_procs); y = (MPI_rank % (z_procs*y_procs)) / z_procs; z = (MPI_rank % (z_procs*y_procs)) % z_procs; // set new rank according to PETSc ordering: NewRank = z*y_procs*x_procs + y*x_procs + x; // create communicator with new ranks according to PETSc ordering: MPI_Comm_split(PETSC_COMM_WORLD, 1, NewRank, &NewComm); // override the default communicator (was MPI_COMM_WORLD as default) PETSC_COMM_WORLD = NewComm; On 08.08.2015 23:58, Matthew Knepley wrote: > On Sat, Aug 8, 2015 at 4:56 PM, Mani Chandra > wrote: > > So basically one needs to correctly map > > iPetsc, jPetsc -> iApplication, jApplication ? > > Is there is any standard way to do this? Can I get petsc to > automatically follow the same parallel topology as the host > application? > > > If you want to use DMDA, there is only one mapping of ranks, namely > lexicographic. However, every structured grid code I have > ever seen uses that mapping, perhaps with a permutation of the > directions {x, y, z}. Thus, the user needs to map the directions > in PETSc in the right order for the application. I am not sure how you > would automate this seeing as it depends on the application. > > Thanks, > > Matt > > Thanks, > Mani > > On Sat, Aug 8, 2015 at 3:12 PM, Barry Smith > wrote: > > > > On Aug 8, 2015, at 3:08 PM, Mani Chandra > wrote: > > > > Tried flipping the indices, I get a seg fault. > > You would have to be careful in exactly what you flip. Note > that the meaning of N1 and N2 etc would also be reversed > between your code and the PETSc DMDA code. > > I would create a tiny DMDA and put entires like 1 2 3 4 ... > into the array so you can track where the values go > > Barry > > > > > On Sat, Aug 8, 2015 at 3:03 PM, Barry Smith > > wrote: > > > > > On Aug 8, 2015, at 2:45 PM, Mani Chandra > wrote: > > > > > > Thanks. Any suggestions for a fix? > > > > Just flip the meaning of the x indices and the y indices > in the PETSc parts of the code? > > > > Also run with a very different N1 and N2 (instead of > equal size) to better test the code coupling. > > > > Barry > > > > > > > > > > Reorder the indices in arrayApplication? > > > > > > On Sat, Aug 8, 2015 at 2:19 PM, Matthew Knepley > > wrote: > > > On Sat, Aug 8, 2015 at 1:52 PM, Mani Chandra > > wrote: > > > Hi, > > > > > > I'm having trouble interfacing petsc to an application > which I think is related to the ordering of the nodes. Here's > what I'm trying to do: > > > > > > The application uses a structured grid with a global array > having dimensions N1 x N2, which is then decomposed into a > local array with dimensions NX1 x NX2. > > > > > > I create a Petsc DMDA using > > > > > > DMDACreate2d(MPI_COMM_WORLD, > > > DM_BOUNDARY_PERIODIC, DM_BOUNDARY_PERIODIC, > > > DMDA_STENCIL_BOX, > > > N1, N2, > > > N1/NX1, N2/NX2, > > > 1, nghost, PETSC_NULL, PETSC_NULL, > > > &dmda); > > > > > > and then use this to create a vec: > > > > > > DMCreateGlobalVector(dmda, &vec); > > > > > > Now I copy the local contents of the application array to > the petsc array using the following: > > > > > > Let i, j be the application indices and iPetsc and jPetsc > be petsc's indices, then: > > > > > > DMDAGetCorners(dmda, &iStart, &jStart, &kStart, > > > &iSize, &jSize, &kSize > > > ); > > > > > > > > > double **arrayPetsc; > > > DMDAVecGetArray(dmda, vec, &arrayPetsc); > > > > > > for (int j=0, jPetsc=jStart; j j++, jPetsc++) > > > { > > > for (int i=0, iPetsc=iStart; i i++, iPetsc++) > > > { > > > arrayPetsc[jPetsc][iPetsc] = arrayApplication[j][i]; > > > } > > > } > > > > > > DMDAVecRestoreArray(dmda, vec, &arrayPetsc); > > > > > > Now if I VecView(vec, viewer) and look at the data that > petsc has, it looks right when run with 1 proc, but if I use 4 > procs it's all messed up (see attached plots). > > > > > > I should probably be using the AO object but its not clear > how. Could you help me out? > > > > > > It looks like you have the global order of processes > reversed, meaning you have > > > > > > 1 3 > > > > > > 0 2 > > > > > > and it should be > > > > > > 2 3 > > > > > > 0 1 > > > > > > Thanks, > > > > > > Matt > > > > > > Thanks, > > > Mani > > > -- > > > What most experimenters take for granted before they begin > their experiments is infinitely more interesting than any > results to which their experiments lead. > > > -- Norbert Wiener > > > > > > > > > > > > > -- > What most experimenters take for granted before they begin their > experiments is infinitely more interesting than any results to which > their experiments lead. > -- Norbert Wiener -------------- next part -------------- An HTML attachment was scrubbed... URL: From mc0710 at gmail.com Sun Aug 9 16:57:15 2015 From: mc0710 at gmail.com (Mani Chandra) Date: Sun, 9 Aug 2015 16:57:15 -0500 Subject: [petsc-users] Mapping between application ordering and Petsc ordering In-Reply-To: <55C7AAA5.3090708@physik.uni-muenchen.de> References: <89239B79-F150-482F-9D62-48B3C0DC9A31@mcs.anl.gov> <55C7AAA5.3090708@physik.uni-muenchen.de> Message-ID: Thank you! This was *exactly* what I was looking it. It fixed the problem. On Sun, Aug 9, 2015 at 2:31 PM, Fabian wrote: > If the problem is due to the rank-ordering, the following excerpt from the > PETSc FAQ section may help: > > > > > The PETSc DA object decomposes the domain differently than the > MPI_Cart_create() command. How can one use them together? > > The MPI_Cart_create() first divides the mesh along the z direction, then > the y, then the x. DMDA divides along the x, then y, then z. Thus, for > example, rank 1 of the processes will be in a different part of the mesh > for the two schemes. To resolve this you can create a new MPI communicator > that you pass to DMDACreate() that renumbers the process ranks so that each > physical process shares the same part of the mesh with both the DMDA and > the MPI_Cart_create(). The code to determine the new numbering was provided > by Rolf Kuiper. > > // the numbers of processors per direction are (int) x_procs, y_procs, z_procs respectively > // (no parallelization in direction 'dir' means dir_procs = 1) > > MPI_Comm NewComm; > int MPI_Rank, NewRank, x,y,z; > > // get rank from MPI ordering: > MPI_Comm_rank(MPI_COMM_WORLD, &MPI_Rank); > > // calculate coordinates of cpus in MPI ordering: > x = MPI_rank / (z_procs*y_procs); > y = (MPI_rank % (z_procs*y_procs)) / z_procs; > z = (MPI_rank % (z_procs*y_procs)) % z_procs; > > // set new rank according to PETSc ordering: > NewRank = z*y_procs*x_procs + y*x_procs + x; > > // create communicator with new ranks according to > PETSc ordering: > MPI_Comm_split(PETSC_COMM_WORLD, 1, NewRank, &NewComm); > > // override the default communicator (was > MPI_COMM_WORLD as default) > PETSC_COMM_WORLD = NewComm; > > > > > On 08.08.2015 23:58, Matthew Knepley wrote: > > On Sat, Aug 8, 2015 at 4:56 PM, Mani Chandra wrote: > >> So basically one needs to correctly map >> >> iPetsc, jPetsc -> iApplication, jApplication ? >> >> Is there is any standard way to do this? Can I get petsc to automatically >> follow the same parallel topology as the host application? >> > > If you want to use DMDA, there is only one mapping of ranks, namely > lexicographic. However, every structured grid code I have > ever seen uses that mapping, perhaps with a permutation of the directions > {x, y, z}. Thus, the user needs to map the directions > in PETSc in the right order for the application. I am not sure how you > would automate this seeing as it depends on the application. > > Thanks, > > Matt > > >> Thanks, >> Mani >> >> On Sat, Aug 8, 2015 at 3:12 PM, Barry Smith wrote: >> >>> >>> > On Aug 8, 2015, at 3:08 PM, Mani Chandra wrote: >>> > >>> > Tried flipping the indices, I get a seg fault. >>> >>> You would have to be careful in exactly what you flip. Note that the >>> meaning of N1 and N2 etc would also be reversed between your code and the >>> PETSc DMDA code. >>> >>> I would create a tiny DMDA and put entires like 1 2 3 4 ... into the >>> array so you can track where the values go >>> >>> Barry >>> >>> > >>> > On Sat, Aug 8, 2015 at 3:03 PM, Barry Smith >>> wrote: >>> > >>> > > On Aug 8, 2015, at 2:45 PM, Mani Chandra wrote: >>> > > >>> > > Thanks. Any suggestions for a fix? >>> > >>> > Just flip the meaning of the x indices and the y indices in the >>> PETSc parts of the code? >>> > >>> > Also run with a very different N1 and N2 (instead of equal size) to >>> better test the code coupling. >>> > >>> > Barry >>> > >>> > >>> > > >>> > > Reorder the indices in arrayApplication? >>> > > >>> > > On Sat, Aug 8, 2015 at 2:19 PM, Matthew Knepley >>> wrote: >>> > > On Sat, Aug 8, 2015 at 1:52 PM, Mani Chandra >>> wrote: >>> > > Hi, >>> > > >>> > > I'm having trouble interfacing petsc to an application which I think >>> is related to the ordering of the nodes. Here's what I'm trying to do: >>> > > >>> > > The application uses a structured grid with a global array having >>> dimensions N1 x N2, which is then decomposed into a local array with >>> dimensions NX1 x NX2. >>> > > >>> > > I create a Petsc DMDA using >>> > > >>> > > DMDACreate2d(MPI_COMM_WORLD, >>> > > DM_BOUNDARY_PERIODIC, DM_BOUNDARY_PERIODIC, >>> > > DMDA_STENCIL_BOX, >>> > > N1, N2, >>> > > N1/NX1, N2/NX2, >>> > > 1, nghost, PETSC_NULL, PETSC_NULL, >>> > > &dmda); >>> > > >>> > > and then use this to create a vec: >>> > > >>> > > DMCreateGlobalVector(dmda, &vec); >>> > > >>> > > Now I copy the local contents of the application array to the petsc >>> array using the following: >>> > > >>> > > Let i, j be the application indices and iPetsc and jPetsc be petsc's >>> indices, then: >>> > > >>> > > DMDAGetCorners(dmda, &iStart, &jStart, &kStart, >>> > > &iSize, &jSize, &kSize >>> > > ); >>> > > >>> > > >>> > > double **arrayPetsc; >>> > > DMDAVecGetArray(dmda, vec, &arrayPetsc); >>> > > >>> > > for (int j=0, jPetsc=jStart; j>> jPetsc++) >>> > > { >>> > > for (int i=0, iPetsc=iStart; i>> iPetsc++) >>> > > { >>> > > arrayPetsc[jPetsc][iPetsc] = arrayApplication[j][i]; >>> > > } >>> > > } >>> > > >>> > > DMDAVecRestoreArray(dmda, vec, &arrayPetsc); >>> > > >>> > > Now if I VecView(vec, viewer) and look at the data that petsc has, >>> it looks right when run with 1 proc, but if I use 4 procs it's all messed >>> up (see attached plots). >>> > > >>> > > I should probably be using the AO object but its not clear how. >>> Could you help me out? >>> > > >>> > > It looks like you have the global order of processes reversed, >>> meaning you have >>> > > >>> > > 1 3 >>> > > >>> > > 0 2 >>> > > >>> > > and it should be >>> > > >>> > > 2 3 >>> > > >>> > > 0 1 >>> > > >>> > > Thanks, >>> > > >>> > > Matt >>> > > >>> > > Thanks, >>> > > Mani >>> > > -- >>> > > What most experimenters take for granted before they begin their >>> experiments is infinitely more interesting than any results to which their >>> experiments lead. >>> > > -- Norbert Wiener >>> > > >>> > >>> > >>> >>> >> > > > -- > What most experimenters take for granted before they begin their > experiments is infinitely more interesting than any results to which their > experiments lead. > -- Norbert Wiener > > > -------------- next part -------------- An HTML attachment was scrubbed... URL: From hgbk2008 at gmail.com Mon Aug 10 03:57:52 2015 From: hgbk2008 at gmail.com (Hoang Giang Bui) Date: Mon, 10 Aug 2015 10:57:52 +0200 Subject: [petsc-users] set the diagonal for zero row Message-ID: <55C86790.9040401@gmail.com> Dear list What is the best way to search for complete zero rows of the matrix and set the diagonal to 1.0? In my thinking, the solution would be: + extract the max absolute values of each row by using MatGetRowMaxAbs + Compare the value with some tolerance and put into the zero row list + Extract the diagonal of the matrix by MatGetDiagonal + modify the vector of diagonal + set the diagonal back by using MatDiagonalSet THis seem to be overly complicated. Is there an all-in-one solution, similar to MatZeroRowsColumns? Best regards Giang From knepley at gmail.com Mon Aug 10 07:02:10 2015 From: knepley at gmail.com (Matthew Knepley) Date: Mon, 10 Aug 2015 07:02:10 -0500 Subject: [petsc-users] set the diagonal for zero row In-Reply-To: <55C86790.9040401@gmail.com> References: <55C86790.9040401@gmail.com> Message-ID: On Mon, Aug 10, 2015 at 3:57 AM, Hoang Giang Bui wrote: > Dear list > > What is the best way to search for complete zero rows of the matrix and > set the diagonal to 1.0? In my thinking, the solution would be: > + extract the max absolute values of each row by using MatGetRowMaxAbs > + Compare the value with some tolerance and put into the zero row list > + Extract the diagonal of the matrix by MatGetDiagonal > + modify the vector of diagonal > + set the diagonal back by using MatDiagonalSet > > THis seem to be overly complicated. Is there an all-in-one solution, > similar to MatZeroRowsColumns? > I don't think there is anything better than checking the MaxAbs, and calling MatZeroRows() on the indices with a 0.0 Thanks, Matt > Best regards > Giang > -- What most experimenters take for granted before they begin their experiments is infinitely more interesting than any results to which their experiments lead. -- Norbert Wiener -------------- next part -------------- An HTML attachment was scrubbed... URL: From jychang48 at gmail.com Mon Aug 10 09:56:59 2015 From: jychang48 at gmail.com (Justin Chang) Date: Mon, 10 Aug 2015 09:56:59 -0500 Subject: [petsc-users] Augmented Lagrangian examples? Message-ID: Hi all, 1) I ran across this paper: http://web.stanford.edu/~egawlik/pdf/GaMuSaWi2012.pdf and was wondering if there are any current TAO examples that do this. also 2) If I integrate this into an FEM (from DMPlex) I will need to assemble an equality jacobian matrix and constraint vector. But the element-wise constraints that I need to compute (e.g., the divergence) needs all degrees of freedom within the element closure including the essential BCs DMPlex removes from the global matrix/vector. So how can I work around this and/or access said removed terms inside a TAO routine? Thanks, Justin -------------- next part -------------- An HTML attachment was scrubbed... URL: From knepley at gmail.com Mon Aug 10 10:01:07 2015 From: knepley at gmail.com (Matthew Knepley) Date: Mon, 10 Aug 2015 10:01:07 -0500 Subject: [petsc-users] Augmented Lagrangian examples? In-Reply-To: References: Message-ID: On Mon, Aug 10, 2015 at 9:56 AM, Justin Chang wrote: > Hi all, > > 1) I ran across this paper: > > http://web.stanford.edu/~egawlik/pdf/GaMuSaWi2012.pdf > > and was wondering if there are any current TAO examples that do this. > > also > > 2) If I integrate this into an FEM (from DMPlex) I will need to assemble > an equality jacobian matrix and constraint vector. But the element-wise > constraints that I need to compute (e.g., the divergence) needs all degrees > of freedom within the element closure including the essential BCs DMPlex > removes from the global matrix/vector. So how can I work around this and/or > access said removed terms inside a TAO routine? > Those are not really dofs, they are boundary values, so they are not in the global system. The local vectors have the boundary values, so you can calculate the correct constraint eqns to put in the global system. Matt > Thanks, > Justin > -- What most experimenters take for granted before they begin their experiments is infinitely more interesting than any results to which their experiments lead. -- Norbert Wiener -------------- next part -------------- An HTML attachment was scrubbed... URL: From jychang48 at gmail.com Mon Aug 10 11:05:39 2015 From: jychang48 at gmail.com (Justin Chang) Date: Mon, 10 Aug 2015 11:05:39 -0500 Subject: [petsc-users] Augmented Lagrangian examples? In-Reply-To: References: Message-ID: Matt, So inside these TAO routines, if I wanted to include the boundary values, would I follow the approaches in functions like DMPlexComputeResidual/Jacobian_Internal? I assume I need something like: DMGetLocalVector(dm,xlocal); DMPlexInsertBoundaryValues(xlocal,...); ** use xlocal to compute equality constraints/jacobian ** DMRestoreLocalVector(dm,xlocal); The Jacobian and equality constraints that I want to assemble are not the same as the DM matrix use for the entire problem. I am guessing I will need to use a different DS for the DM because, for example the stokes problem with TH elements, I want to assemble an equality jacobian of size (no. of cells) by (no. of velocity dofs), and an equality constraints vector of size (no. of cells). How would I go about doing a problem like this? Thanks, Justin On Mon, Aug 10, 2015 at 10:01 AM, Matthew Knepley wrote: > On Mon, Aug 10, 2015 at 9:56 AM, Justin Chang wrote: > >> Hi all, >> >> 1) I ran across this paper: >> >> http://web.stanford.edu/~egawlik/pdf/GaMuSaWi2012.pdf >> >> and was wondering if there are any current TAO examples that do this. >> >> also >> >> 2) If I integrate this into an FEM (from DMPlex) I will need to assemble >> an equality jacobian matrix and constraint vector. But the element-wise >> constraints that I need to compute (e.g., the divergence) needs all degrees >> of freedom within the element closure including the essential BCs DMPlex >> removes from the global matrix/vector. So how can I work around this and/or >> access said removed terms inside a TAO routine? >> > > Those are not really dofs, they are boundary values, so they are not in > the global system. The local vectors have > the boundary values, so you can calculate the correct constraint eqns to > put in the global system. > > Matt > > >> Thanks, >> Justin >> > > > > -- > What most experimenters take for granted before they begin their > experiments is infinitely more interesting than any results to which their > experiments lead. > -- Norbert Wiener > -------------- next part -------------- An HTML attachment was scrubbed... URL: From knepley at gmail.com Mon Aug 10 14:26:15 2015 From: knepley at gmail.com (Matthew Knepley) Date: Mon, 10 Aug 2015 14:26:15 -0500 Subject: [petsc-users] Augmented Lagrangian examples? In-Reply-To: References: Message-ID: On Mon, Aug 10, 2015 at 11:05 AM, Justin Chang wrote: > Matt, > > So inside these TAO routines, if I wanted to include the boundary values, > would I follow the approaches in functions like > DMPlexComputeResidual/Jacobian_Internal? I assume I need something like: > > DMGetLocalVector(dm,xlocal); > DMPlexInsertBoundaryValues(xlocal,...); > ** use xlocal to compute equality constraints/jacobian ** > DMRestoreLocalVector(dm,xlocal); > Yes. > The Jacobian and equality constraints that I want to assemble are not the > same as the DM matrix use for the entire problem. I am guessing I will need > to use a different DS for the DM because, for example the stokes problem > with TH elements, I want to assemble an equality jacobian of size (no. of > cells) by (no. of velocity dofs), and an equality constraints vector of > size (no. of cells). How would I go about doing a problem like this? > Okay, now we get into some choices I made in Plex. The original version I wrote could assemble rectangular matrices. This was a huge complication that no one took advantage of, so I got rid of it. Now I just assemble the entire problem, and if you want pieces of it, I pull them out using MatGetSubmatrix(). I still believe this is the cleanest thing to do. Maybe you could schematically tell me what you want to do. Thanks, Matt > Thanks, > Justin > > > On Mon, Aug 10, 2015 at 10:01 AM, Matthew Knepley > wrote: > >> On Mon, Aug 10, 2015 at 9:56 AM, Justin Chang >> wrote: >> >>> Hi all, >>> >>> 1) I ran across this paper: >>> >>> http://web.stanford.edu/~egawlik/pdf/GaMuSaWi2012.pdf >>> >>> and was wondering if there are any current TAO examples that do this. >>> >>> also >>> >>> 2) If I integrate this into an FEM (from DMPlex) I will need to assemble >>> an equality jacobian matrix and constraint vector. But the element-wise >>> constraints that I need to compute (e.g., the divergence) needs all degrees >>> of freedom within the element closure including the essential BCs DMPlex >>> removes from the global matrix/vector. So how can I work around this and/or >>> access said removed terms inside a TAO routine? >>> >> >> Those are not really dofs, they are boundary values, so they are not in >> the global system. The local vectors have >> the boundary values, so you can calculate the correct constraint eqns to >> put in the global system. >> >> Matt >> >> >>> Thanks, >>> Justin >>> >> >> >> >> -- >> What most experimenters take for granted before they begin their >> experiments is infinitely more interesting than any results to which their >> experiments lead. >> -- Norbert Wiener >> > > -- What most experimenters take for granted before they begin their experiments is infinitely more interesting than any results to which their experiments lead. -- Norbert Wiener -------------- next part -------------- An HTML attachment was scrubbed... URL: From aph at email.arizona.edu Mon Aug 10 15:50:50 2015 From: aph at email.arizona.edu (Anthony Haas) Date: Mon, 10 Aug 2015 13:50:50 -0700 Subject: [petsc-users] SIGSEGV in Superlu_dist Message-ID: <55C90EAA.5060702@email.arizona.edu> Hi Sherry, I recently submitted a matrix for which I noticed that Superlu_dist was hanging when running on 4 processors with parallel symbolic factorization. I have been using the latest version of Superlu_dist and the code is not hanging anymore. However, I noticed that when running the same matrix (I have attached the matrix), the code crashes with the following SIGSEGV when running on 10 procs (with or without parallel symbolic factorization). It is probably overkill to run such a 'small' matrix on 10 procs but I thought that it might still be useful to report the problem?? See below for the error obtained when running with gdb and also a code snippet to reproduce the error. Thanks, Anthony 1) ERROR in GDB Program received signal SIGSEGV, Segmentation fault. 0x00007fe6ba609297 in MatLUFactorNumeric_SuperLU_DIST (F=0x1922b50, A=0x14a6a70, info=0x19099f8) at /home/anthony/LIB/petsc-3.6.1/src/mat/impls/aij/mpi/superlu_dist/superlu_dist.c:368 368 colA_start = rstart + ajj[0]; /* the smallest global col index of A */ (gdb) 2) PORTION OF CODE TO REPRODUCE ERROR Subroutine HowBigLUCanBe(rank) IMPLICIT NONE integer(i4b),intent(in) :: rank integer(i4b) :: i,ct real(dp) :: begin,endd complex(dpc) :: sigma PetscErrorCode ierr if (rank==0) call cpu_time(begin) if (rank==0) then write(*,*) write(*,*)'Testing How Big LU Can Be...' write(*,*)'============================' write(*,*) endif !sigma = (1.0d0,0.0d0) !call MatAXPY(A,-sigma,B,DIFFERENT_NONZERO_PATTERN,ierr) ! on exit A = A-sigma*B !call MatView(A,PETSC_VIEWER_STDOUT_WORLD,ierr) !.....Write Matrix to ASCII and Binary Format !call PetscViewerASCIIOpen(PETSC_COMM_WORLD,"Amat.m",viewer,ierr) !call MatView(DXX,viewer,ierr) !call PetscViewerDestroy(viewer,ierr) !call PetscViewerBinaryOpen(PETSC_COMM_WORLD,"Amat_binary.m",FILE_MODE_WRITE,viewer,ierr) !call MatView(A,viewer,ierr) !call PetscViewerDestroy(viewer,ierr) !...Load a Matrix in Binary Format call PetscViewerBinaryOpen(PETSC_COMM_WORLD,"Amat_binary.m",FILE_MODE_READ,viewer,ierr) call MatCreate(PETSC_COMM_WORLD,DLOAD,ierr) call MatSetType(DLOAD,MATAIJ,ierr) call MatLoad(DLOAD,viewer,ierr) call PetscViewerDestroy(viewer,ierr) !call MatView(DLOAD,PETSC_VIEWER_STDOUT_WORLD,ierr) !.....Create Linear Solver Context call KSPCreate(PETSC_COMM_WORLD,ksp,ierr) !.....Set operators. Here the matrix that defines the linear system also serves as the preconditioning matrix. !call KSPSetOperators(ksp,A,A,DIFFERENT_NONZERO_PATTERN,ierr) !aha commented and replaced by next line !call KSPSetOperators(ksp,A,A,ierr) ! remember: here A = A-sigma*B call KSPSetOperators(ksp,DLOAD,DLOAD,ierr) ! remember: here A = A-sigma*B !.....Set Relative and Absolute Tolerances and Uses Default for Divergence Tol tol = 1.e-10 call KSPSetTolerances(ksp,tol,tol,PETSC_DEFAULT_REAL,PETSC_DEFAULT_INTEGER,ierr) !.....Set the Direct (LU) Solver call KSPSetType(ksp,KSPPREONLY,ierr) call KSPGetPC(ksp,pc,ierr) call PCSetType(pc,PCLU,ierr) call PCFactorSetMatSolverPackage(pc,MATSOLVERSUPERLU_DIST,ierr) ! MATSOLVERSUPERLU_DIST MATSOLVERMUMPS !.....Create Right-Hand-Side Vector !call MatCreateVecs(A,frhs,PETSC_NULL_OBJECT,ierr) !call MatCreateVecs(A,sol,PETSC_NULL_OBJECT,ierr) call MatCreateVecs(DLOAD,frhs,PETSC_NULL_OBJECT,ierr) call MatCreateVecs(DLOAD,sol,PETSC_NULL_OBJECT,ierr) call MatGetOwnershipRange(DLOAD,IstartA,IendA,ierr)!;CHKERRQ(ierr) allocate(xwork1(IendA-IstartA)) allocate(loc(IendA-IstartA)) ct=0 do i=IstartA,IendA-1 ct=ct+1 loc(ct)=i xwork1(ct)=(1.0d0,0.0d0) enddo call VecSetValues(frhs,IendA-IstartA,loc,xwork1,INSERT_VALUES,ierr) call VecZeroEntries(sol,ierr) deallocate(xwork1,loc) !.....Assemble Vectors call VecAssemblyBegin(frhs,ierr) call VecAssemblyEnd(frhs,ierr) !.....Solve the Linear System call KSPSolve(ksp,frhs,sol,ierr) !call VecView(sol,PETSC_VIEWER_STDOUT_WORLD,ierr) if (rank==0) then call cpu_time(endd) write(*,*) print '("Total time for HowBigLUCanBe = ",f21.3," seconds.")',endd-begin endif call SlepcFinalize(ierr) STOP end Subroutine HowBigLUCanBe -------------- next part -------------- A non-text attachment was scrubbed... Name: Amat_binary.m Type: text/x-objcsrc Size: 7906356 bytes Desc: not available URL: -------------- next part -------------- -matload_block_size 1 From bsmith at mcs.anl.gov Mon Aug 10 16:27:00 2015 From: bsmith at mcs.anl.gov (Barry Smith) Date: Mon, 10 Aug 2015 16:27:00 -0500 Subject: [petsc-users] SIGSEGV in Superlu_dist In-Reply-To: <55C90EAA.5060702@email.arizona.edu> References: <55C90EAA.5060702@email.arizona.edu> Message-ID: <8B989122-F09F-4AA2-8676-D82CA2E58B0C@mcs.anl.gov> Anthony, This crash is in PETSc code before it calls the SuperLU_DIST numeric factorization; likely we have a mistake such as assuming a process has at least one row of the matrix and need to fix it. Barry > 0x00007fe6ba609297 in MatLUFactorNumeric_SuperLU_DIST (F=0x1922b50, > A=0x14a6a70, info=0x19099f8) > at /home/anthony/LIB/petsc-3.6.1/src/mat/impls/aij/mpi/superlu_dist/superlu_dist.c:368 > 368 colA_start = rstart + ajj[0]; /* the smallest global col index of A */ > On Aug 10, 2015, at 3:50 PM, Anthony Haas wrote: > > Hi Sherry, > > I recently submitted a matrix for which I noticed that Superlu_dist was hanging when running on 4 processors with parallel symbolic factorization. I have been using the latest version of Superlu_dist and the code is not hanging anymore. However, I noticed that when running the same matrix (I have attached the matrix), the code crashes with the following SIGSEGV when running on 10 procs (with or without parallel symbolic factorization). It is probably overkill to run such a 'small' matrix on 10 procs but I thought that it might still be useful to report the problem?? See below for the error obtained when running with gdb and also a code snippet to reproduce the error. > > Thanks, > > > Anthony > > > > 1) ERROR in GDB > > Program received signal SIGSEGV, Segmentation fault. > 0x00007fe6ba609297 in MatLUFactorNumeric_SuperLU_DIST (F=0x1922b50, > A=0x14a6a70, info=0x19099f8) > at /home/anthony/LIB/petsc-3.6.1/src/mat/impls/aij/mpi/superlu_dist/superlu_dist.c:368 > 368 colA_start = rstart + ajj[0]; /* the smallest global col index of A */ > (gdb) > > > > 2) PORTION OF CODE TO REPRODUCE ERROR > > Subroutine HowBigLUCanBe(rank) > > IMPLICIT NONE > > integer(i4b),intent(in) :: rank > integer(i4b) :: i,ct > real(dp) :: begin,endd > complex(dpc) :: sigma > > PetscErrorCode ierr > > > if (rank==0) call cpu_time(begin) > > if (rank==0) then > write(*,*) > write(*,*)'Testing How Big LU Can Be...' > write(*,*)'============================' > write(*,*) > endif > > !sigma = (1.0d0,0.0d0) > !call MatAXPY(A,-sigma,B,DIFFERENT_NONZERO_PATTERN,ierr) ! on exit A = A-sigma*B > > !call MatView(A,PETSC_VIEWER_STDOUT_WORLD,ierr) > > !.....Write Matrix to ASCII and Binary Format > !call PetscViewerASCIIOpen(PETSC_COMM_WORLD,"Amat.m",viewer,ierr) > !call MatView(DXX,viewer,ierr) > !call PetscViewerDestroy(viewer,ierr) > > !call PetscViewerBinaryOpen(PETSC_COMM_WORLD,"Amat_binary.m",FILE_MODE_WRITE,viewer,ierr) > !call MatView(A,viewer,ierr) > !call PetscViewerDestroy(viewer,ierr) > > !...Load a Matrix in Binary Format > call PetscViewerBinaryOpen(PETSC_COMM_WORLD,"Amat_binary.m",FILE_MODE_READ,viewer,ierr) > call MatCreate(PETSC_COMM_WORLD,DLOAD,ierr) > call MatSetType(DLOAD,MATAIJ,ierr) > call MatLoad(DLOAD,viewer,ierr) > call PetscViewerDestroy(viewer,ierr) > > !call MatView(DLOAD,PETSC_VIEWER_STDOUT_WORLD,ierr) > > > !.....Create Linear Solver Context > call KSPCreate(PETSC_COMM_WORLD,ksp,ierr) > > !.....Set operators. Here the matrix that defines the linear system also serves as the preconditioning matrix. > !call KSPSetOperators(ksp,A,A,DIFFERENT_NONZERO_PATTERN,ierr) !aha commented and replaced by next line > > !call KSPSetOperators(ksp,A,A,ierr) ! remember: here A = A-sigma*B > call KSPSetOperators(ksp,DLOAD,DLOAD,ierr) ! remember: here A = A-sigma*B > > !.....Set Relative and Absolute Tolerances and Uses Default for Divergence Tol > tol = 1.e-10 > call KSPSetTolerances(ksp,tol,tol,PETSC_DEFAULT_REAL,PETSC_DEFAULT_INTEGER,ierr) > > !.....Set the Direct (LU) Solver > call KSPSetType(ksp,KSPPREONLY,ierr) > call KSPGetPC(ksp,pc,ierr) > call PCSetType(pc,PCLU,ierr) > call PCFactorSetMatSolverPackage(pc,MATSOLVERSUPERLU_DIST,ierr) ! MATSOLVERSUPERLU_DIST MATSOLVERMUMPS > > !.....Create Right-Hand-Side Vector > !call MatCreateVecs(A,frhs,PETSC_NULL_OBJECT,ierr) > !call MatCreateVecs(A,sol,PETSC_NULL_OBJECT,ierr) > > call MatCreateVecs(DLOAD,frhs,PETSC_NULL_OBJECT,ierr) > call MatCreateVecs(DLOAD,sol,PETSC_NULL_OBJECT,ierr) > > call MatGetOwnershipRange(DLOAD,IstartA,IendA,ierr)!;CHKERRQ(ierr) > > allocate(xwork1(IendA-IstartA)) > allocate(loc(IendA-IstartA)) > > ct=0 > do i=IstartA,IendA-1 > ct=ct+1 > loc(ct)=i > xwork1(ct)=(1.0d0,0.0d0) > enddo > > call VecSetValues(frhs,IendA-IstartA,loc,xwork1,INSERT_VALUES,ierr) > call VecZeroEntries(sol,ierr) > > deallocate(xwork1,loc) > > !.....Assemble Vectors > call VecAssemblyBegin(frhs,ierr) > call VecAssemblyEnd(frhs,ierr) > > !.....Solve the Linear System > call KSPSolve(ksp,frhs,sol,ierr) > > !call VecView(sol,PETSC_VIEWER_STDOUT_WORLD,ierr) > > if (rank==0) then > call cpu_time(endd) > write(*,*) > print '("Total time for HowBigLUCanBe = ",f21.3," seconds.")',endd-begin > endif > > call SlepcFinalize(ierr) > > STOP > > > end Subroutine HowBigLUCanBe > > From hzhang at mcs.anl.gov Mon Aug 10 16:58:19 2015 From: hzhang at mcs.anl.gov (Hong) Date: Mon, 10 Aug 2015 16:58:19 -0500 Subject: [petsc-users] SIGSEGV in Superlu_dist In-Reply-To: <8B989122-F09F-4AA2-8676-D82CA2E58B0C@mcs.anl.gov> References: <55C90EAA.5060702@email.arizona.edu> <8B989122-F09F-4AA2-8676-D82CA2E58B0C@mcs.anl.gov> Message-ID: I'll fix this in the release if no one has done it yet. Hong On Mon, Aug 10, 2015 at 4:27 PM, Barry Smith wrote: > > Anthony, > > This crash is in PETSc code before it calls the SuperLU_DIST numeric > factorization; likely we have a mistake such as assuming a process has at > least one row of the matrix and need to fix it. > > Barry > > > 0x00007fe6ba609297 in MatLUFactorNumeric_SuperLU_DIST (F=0x1922b50, > > A=0x14a6a70, info=0x19099f8) > > at > /home/anthony/LIB/petsc-3.6.1/src/mat/impls/aij/mpi/superlu_dist/superlu_dist.c:368 > > 368 colA_start = rstart + ajj[0]; /* the smallest global col > index of A */ > > > > > On Aug 10, 2015, at 3:50 PM, Anthony Haas wrote: > > > > Hi Sherry, > > > > I recently submitted a matrix for which I noticed that Superlu_dist was > hanging when running on 4 processors with parallel symbolic factorization. > I have been using the latest version of Superlu_dist and the code is not > hanging anymore. However, I noticed that when running the same matrix (I > have attached the matrix), the code crashes with the following SIGSEGV when > running on 10 procs (with or without parallel symbolic factorization). It > is probably overkill to run such a 'small' matrix on 10 procs but I thought > that it might still be useful to report the problem?? See below for the > error obtained when running with gdb and also a code snippet to reproduce > the error. > > > > Thanks, > > > > > > Anthony > > > > > > > > 1) ERROR in GDB > > > > Program received signal SIGSEGV, Segmentation fault. > > 0x00007fe6ba609297 in MatLUFactorNumeric_SuperLU_DIST (F=0x1922b50, > > A=0x14a6a70, info=0x19099f8) > > at > /home/anthony/LIB/petsc-3.6.1/src/mat/impls/aij/mpi/superlu_dist/superlu_dist.c:368 > > 368 colA_start = rstart + ajj[0]; /* the smallest global col > index of A */ > > (gdb) > > > > > > > > 2) PORTION OF CODE TO REPRODUCE ERROR > > > > Subroutine HowBigLUCanBe(rank) > > > > IMPLICIT NONE > > > > integer(i4b),intent(in) :: rank > > integer(i4b) :: i,ct > > real(dp) :: begin,endd > > complex(dpc) :: sigma > > > > PetscErrorCode ierr > > > > > > if (rank==0) call cpu_time(begin) > > > > if (rank==0) then > > write(*,*) > > write(*,*)'Testing How Big LU Can Be...' > > write(*,*)'============================' > > write(*,*) > > endif > > > > !sigma = (1.0d0,0.0d0) > > !call MatAXPY(A,-sigma,B,DIFFERENT_NONZERO_PATTERN,ierr) ! on exit > A = A-sigma*B > > > > !call MatView(A,PETSC_VIEWER_STDOUT_WORLD,ierr) > > > > !.....Write Matrix to ASCII and Binary Format > > !call PetscViewerASCIIOpen(PETSC_COMM_WORLD,"Amat.m",viewer,ierr) > > !call MatView(DXX,viewer,ierr) > > !call PetscViewerDestroy(viewer,ierr) > > > > !call > PetscViewerBinaryOpen(PETSC_COMM_WORLD,"Amat_binary.m",FILE_MODE_WRITE,viewer,ierr) > > !call MatView(A,viewer,ierr) > > !call PetscViewerDestroy(viewer,ierr) > > > > !...Load a Matrix in Binary Format > > call > PetscViewerBinaryOpen(PETSC_COMM_WORLD,"Amat_binary.m",FILE_MODE_READ,viewer,ierr) > > call MatCreate(PETSC_COMM_WORLD,DLOAD,ierr) > > call MatSetType(DLOAD,MATAIJ,ierr) > > call MatLoad(DLOAD,viewer,ierr) > > call PetscViewerDestroy(viewer,ierr) > > > > !call MatView(DLOAD,PETSC_VIEWER_STDOUT_WORLD,ierr) > > > > > > !.....Create Linear Solver Context > > call KSPCreate(PETSC_COMM_WORLD,ksp,ierr) > > > > !.....Set operators. Here the matrix that defines the linear system also > serves as the preconditioning matrix. > > !call KSPSetOperators(ksp,A,A,DIFFERENT_NONZERO_PATTERN,ierr) !aha > commented and replaced by next line > > > > !call KSPSetOperators(ksp,A,A,ierr) ! remember: here A = A-sigma*B > > call KSPSetOperators(ksp,DLOAD,DLOAD,ierr) ! remember: here A = > A-sigma*B > > > > !.....Set Relative and Absolute Tolerances and Uses Default for > Divergence Tol > > tol = 1.e-10 > > call > KSPSetTolerances(ksp,tol,tol,PETSC_DEFAULT_REAL,PETSC_DEFAULT_INTEGER,ierr) > > > > !.....Set the Direct (LU) Solver > > call KSPSetType(ksp,KSPPREONLY,ierr) > > call KSPGetPC(ksp,pc,ierr) > > call PCSetType(pc,PCLU,ierr) > > call PCFactorSetMatSolverPackage(pc,MATSOLVERSUPERLU_DIST,ierr) ! > MATSOLVERSUPERLU_DIST MATSOLVERMUMPS > > > > !.....Create Right-Hand-Side Vector > > !call MatCreateVecs(A,frhs,PETSC_NULL_OBJECT,ierr) > > !call MatCreateVecs(A,sol,PETSC_NULL_OBJECT,ierr) > > > > call MatCreateVecs(DLOAD,frhs,PETSC_NULL_OBJECT,ierr) > > call MatCreateVecs(DLOAD,sol,PETSC_NULL_OBJECT,ierr) > > > > call MatGetOwnershipRange(DLOAD,IstartA,IendA,ierr)!;CHKERRQ(ierr) > > > > allocate(xwork1(IendA-IstartA)) > > allocate(loc(IendA-IstartA)) > > > > ct=0 > > do i=IstartA,IendA-1 > > ct=ct+1 > > loc(ct)=i > > xwork1(ct)=(1.0d0,0.0d0) > > enddo > > > > call VecSetValues(frhs,IendA-IstartA,loc,xwork1,INSERT_VALUES,ierr) > > call VecZeroEntries(sol,ierr) > > > > deallocate(xwork1,loc) > > > > !.....Assemble Vectors > > call VecAssemblyBegin(frhs,ierr) > > call VecAssemblyEnd(frhs,ierr) > > > > !.....Solve the Linear System > > call KSPSolve(ksp,frhs,sol,ierr) > > > > !call VecView(sol,PETSC_VIEWER_STDOUT_WORLD,ierr) > > > > if (rank==0) then > > call cpu_time(endd) > > write(*,*) > > print '("Total time for HowBigLUCanBe = ",f21.3," > seconds.")',endd-begin > > endif > > > > call SlepcFinalize(ierr) > > > > STOP > > > > > > end Subroutine HowBigLUCanBe > > > > > > -------------- next part -------------- An HTML attachment was scrubbed... URL: From Mahir.Ulker-Kaustell at tyrens.se Tue Aug 11 09:31:59 2015 From: Mahir.Ulker-Kaustell at tyrens.se (Mahir.Ulker-Kaustell at tyrens.se) Date: Tue, 11 Aug 2015 14:31:59 +0000 Subject: [petsc-users] SuperLU MPI-problem In-Reply-To: References: <1891ada2f99843b6b6c1d91f0f47f065@STHWS42.tyrens.se> <63c6587a85914931bbbad4660884efed@STHWS42.tyrens.se> <429fe4873a534ab19216a8d2e5fa8213@STHWS42.tyrens.se> Message-ID: Yes! Doing: $PETSC_DIR/$PETSC_ARCH/bin/mpiexec instead of mpiexec makes the program run as expected. Thank you all for your patience and encouragement. Sherry: I have noticed that you have been involved in some publications related to my current work, i.e. wave propagation in elastic solids. What computation time would you expect using SuperLU to solve one linear system with say 800000 degrees of freedom and 4-8 processes (on a single node) with a finite element discretization? Mahir -----Original Message----- From: Satish Balay [mailto:balay at mcs.anl.gov] Sent: den 7 augusti 2015 18:09 To: ?lker-Kaustell, Mahir Cc: Hong; PETSc users list Subject: Re: [petsc-users] SuperLU MPI-problem This usually happens if you use the wrong MPIEXEC i.e use the mpiexec from the MPI you built PETSc with. Satish On Fri, 7 Aug 2015, Mahir.Ulker-Kaustell at tyrens.se wrote: > Hong, > > Running example 2 with the command line given below gives me two uniprocessor runs!? > > $ mpiexec -n 2 ./ex2 -pc_type lu -pc_factor_mat_solver_package superlu_dist -ksp_view > KSP Object: 1 MPI processes > type: gmres > GMRES: restart=30, using Classical (unmodified) Gram-Schmidt Orthogonalization with no iterative refinement > GMRES: happy breakdown tolerance 1e-30 > maximum iterations=10000, initial guess is zero > tolerances: relative=0.000138889, absolute=1e-50, divergence=10000 > left preconditioning > using PRECONDITIONED norm type for convergence test > PC Object: 1 MPI processes > type: lu > LU: out-of-place factorization > tolerance for zero pivot 2.22045e-14 > matrix ordering: nd > factor fill ratio given 0, needed 0 > Factored matrix follows: > Mat Object: 1 MPI processes > type: seqaij > rows=56, cols=56 > package used to perform factorization: superlu_dist > total: nonzeros=0, allocated nonzeros=0 > total number of mallocs used during MatSetValues calls =0 > SuperLU_DIST run parameters: > Process grid nprow 1 x npcol 1 > Equilibrate matrix TRUE > Matrix input mode 0 > Replace tiny pivots TRUE > Use iterative refinement FALSE > Processors in row 1 col partition 1 > Row permutation LargeDiag > Column permutation METIS_AT_PLUS_A > Parallel symbolic factorization FALSE > Repeated factorization SamePattern_SameRowPerm > linear system matrix = precond matrix: > Mat Object: 1 MPI processes > type: seqaij > rows=56, cols=56 > total: nonzeros=250, allocated nonzeros=280 > total number of mallocs used during MatSetValues calls =0 > not using I-node routines > Norm of error 5.21214e-15 iterations 1 > KSP Object: 1 MPI processes > type: gmres > GMRES: restart=30, using Classical (unmodified) Gram-Schmidt Orthogonalization with no iterative refinement > GMRES: happy breakdown tolerance 1e-30 > maximum iterations=10000, initial guess is zero > tolerances: relative=0.000138889, absolute=1e-50, divergence=10000 > left preconditioning > using PRECONDITIONED norm type for convergence test > PC Object: 1 MPI processes > type: lu > LU: out-of-place factorization > tolerance for zero pivot 2.22045e-14 > matrix ordering: nd > factor fill ratio given 0, needed 0 > Factored matrix follows: > Mat Object: 1 MPI processes > type: seqaij > rows=56, cols=56 > package used to perform factorization: superlu_dist > total: nonzeros=0, allocated nonzeros=0 > total number of mallocs used during MatSetValues calls =0 > SuperLU_DIST run parameters: > Process grid nprow 1 x npcol 1 > Equilibrate matrix TRUE > Matrix input mode 0 > Replace tiny pivots TRUE > Use iterative refinement FALSE > Processors in row 1 col partition 1 > Row permutation LargeDiag > Column permutation METIS_AT_PLUS_A > Parallel symbolic factorization FALSE > Repeated factorization SamePattern_SameRowPerm > linear system matrix = precond matrix: > Mat Object: 1 MPI processes > type: seqaij > rows=56, cols=56 > total: nonzeros=250, allocated nonzeros=280 > total number of mallocs used during MatSetValues calls =0 > not using I-node routines > Norm of error 5.21214e-15 iterations 1 > > Mahir > > From: Hong [mailto:hzhang at mcs.anl.gov] > Sent: den 6 augusti 2015 16:36 > To: ?lker-Kaustell, Mahir > Cc: Hong; Xiaoye S. Li; PETSc users list > Subject: Re: [petsc-users] SuperLU MPI-problem > > Mahir: > > I have been using PETSC_COMM_WORLD. > > What do you get by running a petsc example, e.g., > petsc/src/ksp/ksp/examples/tutorials > mpiexec -n 2 ./ex2 -pc_type lu -pc_factor_mat_solver_package superlu_dist -ksp_view > > KSP Object: 2 MPI processes > type: gmres > ... > > Hong > > From: Hong [mailto:hzhang at mcs.anl.gov] > Sent: den 5 augusti 2015 17:11 > To: ?lker-Kaustell, Mahir > Cc: Hong; Xiaoye S. Li; PETSc users list > Subject: Re: [petsc-users] SuperLU MPI-problem > > Mahir: > As you noticed, you ran the code in serial mode, not parallel. > Check your code on input communicator, e.g., what input communicator do you use in > KSPCreate(comm,&ksp)? > > I have added error flag to superlu_dist interface (released version). When user uses '-mat_superlu_dist_parsymbfact' > in serial mode, this option is ignored with a warning. > > Hong > > Hong, > > If I set parsymbfact: > > $ mpiexec -n 2 ./solve -ksp_type preonly -pc_type lu -pc_factor_mat_solver_package superlu_dist -mat_superlu_dist_matinput DISTRIBUTED -mat_superlu_dist_parsymbfact -ksp_view > Invalid ISPEC at line 484 in file get_perm_c.c > Invalid ISPEC at line 484 in file get_perm_c.c > ------------------------------------------------------- > Primary job terminated normally, but 1 process returned > a non-zero exit code.. Per user-direction, the job has been aborted. > ------------------------------------------------------- > -------------------------------------------------------------------------- > mpiexec detected that one or more processes exited with non-zero status, thus causing > the job to be terminated. The first process to do so was: > > Process name: [[63679,1],0] > Exit code: 255 > -------------------------------------------------------------------------- > > Since the program does not finish the call to KSPSolve(), we do not get any information about the KSP from ?ksp_view. > > If I do not set it, I get a serial run even if I specify ?n 2: > > mpiexec -n 2 ./solve -ksp_type preonly -pc_type lu -pc_factor_mat_solver_package superlu_dist -ksp_view > ? > KSP Object: 1 MPI processes > type: preonly > maximum iterations=10000, initial guess is zero > tolerances: relative=1e-05, absolute=1e-50, divergence=10000 > left preconditioning > using NONE norm type for convergence test > PC Object: 1 MPI processes > type: lu > LU: out-of-place factorization > tolerance for zero pivot 2.22045e-14 > matrix ordering: nd > factor fill ratio given 0, needed 0 > Factored matrix follows: > Mat Object: 1 MPI processes > type: seqaij > rows=954, cols=954 > package used to perform factorization: superlu_dist > total: nonzeros=0, allocated nonzeros=0 > total number of mallocs used during MatSetValues calls =0 > SuperLU_DIST run parameters: > Process grid nprow 1 x npcol 1 > Equilibrate matrix TRUE > Matrix input mode 0 > Replace tiny pivots TRUE > Use iterative refinement FALSE > Processors in row 1 col partition 1 > Row permutation LargeDiag > Column permutation METIS_AT_PLUS_A > Parallel symbolic factorization FALSE > Repeated factorization SamePattern_SameRowPerm > linear system matrix = precond matrix: > Mat Object: 1 MPI processes > type: seqaij > rows=954, cols=954 > total: nonzeros=34223, allocated nonzeros=34223 > total number of mallocs used during MatSetValues calls =0 > using I-node routines: found 668 nodes, limit used is 5 > > I am running PETSc via Cygwin on a windows machine. > When I installed PETSc the tests with different numbers of processes ran well. > > Mahir > > > From: Hong [mailto:hzhang at mcs.anl.gov] > Sent: den 3 augusti 2015 19:06 > To: ?lker-Kaustell, Mahir > Cc: Hong; Xiaoye S. Li; PETSc users list > Subject: Re: [petsc-users] SuperLU MPI-problem > > Mahir, > > > I have not used ?parsymbfact in sequential runs or set matinput=GLOBAL for parallel runs. > > If I use 2 processors, the program runs if I use ?mat_superlu_dist_parsymbfact=1: > mpiexec -n 2 ./solve -ksp_type preonly -pc_type lu -pc_factor_mat_solver_package superlu_dist -mat_superlu_dist_matinput GLOBAL -mat_superlu_dist_parsymbfact=1 > > The incorrect option '-mat_superlu_dist_parsymbfact=1' is not taken, so your code runs well without parsymbfact. > > Please run it with '-ksp_view' and see what > 'SuperLU_DIST run parameters:' are being used, e.g. > petsc/src/ksp/ksp/examples/tutorials (maint) > $ mpiexec -n 2 ./ex2 -pc_type lu -pc_factor_mat_solver_package superlu_dist -mat_superlu_dist_parsymbfact=1 -ksp_view > > ... > SuperLU_DIST run parameters: > Process grid nprow 2 x npcol 1 > Equilibrate matrix TRUE > Matrix input mode 1 > Replace tiny pivots TRUE > Use iterative refinement FALSE > Processors in row 2 col partition 1 > Row permutation LargeDiag > Column permutation METIS_AT_PLUS_A > Parallel symbolic factorization FALSE > Repeated factorization SamePattern_SameRowPerm > > I do not understand why your code uses matrix input mode = global. > > Hong > > > > From: Hong [mailto:hzhang at mcs.anl.gov] > Sent: den 3 augusti 2015 16:46 > To: Xiaoye S. Li > Cc: ?lker-Kaustell, Mahir; Hong; PETSc users list > > Subject: Re: [petsc-users] SuperLU MPI-problem > > Mahir, > > Sherry found the culprit. I can reproduce it: > petsc/src/ksp/ksp/examples/tutorials > mpiexec -n 2 ./ex2 -pc_type lu -pc_factor_mat_solver_package superlu_dist -mat_superlu_dist_matinput GLOBAL -mat_superlu_dist_parsymbfact > > Invalid ISPEC at line 484 in file get_perm_c.c > Invalid ISPEC at line 484 in file get_perm_c.c > ------------------------------------------------------- > Primary job terminated normally, but 1 process returned > a non-zero exit code.. Per user-direction, the job has been aborted. > ------------------------------------------------------- > ... > > PETSc-superlu_dist interface sets matinput=DISTRIBUTED as default when using more than one processes. > Did you either use '-mat_superlu_dist_parsymbfact' for sequential run or set matinput=GLOBAL for parallel run? > > I'll add an error flag for these use cases. > > Hong > > On Mon, Aug 3, 2015 at 9:17 AM, Xiaoye S. Li > wrote: > I think I know the problem. Since zdistribute.c is called, I guess you are using the global (replicated) matrix input interface, pzgssvx_ABglobal(). This interface does not allow you to use parallel symbolic factorization (since matrix is centralized). > > That's why you get the following error: > Invalid ISPEC at line 484 in file get_perm_c.c > > You need to use distributed matrix input interface pzgssvx() (without ABglobal) > > Sherry > > > On Mon, Aug 3, 2015 at 5:02 AM, Mahir.Ulker-Kaustell at tyrens.se > wrote: > Hong and Sherry, > > I have rebuilt PETSc with SuperLU 4.1. Unfortunately, the problem remains: > > If I use -mat_superlu_dist_parsymbfact, the program crashes with: Invalid ISPEC at line 484 in file get_perm_c.c > If I use -mat_superlu_dist_parsymbfact=1 or leave this flag out, the program crashes with: Calloc fails for SPA dense[]. at line 438 in file zdistribute.c > > Mahir > > From: Hong [mailto:hzhang at mcs.anl.gov] > Sent: den 30 juli 2015 02:58 > To: ?lker-Kaustell, Mahir > Cc: Xiaoye Li; PETSc users list > > Subject: Fwd: [petsc-users] SuperLU MPI-problem > > Mahir, > > Sherry fixed several bugs in superlu_dist-v4.1. > The current petsc-release interfaces with superlu_dist-v4.0. > We do not know whether the reported issue (attached below) has been resolved or not. If not, can you test it with the latest superlu_dist-v4.1? > > Here is how to do it: > 1. download superlu_dist v4.1 > 2. remove existing PETSC_ARCH directory, then configure petsc with > '--download-superlu_dist=superlu_dist_4.1.tar.gz' > 3. build petsc > > Let us know if the issue remains. > > Hong > > > ---------- Forwarded message ---------- > From: Xiaoye S. Li > > Date: Wed, Jul 29, 2015 at 2:24 PM > Subject: Fwd: [petsc-users] SuperLU MPI-problem > To: Hong Zhang > > Hong, > I am cleaning the mailbox, and saw this unresolved issue. I am not sure whether the new fix to parallel symbolic factorization solves the problem. What bothers be is that he is getting the following error: > > Invalid ISPEC at line 484 in file get_perm_c.c > This has nothing to do with my bug fix. > ? Shall we ask him to try the new version, or try to get him matrix? > Sherry > ? > > ---------- Forwarded message ---------- > From: Mahir.Ulker-Kaustell at tyrens.se > > Date: Wed, Jul 22, 2015 at 1:32 PM > Subject: RE: [petsc-users] SuperLU MPI-problem > To: Hong >, "Xiaoye S. Li" > > Cc: petsc-users > > The 1000 was just a conservative guess. The number of non-zeros per row is in the tens in general but certain constraints lead to non-diagonal streaks in the sparsity-pattern. > Is it the reordering of the matrix that is killing me here? How can I set options.ColPerm? > > If i use -mat_superlu_dist_parsymbfact the program crashes with > > Invalid ISPEC at line 484 in file get_perm_c.c > ------------------------------------------------------- > Primary job terminated normally, but 1 process returned > a non-zero exit code.. Per user-direction, the job has been aborted. > ------------------------------------------------------- > [0]PETSC ERROR: ------------------------------------------------------------------------ > [0]PETSC ERROR: Caught signal number 15 Terminate: Some process (or the batch system) has told this process to end > [0]PETSC ERROR: Try option -start_in_debugger or -on_error_attach_debugger > [0]PETSC ERROR: or see http://www.mcs.anl.gov/petsc/documentation/faq.html#valgrind > [0]PETSC ERROR: or try http://valgrind.org on GNU/linux and Apple Mac OS X to find memory corruption errors > [0]PETSC ERROR: configure using --with-debugging=yes, recompile, link, and run > [0]PETSC ERROR: to get more information on the crash. > [0]PETSC ERROR: --------------------- Error Message -------------------------------------------------------------- > [0]PETSC ERROR: Signal received > [0]PETSC ERROR: See http://www.mcs.anl.gov/petsc/documentation/faq.html for trouble shooting. > [0]PETSC ERROR: Petsc Release Version 3.6.0, Jun, 09, 2015 > [0]PETSC ERROR: ./solve on a cygwin-complex-nodebug named CZC5202SM2 by muk Wed Jul 22 21:59:23 2015 > [0]PETSC ERROR: Configure options PETSC_DIR=/packages/petsc-3.6.0 PETSC_ARCH=cygwin-complex-nodebug --with-cc=gcc --with-cxx=g++ --with-fc=gfortran --with-debugging=0 --with-fortran-kernels=1 --with-scalar-type=complex --download-fblaspack --download-mpich --download-scalapack --download-mumps --download-metis --download-parmetis --download-superlu --download-superlu_dist --download-fftw > [0]PETSC ERROR: #1 User provided function() line 0 in unknown file > application called MPI_Abort(MPI_COMM_WORLD, 59) - process 0 > [unset]: aborting job: > application called MPI_Abort(MPI_COMM_WORLD, 59) - process 0 > [0]PETSC ERROR: ------------------------------------------------------------------------ > > If i use -mat_superlu_dist_parsymbfact=1 the program crashes (somewhat later) with > > Malloc fails for Lnzval_bc_ptr[*][] at line 626 in file zdistribute.c > col block 3006 ------------------------------------------------------- > Primary job terminated normally, but 1 process returned > a non-zero exit code.. Per user-direction, the job has been aborted. > ------------------------------------------------------- > col block 1924 [0]PETSC ERROR: ------------------------------------------------------------------------ > [0]PETSC ERROR: Caught signal number 15 Terminate: Some process (or the batch system) has told this process to end > [0]PETSC ERROR: Try option -start_in_debugger or -on_error_attach_debugger > [0]PETSC ERROR: or see http://www.mcs.anl.gov/petsc/documentation/faq.html#valgrind > [0]PETSC ERROR: or try http://valgrind.org on GNU/linux and Apple Mac OS X to find memory corruption errors > [0]PETSC ERROR: configure using --with-debugging=yes, recompile, link, and run > [0]PETSC ERROR: to get more information on the crash. > [0]PETSC ERROR: --------------------- Error Message -------------------------------------------------------------- > [0]PETSC ERROR: Signal received > [0]PETSC ERROR: See http://www.mcs.anl.gov/petsc/documentation/faq.html for trouble shooting. > [0]PETSC ERROR: Petsc Release Version 3.6.0, Jun, 09, 2015 > [0]PETSC ERROR: ./solve on a cygwin-complex-nodebug named CZC5202SM2 by muk Wed Jul 22 21:59:58 2015 > [0]PETSC ERROR: Configure options PETSC_DIR=/packages/petsc-3.6.0 PETSC_ARCH=cygwin-complex-nodebug --with-cc=gcc --with-cxx=g++ --with-fc=gfortran --with-debugging=0 --with-fortran-kernels=1 --with-scalar-type=complex --download-fblaspack --download-mpich --download-scalapack --download-mumps --download-metis --download-parmetis --download-superlu --download-superlu_dist --download-fftw > [0]PETSC ERROR: #1 User provided function() line 0 in unknown file > application called MPI_Abort(MPI_COMM_WORLD, 59) - process 0 > [unset]: aborting job: > application called MPI_Abort(MPI_COMM_WORLD, 59) - process 0 > [0]PETSC ERROR: ------------------------------------------------------------------------ > > > /Mahir > > > From: Hong [mailto:hzhang at mcs.anl.gov] > Sent: den 22 juli 2015 21:34 > To: Xiaoye S. Li > Cc: ?lker-Kaustell, Mahir; petsc-users > > Subject: Re: [petsc-users] SuperLU MPI-problem > > In Petsc/superlu_dist interface, we set default > > options.ParSymbFact = NO; > > When user raises the flag "-mat_superlu_dist_parsymbfact", > we set > > options.ParSymbFact = YES; > options.ColPerm = PARMETIS; /* in v2.2, PARMETIS is forced for ParSymbFact regardless of user ordering setting */ > > We do not change anything else. > > Hong > > On Wed, Jul 22, 2015 at 2:19 PM, Xiaoye S. Li > wrote: > I am trying to understand your problem. You said you are solving Naviers equation (elastodynamics) in the frequency domain, using finite element discretization. I wonder why you have about 1000 nonzeros per row. Usually in many PDE discretized matrices, the number of nonzeros per row is in the tens (even for 3D problems), not in the thousands. So, your matrix is quite a bit denser than many sparse matrices we deal with. > > The number of nonzeros in the L and U factors is much more than that in original matrix A -- typically we see 10-20x fill ratio for 2D, or can be as bad as 50-100x fill ratio for 3D. But since your matrix starts much denser (i.e., the underlying graph has many connections), it may not lend to any good ordering strategy to preserve sparsity of L and U; that is, the L and U fill ratio may be large. > > I don't understand why you get the following error when you use > ?-mat_superlu_dist_parsymbfact?. > > Invalid ISPEC at line 484 in file get_perm_c.c > > Perhaps Hong Zhang knows; she built the SuperLU_DIST interface for PETSc. > > ?Hong -- in order to use parallel symbolic factorization, is it sufficient to specify only > ?-mat_superlu_dist_parsymbfact? > ? ? (the default is to use sequential symbolic factorization.) > > > Sherry > > On Wed, Jul 22, 2015 at 9:11 AM, Mahir.Ulker-Kaustell at tyrens.se > wrote: > Thank you for your reply. > > As you have probably figured out already, I am not a computational scientist. I am a researcher in civil engineering (railways for high-speed traffic), trying to produce some, from my perspective, fairly large parametric studies based on finite element discretizations. > > I am working in a Windows-environment and have installed PETSc through Cygwin. > Apparently, there is no support for Valgrind in this OS. > > If I have understood you correct, the memory issues are related to superLU and given my background, there is not much I can do. Is this correct? > > > Best regards, > Mahir > > ______________________________________________ > Mahir ?lker-Kaustell, Kompetenssamordnare, Brokonstrukt?r, Tekn. Dr, Tyr?ns AB > 010 452 30 82, Mahir.Ulker-Kaustell at tyrens.se > ______________________________________________ > > -----Original Message----- > From: Barry Smith [mailto:bsmith at mcs.anl.gov] > Sent: den 22 juli 2015 02:57 > To: ?lker-Kaustell, Mahir > Cc: Xiaoye S. Li; petsc-users > Subject: Re: [petsc-users] SuperLU MPI-problem > > > Run the program under valgrind http://www.mcs.anl.gov/petsc/documentation/faq.html#valgrind . When I use the option -mat_superlu_dist_parsymbfact I get many scary memory problems some involving for example ddist_psymbtonum (pdsymbfact_distdata.c:1332) > > Note that I consider it unacceptable for running programs to EVER use uninitialized values; until these are all cleaned up I won't trust any runs like this. > > Barry > > > > > ==42050== Conditional jump or move depends on uninitialised value(s) > ==42050== at 0x10274C436: MPI_Allgatherv (allgatherv.c:1053) > ==42050== by 0x101557F60: get_perm_c_parmetis (get_perm_c_parmetis.c:285) > ==42050== by 0x101501192: pdgssvx (pdgssvx.c:934) > ==42050== by 0x1009CFE7A: MatLUFactorNumeric_SuperLU_DIST (superlu_dist.c:414) > ==42050== by 0x10046CC5C: MatLUFactorNumeric (matrix.c:2946) > ==42050== by 0x100F09F2C: PCSetUp_LU (lu.c:152) > ==42050== by 0x100FF9036: PCSetUp (precon.c:982) > ==42050== by 0x1010F54EB: KSPSetUp (itfunc.c:332) > ==42050== by 0x1010F7985: KSPSolve (itfunc.c:546) > ==42050== by 0x10125541E: SNESSolve_NEWTONLS (ls.c:233) > ==42050== by 0x1011C49B7: SNESSolve (snes.c:3906) > ==42050== by 0x100001B3C: main (in ./ex19) > ==42050== Uninitialised value was created by a stack allocation > ==42050== at 0x10155751B: get_perm_c_parmetis (get_perm_c_parmetis.c:96) > ==42050== > ==42050== Conditional jump or move depends on uninitialised value(s) > ==42050== at 0x102851C61: MPIR_Allgatherv_intra (allgatherv.c:651) > ==42050== by 0x102853EC7: MPIR_Allgatherv (allgatherv.c:903) > ==42050== by 0x102853F84: MPIR_Allgatherv_impl (allgatherv.c:944) > ==42050== by 0x10274CA41: MPI_Allgatherv (allgatherv.c:1107) > ==42050== by 0x101557F60: get_perm_c_parmetis (get_perm_c_parmetis.c:285) > ==42050== by 0x101501192: pdgssvx (pdgssvx.c:934) > ==42050== by 0x1009CFE7A: MatLUFactorNumeric_SuperLU_DIST (superlu_dist.c:414) > ==42050== by 0x10046CC5C: MatLUFactorNumeric (matrix.c:2946) > ==42050== by 0x100F09F2C: PCSetUp_LU (lu.c:152) > ==42050== by 0x100FF9036: PCSetUp (precon.c:982) > ==42050== by 0x1010F54EB: KSPSetUp (itfunc.c:332) > ==42050== by 0x1010F7985: KSPSolve (itfunc.c:546) > ==42050== by 0x10125541E: SNESSolve_NEWTONLS (ls.c:233) > ==42050== by 0x1011C49B7: SNESSolve (snes.c:3906) > ==42050== by 0x100001B3C: main (in ./ex19) > ==42050== Uninitialised value was created by a stack allocation > ==42050== at 0x10155751B: get_perm_c_parmetis (get_perm_c_parmetis.c:96) > ==42050== > ==42049== Syscall param writev(vector[...]) points to uninitialised byte(s) > ==42049== at 0x102DA1C3A: writev (in /usr/lib/system/libsystem_kernel.dylib) > ==42049== by 0x10296A0DC: MPL_large_writev (mplsock.c:32) > ==42049== by 0x10295F6AD: MPIDU_Sock_writev (sock_immed.i:610) > ==42049== by 0x102943FCA: MPIDI_CH3_iSendv (ch3_isendv.c:84) > ==42049== by 0x102934361: MPIDI_CH3_EagerContigIsend (ch3u_eager.c:556) > ==42049== by 0x102939531: MPID_Isend (mpid_isend.c:138) > ==42049== by 0x10277656E: MPI_Isend (isend.c:125) > ==42049== by 0x102088B66: libparmetis__gkMPI_Isend (gkmpi.c:63) > ==42049== by 0x10208140F: libparmetis__CommInterfaceData (comm.c:298) > ==42049== by 0x1020A8758: libparmetis__CompactGraph (ometis.c:553) > ==42049== by 0x1020A77BB: libparmetis__MultilevelOrder (ometis.c:225) > ==42049== by 0x1020A7493: ParMETIS_V32_NodeND (ometis.c:151) > ==42049== by 0x1020A6AFB: ParMETIS_V3_NodeND (ometis.c:34) > ==42049== by 0x101557CFC: get_perm_c_parmetis (get_perm_c_parmetis.c:241) > ==42049== by 0x101501192: pdgssvx (pdgssvx.c:934) > ==42049== by 0x1009CFE7A: MatLUFactorNumeric_SuperLU_DIST (superlu_dist.c:414) > ==42049== by 0x10046CC5C: MatLUFactorNumeric (matrix.c:2946) > ==42049== by 0x100F09F2C: PCSetUp_LU (lu.c:152) > ==42049== by 0x100FF9036: PCSetUp (precon.c:982) > ==42048== Syscall param writev(vector[...]) points to uninitialised byte(s) > ==42049== by 0x1010F54EB: KSPSetUp (itfunc.c:332) > ==42049== Address 0x105edff70 is 1,424 bytes inside a block of size 752,720 alloc'd > ==42049== at 0x1000183B1: malloc (vg_replace_malloc.c:303) > ==42049== by 0x1020EB90C: gk_malloc (memory.c:147) > ==42049== by 0x1020EAA28: gk_mcoreCreate (mcore.c:28) > ==42048== at 0x102DA1C3A: writev (in /usr/lib/system/libsystem_kernel.dylib) > ==42048== by 0x10296A0DC: MPL_large_writev (mplsock.c:32) > ==42049== by 0x1020BA5CF: libparmetis__AllocateWSpace (wspace.c:23) > ==42049== by 0x1020A6E84: ParMETIS_V32_NodeND (ometis.c:98) > ==42048== by 0x10295F6AD: MPIDU_Sock_writev (sock_immed.i:610) > ==42048== by 0x102943FCA: MPIDI_CH3_iSendv (ch3_isendv.c:84) > ==42048== by 0x102934361: MPIDI_CH3_EagerContigIsend (ch3u_eager.c:556) > ==42049== by 0x1020A6AFB: ParMETIS_V3_NodeND (ometis.c:34) > ==42049== by 0x101557CFC: get_perm_c_parmetis (get_perm_c_parmetis.c:241) > ==42049== by 0x101501192: pdgssvx (pdgssvx.c:934) > ==42048== by 0x102939531: MPID_Isend (mpid_isend.c:138) > ==42048== by 0x10277656E: MPI_Isend (isend.c:125) > ==42049== by 0x1009CFE7A: MatLUFactorNumeric_SuperLU_DIST (superlu_dist.c:414) > ==42049== by 0x10046CC5C: MatLUFactorNumeric (matrix.c:2946) > ==42049== by 0x100F09F2C: PCSetUp_LU (lu.c:152) > ==42049== by 0x100FF9036: PCSetUp (precon.c:982) > ==42049== by 0x1010F54EB: KSPSetUp (itfunc.c:332) > ==42048== by 0x102088B66: libparmetis__gkMPI_Isend (gkmpi.c:63) > ==42048== by 0x10208140F: libparmetis__CommInterfaceData (comm.c:298) > ==42049== by 0x1010F7985: KSPSolve (itfunc.c:546) > ==42049== by 0x10125541E: SNESSolve_NEWTONLS (ls.c:233) > ==42048== by 0x1020A8758: libparmetis__CompactGraph (ometis.c:553) > ==42048== by 0x1020A77BB: libparmetis__MultilevelOrder (ometis.c:225) > ==42048== by 0x1020A7493: ParMETIS_V32_NodeND (ometis.c:151) > ==42049== by 0x1011C49B7: SNESSolve (snes.c:3906) > ==42049== by 0x100001B3C: main (in ./ex19) > ==42049== Uninitialised value was created by a heap allocation > ==42049== at 0x1000183B1: malloc (vg_replace_malloc.c:303) > ==42049== by 0x1020EB90C: gk_malloc (memory.c:147) > ==42048== by 0x1020A6AFB: ParMETIS_V3_NodeND (ometis.c:34) > ==42048== by 0x101557CFC: get_perm_c_parmetis (get_perm_c_parmetis.c:241) > ==42048== by 0x101501192: pdgssvx (pdgssvx.c:934) > ==42048== by 0x1009CFE7A: MatLUFactorNumeric_SuperLU_DIST (superlu_dist.c:414) > ==42049== by 0x10211C50B: libmetis__imalloc (gklib.c:24) > ==42049== by 0x1020A8566: libparmetis__CompactGraph (ometis.c:519) > ==42049== by 0x1020A77BB: libparmetis__MultilevelOrder (ometis.c:225) > ==42048== by 0x10046CC5C: MatLUFactorNumeric (matrix.c:2946) > ==42049== by 0x1020A7493: ParMETIS_V32_NodeND (ometis.c:151) > ==42049== by 0x1020A6AFB: ParMETIS_V3_NodeND (ometis.c:34) > ==42049== by 0x101557CFC: get_perm_c_parmetis (get_perm_c_parmetis.c:241) > ==42049== by 0x101501192: pdgssvx (pdgssvx.c:934) > ==42049== by 0x1009CFE7A: MatLUFactorNumeric_SuperLU_DIST (superlu_dist.c:414) > ==42049== by 0x10046CC5C: MatLUFactorNumeric (matrix.c:2946) > ==42048== by 0x100F09F2C: PCSetUp_LU (lu.c:152) > ==42049== by 0x100F09F2C: PCSetUp_LU (lu.c:152) > ==42049== by 0x100FF9036: PCSetUp (precon.c:982) > ==42049== by 0x1010F54EB: KSPSetUp (itfunc.c:332) > ==42049== by 0x1010F7985: KSPSolve (itfunc.c:546) > ==42049== by 0x10125541E: SNESSolve_NEWTONLS (ls.c:233) > ==42048== by 0x100FF9036: PCSetUp (precon.c:982) > ==42048== by 0x1010F54EB: KSPSetUp (itfunc.c:332) > ==42048== Address 0x10597a860 is 1,408 bytes inside a block of size 752,720 alloc'd > ==42049== by 0x1011C49B7: SNESSolve (snes.c:3906) > ==42049== by 0x100001B3C: main (in ./ex19) > ==42049== > ==42048== at 0x1000183B1: malloc (vg_replace_malloc.c:303) > ==42048== by 0x1020EB90C: gk_malloc (memory.c:147) > ==42048== by 0x1020EAA28: gk_mcoreCreate (mcore.c:28) > ==42048== by 0x1020BA5CF: libparmetis__AllocateWSpace (wspace.c:23) > ==42048== by 0x1020A6E84: ParMETIS_V32_NodeND (ometis.c:98) > ==42048== by 0x1020A6AFB: ParMETIS_V3_NodeND (ometis.c:34) > ==42048== by 0x101557CFC: get_perm_c_parmetis (get_perm_c_parmetis.c:241) > ==42048== by 0x101501192: pdgssvx (pdgssvx.c:934) > ==42048== by 0x1009CFE7A: MatLUFactorNumeric_SuperLU_DIST (superlu_dist.c:414) > ==42048== by 0x10046CC5C: MatLUFactorNumeric (matrix.c:2946) > ==42048== by 0x100F09F2C: PCSetUp_LU (lu.c:152) > ==42048== by 0x100FF9036: PCSetUp (precon.c:982) > ==42048== by 0x1010F54EB: KSPSetUp (itfunc.c:332) > ==42048== by 0x1010F7985: KSPSolve (itfunc.c:546) > ==42048== by 0x10125541E: SNESSolve_NEWTONLS (ls.c:233) > ==42048== by 0x1011C49B7: SNESSolve (snes.c:3906) > ==42048== by 0x100001B3C: main (in ./ex19) > ==42048== Uninitialised value was created by a heap allocation > ==42048== at 0x1000183B1: malloc (vg_replace_malloc.c:303) > ==42048== by 0x1020EB90C: gk_malloc (memory.c:147) > ==42048== by 0x10211C50B: libmetis__imalloc (gklib.c:24) > ==42048== by 0x1020A8566: libparmetis__CompactGraph (ometis.c:519) > ==42048== by 0x1020A77BB: libparmetis__MultilevelOrder (ometis.c:225) > ==42048== by 0x1020A7493: ParMETIS_V32_NodeND (ometis.c:151) > ==42048== by 0x1020A6AFB: ParMETIS_V3_NodeND (ometis.c:34) > ==42048== by 0x101557CFC: get_perm_c_parmetis (get_perm_c_parmetis.c:241) > ==42048== by 0x101501192: pdgssvx (pdgssvx.c:934) > ==42048== by 0x1009CFE7A: MatLUFactorNumeric_SuperLU_DIST (superlu_dist.c:414) > ==42048== by 0x10046CC5C: MatLUFactorNumeric (matrix.c:2946) > ==42048== by 0x100F09F2C: PCSetUp_LU (lu.c:152) > ==42048== by 0x100FF9036: PCSetUp (precon.c:982) > ==42048== by 0x1010F54EB: KSPSetUp (itfunc.c:332) > ==42048== by 0x1010F7985: KSPSolve (itfunc.c:546) > ==42048== by 0x10125541E: SNESSolve_NEWTONLS (ls.c:233) > ==42048== by 0x1011C49B7: SNESSolve (snes.c:3906) > ==42048== by 0x100001B3C: main (in ./ex19) > ==42048== > ==42048== Syscall param write(buf) points to uninitialised byte(s) > ==42048== at 0x102DA1C22: write (in /usr/lib/system/libsystem_kernel.dylib) > ==42048== by 0x10295F5BD: MPIDU_Sock_write (sock_immed.i:525) > ==42048== by 0x102944839: MPIDI_CH3_iStartMsg (ch3_istartmsg.c:86) > ==42048== by 0x102933B80: MPIDI_CH3_EagerContigShortSend (ch3u_eager.c:257) > ==42048== by 0x10293ADBA: MPID_Send (mpid_send.c:130) > ==42048== by 0x10277A1FA: MPI_Send (send.c:127) > ==42048== by 0x10155802F: get_perm_c_parmetis (get_perm_c_parmetis.c:299) > ==42048== by 0x101501192: pdgssvx (pdgssvx.c:934) > ==42048== by 0x1009CFE7A: MatLUFactorNumeric_SuperLU_DIST (superlu_dist.c:414) > ==42048== by 0x10046CC5C: MatLUFactorNumeric (matrix.c:2946) > ==42048== by 0x100F09F2C: PCSetUp_LU (lu.c:152) > ==42048== by 0x100FF9036: PCSetUp (precon.c:982) > ==42048== by 0x1010F54EB: KSPSetUp (itfunc.c:332) > ==42048== by 0x1010F7985: KSPSolve (itfunc.c:546) > ==42048== by 0x10125541E: SNESSolve_NEWTONLS (ls.c:233) > ==42048== by 0x1011C49B7: SNESSolve (snes.c:3906) > ==42048== by 0x100001B3C: main (in ./ex19) > ==42048== Address 0x104810704 is on thread 1's stack > ==42048== in frame #3, created by MPIDI_CH3_EagerContigShortSend (ch3u_eager.c:218) > ==42048== Uninitialised value was created by a heap allocation > ==42048== at 0x1000183B1: malloc (vg_replace_malloc.c:303) > ==42048== by 0x10153B704: superlu_malloc_dist (memory.c:108) > ==42048== by 0x101557AB9: get_perm_c_parmetis (get_perm_c_parmetis.c:185) > ==42048== by 0x101501192: pdgssvx (pdgssvx.c:934) > ==42048== by 0x1009CFE7A: MatLUFactorNumeric_SuperLU_DIST (superlu_dist.c:414) > ==42048== by 0x10046CC5C: MatLUFactorNumeric (matrix.c:2946) > ==42048== by 0x100F09F2C: PCSetUp_LU (lu.c:152) > ==42048== by 0x100FF9036: PCSetUp (precon.c:982) > ==42048== by 0x1010F54EB: KSPSetUp (itfunc.c:332) > ==42048== by 0x1010F7985: KSPSolve (itfunc.c:546) > ==42048== by 0x10125541E: SNESSolve_NEWTONLS (ls.c:233) > ==42048== by 0x1011C49B7: SNESSolve (snes.c:3906) > ==42048== by 0x100001B3C: main (in ./ex19) > ==42048== > ==42050== Conditional jump or move depends on uninitialised value(s) > ==42050== at 0x102744CB8: MPI_Alltoallv (alltoallv.c:480) > ==42050== by 0x101510B3E: dist_symbLU (pdsymbfact_distdata.c:539) > ==42050== by 0x10150A5C6: ddist_psymbtonum (pdsymbfact_distdata.c:1275) > ==42050== by 0x1015018C2: pdgssvx (pdgssvx.c:1057) > ==42050== by 0x1009CFE7A: MatLUFactorNumeric_SuperLU_DIST (superlu_dist.c:414) > ==42050== by 0x10046CC5C: MatLUFactorNumeric (matrix.c:2946) > ==42050== by 0x100F09F2C: PCSetUp_LU (lu.c:152) > ==42050== by 0x100FF9036: PCSetUp (precon.c:982) > ==42050== by 0x1010F54EB: KSPSetUp (itfunc.c:332) > ==42050== by 0x1010F7985: KSPSolve (itfunc.c:546) > ==42050== by 0x10125541E: SNESSolve_NEWTONLS (ls.c:233) > ==42050== by 0x1011C49B7: SNESSolve (snes.c:3906) > ==42050== by 0x100001B3C: main (in ./ex19) > ==42050== Uninitialised value was created by a stack allocation > ==42050== at 0x10150E4C4: dist_symbLU (pdsymbfact_distdata.c:96) > ==42050== > ==42050== Conditional jump or move depends on uninitialised value(s) > ==42050== at 0x102744E43: MPI_Alltoallv (alltoallv.c:490) > ==42050== by 0x101510B3E: dist_symbLU (pdsymbfact_distdata.c:539) > ==42050== by 0x10150A5C6: ddist_psymbtonum (pdsymbfact_distdata.c:1275) > ==42050== by 0x1015018C2: pdgssvx (pdgssvx.c:1057) > ==42050== by 0x1009CFE7A: MatLUFactorNumeric_SuperLU_DIST (superlu_dist.c:414) > ==42050== by 0x10046CC5C: MatLUFactorNumeric (matrix.c:2946) > ==42050== by 0x100F09F2C: PCSetUp_LU (lu.c:152) > ==42050== by 0x100FF9036: PCSetUp (precon.c:982) > ==42050== by 0x1010F54EB: KSPSetUp (itfunc.c:332) > ==42050== by 0x1010F7985: KSPSolve (itfunc.c:546) > ==42050== by 0x10125541E: SNESSolve_NEWTONLS (ls.c:233) > ==42050== by 0x1011C49B7: SNESSolve (snes.c:3906) > ==42050== by 0x100001B3C: main (in ./ex19) > ==42050== Uninitialised value was created by a stack allocation > ==42050== at 0x10150E4C4: dist_symbLU (pdsymbfact_distdata.c:96) > ==42050== > ==42050== Conditional jump or move depends on uninitialised value(s) > ==42050== at 0x102744EBF: MPI_Alltoallv (alltoallv.c:497) > ==42050== by 0x101510B3E: dist_symbLU (pdsymbfact_distdata.c:539) > ==42050== by 0x10150A5C6: ddist_psymbtonum (pdsymbfact_distdata.c:1275) > ==42050== by 0x1015018C2: pdgssvx (pdgssvx.c:1057) > ==42050== by 0x1009CFE7A: MatLUFactorNumeric_SuperLU_DIST (superlu_dist.c:414) > ==42050== by 0x10046CC5C: MatLUFactorNumeric (matrix.c:2946) > ==42050== by 0x100F09F2C: PCSetUp_LU (lu.c:152) > ==42050== by 0x100FF9036: PCSetUp (precon.c:982) > ==42050== by 0x1010F54EB: KSPSetUp (itfunc.c:332) > ==42050== by 0x1010F7985: KSPSolve (itfunc.c:546) > ==42050== by 0x10125541E: SNESSolve_NEWTONLS (ls.c:233) > ==42050== by 0x1011C49B7: SNESSolve (snes.c:3906) > ==42050== by 0x100001B3C: main (in ./ex19) > ==42050== Uninitialised value was created by a stack allocation > ==42050== at 0x10150E4C4: dist_symbLU (pdsymbfact_distdata.c:96) > ==42050== > ==42050== Conditional jump or move depends on uninitialised value(s) > ==42050== at 0x1027450B1: MPI_Alltoallv (alltoallv.c:512) > ==42050== by 0x101510B3E: dist_symbLU (pdsymbfact_distdata.c:539) > ==42050== by 0x10150A5C6: ddist_psymbtonum (pdsymbfact_distdata.c:1275) > ==42050== by 0x1015018C2: pdgssvx (pdgssvx.c:1057) > ==42050== by 0x1009CFE7A: MatLUFactorNumeric_SuperLU_DIST (superlu_dist.c:414) > ==42050== by 0x10046CC5C: MatLUFactorNumeric (matrix.c:2946) > ==42050== by 0x100F09F2C: PCSetUp_LU (lu.c:152) > ==42050== by 0x100FF9036: PCSetUp (precon.c:982) > ==42050== by 0x1010F54EB: KSPSetUp (itfunc.c:332) > ==42050== by 0x1010F7985: KSPSolve (itfunc.c:546) > ==42050== by 0x10125541E: SNESSolve_NEWTONLS (ls.c:233) > ==42050== by 0x1011C49B7: SNESSolve (snes.c:3906) > ==42050== by 0x100001B3C: main (in ./ex19) > ==42050== Uninitialised value was created by a stack allocation > ==42050== at 0x10150E4C4: dist_symbLU (pdsymbfact_distdata.c:96) > ==42050== > ==42050== Conditional jump or move depends on uninitialised value(s) > ==42050== at 0x10283FB06: MPIR_Alltoallv_intra (alltoallv.c:92) > ==42050== by 0x1028407B6: MPIR_Alltoallv (alltoallv.c:343) > ==42050== by 0x102840884: MPIR_Alltoallv_impl (alltoallv.c:380) > ==42050== by 0x10274541B: MPI_Alltoallv (alltoallv.c:531) > ==42050== by 0x101510B3E: dist_symbLU (pdsymbfact_distdata.c:539) > ==42050== by 0x10150A5C6: ddist_psymbtonum (pdsymbfact_distdata.c:1275) > ==42050== by 0x1015018C2: pdgssvx (pdgssvx.c:1057) > ==42050== by 0x1009CFE7A: MatLUFactorNumeric_SuperLU_DIST (superlu_dist.c:414) > ==42050== by 0x10046CC5C: MatLUFactorNumeric (matrix.c:2946) > ==42050== by 0x100F09F2C: PCSetUp_LU (lu.c:152) > ==42050== by 0x100FF9036: PCSetUp (precon.c:982) > ==42050== by 0x1010F54EB: KSPSetUp (itfunc.c:332) > ==42050== by 0x1010F7985: KSPSolve (itfunc.c:546) > ==42050== by 0x10125541E: SNESSolve_NEWTONLS (ls.c:233) > ==42050== by 0x1011C49B7: SNESSolve (snes.c:3906) > ==42050== by 0x100001B3C: main (in ./ex19) > ==42050== Uninitialised value was created by a stack allocation > ==42050== at 0x10150E4C4: dist_symbLU (pdsymbfact_distdata.c:96) > ==42050== > ==42050== Syscall param writev(vector[...]) points to uninitialised byte(s) > ==42050== at 0x102DA1C3A: writev (in /usr/lib/system/libsystem_kernel.dylib) > ==42050== by 0x10296A0DC: MPL_large_writev (mplsock.c:32) > ==42050== by 0x10295F6AD: MPIDU_Sock_writev (sock_immed.i:610) > ==42050== by 0x102943FCA: MPIDI_CH3_iSendv (ch3_isendv.c:84) > ==42050== by 0x102934361: MPIDI_CH3_EagerContigIsend (ch3u_eager.c:556) > ==42050== by 0x102939531: MPID_Isend (mpid_isend.c:138) > ==42050== by 0x10277656E: MPI_Isend (isend.c:125) > ==42050== by 0x101524C41: pdgstrf2_trsm (pdgstrf2.c:201) > ==42050== by 0x10151ECBF: pdgstrf (pdgstrf.c:1082) > ==42050== by 0x1015019A5: pdgssvx (pdgssvx.c:1069) > ==42050== by 0x1009CFE7A: MatLUFactorNumeric_SuperLU_DIST (superlu_dist.c:414) > ==42050== by 0x10046CC5C: MatLUFactorNumeric (matrix.c:2946) > ==42050== by 0x100F09F2C: PCSetUp_LU (lu.c:152) > ==42050== by 0x100FF9036: PCSetUp (precon.c:982) > ==42050== by 0x1010F54EB: KSPSetUp (itfunc.c:332) > ==42050== by 0x1010F7985: KSPSolve (itfunc.c:546) > ==42050== by 0x10125541E: SNESSolve_NEWTONLS (ls.c:233) > ==42050== by 0x1011C49B7: SNESSolve (snes.c:3906) > ==42050== by 0x100001B3C: main (in ./ex19) > ==42050== Address 0x1060144d0 is 1,168 bytes inside a block of size 131,072 alloc'd > ==42050== at 0x1000183B1: malloc (vg_replace_malloc.c:303) > ==42050== by 0x10153B704: superlu_malloc_dist (memory.c:108) > ==42050== by 0x1014FD7AD: doubleMalloc_dist (dmemory.c:145) > ==42050== by 0x10151DA7D: pdgstrf (pdgstrf.c:735) > ==42050== by 0x1015019A5: pdgssvx (pdgssvx.c:1069) > ==42050== by 0x1009CFE7A: MatLUFactorNumeric_SuperLU_DIST (superlu_dist.c:414) > ==42050== by 0x10046CC5C: MatLUFactorNumeric (matrix.c:2946) > ==42050== by 0x100F09F2C: PCSetUp_LU (lu.c:152) > ==42050== by 0x100FF9036: PCSetUp (precon.c:982) > ==42050== by 0x1010F54EB: KSPSetUp (itfunc.c:332) > ==42050== by 0x1010F7985: KSPSolve (itfunc.c:546) > ==42050== by 0x10125541E: SNESSolve_NEWTONLS (ls.c:233) > ==42050== by 0x1011C49B7: SNESSolve (snes.c:3906) > ==42050== by 0x100001B3C: main (in ./ex19) > ==42050== Uninitialised value was created by a heap allocation > ==42050== at 0x1000183B1: malloc (vg_replace_malloc.c:303) > ==42050== by 0x10153B704: superlu_malloc_dist (memory.c:108) > ==42050== by 0x1014FD7AD: doubleMalloc_dist (dmemory.c:145) > ==42050== by 0x10151DA7D: pdgstrf (pdgstrf.c:735) > ==42050== by 0x1015019A5: pdgssvx (pdgssvx.c:1069) > ==42050== by 0x1009CFE7A: MatLUFactorNumeric_SuperLU_DIST (superlu_dist.c:414) > ==42050== by 0x10046CC5C: MatLUFactorNumeric (matrix.c:2946) > ==42050== by 0x100F09F2C: PCSetUp_LU (lu.c:152) > ==42050== by 0x100FF9036: PCSetUp (precon.c:982) > ==42050== by 0x1010F54EB: KSPSetUp (itfunc.c:332) > ==42050== by 0x1010F7985: KSPSolve (itfunc.c:546) > ==42050== by 0x10125541E: SNESSolve_NEWTONLS (ls.c:233) > ==42050== by 0x1011C49B7: SNESSolve (snes.c:3906) > ==42050== by 0x100001B3C: main (in ./ex19) > ==42050== > ==42048== Conditional jump or move depends on uninitialised value(s) > ==42048== at 0x10151F141: pdgstrf (pdgstrf.c:1139) > ==42048== by 0x1015019A5: pdgssvx (pdgssvx.c:1069) > ==42048== by 0x1009CFE7A: MatLUFactorNumeric_SuperLU_DIST (superlu_dist.c:414) > ==42048== by 0x10046CC5C: MatLUFactorNumeric (matrix.c:2946) > ==42048== by 0x100F09F2C: PCSetUp_LU (lu.c:152) > ==42048== by 0x100FF9036: PCSetUp (precon.c:982) > ==42048== by 0x1010F54EB: KSPSetUp (itfunc.c:332) > ==42048== by 0x1010F7985: KSPSolve (itfunc.c:546) > ==42048== by 0x10125541E: SNESSolve_NEWTONLS (ls.c:233) > ==42048== by 0x1011C49B7: SNESSolve (snes.c:3906) > ==42048== by 0x100001B3C: main (in ./ex19) > ==42048== Uninitialised value was created by a heap allocation > ==42048== at 0x1000183B1: malloc (vg_replace_malloc.c:303) > ==42048== by 0x10153B704: superlu_malloc_dist (memory.c:108) > ==42048== by 0x10150ABE2: ddist_psymbtonum (pdsymbfact_distdata.c:1332) > ==42048== by 0x1015018C2: pdgssvx (pdgssvx.c:1057) > ==42048== by 0x1009CFE7A: MatLUFactorNumeric_SuperLU_DIST (superlu_dist.c:414) > ==42048== by 0x10046CC5C: MatLUFactorNumeric (matrix.c:2946) > ==42048== by 0x100F09F2C: PCSetUp_LU (lu.c:152) > ==42048== by 0x100FF9036: PCSetUp (precon.c:982) > ==42048== by 0x1010F54EB: KSPSetUp (itfunc.c:332) > ==42048== by 0x1010F7985: KSPSolve (itfunc.c:546) > ==42048== by 0x10125541E: SNESSolve_NEWTONLS (ls.c:233) > ==42048== by 0x1011C49B7: SNESSolve (snes.c:3906) > ==42048== by 0x100001B3C: main (in ./ex19) > ==42048== > ==42049== Conditional jump or move depends on uninitialised value(s) > ==42049== at 0x10151F141: pdgstrf (pdgstrf.c:1139) > ==42049== by 0x1015019A5: pdgssvx (pdgssvx.c:1069) > ==42049== by 0x1009CFE7A: MatLUFactorNumeric_SuperLU_DIST (superlu_dist.c:414) > ==42049== by 0x10046CC5C: MatLUFactorNumeric (matrix.c:2946) > ==42049== by 0x100F09F2C: PCSetUp_LU (lu.c:152) > ==42049== by 0x100FF9036: PCSetUp (precon.c:982) > ==42049== by 0x1010F54EB: KSPSetUp (itfunc.c:332) > ==42049== by 0x1010F7985: KSPSolve (itfunc.c:546) > ==42049== by 0x10125541E: SNESSolve_NEWTONLS (ls.c:233) > ==42049== by 0x1011C49B7: SNESSolve (snes.c:3906) > ==42049== by 0x100001B3C: main (in ./ex19) > ==42049== Uninitialised value was created by a heap allocation > ==42049== at 0x1000183B1: malloc (vg_replace_malloc.c:303) > ==42049== by 0x10153B704: superlu_malloc_dist (memory.c:108) > ==42049== by 0x10150ABE2: ddist_psymbtonum (pdsymbfact_distdata.c:1332) > ==42049== by 0x1015018C2: pdgssvx (pdgssvx.c:1057) > ==42049== by 0x1009CFE7A: MatLUFactorNumeric_SuperLU_DIST (superlu_dist.c:414) > ==42049== by 0x10046CC5C: MatLUFactorNumeric (matrix.c:2946) > ==42049== by 0x100F09F2C: PCSetUp_LU (lu.c:152) > ==42049== by 0x100FF9036: PCSetUp (precon.c:982) > ==42049== by 0x1010F54EB: KSPSetUp (itfunc.c:332) > ==42049== by 0x1010F7985: KSPSolve (itfunc.c:546) > ==42049== by 0x10125541E: SNESSolve_NEWTONLS (ls.c:233) > ==42049== by 0x1011C49B7: SNESSolve (snes.c:3906) > ==42049== by 0x100001B3C: main (in ./ex19) > ==42049== > ==42048== Conditional jump or move depends on uninitialised value(s) > ==42048== at 0x101520054: pdgstrf (pdgstrf.c:1429) > ==42048== by 0x1015019A5: pdgssvx (pdgssvx.c:1069) > ==42048== by 0x1009CFE7A: MatLUFactorNumeric_SuperLU_DIST (superlu_dist.c:414) > ==42048== by 0x10046CC5C: MatLUFactorNumeric (matrix.c:2946) > ==42048== by 0x100F09F2C: PCSetUp_LU (lu.c:152) > ==42048== by 0x100FF9036: PCSetUp (precon.c:982) > ==42048== by 0x1010F54EB: KSPSetUp (itfunc.c:332) > ==42048== by 0x1010F7985: KSPSolve (itfunc.c:546) > ==42049== Conditional jump or move depends on uninitialised value(s) > ==42048== by 0x10125541E: SNESSolve_NEWTONLS (ls.c:233) > ==42048== by 0x1011C49B7: SNESSolve (snes.c:3906) > ==42048== by 0x100001B3C: main (in ./ex19) > ==42048== Uninitialised value was created by a heap allocation > ==42049== at 0x101520054: pdgstrf (pdgstrf.c:1429) > ==42048== at 0x1000183B1: malloc (vg_replace_malloc.c:303) > ==42048== by 0x10153B704: superlu_malloc_dist (memory.c:108) > ==42049== by 0x1015019A5: pdgssvx (pdgssvx.c:1069) > ==42049== by 0x1009CFE7A: MatLUFactorNumeric_SuperLU_DIST (superlu_dist.c:414) > ==42048== by 0x10150ABE2: ddist_psymbtonum (pdsymbfact_distdata.c:1332) > ==42048== by 0x1015018C2: pdgssvx (pdgssvx.c:1057) > ==42048== by 0x1009CFE7A: MatLUFactorNumeric_SuperLU_DIST (superlu_dist.c:414) > ==42049== by 0x10046CC5C: MatLUFactorNumeric (matrix.c:2946) > ==42049== by 0x100F09F2C: PCSetUp_LU (lu.c:152) > ==42048== by 0x10046CC5C: MatLUFactorNumeric (matrix.c:2946) > ==42048== by 0x100F09F2C: PCSetUp_LU (lu.c:152) > ==42049== by 0x100FF9036: PCSetUp (precon.c:982) > ==42049== by 0x1010F54EB: KSPSetUp (itfunc.c:332) > ==42049== by 0x1010F7985: KSPSolve (itfunc.c:546) > ==42048== by 0x100FF9036: PCSetUp (precon.c:982) > ==42048== by 0x1010F54EB: KSPSetUp (itfunc.c:332) > ==42048== by 0x1010F7985: KSPSolve (itfunc.c:546) > ==42049== by 0x10125541E: SNESSolve_NEWTONLS (ls.c:233) > ==42049== by 0x1011C49B7: SNESSolve (snes.c:3906) > ==42048== by 0x10125541E: SNESSolve_NEWTONLS (ls.c:233) > ==42048== by 0x1011C49B7: SNESSolve (snes.c:3906) > ==42049== by 0x100001B3C: main (in ./ex19) > ==42049== Uninitialised value was created by a heap allocation > ==42049== at 0x1000183B1: malloc (vg_replace_malloc.c:303) > ==42048== by 0x100001B3C: main (in ./ex19) > ==42048== > ==42049== by 0x10153B704: superlu_malloc_dist (memory.c:108) > ==42049== by 0x10150ABE2: ddist_psymbtonum (pdsymbfact_distdata.c:1332) > ==42049== by 0x1015018C2: pdgssvx (pdgssvx.c:1057) > ==42049== by 0x1009CFE7A: MatLUFactorNumeric_SuperLU_DIST (superlu_dist.c:414) > ==42049== by 0x10046CC5C: MatLUFactorNumeric (matrix.c:2946) > ==42049== by 0x100F09F2C: PCSetUp_LU (lu.c:152) > ==42049== by 0x100FF9036: PCSetUp (precon.c:982) > ==42049== by 0x1010F54EB: KSPSetUp (itfunc.c:332) > ==42049== by 0x1010F7985: KSPSolve (itfunc.c:546) > ==42049== by 0x10125541E: SNESSolve_NEWTONLS (ls.c:233) > ==42049== by 0x1011C49B7: SNESSolve (snes.c:3906) > ==42049== by 0x100001B3C: main (in ./ex19) > ==42049== > ==42050== Conditional jump or move depends on uninitialised value(s) > ==42050== at 0x10151FDE6: pdgstrf (pdgstrf.c:1382) > ==42050== by 0x1015019A5: pdgssvx (pdgssvx.c:1069) > ==42050== by 0x1009CFE7A: MatLUFactorNumeric_SuperLU_DIST (superlu_dist.c:414) > ==42050== by 0x10046CC5C: MatLUFactorNumeric (matrix.c:2946) > ==42050== by 0x100F09F2C: PCSetUp_LU (lu.c:152) > ==42050== by 0x100FF9036: PCSetUp (precon.c:982) > ==42050== by 0x1010F54EB: KSPSetUp (itfunc.c:332) > ==42050== by 0x1010F7985: KSPSolve (itfunc.c:546) > ==42050== by 0x10125541E: SNESSolve_NEWTONLS (ls.c:233) > ==42050== by 0x1011C49B7: SNESSolve (snes.c:3906) > ==42050== by 0x100001B3C: main (in ./ex19) > ==42050== Uninitialised value was created by a heap allocation > ==42050== at 0x1000183B1: malloc (vg_replace_malloc.c:303) > ==42050== by 0x10153B704: superlu_malloc_dist (memory.c:108) > ==42050== by 0x10150B241: ddist_psymbtonum (pdsymbfact_distdata.c:1389) > ==42050== by 0x1015018C2: pdgssvx (pdgssvx.c:1057) > ==42050== by 0x1009CFE7A: MatLUFactorNumeric_SuperLU_DIST (superlu_dist.c:414) > ==42050== by 0x10046CC5C: MatLUFactorNumeric (matrix.c:2946) > ==42050== by 0x100F09F2C: PCSetUp_LU (lu.c:152) > ==42050== by 0x100FF9036: PCSetUp (precon.c:982) > ==42050== by 0x1010F54EB: KSPSetUp (itfunc.c:332) > ==42050== by 0x1010F7985: KSPSolve (itfunc.c:546) > ==42050== by 0x10125541E: SNESSolve_NEWTONLS (ls.c:233) > ==42050== by 0x1011C49B7: SNESSolve (snes.c:3906) > ==42050== by 0x100001B3C: main (in ./ex19) > ==42050== > > > > On Jul 20, 2015, at 12:03 PM, Mahir.Ulker-Kaustell at tyrens.se wrote: > > > > Ok. So I have been creating the full factorization on each process. That gives me some hope! > > > > I followed your suggestion and tried to use the runtime option ?-mat_superlu_dist_parsymbfact?. > > However, now the program crashes with: > > > > Invalid ISPEC at line 484 in file get_perm_c.c > > > > And so on? > > > > From the SuperLU manual; I should give the option either YES or NO, however -mat_superlu_dist_parsymbfact YES makes the program crash in the same way as above. > > Also I can?t find any reference to -mat_superlu_dist_parsymbfact in the PETSc documentation > > > > Mahir > > > > Mahir ?lker-Kaustell, Kompetenssamordnare, Brokonstrukt?r, Tekn. Dr, Tyr?ns AB > > 010 452 30 82, Mahir.Ulker-Kaustell at tyrens.se > > > > From: Xiaoye S. Li [mailto:xsli at lbl.gov] > > Sent: den 20 juli 2015 18:12 > > To: ?lker-Kaustell, Mahir > > Cc: Hong; petsc-users > > Subject: Re: [petsc-users] SuperLU MPI-problem > > > > The default SuperLU_DIST setting is to serial symbolic factorization. Therefore, what matters is how much memory do you have per MPI task? > > > > The code failed to malloc memory during redistribution of matrix A to {L\U} data struction (using result of serial symbolic factorization.) > > > > You can use parallel symbolic factorization, by runtime option: '-mat_superlu_dist_parsymbfact' > > > > Sherry Li > > > > > > On Mon, Jul 20, 2015 at 8:59 AM, Mahir.Ulker-Kaustell at tyrens.se > wrote: > > Hong: > > > > Previous experiences with this equation have shown that it is very difficult to solve it iteratively. Hence the use of a direct solver. > > > > The large test problem I am trying to solve has slightly less than 10^6 degrees of freedom. The matrices are derived from finite elements so they are sparse. > > The machine I am working on has 128GB ram. I have estimated the memory needed to less than 20GB, so if the solver needs twice or even three times as much, it should still work well. Or have I completely misunderstood something here? > > > > Mahir > > > > > > > > From: Hong [mailto:hzhang at mcs.anl.gov] > > Sent: den 20 juli 2015 17:39 > > To: ?lker-Kaustell, Mahir > > Cc: petsc-users > > Subject: Re: [petsc-users] SuperLU MPI-problem > > > > Mahir: > > Direct solvers consume large amount of memory. Suggest to try followings: > > > > 1. A sparse iterative solver if [-omega^2M + K] is not too ill-conditioned. You may test it using the small matrix. > > > > 2. Incrementally increase your matrix sizes. Try different matrix orderings. > > Do you get memory crash in the 1st symbolic factorization? > > In your case, matrix data structure stays same when omega changes, so you only need to do one matrix symbolic factorization and reuse it. > > > > 3. Use a machine that gives larger memory. > > > > Hong > > > > Dear Petsc-Users, > > > > I am trying to use PETSc to solve a set of linear equations arising from Naviers equation (elastodynamics) in the frequency domain. > > The frequency dependency of the problem requires that the system > > > > [-omega^2M + K]u = F > > > > where M and K are constant, square, positive definite matrices (mass and stiffness respectively) is solved for each frequency omega of interest. > > K is a complex matrix, including material damping. > > > > I have written a PETSc program which solves this problem for a small (1000 degrees of freedom) test problem on one or several processors, but it keeps crashing when I try it on my full scale (in the order of 10^6 degrees of freedom) problem. > > > > The program crashes at KSPSetUp() and from what I can see in the error messages, it appears as if it consumes too much memory. > > > > I would guess that similar problems have occurred in this mail-list, so I am hoping that someone can push me in the right direction? > > > > Mahir > > > > > > > > > > From hzhang at mcs.anl.gov Tue Aug 11 11:58:24 2015 From: hzhang at mcs.anl.gov (Hong) Date: Tue, 11 Aug 2015 11:58:24 -0500 Subject: [petsc-users] SIGSEGV in Superlu_dist In-Reply-To: <8B989122-F09F-4AA2-8676-D82CA2E58B0C@mcs.anl.gov> References: <55C90EAA.5060702@email.arizona.edu> <8B989122-F09F-4AA2-8676-D82CA2E58B0C@mcs.anl.gov> Message-ID: Anthony, I pushed a fix https://bitbucket.org/petsc/petsc/commits/ceeba3afeff0c18262ed13ef92e2508ca68b0ecf Once it passes our nightly tests, I'll merge it to petsc-maint, then petsc-dev. Thanks for reporting it! Hong On Mon, Aug 10, 2015 at 4:27 PM, Barry Smith wrote: > > Anthony, > > This crash is in PETSc code before it calls the SuperLU_DIST numeric > factorization; likely we have a mistake such as assuming a process has at > least one row of the matrix and need to fix it. > > Barry > > > 0x00007fe6ba609297 in MatLUFactorNumeric_SuperLU_DIST (F=0x1922b50, > > A=0x14a6a70, info=0x19099f8) > > at > /home/anthony/LIB/petsc-3.6.1/src/mat/impls/aij/mpi/superlu_dist/superlu_dist.c:368 > > 368 colA_start = rstart + ajj[0]; /* the smallest global col > index of A */ > > > > > On Aug 10, 2015, at 3:50 PM, Anthony Haas wrote: > > > > Hi Sherry, > > > > I recently submitted a matrix for which I noticed that Superlu_dist was > hanging when running on 4 processors with parallel symbolic factorization. > I have been using the latest version of Superlu_dist and the code is not > hanging anymore. However, I noticed that when running the same matrix (I > have attached the matrix), the code crashes with the following SIGSEGV when > running on 10 procs (with or without parallel symbolic factorization). It > is probably overkill to run such a 'small' matrix on 10 procs but I thought > that it might still be useful to report the problem?? See below for the > error obtained when running with gdb and also a code snippet to reproduce > the error. > > > > Thanks, > > > > > > Anthony > > > > > > > > 1) ERROR in GDB > > > > Program received signal SIGSEGV, Segmentation fault. > > 0x00007fe6ba609297 in MatLUFactorNumeric_SuperLU_DIST (F=0x1922b50, > > A=0x14a6a70, info=0x19099f8) > > at > /home/anthony/LIB/petsc-3.6.1/src/mat/impls/aij/mpi/superlu_dist/superlu_dist.c:368 > > 368 colA_start = rstart + ajj[0]; /* the smallest global col > index of A */ > > (gdb) > > > > > > > > 2) PORTION OF CODE TO REPRODUCE ERROR > > > > Subroutine HowBigLUCanBe(rank) > > > > IMPLICIT NONE > > > > integer(i4b),intent(in) :: rank > > integer(i4b) :: i,ct > > real(dp) :: begin,endd > > complex(dpc) :: sigma > > > > PetscErrorCode ierr > > > > > > if (rank==0) call cpu_time(begin) > > > > if (rank==0) then > > write(*,*) > > write(*,*)'Testing How Big LU Can Be...' > > write(*,*)'============================' > > write(*,*) > > endif > > > > !sigma = (1.0d0,0.0d0) > > !call MatAXPY(A,-sigma,B,DIFFERENT_NONZERO_PATTERN,ierr) ! on exit > A = A-sigma*B > > > > !call MatView(A,PETSC_VIEWER_STDOUT_WORLD,ierr) > > > > !.....Write Matrix to ASCII and Binary Format > > !call PetscViewerASCIIOpen(PETSC_COMM_WORLD,"Amat.m",viewer,ierr) > > !call MatView(DXX,viewer,ierr) > > !call PetscViewerDestroy(viewer,ierr) > > > > !call > PetscViewerBinaryOpen(PETSC_COMM_WORLD,"Amat_binary.m",FILE_MODE_WRITE,viewer,ierr) > > !call MatView(A,viewer,ierr) > > !call PetscViewerDestroy(viewer,ierr) > > > > !...Load a Matrix in Binary Format > > call > PetscViewerBinaryOpen(PETSC_COMM_WORLD,"Amat_binary.m",FILE_MODE_READ,viewer,ierr) > > call MatCreate(PETSC_COMM_WORLD,DLOAD,ierr) > > call MatSetType(DLOAD,MATAIJ,ierr) > > call MatLoad(DLOAD,viewer,ierr) > > call PetscViewerDestroy(viewer,ierr) > > > > !call MatView(DLOAD,PETSC_VIEWER_STDOUT_WORLD,ierr) > > > > > > !.....Create Linear Solver Context > > call KSPCreate(PETSC_COMM_WORLD,ksp,ierr) > > > > !.....Set operators. Here the matrix that defines the linear system also > serves as the preconditioning matrix. > > !call KSPSetOperators(ksp,A,A,DIFFERENT_NONZERO_PATTERN,ierr) !aha > commented and replaced by next line > > > > !call KSPSetOperators(ksp,A,A,ierr) ! remember: here A = A-sigma*B > > call KSPSetOperators(ksp,DLOAD,DLOAD,ierr) ! remember: here A = > A-sigma*B > > > > !.....Set Relative and Absolute Tolerances and Uses Default for > Divergence Tol > > tol = 1.e-10 > > call > KSPSetTolerances(ksp,tol,tol,PETSC_DEFAULT_REAL,PETSC_DEFAULT_INTEGER,ierr) > > > > !.....Set the Direct (LU) Solver > > call KSPSetType(ksp,KSPPREONLY,ierr) > > call KSPGetPC(ksp,pc,ierr) > > call PCSetType(pc,PCLU,ierr) > > call PCFactorSetMatSolverPackage(pc,MATSOLVERSUPERLU_DIST,ierr) ! > MATSOLVERSUPERLU_DIST MATSOLVERMUMPS > > > > !.....Create Right-Hand-Side Vector > > !call MatCreateVecs(A,frhs,PETSC_NULL_OBJECT,ierr) > > !call MatCreateVecs(A,sol,PETSC_NULL_OBJECT,ierr) > > > > call MatCreateVecs(DLOAD,frhs,PETSC_NULL_OBJECT,ierr) > > call MatCreateVecs(DLOAD,sol,PETSC_NULL_OBJECT,ierr) > > > > call MatGetOwnershipRange(DLOAD,IstartA,IendA,ierr)!;CHKERRQ(ierr) > > > > allocate(xwork1(IendA-IstartA)) > > allocate(loc(IendA-IstartA)) > > > > ct=0 > > do i=IstartA,IendA-1 > > ct=ct+1 > > loc(ct)=i > > xwork1(ct)=(1.0d0,0.0d0) > > enddo > > > > call VecSetValues(frhs,IendA-IstartA,loc,xwork1,INSERT_VALUES,ierr) > > call VecZeroEntries(sol,ierr) > > > > deallocate(xwork1,loc) > > > > !.....Assemble Vectors > > call VecAssemblyBegin(frhs,ierr) > > call VecAssemblyEnd(frhs,ierr) > > > > !.....Solve the Linear System > > call KSPSolve(ksp,frhs,sol,ierr) > > > > !call VecView(sol,PETSC_VIEWER_STDOUT_WORLD,ierr) > > > > if (rank==0) then > > call cpu_time(endd) > > write(*,*) > > print '("Total time for HowBigLUCanBe = ",f21.3," > seconds.")',endd-begin > > endif > > > > call SlepcFinalize(ierr) > > > > STOP > > > > > > end Subroutine HowBigLUCanBe > > > > > > -------------- next part -------------- An HTML attachment was scrubbed... URL: From gideon.simpson at gmail.com Tue Aug 11 12:36:22 2015 From: gideon.simpson at gmail.com (Gideon Simpson) Date: Tue, 11 Aug 2015 13:36:22 -0400 Subject: [petsc-users] checking jacobian Message-ID: <1AED74B9-C28E-4F0E-BC61-B8C6E8E5B1F8@gmail.com> I?m a bit confused by the following options: Run with -snes_check_jacobian_view [viewer][:filename][:format] to show difference of hand-coded and finite difference Jacobian. What flags do I pass it to get some output to diagnose my Jacobian error? -gideon -------------- next part -------------- An HTML attachment was scrubbed... URL: From knepley at gmail.com Tue Aug 11 12:39:47 2015 From: knepley at gmail.com (Matthew Knepley) Date: Tue, 11 Aug 2015 12:39:47 -0500 Subject: [petsc-users] checking jacobian In-Reply-To: <1AED74B9-C28E-4F0E-BC61-B8C6E8E5B1F8@gmail.com> References: <1AED74B9-C28E-4F0E-BC61-B8C6E8E5B1F8@gmail.com> Message-ID: On Tue, Aug 11, 2015 at 12:36 PM, Gideon Simpson wrote: > I?m a bit confused by the following options: > > Run with -snes_check_jacobian_view [viewer][:filename][:format] to show > difference of hand-coded and finite difference Jacobian. > > What flags do I pass it to get some output to diagnose my Jacobian error? > I would start with -snes_check_jacobian_view ascii:bug.txt Matt > -gideon > > -- What most experimenters take for granted before they begin their experiments is infinitely more interesting than any results to which their experiments lead. -- Norbert Wiener -------------- next part -------------- An HTML attachment was scrubbed... URL: From jed at jedbrown.org Tue Aug 11 12:40:40 2015 From: jed at jedbrown.org (Jed Brown) Date: Tue, 11 Aug 2015 11:40:40 -0600 Subject: [petsc-users] checking jacobian In-Reply-To: <1AED74B9-C28E-4F0E-BC61-B8C6E8E5B1F8@gmail.com> References: <1AED74B9-C28E-4F0E-BC61-B8C6E8E5B1F8@gmail.com> Message-ID: <87zj1xsoyv.fsf@jedbrown.org> Gideon Simpson writes: > I?m a bit confused by the following options: > > Run with -snes_check_jacobian_view [viewer][:filename][:format] to show difference of hand-coded and finite difference Jacobian. > > What flags do I pass it to get some output to diagnose my Jacobian error? Nothing to display ASCII to the screen. You might use "binary:thematrix" if you want to read it in with Python or MATLAB, for example. -------------- next part -------------- A non-text attachment was scrubbed... Name: signature.asc Type: application/pgp-signature Size: 818 bytes Desc: not available URL: From gideon.simpson at gmail.com Tue Aug 11 12:49:12 2015 From: gideon.simpson at gmail.com (Gideon Simpson) Date: Tue, 11 Aug 2015 13:49:12 -0400 Subject: [petsc-users] checking jacobian In-Reply-To: <87zj1xsoyv.fsf@jedbrown.org> References: <1AED74B9-C28E-4F0E-BC61-B8C6E8E5B1F8@gmail.com> <87zj1xsoyv.fsf@jedbrown.org> Message-ID: <5B0AD314-87B9-4A74-AE92-16CD106D7A52@gmail.com> Maybe it?s a quirk of the macports installation of petsc, but nothing seems to be getting generated. -gideon > On Aug 11, 2015, at 1:40 PM, Jed Brown wrote: > > Gideon Simpson writes: > >> I?m a bit confused by the following options: >> >> Run with -snes_check_jacobian_view [viewer][:filename][:format] to show difference of hand-coded and finite difference Jacobian. >> >> What flags do I pass it to get some output to diagnose my Jacobian error? > > Nothing to display ASCII to the screen. You might use > "binary:thematrix" if you want to read it in with Python or MATLAB, for > example. -------------- next part -------------- An HTML attachment was scrubbed... URL: From knepley at gmail.com Tue Aug 11 12:50:19 2015 From: knepley at gmail.com (Matthew Knepley) Date: Tue, 11 Aug 2015 12:50:19 -0500 Subject: [petsc-users] checking jacobian In-Reply-To: <5B0AD314-87B9-4A74-AE92-16CD106D7A52@gmail.com> References: <1AED74B9-C28E-4F0E-BC61-B8C6E8E5B1F8@gmail.com> <87zj1xsoyv.fsf@jedbrown.org> <5B0AD314-87B9-4A74-AE92-16CD106D7A52@gmail.com> Message-ID: On Tue, Aug 11, 2015 at 12:49 PM, Gideon Simpson wrote: > Maybe it?s a quirk of the macports installation of petsc, but nothing > seems to be getting generated. > Run with -options_left. Is it reading the option? Matt > -gideon > > On Aug 11, 2015, at 1:40 PM, Jed Brown wrote: > > Gideon Simpson writes: > > I?m a bit confused by the following options: > > Run with -snes_check_jacobian_view [viewer][:filename][:format] to show > difference of hand-coded and finite difference Jacobian. > > What flags do I pass it to get some output to diagnose my Jacobian error? > > > Nothing to display ASCII to the screen. You might use > "binary:thematrix" if you want to read it in with Python or MATLAB, for > example. > > > -- What most experimenters take for granted before they begin their experiments is infinitely more interesting than any results to which their experiments lead. -- Norbert Wiener -------------- next part -------------- An HTML attachment was scrubbed... URL: From gideon.simpson at gmail.com Tue Aug 11 12:51:19 2015 From: gideon.simpson at gmail.com (Gideon Simpson) Date: Tue, 11 Aug 2015 13:51:19 -0400 Subject: [petsc-users] checking jacobian In-Reply-To: References: <1AED74B9-C28E-4F0E-BC61-B8C6E8E5B1F8@gmail.com> <87zj1xsoyv.fsf@jedbrown.org> <5B0AD314-87B9-4A74-AE92-16CD106D7A52@gmail.com> Message-ID: <18507BF6-BCB7-44B4-918B-7197CF716208@gmail.com> #End of PETSc Option Table entries There is one unused database option. It is: Option left: name:-snes_check_jacobian_view (no value) -gideon > On Aug 11, 2015, at 1:50 PM, Matthew Knepley wrote: > > On Tue, Aug 11, 2015 at 12:49 PM, Gideon Simpson > wrote: > Maybe it?s a quirk of the macports installation of petsc, but nothing seems to be getting generated. > > Run with -options_left. Is it reading the option? > > Matt > > -gideon > >> On Aug 11, 2015, at 1:40 PM, Jed Brown > wrote: >> >> Gideon Simpson > writes: >> >>> I?m a bit confused by the following options: >>> >>> Run with -snes_check_jacobian_view [viewer][:filename][:format] to show difference of hand-coded and finite difference Jacobian. >>> >>> What flags do I pass it to get some output to diagnose my Jacobian error? >> >> Nothing to display ASCII to the screen. You might use >> "binary:thematrix" if you want to read it in with Python or MATLAB, for >> example. > > > > > -- > What most experimenters take for granted before they begin their experiments is infinitely more interesting than any results to which their experiments lead. > -- Norbert Wiener -------------- next part -------------- An HTML attachment was scrubbed... URL: From knepley at gmail.com Tue Aug 11 13:03:05 2015 From: knepley at gmail.com (Matthew Knepley) Date: Tue, 11 Aug 2015 13:03:05 -0500 Subject: [petsc-users] checking jacobian In-Reply-To: <18507BF6-BCB7-44B4-918B-7197CF716208@gmail.com> References: <1AED74B9-C28E-4F0E-BC61-B8C6E8E5B1F8@gmail.com> <87zj1xsoyv.fsf@jedbrown.org> <5B0AD314-87B9-4A74-AE92-16CD106D7A52@gmail.com> <18507BF6-BCB7-44B4-918B-7197CF716208@gmail.com> Message-ID: On Tue, Aug 11, 2015 at 12:51 PM, Gideon Simpson wrote: > #End of PETSc Option Table entries > There is one unused database option. It is: > Option left: name:-snes_check_jacobian_view (no value) > This is the option for the newest release. What are you using? Matt > -gideon > > On Aug 11, 2015, at 1:50 PM, Matthew Knepley wrote: > > On Tue, Aug 11, 2015 at 12:49 PM, Gideon Simpson > wrote: > >> Maybe it?s a quirk of the macports installation of petsc, but nothing >> seems to be getting generated. >> > > Run with -options_left. Is it reading the option? > > Matt > > >> -gideon >> >> On Aug 11, 2015, at 1:40 PM, Jed Brown wrote: >> >> Gideon Simpson writes: >> >> I?m a bit confused by the following options: >> >> Run with -snes_check_jacobian_view [viewer][:filename][:format] to show >> difference of hand-coded and finite difference Jacobian. >> >> What flags do I pass it to get some output to diagnose my Jacobian error? >> >> >> Nothing to display ASCII to the screen. You might use >> "binary:thematrix" if you want to read it in with Python or MATLAB, for >> example. >> >> >> > > > -- > What most experimenters take for granted before they begin their > experiments is infinitely more interesting than any results to which their > experiments lead. > -- Norbert Wiener > > > -- What most experimenters take for granted before they begin their experiments is infinitely more interesting than any results to which their experiments lead. -- Norbert Wiener -------------- next part -------------- An HTML attachment was scrubbed... URL: From gideon.simpson at gmail.com Tue Aug 11 13:04:19 2015 From: gideon.simpson at gmail.com (Gideon Simpson) Date: Tue, 11 Aug 2015 14:04:19 -0400 Subject: [petsc-users] checking jacobian In-Reply-To: References: <1AED74B9-C28E-4F0E-BC61-B8C6E8E5B1F8@gmail.com> <87zj1xsoyv.fsf@jedbrown.org> <5B0AD314-87B9-4A74-AE92-16CD106D7A52@gmail.com> <18507BF6-BCB7-44B4-918B-7197CF716208@gmail.com> Message-ID: <877E66A4-B4D9-4A4C-8654-FD55C11F12C2@gmail.com> Macports installation of 3.5.3. -gideon > On Aug 11, 2015, at 2:03 PM, Matthew Knepley wrote: > > On Tue, Aug 11, 2015 at 12:51 PM, Gideon Simpson > wrote: > #End of PETSc Option Table entries > There is one unused database option. It is: > Option left: name:-snes_check_jacobian_view (no value) > > This is the option for the newest release. What are you using? > > Matt > > -gideon > >> On Aug 11, 2015, at 1:50 PM, Matthew Knepley > wrote: >> >> On Tue, Aug 11, 2015 at 12:49 PM, Gideon Simpson > wrote: >> Maybe it?s a quirk of the macports installation of petsc, but nothing seems to be getting generated. >> >> Run with -options_left. Is it reading the option? >> >> Matt >> >> -gideon >> >>> On Aug 11, 2015, at 1:40 PM, Jed Brown > wrote: >>> >>> Gideon Simpson > writes: >>> >>>> I?m a bit confused by the following options: >>>> >>>> Run with -snes_check_jacobian_view [viewer][:filename][:format] to show difference of hand-coded and finite difference Jacobian. >>>> >>>> What flags do I pass it to get some output to diagnose my Jacobian error? >>> >>> Nothing to display ASCII to the screen. You might use >>> "binary:thematrix" if you want to read it in with Python or MATLAB, for >>> example. >> >> >> >> >> -- >> What most experimenters take for granted before they begin their experiments is infinitely more interesting than any results to which their experiments lead. >> -- Norbert Wiener > > > > > -- > What most experimenters take for granted before they begin their experiments is infinitely more interesting than any results to which their experiments lead. > -- Norbert Wiener -------------- next part -------------- An HTML attachment was scrubbed... URL: From knepley at gmail.com Tue Aug 11 13:07:00 2015 From: knepley at gmail.com (Matthew Knepley) Date: Tue, 11 Aug 2015 13:07:00 -0500 Subject: [petsc-users] checking jacobian In-Reply-To: <877E66A4-B4D9-4A4C-8654-FD55C11F12C2@gmail.com> References: <1AED74B9-C28E-4F0E-BC61-B8C6E8E5B1F8@gmail.com> <87zj1xsoyv.fsf@jedbrown.org> <5B0AD314-87B9-4A74-AE92-16CD106D7A52@gmail.com> <18507BF6-BCB7-44B4-918B-7197CF716208@gmail.com> <877E66A4-B4D9-4A4C-8654-FD55C11F12C2@gmail.com> Message-ID: On Tue, Aug 11, 2015 at 1:04 PM, Gideon Simpson wrote: > Macports installation of 3.5.3. > Use -help to find the option name. Maybe its -snes_test. Thanks, Matt > -gideon > > On Aug 11, 2015, at 2:03 PM, Matthew Knepley wrote: > > On Tue, Aug 11, 2015 at 12:51 PM, Gideon Simpson > wrote: > >> #End of PETSc Option Table entries >> There is one unused database option. It is: >> Option left: name:-snes_check_jacobian_view (no value) >> > > This is the option for the newest release. What are you using? > > Matt > > >> -gideon >> >> On Aug 11, 2015, at 1:50 PM, Matthew Knepley wrote: >> >> On Tue, Aug 11, 2015 at 12:49 PM, Gideon Simpson < >> gideon.simpson at gmail.com> wrote: >> >>> Maybe it?s a quirk of the macports installation of petsc, but nothing >>> seems to be getting generated. >>> >> >> Run with -options_left. Is it reading the option? >> >> Matt >> >> >>> -gideon >>> >>> On Aug 11, 2015, at 1:40 PM, Jed Brown wrote: >>> >>> Gideon Simpson writes: >>> >>> I?m a bit confused by the following options: >>> >>> Run with -snes_check_jacobian_view [viewer][:filename][:format] to show >>> difference of hand-coded and finite difference Jacobian. >>> >>> What flags do I pass it to get some output to diagnose my Jacobian error? >>> >>> >>> Nothing to display ASCII to the screen. You might use >>> "binary:thematrix" if you want to read it in with Python or MATLAB, for >>> example. >>> >>> >>> >> >> >> -- >> What most experimenters take for granted before they begin their >> experiments is infinitely more interesting than any results to which their >> experiments lead. >> -- Norbert Wiener >> >> >> > > > -- > What most experimenters take for granted before they begin their > experiments is infinitely more interesting than any results to which their > experiments lead. > -- Norbert Wiener > > > -- What most experimenters take for granted before they begin their experiments is infinitely more interesting than any results to which their experiments lead. -- Norbert Wiener -------------- next part -------------- An HTML attachment was scrubbed... URL: From aph at email.arizona.edu Tue Aug 11 13:08:22 2015 From: aph at email.arizona.edu (Anthony Haas) Date: Tue, 11 Aug 2015 11:08:22 -0700 Subject: [petsc-users] SIGSEGV in Superlu_dist In-Reply-To: References: <55C90EAA.5060702@email.arizona.edu> <8B989122-F09F-4AA2-8676-D82CA2E58B0C@mcs.anl.gov> Message-ID: <55CA3A16.90206@email.arizona.edu> Hi Hong, Sorry for my late reply and thanks for the fix. Does that mean that I will be able to run that matrix on 10 procs in the future (petsc 3.6.2?)? Thanks Anthony On 08/11/2015 09:58 AM, Hong wrote: > Anthony, > I pushed a fix > https://bitbucket.org/petsc/petsc/commits/ceeba3afeff0c18262ed13ef92e2508ca68b0ecf > > Once it passes our nightly tests, I'll merge it to petsc-maint, then > petsc-dev. > Thanks for reporting it! > > Hong > > On Mon, Aug 10, 2015 at 4:27 PM, Barry Smith > wrote: > > > Anthony, > > This crash is in PETSc code before it calls the SuperLU_DIST > numeric factorization; likely we have a mistake such as assuming a > process has at least one row of the matrix and need to fix it. > > Barry > > > 0x00007fe6ba609297 in MatLUFactorNumeric_SuperLU_DIST (F=0x1922b50, > > A=0x14a6a70, info=0x19099f8) > > at > /home/anthony/LIB/petsc-3.6.1/src/mat/impls/aij/mpi/superlu_dist/superlu_dist.c:368 > > 368 colA_start = rstart + ajj[0]; /* the smallest > global col index of A */ > > > > > On Aug 10, 2015, at 3:50 PM, Anthony Haas > wrote: > > > > Hi Sherry, > > > > I recently submitted a matrix for which I noticed that > Superlu_dist was hanging when running on 4 processors with > parallel symbolic factorization. I have been using the latest > version of Superlu_dist and the code is not hanging anymore. > However, I noticed that when running the same matrix (I have > attached the matrix), the code crashes with the following SIGSEGV > when running on 10 procs (with or without parallel symbolic > factorization). It is probably overkill to run such a 'small' > matrix on 10 procs but I thought that it might still be useful to > report the problem?? See below for the error obtained when running > with gdb and also a code snippet to reproduce the error. > > > > Thanks, > > > > > > Anthony > > > > > > > > 1) ERROR in GDB > > > > Program received signal SIGSEGV, Segmentation fault. > > 0x00007fe6ba609297 in MatLUFactorNumeric_SuperLU_DIST (F=0x1922b50, > > A=0x14a6a70, info=0x19099f8) > > at > /home/anthony/LIB/petsc-3.6.1/src/mat/impls/aij/mpi/superlu_dist/superlu_dist.c:368 > > 368 colA_start = rstart + ajj[0]; /* the smallest > global col index of A */ > > (gdb) > > > > > > > > 2) PORTION OF CODE TO REPRODUCE ERROR > > > > Subroutine HowBigLUCanBe(rank) > > > > IMPLICIT NONE > > > > integer(i4b),intent(in) :: rank > > integer(i4b) :: i,ct > > real(dp) :: begin,endd > > complex(dpc) :: sigma > > > > PetscErrorCode ierr > > > > > > if (rank==0) call cpu_time(begin) > > > > if (rank==0) then > > write(*,*) > > write(*,*)'Testing How Big LU Can Be...' > > write(*,*)'============================' > > write(*,*) > > endif > > > > !sigma = (1.0d0,0.0d0) > > !call MatAXPY(A,-sigma,B,DIFFERENT_NONZERO_PATTERN,ierr) ! > on exit A = A-sigma*B > > > > !call MatView(A,PETSC_VIEWER_STDOUT_WORLD,ierr) > > > > !.....Write Matrix to ASCII and Binary Format > > !call > PetscViewerASCIIOpen(PETSC_COMM_WORLD,"Amat.m",viewer,ierr) > > !call MatView(DXX,viewer,ierr) > > !call PetscViewerDestroy(viewer,ierr) > > > > !call > PetscViewerBinaryOpen(PETSC_COMM_WORLD,"Amat_binary.m",FILE_MODE_WRITE,viewer,ierr) > > !call MatView(A,viewer,ierr) > > !call PetscViewerDestroy(viewer,ierr) > > > > !...Load a Matrix in Binary Format > > call > PetscViewerBinaryOpen(PETSC_COMM_WORLD,"Amat_binary.m",FILE_MODE_READ,viewer,ierr) > > call MatCreate(PETSC_COMM_WORLD,DLOAD,ierr) > > call MatSetType(DLOAD,MATAIJ,ierr) > > call MatLoad(DLOAD,viewer,ierr) > > call PetscViewerDestroy(viewer,ierr) > > > > !call MatView(DLOAD,PETSC_VIEWER_STDOUT_WORLD,ierr) > > > > > > !.....Create Linear Solver Context > > call KSPCreate(PETSC_COMM_WORLD,ksp,ierr) > > > > !.....Set operators. Here the matrix that defines the linear > system also serves as the preconditioning matrix. > > !call > KSPSetOperators(ksp,A,A,DIFFERENT_NONZERO_PATTERN,ierr) !aha > commented and replaced by next line > > > > !call KSPSetOperators(ksp,A,A,ierr) ! remember: here A = > A-sigma*B > > call KSPSetOperators(ksp,DLOAD,DLOAD,ierr) ! remember: here > A = A-sigma*B > > > > !.....Set Relative and Absolute Tolerances and Uses Default for > Divergence Tol > > tol = 1.e-10 > > call > KSPSetTolerances(ksp,tol,tol,PETSC_DEFAULT_REAL,PETSC_DEFAULT_INTEGER,ierr) > > > > !.....Set the Direct (LU) Solver > > call KSPSetType(ksp,KSPPREONLY,ierr) > > call KSPGetPC(ksp,pc,ierr) > > call PCSetType(pc,PCLU,ierr) > > call > PCFactorSetMatSolverPackage(pc,MATSOLVERSUPERLU_DIST,ierr) ! > MATSOLVERSUPERLU_DIST MATSOLVERMUMPS > > > > !.....Create Right-Hand-Side Vector > > !call MatCreateVecs(A,frhs,PETSC_NULL_OBJECT,ierr) > > !call MatCreateVecs(A,sol,PETSC_NULL_OBJECT,ierr) > > > > call MatCreateVecs(DLOAD,frhs,PETSC_NULL_OBJECT,ierr) > > call MatCreateVecs(DLOAD,sol,PETSC_NULL_OBJECT,ierr) > > > > call > MatGetOwnershipRange(DLOAD,IstartA,IendA,ierr)!;CHKERRQ(ierr) > > > > allocate(xwork1(IendA-IstartA)) > > allocate(loc(IendA-IstartA)) > > > > ct=0 > > do i=IstartA,IendA-1 > > ct=ct+1 > > loc(ct)=i > > xwork1(ct)=(1.0d0,0.0d0) > > enddo > > > > call > VecSetValues(frhs,IendA-IstartA,loc,xwork1,INSERT_VALUES,ierr) > > call VecZeroEntries(sol,ierr) > > > > deallocate(xwork1,loc) > > > > !.....Assemble Vectors > > call VecAssemblyBegin(frhs,ierr) > > call VecAssemblyEnd(frhs,ierr) > > > > !.....Solve the Linear System > > call KSPSolve(ksp,frhs,sol,ierr) > > > > !call VecView(sol,PETSC_VIEWER_STDOUT_WORLD,ierr) > > > > if (rank==0) then > > call cpu_time(endd) > > write(*,*) > > print '("Total time for HowBigLUCanBe = ",f21.3," > seconds.")',endd-begin > > endif > > > > call SlepcFinalize(ierr) > > > > STOP > > > > > > end Subroutine HowBigLUCanBe > > > > > > > -------------- next part -------------- An HTML attachment was scrubbed... URL: From gideon.simpson at gmail.com Tue Aug 11 13:09:12 2015 From: gideon.simpson at gmail.com (Gideon Simpson) Date: Tue, 11 Aug 2015 14:09:12 -0400 Subject: [petsc-users] checking jacobian In-Reply-To: References: <1AED74B9-C28E-4F0E-BC61-B8C6E8E5B1F8@gmail.com> <87zj1xsoyv.fsf@jedbrown.org> <5B0AD314-87B9-4A74-AE92-16CD106D7A52@gmail.com> <18507BF6-BCB7-44B4-918B-7197CF716208@gmail.com> <877E66A4-B4D9-4A4C-8654-FD55C11F12C2@gmail.com> Message-ID: I don?t see it listed in -help, but I do get ./blowup -xmax 50 -nx 1000 -snes_check_jacobian Testing hand-coded Jacobian, if the ratio is O(1.e-8), the hand-coded Jacobian is probably correct. Run with -snes_check_jacobian_view [viewer][:filename][:format] to show difference of hand-coded and finite difference Jacobian. 63386.1 = ||J - Jfd||//J|| 63386.1 = ||J - Jfd|| -gideon > On Aug 11, 2015, at 2:07 PM, Matthew Knepley wrote: > > On Tue, Aug 11, 2015 at 1:04 PM, Gideon Simpson > wrote: > Macports installation of 3.5.3. > > Use -help to find the option name. Maybe its -snes_test. > > Thanks, > > Matt > > -gideon > >> On Aug 11, 2015, at 2:03 PM, Matthew Knepley > wrote: >> >> On Tue, Aug 11, 2015 at 12:51 PM, Gideon Simpson > wrote: >> #End of PETSc Option Table entries >> There is one unused database option. It is: >> Option left: name:-snes_check_jacobian_view (no value) >> >> This is the option for the newest release. What are you using? >> >> Matt >> >> -gideon >> >>> On Aug 11, 2015, at 1:50 PM, Matthew Knepley > wrote: >>> >>> On Tue, Aug 11, 2015 at 12:49 PM, Gideon Simpson > wrote: >>> Maybe it?s a quirk of the macports installation of petsc, but nothing seems to be getting generated. >>> >>> Run with -options_left. Is it reading the option? >>> >>> Matt >>> >>> -gideon >>> >>>> On Aug 11, 2015, at 1:40 PM, Jed Brown > wrote: >>>> >>>> Gideon Simpson > writes: >>>> >>>>> I?m a bit confused by the following options: >>>>> >>>>> Run with -snes_check_jacobian_view [viewer][:filename][:format] to show difference of hand-coded and finite difference Jacobian. >>>>> >>>>> What flags do I pass it to get some output to diagnose my Jacobian error? >>>> >>>> Nothing to display ASCII to the screen. You might use >>>> "binary:thematrix" if you want to read it in with Python or MATLAB, for >>>> example. >>> >>> >>> >>> >>> -- >>> What most experimenters take for granted before they begin their experiments is infinitely more interesting than any results to which their experiments lead. >>> -- Norbert Wiener >> >> >> >> >> -- >> What most experimenters take for granted before they begin their experiments is infinitely more interesting than any results to which their experiments lead. >> -- Norbert Wiener > > > > > -- > What most experimenters take for granted before they begin their experiments is infinitely more interesting than any results to which their experiments lead. > -- Norbert Wiener -------------- next part -------------- An HTML attachment was scrubbed... URL: From knepley at gmail.com Tue Aug 11 13:11:06 2015 From: knepley at gmail.com (Matthew Knepley) Date: Tue, 11 Aug 2015 13:11:06 -0500 Subject: [petsc-users] checking jacobian In-Reply-To: References: <1AED74B9-C28E-4F0E-BC61-B8C6E8E5B1F8@gmail.com> <87zj1xsoyv.fsf@jedbrown.org> <5B0AD314-87B9-4A74-AE92-16CD106D7A52@gmail.com> <18507BF6-BCB7-44B4-918B-7197CF716208@gmail.com> <877E66A4-B4D9-4A4C-8654-FD55C11F12C2@gmail.com> Message-ID: On Tue, Aug 11, 2015 at 1:09 PM, Gideon Simpson wrote: > I don?t see it listed in -help, but I do get > > ./blowup -xmax 50 -nx 1000 -snes_check_jacobian > Testing hand-coded Jacobian, if the ratio is O(1.e-8), the > hand-coded Jacobian is probably correct. > Run with -snes_check_jacobian_view [viewer][:filename][:format] to > show difference of hand-coded and finite difference Jacobian. > 63386.1 = ||J - Jfd||//J|| 63386.1 = ||J - Jfd|| > It must be broken in 3.5.3 for some reason. I would use the latest release. Matt > -gideon > > On Aug 11, 2015, at 2:07 PM, Matthew Knepley wrote: > > On Tue, Aug 11, 2015 at 1:04 PM, Gideon Simpson > wrote: > >> Macports installation of 3.5.3. >> > > Use -help to find the option name. Maybe its -snes_test. > > Thanks, > > Matt > > >> -gideon >> >> On Aug 11, 2015, at 2:03 PM, Matthew Knepley wrote: >> >> On Tue, Aug 11, 2015 at 12:51 PM, Gideon Simpson < >> gideon.simpson at gmail.com> wrote: >> >>> #End of PETSc Option Table entries >>> There is one unused database option. It is: >>> Option left: name:-snes_check_jacobian_view (no value) >>> >> >> This is the option for the newest release. What are you using? >> >> Matt >> >> >>> -gideon >>> >>> On Aug 11, 2015, at 1:50 PM, Matthew Knepley wrote: >>> >>> On Tue, Aug 11, 2015 at 12:49 PM, Gideon Simpson < >>> gideon.simpson at gmail.com> wrote: >>> >>>> Maybe it?s a quirk of the macports installation of petsc, but nothing >>>> seems to be getting generated. >>>> >>> >>> Run with -options_left. Is it reading the option? >>> >>> Matt >>> >>> >>>> -gideon >>>> >>>> On Aug 11, 2015, at 1:40 PM, Jed Brown wrote: >>>> >>>> Gideon Simpson writes: >>>> >>>> I?m a bit confused by the following options: >>>> >>>> Run with -snes_check_jacobian_view [viewer][:filename][:format] to show >>>> difference of hand-coded and finite difference Jacobian. >>>> >>>> What flags do I pass it to get some output to diagnose my Jacobian >>>> error? >>>> >>>> >>>> Nothing to display ASCII to the screen. You might use >>>> "binary:thematrix" if you want to read it in with Python or MATLAB, for >>>> example. >>>> >>>> >>>> >>> >>> >>> -- >>> What most experimenters take for granted before they begin their >>> experiments is infinitely more interesting than any results to which their >>> experiments lead. >>> -- Norbert Wiener >>> >>> >>> >> >> >> -- >> What most experimenters take for granted before they begin their >> experiments is infinitely more interesting than any results to which their >> experiments lead. >> -- Norbert Wiener >> >> >> > > > -- > What most experimenters take for granted before they begin their > experiments is infinitely more interesting than any results to which their > experiments lead. > -- Norbert Wiener > > > -- What most experimenters take for granted before they begin their experiments is infinitely more interesting than any results to which their experiments lead. -- Norbert Wiener -------------- next part -------------- An HTML attachment was scrubbed... URL: From balay at mcs.anl.gov Tue Aug 11 13:33:08 2015 From: balay at mcs.anl.gov (Satish Balay) Date: Tue, 11 Aug 2015 13:33:08 -0500 Subject: [petsc-users] SIGSEGV in Superlu_dist In-Reply-To: <55CA3A16.90206@email.arizona.edu> References: <55C90EAA.5060702@email.arizona.edu> <8B989122-F09F-4AA2-8676-D82CA2E58B0C@mcs.anl.gov> <55CA3A16.90206@email.arizona.edu> Message-ID: yes - the patch will be in petsc 3.6.2. However - you can grab the patch right now - and start using it If using a 3.6.1 tarball - you can do download the (raw) patch from the url below and apply with: cd petsc-3.6.1 patch -Np1 < patchfile If using a git clone - you can do: git fetch git checkout ceeba3afeff0c18262ed13ef92e2508ca68b0ecf Satish On Tue, 11 Aug 2015, Anthony Haas wrote: > Hi Hong, > > Sorry for my late reply and thanks for the fix. Does that mean that I will be > able to run that matrix on 10 procs in the future (petsc 3.6.2?)? > > Thanks > > Anthony > > > On 08/11/2015 09:58 AM, Hong wrote: > > Anthony, > > I pushed a fix > > https://bitbucket.org/petsc/petsc/commits/ceeba3afeff0c18262ed13ef92e2508ca68b0ecf > > > > Once it passes our nightly tests, I'll merge it to petsc-maint, then > > petsc-dev. > > Thanks for reporting it! > > > > Hong > > > > On Mon, Aug 10, 2015 at 4:27 PM, Barry Smith > > wrote: > > > > > > Anthony, > > > > This crash is in PETSc code before it calls the SuperLU_DIST > > numeric factorization; likely we have a mistake such as assuming a > > process has at least one row of the matrix and need to fix it. > > > > Barry > > > > > 0x00007fe6ba609297 in MatLUFactorNumeric_SuperLU_DIST (F=0x1922b50, > > > A=0x14a6a70, info=0x19099f8) > > > at > > /home/anthony/LIB/petsc-3.6.1/src/mat/impls/aij/mpi/superlu_dist/superlu_dist.c:368 > > > 368 colA_start = rstart + ajj[0]; /* the smallest > > global col index of A */ > > > > > > > > > On Aug 10, 2015, at 3:50 PM, Anthony Haas > > wrote: > > > > > > Hi Sherry, > > > > > > I recently submitted a matrix for which I noticed that > > Superlu_dist was hanging when running on 4 processors with > > parallel symbolic factorization. I have been using the latest > > version of Superlu_dist and the code is not hanging anymore. > > However, I noticed that when running the same matrix (I have > > attached the matrix), the code crashes with the following SIGSEGV > > when running on 10 procs (with or without parallel symbolic > > factorization). It is probably overkill to run such a 'small' > > matrix on 10 procs but I thought that it might still be useful to > > report the problem?? See below for the error obtained when running > > with gdb and also a code snippet to reproduce the error. > > > > > > Thanks, > > > > > > > > > Anthony > > > > > > > > > > > > 1) ERROR in GDB > > > > > > Program received signal SIGSEGV, Segmentation fault. > > > 0x00007fe6ba609297 in MatLUFactorNumeric_SuperLU_DIST (F=0x1922b50, > > > A=0x14a6a70, info=0x19099f8) > > > at > > /home/anthony/LIB/petsc-3.6.1/src/mat/impls/aij/mpi/superlu_dist/superlu_dist.c:368 > > > 368 colA_start = rstart + ajj[0]; /* the smallest > > global col index of A */ > > > (gdb) > > > > > > > > > > > > 2) PORTION OF CODE TO REPRODUCE ERROR > > > > > > Subroutine HowBigLUCanBe(rank) > > > > > > IMPLICIT NONE > > > > > > integer(i4b),intent(in) :: rank > > > integer(i4b) :: i,ct > > > real(dp) :: begin,endd > > > complex(dpc) :: sigma > > > > > > PetscErrorCode ierr > > > > > > > > > if (rank==0) call cpu_time(begin) > > > > > > if (rank==0) then > > > write(*,*) > > > write(*,*)'Testing How Big LU Can Be...' > > > write(*,*)'============================' > > > write(*,*) > > > endif > > > > > > !sigma = (1.0d0,0.0d0) > > > !call MatAXPY(A,-sigma,B,DIFFERENT_NONZERO_PATTERN,ierr) ! > > on exit A = A-sigma*B > > > > > > !call MatView(A,PETSC_VIEWER_STDOUT_WORLD,ierr) > > > > > > !.....Write Matrix to ASCII and Binary Format > > > !call > > PetscViewerASCIIOpen(PETSC_COMM_WORLD,"Amat.m",viewer,ierr) > > > !call MatView(DXX,viewer,ierr) > > > !call PetscViewerDestroy(viewer,ierr) > > > > > > !call > > PetscViewerBinaryOpen(PETSC_COMM_WORLD,"Amat_binary.m",FILE_MODE_WRITE,viewer,ierr) > > > !call MatView(A,viewer,ierr) > > > !call PetscViewerDestroy(viewer,ierr) > > > > > > !...Load a Matrix in Binary Format > > > call > > PetscViewerBinaryOpen(PETSC_COMM_WORLD,"Amat_binary.m",FILE_MODE_READ,viewer,ierr) > > > call MatCreate(PETSC_COMM_WORLD,DLOAD,ierr) > > > call MatSetType(DLOAD,MATAIJ,ierr) > > > call MatLoad(DLOAD,viewer,ierr) > > > call PetscViewerDestroy(viewer,ierr) > > > > > > !call MatView(DLOAD,PETSC_VIEWER_STDOUT_WORLD,ierr) > > > > > > > > > !.....Create Linear Solver Context > > > call KSPCreate(PETSC_COMM_WORLD,ksp,ierr) > > > > > > !.....Set operators. Here the matrix that defines the linear > > system also serves as the preconditioning matrix. > > > !call > > KSPSetOperators(ksp,A,A,DIFFERENT_NONZERO_PATTERN,ierr) !aha > > commented and replaced by next line > > > > > > !call KSPSetOperators(ksp,A,A,ierr) ! remember: here A = > > A-sigma*B > > > call KSPSetOperators(ksp,DLOAD,DLOAD,ierr) ! remember: here > > A = A-sigma*B > > > > > > !.....Set Relative and Absolute Tolerances and Uses Default for > > Divergence Tol > > > tol = 1.e-10 > > > call > > KSPSetTolerances(ksp,tol,tol,PETSC_DEFAULT_REAL,PETSC_DEFAULT_INTEGER,ierr) > > > > > > !.....Set the Direct (LU) Solver > > > call KSPSetType(ksp,KSPPREONLY,ierr) > > > call KSPGetPC(ksp,pc,ierr) > > > call PCSetType(pc,PCLU,ierr) > > > call > > PCFactorSetMatSolverPackage(pc,MATSOLVERSUPERLU_DIST,ierr) ! > > MATSOLVERSUPERLU_DIST MATSOLVERMUMPS > > > > > > !.....Create Right-Hand-Side Vector > > > !call MatCreateVecs(A,frhs,PETSC_NULL_OBJECT,ierr) > > > !call MatCreateVecs(A,sol,PETSC_NULL_OBJECT,ierr) > > > > > > call MatCreateVecs(DLOAD,frhs,PETSC_NULL_OBJECT,ierr) > > > call MatCreateVecs(DLOAD,sol,PETSC_NULL_OBJECT,ierr) > > > > > > call > > MatGetOwnershipRange(DLOAD,IstartA,IendA,ierr)!;CHKERRQ(ierr) > > > > > > allocate(xwork1(IendA-IstartA)) > > > allocate(loc(IendA-IstartA)) > > > > > > ct=0 > > > do i=IstartA,IendA-1 > > > ct=ct+1 > > > loc(ct)=i > > > xwork1(ct)=(1.0d0,0.0d0) > > > enddo > > > > > > call > > VecSetValues(frhs,IendA-IstartA,loc,xwork1,INSERT_VALUES,ierr) > > > call VecZeroEntries(sol,ierr) > > > > > > deallocate(xwork1,loc) > > > > > > !.....Assemble Vectors > > > call VecAssemblyBegin(frhs,ierr) > > > call VecAssemblyEnd(frhs,ierr) > > > > > > !.....Solve the Linear System > > > call KSPSolve(ksp,frhs,sol,ierr) > > > > > > !call VecView(sol,PETSC_VIEWER_STDOUT_WORLD,ierr) > > > > > > if (rank==0) then > > > call cpu_time(endd) > > > write(*,*) > > > print '("Total time for HowBigLUCanBe = ",f21.3," > > seconds.")',endd-begin > > > endif > > > > > > call SlepcFinalize(ierr) > > > > > > STOP > > > > > > > > > end Subroutine HowBigLUCanBe > > > > > > > > > > > > > From mc0710 at gmail.com Tue Aug 11 14:10:09 2015 From: mc0710 at gmail.com (Mani Chandra) Date: Tue, 11 Aug 2015 14:10:09 -0500 Subject: [petsc-users] Petsc+Chombo example Message-ID: Hi, Is there an example where Petsc's SNES has been used with Chombo, and perhaps with an automatic Jacobian assembly? I'd like to know if Petsc can pick out the number of colors of a Chombo data structure like it can do with a DMDA. Thanks, Mani -------------- next part -------------- An HTML attachment was scrubbed... URL: From jed at jedbrown.org Tue Aug 11 14:18:05 2015 From: jed at jedbrown.org (Jed Brown) Date: Tue, 11 Aug 2015 13:18:05 -0600 Subject: [petsc-users] Petsc+Chombo example In-Reply-To: References: Message-ID: <87wpx1skgi.fsf@jedbrown.org> Mani Chandra writes: > Is there an example where Petsc's SNES has been used with Chombo, and > perhaps with an automatic Jacobian assembly? I'd like to know if Petsc can > pick out the number of colors of a Chombo data structure like it can do > with a DMDA. You'll have to ask Chombo about this. -------------- next part -------------- A non-text attachment was scrubbed... Name: signature.asc Type: application/pgp-signature Size: 818 bytes Desc: not available URL: From knepley at gmail.com Tue Aug 11 14:25:57 2015 From: knepley at gmail.com (Matthew Knepley) Date: Tue, 11 Aug 2015 14:25:57 -0500 Subject: [petsc-users] Petsc+Chombo example In-Reply-To: References: Message-ID: On Tue, Aug 11, 2015 at 2:10 PM, Mani Chandra wrote: > Hi, > > Is there an example where Petsc's SNES has been used with Chombo, and > perhaps with an automatic Jacobian assembly? I'd like to know if Petsc can > pick out the number of colors of a Chombo data structure like it can do > with a DMDA. > The specific kinds of colorings for structured grids also assume a colocated discretization which I am not sure Chombo uses. However, the greedy colorings which only use the matrix will work. Thanks, Matt > Thanks, > Mani > -- What most experimenters take for granted before they begin their experiments is infinitely more interesting than any results to which their experiments lead. -- Norbert Wiener -------------- next part -------------- An HTML attachment was scrubbed... URL: From bsmith at mcs.anl.gov Tue Aug 11 16:40:19 2015 From: bsmith at mcs.anl.gov (Barry Smith) Date: Tue, 11 Aug 2015 16:40:19 -0500 Subject: [petsc-users] checking jacobian In-Reply-To: References: <1AED74B9-C28E-4F0E-BC61-B8C6E8E5B1F8@gmail.com> <87zj1xsoyv.fsf@jedbrown.org> <5B0AD314-87B9-4A74-AE92-16CD106D7A52@gmail.com> <18507BF6-BCB7-44B4-918B-7197CF716208@gmail.com> <877E66A4-B4D9-4A4C-8654-FD55C11F12C2@gmail.com> Message-ID: <97A16F65-5714-4946-B2A1-32AEA315BFA1@mcs.anl.gov> You also have to KEEP the -snes_check_jacobian option Barry > On Aug 11, 2015, at 1:09 PM, Gideon Simpson wrote: > > I don?t see it listed in -help, but I do get > > ./blowup -xmax 50 -nx 1000 -snes_check_jacobian > Testing hand-coded Jacobian, if the ratio is O(1.e-8), the hand-coded Jacobian is probably correct. > Run with -snes_check_jacobian_view [viewer][:filename][:format] to show difference of hand-coded and finite difference Jacobian. > 63386.1 = ||J - Jfd||//J|| 63386.1 = ||J - Jfd|| > > -gideon > >> On Aug 11, 2015, at 2:07 PM, Matthew Knepley wrote: >> >> On Tue, Aug 11, 2015 at 1:04 PM, Gideon Simpson wrote: >> Macports installation of 3.5.3. >> >> Use -help to find the option name. Maybe its -snes_test. >> >> Thanks, >> >> Matt >> >> -gideon >> >>> On Aug 11, 2015, at 2:03 PM, Matthew Knepley wrote: >>> >>> On Tue, Aug 11, 2015 at 12:51 PM, Gideon Simpson wrote: >>> #End of PETSc Option Table entries >>> There is one unused database option. It is: >>> Option left: name:-snes_check_jacobian_view (no value) >>> >>> This is the option for the newest release. What are you using? >>> >>> Matt >>> >>> -gideon >>> >>>> On Aug 11, 2015, at 1:50 PM, Matthew Knepley wrote: >>>> >>>> On Tue, Aug 11, 2015 at 12:49 PM, Gideon Simpson wrote: >>>> Maybe it?s a quirk of the macports installation of petsc, but nothing seems to be getting generated. >>>> >>>> Run with -options_left. Is it reading the option? >>>> >>>> Matt >>>> >>>> -gideon >>>> >>>>> On Aug 11, 2015, at 1:40 PM, Jed Brown wrote: >>>>> >>>>> Gideon Simpson writes: >>>>> >>>>>> I?m a bit confused by the following options: >>>>>> >>>>>> Run with -snes_check_jacobian_view [viewer][:filename][:format] to show difference of hand-coded and finite difference Jacobian. >>>>>> >>>>>> What flags do I pass it to get some output to diagnose my Jacobian error? >>>>> >>>>> Nothing to display ASCII to the screen. You might use >>>>> "binary:thematrix" if you want to read it in with Python or MATLAB, for >>>>> example. >>>> >>>> >>>> >>>> >>>> -- >>>> What most experimenters take for granted before they begin their experiments is infinitely more interesting than any results to which their experiments lead. >>>> -- Norbert Wiener >>> >>> >>> >>> >>> -- >>> What most experimenters take for granted before they begin their experiments is infinitely more interesting than any results to which their experiments lead. >>> -- Norbert Wiener >> >> >> >> >> -- >> What most experimenters take for granted before they begin their experiments is infinitely more interesting than any results to which their experiments lead. >> -- Norbert Wiener > From xsli at lbl.gov Tue Aug 11 18:49:05 2015 From: xsli at lbl.gov (Xiaoye S. Li) Date: Tue, 11 Aug 2015 16:49:05 -0700 Subject: [petsc-users] SuperLU MPI-problem In-Reply-To: References: <1891ada2f99843b6b6c1d91f0f47f065@STHWS42.tyrens.se> <63c6587a85914931bbbad4660884efed@STHWS42.tyrens.se> <429fe4873a534ab19216a8d2e5fa8213@STHWS42.tyrens.se> Message-ID: ?It's hard to say. For 3D problems, you may get a fill factor about 30x-50x (can be larger or smaller depending on problem.) The time may be in seconds, or minutes at most. Sherry On Tue, Aug 11, 2015 at 7:31 AM, Mahir.Ulker-Kaustell at tyrens.se < Mahir.Ulker-Kaustell at tyrens.se> wrote: > Yes! Doing: > > $PETSC_DIR/$PETSC_ARCH/bin/mpiexec > > instead of > > mpiexec > > makes the program run as expected. > > Thank you all for your patience and encouragement. > > Sherry: I have noticed that you have been involved in some publications > related to my current work, i.e. wave propagation in elastic solids. What > computation time would you expect using SuperLU to solve one linear system > with say 800000 degrees of freedom and 4-8 processes (on a single node) > with a finite element discretization? > > Mahir > > > > > > -----Original Message----- > From: Satish Balay [mailto:balay at mcs.anl.gov] > Sent: den 7 augusti 2015 18:09 > To: ?lker-Kaustell, Mahir > Cc: Hong; PETSc users list > Subject: Re: [petsc-users] SuperLU MPI-problem > > This usually happens if you use the wrong MPIEXEC > > i.e use the mpiexec from the MPI you built PETSc with. > > Satish > > On Fri, 7 Aug 2015, Mahir.Ulker-Kaustell at tyrens.se wrote: > > > Hong, > > > > Running example 2 with the command line given below gives me two > uniprocessor runs!? > > > > $ mpiexec -n 2 ./ex2 -pc_type lu -pc_factor_mat_solver_package > superlu_dist -ksp_view > > KSP Object: 1 MPI processes > > type: gmres > > GMRES: restart=30, using Classical (unmodified) Gram-Schmidt > Orthogonalization with no iterative refinement > > GMRES: happy breakdown tolerance 1e-30 > > maximum iterations=10000, initial guess is zero > > tolerances: relative=0.000138889, absolute=1e-50, divergence=10000 > > left preconditioning > > using PRECONDITIONED norm type for convergence test > > PC Object: 1 MPI processes > > type: lu > > LU: out-of-place factorization > > tolerance for zero pivot 2.22045e-14 > > matrix ordering: nd > > factor fill ratio given 0, needed 0 > > Factored matrix follows: > > Mat Object: 1 MPI processes > > type: seqaij > > rows=56, cols=56 > > package used to perform factorization: superlu_dist > > total: nonzeros=0, allocated nonzeros=0 > > total number of mallocs used during MatSetValues calls =0 > > SuperLU_DIST run parameters: > > Process grid nprow 1 x npcol 1 > > Equilibrate matrix TRUE > > Matrix input mode 0 > > Replace tiny pivots TRUE > > Use iterative refinement FALSE > > Processors in row 1 col partition 1 > > Row permutation LargeDiag > > Column permutation METIS_AT_PLUS_A > > Parallel symbolic factorization FALSE > > Repeated factorization SamePattern_SameRowPerm > > linear system matrix = precond matrix: > > Mat Object: 1 MPI processes > > type: seqaij > > rows=56, cols=56 > > total: nonzeros=250, allocated nonzeros=280 > > total number of mallocs used during MatSetValues calls =0 > > not using I-node routines > > Norm of error 5.21214e-15 iterations 1 > > KSP Object: 1 MPI processes > > type: gmres > > GMRES: restart=30, using Classical (unmodified) Gram-Schmidt > Orthogonalization with no iterative refinement > > GMRES: happy breakdown tolerance 1e-30 > > maximum iterations=10000, initial guess is zero > > tolerances: relative=0.000138889, absolute=1e-50, divergence=10000 > > left preconditioning > > using PRECONDITIONED norm type for convergence test > > PC Object: 1 MPI processes > > type: lu > > LU: out-of-place factorization > > tolerance for zero pivot 2.22045e-14 > > matrix ordering: nd > > factor fill ratio given 0, needed 0 > > Factored matrix follows: > > Mat Object: 1 MPI processes > > type: seqaij > > rows=56, cols=56 > > package used to perform factorization: superlu_dist > > total: nonzeros=0, allocated nonzeros=0 > > total number of mallocs used during MatSetValues calls =0 > > SuperLU_DIST run parameters: > > Process grid nprow 1 x npcol 1 > > Equilibrate matrix TRUE > > Matrix input mode 0 > > Replace tiny pivots TRUE > > Use iterative refinement FALSE > > Processors in row 1 col partition 1 > > Row permutation LargeDiag > > Column permutation METIS_AT_PLUS_A > > Parallel symbolic factorization FALSE > > Repeated factorization SamePattern_SameRowPerm > > linear system matrix = precond matrix: > > Mat Object: 1 MPI processes > > type: seqaij > > rows=56, cols=56 > > total: nonzeros=250, allocated nonzeros=280 > > total number of mallocs used during MatSetValues calls =0 > > not using I-node routines > > Norm of error 5.21214e-15 iterations 1 > > > > Mahir > > > > From: Hong [mailto:hzhang at mcs.anl.gov] > > Sent: den 6 augusti 2015 16:36 > > To: ?lker-Kaustell, Mahir > > Cc: Hong; Xiaoye S. Li; PETSc users list > > Subject: Re: [petsc-users] SuperLU MPI-problem > > > > Mahir: > > > > I have been using PETSC_COMM_WORLD. > > > > What do you get by running a petsc example, e.g., > > petsc/src/ksp/ksp/examples/tutorials > > mpiexec -n 2 ./ex2 -pc_type lu -pc_factor_mat_solver_package > superlu_dist -ksp_view > > > > KSP Object: 2 MPI processes > > type: gmres > > ... > > > > Hong > > > > From: Hong [mailto:hzhang at mcs.anl.gov] > > Sent: den 5 augusti 2015 17:11 > > To: ?lker-Kaustell, Mahir > > Cc: Hong; Xiaoye S. Li; PETSc users list > > Subject: Re: [petsc-users] SuperLU MPI-problem > > > > Mahir: > > As you noticed, you ran the code in serial mode, not parallel. > > Check your code on input communicator, e.g., what input communicator do > you use in > > KSPCreate(comm,&ksp)? > > > > I have added error flag to superlu_dist interface (released version). > When user uses '-mat_superlu_dist_parsymbfact' > > in serial mode, this option is ignored with a warning. > > > > Hong > > > > Hong, > > > > If I set parsymbfact: > > > > $ mpiexec -n 2 ./solve -ksp_type preonly -pc_type lu > -pc_factor_mat_solver_package superlu_dist -mat_superlu_dist_matinput > DISTRIBUTED -mat_superlu_dist_parsymbfact -ksp_view > > Invalid ISPEC at line 484 in file get_perm_c.c > > Invalid ISPEC at line 484 in file get_perm_c.c > > ------------------------------------------------------- > > Primary job terminated normally, but 1 process returned > > a non-zero exit code.. Per user-direction, the job has been aborted. > > ------------------------------------------------------- > > > -------------------------------------------------------------------------- > > mpiexec detected that one or more processes exited with non-zero status, > thus causing > > the job to be terminated. The first process to do so was: > > > > Process name: [[63679,1],0] > > Exit code: 255 > > > -------------------------------------------------------------------------- > > > > Since the program does not finish the call to KSPSolve(), we do not get > any information about the KSP from ?ksp_view. > > > > If I do not set it, I get a serial run even if I specify ?n 2: > > > > mpiexec -n 2 ./solve -ksp_type preonly -pc_type lu > -pc_factor_mat_solver_package superlu_dist -ksp_view > > ? > > KSP Object: 1 MPI processes > > type: preonly > > maximum iterations=10000, initial guess is zero > > tolerances: relative=1e-05, absolute=1e-50, divergence=10000 > > left preconditioning > > using NONE norm type for convergence test > > PC Object: 1 MPI processes > > type: lu > > LU: out-of-place factorization > > tolerance for zero pivot 2.22045e-14 > > matrix ordering: nd > > factor fill ratio given 0, needed 0 > > Factored matrix follows: > > Mat Object: 1 MPI processes > > type: seqaij > > rows=954, cols=954 > > package used to perform factorization: superlu_dist > > total: nonzeros=0, allocated nonzeros=0 > > total number of mallocs used during MatSetValues calls =0 > > SuperLU_DIST run parameters: > > Process grid nprow 1 x npcol 1 > > Equilibrate matrix TRUE > > Matrix input mode 0 > > Replace tiny pivots TRUE > > Use iterative refinement FALSE > > Processors in row 1 col partition 1 > > Row permutation LargeDiag > > Column permutation METIS_AT_PLUS_A > > Parallel symbolic factorization FALSE > > Repeated factorization SamePattern_SameRowPerm > > linear system matrix = precond matrix: > > Mat Object: 1 MPI processes > > type: seqaij > > rows=954, cols=954 > > total: nonzeros=34223, allocated nonzeros=34223 > > total number of mallocs used during MatSetValues calls =0 > > using I-node routines: found 668 nodes, limit used is 5 > > > > I am running PETSc via Cygwin on a windows machine. > > When I installed PETSc the tests with different numbers of processes ran > well. > > > > Mahir > > > > > > From: Hong [mailto:hzhang at mcs.anl.gov] > > Sent: den 3 augusti 2015 19:06 > > To: ?lker-Kaustell, Mahir > > Cc: Hong; Xiaoye S. Li; PETSc users list > > Subject: Re: [petsc-users] SuperLU MPI-problem > > > > Mahir, > > > > > > I have not used ?parsymbfact in sequential runs or set matinput=GLOBAL > for parallel runs. > > > > If I use 2 processors, the program runs if I use > ?mat_superlu_dist_parsymbfact=1: > > mpiexec -n 2 ./solve -ksp_type preonly -pc_type lu > -pc_factor_mat_solver_package superlu_dist -mat_superlu_dist_matinput > GLOBAL -mat_superlu_dist_parsymbfact=1 > > > > The incorrect option '-mat_superlu_dist_parsymbfact=1' is not taken, so > your code runs well without parsymbfact. > > > > Please run it with '-ksp_view' and see what > > 'SuperLU_DIST run parameters:' are being used, e.g. > > petsc/src/ksp/ksp/examples/tutorials (maint) > > $ mpiexec -n 2 ./ex2 -pc_type lu -pc_factor_mat_solver_package > superlu_dist -mat_superlu_dist_parsymbfact=1 -ksp_view > > > > ... > > SuperLU_DIST run parameters: > > Process grid nprow 2 x npcol 1 > > Equilibrate matrix TRUE > > Matrix input mode 1 > > Replace tiny pivots TRUE > > Use iterative refinement FALSE > > Processors in row 2 col partition 1 > > Row permutation LargeDiag > > Column permutation METIS_AT_PLUS_A > > Parallel symbolic factorization FALSE > > Repeated factorization SamePattern_SameRowPerm > > > > I do not understand why your code uses matrix input mode = global. > > > > Hong > > > > > > > > From: Hong [mailto:hzhang at mcs.anl.gov] > > Sent: den 3 augusti 2015 16:46 > > To: Xiaoye S. Li > > Cc: ?lker-Kaustell, Mahir; Hong; PETSc users list > > > > Subject: Re: [petsc-users] SuperLU MPI-problem > > > > Mahir, > > > > Sherry found the culprit. I can reproduce it: > > petsc/src/ksp/ksp/examples/tutorials > > mpiexec -n 2 ./ex2 -pc_type lu -pc_factor_mat_solver_package > superlu_dist -mat_superlu_dist_matinput GLOBAL -mat_superlu_dist_parsymbfact > > > > Invalid ISPEC at line 484 in file get_perm_c.c > > Invalid ISPEC at line 484 in file get_perm_c.c > > ------------------------------------------------------- > > Primary job terminated normally, but 1 process returned > > a non-zero exit code.. Per user-direction, the job has been aborted. > > ------------------------------------------------------- > > ... > > > > PETSc-superlu_dist interface sets matinput=DISTRIBUTED as default when > using more than one processes. > > Did you either use '-mat_superlu_dist_parsymbfact' for sequential run or > set matinput=GLOBAL for parallel run? > > > > I'll add an error flag for these use cases. > > > > Hong > > > > On Mon, Aug 3, 2015 at 9:17 AM, Xiaoye S. Li xsli at lbl.gov>> wrote: > > I think I know the problem. Since zdistribute.c is called, I guess you > are using the global (replicated) matrix input interface, > pzgssvx_ABglobal(). This interface does not allow you to use parallel > symbolic factorization (since matrix is centralized). > > > > That's why you get the following error: > > Invalid ISPEC at line 484 in file get_perm_c.c > > > > You need to use distributed matrix input interface pzgssvx() (without > ABglobal) > > > > Sherry > > > > > > On Mon, Aug 3, 2015 at 5:02 AM, Mahir.Ulker-Kaustell at tyrens.se Mahir.Ulker-Kaustell at tyrens.se> Mahir.Ulker-Kaustell at tyrens.se>> wrote: > > Hong and Sherry, > > > > I have rebuilt PETSc with SuperLU 4.1. Unfortunately, the problem > remains: > > > > If I use -mat_superlu_dist_parsymbfact, the program crashes with: > Invalid ISPEC at line 484 in file get_perm_c.c > > If I use -mat_superlu_dist_parsymbfact=1 or leave this flag out, the > program crashes with: Calloc fails for SPA dense[]. at line 438 in file > zdistribute.c > > > > Mahir > > > > From: Hong [mailto:hzhang at mcs.anl.gov] > > Sent: den 30 juli 2015 02:58 > > To: ?lker-Kaustell, Mahir > > Cc: Xiaoye Li; PETSc users list > > > > Subject: Fwd: [petsc-users] SuperLU MPI-problem > > > > Mahir, > > > > Sherry fixed several bugs in superlu_dist-v4.1. > > The current petsc-release interfaces with superlu_dist-v4.0. > > We do not know whether the reported issue (attached below) has been > resolved or not. If not, can you test it with the latest superlu_dist-v4.1? > > > > Here is how to do it: > > 1. download superlu_dist v4.1 > > 2. remove existing PETSC_ARCH directory, then configure petsc with > > '--download-superlu_dist=superlu_dist_4.1.tar.gz' > > 3. build petsc > > > > Let us know if the issue remains. > > > > Hong > > > > > > ---------- Forwarded message ---------- > > From: Xiaoye S. Li > > > Date: Wed, Jul 29, 2015 at 2:24 PM > > Subject: Fwd: [petsc-users] SuperLU MPI-problem > > To: Hong Zhang > > > Hong, > > I am cleaning the mailbox, and saw this unresolved issue. I am not sure > whether the new fix to parallel symbolic factorization solves the problem. > What bothers be is that he is getting the following error: > > > > Invalid ISPEC at line 484 in file get_perm_c.c > > This has nothing to do with my bug fix. > > ? Shall we ask him to try the new version, or try to get him matrix? > > Sherry > > ? > > > > ---------- Forwarded message ---------- > > From: Mahir.Ulker-Kaustell at tyrens.se Mahir.Ulker-Kaustell at tyrens.se> Mahir.Ulker-Kaustell at tyrens.se>> > > Date: Wed, Jul 22, 2015 at 1:32 PM > > Subject: RE: [petsc-users] SuperLU MPI-problem > > To: Hong >, "Xiaoye S. > Li" > > > Cc: petsc-users >> > > The 1000 was just a conservative guess. The number of non-zeros per row > is in the tens in general but certain constraints lead to non-diagonal > streaks in the sparsity-pattern. > > Is it the reordering of the matrix that is killing me here? How can I > set options.ColPerm? > > > > If i use -mat_superlu_dist_parsymbfact the program crashes with > > > > Invalid ISPEC at line 484 in file get_perm_c.c > > ------------------------------------------------------- > > Primary job terminated normally, but 1 process returned > > a non-zero exit code.. Per user-direction, the job has been aborted. > > ------------------------------------------------------- > > [0]PETSC ERROR: > ------------------------------------------------------------------------ > > [0]PETSC ERROR: Caught signal number 15 Terminate: Some process (or the > batch system) has told this process to end > > [0]PETSC ERROR: Try option -start_in_debugger or > -on_error_attach_debugger > > [0]PETSC ERROR: or see > http://www.mcs.anl.gov/petsc/documentation/faq.html#valgrind > > [0]PETSC ERROR: or try http://valgrind.org on GNU/linux and Apple Mac > OS X to find memory corruption errors > > [0]PETSC ERROR: configure using --with-debugging=yes, recompile, link, > and run > > [0]PETSC ERROR: to get more information on the crash. > > [0]PETSC ERROR: --------------------- Error Message > -------------------------------------------------------------- > > [0]PETSC ERROR: Signal received > > [0]PETSC ERROR: See http://www.mcs.anl.gov/petsc/documentation/faq.html > for trouble shooting. > > [0]PETSC ERROR: Petsc Release Version 3.6.0, Jun, 09, 2015 > > [0]PETSC ERROR: ./solve on a cygwin-complex-nodebug named CZC5202SM2 by > muk Wed Jul 22 21:59:23 2015 > > [0]PETSC ERROR: Configure options PETSC_DIR=/packages/petsc-3.6.0 > PETSC_ARCH=cygwin-complex-nodebug --with-cc=gcc --with-cxx=g++ > --with-fc=gfortran --with-debugging=0 --with-fortran-kernels=1 > --with-scalar-type=complex --download-fblaspack --download-mpich > --download-scalapack --download-mumps --download-metis --download-parmetis > --download-superlu --download-superlu_dist --download-fftw > > [0]PETSC ERROR: #1 User provided function() line 0 in unknown file > > application called MPI_Abort(MPI_COMM_WORLD, 59) - process 0 > > [unset]: aborting job: > > application called MPI_Abort(MPI_COMM_WORLD, 59) - process 0 > > [0]PETSC ERROR: > ------------------------------------------------------------------------ > > > > If i use -mat_superlu_dist_parsymbfact=1 the program crashes (somewhat > later) with > > > > Malloc fails for Lnzval_bc_ptr[*][] at line 626 in file zdistribute.c > > col block 3006 ------------------------------------------------------- > > Primary job terminated normally, but 1 process returned > > a non-zero exit code.. Per user-direction, the job has been aborted. > > ------------------------------------------------------- > > col block 1924 [0]PETSC ERROR: > ------------------------------------------------------------------------ > > [0]PETSC ERROR: Caught signal number 15 Terminate: Some process (or the > batch system) has told this process to end > > [0]PETSC ERROR: Try option -start_in_debugger or > -on_error_attach_debugger > > [0]PETSC ERROR: or see > http://www.mcs.anl.gov/petsc/documentation/faq.html#valgrind > > [0]PETSC ERROR: or try http://valgrind.org on GNU/linux and Apple Mac > OS X to find memory corruption errors > > [0]PETSC ERROR: configure using --with-debugging=yes, recompile, link, > and run > > [0]PETSC ERROR: to get more information on the crash. > > [0]PETSC ERROR: --------------------- Error Message > -------------------------------------------------------------- > > [0]PETSC ERROR: Signal received > > [0]PETSC ERROR: See http://www.mcs.anl.gov/petsc/documentation/faq.html > for trouble shooting. > > [0]PETSC ERROR: Petsc Release Version 3.6.0, Jun, 09, 2015 > > [0]PETSC ERROR: ./solve on a cygwin-complex-nodebug named CZC5202SM2 by > muk Wed Jul 22 21:59:58 2015 > > [0]PETSC ERROR: Configure options PETSC_DIR=/packages/petsc-3.6.0 > PETSC_ARCH=cygwin-complex-nodebug --with-cc=gcc --with-cxx=g++ > --with-fc=gfortran --with-debugging=0 --with-fortran-kernels=1 > --with-scalar-type=complex --download-fblaspack --download-mpich > --download-scalapack --download-mumps --download-metis --download-parmetis > --download-superlu --download-superlu_dist --download-fftw > > [0]PETSC ERROR: #1 User provided function() line 0 in unknown file > > application called MPI_Abort(MPI_COMM_WORLD, 59) - process 0 > > [unset]: aborting job: > > application called MPI_Abort(MPI_COMM_WORLD, 59) - process 0 > > [0]PETSC ERROR: > ------------------------------------------------------------------------ > > > > > > /Mahir > > > > > > From: Hong [mailto:hzhang at mcs.anl.gov] > > Sent: den 22 juli 2015 21:34 > > To: Xiaoye S. Li > > Cc: ?lker-Kaustell, Mahir; petsc-users > > > > Subject: Re: [petsc-users] SuperLU MPI-problem > > > > In Petsc/superlu_dist interface, we set default > > > > options.ParSymbFact = NO; > > > > When user raises the flag "-mat_superlu_dist_parsymbfact", > > we set > > > > options.ParSymbFact = YES; > > options.ColPerm = PARMETIS; /* in v2.2, PARMETIS is forced for > ParSymbFact regardless of user ordering setting */ > > > > We do not change anything else. > > > > Hong > > > > On Wed, Jul 22, 2015 at 2:19 PM, Xiaoye S. Li xsli at lbl.gov>> wrote: > > I am trying to understand your problem. You said you are solving Naviers > equation (elastodynamics) in the frequency domain, using finite element > discretization. I wonder why you have about 1000 nonzeros per row. > Usually in many PDE discretized matrices, the number of nonzeros per row is > in the tens (even for 3D problems), not in the thousands. So, your matrix > is quite a bit denser than many sparse matrices we deal with. > > > > The number of nonzeros in the L and U factors is much more than that in > original matrix A -- typically we see 10-20x fill ratio for 2D, or can be > as bad as 50-100x fill ratio for 3D. But since your matrix starts much > denser (i.e., the underlying graph has many connections), it may not lend > to any good ordering strategy to preserve sparsity of L and U; that is, the > L and U fill ratio may be large. > > > > I don't understand why you get the following error when you use > > ?-mat_superlu_dist_parsymbfact?. > > > > Invalid ISPEC at line 484 in file get_perm_c.c > > > > Perhaps Hong Zhang knows; she built the SuperLU_DIST interface for PETSc. > > > > ?Hong -- in order to use parallel symbolic factorization, is it > sufficient to specify only > > ?-mat_superlu_dist_parsymbfact? > > ? ? (the default is to use sequential symbolic factorization.) > > > > > > Sherry > > > > On Wed, Jul 22, 2015 at 9:11 AM, Mahir.Ulker-Kaustell at tyrens.se Mahir.Ulker-Kaustell at tyrens.se> Mahir.Ulker-Kaustell at tyrens.se>> wrote: > > Thank you for your reply. > > > > As you have probably figured out already, I am not a computational > scientist. I am a researcher in civil engineering (railways for high-speed > traffic), trying to produce some, from my perspective, fairly large > parametric studies based on finite element discretizations. > > > > I am working in a Windows-environment and have installed PETSc through > Cygwin. > > Apparently, there is no support for Valgrind in this OS. > > > > If I have understood you correct, the memory issues are related to > superLU and given my background, there is not much I can do. Is this > correct? > > > > > > Best regards, > > Mahir > > > > ______________________________________________ > > Mahir ?lker-Kaustell, Kompetenssamordnare, Brokonstrukt?r, Tekn. Dr, > Tyr?ns AB > > 010 452 30 82, Mahir.Ulker-Kaustell at tyrens.se Mahir.Ulker-Kaustell at tyrens.se> > > ______________________________________________ > > > > -----Original Message----- > > From: Barry Smith [mailto:bsmith at mcs.anl.gov] > > Sent: den 22 juli 2015 02:57 > > To: ?lker-Kaustell, Mahir > > Cc: Xiaoye S. Li; petsc-users > > Subject: Re: [petsc-users] SuperLU MPI-problem > > > > > > Run the program under valgrind > http://www.mcs.anl.gov/petsc/documentation/faq.html#valgrind . When I use > the option -mat_superlu_dist_parsymbfact I get many scary memory problems > some involving for example ddist_psymbtonum (pdsymbfact_distdata.c:1332) > > > > Note that I consider it unacceptable for running programs to EVER use > uninitialized values; until these are all cleaned up I won't trust any runs > like this. > > > > Barry > > > > > > > > > > ==42050== Conditional jump or move depends on uninitialised value(s) > > ==42050== at 0x10274C436: MPI_Allgatherv (allgatherv.c:1053) > > ==42050== by 0x101557F60: get_perm_c_parmetis > (get_perm_c_parmetis.c:285) > > ==42050== by 0x101501192: pdgssvx (pdgssvx.c:934) > > ==42050== by 0x1009CFE7A: MatLUFactorNumeric_SuperLU_DIST > (superlu_dist.c:414) > > ==42050== by 0x10046CC5C: MatLUFactorNumeric (matrix.c:2946) > > ==42050== by 0x100F09F2C: PCSetUp_LU (lu.c:152) > > ==42050== by 0x100FF9036: PCSetUp (precon.c:982) > > ==42050== by 0x1010F54EB: KSPSetUp (itfunc.c:332) > > ==42050== by 0x1010F7985: KSPSolve (itfunc.c:546) > > ==42050== by 0x10125541E: SNESSolve_NEWTONLS (ls.c:233) > > ==42050== by 0x1011C49B7: SNESSolve (snes.c:3906) > > ==42050== by 0x100001B3C: main (in ./ex19) > > ==42050== Uninitialised value was created by a stack allocation > > ==42050== at 0x10155751B: get_perm_c_parmetis > (get_perm_c_parmetis.c:96) > > ==42050== > > ==42050== Conditional jump or move depends on uninitialised value(s) > > ==42050== at 0x102851C61: MPIR_Allgatherv_intra (allgatherv.c:651) > > ==42050== by 0x102853EC7: MPIR_Allgatherv (allgatherv.c:903) > > ==42050== by 0x102853F84: MPIR_Allgatherv_impl (allgatherv.c:944) > > ==42050== by 0x10274CA41: MPI_Allgatherv (allgatherv.c:1107) > > ==42050== by 0x101557F60: get_perm_c_parmetis > (get_perm_c_parmetis.c:285) > > ==42050== by 0x101501192: pdgssvx (pdgssvx.c:934) > > ==42050== by 0x1009CFE7A: MatLUFactorNumeric_SuperLU_DIST > (superlu_dist.c:414) > > ==42050== by 0x10046CC5C: MatLUFactorNumeric (matrix.c:2946) > > ==42050== by 0x100F09F2C: PCSetUp_LU (lu.c:152) > > ==42050== by 0x100FF9036: PCSetUp (precon.c:982) > > ==42050== by 0x1010F54EB: KSPSetUp (itfunc.c:332) > > ==42050== by 0x1010F7985: KSPSolve (itfunc.c:546) > > ==42050== by 0x10125541E: SNESSolve_NEWTONLS (ls.c:233) > > ==42050== by 0x1011C49B7: SNESSolve (snes.c:3906) > > ==42050== by 0x100001B3C: main (in ./ex19) > > ==42050== Uninitialised value was created by a stack allocation > > ==42050== at 0x10155751B: get_perm_c_parmetis > (get_perm_c_parmetis.c:96) > > ==42050== > > ==42049== Syscall param writev(vector[...]) points to uninitialised > byte(s) > > ==42049== at 0x102DA1C3A: writev (in > /usr/lib/system/libsystem_kernel.dylib) > > ==42049== by 0x10296A0DC: MPL_large_writev (mplsock.c:32) > > ==42049== by 0x10295F6AD: MPIDU_Sock_writev (sock_immed.i:610) > > ==42049== by 0x102943FCA: MPIDI_CH3_iSendv (ch3_isendv.c:84) > > ==42049== by 0x102934361: MPIDI_CH3_EagerContigIsend > (ch3u_eager.c:556) > > ==42049== by 0x102939531: MPID_Isend (mpid_isend.c:138) > > ==42049== by 0x10277656E: MPI_Isend (isend.c:125) > > ==42049== by 0x102088B66: libparmetis__gkMPI_Isend (gkmpi.c:63) > > ==42049== by 0x10208140F: libparmetis__CommInterfaceData (comm.c:298) > > ==42049== by 0x1020A8758: libparmetis__CompactGraph (ometis.c:553) > > ==42049== by 0x1020A77BB: libparmetis__MultilevelOrder (ometis.c:225) > > ==42049== by 0x1020A7493: ParMETIS_V32_NodeND (ometis.c:151) > > ==42049== by 0x1020A6AFB: ParMETIS_V3_NodeND (ometis.c:34) > > ==42049== by 0x101557CFC: get_perm_c_parmetis > (get_perm_c_parmetis.c:241) > > ==42049== by 0x101501192: pdgssvx (pdgssvx.c:934) > > ==42049== by 0x1009CFE7A: MatLUFactorNumeric_SuperLU_DIST > (superlu_dist.c:414) > > ==42049== by 0x10046CC5C: MatLUFactorNumeric (matrix.c:2946) > > ==42049== by 0x100F09F2C: PCSetUp_LU (lu.c:152) > > ==42049== by 0x100FF9036: PCSetUp (precon.c:982) > > ==42048== Syscall param writev(vector[...]) points to uninitialised > byte(s) > > ==42049== by 0x1010F54EB: KSPSetUp (itfunc.c:332) > > ==42049== Address 0x105edff70 is 1,424 bytes inside a block of size > 752,720 alloc'd > > ==42049== at 0x1000183B1: malloc (vg_replace_malloc.c:303) > > ==42049== by 0x1020EB90C: gk_malloc (memory.c:147) > > ==42049== by 0x1020EAA28: gk_mcoreCreate (mcore.c:28) > > ==42048== at 0x102DA1C3A: writev (in > /usr/lib/system/libsystem_kernel.dylib) > > ==42048== by 0x10296A0DC: MPL_large_writev (mplsock.c:32) > > ==42049== by 0x1020BA5CF: libparmetis__AllocateWSpace (wspace.c:23) > > ==42049== by 0x1020A6E84: ParMETIS_V32_NodeND (ometis.c:98) > > ==42048== by 0x10295F6AD: MPIDU_Sock_writev (sock_immed.i:610) > > ==42048== by 0x102943FCA: MPIDI_CH3_iSendv (ch3_isendv.c:84) > > ==42048== by 0x102934361: MPIDI_CH3_EagerContigIsend > (ch3u_eager.c:556) > > ==42049== by 0x1020A6AFB: ParMETIS_V3_NodeND (ometis.c:34) > > ==42049== by 0x101557CFC: get_perm_c_parmetis > (get_perm_c_parmetis.c:241) > > ==42049== by 0x101501192: pdgssvx (pdgssvx.c:934) > > ==42048== by 0x102939531: MPID_Isend (mpid_isend.c:138) > > ==42048== by 0x10277656E: MPI_Isend (isend.c:125) > > ==42049== by 0x1009CFE7A: MatLUFactorNumeric_SuperLU_DIST > (superlu_dist.c:414) > > ==42049== by 0x10046CC5C: MatLUFactorNumeric (matrix.c:2946) > > ==42049== by 0x100F09F2C: PCSetUp_LU (lu.c:152) > > ==42049== by 0x100FF9036: PCSetUp (precon.c:982) > > ==42049== by 0x1010F54EB: KSPSetUp (itfunc.c:332) > > ==42048== by 0x102088B66: libparmetis__gkMPI_Isend (gkmpi.c:63) > > ==42048== by 0x10208140F: libparmetis__CommInterfaceData (comm.c:298) > > ==42049== by 0x1010F7985: KSPSolve (itfunc.c:546) > > ==42049== by 0x10125541E: SNESSolve_NEWTONLS (ls.c:233) > > ==42048== by 0x1020A8758: libparmetis__CompactGraph (ometis.c:553) > > ==42048== by 0x1020A77BB: libparmetis__MultilevelOrder (ometis.c:225) > > ==42048== by 0x1020A7493: ParMETIS_V32_NodeND (ometis.c:151) > > ==42049== by 0x1011C49B7: SNESSolve (snes.c:3906) > > ==42049== by 0x100001B3C: main (in ./ex19) > > ==42049== Uninitialised value was created by a heap allocation > > ==42049== at 0x1000183B1: malloc (vg_replace_malloc.c:303) > > ==42049== by 0x1020EB90C: gk_malloc (memory.c:147) > > ==42048== by 0x1020A6AFB: ParMETIS_V3_NodeND (ometis.c:34) > > ==42048== by 0x101557CFC: get_perm_c_parmetis > (get_perm_c_parmetis.c:241) > > ==42048== by 0x101501192: pdgssvx (pdgssvx.c:934) > > ==42048== by 0x1009CFE7A: MatLUFactorNumeric_SuperLU_DIST > (superlu_dist.c:414) > > ==42049== by 0x10211C50B: libmetis__imalloc (gklib.c:24) > > ==42049== by 0x1020A8566: libparmetis__CompactGraph (ometis.c:519) > > ==42049== by 0x1020A77BB: libparmetis__MultilevelOrder (ometis.c:225) > > ==42048== by 0x10046CC5C: MatLUFactorNumeric (matrix.c:2946) > > ==42049== by 0x1020A7493: ParMETIS_V32_NodeND (ometis.c:151) > > ==42049== by 0x1020A6AFB: ParMETIS_V3_NodeND (ometis.c:34) > > ==42049== by 0x101557CFC: get_perm_c_parmetis > (get_perm_c_parmetis.c:241) > > ==42049== by 0x101501192: pdgssvx (pdgssvx.c:934) > > ==42049== by 0x1009CFE7A: MatLUFactorNumeric_SuperLU_DIST > (superlu_dist.c:414) > > ==42049== by 0x10046CC5C: MatLUFactorNumeric (matrix.c:2946) > > ==42048== by 0x100F09F2C: PCSetUp_LU (lu.c:152) > > ==42049== by 0x100F09F2C: PCSetUp_LU (lu.c:152) > > ==42049== by 0x100FF9036: PCSetUp (precon.c:982) > > ==42049== by 0x1010F54EB: KSPSetUp (itfunc.c:332) > > ==42049== by 0x1010F7985: KSPSolve (itfunc.c:546) > > ==42049== by 0x10125541E: SNESSolve_NEWTONLS (ls.c:233) > > ==42048== by 0x100FF9036: PCSetUp (precon.c:982) > > ==42048== by 0x1010F54EB: KSPSetUp (itfunc.c:332) > > ==42048== Address 0x10597a860 is 1,408 bytes inside a block of size > 752,720 alloc'd > > ==42049== by 0x1011C49B7: SNESSolve (snes.c:3906) > > ==42049== by 0x100001B3C: main (in ./ex19) > > ==42049== > > ==42048== at 0x1000183B1: malloc (vg_replace_malloc.c:303) > > ==42048== by 0x1020EB90C: gk_malloc (memory.c:147) > > ==42048== by 0x1020EAA28: gk_mcoreCreate (mcore.c:28) > > ==42048== by 0x1020BA5CF: libparmetis__AllocateWSpace (wspace.c:23) > > ==42048== by 0x1020A6E84: ParMETIS_V32_NodeND (ometis.c:98) > > ==42048== by 0x1020A6AFB: ParMETIS_V3_NodeND (ometis.c:34) > > ==42048== by 0x101557CFC: get_perm_c_parmetis > (get_perm_c_parmetis.c:241) > > ==42048== by 0x101501192: pdgssvx (pdgssvx.c:934) > > ==42048== by 0x1009CFE7A: MatLUFactorNumeric_SuperLU_DIST > (superlu_dist.c:414) > > ==42048== by 0x10046CC5C: MatLUFactorNumeric (matrix.c:2946) > > ==42048== by 0x100F09F2C: PCSetUp_LU (lu.c:152) > > ==42048== by 0x100FF9036: PCSetUp (precon.c:982) > > ==42048== by 0x1010F54EB: KSPSetUp (itfunc.c:332) > > ==42048== by 0x1010F7985: KSPSolve (itfunc.c:546) > > ==42048== by 0x10125541E: SNESSolve_NEWTONLS (ls.c:233) > > ==42048== by 0x1011C49B7: SNESSolve (snes.c:3906) > > ==42048== by 0x100001B3C: main (in ./ex19) > > ==42048== Uninitialised value was created by a heap allocation > > ==42048== at 0x1000183B1: malloc (vg_replace_malloc.c:303) > > ==42048== by 0x1020EB90C: gk_malloc (memory.c:147) > > ==42048== by 0x10211C50B: libmetis__imalloc (gklib.c:24) > > ==42048== by 0x1020A8566: libparmetis__CompactGraph (ometis.c:519) > > ==42048== by 0x1020A77BB: libparmetis__MultilevelOrder (ometis.c:225) > > ==42048== by 0x1020A7493: ParMETIS_V32_NodeND (ometis.c:151) > > ==42048== by 0x1020A6AFB: ParMETIS_V3_NodeND (ometis.c:34) > > ==42048== by 0x101557CFC: get_perm_c_parmetis > (get_perm_c_parmetis.c:241) > > ==42048== by 0x101501192: pdgssvx (pdgssvx.c:934) > > ==42048== by 0x1009CFE7A: MatLUFactorNumeric_SuperLU_DIST > (superlu_dist.c:414) > > ==42048== by 0x10046CC5C: MatLUFactorNumeric (matrix.c:2946) > > ==42048== by 0x100F09F2C: PCSetUp_LU (lu.c:152) > > ==42048== by 0x100FF9036: PCSetUp (precon.c:982) > > ==42048== by 0x1010F54EB: KSPSetUp (itfunc.c:332) > > ==42048== by 0x1010F7985: KSPSolve (itfunc.c:546) > > ==42048== by 0x10125541E: SNESSolve_NEWTONLS (ls.c:233) > > ==42048== by 0x1011C49B7: SNESSolve (snes.c:3906) > > ==42048== by 0x100001B3C: main (in ./ex19) > > ==42048== > > ==42048== Syscall param write(buf) points to uninitialised byte(s) > > ==42048== at 0x102DA1C22: write (in > /usr/lib/system/libsystem_kernel.dylib) > > ==42048== by 0x10295F5BD: MPIDU_Sock_write (sock_immed.i:525) > > ==42048== by 0x102944839: MPIDI_CH3_iStartMsg (ch3_istartmsg.c:86) > > ==42048== by 0x102933B80: MPIDI_CH3_EagerContigShortSend > (ch3u_eager.c:257) > > ==42048== by 0x10293ADBA: MPID_Send (mpid_send.c:130) > > ==42048== by 0x10277A1FA: MPI_Send (send.c:127) > > ==42048== by 0x10155802F: get_perm_c_parmetis > (get_perm_c_parmetis.c:299) > > ==42048== by 0x101501192: pdgssvx (pdgssvx.c:934) > > ==42048== by 0x1009CFE7A: MatLUFactorNumeric_SuperLU_DIST > (superlu_dist.c:414) > > ==42048== by 0x10046CC5C: MatLUFactorNumeric (matrix.c:2946) > > ==42048== by 0x100F09F2C: PCSetUp_LU (lu.c:152) > > ==42048== by 0x100FF9036: PCSetUp (precon.c:982) > > ==42048== by 0x1010F54EB: KSPSetUp (itfunc.c:332) > > ==42048== by 0x1010F7985: KSPSolve (itfunc.c:546) > > ==42048== by 0x10125541E: SNESSolve_NEWTONLS (ls.c:233) > > ==42048== by 0x1011C49B7: SNESSolve (snes.c:3906) > > ==42048== by 0x100001B3C: main (in ./ex19) > > ==42048== Address 0x104810704 is on thread 1's stack > > ==42048== in frame #3, created by MPIDI_CH3_EagerContigShortSend > (ch3u_eager.c:218) > > ==42048== Uninitialised value was created by a heap allocation > > ==42048== at 0x1000183B1: malloc (vg_replace_malloc.c:303) > > ==42048== by 0x10153B704: superlu_malloc_dist (memory.c:108) > > ==42048== by 0x101557AB9: get_perm_c_parmetis > (get_perm_c_parmetis.c:185) > > ==42048== by 0x101501192: pdgssvx (pdgssvx.c:934) > > ==42048== by 0x1009CFE7A: MatLUFactorNumeric_SuperLU_DIST > (superlu_dist.c:414) > > ==42048== by 0x10046CC5C: MatLUFactorNumeric (matrix.c:2946) > > ==42048== by 0x100F09F2C: PCSetUp_LU (lu.c:152) > > ==42048== by 0x100FF9036: PCSetUp (precon.c:982) > > ==42048== by 0x1010F54EB: KSPSetUp (itfunc.c:332) > > ==42048== by 0x1010F7985: KSPSolve (itfunc.c:546) > > ==42048== by 0x10125541E: SNESSolve_NEWTONLS (ls.c:233) > > ==42048== by 0x1011C49B7: SNESSolve (snes.c:3906) > > ==42048== by 0x100001B3C: main (in ./ex19) > > ==42048== > > ==42050== Conditional jump or move depends on uninitialised value(s) > > ==42050== at 0x102744CB8: MPI_Alltoallv (alltoallv.c:480) > > ==42050== by 0x101510B3E: dist_symbLU (pdsymbfact_distdata.c:539) > > ==42050== by 0x10150A5C6: ddist_psymbtonum > (pdsymbfact_distdata.c:1275) > > ==42050== by 0x1015018C2: pdgssvx (pdgssvx.c:1057) > > ==42050== by 0x1009CFE7A: MatLUFactorNumeric_SuperLU_DIST > (superlu_dist.c:414) > > ==42050== by 0x10046CC5C: MatLUFactorNumeric (matrix.c:2946) > > ==42050== by 0x100F09F2C: PCSetUp_LU (lu.c:152) > > ==42050== by 0x100FF9036: PCSetUp (precon.c:982) > > ==42050== by 0x1010F54EB: KSPSetUp (itfunc.c:332) > > ==42050== by 0x1010F7985: KSPSolve (itfunc.c:546) > > ==42050== by 0x10125541E: SNESSolve_NEWTONLS (ls.c:233) > > ==42050== by 0x1011C49B7: SNESSolve (snes.c:3906) > > ==42050== by 0x100001B3C: main (in ./ex19) > > ==42050== Uninitialised value was created by a stack allocation > > ==42050== at 0x10150E4C4: dist_symbLU (pdsymbfact_distdata.c:96) > > ==42050== > > ==42050== Conditional jump or move depends on uninitialised value(s) > > ==42050== at 0x102744E43: MPI_Alltoallv (alltoallv.c:490) > > ==42050== by 0x101510B3E: dist_symbLU (pdsymbfact_distdata.c:539) > > ==42050== by 0x10150A5C6: ddist_psymbtonum > (pdsymbfact_distdata.c:1275) > > ==42050== by 0x1015018C2: pdgssvx (pdgssvx.c:1057) > > ==42050== by 0x1009CFE7A: MatLUFactorNumeric_SuperLU_DIST > (superlu_dist.c:414) > > ==42050== by 0x10046CC5C: MatLUFactorNumeric (matrix.c:2946) > > ==42050== by 0x100F09F2C: PCSetUp_LU (lu.c:152) > > ==42050== by 0x100FF9036: PCSetUp (precon.c:982) > > ==42050== by 0x1010F54EB: KSPSetUp (itfunc.c:332) > > ==42050== by 0x1010F7985: KSPSolve (itfunc.c:546) > > ==42050== by 0x10125541E: SNESSolve_NEWTONLS (ls.c:233) > > ==42050== by 0x1011C49B7: SNESSolve (snes.c:3906) > > ==42050== by 0x100001B3C: main (in ./ex19) > > ==42050== Uninitialised value was created by a stack allocation > > ==42050== at 0x10150E4C4: dist_symbLU (pdsymbfact_distdata.c:96) > > ==42050== > > ==42050== Conditional jump or move depends on uninitialised value(s) > > ==42050== at 0x102744EBF: MPI_Alltoallv (alltoallv.c:497) > > ==42050== by 0x101510B3E: dist_symbLU (pdsymbfact_distdata.c:539) > > ==42050== by 0x10150A5C6: ddist_psymbtonum > (pdsymbfact_distdata.c:1275) > > ==42050== by 0x1015018C2: pdgssvx (pdgssvx.c:1057) > > ==42050== by 0x1009CFE7A: MatLUFactorNumeric_SuperLU_DIST > (superlu_dist.c:414) > > ==42050== by 0x10046CC5C: MatLUFactorNumeric (matrix.c:2946) > > ==42050== by 0x100F09F2C: PCSetUp_LU (lu.c:152) > > ==42050== by 0x100FF9036: PCSetUp (precon.c:982) > > ==42050== by 0x1010F54EB: KSPSetUp (itfunc.c:332) > > ==42050== by 0x1010F7985: KSPSolve (itfunc.c:546) > > ==42050== by 0x10125541E: SNESSolve_NEWTONLS (ls.c:233) > > ==42050== by 0x1011C49B7: SNESSolve (snes.c:3906) > > ==42050== by 0x100001B3C: main (in ./ex19) > > ==42050== Uninitialised value was created by a stack allocation > > ==42050== at 0x10150E4C4: dist_symbLU (pdsymbfact_distdata.c:96) > > ==42050== > > ==42050== Conditional jump or move depends on uninitialised value(s) > > ==42050== at 0x1027450B1: MPI_Alltoallv (alltoallv.c:512) > > ==42050== by 0x101510B3E: dist_symbLU (pdsymbfact_distdata.c:539) > > ==42050== by 0x10150A5C6: ddist_psymbtonum > (pdsymbfact_distdata.c:1275) > > ==42050== by 0x1015018C2: pdgssvx (pdgssvx.c:1057) > > ==42050== by 0x1009CFE7A: MatLUFactorNumeric_SuperLU_DIST > (superlu_dist.c:414) > > ==42050== by 0x10046CC5C: MatLUFactorNumeric (matrix.c:2946) > > ==42050== by 0x100F09F2C: PCSetUp_LU (lu.c:152) > > ==42050== by 0x100FF9036: PCSetUp (precon.c:982) > > ==42050== by 0x1010F54EB: KSPSetUp (itfunc.c:332) > > ==42050== by 0x1010F7985: KSPSolve (itfunc.c:546) > > ==42050== by 0x10125541E: SNESSolve_NEWTONLS (ls.c:233) > > ==42050== by 0x1011C49B7: SNESSolve (snes.c:3906) > > ==42050== by 0x100001B3C: main (in ./ex19) > > ==42050== Uninitialised value was created by a stack allocation > > ==42050== at 0x10150E4C4: dist_symbLU (pdsymbfact_distdata.c:96) > > ==42050== > > ==42050== Conditional jump or move depends on uninitialised value(s) > > ==42050== at 0x10283FB06: MPIR_Alltoallv_intra (alltoallv.c:92) > > ==42050== by 0x1028407B6: MPIR_Alltoallv (alltoallv.c:343) > > ==42050== by 0x102840884: MPIR_Alltoallv_impl (alltoallv.c:380) > > ==42050== by 0x10274541B: MPI_Alltoallv (alltoallv.c:531) > > ==42050== by 0x101510B3E: dist_symbLU (pdsymbfact_distdata.c:539) > > ==42050== by 0x10150A5C6: ddist_psymbtonum > (pdsymbfact_distdata.c:1275) > > ==42050== by 0x1015018C2: pdgssvx (pdgssvx.c:1057) > > ==42050== by 0x1009CFE7A: MatLUFactorNumeric_SuperLU_DIST > (superlu_dist.c:414) > > ==42050== by 0x10046CC5C: MatLUFactorNumeric (matrix.c:2946) > > ==42050== by 0x100F09F2C: PCSetUp_LU (lu.c:152) > > ==42050== by 0x100FF9036: PCSetUp (precon.c:982) > > ==42050== by 0x1010F54EB: KSPSetUp (itfunc.c:332) > > ==42050== by 0x1010F7985: KSPSolve (itfunc.c:546) > > ==42050== by 0x10125541E: SNESSolve_NEWTONLS (ls.c:233) > > ==42050== by 0x1011C49B7: SNESSolve (snes.c:3906) > > ==42050== by 0x100001B3C: main (in ./ex19) > > ==42050== Uninitialised value was created by a stack allocation > > ==42050== at 0x10150E4C4: dist_symbLU (pdsymbfact_distdata.c:96) > > ==42050== > > ==42050== Syscall param writev(vector[...]) points to uninitialised > byte(s) > > ==42050== at 0x102DA1C3A: writev (in > /usr/lib/system/libsystem_kernel.dylib) > > ==42050== by 0x10296A0DC: MPL_large_writev (mplsock.c:32) > > ==42050== by 0x10295F6AD: MPIDU_Sock_writev (sock_immed.i:610) > > ==42050== by 0x102943FCA: MPIDI_CH3_iSendv (ch3_isendv.c:84) > > ==42050== by 0x102934361: MPIDI_CH3_EagerContigIsend > (ch3u_eager.c:556) > > ==42050== by 0x102939531: MPID_Isend (mpid_isend.c:138) > > ==42050== by 0x10277656E: MPI_Isend (isend.c:125) > > ==42050== by 0x101524C41: pdgstrf2_trsm (pdgstrf2.c:201) > > ==42050== by 0x10151ECBF: pdgstrf (pdgstrf.c:1082) > > ==42050== by 0x1015019A5: pdgssvx (pdgssvx.c:1069) > > ==42050== by 0x1009CFE7A: MatLUFactorNumeric_SuperLU_DIST > (superlu_dist.c:414) > > ==42050== by 0x10046CC5C: MatLUFactorNumeric (matrix.c:2946) > > ==42050== by 0x100F09F2C: PCSetUp_LU (lu.c:152) > > ==42050== by 0x100FF9036: PCSetUp (precon.c:982) > > ==42050== by 0x1010F54EB: KSPSetUp (itfunc.c:332) > > ==42050== by 0x1010F7985: KSPSolve (itfunc.c:546) > > ==42050== by 0x10125541E: SNESSolve_NEWTONLS (ls.c:233) > > ==42050== by 0x1011C49B7: SNESSolve (snes.c:3906) > > ==42050== by 0x100001B3C: main (in ./ex19) > > ==42050== Address 0x1060144d0 is 1,168 bytes inside a block of size > 131,072 alloc'd > > ==42050== at 0x1000183B1: malloc (vg_replace_malloc.c:303) > > ==42050== by 0x10153B704: superlu_malloc_dist (memory.c:108) > > ==42050== by 0x1014FD7AD: doubleMalloc_dist (dmemory.c:145) > > ==42050== by 0x10151DA7D: pdgstrf (pdgstrf.c:735) > > ==42050== by 0x1015019A5: pdgssvx (pdgssvx.c:1069) > > ==42050== by 0x1009CFE7A: MatLUFactorNumeric_SuperLU_DIST > (superlu_dist.c:414) > > ==42050== by 0x10046CC5C: MatLUFactorNumeric (matrix.c:2946) > > ==42050== by 0x100F09F2C: PCSetUp_LU (lu.c:152) > > ==42050== by 0x100FF9036: PCSetUp (precon.c:982) > > ==42050== by 0x1010F54EB: KSPSetUp (itfunc.c:332) > > ==42050== by 0x1010F7985: KSPSolve (itfunc.c:546) > > ==42050== by 0x10125541E: SNESSolve_NEWTONLS (ls.c:233) > > ==42050== by 0x1011C49B7: SNESSolve (snes.c:3906) > > ==42050== by 0x100001B3C: main (in ./ex19) > > ==42050== Uninitialised value was created by a heap allocation > > ==42050== at 0x1000183B1: malloc (vg_replace_malloc.c:303) > > ==42050== by 0x10153B704: superlu_malloc_dist (memory.c:108) > > ==42050== by 0x1014FD7AD: doubleMalloc_dist (dmemory.c:145) > > ==42050== by 0x10151DA7D: pdgstrf (pdgstrf.c:735) > > ==42050== by 0x1015019A5: pdgssvx (pdgssvx.c:1069) > > ==42050== by 0x1009CFE7A: MatLUFactorNumeric_SuperLU_DIST > (superlu_dist.c:414) > > ==42050== by 0x10046CC5C: MatLUFactorNumeric (matrix.c:2946) > > ==42050== by 0x100F09F2C: PCSetUp_LU (lu.c:152) > > ==42050== by 0x100FF9036: PCSetUp (precon.c:982) > > ==42050== by 0x1010F54EB: KSPSetUp (itfunc.c:332) > > ==42050== by 0x1010F7985: KSPSolve (itfunc.c:546) > > ==42050== by 0x10125541E: SNESSolve_NEWTONLS (ls.c:233) > > ==42050== by 0x1011C49B7: SNESSolve (snes.c:3906) > > ==42050== by 0x100001B3C: main (in ./ex19) > > ==42050== > > ==42048== Conditional jump or move depends on uninitialised value(s) > > ==42048== at 0x10151F141: pdgstrf (pdgstrf.c:1139) > > ==42048== by 0x1015019A5: pdgssvx (pdgssvx.c:1069) > > ==42048== by 0x1009CFE7A: MatLUFactorNumeric_SuperLU_DIST > (superlu_dist.c:414) > > ==42048== by 0x10046CC5C: MatLUFactorNumeric (matrix.c:2946) > > ==42048== by 0x100F09F2C: PCSetUp_LU (lu.c:152) > > ==42048== by 0x100FF9036: PCSetUp (precon.c:982) > > ==42048== by 0x1010F54EB: KSPSetUp (itfunc.c:332) > > ==42048== by 0x1010F7985: KSPSolve (itfunc.c:546) > > ==42048== by 0x10125541E: SNESSolve_NEWTONLS (ls.c:233) > > ==42048== by 0x1011C49B7: SNESSolve (snes.c:3906) > > ==42048== by 0x100001B3C: main (in ./ex19) > > ==42048== Uninitialised value was created by a heap allocation > > ==42048== at 0x1000183B1: malloc (vg_replace_malloc.c:303) > > ==42048== by 0x10153B704: superlu_malloc_dist (memory.c:108) > > ==42048== by 0x10150ABE2: ddist_psymbtonum > (pdsymbfact_distdata.c:1332) > > ==42048== by 0x1015018C2: pdgssvx (pdgssvx.c:1057) > > ==42048== by 0x1009CFE7A: MatLUFactorNumeric_SuperLU_DIST > (superlu_dist.c:414) > > ==42048== by 0x10046CC5C: MatLUFactorNumeric (matrix.c:2946) > > ==42048== by 0x100F09F2C: PCSetUp_LU (lu.c:152) > > ==42048== by 0x100FF9036: PCSetUp (precon.c:982) > > ==42048== by 0x1010F54EB: KSPSetUp (itfunc.c:332) > > ==42048== by 0x1010F7985: KSPSolve (itfunc.c:546) > > ==42048== by 0x10125541E: SNESSolve_NEWTONLS (ls.c:233) > > ==42048== by 0x1011C49B7: SNESSolve (snes.c:3906) > > ==42048== by 0x100001B3C: main (in ./ex19) > > ==42048== > > ==42049== Conditional jump or move depends on uninitialised value(s) > > ==42049== at 0x10151F141: pdgstrf (pdgstrf.c:1139) > > ==42049== by 0x1015019A5: pdgssvx (pdgssvx.c:1069) > > ==42049== by 0x1009CFE7A: MatLUFactorNumeric_SuperLU_DIST > (superlu_dist.c:414) > > ==42049== by 0x10046CC5C: MatLUFactorNumeric (matrix.c:2946) > > ==42049== by 0x100F09F2C: PCSetUp_LU (lu.c:152) > > ==42049== by 0x100FF9036: PCSetUp (precon.c:982) > > ==42049== by 0x1010F54EB: KSPSetUp (itfunc.c:332) > > ==42049== by 0x1010F7985: KSPSolve (itfunc.c:546) > > ==42049== by 0x10125541E: SNESSolve_NEWTONLS (ls.c:233) > > ==42049== by 0x1011C49B7: SNESSolve (snes.c:3906) > > ==42049== by 0x100001B3C: main (in ./ex19) > > ==42049== Uninitialised value was created by a heap allocation > > ==42049== at 0x1000183B1: malloc (vg_replace_malloc.c:303) > > ==42049== by 0x10153B704: superlu_malloc_dist (memory.c:108) > > ==42049== by 0x10150ABE2: ddist_psymbtonum > (pdsymbfact_distdata.c:1332) > > ==42049== by 0x1015018C2: pdgssvx (pdgssvx.c:1057) > > ==42049== by 0x1009CFE7A: MatLUFactorNumeric_SuperLU_DIST > (superlu_dist.c:414) > > ==42049== by 0x10046CC5C: MatLUFactorNumeric (matrix.c:2946) > > ==42049== by 0x100F09F2C: PCSetUp_LU (lu.c:152) > > ==42049== by 0x100FF9036: PCSetUp (precon.c:982) > > ==42049== by 0x1010F54EB: KSPSetUp (itfunc.c:332) > > ==42049== by 0x1010F7985: KSPSolve (itfunc.c:546) > > ==42049== by 0x10125541E: SNESSolve_NEWTONLS (ls.c:233) > > ==42049== by 0x1011C49B7: SNESSolve (snes.c:3906) > > ==42049== by 0x100001B3C: main (in ./ex19) > > ==42049== > > ==42048== Conditional jump or move depends on uninitialised value(s) > > ==42048== at 0x101520054: pdgstrf (pdgstrf.c:1429) > > ==42048== by 0x1015019A5: pdgssvx (pdgssvx.c:1069) > > ==42048== by 0x1009CFE7A: MatLUFactorNumeric_SuperLU_DIST > (superlu_dist.c:414) > > ==42048== by 0x10046CC5C: MatLUFactorNumeric (matrix.c:2946) > > ==42048== by 0x100F09F2C: PCSetUp_LU (lu.c:152) > > ==42048== by 0x100FF9036: PCSetUp (precon.c:982) > > ==42048== by 0x1010F54EB: KSPSetUp (itfunc.c:332) > > ==42048== by 0x1010F7985: KSPSolve (itfunc.c:546) > > ==42049== Conditional jump or move depends on uninitialised value(s) > > ==42048== by 0x10125541E: SNESSolve_NEWTONLS (ls.c:233) > > ==42048== by 0x1011C49B7: SNESSolve (snes.c:3906) > > ==42048== by 0x100001B3C: main (in ./ex19) > > ==42048== Uninitialised value was created by a heap allocation > > ==42049== at 0x101520054: pdgstrf (pdgstrf.c:1429) > > ==42048== at 0x1000183B1: malloc (vg_replace_malloc.c:303) > > ==42048== by 0x10153B704: superlu_malloc_dist (memory.c:108) > > ==42049== by 0x1015019A5: pdgssvx (pdgssvx.c:1069) > > ==42049== by 0x1009CFE7A: MatLUFactorNumeric_SuperLU_DIST > (superlu_dist.c:414) > > ==42048== by 0x10150ABE2: ddist_psymbtonum > (pdsymbfact_distdata.c:1332) > > ==42048== by 0x1015018C2: pdgssvx (pdgssvx.c:1057) > > ==42048== by 0x1009CFE7A: MatLUFactorNumeric_SuperLU_DIST > (superlu_dist.c:414) > > ==42049== by 0x10046CC5C: MatLUFactorNumeric (matrix.c:2946) > > ==42049== by 0x100F09F2C: PCSetUp_LU (lu.c:152) > > ==42048== by 0x10046CC5C: MatLUFactorNumeric (matrix.c:2946) > > ==42048== by 0x100F09F2C: PCSetUp_LU (lu.c:152) > > ==42049== by 0x100FF9036: PCSetUp (precon.c:982) > > ==42049== by 0x1010F54EB: KSPSetUp (itfunc.c:332) > > ==42049== by 0x1010F7985: KSPSolve (itfunc.c:546) > > ==42048== by 0x100FF9036: PCSetUp (precon.c:982) > > ==42048== by 0x1010F54EB: KSPSetUp (itfunc.c:332) > > ==42048== by 0x1010F7985: KSPSolve (itfunc.c:546) > > ==42049== by 0x10125541E: SNESSolve_NEWTONLS (ls.c:233) > > ==42049== by 0x1011C49B7: SNESSolve (snes.c:3906) > > ==42048== by 0x10125541E: SNESSolve_NEWTONLS (ls.c:233) > > ==42048== by 0x1011C49B7: SNESSolve (snes.c:3906) > > ==42049== by 0x100001B3C: main (in ./ex19) > > ==42049== Uninitialised value was created by a heap allocation > > ==42049== at 0x1000183B1: malloc (vg_replace_malloc.c:303) > > ==42048== by 0x100001B3C: main (in ./ex19) > > ==42048== > > ==42049== by 0x10153B704: superlu_malloc_dist (memory.c:108) > > ==42049== by 0x10150ABE2: ddist_psymbtonum > (pdsymbfact_distdata.c:1332) > > ==42049== by 0x1015018C2: pdgssvx (pdgssvx.c:1057) > > ==42049== by 0x1009CFE7A: MatLUFactorNumeric_SuperLU_DIST > (superlu_dist.c:414) > > ==42049== by 0x10046CC5C: MatLUFactorNumeric (matrix.c:2946) > > ==42049== by 0x100F09F2C: PCSetUp_LU (lu.c:152) > > ==42049== by 0x100FF9036: PCSetUp (precon.c:982) > > ==42049== by 0x1010F54EB: KSPSetUp (itfunc.c:332) > > ==42049== by 0x1010F7985: KSPSolve (itfunc.c:546) > > ==42049== by 0x10125541E: SNESSolve_NEWTONLS (ls.c:233) > > ==42049== by 0x1011C49B7: SNESSolve (snes.c:3906) > > ==42049== by 0x100001B3C: main (in ./ex19) > > ==42049== > > ==42050== Conditional jump or move depends on uninitialised value(s) > > ==42050== at 0x10151FDE6: pdgstrf (pdgstrf.c:1382) > > ==42050== by 0x1015019A5: pdgssvx (pdgssvx.c:1069) > > ==42050== by 0x1009CFE7A: MatLUFactorNumeric_SuperLU_DIST > (superlu_dist.c:414) > > ==42050== by 0x10046CC5C: MatLUFactorNumeric (matrix.c:2946) > > ==42050== by 0x100F09F2C: PCSetUp_LU (lu.c:152) > > ==42050== by 0x100FF9036: PCSetUp (precon.c:982) > > ==42050== by 0x1010F54EB: KSPSetUp (itfunc.c:332) > > ==42050== by 0x1010F7985: KSPSolve (itfunc.c:546) > > ==42050== by 0x10125541E: SNESSolve_NEWTONLS (ls.c:233) > > ==42050== by 0x1011C49B7: SNESSolve (snes.c:3906) > > ==42050== by 0x100001B3C: main (in ./ex19) > > ==42050== Uninitialised value was created by a heap allocation > > ==42050== at 0x1000183B1: malloc (vg_replace_malloc.c:303) > > ==42050== by 0x10153B704: superlu_malloc_dist (memory.c:108) > > ==42050== by 0x10150B241: ddist_psymbtonum > (pdsymbfact_distdata.c:1389) > > ==42050== by 0x1015018C2: pdgssvx (pdgssvx.c:1057) > > ==42050== by 0x1009CFE7A: MatLUFactorNumeric_SuperLU_DIST > (superlu_dist.c:414) > > ==42050== by 0x10046CC5C: MatLUFactorNumeric (matrix.c:2946) > > ==42050== by 0x100F09F2C: PCSetUp_LU (lu.c:152) > > ==42050== by 0x100FF9036: PCSetUp (precon.c:982) > > ==42050== by 0x1010F54EB: KSPSetUp (itfunc.c:332) > > ==42050== by 0x1010F7985: KSPSolve (itfunc.c:546) > > ==42050== by 0x10125541E: SNESSolve_NEWTONLS (ls.c:233) > > ==42050== by 0x1011C49B7: SNESSolve (snes.c:3906) > > ==42050== by 0x100001B3C: main (in ./ex19) > > ==42050== > > > > > > > On Jul 20, 2015, at 12:03 PM, Mahir.Ulker-Kaustell at tyrens.se Mahir.Ulker-Kaustell at tyrens.se> wrote: > > > > > > Ok. So I have been creating the full factorization on each process. > That gives me some hope! > > > > > > I followed your suggestion and tried to use the runtime option > ?-mat_superlu_dist_parsymbfact?. > > > However, now the program crashes with: > > > > > > Invalid ISPEC at line 484 in file get_perm_c.c > > > > > > And so on? > > > > > > From the SuperLU manual; I should give the option either YES or NO, > however -mat_superlu_dist_parsymbfact YES makes the program crash in the > same way as above. > > > Also I can?t find any reference to -mat_superlu_dist_parsymbfact in > the PETSc documentation > > > > > > Mahir > > > > > > Mahir ?lker-Kaustell, Kompetenssamordnare, Brokonstrukt?r, Tekn. Dr, > Tyr?ns AB > > > 010 452 30 82, Mahir.Ulker-Kaustell at tyrens.se Mahir.Ulker-Kaustell at tyrens.se> > > > > > > From: Xiaoye S. Li [mailto:xsli at lbl.gov] > > > Sent: den 20 juli 2015 18:12 > > > To: ?lker-Kaustell, Mahir > > > Cc: Hong; petsc-users > > > Subject: Re: [petsc-users] SuperLU MPI-problem > > > > > > The default SuperLU_DIST setting is to serial symbolic factorization. > Therefore, what matters is how much memory do you have per MPI task? > > > > > > The code failed to malloc memory during redistribution of matrix A to > {L\U} data struction (using result of serial symbolic factorization.) > > > > > > You can use parallel symbolic factorization, by runtime option: > '-mat_superlu_dist_parsymbfact' > > > > > > Sherry Li > > > > > > > > > On Mon, Jul 20, 2015 at 8:59 AM, Mahir.Ulker-Kaustell at tyrens.se > > wrote: > > > Hong: > > > > > > Previous experiences with this equation have shown that it is very > difficult to solve it iteratively. Hence the use of a direct solver. > > > > > > The large test problem I am trying to solve has slightly less than > 10^6 degrees of freedom. The matrices are derived from finite elements so > they are sparse. > > > The machine I am working on has 128GB ram. I have estimated the memory > needed to less than 20GB, so if the solver needs twice or even three times > as much, it should still work well. Or have I completely misunderstood > something here? > > > > > > Mahir > > > > > > > > > > > > From: Hong [mailto:hzhang at mcs.anl.gov] > > > Sent: den 20 juli 2015 17:39 > > > To: ?lker-Kaustell, Mahir > > > Cc: petsc-users > > > Subject: Re: [petsc-users] SuperLU MPI-problem > > > > > > Mahir: > > > Direct solvers consume large amount of memory. Suggest to try > followings: > > > > > > 1. A sparse iterative solver if [-omega^2M + K] is not too > ill-conditioned. You may test it using the small matrix. > > > > > > 2. Incrementally increase your matrix sizes. Try different matrix > orderings. > > > Do you get memory crash in the 1st symbolic factorization? > > > In your case, matrix data structure stays same when omega changes, so > you only need to do one matrix symbolic factorization and reuse it. > > > > > > 3. Use a machine that gives larger memory. > > > > > > Hong > > > > > > Dear Petsc-Users, > > > > > > I am trying to use PETSc to solve a set of linear equations arising > from Naviers equation (elastodynamics) in the frequency domain. > > > The frequency dependency of the problem requires that the system > > > > > > [-omega^2M + K]u = F > > > > > > where M and K are constant, square, positive definite matrices (mass > and stiffness respectively) is solved for each frequency omega of interest. > > > K is a complex matrix, including material damping. > > > > > > I have written a PETSc program which solves this problem for a small > (1000 degrees of freedom) test problem on one or several processors, but it > keeps crashing when I try it on my full scale (in the order of 10^6 degrees > of freedom) problem. > > > > > > The program crashes at KSPSetUp() and from what I can see in the error > messages, it appears as if it consumes too much memory. > > > > > > I would guess that similar problems have occurred in this mail-list, > so I am hoping that someone can push me in the right direction? > > > > > > Mahir > > > > > > > > > > > > > > > > > > > > > -------------- next part -------------- An HTML attachment was scrubbed... URL: From fdkong.jd at gmail.com Tue Aug 11 20:53:47 2015 From: fdkong.jd at gmail.com (Fande Kong) Date: Tue, 11 Aug 2015 20:53:47 -0500 Subject: [petsc-users] petsc_gen_xdmf.py errors Message-ID: Hi all, I tried to use petsc_gen_xdmf.py to generate a xml file for visulaztion using paraview. I got the following errors: ./petsc_gen_xdmf.py sol.h5 Traceback (most recent call last): File "./petsc_gen_xdmf.py", line 236, in generateXdmf(sys.argv[1]) File "./petsc_gen_xdmf.py", line 231, in generateXdmf Xdmf(xdmfFilename).write(hdfFilename, topoPath, numCells, numCorners, cellDim, geomPath, numVertices, spaceDim, time, vfields, cfields) File "./petsc_gen_xdmf.py", line 190, in write for vf in vfields: self.writeField(fp, len(time), t, cellDim, spaceDim, '/vertex_fields/'+vf[0], vf, 'Node') File "./petsc_gen_xdmf.py", line 164, in writeField self.writeFieldComponents(fp, numSteps, timestep, spaceDim, name, f, domain) File "./petsc_gen_xdmf.py", line 120, in writeFieldComponents dims = '1 %d 1' % (numSteps, dof, bs) TypeError: not all arguments converted during string formatting The hdf5 file is attached. Originally from Matthew. Configuration and make log files are also attached. Fande Kong, Thanks, -------------- next part -------------- An HTML attachment was scrubbed... URL: -------------- next part -------------- A non-text attachment was scrubbed... Name: sol.h5 Type: application/octet-stream Size: 246288 bytes Desc: not available URL: -------------- next part -------------- A non-text attachment was scrubbed... Name: configure.log Type: application/octet-stream Size: 6236054 bytes Desc: not available URL: -------------- next part -------------- A non-text attachment was scrubbed... Name: make.log Type: application/octet-stream Size: 104776 bytes Desc: not available URL: From gideon.simpson at gmail.com Tue Aug 11 22:40:12 2015 From: gideon.simpson at gmail.com (Gideon Simpson) Date: Tue, 11 Aug 2015 23:40:12 -0400 Subject: [petsc-users] checking jacobian In-Reply-To: <97A16F65-5714-4946-B2A1-32AEA315BFA1@mcs.anl.gov> References: <1AED74B9-C28E-4F0E-BC61-B8C6E8E5B1F8@gmail.com> <87zj1xsoyv.fsf@jedbrown.org> <5B0AD314-87B9-4A74-AE92-16CD106D7A52@gmail.com> <18507BF6-BCB7-44B4-918B-7197CF716208@gmail.com> <877E66A4-B4D9-4A4C-8654-FD55C11F12C2@gmail.com> <97A16F65-5714-4946-B2A1-32AEA315BFA1@mcs.anl.gov> Message-ID: <03AE51B1-F431-4534-B04F-B80BEBF2EFE1@gmail.com> Barry?s comment resolved my issue. -gideon > On Aug 11, 2015, at 5:40 PM, Barry Smith wrote: > > You also have to KEEP the -snes_check_jacobian option > > Barry > >> On Aug 11, 2015, at 1:09 PM, Gideon Simpson wrote: >> >> I don?t see it listed in -help, but I do get >> >> ./blowup -xmax 50 -nx 1000 -snes_check_jacobian >> Testing hand-coded Jacobian, if the ratio is O(1.e-8), the hand-coded Jacobian is probably correct. >> Run with -snes_check_jacobian_view [viewer][:filename][:format] to show difference of hand-coded and finite difference Jacobian. >> 63386.1 = ||J - Jfd||//J|| 63386.1 = ||J - Jfd|| >> >> -gideon >> >>> On Aug 11, 2015, at 2:07 PM, Matthew Knepley wrote: >>> >>> On Tue, Aug 11, 2015 at 1:04 PM, Gideon Simpson wrote: >>> Macports installation of 3.5.3. >>> >>> Use -help to find the option name. Maybe its -snes_test. >>> >>> Thanks, >>> >>> Matt >>> >>> -gideon >>> >>>> On Aug 11, 2015, at 2:03 PM, Matthew Knepley wrote: >>>> >>>> On Tue, Aug 11, 2015 at 12:51 PM, Gideon Simpson wrote: >>>> #End of PETSc Option Table entries >>>> There is one unused database option. It is: >>>> Option left: name:-snes_check_jacobian_view (no value) >>>> >>>> This is the option for the newest release. What are you using? >>>> >>>> Matt >>>> >>>> -gideon >>>> >>>>> On Aug 11, 2015, at 1:50 PM, Matthew Knepley wrote: >>>>> >>>>> On Tue, Aug 11, 2015 at 12:49 PM, Gideon Simpson wrote: >>>>> Maybe it?s a quirk of the macports installation of petsc, but nothing seems to be getting generated. >>>>> >>>>> Run with -options_left. Is it reading the option? >>>>> >>>>> Matt >>>>> >>>>> -gideon >>>>> >>>>>> On Aug 11, 2015, at 1:40 PM, Jed Brown wrote: >>>>>> >>>>>> Gideon Simpson writes: >>>>>> >>>>>>> I?m a bit confused by the following options: >>>>>>> >>>>>>> Run with -snes_check_jacobian_view [viewer][:filename][:format] to show difference of hand-coded and finite difference Jacobian. >>>>>>> >>>>>>> What flags do I pass it to get some output to diagnose my Jacobian error? >>>>>> >>>>>> Nothing to display ASCII to the screen. You might use >>>>>> "binary:thematrix" if you want to read it in with Python or MATLAB, for >>>>>> example. >>>>> >>>>> >>>>> >>>>> >>>>> -- >>>>> What most experimenters take for granted before they begin their experiments is infinitely more interesting than any results to which their experiments lead. >>>>> -- Norbert Wiener >>>> >>>> >>>> >>>> >>>> -- >>>> What most experimenters take for granted before they begin their experiments is infinitely more interesting than any results to which their experiments lead. >>>> -- Norbert Wiener >>> >>> >>> >>> >>> -- >>> What most experimenters take for granted before they begin their experiments is infinitely more interesting than any results to which their experiments lead. >>> -- Norbert Wiener >> > -------------- next part -------------- An HTML attachment was scrubbed... URL: From Mahir.Ulker-Kaustell at tyrens.se Wed Aug 12 02:02:39 2015 From: Mahir.Ulker-Kaustell at tyrens.se (Mahir.Ulker-Kaustell at tyrens.se) Date: Wed, 12 Aug 2015 07:02:39 +0000 Subject: [petsc-users] SuperLU MPI-problem In-Reply-To: References: <1891ada2f99843b6b6c1d91f0f47f065@STHWS42.tyrens.se> <63c6587a85914931bbbad4660884efed@STHWS42.tyrens.se> <429fe4873a534ab19216a8d2e5fa8213@STHWS42.tyrens.se> , Message-ID: <1439362955895.53919@tyrens.se> Ok, thank you. I have 1-2 minutes in a commercial code and was of course hoping that PETScSuperLU would be at least that fast. Currently PETSc/SuperLU is around 10 times slower, so I have to dig a little deeper, but now I know it will be worthwhile... Mahir ________________________________ Fr?n: Xiaoye S. Li Skickat: den 12 augusti 2015 01:49 Till: ?lker-Kaustell, Mahir Kopia: petsc-users ?mne: Re: [petsc-users] SuperLU MPI-problem ?It's hard to say. For 3D problems, you may get a fill factor about 30x-50x (can be larger or smaller depending on problem.) The time may be in seconds, or minutes at most. Sherry On Tue, Aug 11, 2015 at 7:31 AM, Mahir.Ulker-Kaustell at tyrens.se > wrote: Yes! Doing: $PETSC_DIR/$PETSC_ARCH/bin/mpiexec instead of mpiexec makes the program run as expected. Thank you all for your patience and encouragement. Sherry: I have noticed that you have been involved in some publications related to my current work, i.e. wave propagation in elastic solids. What computation time would you expect using SuperLU to solve one linear system with say 800000 degrees of freedom and 4-8 processes (on a single node) with a finite element discretization? Mahir -----Original Message----- From: Satish Balay [mailto:balay at mcs.anl.gov] Sent: den 7 augusti 2015 18:09 To: ?lker-Kaustell, Mahir Cc: Hong; PETSc users list Subject: Re: [petsc-users] SuperLU MPI-problem This usually happens if you use the wrong MPIEXEC i.e use the mpiexec from the MPI you built PETSc with. Satish On Fri, 7 Aug 2015, Mahir.Ulker-Kaustell at tyrens.se wrote: > Hong, > > Running example 2 with the command line given below gives me two uniprocessor runs!? > > $ mpiexec -n 2 ./ex2 -pc_type lu -pc_factor_mat_solver_package superlu_dist -ksp_view > KSP Object: 1 MPI processes > type: gmres > GMRES: restart=30, using Classical (unmodified) Gram-Schmidt Orthogonalization with no iterative refinement > GMRES: happy breakdown tolerance 1e-30 > maximum iterations=10000, initial guess is zero > tolerances: relative=0.000138889, absolute=1e-50, divergence=10000 > left preconditioning > using PRECONDITIONED norm type for convergence test > PC Object: 1 MPI processes > type: lu > LU: out-of-place factorization > tolerance for zero pivot 2.22045e-14 > matrix ordering: nd > factor fill ratio given 0, needed 0 > Factored matrix follows: > Mat Object: 1 MPI processes > type: seqaij > rows=56, cols=56 > package used to perform factorization: superlu_dist > total: nonzeros=0, allocated nonzeros=0 > total number of mallocs used during MatSetValues calls =0 > SuperLU_DIST run parameters: > Process grid nprow 1 x npcol 1 > Equilibrate matrix TRUE > Matrix input mode 0 > Replace tiny pivots TRUE > Use iterative refinement FALSE > Processors in row 1 col partition 1 > Row permutation LargeDiag > Column permutation METIS_AT_PLUS_A > Parallel symbolic factorization FALSE > Repeated factorization SamePattern_SameRowPerm > linear system matrix = precond matrix: > Mat Object: 1 MPI processes > type: seqaij > rows=56, cols=56 > total: nonzeros=250, allocated nonzeros=280 > total number of mallocs used during MatSetValues calls =0 > not using I-node routines > Norm of error 5.21214e-15 iterations 1 > KSP Object: 1 MPI processes > type: gmres > GMRES: restart=30, using Classical (unmodified) Gram-Schmidt Orthogonalization with no iterative refinement > GMRES: happy breakdown tolerance 1e-30 > maximum iterations=10000, initial guess is zero > tolerances: relative=0.000138889, absolute=1e-50, divergence=10000 > left preconditioning > using PRECONDITIONED norm type for convergence test > PC Object: 1 MPI processes > type: lu > LU: out-of-place factorization > tolerance for zero pivot 2.22045e-14 > matrix ordering: nd > factor fill ratio given 0, needed 0 > Factored matrix follows: > Mat Object: 1 MPI processes > type: seqaij > rows=56, cols=56 > package used to perform factorization: superlu_dist > total: nonzeros=0, allocated nonzeros=0 > total number of mallocs used during MatSetValues calls =0 > SuperLU_DIST run parameters: > Process grid nprow 1 x npcol 1 > Equilibrate matrix TRUE > Matrix input mode 0 > Replace tiny pivots TRUE > Use iterative refinement FALSE > Processors in row 1 col partition 1 > Row permutation LargeDiag > Column permutation METIS_AT_PLUS_A > Parallel symbolic factorization FALSE > Repeated factorization SamePattern_SameRowPerm > linear system matrix = precond matrix: > Mat Object: 1 MPI processes > type: seqaij > rows=56, cols=56 > total: nonzeros=250, allocated nonzeros=280 > total number of mallocs used during MatSetValues calls =0 > not using I-node routines > Norm of error 5.21214e-15 iterations 1 > > Mahir > > From: Hong [mailto:hzhang at mcs.anl.gov] > Sent: den 6 augusti 2015 16:36 > To: ?lker-Kaustell, Mahir > Cc: Hong; Xiaoye S. Li; PETSc users list > Subject: Re: [petsc-users] SuperLU MPI-problem > > Mahir: > > I have been using PETSC_COMM_WORLD. > > What do you get by running a petsc example, e.g., > petsc/src/ksp/ksp/examples/tutorials > mpiexec -n 2 ./ex2 -pc_type lu -pc_factor_mat_solver_package superlu_dist -ksp_view > > KSP Object: 2 MPI processes > type: gmres > ... > > Hong > > From: Hong [mailto:hzhang at mcs.anl.gov>] > Sent: den 5 augusti 2015 17:11 > To: ?lker-Kaustell, Mahir > Cc: Hong; Xiaoye S. Li; PETSc users list > Subject: Re: [petsc-users] SuperLU MPI-problem > > Mahir: > As you noticed, you ran the code in serial mode, not parallel. > Check your code on input communicator, e.g., what input communicator do you use in > KSPCreate(comm,&ksp)? > > I have added error flag to superlu_dist interface (released version). When user uses '-mat_superlu_dist_parsymbfact' > in serial mode, this option is ignored with a warning. > > Hong > > Hong, > > If I set parsymbfact: > > $ mpiexec -n 2 ./solve -ksp_type preonly -pc_type lu -pc_factor_mat_solver_package superlu_dist -mat_superlu_dist_matinput DISTRIBUTED -mat_superlu_dist_parsymbfact -ksp_view > Invalid ISPEC at line 484 in file get_perm_c.c > Invalid ISPEC at line 484 in file get_perm_c.c > ------------------------------------------------------- > Primary job terminated normally, but 1 process returned > a non-zero exit code.. Per user-direction, the job has been aborted. > ------------------------------------------------------- > -------------------------------------------------------------------------- > mpiexec detected that one or more processes exited with non-zero status, thus causing > the job to be terminated. The first process to do so was: > > Process name: [[63679,1],0] > Exit code: 255 > -------------------------------------------------------------------------- > > Since the program does not finish the call to KSPSolve(), we do not get any information about the KSP from ?ksp_view. > > If I do not set it, I get a serial run even if I specify ?n 2: > > mpiexec -n 2 ./solve -ksp_type preonly -pc_type lu -pc_factor_mat_solver_package superlu_dist -ksp_view > ? > KSP Object: 1 MPI processes > type: preonly > maximum iterations=10000, initial guess is zero > tolerances: relative=1e-05, absolute=1e-50, divergence=10000 > left preconditioning > using NONE norm type for convergence test > PC Object: 1 MPI processes > type: lu > LU: out-of-place factorization > tolerance for zero pivot 2.22045e-14 > matrix ordering: nd > factor fill ratio given 0, needed 0 > Factored matrix follows: > Mat Object: 1 MPI processes > type: seqaij > rows=954, cols=954 > package used to perform factorization: superlu_dist > total: nonzeros=0, allocated nonzeros=0 > total number of mallocs used during MatSetValues calls =0 > SuperLU_DIST run parameters: > Process grid nprow 1 x npcol 1 > Equilibrate matrix TRUE > Matrix input mode 0 > Replace tiny pivots TRUE > Use iterative refinement FALSE > Processors in row 1 col partition 1 > Row permutation LargeDiag > Column permutation METIS_AT_PLUS_A > Parallel symbolic factorization FALSE > Repeated factorization SamePattern_SameRowPerm > linear system matrix = precond matrix: > Mat Object: 1 MPI processes > type: seqaij > rows=954, cols=954 > total: nonzeros=34223, allocated nonzeros=34223 > total number of mallocs used during MatSetValues calls =0 > using I-node routines: found 668 nodes, limit used is 5 > > I am running PETSc via Cygwin on a windows machine. > When I installed PETSc the tests with different numbers of processes ran well. > > Mahir > > > From: Hong [mailto:hzhang at mcs.anl.gov>] > Sent: den 3 augusti 2015 19:06 > To: ?lker-Kaustell, Mahir > Cc: Hong; Xiaoye S. Li; PETSc users list > Subject: Re: [petsc-users] SuperLU MPI-problem > > Mahir, > > > I have not used ?parsymbfact in sequential runs or set matinput=GLOBAL for parallel runs. > > If I use 2 processors, the program runs if I use ?mat_superlu_dist_parsymbfact=1: > mpiexec -n 2 ./solve -ksp_type preonly -pc_type lu -pc_factor_mat_solver_package superlu_dist -mat_superlu_dist_matinput GLOBAL -mat_superlu_dist_parsymbfact=1 > > The incorrect option '-mat_superlu_dist_parsymbfact=1' is not taken, so your code runs well without parsymbfact. > > Please run it with '-ksp_view' and see what > 'SuperLU_DIST run parameters:' are being used, e.g. > petsc/src/ksp/ksp/examples/tutorials (maint) > $ mpiexec -n 2 ./ex2 -pc_type lu -pc_factor_mat_solver_package superlu_dist -mat_superlu_dist_parsymbfact=1 -ksp_view > > ... > SuperLU_DIST run parameters: > Process grid nprow 2 x npcol 1 > Equilibrate matrix TRUE > Matrix input mode 1 > Replace tiny pivots TRUE > Use iterative refinement FALSE > Processors in row 2 col partition 1 > Row permutation LargeDiag > Column permutation METIS_AT_PLUS_A > Parallel symbolic factorization FALSE > Repeated factorization SamePattern_SameRowPerm > > I do not understand why your code uses matrix input mode = global. > > Hong > > > > From: Hong [mailto:hzhang at mcs.anl.gov>] > Sent: den 3 augusti 2015 16:46 > To: Xiaoye S. Li > Cc: ?lker-Kaustell, Mahir; Hong; PETSc users list > > Subject: Re: [petsc-users] SuperLU MPI-problem > > Mahir, > > Sherry found the culprit. I can reproduce it: > petsc/src/ksp/ksp/examples/tutorials > mpiexec -n 2 ./ex2 -pc_type lu -pc_factor_mat_solver_package superlu_dist -mat_superlu_dist_matinput GLOBAL -mat_superlu_dist_parsymbfact > > Invalid ISPEC at line 484 in file get_perm_c.c > Invalid ISPEC at line 484 in file get_perm_c.c > ------------------------------------------------------- > Primary job terminated normally, but 1 process returned > a non-zero exit code.. Per user-direction, the job has been aborted. > ------------------------------------------------------- > ... > > PETSc-superlu_dist interface sets matinput=DISTRIBUTED as default when using more than one processes. > Did you either use '-mat_superlu_dist_parsymbfact' for sequential run or set matinput=GLOBAL for parallel run? > > I'll add an error flag for these use cases. > > Hong > > On Mon, Aug 3, 2015 at 9:17 AM, Xiaoye S. Li >> wrote: > I think I know the problem. Since zdistribute.c is called, I guess you are using the global (replicated) matrix input interface, pzgssvx_ABglobal(). This interface does not allow you to use parallel symbolic factorization (since matrix is centralized). > > That's why you get the following error: > Invalid ISPEC at line 484 in file get_perm_c.c > > You need to use distributed matrix input interface pzgssvx() (without ABglobal) > > Sherry > > > On Mon, Aug 3, 2015 at 5:02 AM, Mahir.Ulker-Kaustell at tyrens.se> >> wrote: > Hong and Sherry, > > I have rebuilt PETSc with SuperLU 4.1. Unfortunately, the problem remains: > > If I use -mat_superlu_dist_parsymbfact, the program crashes with: Invalid ISPEC at line 484 in file get_perm_c.c > If I use -mat_superlu_dist_parsymbfact=1 or leave this flag out, the program crashes with: Calloc fails for SPA dense[]. at line 438 in file zdistribute.c > > Mahir > > From: Hong [mailto:hzhang at mcs.anl.gov>] > Sent: den 30 juli 2015 02:58 > To: ?lker-Kaustell, Mahir > Cc: Xiaoye Li; PETSc users list > > Subject: Fwd: [petsc-users] SuperLU MPI-problem > > Mahir, > > Sherry fixed several bugs in superlu_dist-v4.1. > The current petsc-release interfaces with superlu_dist-v4.0. > We do not know whether the reported issue (attached below) has been resolved or not. If not, can you test it with the latest superlu_dist-v4.1? > > Here is how to do it: > 1. download superlu_dist v4.1 > 2. remove existing PETSC_ARCH directory, then configure petsc with > '--download-superlu_dist=superlu_dist_4.1.tar.gz' > 3. build petsc > > Let us know if the issue remains. > > Hong > > > ---------- Forwarded message ---------- > From: Xiaoye S. Li >> > Date: Wed, Jul 29, 2015 at 2:24 PM > Subject: Fwd: [petsc-users] SuperLU MPI-problem > To: Hong Zhang >> > Hong, > I am cleaning the mailbox, and saw this unresolved issue. I am not sure whether the new fix to parallel symbolic factorization solves the problem. What bothers be is that he is getting the following error: > > Invalid ISPEC at line 484 in file get_perm_c.c > This has nothing to do with my bug fix. > ? Shall we ask him to try the new version, or try to get him matrix? > Sherry > ? > > ---------- Forwarded message ---------- > From: Mahir.Ulker-Kaustell at tyrens.se> >> > Date: Wed, Jul 22, 2015 at 1:32 PM > Subject: RE: [petsc-users] SuperLU MPI-problem > To: Hong >>, "Xiaoye S. Li" >> > Cc: petsc-users >> > The 1000 was just a conservative guess. The number of non-zeros per row is in the tens in general but certain constraints lead to non-diagonal streaks in the sparsity-pattern. > Is it the reordering of the matrix that is killing me here? How can I set options.ColPerm? > > If i use -mat_superlu_dist_parsymbfact the program crashes with > > Invalid ISPEC at line 484 in file get_perm_c.c > ------------------------------------------------------- > Primary job terminated normally, but 1 process returned > a non-zero exit code.. Per user-direction, the job has been aborted. > ------------------------------------------------------- > [0]PETSC ERROR: ------------------------------------------------------------------------ > [0]PETSC ERROR: Caught signal number 15 Terminate: Some process (or the batch system) has told this process to end > [0]PETSC ERROR: Try option -start_in_debugger or -on_error_attach_debugger > [0]PETSC ERROR: or see http://www.mcs.anl.gov/petsc/documentation/faq.html#valgrind > [0]PETSC ERROR: or try http://valgrind.org on GNU/linux and Apple Mac OS X to find memory corruption errors > [0]PETSC ERROR: configure using --with-debugging=yes, recompile, link, and run > [0]PETSC ERROR: to get more information on the crash. > [0]PETSC ERROR: --------------------- Error Message -------------------------------------------------------------- > [0]PETSC ERROR: Signal received > [0]PETSC ERROR: See http://www.mcs.anl.gov/petsc/documentation/faq.html for trouble shooting. > [0]PETSC ERROR: Petsc Release Version 3.6.0, Jun, 09, 2015 > [0]PETSC ERROR: ./solve on a cygwin-complex-nodebug named CZC5202SM2 by muk Wed Jul 22 21:59:23 2015 > [0]PETSC ERROR: Configure options PETSC_DIR=/packages/petsc-3.6.0 PETSC_ARCH=cygwin-complex-nodebug --with-cc=gcc --with-cxx=g++ --with-fc=gfortran --with-debugging=0 --with-fortran-kernels=1 --with-scalar-type=complex --download-fblaspack --download-mpich --download-scalapack --download-mumps --download-metis --download-parmetis --download-superlu --download-superlu_dist --download-fftw > [0]PETSC ERROR: #1 User provided function() line 0 in unknown file > application called MPI_Abort(MPI_COMM_WORLD, 59) - process 0 > [unset]: aborting job: > application called MPI_Abort(MPI_COMM_WORLD, 59) - process 0 > [0]PETSC ERROR: ------------------------------------------------------------------------ > > If i use -mat_superlu_dist_parsymbfact=1 the program crashes (somewhat later) with > > Malloc fails for Lnzval_bc_ptr[*][] at line 626 in file zdistribute.c > col block 3006 ------------------------------------------------------- > Primary job terminated normally, but 1 process returned > a non-zero exit code.. Per user-direction, the job has been aborted. > ------------------------------------------------------- > col block 1924 [0]PETSC ERROR: ------------------------------------------------------------------------ > [0]PETSC ERROR: Caught signal number 15 Terminate: Some process (or the batch system) has told this process to end > [0]PETSC ERROR: Try option -start_in_debugger or -on_error_attach_debugger > [0]PETSC ERROR: or see http://www.mcs.anl.gov/petsc/documentation/faq.html#valgrind > [0]PETSC ERROR: or try http://valgrind.org on GNU/linux and Apple Mac OS X to find memory corruption errors > [0]PETSC ERROR: configure using --with-debugging=yes, recompile, link, and run > [0]PETSC ERROR: to get more information on the crash. > [0]PETSC ERROR: --------------------- Error Message -------------------------------------------------------------- > [0]PETSC ERROR: Signal received > [0]PETSC ERROR: See http://www.mcs.anl.gov/petsc/documentation/faq.html for trouble shooting. > [0]PETSC ERROR: Petsc Release Version 3.6.0, Jun, 09, 2015 > [0]PETSC ERROR: ./solve on a cygwin-complex-nodebug named CZC5202SM2 by muk Wed Jul 22 21:59:58 2015 > [0]PETSC ERROR: Configure options PETSC_DIR=/packages/petsc-3.6.0 PETSC_ARCH=cygwin-complex-nodebug --with-cc=gcc --with-cxx=g++ --with-fc=gfortran --with-debugging=0 --with-fortran-kernels=1 --with-scalar-type=complex --download-fblaspack --download-mpich --download-scalapack --download-mumps --download-metis --download-parmetis --download-superlu --download-superlu_dist --download-fftw > [0]PETSC ERROR: #1 User provided function() line 0 in unknown file > application called MPI_Abort(MPI_COMM_WORLD, 59) - process 0 > [unset]: aborting job: > application called MPI_Abort(MPI_COMM_WORLD, 59) - process 0 > [0]PETSC ERROR: ------------------------------------------------------------------------ > > > /Mahir > > > From: Hong [mailto:hzhang at mcs.anl.gov>] > Sent: den 22 juli 2015 21:34 > To: Xiaoye S. Li > Cc: ?lker-Kaustell, Mahir; petsc-users > > Subject: Re: [petsc-users] SuperLU MPI-problem > > In Petsc/superlu_dist interface, we set default > > options.ParSymbFact = NO; > > When user raises the flag "-mat_superlu_dist_parsymbfact", > we set > > options.ParSymbFact = YES; > options.ColPerm = PARMETIS; /* in v2.2, PARMETIS is forced for ParSymbFact regardless of user ordering setting */ > > We do not change anything else. > > Hong > > On Wed, Jul 22, 2015 at 2:19 PM, Xiaoye S. Li >> wrote: > I am trying to understand your problem. You said you are solving Naviers equation (elastodynamics) in the frequency domain, using finite element discretization. I wonder why you have about 1000 nonzeros per row. Usually in many PDE discretized matrices, the number of nonzeros per row is in the tens (even for 3D problems), not in the thousands. So, your matrix is quite a bit denser than many sparse matrices we deal with. > > The number of nonzeros in the L and U factors is much more than that in original matrix A -- typically we see 10-20x fill ratio for 2D, or can be as bad as 50-100x fill ratio for 3D. But since your matrix starts much denser (i.e., the underlying graph has many connections), it may not lend to any good ordering strategy to preserve sparsity of L and U; that is, the L and U fill ratio may be large. > > I don't understand why you get the following error when you use > ?-mat_superlu_dist_parsymbfact?. > > Invalid ISPEC at line 484 in file get_perm_c.c > > Perhaps Hong Zhang knows; she built the SuperLU_DIST interface for PETSc. > > ?Hong -- in order to use parallel symbolic factorization, is it sufficient to specify only > ?-mat_superlu_dist_parsymbfact? > ? ? (the default is to use sequential symbolic factorization.) > > > Sherry > > On Wed, Jul 22, 2015 at 9:11 AM, Mahir.Ulker-Kaustell at tyrens.se> >> wrote: > Thank you for your reply. > > As you have probably figured out already, I am not a computational scientist. I am a researcher in civil engineering (railways for high-speed traffic), trying to produce some, from my perspective, fairly large parametric studies based on finite element discretizations. > > I am working in a Windows-environment and have installed PETSc through Cygwin. > Apparently, there is no support for Valgrind in this OS. > > If I have understood you correct, the memory issues are related to superLU and given my background, there is not much I can do. Is this correct? > > > Best regards, > Mahir > > ______________________________________________ > Mahir ?lker-Kaustell, Kompetenssamordnare, Brokonstrukt?r, Tekn. Dr, Tyr?ns AB > 010 452 30 82, Mahir.Ulker-Kaustell at tyrens.se> > ______________________________________________ > > -----Original Message----- > From: Barry Smith [mailto:bsmith at mcs.anl.gov>] > Sent: den 22 juli 2015 02:57 > To: ?lker-Kaustell, Mahir > Cc: Xiaoye S. Li; petsc-users > Subject: Re: [petsc-users] SuperLU MPI-problem > > > Run the program under valgrind http://www.mcs.anl.gov/petsc/documentation/faq.html#valgrind . When I use the option -mat_superlu_dist_parsymbfact I get many scary memory problems some involving for example ddist_psymbtonum (pdsymbfact_distdata.c:1332) > > Note that I consider it unacceptable for running programs to EVER use uninitialized values; until these are all cleaned up I won't trust any runs like this. > > Barry > > > > > ==42050== Conditional jump or move depends on uninitialised value(s) > ==42050== at 0x10274C436: MPI_Allgatherv (allgatherv.c:1053) > ==42050== by 0x101557F60: get_perm_c_parmetis (get_perm_c_parmetis.c:285) > ==42050== by 0x101501192: pdgssvx (pdgssvx.c:934) > ==42050== by 0x1009CFE7A: MatLUFactorNumeric_SuperLU_DIST (superlu_dist.c:414) > ==42050== by 0x10046CC5C: MatLUFactorNumeric (matrix.c:2946) > ==42050== by 0x100F09F2C: PCSetUp_LU (lu.c:152) > ==42050== by 0x100FF9036: PCSetUp (precon.c:982) > ==42050== by 0x1010F54EB: KSPSetUp (itfunc.c:332) > ==42050== by 0x1010F7985: KSPSolve (itfunc.c:546) > ==42050== by 0x10125541E: SNESSolve_NEWTONLS (ls.c:233) > ==42050== by 0x1011C49B7: SNESSolve (snes.c:3906) > ==42050== by 0x100001B3C: main (in ./ex19) > ==42050== Uninitialised value was created by a stack allocation > ==42050== at 0x10155751B: get_perm_c_parmetis (get_perm_c_parmetis.c:96) > ==42050== > ==42050== Conditional jump or move depends on uninitialised value(s) > ==42050== at 0x102851C61: MPIR_Allgatherv_intra (allgatherv.c:651) > ==42050== by 0x102853EC7: MPIR_Allgatherv (allgatherv.c:903) > ==42050== by 0x102853F84: MPIR_Allgatherv_impl (allgatherv.c:944) > ==42050== by 0x10274CA41: MPI_Allgatherv (allgatherv.c:1107) > ==42050== by 0x101557F60: get_perm_c_parmetis (get_perm_c_parmetis.c:285) > ==42050== by 0x101501192: pdgssvx (pdgssvx.c:934) > ==42050== by 0x1009CFE7A: MatLUFactorNumeric_SuperLU_DIST (superlu_dist.c:414) > ==42050== by 0x10046CC5C: MatLUFactorNumeric (matrix.c:2946) > ==42050== by 0x100F09F2C: PCSetUp_LU (lu.c:152) > ==42050== by 0x100FF9036: PCSetUp (precon.c:982) > ==42050== by 0x1010F54EB: KSPSetUp (itfunc.c:332) > ==42050== by 0x1010F7985: KSPSolve (itfunc.c:546) > ==42050== by 0x10125541E: SNESSolve_NEWTONLS (ls.c:233) > ==42050== by 0x1011C49B7: SNESSolve (snes.c:3906) > ==42050== by 0x100001B3C: main (in ./ex19) > ==42050== Uninitialised value was created by a stack allocation > ==42050== at 0x10155751B: get_perm_c_parmetis (get_perm_c_parmetis.c:96) > ==42050== > ==42049== Syscall param writev(vector[...]) points to uninitialised byte(s) > ==42049== at 0x102DA1C3A: writev (in /usr/lib/system/libsystem_kernel.dylib) > ==42049== by 0x10296A0DC: MPL_large_writev (mplsock.c:32) > ==42049== by 0x10295F6AD: MPIDU_Sock_writev (sock_immed.i:610) > ==42049== by 0x102943FCA: MPIDI_CH3_iSendv (ch3_isendv.c:84) > ==42049== by 0x102934361: MPIDI_CH3_EagerContigIsend (ch3u_eager.c:556) > ==42049== by 0x102939531: MPID_Isend (mpid_isend.c:138) > ==42049== by 0x10277656E: MPI_Isend (isend.c:125) > ==42049== by 0x102088B66: libparmetis__gkMPI_Isend (gkmpi.c:63) > ==42049== by 0x10208140F: libparmetis__CommInterfaceData (comm.c:298) > ==42049== by 0x1020A8758: libparmetis__CompactGraph (ometis.c:553) > ==42049== by 0x1020A77BB: libparmetis__MultilevelOrder (ometis.c:225) > ==42049== by 0x1020A7493: ParMETIS_V32_NodeND (ometis.c:151) > ==42049== by 0x1020A6AFB: ParMETIS_V3_NodeND (ometis.c:34) > ==42049== by 0x101557CFC: get_perm_c_parmetis (get_perm_c_parmetis.c:241) > ==42049== by 0x101501192: pdgssvx (pdgssvx.c:934) > ==42049== by 0x1009CFE7A: MatLUFactorNumeric_SuperLU_DIST (superlu_dist.c:414) > ==42049== by 0x10046CC5C: MatLUFactorNumeric (matrix.c:2946) > ==42049== by 0x100F09F2C: PCSetUp_LU (lu.c:152) > ==42049== by 0x100FF9036: PCSetUp (precon.c:982) > ==42048== Syscall param writev(vector[...]) points to uninitialised byte(s) > ==42049== by 0x1010F54EB: KSPSetUp (itfunc.c:332) > ==42049== Address 0x105edff70 is 1,424 bytes inside a block of size 752,720 alloc'd > ==42049== at 0x1000183B1: malloc (vg_replace_malloc.c:303) > ==42049== by 0x1020EB90C: gk_malloc (memory.c:147) > ==42049== by 0x1020EAA28: gk_mcoreCreate (mcore.c:28) > ==42048== at 0x102DA1C3A: writev (in /usr/lib/system/libsystem_kernel.dylib) > ==42048== by 0x10296A0DC: MPL_large_writev (mplsock.c:32) > ==42049== by 0x1020BA5CF: libparmetis__AllocateWSpace (wspace.c:23) > ==42049== by 0x1020A6E84: ParMETIS_V32_NodeND (ometis.c:98) > ==42048== by 0x10295F6AD: MPIDU_Sock_writev (sock_immed.i:610) > ==42048== by 0x102943FCA: MPIDI_CH3_iSendv (ch3_isendv.c:84) > ==42048== by 0x102934361: MPIDI_CH3_EagerContigIsend (ch3u_eager.c:556) > ==42049== by 0x1020A6AFB: ParMETIS_V3_NodeND (ometis.c:34) > ==42049== by 0x101557CFC: get_perm_c_parmetis (get_perm_c_parmetis.c:241) > ==42049== by 0x101501192: pdgssvx (pdgssvx.c:934) > ==42048== by 0x102939531: MPID_Isend (mpid_isend.c:138) > ==42048== by 0x10277656E: MPI_Isend (isend.c:125) > ==42049== by 0x1009CFE7A: MatLUFactorNumeric_SuperLU_DIST (superlu_dist.c:414) > ==42049== by 0x10046CC5C: MatLUFactorNumeric (matrix.c:2946) > ==42049== by 0x100F09F2C: PCSetUp_LU (lu.c:152) > ==42049== by 0x100FF9036: PCSetUp (precon.c:982) > ==42049== by 0x1010F54EB: KSPSetUp (itfunc.c:332) > ==42048== by 0x102088B66: libparmetis__gkMPI_Isend (gkmpi.c:63) > ==42048== by 0x10208140F: libparmetis__CommInterfaceData (comm.c:298) > ==42049== by 0x1010F7985: KSPSolve (itfunc.c:546) > ==42049== by 0x10125541E: SNESSolve_NEWTONLS (ls.c:233) > ==42048== by 0x1020A8758: libparmetis__CompactGraph (ometis.c:553) > ==42048== by 0x1020A77BB: libparmetis__MultilevelOrder (ometis.c:225) > ==42048== by 0x1020A7493: ParMETIS_V32_NodeND (ometis.c:151) > ==42049== by 0x1011C49B7: SNESSolve (snes.c:3906) > ==42049== by 0x100001B3C: main (in ./ex19) > ==42049== Uninitialised value was created by a heap allocation > ==42049== at 0x1000183B1: malloc (vg_replace_malloc.c:303) > ==42049== by 0x1020EB90C: gk_malloc (memory.c:147) > ==42048== by 0x1020A6AFB: ParMETIS_V3_NodeND (ometis.c:34) > ==42048== by 0x101557CFC: get_perm_c_parmetis (get_perm_c_parmetis.c:241) > ==42048== by 0x101501192: pdgssvx (pdgssvx.c:934) > ==42048== by 0x1009CFE7A: MatLUFactorNumeric_SuperLU_DIST (superlu_dist.c:414) > ==42049== by 0x10211C50B: libmetis__imalloc (gklib.c:24) > ==42049== by 0x1020A8566: libparmetis__CompactGraph (ometis.c:519) > ==42049== by 0x1020A77BB: libparmetis__MultilevelOrder (ometis.c:225) > ==42048== by 0x10046CC5C: MatLUFactorNumeric (matrix.c:2946) > ==42049== by 0x1020A7493: ParMETIS_V32_NodeND (ometis.c:151) > ==42049== by 0x1020A6AFB: ParMETIS_V3_NodeND (ometis.c:34) > ==42049== by 0x101557CFC: get_perm_c_parmetis (get_perm_c_parmetis.c:241) > ==42049== by 0x101501192: pdgssvx (pdgssvx.c:934) > ==42049== by 0x1009CFE7A: MatLUFactorNumeric_SuperLU_DIST (superlu_dist.c:414) > ==42049== by 0x10046CC5C: MatLUFactorNumeric (matrix.c:2946) > ==42048== by 0x100F09F2C: PCSetUp_LU (lu.c:152) > ==42049== by 0x100F09F2C: PCSetUp_LU (lu.c:152) > ==42049== by 0x100FF9036: PCSetUp (precon.c:982) > ==42049== by 0x1010F54EB: KSPSetUp (itfunc.c:332) > ==42049== by 0x1010F7985: KSPSolve (itfunc.c:546) > ==42049== by 0x10125541E: SNESSolve_NEWTONLS (ls.c:233) > ==42048== by 0x100FF9036: PCSetUp (precon.c:982) > ==42048== by 0x1010F54EB: KSPSetUp (itfunc.c:332) > ==42048== Address 0x10597a860 is 1,408 bytes inside a block of size 752,720 alloc'd > ==42049== by 0x1011C49B7: SNESSolve (snes.c:3906) > ==42049== by 0x100001B3C: main (in ./ex19) > ==42049== > ==42048== at 0x1000183B1: malloc (vg_replace_malloc.c:303) > ==42048== by 0x1020EB90C: gk_malloc (memory.c:147) > ==42048== by 0x1020EAA28: gk_mcoreCreate (mcore.c:28) > ==42048== by 0x1020BA5CF: libparmetis__AllocateWSpace (wspace.c:23) > ==42048== by 0x1020A6E84: ParMETIS_V32_NodeND (ometis.c:98) > ==42048== by 0x1020A6AFB: ParMETIS_V3_NodeND (ometis.c:34) > ==42048== by 0x101557CFC: get_perm_c_parmetis (get_perm_c_parmetis.c:241) > ==42048== by 0x101501192: pdgssvx (pdgssvx.c:934) > ==42048== by 0x1009CFE7A: MatLUFactorNumeric_SuperLU_DIST (superlu_dist.c:414) > ==42048== by 0x10046CC5C: MatLUFactorNumeric (matrix.c:2946) > ==42048== by 0x100F09F2C: PCSetUp_LU (lu.c:152) > ==42048== by 0x100FF9036: PCSetUp (precon.c:982) > ==42048== by 0x1010F54EB: KSPSetUp (itfunc.c:332) > ==42048== by 0x1010F7985: KSPSolve (itfunc.c:546) > ==42048== by 0x10125541E: SNESSolve_NEWTONLS (ls.c:233) > ==42048== by 0x1011C49B7: SNESSolve (snes.c:3906) > ==42048== by 0x100001B3C: main (in ./ex19) > ==42048== Uninitialised value was created by a heap allocation > ==42048== at 0x1000183B1: malloc (vg_replace_malloc.c:303) > ==42048== by 0x1020EB90C: gk_malloc (memory.c:147) > ==42048== by 0x10211C50B: libmetis__imalloc (gklib.c:24) > ==42048== by 0x1020A8566: libparmetis__CompactGraph (ometis.c:519) > ==42048== by 0x1020A77BB: libparmetis__MultilevelOrder (ometis.c:225) > ==42048== by 0x1020A7493: ParMETIS_V32_NodeND (ometis.c:151) > ==42048== by 0x1020A6AFB: ParMETIS_V3_NodeND (ometis.c:34) > ==42048== by 0x101557CFC: get_perm_c_parmetis (get_perm_c_parmetis.c:241) > ==42048== by 0x101501192: pdgssvx (pdgssvx.c:934) > ==42048== by 0x1009CFE7A: MatLUFactorNumeric_SuperLU_DIST (superlu_dist.c:414) > ==42048== by 0x10046CC5C: MatLUFactorNumeric (matrix.c:2946) > ==42048== by 0x100F09F2C: PCSetUp_LU (lu.c:152) > ==42048== by 0x100FF9036: PCSetUp (precon.c:982) > ==42048== by 0x1010F54EB: KSPSetUp (itfunc.c:332) > ==42048== by 0x1010F7985: KSPSolve (itfunc.c:546) > ==42048== by 0x10125541E: SNESSolve_NEWTONLS (ls.c:233) > ==42048== by 0x1011C49B7: SNESSolve (snes.c:3906) > ==42048== by 0x100001B3C: main (in ./ex19) > ==42048== > ==42048== Syscall param write(buf) points to uninitialised byte(s) > ==42048== at 0x102DA1C22: write (in /usr/lib/system/libsystem_kernel.dylib) > ==42048== by 0x10295F5BD: MPIDU_Sock_write (sock_immed.i:525) > ==42048== by 0x102944839: MPIDI_CH3_iStartMsg (ch3_istartmsg.c:86) > ==42048== by 0x102933B80: MPIDI_CH3_EagerContigShortSend (ch3u_eager.c:257) > ==42048== by 0x10293ADBA: MPID_Send (mpid_send.c:130) > ==42048== by 0x10277A1FA: MPI_Send (send.c:127) > ==42048== by 0x10155802F: get_perm_c_parmetis (get_perm_c_parmetis.c:299) > ==42048== by 0x101501192: pdgssvx (pdgssvx.c:934) > ==42048== by 0x1009CFE7A: MatLUFactorNumeric_SuperLU_DIST (superlu_dist.c:414) > ==42048== by 0x10046CC5C: MatLUFactorNumeric (matrix.c:2946) > ==42048== by 0x100F09F2C: PCSetUp_LU (lu.c:152) > ==42048== by 0x100FF9036: PCSetUp (precon.c:982) > ==42048== by 0x1010F54EB: KSPSetUp (itfunc.c:332) > ==42048== by 0x1010F7985: KSPSolve (itfunc.c:546) > ==42048== by 0x10125541E: SNESSolve_NEWTONLS (ls.c:233) > ==42048== by 0x1011C49B7: SNESSolve (snes.c:3906) > ==42048== by 0x100001B3C: main (in ./ex19) > ==42048== Address 0x104810704 is on thread 1's stack > ==42048== in frame #3, created by MPIDI_CH3_EagerContigShortSend (ch3u_eager.c:218) > ==42048== Uninitialised value was created by a heap allocation > ==42048== at 0x1000183B1: malloc (vg_replace_malloc.c:303) > ==42048== by 0x10153B704: superlu_malloc_dist (memory.c:108) > ==42048== by 0x101557AB9: get_perm_c_parmetis (get_perm_c_parmetis.c:185) > ==42048== by 0x101501192: pdgssvx (pdgssvx.c:934) > ==42048== by 0x1009CFE7A: MatLUFactorNumeric_SuperLU_DIST (superlu_dist.c:414) > ==42048== by 0x10046CC5C: MatLUFactorNumeric (matrix.c:2946) > ==42048== by 0x100F09F2C: PCSetUp_LU (lu.c:152) > ==42048== by 0x100FF9036: PCSetUp (precon.c:982) > ==42048== by 0x1010F54EB: KSPSetUp (itfunc.c:332) > ==42048== by 0x1010F7985: KSPSolve (itfunc.c:546) > ==42048== by 0x10125541E: SNESSolve_NEWTONLS (ls.c:233) > ==42048== by 0x1011C49B7: SNESSolve (snes.c:3906) > ==42048== by 0x100001B3C: main (in ./ex19) > ==42048== > ==42050== Conditional jump or move depends on uninitialised value(s) > ==42050== at 0x102744CB8: MPI_Alltoallv (alltoallv.c:480) > ==42050== by 0x101510B3E: dist_symbLU (pdsymbfact_distdata.c:539) > ==42050== by 0x10150A5C6: ddist_psymbtonum (pdsymbfact_distdata.c:1275) > ==42050== by 0x1015018C2: pdgssvx (pdgssvx.c:1057) > ==42050== by 0x1009CFE7A: MatLUFactorNumeric_SuperLU_DIST (superlu_dist.c:414) > ==42050== by 0x10046CC5C: MatLUFactorNumeric (matrix.c:2946) > ==42050== by 0x100F09F2C: PCSetUp_LU (lu.c:152) > ==42050== by 0x100FF9036: PCSetUp (precon.c:982) > ==42050== by 0x1010F54EB: KSPSetUp (itfunc.c:332) > ==42050== by 0x1010F7985: KSPSolve (itfunc.c:546) > ==42050== by 0x10125541E: SNESSolve_NEWTONLS (ls.c:233) > ==42050== by 0x1011C49B7: SNESSolve (snes.c:3906) > ==42050== by 0x100001B3C: main (in ./ex19) > ==42050== Uninitialised value was created by a stack allocation > ==42050== at 0x10150E4C4: dist_symbLU (pdsymbfact_distdata.c:96) > ==42050== > ==42050== Conditional jump or move depends on uninitialised value(s) > ==42050== at 0x102744E43: MPI_Alltoallv (alltoallv.c:490) > ==42050== by 0x101510B3E: dist_symbLU (pdsymbfact_distdata.c:539) > ==42050== by 0x10150A5C6: ddist_psymbtonum (pdsymbfact_distdata.c:1275) > ==42050== by 0x1015018C2: pdgssvx (pdgssvx.c:1057) > ==42050== by 0x1009CFE7A: MatLUFactorNumeric_SuperLU_DIST (superlu_dist.c:414) > ==42050== by 0x10046CC5C: MatLUFactorNumeric (matrix.c:2946) > ==42050== by 0x100F09F2C: PCSetUp_LU (lu.c:152) > ==42050== by 0x100FF9036: PCSetUp (precon.c:982) > ==42050== by 0x1010F54EB: KSPSetUp (itfunc.c:332) > ==42050== by 0x1010F7985: KSPSolve (itfunc.c:546) > ==42050== by 0x10125541E: SNESSolve_NEWTONLS (ls.c:233) > ==42050== by 0x1011C49B7: SNESSolve (snes.c:3906) > ==42050== by 0x100001B3C: main (in ./ex19) > ==42050== Uninitialised value was created by a stack allocation > ==42050== at 0x10150E4C4: dist_symbLU (pdsymbfact_distdata.c:96) > ==42050== > ==42050== Conditional jump or move depends on uninitialised value(s) > ==42050== at 0x102744EBF: MPI_Alltoallv (alltoallv.c:497) > ==42050== by 0x101510B3E: dist_symbLU (pdsymbfact_distdata.c:539) > ==42050== by 0x10150A5C6: ddist_psymbtonum (pdsymbfact_distdata.c:1275) > ==42050== by 0x1015018C2: pdgssvx (pdgssvx.c:1057) > ==42050== by 0x1009CFE7A: MatLUFactorNumeric_SuperLU_DIST (superlu_dist.c:414) > ==42050== by 0x10046CC5C: MatLUFactorNumeric (matrix.c:2946) > ==42050== by 0x100F09F2C: PCSetUp_LU (lu.c:152) > ==42050== by 0x100FF9036: PCSetUp (precon.c:982) > ==42050== by 0x1010F54EB: KSPSetUp (itfunc.c:332) > ==42050== by 0x1010F7985: KSPSolve (itfunc.c:546) > ==42050== by 0x10125541E: SNESSolve_NEWTONLS (ls.c:233) > ==42050== by 0x1011C49B7: SNESSolve (snes.c:3906) > ==42050== by 0x100001B3C: main (in ./ex19) > ==42050== Uninitialised value was created by a stack allocation > ==42050== at 0x10150E4C4: dist_symbLU (pdsymbfact_distdata.c:96) > ==42050== > ==42050== Conditional jump or move depends on uninitialised value(s) > ==42050== at 0x1027450B1: MPI_Alltoallv (alltoallv.c:512) > ==42050== by 0x101510B3E: dist_symbLU (pdsymbfact_distdata.c:539) > ==42050== by 0x10150A5C6: ddist_psymbtonum (pdsymbfact_distdata.c:1275) > ==42050== by 0x1015018C2: pdgssvx (pdgssvx.c:1057) > ==42050== by 0x1009CFE7A: MatLUFactorNumeric_SuperLU_DIST (superlu_dist.c:414) > ==42050== by 0x10046CC5C: MatLUFactorNumeric (matrix.c:2946) > ==42050== by 0x100F09F2C: PCSetUp_LU (lu.c:152) > ==42050== by 0x100FF9036: PCSetUp (precon.c:982) > ==42050== by 0x1010F54EB: KSPSetUp (itfunc.c:332) > ==42050== by 0x1010F7985: KSPSolve (itfunc.c:546) > ==42050== by 0x10125541E: SNESSolve_NEWTONLS (ls.c:233) > ==42050== by 0x1011C49B7: SNESSolve (snes.c:3906) > ==42050== by 0x100001B3C: main (in ./ex19) > ==42050== Uninitialised value was created by a stack allocation > ==42050== at 0x10150E4C4: dist_symbLU (pdsymbfact_distdata.c:96) > ==42050== > ==42050== Conditional jump or move depends on uninitialised value(s) > ==42050== at 0x10283FB06: MPIR_Alltoallv_intra (alltoallv.c:92) > ==42050== by 0x1028407B6: MPIR_Alltoallv (alltoallv.c:343) > ==42050== by 0x102840884: MPIR_Alltoallv_impl (alltoallv.c:380) > ==42050== by 0x10274541B: MPI_Alltoallv (alltoallv.c:531) > ==42050== by 0x101510B3E: dist_symbLU (pdsymbfact_distdata.c:539) > ==42050== by 0x10150A5C6: ddist_psymbtonum (pdsymbfact_distdata.c:1275) > ==42050== by 0x1015018C2: pdgssvx (pdgssvx.c:1057) > ==42050== by 0x1009CFE7A: MatLUFactorNumeric_SuperLU_DIST (superlu_dist.c:414) > ==42050== by 0x10046CC5C: MatLUFactorNumeric (matrix.c:2946) > ==42050== by 0x100F09F2C: PCSetUp_LU (lu.c:152) > ==42050== by 0x100FF9036: PCSetUp (precon.c:982) > ==42050== by 0x1010F54EB: KSPSetUp (itfunc.c:332) > ==42050== by 0x1010F7985: KSPSolve (itfunc.c:546) > ==42050== by 0x10125541E: SNESSolve_NEWTONLS (ls.c:233) > ==42050== by 0x1011C49B7: SNESSolve (snes.c:3906) > ==42050== by 0x100001B3C: main (in ./ex19) > ==42050== Uninitialised value was created by a stack allocation > ==42050== at 0x10150E4C4: dist_symbLU (pdsymbfact_distdata.c:96) > ==42050== > ==42050== Syscall param writev(vector[...]) points to uninitialised byte(s) > ==42050== at 0x102DA1C3A: writev (in /usr/lib/system/libsystem_kernel.dylib) > ==42050== by 0x10296A0DC: MPL_large_writev (mplsock.c:32) > ==42050== by 0x10295F6AD: MPIDU_Sock_writev (sock_immed.i:610) > ==42050== by 0x102943FCA: MPIDI_CH3_iSendv (ch3_isendv.c:84) > ==42050== by 0x102934361: MPIDI_CH3_EagerContigIsend (ch3u_eager.c:556) > ==42050== by 0x102939531: MPID_Isend (mpid_isend.c:138) > ==42050== by 0x10277656E: MPI_Isend (isend.c:125) > ==42050== by 0x101524C41: pdgstrf2_trsm (pdgstrf2.c:201) > ==42050== by 0x10151ECBF: pdgstrf (pdgstrf.c:1082) > ==42050== by 0x1015019A5: pdgssvx (pdgssvx.c:1069) > ==42050== by 0x1009CFE7A: MatLUFactorNumeric_SuperLU_DIST (superlu_dist.c:414) > ==42050== by 0x10046CC5C: MatLUFactorNumeric (matrix.c:2946) > ==42050== by 0x100F09F2C: PCSetUp_LU (lu.c:152) > ==42050== by 0x100FF9036: PCSetUp (precon.c:982) > ==42050== by 0x1010F54EB: KSPSetUp (itfunc.c:332) > ==42050== by 0x1010F7985: KSPSolve (itfunc.c:546) > ==42050== by 0x10125541E: SNESSolve_NEWTONLS (ls.c:233) > ==42050== by 0x1011C49B7: SNESSolve (snes.c:3906) > ==42050== by 0x100001B3C: main (in ./ex19) > ==42050== Address 0x1060144d0 is 1,168 bytes inside a block of size 131,072 alloc'd > ==42050== at 0x1000183B1: malloc (vg_replace_malloc.c:303) > ==42050== by 0x10153B704: superlu_malloc_dist (memory.c:108) > ==42050== by 0x1014FD7AD: doubleMalloc_dist (dmemory.c:145) > ==42050== by 0x10151DA7D: pdgstrf (pdgstrf.c:735) > ==42050== by 0x1015019A5: pdgssvx (pdgssvx.c:1069) > ==42050== by 0x1009CFE7A: MatLUFactorNumeric_SuperLU_DIST (superlu_dist.c:414) > ==42050== by 0x10046CC5C: MatLUFactorNumeric (matrix.c:2946) > ==42050== by 0x100F09F2C: PCSetUp_LU (lu.c:152) > ==42050== by 0x100FF9036: PCSetUp (precon.c:982) > ==42050== by 0x1010F54EB: KSPSetUp (itfunc.c:332) > ==42050== by 0x1010F7985: KSPSolve (itfunc.c:546) > ==42050== by 0x10125541E: SNESSolve_NEWTONLS (ls.c:233) > ==42050== by 0x1011C49B7: SNESSolve (snes.c:3906) > ==42050== by 0x100001B3C: main (in ./ex19) > ==42050== Uninitialised value was created by a heap allocation > ==42050== at 0x1000183B1: malloc (vg_replace_malloc.c:303) > ==42050== by 0x10153B704: superlu_malloc_dist (memory.c:108) > ==42050== by 0x1014FD7AD: doubleMalloc_dist (dmemory.c:145) > ==42050== by 0x10151DA7D: pdgstrf (pdgstrf.c:735) > ==42050== by 0x1015019A5: pdgssvx (pdgssvx.c:1069) > ==42050== by 0x1009CFE7A: MatLUFactorNumeric_SuperLU_DIST (superlu_dist.c:414) > ==42050== by 0x10046CC5C: MatLUFactorNumeric (matrix.c:2946) > ==42050== by 0x100F09F2C: PCSetUp_LU (lu.c:152) > ==42050== by 0x100FF9036: PCSetUp (precon.c:982) > ==42050== by 0x1010F54EB: KSPSetUp (itfunc.c:332) > ==42050== by 0x1010F7985: KSPSolve (itfunc.c:546) > ==42050== by 0x10125541E: SNESSolve_NEWTONLS (ls.c:233) > ==42050== by 0x1011C49B7: SNESSolve (snes.c:3906) > ==42050== by 0x100001B3C: main (in ./ex19) > ==42050== > ==42048== Conditional jump or move depends on uninitialised value(s) > ==42048== at 0x10151F141: pdgstrf (pdgstrf.c:1139) > ==42048== by 0x1015019A5: pdgssvx (pdgssvx.c:1069) > ==42048== by 0x1009CFE7A: MatLUFactorNumeric_SuperLU_DIST (superlu_dist.c:414) > ==42048== by 0x10046CC5C: MatLUFactorNumeric (matrix.c:2946) > ==42048== by 0x100F09F2C: PCSetUp_LU (lu.c:152) > ==42048== by 0x100FF9036: PCSetUp (precon.c:982) > ==42048== by 0x1010F54EB: KSPSetUp (itfunc.c:332) > ==42048== by 0x1010F7985: KSPSolve (itfunc.c:546) > ==42048== by 0x10125541E: SNESSolve_NEWTONLS (ls.c:233) > ==42048== by 0x1011C49B7: SNESSolve (snes.c:3906) > ==42048== by 0x100001B3C: main (in ./ex19) > ==42048== Uninitialised value was created by a heap allocation > ==42048== at 0x1000183B1: malloc (vg_replace_malloc.c:303) > ==42048== by 0x10153B704: superlu_malloc_dist (memory.c:108) > ==42048== by 0x10150ABE2: ddist_psymbtonum (pdsymbfact_distdata.c:1332) > ==42048== by 0x1015018C2: pdgssvx (pdgssvx.c:1057) > ==42048== by 0x1009CFE7A: MatLUFactorNumeric_SuperLU_DIST (superlu_dist.c:414) > ==42048== by 0x10046CC5C: MatLUFactorNumeric (matrix.c:2946) > ==42048== by 0x100F09F2C: PCSetUp_LU (lu.c:152) > ==42048== by 0x100FF9036: PCSetUp (precon.c:982) > ==42048== by 0x1010F54EB: KSPSetUp (itfunc.c:332) > ==42048== by 0x1010F7985: KSPSolve (itfunc.c:546) > ==42048== by 0x10125541E: SNESSolve_NEWTONLS (ls.c:233) > ==42048== by 0x1011C49B7: SNESSolve (snes.c:3906) > ==42048== by 0x100001B3C: main (in ./ex19) > ==42048== > ==42049== Conditional jump or move depends on uninitialised value(s) > ==42049== at 0x10151F141: pdgstrf (pdgstrf.c:1139) > ==42049== by 0x1015019A5: pdgssvx (pdgssvx.c:1069) > ==42049== by 0x1009CFE7A: MatLUFactorNumeric_SuperLU_DIST (superlu_dist.c:414) > ==42049== by 0x10046CC5C: MatLUFactorNumeric (matrix.c:2946) > ==42049== by 0x100F09F2C: PCSetUp_LU (lu.c:152) > ==42049== by 0x100FF9036: PCSetUp (precon.c:982) > ==42049== by 0x1010F54EB: KSPSetUp (itfunc.c:332) > ==42049== by 0x1010F7985: KSPSolve (itfunc.c:546) > ==42049== by 0x10125541E: SNESSolve_NEWTONLS (ls.c:233) > ==42049== by 0x1011C49B7: SNESSolve (snes.c:3906) > ==42049== by 0x100001B3C: main (in ./ex19) > ==42049== Uninitialised value was created by a heap allocation > ==42049== at 0x1000183B1: malloc (vg_replace_malloc.c:303) > ==42049== by 0x10153B704: superlu_malloc_dist (memory.c:108) > ==42049== by 0x10150ABE2: ddist_psymbtonum (pdsymbfact_distdata.c:1332) > ==42049== by 0x1015018C2: pdgssvx (pdgssvx.c:1057) > ==42049== by 0x1009CFE7A: MatLUFactorNumeric_SuperLU_DIST (superlu_dist.c:414) > ==42049== by 0x10046CC5C: MatLUFactorNumeric (matrix.c:2946) > ==42049== by 0x100F09F2C: PCSetUp_LU (lu.c:152) > ==42049== by 0x100FF9036: PCSetUp (precon.c:982) > ==42049== by 0x1010F54EB: KSPSetUp (itfunc.c:332) > ==42049== by 0x1010F7985: KSPSolve (itfunc.c:546) > ==42049== by 0x10125541E: SNESSolve_NEWTONLS (ls.c:233) > ==42049== by 0x1011C49B7: SNESSolve (snes.c:3906) > ==42049== by 0x100001B3C: main (in ./ex19) > ==42049== > ==42048== Conditional jump or move depends on uninitialised value(s) > ==42048== at 0x101520054: pdgstrf (pdgstrf.c:1429) > ==42048== by 0x1015019A5: pdgssvx (pdgssvx.c:1069) > ==42048== by 0x1009CFE7A: MatLUFactorNumeric_SuperLU_DIST (superlu_dist.c:414) > ==42048== by 0x10046CC5C: MatLUFactorNumeric (matrix.c:2946) > ==42048== by 0x100F09F2C: PCSetUp_LU (lu.c:152) > ==42048== by 0x100FF9036: PCSetUp (precon.c:982) > ==42048== by 0x1010F54EB: KSPSetUp (itfunc.c:332) > ==42048== by 0x1010F7985: KSPSolve (itfunc.c:546) > ==42049== Conditional jump or move depends on uninitialised value(s) > ==42048== by 0x10125541E: SNESSolve_NEWTONLS (ls.c:233) > ==42048== by 0x1011C49B7: SNESSolve (snes.c:3906) > ==42048== by 0x100001B3C: main (in ./ex19) > ==42048== Uninitialised value was created by a heap allocation > ==42049== at 0x101520054: pdgstrf (pdgstrf.c:1429) > ==42048== at 0x1000183B1: malloc (vg_replace_malloc.c:303) > ==42048== by 0x10153B704: superlu_malloc_dist (memory.c:108) > ==42049== by 0x1015019A5: pdgssvx (pdgssvx.c:1069) > ==42049== by 0x1009CFE7A: MatLUFactorNumeric_SuperLU_DIST (superlu_dist.c:414) > ==42048== by 0x10150ABE2: ddist_psymbtonum (pdsymbfact_distdata.c:1332) > ==42048== by 0x1015018C2: pdgssvx (pdgssvx.c:1057) > ==42048== by 0x1009CFE7A: MatLUFactorNumeric_SuperLU_DIST (superlu_dist.c:414) > ==42049== by 0x10046CC5C: MatLUFactorNumeric (matrix.c:2946) > ==42049== by 0x100F09F2C: PCSetUp_LU (lu.c:152) > ==42048== by 0x10046CC5C: MatLUFactorNumeric (matrix.c:2946) > ==42048== by 0x100F09F2C: PCSetUp_LU (lu.c:152) > ==42049== by 0x100FF9036: PCSetUp (precon.c:982) > ==42049== by 0x1010F54EB: KSPSetUp (itfunc.c:332) > ==42049== by 0x1010F7985: KSPSolve (itfunc.c:546) > ==42048== by 0x100FF9036: PCSetUp (precon.c:982) > ==42048== by 0x1010F54EB: KSPSetUp (itfunc.c:332) > ==42048== by 0x1010F7985: KSPSolve (itfunc.c:546) > ==42049== by 0x10125541E: SNESSolve_NEWTONLS (ls.c:233) > ==42049== by 0x1011C49B7: SNESSolve (snes.c:3906) > ==42048== by 0x10125541E: SNESSolve_NEWTONLS (ls.c:233) > ==42048== by 0x1011C49B7: SNESSolve (snes.c:3906) > ==42049== by 0x100001B3C: main (in ./ex19) > ==42049== Uninitialised value was created by a heap allocation > ==42049== at 0x1000183B1: malloc (vg_replace_malloc.c:303) > ==42048== by 0x100001B3C: main (in ./ex19) > ==42048== > ==42049== by 0x10153B704: superlu_malloc_dist (memory.c:108) > ==42049== by 0x10150ABE2: ddist_psymbtonum (pdsymbfact_distdata.c:1332) > ==42049== by 0x1015018C2: pdgssvx (pdgssvx.c:1057) > ==42049== by 0x1009CFE7A: MatLUFactorNumeric_SuperLU_DIST (superlu_dist.c:414) > ==42049== by 0x10046CC5C: MatLUFactorNumeric (matrix.c:2946) > ==42049== by 0x100F09F2C: PCSetUp_LU (lu.c:152) > ==42049== by 0x100FF9036: PCSetUp (precon.c:982) > ==42049== by 0x1010F54EB: KSPSetUp (itfunc.c:332) > ==42049== by 0x1010F7985: KSPSolve (itfunc.c:546) > ==42049== by 0x10125541E: SNESSolve_NEWTONLS (ls.c:233) > ==42049== by 0x1011C49B7: SNESSolve (snes.c:3906) > ==42049== by 0x100001B3C: main (in ./ex19) > ==42049== > ==42050== Conditional jump or move depends on uninitialised value(s) > ==42050== at 0x10151FDE6: pdgstrf (pdgstrf.c:1382) > ==42050== by 0x1015019A5: pdgssvx (pdgssvx.c:1069) > ==42050== by 0x1009CFE7A: MatLUFactorNumeric_SuperLU_DIST (superlu_dist.c:414) > ==42050== by 0x10046CC5C: MatLUFactorNumeric (matrix.c:2946) > ==42050== by 0x100F09F2C: PCSetUp_LU (lu.c:152) > ==42050== by 0x100FF9036: PCSetUp (precon.c:982) > ==42050== by 0x1010F54EB: KSPSetUp (itfunc.c:332) > ==42050== by 0x1010F7985: KSPSolve (itfunc.c:546) > ==42050== by 0x10125541E: SNESSolve_NEWTONLS (ls.c:233) > ==42050== by 0x1011C49B7: SNESSolve (snes.c:3906) > ==42050== by 0x100001B3C: main (in ./ex19) > ==42050== Uninitialised value was created by a heap allocation > ==42050== at 0x1000183B1: malloc (vg_replace_malloc.c:303) > ==42050== by 0x10153B704: superlu_malloc_dist (memory.c:108) > ==42050== by 0x10150B241: ddist_psymbtonum (pdsymbfact_distdata.c:1389) > ==42050== by 0x1015018C2: pdgssvx (pdgssvx.c:1057) > ==42050== by 0x1009CFE7A: MatLUFactorNumeric_SuperLU_DIST (superlu_dist.c:414) > ==42050== by 0x10046CC5C: MatLUFactorNumeric (matrix.c:2946) > ==42050== by 0x100F09F2C: PCSetUp_LU (lu.c:152) > ==42050== by 0x100FF9036: PCSetUp (precon.c:982) > ==42050== by 0x1010F54EB: KSPSetUp (itfunc.c:332) > ==42050== by 0x1010F7985: KSPSolve (itfunc.c:546) > ==42050== by 0x10125541E: SNESSolve_NEWTONLS (ls.c:233) > ==42050== by 0x1011C49B7: SNESSolve (snes.c:3906) > ==42050== by 0x100001B3C: main (in ./ex19) > ==42050== > > > > On Jul 20, 2015, at 12:03 PM, Mahir.Ulker-Kaustell at tyrens.se> wrote: > > > > Ok. So I have been creating the full factorization on each process. That gives me some hope! > > > > I followed your suggestion and tried to use the runtime option ?-mat_superlu_dist_parsymbfact?. > > However, now the program crashes with: > > > > Invalid ISPEC at line 484 in file get_perm_c.c > > > > And so on? > > > > From the SuperLU manual; I should give the option either YES or NO, however -mat_superlu_dist_parsymbfact YES makes the program crash in the same way as above. > > Also I can?t find any reference to -mat_superlu_dist_parsymbfact in the PETSc documentation > > > > Mahir > > > > Mahir ?lker-Kaustell, Kompetenssamordnare, Brokonstrukt?r, Tekn. Dr, Tyr?ns AB > > 010 452 30 82, Mahir.Ulker-Kaustell at tyrens.se> > > > > From: Xiaoye S. Li [mailto:xsli at lbl.gov>] > > Sent: den 20 juli 2015 18:12 > > To: ?lker-Kaustell, Mahir > > Cc: Hong; petsc-users > > Subject: Re: [petsc-users] SuperLU MPI-problem > > > > The default SuperLU_DIST setting is to serial symbolic factorization. Therefore, what matters is how much memory do you have per MPI task? > > > > The code failed to malloc memory during redistribution of matrix A to {L\U} data struction (using result of serial symbolic factorization.) > > > > You can use parallel symbolic factorization, by runtime option: '-mat_superlu_dist_parsymbfact' > > > > Sherry Li > > > > > > On Mon, Jul 20, 2015 at 8:59 AM, Mahir.Ulker-Kaustell at tyrens.se> >> wrote: > > Hong: > > > > Previous experiences with this equation have shown that it is very difficult to solve it iteratively. Hence the use of a direct solver. > > > > The large test problem I am trying to solve has slightly less than 10^6 degrees of freedom. The matrices are derived from finite elements so they are sparse. > > The machine I am working on has 128GB ram. I have estimated the memory needed to less than 20GB, so if the solver needs twice or even three times as much, it should still work well. Or have I completely misunderstood something here? > > > > Mahir > > > > > > > > From: Hong [mailto:hzhang at mcs.anl.gov>] > > Sent: den 20 juli 2015 17:39 > > To: ?lker-Kaustell, Mahir > > Cc: petsc-users > > Subject: Re: [petsc-users] SuperLU MPI-problem > > > > Mahir: > > Direct solvers consume large amount of memory. Suggest to try followings: > > > > 1. A sparse iterative solver if [-omega^2M + K] is not too ill-conditioned. You may test it using the small matrix. > > > > 2. Incrementally increase your matrix sizes. Try different matrix orderings. > > Do you get memory crash in the 1st symbolic factorization? > > In your case, matrix data structure stays same when omega changes, so you only need to do one matrix symbolic factorization and reuse it. > > > > 3. Use a machine that gives larger memory. > > > > Hong > > > > Dear Petsc-Users, > > > > I am trying to use PETSc to solve a set of linear equations arising from Naviers equation (elastodynamics) in the frequency domain. > > The frequency dependency of the problem requires that the system > > > > [-omega^2M + K]u = F > > > > where M and K are constant, square, positive definite matrices (mass and stiffness respectively) is solved for each frequency omega of interest. > > K is a complex matrix, including material damping. > > > > I have written a PETSc program which solves this problem for a small (1000 degrees of freedom) test problem on one or several processors, but it keeps crashing when I try it on my full scale (in the order of 10^6 degrees of freedom) problem. > > > > The program crashes at KSPSetUp() and from what I can see in the error messages, it appears as if it consumes too much memory. > > > > I would guess that similar problems have occurred in this mail-list, so I am hoping that someone can push me in the right direction? > > > > Mahir > > > > > > > > > > -------------- next part -------------- An HTML attachment was scrubbed... URL: From torquil at gmail.com Wed Aug 12 09:29:20 2015 From: torquil at gmail.com (=?UTF-8?Q?Torquil_Macdonald_S=c3=b8rensen?=) Date: Wed, 12 Aug 2015 16:29:20 +0200 Subject: [petsc-users] Duplicate options Message-ID: <55CB5840.2050109@gmail.com> Hi! Is it intentional that Petsc prints duplicates of the matrix-related options in the following test program that creates two matrices?: ------------- Mat A; ierr = MatCreate(PETSC_COMM_SELF, &A); CHKERRQ(ierr); ierr = MatSetType(A, MATSEQAIJ); CHKERRQ(ierr); Mat B; ierr = MatCreate(PETSC_COMM_SELF, &B); CHKERRQ(ierr); ierr = MatSetType(B, MATSEQAIJ); CHKERRQ(ierr); -------------- The output when running with -help contains: Options for SEQAIJ matrix ------------------------------------------------- -mat_no_unroll: Do not optimize for inodes (slower) (None) -mat_no_inode: Do not optimize for inodes -slower- (None) -mat_inode_limit <5>: Do not use inodes larger then this value (None) Options for SEQAIJ matrix ------------------------------------------------- -mat_no_unroll: Do not optimize for inodes (slower) (None) -mat_no_inode: Do not optimize for inodes -slower- (None) -mat_inode_limit <5>: Do not use inodes larger then this value (None) The section "Options for SEQAIJ matrix" is repeated. The reason I ask is because I have another Petsc program that prints an enormous amount of duplicate lines when running with -help. I found this old thread from 2006 about the same problem: http://lists.mcs.anl.gov/pipermail/petsc-users/2006-October/000737.html Best regards Torquil S?rensen From jed at jedbrown.org Wed Aug 12 09:39:29 2015 From: jed at jedbrown.org (Jed Brown) Date: Wed, 12 Aug 2015 08:39:29 -0600 Subject: [petsc-users] Duplicate options In-Reply-To: <55CB5840.2050109@gmail.com> References: <55CB5840.2050109@gmail.com> Message-ID: <87h9o4sh9a.fsf@jedbrown.org> Torquil Macdonald S?rensen writes: > The section "Options for SEQAIJ matrix" is repeated. The reason I ask is > because I have another Petsc program that prints an enormous amount of > duplicate lines when running with -help. I found this old thread from > 2006 about the same problem: > > http://lists.mcs.anl.gov/pipermail/petsc-users/2006-October/000737.html Sadly, this is still a known problem and it got worse when we became more consistent about printing options for things like matrices and vectors (which rarely have prefixes and for which many options used to be hidden). Fixing it is somewhat at odds with our desire to remove global variables whenever possible, but I think it needs to be fixed. I tend to filter -help output with grep, FWIW. -------------- next part -------------- A non-text attachment was scrubbed... Name: signature.asc Type: application/pgp-signature Size: 818 bytes Desc: not available URL: From mfadams at lbl.gov Wed Aug 12 09:56:44 2015 From: mfadams at lbl.gov (Mark Adams) Date: Wed, 12 Aug 2015 10:56:44 -0400 Subject: [petsc-users] Petsc+Chombo example In-Reply-To: References: Message-ID: On Tue, Aug 11, 2015 at 3:25 PM, Matthew Knepley wrote: > On Tue, Aug 11, 2015 at 2:10 PM, Mani Chandra wrote: > >> Hi, >> >> Is there an example where Petsc's SNES has been used with Chombo, and >> perhaps with an automatic Jacobian assembly? I'd like to know if Petsc can >> pick out the number of colors of a Chombo data structure like it can do >> with a DMDA. >> > > The specific kinds of colorings for structured grids also assume a > colocated discretization which > I am not sure Chombo uses. However, the greedy colorings which only use > the matrix will work. > Chombo (me) creates an MPIAIJ matrix. So automatic Jacobian assembly should work. I have put a SNES in a Chombo code, but did not use automatic Jacobian assembly. Mark > > Thanks, > > Matt > > >> Thanks, >> Mani >> > > > > -- > What most experimenters take for granted before they begin their > experiments is infinitely more interesting than any results to which their > experiments lead. > -- Norbert Wiener > -------------- next part -------------- An HTML attachment was scrubbed... URL: From knepley at gmail.com Wed Aug 12 10:10:46 2015 From: knepley at gmail.com (Matthew Knepley) Date: Wed, 12 Aug 2015 10:10:46 -0500 Subject: [petsc-users] Duplicate options In-Reply-To: <87h9o4sh9a.fsf@jedbrown.org> References: <55CB5840.2050109@gmail.com> <87h9o4sh9a.fsf@jedbrown.org> Message-ID: On Wed, Aug 12, 2015 at 9:39 AM, Jed Brown wrote: > Torquil Macdonald S?rensen writes: > > The section "Options for SEQAIJ matrix" is repeated. The reason I ask is > > because I have another Petsc program that prints an enormous amount of > > duplicate lines when running with -help. I found this old thread from > > 2006 about the same problem: > > > > http://lists.mcs.anl.gov/pipermail/petsc-users/2006-October/000737.html > > Sadly, this is still a known problem and it got worse when we became > more consistent about printing options for things like matrices and > vectors (which rarely have prefixes and for which many options used to > be hidden). Fixing it is somewhat at odds with our desire to remove > global variables whenever possible, but I think it needs to be fixed. I > tend to filter -help output with grep, FWIW. > What will we use to uniquely identify a block of options? I hate the idea of a random string. Its too easy to mess up. Should we use a class+type_name? Matt -- What most experimenters take for granted before they begin their experiments is infinitely more interesting than any results to which their experiments lead. -- Norbert Wiener -------------- next part -------------- An HTML attachment was scrubbed... URL: From jed at jedbrown.org Wed Aug 12 10:17:56 2015 From: jed at jedbrown.org (Jed Brown) Date: Wed, 12 Aug 2015 09:17:56 -0600 Subject: [petsc-users] Duplicate options In-Reply-To: References: <55CB5840.2050109@gmail.com> <87h9o4sh9a.fsf@jedbrown.org> Message-ID: <878u9gsfh7.fsf@jedbrown.org> Matthew Knepley writes: > What will we use to uniquely identify a block of options? I hate the idea > of a random string. > Its too easy to mess up. Should we use a class+type_name? Prefix is relevant. We could hash the contents to make the comparison fixed-length. -------------- next part -------------- A non-text attachment was scrubbed... Name: signature.asc Type: application/pgp-signature Size: 818 bytes Desc: not available URL: From knepley at gmail.com Wed Aug 12 10:19:33 2015 From: knepley at gmail.com (Matthew Knepley) Date: Wed, 12 Aug 2015 10:19:33 -0500 Subject: [petsc-users] Duplicate options In-Reply-To: <878u9gsfh7.fsf@jedbrown.org> References: <55CB5840.2050109@gmail.com> <87h9o4sh9a.fsf@jedbrown.org> <878u9gsfh7.fsf@jedbrown.org> Message-ID: On Wed, Aug 12, 2015 at 10:17 AM, Jed Brown wrote: > Matthew Knepley writes: > > What will we use to uniquely identify a block of options? I hate the idea > > of a random string. > > Its too easy to mess up. Should we use a class+type_name? > > Prefix is relevant. We could hash the contents to make the comparison > fixed-length. > So, SHA1(class, type_name, prefix)? I could live with that. Then we maintain a khash table of those we have seen while printing. Matt -- What most experimenters take for granted before they begin their experiments is infinitely more interesting than any results to which their experiments lead. -- Norbert Wiener -------------- next part -------------- An HTML attachment was scrubbed... URL: From jed at jedbrown.org Wed Aug 12 10:27:48 2015 From: jed at jedbrown.org (Jed Brown) Date: Wed, 12 Aug 2015 09:27:48 -0600 Subject: [petsc-users] Duplicate options In-Reply-To: References: <55CB5840.2050109@gmail.com> <87h9o4sh9a.fsf@jedbrown.org> <878u9gsfh7.fsf@jedbrown.org> Message-ID: <87614ksf0r.fsf@jedbrown.org> Matthew Knepley writes: > So, SHA1(class, type_name, prefix)? I could live with that. Then we > maintain a khash table of those we have seen while printing. Yeah, ultimately with a reader lock for thread safety. -------------- next part -------------- A non-text attachment was scrubbed... Name: signature.asc Type: application/pgp-signature Size: 818 bytes Desc: not available URL: From ustc.liu at gmail.com Wed Aug 12 10:35:45 2015 From: ustc.liu at gmail.com (sheng liu) Date: Wed, 12 Aug 2015 23:35:45 +0800 Subject: [petsc-users] Need to update matrix in every loop In-Reply-To: References: Message-ID: Thank you very much! I have another question. If I need all the eigenvalues of the sparse matrix, which solver should I use? Thanks! 2015-08-09 1:52 GMT+08:00 Barry Smith : > > > On Aug 8, 2015, at 7:52 AM, sheng liu wrote: > > > > Hello: > > I have a large sparse symmetric matrix ( about 1000000x1000000), and > I need about 10 eigenvalues near 0. The problem is: I need to run the same > program about 1000 times, each time I need to change the diagonal matrix > elements ( and they are generated randomly). Is there a fast way to > implement this problem? Thank you! > > Does each run depend on the previous one or are they all independent? > > If they are independent I would introduce two levels of parallelism: On > the outer level have different MPI communicators compute different random > diagonal perturbations and on the inner level use a small amount of > parallelism for each eigenvalue solve. The outer level of parallelism is > embarrassingly parallel. > > Of course, for runs of the eigensolve use -log_summary to make sure it > is running efficiently and tune the amount of parallelism in the eigensolve > for best performance. > > Barry > > > -------------- next part -------------- An HTML attachment was scrubbed... URL: From jed at jedbrown.org Wed Aug 12 10:46:21 2015 From: jed at jedbrown.org (Jed Brown) Date: Wed, 12 Aug 2015 09:46:21 -0600 Subject: [petsc-users] Need to update matrix in every loop In-Reply-To: References: Message-ID: <87y4hgqzle.fsf@jedbrown.org> sheng liu writes: > Thank you very much! I have another question. If I need all the eigenvalues > of the sparse matrix, which solver should I use? Thanks! That's O(n^3) with n=1e6. Better find a way to not need all the eigenvalues or to make the system smaller. -------------- next part -------------- A non-text attachment was scrubbed... Name: signature.asc Type: application/pgp-signature Size: 818 bytes Desc: not available URL: From hzhang at mcs.anl.gov Wed Aug 12 10:58:50 2015 From: hzhang at mcs.anl.gov (Hong) Date: Wed, 12 Aug 2015 10:58:50 -0500 Subject: [petsc-users] SIGSEGV in Superlu_dist In-Reply-To: References: <55C90EAA.5060702@email.arizona.edu> <8B989122-F09F-4AA2-8676-D82CA2E58B0C@mcs.anl.gov> <55CA3A16.90206@email.arizona.edu> Message-ID: Anthony, I just patched petsc-maint branch. Your matrix Amat_binary.m has empty diagonal blocks. Most petsc solvers require matrix diagonal entries to be allocated as 'non-zero', i.e., insert zero values to these zero entries. I would suggest you add zeros to Amat_binary.m during its buildup. This would enable petsc solvers, as well as other packages. Again, thanks for bug reporting. Hong On Tue, Aug 11, 2015 at 1:33 PM, Satish Balay wrote: > yes - the patch will be in petsc 3.6.2. > > However - you can grab the patch right now - and start using it > > If using a 3.6.1 tarball - you can do download the (raw) patch from > the url below and apply with: > > cd petsc-3.6.1 > patch -Np1 < patchfile > > If using a git clone - you can do: > > git fetch > git checkout ceeba3afeff0c18262ed13ef92e2508ca68b0ecf > > Satish > > On Tue, 11 Aug 2015, Anthony Haas wrote: > > > Hi Hong, > > > > Sorry for my late reply and thanks for the fix. Does that mean that I > will be > > able to run that matrix on 10 procs in the future (petsc 3.6.2?)? > > > > Thanks > > > > Anthony > > > > > > On 08/11/2015 09:58 AM, Hong wrote: > > > Anthony, > > > I pushed a fix > > > > https://bitbucket.org/petsc/petsc/commits/ceeba3afeff0c18262ed13ef92e2508ca68b0ecf > > > > > > Once it passes our nightly tests, I'll merge it to petsc-maint, then > > > petsc-dev. > > > Thanks for reporting it! > > > > > > Hong > > > > > > On Mon, Aug 10, 2015 at 4:27 PM, Barry Smith > > > wrote: > > > > > > > > > Anthony, > > > > > > This crash is in PETSc code before it calls the SuperLU_DIST > > > numeric factorization; likely we have a mistake such as assuming a > > > process has at least one row of the matrix and need to fix it. > > > > > > Barry > > > > > > > 0x00007fe6ba609297 in MatLUFactorNumeric_SuperLU_DIST > (F=0x1922b50, > > > > A=0x14a6a70, info=0x19099f8) > > > > at > > > > /home/anthony/LIB/petsc-3.6.1/src/mat/impls/aij/mpi/superlu_dist/superlu_dist.c:368 > > > > 368 colA_start = rstart + ajj[0]; /* the smallest > > > global col index of A */ > > > > > > > > > > > > > On Aug 10, 2015, at 3:50 PM, Anthony Haas > > > wrote: > > > > > > > > Hi Sherry, > > > > > > > > I recently submitted a matrix for which I noticed that > > > Superlu_dist was hanging when running on 4 processors with > > > parallel symbolic factorization. I have been using the latest > > > version of Superlu_dist and the code is not hanging anymore. > > > However, I noticed that when running the same matrix (I have > > > attached the matrix), the code crashes with the following SIGSEGV > > > when running on 10 procs (with or without parallel symbolic > > > factorization). It is probably overkill to run such a 'small' > > > matrix on 10 procs but I thought that it might still be useful to > > > report the problem?? See below for the error obtained when running > > > with gdb and also a code snippet to reproduce the error. > > > > > > > > Thanks, > > > > > > > > > > > > Anthony > > > > > > > > > > > > > > > > 1) ERROR in GDB > > > > > > > > Program received signal SIGSEGV, Segmentation fault. > > > > 0x00007fe6ba609297 in MatLUFactorNumeric_SuperLU_DIST > (F=0x1922b50, > > > > A=0x14a6a70, info=0x19099f8) > > > > at > > > > /home/anthony/LIB/petsc-3.6.1/src/mat/impls/aij/mpi/superlu_dist/superlu_dist.c:368 > > > > 368 colA_start = rstart + ajj[0]; /* the smallest > > > global col index of A */ > > > > (gdb) > > > > > > > > > > > > > > > > 2) PORTION OF CODE TO REPRODUCE ERROR > > > > > > > > Subroutine HowBigLUCanBe(rank) > > > > > > > > IMPLICIT NONE > > > > > > > > integer(i4b),intent(in) :: rank > > > > integer(i4b) :: i,ct > > > > real(dp) :: begin,endd > > > > complex(dpc) :: sigma > > > > > > > > PetscErrorCode ierr > > > > > > > > > > > > if (rank==0) call cpu_time(begin) > > > > > > > > if (rank==0) then > > > > write(*,*) > > > > write(*,*)'Testing How Big LU Can Be...' > > > > write(*,*)'============================' > > > > write(*,*) > > > > endif > > > > > > > > !sigma = (1.0d0,0.0d0) > > > > !call MatAXPY(A,-sigma,B,DIFFERENT_NONZERO_PATTERN,ierr) ! > > > on exit A = A-sigma*B > > > > > > > > !call MatView(A,PETSC_VIEWER_STDOUT_WORLD,ierr) > > > > > > > > !.....Write Matrix to ASCII and Binary Format > > > > !call > > > PetscViewerASCIIOpen(PETSC_COMM_WORLD,"Amat.m",viewer,ierr) > > > > !call MatView(DXX,viewer,ierr) > > > > !call PetscViewerDestroy(viewer,ierr) > > > > > > > > !call > > > > PetscViewerBinaryOpen(PETSC_COMM_WORLD,"Amat_binary.m",FILE_MODE_WRITE,viewer,ierr) > > > > !call MatView(A,viewer,ierr) > > > > !call PetscViewerDestroy(viewer,ierr) > > > > > > > > !...Load a Matrix in Binary Format > > > > call > > > > PetscViewerBinaryOpen(PETSC_COMM_WORLD,"Amat_binary.m",FILE_MODE_READ,viewer,ierr) > > > > call MatCreate(PETSC_COMM_WORLD,DLOAD,ierr) > > > > call MatSetType(DLOAD,MATAIJ,ierr) > > > > call MatLoad(DLOAD,viewer,ierr) > > > > call PetscViewerDestroy(viewer,ierr) > > > > > > > > !call MatView(DLOAD,PETSC_VIEWER_STDOUT_WORLD,ierr) > > > > > > > > > > > > !.....Create Linear Solver Context > > > > call KSPCreate(PETSC_COMM_WORLD,ksp,ierr) > > > > > > > > !.....Set operators. Here the matrix that defines the linear > > > system also serves as the preconditioning matrix. > > > > !call > > > KSPSetOperators(ksp,A,A,DIFFERENT_NONZERO_PATTERN,ierr) !aha > > > commented and replaced by next line > > > > > > > > !call KSPSetOperators(ksp,A,A,ierr) ! remember: here A = > > > A-sigma*B > > > > call KSPSetOperators(ksp,DLOAD,DLOAD,ierr) ! remember: here > > > A = A-sigma*B > > > > > > > > !.....Set Relative and Absolute Tolerances and Uses Default for > > > Divergence Tol > > > > tol = 1.e-10 > > > > call > > > > KSPSetTolerances(ksp,tol,tol,PETSC_DEFAULT_REAL,PETSC_DEFAULT_INTEGER,ierr) > > > > > > > > !.....Set the Direct (LU) Solver > > > > call KSPSetType(ksp,KSPPREONLY,ierr) > > > > call KSPGetPC(ksp,pc,ierr) > > > > call PCSetType(pc,PCLU,ierr) > > > > call > > > PCFactorSetMatSolverPackage(pc,MATSOLVERSUPERLU_DIST,ierr) ! > > > MATSOLVERSUPERLU_DIST MATSOLVERMUMPS > > > > > > > > !.....Create Right-Hand-Side Vector > > > > !call MatCreateVecs(A,frhs,PETSC_NULL_OBJECT,ierr) > > > > !call MatCreateVecs(A,sol,PETSC_NULL_OBJECT,ierr) > > > > > > > > call MatCreateVecs(DLOAD,frhs,PETSC_NULL_OBJECT,ierr) > > > > call MatCreateVecs(DLOAD,sol,PETSC_NULL_OBJECT,ierr) > > > > > > > > call > > > MatGetOwnershipRange(DLOAD,IstartA,IendA,ierr)!;CHKERRQ(ierr) > > > > > > > > allocate(xwork1(IendA-IstartA)) > > > > allocate(loc(IendA-IstartA)) > > > > > > > > ct=0 > > > > do i=IstartA,IendA-1 > > > > ct=ct+1 > > > > loc(ct)=i > > > > xwork1(ct)=(1.0d0,0.0d0) > > > > enddo > > > > > > > > call > > > VecSetValues(frhs,IendA-IstartA,loc,xwork1,INSERT_VALUES,ierr) > > > > call VecZeroEntries(sol,ierr) > > > > > > > > deallocate(xwork1,loc) > > > > > > > > !.....Assemble Vectors > > > > call VecAssemblyBegin(frhs,ierr) > > > > call VecAssemblyEnd(frhs,ierr) > > > > > > > > !.....Solve the Linear System > > > > call KSPSolve(ksp,frhs,sol,ierr) > > > > > > > > !call VecView(sol,PETSC_VIEWER_STDOUT_WORLD,ierr) > > > > > > > > if (rank==0) then > > > > call cpu_time(endd) > > > > write(*,*) > > > > print '("Total time for HowBigLUCanBe = ",f21.3," > > > seconds.")',endd-begin > > > > endif > > > > > > > > call SlepcFinalize(ierr) > > > > > > > > STOP > > > > > > > > > > > > end Subroutine HowBigLUCanBe > > > > > > > > > > > > > > > > > > > > > -------------- next part -------------- An HTML attachment was scrubbed... URL: From mc0710 at gmail.com Wed Aug 12 13:29:32 2015 From: mc0710 at gmail.com (Mani Chandra) Date: Wed, 12 Aug 2015 13:29:32 -0500 Subject: [petsc-users] Petsc+Chombo example In-Reply-To: References: Message-ID: Hi, > Chombo (me) creates an MPIAIJ matrix. So automatic Jacobian assembly > should work. > > I have put a SNES in a Chombo code, but did not use automatic Jacobian > assembly. > Do you have an example? Thanks, Mani -------------- next part -------------- An HTML attachment was scrubbed... URL: From aph at email.arizona.edu Wed Aug 12 14:01:32 2015 From: aph at email.arizona.edu (Anthony Haas) Date: Wed, 12 Aug 2015 12:01:32 -0700 Subject: [petsc-users] SIGSEGV in Superlu_dist In-Reply-To: References: <55C90EAA.5060702@email.arizona.edu> <8B989122-F09F-4AA2-8676-D82CA2E58B0C@mcs.anl.gov> <55CA3A16.90206@email.arizona.edu> Message-ID: <55CB980C.8040507@email.arizona.edu> Hi Hong, I have attached a schematic of my matrices. I solve a generalized EVP in shift-and-invert mode. As you will see, I have indeed a zero diagonal block in the matrices A and B (blocks 4,4). I guess I could just add the zero entries to the diagonal elements of A? Is that strictly necessary when using a direct (LU) solver? Can you please give me a short explanation of why empty diagonal blocks can be problematic? Is the patch still available at: https://bitbucket.org/petsc/petsc/commits/ceeba3afeff0c18262ed13ef92e2508ca68b0ecf ? Thanks again, Anthony On 08/12/2015 08:58 AM, Hong wrote: > Anthony, > I just patched petsc-maint branch. > > Your matrix Amat_binary.m has empty diagonal blocks. Most petsc > solvers require matrix diagonal entries to be allocated as 'non-zero', > i.e., insert zero values to these zero entries. I would suggest you > add zeros to Amat_binary.m during its buildup. This would enable > petsc solvers, as well as other packages. > > Again, thanks for bug reporting. > > Hong > -------------- next part -------------- A non-text attachment was scrubbed... Name: BiGlobal-temporal-A-and-B.pdf Type: application/pdf Size: 316365 bytes Desc: not available URL: From knepley at gmail.com Wed Aug 12 14:17:08 2015 From: knepley at gmail.com (Matthew Knepley) Date: Wed, 12 Aug 2015 14:17:08 -0500 Subject: [petsc-users] SIGSEGV in Superlu_dist In-Reply-To: <55CB980C.8040507@email.arizona.edu> References: <55C90EAA.5060702@email.arizona.edu> <8B989122-F09F-4AA2-8676-D82CA2E58B0C@mcs.anl.gov> <55CA3A16.90206@email.arizona.edu> <55CB980C.8040507@email.arizona.edu> Message-ID: On Wed, Aug 12, 2015 at 2:01 PM, Anthony Haas wrote: > Hi Hong, > > I have attached a schematic of my matrices. I solve a generalized EVP in > shift-and-invert mode. As you will see, I have indeed a zero diagonal block > in the matrices A and B (blocks 4,4). I guess I could just add the zero > entries to the diagonal elements of A? Is that strictly necessary when > using a direct (LU) solver? Can you please give me a short explanation of > why empty diagonal blocks can be problematic? > Yes, the sparse numbering schemes use the location of the diagonal for faster indexing in many places. Matt > Is the patch still available at: > https://bitbucket.org/petsc/petsc/commits/ceeba3afeff0c18262ed13ef92e2508ca68b0ecf > ? > > Thanks again, > > Anthony > > > > > > > On 08/12/2015 08:58 AM, Hong wrote: > >> Anthony, >> I just patched petsc-maint branch. >> >> Your matrix Amat_binary.m has empty diagonal blocks. Most petsc solvers >> require matrix diagonal entries to be allocated as 'non-zero', i.e., insert >> zero values to these zero entries. I would suggest you add zeros to >> Amat_binary.m during its buildup. This would enable petsc solvers, as well >> as other packages. >> >> Again, thanks for bug reporting. >> >> Hong >> >> > -- What most experimenters take for granted before they begin their experiments is infinitely more interesting than any results to which their experiments lead. -- Norbert Wiener -------------- next part -------------- An HTML attachment was scrubbed... URL: From jychang48 at gmail.com Wed Aug 12 14:22:34 2015 From: jychang48 at gmail.com (Justin Chang) Date: Wed, 12 Aug 2015 14:22:34 -0500 Subject: [petsc-users] petsc_gen_xdmf.py errors In-Reply-To: References: Message-ID: Fande, Your sol.h5 is old and outdated. If you generate a more recent sol.h5 with the following SNES example inside your PETSC_DIR: ./src/snes/examples/tutorial/ex12 -run_type test -refinement_limit 0.0 -bc_type dirichlet -interpolate 1 -petscspace_order 1 -dm_view hdf5:sol.h5 -vec_view hdf5:sol.h5::append ./bin/petsc_gen_xdmf.py sol.h5 the resulting sol.xmf is compatible with Paraview Thanks, Justin On Tue, Aug 11, 2015 at 8:53 PM, Fande Kong wrote: > Hi all, > > I tried to use petsc_gen_xdmf.py to generate a xml file for visulaztion > using paraview. I got the following errors: > > ./petsc_gen_xdmf.py sol.h5 > Traceback (most recent call last): > File "./petsc_gen_xdmf.py", line 236, in > generateXdmf(sys.argv[1]) > File "./petsc_gen_xdmf.py", line 231, in generateXdmf > Xdmf(xdmfFilename).write(hdfFilename, topoPath, numCells, numCorners, > cellDim, geomPath, numVertices, spaceDim, time, vfields, cfields) > File "./petsc_gen_xdmf.py", line 190, in write > for vf in vfields: self.writeField(fp, len(time), t, cellDim, > spaceDim, '/vertex_fields/'+vf[0], vf, 'Node') > File "./petsc_gen_xdmf.py", line 164, in writeField > self.writeFieldComponents(fp, numSteps, timestep, spaceDim, name, f, > domain) > File "./petsc_gen_xdmf.py", line 120, in writeFieldComponents > dims = '1 %d 1' % (numSteps, dof, bs) > TypeError: not all arguments converted during string formatting > > > The hdf5 file is attached. Originally from Matthew. Configuration and make > log files are also attached. > > Fande Kong, > > Thanks, > -------------- next part -------------- An HTML attachment was scrubbed... URL: From jychang48 at gmail.com Wed Aug 12 14:39:35 2015 From: jychang48 at gmail.com (Justin Chang) Date: Wed, 12 Aug 2015 14:39:35 -0500 Subject: [petsc-users] petsc_gen_xdmf.py errors In-Reply-To: References: Message-ID: Sorry there was a typo. Here's the correct run: ./src/snes/examples/tutorials/ex12 -run_type test -refinement_limit 0.0 -bc_type dirichlet -interpolate 1 -petscspace_order 1 -dm_view hdf5:sol.h5 -vec_view hdf5:sol.h5::append On Wed, Aug 12, 2015 at 2:22 PM, Justin Chang wrote: > Fande, > > Your sol.h5 is old and outdated. If you generate a more recent sol.h5 with > the following SNES example inside your PETSC_DIR: > > ./src/snes/examples/tutorial/ex12 -run_type test -refinement_limit 0.0 > -bc_type dirichlet -interpolate 1 -petscspace_order 1 -dm_view hdf5:sol.h5 > -vec_view hdf5:sol.h5::append > ./bin/petsc_gen_xdmf.py sol.h5 > > the resulting sol.xmf is compatible with Paraview > > Thanks, > Justin > > > On Tue, Aug 11, 2015 at 8:53 PM, Fande Kong wrote: > >> Hi all, >> >> I tried to use petsc_gen_xdmf.py to generate a xml file for visulaztion >> using paraview. I got the following errors: >> >> ./petsc_gen_xdmf.py sol.h5 >> Traceback (most recent call last): >> File "./petsc_gen_xdmf.py", line 236, in >> generateXdmf(sys.argv[1]) >> File "./petsc_gen_xdmf.py", line 231, in generateXdmf >> Xdmf(xdmfFilename).write(hdfFilename, topoPath, numCells, numCorners, >> cellDim, geomPath, numVertices, spaceDim, time, vfields, cfields) >> File "./petsc_gen_xdmf.py", line 190, in write >> for vf in vfields: self.writeField(fp, len(time), t, cellDim, >> spaceDim, '/vertex_fields/'+vf[0], vf, 'Node') >> File "./petsc_gen_xdmf.py", line 164, in writeField >> self.writeFieldComponents(fp, numSteps, timestep, spaceDim, name, f, >> domain) >> File "./petsc_gen_xdmf.py", line 120, in writeFieldComponents >> dims = '1 %d 1' % (numSteps, dof, bs) >> TypeError: not all arguments converted during string formatting >> >> >> The hdf5 file is attached. Originally from Matthew. Configuration and >> make log files are also attached. >> >> Fande Kong, >> >> Thanks, >> > > -------------- next part -------------- An HTML attachment was scrubbed... URL: From mfadams at lbl.gov Wed Aug 12 17:56:59 2015 From: mfadams at lbl.gov (Mark Adams) Date: Wed, 12 Aug 2015 18:56:59 -0400 Subject: [petsc-users] Petsc+Chombo example In-Reply-To: References: Message-ID: On Wed, Aug 12, 2015 at 2:29 PM, Mani Chandra wrote: > Hi, > > >> Chombo (me) creates an MPIAIJ matrix. So automatic Jacobian assembly >> should work. >> >> I have put a SNES in a Chombo code, but did not use automatic Jacobian >> assembly. >> > > Do you have an example? > If you want to make a SNES solver then you need an "apply" call back function and a way to map Chombo vectors with PETSc vectors. Chombo has a level solver (classes derived from PetscSolver) and an AMR composite matrix constructor class (classes derived from PetscCompGrid) in lib/src/AMRElliptic. These two class each create these maps, providing methods to "putChomboInPetsc", and so forth. lib/src/AMRElliptic/PetscSolverI.H has an apply_mfree() method that is a callback function that you give to PETSc to apply an operator. There are examples in Chombo on how to use/construct these two classes, or two installations of them. Each of these classes has a Poisson and a 2D Viscous Tensor instantiation. You probably want to look at PETSc SNES examples if you are not familiar with SNES to get an idea of what you need to provide. Then, look at the appropriate Chombo class as a place start. I am guessing that you will want to write your own solver and just use these classes to get these mapping methods. Wrapping a Chombo operator (apply) and solver in a SNES is not hard and PetscSolverI.H has examples. These codes only have one user each (and they are both ANAG staff members), so they are pretty immature codes. Mark > Thanks, > Mani > -------------- next part -------------- An HTML attachment was scrubbed... URL: From fdkong.jd at gmail.com Wed Aug 12 19:09:07 2015 From: fdkong.jd at gmail.com (Fande Kong) Date: Wed, 12 Aug 2015 19:09:07 -0500 Subject: [petsc-users] petsc_gen_xdmf.py errors In-Reply-To: References: Message-ID: Thanks, Justin, I can get a correct xml file now. Thanks, Fande Kong, On Wed, Aug 12, 2015 at 2:39 PM, Justin Chang wrote: > Sorry there was a typo. Here's the correct run: > > ./src/snes/examples/tutorials/ex12 -run_type test -refinement_limit 0.0 > -bc_type dirichlet -interpolate 1 -petscspace_order 1 -dm_view hdf5:sol.h5 > -vec_view hdf5:sol.h5::append > > On Wed, Aug 12, 2015 at 2:22 PM, Justin Chang wrote: > >> Fande, >> >> Your sol.h5 is old and outdated. If you generate a more recent sol.h5 >> with the following SNES example inside your PETSC_DIR: >> >> ./src/snes/examples/tutorial/ex12 -run_type test -refinement_limit 0.0 >> -bc_type dirichlet -interpolate 1 -petscspace_order 1 -dm_view hdf5:sol.h5 >> -vec_view hdf5:sol.h5::append >> ./bin/petsc_gen_xdmf.py sol.h5 >> >> the resulting sol.xmf is compatible with Paraview >> >> Thanks, >> Justin >> >> >> On Tue, Aug 11, 2015 at 8:53 PM, Fande Kong wrote: >> >>> Hi all, >>> >>> I tried to use petsc_gen_xdmf.py to generate a xml file for visulaztion >>> using paraview. I got the following errors: >>> >>> ./petsc_gen_xdmf.py sol.h5 >>> Traceback (most recent call last): >>> File "./petsc_gen_xdmf.py", line 236, in >>> generateXdmf(sys.argv[1]) >>> File "./petsc_gen_xdmf.py", line 231, in generateXdmf >>> Xdmf(xdmfFilename).write(hdfFilename, topoPath, numCells, >>> numCorners, cellDim, geomPath, numVertices, spaceDim, time, vfields, >>> cfields) >>> File "./petsc_gen_xdmf.py", line 190, in write >>> for vf in vfields: self.writeField(fp, len(time), t, cellDim, >>> spaceDim, '/vertex_fields/'+vf[0], vf, 'Node') >>> File "./petsc_gen_xdmf.py", line 164, in writeField >>> self.writeFieldComponents(fp, numSteps, timestep, spaceDim, name, f, >>> domain) >>> File "./petsc_gen_xdmf.py", line 120, in writeFieldComponents >>> dims = '1 %d 1' % (numSteps, dof, bs) >>> TypeError: not all arguments converted during string formatting >>> >>> >>> The hdf5 file is attached. Originally from Matthew. Configuration and >>> make log files are also attached. >>> >>> Fande Kong, >>> >>> Thanks, >>> >> >> > -------------- next part -------------- An HTML attachment was scrubbed... URL: From mc0710 at gmail.com Wed Aug 12 21:52:56 2015 From: mc0710 at gmail.com (Mani Chandra) Date: Wed, 12 Aug 2015 21:52:56 -0500 Subject: [petsc-users] Petsc+Chombo example In-Reply-To: References: Message-ID: Thanks for the information. I'm familiar with SNES+DMDA, just not sure it would work with Chombo. But I'll give it a shot. Cheers, Mani On Wed, Aug 12, 2015 at 5:56 PM, Mark Adams wrote: > > > On Wed, Aug 12, 2015 at 2:29 PM, Mani Chandra wrote: > >> Hi, >> >> >>> Chombo (me) creates an MPIAIJ matrix. So automatic Jacobian assembly >>> should work. >>> >>> I have put a SNES in a Chombo code, but did not use automatic Jacobian >>> assembly. >>> >> >> Do you have an example? >> > > If you want to make a SNES solver then you need an "apply" call back > function and a way to map Chombo vectors with PETSc vectors. > > Chombo has a level solver (classes derived from PetscSolver) and an AMR > composite matrix constructor class (classes derived from PetscCompGrid) in > lib/src/AMRElliptic. These two class each create these maps, providing > methods to "putChomboInPetsc", and so forth. > lib/src/AMRElliptic/PetscSolverI.H has an apply_mfree() method that is a > callback function that you give to PETSc to apply an operator. There are > examples in Chombo on how to use/construct these two classes, or two > installations of them. Each of these classes has a Poisson and a 2D > Viscous Tensor instantiation. > > You probably want to look at PETSc SNES examples if you are not familiar > with SNES to get an idea of what you need to provide. Then, look at the > appropriate Chombo class as a place start. I am guessing that you will > want to write your own solver and just use these classes to get these > mapping methods. Wrapping a Chombo operator (apply) and solver in a SNES > is not hard and PetscSolverI.H has examples. > > These codes only have one user each (and they are both ANAG staff > members), so they are pretty immature codes. > > Mark > > >> Thanks, >> Mani >> > > -------------- next part -------------- An HTML attachment was scrubbed... URL: From mfadams at lbl.gov Thu Aug 13 08:36:41 2015 From: mfadams at lbl.gov (Mark Adams) Date: Thu, 13 Aug 2015 09:36:41 -0400 Subject: [petsc-users] Petsc+Chombo example In-Reply-To: References: Message-ID: On Wed, Aug 12, 2015 at 10:52 PM, Mani Chandra wrote: > Thanks for the information. I'm familiar with SNES+DMDA, > DMDA will not work with Chombo. DMDA only works with uniform grids. My two (base) classes do the transformations and linearizations for a 1) level solve and 2) full AMR solve, to a AIJ matrix. I don't think you want to look at DMs. Mark > just not sure it would work with Chombo. But I'll give it a shot. > > Cheers, > Mani > > On Wed, Aug 12, 2015 at 5:56 PM, Mark Adams wrote: > >> >> >> On Wed, Aug 12, 2015 at 2:29 PM, Mani Chandra wrote: >> >>> Hi, >>> >>> >>>> Chombo (me) creates an MPIAIJ matrix. So automatic Jacobian assembly >>>> should work. >>>> >>>> I have put a SNES in a Chombo code, but did not use automatic Jacobian >>>> assembly. >>>> >>> >>> Do you have an example? >>> >> >> If you want to make a SNES solver then you need an "apply" call back >> function and a way to map Chombo vectors with PETSc vectors. >> >> Chombo has a level solver (classes derived from PetscSolver) and an AMR >> composite matrix constructor class (classes derived from PetscCompGrid) in >> lib/src/AMRElliptic. These two class each create these maps, providing >> methods to "putChomboInPetsc", and so forth. >> lib/src/AMRElliptic/PetscSolverI.H has an apply_mfree() method that is a >> callback function that you give to PETSc to apply an operator. There are >> examples in Chombo on how to use/construct these two classes, or two >> installations of them. Each of these classes has a Poisson and a 2D >> Viscous Tensor instantiation. >> >> You probably want to look at PETSc SNES examples if you are not familiar >> with SNES to get an idea of what you need to provide. Then, look at the >> appropriate Chombo class as a place start. I am guessing that you will >> want to write your own solver and just use these classes to get these >> mapping methods. Wrapping a Chombo operator (apply) and solver in a SNES >> is not hard and PetscSolverI.H has examples. >> >> These codes only have one user each (and they are both ANAG staff >> members), so they are pretty immature codes. >> >> Mark >> >> >>> Thanks, >>> Mani >>> >> >> > -------------- next part -------------- An HTML attachment was scrubbed... URL: From jychang48 at gmail.com Thu Aug 13 10:34:44 2015 From: jychang48 at gmail.com (Justin Chang) Date: Thu, 13 Aug 2015 10:34:44 -0500 Subject: [petsc-users] Understanding the memory bandwidth Message-ID: Hi all, According to our University's HPC cluster (Intel Xeon E5-2680v2 ), the online specifications says I should have a maximum BW of 59.7 GB/s. I am guessing this number is computed by 1866 MHz * 8 Bytes * 4 memory channels. Now, when I run the STREAMS Triad benchmark on a single compute node (two sockets, 10 cores each, 64 GB total memory), on up to 20 processes with MPICH, i get the following: $ mpiexec -n 1 ./MPIVersion: Triad: 13448.6701 Rate (MB/s) $ mpiexec -n 2 ./MPIVersion: Triad: 24409.1406 Rate (MB/s) $ mpiexec -n 4 ./MPIVersion Triad: 31914.8087 Rate (MB/s) $ mpiexec -n 6 ./MPIVersion Triad: 33290.2676 Rate (MB/s) $ mpiexec -n 8 ./MPIVersion Triad: 33618.2542 Rate (MB/s) $ mpiexec -n 10 ./MPIVersion Triad: 33730.1662 Rate (MB/s) $ mpiexec -n 12 ./MPIVersion Triad: 40835.9440 Rate (MB/s) $ mpiexec -n 14 ./MPIVersion Triad: 44396.0042 Rate (MB/s) $ mpiexec -n 16 ./MPIVersion Triad: 54647.5214 Rate (MB/s) * $ mpiexec -n 18 ./MPIVersion Triad: 57530.8125 Rate (MB/s) * $ mpiexec -n 20 ./MPIVersion Triad: 42388.0739 Rate (MB/s) * The * numbers fluctuate greatly each time I run this. However, if I use hydra's processor binding options: $ mpiexec.hydra -n 2 -bind-to socket ./MPIVersion Triad: 26879.3853 Rate (MB/s) $ mpiexec.hydra -n 4 -bind-to socket ./MPIVersion Triad: 48363.8441 Rate (MB/s) $ mpiexec.hydra -n 8 -bind-to socket ./MPIVersion Triad: 63479.9284 Rate (MB/s) $ mpiexec.hydra -n 10 -bind-to socket ./MPIVersion Triad: 66160.5627 Rate (MB/s) $ mpiexec.hydra -n 16 -bind-to socket ./MPIVersion Triad: 65975.5959 Rate (MB/s) $ mpiexec.hydra -n 20 -bind-to socket ./MPIVersion Triad: 64738.9336 Rate (MB/s) I get similar metrics when i use the binding options "-bind-to hwthread -map-by socket". Now my question is, is 13.5 GB/s on one processor "good"? Because when I compare this to the 59.7 GB/s it seems really inefficient. Is there a way to browse through my system files to confirm this? Also, when I use multiple cores and with proper binding, the streams BW exceeds the reported max BW. Is this expected? Thanks, Justin -------------- next part -------------- An HTML attachment was scrubbed... URL: From thronesf at gmail.com Thu Aug 13 10:52:39 2015 From: thronesf at gmail.com (Sharp Stone) Date: Thu, 13 Aug 2015 11:52:39 -0400 Subject: [petsc-users] Multigrid and AMR Message-ID: Hi All, I'm a new who are dealing with linear systems, and want to use multigrid, especially AMR. I found some examples regarding multigrids, but does petsc have any examples of AMR? Thank you in advance! -------------- next part -------------- An HTML attachment was scrubbed... URL: From knepley at gmail.com Thu Aug 13 11:06:00 2015 From: knepley at gmail.com (Matthew Knepley) Date: Thu, 13 Aug 2015 11:06:00 -0500 Subject: [petsc-users] Multigrid and AMR In-Reply-To: References: Message-ID: On Thu, Aug 13, 2015 at 10:52 AM, Sharp Stone wrote: > Hi All, > > I'm a new who are dealing with linear systems, and want to use multigrid, > especially AMR. I found some examples regarding multigrids, but does petsc > have any examples of AMR? > Do you mean Algebraic Multigrid (AMG)? Matt > Thank you in advance! > -- What most experimenters take for granted before they begin their experiments is infinitely more interesting than any results to which their experiments lead. -- Norbert Wiener -------------- next part -------------- An HTML attachment was scrubbed... URL: From bsmith at mcs.anl.gov Thu Aug 13 12:51:15 2015 From: bsmith at mcs.anl.gov (Barry Smith) Date: Thu, 13 Aug 2015 12:51:15 -0500 Subject: [petsc-users] Understanding the memory bandwidth In-Reply-To: References: Message-ID: <0A4EBD35-AA57-4E25-B874-639DA6199218@mcs.anl.gov> > On Aug 13, 2015, at 10:34 AM, Justin Chang wrote: > > Hi all, > > According to our University's HPC cluster (Intel Xeon E5-2680v2), the online specifications says I should have a maximum BW of 59.7 GB/s. I am guessing this number is computed by 1866 MHz * 8 Bytes * 4 memory channels. > > Now, when I run the STREAMS Triad benchmark on a single compute node (two sockets, 10 cores each, 64 GB total memory), on up to 20 processes with MPICH, i get the following: > > $ mpiexec -n 1 ./MPIVersion: > Triad: 13448.6701 Rate (MB/s) > > $ mpiexec -n 2 ./MPIVersion: > Triad: 24409.1406 Rate (MB/s) > > $ mpiexec -n 4 ./MPIVersion > Triad: 31914.8087 Rate (MB/s) > $ mpiexec -n 6 ./MPIVersion > Triad: 33290.2676 Rate (MB/s) > > > $ mpiexec -n 8 ./MPIVersion > Triad: 33618.2542 Rate (MB/s) > > $ mpiexec -n 10 ./MPIVersion > Triad: 33730.1662 Rate (MB/s) > > > $ mpiexec -n 12 ./MPIVersion > Triad: 40835.9440 Rate (MB/s) > > > $ mpiexec -n 14 ./MPIVersion > Triad: 44396.0042 Rate (MB/s) > > $ mpiexec -n 16 ./MPIVersion > Triad: 54647.5214 Rate (MB/s) * > > $ mpiexec -n 18 ./MPIVersion > Triad: 57530.8125 Rate (MB/s) * > > $ mpiexec -n 20 ./MPIVersion > Triad: 42388.0739 Rate (MB/s) * > > The * numbers fluctuate greatly each time I run this. Yeah, MPICH's default behavior is super annoying. I think they need better defaults. > However, if I use hydra's processor binding options: > > $ mpiexec.hydra -n 2 -bind-to socket ./MPIVersion > Triad: 26879.3853 Rate (MB/s) > > $ mpiexec.hydra -n 4 -bind-to socket ./MPIVersion > Triad: 48363.8441 Rate (MB/s) > > $ mpiexec.hydra -n 8 -bind-to socket ./MPIVersion > Triad: 63479.9284 Rate (MB/s) > > $ mpiexec.hydra -n 10 -bind-to socket ./MPIVersion > Triad: 66160.5627 Rate (MB/s) > > $ mpiexec.hydra -n 16 -bind-to socket ./MPIVersion > Triad: 65975.5959 Rate (MB/s) > > $ mpiexec.hydra -n 20 -bind-to socket ./MPIVersion > Triad: 64738.9336 Rate (MB/s) > > I get similar metrics when i use the binding options "-bind-to hwthread -map-by socket". > > Now my question is, is 13.5 GB/s on one processor "good"? You mean one core. Yes, that is a good number. These systems are not designed so that a single core can "saturate" (that is use) all the memory bandwidth of the node. Note that after about 8 cores you don't see any more improvement because the 8 cores has saturated the memory bandwidth. What this means is that for PETSc simulations any cores beyond 8 (or so) on the node are just unnecessary eye-candy. > Because when I compare this to the 59.7 GB/s it seems really inefficient. Is there a way to browse through my system files to confirm this? > > Also, when I use multiple cores and with proper binding, the streams BW exceeds the reported max BW. Is this expected? I cannot explain this, look at the exact number of loads and stores needed for the triad benchmark. Perhaps the online docs are out of date. Barry > > Thanks, > Justin > From thronesf at gmail.com Thu Aug 13 12:55:54 2015 From: thronesf at gmail.com (Sharp Stone) Date: Thu, 13 Aug 2015 13:55:54 -0400 Subject: [petsc-users] Multigrid and AMR In-Reply-To: References: Message-ID: No, I mean Adaptive Mesh Refinement (AMR). Is this supported now in Petsc? Thanks! On Thu, Aug 13, 2015 at 12:06 PM, Matthew Knepley wrote: > On Thu, Aug 13, 2015 at 10:52 AM, Sharp Stone wrote: > >> Hi All, >> >> I'm a new who are dealing with linear systems, and want to use multigrid, >> especially AMR. I found some examples regarding multigrids, but does petsc >> have any examples of AMR? >> > > Do you mean Algebraic Multigrid (AMG)? > > Matt > > >> Thank you in advance! >> > -- > What most experimenters take for granted before they begin their > experiments is infinitely more interesting than any results to which their > experiments lead. > -- Norbert Wiener > -- Best regards, Feng -------------- next part -------------- An HTML attachment was scrubbed... URL: From knepley at gmail.com Thu Aug 13 12:57:38 2015 From: knepley at gmail.com (Matthew Knepley) Date: Thu, 13 Aug 2015 12:57:38 -0500 Subject: [petsc-users] Multigrid and AMR In-Reply-To: References: Message-ID: On Thu, Aug 13, 2015 at 12:55 PM, Sharp Stone wrote: > No, I mean Adaptive Mesh Refinement (AMR). Is this supported now in Petsc? > No, not at this time. Thanks, Matt > Thanks! > > On Thu, Aug 13, 2015 at 12:06 PM, Matthew Knepley > wrote: > >> On Thu, Aug 13, 2015 at 10:52 AM, Sharp Stone wrote: >> >>> Hi All, >>> >>> I'm a new who are dealing with linear systems, and want to use >>> multigrid, especially AMR. I found some examples regarding multigrids, but >>> does petsc have any examples of AMR? >>> >> >> Do you mean Algebraic Multigrid (AMG)? >> >> Matt >> >> >>> Thank you in advance! >>> >> -- >> What most experimenters take for granted before they begin their >> experiments is infinitely more interesting than any results to which their >> experiments lead. >> -- Norbert Wiener >> > > > > -- > Best regards, > > Feng > -- What most experimenters take for granted before they begin their experiments is infinitely more interesting than any results to which their experiments lead. -- Norbert Wiener -------------- next part -------------- An HTML attachment was scrubbed... URL: From jed at jedbrown.org Thu Aug 13 13:04:27 2015 From: jed at jedbrown.org (Jed Brown) Date: Thu, 13 Aug 2015 12:04:27 -0600 Subject: [petsc-users] Understanding the memory bandwidth In-Reply-To: References: Message-ID: <877fozqd3o.fsf@jedbrown.org> Justin Chang writes: > Hi all, > > According to our University's HPC cluster (Intel Xeon E5-2680v2 > ), the > online specifications says I should have a maximum BW of 59.7 GB/s. I am > guessing this number is computed by 1866 MHz * 8 Bytes * 4 memory channels. Yup, per socket. > Now, when I run the STREAMS Triad benchmark on a single compute node (two > sockets, 10 cores each, 64 GB total memory), on up to 20 processes with > MPICH, i get the following: > > $ mpiexec -n 1 ./MPIVersion: > Triad: 13448.6701 Rate (MB/s) > > $ mpiexec -n 2 ./MPIVersion: > Triad: 24409.1406 Rate (MB/s) > > $ mpiexec -n 4 ./MPIVersion > Triad: 31914.8087 Rate (MB/s) > > $ mpiexec -n 6 ./MPIVersion > Triad: 33290.2676 Rate (MB/s) > > $ mpiexec -n 8 ./MPIVersion > Triad: 33618.2542 Rate (MB/s) > > $ mpiexec -n 10 ./MPIVersion > Triad: 33730.1662 Rate (MB/s) > > $ mpiexec -n 12 ./MPIVersion > Triad: 40835.9440 Rate (MB/s) > > $ mpiexec -n 14 ./MPIVersion > Triad: 44396.0042 Rate (MB/s) > > $ mpiexec -n 16 ./MPIVersion > Triad: 54647.5214 Rate (MB/s) * > > $ mpiexec -n 18 ./MPIVersion > Triad: 57530.8125 Rate (MB/s) * > > $ mpiexec -n 20 ./MPIVersion > Triad: 42388.0739 Rate (MB/s) * > > The * numbers fluctuate greatly each time I run this. However, if I use > hydra's processor binding options: > > $ mpiexec.hydra -n 2 -bind-to socket ./MPIVersion > Triad: 26879.3853 Rate (MB/s) > > $ mpiexec.hydra -n 4 -bind-to socket ./MPIVersion > Triad: 48363.8441 Rate (MB/s) > > $ mpiexec.hydra -n 8 -bind-to socket ./MPIVersion > Triad: 63479.9284 Rate (MB/s) It looks like with one core/socket, all your memory sits over one channel. You can play tricks to avoid that or use 4 cores/socket in order to use all memory channels. > $ mpiexec.hydra -n 10 -bind-to socket ./MPIVersion > Triad: 66160.5627 Rate (MB/s) So this is a pretty low fraction (55%) of 59.7*2 = 119.4. I suspect your memory or motherboard is at most 1600 MHz, so your peak would be 102.4 GB/s. You can check this as root using "dmidecode --type 17", which should give one entry per channel, looking something like this: Handle 0x002B, DMI type 17, 34 bytes Memory Device Array Handle: 0x002A Error Information Handle: 0x002F Total Width: Unknown Data Width: Unknown Size: 4096 MB Form Factor: DIMM Set: None Locator: DIMM0 Bank Locator: BANK 0 Type: Type Detail: None Speed: Unknown Manufacturer: Not Specified Serial Number: Not Specified Asset Tag: Unknown Part Number: Not Specified Rank: Unknown Configured Clock Speed: 1600 MHz > Now my question is, is 13.5 GB/s on one processor "good"? One memory channel is 1.866 * 8 = 14.9 GB/s. You can get some bonus overlap when adjacent pages are on different busses, but the prefetcher only looks so far ahead, so most of the time you're only pulling from one channel when using one thread. > Because when I compare this to the 59.7 GB/s it seems really > inefficient. Is there a way to browse through my system files to > confirm this? > > Also, when I use multiple cores and with proper binding, the streams BW > exceeds the reported max BW. Is this expected? You're using two sockets. -------------- next part -------------- A non-text attachment was scrubbed... Name: signature.asc Type: application/pgp-signature Size: 818 bytes Desc: not available URL: From jychang48 at gmail.com Thu Aug 13 15:22:42 2015 From: jychang48 at gmail.com (Justin Chang) Date: Thu, 13 Aug 2015 15:22:42 -0500 Subject: [petsc-users] Understanding the memory bandwidth In-Reply-To: <877fozqd3o.fsf@jedbrown.org> References: <877fozqd3o.fsf@jedbrown.org> Message-ID: On Thu, Aug 13, 2015 at 1:04 PM, Jed Brown wrote: > It looks like with one core/socket, all your memory sits over one > channel. You can play tricks to avoid that or use 4 cores/socket in > order to use all memory channels. How do I play these tricks? > So this is a pretty low fraction (55%) of 59.7*2 = 119.4. I suspect > your memory or motherboard is at most 1600 MHz, so your peak would be > 102.4 GB/s. > You can check this as root using "dmidecode --type 17", which should > give one entry per channel, looking something like this: > > Handle 0x002B, DMI type 17, 34 bytes > Memory Device > Array Handle: 0x002A > Error Information Handle: 0x002F > Total Width: Unknown > Data Width: Unknown > Size: 4096 MB > Form Factor: DIMM > Set: None > Locator: DIMM0 > Bank Locator: BANK 0 > Type: > Type Detail: None > Speed: Unknown > Manufacturer: Not Specified > Serial Number: Not Specified > Asset Tag: Unknown > Part Number: Not Specified > Rank: Unknown > Configured Clock Speed: 1600 MHz I have no root access. Is there another way to confirm the clock speed? --- So if I have two sockets per node, then the theoretical peak bandwidth is actually double than what I thought (whether it be 119.4 GB/s or 102.4 GB/s). And if 8 cores really is the optimal number to use for a single compute node, why are there 20 totals to begin with? Or would this depend on the particular application? Also, can someone elaborate on the difference between the words "core", "processor", and "thread"? From knepley at gmail.com Thu Aug 13 15:30:47 2015 From: knepley at gmail.com (Matthew Knepley) Date: Thu, 13 Aug 2015 15:30:47 -0500 Subject: [petsc-users] Understanding the memory bandwidth In-Reply-To: References: <877fozqd3o.fsf@jedbrown.org> Message-ID: On Thu, Aug 13, 2015 at 3:22 PM, Justin Chang wrote: > On Thu, Aug 13, 2015 at 1:04 PM, Jed Brown wrote: > > It looks like with one core/socket, all your memory sits over one > > channel. You can play tricks to avoid that or use 4 cores/socket in > > order to use all memory channels. > > How do I play these tricks? > > > So this is a pretty low fraction (55%) of 59.7*2 = 119.4. I suspect > > your memory or motherboard is at most 1600 MHz, so your peak would be > > 102.4 GB/s. > > > You can check this as root using "dmidecode --type 17", which should > > give one entry per channel, looking something like this: > > > > Handle 0x002B, DMI type 17, 34 bytes > > Memory Device > > Array Handle: 0x002A > > Error Information Handle: 0x002F > > Total Width: Unknown > > Data Width: Unknown > > Size: 4096 MB > > Form Factor: DIMM > > Set: None > > Locator: DIMM0 > > Bank Locator: BANK 0 > > Type: > > Type Detail: None > > Speed: Unknown > > Manufacturer: Not Specified > > Serial Number: Not Specified > > Asset Tag: Unknown > > Part Number: Not Specified > > Rank: Unknown > > Configured Clock Speed: 1600 MHz > > I have no root access. Is there another way to confirm the clock speed? > > --- > > So if I have two sockets per node, then the theoretical peak bandwidth > is actually double than what I thought (whether it be 119.4 GB/s or > 102.4 GB/s). And if 8 cores really is the optimal number to use for a > single compute node, why are there 20 totals to begin with? Or would > this depend on the particular application? > Kind Answer: Different application have different needs Cynical Answer: Computer companies sell you what they can produce, lots of cores, not what you need, lots of bandwidth. Bandwidth is very expensive and there are technical limits. > Also, can someone elaborate on the difference between the words > "core", "processor", and "thread"? > A core and a processor are hardware terms. I think they are both fuzzy, but I understand a core to be something that can carry a thread of execution, namely a program counter, instruction and data stream, and compute something. A thread is a logical construct for talking about an execution stream. Matt -- What most experimenters take for granted before they begin their experiments is infinitely more interesting than any results to which their experiments lead. -- Norbert Wiener -------------- next part -------------- An HTML attachment was scrubbed... URL: From balay at mcs.anl.gov Thu Aug 13 15:40:55 2015 From: balay at mcs.anl.gov (Satish Balay) Date: Thu, 13 Aug 2015 15:40:55 -0500 Subject: [petsc-users] Understanding the memory bandwidth In-Reply-To: References: <877fozqd3o.fsf@jedbrown.org> Message-ID: On Thu, 13 Aug 2015, Matthew Knepley wrote: > > > Also, can someone elaborate on the difference between the words > > "core", "processor", and "thread"? > > > A core and a processor are hardware terms. I think they are both > fuzzy, but I understand a core to be something that can carry a > thread of execution, namely a program counter, instruction and data > stream, and compute something. A thread is a logical construct for > talking about an execution stream. Perhaps there are multiple terminologies here - but I think you are asking about the difference between: CPU/processor, core, hardware-thread CPU: a (manufacturing) packaging unit. Or a single chip that can be inserted on the MotherBoard. Core: a CPU can have multiple cores. Each core is equivalent to independent processing unit hardware-thread. Its a virtual mode for a single core (hardware) process multiple streams of instructions simultaneously (aka virtual cores). The core vs hardware threads is a murky territory. H designers can do quiet complex things here - esp between gore/hardware-thread boundaries. Satish From bsmith at mcs.anl.gov Thu Aug 13 15:47:30 2015 From: bsmith at mcs.anl.gov (Barry Smith) Date: Thu, 13 Aug 2015 15:47:30 -0500 Subject: [petsc-users] Understanding the memory bandwidth In-Reply-To: References: <877fozqd3o.fsf@jedbrown.org> Message-ID: <318DDA3E-0B57-432C-A30D-94257EBBF686@mcs.anl.gov> > On Aug 13, 2015, at 3:30 PM, Matthew Knepley wrote: > > On Thu, Aug 13, 2015 at 3:22 PM, Justin Chang wrote: > On Thu, Aug 13, 2015 at 1:04 PM, Jed Brown wrote: > > It looks like with one core/socket, all your memory sits over one > > channel. You can play tricks to avoid that or use 4 cores/socket in > > order to use all memory channels. > > How do I play these tricks? > > > So this is a pretty low fraction (55%) of 59.7*2 = 119.4. I suspect > > your memory or motherboard is at most 1600 MHz, so your peak would be > > 102.4 GB/s. > > > You can check this as root using "dmidecode --type 17", which should > > give one entry per channel, looking something like this: > > > > Handle 0x002B, DMI type 17, 34 bytes > > Memory Device > > Array Handle: 0x002A > > Error Information Handle: 0x002F > > Total Width: Unknown > > Data Width: Unknown > > Size: 4096 MB > > Form Factor: DIMM > > Set: None > > Locator: DIMM0 > > Bank Locator: BANK 0 > > Type: > > Type Detail: None > > Speed: Unknown > > Manufacturer: Not Specified > > Serial Number: Not Specified > > Asset Tag: Unknown > > Part Number: Not Specified > > Rank: Unknown > > Configured Clock Speed: 1600 MHz > > I have no root access. Is there another way to confirm the clock speed? > > --- > > So if I have two sockets per node, then the theoretical peak bandwidth > is actually double than what I thought (whether it be 119.4 GB/s or > 102.4 GB/s). And if 8 cores really is the optimal number to use for a > single compute node, why are there 20 totals to begin with? Or would > this depend on the particular application? > > Kind Answer: Different application have different needs > > Cynical Answer: Computer companies sell you what they can produce, > lots of cores, not what you need, lots of bandwidth. Bandwidth is very > expensive and there are technical limits. Cost of production of a system may not, is not, simply linearly proportional to the number of cores, or number of floating point units or any other particular feature of a system. For example, maybe a 50 core system costs $50,000 and a 100 core system (everything else being equal) costs $70,000 for a company to make, in a sense each additional core (within reason) costs less so it is acceptable to get less performance out it since the incremental cost is lower. Barry > > Also, can someone elaborate on the difference between the words > "core", "processor", and "thread"? > > A core and a processor are hardware terms. I think they are both fuzzy, > but I understand a core to be something that can carry a thread of execution, > namely a program counter, instruction and data stream, and compute something. > A thread is a logical construct for talking about an execution stream. > > Matt > > -- > What most experimenters take for granted before they begin their experiments is infinitely more interesting than any results to which their experiments lead. > -- Norbert Wiener From jed at jedbrown.org Thu Aug 13 15:50:34 2015 From: jed at jedbrown.org (Jed Brown) Date: Thu, 13 Aug 2015 14:50:34 -0600 Subject: [petsc-users] Understanding the memory bandwidth In-Reply-To: References: <877fozqd3o.fsf@jedbrown.org> Message-ID: <87r3n6oqud.fsf@jedbrown.org> Justin Chang writes: > On Thu, Aug 13, 2015 at 1:04 PM, Jed Brown wrote: >> It looks like with one core/socket, all your memory sits over one >> channel. You can play tricks to avoid that or use 4 cores/socket in >> order to use all memory channels. > > How do I play these tricks? They generally aren't practical outside of simple benchmarks. Read through this blog series if you want to dive into memory performance. http://sites.utexas.edu/jdm4372/2010/11/11/optimizing-amd-opteron-memory-bandwidth-part-5-single-thread-read-only/ > I have no root access. Is there another way to confirm the clock speed? I don't recall a way to access that information without root. You can benchmark, obviously, but you're looking for an independent information source. You can ask a sysadmin to run this on a compute node. > > --- > > So if I have two sockets per node, then the theoretical peak bandwidth > is actually double than what I thought (whether it be 119.4 GB/s or > 102.4 GB/s). And if 8 cores really is the optimal number to use for a > single compute node, why are there 20 totals to begin with? Or would > this depend on the particular application? "20 totals"? Note that you might have hyperthreading, in which case there are twice as many logical cores as physical cores. > Also, can someone elaborate on the difference between the words > "core", "processor", and "thread"? Processor - typically a unit of manufacturing and sale that goes into a socket. Sometimes it shares a last-level cache and other times it is independent parts stuck together. Sometimes different parts of the processor are connected to different memory channels (implying multiple "NUMA nodes" on a single socket) and sometimes they are multiplexed (so all cores see the same speed to any memory channel on that socket). Core - the physical unit that processes ("integer") instructions. There can be multiple floating point units per core (e.g., anything with dual-issue FMA) or multiple cores per floating point unit (e.g., the AMD processors on Titan). Logical core/hardware thread - the logical unit exposed to the operating system. Often there are 2, 4, or more hardware threads per core. These have their own registers (as far as you can tell; it can be complicated by "register renaming") and are used to cover high-latency operations including waiting on memory and some arithmetic. Usually only one hardware thread issues instructions in any given cycle, so if a single thread has sufficient ILP (instruction-level parallelism) to keep issuing every cycle, there can be no benefit to using multiple hardware threads. This is impossible with some architectures, thus necessitating use of multiple hardware threads per core to reach peak flops, integer instructions, and/or bandwidth. -------------- next part -------------- A non-text attachment was scrubbed... Name: signature.asc Type: application/pgp-signature Size: 818 bytes Desc: not available URL: From asmund.ervik at ntnu.no Fri Aug 14 03:29:22 2015 From: asmund.ervik at ntnu.no (=?UTF-8?Q?=c3=85smund_Ervik?=) Date: Fri, 14 Aug 2015 10:29:22 +0200 Subject: [petsc-users] Understanding the memory bandwidth Message-ID: <55CDA6E2.5000302@ntnu.no> >> So this is a pretty low fraction (55%) of 59.7*2 = 119.4. I suspect >> your memory or motherboard is at most 1600 MHz, so your peak would be >> 102.4 GB/s. > >> You can check this as root using "dmidecode --type 17", which should >> give one entry per channel, looking something like this: >> >> Handle 0x002B, DMI type 17, 34 bytes >> Memory Device >> Array Handle: 0x002A >> Error Information Handle: 0x002F >> Total Width: Unknown >> Data Width: Unknown >> Size: 4096 MB >> Form Factor: DIMM >> Set: None >> Locator: DIMM0 >> Bank Locator: BANK 0 >> Type: >> Type Detail: None >> Speed: Unknown >> Manufacturer: Not Specified >> Serial Number: Not Specified >> Asset Tag: Unknown >> Part Number: Not Specified >> Rank: Unknown >> Configured Clock Speed: 1600 MHz > >I have no root access. Is there another way to confirm the clock speed? Also note: even in the case where your motherboard, RAM and CPU all say 1866 on the label, if there are more memory DIMMs (chips) per node than channels, say 16 DIMMs on your 8 channels, you will see a performance reduction on the order of 20-30%. This is more likely if you are using nodes in a "high-memory queue" or similar where there's >= 128 GB memory per node. (This will change in the future when/if people start using DDR4 LRDIMMs.) There's a series of in-depth discussions here: http://frankdenneman.nl/2015/02/20/memory-deep-dive/ and there's also lots of interesting memory-stuff on John McCalpin's blog: https://sites.utexas.edu/jdm4372/ Regards, ?smund -------------- next part -------------- A non-text attachment was scrubbed... Name: signature.asc Type: application/pgp-signature Size: 473 bytes Desc: OpenPGP digital signature URL: From R.Thomas at tudelft.nl Fri Aug 14 09:23:16 2015 From: R.Thomas at tudelft.nl (Romain Thomas) Date: Fri, 14 Aug 2015 14:23:16 +0000 Subject: [petsc-users] petsc KLU Message-ID: <6F0087987AC5484D8D593B639648295A2F3A8EB0@SRV383.tudelft.net> Dear PETSc users, I would like to know if I can replace the following functions MatLUFactorNumeric(Mat fact,Mat mat,const MatFactorInfo *info) MatLUFactorSymbolic(Mat fact,Mat mat,IS row,IS col,const MatFactorInfo *info) MatLUFactor(Mat mat,IS row,IS col,const MatFactorInfo *info) by MatLUFactorNumeric_KLU(Mat F,Mat A,const MatFactorInfo *info) MatLUFactorSymbolic_KLU(Mat F,Mat A,IS r,IS c,const MatFactorInfo *info) MatGetFactor_seqaij_klu(Mat A,MatFactorType ftype,Mat *F) in my code for the simulation of electrical power systems? (I installed the package SuiteSparse) Thank you, Best regards, Romain -------------- next part -------------- An HTML attachment was scrubbed... URL: From knepley at gmail.com Fri Aug 14 09:40:38 2015 From: knepley at gmail.com (Matthew Knepley) Date: Fri, 14 Aug 2015 09:40:38 -0500 Subject: [petsc-users] petsc KLU In-Reply-To: <6F0087987AC5484D8D593B639648295A2F3A8EB0@SRV383.tudelft.net> References: <6F0087987AC5484D8D593B639648295A2F3A8EB0@SRV383.tudelft.net> Message-ID: On Fri, Aug 14, 2015 at 9:23 AM, Romain Thomas wrote: > Dear PETSc users, > > I would like to know if I can replace the following functions > > MatLUFactorNumeric(Mat fact,Mat mat,const MatFactorInfo *info) > MatLUFactorSymbolic(Mat fact,Mat mat,IS row,IS col,const MatFactorInfo > *info) > MatLUFactor(Mat mat,IS row,IS col,const MatFactorInfo *info) > > by > > MatLUFactorNumeric_KLU(Mat F,Mat A,const MatFactorInfo *info) > MatLUFactorSymbolic_KLU(Mat F,Mat A,IS r,IS c,const MatFactorInfo *info) > MatGetFactor_seqaij_klu(Mat A,MatFactorType ftype,Mat *F) > > in my code for the simulation of electrical power systems? (I installed > the package SuiteSparse) > Why would you do that? It already works with the former code. In fact, you should really just use KSPCreate(comm, &ksp) KSPSetOperator(ksp, A, A); KSPSetFromOptions(ksp); KSPSolve(ksp, b, x); and then give the options -ksp_type preonly -pc_type lu -pc_mat_factor_package suitesparse This is no advantage to using the Factor language since subsequent calls to KSPSolve() will not refactor. Matt > Thank you, > Best regards, > Romain > -- What most experimenters take for granted before they begin their experiments is infinitely more interesting than any results to which their experiments lead. -- Norbert Wiener -------------- next part -------------- An HTML attachment was scrubbed... URL: From R.Thomas at tudelft.nl Fri Aug 14 10:07:39 2015 From: R.Thomas at tudelft.nl (Romain Thomas) Date: Fri, 14 Aug 2015 15:07:39 +0000 Subject: [petsc-users] petsc KLU In-Reply-To: References: <6F0087987AC5484D8D593B639648295A2F3A8EB0@SRV383.tudelft.net> Message-ID: <6F0087987AC5484D8D593B639648295A2F3AC1DE@SRV383.tudelft.net> Hi, Thank you for your answer. My problem is a bit more complex. During the simulation (?real time?), I need to upgrade at each time step the matrix A and the MatassemblyBegin and MatassemblyEnd take time and so, in order to avoid these functions, I don?t use ksp or pc. I prefer to use the functions MatLUFactorNumeric, MatLUFactorSymbolic and MatLUFactor. And so, I want to know if there is similar functions for KLU. (I tried for Cholesky and, iLU and it works well). Best regards, Romain From: Matthew Knepley [mailto:knepley at gmail.com] Sent: vrijdag 14 augustus 2015 16:41 To: Romain Thomas Cc: petsc-users at mcs.anl.gov Subject: Re: [petsc-users] petsc KLU On Fri, Aug 14, 2015 at 9:23 AM, Romain Thomas > wrote: Dear PETSc users, I would like to know if I can replace the following functions MatLUFactorNumeric(Mat fact,Mat mat,const MatFactorInfo *info) MatLUFactorSymbolic(Mat fact,Mat mat,IS row,IS col,const MatFactorInfo *info) MatLUFactor(Mat mat,IS row,IS col,const MatFactorInfo *info) by MatLUFactorNumeric_KLU(Mat F,Mat A,const MatFactorInfo *info) MatLUFactorSymbolic_KLU(Mat F,Mat A,IS r,IS c,const MatFactorInfo *info) MatGetFactor_seqaij_klu(Mat A,MatFactorType ftype,Mat *F) in my code for the simulation of electrical power systems? (I installed the package SuiteSparse) Why would you do that? It already works with the former code. In fact, you should really just use KSPCreate(comm, &ksp) KSPSetOperator(ksp, A, A); KSPSetFromOptions(ksp); KSPSolve(ksp, b, x); and then give the options -ksp_type preonly -pc_type lu -pc_mat_factor_package suitesparse This is no advantage to using the Factor language since subsequent calls to KSPSolve() will not refactor. Matt Thank you, Best regards, Romain -- What most experimenters take for granted before they begin their experiments is infinitely more interesting than any results to which their experiments lead. -- Norbert Wiener -------------- next part -------------- An HTML attachment was scrubbed... URL: From bsmith at mcs.anl.gov Fri Aug 14 10:30:58 2015 From: bsmith at mcs.anl.gov (Barry Smith) Date: Fri, 14 Aug 2015 10:30:58 -0500 Subject: [petsc-users] petsc KLU In-Reply-To: <6F0087987AC5484D8D593B639648295A2F3AC1DE@SRV383.tudelft.net> References: <6F0087987AC5484D8D593B639648295A2F3A8EB0@SRV383.tudelft.net> <6F0087987AC5484D8D593B639648295A2F3AC1DE@SRV383.tudelft.net> Message-ID: You should call MatGetFactor(mat,MATSOLVERKLU,MAT_FACTOR_LU,&fact); then call > MatLUFactorNumeric(Mat fact,Mat mat,const MatFactorInfo *info) > MatLUFactorSymbolic(Mat fact,Mat mat,IS row,IS col,const MatFactorInfo *info) > MatLUFactor(Mat mat,IS row,IS col,const MatFactorInfo *info) This routines correctly internally call the appropriate MatLUFactorNumeric_KLU() etc for you because you passed MATSOLVERKLU above. There is no reason to (and it won't work) to call > MatLUFactorNumeric_KLU(Mat F,Mat A,const MatFactorInfo *info) > MatLUFactorSymbolic_KLU(Mat F,Mat A,IS r,IS c,const MatFactorInfo *info) > MatGetFactor_seqaij_klu(Mat A,MatFactorType ftype,Mat *F) directly. Barry > On Aug 14, 2015, at 10:07 AM, Romain Thomas wrote: > > Hi, > Thank you for your answer. > My problem is a bit more complex. During the simulation (?real time?), I need to upgrade at each time step the matrix A and the MatassemblyBegin and MatassemblyEnd take time and so, in order to avoid these functions, I don?t use ksp or pc. I prefer to use the functions MatLUFactorNumeric, MatLUFactorSymbolic and MatLUFactor. And so, I want to know if there is similar functions for KLU. (I tried for Cholesky and, iLU and it works well). > Best regards, > Romain > > > From: Matthew Knepley [mailto:knepley at gmail.com] > Sent: vrijdag 14 augustus 2015 16:41 > To: Romain Thomas > Cc: petsc-users at mcs.anl.gov > Subject: Re: [petsc-users] petsc KLU > > On Fri, Aug 14, 2015 at 9:23 AM, Romain Thomas wrote: > Dear PETSc users, > > I would like to know if I can replace the following functions > > MatLUFactorNumeric(Mat fact,Mat mat,const MatFactorInfo *info) > MatLUFactorSymbolic(Mat fact,Mat mat,IS row,IS col,const MatFactorInfo *info) > MatLUFactor(Mat mat,IS row,IS col,const MatFactorInfo *info) > > by > > MatLUFactorNumeric_KLU(Mat F,Mat A,const MatFactorInfo *info) > MatLUFactorSymbolic_KLU(Mat F,Mat A,IS r,IS c,const MatFactorInfo *info) > MatGetFactor_seqaij_klu(Mat A,MatFactorType ftype,Mat *F) > > in my code for the simulation of electrical power systems? (I installed the package SuiteSparse) > > Why would you do that? It already works with the former code. In fact, you should really just use > > KSPCreate(comm, &ksp) > KSPSetOperator(ksp, A, A); > KSPSetFromOptions(ksp); > KSPSolve(ksp, b, x); > > and then give the options > > -ksp_type preonly -pc_type lu -pc_mat_factor_package suitesparse > > This is no advantage to using the Factor language since subsequent calls to > KSPSolve() will not refactor. > > Matt > > Thank you, > Best regards, > Romain > > > > -- > What most experimenters take for granted before they begin their experiments is infinitely more interesting than any results to which their experiments lead. > -- Norbert Wiener From jed at jedbrown.org Fri Aug 14 11:10:02 2015 From: jed at jedbrown.org (Jed Brown) Date: Fri, 14 Aug 2015 10:10:02 -0600 Subject: [petsc-users] Need to update matrix in every loop In-Reply-To: References: <87y4hgqzle.fsf@jedbrown.org> Message-ID: <878u9dn95x.fsf@jedbrown.org> Please always use "reply-all" so that your messages go to the list. This is standard mailing list etiquette. It is important to preserve threading for people who find this discussion later and so that we do not waste our time re-answering the same questions that have already been answered in private side-conversations. You'll likely get an answer faster that way too. sheng liu writes: > Thank you! I have a small question: What does the "degree of freedom" mean > in the DMDA object? If I have a spin-1/2 system, and I discrete the system, > does that mean I have DOF=2? DOF is the number of PetscScalar values at each grid point. > 2015-08-12 23:46 GMT+08:00 Jed Brown : > >> sheng liu writes: >> >> > Thank you very much! I have another question. If I need all the >> eigenvalues >> > of the sparse matrix, which solver should I use? Thanks! >> >> That's O(n^3) with n=1e6. Better find a way to not need all the >> eigenvalues or to make the system smaller. >> -------------- next part -------------- A non-text attachment was scrubbed... Name: signature.asc Type: application/pgp-signature Size: 818 bytes Desc: not available URL: From R.Thomas at tudelft.nl Mon Aug 17 09:34:37 2015 From: R.Thomas at tudelft.nl (Romain Thomas) Date: Mon, 17 Aug 2015 14:34:37 +0000 Subject: [petsc-users] petsc KLU In-Reply-To: References: <6F0087987AC5484D8D593B639648295A2F3A8EB0@SRV383.tudelft.net> <6F0087987AC5484D8D593B639648295A2F3AC1DE@SRV383.tudelft.net> Message-ID: <6F0087987AC5484D8D593B639648295A2F3AF250@SRV383.tudelft.net> Hi Thank you for your answer. I was asking help because I find LU factorization 2-3 times faster than KLU. According to my problem size (200*200) and type (power system simulation), I should get almost the same computation time. Is it true to think that? Is the difference of time due to the interface between PETSc and SuiteSparse? Thank you, Romain -----Original Message----- From: Barry Smith [mailto:bsmith at mcs.anl.gov] Sent: vrijdag 14 augustus 2015 17:31 To: Romain Thomas Cc: Matthew Knepley; petsc-users at mcs.anl.gov Subject: Re: [petsc-users] petsc KLU You should call MatGetFactor(mat,MATSOLVERKLU,MAT_FACTOR_LU,&fact); then call > MatLUFactorNumeric(Mat fact,Mat mat,const MatFactorInfo *info) > MatLUFactorSymbolic(Mat fact,Mat mat,IS row,IS col,const MatFactorInfo > *info) MatLUFactor(Mat mat,IS row,IS col,const MatFactorInfo *info) This routines correctly internally call the appropriate MatLUFactorNumeric_KLU() etc for you because you passed MATSOLVERKLU above. There is no reason to (and it won't work) to call > MatLUFactorNumeric_KLU(Mat F,Mat A,const MatFactorInfo *info) > MatLUFactorSymbolic_KLU(Mat F,Mat A,IS r,IS c,const MatFactorInfo > *info) MatGetFactor_seqaij_klu(Mat A,MatFactorType ftype,Mat *F) directly. Barry > On Aug 14, 2015, at 10:07 AM, Romain Thomas wrote: > > Hi, > Thank you for your answer. > My problem is a bit more complex. During the simulation (?real time?), I need to upgrade at each time step the matrix A and the MatassemblyBegin and MatassemblyEnd take time and so, in order to avoid these functions, I don?t use ksp or pc. I prefer to use the functions MatLUFactorNumeric, MatLUFactorSymbolic and MatLUFactor. And so, I want to know if there is similar functions for KLU. (I tried for Cholesky and, iLU and it works well). > Best regards, > Romain > > > From: Matthew Knepley [mailto:knepley at gmail.com] > Sent: vrijdag 14 augustus 2015 16:41 > To: Romain Thomas > Cc: petsc-users at mcs.anl.gov > Subject: Re: [petsc-users] petsc KLU > > On Fri, Aug 14, 2015 at 9:23 AM, Romain Thomas wrote: > Dear PETSc users, > > I would like to know if I can replace the following functions > > MatLUFactorNumeric(Mat fact,Mat mat,const MatFactorInfo *info) > MatLUFactorSymbolic(Mat fact,Mat mat,IS row,IS col,const MatFactorInfo > *info) MatLUFactor(Mat mat,IS row,IS col,const MatFactorInfo *info) > > by > > MatLUFactorNumeric_KLU(Mat F,Mat A,const MatFactorInfo *info) > MatLUFactorSymbolic_KLU(Mat F,Mat A,IS r,IS c,const MatFactorInfo > *info) MatGetFactor_seqaij_klu(Mat A,MatFactorType ftype,Mat *F) > > in my code for the simulation of electrical power systems? (I > installed the package SuiteSparse) > > Why would you do that? It already works with the former code. In fact, > you should really just use > > KSPCreate(comm, &ksp) > KSPSetOperator(ksp, A, A); > KSPSetFromOptions(ksp); > KSPSolve(ksp, b, x); > > and then give the options > > -ksp_type preonly -pc_type lu -pc_mat_factor_package suitesparse > > This is no advantage to using the Factor language since subsequent > calls to > KSPSolve() will not refactor. > > Matt > > Thank you, > Best regards, > Romain > > > > -- > What most experimenters take for granted before they begin their experiments is infinitely more interesting than any results to which their experiments lead. > -- Norbert Wiener From hzhang at mcs.anl.gov Mon Aug 17 10:08:17 2015 From: hzhang at mcs.anl.gov (Hong) Date: Mon, 17 Aug 2015 10:08:17 -0500 Subject: [petsc-users] petsc KLU In-Reply-To: <6F0087987AC5484D8D593B639648295A2F3AF250@SRV383.tudelft.net> References: <6F0087987AC5484D8D593B639648295A2F3A8EB0@SRV383.tudelft.net> <6F0087987AC5484D8D593B639648295A2F3AC1DE@SRV383.tudelft.net> <6F0087987AC5484D8D593B639648295A2F3AF250@SRV383.tudelft.net> Message-ID: Romain: Do you mean small sparse sequential 200 by 200 matrices? Petsc LU might be better than external LU packages because it implements simple LU algorithm and we took good care on data accesing (I've heard same observations). You may try 'qmd' matrix ordering for power grid simulation. I do not have experience on SuiteSparse. Testing MUMPS is worth it as well. Hong Hi > Thank you for your answer. I was asking help because I find LU > factorization 2-3 times faster than KLU. According to my problem size > (200*200) and type (power system simulation), I should get almost the same > computation time. Is it true to think that? Is the difference of time due > to the interface between PETSc and SuiteSparse? > Thank you, > Romain > > -----Original Message----- > From: Barry Smith [mailto:bsmith at mcs.anl.gov] > Sent: vrijdag 14 augustus 2015 17:31 > To: Romain Thomas > Cc: Matthew Knepley; petsc-users at mcs.anl.gov > Subject: Re: [petsc-users] petsc KLU > > > You should call > > MatGetFactor(mat,MATSOLVERKLU,MAT_FACTOR_LU,&fact); > > then call > > > MatLUFactorNumeric(Mat fact,Mat mat,const MatFactorInfo *info) > > MatLUFactorSymbolic(Mat fact,Mat mat,IS row,IS col,const MatFactorInfo > > *info) MatLUFactor(Mat mat,IS row,IS col,const MatFactorInfo *info) > > This routines correctly internally call the appropriate > MatLUFactorNumeric_KLU() etc for you because you passed MATSOLVERKLU above. > There is no reason to (and it won't work) to call > > > MatLUFactorNumeric_KLU(Mat F,Mat A,const MatFactorInfo *info) > > MatLUFactorSymbolic_KLU(Mat F,Mat A,IS r,IS c,const MatFactorInfo > > *info) MatGetFactor_seqaij_klu(Mat A,MatFactorType ftype,Mat *F) > > directly. > > Barry > > > On Aug 14, 2015, at 10:07 AM, Romain Thomas wrote: > > > > Hi, > > Thank you for your answer. > > My problem is a bit more complex. During the simulation (?real time?), I > need to upgrade at each time step the matrix A and the MatassemblyBegin and > MatassemblyEnd take time and so, in order to avoid these functions, I don?t > use ksp or pc. I prefer to use the functions MatLUFactorNumeric, > MatLUFactorSymbolic and MatLUFactor. And so, I want to know if there is > similar functions for KLU. (I tried for Cholesky and, iLU and it works > well). > > Best regards, > > Romain > > > > > > From: Matthew Knepley [mailto:knepley at gmail.com] > > Sent: vrijdag 14 augustus 2015 16:41 > > To: Romain Thomas > > Cc: petsc-users at mcs.anl.gov > > Subject: Re: [petsc-users] petsc KLU > > > > On Fri, Aug 14, 2015 at 9:23 AM, Romain Thomas > wrote: > > Dear PETSc users, > > > > I would like to know if I can replace the following functions > > > > MatLUFactorNumeric(Mat fact,Mat mat,const MatFactorInfo *info) > > MatLUFactorSymbolic(Mat fact,Mat mat,IS row,IS col,const MatFactorInfo > > *info) MatLUFactor(Mat mat,IS row,IS col,const MatFactorInfo *info) > > > > by > > > > MatLUFactorNumeric_KLU(Mat F,Mat A,const MatFactorInfo *info) > > MatLUFactorSymbolic_KLU(Mat F,Mat A,IS r,IS c,const MatFactorInfo > > *info) MatGetFactor_seqaij_klu(Mat A,MatFactorType ftype,Mat *F) > > > > in my code for the simulation of electrical power systems? (I > > installed the package SuiteSparse) > > > > Why would you do that? It already works with the former code. In fact, > > you should really just use > > > > KSPCreate(comm, &ksp) > > KSPSetOperator(ksp, A, A); > > KSPSetFromOptions(ksp); > > KSPSolve(ksp, b, x); > > > > and then give the options > > > > -ksp_type preonly -pc_type lu -pc_mat_factor_package suitesparse > > > > This is no advantage to using the Factor language since subsequent > > calls to > > KSPSolve() will not refactor. > > > > Matt > > > > Thank you, > > Best regards, > > Romain > > > > > > > > -- > > What most experimenters take for granted before they begin their > experiments is infinitely more interesting than any results to which their > experiments lead. > > -- Norbert Wiener > > -------------- next part -------------- An HTML attachment was scrubbed... URL: From xzhao99 at gmail.com Mon Aug 17 10:49:08 2015 From: xzhao99 at gmail.com (Xujun Zhao) Date: Mon, 17 Aug 2015 10:49:08 -0500 Subject: [petsc-users] Petsc creates a random vector Message-ID: Hi all, I want PETSc to generate random vector using VecSetRandom() following given examples, but failed and showed some "out of memory" error. The following is the code, which goes well until it reaches VecSetRandom(). Can anyone help me figure out the reason? Thanks a lot. XZ -------------------------------------------------------------------------------------------- Vec u; PetscRandom rand_ctx; /* random number generator context */ PetscMPIInt size, rank; PetscInt n, dn; MPI_Comm_rank(PETSC_COMM_WORLD,&rank);//CHKERRQ(ierr); MPI_Comm_size(PETSC_COMM_WORLD,&size);//CHKERRQ(ierr); n = N/size + 1; dn = n*size - N; if ( dn>0 && ranktest in petsc_random_vector(): rank = %d, n = %d\n",rank,n); VecCreate(PETSC_COMM_WORLD,&u); VecSetSizes(u,n,N); PetscRandomCreate(PETSC_COMM_WORLD, &rand_ctx); #if defined(PETSC_HAVE_DRAND48) PetscRandomSetType(rand_ctx,PETSCRAND48); #elif defined(PETSC_HAVE_RAND) PetscRandomSetType(rand_ctx,PETSCRAND); #endif PetscRandomSetFromOptions(rand_ctx); VecSetRandom(u,rand_ctx); PetscRandomDestroy(&rand_ctx); -------------- next part -------------- An HTML attachment was scrubbed... URL: From knepley at gmail.com Mon Aug 17 10:57:07 2015 From: knepley at gmail.com (Matthew Knepley) Date: Mon, 17 Aug 2015 10:57:07 -0500 Subject: [petsc-users] Petsc creates a random vector In-Reply-To: References: Message-ID: On Mon, Aug 17, 2015 at 10:49 AM, Xujun Zhao wrote: > Hi all, > > I want PETSc to generate random vector using VecSetRandom() following > given examples, but failed and showed some "out of memory" error. The > following is the code, which goes well until it reaches VecSetRandom(). Can > anyone help me figure out the reason? Thanks a lot. > Does src/vec/vec/examples/tests/ex43.c run for you? Thanks, Matt > XZ > > > > -------------------------------------------------------------------------------------------- > Vec u; > PetscRandom rand_ctx; /* random number generator context */ > PetscMPIInt size, rank; > PetscInt n, dn; > > > MPI_Comm_rank(PETSC_COMM_WORLD,&rank);//CHKERRQ(ierr); > MPI_Comm_size(PETSC_COMM_WORLD,&size);//CHKERRQ(ierr); > n = N/size + 1; > dn = n*size - N; > if ( dn>0 && rank printf("--->test in petsc_random_vector(): rank = %d, n = %d\n",rank,n); > > > VecCreate(PETSC_COMM_WORLD,&u); > VecSetSizes(u,n,N); > PetscRandomCreate(PETSC_COMM_WORLD, &rand_ctx); > #if defined(PETSC_HAVE_DRAND48) > PetscRandomSetType(rand_ctx,PETSCRAND48); > #elif defined(PETSC_HAVE_RAND) > PetscRandomSetType(rand_ctx,PETSCRAND); > #endif > PetscRandomSetFromOptions(rand_ctx); > > > VecSetRandom(u,rand_ctx); > PetscRandomDestroy(&rand_ctx); > -- What most experimenters take for granted before they begin their experiments is infinitely more interesting than any results to which their experiments lead. -- Norbert Wiener -------------- next part -------------- An HTML attachment was scrubbed... URL: From xzhao99 at gmail.com Mon Aug 17 11:02:24 2015 From: xzhao99 at gmail.com (Xujun Zhao) Date: Mon, 17 Aug 2015 11:02:24 -0500 Subject: [petsc-users] Petsc creates a random vector In-Reply-To: References: Message-ID: No. It gives the following error msg: mpirun -np 2 ex43 [proxy:0:0 at mcswl164.mcs.anl.gov] [proxy:0:0 at mcswl164.mcs.anl.gov] HYDU_create_process (utils/launch/launch.c:75): HYDU_create_process (utils/launch/launch.c:75): execvp error on file ex43 (No such file or directory) execvp error on file ex43 (No such file or directory) On Mon, Aug 17, 2015 at 10:57 AM, Matthew Knepley wrote: > On Mon, Aug 17, 2015 at 10:49 AM, Xujun Zhao wrote: > >> Hi all, >> >> I want PETSc to generate random vector using VecSetRandom() following >> given examples, but failed and showed some "out of memory" error. The >> following is the code, which goes well until it reaches VecSetRandom(). Can >> anyone help me figure out the reason? Thanks a lot. >> > > Does src/vec/vec/examples/tests/ex43.c run for you? > > Thanks, > > Matt > > >> XZ >> >> >> >> -------------------------------------------------------------------------------------------- >> Vec u; >> PetscRandom rand_ctx; /* random number generator context */ >> PetscMPIInt size, rank; >> PetscInt n, dn; >> >> >> MPI_Comm_rank(PETSC_COMM_WORLD,&rank);//CHKERRQ(ierr); >> MPI_Comm_size(PETSC_COMM_WORLD,&size);//CHKERRQ(ierr); >> n = N/size + 1; >> dn = n*size - N; >> if ( dn>0 && rank> printf("--->test in petsc_random_vector(): rank = %d, n = %d\n",rank,n); >> >> >> VecCreate(PETSC_COMM_WORLD,&u); >> VecSetSizes(u,n,N); >> PetscRandomCreate(PETSC_COMM_WORLD, &rand_ctx); >> #if defined(PETSC_HAVE_DRAND48) >> PetscRandomSetType(rand_ctx,PETSCRAND48); >> #elif defined(PETSC_HAVE_RAND) >> PetscRandomSetType(rand_ctx,PETSCRAND); >> #endif >> PetscRandomSetFromOptions(rand_ctx); >> >> >> VecSetRandom(u,rand_ctx); >> PetscRandomDestroy(&rand_ctx); >> > > > > -- > What most experimenters take for granted before they begin their > experiments is infinitely more interesting than any results to which their > experiments lead. > -- Norbert Wiener > -------------- next part -------------- An HTML attachment was scrubbed... URL: From karpeev at mcs.anl.gov Mon Aug 17 11:12:47 2015 From: karpeev at mcs.anl.gov (Dmitry Karpeyev) Date: Mon, 17 Aug 2015 16:12:47 +0000 Subject: [petsc-users] Petsc creates a random vector In-Reply-To: References: Message-ID: You need to give the path to the executable, for example, ./ex43 etc. Dmitry. On Mon, Aug 17, 2015 at 11:02 AM Xujun Zhao wrote: > No. It gives the following error msg: > > mpirun -np 2 ex43 > > [proxy:0:0 at mcswl164.mcs.anl.gov] [proxy:0:0 at mcswl164.mcs.anl.gov] > HYDU_create_process (utils/launch/launch.c:75): HYDU_create_process > (utils/launch/launch.c:75): execvp error on file ex43 (No such file or > directory) > > execvp error on file ex43 (No such file or directory) > > On Mon, Aug 17, 2015 at 10:57 AM, Matthew Knepley > wrote: > >> On Mon, Aug 17, 2015 at 10:49 AM, Xujun Zhao wrote: >> >>> Hi all, >>> >>> I want PETSc to generate random vector using VecSetRandom() following >>> given examples, but failed and showed some "out of memory" error. The >>> following is the code, which goes well until it reaches VecSetRandom(). Can >>> anyone help me figure out the reason? Thanks a lot. >>> >> >> Does src/vec/vec/examples/tests/ex43.c run for you? >> >> Thanks, >> >> Matt >> >> >>> XZ >>> >>> >>> >>> -------------------------------------------------------------------------------------------- >>> Vec u; >>> PetscRandom rand_ctx; /* random number generator context */ >>> PetscMPIInt size, rank; >>> PetscInt n, dn; >>> >>> >>> MPI_Comm_rank(PETSC_COMM_WORLD,&rank);//CHKERRQ(ierr); >>> MPI_Comm_size(PETSC_COMM_WORLD,&size);//CHKERRQ(ierr); >>> n = N/size + 1; >>> dn = n*size - N; >>> if ( dn>0 && rank>> printf("--->test in petsc_random_vector(): rank = %d, n = >>> %d\n",rank,n); >>> >>> >>> VecCreate(PETSC_COMM_WORLD,&u); >>> VecSetSizes(u,n,N); >>> PetscRandomCreate(PETSC_COMM_WORLD, &rand_ctx); >>> #if defined(PETSC_HAVE_DRAND48) >>> PetscRandomSetType(rand_ctx,PETSCRAND48); >>> #elif defined(PETSC_HAVE_RAND) >>> PetscRandomSetType(rand_ctx,PETSCRAND); >>> #endif >>> PetscRandomSetFromOptions(rand_ctx); >>> >>> >>> VecSetRandom(u,rand_ctx); >>> PetscRandomDestroy(&rand_ctx); >>> >> >> >> >> -- >> What most experimenters take for granted before they begin their >> experiments is infinitely more interesting than any results to which their >> experiments lead. >> -- Norbert Wiener >> > > -------------- next part -------------- An HTML attachment was scrubbed... URL: From knepley at gmail.com Mon Aug 17 11:13:23 2015 From: knepley at gmail.com (Matthew Knepley) Date: Mon, 17 Aug 2015 11:13:23 -0500 Subject: [petsc-users] Petsc creates a random vector In-Reply-To: References: Message-ID: On Mon, Aug 17, 2015 at 11:02 AM, Xujun Zhao wrote: > No. It gives the following error msg: > Did you build the executable? cd src/vec/vec/examples/tutorials make ex43 Matt > mpirun -np 2 ex43 > > [proxy:0:0 at mcswl164.mcs.anl.gov] [proxy:0:0 at mcswl164.mcs.anl.gov] > HYDU_create_process (utils/launch/launch.c:75): HYDU_create_process > (utils/launch/launch.c:75): execvp error on file ex43 (No such file or > directory) > > execvp error on file ex43 (No such file or directory) > > On Mon, Aug 17, 2015 at 10:57 AM, Matthew Knepley > wrote: > >> On Mon, Aug 17, 2015 at 10:49 AM, Xujun Zhao wrote: >> >>> Hi all, >>> >>> I want PETSc to generate random vector using VecSetRandom() following >>> given examples, but failed and showed some "out of memory" error. The >>> following is the code, which goes well until it reaches VecSetRandom(). Can >>> anyone help me figure out the reason? Thanks a lot. >>> >> >> Does src/vec/vec/examples/tests/ex43.c run for you? >> >> Thanks, >> >> Matt >> >> >>> XZ >>> >>> >>> >>> -------------------------------------------------------------------------------------------- >>> Vec u; >>> PetscRandom rand_ctx; /* random number generator context */ >>> PetscMPIInt size, rank; >>> PetscInt n, dn; >>> >>> >>> MPI_Comm_rank(PETSC_COMM_WORLD,&rank);//CHKERRQ(ierr); >>> MPI_Comm_size(PETSC_COMM_WORLD,&size);//CHKERRQ(ierr); >>> n = N/size + 1; >>> dn = n*size - N; >>> if ( dn>0 && rank>> printf("--->test in petsc_random_vector(): rank = %d, n = >>> %d\n",rank,n); >>> >>> >>> VecCreate(PETSC_COMM_WORLD,&u); >>> VecSetSizes(u,n,N); >>> PetscRandomCreate(PETSC_COMM_WORLD, &rand_ctx); >>> #if defined(PETSC_HAVE_DRAND48) >>> PetscRandomSetType(rand_ctx,PETSCRAND48); >>> #elif defined(PETSC_HAVE_RAND) >>> PetscRandomSetType(rand_ctx,PETSCRAND); >>> #endif >>> PetscRandomSetFromOptions(rand_ctx); >>> >>> >>> VecSetRandom(u,rand_ctx); >>> PetscRandomDestroy(&rand_ctx); >>> >> >> >> >> -- >> What most experimenters take for granted before they begin their >> experiments is infinitely more interesting than any results to which their >> experiments lead. >> -- Norbert Wiener >> > > -- What most experimenters take for granted before they begin their experiments is infinitely more interesting than any results to which their experiments lead. -- Norbert Wiener -------------- next part -------------- An HTML attachment was scrubbed... URL: From xzhao99 at gmail.com Mon Aug 17 11:15:07 2015 From: xzhao99 at gmail.com (Xujun Zhao) Date: Mon, 17 Aug 2015 11:15:07 -0500 Subject: [petsc-users] Petsc creates a random vector In-Reply-To: References: Message-ID: Ahhhh, I should drink some coffee in the morning. Now it passed the test! On Mon, Aug 17, 2015 at 11:13 AM, Matthew Knepley wrote: > On Mon, Aug 17, 2015 at 11:02 AM, Xujun Zhao wrote: > >> No. It gives the following error msg: >> > > Did you build the executable? > > cd src/vec/vec/examples/tutorials > make ex43 > > Matt > > >> mpirun -np 2 ex43 >> >> [proxy:0:0 at mcswl164.mcs.anl.gov] [proxy:0:0 at mcswl164.mcs.anl.gov] >> HYDU_create_process (utils/launch/launch.c:75): HYDU_create_process >> (utils/launch/launch.c:75): execvp error on file ex43 (No such file or >> directory) >> >> execvp error on file ex43 (No such file or directory) >> >> On Mon, Aug 17, 2015 at 10:57 AM, Matthew Knepley >> wrote: >> >>> On Mon, Aug 17, 2015 at 10:49 AM, Xujun Zhao wrote: >>> >>>> Hi all, >>>> >>>> I want PETSc to generate random vector using VecSetRandom() following >>>> given examples, but failed and showed some "out of memory" error. The >>>> following is the code, which goes well until it reaches VecSetRandom(). Can >>>> anyone help me figure out the reason? Thanks a lot. >>>> >>> >>> Does src/vec/vec/examples/tests/ex43.c run for you? >>> >>> Thanks, >>> >>> Matt >>> >>> >>>> XZ >>>> >>>> >>>> >>>> -------------------------------------------------------------------------------------------- >>>> Vec u; >>>> PetscRandom rand_ctx; /* random number generator context */ >>>> PetscMPIInt size, rank; >>>> PetscInt n, dn; >>>> >>>> >>>> MPI_Comm_rank(PETSC_COMM_WORLD,&rank);//CHKERRQ(ierr); >>>> MPI_Comm_size(PETSC_COMM_WORLD,&size);//CHKERRQ(ierr); >>>> n = N/size + 1; >>>> dn = n*size - N; >>>> if ( dn>0 && rank>>> printf("--->test in petsc_random_vector(): rank = %d, n = >>>> %d\n",rank,n); >>>> >>>> >>>> VecCreate(PETSC_COMM_WORLD,&u); >>>> VecSetSizes(u,n,N); >>>> PetscRandomCreate(PETSC_COMM_WORLD, &rand_ctx); >>>> #if defined(PETSC_HAVE_DRAND48) >>>> PetscRandomSetType(rand_ctx,PETSCRAND48); >>>> #elif defined(PETSC_HAVE_RAND) >>>> PetscRandomSetType(rand_ctx,PETSCRAND); >>>> #endif >>>> PetscRandomSetFromOptions(rand_ctx); >>>> >>>> >>>> VecSetRandom(u,rand_ctx); >>>> PetscRandomDestroy(&rand_ctx); >>>> >>> >>> >>> >>> -- >>> What most experimenters take for granted before they begin their >>> experiments is infinitely more interesting than any results to which their >>> experiments lead. >>> -- Norbert Wiener >>> >> >> > > > -- > What most experimenters take for granted before they begin their > experiments is infinitely more interesting than any results to which their > experiments lead. > -- Norbert Wiener > -------------- next part -------------- An HTML attachment was scrubbed... URL: From dkarpeev at gmail.com Mon Aug 17 11:16:30 2015 From: dkarpeev at gmail.com (Dmitry Karpeyev) Date: Mon, 17 Aug 2015 16:16:30 +0000 Subject: [petsc-users] Petsc creates a random vector In-Reply-To: References: Message-ID: Xujun, Regarding your original question: please send the complete error message. Dmitry. On Mon, Aug 17, 2015 at 11:15 AM Xujun Zhao wrote: > Ahhhh, I should drink some coffee in the morning. > Now it passed the test! > > On Mon, Aug 17, 2015 at 11:13 AM, Matthew Knepley > wrote: > >> On Mon, Aug 17, 2015 at 11:02 AM, Xujun Zhao wrote: >> >>> No. It gives the following error msg: >>> >> >> Did you build the executable? >> >> cd src/vec/vec/examples/tutorials >> make ex43 >> >> Matt >> >> >>> mpirun -np 2 ex43 >>> >>> [proxy:0:0 at mcswl164.mcs.anl.gov] [proxy:0:0 at mcswl164.mcs.anl.gov] >>> HYDU_create_process (utils/launch/launch.c:75): HYDU_create_process >>> (utils/launch/launch.c:75): execvp error on file ex43 (No such file or >>> directory) >>> >>> execvp error on file ex43 (No such file or directory) >>> >>> On Mon, Aug 17, 2015 at 10:57 AM, Matthew Knepley >>> wrote: >>> >>>> On Mon, Aug 17, 2015 at 10:49 AM, Xujun Zhao wrote: >>>> >>>>> Hi all, >>>>> >>>>> I want PETSc to generate random vector using VecSetRandom() following >>>>> given examples, but failed and showed some "out of memory" error. The >>>>> following is the code, which goes well until it reaches VecSetRandom(). Can >>>>> anyone help me figure out the reason? Thanks a lot. >>>>> >>>> >>>> Does src/vec/vec/examples/tests/ex43.c run for you? >>>> >>>> Thanks, >>>> >>>> Matt >>>> >>>> >>>>> XZ >>>>> >>>>> >>>>> >>>>> -------------------------------------------------------------------------------------------- >>>>> Vec u; >>>>> PetscRandom rand_ctx; /* random number generator context */ >>>>> PetscMPIInt size, rank; >>>>> PetscInt n, dn; >>>>> >>>>> >>>>> MPI_Comm_rank(PETSC_COMM_WORLD,&rank);//CHKERRQ(ierr); >>>>> MPI_Comm_size(PETSC_COMM_WORLD,&size);//CHKERRQ(ierr); >>>>> n = N/size + 1; >>>>> dn = n*size - N; >>>>> if ( dn>0 && rank>>>> printf("--->test in petsc_random_vector(): rank = %d, n = >>>>> %d\n",rank,n); >>>>> >>>>> >>>>> VecCreate(PETSC_COMM_WORLD,&u); >>>>> VecSetSizes(u,n,N); >>>>> PetscRandomCreate(PETSC_COMM_WORLD, &rand_ctx); >>>>> #if defined(PETSC_HAVE_DRAND48) >>>>> PetscRandomSetType(rand_ctx,PETSCRAND48); >>>>> #elif defined(PETSC_HAVE_RAND) >>>>> PetscRandomSetType(rand_ctx,PETSCRAND); >>>>> #endif >>>>> PetscRandomSetFromOptions(rand_ctx); >>>>> >>>>> >>>>> VecSetRandom(u,rand_ctx); >>>>> PetscRandomDestroy(&rand_ctx); >>>>> >>>> >>>> >>>> >>>> -- >>>> What most experimenters take for granted before they begin their >>>> experiments is infinitely more interesting than any results to which their >>>> experiments lead. >>>> -- Norbert Wiener >>>> >>> >>> >> >> >> -- >> What most experimenters take for granted before they begin their >> experiments is infinitely more interesting than any results to which their >> experiments lead. >> -- Norbert Wiener >> > > -------------- next part -------------- An HTML attachment was scrubbed... URL: From abhyshr at anl.gov Mon Aug 17 11:21:15 2015 From: abhyshr at anl.gov (Abhyankar, Shrirang G.) Date: Mon, 17 Aug 2015 16:21:15 +0000 Subject: [petsc-users] petsc KLU In-Reply-To: References: <6F0087987AC5484D8D593B639648295A2F3A8EB0@SRV383.tudelft.net> <6F0087987AC5484D8D593B639648295A2F3AC1DE@SRV383.tudelft.net> <6F0087987AC5484D8D593B639648295A2F3AF250@SRV383.tudelft.net> Message-ID: Romain, I added the KLU interface to PETSc last year hearing the hype about KLU?s performance from several power system folks. I must say that I?m terribly disappointed! I did some performance testing of KLU on power grid problems (power flow application) last year and I got a similar performance that you report (PETSc is 2-4 times faster than KLU). I also clocked the time spent in PETSc?s SuiteSparse interface for KLU for operations other than factorization and it was very minimal. The fastest linear solver combination that I found was PETSc?s LU solver + AMD ordering from the SuiteSparse package (-pc_factor_mat_ordering_type amd). Don?t try MUMPS and SuperLU ? they are terribly slow. Shri From: hong zhang Date: Monday, August 17, 2015 at 10:08 AM To: Romain Thomas Cc: "petsc-users at mcs.anl.gov" Subject: Re: [petsc-users] petsc KLU >Romain: >Do you mean small sparse sequential 200 by 200 matrices? >Petsc LU might be better than external LU packages because it implements >simple LU algorithm and we took good care on data accesing (I've heard >same observations). You may try 'qmd' matrix ordering for power grid >simulation. >I do not have experience on SuiteSparse. Testing MUMPS is worth it as >well. > >Hong > > >Hi >Thank you for your answer. I was asking help because I find LU >factorization 2-3 times faster than KLU. According to my problem size >(200*200) and type (power system simulation), I should get almost the >same computation time. Is it true to think that? Is the > difference of time due to the interface between PETSc and SuiteSparse? >Thank you, >Romain > >-----Original Message----- >From: Barry Smith [mailto:bsmith at mcs.anl.gov] >Sent: vrijdag 14 augustus 2015 17:31 >To: Romain Thomas >Cc: Matthew Knepley; petsc-users at mcs.anl.gov >Subject: Re: [petsc-users] petsc KLU > > > You should call > > MatGetFactor(mat,MATSOLVERKLU,MAT_FACTOR_LU,&fact); > > then call > >> MatLUFactorNumeric(Mat fact,Mat mat,const MatFactorInfo *info) >> MatLUFactorSymbolic(Mat fact,Mat mat,IS row,IS col,const MatFactorInfo >> *info) MatLUFactor(Mat mat,IS row,IS col,const MatFactorInfo *info) > > This routines correctly internally call the appropriate >MatLUFactorNumeric_KLU() etc for you because you passed MATSOLVERKLU >above. > There is no reason to (and it won't work) to call > >> MatLUFactorNumeric_KLU(Mat F,Mat A,const MatFactorInfo *info) >> MatLUFactorSymbolic_KLU(Mat F,Mat A,IS r,IS c,const MatFactorInfo >> *info) MatGetFactor_seqaij_klu(Mat A,MatFactorType ftype,Mat *F) > >directly. > > Barry > >> On Aug 14, 2015, at 10:07 AM, Romain Thomas wrote: >> >> Hi, >> Thank you for your answer. >> My problem is a bit more complex. During the simulation (?real time?), >>I need to upgrade at each time step the matrix A and the >>MatassemblyBegin and MatassemblyEnd take time and so, in order to avoid >>these functions, I don?t use ksp or pc. I prefer to use > the functions MatLUFactorNumeric, MatLUFactorSymbolic and MatLUFactor. >And so, I want to know if there is similar functions for KLU. (I tried >for Cholesky and, iLU and it works well). >> Best regards, >> Romain >> >> >> From: Matthew Knepley [mailto:knepley at gmail.com] >> Sent: vrijdag 14 augustus 2015 16:41 >> To: Romain Thomas >> Cc: petsc-users at mcs.anl.gov >> Subject: Re: [petsc-users] petsc KLU >> >> On Fri, Aug 14, 2015 at 9:23 AM, Romain Thomas >>wrote: >> Dear PETSc users, >> >> I would like to know if I can replace the following functions >> >> MatLUFactorNumeric(Mat fact,Mat mat,const MatFactorInfo *info) >> MatLUFactorSymbolic(Mat fact,Mat mat,IS row,IS col,const MatFactorInfo >> *info) MatLUFactor(Mat mat,IS row,IS col,const MatFactorInfo *info) >> >> by >> >> MatLUFactorNumeric_KLU(Mat F,Mat A,const MatFactorInfo *info) >> MatLUFactorSymbolic_KLU(Mat F,Mat A,IS r,IS c,const MatFactorInfo >> *info) MatGetFactor_seqaij_klu(Mat A,MatFactorType ftype,Mat *F) >> >> in my code for the simulation of electrical power systems? (I >> installed the package SuiteSparse) >> >> Why would you do that? It already works with the former code. In fact, >> you should really just use >> >> KSPCreate(comm, &ksp) >> KSPSetOperator(ksp, A, A); >> KSPSetFromOptions(ksp); >> KSPSolve(ksp, b, x); >> >> and then give the options >> >> -ksp_type preonly -pc_type lu -pc_mat_factor_package suitesparse >> >> This is no advantage to using the Factor language since subsequent >> calls to >> KSPSolve() will not refactor. >> >> Matt >> >> Thank you, >> Best regards, >> Romain >> >> >> >> -- >> What most experimenters take for granted before they begin their >>experiments is infinitely more interesting than any results to which >>their experiments lead. >> -- Norbert Wiener > > > > > From xzhao99 at gmail.com Mon Aug 17 11:31:42 2015 From: xzhao99 at gmail.com (Xujun Zhao) Date: Mon, 17 Aug 2015 11:31:42 -0500 Subject: [petsc-users] Petsc creates a random vector In-Reply-To: References: Message-ID: This is run with PETSc opt mode, so the error message looks not very useful, see below: Probably I should use dbg version to see the details. [0]PETSC ERROR: ------------------------------------------------------------------------ [0]PETSC ERROR: Caught signal number 11 SEGV: Segmentation Violation, probably memory access out of range [0]PETSC ERROR: Try option -start_in_debugger or -on_error_attach_debugger [0]PETSC ERROR: or see http://www.mcs.anl.gov/petsc/documentation/faq.html#valgrind [0]PETSC ERROR: or try http://valgrind.org on GNU/linux and Apple Mac OS X to find memory corruption errors [0]PETSC ERROR: configure using --with-debugging=yes, recompile, link, and run [0]PETSC ERROR: to get more information on the crash. [1]PETSC ERROR: ------------------------------------------------------------------------ [1]PETSC ERROR: Caught signal number 11 SEGV: Segmentation Violation, probably memory access out of range [1]PETSC ERROR: Try option -start_in_debugger or -on_error_attach_debugger [1]PETSC ERROR: or see http://www.mcs.anl.gov/petsc/documentation/faq.html#valgrind [1]PETSC ERROR: or try http://valgrind.org on GNU/linux and Apple Mac OS X to find memory corruption errors [1]PETSC ERROR: configure using --with-debugging=yes, recompile, link, and run [2]PETSC ERROR: ------------------------------------------------------------------------ [2]PETSC ERROR: Caught signal number 11 SEGV: Segmentation Violation, probably memory access out of range [2]PETSC ERROR: Try option -start_in_debugger or -on_error_attach_debugger [2]PETSC ERROR: or see http://www.mcs.anl.gov/petsc/documentation/faq.html#valgrind [2]PETSC ERROR: or try http://valgrind.org on GNU/linux and Apple Mac OS X to find memory corruption errors [2]PETSC ERROR: configure using --with-debugging=yes, recompile, link, and run [2]PETSC ERROR: [1]PETSC ERROR: to get more information on the crash. to get more information on the crash. [3]PETSC ERROR: ------------------------------------------------------------------------ [3]PETSC ERROR: Caught signal number 11 SEGV: Segmentation Violation, probably memory access out of range [3]PETSC ERROR: Try option -start_in_debugger or -on_error_attach_debugger [3]PETSC ERROR: or see http://www.mcs.anl.gov/petsc/documentation/faq.html#valgrind [3]PETSC ERROR: or try http://valgrind.org on GNU/linux and Apple Mac OS X to find memory corruption errors [3]PETSC ERROR: configure using --with-debugging=yes, recompile, link, and run [3]PETSC ERROR: to get more information on the crash. [0]PETSC ERROR: --------------------- Error Message -------------------------------------------------------------- [0]PETSC ERROR: Signal received [0]PETSC ERROR: See http://www.mcs.anl.gov/petsc/documentation/faq.html for trouble shooting. [0]PETSC ERROR: Petsc Release Version 3.5.4, May, 23, 2015 [0]PETSC ERROR: ./example-opt on a arch-darwin-c-opt named mcswl164.mcs.anl.gov by xzhao Mon Aug 17 11:20:33 2015 [0]PETSC ERROR: Configure options --with-cc=gcc-4.9 --with-cxx=g++-4.9 --with-fc=gfortran-4.9 --with-cxx-dialect=C++11 --download-mpich --download-fblaslapack --download-scalapack --download-mumps --download-superlu_dist --download-hypre --download-ml --download-parmetis --download-metis --download-triangle --download-chaco --download-elemental --with-debugging=0 [0]PETSC ERROR: #1 User provided function() line 0 in unknown file [1]PETSC ERROR: --------------------- Error Message -------------------------------------------------------------- [1]PETSC ERROR: Signal received [1]PETSC ERROR: See http://www.mcs.anl.gov/petsc/documentation/faq.html for trouble shooting. [1]PETSC ERROR: Petsc Release Version 3.5.4, May, 23, 2015 [1]PETSC ERROR: ./example-opt on a arch-darwin-c-opt named mcswl164.mcs.anl.gov by xzhao Mon Aug 17 11:20:33 2015 [1]PETSC ERROR: Configure options --with-cc=gcc-4.9 --with-cxx=g++-4.9 --with-fc=gfortran-4.9 --with-cxx-dialect=C++11 --download-mpich --download-fblaslapack --download-scalapack --download-mumps --download-superlu_dist --download-hypre --download-ml --download-parmetis --download-metis --download-triangle --download-chaco --download-elemental --with-debugging=0 [1]PETSC ERROR: #1 User provided function() line 0 in unknown file [2]PETSC ERROR: --------------------- Error Message -------------------------------------------------------------- [2]PETSC ERROR: Signal received [2]PETSC ERROR: See http://www.mcs.anl.gov/petsc/documentation/faq.html for trouble shooting. [2]PETSC ERROR: Petsc Release Version 3.5.4, May, 23, 2015 [2]PETSC ERROR: ./example-opt on a arch-darwin-c-opt named mcswl164.mcs.anl.gov by xzhao Mon Aug 17 11:20:33 2015 [2]PETSC ERROR: Configure options --with-cc=gcc-4.9 --with-cxx=g++-4.9 --with-fc=gfortran-4.9 --with-cxx-dialect=C++11 --download-mpich --download-fblaslapack --download-scalapack --download-mumps --download-superlu_dist --download-hypre --download-ml --download-parmetis --download-metis --download-triangle --download-chaco --download-elemental --with-debugging=0 [2]PETSC ERROR: #1 User provided function() line 0 in unknown file [3]PETSC ERROR: --------------------- Error Message -------------------------------------------------------------- [3]PETSC ERROR: Signal received [3]PETSC ERROR: See http://www.mcs.anl.gov/petsc/documentation/faq.html for trouble shooting. [3]PETSC ERROR: Petsc Release Version 3.5.4, May, 23, 2015 [3]PETSC ERROR: ./example-opt on a arch-darwin-c-opt named mcswl164.mcs.anl.gov by xzhao Mon Aug 17 11:20:33 2015 [3]PETSC ERROR: Configure options --with-cc=gcc-4.9 --with-cxx=g++-4.9 --with-fc=gfortran-4.9 --with-cxx-dialect=C++11 --download-mpich --download-fblaslapack --download-scalapack --download-mumps --download-superlu_dist --download-hypre --download-ml --download-parmetis --download-metis --download-triangle --download-chaco --download-elemental --with-debugging=0 [3]PETSC ERROR: #1 User provided function() line 0 in unknown file application called MPI_Abort(MPI_COMM_WORLD, 59) - process 0 application called MPI_Abort(MPI_COMM_WORLD, 59) - process 1 application called MPI_Abort(MPI_COMM_WORLD, 59) - process 2 [cli_0]: aborting job: application called MPI_Abort(MPI_COMM_WORLD, 59) - process 0 [cli_1]: aborting job: application called MPI_Abort(MPI_COMM_WORLD, 59) - process 1 [cli_2]: aborting job: application called MPI_Abort(MPI_COMM_WORLD, 59) - process 2 application called MPI_Abort(MPI_COMM_WORLD, 59) - process 3 [cli_3]: aborting job: application called MPI_Abort(MPI_COMM_WORLD, 59) - process 3 On Mon, Aug 17, 2015 at 11:16 AM, Dmitry Karpeyev wrote: > Xujun, > Regarding your original question: please send the complete error message. > Dmitry. > > On Mon, Aug 17, 2015 at 11:15 AM Xujun Zhao wrote: > >> Ahhhh, I should drink some coffee in the morning. >> Now it passed the test! >> >> On Mon, Aug 17, 2015 at 11:13 AM, Matthew Knepley >> wrote: >> >>> On Mon, Aug 17, 2015 at 11:02 AM, Xujun Zhao wrote: >>> >>>> No. It gives the following error msg: >>>> >>> >>> Did you build the executable? >>> >>> cd src/vec/vec/examples/tutorials >>> make ex43 >>> >>> Matt >>> >>> >>>> mpirun -np 2 ex43 >>>> >>>> [proxy:0:0 at mcswl164.mcs.anl.gov] [proxy:0:0 at mcswl164.mcs.anl.gov] >>>> HYDU_create_process (utils/launch/launch.c:75): HYDU_create_process >>>> (utils/launch/launch.c:75): execvp error on file ex43 (No such file or >>>> directory) >>>> >>>> execvp error on file ex43 (No such file or directory) >>>> >>>> On Mon, Aug 17, 2015 at 10:57 AM, Matthew Knepley >>>> wrote: >>>> >>>>> On Mon, Aug 17, 2015 at 10:49 AM, Xujun Zhao >>>>> wrote: >>>>> >>>>>> Hi all, >>>>>> >>>>>> I want PETSc to generate random vector using VecSetRandom() following >>>>>> given examples, but failed and showed some "out of memory" error. The >>>>>> following is the code, which goes well until it reaches VecSetRandom(). Can >>>>>> anyone help me figure out the reason? Thanks a lot. >>>>>> >>>>> >>>>> Does src/vec/vec/examples/tests/ex43.c run for you? >>>>> >>>>> Thanks, >>>>> >>>>> Matt >>>>> >>>>> >>>>>> XZ >>>>>> >>>>>> >>>>>> >>>>>> -------------------------------------------------------------------------------------------- >>>>>> Vec u; >>>>>> PetscRandom rand_ctx; /* random number generator context */ >>>>>> PetscMPIInt size, rank; >>>>>> PetscInt n, dn; >>>>>> >>>>>> >>>>>> MPI_Comm_rank(PETSC_COMM_WORLD,&rank);//CHKERRQ(ierr); >>>>>> MPI_Comm_size(PETSC_COMM_WORLD,&size);//CHKERRQ(ierr); >>>>>> n = N/size + 1; >>>>>> dn = n*size - N; >>>>>> if ( dn>0 && rank>>>>> printf("--->test in petsc_random_vector(): rank = %d, n = >>>>>> %d\n",rank,n); >>>>>> >>>>>> >>>>>> VecCreate(PETSC_COMM_WORLD,&u); >>>>>> VecSetSizes(u,n,N); >>>>>> PetscRandomCreate(PETSC_COMM_WORLD, &rand_ctx); >>>>>> #if defined(PETSC_HAVE_DRAND48) >>>>>> PetscRandomSetType(rand_ctx,PETSCRAND48); >>>>>> #elif defined(PETSC_HAVE_RAND) >>>>>> PetscRandomSetType(rand_ctx,PETSCRAND); >>>>>> #endif >>>>>> PetscRandomSetFromOptions(rand_ctx); >>>>>> >>>>>> >>>>>> VecSetRandom(u,rand_ctx); >>>>>> PetscRandomDestroy(&rand_ctx); >>>>>> >>>>> >>>>> >>>>> >>>>> -- >>>>> What most experimenters take for granted before they begin their >>>>> experiments is infinitely more interesting than any results to which their >>>>> experiments lead. >>>>> -- Norbert Wiener >>>>> >>>> >>>> >>> >>> >>> -- >>> What most experimenters take for granted before they begin their >>> experiments is infinitely more interesting than any results to which their >>> experiments lead. >>> -- Norbert Wiener >>> >> >> -------------- next part -------------- An HTML attachment was scrubbed... URL: From reza.yaghmaie2 at gmail.com Mon Aug 17 11:46:35 2015 From: reza.yaghmaie2 at gmail.com (Reza Yaghmaie) Date: Mon, 17 Aug 2015 12:46:35 -0400 Subject: [petsc-users] SNESSetFunction Message-ID: Hi, I have problems with passing variables through *SNESSetFunction* in my code. basically I have the following subroutines in the main body of the Fortran code. Could you provide some insight on how to transfer variables into the residual calculation routine (*FormFunction1*)? Thanks, Reza ------------------------------------------------------------------------------------------------------------------ *main code* *SNES* snes *Vec* xvec,rvec *external* FormFunction1 *real*8* variable1(10),variable2(20,20),variable3(30),variable4(40,40) call *SNESSetFunction*(snes,rvec,FormFunction1, & PETSC_NULL_OBJECT, & variable1,variable2,variable3,variable4, & ierr) end subroutine *FormFunction1*(snes,XVEC,FVEC, & dummy, & varable1,varable2,varable3,varable4, & ierr) *SNES* snes *Vec* XVEC,FVEC *PetscFortranAddr* dummy *real*8* variable1(10),variable2(20,20),variable3(30),variable4(40,40) return end -------------------------------------------------------------------------------------------------------------- -------------- next part -------------- An HTML attachment was scrubbed... URL: From dkarpeev at gmail.com Mon Aug 17 11:49:56 2015 From: dkarpeev at gmail.com (Dmitry Karpeyev) Date: Mon, 17 Aug 2015 16:49:56 +0000 Subject: [petsc-users] Petsc creates a random vector In-Reply-To: References: Message-ID: Use a dbg build with a debugger and/or valgrind. Dmitry. On Mon, Aug 17, 2015 at 11:31 AM Xujun Zhao wrote: > This is run with PETSc opt mode, so the error message looks not very > useful, see below: > Probably I should use dbg version to see the details. > > [0]PETSC ERROR: > ------------------------------------------------------------------------ > > [0]PETSC ERROR: Caught signal number 11 SEGV: Segmentation Violation, > probably memory access out of range > > [0]PETSC ERROR: Try option -start_in_debugger or -on_error_attach_debugger > > [0]PETSC ERROR: or see > http://www.mcs.anl.gov/petsc/documentation/faq.html#valgrind > > [0]PETSC ERROR: or try http://valgrind.org on GNU/linux and Apple Mac OS > X to find memory corruption errors > > [0]PETSC ERROR: configure using --with-debugging=yes, recompile, link, and > run > > [0]PETSC ERROR: to get more information on the crash. > > [1]PETSC ERROR: > ------------------------------------------------------------------------ > > [1]PETSC ERROR: Caught signal number 11 SEGV: Segmentation Violation, > probably memory access out of range > > [1]PETSC ERROR: Try option -start_in_debugger or -on_error_attach_debugger > > [1]PETSC ERROR: or see > http://www.mcs.anl.gov/petsc/documentation/faq.html#valgrind > > [1]PETSC ERROR: or try http://valgrind.org on GNU/linux and Apple Mac OS > X to find memory corruption errors > > [1]PETSC ERROR: configure using --with-debugging=yes, recompile, link, and > run > > [2]PETSC ERROR: > ------------------------------------------------------------------------ > > [2]PETSC ERROR: Caught signal number 11 SEGV: Segmentation Violation, > probably memory access out of range > > [2]PETSC ERROR: Try option -start_in_debugger or -on_error_attach_debugger > > [2]PETSC ERROR: or see > http://www.mcs.anl.gov/petsc/documentation/faq.html#valgrind > > [2]PETSC ERROR: or try http://valgrind.org on GNU/linux and Apple Mac OS > X to find memory corruption errors > > [2]PETSC ERROR: configure using --with-debugging=yes, recompile, link, and > run > > [2]PETSC ERROR: [1]PETSC ERROR: to get more information on the crash. > > to get more information on the crash. > > [3]PETSC ERROR: > ------------------------------------------------------------------------ > > [3]PETSC ERROR: Caught signal number 11 SEGV: Segmentation Violation, > probably memory access out of range > > [3]PETSC ERROR: Try option -start_in_debugger or -on_error_attach_debugger > > [3]PETSC ERROR: or see > http://www.mcs.anl.gov/petsc/documentation/faq.html#valgrind > > [3]PETSC ERROR: or try http://valgrind.org on GNU/linux and Apple Mac OS > X to find memory corruption errors > > [3]PETSC ERROR: configure using --with-debugging=yes, recompile, link, and > run > > [3]PETSC ERROR: to get more information on the crash. > > [0]PETSC ERROR: --------------------- Error Message > -------------------------------------------------------------- > > [0]PETSC ERROR: Signal received > > [0]PETSC ERROR: See http://www.mcs.anl.gov/petsc/documentation/faq.html > for trouble shooting. > > [0]PETSC ERROR: Petsc Release Version 3.5.4, May, 23, 2015 > > [0]PETSC ERROR: ./example-opt on a arch-darwin-c-opt named > mcswl164.mcs.anl.gov by xzhao Mon Aug 17 11:20:33 2015 > > [0]PETSC ERROR: Configure options --with-cc=gcc-4.9 --with-cxx=g++-4.9 > --with-fc=gfortran-4.9 --with-cxx-dialect=C++11 --download-mpich > --download-fblaslapack --download-scalapack --download-mumps > --download-superlu_dist --download-hypre --download-ml --download-parmetis > --download-metis --download-triangle --download-chaco --download-elemental > --with-debugging=0 > > [0]PETSC ERROR: #1 User provided function() line 0 in unknown file > > [1]PETSC ERROR: --------------------- Error Message > -------------------------------------------------------------- > > [1]PETSC ERROR: Signal received > > [1]PETSC ERROR: See http://www.mcs.anl.gov/petsc/documentation/faq.html > for trouble shooting. > > [1]PETSC ERROR: Petsc Release Version 3.5.4, May, 23, 2015 > > [1]PETSC ERROR: ./example-opt on a arch-darwin-c-opt named > mcswl164.mcs.anl.gov by xzhao Mon Aug 17 11:20:33 2015 > > [1]PETSC ERROR: Configure options --with-cc=gcc-4.9 --with-cxx=g++-4.9 > --with-fc=gfortran-4.9 --with-cxx-dialect=C++11 --download-mpich > --download-fblaslapack --download-scalapack --download-mumps > --download-superlu_dist --download-hypre --download-ml --download-parmetis > --download-metis --download-triangle --download-chaco --download-elemental > --with-debugging=0 > > [1]PETSC ERROR: #1 User provided function() line 0 in unknown file > > [2]PETSC ERROR: --------------------- Error Message > -------------------------------------------------------------- > > [2]PETSC ERROR: Signal received > > [2]PETSC ERROR: See http://www.mcs.anl.gov/petsc/documentation/faq.html > for trouble shooting. > > [2]PETSC ERROR: Petsc Release Version 3.5.4, May, 23, 2015 > > [2]PETSC ERROR: ./example-opt on a arch-darwin-c-opt named > mcswl164.mcs.anl.gov by xzhao Mon Aug 17 11:20:33 2015 > > [2]PETSC ERROR: Configure options --with-cc=gcc-4.9 --with-cxx=g++-4.9 > --with-fc=gfortran-4.9 --with-cxx-dialect=C++11 --download-mpich > --download-fblaslapack --download-scalapack --download-mumps > --download-superlu_dist --download-hypre --download-ml --download-parmetis > --download-metis --download-triangle --download-chaco --download-elemental > --with-debugging=0 > > [2]PETSC ERROR: #1 User provided function() line 0 in unknown file > > [3]PETSC ERROR: --------------------- Error Message > -------------------------------------------------------------- > > [3]PETSC ERROR: Signal received > > [3]PETSC ERROR: See http://www.mcs.anl.gov/petsc/documentation/faq.html > for trouble shooting. > > [3]PETSC ERROR: Petsc Release Version 3.5.4, May, 23, 2015 > > [3]PETSC ERROR: ./example-opt on a arch-darwin-c-opt named > mcswl164.mcs.anl.gov by xzhao Mon Aug 17 11:20:33 2015 > > [3]PETSC ERROR: Configure options --with-cc=gcc-4.9 --with-cxx=g++-4.9 > --with-fc=gfortran-4.9 --with-cxx-dialect=C++11 --download-mpich > --download-fblaslapack --download-scalapack --download-mumps > --download-superlu_dist --download-hypre --download-ml --download-parmetis > --download-metis --download-triangle --download-chaco --download-elemental > --with-debugging=0 > > [3]PETSC ERROR: #1 User provided function() line 0 in unknown file > > application called MPI_Abort(MPI_COMM_WORLD, 59) - process 0 > > application called MPI_Abort(MPI_COMM_WORLD, 59) - process 1 > > application called MPI_Abort(MPI_COMM_WORLD, 59) - process 2 > > [cli_0]: aborting job: > > application called MPI_Abort(MPI_COMM_WORLD, 59) - process 0 > > [cli_1]: aborting job: > > application called MPI_Abort(MPI_COMM_WORLD, 59) - process 1 > > [cli_2]: aborting job: > > application called MPI_Abort(MPI_COMM_WORLD, 59) - process 2 > > application called MPI_Abort(MPI_COMM_WORLD, 59) - process 3 > > [cli_3]: aborting job: > > application called MPI_Abort(MPI_COMM_WORLD, 59) - process 3 > > On Mon, Aug 17, 2015 at 11:16 AM, Dmitry Karpeyev > wrote: > >> Xujun, >> Regarding your original question: please send the complete error message. >> Dmitry. >> >> On Mon, Aug 17, 2015 at 11:15 AM Xujun Zhao wrote: >> >>> Ahhhh, I should drink some coffee in the morning. >>> Now it passed the test! >>> >>> On Mon, Aug 17, 2015 at 11:13 AM, Matthew Knepley >>> wrote: >>> >>>> On Mon, Aug 17, 2015 at 11:02 AM, Xujun Zhao wrote: >>>> >>>>> No. It gives the following error msg: >>>>> >>>> >>>> Did you build the executable? >>>> >>>> cd src/vec/vec/examples/tutorials >>>> make ex43 >>>> >>>> Matt >>>> >>>> >>>>> mpirun -np 2 ex43 >>>>> >>>>> [proxy:0:0 at mcswl164.mcs.anl.gov] [proxy:0:0 at mcswl164.mcs.anl.gov] >>>>> HYDU_create_process (utils/launch/launch.c:75): HYDU_create_process >>>>> (utils/launch/launch.c:75): execvp error on file ex43 (No such file or >>>>> directory) >>>>> >>>>> execvp error on file ex43 (No such file or directory) >>>>> >>>>> On Mon, Aug 17, 2015 at 10:57 AM, Matthew Knepley >>>>> wrote: >>>>> >>>>>> On Mon, Aug 17, 2015 at 10:49 AM, Xujun Zhao >>>>>> wrote: >>>>>> >>>>>>> Hi all, >>>>>>> >>>>>>> I want PETSc to generate random vector using VecSetRandom() >>>>>>> following given examples, but failed and showed some "out of memory" error. >>>>>>> The following is the code, which goes well until it reaches VecSetRandom(). >>>>>>> Can anyone help me figure out the reason? Thanks a lot. >>>>>>> >>>>>> >>>>>> Does src/vec/vec/examples/tests/ex43.c run for you? >>>>>> >>>>>> Thanks, >>>>>> >>>>>> Matt >>>>>> >>>>>> >>>>>>> XZ >>>>>>> >>>>>>> >>>>>>> >>>>>>> -------------------------------------------------------------------------------------------- >>>>>>> Vec u; >>>>>>> PetscRandom rand_ctx; /* random number generator context */ >>>>>>> PetscMPIInt size, rank; >>>>>>> PetscInt n, dn; >>>>>>> >>>>>>> >>>>>>> MPI_Comm_rank(PETSC_COMM_WORLD,&rank);//CHKERRQ(ierr); >>>>>>> MPI_Comm_size(PETSC_COMM_WORLD,&size);//CHKERRQ(ierr); >>>>>>> n = N/size + 1; >>>>>>> dn = n*size - N; >>>>>>> if ( dn>0 && rank>>>>>> printf("--->test in petsc_random_vector(): rank = %d, n = >>>>>>> %d\n",rank,n); >>>>>>> >>>>>>> >>>>>>> VecCreate(PETSC_COMM_WORLD,&u); >>>>>>> VecSetSizes(u,n,N); >>>>>>> PetscRandomCreate(PETSC_COMM_WORLD, &rand_ctx); >>>>>>> #if defined(PETSC_HAVE_DRAND48) >>>>>>> PetscRandomSetType(rand_ctx,PETSCRAND48); >>>>>>> #elif defined(PETSC_HAVE_RAND) >>>>>>> PetscRandomSetType(rand_ctx,PETSCRAND); >>>>>>> #endif >>>>>>> PetscRandomSetFromOptions(rand_ctx); >>>>>>> >>>>>>> >>>>>>> VecSetRandom(u,rand_ctx); >>>>>>> PetscRandomDestroy(&rand_ctx); >>>>>>> >>>>>> >>>>>> >>>>>> >>>>>> -- >>>>>> What most experimenters take for granted before they begin their >>>>>> experiments is infinitely more interesting than any results to which their >>>>>> experiments lead. >>>>>> -- Norbert Wiener >>>>>> >>>>> >>>>> >>>> >>>> >>>> -- >>>> What most experimenters take for granted before they begin their >>>> experiments is infinitely more interesting than any results to which their >>>> experiments lead. >>>> -- Norbert Wiener >>>> >>> >>> > -------------- next part -------------- An HTML attachment was scrubbed... URL: From knepley at gmail.com Mon Aug 17 12:39:31 2015 From: knepley at gmail.com (Matthew Knepley) Date: Mon, 17 Aug 2015 12:39:31 -0500 Subject: [petsc-users] SNESSetFunction In-Reply-To: References: Message-ID: On Mon, Aug 17, 2015 at 11:46 AM, Reza Yaghmaie wrote: > > Hi, > > I have problems with passing variables through *SNESSetFunction* in my > code. basically I have the following subroutines in the main body of the > Fortran code. Could you provide some insight on how to transfer variables > into the residual calculation routine (*FormFunction1*)? > Extra arguments to your FormFunction are meant to be passed in a context, through the context variable. This is difficult in Fortran, but you can use a PetscObject as a container. You can attach other PetscObjects using PetscObjectCompose() in Fortran. Matt > Thanks, > Reza > > ------------------------------------------------------------------------------------------------------------------ > *main code* > > *SNES* snes > *Vec* xvec,rvec > *external* FormFunction1 > *real*8* > variable1(10),variable2(20,20),variable3(30),variable4(40,40) > > > call *SNESSetFunction*(snes,rvec,FormFunction1, > & PETSC_NULL_OBJECT, > & variable1,variable2,variable3,variable4, > & ierr) > > end > > subroutine *FormFunction1*(snes,XVEC,FVEC, > & dummy, > & varable1,varable2,varable3,varable4, > & ierr) > > *SNES* snes > *Vec* XVEC,FVEC > *PetscFortranAddr* dummy > *real*8* > variable1(10),variable2(20,20),variable3(30),variable4(40,40) > > > return > end > > -------------------------------------------------------------------------------------------------------------- > > > > > > -- What most experimenters take for granted before they begin their experiments is infinitely more interesting than any results to which their experiments lead. -- Norbert Wiener -------------- next part -------------- An HTML attachment was scrubbed... URL: From jychang48 at gmail.com Mon Aug 17 13:21:51 2015 From: jychang48 at gmail.com (Justin Chang) Date: Mon, 17 Aug 2015 12:21:51 -0600 Subject: [petsc-users] Understanding the memory bandwidth In-Reply-To: <55CDA6E2.5000302@ntnu.no> References: <55CDA6E2.5000302@ntnu.no> Message-ID: Thanks everyone for your valuable input, a few follow up questions: 1) The specs for my machine says there are 10 cores and 20 threads. Does that mean for each socket, i have 10 cores where each core has 2 threads? Or does it mean that each core can use up to 20 threads? Or something else entirely? 2a) When I do an hwloc-info on a single compute node: $ hwloc-info depth 0: 1 Machine (type #1) depth 1: 2 NUMANode (type #2) depth 2: 2 Socket (type #3) depth 3: 2 L3Cache (type #4) depth 4: 20 L2Cache (type #4) depth 5: 20 L1dCache (type #4) depth 6: 20 L1iCache (type #4) depth 7: 20 Core (type #5) depth 8: 20 PU (type #6) Special depth -3: 5 Bridge (type #9) Special depth -4: 6 PCI Device (type #10) Special depth -5: 6 OS Device (type #11) With this setup, does it mean that if I invoke mpiexec.hydra -np -bind-to hwthread ... the MPI program will bind to the cores? 2b) Our headnode has 40 PU at depth 8, so if I -bind-to hwthread on this node (and get yelled at by the system admins) it's possible that two MPI processes can run on the same core? 3) When I invoke an MPI process via mpiexec.hydra -np ... without any bindings, do we know what exactly is going on? Thanks, Justin On Fri, Aug 14, 2015 at 2:29 AM, ?smund Ervik wrote: >>> So this is a pretty low fraction (55%) of 59.7*2 = 119.4. I suspect >>> your memory or motherboard is at most 1600 MHz, so your peak would be >>> 102.4 GB/s. >> >>> You can check this as root using "dmidecode --type 17", which should >>> give one entry per channel, looking something like this: >>> >>> Handle 0x002B, DMI type 17, 34 bytes >>> Memory Device >>> Array Handle: 0x002A >>> Error Information Handle: 0x002F >>> Total Width: Unknown >>> Data Width: Unknown >>> Size: 4096 MB >>> Form Factor: DIMM >>> Set: None >>> Locator: DIMM0 >>> Bank Locator: BANK 0 >>> Type: >>> Type Detail: None >>> Speed: Unknown >>> Manufacturer: Not Specified >>> Serial Number: Not Specified >>> Asset Tag: Unknown >>> Part Number: Not Specified >>> Rank: Unknown >>> Configured Clock Speed: 1600 MHz >> >>I have no root access. Is there another way to confirm the clock speed? > > Also note: even in the case where your motherboard, RAM and CPU all say > 1866 on the label, if there are more memory DIMMs (chips) per node than > channels, say 16 DIMMs on your 8 channels, you will see a performance > reduction on the order of 20-30%. This is more likely if you are using > nodes in a "high-memory queue" or similar where there's >= 128 GB memory > per node. (This will change in the future when/if people start using > DDR4 LRDIMMs.) There's a series of in-depth discussions here: > http://frankdenneman.nl/2015/02/20/memory-deep-dive/ and there's also > lots of interesting memory-stuff on John McCalpin's blog: > https://sites.utexas.edu/jdm4372/ > > Regards, > ?smund > From bsmith at mcs.anl.gov Mon Aug 17 13:25:46 2015 From: bsmith at mcs.anl.gov (Barry Smith) Date: Mon, 17 Aug 2015 13:25:46 -0500 Subject: [petsc-users] SNESSetFunction In-Reply-To: References: Message-ID: Reza, See src/snes/examples/tutorials/ex5f90.F for how this may be easily done using a Fortran user defined type Barry > On Aug 17, 2015, at 12:39 PM, Matthew Knepley wrote: > > On Mon, Aug 17, 2015 at 11:46 AM, Reza Yaghmaie wrote: > > Hi, > > I have problems with passing variables through SNESSetFunction in my code. basically I have the following subroutines in the main body of the Fortran code. Could you provide some insight on how to transfer variables into the residual calculation routine (FormFunction1)? > > Extra arguments to your FormFunction are meant to be passed in a context, through the context variable. > > This is difficult in Fortran, but you can use a PetscObject as a container. You can attach other > PetscObjects using PetscObjectCompose() in Fortran. > > Matt > > Thanks, > Reza > ------------------------------------------------------------------------------------------------------------------ > main code > > SNES snes > Vec xvec,rvec > external FormFunction1 > real*8 variable1(10),variable2(20,20),variable3(30),variable4(40,40) > > > call SNESSetFunction(snes,rvec,FormFunction1, > & PETSC_NULL_OBJECT, > & variable1,variable2,variable3,variable4, > & ierr) > > end > > subroutine FormFunction1(snes,XVEC,FVEC, > & dummy, > & varable1,varable2,varable3,varable4, > & ierr) > > SNES snes > Vec XVEC,FVEC > PetscFortranAddr dummy > real*8 variable1(10),variable2(20,20),variable3(30),variable4(40,40) > > > return > end > -------------------------------------------------------------------------------------------------------------- > > > > > > > > > -- > What most experimenters take for granted before they begin their experiments is infinitely more interesting than any results to which their experiments lead. > -- Norbert Wiener From bsmith at mcs.anl.gov Mon Aug 17 13:35:31 2015 From: bsmith at mcs.anl.gov (Barry Smith) Date: Mon, 17 Aug 2015 13:35:31 -0500 Subject: [petsc-users] Understanding the memory bandwidth In-Reply-To: References: <55CDA6E2.5000302@ntnu.no> Message-ID: <7FED100B-A227-4C8D-AB1C-DA3FCF9C8F38@mcs.anl.gov> > On Aug 17, 2015, at 1:21 PM, Justin Chang wrote: > > Thanks everyone for your valuable input, a few follow up questions: > > 1) The specs for my machine says there are 10 cores and 20 threads. > Does that mean for each socket, i have 10 cores where each core has 2 > threads? Or does it mean that each core can use up to 20 threads? Or > something else entirely? Some times a single core has support for multiple (like 2) "hardware threads". What this means is that the core has "extra" hardware, generally registers, that allow switching between 2 threads on the core without having to save all the registers for one thread and load all the registers from the other thread (essentially it has more registers then it would have without support for "hardware threads". This means that if the core has two threads, they can be switched back and forth very rapidly. The reason hardware designers put this in is so if one thread is waiting for memory loads, it can switch to the other thread and get some work done during the time. This allows memory latency hiding. The term "hardware threads" is not a very accurate term IMHO. You can use any many or as few threads on this system as you want to, the system is just optimized hardware wise to run with 20 threads for latency hiding. Note that PETSc codes are generally memory bandwidth limited, not memory latency limited and so generally with PETSc code it makes sense to use fewer threads than cores (you are not utilizing all the "extra" hardware then but it is faster). Barry > > 2a) When I do an hwloc-info on a single compute node: > > $ hwloc-info > > depth 0: 1 Machine (type #1) > > depth 1: 2 NUMANode (type #2) > > depth 2: 2 Socket (type #3) > > depth 3: 2 L3Cache (type #4) > > depth 4: 20 L2Cache (type #4) > > depth 5: 20 L1dCache (type #4) > > depth 6: 20 L1iCache (type #4) > > depth 7: 20 Core (type #5) > > depth 8: 20 PU (type #6) > > Special depth -3: 5 Bridge (type #9) > > Special depth -4: 6 PCI Device (type #10) > > Special depth -5: 6 OS Device (type #11) > > With this setup, does it mean that if I invoke mpiexec.hydra -np > -bind-to hwthread ... the MPI program will bind to the cores? > > 2b) Our headnode has 40 PU at depth 8, so if I -bind-to hwthread on > this node (and get yelled at by the system admins) it's possible that > two MPI processes can run on the same core? > > 3) When I invoke an MPI process via mpiexec.hydra -np ... > without any bindings, do we know what exactly is going on? > > Thanks, > Justin > > On Fri, Aug 14, 2015 at 2:29 AM, ?smund Ervik wrote: >>>> So this is a pretty low fraction (55%) of 59.7*2 = 119.4. I suspect >>>> your memory or motherboard is at most 1600 MHz, so your peak would be >>>> 102.4 GB/s. >>> >>>> You can check this as root using "dmidecode --type 17", which should >>>> give one entry per channel, looking something like this: >>>> >>>> Handle 0x002B, DMI type 17, 34 bytes >>>> Memory Device >>>> Array Handle: 0x002A >>>> Error Information Handle: 0x002F >>>> Total Width: Unknown >>>> Data Width: Unknown >>>> Size: 4096 MB >>>> Form Factor: DIMM >>>> Set: None >>>> Locator: DIMM0 >>>> Bank Locator: BANK 0 >>>> Type: >>>> Type Detail: None >>>> Speed: Unknown >>>> Manufacturer: Not Specified >>>> Serial Number: Not Specified >>>> Asset Tag: Unknown >>>> Part Number: Not Specified >>>> Rank: Unknown >>>> Configured Clock Speed: 1600 MHz >>> >>> I have no root access. Is there another way to confirm the clock speed? >> >> Also note: even in the case where your motherboard, RAM and CPU all say >> 1866 on the label, if there are more memory DIMMs (chips) per node than >> channels, say 16 DIMMs on your 8 channels, you will see a performance >> reduction on the order of 20-30%. This is more likely if you are using >> nodes in a "high-memory queue" or similar where there's >= 128 GB memory >> per node. (This will change in the future when/if people start using >> DDR4 LRDIMMs.) There's a series of in-depth discussions here: >> http://frankdenneman.nl/2015/02/20/memory-deep-dive/ and there's also >> lots of interesting memory-stuff on John McCalpin's blog: >> https://sites.utexas.edu/jdm4372/ >> >> Regards, >> ?smund >> From gideon.simpson at gmail.com Mon Aug 17 15:05:27 2015 From: gideon.simpson at gmail.com (Gideon Simpson) Date: Mon, 17 Aug 2015 16:05:27 -0400 Subject: [petsc-users] superlu_dist output Message-ID: <9808DCD8-4D9D-46E4-82BE-BBB8F0537840@gmail.com> Is there a way to suppress this kind of output when running with superlu_dist? .. equilibrated? *equed = B .. LDPERM job 5 time: 0.00 .. anorm 3.541231e+01 .. Use METIS ordering on A'+A .. symbfact(): relax 10, maxsuper 60, fill 4 No of supers 73 Size of G(L) 991 Size of G(U) 722 int 4, short 2, float 4, double 8 SYMBfact (MB): L\U 0.02 total 0.03 expansions 0 .. # L blocks 228 # U blocks 211 MPI tag upper bound = 536870911 === using DAG === * init: 1.779720e-04 seconds .. thresh = s_eps 0.000000e+00 * anorm 3.541231e+01 = 0.000000e+00 .. Buffer size: Lsub 30 Lval 200 Usub 35 Uval 120 LDA 20 NUMfact (MB) all PEs: L\U 0.10 all 0.12 All space (MB): total 0.32 Avg 0.32 Max 0.32 Number of tiny pivots: 0 .. DiagScale = 3 .. LDPERM job 5 time: 0.00 .. DiagScale = 3 .. LDPERM job 5 time: 0.00 .. DiagScale = 3 .. LDPERM job 5 time: 0.00 .. anorm 3.378641e+01 -gideon -------------- next part -------------- An HTML attachment was scrubbed... URL: From balay at mcs.anl.gov Mon Aug 17 15:14:10 2015 From: balay at mcs.anl.gov (Satish Balay) Date: Mon, 17 Aug 2015 15:14:10 -0500 Subject: [petsc-users] superlu_dist output In-Reply-To: <9808DCD8-4D9D-46E4-82BE-BBB8F0537840@gmail.com> References: <9808DCD8-4D9D-46E4-82BE-BBB8F0537840@gmail.com> Message-ID: How can we reproduce this? Is this using debian pkg install of petsc, superlu_dist? [which versions?] >>>>> balay at es^/scratch/balay/petsc/src/ksp/ksp/examples/tutorials(maint=) $ $PETSC_DIR/bin/petscmpiexec -n 2 ./ex2 -ksp_type preonly -pc_type lu -pc_factor_mat_solver_package superlu_dist Norm of error 1.87427e-15 iterations 1 balay at es^/scratch/balay/petsc/src/ksp/ksp/examples/tutorials(maint=) $ <<<<<< You can try installing our latest version using --download-superlu_dist and see if you still have this issue. Satish Satish On Mon, 17 Aug 2015, Gideon Simpson wrote: > Is there a way to suppress this kind of output when running with superlu_dist? > > .. equilibrated? *equed = B > .. LDPERM job 5 time: 0.00 > .. anorm 3.541231e+01 > .. Use METIS ordering on A'+A > .. symbfact(): relax 10, maxsuper 60, fill 4 > No of supers 73 > Size of G(L) 991 > Size of G(U) 722 > int 4, short 2, float 4, double 8 > SYMBfact (MB): L\U 0.02 total 0.03 expansions 0 > .. # L blocks 228 # U blocks 211 > MPI tag upper bound = 536870911 > === using DAG === > * init: 1.779720e-04 seconds > .. thresh = s_eps 0.000000e+00 * anorm 3.541231e+01 = 0.000000e+00 > .. Buffer size: Lsub 30 Lval 200 Usub 35 Uval 120 LDA 20 > NUMfact (MB) all PEs: L\U 0.10 all 0.12 > All space (MB): total 0.32 Avg 0.32 Max 0.32 > Number of tiny pivots: 0 > .. DiagScale = 3 > .. LDPERM job 5 time: 0.00 > .. DiagScale = 3 > .. LDPERM job 5 time: 0.00 > .. DiagScale = 3 > .. LDPERM job 5 time: 0.00 > .. anorm 3.378641e+01 > > -gideon > > From gideon.simpson at gmail.com Mon Aug 17 15:18:49 2015 From: gideon.simpson at gmail.com (Gideon Simpson) Date: Mon, 17 Aug 2015 16:18:49 -0400 Subject: [petsc-users] superlu_dist output In-Reply-To: References: <9808DCD8-4D9D-46E4-82BE-BBB8F0537840@gmail.com> Message-ID: <36CA333A-7F0A-4A2C-9411-F651740271E8@gmail.com> This is the macports version. For me, this appears when running a problem with a linear solver with the flags -pc_factor_mat_solver_package superlu_dist -pc_type lu Nothing else is required. The answer it gives is correct, I?d just like to suppress the output. -gideon > On Aug 17, 2015, at 4:14 PM, Satish Balay wrote: > > How can we reproduce this? Is this using debian pkg install of petsc, > superlu_dist? [which versions?] > >>>>>> > balay at es^/scratch/balay/petsc/src/ksp/ksp/examples/tutorials(maint=) $ $PETSC_DIR/bin/petscmpiexec -n 2 ./ex2 -ksp_type preonly -pc_type lu -pc_factor_mat_solver_package superlu_dist > Norm of error 1.87427e-15 iterations 1 > balay at es^/scratch/balay/petsc/src/ksp/ksp/examples/tutorials(maint=) $ > <<<<<< > > You can try installing our latest version using > --download-superlu_dist and see if you still have this issue. > > Satish > > Satish > > On Mon, 17 Aug 2015, Gideon Simpson wrote: > >> Is there a way to suppress this kind of output when running with superlu_dist? >> >> .. equilibrated? *equed = B >> .. LDPERM job 5 time: 0.00 >> .. anorm 3.541231e+01 >> .. Use METIS ordering on A'+A >> .. symbfact(): relax 10, maxsuper 60, fill 4 >> No of supers 73 >> Size of G(L) 991 >> Size of G(U) 722 >> int 4, short 2, float 4, double 8 >> SYMBfact (MB): L\U 0.02 total 0.03 expansions 0 >> .. # L blocks 228 # U blocks 211 >> MPI tag upper bound = 536870911 >> === using DAG === >> * init: 1.779720e-04 seconds >> .. thresh = s_eps 0.000000e+00 * anorm 3.541231e+01 = 0.000000e+00 >> .. Buffer size: Lsub 30 Lval 200 Usub 35 Uval 120 LDA 20 >> NUMfact (MB) all PEs: L\U 0.10 all 0.12 >> All space (MB): total 0.32 Avg 0.32 Max 0.32 >> Number of tiny pivots: 0 >> .. DiagScale = 3 >> .. LDPERM job 5 time: 0.00 >> .. DiagScale = 3 >> .. LDPERM job 5 time: 0.00 >> .. DiagScale = 3 >> .. LDPERM job 5 time: 0.00 >> .. anorm 3.378641e+01 >> >> -gideon >> >> > -------------- next part -------------- An HTML attachment was scrubbed... URL: From balay at mcs.anl.gov Mon Aug 17 15:37:49 2015 From: balay at mcs.anl.gov (Satish Balay) Date: Mon, 17 Aug 2015 15:37:49 -0500 Subject: [petsc-users] superlu_dist output In-Reply-To: <36CA333A-7F0A-4A2C-9411-F651740271E8@gmail.com> References: <9808DCD8-4D9D-46E4-82BE-BBB8F0537840@gmail.com> <36CA333A-7F0A-4A2C-9411-F651740271E8@gmail.com> Message-ID: from superlu_dist README >>>>>>>> o -DPRNTlevel=[0,1,2,...] printing level to show solver's execution details. (default is 0) o -DDEBUGlevel=[0,1,2,...] diagnostic printing level for debugging purpose. (default is 0) <<<<< Presumably macports version of superlu_dist is built with these flags enabled. cc:ing Sean to check if its an issue with macports superlu_dist package. If you need to get rid of this output - you can build petsc from sources [with --download_superlu_dist] Satish On Mon, 17 Aug 2015, Gideon Simpson wrote: > This is the macports version. For me, this appears when running a problem with a linear solver with the flags > > -pc_factor_mat_solver_package superlu_dist -pc_type lu > > Nothing else is required. The answer it gives is correct, I?d just like to suppress the output. > > -gideon > > > On Aug 17, 2015, at 4:14 PM, Satish Balay wrote: > > > > How can we reproduce this? Is this using debian pkg install of petsc, > > superlu_dist? [which versions?] > > > >>>>>> > > balay at es^/scratch/balay/petsc/src/ksp/ksp/examples/tutorials(maint=) $ $PETSC_DIR/bin/petscmpiexec -n 2 ./ex2 -ksp_type preonly -pc_type lu -pc_factor_mat_solver_package superlu_dist > > Norm of error 1.87427e-15 iterations 1 > > balay at es^/scratch/balay/petsc/src/ksp/ksp/examples/tutorials(maint=) $ > > <<<<<< > > > > You can try installing our latest version using > > --download-superlu_dist and see if you still have this issue. > > > > Satish > > > > Satish > > > > On Mon, 17 Aug 2015, Gideon Simpson wrote: > > > >> Is there a way to suppress this kind of output when running with superlu_dist? > >> > >> .. equilibrated? *equed = B > >> .. LDPERM job 5 time: 0.00 > >> .. anorm 3.541231e+01 > >> .. Use METIS ordering on A'+A > >> .. symbfact(): relax 10, maxsuper 60, fill 4 > >> No of supers 73 > >> Size of G(L) 991 > >> Size of G(U) 722 > >> int 4, short 2, float 4, double 8 > >> SYMBfact (MB): L\U 0.02 total 0.03 expansions 0 > >> .. # L blocks 228 # U blocks 211 > >> MPI tag upper bound = 536870911 > >> === using DAG === > >> * init: 1.779720e-04 seconds > >> .. thresh = s_eps 0.000000e+00 * anorm 3.541231e+01 = 0.000000e+00 > >> .. Buffer size: Lsub 30 Lval 200 Usub 35 Uval 120 LDA 20 > >> NUMfact (MB) all PEs: L\U 0.10 all 0.12 > >> All space (MB): total 0.32 Avg 0.32 Max 0.32 > >> Number of tiny pivots: 0 > >> .. DiagScale = 3 > >> .. LDPERM job 5 time: 0.00 > >> .. DiagScale = 3 > >> .. LDPERM job 5 time: 0.00 > >> .. DiagScale = 3 > >> .. LDPERM job 5 time: 0.00 > >> .. anorm 3.378641e+01 > >> > >> -gideon > >> > >> > > > > From sean at farley.io Mon Aug 17 16:13:08 2015 From: sean at farley.io (Sean Farley) Date: Mon, 17 Aug 2015 14:13:08 -0700 Subject: [petsc-users] superlu_dist output In-Reply-To: References: <9808DCD8-4D9D-46E4-82BE-BBB8F0537840@gmail.com> <36CA333A-7F0A-4A2C-9411-F651740271E8@gmail.com> Message-ID: Satish Balay writes: > from superlu_dist README > >>>>>>>>> > o -DPRNTlevel=[0,1,2,...] > printing level to show solver's execution details. (default is 0) > > o -DDEBUGlevel=[0,1,2,...] > diagnostic printing level for debugging purpose. (default is 0) > <<<<< > > Presumably macports version of superlu_dist is built with these flags enabled. > > cc:ing Sean to check if its an issue with macports superlu_dist package. I don't think they're set at all: http://trac.macports.org/browser/trunk/dports/math/superlu_dist/Portfile I'm in the process of upgrading superlu_dist soon, so maybe that will help. From fdkong.jd at gmail.com Mon Aug 17 17:32:35 2015 From: fdkong.jd at gmail.com (Fande Kong) Date: Mon, 17 Aug 2015 16:32:35 -0600 Subject: [petsc-users] any reasons to distinguish MPIU_SUM from MPI_SUM? Message-ID: Hi all, I was wondering why, in Petsc, MPI_Reduce with PetscInt needs MPI_SUM meanwhile MPI_Reduce with PetscReal needs MPIU_SUM? Do we have any special reasons to distinguish them? Thanks, Fande Kong, -------------- next part -------------- An HTML attachment was scrubbed... URL: From balay at mcs.anl.gov Mon Aug 17 17:49:32 2015 From: balay at mcs.anl.gov (Satish Balay) Date: Mon, 17 Aug 2015 17:49:32 -0500 Subject: [petsc-users] any reasons to distinguish MPIU_SUM from MPI_SUM? In-Reply-To: References: Message-ID: I think some MPI impls didn't provide some of the ops on MPI_COMPLEX datatype. So petsc provides these ops for PetscReal i.e MPIU_SUM, MPIU_MAX, MPIU_MIN Satish On Mon, 17 Aug 2015, Fande Kong wrote: > Hi all, > > I was wondering why, in Petsc, MPI_Reduce with PetscInt needs MPI_SUM > meanwhile MPI_Reduce with PetscReal needs MPIU_SUM? Do we have any special > reasons to distinguish them? > > Thanks, > > Fande Kong, > From bsmith at mcs.anl.gov Mon Aug 17 19:18:52 2015 From: bsmith at mcs.anl.gov (Barry Smith) Date: Mon, 17 Aug 2015 19:18:52 -0500 Subject: [petsc-users] any reasons to distinguish MPIU_SUM from MPI_SUM? In-Reply-To: References: Message-ID: It is crucial. MPI also doesn't provide sums for __float128 precision. But MPI does always provide sums for 32 and 64 bit integers so no need for MPIU_SUM for PETSC_INT > On Aug 17, 2015, at 5:49 PM, Satish Balay wrote: > > I think some MPI impls didn't provide some of the ops on MPI_COMPLEX > datatype. > > So petsc provides these ops for PetscReal i.e MPIU_SUM, MPIU_MAX, MPIU_MIN > > Satish > > On Mon, 17 Aug 2015, Fande Kong wrote: > >> Hi all, >> >> I was wondering why, in Petsc, MPI_Reduce with PetscInt needs MPI_SUM >> meanwhile MPI_Reduce with PetscReal needs MPIU_SUM? Do we have any special >> reasons to distinguish them? >> >> Thanks, >> >> Fande Kong, >> > From fdkong.jd at gmail.com Mon Aug 17 21:53:01 2015 From: fdkong.jd at gmail.com (Fande Kong) Date: Mon, 17 Aug 2015 20:53:01 -0600 Subject: [petsc-users] any reasons to distinguish MPIU_SUM from MPI_SUM? In-Reply-To: References: Message-ID: Thanks, Barry, Satish, But, is it possible to uniform the use of MPI_SUM and MPIU_SUM? For example, we could let a Petsc function just switch to a regular MPI_Reduce or other function when using PetscInt. In other words, we need a wrapper. I always use MPIU_INT in a MPI function when using PetscInt. It is very straightforward to use MPIU_SUM, MPIU_MAX so on, when thinking about we are using MPIU_INT. Thanks, Fande Kong, On Mon, Aug 17, 2015 at 6:18 PM, Barry Smith wrote: > > It is crucial. MPI also doesn't provide sums for __float128 precision. > But MPI does always provide sums for 32 and 64 bit integers so no need for > MPIU_SUM for PETSC_INT > > > > On Aug 17, 2015, at 5:49 PM, Satish Balay wrote: > > > > I think some MPI impls didn't provide some of the ops on MPI_COMPLEX > > datatype. > > > > So petsc provides these ops for PetscReal i.e MPIU_SUM, MPIU_MAX, > MPIU_MIN > > > > Satish > > > > On Mon, 17 Aug 2015, Fande Kong wrote: > > > >> Hi all, > >> > >> I was wondering why, in Petsc, MPI_Reduce with PetscInt needs MPI_SUM > >> meanwhile MPI_Reduce with PetscReal needs MPIU_SUM? Do we have any > special > >> reasons to distinguish them? > >> > >> Thanks, > >> > >> Fande Kong, > >> > > > > -------------- next part -------------- An HTML attachment was scrubbed... URL: From bsmith at mcs.anl.gov Mon Aug 17 22:01:17 2015 From: bsmith at mcs.anl.gov (Barry Smith) Date: Mon, 17 Aug 2015 22:01:17 -0500 Subject: [petsc-users] any reasons to distinguish MPIU_SUM from MPI_SUM? In-Reply-To: References: Message-ID: > On Aug 17, 2015, at 9:53 PM, Fande Kong wrote: > > Thanks, Barry, Satish, > > But, is it possible to uniform the use of MPI_SUM and MPIU_SUM? For example, we could let a Petsc function just switch to a regular MPI_Reduce or other function when using PetscInt. In other words, we need a wrapper. I always use MPIU_INT in a MPI function when using PetscInt. It is very straightforward to use MPIU_SUM, MPIU_MAX so on, when thinking about we are using MPIU_INT. We could add code to the routine that gets called when one uses MPIU_SUM which is PetscSum_Local() and defined in pinit.c to handle all possible data types then you could always use MPIU_SUM. The reason we don't is that using a user provide reduction such as PetscSum_Local() will ALWAYS be less efficient then using the MPI built in reduction operations. For integers which MPI can always handle we prefer to us the fastest possible which is the built in operation for summing. Now likely the time difference between the user provided one vs the built in one is too small to measure, I agree, but for me it is easy enough just to remember that MPIU_SUM is only needed for floating pointer numbers not integers. Barry > > Thanks, > > Fande Kong, > > On Mon, Aug 17, 2015 at 6:18 PM, Barry Smith wrote: > > It is crucial. MPI also doesn't provide sums for __float128 precision. But MPI does always provide sums for 32 and 64 bit integers so no need for MPIU_SUM for PETSC_INT > > > > On Aug 17, 2015, at 5:49 PM, Satish Balay wrote: > > > > I think some MPI impls didn't provide some of the ops on MPI_COMPLEX > > datatype. > > > > So petsc provides these ops for PetscReal i.e MPIU_SUM, MPIU_MAX, MPIU_MIN > > > > Satish > > > > On Mon, 17 Aug 2015, Fande Kong wrote: > > > >> Hi all, > >> > >> I was wondering why, in Petsc, MPI_Reduce with PetscInt needs MPI_SUM > >> meanwhile MPI_Reduce with PetscReal needs MPIU_SUM? Do we have any special > >> reasons to distinguish them? > >> > >> Thanks, > >> > >> Fande Kong, > >> > > > > From timothee.nicolas at gmail.com Tue Aug 18 03:42:36 2015 From: timothee.nicolas at gmail.com (=?UTF-8?Q?Timoth=C3=A9e_Nicolas?=) Date: Tue, 18 Aug 2015 17:42:36 +0900 Subject: [petsc-users] Wise usage of user contexts Message-ID: Hi all, I am in the process of writing an implicit solver for a set of PDEs (namely MHD equations), in FORTRAN. When setting the non-linear function to solve via Newton-Krylov, I use a "user defined context", namely the thing denoted by "ctx" on the doc page about SNESFunction : http://www.mcs.anl.gov/petsc/petsc-current/docs/manualpages/SNES/SNESFunction.html#SNESFunction In practice ctx is a user defined type which contains everything I need in the local routine which sets the function on the local part of the grid, FormFunctionLocal. That is, some local/global geometrical information on the grid, the physical parameter, and possibly any other thing. In my case it so happens that due to the scheme I have chosen, when I compute my function, I need the full solution of the problem at the last two time steps (which are in Vec format). So my ctx contains two Vec elements. Since I will work in 3D and intend to use a lot of points in the future, I am concerned about memory problems which could occur. Is there a limit to the size occupied by ctx ? Would this be better if instead I was declaring global variables in a module and using this module inside FormFunctionLocal ? Is this allowed ? Best regards Timothee NICOLAS -------------- next part -------------- An HTML attachment was scrubbed... URL: From dave.mayhem23 at gmail.com Tue Aug 18 03:59:58 2015 From: dave.mayhem23 at gmail.com (Dave May) Date: Tue, 18 Aug 2015 10:59:58 +0200 Subject: [petsc-users] Wise usage of user contexts In-Reply-To: References: Message-ID: On 18 August 2015 at 10:42, Timoth?e Nicolas wrote: > Hi all, > > I am in the process of writing an implicit solver for a set of PDEs > (namely MHD equations), in FORTRAN. When setting the non-linear function to > solve via Newton-Krylov, I use a "user defined context", namely the thing > denoted by "ctx" on the doc page about SNESFunction : > > > http://www.mcs.anl.gov/petsc/petsc-current/docs/manualpages/SNES/SNESFunction.html#SNESFunction > > In practice ctx is a user defined type which contains everything I need in > the local routine which sets the function on the local part of the grid, > FormFunctionLocal. That is, some local/global geometrical information on > the grid, the physical parameter, and possibly any other thing. > > In my case it so happens that due to the scheme I have chosen, when I > compute my function, I need the full solution of the problem at the last > two time steps (which are in Vec format). So my ctx contains two Vec > elements. Since I will work in 3D and intend to use a lot of points in the > future, I am concerned about memory problems which could occur. > In the grand scheme of things, the two vectors in your context aren't likely to significantly add to the total memory footprint of your code. A couple of things to note: * If you run in parallel, only the local part of the vector will be stored on each MPI process. * All the KSP methods will allocate auxiliary vectors. Most methods require more than 2 auxiliary vectors. * SNES also requires auxiliary vectors. If you use JFNK, that method will also need some additional temporary vectors. * If you assemble a Jacobian, this matrix will likely require much more memory per MPI process than two vectors > Is there a limit to the size occupied by ctx ? > The only limit is defined by the available memory per MPI process you have on your target machine. > Would this be better if instead I was declaring global variables in a > module and using this module inside FormFunctionLocal ? Is this allowed ? > What would be the difference in doing that - the memory usage will be identical. Cheers Dave > > Best regards > > Timothee NICOLAS > -------------- next part -------------- An HTML attachment was scrubbed... URL: From timothee.nicolas at gmail.com Tue Aug 18 04:20:35 2015 From: timothee.nicolas at gmail.com (=?UTF-8?Q?Timoth=C3=A9e_Nicolas?=) Date: Tue, 18 Aug 2015 18:20:35 +0900 Subject: [petsc-users] Wise usage of user contexts In-Reply-To: References: Message-ID: Dave, Thx a lot for your very clear answer. My last question about modules could be reformulated like this : Why would I put anything in a ctx while I could simply use modules ? Maybe it has something to do with the fact that PETSc is initially written for C ? Best Timothee 2015-08-18 17:59 GMT+09:00 Dave May : > > > On 18 August 2015 at 10:42, Timoth?e Nicolas > wrote: > >> Hi all, >> >> I am in the process of writing an implicit solver for a set of PDEs >> (namely MHD equations), in FORTRAN. When setting the non-linear function to >> solve via Newton-Krylov, I use a "user defined context", namely the thing >> denoted by "ctx" on the doc page about SNESFunction : >> >> >> http://www.mcs.anl.gov/petsc/petsc-current/docs/manualpages/SNES/SNESFunction.html#SNESFunction >> >> In practice ctx is a user defined type which contains everything I need >> in the local routine which sets the function on the local part of the grid, >> FormFunctionLocal. That is, some local/global geometrical information on >> the grid, the physical parameter, and possibly any other thing. >> >> In my case it so happens that due to the scheme I have chosen, when I >> compute my function, I need the full solution of the problem at the last >> two time steps (which are in Vec format). So my ctx contains two Vec >> elements. Since I will work in 3D and intend to use a lot of points in the >> future, I am concerned about memory problems which could occur. >> > > In the grand scheme of things, the two vectors in your context aren't > likely to significantly add to the total memory footprint of your code. A > couple of things to note: > * If you run in parallel, only the local part of the vector will be stored > on each MPI process. > * All the KSP methods will allocate auxiliary vectors. Most methods > require more than 2 auxiliary vectors. > * SNES also requires auxiliary vectors. If you use JFNK, that method will > also need some additional temporary vectors. > * If you assemble a Jacobian, this matrix will likely require much more > memory per MPI process than two vectors > > > >> Is there a limit to the size occupied by ctx ? >> > > The only limit is defined by the available memory per MPI process you have > on your target machine. > > >> Would this be better if instead I was declaring global variables in a >> module and using this module inside FormFunctionLocal ? Is this allowed ? >> > > What would be the difference in doing that - the memory usage will be > identical. > > Cheers > Dave > > >> >> Best regards >> >> Timothee NICOLAS >> > > -------------- next part -------------- An HTML attachment was scrubbed... URL: From D.J.P.Lahaye at tudelft.nl Tue Aug 18 06:34:14 2015 From: D.J.P.Lahaye at tudelft.nl (Domenico Lahaye - EWI) Date: Tue, 18 Aug 2015 11:34:14 +0000 Subject: [petsc-users] petsc KLU In-Reply-To: <6F0087987AC5484D8D593B639648295A2F3B3521@SRV384.tudelft.net> References: <6F0087987AC5484D8D593B639648295A2F3A8EB0@SRV383.tudelft.net> <6F0087987AC5484D8D593B639648295A2F3AC1DE@SRV383.tudelft.net> <6F0087987AC5484D8D593B639648295A2F3AF250@SRV383.tudelft.net> , <6F0087987AC5484D8D593B639648295A2F3B3521@SRV384.tudelft.net> Message-ID: <71B4204D92F7884494460446CAD0F04B5B7CDBF0@SRV364.tudelft.net> Dear all, Have the disappointing results of KLU been reported somewhere? Earlier claims made might reinforce claims that we want to make. Sincere thanks, Domenico. ________________________________________ From: Romain Thomas Sent: Tuesday, August 18, 2015 1:10 PM To: Domenico Lahaye - EWI Subject: FW: [petsc-users] petsc KLU Hi, You can find below the message from Shri. Best regards, Romain -----Original Message----- From: Abhyankar, Shrirang G. [mailto:abhyshr at anl.gov] Sent: maandag 17 augustus 2015 18:21 To: Romain Thomas; Zhang, Hong Cc: petsc-users at mcs.anl.gov Subject: Re: [petsc-users] petsc KLU Romain, I added the KLU interface to PETSc last year hearing the hype about KLU?s performance from several power system folks. I must say that I?m terribly disappointed! I did some performance testing of KLU on power grid problems (power flow application) last year and I got a similar performance that you report (PETSc is 2-4 times faster than KLU). I also clocked the time spent in PETSc?s SuiteSparse interface for KLU for operations other than factorization and it was very minimal. The fastest linear solver combination that I found was PETSc?s LU solver + AMD ordering from the SuiteSparse package (-pc_factor_mat_ordering_type amd). Don?t try MUMPS and SuperLU ? they are terribly slow. Shri From: hong zhang Date: Monday, August 17, 2015 at 10:08 AM To: Romain Thomas Cc: "petsc-users at mcs.anl.gov" Subject: Re: [petsc-users] petsc KLU >Romain: >Do you mean small sparse sequential 200 by 200 matrices? >Petsc LU might be better than external LU packages because it >implements simple LU algorithm and we took good care on data accesing >(I've heard same observations). You may try 'qmd' matrix ordering for >power grid simulation. >I do not have experience on SuiteSparse. Testing MUMPS is worth it as >well. > >Hong > > >Hi >Thank you for your answer. I was asking help because I find LU >factorization 2-3 times faster than KLU. According to my problem size >(200*200) and type (power system simulation), I should get almost the >same computation time. Is it true to think that? Is the difference of >time due to the interface between PETSc and SuiteSparse? >Thank you, >Romain > >-----Original Message----- >From: Barry Smith [mailto:bsmith at mcs.anl.gov] >Sent: vrijdag 14 augustus 2015 17:31 >To: Romain Thomas >Cc: Matthew Knepley; petsc-users at mcs.anl.gov >Subject: Re: [petsc-users] petsc KLU > > > You should call > > MatGetFactor(mat,MATSOLVERKLU,MAT_FACTOR_LU,&fact); > > then call > >> MatLUFactorNumeric(Mat fact,Mat mat,const MatFactorInfo *info) >> MatLUFactorSymbolic(Mat fact,Mat mat,IS row,IS col,const >> MatFactorInfo >> *info) MatLUFactor(Mat mat,IS row,IS col,const MatFactorInfo *info) > > This routines correctly internally call the appropriate >MatLUFactorNumeric_KLU() etc for you because you passed MATSOLVERKLU >above. > There is no reason to (and it won't work) to call > >> MatLUFactorNumeric_KLU(Mat F,Mat A,const MatFactorInfo *info) >> MatLUFactorSymbolic_KLU(Mat F,Mat A,IS r,IS c,const MatFactorInfo >> *info) MatGetFactor_seqaij_klu(Mat A,MatFactorType ftype,Mat *F) > >directly. > > Barry > >> On Aug 14, 2015, at 10:07 AM, Romain Thomas wrote: >> >> Hi, >> Thank you for your answer. >> My problem is a bit more complex. During the simulation (?real >>time?), I need to upgrade at each time step the matrix A and the >>MatassemblyBegin and MatassemblyEnd take time and so, in order to >>avoid these functions, I don?t use ksp or pc. I prefer to use > the functions MatLUFactorNumeric, MatLUFactorSymbolic and MatLUFactor. >And so, I want to know if there is similar functions for KLU. (I tried >for Cholesky and, iLU and it works well). >> Best regards, >> Romain >> >> >> From: Matthew Knepley [mailto:knepley at gmail.com] >> Sent: vrijdag 14 augustus 2015 16:41 >> To: Romain Thomas >> Cc: petsc-users at mcs.anl.gov >> Subject: Re: [petsc-users] petsc KLU >> >> On Fri, Aug 14, 2015 at 9:23 AM, Romain Thomas >>wrote: >> Dear PETSc users, >> >> I would like to know if I can replace the following functions >> >> MatLUFactorNumeric(Mat fact,Mat mat,const MatFactorInfo *info) >> MatLUFactorSymbolic(Mat fact,Mat mat,IS row,IS col,const >> MatFactorInfo >> *info) MatLUFactor(Mat mat,IS row,IS col,const MatFactorInfo *info) >> >> by >> >> MatLUFactorNumeric_KLU(Mat F,Mat A,const MatFactorInfo *info) >> MatLUFactorSymbolic_KLU(Mat F,Mat A,IS r,IS c,const MatFactorInfo >> *info) MatGetFactor_seqaij_klu(Mat A,MatFactorType ftype,Mat *F) >> >> in my code for the simulation of electrical power systems? (I >> installed the package SuiteSparse) >> >> Why would you do that? It already works with the former code. In >> fact, you should really just use >> >> KSPCreate(comm, &ksp) >> KSPSetOperator(ksp, A, A); >> KSPSetFromOptions(ksp); >> KSPSolve(ksp, b, x); >> >> and then give the options >> >> -ksp_type preonly -pc_type lu -pc_mat_factor_package suitesparse >> >> This is no advantage to using the Factor language since subsequent >> calls to >> KSPSolve() will not refactor. >> >> Matt >> >> Thank you, >> Best regards, >> Romain >> >> >> >> -- >> What most experimenters take for granted before they begin their >>experiments is infinitely more interesting than any results to which >>their experiments lead. >> -- Norbert Wiener > > > > > From knepley at gmail.com Tue Aug 18 06:47:40 2015 From: knepley at gmail.com (Matthew Knepley) Date: Tue, 18 Aug 2015 06:47:40 -0500 Subject: [petsc-users] Wise usage of user contexts In-Reply-To: References: Message-ID: On Tue, Aug 18, 2015 at 4:20 AM, Timoth?e Nicolas < timothee.nicolas at gmail.com> wrote: > Dave, > > Thx a lot for your very clear answer. My last question about modules could > be reformulated like this : > > Why would I put anything in a ctx while I could simply use modules ? Maybe > it has something to do with the fact that PETSc is initially written for C ? > We think it makes the code more modular and easier to understand, especially if many pieces are composed together. Global variables have to be tracked down by someone looking at your code. Thanks, Matt > Best > > Timothee > > 2015-08-18 17:59 GMT+09:00 Dave May : > >> >> >> On 18 August 2015 at 10:42, Timoth?e Nicolas >> wrote: >> >>> Hi all, >>> >>> I am in the process of writing an implicit solver for a set of PDEs >>> (namely MHD equations), in FORTRAN. When setting the non-linear function to >>> solve via Newton-Krylov, I use a "user defined context", namely the thing >>> denoted by "ctx" on the doc page about SNESFunction : >>> >>> >>> http://www.mcs.anl.gov/petsc/petsc-current/docs/manualpages/SNES/SNESFunction.html#SNESFunction >>> >>> In practice ctx is a user defined type which contains everything I need >>> in the local routine which sets the function on the local part of the grid, >>> FormFunctionLocal. That is, some local/global geometrical information on >>> the grid, the physical parameter, and possibly any other thing. >>> >>> In my case it so happens that due to the scheme I have chosen, when I >>> compute my function, I need the full solution of the problem at the last >>> two time steps (which are in Vec format). So my ctx contains two Vec >>> elements. Since I will work in 3D and intend to use a lot of points in the >>> future, I am concerned about memory problems which could occur. >>> >> >> In the grand scheme of things, the two vectors in your context aren't >> likely to significantly add to the total memory footprint of your code. A >> couple of things to note: >> * If you run in parallel, only the local part of the vector will be >> stored on each MPI process. >> * All the KSP methods will allocate auxiliary vectors. Most methods >> require more than 2 auxiliary vectors. >> * SNES also requires auxiliary vectors. If you use JFNK, that method will >> also need some additional temporary vectors. >> * If you assemble a Jacobian, this matrix will likely require much more >> memory per MPI process than two vectors >> >> >> >>> Is there a limit to the size occupied by ctx ? >>> >> >> The only limit is defined by the available memory per MPI process you >> have on your target machine. >> >> >>> Would this be better if instead I was declaring global variables in a >>> module and using this module inside FormFunctionLocal ? Is this allowed ? >>> >> >> What would be the difference in doing that - the memory usage will be >> identical. >> >> Cheers >> Dave >> >> >>> >>> Best regards >>> >>> Timothee NICOLAS >>> >> >> > -- What most experimenters take for granted before they begin their experiments is infinitely more interesting than any results to which their experiments lead. -- Norbert Wiener -------------- next part -------------- An HTML attachment was scrubbed... URL: From abhyshr at anl.gov Tue Aug 18 11:45:18 2015 From: abhyshr at anl.gov (Abhyankar, Shrirang G.) Date: Tue, 18 Aug 2015 16:45:18 +0000 Subject: [petsc-users] petsc KLU In-Reply-To: <71B4204D92F7884494460446CAD0F04B5B7CDBF0@SRV364.tudelft.net> References: <6F0087987AC5484D8D593B639648295A2F3A8EB0@SRV383.tudelft.net> <6F0087987AC5484D8D593B639648295A2F3AC1DE@SRV383.tudelft.net> <6F0087987AC5484D8D593B639648295A2F3AF250@SRV383.tudelft.net> <6F0087987AC5484D8D593B639648295A2F3B3521@SRV384.tudelft.net> <71B4204D92F7884494460446CAD0F04B5B7CDBF0@SRV364.tudelft.net> Message-ID: Domenico, Here are the results for a power flow application.I don?t remember the size of the system. Package + Ordering MatSolve Sym. Fact Num. Fact Ordering KSPSolve Numeric ratio Linear solve ratio PETSc + QMD 1.60E-02 2.40E-02 9.99E-02 0.13 2.76E-01 1.14 1.90 PETSc + ND 3.20E-02 5.60E-02 9.40E-01 0.02 1.06E+00 10.68 7.31 PETSc + AMD 2.40E-02 2.00E-02 8.80E-02 0.01 1.45E-01 1.00 1.00 KLU + AMD 2.80E-02 2.80E-02 2.40E-01 0.01 3.08E-01 2.73 2.12 KLU + COLAMD 5.60E-02 4.00E-02 3.90E-01 0.01 5.00E-01 4.43 3.45 KLU + QMD 2.80E-02 1.20E-02 2.67E-01 0.13 4.40E-01 3.03 3.03 The numeric and linear solve ratios are the ratios w.r.t. to using PETSc + AMD. You can test the performance of KLU on the power flow example application $PETSC_DIR/src/snes/examples/tutorial/network/pflow/pf.c Shri -----Original Message----- From: Domenico Lahaye - EWI Date: Tuesday, August 18, 2015 at 6:34 AM To: "petsc-users at mcs.anl.gov" Subject: Re: [petsc-users] petsc KLU >Dear all, > > Have the disappointing results of KLU been reported somewhere? >Earlier claims made might reinforce claims that we want to make. > > Sincere thanks, Domenico. > >________________________________________ >From: Romain Thomas >Sent: Tuesday, August 18, 2015 1:10 PM >To: Domenico Lahaye - EWI >Subject: FW: [petsc-users] petsc KLU > >Hi, >You can find below the message from Shri. >Best regards, >Romain > >-----Original Message----- >From: Abhyankar, Shrirang G. [mailto:abhyshr at anl.gov] >Sent: maandag 17 augustus 2015 18:21 >To: Romain Thomas; Zhang, Hong >Cc: petsc-users at mcs.anl.gov >Subject: Re: [petsc-users] petsc KLU > >Romain, > I added the KLU interface to PETSc last year hearing the hype about >KLU?s performance from several power system folks. I must say that I?m >terribly disappointed! I did some performance testing of KLU on power >grid problems (power flow application) last year and I got a similar >performance that you report (PETSc is 2-4 times faster than KLU). I also >clocked the time spent in PETSc?s SuiteSparse interface for KLU for >operations other than factorization and it was very minimal. The fastest >linear solver combination that I found was PETSc?s LU solver + AMD >ordering from the SuiteSparse package (-pc_factor_mat_ordering_type amd). >Don?t try MUMPS and SuperLU ? they are terribly slow. > >Shri > > >From: hong zhang >Date: Monday, August 17, 2015 at 10:08 AM >To: Romain Thomas >Cc: "petsc-users at mcs.anl.gov" >Subject: Re: [petsc-users] petsc KLU > > >>Romain: >>Do you mean small sparse sequential 200 by 200 matrices? >>Petsc LU might be better than external LU packages because it >>implements simple LU algorithm and we took good care on data accesing >>(I've heard same observations). You may try 'qmd' matrix ordering for >>power grid simulation. >>I do not have experience on SuiteSparse. Testing MUMPS is worth it as >>well. >> >>Hong >> >> >>Hi >>Thank you for your answer. I was asking help because I find LU >>factorization 2-3 times faster than KLU. According to my problem size >>(200*200) and type (power system simulation), I should get almost the >>same computation time. Is it true to think that? Is the difference of >>time due to the interface between PETSc and SuiteSparse? >>Thank you, >>Romain >> >>-----Original Message----- >>From: Barry Smith [mailto:bsmith at mcs.anl.gov] >>Sent: vrijdag 14 augustus 2015 17:31 >>To: Romain Thomas >>Cc: Matthew Knepley; petsc-users at mcs.anl.gov >>Subject: Re: [petsc-users] petsc KLU >> >> >> You should call >> >> MatGetFactor(mat,MATSOLVERKLU,MAT_FACTOR_LU,&fact); >> >> then call >> >>> MatLUFactorNumeric(Mat fact,Mat mat,const MatFactorInfo *info) >>> MatLUFactorSymbolic(Mat fact,Mat mat,IS row,IS col,const >>> MatFactorInfo >>> *info) MatLUFactor(Mat mat,IS row,IS col,const MatFactorInfo *info) >> >> This routines correctly internally call the appropriate >>MatLUFactorNumeric_KLU() etc for you because you passed MATSOLVERKLU >>above. >> There is no reason to (and it won't work) to call >> >>> MatLUFactorNumeric_KLU(Mat F,Mat A,const MatFactorInfo *info) >>> MatLUFactorSymbolic_KLU(Mat F,Mat A,IS r,IS c,const MatFactorInfo >>> *info) MatGetFactor_seqaij_klu(Mat A,MatFactorType ftype,Mat *F) >> >>directly. >> >> Barry >> >>> On Aug 14, 2015, at 10:07 AM, Romain Thomas >>>wrote: >>> >>> Hi, >>> Thank you for your answer. >>> My problem is a bit more complex. During the simulation (?real >>>time?), I need to upgrade at each time step the matrix A and the >>>MatassemblyBegin and MatassemblyEnd take time and so, in order to >>>avoid these functions, I don?t use ksp or pc. I prefer to use >> the functions MatLUFactorNumeric, MatLUFactorSymbolic and MatLUFactor. >>And so, I want to know if there is similar functions for KLU. (I tried >>for Cholesky and, iLU and it works well). >>> Best regards, >>> Romain >>> >>> >>> From: Matthew Knepley [mailto:knepley at gmail.com] >>> Sent: vrijdag 14 augustus 2015 16:41 >>> To: Romain Thomas >>> Cc: petsc-users at mcs.anl.gov >>> Subject: Re: [petsc-users] petsc KLU >>> >>> On Fri, Aug 14, 2015 at 9:23 AM, Romain Thomas >>>wrote: >>> Dear PETSc users, >>> >>> I would like to know if I can replace the following functions >>> >>> MatLUFactorNumeric(Mat fact,Mat mat,const MatFactorInfo *info) >>> MatLUFactorSymbolic(Mat fact,Mat mat,IS row,IS col,const >>> MatFactorInfo >>> *info) MatLUFactor(Mat mat,IS row,IS col,const MatFactorInfo *info) >>> >>> by >>> >>> MatLUFactorNumeric_KLU(Mat F,Mat A,const MatFactorInfo *info) >>> MatLUFactorSymbolic_KLU(Mat F,Mat A,IS r,IS c,const MatFactorInfo >>> *info) MatGetFactor_seqaij_klu(Mat A,MatFactorType ftype,Mat *F) >>> >>> in my code for the simulation of electrical power systems? (I >>> installed the package SuiteSparse) >>> >>> Why would you do that? It already works with the former code. In >>> fact, you should really just use >>> >>> KSPCreate(comm, &ksp) >>> KSPSetOperator(ksp, A, A); >>> KSPSetFromOptions(ksp); >>> KSPSolve(ksp, b, x); >>> >>> and then give the options >>> >>> -ksp_type preonly -pc_type lu -pc_mat_factor_package suitesparse >>> >>> This is no advantage to using the Factor language since subsequent >>> calls to >>> KSPSolve() will not refactor. >>> >>> Matt >>> >>> Thank you, >>> Best regards, >>> Romain >>> >>> >>> >>> -- >>> What most experimenters take for granted before they begin their >>>experiments is infinitely more interesting than any results to which >>>their experiments lead. >>> -- Norbert Wiener >> >> >> >> >> > From D.J.P.Lahaye at tudelft.nl Tue Aug 18 12:38:04 2015 From: D.J.P.Lahaye at tudelft.nl (Domenico Lahaye - EWI) Date: Tue, 18 Aug 2015 17:38:04 +0000 Subject: [petsc-users] petsc KLU In-Reply-To: References: <6F0087987AC5484D8D593B639648295A2F3A8EB0@SRV383.tudelft.net> <6F0087987AC5484D8D593B639648295A2F3AC1DE@SRV383.tudelft.net> <6F0087987AC5484D8D593B639648295A2F3AF250@SRV383.tudelft.net> <6F0087987AC5484D8D593B639648295A2F3B3521@SRV384.tudelft.net> <71B4204D92F7884494460446CAD0F04B5B7CDBF0@SRV364.tudelft.net>, Message-ID: <71B4204D92F7884494460446CAD0F04B5B7CDCA4@SRV364.tudelft.net> Dear Shri, I am by no means putting your arguments in doubt. I apologize if I gave that impression. I am however looking for a reference that we can cite when making claims on the performance of KLU. Did you publish your results somewhere? Thank you. Domenico. ________________________________________ From: Abhyankar, Shrirang G. [abhyshr at anl.gov] Sent: Tuesday, August 18, 2015 6:45 PM To: Domenico Lahaye - EWI; petsc-users at mcs.anl.gov Subject: Re: [petsc-users] petsc KLU Domenico, Here are the results for a power flow application.I don?t remember the size of the system. Package + Ordering MatSolve Sym. Fact Num. Fact Ordering KSPSolve Numeric ratio Linear solve ratio PETSc + QMD 1.60E-02 2.40E-02 9.99E-02 0.13 2.76E-01 1.14 1.90 PETSc + ND 3.20E-02 5.60E-02 9.40E-01 0.02 1.06E+00 10.68 7.31 PETSc + AMD 2.40E-02 2.00E-02 8.80E-02 0.01 1.45E-01 1.00 1.00 KLU + AMD 2.80E-02 2.80E-02 2.40E-01 0.01 3.08E-01 2.73 2.12 KLU + COLAMD 5.60E-02 4.00E-02 3.90E-01 0.01 5.00E-01 4.43 3.45 KLU + QMD 2.80E-02 1.20E-02 2.67E-01 0.13 4.40E-01 3.03 3.03 The numeric and linear solve ratios are the ratios w.r.t. to using PETSc + AMD. You can test the performance of KLU on the power flow example application $PETSC_DIR/src/snes/examples/tutorial/network/pflow/pf.c Shri -----Original Message----- From: Domenico Lahaye - EWI Date: Tuesday, August 18, 2015 at 6:34 AM To: "petsc-users at mcs.anl.gov" Subject: Re: [petsc-users] petsc KLU >Dear all, > > Have the disappointing results of KLU been reported somewhere? >Earlier claims made might reinforce claims that we want to make. > > Sincere thanks, Domenico. > >________________________________________ >From: Romain Thomas >Sent: Tuesday, August 18, 2015 1:10 PM >To: Domenico Lahaye - EWI >Subject: FW: [petsc-users] petsc KLU > >Hi, >You can find below the message from Shri. >Best regards, >Romain > >-----Original Message----- >From: Abhyankar, Shrirang G. [mailto:abhyshr at anl.gov] >Sent: maandag 17 augustus 2015 18:21 >To: Romain Thomas; Zhang, Hong >Cc: petsc-users at mcs.anl.gov >Subject: Re: [petsc-users] petsc KLU > >Romain, > I added the KLU interface to PETSc last year hearing the hype about >KLU?s performance from several power system folks. I must say that I?m >terribly disappointed! I did some performance testing of KLU on power >grid problems (power flow application) last year and I got a similar >performance that you report (PETSc is 2-4 times faster than KLU). I also >clocked the time spent in PETSc?s SuiteSparse interface for KLU for >operations other than factorization and it was very minimal. The fastest >linear solver combination that I found was PETSc?s LU solver + AMD >ordering from the SuiteSparse package (-pc_factor_mat_ordering_type amd). >Don?t try MUMPS and SuperLU ? they are terribly slow. > >Shri > > >From: hong zhang >Date: Monday, August 17, 2015 at 10:08 AM >To: Romain Thomas >Cc: "petsc-users at mcs.anl.gov" >Subject: Re: [petsc-users] petsc KLU > > >>Romain: >>Do you mean small sparse sequential 200 by 200 matrices? >>Petsc LU might be better than external LU packages because it >>implements simple LU algorithm and we took good care on data accesing >>(I've heard same observations). You may try 'qmd' matrix ordering for >>power grid simulation. >>I do not have experience on SuiteSparse. Testing MUMPS is worth it as >>well. >> >>Hong >> >> >>Hi >>Thank you for your answer. I was asking help because I find LU >>factorization 2-3 times faster than KLU. According to my problem size >>(200*200) and type (power system simulation), I should get almost the >>same computation time. Is it true to think that? Is the difference of >>time due to the interface between PETSc and SuiteSparse? >>Thank you, >>Romain >> >>-----Original Message----- >>From: Barry Smith [mailto:bsmith at mcs.anl.gov] >>Sent: vrijdag 14 augustus 2015 17:31 >>To: Romain Thomas >>Cc: Matthew Knepley; petsc-users at mcs.anl.gov >>Subject: Re: [petsc-users] petsc KLU >> >> >> You should call >> >> MatGetFactor(mat,MATSOLVERKLU,MAT_FACTOR_LU,&fact); >> >> then call >> >>> MatLUFactorNumeric(Mat fact,Mat mat,const MatFactorInfo *info) >>> MatLUFactorSymbolic(Mat fact,Mat mat,IS row,IS col,const >>> MatFactorInfo >>> *info) MatLUFactor(Mat mat,IS row,IS col,const MatFactorInfo *info) >> >> This routines correctly internally call the appropriate >>MatLUFactorNumeric_KLU() etc for you because you passed MATSOLVERKLU >>above. >> There is no reason to (and it won't work) to call >> >>> MatLUFactorNumeric_KLU(Mat F,Mat A,const MatFactorInfo *info) >>> MatLUFactorSymbolic_KLU(Mat F,Mat A,IS r,IS c,const MatFactorInfo >>> *info) MatGetFactor_seqaij_klu(Mat A,MatFactorType ftype,Mat *F) >> >>directly. >> >> Barry >> >>> On Aug 14, 2015, at 10:07 AM, Romain Thomas >>>wrote: >>> >>> Hi, >>> Thank you for your answer. >>> My problem is a bit more complex. During the simulation (?real >>>time?), I need to upgrade at each time step the matrix A and the >>>MatassemblyBegin and MatassemblyEnd take time and so, in order to >>>avoid these functions, I don?t use ksp or pc. I prefer to use >> the functions MatLUFactorNumeric, MatLUFactorSymbolic and MatLUFactor. >>And so, I want to know if there is similar functions for KLU. (I tried >>for Cholesky and, iLU and it works well). >>> Best regards, >>> Romain >>> >>> >>> From: Matthew Knepley [mailto:knepley at gmail.com] >>> Sent: vrijdag 14 augustus 2015 16:41 >>> To: Romain Thomas >>> Cc: petsc-users at mcs.anl.gov >>> Subject: Re: [petsc-users] petsc KLU >>> >>> On Fri, Aug 14, 2015 at 9:23 AM, Romain Thomas >>>wrote: >>> Dear PETSc users, >>> >>> I would like to know if I can replace the following functions >>> >>> MatLUFactorNumeric(Mat fact,Mat mat,const MatFactorInfo *info) >>> MatLUFactorSymbolic(Mat fact,Mat mat,IS row,IS col,const >>> MatFactorInfo >>> *info) MatLUFactor(Mat mat,IS row,IS col,const MatFactorInfo *info) >>> >>> by >>> >>> MatLUFactorNumeric_KLU(Mat F,Mat A,const MatFactorInfo *info) >>> MatLUFactorSymbolic_KLU(Mat F,Mat A,IS r,IS c,const MatFactorInfo >>> *info) MatGetFactor_seqaij_klu(Mat A,MatFactorType ftype,Mat *F) >>> >>> in my code for the simulation of electrical power systems? (I >>> installed the package SuiteSparse) >>> >>> Why would you do that? It already works with the former code. In >>> fact, you should really just use >>> >>> KSPCreate(comm, &ksp) >>> KSPSetOperator(ksp, A, A); >>> KSPSetFromOptions(ksp); >>> KSPSolve(ksp, b, x); >>> >>> and then give the options >>> >>> -ksp_type preonly -pc_type lu -pc_mat_factor_package suitesparse >>> >>> This is no advantage to using the Factor language since subsequent >>> calls to >>> KSPSolve() will not refactor. >>> >>> Matt >>> >>> Thank you, >>> Best regards, >>> Romain >>> >>> >>> >>> -- >>> What most experimenters take for granted before they begin their >>>experiments is infinitely more interesting than any results to which >>>their experiments lead. >>> -- Norbert Wiener >> >> >> >> >> > From bsmith at mcs.anl.gov Tue Aug 18 13:20:51 2015 From: bsmith at mcs.anl.gov (Barry Smith) Date: Tue, 18 Aug 2015 13:20:51 -0500 Subject: [petsc-users] Wise usage of user contexts In-Reply-To: References: Message-ID: > On Aug 18, 2015, at 4:20 AM, Timoth?e Nicolas wrote: > > Dave, > > Thx a lot for your very clear answer. My last question about modules could be reformulated like this : > > Why would I put anything in a ctx while I could simply use modules ? You could. The disadvantage of modules, which may not matter in your case, is that there can only ever be ONE data structure that contains all this information and can be used in your form functions. If you use a derived data type you can have multiple different ones of these data structures in the same code. Say for example, you had two different sets of physical parameters and you wanted to solve in your program both problems, you would just a derived type designed to hold this information and put each set of parameters into a different derived type object. With a module there could only be one set of parameters so you would have to do some horrible thing like constantly be changing the values in the model back and forth between the two sets. > Maybe it has something to do with the fact that PETSc is initially written for C ? Modules are kind of like singletons in object oriented programming. They are occasionally useful but limiting yourself to ONLY having singletons in object oriented programs is nuts. Barry > > Best > > Timothee > > 2015-08-18 17:59 GMT+09:00 Dave May : > > > On 18 August 2015 at 10:42, Timoth?e Nicolas wrote: > Hi all, > > I am in the process of writing an implicit solver for a set of PDEs (namely MHD equations), in FORTRAN. When setting the non-linear function to solve via Newton-Krylov, I use a "user defined context", namely the thing denoted by "ctx" on the doc page about SNESFunction : > > http://www.mcs.anl.gov/petsc/petsc-current/docs/manualpages/SNES/SNESFunction.html#SNESFunction > > In practice ctx is a user defined type which contains everything I need in the local routine which sets the function on the local part of the grid, FormFunctionLocal. That is, some local/global geometrical information on the grid, the physical parameter, and possibly any other thing. > > In my case it so happens that due to the scheme I have chosen, when I compute my function, I need the full solution of the problem at the last two time steps (which are in Vec format). So my ctx contains two Vec elements. Since I will work in 3D and intend to use a lot of points in the future, I am concerned about memory problems which could occur. > > In the grand scheme of things, the two vectors in your context aren't likely to significantly add to the total memory footprint of your code. A couple of things to note: > * If you run in parallel, only the local part of the vector will be stored on each MPI process. > * All the KSP methods will allocate auxiliary vectors. Most methods require more than 2 auxiliary vectors. > * SNES also requires auxiliary vectors. If you use JFNK, that method will also need some additional temporary vectors. > * If you assemble a Jacobian, this matrix will likely require much more memory per MPI process than two vectors > > > > Is there a limit to the size occupied by ctx ? > > The only limit is defined by the available memory per MPI process you have on your target machine. > > Would this be better if instead I was declaring global variables in a module and using this module inside FormFunctionLocal ? Is this allowed ? > > What would be the difference in doing that - the memory usage will be identical. > > Cheers > Dave > > > Best regards > > Timothee NICOLAS > > From reza.yaghmaie2 at gmail.com Tue Aug 18 19:25:42 2015 From: reza.yaghmaie2 at gmail.com (Reza Yaghmaie) Date: Tue, 18 Aug 2015 20:25:42 -0400 Subject: [petsc-users] SNESSetFunction In-Reply-To: References: Message-ID: Thank you very much for the insight. It helped. I am trying to solve the system using *snes *routines. Let's say the I execute the below command in Fortran call *SNESSolve*(snes,PETSC_NULL_OBJECT,xvec,ierr) In the residual calculation and Jacobian update routines I need to finalize the vectors and matrix assemblies using the commands as following otherwise *SNESSolve* will crash: call *VecAssemblyBegin *(FVEC, ierr) call *VecAssemblyEnd *(FVEC, ierr) call *MatAssemblyBegin*(jac_prec,MAT_FINAL_ASSEMBLY,ierr) call *MatAssemblyEnd*(jac_prec,MAT_FINAL_ASSEMBLY,ierr) I face the issue that my debugger crashes at the locations of theses final vector and matrix assemblies. It worked for the sequential version of the code but for the parallel version it stops there. I am sure all processors in the mpi framework reach to these pointers simultaneously. Any insights? Thanks, Reza On Mon, Aug 17, 2015 at 2:25 PM, Barry Smith wrote: > > Reza, > > See src/snes/examples/tutorials/ex5f90.F for how this may be easily > done using a Fortran user defined type > > Barry > > > On Aug 17, 2015, at 12:39 PM, Matthew Knepley wrote: > > > > On Mon, Aug 17, 2015 at 11:46 AM, Reza Yaghmaie < > reza.yaghmaie2 at gmail.com> wrote: > > > > Hi, > > > > I have problems with passing variables through SNESSetFunction in my > code. basically I have the following subroutines in the main body of the > Fortran code. Could you provide some insight on how to transfer variables > into the residual calculation routine (FormFunction1)? > > > > Extra arguments to your FormFunction are meant to be passed in a > context, through the context variable. > > > > This is difficult in Fortran, but you can use a PetscObject as a > container. You can attach other > > PetscObjects using PetscObjectCompose() in Fortran. > > > > Matt > > > > Thanks, > > Reza > > > ------------------------------------------------------------------------------------------------------------------ > > main code > > > > SNES snes > > Vec xvec,rvec > > external FormFunction1 > > real*8 > variable1(10),variable2(20,20),variable3(30),variable4(40,40) > > > > > > call SNESSetFunction(snes,rvec,FormFunction1, > > & PETSC_NULL_OBJECT, > > & variable1,variable2,variable3,variable4, > > & ierr) > > > > end > > > > subroutine FormFunction1(snes,XVEC,FVEC, > > & dummy, > > & varable1,varable2,varable3,varable4, > > & ierr) > > > > SNES snes > > Vec XVEC,FVEC > > PetscFortranAddr dummy > > real*8 > variable1(10),variable2(20,20),variable3(30),variable4(40,40) > > > > > > return > > end > > > -------------------------------------------------------------------------------------------------------------- > > > > > > > > > > > > > > > > > > -- > > What most experimenters take for granted before they begin their > experiments is infinitely more interesting than any results to which their > experiments lead. > > -- Norbert Wiener > > -------------- next part -------------- An HTML attachment was scrubbed... URL: From bsmith at mcs.anl.gov Tue Aug 18 20:29:17 2015 From: bsmith at mcs.anl.gov (Barry Smith) Date: Tue, 18 Aug 2015 20:29:17 -0500 Subject: [petsc-users] SNESSetFunction In-Reply-To: References: Message-ID: We would have to have full details of what happens in the debugger during those crashes. Barry > On Aug 18, 2015, at 7:25 PM, Reza Yaghmaie wrote: > > > Thank you very much for the insight. It helped. > > I am trying to solve the system using snes routines. Let's say the I execute the below command in Fortran > > call SNESSolve(snes,PETSC_NULL_OBJECT,xvec,ierr) > > In the residual calculation and Jacobian update routines I need to finalize the vectors and matrix assemblies using the commands as following otherwise SNESSolve will crash: > > call VecAssemblyBegin (FVEC, ierr) > call VecAssemblyEnd (FVEC, ierr) > > call MatAssemblyBegin(jac_prec,MAT_FINAL_ASSEMBLY,ierr) > call MatAssemblyEnd(jac_prec,MAT_FINAL_ASSEMBLY,ierr) > > I face the issue that my debugger crashes at the locations of theses final vector and matrix assemblies. It worked for the sequential version of the code but for the parallel version it stops there. I am sure all processors in the mpi framework reach to these pointers simultaneously. Any insights? > > Thanks, > Reza > > > > On Mon, Aug 17, 2015 at 2:25 PM, Barry Smith wrote: > > Reza, > > See src/snes/examples/tutorials/ex5f90.F for how this may be easily done using a Fortran user defined type > > Barry > > > On Aug 17, 2015, at 12:39 PM, Matthew Knepley wrote: > > > > On Mon, Aug 17, 2015 at 11:46 AM, Reza Yaghmaie wrote: > > > > Hi, > > > > I have problems with passing variables through SNESSetFunction in my code. basically I have the following subroutines in the main body of the Fortran code. Could you provide some insight on how to transfer variables into the residual calculation routine (FormFunction1)? > > > > Extra arguments to your FormFunction are meant to be passed in a context, through the context variable. > > > > This is difficult in Fortran, but you can use a PetscObject as a container. You can attach other > > PetscObjects using PetscObjectCompose() in Fortran. > > > > Matt > > > > Thanks, > > Reza > > ------------------------------------------------------------------------------------------------------------------ > > main code > > > > SNES snes > > Vec xvec,rvec > > external FormFunction1 > > real*8 variable1(10),variable2(20,20),variable3(30),variable4(40,40) > > > > > > call SNESSetFunction(snes,rvec,FormFunction1, > > & PETSC_NULL_OBJECT, > > & variable1,variable2,variable3,variable4, > > & ierr) > > > > end > > > > subroutine FormFunction1(snes,XVEC,FVEC, > > & dummy, > > & varable1,varable2,varable3,varable4, > > & ierr) > > > > SNES snes > > Vec XVEC,FVEC > > PetscFortranAddr dummy > > real*8 variable1(10),variable2(20,20),variable3(30),variable4(40,40) > > > > > > return > > end > > -------------------------------------------------------------------------------------------------------------- > > > > > > > > > > > > > > > > > > -- > > What most experimenters take for granted before they begin their experiments is infinitely more interesting than any results to which their experiments lead. > > -- Norbert Wiener > > From zonexo at gmail.com Tue Aug 18 20:38:03 2015 From: zonexo at gmail.com (TAY wee-beng) Date: Wed, 19 Aug 2015 09:38:03 +0800 Subject: [petsc-users] difference between local and global vectors Message-ID: <55D3DDFB.30302@gmail.com> Hi, I am using DA. For e.g. DM da_u call DMDACreate3d(MPI_COMM_WORLD,DM_BOUNDARY_NONE,DM_BOUNDARY_NONE,DM_BOUNDARY_NONE,DMDA_STENCIL_STAR,size_x,size_y,& size_z,1,PETSC_DECIDE,PETSC_DECIDE,1,stencil_width,lx,PETSC_NULL_INTEGER,PETSC_NULL_INTEGER,da_u,ierr) call DMCreateGlobalVector(da_u,u_global,ierr) call DMCreateLocalVector(da_u,u_local,ierr) To update the ghost values, I use: call DMLocalToLocalBegin(da_u,u_local,INSERT_VALUES,u_local,ierr) call DMLocalToLocalEnd(da_u,u_local,INSERT_VALUES,u_local,ierr) It seems that I don't need to use global vector at all. So what's the difference between local and global vector? When will I need to use?: call DMGlobalToLocalBegin(da_u,u_global,INSERT_VALUES,u_local,ierr) call DMGlobalToLocalEnd(da_u,u_global,INSERT_VALUES,u_local,ierr) -- Thank you Yours sincerely, TAY wee-beng From dave.mayhem23 at gmail.com Wed Aug 19 00:17:58 2015 From: dave.mayhem23 at gmail.com (Dave May) Date: Wed, 19 Aug 2015 07:17:58 +0200 Subject: [petsc-users] difference between local and global vectors In-Reply-To: <55D3DDFB.30302@gmail.com> References: <55D3DDFB.30302@gmail.com> Message-ID: On 19 August 2015 at 03:38, TAY wee-beng wrote: > Hi, > > I am using DA. For e.g. > > DM da_u > > call > DMDACreate3d(MPI_COMM_WORLD,DM_BOUNDARY_NONE,DM_BOUNDARY_NONE,DM_BOUNDARY_NONE,DMDA_STENCIL_STAR,size_x,size_y,& > > > size_z,1,PETSC_DECIDE,PETSC_DECIDE,1,stencil_width,lx,PETSC_NULL_INTEGER,PETSC_NULL_INTEGER,da_u,ierr) > > call DMCreateGlobalVector(da_u,u_global,ierr) > > call DMCreateLocalVector(da_u,u_local,ierr) > > To update the ghost values, I use: > > call DMLocalToLocalBegin(da_u,u_local,INSERT_VALUES,u_local,ierr) > > call DMLocalToLocalEnd(da_u,u_local,INSERT_VALUES,u_local,ierr) > This is incorrect. The manpage for DMLocalToLocal clearly says "Maps from a local vector (including ghost points that contain irrelevant values) to another local vector where the ghost points in the second are set correctly." To update ghost values from a global vector (e.g. to perform the scatter) you need to use DMGlobalToLocalBegin() , DMGlobalToLocalEnd(). > > It seems that I don't need to use global vector at all. > > So what's the difference between local and global vector? > * Local vectors contain ghost values from any neighbouring MPI processes. They are always defined over PETSC_COMM_SELF. * Global vectors store the DOFs assigned to each sub-domain. These will parallel vectors defined over the same communicator as your DM Thus, you use local vectors to compute things like the sub-domain contribution to (i) a non-linear residual evaluation or (ii) a sparse-matric vector product. You use global vectors together with linear and non-linear solvers as these vectors. If your stencil width was zero (in your DMDACreate3d() function call), then the would be no ghost values to communicate between neighbouring MPI processes. Hence, the entries in the following two arrays LA_u_local[], LA_u[] would be identical VecGetArrayRead(u_local,&LA_u_local); and VecGetArrayRead(u,&LA_u); That said, u_local would still be of type VECSEQ, where as u would be of type VECMPI. > > When will I need to use?: > > call DMGlobalToLocalBegin(da_u,u_global,INSERT_VALUES,u_local,ierr) > > call DMGlobalToLocalEnd(da_u,u_global,INSERT_VALUES,u_local,ierr) > See points (i) and (ii) above from common use cases. Thanks, Dave > > -- > Thank you > > Yours sincerely, > > TAY wee-beng > > -------------- next part -------------- An HTML attachment was scrubbed... URL: From zonexo at gmail.com Wed Aug 19 03:20:01 2015 From: zonexo at gmail.com (TAY wee-beng) Date: Wed, 19 Aug 2015 16:20:01 +0800 Subject: [petsc-users] difference between local and global vectors In-Reply-To: References: <55D3DDFB.30302@gmail.com> Message-ID: <55D43C31.4070902@gmail.com> On 19/8/2015 1:17 PM, Dave May wrote: > > > On 19 August 2015 at 03:38, TAY wee-beng > wrote: > > Hi, > > I am using DA. For e.g. > > DM da_u > > call > DMDACreate3d(MPI_COMM_WORLD,DM_BOUNDARY_NONE,DM_BOUNDARY_NONE,DM_BOUNDARY_NONE,DMDA_STENCIL_STAR,size_x,size_y,& > > size_z,1,PETSC_DECIDE,PETSC_DECIDE,1,stencil_width,lx,PETSC_NULL_INTEGER,PETSC_NULL_INTEGER,da_u,ierr) > > call DMCreateGlobalVector(da_u,u_global,ierr) > > call DMCreateLocalVector(da_u,u_local,ierr) > > To update the ghost values, I use: > > call DMLocalToLocalBegin(da_u,u_local,INSERT_VALUES,u_local,ierr) > > call DMLocalToLocalEnd(da_u,u_local,INSERT_VALUES,u_local,ierr) > > > > This is incorrect. > The manpage for DMLocalToLocal clearly says "Maps from a local vector > (including ghost points that contain irrelevant values) to another > local vector where the ghost points in the second are set correctly." > To update ghost values from a global vector (e.g. to perform the > scatter) you need to use DMGlobalToLocalBegin() , DMGlobalToLocalEnd(). Hi Dave, Thanks for the clarification although I'm still confused. Supposed I have a 1D vector da_u, It has size 8, so it's like da_u_array(8), with stencil width 1 So for 2 procs, there will be 2 da_u_array - da_u_array(1:5) and da_u_array(4:8) After performing some operations on each procs's da_u_array, I need to update 1st procs's da_u_array(5) and 2nd procs's da_u_array(4) from the 2nd and 1st procs respectively. I simply call: call DMLocalToLocalBegin(da_u,u_local,INSERT_VALUES,u_local,ierr) call DMLocalToLocalEnd(da_u,u_local,INSERT_VALUES,u_local,ierr) and it seems to be enough. I check the ghost values and they have been updated. So if I am not using the linear solvers, I do not need the global vector,is that so? > > > > It seems that I don't need to use global vector at all. > > So what's the difference between local and global vector? > > > > * Local vectors contain ghost values from any neighbouring MPI > processes. They are always defined over PETSC_COMM_SELF. > * Global vectors store the DOFs assigned to each sub-domain. These > will parallel vectors defined over the same communicator as your DM > > Thus, you use local vectors to compute things like the sub-domain > contribution to (i) a non-linear residual evaluation or (ii) a > sparse-matric vector product. > You use global vectors together with linear and non-linear solvers as > these vectors. > > If your stencil width was zero (in your DMDACreate3d() function call), > then the would be no ghost values to communicate between neighbouring > MPI processes. Hence, the entries in the following two arrays > LA_u_local[], LA_u[] would be identical > VecGetArrayRead(u_local,&LA_u_local); > and > VecGetArrayRead(u,&LA_u); > > That said, u_local would still be of type VECSEQ, where as u would be > of type VECMPI. > > > > When will I need to use?: > > call DMGlobalToLocalBegin(da_u,u_global,INSERT_VALUES,u_local,ierr) > > call DMGlobalToLocalEnd(da_u,u_global,INSERT_VALUES,u_local,ierr) > > > See points (i) and (ii) above from common use cases. > > Thanks, > Dave > > > > > -- > Thank you > > Yours sincerely, > > TAY wee-beng > > -------------- next part -------------- An HTML attachment was scrubbed... URL: From dave.mayhem23 at gmail.com Wed Aug 19 03:26:13 2015 From: dave.mayhem23 at gmail.com (Dave May) Date: Wed, 19 Aug 2015 10:26:13 +0200 Subject: [petsc-users] difference between local and global vectors In-Reply-To: <55D43C31.4070902@gmail.com> References: <55D3DDFB.30302@gmail.com> <55D43C31.4070902@gmail.com> Message-ID: On 19 August 2015 at 10:20, TAY wee-beng wrote: > > On 19/8/2015 1:17 PM, Dave May wrote: > > > > On 19 August 2015 at 03:38, TAY wee-beng wrote: > >> Hi, >> >> I am using DA. For e.g. >> >> DM da_u >> >> call >> DMDACreate3d(MPI_COMM_WORLD,DM_BOUNDARY_NONE,DM_BOUNDARY_NONE,DM_BOUNDARY_NONE,DMDA_STENCIL_STAR,size_x,size_y,& >> >> >> size_z,1,PETSC_DECIDE,PETSC_DECIDE,1,stencil_width,lx,PETSC_NULL_INTEGER,PETSC_NULL_INTEGER,da_u,ierr) >> >> call DMCreateGlobalVector(da_u,u_global,ierr) >> >> call DMCreateLocalVector(da_u,u_local,ierr) >> >> To update the ghost values, I use: >> >> call DMLocalToLocalBegin(da_u,u_local,INSERT_VALUES,u_local,ierr) >> >> call DMLocalToLocalEnd(da_u,u_local,INSERT_VALUES,u_local,ierr) >> > > > This is incorrect. > The manpage for DMLocalToLocal clearly says "Maps from a local vector > (including ghost points that contain irrelevant values) to another local > vector where the ghost points in the second are set correctly." > To update ghost values from a global vector (e.g. to perform the scatter) > you need to use DMGlobalToLocalBegin() , DMGlobalToLocalEnd(). > > I must apologize (and should have read my own email :D) - I misunderstood what DMLocalToLocalBegin/End does. Indeed it will give produce the correct / updated ghost values. > Hi Dave, > > Thanks for the clarification although I'm still confused. Supposed I have > a 1D vector da_u, It has size 8, so it's like da_u_array(8), with stencil > width 1 > > So for 2 procs, > > there will be 2 da_u_array - da_u_array(1:5) and da_u_array(4:8) > > After performing some operations on each procs's da_u_array, I need to > update 1st procs's da_u_array(5) and 2nd procs's da_u_array(4) from the 2nd > and 1st procs respectively. I simply call: > > call DMLocalToLocalBegin(da_u,u_local,INSERT_VALUES,u_local,ierr) > > call DMLocalToLocalEnd(da_u,u_local,INSERT_VALUES,u_local,ierr) > > and it seems to be enough. I check the ghost values and they have been > updated. > Yeah, this is correct. Sorry about my mistake in the previous email regarding what DMLocalToLocal actually does. > So if I am not using the linear solvers, I do not need the global > vector,is that so? > I guess in the end it is application specific whether you need a global vector or not. I would have thought you always would want a global vector. What is your application where you don't require a global vector? Cheers, Dave -------------- next part -------------- An HTML attachment was scrubbed... URL: From zonexo at gmail.com Wed Aug 19 03:28:25 2015 From: zonexo at gmail.com (TAY wee-beng) Date: Wed, 19 Aug 2015 16:28:25 +0800 Subject: [petsc-users] difference between local and global vectors In-Reply-To: References: <55D3DDFB.30302@gmail.com> <55D43C31.4070902@gmail.com> Message-ID: <55D43E29.9020400@gmail.com> On 19/8/2015 4:26 PM, Dave May wrote: > > > On 19 August 2015 at 10:20, TAY wee-beng > wrote: > > > On 19/8/2015 1:17 PM, Dave May wrote: >> >> >> On 19 August 2015 at 03:38, TAY wee-beng > > wrote: >> >> Hi, >> >> I am using DA. For e.g. >> >> DM da_u >> >> call >> DMDACreate3d(MPI_COMM_WORLD,DM_BOUNDARY_NONE,DM_BOUNDARY_NONE,DM_BOUNDARY_NONE,DMDA_STENCIL_STAR,size_x,size_y,& >> >> size_z,1,PETSC_DECIDE,PETSC_DECIDE,1,stencil_width,lx,PETSC_NULL_INTEGER,PETSC_NULL_INTEGER,da_u,ierr) >> >> call DMCreateGlobalVector(da_u,u_global,ierr) >> >> call DMCreateLocalVector(da_u,u_local,ierr) >> >> To update the ghost values, I use: >> >> call DMLocalToLocalBegin(da_u,u_local,INSERT_VALUES,u_local,ierr) >> >> call DMLocalToLocalEnd(da_u,u_local,INSERT_VALUES,u_local,ierr) >> >> >> >> This is incorrect. >> The manpage for DMLocalToLocal clearly says "Maps from a local >> vector (including ghost points that contain irrelevant values) to >> another local vector where the ghost points in the second are set >> correctly." >> To update ghost values from a global vector (e.g. to perform the >> scatter) you need to use DMGlobalToLocalBegin() , >> DMGlobalToLocalEnd(). > > > I must apologize (and should have read my own email :D) > - I misunderstood what DMLocalToLocalBegin/End does. > Indeed it will give produce the correct / updated ghost values. > > Hi Dave, > > Thanks for the clarification although I'm still confused. Supposed > I have a 1D vector da_u, It has size 8, so it's like > da_u_array(8), with stencil width 1 > > So for 2 procs, > > there will be 2 da_u_array - da_u_array(1:5) and da_u_array(4:8) > > After performing some operations on each procs's da_u_array, I > need to update 1st procs's da_u_array(5) and 2nd procs's > da_u_array(4) from the 2nd and 1st procs respectively. I simply call: > > call DMLocalToLocalBegin(da_u,u_local,INSERT_VALUES,u_local,ierr) > > call DMLocalToLocalEnd(da_u,u_local,INSERT_VALUES,u_local,ierr) > > and it seems to be enough. I check the ghost values and they have > been updated. > > > Yeah, this is correct. > Sorry about my mistake in the previous email regarding what > DMLocalToLocal actually does. > > So if I am not using the linear solvers, I do not need the global > vector,is that so? > > > I guess in the end it is application specific whether you need a > global vector or not. > I would have thought you always would want a global vector. > > What is your application where you don't require a global vector? Well, I mean when I don't need to solve the linear eqn. But of course, later on in the code, when I need to, I will require the global vector. Thanks > > Cheers, > Dave > -------------- next part -------------- An HTML attachment was scrubbed... URL: From zonexo at gmail.com Wed Aug 19 03:54:29 2015 From: zonexo at gmail.com (TAY wee-beng) Date: Wed, 19 Aug 2015 16:54:29 +0800 Subject: [petsc-users] How to view petsc array? In-Reply-To: References: <55ADE837.8080705@gmail.com> Message-ID: <55D44445.8000503@gmail.com> On 21/7/2015 7:28 PM, Matthew Knepley wrote: > On Tue, Jul 21, 2015 at 1:35 AM, TAY wee-beng > wrote: > > Hi, > > I need to check the contents of the array which was declared using: > > PetscScalar,pointer :: > u_array(:,:,:),v_array(:,:,:),w_array(:,:,:),p_array(:,:,:) > > I tried to use : > > call PetscViewerASCIIOpen(MPI_COMM_WORLD,"pres.txt",viewer,ierr) > > call VecView(p_array,viewer,ierr) > > or > > call MatView(p_array,viewer,ierr) > > call PetscViewerDestroy(viewer,ierr) > > but I got segmentation error. So is there a PETSc routine I can use? > > > No. Those routines work only for Vec objects. You could > > a) Declare a DMDA of the same size > > b) Use DMDAVecGetArrayF90() to get out the multidimensional array > > c) Use that in your code > > d) Use VecView() on the original vector Hi, Supposed I need to check the contents of the u_array which was declared using: PetscScalar,pointer :: u_array(:,:,:) call DMDACreate3d(MPI_COMM_WORLD,DM_BOUNDARY_NONE,DM_BOUNDARY_NONE,DM_BOUNDARY_NONE,DMDA_STENCIL_STAR,size_x,size_y,& size_z,1,PETSC_DECIDE,PETSC_DECIDE,1,stencil_width,lx,PETSC_NULL_INTEGER,PETSC_NULL_INTEGER,da_u,ierr) call DMDAVecGetArrayF90(da_u,u_local,u_array,ierr) call PetscViewerASCIIOpen(MPI_COMM_WORLD,"u.txt",viewer,ierr) call VecView(array,viewer,ierr) call PetscViewerDestroy(viewer,ierr) Is this the correct way? > > Matt > > > -- > Thank you > > Yours sincerely, > > TAY wee-beng > > > > > -- > What most experimenters take for granted before they begin their > experiments is infinitely more interesting than any results to which > their experiments lead. > -- Norbert Wiener -------------- next part -------------- An HTML attachment was scrubbed... URL: From dave.mayhem23 at gmail.com Wed Aug 19 03:58:58 2015 From: dave.mayhem23 at gmail.com (Dave May) Date: Wed, 19 Aug 2015 10:58:58 +0200 Subject: [petsc-users] How to view petsc array? In-Reply-To: <55D44445.8000503@gmail.com> References: <55ADE837.8080705@gmail.com> <55D44445.8000503@gmail.com> Message-ID: On 19 August 2015 at 10:54, TAY wee-beng wrote: > > On 21/7/2015 7:28 PM, Matthew Knepley wrote: > > On Tue, Jul 21, 2015 at 1:35 AM, TAY wee-beng wrote: > >> Hi, >> >> I need to check the contents of the array which was declared using: >> >> PetscScalar,pointer :: >> u_array(:,:,:),v_array(:,:,:),w_array(:,:,:),p_array(:,:,:) >> >> I tried to use : >> >> call PetscViewerASCIIOpen(MPI_COMM_WORLD,"pres.txt",viewer,ierr) >> >> call VecView(p_array,viewer,ierr) >> >> or >> >> call MatView(p_array,viewer,ierr) >> >> call PetscViewerDestroy(viewer,ierr) >> >> but I got segmentation error. So is there a PETSc routine I can use? > > > No. Those routines work only for Vec objects. You could > > a) Declare a DMDA of the same size > > b) Use DMDAVecGetArrayF90() to get out the multidimensional array > > c) Use that in your code > > d) Use VecView() on the original vector > > > Hi, > > Supposed I need to check the contents of the u_array which was declared > using: > > PetscScalar,pointer :: u_array(:,:,:) > > call > DMDACreate3d(MPI_COMM_WORLD,DM_BOUNDARY_NONE,DM_BOUNDARY_NONE,DM_BOUNDARY_NONE,DMDA_STENCIL_STAR,size_x,size_y,& > > > size_z,1,PETSC_DECIDE,PETSC_DECIDE,1,stencil_width,lx,PETSC_NULL_INTEGER,PETSC_NULL_INTEGER,da_u,ierr) > > call DMDAVecGetArrayF90(da_u,u_local,u_array,ierr) > > call PetscViewerASCIIOpen(MPI_COMM_WORLD,"u.txt",viewer,ierr) > > call VecView(array,viewer,ierr) > The first argument of VecView must be of type Vec (as Matt noted). It looks you are passing in an array of PetscScalar's. > > call PetscViewerDestroy(viewer,ierr) > > Is this the correct way? > > > Matt > > >> >> -- >> Thank you >> >> Yours sincerely, >> >> TAY wee-beng >> >> > > > -- > What most experimenters take for granted before they begin their > experiments is infinitely more interesting than any results to which their > experiments lead. > -- Norbert Wiener > > > -------------- next part -------------- An HTML attachment was scrubbed... URL: From zonexo at gmail.com Wed Aug 19 04:08:16 2015 From: zonexo at gmail.com (TAY wee-beng) Date: Wed, 19 Aug 2015 17:08:16 +0800 Subject: [petsc-users] How to view petsc array? In-Reply-To: References: <55ADE837.8080705@gmail.com> <55D44445.8000503@gmail.com> Message-ID: <55D44780.3000500@gmail.com> On 19/8/2015 4:58 PM, Dave May wrote: > > > On 19 August 2015 at 10:54, TAY wee-beng > wrote: > > > On 21/7/2015 7:28 PM, Matthew Knepley wrote: >> On Tue, Jul 21, 2015 at 1:35 AM, TAY wee-beng > > wrote: >> >> Hi, >> >> I need to check the contents of the array which was declared >> using: >> >> PetscScalar,pointer :: >> u_array(:,:,:),v_array(:,:,:),w_array(:,:,:),p_array(:,:,:) >> >> I tried to use : >> >> call PetscViewerASCIIOpen(MPI_COMM_WORLD,"pres.txt",viewer,ierr) >> >> call VecView(p_array,viewer,ierr) >> >> or >> >> call MatView(p_array,viewer,ierr) >> >> call PetscViewerDestroy(viewer,ierr) >> >> but I got segmentation error. So is there a PETSc routine I >> can use? >> >> >> No. Those routines work only for Vec objects. You could >> >> a) Declare a DMDA of the same size >> >> b) Use DMDAVecGetArrayF90() to get out the multidimensional array >> >> c) Use that in your code >> >> d) Use VecView() on the original vector > > Hi, > > Supposed I need to check the contents of the u_array which was > declared using: > > PetscScalar,pointer :: u_array(:,:,:) > > call > DMDACreate3d(MPI_COMM_WORLD,DM_BOUNDARY_NONE,DM_BOUNDARY_NONE,DM_BOUNDARY_NONE,DMDA_STENCIL_STAR,size_x,size_y,& > > size_z,1,PETSC_DECIDE,PETSC_DECIDE,1,stencil_width,lx,PETSC_NULL_INTEGER,PETSC_NULL_INTEGER,da_u,ierr) > > call DMDAVecGetArrayF90(da_u,u_local,u_array,ierr) > > call PetscViewerASCIIOpen(MPI_COMM_WORLD,"u.txt",viewer,ierr) > > call VecView(array,viewer,ierr) > > > > The first argument of VecView must be of type Vec (as Matt noted). > It looks you are passing in an array of PetscScalar's. Oh so should it be: Vec u_global,u_local call DMDACreate3d(MPI_COMM_WORLD,DM_BOUNDARY_NONE,DM_BOUNDARY_NONE,DM_BOUNDARY_NONE,DMDA_STENCIL_STAR,size_x,size_y,& size_z,1,PETSC_DECIDE,PETSC_DECIDE,1,stencil_width,lx,PETSC_NULL_INTEGER,PETSC_NULL_INTEGER,da_u,ierr) call DMDAVecGetArrayF90(da_u,u_local,u_array,ierr) call PetscViewerASCIIOpen(MPI_COMM_WORLD,"u.txt",viewer,ierr) call VecView(u_local,viewer,ierr) call PetscViewerDestroy(viewer,ierr) > > > call PetscViewerDestroy(viewer,ierr) > > Is this the correct way? >> >> Matt >> >> >> -- >> Thank you >> >> Yours sincerely, >> >> TAY wee-beng >> >> >> >> >> -- >> What most experimenters take for granted before they begin their >> experiments is infinitely more interesting than any results to >> which their experiments lead. >> -- Norbert Wiener > > -------------- next part -------------- An HTML attachment was scrubbed... URL: From dave.mayhem23 at gmail.com Wed Aug 19 04:56:35 2015 From: dave.mayhem23 at gmail.com (Dave May) Date: Wed, 19 Aug 2015 11:56:35 +0200 Subject: [petsc-users] How to view petsc array? In-Reply-To: <55D44780.3000500@gmail.com> References: <55ADE837.8080705@gmail.com> <55D44445.8000503@gmail.com> <55D44780.3000500@gmail.com> Message-ID: On 19 August 2015 at 11:08, TAY wee-beng wrote: > > On 19/8/2015 4:58 PM, Dave May wrote: > > > > On 19 August 2015 at 10:54, TAY wee-beng wrote: > >> >> On 21/7/2015 7:28 PM, Matthew Knepley wrote: >> >> On Tue, Jul 21, 2015 at 1:35 AM, TAY wee-beng wrote: >> >>> Hi, >>> >>> I need to check the contents of the array which was declared using: >>> >>> PetscScalar,pointer :: >>> u_array(:,:,:),v_array(:,:,:),w_array(:,:,:),p_array(:,:,:) >>> >>> I tried to use : >>> >>> call PetscViewerASCIIOpen(MPI_COMM_WORLD,"pres.txt",viewer,ierr) >>> >>> call VecView(p_array,viewer,ierr) >>> >>> or >>> >>> call MatView(p_array,viewer,ierr) >>> >>> call PetscViewerDestroy(viewer,ierr) >>> >>> but I got segmentation error. So is there a PETSc routine I can use? >> >> >> No. Those routines work only for Vec objects. You could >> >> a) Declare a DMDA of the same size >> >> b) Use DMDAVecGetArrayF90() to get out the multidimensional array >> >> c) Use that in your code >> >> d) Use VecView() on the original vector >> >> >> Hi, >> >> Supposed I need to check the contents of the u_array which was declared >> using: >> >> PetscScalar,pointer :: u_array(:,:,:) >> >> call >> DMDACreate3d(MPI_COMM_WORLD,DM_BOUNDARY_NONE,DM_BOUNDARY_NONE,DM_BOUNDARY_NONE,DMDA_STENCIL_STAR,size_x,size_y,& >> >> >> size_z,1,PETSC_DECIDE,PETSC_DECIDE,1,stencil_width,lx,PETSC_NULL_INTEGER,PETSC_NULL_INTEGER,da_u,ierr) >> >> call DMDAVecGetArrayF90(da_u,u_local,u_array,ierr) >> >> call PetscViewerASCIIOpen(MPI_COMM_WORLD,"u.txt",viewer,ierr) >> >> call VecView(array,viewer,ierr) >> > > > The first argument of VecView must be of type Vec (as Matt noted). > It looks you are passing in an array of PetscScalar's. > > > Oh so should it be: > > Vec u_global,u_local > > call > DMDACreate3d(MPI_COMM_WORLD,DM_BOUNDARY_NONE,DM_BOUNDARY_NONE,DM_BOUNDARY_NONE,DMDA_STENCIL_STAR,size_x,size_y,& > > > size_z,1,PETSC_DECIDE,PETSC_DECIDE,1,stencil_width,lx,PETSC_NULL_INTEGER,PETSC_NULL_INTEGER,da_u,ierr) > > call DMDAVecGetArrayF90(da_u,u_local,u_array,ierr) > > call PetscViewerASCIIOpen(MPI_COMM_WORLD,"u.txt",viewer,ierr) > > call VecView(u_local,viewer,ierr) > > call PetscViewerDestroy(viewer,ierr) > Yes, the arguments types now match. However, if you run this in parallel there will be two issues: (1) You have a different local vector per process, thus you will need to use a unique file name, e.g. "u-rankXXX.txt" to avoid overwriting the data from each process (2) You need to make sure that the communicator used for the viewer and the vector are the same. To implement (1) and (2) you could do something like this: MPI_Comm comm; PetscMPIInt rank; char filename[PETSC_MAX_PATH_LEN]; ierr = MPI_Comm_rank(PETSC_COMM_WORLD,&rank);CHKERRQ(ierr); ierr = PetscSNPrintf(filename,PETSC_MAX_PATH_LEN-1,"u-%d.txt",rank);CHKERRQ(ierr); ierr = PetscObjectGetComm(((PetscObject)u_local,&comm);CHKERRQ(ierr); ierr = PetscViewerASCIIOpen(comm,filename,viewer);CHKERRQ(ierr); > > > > >> >> call PetscViewerDestroy(viewer,ierr) >> >> Is this the correct way? >> >> >> Matt >> >> >>> >>> -- >>> Thank you >>> >>> Yours sincerely, >>> >>> TAY wee-beng >>> >>> >> >> >> -- >> What most experimenters take for granted before they begin their >> experiments is infinitely more interesting than any results to which their >> experiments lead. >> -- Norbert Wiener >> >> >> > > -------------- next part -------------- An HTML attachment was scrubbed... URL: From dave.mayhem23 at gmail.com Wed Aug 19 04:57:55 2015 From: dave.mayhem23 at gmail.com (Dave May) Date: Wed, 19 Aug 2015 11:57:55 +0200 Subject: [petsc-users] How to view petsc array? In-Reply-To: References: <55ADE837.8080705@gmail.com> <55D44445.8000503@gmail.com> <55D44780.3000500@gmail.com> Message-ID: > > MPI_Comm comm; > PetscMPIInt rank; > char filename[PETSC_MAX_PATH_LEN]; > > ierr = MPI_Comm_rank(PETSC_COMM_WORLD,&rank);CHKERRQ(ierr); > ierr = > PetscSNPrintf(filename,PETSC_MAX_PATH_LEN-1,"u-%d.txt",rank);CHKERRQ(ierr); > ierr = PetscObjectGetComm(((PetscObject)u_local,&comm);CHKERRQ(ierr); > ierr = PetscViewerASCIIOpen(comm,filename,viewer);CHKERRQ(ierr); > The last line should be ierr = PetscViewerASCIIOpen(comm,filename,*&viewer*);CHKERRQ(ierr); -------------- next part -------------- An HTML attachment was scrubbed... URL: From balay at mcs.anl.gov Wed Aug 19 09:41:28 2015 From: balay at mcs.anl.gov (Satish Balay) Date: Wed, 19 Aug 2015 09:41:28 -0500 Subject: [petsc-users] How to view petsc array? In-Reply-To: <55ADE837.8080705@gmail.com> References: <55ADE837.8080705@gmail.com> Message-ID: check PetscScalarView() Satish On Tue, 21 Jul 2015, TAY wee-beng wrote: > Hi, > > I need to check the contents of the array which was declared using: > > PetscScalar,pointer :: > u_array(:,:,:),v_array(:,:,:),w_array(:,:,:),p_array(:,:,:) > > I tried to use : > > call PetscViewerASCIIOpen(MPI_COMM_WORLD,"pres.txt",viewer,ierr) > > call VecView(p_array,viewer,ierr) > > or > > call MatView(p_array,viewer,ierr) > > call PetscViewerDestroy(viewer,ierr) > > but I got segmentation error. So is there a PETSc routine I can use? > > From honglianglu87 at gmail.com Wed Aug 19 10:51:07 2015 From: honglianglu87 at gmail.com (Hongliang Lu) Date: Wed, 19 Aug 2015 23:51:07 +0800 Subject: [petsc-users] on the data size problem Message-ID: Dear all, I am trying to implement a BFS algorithm using Petsc, and I have tested my code on a graph of 5 nodes, but when I tested on a larger graph, which size is 5000 nodes, the program went wrong, and ca not finished, could some on help me out? thank you very much!!!!! I tried to run the following code in a cluster with 10 nodes. *int main(int argc,char **args)* *{* * Vec curNodes,tmp;* * Mat oriGraph;* * PetscInt rows, cols;* * PetscScalar one=1;* * PetscScalar nodeVecSum=1;* * char filein[PETSC_MAX_PATH_LEN],fileout[PETSC_MAX_PATH_LEN],buf[PETSC_MAX_PATH_LEN];* * PetscViewer fd;* * PetscInitialize(&argc,&args,(char *)0,help);* * PetscOptionsGetString(PETSC_NULL,"-fin",filein,PETSC_MAX_PATH_LEN-1,PETSC_NULL);* * PetscViewerBinaryOpen(PETSC_COMM_WORLD,filein,FILE_MODE_READ,&fd);* * MatCreate(PETSC_COMM_WORLD,&oriGraph);* * MatLoad(oriGraph,fd);* * MatGetSize(oriGraph,&rows,&cols);* * MatSetOption(oriGraph,MAT_NEW_NONZERO_ALLOCATION_ERR, PETSC_FALSE);* * MatSetUp(oriGraph);* * VecCreate(PETSC_COMM_WORLD,&curNodes);* * VecSetSizes(curNodes,PETSC_DECIDE,rows);* * VecSetFromOptions(curNodes);* * VecCreate(PETSC_COMM_WORLD,&tmp);* * VecSetSizes(tmp,PETSC_DECIDE,rows);* * VecSetFromOptions(tmp);* * VecZeroEntries(tmp);* * srand(time(0));* * PetscInt node=rand()%rows;* * PetscPrintf(PETSC_COMM_SELF,"The node ID is: %d \n",node);* * VecSetValues(curNodes,1,&node,&one,INSERT_VALUES);* * VecAssemblyBegin(curNodes);* * VecAssemblyEnd(curNodes); * * PetscViewerDestroy(&fd);* * const PetscInt *colsv;* * const PetscScalar *valsv;* * PetscInt ncols,i,zero=0;* * PetscInt iter=0;* * nodeVecSum=1;* * for(;iter<10;iter++)* * { * * VecAssemblyBegin(curNodes);* * VecAssemblyEnd(curNodes);* * MatMult(oriGraph,curNodes,tmp);* * VecAssemblyBegin(tmp);* * VecAssemblyEnd(tmp);* * VecSum(tmp,&nodeVecSum);* * PetscPrintf(PETSC_COMM_SELF,"There are neighbors: %d \n",(int)nodeVecSum);* * VecSum(curNodes,&nodeVecSum);* * if(nodeVecSum<1)* * break;* * PetscScalar y;* * PetscInt indices;* * PetscInt n,m,rstart,rend;* * IS isrow;* * Mat curMat;* * MatGetLocalSize(oriGraph,&n,&m);* * MatGetOwnershipRange(oriGraph,&rstart,&rend);* * ISCreateStride(PETSC_COMM_SELF,n,rstart,1,&isrow);* * MatGetSubMatrix(oriGraph,isrow,NULL,MAT_INITIAL_MATRIX,&curMat);* * MatGetSize(curMat,&n,&m);* * for(i=rstart;i0){* * MatGetRow(oriGraph,indices,&ncols,&colsv,&valsv);* * PetscScalar *v,zero=0;* * PetscMalloc1(cols,&v);* * for(int j=0;j From bsmith at mcs.anl.gov Tue Aug 18 20:44:11 2015 From: bsmith at mcs.anl.gov (Barry Smith) Date: Tue, 18 Aug 2015 20:44:11 -0500 Subject: [petsc-users] difference between local and global vectors In-Reply-To: <55D3DDFB.30302@gmail.com> References: <55D3DDFB.30302@gmail.com> Message-ID: The global vectors are what the "algebraic solvers" TS/SNES/KSP see, while the local vectors are what you use to perform function evaluations and Jacobian evaluations needed by KSP, SNES, and TS, for example with SNESSetFunction(). Barry > On Aug 18, 2015, at 8:38 PM, TAY wee-beng wrote: > > Hi, > > I am using DA. For e.g. > > DM da_u > > call DMDACreate3d(MPI_COMM_WORLD,DM_BOUNDARY_NONE,DM_BOUNDARY_NONE,DM_BOUNDARY_NONE,DMDA_STENCIL_STAR,size_x,size_y,& > > size_z,1,PETSC_DECIDE,PETSC_DECIDE,1,stencil_width,lx,PETSC_NULL_INTEGER,PETSC_NULL_INTEGER,da_u,ierr) > > call DMCreateGlobalVector(da_u,u_global,ierr) > > call DMCreateLocalVector(da_u,u_local,ierr) > > To update the ghost values, I use: > > call DMLocalToLocalBegin(da_u,u_local,INSERT_VALUES,u_local,ierr) > > call DMLocalToLocalEnd(da_u,u_local,INSERT_VALUES,u_local,ierr) > > It seems that I don't need to use global vector at all. > > So what's the difference between local and global vector? > > When will I need to use?: > > call DMGlobalToLocalBegin(da_u,u_global,INSERT_VALUES,u_local,ierr) > > call DMGlobalToLocalEnd(da_u,u_global,INSERT_VALUES,u_local,ierr) > > -- > Thank you > > Yours sincerely, > > TAY wee-beng > From zonexo at gmail.com Wed Aug 19 11:06:59 2015 From: zonexo at gmail.com (Wee Beng Tay) Date: Thu, 20 Aug 2015 00:06:59 +0800 Subject: [petsc-users] How to view petsc array? In-Reply-To: References: <55ADE837.8080705@gmail.com> Message-ID: <1440000424021-ac869639-5f486918-9750e761@gmail.com> Hi, So I can use PetscScalar view directly to view the u_array? Sent using CloudMagic [https://cloudmagic.com/k/d/mailapp?ct=pa&cv=7.2.7&pv=5.0.2] On Wed, Aug 19, 2015 at 10:41 PM, Satish Balay < balay at mcs.anl.gov [balay at mcs.anl.gov] > wrote: check PetscScalarView() Satish On Tue, 21 Jul 2015, TAY wee-beng wrote: > Hi, > > I need to check the contents of the array which was declared using: > > PetscScalar,pointer :: > u_array(:,:,:),v_array(:,:,:),w_array(:,:,:),p_array(:,:,:) > > I tried to use : > > call PetscViewerASCIIOpen(MPI_COMM_WORLD,"pres.txt",viewer,ierr) > > call VecView(p_array,viewer,ierr) > > or > > call MatView(p_array,viewer,ierr) > > call PetscViewerDestroy(viewer,ierr) > > but I got segmentation error. So is there a PETSc routine I can use? > > -------------- next part -------------- An HTML attachment was scrubbed... URL: From eugenio.aulisa at ttu.edu Wed Aug 19 21:26:25 2015 From: eugenio.aulisa at ttu.edu (Aulisa, Eugenio) Date: Thu, 20 Aug 2015 02:26:25 +0000 Subject: [petsc-users] GMRES -> PCMG -> PCASM pre- post- smoother Message-ID: Hi, I am solving an iteration of GMRES -> PCMG -> PCASM where I build my particular ASM domain decomposition. In setting the PCMG I would like at each level to use the same pre- and post-smoother and for this reason I am using ... PCMGGetSmoother ( pcMG, level , &subksp ); to extract and set at each level the ksp object. In setting PCASM then I use ... KSPGetPC ( subksp, &subpc ); PCSetType ( subpc, PCASM ); ... and then set my own decomposition ... PCASMSetLocalSubdomains(subpc,_is_loc_idx.size(),&_is_ovl[0],&_is_loc[0]); ... Now everything compiles, and runs with no memory leakage, but I do not get the expected convergence. When I checked the output of -ksp_view, I saw something that puzzled me: at each level >0, while in the MG pre-smoother the ASM domain decomposition is the one that I set, for example with 4 processes I get >>>>>>>>>>>>>>>>>>> ... Down solver (pre-smoother) on level 2 ------------------------------- KSP Object: (level-2) 4 MPI processes type: gmres GMRES: restart=30, using Classical (unmodified) Gram-Schmidt Orthogonalization with no iterative refinement GMRES: happy breakdown tolerance 1e-30 maximum iterations=1 using preconditioner applied to right hand side for initial guess tolerances: relative=1e-12, absolute=1e-20, divergence=1e+50 left preconditioning using nonzero initial guess using NONE norm type for convergence test PC Object: (level-2) 4 MPI processes type: asm Additive Schwarz: total subdomain blocks = 198, amount of overlap = 0 Additive Schwarz: restriction/interpolation type - RESTRICT [0] number of local blocks = 52 [1] number of local blocks = 48 [2] number of local blocks = 48 [3] number of local blocks = 50 Local solve info for each block is in the following KSP and PC objects: - - - - - - - - - - - - - - - - - - ... >>>>>>>>>>> in the post-smoother I have the default ASM decomposition with overlapping 1: >>>>>>>>>>> ... Up solver (post-smoother) on level 2 ------------------------------- KSP Object: (level-2) 4 MPI processes type: gmres GMRES: restart=30, using Classical (unmodified) Gram-Schmidt Orthogonalization with no iterative refinement GMRES: happy breakdown tolerance 1e-30 maximum iterations=2 tolerances: relative=1e-12, absolute=1e-20, divergence=1e+50 left preconditioning using nonzero initial guess using NONE norm type for convergence test PC Object: (level-2) 4 MPI processes type: asm Additive Schwarz: total subdomain blocks = 4, amount of overlap = 1 Additive Schwarz: restriction/interpolation type - RESTRICT Local solve is same for all blocks, in the following KSP and PC objects: KSP Object: (level-2sub_) 1 MPI processes type: preonly maximum iterations=10000, initial guess is zero tolerances: relative=1e-05, absolute=1e-50, divergence=10000 left preconditioning ... >>>>>>>>>>>>> %%%%%%%%%%%%%%%%%%%%%%%% So it seams that by using PCMGGetSmoother ( pcMG, level , &subksp ); I was capable to set both the pre- and post- smoothers to be PCASM but everything I did after that applied only to the pre-smoother, while the post-smoother got the default PCASM options. I know that I can use PCMGGetSmootherDown and PCMGGetSmootherUp, but that would probably double the memory allocation and the computational time in the ASM. Is there any way I can just use PCMGGetSmoother and use the same PCASM in the pre- and post- smoother? I hope I was clear enough. Thanks a lot for your help, Eugenio From zonexo at gmail.com Wed Aug 19 22:28:56 2015 From: zonexo at gmail.com (TAY wee-beng) Date: Thu, 20 Aug 2015 11:28:56 +0800 Subject: [petsc-users] Debugging KSP output error Message-ID: <55D54978.8030003@gmail.com> Hi, I run my code on 1, 2 and 3 procs. KSP is used to solve the Poisson eqn. Using MatView and VecView, I found that my LHS matrix and RHS vec are the same for 1,2 and 3 procs. However, my pressure (ans) output is the almost the same (due to truncation err) for 1,2 procs. But for 3 procs, the output is the same as for the 1,2 procs for all values except: 1. the last few values for procs 0 2. the first and last few values for procs 1 and 2. Shouldn't the output be the same when the LHS matrix and RHS vec are the same? How can I debug to find the err? -- Thank you Yours sincerely, TAY wee-beng From dave.mayhem23 at gmail.com Thu Aug 20 02:29:16 2015 From: dave.mayhem23 at gmail.com (Dave May) Date: Thu, 20 Aug 2015 09:29:16 +0200 Subject: [petsc-users] Debugging KSP output error In-Reply-To: <55D54978.8030003@gmail.com> References: <55D54978.8030003@gmail.com> Message-ID: On 20 August 2015 at 05:28, TAY wee-beng wrote: > Hi, > > I run my code on 1, 2 and 3 procs. KSP is used to solve the Poisson eqn. > > Using MatView and VecView, I found that my LHS matrix and RHS vec are the > same for 1,2 and 3 procs. > > However, my pressure (ans) output is the almost the same (due to > truncation err) for 1,2 procs. > > But for 3 procs, the output is the same as for the 1,2 procs for all > values except: > > 1. the last few values for procs 0 > > 2. the first and last few values for procs 1 and 2. > > Shouldn't the output be the same when the LHS matrix and RHS vec are the > same? How can I debug to find the err? > > It's a bit hard to say much without knowing exactly what solver configuration you actually ran and without seeing the difference in the solution you are referring too. Some preconditioners have different behaviour in serial and parallel. Thus, the convergence of the solver and the residual history (and thus the answer) can look slightly different. This difference will become smaller as you solve the system more accurately. Do you solve the system accurately? e.g. something like -ksp_rtol 1.0e-10 To avoid the problem mentioned above, try using -pc_type jacobi. This PC is the same in serial and parallel. Thus, if your A and b are identical on 1,2,3 procs, then the residuals and solution will also be identical on 1,2,3 procs (upto machine precision). Thanks, Dave > -- > Thank you > > Yours sincerely, > > TAY wee-beng > > -------------- next part -------------- An HTML attachment was scrubbed... URL: From bsmith at mcs.anl.gov Thu Aug 20 02:37:13 2015 From: bsmith at mcs.anl.gov (Barry Smith) Date: Thu, 20 Aug 2015 02:37:13 -0500 Subject: [petsc-users] GMRES -> PCMG -> PCASM pre- post- smoother In-Reply-To: References: Message-ID: <1CF3ABE1-214C-4BBC-B8FF-93416EC26EFC@mcs.anl.gov> What you describe is not the expected behavior. I expected exactly the result that you expected. Do you perhaps have some PETSc options around that may be changing the post-smoother? On the command line or in the file petscrc or in the environmental variable PETSC_OPTIONS? Can you send us some code that we could run that reproduces the problem? Barry > On Aug 19, 2015, at 9:26 PM, Aulisa, Eugenio wrote: > > Hi, > > I am solving an iteration of > > GMRES -> PCMG -> PCASM > > where I build my particular ASM domain decomposition. > > In setting the PCMG I would like at each level > to use the same pre- and post-smoother > and for this reason I am using > ... > PCMGGetSmoother ( pcMG, level , &subksp ); > > to extract and set at each level the ksp object. > > In setting PCASM then I use > ... > KSPGetPC ( subksp, &subpc ); > PCSetType ( subpc, PCASM ); > ... > and then set my own decomposition > ... > PCASMSetLocalSubdomains(subpc,_is_loc_idx.size(),&_is_ovl[0],&_is_loc[0]); > ... > > Now everything compiles, and runs with no memory leakage, > but I do not get the expected convergence. > > When I checked the output of -ksp_view, I saw something that puzzled me: > at each level >0, while in the MG pre-smoother the ASM domain decomposition > is the one that I set, for example with 4 processes I get > >>>>>>>>>>>>>>>>>>>> > ... > Down solver (pre-smoother) on level 2 ------------------------------- > KSP Object: (level-2) 4 MPI processes > type: gmres > GMRES: restart=30, using Classical (unmodified) Gram-Schmidt Orthogonalization with no iterative refinement > GMRES: happy breakdown tolerance 1e-30 > maximum iterations=1 > using preconditioner applied to right hand side for initial guess > tolerances: relative=1e-12, absolute=1e-20, divergence=1e+50 > left preconditioning > using nonzero initial guess > using NONE norm type for convergence test > PC Object: (level-2) 4 MPI processes > type: asm > Additive Schwarz: total subdomain blocks = 198, amount of overlap = 0 > Additive Schwarz: restriction/interpolation type - RESTRICT > [0] number of local blocks = 52 > [1] number of local blocks = 48 > [2] number of local blocks = 48 > [3] number of local blocks = 50 > Local solve info for each block is in the following KSP and PC objects: > - - - - - - - - - - - - - - - - - - > ... >>>>>>>>>>>> > > > in the post-smoother I have the default ASM decomposition with overlapping 1: > > >>>>>>>>>>>> > ... > Up solver (post-smoother) on level 2 ------------------------------- > KSP Object: (level-2) 4 MPI processes > type: gmres > GMRES: restart=30, using Classical (unmodified) Gram-Schmidt Orthogonalization with no iterative refinement > GMRES: happy breakdown tolerance 1e-30 > maximum iterations=2 > tolerances: relative=1e-12, absolute=1e-20, divergence=1e+50 > left preconditioning > using nonzero initial guess > using NONE norm type for convergence test > PC Object: (level-2) 4 MPI processes > type: asm > Additive Schwarz: total subdomain blocks = 4, amount of overlap = 1 > Additive Schwarz: restriction/interpolation type - RESTRICT > Local solve is same for all blocks, in the following KSP and PC objects: > KSP Object: (level-2sub_) 1 MPI processes > type: preonly > maximum iterations=10000, initial guess is zero > tolerances: relative=1e-05, absolute=1e-50, divergence=10000 > left preconditioning > ... >>>>>>>>>>>>>> > %%%%%%%%%%%%%%%%%%%%%%%% > > So it seams that by using > > PCMGGetSmoother ( pcMG, level , &subksp ); > > I was capable to set both the pre- and post- smoothers to be PCASM > but everything I did after that applied only to the > pre-smoother, while the post-smoother got the default PCASM options. > > I know that I can use > PCMGGetSmootherDown and PCMGGetSmootherUp, but that would > probably double the memory allocation and the computational time in the ASM. > > Is there any way I can just use PCMGGetSmoother > and use the same PCASM in the pre- and post- smoother? > > I hope I was clear enough. > > Thanks a lot for your help, > Eugenio > > > From zonexo at gmail.com Thu Aug 20 04:01:33 2015 From: zonexo at gmail.com (TAY wee-beng) Date: Thu, 20 Aug 2015 17:01:33 +0800 Subject: [petsc-users] Debugging KSP output error In-Reply-To: References: <55D54978.8030003@gmail.com> Message-ID: <55D5976D.8060509@gmail.com> On 20/8/2015 3:29 PM, Dave May wrote: > > > On 20 August 2015 at 05:28, TAY wee-beng > wrote: > > Hi, > > I run my code on 1, 2 and 3 procs. KSP is used to solve the > Poisson eqn. > > Using MatView and VecView, I found that my LHS matrix and RHS vec > are the same for 1,2 and 3 procs. > > However, my pressure (ans) output is the almost the same (due to > truncation err) for 1,2 procs. > > But for 3 procs, the output is the same as for the 1,2 procs for > all values except: > > 1. the last few values for procs 0 > > 2. the first and last few values for procs 1 and 2. > > Shouldn't the output be the same when the LHS matrix and RHS vec > are the same? How can I debug to find the err? > > > It's a bit hard to say much without knowing exactly what solver > configuration you actually ran and without seeing the difference in > the solution you are referring too. > > Some preconditioners have different behaviour in serial and parallel. > Thus, the convergence of the solver and the residual history (and thus > the answer) can look slightly different. This difference will become > smaller as you solve the system more accurately. > Do you solve the system accurately? e.g. something like -ksp_rtol 1.0e-10 > > To avoid the problem mentioned above, try using -pc_type jacobi. This > PC is the same in serial and parallel. Thus, if your A and b are > identical on 1,2,3 procs, then the residuals and solution will also > be identical on 1,2,3 procs (upto machine precision). > Hi Dave, I tried using jacobi and it's the same result. I found out that the error is actually due to mismatched size between DMDACreate3d and MatGetOwnershipRange. Using /*call DMDACreate3d(MPI_COMM_WORLD,DM_BOUNDARY_NONE,DM_BOUNDARY_NONE,DM_BOUNDARY_NONE,DMDA_STENCIL_STAR,size_x,size_y,&*//* *//* *//*size_z,1,PETSC_DECIDE,PETSC_DECIDE,1,stencil_width,lx,PETSC_NULL_INTEGER,PETSC_NULL_INTEGER,da_w,ierr)*//* *//* *//*call DMDAGetCorners(da_u,start_ijk(1),start_ijk(2),start_ijk(3),width_ijk(1),width_ijk(2),width_ijk(3),ierr)*/ and */call MatCreateAIJ(MPI_COMM_WORLD,PETSC_DECIDE,PETSC_DECIDE,size_x*size_y*size_z,size_x*size_y*size_z,7,PETSC_NULL_INTEGER,7,PETSC_NULL_INTEGER,A_mat,ierr)/**/ /**/ /**/call MatGetOwnershipRange(A_mat,ijk_sta_p,ijk_end_p,ierr)/* Is this possible? Or is there an error somewhere? It happens when using 3 procs, instead of 1 or 2. For my size_x,size_y,size_z = 4,8,10, it was partitioned along z direction with 1->4, 5->7, 8->10 using 3 procs with DMDACreate3d which should give ownership (with Fortran index + 1) of: myid,ijk_sta_p,ijk_end_p 1 129 192 myid,ijk_sta_p,ijk_end_p 0 1 128 myid,ijk_sta_p,ijk_end_p 2 193 320 But with MatGetOwnershipRange, I got myid,ijk_sta_p,ijk_end_p 1 108 214 myid,ijk_sta_p,ijk_end_p 0 1 107 myid,ijk_sta_p,ijk_end_p 2 215 320 > Thanks, > Dave > > > > -- > Thank you > > Yours sincerely, > > TAY wee-beng > > -------------- next part -------------- An HTML attachment was scrubbed... URL: From dave.mayhem23 at gmail.com Thu Aug 20 04:13:55 2015 From: dave.mayhem23 at gmail.com (Dave May) Date: Thu, 20 Aug 2015 11:13:55 +0200 Subject: [petsc-users] Debugging KSP output error In-Reply-To: <55D5976D.8060509@gmail.com> References: <55D54978.8030003@gmail.com> <55D5976D.8060509@gmail.com> Message-ID: On 20 August 2015 at 11:01, TAY wee-beng wrote: > > On 20/8/2015 3:29 PM, Dave May wrote: > > > > On 20 August 2015 at 05:28, TAY wee-beng wrote: > >> Hi, >> >> I run my code on 1, 2 and 3 procs. KSP is used to solve the Poisson eqn. >> >> Using MatView and VecView, I found that my LHS matrix and RHS vec are the >> same for 1,2 and 3 procs. >> >> However, my pressure (ans) output is the almost the same (due to >> truncation err) for 1,2 procs. >> >> But for 3 procs, the output is the same as for the 1,2 procs for all >> values except: >> >> 1. the last few values for procs 0 >> >> 2. the first and last few values for procs 1 and 2. >> >> Shouldn't the output be the same when the LHS matrix and RHS vec are the >> same? How can I debug to find the err? >> >> > It's a bit hard to say much without knowing exactly what solver > configuration you actually ran and without seeing the difference in the > solution you are referring too. > > Some preconditioners have different behaviour in serial and parallel. > Thus, the convergence of the solver and the residual history (and thus the > answer) can look slightly different. This difference will become smaller as > you solve the system more accurately. > Do you solve the system accurately? e.g. something like -ksp_rtol 1.0e-10 > > To avoid the problem mentioned above, try using -pc_type jacobi. This PC > is the same in serial and parallel. Thus, if your A and b are identical on > 1,2,3 procs, then the residuals and solution will also be identical on > 1,2,3 procs (upto machine precision). > > Hi Dave, > > I tried using jacobi and it's the same result. I found out that the error > is actually due to mismatched size between DMDACreate3d and > MatGetOwnershipRange. > > Using > > *call > DMDACreate3d(MPI_COMM_WORLD,DM_BOUNDARY_NONE,DM_BOUNDARY_NONE,DM_BOUNDARY_NONE,DMDA_STENCIL_STAR,size_x,size_y,&* > > > *size_z,1,PETSC_DECIDE,PETSC_DECIDE,1,stencil_width,lx,PETSC_NULL_INTEGER,PETSC_NULL_INTEGER,da_w,ierr)* > > *call > DMDAGetCorners(da_u,start_ijk(1),start_ijk(2),start_ijk(3),width_ijk(1),width_ijk(2),width_ijk(3),ierr)* > > and > > *call > MatCreateAIJ(MPI_COMM_WORLD,PETSC_DECIDE,PETSC_DECIDE,size_x*size_y*size_z,size_x*size_y*size_z,7,PETSC_NULL_INTEGER,7,PETSC_NULL_INTEGER,A_mat,ierr)* > > *call MatGetOwnershipRange(A_mat,ijk_sta_p,ijk_end_p,ierr)* > > Is this possible? Or is there an error somewhere? It happens when using 3 > procs, instead of 1 or 2. > > Sure it is possible you get a mismatch in the local sizes if you create the matrix this way as the matrix created knows nothing about the DMDA, and specifically, it does not know how it has been spatially decomposed. If you want to ensure consistency between the DMDA and the matrix, you should always use DMCreateMatrix() to create the matrix. Any subsequent calls to MatGetOwnershipRange() will then be consistent with the DMDA parallel layout. > For my size_x,size_y,size_z = 4,8,10, it was partitioned along z direction > with 1->4, 5->7, 8->10 using 3 procs with DMDACreate3d which should give > ownership (with Fortran index + 1) of: > > myid,ijk_sta_p,ijk_end_p 1 129 192 > myid,ijk_sta_p,ijk_end_p 0 1 128 > myid,ijk_sta_p,ijk_end_p 2 193 320 > > But with MatGetOwnershipRange, I got > > myid,ijk_sta_p,ijk_end_p 1 108 214 > myid,ijk_sta_p,ijk_end_p 0 1 107 > myid,ijk_sta_p,ijk_end_p 2 215 320 > > Thanks, > Dave > > > >> -- >> Thank you >> >> Yours sincerely, >> >> TAY wee-beng >> >> > > -------------- next part -------------- An HTML attachment was scrubbed... URL: From lorenzoalessiobotti at gmail.com Thu Aug 20 05:29:10 2015 From: lorenzoalessiobotti at gmail.com (Lorenzo Alessio Botti) Date: Thu, 20 Aug 2015 12:29:10 +0200 Subject: [petsc-users] GMRES -> PCMG -> PCASM pre- post- smoother Message-ID: <73CC58DF-71E5-4A32-B714-45BAA8247D0A@gmail.com> I tried to achieve this behaviour getting all the smothers and setting the same preconditioner to the down and up smoother on the same level. smoothers.resize(nLevels+1); smoothers_up.resize(nLevels); for (PetscInt i = 0; i < nLevels; i++) { PCMGGetSmootherDown(M_pc,nLevels-i,&(smoothers[i])); KSPSetInitialGuessNonzero(smoothers[i],PETSC_TRUE); // for full and wCicle PCMGGetSmootherUp(M_pc,nLevels-i,&(smoothers_up[i])); } PCMGSetNumberSmoothDown(M_pc,1); PCMGSetNumberSmoothUp(M_pc,1); ? set coarse solver options here for (PetscInt i = 0; i < nLevels; i++) { PC pc; KSPSetType(smoothers[i], KSPGMRES); KSPGetPC(smoothers[i], &pc); KSPSetPCSide(smoothers[i], PC_RIGHT); PCSetType(pc, PCASM); PCFactorSetPivotInBlocks(pc, PETSC_TRUE); PCFactorSetAllowDiagonalFill(pc); PCFactorSetReuseFill(pc, PETSC_TRUE); PCFactorSetReuseOrdering(pc, PETSC_TRUE); KSPSetType(smoothers_up[i], KSPGMRES); KSPSetPC(smoothers_up[i], pc); KSPSetPCSide(smoothers_up[i], PC_RIGHT); KSPSetConvergenceTest(smoothers[i],KSPConvergedSkip,NULL,NULL); KSPSetConvergenceTest(smoothers_up[i],KSPConvergedSkip,NULL,NULL); KSPSetNormType(smoothers[i],KSP_NORM_NONE); KSPSetNormType(smoothers_up[i],KSP_NORM_NONE); } Is this correct? Note moreover that for Full Multigrid and W cicles to work as expected I need to add the KSPSetInitialGuessNonZero option. Bests Lorenzo > Message: 4 > Date: Thu, 20 Aug 2015 02:37:13 -0500 > From: Barry Smith > > To: "Aulisa, Eugenio" > > Cc: "petsc-users at mcs.anl.gov " > > Subject: Re: [petsc-users] GMRES -> PCMG -> PCASM pre- post- smoother > Message-ID: <1CF3ABE1-214C-4BBC-B8FF-93416EC26EFC at mcs.anl.gov > > Content-Type: text/plain; charset="us-ascii" > > > What you describe is not the expected behavior. I expected exactly the result that you expected. > > Do you perhaps have some PETSc options around that may be changing the post-smoother? On the command line or in the file petscrc or in the environmental variable PETSC_OPTIONS? Can you send us some code that we could run that reproduces the problem? > > Barry > >> On Aug 19, 2015, at 9:26 PM, Aulisa, Eugenio > wrote: >> >> Hi, >> >> I am solving an iteration of >> >> GMRES -> PCMG -> PCASM >> >> where I build my particular ASM domain decomposition. >> >> In setting the PCMG I would like at each level >> to use the same pre- and post-smoother >> and for this reason I am using >> ... >> PCMGGetSmoother ( pcMG, level , &subksp ); >> >> to extract and set at each level the ksp object. >> >> In setting PCASM then I use >> ... >> KSPGetPC ( subksp, &subpc ); >> PCSetType ( subpc, PCASM ); >> ... >> and then set my own decomposition >> ... >> PCASMSetLocalSubdomains(subpc,_is_loc_idx.size(),&_is_ovl[0],&_is_loc[0]); >> ... >> >> Now everything compiles, and runs with no memory leakage, >> but I do not get the expected convergence. >> >> When I checked the output of -ksp_view, I saw something that puzzled me: >> at each level >0, while in the MG pre-smoother the ASM domain decomposition >> is the one that I set, for example with 4 processes I get >> >>>>>>>>>>>>>>>>>>>>> >> ... >> Down solver (pre-smoother) on level 2 ------------------------------- >> KSP Object: (level-2) 4 MPI processes >> type: gmres >> GMRES: restart=30, using Classical (unmodified) Gram-Schmidt Orthogonalization with no iterative refinement >> GMRES: happy breakdown tolerance 1e-30 >> maximum iterations=1 >> using preconditioner applied to right hand side for initial guess >> tolerances: relative=1e-12, absolute=1e-20, divergence=1e+50 >> left preconditioning >> using nonzero initial guess >> using NONE norm type for convergence test >> PC Object: (level-2) 4 MPI processes >> type: asm >> Additive Schwarz: total subdomain blocks = 198, amount of overlap = 0 >> Additive Schwarz: restriction/interpolation type - RESTRICT >> [0] number of local blocks = 52 >> [1] number of local blocks = 48 >> [2] number of local blocks = 48 >> [3] number of local blocks = 50 >> Local solve info for each block is in the following KSP and PC objects: >> - - - - - - - - - - - - - - - - - - >> ... >>>>>>>>>>>>> >> >> >> in the post-smoother I have the default ASM decomposition with overlapping 1: >> >> >>>>>>>>>>>>> >> ... >> Up solver (post-smoother) on level 2 ------------------------------- >> KSP Object: (level-2) 4 MPI processes >> type: gmres >> GMRES: restart=30, using Classical (unmodified) Gram-Schmidt Orthogonalization with no iterative refinement >> GMRES: happy breakdown tolerance 1e-30 >> maximum iterations=2 >> tolerances: relative=1e-12, absolute=1e-20, divergence=1e+50 >> left preconditioning >> using nonzero initial guess >> using NONE norm type for convergence test >> PC Object: (level-2) 4 MPI processes >> type: asm >> Additive Schwarz: total subdomain blocks = 4, amount of overlap = 1 >> Additive Schwarz: restriction/interpolation type - RESTRICT >> Local solve is same for all blocks, in the following KSP and PC objects: >> KSP Object: (level-2sub_) 1 MPI processes >> type: preonly >> maximum iterations=10000, initial guess is zero >> tolerances: relative=1e-05, absolute=1e-50, divergence=10000 >> left preconditioning >> ... >>>>>>>>>>>>>>> >> %%%%%%%%%%%%%%%%%%%%%%%% >> >> So it seams that by using >> >> PCMGGetSmoother ( pcMG, level , &subksp ); >> >> I was capable to set both the pre- and post- smoothers to be PCASM >> but everything I did after that applied only to the >> pre-smoother, while the post-smoother got the default PCASM options. >> >> I know that I can use >> PCMGGetSmootherDown and PCMGGetSmootherUp, but that would >> probably double the memory allocation and the computational time in the ASM. >> >> Is there any way I can just use PCMGGetSmoother >> and use the same PCASM in the pre- and post- smoother? >> >> I hope I was clear enough. >> >> Thanks a lot for your help, >> Eugenio >> >> -------------- next part -------------- An HTML attachment was scrubbed... URL: From nelsonflsilva at ist.utl.pt Thu Aug 20 06:30:56 2015 From: nelsonflsilva at ist.utl.pt (Nelson Filipe Lopes da Silva) Date: Thu, 20 Aug 2015 12:30:56 +0100 Subject: [petsc-users] Scalability issue In-Reply-To: References: Message-ID: <6f0a267caafdec17d9e34595c9528b7c@mail.ist.utl.pt> Hello. I am sorry for the long time without response. I decided to rewrite my application in a different way and will send the log_summary output when done reimplementing. As for the machine, I am using mpirun to run jobs in a 8 node cluster. I modified the makefile on the steams folder so it would run using my hostfile. The output is attached to this email. It seems reasonable for a cluster with 8 machines. From "lscpu", each machine cpu has 4 cores and 1 socket. Cheers, Nelson Em 2015-07-24 16:50, Barry Smith escreveu: > It would be very helpful if you ran the code on say 1, 2, 4, 8, 16 > ... processes with the option -log_summary and send (as attachments) > the log summary information. > > Also on the same machine run the streams benchmark; with recent > releases of PETSc you only need to do > > cd $PETSC_DIR > make streams NPMAX=16 (or whatever your largest process count is) > > and send the output. > > I suspect that you are doing everything fine and it is more an issue > with the configuration of your machine. Also read the information at > http://www.mcs.anl.gov/petsc/documentation/faq.html#computers on > "binding" > > Barry > >> On Jul 24, 2015, at 10:41 AM, Nelson Filipe Lopes da Silva >> wrote: >> >> Hello, >> >> I have been using PETSc for a few months now, and it truly is >> fantastic piece of software. >> >> In my particular example I am working with a large, sparse >> distributed (MPI AIJ) matrix we can refer as 'G'. >> G is a horizontal - retangular matrix (for example, 1,1 Million rows >> per 2,1 Million columns). This matrix is commonly very sparse and not >> diagonal 'heavy' (for example 5,2 Million nnz in which ~50% are on the >> diagonal block of MPI AIJ representation). >> To work with this matrix, I also have a few parallel vectors >> (created using MatCreate Vec), we can refer as 'm' and 'k'. >> I am trying to parallelize an iterative algorithm in which the most >> computational heavy operations are: >> >> ->Matrix-Vector Multiplication, more precisely G * m + k = b >> (MatMultAdd). From what I have been reading, to achive a good speedup >> in this operation, G should be as much diagonal as possible, due to >> overlapping communication and computation. But even when using a G >> matrix in which the diagonal block has ~95% of the nnz, I cannot get a >> decent speedup. Most of the times, the performance even gets worse. >> >> ->Matrix-Matrix Multiplication, in this case I need to perform G * >> G' = A, where A is later used on the linear solver and G' is transpose >> of G. The speedup in this operation is not worse, although is not very >> good. >> >> ->Linear problem solving. Lastly, In this operation I compute "Ax=b" >> from the last two operations. I tried to apply a RCM permutation to A >> to make it more diagonal, for better performance. However, the problem >> I faced was that, the permutation is performed locally in each >> processor and thus, the final result is different with different >> number of processors. I assume this was intended to reduce >> communication. The solution I found was >> 1-calculate A >> 2-calculate, localy to 1 machine, the RCM permutation IS using A >> 3-apply this permutation to the lines of G. >> This works well, and A is generated as if RCM permuted. It is fine >> to do this operation in one machine because it is only done once while >> reading the input. The nnz of G become more spread and less diagonal, >> causing problems when calculating G * m + k = b. >> >> These 3 operations (except the permutation) are performed in each >> iteration of my algorithm. >> >> So, my questions are. >> -What are the characteristics of G that lead to a good speedup in >> the operations I described? Am I missing something and too much >> obsessed with the diagonal block? >> >> -Is there a better way to permute A without permute G and still get >> the same result using 1 or N machines? >> >> >> I have been avoiding asking for help for a while. I'm very sorry for >> the long email. >> Thank you very much for your time. >> Best Regards, >> Nelson -------------- next part -------------- An embedded and charset-unspecified text was scrubbed... Name: streams.output URL: From eugenio.aulisa at ttu.edu Thu Aug 20 06:51:17 2015 From: eugenio.aulisa at ttu.edu (Aulisa, Eugenio) Date: Thu, 20 Aug 2015 11:51:17 +0000 Subject: [petsc-users] GMRES -> PCMG -> PCASM pre- post- smoother In-Reply-To: <1CF3ABE1-214C-4BBC-B8FF-93416EC26EFC@mcs.anl.gov> References: , <1CF3ABE1-214C-4BBC-B8FF-93416EC26EFC@mcs.anl.gov> Message-ID: Hi Barry, Thanks for your answer. I run my applications with no command line, and I do not think I changed any PETSC_OPTIONS, at least not voluntarily. For the source it is available on https://github.com/NumPDEClassTTU/femus but it is part of a much larger library and I do not think any of you want to install and run it just to find what I messed up. In any case, if you just want to look at the source code where I set up the level smoother it is in https://github.com/NumPDEClassTTU/femus/blob/master/src/algebra/AsmPetscLinearEquationSolver.cpp line 400 void AsmPetscLinearEquationSolver::MGsetLevels ( LinearEquationSolver *LinSolver, const unsigned &level, const unsigned &levelMax, const vector &variable_to_be_solved, SparseMatrix* PP, SparseMatrix* RR ){ Be aware, that even if it seams that this takes care of the coarse level it is not. The coarse level smoother is set some where else. Thanks, Eugenio ________________________________________ From: Barry Smith [bsmith at mcs.anl.gov] Sent: Thursday, August 20, 2015 2:37 AM To: Aulisa, Eugenio Cc: petsc-users at mcs.anl.gov Subject: Re: [petsc-users] GMRES -> PCMG -> PCASM pre- post- smoother What you describe is not the expected behavior. I expected exactly the result that you expected. Do you perhaps have some PETSc options around that may be changing the post-smoother? On the command line or in the file petscrc or in the environmental variable PETSC_OPTIONS? Can you send us some code that we could run that reproduces the problem? Barry > On Aug 19, 2015, at 9:26 PM, Aulisa, Eugenio wrote: > > Hi, > > I am solving an iteration of > > GMRES -> PCMG -> PCASM > > where I build my particular ASM domain decomposition. > > In setting the PCMG I would like at each level > to use the same pre- and post-smoother > and for this reason I am using > ... > PCMGGetSmoother ( pcMG, level , &subksp ); > > to extract and set at each level the ksp object. > > In setting PCASM then I use > ... > KSPGetPC ( subksp, &subpc ); > PCSetType ( subpc, PCASM ); > ... > and then set my own decomposition > ... > PCASMSetLocalSubdomains(subpc,_is_loc_idx.size(),&_is_ovl[0],&_is_loc[0]); > ... > > Now everything compiles, and runs with no memory leakage, > but I do not get the expected convergence. > > When I checked the output of -ksp_view, I saw something that puzzled me: > at each level >0, while in the MG pre-smoother the ASM domain decomposition > is the one that I set, for example with 4 processes I get > >>>>>>>>>>>>>>>>>>>> > ... > Down solver (pre-smoother) on level 2 ------------------------------- > KSP Object: (level-2) 4 MPI processes > type: gmres > GMRES: restart=30, using Classical (unmodified) Gram-Schmidt Orthogonalization with no iterative refinement > GMRES: happy breakdown tolerance 1e-30 > maximum iterations=1 > using preconditioner applied to right hand side for initial guess > tolerances: relative=1e-12, absolute=1e-20, divergence=1e+50 > left preconditioning > using nonzero initial guess > using NONE norm type for convergence test > PC Object: (level-2) 4 MPI processes > type: asm > Additive Schwarz: total subdomain blocks = 198, amount of overlap = 0 > Additive Schwarz: restriction/interpolation type - RESTRICT > [0] number of local blocks = 52 > [1] number of local blocks = 48 > [2] number of local blocks = 48 > [3] number of local blocks = 50 > Local solve info for each block is in the following KSP and PC objects: > - - - - - - - - - - - - - - - - - - > ... >>>>>>>>>>>> > > > in the post-smoother I have the default ASM decomposition with overlapping 1: > > >>>>>>>>>>>> > ... > Up solver (post-smoother) on level 2 ------------------------------- > KSP Object: (level-2) 4 MPI processes > type: gmres > GMRES: restart=30, using Classical (unmodified) Gram-Schmidt Orthogonalization with no iterative refinement > GMRES: happy breakdown tolerance 1e-30 > maximum iterations=2 > tolerances: relative=1e-12, absolute=1e-20, divergence=1e+50 > left preconditioning > using nonzero initial guess > using NONE norm type for convergence test > PC Object: (level-2) 4 MPI processes > type: asm > Additive Schwarz: total subdomain blocks = 4, amount of overlap = 1 > Additive Schwarz: restriction/interpolation type - RESTRICT > Local solve is same for all blocks, in the following KSP and PC objects: > KSP Object: (level-2sub_) 1 MPI processes > type: preonly > maximum iterations=10000, initial guess is zero > tolerances: relative=1e-05, absolute=1e-50, divergence=10000 > left preconditioning > ... >>>>>>>>>>>>>> > %%%%%%%%%%%%%%%%%%%%%%%% > > So it seams that by using > > PCMGGetSmoother ( pcMG, level , &subksp ); > > I was capable to set both the pre- and post- smoothers to be PCASM > but everything I did after that applied only to the > pre-smoother, while the post-smoother got the default PCASM options. > > I know that I can use > PCMGGetSmootherDown and PCMGGetSmootherUp, but that would > probably double the memory allocation and the computational time in the ASM. > > Is there any way I can just use PCMGGetSmoother > and use the same PCASM in the pre- and post- smoother? > > I hope I was clear enough. > > Thanks a lot for your help, > Eugenio > > > From eugenio.aulisa at ttu.edu Thu Aug 20 07:00:16 2015 From: eugenio.aulisa at ttu.edu (Aulisa, Eugenio) Date: Thu, 20 Aug 2015 12:00:16 +0000 Subject: [petsc-users] GMRES -> PCMG -> PCASM pre- post- smoother In-Reply-To: <73CC58DF-71E5-4A32-B714-45BAA8247D0A@gmail.com> References: <73CC58DF-71E5-4A32-B714-45BAA8247D0A@gmail.com> Message-ID: Thanks Lorenzo If I well understood what you say is Set up PCASM for the smoother_down[i] and then feed it up to smoother_up[i] Does this assure that only one memory allocation is done for PCASM? Eugenio ________________________________ From: Lorenzo Alessio Botti [lorenzoalessiobotti at gmail.com] Sent: Thursday, August 20, 2015 5:29 AM To: petsc-users at mcs.anl.gov Cc: Aulisa, Eugenio Subject: GMRES -> PCMG -> PCASM pre- post- smoother I tried to achieve this behaviour getting all the smothers and setting the same preconditioner to the down and up smoother on the same level. smoothers.resize(nLevels+1); smoothers_up.resize(nLevels); for (PetscInt i = 0; i < nLevels; i++) { PCMGGetSmootherDown(M_pc,nLevels-i,&(smoothers[i])); KSPSetInitialGuessNonzero(smoothers[i],PETSC_TRUE); // for full and wCicle PCMGGetSmootherUp(M_pc,nLevels-i,&(smoothers_up[i])); } PCMGSetNumberSmoothDown(M_pc,1); PCMGSetNumberSmoothUp(M_pc,1); ? set coarse solver options here for (PetscInt i = 0; i < nLevels; i++) { PC pc; KSPSetType(smoothers[i], KSPGMRES); KSPGetPC(smoothers[i], &pc); KSPSetPCSide(smoothers[i], PC_RIGHT); PCSetType(pc, PCASM); PCFactorSetPivotInBlocks(pc, PETSC_TRUE); PCFactorSetAllowDiagonalFill(pc); PCFactorSetReuseFill(pc, PETSC_TRUE); PCFactorSetReuseOrdering(pc, PETSC_TRUE); KSPSetType(smoothers_up[i], KSPGMRES); KSPSetPC(smoothers_up[i], pc); KSPSetPCSide(smoothers_up[i], PC_RIGHT); KSPSetConvergenceTest(smoothers[i],KSPConvergedSkip,NULL,NULL); KSPSetConvergenceTest(smoothers_up[i],KSPConvergedSkip,NULL,NULL); KSPSetNormType(smoothers[i],KSP_NORM_NONE); KSPSetNormType(smoothers_up[i],KSP_NORM_NONE); } Is this correct? Note moreover that for Full Multigrid and W cicles to work as expected I need to add the KSPSetInitialGuessNonZero option. Bests Lorenzo Message: 4 Date: Thu, 20 Aug 2015 02:37:13 -0500 From: Barry Smith > To: "Aulisa, Eugenio" > Cc: "petsc-users at mcs.anl.gov" > Subject: Re: [petsc-users] GMRES -> PCMG -> PCASM pre- post- smoother Message-ID: <1CF3ABE1-214C-4BBC-B8FF-93416EC26EFC at mcs.anl.gov> Content-Type: text/plain; charset="us-ascii" What you describe is not the expected behavior. I expected exactly the result that you expected. Do you perhaps have some PETSc options around that may be changing the post-smoother? On the command line or in the file petscrc or in the environmental variable PETSC_OPTIONS? Can you send us some code that we could run that reproduces the problem? Barry On Aug 19, 2015, at 9:26 PM, Aulisa, Eugenio > wrote: Hi, I am solving an iteration of GMRES -> PCMG -> PCASM where I build my particular ASM domain decomposition. In setting the PCMG I would like at each level to use the same pre- and post-smoother and for this reason I am using ... PCMGGetSmoother ( pcMG, level , &subksp ); to extract and set at each level the ksp object. In setting PCASM then I use ... KSPGetPC ( subksp, &subpc ); PCSetType ( subpc, PCASM ); ... and then set my own decomposition ... PCASMSetLocalSubdomains(subpc,_is_loc_idx.size(),&_is_ovl[0],&_is_loc[0]); ... Now everything compiles, and runs with no memory leakage, but I do not get the expected convergence. When I checked the output of -ksp_view, I saw something that puzzled me: at each level >0, while in the MG pre-smoother the ASM domain decomposition is the one that I set, for example with 4 processes I get ... Down solver (pre-smoother) on level 2 ------------------------------- KSP Object: (level-2) 4 MPI processes type: gmres GMRES: restart=30, using Classical (unmodified) Gram-Schmidt Orthogonalization with no iterative refinement GMRES: happy breakdown tolerance 1e-30 maximum iterations=1 using preconditioner applied to right hand side for initial guess tolerances: relative=1e-12, absolute=1e-20, divergence=1e+50 left preconditioning using nonzero initial guess using NONE norm type for convergence test PC Object: (level-2) 4 MPI processes type: asm Additive Schwarz: total subdomain blocks = 198, amount of overlap = 0 Additive Schwarz: restriction/interpolation type - RESTRICT [0] number of local blocks = 52 [1] number of local blocks = 48 [2] number of local blocks = 48 [3] number of local blocks = 50 Local solve info for each block is in the following KSP and PC objects: - - - - - - - - - - - - - - - - - - ... in the post-smoother I have the default ASM decomposition with overlapping 1: ... Up solver (post-smoother) on level 2 ------------------------------- KSP Object: (level-2) 4 MPI processes type: gmres GMRES: restart=30, using Classical (unmodified) Gram-Schmidt Orthogonalization with no iterative refinement GMRES: happy breakdown tolerance 1e-30 maximum iterations=2 tolerances: relative=1e-12, absolute=1e-20, divergence=1e+50 left preconditioning using nonzero initial guess using NONE norm type for convergence test PC Object: (level-2) 4 MPI processes type: asm Additive Schwarz: total subdomain blocks = 4, amount of overlap = 1 Additive Schwarz: restriction/interpolation type - RESTRICT Local solve is same for all blocks, in the following KSP and PC objects: KSP Object: (level-2sub_) 1 MPI processes type: preonly maximum iterations=10000, initial guess is zero tolerances: relative=1e-05, absolute=1e-50, divergence=10000 left preconditioning ... %%%%%%%%%%%%%%%%%%%%%%%% So it seams that by using PCMGGetSmoother ( pcMG, level , &subksp ); I was capable to set both the pre- and post- smoothers to be PCASM but everything I did after that applied only to the pre-smoother, while the post-smoother got the default PCASM options. I know that I can use PCMGGetSmootherDown and PCMGGetSmootherUp, but that would probably double the memory allocation and the computational time in the ASM. Is there any way I can just use PCMGGetSmoother and use the same PCASM in the pre- and post- smoother? I hope I was clear enough. Thanks a lot for your help, Eugenio -------------- next part -------------- An HTML attachment was scrubbed... URL: From lorenzoalessiobotti at gmail.com Thu Aug 20 07:57:38 2015 From: lorenzoalessiobotti at gmail.com (Lorenzo Alessio Botti) Date: Thu, 20 Aug 2015 14:57:38 +0200 Subject: [petsc-users] GMRES -> PCMG -> PCASM pre- post- smoother In-Reply-To: References: <73CC58DF-71E5-4A32-B714-45BAA8247D0A@gmail.com> Message-ID: <170951A3-97AA-4639-9E80-3E235C4A5244@gmail.com> I guess that only one memory allocation is done, but basically this is what I?m asking as well. Is this correct? I?d be more confident if one of the developers confirmed this. Lorenzo > On 20 Aug 2015, at 14:00, Aulisa, Eugenio wrote: > > Thanks Lorenzo > > If I well understood what you say is > > Set up PCASM for the smoother_down[i] and then > feed it up to smoother_up[i] > > Does this assure that only one > memory allocation is done for PCASM? > > > Eugenio > From: Lorenzo Alessio Botti [lorenzoalessiobotti at gmail.com] > Sent: Thursday, August 20, 2015 5:29 AM > To: petsc-users at mcs.anl.gov > Cc: Aulisa, Eugenio > Subject: GMRES -> PCMG -> PCASM pre- post- smoother > > I tried to achieve this behaviour getting all the smothers and setting the same preconditioner to the down and up smoother on the same level. > > > smoothers.resize(nLevels+1); > smoothers_up.resize(nLevels); > for (PetscInt i = 0; i < nLevels; i++) > { > PCMGGetSmootherDown(M_pc,nLevels-i,&(smoothers[i])); > KSPSetInitialGuessNonzero(smoothers[i],PETSC_TRUE); // for full and wCicle > PCMGGetSmootherUp(M_pc,nLevels-i,&(smoothers_up[i])); > } > PCMGSetNumberSmoothDown(M_pc,1); > PCMGSetNumberSmoothUp(M_pc,1); > > ? set coarse solver options here > > for (PetscInt i = 0; i < nLevels; i++) > { > PC pc; > KSPSetType(smoothers[i], KSPGMRES); > KSPGetPC(smoothers[i], &pc); > KSPSetPCSide(smoothers[i], PC_RIGHT); > PCSetType(pc, PCASM); > PCFactorSetPivotInBlocks(pc, PETSC_TRUE); > PCFactorSetAllowDiagonalFill(pc); > PCFactorSetReuseFill(pc, PETSC_TRUE); > PCFactorSetReuseOrdering(pc, PETSC_TRUE); > KSPSetType(smoothers_up[i], KSPGMRES); > KSPSetPC(smoothers_up[i], pc); > KSPSetPCSide(smoothers_up[i], PC_RIGHT); > KSPSetConvergenceTest(smoothers[i],KSPConvergedSkip,NULL,NULL); > KSPSetConvergenceTest(smoothers_up[i],KSPConvergedSkip,NULL,NULL); > KSPSetNormType(smoothers[i],KSP_NORM_NONE); > KSPSetNormType(smoothers_up[i],KSP_NORM_NONE); > } > > Is this correct? > Note moreover that for Full Multigrid and W cicles to work as expected I need to add the KSPSetInitialGuessNonZero option. > > Bests > Lorenzo > >> Message: 4 >> Date: Thu, 20 Aug 2015 02:37:13 -0500 >> From: Barry Smith > >> To: "Aulisa, Eugenio" > >> Cc: "petsc-users at mcs.anl.gov " > >> Subject: Re: [petsc-users] GMRES -> PCMG -> PCASM pre- post- smoother >> Message-ID: <1CF3ABE1-214C-4BBC-B8FF-93416EC26EFC at mcs.anl.gov > >> Content-Type: text/plain; charset="us-ascii" >> >> >> What you describe is not the expected behavior. I expected exactly the result that you expected. >> >> Do you perhaps have some PETSc options around that may be changing the post-smoother? On the command line or in the file petscrc or in the environmental variable PETSC_OPTIONS? Can you send us some code that we could run that reproduces the problem? >> >> Barry >> >>> On Aug 19, 2015, at 9:26 PM, Aulisa, Eugenio > wrote: >>> >>> Hi, >>> >>> I am solving an iteration of >>> >>> GMRES -> PCMG -> PCASM >>> >>> where I build my particular ASM domain decomposition. >>> >>> In setting the PCMG I would like at each level >>> to use the same pre- and post-smoother >>> and for this reason I am using >>> ... >>> PCMGGetSmoother ( pcMG, level , &subksp ); >>> >>> to extract and set at each level the ksp object. >>> >>> In setting PCASM then I use >>> ... >>> KSPGetPC ( subksp, &subpc ); >>> PCSetType ( subpc, PCASM ); >>> ... >>> and then set my own decomposition >>> ... >>> PCASMSetLocalSubdomains(subpc,_is_loc_idx.size(),&_is_ovl[0],&_is_loc[0]); >>> ... >>> >>> Now everything compiles, and runs with no memory leakage, >>> but I do not get the expected convergence. >>> >>> When I checked the output of -ksp_view, I saw something that puzzled me: >>> at each level >0, while in the MG pre-smoother the ASM domain decomposition >>> is the one that I set, for example with 4 processes I get >>> >>>>>>>>>>>>>>>>>>>>>> >>> ... >>> Down solver (pre-smoother) on level 2 ------------------------------- >>> KSP Object: (level-2) 4 MPI processes >>> type: gmres >>> GMRES: restart=30, using Classical (unmodified) Gram-Schmidt Orthogonalization with no iterative refinement >>> GMRES: happy breakdown tolerance 1e-30 >>> maximum iterations=1 >>> using preconditioner applied to right hand side for initial guess >>> tolerances: relative=1e-12, absolute=1e-20, divergence=1e+50 >>> left preconditioning >>> using nonzero initial guess >>> using NONE norm type for convergence test >>> PC Object: (level-2) 4 MPI processes >>> type: asm >>> Additive Schwarz: total subdomain blocks = 198, amount of overlap = 0 >>> Additive Schwarz: restriction/interpolation type - RESTRICT >>> [0] number of local blocks = 52 >>> [1] number of local blocks = 48 >>> [2] number of local blocks = 48 >>> [3] number of local blocks = 50 >>> Local solve info for each block is in the following KSP and PC objects: >>> - - - - - - - - - - - - - - - - - - >>> ... >>>>>>>>>>>>>> >>> >>> >>> in the post-smoother I have the default ASM decomposition with overlapping 1: >>> >>> >>>>>>>>>>>>>> >>> ... >>> Up solver (post-smoother) on level 2 ------------------------------- >>> KSP Object: (level-2) 4 MPI processes >>> type: gmres >>> GMRES: restart=30, using Classical (unmodified) Gram-Schmidt Orthogonalization with no iterative refinement >>> GMRES: happy breakdown tolerance 1e-30 >>> maximum iterations=2 >>> tolerances: relative=1e-12, absolute=1e-20, divergence=1e+50 >>> left preconditioning >>> using nonzero initial guess >>> using NONE norm type for convergence test >>> PC Object: (level-2) 4 MPI processes >>> type: asm >>> Additive Schwarz: total subdomain blocks = 4, amount of overlap = 1 >>> Additive Schwarz: restriction/interpolation type - RESTRICT >>> Local solve is same for all blocks, in the following KSP and PC objects: >>> KSP Object: (level-2sub_) 1 MPI processes >>> type: preonly >>> maximum iterations=10000, initial guess is zero >>> tolerances: relative=1e-05, absolute=1e-50, divergence=10000 >>> left preconditioning >>> ... >>>>>>>>>>>>>>>> >>> %%%%%%%%%%%%%%%%%%%%%%%% >>> >>> So it seams that by using >>> >>> PCMGGetSmoother ( pcMG, level , &subksp ); >>> >>> I was capable to set both the pre- and post- smoothers to be PCASM >>> but everything I did after that applied only to the >>> pre-smoother, while the post-smoother got the default PCASM options. >>> >>> I know that I can use >>> PCMGGetSmootherDown and PCMGGetSmootherUp, but that would >>> probably double the memory allocation and the computational time in the ASM. >>> >>> Is there any way I can just use PCMGGetSmoother >>> and use the same PCASM in the pre- and post- smoother? >>> >>> I hope I was clear enough. >>> >>> Thanks a lot for your help, >>> Eugenio -------------- next part -------------- An HTML attachment was scrubbed... URL: From knepley at gmail.com Thu Aug 20 10:17:26 2015 From: knepley at gmail.com (Matthew Knepley) Date: Thu, 20 Aug 2015 10:17:26 -0500 Subject: [petsc-users] Scalability issue In-Reply-To: <6f0a267caafdec17d9e34595c9528b7c@mail.ist.utl.pt> References: <6f0a267caafdec17d9e34595c9528b7c@mail.ist.utl.pt> Message-ID: On Thu, Aug 20, 2015 at 6:30 AM, Nelson Filipe Lopes da Silva < nelsonflsilva at ist.utl.pt> wrote: > Hello. > > I am sorry for the long time without response. I decided to rewrite my > application in a different way and will send the log_summary output when > done reimplementing. > > As for the machine, I am using mpirun to run jobs in a 8 node cluster. I > modified the makefile on the steams folder so it would run using my > hostfile. > The output is attached to this email. It seems reasonable for a cluster > with 8 machines. From "lscpu", each machine cpu has 4 cores and 1 socket. > 1) You launcher is placing processes haphazardly. I would figure out how to assign them to certain nodes 2) Each node has enough bandwidth for 1 core, so it does not make much sense to use more than 1. Thanks, Matt > Cheers, > Nelson > > > Em 2015-07-24 16:50, Barry Smith escreveu: > >> It would be very helpful if you ran the code on say 1, 2, 4, 8, 16 >> ... processes with the option -log_summary and send (as attachments) >> the log summary information. >> >> Also on the same machine run the streams benchmark; with recent >> releases of PETSc you only need to do >> >> cd $PETSC_DIR >> make streams NPMAX=16 (or whatever your largest process count is) >> >> and send the output. >> >> I suspect that you are doing everything fine and it is more an issue >> with the configuration of your machine. Also read the information at >> http://www.mcs.anl.gov/petsc/documentation/faq.html#computers on >> "binding" >> >> Barry >> >> On Jul 24, 2015, at 10:41 AM, Nelson Filipe Lopes da Silva < >>> nelsonflsilva at ist.utl.pt> wrote: >>> >>> Hello, >>> >>> I have been using PETSc for a few months now, and it truly is fantastic >>> piece of software. >>> >>> In my particular example I am working with a large, sparse distributed >>> (MPI AIJ) matrix we can refer as 'G'. >>> G is a horizontal - retangular matrix (for example, 1,1 Million rows per >>> 2,1 Million columns). This matrix is commonly very sparse and not diagonal >>> 'heavy' (for example 5,2 Million nnz in which ~50% are on the diagonal >>> block of MPI AIJ representation). >>> To work with this matrix, I also have a few parallel vectors (created >>> using MatCreate Vec), we can refer as 'm' and 'k'. >>> I am trying to parallelize an iterative algorithm in which the most >>> computational heavy operations are: >>> >>> ->Matrix-Vector Multiplication, more precisely G * m + k = b >>> (MatMultAdd). From what I have been reading, to achive a good speedup in >>> this operation, G should be as much diagonal as possible, due to >>> overlapping communication and computation. But even when using a G matrix >>> in which the diagonal block has ~95% of the nnz, I cannot get a decent >>> speedup. Most of the times, the performance even gets worse. >>> >>> ->Matrix-Matrix Multiplication, in this case I need to perform G * G' = >>> A, where A is later used on the linear solver and G' is transpose of G. The >>> speedup in this operation is not worse, although is not very good. >>> >>> ->Linear problem solving. Lastly, In this operation I compute "Ax=b" >>> from the last two operations. I tried to apply a RCM permutation to A to >>> make it more diagonal, for better performance. However, the problem I faced >>> was that, the permutation is performed locally in each processor and thus, >>> the final result is different with different number of processors. I assume >>> this was intended to reduce communication. The solution I found was >>> 1-calculate A >>> 2-calculate, localy to 1 machine, the RCM permutation IS using A >>> 3-apply this permutation to the lines of G. >>> This works well, and A is generated as if RCM permuted. It is fine to do >>> this operation in one machine because it is only done once while reading >>> the input. The nnz of G become more spread and less diagonal, causing >>> problems when calculating G * m + k = b. >>> >>> These 3 operations (except the permutation) are performed in each >>> iteration of my algorithm. >>> >>> So, my questions are. >>> -What are the characteristics of G that lead to a good speedup in the >>> operations I described? Am I missing something and too much obsessed with >>> the diagonal block? >>> >>> -Is there a better way to permute A without permute G and still get the >>> same result using 1 or N machines? >>> >>> >>> I have been avoiding asking for help for a while. I'm very sorry for the >>> long email. >>> Thank you very much for your time. >>> Best Regards, >>> Nelson >>> >> -- What most experimenters take for granted before they begin their experiments is infinitely more interesting than any results to which their experiments lead. -- Norbert Wiener -------------- next part -------------- An HTML attachment was scrubbed... URL: From bsmith at mcs.anl.gov Thu Aug 20 23:54:58 2015 From: bsmith at mcs.anl.gov (Barry Smith) Date: Thu, 20 Aug 2015 23:54:58 -0500 Subject: [petsc-users] GMRES -> PCMG -> PCASM pre- post- smoother In-Reply-To: References: <1CF3ABE1-214C-4BBC-B8FF-93416EC26EFC@mcs.anl.gov> Message-ID: <625658D1-0B72-493C-955B-CA6582F63967@mcs.anl.gov> Ahhh, void AsmPetscLinearEquationSolver::MGsolve ( const bool ksp_clean , const unsigned &npre, const unsigned &npost ) { if ( ksp_clean ) { PetscMatrix* KKp = static_cast< PetscMatrix* > ( _KK ); Mat KK = KKp->mat(); KSPSetOperators ( _ksp, KK, _Pmat ); KSPSetTolerances ( _ksp, _rtol, _abstol, _dtol, _maxits ); KSPSetFromOptions ( _ksp ); PC pcMG; KSPGetPC(_ksp, &pcMG); PCMGSetNumberSmoothDown(pcMG, npre); PCMGSetNumberSmoothUp(pcMG, npost); } PetscErrorCode PCMGSetNumberSmoothDown(PC pc,PetscInt n) { PC_MG *mg = (PC_MG*)pc->data; PC_MG_Levels **mglevels = mg->levels; PetscErrorCode ierr; PetscInt i,levels; PetscFunctionBegin; PetscValidHeaderSpecific(pc,PC_CLASSID,1); if (!mglevels) SETERRQ(PetscObjectComm((PetscObject)pc),PETSC_ERR_ARG_WRONGSTATE,"Must set MG levels before calling"); PetscValidLogicalCollectiveInt(pc,n,2); levels = mglevels[0]->levels; for (i=1; ismoothd,PETSC_DEFAULT,PETSC_DEFAULT,PETSC_DEFAULT,n);CHKERRQ(ierr); mg->default_smoothd = n; } PetscFunctionReturn(0); } PetscErrorCode PCMGGetSmootherUp(PC pc,PetscInt l,KSP *ksp) { PC_MG *mg = (PC_MG*)pc->data; PC_MG_Levels **mglevels = mg->levels; PetscErrorCode ierr; const char *prefix; MPI_Comm comm; PetscFunctionBegin; PetscValidHeaderSpecific(pc,PC_CLASSID,1); /* This is called only if user wants a different pre-smoother from post. Thus we check if a different one has already been allocated, if not we allocate it. */ if (!l) SETERRQ(PetscObjectComm((PetscObject)pc),PETSC_ERR_ARG_OUTOFRANGE,"There is no such thing as a up smoother on the coarse grid"); if (mglevels[l]->smoothu == mglevels[l]->smoothd) { KSPType ksptype; PCType pctype; PC ipc; PetscReal rtol,abstol,dtol; PetscInt maxits; KSPNormType normtype; ierr = PetscObjectGetComm((PetscObject)mglevels[l]->smoothd,&comm);CHKERRQ(ierr); ierr = KSPGetOptionsPrefix(mglevels[l]->smoothd,&prefix);CHKERRQ(ierr); ierr = KSPGetTolerances(mglevels[l]->smoothd,&rtol,&abstol,&dtol,&maxits);CHKERRQ(ierr); ierr = KSPGetType(mglevels[l]->smoothd,&ksptype);CHKERRQ(ierr); ierr = KSPGetNormType(mglevels[l]->smoothd,&normtype);CHKERRQ(ierr); ierr = KSPGetPC(mglevels[l]->smoothd,&ipc);CHKERRQ(ierr); ierr = PCGetType(ipc,&pctype);CHKERRQ(ierr); ierr = KSPCreate(comm,&mglevels[l]->smoothu);CHKERRQ(ierr); ierr = KSPSetErrorIfNotConverged(mglevels[l]->smoothu,pc->erroriffailure);CHKERRQ(ierr); ierr = PetscObjectIncrementTabLevel((PetscObject)mglevels[l]->smoothu,(PetscObject)pc,mglevels[0]->levels-l);CHKERRQ(ierr); ierr = KSPSetOptionsPrefix(mglevels[l]->smoothu,prefix);CHKERRQ(ierr); ierr = KSPSetTolerances(mglevels[l]->smoothu,rtol,abstol,dtol,maxits);CHKERRQ(ierr); ierr = KSPSetType(mglevels[l]->smoothu,ksptype);CHKERRQ(ierr); ierr = KSPSetNormType(mglevels[l]->smoothu,normtype);CHKERRQ(ierr); ierr = KSPSetConvergenceTest(mglevels[l]->smoothu,KSPConvergedSkip,NULL,NULL);CHKERRQ(ierr); ierr = KSPGetPC(mglevels[l]->smoothu,&ipc);CHKERRQ(ierr); ierr = PCSetType(ipc,pctype);CHKERRQ(ierr); ierr = PetscLogObjectParent((PetscObject)pc,(PetscObject)mglevels[l]->smoothu);CHKERRQ(ierr); } if (ksp) *ksp = mglevels[l]->smoothu; PetscFunctionReturn(0); } As soon as you set both the up and down number of iterations it causes a duplication of the current smoother with some options preserved but others not (we don't have a KSPDuplicate() that duplicates everything). So if you are fine with the number of pre and post smooths the same just don't set both PCMGSetNumberSmoothDown(pcMG, npre); PCMGSetNumberSmoothUp(pcMG, npost); if you want them to be different you can share the same PC between the two (which has the overlapping matrices in it) but you cannot share the same KSP. I can tell you how to do that but suggest it is simpler just to have the same number of pre and post smooths Barry > On Aug 20, 2015, at 6:51 AM, Aulisa, Eugenio wrote: > > Hi Barry, > > Thanks for your answer. > > I run my applications with no command line, and I do not think I changed any PETSC_OPTIONS, > at least not voluntarily. > > For the source it is available on > https://github.com/NumPDEClassTTU/femus > but it is part of a much larger library and > I do not think any of you want to install and run it > just to find what I messed up. > > In any case, if you just want to look at the source code > where I set up the level smoother it is in > > https://github.com/NumPDEClassTTU/femus/blob/master/src/algebra/AsmPetscLinearEquationSolver.cpp > > line 400 > > void AsmPetscLinearEquationSolver::MGsetLevels ( > LinearEquationSolver *LinSolver, const unsigned &level, const unsigned &levelMax, > const vector &variable_to_be_solved, SparseMatrix* PP, SparseMatrix* RR ){ > > Be aware, that even if it seams that this takes care of the coarse level it is not. > The coarse level smoother is set some where else. > > Thanks, > Eugenio > > ________________________________________ > From: Barry Smith [bsmith at mcs.anl.gov] > Sent: Thursday, August 20, 2015 2:37 AM > To: Aulisa, Eugenio > Cc: petsc-users at mcs.anl.gov > Subject: Re: [petsc-users] GMRES -> PCMG -> PCASM pre- post- smoother > > What you describe is not the expected behavior. I expected exactly the result that you expected. > > Do you perhaps have some PETSc options around that may be changing the post-smoother? On the command line or in the file petscrc or in the environmental variable PETSC_OPTIONS? Can you send us some code that we could run that reproduces the problem? > > Barry > >> On Aug 19, 2015, at 9:26 PM, Aulisa, Eugenio wrote: >> >> Hi, >> >> I am solving an iteration of >> >> GMRES -> PCMG -> PCASM >> >> where I build my particular ASM domain decomposition. >> >> In setting the PCMG I would like at each level >> to use the same pre- and post-smoother >> and for this reason I am using >> ... >> PCMGGetSmoother ( pcMG, level , &subksp ); >> >> to extract and set at each level the ksp object. >> >> In setting PCASM then I use >> ... >> KSPGetPC ( subksp, &subpc ); >> PCSetType ( subpc, PCASM ); >> ... >> and then set my own decomposition >> ... >> PCASMSetLocalSubdomains(subpc,_is_loc_idx.size(),&_is_ovl[0],&_is_loc[0]); >> ... >> >> Now everything compiles, and runs with no memory leakage, >> but I do not get the expected convergence. >> >> When I checked the output of -ksp_view, I saw something that puzzled me: >> at each level >0, while in the MG pre-smoother the ASM domain decomposition >> is the one that I set, for example with 4 processes I get >> >>>>>>>>>>>>>>>>>>>>> >> ... >> Down solver (pre-smoother) on level 2 ------------------------------- >> KSP Object: (level-2) 4 MPI processes >> type: gmres >> GMRES: restart=30, using Classical (unmodified) Gram-Schmidt Orthogonalization with no iterative refinement >> GMRES: happy breakdown tolerance 1e-30 >> maximum iterations=1 >> using preconditioner applied to right hand side for initial guess >> tolerances: relative=1e-12, absolute=1e-20, divergence=1e+50 >> left preconditioning >> using nonzero initial guess >> using NONE norm type for convergence test >> PC Object: (level-2) 4 MPI processes >> type: asm >> Additive Schwarz: total subdomain blocks = 198, amount of overlap = 0 >> Additive Schwarz: restriction/interpolation type - RESTRICT >> [0] number of local blocks = 52 >> [1] number of local blocks = 48 >> [2] number of local blocks = 48 >> [3] number of local blocks = 50 >> Local solve info for each block is in the following KSP and PC objects: >> - - - - - - - - - - - - - - - - - - >> ... >>>>>>>>>>>>> >> >> >> in the post-smoother I have the default ASM decomposition with overlapping 1: >> >> >>>>>>>>>>>>> >> ... >> Up solver (post-smoother) on level 2 ------------------------------- >> KSP Object: (level-2) 4 MPI processes >> type: gmres >> GMRES: restart=30, using Classical (unmodified) Gram-Schmidt Orthogonalization with no iterative refinement >> GMRES: happy breakdown tolerance 1e-30 >> maximum iterations=2 >> tolerances: relative=1e-12, absolute=1e-20, divergence=1e+50 >> left preconditioning >> using nonzero initial guess >> using NONE norm type for convergence test >> PC Object: (level-2) 4 MPI processes >> type: asm >> Additive Schwarz: total subdomain blocks = 4, amount of overlap = 1 >> Additive Schwarz: restriction/interpolation type - RESTRICT >> Local solve is same for all blocks, in the following KSP and PC objects: >> KSP Object: (level-2sub_) 1 MPI processes >> type: preonly >> maximum iterations=10000, initial guess is zero >> tolerances: relative=1e-05, absolute=1e-50, divergence=10000 >> left preconditioning >> ... >>>>>>>>>>>>>>> >> %%%%%%%%%%%%%%%%%%%%%%%% >> >> So it seams that by using >> >> PCMGGetSmoother ( pcMG, level , &subksp ); >> >> I was capable to set both the pre- and post- smoothers to be PCASM >> but everything I did after that applied only to the >> pre-smoother, while the post-smoother got the default PCASM options. >> >> I know that I can use >> PCMGGetSmootherDown and PCMGGetSmootherUp, but that would >> probably double the memory allocation and the computational time in the ASM. >> >> Is there any way I can just use PCMGGetSmoother >> and use the same PCASM in the pre- and post- smoother? >> >> I hope I was clear enough. >> >> Thanks a lot for your help, >> Eugenio >> >> >> > From eugenio.aulisa at ttu.edu Fri Aug 21 19:38:27 2015 From: eugenio.aulisa at ttu.edu (Aulisa, Eugenio) Date: Sat, 22 Aug 2015 00:38:27 +0000 Subject: [petsc-users] GMRES -> PCMG -> PCASM pre- post- smoother In-Reply-To: <625658D1-0B72-493C-955B-CA6582F63967@mcs.anl.gov> References: <1CF3ABE1-214C-4BBC-B8FF-93416EC26EFC@mcs.anl.gov> , <625658D1-0B72-493C-955B-CA6582F63967@mcs.anl.gov> Message-ID: Thanks Barry. Yes that was my problem now if I run with the same down and up number of iterations I see the in -ksp_view output that smoother up is the same as smoother down. I think I figure it out how to set up different smoothers up and down but use the same ASM Preconditioner, which is more or less what Lorenzo suggested. Thanks again Eugenio ________________________________________ From: Barry Smith [bsmith at mcs.anl.gov] Sent: Thursday, August 20, 2015 11:54 PM To: Aulisa, Eugenio Cc: PETSc list Subject: Re: [petsc-users] GMRES -> PCMG -> PCASM pre- post- smoother Ahhh, void AsmPetscLinearEquationSolver::MGsolve ( const bool ksp_clean , const unsigned &npre, const unsigned &npost ) { if ( ksp_clean ) { PetscMatrix* KKp = static_cast< PetscMatrix* > ( _KK ); Mat KK = KKp->mat(); KSPSetOperators ( _ksp, KK, _Pmat ); KSPSetTolerances ( _ksp, _rtol, _abstol, _dtol, _maxits ); KSPSetFromOptions ( _ksp ); PC pcMG; KSPGetPC(_ksp, &pcMG); PCMGSetNumberSmoothDown(pcMG, npre); PCMGSetNumberSmoothUp(pcMG, npost); } PetscErrorCode PCMGSetNumberSmoothDown(PC pc,PetscInt n) { PC_MG *mg = (PC_MG*)pc->data; PC_MG_Levels **mglevels = mg->levels; PetscErrorCode ierr; PetscInt i,levels; PetscFunctionBegin; PetscValidHeaderSpecific(pc,PC_CLASSID,1); if (!mglevels) SETERRQ(PetscObjectComm((PetscObject)pc),PETSC_ERR_ARG_WRONGSTATE,"Must set MG levels before calling"); PetscValidLogicalCollectiveInt(pc,n,2); levels = mglevels[0]->levels; for (i=1; ismoothd,PETSC_DEFAULT,PETSC_DEFAULT,PETSC_DEFAULT,n);CHKERRQ(ierr); mg->default_smoothd = n; } PetscFunctionReturn(0); } PetscErrorCode PCMGGetSmootherUp(PC pc,PetscInt l,KSP *ksp) { PC_MG *mg = (PC_MG*)pc->data; PC_MG_Levels **mglevels = mg->levels; PetscErrorCode ierr; const char *prefix; MPI_Comm comm; PetscFunctionBegin; PetscValidHeaderSpecific(pc,PC_CLASSID,1); /* This is called only if user wants a different pre-smoother from post. Thus we check if a different one has already been allocated, if not we allocate it. */ if (!l) SETERRQ(PetscObjectComm((PetscObject)pc),PETSC_ERR_ARG_OUTOFRANGE,"There is no such thing as a up smoother on the coarse grid"); if (mglevels[l]->smoothu == mglevels[l]->smoothd) { KSPType ksptype; PCType pctype; PC ipc; PetscReal rtol,abstol,dtol; PetscInt maxits; KSPNormType normtype; ierr = PetscObjectGetComm((PetscObject)mglevels[l]->smoothd,&comm);CHKERRQ(ierr); ierr = KSPGetOptionsPrefix(mglevels[l]->smoothd,&prefix);CHKERRQ(ierr); ierr = KSPGetTolerances(mglevels[l]->smoothd,&rtol,&abstol,&dtol,&maxits);CHKERRQ(ierr); ierr = KSPGetType(mglevels[l]->smoothd,&ksptype);CHKERRQ(ierr); ierr = KSPGetNormType(mglevels[l]->smoothd,&normtype);CHKERRQ(ierr); ierr = KSPGetPC(mglevels[l]->smoothd,&ipc);CHKERRQ(ierr); ierr = PCGetType(ipc,&pctype);CHKERRQ(ierr); ierr = KSPCreate(comm,&mglevels[l]->smoothu);CHKERRQ(ierr); ierr = KSPSetErrorIfNotConverged(mglevels[l]->smoothu,pc->erroriffailure);CHKERRQ(ierr); ierr = PetscObjectIncrementTabLevel((PetscObject)mglevels[l]->smoothu,(PetscObject)pc,mglevels[0]->levels-l);CHKERRQ(ierr); ierr = KSPSetOptionsPrefix(mglevels[l]->smoothu,prefix);CHKERRQ(ierr); ierr = KSPSetTolerances(mglevels[l]->smoothu,rtol,abstol,dtol,maxits);CHKERRQ(ierr); ierr = KSPSetType(mglevels[l]->smoothu,ksptype);CHKERRQ(ierr); ierr = KSPSetNormType(mglevels[l]->smoothu,normtype);CHKERRQ(ierr); ierr = KSPSetConvergenceTest(mglevels[l]->smoothu,KSPConvergedSkip,NULL,NULL);CHKERRQ(ierr); ierr = KSPGetPC(mglevels[l]->smoothu,&ipc);CHKERRQ(ierr); ierr = PCSetType(ipc,pctype);CHKERRQ(ierr); ierr = PetscLogObjectParent((PetscObject)pc,(PetscObject)mglevels[l]->smoothu);CHKERRQ(ierr); } if (ksp) *ksp = mglevels[l]->smoothu; PetscFunctionReturn(0); } As soon as you set both the up and down number of iterations it causes a duplication of the current smoother with some options preserved but others not (we don't have a KSPDuplicate() that duplicates everything). So if you are fine with the number of pre and post smooths the same just don't set both PCMGSetNumberSmoothDown(pcMG, npre); PCMGSetNumberSmoothUp(pcMG, npost); if you want them to be different you can share the same PC between the two (which has the overlapping matrices in it) but you cannot share the same KSP. I can tell you how to do that but suggest it is simpler just to have the same number of pre and post smooths Barry > On Aug 20, 2015, at 6:51 AM, Aulisa, Eugenio wrote: > > Hi Barry, > > Thanks for your answer. > > I run my applications with no command line, and I do not think I changed any PETSC_OPTIONS, > at least not voluntarily. > > For the source it is available on > https://github.com/NumPDEClassTTU/femus > but it is part of a much larger library and > I do not think any of you want to install and run it > just to find what I messed up. > > In any case, if you just want to look at the source code > where I set up the level smoother it is in > > https://github.com/NumPDEClassTTU/femus/blob/master/src/algebra/AsmPetscLinearEquationSolver.cpp > > line 400 > > void AsmPetscLinearEquationSolver::MGsetLevels ( > LinearEquationSolver *LinSolver, const unsigned &level, const unsigned &levelMax, > const vector &variable_to_be_solved, SparseMatrix* PP, SparseMatrix* RR ){ > > Be aware, that even if it seams that this takes care of the coarse level it is not. > The coarse level smoother is set some where else. > > Thanks, > Eugenio > > ________________________________________ > From: Barry Smith [bsmith at mcs.anl.gov] > Sent: Thursday, August 20, 2015 2:37 AM > To: Aulisa, Eugenio > Cc: petsc-users at mcs.anl.gov > Subject: Re: [petsc-users] GMRES -> PCMG -> PCASM pre- post- smoother > > What you describe is not the expected behavior. I expected exactly the result that you expected. > > Do you perhaps have some PETSc options around that may be changing the post-smoother? On the command line or in the file petscrc or in the environmental variable PETSC_OPTIONS? Can you send us some code that we could run that reproduces the problem? > > Barry > >> On Aug 19, 2015, at 9:26 PM, Aulisa, Eugenio wrote: >> >> Hi, >> >> I am solving an iteration of >> >> GMRES -> PCMG -> PCASM >> >> where I build my particular ASM domain decomposition. >> >> In setting the PCMG I would like at each level >> to use the same pre- and post-smoother >> and for this reason I am using >> ... >> PCMGGetSmoother ( pcMG, level , &subksp ); >> >> to extract and set at each level the ksp object. >> >> In setting PCASM then I use >> ... >> KSPGetPC ( subksp, &subpc ); >> PCSetType ( subpc, PCASM ); >> ... >> and then set my own decomposition >> ... >> PCASMSetLocalSubdomains(subpc,_is_loc_idx.size(),&_is_ovl[0],&_is_loc[0]); >> ... >> >> Now everything compiles, and runs with no memory leakage, >> but I do not get the expected convergence. >> >> When I checked the output of -ksp_view, I saw something that puzzled me: >> at each level >0, while in the MG pre-smoother the ASM domain decomposition >> is the one that I set, for example with 4 processes I get >> >>>>>>>>>>>>>>>>>>>>> >> ... >> Down solver (pre-smoother) on level 2 ------------------------------- >> KSP Object: (level-2) 4 MPI processes >> type: gmres >> GMRES: restart=30, using Classical (unmodified) Gram-Schmidt Orthogonalization with no iterative refinement >> GMRES: happy breakdown tolerance 1e-30 >> maximum iterations=1 >> using preconditioner applied to right hand side for initial guess >> tolerances: relative=1e-12, absolute=1e-20, divergence=1e+50 >> left preconditioning >> using nonzero initial guess >> using NONE norm type for convergence test >> PC Object: (level-2) 4 MPI processes >> type: asm >> Additive Schwarz: total subdomain blocks = 198, amount of overlap = 0 >> Additive Schwarz: restriction/interpolation type - RESTRICT >> [0] number of local blocks = 52 >> [1] number of local blocks = 48 >> [2] number of local blocks = 48 >> [3] number of local blocks = 50 >> Local solve info for each block is in the following KSP and PC objects: >> - - - - - - - - - - - - - - - - - - >> ... >>>>>>>>>>>>> >> >> >> in the post-smoother I have the default ASM decomposition with overlapping 1: >> >> >>>>>>>>>>>>> >> ... >> Up solver (post-smoother) on level 2 ------------------------------- >> KSP Object: (level-2) 4 MPI processes >> type: gmres >> GMRES: restart=30, using Classical (unmodified) Gram-Schmidt Orthogonalization with no iterative refinement >> GMRES: happy breakdown tolerance 1e-30 >> maximum iterations=2 >> tolerances: relative=1e-12, absolute=1e-20, divergence=1e+50 >> left preconditioning >> using nonzero initial guess >> using NONE norm type for convergence test >> PC Object: (level-2) 4 MPI processes >> type: asm >> Additive Schwarz: total subdomain blocks = 4, amount of overlap = 1 >> Additive Schwarz: restriction/interpolation type - RESTRICT >> Local solve is same for all blocks, in the following KSP and PC objects: >> KSP Object: (level-2sub_) 1 MPI processes >> type: preonly >> maximum iterations=10000, initial guess is zero >> tolerances: relative=1e-05, absolute=1e-50, divergence=10000 >> left preconditioning >> ... >>>>>>>>>>>>>>> >> %%%%%%%%%%%%%%%%%%%%%%%% >> >> So it seams that by using >> >> PCMGGetSmoother ( pcMG, level , &subksp ); >> >> I was capable to set both the pre- and post- smoothers to be PCASM >> but everything I did after that applied only to the >> pre-smoother, while the post-smoother got the default PCASM options. >> >> I know that I can use >> PCMGGetSmootherDown and PCMGGetSmootherUp, but that would >> probably double the memory allocation and the computational time in the ASM. >> >> Is there any way I can just use PCMGGetSmoother >> and use the same PCASM in the pre- and post- smoother? >> >> I hope I was clear enough. >> >> Thanks a lot for your help, >> Eugenio >> >> >> > From david.knezevic at akselos.com Sat Aug 22 06:59:33 2015 From: david.knezevic at akselos.com (David Knezevic) Date: Sat, 22 Aug 2015 07:59:33 -0400 Subject: [petsc-users] Variatonal inequalities Message-ID: Hi all, I see from Section 5.7 of the manual that SNES supports box constraints on variables, which is great. However, I was also hoping to also be able to consider general linear inequality constraints, so I was wondering if anyone has any suggestions on how (or if) that could be done with PETSc? Thanks, David -------------- next part -------------- An HTML attachment was scrubbed... URL: From simpson at math.drexel.edu Sat Aug 22 16:02:32 2015 From: simpson at math.drexel.edu (Gideon Simpson) Date: Sat, 22 Aug 2015 17:02:32 -0400 Subject: [petsc-users] two issues with sparse direct solvers Message-ID: <94C1CB4B-C560-4E12-A3B8-2341061548EC@math.drexel.edu> I?m having issues with both SuperLU dist and MUMPS, as compiled by PETsc, in the following sense: 1. For large enough systems, which seems to vary depending on which computer I?m on, MUMPS seems to just die and never start, when it?s used as the linear solver within a SNES. There?s no error message, it just sits there and doesn?t do anything. 2. When running with SuperLU dist, I got the following error, with no further information: MPI_ABORT was invoked on rank 36 in communicator MPI_COMM_WORLD with errorcode 59. NOTE: invoking MPI_ABORT causes Open MPI to kill all MPI processes. You may or may not see output from other processes, depending on exactly when Open MPI kills them. -gideon -------------- next part -------------- An HTML attachment was scrubbed... URL: From gideon.simpson at gmail.com Sat Aug 22 16:04:18 2015 From: gideon.simpson at gmail.com (Gideon Simpson) Date: Sat, 22 Aug 2015 17:04:18 -0400 Subject: [petsc-users] issues with sparse direct solvers Message-ID: I?m having issues with both SuperLU dist and MUMPS, as compiled by PETsc, in the following sense: 1. For large enough systems, which seems to vary depending on which computer I?m on, MUMPS seems to just die and never start, when it?s used as the linear solver within a SNES. There?s no error message, it just sits there and doesn?t do anything. 2. When running with SuperLU dist, I got the following error, with no further information: [3]PETSC ERROR: ------------------------------------------------------------------------ [3]PETSC ERROR: Caught signal number 11 SEGV: Segmentation Violation, probably memory access out of range [3]PETSC ERROR: Try option -start_in_debugger or -on_error_attach_debugger [3]PETSC ERROR: or see http://www.mcs.anl.gov/petsc/documentation/faq.html#valgrind [3]PETSC ERROR: or try http://valgrind.org on GNU/linux and Apple Mac OS X to find memory corruption errors [3]PETSC ERROR: likely location of problem given in stack below [3]PETSC ERROR: --------------------- Stack Frames ------------------------------------ [3]PETSC ERROR: Note: The EXACT line numbers in the stack are not available, [3]PETSC ERROR: INSTEAD the line number of the start of the function [3]PETSC ERROR: is given. [3]PETSC ERROR: [3] SuperLU_DIST:pdgssvx line 161 /home/simpson/software/petsc-3.5.4/src/mat/impls/aij/mpi/superlu_dist/superlu_dist.c [3]PETSC ERROR: [3] MatSolve_SuperLU_DIST line 121 /home/simpson/software/petsc-3.5.4/src/mat/impls/aij/mpi/superlu_dist/superlu_dist.c [3]PETSC ERROR: [3] MatSolve line 3104 /home/simpson/software/petsc-3.5.4/src/mat/interface/matrix.c [3]PETSC ERROR: [3] PCApply_LU line 194 /home/simpson/software/petsc-3.5.4/src/ksp/pc/impls/factor/lu/lu.c [3]PETSC ERROR: [3] KSP_PCApplyBAorAB line 258 /home/simpson/software/petsc-3.5.4/include/petsc-private/kspimpl.h [3]PETSC ERROR: --------------------- Error Message -------------------------------------------------------------- [3]PETSC ERROR: Signal received [3]PETSC ERROR: See http://www.mcs.anl.gov/petsc/documentation/faq.html for trouble shooting. [3]PETSC ERROR: Petsc Release Version 3.5.4, May, 23, 2015 [3]PETSC ERROR: ./blowup_batch2 on a arch-linux2-c-debug named proteusi01 by simpson Sat Aug 22 17:01:41 2015 [3]PETSC ERROR: Configure options --with-mpi-dir=/mnt/HA/opt/openmpi/intel/64/1.8.1-mlnx-ofed --with-blas-lib=/mnt/HA/opt/blas/gcc/64/20110419/libblas.a --with-lapack-lib=/liblapack.a --download-suitesparse=yes --download-superlu=yes --download-superlu_dist=yes --download-mumps=yes --download-metis=yes --download-parmetis=yes --download-scalapack=yes [3]PETSC ERROR: #1 User provided function() line 0 in unknown file -------------------------------------------------------------------------- MPI_ABORT was invoked on rank 3 in communicator MPI_COMM_WORLD with errorcode 59. NOTE: invoking MPI_ABORT causes Open MPI to kill all MPI processes. You may or may not see output from other processes, depending on exactly when Open MPI kills them. -------------------------------------------------------------------------- [proteusi01:14037] 1 more process has sent help message help-mpi-api.txt / mpi-abort [proteusi01:14037] Set MCA parameter "orte_base_help_aggregate" to 0 to see all help / error messages [6]PETSC ERROR: ------------------------------------------------------------------------ [6]PETSC ERROR: Caught signal number 15 Terminate: Some process (or the batch system) has told this process to end [6]PETSC ERROR: Try option -start_in_debugger or -on_error_attach_debugger [6]PETSC ERROR: or see http://www.mcs.anl.gov/petsc/documentation/faq.html#valgrind [6]PETSC ERROR: or try http://valgrind.org on GNU/linux and Apple Mac OS X to find memory corruption errors [6]PETSC ERROR: likely location of problem given in stack below [6]PETSC ERROR: --------------------- Stack Frames ------------------------------------ [6]PETSC ERROR: Note: The EXACT line numbers in the stack are not available, [6]PETSC ERROR: INSTEAD the line number of the start of the function [6]PETSC ERROR: is given. [6]PETSC ERROR: [6] SuperLU_DIST:pdgssvx line 161 /home/simpson/software/petsc-3.5.4/src/mat/impls/aij/mpi/superlu_dist/superlu_dist.c [6]PETSC ERROR: [6] MatSolve_SuperLU_DIST line 121 /home/simpson/software/petsc-3.5.4/src/mat/impls/aij/mpi/superlu_dist/superlu_dist.c [6]PETSC ERROR: [6] MatSolve line 3104 /home/simpson/software/petsc-3.5.4/src/mat/interface/matrix.c [6]PETSC ERROR: [6] PCApply_LU line 194 /home/simpson/software/petsc-3.5.4/src/ksp/pc/impls/factor/lu/lu.c [6]PETSC ERROR: [6] KSP_PCApplyBAorAB line 258 /home/simpson/software/petsc-3.5.4/include/petsc-private/kspimpl.h [6]PETSC ERROR: --------------------- Error Message -------------------------------------------------------------- [6]PETSC ERROR: Signal received [6]PETSC ERROR: See http://www.mcs.anl.gov/petsc/documentation/faq.html for trouble shooting. [6]PETSC ERROR: Petsc Release Version 3.5.4, May, 23, 2015 [6]PETSC ERROR: ./blowup_batch2 on a arch-linux2-c-debug named proteusi01 by simpson Sat Aug 22 17:01:41 2015 [6]PETSC ERROR: Configure options --with-mpi-dir=/mnt/HA/opt/openmpi/intel/64/1.8.1-mlnx-ofed --with-blas-lib=/mnt/HA/opt/blas/gcc/64/20110419/libblas.a --with-lapack-lib=/liblapack.a --download-suitesparse=yes --download-superlu=yes --download-superlu_dist=yes --download-mumps=yes --download-metis=yes --download-parmetis=yes --download-scalapack=yes [6]PETSC ERROR: #1 User provided function() line 0 in unknown file [7]PETSC ERROR: ------------------------------------------------------------------------ [7]PETSC ERROR: Caught signal number 15 Terminate: Some process (or the batch system) has told this process to end [7]PETSC ERROR: Try option -start_in_debugger or -on_error_attach_debugger [7]PETSC ERROR: or see http://www.mcs.anl.gov/petsc/documentation/faq.html#valgrind [7]PETSC ERROR: or try http://valgrind.org on GNU/linux and Apple Mac OS X to find memory corruption errors [7]PETSC ERROR: likely location of problem given in stack below [7]PETSC ERROR: --------------------- Stack Frames ------------------------------------ [7]PETSC ERROR: Note: The EXACT line numbers in the stack are not available, [7]PETSC ERROR: INSTEAD the line number of the start of the function [7]PETSC ERROR: is given. [7]PETSC ERROR: [7] SuperLU_DIST:pdgssvx line 161 /home/simpson/software/petsc-3.5.4/src/mat/impls/aij/mpi/superlu_dist/superlu_dist.c [7]PETSC ERROR: [7] MatSolve_SuperLU_DIST line 121 /home/simpson/software/petsc-3.5.4/src/mat/impls/aij/mpi/superlu_dist/superlu_dist.c [7]PETSC ERROR: [7] MatSolve line 3104 /home/simpson/software/petsc-3.5.4/src/mat/interface/matrix.c [7]PETSC ERROR: [7] PCApply_LU line 194 /home/simpson/software/petsc-3.5.4/src/ksp/pc/impls/factor/lu/lu.c [7]PETSC ERROR: [7] KSP_PCApplyBAorAB line 258 /home/simpson/software/petsc-3.5.4/include/petsc-private/kspimpl.h [7]PETSC ERROR: --------------------- Error Message -------------------------------------------------------------- [7]PETSC ERROR: Signal received [7]PETSC ERROR: See http://www.mcs.anl.gov/petsc/documentation/faq.html for trouble shooting. [7]PETSC ERROR: Petsc Release Version 3.5.4, May, 23, 2015 [7]PETSC ERROR: ./blowup_batch2 on a arch-linux2-c-debug named proteusi01 by simpson Sat Aug 22 17:01:41 2015 [7]PETSC ERROR: Configure options --with-mpi-dir=/mnt/HA/opt/openmpi/intel/64/1.8.1-mlnx-ofed --with-blas-lib=/mnt/HA/opt/blas/gcc/64/20110419/libblas.a --with-lapack-lib=/liblapack.a --download-suitesparse=yes --download-superlu=yes --download-superlu_dist=yes --download-mumps=yes --download-metis=yes --download-parmetis=yes --download-scalapack=yes [7]PETSC ERROR: #1 User provided function() line 0 in unknown file [0]PETSC ERROR: ------------------------------------------------------------------------ [0]PETSC ERROR: Caught signal number 15 Terminate: Some process (or the batch system) has told this process to end [0]PETSC ERROR: Try option -start_in_debugger or -on_error_attach_debugger [0]PETSC ERROR: or see http://www.mcs.anl.gov/petsc/documentation/faq.html#valgrind [0]PETSC ERROR: or try http://valgrind.org on GNU/linux and Apple Mac OS X to find memory corruption errors [0]PETSC ERROR: likely location of problem given in stack below [0]PETSC ERROR: --------------------- Stack Frames ------------------------------------ [0]PETSC ERROR: Note: The EXACT line numbers in the stack are not available, [0]PETSC ERROR: INSTEAD the line number of the start of the function [0]PETSC ERROR: is given. [0]PETSC ERROR: [0] SuperLU_DIST:pdgssvx line 161 /home/simpson/software/petsc-3.5.4/src/mat/impls/aij/mpi/superlu_dist/superlu_dist.c [0]PETSC ERROR: [0] MatSolve_SuperLU_DIST line 121 /home/simpson/software/petsc-3.5.4/src/mat/impls/aij/mpi/superlu_dist/superlu_dist.c [0]PETSC ERROR: [0] MatSolve line 3104 /home/simpson/software/petsc-3.5.4/src/mat/interface/matrix.c [0]PETSC ERROR: [0] PCApply_LU line 194 /home/simpson/software/petsc-3.5.4/src/ksp/pc/impls/factor/lu/lu.c [0]PETSC ERROR: [0] KSP_PCApplyBAorAB line 258 /home/simpson/software/petsc-3.5.4/include/petsc-private/kspimpl.h [0]PETSC ERROR: --------------------- Error Message -------------------------------------------------------------- [0]PETSC ERROR: Signal received [0]PETSC ERROR: See http://www.mcs.anl.gov/petsc/documentation/faq.html for trouble shooting. [0]PETSC ERROR: Petsc Release Version 3.5.4, May, 23, 2015 [0]PETSC ERROR: ./blowup_batch2 on a arch-linux2-c-debug named proteusi01 by simpson Sat Aug 22 17:01:41 2015 [0]PETSC ERROR: Configure options --with-mpi-dir=/mnt/HA/opt/openmpi/intel/64/1.8.1-mlnx-ofed --with-blas-lib=/mnt/HA/opt/blas/gcc/64/20110419/libblas.a --with-lapack-lib=/liblapack.a --download-suitesparse=yes --download-superlu=yes --download-superlu_dist=yes --download-mumps=yes --download-metis=yes --download-parmetis=yes --download-scalapack=yes [0]PETSC ERROR: #1 User provided function() line 0 in unknown file [1]PETSC ERROR: ------------------------------------------------------------------------ [1]PETSC ERROR: Caught signal number 15 Terminate: Some process (or the batch system) has told this process to end [1]PETSC ERROR: Try option -start_in_debugger or -on_error_attach_debugger [1]PETSC ERROR: or see http://www.mcs.anl.gov/petsc/documentation/faq.html#valgrind [1]PETSC ERROR: or try http://valgrind.org on GNU/linux and Apple Mac OS X to find memory corruption errors [1]PETSC ERROR: likely location of problem given in stack below [1]PETSC ERROR: --------------------- Stack Frames ------------------------------------ [1]PETSC ERROR: Note: The EXACT line numbers in the stack are not available, [1]PETSC ERROR: INSTEAD the line number of the start of the function [1]PETSC ERROR: is given. [1]PETSC ERROR: [1] SuperLU_DIST:pdgssvx line 161 /home/simpson/software/petsc-3.5.4/src/mat/impls/aij/mpi/superlu_dist/superlu_dist.c [1]PETSC ERROR: [1] MatSolve_SuperLU_DIST line 121 /home/simpson/software/petsc-3.5.4/src/mat/impls/aij/mpi/superlu_dist/superlu_dist.c [1]PETSC ERROR: [1] MatSolve line 3104 /home/simpson/software/petsc-3.5.4/src/mat/interface/matrix.c [1]PETSC ERROR: [1] PCApply_LU line 194 /home/simpson/software/petsc-3.5.4/src/ksp/pc/impls/factor/lu/lu.c [1]PETSC ERROR: [1] KSP_PCApplyBAorAB line 258 /home/simpson/software/petsc-3.5.4/include/petsc-private/kspimpl.h [1]PETSC ERROR: --------------------- Error Message -------------------------------------------------------------- [1]PETSC ERROR: Signal received [1]PETSC ERROR: See http://www.mcs.anl.gov/petsc/documentation/faq.html for trouble shooting. [1]PETSC ERROR: Petsc Release Version 3.5.4, May, 23, 2015 [1]PETSC ERROR: ./blowup_batch2 on a arch-linux2-c-debug named proteusi01 by simpson Sat Aug 22 17:01:41 2015 [1]PETSC ERROR: Configure options --with-mpi-dir=/mnt/HA/opt/openmpi/intel/64/1.8.1-mlnx-ofed --with-blas-lib=/mnt/HA/opt/blas/gcc/64/20110419/libblas.a --with-lapack-lib=/liblapack.a --download-suitesparse=yes --download-superlu=yes --download-superlu_dist=yes --download-mumps=yes --download-metis=yes --download-parmetis=yes --download-scalapack=yes [1]PETSC ERROR: #1 User provided function() line 0 in unknown file [2]PETSC ERROR: ------------------------------------------------------------------------ [2]PETSC ERROR: Caught signal number 15 Terminate: Some process (or the batch system) has told this process to end [2]PETSC ERROR: Try option -start_in_debugger or -on_error_attach_debugger [2]PETSC ERROR: or see http://www.mcs.anl.gov/petsc/documentation/faq.html#valgrind [2]PETSC ERROR: or try http://valgrind.org on GNU/linux and Apple Mac OS X to find memory corruption errors [2]PETSC ERROR: likely location of problem given in stack below [2]PETSC ERROR: --------------------- Stack Frames ------------------------------------ [2]PETSC ERROR: Note: The EXACT line numbers in the stack are not available, [2]PETSC ERROR: INSTEAD the line number of the start of the function [2]PETSC ERROR: is given. [2]PETSC ERROR: [2] SuperLU_DIST:pdgssvx line 161 /home/simpson/software/petsc-3.5.4/src/mat/impls/aij/mpi/superlu_dist/superlu_dist.c [2]PETSC ERROR: [2] MatSolve_SuperLU_DIST line 121 /home/simpson/software/petsc-3.5.4/src/mat/impls/aij/mpi/superlu_dist/superlu_dist.c [2]PETSC ERROR: [2] MatSolve line 3104 /home/simpson/software/petsc-3.5.4/src/mat/interface/matrix.c [2]PETSC ERROR: [2] PCApply_LU line 194 /home/simpson/software/petsc-3.5.4/src/ksp/pc/impls/factor/lu/lu.c [2]PETSC ERROR: [2] KSP_PCApplyBAorAB line 258 /home/simpson/software/petsc-3.5.4/include/petsc-private/kspimpl.h [2]PETSC ERROR: --------------------- Error Message -------------------------------------------------------------- [2]PETSC ERROR: Signal received [2]PETSC ERROR: See http://www.mcs.anl.gov/petsc/documentation/faq.html for trouble shooting. [2]PETSC ERROR: Petsc Release Version 3.5.4, May, 23, 2015 [2]PETSC ERROR: ./blowup_batch2 on a arch-linux2-c-debug named proteusi01 by simpson Sat Aug 22 17:01:41 2015 [2]PETSC ERROR: Configure options --with-mpi-dir=/mnt/HA/opt/openmpi/intel/64/1.8.1-mlnx-ofed --with-blas-lib=/mnt/HA/opt/blas/gcc/64/20110419/libblas.a --with-lapack-lib=/liblapack.a --download-suitesparse=yes --download-superlu=yes --download-superlu_dist=yes --download-mumps=yes --download-metis=yes --download-parmetis=yes --download-scalapack=yes [2]PETSC ERROR: #1 User provided function() line 0 in unknown file [4]PETSC ERROR: ------------------------------------------------------------------------ [4]PETSC ERROR: Caught signal number 15 Terminate: Some process (or the batch system) has told this process to end [4]PETSC ERROR: Try option -start_in_debugger or -on_error_attach_debugger [4]PETSC ERROR: or see http://www.mcs.anl.gov/petsc/documentation/faq.html#valgrind [4]PETSC ERROR: or try http://valgrind.org on GNU/linux and Apple Mac OS X to find memory corruption errors [4]PETSC ERROR: likely location of problem given in stack below [4]PETSC ERROR: --------------------- Stack Frames ------------------------------------ [4]PETSC ERROR: Note: The EXACT line numbers in the stack are not available, [4]PETSC ERROR: INSTEAD the line number of the start of the function [4]PETSC ERROR: is given. [4]PETSC ERROR: [4] SuperLU_DIST:pdgssvx line 161 /home/simpson/software/petsc-3.5.4/src/mat/impls/aij/mpi/superlu_dist/superlu_dist.c [4]PETSC ERROR: [4] MatSolve_SuperLU_DIST line 121 /home/simpson/software/petsc-3.5.4/src/mat/impls/aij/mpi/superlu_dist/superlu_dist.c [4]PETSC ERROR: [4] MatSolve line 3104 /home/simpson/software/petsc-3.5.4/src/mat/interface/matrix.c [4]PETSC ERROR: [4] PCApply_LU line 194 /home/simpson/software/petsc-3.5.4/src/ksp/pc/impls/factor/lu/lu.c [4]PETSC ERROR: [4] KSP_PCApplyBAorAB line 258 /home/simpson/software/petsc-3.5.4/include/petsc-private/kspimpl.h [4]PETSC ERROR: --------------------- Error Message -------------------------------------------------------------- [4]PETSC ERROR: Signal received [4]PETSC ERROR: See http://www.mcs.anl.gov/petsc/documentation/faq.html for trouble shooting. [4]PETSC ERROR: Petsc Release Version 3.5.4, May, 23, 2015 [4]PETSC ERROR: ./blowup_batch2 on a arch-linux2-c-debug named proteusi01 by simpson Sat Aug 22 17:01:41 2015 [4]PETSC ERROR: Configure options --with-mpi-dir=/mnt/HA/opt/openmpi/intel/64/1.8.1-mlnx-ofed --with-blas-lib=/mnt/HA/opt/blas/gcc/64/20110419/libblas.a --with-lapack-lib=/liblapack.a --download-suitesparse=yes --download-superlu=yes --download-superlu_dist=yes --download-mumps=yes --download-metis=yes --download-parmetis=yes --download-scalapack=yes [4]PETSC ERROR: #1 User provided function() line 0 in unknown file [5]PETSC ERROR: ------------------------------------------------------------------------ [5]PETSC ERROR: Caught signal number 15 Terminate: Some process (or the batch system) has told this process to end [5]PETSC ERROR: Try option -start_in_debugger or -on_error_attach_debugger [5]PETSC ERROR: or see http://www.mcs.anl.gov/petsc/documentation/faq.html#valgrind [5]PETSC ERROR: or try http://valgrind.org on GNU/linux and Apple Mac OS X to find memory corruption errors [5]PETSC ERROR: likely location of problem given in stack below [5]PETSC ERROR: --------------------- Stack Frames ------------------------------------ [5]PETSC ERROR: Note: The EXACT line numbers in the stack are not available, [5]PETSC ERROR: INSTEAD the line number of the start of the function [5]PETSC ERROR: is given. [5]PETSC ERROR: [5] SuperLU_DIST:pdgssvx line 161 /home/simpson/software/petsc-3.5.4/src/mat/impls/aij/mpi/superlu_dist/superlu_dist.c [5]PETSC ERROR: [5] MatSolve_SuperLU_DIST line 121 /home/simpson/software/petsc-3.5.4/src/mat/impls/aij/mpi/superlu_dist/superlu_dist.c [5]PETSC ERROR: [5] MatSolve line 3104 /home/simpson/software/petsc-3.5.4/src/mat/interface/matrix.c [5]PETSC ERROR: [5] PCApply_LU line 194 /home/simpson/software/petsc-3.5.4/src/ksp/pc/impls/factor/lu/lu.c [5]PETSC ERROR: [5] KSP_PCApplyBAorAB line 258 /home/simpson/software/petsc-3.5.4/include/petsc-private/kspimpl.h [5]PETSC ERROR: --------------------- Error Message -------------------------------------------------------------- [5]PETSC ERROR: Signal received [5]PETSC ERROR: See http://www.mcs.anl.gov/petsc/documentation/faq.html for trouble shooting. [5]PETSC ERROR: Petsc Release Version 3.5.4, May, 23, 2015 [5]PETSC ERROR: ./blowup_batch2 on a arch-linux2-c-debug named proteusi01 by simpson Sat Aug 22 17:01:41 2015 [5]PETSC ERROR: Configure options --with-mpi-dir=/mnt/HA/opt/openmpi/intel/64/1.8.1-mlnx-ofed --with-blas-lib=/mnt/HA/opt/blas/gcc/64/20110419/libblas.a --with-lapack-lib=/liblapack.a --download-suitesparse=yes --download-superlu=yes --download-superlu_dist=yes --download-mumps=yes --download-metis=yes --download-parmetis=yes --download-scalapack=yes [5]PETSC ERROR: #1 User provided function() line 0 in unknown file -gideon -------------- next part -------------- An HTML attachment was scrubbed... URL: From bsmith at mcs.anl.gov Sat Aug 22 16:12:53 2015 From: bsmith at mcs.anl.gov (Barry Smith) Date: Sat, 22 Aug 2015 16:12:53 -0500 Subject: [petsc-users] issues with sparse direct solvers In-Reply-To: References: Message-ID: <1F79415D-8468-4FB7-9821-54D71165CE11@mcs.anl.gov> > On Aug 22, 2015, at 4:04 PM, Gideon Simpson wrote: > > I?m having issues with both SuperLU dist and MUMPS, as compiled by PETsc, in the following sense: > > 1. For large enough systems, which seems to vary depending on which computer I?m on, MUMPS seems to just die and never start, when it?s used as the linear solver within a SNES. There?s no error message, it just sits there and doesn?t do anything. You will need to use a debugger to figure out where it is "hanging"; we haven't heard reports about this. > > 2. When running with SuperLU dist, I got the following error, with no further information: The last release of SuperLU_DIST had some pretty nasty bugs, memory corruption that caused crashes etc. We think they are now fixed if you use the maint branch of the PETSc repository and --download-superlu_dist If you stick with the PETSc release and SuperLU_Dist you are using you will keep seeing these crashes Barry > > [3]PETSC ERROR: ------------------------------------------------------------------------ > [3]PETSC ERROR: Caught signal number 11 SEGV: Segmentation Violation, probably memory access out of range > [3]PETSC ERROR: Try option -start_in_debugger or -on_error_attach_debugger > [3]PETSC ERROR: or see http://www.mcs.anl.gov/petsc/documentation/faq.html#valgrind > [3]PETSC ERROR: or try http://valgrind.org on GNU/linux and Apple Mac OS X to find memory corruption errors > [3]PETSC ERROR: likely location of problem given in stack below > [3]PETSC ERROR: --------------------- Stack Frames ------------------------------------ > [3]PETSC ERROR: Note: The EXACT line numbers in the stack are not available, > [3]PETSC ERROR: INSTEAD the line number of the start of the function > [3]PETSC ERROR: is given. > [3]PETSC ERROR: [3] SuperLU_DIST:pdgssvx line 161 /home/simpson/software/petsc-3.5.4/src/mat/impls/aij/mpi/superlu_dist/superlu_dist.c > [3]PETSC ERROR: [3] MatSolve_SuperLU_DIST line 121 /home/simpson/software/petsc-3.5.4/src/mat/impls/aij/mpi/superlu_dist/superlu_dist.c > [3]PETSC ERROR: [3] MatSolve line 3104 /home/simpson/software/petsc-3.5.4/src/mat/interface/matrix.c > [3]PETSC ERROR: [3] PCApply_LU line 194 /home/simpson/software/petsc-3.5.4/src/ksp/pc/impls/factor/lu/lu.c > [3]PETSC ERROR: [3] KSP_PCApplyBAorAB line 258 /home/simpson/software/petsc-3.5.4/include/petsc-private/kspimpl.h > [3]PETSC ERROR: --------------------- Error Message -------------------------------------------------------------- > [3]PETSC ERROR: Signal received > [3]PETSC ERROR: See http://www.mcs.anl.gov/petsc/documentation/faq.html for trouble shooting. > [3]PETSC ERROR: Petsc Release Version 3.5.4, May, 23, 2015 > [3]PETSC ERROR: ./blowup_batch2 on a arch-linux2-c-debug named proteusi01 by simpson Sat Aug 22 17:01:41 2015 > [3]PETSC ERROR: Configure options --with-mpi-dir=/mnt/HA/opt/openmpi/intel/64/1.8.1-mlnx-ofed --with-blas-lib=/mnt/HA/opt/blas/gcc/64/20110419/libblas.a --with-lapack-lib=/liblapack.a --download-suitesparse=yes --download-superlu=yes --download-superlu_dist=yes --download-mumps=yes --download-metis=yes --download-parmetis=yes --download-scalapack=yes > [3]PETSC ERROR: #1 User provided function() line 0 in unknown file > -------------------------------------------------------------------------- > MPI_ABORT was invoked on rank 3 in communicator MPI_COMM_WORLD > with errorcode 59. > > NOTE: invoking MPI_ABORT causes Open MPI to kill all MPI processes. > You may or may not see output from other processes, depending on > exactly when Open MPI kills them. > -------------------------------------------------------------------------- > [proteusi01:14037] 1 more process has sent help message help-mpi-api.txt / mpi-abort > [proteusi01:14037] Set MCA parameter "orte_base_help_aggregate" to 0 to see all help / error messages > [6]PETSC ERROR: ------------------------------------------------------------------------ > [6]PETSC ERROR: Caught signal number 15 Terminate: Some process (or the batch system) has told this process to end > [6]PETSC ERROR: Try option -start_in_debugger or -on_error_attach_debugger > [6]PETSC ERROR: or see http://www.mcs.anl.gov/petsc/documentation/faq.html#valgrind > [6]PETSC ERROR: or try http://valgrind.org on GNU/linux and Apple Mac OS X to find memory corruption errors > [6]PETSC ERROR: likely location of problem given in stack below > [6]PETSC ERROR: --------------------- Stack Frames ------------------------------------ > [6]PETSC ERROR: Note: The EXACT line numbers in the stack are not available, > [6]PETSC ERROR: INSTEAD the line number of the start of the function > [6]PETSC ERROR: is given. > [6]PETSC ERROR: [6] SuperLU_DIST:pdgssvx line 161 /home/simpson/software/petsc-3.5.4/src/mat/impls/aij/mpi/superlu_dist/superlu_dist.c > [6]PETSC ERROR: [6] MatSolve_SuperLU_DIST line 121 /home/simpson/software/petsc-3.5.4/src/mat/impls/aij/mpi/superlu_dist/superlu_dist.c > [6]PETSC ERROR: [6] MatSolve line 3104 /home/simpson/software/petsc-3.5.4/src/mat/interface/matrix.c > [6]PETSC ERROR: [6] PCApply_LU line 194 /home/simpson/software/petsc-3.5.4/src/ksp/pc/impls/factor/lu/lu.c > [6]PETSC ERROR: [6] KSP_PCApplyBAorAB line 258 /home/simpson/software/petsc-3.5.4/include/petsc-private/kspimpl.h > [6]PETSC ERROR: --------------------- Error Message -------------------------------------------------------------- > [6]PETSC ERROR: Signal received > [6]PETSC ERROR: See http://www.mcs.anl.gov/petsc/documentation/faq.html for trouble shooting. > [6]PETSC ERROR: Petsc Release Version 3.5.4, May, 23, 2015 > [6]PETSC ERROR: ./blowup_batch2 on a arch-linux2-c-debug named proteusi01 by simpson Sat Aug 22 17:01:41 2015 > [6]PETSC ERROR: Configure options --with-mpi-dir=/mnt/HA/opt/openmpi/intel/64/1.8.1-mlnx-ofed --with-blas-lib=/mnt/HA/opt/blas/gcc/64/20110419/libblas.a --with-lapack-lib=/liblapack.a --download-suitesparse=yes --download-superlu=yes --download-superlu_dist=yes --download-mumps=yes --download-metis=yes --download-parmetis=yes --download-scalapack=yes > [6]PETSC ERROR: #1 User provided function() line 0 in unknown file > [7]PETSC ERROR: ------------------------------------------------------------------------ > [7]PETSC ERROR: Caught signal number 15 Terminate: Some process (or the batch system) has told this process to end > [7]PETSC ERROR: Try option -start_in_debugger or -on_error_attach_debugger > [7]PETSC ERROR: or see http://www.mcs.anl.gov/petsc/documentation/faq.html#valgrind > [7]PETSC ERROR: or try http://valgrind.org on GNU/linux and Apple Mac OS X to find memory corruption errors > [7]PETSC ERROR: likely location of problem given in stack below > [7]PETSC ERROR: --------------------- Stack Frames ------------------------------------ > [7]PETSC ERROR: Note: The EXACT line numbers in the stack are not available, > [7]PETSC ERROR: INSTEAD the line number of the start of the function > [7]PETSC ERROR: is given. > [7]PETSC ERROR: [7] SuperLU_DIST:pdgssvx line 161 /home/simpson/software/petsc-3.5.4/src/mat/impls/aij/mpi/superlu_dist/superlu_dist.c > [7]PETSC ERROR: [7] MatSolve_SuperLU_DIST line 121 /home/simpson/software/petsc-3.5.4/src/mat/impls/aij/mpi/superlu_dist/superlu_dist.c > [7]PETSC ERROR: [7] MatSolve line 3104 /home/simpson/software/petsc-3.5.4/src/mat/interface/matrix.c > [7]PETSC ERROR: [7] PCApply_LU line 194 /home/simpson/software/petsc-3.5.4/src/ksp/pc/impls/factor/lu/lu.c > [7]PETSC ERROR: [7] KSP_PCApplyBAorAB line 258 /home/simpson/software/petsc-3.5.4/include/petsc-private/kspimpl.h > [7]PETSC ERROR: --------------------- Error Message -------------------------------------------------------------- > [7]PETSC ERROR: Signal received > [7]PETSC ERROR: See http://www.mcs.anl.gov/petsc/documentation/faq.html for trouble shooting. > [7]PETSC ERROR: Petsc Release Version 3.5.4, May, 23, 2015 > [7]PETSC ERROR: ./blowup_batch2 on a arch-linux2-c-debug named proteusi01 by simpson Sat Aug 22 17:01:41 2015 > [7]PETSC ERROR: Configure options --with-mpi-dir=/mnt/HA/opt/openmpi/intel/64/1.8.1-mlnx-ofed --with-blas-lib=/mnt/HA/opt/blas/gcc/64/20110419/libblas.a --with-lapack-lib=/liblapack.a --download-suitesparse=yes --download-superlu=yes --download-superlu_dist=yes --download-mumps=yes --download-metis=yes --download-parmetis=yes --download-scalapack=yes > [7]PETSC ERROR: #1 User provided function() line 0 in unknown file > [0]PETSC ERROR: ------------------------------------------------------------------------ > [0]PETSC ERROR: Caught signal number 15 Terminate: Some process (or the batch system) has told this process to end > [0]PETSC ERROR: Try option -start_in_debugger or -on_error_attach_debugger > [0]PETSC ERROR: or see http://www.mcs.anl.gov/petsc/documentation/faq.html#valgrind > [0]PETSC ERROR: or try http://valgrind.org on GNU/linux and Apple Mac OS X to find memory corruption errors > [0]PETSC ERROR: likely location of problem given in stack below > [0]PETSC ERROR: --------------------- Stack Frames ------------------------------------ > [0]PETSC ERROR: Note: The EXACT line numbers in the stack are not available, > [0]PETSC ERROR: INSTEAD the line number of the start of the function > [0]PETSC ERROR: is given. > [0]PETSC ERROR: [0] SuperLU_DIST:pdgssvx line 161 /home/simpson/software/petsc-3.5.4/src/mat/impls/aij/mpi/superlu_dist/superlu_dist.c > [0]PETSC ERROR: [0] MatSolve_SuperLU_DIST line 121 /home/simpson/software/petsc-3.5.4/src/mat/impls/aij/mpi/superlu_dist/superlu_dist.c > [0]PETSC ERROR: [0] MatSolve line 3104 /home/simpson/software/petsc-3.5.4/src/mat/interface/matrix.c > [0]PETSC ERROR: [0] PCApply_LU line 194 /home/simpson/software/petsc-3.5.4/src/ksp/pc/impls/factor/lu/lu.c > [0]PETSC ERROR: [0] KSP_PCApplyBAorAB line 258 /home/simpson/software/petsc-3.5.4/include/petsc-private/kspimpl.h > [0]PETSC ERROR: --------------------- Error Message -------------------------------------------------------------- > [0]PETSC ERROR: Signal received > [0]PETSC ERROR: See http://www.mcs.anl.gov/petsc/documentation/faq.html for trouble shooting. > [0]PETSC ERROR: Petsc Release Version 3.5.4, May, 23, 2015 > [0]PETSC ERROR: ./blowup_batch2 on a arch-linux2-c-debug named proteusi01 by simpson Sat Aug 22 17:01:41 2015 > [0]PETSC ERROR: Configure options --with-mpi-dir=/mnt/HA/opt/openmpi/intel/64/1.8.1-mlnx-ofed --with-blas-lib=/mnt/HA/opt/blas/gcc/64/20110419/libblas.a --with-lapack-lib=/liblapack.a --download-suitesparse=yes --download-superlu=yes --download-superlu_dist=yes --download-mumps=yes --download-metis=yes --download-parmetis=yes --download-scalapack=yes > [0]PETSC ERROR: #1 User provided function() line 0 in unknown file > [1]PETSC ERROR: ------------------------------------------------------------------------ > [1]PETSC ERROR: Caught signal number 15 Terminate: Some process (or the batch system) has told this process to end > [1]PETSC ERROR: Try option -start_in_debugger or -on_error_attach_debugger > [1]PETSC ERROR: or see http://www.mcs.anl.gov/petsc/documentation/faq.html#valgrind > [1]PETSC ERROR: or try http://valgrind.org on GNU/linux and Apple Mac OS X to find memory corruption errors > [1]PETSC ERROR: likely location of problem given in stack below > [1]PETSC ERROR: --------------------- Stack Frames ------------------------------------ > [1]PETSC ERROR: Note: The EXACT line numbers in the stack are not available, > [1]PETSC ERROR: INSTEAD the line number of the start of the function > [1]PETSC ERROR: is given. > [1]PETSC ERROR: [1] SuperLU_DIST:pdgssvx line 161 /home/simpson/software/petsc-3.5.4/src/mat/impls/aij/mpi/superlu_dist/superlu_dist.c > [1]PETSC ERROR: [1] MatSolve_SuperLU_DIST line 121 /home/simpson/software/petsc-3.5.4/src/mat/impls/aij/mpi/superlu_dist/superlu_dist.c > [1]PETSC ERROR: [1] MatSolve line 3104 /home/simpson/software/petsc-3.5.4/src/mat/interface/matrix.c > [1]PETSC ERROR: [1] PCApply_LU line 194 /home/simpson/software/petsc-3.5.4/src/ksp/pc/impls/factor/lu/lu.c > [1]PETSC ERROR: [1] KSP_PCApplyBAorAB line 258 /home/simpson/software/petsc-3.5.4/include/petsc-private/kspimpl.h > [1]PETSC ERROR: --------------------- Error Message -------------------------------------------------------------- > [1]PETSC ERROR: Signal received > [1]PETSC ERROR: See http://www.mcs.anl.gov/petsc/documentation/faq.html for trouble shooting. > [1]PETSC ERROR: Petsc Release Version 3.5.4, May, 23, 2015 > [1]PETSC ERROR: ./blowup_batch2 on a arch-linux2-c-debug named proteusi01 by simpson Sat Aug 22 17:01:41 2015 > [1]PETSC ERROR: Configure options --with-mpi-dir=/mnt/HA/opt/openmpi/intel/64/1.8.1-mlnx-ofed --with-blas-lib=/mnt/HA/opt/blas/gcc/64/20110419/libblas.a --with-lapack-lib=/liblapack.a --download-suitesparse=yes --download-superlu=yes --download-superlu_dist=yes --download-mumps=yes --download-metis=yes --download-parmetis=yes --download-scalapack=yes > [1]PETSC ERROR: #1 User provided function() line 0 in unknown file > [2]PETSC ERROR: ------------------------------------------------------------------------ > [2]PETSC ERROR: Caught signal number 15 Terminate: Some process (or the batch system) has told this process to end > [2]PETSC ERROR: Try option -start_in_debugger or -on_error_attach_debugger > [2]PETSC ERROR: or see http://www.mcs.anl.gov/petsc/documentation/faq.html#valgrind > [2]PETSC ERROR: or try http://valgrind.org on GNU/linux and Apple Mac OS X to find memory corruption errors > [2]PETSC ERROR: likely location of problem given in stack below > [2]PETSC ERROR: --------------------- Stack Frames ------------------------------------ > [2]PETSC ERROR: Note: The EXACT line numbers in the stack are not available, > [2]PETSC ERROR: INSTEAD the line number of the start of the function > [2]PETSC ERROR: is given. > [2]PETSC ERROR: [2] SuperLU_DIST:pdgssvx line 161 /home/simpson/software/petsc-3.5.4/src/mat/impls/aij/mpi/superlu_dist/superlu_dist.c > [2]PETSC ERROR: [2] MatSolve_SuperLU_DIST line 121 /home/simpson/software/petsc-3.5.4/src/mat/impls/aij/mpi/superlu_dist/superlu_dist.c > [2]PETSC ERROR: [2] MatSolve line 3104 /home/simpson/software/petsc-3.5.4/src/mat/interface/matrix.c > [2]PETSC ERROR: [2] PCApply_LU line 194 /home/simpson/software/petsc-3.5.4/src/ksp/pc/impls/factor/lu/lu.c > [2]PETSC ERROR: [2] KSP_PCApplyBAorAB line 258 /home/simpson/software/petsc-3.5.4/include/petsc-private/kspimpl.h > [2]PETSC ERROR: --------------------- Error Message -------------------------------------------------------------- > [2]PETSC ERROR: Signal received > [2]PETSC ERROR: See http://www.mcs.anl.gov/petsc/documentation/faq.html for trouble shooting. > [2]PETSC ERROR: Petsc Release Version 3.5.4, May, 23, 2015 > [2]PETSC ERROR: ./blowup_batch2 on a arch-linux2-c-debug named proteusi01 by simpson Sat Aug 22 17:01:41 2015 > [2]PETSC ERROR: Configure options --with-mpi-dir=/mnt/HA/opt/openmpi/intel/64/1.8.1-mlnx-ofed --with-blas-lib=/mnt/HA/opt/blas/gcc/64/20110419/libblas.a --with-lapack-lib=/liblapack.a --download-suitesparse=yes --download-superlu=yes --download-superlu_dist=yes --download-mumps=yes --download-metis=yes --download-parmetis=yes --download-scalapack=yes > [2]PETSC ERROR: #1 User provided function() line 0 in unknown file > [4]PETSC ERROR: ------------------------------------------------------------------------ > [4]PETSC ERROR: Caught signal number 15 Terminate: Some process (or the batch system) has told this process to end > [4]PETSC ERROR: Try option -start_in_debugger or -on_error_attach_debugger > [4]PETSC ERROR: or see http://www.mcs.anl.gov/petsc/documentation/faq.html#valgrind > [4]PETSC ERROR: or try http://valgrind.org on GNU/linux and Apple Mac OS X to find memory corruption errors > [4]PETSC ERROR: likely location of problem given in stack below > [4]PETSC ERROR: --------------------- Stack Frames ------------------------------------ > [4]PETSC ERROR: Note: The EXACT line numbers in the stack are not available, > [4]PETSC ERROR: INSTEAD the line number of the start of the function > [4]PETSC ERROR: is given. > [4]PETSC ERROR: [4] SuperLU_DIST:pdgssvx line 161 /home/simpson/software/petsc-3.5.4/src/mat/impls/aij/mpi/superlu_dist/superlu_dist.c > [4]PETSC ERROR: [4] MatSolve_SuperLU_DIST line 121 /home/simpson/software/petsc-3.5.4/src/mat/impls/aij/mpi/superlu_dist/superlu_dist.c > [4]PETSC ERROR: [4] MatSolve line 3104 /home/simpson/software/petsc-3.5.4/src/mat/interface/matrix.c > [4]PETSC ERROR: [4] PCApply_LU line 194 /home/simpson/software/petsc-3.5.4/src/ksp/pc/impls/factor/lu/lu.c > [4]PETSC ERROR: [4] KSP_PCApplyBAorAB line 258 /home/simpson/software/petsc-3.5.4/include/petsc-private/kspimpl.h > [4]PETSC ERROR: --------------------- Error Message -------------------------------------------------------------- > [4]PETSC ERROR: Signal received > [4]PETSC ERROR: See http://www.mcs.anl.gov/petsc/documentation/faq.html for trouble shooting. > [4]PETSC ERROR: Petsc Release Version 3.5.4, May, 23, 2015 > [4]PETSC ERROR: ./blowup_batch2 on a arch-linux2-c-debug named proteusi01 by simpson Sat Aug 22 17:01:41 2015 > [4]PETSC ERROR: Configure options --with-mpi-dir=/mnt/HA/opt/openmpi/intel/64/1.8.1-mlnx-ofed --with-blas-lib=/mnt/HA/opt/blas/gcc/64/20110419/libblas.a --with-lapack-lib=/liblapack.a --download-suitesparse=yes --download-superlu=yes --download-superlu_dist=yes --download-mumps=yes --download-metis=yes --download-parmetis=yes --download-scalapack=yes > [4]PETSC ERROR: #1 User provided function() line 0 in unknown file > [5]PETSC ERROR: ------------------------------------------------------------------------ > [5]PETSC ERROR: Caught signal number 15 Terminate: Some process (or the batch system) has told this process to end > [5]PETSC ERROR: Try option -start_in_debugger or -on_error_attach_debugger > [5]PETSC ERROR: or see http://www.mcs.anl.gov/petsc/documentation/faq.html#valgrind > [5]PETSC ERROR: or try http://valgrind.org on GNU/linux and Apple Mac OS X to find memory corruption errors > [5]PETSC ERROR: likely location of problem given in stack below > [5]PETSC ERROR: --------------------- Stack Frames ------------------------------------ > [5]PETSC ERROR: Note: The EXACT line numbers in the stack are not available, > [5]PETSC ERROR: INSTEAD the line number of the start of the function > [5]PETSC ERROR: is given. > [5]PETSC ERROR: [5] SuperLU_DIST:pdgssvx line 161 /home/simpson/software/petsc-3.5.4/src/mat/impls/aij/mpi/superlu_dist/superlu_dist.c > [5]PETSC ERROR: [5] MatSolve_SuperLU_DIST line 121 /home/simpson/software/petsc-3.5.4/src/mat/impls/aij/mpi/superlu_dist/superlu_dist.c > [5]PETSC ERROR: [5] MatSolve line 3104 /home/simpson/software/petsc-3.5.4/src/mat/interface/matrix.c > [5]PETSC ERROR: [5] PCApply_LU line 194 /home/simpson/software/petsc-3.5.4/src/ksp/pc/impls/factor/lu/lu.c > [5]PETSC ERROR: [5] KSP_PCApplyBAorAB line 258 /home/simpson/software/petsc-3.5.4/include/petsc-private/kspimpl.h > [5]PETSC ERROR: --------------------- Error Message -------------------------------------------------------------- > [5]PETSC ERROR: Signal received > [5]PETSC ERROR: See http://www.mcs.anl.gov/petsc/documentation/faq.html for trouble shooting. > [5]PETSC ERROR: Petsc Release Version 3.5.4, May, 23, 2015 > [5]PETSC ERROR: ./blowup_batch2 on a arch-linux2-c-debug named proteusi01 by simpson Sat Aug 22 17:01:41 2015 > [5]PETSC ERROR: Configure options --with-mpi-dir=/mnt/HA/opt/openmpi/intel/64/1.8.1-mlnx-ofed --with-blas-lib=/mnt/HA/opt/blas/gcc/64/20110419/libblas.a --with-lapack-lib=/liblapack.a --download-suitesparse=yes --download-superlu=yes --download-superlu_dist=yes --download-mumps=yes --download-metis=yes --download-parmetis=yes --download-scalapack=yes > [5]PETSC ERROR: #1 User provided function() line 0 in unknown file > > -gideon > > From gideon.simpson at gmail.com Sat Aug 22 16:16:01 2015 From: gideon.simpson at gmail.com (Gideon Simpson) Date: Sat, 22 Aug 2015 17:16:01 -0400 Subject: [petsc-users] issues with sparse direct solvers In-Reply-To: <1F79415D-8468-4FB7-9821-54D71165CE11@mcs.anl.gov> References: <1F79415D-8468-4FB7-9821-54D71165CE11@mcs.anl.gov> Message-ID: <5CAA6DE5-E31F-4868-BF98-B17B1D5CED44@gmail.com> Thanks Barry, I?ll take a look at debugging. I?m also going to try petsc 3.6, since that has a newer MUMPS build. Regarding the SuperLU bugs, are they bad enough hat I should distrust output even when errors were not generated? -gideon > On Aug 22, 2015, at 5:12 PM, Barry Smith wrote: > > >> On Aug 22, 2015, at 4:04 PM, Gideon Simpson wrote: >> >> I?m having issues with both SuperLU dist and MUMPS, as compiled by PETsc, in the following sense: >> >> 1. For large enough systems, which seems to vary depending on which computer I?m on, MUMPS seems to just die and never start, when it?s used as the linear solver within a SNES. There?s no error message, it just sits there and doesn?t do anything. > > You will need to use a debugger to figure out where it is "hanging"; we haven't heard reports about this. >> >> 2. When running with SuperLU dist, I got the following error, with no further information: > > The last release of SuperLU_DIST had some pretty nasty bugs, memory corruption that caused crashes etc. We think they are now fixed if you use the maint branch of the PETSc repository and --download-superlu_dist If you stick with the PETSc release and SuperLU_Dist you are using you will keep seeing these crashes > > Barry > > >> >> [3]PETSC ERROR: ------------------------------------------------------------------------ >> [3]PETSC ERROR: Caught signal number 11 SEGV: Segmentation Violation, probably memory access out of range >> [3]PETSC ERROR: Try option -start_in_debugger or -on_error_attach_debugger >> [3]PETSC ERROR: or see http://www.mcs.anl.gov/petsc/documentation/faq.html#valgrind >> [3]PETSC ERROR: or try http://valgrind.org on GNU/linux and Apple Mac OS X to find memory corruption errors >> [3]PETSC ERROR: likely location of problem given in stack below >> [3]PETSC ERROR: --------------------- Stack Frames ------------------------------------ >> [3]PETSC ERROR: Note: The EXACT line numbers in the stack are not available, >> [3]PETSC ERROR: INSTEAD the line number of the start of the function >> [3]PETSC ERROR: is given. >> [3]PETSC ERROR: [3] SuperLU_DIST:pdgssvx line 161 /home/simpson/software/petsc-3.5.4/src/mat/impls/aij/mpi/superlu_dist/superlu_dist.c >> [3]PETSC ERROR: [3] MatSolve_SuperLU_DIST line 121 /home/simpson/software/petsc-3.5.4/src/mat/impls/aij/mpi/superlu_dist/superlu_dist.c >> [3]PETSC ERROR: [3] MatSolve line 3104 /home/simpson/software/petsc-3.5.4/src/mat/interface/matrix.c >> [3]PETSC ERROR: [3] PCApply_LU line 194 /home/simpson/software/petsc-3.5.4/src/ksp/pc/impls/factor/lu/lu.c >> [3]PETSC ERROR: [3] KSP_PCApplyBAorAB line 258 /home/simpson/software/petsc-3.5.4/include/petsc-private/kspimpl.h >> [3]PETSC ERROR: --------------------- Error Message -------------------------------------------------------------- >> [3]PETSC ERROR: Signal received >> [3]PETSC ERROR: See http://www.mcs.anl.gov/petsc/documentation/faq.html for trouble shooting. >> [3]PETSC ERROR: Petsc Release Version 3.5.4, May, 23, 2015 >> [3]PETSC ERROR: ./blowup_batch2 on a arch-linux2-c-debug named proteusi01 by simpson Sat Aug 22 17:01:41 2015 >> [3]PETSC ERROR: Configure options --with-mpi-dir=/mnt/HA/opt/openmpi/intel/64/1.8.1-mlnx-ofed --with-blas-lib=/mnt/HA/opt/blas/gcc/64/20110419/libblas.a --with-lapack-lib=/liblapack.a --download-suitesparse=yes --download-superlu=yes --download-superlu_dist=yes --download-mumps=yes --download-metis=yes --download-parmetis=yes --download-scalapack=yes >> [3]PETSC ERROR: #1 User provided function() line 0 in unknown file >> -------------------------------------------------------------------------- >> MPI_ABORT was invoked on rank 3 in communicator MPI_COMM_WORLD >> with errorcode 59. >> >> NOTE: invoking MPI_ABORT causes Open MPI to kill all MPI processes. >> You may or may not see output from other processes, depending on >> exactly when Open MPI kills them. >> -------------------------------------------------------------------------- >> [proteusi01:14037] 1 more process has sent help message help-mpi-api.txt / mpi-abort >> [proteusi01:14037] Set MCA parameter "orte_base_help_aggregate" to 0 to see all help / error messages >> [6]PETSC ERROR: ------------------------------------------------------------------------ >> [6]PETSC ERROR: Caught signal number 15 Terminate: Some process (or the batch system) has told this process to end >> [6]PETSC ERROR: Try option -start_in_debugger or -on_error_attach_debugger >> [6]PETSC ERROR: or see http://www.mcs.anl.gov/petsc/documentation/faq.html#valgrind >> [6]PETSC ERROR: or try http://valgrind.org on GNU/linux and Apple Mac OS X to find memory corruption errors >> [6]PETSC ERROR: likely location of problem given in stack below >> [6]PETSC ERROR: --------------------- Stack Frames ------------------------------------ >> [6]PETSC ERROR: Note: The EXACT line numbers in the stack are not available, >> [6]PETSC ERROR: INSTEAD the line number of the start of the function >> [6]PETSC ERROR: is given. >> [6]PETSC ERROR: [6] SuperLU_DIST:pdgssvx line 161 /home/simpson/software/petsc-3.5.4/src/mat/impls/aij/mpi/superlu_dist/superlu_dist.c >> [6]PETSC ERROR: [6] MatSolve_SuperLU_DIST line 121 /home/simpson/software/petsc-3.5.4/src/mat/impls/aij/mpi/superlu_dist/superlu_dist.c >> [6]PETSC ERROR: [6] MatSolve line 3104 /home/simpson/software/petsc-3.5.4/src/mat/interface/matrix.c >> [6]PETSC ERROR: [6] PCApply_LU line 194 /home/simpson/software/petsc-3.5.4/src/ksp/pc/impls/factor/lu/lu.c >> [6]PETSC ERROR: [6] KSP_PCApplyBAorAB line 258 /home/simpson/software/petsc-3.5.4/include/petsc-private/kspimpl.h >> [6]PETSC ERROR: --------------------- Error Message -------------------------------------------------------------- >> [6]PETSC ERROR: Signal received >> [6]PETSC ERROR: See http://www.mcs.anl.gov/petsc/documentation/faq.html for trouble shooting. >> [6]PETSC ERROR: Petsc Release Version 3.5.4, May, 23, 2015 >> [6]PETSC ERROR: ./blowup_batch2 on a arch-linux2-c-debug named proteusi01 by simpson Sat Aug 22 17:01:41 2015 >> [6]PETSC ERROR: Configure options --with-mpi-dir=/mnt/HA/opt/openmpi/intel/64/1.8.1-mlnx-ofed --with-blas-lib=/mnt/HA/opt/blas/gcc/64/20110419/libblas.a --with-lapack-lib=/liblapack.a --download-suitesparse=yes --download-superlu=yes --download-superlu_dist=yes --download-mumps=yes --download-metis=yes --download-parmetis=yes --download-scalapack=yes >> [6]PETSC ERROR: #1 User provided function() line 0 in unknown file >> [7]PETSC ERROR: ------------------------------------------------------------------------ >> [7]PETSC ERROR: Caught signal number 15 Terminate: Some process (or the batch system) has told this process to end >> [7]PETSC ERROR: Try option -start_in_debugger or -on_error_attach_debugger >> [7]PETSC ERROR: or see http://www.mcs.anl.gov/petsc/documentation/faq.html#valgrind >> [7]PETSC ERROR: or try http://valgrind.org on GNU/linux and Apple Mac OS X to find memory corruption errors >> [7]PETSC ERROR: likely location of problem given in stack below >> [7]PETSC ERROR: --------------------- Stack Frames ------------------------------------ >> [7]PETSC ERROR: Note: The EXACT line numbers in the stack are not available, >> [7]PETSC ERROR: INSTEAD the line number of the start of the function >> [7]PETSC ERROR: is given. >> [7]PETSC ERROR: [7] SuperLU_DIST:pdgssvx line 161 /home/simpson/software/petsc-3.5.4/src/mat/impls/aij/mpi/superlu_dist/superlu_dist.c >> [7]PETSC ERROR: [7] MatSolve_SuperLU_DIST line 121 /home/simpson/software/petsc-3.5.4/src/mat/impls/aij/mpi/superlu_dist/superlu_dist.c >> [7]PETSC ERROR: [7] MatSolve line 3104 /home/simpson/software/petsc-3.5.4/src/mat/interface/matrix.c >> [7]PETSC ERROR: [7] PCApply_LU line 194 /home/simpson/software/petsc-3.5.4/src/ksp/pc/impls/factor/lu/lu.c >> [7]PETSC ERROR: [7] KSP_PCApplyBAorAB line 258 /home/simpson/software/petsc-3.5.4/include/petsc-private/kspimpl.h >> [7]PETSC ERROR: --------------------- Error Message -------------------------------------------------------------- >> [7]PETSC ERROR: Signal received >> [7]PETSC ERROR: See http://www.mcs.anl.gov/petsc/documentation/faq.html for trouble shooting. >> [7]PETSC ERROR: Petsc Release Version 3.5.4, May, 23, 2015 >> [7]PETSC ERROR: ./blowup_batch2 on a arch-linux2-c-debug named proteusi01 by simpson Sat Aug 22 17:01:41 2015 >> [7]PETSC ERROR: Configure options --with-mpi-dir=/mnt/HA/opt/openmpi/intel/64/1.8.1-mlnx-ofed --with-blas-lib=/mnt/HA/opt/blas/gcc/64/20110419/libblas.a --with-lapack-lib=/liblapack.a --download-suitesparse=yes --download-superlu=yes --download-superlu_dist=yes --download-mumps=yes --download-metis=yes --download-parmetis=yes --download-scalapack=yes >> [7]PETSC ERROR: #1 User provided function() line 0 in unknown file >> [0]PETSC ERROR: ------------------------------------------------------------------------ >> [0]PETSC ERROR: Caught signal number 15 Terminate: Some process (or the batch system) has told this process to end >> [0]PETSC ERROR: Try option -start_in_debugger or -on_error_attach_debugger >> [0]PETSC ERROR: or see http://www.mcs.anl.gov/petsc/documentation/faq.html#valgrind >> [0]PETSC ERROR: or try http://valgrind.org on GNU/linux and Apple Mac OS X to find memory corruption errors >> [0]PETSC ERROR: likely location of problem given in stack below >> [0]PETSC ERROR: --------------------- Stack Frames ------------------------------------ >> [0]PETSC ERROR: Note: The EXACT line numbers in the stack are not available, >> [0]PETSC ERROR: INSTEAD the line number of the start of the function >> [0]PETSC ERROR: is given. >> [0]PETSC ERROR: [0] SuperLU_DIST:pdgssvx line 161 /home/simpson/software/petsc-3.5.4/src/mat/impls/aij/mpi/superlu_dist/superlu_dist.c >> [0]PETSC ERROR: [0] MatSolve_SuperLU_DIST line 121 /home/simpson/software/petsc-3.5.4/src/mat/impls/aij/mpi/superlu_dist/superlu_dist.c >> [0]PETSC ERROR: [0] MatSolve line 3104 /home/simpson/software/petsc-3.5.4/src/mat/interface/matrix.c >> [0]PETSC ERROR: [0] PCApply_LU line 194 /home/simpson/software/petsc-3.5.4/src/ksp/pc/impls/factor/lu/lu.c >> [0]PETSC ERROR: [0] KSP_PCApplyBAorAB line 258 /home/simpson/software/petsc-3.5.4/include/petsc-private/kspimpl.h >> [0]PETSC ERROR: --------------------- Error Message -------------------------------------------------------------- >> [0]PETSC ERROR: Signal received >> [0]PETSC ERROR: See http://www.mcs.anl.gov/petsc/documentation/faq.html for trouble shooting. >> [0]PETSC ERROR: Petsc Release Version 3.5.4, May, 23, 2015 >> [0]PETSC ERROR: ./blowup_batch2 on a arch-linux2-c-debug named proteusi01 by simpson Sat Aug 22 17:01:41 2015 >> [0]PETSC ERROR: Configure options --with-mpi-dir=/mnt/HA/opt/openmpi/intel/64/1.8.1-mlnx-ofed --with-blas-lib=/mnt/HA/opt/blas/gcc/64/20110419/libblas.a --with-lapack-lib=/liblapack.a --download-suitesparse=yes --download-superlu=yes --download-superlu_dist=yes --download-mumps=yes --download-metis=yes --download-parmetis=yes --download-scalapack=yes >> [0]PETSC ERROR: #1 User provided function() line 0 in unknown file >> [1]PETSC ERROR: ------------------------------------------------------------------------ >> [1]PETSC ERROR: Caught signal number 15 Terminate: Some process (or the batch system) has told this process to end >> [1]PETSC ERROR: Try option -start_in_debugger or -on_error_attach_debugger >> [1]PETSC ERROR: or see http://www.mcs.anl.gov/petsc/documentation/faq.html#valgrind >> [1]PETSC ERROR: or try http://valgrind.org on GNU/linux and Apple Mac OS X to find memory corruption errors >> [1]PETSC ERROR: likely location of problem given in stack below >> [1]PETSC ERROR: --------------------- Stack Frames ------------------------------------ >> [1]PETSC ERROR: Note: The EXACT line numbers in the stack are not available, >> [1]PETSC ERROR: INSTEAD the line number of the start of the function >> [1]PETSC ERROR: is given. >> [1]PETSC ERROR: [1] SuperLU_DIST:pdgssvx line 161 /home/simpson/software/petsc-3.5.4/src/mat/impls/aij/mpi/superlu_dist/superlu_dist.c >> [1]PETSC ERROR: [1] MatSolve_SuperLU_DIST line 121 /home/simpson/software/petsc-3.5.4/src/mat/impls/aij/mpi/superlu_dist/superlu_dist.c >> [1]PETSC ERROR: [1] MatSolve line 3104 /home/simpson/software/petsc-3.5.4/src/mat/interface/matrix.c >> [1]PETSC ERROR: [1] PCApply_LU line 194 /home/simpson/software/petsc-3.5.4/src/ksp/pc/impls/factor/lu/lu.c >> [1]PETSC ERROR: [1] KSP_PCApplyBAorAB line 258 /home/simpson/software/petsc-3.5.4/include/petsc-private/kspimpl.h >> [1]PETSC ERROR: --------------------- Error Message -------------------------------------------------------------- >> [1]PETSC ERROR: Signal received >> [1]PETSC ERROR: See http://www.mcs.anl.gov/petsc/documentation/faq.html for trouble shooting. >> [1]PETSC ERROR: Petsc Release Version 3.5.4, May, 23, 2015 >> [1]PETSC ERROR: ./blowup_batch2 on a arch-linux2-c-debug named proteusi01 by simpson Sat Aug 22 17:01:41 2015 >> [1]PETSC ERROR: Configure options --with-mpi-dir=/mnt/HA/opt/openmpi/intel/64/1.8.1-mlnx-ofed --with-blas-lib=/mnt/HA/opt/blas/gcc/64/20110419/libblas.a --with-lapack-lib=/liblapack.a --download-suitesparse=yes --download-superlu=yes --download-superlu_dist=yes --download-mumps=yes --download-metis=yes --download-parmetis=yes --download-scalapack=yes >> [1]PETSC ERROR: #1 User provided function() line 0 in unknown file >> [2]PETSC ERROR: ------------------------------------------------------------------------ >> [2]PETSC ERROR: Caught signal number 15 Terminate: Some process (or the batch system) has told this process to end >> [2]PETSC ERROR: Try option -start_in_debugger or -on_error_attach_debugger >> [2]PETSC ERROR: or see http://www.mcs.anl.gov/petsc/documentation/faq.html#valgrind >> [2]PETSC ERROR: or try http://valgrind.org on GNU/linux and Apple Mac OS X to find memory corruption errors >> [2]PETSC ERROR: likely location of problem given in stack below >> [2]PETSC ERROR: --------------------- Stack Frames ------------------------------------ >> [2]PETSC ERROR: Note: The EXACT line numbers in the stack are not available, >> [2]PETSC ERROR: INSTEAD the line number of the start of the function >> [2]PETSC ERROR: is given. >> [2]PETSC ERROR: [2] SuperLU_DIST:pdgssvx line 161 /home/simpson/software/petsc-3.5.4/src/mat/impls/aij/mpi/superlu_dist/superlu_dist.c >> [2]PETSC ERROR: [2] MatSolve_SuperLU_DIST line 121 /home/simpson/software/petsc-3.5.4/src/mat/impls/aij/mpi/superlu_dist/superlu_dist.c >> [2]PETSC ERROR: [2] MatSolve line 3104 /home/simpson/software/petsc-3.5.4/src/mat/interface/matrix.c >> [2]PETSC ERROR: [2] PCApply_LU line 194 /home/simpson/software/petsc-3.5.4/src/ksp/pc/impls/factor/lu/lu.c >> [2]PETSC ERROR: [2] KSP_PCApplyBAorAB line 258 /home/simpson/software/petsc-3.5.4/include/petsc-private/kspimpl.h >> [2]PETSC ERROR: --------------------- Error Message -------------------------------------------------------------- >> [2]PETSC ERROR: Signal received >> [2]PETSC ERROR: See http://www.mcs.anl.gov/petsc/documentation/faq.html for trouble shooting. >> [2]PETSC ERROR: Petsc Release Version 3.5.4, May, 23, 2015 >> [2]PETSC ERROR: ./blowup_batch2 on a arch-linux2-c-debug named proteusi01 by simpson Sat Aug 22 17:01:41 2015 >> [2]PETSC ERROR: Configure options --with-mpi-dir=/mnt/HA/opt/openmpi/intel/64/1.8.1-mlnx-ofed --with-blas-lib=/mnt/HA/opt/blas/gcc/64/20110419/libblas.a --with-lapack-lib=/liblapack.a --download-suitesparse=yes --download-superlu=yes --download-superlu_dist=yes --download-mumps=yes --download-metis=yes --download-parmetis=yes --download-scalapack=yes >> [2]PETSC ERROR: #1 User provided function() line 0 in unknown file >> [4]PETSC ERROR: ------------------------------------------------------------------------ >> [4]PETSC ERROR: Caught signal number 15 Terminate: Some process (or the batch system) has told this process to end >> [4]PETSC ERROR: Try option -start_in_debugger or -on_error_attach_debugger >> [4]PETSC ERROR: or see http://www.mcs.anl.gov/petsc/documentation/faq.html#valgrind >> [4]PETSC ERROR: or try http://valgrind.org on GNU/linux and Apple Mac OS X to find memory corruption errors >> [4]PETSC ERROR: likely location of problem given in stack below >> [4]PETSC ERROR: --------------------- Stack Frames ------------------------------------ >> [4]PETSC ERROR: Note: The EXACT line numbers in the stack are not available, >> [4]PETSC ERROR: INSTEAD the line number of the start of the function >> [4]PETSC ERROR: is given. >> [4]PETSC ERROR: [4] SuperLU_DIST:pdgssvx line 161 /home/simpson/software/petsc-3.5.4/src/mat/impls/aij/mpi/superlu_dist/superlu_dist.c >> [4]PETSC ERROR: [4] MatSolve_SuperLU_DIST line 121 /home/simpson/software/petsc-3.5.4/src/mat/impls/aij/mpi/superlu_dist/superlu_dist.c >> [4]PETSC ERROR: [4] MatSolve line 3104 /home/simpson/software/petsc-3.5.4/src/mat/interface/matrix.c >> [4]PETSC ERROR: [4] PCApply_LU line 194 /home/simpson/software/petsc-3.5.4/src/ksp/pc/impls/factor/lu/lu.c >> [4]PETSC ERROR: [4] KSP_PCApplyBAorAB line 258 /home/simpson/software/petsc-3.5.4/include/petsc-private/kspimpl.h >> [4]PETSC ERROR: --------------------- Error Message -------------------------------------------------------------- >> [4]PETSC ERROR: Signal received >> [4]PETSC ERROR: See http://www.mcs.anl.gov/petsc/documentation/faq.html for trouble shooting. >> [4]PETSC ERROR: Petsc Release Version 3.5.4, May, 23, 2015 >> [4]PETSC ERROR: ./blowup_batch2 on a arch-linux2-c-debug named proteusi01 by simpson Sat Aug 22 17:01:41 2015 >> [4]PETSC ERROR: Configure options --with-mpi-dir=/mnt/HA/opt/openmpi/intel/64/1.8.1-mlnx-ofed --with-blas-lib=/mnt/HA/opt/blas/gcc/64/20110419/libblas.a --with-lapack-lib=/liblapack.a --download-suitesparse=yes --download-superlu=yes --download-superlu_dist=yes --download-mumps=yes --download-metis=yes --download-parmetis=yes --download-scalapack=yes >> [4]PETSC ERROR: #1 User provided function() line 0 in unknown file >> [5]PETSC ERROR: ------------------------------------------------------------------------ >> [5]PETSC ERROR: Caught signal number 15 Terminate: Some process (or the batch system) has told this process to end >> [5]PETSC ERROR: Try option -start_in_debugger or -on_error_attach_debugger >> [5]PETSC ERROR: or see http://www.mcs.anl.gov/petsc/documentation/faq.html#valgrind >> [5]PETSC ERROR: or try http://valgrind.org on GNU/linux and Apple Mac OS X to find memory corruption errors >> [5]PETSC ERROR: likely location of problem given in stack below >> [5]PETSC ERROR: --------------------- Stack Frames ------------------------------------ >> [5]PETSC ERROR: Note: The EXACT line numbers in the stack are not available, >> [5]PETSC ERROR: INSTEAD the line number of the start of the function >> [5]PETSC ERROR: is given. >> [5]PETSC ERROR: [5] SuperLU_DIST:pdgssvx line 161 /home/simpson/software/petsc-3.5.4/src/mat/impls/aij/mpi/superlu_dist/superlu_dist.c >> [5]PETSC ERROR: [5] MatSolve_SuperLU_DIST line 121 /home/simpson/software/petsc-3.5.4/src/mat/impls/aij/mpi/superlu_dist/superlu_dist.c >> [5]PETSC ERROR: [5] MatSolve line 3104 /home/simpson/software/petsc-3.5.4/src/mat/interface/matrix.c >> [5]PETSC ERROR: [5] PCApply_LU line 194 /home/simpson/software/petsc-3.5.4/src/ksp/pc/impls/factor/lu/lu.c >> [5]PETSC ERROR: [5] KSP_PCApplyBAorAB line 258 /home/simpson/software/petsc-3.5.4/include/petsc-private/kspimpl.h >> [5]PETSC ERROR: --------------------- Error Message -------------------------------------------------------------- >> [5]PETSC ERROR: Signal received >> [5]PETSC ERROR: See http://www.mcs.anl.gov/petsc/documentation/faq.html for trouble shooting. >> [5]PETSC ERROR: Petsc Release Version 3.5.4, May, 23, 2015 >> [5]PETSC ERROR: ./blowup_batch2 on a arch-linux2-c-debug named proteusi01 by simpson Sat Aug 22 17:01:41 2015 >> [5]PETSC ERROR: Configure options --with-mpi-dir=/mnt/HA/opt/openmpi/intel/64/1.8.1-mlnx-ofed --with-blas-lib=/mnt/HA/opt/blas/gcc/64/20110419/libblas.a --with-lapack-lib=/liblapack.a --download-suitesparse=yes --download-superlu=yes --download-superlu_dist=yes --download-mumps=yes --download-metis=yes --download-parmetis=yes --download-scalapack=yes >> [5]PETSC ERROR: #1 User provided function() line 0 in unknown file >> >> -gideon >> >> > -------------- next part -------------- An HTML attachment was scrubbed... URL: From nelsonflsilva at ist.utl.pt Sat Aug 22 16:17:21 2015 From: nelsonflsilva at ist.utl.pt (Nelson Filipe Lopes da Silva) Date: Sat, 22 Aug 2015 22:17:21 +0100 Subject: [petsc-users] Scalability issue In-Reply-To: References: <6f0a267caafdec17d9e34595c9528b7c@mail.ist.utl.pt> Message-ID: <11b2afc65a711f7e58bd7092b392c7cb@mail.ist.utl.pt> Hi. I managed to finish the re-implementation. I ran the program with 1,2,3,4,5,6 machines and saved the summary. I send each of them in this email. In these executions, the program performs Matrix-Vector (MatMult, MatMultAdd) products and Vector-Vector operations. From what I understand while reading the logs, the program takes most of the time in "VecScatterEnd". In this example, the matrix taking part on the Matrix-Vector products is not "much diagonal heavy". The following numbers are the percentages of nnz values on the matrix diagonal block for each machine, and each execution time. NMachines %NNZ ExecTime 1 machine0 100%; 16min08sec 2 machine0 91.1%; 24min58sec machine1 69.2%; 3 machine0 90.9% 25min42sec machine1 82.8% machine2 51.6% 4 machine0 91.9% 26min27sec machine1 82.4% machine2 73.1% machine3 39.9% 5 machine0 93.2% 39min23sec machine1 82.8% machine2 74.4% machine3 64.6% machine4 31.6% 6 machine0 94.2% 54min54sec machine1 82.6% machine2 73.1% machine3 65.2% machine4 55.9% machine5 25.4% In this implementation I'm using MatCreate and VecCreate. I'm also leaving the partition sizes in PETSC_DECIDE. Finally, to run the application, I'm using mpirun.hydra from mpich, downloaded by PETSc configure script. I'm checking the process assignment as suggested on the last email. Am I missing anything? Regards, Nelson Em 2015-08-20 16:17, Matthew Knepley escreveu: > On Thu, Aug 20, 2015 at 6:30 AM, Nelson Filipe Lopes da Silva wrote: > >> Hello. >> >> I am sorry for the long time without response. I decided to rewrite my application in a different way and will send the log_summary output when done reimplementing. >> >> As for the machine, I am using mpirun to run jobs in a 8 node cluster. I modified the makefile on the steams folder so it would run using my hostfile. >> The output is attached to this email. It seems reasonable for a cluster with 8 machines. From "lscpu", each machine cpu has 4 cores and 1 socket. > > 1) You launcher is placing processes haphazardly. I would figure out how to assign them to certain nodes > 2) Each node has enough bandwidth for 1 core, so it does not make much sense to use more than 1. > Thanks, > Matt > >> Cheers, >> Nelson >> >> Em 2015-07-24 16:50, Barry Smith escreveu: >> >>> It would be very helpful if you ran the code on say 1, 2, 4, 8, 16 >>> ... processes with the option -log_summary and send (as attachments) >>> the log summary information. >>> >>> Also on the same machine run the streams benchmark; with recent >>> releases of PETSc you only need to do >>> >>> cd $PETSC_DIR >>> make streams NPMAX=16 (or whatever your largest process count is) >>> >>> and send the output. >>> >>> I suspect that you are doing everything fine and it is more an issue >>> with the configuration of your machine. Also read the information at >>> http://www.mcs.anl.gov/petsc/documentation/faq.html#computers [2] on >>> "binding" >>> >>> Barry >>> >>>> On Jul 24, 2015, at 10:41 AM, Nelson Filipe Lopes da Silva wrote: >>>> >>>> Hello, >>>> >>>> I have been using PETSc for a few months now, and it truly is fantastic piece of software. >>>> >>>> In my particular example I am working with a large, sparse distributed (MPI AIJ) matrix we can refer as 'G'. >>>> G is a horizontal - retangular matrix (for example, 1,1 Million rows per 2,1 Million columns). This matrix is commonly very sparse and not diagonal 'heavy' (for example 5,2 Million nnz in which ~50% are on the diagonal block of MPI AIJ representation). >>>> To work with this matrix, I also have a few parallel vectors (created using MatCreate Vec), we can refer as 'm' and 'k'. >>>> I am trying to parallelize an iterative algorithm in which the most computational heavy operations are: >>>> >>>> ->Matrix-Vector Multiplication, more precisely G * m + k = b (MatMultAdd). From what I have been reading, to achive a good speedup in this operation, G should be as much diagonal as possible, due to overlapping communication and computation. But even when using a G matrix in which the diagonal block has ~95% of the nnz, I cannot get a decent speedup. Most of the times, the performance even gets worse. >>>> >>>> ->Matrix-Matrix Multiplication, in this case I need to perform G * G' = A, where A is later used on the linear solver and G' is transpose of G. The speedup in this operation is not worse, although is not very good. >>>> >>>> ->Linear problem solving. Lastly, In this operation I compute "Ax=b" from the last two operations. I tried to apply a RCM permutation to A to make it more diagonal, for better performance. However, the problem I faced was that, the permutation is performed locally in each processor and thus, the final result is different with different number of processors. I assume this was intended to reduce communication. The solution I found was >>>> 1-calculate A >>>> 2-calculate, localy to 1 machine, the RCM permutation IS using A >>>> 3-apply this permutation to the lines of G. >>>> This works well, and A is generated as if RCM permuted. It is fine to do this operation in one machine because it is only done once while reading the input. The nnz of G become more spread and less diagonal, causing problems when calculating G * m + k = b. >>>> >>>> These 3 operations (except the permutation) are performed in each iteration of my algorithm. >>>> >>>> So, my questions are. >>>> -What are the characteristics of G that lead to a good speedup in the operations I described? Am I missing something and too much obsessed with the diagonal block? >>>> >>>> -Is there a better way to permute A without permute G and still get the same result using 1 or N machines? >>>> >>>> I have been avoiding asking for help for a while. I'm very sorry for the long email. >>>> Thank you very much for your time. >>>> Best Regards, >>>> Nelson > > -- > > What most experimenters take for granted before they begin their experiments is infinitely more interesting than any results to which their experiments lead. > -- Norbert Wiener Links: ------ [1] mailto:nelsonflsilva at ist.utl.pt [2] http://www.mcs.anl.gov/petsc/documentation/faq.html#computers [3] mailto:nelsonflsilva at ist.utl.pt -------------- next part -------------- An HTML attachment was scrubbed... URL: -------------- next part -------------- An embedded and charset-unspecified text was scrubbed... Name: Log01P URL: -------------- next part -------------- An embedded and charset-unspecified text was scrubbed... Name: Log02P URL: -------------- next part -------------- An embedded and charset-unspecified text was scrubbed... Name: Log03P URL: -------------- next part -------------- An embedded and charset-unspecified text was scrubbed... Name: Log04P URL: -------------- next part -------------- An embedded and charset-unspecified text was scrubbed... Name: Log05P URL: -------------- next part -------------- An embedded and charset-unspecified text was scrubbed... Name: Log06P URL: From bsmith at mcs.anl.gov Sat Aug 22 16:22:33 2015 From: bsmith at mcs.anl.gov (Barry Smith) Date: Sat, 22 Aug 2015 16:22:33 -0500 Subject: [petsc-users] issues with sparse direct solvers In-Reply-To: <5CAA6DE5-E31F-4868-BF98-B17B1D5CED44@gmail.com> References: <1F79415D-8468-4FB7-9821-54D71165CE11@mcs.anl.gov> <5CAA6DE5-E31F-4868-BF98-B17B1D5CED44@gmail.com> Message-ID: <15C59F64-2281-47D9-8040-193AA04E603E@mcs.anl.gov> > On Aug 22, 2015, at 4:16 PM, Gideon Simpson wrote: > > Thanks Barry, I?ll take a look at debugging. I?m also going to try petsc 3.6, since that has a newer MUMPS build. > > Regarding the SuperLU bugs, are they bad enough hat I should distrust output even when errors were not generated? In my experience and what I've see no. It either crashes or runs correctly. But you can always check the residual after the solution > > > -gideon > >> On Aug 22, 2015, at 5:12 PM, Barry Smith wrote: >> >> >>> On Aug 22, 2015, at 4:04 PM, Gideon Simpson wrote: >>> >>> I?m having issues with both SuperLU dist and MUMPS, as compiled by PETsc, in the following sense: >>> >>> 1. For large enough systems, which seems to vary depending on which computer I?m on, MUMPS seems to just die and never start, when it?s used as the linear solver within a SNES. There?s no error message, it just sits there and doesn?t do anything. >> >> You will need to use a debugger to figure out where it is "hanging"; we haven't heard reports about this. >>> >>> 2. When running with SuperLU dist, I got the following error, with no further information: >> >> The last release of SuperLU_DIST had some pretty nasty bugs, memory corruption that caused crashes etc. We think they are now fixed if you use the maint branch of the PETSc repository and --download-superlu_dist If you stick with the PETSc release and SuperLU_Dist you are using you will keep seeing these crashes >> >> Barry >> >> >>> >>> [3]PETSC ERROR: ------------------------------------------------------------------------ >>> [3]PETSC ERROR: Caught signal number 11 SEGV: Segmentation Violation, probably memory access out of range >>> [3]PETSC ERROR: Try option -start_in_debugger or -on_error_attach_debugger >>> [3]PETSC ERROR: or see http://www.mcs.anl.gov/petsc/documentation/faq.html#valgrind >>> [3]PETSC ERROR: or try http://valgrind.org on GNU/linux and Apple Mac OS X to find memory corruption errors >>> [3]PETSC ERROR: likely location of problem given in stack below >>> [3]PETSC ERROR: --------------------- Stack Frames ------------------------------------ >>> [3]PETSC ERROR: Note: The EXACT line numbers in the stack are not available, >>> [3]PETSC ERROR: INSTEAD the line number of the start of the function >>> [3]PETSC ERROR: is given. >>> [3]PETSC ERROR: [3] SuperLU_DIST:pdgssvx line 161 /home/simpson/software/petsc-3.5.4/src/mat/impls/aij/mpi/superlu_dist/superlu_dist.c >>> [3]PETSC ERROR: [3] MatSolve_SuperLU_DIST line 121 /home/simpson/software/petsc-3.5.4/src/mat/impls/aij/mpi/superlu_dist/superlu_dist.c >>> [3]PETSC ERROR: [3] MatSolve line 3104 /home/simpson/software/petsc-3.5.4/src/mat/interface/matrix.c >>> [3]PETSC ERROR: [3] PCApply_LU line 194 /home/simpson/software/petsc-3.5.4/src/ksp/pc/impls/factor/lu/lu.c >>> [3]PETSC ERROR: [3] KSP_PCApplyBAorAB line 258 /home/simpson/software/petsc-3.5.4/include/petsc-private/kspimpl.h >>> [3]PETSC ERROR: --------------------- Error Message -------------------------------------------------------------- >>> [3]PETSC ERROR: Signal received >>> [3]PETSC ERROR: See http://www.mcs.anl.gov/petsc/documentation/faq.html for trouble shooting. >>> [3]PETSC ERROR: Petsc Release Version 3.5.4, May, 23, 2015 >>> [3]PETSC ERROR: ./blowup_batch2 on a arch-linux2-c-debug named proteusi01 by simpson Sat Aug 22 17:01:41 2015 >>> [3]PETSC ERROR: Configure options --with-mpi-dir=/mnt/HA/opt/openmpi/intel/64/1.8.1-mlnx-ofed --with-blas-lib=/mnt/HA/opt/blas/gcc/64/20110419/libblas.a --with-lapack-lib=/liblapack.a --download-suitesparse=yes --download-superlu=yes --download-superlu_dist=yes --download-mumps=yes --download-metis=yes --download-parmetis=yes --download-scalapack=yes >>> [3]PETSC ERROR: #1 User provided function() line 0 in unknown file >>> -------------------------------------------------------------------------- >>> MPI_ABORT was invoked on rank 3 in communicator MPI_COMM_WORLD >>> with errorcode 59. >>> >>> NOTE: invoking MPI_ABORT causes Open MPI to kill all MPI processes. >>> You may or may not see output from other processes, depending on >>> exactly when Open MPI kills them. >>> -------------------------------------------------------------------------- >>> [proteusi01:14037] 1 more process has sent help message help-mpi-api.txt / mpi-abort >>> [proteusi01:14037] Set MCA parameter "orte_base_help_aggregate" to 0 to see all help / error messages >>> [6]PETSC ERROR: ------------------------------------------------------------------------ >>> [6]PETSC ERROR: Caught signal number 15 Terminate: Some process (or the batch system) has told this process to end >>> [6]PETSC ERROR: Try option -start_in_debugger or -on_error_attach_debugger >>> [6]PETSC ERROR: or see http://www.mcs.anl.gov/petsc/documentation/faq.html#valgrind >>> [6]PETSC ERROR: or try http://valgrind.org on GNU/linux and Apple Mac OS X to find memory corruption errors >>> [6]PETSC ERROR: likely location of problem given in stack below >>> [6]PETSC ERROR: --------------------- Stack Frames ------------------------------------ >>> [6]PETSC ERROR: Note: The EXACT line numbers in the stack are not available, >>> [6]PETSC ERROR: INSTEAD the line number of the start of the function >>> [6]PETSC ERROR: is given. >>> [6]PETSC ERROR: [6] SuperLU_DIST:pdgssvx line 161 /home/simpson/software/petsc-3.5.4/src/mat/impls/aij/mpi/superlu_dist/superlu_dist.c >>> [6]PETSC ERROR: [6] MatSolve_SuperLU_DIST line 121 /home/simpson/software/petsc-3.5.4/src/mat/impls/aij/mpi/superlu_dist/superlu_dist.c >>> [6]PETSC ERROR: [6] MatSolve line 3104 /home/simpson/software/petsc-3.5.4/src/mat/interface/matrix.c >>> [6]PETSC ERROR: [6] PCApply_LU line 194 /home/simpson/software/petsc-3.5.4/src/ksp/pc/impls/factor/lu/lu.c >>> [6]PETSC ERROR: [6] KSP_PCApplyBAorAB line 258 /home/simpson/software/petsc-3.5.4/include/petsc-private/kspimpl.h >>> [6]PETSC ERROR: --------------------- Error Message -------------------------------------------------------------- >>> [6]PETSC ERROR: Signal received >>> [6]PETSC ERROR: See http://www.mcs.anl.gov/petsc/documentation/faq.html for trouble shooting. >>> [6]PETSC ERROR: Petsc Release Version 3.5.4, May, 23, 2015 >>> [6]PETSC ERROR: ./blowup_batch2 on a arch-linux2-c-debug named proteusi01 by simpson Sat Aug 22 17:01:41 2015 >>> [6]PETSC ERROR: Configure options --with-mpi-dir=/mnt/HA/opt/openmpi/intel/64/1.8.1-mlnx-ofed --with-blas-lib=/mnt/HA/opt/blas/gcc/64/20110419/libblas.a --with-lapack-lib=/liblapack.a --download-suitesparse=yes --download-superlu=yes --download-superlu_dist=yes --download-mumps=yes --download-metis=yes --download-parmetis=yes --download-scalapack=yes >>> [6]PETSC ERROR: #1 User provided function() line 0 in unknown file >>> [7]PETSC ERROR: ------------------------------------------------------------------------ >>> [7]PETSC ERROR: Caught signal number 15 Terminate: Some process (or the batch system) has told this process to end >>> [7]PETSC ERROR: Try option -start_in_debugger or -on_error_attach_debugger >>> [7]PETSC ERROR: or see http://www.mcs.anl.gov/petsc/documentation/faq.html#valgrind >>> [7]PETSC ERROR: or try http://valgrind.org on GNU/linux and Apple Mac OS X to find memory corruption errors >>> [7]PETSC ERROR: likely location of problem given in stack below >>> [7]PETSC ERROR: --------------------- Stack Frames ------------------------------------ >>> [7]PETSC ERROR: Note: The EXACT line numbers in the stack are not available, >>> [7]PETSC ERROR: INSTEAD the line number of the start of the function >>> [7]PETSC ERROR: is given. >>> [7]PETSC ERROR: [7] SuperLU_DIST:pdgssvx line 161 /home/simpson/software/petsc-3.5.4/src/mat/impls/aij/mpi/superlu_dist/superlu_dist.c >>> [7]PETSC ERROR: [7] MatSolve_SuperLU_DIST line 121 /home/simpson/software/petsc-3.5.4/src/mat/impls/aij/mpi/superlu_dist/superlu_dist.c >>> [7]PETSC ERROR: [7] MatSolve line 3104 /home/simpson/software/petsc-3.5.4/src/mat/interface/matrix.c >>> [7]PETSC ERROR: [7] PCApply_LU line 194 /home/simpson/software/petsc-3.5.4/src/ksp/pc/impls/factor/lu/lu.c >>> [7]PETSC ERROR: [7] KSP_PCApplyBAorAB line 258 /home/simpson/software/petsc-3.5.4/include/petsc-private/kspimpl.h >>> [7]PETSC ERROR: --------------------- Error Message -------------------------------------------------------------- >>> [7]PETSC ERROR: Signal received >>> [7]PETSC ERROR: See http://www.mcs.anl.gov/petsc/documentation/faq.html for trouble shooting. >>> [7]PETSC ERROR: Petsc Release Version 3.5.4, May, 23, 2015 >>> [7]PETSC ERROR: ./blowup_batch2 on a arch-linux2-c-debug named proteusi01 by simpson Sat Aug 22 17:01:41 2015 >>> [7]PETSC ERROR: Configure options --with-mpi-dir=/mnt/HA/opt/openmpi/intel/64/1.8.1-mlnx-ofed --with-blas-lib=/mnt/HA/opt/blas/gcc/64/20110419/libblas.a --with-lapack-lib=/liblapack.a --download-suitesparse=yes --download-superlu=yes --download-superlu_dist=yes --download-mumps=yes --download-metis=yes --download-parmetis=yes --download-scalapack=yes >>> [7]PETSC ERROR: #1 User provided function() line 0 in unknown file >>> [0]PETSC ERROR: ------------------------------------------------------------------------ >>> [0]PETSC ERROR: Caught signal number 15 Terminate: Some process (or the batch system) has told this process to end >>> [0]PETSC ERROR: Try option -start_in_debugger or -on_error_attach_debugger >>> [0]PETSC ERROR: or see http://www.mcs.anl.gov/petsc/documentation/faq.html#valgrind >>> [0]PETSC ERROR: or try http://valgrind.org on GNU/linux and Apple Mac OS X to find memory corruption errors >>> [0]PETSC ERROR: likely location of problem given in stack below >>> [0]PETSC ERROR: --------------------- Stack Frames ------------------------------------ >>> [0]PETSC ERROR: Note: The EXACT line numbers in the stack are not available, >>> [0]PETSC ERROR: INSTEAD the line number of the start of the function >>> [0]PETSC ERROR: is given. >>> [0]PETSC ERROR: [0] SuperLU_DIST:pdgssvx line 161 /home/simpson/software/petsc-3.5.4/src/mat/impls/aij/mpi/superlu_dist/superlu_dist.c >>> [0]PETSC ERROR: [0] MatSolve_SuperLU_DIST line 121 /home/simpson/software/petsc-3.5.4/src/mat/impls/aij/mpi/superlu_dist/superlu_dist.c >>> [0]PETSC ERROR: [0] MatSolve line 3104 /home/simpson/software/petsc-3.5.4/src/mat/interface/matrix.c >>> [0]PETSC ERROR: [0] PCApply_LU line 194 /home/simpson/software/petsc-3.5.4/src/ksp/pc/impls/factor/lu/lu.c >>> [0]PETSC ERROR: [0] KSP_PCApplyBAorAB line 258 /home/simpson/software/petsc-3.5.4/include/petsc-private/kspimpl.h >>> [0]PETSC ERROR: --------------------- Error Message -------------------------------------------------------------- >>> [0]PETSC ERROR: Signal received >>> [0]PETSC ERROR: See http://www.mcs.anl.gov/petsc/documentation/faq.html for trouble shooting. >>> [0]PETSC ERROR: Petsc Release Version 3.5.4, May, 23, 2015 >>> [0]PETSC ERROR: ./blowup_batch2 on a arch-linux2-c-debug named proteusi01 by simpson Sat Aug 22 17:01:41 2015 >>> [0]PETSC ERROR: Configure options --with-mpi-dir=/mnt/HA/opt/openmpi/intel/64/1.8.1-mlnx-ofed --with-blas-lib=/mnt/HA/opt/blas/gcc/64/20110419/libblas.a --with-lapack-lib=/liblapack.a --download-suitesparse=yes --download-superlu=yes --download-superlu_dist=yes --download-mumps=yes --download-metis=yes --download-parmetis=yes --download-scalapack=yes >>> [0]PETSC ERROR: #1 User provided function() line 0 in unknown file >>> [1]PETSC ERROR: ------------------------------------------------------------------------ >>> [1]PETSC ERROR: Caught signal number 15 Terminate: Some process (or the batch system) has told this process to end >>> [1]PETSC ERROR: Try option -start_in_debugger or -on_error_attach_debugger >>> [1]PETSC ERROR: or see http://www.mcs.anl.gov/petsc/documentation/faq.html#valgrind >>> [1]PETSC ERROR: or try http://valgrind.org on GNU/linux and Apple Mac OS X to find memory corruption errors >>> [1]PETSC ERROR: likely location of problem given in stack below >>> [1]PETSC ERROR: --------------------- Stack Frames ------------------------------------ >>> [1]PETSC ERROR: Note: The EXACT line numbers in the stack are not available, >>> [1]PETSC ERROR: INSTEAD the line number of the start of the function >>> [1]PETSC ERROR: is given. >>> [1]PETSC ERROR: [1] SuperLU_DIST:pdgssvx line 161 /home/simpson/software/petsc-3.5.4/src/mat/impls/aij/mpi/superlu_dist/superlu_dist.c >>> [1]PETSC ERROR: [1] MatSolve_SuperLU_DIST line 121 /home/simpson/software/petsc-3.5.4/src/mat/impls/aij/mpi/superlu_dist/superlu_dist.c >>> [1]PETSC ERROR: [1] MatSolve line 3104 /home/simpson/software/petsc-3.5.4/src/mat/interface/matrix.c >>> [1]PETSC ERROR: [1] PCApply_LU line 194 /home/simpson/software/petsc-3.5.4/src/ksp/pc/impls/factor/lu/lu.c >>> [1]PETSC ERROR: [1] KSP_PCApplyBAorAB line 258 /home/simpson/software/petsc-3.5.4/include/petsc-private/kspimpl.h >>> [1]PETSC ERROR: --------------------- Error Message -------------------------------------------------------------- >>> [1]PETSC ERROR: Signal received >>> [1]PETSC ERROR: See http://www.mcs.anl.gov/petsc/documentation/faq.html for trouble shooting. >>> [1]PETSC ERROR: Petsc Release Version 3.5.4, May, 23, 2015 >>> [1]PETSC ERROR: ./blowup_batch2 on a arch-linux2-c-debug named proteusi01 by simpson Sat Aug 22 17:01:41 2015 >>> [1]PETSC ERROR: Configure options --with-mpi-dir=/mnt/HA/opt/openmpi/intel/64/1.8.1-mlnx-ofed --with-blas-lib=/mnt/HA/opt/blas/gcc/64/20110419/libblas.a --with-lapack-lib=/liblapack.a --download-suitesparse=yes --download-superlu=yes --download-superlu_dist=yes --download-mumps=yes --download-metis=yes --download-parmetis=yes --download-scalapack=yes >>> [1]PETSC ERROR: #1 User provided function() line 0 in unknown file >>> [2]PETSC ERROR: ------------------------------------------------------------------------ >>> [2]PETSC ERROR: Caught signal number 15 Terminate: Some process (or the batch system) has told this process to end >>> [2]PETSC ERROR: Try option -start_in_debugger or -on_error_attach_debugger >>> [2]PETSC ERROR: or see http://www.mcs.anl.gov/petsc/documentation/faq.html#valgrind >>> [2]PETSC ERROR: or try http://valgrind.org on GNU/linux and Apple Mac OS X to find memory corruption errors >>> [2]PETSC ERROR: likely location of problem given in stack below >>> [2]PETSC ERROR: --------------------- Stack Frames ------------------------------------ >>> [2]PETSC ERROR: Note: The EXACT line numbers in the stack are not available, >>> [2]PETSC ERROR: INSTEAD the line number of the start of the function >>> [2]PETSC ERROR: is given. >>> [2]PETSC ERROR: [2] SuperLU_DIST:pdgssvx line 161 /home/simpson/software/petsc-3.5.4/src/mat/impls/aij/mpi/superlu_dist/superlu_dist.c >>> [2]PETSC ERROR: [2] MatSolve_SuperLU_DIST line 121 /home/simpson/software/petsc-3.5.4/src/mat/impls/aij/mpi/superlu_dist/superlu_dist.c >>> [2]PETSC ERROR: [2] MatSolve line 3104 /home/simpson/software/petsc-3.5.4/src/mat/interface/matrix.c >>> [2]PETSC ERROR: [2] PCApply_LU line 194 /home/simpson/software/petsc-3.5.4/src/ksp/pc/impls/factor/lu/lu.c >>> [2]PETSC ERROR: [2] KSP_PCApplyBAorAB line 258 /home/simpson/software/petsc-3.5.4/include/petsc-private/kspimpl.h >>> [2]PETSC ERROR: --------------------- Error Message -------------------------------------------------------------- >>> [2]PETSC ERROR: Signal received >>> [2]PETSC ERROR: See http://www.mcs.anl.gov/petsc/documentation/faq.html for trouble shooting. >>> [2]PETSC ERROR: Petsc Release Version 3.5.4, May, 23, 2015 >>> [2]PETSC ERROR: ./blowup_batch2 on a arch-linux2-c-debug named proteusi01 by simpson Sat Aug 22 17:01:41 2015 >>> [2]PETSC ERROR: Configure options --with-mpi-dir=/mnt/HA/opt/openmpi/intel/64/1.8.1-mlnx-ofed --with-blas-lib=/mnt/HA/opt/blas/gcc/64/20110419/libblas.a --with-lapack-lib=/liblapack.a --download-suitesparse=yes --download-superlu=yes --download-superlu_dist=yes --download-mumps=yes --download-metis=yes --download-parmetis=yes --download-scalapack=yes >>> [2]PETSC ERROR: #1 User provided function() line 0 in unknown file >>> [4]PETSC ERROR: ------------------------------------------------------------------------ >>> [4]PETSC ERROR: Caught signal number 15 Terminate: Some process (or the batch system) has told this process to end >>> [4]PETSC ERROR: Try option -start_in_debugger or -on_error_attach_debugger >>> [4]PETSC ERROR: or see http://www.mcs.anl.gov/petsc/documentation/faq.html#valgrind >>> [4]PETSC ERROR: or try http://valgrind.org on GNU/linux and Apple Mac OS X to find memory corruption errors >>> [4]PETSC ERROR: likely location of problem given in stack below >>> [4]PETSC ERROR: --------------------- Stack Frames ------------------------------------ >>> [4]PETSC ERROR: Note: The EXACT line numbers in the stack are not available, >>> [4]PETSC ERROR: INSTEAD the line number of the start of the function >>> [4]PETSC ERROR: is given. >>> [4]PETSC ERROR: [4] SuperLU_DIST:pdgssvx line 161 /home/simpson/software/petsc-3.5.4/src/mat/impls/aij/mpi/superlu_dist/superlu_dist.c >>> [4]PETSC ERROR: [4] MatSolve_SuperLU_DIST line 121 /home/simpson/software/petsc-3.5.4/src/mat/impls/aij/mpi/superlu_dist/superlu_dist.c >>> [4]PETSC ERROR: [4] MatSolve line 3104 /home/simpson/software/petsc-3.5.4/src/mat/interface/matrix.c >>> [4]PETSC ERROR: [4] PCApply_LU line 194 /home/simpson/software/petsc-3.5.4/src/ksp/pc/impls/factor/lu/lu.c >>> [4]PETSC ERROR: [4] KSP_PCApplyBAorAB line 258 /home/simpson/software/petsc-3.5.4/include/petsc-private/kspimpl.h >>> [4]PETSC ERROR: --------------------- Error Message -------------------------------------------------------------- >>> [4]PETSC ERROR: Signal received >>> [4]PETSC ERROR: See http://www.mcs.anl.gov/petsc/documentation/faq.html for trouble shooting. >>> [4]PETSC ERROR: Petsc Release Version 3.5.4, May, 23, 2015 >>> [4]PETSC ERROR: ./blowup_batch2 on a arch-linux2-c-debug named proteusi01 by simpson Sat Aug 22 17:01:41 2015 >>> [4]PETSC ERROR: Configure options --with-mpi-dir=/mnt/HA/opt/openmpi/intel/64/1.8.1-mlnx-ofed --with-blas-lib=/mnt/HA/opt/blas/gcc/64/20110419/libblas.a --with-lapack-lib=/liblapack.a --download-suitesparse=yes --download-superlu=yes --download-superlu_dist=yes --download-mumps=yes --download-metis=yes --download-parmetis=yes --download-scalapack=yes >>> [4]PETSC ERROR: #1 User provided function() line 0 in unknown file >>> [5]PETSC ERROR: ------------------------------------------------------------------------ >>> [5]PETSC ERROR: Caught signal number 15 Terminate: Some process (or the batch system) has told this process to end >>> [5]PETSC ERROR: Try option -start_in_debugger or -on_error_attach_debugger >>> [5]PETSC ERROR: or see http://www.mcs.anl.gov/petsc/documentation/faq.html#valgrind >>> [5]PETSC ERROR: or try http://valgrind.org on GNU/linux and Apple Mac OS X to find memory corruption errors >>> [5]PETSC ERROR: likely location of problem given in stack below >>> [5]PETSC ERROR: --------------------- Stack Frames ------------------------------------ >>> [5]PETSC ERROR: Note: The EXACT line numbers in the stack are not available, >>> [5]PETSC ERROR: INSTEAD the line number of the start of the function >>> [5]PETSC ERROR: is given. >>> [5]PETSC ERROR: [5] SuperLU_DIST:pdgssvx line 161 /home/simpson/software/petsc-3.5.4/src/mat/impls/aij/mpi/superlu_dist/superlu_dist.c >>> [5]PETSC ERROR: [5] MatSolve_SuperLU_DIST line 121 /home/simpson/software/petsc-3.5.4/src/mat/impls/aij/mpi/superlu_dist/superlu_dist.c >>> [5]PETSC ERROR: [5] MatSolve line 3104 /home/simpson/software/petsc-3.5.4/src/mat/interface/matrix.c >>> [5]PETSC ERROR: [5] PCApply_LU line 194 /home/simpson/software/petsc-3.5.4/src/ksp/pc/impls/factor/lu/lu.c >>> [5]PETSC ERROR: [5] KSP_PCApplyBAorAB line 258 /home/simpson/software/petsc-3.5.4/include/petsc-private/kspimpl.h >>> [5]PETSC ERROR: --------------------- Error Message -------------------------------------------------------------- >>> [5]PETSC ERROR: Signal received >>> [5]PETSC ERROR: See http://www.mcs.anl.gov/petsc/documentation/faq.html for trouble shooting. >>> [5]PETSC ERROR: Petsc Release Version 3.5.4, May, 23, 2015 >>> [5]PETSC ERROR: ./blowup_batch2 on a arch-linux2-c-debug named proteusi01 by simpson Sat Aug 22 17:01:41 2015 >>> [5]PETSC ERROR: Configure options --with-mpi-dir=/mnt/HA/opt/openmpi/intel/64/1.8.1-mlnx-ofed --with-blas-lib=/mnt/HA/opt/blas/gcc/64/20110419/libblas.a --with-lapack-lib=/liblapack.a --download-suitesparse=yes --download-superlu=yes --download-superlu_dist=yes --download-mumps=yes --download-metis=yes --download-parmetis=yes --download-scalapack=yes >>> [5]PETSC ERROR: #1 User provided function() line 0 in unknown file >>> >>> -gideon >>> >>> >> > From bsmith at mcs.anl.gov Sat Aug 22 16:49:30 2015 From: bsmith at mcs.anl.gov (Barry Smith) Date: Sat, 22 Aug 2015 16:49:30 -0500 Subject: [petsc-users] Scalability issue In-Reply-To: <11b2afc65a711f7e58bd7092b392c7cb@mail.ist.utl.pt> References: <6f0a267caafdec17d9e34595c9528b7c@mail.ist.utl.pt> <11b2afc65a711f7e58bd7092b392c7cb@mail.ist.utl.pt> Message-ID: <69E8353F-2DD6-4BFE-BEA2-D6A049E9912A@mcs.anl.gov> > On Aug 22, 2015, at 4:17 PM, Nelson Filipe Lopes da Silva wrote: > > Hi. > > > I managed to finish the re-implementation. I ran the program with 1,2,3,4,5,6 machines and saved the summary. I send each of them in this email. > In these executions, the program performs Matrix-Vector (MatMult, MatMultAdd) products and Vector-Vector operations. From what I understand while reading the logs, the program takes most of the time in "VecScatterEnd". > In this example, the matrix taking part on the Matrix-Vector products is not "much diagonal heavy". > The following numbers are the percentages of nnz values on the matrix diagonal block for each machine, and each execution time. > NMachines %NNZ ExecTime > 1 machine0 100%; 16min08sec > > 2 machine0 91.1%; 24min58sec > machine1 69.2%; > > 3 machine0 90.9% 25min42sec > machine1 82.8% > machine2 51.6% > > 4 machine0 91.9% 26min27sec > machine1 82.4% > machine2 73.1% > machine3 39.9% > > 5 machine0 93.2% 39min23sec > machine1 82.8% > machine2 74.4% > machine3 64.6% > machine4 31.6% > > 6 machine0 94.2% 54min54sec > machine1 82.6% > machine2 73.1% > machine3 65.2% > machine4 55.9% > machine5 25.4% Based on this I am guessing the last rows of the matrix have a lot of nonzeros away from the diagonal? There is a big load imbalance in something: for example with 2 processes you have VecMax 10509 1.0 2.0602e+02 4.2 0.00e+00 0.0 0.0e+00 0.0e+00 1.1e+04 9 0 0 0 72 9 0 0 0 72 0 VecScatterEnd 18128 1.0 8.9404e+02 1.3 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00 53 0 0 0 0 53 0 0 0 0 0 MatMult 10505 1.0 6.5591e+02 1.4 3.16e+10 1.4 2.1e+04 1.2e+06 0.0e+00 37 33 58 38 0 37 33 58 38 0 83 MatMultAdd 7624 1.0 7.0028e+02 2.3 3.26e+10 2.1 1.5e+04 2.8e+06 0.0e+00 34 29 42 62 0 34 29 42 62 0 69 the 5th column has the imbalance between slowest and fastest process. It is 4.2 for max, 1.4 for multi and 2.3 for matmultadd, to get good speed ups these need to be much closer to 1. How many nonzeros in the matrix are there per process? Is it very different for difference processes? You really need to have each process have similar number of matrix nonzeros. Do you have a picture of the nonzero structure of the matrix? Where does the matrix come from, why does it have this structure? Also likely there are just to many vector entries that need to be scattered to the last process for the matmults. > > In this implementation I'm using MatCreate and VecCreate. I'm also leaving the partition sizes in PETSC_DECIDE. > > Finally, to run the application, I'm using mpirun.hydra from mpich, downloaded by PETSc configure script. > I'm checking the process assignment as suggested on the last email. > > Am I missing anything? Your network is very poor; likely ethernet. It is had to get much speedup with such slow reductions and sends and receives. Average time to get PetscTime(): 1.19209e-07 Average time for MPI_Barrier(): 0.000215769 Average time for zero size MPI_Send(): 5.94854e-05 I think you are seeing such bad results due to an unkind matrix nonzero structure giving per load balance and too much communication and a very poor computer network that just makes all the needed communication totally dominate. > > Regards, > Nelson > > Em 2015-08-20 16:17, Matthew Knepley escreveu: > >> On Thu, Aug 20, 2015 at 6:30 AM, Nelson Filipe Lopes da Silva wrote: >> Hello. >> >> I am sorry for the long time without response. I decided to rewrite my application in a different way and will send the log_summary output when done reimplementing. >> >> As for the machine, I am using mpirun to run jobs in a 8 node cluster. I modified the makefile on the steams folder so it would run using my hostfile. >> The output is attached to this email. It seems reasonable for a cluster with 8 machines. From "lscpu", each machine cpu has 4 cores and 1 socket. >> 1) You launcher is placing processes haphazardly. I would figure out how to assign them to certain nodes >> 2) Each node has enough bandwidth for 1 core, so it does not make much sense to use more than 1. >> Thanks, >> Matt >> >> Cheers, >> Nelson >> >> >> Em 2015-07-24 16:50, Barry Smith escreveu: >> It would be very helpful if you ran the code on say 1, 2, 4, 8, 16 >> ... processes with the option -log_summary and send (as attachments) >> the log summary information. >> >> Also on the same machine run the streams benchmark; with recent >> releases of PETSc you only need to do >> >> cd $PETSC_DIR >> make streams NPMAX=16 (or whatever your largest process count is) >> >> and send the output. >> >> I suspect that you are doing everything fine and it is more an issue >> with the configuration of your machine. Also read the information at >> http://www.mcs.anl.gov/petsc/documentation/faq.html#computers on >> "binding" >> >> Barry >> >> On Jul 24, 2015, at 10:41 AM, Nelson Filipe Lopes da Silva wrote: >> >> Hello, >> >> I have been using PETSc for a few months now, and it truly is fantastic piece of software. >> >> In my particular example I am working with a large, sparse distributed (MPI AIJ) matrix we can refer as 'G'. >> G is a horizontal - retangular matrix (for example, 1,1 Million rows per 2,1 Million columns). This matrix is commonly very sparse and not diagonal 'heavy' (for example 5,2 Million nnz in which ~50% are on the diagonal block of MPI AIJ representation). >> To work with this matrix, I also have a few parallel vectors (created using MatCreate Vec), we can refer as 'm' and 'k'. >> I am trying to parallelize an iterative algorithm in which the most computational heavy operations are: >> >> ->Matrix-Vector Multiplication, more precisely G * m + k = b (MatMultAdd). From what I have been reading, to achive a good speedup in this operation, G should be as much diagonal as possible, due to overlapping communication and computation. But even when using a G matrix in which the diagonal block has ~95% of the nnz, I cannot get a decent speedup. Most of the times, the performance even gets worse. >> >> ->Matrix-Matrix Multiplication, in this case I need to perform G * G' = A, where A is later used on the linear solver and G' is transpose of G. The speedup in this operation is not worse, although is not very good. >> >> ->Linear problem solving. Lastly, In this operation I compute "Ax=b" from the last two operations. I tried to apply a RCM permutation to A to make it more diagonal, for better performance. However, the problem I faced was that, the permutation is performed locally in each processor and thus, the final result is different with different number of processors. I assume this was intended to reduce communication. The solution I found was >> 1-calculate A >> 2-calculate, localy to 1 machine, the RCM permutation IS using A >> 3-apply this permutation to the lines of G. >> This works well, and A is generated as if RCM permuted. It is fine to do this operation in one machine because it is only done once while reading the input. The nnz of G become more spread and less diagonal, causing problems when calculating G * m + k = b. >> >> These 3 operations (except the permutation) are performed in each iteration of my algorithm. >> >> So, my questions are. >> -What are the characteristics of G that lead to a good speedup in the operations I described? Am I missing something and too much obsessed with the diagonal block? >> >> -Is there a better way to permute A without permute G and still get the same result using 1 or N machines? >> >> >> I have been avoiding asking for help for a while. I'm very sorry for the long email. >> Thank you very much for your time. >> Best Regards, >> Nelson >> >> >> -- >> What most experimenters take for granted before they begin their experiments is infinitely more interesting than any results to which their experiments lead. >> -- Norbert Wiener > > From bsmith at mcs.anl.gov Sat Aug 22 21:29:25 2015 From: bsmith at mcs.anl.gov (Barry Smith) Date: Sat, 22 Aug 2015 21:29:25 -0500 Subject: [petsc-users] Variatonal inequalities In-Reply-To: References: Message-ID: <31849CD7-9C14-487E-8133-5460E40FA6D3@mcs.anl.gov> David, Currently the only way to do this without adding a lot of additional PETSc code is to add additional variables such that only box constraints appear in the final problem. For example say you have constraints c <= Ax <= d then introduce new variables y = Ax and then you have the larger problem of unknowns (x,y) and box constrains on y with -infinity and +infinity constraints on x. Barry > On Aug 22, 2015, at 6:59 AM, David Knezevic wrote: > > Hi all, > > I see from Section 5.7 of the manual that SNES supports box constraints on variables, which is great. However, I was also hoping to also be able to consider general linear inequality constraints, so I was wondering if anyone has any suggestions on how (or if) that could be done with PETSc? > > Thanks, > David > From dongluo at pku.edu.cn Sat Aug 22 23:38:09 2015 From: dongluo at pku.edu.cn (=?UTF-8?B?572X5Lic?=) Date: Sun, 23 Aug 2015 12:38:09 +0800 (GMT+08:00) Subject: [petsc-users] [petsc-user] intall problem when make test Message-ID: <6fa66142.d93f.14f58d9711b.Coremail.dongluo@pku.edu.cn> Dear all, I'm Luo Dong from Peking University in Beijing, China. I'm trying to intall the Petsc on my account on Tianhe supercomputer. But I met some problem. I have successfully make the build, but when I use make test, there is some error: [taojj at ln3%tianhe 3.6.1]$ make test Running test examples to verify correct installation Using PETSC_DIR=/vol-th/home/taojj/luodong/sfwchombo/petsc/3.6.1 and PETSC_ARCH=linux-dbg *mpiexec not found*. Please run src/snes/examples/tutorials/ex19 manually *******************Error detected during compile or link!******************* See http://www.mcs.anl.gov/petsc/documentation/faq.html /vol-th/home/taojj/luodong/sfwchombo/petsc/3.6.1/src/snes/examples/tutorials ex5f ********************************************************* I don't know why thers is 'mpiexec not found'. I have use compilers mpicc, mpicxx, mpif90. On my account, I should use 'yhrun -n 1 -t 30 -p TH_NET' to submit a running work. I don't know if I should set some thing about this and where I should do that. I have attached the uname-a output below: [taojj at ln3%tianhe tutorials]$ uname -a Linux ln3 2.6.32-358.11.1.2.ky3.1.x86_64 #1 SMP Mon Jul 8 13:05:58 CST 2013 x86_64 x86_64 x86_64 GNU/Linux Please give me some help. Thanks for your help in advance. Best, Dong -------------- next part -------------- An HTML attachment was scrubbed... URL: From balay at mcs.anl.gov Sun Aug 23 00:47:42 2015 From: balay at mcs.anl.gov (Satish Balay) Date: Sun, 23 Aug 2015 00:47:42 -0500 Subject: [petsc-users] [petsc-user] intall problem when make test In-Reply-To: <6fa66142.d93f.14f58d9711b.Coremail.dongluo@pku.edu.cn> References: <6fa66142.d93f.14f58d9711b.Coremail.dongluo@pku.edu.cn> Message-ID: As the message says - try running the examples manually. i.e cd src/snes/examples/tutorials make PETSC_DIR=/vol-th/home/taojj/luodong/sfwchombo/petsc/3.6.1 PETSC_ARCH=linux-dbg ex19 ex5f yhrun -n 1 -t 30 -p TH_NET ./ex29 yhrun -n 1 -t 30 -p TH_NET ./ex5f Satish On Sat, 22 Aug 2015, ?? wrote: > Dear all, > > > I'm Luo Dong from Peking University in Beijing, China. I'm trying to intall the Petsc on my account on Tianhe supercomputer. But I met some problem. I have successfully make the build, but when I use make test, there is some error: > > > [taojj at ln3%tianhe 3.6.1]$ make test > Running test examples to verify correct installation > Using PETSC_DIR=/vol-th/home/taojj/luodong/sfwchombo/petsc/3.6.1 and PETSC_ARCH=linux-dbg > *mpiexec not found*. Please run src/snes/examples/tutorials/ex19 manually > *******************Error detected during compile or link!******************* > See http://www.mcs.anl.gov/petsc/documentation/faq.html > /vol-th/home/taojj/luodong/sfwchombo/petsc/3.6.1/src/snes/examples/tutorials ex5f > ********************************************************* > > > > > I don't know why thers is 'mpiexec not found'. I have use compilers mpicc, mpicxx, mpif90. On my account, I should use 'yhrun -n 1 -t 30 -p TH_NET' to submit a running work. I don't know if I should set some thing about this and where I should do that. > > > I have attached the uname-a output below: > > > [taojj at ln3%tianhe tutorials]$ uname -a > Linux ln3 2.6.32-358.11.1.2.ky3.1.x86_64 #1 SMP Mon Jul 8 13:05:58 CST 2013 x86_64 x86_64 x86_64 GNU/Linux > > > Please give me some help. > > > Thanks for your help in advance. > > > Best, > > > Dong From dongluo at pku.edu.cn Sun Aug 23 01:23:58 2015 From: dongluo at pku.edu.cn (=?UTF-8?B?572X5Lic?=) Date: Sun, 23 Aug 2015 14:23:58 +0800 (GMT+08:00) Subject: [petsc-users] [petsc-user] intall problem when make test In-Reply-To: References: <6fa66142.d93f.14f58d9711b.Coremail.dongluo@pku.edu.cn> Message-ID: <3a73c956.d9cb.14f593a50ba.Coremail.dongluo@pku.edu.cn> Dear Satish, Thanks for your kind help. I have done as you said. And I get the following: [taojj at ln3%tianhe tutorials]$ yhrun -n 1 -t 30 -p debug ./ex19 lid velocity = 0.0625, prandtl # = 1, grashof # = 1 Number of SNES iterations = 2 [taojj at ln3%tianhe tutorials]$ yhrun -n 1 -t 30 -p debug ./ex5f Number of SNES iterations = 4 [taojj at ln3%tianhe tutorials]$ I think this means that it goes well. Much appreciate for your help again. Best, Dong > -----????----- > ???: "Satish Balay" > ????: 2015-08-23 13:47:42 (???) > ???: "??" > ??: petsc-users at mcs.anl.gov > ??: Re: [petsc-users] [petsc-user] intall problem when make test > > As the message says - try running the examples manually. i.e > > cd src/snes/examples/tutorials > make PETSC_DIR=/vol-th/home/taojj/luodong/sfwchombo/petsc/3.6.1 PETSC_ARCH=linux-dbg ex19 ex5f > > > yhrun -n 1 -t 30 -p TH_NET ./ex29 > yhrun -n 1 -t 30 -p TH_NET ./ex5f > > Satish > > On Sat, 22 Aug 2015, ?? wrote: > > > Dear all, > > > > > > I'm Luo Dong from Peking University in Beijing, China. I'm trying to intall the Petsc on my account on Tianhe supercomputer. But I met some problem. I have successfully make the build, but when I use make test, there is some error: > > > > > > [taojj at ln3%tianhe 3.6.1]$ make test > > Running test examples to verify correct installation > > Using PETSC_DIR=/vol-th/home/taojj/luodong/sfwchombo/petsc/3.6.1 and PETSC_ARCH=linux-dbg > > *mpiexec not found*. Please run src/snes/examples/tutorials/ex19 manually > > *******************Error detected during compile or link!******************* > > See http://www.mcs.anl.gov/petsc/documentation/faq.html > > /vol-th/home/taojj/luodong/sfwchombo/petsc/3.6.1/src/snes/examples/tutorials ex5f > > ********************************************************* > > > > > > > > > > I don't know why thers is 'mpiexec not found'. I have use compilers mpicc, mpicxx, mpif90. On my account, I should use 'yhrun -n 1 -t 30 -p TH_NET' to submit a running work. I don't know if I should set some thing about this and where I should do that. > > > > > > I have attached the uname-a output below: > > > > > > [taojj at ln3%tianhe tutorials]$ uname -a > > Linux ln3 2.6.32-358.11.1.2.ky3.1.x86_64 #1 SMP Mon Jul 8 13:05:58 CST 2013 x86_64 x86_64 x86_64 GNU/Linux > > > > > > Please give me some help. > > > > > > Thanks for your help in advance. > > > > > > Best, > > > > > > Dong From david.knezevic at akselos.com Sun Aug 23 09:29:01 2015 From: david.knezevic at akselos.com (David Knezevic) Date: Sun, 23 Aug 2015 10:29:01 -0400 Subject: [petsc-users] Variatonal inequalities In-Reply-To: <31849CD7-9C14-487E-8133-5460E40FA6D3@mcs.anl.gov> References: <31849CD7-9C14-487E-8133-5460E40FA6D3@mcs.anl.gov> Message-ID: On Sat, Aug 22, 2015 at 10:29 PM, Barry Smith wrote: > > David, > > Currently the only way to do this without adding a lot of additional > PETSc code is to add additional variables such that only box constraints > appear in the final problem. For example say you have constraints c <= Ax > <= d then introduce new variables y = Ax and then you have the larger > problem of unknowns (x,y) and box constrains on y with -infinity and > +infinity constraints on x. > OK, that makes sense, thanks for the info! David -------------- next part -------------- An HTML attachment was scrubbed... URL: From nelsonflsilva at ist.utl.pt Sun Aug 23 10:12:22 2015 From: nelsonflsilva at ist.utl.pt (Nelson Filipe Lopes da Silva) Date: Sun, 23 Aug 2015 16:12:22 +0100 Subject: [petsc-users] Scalability issue In-Reply-To: <69E8353F-2DD6-4BFE-BEA2-D6A049E9912A@mcs.anl.gov> References: <6f0a267caafdec17d9e34595c9528b7c@mail.ist.utl.pt> <11b2afc65a711f7e58bd7092b392c7cb@mail.ist.utl.pt> <69E8353F-2DD6-4BFE-BEA2-D6A049E9912A@mcs.anl.gov> Message-ID: Thank you for the fast response! Yes. The last rows of the matrix are indeed more dense, compared with the remaining ones. For this example, concerning load balance between machines, the last process had 46% of the matrix nonzero entries. A few weeks ago I suspected of this problem and wrote a little function that could permute the matrix rows based on their number of nonzeros. However, the matrix would become less pleasant regarding "diagonal block weight", and I stop using it as i thought I was becoming worse. Also, due to this problem, I thought I could have a complete vector copy in each processor, instead of a distributed vector. I tried to implement this idea, but had no luck with the results. However, even if this solution would work, the communication for vector update was inevitable once each iteration of my algorithm. Since this is a rectangular matrix, I cannot apply RCM or such permutations, however I can permute rows and columns though. More specifically, the problem I'm trying to solve is one of balance the best guess and uncertainty estimates of a set of Input-Output subject to linear constraints and ancillary informations. The matrix is called an aggregation matrix, and each entry can be 1, 0 or -1. I don't know the cause of its nonzero structure. I'm addressing this problem using a weighted least-squares algorithm. I ran the code with a different, more friendly problem topology, logging the load of nonzero entries and the "diagonal load" per processor. I'm sending images of both matrices nonzero structure. The last email example used matrix1, the example in this email uses matrix2. Matrix1 (last email example) is 1.098.939 rows x 2.039.681 columns and 5.171.901 nnz. The matrix2 (this email example) is 800.000 rows x 8.800.000 columns and 16.800.000 nnz. With 1,2,3 machines, I have these distributions of nonzeros (using matrix2). I'm sending the logs in this email. 1 machine [0] Matrix diagonal_nnz:16800000 (100.00 %) [0] Matrix local nnz: 16800000 (100.00 %), local rows: 800000 (100.00 %) ExecTime: 4min47sec 2 machines [0] Matrix diagonal_nnz:4400000 (52.38 %) [1] Matrix diagonal_nnz:4000000 (47.62 %) [0] Matrix local nnz: 8400000 (50.00 %), local rows: 400000 (50.00 %) [1] Matrix local nnz: 8400000 (50.00 %), local rows: 400000 (50.00 %) ExecTime: 13min23sec 3 machines [0] Matrix diagonal_nnz:2933334 (52.38 %) [1] Matrix diagonal_nnz:533327 (9.52 %) [2] Matrix diagonal_nnz:2399999 (42.86 %) [0] Matrix local nnz: 5600007 (33.33 %), local rows: 266667 (33.33 %) [1] Matrix local nnz: 5600007 (33.33 %), local rows: 266667 (33.33 %) [2] Matrix local nnz: 5599986 (33.33 %), local rows: 266666 (33.33 %) ExecTime: 20min26sec As for the network, I ran the make streams NPMAX=3 again. I'm also sending it in this email. I too think that these bad results are caused by a combination of bad matrix structure, especially the "diagonal weight", and maybe network. I really should find a way to permute these matrices to a more friendly structure. Thank you very much for the help. Nelson Em 2015-08-22 22:49, Barry Smith escreveu: >> On Aug 22, 2015, at 4:17 PM, Nelson Filipe Lopes da Silva >> wrote: >> >> Hi. >> >> >> I managed to finish the re-implementation. I ran the program with >> 1,2,3,4,5,6 machines and saved the summary. I send each of them in >> this email. >> In these executions, the program performs Matrix-Vector (MatMult, >> MatMultAdd) products and Vector-Vector operations. From what I >> understand while reading the logs, the program takes most of the time >> in "VecScatterEnd". >> In this example, the matrix taking part on the Matrix-Vector >> products is not "much diagonal heavy". >> The following numbers are the percentages of nnz values on the >> matrix diagonal block for each machine, and each execution time. >> NMachines %NNZ ExecTime >> 1 machine0 100%; 16min08sec >> >> 2 machine0 91.1%; 24min58sec >> machine1 69.2%; >> >> 3 machine0 90.9% 25min42sec >> machine1 82.8% >> machine2 51.6% >> >> 4 machine0 91.9% 26min27sec >> machine1 82.4% >> machine2 73.1% >> machine3 39.9% >> >> 5 machine0 93.2% 39min23sec >> machine1 82.8% >> machine2 74.4% >> machine3 64.6% >> machine4 31.6% >> >> 6 machine0 94.2% 54min54sec >> machine1 82.6% >> machine2 73.1% >> machine3 65.2% >> machine4 55.9% >> machine5 25.4% > > Based on this I am guessing the last rows of the matrix have a lot > of nonzeros away from the diagonal? > > There is a big load imbalance in something: for example with 2 > processes you have > > VecMax 10509 1.0 2.0602e+02 4.2 0.00e+00 0.0 0.0e+00 > 0.0e+00 1.1e+04 9 0 0 0 72 9 0 0 0 72 0 > VecScatterEnd 18128 1.0 8.9404e+02 1.3 0.00e+00 0.0 0.0e+00 > 0.0e+00 0.0e+00 53 0 0 0 0 53 0 0 0 0 0 > MatMult 10505 1.0 6.5591e+02 1.4 3.16e+10 1.4 2.1e+04 > 1.2e+06 0.0e+00 37 33 58 38 0 37 33 58 38 0 83 > MatMultAdd 7624 1.0 7.0028e+02 2.3 3.26e+10 2.1 1.5e+04 > 2.8e+06 0.0e+00 34 29 42 62 0 34 29 42 62 0 69 > > the 5th column has the imbalance between slowest and fastest > process. It is 4.2 for max, 1.4 for multi and 2.3 for matmultadd, to > get good speed ups these need to be much closer to 1. > > How many nonzeros in the matrix are there per process? Is it very > different for difference processes? You really need to have each > process have similar number of matrix nonzeros. Do you have a > picture of the nonzero structure of the matrix? Where does the > matrix > come from, why does it have this structure? > > Also likely there are just to many vector entries that need to be > scattered to the last process for the matmults. >> >> In this implementation I'm using MatCreate and VecCreate. I'm also >> leaving the partition sizes in PETSC_DECIDE. >> >> Finally, to run the application, I'm using mpirun.hydra from mpich, >> downloaded by PETSc configure script. >> I'm checking the process assignment as suggested on the last email. >> >> Am I missing anything? > > Your network is very poor; likely ethernet. It is had to get much > speedup with such slow reductions and sends and receives. > > Average time to get PetscTime(): 1.19209e-07 > Average time for MPI_Barrier(): 0.000215769 > Average time for zero size MPI_Send(): 5.94854e-05 > > I think you are seeing such bad results due to an unkind matrix > nonzero structure giving per load balance and too much communication > and a very poor computer network that just makes all the needed > communication totally dominate. > > >> >> Regards, >> Nelson >> >> Em 2015-08-20 16:17, Matthew Knepley escreveu: >> >>> On Thu, Aug 20, 2015 at 6:30 AM, Nelson Filipe Lopes da Silva >>> wrote: >>> Hello. >>> >>> I am sorry for the long time without response. I decided to rewrite >>> my application in a different way and will send the log_summary >>> output when done reimplementing. >>> >>> As for the machine, I am using mpirun to run jobs in a 8 node >>> cluster. I modified the makefile on the steams folder so it would run >>> using my hostfile. >>> The output is attached to this email. It seems reasonable for a >>> cluster with 8 machines. From "lscpu", each machine cpu has 4 cores >>> and 1 socket. >>> 1) You launcher is placing processes haphazardly. I would figure >>> out how to assign them to certain nodes >>> 2) Each node has enough bandwidth for 1 core, so it does not make >>> much sense to use more than 1. >>> Thanks, >>> Matt >>> >>> Cheers, >>> Nelson >>> >>> >>> Em 2015-07-24 16:50, Barry Smith escreveu: >>> It would be very helpful if you ran the code on say 1, 2, 4, 8, 16 >>> ... processes with the option -log_summary and send (as >>> attachments) >>> the log summary information. >>> >>> Also on the same machine run the streams benchmark; with recent >>> releases of PETSc you only need to do >>> >>> cd $PETSC_DIR >>> make streams NPMAX=16 (or whatever your largest process count is) >>> >>> and send the output. >>> >>> I suspect that you are doing everything fine and it is more an >>> issue >>> with the configuration of your machine. Also read the information >>> at >>> http://www.mcs.anl.gov/petsc/documentation/faq.html#computers on >>> "binding" >>> >>> Barry >>> >>> On Jul 24, 2015, at 10:41 AM, Nelson Filipe Lopes da Silva >>> wrote: >>> >>> Hello, >>> >>> I have been using PETSc for a few months now, and it truly is >>> fantastic piece of software. >>> >>> In my particular example I am working with a large, sparse >>> distributed (MPI AIJ) matrix we can refer as 'G'. >>> G is a horizontal - retangular matrix (for example, 1,1 Million >>> rows per 2,1 Million columns). This matrix is commonly very sparse >>> and not diagonal 'heavy' (for example 5,2 Million nnz in which ~50% >>> are on the diagonal block of MPI AIJ representation). >>> To work with this matrix, I also have a few parallel vectors >>> (created using MatCreate Vec), we can refer as 'm' and 'k'. >>> I am trying to parallelize an iterative algorithm in which the most >>> computational heavy operations are: >>> >>> ->Matrix-Vector Multiplication, more precisely G * m + k = b >>> (MatMultAdd). From what I have been reading, to achive a good speedup >>> in this operation, G should be as much diagonal as possible, due to >>> overlapping communication and computation. But even when using a G >>> matrix in which the diagonal block has ~95% of the nnz, I cannot get >>> a decent speedup. Most of the times, the performance even gets worse. >>> >>> ->Matrix-Matrix Multiplication, in this case I need to perform G * >>> G' = A, where A is later used on the linear solver and G' is >>> transpose of G. The speedup in this operation is not worse, although >>> is not very good. >>> >>> ->Linear problem solving. Lastly, In this operation I compute >>> "Ax=b" from the last two operations. I tried to apply a RCM >>> permutation to A to make it more diagonal, for better performance. >>> However, the problem I faced was that, the permutation is performed >>> locally in each processor and thus, the final result is different >>> with different number of processors. I assume this was intended to >>> reduce communication. The solution I found was >>> 1-calculate A >>> 2-calculate, localy to 1 machine, the RCM permutation IS using A >>> 3-apply this permutation to the lines of G. >>> This works well, and A is generated as if RCM permuted. It is fine >>> to do this operation in one machine because it is only done once >>> while reading the input. The nnz of G become more spread and less >>> diagonal, causing problems when calculating G * m + k = b. >>> >>> These 3 operations (except the permutation) are performed in each >>> iteration of my algorithm. >>> >>> So, my questions are. >>> -What are the characteristics of G that lead to a good speedup in >>> the operations I described? Am I missing something and too much >>> obsessed with the diagonal block? >>> >>> -Is there a better way to permute A without permute G and still get >>> the same result using 1 or N machines? >>> >>> >>> I have been avoiding asking for help for a while. I'm very sorry >>> for the long email. >>> Thank you very much for your time. >>> Best Regards, >>> Nelson >>> >>> >>> -- >>> What most experimenters take for granted before they begin their >>> experiments is infinitely more interesting than any results to which >>> their experiments lead. >>> -- Norbert Wiener >> >> >> -------------- next part -------------- An embedded and charset-unspecified text was scrubbed... Name: Log01P URL: -------------- next part -------------- An embedded and charset-unspecified text was scrubbed... Name: Log02P URL: -------------- next part -------------- An embedded and charset-unspecified text was scrubbed... Name: Log03P URL: -------------- next part -------------- A non-text attachment was scrubbed... Name: matrix1.png Type: application/octet-stream Size: 1936 bytes Desc: not available URL: -------------- next part -------------- A non-text attachment was scrubbed... Name: matrix2.png Type: application/octet-stream Size: 2058 bytes Desc: not available URL: -------------- next part -------------- An embedded and charset-unspecified text was scrubbed... Name: streams.out URL: From bsmith at mcs.anl.gov Sun Aug 23 14:19:52 2015 From: bsmith at mcs.anl.gov (Barry Smith) Date: Sun, 23 Aug 2015 14:19:52 -0500 Subject: [petsc-users] Scalability issue In-Reply-To: References: <6f0a267caafdec17d9e34595c9528b7c@mail.ist.utl.pt> <11b2afc65a711f7e58bd7092b392c7cb@mail.ist.utl.pt> <69E8353F-2DD6-4BFE-BEA2-D6A049E9912A@mcs.anl.gov> Message-ID: <53E1D16B-1E0D-4F91-B802-E4934A797267@mcs.anl.gov> A suggestion: take your second ordering and now interlace the second half of the rows with the first half of the rows (keeping the some column ordering) That is, order the rows 0, n/2, 1, n/2+1, 2, n/2+2 etc this will take the two separate "diagonal" bands and form a single "diagonal band". This will increase the "diagonal block weight" to be pretty high and the only scatters will need to be for the final rows of the input vector that all processes need to do their part of the multiply. Generate the image to make sure what I suggest make sense and then run this ordering with 1, 2, and 3 processes. Send the logs. Barry > On Aug 23, 2015, at 10:12 AM, Nelson Filipe Lopes da Silva wrote: > > Thank you for the fast response! > > Yes. The last rows of the matrix are indeed more dense, compared with the remaining ones. > For this example, concerning load balance between machines, the last process had 46% of the matrix nonzero entries. A few weeks ago I suspected of this problem and wrote a little function that could permute the matrix rows based on their number of nonzeros. However, the matrix would become less pleasant regarding "diagonal block weight", and I stop using it as i thought I was becoming worse. > > Also, due to this problem, I thought I could have a complete vector copy in each processor, instead of a distributed vector. I tried to implement this idea, but had no luck with the results. However, even if this solution would work, the communication for vector update was inevitable once each iteration of my algorithm. > Since this is a rectangular matrix, I cannot apply RCM or such permutations, however I can permute rows and columns though. > > More specifically, the problem I'm trying to solve is one of balance the best guess and uncertainty estimates of a set of Input-Output subject to linear constraints and ancillary informations. The matrix is called an aggregation matrix, and each entry can be 1, 0 or -1. I don't know the cause of its nonzero structure. I'm addressing this problem using a weighted least-squares algorithm. > > I ran the code with a different, more friendly problem topology, logging the load of nonzero entries and the "diagonal load" per processor. > I'm sending images of both matrices nonzero structure. The last email example used matrix1, the example in this email uses matrix2. > Matrix1 (last email example) is 1.098.939 rows x 2.039.681 columns and 5.171.901 nnz. > The matrix2 (this email example) is 800.000 rows x 8.800.000 columns and 16.800.000 nnz. > > > With 1,2,3 machines, I have these distributions of nonzeros (using matrix2). I'm sending the logs in this email. > 1 machine > [0] Matrix diagonal_nnz:16800000 (100.00 %) > [0] Matrix local nnz: 16800000 (100.00 %), local rows: 800000 (100.00 %) > ExecTime: 4min47sec > > 2 machines > [0] Matrix diagonal_nnz:4400000 (52.38 %) > [1] Matrix diagonal_nnz:4000000 (47.62 %) > > [0] Matrix local nnz: 8400000 (50.00 %), local rows: 400000 (50.00 %) > [1] Matrix local nnz: 8400000 (50.00 %), local rows: 400000 (50.00 %) > ExecTime: 13min23sec > > 3 machines > [0] Matrix diagonal_nnz:2933334 (52.38 %) > [1] Matrix diagonal_nnz:533327 (9.52 %) > [2] Matrix diagonal_nnz:2399999 (42.86 %) > > [0] Matrix local nnz: 5600007 (33.33 %), local rows: 266667 (33.33 %) > [1] Matrix local nnz: 5600007 (33.33 %), local rows: 266667 (33.33 %) > [2] Matrix local nnz: 5599986 (33.33 %), local rows: 266666 (33.33 %) > ExecTime: 20min26sec > > As for the network, I ran the make streams NPMAX=3 again. I'm also sending it in this email. > > I too think that these bad results are caused by a combination of bad matrix structure, especially the "diagonal weight", and maybe network. > > I really should find a way to permute these matrices to a more friendly structure. > > Thank you very much for the help. > Nelson > > Em 2015-08-22 22:49, Barry Smith escreveu: >>> On Aug 22, 2015, at 4:17 PM, Nelson Filipe Lopes da Silva wrote: >>> >>> Hi. >>> >>> >>> I managed to finish the re-implementation. I ran the program with 1,2,3,4,5,6 machines and saved the summary. I send each of them in this email. >>> In these executions, the program performs Matrix-Vector (MatMult, MatMultAdd) products and Vector-Vector operations. From what I understand while reading the logs, the program takes most of the time in "VecScatterEnd". >>> In this example, the matrix taking part on the Matrix-Vector products is not "much diagonal heavy". >>> The following numbers are the percentages of nnz values on the matrix diagonal block for each machine, and each execution time. >>> NMachines %NNZ ExecTime >>> 1 machine0 100%; 16min08sec >>> >>> 2 machine0 91.1%; 24min58sec >>> machine1 69.2%; >>> >>> 3 machine0 90.9% 25min42sec >>> machine1 82.8% >>> machine2 51.6% >>> >>> 4 machine0 91.9% 26min27sec >>> machine1 82.4% >>> machine2 73.1% >>> machine3 39.9% >>> >>> 5 machine0 93.2% 39min23sec >>> machine1 82.8% >>> machine2 74.4% >>> machine3 64.6% >>> machine4 31.6% >>> >>> 6 machine0 94.2% 54min54sec >>> machine1 82.6% >>> machine2 73.1% >>> machine3 65.2% >>> machine4 55.9% >>> machine5 25.4% >> >> Based on this I am guessing the last rows of the matrix have a lot >> of nonzeros away from the diagonal? >> >> There is a big load imbalance in something: for example with 2 >> processes you have >> >> VecMax 10509 1.0 2.0602e+02 4.2 0.00e+00 0.0 0.0e+00 >> 0.0e+00 1.1e+04 9 0 0 0 72 9 0 0 0 72 0 >> VecScatterEnd 18128 1.0 8.9404e+02 1.3 0.00e+00 0.0 0.0e+00 >> 0.0e+00 0.0e+00 53 0 0 0 0 53 0 0 0 0 0 >> MatMult 10505 1.0 6.5591e+02 1.4 3.16e+10 1.4 2.1e+04 >> 1.2e+06 0.0e+00 37 33 58 38 0 37 33 58 38 0 83 >> MatMultAdd 7624 1.0 7.0028e+02 2.3 3.26e+10 2.1 1.5e+04 >> 2.8e+06 0.0e+00 34 29 42 62 0 34 29 42 62 0 69 >> >> the 5th column has the imbalance between slowest and fastest >> process. It is 4.2 for max, 1.4 for multi and 2.3 for matmultadd, to >> get good speed ups these need to be much closer to 1. >> >> How many nonzeros in the matrix are there per process? Is it very >> different for difference processes? You really need to have each >> process have similar number of matrix nonzeros. Do you have a >> picture of the nonzero structure of the matrix? Where does the matrix >> come from, why does it have this structure? >> >> Also likely there are just to many vector entries that need to be >> scattered to the last process for the matmults. >>> >>> In this implementation I'm using MatCreate and VecCreate. I'm also leaving the partition sizes in PETSC_DECIDE. >>> >>> Finally, to run the application, I'm using mpirun.hydra from mpich, downloaded by PETSc configure script. >>> I'm checking the process assignment as suggested on the last email. >>> >>> Am I missing anything? >> >> Your network is very poor; likely ethernet. It is had to get much >> speedup with such slow reductions and sends and receives. >> >> Average time to get PetscTime(): 1.19209e-07 >> Average time for MPI_Barrier(): 0.000215769 >> Average time for zero size MPI_Send(): 5.94854e-05 >> >> I think you are seeing such bad results due to an unkind matrix >> nonzero structure giving per load balance and too much communication >> and a very poor computer network that just makes all the needed >> communication totally dominate. >> >> >>> >>> Regards, >>> Nelson >>> >>> Em 2015-08-20 16:17, Matthew Knepley escreveu: >>> >>>> On Thu, Aug 20, 2015 at 6:30 AM, Nelson Filipe Lopes da Silva wrote: >>>> Hello. >>>> >>>> I am sorry for the long time without response. I decided to rewrite my application in a different way and will send the log_summary output when done reimplementing. >>>> >>>> As for the machine, I am using mpirun to run jobs in a 8 node cluster. I modified the makefile on the steams folder so it would run using my hostfile. >>>> The output is attached to this email. It seems reasonable for a cluster with 8 machines. From "lscpu", each machine cpu has 4 cores and 1 socket. >>>> 1) You launcher is placing processes haphazardly. I would figure out how to assign them to certain nodes >>>> 2) Each node has enough bandwidth for 1 core, so it does not make much sense to use more than 1. >>>> Thanks, >>>> Matt >>>> >>>> Cheers, >>>> Nelson >>>> >>>> >>>> Em 2015-07-24 16:50, Barry Smith escreveu: >>>> It would be very helpful if you ran the code on say 1, 2, 4, 8, 16 >>>> ... processes with the option -log_summary and send (as attachments) >>>> the log summary information. >>>> >>>> Also on the same machine run the streams benchmark; with recent >>>> releases of PETSc you only need to do >>>> >>>> cd $PETSC_DIR >>>> make streams NPMAX=16 (or whatever your largest process count is) >>>> >>>> and send the output. >>>> >>>> I suspect that you are doing everything fine and it is more an issue >>>> with the configuration of your machine. Also read the information at >>>> http://www.mcs.anl.gov/petsc/documentation/faq.html#computers on >>>> "binding" >>>> >>>> Barry >>>> >>>> On Jul 24, 2015, at 10:41 AM, Nelson Filipe Lopes da Silva wrote: >>>> >>>> Hello, >>>> >>>> I have been using PETSc for a few months now, and it truly is fantastic piece of software. >>>> >>>> In my particular example I am working with a large, sparse distributed (MPI AIJ) matrix we can refer as 'G'. >>>> G is a horizontal - retangular matrix (for example, 1,1 Million rows per 2,1 Million columns). This matrix is commonly very sparse and not diagonal 'heavy' (for example 5,2 Million nnz in which ~50% are on the diagonal block of MPI AIJ representation). >>>> To work with this matrix, I also have a few parallel vectors (created using MatCreate Vec), we can refer as 'm' and 'k'. >>>> I am trying to parallelize an iterative algorithm in which the most computational heavy operations are: >>>> >>>> ->Matrix-Vector Multiplication, more precisely G * m + k = b (MatMultAdd). From what I have been reading, to achive a good speedup in this operation, G should be as much diagonal as possible, due to overlapping communication and computation. But even when using a G matrix in which the diagonal block has ~95% of the nnz, I cannot get a decent speedup. Most of the times, the performance even gets worse. >>>> >>>> ->Matrix-Matrix Multiplication, in this case I need to perform G * G' = A, where A is later used on the linear solver and G' is transpose of G. The speedup in this operation is not worse, although is not very good. >>>> >>>> ->Linear problem solving. Lastly, In this operation I compute "Ax=b" from the last two operations. I tried to apply a RCM permutation to A to make it more diagonal, for better performance. However, the problem I faced was that, the permutation is performed locally in each processor and thus, the final result is different with different number of processors. I assume this was intended to reduce communication. The solution I found was >>>> 1-calculate A >>>> 2-calculate, localy to 1 machine, the RCM permutation IS using A >>>> 3-apply this permutation to the lines of G. >>>> This works well, and A is generated as if RCM permuted. It is fine to do this operation in one machine because it is only done once while reading the input. The nnz of G become more spread and less diagonal, causing problems when calculating G * m + k = b. >>>> >>>> These 3 operations (except the permutation) are performed in each iteration of my algorithm. >>>> >>>> So, my questions are. >>>> -What are the characteristics of G that lead to a good speedup in the operations I described? Am I missing something and too much obsessed with the diagonal block? >>>> >>>> -Is there a better way to permute A without permute G and still get the same result using 1 or N machines? >>>> >>>> >>>> I have been avoiding asking for help for a while. I'm very sorry for the long email. >>>> Thank you very much for your time. >>>> Best Regards, >>>> Nelson >>>> >>>> >>>> -- >>>> What most experimenters take for granted before they begin their experiments is infinitely more interesting than any results to which their experiments lead. >>>> -- Norbert Wiener >>> >>> > From zonexo at gmail.com Mon Aug 24 04:09:53 2015 From: zonexo at gmail.com (Wee-Beng Tay) Date: Mon, 24 Aug 2015 17:09:53 +0800 Subject: [petsc-users] Insert values into matrix using MatSetValuesStencil or MatSetValuesLocal Message-ID: Hi, I'm modifying my 3d fortran code from MPI along 1 direction (z) to MPI along 2 directions (y,z) Previously I was using MatSetValues with global indices. However, now I'm using DM and global indices is much more difficult. I come across MatSetValuesStencil or MatSetValuesLocal. So what's the difference bet the one since they both seem to work locally? Which is a simpler/better option? Is there an example in Fortran for MatSetValuesStencil? Do I also need to use DMDAGetAO together with MatSetValuesStencil or MatSetValuesLocal? Thanks! -------------- next part -------------- An HTML attachment was scrubbed... URL: From timothee.nicolas at gmail.com Mon Aug 24 04:54:54 2015 From: timothee.nicolas at gmail.com (=?UTF-8?Q?Timoth=C3=A9e_Nicolas?=) Date: Mon, 24 Aug 2015 18:54:54 +0900 Subject: [petsc-users] Insert values into matrix using MatSetValuesStencil or MatSetValuesLocal In-Reply-To: References: Message-ID: Hi, ex5 of snes can give you an example of the two routines. The C version ex5.c uses MatSetValuesStencil whereas the Fortran90 version ex5f90.F uses MatSetValuesLocal. However, I use MatSetValuesStencil also in Fortran, there is no problem, and no need to mess around with DMDAGetAO, I think. To input values in the matrix, you need to do the following : ! Declare the matstencils for matrix columns and rows MatStencil :: row(4,1),col(4,n) ! Declare the quantity which will store the actual matrix elements PetscScalar :: v(8) The first dimension in row and col is 4 to allow for 3 spatial dimensions (even if you use only 2) plus one degree of freedom if you have several fields in your DMDA. The second dimension is 1 for row (you input one row at a time) and n for col, where n is the number of columns that you input. For instance, if at node (1,i,j) (1 is the index of the degree of freedom), you have, say, 6 couplings, with nodes (1,i,j), (1,i+1,j), (1,i-1,j), (1,i,j-1), (1,i,j+1), (2,i,j) for example, then you need to set n=6 Then you define the row number by naturally doing the following, inside a local loop : row(MatStencil_i,1) = i -1 row(MatStencil_j,1) = j -1 row(MatStencil_c,1) = 1 -1 the -1 are here because FORTRAN indexing is different from the native C indexing. I put them on the right to make this more apparent. Then the column information. For instance to declare the coupling with node (1,i,j), (1,i-1,j) and (2,i,j) (you can make up for the rest) you will have to write (still within the same local loop on i and j) col(MatStencil_i,1) = i -1 col(MatStencil_j,1) = j -1 col(MatStencil_c,1) = 1 -1 v(1) = whatever_it_is col(MatStencil_i,2) = i-1 -1 col(MatStencil_j,2) = j -1 col(MatStencil_c,2) = 1 -1 v(2) = whatever_it_is col(MatStencil_i,3) = i -1 col(MatStencil_j,3) = j -1 col(MatStencil_c,3) = 2 -1 v(3) = whatever_it_is ... ... .. ... ... ... Note that the index of the degree of freedom (or what field you are coupling to), is indicated by MatStencil_c Finally use MatSetValuesStencil ione = 1 isix = 6 call MatSetValuesStencil(Matrix,ione,row,isix,col,v,INSERT_VALUES,ierr) If it is not clear don't hesitate to ask more details. For me it worked that way, I succesfully computed a Jacobian that way. It is very sensitive. If you slightly depart from the right jacobian, you will see a huge difference compared to using matrix free with -snes_mf, so you can hardly make a mistake because you would see it. That's how I finally got it to work. Best Timothee 2015-08-24 18:09 GMT+09:00 Wee-Beng Tay : > Hi, > > I'm modifying my 3d fortran code from MPI along 1 direction (z) to MPI > along 2 directions (y,z) > > Previously I was using MatSetValues with global indices. However, now I'm > using DM and global indices is much more difficult. > > I come across MatSetValuesStencil or MatSetValuesLocal. > > So what's the difference bet the one since they both seem to work locally? > > Which is a simpler/better option? > > Is there an example in Fortran for MatSetValuesStencil? > > Do I also need to use DMDAGetAO together with MatSetValuesStencil or > MatSetValuesLocal? > > Thanks! > -------------- next part -------------- An HTML attachment was scrubbed... URL: From timothee.nicolas at gmail.com Mon Aug 24 04:56:43 2015 From: timothee.nicolas at gmail.com (=?UTF-8?Q?Timoth=C3=A9e_Nicolas?=) Date: Mon, 24 Aug 2015 18:56:43 +0900 Subject: [petsc-users] Insert values into matrix using MatSetValuesStencil or MatSetValuesLocal In-Reply-To: References: Message-ID: Small erratum, in the declaration for v, it should be PetscScalar :: v(n) where n is the same as for col (6 in the example, not 8 which I copied from my particular case) 2015-08-24 18:54 GMT+09:00 Timoth?e Nicolas : > Hi, > > ex5 of snes can give you an example of the two routines. > > The C version ex5.c uses MatSetValuesStencil whereas the Fortran90 version > ex5f90.F uses MatSetValuesLocal. > > However, I use MatSetValuesStencil also in Fortran, there is no problem, > and no need to mess around with DMDAGetAO, I think. > > To input values in the matrix, you need to do the following : > > ! Declare the matstencils for matrix columns and rows > MatStencil :: row(4,1),col(4,n) > ! Declare the quantity which will store the actual matrix elements > PetscScalar :: v(8) > > The first dimension in row and col is 4 to allow for 3 spatial dimensions > (even if you use only 2) plus one degree of freedom if you have several > fields in your DMDA. The second dimension is 1 for row (you input one row > at a time) and n for col, where n is the number of columns that you input. > For instance, if at node (1,i,j) (1 is the index of the degree of > freedom), you have, say, 6 couplings, with nodes (1,i,j), (1,i+1,j), > (1,i-1,j), (1,i,j-1), (1,i,j+1), (2,i,j) for example, then you need to set > n=6 > > Then you define the row number by naturally doing the following, inside a > local loop : > > row(MatStencil_i,1) = i -1 > row(MatStencil_j,1) = j -1 > row(MatStencil_c,1) = 1 -1 > > the -1 are here because FORTRAN indexing is different from the native C > indexing. I put them on the right to make this more apparent. > > Then the column information. For instance to declare the coupling with > node (1,i,j), (1,i-1,j) and (2,i,j) (you can make up for the rest) you will > have to write (still within the same local loop on i and j) > > col(MatStencil_i,1) = i -1 > col(MatStencil_j,1) = j -1 > col(MatStencil_c,1) = 1 -1 > v(1) = whatever_it_is > > col(MatStencil_i,2) = i-1 -1 > col(MatStencil_j,2) = j -1 > col(MatStencil_c,2) = 1 -1 > v(2) = whatever_it_is > > col(MatStencil_i,3) = i -1 > col(MatStencil_j,3) = j -1 > col(MatStencil_c,3) = 2 -1 > v(3) = whatever_it_is > > ... > ... > .. > > ... > ... > ... > > Note that the index of the degree of freedom (or what field you are > coupling to), is indicated by MatStencil_c > > > Finally use MatSetValuesStencil > > ione = 1 > isix = 6 > call MatSetValuesStencil(Matrix,ione,row,isix,col,v,INSERT_VALUES,ierr) > > If it is not clear don't hesitate to ask more details. For me it worked > that way, I succesfully computed a Jacobian that way. It is very sensitive. > If you slightly depart from the right jacobian, you will see a huge > difference compared to using matrix free with -snes_mf, so you can hardly > make a mistake because you would see it. That's how I finally got it to > work. > > Best > > Timothee > > > 2015-08-24 18:09 GMT+09:00 Wee-Beng Tay : > >> Hi, >> >> I'm modifying my 3d fortran code from MPI along 1 direction (z) to MPI >> along 2 directions (y,z) >> >> Previously I was using MatSetValues with global indices. However, now I'm >> using DM and global indices is much more difficult. >> >> I come across MatSetValuesStencil or MatSetValuesLocal. >> >> So what's the difference bet the one since they both seem to work locally? >> >> Which is a simpler/better option? >> >> Is there an example in Fortran for MatSetValuesStencil? >> >> Do I also need to use DMDAGetAO together with MatSetValuesStencil or >> MatSetValuesLocal? >> >> Thanks! >> > > -------------- next part -------------- An HTML attachment was scrubbed... URL: From knepley at gmail.com Mon Aug 24 05:21:21 2015 From: knepley at gmail.com (Matthew Knepley) Date: Mon, 24 Aug 2015 05:21:21 -0500 Subject: [petsc-users] Insert values into matrix using MatSetValuesStencil or MatSetValuesLocal In-Reply-To: References: Message-ID: On Mon, Aug 24, 2015 at 4:09 AM, Wee-Beng Tay wrote: > Hi, > > I'm modifying my 3d fortran code from MPI along 1 direction (z) to MPI > along 2 directions (y,z) > > Previously I was using MatSetValues with global indices. However, now I'm > using DM and global indices is much more difficult. > > I come across MatSetValuesStencil or MatSetValuesLocal. > > So what's the difference bet the one since they both seem to work locally? > No. MatSetValuesLocal() takes local indices. MatSetValuesStencil() takes global vertex numbers. > Which is a simpler/better option? > MatSetValuesStencil() > Is there an example in Fortran for MatSetValuesStencil? > Timoth?e Nicolas shows one in his reply. Do I also need to use DMDAGetAO together with MatSetValuesStencil or > MatSetValuesLocal? > No. Thanks, Matt > Thanks! > -- What most experimenters take for granted before they begin their experiments is infinitely more interesting than any results to which their experiments lead. -- Norbert Wiener -------------- next part -------------- An HTML attachment was scrubbed... URL: From nelsonflsilva at ist.utl.pt Mon Aug 24 09:24:27 2015 From: nelsonflsilva at ist.utl.pt (Nelson Filipe Lopes da Silva) Date: Mon, 24 Aug 2015 15:24:27 +0100 Subject: [petsc-users] Scalability issue In-Reply-To: <53E1D16B-1E0D-4F91-B802-E4934A797267@mcs.anl.gov> References: <6f0a267caafdec17d9e34595c9528b7c@mail.ist.utl.pt> <11b2afc65a711f7e58bd7092b392c7cb@mail.ist.utl.pt> <69E8353F-2DD6-4BFE-BEA2-D6A049E9912A@mcs.anl.gov> <53E1D16B-1E0D-4F91-B802-E4934A797267@mcs.anl.gov> Message-ID: <0c09f050aa7e81b476b7f2245a60f184@mail.ist.utl.pt> Hello. Thank you very much for your time. I understood the idea, it works very well. I also noticed that my algorithm performs a different number of iterations with different number of machines. The stop conditions are calculated using PETSc "matmultadd". I'm very positive that there may be a program bug in my code, or could it be something with PETSc? I also need to figure out why those vecmax ratio are so high. The vecset is understandable as I'm distributing the initial information from the root machine in sequencial. These are the new values: 1 machine [0] Matrix diagonal_nnz:16800000 (100.00 %) [0] Matrix local nnz: 16800000 (100.00 %), local rows: 800000 (100.00 %) ExecTime: 4min47sec Iterations: 236 2 machines [0] Matrix diagonal_nnz:8000000 (95.24 %) [1] Matrix diagonal_nnz:7600000 (90.48 %) [0] Matrix local nnz: 8400000 (50.00 %), local rows: 400000 (50.00 %) [1] Matrix local nnz: 8400000 (50.00 %), local rows: 400000 (50.00 %) ExecTime: 5min26sec Iterations: 330 3 machines [0] Matrix diagonal_nnz:5333340 (95.24 %) [1] Matrix diagonal_nnz:4800012 (85.71 %) [2] Matrix diagonal_nnz:4533332 (80.95 %) [0] Matrix local nnz: 5600007 (33.33 %), local rows: 266667 (33.33 %) [1] Matrix local nnz: 5600007 (33.33 %), local rows: 266667 (33.33 %) [2] Matrix local nnz: 5599986 (33.33 %), local rows: 266666 (33.33 %)) ExecTime: 5min25sec Iterations: 346 The suggested permutation worked very well in comparison with the original matrix structure. The no-speedup may be related with the different number of iterations. Once again, thank you very much for the time. Cheers, Nelson Em 2015-08-23 20:19, Barry Smith escreveu: > A suggestion: take your second ordering and now interlace the second > half of the rows with the first half of the rows (keeping the some > column ordering) That is, order the rows 0, n/2, 1, n/2+1, 2, n/2+2 > etc this will take the two separate "diagonal" bands and form a > single "diagonal band". This will increase the "diagonal block > weight" to be pretty high and the only scatters will need to be for > the final rows of the input vector that all processes need to do > their > part of the multiply. Generate the image to make sure what I suggest > make sense and then run this ordering with 1, 2, and 3 processes. > Send > the logs. > > Barry > >> On Aug 23, 2015, at 10:12 AM, Nelson Filipe Lopes da Silva >> wrote: >> >> Thank you for the fast response! >> >> Yes. The last rows of the matrix are indeed more dense, compared >> with the remaining ones. >> For this example, concerning load balance between machines, the last >> process had 46% of the matrix nonzero entries. A few weeks ago I >> suspected of this problem and wrote a little function that could >> permute the matrix rows based on their number of nonzeros. However, >> the matrix would become less pleasant regarding "diagonal block >> weight", and I stop using it as i thought I was becoming worse. >> >> Also, due to this problem, I thought I could have a complete vector >> copy in each processor, instead of a distributed vector. I tried to >> implement this idea, but had no luck with the results. However, even >> if this solution would work, the communication for vector update was >> inevitable once each iteration of my algorithm. >> Since this is a rectangular matrix, I cannot apply RCM or such >> permutations, however I can permute rows and columns though. >> >> More specifically, the problem I'm trying to solve is one of balance >> the best guess and uncertainty estimates of a set of Input-Output >> subject to linear constraints and ancillary informations. The matrix >> is called an aggregation matrix, and each entry can be 1, 0 or -1. I >> don't know the cause of its nonzero structure. I'm addressing this >> problem using a weighted least-squares algorithm. >> >> I ran the code with a different, more friendly problem topology, >> logging the load of nonzero entries and the "diagonal load" per >> processor. >> I'm sending images of both matrices nonzero structure. The last >> email example used matrix1, the example in this email uses matrix2. >> Matrix1 (last email example) is 1.098.939 rows x 2.039.681 columns >> and 5.171.901 nnz. >> The matrix2 (this email example) is 800.000 rows x 8.800.000 columns >> and 16.800.000 nnz. >> >> >> With 1,2,3 machines, I have these distributions of nonzeros (using >> matrix2). I'm sending the logs in this email. >> 1 machine >> [0] Matrix diagonal_nnz:16800000 (100.00 %) >> [0] Matrix local nnz: 16800000 (100.00 %), local rows: 800000 >> (100.00 %) >> ExecTime: 4min47sec >> >> 2 machines >> [0] Matrix diagonal_nnz:4400000 (52.38 %) >> [1] Matrix diagonal_nnz:4000000 (47.62 %) >> >> [0] Matrix local nnz: 8400000 (50.00 %), local rows: 400000 (50.00 >> %) >> [1] Matrix local nnz: 8400000 (50.00 %), local rows: 400000 (50.00 >> %) >> ExecTime: 13min23sec >> >> 3 machines >> [0] Matrix diagonal_nnz:2933334 (52.38 %) >> [1] Matrix diagonal_nnz:533327 (9.52 %) >> [2] Matrix diagonal_nnz:2399999 (42.86 %) >> >> [0] Matrix local nnz: 5600007 (33.33 %), local rows: 266667 (33.33 >> %) >> [1] Matrix local nnz: 5600007 (33.33 %), local rows: 266667 (33.33 >> %) >> [2] Matrix local nnz: 5599986 (33.33 %), local rows: 266666 (33.33 >> %) >> ExecTime: 20min26sec >> >> As for the network, I ran the make streams NPMAX=3 again. I'm also >> sending it in this email. >> >> I too think that these bad results are caused by a combination of >> bad matrix structure, especially the "diagonal weight", and maybe >> network. >> >> I really should find a way to permute these matrices to a more >> friendly structure. >> >> Thank you very much for the help. >> Nelson >> >> Em 2015-08-22 22:49, Barry Smith escreveu: >>>> On Aug 22, 2015, at 4:17 PM, Nelson Filipe Lopes da Silva >>>> wrote: >>>> >>>> Hi. >>>> >>>> >>>> I managed to finish the re-implementation. I ran the program with >>>> 1,2,3,4,5,6 machines and saved the summary. I send each of them in >>>> this email. >>>> In these executions, the program performs Matrix-Vector (MatMult, >>>> MatMultAdd) products and Vector-Vector operations. From what I >>>> understand while reading the logs, the program takes most of the >>>> time in "VecScatterEnd". >>>> In this example, the matrix taking part on the Matrix-Vector >>>> products is not "much diagonal heavy". >>>> The following numbers are the percentages of nnz values on the >>>> matrix diagonal block for each machine, and each execution time. >>>> NMachines %NNZ ExecTime >>>> 1 machine0 100%; 16min08sec >>>> >>>> 2 machine0 91.1%; 24min58sec >>>> machine1 69.2%; >>>> >>>> 3 machine0 90.9% 25min42sec >>>> machine1 82.8% >>>> machine2 51.6% >>>> >>>> 4 machine0 91.9% 26min27sec >>>> machine1 82.4% >>>> machine2 73.1% >>>> machine3 39.9% >>>> >>>> 5 machine0 93.2% 39min23sec >>>> machine1 82.8% >>>> machine2 74.4% >>>> machine3 64.6% >>>> machine4 31.6% >>>> >>>> 6 machine0 94.2% 54min54sec >>>> machine1 82.6% >>>> machine2 73.1% >>>> machine3 65.2% >>>> machine4 55.9% >>>> machine5 25.4% >>> >>> Based on this I am guessing the last rows of the matrix have a >>> lot >>> of nonzeros away from the diagonal? >>> >>> There is a big load imbalance in something: for example with 2 >>> processes you have >>> >>> VecMax 10509 1.0 2.0602e+02 4.2 0.00e+00 0.0 0.0e+00 >>> 0.0e+00 1.1e+04 9 0 0 0 72 9 0 0 0 72 0 >>> VecScatterEnd 18128 1.0 8.9404e+02 1.3 0.00e+00 0.0 0.0e+00 >>> 0.0e+00 0.0e+00 53 0 0 0 0 53 0 0 0 0 0 >>> MatMult 10505 1.0 6.5591e+02 1.4 3.16e+10 1.4 2.1e+04 >>> 1.2e+06 0.0e+00 37 33 58 38 0 37 33 58 38 0 83 >>> MatMultAdd 7624 1.0 7.0028e+02 2.3 3.26e+10 2.1 1.5e+04 >>> 2.8e+06 0.0e+00 34 29 42 62 0 34 29 42 62 0 69 >>> >>> the 5th column has the imbalance between slowest and fastest >>> process. It is 4.2 for max, 1.4 for multi and 2.3 for matmultadd, >>> to >>> get good speed ups these need to be much closer to 1. >>> >>> How many nonzeros in the matrix are there per process? Is it very >>> different for difference processes? You really need to have each >>> process have similar number of matrix nonzeros. Do you have a >>> picture of the nonzero structure of the matrix? Where does the >>> matrix >>> come from, why does it have this structure? >>> >>> Also likely there are just to many vector entries that need to be >>> scattered to the last process for the matmults. >>>> >>>> In this implementation I'm using MatCreate and VecCreate. I'm also >>>> leaving the partition sizes in PETSC_DECIDE. >>>> >>>> Finally, to run the application, I'm using mpirun.hydra from >>>> mpich, downloaded by PETSc configure script. >>>> I'm checking the process assignment as suggested on the last >>>> email. >>>> >>>> Am I missing anything? >>> >>> Your network is very poor; likely ethernet. It is had to get much >>> speedup with such slow reductions and sends and receives. >>> >>> Average time to get PetscTime(): 1.19209e-07 >>> Average time for MPI_Barrier(): 0.000215769 >>> Average time for zero size MPI_Send(): 5.94854e-05 >>> >>> I think you are seeing such bad results due to an unkind matrix >>> nonzero structure giving per load balance and too much >>> communication >>> and a very poor computer network that just makes all the needed >>> communication totally dominate. >>> >>> >>>> >>>> Regards, >>>> Nelson >>>> >>>> Em 2015-08-20 16:17, Matthew Knepley escreveu: >>>> >>>>> On Thu, Aug 20, 2015 at 6:30 AM, Nelson Filipe Lopes da Silva >>>>> wrote: >>>>> Hello. >>>>> >>>>> I am sorry for the long time without response. I decided to >>>>> rewrite my application in a different way and will send the >>>>> log_summary output when done reimplementing. >>>>> >>>>> As for the machine, I am using mpirun to run jobs in a 8 node >>>>> cluster. I modified the makefile on the steams folder so it would >>>>> run using my hostfile. >>>>> The output is attached to this email. It seems reasonable for a >>>>> cluster with 8 machines. From "lscpu", each machine cpu has 4 cores >>>>> and 1 socket. >>>>> 1) You launcher is placing processes haphazardly. I would figure >>>>> out how to assign them to certain nodes >>>>> 2) Each node has enough bandwidth for 1 core, so it does not make >>>>> much sense to use more than 1. >>>>> Thanks, >>>>> Matt >>>>> >>>>> Cheers, >>>>> Nelson >>>>> >>>>> >>>>> Em 2015-07-24 16:50, Barry Smith escreveu: >>>>> It would be very helpful if you ran the code on say 1, 2, 4, 8, >>>>> 16 >>>>> ... processes with the option -log_summary and send (as >>>>> attachments) >>>>> the log summary information. >>>>> >>>>> Also on the same machine run the streams benchmark; with recent >>>>> releases of PETSc you only need to do >>>>> >>>>> cd $PETSC_DIR >>>>> make streams NPMAX=16 (or whatever your largest process count is) >>>>> >>>>> and send the output. >>>>> >>>>> I suspect that you are doing everything fine and it is more an >>>>> issue >>>>> with the configuration of your machine. Also read the information >>>>> at >>>>> http://www.mcs.anl.gov/petsc/documentation/faq.html#computers on >>>>> "binding" >>>>> >>>>> Barry >>>>> >>>>> On Jul 24, 2015, at 10:41 AM, Nelson Filipe Lopes da Silva >>>>> wrote: >>>>> >>>>> Hello, >>>>> >>>>> I have been using PETSc for a few months now, and it truly is >>>>> fantastic piece of software. >>>>> >>>>> In my particular example I am working with a large, sparse >>>>> distributed (MPI AIJ) matrix we can refer as 'G'. >>>>> G is a horizontal - retangular matrix (for example, 1,1 Million >>>>> rows per 2,1 Million columns). This matrix is commonly very sparse >>>>> and not diagonal 'heavy' (for example 5,2 Million nnz in which ~50% >>>>> are on the diagonal block of MPI AIJ representation). >>>>> To work with this matrix, I also have a few parallel vectors >>>>> (created using MatCreate Vec), we can refer as 'm' and 'k'. >>>>> I am trying to parallelize an iterative algorithm in which the >>>>> most computational heavy operations are: >>>>> >>>>> ->Matrix-Vector Multiplication, more precisely G * m + k = b >>>>> (MatMultAdd). From what I have been reading, to achive a good >>>>> speedup in this operation, G should be as much diagonal as >>>>> possible, due to overlapping communication and computation. But >>>>> even when using a G matrix in which the diagonal block has ~95% of >>>>> the nnz, I cannot get a decent speedup. Most of the times, the >>>>> performance even gets worse. >>>>> >>>>> ->Matrix-Matrix Multiplication, in this case I need to perform G >>>>> * G' = A, where A is later used on the linear solver and G' is >>>>> transpose of G. The speedup in this operation is not worse, >>>>> although is not very good. >>>>> >>>>> ->Linear problem solving. Lastly, In this operation I compute >>>>> "Ax=b" from the last two operations. I tried to apply a RCM >>>>> permutation to A to make it more diagonal, for better performance. >>>>> However, the problem I faced was that, the permutation is performed >>>>> locally in each processor and thus, the final result is different >>>>> with different number of processors. I assume this was intended to >>>>> reduce communication. The solution I found was >>>>> 1-calculate A >>>>> 2-calculate, localy to 1 machine, the RCM permutation IS using A >>>>> 3-apply this permutation to the lines of G. >>>>> This works well, and A is generated as if RCM permuted. It is >>>>> fine to do this operation in one machine because it is only done >>>>> once while reading the input. The nnz of G become more spread and >>>>> less diagonal, causing problems when calculating G * m + k = b. >>>>> >>>>> These 3 operations (except the permutation) are performed in each >>>>> iteration of my algorithm. >>>>> >>>>> So, my questions are. >>>>> -What are the characteristics of G that lead to a good speedup in >>>>> the operations I described? Am I missing something and too much >>>>> obsessed with the diagonal block? >>>>> >>>>> -Is there a better way to permute A without permute G and still >>>>> get the same result using 1 or N machines? >>>>> >>>>> >>>>> I have been avoiding asking for help for a while. I'm very sorry >>>>> for the long email. >>>>> Thank you very much for your time. >>>>> Best Regards, >>>>> Nelson >>>>> >>>>> >>>>> -- >>>>> What most experimenters take for granted before they begin their >>>>> experiments is infinitely more interesting than any results to >>>>> which their experiments lead. >>>>> -- Norbert Wiener >>>> >>>> >>>> >> >> -------------- next part -------------- An embedded and charset-unspecified text was scrubbed... Name: Log01P URL: -------------- next part -------------- An embedded and charset-unspecified text was scrubbed... Name: Log02P URL: -------------- next part -------------- An embedded and charset-unspecified text was scrubbed... Name: Log03P URL: -------------- next part -------------- A non-text attachment was scrubbed... Name: matrix-after.png Type: application/octet-stream Size: 1818 bytes Desc: not available URL: -------------- next part -------------- A non-text attachment was scrubbed... Name: matrix-before.png Type: application/octet-stream Size: 2058 bytes Desc: not available URL: From knepley at gmail.com Mon Aug 24 09:28:06 2015 From: knepley at gmail.com (Matthew Knepley) Date: Mon, 24 Aug 2015 09:28:06 -0500 Subject: [petsc-users] Scalability issue In-Reply-To: <0c09f050aa7e81b476b7f2245a60f184@mail.ist.utl.pt> References: <6f0a267caafdec17d9e34595c9528b7c@mail.ist.utl.pt> <11b2afc65a711f7e58bd7092b392c7cb@mail.ist.utl.pt> <69E8353F-2DD6-4BFE-BEA2-D6A049E9912A@mcs.anl.gov> <53E1D16B-1E0D-4F91-B802-E4934A797267@mcs.anl.gov> <0c09f050aa7e81b476b7f2245a60f184@mail.ist.utl.pt> Message-ID: On Mon, Aug 24, 2015 at 9:24 AM, Nelson Filipe Lopes da Silva < nelsonflsilva at ist.utl.pt> wrote: > Hello. Thank you very much for your time. > > I understood the idea, it works very well. > I also noticed that my algorithm performs a different number of iterations > with different number of machines. The stop conditions are calculated using > PETSc "matmultadd". I'm very positive that there may be a program bug in my > code, or could it be something with PETSc? > In parallel, a total order on summation is not guaranteed, and thus you will have jitter in the result. However, your iteration seems extremely sensitive to this (10s of iterations difference). Thus it seems that either your iterative tolerance is down around round error, which is usually oversolving, or you have an incredibly ill-conditioned system. Thanks, Matt > I also need to figure out why those vecmax ratio are so high. The vecset > is understandable as I'm distributing the initial information from the root > machine in sequencial. > > These are the new values: > 1 machine > [0] Matrix diagonal_nnz:16800000 (100.00 %) > [0] Matrix local nnz: 16800000 (100.00 %), local rows: 800000 (100.00 %) > ExecTime: 4min47sec > Iterations: 236 > > 2 machines > [0] Matrix diagonal_nnz:8000000 (95.24 %) > [1] Matrix diagonal_nnz:7600000 (90.48 %) > > [0] Matrix local nnz: 8400000 (50.00 %), local rows: 400000 (50.00 %) > [1] Matrix local nnz: 8400000 (50.00 %), local rows: 400000 (50.00 %) > ExecTime: 5min26sec > Iterations: 330 > > 3 machines > [0] Matrix diagonal_nnz:5333340 (95.24 %) > [1] Matrix diagonal_nnz:4800012 (85.71 %) > [2] Matrix diagonal_nnz:4533332 (80.95 %) > > [0] Matrix local nnz: 5600007 (33.33 %), local rows: 266667 (33.33 %) > [1] Matrix local nnz: 5600007 (33.33 %), local rows: 266667 (33.33 %) > [2] Matrix local nnz: 5599986 (33.33 %), local rows: 266666 (33.33 %)) > ExecTime: 5min25sec > Iterations: 346 > > The suggested permutation worked very well in comparison with the original > matrix structure. The no-speedup may be related with the different number > of iterations. > > Once again, thank you very much for the time. > Cheers, > Nelson > > > Em 2015-08-23 20:19, Barry Smith escreveu: > >> A suggestion: take your second ordering and now interlace the second >> half of the rows with the first half of the rows (keeping the some >> column ordering) That is, order the rows 0, n/2, 1, n/2+1, 2, n/2+2 >> etc this will take the two separate "diagonal" bands and form a >> single "diagonal band". This will increase the "diagonal block >> weight" to be pretty high and the only scatters will need to be for >> the final rows of the input vector that all processes need to do their >> part of the multiply. Generate the image to make sure what I suggest >> make sense and then run this ordering with 1, 2, and 3 processes. Send >> the logs. >> >> Barry >> >> On Aug 23, 2015, at 10:12 AM, Nelson Filipe Lopes da Silva < >>> nelsonflsilva at ist.utl.pt> wrote: >>> >>> Thank you for the fast response! >>> >>> Yes. The last rows of the matrix are indeed more dense, compared with >>> the remaining ones. >>> For this example, concerning load balance between machines, the last >>> process had 46% of the matrix nonzero entries. A few weeks ago I suspected >>> of this problem and wrote a little function that could permute the matrix >>> rows based on their number of nonzeros. However, the matrix would become >>> less pleasant regarding "diagonal block weight", and I stop using it as i >>> thought I was becoming worse. >>> >>> Also, due to this problem, I thought I could have a complete vector copy >>> in each processor, instead of a distributed vector. I tried to implement >>> this idea, but had no luck with the results. However, even if this solution >>> would work, the communication for vector update was inevitable once each >>> iteration of my algorithm. >>> Since this is a rectangular matrix, I cannot apply RCM or such >>> permutations, however I can permute rows and columns though. >>> >>> More specifically, the problem I'm trying to solve is one of balance the >>> best guess and uncertainty estimates of a set of Input-Output subject to >>> linear constraints and ancillary informations. The matrix is called an >>> aggregation matrix, and each entry can be 1, 0 or -1. I don't know the >>> cause of its nonzero structure. I'm addressing this problem using a >>> weighted least-squares algorithm. >>> >>> I ran the code with a different, more friendly problem topology, logging >>> the load of nonzero entries and the "diagonal load" per processor. >>> I'm sending images of both matrices nonzero structure. The last email >>> example used matrix1, the example in this email uses matrix2. >>> Matrix1 (last email example) is 1.098.939 rows x 2.039.681 columns and >>> 5.171.901 nnz. >>> The matrix2 (this email example) is 800.000 rows x 8.800.000 columns and >>> 16.800.000 nnz. >>> >>> >>> With 1,2,3 machines, I have these distributions of nonzeros (using >>> matrix2). I'm sending the logs in this email. >>> 1 machine >>> [0] Matrix diagonal_nnz:16800000 (100.00 %) >>> [0] Matrix local nnz: 16800000 (100.00 %), local rows: 800000 (100.00 %) >>> ExecTime: 4min47sec >>> >>> 2 machines >>> [0] Matrix diagonal_nnz:4400000 (52.38 %) >>> [1] Matrix diagonal_nnz:4000000 (47.62 %) >>> >>> [0] Matrix local nnz: 8400000 (50.00 %), local rows: 400000 (50.00 %) >>> [1] Matrix local nnz: 8400000 (50.00 %), local rows: 400000 (50.00 %) >>> ExecTime: 13min23sec >>> >>> 3 machines >>> [0] Matrix diagonal_nnz:2933334 (52.38 %) >>> [1] Matrix diagonal_nnz:533327 (9.52 %) >>> [2] Matrix diagonal_nnz:2399999 (42.86 %) >>> >>> [0] Matrix local nnz: 5600007 (33.33 %), local rows: 266667 (33.33 %) >>> [1] Matrix local nnz: 5600007 (33.33 %), local rows: 266667 (33.33 %) >>> [2] Matrix local nnz: 5599986 (33.33 %), local rows: 266666 (33.33 %) >>> ExecTime: 20min26sec >>> >>> As for the network, I ran the make streams NPMAX=3 again. I'm also >>> sending it in this email. >>> >>> I too think that these bad results are caused by a combination of bad >>> matrix structure, especially the "diagonal weight", and maybe network. >>> >>> I really should find a way to permute these matrices to a more friendly >>> structure. >>> >>> Thank you very much for the help. >>> Nelson >>> >>> Em 2015-08-22 22:49, Barry Smith escreveu: >>> >>>> On Aug 22, 2015, at 4:17 PM, Nelson Filipe Lopes da Silva < >>>>> nelsonflsilva at ist.utl.pt> wrote: >>>>> >>>>> Hi. >>>>> >>>>> >>>>> I managed to finish the re-implementation. I ran the program with >>>>> 1,2,3,4,5,6 machines and saved the summary. I send each of them in this >>>>> email. >>>>> In these executions, the program performs Matrix-Vector (MatMult, >>>>> MatMultAdd) products and Vector-Vector operations. From what I understand >>>>> while reading the logs, the program takes most of the time in >>>>> "VecScatterEnd". >>>>> In this example, the matrix taking part on the Matrix-Vector products >>>>> is not "much diagonal heavy". >>>>> The following numbers are the percentages of nnz values on the matrix >>>>> diagonal block for each machine, and each execution time. >>>>> NMachines %NNZ ExecTime >>>>> 1 machine0 100%; 16min08sec >>>>> >>>>> 2 machine0 91.1%; 24min58sec >>>>> machine1 69.2%; >>>>> >>>>> 3 machine0 90.9% 25min42sec >>>>> machine1 82.8% >>>>> machine2 51.6% >>>>> >>>>> 4 machine0 91.9% 26min27sec >>>>> machine1 82.4% >>>>> machine2 73.1% >>>>> machine3 39.9% >>>>> >>>>> 5 machine0 93.2% 39min23sec >>>>> machine1 82.8% >>>>> machine2 74.4% >>>>> machine3 64.6% >>>>> machine4 31.6% >>>>> >>>>> 6 machine0 94.2% 54min54sec >>>>> machine1 82.6% >>>>> machine2 73.1% >>>>> machine3 65.2% >>>>> machine4 55.9% >>>>> machine5 25.4% >>>>> >>>> >>>> Based on this I am guessing the last rows of the matrix have a lot >>>> of nonzeros away from the diagonal? >>>> >>>> There is a big load imbalance in something: for example with 2 >>>> processes you have >>>> >>>> VecMax 10509 1.0 2.0602e+02 4.2 0.00e+00 0.0 0.0e+00 >>>> 0.0e+00 1.1e+04 9 0 0 0 72 9 0 0 0 72 0 >>>> VecScatterEnd 18128 1.0 8.9404e+02 1.3 0.00e+00 0.0 0.0e+00 >>>> 0.0e+00 0.0e+00 53 0 0 0 0 53 0 0 0 0 0 >>>> MatMult 10505 1.0 6.5591e+02 1.4 3.16e+10 1.4 2.1e+04 >>>> 1.2e+06 0.0e+00 37 33 58 38 0 37 33 58 38 0 83 >>>> MatMultAdd 7624 1.0 7.0028e+02 2.3 3.26e+10 2.1 1.5e+04 >>>> 2.8e+06 0.0e+00 34 29 42 62 0 34 29 42 62 0 69 >>>> >>>> the 5th column has the imbalance between slowest and fastest >>>> process. It is 4.2 for max, 1.4 for multi and 2.3 for matmultadd, to >>>> get good speed ups these need to be much closer to 1. >>>> >>>> How many nonzeros in the matrix are there per process? Is it very >>>> different for difference processes? You really need to have each >>>> process have similar number of matrix nonzeros. Do you have a >>>> picture of the nonzero structure of the matrix? Where does the matrix >>>> come from, why does it have this structure? >>>> >>>> Also likely there are just to many vector entries that need to be >>>> scattered to the last process for the matmults. >>>> >>>>> >>>>> In this implementation I'm using MatCreate and VecCreate. I'm also >>>>> leaving the partition sizes in PETSC_DECIDE. >>>>> >>>>> Finally, to run the application, I'm using mpirun.hydra from mpich, >>>>> downloaded by PETSc configure script. >>>>> I'm checking the process assignment as suggested on the last email. >>>>> >>>>> Am I missing anything? >>>>> >>>> >>>> Your network is very poor; likely ethernet. It is had to get much >>>> speedup with such slow reductions and sends and receives. >>>> >>>> Average time to get PetscTime(): 1.19209e-07 >>>> Average time for MPI_Barrier(): 0.000215769 >>>> Average time for zero size MPI_Send(): 5.94854e-05 >>>> >>>> I think you are seeing such bad results due to an unkind matrix >>>> nonzero structure giving per load balance and too much communication >>>> and a very poor computer network that just makes all the needed >>>> communication totally dominate. >>>> >>>> >>>> >>>>> Regards, >>>>> Nelson >>>>> >>>>> Em 2015-08-20 16:17, Matthew Knepley escreveu: >>>>> >>>>> On Thu, Aug 20, 2015 at 6:30 AM, Nelson Filipe Lopes da Silva < >>>>>> nelsonflsilva at ist.utl.pt> wrote: >>>>>> Hello. >>>>>> >>>>>> I am sorry for the long time without response. I decided to rewrite >>>>>> my application in a different way and will send the log_summary output when >>>>>> done reimplementing. >>>>>> >>>>>> As for the machine, I am using mpirun to run jobs in a 8 node >>>>>> cluster. I modified the makefile on the steams folder so it would run using >>>>>> my hostfile. >>>>>> The output is attached to this email. It seems reasonable for a >>>>>> cluster with 8 machines. From "lscpu", each machine cpu has 4 cores and 1 >>>>>> socket. >>>>>> 1) You launcher is placing processes haphazardly. I would figure out >>>>>> how to assign them to certain nodes >>>>>> 2) Each node has enough bandwidth for 1 core, so it does not make >>>>>> much sense to use more than 1. >>>>>> Thanks, >>>>>> Matt >>>>>> >>>>>> Cheers, >>>>>> Nelson >>>>>> >>>>>> >>>>>> Em 2015-07-24 16:50, Barry Smith escreveu: >>>>>> It would be very helpful if you ran the code on say 1, 2, 4, 8, 16 >>>>>> ... processes with the option -log_summary and send (as attachments) >>>>>> the log summary information. >>>>>> >>>>>> Also on the same machine run the streams benchmark; with recent >>>>>> releases of PETSc you only need to do >>>>>> >>>>>> cd $PETSC_DIR >>>>>> make streams NPMAX=16 (or whatever your largest process count is) >>>>>> >>>>>> and send the output. >>>>>> >>>>>> I suspect that you are doing everything fine and it is more an issue >>>>>> with the configuration of your machine. Also read the information at >>>>>> http://www.mcs.anl.gov/petsc/documentation/faq.html#computers on >>>>>> "binding" >>>>>> >>>>>> Barry >>>>>> >>>>>> On Jul 24, 2015, at 10:41 AM, Nelson Filipe Lopes da Silva < >>>>>> nelsonflsilva at ist.utl.pt> wrote: >>>>>> >>>>>> Hello, >>>>>> >>>>>> I have been using PETSc for a few months now, and it truly is >>>>>> fantastic piece of software. >>>>>> >>>>>> In my particular example I am working with a large, sparse >>>>>> distributed (MPI AIJ) matrix we can refer as 'G'. >>>>>> G is a horizontal - retangular matrix (for example, 1,1 Million rows >>>>>> per 2,1 Million columns). This matrix is commonly very sparse and not >>>>>> diagonal 'heavy' (for example 5,2 Million nnz in which ~50% are on the >>>>>> diagonal block of MPI AIJ representation). >>>>>> To work with this matrix, I also have a few parallel vectors (created >>>>>> using MatCreate Vec), we can refer as 'm' and 'k'. >>>>>> I am trying to parallelize an iterative algorithm in which the most >>>>>> computational heavy operations are: >>>>>> >>>>>> ->Matrix-Vector Multiplication, more precisely G * m + k = b >>>>>> (MatMultAdd). From what I have been reading, to achive a good speedup in >>>>>> this operation, G should be as much diagonal as possible, due to >>>>>> overlapping communication and computation. But even when using a G matrix >>>>>> in which the diagonal block has ~95% of the nnz, I cannot get a decent >>>>>> speedup. Most of the times, the performance even gets worse. >>>>>> >>>>>> ->Matrix-Matrix Multiplication, in this case I need to perform G * G' >>>>>> = A, where A is later used on the linear solver and G' is transpose of G. >>>>>> The speedup in this operation is not worse, although is not very good. >>>>>> >>>>>> ->Linear problem solving. Lastly, In this operation I compute "Ax=b" >>>>>> from the last two operations. I tried to apply a RCM permutation to A to >>>>>> make it more diagonal, for better performance. However, the problem I faced >>>>>> was that, the permutation is performed locally in each processor and thus, >>>>>> the final result is different with different number of processors. I assume >>>>>> this was intended to reduce communication. The solution I found was >>>>>> 1-calculate A >>>>>> 2-calculate, localy to 1 machine, the RCM permutation IS using A >>>>>> 3-apply this permutation to the lines of G. >>>>>> This works well, and A is generated as if RCM permuted. It is fine to >>>>>> do this operation in one machine because it is only done once while reading >>>>>> the input. The nnz of G become more spread and less diagonal, causing >>>>>> problems when calculating G * m + k = b. >>>>>> >>>>>> These 3 operations (except the permutation) are performed in each >>>>>> iteration of my algorithm. >>>>>> >>>>>> So, my questions are. >>>>>> -What are the characteristics of G that lead to a good speedup in the >>>>>> operations I described? Am I missing something and too much obsessed with >>>>>> the diagonal block? >>>>>> >>>>>> -Is there a better way to permute A without permute G and still get >>>>>> the same result using 1 or N machines? >>>>>> >>>>>> >>>>>> I have been avoiding asking for help for a while. I'm very sorry for >>>>>> the long email. >>>>>> Thank you very much for your time. >>>>>> Best Regards, >>>>>> Nelson >>>>>> >>>>>> >>>>>> -- >>>>>> What most experimenters take for granted before they begin their >>>>>> experiments is infinitely more interesting than any results to which their >>>>>> experiments lead. >>>>>> -- Norbert Wiener >>>>>> >>>>> >>>>> >>>>> >>>>> >>>>> >>>> >>> >>> >>> >> -- What most experimenters take for granted before they begin their experiments is infinitely more interesting than any results to which their experiments lead. -- Norbert Wiener -------------- next part -------------- An HTML attachment was scrubbed... URL: From nelsonflsilva at ist.utl.pt Mon Aug 24 11:08:48 2015 From: nelsonflsilva at ist.utl.pt (Nelson Filipe Lopes da Silva) Date: Mon, 24 Aug 2015 17:08:48 +0100 Subject: [petsc-users] Scalability issue In-Reply-To: References: <6f0a267caafdec17d9e34595c9528b7c@mail.ist.utl.pt> <11b2afc65a711f7e58bd7092b392c7cb@mail.ist.utl.pt> <69E8353F-2DD6-4BFE-BEA2-D6A049E9912A@mcs.anl.gov> <53E1D16B-1E0D-4F91-B802-E4934A797267@mcs.anl.gov> <0c09f050aa7e81b476b7f2245a60f184@mail.ist.utl.pt> Message-ID: <7562512c261da763c0c7b1807640c637@mail.ist.utl.pt> I understand. That was indeed the case. I have been experimenting with different values and thresholds. The program was indeed oversolving due to severe low threshold values. Now all executions run for the same number of iterations. The computational part of the program seems to be showing some speedup! The program was indeed suffering from the poor matrix structure and the problem was solved with the suggested permutation. I'll keep experimenting with different matrices to figure out the best permutations for each case. Thank you very much for your time! Best regards, Nelson Em 2015-08-24 15:28, Matthew Knepley escreveu: > On Mon, Aug 24, 2015 at 9:24 AM, Nelson Filipe Lopes da Silva wrote: > >> Hello. Thank you very much for your time. >> >> I understood the idea, it works very well. >> I also noticed that my algorithm performs a different number of iterations with different number of machines. The stop conditions are calculated using PETSc "matmultadd". I'm very positive that there may be a program bug in my code, or could it be something with PETSc? > > In parallel, a total order on summation is not guaranteed, and thus you will have jitter in the result. However, your > iteration seems extremely sensitive to this (10s of iterations difference). Thus it seems that either your iterative > tolerance is down around round error, which is usually oversolving, or you have an incredibly ill-conditioned system. > Thanks, > Matt > >> I also need to figure out why those vecmax ratio are so high. The vecset is understandable as I'm distributing the initial information from the root machine in sequencial. >> >> These are the new values: >> 1 machine >> [0] Matrix diagonal_nnz:16800000 (100.00 %) >> [0] Matrix local nnz: 16800000 (100.00 %), local rows: 800000 (100.00 %) >> ExecTime: 4min47sec >> Iterations: 236 >> >> 2 machines >> [0] Matrix diagonal_nnz:8000000 (95.24 %) >> [1] Matrix diagonal_nnz:7600000 (90.48 %) >> >> [0] Matrix local nnz: 8400000 (50.00 %), local rows: 400000 (50.00 %) >> [1] Matrix local nnz: 8400000 (50.00 %), local rows: 400000 (50.00 %) >> ExecTime: 5min26sec >> Iterations: 330 >> >> 3 machines >> [0] Matrix diagonal_nnz:5333340 (95.24 %) >> [1] Matrix diagonal_nnz:4800012 (85.71 %) >> [2] Matrix diagonal_nnz:4533332 (80.95 %) >> >> [0] Matrix local nnz: 5600007 (33.33 %), local rows: 266667 (33.33 %) >> [1] Matrix local nnz: 5600007 (33.33 %), local rows: 266667 (33.33 %) >> [2] Matrix local nnz: 5599986 (33.33 %), local rows: 266666 (33.33 %)) >> ExecTime: 5min25sec >> Iterations: 346 >> >> The suggested permutation worked very well in comparison with the original matrix structure. The no-speedup may be related with the different number of iterations. >> >> Once again, thank you very much for the time. >> Cheers, >> Nelson >> >> Em 2015-08-23 20:19, Barry Smith escreveu: >> >>> A suggestion: take your second ordering and now interlace the second >>> half of the rows with the first half of the rows (keeping the some >>> column ordering) That is, order the rows 0, n/2, 1, n/2+1, 2, n/2+2 >>> etc this will take the two separate "diagonal" bands and form a >>> single "diagonal band". This will increase the "diagonal block >>> weight" to be pretty high and the only scatters will need to be for >>> the final rows of the input vector that all processes need to do their >>> part of the multiply. Generate the image to make sure what I suggest >>> make sense and then run this ordering with 1, 2, and 3 processes. Send >>> the logs. >>> >>> Barry >>> >>>> On Aug 23, 2015, at 10:12 AM, Nelson Filipe Lopes da Silva wrote: >>>> >>>> Thank you for the fast response! >>>> >>>> Yes. The last rows of the matrix are indeed more dense, compared with the remaining ones. >>>> For this example, concerning load balance between machines, the last process had 46% of the matrix nonzero entries. A few weeks ago I suspected of this problem and wrote a little function that could permute the matrix rows based on their number of nonzeros. However, the matrix would become less pleasant regarding "diagonal block weight", and I stop using it as i thought I was becoming worse. >>>> >>>> Also, due to this problem, I thought I could have a complete vector copy in each processor, instead of a distributed vector. I tried to implement this idea, but had no luck with the results. However, even if this solution would work, the communication for vector update was inevitable once each iteration of my algorithm. >>>> Since this is a rectangular matrix, I cannot apply RCM or such permutations, however I can permute rows and columns though. >>>> >>>> More specifically, the problem I'm trying to solve is one of balance the best guess and uncertainty estimates of a set of Input-Output subject to linear constraints and ancillary informations. The matrix is called an aggregation matrix, and each entry can be 1, 0 or -1. I don't know the cause of its nonzero structure. I'm addressing this problem using a weighted least-squares algorithm. >>>> >>>> I ran the code with a different, more friendly problem topology, logging the load of nonzero entries and the "diagonal load" per processor. >>>> I'm sending images of both matrices nonzero structure. The last email example used matrix1, the example in this email uses matrix2. >>>> Matrix1 (last email example) is 1.098.939 rows x 2.039.681 columns and 5.171.901 nnz. >>>> The matrix2 (this email example) is 800.000 rows x 8.800.000 columns and 16.800.000 nnz. >>>> >>>> With 1,2,3 machines, I have these distributions of nonzeros (using matrix2). I'm sending the logs in this email. >>>> 1 machine >>>> [0] Matrix diagonal_nnz:16800000 (100.00 %) >>>> [0] Matrix local nnz: 16800000 (100.00 %), local rows: 800000 (100.00 %) >>>> ExecTime: 4min47sec >>>> >>>> 2 machines >>>> [0] Matrix diagonal_nnz:4400000 (52.38 %) >>>> [1] Matrix diagonal_nnz:4000000 (47.62 %) >>>> >>>> [0] Matrix local nnz: 8400000 (50.00 %), local rows: 400000 (50.00 %) >>>> [1] Matrix local nnz: 8400000 (50.00 %), local rows: 400000 (50.00 %) >>>> ExecTime: 13min23sec >>>> >>>> 3 machines >>>> [0] Matrix diagonal_nnz:2933334 (52.38 %) >>>> [1] Matrix diagonal_nnz:533327 (9.52 %) >>>> [2] Matrix diagonal_nnz:2399999 (42.86 %) >>>> >>>> [0] Matrix local nnz: 5600007 (33.33 %), local rows: 266667 (33.33 %) >>>> [1] Matrix local nnz: 5600007 (33.33 %), local rows: 266667 (33.33 %) >>>> [2] Matrix local nnz: 5599986 (33.33 %), local rows: 266666 (33.33 %) >>>> ExecTime: 20min26sec >>>> >>>> As for the network, I ran the make streams NPMAX=3 again. I'm also sending it in this email. >>>> >>>> I too think that these bad results are caused by a combination of bad matrix structure, especially the "diagonal weight", and maybe network. >>>> >>>> I really should find a way to permute these matrices to a more friendly structure. >>>> >>>> Thank you very much for the help. >>>> Nelson >>>> >>>> Em 2015-08-22 22:49, Barry Smith escreveu: >>>> >>>>>> On Aug 22, 2015, at 4:17 PM, Nelson Filipe Lopes da Silva wrote: >>>>>> >>>>>> Hi. >>>>>> >>>>>> I managed to finish the re-implementation. I ran the program with 1,2,3,4,5,6 machines and saved the summary. I send each of them in this email. >>>>>> In these executions, the program performs Matrix-Vector (MatMult, MatMultAdd) products and Vector-Vector operations. From what I understand while reading the logs, the program takes most of the time in "VecScatterEnd". >>>>>> In this example, the matrix taking part on the Matrix-Vector products is not "much diagonal heavy". >>>>>> The following numbers are the percentages of nnz values on the matrix diagonal block for each machine, and each execution time. >>>>>> NMachines %NNZ ExecTime >>>>>> 1 machine0 100%; 16min08sec >>>>>> >>>>>> 2 machine0 91.1%; 24min58sec >>>>>> machine1 69.2%; >>>>>> >>>>>> 3 machine0 90.9% 25min42sec >>>>>> machine1 82.8% >>>>>> machine2 51.6% >>>>>> >>>>>> 4 machine0 91.9% 26min27sec >>>>>> machine1 82.4% >>>>>> machine2 73.1% >>>>>> machine3 39.9% >>>>>> >>>>>> 5 machine0 93.2% 39min23sec >>>>>> machine1 82.8% >>>>>> machine2 74.4% >>>>>> machine3 64.6% >>>>>> machine4 31.6% >>>>>> >>>>>> 6 machine0 94.2% 54min54sec >>>>>> machine1 82.6% >>>>>> machine2 73.1% >>>>>> machine3 65.2% >>>>>> machine4 55.9% >>>>>> machine5 25.4% >>>>> >>>>> Based on this I am guessing the last rows of the matrix have a lot >>>>> of nonzeros away from the diagonal? >>>>> >>>>> There is a big load imbalance in something: for example with 2 >>>>> processes you have >>>>> >>>>> VecMax 10509 1.0 2.0602e+02 4.2 0.00e+00 0.0 0.0e+00 >>>>> 0.0e+00 1.1e+04 9 0 0 0 72 9 0 0 0 72 0 >>>>> VecScatterEnd 18128 1.0 8.9404e+02 1.3 0.00e+00 0.0 0.0e+00 >>>>> 0.0e+00 0.0e+00 53 0 0 0 0 53 0 0 0 0 0 >>>>> MatMult 10505 1.0 6.5591e+02 1.4 3.16e+10 1.4 2.1e+04 >>>>> 1.2e+06 0.0e+00 37 33 58 38 0 37 33 58 38 0 83 >>>>> MatMultAdd 7624 1.0 7.0028e+02 2.3 3.26e+10 2.1 1.5e+04 >>>>> 2.8e+06 0.0e+00 34 29 42 62 0 34 29 42 62 0 69 >>>>> >>>>> the 5th column has the imbalance between slowest and fastest >>>>> process. It is 4.2 for max, 1.4 for multi and 2.3 for matmultadd, to >>>>> get good speed ups these need to be much closer to 1. >>>>> >>>>> How many nonzeros in the matrix are there per process? Is it very >>>>> different for difference processes? You really need to have each >>>>> process have similar number of matrix nonzeros. Do you have a >>>>> picture of the nonzero structure of the matrix? Where does the matrix >>>>> come from, why does it have this structure? >>>>> >>>>> Also likely there are just to many vector entries that need to be >>>>> scattered to the last process for the matmults. >>>>> >>>>>> In this implementation I'm using MatCreate and VecCreate. I'm also leaving the partition sizes in PETSC_DECIDE. >>>>>> >>>>>> Finally, to run the application, I'm using mpirun.hydra from mpich, downloaded by PETSc configure script. >>>>>> I'm checking the process assignment as suggested on the last email. >>>>>> >>>>>> Am I missing anything? >>>>> >>>>> Your network is very poor; likely ethernet. It is had to get much >>>>> speedup with such slow reductions and sends and receives. >>>>> >>>>> Average time to get PetscTime(): 1.19209e-07 >>>>> Average time for MPI_Barrier(): 0.000215769 >>>>> Average time for zero size MPI_Send(): 5.94854e-05 >>>>> >>>>> I think you are seeing such bad results due to an unkind matrix >>>>> nonzero structure giving per load balance and too much communication >>>>> and a very poor computer network that just makes all the needed >>>>> communication totally dominate. >>>>> >>>>> Regards, >>>>> Nelson >>>>> >>>>> Em 2015-08-20 16:17, Matthew Knepley escreveu: >>>>> >>>>> On Thu, Aug 20, 2015 at 6:30 AM, Nelson Filipe Lopes da Silva wrote: >>>>> Hello. >>>>> >>>>> I am sorry for the long time without response. I decided to rewrite my application in a different way and will send the log_summary output when done reimplementing. >>>>> >>>>> As for the machine, I am using mpirun to run jobs in a 8 node cluster. I modified the makefile on the steams folder so it would run using my hostfile. >>>>> The output is attached to this email. It seems reasonable for a cluster with 8 machines. From "lscpu", each machine cpu has 4 cores and 1 socket. >>>>> 1) You launcher is placing processes haphazardly. I would figure out how to assign them to certain nodes >>>>> 2) Each node has enough bandwidth for 1 core, so it does not make much sense to use more than 1. >>>>> Thanks, >>>>> Matt >>>>> >>>>> Cheers, >>>>> Nelson >>>>> >>>>> Em 2015-07-24 16:50, Barry Smith escreveu: >>>>> It would be very helpful if you ran the code on say 1, 2, 4, 8, 16 >>>>> ... processes with the option -log_summary and send (as attachments) >>>>> the log summary information. >>>>> >>>>> Also on the same machine run the streams benchmark; with recent >>>>> releases of PETSc you only need to do >>>>> >>>>> cd $PETSC_DIR >>>>> make streams NPMAX=16 (or whatever your largest process count is) >>>>> >>>>> and send the output. >>>>> >>>>> I suspect that you are doing everything fine and it is more an issue >>>>> with the configuration of your machine. Also read the information at >>>>> http://www.mcs.anl.gov/petsc/documentation/faq.html#computers > > -- > > What most experimenters take for granted before they begin their experiments is infinitely more interesting than any results to which their experiments lead. > -- Norbert Wiener Links: ------ [1] mailto:nelsonflsilva at ist.utl.pt [2] mailto:nelsonflsilva at ist.utl.pt [3] mailto:nelsonflsilva at ist.utl.pt [4] mailto:nelsonflsilva at ist.utl.pt [5] mailto:nelsonflsilva at ist.utl.pt -------------- next part -------------- An HTML attachment was scrubbed... URL: From zonexo at gmail.com Mon Aug 24 21:01:47 2015 From: zonexo at gmail.com (Wee Beng Tay) Date: Tue, 25 Aug 2015 10:01:47 +0800 Subject: [petsc-users] Insert values into matrix using MatSetValuesStencil or MatSetValuesLocal In-Reply-To: References: Message-ID: <1440468109831-51ceb105-9c281f1a-bc754585@gmail.com> Sent using CloudMagic [https://cloudmagic.com/k/d/mailapp?ct=pa&cv=7.2.9&pv=5.0.2] On Mon, Aug 24, 2015 at 6:21 PM, Matthew Knepley < knepley at gmail.com [knepley at gmail.com] > wrote: On Mon, Aug 24, 2015 at 4:09 AM, Wee-Beng Tay < zonexo at gmail.com [zonexo at gmail.com] > wrote: Hi, I'm modifying my 3d fortran code from MPI along 1 direction (z) to MPI along 2 directions (y,z) Previously I was using MatSetValues with global indices. However, now I'm using DM and global indices is much more difficult. I come across MatSetValuesStencil or MatSetValuesLocal. So what's the difference bet the one since they both seem to work locally? No. MatSetValuesLocal() takes local indices. MatSetValuesStencil() takes global vertex numbers. So MatSetValuesStencil() takes global vertex numbers. Do you mean the natural or petsc ordering? Which is a simpler/better option? MatSetValuesStencil() Is there an example in Fortran for MatSetValuesStencil? Timoth?e Nicolas shows one in his reply. Do I also need to use DMDAGetAO together with MatSetValuesStencil or MatSetValuesLocal? No. Thanks, Matt Thanks! -- What most experimenters take for granted before they begin their experiments is infinitely more interesting than any results to which their experiments lead. -- Norbert Wiener -------------- next part -------------- An HTML attachment was scrubbed... URL: From zonexo at gmail.com Mon Aug 24 21:06:23 2015 From: zonexo at gmail.com (Wee Beng Tay) Date: Tue, 25 Aug 2015 10:06:23 +0800 Subject: [petsc-users] Insert values into matrix using MatSetValuesStencil or MatSetValuesLocal In-Reply-To: References: Message-ID: <1440468386078-b9b15ac2-3f34e8e5-a4536505@gmail.com> Sent using CloudMagic [https://cloudmagic.com/k/d/mailapp?ct=pa&cv=7.2.9&pv=5.0.2] On Mon, Aug 24, 2015 at 5:54 PM, Timoth?e Nicolas < timothee.nicolas at gmail.com [timothee.nicolas at gmail.com] > wrote: Hi, ex5 of snes can give you an example of the two routines. The C version ex5.c uses MatSetValuesStencil whereas the Fortran90 version ex5f90.F uses MatSetValuesLocal. However, I use MatSetValuesStencil also in Fortran, there is no problem, and no need to mess around with DMDAGetAO, I think. To input values in the matrix, you need to do the following : ! Declare the matstencils for matrix columns and rows MatStencil :: row(4,1),col(4,n) ! Declare the quantity which will store the actual matrix elements PetscScalar :: v(8) The first dimension in row and col is 4 to allow for 3 spatial dimensions (even if you use only 2) plus one degree of freedom if you have several fields in your DMDA. The second dimension is 1 for row (you input one row at a time) and n for col, where n is the number of columns that you input. For instance, if at node (1,i,j) (1 is the index of the degree of freedom), you have, say, 6 couplings, with nodes (1,i,j), (1,i+1,j), (1,i-1,j), (1,i,j-1), (1,i,j+1), (2,i,j) for example, then you need to set n=6 Then you define the row number by naturally doing the following, inside a local loop : row(MatStencil_i,1) = i -1 row(MatStencil_j,1) = j -1 row(MatStencil_c,1) = 1 -1 the -1 are here because FORTRAN indexing is different from the native C indexing. I put them on the right to make this more apparent. Then the column information. For instance to declare the coupling with node (1,i,j), (1,i-1,j) and (2,i,j) (you can make up for the rest) you will have to write (still within the same local loop on i and j) col(MatStencil_i,1) = i -1 col(MatStencil_j,1) = j -1 col(MatStencil_c,1) = 1 -1 v(1) = whatever_it_is col(MatStencil_i,2) = i-1 -1 col(MatStencil_j,2) = j -1 col(MatStencil_c,2) = 1 -1 v(2) = whatever_it_is col(MatStencil_i,3) = i -1 col(MatStencil_j,3) = j -1 col(MatStencil_c,3) = 2 -1 v(3) = whatever_it_is ... ... .. ... ... ... Note that the index of the degree of freedom (or what field you are coupling to), is indicated by MatStencil_c Finally use MatSetValuesStencil ione = 1 isix = 6 call MatSetValuesStencil(Matrix,ione,row,isix,col,v,INSERT_VALUES,ierr) If it is not clear don't hesitate to ask more details. For me it worked that way, I succesfully computed a Jacobian that way. It is very sensitive. If you slightly depart from the right jacobian, you will see a huge difference compared to using matrix free with -snes_mf, so you can hardly make a mistake because you would see it. That's how I finally got it to work. Best Timothee Hi Timothee, Thanks for the help. So for boundary pts I will just leave blank for non existent locations? Also, can I use PETSc multigrid to solve this problem? This is a poisson eqn. 2015-08-24 18:09 GMT+09:00 Wee-Beng Tay < zonexo at gmail.com [zonexo at gmail.com] > : Hi, I'm modifying my 3d fortran code from MPI along 1 direction (z) to MPI along 2 directions (y,z) Previously I was using MatSetValues with global indices. However, now I'm using DM and global indices is much more difficult. I come across MatSetValuesStencil or MatSetValuesLocal. So what's the difference bet the one since they both seem to work locally? Which is a simpler/better option? Is there an example in Fortran for MatSetValuesStencil? Do I also need to use DMDAGetAO together with MatSetValuesStencil or MatSetValuesLocal? Thanks! -------------- next part -------------- An HTML attachment was scrubbed... URL: From knepley at gmail.com Mon Aug 24 21:11:29 2015 From: knepley at gmail.com (Matthew Knepley) Date: Mon, 24 Aug 2015 21:11:29 -0500 Subject: [petsc-users] Insert values into matrix using MatSetValuesStencil or MatSetValuesLocal In-Reply-To: <1440468109831-51ceb105-9c281f1a-bc754585@gmail.com> References: <1440468109831-51ceb105-9c281f1a-bc754585@gmail.com> Message-ID: On Mon, Aug 24, 2015 at 9:01 PM, Wee Beng Tay wrote: > > > Sent using CloudMagic > > On Mon, Aug 24, 2015 at 6:21 PM, Matthew Knepley > wrote: > > On Mon, Aug 24, 2015 at 4:09 AM, Wee-Beng Tay wrote: > >> Hi, >> >> I'm modifying my 3d fortran code from MPI along 1 direction (z) to MPI >> along 2 directions (y,z) >> >> Previously I was using MatSetValues with global indices. However, now I'm >> using DM and global indices is much more difficult. >> >> I come across MatSetValuesStencil or MatSetValuesLocal. >> >> So what's the difference bet the one since they both seem to work locally? >> > > No. MatSetValuesLocal() takes local indices. MatSetValuesStencil() takes > global vertex numbers. > > > So MatSetValuesStencil() takes global vertex numbers. Do you mean the > natural or petsc ordering? > There is no PETSc ordering for vertices, only the natural ordering. Thanks, Matt > Which is a simpler/better option? >> > > MatSetValuesStencil() > >> Is there an example in Fortran for MatSetValuesStencil? >> > > Timoth?e Nicolas shows one in his reply. > > Do I also need to use DMDAGetAO together with MatSetValuesStencil or >> MatSetValuesLocal? >> > > No. > > Thanks, > > Matt > >> Thanks! >> > > > > -- > What most experimenters take for granted before they begin their > experiments is infinitely more interesting than any results to which their > experiments lead. > -- Norbert Wiener > > -- What most experimenters take for granted before they begin their experiments is infinitely more interesting than any results to which their experiments lead. -- Norbert Wiener -------------- next part -------------- An HTML attachment was scrubbed... URL: From timothee.nicolas at gmail.com Tue Aug 25 00:45:33 2015 From: timothee.nicolas at gmail.com (=?UTF-8?Q?Timoth=C3=A9e_Nicolas?=) Date: Tue, 25 Aug 2015 14:45:33 +0900 Subject: [petsc-users] Function evaluation slowness ? Message-ID: Hi, I am testing PETSc on the supercomputer where I used to run my explicit MHD code. For my tests I use 256 processes on a problem of size 128*128*640 = 10485760, that is, 40960 grid points per process, and 8 degrees of freedom (or physical fields). The explicit code was using Runge-Kutta 4 for the time scheme, which means 4 function evaluation per time step (plus one operation to put everything together, but let's forget this one). I could thus easily determine that the typical time required for a function evaluation was of the order of 50 ms. Now with the implicit Newton-Krylov solver written in PETSc, in the present state where for now I have not implemented any Jacobian or preconditioner whatsoever (so I run with -snes_mf), I measure a typical time between two time steps of between 5 and 20 seconds, and the number of function evaluations for each time step obtained with SNESGetNumberFunctionEvals is 17 (I am speaking of a particular case of course) This means a time per function evaluation of about 0.5 to 1 second, that is, 10 to 20 times slower. So I have some questions about this. 1. First does SNESGetNumberFunctionEvals take into account the function evaluations required to evaluate the Jacobian when -snes_mf is used, as well as the operations required by the GMRES (Krylov) method ? If it were the case, I would somehow intuitively expect a number larger than 17, which could explain the increase in time. 2. In any case, I thought that all things considered, the function evaluation would be the most time consuming part of a Newton-Krylov solver, am I completely wrong about that ? Is the 10-20 factor legit ? I realize of course that preconditioning should make all this smoother, in particular allowing larger time steps, but here I am just concerned about the sheer Function evaluation time. Best regards Timothee NICOLAS -------------- next part -------------- An HTML attachment was scrubbed... URL: From bsmith at mcs.anl.gov Tue Aug 25 00:56:41 2015 From: bsmith at mcs.anl.gov (Barry Smith) Date: Tue, 25 Aug 2015 00:56:41 -0500 Subject: [petsc-users] Function evaluation slowness ? In-Reply-To: References: Message-ID: <0FCEEEF7-2374-47A5-9C1B-83D37E3E87CD@mcs.anl.gov> > On Aug 25, 2015, at 12:45 AM, Timoth?e Nicolas wrote: > > Hi, > > I am testing PETSc on the supercomputer where I used to run my explicit MHD code. For my tests I use 256 processes on a problem of size 128*128*640 = 10485760, that is, 40960 grid points per process, and 8 degrees of freedom (or physical fields). The explicit code was using Runge-Kutta 4 for the time scheme, which means 4 function evaluation per time step (plus one operation to put everything together, but let's forget this one). > > I could thus easily determine that the typical time required for a function evaluation was of the order of 50 ms. > > Now with the implicit Newton-Krylov solver written in PETSc, in the present state where for now I have not implemented any Jacobian or preconditioner whatsoever (so I run with -snes_mf), I measure a typical time between two time steps of between 5 and 20 seconds, and the number of function evaluations for each time step obtained with SNESGetNumberFunctionEvals is 17 (I am speaking of a particular case of course) > > This means a time per function evaluation of about 0.5 to 1 second, that is, 10 to 20 times slower. > > So I have some questions about this. > > 1. First does SNESGetNumberFunctionEvals take into account the function evaluations required to evaluate the Jacobian when -snes_mf is used, as well as the operations required by the GMRES (Krylov) method ? If it were the case, I would somehow intuitively expect a number larger than 17, which could explain the increase in time. PetscErrorCode SNESGetNumberFunctionEvals(SNES snes, PetscInt *nfuncs) { *nfuncs = snes->nfuncs; } PetscErrorCode SNESComputeFunction(SNES snes,Vec x,Vec y) { ... snes->nfuncs++; } PetscErrorCode MatCreateSNESMF(SNES snes,Mat *J) { ..... if (snes->pc && snes->pcside == PC_LEFT) { ierr = MatMFFDSetFunction(*J,(PetscErrorCode (*)(void*,Vec,Vec))SNESComputeFunctionDefaultNPC,snes);CHKERRQ(ierr); } else { ierr = MatMFFDSetFunction(*J,(PetscErrorCode (*)(void*,Vec,Vec))SNESComputeFunction,snes);CHKERRQ(ierr); } } So, yes I would expect all the function evaluations needed for the matrix-free Jacobian matrix vector product to be counted. You can also look at the number of GMRES Krylov iterations it took (which should have one multiply per iteration) to double check that the numbers make sense. What does your -log_summary output look like? One thing that GMRES does is it introduces a global reduction with each multiple (hence a barrier across all your processes) on some systems this can be deadly. Barry > > 2. In any case, I thought that all things considered, the function evaluation would be the most time consuming part of a Newton-Krylov solver, am I completely wrong about that ? Is the 10-20 factor legit ? > > I realize of course that preconditioning should make all this smoother, in particular allowing larger time steps, but here I am just concerned about the sheer Function evaluation time. > > Best regards > > Timothee NICOLAS From timothee.nicolas at gmail.com Tue Aug 25 01:21:24 2015 From: timothee.nicolas at gmail.com (=?UTF-8?Q?Timoth=C3=A9e_Nicolas?=) Date: Tue, 25 Aug 2015 15:21:24 +0900 Subject: [petsc-users] Function evaluation slowness ? In-Reply-To: <0FCEEEF7-2374-47A5-9C1B-83D37E3E87CD@mcs.anl.gov> References: <0FCEEEF7-2374-47A5-9C1B-83D37E3E87CD@mcs.anl.gov> Message-ID: Here is the log summary (attached). At the beginning are personal prints, you can skip. I seem to have a memory crash in the present state after typically 45 iterations (that's why I used 40 here), the log summary indicates some creations without destruction of Petsc objects (I will fix this immediately), that may cause the memory crash, but I don't think it's the cause of the slow function evaluations. The log_summary is consistent with 0.7s per function evaluation (4.8990e+02/666 = 0.736). In addition, SNESSolve itself takes approximately the same amount of time (is it normal ?). And the other long operation is VecScatterEnd. I assume it is the time used in process communications ? In which case I suppose it is normal that it takes a significant amount of time. So this ~10 times increase does not look normal right ? Best Timothee NICOLAS 2015-08-25 14:56 GMT+09:00 Barry Smith : > > > On Aug 25, 2015, at 12:45 AM, Timoth?e Nicolas < > timothee.nicolas at gmail.com> wrote: > > > > Hi, > > > > I am testing PETSc on the supercomputer where I used to run my explicit > MHD code. For my tests I use 256 processes on a problem of size 128*128*640 > = 10485760, that is, 40960 grid points per process, and 8 degrees of > freedom (or physical fields). The explicit code was using Runge-Kutta 4 for > the time scheme, which means 4 function evaluation per time step (plus one > operation to put everything together, but let's forget this one). > > > > I could thus easily determine that the typical time required for a > function evaluation was of the order of 50 ms. > > > > Now with the implicit Newton-Krylov solver written in PETSc, in the > present state where for now I have not implemented any Jacobian or > preconditioner whatsoever (so I run with -snes_mf), I measure a typical > time between two time steps of between 5 and 20 seconds, and the number of > function evaluations for each time step obtained with > SNESGetNumberFunctionEvals is 17 (I am speaking of a particular case of > course) > > > > This means a time per function evaluation of about 0.5 to 1 second, that > is, 10 to 20 times slower. > > > > So I have some questions about this. > > > > 1. First does SNESGetNumberFunctionEvals take into account the function > evaluations required to evaluate the Jacobian when -snes_mf is used, as > well as the operations required by the GMRES (Krylov) method ? If it were > the case, I would somehow intuitively expect a number larger than 17, which > could explain the increase in time. > > PetscErrorCode SNESGetNumberFunctionEvals(SNES snes, PetscInt *nfuncs) > { > *nfuncs = snes->nfuncs; > } > > PetscErrorCode SNESComputeFunction(SNES snes,Vec x,Vec y) > { > ... > snes->nfuncs++; > } > > PetscErrorCode MatCreateSNESMF(SNES snes,Mat *J) > { > ..... > if (snes->pc && snes->pcside == PC_LEFT) { > ierr = MatMFFDSetFunction(*J,(PetscErrorCode > (*)(void*,Vec,Vec))SNESComputeFunctionDefaultNPC,snes);CHKERRQ(ierr); > } else { > ierr = MatMFFDSetFunction(*J,(PetscErrorCode > (*)(void*,Vec,Vec))SNESComputeFunction,snes);CHKERRQ(ierr); > } > } > > So, yes I would expect all the function evaluations needed for the > matrix-free Jacobian matrix vector product to be counted. You can also look > at the number of GMRES Krylov iterations it took (which should have one > multiply per iteration) to double check that the numbers make sense. > > What does your -log_summary output look like? One thing that GMRES does > is it introduces a global reduction with each multiple (hence a barrier > across all your processes) on some systems this can be deadly. > > Barry > > > > > > 2. In any case, I thought that all things considered, the function > evaluation would be the most time consuming part of a Newton-Krylov solver, > am I completely wrong about that ? Is the 10-20 factor legit ? > > > > I realize of course that preconditioning should make all this smoother, > in particular allowing larger time steps, but here I am just concerned > about the sheer Function evaluation time. > > > > Best regards > > > > Timothee NICOLAS > > -------------- next part -------------- An HTML attachment was scrubbed... URL: -------------- next part -------------- This is an implicit MHD code based on the MIPS code Setting all options Start: Reading HINT2 equilibrium file DATA: lr,lz,lphi= 128 128 640 lsymmetry= 1 r_min,r_max= 2.70000000000000 4.60000000000000 z_min,z_max= -0.950000000000000 0.950000000000000 phi_min,phi_max= 0.000000000000000E+000 6.28318530717959 dr,dz,dphi= 1.496062992125984E-002 1.496062992125984E-002 9.817477042468103E-003 pmax= 1.135504809500237E-002 bmax= 2.98086676166910 End: Reading HINT2 equilibrium file Creating nonlinear solver, getting geometrical info, and setting vectors Allocating arrays for which it is required Masks definition Major radius and vacuum definition Initializing PETSc Vecs with equilibrium values Set the initial force local vectors (used to enforce the equilibrium) Add a random perturbation to the velocity Entering the main MHD Loop Iteration number = 1 Time (tau_A) = 1.0000E-02 0 SNES Function norm 5.382589763410e-05 1 SNES Function norm 4.917642384947e-11 2 SNES Function norm 4.187109891461e-16 Kinetic Energy = 9.0028E-16 Magnetic Energy = 1.3544E-16 Total CPU time since PetscInitialize: 7.7581E+00 CPU time used for SNESSolve: 5.0903E-01 Number of linear iterations : 11 Number of function evaluations : 16 Iteration number = 2 Time (tau_A) = 2.0000E-02 0 SNES Function norm 8.317717709972e-16 1 SNES Function norm 1.373112118906e-16 Kinetic Energy = 8.9656E-16 Magnetic Energy = 5.4177E-16 Total CPU time since PetscInitialize: 5.0395E+01 CPU time used for SNESSolve: 3.0864E+01 Number of linear iterations : 6 Number of function evaluations : 9 Too few linear iterations; Time step increased to dt = 1.3333E-02 Iteration number = 3 Time (tau_A) = 3.3333E-02 0 SNES Function norm 1.732338962433e-05 1 SNES Function norm 3.798031074240e-11 2 SNES Function norm 5.941618837381e-16 Kinetic Energy = 8.8707E-16 Magnetic Energy = 1.4783E-15 Total CPU time since PetscInitialize: 6.2962E+01 CPU time used for SNESSolve: 8.4275E+00 Number of linear iterations : 12 Number of function evaluations : 17 Iteration number = 4 Time (tau_A) = 4.6667E-02 0 SNES Function norm 8.650290226831e-07 1 SNES Function norm 3.159589761232e-12 2 SNES Function norm 5.559310443305e-16 Kinetic Energy = 8.7490E-16 Magnetic Energy = 2.8512E-15 Total CPU time since PetscInitialize: 7.5298E+01 CPU time used for SNESSolve: 7.2959E+00 Number of linear iterations : 12 Number of function evaluations : 17 Iteration number = 5 Time (tau_A) = 6.0000E-02 0 SNES Function norm 1.814549199007e-06 1 SNES Function norm 6.550542684026e-12 2 SNES Function norm 5.561417252951e-16 Kinetic Energy = 8.5968E-16 Magnetic Energy = 4.6147E-15 Total CPU time since PetscInitialize: 8.6579E+01 CPU time used for SNESSolve: 8.3290E+00 Number of linear iterations : 12 Number of function evaluations : 17 Iteration number = 6 Time (tau_A) = 7.3333E-02 0 SNES Function norm 5.143158573116e-07 1 SNES Function norm 1.854573359134e-12 2 SNES Function norm 5.555619916593e-16 Kinetic Energy = 8.4209E-16 Magnetic Energy = 6.7217E-15 Total CPU time since PetscInitialize: 9.8827E+01 CPU time used for SNESSolve: 9.6296E+00 Number of linear iterations : 12 Number of function evaluations : 17 Iteration number = 7 Time (tau_A) = 8.6667E-02 0 SNES Function norm 1.241769129665e-07 1 SNES Function norm 2.229921948896e-13 2 SNES Function norm 5.547572310422e-16 Kinetic Energy = 8.2295E-16 Magnetic Energy = 9.1167E-15 Total CPU time since PetscInitialize: 1.1428E+02 CPU time used for SNESSolve: 1.1251E+01 Number of linear iterations : 12 Number of function evaluations : 17 Iteration number = 8 Time (tau_A) = 1.0000E-01 0 SNES Function norm 4.127596671841e-08 1 SNES Function norm 9.722821486344e-14 2 SNES Function norm 5.547721956164e-16 Kinetic Energy = 8.0310E-16 Magnetic Energy = 1.1740E-14 Total CPU time since PetscInitialize: 1.2603E+02 CPU time used for SNESSolve: 3.1856E+00 Number of linear iterations : 12 Number of function evaluations : 17 Iteration number = 9 Time (tau_A) = 1.1333E-01 0 SNES Function norm 6.526056598290e-08 1 SNES Function norm 1.551059106556e-13 2 SNES Function norm 5.554664894406e-16 Kinetic Energy = 7.8340E-16 Magnetic Energy = 1.4531E-14 Total CPU time since PetscInitialize: 1.3916E+02 CPU time used for SNESSolve: 1.0360E+01 Number of linear iterations : 12 Number of function evaluations : 17 Iteration number = 10 Time (tau_A) = 1.2667E-01 0 SNES Function norm 7.216736548862e-08 1 SNES Function norm 1.709747035544e-13 2 SNES Function norm 5.551116565705e-16 Kinetic Energy = 7.6463E-16 Magnetic Energy = 1.7432E-14 Total CPU time since PetscInitialize: 1.5486E+02 CPU time used for SNESSolve: 1.4236E+01 Number of linear iterations : 12 Number of function evaluations : 17 Iteration number = 11 Time (tau_A) = 1.4000E-01 0 SNES Function norm 7.957568936931e-08 1 SNES Function norm 7.164583362638e-13 2 SNES Function norm 5.547681147827e-16 Kinetic Energy = 7.4749E-16 Magnetic Energy = 2.0389E-14 Total CPU time since PetscInitialize: 1.6355E+02 CPU time used for SNESSolve: 5.0159E+00 Number of linear iterations : 11 Number of function evaluations : 16 Iteration number = 12 Time (tau_A) = 1.5333E-01 0 SNES Function norm 8.580881227555e-08 1 SNES Function norm 7.177441407594e-13 2 SNES Function norm 5.549122251796e-16 Kinetic Energy = 7.3253E-16 Magnetic Energy = 2.3356E-14 Total CPU time since PetscInitialize: 1.7703E+02 CPU time used for SNESSolve: 8.7550E+00 Number of linear iterations : 11 Number of function evaluations : 16 Iteration number = 13 Time (tau_A) = 1.6667E-01 0 SNES Function norm 9.036148523559e-08 1 SNES Function norm 7.596864452710e-13 2 SNES Function norm 5.552150394126e-16 Kinetic Energy = 7.2017E-16 Magnetic Energy = 2.6296E-14 Total CPU time since PetscInitialize: 1.9497E+02 CPU time used for SNESSolve: 1.0766E+01 Number of linear iterations : 11 Number of function evaluations : 16 Iteration number = 14 Time (tau_A) = 1.8000E-01 0 SNES Function norm 9.318522525551e-08 1 SNES Function norm 8.310283593293e-13 2 SNES Function norm 5.549000838284e-16 Kinetic Energy = 7.1065E-16 Magnetic Energy = 2.9180E-14 Total CPU time since PetscInitialize: 2.0960E+02 CPU time used for SNESSolve: 8.9322E+00 Number of linear iterations : 11 Number of function evaluations : 16 Iteration number = 15 Time (tau_A) = 1.9333E-01 0 SNES Function norm 9.426477938676e-08 1 SNES Function norm 9.230910004645e-13 2 SNES Function norm 5.545358678033e-16 Kinetic Energy = 7.0402E-16 Magnetic Energy = 3.1992E-14 Total CPU time since PetscInitialize: 2.2222E+02 CPU time used for SNESSolve: 3.2721E+00 Number of linear iterations : 11 Number of function evaluations : 16 Iteration number = 16 Time (tau_A) = 2.0667E-01 0 SNES Function norm 9.362326845768e-08 1 SNES Function norm 1.296102793360e-13 2 SNES Function norm 5.547628931478e-16 Kinetic Energy = 7.0019E-16 Magnetic Energy = 3.4725E-14 Total CPU time since PetscInitialize: 2.4133E+02 CPU time used for SNESSolve: 4.5367E+00 Number of linear iterations : 12 Number of function evaluations : 17 Iteration number = 17 Time (tau_A) = 2.2000E-01 0 SNES Function norm 9.133250113813e-08 1 SNES Function norm 1.149831225960e-13 2 SNES Function norm 5.548205596917e-16 Kinetic Energy = 6.9892E-16 Magnetic Energy = 3.7381E-14 Total CPU time since PetscInitialize: 2.5612E+02 CPU time used for SNESSolve: 7.0432E+00 Number of linear iterations : 12 Number of function evaluations : 17 Iteration number = 18 Time (tau_A) = 2.3333E-01 0 SNES Function norm 8.751516705490e-08 1 SNES Function norm 1.044612716823e-13 2 SNES Function norm 5.546955425427e-16 Kinetic Energy = 6.9986E-16 Magnetic Energy = 3.9970E-14 Total CPU time since PetscInitialize: 2.7746E+02 CPU time used for SNESSolve: 1.7779E+01 Number of linear iterations : 12 Number of function evaluations : 17 Iteration number = 19 Time (tau_A) = 2.4667E-01 0 SNES Function norm 8.234829805103e-08 1 SNES Function norm 9.821327679226e-14 2 SNES Function norm 5.548194234735e-16 Kinetic Energy = 7.0258E-16 Magnetic Energy = 4.2510E-14 Total CPU time since PetscInitialize: 2.8796E+02 CPU time used for SNESSolve: 7.1363E+00 Number of linear iterations : 12 Number of function evaluations : 17 Iteration number = 20 Time (tau_A) = 2.6000E-01 0 SNES Function norm 7.607020340067e-08 1 SNES Function norm 9.607454720992e-14 2 SNES Function norm 5.546700693492e-16 Kinetic Energy = 7.0658E-16 Magnetic Energy = 4.5021E-14 Total CPU time since PetscInitialize: 3.0141E+02 CPU time used for SNESSolve: 9.5897E+00 Number of linear iterations : 12 Number of function evaluations : 17 Iteration number = 21 Time (tau_A) = 2.7333E-01 0 SNES Function norm 6.899369214087e-08 1 SNES Function norm 9.762527001937e-14 2 SNES Function norm 5.546534365057e-16 Kinetic Energy = 7.1138E-16 Magnetic Energy = 4.7526E-14 Total CPU time since PetscInitialize: 3.1839E+02 CPU time used for SNESSolve: 1.2120E+01 Number of linear iterations : 12 Number of function evaluations : 17 Iteration number = 22 Time (tau_A) = 2.8667E-01 0 SNES Function norm 6.153013990112e-08 1 SNES Function norm 1.022809264903e-13 2 SNES Function norm 5.559191241468e-16 Kinetic Energy = 7.1650E-16 Magnetic Energy = 5.0049E-14 Total CPU time since PetscInitialize: 3.3729E+02 CPU time used for SNESSolve: 1.4010E+01 Number of linear iterations : 12 Number of function evaluations : 17 Iteration number = 23 Time (tau_A) = 3.0000E-01 0 SNES Function norm 5.422883432610e-08 1 SNES Function norm 1.093821278111e-13 2 SNES Function norm 5.545153520497e-16 Kinetic Energy = 7.2152E-16 Magnetic Energy = 5.2614E-14 Total CPU time since PetscInitialize: 3.4506E+02 CPU time used for SNESSolve: 3.1828E+00 Number of linear iterations : 12 Number of function evaluations : 17 Iteration number = 24 Time (tau_A) = 3.1333E-01 0 SNES Function norm 4.782401973928e-08 1 SNES Function norm 1.182852644926e-13 2 SNES Function norm 5.545048290555e-16 Kinetic Energy = 7.2610E-16 Magnetic Energy = 5.5240E-14 Total CPU time since PetscInitialize: 3.5751E+02 CPU time used for SNESSolve: 3.4274E+00 Number of linear iterations : 12 Number of function evaluations : 17 Iteration number = 25 Time (tau_A) = 3.2667E-01 0 SNES Function norm 4.322955973001e-08 1 SNES Function norm 1.283072300260e-13 2 SNES Function norm 5.545681547284e-16 Kinetic Energy = 7.2996E-16 Magnetic Energy = 5.7944E-14 Total CPU time since PetscInitialize: 3.7469E+02 CPU time used for SNESSolve: 2.5263E+00 Number of linear iterations : 12 Number of function evaluations : 17 Iteration number = 26 Time (tau_A) = 3.4000E-01 0 SNES Function norm 4.132314115916e-08 1 SNES Function norm 1.386969354039e-13 2 SNES Function norm 5.540520147662e-16 Kinetic Energy = 7.3297E-16 Magnetic Energy = 6.0737E-14 Total CPU time since PetscInitialize: 3.8264E+02 CPU time used for SNESSolve: 3.1395E+00 Number of linear iterations : 12 Number of function evaluations : 17 Iteration number = 27 Time (tau_A) = 3.5333E-01 0 SNES Function norm 4.245684761661e-08 1 SNES Function norm 1.485504132442e-13 2 SNES Function norm 5.545487475751e-16 Kinetic Energy = 7.3503E-16 Magnetic Energy = 6.3627E-14 Total CPU time since PetscInitialize: 3.9247E+02 CPU time used for SNESSolve: 5.9592E+00 Number of linear iterations : 12 Number of function evaluations : 17 Iteration number = 28 Time (tau_A) = 3.6667E-01 0 SNES Function norm 4.616289783185e-08 1 SNES Function norm 1.567803392197e-13 2 SNES Function norm 5.548329671559e-16 Kinetic Energy = 7.3618E-16 Magnetic Energy = 6.6616E-14 Total CPU time since PetscInitialize: 4.0173E+02 CPU time used for SNESSolve: 5.8294E+00 Number of linear iterations : 12 Number of function evaluations : 17 Iteration number = 29 Time (tau_A) = 3.8000E-01 0 SNES Function norm 5.149150954830e-08 1 SNES Function norm 1.622328092002e-13 2 SNES Function norm 5.545725405186e-16 Kinetic Energy = 7.3651E-16 Magnetic Energy = 6.9706E-14 Total CPU time since PetscInitialize: 4.1289E+02 CPU time used for SNESSolve: 7.4448E+00 Number of linear iterations : 12 Number of function evaluations : 17 Iteration number = 30 Time (tau_A) = 3.9333E-01 0 SNES Function norm 5.751421272236e-08 1 SNES Function norm 1.639703075993e-13 2 SNES Function norm 5.551145331563e-16 Kinetic Energy = 7.3617E-16 Magnetic Energy = 7.2890E-14 Total CPU time since PetscInitialize: 4.2473E+02 CPU time used for SNESSolve: 9.8094E+00 Number of linear iterations : 12 Number of function evaluations : 17 Iteration number = 31 Time (tau_A) = 4.0667E-01 0 SNES Function norm 6.353011943021e-08 1 SNES Function norm 1.616610594001e-13 2 SNES Function norm 5.542606202109e-16 Kinetic Energy = 7.3536E-16 Magnetic Energy = 7.6164E-14 Total CPU time since PetscInitialize: 4.3767E+02 CPU time used for SNESSolve: 9.9831E+00 Number of linear iterations : 12 Number of function evaluations : 17 Iteration number = 32 Time (tau_A) = 4.2000E-01 0 SNES Function norm 6.905745342631e-08 1 SNES Function norm 1.557783742592e-13 2 SNES Function norm 5.544338434073e-16 Kinetic Energy = 7.3427E-16 Magnetic Energy = 7.9520E-14 Total CPU time since PetscInitialize: 4.6580E+02 CPU time used for SNESSolve: 2.7991E+01 Number of linear iterations : 12 Number of function evaluations : 17 Iteration number = 33 Time (tau_A) = 4.3333E-01 0 SNES Function norm 7.377639093458e-08 1 SNES Function norm 1.474466471932e-13 2 SNES Function norm 5.543376070989e-16 Kinetic Energy = 7.3312E-16 Magnetic Energy = 8.2950E-14 Total CPU time since PetscInitialize: 4.8419E+02 CPU time used for SNESSolve: 1.4762E+01 Number of linear iterations : 12 Number of function evaluations : 17 Iteration number = 34 Time (tau_A) = 4.4667E-01 0 SNES Function norm 7.748319943328e-08 1 SNES Function norm 1.381088301393e-13 2 SNES Function norm 5.550610629314e-16 Kinetic Energy = 7.3209E-16 Magnetic Energy = 8.6449E-14 Total CPU time since PetscInitialize: 4.9802E+02 CPU time used for SNESSolve: 1.3800E+01 Number of linear iterations : 12 Number of function evaluations : 17 Iteration number = 35 Time (tau_A) = 4.6000E-01 0 SNES Function norm 8.006178457271e-08 1 SNES Function norm 1.290725563320e-13 2 SNES Function norm 5.542186770005e-16 Kinetic Energy = 7.3134E-16 Magnetic Energy = 9.0010E-14 Total CPU time since PetscInitialize: 5.2157E+02 CPU time used for SNESSolve: 1.5918E+01 Number of linear iterations : 12 Number of function evaluations : 17 Iteration number = 36 Time (tau_A) = 4.7333E-01 0 SNES Function norm 8.146728382443e-08 1 SNES Function norm 1.213819113579e-13 2 SNES Function norm 5.543903059087e-16 Kinetic Energy = 7.3097E-16 Magnetic Energy = 9.3631E-14 Total CPU time since PetscInitialize: 5.3299E+02 CPU time used for SNESSolve: 7.5341E+00 Number of linear iterations : 12 Number of function evaluations : 17 Iteration number = 37 Time (tau_A) = 4.8667E-01 0 SNES Function norm 8.171674223552e-08 1 SNES Function norm 1.157134150395e-13 2 SNES Function norm 5.542672888432e-16 Kinetic Energy = 7.3106E-16 Magnetic Energy = 9.7311E-14 Total CPU time since PetscInitialize: 5.5695E+02 CPU time used for SNESSolve: 1.8279E+01 Number of linear iterations : 12 Number of function evaluations : 17 Iteration number = 38 Time (tau_A) = 5.0000E-01 0 SNES Function norm 8.088378896911e-08 1 SNES Function norm 1.124615100310e-13 2 SNES Function norm 5.544771436497e-16 Kinetic Energy = 7.3164E-16 Magnetic Energy = 1.0105E-13 Total CPU time since PetscInitialize: 5.7506E+02 CPU time used for SNESSolve: 4.0146E+00 Number of linear iterations : 12 Number of function evaluations : 17 Iteration number = 39 Time (tau_A) = 5.1333E-01 0 SNES Function norm 7.909554806664e-08 1 SNES Function norm 1.117368923673e-13 2 SNES Function norm 5.552869590808e-16 Kinetic Energy = 7.3268E-16 Magnetic Energy = 1.0485E-13 Total CPU time since PetscInitialize: 5.9949E+02 CPU time used for SNESSolve: 1.7531E+01 Number of linear iterations : 12 Number of function evaluations : 17 Iteration number = 40 Time (tau_A) = 5.2667E-01 0 SNES Function norm 7.653064797874e-08 1 SNES Function norm 1.134332671074e-13 2 SNES Function norm 5.551034395178e-16 Kinetic Energy = 7.3414E-16 Magnetic Energy = 1.0872E-13 Total CPU time since PetscInitialize: 6.2212E+02 CPU time used for SNESSolve: 1.4068E+01 Number of linear iterations : 12 Number of function evaluations : 17 Exiting the main MHD Loop Deallocating remaining arrays Destroying remaining Petsc elements ************************************************************************************************************************ *** WIDEN YOUR WINDOW TO 120 CHARACTERS. Use 'enscript -r -fCourier9' to print this document *** ************************************************************************************************************************ ---------------------------------------------- PETSc Performance Summary: ---------------------------------------------- ./a.out.local on a arch-linux2-cxx-opt named helios1497 with 256 processors, by tnicolas Tue Aug 25 14:55:05 2015 Using Petsc Release Version 3.6.0, Jun, 09, 2015 Max Max/Min Avg Total Time (sec): 6.518e+02 1.00002 6.518e+02 Objects: 1.666e+03 1.00000 1.666e+03 Flops: 4.582e+09 1.00000 4.582e+09 1.173e+12 Flops/sec: 7.029e+06 1.00002 7.029e+06 1.799e+09 MPI Messages: 8.081e+03 1.49454 6.746e+03 1.727e+06 MPI Message Lengths: 7.891e+08 3.82180 3.905e+04 6.744e+10 MPI Reductions: 3.368e+03 1.00000 Flop counting convention: 1 flop = 1 real number operation of type (multiply/divide/add/subtract) e.g., VecAXPY() for real vectors of length N --> 2N flops and VecAXPY() for complex vectors of length N --> 8N flops Summary of Stages: ----- Time ------ ----- Flops ----- --- Messages --- -- Message Lengths -- -- Reductions -- Avg %Total Avg %Total counts %Total Avg %Total counts %Total 0: Main Stage: 6.5182e+02 100.0% 1.1729e+12 100.0% 1.727e+06 100.0% 3.905e+04 100.0% 3.367e+03 100.0% ------------------------------------------------------------------------------------------------------------------------ See the 'Profiling' chapter of the users' manual for details on interpreting output. Phase summary info: Count: number of times phase was executed Time and Flops: Max - maximum over all processors Ratio - ratio of maximum to minimum over all processors Mess: number of messages sent Avg. len: average message length (bytes) Reduct: number of global reductions Global: entire computation Stage: stages of a computation. Set stages with PetscLogStagePush() and PetscLogStagePop(). %T - percent time in this phase %F - percent flops in this phase %M - percent messages in this phase %L - percent message lengths in this phase %R - percent reductions in this phase Total Mflop/s: 10e-6 * (sum of flops over all processors)/(max time over all processors) ------------------------------------------------------------------------------------------------------------------------ Event Count Time (sec) Flops --- Global --- --- Stage --- Total Max Ratio Max Ratio Max Ratio Mess Avg len Reduct %T %F %M %L %R %T %F %M %L %R Mflop/s ------------------------------------------------------------------------------------------------------------------------ --- Event Stage 0: Main Stage SNESSolve 40 1.0 4.9745e+02 3.3 4.25e+09 1.0 1.7e+06 3.8e+04 2.7e+03 46 93 99 95 80 46 93 99 95 80 2187 SNESFunctionEval 666 1.0 4.8990e+02 3.4 5.73e+08 1.0 1.7e+06 3.8e+04 1.3e+03 45 13 99 95 40 45 13 99 95 40 299 SNESJacobianEval 79 1.0 2.8245e-03 1.4 0.00e+00 0.0 0.0e+00 0.0e+00 2.0e+00 0 0 0 0 0 0 0 0 0 0 0 SNESLineSearch 79 1.0 3.8578e+00 1.0 4.98e+08 1.0 4.0e+05 3.8e+04 6.3e+02 1 11 23 23 19 1 11 23 23 19 33068 VecView 2 1.0 3.2757e+01 6.9 0.00e+00 0.0 1.6e+04 1.9e+05 9.0e+00 3 0 1 5 0 3 0 1 5 0 0 VecDot 79 1.0 2.1557e-01 5.1 4.53e+07 1.0 0.0e+00 0.0e+00 7.9e+01 0 1 0 0 2 0 1 0 0 2 53797 VecMDot 468 1.0 2.1886e+00 2.4 9.31e+08 1.0 0.0e+00 0.0e+00 4.7e+02 0 20 0 0 14 0 20 0 0 14 108862 VecNorm 781 1.0 6.5641e-01 1.3 4.48e+08 1.0 0.0e+00 0.0e+00 7.8e+02 0 10 0 0 23 0 10 0 0 23 174663 VecScale 1202 1.0 5.7695e-01 2.1 3.45e+08 1.0 0.0e+00 0.0e+00 0.0e+00 0 8 0 0 0 0 8 0 0 0 152921 VecCopy 1092 1.0 1.3964e+00 1.2 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 0 VecSet 827 1.0 1.2017e+00 1.4 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 0 VecAXPY 1661 1.0 2.1384e+00 1.3 9.52e+08 1.0 0.0e+00 0.0e+00 0.0e+00 0 21 0 0 0 0 21 0 0 0 114027 VecWAXPY 1365 1.0 2.9492e+00 1.1 5.48e+08 1.0 0.0e+00 0.0e+00 0.0e+00 0 12 0 0 0 0 12 0 0 0 47586 VecMAXPY 547 1.0 1.7110e+00 1.0 1.20e+09 1.0 0.0e+00 0.0e+00 0.0e+00 0 26 0 0 0 0 26 0 0 0 179407 VecAssemblyBegin 5 1.0 2.7040e-0213.4 0.00e+00 0.0 0.0e+00 0.0e+00 1.5e+01 0 0 0 0 0 0 0 0 0 0 0 VecAssemblyEnd 5 1.0 8.6308e-0511.0 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 0 VecPointwiseMult 80 1.0 1.4662e-01 1.2 2.29e+07 1.0 0.0e+00 0.0e+00 0.0e+00 0 1 0 0 0 0 1 0 0 0 40050 VecScatterBegin 1335 1.0 2.4852e+00 1.2 0.00e+00 0.0 1.7e+06 3.8e+04 0.0e+00 0 0 99 97 0 0 0 99 97 0 0 VecScatterEnd 1335 1.0 3.4761e+02 5.8 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00 31 0 0 0 0 31 0 0 0 0 0 VecReduceArith 158 1.0 7.9838e-02 1.1 9.06e+07 1.0 0.0e+00 0.0e+00 0.0e+00 0 2 0 0 0 0 2 0 0 0 290521 VecReduceComm 79 1.0 1.2613e-02 2.1 0.00e+00 0.0 0.0e+00 0.0e+00 7.9e+01 0 0 0 0 2 0 0 0 0 2 0 VecNormalize 547 1.0 4.6664e-01 1.2 4.25e+08 1.0 0.0e+00 0.0e+00 4.7e+02 0 9 0 0 14 0 9 0 0 14 233267 MatMult MF 547 1.0 1.2570e+01 1.1 1.27e+09 1.0 1.4e+06 3.8e+04 1.1e+03 2 28 81 78 34 2 28 81 78 34 25962 MatMult 547 1.0 1.2571e+01 1.1 1.27e+09 1.0 1.4e+06 3.8e+04 1.1e+03 2 28 81 78 34 2 28 81 78 34 25960 MatAssemblyBegin 79 1.0 1.8835e-0519.8 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 0 MatAssemblyEnd 79 1.0 2.2821e-03 1.4 0.00e+00 0.0 0.0e+00 0.0e+00 2.0e+00 0 0 0 0 0 0 0 0 0 0 0 KSPGMRESOrthog 468 1.0 3.4735e+00 1.6 1.86e+09 1.0 0.0e+00 0.0e+00 4.7e+02 0 41 0 0 14 0 41 0 0 14 137185 KSPSetUp 79 1.0 7.9496e-03 1.0 0.00e+00 0.0 0.0e+00 0.0e+00 1.0e+01 0 0 0 0 0 0 0 0 0 0 0 KSPSolve 79 1.0 1.4808e+01 1.0 3.69e+09 1.0 1.2e+06 3.8e+04 1.9e+03 2 81 69 67 58 2 81 69 67 58 63857 PCSetUp 79 1.0 2.1935e-05 5.8 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 0 PCApply 547 1.0 6.8284e-01 1.4 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 0 ------------------------------------------------------------------------------------------------------------------------ Memory usage is given in bytes: Object Type Creations Destructions Memory Descendants' Mem. Reports information only for process 0. --- Event Stage 0: Main Stage SNES 1 1 1332 0 SNESLineSearch 1 1 864 0 DMSNES 2 1 664 0 Vector 1630 290 654183152 0 Vector Scatter 3 1 1600 0 MatMFFD 1 1 784 0 Matrix 1 1 2304 0 Distributed Mesh 3 2 9456 0 Star Forest Bipartite Graph 6 4 3312 0 Discrete System 3 2 1696 0 Index Set 6 6 187248 0 IS L to G Mapping 2 1 948 0 Krylov Solver 1 1 18360 0 DMKSP interface 1 0 0 0 Preconditioner 1 1 880 0 Viewer 3 2 1536 0 PetscRandom 1 0 0 0 ======================================================================================================================== Average time to get PetscTime(): 0 Average time for MPI_Barrier(): 9.39369e-06 Average time for zero size MPI_Send(): 0.000582926 #PETSc Option Table entries: -dt 0.01 -log_summary -nts 40 -skip 500 -snes_mf -snes_monitor -time_limit 35950 #End of PETSc Option Table entries Compiled without FORTRAN kernels Compiled with full precision matrices (default) sizeof(short) 2 sizeof(int) 4 sizeof(long) 8 sizeof(void*) 8 sizeof(PetscScalar) 8 sizeof(PetscInt) 4 Configure options: --prefix=/csc/softs/anl/petsc-3.6.0/intel-15.0.0.090/bullxmpi-1.2.8.2/real --with-clanguage=cxx --with-debugging=0 --with-x=0 --with-cc=mpicc --with-fc=mpif90 --with-cxx=mpicxx --with-fortran --known-mpi-shared-libraries=1 --with-scalar-type=real --with-precision=double --CFLAGS="-g -O3 -mavx -mkl" --CXXFLAGS="-g -O3 -mavx -mkl" --FFLAGS="-g -O3 -mavx -mkl" ----------------------------------------- Libraries compiled on Mon Jun 22 11:05:39 2015 on helios85 Machine characteristics: Linux-2.6.32-504.16.2.el6.Bull.74.x86_64-x86_64-with-redhat-6.4-Santiago Using PETSc directory: /csc/releases/buildlog/anl/petsc-3.6.0/intel-15.0.0.090/bullxmpi-1.2.8.2/real/petsc-3.6.0 Using PETSc arch: arch-linux2-cxx-opt ----------------------------------------- Using C compiler: mpicxx -g -O3 -mavx -mkl -fPIC ${COPTFLAGS} ${CFLAGS} Using Fortran compiler: mpif90 -g -O3 -mavx -mkl -fPIC ${FOPTFLAGS} ${FFLAGS} ----------------------------------------- Using include paths: -I/csc/releases/buildlog/anl/petsc-3.6.0/intel-15.0.0.090/bullxmpi-1.2.8.2/real/petsc-3.6.0/arch-linux2-cxx-opt/include -I/csc/releases/buildlog/anl/petsc-3.6.0/intel-15.0.0.090/bullxmpi-1.2.8.2/real/petsc-3.6.0/include -I/csc/releases/buildlog/anl/petsc-3.6.0/intel-15.0.0.090/bullxmpi-1.2.8.2/real/petsc-3.6.0/include -I/csc/releases/buildlog/anl/petsc-3.6.0/intel-15.0.0.090/bullxmpi-1.2.8.2/real/petsc-3.6.0/arch-linux2-cxx-opt/include -I/opt/mpi/bullxmpi/1.2.8.2/include ----------------------------------------- Using C linker: mpicxx Using Fortran linker: mpif90 Using libraries: -Wl,-rpath,/csc/releases/buildlog/anl/petsc-3.6.0/intel-15.0.0.090/bullxmpi-1.2.8.2/real/petsc-3.6.0/arch-linux2-cxx-opt/lib -L/csc/releases/buildlog/anl/petsc-3.6.0/intel-15.0.0.090/bullxmpi-1.2.8.2/real/petsc-3.6.0/arch-linux2-cxx-opt/lib -lpetsc -lhwloc -lxml2 -lssl -lcrypto -Wl,-rpath,/opt/mpi/bullxmpi/1.2.8.2/lib -L/opt/mpi/bullxmpi/1.2.8.2/lib -Wl,-rpath,/opt/intel/composer_xe_2015.0.090/mkl/lib/intel64 -L/opt/intel/composer_xe_2015.0.090/mkl/lib/intel64 -Wl,-rpath,/opt/intel/composer_xe_2015.0.090/compiler/lib/intel64 -L/opt/intel/composer_xe_2015.0.090/compiler/lib/intel64 -Wl,-rpath,/usr/lib/gcc/x86_64-redhat-linux/4.4.7 -L/usr/lib/gcc/x86_64-redhat-linux/4.4.7 -lmpi_f90 -lmpi_f77 -lm -lifport -lifcore -lm -lmpi_cxx -ldl -Wl,-rpath,/opt/mpi/bullxmpi/1.2.8.2/lib -L/opt/mpi/bullxmpi/1.2.8.2/lib -lmpi -lnuma -lrt -lnsl -lutil -Wl,-rpath,/opt/mpi/bullxmpi/1.2.8.2/lib -L/opt/mpi/bullxmpi/1.2.8.2/lib -Wl,-rpath,/opt/intel/composer_xe_2015.0.090/mkl/lib/intel64 -L/opt/intel/composer_xe_2015.0.090/mkl/lib/intel64 -Wl,-rpath,/opt/intel/composer_xe_2015.0.090/compiler/lib/intel64 -L/opt/intel/composer_xe_2015.0.090/compiler/lib/intel64 -Wl,-rpath,/opt/intel/composer_xe_2015.0.090/compiler/lib/intel64 -L/opt/intel/composer_xe_2015.0.090/compiler/lib/intel64 -Wl,-rpath,/usr/lib/gcc/x86_64-redhat-linux/4.4.7 -L/usr/lib/gcc/x86_64-redhat-linux/4.4.7 -Wl,-rpath,/opt/intel/composer_xe_2015.0.090/mkl/lib/intel64 -L/opt/intel/composer_xe_2015.0.090/mkl/lib/intel64 -Wl,-rpath,/opt/intel/composer_xe_2015.0.090/compiler/lib/intel64 -L/opt/intel/composer_xe_2015.0.090/compiler/lib/intel64 -lmkl_intel_lp64 -lmkl_intel_thread -lmkl_core -liomp5 -Wl,-rpath,/opt/intel/composer_xe_2015.0.090/compiler/lib/intel64 -L/opt/intel/composer_xe_2015.0.090/compiler/lib/intel64 -limf -Wl,-rpath,/opt/intel/composer_xe_2015.0.090/compiler/lib/intel64 -L/opt/intel/composer_xe_2015.0.090/compiler/lib/intel64 -Wl,-rpath,/opt/intel/composer_xe_2015.0.090/mkl/lib/intel64 -L/opt/intel/composer_xe_2015.0.090/mkl/lib/intel64 -lsvml -lirng -lipgo -ldecimal -lcilkrts -lstdc++ -lgcc_s -lirc -lpthread -lirc_s -Wl,-rpath,/opt/mpi/bullxmpi/1.2.8.2/lib -L/opt/mpi/bullxmpi/1.2.8.2/lib -Wl,-rpath,/opt/intel/composer_xe_2015.0.090/mkl/lib/intel64 -L/opt/intel/composer_xe_2015.0.090/mkl/lib/intel64 -Wl,-rpath,/opt/intel/composer_xe_2015.0.090/compiler/lib/intel64 -L/opt/intel/composer_xe_2015.0.090/compiler/lib/intel64 -Wl,-rpath,/opt/intel/composer_xe_2015.0.090/compiler/lib/intel64 -L/opt/intel/composer_xe_2015.0.090/compiler/lib/intel64 -Wl,-rpath,/usr/lib/gcc/x86_64-redhat-linux/4.4.7 -L/usr/lib/gcc/x86_64-redhat-linux/4.4.7 -Wl,-rpath,/opt/intel/composer_xe_2015.0.090/mkl/lib/intel64 -L/opt/intel/composer_xe_2015.0.090/mkl/lib/intel64 -Wl,-rpath,/opt/intel/composer_xe_2015.0.090/compiler/lib/intel64 -L/opt/intel/composer_xe_2015.0.090/compiler/lib/intel64 -Wl,-rpath,/opt/intel/composer_xe_2015.0.090/compiler/lib/intel64 -L/opt/intel/composer_xe_2015.0.090/compiler/lib/intel64 -Wl,-rpath,/opt/intel/composer_xe_2015.0.090/compiler/lib/intel64 -L/opt/intel/composer_xe_2015.0.090/compiler/lib/intel64 -Wl,-rpath,/opt/intel/composer_xe_2015.0.090/mkl/lib/intel64 -L/opt/intel/composer_xe_2015.0.090/mkl/lib/intel64 -ldl ----------------------------------------- From bsmith at mcs.anl.gov Tue Aug 25 01:47:37 2015 From: bsmith at mcs.anl.gov (Barry Smith) Date: Tue, 25 Aug 2015 01:47:37 -0500 Subject: [petsc-users] Function evaluation slowness ? In-Reply-To: References: <0FCEEEF7-2374-47A5-9C1B-83D37E3E87CD@mcs.anl.gov> Message-ID: <26E1E799-28B2-460E-B8C9-4FE0EE8CA61B@mcs.anl.gov> The results are kind of funky, ------------------------------------------------------------------------------------------------------------------------ Event Count Time (sec) Flops --- Global --- --- Stage --- Total Max Ratio Max Ratio Max Ratio Mess Avg len Reduct %T %F %M %L %R %T %F %M %L %R Mflop/s ------------------------------------------------------------------------------------------------------------------------ SNESSolve 40 1.0 4.9745e+02 3.3 4.25e+09 1.0 1.7e+06 3.8e+04 2.7e+03 46 93 99 95 80 46 93 99 95 80 2187 SNESFunctionEval 666 1.0 4.8990e+02 3.4 5.73e+08 1.0 1.7e+06 3.8e+04 1.3e+03 45 13 99 95 40 45 13 99 95 40 299 SNESLineSearch 79 1.0 3.8578e+00 1.0 4.98e+08 1.0 4.0e+05 3.8e+04 6.3e+02 1 11 23 23 19 1 11 23 23 19 33068 VecScatterEnd 1335 1.0 3.4761e+02 5.8 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00 31 0 0 0 0 31 0 0 0 0 0 MatMult MF 547 1.0 1.2570e+01 1.1 1.27e+09 1.0 1.4e+06 3.8e+04 1.1e+03 2 28 81 78 34 2 28 81 78 34 25962 MatMult 547 1.0 1.2571e+01 1.1 1.27e+09 1.0 1.4e+06 3.8e+04 1.1e+03 2 28 81 78 34 2 28 81 78 34 25960 look at the %T time for global SNES solve is 46 % of the total time, function evaluations are 45% but MatMult are only 2% (and yet matmult should contain most of the function evaluations). I cannot explain this. Also the VecScatterEnd is HUGE and has a bad load balance of 5.8 Why are there so many more scatters than function evaluations? What other operations are you doing that require scatters? It's almost like you have some mysterious "extra" function calls outside of the SNESSolve that are killing the performance? It might help to understand the performance to strip out all extraneous computations not needed (like in custom monitors etc). Barry > On Aug 25, 2015, at 1:21 AM, Timoth?e Nicolas wrote: > > Here is the log summary (attached). At the beginning are personal prints, you can skip. I seem to have a memory crash in the present state after typically 45 iterations (that's why I used 40 here), the log summary indicates some creations without destruction of Petsc objects (I will fix this immediately), that may cause the memory crash, but I don't think it's the cause of the slow function evaluations. > > The log_summary is consistent with 0.7s per function evaluation (4.8990e+02/666 = 0.736). In addition, SNESSolve itself takes approximately the same amount of time (is it normal ?). And the other long operation is VecScatterEnd. I assume it is the time used in process communications ? In which case I suppose it is normal that it takes a significant amount of time. > > So this ~10 times increase does not look normal right ? > > Best > > Timothee NICOLAS > > > 2015-08-25 14:56 GMT+09:00 Barry Smith : > > > On Aug 25, 2015, at 12:45 AM, Timoth?e Nicolas wrote: > > > > Hi, > > > > I am testing PETSc on the supercomputer where I used to run my explicit MHD code. For my tests I use 256 processes on a problem of size 128*128*640 = 10485760, that is, 40960 grid points per process, and 8 degrees of freedom (or physical fields). The explicit code was using Runge-Kutta 4 for the time scheme, which means 4 function evaluation per time step (plus one operation to put everything together, but let's forget this one). > > > > I could thus easily determine that the typical time required for a function evaluation was of the order of 50 ms. > > > > Now with the implicit Newton-Krylov solver written in PETSc, in the present state where for now I have not implemented any Jacobian or preconditioner whatsoever (so I run with -snes_mf), I measure a typical time between two time steps of between 5 and 20 seconds, and the number of function evaluations for each time step obtained with SNESGetNumberFunctionEvals is 17 (I am speaking of a particular case of course) > > > > This means a time per function evaluation of about 0.5 to 1 second, that is, 10 to 20 times slower. > > > > So I have some questions about this. > > > > 1. First does SNESGetNumberFunctionEvals take into account the function evaluations required to evaluate the Jacobian when -snes_mf is used, as well as the operations required by the GMRES (Krylov) method ? If it were the case, I would somehow intuitively expect a number larger than 17, which could explain the increase in time. > > PetscErrorCode SNESGetNumberFunctionEvals(SNES snes, PetscInt *nfuncs) > { > *nfuncs = snes->nfuncs; > } > > PetscErrorCode SNESComputeFunction(SNES snes,Vec x,Vec y) > { > ... > snes->nfuncs++; > } > > PetscErrorCode MatCreateSNESMF(SNES snes,Mat *J) > { > ..... > if (snes->pc && snes->pcside == PC_LEFT) { > ierr = MatMFFDSetFunction(*J,(PetscErrorCode (*)(void*,Vec,Vec))SNESComputeFunctionDefaultNPC,snes);CHKERRQ(ierr); > } else { > ierr = MatMFFDSetFunction(*J,(PetscErrorCode (*)(void*,Vec,Vec))SNESComputeFunction,snes);CHKERRQ(ierr); > } > } > > So, yes I would expect all the function evaluations needed for the matrix-free Jacobian matrix vector product to be counted. You can also look at the number of GMRES Krylov iterations it took (which should have one multiply per iteration) to double check that the numbers make sense. > > What does your -log_summary output look like? One thing that GMRES does is it introduces a global reduction with each multiple (hence a barrier across all your processes) on some systems this can be deadly. > > Barry > > > > > > 2. In any case, I thought that all things considered, the function evaluation would be the most time consuming part of a Newton-Krylov solver, am I completely wrong about that ? Is the 10-20 factor legit ? > > > > I realize of course that preconditioning should make all this smoother, in particular allowing larger time steps, but here I am just concerned about the sheer Function evaluation time. > > > > Best regards > > > > Timothee NICOLAS > > > From timothee.nicolas at gmail.com Tue Aug 25 02:06:54 2015 From: timothee.nicolas at gmail.com (=?UTF-8?Q?Timoth=C3=A9e_Nicolas?=) Date: Tue, 25 Aug 2015 16:06:54 +0900 Subject: [petsc-users] Function evaluation slowness ? In-Reply-To: <26E1E799-28B2-460E-B8C9-4FE0EE8CA61B@mcs.anl.gov> References: <0FCEEEF7-2374-47A5-9C1B-83D37E3E87CD@mcs.anl.gov> <26E1E799-28B2-460E-B8C9-4FE0EE8CA61B@mcs.anl.gov> Message-ID: OK, I see, Might it be that I do something a bit funky to obtain a good guess for solve ? I had he following idea, which I used with great success on a very different problem (much simpler, maybe that's why it worked) : obtain the initial guess as a cubic extrapolation of the preceding solutions. The idea is that I expect my solution to be reasonably smooth over time, so considering this, the increment of the fields should also be continuous (I solve for the increments, not the fields themselves). Therefore, I store in my user context the current vector Xk as well as the last three solutions Xkm1 and Xkm2. I define dxm2 = Xkm1 - Xkm2 dxm1 = Xk - Xkm1 And I use the result of the last SNESSolve as dx = Xkp1 - Xk Then I set the new dx initial guess as the pointwise cubic extrapolation of (dxm2,dxm1,dx) However it seems pretty local and I don't see why scatters would be required for this. I printed the routine I use to do this below. In any case I will clean up a bit, remove the extra stuff (not much there however). If it is not sufficient, I will transform my form function in a dummy which does not require computations and see what happens. Timothee PetscErrorCode :: ierr PetscScalar :: M(3,3) Vec :: xkm2,xkm1 Vec :: coef1,coef2,coef3 PetscScalar :: a,b,c,t,det a = user%tkm1 b = user%tk c = user%t t = user%t+user%dt det = b*a**2 + c*b**2 + a*c**2 - (c*a**2 + a*b**2 + b*c**2) M(1,1) = (b-c)/det M(2,1) = (c**2-b**2)/det M(3,1) = (c*b**2-b*c**2)/det M(1,2) = (c-a)/det M(2,2) = (a**2-c**2)/det M(3,2) = (a*c**2-c*a**2)/det M(1,3) = (a-b)/det M(2,3) = (b**2-a**2)/det M(3,3) = (b*a**2-a*b**2)/det call VecDuplicate(x,xkm1,ierr) call VecDuplicate(x,xkm2,ierr) call VecDuplicate(x,coef1,ierr) call VecDuplicate(x,coef2,ierr) call VecDuplicate(x,coef3,ierr) call VecWAXPY(xkm2,-one,user%Xkm2,user%Xkm1,ierr) call VecWAXPY(xkm1,-one,user%Xkm1,user%Xk,ierr) ! The following lines correspond to the following simple operation ! coef1 = M(1,1)*alpha + M(1,2)*beta + M(1,3)*gamma ! coef2 = M(2,1)*alpha + M(2,2)*beta + M(2,3)*gamma ! coef3 = M(3,1)*alpha + M(3,2)*beta + M(3,3)*gamma call VecCopy(xkm2,coef1,ierr) call VecScale(coef1,M(1,1),ierr) call VecAXPY(coef1,M(1,2),xkm1,ierr) call VecAXPY(coef1,M(1,3),x,ierr) call VecCopy(xkm2,coef2,ierr) call VecScale(coef2,M(2,1),ierr) call VecAXPY(coef2,M(2,2),xkm1,ierr) call VecAXPY(coef2,M(2,3),x,ierr) call VecCopy(xkm2,coef3,ierr) call VecScale(coef3,M(3,1),ierr) call VecAXPY(coef3,M(3,2),xkm1,ierr) call VecAXPY(coef3,M(3,3),x,ierr) call VecCopy(coef3,x,ierr) call VecAXPY(x,t,coef2,ierr) call VecAXPY(x,t**2,coef1,ierr) call VecDestroy(xkm2,ierr) call VecDestroy(xkm1,ierr) call VecDestroy(coef1,ierr) call VecDestroy(coef2,ierr) call VecDestroy(coef3,ierr) 2015-08-25 15:47 GMT+09:00 Barry Smith : > > The results are kind of funky, > > > ------------------------------------------------------------------------------------------------------------------------ > Event Count Time (sec) Flops > --- Global --- --- Stage --- Total > Max Ratio Max Ratio Max Ratio Mess Avg len > Reduct %T %F %M %L %R %T %F %M %L %R Mflop/s > > ------------------------------------------------------------------------------------------------------------------------ > SNESSolve 40 1.0 4.9745e+02 3.3 4.25e+09 1.0 1.7e+06 3.8e+04 > 2.7e+03 46 93 99 95 80 46 93 99 95 80 2187 > SNESFunctionEval 666 1.0 4.8990e+02 3.4 5.73e+08 1.0 1.7e+06 3.8e+04 > 1.3e+03 45 13 99 95 40 45 13 99 95 40 299 > SNESLineSearch 79 1.0 3.8578e+00 1.0 4.98e+08 1.0 4.0e+05 3.8e+04 > 6.3e+02 1 11 23 23 19 1 11 23 23 19 33068 > VecScatterEnd 1335 1.0 3.4761e+02 5.8 0.00e+00 0.0 0.0e+00 0.0e+00 > 0.0e+00 31 0 0 0 0 31 0 0 0 0 0 > MatMult MF 547 1.0 1.2570e+01 1.1 1.27e+09 1.0 1.4e+06 3.8e+04 > 1.1e+03 2 28 81 78 34 2 28 81 78 34 25962 > MatMult 547 1.0 1.2571e+01 1.1 1.27e+09 1.0 1.4e+06 3.8e+04 > 1.1e+03 2 28 81 78 34 2 28 81 78 34 25960 > > look at the %T time for global SNES solve is 46 % of the total time, > function evaluations are 45% but MatMult are only 2% (and yet matmult > should contain most of the function evaluations). I cannot explain this. > Also the VecScatterEnd is HUGE and has a bad load balance of 5.8 Why are > there so many more scatters than function evaluations? What other > operations are you doing that require scatters? > > It's almost like you have some mysterious "extra" function calls outside > of the SNESSolve that are killing the performance? It might help to > understand the performance to strip out all extraneous computations not > needed (like in custom monitors etc). > > Barry > > > > > > > > On Aug 25, 2015, at 1:21 AM, Timoth?e Nicolas < > timothee.nicolas at gmail.com> wrote: > > > > Here is the log summary (attached). At the beginning are personal > prints, you can skip. I seem to have a memory crash in the present state > after typically 45 iterations (that's why I used 40 here), the log summary > indicates some creations without destruction of Petsc objects (I will fix > this immediately), that may cause the memory crash, but I don't think it's > the cause of the slow function evaluations. > > > > The log_summary is consistent with 0.7s per function evaluation > (4.8990e+02/666 = 0.736). In addition, SNESSolve itself takes approximately > the same amount of time (is it normal ?). And the other long operation is > VecScatterEnd. I assume it is the time used in process communications ? In > which case I suppose it is normal that it takes a significant amount of > time. > > > > So this ~10 times increase does not look normal right ? > > > > Best > > > > Timothee NICOLAS > > > > > > 2015-08-25 14:56 GMT+09:00 Barry Smith : > > > > > On Aug 25, 2015, at 12:45 AM, Timoth?e Nicolas < > timothee.nicolas at gmail.com> wrote: > > > > > > Hi, > > > > > > I am testing PETSc on the supercomputer where I used to run my > explicit MHD code. For my tests I use 256 processes on a problem of size > 128*128*640 = 10485760, that is, 40960 grid points per process, and 8 > degrees of freedom (or physical fields). The explicit code was using > Runge-Kutta 4 for the time scheme, which means 4 function evaluation per > time step (plus one operation to put everything together, but let's forget > this one). > > > > > > I could thus easily determine that the typical time required for a > function evaluation was of the order of 50 ms. > > > > > > Now with the implicit Newton-Krylov solver written in PETSc, in the > present state where for now I have not implemented any Jacobian or > preconditioner whatsoever (so I run with -snes_mf), I measure a typical > time between two time steps of between 5 and 20 seconds, and the number of > function evaluations for each time step obtained with > SNESGetNumberFunctionEvals is 17 (I am speaking of a particular case of > course) > > > > > > This means a time per function evaluation of about 0.5 to 1 second, > that is, 10 to 20 times slower. > > > > > > So I have some questions about this. > > > > > > 1. First does SNESGetNumberFunctionEvals take into account the > function evaluations required to evaluate the Jacobian when -snes_mf is > used, as well as the operations required by the GMRES (Krylov) method ? If > it were the case, I would somehow intuitively expect a number larger than > 17, which could explain the increase in time. > > > > PetscErrorCode SNESGetNumberFunctionEvals(SNES snes, PetscInt *nfuncs) > > { > > *nfuncs = snes->nfuncs; > > } > > > > PetscErrorCode SNESComputeFunction(SNES snes,Vec x,Vec y) > > { > > ... > > snes->nfuncs++; > > } > > > > PetscErrorCode MatCreateSNESMF(SNES snes,Mat *J) > > { > > ..... > > if (snes->pc && snes->pcside == PC_LEFT) { > > ierr = MatMFFDSetFunction(*J,(PetscErrorCode > (*)(void*,Vec,Vec))SNESComputeFunctionDefaultNPC,snes);CHKERRQ(ierr); > > } else { > > ierr = MatMFFDSetFunction(*J,(PetscErrorCode > (*)(void*,Vec,Vec))SNESComputeFunction,snes);CHKERRQ(ierr); > > } > > } > > > > So, yes I would expect all the function evaluations needed for the > matrix-free Jacobian matrix vector product to be counted. You can also look > at the number of GMRES Krylov iterations it took (which should have one > multiply per iteration) to double check that the numbers make sense. > > > > What does your -log_summary output look like? One thing that GMRES > does is it introduces a global reduction with each multiple (hence a > barrier across all your processes) on some systems this can be deadly. > > > > Barry > > > > > > > > > > 2. In any case, I thought that all things considered, the function > evaluation would be the most time consuming part of a Newton-Krylov solver, > am I completely wrong about that ? Is the 10-20 factor legit ? > > > > > > I realize of course that preconditioning should make all this > smoother, in particular allowing larger time steps, but here I am just > concerned about the sheer Function evaluation time. > > > > > > Best regards > > > > > > Timothee NICOLAS > > > > > > > > -------------- next part -------------- An HTML attachment was scrubbed... URL: From gideon.simpson at gmail.com Tue Aug 25 09:06:25 2015 From: gideon.simpson at gmail.com (Gideon Simpson) Date: Tue, 25 Aug 2015 10:06:25 -0400 Subject: [petsc-users] issues with sparse direct solvers In-Reply-To: <1F79415D-8468-4FB7-9821-54D71165CE11@mcs.anl.gov> References: <1F79415D-8468-4FB7-9821-54D71165CE11@mcs.anl.gov> Message-ID: <7CB56462-D17B-48A5-B014-B80604501DB5@gmail.com> Regarding the MUMPS issue, I?m not sure if this is useful, but when I run with the mumps flags -mat_mumps_icntl_4 4, to see the progress, it hangs at this point: ... Structural symmetry (in percent)= 75 Density: NBdense, Average, Median = 2 9 7 Ordering based on METIS -gideon > On Aug 22, 2015, at 5:12 PM, Barry Smith wrote: > > >> On Aug 22, 2015, at 4:04 PM, Gideon Simpson wrote: >> >> I?m having issues with both SuperLU dist and MUMPS, as compiled by PETsc, in the following sense: >> >> 1. For large enough systems, which seems to vary depending on which computer I?m on, MUMPS seems to just die and never start, when it?s used as the linear solver within a SNES. There?s no error message, it just sits there and doesn?t do anything. > > You will need to use a debugger to figure out where it is "hanging"; we haven't heard reports about this. >> >> 2. When running with SuperLU dist, I got the following error, with no further information: > > The last release of SuperLU_DIST had some pretty nasty bugs, memory corruption that caused crashes etc. We think they are now fixed if you use the maint branch of the PETSc repository and --download-superlu_dist If you stick with the PETSc release and SuperLU_Dist you are using you will keep seeing these crashes > > Barry > > >> >> [3]PETSC ERROR: ------------------------------------------------------------------------ >> [3]PETSC ERROR: Caught signal number 11 SEGV: Segmentation Violation, probably memory access out of range >> [3]PETSC ERROR: Try option -start_in_debugger or -on_error_attach_debugger >> [3]PETSC ERROR: or see http://www.mcs.anl.gov/petsc/documentation/faq.html#valgrind >> [3]PETSC ERROR: or try http://valgrind.org on GNU/linux and Apple Mac OS X to find memory corruption errors >> [3]PETSC ERROR: likely location of problem given in stack below >> [3]PETSC ERROR: --------------------- Stack Frames ------------------------------------ >> [3]PETSC ERROR: Note: The EXACT line numbers in the stack are not available, >> [3]PETSC ERROR: INSTEAD the line number of the start of the function >> [3]PETSC ERROR: is given. >> [3]PETSC ERROR: [3] SuperLU_DIST:pdgssvx line 161 /home/simpson/software/petsc-3.5.4/src/mat/impls/aij/mpi/superlu_dist/superlu_dist.c >> [3]PETSC ERROR: [3] MatSolve_SuperLU_DIST line 121 /home/simpson/software/petsc-3.5.4/src/mat/impls/aij/mpi/superlu_dist/superlu_dist.c >> [3]PETSC ERROR: [3] MatSolve line 3104 /home/simpson/software/petsc-3.5.4/src/mat/interface/matrix.c >> [3]PETSC ERROR: [3] PCApply_LU line 194 /home/simpson/software/petsc-3.5.4/src/ksp/pc/impls/factor/lu/lu.c >> [3]PETSC ERROR: [3] KSP_PCApplyBAorAB line 258 /home/simpson/software/petsc-3.5.4/include/petsc-private/kspimpl.h >> [3]PETSC ERROR: --------------------- Error Message -------------------------------------------------------------- >> [3]PETSC ERROR: Signal received >> [3]PETSC ERROR: See http://www.mcs.anl.gov/petsc/documentation/faq.html for trouble shooting. >> [3]PETSC ERROR: Petsc Release Version 3.5.4, May, 23, 2015 >> [3]PETSC ERROR: ./blowup_batch2 on a arch-linux2-c-debug named proteusi01 by simpson Sat Aug 22 17:01:41 2015 >> [3]PETSC ERROR: Configure options --with-mpi-dir=/mnt/HA/opt/openmpi/intel/64/1.8.1-mlnx-ofed --with-blas-lib=/mnt/HA/opt/blas/gcc/64/20110419/libblas.a --with-lapack-lib=/liblapack.a --download-suitesparse=yes --download-superlu=yes --download-superlu_dist=yes --download-mumps=yes --download-metis=yes --download-parmetis=yes --download-scalapack=yes >> [3]PETSC ERROR: #1 User provided function() line 0 in unknown file >> -------------------------------------------------------------------------- >> MPI_ABORT was invoked on rank 3 in communicator MPI_COMM_WORLD >> with errorcode 59. >> >> NOTE: invoking MPI_ABORT causes Open MPI to kill all MPI processes. >> You may or may not see output from other processes, depending on >> exactly when Open MPI kills them. >> -------------------------------------------------------------------------- >> [proteusi01:14037] 1 more process has sent help message help-mpi-api.txt / mpi-abort >> [proteusi01:14037] Set MCA parameter "orte_base_help_aggregate" to 0 to see all help / error messages >> [6]PETSC ERROR: ------------------------------------------------------------------------ >> [6]PETSC ERROR: Caught signal number 15 Terminate: Some process (or the batch system) has told this process to end >> [6]PETSC ERROR: Try option -start_in_debugger or -on_error_attach_debugger >> [6]PETSC ERROR: or see http://www.mcs.anl.gov/petsc/documentation/faq.html#valgrind >> [6]PETSC ERROR: or try http://valgrind.org on GNU/linux and Apple Mac OS X to find memory corruption errors >> [6]PETSC ERROR: likely location of problem given in stack below >> [6]PETSC ERROR: --------------------- Stack Frames ------------------------------------ >> [6]PETSC ERROR: Note: The EXACT line numbers in the stack are not available, >> [6]PETSC ERROR: INSTEAD the line number of the start of the function >> [6]PETSC ERROR: is given. >> [6]PETSC ERROR: [6] SuperLU_DIST:pdgssvx line 161 /home/simpson/software/petsc-3.5.4/src/mat/impls/aij/mpi/superlu_dist/superlu_dist.c >> [6]PETSC ERROR: [6] MatSolve_SuperLU_DIST line 121 /home/simpson/software/petsc-3.5.4/src/mat/impls/aij/mpi/superlu_dist/superlu_dist.c >> [6]PETSC ERROR: [6] MatSolve line 3104 /home/simpson/software/petsc-3.5.4/src/mat/interface/matrix.c >> [6]PETSC ERROR: [6] PCApply_LU line 194 /home/simpson/software/petsc-3.5.4/src/ksp/pc/impls/factor/lu/lu.c >> [6]PETSC ERROR: [6] KSP_PCApplyBAorAB line 258 /home/simpson/software/petsc-3.5.4/include/petsc-private/kspimpl.h >> [6]PETSC ERROR: --------------------- Error Message -------------------------------------------------------------- >> [6]PETSC ERROR: Signal received >> [6]PETSC ERROR: See http://www.mcs.anl.gov/petsc/documentation/faq.html for trouble shooting. >> [6]PETSC ERROR: Petsc Release Version 3.5.4, May, 23, 2015 >> [6]PETSC ERROR: ./blowup_batch2 on a arch-linux2-c-debug named proteusi01 by simpson Sat Aug 22 17:01:41 2015 >> [6]PETSC ERROR: Configure options --with-mpi-dir=/mnt/HA/opt/openmpi/intel/64/1.8.1-mlnx-ofed --with-blas-lib=/mnt/HA/opt/blas/gcc/64/20110419/libblas.a --with-lapack-lib=/liblapack.a --download-suitesparse=yes --download-superlu=yes --download-superlu_dist=yes --download-mumps=yes --download-metis=yes --download-parmetis=yes --download-scalapack=yes >> [6]PETSC ERROR: #1 User provided function() line 0 in unknown file >> [7]PETSC ERROR: ------------------------------------------------------------------------ >> [7]PETSC ERROR: Caught signal number 15 Terminate: Some process (or the batch system) has told this process to end >> [7]PETSC ERROR: Try option -start_in_debugger or -on_error_attach_debugger >> [7]PETSC ERROR: or see http://www.mcs.anl.gov/petsc/documentation/faq.html#valgrind >> [7]PETSC ERROR: or try http://valgrind.org on GNU/linux and Apple Mac OS X to find memory corruption errors >> [7]PETSC ERROR: likely location of problem given in stack below >> [7]PETSC ERROR: --------------------- Stack Frames ------------------------------------ >> [7]PETSC ERROR: Note: The EXACT line numbers in the stack are not available, >> [7]PETSC ERROR: INSTEAD the line number of the start of the function >> [7]PETSC ERROR: is given. >> [7]PETSC ERROR: [7] SuperLU_DIST:pdgssvx line 161 /home/simpson/software/petsc-3.5.4/src/mat/impls/aij/mpi/superlu_dist/superlu_dist.c >> [7]PETSC ERROR: [7] MatSolve_SuperLU_DIST line 121 /home/simpson/software/petsc-3.5.4/src/mat/impls/aij/mpi/superlu_dist/superlu_dist.c >> [7]PETSC ERROR: [7] MatSolve line 3104 /home/simpson/software/petsc-3.5.4/src/mat/interface/matrix.c >> [7]PETSC ERROR: [7] PCApply_LU line 194 /home/simpson/software/petsc-3.5.4/src/ksp/pc/impls/factor/lu/lu.c >> [7]PETSC ERROR: [7] KSP_PCApplyBAorAB line 258 /home/simpson/software/petsc-3.5.4/include/petsc-private/kspimpl.h >> [7]PETSC ERROR: --------------------- Error Message -------------------------------------------------------------- >> [7]PETSC ERROR: Signal received >> [7]PETSC ERROR: See http://www.mcs.anl.gov/petsc/documentation/faq.html for trouble shooting. >> [7]PETSC ERROR: Petsc Release Version 3.5.4, May, 23, 2015 >> [7]PETSC ERROR: ./blowup_batch2 on a arch-linux2-c-debug named proteusi01 by simpson Sat Aug 22 17:01:41 2015 >> [7]PETSC ERROR: Configure options --with-mpi-dir=/mnt/HA/opt/openmpi/intel/64/1.8.1-mlnx-ofed --with-blas-lib=/mnt/HA/opt/blas/gcc/64/20110419/libblas.a --with-lapack-lib=/liblapack.a --download-suitesparse=yes --download-superlu=yes --download-superlu_dist=yes --download-mumps=yes --download-metis=yes --download-parmetis=yes --download-scalapack=yes >> [7]PETSC ERROR: #1 User provided function() line 0 in unknown file >> [0]PETSC ERROR: ------------------------------------------------------------------------ >> [0]PETSC ERROR: Caught signal number 15 Terminate: Some process (or the batch system) has told this process to end >> [0]PETSC ERROR: Try option -start_in_debugger or -on_error_attach_debugger >> [0]PETSC ERROR: or see http://www.mcs.anl.gov/petsc/documentation/faq.html#valgrind >> [0]PETSC ERROR: or try http://valgrind.org on GNU/linux and Apple Mac OS X to find memory corruption errors >> [0]PETSC ERROR: likely location of problem given in stack below >> [0]PETSC ERROR: --------------------- Stack Frames ------------------------------------ >> [0]PETSC ERROR: Note: The EXACT line numbers in the stack are not available, >> [0]PETSC ERROR: INSTEAD the line number of the start of the function >> [0]PETSC ERROR: is given. >> [0]PETSC ERROR: [0] SuperLU_DIST:pdgssvx line 161 /home/simpson/software/petsc-3.5.4/src/mat/impls/aij/mpi/superlu_dist/superlu_dist.c >> [0]PETSC ERROR: [0] MatSolve_SuperLU_DIST line 121 /home/simpson/software/petsc-3.5.4/src/mat/impls/aij/mpi/superlu_dist/superlu_dist.c >> [0]PETSC ERROR: [0] MatSolve line 3104 /home/simpson/software/petsc-3.5.4/src/mat/interface/matrix.c >> [0]PETSC ERROR: [0] PCApply_LU line 194 /home/simpson/software/petsc-3.5.4/src/ksp/pc/impls/factor/lu/lu.c >> [0]PETSC ERROR: [0] KSP_PCApplyBAorAB line 258 /home/simpson/software/petsc-3.5.4/include/petsc-private/kspimpl.h >> [0]PETSC ERROR: --------------------- Error Message -------------------------------------------------------------- >> [0]PETSC ERROR: Signal received >> [0]PETSC ERROR: See http://www.mcs.anl.gov/petsc/documentation/faq.html for trouble shooting. >> [0]PETSC ERROR: Petsc Release Version 3.5.4, May, 23, 2015 >> [0]PETSC ERROR: ./blowup_batch2 on a arch-linux2-c-debug named proteusi01 by simpson Sat Aug 22 17:01:41 2015 >> [0]PETSC ERROR: Configure options --with-mpi-dir=/mnt/HA/opt/openmpi/intel/64/1.8.1-mlnx-ofed --with-blas-lib=/mnt/HA/opt/blas/gcc/64/20110419/libblas.a --with-lapack-lib=/liblapack.a --download-suitesparse=yes --download-superlu=yes --download-superlu_dist=yes --download-mumps=yes --download-metis=yes --download-parmetis=yes --download-scalapack=yes >> [0]PETSC ERROR: #1 User provided function() line 0 in unknown file >> [1]PETSC ERROR: ------------------------------------------------------------------------ >> [1]PETSC ERROR: Caught signal number 15 Terminate: Some process (or the batch system) has told this process to end >> [1]PETSC ERROR: Try option -start_in_debugger or -on_error_attach_debugger >> [1]PETSC ERROR: or see http://www.mcs.anl.gov/petsc/documentation/faq.html#valgrind >> [1]PETSC ERROR: or try http://valgrind.org on GNU/linux and Apple Mac OS X to find memory corruption errors >> [1]PETSC ERROR: likely location of problem given in stack below >> [1]PETSC ERROR: --------------------- Stack Frames ------------------------------------ >> [1]PETSC ERROR: Note: The EXACT line numbers in the stack are not available, >> [1]PETSC ERROR: INSTEAD the line number of the start of the function >> [1]PETSC ERROR: is given. >> [1]PETSC ERROR: [1] SuperLU_DIST:pdgssvx line 161 /home/simpson/software/petsc-3.5.4/src/mat/impls/aij/mpi/superlu_dist/superlu_dist.c >> [1]PETSC ERROR: [1] MatSolve_SuperLU_DIST line 121 /home/simpson/software/petsc-3.5.4/src/mat/impls/aij/mpi/superlu_dist/superlu_dist.c >> [1]PETSC ERROR: [1] MatSolve line 3104 /home/simpson/software/petsc-3.5.4/src/mat/interface/matrix.c >> [1]PETSC ERROR: [1] PCApply_LU line 194 /home/simpson/software/petsc-3.5.4/src/ksp/pc/impls/factor/lu/lu.c >> [1]PETSC ERROR: [1] KSP_PCApplyBAorAB line 258 /home/simpson/software/petsc-3.5.4/include/petsc-private/kspimpl.h >> [1]PETSC ERROR: --------------------- Error Message -------------------------------------------------------------- >> [1]PETSC ERROR: Signal received >> [1]PETSC ERROR: See http://www.mcs.anl.gov/petsc/documentation/faq.html for trouble shooting. >> [1]PETSC ERROR: Petsc Release Version 3.5.4, May, 23, 2015 >> [1]PETSC ERROR: ./blowup_batch2 on a arch-linux2-c-debug named proteusi01 by simpson Sat Aug 22 17:01:41 2015 >> [1]PETSC ERROR: Configure options --with-mpi-dir=/mnt/HA/opt/openmpi/intel/64/1.8.1-mlnx-ofed --with-blas-lib=/mnt/HA/opt/blas/gcc/64/20110419/libblas.a --with-lapack-lib=/liblapack.a --download-suitesparse=yes --download-superlu=yes --download-superlu_dist=yes --download-mumps=yes --download-metis=yes --download-parmetis=yes --download-scalapack=yes >> [1]PETSC ERROR: #1 User provided function() line 0 in unknown file >> [2]PETSC ERROR: ------------------------------------------------------------------------ >> [2]PETSC ERROR: Caught signal number 15 Terminate: Some process (or the batch system) has told this process to end >> [2]PETSC ERROR: Try option -start_in_debugger or -on_error_attach_debugger >> [2]PETSC ERROR: or see http://www.mcs.anl.gov/petsc/documentation/faq.html#valgrind >> [2]PETSC ERROR: or try http://valgrind.org on GNU/linux and Apple Mac OS X to find memory corruption errors >> [2]PETSC ERROR: likely location of problem given in stack below >> [2]PETSC ERROR: --------------------- Stack Frames ------------------------------------ >> [2]PETSC ERROR: Note: The EXACT line numbers in the stack are not available, >> [2]PETSC ERROR: INSTEAD the line number of the start of the function >> [2]PETSC ERROR: is given. >> [2]PETSC ERROR: [2] SuperLU_DIST:pdgssvx line 161 /home/simpson/software/petsc-3.5.4/src/mat/impls/aij/mpi/superlu_dist/superlu_dist.c >> [2]PETSC ERROR: [2] MatSolve_SuperLU_DIST line 121 /home/simpson/software/petsc-3.5.4/src/mat/impls/aij/mpi/superlu_dist/superlu_dist.c >> [2]PETSC ERROR: [2] MatSolve line 3104 /home/simpson/software/petsc-3.5.4/src/mat/interface/matrix.c >> [2]PETSC ERROR: [2] PCApply_LU line 194 /home/simpson/software/petsc-3.5.4/src/ksp/pc/impls/factor/lu/lu.c >> [2]PETSC ERROR: [2] KSP_PCApplyBAorAB line 258 /home/simpson/software/petsc-3.5.4/include/petsc-private/kspimpl.h >> [2]PETSC ERROR: --------------------- Error Message -------------------------------------------------------------- >> [2]PETSC ERROR: Signal received >> [2]PETSC ERROR: See http://www.mcs.anl.gov/petsc/documentation/faq.html for trouble shooting. >> [2]PETSC ERROR: Petsc Release Version 3.5.4, May, 23, 2015 >> [2]PETSC ERROR: ./blowup_batch2 on a arch-linux2-c-debug named proteusi01 by simpson Sat Aug 22 17:01:41 2015 >> [2]PETSC ERROR: Configure options --with-mpi-dir=/mnt/HA/opt/openmpi/intel/64/1.8.1-mlnx-ofed --with-blas-lib=/mnt/HA/opt/blas/gcc/64/20110419/libblas.a --with-lapack-lib=/liblapack.a --download-suitesparse=yes --download-superlu=yes --download-superlu_dist=yes --download-mumps=yes --download-metis=yes --download-parmetis=yes --download-scalapack=yes >> [2]PETSC ERROR: #1 User provided function() line 0 in unknown file >> [4]PETSC ERROR: ------------------------------------------------------------------------ >> [4]PETSC ERROR: Caught signal number 15 Terminate: Some process (or the batch system) has told this process to end >> [4]PETSC ERROR: Try option -start_in_debugger or -on_error_attach_debugger >> [4]PETSC ERROR: or see http://www.mcs.anl.gov/petsc/documentation/faq.html#valgrind >> [4]PETSC ERROR: or try http://valgrind.org on GNU/linux and Apple Mac OS X to find memory corruption errors >> [4]PETSC ERROR: likely location of problem given in stack below >> [4]PETSC ERROR: --------------------- Stack Frames ------------------------------------ >> [4]PETSC ERROR: Note: The EXACT line numbers in the stack are not available, >> [4]PETSC ERROR: INSTEAD the line number of the start of the function >> [4]PETSC ERROR: is given. >> [4]PETSC ERROR: [4] SuperLU_DIST:pdgssvx line 161 /home/simpson/software/petsc-3.5.4/src/mat/impls/aij/mpi/superlu_dist/superlu_dist.c >> [4]PETSC ERROR: [4] MatSolve_SuperLU_DIST line 121 /home/simpson/software/petsc-3.5.4/src/mat/impls/aij/mpi/superlu_dist/superlu_dist.c >> [4]PETSC ERROR: [4] MatSolve line 3104 /home/simpson/software/petsc-3.5.4/src/mat/interface/matrix.c >> [4]PETSC ERROR: [4] PCApply_LU line 194 /home/simpson/software/petsc-3.5.4/src/ksp/pc/impls/factor/lu/lu.c >> [4]PETSC ERROR: [4] KSP_PCApplyBAorAB line 258 /home/simpson/software/petsc-3.5.4/include/petsc-private/kspimpl.h >> [4]PETSC ERROR: --------------------- Error Message -------------------------------------------------------------- >> [4]PETSC ERROR: Signal received >> [4]PETSC ERROR: See http://www.mcs.anl.gov/petsc/documentation/faq.html for trouble shooting. >> [4]PETSC ERROR: Petsc Release Version 3.5.4, May, 23, 2015 >> [4]PETSC ERROR: ./blowup_batch2 on a arch-linux2-c-debug named proteusi01 by simpson Sat Aug 22 17:01:41 2015 >> [4]PETSC ERROR: Configure options --with-mpi-dir=/mnt/HA/opt/openmpi/intel/64/1.8.1-mlnx-ofed --with-blas-lib=/mnt/HA/opt/blas/gcc/64/20110419/libblas.a --with-lapack-lib=/liblapack.a --download-suitesparse=yes --download-superlu=yes --download-superlu_dist=yes --download-mumps=yes --download-metis=yes --download-parmetis=yes --download-scalapack=yes >> [4]PETSC ERROR: #1 User provided function() line 0 in unknown file >> [5]PETSC ERROR: ------------------------------------------------------------------------ >> [5]PETSC ERROR: Caught signal number 15 Terminate: Some process (or the batch system) has told this process to end >> [5]PETSC ERROR: Try option -start_in_debugger or -on_error_attach_debugger >> [5]PETSC ERROR: or see http://www.mcs.anl.gov/petsc/documentation/faq.html#valgrind >> [5]PETSC ERROR: or try http://valgrind.org on GNU/linux and Apple Mac OS X to find memory corruption errors >> [5]PETSC ERROR: likely location of problem given in stack below >> [5]PETSC ERROR: --------------------- Stack Frames ------------------------------------ >> [5]PETSC ERROR: Note: The EXACT line numbers in the stack are not available, >> [5]PETSC ERROR: INSTEAD the line number of the start of the function >> [5]PETSC ERROR: is given. >> [5]PETSC ERROR: [5] SuperLU_DIST:pdgssvx line 161 /home/simpson/software/petsc-3.5.4/src/mat/impls/aij/mpi/superlu_dist/superlu_dist.c >> [5]PETSC ERROR: [5] MatSolve_SuperLU_DIST line 121 /home/simpson/software/petsc-3.5.4/src/mat/impls/aij/mpi/superlu_dist/superlu_dist.c >> [5]PETSC ERROR: [5] MatSolve line 3104 /home/simpson/software/petsc-3.5.4/src/mat/interface/matrix.c >> [5]PETSC ERROR: [5] PCApply_LU line 194 /home/simpson/software/petsc-3.5.4/src/ksp/pc/impls/factor/lu/lu.c >> [5]PETSC ERROR: [5] KSP_PCApplyBAorAB line 258 /home/simpson/software/petsc-3.5.4/include/petsc-private/kspimpl.h >> [5]PETSC ERROR: --------------------- Error Message -------------------------------------------------------------- >> [5]PETSC ERROR: Signal received >> [5]PETSC ERROR: See http://www.mcs.anl.gov/petsc/documentation/faq.html for trouble shooting. >> [5]PETSC ERROR: Petsc Release Version 3.5.4, May, 23, 2015 >> [5]PETSC ERROR: ./blowup_batch2 on a arch-linux2-c-debug named proteusi01 by simpson Sat Aug 22 17:01:41 2015 >> [5]PETSC ERROR: Configure options --with-mpi-dir=/mnt/HA/opt/openmpi/intel/64/1.8.1-mlnx-ofed --with-blas-lib=/mnt/HA/opt/blas/gcc/64/20110419/libblas.a --with-lapack-lib=/liblapack.a --download-suitesparse=yes --download-superlu=yes --download-superlu_dist=yes --download-mumps=yes --download-metis=yes --download-parmetis=yes --download-scalapack=yes >> [5]PETSC ERROR: #1 User provided function() line 0 in unknown file >> >> -gideon >> >> > From zonexo at gmail.com Tue Aug 25 10:19:13 2015 From: zonexo at gmail.com (Wee Beng Tay) Date: Tue, 25 Aug 2015 23:19:13 +0800 Subject: [petsc-users] Insert values into matrix using MatSetValuesStencil or MatSetValuesLocal In-Reply-To: References: <1440468109831-51ceb105-9c281f1a-bc754585@gmail.com> Message-ID: <1440515955447-95e4864a-05f6749c-5e9fdfbc@gmail.com> Hi, So can I use multigrid directly if using matsetvalues stencil? Thanks Sent using CloudMagic [https://cloudmagic.com/k/d/mailapp?ct=pa&cv=7.2.9&pv=5.0.2] On Tue, Aug 25, 2015 at 10:11 AM, Matthew Knepley < knepley at gmail.com [knepley at gmail.com] > wrote: On Mon, Aug 24, 2015 at 9:01 PM, Wee Beng Tay < zonexo at gmail.com [zonexo at gmail.com] > wrote: Sent using CloudMagic [https://cloudmagic.com/k/d/mailapp?ct=pa&cv=7.2.9&pv=5.0.2] On Mon, Aug 24, 2015 at 6:21 PM, Matthew Knepley < knepley at gmail.com [knepley at gmail.com] > wrote: On Mon, Aug 24, 2015 at 4:09 AM, Wee-Beng Tay < zonexo at gmail.com [zonexo at gmail.com] > wrote: Hi, I'm modifying my 3d fortran code from MPI along 1 direction (z) to MPI along 2 directions (y,z) Previously I was using MatSetValues with global indices. However, now I'm using DM and global indices is much more difficult. I come across MatSetValuesStencil or MatSetValuesLocal. So what's the difference bet the one since they both seem to work locally? No. MatSetValuesLocal() takes local indices. MatSetValuesStencil() takes global vertex numbers. So MatSetValuesStencil() takes global vertex numbers. Do you mean the natural or petsc ordering? There is no PETSc ordering for vertices, only the natural ordering. Thanks, Matt Which is a simpler/better option? MatSetValuesStencil() Is there an example in Fortran for MatSetValuesStencil? Timoth?e Nicolas shows one in his reply. Do I also need to use DMDAGetAO together with MatSetValuesStencil or MatSetValuesLocal? No. Thanks, Matt Thanks! -- What most experimenters take for granted before they begin their experiments is infinitely more interesting than any results to which their experiments lead. -- Norbert Wiener -- What most experimenters take for granted before they begin their experiments is infinitely more interesting than any results to which their experiments lead. -- Norbert Wiener -------------- next part -------------- An HTML attachment was scrubbed... URL: From knepley at gmail.com Tue Aug 25 10:31:09 2015 From: knepley at gmail.com (Matthew Knepley) Date: Tue, 25 Aug 2015 10:31:09 -0500 Subject: [petsc-users] Insert values into matrix using MatSetValuesStencil or MatSetValuesLocal In-Reply-To: <1440515955447-95e4864a-05f6749c-5e9fdfbc@gmail.com> References: <1440468109831-51ceb105-9c281f1a-bc754585@gmail.com> <1440515955447-95e4864a-05f6749c-5e9fdfbc@gmail.com> Message-ID: On Tue, Aug 25, 2015 at 10:19 AM, Wee Beng Tay wrote: > Hi, > > So can I use multigrid directly if using matsetvalues stencil? > Do you mean, if you use MatSetStencil() then you statements will work no matter what grid comes in to your residual function? That is true. Thanks, Matt > Thanks > > > Sent using CloudMagic > > On Tue, Aug 25, 2015 at 10:11 AM, Matthew Knepley > wrote: > > On Mon, Aug 24, 2015 at 9:01 PM, Wee Beng Tay wrote: > >> >> >> Sent using CloudMagic >> >> On Mon, Aug 24, 2015 at 6:21 PM, Matthew Knepley >> wrote: >> >> On Mon, Aug 24, 2015 at 4:09 AM, Wee-Beng Tay wrote: >> >>> Hi, >>> >>> I'm modifying my 3d fortran code from MPI along 1 direction (z) to MPI >>> along 2 directions (y,z) >>> >>> Previously I was using MatSetValues with global indices. However, now >>> I'm using DM and global indices is much more difficult. >>> >>> I come across MatSetValuesStencil or MatSetValuesLocal. >>> >>> So what's the difference bet the one since they both seem to work >>> locally? >>> >> >> No. MatSetValuesLocal() takes local indices. MatSetValuesStencil() takes >> global vertex numbers. >> >> >> So MatSetValuesStencil() takes global vertex numbers. Do you mean the >> natural or petsc ordering? >> > > There is no PETSc ordering for vertices, only the natural ordering. > > Thanks, > > Matt > >> Which is a simpler/better option? >>> >> >> MatSetValuesStencil() >> >>> Is there an example in Fortran for MatSetValuesStencil? >>> >> >> Timoth?e Nicolas shows one in his reply. >> >> Do I also need to use DMDAGetAO together with MatSetValuesStencil or >>> MatSetValuesLocal? >>> >> >> No. >> >> Thanks, >> >> Matt >> >>> Thanks! >>> >> >> >> >> -- >> What most experimenters take for granted before they begin their >> experiments is infinitely more interesting than any results to which their >> experiments lead. >> -- Norbert Wiener >> >> > > > -- > What most experimenters take for granted before they begin their > experiments is infinitely more interesting than any results to which their > experiments lead. > -- Norbert Wiener > > -- What most experimenters take for granted before they begin their experiments is infinitely more interesting than any results to which their experiments lead. -- Norbert Wiener -------------- next part -------------- An HTML attachment was scrubbed... URL: From hzhang at mcs.anl.gov Tue Aug 25 11:24:41 2015 From: hzhang at mcs.anl.gov (Hong) Date: Tue, 25 Aug 2015 11:24:41 -0500 Subject: [petsc-users] issues with sparse direct solvers In-Reply-To: <7CB56462-D17B-48A5-B014-B80604501DB5@gmail.com> References: <1F79415D-8468-4FB7-9821-54D71165CE11@mcs.anl.gov> <7CB56462-D17B-48A5-B014-B80604501DB5@gmail.com> Message-ID: Gideon: -mat_mumps_icntl_4 <0>: ICNTL(4): level of printing (0 to 4) (None) This is for algorithmic diagnosis, not for regular runs. Use default '0' for it. Hong On Tue, Aug 25, 2015 at 9:06 AM, Gideon Simpson wrote: > Regarding the MUMPS issue, I?m not sure if this is useful, but when I run > with the mumps flags -mat_mumps_icntl_4 4, to see the progress, it hangs > at this point: > > > ... Structural symmetry (in percent)= 75 > Density: NBdense, Average, Median = 2 9 7 > Ordering based on METIS > > -gideon > > > On Aug 22, 2015, at 5:12 PM, Barry Smith wrote: > > > > > >> On Aug 22, 2015, at 4:04 PM, Gideon Simpson > wrote: > >> > >> I?m having issues with both SuperLU dist and MUMPS, as compiled by > PETsc, in the following sense: > >> > >> 1. For large enough systems, which seems to vary depending on which > computer I?m on, MUMPS seems to just die and never start, when it?s used as > the linear solver within a SNES. There?s no error message, it just sits > there and doesn?t do anything. > > > > You will need to use a debugger to figure out where it is "hanging"; we > haven't heard reports about this. > >> > >> 2. When running with SuperLU dist, I got the following error, with no > further information: > > > > The last release of SuperLU_DIST had some pretty nasty bugs, memory > corruption that caused crashes etc. We think they are now fixed if you use > the maint branch of the PETSc repository and --download-superlu_dist If > you stick with the PETSc release and SuperLU_Dist you are using you will > keep seeing these crashes > > > > Barry > > > > > >> > >> [3]PETSC ERROR: > ------------------------------------------------------------------------ > >> [3]PETSC ERROR: Caught signal number 11 SEGV: Segmentation Violation, > probably memory access out of range > >> [3]PETSC ERROR: Try option -start_in_debugger or > -on_error_attach_debugger > >> [3]PETSC ERROR: or see > http://www.mcs.anl.gov/petsc/documentation/faq.html#valgrind > >> [3]PETSC ERROR: or try http://valgrind.org on GNU/linux and Apple Mac > OS X to find memory corruption errors > >> [3]PETSC ERROR: likely location of problem given in stack below > >> [3]PETSC ERROR: --------------------- Stack Frames > ------------------------------------ > >> [3]PETSC ERROR: Note: The EXACT line numbers in the stack are not > available, > >> [3]PETSC ERROR: INSTEAD the line number of the start of the > function > >> [3]PETSC ERROR: is given. > >> [3]PETSC ERROR: [3] SuperLU_DIST:pdgssvx line 161 > /home/simpson/software/petsc-3.5.4/src/mat/impls/aij/mpi/superlu_dist/superlu_dist.c > >> [3]PETSC ERROR: [3] MatSolve_SuperLU_DIST line 121 > /home/simpson/software/petsc-3.5.4/src/mat/impls/aij/mpi/superlu_dist/superlu_dist.c > >> [3]PETSC ERROR: [3] MatSolve line 3104 > /home/simpson/software/petsc-3.5.4/src/mat/interface/matrix.c > >> [3]PETSC ERROR: [3] PCApply_LU line 194 > /home/simpson/software/petsc-3.5.4/src/ksp/pc/impls/factor/lu/lu.c > >> [3]PETSC ERROR: [3] KSP_PCApplyBAorAB line 258 > /home/simpson/software/petsc-3.5.4/include/petsc-private/kspimpl.h > >> [3]PETSC ERROR: --------------------- Error Message > -------------------------------------------------------------- > >> [3]PETSC ERROR: Signal received > >> [3]PETSC ERROR: See http://www.mcs.anl.gov/petsc/documentation/faq.html > for trouble shooting. > >> [3]PETSC ERROR: Petsc Release Version 3.5.4, May, 23, 2015 > >> [3]PETSC ERROR: ./blowup_batch2 on a arch-linux2-c-debug named > proteusi01 by simpson Sat Aug 22 17:01:41 2015 > >> [3]PETSC ERROR: Configure options > --with-mpi-dir=/mnt/HA/opt/openmpi/intel/64/1.8.1-mlnx-ofed > --with-blas-lib=/mnt/HA/opt/blas/gcc/64/20110419/libblas.a > --with-lapack-lib=/liblapack.a --download-suitesparse=yes > --download-superlu=yes --download-superlu_dist=yes --download-mumps=yes > --download-metis=yes --download-parmetis=yes --download-scalapack=yes > >> [3]PETSC ERROR: #1 User provided function() line 0 in unknown file > >> > -------------------------------------------------------------------------- > >> MPI_ABORT was invoked on rank 3 in communicator MPI_COMM_WORLD > >> with errorcode 59. > >> > >> NOTE: invoking MPI_ABORT causes Open MPI to kill all MPI processes. > >> You may or may not see output from other processes, depending on > >> exactly when Open MPI kills them. > >> > -------------------------------------------------------------------------- > >> [proteusi01:14037] 1 more process has sent help message > help-mpi-api.txt / mpi-abort > >> [proteusi01:14037] Set MCA parameter "orte_base_help_aggregate" to 0 to > see all help / error messages > >> [6]PETSC ERROR: > ------------------------------------------------------------------------ > >> [6]PETSC ERROR: Caught signal number 15 Terminate: Some process (or the > batch system) has told this process to end > >> [6]PETSC ERROR: Try option -start_in_debugger or > -on_error_attach_debugger > >> [6]PETSC ERROR: or see > http://www.mcs.anl.gov/petsc/documentation/faq.html#valgrind > >> [6]PETSC ERROR: or try http://valgrind.org on GNU/linux and Apple Mac > OS X to find memory corruption errors > >> [6]PETSC ERROR: likely location of problem given in stack below > >> [6]PETSC ERROR: --------------------- Stack Frames > ------------------------------------ > >> [6]PETSC ERROR: Note: The EXACT line numbers in the stack are not > available, > >> [6]PETSC ERROR: INSTEAD the line number of the start of the > function > >> [6]PETSC ERROR: is given. > >> [6]PETSC ERROR: [6] SuperLU_DIST:pdgssvx line 161 > /home/simpson/software/petsc-3.5.4/src/mat/impls/aij/mpi/superlu_dist/superlu_dist.c > >> [6]PETSC ERROR: [6] MatSolve_SuperLU_DIST line 121 > /home/simpson/software/petsc-3.5.4/src/mat/impls/aij/mpi/superlu_dist/superlu_dist.c > >> [6]PETSC ERROR: [6] MatSolve line 3104 > /home/simpson/software/petsc-3.5.4/src/mat/interface/matrix.c > >> [6]PETSC ERROR: [6] PCApply_LU line 194 > /home/simpson/software/petsc-3.5.4/src/ksp/pc/impls/factor/lu/lu.c > >> [6]PETSC ERROR: [6] KSP_PCApplyBAorAB line 258 > /home/simpson/software/petsc-3.5.4/include/petsc-private/kspimpl.h > >> [6]PETSC ERROR: --------------------- Error Message > -------------------------------------------------------------- > >> [6]PETSC ERROR: Signal received > >> [6]PETSC ERROR: See http://www.mcs.anl.gov/petsc/documentation/faq.html > for trouble shooting. > >> [6]PETSC ERROR: Petsc Release Version 3.5.4, May, 23, 2015 > >> [6]PETSC ERROR: ./blowup_batch2 on a arch-linux2-c-debug named > proteusi01 by simpson Sat Aug 22 17:01:41 2015 > >> [6]PETSC ERROR: Configure options > --with-mpi-dir=/mnt/HA/opt/openmpi/intel/64/1.8.1-mlnx-ofed > --with-blas-lib=/mnt/HA/opt/blas/gcc/64/20110419/libblas.a > --with-lapack-lib=/liblapack.a --download-suitesparse=yes > --download-superlu=yes --download-superlu_dist=yes --download-mumps=yes > --download-metis=yes --download-parmetis=yes --download-scalapack=yes > >> [6]PETSC ERROR: #1 User provided function() line 0 in unknown file > >> [7]PETSC ERROR: > ------------------------------------------------------------------------ > >> [7]PETSC ERROR: Caught signal number 15 Terminate: Some process (or the > batch system) has told this process to end > >> [7]PETSC ERROR: Try option -start_in_debugger or > -on_error_attach_debugger > >> [7]PETSC ERROR: or see > http://www.mcs.anl.gov/petsc/documentation/faq.html#valgrind > >> [7]PETSC ERROR: or try http://valgrind.org on GNU/linux and Apple Mac > OS X to find memory corruption errors > >> [7]PETSC ERROR: likely location of problem given in stack below > >> [7]PETSC ERROR: --------------------- Stack Frames > ------------------------------------ > >> [7]PETSC ERROR: Note: The EXACT line numbers in the stack are not > available, > >> [7]PETSC ERROR: INSTEAD the line number of the start of the > function > >> [7]PETSC ERROR: is given. > >> [7]PETSC ERROR: [7] SuperLU_DIST:pdgssvx line 161 > /home/simpson/software/petsc-3.5.4/src/mat/impls/aij/mpi/superlu_dist/superlu_dist.c > >> [7]PETSC ERROR: [7] MatSolve_SuperLU_DIST line 121 > /home/simpson/software/petsc-3.5.4/src/mat/impls/aij/mpi/superlu_dist/superlu_dist.c > >> [7]PETSC ERROR: [7] MatSolve line 3104 > /home/simpson/software/petsc-3.5.4/src/mat/interface/matrix.c > >> [7]PETSC ERROR: [7] PCApply_LU line 194 > /home/simpson/software/petsc-3.5.4/src/ksp/pc/impls/factor/lu/lu.c > >> [7]PETSC ERROR: [7] KSP_PCApplyBAorAB line 258 > /home/simpson/software/petsc-3.5.4/include/petsc-private/kspimpl.h > >> [7]PETSC ERROR: --------------------- Error Message > -------------------------------------------------------------- > >> [7]PETSC ERROR: Signal received > >> [7]PETSC ERROR: See http://www.mcs.anl.gov/petsc/documentation/faq.html > for trouble shooting. > >> [7]PETSC ERROR: Petsc Release Version 3.5.4, May, 23, 2015 > >> [7]PETSC ERROR: ./blowup_batch2 on a arch-linux2-c-debug named > proteusi01 by simpson Sat Aug 22 17:01:41 2015 > >> [7]PETSC ERROR: Configure options > --with-mpi-dir=/mnt/HA/opt/openmpi/intel/64/1.8.1-mlnx-ofed > --with-blas-lib=/mnt/HA/opt/blas/gcc/64/20110419/libblas.a > --with-lapack-lib=/liblapack.a --download-suitesparse=yes > --download-superlu=yes --download-superlu_dist=yes --download-mumps=yes > --download-metis=yes --download-parmetis=yes --download-scalapack=yes > >> [7]PETSC ERROR: #1 User provided function() line 0 in unknown file > >> [0]PETSC ERROR: > ------------------------------------------------------------------------ > >> [0]PETSC ERROR: Caught signal number 15 Terminate: Some process (or the > batch system) has told this process to end > >> [0]PETSC ERROR: Try option -start_in_debugger or > -on_error_attach_debugger > >> [0]PETSC ERROR: or see > http://www.mcs.anl.gov/petsc/documentation/faq.html#valgrind > >> [0]PETSC ERROR: or try http://valgrind.org on GNU/linux and Apple Mac > OS X to find memory corruption errors > >> [0]PETSC ERROR: likely location of problem given in stack below > >> [0]PETSC ERROR: --------------------- Stack Frames > ------------------------------------ > >> [0]PETSC ERROR: Note: The EXACT line numbers in the stack are not > available, > >> [0]PETSC ERROR: INSTEAD the line number of the start of the > function > >> [0]PETSC ERROR: is given. > >> [0]PETSC ERROR: [0] SuperLU_DIST:pdgssvx line 161 > /home/simpson/software/petsc-3.5.4/src/mat/impls/aij/mpi/superlu_dist/superlu_dist.c > >> [0]PETSC ERROR: [0] MatSolve_SuperLU_DIST line 121 > /home/simpson/software/petsc-3.5.4/src/mat/impls/aij/mpi/superlu_dist/superlu_dist.c > >> [0]PETSC ERROR: [0] MatSolve line 3104 > /home/simpson/software/petsc-3.5.4/src/mat/interface/matrix.c > >> [0]PETSC ERROR: [0] PCApply_LU line 194 > /home/simpson/software/petsc-3.5.4/src/ksp/pc/impls/factor/lu/lu.c > >> [0]PETSC ERROR: [0] KSP_PCApplyBAorAB line 258 > /home/simpson/software/petsc-3.5.4/include/petsc-private/kspimpl.h > >> [0]PETSC ERROR: --------------------- Error Message > -------------------------------------------------------------- > >> [0]PETSC ERROR: Signal received > >> [0]PETSC ERROR: See http://www.mcs.anl.gov/petsc/documentation/faq.html > for trouble shooting. > >> [0]PETSC ERROR: Petsc Release Version 3.5.4, May, 23, 2015 > >> [0]PETSC ERROR: ./blowup_batch2 on a arch-linux2-c-debug named > proteusi01 by simpson Sat Aug 22 17:01:41 2015 > >> [0]PETSC ERROR: Configure options > --with-mpi-dir=/mnt/HA/opt/openmpi/intel/64/1.8.1-mlnx-ofed > --with-blas-lib=/mnt/HA/opt/blas/gcc/64/20110419/libblas.a > --with-lapack-lib=/liblapack.a --download-suitesparse=yes > --download-superlu=yes --download-superlu_dist=yes --download-mumps=yes > --download-metis=yes --download-parmetis=yes --download-scalapack=yes > >> [0]PETSC ERROR: #1 User provided function() line 0 in unknown file > >> [1]PETSC ERROR: > ------------------------------------------------------------------------ > >> [1]PETSC ERROR: Caught signal number 15 Terminate: Some process (or the > batch system) has told this process to end > >> [1]PETSC ERROR: Try option -start_in_debugger or > -on_error_attach_debugger > >> [1]PETSC ERROR: or see > http://www.mcs.anl.gov/petsc/documentation/faq.html#valgrind > >> [1]PETSC ERROR: or try http://valgrind.org on GNU/linux and Apple Mac > OS X to find memory corruption errors > >> [1]PETSC ERROR: likely location of problem given in stack below > >> [1]PETSC ERROR: --------------------- Stack Frames > ------------------------------------ > >> [1]PETSC ERROR: Note: The EXACT line numbers in the stack are not > available, > >> [1]PETSC ERROR: INSTEAD the line number of the start of the > function > >> [1]PETSC ERROR: is given. > >> [1]PETSC ERROR: [1] SuperLU_DIST:pdgssvx line 161 > /home/simpson/software/petsc-3.5.4/src/mat/impls/aij/mpi/superlu_dist/superlu_dist.c > >> [1]PETSC ERROR: [1] MatSolve_SuperLU_DIST line 121 > /home/simpson/software/petsc-3.5.4/src/mat/impls/aij/mpi/superlu_dist/superlu_dist.c > >> [1]PETSC ERROR: [1] MatSolve line 3104 > /home/simpson/software/petsc-3.5.4/src/mat/interface/matrix.c > >> [1]PETSC ERROR: [1] PCApply_LU line 194 > /home/simpson/software/petsc-3.5.4/src/ksp/pc/impls/factor/lu/lu.c > >> [1]PETSC ERROR: [1] KSP_PCApplyBAorAB line 258 > /home/simpson/software/petsc-3.5.4/include/petsc-private/kspimpl.h > >> [1]PETSC ERROR: --------------------- Error Message > -------------------------------------------------------------- > >> [1]PETSC ERROR: Signal received > >> [1]PETSC ERROR: See http://www.mcs.anl.gov/petsc/documentation/faq.html > for trouble shooting. > >> [1]PETSC ERROR: Petsc Release Version 3.5.4, May, 23, 2015 > >> [1]PETSC ERROR: ./blowup_batch2 on a arch-linux2-c-debug named > proteusi01 by simpson Sat Aug 22 17:01:41 2015 > >> [1]PETSC ERROR: Configure options > --with-mpi-dir=/mnt/HA/opt/openmpi/intel/64/1.8.1-mlnx-ofed > --with-blas-lib=/mnt/HA/opt/blas/gcc/64/20110419/libblas.a > --with-lapack-lib=/liblapack.a --download-suitesparse=yes > --download-superlu=yes --download-superlu_dist=yes --download-mumps=yes > --download-metis=yes --download-parmetis=yes --download-scalapack=yes > >> [1]PETSC ERROR: #1 User provided function() line 0 in unknown file > >> [2]PETSC ERROR: > ------------------------------------------------------------------------ > >> [2]PETSC ERROR: Caught signal number 15 Terminate: Some process (or the > batch system) has told this process to end > >> [2]PETSC ERROR: Try option -start_in_debugger or > -on_error_attach_debugger > >> [2]PETSC ERROR: or see > http://www.mcs.anl.gov/petsc/documentation/faq.html#valgrind > >> [2]PETSC ERROR: or try http://valgrind.org on GNU/linux and Apple Mac > OS X to find memory corruption errors > >> [2]PETSC ERROR: likely location of problem given in stack below > >> [2]PETSC ERROR: --------------------- Stack Frames > ------------------------------------ > >> [2]PETSC ERROR: Note: The EXACT line numbers in the stack are not > available, > >> [2]PETSC ERROR: INSTEAD the line number of the start of the > function > >> [2]PETSC ERROR: is given. > >> [2]PETSC ERROR: [2] SuperLU_DIST:pdgssvx line 161 > /home/simpson/software/petsc-3.5.4/src/mat/impls/aij/mpi/superlu_dist/superlu_dist.c > >> [2]PETSC ERROR: [2] MatSolve_SuperLU_DIST line 121 > /home/simpson/software/petsc-3.5.4/src/mat/impls/aij/mpi/superlu_dist/superlu_dist.c > >> [2]PETSC ERROR: [2] MatSolve line 3104 > /home/simpson/software/petsc-3.5.4/src/mat/interface/matrix.c > >> [2]PETSC ERROR: [2] PCApply_LU line 194 > /home/simpson/software/petsc-3.5.4/src/ksp/pc/impls/factor/lu/lu.c > >> [2]PETSC ERROR: [2] KSP_PCApplyBAorAB line 258 > /home/simpson/software/petsc-3.5.4/include/petsc-private/kspimpl.h > >> [2]PETSC ERROR: --------------------- Error Message > -------------------------------------------------------------- > >> [2]PETSC ERROR: Signal received > >> [2]PETSC ERROR: See http://www.mcs.anl.gov/petsc/documentation/faq.html > for trouble shooting. > >> [2]PETSC ERROR: Petsc Release Version 3.5.4, May, 23, 2015 > >> [2]PETSC ERROR: ./blowup_batch2 on a arch-linux2-c-debug named > proteusi01 by simpson Sat Aug 22 17:01:41 2015 > >> [2]PETSC ERROR: Configure options > --with-mpi-dir=/mnt/HA/opt/openmpi/intel/64/1.8.1-mlnx-ofed > --with-blas-lib=/mnt/HA/opt/blas/gcc/64/20110419/libblas.a > --with-lapack-lib=/liblapack.a --download-suitesparse=yes > --download-superlu=yes --download-superlu_dist=yes --download-mumps=yes > --download-metis=yes --download-parmetis=yes --download-scalapack=yes > >> [2]PETSC ERROR: #1 User provided function() line 0 in unknown file > >> [4]PETSC ERROR: > ------------------------------------------------------------------------ > >> [4]PETSC ERROR: Caught signal number 15 Terminate: Some process (or the > batch system) has told this process to end > >> [4]PETSC ERROR: Try option -start_in_debugger or > -on_error_attach_debugger > >> [4]PETSC ERROR: or see > http://www.mcs.anl.gov/petsc/documentation/faq.html#valgrind > >> [4]PETSC ERROR: or try http://valgrind.org on GNU/linux and Apple Mac > OS X to find memory corruption errors > >> [4]PETSC ERROR: likely location of problem given in stack below > >> [4]PETSC ERROR: --------------------- Stack Frames > ------------------------------------ > >> [4]PETSC ERROR: Note: The EXACT line numbers in the stack are not > available, > >> [4]PETSC ERROR: INSTEAD the line number of the start of the > function > >> [4]PETSC ERROR: is given. > >> [4]PETSC ERROR: [4] SuperLU_DIST:pdgssvx line 161 > /home/simpson/software/petsc-3.5.4/src/mat/impls/aij/mpi/superlu_dist/superlu_dist.c > >> [4]PETSC ERROR: [4] MatSolve_SuperLU_DIST line 121 > /home/simpson/software/petsc-3.5.4/src/mat/impls/aij/mpi/superlu_dist/superlu_dist.c > >> [4]PETSC ERROR: [4] MatSolve line 3104 > /home/simpson/software/petsc-3.5.4/src/mat/interface/matrix.c > >> [4]PETSC ERROR: [4] PCApply_LU line 194 > /home/simpson/software/petsc-3.5.4/src/ksp/pc/impls/factor/lu/lu.c > >> [4]PETSC ERROR: [4] KSP_PCApplyBAorAB line 258 > /home/simpson/software/petsc-3.5.4/include/petsc-private/kspimpl.h > >> [4]PETSC ERROR: --------------------- Error Message > -------------------------------------------------------------- > >> [4]PETSC ERROR: Signal received > >> [4]PETSC ERROR: See http://www.mcs.anl.gov/petsc/documentation/faq.html > for trouble shooting. > >> [4]PETSC ERROR: Petsc Release Version 3.5.4, May, 23, 2015 > >> [4]PETSC ERROR: ./blowup_batch2 on a arch-linux2-c-debug named > proteusi01 by simpson Sat Aug 22 17:01:41 2015 > >> [4]PETSC ERROR: Configure options > --with-mpi-dir=/mnt/HA/opt/openmpi/intel/64/1.8.1-mlnx-ofed > --with-blas-lib=/mnt/HA/opt/blas/gcc/64/20110419/libblas.a > --with-lapack-lib=/liblapack.a --download-suitesparse=yes > --download-superlu=yes --download-superlu_dist=yes --download-mumps=yes > --download-metis=yes --download-parmetis=yes --download-scalapack=yes > >> [4]PETSC ERROR: #1 User provided function() line 0 in unknown file > >> [5]PETSC ERROR: > ------------------------------------------------------------------------ > >> [5]PETSC ERROR: Caught signal number 15 Terminate: Some process (or the > batch system) has told this process to end > >> [5]PETSC ERROR: Try option -start_in_debugger or > -on_error_attach_debugger > >> [5]PETSC ERROR: or see > http://www.mcs.anl.gov/petsc/documentation/faq.html#valgrind > >> [5]PETSC ERROR: or try http://valgrind.org on GNU/linux and Apple Mac > OS X to find memory corruption errors > >> [5]PETSC ERROR: likely location of problem given in stack below > >> [5]PETSC ERROR: --------------------- Stack Frames > ------------------------------------ > >> [5]PETSC ERROR: Note: The EXACT line numbers in the stack are not > available, > >> [5]PETSC ERROR: INSTEAD the line number of the start of the > function > >> [5]PETSC ERROR: is given. > >> [5]PETSC ERROR: [5] SuperLU_DIST:pdgssvx line 161 > /home/simpson/software/petsc-3.5.4/src/mat/impls/aij/mpi/superlu_dist/superlu_dist.c > >> [5]PETSC ERROR: [5] MatSolve_SuperLU_DIST line 121 > /home/simpson/software/petsc-3.5.4/src/mat/impls/aij/mpi/superlu_dist/superlu_dist.c > >> [5]PETSC ERROR: [5] MatSolve line 3104 > /home/simpson/software/petsc-3.5.4/src/mat/interface/matrix.c > >> [5]PETSC ERROR: [5] PCApply_LU line 194 > /home/simpson/software/petsc-3.5.4/src/ksp/pc/impls/factor/lu/lu.c > >> [5]PETSC ERROR: [5] KSP_PCApplyBAorAB line 258 > /home/simpson/software/petsc-3.5.4/include/petsc-private/kspimpl.h > >> [5]PETSC ERROR: --------------------- Error Message > -------------------------------------------------------------- > >> [5]PETSC ERROR: Signal received > >> [5]PETSC ERROR: See http://www.mcs.anl.gov/petsc/documentation/faq.html > for trouble shooting. > >> [5]PETSC ERROR: Petsc Release Version 3.5.4, May, 23, 2015 > >> [5]PETSC ERROR: ./blowup_batch2 on a arch-linux2-c-debug named > proteusi01 by simpson Sat Aug 22 17:01:41 2015 > >> [5]PETSC ERROR: Configure options > --with-mpi-dir=/mnt/HA/opt/openmpi/intel/64/1.8.1-mlnx-ofed > --with-blas-lib=/mnt/HA/opt/blas/gcc/64/20110419/libblas.a > --with-lapack-lib=/liblapack.a --download-suitesparse=yes > --download-superlu=yes --download-superlu_dist=yes --download-mumps=yes > --download-metis=yes --download-parmetis=yes --download-scalapack=yes > >> [5]PETSC ERROR: #1 User provided function() line 0 in unknown file > >> > >> -gideon > >> > >> > > > > -------------- next part -------------- An HTML attachment was scrubbed... URL: From gideon.simpson at gmail.com Tue Aug 25 12:06:53 2015 From: gideon.simpson at gmail.com (Gideon Simpson) Date: Tue, 25 Aug 2015 13:06:53 -0400 Subject: [petsc-users] issues with sparse direct solvers In-Reply-To: References: <1F79415D-8468-4FB7-9821-54D71165CE11@mcs.anl.gov> <7CB56462-D17B-48A5-B014-B80604501DB5@gmail.com> Message-ID: <83DCD281-86DE-475B-B0CC-74DCA1515E61@gmail.com> Hi Hong, I ran with that flag because, while solving a SNES with MUMPS, the code would just sit there as though it had died, and never seem to recover. I tried using that flag just to determine where it had stalled, which was at the "ordering based on METIS? bit. -gideon > On Aug 25, 2015, at 12:24 PM, Hong wrote: > > Gideon: > -mat_mumps_icntl_4 <0>: ICNTL(4): level of printing (0 to 4) (None) > This is for algorithmic diagnosis, not for regular runs. Use default '0' for it. > > Hong > > On Tue, Aug 25, 2015 at 9:06 AM, Gideon Simpson > wrote: > Regarding the MUMPS issue, I?m not sure if this is useful, but when I run with the mumps flags -mat_mumps_icntl_4 4, to see the progress, it hangs at this point: > > > ... Structural symmetry (in percent)= 75 > Density: NBdense, Average, Median = 2 9 7 > Ordering based on METIS > > -gideon > > > On Aug 22, 2015, at 5:12 PM, Barry Smith > wrote: > > > > > >> On Aug 22, 2015, at 4:04 PM, Gideon Simpson > wrote: > >> > >> I?m having issues with both SuperLU dist and MUMPS, as compiled by PETsc, in the following sense: > >> > >> 1. For large enough systems, which seems to vary depending on which computer I?m on, MUMPS seems to just die and never start, when it?s used as the linear solver within a SNES. There?s no error message, it just sits there and doesn?t do anything. > > > > You will need to use a debugger to figure out where it is "hanging"; we haven't heard reports about this. > >> > >> 2. When running with SuperLU dist, I got the following error, with no further information: > > > > The last release of SuperLU_DIST had some pretty nasty bugs, memory corruption that caused crashes etc. We think they are now fixed if you use the maint branch of the PETSc repository and --download-superlu_dist If you stick with the PETSc release and SuperLU_Dist you are using you will keep seeing these crashes > > > > Barry > > > > > >> > >> [3]PETSC ERROR: ------------------------------------------------------------------------ > >> [3]PETSC ERROR: Caught signal number 11 SEGV: Segmentation Violation, probably memory access out of range > >> [3]PETSC ERROR: Try option -start_in_debugger or -on_error_attach_debugger > >> [3]PETSC ERROR: or see http://www.mcs.anl.gov/petsc/documentation/faq.html#valgrind > >> [3]PETSC ERROR: or try http://valgrind.org on GNU/linux and Apple Mac OS X to find memory corruption errors > >> [3]PETSC ERROR: likely location of problem given in stack below > >> [3]PETSC ERROR: --------------------- Stack Frames ------------------------------------ > >> [3]PETSC ERROR: Note: The EXACT line numbers in the stack are not available, > >> [3]PETSC ERROR: INSTEAD the line number of the start of the function > >> [3]PETSC ERROR: is given. > >> [3]PETSC ERROR: [3] SuperLU_DIST:pdgssvx line 161 /home/simpson/software/petsc-3.5.4/src/mat/impls/aij/mpi/superlu_dist/superlu_dist.c > >> [3]PETSC ERROR: [3] MatSolve_SuperLU_DIST line 121 /home/simpson/software/petsc-3.5.4/src/mat/impls/aij/mpi/superlu_dist/superlu_dist.c > >> [3]PETSC ERROR: [3] MatSolve line 3104 /home/simpson/software/petsc-3.5.4/src/mat/interface/matrix.c > >> [3]PETSC ERROR: [3] PCApply_LU line 194 /home/simpson/software/petsc-3.5.4/src/ksp/pc/impls/factor/lu/lu.c > >> [3]PETSC ERROR: [3] KSP_PCApplyBAorAB line 258 /home/simpson/software/petsc-3.5.4/include/petsc-private/kspimpl.h > >> [3]PETSC ERROR: --------------------- Error Message -------------------------------------------------------------- > >> [3]PETSC ERROR: Signal received > >> [3]PETSC ERROR: See http://www.mcs.anl.gov/petsc/documentation/faq.html for trouble shooting. > >> [3]PETSC ERROR: Petsc Release Version 3.5.4, May, 23, 2015 > >> [3]PETSC ERROR: ./blowup_batch2 on a arch-linux2-c-debug named proteusi01 by simpson Sat Aug 22 17:01:41 2015 > >> [3]PETSC ERROR: Configure options --with-mpi-dir=/mnt/HA/opt/openmpi/intel/64/1.8.1-mlnx-ofed --with-blas-lib=/mnt/HA/opt/blas/gcc/64/20110419/libblas.a --with-lapack-lib=/liblapack.a --download-suitesparse=yes --download-superlu=yes --download-superlu_dist=yes --download-mumps=yes --download-metis=yes --download-parmetis=yes --download-scalapack=yes > >> [3]PETSC ERROR: #1 User provided function() line 0 in unknown file > >> -------------------------------------------------------------------------- > >> MPI_ABORT was invoked on rank 3 in communicator MPI_COMM_WORLD > >> with errorcode 59. > >> > >> NOTE: invoking MPI_ABORT causes Open MPI to kill all MPI processes. > >> You may or may not see output from other processes, depending on > >> exactly when Open MPI kills them. > >> -------------------------------------------------------------------------- > >> [proteusi01:14037] 1 more process has sent help message help-mpi-api.txt / mpi-abort > >> [proteusi01:14037] Set MCA parameter "orte_base_help_aggregate" to 0 to see all help / error messages > >> [6]PETSC ERROR: ------------------------------------------------------------------------ > >> [6]PETSC ERROR: Caught signal number 15 Terminate: Some process (or the batch system) has told this process to end > >> [6]PETSC ERROR: Try option -start_in_debugger or -on_error_attach_debugger > >> [6]PETSC ERROR: or see http://www.mcs.anl.gov/petsc/documentation/faq.html#valgrind > >> [6]PETSC ERROR: or try http://valgrind.org on GNU/linux and Apple Mac OS X to find memory corruption errors > >> [6]PETSC ERROR: likely location of problem given in stack below > >> [6]PETSC ERROR: --------------------- Stack Frames ------------------------------------ > >> [6]PETSC ERROR: Note: The EXACT line numbers in the stack are not available, > >> [6]PETSC ERROR: INSTEAD the line number of the start of the function > >> [6]PETSC ERROR: is given. > >> [6]PETSC ERROR: [6] SuperLU_DIST:pdgssvx line 161 /home/simpson/software/petsc-3.5.4/src/mat/impls/aij/mpi/superlu_dist/superlu_dist.c > >> [6]PETSC ERROR: [6] MatSolve_SuperLU_DIST line 121 /home/simpson/software/petsc-3.5.4/src/mat/impls/aij/mpi/superlu_dist/superlu_dist.c > >> [6]PETSC ERROR: [6] MatSolve line 3104 /home/simpson/software/petsc-3.5.4/src/mat/interface/matrix.c > >> [6]PETSC ERROR: [6] PCApply_LU line 194 /home/simpson/software/petsc-3.5.4/src/ksp/pc/impls/factor/lu/lu.c > >> [6]PETSC ERROR: [6] KSP_PCApplyBAorAB line 258 /home/simpson/software/petsc-3.5.4/include/petsc-private/kspimpl.h > >> [6]PETSC ERROR: --------------------- Error Message -------------------------------------------------------------- > >> [6]PETSC ERROR: Signal received > >> [6]PETSC ERROR: See http://www.mcs.anl.gov/petsc/documentation/faq.html for trouble shooting. > >> [6]PETSC ERROR: Petsc Release Version 3.5.4, May, 23, 2015 > >> [6]PETSC ERROR: ./blowup_batch2 on a arch-linux2-c-debug named proteusi01 by simpson Sat Aug 22 17:01:41 2015 > >> [6]PETSC ERROR: Configure options --with-mpi-dir=/mnt/HA/opt/openmpi/intel/64/1.8.1-mlnx-ofed --with-blas-lib=/mnt/HA/opt/blas/gcc/64/20110419/libblas.a --with-lapack-lib=/liblapack.a --download-suitesparse=yes --download-superlu=yes --download-superlu_dist=yes --download-mumps=yes --download-metis=yes --download-parmetis=yes --download-scalapack=yes > >> [6]PETSC ERROR: #1 User provided function() line 0 in unknown file > >> [7]PETSC ERROR: ------------------------------------------------------------------------ > >> [7]PETSC ERROR: Caught signal number 15 Terminate: Some process (or the batch system) has told this process to end > >> [7]PETSC ERROR: Try option -start_in_debugger or -on_error_attach_debugger > >> [7]PETSC ERROR: or see http://www.mcs.anl.gov/petsc/documentation/faq.html#valgrind > >> [7]PETSC ERROR: or try http://valgrind.org on GNU/linux and Apple Mac OS X to find memory corruption errors > >> [7]PETSC ERROR: likely location of problem given in stack below > >> [7]PETSC ERROR: --------------------- Stack Frames ------------------------------------ > >> [7]PETSC ERROR: Note: The EXACT line numbers in the stack are not available, > >> [7]PETSC ERROR: INSTEAD the line number of the start of the function > >> [7]PETSC ERROR: is given. > >> [7]PETSC ERROR: [7] SuperLU_DIST:pdgssvx line 161 /home/simpson/software/petsc-3.5.4/src/mat/impls/aij/mpi/superlu_dist/superlu_dist.c > >> [7]PETSC ERROR: [7] MatSolve_SuperLU_DIST line 121 /home/simpson/software/petsc-3.5.4/src/mat/impls/aij/mpi/superlu_dist/superlu_dist.c > >> [7]PETSC ERROR: [7] MatSolve line 3104 /home/simpson/software/petsc-3.5.4/src/mat/interface/matrix.c > >> [7]PETSC ERROR: [7] PCApply_LU line 194 /home/simpson/software/petsc-3.5.4/src/ksp/pc/impls/factor/lu/lu.c > >> [7]PETSC ERROR: [7] KSP_PCApplyBAorAB line 258 /home/simpson/software/petsc-3.5.4/include/petsc-private/kspimpl.h > >> [7]PETSC ERROR: --------------------- Error Message -------------------------------------------------------------- > >> [7]PETSC ERROR: Signal received > >> [7]PETSC ERROR: See http://www.mcs.anl.gov/petsc/documentation/faq.html for trouble shooting. > >> [7]PETSC ERROR: Petsc Release Version 3.5.4, May, 23, 2015 > >> [7]PETSC ERROR: ./blowup_batch2 on a arch-linux2-c-debug named proteusi01 by simpson Sat Aug 22 17:01:41 2015 > >> [7]PETSC ERROR: Configure options --with-mpi-dir=/mnt/HA/opt/openmpi/intel/64/1.8.1-mlnx-ofed --with-blas-lib=/mnt/HA/opt/blas/gcc/64/20110419/libblas.a --with-lapack-lib=/liblapack.a --download-suitesparse=yes --download-superlu=yes --download-superlu_dist=yes --download-mumps=yes --download-metis=yes --download-parmetis=yes --download-scalapack=yes > >> [7]PETSC ERROR: #1 User provided function() line 0 in unknown file > >> [0]PETSC ERROR: ------------------------------------------------------------------------ > >> [0]PETSC ERROR: Caught signal number 15 Terminate: Some process (or the batch system) has told this process to end > >> [0]PETSC ERROR: Try option -start_in_debugger or -on_error_attach_debugger > >> [0]PETSC ERROR: or see http://www.mcs.anl.gov/petsc/documentation/faq.html#valgrind > >> [0]PETSC ERROR: or try http://valgrind.org on GNU/linux and Apple Mac OS X to find memory corruption errors > >> [0]PETSC ERROR: likely location of problem given in stack below > >> [0]PETSC ERROR: --------------------- Stack Frames ------------------------------------ > >> [0]PETSC ERROR: Note: The EXACT line numbers in the stack are not available, > >> [0]PETSC ERROR: INSTEAD the line number of the start of the function > >> [0]PETSC ERROR: is given. > >> [0]PETSC ERROR: [0] SuperLU_DIST:pdgssvx line 161 /home/simpson/software/petsc-3.5.4/src/mat/impls/aij/mpi/superlu_dist/superlu_dist.c > >> [0]PETSC ERROR: [0] MatSolve_SuperLU_DIST line 121 /home/simpson/software/petsc-3.5.4/src/mat/impls/aij/mpi/superlu_dist/superlu_dist.c > >> [0]PETSC ERROR: [0] MatSolve line 3104 /home/simpson/software/petsc-3.5.4/src/mat/interface/matrix.c > >> [0]PETSC ERROR: [0] PCApply_LU line 194 /home/simpson/software/petsc-3.5.4/src/ksp/pc/impls/factor/lu/lu.c > >> [0]PETSC ERROR: [0] KSP_PCApplyBAorAB line 258 /home/simpson/software/petsc-3.5.4/include/petsc-private/kspimpl.h > >> [0]PETSC ERROR: --------------------- Error Message -------------------------------------------------------------- > >> [0]PETSC ERROR: Signal received > >> [0]PETSC ERROR: See http://www.mcs.anl.gov/petsc/documentation/faq.html for trouble shooting. > >> [0]PETSC ERROR: Petsc Release Version 3.5.4, May, 23, 2015 > >> [0]PETSC ERROR: ./blowup_batch2 on a arch-linux2-c-debug named proteusi01 by simpson Sat Aug 22 17:01:41 2015 > >> [0]PETSC ERROR: Configure options --with-mpi-dir=/mnt/HA/opt/openmpi/intel/64/1.8.1-mlnx-ofed --with-blas-lib=/mnt/HA/opt/blas/gcc/64/20110419/libblas.a --with-lapack-lib=/liblapack.a --download-suitesparse=yes --download-superlu=yes --download-superlu_dist=yes --download-mumps=yes --download-metis=yes --download-parmetis=yes --download-scalapack=yes > >> [0]PETSC ERROR: #1 User provided function() line 0 in unknown file > >> [1]PETSC ERROR: ------------------------------------------------------------------------ > >> [1]PETSC ERROR: Caught signal number 15 Terminate: Some process (or the batch system) has told this process to end > >> [1]PETSC ERROR: Try option -start_in_debugger or -on_error_attach_debugger > >> [1]PETSC ERROR: or see http://www.mcs.anl.gov/petsc/documentation/faq.html#valgrind > >> [1]PETSC ERROR: or try http://valgrind.org on GNU/linux and Apple Mac OS X to find memory corruption errors > >> [1]PETSC ERROR: likely location of problem given in stack below > >> [1]PETSC ERROR: --------------------- Stack Frames ------------------------------------ > >> [1]PETSC ERROR: Note: The EXACT line numbers in the stack are not available, > >> [1]PETSC ERROR: INSTEAD the line number of the start of the function > >> [1]PETSC ERROR: is given. > >> [1]PETSC ERROR: [1] SuperLU_DIST:pdgssvx line 161 /home/simpson/software/petsc-3.5.4/src/mat/impls/aij/mpi/superlu_dist/superlu_dist.c > >> [1]PETSC ERROR: [1] MatSolve_SuperLU_DIST line 121 /home/simpson/software/petsc-3.5.4/src/mat/impls/aij/mpi/superlu_dist/superlu_dist.c > >> [1]PETSC ERROR: [1] MatSolve line 3104 /home/simpson/software/petsc-3.5.4/src/mat/interface/matrix.c > >> [1]PETSC ERROR: [1] PCApply_LU line 194 /home/simpson/software/petsc-3.5.4/src/ksp/pc/impls/factor/lu/lu.c > >> [1]PETSC ERROR: [1] KSP_PCApplyBAorAB line 258 /home/simpson/software/petsc-3.5.4/include/petsc-private/kspimpl.h > >> [1]PETSC ERROR: --------------------- Error Message -------------------------------------------------------------- > >> [1]PETSC ERROR: Signal received > >> [1]PETSC ERROR: See http://www.mcs.anl.gov/petsc/documentation/faq.html for trouble shooting. > >> [1]PETSC ERROR: Petsc Release Version 3.5.4, May, 23, 2015 > >> [1]PETSC ERROR: ./blowup_batch2 on a arch-linux2-c-debug named proteusi01 by simpson Sat Aug 22 17:01:41 2015 > >> [1]PETSC ERROR: Configure options --with-mpi-dir=/mnt/HA/opt/openmpi/intel/64/1.8.1-mlnx-ofed --with-blas-lib=/mnt/HA/opt/blas/gcc/64/20110419/libblas.a --with-lapack-lib=/liblapack.a --download-suitesparse=yes --download-superlu=yes --download-superlu_dist=yes --download-mumps=yes --download-metis=yes --download-parmetis=yes --download-scalapack=yes > >> [1]PETSC ERROR: #1 User provided function() line 0 in unknown file > >> [2]PETSC ERROR: ------------------------------------------------------------------------ > >> [2]PETSC ERROR: Caught signal number 15 Terminate: Some process (or the batch system) has told this process to end > >> [2]PETSC ERROR: Try option -start_in_debugger or -on_error_attach_debugger > >> [2]PETSC ERROR: or see http://www.mcs.anl.gov/petsc/documentation/faq.html#valgrind > >> [2]PETSC ERROR: or try http://valgrind.org on GNU/linux and Apple Mac OS X to find memory corruption errors > >> [2]PETSC ERROR: likely location of problem given in stack below > >> [2]PETSC ERROR: --------------------- Stack Frames ------------------------------------ > >> [2]PETSC ERROR: Note: The EXACT line numbers in the stack are not available, > >> [2]PETSC ERROR: INSTEAD the line number of the start of the function > >> [2]PETSC ERROR: is given. > >> [2]PETSC ERROR: [2] SuperLU_DIST:pdgssvx line 161 /home/simpson/software/petsc-3.5.4/src/mat/impls/aij/mpi/superlu_dist/superlu_dist.c > >> [2]PETSC ERROR: [2] MatSolve_SuperLU_DIST line 121 /home/simpson/software/petsc-3.5.4/src/mat/impls/aij/mpi/superlu_dist/superlu_dist.c > >> [2]PETSC ERROR: [2] MatSolve line 3104 /home/simpson/software/petsc-3.5.4/src/mat/interface/matrix.c > >> [2]PETSC ERROR: [2] PCApply_LU line 194 /home/simpson/software/petsc-3.5.4/src/ksp/pc/impls/factor/lu/lu.c > >> [2]PETSC ERROR: [2] KSP_PCApplyBAorAB line 258 /home/simpson/software/petsc-3.5.4/include/petsc-private/kspimpl.h > >> [2]PETSC ERROR: --------------------- Error Message -------------------------------------------------------------- > >> [2]PETSC ERROR: Signal received > >> [2]PETSC ERROR: See http://www.mcs.anl.gov/petsc/documentation/faq.html for trouble shooting. > >> [2]PETSC ERROR: Petsc Release Version 3.5.4, May, 23, 2015 > >> [2]PETSC ERROR: ./blowup_batch2 on a arch-linux2-c-debug named proteusi01 by simpson Sat Aug 22 17:01:41 2015 > >> [2]PETSC ERROR: Configure options --with-mpi-dir=/mnt/HA/opt/openmpi/intel/64/1.8.1-mlnx-ofed --with-blas-lib=/mnt/HA/opt/blas/gcc/64/20110419/libblas.a --with-lapack-lib=/liblapack.a --download-suitesparse=yes --download-superlu=yes --download-superlu_dist=yes --download-mumps=yes --download-metis=yes --download-parmetis=yes --download-scalapack=yes > >> [2]PETSC ERROR: #1 User provided function() line 0 in unknown file > >> [4]PETSC ERROR: ------------------------------------------------------------------------ > >> [4]PETSC ERROR: Caught signal number 15 Terminate: Some process (or the batch system) has told this process to end > >> [4]PETSC ERROR: Try option -start_in_debugger or -on_error_attach_debugger > >> [4]PETSC ERROR: or see http://www.mcs.anl.gov/petsc/documentation/faq.html#valgrind > >> [4]PETSC ERROR: or try http://valgrind.org on GNU/linux and Apple Mac OS X to find memory corruption errors > >> [4]PETSC ERROR: likely location of problem given in stack below > >> [4]PETSC ERROR: --------------------- Stack Frames ------------------------------------ > >> [4]PETSC ERROR: Note: The EXACT line numbers in the stack are not available, > >> [4]PETSC ERROR: INSTEAD the line number of the start of the function > >> [4]PETSC ERROR: is given. > >> [4]PETSC ERROR: [4] SuperLU_DIST:pdgssvx line 161 /home/simpson/software/petsc-3.5.4/src/mat/impls/aij/mpi/superlu_dist/superlu_dist.c > >> [4]PETSC ERROR: [4] MatSolve_SuperLU_DIST line 121 /home/simpson/software/petsc-3.5.4/src/mat/impls/aij/mpi/superlu_dist/superlu_dist.c > >> [4]PETSC ERROR: [4] MatSolve line 3104 /home/simpson/software/petsc-3.5.4/src/mat/interface/matrix.c > >> [4]PETSC ERROR: [4] PCApply_LU line 194 /home/simpson/software/petsc-3.5.4/src/ksp/pc/impls/factor/lu/lu.c > >> [4]PETSC ERROR: [4] KSP_PCApplyBAorAB line 258 /home/simpson/software/petsc-3.5.4/include/petsc-private/kspimpl.h > >> [4]PETSC ERROR: --------------------- Error Message -------------------------------------------------------------- > >> [4]PETSC ERROR: Signal received > >> [4]PETSC ERROR: See http://www.mcs.anl.gov/petsc/documentation/faq.html for trouble shooting. > >> [4]PETSC ERROR: Petsc Release Version 3.5.4, May, 23, 2015 > >> [4]PETSC ERROR: ./blowup_batch2 on a arch-linux2-c-debug named proteusi01 by simpson Sat Aug 22 17:01:41 2015 > >> [4]PETSC ERROR: Configure options --with-mpi-dir=/mnt/HA/opt/openmpi/intel/64/1.8.1-mlnx-ofed --with-blas-lib=/mnt/HA/opt/blas/gcc/64/20110419/libblas.a --with-lapack-lib=/liblapack.a --download-suitesparse=yes --download-superlu=yes --download-superlu_dist=yes --download-mumps=yes --download-metis=yes --download-parmetis=yes --download-scalapack=yes > >> [4]PETSC ERROR: #1 User provided function() line 0 in unknown file > >> [5]PETSC ERROR: ------------------------------------------------------------------------ > >> [5]PETSC ERROR: Caught signal number 15 Terminate: Some process (or the batch system) has told this process to end > >> [5]PETSC ERROR: Try option -start_in_debugger or -on_error_attach_debugger > >> [5]PETSC ERROR: or see http://www.mcs.anl.gov/petsc/documentation/faq.html#valgrind > >> [5]PETSC ERROR: or try http://valgrind.org on GNU/linux and Apple Mac OS X to find memory corruption errors > >> [5]PETSC ERROR: likely location of problem given in stack below > >> [5]PETSC ERROR: --------------------- Stack Frames ------------------------------------ > >> [5]PETSC ERROR: Note: The EXACT line numbers in the stack are not available, > >> [5]PETSC ERROR: INSTEAD the line number of the start of the function > >> [5]PETSC ERROR: is given. > >> [5]PETSC ERROR: [5] SuperLU_DIST:pdgssvx line 161 /home/simpson/software/petsc-3.5.4/src/mat/impls/aij/mpi/superlu_dist/superlu_dist.c > >> [5]PETSC ERROR: [5] MatSolve_SuperLU_DIST line 121 /home/simpson/software/petsc-3.5.4/src/mat/impls/aij/mpi/superlu_dist/superlu_dist.c > >> [5]PETSC ERROR: [5] MatSolve line 3104 /home/simpson/software/petsc-3.5.4/src/mat/interface/matrix.c > >> [5]PETSC ERROR: [5] PCApply_LU line 194 /home/simpson/software/petsc-3.5.4/src/ksp/pc/impls/factor/lu/lu.c > >> [5]PETSC ERROR: [5] KSP_PCApplyBAorAB line 258 /home/simpson/software/petsc-3.5.4/include/petsc-private/kspimpl.h > >> [5]PETSC ERROR: --------------------- Error Message -------------------------------------------------------------- > >> [5]PETSC ERROR: Signal received > >> [5]PETSC ERROR: See http://www.mcs.anl.gov/petsc/documentation/faq.html for trouble shooting. > >> [5]PETSC ERROR: Petsc Release Version 3.5.4, May, 23, 2015 > >> [5]PETSC ERROR: ./blowup_batch2 on a arch-linux2-c-debug named proteusi01 by simpson Sat Aug 22 17:01:41 2015 > >> [5]PETSC ERROR: Configure options --with-mpi-dir=/mnt/HA/opt/openmpi/intel/64/1.8.1-mlnx-ofed --with-blas-lib=/mnt/HA/opt/blas/gcc/64/20110419/libblas.a --with-lapack-lib=/liblapack.a --download-suitesparse=yes --download-superlu=yes --download-superlu_dist=yes --download-mumps=yes --download-metis=yes --download-parmetis=yes --download-scalapack=yes > >> [5]PETSC ERROR: #1 User provided function() line 0 in unknown file > >> > >> -gideon > >> > >> > > > > -------------- next part -------------- An HTML attachment was scrubbed... URL: From bsmith at mcs.anl.gov Tue Aug 25 12:39:16 2015 From: bsmith at mcs.anl.gov (Barry Smith) Date: Tue, 25 Aug 2015 12:39:16 -0500 Subject: [petsc-users] Function evaluation slowness ? In-Reply-To: References: <0FCEEEF7-2374-47A5-9C1B-83D37E3E87CD@mcs.anl.gov> <26E1E799-28B2-460E-B8C9-4FE0EE8CA61B@mcs.anl.gov> Message-ID: <3D98A4A3-C588-4109-8988-79BA2F541F98@mcs.anl.gov> > On Aug 25, 2015, at 2:06 AM, Timoth?e Nicolas wrote: > > OK, I see, > > Might it be that I do something a bit funky to obtain a good guess for solve ? I had he following idea, which I used with great success on a very different problem (much simpler, maybe that's why it worked) : obtain the initial guess as a cubic extrapolation of the preceding solutions. The idea is that I expect my solution to be reasonably smooth over time, so considering this, the increment of the fields should also be continuous (I solve for the increments, not the fields themselves). Therefore, I store in my user context the current vector Xk as well as the last three solutions Xkm1 and Xkm2. > > I define > > dxm2 = Xkm1 - Xkm2 > dxm1 = Xk - Xkm1 > > And I use the result of the last SNESSolve as > > dx = Xkp1 - Xk > > Then I set the new dx initial guess as the pointwise cubic extrapolation of (dxm2,dxm1,dx) > > However it seems pretty local and I don't see why scatters would be required for this. Yes, no scatters here. > > I printed the routine I use to do this below. In any case I will clean up a bit, remove the extra stuff (not much there however). If it is not sufficient, I will transform my form function in a dummy which does not require computations and see what happens. > > Timothee > > PetscErrorCode :: ierr > > PetscScalar :: M(3,3) > Vec :: xkm2,xkm1 > Vec :: coef1,coef2,coef3 > PetscScalar :: a,b,c,t,det > > a = user%tkm1 > b = user%tk > c = user%t > t = user%t+user%dt > > det = b*a**2 + c*b**2 + a*c**2 - (c*a**2 + a*b**2 + b*c**2) > > M(1,1) = (b-c)/det > M(2,1) = (c**2-b**2)/det > M(3,1) = (c*b**2-b*c**2)/det > > M(1,2) = (c-a)/det > M(2,2) = (a**2-c**2)/det > M(3,2) = (a*c**2-c*a**2)/det > > M(1,3) = (a-b)/det > M(2,3) = (b**2-a**2)/det > M(3,3) = (b*a**2-a*b**2)/det > > call VecDuplicate(x,xkm1,ierr) > call VecDuplicate(x,xkm2,ierr) > > call VecDuplicate(x,coef1,ierr) > call VecDuplicate(x,coef2,ierr) > call VecDuplicate(x,coef3,ierr) > > call VecWAXPY(xkm2,-one,user%Xkm2,user%Xkm1,ierr) > call VecWAXPY(xkm1,-one,user%Xkm1,user%Xk,ierr) > > ! The following lines correspond to the following simple operation > ! coef1 = M(1,1)*alpha + M(1,2)*beta + M(1,3)*gamma > ! coef2 = M(2,1)*alpha + M(2,2)*beta + M(2,3)*gamma > ! coef3 = M(3,1)*alpha + M(3,2)*beta + M(3,3)*gamma > call VecCopy(xkm2,coef1,ierr) > call VecScale(coef1,M(1,1),ierr) > call VecAXPY(coef1,M(1,2),xkm1,ierr) > call VecAXPY(coef1,M(1,3),x,ierr) > > call VecCopy(xkm2,coef2,ierr) > call VecScale(coef2,M(2,1),ierr) > call VecAXPY(coef2,M(2,2),xkm1,ierr) > call VecAXPY(coef2,M(2,3),x,ierr) > > call VecCopy(xkm2,coef3,ierr) > call VecScale(coef3,M(3,1),ierr) > call VecAXPY(coef3,M(3,2),xkm1,ierr) > call VecAXPY(coef3,M(3,3),x,ierr) > > call VecCopy(coef3,x,ierr) > call VecAXPY(x,t,coef2,ierr) > call VecAXPY(x,t**2,coef1,ierr) > > call VecDestroy(xkm2,ierr) > call VecDestroy(xkm1,ierr) > > call VecDestroy(coef1,ierr) > call VecDestroy(coef2,ierr) > call VecDestroy(coef3,ierr) > > > > 2015-08-25 15:47 GMT+09:00 Barry Smith : > > The results are kind of funky, > > ------------------------------------------------------------------------------------------------------------------------ > Event Count Time (sec) Flops --- Global --- --- Stage --- Total > Max Ratio Max Ratio Max Ratio Mess Avg len Reduct %T %F %M %L %R %T %F %M %L %R Mflop/s > ------------------------------------------------------------------------------------------------------------------------ > SNESSolve 40 1.0 4.9745e+02 3.3 4.25e+09 1.0 1.7e+06 3.8e+04 2.7e+03 46 93 99 95 80 46 93 99 95 80 2187 > SNESFunctionEval 666 1.0 4.8990e+02 3.4 5.73e+08 1.0 1.7e+06 3.8e+04 1.3e+03 45 13 99 95 40 45 13 99 95 40 299 > SNESLineSearch 79 1.0 3.8578e+00 1.0 4.98e+08 1.0 4.0e+05 3.8e+04 6.3e+02 1 11 23 23 19 1 11 23 23 19 33068 > VecScatterEnd 1335 1.0 3.4761e+02 5.8 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00 31 0 0 0 0 31 0 0 0 0 0 > MatMult MF 547 1.0 1.2570e+01 1.1 1.27e+09 1.0 1.4e+06 3.8e+04 1.1e+03 2 28 81 78 34 2 28 81 78 34 25962 > MatMult 547 1.0 1.2571e+01 1.1 1.27e+09 1.0 1.4e+06 3.8e+04 1.1e+03 2 28 81 78 34 2 28 81 78 34 25960 > > look at the %T time for global SNES solve is 46 % of the total time, function evaluations are 45% but MatMult are only 2% (and yet matmult should contain most of the function evaluations). I cannot explain this. Also the VecScatterEnd is HUGE and has a bad load balance of 5.8 Why are there so many more scatters than function evaluations? What other operations are you doing that require scatters? > > It's almost like you have some mysterious "extra" function calls outside of the SNESSolve that are killing the performance? It might help to understand the performance to strip out all extraneous computations not needed (like in custom monitors etc). > > Barry > > > > > > > > On Aug 25, 2015, at 1:21 AM, Timoth?e Nicolas wrote: > > > > Here is the log summary (attached). At the beginning are personal prints, you can skip. I seem to have a memory crash in the present state after typically 45 iterations (that's why I used 40 here), the log summary indicates some creations without destruction of Petsc objects (I will fix this immediately), that may cause the memory crash, but I don't think it's the cause of the slow function evaluations. > > > > The log_summary is consistent with 0.7s per function evaluation (4.8990e+02/666 = 0.736). In addition, SNESSolve itself takes approximately the same amount of time (is it normal ?). And the other long operation is VecScatterEnd. I assume it is the time used in process communications ? In which case I suppose it is normal that it takes a significant amount of time. > > > > So this ~10 times increase does not look normal right ? > > > > Best > > > > Timothee NICOLAS > > > > > > 2015-08-25 14:56 GMT+09:00 Barry Smith : > > > > > On Aug 25, 2015, at 12:45 AM, Timoth?e Nicolas wrote: > > > > > > Hi, > > > > > > I am testing PETSc on the supercomputer where I used to run my explicit MHD code. For my tests I use 256 processes on a problem of size 128*128*640 = 10485760, that is, 40960 grid points per process, and 8 degrees of freedom (or physical fields). The explicit code was using Runge-Kutta 4 for the time scheme, which means 4 function evaluation per time step (plus one operation to put everything together, but let's forget this one). > > > > > > I could thus easily determine that the typical time required for a function evaluation was of the order of 50 ms. > > > > > > Now with the implicit Newton-Krylov solver written in PETSc, in the present state where for now I have not implemented any Jacobian or preconditioner whatsoever (so I run with -snes_mf), I measure a typical time between two time steps of between 5 and 20 seconds, and the number of function evaluations for each time step obtained with SNESGetNumberFunctionEvals is 17 (I am speaking of a particular case of course) > > > > > > This means a time per function evaluation of about 0.5 to 1 second, that is, 10 to 20 times slower. > > > > > > So I have some questions about this. > > > > > > 1. First does SNESGetNumberFunctionEvals take into account the function evaluations required to evaluate the Jacobian when -snes_mf is used, as well as the operations required by the GMRES (Krylov) method ? If it were the case, I would somehow intuitively expect a number larger than 17, which could explain the increase in time. > > > > PetscErrorCode SNESGetNumberFunctionEvals(SNES snes, PetscInt *nfuncs) > > { > > *nfuncs = snes->nfuncs; > > } > > > > PetscErrorCode SNESComputeFunction(SNES snes,Vec x,Vec y) > > { > > ... > > snes->nfuncs++; > > } > > > > PetscErrorCode MatCreateSNESMF(SNES snes,Mat *J) > > { > > ..... > > if (snes->pc && snes->pcside == PC_LEFT) { > > ierr = MatMFFDSetFunction(*J,(PetscErrorCode (*)(void*,Vec,Vec))SNESComputeFunctionDefaultNPC,snes);CHKERRQ(ierr); > > } else { > > ierr = MatMFFDSetFunction(*J,(PetscErrorCode (*)(void*,Vec,Vec))SNESComputeFunction,snes);CHKERRQ(ierr); > > } > > } > > > > So, yes I would expect all the function evaluations needed for the matrix-free Jacobian matrix vector product to be counted. You can also look at the number of GMRES Krylov iterations it took (which should have one multiply per iteration) to double check that the numbers make sense. > > > > What does your -log_summary output look like? One thing that GMRES does is it introduces a global reduction with each multiple (hence a barrier across all your processes) on some systems this can be deadly. > > > > Barry > > > > > > > > > > 2. In any case, I thought that all things considered, the function evaluation would be the most time consuming part of a Newton-Krylov solver, am I completely wrong about that ? Is the 10-20 factor legit ? > > > > > > I realize of course that preconditioning should make all this smoother, in particular allowing larger time steps, but here I am just concerned about the sheer Function evaluation time. > > > > > > Best regards > > > > > > Timothee NICOLAS > > > > > > > > From bsmith at mcs.anl.gov Tue Aug 25 14:14:56 2015 From: bsmith at mcs.anl.gov (Barry Smith) Date: Tue, 25 Aug 2015 14:14:56 -0500 Subject: [petsc-users] on the data size problem In-Reply-To: References: Message-ID: <7CF7408A-9540-4ABA-BFAF-E16B5085A0B4@mcs.anl.gov> Convergence of iterative schemes depends on problem sizes and problem properties. You need to debug your code/algorithm to determine what is going on. Some advice NEVER NEVER NEVER run in parallel until you are getting correct behavior and solutions on one process consistently. Increase the problem size slightly from 5 until you start seeing bad behavior; don't just jump from 5 to 5000. Barry > On Aug 19, 2015, at 10:51 AM, Hongliang Lu wrote: > > Dear all, > I am trying to implement a BFS algorithm using Petsc, and I have tested my code on a graph of 5 nodes, but when I tested on a larger graph, which size is 5000 nodes, the program went wrong, and ca not finished, could some on help me out? thank you very much!!!!! > I tried to run the following code in a cluster with 10 nodes. > > int main(int argc,char **args) > { > Vec curNodes,tmp; > Mat oriGraph; > PetscInt rows, cols; > PetscScalar one=1; > PetscScalar nodeVecSum=1; > char filein[PETSC_MAX_PATH_LEN],fileout[PETSC_MAX_PATH_LEN],buf[PETSC_MAX_PATH_LEN]; > PetscViewer fd; > PetscInitialize(&argc,&args,(char *)0,help); > > PetscOptionsGetString(PETSC_NULL,"-fin",filein,PETSC_MAX_PATH_LEN-1,PETSC_NULL); > PetscViewerBinaryOpen(PETSC_COMM_WORLD,filein,FILE_MODE_READ,&fd); > MatCreate(PETSC_COMM_WORLD,&oriGraph); > > MatLoad(oriGraph,fd); > MatGetSize(oriGraph,&rows,&cols); > MatSetOption(oriGraph,MAT_NEW_NONZERO_ALLOCATION_ERR, PETSC_FALSE); > MatSetUp(oriGraph); > VecCreate(PETSC_COMM_WORLD,&curNodes); > > VecSetSizes(curNodes,PETSC_DECIDE,rows); > VecSetFromOptions(curNodes); > VecCreate(PETSC_COMM_WORLD,&tmp); > VecSetSizes(tmp,PETSC_DECIDE,rows); > VecSetFromOptions(tmp); > VecZeroEntries(tmp); > srand(time(0)); > PetscInt node=rand()%rows; > PetscPrintf(PETSC_COMM_SELF,"The node ID is: %d \n",node); > VecSetValues(curNodes,1,&node,&one,INSERT_VALUES); > VecAssemblyBegin(curNodes); > VecAssemblyEnd(curNodes); > > PetscViewerDestroy(&fd); > > const PetscInt *colsv; > const PetscScalar *valsv; > PetscInt ncols,i,zero=0; > PetscInt iter=0; > > nodeVecSum=1; > for(;iter<10;iter++) > { > VecAssemblyBegin(curNodes); > VecAssemblyEnd(curNodes); > MatMult(oriGraph,curNodes,tmp); > VecAssemblyBegin(tmp); > VecAssemblyEnd(tmp); > VecSum(tmp,&nodeVecSum); > PetscPrintf(PETSC_COMM_SELF,"There are neighbors: %d \n",(int)nodeVecSum); > VecSum(curNodes,&nodeVecSum); > if(nodeVecSum<1) > break; > > PetscScalar y; > PetscInt indices; > PetscInt n,m,rstart,rend; > IS isrow; > Mat curMat; > MatGetLocalSize(oriGraph,&n,&m); > MatGetOwnershipRange(oriGraph,&rstart,&rend); > ISCreateStride(PETSC_COMM_SELF,n,rstart,1,&isrow); > MatGetSubMatrix(oriGraph,isrow,NULL,MAT_INITIAL_MATRIX,&curMat); > > MatGetSize(curMat,&n,&m); > for(i=rstart;i { > indices=i; > VecGetValues(curNodes,1,&indices,&y); > if(y>0){ > MatGetRow(oriGraph,indices,&ncols,&colsv,&valsv); > PetscScalar *v,zero=0; > PetscMalloc1(cols,&v); > for(int j=0;j v[j]=zero; > } > MatSetValues(oriGraph,1,&indices,ncols,colsv,v,INSERT_VALUES); > PetscFree(v); > > } > > } > MatAssemblyBegin(oriGraph,MAT_FINAL_ASSEMBLY); > MatAssemblyEnd(oriGraph,MAT_FINAL_ASSEMBLY); > ISDestroy(&isrow); > > MatDestroy(&curMat); > > VecCopy(tmp,curNodes); > VecAssemblyBegin(curNodes); > VecAssemblyEnd(curNodes); > > } > PetscPrintf(PETSC_COMM_SELF,"Finished in iterations of: %d\n",iter); > MatDestroy(&oriGraph); > VecDestroy(&curNodes); > VecDestroy(&tmp); > PetscFinalize(); > return 0; > } > The Petsc version I have installed is 3.6.1. > > From hzhang at mcs.anl.gov Tue Aug 25 15:35:29 2015 From: hzhang at mcs.anl.gov (Hong) Date: Tue, 25 Aug 2015 15:35:29 -0500 Subject: [petsc-users] issues with sparse direct solvers In-Reply-To: <83DCD281-86DE-475B-B0CC-74DCA1515E61@gmail.com> References: <1F79415D-8468-4FB7-9821-54D71165CE11@mcs.anl.gov> <7CB56462-D17B-48A5-B014-B80604501DB5@gmail.com> <83DCD281-86DE-475B-B0CC-74DCA1515E61@gmail.com> Message-ID: Gideon : > > > I ran with that flag because, while solving a SNES with MUMPS, the code > would just sit there as though it had died, and never seem to recover. I > tried using that flag just to determine where it had stalled, which was at > the "ordering based on METIS? bit. > If you suspect METIS/ParMetis hangs, then turn to other sequential matrix orderings, e.g., ' -mat_mumps_icntl_29 0 -mat_mumps_icntl_7 2', which I found the most robust ordering. Run your code with '-help |grep mumps', it will display mumps options. Hong > > -gideon > > On Aug 25, 2015, at 12:24 PM, Hong wrote: > > Gideon: > -mat_mumps_icntl_4 <0>: ICNTL(4): level of printing (0 to 4) (None) > This is for algorithmic diagnosis, not for regular runs. Use default '0' > for it. > > Hong > > On Tue, Aug 25, 2015 at 9:06 AM, Gideon Simpson > wrote: > >> Regarding the MUMPS issue, I?m not sure if this is useful, but when I run >> with the mumps flags -mat_mumps_icntl_4 4, to see the progress, it hangs >> at this point: >> >> >> ... Structural symmetry (in percent)= 75 >> Density: NBdense, Average, Median = 2 9 7 >> Ordering based on METIS >> >> -gideon >> >> > On Aug 22, 2015, at 5:12 PM, Barry Smith wrote: >> > >> > >> >> On Aug 22, 2015, at 4:04 PM, Gideon Simpson >> wrote: >> >> >> >> I?m having issues with both SuperLU dist and MUMPS, as compiled by >> PETsc, in the following sense: >> >> >> >> 1. For large enough systems, which seems to vary depending on which >> computer I?m on, MUMPS seems to just die and never start, when it?s used as >> the linear solver within a SNES. There?s no error message, it just sits >> there and doesn?t do anything. >> > >> > You will need to use a debugger to figure out where it is "hanging"; >> we haven't heard reports about this. >> >> >> >> 2. When running with SuperLU dist, I got the following error, with no >> further information: >> > >> > The last release of SuperLU_DIST had some pretty nasty bugs, memory >> corruption that caused crashes etc. We think they are now fixed if you use >> the maint branch of the PETSc repository and --download-superlu_dist If >> you stick with the PETSc release and SuperLU_Dist you are using you will >> keep seeing these crashes >> > >> > Barry >> > >> > >> >> >> >> [3]PETSC ERROR: >> ------------------------------------------------------------------------ >> >> [3]PETSC ERROR: Caught signal number 11 SEGV: Segmentation Violation, >> probably memory access out of range >> >> [3]PETSC ERROR: Try option -start_in_debugger or >> -on_error_attach_debugger >> >> [3]PETSC ERROR: or see >> http://www.mcs.anl.gov/petsc/documentation/faq.html#valgrind >> >> [3]PETSC ERROR: or try http://valgrind.org on GNU/linux and Apple Mac >> OS X to find memory corruption errors >> >> [3]PETSC ERROR: likely location of problem given in stack below >> >> [3]PETSC ERROR: --------------------- Stack Frames >> ------------------------------------ >> >> [3]PETSC ERROR: Note: The EXACT line numbers in the stack are not >> available, >> >> [3]PETSC ERROR: INSTEAD the line number of the start of the >> function >> >> [3]PETSC ERROR: is given. >> >> [3]PETSC ERROR: [3] SuperLU_DIST:pdgssvx line 161 >> /home/simpson/software/petsc-3.5.4/src/mat/impls/aij/mpi/superlu_dist/superlu_dist.c >> >> [3]PETSC ERROR: [3] MatSolve_SuperLU_DIST line 121 >> /home/simpson/software/petsc-3.5.4/src/mat/impls/aij/mpi/superlu_dist/superlu_dist.c >> >> [3]PETSC ERROR: [3] MatSolve line 3104 >> /home/simpson/software/petsc-3.5.4/src/mat/interface/matrix.c >> >> [3]PETSC ERROR: [3] PCApply_LU line 194 >> /home/simpson/software/petsc-3.5.4/src/ksp/pc/impls/factor/lu/lu.c >> >> [3]PETSC ERROR: [3] KSP_PCApplyBAorAB line 258 >> /home/simpson/software/petsc-3.5.4/include/petsc-private/kspimpl.h >> >> [3]PETSC ERROR: --------------------- Error Message >> -------------------------------------------------------------- >> >> [3]PETSC ERROR: Signal received >> >> [3]PETSC ERROR: See >> http://www.mcs.anl.gov/petsc/documentation/faq.html for trouble shooting. >> >> [3]PETSC ERROR: Petsc Release Version 3.5.4, May, 23, 2015 >> >> [3]PETSC ERROR: ./blowup_batch2 on a arch-linux2-c-debug named >> proteusi01 by simpson Sat Aug 22 17:01:41 2015 >> >> [3]PETSC ERROR: Configure options >> --with-mpi-dir=/mnt/HA/opt/openmpi/intel/64/1.8.1-mlnx-ofed >> --with-blas-lib=/mnt/HA/opt/blas/gcc/64/20110419/libblas.a >> --with-lapack-lib=/liblapack.a --download-suitesparse=yes >> --download-superlu=yes --download-superlu_dist=yes --download-mumps=yes >> --download-metis=yes --download-parmetis=yes --download-scalapack=yes >> >> [3]PETSC ERROR: #1 User provided function() line 0 in unknown file >> >> >> -------------------------------------------------------------------------- >> >> MPI_ABORT was invoked on rank 3 in communicator MPI_COMM_WORLD >> >> with errorcode 59. >> >> >> >> NOTE: invoking MPI_ABORT causes Open MPI to kill all MPI processes. >> >> You may or may not see output from other processes, depending on >> >> exactly when Open MPI kills them. >> >> >> -------------------------------------------------------------------------- >> >> [proteusi01:14037] 1 more process has sent help message >> help-mpi-api.txt / mpi-abort >> >> [proteusi01:14037] Set MCA parameter "orte_base_help_aggregate" to 0 >> to see all help / error messages >> >> [6]PETSC ERROR: >> ------------------------------------------------------------------------ >> >> [6]PETSC ERROR: Caught signal number 15 Terminate: Some process (or >> the batch system) has told this process to end >> >> [6]PETSC ERROR: Try option -start_in_debugger or >> -on_error_attach_debugger >> >> [6]PETSC ERROR: or see >> http://www.mcs.anl.gov/petsc/documentation/faq.html#valgrind >> >> [6]PETSC ERROR: or try http://valgrind.org on GNU/linux and Apple Mac >> OS X to find memory corruption errors >> >> [6]PETSC ERROR: likely location of problem given in stack below >> >> [6]PETSC ERROR: --------------------- Stack Frames >> ------------------------------------ >> >> [6]PETSC ERROR: Note: The EXACT line numbers in the stack are not >> available, >> >> [6]PETSC ERROR: INSTEAD the line number of the start of the >> function >> >> [6]PETSC ERROR: is given. >> >> [6]PETSC ERROR: [6] SuperLU_DIST:pdgssvx line 161 >> /home/simpson/software/petsc-3.5.4/src/mat/impls/aij/mpi/superlu_dist/superlu_dist.c >> >> [6]PETSC ERROR: [6] MatSolve_SuperLU_DIST line 121 >> /home/simpson/software/petsc-3.5.4/src/mat/impls/aij/mpi/superlu_dist/superlu_dist.c >> >> [6]PETSC ERROR: [6] MatSolve line 3104 >> /home/simpson/software/petsc-3.5.4/src/mat/interface/matrix.c >> >> [6]PETSC ERROR: [6] PCApply_LU line 194 >> /home/simpson/software/petsc-3.5.4/src/ksp/pc/impls/factor/lu/lu.c >> >> [6]PETSC ERROR: [6] KSP_PCApplyBAorAB line 258 >> /home/simpson/software/petsc-3.5.4/include/petsc-private/kspimpl.h >> >> [6]PETSC ERROR: --------------------- Error Message >> -------------------------------------------------------------- >> >> [6]PETSC ERROR: Signal received >> >> [6]PETSC ERROR: See >> http://www.mcs.anl.gov/petsc/documentation/faq.html for trouble shooting. >> >> [6]PETSC ERROR: Petsc Release Version 3.5.4, May, 23, 2015 >> >> [6]PETSC ERROR: ./blowup_batch2 on a arch-linux2-c-debug named >> proteusi01 by simpson Sat Aug 22 17:01:41 2015 >> >> [6]PETSC ERROR: Configure options >> --with-mpi-dir=/mnt/HA/opt/openmpi/intel/64/1.8.1-mlnx-ofed >> --with-blas-lib=/mnt/HA/opt/blas/gcc/64/20110419/libblas.a >> --with-lapack-lib=/liblapack.a --download-suitesparse=yes >> --download-superlu=yes --download-superlu_dist=yes --download-mumps=yes >> --download-metis=yes --download-parmetis=yes --download-scalapack=yes >> >> [6]PETSC ERROR: #1 User provided function() line 0 in unknown file >> >> [7]PETSC ERROR: >> ------------------------------------------------------------------------ >> >> [7]PETSC ERROR: Caught signal number 15 Terminate: Some process (or >> the batch system) has told this process to end >> >> [7]PETSC ERROR: Try option -start_in_debugger or >> -on_error_attach_debugger >> >> [7]PETSC ERROR: or see >> http://www.mcs.anl.gov/petsc/documentation/faq.html#valgrind >> >> [7]PETSC ERROR: or try http://valgrind.org on GNU/linux and Apple Mac >> OS X to find memory corruption errors >> >> [7]PETSC ERROR: likely location of problem given in stack below >> >> [7]PETSC ERROR: --------------------- Stack Frames >> ------------------------------------ >> >> [7]PETSC ERROR: Note: The EXACT line numbers in the stack are not >> available, >> >> [7]PETSC ERROR: INSTEAD the line number of the start of the >> function >> >> [7]PETSC ERROR: is given. >> >> [7]PETSC ERROR: [7] SuperLU_DIST:pdgssvx line 161 >> /home/simpson/software/petsc-3.5.4/src/mat/impls/aij/mpi/superlu_dist/superlu_dist.c >> >> [7]PETSC ERROR: [7] MatSolve_SuperLU_DIST line 121 >> /home/simpson/software/petsc-3.5.4/src/mat/impls/aij/mpi/superlu_dist/superlu_dist.c >> >> [7]PETSC ERROR: [7] MatSolve line 3104 >> /home/simpson/software/petsc-3.5.4/src/mat/interface/matrix.c >> >> [7]PETSC ERROR: [7] PCApply_LU line 194 >> /home/simpson/software/petsc-3.5.4/src/ksp/pc/impls/factor/lu/lu.c >> >> [7]PETSC ERROR: [7] KSP_PCApplyBAorAB line 258 >> /home/simpson/software/petsc-3.5.4/include/petsc-private/kspimpl.h >> >> [7]PETSC ERROR: --------------------- Error Message >> -------------------------------------------------------------- >> >> [7]PETSC ERROR: Signal received >> >> [7]PETSC ERROR: See >> http://www.mcs.anl.gov/petsc/documentation/faq.html for trouble shooting. >> >> [7]PETSC ERROR: Petsc Release Version 3.5.4, May, 23, 2015 >> >> [7]PETSC ERROR: ./blowup_batch2 on a arch-linux2-c-debug named >> proteusi01 by simpson Sat Aug 22 17:01:41 2015 >> >> [7]PETSC ERROR: Configure options >> --with-mpi-dir=/mnt/HA/opt/openmpi/intel/64/1.8.1-mlnx-ofed >> --with-blas-lib=/mnt/HA/opt/blas/gcc/64/20110419/libblas.a >> --with-lapack-lib=/liblapack.a --download-suitesparse=yes >> --download-superlu=yes --download-superlu_dist=yes --download-mumps=yes >> --download-metis=yes --download-parmetis=yes --download-scalapack=yes >> >> [7]PETSC ERROR: #1 User provided function() line 0 in unknown file >> >> [0]PETSC ERROR: >> ------------------------------------------------------------------------ >> >> [0]PETSC ERROR: Caught signal number 15 Terminate: Some process (or >> the batch system) has told this process to end >> >> [0]PETSC ERROR: Try option -start_in_debugger or >> -on_error_attach_debugger >> >> [0]PETSC ERROR: or see >> http://www.mcs.anl.gov/petsc/documentation/faq.html#valgrind >> >> [0]PETSC ERROR: or try http://valgrind.org on GNU/linux and Apple Mac >> OS X to find memory corruption errors >> >> [0]PETSC ERROR: likely location of problem given in stack below >> >> [0]PETSC ERROR: --------------------- Stack Frames >> ------------------------------------ >> >> [0]PETSC ERROR: Note: The EXACT line numbers in the stack are not >> available, >> >> [0]PETSC ERROR: INSTEAD the line number of the start of the >> function >> >> [0]PETSC ERROR: is given. >> >> [0]PETSC ERROR: [0] SuperLU_DIST:pdgssvx line 161 >> /home/simpson/software/petsc-3.5.4/src/mat/impls/aij/mpi/superlu_dist/superlu_dist.c >> >> [0]PETSC ERROR: [0] MatSolve_SuperLU_DIST line 121 >> /home/simpson/software/petsc-3.5.4/src/mat/impls/aij/mpi/superlu_dist/superlu_dist.c >> >> [0]PETSC ERROR: [0] MatSolve line 3104 >> /home/simpson/software/petsc-3.5.4/src/mat/interface/matrix.c >> >> [0]PETSC ERROR: [0] PCApply_LU line 194 >> /home/simpson/software/petsc-3.5.4/src/ksp/pc/impls/factor/lu/lu.c >> >> [0]PETSC ERROR: [0] KSP_PCApplyBAorAB line 258 >> /home/simpson/software/petsc-3.5.4/include/petsc-private/kspimpl.h >> >> [0]PETSC ERROR: --------------------- Error Message >> -------------------------------------------------------------- >> >> [0]PETSC ERROR: Signal received >> >> [0]PETSC ERROR: See >> http://www.mcs.anl.gov/petsc/documentation/faq.html for trouble shooting. >> >> [0]PETSC ERROR: Petsc Release Version 3.5.4, May, 23, 2015 >> >> [0]PETSC ERROR: ./blowup_batch2 on a arch-linux2-c-debug named >> proteusi01 by simpson Sat Aug 22 17:01:41 2015 >> >> [0]PETSC ERROR: Configure options >> --with-mpi-dir=/mnt/HA/opt/openmpi/intel/64/1.8.1-mlnx-ofed >> --with-blas-lib=/mnt/HA/opt/blas/gcc/64/20110419/libblas.a >> --with-lapack-lib=/liblapack.a --download-suitesparse=yes >> --download-superlu=yes --download-superlu_dist=yes --download-mumps=yes >> --download-metis=yes --download-parmetis=yes --download-scalapack=yes >> >> [0]PETSC ERROR: #1 User provided function() line 0 in unknown file >> >> [1]PETSC ERROR: >> ------------------------------------------------------------------------ >> >> [1]PETSC ERROR: Caught signal number 15 Terminate: Some process (or >> the batch system) has told this process to end >> >> [1]PETSC ERROR: Try option -start_in_debugger or >> -on_error_attach_debugger >> >> [1]PETSC ERROR: or see >> http://www.mcs.anl.gov/petsc/documentation/faq.html#valgrind >> >> [1]PETSC ERROR: or try http://valgrind.org on GNU/linux and Apple Mac >> OS X to find memory corruption errors >> >> [1]PETSC ERROR: likely location of problem given in stack below >> >> [1]PETSC ERROR: --------------------- Stack Frames >> ------------------------------------ >> >> [1]PETSC ERROR: Note: The EXACT line numbers in the stack are not >> available, >> >> [1]PETSC ERROR: INSTEAD the line number of the start of the >> function >> >> [1]PETSC ERROR: is given. >> >> [1]PETSC ERROR: [1] SuperLU_DIST:pdgssvx line 161 >> /home/simpson/software/petsc-3.5.4/src/mat/impls/aij/mpi/superlu_dist/superlu_dist.c >> >> [1]PETSC ERROR: [1] MatSolve_SuperLU_DIST line 121 >> /home/simpson/software/petsc-3.5.4/src/mat/impls/aij/mpi/superlu_dist/superlu_dist.c >> >> [1]PETSC ERROR: [1] MatSolve line 3104 >> /home/simpson/software/petsc-3.5.4/src/mat/interface/matrix.c >> >> [1]PETSC ERROR: [1] PCApply_LU line 194 >> /home/simpson/software/petsc-3.5.4/src/ksp/pc/impls/factor/lu/lu.c >> >> [1]PETSC ERROR: [1] KSP_PCApplyBAorAB line 258 >> /home/simpson/software/petsc-3.5.4/include/petsc-private/kspimpl.h >> >> [1]PETSC ERROR: --------------------- Error Message >> -------------------------------------------------------------- >> >> [1]PETSC ERROR: Signal received >> >> [1]PETSC ERROR: See >> http://www.mcs.anl.gov/petsc/documentation/faq.html for trouble shooting. >> >> [1]PETSC ERROR: Petsc Release Version 3.5.4, May, 23, 2015 >> >> [1]PETSC ERROR: ./blowup_batch2 on a arch-linux2-c-debug named >> proteusi01 by simpson Sat Aug 22 17:01:41 2015 >> >> [1]PETSC ERROR: Configure options >> --with-mpi-dir=/mnt/HA/opt/openmpi/intel/64/1.8.1-mlnx-ofed >> --with-blas-lib=/mnt/HA/opt/blas/gcc/64/20110419/libblas.a >> --with-lapack-lib=/liblapack.a --download-suitesparse=yes >> --download-superlu=yes --download-superlu_dist=yes --download-mumps=yes >> --download-metis=yes --download-parmetis=yes --download-scalapack=yes >> >> [1]PETSC ERROR: #1 User provided function() line 0 in unknown file >> >> [2]PETSC ERROR: >> ------------------------------------------------------------------------ >> >> [2]PETSC ERROR: Caught signal number 15 Terminate: Some process (or >> the batch system) has told this process to end >> >> [2]PETSC ERROR: Try option -start_in_debugger or >> -on_error_attach_debugger >> >> [2]PETSC ERROR: or see >> http://www.mcs.anl.gov/petsc/documentation/faq.html#valgrind >> >> [2]PETSC ERROR: or try http://valgrind.org on GNU/linux and Apple Mac >> OS X to find memory corruption errors >> >> [2]PETSC ERROR: likely location of problem given in stack below >> >> [2]PETSC ERROR: --------------------- Stack Frames >> ------------------------------------ >> >> [2]PETSC ERROR: Note: The EXACT line numbers in the stack are not >> available, >> >> [2]PETSC ERROR: INSTEAD the line number of the start of the >> function >> >> [2]PETSC ERROR: is given. >> >> [2]PETSC ERROR: [2] SuperLU_DIST:pdgssvx line 161 >> /home/simpson/software/petsc-3.5.4/src/mat/impls/aij/mpi/superlu_dist/superlu_dist.c >> >> [2]PETSC ERROR: [2] MatSolve_SuperLU_DIST line 121 >> /home/simpson/software/petsc-3.5.4/src/mat/impls/aij/mpi/superlu_dist/superlu_dist.c >> >> [2]PETSC ERROR: [2] MatSolve line 3104 >> /home/simpson/software/petsc-3.5.4/src/mat/interface/matrix.c >> >> [2]PETSC ERROR: [2] PCApply_LU line 194 >> /home/simpson/software/petsc-3.5.4/src/ksp/pc/impls/factor/lu/lu.c >> >> [2]PETSC ERROR: [2] KSP_PCApplyBAorAB line 258 >> /home/simpson/software/petsc-3.5.4/include/petsc-private/kspimpl.h >> >> [2]PETSC ERROR: --------------------- Error Message >> -------------------------------------------------------------- >> >> [2]PETSC ERROR: Signal received >> >> [2]PETSC ERROR: See >> http://www.mcs.anl.gov/petsc/documentation/faq.html for trouble shooting. >> >> [2]PETSC ERROR: Petsc Release Version 3.5.4, May, 23, 2015 >> >> [2]PETSC ERROR: ./blowup_batch2 on a arch-linux2-c-debug named >> proteusi01 by simpson Sat Aug 22 17:01:41 2015 >> >> [2]PETSC ERROR: Configure options >> --with-mpi-dir=/mnt/HA/opt/openmpi/intel/64/1.8.1-mlnx-ofed >> --with-blas-lib=/mnt/HA/opt/blas/gcc/64/20110419/libblas.a >> --with-lapack-lib=/liblapack.a --download-suitesparse=yes >> --download-superlu=yes --download-superlu_dist=yes --download-mumps=yes >> --download-metis=yes --download-parmetis=yes --download-scalapack=yes >> >> [2]PETSC ERROR: #1 User provided function() line 0 in unknown file >> >> [4]PETSC ERROR: >> ------------------------------------------------------------------------ >> >> [4]PETSC ERROR: Caught signal number 15 Terminate: Some process (or >> the batch system) has told this process to end >> >> [4]PETSC ERROR: Try option -start_in_debugger or >> -on_error_attach_debugger >> >> [4]PETSC ERROR: or see >> http://www.mcs.anl.gov/petsc/documentation/faq.html#valgrind >> >> [4]PETSC ERROR: or try http://valgrind.org on GNU/linux and Apple Mac >> OS X to find memory corruption errors >> >> [4]PETSC ERROR: likely location of problem given in stack below >> >> [4]PETSC ERROR: --------------------- Stack Frames >> ------------------------------------ >> >> [4]PETSC ERROR: Note: The EXACT line numbers in the stack are not >> available, >> >> [4]PETSC ERROR: INSTEAD the line number of the start of the >> function >> >> [4]PETSC ERROR: is given. >> >> [4]PETSC ERROR: [4] SuperLU_DIST:pdgssvx line 161 >> /home/simpson/software/petsc-3.5.4/src/mat/impls/aij/mpi/superlu_dist/superlu_dist.c >> >> [4]PETSC ERROR: [4] MatSolve_SuperLU_DIST line 121 >> /home/simpson/software/petsc-3.5.4/src/mat/impls/aij/mpi/superlu_dist/superlu_dist.c >> >> [4]PETSC ERROR: [4] MatSolve line 3104 >> /home/simpson/software/petsc-3.5.4/src/mat/interface/matrix.c >> >> [4]PETSC ERROR: [4] PCApply_LU line 194 >> /home/simpson/software/petsc-3.5.4/src/ksp/pc/impls/factor/lu/lu.c >> >> [4]PETSC ERROR: [4] KSP_PCApplyBAorAB line 258 >> /home/simpson/software/petsc-3.5.4/include/petsc-private/kspimpl.h >> >> [4]PETSC ERROR: --------------------- Error Message >> -------------------------------------------------------------- >> >> [4]PETSC ERROR: Signal received >> >> [4]PETSC ERROR: See >> http://www.mcs.anl.gov/petsc/documentation/faq.html for trouble shooting. >> >> [4]PETSC ERROR: Petsc Release Version 3.5.4, May, 23, 2015 >> >> [4]PETSC ERROR: ./blowup_batch2 on a arch-linux2-c-debug named >> proteusi01 by simpson Sat Aug 22 17:01:41 2015 >> >> [4]PETSC ERROR: Configure options >> --with-mpi-dir=/mnt/HA/opt/openmpi/intel/64/1.8.1-mlnx-ofed >> --with-blas-lib=/mnt/HA/opt/blas/gcc/64/20110419/libblas.a >> --with-lapack-lib=/liblapack.a --download-suitesparse=yes >> --download-superlu=yes --download-superlu_dist=yes --download-mumps=yes >> --download-metis=yes --download-parmetis=yes --download-scalapack=yes >> >> [4]PETSC ERROR: #1 User provided function() line 0 in unknown file >> >> [5]PETSC ERROR: >> ------------------------------------------------------------------------ >> >> [5]PETSC ERROR: Caught signal number 15 Terminate: Some process (or >> the batch system) has told this process to end >> >> [5]PETSC ERROR: Try option -start_in_debugger or >> -on_error_attach_debugger >> >> [5]PETSC ERROR: or see >> http://www.mcs.anl.gov/petsc/documentation/faq.html#valgrind >> >> [5]PETSC ERROR: or try http://valgrind.org on GNU/linux and Apple Mac >> OS X to find memory corruption errors >> >> [5]PETSC ERROR: likely location of problem given in stack below >> >> [5]PETSC ERROR: --------------------- Stack Frames >> ------------------------------------ >> >> [5]PETSC ERROR: Note: The EXACT line numbers in the stack are not >> available, >> >> [5]PETSC ERROR: INSTEAD the line number of the start of the >> function >> >> [5]PETSC ERROR: is given. >> >> [5]PETSC ERROR: [5] SuperLU_DIST:pdgssvx line 161 >> /home/simpson/software/petsc-3.5.4/src/mat/impls/aij/mpi/superlu_dist/superlu_dist.c >> >> [5]PETSC ERROR: [5] MatSolve_SuperLU_DIST line 121 >> /home/simpson/software/petsc-3.5.4/src/mat/impls/aij/mpi/superlu_dist/superlu_dist.c >> >> [5]PETSC ERROR: [5] MatSolve line 3104 >> /home/simpson/software/petsc-3.5.4/src/mat/interface/matrix.c >> >> [5]PETSC ERROR: [5] PCApply_LU line 194 >> /home/simpson/software/petsc-3.5.4/src/ksp/pc/impls/factor/lu/lu.c >> >> [5]PETSC ERROR: [5] KSP_PCApplyBAorAB line 258 >> /home/simpson/software/petsc-3.5.4/include/petsc-private/kspimpl.h >> >> [5]PETSC ERROR: --------------------- Error Message >> -------------------------------------------------------------- >> >> [5]PETSC ERROR: Signal received >> >> [5]PETSC ERROR: See >> http://www.mcs.anl.gov/petsc/documentation/faq.html for trouble shooting. >> >> [5]PETSC ERROR: Petsc Release Version 3.5.4, May, 23, 2015 >> >> [5]PETSC ERROR: ./blowup_batch2 on a arch-linux2-c-debug named >> proteusi01 by simpson Sat Aug 22 17:01:41 2015 >> >> [5]PETSC ERROR: Configure options >> --with-mpi-dir=/mnt/HA/opt/openmpi/intel/64/1.8.1-mlnx-ofed >> --with-blas-lib=/mnt/HA/opt/blas/gcc/64/20110419/libblas.a >> --with-lapack-lib=/liblapack.a --download-suitesparse=yes >> --download-superlu=yes --download-superlu_dist=yes --download-mumps=yes >> --download-metis=yes --download-parmetis=yes --download-scalapack=yes >> >> [5]PETSC ERROR: #1 User provided function() line 0 in unknown file >> >> >> >> -gideon >> >> >> >> >> > >> >> > > -------------- next part -------------- An HTML attachment was scrubbed... URL: From gideon.simpson at gmail.com Tue Aug 25 15:54:59 2015 From: gideon.simpson at gmail.com (Gideon Simpson) Date: Tue, 25 Aug 2015 16:54:59 -0400 Subject: [petsc-users] issues with sparse direct solvers In-Reply-To: References: <1F79415D-8468-4FB7-9821-54D71165CE11@mcs.anl.gov> <7CB56462-D17B-48A5-B014-B80604501DB5@gmail.com> <83DCD281-86DE-475B-B0CC-74DCA1515E61@gmail.com> Message-ID: running with -mat_mumps_icntl_7 4 got it to run on problems that it couldn?t do before, thanks. How should I understand how this choice of flag is impacting whether or not it stalls? -gideon > On Aug 25, 2015, at 4:35 PM, Hong wrote: > > Gideon : > > I ran with that flag because, while solving a SNES with MUMPS, the code would just sit there as though it had died, and never seem to recover. I tried using that flag just to determine where it had stalled, which was at the "ordering based on METIS? bit. > > If you suspect METIS/ParMetis hangs, > then turn to other sequential matrix orderings, e.g., > ' -mat_mumps_icntl_29 0 -mat_mumps_icntl_7 2', which I found the most robust ordering. > Run your code with '-help |grep mumps', it will display mumps options. > > Hong > > > -gideon > >> On Aug 25, 2015, at 12:24 PM, Hong > wrote: >> >> Gideon: >> -mat_mumps_icntl_4 <0>: ICNTL(4): level of printing (0 to 4) (None) >> This is for algorithmic diagnosis, not for regular runs. Use default '0' for it. >> >> Hong >> >> On Tue, Aug 25, 2015 at 9:06 AM, Gideon Simpson > wrote: >> Regarding the MUMPS issue, I?m not sure if this is useful, but when I run with the mumps flags -mat_mumps_icntl_4 4, to see the progress, it hangs at this point: >> >> >> ... Structural symmetry (in percent)= 75 >> Density: NBdense, Average, Median = 2 9 7 >> Ordering based on METIS >> >> -gideon >> >> > On Aug 22, 2015, at 5:12 PM, Barry Smith > wrote: >> > >> > >> >> On Aug 22, 2015, at 4:04 PM, Gideon Simpson > wrote: >> >> >> >> I?m having issues with both SuperLU dist and MUMPS, as compiled by PETsc, in the following sense: >> >> >> >> 1. For large enough systems, which seems to vary depending on which computer I?m on, MUMPS seems to just die and never start, when it?s used as the linear solver within a SNES. There?s no error message, it just sits there and doesn?t do anything. >> > >> > You will need to use a debugger to figure out where it is "hanging"; we haven't heard reports about this. >> >> >> >> 2. When running with SuperLU dist, I got the following error, with no further information: >> > >> > The last release of SuperLU_DIST had some pretty nasty bugs, memory corruption that caused crashes etc. We think they are now fixed if you use the maint branch of the PETSc repository and --download-superlu_dist If you stick with the PETSc release and SuperLU_Dist you are using you will keep seeing these crashes >> > >> > Barry >> > >> > >> >> >> >> [3]PETSC ERROR: ------------------------------------------------------------------------ >> >> [3]PETSC ERROR: Caught signal number 11 SEGV: Segmentation Violation, probably memory access out of range >> >> [3]PETSC ERROR: Try option -start_in_debugger or -on_error_attach_debugger >> >> [3]PETSC ERROR: or see http://www.mcs.anl.gov/petsc/documentation/faq.html#valgrind >> >> [3]PETSC ERROR: or try http://valgrind.org on GNU/linux and Apple Mac OS X to find memory corruption errors >> >> [3]PETSC ERROR: likely location of problem given in stack below >> >> [3]PETSC ERROR: --------------------- Stack Frames ------------------------------------ >> >> [3]PETSC ERROR: Note: The EXACT line numbers in the stack are not available, >> >> [3]PETSC ERROR: INSTEAD the line number of the start of the function >> >> [3]PETSC ERROR: is given. >> >> [3]PETSC ERROR: [3] SuperLU_DIST:pdgssvx line 161 /home/simpson/software/petsc-3.5.4/src/mat/impls/aij/mpi/superlu_dist/superlu_dist.c >> >> [3]PETSC ERROR: [3] MatSolve_SuperLU_DIST line 121 /home/simpson/software/petsc-3.5.4/src/mat/impls/aij/mpi/superlu_dist/superlu_dist.c >> >> [3]PETSC ERROR: [3] MatSolve line 3104 /home/simpson/software/petsc-3.5.4/src/mat/interface/matrix.c >> >> [3]PETSC ERROR: [3] PCApply_LU line 194 /home/simpson/software/petsc-3.5.4/src/ksp/pc/impls/factor/lu/lu.c >> >> [3]PETSC ERROR: [3] KSP_PCApplyBAorAB line 258 /home/simpson/software/petsc-3.5.4/include/petsc-private/kspimpl.h >> >> [3]PETSC ERROR: --------------------- Error Message -------------------------------------------------------------- >> >> [3]PETSC ERROR: Signal received >> >> [3]PETSC ERROR: See http://www.mcs.anl.gov/petsc/documentation/faq.html for trouble shooting. >> >> [3]PETSC ERROR: Petsc Release Version 3.5.4, May, 23, 2015 >> >> [3]PETSC ERROR: ./blowup_batch2 on a arch-linux2-c-debug named proteusi01 by simpson Sat Aug 22 17:01:41 2015 >> >> [3]PETSC ERROR: Configure options --with-mpi-dir=/mnt/HA/opt/openmpi/intel/64/1.8.1-mlnx-ofed --with-blas-lib=/mnt/HA/opt/blas/gcc/64/20110419/libblas.a --with-lapack-lib=/liblapack.a --download-suitesparse=yes --download-superlu=yes --download-superlu_dist=yes --download-mumps=yes --download-metis=yes --download-parmetis=yes --download-scalapack=yes >> >> [3]PETSC ERROR: #1 User provided function() line 0 in unknown file >> >> -------------------------------------------------------------------------- >> >> MPI_ABORT was invoked on rank 3 in communicator MPI_COMM_WORLD >> >> with errorcode 59. >> >> >> >> NOTE: invoking MPI_ABORT causes Open MPI to kill all MPI processes. >> >> You may or may not see output from other processes, depending on >> >> exactly when Open MPI kills them. >> >> -------------------------------------------------------------------------- >> >> [proteusi01:14037] 1 more process has sent help message help-mpi-api.txt / mpi-abort >> >> [proteusi01:14037] Set MCA parameter "orte_base_help_aggregate" to 0 to see all help / error messages >> >> [6]PETSC ERROR: ------------------------------------------------------------------------ >> >> [6]PETSC ERROR: Caught signal number 15 Terminate: Some process (or the batch system) has told this process to end >> >> [6]PETSC ERROR: Try option -start_in_debugger or -on_error_attach_debugger >> >> [6]PETSC ERROR: or see http://www.mcs.anl.gov/petsc/documentation/faq.html#valgrind >> >> [6]PETSC ERROR: or try http://valgrind.org on GNU/linux and Apple Mac OS X to find memory corruption errors >> >> [6]PETSC ERROR: likely location of problem given in stack below >> >> [6]PETSC ERROR: --------------------- Stack Frames ------------------------------------ >> >> [6]PETSC ERROR: Note: The EXACT line numbers in the stack are not available, >> >> [6]PETSC ERROR: INSTEAD the line number of the start of the function >> >> [6]PETSC ERROR: is given. >> >> [6]PETSC ERROR: [6] SuperLU_DIST:pdgssvx line 161 /home/simpson/software/petsc-3.5.4/src/mat/impls/aij/mpi/superlu_dist/superlu_dist.c >> >> [6]PETSC ERROR: [6] MatSolve_SuperLU_DIST line 121 /home/simpson/software/petsc-3.5.4/src/mat/impls/aij/mpi/superlu_dist/superlu_dist.c >> >> [6]PETSC ERROR: [6] MatSolve line 3104 /home/simpson/software/petsc-3.5.4/src/mat/interface/matrix.c >> >> [6]PETSC ERROR: [6] PCApply_LU line 194 /home/simpson/software/petsc-3.5.4/src/ksp/pc/impls/factor/lu/lu.c >> >> [6]PETSC ERROR: [6] KSP_PCApplyBAorAB line 258 /home/simpson/software/petsc-3.5.4/include/petsc-private/kspimpl.h >> >> [6]PETSC ERROR: --------------------- Error Message -------------------------------------------------------------- >> >> [6]PETSC ERROR: Signal received >> >> [6]PETSC ERROR: See http://www.mcs.anl.gov/petsc/documentation/faq.html for trouble shooting. >> >> [6]PETSC ERROR: Petsc Release Version 3.5.4, May, 23, 2015 >> >> [6]PETSC ERROR: ./blowup_batch2 on a arch-linux2-c-debug named proteusi01 by simpson Sat Aug 22 17:01:41 2015 >> >> [6]PETSC ERROR: Configure options --with-mpi-dir=/mnt/HA/opt/openmpi/intel/64/1.8.1-mlnx-ofed --with-blas-lib=/mnt/HA/opt/blas/gcc/64/20110419/libblas.a --with-lapack-lib=/liblapack.a --download-suitesparse=yes --download-superlu=yes --download-superlu_dist=yes --download-mumps=yes --download-metis=yes --download-parmetis=yes --download-scalapack=yes >> >> [6]PETSC ERROR: #1 User provided function() line 0 in unknown file >> >> [7]PETSC ERROR: ------------------------------------------------------------------------ >> >> [7]PETSC ERROR: Caught signal number 15 Terminate: Some process (or the batch system) has told this process to end >> >> [7]PETSC ERROR: Try option -start_in_debugger or -on_error_attach_debugger >> >> [7]PETSC ERROR: or see http://www.mcs.anl.gov/petsc/documentation/faq.html#valgrind >> >> [7]PETSC ERROR: or try http://valgrind.org on GNU/linux and Apple Mac OS X to find memory corruption errors >> >> [7]PETSC ERROR: likely location of problem given in stack below >> >> [7]PETSC ERROR: --------------------- Stack Frames ------------------------------------ >> >> [7]PETSC ERROR: Note: The EXACT line numbers in the stack are not available, >> >> [7]PETSC ERROR: INSTEAD the line number of the start of the function >> >> [7]PETSC ERROR: is given. >> >> [7]PETSC ERROR: [7] SuperLU_DIST:pdgssvx line 161 /home/simpson/software/petsc-3.5.4/src/mat/impls/aij/mpi/superlu_dist/superlu_dist.c >> >> [7]PETSC ERROR: [7] MatSolve_SuperLU_DIST line 121 /home/simpson/software/petsc-3.5.4/src/mat/impls/aij/mpi/superlu_dist/superlu_dist.c >> >> [7]PETSC ERROR: [7] MatSolve line 3104 /home/simpson/software/petsc-3.5.4/src/mat/interface/matrix.c >> >> [7]PETSC ERROR: [7] PCApply_LU line 194 /home/simpson/software/petsc-3.5.4/src/ksp/pc/impls/factor/lu/lu.c >> >> [7]PETSC ERROR: [7] KSP_PCApplyBAorAB line 258 /home/simpson/software/petsc-3.5.4/include/petsc-private/kspimpl.h >> >> [7]PETSC ERROR: --------------------- Error Message -------------------------------------------------------------- >> >> [7]PETSC ERROR: Signal received >> >> [7]PETSC ERROR: See http://www.mcs.anl.gov/petsc/documentation/faq.html for trouble shooting. >> >> [7]PETSC ERROR: Petsc Release Version 3.5.4, May, 23, 2015 >> >> [7]PETSC ERROR: ./blowup_batch2 on a arch-linux2-c-debug named proteusi01 by simpson Sat Aug 22 17:01:41 2015 >> >> [7]PETSC ERROR: Configure options --with-mpi-dir=/mnt/HA/opt/openmpi/intel/64/1.8.1-mlnx-ofed --with-blas-lib=/mnt/HA/opt/blas/gcc/64/20110419/libblas.a --with-lapack-lib=/liblapack.a --download-suitesparse=yes --download-superlu=yes --download-superlu_dist=yes --download-mumps=yes --download-metis=yes --download-parmetis=yes --download-scalapack=yes >> >> [7]PETSC ERROR: #1 User provided function() line 0 in unknown file >> >> [0]PETSC ERROR: ------------------------------------------------------------------------ >> >> [0]PETSC ERROR: Caught signal number 15 Terminate: Some process (or the batch system) has told this process to end >> >> [0]PETSC ERROR: Try option -start_in_debugger or -on_error_attach_debugger >> >> [0]PETSC ERROR: or see http://www.mcs.anl.gov/petsc/documentation/faq.html#valgrind >> >> [0]PETSC ERROR: or try http://valgrind.org on GNU/linux and Apple Mac OS X to find memory corruption errors >> >> [0]PETSC ERROR: likely location of problem given in stack below >> >> [0]PETSC ERROR: --------------------- Stack Frames ------------------------------------ >> >> [0]PETSC ERROR: Note: The EXACT line numbers in the stack are not available, >> >> [0]PETSC ERROR: INSTEAD the line number of the start of the function >> >> [0]PETSC ERROR: is given. >> >> [0]PETSC ERROR: [0] SuperLU_DIST:pdgssvx line 161 /home/simpson/software/petsc-3.5.4/src/mat/impls/aij/mpi/superlu_dist/superlu_dist.c >> >> [0]PETSC ERROR: [0] MatSolve_SuperLU_DIST line 121 /home/simpson/software/petsc-3.5.4/src/mat/impls/aij/mpi/superlu_dist/superlu_dist.c >> >> [0]PETSC ERROR: [0] MatSolve line 3104 /home/simpson/software/petsc-3.5.4/src/mat/interface/matrix.c >> >> [0]PETSC ERROR: [0] PCApply_LU line 194 /home/simpson/software/petsc-3.5.4/src/ksp/pc/impls/factor/lu/lu.c >> >> [0]PETSC ERROR: [0] KSP_PCApplyBAorAB line 258 /home/simpson/software/petsc-3.5.4/include/petsc-private/kspimpl.h >> >> [0]PETSC ERROR: --------------------- Error Message -------------------------------------------------------------- >> >> [0]PETSC ERROR: Signal received >> >> [0]PETSC ERROR: See http://www.mcs.anl.gov/petsc/documentation/faq.html for trouble shooting. >> >> [0]PETSC ERROR: Petsc Release Version 3.5.4, May, 23, 2015 >> >> [0]PETSC ERROR: ./blowup_batch2 on a arch-linux2-c-debug named proteusi01 by simpson Sat Aug 22 17:01:41 2015 >> >> [0]PETSC ERROR: Configure options --with-mpi-dir=/mnt/HA/opt/openmpi/intel/64/1.8.1-mlnx-ofed --with-blas-lib=/mnt/HA/opt/blas/gcc/64/20110419/libblas.a --with-lapack-lib=/liblapack.a --download-suitesparse=yes --download-superlu=yes --download-superlu_dist=yes --download-mumps=yes --download-metis=yes --download-parmetis=yes --download-scalapack=yes >> >> [0]PETSC ERROR: #1 User provided function() line 0 in unknown file >> >> [1]PETSC ERROR: ------------------------------------------------------------------------ >> >> [1]PETSC ERROR: Caught signal number 15 Terminate: Some process (or the batch system) has told this process to end >> >> [1]PETSC ERROR: Try option -start_in_debugger or -on_error_attach_debugger >> >> [1]PETSC ERROR: or see http://www.mcs.anl.gov/petsc/documentation/faq.html#valgrind >> >> [1]PETSC ERROR: or try http://valgrind.org on GNU/linux and Apple Mac OS X to find memory corruption errors >> >> [1]PETSC ERROR: likely location of problem given in stack below >> >> [1]PETSC ERROR: --------------------- Stack Frames ------------------------------------ >> >> [1]PETSC ERROR: Note: The EXACT line numbers in the stack are not available, >> >> [1]PETSC ERROR: INSTEAD the line number of the start of the function >> >> [1]PETSC ERROR: is given. >> >> [1]PETSC ERROR: [1] SuperLU_DIST:pdgssvx line 161 /home/simpson/software/petsc-3.5.4/src/mat/impls/aij/mpi/superlu_dist/superlu_dist.c >> >> [1]PETSC ERROR: [1] MatSolve_SuperLU_DIST line 121 /home/simpson/software/petsc-3.5.4/src/mat/impls/aij/mpi/superlu_dist/superlu_dist.c >> >> [1]PETSC ERROR: [1] MatSolve line 3104 /home/simpson/software/petsc-3.5.4/src/mat/interface/matrix.c >> >> [1]PETSC ERROR: [1] PCApply_LU line 194 /home/simpson/software/petsc-3.5.4/src/ksp/pc/impls/factor/lu/lu.c >> >> [1]PETSC ERROR: [1] KSP_PCApplyBAorAB line 258 /home/simpson/software/petsc-3.5.4/include/petsc-private/kspimpl.h >> >> [1]PETSC ERROR: --------------------- Error Message -------------------------------------------------------------- >> >> [1]PETSC ERROR: Signal received >> >> [1]PETSC ERROR: See http://www.mcs.anl.gov/petsc/documentation/faq.html for trouble shooting. >> >> [1]PETSC ERROR: Petsc Release Version 3.5.4, May, 23, 2015 >> >> [1]PETSC ERROR: ./blowup_batch2 on a arch-linux2-c-debug named proteusi01 by simpson Sat Aug 22 17:01:41 2015 >> >> [1]PETSC ERROR: Configure options --with-mpi-dir=/mnt/HA/opt/openmpi/intel/64/1.8.1-mlnx-ofed --with-blas-lib=/mnt/HA/opt/blas/gcc/64/20110419/libblas.a --with-lapack-lib=/liblapack.a --download-suitesparse=yes --download-superlu=yes --download-superlu_dist=yes --download-mumps=yes --download-metis=yes --download-parmetis=yes --download-scalapack=yes >> >> [1]PETSC ERROR: #1 User provided function() line 0 in unknown file >> >> [2]PETSC ERROR: ------------------------------------------------------------------------ >> >> [2]PETSC ERROR: Caught signal number 15 Terminate: Some process (or the batch system) has told this process to end >> >> [2]PETSC ERROR: Try option -start_in_debugger or -on_error_attach_debugger >> >> [2]PETSC ERROR: or see http://www.mcs.anl.gov/petsc/documentation/faq.html#valgrind >> >> [2]PETSC ERROR: or try http://valgrind.org on GNU/linux and Apple Mac OS X to find memory corruption errors >> >> [2]PETSC ERROR: likely location of problem given in stack below >> >> [2]PETSC ERROR: --------------------- Stack Frames ------------------------------------ >> >> [2]PETSC ERROR: Note: The EXACT line numbers in the stack are not available, >> >> [2]PETSC ERROR: INSTEAD the line number of the start of the function >> >> [2]PETSC ERROR: is given. >> >> [2]PETSC ERROR: [2] SuperLU_DIST:pdgssvx line 161 /home/simpson/software/petsc-3.5.4/src/mat/impls/aij/mpi/superlu_dist/superlu_dist.c >> >> [2]PETSC ERROR: [2] MatSolve_SuperLU_DIST line 121 /home/simpson/software/petsc-3.5.4/src/mat/impls/aij/mpi/superlu_dist/superlu_dist.c >> >> [2]PETSC ERROR: [2] MatSolve line 3104 /home/simpson/software/petsc-3.5.4/src/mat/interface/matrix.c >> >> [2]PETSC ERROR: [2] PCApply_LU line 194 /home/simpson/software/petsc-3.5.4/src/ksp/pc/impls/factor/lu/lu.c >> >> [2]PETSC ERROR: [2] KSP_PCApplyBAorAB line 258 /home/simpson/software/petsc-3.5.4/include/petsc-private/kspimpl.h >> >> [2]PETSC ERROR: --------------------- Error Message -------------------------------------------------------------- >> >> [2]PETSC ERROR: Signal received >> >> [2]PETSC ERROR: See http://www.mcs.anl.gov/petsc/documentation/faq.html for trouble shooting. >> >> [2]PETSC ERROR: Petsc Release Version 3.5.4, May, 23, 2015 >> >> [2]PETSC ERROR: ./blowup_batch2 on a arch-linux2-c-debug named proteusi01 by simpson Sat Aug 22 17:01:41 2015 >> >> [2]PETSC ERROR: Configure options --with-mpi-dir=/mnt/HA/opt/openmpi/intel/64/1.8.1-mlnx-ofed --with-blas-lib=/mnt/HA/opt/blas/gcc/64/20110419/libblas.a --with-lapack-lib=/liblapack.a --download-suitesparse=yes --download-superlu=yes --download-superlu_dist=yes --download-mumps=yes --download-metis=yes --download-parmetis=yes --download-scalapack=yes >> >> [2]PETSC ERROR: #1 User provided function() line 0 in unknown file >> >> [4]PETSC ERROR: ------------------------------------------------------------------------ >> >> [4]PETSC ERROR: Caught signal number 15 Terminate: Some process (or the batch system) has told this process to end >> >> [4]PETSC ERROR: Try option -start_in_debugger or -on_error_attach_debugger >> >> [4]PETSC ERROR: or see http://www.mcs.anl.gov/petsc/documentation/faq.html#valgrind >> >> [4]PETSC ERROR: or try http://valgrind.org on GNU/linux and Apple Mac OS X to find memory corruption errors >> >> [4]PETSC ERROR: likely location of problem given in stack below >> >> [4]PETSC ERROR: --------------------- Stack Frames ------------------------------------ >> >> [4]PETSC ERROR: Note: The EXACT line numbers in the stack are not available, >> >> [4]PETSC ERROR: INSTEAD the line number of the start of the function >> >> [4]PETSC ERROR: is given. >> >> [4]PETSC ERROR: [4] SuperLU_DIST:pdgssvx line 161 /home/simpson/software/petsc-3.5.4/src/mat/impls/aij/mpi/superlu_dist/superlu_dist.c >> >> [4]PETSC ERROR: [4] MatSolve_SuperLU_DIST line 121 /home/simpson/software/petsc-3.5.4/src/mat/impls/aij/mpi/superlu_dist/superlu_dist.c >> >> [4]PETSC ERROR: [4] MatSolve line 3104 /home/simpson/software/petsc-3.5.4/src/mat/interface/matrix.c >> >> [4]PETSC ERROR: [4] PCApply_LU line 194 /home/simpson/software/petsc-3.5.4/src/ksp/pc/impls/factor/lu/lu.c >> >> [4]PETSC ERROR: [4] KSP_PCApplyBAorAB line 258 /home/simpson/software/petsc-3.5.4/include/petsc-private/kspimpl.h >> >> [4]PETSC ERROR: --------------------- Error Message -------------------------------------------------------------- >> >> [4]PETSC ERROR: Signal received >> >> [4]PETSC ERROR: See http://www.mcs.anl.gov/petsc/documentation/faq.html for trouble shooting. >> >> [4]PETSC ERROR: Petsc Release Version 3.5.4, May, 23, 2015 >> >> [4]PETSC ERROR: ./blowup_batch2 on a arch-linux2-c-debug named proteusi01 by simpson Sat Aug 22 17:01:41 2015 >> >> [4]PETSC ERROR: Configure options --with-mpi-dir=/mnt/HA/opt/openmpi/intel/64/1.8.1-mlnx-ofed --with-blas-lib=/mnt/HA/opt/blas/gcc/64/20110419/libblas.a --with-lapack-lib=/liblapack.a --download-suitesparse=yes --download-superlu=yes --download-superlu_dist=yes --download-mumps=yes --download-metis=yes --download-parmetis=yes --download-scalapack=yes >> >> [4]PETSC ERROR: #1 User provided function() line 0 in unknown file >> >> [5]PETSC ERROR: ------------------------------------------------------------------------ >> >> [5]PETSC ERROR: Caught signal number 15 Terminate: Some process (or the batch system) has told this process to end >> >> [5]PETSC ERROR: Try option -start_in_debugger or -on_error_attach_debugger >> >> [5]PETSC ERROR: or see http://www.mcs.anl.gov/petsc/documentation/faq.html#valgrind >> >> [5]PETSC ERROR: or try http://valgrind.org on GNU/linux and Apple Mac OS X to find memory corruption errors >> >> [5]PETSC ERROR: likely location of problem given in stack below >> >> [5]PETSC ERROR: --------------------- Stack Frames ------------------------------------ >> >> [5]PETSC ERROR: Note: The EXACT line numbers in the stack are not available, >> >> [5]PETSC ERROR: INSTEAD the line number of the start of the function >> >> [5]PETSC ERROR: is given. >> >> [5]PETSC ERROR: [5] SuperLU_DIST:pdgssvx line 161 /home/simpson/software/petsc-3.5.4/src/mat/impls/aij/mpi/superlu_dist/superlu_dist.c >> >> [5]PETSC ERROR: [5] MatSolve_SuperLU_DIST line 121 /home/simpson/software/petsc-3.5.4/src/mat/impls/aij/mpi/superlu_dist/superlu_dist.c >> >> [5]PETSC ERROR: [5] MatSolve line 3104 /home/simpson/software/petsc-3.5.4/src/mat/interface/matrix.c >> >> [5]PETSC ERROR: [5] PCApply_LU line 194 /home/simpson/software/petsc-3.5.4/src/ksp/pc/impls/factor/lu/lu.c >> >> [5]PETSC ERROR: [5] KSP_PCApplyBAorAB line 258 /home/simpson/software/petsc-3.5.4/include/petsc-private/kspimpl.h >> >> [5]PETSC ERROR: --------------------- Error Message -------------------------------------------------------------- >> >> [5]PETSC ERROR: Signal received >> >> [5]PETSC ERROR: See http://www.mcs.anl.gov/petsc/documentation/faq.html for trouble shooting. >> >> [5]PETSC ERROR: Petsc Release Version 3.5.4, May, 23, 2015 >> >> [5]PETSC ERROR: ./blowup_batch2 on a arch-linux2-c-debug named proteusi01 by simpson Sat Aug 22 17:01:41 2015 >> >> [5]PETSC ERROR: Configure options --with-mpi-dir=/mnt/HA/opt/openmpi/intel/64/1.8.1-mlnx-ofed --with-blas-lib=/mnt/HA/opt/blas/gcc/64/20110419/libblas.a --with-lapack-lib=/liblapack.a --download-suitesparse=yes --download-superlu=yes --download-superlu_dist=yes --download-mumps=yes --download-metis=yes --download-parmetis=yes --download-scalapack=yes >> >> [5]PETSC ERROR: #1 User provided function() line 0 in unknown file >> >> >> >> -gideon >> >> >> >> >> > -------------- next part -------------- An HTML attachment was scrubbed... URL: From hzhang at mcs.anl.gov Tue Aug 25 17:23:28 2015 From: hzhang at mcs.anl.gov (Hong) Date: Tue, 25 Aug 2015 17:23:28 -0500 Subject: [petsc-users] issues with sparse direct solvers In-Reply-To: References: <1F79415D-8468-4FB7-9821-54D71165CE11@mcs.anl.gov> <7CB56462-D17B-48A5-B014-B80604501DB5@gmail.com> <83DCD281-86DE-475B-B0CC-74DCA1515E61@gmail.com> Message-ID: Gideon : > running with -mat_mumps_icntl_7 4 got it to run on problems that it > couldn?t do before, thanks. How should I understand how this choice of > flag is impacting whether or not it stalls? > Good to know that your code starts running :-) I've never encountered hang with mumps. It turns that the hang occurred in metis, which I rarely use. Debugging what causes hang, I usually run code with a debugger, e.g. use opiton '-start_in_debugger', when code hangs, hit control^C to see where it hangs. Hong > > On Aug 25, 2015, at 4:35 PM, Hong wrote: > > Gideon : >> >> >> I ran with that flag because, while solving a SNES with MUMPS, the code >> would just sit there as though it had died, and never seem to recover. I >> tried using that flag just to determine where it had stalled, which was at >> the "ordering based on METIS? bit. >> > > If you suspect METIS/ParMetis hangs, > then turn to other sequential matrix orderings, e.g., > ' -mat_mumps_icntl_29 0 -mat_mumps_icntl_7 2', which I found the most > robust ordering. > Run your code with '-help |grep mumps', it will display mumps options. > > Hong > > >> >> -gideon >> >> On Aug 25, 2015, at 12:24 PM, Hong wrote: >> >> Gideon: >> -mat_mumps_icntl_4 <0>: ICNTL(4): level of printing (0 to 4) (None) >> This is for algorithmic diagnosis, not for regular runs. Use default '0' >> for it. >> >> Hong >> >> On Tue, Aug 25, 2015 at 9:06 AM, Gideon Simpson > > wrote: >> >>> Regarding the MUMPS issue, I?m not sure if this is useful, but when I >>> run with the mumps flags -mat_mumps_icntl_4 4, to see the progress, it >>> hangs at this point: >>> >>> >>> ... Structural symmetry (in percent)= 75 >>> Density: NBdense, Average, Median = 2 9 7 >>> Ordering based on METIS >>> >>> -gideon >>> >>> > On Aug 22, 2015, at 5:12 PM, Barry Smith wrote: >>> > >>> > >>> >> On Aug 22, 2015, at 4:04 PM, Gideon Simpson >>> wrote: >>> >> >>> >> I?m having issues with both SuperLU dist and MUMPS, as compiled by >>> PETsc, in the following sense: >>> >> >>> >> 1. For large enough systems, which seems to vary depending on which >>> computer I?m on, MUMPS seems to just die and never start, when it?s used as >>> the linear solver within a SNES. There?s no error message, it just sits >>> there and doesn?t do anything. >>> > >>> > You will need to use a debugger to figure out where it is "hanging"; >>> we haven't heard reports about this. >>> >> >>> >> 2. When running with SuperLU dist, I got the following error, with >>> no further information: >>> > >>> > The last release of SuperLU_DIST had some pretty nasty bugs, memory >>> corruption that caused crashes etc. We think they are now fixed if you use >>> the maint branch of the PETSc repository and --download-superlu_dist If >>> you stick with the PETSc release and SuperLU_Dist you are using you will >>> keep seeing these crashes >>> > >>> > Barry >>> > >>> > >>> >> >>> >> [3]PETSC ERROR: >>> ------------------------------------------------------------------------ >>> >> [3]PETSC ERROR: Caught signal number 11 SEGV: Segmentation Violation, >>> probably memory access out of range >>> >> [3]PETSC ERROR: Try option -start_in_debugger or >>> -on_error_attach_debugger >>> >> [3]PETSC ERROR: or see >>> http://www.mcs.anl.gov/petsc/documentation/faq.html#valgrind >>> >> [3]PETSC ERROR: or try http://valgrind.org on GNU/linux and Apple >>> Mac OS X to find memory corruption errors >>> >> [3]PETSC ERROR: likely location of problem given in stack below >>> >> [3]PETSC ERROR: --------------------- Stack Frames >>> ------------------------------------ >>> >> [3]PETSC ERROR: Note: The EXACT line numbers in the stack are not >>> available, >>> >> [3]PETSC ERROR: INSTEAD the line number of the start of the >>> function >>> >> [3]PETSC ERROR: is given. >>> >> [3]PETSC ERROR: [3] SuperLU_DIST:pdgssvx line 161 >>> /home/simpson/software/petsc-3.5.4/src/mat/impls/aij/mpi/superlu_dist/superlu_dist.c >>> >> [3]PETSC ERROR: [3] MatSolve_SuperLU_DIST line 121 >>> /home/simpson/software/petsc-3.5.4/src/mat/impls/aij/mpi/superlu_dist/superlu_dist.c >>> >> [3]PETSC ERROR: [3] MatSolve line 3104 >>> /home/simpson/software/petsc-3.5.4/src/mat/interface/matrix.c >>> >> [3]PETSC ERROR: [3] PCApply_LU line 194 >>> /home/simpson/software/petsc-3.5.4/src/ksp/pc/impls/factor/lu/lu.c >>> >> [3]PETSC ERROR: [3] KSP_PCApplyBAorAB line 258 >>> /home/simpson/software/petsc-3.5.4/include/petsc-private/kspimpl.h >>> >> [3]PETSC ERROR: --------------------- Error Message >>> -------------------------------------------------------------- >>> >> [3]PETSC ERROR: Signal received >>> >> [3]PETSC ERROR: See >>> http://www.mcs.anl.gov/petsc/documentation/faq.html for trouble >>> shooting. >>> >> [3]PETSC ERROR: Petsc Release Version 3.5.4, May, 23, 2015 >>> >> [3]PETSC ERROR: ./blowup_batch2 on a arch-linux2-c-debug named >>> proteusi01 by simpson Sat Aug 22 17:01:41 2015 >>> >> [3]PETSC ERROR: Configure options >>> --with-mpi-dir=/mnt/HA/opt/openmpi/intel/64/1.8.1-mlnx-ofed >>> --with-blas-lib=/mnt/HA/opt/blas/gcc/64/20110419/libblas.a >>> --with-lapack-lib=/liblapack.a --download-suitesparse=yes >>> --download-superlu=yes --download-superlu_dist=yes --download-mumps=yes >>> --download-metis=yes --download-parmetis=yes --download-scalapack=yes >>> >> [3]PETSC ERROR: #1 User provided function() line 0 in unknown file >>> >> >>> -------------------------------------------------------------------------- >>> >> MPI_ABORT was invoked on rank 3 in communicator MPI_COMM_WORLD >>> >> with errorcode 59. >>> >> >>> >> NOTE: invoking MPI_ABORT causes Open MPI to kill all MPI processes. >>> >> You may or may not see output from other processes, depending on >>> >> exactly when Open MPI kills them. >>> >> >>> -------------------------------------------------------------------------- >>> >> [proteusi01:14037] 1 more process has sent help message >>> help-mpi-api.txt / mpi-abort >>> >> [proteusi01:14037] Set MCA parameter "orte_base_help_aggregate" to 0 >>> to see all help / error messages >>> >> [6]PETSC ERROR: >>> ------------------------------------------------------------------------ >>> >> [6]PETSC ERROR: Caught signal number 15 Terminate: Some process (or >>> the batch system) has told this process to end >>> >> [6]PETSC ERROR: Try option -start_in_debugger or >>> -on_error_attach_debugger >>> >> [6]PETSC ERROR: or see >>> http://www.mcs.anl.gov/petsc/documentation/faq.html#valgrind >>> >> [6]PETSC ERROR: or try http://valgrind.org on GNU/linux and Apple >>> Mac OS X to find memory corruption errors >>> >> [6]PETSC ERROR: likely location of problem given in stack below >>> >> [6]PETSC ERROR: --------------------- Stack Frames >>> ------------------------------------ >>> >> [6]PETSC ERROR: Note: The EXACT line numbers in the stack are not >>> available, >>> >> [6]PETSC ERROR: INSTEAD the line number of the start of the >>> function >>> >> [6]PETSC ERROR: is given. >>> >> [6]PETSC ERROR: [6] SuperLU_DIST:pdgssvx line 161 >>> /home/simpson/software/petsc-3.5.4/src/mat/impls/aij/mpi/superlu_dist/superlu_dist.c >>> >> [6]PETSC ERROR: [6] MatSolve_SuperLU_DIST line 121 >>> /home/simpson/software/petsc-3.5.4/src/mat/impls/aij/mpi/superlu_dist/superlu_dist.c >>> >> [6]PETSC ERROR: [6] MatSolve line 3104 >>> /home/simpson/software/petsc-3.5.4/src/mat/interface/matrix.c >>> >> [6]PETSC ERROR: [6] PCApply_LU line 194 >>> /home/simpson/software/petsc-3.5.4/src/ksp/pc/impls/factor/lu/lu.c >>> >> [6]PETSC ERROR: [6] KSP_PCApplyBAorAB line 258 >>> /home/simpson/software/petsc-3.5.4/include/petsc-private/kspimpl.h >>> >> [6]PETSC ERROR: --------------------- Error Message >>> -------------------------------------------------------------- >>> >> [6]PETSC ERROR: Signal received >>> >> [6]PETSC ERROR: See >>> http://www.mcs.anl.gov/petsc/documentation/faq.html for trouble >>> shooting. >>> >> [6]PETSC ERROR: Petsc Release Version 3.5.4, May, 23, 2015 >>> >> [6]PETSC ERROR: ./blowup_batch2 on a arch-linux2-c-debug named >>> proteusi01 by simpson Sat Aug 22 17:01:41 2015 >>> >> [6]PETSC ERROR: Configure options >>> --with-mpi-dir=/mnt/HA/opt/openmpi/intel/64/1.8.1-mlnx-ofed >>> --with-blas-lib=/mnt/HA/opt/blas/gcc/64/20110419/libblas.a >>> --with-lapack-lib=/liblapack.a --download-suitesparse=yes >>> --download-superlu=yes --download-superlu_dist=yes --download-mumps=yes >>> --download-metis=yes --download-parmetis=yes --download-scalapack=yes >>> >> [6]PETSC ERROR: #1 User provided function() line 0 in unknown file >>> >> [7]PETSC ERROR: >>> ------------------------------------------------------------------------ >>> >> [7]PETSC ERROR: Caught signal number 15 Terminate: Some process (or >>> the batch system) has told this process to end >>> >> [7]PETSC ERROR: Try option -start_in_debugger or >>> -on_error_attach_debugger >>> >> [7]PETSC ERROR: or see >>> http://www.mcs.anl.gov/petsc/documentation/faq.html#valgrind >>> >> [7]PETSC ERROR: or try http://valgrind.org on GNU/linux and Apple >>> Mac OS X to find memory corruption errors >>> >> [7]PETSC ERROR: likely location of problem given in stack below >>> >> [7]PETSC ERROR: --------------------- Stack Frames >>> ------------------------------------ >>> >> [7]PETSC ERROR: Note: The EXACT line numbers in the stack are not >>> available, >>> >> [7]PETSC ERROR: INSTEAD the line number of the start of the >>> function >>> >> [7]PETSC ERROR: is given. >>> >> [7]PETSC ERROR: [7] SuperLU_DIST:pdgssvx line 161 >>> /home/simpson/software/petsc-3.5.4/src/mat/impls/aij/mpi/superlu_dist/superlu_dist.c >>> >> [7]PETSC ERROR: [7] MatSolve_SuperLU_DIST line 121 >>> /home/simpson/software/petsc-3.5.4/src/mat/impls/aij/mpi/superlu_dist/superlu_dist.c >>> >> [7]PETSC ERROR: [7] MatSolve line 3104 >>> /home/simpson/software/petsc-3.5.4/src/mat/interface/matrix.c >>> >> [7]PETSC ERROR: [7] PCApply_LU line 194 >>> /home/simpson/software/petsc-3.5.4/src/ksp/pc/impls/factor/lu/lu.c >>> >> [7]PETSC ERROR: [7] KSP_PCApplyBAorAB line 258 >>> /home/simpson/software/petsc-3.5.4/include/petsc-private/kspimpl.h >>> >> [7]PETSC ERROR: --------------------- Error Message >>> -------------------------------------------------------------- >>> >> [7]PETSC ERROR: Signal received >>> >> [7]PETSC ERROR: See >>> http://www.mcs.anl.gov/petsc/documentation/faq.html for trouble >>> shooting. >>> >> [7]PETSC ERROR: Petsc Release Version 3.5.4, May, 23, 2015 >>> >> [7]PETSC ERROR: ./blowup_batch2 on a arch-linux2-c-debug named >>> proteusi01 by simpson Sat Aug 22 17:01:41 2015 >>> >> [7]PETSC ERROR: Configure options >>> --with-mpi-dir=/mnt/HA/opt/openmpi/intel/64/1.8.1-mlnx-ofed >>> --with-blas-lib=/mnt/HA/opt/blas/gcc/64/20110419/libblas.a >>> --with-lapack-lib=/liblapack.a --download-suitesparse=yes >>> --download-superlu=yes --download-superlu_dist=yes --download-mumps=yes >>> --download-metis=yes --download-parmetis=yes --download-scalapack=yes >>> >> [7]PETSC ERROR: #1 User provided function() line 0 in unknown file >>> >> [0]PETSC ERROR: >>> ------------------------------------------------------------------------ >>> >> [0]PETSC ERROR: Caught signal number 15 Terminate: Some process (or >>> the batch system) has told this process to end >>> >> [0]PETSC ERROR: Try option -start_in_debugger or >>> -on_error_attach_debugger >>> >> [0]PETSC ERROR: or see >>> http://www.mcs.anl.gov/petsc/documentation/faq.html#valgrind >>> >> [0]PETSC ERROR: or try http://valgrind.org on GNU/linux and Apple >>> Mac OS X to find memory corruption errors >>> >> [0]PETSC ERROR: likely location of problem given in stack below >>> >> [0]PETSC ERROR: --------------------- Stack Frames >>> ------------------------------------ >>> >> [0]PETSC ERROR: Note: The EXACT line numbers in the stack are not >>> available, >>> >> [0]PETSC ERROR: INSTEAD the line number of the start of the >>> function >>> >> [0]PETSC ERROR: is given. >>> >> [0]PETSC ERROR: [0] SuperLU_DIST:pdgssvx line 161 >>> /home/simpson/software/petsc-3.5.4/src/mat/impls/aij/mpi/superlu_dist/superlu_dist.c >>> >> [0]PETSC ERROR: [0] MatSolve_SuperLU_DIST line 121 >>> /home/simpson/software/petsc-3.5.4/src/mat/impls/aij/mpi/superlu_dist/superlu_dist.c >>> >> [0]PETSC ERROR: [0] MatSolve line 3104 >>> /home/simpson/software/petsc-3.5.4/src/mat/interface/matrix.c >>> >> [0]PETSC ERROR: [0] PCApply_LU line 194 >>> /home/simpson/software/petsc-3.5.4/src/ksp/pc/impls/factor/lu/lu.c >>> >> [0]PETSC ERROR: [0] KSP_PCApplyBAorAB line 258 >>> /home/simpson/software/petsc-3.5.4/include/petsc-private/kspimpl.h >>> >> [0]PETSC ERROR: --------------------- Error Message >>> -------------------------------------------------------------- >>> >> [0]PETSC ERROR: Signal received >>> >> [0]PETSC ERROR: See >>> http://www.mcs.anl.gov/petsc/documentation/faq.html for trouble >>> shooting. >>> >> [0]PETSC ERROR: Petsc Release Version 3.5.4, May, 23, 2015 >>> >> [0]PETSC ERROR: ./blowup_batch2 on a arch-linux2-c-debug named >>> proteusi01 by simpson Sat Aug 22 17:01:41 2015 >>> >> [0]PETSC ERROR: Configure options >>> --with-mpi-dir=/mnt/HA/opt/openmpi/intel/64/1.8.1-mlnx-ofed >>> --with-blas-lib=/mnt/HA/opt/blas/gcc/64/20110419/libblas.a >>> --with-lapack-lib=/liblapack.a --download-suitesparse=yes >>> --download-superlu=yes --download-superlu_dist=yes --download-mumps=yes >>> --download-metis=yes --download-parmetis=yes --download-scalapack=yes >>> >> [0]PETSC ERROR: #1 User provided function() line 0 in unknown file >>> >> [1]PETSC ERROR: >>> ------------------------------------------------------------------------ >>> >> [1]PETSC ERROR: Caught signal number 15 Terminate: Some process (or >>> the batch system) has told this process to end >>> >> [1]PETSC ERROR: Try option -start_in_debugger or >>> -on_error_attach_debugger >>> >> [1]PETSC ERROR: or see >>> http://www.mcs.anl.gov/petsc/documentation/faq.html#valgrind >>> >> [1]PETSC ERROR: or try http://valgrind.org on GNU/linux and Apple >>> Mac OS X to find memory corruption errors >>> >> [1]PETSC ERROR: likely location of problem given in stack below >>> >> [1]PETSC ERROR: --------------------- Stack Frames >>> ------------------------------------ >>> >> [1]PETSC ERROR: Note: The EXACT line numbers in the stack are not >>> available, >>> >> [1]PETSC ERROR: INSTEAD the line number of the start of the >>> function >>> >> [1]PETSC ERROR: is given. >>> >> [1]PETSC ERROR: [1] SuperLU_DIST:pdgssvx line 161 >>> /home/simpson/software/petsc-3.5.4/src/mat/impls/aij/mpi/superlu_dist/superlu_dist.c >>> >> [1]PETSC ERROR: [1] MatSolve_SuperLU_DIST line 121 >>> /home/simpson/software/petsc-3.5.4/src/mat/impls/aij/mpi/superlu_dist/superlu_dist.c >>> >> [1]PETSC ERROR: [1] MatSolve line 3104 >>> /home/simpson/software/petsc-3.5.4/src/mat/interface/matrix.c >>> >> [1]PETSC ERROR: [1] PCApply_LU line 194 >>> /home/simpson/software/petsc-3.5.4/src/ksp/pc/impls/factor/lu/lu.c >>> >> [1]PETSC ERROR: [1] KSP_PCApplyBAorAB line 258 >>> /home/simpson/software/petsc-3.5.4/include/petsc-private/kspimpl.h >>> >> [1]PETSC ERROR: --------------------- Error Message >>> -------------------------------------------------------------- >>> >> [1]PETSC ERROR: Signal received >>> >> [1]PETSC ERROR: See >>> http://www.mcs.anl.gov/petsc/documentation/faq.html for trouble >>> shooting. >>> >> [1]PETSC ERROR: Petsc Release Version 3.5.4, May, 23, 2015 >>> >> [1]PETSC ERROR: ./blowup_batch2 on a arch-linux2-c-debug named >>> proteusi01 by simpson Sat Aug 22 17:01:41 2015 >>> >> [1]PETSC ERROR: Configure options >>> --with-mpi-dir=/mnt/HA/opt/openmpi/intel/64/1.8.1-mlnx-ofed >>> --with-blas-lib=/mnt/HA/opt/blas/gcc/64/20110419/libblas.a >>> --with-lapack-lib=/liblapack.a --download-suitesparse=yes >>> --download-superlu=yes --download-superlu_dist=yes --download-mumps=yes >>> --download-metis=yes --download-parmetis=yes --download-scalapack=yes >>> >> [1]PETSC ERROR: #1 User provided function() line 0 in unknown file >>> >> [2]PETSC ERROR: >>> ------------------------------------------------------------------------ >>> >> [2]PETSC ERROR: Caught signal number 15 Terminate: Some process (or >>> the batch system) has told this process to end >>> >> [2]PETSC ERROR: Try option -start_in_debugger or >>> -on_error_attach_debugger >>> >> [2]PETSC ERROR: or see >>> http://www.mcs.anl.gov/petsc/documentation/faq.html#valgrind >>> >> [2]PETSC ERROR: or try http://valgrind.org on GNU/linux and Apple >>> Mac OS X to find memory corruption errors >>> >> [2]PETSC ERROR: likely location of problem given in stack below >>> >> [2]PETSC ERROR: --------------------- Stack Frames >>> ------------------------------------ >>> >> [2]PETSC ERROR: Note: The EXACT line numbers in the stack are not >>> available, >>> >> [2]PETSC ERROR: INSTEAD the line number of the start of the >>> function >>> >> [2]PETSC ERROR: is given. >>> >> [2]PETSC ERROR: [2] SuperLU_DIST:pdgssvx line 161 >>> /home/simpson/software/petsc-3.5.4/src/mat/impls/aij/mpi/superlu_dist/superlu_dist.c >>> >> [2]PETSC ERROR: [2] MatSolve_SuperLU_DIST line 121 >>> /home/simpson/software/petsc-3.5.4/src/mat/impls/aij/mpi/superlu_dist/superlu_dist.c >>> >> [2]PETSC ERROR: [2] MatSolve line 3104 >>> /home/simpson/software/petsc-3.5.4/src/mat/interface/matrix.c >>> >> [2]PETSC ERROR: [2] PCApply_LU line 194 >>> /home/simpson/software/petsc-3.5.4/src/ksp/pc/impls/factor/lu/lu.c >>> >> [2]PETSC ERROR: [2] KSP_PCApplyBAorAB line 258 >>> /home/simpson/software/petsc-3.5.4/include/petsc-private/kspimpl.h >>> >> [2]PETSC ERROR: --------------------- Error Message >>> -------------------------------------------------------------- >>> >> [2]PETSC ERROR: Signal received >>> >> [2]PETSC ERROR: See >>> http://www.mcs.anl.gov/petsc/documentation/faq.html for trouble >>> shooting. >>> >> [2]PETSC ERROR: Petsc Release Version 3.5.4, May, 23, 2015 >>> >> [2]PETSC ERROR: ./blowup_batch2 on a arch-linux2-c-debug named >>> proteusi01 by simpson Sat Aug 22 17:01:41 2015 >>> >> [2]PETSC ERROR: Configure options >>> --with-mpi-dir=/mnt/HA/opt/openmpi/intel/64/1.8.1-mlnx-ofed >>> --with-blas-lib=/mnt/HA/opt/blas/gcc/64/20110419/libblas.a >>> --with-lapack-lib=/liblapack.a --download-suitesparse=yes >>> --download-superlu=yes --download-superlu_dist=yes --download-mumps=yes >>> --download-metis=yes --download-parmetis=yes --download-scalapack=yes >>> >> [2]PETSC ERROR: #1 User provided function() line 0 in unknown file >>> >> [4]PETSC ERROR: >>> ------------------------------------------------------------------------ >>> >> [4]PETSC ERROR: Caught signal number 15 Terminate: Some process (or >>> the batch system) has told this process to end >>> >> [4]PETSC ERROR: Try option -start_in_debugger or >>> -on_error_attach_debugger >>> >> [4]PETSC ERROR: or see >>> http://www.mcs.anl.gov/petsc/documentation/faq.html#valgrind >>> >> [4]PETSC ERROR: or try http://valgrind.org on GNU/linux and Apple >>> Mac OS X to find memory corruption errors >>> >> [4]PETSC ERROR: likely location of problem given in stack below >>> >> [4]PETSC ERROR: --------------------- Stack Frames >>> ------------------------------------ >>> >> [4]PETSC ERROR: Note: The EXACT line numbers in the stack are not >>> available, >>> >> [4]PETSC ERROR: INSTEAD the line number of the start of the >>> function >>> >> [4]PETSC ERROR: is given. >>> >> [4]PETSC ERROR: [4] SuperLU_DIST:pdgssvx line 161 >>> /home/simpson/software/petsc-3.5.4/src/mat/impls/aij/mpi/superlu_dist/superlu_dist.c >>> >> [4]PETSC ERROR: [4] MatSolve_SuperLU_DIST line 121 >>> /home/simpson/software/petsc-3.5.4/src/mat/impls/aij/mpi/superlu_dist/superlu_dist.c >>> >> [4]PETSC ERROR: [4] MatSolve line 3104 >>> /home/simpson/software/petsc-3.5.4/src/mat/interface/matrix.c >>> >> [4]PETSC ERROR: [4] PCApply_LU line 194 >>> /home/simpson/software/petsc-3.5.4/src/ksp/pc/impls/factor/lu/lu.c >>> >> [4]PETSC ERROR: [4] KSP_PCApplyBAorAB line 258 >>> /home/simpson/software/petsc-3.5.4/include/petsc-private/kspimpl.h >>> >> [4]PETSC ERROR: --------------------- Error Message >>> -------------------------------------------------------------- >>> >> [4]PETSC ERROR: Signal received >>> >> [4]PETSC ERROR: See >>> http://www.mcs.anl.gov/petsc/documentation/faq.html for trouble >>> shooting. >>> >> [4]PETSC ERROR: Petsc Release Version 3.5.4, May, 23, 2015 >>> >> [4]PETSC ERROR: ./blowup_batch2 on a arch-linux2-c-debug named >>> proteusi01 by simpson Sat Aug 22 17:01:41 2015 >>> >> [4]PETSC ERROR: Configure options >>> --with-mpi-dir=/mnt/HA/opt/openmpi/intel/64/1.8.1-mlnx-ofed >>> --with-blas-lib=/mnt/HA/opt/blas/gcc/64/20110419/libblas.a >>> --with-lapack-lib=/liblapack.a --download-suitesparse=yes >>> --download-superlu=yes --download-superlu_dist=yes --download-mumps=yes >>> --download-metis=yes --download-parmetis=yes --download-scalapack=yes >>> >> [4]PETSC ERROR: #1 User provided function() line 0 in unknown file >>> >> [5]PETSC ERROR: >>> ------------------------------------------------------------------------ >>> >> [5]PETSC ERROR: Caught signal number 15 Terminate: Some process (or >>> the batch system) has told this process to end >>> >> [5]PETSC ERROR: Try option -start_in_debugger or >>> -on_error_attach_debugger >>> >> [5]PETSC ERROR: or see >>> http://www.mcs.anl.gov/petsc/documentation/faq.html#valgrind >>> >> [5]PETSC ERROR: or try http://valgrind.org on GNU/linux and Apple >>> Mac OS X to find memory corruption errors >>> >> [5]PETSC ERROR: likely location of problem given in stack below >>> >> [5]PETSC ERROR: --------------------- Stack Frames >>> ------------------------------------ >>> >> [5]PETSC ERROR: Note: The EXACT line numbers in the stack are not >>> available, >>> >> [5]PETSC ERROR: INSTEAD the line number of the start of the >>> function >>> >> [5]PETSC ERROR: is given. >>> >> [5]PETSC ERROR: [5] SuperLU_DIST:pdgssvx line 161 >>> /home/simpson/software/petsc-3.5.4/src/mat/impls/aij/mpi/superlu_dist/superlu_dist.c >>> >> [5]PETSC ERROR: [5] MatSolve_SuperLU_DIST line 121 >>> /home/simpson/software/petsc-3.5.4/src/mat/impls/aij/mpi/superlu_dist/superlu_dist.c >>> >> [5]PETSC ERROR: [5] MatSolve line 3104 >>> /home/simpson/software/petsc-3.5.4/src/mat/interface/matrix.c >>> >> [5]PETSC ERROR: [5] PCApply_LU line 194 >>> /home/simpson/software/petsc-3.5.4/src/ksp/pc/impls/factor/lu/lu.c >>> >> [5]PETSC ERROR: [5] KSP_PCApplyBAorAB line 258 >>> /home/simpson/software/petsc-3.5.4/include/petsc-private/kspimpl.h >>> >> [5]PETSC ERROR: --------------------- Error Message >>> -------------------------------------------------------------- >>> >> [5]PETSC ERROR: Signal received >>> >> [5]PETSC ERROR: See >>> http://www.mcs.anl.gov/petsc/documentation/faq.html for trouble >>> shooting. >>> >> [5]PETSC ERROR: Petsc Release Version 3.5.4, May, 23, 2015 >>> >> [5]PETSC ERROR: ./blowup_batch2 on a arch-linux2-c-debug named >>> proteusi01 by simpson Sat Aug 22 17:01:41 2015 >>> >> [5]PETSC ERROR: Configure options >>> --with-mpi-dir=/mnt/HA/opt/openmpi/intel/64/1.8.1-mlnx-ofed >>> --with-blas-lib=/mnt/HA/opt/blas/gcc/64/20110419/libblas.a >>> --with-lapack-lib=/liblapack.a --download-suitesparse=yes >>> --download-superlu=yes --download-superlu_dist=yes --download-mumps=yes >>> --download-metis=yes --download-parmetis=yes --download-scalapack=yes >>> >> [5]PETSC ERROR: #1 User provided function() line 0 in unknown file >>> >> >>> >> -gideon >>> >> >>> >> >>> > >>> >> > -------------- next part -------------- An HTML attachment was scrubbed... URL: From timothee.nicolas at gmail.com Tue Aug 25 21:19:56 2015 From: timothee.nicolas at gmail.com (=?UTF-8?Q?Timoth=C3=A9e_Nicolas?=) Date: Wed, 26 Aug 2015 11:19:56 +0900 Subject: [petsc-users] Function evaluation slowness ? In-Reply-To: <3D98A4A3-C588-4109-8988-79BA2F541F98@mcs.anl.gov> References: <0FCEEEF7-2374-47A5-9C1B-83D37E3E87CD@mcs.anl.gov> <26E1E799-28B2-460E-B8C9-4FE0EE8CA61B@mcs.anl.gov> <3D98A4A3-C588-4109-8988-79BA2F541F98@mcs.anl.gov> Message-ID: Hi, Problem solved ! Several points to be noticed : 1. First the discrepancy between creations and destructions showed I had forgotten some VecDestroy, and also a VecRestoreArrayReadF90. Repairing this seemed to increase a bit the speed but did not solve the actual more serious problem seen in the log_summary. 2. The actual problem was a very stupid one on my side. At some point I print small diagnostics at every time step to a text file with standard Fortran write statement rather than a viewer to a binary file. I had simply forgotten to put the statement between an if statement on the rank if (rank.eq.0) then write(50) .... end if So all the processors were trying to write together to the file, which, I suppose, somehow caused all the Scatters. After adding the if statement, I recover a fast speed (about 10 ms per function evaluation). Thank you so much for your help, I would never have made it so far without it !!! 3. Last minor point, a discrepancy of 1 remains between creations and destructions (see below extract of log_summary) for the category "viewer". I have checked that it is also the case for the examples ex5f90.F and ex5.c on which my code is based. I can't track it down, but it's probably a minor point anyway. --- Event Stage 0: Main Stage SNES 1 1 1332 0 SNESLineSearch 1 1 864 0 DMSNES 2 2 1328 0 Vector 20 20 34973120 0 Vector Scatter 3 3 503488 0 MatMFFD 1 1 768 0 Matrix 1 1 2304 0 Distributed Mesh 3 3 14416 0 Star Forest Bipartite Graph 6 6 5024 0 Discrete System 3 3 2544 0 Index Set 6 6 187248 0 IS L to G Mapping 2 2 184524 0 Krylov Solver 1 1 1304 0 DMKSP interface 1 1 648 0 Preconditioner 1 1 880 0 Viewer 3 2 1536 0 PetscRandom 1 1 624 0 ======================================================================================================================== Best Timothee 2015-08-26 2:39 GMT+09:00 Barry Smith : > > > On Aug 25, 2015, at 2:06 AM, Timoth?e Nicolas < > timothee.nicolas at gmail.com> wrote: > > > > OK, I see, > > > > Might it be that I do something a bit funky to obtain a good guess for > solve ? I had he following idea, which I used with great success on a very > different problem (much simpler, maybe that's why it worked) : obtain the > initial guess as a cubic extrapolation of the preceding solutions. The idea > is that I expect my solution to be reasonably smooth over time, so > considering this, the increment of the fields should also be continuous (I > solve for the increments, not the fields themselves). Therefore, I store in > my user context the current vector Xk as well as the last three solutions > Xkm1 and Xkm2. > > > > I define > > > > dxm2 = Xkm1 - Xkm2 > > dxm1 = Xk - Xkm1 > > > > And I use the result of the last SNESSolve as > > > > dx = Xkp1 - Xk > > > > Then I set the new dx initial guess as the pointwise cubic extrapolation > of (dxm2,dxm1,dx) > > > > However it seems pretty local and I don't see why scatters would be > required for this. > > Yes, no scatters here. > > > > > I printed the routine I use to do this below. In any case I will clean > up a bit, remove the extra stuff (not much there however). If it is not > sufficient, I will transform my form function in a dummy which does not > require computations and see what happens. > > > > Timothee > > > > PetscErrorCode :: ierr > > > > PetscScalar :: M(3,3) > > Vec :: xkm2,xkm1 > > Vec :: coef1,coef2,coef3 > > PetscScalar :: a,b,c,t,det > > > > a = user%tkm1 > > b = user%tk > > c = user%t > > t = user%t+user%dt > > > > det = b*a**2 + c*b**2 + a*c**2 - (c*a**2 + a*b**2 + b*c**2) > > > > M(1,1) = (b-c)/det > > M(2,1) = (c**2-b**2)/det > > M(3,1) = (c*b**2-b*c**2)/det > > > > M(1,2) = (c-a)/det > > M(2,2) = (a**2-c**2)/det > > M(3,2) = (a*c**2-c*a**2)/det > > > > M(1,3) = (a-b)/det > > M(2,3) = (b**2-a**2)/det > > M(3,3) = (b*a**2-a*b**2)/det > > > > call VecDuplicate(x,xkm1,ierr) > > call VecDuplicate(x,xkm2,ierr) > > > > call VecDuplicate(x,coef1,ierr) > > call VecDuplicate(x,coef2,ierr) > > call VecDuplicate(x,coef3,ierr) > > > > call VecWAXPY(xkm2,-one,user%Xkm2,user%Xkm1,ierr) > > call VecWAXPY(xkm1,-one,user%Xkm1,user%Xk,ierr) > > > > ! The following lines correspond to the following simple operation > > ! coef1 = M(1,1)*alpha + M(1,2)*beta + M(1,3)*gamma > > ! coef2 = M(2,1)*alpha + M(2,2)*beta + M(2,3)*gamma > > ! coef3 = M(3,1)*alpha + M(3,2)*beta + M(3,3)*gamma > > call VecCopy(xkm2,coef1,ierr) > > call VecScale(coef1,M(1,1),ierr) > > call VecAXPY(coef1,M(1,2),xkm1,ierr) > > call VecAXPY(coef1,M(1,3),x,ierr) > > > > call VecCopy(xkm2,coef2,ierr) > > call VecScale(coef2,M(2,1),ierr) > > call VecAXPY(coef2,M(2,2),xkm1,ierr) > > call VecAXPY(coef2,M(2,3),x,ierr) > > > > call VecCopy(xkm2,coef3,ierr) > > call VecScale(coef3,M(3,1),ierr) > > call VecAXPY(coef3,M(3,2),xkm1,ierr) > > call VecAXPY(coef3,M(3,3),x,ierr) > > > > call VecCopy(coef3,x,ierr) > > call VecAXPY(x,t,coef2,ierr) > > call VecAXPY(x,t**2,coef1,ierr) > > > > call VecDestroy(xkm2,ierr) > > call VecDestroy(xkm1,ierr) > > > > call VecDestroy(coef1,ierr) > > call VecDestroy(coef2,ierr) > > call VecDestroy(coef3,ierr) > > > > > > > > 2015-08-25 15:47 GMT+09:00 Barry Smith : > > > > The results are kind of funky, > > > > > ------------------------------------------------------------------------------------------------------------------------ > > Event Count Time (sec) Flops > --- Global --- --- Stage --- Total > > Max Ratio Max Ratio Max Ratio Mess Avg len > Reduct %T %F %M %L %R %T %F %M %L %R Mflop/s > > > ------------------------------------------------------------------------------------------------------------------------ > > SNESSolve 40 1.0 4.9745e+02 3.3 4.25e+09 1.0 1.7e+06 3.8e+04 > 2.7e+03 46 93 99 95 80 46 93 99 95 80 2187 > > SNESFunctionEval 666 1.0 4.8990e+02 3.4 5.73e+08 1.0 1.7e+06 3.8e+04 > 1.3e+03 45 13 99 95 40 45 13 99 95 40 299 > > SNESLineSearch 79 1.0 3.8578e+00 1.0 4.98e+08 1.0 4.0e+05 3.8e+04 > 6.3e+02 1 11 23 23 19 1 11 23 23 19 33068 > > VecScatterEnd 1335 1.0 3.4761e+02 5.8 0.00e+00 0.0 0.0e+00 0.0e+00 > 0.0e+00 31 0 0 0 0 31 0 0 0 0 0 > > MatMult MF 547 1.0 1.2570e+01 1.1 1.27e+09 1.0 1.4e+06 3.8e+04 > 1.1e+03 2 28 81 78 34 2 28 81 78 34 25962 > > MatMult 547 1.0 1.2571e+01 1.1 1.27e+09 1.0 1.4e+06 3.8e+04 > 1.1e+03 2 28 81 78 34 2 28 81 78 34 25960 > > > > look at the %T time for global SNES solve is 46 % of the total time, > function evaluations are 45% but MatMult are only 2% (and yet matmult > should contain most of the function evaluations). I cannot explain this. > Also the VecScatterEnd is HUGE and has a bad load balance of 5.8 Why are > there so many more scatters than function evaluations? What other > operations are you doing that require scatters? > > > > It's almost like you have some mysterious "extra" function calls outside > of the SNESSolve that are killing the performance? It might help to > understand the performance to strip out all extraneous computations not > needed (like in custom monitors etc). > > > > Barry > > > > > > > > > > > > > > > On Aug 25, 2015, at 1:21 AM, Timoth?e Nicolas < > timothee.nicolas at gmail.com> wrote: > > > > > > Here is the log summary (attached). At the beginning are personal > prints, you can skip. I seem to have a memory crash in the present state > after typically 45 iterations (that's why I used 40 here), the log summary > indicates some creations without destruction of Petsc objects (I will fix > this immediately), that may cause the memory crash, but I don't think it's > the cause of the slow function evaluations. > > > > > > The log_summary is consistent with 0.7s per function evaluation > (4.8990e+02/666 = 0.736). In addition, SNESSolve itself takes approximately > the same amount of time (is it normal ?). And the other long operation is > VecScatterEnd. I assume it is the time used in process communications ? In > which case I suppose it is normal that it takes a significant amount of > time. > > > > > > So this ~10 times increase does not look normal right ? > > > > > > Best > > > > > > Timothee NICOLAS > > > > > > > > > 2015-08-25 14:56 GMT+09:00 Barry Smith : > > > > > > > On Aug 25, 2015, at 12:45 AM, Timoth?e Nicolas < > timothee.nicolas at gmail.com> wrote: > > > > > > > > Hi, > > > > > > > > I am testing PETSc on the supercomputer where I used to run my > explicit MHD code. For my tests I use 256 processes on a problem of size > 128*128*640 = 10485760, that is, 40960 grid points per process, and 8 > degrees of freedom (or physical fields). The explicit code was using > Runge-Kutta 4 for the time scheme, which means 4 function evaluation per > time step (plus one operation to put everything together, but let's forget > this one). > > > > > > > > I could thus easily determine that the typical time required for a > function evaluation was of the order of 50 ms. > > > > > > > > Now with the implicit Newton-Krylov solver written in PETSc, in the > present state where for now I have not implemented any Jacobian or > preconditioner whatsoever (so I run with -snes_mf), I measure a typical > time between two time steps of between 5 and 20 seconds, and the number of > function evaluations for each time step obtained with > SNESGetNumberFunctionEvals is 17 (I am speaking of a particular case of > course) > > > > > > > > This means a time per function evaluation of about 0.5 to 1 second, > that is, 10 to 20 times slower. > > > > > > > > So I have some questions about this. > > > > > > > > 1. First does SNESGetNumberFunctionEvals take into account the > function evaluations required to evaluate the Jacobian when -snes_mf is > used, as well as the operations required by the GMRES (Krylov) method ? If > it were the case, I would somehow intuitively expect a number larger than > 17, which could explain the increase in time. > > > > > > PetscErrorCode SNESGetNumberFunctionEvals(SNES snes, PetscInt *nfuncs) > > > { > > > *nfuncs = snes->nfuncs; > > > } > > > > > > PetscErrorCode SNESComputeFunction(SNES snes,Vec x,Vec y) > > > { > > > ... > > > snes->nfuncs++; > > > } > > > > > > PetscErrorCode MatCreateSNESMF(SNES snes,Mat *J) > > > { > > > ..... > > > if (snes->pc && snes->pcside == PC_LEFT) { > > > ierr = MatMFFDSetFunction(*J,(PetscErrorCode > (*)(void*,Vec,Vec))SNESComputeFunctionDefaultNPC,snes);CHKERRQ(ierr); > > > } else { > > > ierr = MatMFFDSetFunction(*J,(PetscErrorCode > (*)(void*,Vec,Vec))SNESComputeFunction,snes);CHKERRQ(ierr); > > > } > > > } > > > > > > So, yes I would expect all the function evaluations needed for the > matrix-free Jacobian matrix vector product to be counted. You can also look > at the number of GMRES Krylov iterations it took (which should have one > multiply per iteration) to double check that the numbers make sense. > > > > > > What does your -log_summary output look like? One thing that GMRES > does is it introduces a global reduction with each multiple (hence a > barrier across all your processes) on some systems this can be deadly. > > > > > > Barry > > > > > > > > > > > > > > 2. In any case, I thought that all things considered, the function > evaluation would be the most time consuming part of a Newton-Krylov solver, > am I completely wrong about that ? Is the 10-20 factor legit ? > > > > > > > > I realize of course that preconditioning should make all this > smoother, in particular allowing larger time steps, but here I am just > concerned about the sheer Function evaluation time. > > > > > > > > Best regards > > > > > > > > Timothee NICOLAS > > > > > > > > > > > > > > > -------------- next part -------------- An HTML attachment was scrubbed... URL: From bsmith at mcs.anl.gov Tue Aug 25 21:25:14 2015 From: bsmith at mcs.anl.gov (Barry Smith) Date: Tue, 25 Aug 2015 21:25:14 -0500 Subject: [petsc-users] Function evaluation slowness ? In-Reply-To: References: <0FCEEEF7-2374-47A5-9C1B-83D37E3E87CD@mcs.anl.gov> <26E1E799-28B2-460E-B8C9-4FE0EE8CA61B@mcs.anl.gov> <3D98A4A3-C588-4109-8988-79BA2F541F98@mcs.anl.gov> Message-ID: <9A6ECF0C-15C2-4993-8F48-8B7D7441BD3C@mcs.anl.gov> > On Aug 25, 2015, at 9:19 PM, Timoth?e Nicolas wrote: > > Hi, > > Problem solved ! > > Several points to be noticed : > > 1. First the discrepancy between creations and destructions showed I had forgotten some VecDestroy, and also a VecRestoreArrayReadF90. Repairing this seemed to increase a bit the speed but did not solve the actual more serious problem seen in the log_summary. > > 2. The actual problem was a very stupid one on my side. At some point I print small diagnostics at every time step to a text file with standard Fortran write statement rather than a viewer to a binary file. I had simply forgotten to put the statement between an if statement on the rank > > if (rank.eq.0) then > > write(50) .... > > end if > > So all the processors were trying to write together to the file, which, I suppose, somehow caused all the Scatters. After adding the if statement, I recover a fast speed (about 10 ms per function evaluation). Thank you so much for your help, I would never have made it so far without it !!! > > 3. Last minor point, a discrepancy of 1 remains between creations and destructions (see below extract of log_summary) for the category "viewer". I have checked that it is also the case for the examples ex5f90.F and ex5.c on which my code is based. I can't track it down, but it's probably a minor point anyway. > > --- Event Stage 0: Main Stage > > SNES 1 1 1332 0 > SNESLineSearch 1 1 864 0 > DMSNES 2 2 1328 0 > Vector 20 20 34973120 0 > Vector Scatter 3 3 503488 0 > MatMFFD 1 1 768 0 > Matrix 1 1 2304 0 > Distributed Mesh 3 3 14416 0 > Star Forest Bipartite Graph 6 6 5024 0 > Discrete System 3 3 2544 0 > Index Set 6 6 187248 0 > IS L to G Mapping 2 2 184524 0 > Krylov Solver 1 1 1304 0 > DMKSP interface 1 1 648 0 > Preconditioner 1 1 880 0 > Viewer 3 2 1536 0 > PetscRandom 1 1 624 0 > ======================================================================================================================== > Because the -log_summary output is done with a viewer there has to be a viewer not yet destroyed with the output is made. Hence it will indicate one viewer still exists. This does not mean that it does not get destroyed eventually. Barry > > Best > > Timothee > > > > 2015-08-26 2:39 GMT+09:00 Barry Smith : > > > On Aug 25, 2015, at 2:06 AM, Timoth?e Nicolas wrote: > > > > OK, I see, > > > > Might it be that I do something a bit funky to obtain a good guess for solve ? I had he following idea, which I used with great success on a very different problem (much simpler, maybe that's why it worked) : obtain the initial guess as a cubic extrapolation of the preceding solutions. The idea is that I expect my solution to be reasonably smooth over time, so considering this, the increment of the fields should also be continuous (I solve for the increments, not the fields themselves). Therefore, I store in my user context the current vector Xk as well as the last three solutions Xkm1 and Xkm2. > > > > I define > > > > dxm2 = Xkm1 - Xkm2 > > dxm1 = Xk - Xkm1 > > > > And I use the result of the last SNESSolve as > > > > dx = Xkp1 - Xk > > > > Then I set the new dx initial guess as the pointwise cubic extrapolation of (dxm2,dxm1,dx) > > > > However it seems pretty local and I don't see why scatters would be required for this. > > Yes, no scatters here. > > > > > I printed the routine I use to do this below. In any case I will clean up a bit, remove the extra stuff (not much there however). If it is not sufficient, I will transform my form function in a dummy which does not require computations and see what happens. > > > > Timothee > > > > PetscErrorCode :: ierr > > > > PetscScalar :: M(3,3) > > Vec :: xkm2,xkm1 > > Vec :: coef1,coef2,coef3 > > PetscScalar :: a,b,c,t,det > > > > a = user%tkm1 > > b = user%tk > > c = user%t > > t = user%t+user%dt > > > > det = b*a**2 + c*b**2 + a*c**2 - (c*a**2 + a*b**2 + b*c**2) > > > > M(1,1) = (b-c)/det > > M(2,1) = (c**2-b**2)/det > > M(3,1) = (c*b**2-b*c**2)/det > > > > M(1,2) = (c-a)/det > > M(2,2) = (a**2-c**2)/det > > M(3,2) = (a*c**2-c*a**2)/det > > > > M(1,3) = (a-b)/det > > M(2,3) = (b**2-a**2)/det > > M(3,3) = (b*a**2-a*b**2)/det > > > > call VecDuplicate(x,xkm1,ierr) > > call VecDuplicate(x,xkm2,ierr) > > > > call VecDuplicate(x,coef1,ierr) > > call VecDuplicate(x,coef2,ierr) > > call VecDuplicate(x,coef3,ierr) > > > > call VecWAXPY(xkm2,-one,user%Xkm2,user%Xkm1,ierr) > > call VecWAXPY(xkm1,-one,user%Xkm1,user%Xk,ierr) > > > > ! The following lines correspond to the following simple operation > > ! coef1 = M(1,1)*alpha + M(1,2)*beta + M(1,3)*gamma > > ! coef2 = M(2,1)*alpha + M(2,2)*beta + M(2,3)*gamma > > ! coef3 = M(3,1)*alpha + M(3,2)*beta + M(3,3)*gamma > > call VecCopy(xkm2,coef1,ierr) > > call VecScale(coef1,M(1,1),ierr) > > call VecAXPY(coef1,M(1,2),xkm1,ierr) > > call VecAXPY(coef1,M(1,3),x,ierr) > > > > call VecCopy(xkm2,coef2,ierr) > > call VecScale(coef2,M(2,1),ierr) > > call VecAXPY(coef2,M(2,2),xkm1,ierr) > > call VecAXPY(coef2,M(2,3),x,ierr) > > > > call VecCopy(xkm2,coef3,ierr) > > call VecScale(coef3,M(3,1),ierr) > > call VecAXPY(coef3,M(3,2),xkm1,ierr) > > call VecAXPY(coef3,M(3,3),x,ierr) > > > > call VecCopy(coef3,x,ierr) > > call VecAXPY(x,t,coef2,ierr) > > call VecAXPY(x,t**2,coef1,ierr) > > > > call VecDestroy(xkm2,ierr) > > call VecDestroy(xkm1,ierr) > > > > call VecDestroy(coef1,ierr) > > call VecDestroy(coef2,ierr) > > call VecDestroy(coef3,ierr) > > > > > > > > 2015-08-25 15:47 GMT+09:00 Barry Smith : > > > > The results are kind of funky, > > > > ------------------------------------------------------------------------------------------------------------------------ > > Event Count Time (sec) Flops --- Global --- --- Stage --- Total > > Max Ratio Max Ratio Max Ratio Mess Avg len Reduct %T %F %M %L %R %T %F %M %L %R Mflop/s > > ------------------------------------------------------------------------------------------------------------------------ > > SNESSolve 40 1.0 4.9745e+02 3.3 4.25e+09 1.0 1.7e+06 3.8e+04 2.7e+03 46 93 99 95 80 46 93 99 95 80 2187 > > SNESFunctionEval 666 1.0 4.8990e+02 3.4 5.73e+08 1.0 1.7e+06 3.8e+04 1.3e+03 45 13 99 95 40 45 13 99 95 40 299 > > SNESLineSearch 79 1.0 3.8578e+00 1.0 4.98e+08 1.0 4.0e+05 3.8e+04 6.3e+02 1 11 23 23 19 1 11 23 23 19 33068 > > VecScatterEnd 1335 1.0 3.4761e+02 5.8 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00 31 0 0 0 0 31 0 0 0 0 0 > > MatMult MF 547 1.0 1.2570e+01 1.1 1.27e+09 1.0 1.4e+06 3.8e+04 1.1e+03 2 28 81 78 34 2 28 81 78 34 25962 > > MatMult 547 1.0 1.2571e+01 1.1 1.27e+09 1.0 1.4e+06 3.8e+04 1.1e+03 2 28 81 78 34 2 28 81 78 34 25960 > > > > look at the %T time for global SNES solve is 46 % of the total time, function evaluations are 45% but MatMult are only 2% (and yet matmult should contain most of the function evaluations). I cannot explain this. Also the VecScatterEnd is HUGE and has a bad load balance of 5.8 Why are there so many more scatters than function evaluations? What other operations are you doing that require scatters? > > > > It's almost like you have some mysterious "extra" function calls outside of the SNESSolve that are killing the performance? It might help to understand the performance to strip out all extraneous computations not needed (like in custom monitors etc). > > > > Barry > > > > > > > > > > > > > > > On Aug 25, 2015, at 1:21 AM, Timoth?e Nicolas wrote: > > > > > > Here is the log summary (attached). At the beginning are personal prints, you can skip. I seem to have a memory crash in the present state after typically 45 iterations (that's why I used 40 here), the log summary indicates some creations without destruction of Petsc objects (I will fix this immediately), that may cause the memory crash, but I don't think it's the cause of the slow function evaluations. > > > > > > The log_summary is consistent with 0.7s per function evaluation (4.8990e+02/666 = 0.736). In addition, SNESSolve itself takes approximately the same amount of time (is it normal ?). And the other long operation is VecScatterEnd. I assume it is the time used in process communications ? In which case I suppose it is normal that it takes a significant amount of time. > > > > > > So this ~10 times increase does not look normal right ? > > > > > > Best > > > > > > Timothee NICOLAS > > > > > > > > > 2015-08-25 14:56 GMT+09:00 Barry Smith : > > > > > > > On Aug 25, 2015, at 12:45 AM, Timoth?e Nicolas wrote: > > > > > > > > Hi, > > > > > > > > I am testing PETSc on the supercomputer where I used to run my explicit MHD code. For my tests I use 256 processes on a problem of size 128*128*640 = 10485760, that is, 40960 grid points per process, and 8 degrees of freedom (or physical fields). The explicit code was using Runge-Kutta 4 for the time scheme, which means 4 function evaluation per time step (plus one operation to put everything together, but let's forget this one). > > > > > > > > I could thus easily determine that the typical time required for a function evaluation was of the order of 50 ms. > > > > > > > > Now with the implicit Newton-Krylov solver written in PETSc, in the present state where for now I have not implemented any Jacobian or preconditioner whatsoever (so I run with -snes_mf), I measure a typical time between two time steps of between 5 and 20 seconds, and the number of function evaluations for each time step obtained with SNESGetNumberFunctionEvals is 17 (I am speaking of a particular case of course) > > > > > > > > This means a time per function evaluation of about 0.5 to 1 second, that is, 10 to 20 times slower. > > > > > > > > So I have some questions about this. > > > > > > > > 1. First does SNESGetNumberFunctionEvals take into account the function evaluations required to evaluate the Jacobian when -snes_mf is used, as well as the operations required by the GMRES (Krylov) method ? If it were the case, I would somehow intuitively expect a number larger than 17, which could explain the increase in time. > > > > > > PetscErrorCode SNESGetNumberFunctionEvals(SNES snes, PetscInt *nfuncs) > > > { > > > *nfuncs = snes->nfuncs; > > > } > > > > > > PetscErrorCode SNESComputeFunction(SNES snes,Vec x,Vec y) > > > { > > > ... > > > snes->nfuncs++; > > > } > > > > > > PetscErrorCode MatCreateSNESMF(SNES snes,Mat *J) > > > { > > > ..... > > > if (snes->pc && snes->pcside == PC_LEFT) { > > > ierr = MatMFFDSetFunction(*J,(PetscErrorCode (*)(void*,Vec,Vec))SNESComputeFunctionDefaultNPC,snes);CHKERRQ(ierr); > > > } else { > > > ierr = MatMFFDSetFunction(*J,(PetscErrorCode (*)(void*,Vec,Vec))SNESComputeFunction,snes);CHKERRQ(ierr); > > > } > > > } > > > > > > So, yes I would expect all the function evaluations needed for the matrix-free Jacobian matrix vector product to be counted. You can also look at the number of GMRES Krylov iterations it took (which should have one multiply per iteration) to double check that the numbers make sense. > > > > > > What does your -log_summary output look like? One thing that GMRES does is it introduces a global reduction with each multiple (hence a barrier across all your processes) on some systems this can be deadly. > > > > > > Barry > > > > > > > > > > > > > > 2. In any case, I thought that all things considered, the function evaluation would be the most time consuming part of a Newton-Krylov solver, am I completely wrong about that ? Is the 10-20 factor legit ? > > > > > > > > I realize of course that preconditioning should make all this smoother, in particular allowing larger time steps, but here I am just concerned about the sheer Function evaluation time. > > > > > > > > Best regards > > > > > > > > Timothee NICOLAS > > > > > > > > > > > > > > > From jed at jedbrown.org Tue Aug 25 22:18:03 2015 From: jed at jedbrown.org (Jed Brown) Date: Tue, 25 Aug 2015 21:18:03 -0600 Subject: [petsc-users] Function evaluation slowness ? In-Reply-To: References: <0FCEEEF7-2374-47A5-9C1B-83D37E3E87CD@mcs.anl.gov> <26E1E799-28B2-460E-B8C9-4FE0EE8CA61B@mcs.anl.gov> <3D98A4A3-C588-4109-8988-79BA2F541F98@mcs.anl.gov> Message-ID: <87k2siagb8.fsf@jedbrown.org> Timoth?e Nicolas writes: > 2. The actual problem was a very stupid one on my side. At some point I > print small diagnostics at every time step to a text file with standard > Fortran write statement rather than a viewer to a binary file. I had simply > forgotten to put the statement between an if statement on the rank > > if (rank.eq.0) then > > write(50) .... > > end if > > So all the processors were trying to write together to the file, which, I > suppose, somehow caused all the Scatters. It doesn't create Scatters, but it likely creates load imbalance that will be paid for in the subsequent VecScatter. -------------- next part -------------- A non-text attachment was scrubbed... Name: signature.asc Type: application/pgp-signature Size: 818 bytes Desc: not available URL: From zonexo at gmail.com Tue Aug 25 23:12:49 2015 From: zonexo at gmail.com (TAY wee-beng) Date: Wed, 26 Aug 2015 12:12:49 +0800 Subject: [petsc-users] Insert values into matrix using MatSetValuesStencil or MatSetValuesLocal In-Reply-To: References: Message-ID: <55DD3CC1.5070801@gmail.com> Hi, I have wrote the routine for my Poisson eqn. I have only 1 DOF, which is for pressure. The center cell is coupled with 6 other cells (north, south, east, west, front, back), so together 7 couplings. size x/y/z = 4/8/10 */MatStencil :: row(4,1),col(4,7)/**/ /**/ /**/PetscScalar :: value_insert(7)/**/ /**/ /**/PetscInt :: ione,iseven/**/ /**/ /**/ione = 1; iseven = 7/**/ /**/ /**/do k=ksta,kend/**/ /**/ /**/ do j = jsta,jend/**/ /**/ /**/ do i=1,size_x/**/ /**//**/ /**/ row(MatStencil_i,1) = i - 1/**/ /**//**/ /**/ row(MatStencil_j,1) = j - 1/**/ /**//**/ /**/ row(MatStencil_k,1) = k - 1/**/ /**//**/ /**/ row(MatStencil_c,1) = 0 ! 1 - 1/**/ /**//**/ /**/ value_insert = 0.d0/**/ /**//**/ /**/ if (i /= size_x) then/**/ /**//**/ /**/ col(MatStencil_i,3) = i + 1 - 1 !east/**/ /**//**/ /**/ col(MatStencil_j,3) = j - 1/**/ /**//**/ /**/ col(MatStencil_k,3) = k - 1/**/ /**//**/ /**/ col(MatStencil_c,3) = 0/**/ /**//**/ /**/ value_insert(3) = (cp_yz(j,k)%fc_E)/(cp_x(i)%pd_E+cp_x(i+1)%pd_W)/**/ /**//**/ /**/ end if/**/ /**//**/ /**/ if (i /= 1) then/**/ /**//**/ /**/ col(MatStencil_i,5) = i - 1 - 1 !west/**/ /**//**/ /**/ col(MatStencil_j,5) = j - 1/**/ /**//**/ /**/ col(MatStencil_k,5) = k - 1/**/ /**//**/ /**/ col(MatStencil_c,5) = 0/**/ /**//**/ /**/ value_insert(5) = (cp_yz(j,k)%fc_E)/(cp_x(i)%pd_W+cp_x(i-1)%pd_E)/**/ /**//**/ /**/ end if/**/ /**//**/ /**/ if (j /= size_y) then/**/ /**//**/ /**/ col(MatStencil_i,2) = i - 1 !north/**/ /**//**/ /**/ col(MatStencil_j,2) = j + 1 - 1/**/ /**//**/ /**/ col(MatStencil_k,2) = k - 1/**/ /**//**/ /**/ col(MatStencil_c,2) = 0/**/ /**//**/ /**/ value_insert(2) = (cp_zx(i,k)%fc_N)/(cp_y(j)%pd_N+cp_y(j+1)%pd_S)/**/ /**//**/ /**/ end if/**/ /**//**/ /**/ .../**/ /**//**/ /**/ col(MatStencil_i,1) = i - 1/**/ /**//**/ /**/ col(MatStencil_j,1) = j - 1/**/ /**//**/ /**/ col(MatStencil_k,1) = k - 1/**/ /**//**/ /**/ col(MatStencil_c,1) = 0/**/ /**//**/ /**/ value_insert(1) = -value_insert(2) - value_insert(3) - value_insert(4) - value_insert(5) - value_insert(6) - value_insert(7)/**/ /**//**/ /**/ call MatSetValuesStencil(A_mat,ione,row,iseven,col,value_insert,INSERT_VALUES,ierr)/**/ /**//**/ /**/ end do/**/ /**//**/ /**/ end do/**/ /**//**/ /**/ end do/* but I got the error : [0]PETSC ERROR: Argument out of range [0]PETSC ERROR: Inserting a new nonzero at (3,0) in the matrix. The error happens at i = 4, j = 1, k = 1. So I guess it has something to do with the boundary condition. However, I can't figure out what's wrong. Can someone help? Thank you Yours sincerely, TAY wee-beng On 24/8/2015 5:54 PM, Timoth?e Nicolas wrote: > Hi, > > ex5 of snes can give you an example of the two routines. > > The C version ex5.c uses MatSetValuesStencil whereas the Fortran90 > version ex5f90.F uses MatSetValuesLocal. > > However, I use MatSetValuesStencil also in Fortran, there is no > problem, and no need to mess around with DMDAGetAO, I think. > > To input values in the matrix, you need to do the following : > > ! Declare the matstencils for matrix columns and rows > MatStencil :: row(4,1),col(4,n) > ! Declare the quantity which will store the actual matrix elements > PetscScalar :: v(8) > > The first dimension in row and col is 4 to allow for 3 spatial > dimensions (even if you use only 2) plus one degree of freedom if you > have several fields in your DMDA. The second dimension is 1 for row > (you input one row at a time) and n for col, where n is the number of > columns that you input. For instance, if at node (1,i,j) (1 is the > index of the degree of freedom), you have, say, 6 couplings, with > nodes (1,i,j), (1,i+1,j), (1,i-1,j), (1,i,j-1), (1,i,j+1), (2,i,j) for > example, then you need to set n=6 > > Then you define the row number by naturally doing the following, > inside a local loop : > > row(MatStencil_i,1) = i -1 > row(MatStencil_j,1) = j -1 > row(MatStencil_c,1) = 1 -1 > > the -1 are here because FORTRAN indexing is different from the native > C indexing. I put them on the right to make this more apparent. > > Then the column information. For instance to declare the coupling with > node (1,i,j), (1,i-1,j) and (2,i,j) (you can make up for the rest) you > will have to write (still within the same local loop on i and j) > > col(MatStencil_i,1) = i -1 > col(MatStencil_j,1) = j -1 > col(MatStencil_c,1) = 1 -1 > v(1) = whatever_it_is > > col(MatStencil_i,2) = i-1 -1 > col(MatStencil_j,2) = j -1 > col(MatStencil_c,2) = 1 -1 > v(2) = whatever_it_is > > col(MatStencil_i,3) = i -1 > col(MatStencil_j,3) = j -1 > col(MatStencil_c,3) = 2 -1 > v(3) = whatever_it_is > > ... > ... > .. > > ... > ... > ... > > Note that the index of the degree of freedom (or what field you are > coupling to), is indicated by MatStencil_c > > > Finally use MatSetValuesStencil > > ione = 1 > isix = 6 > call MatSetValuesStencil(Matrix,ione,row,isix,col,v,INSERT_VALUES,ierr) > > If it is not clear don't hesitate to ask more details. For me it > worked that way, I succesfully computed a Jacobian that way. It is > very sensitive. If you slightly depart from the right jacobian, you > will see a huge difference compared to using matrix free with > -snes_mf, so you can hardly make a mistake because you would see it. > That's how I finally got it to work. > > Best > > Timothee > > > 2015-08-24 18:09 GMT+09:00 Wee-Beng Tay >: > > Hi, > > I'm modifying my 3d fortran code from MPI along 1 direction (z) to > MPI along 2 directions (y,z) > > Previously I was using MatSetValues with global indices. However, > now I'm using DM and global indices is much more difficult. > > I come across MatSetValuesStencil or MatSetValuesLocal. > > So what's the difference bet the one since they both seem to work > locally? > > Which is a simpler/better option? > > Is there an example in Fortran for MatSetValuesStencil? > > Do I also need to use DMDAGetAO together with MatSetValuesStencil > or MatSetValuesLocal? > > Thanks! > > -------------- next part -------------- An HTML attachment was scrubbed... URL: From timothee.nicolas at gmail.com Tue Aug 25 23:24:18 2015 From: timothee.nicolas at gmail.com (=?UTF-8?Q?Timoth=C3=A9e_Nicolas?=) Date: Wed, 26 Aug 2015 13:24:18 +0900 Subject: [petsc-users] Insert values into matrix using MatSetValuesStencil or MatSetValuesLocal In-Reply-To: <55DD3CC1.5070801@gmail.com> References: <55DD3CC1.5070801@gmail.com> Message-ID: What is the definition of ksta, kend, jsta, jend ? Etc ? You are parallelized only in j and k ? What I said about the "-1" holds only if you have translated the start and end points to FORTRAN numbering after getting the corners and ghost corners from the DMDA (see ex ex5f90.F from snes) Would you mind sending the complete routine with the complete definitions of ksta,kend,jsta,jend,and size_x ? Timothee 2015-08-26 13:12 GMT+09:00 TAY wee-beng : > Hi, > > I have wrote the routine for my Poisson eqn. I have only 1 DOF, which is > for pressure. The center cell is coupled with 6 other cells (north, south, > east, west, front, back), so together 7 couplings. > > size x/y/z = 4/8/10 > > *MatStencil :: row(4,1),col(4,7)* > > *PetscScalar :: value_insert(7)* > > *PetscInt :: ione,iseven* > > *ione = 1; iseven = 7* > > *do k=ksta,kend* > > * do j = jsta,jend* > > * do i=1,size_x* > > * row(MatStencil_i,1) = i - 1* > > * row(MatStencil_j,1) = j - 1* > > * row(MatStencil_k,1) = k - 1* > > * row(MatStencil_c,1) = 0 ! 1 - 1* > > * value_insert = 0.d0* > > * if (i /= size_x) then* > > * col(MatStencil_i,3) = i + 1 - 1 !east* > > * col(MatStencil_j,3) = j - 1* > > * col(MatStencil_k,3) = k - 1* > > * col(MatStencil_c,3) = 0* > > * value_insert(3) = > (cp_yz(j,k)%fc_E)/(cp_x(i)%pd_E+cp_x(i+1)%pd_W)* > > * end if* > > * if (i /= 1) then* > > * col(MatStencil_i,5) = i - 1 - 1 !west* > > * col(MatStencil_j,5) = j - 1* > > * col(MatStencil_k,5) = k - 1* > > * col(MatStencil_c,5) = 0* > > * value_insert(5) = > (cp_yz(j,k)%fc_E)/(cp_x(i)%pd_W+cp_x(i-1)%pd_E)* > > * end if* > > * if (j /= size_y) then* > > * col(MatStencil_i,2) = i - 1 !north* > > * col(MatStencil_j,2) = j + 1 - 1* > > * col(MatStencil_k,2) = k - 1* > > * col(MatStencil_c,2) = 0* > > * value_insert(2) = > (cp_zx(i,k)%fc_N)/(cp_y(j)%pd_N+cp_y(j+1)%pd_S)* > > * end if* > > * ...* > > * col(MatStencil_i,1) = i - 1* > > * col(MatStencil_j,1) = j - 1* > > * col(MatStencil_k,1) = k - 1* > > * col(MatStencil_c,1) = 0* > > * value_insert(1) = -value_insert(2) - value_insert(3) - > value_insert(4) - value_insert(5) - value_insert(6) - value_insert(7)* > > * call > MatSetValuesStencil(A_mat,ione,row,iseven,col,value_insert,INSERT_VALUES,ierr)* > > * end do* > > * end do* > > * end do* > > but I got the error : > > [0]PETSC ERROR: Argument out of range > [0]PETSC ERROR: Inserting a new nonzero at (3,0) in the matrix. > > The error happens at i = 4, j = 1, k = 1. So I guess it has something to > do with the boundary condition. However, I can't figure out what's wrong. > Can someone help? > > Thank you > > Yours sincerely, > > TAY wee-beng > > On 24/8/2015 5:54 PM, Timoth?e Nicolas wrote: > > Hi, > > ex5 of snes can give you an example of the two routines. > > The C version ex5.c uses MatSetValuesStencil whereas the Fortran90 version > ex5f90.F uses MatSetValuesLocal. > > However, I use MatSetValuesStencil also in Fortran, there is no problem, > and no need to mess around with DMDAGetAO, I think. > > To input values in the matrix, you need to do the following : > > ! Declare the matstencils for matrix columns and rows > MatStencil :: row(4,1),col(4,n) > ! Declare the quantity which will store the actual matrix elements > PetscScalar :: v(8) > > The first dimension in row and col is 4 to allow for 3 spatial dimensions > (even if you use only 2) plus one degree of freedom if you have several > fields in your DMDA. The second dimension is 1 for row (you input one row > at a time) and n for col, where n is the number of columns that you input. > For instance, if at node (1,i,j) (1 is the index of the degree of > freedom), you have, say, 6 couplings, with nodes (1,i,j), (1,i+1,j), > (1,i-1,j), (1,i,j-1), (1,i,j+1), (2,i,j) for example, then you need to set > n=6 > > Then you define the row number by naturally doing the following, inside a > local loop : > > row(MatStencil_i,1) = i -1 > row(MatStencil_j,1) = j -1 > row(MatStencil_c,1) = 1 -1 > > the -1 are here because FORTRAN indexing is different from the native C > indexing. I put them on the right to make this more apparent. > > Then the column information. For instance to declare the coupling with > node (1,i,j), (1,i-1,j) and (2,i,j) (you can make up for the rest) you will > have to write (still within the same local loop on i and j) > > col(MatStencil_i,1) = i -1 > col(MatStencil_j,1) = j -1 > col(MatStencil_c,1) = 1 -1 > v(1) = whatever_it_is > > col(MatStencil_i,2) = i-1 -1 > col(MatStencil_j,2) = j -1 > col(MatStencil_c,2) = 1 -1 > v(2) = whatever_it_is > > col(MatStencil_i,3) = i -1 > col(MatStencil_j,3) = j -1 > col(MatStencil_c,3) = 2 -1 > v(3) = whatever_it_is > > ... > ... > .. > > ... > ... > ... > > Note that the index of the degree of freedom (or what field you are > coupling to), is indicated by MatStencil_c > > > Finally use MatSetValuesStencil > > ione = 1 > isix = 6 > call MatSetValuesStencil(Matrix,ione,row,isix,col,v,INSERT_VALUES,ierr) > > If it is not clear don't hesitate to ask more details. For me it worked > that way, I succesfully computed a Jacobian that way. It is very sensitive. > If you slightly depart from the right jacobian, you will see a huge > difference compared to using matrix free with -snes_mf, so you can hardly > make a mistake because you would see it. That's how I finally got it to > work. > > Best > > Timothee > > > 2015-08-24 18:09 GMT+09:00 Wee-Beng Tay : > >> Hi, >> >> I'm modifying my 3d fortran code from MPI along 1 direction (z) to MPI >> along 2 directions (y,z) >> >> Previously I was using MatSetValues with global indices. However, now I'm >> using DM and global indices is much more difficult. >> >> I come across MatSetValuesStencil or MatSetValuesLocal. >> >> So what's the difference bet the one since they both seem to work locally? >> >> Which is a simpler/better option? >> >> Is there an example in Fortran for MatSetValuesStencil? >> >> Do I also need to use DMDAGetAO together with MatSetValuesStencil or >> MatSetValuesLocal? >> >> Thanks! >> > > > -------------- next part -------------- An HTML attachment was scrubbed... URL: From zonexo at gmail.com Wed Aug 26 00:02:33 2015 From: zonexo at gmail.com (TAY wee-beng) Date: Wed, 26 Aug 2015 13:02:33 +0800 Subject: [petsc-users] Insert values into matrix using MatSetValuesStencil or MatSetValuesLocal In-Reply-To: References: <55DD3CC1.5070801@gmail.com> Message-ID: <55DD4869.2000006@gmail.com> Hi Timothee, Yes, I only parallelized in j and k. ksta,jsta are the starting k and j values. kend,jend are the ending k and j values. However, now I am using only 1 procs. I was going to resend you my code but then I realised my mistake. I used: */call MatSetValuesStencil(A_mat,ione,row,iseven,col,value_insert,INSERT_VALUES,ierr)/**/ /* for all pts, including those at the boundary. Hence, those coupling outside the boundary is also included. I changed to: */call MatSetValuesStencil(A_mat,ione,row,ione,col(:,7),value_insert(7),INSERT_VALUES,ierr)/* so I am now entering values individually. Is there anyway I can use the 1st option to enter all the values together even those some pts are invalid. I think it should be faster. Can I somehow tell PETSc to ignore them? Thank you Yours sincerely, TAY wee-beng On 26/8/2015 12:24 PM, Timoth?e Nicolas wrote: > What is the definition of ksta, kend, jsta, jend ? Etc ? You are > parallelized only in j and k ? > > What I said about the "-1" holds only if you have translated the start > and end points to FORTRAN numbering after getting the corners and > ghost corners from the DMDA (see ex ex5f90.F from snes) > > Would you mind sending the complete routine with the complete > definitions of ksta,kend,jsta,jend,and size_x ? > > Timothee > > 2015-08-26 13:12 GMT+09:00 TAY wee-beng >: > > Hi, > > I have wrote the routine for my Poisson eqn. I have only 1 DOF, > which is for pressure. The center cell is coupled with 6 other > cells (north, south, east, west, front, back), so together 7 > couplings. > > size x/y/z = 4/8/10 > > */MatStencil :: row(4,1),col(4,7)/**/ > /**/ > /**/PetscScalar :: value_insert(7)/**/ > /**/ > /**/PetscInt :: ione,iseven/**/ > /**/ > /**/ione = 1; iseven = 7/**/ > /**/ > /**/do k=ksta,kend/**/ > /**/ > /**/ do j = jsta,jend/**/ > /**/ > /**/ do i=1,size_x/**/ > /**//**/ > /**/ row(MatStencil_i,1) = i - 1/**/ > /**//**/ > /**/ row(MatStencil_j,1) = j - 1/**/ > /**//**/ > /**/ row(MatStencil_k,1) = k - 1/**/ > /**//**/ > /**/ row(MatStencil_c,1) = 0 ! 1 - 1/**/ > /**//**/ > /**/ value_insert = 0.d0/**/ > /**//**/ > /**/ if (i /= size_x) then/**/ > /**//**/ > /**/ col(MatStencil_i,3) = i + 1 - 1 !east/**/ > /**//**/ > /**/ col(MatStencil_j,3) = j - 1/**/ > /**//**/ > /**/ col(MatStencil_k,3) = k - 1/**/ > /**//**/ > /**/ col(MatStencil_c,3) = 0/**/ > /**//**/ > /**/ value_insert(3) = > (cp_yz(j,k)%fc_E)/(cp_x(i)%pd_E+cp_x(i+1)%pd_W)/**/ > /**//**/ > /**/ end if/**/ > /**//**/ > /**/ if (i /= 1) then/**/ > /**//**/ > /**/ col(MatStencil_i,5) = i - 1 - 1 !west/**/ > /**//**/ > /**/ col(MatStencil_j,5) = j - 1/**/ > /**//**/ > /**/ col(MatStencil_k,5) = k - 1/**/ > /**//**/ > /**/ col(MatStencil_c,5) = 0/**/ > /**//**/ > /**/ value_insert(5) = > (cp_yz(j,k)%fc_E)/(cp_x(i)%pd_W+cp_x(i-1)%pd_E)/**/ > /**//**/ > /**/ end if/**/ > /**//**/ > /**/ if (j /= size_y) then/**/ > /**//**/ > /**/ col(MatStencil_i,2) = i - 1 !north/**/ > /**//**/ > /**/ col(MatStencil_j,2) = j + 1 - 1/**/ > /**//**/ > /**/ col(MatStencil_k,2) = k - 1/**/ > /**//**/ > /**/ col(MatStencil_c,2) = 0/**/ > /**//**/ > /**/ value_insert(2) = > (cp_zx(i,k)%fc_N)/(cp_y(j)%pd_N+cp_y(j+1)%pd_S)/**/ > /**//**/ > /**/ end if/**/ > /**//**/ > /**/ .../**/ > /**//**/ > /**/ col(MatStencil_i,1) = i - 1/**/ > /**//**/ > /**/ col(MatStencil_j,1) = j - 1/**/ > /**//**/ > /**/ col(MatStencil_k,1) = k - 1/**/ > /**//**/ > /**/ col(MatStencil_c,1) = 0/**/ > /**//**/ > /**/ value_insert(1) = -value_insert(2) - > value_insert(3) - value_insert(4) - value_insert(5) - > value_insert(6) - value_insert(7)/**/ > /**//**/ > /**/ call > MatSetValuesStencil(A_mat,ione,row,iseven,col,value_insert,INSERT_VALUES,ierr)/**/ > /**//**/ > /**/ end do/**/ > /**//**/ > /**/ end do/**/ > /**//**/ > /**/ end do/* > > but I got the error : > > [0]PETSC ERROR: Argument out of range > [0]PETSC ERROR: Inserting a new nonzero at (3,0) in the matrix. > > The error happens at i = 4, j = 1, k = 1. So I guess it has > something to do with the boundary condition. However, I can't > figure out what's wrong. Can someone help? > > Thank you > > Yours sincerely, > > TAY wee-beng > > On 24/8/2015 5:54 PM, Timoth?e Nicolas wrote: >> Hi, >> >> ex5 of snes can give you an example of the two routines. >> >> The C version ex5.c uses MatSetValuesStencil whereas the >> Fortran90 version ex5f90.F uses MatSetValuesLocal. >> >> However, I use MatSetValuesStencil also in Fortran, there is no >> problem, and no need to mess around with DMDAGetAO, I think. >> >> To input values in the matrix, you need to do the following : >> >> ! Declare the matstencils for matrix columns and rows >> MatStencil :: row(4,1),col(4,n) >> ! Declare the quantity which will store the actual matrix elements >> PetscScalar :: v(8) >> >> The first dimension in row and col is 4 to allow for 3 spatial >> dimensions (even if you use only 2) plus one degree of freedom if >> you have several fields in your DMDA. The second dimension is 1 >> for row (you input one row at a time) and n for col, where n is >> the number of columns that you input. For instance, if at node >> (1,i,j) (1 is the index of the degree of freedom), you have, >> say, 6 couplings, with nodes (1,i,j), (1,i+1,j), (1,i-1,j), >> (1,i,j-1), (1,i,j+1), (2,i,j) for example, then you need to set n=6 >> >> Then you define the row number by naturally doing the following, >> inside a local loop : >> >> row(MatStencil_i,1) = i -1 >> row(MatStencil_j,1) = j -1 >> row(MatStencil_c,1) = 1 -1 >> >> the -1 are here because FORTRAN indexing is different from the >> native C indexing. I put them on the right to make this more >> apparent. >> >> Then the column information. For instance to declare the coupling >> with node (1,i,j), (1,i-1,j) and (2,i,j) (you can make up for the >> rest) you will have to write (still within the same local loop on >> i and j) >> >> col(MatStencil_i,1) = i -1 >> col(MatStencil_j,1) = j -1 >> col(MatStencil_c,1) = 1 -1 >> v(1) = whatever_it_is >> >> col(MatStencil_i,2) = i-1 -1 >> col(MatStencil_j,2) = j -1 >> col(MatStencil_c,2) = 1 -1 >> v(2) = whatever_it_is >> >> col(MatStencil_i,3) = i -1 >> col(MatStencil_j,3) = j -1 >> col(MatStencil_c,3) = 2 -1 >> v(3) = whatever_it_is >> >> ... >> ... >> .. >> >> ... >> ... >> ... >> >> Note that the index of the degree of freedom (or what field you >> are coupling to), is indicated by MatStencil_c >> >> >> Finally use MatSetValuesStencil >> >> ione = 1 >> isix = 6 >> call >> MatSetValuesStencil(Matrix,ione,row,isix,col,v,INSERT_VALUES,ierr) >> >> If it is not clear don't hesitate to ask more details. For me it >> worked that way, I succesfully computed a Jacobian that way. It >> is very sensitive. If you slightly depart from the right >> jacobian, you will see a huge difference compared to using >> matrix free with -snes_mf, so you can hardly make a mistake >> because you would see it. That's how I finally got it to work. >> >> Best >> >> Timothee >> >> >> 2015-08-24 18:09 GMT+09:00 Wee-Beng Tay > >: >> >> Hi, >> >> I'm modifying my 3d fortran code from MPI along 1 direction >> (z) to MPI along 2 directions (y,z) >> >> Previously I was using MatSetValues with global indices. >> However, now I'm using DM and global indices is much more >> difficult. >> >> I come across MatSetValuesStencil or MatSetValuesLocal. >> >> So what's the difference bet the one since they both seem to >> work locally? >> >> Which is a simpler/better option? >> >> Is there an example in Fortran for MatSetValuesStencil? >> >> Do I also need to use DMDAGetAO together >> with MatSetValuesStencil or MatSetValuesLocal? >> >> Thanks! >> >> > > -------------- next part -------------- An HTML attachment was scrubbed... URL: From timothee.nicolas at gmail.com Wed Aug 26 00:15:10 2015 From: timothee.nicolas at gmail.com (=?UTF-8?Q?Timoth=C3=A9e_Nicolas?=) Date: Wed, 26 Aug 2015 14:15:10 +0900 Subject: [petsc-users] Insert values into matrix using MatSetValuesStencil or MatSetValuesLocal In-Reply-To: <55DD4869.2000006@gmail.com> References: <55DD3CC1.5070801@gmail.com> <55DD4869.2000006@gmail.com> Message-ID: I don't really understand what you say, but it does not sound right. You can enter the boundary points separately and then the points outside the boundary on separate calls, like this : do j=user%ys,user%ye do i=user%xs,user%xe if (i.eq.1 .or. i.eq.user%mx .or. j .eq. 1 .or. j .eq. user%my) then ! boundary point row(MatStencil_i,1) = i -1 row(MatStencil_j,1) = j -1 row(MatStencil_c,1) = 1 -1 col(MatStencil_i,1) = i -1 col(MatStencil_j,1) = j -1 col(MatStencil_c,1) = 1 -1 v(1) = one call MatSetValuesStencil(jac_prec,ione,row,ione,col,v, & & INSERT_VALUES,ierr) else row(MatStencil_i,1) = i -1 row(MatStencil_j,1) = j -1 row(MatStencil_c,1) = 1 -1 col(MatStencil_i,1) = i -1 col(MatStencil_j,1) = j -1 col(MatStencil_c,1) = 1 -1 v(1) = undemi*dxm1*(vx_ip1j-vx_im1j) + two*user%nu*(dxm1**2+dym1**2) col(MatStencil_i,2) = i+1 -1 col(MatStencil_j,2) = j -1 col(MatStencil_c,2) = 1 -1 v(2) = undemi*dxm1*(vx_ij-vx_ip1j) - user%nu*dxm1**2 col(MatStencil_i,3) = i-1 -1 col(MatStencil_j,3) = j -1 col(MatStencil_c,3) = 1 -1 v(3) = -undemi*dxm1*(vx_ij-vx_im1j) - user%nu*dxm1**2 col(MatStencil_i,4) = i -1 col(MatStencil_j,4) = j+1 -1 col(MatStencil_c,4) = 1 -1 v(4) = undemi*dym1*vy_ij - user%nu*dym1**2 col(MatStencil_i,5) = i -1 col(MatStencil_j,5) = j-1 -1 col(MatStencil_c,5) = 1 -1 v(5) = -undemi*dym1*vy_ij - user%nu*dym1**2 col(MatStencil_i,6) = i -1 col(MatStencil_j,6) = j -1 col(MatStencil_c,6) = 2 -1 v(6) = undemi*dym1*(vx_ijp1-vx_ijm1) col(MatStencil_i,7) = i+1 -1 col(MatStencil_j,7) = j -1 col(MatStencil_c,7) = 2 -1 v(7) = -undemi*dxm1*vy_ip1j col(MatStencil_i,8) = i-1 -1 col(MatStencil_j,8) = j -1 col(MatStencil_c,8) = 2 -1 v(8) = undemi*dxm1*vy_im1j call MatSetValuesStencil(jac_prec,ione,row,ieight,col,v, & & INSERT_VALUES,ierr) endif enddo enddo Timothee 2015-08-26 14:02 GMT+09:00 TAY wee-beng : > Hi Timothee, > > Yes, I only parallelized in j and k. ksta,jsta are the starting k and j > values. kend,jend are the ending k and j values. > > However, now I am using only 1 procs. > > I was going to resend you my code but then I realised my mistake. I used: > > *call > MatSetValuesStencil(A_mat,ione,row,iseven,col,value_insert,INSERT_VALUES,ierr)* > > for all pts, including those at the boundary. Hence, those coupling > outside the boundary is also included. > > I changed to: > > *call > MatSetValuesStencil(A_mat,ione,row,ione,col(:,7),value_insert(7),INSERT_VALUES,ierr)* > > so I am now entering values individually. > > Is there anyway I can use the 1st option to enter all the values together > even those some pts are invalid. I think it should be faster. Can I somehow > tell PETSc to ignore them? > > Thank you > > Yours sincerely, > > TAY wee-beng > > On 26/8/2015 12:24 PM, Timoth?e Nicolas wrote: > > What is the definition of ksta, kend, jsta, jend ? Etc ? You are > parallelized only in j and k ? > > What I said about the "-1" holds only if you have translated the start and > end points to FORTRAN numbering after getting the corners and ghost corners > from the DMDA (see ex ex5f90.F from snes) > > Would you mind sending the complete routine with the complete definitions > of ksta,kend,jsta,jend,and size_x ? > > Timothee > > 2015-08-26 13:12 GMT+09:00 TAY wee-beng : > >> Hi, >> >> I have wrote the routine for my Poisson eqn. I have only 1 DOF, which is >> for pressure. The center cell is coupled with 6 other cells (north, south, >> east, west, front, back), so together 7 couplings. >> >> size x/y/z = 4/8/10 >> >> *MatStencil :: row(4,1),col(4,7)* >> >> *PetscScalar :: value_insert(7)* >> >> *PetscInt :: ione,iseven* >> >> *ione = 1; iseven = 7* >> >> *do k=ksta,kend* >> >> * do j = jsta,jend* >> >> * do i=1,size_x* >> >> * row(MatStencil_i,1) = i - 1* >> >> * row(MatStencil_j,1) = j - 1* >> >> * row(MatStencil_k,1) = k - 1* >> >> * row(MatStencil_c,1) = 0 ! 1 - 1* >> >> * value_insert = 0.d0* >> >> * if (i /= size_x) then* >> >> * col(MatStencil_i,3) = i + 1 - 1 !east* >> >> * col(MatStencil_j,3) = j - 1* >> >> * col(MatStencil_k,3) = k - 1* >> >> * col(MatStencil_c,3) = 0* >> >> * value_insert(3) = >> (cp_yz(j,k)%fc_E)/(cp_x(i)%pd_E+cp_x(i+1)%pd_W)* >> >> * end if* >> >> * if (i /= 1) then* >> >> * col(MatStencil_i,5) = i - 1 - 1 !west* >> >> * col(MatStencil_j,5) = j - 1* >> >> * col(MatStencil_k,5) = k - 1* >> >> * col(MatStencil_c,5) = 0* >> >> * value_insert(5) = >> (cp_yz(j,k)%fc_E)/(cp_x(i)%pd_W+cp_x(i-1)%pd_E)* >> >> * end if* >> >> * if (j /= size_y) then* >> >> * col(MatStencil_i,2) = i - 1 !north* >> >> * col(MatStencil_j,2) = j + 1 - 1* >> >> * col(MatStencil_k,2) = k - 1* >> >> * col(MatStencil_c,2) = 0* >> >> * value_insert(2) = >> (cp_zx(i,k)%fc_N)/(cp_y(j)%pd_N+cp_y(j+1)%pd_S)* >> >> * end if* >> >> * ...* >> >> * col(MatStencil_i,1) = i - 1* >> >> * col(MatStencil_j,1) = j - 1* >> >> * col(MatStencil_k,1) = k - 1* >> >> * col(MatStencil_c,1) = 0* >> >> * value_insert(1) = -value_insert(2) - value_insert(3) - >> value_insert(4) - value_insert(5) - value_insert(6) - value_insert(7)* >> >> * call >> MatSetValuesStencil(A_mat,ione,row,iseven,col,value_insert,INSERT_VALUES,ierr)* >> >> * end do* >> >> * end do* >> >> * end do* >> >> but I got the error : >> >> [0]PETSC ERROR: Argument out of range >> [0]PETSC ERROR: Inserting a new nonzero at (3,0) in the matrix. >> >> The error happens at i = 4, j = 1, k = 1. So I guess it has something to >> do with the boundary condition. However, I can't figure out what's wrong. >> Can someone help? >> >> Thank you >> >> Yours sincerely, >> >> TAY wee-beng >> >> On 24/8/2015 5:54 PM, Timoth?e Nicolas wrote: >> >> Hi, >> >> ex5 of snes can give you an example of the two routines. >> >> The C version ex5.c uses MatSetValuesStencil whereas the Fortran90 >> version ex5f90.F uses MatSetValuesLocal. >> >> However, I use MatSetValuesStencil also in Fortran, there is no problem, >> and no need to mess around with DMDAGetAO, I think. >> >> To input values in the matrix, you need to do the following : >> >> ! Declare the matstencils for matrix columns and rows >> MatStencil :: row(4,1),col(4,n) >> ! Declare the quantity which will store the actual matrix elements >> PetscScalar :: v(8) >> >> The first dimension in row and col is 4 to allow for 3 spatial dimensions >> (even if you use only 2) plus one degree of freedom if you have several >> fields in your DMDA. The second dimension is 1 for row (you input one row >> at a time) and n for col, where n is the number of columns that you input. >> For instance, if at node (1,i,j) (1 is the index of the degree of >> freedom), you have, say, 6 couplings, with nodes (1,i,j), (1,i+1,j), >> (1,i-1,j), (1,i,j-1), (1,i,j+1), (2,i,j) for example, then you need to set >> n=6 >> >> Then you define the row number by naturally doing the following, inside a >> local loop : >> >> row(MatStencil_i,1) = i -1 >> row(MatStencil_j,1) = j -1 >> row(MatStencil_c,1) = 1 -1 >> >> the -1 are here because FORTRAN indexing is different from the native C >> indexing. I put them on the right to make this more apparent. >> >> Then the column information. For instance to declare the coupling with >> node (1,i,j), (1,i-1,j) and (2,i,j) (you can make up for the rest) you will >> have to write (still within the same local loop on i and j) >> >> col(MatStencil_i,1) = i -1 >> col(MatStencil_j,1) = j -1 >> col(MatStencil_c,1) = 1 -1 >> v(1) = whatever_it_is >> >> col(MatStencil_i,2) = i-1 -1 >> col(MatStencil_j,2) = j -1 >> col(MatStencil_c,2) = 1 -1 >> v(2) = whatever_it_is >> >> col(MatStencil_i,3) = i -1 >> col(MatStencil_j,3) = j -1 >> col(MatStencil_c,3) = 2 -1 >> v(3) = whatever_it_is >> >> ... >> ... >> .. >> >> ... >> ... >> ... >> >> Note that the index of the degree of freedom (or what field you are >> coupling to), is indicated by MatStencil_c >> >> >> Finally use MatSetValuesStencil >> >> ione = 1 >> isix = 6 >> call MatSetValuesStencil(Matrix,ione,row,isix,col,v,INSERT_VALUES,ierr) >> >> If it is not clear don't hesitate to ask more details. For me it worked >> that way, I succesfully computed a Jacobian that way. It is very sensitive. >> If you slightly depart from the right jacobian, you will see a huge >> difference compared to using matrix free with -snes_mf, so you can hardly >> make a mistake because you would see it. That's how I finally got it to >> work. >> >> Best >> >> Timothee >> >> >> 2015-08-24 18:09 GMT+09:00 Wee-Beng Tay < >> zonexo at gmail.com>: >> >>> Hi, >>> >>> I'm modifying my 3d fortran code from MPI along 1 direction (z) to MPI >>> along 2 directions (y,z) >>> >>> Previously I was using MatSetValues with global indices. However, now >>> I'm using DM and global indices is much more difficult. >>> >>> I come across MatSetValuesStencil or MatSetValuesLocal. >>> >>> So what's the difference bet the one since they both seem to work >>> locally? >>> >>> Which is a simpler/better option? >>> >>> Is there an example in Fortran for MatSetValuesStencil? >>> >>> Do I also need to use DMDAGetAO together with MatSetValuesStencil or >>> MatSetValuesLocal? >>> >>> Thanks! >>> >> >> >> > > -------------- next part -------------- An HTML attachment was scrubbed... URL: From nicolas.pozin at inria.fr Wed Aug 26 13:00:36 2015 From: nicolas.pozin at inria.fr (Nicolas Pozin) Date: Wed, 26 Aug 2015 20:00:36 +0200 (CEST) Subject: [petsc-users] forming a matrix from a set of vectors In-Reply-To: <1728363781.5802570.1438184949520.JavaMail.zimbra@inria.fr> References: <1728363781.5802570.1438184949520.JavaMail.zimbra@inria.fr> Message-ID: <1236537078.10036213.1440612036075.JavaMail.zimbra@inria.fr> Dear all, Given a set of vectors V1, V2,...,Vn, is there an efficient way to form the dense matrix [V1 V2 ... Vn]? Thanks, Regards Nicolas From jed at jedbrown.org Wed Aug 26 13:38:37 2015 From: jed at jedbrown.org (Jed Brown) Date: Wed, 26 Aug 2015 12:38:37 -0600 Subject: [petsc-users] forming a matrix from a set of vectors In-Reply-To: <1236537078.10036213.1440612036075.JavaMail.zimbra@inria.fr> References: <1728363781.5802570.1438184949520.JavaMail.zimbra@inria.fr> <1236537078.10036213.1440612036075.JavaMail.zimbra@inria.fr> Message-ID: <87pp2999oy.fsf@jedbrown.org> Nicolas Pozin writes: > Given a set of vectors V1, V2,...,Vn, is there an efficient way to form the dense matrix [V1 V2 ... Vn]? What do you want to do with that matrix? The vector representation is pretty flexible and the memory semantics are similar unless you store the dense matrix row-aligned (not the default). -------------- next part -------------- A non-text attachment was scrubbed... Name: signature.asc Type: application/pgp-signature Size: 818 bytes Desc: not available URL: From nicolas.pozin at inria.fr Wed Aug 26 15:06:32 2015 From: nicolas.pozin at inria.fr (Nicolas Pozin) Date: Wed, 26 Aug 2015 22:06:32 +0200 (CEST) Subject: [petsc-users] forming a matrix from a set of vectors In-Reply-To: <87pp2999oy.fsf@jedbrown.org> References: <1728363781.5802570.1438184949520.JavaMail.zimbra@inria.fr> <1236537078.10036213.1440612036075.JavaMail.zimbra@inria.fr> <87pp2999oy.fsf@jedbrown.org> Message-ID: <1398659529.10050070.1440619592053.JavaMail.zimbra@inria.fr> Thank you for this answer. What I want to do is to get the lines of this matrix and store them in vectors. ----- Mail original ----- > De: "Jed Brown" > ?: "Nicolas Pozin" , petsc-users at mcs.anl.gov > Envoy?: Mercredi 26 Ao?t 2015 20:38:37 > Objet: Re: [petsc-users] forming a matrix from a set of vectors > > Nicolas Pozin writes: > > Given a set of vectors V1, V2,...,Vn, is there an efficient way to form the > > dense matrix [V1 V2 ... Vn]? > > What do you want to do with that matrix? The vector representation is > pretty flexible and the memory semantics are similar unless you store > the dense matrix row-aligned (not the default). > From bsmith at mcs.anl.gov Wed Aug 26 15:21:04 2015 From: bsmith at mcs.anl.gov (Barry Smith) Date: Wed, 26 Aug 2015 15:21:04 -0500 Subject: [petsc-users] forming a matrix from a set of vectors In-Reply-To: <1398659529.10050070.1440619592053.JavaMail.zimbra@inria.fr> References: <1728363781.5802570.1438184949520.JavaMail.zimbra@inria.fr> <1236537078.10036213.1440612036075.JavaMail.zimbra@inria.fr> <87pp2999oy.fsf@jedbrown.org> <1398659529.10050070.1440619592053.JavaMail.zimbra@inria.fr> Message-ID: > On Aug 26, 2015, at 3:06 PM, Nicolas Pozin wrote: > > Thank you for this answer. > > What I want to do is to get the lines of this matrix and store them in vectors. If you want to treat the columns of the dense matrix as vectors then use MatDenseGetArray() and call VecCreateMPIWithArray() with a pointer to the first row of each column of the obtained array (PETSc dense matrices are stored by column; same as for example LAPACK). But if you explained more why you want to treat something sometimes as a Mat (which is a linear operator on vectors) and sometimes as vectors we might be able to suggest how to organize your code. Barry > > > ----- Mail original ----- >> De: "Jed Brown" >> ?: "Nicolas Pozin" , petsc-users at mcs.anl.gov >> Envoy?: Mercredi 26 Ao?t 2015 20:38:37 >> Objet: Re: [petsc-users] forming a matrix from a set of vectors >> >> Nicolas Pozin writes: >>> Given a set of vectors V1, V2,...,Vn, is there an efficient way to form the >>> dense matrix [V1 V2 ... Vn]? >> >> What do you want to do with that matrix? The vector representation is >> pretty flexible and the memory semantics are similar unless you store >> the dense matrix row-aligned (not the default). >> From nicolas.pozin at inria.fr Wed Aug 26 15:41:07 2015 From: nicolas.pozin at inria.fr (Nicolas Pozin) Date: Wed, 26 Aug 2015 22:41:07 +0200 (CEST) Subject: [petsc-users] forming a matrix from a set of vectors In-Reply-To: References: <1728363781.5802570.1438184949520.JavaMail.zimbra@inria.fr> <1236537078.10036213.1440612036075.JavaMail.zimbra@inria.fr> <87pp2999oy.fsf@jedbrown.org> <1398659529.10050070.1440619592053.JavaMail.zimbra@inria.fr> Message-ID: <1142490980.10053474.1440621667465.JavaMail.zimbra@inria.fr> Actually I want to get the diagonal of the matrix : transpose(d)*A*d where -d is a sparse matrix of size (n1,m1) -A is a dense symetric matrix of size size (n1,n1) with m1 very big compared to n1 (1 million against a few dozens). The problem is too big to allow the use of MatMatMult. What I planned to do : -compute the vectors Vi defined by transpose(d)*Ai where Ai is the i-th column of A : quick since d is sparse and n1 is small -deduce the matrix transpose(d)*A = [V1 ... Vn] and then get the diagonal of transpose(d)*transpose([V1 ...Vn]) through -transpose([V1 ...Vn]) and get its columns C1 ... Cn -conclude on the i-th diagonal value which is the i-th component of tranpose(d)*Ci ----- Mail original ----- > De: "Barry Smith" > ?: "Nicolas Pozin" > Cc: "Jed Brown" , petsc-users at mcs.anl.gov > Envoy?: Mercredi 26 Ao?t 2015 22:21:04 > Objet: Re: [petsc-users] forming a matrix from a set of vectors > > > > On Aug 26, 2015, at 3:06 PM, Nicolas Pozin wrote: > > > > Thank you for this answer. > > > > What I want to do is to get the lines of this matrix and store them in > > vectors. > > If you want to treat the columns of the dense matrix as vectors then use > MatDenseGetArray() and call VecCreateMPIWithArray() with a pointer to the > first row of each column of the obtained array (PETSc dense matrices are > stored by column; same as for example LAPACK). > > But if you explained more why you want to treat something sometimes as a > Mat (which is a linear operator on vectors) and sometimes as vectors we > might be able to suggest how to organize your code. > > Barry > > > > > > > ----- Mail original ----- > >> De: "Jed Brown" > >> ?: "Nicolas Pozin" , petsc-users at mcs.anl.gov > >> Envoy?: Mercredi 26 Ao?t 2015 20:38:37 > >> Objet: Re: [petsc-users] forming a matrix from a set of vectors > >> > >> Nicolas Pozin writes: > >>> Given a set of vectors V1, V2,...,Vn, is there an efficient way to form > >>> the > >>> dense matrix [V1 V2 ... Vn]? > >> > >> What do you want to do with that matrix? The vector representation is > >> pretty flexible and the memory semantics are similar unless you store > >> the dense matrix row-aligned (not the default). > >> > > From bsmith at mcs.anl.gov Wed Aug 26 15:58:12 2015 From: bsmith at mcs.anl.gov (Barry Smith) Date: Wed, 26 Aug 2015 15:58:12 -0500 Subject: [petsc-users] forming a matrix from a set of vectors In-Reply-To: <1142490980.10053474.1440621667465.JavaMail.zimbra@inria.fr> References: <1728363781.5802570.1438184949520.JavaMail.zimbra@inria.fr> <1236537078.10036213.1440612036075.JavaMail.zimbra@inria.fr> <87pp2999oy.fsf@jedbrown.org> <1398659529.10050070.1440619592053.JavaMail.zimbra@inria.fr> <1142490980.10053474.1440621667465.JavaMail.zimbra@inria.fr> Message-ID: <154030DD-BAED-4F54-829D-0B2F089301F5@mcs.anl.gov> Since A is tiny I am assuming you are doing this sequentially only? Do you have d stored as a AIJ matrix or is transpose(d) stored as a AIJ matrix? > On Aug 26, 2015, at 3:41 PM, Nicolas Pozin wrote: > > Actually I want to get the diagonal of the matrix : transpose(d)*A*d where > -d is a sparse matrix of size (n1,m1) > -A is a dense symetric matrix of size size (n1,n1) > with m1 very big compared to n1 (1 million against a few dozens). > > The problem is too big to allow the use of MatMatMult. > What I planned to do : > -compute the vectors Vi defined by transpose(d)*Ai where Ai is the i-th column of A : quick since d is sparse and n1 is small > -deduce the matrix transpose(d)*A = [V1 ... Vn] > and then get the diagonal of transpose(d)*transpose([V1 ...Vn]) through > -transpose([V1 ...Vn]) and get its columns C1 ... Cn > -conclude on the i-th diagonal value which is the i-th component of tranpose(d)*Ci > > > > ----- Mail original ----- >> De: "Barry Smith" >> ?: "Nicolas Pozin" >> Cc: "Jed Brown" , petsc-users at mcs.anl.gov >> Envoy?: Mercredi 26 Ao?t 2015 22:21:04 >> Objet: Re: [petsc-users] forming a matrix from a set of vectors >> >> >>> On Aug 26, 2015, at 3:06 PM, Nicolas Pozin wrote: >>> >>> Thank you for this answer. >>> >>> What I want to do is to get the lines of this matrix and store them in >>> vectors. >> >> If you want to treat the columns of the dense matrix as vectors then use >> MatDenseGetArray() and call VecCreateMPIWithArray() with a pointer to the >> first row of each column of the obtained array (PETSc dense matrices are >> stored by column; same as for example LAPACK). >> >> But if you explained more why you want to treat something sometimes as a >> Mat (which is a linear operator on vectors) and sometimes as vectors we >> might be able to suggest how to organize your code. >> >> Barry >> >>> >>> >>> ----- Mail original ----- >>>> De: "Jed Brown" >>>> ?: "Nicolas Pozin" , petsc-users at mcs.anl.gov >>>> Envoy?: Mercredi 26 Ao?t 2015 20:38:37 >>>> Objet: Re: [petsc-users] forming a matrix from a set of vectors >>>> >>>> Nicolas Pozin writes: >>>>> Given a set of vectors V1, V2,...,Vn, is there an efficient way to form >>>>> the >>>>> dense matrix [V1 V2 ... Vn]? >>>> >>>> What do you want to do with that matrix? The vector representation is >>>> pretty flexible and the memory semantics are similar unless you store >>>> the dense matrix row-aligned (not the default). >>>> >> >> From nicolas.pozin at inria.fr Wed Aug 26 16:01:29 2015 From: nicolas.pozin at inria.fr (Nicolas Pozin) Date: Wed, 26 Aug 2015 23:01:29 +0200 (CEST) Subject: [petsc-users] forming a matrix from a set of vectors In-Reply-To: <154030DD-BAED-4F54-829D-0B2F089301F5@mcs.anl.gov> References: <1728363781.5802570.1438184949520.JavaMail.zimbra@inria.fr> <1236537078.10036213.1440612036075.JavaMail.zimbra@inria.fr> <87pp2999oy.fsf@jedbrown.org> <1398659529.10050070.1440619592053.JavaMail.zimbra@inria.fr> <1142490980.10053474.1440621667465.JavaMail.zimbra@inria.fr> <154030DD-BAED-4F54-829D-0B2F089301F5@mcs.anl.gov> Message-ID: <1199295077.10054108.1440622889251.JavaMail.zimbra@inria.fr> Yes, this is sequentially and yes again, d is stored as a AIJ matrix ----- Mail original ----- > De: "Barry Smith" > ?: "Nicolas Pozin" > Cc: "Jed Brown" , petsc-users at mcs.anl.gov > Envoy?: Mercredi 26 Ao?t 2015 22:58:12 > Objet: Re: [petsc-users] forming a matrix from a set of vectors > > > Since A is tiny I am assuming you are doing this sequentially only? > > Do you have d stored as a AIJ matrix or is transpose(d) stored as a AIJ > matrix? > > > > > > On Aug 26, 2015, at 3:41 PM, Nicolas Pozin wrote: > > > > Actually I want to get the diagonal of the matrix : transpose(d)*A*d where > > -d is a sparse matrix of size (n1,m1) > > -A is a dense symetric matrix of size size (n1,n1) > > with m1 very big compared to n1 (1 million against a few dozens). > > > > The problem is too big to allow the use of MatMatMult. > > What I planned to do : > > -compute the vectors Vi defined by transpose(d)*Ai where Ai is the i-th > > column of A : quick since d is sparse and n1 is small > > -deduce the matrix transpose(d)*A = [V1 ... Vn] > > and then get the diagonal of transpose(d)*transpose([V1 ...Vn]) through > > -transpose([V1 ...Vn]) and get its columns C1 ... Cn > > -conclude on the i-th diagonal value which is the i-th component of > > tranpose(d)*Ci > > > > > > > > ----- Mail original ----- > >> De: "Barry Smith" > >> ?: "Nicolas Pozin" > >> Cc: "Jed Brown" , petsc-users at mcs.anl.gov > >> Envoy?: Mercredi 26 Ao?t 2015 22:21:04 > >> Objet: Re: [petsc-users] forming a matrix from a set of vectors > >> > >> > >>> On Aug 26, 2015, at 3:06 PM, Nicolas Pozin > >>> wrote: > >>> > >>> Thank you for this answer. > >>> > >>> What I want to do is to get the lines of this matrix and store them in > >>> vectors. > >> > >> If you want to treat the columns of the dense matrix as vectors then use > >> MatDenseGetArray() and call VecCreateMPIWithArray() with a pointer to the > >> first row of each column of the obtained array (PETSc dense matrices are > >> stored by column; same as for example LAPACK). > >> > >> But if you explained more why you want to treat something sometimes as a > >> Mat (which is a linear operator on vectors) and sometimes as vectors we > >> might be able to suggest how to organize your code. > >> > >> Barry > >> > >>> > >>> > >>> ----- Mail original ----- > >>>> De: "Jed Brown" > >>>> ?: "Nicolas Pozin" , petsc-users at mcs.anl.gov > >>>> Envoy?: Mercredi 26 Ao?t 2015 20:38:37 > >>>> Objet: Re: [petsc-users] forming a matrix from a set of vectors > >>>> > >>>> Nicolas Pozin writes: > >>>>> Given a set of vectors V1, V2,...,Vn, is there an efficient way to form > >>>>> the > >>>>> dense matrix [V1 V2 ... Vn]? > >>>> > >>>> What do you want to do with that matrix? The vector representation is > >>>> pretty flexible and the memory semantics are similar unless you store > >>>> the dense matrix row-aligned (not the default). > >>>> > >> > >> > > From patrick.sanan at gmail.com Wed Aug 26 16:02:41 2015 From: patrick.sanan at gmail.com (Patrick Sanan) Date: Wed, 26 Aug 2015 23:02:41 +0200 Subject: [petsc-users] forming a matrix from a set of vectors In-Reply-To: <1142490980.10053474.1440621667465.JavaMail.zimbra@inria.fr> References: <1728363781.5802570.1438184949520.JavaMail.zimbra@inria.fr> <1236537078.10036213.1440612036075.JavaMail.zimbra@inria.fr> <87pp2999oy.fsf@jedbrown.org> <1398659529.10050070.1440619592053.JavaMail.zimbra@inria.fr> <1142490980.10053474.1440621667465.JavaMail.zimbra@inria.fr> Message-ID: On Wed, Aug 26, 2015 at 10:41 PM, Nicolas Pozin wrote: > Actually I want to get the diagonal of the matrix : transpose(d)*A*d where > -d is a sparse matrix of size (n1,m1) > -A is a dense symetric matrix of size size (n1,n1) > with m1 very big compared to n1 (1 million against a few dozens). > If I read this correctly, another way to phrase what you need is ||d_i||_A^2 = , for a few dozen values of i . Naively you could do that by iterating through an array of Vec objects (which need not all be stored in memory simultaneously), calling MatMult followed by VecDot. You could perhaps get more clever later (if the size of the system justifies it) by doing things like using non-blocking/split versions of VecDot (or VecMDot) so that you can overlap the matrix multiplications with the dot products. > > The problem is too big to allow the use of MatMatMult. > What I planned to do : > -compute the vectors Vi defined by transpose(d)*Ai where Ai is the i-th > column of A : quick since d is sparse and n1 is small > -deduce the matrix transpose(d)*A = [V1 ... Vn] > and then get the diagonal of transpose(d)*transpose([V1 ...Vn]) through > -transpose([V1 ...Vn]) and get its columns C1 ... Cn > -conclude on the i-th diagonal value which is the i-th component of > tranpose(d)*Ci > > > > ----- Mail original ----- > > De: "Barry Smith" > > ?: "Nicolas Pozin" > > Cc: "Jed Brown" , petsc-users at mcs.anl.gov > > Envoy?: Mercredi 26 Ao?t 2015 22:21:04 > > Objet: Re: [petsc-users] forming a matrix from a set of vectors > > > > > > > On Aug 26, 2015, at 3:06 PM, Nicolas Pozin > wrote: > > > > > > Thank you for this answer. > > > > > > What I want to do is to get the lines of this matrix and store them in > > > vectors. > > > > If you want to treat the columns of the dense matrix as vectors then > use > > MatDenseGetArray() and call VecCreateMPIWithArray() with a pointer to > the > > first row of each column of the obtained array (PETSc dense matrices > are > > stored by column; same as for example LAPACK). > > > > But if you explained more why you want to treat something sometimes as > a > > Mat (which is a linear operator on vectors) and sometimes as vectors we > > might be able to suggest how to organize your code. > > > > Barry > > > > > > > > > > > ----- Mail original ----- > > >> De: "Jed Brown" > > >> ?: "Nicolas Pozin" , petsc-users at mcs.anl.gov > > >> Envoy?: Mercredi 26 Ao?t 2015 20:38:37 > > >> Objet: Re: [petsc-users] forming a matrix from a set of vectors > > >> > > >> Nicolas Pozin writes: > > >>> Given a set of vectors V1, V2,...,Vn, is there an efficient way to > form > > >>> the > > >>> dense matrix [V1 V2 ... Vn]? > > >> > > >> What do you want to do with that matrix? The vector representation is > > >> pretty flexible and the memory semantics are similar unless you store > > >> the dense matrix row-aligned (not the default). > > >> > > > > > -------------- next part -------------- An HTML attachment was scrubbed... URL: From patrick.sanan at gmail.com Wed Aug 26 16:04:17 2015 From: patrick.sanan at gmail.com (Patrick Sanan) Date: Wed, 26 Aug 2015 23:04:17 +0200 Subject: [petsc-users] forming a matrix from a set of vectors In-Reply-To: References: <1728363781.5802570.1438184949520.JavaMail.zimbra@inria.fr> <1236537078.10036213.1440612036075.JavaMail.zimbra@inria.fr> <87pp2999oy.fsf@jedbrown.org> <1398659529.10050070.1440619592053.JavaMail.zimbra@inria.fr> <1142490980.10053474.1440621667465.JavaMail.zimbra@inria.fr> Message-ID: On Wed, Aug 26, 2015 at 11:02 PM, Patrick Sanan wrote: > > > On Wed, Aug 26, 2015 at 10:41 PM, Nicolas Pozin > wrote: > >> Actually I want to get the diagonal of the matrix : transpose(d)*A*d where >> -d is a sparse matrix of size (n1,m1) >> -A is a dense symetric matrix of size size (n1,n1) >> with m1 very big compared to n1 (1 million against a few dozens). >> > If I read this correctly, another way to phrase what you need is > ||d_i||_A^2 = , for a few dozen values of i . Naively you could > do that by iterating through an array of Vec objects (which need not all be > stored in memory simultaneously), calling MatMult followed by VecDot. You > could perhaps get more clever later (if the size of the system justifies > it) by doing things like using non-blocking/split versions of VecDot (or > VecMDot) so that you can overlap the matrix multiplications with the dot > products. > Ah, sorry, I had the sparsity of A and d reversed in my reading. > >> The problem is too big to allow the use of MatMatMult. >> What I planned to do : >> -compute the vectors Vi defined by transpose(d)*Ai where Ai is the i-th >> column of A : quick since d is sparse and n1 is small >> -deduce the matrix transpose(d)*A = [V1 ... Vn] >> and then get the diagonal of transpose(d)*transpose([V1 ...Vn]) through >> -transpose([V1 ...Vn]) and get its columns C1 ... Cn >> -conclude on the i-th diagonal value which is the i-th component of >> tranpose(d)*Ci >> >> >> >> ----- Mail original ----- >> > De: "Barry Smith" >> > ?: "Nicolas Pozin" >> > Cc: "Jed Brown" , petsc-users at mcs.anl.gov >> > Envoy?: Mercredi 26 Ao?t 2015 22:21:04 >> > Objet: Re: [petsc-users] forming a matrix from a set of vectors >> > >> > >> > > On Aug 26, 2015, at 3:06 PM, Nicolas Pozin >> wrote: >> > > >> > > Thank you for this answer. >> > > >> > > What I want to do is to get the lines of this matrix and store them in >> > > vectors. >> > >> > If you want to treat the columns of the dense matrix as vectors then >> use >> > MatDenseGetArray() and call VecCreateMPIWithArray() with a pointer to >> the >> > first row of each column of the obtained array (PETSc dense matrices >> are >> > stored by column; same as for example LAPACK). >> > >> > But if you explained more why you want to treat something sometimes >> as a >> > Mat (which is a linear operator on vectors) and sometimes as vectors >> we >> > might be able to suggest how to organize your code. >> > >> > Barry >> > >> > > >> > > >> > > ----- Mail original ----- >> > >> De: "Jed Brown" >> > >> ?: "Nicolas Pozin" , petsc-users at mcs.anl.gov >> > >> Envoy?: Mercredi 26 Ao?t 2015 20:38:37 >> > >> Objet: Re: [petsc-users] forming a matrix from a set of vectors >> > >> >> > >> Nicolas Pozin writes: >> > >>> Given a set of vectors V1, V2,...,Vn, is there an efficient way to >> form >> > >>> the >> > >>> dense matrix [V1 V2 ... Vn]? >> > >> >> > >> What do you want to do with that matrix? The vector representation >> is >> > >> pretty flexible and the memory semantics are similar unless you store >> > >> the dense matrix row-aligned (not the default). >> > >> >> > >> > >> > > -------------- next part -------------- An HTML attachment was scrubbed... URL: From jed at jedbrown.org Wed Aug 26 16:45:05 2015 From: jed at jedbrown.org (Jed Brown) Date: Wed, 26 Aug 2015 15:45:05 -0600 Subject: [petsc-users] forming a matrix from a set of vectors In-Reply-To: <1142490980.10053474.1440621667465.JavaMail.zimbra@inria.fr> References: <1728363781.5802570.1438184949520.JavaMail.zimbra@inria.fr> <1236537078.10036213.1440612036075.JavaMail.zimbra@inria.fr> <87pp2999oy.fsf@jedbrown.org> <1398659529.10050070.1440619592053.JavaMail.zimbra@inria.fr> <1142490980.10053474.1440621667465.JavaMail.zimbra@inria.fr> Message-ID: <87k2sh9126.fsf@jedbrown.org> Nicolas Pozin writes: > Actually I want to get the diagonal of the matrix : transpose(d)*A*d where > -d is a sparse matrix of size (n1,m1) > -A is a dense symetric matrix of size size (n1,n1) > with m1 very big compared to n1 (1 million against a few dozens). The result will be m1 ? m1, but at most rank n1. Why would you want that monster as a matrix? -------------- next part -------------- A non-text attachment was scrubbed... Name: signature.asc Type: application/pgp-signature Size: 818 bytes Desc: not available URL: From nicolas.pozin at inria.fr Wed Aug 26 16:55:38 2015 From: nicolas.pozin at inria.fr (Nicolas Pozin) Date: Wed, 26 Aug 2015 23:55:38 +0200 (CEST) Subject: [petsc-users] forming a matrix from a set of vectors In-Reply-To: <87k2sh9126.fsf@jedbrown.org> References: <1728363781.5802570.1438184949520.JavaMail.zimbra@inria.fr> <1236537078.10036213.1440612036075.JavaMail.zimbra@inria.fr> <87pp2999oy.fsf@jedbrown.org> <1398659529.10050070.1440619592053.JavaMail.zimbra@inria.fr> <1142490980.10053474.1440621667465.JavaMail.zimbra@inria.fr> <87k2sh9126.fsf@jedbrown.org> Message-ID: <457017974.10055822.1440626138775.JavaMail.zimbra@inria.fr> I'm working on a finite element system of the type (A+B)X=Y where A is a classic sparse symetric matrix and B is this transpose(d)*A*d. All the degrees of freedom are coupled (B dense), this is a physical property of the problem I deal with... To solve it I use a conjugate gradient with jacobi preconditionner (which proves to be satisying here) . So I need the diagonal of B... and for now this is clearly the most time-consuming part of my code. ----- Mail original ----- > De: "Jed Brown" > ?: "Nicolas Pozin" , "Barry Smith" > Cc: petsc-users at mcs.anl.gov > Envoy?: Mercredi 26 Ao?t 2015 23:45:05 > Objet: Re: [petsc-users] forming a matrix from a set of vectors > > Nicolas Pozin writes: > > > Actually I want to get the diagonal of the matrix : transpose(d)*A*d where > > -d is a sparse matrix of size (n1,m1) > > -A is a dense symetric matrix of size size (n1,n1) > > with m1 very big compared to n1 (1 million against a few dozens). > > The result will be m1 ? m1, but at most rank n1. Why would you want > that monster as a matrix? > From bsmith at mcs.anl.gov Wed Aug 26 22:00:44 2015 From: bsmith at mcs.anl.gov (Barry Smith) Date: Wed, 26 Aug 2015 22:00:44 -0500 Subject: [petsc-users] forming a matrix from a set of vectors In-Reply-To: <1199295077.10054108.1440622889251.JavaMail.zimbra@inria.fr> References: <1728363781.5802570.1438184949520.JavaMail.zimbra@inria.fr> <1236537078.10036213.1440612036075.JavaMail.zimbra@inria.fr> <87pp2999oy.fsf@jedbrown.org> <1398659529.10050070.1440619592053.JavaMail.zimbra@inria.fr> <1142490980.10053474.1440621667465.JavaMail.zimbra@inria.fr> <154030DD-BAED-4F54-829D-0B2F089301F5@mcs.anl.gov> <1199295077.10054108.1440622889251.JavaMail.zimbra@inria.fr> Message-ID: <44BE97E7-4D65-46BE-B8E0-0B77D808DC0B@mcs.anl.gov> Nicolas, I believe the best way to do this is to write a code that specifically knows the SeqAIJ matrix structure and does the product directly with that data structure instead of trying to "cook something up" using the standard higher level PETSc routines. So you need to have #include <../src/mat/impls/aij/seq/aij.h> directly in your code. See for example src/mat/impls/aij/seq/aij.c Next you need to store c = transpose(d) since storing the matrix d in PETSc sparse format (which is row based) is terrible for the operation you want to perform. So ideally just create the matrix c and put its values in; if that is too difficult you can use MatTranspose() to get c from d. Now to the algorithm T = c *(A*transpose(c)) note that the columns of transpose(c) (i.e. the columns of d) are the rows of c, let c(i,*) represent a row of c. The diagonals of T are T(i,i) = c(i,*) * (A * c(i,*)) Let s = A * c(i,*) now to implement this just loop over i. Compute s (an array) using directly the data in the sparse matrix c (use MatDenseGetArray()) to access the values in A) then compute c(i,*) * s; then move on to the next i. This will be very efficient (computes only exactly what is needed) and requires only the array s which is small (size of number of rows of A, not c). Barry > On Aug 26, 2015, at 4:01 PM, Nicolas Pozin wrote: > > Yes, this is sequentially and yes again, d is stored as a AIJ matrix > > ----- Mail original ----- >> De: "Barry Smith" >> ?: "Nicolas Pozin" >> Cc: "Jed Brown" , petsc-users at mcs.anl.gov >> Envoy?: Mercredi 26 Ao?t 2015 22:58:12 >> Objet: Re: [petsc-users] forming a matrix from a set of vectors >> >> >> Since A is tiny I am assuming you are doing this sequentially only? >> >> Do you have d stored as a AIJ matrix or is transpose(d) stored as a AIJ >> matrix? >> >> >> >> >>> On Aug 26, 2015, at 3:41 PM, Nicolas Pozin wrote: >>> >>> Actually I want to get the diagonal of the matrix : transpose(d)*A*d where >>> -d is a sparse matrix of size (n1,m1) >>> -A is a dense symetric matrix of size size (n1,n1) >>> with m1 very big compared to n1 (1 million against a few dozens). >>> >>> The problem is too big to allow the use of MatMatMult. >>> What I planned to do : >>> -compute the vectors Vi defined by transpose(d)*Ai where Ai is the i-th >>> column of A : quick since d is sparse and n1 is small >>> -deduce the matrix transpose(d)*A = [V1 ... Vn] >>> and then get the diagonal of transpose(d)*transpose([V1 ...Vn]) through >>> -transpose([V1 ...Vn]) and get its columns C1 ... Cn >>> -conclude on the i-th diagonal value which is the i-th component of >>> tranpose(d)*Ci >>> >>> >>> >>> ----- Mail original ----- >>>> De: "Barry Smith" >>>> ?: "Nicolas Pozin" >>>> Cc: "Jed Brown" , petsc-users at mcs.anl.gov >>>> Envoy?: Mercredi 26 Ao?t 2015 22:21:04 >>>> Objet: Re: [petsc-users] forming a matrix from a set of vectors >>>> >>>> >>>>> On Aug 26, 2015, at 3:06 PM, Nicolas Pozin >>>>> wrote: >>>>> >>>>> Thank you for this answer. >>>>> >>>>> What I want to do is to get the lines of this matrix and store them in >>>>> vectors. >>>> >>>> If you want to treat the columns of the dense matrix as vectors then use >>>> MatDenseGetArray() and call VecCreateMPIWithArray() with a pointer to the >>>> first row of each column of the obtained array (PETSc dense matrices are >>>> stored by column; same as for example LAPACK). >>>> >>>> But if you explained more why you want to treat something sometimes as a >>>> Mat (which is a linear operator on vectors) and sometimes as vectors we >>>> might be able to suggest how to organize your code. >>>> >>>> Barry >>>> >>>>> >>>>> >>>>> ----- Mail original ----- >>>>>> De: "Jed Brown" >>>>>> ?: "Nicolas Pozin" , petsc-users at mcs.anl.gov >>>>>> Envoy?: Mercredi 26 Ao?t 2015 20:38:37 >>>>>> Objet: Re: [petsc-users] forming a matrix from a set of vectors >>>>>> >>>>>> Nicolas Pozin writes: >>>>>>> Given a set of vectors V1, V2,...,Vn, is there an efficient way to form >>>>>>> the >>>>>>> dense matrix [V1 V2 ... Vn]? >>>>>> >>>>>> What do you want to do with that matrix? The vector representation is >>>>>> pretty flexible and the memory semantics are similar unless you store >>>>>> the dense matrix row-aligned (not the default). >>>>>> >>>> >>>> >> >> From zonexo at gmail.com Thu Aug 27 01:05:59 2015 From: zonexo at gmail.com (TAY wee-beng) Date: Thu, 27 Aug 2015 14:05:59 +0800 Subject: [petsc-users] Problem with linking PETSc Message-ID: <55DEA8C7.5010100@gmail.com> Hi, I used to compile and link using PETSc 3.6.0 + MPICH2 on VS2008. Due to some MPICH2 issues, I am trying to use Intel MPI (newest version). Building and testing of PETSc 3.6.1 with Intel MPI all went smoothly. However, I now have problem linking the files on VS2008 to create the final exe. The error is: /*1>Compiling manifest to resources...*//* *//*1>Microsoft (R) Windows (R) Resource Compiler Version 6.0.5724.0*//* *//*1>Copyright (C) Microsoft Corporation. All rights reserved.*//* *//*1>Linking...*//* *//*1>global.obj : error LNK2019: unresolved external symbol MATCREATEAIJ referenced in function GLOBAL_DATA_mp_ALLO_VAR*//* *//*1>global.obj : error LNK2019: unresolved external symbol MATSETFROMOPTIONS referenced in function GLOBAL_DATA_mp_ALLO_VAR*//* *//*...*//* *//*1>PETSc_solvers.obj : error LNK2019: unresolved external symbol VECGETARRAY referenced in function PETSC_SOLVERS_mp_P_MATRIX_SOLV_PETSC*//* *//*1>PETSc_solvers.obj : error LNK2019: unresolved external symbol VECRESTOREARRAY referenced in function PETSC_SOLVERS_mp_P_MATRIX_SOLV_PETSC*//* *//*1>PETSc_solvers.obj : error LNK2019: unresolved external symbol DMLOCALTOLOCALBEGIN referenced in function PETSC_SOLVERS_mp_DM_UPDATE_ARRAY*//* *//*1>PETSc_solvers.obj : error LNK2019: unresolved external symbol DMLOCALTOLOCALEND referenced in function PETSC_SOLVERS_mp_DM_UPDATE_ARRAY*//* *//*1>ibm3d_high_Re.obj : error LNK2019: unresolved external symbol PETSCINITIALIZE referenced in function MAIN__*//* *//*1>C:\Obj_tmp\ibm3d_IIB_mpi_old\Debug\ibm3d_IIB_mpi.exe : fatal error LNK1120: 74 unresolved externals*//* *//*1>*//* *//*1>Build log written to "file://C:\Obj_tmp\ibm3d_IIB_mpi_old\Debug\BuildLog.htm"*//* *//*1>ibm3d_IIB_mpi_old - 165 error(s), 0 warning(s)*//* *//*========== Rebuild All: 0 succeeded, 1 failed, 0 skipped ==========*/ I did not do much changes since the prev PETSc worked. I only changed the directory $(PETSC_DIR) and $(IMPI) to the new directory in win7 environment variables. I wonder what's wrong. -- Thank you Yours sincerely, TAY wee-beng -------------- next part -------------- An HTML attachment was scrubbed... URL: From zonexo at gmail.com Thu Aug 27 01:08:15 2015 From: zonexo at gmail.com (TAY wee-beng) Date: Thu, 27 Aug 2015 14:08:15 +0800 Subject: [petsc-users] Problem with linking PETSc 2 Message-ID: <55DEA94F.9040407@gmail.com> Hi, I forgot to add that I also changed the MPI lib to those used by Intel MPI. -- Thank you Yours sincerely, TAY wee-beng From balay at mcs.anl.gov Thu Aug 27 10:38:59 2015 From: balay at mcs.anl.gov (Satish Balay) Date: Thu, 27 Aug 2015 10:38:59 -0500 Subject: [petsc-users] Problem with linking PETSc In-Reply-To: <55DEA8C7.5010100@gmail.com> References: <55DEA8C7.5010100@gmail.com> Message-ID: Are you able to compile and run both C and fortran petsc examples using the corresponding makefile? Satish On Thu, 27 Aug 2015, TAY wee-beng wrote: > Hi, > > I used to compile and link using PETSc 3.6.0 + MPICH2 on VS2008. > > Due to some MPICH2 issues, I am trying to use Intel MPI (newest version). > Building and testing of PETSc 3.6.1 with Intel MPI all went smoothly. > > However, I now have problem linking the files on VS2008 to create the final > exe. The error is: > > /*1>Compiling manifest to resources...*//* > *//*1>Microsoft (R) Windows (R) Resource Compiler Version 6.0.5724.0*//* > *//*1>Copyright (C) Microsoft Corporation. All rights reserved.*//* > *//*1>Linking...*//* > *//*1>global.obj : error LNK2019: unresolved external symbol MATCREATEAIJ > referenced in function GLOBAL_DATA_mp_ALLO_VAR*//* > *//*1>global.obj : error LNK2019: unresolved external symbol MATSETFROMOPTIONS > referenced in function GLOBAL_DATA_mp_ALLO_VAR*//* > *//*...*//* > *//*1>PETSc_solvers.obj : error LNK2019: unresolved external symbol > VECGETARRAY referenced in function PETSC_SOLVERS_mp_P_MATRIX_SOLV_PETSC*//* > *//*1>PETSc_solvers.obj : error LNK2019: unresolved external symbol > VECRESTOREARRAY referenced in function > PETSC_SOLVERS_mp_P_MATRIX_SOLV_PETSC*//* > *//*1>PETSc_solvers.obj : error LNK2019: unresolved external symbol > DMLOCALTOLOCALBEGIN referenced in function > PETSC_SOLVERS_mp_DM_UPDATE_ARRAY*//* > *//*1>PETSc_solvers.obj : error LNK2019: unresolved external symbol > DMLOCALTOLOCALEND referenced in function PETSC_SOLVERS_mp_DM_UPDATE_ARRAY*//* > *//*1>ibm3d_high_Re.obj : error LNK2019: unresolved external symbol > PETSCINITIALIZE referenced in function MAIN__*//* > *//*1>C:\Obj_tmp\ibm3d_IIB_mpi_old\Debug\ibm3d_IIB_mpi.exe : fatal error > LNK1120: 74 unresolved externals*//* > *//*1>*//* > *//*1>Build log written to > "file://C:\Obj_tmp\ibm3d_IIB_mpi_old\Debug\BuildLog.htm"*//* > *//*1>ibm3d_IIB_mpi_old - 165 error(s), 0 warning(s)*//* > *//*========== Rebuild All: 0 succeeded, 1 failed, 0 skipped ==========*/ > > I did not do much changes since the prev PETSc worked. I only changed the > directory $(PETSC_DIR) and $(IMPI) to the new directory in win7 environment > variables. I wonder what's wrong. > > From zonexo at gmail.com Thu Aug 27 10:45:12 2015 From: zonexo at gmail.com (Wee Beng Tay) Date: Thu, 27 Aug 2015 23:45:12 +0800 Subject: [petsc-users] Insert values into matrix using MatSetValuesStencil or MatSetValuesLocal In-Reply-To: References: <55DD3CC1.5070801@gmail.com> <55DD4869.2000006@gmail.com> Message-ID: <1440690313501-f2c9870c-a80854bd-4d8201cd@gmail.com> Hi Timothee, That's a better way. Thanks Sent using CloudMagic [https://cloudmagic.com/k/d/mailapp?ct=pa&cv=7.2.9&pv=5.0.2] On Wed, Aug 26, 2015 at 1:15 PM, Timoth?e Nicolas < timothee.nicolas at gmail.com [timothee.nicolas at gmail.com] > wrote: I don't really understand what you say, but it does not sound right. You can enter the boundary points separately and then the points outside the boundary on separate calls, like this : do j=user%ys,user%ye do i=user%xs,user%xe if (i.eq.1 .or. i.eq.user%mx .or. j .eq. 1 .or. j .eq. user%my) then ! boundary point row(MatStencil_i,1) = i -1 row(MatStencil_j,1) = j -1 row(MatStencil_c,1) = 1 -1 col(MatStencil_i,1) = i -1 col(MatStencil_j,1) = j -1 col(MatStencil_c,1) = 1 -1 v(1) = one call MatSetValuesStencil(jac_prec,ione,row,ione,col,v, & & INSERT_VALUES,ierr) else row(MatStencil_i,1) = i -1 row(MatStencil_j,1) = j -1 row(MatStencil_c,1) = 1 -1 col(MatStencil_i,1) = i -1 col(MatStencil_j,1) = j -1 col(MatStencil_c,1) = 1 -1 v(1) = undemi*dxm1*(vx_ip1j-vx_im1j) + two*user%nu*(dxm1**2+dym1**2) col(MatStencil_i,2) = i+1 -1 col(MatStencil_j,2) = j -1 col(MatStencil_c,2) = 1 -1 v(2) = undemi*dxm1*(vx_ij-vx_ip1j) - user%nu*dxm1**2 col(MatStencil_i,3) = i-1 -1 col(MatStencil_j,3) = j -1 col(MatStencil_c,3) = 1 -1 v(3) = -undemi*dxm1*(vx_ij-vx_im1j) - user%nu*dxm1**2 col(MatStencil_i,4) = i -1 col(MatStencil_j,4) = j+1 -1 col(MatStencil_c,4) = 1 -1 v(4) = undemi*dym1*vy_ij - user%nu*dym1**2 col(MatStencil_i,5) = i -1 col(MatStencil_j,5) = j-1 -1 col(MatStencil_c,5) = 1 -1 v(5) = -undemi*dym1*vy_ij - user%nu*dym1**2 col(MatStencil_i,6) = i -1 col(MatStencil_j,6) = j -1 col(MatStencil_c,6) = 2 -1 v(6) = undemi*dym1*(vx_ijp1-vx_ijm1) col(MatStencil_i,7) = i+1 -1 col(MatStencil_j,7) = j -1 col(MatStencil_c,7) = 2 -1 v(7) = -undemi*dxm1*vy_ip1j col(MatStencil_i,8) = i-1 -1 col(MatStencil_j,8) = j -1 col(MatStencil_c,8) = 2 -1 v(8) = undemi*dxm1*vy_im1j call MatSetValuesStencil(jac_prec,ione,row,ieight,col,v, & & INSERT_VALUES,ierr) endif enddo enddo Timothee 2015-08-26 14:02 GMT+09:00 TAY wee-beng < zonexo at gmail.com [zonexo at gmail.com] > : Hi Timothee, Yes, I only parallelized in j and k. ksta,jsta are the starting k and j values. kend,jend are the ending k and j values. However, now I am using only 1 procs. I was going to resend you my code but then I realised my mistake. I used: call MatSetValuesStencil(A_mat,ione,row,iseven,col,value_insert,INSERT_VALUES,ierr) for all pts, including those at the boundary. Hence, those coupling outside the boundary is also included. I changed to: call MatSetValuesStencil(A_mat,ione,row,ione,col(:,7),value_insert(7),INSERT_VALUES,ierr) so I am now entering values individually. Is there anyway I can use the 1st option to enter all the values together even those some pts are invalid. I think it should be faster. Can I somehow tell PETSc to ignore them? Thank you Yours sincerely, TAY wee-beng On 26/8/2015 12:24 PM, Timoth?e Nicolas wrote: What is the definition of ksta, kend, jsta, jend ? Etc ? You are parallelized only in j and k ? What I said about the "-1" holds only if you have translated the start and end points to FORTRAN numbering after getting the corners and ghost corners from the DMDA (see ex ex5f90.F from snes) Would you mind sending the complete routine with the complete definitions of ksta,kend,jsta,jend,and size_x ? Timothee 2015-08-26 13:12 GMT+09:00 TAY wee-beng < zonexo at gmail.com [zonexo at gmail.com] > : Hi, I have wrote the routine for my Poisson eqn. I have only 1 DOF, which is for pressure. The center cell is coupled with 6 other cells (north, south, east, west, front, back), so together 7 couplings. size x/y/z = 4/8/10 MatStencil :: row(4,1),col(4,7) PetscScalar :: value_insert(7) PetscInt :: ione,iseven ione = 1; iseven = 7 do k=ksta,kend do j = jsta,jend do i=1,size_x row(MatStencil_i,1) = i - 1 row(MatStencil_j,1) = j - 1 row(MatStencil_k,1) = k - 1 row(MatStencil_c,1) = 0 ! 1 - 1 value_insert = 0.d0 if (i /= size_x) then col(MatStencil_i,3) = i + 1 - 1 !east col(MatStencil_j,3) = j - 1 col(MatStencil_k,3) = k - 1 col(MatStencil_c,3) = 0 value_insert(3) = (cp_yz(j,k)%fc_E)/(cp_x(i)%pd_E+cp_x(i+1)%pd_W) end if if (i /= 1) then col(MatStencil_i,5) = i - 1 - 1 !west col(MatStencil_j,5) = j - 1 col(MatStencil_k,5) = k - 1 col(MatStencil_c,5) = 0 value_insert(5) = (cp_yz(j,k)%fc_E)/(cp_x(i)%pd_W+cp_x(i-1)%pd_E) end if if (j /= size_y) then col(MatStencil_i,2) = i - 1 !north col(MatStencil_j,2) = j + 1 - 1 col(MatStencil_k,2) = k - 1 col(MatStencil_c,2) = 0 value_insert(2) = (cp_zx(i,k)%fc_N)/(cp_y(j)%pd_N+cp_y(j+1)%pd_S) end if ... col(MatStencil_i,1) = i - 1 col(MatStencil_j,1) = j - 1 col(MatStencil_k,1) = k - 1 col(MatStencil_c,1) = 0 value_insert(1) = -value_insert(2) - value_insert(3) - value_insert(4) - value_insert(5) - value_insert(6) - value_insert(7) call MatSetValuesStencil(A_mat,ione,row,iseven,col,value_insert,INSERT_VALUES,ierr) end do end do end do but I got the error : [0]PETSC ERROR: Argument out of range [0]PETSC ERROR: Inserting a new nonzero at (3,0) in the matrix. The error happens at i = 4, j = 1, k = 1. So I guess it has something to do with the boundary condition. However, I can't figure out what's wrong. Can someone help? Thank you Yours sincerely, TAY wee-beng On 24/8/2015 5:54 PM, Timoth?e Nicolas wrote: Hi, ex5 of snes can give you an example of the two routines. The C version ex5.c uses MatSetValuesStencil whereas the Fortran90 version ex5f90.F uses MatSetValuesLocal. However, I use MatSetValuesStencil also in Fortran, there is no problem, and no need to mess around with DMDAGetAO, I think. To input values in the matrix, you need to do the following : ! Declare the matstencils for matrix columns and rows MatStencil :: row(4,1),col(4,n) ! Declare the quantity which will store the actual matrix elements PetscScalar :: v(8) The first dimension in row and col is 4 to allow for 3 spatial dimensions (even if you use only 2) plus one degree of freedom if you have several fields in your DMDA. The second dimension is 1 for row (you input one row at a time) and n for col, where n is the number of columns that you input. For instance, if at node (1,i,j) (1 is the index of the degree of freedom), you have, say, 6 couplings, with nodes (1,i,j), (1,i+1,j), (1,i-1,j), (1,i,j-1), (1,i,j+1), (2,i,j) for example, then you need to set n=6 Then you define the row number by naturally doing the following, inside a local loop : row(MatStencil_i,1) = i -1 row(MatStencil_j,1) = j -1 row(MatStencil_c,1) = 1 -1 the -1 are here because FORTRAN indexing is different from the native C indexing. I put them on the right to make this more apparent. Then the column information. For instance to declare the coupling with node (1,i,j), (1,i-1,j) and (2,i,j) (you can make up for the rest) you will have to write (still within the same local loop on i and j) col(MatStencil_i,1) = i -1 col(MatStencil_j,1) = j -1 col(MatStencil_c,1) = 1 -1 v(1) = whatever_it_is col(MatStencil_i,2) = i-1 -1 col(MatStencil_j,2) = j -1 col(MatStencil_c,2) = 1 -1 v(2) = whatever_it_is col(MatStencil_i,3) = i -1 col(MatStencil_j,3) = j -1 col(MatStencil_c,3) = 2 -1 v(3) = whatever_it_is ... ... .. ... ... ... Note that the index of the degree of freedom (or what field you are coupling to), is indicated by MatStencil_c Finally use MatSetValuesStencil ione = 1 isix = 6 call MatSetValuesStencil(Matrix,ione,row,isix,col,v,INSERT_VALUES,ierr) If it is not clear don't hesitate to ask more details. For me it worked that way, I succesfully computed a Jacobian that way. It is very sensitive. If you slightly depart from the right jacobian, you will see a huge difference compared to using matrix free with -snes_mf, so you can hardly make a mistake because you would see it. That's how I finally got it to work. Best Timothee 2015-08-24 18:09 GMT+09:00 Wee-Beng Tay < [zonexo at gmail.com] zonexo at gmail.com [zonexo at gmail.com] > : Hi, I'm modifying my 3d fortran code from MPI along 1 direction (z) to MPI along 2 directions (y,z) Previously I was using MatSetValues with global indices. However, now I'm using DM and global indices is much more difficult. I come across MatSetValuesStencil or MatSetValuesLocal. So what's the difference bet the one since they both seem to work locally? Which is a simpler/better option? Is there an example in Fortran for MatSetValuesStencil? Do I also need to use DMDAGetAO together with MatSetValuesStencil or MatSetValuesLocal? Thanks! -------------- next part -------------- An HTML attachment was scrubbed... URL: From gideon.simpson at gmail.com Thu Aug 27 19:00:56 2015 From: gideon.simpson at gmail.com (Gideon Simpson) Date: Thu, 27 Aug 2015 20:00:56 -0400 Subject: [petsc-users] pcfieldsplit for a composite dm with multiple subfields Message-ID: I?m working on a problem which, morally, can be posed as a system of coupled semi linear elliptic PDEs together with unknown nonlinear eigenvalue parameters, loosely, of the form -\Delta u_1 + f(u_1, u_2) = lam * u1 - mu * du2/dx -\Delta u_2 + g(u_1, u_2) = lam * u2 + mu * du1/dx Currently, I have it set up with a DMComposite with two sub da?s, one for the parameters (lam, mu), and one for the vector field (u_1, u_2) on the mesh. I have had success in solving this as a fully coupled system with SNES + sparse direct solvers (MUMPS, SuperLU). Lately, I am finding that, when the mesh resolution gets fine enough (i.e. 10^6-10^8 lattice points), my SNES gets stuck with the function norm = O(10^{-4}), eventually returning reason -6 (failed line search). Perhaps there is another way around the above problem, but one thing I was thinking of trying would be to get away from direct solvers, and I was hoping to use field split for this. However, it?s a bit beyond what I?ve seen examples for because it has 2 types of variables: scalar parameters which appear globally in the system and vector valued field variables. Any suggestions on how to get started? -gideon From bsmith at mcs.anl.gov Thu Aug 27 21:02:45 2015 From: bsmith at mcs.anl.gov (Barry Smith) Date: Thu, 27 Aug 2015 21:02:45 -0500 Subject: [petsc-users] pcfieldsplit for a composite dm with multiple subfields In-Reply-To: References: Message-ID: Gideon, Are you using grid sequencing? Simply solve on a coarse grid, interpolate u1 and u2 to a once refined version of the grid and use that plus the mu lam as initial guess for the next level. Repeat to as fine a grid as you want. You can use DMRefine() and DMGetInterpolation() to get the interpolation needed to interpolate from the coarse to finer mesh. Then and only then you can use multigrid (with or without fieldsplit) to solve the linear problems for finer meshes. Once you have the grid sequencing working we can help you with this. Barry > On Aug 27, 2015, at 7:00 PM, Gideon Simpson wrote: > > I?m working on a problem which, morally, can be posed as a system of coupled semi linear elliptic PDEs together with unknown nonlinear eigenvalue parameters, loosely, of the form > > -\Delta u_1 + f(u_1, u_2) = lam * u1 - mu * du2/dx > -\Delta u_2 + g(u_1, u_2) = lam * u2 + mu * du1/dx > > Currently, I have it set up with a DMComposite with two sub da?s, one for the parameters (lam, mu), and one for the vector field (u_1, u_2) on the mesh. I have had success in solving this as a fully coupled system with SNES + sparse direct solvers (MUMPS, SuperLU). > > Lately, I am finding that, when the mesh resolution gets fine enough (i.e. 10^6-10^8 lattice points), my SNES gets stuck with the function norm = O(10^{-4}), eventually returning reason -6 (failed line search). > > Perhaps there is another way around the above problem, but one thing I was thinking of trying would be to get away from direct solvers, and I was hoping to use field split for this. However, it?s a bit beyond what I?ve seen examples for because it has 2 types of variables: scalar parameters which appear globally in the system and vector valued field variables. Any suggestions on how to get started? > > -gideon > From knepley at gmail.com Thu Aug 27 21:04:57 2015 From: knepley at gmail.com (Matthew Knepley) Date: Thu, 27 Aug 2015 21:04:57 -0500 Subject: [petsc-users] pcfieldsplit for a composite dm with multiple subfields In-Reply-To: References: Message-ID: On Thu, Aug 27, 2015 at 7:00 PM, Gideon Simpson wrote: > I?m working on a problem which, morally, can be posed as a system of > coupled semi linear elliptic PDEs together with unknown nonlinear > eigenvalue parameters, loosely, of the form > > -\Delta u_1 + f(u_1, u_2) = lam * u1 - mu * du2/dx > -\Delta u_2 + g(u_1, u_2) = lam * u2 + mu * du1/dx > > Currently, I have it set up with a DMComposite with two sub da?s, one for > the parameters (lam, mu), and one for the vector field (u_1, u_2) on the > mesh. I have had success in solving this as a fully coupled system with > SNES + sparse direct solvers (MUMPS, SuperLU). > > Lately, I am finding that, when the mesh resolution gets fine enough > (i.e. 10^6-10^8 lattice points), my SNES gets stuck with the function norm > = O(10^{-4}), eventually returning reason -6 (failed line search). > > Perhaps there is another way around the above problem, but one thing I was > thinking of trying would be to get away from direct solvers, and I was > hoping to use field split for this. However, it?s a bit beyond what I?ve > seen examples for because it has 2 types of variables: scalar parameters > which appear globally in the system and vector valued field variables. Any > suggestions on how to get started? Barry is right. However, I also really think we should have a nonlinear fieldsplit. I tried to write one (SNES multiblock), but no one has ever used it. I would be willing to put some time in if you need it. You would likely nonlinearly precondition the Newton solve with this, which is what X. Cai does to great effect in some problems he works on. Matt > > -gideon > > -- What most experimenters take for granted before they begin their experiments is infinitely more interesting than any results to which their experiments lead. -- Norbert Wiener -------------- next part -------------- An HTML attachment was scrubbed... URL: From gideon.simpson at gmail.com Thu Aug 27 21:11:38 2015 From: gideon.simpson at gmail.com (Gideon Simpson) Date: Thu, 27 Aug 2015 22:11:38 -0400 Subject: [petsc-users] pcfieldsplit for a composite dm with multiple subfields In-Reply-To: References: Message-ID: <22066404-50E3-4BBF-9D27-26384815571A@gmail.com> HI Barry, Nope, I?m not doing any grid sequencing. Clearly that makes a lot of sense, to solve on a spatially coarse mesh for the field variables, interpolate onto the finer mesh, and then solve again. I?m not entirely clear on the practical implementation -gideon > On Aug 27, 2015, at 10:02 PM, Barry Smith wrote: > > > Gideon, > > Are you using grid sequencing? Simply solve on a coarse grid, interpolate u1 and u2 to a once refined version of the grid and use that plus the mu lam as initial guess for the next level. Repeat to as fine a grid as you want. You can use DMRefine() and DMGetInterpolation() to get the interpolation needed to interpolate from the coarse to finer mesh. > > Then and only then you can use multigrid (with or without fieldsplit) to solve the linear problems for finer meshes. Once you have the grid sequencing working we can help you with this. > > Barry > >> On Aug 27, 2015, at 7:00 PM, Gideon Simpson wrote: >> >> I?m working on a problem which, morally, can be posed as a system of coupled semi linear elliptic PDEs together with unknown nonlinear eigenvalue parameters, loosely, of the form >> >> -\Delta u_1 + f(u_1, u_2) = lam * u1 - mu * du2/dx >> -\Delta u_2 + g(u_1, u_2) = lam * u2 + mu * du1/dx >> >> Currently, I have it set up with a DMComposite with two sub da?s, one for the parameters (lam, mu), and one for the vector field (u_1, u_2) on the mesh. I have had success in solving this as a fully coupled system with SNES + sparse direct solvers (MUMPS, SuperLU). >> >> Lately, I am finding that, when the mesh resolution gets fine enough (i.e. 10^6-10^8 lattice points), my SNES gets stuck with the function norm = O(10^{-4}), eventually returning reason -6 (failed line search). >> >> Perhaps there is another way around the above problem, but one thing I was thinking of trying would be to get away from direct solvers, and I was hoping to use field split for this. However, it?s a bit beyond what I?ve seen examples for because it has 2 types of variables: scalar parameters which appear globally in the system and vector valued field variables. Any suggestions on how to get started? >> >> -gideon >> > -------------- next part -------------- An HTML attachment was scrubbed... URL: From knepley at gmail.com Thu Aug 27 21:15:58 2015 From: knepley at gmail.com (Matthew Knepley) Date: Thu, 27 Aug 2015 21:15:58 -0500 Subject: [petsc-users] pcfieldsplit for a composite dm with multiple subfields In-Reply-To: <22066404-50E3-4BBF-9D27-26384815571A@gmail.com> References: <22066404-50E3-4BBF-9D27-26384815571A@gmail.com> Message-ID: On Thu, Aug 27, 2015 at 9:11 PM, Gideon Simpson wrote: > HI Barry, > > Nope, I?m not doing any grid sequencing. Clearly that makes a lot of > sense, to solve on a spatially coarse mesh for the field variables, > interpolate onto the finer mesh, and then solve again. I?m not entirely > clear on the practical implementation > SNES should do this automatically using -snes_grid_sequence . If this does not work, complain. Loudly. Matt -gideon > > On Aug 27, 2015, at 10:02 PM, Barry Smith wrote: > > > Gideon, > > Are you using grid sequencing? Simply solve on a coarse grid, > interpolate u1 and u2 to a once refined version of the grid and use that > plus the mu lam as initial guess for the next level. Repeat to as fine a > grid as you want. You can use DMRefine() and DMGetInterpolation() to get > the interpolation needed to interpolate from the coarse to finer mesh. > > Then and only then you can use multigrid (with or without fieldsplit) > to solve the linear problems for finer meshes. Once you have the grid > sequencing working we can help you with this. > > Barry > > On Aug 27, 2015, at 7:00 PM, Gideon Simpson > wrote: > > I?m working on a problem which, morally, can be posed as a system of > coupled semi linear elliptic PDEs together with unknown nonlinear > eigenvalue parameters, loosely, of the form > > -\Delta u_1 + f(u_1, u_2) = lam * u1 - mu * du2/dx > -\Delta u_2 + g(u_1, u_2) = lam * u2 + mu * du1/dx > > Currently, I have it set up with a DMComposite with two sub da?s, one for > the parameters (lam, mu), and one for the vector field (u_1, u_2) on the > mesh. I have had success in solving this as a fully coupled system with > SNES + sparse direct solvers (MUMPS, SuperLU). > > Lately, I am finding that, when the mesh resolution gets fine enough (i.e. > 10^6-10^8 lattice points), my SNES gets stuck with the function norm = > O(10^{-4}), eventually returning reason -6 (failed line search). > > Perhaps there is another way around the above problem, but one thing I was > thinking of trying would be to get away from direct solvers, and I was > hoping to use field split for this. However, it?s a bit beyond what I?ve > seen examples for because it has 2 types of variables: scalar parameters > which appear globally in the system and vector valued field variables. Any > suggestions on how to get started? > > -gideon > > > > -- What most experimenters take for granted before they begin their experiments is infinitely more interesting than any results to which their experiments lead. -- Norbert Wiener -------------- next part -------------- An HTML attachment was scrubbed... URL: From gideon.simpson at gmail.com Thu Aug 27 21:32:18 2015 From: gideon.simpson at gmail.com (Gideon Simpson) Date: Thu, 27 Aug 2015 22:32:18 -0400 Subject: [petsc-users] pcfieldsplit for a composite dm with multiple subfields In-Reply-To: References: <22066404-50E3-4BBF-9D27-26384815571A@gmail.com> Message-ID: <8376EFA7-D775-4D65-9E86-2303FA7E47E2@gmail.com> I?m getting the following errors: [1]PETSC ERROR: Argument out of range [1]PETSC ERROR: Inserting a new nonzero (40003, 0) into matrix Could this have to do with me using the DMComposite with one da holding the scalar parameters and the other holding the field variables? -gideon > On Aug 27, 2015, at 10:15 PM, Matthew Knepley wrote: > > On Thu, Aug 27, 2015 at 9:11 PM, Gideon Simpson > wrote: > HI Barry, > > Nope, I?m not doing any grid sequencing. Clearly that makes a lot of sense, to solve on a spatially coarse mesh for the field variables, interpolate onto the finer mesh, and then solve again. I?m not entirely clear on the practical implementation > > SNES should do this automatically using -snes_grid_sequence . If this does not work, complain. Loudly. > > Matt > > -gideon > >> On Aug 27, 2015, at 10:02 PM, Barry Smith > wrote: >> >> >> Gideon, >> >> Are you using grid sequencing? Simply solve on a coarse grid, interpolate u1 and u2 to a once refined version of the grid and use that plus the mu lam as initial guess for the next level. Repeat to as fine a grid as you want. You can use DMRefine() and DMGetInterpolation() to get the interpolation needed to interpolate from the coarse to finer mesh. >> >> Then and only then you can use multigrid (with or without fieldsplit) to solve the linear problems for finer meshes. Once you have the grid sequencing working we can help you with this. >> >> Barry >> >>> On Aug 27, 2015, at 7:00 PM, Gideon Simpson > wrote: >>> >>> I?m working on a problem which, morally, can be posed as a system of coupled semi linear elliptic PDEs together with unknown nonlinear eigenvalue parameters, loosely, of the form >>> >>> -\Delta u_1 + f(u_1, u_2) = lam * u1 - mu * du2/dx >>> -\Delta u_2 + g(u_1, u_2) = lam * u2 + mu * du1/dx >>> >>> Currently, I have it set up with a DMComposite with two sub da?s, one for the parameters (lam, mu), and one for the vector field (u_1, u_2) on the mesh. I have had success in solving this as a fully coupled system with SNES + sparse direct solvers (MUMPS, SuperLU). >>> >>> Lately, I am finding that, when the mesh resolution gets fine enough (i.e. 10^6-10^8 lattice points), my SNES gets stuck with the function norm = O(10^{-4}), eventually returning reason -6 (failed line search). >>> >>> Perhaps there is another way around the above problem, but one thing I was thinking of trying would be to get away from direct solvers, and I was hoping to use field split for this. However, it?s a bit beyond what I?ve seen examples for because it has 2 types of variables: scalar parameters which appear globally in the system and vector valued field variables. Any suggestions on how to get started? >>> >>> -gideon >>> >> > > > > > -- > What most experimenters take for granted before they begin their experiments is infinitely more interesting than any results to which their experiments lead. > -- Norbert Wiener -------------- next part -------------- An HTML attachment was scrubbed... URL: From bsmith at mcs.anl.gov Thu Aug 27 21:37:12 2015 From: bsmith at mcs.anl.gov (Barry Smith) Date: Thu, 27 Aug 2015 21:37:12 -0500 Subject: [petsc-users] pcfieldsplit for a composite dm with multiple subfields In-Reply-To: <8376EFA7-D775-4D65-9E86-2303FA7E47E2@gmail.com> References: <22066404-50E3-4BBF-9D27-26384815571A@gmail.com> <8376EFA7-D775-4D65-9E86-2303FA7E47E2@gmail.com> Message-ID: <4136AE9B-AE00-4E39-8012-888BBF920548@mcs.anl.gov> We need the full error message. But are you using a DMDA for the scalars? You should not be, you should be using a DMRedundant for the scalars. Barry Though you should not get this error even if you are using a DMDA there. > On Aug 27, 2015, at 9:32 PM, Gideon Simpson wrote: > > I?m getting the following errors: > > [1]PETSC ERROR: Argument out of range > [1]PETSC ERROR: Inserting a new nonzero (40003, 0) into matrix > > Could this have to do with me using the DMComposite with one da holding the scalar parameters and the other holding the field variables? > > -gideon > >> On Aug 27, 2015, at 10:15 PM, Matthew Knepley wrote: >> >> On Thu, Aug 27, 2015 at 9:11 PM, Gideon Simpson wrote: >> HI Barry, >> >> Nope, I?m not doing any grid sequencing. Clearly that makes a lot of sense, to solve on a spatially coarse mesh for the field variables, interpolate onto the finer mesh, and then solve again. I?m not entirely clear on the practical implementation >> >> SNES should do this automatically using -snes_grid_sequence . If this does not work, complain. Loudly. >> >> Matt >> >> -gideon >> >>> On Aug 27, 2015, at 10:02 PM, Barry Smith wrote: >>> >>> >>> Gideon, >>> >>> Are you using grid sequencing? Simply solve on a coarse grid, interpolate u1 and u2 to a once refined version of the grid and use that plus the mu lam as initial guess for the next level. Repeat to as fine a grid as you want. You can use DMRefine() and DMGetInterpolation() to get the interpolation needed to interpolate from the coarse to finer mesh. >>> >>> Then and only then you can use multigrid (with or without fieldsplit) to solve the linear problems for finer meshes. Once you have the grid sequencing working we can help you with this. >>> >>> Barry >>> >>>> On Aug 27, 2015, at 7:00 PM, Gideon Simpson wrote: >>>> >>>> I?m working on a problem which, morally, can be posed as a system of coupled semi linear elliptic PDEs together with unknown nonlinear eigenvalue parameters, loosely, of the form >>>> >>>> -\Delta u_1 + f(u_1, u_2) = lam * u1 - mu * du2/dx >>>> -\Delta u_2 + g(u_1, u_2) = lam * u2 + mu * du1/dx >>>> >>>> Currently, I have it set up with a DMComposite with two sub da?s, one for the parameters (lam, mu), and one for the vector field (u_1, u_2) on the mesh. I have had success in solving this as a fully coupled system with SNES + sparse direct solvers (MUMPS, SuperLU). >>>> >>>> Lately, I am finding that, when the mesh resolution gets fine enough (i.e. 10^6-10^8 lattice points), my SNES gets stuck with the function norm = O(10^{-4}), eventually returning reason -6 (failed line search). >>>> >>>> Perhaps there is another way around the above problem, but one thing I was thinking of trying would be to get away from direct solvers, and I was hoping to use field split for this. However, it?s a bit beyond what I?ve seen examples for because it has 2 types of variables: scalar parameters which appear globally in the system and vector valued field variables. Any suggestions on how to get started? >>>> >>>> -gideon >>>> >>> >> >> >> >> >> -- >> What most experimenters take for granted before they begin their experiments is infinitely more interesting than any results to which their experiments lead. >> -- Norbert Wiener > From gideon.simpson at gmail.com Thu Aug 27 21:42:44 2015 From: gideon.simpson at gmail.com (Gideon Simpson) Date: Thu, 27 Aug 2015 22:42:44 -0400 Subject: [petsc-users] pcfieldsplit for a composite dm with multiple subfields In-Reply-To: <4136AE9B-AE00-4E39-8012-888BBF920548@mcs.anl.gov> References: <22066404-50E3-4BBF-9D27-26384815571A@gmail.com> <8376EFA7-D775-4D65-9E86-2303FA7E47E2@gmail.com> <4136AE9B-AE00-4E39-8012-888BBF920548@mcs.anl.gov> Message-ID: I have it set up as: DMCompositeCreate(PETSC_COMM_WORLD, &user.packer); DMRedundantCreate(PETSC_COMM_WORLD, 0, 3, &user.p_dm); DMCompositeAddDM(user.packer,user.p_dm); DMDACreate1d(PETSC_COMM_WORLD,DM_BOUNDARY_GHOSTED, nx, 4, 1, NULL, &user.Q_dm); DMCompositeAddDM(user.packer,user.Q_dm); DMCreateGlobalVector(user.packer,&U); where the user.packer structure has DM packer; DM p_dm, Q_dm; Q_dm holds the field variables and p_dm holds the scalar values (the nonlinear eigenvalues). Here are some of the errors that are generated: [0]PETSC ERROR: --------------------- Error Message -------------------------------------------------------------- [0]PETSC ERROR: Argument out of range [0]PETSC ERROR: New nonzero at (0,3) caused a malloc Use MatSetOption(A, MAT_NEW_NONZERO_ALLOCATION_ERR, PETSC_FALSE) to turn off this check [0]PETSC ERROR: See http://www.mcs.anl.gov/petsc/documentation/faq.html for trouble shooting. [0]PETSC ERROR: Petsc Release Version 3.5.3, unknown [0]PETSC ERROR: ./blowup_batch2 on a arch-macports named gs_air by gideon Thu Aug 27 22:40:54 2015 [0]PETSC ERROR: Configure options --prefix=/opt/local --prefix=/opt/local/lib/petsc --with-valgrind=0 --with-shared-libraries --with-debugging=0 --with-c2html-dir=/opt/local --with-x=0 --with-blas-lapack-lib=/System/Library/Frameworks/Accelerate.framework/Versions/Current/Accelerate --with-hwloc-dir=/opt/local --with-suitesparse-dir=/opt/local --with-superlu-dir=/opt/local --with-metis-dir=/opt/local --with-parmetis-dir=/opt/local --with-scalapack-dir=/opt/local --with-mumps-dir=/opt/local --with-superlu_dist-dir=/opt/local CC=/opt/local/bin/mpicc-mpich-mp CXX=/opt/local/bin/mpicxx-mpich-mp FC=/opt/local/bin/mpif90-mpich-mp F77=/opt/local/bin/mpif90-mpich-mp F90=/opt/local/bin/mpif90-mpich-mp COPTFLAGS=-Os CXXOPTFLAGS=-Os FOPTFLAGS=-Os LDFLAGS="-L/opt/local/lib -Wl,-headerpad_max_install_names" CPPFLAGS=-I/opt/local/include CFLAGS="-Os -arch x86_64" CXXFLAGS=-Os FFLAGS=-Os FCFLAGS=-Os F90FLAGS=-Os PETSC_ARCH=arch-macports --with-mpiexec=mpiexec-mpich-mp [0]PETSC ERROR: #1 MatSetValues_MPIAIJ() line 530 in /opt/local/var/macports/build/_opt_local_var_macports_sources_rsync.macports.org_release_tarballs_ports_math_petsc/petsc/work/v3.5.3/src/mat/impls/aij/mpi/mpiaij.c [0]PETSC ERROR: [1]PETSC ERROR: --------------------- Error Message -------------------------------------------------------------- [1]PETSC ERROR: Argument out of range [1]PETSC ERROR: Inserting a new nonzero (40003, 0) into matrix [1]PETSC ERROR: See http://www.mcs.anl.gov/petsc/documentation/faq.html for trouble shooting. [1]PETSC ERROR: Petsc Release Version 3.5.3, unknown [1]PETSC ERROR: ./blowup_batch2 on a arch-macports named gs_air by gideon Thu Aug 27 22:40:54 2015 [1]PETSC ERROR: Configure options --prefix=/opt/local --prefix=/opt/local/lib/petsc --with-valgrind=0 --with-shared-libraries --with-debugging=0 --with-c2html-dir=/opt/local --with-x=0 --with-blas-lapack-lib=/System/Library/Frameworks/Accelerate.framework/Versions/Current/Accelerate --with-hwloc-dir=/opt/local --with-suitesparse-dir=/opt/local --with-superlu-dir=/opt/local --with-metis-dir=/opt/local --with-parmetis-dir=/opt/local --with-scalapack-dir=/opt/local --with-mumps-dir=/opt/local --with-superlu_dist-dir=/opt/local CC=/opt/local/bin/mpicc-mpich-mp CXX=/opt/local/bin/mpicxx-mpich-mp FC=/opt/local/bin/mpif90-mpich-mp F77=/opt/local/bin/mpif90-mpich-mp F90=/opt/local/bin/mpif90-mpich-mp COPTFLAGS=-Os CXXOPTFLAGS=-Os FOPTFLAGS=-Os LDFLAGS="-L/opt/local/lib -Wl,-headerpad_max_install_names" CPPFLAGS=-I/opt/local/include CFLAGS="-Os -arch x86_64" CXXFLAGS=-Os FFLAGS=-Os FCFLAGS=-Os F90FLAGS=-Os PETSC_ARCH=arch-macports --with-mpiexec=mpiexec-mpich-mp [1]PETSC ERROR: #1 MatSetValues_MPIAIJ() line 561 in /opt/local/var/macports/build/_opt_local_var_macports_sources_rsync.macports.org_release_tarballs_ports_math_petsc/petsc/work/v3.5.3/src/mat/impls/aij/mpi/mpiaij.c [1]PETSC ERROR: #2 MatSetValues() line 1135 in /opt/local/var/macports/build/_opt_local_var_macports_sources_rsync.macports.org_release_tarballs_ports_math_petsc/petsc/work/v3.5.3/src/mat/interface/matrix.c -gideon > On Aug 27, 2015, at 10:37 PM, Barry Smith wrote: > > > We need the full error message. > > But are you using a DMDA for the scalars? You should not be, you should be using a DMRedundant for the scalars. > > Barry > > Though you should not get this error even if you are using a DMDA there. > >> On Aug 27, 2015, at 9:32 PM, Gideon Simpson wrote: >> >> I?m getting the following errors: >> >> [1]PETSC ERROR: Argument out of range >> [1]PETSC ERROR: Inserting a new nonzero (40003, 0) into matrix >> >> Could this have to do with me using the DMComposite with one da holding the scalar parameters and the other holding the field variables? >> >> -gideon >> >>> On Aug 27, 2015, at 10:15 PM, Matthew Knepley wrote: >>> >>> On Thu, Aug 27, 2015 at 9:11 PM, Gideon Simpson wrote: >>> HI Barry, >>> >>> Nope, I?m not doing any grid sequencing. Clearly that makes a lot of sense, to solve on a spatially coarse mesh for the field variables, interpolate onto the finer mesh, and then solve again. I?m not entirely clear on the practical implementation >>> >>> SNES should do this automatically using -snes_grid_sequence . If this does not work, complain. Loudly. >>> >>> Matt >>> >>> -gideon >>> >>>> On Aug 27, 2015, at 10:02 PM, Barry Smith wrote: >>>> >>>> >>>> Gideon, >>>> >>>> Are you using grid sequencing? Simply solve on a coarse grid, interpolate u1 and u2 to a once refined version of the grid and use that plus the mu lam as initial guess for the next level. Repeat to as fine a grid as you want. You can use DMRefine() and DMGetInterpolation() to get the interpolation needed to interpolate from the coarse to finer mesh. >>>> >>>> Then and only then you can use multigrid (with or without fieldsplit) to solve the linear problems for finer meshes. Once you have the grid sequencing working we can help you with this. >>>> >>>> Barry >>>> >>>>> On Aug 27, 2015, at 7:00 PM, Gideon Simpson wrote: >>>>> >>>>> I?m working on a problem which, morally, can be posed as a system of coupled semi linear elliptic PDEs together with unknown nonlinear eigenvalue parameters, loosely, of the form >>>>> >>>>> -\Delta u_1 + f(u_1, u_2) = lam * u1 - mu * du2/dx >>>>> -\Delta u_2 + g(u_1, u_2) = lam * u2 + mu * du1/dx >>>>> >>>>> Currently, I have it set up with a DMComposite with two sub da?s, one for the parameters (lam, mu), and one for the vector field (u_1, u_2) on the mesh. I have had success in solving this as a fully coupled system with SNES + sparse direct solvers (MUMPS, SuperLU). >>>>> >>>>> Lately, I am finding that, when the mesh resolution gets fine enough (i.e. 10^6-10^8 lattice points), my SNES gets stuck with the function norm = O(10^{-4}), eventually returning reason -6 (failed line search). >>>>> >>>>> Perhaps there is another way around the above problem, but one thing I was thinking of trying would be to get away from direct solvers, and I was hoping to use field split for this. However, it?s a bit beyond what I?ve seen examples for because it has 2 types of variables: scalar parameters which appear globally in the system and vector valued field variables. Any suggestions on how to get started? >>>>> >>>>> -gideon >>>>> >>>> >>> >>> >>> >>> >>> -- >>> What most experimenters take for granted before they begin their experiments is infinitely more interesting than any results to which their experiments lead. >>> -- Norbert Wiener >> > -------------- next part -------------- An HTML attachment was scrubbed... URL: From bsmith at mcs.anl.gov Thu Aug 27 22:09:22 2015 From: bsmith at mcs.anl.gov (Barry Smith) Date: Thu, 27 Aug 2015 22:09:22 -0500 Subject: [petsc-users] pcfieldsplit for a composite dm with multiple subfields In-Reply-To: References: <22066404-50E3-4BBF-9D27-26384815571A@gmail.com> <8376EFA7-D775-4D65-9E86-2303FA7E47E2@gmail.com> <4136AE9B-AE00-4E39-8012-888BBF920548@mcs.anl.gov> Message-ID: <5714BD56-097D-40B9-8AEF-41273E2B512D@mcs.anl.gov> Can you send the code, that will be the easiest way to find the problem. My guess is that you have hardwired in your function/Jacobian computation the use of the original DM for computations instead of using the current DM (with refinement there will be a new DM on the second level different than your original DM). So what you need to do in writing your FormFunction and FormJacobian is to call SNESGetDM() to get the current DM and then use DMComputeGet... to access the individual DMDA and DMRedundent for the parts. I notice you have this user.Q_dm I bet inside your form functions you use this DM? You have to remove this logic. Barry > On Aug 27, 2015, at 9:42 PM, Gideon Simpson wrote: > > I have it set up as: > > DMCompositeCreate(PETSC_COMM_WORLD, &user.packer); > DMRedundantCreate(PETSC_COMM_WORLD, 0, 3, &user.p_dm); > DMCompositeAddDM(user.packer,user.p_dm); > DMDACreate1d(PETSC_COMM_WORLD,DM_BOUNDARY_GHOSTED, > nx, 4, 1, NULL, &user.Q_dm); > DMCompositeAddDM(user.packer,user.Q_dm); > DMCreateGlobalVector(user.packer,&U); > > where the user.packer structure has > > DM packer; > DM p_dm, Q_dm; > > Q_dm holds the field variables and p_dm holds the scalar values (the nonlinear eigenvalues). > > Here are some of the errors that are generated: > > [0]PETSC ERROR: --------------------- Error Message -------------------------------------------------------------- > [0]PETSC ERROR: Argument out of range > [0]PETSC ERROR: New nonzero at (0,3) caused a malloc > Use MatSetOption(A, MAT_NEW_NONZERO_ALLOCATION_ERR, PETSC_FALSE) to turn off this check > [0]PETSC ERROR: See http://www.mcs.anl.gov/petsc/documentation/faq.html for trouble shooting. > [0]PETSC ERROR: Petsc Release Version 3.5.3, unknown > [0]PETSC ERROR: ./blowup_batch2 on a arch-macports named gs_air by gideon Thu Aug 27 22:40:54 2015 > [0]PETSC ERROR: Configure options --prefix=/opt/local --prefix=/opt/local/lib/petsc --with-valgrind=0 --with-shared-libraries --with-debugging=0 --with-c2html-dir=/opt/local --with-x=0 --with-blas-lapack-lib=/System/Library/Frameworks/Accelerate.framework/Versions/Current/Accelerate --with-hwloc-dir=/opt/local --with-suitesparse-dir=/opt/local --with-superlu-dir=/opt/local --with-metis-dir=/opt/local --with-parmetis-dir=/opt/local --with-scalapack-dir=/opt/local --with-mumps-dir=/opt/local --with-superlu_dist-dir=/opt/local CC=/opt/local/bin/mpicc-mpich-mp CXX=/opt/local/bin/mpicxx-mpich-mp FC=/opt/local/bin/mpif90-mpich-mp F77=/opt/local/bin/mpif90-mpich-mp F90=/opt/local/bin/mpif90-mpich-mp COPTFLAGS=-Os CXXOPTFLAGS=-Os FOPTFLAGS=-Os LDFLAGS="-L/opt/local/lib -Wl,-headerpad_max_install_names" CPPFLAGS=-I/opt/local/include CFLAGS="-Os -arch x86_64" CXXFLAGS=-Os FFLAGS=-Os FCFLAGS=-Os F90FLAGS=-Os PETSC_ARCH=arch-macports --with-mpiexec=mpiexec-mpich-mp > [0]PETSC ERROR: #1 MatSetValues_MPIAIJ() line 530 in /opt/local/var/macports/build/_opt_local_var_macports_sources_rsync.macports.org_release_tarballs_ports_math_petsc/petsc/work/v3.5.3/src/mat/impls/aij/mpi/mpiaij.c > [0]PETSC ERROR: [1]PETSC ERROR: --------------------- Error Message -------------------------------------------------------------- > [1]PETSC ERROR: Argument out of range > [1]PETSC ERROR: Inserting a new nonzero (40003, 0) into matrix > [1]PETSC ERROR: See http://www.mcs.anl.gov/petsc/documentation/faq.html for trouble shooting. > [1]PETSC ERROR: Petsc Release Version 3.5.3, unknown > [1]PETSC ERROR: ./blowup_batch2 on a arch-macports named gs_air by gideon Thu Aug 27 22:40:54 2015 > [1]PETSC ERROR: Configure options --prefix=/opt/local --prefix=/opt/local/lib/petsc --with-valgrind=0 --with-shared-libraries --with-debugging=0 --with-c2html-dir=/opt/local --with-x=0 --with-blas-lapack-lib=/System/Library/Frameworks/Accelerate.framework/Versions/Current/Accelerate --with-hwloc-dir=/opt/local --with-suitesparse-dir=/opt/local --with-superlu-dir=/opt/local --with-metis-dir=/opt/local --with-parmetis-dir=/opt/local --with-scalapack-dir=/opt/local --with-mumps-dir=/opt/local --with-superlu_dist-dir=/opt/local CC=/opt/local/bin/mpicc-mpich-mp CXX=/opt/local/bin/mpicxx-mpich-mp FC=/opt/local/bin/mpif90-mpich-mp F77=/opt/local/bin/mpif90-mpich-mp F90=/opt/local/bin/mpif90-mpich-mp COPTFLAGS=-Os CXXOPTFLAGS=-Os FOPTFLAGS=-Os LDFLAGS="-L/opt/local/lib -Wl,-headerpad_max_install_names" CPPFLAGS=-I/opt/local/include CFLAGS="-Os -arch x86_64" CXXFLAGS=-Os FFLAGS=-Os FCFLAGS=-Os F90FLAGS=-Os PETSC_ARCH=arch-macports --with-mpiexec=mpiexec-mpich-mp > [1]PETSC ERROR: #1 MatSetValues_MPIAIJ() line 561 in /opt/local/var/macports/build/_opt_local_var_macports_sources_rsync.macports.org_release_tarballs_ports_math_petsc/petsc/work/v3.5.3/src/mat/impls/aij/mpi/mpiaij.c > [1]PETSC ERROR: #2 MatSetValues() line 1135 in /opt/local/var/macports/build/_opt_local_var_macports_sources_rsync.macports.org_release_tarballs_ports_math_petsc/petsc/work/v3.5.3/src/mat/interface/matrix.c > > > > -gideon > >> On Aug 27, 2015, at 10:37 PM, Barry Smith wrote: >> >> >> We need the full error message. >> >> But are you using a DMDA for the scalars? You should not be, you should be using a DMRedundant for the scalars. >> >> Barry >> >> Though you should not get this error even if you are using a DMDA there. >> >>> On Aug 27, 2015, at 9:32 PM, Gideon Simpson wrote: >>> >>> I?m getting the following errors: >>> >>> [1]PETSC ERROR: Argument out of range >>> [1]PETSC ERROR: Inserting a new nonzero (40003, 0) into matrix >>> >>> Could this have to do with me using the DMComposite with one da holding the scalar parameters and the other holding the field variables? >>> >>> -gideon >>> >>>> On Aug 27, 2015, at 10:15 PM, Matthew Knepley wrote: >>>> >>>> On Thu, Aug 27, 2015 at 9:11 PM, Gideon Simpson wrote: >>>> HI Barry, >>>> >>>> Nope, I?m not doing any grid sequencing. Clearly that makes a lot of sense, to solve on a spatially coarse mesh for the field variables, interpolate onto the finer mesh, and then solve again. I?m not entirely clear on the practical implementation >>>> >>>> SNES should do this automatically using -snes_grid_sequence . If this does not work, complain. Loudly. >>>> >>>> Matt >>>> >>>> -gideon >>>> >>>>> On Aug 27, 2015, at 10:02 PM, Barry Smith wrote: >>>>> >>>>> >>>>> Gideon, >>>>> >>>>> Are you using grid sequencing? Simply solve on a coarse grid, interpolate u1 and u2 to a once refined version of the grid and use that plus the mu lam as initial guess for the next level. Repeat to as fine a grid as you want. You can use DMRefine() and DMGetInterpolation() to get the interpolation needed to interpolate from the coarse to finer mesh. >>>>> >>>>> Then and only then you can use multigrid (with or without fieldsplit) to solve the linear problems for finer meshes. Once you have the grid sequencing working we can help you with this. >>>>> >>>>> Barry >>>>> >>>>>> On Aug 27, 2015, at 7:00 PM, Gideon Simpson wrote: >>>>>> >>>>>> I?m working on a problem which, morally, can be posed as a system of coupled semi linear elliptic PDEs together with unknown nonlinear eigenvalue parameters, loosely, of the form >>>>>> >>>>>> -\Delta u_1 + f(u_1, u_2) = lam * u1 - mu * du2/dx >>>>>> -\Delta u_2 + g(u_1, u_2) = lam * u2 + mu * du1/dx >>>>>> >>>>>> Currently, I have it set up with a DMComposite with two sub da?s, one for the parameters (lam, mu), and one for the vector field (u_1, u_2) on the mesh. I have had success in solving this as a fully coupled system with SNES + sparse direct solvers (MUMPS, SuperLU). >>>>>> >>>>>> Lately, I am finding that, when the mesh resolution gets fine enough (i.e. 10^6-10^8 lattice points), my SNES gets stuck with the function norm = O(10^{-4}), eventually returning reason -6 (failed line search). >>>>>> >>>>>> Perhaps there is another way around the above problem, but one thing I was thinking of trying would be to get away from direct solvers, and I was hoping to use field split for this. However, it?s a bit beyond what I?ve seen examples for because it has 2 types of variables: scalar parameters which appear globally in the system and vector valued field variables. Any suggestions on how to get started? >>>>>> >>>>>> -gideon >>>>>> >>>>> >>>> >>>> >>>> >>>> >>>> -- >>>> What most experimenters take for granted before they begin their experiments is infinitely more interesting than any results to which their experiments lead. >>>> -- Norbert Wiener >>> >> > From gideon.simpson at gmail.com Thu Aug 27 22:15:50 2015 From: gideon.simpson at gmail.com (Gideon Simpson) Date: Thu, 27 Aug 2015 23:15:50 -0400 Subject: [petsc-users] pcfieldsplit for a composite dm with multiple subfields In-Reply-To: <5714BD56-097D-40B9-8AEF-41273E2B512D@mcs.anl.gov> References: <22066404-50E3-4BBF-9D27-26384815571A@gmail.com> <8376EFA7-D775-4D65-9E86-2303FA7E47E2@gmail.com> <4136AE9B-AE00-4E39-8012-888BBF920548@mcs.anl.gov> <5714BD56-097D-40B9-8AEF-41273E2B512D@mcs.anl.gov> Message-ID: That?s correct, I am not using the SNESGetDM. I suppose I could. Keep in mind that I?m trying to solve, simultaneously, for the scalar parameters and the vector field. I guess what I am unclear about is how DMRefine is to know that the unknown associated with the scalar parameters can never be coarsened out, but must be retained at all iterations. Here is my form function. I can send more code if needed. /* Form the system of equations for computing a blowup solution*/ PetscErrorCode form_function(SNES snes, Vec U, Vec F, void *ctx){ blowup_ctx *user = (blowup_ctx *) ctx; PetscInt i; PetscScalar dx, dx2, xmax,x; PetscScalar u, v, f,g, ux, vx, uxx, vxx, fx,gx, fxx, gxx; DMDALocalInfo info; Vec p_vec, Q_vec, Fp_vec, FQ_vec; PetscScalar *p_array, *Fp_array; Q *Qvals, *FQvals; PetscScalar Q2sig, W2sig; PetscScalar a,a2, b, u0, sigma; dx = user->dx; dx2 = dx *dx; xmax = user->xmax; sigma = user->sigma; /* PetscPrintf(PETSC_COMM_SELF, " dx = %g, sigma = %g\n", dx, sigma); */ /* Extract raw arrays */ DMCompositeGetLocalVectors(user->packer, &p_vec, &Q_vec); DMCompositeGetLocalVectors(user->packer, &Fp_vec, &FQ_vec); DMCompositeScatter(user->packer, U, p_vec, Q_vec); /* VecView(Q_vec, PETSC_VIEWER_STDOUT_SELF); */ VecGetArray(p_vec,&p_array); VecGetArray(Fp_vec,&Fp_array); DMDAVecGetArray(user->Q_dm, Q_vec, &Qvals); DMDAVecGetArray(user->Q_dm, FQ_vec, &FQvals); DMDAGetLocalInfo(user->Q_dm, &info); a = p_array[0]; a2 = a*a; b = p_array[1]; u0 = p_array[2]; /* Set boundary conditions at the origin*/ if(info.xs ==0){ set_origin_bcs(u0, Qvals); } /* Set boundray conditions in the far field */ if(info.xs+ info.xm == info.mx){ set_farfield_bcs(xmax,sigma, a, b, dx, Qvals,info.mx); } /* Solve auxiliary equations */ if(info.xs ==0){ uxx = (2 * Qvals[0].u-2 * u0)/dx2; vxx = (Qvals[0].v + Qvals[0].g)/dx2; vx = (Qvals[0].v - Qvals[0].g)/(2*dx); Fp_array[0] = Qvals[0].u - Qvals[0].f; Fp_array[1] = -vxx - (1/a) * (.5/sigma) * u0; Fp_array[2] = -uxx + (1/a2) * u0 + (1/a) * (-b * vx + PetscPowScalar(u0 * u0, sigma) * vx); } /* Solve equations in the bulk */ for(i=info.xs; i < info.xs + info.xm;i++){ u = Qvals[i].u; v = Qvals[i].v; f = Qvals[i].f; g = Qvals[i].g; x = (i+1) * dx; Q2sig = PetscPowScalar(u*u + v*v,sigma); W2sig= PetscPowScalar(f*f + g*g, sigma); ux = (Qvals[i+1].u-Qvals[i-1].u)/(2*dx); vx = (Qvals[i+1].v-Qvals[i-1].v)/(2*dx); fx = (Qvals[i+1].f-Qvals[i-1].f)/(2*dx); gx = (Qvals[i+1].g-Qvals[i-1].g)/(2*dx); uxx = (Qvals[i+1].u+Qvals[i-1].u - 2 *u)/(dx2); vxx = (Qvals[i+1].v+Qvals[i-1].v- 2 *v)/(dx2); fxx = (Qvals[i+1].f+Qvals[i-1].f -2*f)/(dx2); gxx = (Qvals[i+1].g+Qvals[i-1].g -2*g)/(dx2); FQvals[i].u = -uxx +1/a2 * u + 1/a *(.5/sigma* v +x * vx- b* vx +Q2sig* vx); FQvals[i].v = -vxx +1/a2 * v - 1/a *(.5/sigma * u +x * ux- b* ux +Q2sig* ux); FQvals[i].f = -fxx +1/a2 * f + 1/a *(.5/sigma * g +x * gx+ b* gx -W2sig* gx); FQvals[i].g =-gxx +1/a2 * g - 1/a *(.5/sigma * f +x * fx+ b* fx -W2sig* fx); } /* Restore raw arrays */ VecRestoreArray(p_vec, &p_array); VecRestoreArray(Fp_vec, &Fp_array); DMDAVecRestoreArray(user->Q_dm, Q_vec, &Qvals); DMDAVecRestoreArray(user->Q_dm, FQ_vec, &FQvals); DMCompositeGather(user->packer,F,INSERT_VALUES, Fp_vec, FQ_vec); DMCompositeRestoreLocalVectors(user->packer, &p_vec, &Q_vec); DMCompositeRestoreLocalVectors(user->packer, &Fp_vec, &FQ_vec); return 0; } Here is the form function: -gideon > On Aug 27, 2015, at 11:09 PM, Barry Smith wrote: > > > Can you send the code, that will be the easiest way to find the problem. > > My guess is that you have hardwired in your function/Jacobian computation the use of the original DM for computations instead of using the current DM (with refinement there will be a new DM on the second level different than your original DM). So what you need to do in writing your FormFunction and FormJacobian is to call SNESGetDM() to get the current DM and then use DMComputeGet... to access the individual DMDA and DMRedundent for the parts. I notice you have this user.Q_dm I bet inside your form functions you use this DM? You have to remove this logic. > > Barry > >> On Aug 27, 2015, at 9:42 PM, Gideon Simpson wrote: >> >> I have it set up as: >> >> DMCompositeCreate(PETSC_COMM_WORLD, &user.packer); >> DMRedundantCreate(PETSC_COMM_WORLD, 0, 3, &user.p_dm); >> DMCompositeAddDM(user.packer,user.p_dm); >> DMDACreate1d(PETSC_COMM_WORLD,DM_BOUNDARY_GHOSTED, >> nx, 4, 1, NULL, &user.Q_dm); >> DMCompositeAddDM(user.packer,user.Q_dm); >> DMCreateGlobalVector(user.packer,&U); >> >> where the user.packer structure has >> >> DM packer; >> DM p_dm, Q_dm; >> >> Q_dm holds the field variables and p_dm holds the scalar values (the nonlinear eigenvalues). >> >> Here are some of the errors that are generated: >> >> [0]PETSC ERROR: --------------------- Error Message -------------------------------------------------------------- >> [0]PETSC ERROR: Argument out of range >> [0]PETSC ERROR: New nonzero at (0,3) caused a malloc >> Use MatSetOption(A, MAT_NEW_NONZERO_ALLOCATION_ERR, PETSC_FALSE) to turn off this check >> [0]PETSC ERROR: See http://www.mcs.anl.gov/petsc/documentation/faq.html for trouble shooting. >> [0]PETSC ERROR: Petsc Release Version 3.5.3, unknown >> [0]PETSC ERROR: ./blowup_batch2 on a arch-macports named gs_air by gideon Thu Aug 27 22:40:54 2015 >> [0]PETSC ERROR: Configure options --prefix=/opt/local --prefix=/opt/local/lib/petsc --with-valgrind=0 --with-shared-libraries --with-debugging=0 --with-c2html-dir=/opt/local --with-x=0 --with-blas-lapack-lib=/System/Library/Frameworks/Accelerate.framework/Versions/Current/Accelerate --with-hwloc-dir=/opt/local --with-suitesparse-dir=/opt/local --with-superlu-dir=/opt/local --with-metis-dir=/opt/local --with-parmetis-dir=/opt/local --with-scalapack-dir=/opt/local --with-mumps-dir=/opt/local --with-superlu_dist-dir=/opt/local CC=/opt/local/bin/mpicc-mpich-mp CXX=/opt/local/bin/mpicxx-mpich-mp FC=/opt/local/bin/mpif90-mpich-mp F77=/opt/local/bin/mpif90-mpich-mp F90=/opt/local/bin/mpif90-mpich-mp COPTFLAGS=-Os CXXOPTFLAGS=-Os FOPTFLAGS=-Os LDFLAGS="-L/opt/local/lib -Wl,-headerpad_max_install_names" CPPFLAGS=-I/opt/local/include CFLAGS="-Os -arch x86_64" CXXFLAGS=-Os FFLAGS=-Os FCFLAGS=-Os F90FLAGS=-Os PETSC_ARCH=arch-macports --with-mpiexec=mpiexec-mpich-mp >> [0]PETSC ERROR: #1 MatSetValues_MPIAIJ() line 530 in /opt/local/var/macports/build/_opt_local_var_macports_sources_rsync.macports.org_release_tarballs_ports_math_petsc/petsc/work/v3.5.3/src/mat/impls/aij/mpi/mpiaij.c >> [0]PETSC ERROR: [1]PETSC ERROR: --------------------- Error Message -------------------------------------------------------------- >> [1]PETSC ERROR: Argument out of range >> [1]PETSC ERROR: Inserting a new nonzero (40003, 0) into matrix >> [1]PETSC ERROR: See http://www.mcs.anl.gov/petsc/documentation/faq.html for trouble shooting. >> [1]PETSC ERROR: Petsc Release Version 3.5.3, unknown >> [1]PETSC ERROR: ./blowup_batch2 on a arch-macports named gs_air by gideon Thu Aug 27 22:40:54 2015 >> [1]PETSC ERROR: Configure options --prefix=/opt/local --prefix=/opt/local/lib/petsc --with-valgrind=0 --with-shared-libraries --with-debugging=0 --with-c2html-dir=/opt/local --with-x=0 --with-blas-lapack-lib=/System/Library/Frameworks/Accelerate.framework/Versions/Current/Accelerate --with-hwloc-dir=/opt/local --with-suitesparse-dir=/opt/local --with-superlu-dir=/opt/local --with-metis-dir=/opt/local --with-parmetis-dir=/opt/local --with-scalapack-dir=/opt/local --with-mumps-dir=/opt/local --with-superlu_dist-dir=/opt/local CC=/opt/local/bin/mpicc-mpich-mp CXX=/opt/local/bin/mpicxx-mpich-mp FC=/opt/local/bin/mpif90-mpich-mp F77=/opt/local/bin/mpif90-mpich-mp F90=/opt/local/bin/mpif90-mpich-mp COPTFLAGS=-Os CXXOPTFLAGS=-Os FOPTFLAGS=-Os LDFLAGS="-L/opt/local/lib -Wl,-headerpad_max_install_names" CPPFLAGS=-I/opt/local/include CFLAGS="-Os -arch x86_64" CXXFLAGS=-Os FFLAGS=-Os FCFLAGS=-Os F90FLAGS=-Os PETSC_ARCH=arch-macports --with-mpiexec=mpiexec-mpich-mp >> [1]PETSC ERROR: #1 MatSetValues_MPIAIJ() line 561 in /opt/local/var/macports/build/_opt_local_var_macports_sources_rsync.macports.org_release_tarballs_ports_math_petsc/petsc/work/v3.5.3/src/mat/impls/aij/mpi/mpiaij.c >> [1]PETSC ERROR: #2 MatSetValues() line 1135 in /opt/local/var/macports/build/_opt_local_var_macports_sources_rsync.macports.org_release_tarballs_ports_math_petsc/petsc/work/v3.5.3/src/mat/interface/matrix.c >> >> >> >> -gideon >> >>> On Aug 27, 2015, at 10:37 PM, Barry Smith wrote: >>> >>> >>> We need the full error message. >>> >>> But are you using a DMDA for the scalars? You should not be, you should be using a DMRedundant for the scalars. >>> >>> Barry >>> >>> Though you should not get this error even if you are using a DMDA there. >>> >>>> On Aug 27, 2015, at 9:32 PM, Gideon Simpson wrote: >>>> >>>> I?m getting the following errors: >>>> >>>> [1]PETSC ERROR: Argument out of range >>>> [1]PETSC ERROR: Inserting a new nonzero (40003, 0) into matrix >>>> >>>> Could this have to do with me using the DMComposite with one da holding the scalar parameters and the other holding the field variables? >>>> >>>> -gideon >>>> >>>>> On Aug 27, 2015, at 10:15 PM, Matthew Knepley wrote: >>>>> >>>>> On Thu, Aug 27, 2015 at 9:11 PM, Gideon Simpson wrote: >>>>> HI Barry, >>>>> >>>>> Nope, I?m not doing any grid sequencing. Clearly that makes a lot of sense, to solve on a spatially coarse mesh for the field variables, interpolate onto the finer mesh, and then solve again. I?m not entirely clear on the practical implementation >>>>> >>>>> SNES should do this automatically using -snes_grid_sequence . If this does not work, complain. Loudly. >>>>> >>>>> Matt >>>>> >>>>> -gideon >>>>> >>>>>> On Aug 27, 2015, at 10:02 PM, Barry Smith wrote: >>>>>> >>>>>> >>>>>> Gideon, >>>>>> >>>>>> Are you using grid sequencing? Simply solve on a coarse grid, interpolate u1 and u2 to a once refined version of the grid and use that plus the mu lam as initial guess for the next level. Repeat to as fine a grid as you want. You can use DMRefine() and DMGetInterpolation() to get the interpolation needed to interpolate from the coarse to finer mesh. >>>>>> >>>>>> Then and only then you can use multigrid (with or without fieldsplit) to solve the linear problems for finer meshes. Once you have the grid sequencing working we can help you with this. >>>>>> >>>>>> Barry >>>>>> >>>>>>> On Aug 27, 2015, at 7:00 PM, Gideon Simpson wrote: >>>>>>> >>>>>>> I?m working on a problem which, morally, can be posed as a system of coupled semi linear elliptic PDEs together with unknown nonlinear eigenvalue parameters, loosely, of the form >>>>>>> >>>>>>> -\Delta u_1 + f(u_1, u_2) = lam * u1 - mu * du2/dx >>>>>>> -\Delta u_2 + g(u_1, u_2) = lam * u2 + mu * du1/dx >>>>>>> >>>>>>> Currently, I have it set up with a DMComposite with two sub da?s, one for the parameters (lam, mu), and one for the vector field (u_1, u_2) on the mesh. I have had success in solving this as a fully coupled system with SNES + sparse direct solvers (MUMPS, SuperLU). >>>>>>> >>>>>>> Lately, I am finding that, when the mesh resolution gets fine enough (i.e. 10^6-10^8 lattice points), my SNES gets stuck with the function norm = O(10^{-4}), eventually returning reason -6 (failed line search). >>>>>>> >>>>>>> Perhaps there is another way around the above problem, but one thing I was thinking of trying would be to get away from direct solvers, and I was hoping to use field split for this. However, it?s a bit beyond what I?ve seen examples for because it has 2 types of variables: scalar parameters which appear globally in the system and vector valued field variables. Any suggestions on how to get started? >>>>>>> >>>>>>> -gideon >>>>>>> >>>>>> >>>>> >>>>> >>>>> >>>>> >>>>> -- >>>>> What most experimenters take for granted before they begin their experiments is infinitely more interesting than any results to which their experiments lead. >>>>> -- Norbert Wiener >>>> >>> >> > From bsmith at mcs.anl.gov Thu Aug 27 22:23:31 2015 From: bsmith at mcs.anl.gov (Barry Smith) Date: Thu, 27 Aug 2015 22:23:31 -0500 Subject: [petsc-users] pcfieldsplit for a composite dm with multiple subfields In-Reply-To: References: <22066404-50E3-4BBF-9D27-26384815571A@gmail.com> <8376EFA7-D775-4D65-9E86-2303FA7E47E2@gmail.com> <4136AE9B-AE00-4E39-8012-888BBF920548@mcs.anl.gov> <5714BD56-097D-40B9-8AEF-41273E2B512D@mcs.anl.gov> Message-ID: <05BA24AC-4011-483C-8599-5D8EED7AFE10@mcs.anl.gov> > On Aug 27, 2015, at 10:15 PM, Gideon Simpson wrote: > > That?s correct, I am not using the SNESGetDM. I suppose I could. Keep in mind that I?m trying to solve, simultaneously, for the scalar parameters and the vector field. I guess what I am unclear about is how DMRefine is to know that the unknown associated with the scalar parameters can never be coarsened out, but must be retained at all iterations. Nothing ever gets coarsened in grid sequencing, it only gets refined. The reason it knows not to "refine" the scalars is because the scalars are created with DMRedundant and the DMRedundant object knows that refinement means "leave as is, since there is no grid" while the DMDA knows it is a grid and knows how to refine itself. So when it "interpolates" the DMRedundant variables it just copies them (or multiples them by the matrix 1 which is just a copy). > > Here is my form function. I can send more code if needed. > Just change the user->packer that you use to be the DM obtained with SNESGetDM() > /* Form the system of equations for computing a blowup solution*/ > PetscErrorCode form_function(SNES snes, Vec U, Vec F, void *ctx){ > > blowup_ctx *user = (blowup_ctx *) ctx; > PetscInt i; > PetscScalar dx, dx2, xmax,x; > PetscScalar u, v, f,g, ux, vx, uxx, vxx, fx,gx, fxx, gxx; > DMDALocalInfo info; > Vec p_vec, Q_vec, Fp_vec, FQ_vec; > PetscScalar *p_array, *Fp_array; > Q *Qvals, *FQvals; > PetscScalar Q2sig, W2sig; > PetscScalar a,a2, b, u0, sigma; > > dx = user->dx; dx2 = dx *dx; > xmax = user->xmax; > sigma = user->sigma; > > /* PetscPrintf(PETSC_COMM_SELF, " dx = %g, sigma = %g\n", dx, sigma); */ > > /* Extract raw arrays */ > DMCompositeGetLocalVectors(user->packer, &p_vec, &Q_vec); > DMCompositeGetLocalVectors(user->packer, &Fp_vec, &FQ_vec); > > DMCompositeScatter(user->packer, U, p_vec, Q_vec); > /* VecView(Q_vec, PETSC_VIEWER_STDOUT_SELF); */ > > VecGetArray(p_vec,&p_array); > VecGetArray(Fp_vec,&Fp_array); > > DMDAVecGetArray(user->Q_dm, Q_vec, &Qvals); > DMDAVecGetArray(user->Q_dm, FQ_vec, &FQvals); > > DMDAGetLocalInfo(user->Q_dm, &info); > > a = p_array[0]; a2 = a*a; > b = p_array[1]; > u0 = p_array[2]; > > /* Set boundary conditions at the origin*/ > if(info.xs ==0){ > set_origin_bcs(u0, Qvals); > } > /* Set boundray conditions in the far field */ > if(info.xs+ info.xm == info.mx){ > set_farfield_bcs(xmax,sigma, a, b, dx, Qvals,info.mx); > } > > /* Solve auxiliary equations */ > if(info.xs ==0){ > uxx = (2 * Qvals[0].u-2 * u0)/dx2; > vxx = (Qvals[0].v + Qvals[0].g)/dx2; > vx = (Qvals[0].v - Qvals[0].g)/(2*dx); > Fp_array[0] = Qvals[0].u - Qvals[0].f; > Fp_array[1] = -vxx - (1/a) * (.5/sigma) * u0; > Fp_array[2] = -uxx + (1/a2) * u0 > + (1/a) * (-b * vx + PetscPowScalar(u0 * u0, sigma) * vx); > } > > /* Solve equations in the bulk */ > for(i=info.xs; i < info.xs + info.xm;i++){ > > u = Qvals[i].u; > v = Qvals[i].v; > f = Qvals[i].f; > g = Qvals[i].g; > > x = (i+1) * dx; > > Q2sig = PetscPowScalar(u*u + v*v,sigma); > W2sig= PetscPowScalar(f*f + g*g, sigma); > > ux = (Qvals[i+1].u-Qvals[i-1].u)/(2*dx); > vx = (Qvals[i+1].v-Qvals[i-1].v)/(2*dx); > fx = (Qvals[i+1].f-Qvals[i-1].f)/(2*dx); > gx = (Qvals[i+1].g-Qvals[i-1].g)/(2*dx); > > uxx = (Qvals[i+1].u+Qvals[i-1].u - 2 *u)/(dx2); > vxx = (Qvals[i+1].v+Qvals[i-1].v- 2 *v)/(dx2); > fxx = (Qvals[i+1].f+Qvals[i-1].f -2*f)/(dx2); > gxx = (Qvals[i+1].g+Qvals[i-1].g -2*g)/(dx2); > > FQvals[i].u = -uxx +1/a2 * u > + 1/a *(.5/sigma* v +x * vx- b* vx +Q2sig* vx); > > FQvals[i].v = -vxx +1/a2 * v > - 1/a *(.5/sigma * u +x * ux- b* ux +Q2sig* ux); > > FQvals[i].f = -fxx +1/a2 * f > + 1/a *(.5/sigma * g +x * gx+ b* gx -W2sig* gx); > > FQvals[i].g =-gxx +1/a2 * g > - 1/a *(.5/sigma * f +x * fx+ b* fx -W2sig* fx); > } > > /* Restore raw arrays */ > VecRestoreArray(p_vec, &p_array); > VecRestoreArray(Fp_vec, &Fp_array); > > DMDAVecRestoreArray(user->Q_dm, Q_vec, &Qvals); > DMDAVecRestoreArray(user->Q_dm, FQ_vec, &FQvals); > > DMCompositeGather(user->packer,F,INSERT_VALUES, Fp_vec, FQ_vec); > DMCompositeRestoreLocalVectors(user->packer, &p_vec, &Q_vec); > DMCompositeRestoreLocalVectors(user->packer, &Fp_vec, &FQ_vec); > > return 0; > } > > > Here is the form function: > > > > -gideon > >> On Aug 27, 2015, at 11:09 PM, Barry Smith wrote: >> >> >> Can you send the code, that will be the easiest way to find the problem. >> >> My guess is that you have hardwired in your function/Jacobian computation the use of the original DM for computations instead of using the current DM (with refinement there will be a new DM on the second level different than your original DM). So what you need to do in writing your FormFunction and FormJacobian is to call SNESGetDM() to get the current DM and then use DMComputeGet... to access the individual DMDA and DMRedundent for the parts. I notice you have this user.Q_dm I bet inside your form functions you use this DM? You have to remove this logic. >> >> Barry >> >>> On Aug 27, 2015, at 9:42 PM, Gideon Simpson wrote: >>> >>> I have it set up as: >>> >>> DMCompositeCreate(PETSC_COMM_WORLD, &user.packer); >>> DMRedundantCreate(PETSC_COMM_WORLD, 0, 3, &user.p_dm); >>> DMCompositeAddDM(user.packer,user.p_dm); >>> DMDACreate1d(PETSC_COMM_WORLD,DM_BOUNDARY_GHOSTED, >>> nx, 4, 1, NULL, &user.Q_dm); >>> DMCompositeAddDM(user.packer,user.Q_dm); >>> DMCreateGlobalVector(user.packer,&U); >>> >>> where the user.packer structure has >>> >>> DM packer; >>> DM p_dm, Q_dm; >>> >>> Q_dm holds the field variables and p_dm holds the scalar values (the nonlinear eigenvalues). >>> >>> Here are some of the errors that are generated: >>> >>> [0]PETSC ERROR: --------------------- Error Message -------------------------------------------------------------- >>> [0]PETSC ERROR: Argument out of range >>> [0]PETSC ERROR: New nonzero at (0,3) caused a malloc >>> Use MatSetOption(A, MAT_NEW_NONZERO_ALLOCATION_ERR, PETSC_FALSE) to turn off this check >>> [0]PETSC ERROR: See http://www.mcs.anl.gov/petsc/documentation/faq.html for trouble shooting. >>> [0]PETSC ERROR: Petsc Release Version 3.5.3, unknown >>> [0]PETSC ERROR: ./blowup_batch2 on a arch-macports named gs_air by gideon Thu Aug 27 22:40:54 2015 >>> [0]PETSC ERROR: Configure options --prefix=/opt/local --prefix=/opt/local/lib/petsc --with-valgrind=0 --with-shared-libraries --with-debugging=0 --with-c2html-dir=/opt/local --with-x=0 --with-blas-lapack-lib=/System/Library/Frameworks/Accelerate.framework/Versions/Current/Accelerate --with-hwloc-dir=/opt/local --with-suitesparse-dir=/opt/local --with-superlu-dir=/opt/local --with-metis-dir=/opt/local --with-parmetis-dir=/opt/local --with-scalapack-dir=/opt/local --with-mumps-dir=/opt/local --with-superlu_dist-dir=/opt/local CC=/opt/local/bin/mpicc-mpich-mp CXX=/opt/local/bin/mpicxx-mpich-mp FC=/opt/local/bin/mpif90-mpich-mp F77=/opt/local/bin/mpif90-mpich-mp F90=/opt/local/bin/mpif90-mpich-mp COPTFLAGS=-Os CXXOPTFLAGS=-Os FOPTFLAGS=-Os LDFLAGS="-L/opt/local/lib -Wl,-headerpad_max_install_names" CPPFLAGS=-I/opt/local/include CFLAGS="-Os -arch x86_64" CXXFLAGS=-Os FFLAGS=-Os FCFLAGS=-Os F90FLAGS=-Os PETSC_ARCH=arch-macports --with-mpiexec=mpiexec-mpich-mp >>> [0]PETSC ERROR: #1 MatSetValues_MPIAIJ() line 530 in /opt/local/var/macports/build/_opt_local_var_macports_sources_rsync.macports.org_release_tarballs_ports_math_petsc/petsc/work/v3.5.3/src/mat/impls/aij/mpi/mpiaij.c >>> [0]PETSC ERROR: [1]PETSC ERROR: --------------------- Error Message -------------------------------------------------------------- >>> [1]PETSC ERROR: Argument out of range >>> [1]PETSC ERROR: Inserting a new nonzero (40003, 0) into matrix >>> [1]PETSC ERROR: See http://www.mcs.anl.gov/petsc/documentation/faq.html for trouble shooting. >>> [1]PETSC ERROR: Petsc Release Version 3.5.3, unknown >>> [1]PETSC ERROR: ./blowup_batch2 on a arch-macports named gs_air by gideon Thu Aug 27 22:40:54 2015 >>> [1]PETSC ERROR: Configure options --prefix=/opt/local --prefix=/opt/local/lib/petsc --with-valgrind=0 --with-shared-libraries --with-debugging=0 --with-c2html-dir=/opt/local --with-x=0 --with-blas-lapack-lib=/System/Library/Frameworks/Accelerate.framework/Versions/Current/Accelerate --with-hwloc-dir=/opt/local --with-suitesparse-dir=/opt/local --with-superlu-dir=/opt/local --with-metis-dir=/opt/local --with-parmetis-dir=/opt/local --with-scalapack-dir=/opt/local --with-mumps-dir=/opt/local --with-superlu_dist-dir=/opt/local CC=/opt/local/bin/mpicc-mpich-mp CXX=/opt/local/bin/mpicxx-mpich-mp FC=/opt/local/bin/mpif90-mpich-mp F77=/opt/local/bin/mpif90-mpich-mp F90=/opt/local/bin/mpif90-mpich-mp COPTFLAGS=-Os CXXOPTFLAGS=-Os FOPTFLAGS=-Os LDFLAGS="-L/opt/local/lib -Wl,-headerpad_max_install_names" CPPFLAGS=-I/opt/local/include CFLAGS="-Os -arch x86_64" CXXFLAGS=-Os FFLAGS=-Os FCFLAGS=-Os F90FLAGS=-Os PETSC_ARCH=arch-macports --with-mpiexec=mpiexec-mpich-mp >>> [1]PETSC ERROR: #1 MatSetValues_MPIAIJ() line 561 in /opt/local/var/macports/build/_opt_local_var_macports_sources_rsync.macports.org_release_tarballs_ports_math_petsc/petsc/work/v3.5.3/src/mat/impls/aij/mpi/mpiaij.c >>> [1]PETSC ERROR: #2 MatSetValues() line 1135 in /opt/local/var/macports/build/_opt_local_var_macports_sources_rsync.macports.org_release_tarballs_ports_math_petsc/petsc/work/v3.5.3/src/mat/interface/matrix.c >>> >>> >>> >>> -gideon >>> >>>> On Aug 27, 2015, at 10:37 PM, Barry Smith wrote: >>>> >>>> >>>> We need the full error message. >>>> >>>> But are you using a DMDA for the scalars? You should not be, you should be using a DMRedundant for the scalars. >>>> >>>> Barry >>>> >>>> Though you should not get this error even if you are using a DMDA there. >>>> >>>>> On Aug 27, 2015, at 9:32 PM, Gideon Simpson wrote: >>>>> >>>>> I?m getting the following errors: >>>>> >>>>> [1]PETSC ERROR: Argument out of range >>>>> [1]PETSC ERROR: Inserting a new nonzero (40003, 0) into matrix >>>>> >>>>> Could this have to do with me using the DMComposite with one da holding the scalar parameters and the other holding the field variables? >>>>> >>>>> -gideon >>>>> >>>>>> On Aug 27, 2015, at 10:15 PM, Matthew Knepley wrote: >>>>>> >>>>>> On Thu, Aug 27, 2015 at 9:11 PM, Gideon Simpson wrote: >>>>>> HI Barry, >>>>>> >>>>>> Nope, I?m not doing any grid sequencing. Clearly that makes a lot of sense, to solve on a spatially coarse mesh for the field variables, interpolate onto the finer mesh, and then solve again. I?m not entirely clear on the practical implementation >>>>>> >>>>>> SNES should do this automatically using -snes_grid_sequence . If this does not work, complain. Loudly. >>>>>> >>>>>> Matt >>>>>> >>>>>> -gideon >>>>>> >>>>>>> On Aug 27, 2015, at 10:02 PM, Barry Smith wrote: >>>>>>> >>>>>>> >>>>>>> Gideon, >>>>>>> >>>>>>> Are you using grid sequencing? Simply solve on a coarse grid, interpolate u1 and u2 to a once refined version of the grid and use that plus the mu lam as initial guess for the next level. Repeat to as fine a grid as you want. You can use DMRefine() and DMGetInterpolation() to get the interpolation needed to interpolate from the coarse to finer mesh. >>>>>>> >>>>>>> Then and only then you can use multigrid (with or without fieldsplit) to solve the linear problems for finer meshes. Once you have the grid sequencing working we can help you with this. >>>>>>> >>>>>>> Barry >>>>>>> >>>>>>>> On Aug 27, 2015, at 7:00 PM, Gideon Simpson wrote: >>>>>>>> >>>>>>>> I?m working on a problem which, morally, can be posed as a system of coupled semi linear elliptic PDEs together with unknown nonlinear eigenvalue parameters, loosely, of the form >>>>>>>> >>>>>>>> -\Delta u_1 + f(u_1, u_2) = lam * u1 - mu * du2/dx >>>>>>>> -\Delta u_2 + g(u_1, u_2) = lam * u2 + mu * du1/dx >>>>>>>> >>>>>>>> Currently, I have it set up with a DMComposite with two sub da?s, one for the parameters (lam, mu), and one for the vector field (u_1, u_2) on the mesh. I have had success in solving this as a fully coupled system with SNES + sparse direct solvers (MUMPS, SuperLU). >>>>>>>> >>>>>>>> Lately, I am finding that, when the mesh resolution gets fine enough (i.e. 10^6-10^8 lattice points), my SNES gets stuck with the function norm = O(10^{-4}), eventually returning reason -6 (failed line search). >>>>>>>> >>>>>>>> Perhaps there is another way around the above problem, but one thing I was thinking of trying would be to get away from direct solvers, and I was hoping to use field split for this. However, it?s a bit beyond what I?ve seen examples for because it has 2 types of variables: scalar parameters which appear globally in the system and vector valued field variables. Any suggestions on how to get started? >>>>>>>> >>>>>>>> -gideon >>>>>>>> >>>>>>> >>>>>> >>>>>> >>>>>> >>>>>> >>>>>> -- >>>>>> What most experimenters take for granted before they begin their experiments is infinitely more interesting than any results to which their experiments lead. >>>>>> -- Norbert Wiener >>>>> >>>> >>> >> > From gideon.simpson at gmail.com Thu Aug 27 22:56:05 2015 From: gideon.simpson at gmail.com (Gideon Simpson) Date: Thu, 27 Aug 2015 23:56:05 -0400 Subject: [petsc-users] pcfieldsplit for a composite dm with multiple subfields In-Reply-To: <05BA24AC-4011-483C-8599-5D8EED7AFE10@mcs.anl.gov> References: <22066404-50E3-4BBF-9D27-26384815571A@gmail.com> <8376EFA7-D775-4D65-9E86-2303FA7E47E2@gmail.com> <4136AE9B-AE00-4E39-8012-888BBF920548@mcs.anl.gov> <5714BD56-097D-40B9-8AEF-41273E2B512D@mcs.anl.gov> <05BA24AC-4011-483C-8599-5D8EED7AFE10@mcs.anl.gov> Message-ID: Ok, it seems to work with that switch, however, when I try to use DM sequence, I get errors like: 0 SNES Function norm 5.067205249874e-03 1 SNES Function norm 7.983917252341e-08 2 SNES Function norm 7.291012540201e-11 0 SNES Function norm 2.228951406196e+02 [0]PETSC ERROR: --------------------- Error Message -------------------------------------------------------------- [0]PETSC ERROR: Argument out of range [0]PETSC ERROR: New nonzero at (0,3) caused a malloc Use MatSetOption(A, MAT_NEW_NONZERO_ALLOCATION_ERR, PETSC_FALSE) to turn off this check [0]PETSC ERROR: See http://www.mcs.anl.gov/petsc/documentation/faq.html for trouble shooting. [0]PETSC ERROR: Petsc Release Version 3.5.3, unknown [0]PETSC ERROR: ./blowup_batch2 on a arch-macports named gs_air by gideon Thu Aug 27 23:53:19 2015 [0]PETSC ERROR: Configure options --prefix=/opt/local --prefix=/opt/local/lib/petsc --with-valgrind=0 --with-shared-libraries --with-debugging=0 --with-c2html-dir=/opt/local --with-x=0 --with-blas-lapack-lib=/System/Library/Frameworks/Accelerate.framework/Versions/Current/Accelerate --with-hwloc-dir=/opt/local --with-suitesparse-dir=/opt/local --with-superlu-dir=/opt/local --with-metis-dir=/opt/local --with-parmetis-dir=/opt/local --with-scalapack-dir=/opt/local --with-mumps-dir=/opt/local --with-superlu_dist-dir=/opt/local CC=/opt/local/bin/mpicc-mpich-mp CXX=/opt/local/bin/mpicxx-mpich-mp FC=/opt/local/bin/mpif90-mpich-mp F77=/opt/local/bin/mpif90-mpich-mp F90=/opt/local/bin/mpif90-mpich-mp COPTFLAGS=-Os CXXOPTFLAGS=-Os FOPTFLAGS=-Os LDFLAGS="-L/opt/local/lib -Wl,-headerpad_max_install_names" CPPFLAGS=-I/opt/local/include CFLAGS="-Os -arch x86_64" CXXFLAGS=-Os FFLAGS=-Os FCFLAGS=-Os F90FLAGS=-Os PETSC_ARCH=arch-macports --with-mpiexec=mpiexec-mpich-mp [0]PETSC ERROR: #1 MatSetValues_SeqAIJ() line 487 in /opt/local/var/macports/build/_opt_local_var_macports_sources_rsync.macports.org_release_tarballs_ports_math_petsc/petsc/work/v3.5.3/src/mat/impls/aij/seq/aij.c [0]PETSC ERROR: #2 MatSetValues() line 1135 in /opt/local/var/macports/build/_opt_local_var_macports_sources_rsync.macports.org_release_tarballs_ports_math_petsc/petsc/work/v3.5.3/src/mat/interface/matrix.c [0]PETSC ERROR: --------------------- Error Message -------------------------------------------------------------- [0]PETSC ERROR: Argument out of range [0]PETSC ERROR: New nonzero at (1,4) caused a malloc Use MatSetOption(A, MAT_NEW_NONZERO_ALLOCATION_ERR, PETSC_FALSE) to turn off this check [0]PETSC ERROR: See http://www.mcs.anl.gov/petsc/documentation/faq.html for trouble shooting. [0]PETSC ERROR: Petsc Release Version 3.5.3, unknown [0]PETSC ERROR: ./blowup_batch2 on a arch-macports named gs_air by gideon Thu Aug 27 23:53:19 2015 [0]PETSC ERROR: Configure options --prefix=/opt/local --prefix=/opt/local/lib/petsc --with-valgrind=0 --with-shared-libraries --with-debugging=0 --with-c2html-dir=/opt/local --with-x=0 --with-blas-lapack-lib=/System/Library/Frameworks/Accelerate.framework/Versions/Current/Accelerate --with-hwloc-dir=/opt/local --with-suitesparse-dir=/opt/local --with-superlu-dir=/opt/local --with-metis-dir=/opt/local --with-parmetis-dir=/opt/local --with-scalapack-dir=/opt/local --with-mumps-dir=/opt/local --with-superlu_dist-dir=/opt/local CC=/opt/local/bin/mpicc-mpich-mp CXX=/opt/local/bin/mpicxx-mpich-mp FC=/opt/local/bin/mpif90-mpich-mp F77=/opt/local/bin/mpif90-mpich-mp F90=/opt/local/bin/mpif90-mpich-mp COPTFLAGS=-Os CXXOPTFLAGS=-Os FOPTFLAGS=-Os LDFLAGS="-L/opt/local/lib -Wl,-headerpad_max_install_names" CPPFLAGS=-I/opt/local/include CFLAGS="-Os -arch x86_64" CXXFLAGS=-Os FFLAGS=-Os FCFLAGS=-Os F90FLAGS=-Os PETSC_ARCH=arch-macports --with-mpiexec=mpiexec-mpich-mp [0]PETSC ERROR: #3 MatSetValues_SeqAIJ() line 487 in /opt/local/var/macports/build/_opt_local_var_macports_sources_rsync.macports.org_release_tarballs_ports_math_petsc/petsc/work/v3.5.3/src/mat/impls/aij/seq/aij.c [0]PETSC ERROR: #4 MatSetValues() line 1135 in /opt/local/var/macports/build/_opt_local_var_macports_sources_rsync.macports.org_release_tarballs_ports_math_petsc/petsc/work/v3.5.3/src/mat/interface/matrix.c And in my main program, I did set MatSetOption(J, MAT_NEW_NONZERO_ALLOCATION_ERR, PETSC_FALSE); for the Jacobian. -gideon > On Aug 27, 2015, at 11:23 PM, Barry Smith wrote: > > >> On Aug 27, 2015, at 10:15 PM, Gideon Simpson wrote: >> >> That?s correct, I am not using the SNESGetDM. I suppose I could. Keep in mind that I?m trying to solve, simultaneously, for the scalar parameters and the vector field. I guess what I am unclear about is how DMRefine is to know that the unknown associated with the scalar parameters can never be coarsened out, but must be retained at all iterations. > > Nothing ever gets coarsened in grid sequencing, it only gets refined. > > The reason it knows not to "refine" the scalars is because the scalars are created with DMRedundant and the DMRedundant object knows that refinement means "leave as is, since there is no grid" while the DMDA knows it is a grid and knows how to refine itself. So when it "interpolates" the DMRedundant variables it just copies them (or multiples them by the matrix 1 which is just a copy). >> >> Here is my form function. I can send more code if needed. >> > > Just change the user->packer that you use to be the DM obtained with SNESGetDM() > > >> /* Form the system of equations for computing a blowup solution*/ >> PetscErrorCode form_function(SNES snes, Vec U, Vec F, void *ctx){ >> >> blowup_ctx *user = (blowup_ctx *) ctx; >> PetscInt i; >> PetscScalar dx, dx2, xmax,x; >> PetscScalar u, v, f,g, ux, vx, uxx, vxx, fx,gx, fxx, gxx; >> DMDALocalInfo info; >> Vec p_vec, Q_vec, Fp_vec, FQ_vec; >> PetscScalar *p_array, *Fp_array; >> Q *Qvals, *FQvals; >> PetscScalar Q2sig, W2sig; >> PetscScalar a,a2, b, u0, sigma; >> >> dx = user->dx; dx2 = dx *dx; >> xmax = user->xmax; >> sigma = user->sigma; >> >> /* PetscPrintf(PETSC_COMM_SELF, " dx = %g, sigma = %g\n", dx, sigma); */ >> >> /* Extract raw arrays */ >> DMCompositeGetLocalVectors(user->packer, &p_vec, &Q_vec); >> DMCompositeGetLocalVectors(user->packer, &Fp_vec, &FQ_vec); >> >> DMCompositeScatter(user->packer, U, p_vec, Q_vec); >> /* VecView(Q_vec, PETSC_VIEWER_STDOUT_SELF); */ >> >> VecGetArray(p_vec,&p_array); >> VecGetArray(Fp_vec,&Fp_array); >> >> DMDAVecGetArray(user->Q_dm, Q_vec, &Qvals); >> DMDAVecGetArray(user->Q_dm, FQ_vec, &FQvals); >> >> DMDAGetLocalInfo(user->Q_dm, &info); >> >> a = p_array[0]; a2 = a*a; >> b = p_array[1]; >> u0 = p_array[2]; >> >> /* Set boundary conditions at the origin*/ >> if(info.xs ==0){ >> set_origin_bcs(u0, Qvals); >> } >> /* Set boundray conditions in the far field */ >> if(info.xs+ info.xm == info.mx){ >> set_farfield_bcs(xmax,sigma, a, b, dx, Qvals,info.mx); >> } >> >> /* Solve auxiliary equations */ >> if(info.xs ==0){ >> uxx = (2 * Qvals[0].u-2 * u0)/dx2; >> vxx = (Qvals[0].v + Qvals[0].g)/dx2; >> vx = (Qvals[0].v - Qvals[0].g)/(2*dx); >> Fp_array[0] = Qvals[0].u - Qvals[0].f; >> Fp_array[1] = -vxx - (1/a) * (.5/sigma) * u0; >> Fp_array[2] = -uxx + (1/a2) * u0 >> + (1/a) * (-b * vx + PetscPowScalar(u0 * u0, sigma) * vx); >> } >> >> /* Solve equations in the bulk */ >> for(i=info.xs; i < info.xs + info.xm;i++){ >> >> u = Qvals[i].u; >> v = Qvals[i].v; >> f = Qvals[i].f; >> g = Qvals[i].g; >> >> x = (i+1) * dx; >> >> Q2sig = PetscPowScalar(u*u + v*v,sigma); >> W2sig= PetscPowScalar(f*f + g*g, sigma); >> >> ux = (Qvals[i+1].u-Qvals[i-1].u)/(2*dx); >> vx = (Qvals[i+1].v-Qvals[i-1].v)/(2*dx); >> fx = (Qvals[i+1].f-Qvals[i-1].f)/(2*dx); >> gx = (Qvals[i+1].g-Qvals[i-1].g)/(2*dx); >> >> uxx = (Qvals[i+1].u+Qvals[i-1].u - 2 *u)/(dx2); >> vxx = (Qvals[i+1].v+Qvals[i-1].v- 2 *v)/(dx2); >> fxx = (Qvals[i+1].f+Qvals[i-1].f -2*f)/(dx2); >> gxx = (Qvals[i+1].g+Qvals[i-1].g -2*g)/(dx2); >> >> FQvals[i].u = -uxx +1/a2 * u >> + 1/a *(.5/sigma* v +x * vx- b* vx +Q2sig* vx); >> >> FQvals[i].v = -vxx +1/a2 * v >> - 1/a *(.5/sigma * u +x * ux- b* ux +Q2sig* ux); >> >> FQvals[i].f = -fxx +1/a2 * f >> + 1/a *(.5/sigma * g +x * gx+ b* gx -W2sig* gx); >> >> FQvals[i].g =-gxx +1/a2 * g >> - 1/a *(.5/sigma * f +x * fx+ b* fx -W2sig* fx); >> } >> >> /* Restore raw arrays */ >> VecRestoreArray(p_vec, &p_array); >> VecRestoreArray(Fp_vec, &Fp_array); >> >> DMDAVecRestoreArray(user->Q_dm, Q_vec, &Qvals); >> DMDAVecRestoreArray(user->Q_dm, FQ_vec, &FQvals); >> >> DMCompositeGather(user->packer,F,INSERT_VALUES, Fp_vec, FQ_vec); >> DMCompositeRestoreLocalVectors(user->packer, &p_vec, &Q_vec); >> DMCompositeRestoreLocalVectors(user->packer, &Fp_vec, &FQ_vec); >> >> return 0; >> } >> >> >> Here is the form function: >> >> >> >> -gideon >> >>> On Aug 27, 2015, at 11:09 PM, Barry Smith wrote: >>> >>> >>> Can you send the code, that will be the easiest way to find the problem. >>> >>> My guess is that you have hardwired in your function/Jacobian computation the use of the original DM for computations instead of using the current DM (with refinement there will be a new DM on the second level different than your original DM). So what you need to do in writing your FormFunction and FormJacobian is to call SNESGetDM() to get the current DM and then use DMComputeGet... to access the individual DMDA and DMRedundent for the parts. I notice you have this user.Q_dm I bet inside your form functions you use this DM? You have to remove this logic. >>> >>> Barry >>> >>>> On Aug 27, 2015, at 9:42 PM, Gideon Simpson wrote: >>>> >>>> I have it set up as: >>>> >>>> DMCompositeCreate(PETSC_COMM_WORLD, &user.packer); >>>> DMRedundantCreate(PETSC_COMM_WORLD, 0, 3, &user.p_dm); >>>> DMCompositeAddDM(user.packer,user.p_dm); >>>> DMDACreate1d(PETSC_COMM_WORLD,DM_BOUNDARY_GHOSTED, >>>> nx, 4, 1, NULL, &user.Q_dm); >>>> DMCompositeAddDM(user.packer,user.Q_dm); >>>> DMCreateGlobalVector(user.packer,&U); >>>> >>>> where the user.packer structure has >>>> >>>> DM packer; >>>> DM p_dm, Q_dm; >>>> >>>> Q_dm holds the field variables and p_dm holds the scalar values (the nonlinear eigenvalues). >>>> >>>> Here are some of the errors that are generated: >>>> >>>> [0]PETSC ERROR: --------------------- Error Message -------------------------------------------------------------- >>>> [0]PETSC ERROR: Argument out of range >>>> [0]PETSC ERROR: New nonzero at (0,3) caused a malloc >>>> Use MatSetOption(A, MAT_NEW_NONZERO_ALLOCATION_ERR, PETSC_FALSE) to turn off this check >>>> [0]PETSC ERROR: See http://www.mcs.anl.gov/petsc/documentation/faq.html for trouble shooting. >>>> [0]PETSC ERROR: Petsc Release Version 3.5.3, unknown >>>> [0]PETSC ERROR: ./blowup_batch2 on a arch-macports named gs_air by gideon Thu Aug 27 22:40:54 2015 >>>> [0]PETSC ERROR: Configure options --prefix=/opt/local --prefix=/opt/local/lib/petsc --with-valgrind=0 --with-shared-libraries --with-debugging=0 --with-c2html-dir=/opt/local --with-x=0 --with-blas-lapack-lib=/System/Library/Frameworks/Accelerate.framework/Versions/Current/Accelerate --with-hwloc-dir=/opt/local --with-suitesparse-dir=/opt/local --with-superlu-dir=/opt/local --with-metis-dir=/opt/local --with-parmetis-dir=/opt/local --with-scalapack-dir=/opt/local --with-mumps-dir=/opt/local --with-superlu_dist-dir=/opt/local CC=/opt/local/bin/mpicc-mpich-mp CXX=/opt/local/bin/mpicxx-mpich-mp FC=/opt/local/bin/mpif90-mpich-mp F77=/opt/local/bin/mpif90-mpich-mp F90=/opt/local/bin/mpif90-mpich-mp COPTFLAGS=-Os CXXOPTFLAGS=-Os FOPTFLAGS=-Os LDFLAGS="-L/opt/local/lib -Wl,-headerpad_max_install_names" CPPFLAGS=-I/opt/local/include CFLAGS="-Os -arch x86_64" CXXFLAGS=-Os FFLAGS=-Os FCFLAGS=-Os F90FLAGS=-Os PETSC_ARCH=arch-macports --with-mpiexec=mpiexec-mpich-mp >>>> [0]PETSC ERROR: #1 MatSetValues_MPIAIJ() line 530 in /opt/local/var/macports/build/_opt_local_var_macports_sources_rsync.macports.org_release_tarballs_ports_math_petsc/petsc/work/v3.5.3/src/mat/impls/aij/mpi/mpiaij.c >>>> [0]PETSC ERROR: [1]PETSC ERROR: --------------------- Error Message -------------------------------------------------------------- >>>> [1]PETSC ERROR: Argument out of range >>>> [1]PETSC ERROR: Inserting a new nonzero (40003, 0) into matrix >>>> [1]PETSC ERROR: See http://www.mcs.anl.gov/petsc/documentation/faq.html for trouble shooting. >>>> [1]PETSC ERROR: Petsc Release Version 3.5.3, unknown >>>> [1]PETSC ERROR: ./blowup_batch2 on a arch-macports named gs_air by gideon Thu Aug 27 22:40:54 2015 >>>> [1]PETSC ERROR: Configure options --prefix=/opt/local --prefix=/opt/local/lib/petsc --with-valgrind=0 --with-shared-libraries --with-debugging=0 --with-c2html-dir=/opt/local --with-x=0 --with-blas-lapack-lib=/System/Library/Frameworks/Accelerate.framework/Versions/Current/Accelerate --with-hwloc-dir=/opt/local --with-suitesparse-dir=/opt/local --with-superlu-dir=/opt/local --with-metis-dir=/opt/local --with-parmetis-dir=/opt/local --with-scalapack-dir=/opt/local --with-mumps-dir=/opt/local --with-superlu_dist-dir=/opt/local CC=/opt/local/bin/mpicc-mpich-mp CXX=/opt/local/bin/mpicxx-mpich-mp FC=/opt/local/bin/mpif90-mpich-mp F77=/opt/local/bin/mpif90-mpich-mp F90=/opt/local/bin/mpif90-mpich-mp COPTFLAGS=-Os CXXOPTFLAGS=-Os FOPTFLAGS=-Os LDFLAGS="-L/opt/local/lib -Wl,-headerpad_max_install_names" CPPFLAGS=-I/opt/local/include CFLAGS="-Os -arch x86_64" CXXFLAGS=-Os FFLAGS=-Os FCFLAGS=-Os F90FLAGS=-Os PETSC_ARCH=arch-macports --with-mpiexec=mpiexec-mpich-mp >>>> [1]PETSC ERROR: #1 MatSetValues_MPIAIJ() line 561 in /opt/local/var/macports/build/_opt_local_var_macports_sources_rsync.macports.org_release_tarballs_ports_math_petsc/petsc/work/v3.5.3/src/mat/impls/aij/mpi/mpiaij.c >>>> [1]PETSC ERROR: #2 MatSetValues() line 1135 in /opt/local/var/macports/build/_opt_local_var_macports_sources_rsync.macports.org_release_tarballs_ports_math_petsc/petsc/work/v3.5.3/src/mat/interface/matrix.c >>>> >>>> >>>> >>>> -gideon >>>> >>>>> On Aug 27, 2015, at 10:37 PM, Barry Smith wrote: >>>>> >>>>> >>>>> We need the full error message. >>>>> >>>>> But are you using a DMDA for the scalars? You should not be, you should be using a DMRedundant for the scalars. >>>>> >>>>> Barry >>>>> >>>>> Though you should not get this error even if you are using a DMDA there. >>>>> >>>>>> On Aug 27, 2015, at 9:32 PM, Gideon Simpson wrote: >>>>>> >>>>>> I?m getting the following errors: >>>>>> >>>>>> [1]PETSC ERROR: Argument out of range >>>>>> [1]PETSC ERROR: Inserting a new nonzero (40003, 0) into matrix >>>>>> >>>>>> Could this have to do with me using the DMComposite with one da holding the scalar parameters and the other holding the field variables? >>>>>> >>>>>> -gideon >>>>>> >>>>>>> On Aug 27, 2015, at 10:15 PM, Matthew Knepley wrote: >>>>>>> >>>>>>> On Thu, Aug 27, 2015 at 9:11 PM, Gideon Simpson wrote: >>>>>>> HI Barry, >>>>>>> >>>>>>> Nope, I?m not doing any grid sequencing. Clearly that makes a lot of sense, to solve on a spatially coarse mesh for the field variables, interpolate onto the finer mesh, and then solve again. I?m not entirely clear on the practical implementation >>>>>>> >>>>>>> SNES should do this automatically using -snes_grid_sequence . If this does not work, complain. Loudly. >>>>>>> >>>>>>> Matt >>>>>>> >>>>>>> -gideon >>>>>>> >>>>>>>> On Aug 27, 2015, at 10:02 PM, Barry Smith wrote: >>>>>>>> >>>>>>>> >>>>>>>> Gideon, >>>>>>>> >>>>>>>> Are you using grid sequencing? Simply solve on a coarse grid, interpolate u1 and u2 to a once refined version of the grid and use that plus the mu lam as initial guess for the next level. Repeat to as fine a grid as you want. You can use DMRefine() and DMGetInterpolation() to get the interpolation needed to interpolate from the coarse to finer mesh. >>>>>>>> >>>>>>>> Then and only then you can use multigrid (with or without fieldsplit) to solve the linear problems for finer meshes. Once you have the grid sequencing working we can help you with this. >>>>>>>> >>>>>>>> Barry >>>>>>>> >>>>>>>>> On Aug 27, 2015, at 7:00 PM, Gideon Simpson wrote: >>>>>>>>> >>>>>>>>> I?m working on a problem which, morally, can be posed as a system of coupled semi linear elliptic PDEs together with unknown nonlinear eigenvalue parameters, loosely, of the form >>>>>>>>> >>>>>>>>> -\Delta u_1 + f(u_1, u_2) = lam * u1 - mu * du2/dx >>>>>>>>> -\Delta u_2 + g(u_1, u_2) = lam * u2 + mu * du1/dx >>>>>>>>> >>>>>>>>> Currently, I have it set up with a DMComposite with two sub da?s, one for the parameters (lam, mu), and one for the vector field (u_1, u_2) on the mesh. I have had success in solving this as a fully coupled system with SNES + sparse direct solvers (MUMPS, SuperLU). >>>>>>>>> >>>>>>>>> Lately, I am finding that, when the mesh resolution gets fine enough (i.e. 10^6-10^8 lattice points), my SNES gets stuck with the function norm = O(10^{-4}), eventually returning reason -6 (failed line search). >>>>>>>>> >>>>>>>>> Perhaps there is another way around the above problem, but one thing I was thinking of trying would be to get away from direct solvers, and I was hoping to use field split for this. However, it?s a bit beyond what I?ve seen examples for because it has 2 types of variables: scalar parameters which appear globally in the system and vector valued field variables. Any suggestions on how to get started? >>>>>>>>> >>>>>>>>> -gideon >>>>>>>>> >>>>>>>> >>>>>>> >>>>>>> >>>>>>> >>>>>>> >>>>>>> -- >>>>>>> What most experimenters take for granted before they begin their experiments is infinitely more interesting than any results to which their experiments lead. >>>>>>> -- Norbert Wiener >>>>>> >>>>> >>>> >>> >> > -------------- next part -------------- An HTML attachment was scrubbed... URL: From zonexo at gmail.com Thu Aug 27 23:32:09 2015 From: zonexo at gmail.com (TAY wee-beng) Date: Fri, 28 Aug 2015 12:32:09 +0800 Subject: [petsc-users] Problem with linking PETSc In-Reply-To: References: <55DEA8C7.5010100@gmail.com> Message-ID: <55DFE449.3020401@gmail.com> On 27/8/2015 11:38 PM, Satish Balay wrote: > Are you able to compile and run both C and fortran petsc examples > using the corresponding makefile? > > Satish Hi Satish, Yes, there is no problem except for a minor warning: /*$ make ex2*//* *//*/cygdrive/c/wtay/Backup/Codes/petsc-3.6.1/bin/win32fe/win32fe cl -o ex2.o -c -MT -wd4996 -Z7 -I/cygdrive/c/wtay/Backup/Codes/petsc-3.6.1/include -I/cygdrive/c*//* *//*/wtay/Backup/Codes/petsc-3.6.1/petsc-3.6.1_win64_impi_vs2008/include -I/cygdrive/c/Program\ Files\ \(x86\)/IntelSWTools/mpi/5.1.1.110/intel64/include `pwd`/e*//* *//*x2.c*//* *//*ex2.c*//* *//*You are using an Intel supplied intrinsic header file with a third-party compiler.*//* *//*/cygdrive/c/wtay/Backup/Codes/petsc-3.6.1/bin/win32fe/win32fe cl -MT -wd4996 -Z7 -o ex2 ex2.o -L/cygdrive/c/wtay/Backup/Codes/petsc-3.6.1/petsc-3.6.1_win64_im*//* *//*pi_vs2008/lib -lpetsc -L/cygdrive/c/wtay/Lib/petsc-3.6.1_win64_impi_vs2008/lib -lflapack -lfblas /cygdrive/c/Program\ Files\ \(x86\)/IntelSWTools/mpi/5.1.1.110*//* *//*/intel64/lib/debug/impi.lib /cygdrive/c/Program\ Files\ \(x86\)/IntelSWTools/mpi/5.1.1.110/intel64/lib/impicxx.lib /cygdrive/c/Program\ Files\ \(x86\)/IntelSWTo*//* *//*ols/mpi/5.1.1.110/intel64/lib/impicxxd.lib /cygdrive/c/Program\ Files\ \(x86\)/IntelSWTools/mpi/5.1.1.110/intel64/lib/libmpi_ilp64.lib Gdi32.lib User32.lib Adva*//* *//*pi32.lib Kernel32.lib Ws2_32.lib*//* *//*/usr/bin/rm -f ex2.o*//* *//* *//*tsltaywb at 1C3YYY1 /cygdrive/c/wtay/Backup/Codes/petsc-3.6.1/src/ksp/ksp/examples/tutorials*//* *//*$ ./ex2f*//* *//*Norm of error 0.1192E-05 iterations 4*//* *//* *//*tsltaywb at 1C3YYY1 /cygdrive/c/wtay/Backup/Codes/petsc-3.6.1/src/ksp/ksp/examples/tutorials*//* *//*$ mpiexec -n 2 ./ex2f*//* *//*Norm of error < 1.e-12,iterations 7*/ > > On Thu, 27 Aug 2015, TAY wee-beng wrote: > >> Hi, >> >> I used to compile and link using PETSc 3.6.0 + MPICH2 on VS2008. >> >> Due to some MPICH2 issues, I am trying to use Intel MPI (newest version). >> Building and testing of PETSc 3.6.1 with Intel MPI all went smoothly. >> >> However, I now have problem linking the files on VS2008 to create the final >> exe. The error is: >> >> /*1>Compiling manifest to resources...*//* >> *//*1>Microsoft (R) Windows (R) Resource Compiler Version 6.0.5724.0*//* >> *//*1>Copyright (C) Microsoft Corporation. All rights reserved.*//* >> *//*1>Linking...*//* >> *//*1>global.obj : error LNK2019: unresolved external symbol MATCREATEAIJ >> referenced in function GLOBAL_DATA_mp_ALLO_VAR*//* >> *//*1>global.obj : error LNK2019: unresolved external symbol MATSETFROMOPTIONS >> referenced in function GLOBAL_DATA_mp_ALLO_VAR*//* >> *//*...*//* >> *//*1>PETSc_solvers.obj : error LNK2019: unresolved external symbol >> VECGETARRAY referenced in function PETSC_SOLVERS_mp_P_MATRIX_SOLV_PETSC*//* >> *//*1>PETSc_solvers.obj : error LNK2019: unresolved external symbol >> VECRESTOREARRAY referenced in function >> PETSC_SOLVERS_mp_P_MATRIX_SOLV_PETSC*//* >> *//*1>PETSc_solvers.obj : error LNK2019: unresolved external symbol >> DMLOCALTOLOCALBEGIN referenced in function >> PETSC_SOLVERS_mp_DM_UPDATE_ARRAY*//* >> *//*1>PETSc_solvers.obj : error LNK2019: unresolved external symbol >> DMLOCALTOLOCALEND referenced in function PETSC_SOLVERS_mp_DM_UPDATE_ARRAY*//* >> *//*1>ibm3d_high_Re.obj : error LNK2019: unresolved external symbol >> PETSCINITIALIZE referenced in function MAIN__*//* >> *//*1>C:\Obj_tmp\ibm3d_IIB_mpi_old\Debug\ibm3d_IIB_mpi.exe : fatal error >> LNK1120: 74 unresolved externals*//* >> *//*1>*//* >> *//*1>Build log written to >> "file://C:\Obj_tmp\ibm3d_IIB_mpi_old\Debug\BuildLog.htm"*//* >> *//*1>ibm3d_IIB_mpi_old - 165 error(s), 0 warning(s)*//* >> *//*========== Rebuild All: 0 succeeded, 1 failed, 0 skipped ==========*/ >> >> I did not do much changes since the prev PETSc worked. I only changed the >> directory $(PETSC_DIR) and $(IMPI) to the new directory in win7 environment >> variables. I wonder what's wrong. >> >> -------------- next part -------------- An HTML attachment was scrubbed... URL: From balay at mcs.anl.gov Thu Aug 27 23:46:48 2015 From: balay at mcs.anl.gov (Satish Balay) Date: Thu, 27 Aug 2015 23:46:48 -0500 Subject: [petsc-users] Problem with linking PETSc In-Reply-To: <55DFE449.3020401@gmail.com> References: <55DEA8C7.5010100@gmail.com> <55DFE449.3020401@gmail.com> Message-ID: I don't see a compile of ex2f in the copy/paste. Assuming that ran correctly and [ex2f was not an old binary lying arround] - it implies that your project file has bugs. Perhaps there is a verbose mode that it provides that you can use to see the exact compile command its using. I suspect its missing the equivalent of the following options: -L/cygdrive/c/wtay/Backup/Codes/petsc-3.6.1/petsc-3.6.1_win64_impi_vs2008/lib -lpetsc Satish On Thu, 27 Aug 2015, TAY wee-beng wrote: > > On 27/8/2015 11:38 PM, Satish Balay wrote: > > Are you able to compile and run both C and fortran petsc examples > > using the corresponding makefile? > > > > Satish > Hi Satish, > > Yes, there is no problem except for a minor warning: > > /*$ make ex2*//* > *//*/cygdrive/c/wtay/Backup/Codes/petsc-3.6.1/bin/win32fe/win32fe cl -o ex2.o > -c -MT -wd4996 -Z7 -I/cygdrive/c/wtay/Backup/Codes/petsc-3.6.1/include > -I/cygdrive/c*//* > *//*/wtay/Backup/Codes/petsc-3.6.1/petsc-3.6.1_win64_impi_vs2008/include > -I/cygdrive/c/Program\ Files\ > \(x86\)/IntelSWTools/mpi/5.1.1.110/intel64/include `pwd`/e*//* > *//*x2.c*//* > *//*ex2.c*//* > *//*You are using an Intel supplied intrinsic header file with a third-party > compiler.*//* > *//*/cygdrive/c/wtay/Backup/Codes/petsc-3.6.1/bin/win32fe/win32fe cl -MT > -wd4996 -Z7 -o ex2 ex2.o > -L/cygdrive/c/wtay/Backup/Codes/petsc-3.6.1/petsc-3.6.1_win64_im*//* > *//*pi_vs2008/lib -lpetsc > -L/cygdrive/c/wtay/Lib/petsc-3.6.1_win64_impi_vs2008/lib -lflapack -lfblas > /cygdrive/c/Program\ Files\ \(x86\)/IntelSWTools/mpi/5.1.1.110*//* > *//*/intel64/lib/debug/impi.lib /cygdrive/c/Program\ Files\ > \(x86\)/IntelSWTools/mpi/5.1.1.110/intel64/lib/impicxx.lib > /cygdrive/c/Program\ Files\ \(x86\)/IntelSWTo*//* > *//*ols/mpi/5.1.1.110/intel64/lib/impicxxd.lib /cygdrive/c/Program\ Files\ > \(x86\)/IntelSWTools/mpi/5.1.1.110/intel64/lib/libmpi_ilp64.lib Gdi32.lib > User32.lib Adva*//* > *//*pi32.lib Kernel32.lib Ws2_32.lib*//* > *//*/usr/bin/rm -f ex2.o*//* > *//* > *//*tsltaywb at 1C3YYY1 > /cygdrive/c/wtay/Backup/Codes/petsc-3.6.1/src/ksp/ksp/examples/tutorials*//* > *//*$ ./ex2f*//* > *//*Norm of error 0.1192E-05 iterations 4*//* > *//* > *//*tsltaywb at 1C3YYY1 > /cygdrive/c/wtay/Backup/Codes/petsc-3.6.1/src/ksp/ksp/examples/tutorials*//* > *//*$ mpiexec -n 2 ./ex2f*//* > *//*Norm of error < 1.e-12,iterations 7*/ > > > > > On Thu, 27 Aug 2015, TAY wee-beng wrote: > > > > > Hi, > > > > > > I used to compile and link using PETSc 3.6.0 + MPICH2 on VS2008. > > > > > > Due to some MPICH2 issues, I am trying to use Intel MPI (newest version). > > > Building and testing of PETSc 3.6.1 with Intel MPI all went smoothly. > > > > > > However, I now have problem linking the files on VS2008 to create the > > > final > > > exe. The error is: > > > > > > /*1>Compiling manifest to resources...*//* > > > *//*1>Microsoft (R) Windows (R) Resource Compiler Version 6.0.5724.0*//* > > > *//*1>Copyright (C) Microsoft Corporation. All rights reserved.*//* > > > *//*1>Linking...*//* > > > *//*1>global.obj : error LNK2019: unresolved external symbol MATCREATEAIJ > > > referenced in function GLOBAL_DATA_mp_ALLO_VAR*//* > > > *//*1>global.obj : error LNK2019: unresolved external symbol > > > MATSETFROMOPTIONS > > > referenced in function GLOBAL_DATA_mp_ALLO_VAR*//* > > > *//*...*//* > > > *//*1>PETSc_solvers.obj : error LNK2019: unresolved external symbol > > > VECGETARRAY referenced in function > > > PETSC_SOLVERS_mp_P_MATRIX_SOLV_PETSC*//* > > > *//*1>PETSc_solvers.obj : error LNK2019: unresolved external symbol > > > VECRESTOREARRAY referenced in function > > > PETSC_SOLVERS_mp_P_MATRIX_SOLV_PETSC*//* > > > *//*1>PETSc_solvers.obj : error LNK2019: unresolved external symbol > > > DMLOCALTOLOCALBEGIN referenced in function > > > PETSC_SOLVERS_mp_DM_UPDATE_ARRAY*//* > > > *//*1>PETSc_solvers.obj : error LNK2019: unresolved external symbol > > > DMLOCALTOLOCALEND referenced in function > > > PETSC_SOLVERS_mp_DM_UPDATE_ARRAY*//* > > > *//*1>ibm3d_high_Re.obj : error LNK2019: unresolved external symbol > > > PETSCINITIALIZE referenced in function MAIN__*//* > > > *//*1>C:\Obj_tmp\ibm3d_IIB_mpi_old\Debug\ibm3d_IIB_mpi.exe : fatal error > > > LNK1120: 74 unresolved externals*//* > > > *//*1>*//* > > > *//*1>Build log written to > > > "file://C:\Obj_tmp\ibm3d_IIB_mpi_old\Debug\BuildLog.htm"*//* > > > *//*1>ibm3d_IIB_mpi_old - 165 error(s), 0 warning(s)*//* > > > *//*========== Rebuild All: 0 succeeded, 1 failed, 0 skipped ==========*/ > > > > > > I did not do much changes since the prev PETSc worked. I only changed the > > > directory $(PETSC_DIR) and $(IMPI) to the new directory in win7 > > > environment > > > variables. I wonder what's wrong. > > > > > > > > From zonexo at gmail.com Fri Aug 28 03:20:22 2015 From: zonexo at gmail.com (TAY wee-beng) Date: Fri, 28 Aug 2015 16:20:22 +0800 Subject: [petsc-users] Problem with linking PETSc In-Reply-To: References: <55DEA8C7.5010100@gmail.com> <55DFE449.3020401@gmail.com> Message-ID: <55E019C6.5030003@gmail.com> Hi Satish, Was searching high and low at the wrong place! I accidentally removed the libpetsc.lib from the library files... Now it worked. Thank you Yours sincerely, TAY wee-beng On 28/8/2015 12:46 PM, Satish Balay wrote: > I don't see a compile of ex2f in the copy/paste. Assuming that ran > correctly and [ex2f was not an old binary lying arround] - it implies > that your project file has bugs. > > Perhaps there is a verbose mode that it provides that you can use to > see the exact compile command its using. > > I suspect its missing the equivalent of the following options: > > -L/cygdrive/c/wtay/Backup/Codes/petsc-3.6.1/petsc-3.6.1_win64_impi_vs2008/lib -lpetsc > > Satish > > > On Thu, 27 Aug 2015, TAY wee-beng wrote: > >> On 27/8/2015 11:38 PM, Satish Balay wrote: >>> Are you able to compile and run both C and fortran petsc examples >>> using the corresponding makefile? >>> >>> Satish >> Hi Satish, >> >> Yes, there is no problem except for a minor warning: >> >> /*$ make ex2*//* >> *//*/cygdrive/c/wtay/Backup/Codes/petsc-3.6.1/bin/win32fe/win32fe cl -o ex2.o >> -c -MT -wd4996 -Z7 -I/cygdrive/c/wtay/Backup/Codes/petsc-3.6.1/include >> -I/cygdrive/c*//* >> *//*/wtay/Backup/Codes/petsc-3.6.1/petsc-3.6.1_win64_impi_vs2008/include >> -I/cygdrive/c/Program\ Files\ >> \(x86\)/IntelSWTools/mpi/5.1.1.110/intel64/include `pwd`/e*//* >> *//*x2.c*//* >> *//*ex2.c*//* >> *//*You are using an Intel supplied intrinsic header file with a third-party >> compiler.*//* >> *//*/cygdrive/c/wtay/Backup/Codes/petsc-3.6.1/bin/win32fe/win32fe cl -MT >> -wd4996 -Z7 -o ex2 ex2.o >> -L/cygdrive/c/wtay/Backup/Codes/petsc-3.6.1/petsc-3.6.1_win64_im*//* >> *//*pi_vs2008/lib -lpetsc >> -L/cygdrive/c/wtay/Lib/petsc-3.6.1_win64_impi_vs2008/lib -lflapack -lfblas >> /cygdrive/c/Program\ Files\ \(x86\)/IntelSWTools/mpi/5.1.1.110*//* >> *//*/intel64/lib/debug/impi.lib /cygdrive/c/Program\ Files\ >> \(x86\)/IntelSWTools/mpi/5.1.1.110/intel64/lib/impicxx.lib >> /cygdrive/c/Program\ Files\ \(x86\)/IntelSWTo*//* >> *//*ols/mpi/5.1.1.110/intel64/lib/impicxxd.lib /cygdrive/c/Program\ Files\ >> \(x86\)/IntelSWTools/mpi/5.1.1.110/intel64/lib/libmpi_ilp64.lib Gdi32.lib >> User32.lib Adva*//* >> *//*pi32.lib Kernel32.lib Ws2_32.lib*//* >> *//*/usr/bin/rm -f ex2.o*//* >> *//* >> *//*tsltaywb at 1C3YYY1 >> /cygdrive/c/wtay/Backup/Codes/petsc-3.6.1/src/ksp/ksp/examples/tutorials*//* >> *//*$ ./ex2f*//* >> *//*Norm of error 0.1192E-05 iterations 4*//* >> *//* >> *//*tsltaywb at 1C3YYY1 >> /cygdrive/c/wtay/Backup/Codes/petsc-3.6.1/src/ksp/ksp/examples/tutorials*//* >> *//*$ mpiexec -n 2 ./ex2f*//* >> *//*Norm of error < 1.e-12,iterations 7*/ >> >>> On Thu, 27 Aug 2015, TAY wee-beng wrote: >>> >>>> Hi, >>>> >>>> I used to compile and link using PETSc 3.6.0 + MPICH2 on VS2008. >>>> >>>> Due to some MPICH2 issues, I am trying to use Intel MPI (newest version). >>>> Building and testing of PETSc 3.6.1 with Intel MPI all went smoothly. >>>> >>>> However, I now have problem linking the files on VS2008 to create the >>>> final >>>> exe. The error is: >>>> >>>> /*1>Compiling manifest to resources...*//* >>>> *//*1>Microsoft (R) Windows (R) Resource Compiler Version 6.0.5724.0*//* >>>> *//*1>Copyright (C) Microsoft Corporation. All rights reserved.*//* >>>> *//*1>Linking...*//* >>>> *//*1>global.obj : error LNK2019: unresolved external symbol MATCREATEAIJ >>>> referenced in function GLOBAL_DATA_mp_ALLO_VAR*//* >>>> *//*1>global.obj : error LNK2019: unresolved external symbol >>>> MATSETFROMOPTIONS >>>> referenced in function GLOBAL_DATA_mp_ALLO_VAR*//* >>>> *//*...*//* >>>> *//*1>PETSc_solvers.obj : error LNK2019: unresolved external symbol >>>> VECGETARRAY referenced in function >>>> PETSC_SOLVERS_mp_P_MATRIX_SOLV_PETSC*//* >>>> *//*1>PETSc_solvers.obj : error LNK2019: unresolved external symbol >>>> VECRESTOREARRAY referenced in function >>>> PETSC_SOLVERS_mp_P_MATRIX_SOLV_PETSC*//* >>>> *//*1>PETSc_solvers.obj : error LNK2019: unresolved external symbol >>>> DMLOCALTOLOCALBEGIN referenced in function >>>> PETSC_SOLVERS_mp_DM_UPDATE_ARRAY*//* >>>> *//*1>PETSc_solvers.obj : error LNK2019: unresolved external symbol >>>> DMLOCALTOLOCALEND referenced in function >>>> PETSC_SOLVERS_mp_DM_UPDATE_ARRAY*//* >>>> *//*1>ibm3d_high_Re.obj : error LNK2019: unresolved external symbol >>>> PETSCINITIALIZE referenced in function MAIN__*//* >>>> *//*1>C:\Obj_tmp\ibm3d_IIB_mpi_old\Debug\ibm3d_IIB_mpi.exe : fatal error >>>> LNK1120: 74 unresolved externals*//* >>>> *//*1>*//* >>>> *//*1>Build log written to >>>> "file://C:\Obj_tmp\ibm3d_IIB_mpi_old\Debug\BuildLog.htm"*//* >>>> *//*1>ibm3d_IIB_mpi_old - 165 error(s), 0 warning(s)*//* >>>> *//*========== Rebuild All: 0 succeeded, 1 failed, 0 skipped ==========*/ >>>> >>>> I did not do much changes since the prev PETSc worked. I only changed the >>>> directory $(PETSC_DIR) and $(IMPI) to the new directory in win7 >>>> environment >>>> variables. I wonder what's wrong. >>>> >>>> >> From knepley at gmail.com Fri Aug 28 05:20:02 2015 From: knepley at gmail.com (Matthew Knepley) Date: Fri, 28 Aug 2015 05:20:02 -0500 Subject: [petsc-users] pcfieldsplit for a composite dm with multiple subfields In-Reply-To: <05BA24AC-4011-483C-8599-5D8EED7AFE10@mcs.anl.gov> References: <22066404-50E3-4BBF-9D27-26384815571A@gmail.com> <8376EFA7-D775-4D65-9E86-2303FA7E47E2@gmail.com> <4136AE9B-AE00-4E39-8012-888BBF920548@mcs.anl.gov> <5714BD56-097D-40B9-8AEF-41273E2B512D@mcs.anl.gov> <05BA24AC-4011-483C-8599-5D8EED7AFE10@mcs.anl.gov> Message-ID: On Thu, Aug 27, 2015 at 10:23 PM, Barry Smith wrote: > > > On Aug 27, 2015, at 10:15 PM, Gideon Simpson > wrote: > > > > That?s correct, I am not using the SNESGetDM. I suppose I could. Keep > in mind that I?m trying to solve, simultaneously, for the scalar parameters > and the vector field. I guess what I am unclear about is how DMRefine is > to know that the unknown associated with the scalar parameters can never be > coarsened out, but must be retained at all iterations. > > Nothing ever gets coarsened in grid sequencing, it only gets refined. > > The reason it knows not to "refine" the scalars is because the scalars > are created with DMRedundant and the DMRedundant object knows that > refinement means "leave as is, since there is no grid" while the DMDA knows > it is a grid and knows how to refine itself. So when it "interpolates" the > DMRedundant variables it just copies them (or multiples them by the matrix > 1 which is just a copy). > I think you might be misunderstanding the "scalars" part. He is solving a nonlinear eigenproblem (which he did not write down) for some variables. Then he uses those variable in the coupled diffusion equations he did write down. He has wrapped the whole problem in a SNES with 2 parts: the nonlinear eigenproblem and the diffusion equations. He uses DMComposite to deal with all the unknowns. I think Nonlinear Block Gauss-Siedel on the different problems would be a useful starting point, but we do not have that. Thanks, Matt > > > > Here is my form function. I can send more code if needed. > > > > Just change the user->packer that you use to be the DM obtained with > SNESGetDM() > > > > /* Form the system of equations for computing a blowup solution*/ > > PetscErrorCode form_function(SNES snes, Vec U, Vec F, void *ctx){ > > > > blowup_ctx *user = (blowup_ctx *) ctx; > > PetscInt i; > > PetscScalar dx, dx2, xmax,x; > > PetscScalar u, v, f,g, ux, vx, uxx, vxx, fx,gx, fxx, gxx; > > DMDALocalInfo info; > > Vec p_vec, Q_vec, Fp_vec, FQ_vec; > > PetscScalar *p_array, *Fp_array; > > Q *Qvals, *FQvals; > > PetscScalar Q2sig, W2sig; > > PetscScalar a,a2, b, u0, sigma; > > > > dx = user->dx; dx2 = dx *dx; > > xmax = user->xmax; > > sigma = user->sigma; > > > > /* PetscPrintf(PETSC_COMM_SELF, " dx = %g, sigma = %g\n", dx, sigma); > */ > > > > /* Extract raw arrays */ > > DMCompositeGetLocalVectors(user->packer, &p_vec, &Q_vec); > > DMCompositeGetLocalVectors(user->packer, &Fp_vec, &FQ_vec); > > > > DMCompositeScatter(user->packer, U, p_vec, Q_vec); > > /* VecView(Q_vec, PETSC_VIEWER_STDOUT_SELF); */ > > > > VecGetArray(p_vec,&p_array); > > VecGetArray(Fp_vec,&Fp_array); > > > > DMDAVecGetArray(user->Q_dm, Q_vec, &Qvals); > > DMDAVecGetArray(user->Q_dm, FQ_vec, &FQvals); > > > > DMDAGetLocalInfo(user->Q_dm, &info); > > > > a = p_array[0]; a2 = a*a; > > b = p_array[1]; > > u0 = p_array[2]; > > > > /* Set boundary conditions at the origin*/ > > if(info.xs ==0){ > > set_origin_bcs(u0, Qvals); > > } > > /* Set boundray conditions in the far field */ > > if(info.xs+ info.xm == info.mx){ > > set_farfield_bcs(xmax,sigma, a, b, dx, Qvals,info.mx); > > } > > > > /* Solve auxiliary equations */ > > if(info.xs ==0){ > > uxx = (2 * Qvals[0].u-2 * u0)/dx2; > > vxx = (Qvals[0].v + Qvals[0].g)/dx2; > > vx = (Qvals[0].v - Qvals[0].g)/(2*dx); > > Fp_array[0] = Qvals[0].u - Qvals[0].f; > > Fp_array[1] = -vxx - (1/a) * (.5/sigma) * u0; > > Fp_array[2] = -uxx + (1/a2) * u0 > > + (1/a) * (-b * vx + PetscPowScalar(u0 * u0, sigma) * vx); > > } > > > > /* Solve equations in the bulk */ > > for(i=info.xs; i < info.xs + info.xm;i++){ > > > > u = Qvals[i].u; > > v = Qvals[i].v; > > f = Qvals[i].f; > > g = Qvals[i].g; > > > > x = (i+1) * dx; > > > > Q2sig = PetscPowScalar(u*u + v*v,sigma); > > W2sig= PetscPowScalar(f*f + g*g, sigma); > > > > ux = (Qvals[i+1].u-Qvals[i-1].u)/(2*dx); > > vx = (Qvals[i+1].v-Qvals[i-1].v)/(2*dx); > > fx = (Qvals[i+1].f-Qvals[i-1].f)/(2*dx); > > gx = (Qvals[i+1].g-Qvals[i-1].g)/(2*dx); > > > > uxx = (Qvals[i+1].u+Qvals[i-1].u - 2 *u)/(dx2); > > vxx = (Qvals[i+1].v+Qvals[i-1].v- 2 *v)/(dx2); > > fxx = (Qvals[i+1].f+Qvals[i-1].f -2*f)/(dx2); > > gxx = (Qvals[i+1].g+Qvals[i-1].g -2*g)/(dx2); > > > > FQvals[i].u = -uxx +1/a2 * u > > + 1/a *(.5/sigma* v +x * vx- b* vx +Q2sig* vx); > > > > FQvals[i].v = -vxx +1/a2 * v > > - 1/a *(.5/sigma * u +x * ux- b* ux +Q2sig* ux); > > > > FQvals[i].f = -fxx +1/a2 * f > > + 1/a *(.5/sigma * g +x * gx+ b* gx -W2sig* gx); > > > > FQvals[i].g =-gxx +1/a2 * g > > - 1/a *(.5/sigma * f +x * fx+ b* fx -W2sig* fx); > > } > > > > /* Restore raw arrays */ > > VecRestoreArray(p_vec, &p_array); > > VecRestoreArray(Fp_vec, &Fp_array); > > > > DMDAVecRestoreArray(user->Q_dm, Q_vec, &Qvals); > > DMDAVecRestoreArray(user->Q_dm, FQ_vec, &FQvals); > > > > DMCompositeGather(user->packer,F,INSERT_VALUES, Fp_vec, FQ_vec); > > DMCompositeRestoreLocalVectors(user->packer, &p_vec, &Q_vec); > > DMCompositeRestoreLocalVectors(user->packer, &Fp_vec, &FQ_vec); > > > > return 0; > > } > > > > > > Here is the form function: > > > > > > > > -gideon > > > >> On Aug 27, 2015, at 11:09 PM, Barry Smith wrote: > >> > >> > >> Can you send the code, that will be the easiest way to find the problem. > >> > >> My guess is that you have hardwired in your function/Jacobian > computation the use of the original DM for computations instead of using > the current DM (with refinement there will be a new DM on the second level > different than your original DM). So what you need to do in writing your > FormFunction and FormJacobian is to call SNESGetDM() to get the current DM > and then use DMComputeGet... to access the individual DMDA and DMRedundent > for the parts. I notice you have this user.Q_dm I bet inside your form > functions you use this DM? You have to remove this logic. > >> > >> Barry > >> > >>> On Aug 27, 2015, at 9:42 PM, Gideon Simpson > wrote: > >>> > >>> I have it set up as: > >>> > >>> DMCompositeCreate(PETSC_COMM_WORLD, &user.packer); > >>> DMRedundantCreate(PETSC_COMM_WORLD, 0, 3, &user.p_dm); > >>> DMCompositeAddDM(user.packer,user.p_dm); > >>> DMDACreate1d(PETSC_COMM_WORLD,DM_BOUNDARY_GHOSTED, > >>> nx, 4, 1, NULL, &user.Q_dm); > >>> DMCompositeAddDM(user.packer,user.Q_dm); > >>> DMCreateGlobalVector(user.packer,&U); > >>> > >>> where the user.packer structure has > >>> > >>> DM packer; > >>> DM p_dm, Q_dm; > >>> > >>> Q_dm holds the field variables and p_dm holds the scalar values (the > nonlinear eigenvalues). > >>> > >>> Here are some of the errors that are generated: > >>> > >>> [0]PETSC ERROR: --------------------- Error Message > -------------------------------------------------------------- > >>> [0]PETSC ERROR: Argument out of range > >>> [0]PETSC ERROR: New nonzero at (0,3) caused a malloc > >>> Use MatSetOption(A, MAT_NEW_NONZERO_ALLOCATION_ERR, PETSC_FALSE) to > turn off this check > >>> [0]PETSC ERROR: See > http://www.mcs.anl.gov/petsc/documentation/faq.html for trouble shooting. > >>> [0]PETSC ERROR: Petsc Release Version 3.5.3, unknown > >>> [0]PETSC ERROR: ./blowup_batch2 on a arch-macports named gs_air by > gideon Thu Aug 27 22:40:54 2015 > >>> [0]PETSC ERROR: Configure options --prefix=/opt/local > --prefix=/opt/local/lib/petsc --with-valgrind=0 --with-shared-libraries > --with-debugging=0 --with-c2html-dir=/opt/local --with-x=0 > --with-blas-lapack-lib=/System/Library/Frameworks/Accelerate.framework/Versions/Current/Accelerate > --with-hwloc-dir=/opt/local --with-suitesparse-dir=/opt/local > --with-superlu-dir=/opt/local --with-metis-dir=/opt/local > --with-parmetis-dir=/opt/local --with-scalapack-dir=/opt/local > --with-mumps-dir=/opt/local --with-superlu_dist-dir=/opt/local > CC=/opt/local/bin/mpicc-mpich-mp CXX=/opt/local/bin/mpicxx-mpich-mp > FC=/opt/local/bin/mpif90-mpich-mp F77=/opt/local/bin/mpif90-mpich-mp > F90=/opt/local/bin/mpif90-mpich-mp COPTFLAGS=-Os CXXOPTFLAGS=-Os > FOPTFLAGS=-Os LDFLAGS="-L/opt/local/lib -Wl,-headerpad_max_install_names" > CPPFLAGS=-I/opt/local/include CFLAGS="-Os -arch x86_64" CXXFLAGS=-Os > FFLAGS=-Os FCFLAGS=-Os F90FLAGS=-Os PETSC_ARCH=arch-macports > --with-mpiexec=mpiexec-mpich-mp > >>> [0]PETSC ERROR: #1 MatSetValues_MPIAIJ() line 530 in > /opt/local/var/macports/build/_opt_local_var_macports_sources_rsync.macports.org_release_tarballs_ports_math_petsc/petsc/work/v3.5.3/src/mat/impls/aij/mpi/mpiaij.c > >>> [0]PETSC ERROR: [1]PETSC ERROR: --------------------- Error Message > -------------------------------------------------------------- > >>> [1]PETSC ERROR: Argument out of range > >>> [1]PETSC ERROR: Inserting a new nonzero (40003, 0) into matrix > >>> [1]PETSC ERROR: See > http://www.mcs.anl.gov/petsc/documentation/faq.html for trouble shooting. > >>> [1]PETSC ERROR: Petsc Release Version 3.5.3, unknown > >>> [1]PETSC ERROR: ./blowup_batch2 on a arch-macports named gs_air by > gideon Thu Aug 27 22:40:54 2015 > >>> [1]PETSC ERROR: Configure options --prefix=/opt/local > --prefix=/opt/local/lib/petsc --with-valgrind=0 --with-shared-libraries > --with-debugging=0 --with-c2html-dir=/opt/local --with-x=0 > --with-blas-lapack-lib=/System/Library/Frameworks/Accelerate.framework/Versions/Current/Accelerate > --with-hwloc-dir=/opt/local --with-suitesparse-dir=/opt/local > --with-superlu-dir=/opt/local --with-metis-dir=/opt/local > --with-parmetis-dir=/opt/local --with-scalapack-dir=/opt/local > --with-mumps-dir=/opt/local --with-superlu_dist-dir=/opt/local > CC=/opt/local/bin/mpicc-mpich-mp CXX=/opt/local/bin/mpicxx-mpich-mp > FC=/opt/local/bin/mpif90-mpich-mp F77=/opt/local/bin/mpif90-mpich-mp > F90=/opt/local/bin/mpif90-mpich-mp COPTFLAGS=-Os CXXOPTFLAGS=-Os > FOPTFLAGS=-Os LDFLAGS="-L/opt/local/lib -Wl,-headerpad_max_install_names" > CPPFLAGS=-I/opt/local/include CFLAGS="-Os -arch x86_64" CXXFLAGS=-Os > FFLAGS=-Os FCFLAGS=-Os F90FLAGS=-Os PETSC_ARCH=arch-macports > --with-mpiexec=mpiexec-mpich-mp > >>> [1]PETSC ERROR: #1 MatSetValues_MPIAIJ() line 561 in > /opt/local/var/macports/build/_opt_local_var_macports_sources_rsync.macports.org_release_tarballs_ports_math_petsc/petsc/work/v3.5.3/src/mat/impls/aij/mpi/mpiaij.c > >>> [1]PETSC ERROR: #2 MatSetValues() line 1135 in > /opt/local/var/macports/build/_opt_local_var_macports_sources_rsync.macports.org_release_tarballs_ports_math_petsc/petsc/work/v3.5.3/src/mat/interface/matrix.c > >>> > >>> > >>> > >>> -gideon > >>> > >>>> On Aug 27, 2015, at 10:37 PM, Barry Smith wrote: > >>>> > >>>> > >>>> We need the full error message. > >>>> > >>>> But are you using a DMDA for the scalars? You should not be, you > should be using a DMRedundant for the scalars. > >>>> > >>>> Barry > >>>> > >>>> Though you should not get this error even if you are using a DMDA > there. > >>>> > >>>>> On Aug 27, 2015, at 9:32 PM, Gideon Simpson < > gideon.simpson at gmail.com> wrote: > >>>>> > >>>>> I?m getting the following errors: > >>>>> > >>>>> [1]PETSC ERROR: Argument out of range > >>>>> [1]PETSC ERROR: Inserting a new nonzero (40003, 0) into matrix > >>>>> > >>>>> Could this have to do with me using the DMComposite with one da > holding the scalar parameters and the other holding the field variables? > >>>>> > >>>>> -gideon > >>>>> > >>>>>> On Aug 27, 2015, at 10:15 PM, Matthew Knepley > wrote: > >>>>>> > >>>>>> On Thu, Aug 27, 2015 at 9:11 PM, Gideon Simpson < > gideon.simpson at gmail.com> wrote: > >>>>>> HI Barry, > >>>>>> > >>>>>> Nope, I?m not doing any grid sequencing. Clearly that makes a lot > of sense, to solve on a spatially coarse mesh for the field variables, > interpolate onto the finer mesh, and then solve again. I?m not entirely > clear on the practical implementation > >>>>>> > >>>>>> SNES should do this automatically using -snes_grid_sequence . > If this does not work, complain. Loudly. > >>>>>> > >>>>>> Matt > >>>>>> > >>>>>> -gideon > >>>>>> > >>>>>>> On Aug 27, 2015, at 10:02 PM, Barry Smith > wrote: > >>>>>>> > >>>>>>> > >>>>>>> Gideon, > >>>>>>> > >>>>>>> Are you using grid sequencing? Simply solve on a coarse grid, > interpolate u1 and u2 to a once refined version of the grid and use that > plus the mu lam as initial guess for the next level. Repeat to as fine a > grid as you want. You can use DMRefine() and DMGetInterpolation() to get > the interpolation needed to interpolate from the coarse to finer mesh. > >>>>>>> > >>>>>>> Then and only then you can use multigrid (with or without > fieldsplit) to solve the linear problems for finer meshes. Once you have > the grid sequencing working we can help you with this. > >>>>>>> > >>>>>>> Barry > >>>>>>> > >>>>>>>> On Aug 27, 2015, at 7:00 PM, Gideon Simpson < > gideon.simpson at gmail.com> wrote: > >>>>>>>> > >>>>>>>> I?m working on a problem which, morally, can be posed as a system > of coupled semi linear elliptic PDEs together with unknown nonlinear > eigenvalue parameters, loosely, of the form > >>>>>>>> > >>>>>>>> -\Delta u_1 + f(u_1, u_2) = lam * u1 - mu * du2/dx > >>>>>>>> -\Delta u_2 + g(u_1, u_2) = lam * u2 + mu * du1/dx > >>>>>>>> > >>>>>>>> Currently, I have it set up with a DMComposite with two sub da?s, > one for the parameters (lam, mu), and one for the vector field (u_1, u_2) > on the mesh. I have had success in solving this as a fully coupled system > with SNES + sparse direct solvers (MUMPS, SuperLU). > >>>>>>>> > >>>>>>>> Lately, I am finding that, when the mesh resolution gets fine > enough (i.e. 10^6-10^8 lattice points), my SNES gets stuck with the > function norm = O(10^{-4}), eventually returning reason -6 (failed line > search). > >>>>>>>> > >>>>>>>> Perhaps there is another way around the above problem, but one > thing I was thinking of trying would be to get away from direct solvers, > and I was hoping to use field split for this. However, it?s a bit beyond > what I?ve seen examples for because it has 2 types of variables: scalar > parameters which appear globally in the system and vector valued field > variables. Any suggestions on how to get started? > >>>>>>>> > >>>>>>>> -gideon > >>>>>>>> > >>>>>>> > >>>>>> > >>>>>> > >>>>>> > >>>>>> > >>>>>> -- > >>>>>> What most experimenters take for granted before they begin their > experiments is infinitely more interesting than any results to which their > experiments lead. > >>>>>> -- Norbert Wiener > >>>>> > >>>> > >>> > >> > > > > -- What most experimenters take for granted before they begin their experiments is infinitely more interesting than any results to which their experiments lead. -- Norbert Wiener -------------- next part -------------- An HTML attachment was scrubbed... URL: From gideon.simpson at gmail.com Fri Aug 28 12:21:36 2015 From: gideon.simpson at gmail.com (Gideon Simpson) Date: Fri, 28 Aug 2015 13:21:36 -0400 Subject: [petsc-users] pcfieldsplit for a composite dm with multiple subfields In-Reply-To: References: <22066404-50E3-4BBF-9D27-26384815571A@gmail.com> <8376EFA7-D775-4D65-9E86-2303FA7E47E2@gmail.com> <4136AE9B-AE00-4E39-8012-888BBF920548@mcs.anl.gov> <5714BD56-097D-40B9-8AEF-41273E2B512D@mcs.anl.gov> <05BA24AC-4011-483C-8599-5D8EED7AFE10@mcs.anl.gov> Message-ID: <3BD30839-963B-427A-B65A-F20D794606B9@gmail.com> Yes, to clarify, the problem with two scalar fields and two scalar ?eigenvalues? is -\Delta u_1 + f(u_1, u_2) = lam * u1 - mu * du2/dx -\Delta u_2 + g(u_1, u_2) = lam * u2 + mu * du1/dx (u_1, u_2) are defined on the mesh, and (lam, mu) are unknown scalars. My actual problem has a 4 degrees of freedom at each mesh point and 3 unknown scalars, but this makes the point. -gideon > On Aug 28, 2015, at 6:20 AM, Matthew Knepley wrote: > > On Thu, Aug 27, 2015 at 10:23 PM, Barry Smith > wrote: > > > On Aug 27, 2015, at 10:15 PM, Gideon Simpson > wrote: > > > > That?s correct, I am not using the SNESGetDM. I suppose I could. Keep in mind that I?m trying to solve, simultaneously, for the scalar parameters and the vector field. I guess what I am unclear about is how DMRefine is to know that the unknown associated with the scalar parameters can never be coarsened out, but must be retained at all iterations. > > Nothing ever gets coarsened in grid sequencing, it only gets refined. > > The reason it knows not to "refine" the scalars is because the scalars are created with DMRedundant and the DMRedundant object knows that refinement means "leave as is, since there is no grid" while the DMDA knows it is a grid and knows how to refine itself. So when it "interpolates" the DMRedundant variables it just copies them (or multiples them by the matrix 1 which is just a copy). > > I think you might be misunderstanding the "scalars" part. He is solving a nonlinear eigenproblem (which he did not write down) for some variables. Then he > uses those variable in the coupled diffusion equations he did write down. He has wrapped the whole problem in a SNES with 2 parts: the nonlinear eigenproblem > and the diffusion equations. He uses DMComposite to deal with all the unknowns. > > I think Nonlinear Block Gauss-Siedel on the different problems would be a useful starting point, but we do not have that. > > Thanks, > > Matt > > > > > Here is my form function. I can send more code if needed. > > > > Just change the user->packer that you use to be the DM obtained with SNESGetDM() > > > > /* Form the system of equations for computing a blowup solution*/ > > PetscErrorCode form_function(SNES snes, Vec U, Vec F, void *ctx){ > > > > blowup_ctx *user = (blowup_ctx *) ctx; > > PetscInt i; > > PetscScalar dx, dx2, xmax,x; > > PetscScalar u, v, f,g, ux, vx, uxx, vxx, fx,gx, fxx, gxx; > > DMDALocalInfo info; > > Vec p_vec, Q_vec, Fp_vec, FQ_vec; > > PetscScalar *p_array, *Fp_array; > > Q *Qvals, *FQvals; > > PetscScalar Q2sig, W2sig; > > PetscScalar a,a2, b, u0, sigma; > > > > dx = user->dx; dx2 = dx *dx; > > xmax = user->xmax; > > sigma = user->sigma; > > > > /* PetscPrintf(PETSC_COMM_SELF, " dx = %g, sigma = %g\n", dx, sigma); */ > > > > /* Extract raw arrays */ > > DMCompositeGetLocalVectors(user->packer, &p_vec, &Q_vec); > > DMCompositeGetLocalVectors(user->packer, &Fp_vec, &FQ_vec); > > > > DMCompositeScatter(user->packer, U, p_vec, Q_vec); > > /* VecView(Q_vec, PETSC_VIEWER_STDOUT_SELF); */ > > > > VecGetArray(p_vec,&p_array); > > VecGetArray(Fp_vec,&Fp_array); > > > > DMDAVecGetArray(user->Q_dm, Q_vec, &Qvals); > > DMDAVecGetArray(user->Q_dm, FQ_vec, &FQvals); > > > > DMDAGetLocalInfo(user->Q_dm, &info); > > > > a = p_array[0]; a2 = a*a; > > b = p_array[1]; > > u0 = p_array[2]; > > > > /* Set boundary conditions at the origin*/ > > if(info.xs ==0){ > > set_origin_bcs(u0, Qvals); > > } > > /* Set boundray conditions in the far field */ > > if(info.xs+ info.xm == info.mx ){ > > set_farfield_bcs(xmax,sigma, a, b, dx, Qvals,info.mx ); > > } > > > > /* Solve auxiliary equations */ > > if(info.xs ==0){ > > uxx = (2 * Qvals[0].u-2 * u0)/dx2; > > vxx = (Qvals[0].v + Qvals[0].g)/dx2; > > vx = (Qvals[0].v - Qvals[0].g)/(2*dx); > > Fp_array[0] = Qvals[0].u - Qvals[0].f; > > Fp_array[1] = -vxx - (1/a) * (.5/sigma) * u0; > > Fp_array[2] = -uxx + (1/a2) * u0 > > + (1/a) * (-b * vx + PetscPowScalar(u0 * u0, sigma) * vx); > > } > > > > /* Solve equations in the bulk */ > > for(i=info.xs; i < info.xs + info.xm;i++){ > > > > u = Qvals[i].u; > > v = Qvals[i].v; > > f = Qvals[i].f; > > g = Qvals[i].g; > > > > x = (i+1) * dx; > > > > Q2sig = PetscPowScalar(u*u + v*v,sigma); > > W2sig= PetscPowScalar(f*f + g*g, sigma); > > > > ux = (Qvals[i+1].u-Qvals[i-1].u)/(2*dx); > > vx = (Qvals[i+1].v-Qvals[i-1].v)/(2*dx); > > fx = (Qvals[i+1].f-Qvals[i-1].f)/(2*dx); > > gx = (Qvals[i+1].g-Qvals[i-1].g)/(2*dx); > > > > uxx = (Qvals[i+1].u+Qvals[i-1].u - 2 *u)/(dx2); > > vxx = (Qvals[i+1].v+Qvals[i-1].v- 2 *v)/(dx2); > > fxx = (Qvals[i+1].f+Qvals[i-1].f -2*f)/(dx2); > > gxx = (Qvals[i+1].g+Qvals[i-1].g -2*g)/(dx2); > > > > FQvals[i].u = -uxx +1/a2 * u > > + 1/a *(.5/sigma* v +x * vx- b* vx +Q2sig* vx); > > > > FQvals[i].v = -vxx +1/a2 * v > > - 1/a *(.5/sigma * u +x * ux- b* ux +Q2sig* ux); > > > > FQvals[i].f = -fxx +1/a2 * f > > + 1/a *(.5/sigma * g +x * gx+ b* gx -W2sig* gx); > > > > FQvals[i].g =-gxx +1/a2 * g > > - 1/a *(.5/sigma * f +x * fx+ b* fx -W2sig* fx); > > } > > > > /* Restore raw arrays */ > > VecRestoreArray(p_vec, &p_array); > > VecRestoreArray(Fp_vec, &Fp_array); > > > > DMDAVecRestoreArray(user->Q_dm, Q_vec, &Qvals); > > DMDAVecRestoreArray(user->Q_dm, FQ_vec, &FQvals); > > > > DMCompositeGather(user->packer,F,INSERT_VALUES, Fp_vec, FQ_vec); > > DMCompositeRestoreLocalVectors(user->packer, &p_vec, &Q_vec); > > DMCompositeRestoreLocalVectors(user->packer, &Fp_vec, &FQ_vec); > > > > return 0; > > } > > > > > > Here is the form function: > > > > > > > > -gideon > > > >> On Aug 27, 2015, at 11:09 PM, Barry Smith > wrote: > >> > >> > >> Can you send the code, that will be the easiest way to find the problem. > >> > >> My guess is that you have hardwired in your function/Jacobian computation the use of the original DM for computations instead of using the current DM (with refinement there will be a new DM on the second level different than your original DM). So what you need to do in writing your FormFunction and FormJacobian is to call SNESGetDM() to get the current DM and then use DMComputeGet... to access the individual DMDA and DMRedundent for the parts. I notice you have this user.Q_dm I bet inside your form functions you use this DM? You have to remove this logic. > >> > >> Barry > >> > >>> On Aug 27, 2015, at 9:42 PM, Gideon Simpson > wrote: > >>> > >>> I have it set up as: > >>> > >>> DMCompositeCreate(PETSC_COMM_WORLD, &user.packer); > >>> DMRedundantCreate(PETSC_COMM_WORLD, 0, 3, &user.p_dm); > >>> DMCompositeAddDM(user.packer,user.p_dm); > >>> DMDACreate1d(PETSC_COMM_WORLD,DM_BOUNDARY_GHOSTED, > >>> nx, 4, 1, NULL, &user.Q_dm); > >>> DMCompositeAddDM(user.packer,user.Q_dm); > >>> DMCreateGlobalVector(user.packer,&U); > >>> > >>> where the user.packer structure has > >>> > >>> DM packer; > >>> DM p_dm, Q_dm; > >>> > >>> Q_dm holds the field variables and p_dm holds the scalar values (the nonlinear eigenvalues). > >>> > >>> Here are some of the errors that are generated: > >>> > >>> [0]PETSC ERROR: --------------------- Error Message -------------------------------------------------------------- > >>> [0]PETSC ERROR: Argument out of range > >>> [0]PETSC ERROR: New nonzero at (0,3) caused a malloc > >>> Use MatSetOption(A, MAT_NEW_NONZERO_ALLOCATION_ERR, PETSC_FALSE) to turn off this check > >>> [0]PETSC ERROR: See http://www.mcs.anl.gov/petsc/documentation/faq.html for trouble shooting. > >>> [0]PETSC ERROR: Petsc Release Version 3.5.3, unknown > >>> [0]PETSC ERROR: ./blowup_batch2 on a arch-macports named gs_air by gideon Thu Aug 27 22:40:54 2015 > >>> [0]PETSC ERROR: Configure options --prefix=/opt/local --prefix=/opt/local/lib/petsc --with-valgrind=0 --with-shared-libraries --with-debugging=0 --with-c2html-dir=/opt/local --with-x=0 --with-blas-lapack-lib=/System/Library/Frameworks/Accelerate.framework/Versions/Current/Accelerate --with-hwloc-dir=/opt/local --with-suitesparse-dir=/opt/local --with-superlu-dir=/opt/local --with-metis-dir=/opt/local --with-parmetis-dir=/opt/local --with-scalapack-dir=/opt/local --with-mumps-dir=/opt/local --with-superlu_dist-dir=/opt/local CC=/opt/local/bin/mpicc-mpich-mp CXX=/opt/local/bin/mpicxx-mpich-mp FC=/opt/local/bin/mpif90-mpich-mp F77=/opt/local/bin/mpif90-mpich-mp F90=/opt/local/bin/mpif90-mpich-mp COPTFLAGS=-Os CXXOPTFLAGS=-Os FOPTFLAGS=-Os LDFLAGS="-L/opt/local/lib -Wl,-headerpad_max_install_names" CPPFLAGS=-I/opt/local/include CFLAGS="-Os -arch x86_64" CXXFLAGS=-Os FFLAGS=-Os FCFLAGS=-Os F90FLAGS=-Os PETSC_ARCH=arch-macports --with-mpiexec=mpiexec-mpich-mp > >>> [0]PETSC ERROR: #1 MatSetValues_MPIAIJ() line 530 in /opt/local/var/macports/build/_opt_local_var_macports_sources_rsync.macports.org_release_tarballs_ports_math_petsc/petsc/work/v3.5.3/src/mat/impls/aij/mpi/mpiaij.c > >>> [0]PETSC ERROR: [1]PETSC ERROR: --------------------- Error Message -------------------------------------------------------------- > >>> [1]PETSC ERROR: Argument out of range > >>> [1]PETSC ERROR: Inserting a new nonzero (40003, 0) into matrix > >>> [1]PETSC ERROR: See http://www.mcs.anl.gov/petsc/documentation/faq.html for trouble shooting. > >>> [1]PETSC ERROR: Petsc Release Version 3.5.3, unknown > >>> [1]PETSC ERROR: ./blowup_batch2 on a arch-macports named gs_air by gideon Thu Aug 27 22:40:54 2015 > >>> [1]PETSC ERROR: Configure options --prefix=/opt/local --prefix=/opt/local/lib/petsc --with-valgrind=0 --with-shared-libraries --with-debugging=0 --with-c2html-dir=/opt/local --with-x=0 --with-blas-lapack-lib=/System/Library/Frameworks/Accelerate.framework/Versions/Current/Accelerate --with-hwloc-dir=/opt/local --with-suitesparse-dir=/opt/local --with-superlu-dir=/opt/local --with-metis-dir=/opt/local --with-parmetis-dir=/opt/local --with-scalapack-dir=/opt/local --with-mumps-dir=/opt/local --with-superlu_dist-dir=/opt/local CC=/opt/local/bin/mpicc-mpich-mp CXX=/opt/local/bin/mpicxx-mpich-mp FC=/opt/local/bin/mpif90-mpich-mp F77=/opt/local/bin/mpif90-mpich-mp F90=/opt/local/bin/mpif90-mpich-mp COPTFLAGS=-Os CXXOPTFLAGS=-Os FOPTFLAGS=-Os LDFLAGS="-L/opt/local/lib -Wl,-headerpad_max_install_names" CPPFLAGS=-I/opt/local/include CFLAGS="-Os -arch x86_64" CXXFLAGS=-Os FFLAGS=-Os FCFLAGS=-Os F90FLAGS=-Os PETSC_ARCH=arch-macports --with-mpiexec=mpiexec-mpich-mp > >>> [1]PETSC ERROR: #1 MatSetValues_MPIAIJ() line 561 in /opt/local/var/macports/build/_opt_local_var_macports_sources_rsync.macports.org_release_tarballs_ports_math_petsc/petsc/work/v3.5.3/src/mat/impls/aij/mpi/mpiaij.c > >>> [1]PETSC ERROR: #2 MatSetValues() line 1135 in /opt/local/var/macports/build/_opt_local_var_macports_sources_rsync.macports.org_release_tarballs_ports_math_petsc/petsc/work/v3.5.3/src/mat/interface/matrix.c > >>> > >>> > >>> > >>> -gideon > >>> > >>>> On Aug 27, 2015, at 10:37 PM, Barry Smith > wrote: > >>>> > >>>> > >>>> We need the full error message. > >>>> > >>>> But are you using a DMDA for the scalars? You should not be, you should be using a DMRedundant for the scalars. > >>>> > >>>> Barry > >>>> > >>>> Though you should not get this error even if you are using a DMDA there. > >>>> > >>>>> On Aug 27, 2015, at 9:32 PM, Gideon Simpson > wrote: > >>>>> > >>>>> I?m getting the following errors: > >>>>> > >>>>> [1]PETSC ERROR: Argument out of range > >>>>> [1]PETSC ERROR: Inserting a new nonzero (40003, 0) into matrix > >>>>> > >>>>> Could this have to do with me using the DMComposite with one da holding the scalar parameters and the other holding the field variables? > >>>>> > >>>>> -gideon > >>>>> > >>>>>> On Aug 27, 2015, at 10:15 PM, Matthew Knepley > wrote: > >>>>>> > >>>>>> On Thu, Aug 27, 2015 at 9:11 PM, Gideon Simpson > wrote: > >>>>>> HI Barry, > >>>>>> > >>>>>> Nope, I?m not doing any grid sequencing. Clearly that makes a lot of sense, to solve on a spatially coarse mesh for the field variables, interpolate onto the finer mesh, and then solve again. I?m not entirely clear on the practical implementation > >>>>>> > >>>>>> SNES should do this automatically using -snes_grid_sequence . If this does not work, complain. Loudly. > >>>>>> > >>>>>> Matt > >>>>>> > >>>>>> -gideon > >>>>>> > >>>>>>> On Aug 27, 2015, at 10:02 PM, Barry Smith > wrote: > >>>>>>> > >>>>>>> > >>>>>>> Gideon, > >>>>>>> > >>>>>>> Are you using grid sequencing? Simply solve on a coarse grid, interpolate u1 and u2 to a once refined version of the grid and use that plus the mu lam as initial guess for the next level. Repeat to as fine a grid as you want. You can use DMRefine() and DMGetInterpolation() to get the interpolation needed to interpolate from the coarse to finer mesh. > >>>>>>> > >>>>>>> Then and only then you can use multigrid (with or without fieldsplit) to solve the linear problems for finer meshes. Once you have the grid sequencing working we can help you with this. > >>>>>>> > >>>>>>> Barry > >>>>>>> > >>>>>>>> On Aug 27, 2015, at 7:00 PM, Gideon Simpson > wrote: > >>>>>>>> > >>>>>>>> I?m working on a problem which, morally, can be posed as a system of coupled semi linear elliptic PDEs together with unknown nonlinear eigenvalue parameters, loosely, of the form > >>>>>>>> > >>>>>>>> -\Delta u_1 + f(u_1, u_2) = lam * u1 - mu * du2/dx > >>>>>>>> -\Delta u_2 + g(u_1, u_2) = lam * u2 + mu * du1/dx > >>>>>>>> > >>>>>>>> Currently, I have it set up with a DMComposite with two sub da?s, one for the parameters (lam, mu), and one for the vector field (u_1, u_2) on the mesh. I have had success in solving this as a fully coupled system with SNES + sparse direct solvers (MUMPS, SuperLU). > >>>>>>>> > >>>>>>>> Lately, I am finding that, when the mesh resolution gets fine enough (i.e. 10^6-10^8 lattice points), my SNES gets stuck with the function norm = O(10^{-4}), eventually returning reason -6 (failed line search). > >>>>>>>> > >>>>>>>> Perhaps there is another way around the above problem, but one thing I was thinking of trying would be to get away from direct solvers, and I was hoping to use field split for this. However, it?s a bit beyond what I?ve seen examples for because it has 2 types of variables: scalar parameters which appear globally in the system and vector valued field variables. Any suggestions on how to get started? > >>>>>>>> > >>>>>>>> -gideon > >>>>>>>> > >>>>>>> > >>>>>> > >>>>>> > >>>>>> > >>>>>> > >>>>>> -- > >>>>>> What most experimenters take for granted before they begin their experiments is infinitely more interesting than any results to which their experiments lead. > >>>>>> -- Norbert Wiener > >>>>> > >>>> > >>> > >> > > > > > > > -- > What most experimenters take for granted before they begin their experiments is infinitely more interesting than any results to which their experiments lead. > -- Norbert Wiener -------------- next part -------------- An HTML attachment was scrubbed... URL: From gideon.simpson at gmail.com Fri Aug 28 14:41:26 2015 From: gideon.simpson at gmail.com (Gideon Simpson) Date: Fri, 28 Aug 2015 15:41:26 -0400 Subject: [petsc-users] pcfieldsplit for a composite dm with multiple subfields In-Reply-To: References: <22066404-50E3-4BBF-9D27-26384815571A@gmail.com> <8376EFA7-D775-4D65-9E86-2303FA7E47E2@gmail.com> <4136AE9B-AE00-4E39-8012-888BBF920548@mcs.anl.gov> <5714BD56-097D-40B9-8AEF-41273E2B512D@mcs.anl.gov> <05BA24AC-4011-483C-8599-5D8EED7AFE10@mcs.anl.gov> Message-ID: <0DBB158F-6E6B-404A-B477-BC7D5A321F01@gmail.com> Hi Barry, Matt, Barry, your solution worked, but I wanted to follow up on a few points on grid sequencing 1. If, in using the grid sequence, there is no preallocation of the matrix, does that mean I?m going to get hit (badly) with the mallocing as I increase the number of levels? 2. In my problem, with real data, I see behavior like: 0 SNES Function norm 7.948742655505e-03 0 KSP Residual norm 3.666593515373e-03 1 KSP Residual norm 7.943650614441e-16 1 SNES Function norm 9.001557371893e-07 0 KSP Residual norm 8.814810163693e-06 1 KSP Residual norm 6.638031123907e-18 2 SNES Function norm 4.176927119066e-11 0 SNES Function norm 1.500187158175e+02 0 KSP Residual norm 1.006776821797e-01 1 KSP Residual norm 2.010368372645e-13 1 SNES Function norm 5.899853203939e-03 0 KSP Residual norm 1.752660743738e-02 1 KSP Residual norm 1.244868008219e-14 2 SNES Function norm 1.748583606371e-06 0 KSP Residual norm 4.933624839470e-06 1 KSP Residual norm 5.789658241868e-18 3 SNES Function norm 2.034638891687e-10 Where, when it gets to the first refinement, it?s not clear how much advantage its taking of the coarser solution. 3. This problem is actually part of a continuation problem that roughly looks like this for( continuation parameter p = 0 to 1){ solve with parameter p_i using solution from p_{i-1}, } What I would like to do is to start the solver, for each value of parameter p_i on the coarse mesh, and then do grid sequencing on that. But it appears that after doing grid sequencing on the initial p_0 = 0, the SNES is set to use the finer mesh. 4. When I do SNESSolve(snes, NULL, U) with grid sequencing, U is not the solution on the fine mesh. But what is it? Is it still the starting guess, or is it the solution on the coarse mesh? -gideon > On Aug 28, 2015, at 6:20 AM, Matthew Knepley wrote: > > On Thu, Aug 27, 2015 at 10:23 PM, Barry Smith > wrote: > > > On Aug 27, 2015, at 10:15 PM, Gideon Simpson > wrote: > > > > That?s correct, I am not using the SNESGetDM. I suppose I could. Keep in mind that I?m trying to solve, simultaneously, for the scalar parameters and the vector field. I guess what I am unclear about is how DMRefine is to know that the unknown associated with the scalar parameters can never be coarsened out, but must be retained at all iterations. > > Nothing ever gets coarsened in grid sequencing, it only gets refined. > > The reason it knows not to "refine" the scalars is because the scalars are created with DMRedundant and the DMRedundant object knows that refinement means "leave as is, since there is no grid" while the DMDA knows it is a grid and knows how to refine itself. So when it "interpolates" the DMRedundant variables it just copies them (or multiples them by the matrix 1 which is just a copy). > > I think you might be misunderstanding the "scalars" part. He is solving a nonlinear eigenproblem (which he did not write down) for some variables. Then he > uses those variable in the coupled diffusion equations he did write down. He has wrapped the whole problem in a SNES with 2 parts: the nonlinear eigenproblem > and the diffusion equations. He uses DMComposite to deal with all the unknowns. > > I think Nonlinear Block Gauss-Siedel on the different problems would be a useful starting point, but we do not have that. > > Thanks, > > Matt > > > > > Here is my form function. I can send more code if needed. > > > > Just change the user->packer that you use to be the DM obtained with SNESGetDM() > > > > /* Form the system of equations for computing a blowup solution*/ > > PetscErrorCode form_function(SNES snes, Vec U, Vec F, void *ctx){ > > > > blowup_ctx *user = (blowup_ctx *) ctx; > > PetscInt i; > > PetscScalar dx, dx2, xmax,x; > > PetscScalar u, v, f,g, ux, vx, uxx, vxx, fx,gx, fxx, gxx; > > DMDALocalInfo info; > > Vec p_vec, Q_vec, Fp_vec, FQ_vec; > > PetscScalar *p_array, *Fp_array; > > Q *Qvals, *FQvals; > > PetscScalar Q2sig, W2sig; > > PetscScalar a,a2, b, u0, sigma; > > > > dx = user->dx; dx2 = dx *dx; > > xmax = user->xmax; > > sigma = user->sigma; > > > > /* PetscPrintf(PETSC_COMM_SELF, " dx = %g, sigma = %g\n", dx, sigma); */ > > > > /* Extract raw arrays */ > > DMCompositeGetLocalVectors(user->packer, &p_vec, &Q_vec); > > DMCompositeGetLocalVectors(user->packer, &Fp_vec, &FQ_vec); > > > > DMCompositeScatter(user->packer, U, p_vec, Q_vec); > > /* VecView(Q_vec, PETSC_VIEWER_STDOUT_SELF); */ > > > > VecGetArray(p_vec,&p_array); > > VecGetArray(Fp_vec,&Fp_array); > > > > DMDAVecGetArray(user->Q_dm, Q_vec, &Qvals); > > DMDAVecGetArray(user->Q_dm, FQ_vec, &FQvals); > > > > DMDAGetLocalInfo(user->Q_dm, &info); > > > > a = p_array[0]; a2 = a*a; > > b = p_array[1]; > > u0 = p_array[2]; > > > > /* Set boundary conditions at the origin*/ > > if(info.xs ==0){ > > set_origin_bcs(u0, Qvals); > > } > > /* Set boundray conditions in the far field */ > > if(info.xs+ info.xm == info.mx ){ > > set_farfield_bcs(xmax,sigma, a, b, dx, Qvals,info.mx ); > > } > > > > /* Solve auxiliary equations */ > > if(info.xs ==0){ > > uxx = (2 * Qvals[0].u-2 * u0)/dx2; > > vxx = (Qvals[0].v + Qvals[0].g)/dx2; > > vx = (Qvals[0].v - Qvals[0].g)/(2*dx); > > Fp_array[0] = Qvals[0].u - Qvals[0].f; > > Fp_array[1] = -vxx - (1/a) * (.5/sigma) * u0; > > Fp_array[2] = -uxx + (1/a2) * u0 > > + (1/a) * (-b * vx + PetscPowScalar(u0 * u0, sigma) * vx); > > } > > > > /* Solve equations in the bulk */ > > for(i=info.xs; i < info.xs + info.xm;i++){ > > > > u = Qvals[i].u; > > v = Qvals[i].v; > > f = Qvals[i].f; > > g = Qvals[i].g; > > > > x = (i+1) * dx; > > > > Q2sig = PetscPowScalar(u*u + v*v,sigma); > > W2sig= PetscPowScalar(f*f + g*g, sigma); > > > > ux = (Qvals[i+1].u-Qvals[i-1].u)/(2*dx); > > vx = (Qvals[i+1].v-Qvals[i-1].v)/(2*dx); > > fx = (Qvals[i+1].f-Qvals[i-1].f)/(2*dx); > > gx = (Qvals[i+1].g-Qvals[i-1].g)/(2*dx); > > > > uxx = (Qvals[i+1].u+Qvals[i-1].u - 2 *u)/(dx2); > > vxx = (Qvals[i+1].v+Qvals[i-1].v- 2 *v)/(dx2); > > fxx = (Qvals[i+1].f+Qvals[i-1].f -2*f)/(dx2); > > gxx = (Qvals[i+1].g+Qvals[i-1].g -2*g)/(dx2); > > > > FQvals[i].u = -uxx +1/a2 * u > > + 1/a *(.5/sigma* v +x * vx- b* vx +Q2sig* vx); > > > > FQvals[i].v = -vxx +1/a2 * v > > - 1/a *(.5/sigma * u +x * ux- b* ux +Q2sig* ux); > > > > FQvals[i].f = -fxx +1/a2 * f > > + 1/a *(.5/sigma * g +x * gx+ b* gx -W2sig* gx); > > > > FQvals[i].g =-gxx +1/a2 * g > > - 1/a *(.5/sigma * f +x * fx+ b* fx -W2sig* fx); > > } > > > > /* Restore raw arrays */ > > VecRestoreArray(p_vec, &p_array); > > VecRestoreArray(Fp_vec, &Fp_array); > > > > DMDAVecRestoreArray(user->Q_dm, Q_vec, &Qvals); > > DMDAVecRestoreArray(user->Q_dm, FQ_vec, &FQvals); > > > > DMCompositeGather(user->packer,F,INSERT_VALUES, Fp_vec, FQ_vec); > > DMCompositeRestoreLocalVectors(user->packer, &p_vec, &Q_vec); > > DMCompositeRestoreLocalVectors(user->packer, &Fp_vec, &FQ_vec); > > > > return 0; > > } > > > > > > Here is the form function: > > > > > > > > -gideon > > > >> On Aug 27, 2015, at 11:09 PM, Barry Smith > wrote: > >> > >> > >> Can you send the code, that will be the easiest way to find the problem. > >> > >> My guess is that you have hardwired in your function/Jacobian computation the use of the original DM for computations instead of using the current DM (with refinement there will be a new DM on the second level different than your original DM). So what you need to do in writing your FormFunction and FormJacobian is to call SNESGetDM() to get the current DM and then use DMComputeGet... to access the individual DMDA and DMRedundent for the parts. I notice you have this user.Q_dm I bet inside your form functions you use this DM? You have to remove this logic. > >> > >> Barry > >> > >>> On Aug 27, 2015, at 9:42 PM, Gideon Simpson > wrote: > >>> > >>> I have it set up as: > >>> > >>> DMCompositeCreate(PETSC_COMM_WORLD, &user.packer); > >>> DMRedundantCreate(PETSC_COMM_WORLD, 0, 3, &user.p_dm); > >>> DMCompositeAddDM(user.packer,user.p_dm); > >>> DMDACreate1d(PETSC_COMM_WORLD,DM_BOUNDARY_GHOSTED, > >>> nx, 4, 1, NULL, &user.Q_dm); > >>> DMCompositeAddDM(user.packer,user.Q_dm); > >>> DMCreateGlobalVector(user.packer,&U); > >>> > >>> where the user.packer structure has > >>> > >>> DM packer; > >>> DM p_dm, Q_dm; > >>> > >>> Q_dm holds the field variables and p_dm holds the scalar values (the nonlinear eigenvalues). > >>> > >>> Here are some of the errors that are generated: > >>> > >>> [0]PETSC ERROR: --------------------- Error Message -------------------------------------------------------------- > >>> [0]PETSC ERROR: Argument out of range > >>> [0]PETSC ERROR: New nonzero at (0,3) caused a malloc > >>> Use MatSetOption(A, MAT_NEW_NONZERO_ALLOCATION_ERR, PETSC_FALSE) to turn off this check > >>> [0]PETSC ERROR: See http://www.mcs.anl.gov/petsc/documentation/faq.html for trouble shooting. > >>> [0]PETSC ERROR: Petsc Release Version 3.5.3, unknown > >>> [0]PETSC ERROR: ./blowup_batch2 on a arch-macports named gs_air by gideon Thu Aug 27 22:40:54 2015 > >>> [0]PETSC ERROR: Configure options --prefix=/opt/local --prefix=/opt/local/lib/petsc --with-valgrind=0 --with-shared-libraries --with-debugging=0 --with-c2html-dir=/opt/local --with-x=0 --with-blas-lapack-lib=/System/Library/Frameworks/Accelerate.framework/Versions/Current/Accelerate --with-hwloc-dir=/opt/local --with-suitesparse-dir=/opt/local --with-superlu-dir=/opt/local --with-metis-dir=/opt/local --with-parmetis-dir=/opt/local --with-scalapack-dir=/opt/local --with-mumps-dir=/opt/local --with-superlu_dist-dir=/opt/local CC=/opt/local/bin/mpicc-mpich-mp CXX=/opt/local/bin/mpicxx-mpich-mp FC=/opt/local/bin/mpif90-mpich-mp F77=/opt/local/bin/mpif90-mpich-mp F90=/opt/local/bin/mpif90-mpich-mp COPTFLAGS=-Os CXXOPTFLAGS=-Os FOPTFLAGS=-Os LDFLAGS="-L/opt/local/lib -Wl,-headerpad_max_install_names" CPPFLAGS=-I/opt/local/include CFLAGS="-Os -arch x86_64" CXXFLAGS=-Os FFLAGS=-Os FCFLAGS=-Os F90FLAGS=-Os PETSC_ARCH=arch-macports --with-mpiexec=mpiexec-mpich-mp > >>> [0]PETSC ERROR: #1 MatSetValues_MPIAIJ() line 530 in /opt/local/var/macports/build/_opt_local_var_macports_sources_rsync.macports.org_release_tarballs_ports_math_petsc/petsc/work/v3.5.3/src/mat/impls/aij/mpi/mpiaij.c > >>> [0]PETSC ERROR: [1]PETSC ERROR: --------------------- Error Message -------------------------------------------------------------- > >>> [1]PETSC ERROR: Argument out of range > >>> [1]PETSC ERROR: Inserting a new nonzero (40003, 0) into matrix > >>> [1]PETSC ERROR: See http://www.mcs.anl.gov/petsc/documentation/faq.html for trouble shooting. > >>> [1]PETSC ERROR: Petsc Release Version 3.5.3, unknown > >>> [1]PETSC ERROR: ./blowup_batch2 on a arch-macports named gs_air by gideon Thu Aug 27 22:40:54 2015 > >>> [1]PETSC ERROR: Configure options --prefix=/opt/local --prefix=/opt/local/lib/petsc --with-valgrind=0 --with-shared-libraries --with-debugging=0 --with-c2html-dir=/opt/local --with-x=0 --with-blas-lapack-lib=/System/Library/Frameworks/Accelerate.framework/Versions/Current/Accelerate --with-hwloc-dir=/opt/local --with-suitesparse-dir=/opt/local --with-superlu-dir=/opt/local --with-metis-dir=/opt/local --with-parmetis-dir=/opt/local --with-scalapack-dir=/opt/local --with-mumps-dir=/opt/local --with-superlu_dist-dir=/opt/local CC=/opt/local/bin/mpicc-mpich-mp CXX=/opt/local/bin/mpicxx-mpich-mp FC=/opt/local/bin/mpif90-mpich-mp F77=/opt/local/bin/mpif90-mpich-mp F90=/opt/local/bin/mpif90-mpich-mp COPTFLAGS=-Os CXXOPTFLAGS=-Os FOPTFLAGS=-Os LDFLAGS="-L/opt/local/lib -Wl,-headerpad_max_install_names" CPPFLAGS=-I/opt/local/include CFLAGS="-Os -arch x86_64" CXXFLAGS=-Os FFLAGS=-Os FCFLAGS=-Os F90FLAGS=-Os PETSC_ARCH=arch-macports --with-mpiexec=mpiexec-mpich-mp > >>> [1]PETSC ERROR: #1 MatSetValues_MPIAIJ() line 561 in /opt/local/var/macports/build/_opt_local_var_macports_sources_rsync.macports.org_release_tarballs_ports_math_petsc/petsc/work/v3.5.3/src/mat/impls/aij/mpi/mpiaij.c > >>> [1]PETSC ERROR: #2 MatSetValues() line 1135 in /opt/local/var/macports/build/_opt_local_var_macports_sources_rsync.macports.org_release_tarballs_ports_math_petsc/petsc/work/v3.5.3/src/mat/interface/matrix.c > >>> > >>> > >>> > >>> -gideon > >>> > >>>> On Aug 27, 2015, at 10:37 PM, Barry Smith > wrote: > >>>> > >>>> > >>>> We need the full error message. > >>>> > >>>> But are you using a DMDA for the scalars? You should not be, you should be using a DMRedundant for the scalars. > >>>> > >>>> Barry > >>>> > >>>> Though you should not get this error even if you are using a DMDA there. > >>>> > >>>>> On Aug 27, 2015, at 9:32 PM, Gideon Simpson > wrote: > >>>>> > >>>>> I?m getting the following errors: > >>>>> > >>>>> [1]PETSC ERROR: Argument out of range > >>>>> [1]PETSC ERROR: Inserting a new nonzero (40003, 0) into matrix > >>>>> > >>>>> Could this have to do with me using the DMComposite with one da holding the scalar parameters and the other holding the field variables? > >>>>> > >>>>> -gideon > >>>>> > >>>>>> On Aug 27, 2015, at 10:15 PM, Matthew Knepley > wrote: > >>>>>> > >>>>>> On Thu, Aug 27, 2015 at 9:11 PM, Gideon Simpson > wrote: > >>>>>> HI Barry, > >>>>>> > >>>>>> Nope, I?m not doing any grid sequencing. Clearly that makes a lot of sense, to solve on a spatially coarse mesh for the field variables, interpolate onto the finer mesh, and then solve again. I?m not entirely clear on the practical implementation > >>>>>> > >>>>>> SNES should do this automatically using -snes_grid_sequence . If this does not work, complain. Loudly. > >>>>>> > >>>>>> Matt > >>>>>> > >>>>>> -gideon > >>>>>> > >>>>>>> On Aug 27, 2015, at 10:02 PM, Barry Smith > wrote: > >>>>>>> > >>>>>>> > >>>>>>> Gideon, > >>>>>>> > >>>>>>> Are you using grid sequencing? Simply solve on a coarse grid, interpolate u1 and u2 to a once refined version of the grid and use that plus the mu lam as initial guess for the next level. Repeat to as fine a grid as you want. You can use DMRefine() and DMGetInterpolation() to get the interpolation needed to interpolate from the coarse to finer mesh. > >>>>>>> > >>>>>>> Then and only then you can use multigrid (with or without fieldsplit) to solve the linear problems for finer meshes. Once you have the grid sequencing working we can help you with this. > >>>>>>> > >>>>>>> Barry > >>>>>>> > >>>>>>>> On Aug 27, 2015, at 7:00 PM, Gideon Simpson > wrote: > >>>>>>>> > >>>>>>>> I?m working on a problem which, morally, can be posed as a system of coupled semi linear elliptic PDEs together with unknown nonlinear eigenvalue parameters, loosely, of the form > >>>>>>>> > >>>>>>>> -\Delta u_1 + f(u_1, u_2) = lam * u1 - mu * du2/dx > >>>>>>>> -\Delta u_2 + g(u_1, u_2) = lam * u2 + mu * du1/dx > >>>>>>>> > >>>>>>>> Currently, I have it set up with a DMComposite with two sub da?s, one for the parameters (lam, mu), and one for the vector field (u_1, u_2) on the mesh. I have had success in solving this as a fully coupled system with SNES + sparse direct solvers (MUMPS, SuperLU). > >>>>>>>> > >>>>>>>> Lately, I am finding that, when the mesh resolution gets fine enough (i.e. 10^6-10^8 lattice points), my SNES gets stuck with the function norm = O(10^{-4}), eventually returning reason -6 (failed line search). > >>>>>>>> > >>>>>>>> Perhaps there is another way around the above problem, but one thing I was thinking of trying would be to get away from direct solvers, and I was hoping to use field split for this. However, it?s a bit beyond what I?ve seen examples for because it has 2 types of variables: scalar parameters which appear globally in the system and vector valued field variables. Any suggestions on how to get started? > >>>>>>>> > >>>>>>>> -gideon > >>>>>>>> > >>>>>>> > >>>>>> > >>>>>> > >>>>>> > >>>>>> > >>>>>> -- > >>>>>> What most experimenters take for granted before they begin their experiments is infinitely more interesting than any results to which their experiments lead. > >>>>>> -- Norbert Wiener > >>>>> > >>>> > >>> > >> > > > > > > > -- > What most experimenters take for granted before they begin their experiments is infinitely more interesting than any results to which their experiments lead. > -- Norbert Wiener -------------- next part -------------- An HTML attachment was scrubbed... URL: From bsmith at mcs.anl.gov Fri Aug 28 14:55:18 2015 From: bsmith at mcs.anl.gov (Barry Smith) Date: Fri, 28 Aug 2015 14:55:18 -0500 Subject: [petsc-users] pcfieldsplit for a composite dm with multiple subfields In-Reply-To: <0DBB158F-6E6B-404A-B477-BC7D5A321F01@gmail.com> References: <22066404-50E3-4BBF-9D27-26384815571A@gmail.com> <8376EFA7-D775-4D65-9E86-2303FA7E47E2@gmail.com> <4136AE9B-AE00-4E39-8012-888BBF920548@mcs.anl.gov> <5714BD56-097D-40B9-8AEF-41273E2B512D@mcs.anl.gov> <05BA24AC-4011-483C-8599-5D8EED7AFE10@mcs.anl.gov> <0DBB158F-6E6B-404A-B477-BC7D5A321F01@gmail.com> Message-ID: <122D6409-96DB-4A1A-A134-525D2CEC2F1D@mcs.anl.gov> > On Aug 28, 2015, at 2:41 PM, Gideon Simpson wrote: > > Hi Barry, Matt, > > Barry, your solution worked, but I wanted to follow up on a few points on grid sequencing > > 1. If, in using the grid sequence, there is no preallocation of the matrix, does that mean I?m going to get hit (badly) with the mallocing as I increase the number of levels? Actually the preallocation of the matrix is the same with or without grid sequencing! It is doing the proper preallocation for the DMDA part of the matrix, it is the coupling between the DMDA variables and the DMREDUNDANT variables that is not preallocated. This may be a performance hit for large problems but lets cross that bridge when we get to it. > > > 2. In my problem, with real data, I see behavior like: > > 0 SNES Function norm 7.948742655505e-03 > 0 KSP Residual norm 3.666593515373e-03 > 1 KSP Residual norm 7.943650614441e-16 > 1 SNES Function norm 9.001557371893e-07 > 0 KSP Residual norm 8.814810163693e-06 > 1 KSP Residual norm 6.638031123907e-18 > 2 SNES Function norm 4.176927119066e-11 > 0 SNES Function norm 1.500187158175e+02 > 0 KSP Residual norm 1.006776821797e-01 > 1 KSP Residual norm 2.010368372645e-13 > 1 SNES Function norm 5.899853203939e-03 > 0 KSP Residual norm 1.752660743738e-02 > 1 KSP Residual norm 1.244868008219e-14 > 2 SNES Function norm 1.748583606371e-06 > 0 KSP Residual norm 4.933624839470e-06 > 1 KSP Residual norm 5.789658241868e-18 > 3 SNES Function norm 2.034638891687e-10 > > Where, when it gets to the first refinement, it?s not clear how much advantage its taking of the coarser solution. I see this often with grid sequencing. When you interpolate to the next mesh the residual norm is still pretty big so it looks like it does not help, but actually even though the residual norm is big the initial guess is still much better for Newton's method. So the question is not if the interpolated residual norm is big, instead the question is can you get convergence on finer and finer meshes that you could not get before. You'll need to run the refinement for several levels to see, I predict you will. > > > 3. This problem is actually part of a continuation problem that roughly looks like this > > for( continuation parameter p = 0 to 1){ > > solve with parameter p_i using solution from p_{i-1}, > } > > What I would like to do is to start the solver, for each value of parameter p_i on the coarse mesh, and then do grid sequencing on that. But it appears that after doing grid sequencing on the initial p_0 = 0, the SNES is set to use the finer mesh. So you are using continuation to give you a good enough initial guess on the coarse level to even get convergence on the coarse level? First I would check if you even need the continuation (or can you not even solve the coarse problem without it). If you do need the continuation then you will need to tweak how you do the grid sequencing. I think this will work: Do not use -snes_grid_sequencing Run SNESSolve() as many times as you want with your continuation parameter. This will all happen on the coarse mesh. Call SNESSetGridSequence() Then call SNESSolve() again and it will do one solve on the coarse level and then interpolate to the next level etc. > > 4. When I do SNESSolve(snes, NULL, U) with grid sequencing, U is not the solution on the fine mesh. But what is it? Is it still the starting guess, or is it the solution on the coarse mesh? It will contain the solution on the coarse mesh. After SNESSolve() call SNESGetSolution() and it will give back the solution on the fine mesh. Barry > > > -gideon > >> On Aug 28, 2015, at 6:20 AM, Matthew Knepley wrote: >> >> On Thu, Aug 27, 2015 at 10:23 PM, Barry Smith wrote: >> >> > On Aug 27, 2015, at 10:15 PM, Gideon Simpson wrote: >> > >> > That?s correct, I am not using the SNESGetDM. I suppose I could. Keep in mind that I?m trying to solve, simultaneously, for the scalar parameters and the vector field. I guess what I am unclear about is how DMRefine is to know that the unknown associated with the scalar parameters can never be coarsened out, but must be retained at all iterations. >> >> Nothing ever gets coarsened in grid sequencing, it only gets refined. >> >> The reason it knows not to "refine" the scalars is because the scalars are created with DMRedundant and the DMRedundant object knows that refinement means "leave as is, since there is no grid" while the DMDA knows it is a grid and knows how to refine itself. So when it "interpolates" the DMRedundant variables it just copies them (or multiples them by the matrix 1 which is just a copy). >> >> I think you might be misunderstanding the "scalars" part. He is solving a nonlinear eigenproblem (which he did not write down) for some variables. Then he >> uses those variable in the coupled diffusion equations he did write down. He has wrapped the whole problem in a SNES with 2 parts: the nonlinear eigenproblem >> and the diffusion equations. He uses DMComposite to deal with all the unknowns. >> >> I think Nonlinear Block Gauss-Siedel on the different problems would be a useful starting point, but we do not have that. >> >> Thanks, >> >> Matt >> >> > >> > Here is my form function. I can send more code if needed. >> > >> >> Just change the user->packer that you use to be the DM obtained with SNESGetDM() >> >> >> > /* Form the system of equations for computing a blowup solution*/ >> > PetscErrorCode form_function(SNES snes, Vec U, Vec F, void *ctx){ >> > >> > blowup_ctx *user = (blowup_ctx *) ctx; >> > PetscInt i; >> > PetscScalar dx, dx2, xmax,x; >> > PetscScalar u, v, f,g, ux, vx, uxx, vxx, fx,gx, fxx, gxx; >> > DMDALocalInfo info; >> > Vec p_vec, Q_vec, Fp_vec, FQ_vec; >> > PetscScalar *p_array, *Fp_array; >> > Q *Qvals, *FQvals; >> > PetscScalar Q2sig, W2sig; >> > PetscScalar a,a2, b, u0, sigma; >> > >> > dx = user->dx; dx2 = dx *dx; >> > xmax = user->xmax; >> > sigma = user->sigma; >> > >> > /* PetscPrintf(PETSC_COMM_SELF, " dx = %g, sigma = %g\n", dx, sigma); */ >> > >> > /* Extract raw arrays */ >> > DMCompositeGetLocalVectors(user->packer, &p_vec, &Q_vec); >> > DMCompositeGetLocalVectors(user->packer, &Fp_vec, &FQ_vec); >> > >> > DMCompositeScatter(user->packer, U, p_vec, Q_vec); >> > /* VecView(Q_vec, PETSC_VIEWER_STDOUT_SELF); */ >> > >> > VecGetArray(p_vec,&p_array); >> > VecGetArray(Fp_vec,&Fp_array); >> > >> > DMDAVecGetArray(user->Q_dm, Q_vec, &Qvals); >> > DMDAVecGetArray(user->Q_dm, FQ_vec, &FQvals); >> > >> > DMDAGetLocalInfo(user->Q_dm, &info); >> > >> > a = p_array[0]; a2 = a*a; >> > b = p_array[1]; >> > u0 = p_array[2]; >> > >> > /* Set boundary conditions at the origin*/ >> > if(info.xs ==0){ >> > set_origin_bcs(u0, Qvals); >> > } >> > /* Set boundray conditions in the far field */ >> > if(info.xs+ info.xm == info.mx){ >> > set_farfield_bcs(xmax,sigma, a, b, dx, Qvals,info.mx); >> > } >> > >> > /* Solve auxiliary equations */ >> > if(info.xs ==0){ >> > uxx = (2 * Qvals[0].u-2 * u0)/dx2; >> > vxx = (Qvals[0].v + Qvals[0].g)/dx2; >> > vx = (Qvals[0].v - Qvals[0].g)/(2*dx); >> > Fp_array[0] = Qvals[0].u - Qvals[0].f; >> > Fp_array[1] = -vxx - (1/a) * (.5/sigma) * u0; >> > Fp_array[2] = -uxx + (1/a2) * u0 >> > + (1/a) * (-b * vx + PetscPowScalar(u0 * u0, sigma) * vx); >> > } >> > >> > /* Solve equations in the bulk */ >> > for(i=info.xs; i < info.xs + info.xm;i++){ >> > >> > u = Qvals[i].u; >> > v = Qvals[i].v; >> > f = Qvals[i].f; >> > g = Qvals[i].g; >> > >> > x = (i+1) * dx; >> > >> > Q2sig = PetscPowScalar(u*u + v*v,sigma); >> > W2sig= PetscPowScalar(f*f + g*g, sigma); >> > >> > ux = (Qvals[i+1].u-Qvals[i-1].u)/(2*dx); >> > vx = (Qvals[i+1].v-Qvals[i-1].v)/(2*dx); >> > fx = (Qvals[i+1].f-Qvals[i-1].f)/(2*dx); >> > gx = (Qvals[i+1].g-Qvals[i-1].g)/(2*dx); >> > >> > uxx = (Qvals[i+1].u+Qvals[i-1].u - 2 *u)/(dx2); >> > vxx = (Qvals[i+1].v+Qvals[i-1].v- 2 *v)/(dx2); >> > fxx = (Qvals[i+1].f+Qvals[i-1].f -2*f)/(dx2); >> > gxx = (Qvals[i+1].g+Qvals[i-1].g -2*g)/(dx2); >> > >> > FQvals[i].u = -uxx +1/a2 * u >> > + 1/a *(.5/sigma* v +x * vx- b* vx +Q2sig* vx); >> > >> > FQvals[i].v = -vxx +1/a2 * v >> > - 1/a *(.5/sigma * u +x * ux- b* ux +Q2sig* ux); >> > >> > FQvals[i].f = -fxx +1/a2 * f >> > + 1/a *(.5/sigma * g +x * gx+ b* gx -W2sig* gx); >> > >> > FQvals[i].g =-gxx +1/a2 * g >> > - 1/a *(.5/sigma * f +x * fx+ b* fx -W2sig* fx); >> > } >> > >> > /* Restore raw arrays */ >> > VecRestoreArray(p_vec, &p_array); >> > VecRestoreArray(Fp_vec, &Fp_array); >> > >> > DMDAVecRestoreArray(user->Q_dm, Q_vec, &Qvals); >> > DMDAVecRestoreArray(user->Q_dm, FQ_vec, &FQvals); >> > >> > DMCompositeGather(user->packer,F,INSERT_VALUES, Fp_vec, FQ_vec); >> > DMCompositeRestoreLocalVectors(user->packer, &p_vec, &Q_vec); >> > DMCompositeRestoreLocalVectors(user->packer, &Fp_vec, &FQ_vec); >> > >> > return 0; >> > } >> > >> > >> > Here is the form function: >> > >> > >> > >> > -gideon >> > >> >> On Aug 27, 2015, at 11:09 PM, Barry Smith wrote: >> >> >> >> >> >> Can you send the code, that will be the easiest way to find the problem. >> >> >> >> My guess is that you have hardwired in your function/Jacobian computation the use of the original DM for computations instead of using the current DM (with refinement there will be a new DM on the second level different than your original DM). So what you need to do in writing your FormFunction and FormJacobian is to call SNESGetDM() to get the current DM and then use DMComputeGet... to access the individual DMDA and DMRedundent for the parts. I notice you have this user.Q_dm I bet inside your form functions you use this DM? You have to remove this logic. >> >> >> >> Barry >> >> >> >>> On Aug 27, 2015, at 9:42 PM, Gideon Simpson wrote: >> >>> >> >>> I have it set up as: >> >>> >> >>> DMCompositeCreate(PETSC_COMM_WORLD, &user.packer); >> >>> DMRedundantCreate(PETSC_COMM_WORLD, 0, 3, &user.p_dm); >> >>> DMCompositeAddDM(user.packer,user.p_dm); >> >>> DMDACreate1d(PETSC_COMM_WORLD,DM_BOUNDARY_GHOSTED, >> >>> nx, 4, 1, NULL, &user.Q_dm); >> >>> DMCompositeAddDM(user.packer,user.Q_dm); >> >>> DMCreateGlobalVector(user.packer,&U); >> >>> >> >>> where the user.packer structure has >> >>> >> >>> DM packer; >> >>> DM p_dm, Q_dm; >> >>> >> >>> Q_dm holds the field variables and p_dm holds the scalar values (the nonlinear eigenvalues). >> >>> >> >>> Here are some of the errors that are generated: >> >>> >> >>> [0]PETSC ERROR: --------------------- Error Message -------------------------------------------------------------- >> >>> [0]PETSC ERROR: Argument out of range >> >>> [0]PETSC ERROR: New nonzero at (0,3) caused a malloc >> >>> Use MatSetOption(A, MAT_NEW_NONZERO_ALLOCATION_ERR, PETSC_FALSE) to turn off this check >> >>> [0]PETSC ERROR: See http://www.mcs.anl.gov/petsc/documentation/faq.html for trouble shooting. >> >>> [0]PETSC ERROR: Petsc Release Version 3.5.3, unknown >> >>> [0]PETSC ERROR: ./blowup_batch2 on a arch-macports named gs_air by gideon Thu Aug 27 22:40:54 2015 >> >>> [0]PETSC ERROR: Configure options --prefix=/opt/local --prefix=/opt/local/lib/petsc --with-valgrind=0 --with-shared-libraries --with-debugging=0 --with-c2html-dir=/opt/local --with-x=0 --with-blas-lapack-lib=/System/Library/Frameworks/Accelerate.framework/Versions/Current/Accelerate --with-hwloc-dir=/opt/local --with-suitesparse-dir=/opt/local --with-superlu-dir=/opt/local --with-metis-dir=/opt/local --with-parmetis-dir=/opt/local --with-scalapack-dir=/opt/local --with-mumps-dir=/opt/local --with-superlu_dist-dir=/opt/local CC=/opt/local/bin/mpicc-mpich-mp CXX=/opt/local/bin/mpicxx-mpich-mp FC=/opt/local/bin/mpif90-mpich-mp F77=/opt/local/bin/mpif90-mpich-mp F90=/opt/local/bin/mpif90-mpich-mp COPTFLAGS=-Os CXXOPTFLAGS=-Os FOPTFLAGS=-Os LDFLAGS="-L/opt/local/lib -Wl,-headerpad_max_install_names" CPPFLAGS=-I/opt/local/include CFLAGS="-Os -arch x86_64" CXXFLAGS=-Os FFLAGS=-Os FCFLAGS=-Os F90FLAGS=-Os PETSC_ARCH=arch-macports --with-mpiexec=mpiexec-mpich-mp >> >>> [0]PETSC ERROR: #1 MatSetValues_MPIAIJ() line 530 in /opt/local/var/macports/build/_opt_local_var_macports_sources_rsync.macports.org_release_tarballs_ports_math_petsc/petsc/work/v3.5.3/src/mat/impls/aij/mpi/mpiaij.c >> >>> [0]PETSC ERROR: [1]PETSC ERROR: --------------------- Error Message -------------------------------------------------------------- >> >>> [1]PETSC ERROR: Argument out of range >> >>> [1]PETSC ERROR: Inserting a new nonzero (40003, 0) into matrix >> >>> [1]PETSC ERROR: See http://www.mcs.anl.gov/petsc/documentation/faq.html for trouble shooting. >> >>> [1]PETSC ERROR: Petsc Release Version 3.5.3, unknown >> >>> [1]PETSC ERROR: ./blowup_batch2 on a arch-macports named gs_air by gideon Thu Aug 27 22:40:54 2015 >> >>> [1]PETSC ERROR: Configure options --prefix=/opt/local --prefix=/opt/local/lib/petsc --with-valgrind=0 --with-shared-libraries --with-debugging=0 --with-c2html-dir=/opt/local --with-x=0 --with-blas-lapack-lib=/System/Library/Frameworks/Accelerate.framework/Versions/Current/Accelerate --with-hwloc-dir=/opt/local --with-suitesparse-dir=/opt/local --with-superlu-dir=/opt/local --with-metis-dir=/opt/local --with-parmetis-dir=/opt/local --with-scalapack-dir=/opt/local --with-mumps-dir=/opt/local --with-superlu_dist-dir=/opt/local CC=/opt/local/bin/mpicc-mpich-mp CXX=/opt/local/bin/mpicxx-mpich-mp FC=/opt/local/bin/mpif90-mpich-mp F77=/opt/local/bin/mpif90-mpich-mp F90=/opt/local/bin/mpif90-mpich-mp COPTFLAGS=-Os CXXOPTFLAGS=-Os FOPTFLAGS=-Os LDFLAGS="-L/opt/local/lib -Wl,-headerpad_max_install_names" CPPFLAGS=-I/opt/local/include CFLAGS="-Os -arch x86_64" CXXFLAGS=-Os FFLAGS=-Os FCFLAGS=-Os F90FLAGS=-Os PETSC_ARCH=arch-macports --with-mpiexec=mpiexec-mpich-mp >> >>> [1]PETSC ERROR: #1 MatSetValues_MPIAIJ() line 561 in /opt/local/var/macports/build/_opt_local_var_macports_sources_rsync.macports.org_release_tarballs_ports_math_petsc/petsc/work/v3.5.3/src/mat/impls/aij/mpi/mpiaij.c >> >>> [1]PETSC ERROR: #2 MatSetValues() line 1135 in /opt/local/var/macports/build/_opt_local_var_macports_sources_rsync.macports.org_release_tarballs_ports_math_petsc/petsc/work/v3.5.3/src/mat/interface/matrix.c >> >>> >> >>> >> >>> >> >>> -gideon >> >>> >> >>>> On Aug 27, 2015, at 10:37 PM, Barry Smith wrote: >> >>>> >> >>>> >> >>>> We need the full error message. >> >>>> >> >>>> But are you using a DMDA for the scalars? You should not be, you should be using a DMRedundant for the scalars. >> >>>> >> >>>> Barry >> >>>> >> >>>> Though you should not get this error even if you are using a DMDA there. >> >>>> >> >>>>> On Aug 27, 2015, at 9:32 PM, Gideon Simpson wrote: >> >>>>> >> >>>>> I?m getting the following errors: >> >>>>> >> >>>>> [1]PETSC ERROR: Argument out of range >> >>>>> [1]PETSC ERROR: Inserting a new nonzero (40003, 0) into matrix >> >>>>> >> >>>>> Could this have to do with me using the DMComposite with one da holding the scalar parameters and the other holding the field variables? >> >>>>> >> >>>>> -gideon >> >>>>> >> >>>>>> On Aug 27, 2015, at 10:15 PM, Matthew Knepley wrote: >> >>>>>> >> >>>>>> On Thu, Aug 27, 2015 at 9:11 PM, Gideon Simpson wrote: >> >>>>>> HI Barry, >> >>>>>> >> >>>>>> Nope, I?m not doing any grid sequencing. Clearly that makes a lot of sense, to solve on a spatially coarse mesh for the field variables, interpolate onto the finer mesh, and then solve again. I?m not entirely clear on the practical implementation >> >>>>>> >> >>>>>> SNES should do this automatically using -snes_grid_sequence . If this does not work, complain. Loudly. >> >>>>>> >> >>>>>> Matt >> >>>>>> >> >>>>>> -gideon >> >>>>>> >> >>>>>>> On Aug 27, 2015, at 10:02 PM, Barry Smith wrote: >> >>>>>>> >> >>>>>>> >> >>>>>>> Gideon, >> >>>>>>> >> >>>>>>> Are you using grid sequencing? Simply solve on a coarse grid, interpolate u1 and u2 to a once refined version of the grid and use that plus the mu lam as initial guess for the next level. Repeat to as fine a grid as you want. You can use DMRefine() and DMGetInterpolation() to get the interpolation needed to interpolate from the coarse to finer mesh. >> >>>>>>> >> >>>>>>> Then and only then you can use multigrid (with or without fieldsplit) to solve the linear problems for finer meshes. Once you have the grid sequencing working we can help you with this. >> >>>>>>> >> >>>>>>> Barry >> >>>>>>> >> >>>>>>>> On Aug 27, 2015, at 7:00 PM, Gideon Simpson wrote: >> >>>>>>>> >> >>>>>>>> I?m working on a problem which, morally, can be posed as a system of coupled semi linear elliptic PDEs together with unknown nonlinear eigenvalue parameters, loosely, of the form >> >>>>>>>> >> >>>>>>>> -\Delta u_1 + f(u_1, u_2) = lam * u1 - mu * du2/dx >> >>>>>>>> -\Delta u_2 + g(u_1, u_2) = lam * u2 + mu * du1/dx >> >>>>>>>> >> >>>>>>>> Currently, I have it set up with a DMComposite with two sub da?s, one for the parameters (lam, mu), and one for the vector field (u_1, u_2) on the mesh. I have had success in solving this as a fully coupled system with SNES + sparse direct solvers (MUMPS, SuperLU). >> >>>>>>>> >> >>>>>>>> Lately, I am finding that, when the mesh resolution gets fine enough (i.e. 10^6-10^8 lattice points), my SNES gets stuck with the function norm = O(10^{-4}), eventually returning reason -6 (failed line search). >> >>>>>>>> >> >>>>>>>> Perhaps there is another way around the above problem, but one thing I was thinking of trying would be to get away from direct solvers, and I was hoping to use field split for this. However, it?s a bit beyond what I?ve seen examples for because it has 2 types of variables: scalar parameters which appear globally in the system and vector valued field variables. Any suggestions on how to get started? >> >>>>>>>> >> >>>>>>>> -gideon >> >>>>>>>> >> >>>>>>> >> >>>>>> >> >>>>>> >> >>>>>> >> >>>>>> >> >>>>>> -- >> >>>>>> What most experimenters take for granted before they begin their experiments is infinitely more interesting than any results to which their experiments lead. >> >>>>>> -- Norbert Wiener >> >>>>> >> >>>> >> >>> >> >> >> > >> >> >> >> >> -- >> What most experimenters take for granted before they begin their experiments is infinitely more interesting than any results to which their experiments lead. >> -- Norbert Wiener From gideon.simpson at gmail.com Fri Aug 28 15:04:47 2015 From: gideon.simpson at gmail.com (Gideon Simpson) Date: Fri, 28 Aug 2015 16:04:47 -0400 Subject: [petsc-users] pcfieldsplit for a composite dm with multiple subfields In-Reply-To: <122D6409-96DB-4A1A-A134-525D2CEC2F1D@mcs.anl.gov> References: <22066404-50E3-4BBF-9D27-26384815571A@gmail.com> <8376EFA7-D775-4D65-9E86-2303FA7E47E2@gmail.com> <4136AE9B-AE00-4E39-8012-888BBF920548@mcs.anl.gov> <5714BD56-097D-40B9-8AEF-41273E2B512D@mcs.anl.gov> <05BA24AC-4011-483C-8599-5D8EED7AFE10@mcs.anl.gov> <0DBB158F-6E6B-404A-B477-BC7D5A321F01@gmail.com> <122D6409-96DB-4A1A-A134-525D2CEC2F1D@mcs.anl.gov> Message-ID: <3636A1D1-9B71-4C11-8A7F-424CAB000C3D@gmail.com> Yes, if i continue in this parameter on the coarse mesh, I can generally solve at all values. I do find that I need to do some amount of continuation to solve near the endpoint. The problem is that on the coarse mesh, things are not fully resolved at all the values along the continuation parameter, and I would like to do refinement. One subtlety is that I actually want the intermediate continuation solutions too. Currently, without doing any grid sequence, I compute each, write it to disk, and then go on to the next one. So I now need to go back an refine them. I was thinking that perhaps I could refine them on the fly, dump them to disk, and use the coarse solution as the starting guess at the next iteration, but that would seem to require resetting the snes back to the coarse grid. The alternative would be to just script the mesh refinement in a post processing stage, where each value of the continuation is parameter is loaded on the coarse mesh, and refined. Perhaps that?s the most practical thing to do. -gideon > On Aug 28, 2015, at 3:55 PM, Barry Smith wrote: > >> >> >> 3. This problem is actually part of a continuation problem that roughly looks like this >> >> for( continuation parameter p = 0 to 1){ >> >> solve with parameter p_i using solution from p_{i-1}, >> } >> >> What I would like to do is to start the solver, for each value of parameter p_i on the coarse mesh, and then do grid sequencing on that. But it appears that after doing grid sequencing on the initial p_0 = 0, the SNES is set to use the finer mesh. > > So you are using continuation to give you a good enough initial guess on the coarse level to even get convergence on the coarse level? First I would check if you even need the continuation (or can you not even solve the coarse problem without it). > > If you do need the continuation then you will need to tweak how you do the grid sequencing. I think this will work: > > Do not use -snes_grid_sequencing > > Run SNESSolve() as many times as you want with your continuation parameter. This will all happen on the coarse mesh. > > Call SNESSetGridSequence() > > Then call SNESSolve() again and it will do one solve on the coarse level and then interpolate to the next level etc. -------------- next part -------------- An HTML attachment was scrubbed... URL: From bsmith at mcs.anl.gov Fri Aug 28 15:21:44 2015 From: bsmith at mcs.anl.gov (Barry Smith) Date: Fri, 28 Aug 2015 15:21:44 -0500 Subject: [petsc-users] pcfieldsplit for a composite dm with multiple subfields In-Reply-To: <3636A1D1-9B71-4C11-8A7F-424CAB000C3D@gmail.com> References: <22066404-50E3-4BBF-9D27-26384815571A@gmail.com> <8376EFA7-D775-4D65-9E86-2303FA7E47E2@gmail.com> <4136AE9B-AE00-4E39-8012-888BBF920548@mcs.anl.gov> <5714BD56-097D-40B9-8AEF-41273E2B512D@mcs.anl.gov> <05BA24AC-4011-483C-8599-5D8EED7AFE10@mcs.anl.gov> <0DBB158F-6E6B-404A-B477-BC7D5A321F01@gmail.com> <122D6409-96DB-4A1A-A134-525D2CEC2F1D@mcs.anl.gov> <3636A1D1-9B71-4C11-8A7F-424CAB000C3D@gmail.com> Message-ID: > On Aug 28, 2015, at 3:04 PM, Gideon Simpson wrote: > > Yes, if i continue in this parameter on the coarse mesh, I can generally solve at all values. I do find that I need to do some amount of continuation to solve near the endpoint. The problem is that on the coarse mesh, things are not fully resolved at all the values along the continuation parameter, and I would like to do refinement. > > One subtlety is that I actually want the intermediate continuation solutions too. Currently, without doing any grid sequence, I compute each, write it to disk, and then go on to the next one. So I now need to go back an refine them. I was thinking that perhaps I could refine them on the fly, dump them to disk, and use the coarse solution as the starting guess at the next iteration, but that would seem to require resetting the snes back to the coarse grid. > > The alternative would be to just script the mesh refinement in a post processing stage, where each value of the continuation is parameter is loaded on the coarse mesh, and refined. Perhaps that?s the most practical thing to do. I would do the following. Create your DM and create a SNES that will do the continuation loop over continuation parameter SNESSolve(snes,NULL,Ucoarse); if (you decide you want to see the refined solution at this continuation point) { SNESCreate(comm,&snesrefine); SNESSetDM() etc SNESSetGridSequence(snesrefine,) SNESSolve(snesrefine,0,Ucoarse); SNESGetSolution(snesrefine,&Ufine); VecView(Ufine or do whatever you want to do with the Ufine at that continuation point SNESDestroy(snesrefine); end if end loop over continuation parameter. Barry > > -gideon > >> On Aug 28, 2015, at 3:55 PM, Barry Smith wrote: >> >>> >>> >>> 3. This problem is actually part of a continuation problem that roughly looks like this >>> >>> for( continuation parameter p = 0 to 1){ >>> >>> solve with parameter p_i using solution from p_{i-1}, >>> } >>> >>> What I would like to do is to start the solver, for each value of parameter p_i on the coarse mesh, and then do grid sequencing on that. But it appears that after doing grid sequencing on the initial p_0 = 0, the SNES is set to use the finer mesh. >> >> So you are using continuation to give you a good enough initial guess on the coarse level to even get convergence on the coarse level? First I would check if you even need the continuation (or can you not even solve the coarse problem without it). >> >> If you do need the continuation then you will need to tweak how you do the grid sequencing. I think this will work: >> >> Do not use -snes_grid_sequencing >> >> Run SNESSolve() as many times as you want with your continuation parameter. This will all happen on the coarse mesh. >> >> Call SNESSetGridSequence() >> >> Then call SNESSolve() again and it will do one solve on the coarse level and then interpolate to the next level etc. > From hanklammiv at gmail.com Fri Aug 28 16:13:56 2015 From: hanklammiv at gmail.com (Hank Lamm) Date: Fri, 28 Aug 2015 14:13:56 -0700 Subject: [petsc-users] Increasing nodes doesn't decrease memory per node. Message-ID: Hi All, I am having a problem running Petsc3.6 and Slepc3.6 on Stampede. My code should be a simple eigenvalue solver, but when I attempt to solve large problems (8488x8488 matrices) I get errors: --------------------- Error Message -------------------------------------------------------------- [1]Total space allocated 1736835920 bytes [1]PETSC ERROR: Out of memory. This could be due to allocating [1]PETSC ERROR: too large an object or bleeding by not properly [1]PETSC ERROR: destroying unneeded objects. [1]PETSC ERROR: Memory allocated 1736835920 Memory used by process 1769742336 [1]PETSC ERROR: [0]PETSC ERROR: Memory requested 864587796 [1]PETSC ERROR: [0]PETSC ERROR: See http://www.mcs.anl.gov/petsc/documentation/faq.html for trouble shooting. [1]PETSC ERROR: #8 STSetUp() line 305 in /work/03324/hlammiv/slepc-3.6.0/src/sys/classes/st/interface/stsolve.c [1]PETSC ERROR: [0]PETSC ERROR: #1 MatDuplicateNoCreate_SeqAIJ() line 4030 in /home1/apps/intel15/mvapich2_2_1/petsc/3.6/src/mat/impls/aij/seq/aij.c [1]PETSC ERROR: #2 PetscTrMallocDefault() line 188 in /home1/apps/intel15/mvapich2_2_1/petsc/3.6/src/sys/memory/mtr.c [1]PETSC ERROR: #4 MatDuplicate_SeqAIJ() line 4103 in /home1/apps/intel15/mvapich2_2_1/petsc/3.6/src/mat/impls/aij/seq/aij.c [1]PETSC ERROR: #5 MatDuplicate() line 4252 in /home1/apps/intel15/mvapich2_2_1/petsc/3.6/src/mat/interface/matrix.c [1]PETSC ERROR: #6 STMatMAXPY_Private() line 379 in /work/03324/hlammiv/slepc-3.6.0/src/sys/classes/st/interface/stsolve.c [1]PETSC ERROR: #7 STSetUp_Sinvert() line 131 in /work/03324/hlammiv/slepc-3.6.0/src/sys/classes/st/impls/sinvert/sinvert.c [1]PETSC ERROR: #8 STSetUp() line 305 in /work/03324/hlammiv/slepc-3.6.0/src/sys/classes/st/interface/stsolve.c [1]PETSC ERROR: #9 EPSSliceGetInertia() line 295 in /work/03324/hlammiv/slepc-3.6.0/src/eps/impls/krylov/krylovschur/ks-slice.c [1]PETSC ERROR: #10 EPSSetUp_KrylovSchur_Slice() line 425 in /work/03324/hlammiv/slepc-3.6.0/src/eps/impls/krylov/krylovschur/ks-slice.c [1]PETSC ERROR: #11 EPSSetUp_KrylovSchur() line 89 in /work/03324/hlammiv/slepc-3.6.0/src/eps/impls/krylov/krylovschur/krylovschur.c [1]PETSC ERROR: #12 EPSSetUp() line 121 in /work/03324/hlammiv/slepc-3.6.0/src/eps/interface/epssetup.c [1]PETSC ERROR: #13 EPSSliceGetEPS() line 267 in /work/03324/hlammiv/slepc-3.6.0/src/eps/impls/krylov/krylovschur/ks-slice.c [1]PETSC ERROR: #14 EPSSetUp_KrylovSchur_Slice() line 368 in /work/03324/hlammiv/slepc-3.6.0/src/eps/impls/krylov/krylovschur/ks-slice.c [1]PETSC ERROR: #15 EPSSetUp_KrylovSchur() line 89 in /work/03324/hlammiv/slepc-3.6.0/src/eps/impls/krylov/krylovschur/krylovschur.c [1]PETSC ERROR: #16 EPSSetUp() line 121 in /work/03324/hlammiv/slepc-3.6.0/src/eps/interface/epssetup.c [1]PETSC ERROR: #17 EPSSolve() line 88 in /work/03324/hlammiv/slepc-3.6.0/src/eps/interface/epssolve.c [1]PETSC ERROR: #18 eigensolver() line 64 in /work/03324/hlammiv/TMSWIFT/src/solver.cpp [1]Current space PetscMalloc()ed 1.73683e+09, max space PetscMalloced() 1.73684e+09 [1]Current process memory 1.76979e+09 max process memory 1.76979e+09 The curious thing about this error, is that it seems that if I increase the number of nodes, from 32 to 64 to 128, the amount of memory per node doesn't decrease. I have used valgrind and it doesn't seem to a memory leak. The relevant code piece is: void eigensolver(PetscErrorCode ierr, params *params, Mat &H, int argc, char **argv) { EPS eps; /* eigenproblem solver context */ EPSType type; ST st; KSP ksp; PC pc; PetscReal tol,error; PetscReal lower,upper; //PetscInt nev=dim,maxit,its; PetscInt nev,maxit,its,nconv; Vec xr,xi; PetscScalar kr,ki; PetscReal re,im; PetscViewer viewer; PetscInt rank; PetscInt size; std::string eig_file_n; std::ofstream eig_file; char ofile[100]; MPI_Comm_rank(PETSC_COMM_WORLD,&rank); MPI_Comm_size(PETSC_COMM_WORLD,&size); ierr = PetscPrintf(PETSC_COMM_WORLD,"---Beginning Eigenvalue Solver---\n");CHKERRV(ierr); ierr = EPSCreate(PETSC_COMM_WORLD,&eps);CHKERRV(ierr); eig_file_n.append(params->ofile_n); eig_file_n.append("_eval"); eig_file.open(eig_file_n.c_str(),std::ofstream::trunc); //Set operators. In this case, it is a standard eigenvalue problem ierr = EPSSetOperators(eps,H,NULL);CHKERRV(ierr); ierr = EPSSetProblemType(eps,EPS_HEP);CHKERRV(ierr); ierr = EPSSetType(eps,EPSKRYLOVSCHUR);CHKERRV(ierr); ierr = EPSGetST(eps,&st);CHKERRV(ierr); ierr = STSetType(st,STSINVERT);CHKERRV(ierr); ierr = STGetKSP(st,&ksp);CHKERRV(ierr); ierr = KSPSetType(ksp,KSPPREONLY);CHKERRV(ierr); ierr = KSPGetPC(ksp,&pc);CHKERRV(ierr); ierr = PCSetType(pc,PCCHOLESKY);CHKERRV(ierr); ierr = EPSKrylovSchurSetPartitions(eps,size);CHKERRV(ierr); for(PetscInt i=0;inf;i++){ lower=std::pow(2.0*params->m[i]-params->m[i]*params->alpha*params->alpha,2.0); upper=4.0*params->m[i]*params->m[i]; ierr = EPSSetInterval(eps,lower,upper); ierr = EPSSetWhichEigenpairs(eps,EPS_ALL); //Set solver parameters at runtime ierr = EPSSetFromOptions(eps);CHKERRV(ierr); // ierr = EPSSetWhichEigenpairs(eps,EPS_SMALLEST_REAL); ierr = MatCreateVecs(H,NULL,&xr);CHKERRV(ierr); ierr = MatCreateVecs(H,NULL,&xi);CHKERRV(ierr); ierr = EPSSolve(eps);CHKERRV(ierr); ierr = EPSGetIterationNumber(eps,&its);CHKERRV(ierr); ierr = PetscPrintf(PETSC_COMM_WORLD," Number of iterations of the method: %D\n",its);CHKERRV(ierr); //Optional: Get some information from the solver and display it ierr = EPSGetType(eps,&type);CHKERRV(ierr); ierr = PetscPrintf(PETSC_COMM_WORLD," Solution method: %s\n\n",type);CHKERRV(ierr); ierr = EPSGetDimensions(eps,&nev,NULL,NULL);CHKERRV(ierr); ierr = PetscPrintf(PETSC_COMM_WORLD," Number of requested eigenvalues: %D\n",nev);CHKERRV(ierr); ierr = EPSGetTolerances(eps,&tol,&maxit);CHKERRV(ierr); ierr = PetscPrintf(PETSC_COMM_WORLD," Stopping condition: tol=%.4g, maxit=%D\n",tol,maxit);CHKERRV(ierr); ierr = EPSGetConverged(eps,&nconv);CHKERRV(ierr); ierr = PetscPrintf(PETSC_COMM_WORLD," Number of converged eigenpairs: %D\n\n",nconv);CHKERRV(ierr); strcpy(ofile,params->ofile_n); strcat(ofile,"_evecr"); ierr = PetscViewerASCIIOpen(PETSC_COMM_WORLD,ofile,&viewer);CHKERRV(ierr); if (nconv>0) { ierr = PetscPrintf(PETSC_COMM_WORLD, " k ||Ax-kx||/||kx||\n" " ----------------- ------------------\n");CHKERRV(ierr); for (PetscInt i=0;i From gideon.simpson at gmail.com Fri Aug 28 16:35:35 2015 From: gideon.simpson at gmail.com (Gideon Simpson) Date: Fri, 28 Aug 2015 17:35:35 -0400 Subject: [petsc-users] pcfieldsplit for a composite dm with multiple subfields In-Reply-To: References: <22066404-50E3-4BBF-9D27-26384815571A@gmail.com> <8376EFA7-D775-4D65-9E86-2303FA7E47E2@gmail.com> <4136AE9B-AE00-4E39-8012-888BBF920548@mcs.anl.gov> <5714BD56-097D-40B9-8AEF-41273E2B512D@mcs.anl.gov> <05BA24AC-4011-483C-8599-5D8EED7AFE10@mcs.anl.gov> <0DBB158F-6E6B-404A-B477-BC7D5A321F01@gmail.com> <122D6409-96DB-4A1A-A134-525D2CEC2F1D@mcs.anl.gov> <3636A1D1-9B71-4C11-8A7F-424CAB000C3D@gmail.com> Message-ID: Hi Barry, Ok, I tried that and it works as intended, but there?s something I noticed. If i use that, and do a SNESGetConvergedReason on the snesrefine, it always seems to return 0. Is there a reason for that? -gideon > On Aug 28, 2015, at 4:21 PM, Barry Smith wrote: > >> >> On Aug 28, 2015, at 3:04 PM, Gideon Simpson wrote: >> >> Yes, if i continue in this parameter on the coarse mesh, I can generally solve at all values. I do find that I need to do some amount of continuation to solve near the endpoint. The problem is that on the coarse mesh, things are not fully resolved at all the values along the continuation parameter, and I would like to do refinement. >> >> One subtlety is that I actually want the intermediate continuation solutions too. Currently, without doing any grid sequence, I compute each, write it to disk, and then go on to the next one. So I now need to go back an refine them. I was thinking that perhaps I could refine them on the fly, dump them to disk, and use the coarse solution as the starting guess at the next iteration, but that would seem to require resetting the snes back to the coarse grid. >> >> The alternative would be to just script the mesh refinement in a post processing stage, where each value of the continuation is parameter is loaded on the coarse mesh, and refined. Perhaps that?s the most practical thing to do. > > I would do the following. Create your DM and create a SNES that will do the continuation > > loop over continuation parameter > > SNESSolve(snes,NULL,Ucoarse); > > if (you decide you want to see the refined solution at this continuation point) { > SNESCreate(comm,&snesrefine); > SNESSetDM() > etc > SNESSetGridSequence(snesrefine,) > SNESSolve(snesrefine,0,Ucoarse); > SNESGetSolution(snesrefine,&Ufine); > VecView(Ufine or do whatever you want to do with the Ufine at that continuation point > SNESDestroy(snesrefine); > end if > > end loop over continuation parameter. > > Barry > >> >> -gideon >> >>> On Aug 28, 2015, at 3:55 PM, Barry Smith wrote: >>> >>>> >>>> >>>> 3. This problem is actually part of a continuation problem that roughly looks like this >>>> >>>> for( continuation parameter p = 0 to 1){ >>>> >>>> solve with parameter p_i using solution from p_{i-1}, >>>> } >>>> >>>> What I would like to do is to start the solver, for each value of parameter p_i on the coarse mesh, and then do grid sequencing on that. But it appears that after doing grid sequencing on the initial p_0 = 0, the SNES is set to use the finer mesh. >>> >>> So you are using continuation to give you a good enough initial guess on the coarse level to even get convergence on the coarse level? First I would check if you even need the continuation (or can you not even solve the coarse problem without it). >>> >>> If you do need the continuation then you will need to tweak how you do the grid sequencing. I think this will work: >>> >>> Do not use -snes_grid_sequencing >>> >>> Run SNESSolve() as many times as you want with your continuation parameter. This will all happen on the coarse mesh. >>> >>> Call SNESSetGridSequence() >>> >>> Then call SNESSolve() again and it will do one solve on the coarse level and then interpolate to the next level etc. -------------- next part -------------- An HTML attachment was scrubbed... URL: From bsmith at mcs.anl.gov Fri Aug 28 17:03:01 2015 From: bsmith at mcs.anl.gov (Barry Smith) Date: Fri, 28 Aug 2015 17:03:01 -0500 Subject: [petsc-users] pcfieldsplit for a composite dm with multiple subfields In-Reply-To: References: <22066404-50E3-4BBF-9D27-26384815571A@gmail.com> <8376EFA7-D775-4D65-9E86-2303FA7E47E2@gmail.com> <4136AE9B-AE00-4E39-8012-888BBF920548@mcs.anl.gov> <5714BD56-097D-40B9-8AEF-41273E2B512D@mcs.anl.gov> <05BA24AC-4011-483C-8599-5D8EED7AFE10@mcs.anl.gov> <0DBB158F-6E6B-404A-B477-BC7D5A321F01@gmail.com> <122D6409-96DB-4A1A-A134-525D2CEC2F1D@mcs.anl.gov> <3636A1D1-9B71-4C11-8A7F-424CAB000C3D@gmail.com> Message-ID: <76703FF4-3008-4744-B334-C1EB732DFC4C@mcs.anl.gov> > On Aug 28, 2015, at 4:35 PM, Gideon Simpson wrote: > > Hi Barry, > > Ok, I tried that and it works as intended, but there?s something I noticed. If i use that, and do a SNESGetConvergedReason on the snesrefine, it always seems to return 0. Is there a reason for that? Should never do that; are you sure that SNESSolve() has actually been called on it. What does -snes_monitor and -snes_converged_reason show. Barry > > -gideon > >> On Aug 28, 2015, at 4:21 PM, Barry Smith wrote: >> >>> >>> On Aug 28, 2015, at 3:04 PM, Gideon Simpson wrote: >>> >>> Yes, if i continue in this parameter on the coarse mesh, I can generally solve at all values. I do find that I need to do some amount of continuation to solve near the endpoint. The problem is that on the coarse mesh, things are not fully resolved at all the values along the continuation parameter, and I would like to do refinement. >>> >>> One subtlety is that I actually want the intermediate continuation solutions too. Currently, without doing any grid sequence, I compute each, write it to disk, and then go on to the next one. So I now need to go back an refine them. I was thinking that perhaps I could refine them on the fly, dump them to disk, and use the coarse solution as the starting guess at the next iteration, but that would seem to require resetting the snes back to the coarse grid. >>> >>> The alternative would be to just script the mesh refinement in a post processing stage, where each value of the continuation is parameter is loaded on the coarse mesh, and refined. Perhaps that?s the most practical thing to do. >> >> I would do the following. Create your DM and create a SNES that will do the continuation >> >> loop over continuation parameter >> >> SNESSolve(snes,NULL,Ucoarse); >> >> if (you decide you want to see the refined solution at this continuation point) { >> SNESCreate(comm,&snesrefine); >> SNESSetDM() >> etc >> SNESSetGridSequence(snesrefine,) >> SNESSolve(snesrefine,0,Ucoarse); >> SNESGetSolution(snesrefine,&Ufine); >> VecView(Ufine or do whatever you want to do with the Ufine at that continuation point >> SNESDestroy(snesrefine); >> end if >> >> end loop over continuation parameter. >> >> Barry >> >>> >>> -gideon >>> >>>> On Aug 28, 2015, at 3:55 PM, Barry Smith wrote: >>>> >>>>> >>>>> >>>>> 3. This problem is actually part of a continuation problem that roughly looks like this >>>>> >>>>> for( continuation parameter p = 0 to 1){ >>>>> >>>>> solve with parameter p_i using solution from p_{i-1}, >>>>> } >>>>> >>>>> What I would like to do is to start the solver, for each value of parameter p_i on the coarse mesh, and then do grid sequencing on that. But it appears that after doing grid sequencing on the initial p_0 = 0, the SNES is set to use the finer mesh. >>>> >>>> So you are using continuation to give you a good enough initial guess on the coarse level to even get convergence on the coarse level? First I would check if you even need the continuation (or can you not even solve the coarse problem without it). >>>> >>>> If you do need the continuation then you will need to tweak how you do the grid sequencing. I think this will work: >>>> >>>> Do not use -snes_grid_sequencing >>>> >>>> Run SNESSolve() as many times as you want with your continuation parameter. This will all happen on the coarse mesh. >>>> >>>> Call SNESSetGridSequence() >>>> >>>> Then call SNESSolve() again and it will do one solve on the coarse level and then interpolate to the next level etc. > From bsmith at mcs.anl.gov Fri Aug 28 18:14:40 2015 From: bsmith at mcs.anl.gov (Barry Smith) Date: Fri, 28 Aug 2015 18:14:40 -0500 Subject: [petsc-users] Increasing nodes doesn't decrease memory per node. In-Reply-To: References: Message-ID: It is using a SeqAIJ matrix, not a parallel matrix. Increasing the number of cores won't affect the size of a sequential matrix since it must be stored entirely on one process. Perhaps you need to use parallel matrices? [1]PETSC ERROR: #4 MatDuplicate_SeqAIJ() line 4103 in /home1/apps/intel15/mvapich2_2_1/petsc/3.6/src/mat/impls/aij/seq/aij.c [1]PETSC ERROR: #5 MatDuplicate() line 4252 in /home1/apps/intel15/mvapich2_2_1/petsc/3.6/src/mat/interface/matrix.c > On Aug 28, 2015, at 4:13 PM, Hank Lamm wrote: > > Hi All, > > I am having a problem running Petsc3.6 and Slepc3.6 on Stampede. My code should be a simple eigenvalue solver, but when I attempt to solve large problems (8488x8488 matrices) I get errors: > > --------------------- Error Message -------------------------------------------------------------- > [1]Total space allocated 1736835920 bytes > [1]PETSC ERROR: Out of memory. This could be due to allocating > [1]PETSC ERROR: too large an object or bleeding by not properly > [1]PETSC ERROR: destroying unneeded objects. > [1]PETSC ERROR: Memory allocated 1736835920 Memory used by process 1769742336 > [1]PETSC ERROR: [0]PETSC ERROR: Memory requested 864587796 > [1]PETSC ERROR: [0]PETSC ERROR: See http://www.mcs.anl.gov/petsc/documentation/faq.html for trouble shooting. > [1]PETSC ERROR: #8 STSetUp() line 305 in /work/03324/hlammiv/slepc-3.6.0/src/sys/classes/st/interface/stsolve.c > [1]PETSC ERROR: [0]PETSC ERROR: #1 MatDuplicateNoCreate_SeqAIJ() line 4030 in /home1/apps/intel15/mvapich2_2_1/petsc/3.6/src/mat/impls/aij/seq/aij.c > [1]PETSC ERROR: #2 PetscTrMallocDefault() line 188 in /home1/apps/intel15/mvapich2_2_1/petsc/3.6/src/sys/memory/mtr.c > [1]PETSC ERROR: #4 MatDuplicate_SeqAIJ() line 4103 in /home1/apps/intel15/mvapich2_2_1/petsc/3.6/src/mat/impls/aij/seq/aij.c > [1]PETSC ERROR: #5 MatDuplicate() line 4252 in /home1/apps/intel15/mvapich2_2_1/petsc/3.6/src/mat/interface/matrix.c > [1]PETSC ERROR: #6 STMatMAXPY_Private() line 379 in /work/03324/hlammiv/slepc-3.6.0/src/sys/classes/st/interface/stsolve.c > [1]PETSC ERROR: #7 STSetUp_Sinvert() line 131 in /work/03324/hlammiv/slepc-3.6.0/src/sys/classes/st/impls/sinvert/sinvert.c > [1]PETSC ERROR: #8 STSetUp() line 305 in /work/03324/hlammiv/slepc-3.6.0/src/sys/classes/st/interface/stsolve.c > [1]PETSC ERROR: #9 EPSSliceGetInertia() line 295 in /work/03324/hlammiv/slepc-3.6.0/src/eps/impls/krylov/krylovschur/ks-slice.c > [1]PETSC ERROR: #10 EPSSetUp_KrylovSchur_Slice() line 425 in /work/03324/hlammiv/slepc-3.6.0/src/eps/impls/krylov/krylovschur/ks-slice.c > [1]PETSC ERROR: #11 EPSSetUp_KrylovSchur() line 89 in /work/03324/hlammiv/slepc-3.6.0/src/eps/impls/krylov/krylovschur/krylovschur.c > [1]PETSC ERROR: #12 EPSSetUp() line 121 in /work/03324/hlammiv/slepc-3.6.0/src/eps/interface/epssetup.c > [1]PETSC ERROR: #13 EPSSliceGetEPS() line 267 in /work/03324/hlammiv/slepc-3.6.0/src/eps/impls/krylov/krylovschur/ks-slice.c > [1]PETSC ERROR: #14 EPSSetUp_KrylovSchur_Slice() line 368 in /work/03324/hlammiv/slepc-3.6.0/src/eps/impls/krylov/krylovschur/ks-slice.c > [1]PETSC ERROR: #15 EPSSetUp_KrylovSchur() line 89 in /work/03324/hlammiv/slepc-3.6.0/src/eps/impls/krylov/krylovschur/krylovschur.c > [1]PETSC ERROR: #16 EPSSetUp() line 121 in /work/03324/hlammiv/slepc-3.6.0/src/eps/interface/epssetup.c > [1]PETSC ERROR: #17 EPSSolve() line 88 in /work/03324/hlammiv/slepc-3.6.0/src/eps/interface/epssolve.c > [1]PETSC ERROR: #18 eigensolver() line 64 in /work/03324/hlammiv/TMSWIFT/src/solver.cpp > [1]Current space PetscMalloc()ed 1.73683e+09, max space PetscMalloced() 1.73684e+09 > [1]Current process memory 1.76979e+09 max process memory 1.76979e+09 > > > The curious thing about this error, is that it seems that if I increase the number of nodes, from 32 to 64 to 128, the amount of memory per node doesn't decrease. I have used valgrind and it doesn't seem to a memory leak. > > The relevant code piece is: > > void eigensolver(PetscErrorCode ierr, params *params, Mat &H, int argc, char **argv) > { > > > EPS eps; /* eigenproblem solver context */ > EPSType type; > ST st; > KSP ksp; > PC pc; > PetscReal tol,error; > PetscReal lower,upper; > //PetscInt nev=dim,maxit,its; > PetscInt nev,maxit,its,nconv; > Vec xr,xi; > PetscScalar kr,ki; > PetscReal re,im; > PetscViewer viewer; > PetscInt rank; > PetscInt size; > std::string eig_file_n; > std::ofstream eig_file; > char ofile[100]; > > MPI_Comm_rank(PETSC_COMM_WORLD,&rank); > MPI_Comm_size(PETSC_COMM_WORLD,&size); > > ierr = PetscPrintf(PETSC_COMM_WORLD,"---Beginning Eigenvalue Solver---\n");CHKERRV(ierr); > ierr = EPSCreate(PETSC_COMM_WORLD,&eps);CHKERRV(ierr); > > eig_file_n.append(params->ofile_n); > eig_file_n.append("_eval"); > eig_file.open(eig_file_n.c_str(),std::ofstream::trunc); > > //Set operators. In this case, it is a standard eigenvalue problem > ierr = EPSSetOperators(eps,H,NULL);CHKERRV(ierr); > ierr = EPSSetProblemType(eps,EPS_HEP);CHKERRV(ierr); > > ierr = EPSSetType(eps,EPSKRYLOVSCHUR);CHKERRV(ierr); > > ierr = EPSGetST(eps,&st);CHKERRV(ierr); > ierr = STSetType(st,STSINVERT);CHKERRV(ierr); > > ierr = STGetKSP(st,&ksp);CHKERRV(ierr); > ierr = KSPSetType(ksp,KSPPREONLY);CHKERRV(ierr); > ierr = KSPGetPC(ksp,&pc);CHKERRV(ierr); > ierr = PCSetType(pc,PCCHOLESKY);CHKERRV(ierr); > ierr = EPSKrylovSchurSetPartitions(eps,size);CHKERRV(ierr); > > for(PetscInt i=0;inf;i++){ > lower=std::pow(2.0*params->m[i]-params->m[i]*params->alpha*params->alpha,2.0); > upper=4.0*params->m[i]*params->m[i]; > ierr = EPSSetInterval(eps,lower,upper); > ierr = EPSSetWhichEigenpairs(eps,EPS_ALL); > //Set solver parameters at runtime > ierr = EPSSetFromOptions(eps);CHKERRV(ierr); > // ierr = EPSSetWhichEigenpairs(eps,EPS_SMALLEST_REAL); > > ierr = MatCreateVecs(H,NULL,&xr);CHKERRV(ierr); > ierr = MatCreateVecs(H,NULL,&xi);CHKERRV(ierr); > > > ierr = EPSSolve(eps);CHKERRV(ierr); > > ierr = EPSGetIterationNumber(eps,&its);CHKERRV(ierr); > ierr = PetscPrintf(PETSC_COMM_WORLD," Number of iterations of the method: %D\n",its);CHKERRV(ierr); > > > //Optional: Get some information from the solver and display it > ierr = EPSGetType(eps,&type);CHKERRV(ierr); > ierr = PetscPrintf(PETSC_COMM_WORLD," Solution method: %s\n\n",type);CHKERRV(ierr); > ierr = EPSGetDimensions(eps,&nev,NULL,NULL);CHKERRV(ierr); > ierr = PetscPrintf(PETSC_COMM_WORLD," Number of requested eigenvalues: %D\n",nev);CHKERRV(ierr); > ierr = EPSGetTolerances(eps,&tol,&maxit);CHKERRV(ierr); > ierr = PetscPrintf(PETSC_COMM_WORLD," Stopping condition: tol=%.4g, maxit=%D\n",tol,maxit);CHKERRV(ierr); > > ierr = EPSGetConverged(eps,&nconv);CHKERRV(ierr); > ierr = PetscPrintf(PETSC_COMM_WORLD," Number of converged eigenpairs: %D\n\n",nconv);CHKERRV(ierr); > > strcpy(ofile,params->ofile_n); > strcat(ofile,"_evecr"); > > ierr = PetscViewerASCIIOpen(PETSC_COMM_WORLD,ofile,&viewer);CHKERRV(ierr); > > if (nconv>0) > { > ierr = PetscPrintf(PETSC_COMM_WORLD, > " k ||Ax-kx||/||kx||\n" > " ----------------- ------------------\n");CHKERRV(ierr); > > for (PetscInt i=0;i { > //Get converged eigenpairs: i-th eigenvalue is stored in kr (real part) and ki (imaginary part) > ierr = EPSGetEigenpair(eps,i,&kr,&ki,xr,xi);CHKERRV(ierr); > //Compute the relative error associated to each eigenpair > ierr = EPSComputeError(eps,i,EPS_ERROR_RELATIVE,&error);CHKERRV(ierr); > > #if defined(PETSC_USE_COMPLEX) > re = PetscRealPart(kr); > im = PetscImaginaryPart(kr); > #else > re = kr; > im = ki; > #endif > > if (im!=0.0) > { > > ierr = PetscPrintf(PETSC_COMM_WORLD," %9f%+9f j %12g\n",re,im,error);CHKERRV(ierr); > if(rank==0) eig_file << re << " " << im << " " << error << std::endl; > } else > { > ierr = PetscPrintf(PETSC_COMM_WORLD," %12f %12g\n",re,error);CHKERRV(ierr); > if(rank==0) eig_file << re << " " << 0 << " " << error << std::endl; > } > > ierr = VecView(xr,viewer);CHKERRV(ierr); > > } > ierr = PetscPrintf(PETSC_COMM_WORLD,"\n");CHKERRV(ierr); > } > } > eig_file.close(); > ierr = EPSDestroy(&eps);CHKERRV(ierr); > ierr = PetscViewerDestroy(&viewer);CHKERRV(ierr); > ierr = VecDestroy(&xr);CHKERRV(ierr); > ierr = VecDestroy(&xi);CHKERRV(ierr); > > ierr = PetscPrintf(PETSC_COMM_WORLD,"---Finishing Eigenvalue Solver---\n");CHKERRV(ierr); > } > > > > Thanks, > Hank From gideon.simpson at gmail.com Fri Aug 28 19:29:03 2015 From: gideon.simpson at gmail.com (Gideon Simpson) Date: Fri, 28 Aug 2015 20:29:03 -0400 Subject: [petsc-users] pcfieldsplit for a composite dm with multiple subfields In-Reply-To: <76703FF4-3008-4744-B334-C1EB732DFC4C@mcs.anl.gov> References: <22066404-50E3-4BBF-9D27-26384815571A@gmail.com> <8376EFA7-D775-4D65-9E86-2303FA7E47E2@gmail.com> <4136AE9B-AE00-4E39-8012-888BBF920548@mcs.anl.gov> <5714BD56-097D-40B9-8AEF-41273E2B512D@mcs.anl.gov> <05BA24AC-4011-483C-8599-5D8EED7AFE10@mcs.anl.gov> <0DBB158F-6E6B-404A-B477-BC7D5A321F01@gmail.com> <122D6409-96DB-4A1A-A134-525D2CEC2F1D@mcs.anl.gov> <3636A1D1-9B71-4C11-8A7F-424CAB000C3D@gmail.com> <76703FF4-3008-4744-B334-C1EB732DFC4C@mcs.anl.gov> Message-ID: That was a mistake on my part. But I did want to ask, what should be the behavior with a grid sequence if the SNES fails during one of the intermediate steps? -gideon > On Aug 28, 2015, at 6:03 PM, Barry Smith wrote: > > >> On Aug 28, 2015, at 4:35 PM, Gideon Simpson wrote: >> >> Hi Barry, >> >> Ok, I tried that and it works as intended, but there?s something I noticed. If i use that, and do a SNESGetConvergedReason on the snesrefine, it always seems to return 0. Is there a reason for that? > > Should never do that; are you sure that SNESSolve() has actually been called on it. What does -snes_monitor and -snes_converged_reason show. > > Barry > > > >> >> -gideon >> >>> On Aug 28, 2015, at 4:21 PM, Barry Smith wrote: >>> >>>> >>>> On Aug 28, 2015, at 3:04 PM, Gideon Simpson wrote: >>>> >>>> Yes, if i continue in this parameter on the coarse mesh, I can generally solve at all values. I do find that I need to do some amount of continuation to solve near the endpoint. The problem is that on the coarse mesh, things are not fully resolved at all the values along the continuation parameter, and I would like to do refinement. >>>> >>>> One subtlety is that I actually want the intermediate continuation solutions too. Currently, without doing any grid sequence, I compute each, write it to disk, and then go on to the next one. So I now need to go back an refine them. I was thinking that perhaps I could refine them on the fly, dump them to disk, and use the coarse solution as the starting guess at the next iteration, but that would seem to require resetting the snes back to the coarse grid. >>>> >>>> The alternative would be to just script the mesh refinement in a post processing stage, where each value of the continuation is parameter is loaded on the coarse mesh, and refined. Perhaps that?s the most practical thing to do. >>> >>> I would do the following. Create your DM and create a SNES that will do the continuation >>> >>> loop over continuation parameter >>> >>> SNESSolve(snes,NULL,Ucoarse); >>> >>> if (you decide you want to see the refined solution at this continuation point) { >>> SNESCreate(comm,&snesrefine); >>> SNESSetDM() >>> etc >>> SNESSetGridSequence(snesrefine,) >>> SNESSolve(snesrefine,0,Ucoarse); >>> SNESGetSolution(snesrefine,&Ufine); >>> VecView(Ufine or do whatever you want to do with the Ufine at that continuation point >>> SNESDestroy(snesrefine); >>> end if >>> >>> end loop over continuation parameter. >>> >>> Barry >>> >>>> >>>> -gideon >>>> >>>>> On Aug 28, 2015, at 3:55 PM, Barry Smith wrote: >>>>> >>>>>> >>>>>> >>>>>> 3. This problem is actually part of a continuation problem that roughly looks like this >>>>>> >>>>>> for( continuation parameter p = 0 to 1){ >>>>>> >>>>>> solve with parameter p_i using solution from p_{i-1}, >>>>>> } >>>>>> >>>>>> What I would like to do is to start the solver, for each value of parameter p_i on the coarse mesh, and then do grid sequencing on that. But it appears that after doing grid sequencing on the initial p_0 = 0, the SNES is set to use the finer mesh. >>>>> >>>>> So you are using continuation to give you a good enough initial guess on the coarse level to even get convergence on the coarse level? First I would check if you even need the continuation (or can you not even solve the coarse problem without it). >>>>> >>>>> If you do need the continuation then you will need to tweak how you do the grid sequencing. I think this will work: >>>>> >>>>> Do not use -snes_grid_sequencing >>>>> >>>>> Run SNESSolve() as many times as you want with your continuation parameter. This will all happen on the coarse mesh. >>>>> >>>>> Call SNESSetGridSequence() >>>>> >>>>> Then call SNESSolve() again and it will do one solve on the coarse level and then interpolate to the next level etc. >> > -------------- next part -------------- An HTML attachment was scrubbed... URL: From bsmith at mcs.anl.gov Fri Aug 28 19:54:59 2015 From: bsmith at mcs.anl.gov (Barry Smith) Date: Fri, 28 Aug 2015 19:54:59 -0500 Subject: [petsc-users] pcfieldsplit for a composite dm with multiple subfields In-Reply-To: References: <22066404-50E3-4BBF-9D27-26384815571A@gmail.com> <8376EFA7-D775-4D65-9E86-2303FA7E47E2@gmail.com> <4136AE9B-AE00-4E39-8012-888BBF920548@mcs.anl.gov> <5714BD56-097D-40B9-8AEF-41273E2B512D@mcs.anl.gov> <05BA24AC-4011-483C-8599-5D8EED7AFE10@mcs.anl.gov> <0DBB158F-6E6B-404A-B477-BC7D5A321F01@gmail.com> <122D6409-96DB-4A1A-A134-525D2CEC2F1D@mcs.anl.gov> <3636A1D1-9B71-4C11-8A7F-424CAB000C3D@gmail.com> <76703FF4-3008-4744-B334-C1EB732DFC4C@mcs.anl.gov> Message-ID: > On Aug 28, 2015, at 7:29 PM, Gideon Simpson wrote: > > That was a mistake on my part. But I did want to ask, what should be the behavior with a grid sequence if the SNES fails during one of the intermediate steps? You'll have to look at the code. So I just did, unless you set the option -snes_error_if_not_converged it will blinding go on. But the final SNESConvergedReason() will hopefully be negative indicating that SNES has not converged. Barry > > -gideon > >> On Aug 28, 2015, at 6:03 PM, Barry Smith wrote: >> >> >>> On Aug 28, 2015, at 4:35 PM, Gideon Simpson wrote: >>> >>> Hi Barry, >>> >>> Ok, I tried that and it works as intended, but there?s something I noticed. If i use that, and do a SNESGetConvergedReason on the snesrefine, it always seems to return 0. Is there a reason for that? >> >> Should never do that; are you sure that SNESSolve() has actually been called on it. What does -snes_monitor and -snes_converged_reason show. >> >> Barry >> >> >> >>> >>> -gideon >>> >>>> On Aug 28, 2015, at 4:21 PM, Barry Smith wrote: >>>> >>>>> >>>>> On Aug 28, 2015, at 3:04 PM, Gideon Simpson wrote: >>>>> >>>>> Yes, if i continue in this parameter on the coarse mesh, I can generally solve at all values. I do find that I need to do some amount of continuation to solve near the endpoint. The problem is that on the coarse mesh, things are not fully resolved at all the values along the continuation parameter, and I would like to do refinement. >>>>> >>>>> One subtlety is that I actually want the intermediate continuation solutions too. Currently, without doing any grid sequence, I compute each, write it to disk, and then go on to the next one. So I now need to go back an refine them. I was thinking that perhaps I could refine them on the fly, dump them to disk, and use the coarse solution as the starting guess at the next iteration, but that would seem to require resetting the snes back to the coarse grid. >>>>> >>>>> The alternative would be to just script the mesh refinement in a post processing stage, where each value of the continuation is parameter is loaded on the coarse mesh, and refined. Perhaps that?s the most practical thing to do. >>>> >>>> I would do the following. Create your DM and create a SNES that will do the continuation >>>> >>>> loop over continuation parameter >>>> >>>> SNESSolve(snes,NULL,Ucoarse); >>>> >>>> if (you decide you want to see the refined solution at this continuation point) { >>>> SNESCreate(comm,&snesrefine); >>>> SNESSetDM() >>>> etc >>>> SNESSetGridSequence(snesrefine,) >>>> SNESSolve(snesrefine,0,Ucoarse); >>>> SNESGetSolution(snesrefine,&Ufine); >>>> VecView(Ufine or do whatever you want to do with the Ufine at that continuation point >>>> SNESDestroy(snesrefine); >>>> end if >>>> >>>> end loop over continuation parameter. >>>> >>>> Barry >>>> >>>>> >>>>> -gideon >>>>> >>>>>> On Aug 28, 2015, at 3:55 PM, Barry Smith wrote: >>>>>> >>>>>>> >>>>>>> >>>>>>> 3. This problem is actually part of a continuation problem that roughly looks like this >>>>>>> >>>>>>> for( continuation parameter p = 0 to 1){ >>>>>>> >>>>>>> solve with parameter p_i using solution from p_{i-1}, >>>>>>> } >>>>>>> >>>>>>> What I would like to do is to start the solver, for each value of parameter p_i on the coarse mesh, and then do grid sequencing on that. But it appears that after doing grid sequencing on the initial p_0 = 0, the SNES is set to use the finer mesh. >>>>>> >>>>>> So you are using continuation to give you a good enough initial guess on the coarse level to even get convergence on the coarse level? First I would check if you even need the continuation (or can you not even solve the coarse problem without it). >>>>>> >>>>>> If you do need the continuation then you will need to tweak how you do the grid sequencing. I think this will work: >>>>>> >>>>>> Do not use -snes_grid_sequencing >>>>>> >>>>>> Run SNESSolve() as many times as you want with your continuation parameter. This will all happen on the coarse mesh. >>>>>> >>>>>> Call SNESSetGridSequence() >>>>>> >>>>>> Then call SNESSolve() again and it will do one solve on the coarse level and then interpolate to the next level etc. >>> >> > From timothee.nicolas at gmail.com Fri Aug 28 22:15:30 2015 From: timothee.nicolas at gmail.com (=?UTF-8?Q?Timoth=C3=A9e_Nicolas?=) Date: Sat, 29 Aug 2015 12:15:30 +0900 Subject: [petsc-users] How to extract a slice at a given coordinate and view it ? Message-ID: Hi, I have been thinking for several hours about this problem and can't find an efficient solution, however I imagine this must be possible somehow with Petsc. My problem is the following : I work in 3D (R,Z,Phi) which makes my data quite heavy and I don't want to save all the data of all my fields, even just once in a while. Instead, I would like to save in a binary file a slice at a given angle, say phi=0. As I did not find if it's natively possible in Petsc, I considered creating a second 2D DMDA, on which I can create 2D vectors and view them with the binary viewer. So far so good. However, upon creating the 2D DMDA, naturally the distribution of processors does not correspond to the distribution of the 3D DMDA. So I was considering creating global arrays, filling them with the data of the 3D array in phi=0, then doing an MPI_allgather to give the information to all the processors, to be able to read the array and fill the 2D Petsc Vector with it. So the code would be something along the lines of : PetscScalar, pointer :: gX2D(:,:,:) PetscScalar, pointer :: gX(:,:,:,:) ! LocalArray is locally filled ! It is transmitted to GlobalArray via MPI_Allgather real(8) :: LocalArray(user%dof,user%mr,user%mz) real(8) :: GlobalArray(user%dof,user%mr,user%mz) call DMDAVecGetArrayF90(da_phi0,X2D,gX2D,ierr) call DMDAVecGetArrayF90(da,X,gX,ierr) do k = user%phis,user%phie do j = user%zs,user%ze do i = user%rs,user%re do l=1,user%dof if (k.eq.phi_print) then ! Numbering obtained with DMDAGetArrayF90 differs from usual LocalArray(l,i,j) = gX(l-1,i-1,j-1,k-1) end if end do end do end do end do nvals = user%dof*user%rm*user%zm call MPI_AllGather(LocalArray(1,user%rs,user%zs), & & nvals,MPI_REAL, & & GlobalArray, & & nvals,MPI_REAL,MPI_COMM_WORLD,ierr) do j = zs2D,ze2D do i = rs2D,re2D do l=1,user%dof gX2D(l-1,i-1,j-1) = GlobalArray(l,i,j) end do end do end do call DMDAVecRestoreArrayF90(da_phi0,X2D,gX2D,ierr) call DMDAVecRestoreArrayF90(da,X,gX,ierr) The problem is that MPI_allgather is not at all that simple. Exchanging array information is much more complicated that I had anticipated ! See this long post on stackoverflow : http://stackoverflow.com/questions/17508647/sending-2d-arrays-in-fortran-with-mpi-gather I could probably get it to work eventually, but it's pretty complicated, and I was wondering if there was not a simpler alternative I could not see. Besides, I am concerned about what could happen if the number of processors is so large that the 2D Vector gets less than 2 points per processor (I have lots of points in phi, so this can happen easily). Then Petsc would complain. Does anybody have ideas ? Best Timoth?e -------------- next part -------------- An HTML attachment was scrubbed... URL: From bsmith at mcs.anl.gov Fri Aug 28 22:40:46 2015 From: bsmith at mcs.anl.gov (Barry Smith) Date: Fri, 28 Aug 2015 22:40:46 -0500 Subject: [petsc-users] How to extract a slice at a given coordinate and view it ? In-Reply-To: References: Message-ID: I wrote a routine DMDAGetRay() that pulls a 1 dimensional slice out of a 2d DMDA and puts it on process 0. It uses AOApplicationToPetsc() so is not truly scalable but perhaps you could take a look at that. Since you say "(I have lots of points in phi, so this can happen easily)" it may be ok for you to just stick the 2d slice on process 0 and then save it? Barry Without using AOApplicationToPetsc() or something similar yes it is in general a nightmare. > On Aug 28, 2015, at 10:15 PM, Timoth?e Nicolas wrote: > > Hi, > > I have been thinking for several hours about this problem and can't find an efficient solution, however I imagine this must be possible somehow with Petsc. > > My problem is the following : > > I work in 3D (R,Z,Phi) which makes my data quite heavy and I don't want to save all the data of all my fields, even just once in a while. Instead, I would like to save in a binary file a slice at a given angle, say phi=0. > > As I did not find if it's natively possible in Petsc, I considered creating a second 2D DMDA, on which I can create 2D vectors and view them with the binary viewer. So far so good. However, upon creating the 2D DMDA, naturally the distribution of processors does not correspond to the distribution of the 3D DMDA. So I was considering creating global arrays, filling them with the data of the 3D array in phi=0, then doing an MPI_allgather to give the information to all the processors, to be able to read the array and fill the 2D Petsc Vector with it. So the code would be something along the lines of : > > PetscScalar, pointer :: gX2D(:,:,:) > PetscScalar, pointer :: gX(:,:,:,:) > ! LocalArray is locally filled > ! It is transmitted to GlobalArray via MPI_Allgather > real(8) :: LocalArray(user%dof,user%mr,user%mz) > real(8) :: GlobalArray(user%dof,user%mr,user%mz) > > call DMDAVecGetArrayF90(da_phi0,X2D,gX2D,ierr) > call DMDAVecGetArrayF90(da,X,gX,ierr) > > do k = user%phis,user%phie > do j = user%zs,user%ze > do i = user%rs,user%re > do l=1,user%dof > if (k.eq.phi_print) then > ! Numbering obtained with DMDAGetArrayF90 differs from usual > LocalArray(l,i,j) = gX(l-1,i-1,j-1,k-1) > end if > end do > end do > end do > end do > > nvals = user%dof*user%rm*user%zm > > call MPI_AllGather(LocalArray(1,user%rs,user%zs), & > & nvals,MPI_REAL, & > & GlobalArray, & > & nvals,MPI_REAL,MPI_COMM_WORLD,ierr) > > do j = zs2D,ze2D > do i = rs2D,re2D > do l=1,user%dof > gX2D(l-1,i-1,j-1) = GlobalArray(l,i,j) > end do > end do > end do > > call DMDAVecRestoreArrayF90(da_phi0,X2D,gX2D,ierr) > call DMDAVecRestoreArrayF90(da,X,gX,ierr) > > The problem is that MPI_allgather is not at all that simple. Exchanging array information is much more complicated that I had anticipated ! See this long post on stackoverflow : > > http://stackoverflow.com/questions/17508647/sending-2d-arrays-in-fortran-with-mpi-gather > > I could probably get it to work eventually, but it's pretty complicated, and I was wondering if there was not a simpler alternative I could not see. Besides, I am concerned about what could happen if the number of processors is so large that the 2D Vector gets less than 2 points per processor (I have lots of points in phi, so this can happen easily). Then Petsc would complain. > > Does anybody have ideas ? > > Best > > Timoth?e From jroman at dsic.upv.es Sat Aug 29 02:55:35 2015 From: jroman at dsic.upv.es (Jose E. Roman) Date: Sat, 29 Aug 2015 09:55:35 +0200 Subject: [petsc-users] Increasing nodes doesn't decrease memory per node. In-Reply-To: References: Message-ID: <97608CC3-A769-48DD-B3EA-1BEA073ADFB9@dsic.upv.es> You are doing a spectrum slicing run (EPSSetInterval) with ?size? partitions (EPSKrylovSchurSetPartitions), so every single process will be in charge of computing a subinterval. Each subcommunicator needs a redundant copy of the matrix, and in this case this copy is SeqAIJ since subcommunicators consist in just one process. You will probably need to share this memory across a set of processes and use MUMPS for the factorization. Try setting e.g. size/8 partitions. Jose > El 29/8/2015, a las 1:14, Barry Smith escribi?: > > > It is using a SeqAIJ matrix, not a parallel matrix. Increasing the number of cores won't affect the size of a sequential matrix since it must be stored entirely on one process. Perhaps you need to use parallel matrices? > > > [1]PETSC ERROR: #4 MatDuplicate_SeqAIJ() line 4103 in /home1/apps/intel15/mvapich2_2_1/petsc/3.6/src/mat/impls/aij/seq/aij.c > [1]PETSC ERROR: #5 MatDuplicate() line 4252 in /home1/apps/intel15/mvapich2_2_1/petsc/3.6/src/mat/interface/matrix.c > > >> On Aug 28, 2015, at 4:13 PM, Hank Lamm wrote: >> >> Hi All, >> >> I am having a problem running Petsc3.6 and Slepc3.6 on Stampede. My code should be a simple eigenvalue solver, but when I attempt to solve large problems (8488x8488 matrices) I get errors: >> >> --------------------- Error Message -------------------------------------------------------------- >> [1]Total space allocated 1736835920 bytes >> [1]PETSC ERROR: Out of memory. This could be due to allocating >> [1]PETSC ERROR: too large an object or bleeding by not properly >> [1]PETSC ERROR: destroying unneeded objects. >> [1]PETSC ERROR: Memory allocated 1736835920 Memory used by process 1769742336 >> [1]PETSC ERROR: [0]PETSC ERROR: Memory requested 864587796 >> [1]PETSC ERROR: [0]PETSC ERROR: See http://www.mcs.anl.gov/petsc/documentation/faq.html for trouble shooting. >> [1]PETSC ERROR: #8 STSetUp() line 305 in /work/03324/hlammiv/slepc-3.6.0/src/sys/classes/st/interface/stsolve.c >> [1]PETSC ERROR: [0]PETSC ERROR: #1 MatDuplicateNoCreate_SeqAIJ() line 4030 in /home1/apps/intel15/mvapich2_2_1/petsc/3.6/src/mat/impls/aij/seq/aij.c >> [1]PETSC ERROR: #2 PetscTrMallocDefault() line 188 in /home1/apps/intel15/mvapich2_2_1/petsc/3.6/src/sys/memory/mtr.c >> [1]PETSC ERROR: #4 MatDuplicate_SeqAIJ() line 4103 in /home1/apps/intel15/mvapich2_2_1/petsc/3.6/src/mat/impls/aij/seq/aij.c >> [1]PETSC ERROR: #5 MatDuplicate() line 4252 in /home1/apps/intel15/mvapich2_2_1/petsc/3.6/src/mat/interface/matrix.c >> [1]PETSC ERROR: #6 STMatMAXPY_Private() line 379 in /work/03324/hlammiv/slepc-3.6.0/src/sys/classes/st/interface/stsolve.c >> [1]PETSC ERROR: #7 STSetUp_Sinvert() line 131 in /work/03324/hlammiv/slepc-3.6.0/src/sys/classes/st/impls/sinvert/sinvert.c >> [1]PETSC ERROR: #8 STSetUp() line 305 in /work/03324/hlammiv/slepc-3.6.0/src/sys/classes/st/interface/stsolve.c >> [1]PETSC ERROR: #9 EPSSliceGetInertia() line 295 in /work/03324/hlammiv/slepc-3.6.0/src/eps/impls/krylov/krylovschur/ks-slice.c >> [1]PETSC ERROR: #10 EPSSetUp_KrylovSchur_Slice() line 425 in /work/03324/hlammiv/slepc-3.6.0/src/eps/impls/krylov/krylovschur/ks-slice.c >> [1]PETSC ERROR: #11 EPSSetUp_KrylovSchur() line 89 in /work/03324/hlammiv/slepc-3.6.0/src/eps/impls/krylov/krylovschur/krylovschur.c >> [1]PETSC ERROR: #12 EPSSetUp() line 121 in /work/03324/hlammiv/slepc-3.6.0/src/eps/interface/epssetup.c >> [1]PETSC ERROR: #13 EPSSliceGetEPS() line 267 in /work/03324/hlammiv/slepc-3.6.0/src/eps/impls/krylov/krylovschur/ks-slice.c >> [1]PETSC ERROR: #14 EPSSetUp_KrylovSchur_Slice() line 368 in /work/03324/hlammiv/slepc-3.6.0/src/eps/impls/krylov/krylovschur/ks-slice.c >> [1]PETSC ERROR: #15 EPSSetUp_KrylovSchur() line 89 in /work/03324/hlammiv/slepc-3.6.0/src/eps/impls/krylov/krylovschur/krylovschur.c >> [1]PETSC ERROR: #16 EPSSetUp() line 121 in /work/03324/hlammiv/slepc-3.6.0/src/eps/interface/epssetup.c >> [1]PETSC ERROR: #17 EPSSolve() line 88 in /work/03324/hlammiv/slepc-3.6.0/src/eps/interface/epssolve.c >> [1]PETSC ERROR: #18 eigensolver() line 64 in /work/03324/hlammiv/TMSWIFT/src/solver.cpp >> [1]Current space PetscMalloc()ed 1.73683e+09, max space PetscMalloced() 1.73684e+09 >> [1]Current process memory 1.76979e+09 max process memory 1.76979e+09 >> >> >> The curious thing about this error, is that it seems that if I increase the number of nodes, from 32 to 64 to 128, the amount of memory per node doesn't decrease. I have used valgrind and it doesn't seem to a memory leak. >> >> The relevant code piece is: >> >> void eigensolver(PetscErrorCode ierr, params *params, Mat &H, int argc, char **argv) >> { >> >> >> EPS eps; /* eigenproblem solver context */ >> EPSType type; >> ST st; >> KSP ksp; >> PC pc; >> PetscReal tol,error; >> PetscReal lower,upper; >> //PetscInt nev=dim,maxit,its; >> PetscInt nev,maxit,its,nconv; >> Vec xr,xi; >> PetscScalar kr,ki; >> PetscReal re,im; >> PetscViewer viewer; >> PetscInt rank; >> PetscInt size; >> std::string eig_file_n; >> std::ofstream eig_file; >> char ofile[100]; >> >> MPI_Comm_rank(PETSC_COMM_WORLD,&rank); >> MPI_Comm_size(PETSC_COMM_WORLD,&size); >> >> ierr = PetscPrintf(PETSC_COMM_WORLD,"---Beginning Eigenvalue Solver---\n");CHKERRV(ierr); >> ierr = EPSCreate(PETSC_COMM_WORLD,&eps);CHKERRV(ierr); >> >> eig_file_n.append(params->ofile_n); >> eig_file_n.append("_eval"); >> eig_file.open(eig_file_n.c_str(),std::ofstream::trunc); >> >> //Set operators. In this case, it is a standard eigenvalue problem >> ierr = EPSSetOperators(eps,H,NULL);CHKERRV(ierr); >> ierr = EPSSetProblemType(eps,EPS_HEP);CHKERRV(ierr); >> >> ierr = EPSSetType(eps,EPSKRYLOVSCHUR);CHKERRV(ierr); >> >> ierr = EPSGetST(eps,&st);CHKERRV(ierr); >> ierr = STSetType(st,STSINVERT);CHKERRV(ierr); >> >> ierr = STGetKSP(st,&ksp);CHKERRV(ierr); >> ierr = KSPSetType(ksp,KSPPREONLY);CHKERRV(ierr); >> ierr = KSPGetPC(ksp,&pc);CHKERRV(ierr); >> ierr = PCSetType(pc,PCCHOLESKY);CHKERRV(ierr); >> ierr = EPSKrylovSchurSetPartitions(eps,size);CHKERRV(ierr); >> >> for(PetscInt i=0;inf;i++){ >> lower=std::pow(2.0*params->m[i]-params->m[i]*params->alpha*params->alpha,2.0); >> upper=4.0*params->m[i]*params->m[i]; >> ierr = EPSSetInterval(eps,lower,upper); >> ierr = EPSSetWhichEigenpairs(eps,EPS_ALL); >> //Set solver parameters at runtime >> ierr = EPSSetFromOptions(eps);CHKERRV(ierr); >> // ierr = EPSSetWhichEigenpairs(eps,EPS_SMALLEST_REAL); >> >> ierr = MatCreateVecs(H,NULL,&xr);CHKERRV(ierr); >> ierr = MatCreateVecs(H,NULL,&xi);CHKERRV(ierr); >> >> >> ierr = EPSSolve(eps);CHKERRV(ierr); >> >> ierr = EPSGetIterationNumber(eps,&its);CHKERRV(ierr); >> ierr = PetscPrintf(PETSC_COMM_WORLD," Number of iterations of the method: %D\n",its);CHKERRV(ierr); >> >> >> //Optional: Get some information from the solver and display it >> ierr = EPSGetType(eps,&type);CHKERRV(ierr); >> ierr = PetscPrintf(PETSC_COMM_WORLD," Solution method: %s\n\n",type);CHKERRV(ierr); >> ierr = EPSGetDimensions(eps,&nev,NULL,NULL);CHKERRV(ierr); >> ierr = PetscPrintf(PETSC_COMM_WORLD," Number of requested eigenvalues: %D\n",nev);CHKERRV(ierr); >> ierr = EPSGetTolerances(eps,&tol,&maxit);CHKERRV(ierr); >> ierr = PetscPrintf(PETSC_COMM_WORLD," Stopping condition: tol=%.4g, maxit=%D\n",tol,maxit);CHKERRV(ierr); >> >> ierr = EPSGetConverged(eps,&nconv);CHKERRV(ierr); >> ierr = PetscPrintf(PETSC_COMM_WORLD," Number of converged eigenpairs: %D\n\n",nconv);CHKERRV(ierr); >> >> strcpy(ofile,params->ofile_n); >> strcat(ofile,"_evecr"); >> >> ierr = PetscViewerASCIIOpen(PETSC_COMM_WORLD,ofile,&viewer);CHKERRV(ierr); >> >> if (nconv>0) >> { >> ierr = PetscPrintf(PETSC_COMM_WORLD, >> " k ||Ax-kx||/||kx||\n" >> " ----------------- ------------------\n");CHKERRV(ierr); >> >> for (PetscInt i=0;i> { >> //Get converged eigenpairs: i-th eigenvalue is stored in kr (real part) and ki (imaginary part) >> ierr = EPSGetEigenpair(eps,i,&kr,&ki,xr,xi);CHKERRV(ierr); >> //Compute the relative error associated to each eigenpair >> ierr = EPSComputeError(eps,i,EPS_ERROR_RELATIVE,&error);CHKERRV(ierr); >> >> #if defined(PETSC_USE_COMPLEX) >> re = PetscRealPart(kr); >> im = PetscImaginaryPart(kr); >> #else >> re = kr; >> im = ki; >> #endif >> >> if (im!=0.0) >> { >> >> ierr = PetscPrintf(PETSC_COMM_WORLD," %9f%+9f j %12g\n",re,im,error);CHKERRV(ierr); >> if(rank==0) eig_file << re << " " << im << " " << error << std::endl; >> } else >> { >> ierr = PetscPrintf(PETSC_COMM_WORLD," %12f %12g\n",re,error);CHKERRV(ierr); >> if(rank==0) eig_file << re << " " << 0 << " " << error << std::endl; >> } >> >> ierr = VecView(xr,viewer);CHKERRV(ierr); >> >> } >> ierr = PetscPrintf(PETSC_COMM_WORLD,"\n");CHKERRV(ierr); >> } >> } >> eig_file.close(); >> ierr = EPSDestroy(&eps);CHKERRV(ierr); >> ierr = PetscViewerDestroy(&viewer);CHKERRV(ierr); >> ierr = VecDestroy(&xr);CHKERRV(ierr); >> ierr = VecDestroy(&xi);CHKERRV(ierr); >> >> ierr = PetscPrintf(PETSC_COMM_WORLD,"---Finishing Eigenvalue Solver---\n");CHKERRV(ierr); >> } >> >> >> >> Thanks, >> Hank > From timothee.nicolas at gmail.com Sat Aug 29 03:46:35 2015 From: timothee.nicolas at gmail.com (timothee.nicolas at gmail.com) Date: Sat, 29 Aug 2015 17:46:35 +0900 Subject: [petsc-users] How to extract a slice at a given coordinate and view it ? In-Reply-To: References: Message-ID: <92010472-EC99-4820-B52D-AC2308F4AB6C@gmail.com> I see. I will have a look. Regarding memory there would be no problem to put everything on process 0 I believe. If I can't figure out how to process with your routine, I will go back to the initial try with mpi_allgather. Thx Timothee Sent from my iPhone > On 2015/08/29, at 12:40, Barry Smith wrote: > > > I wrote a routine DMDAGetRay() that pulls a 1 dimensional slice out of a 2d DMDA and puts it on process 0. It uses AOApplicationToPetsc() so is not truly scalable but perhaps you could take a look at that. Since you say "(I have lots of points in phi, so this can happen easily)" it may be ok for you to just stick the 2d slice on process 0 and then save it? > > Barry > > Without using AOApplicationToPetsc() or something similar yes it is in general a nightmare. > > >> On Aug 28, 2015, at 10:15 PM, Timoth?e Nicolas wrote: >> >> Hi, >> >> I have been thinking for several hours about this problem and can't find an efficient solution, however I imagine this must be possible somehow with Petsc. >> >> My problem is the following : >> >> I work in 3D (R,Z,Phi) which makes my data quite heavy and I don't want to save all the data of all my fields, even just once in a while. Instead, I would like to save in a binary file a slice at a given angle, say phi=0. >> >> As I did not find if it's natively possible in Petsc, I considered creating a second 2D DMDA, on which I can create 2D vectors and view them with the binary viewer. So far so good. However, upon creating the 2D DMDA, naturally the distribution of processors does not correspond to the distribution of the 3D DMDA. So I was considering creating global arrays, filling them with the data of the 3D array in phi=0, then doing an MPI_allgather to give the information to all the processors, to be able to read the array and fill the 2D Petsc Vector with it. So the code would be something along the lines of : >> >> PetscScalar, pointer :: gX2D(:,:,:) >> PetscScalar, pointer :: gX(:,:,:,:) >> ! LocalArray is locally filled >> ! It is transmitted to GlobalArray via MPI_Allgather >> real(8) :: LocalArray(user%dof,user%mr,user%mz) >> real(8) :: GlobalArray(user%dof,user%mr,user%mz) >> >> call DMDAVecGetArrayF90(da_phi0,X2D,gX2D,ierr) >> call DMDAVecGetArrayF90(da,X,gX,ierr) >> >> do k = user%phis,user%phie >> do j = user%zs,user%ze >> do i = user%rs,user%re >> do l=1,user%dof >> if (k.eq.phi_print) then >> ! Numbering obtained with DMDAGetArrayF90 differs from usual >> LocalArray(l,i,j) = gX(l-1,i-1,j-1,k-1) >> end if >> end do >> end do >> end do >> end do >> >> nvals = user%dof*user%rm*user%zm >> >> call MPI_AllGather(LocalArray(1,user%rs,user%zs), & >> & nvals,MPI_REAL, & >> & GlobalArray, & >> & nvals,MPI_REAL,MPI_COMM_WORLD,ierr) >> >> do j = zs2D,ze2D >> do i = rs2D,re2D >> do l=1,user%dof >> gX2D(l-1,i-1,j-1) = GlobalArray(l,i,j) >> end do >> end do >> end do >> >> call DMDAVecRestoreArrayF90(da_phi0,X2D,gX2D,ierr) >> call DMDAVecRestoreArrayF90(da,X,gX,ierr) >> >> The problem is that MPI_allgather is not at all that simple. Exchanging array information is much more complicated that I had anticipated ! See this long post on stackoverflow : >> >> http://stackoverflow.com/questions/17508647/sending-2d-arrays-in-fortran-with-mpi-gather >> >> I could probably get it to work eventually, but it's pretty complicated, and I was wondering if there was not a simpler alternative I could not see. Besides, I am concerned about what could happen if the number of processors is so large that the 2D Vector gets less than 2 points per processor (I have lots of points in phi, so this can happen easily). Then Petsc would complain. >> >> Does anybody have ideas ? >> >> Best >> >> Timoth?e > From jychang48 at gmail.com Sat Aug 29 16:41:36 2015 From: jychang48 at gmail.com (Justin Chang) Date: Sat, 29 Aug 2015 15:41:36 -0600 Subject: [petsc-users] Using the fieldsplit/schur/selfp preconditoner Message-ID: Hi all, I am attempting to solve Darcy's equation: u + grad[p] = g div[u] = f The weak form under the least-squares finite element method (LSFEM) looks like this: (u + grad[p]; v + grad[q]) + div[u]*div[v] = (g; v + grad[q]) + (f; div[v]) The classical mixed formulations using H(div) elements has the following weak form: (u; v) - (p; div[v]) - (div[v]; q) = (g; v) - (f; q) For H(div) elements like RT0 and BDM, I was told that I could use these options: -ksp_type gmres -pc_type fieldsplit -pc_fieldsplit_type schur -pc_fieldsplit_schur_precondition selfp -fieldsplit_0_ksp_type preonly -fieldsplit_0_pc_type bjacobi -fieldsplit_0_sub_pc_type ilu -fieldsplir_1_ksp_type preonly -fieldsplit_1_pc_type hypre This works nicely for the classical mixed form if g was zero and f was nonzero. It also works if f was zero and g was non-zero although it seems to me the solver requires a few more iterations. Now when I attempt to apply these options to the LSFEM, my u solution is nonsensical while my p is correct for nonzero g. For nonzero f, the solver doesn't converge at all. II have used CG/Jacobi with success for small LSFEM problems, but I was wondering if it's possible (or even necessary) to do a fieldsplit/schur complement for this kind of problem and how I could modify the above options. Or what other preconditioner would work best for this type of problem where its symmetric and positive definite? Thanks, Justin From timothee.nicolas at gmail.com Sat Aug 29 23:12:26 2015 From: timothee.nicolas at gmail.com (=?UTF-8?Q?Timoth=C3=A9e_Nicolas?=) Date: Sun, 30 Aug 2015 13:12:26 +0900 Subject: [petsc-users] How to extract a slice at a given coordinate and view it ? In-Reply-To: <92010472-EC99-4820-B52D-AC2308F4AB6C@gmail.com> References: <92010472-EC99-4820-B52D-AC2308F4AB6C@gmail.com> Message-ID: Hi, I finally found a quite simple solution using MPI_WRITE_FILE_AT, even though there may be a more efficient one. Since it is only used to write a single file with reduced dimensions, it should not be a hindrance. I finally don't use Petsc viewers neither a secondary 2D DMDA. The routine looks like this, if anyone is interested : subroutine WriteVectorSelectK0(da,X,k0,filename,flg_open,ierr) implicit none DM :: da Vec :: X PetscErrorCode :: ierr PetscViewer :: viewer PetscBool :: flg_open PetscScalar, pointer :: gX(:,:,:,:) PetscScalar :: LocalArray(user%dof,user%mr,user%mz) character(*) :: filename PetscInt :: thefile integer(kind=MPI_OFFSET_KIND) :: offset PetscInt :: i,j ! The slice to select in the Z direction PetscInt :: k0 ! access the array, indexed with global indices call DMDAVecGetArrayF90(da,X,gX,ierr) ! open a file call MPI_FILE_OPEN(MPI_COMM_WORLD, filename, & MPI_MODE_WRONLY + MPI_MODE_CREATE, & MPI_INFO_NULL, thefile, ierr) ! only the processes which contain data on the slice k=k0 are interesting and will write to the file if (k0.ge.user%phis .and. k0.le.user%phie) then do j=user%zs,user%ze offset = user%dof*((j-1)*user%mr+(user%rs-1))*8 ! Indexing of gX : one has to subtract 1 because of the C like indexing resulting from DMDAVecGetArrayF90 call MPI_FILE_WRITE_AT(thefile,offset,gX(:,user%rs-1,j-1,k0-1), & & user%dof, MPI_DOUBLE_PRECISION, & & MPI_STATUS_IGNORE, ierr) end do end if call MPI_FILE_CLOSE(thefile, ierr) ! restore the array call DMDAVecRestoreArrayF90(da,X,gX,ierr) end subroutine WriteVectorSelectK0 Unfortunately, I did not find a way to avoid the loop on j, which are in principle quite inefficient. That is because when you reach the end of the local block in the first direction (user%re in my example), the next point where j is incremented by 1 is not contiguous in memory in the written file. So you have to change the offset and use a new call to MPI_FILE_WRITE_AT. Two things that one should be careful with : 1. By default, MPI-IO does not seem to include the 4 or 8 bytes at the beginning of the file which are often added by FORTRAN (for example Petsc Viewer add 8 bytes, which include one integer about the size of the data written in the file). When you read your file outside of FORTRAN (e.g. with python), you should be careful about this difference. 2. On my machine, Petsc Viewer writes the data in big endian. However, the routine above gives me little endian (however this may be machine dependent). Best Timoth?e 2015-08-29 17:46 GMT+09:00 : > I see. I will have a look. Regarding memory there would be no problem to > put everything on process 0 I believe. If I can't figure out how to process > with your routine, I will go back to the initial try with mpi_allgather. > > Thx > > Timothee > > Sent from my iPhone > > > On 2015/08/29, at 12:40, Barry Smith wrote: > > > > > > I wrote a routine DMDAGetRay() that pulls a 1 dimensional slice out of > a 2d DMDA and puts it on process 0. It uses AOApplicationToPetsc() so is > not truly scalable but perhaps you could take a look at that. Since you > say "(I have lots of points in phi, so this can happen easily)" it may be > ok for you to just stick the 2d slice on process 0 and then save it? > > > > Barry > > > > Without using AOApplicationToPetsc() or something similar yes it is in > general a nightmare. > > > > > >> On Aug 28, 2015, at 10:15 PM, Timoth?e Nicolas < > timothee.nicolas at gmail.com> wrote: > >> > >> Hi, > >> > >> I have been thinking for several hours about this problem and can't > find an efficient solution, however I imagine this must be possible somehow > with Petsc. > >> > >> My problem is the following : > >> > >> I work in 3D (R,Z,Phi) which makes my data quite heavy and I don't want > to save all the data of all my fields, even just once in a while. Instead, > I would like to save in a binary file a slice at a given angle, say phi=0. > >> > >> As I did not find if it's natively possible in Petsc, I considered > creating a second 2D DMDA, on which I can create 2D vectors and view them > with the binary viewer. So far so good. However, upon creating the 2D DMDA, > naturally the distribution of processors does not correspond to the > distribution of the 3D DMDA. So I was considering creating global arrays, > filling them with the data of the 3D array in phi=0, then doing an > MPI_allgather to give the information to all the processors, to be able to > read the array and fill the 2D Petsc Vector with it. So the code would be > something along the lines of : > >> > >> PetscScalar, pointer :: gX2D(:,:,:) > >> PetscScalar, pointer :: gX(:,:,:,:) > >> ! LocalArray is locally filled > >> ! It is transmitted to GlobalArray via MPI_Allgather > >> real(8) :: LocalArray(user%dof,user%mr,user%mz) > >> real(8) :: GlobalArray(user%dof,user%mr,user%mz) > >> > >> call DMDAVecGetArrayF90(da_phi0,X2D,gX2D,ierr) > >> call DMDAVecGetArrayF90(da,X,gX,ierr) > >> > >> do k = user%phis,user%phie > >> do j = user%zs,user%ze > >> do i = user%rs,user%re > >> do l=1,user%dof > >> if (k.eq.phi_print) then > >> ! Numbering obtained with DMDAGetArrayF90 differs from > usual > >> LocalArray(l,i,j) = gX(l-1,i-1,j-1,k-1) > >> end if > >> end do > >> end do > >> end do > >> end do > >> > >> nvals = user%dof*user%rm*user%zm > >> > >> call MPI_AllGather(LocalArray(1,user%rs,user%zs), & > >> & nvals,MPI_REAL, & > >> & GlobalArray, & > >> & nvals,MPI_REAL,MPI_COMM_WORLD,ierr) > >> > >> do j = zs2D,ze2D > >> do i = rs2D,re2D > >> do l=1,user%dof > >> gX2D(l-1,i-1,j-1) = GlobalArray(l,i,j) > >> end do > >> end do > >> end do > >> > >> call DMDAVecRestoreArrayF90(da_phi0,X2D,gX2D,ierr) > >> call DMDAVecRestoreArrayF90(da,X,gX,ierr) > >> > >> The problem is that MPI_allgather is not at all that simple. Exchanging > array information is much more complicated that I had anticipated ! See > this long post on stackoverflow : > >> > >> > http://stackoverflow.com/questions/17508647/sending-2d-arrays-in-fortran-with-mpi-gather > >> > >> I could probably get it to work eventually, but it's pretty > complicated, and I was wondering if there was not a simpler alternative I > could not see. Besides, I am concerned about what could happen if the > number of processors is so large that the 2D Vector gets less than 2 points > per processor (I have lots of points in phi, so this can happen easily). > Then Petsc would complain. > >> > >> Does anybody have ideas ? > >> > >> Best > >> > >> Timoth?e > > > -------------- next part -------------- An HTML attachment was scrubbed... URL: From hanklammiv at gmail.com Mon Aug 31 14:50:48 2015 From: hanklammiv at gmail.com (Hank Lamm) Date: Mon, 31 Aug 2015 12:50:48 -0700 Subject: [petsc-users] Increasing nodes doesn't decrease memory per node. In-Reply-To: <97608CC3-A769-48DD-B3EA-1BEA073ADFB9@dsic.upv.es> References: <97608CC3-A769-48DD-B3EA-1BEA073ADFB9@dsic.upv.es> Message-ID: If I use MatView to check the type of my matrix, it replies mpiaij, not seqaij. Am I correct in understanding your comments to mean that the reason for the error that when I do the spectrum slicing, it is creating seqaij for each processor? On Sat, Aug 29, 2015 at 12:55 AM, Jose E. Roman wrote: > You are doing a spectrum slicing run (EPSSetInterval) with ?size? > partitions (EPSKrylovSchurSetPartitions), so every single process will be > in charge of computing a subinterval. Each subcommunicator needs a > redundant copy of the matrix, and in this case this copy is SeqAIJ since > subcommunicators consist in just one process. You will probably need to > share this memory across a set of processes and use MUMPS for the > factorization. Try setting e.g. size/8 partitions. > > Jose > > > > El 29/8/2015, a las 1:14, Barry Smith escribi?: > > > > > > It is using a SeqAIJ matrix, not a parallel matrix. Increasing the > number of cores won't affect the size of a sequential matrix since it must > be stored entirely on one process. Perhaps you need to use parallel > matrices? > > > > > > [1]PETSC ERROR: #4 MatDuplicate_SeqAIJ() line 4103 in > /home1/apps/intel15/mvapich2_2_1/petsc/3.6/src/mat/impls/aij/seq/aij.c > > [1]PETSC ERROR: #5 MatDuplicate() line 4252 in > /home1/apps/intel15/mvapich2_2_1/petsc/3.6/src/mat/interface/matrix.c > > > > > >> On Aug 28, 2015, at 4:13 PM, Hank Lamm wrote: > >> > >> Hi All, > >> > >> I am having a problem running Petsc3.6 and Slepc3.6 on Stampede. My > code should be a simple eigenvalue solver, but when I attempt to solve > large problems (8488x8488 matrices) I get errors: > >> > >> --------------------- Error Message > -------------------------------------------------------------- > >> [1]Total space allocated 1736835920 bytes > >> [1]PETSC ERROR: Out of memory. This could be due to allocating > >> [1]PETSC ERROR: too large an object or bleeding by not properly > >> [1]PETSC ERROR: destroying unneeded objects. > >> [1]PETSC ERROR: Memory allocated 1736835920 Memory used by process > 1769742336 > >> [1]PETSC ERROR: [0]PETSC ERROR: Memory requested 864587796 > >> [1]PETSC ERROR: [0]PETSC ERROR: See > http://www.mcs.anl.gov/petsc/documentation/faq.html for trouble shooting. > >> [1]PETSC ERROR: #8 STSetUp() line 305 in > /work/03324/hlammiv/slepc-3.6.0/src/sys/classes/st/interface/stsolve.c > >> [1]PETSC ERROR: [0]PETSC ERROR: #1 MatDuplicateNoCreate_SeqAIJ() line > 4030 in > /home1/apps/intel15/mvapich2_2_1/petsc/3.6/src/mat/impls/aij/seq/aij.c > >> [1]PETSC ERROR: #2 PetscTrMallocDefault() line 188 in > /home1/apps/intel15/mvapich2_2_1/petsc/3.6/src/sys/memory/mtr.c > >> [1]PETSC ERROR: #4 MatDuplicate_SeqAIJ() line 4103 in > /home1/apps/intel15/mvapich2_2_1/petsc/3.6/src/mat/impls/aij/seq/aij.c > >> [1]PETSC ERROR: #5 MatDuplicate() line 4252 in > /home1/apps/intel15/mvapich2_2_1/petsc/3.6/src/mat/interface/matrix.c > >> [1]PETSC ERROR: #6 STMatMAXPY_Private() line 379 in > /work/03324/hlammiv/slepc-3.6.0/src/sys/classes/st/interface/stsolve.c > >> [1]PETSC ERROR: #7 STSetUp_Sinvert() line 131 in > /work/03324/hlammiv/slepc-3.6.0/src/sys/classes/st/impls/sinvert/sinvert.c > >> [1]PETSC ERROR: #8 STSetUp() line 305 in > /work/03324/hlammiv/slepc-3.6.0/src/sys/classes/st/interface/stsolve.c > >> [1]PETSC ERROR: #9 EPSSliceGetInertia() line 295 in > /work/03324/hlammiv/slepc-3.6.0/src/eps/impls/krylov/krylovschur/ks-slice.c > >> [1]PETSC ERROR: #10 EPSSetUp_KrylovSchur_Slice() line 425 in > /work/03324/hlammiv/slepc-3.6.0/src/eps/impls/krylov/krylovschur/ks-slice.c > >> [1]PETSC ERROR: #11 EPSSetUp_KrylovSchur() line 89 in > /work/03324/hlammiv/slepc-3.6.0/src/eps/impls/krylov/krylovschur/krylovschur.c > >> [1]PETSC ERROR: #12 EPSSetUp() line 121 in > /work/03324/hlammiv/slepc-3.6.0/src/eps/interface/epssetup.c > >> [1]PETSC ERROR: #13 EPSSliceGetEPS() line 267 in > /work/03324/hlammiv/slepc-3.6.0/src/eps/impls/krylov/krylovschur/ks-slice.c > >> [1]PETSC ERROR: #14 EPSSetUp_KrylovSchur_Slice() line 368 in > /work/03324/hlammiv/slepc-3.6.0/src/eps/impls/krylov/krylovschur/ks-slice.c > >> [1]PETSC ERROR: #15 EPSSetUp_KrylovSchur() line 89 in > /work/03324/hlammiv/slepc-3.6.0/src/eps/impls/krylov/krylovschur/krylovschur.c > >> [1]PETSC ERROR: #16 EPSSetUp() line 121 in > /work/03324/hlammiv/slepc-3.6.0/src/eps/interface/epssetup.c > >> [1]PETSC ERROR: #17 EPSSolve() line 88 in > /work/03324/hlammiv/slepc-3.6.0/src/eps/interface/epssolve.c > >> [1]PETSC ERROR: #18 eigensolver() line 64 in > /work/03324/hlammiv/TMSWIFT/src/solver.cpp > >> [1]Current space PetscMalloc()ed 1.73683e+09, max space PetscMalloced() > 1.73684e+09 > >> [1]Current process memory 1.76979e+09 max process memory 1.76979e+09 > >> > >> > >> The curious thing about this error, is that it seems that if I increase > the number of nodes, from 32 to 64 to 128, the amount of memory per node > doesn't decrease. I have used valgrind and it doesn't seem to a memory > leak. > >> > >> The relevant code piece is: > >> > >> void eigensolver(PetscErrorCode ierr, params *params, Mat &H, int argc, > char **argv) > >> { > >> > >> > >> EPS eps; /* eigenproblem solver context */ > >> EPSType type; > >> ST st; > >> KSP ksp; > >> PC pc; > >> PetscReal tol,error; > >> PetscReal lower,upper; > >> //PetscInt nev=dim,maxit,its; > >> PetscInt nev,maxit,its,nconv; > >> Vec xr,xi; > >> PetscScalar kr,ki; > >> PetscReal re,im; > >> PetscViewer viewer; > >> PetscInt rank; > >> PetscInt size; > >> std::string eig_file_n; > >> std::ofstream eig_file; > >> char ofile[100]; > >> > >> MPI_Comm_rank(PETSC_COMM_WORLD,&rank); > >> MPI_Comm_size(PETSC_COMM_WORLD,&size); > >> > >> ierr = PetscPrintf(PETSC_COMM_WORLD,"---Beginning Eigenvalue > Solver---\n");CHKERRV(ierr); > >> ierr = EPSCreate(PETSC_COMM_WORLD,&eps);CHKERRV(ierr); > >> > >> eig_file_n.append(params->ofile_n); > >> eig_file_n.append("_eval"); > >> eig_file.open(eig_file_n.c_str(),std::ofstream::trunc); > >> > >> //Set operators. In this case, it is a standard eigenvalue problem > >> ierr = EPSSetOperators(eps,H,NULL);CHKERRV(ierr); > >> ierr = EPSSetProblemType(eps,EPS_HEP);CHKERRV(ierr); > >> > >> ierr = EPSSetType(eps,EPSKRYLOVSCHUR);CHKERRV(ierr); > >> > >> ierr = EPSGetST(eps,&st);CHKERRV(ierr); > >> ierr = STSetType(st,STSINVERT);CHKERRV(ierr); > >> > >> ierr = STGetKSP(st,&ksp);CHKERRV(ierr); > >> ierr = KSPSetType(ksp,KSPPREONLY);CHKERRV(ierr); > >> ierr = KSPGetPC(ksp,&pc);CHKERRV(ierr); > >> ierr = PCSetType(pc,PCCHOLESKY);CHKERRV(ierr); > >> ierr = EPSKrylovSchurSetPartitions(eps,size);CHKERRV(ierr); > >> > >> for(PetscInt i=0;inf;i++){ > >> > lower=std::pow(2.0*params->m[i]-params->m[i]*params->alpha*params->alpha,2.0); > >> upper=4.0*params->m[i]*params->m[i]; > >> ierr = EPSSetInterval(eps,lower,upper); > >> ierr = EPSSetWhichEigenpairs(eps,EPS_ALL); > >> //Set solver parameters at runtime > >> ierr = EPSSetFromOptions(eps);CHKERRV(ierr); > >> // ierr = EPSSetWhichEigenpairs(eps,EPS_SMALLEST_REAL); > >> > >> ierr = MatCreateVecs(H,NULL,&xr);CHKERRV(ierr); > >> ierr = MatCreateVecs(H,NULL,&xi);CHKERRV(ierr); > >> > >> > >> ierr = EPSSolve(eps);CHKERRV(ierr); > >> > >> ierr = EPSGetIterationNumber(eps,&its);CHKERRV(ierr); > >> ierr = PetscPrintf(PETSC_COMM_WORLD," Number of iterations of the > method: %D\n",its);CHKERRV(ierr); > >> > >> > >> //Optional: Get some information from the solver and display it > >> ierr = EPSGetType(eps,&type);CHKERRV(ierr); > >> ierr = PetscPrintf(PETSC_COMM_WORLD," Solution method: > %s\n\n",type);CHKERRV(ierr); > >> ierr = EPSGetDimensions(eps,&nev,NULL,NULL);CHKERRV(ierr); > >> ierr = PetscPrintf(PETSC_COMM_WORLD," Number of requested > eigenvalues: %D\n",nev);CHKERRV(ierr); > >> ierr = EPSGetTolerances(eps,&tol,&maxit);CHKERRV(ierr); > >> ierr = PetscPrintf(PETSC_COMM_WORLD," Stopping condition: > tol=%.4g, maxit=%D\n",tol,maxit);CHKERRV(ierr); > >> > >> ierr = EPSGetConverged(eps,&nconv);CHKERRV(ierr); > >> ierr = PetscPrintf(PETSC_COMM_WORLD," Number of converged > eigenpairs: %D\n\n",nconv);CHKERRV(ierr); > >> > >> strcpy(ofile,params->ofile_n); > >> strcat(ofile,"_evecr"); > >> > >> ierr = > PetscViewerASCIIOpen(PETSC_COMM_WORLD,ofile,&viewer);CHKERRV(ierr); > >> > >> if (nconv>0) > >> { > >> ierr = PetscPrintf(PETSC_COMM_WORLD, > >> " k ||Ax-kx||/||kx||\n" > >> " ----------------- > ------------------\n");CHKERRV(ierr); > >> > >> for (PetscInt i=0;i >> { > >> //Get converged eigenpairs: i-th eigenvalue is stored in kr > (real part) and ki (imaginary part) > >> ierr = > EPSGetEigenpair(eps,i,&kr,&ki,xr,xi);CHKERRV(ierr); > >> //Compute the relative error associated to each > eigenpair > >> ierr = > EPSComputeError(eps,i,EPS_ERROR_RELATIVE,&error);CHKERRV(ierr); > >> > >> #if defined(PETSC_USE_COMPLEX) > >> re = PetscRealPart(kr); > >> im = PetscImaginaryPart(kr); > >> #else > >> re = kr; > >> im = ki; > >> #endif > >> > >> if (im!=0.0) > >> { > >> > >> ierr = PetscPrintf(PETSC_COMM_WORLD," %9f%+9f j > %12g\n",re,im,error);CHKERRV(ierr); > >> if(rank==0) eig_file << re << " " << im << " " << error > << std::endl; > >> } else > >> { > >> ierr = PetscPrintf(PETSC_COMM_WORLD," %12f > %12g\n",re,error);CHKERRV(ierr); > >> if(rank==0) eig_file << re << " " << 0 << " " << error > << std::endl; > >> } > >> > >> ierr = VecView(xr,viewer);CHKERRV(ierr); > >> > >> } > >> ierr = PetscPrintf(PETSC_COMM_WORLD,"\n");CHKERRV(ierr); > >> } > >> } > >> eig_file.close(); > >> ierr = EPSDestroy(&eps);CHKERRV(ierr); > >> ierr = PetscViewerDestroy(&viewer);CHKERRV(ierr); > >> ierr = VecDestroy(&xr);CHKERRV(ierr); > >> ierr = VecDestroy(&xi);CHKERRV(ierr); > >> > >> ierr = PetscPrintf(PETSC_COMM_WORLD,"---Finishing Eigenvalue > Solver---\n");CHKERRV(ierr); > >> } > >> > >> > >> > >> Thanks, > >> Hank > > > > -------------- next part -------------- An HTML attachment was scrubbed... URL: From jroman at dsic.upv.es Mon Aug 31 15:16:36 2015 From: jroman at dsic.upv.es (Jose E. Roman) Date: Mon, 31 Aug 2015 22:16:36 +0200 Subject: [petsc-users] Increasing nodes doesn't decrease memory per node. In-Reply-To: References: <97608CC3-A769-48DD-B3EA-1BEA073ADFB9@dsic.upv.es> Message-ID: <687CC1A8-C197-4767-A164-1D353CE11F93@dsic.upv.es> > El 31/8/2015, a las 21:50, Hank Lamm escribi?: > > If I use MatView to check the type of my matrix, it replies mpiaij, not seqaij. Am I correct in understanding your comments to mean that the reason for the error that when I do the spectrum slicing, it is creating seqaij for each processor? It is creating a seqaij matrix if partitions=size. The more partitions the more memory per node you need. I presume with partitions=size there is not enough memory to store the copy of the matrix plus the factorized matrix. Use of multiple communicators is explained in section 3.4.5 of SLEPc users guide, including the use of MUMPS. Jose From jychang48 at gmail.com Mon Aug 31 19:36:02 2015 From: jychang48 at gmail.com (Justin Chang) Date: Mon, 31 Aug 2015 19:36:02 -0500 Subject: [petsc-users] Integrating TAO into SNES and TS In-Reply-To: <6F8EDD4E-4AC5-4AEB-B593-6F0E682ABA1B@mcs.anl.gov> References: <16064BCD-8B50-4E95-AF7F-386F6780E645@mcs.anl.gov> <0759EFDD-F7BB-4826-B6E0-241159EF0D21@mcs.anl.gov> <6F8EDD4E-4AC5-4AEB-B593-6F0E682ABA1B@mcs.anl.gov> Message-ID: Coming back to this, Say I now want to ensure the DMP for advection-diffusion equations. The linear operator is now asymmetric and non-self-adjoint (assuming I do something like SUPG or finite volume), meaning I cannot simply solve this problem without any manipulation (e.g. normalizing the equations) using TAO's optimization solvers. Does this statement also hold true for SNESVI? Thanks, Justin On Fri, Apr 3, 2015 at 7:38 PM, Barry Smith wrote: > > > On Apr 3, 2015, at 7:35 PM, Justin Chang wrote: > > > > I guess I will have to write my own code then :) > > > > I am not all that familiar with Variational Inequalities at the moment, > but if my Jacobian is symmetric and positive definite and I only have lower > and upper bounds, doesn't the problem simply reduce to that of a convex > optimization? That is, with SNES act as if it were Tao? > > Yes, I think that is essentially correctly. > > Barry > > > > > On Fri, Apr 3, 2015 at 6:35 PM, Barry Smith wrote: > > > > Justin, > > > > We haven't done anything with TS to handle variational inequalities. > So you can either write your own backward Euler (outside of TS) that solves > each time-step problem either as 1) an optimization problem using Tao or 2) > as a variational inequality using SNES. > > > > More adventurously you could look at the TSTHETA code in TS (which is > a general form that includes Euler, Backward Euler and Crank-Nicolson and > see if you can add the constraints to the SNES problem that is solved; in > theory this is straightforward but it would require understanding the > current code (which Jed, of course, overwrote :-). I think you should do > this. > > > > Barry > > > > > > > On Apr 3, 2015, at 12:31 PM, Justin Chang wrote: > > > > > > I am solving the following anisotropic transient diffusion equation > subject to 0 bounds: > > > > > > du/dt = div[D*grad[u]] + f > > > > > > Where the dispersion tensor D(x) is symmetric and positive definite. > This formulation violates the discrete maximum principles so one of the > ways to ensure nonnegative concentrations is to employ convex optimization. > I am following the procedures in Nakshatrala and Valocchi (2009) JCP and > Nagarajan and Nakshatrala (2011) IJNMF. > > > > > > The Variational Inequality method works gives what I want for my > transient case, but what if I want to implement the Tao methodology in TS? > That is, what TS functions do I need to set up steps a) through e) for each > time step (also the Jacobian remains the same for all time steps so I would > only call this once). Normally I would just call TSSolve() and let the > libraries and functions do everything, but I would like to incorporate > TaoSolve into every time step. > > > > > > Thanks, > > > > > > -- > > > Justin Chang > > > PhD Candidate, Civil Engineering - Computational Sciences > > > University of Houston, Department of Civil and Environmental > Engineering > > > Houston, TX 77004 > > > (512) 963-3262 > > > > > > On Thu, Apr 2, 2015 at 6:53 PM, Barry Smith > wrote: > > > > > > An alternative approach is for you to solve it as a (non)linear > variational inequality. See src/snes/examples/tutorials/ex9.c > > > > > > How you should proceed depends on your long term goal. What problem > do you really want to solve? Is it really a linear time dependent problem > with 0 bounds on U? Can the problem always be represented as an > optimization problem easily? What are and what will be the properties of > K? For example if K is positive definite then likely the bounds will remain > try without explicitly providing the constraints. > > > > > > Barry > > > > > > > On Apr 2, 2015, at 6:39 PM, Justin Chang wrote: > > > > > > > > Hi everyone, > > > > > > > > I have a two part question regarding the integration of the > following optimization problem > > > > > > > > min 1/2 u^T*K*u + u^T*f > > > > S.T. u >= 0 > > > > > > > > into SNES and TS > > > > > > > > 1) For SNES, assuming I am working with a linear FE equation, I have > the following algorithm/steps for solving my problem > > > > > > > > a) Set an initial guess x > > > > b) Obtain residual r and jacobian A through functions > SNESComputeFunction() and SNESComputeJacobian() respectively > > > > c) Form vector b = r - A*x > > > > d) Set Hessian equal to A, gradient to A*x, objective function value > to 1/2*x^T*A*x + x^T*b, and variable (lower) bounds to a zero vector > > > > e) Call TaoSolve > > > > > > > > This works well at the moment, but my question is there a more > "efficient" way of doing this? Because with my current setup, I am making a > rather bold assumption that my problem would converge in one SNES iteration > without the bounded constraints and does not have any unexpected > nonlinearities. > > > > > > > > 2) How would I go about doing the above for time-stepping problems? > At each time step, I want to solve a convex optimization subject to the > lower bounds constraint. I plan on using backward euler and my resulting > jacobian should still be compatible with the above optimization problem. > > > > > > > > Thanks, > > > > > > > > -- > > > > Justin Chang > > > > PhD Candidate, Civil Engineering - Computational Sciences > > > > University of Houston, Department of Civil and Environmental > Engineering > > > > Houston, TX 77004 > > > > (512) 963-3262 > > > > > > > > > > > > > > > -- > > > Justin Chang > > > PhD Candidate, Civil Engineering - Computational Sciences > > > University of Houston, Department of Civil and Environmental > Engineering > > > Houston, TX 77004 > > > (512) 963-3262 > > > > > > > > > > -- > > Justin Chang > > PhD Candidate, Civil Engineering - Computational Sciences > > University of Houston, Department of Civil and Environmental Engineering > > Houston, TX 77004 > > (512) 963-3262 > > -------------- next part -------------- An HTML attachment was scrubbed... URL: From bsmith at mcs.anl.gov Mon Aug 31 20:13:54 2015 From: bsmith at mcs.anl.gov (Barry Smith) Date: Mon, 31 Aug 2015 20:13:54 -0500 Subject: [petsc-users] Integrating TAO into SNES and TS In-Reply-To: References: <16064BCD-8B50-4E95-AF7F-386F6780E645@mcs.anl.gov> <0759EFDD-F7BB-4826-B6E0-241159EF0D21@mcs.anl.gov> <6F8EDD4E-4AC5-4AEB-B593-6F0E682ABA1B@mcs.anl.gov> Message-ID: > On Aug 31, 2015, at 7:36 PM, Justin Chang wrote: > > Coming back to this, > > Say I now want to ensure the DMP for advection-diffusion equations. The linear operator is now asymmetric and non-self-adjoint (assuming I do something like SUPG or finite volume), meaning I cannot simply solve this problem without any manipulation (e.g. normalizing the equations) using TAO's optimization solvers. Does this statement also hold true for SNESVI? SNESVI doesn't care about symmetry etc > > Thanks, > Justin > > On Fri, Apr 3, 2015 at 7:38 PM, Barry Smith wrote: > > > On Apr 3, 2015, at 7:35 PM, Justin Chang wrote: > > > > I guess I will have to write my own code then :) > > > > I am not all that familiar with Variational Inequalities at the moment, but if my Jacobian is symmetric and positive definite and I only have lower and upper bounds, doesn't the problem simply reduce to that of a convex optimization? That is, with SNES act as if it were Tao? > > Yes, I think that is essentially correctly. > > Barry > > > > > On Fri, Apr 3, 2015 at 6:35 PM, Barry Smith wrote: > > > > Justin, > > > > We haven't done anything with TS to handle variational inequalities. So you can either write your own backward Euler (outside of TS) that solves each time-step problem either as 1) an optimization problem using Tao or 2) as a variational inequality using SNES. > > > > More adventurously you could look at the TSTHETA code in TS (which is a general form that includes Euler, Backward Euler and Crank-Nicolson and see if you can add the constraints to the SNES problem that is solved; in theory this is straightforward but it would require understanding the current code (which Jed, of course, overwrote :-). I think you should do this. > > > > Barry > > > > > > > On Apr 3, 2015, at 12:31 PM, Justin Chang wrote: > > > > > > I am solving the following anisotropic transient diffusion equation subject to 0 bounds: > > > > > > du/dt = div[D*grad[u]] + f > > > > > > Where the dispersion tensor D(x) is symmetric and positive definite. This formulation violates the discrete maximum principles so one of the ways to ensure nonnegative concentrations is to employ convex optimization. I am following the procedures in Nakshatrala and Valocchi (2009) JCP and Nagarajan and Nakshatrala (2011) IJNMF. > > > > > > The Variational Inequality method works gives what I want for my transient case, but what if I want to implement the Tao methodology in TS? That is, what TS functions do I need to set up steps a) through e) for each time step (also the Jacobian remains the same for all time steps so I would only call this once). Normally I would just call TSSolve() and let the libraries and functions do everything, but I would like to incorporate TaoSolve into every time step. > > > > > > Thanks, > > > > > > -- > > > Justin Chang > > > PhD Candidate, Civil Engineering - Computational Sciences > > > University of Houston, Department of Civil and Environmental Engineering > > > Houston, TX 77004 > > > (512) 963-3262 > > > > > > On Thu, Apr 2, 2015 at 6:53 PM, Barry Smith wrote: > > > > > > An alternative approach is for you to solve it as a (non)linear variational inequality. See src/snes/examples/tutorials/ex9.c > > > > > > How you should proceed depends on your long term goal. What problem do you really want to solve? Is it really a linear time dependent problem with 0 bounds on U? Can the problem always be represented as an optimization problem easily? What are and what will be the properties of K? For example if K is positive definite then likely the bounds will remain try without explicitly providing the constraints. > > > > > > Barry > > > > > > > On Apr 2, 2015, at 6:39 PM, Justin Chang wrote: > > > > > > > > Hi everyone, > > > > > > > > I have a two part question regarding the integration of the following optimization problem > > > > > > > > min 1/2 u^T*K*u + u^T*f > > > > S.T. u >= 0 > > > > > > > > into SNES and TS > > > > > > > > 1) For SNES, assuming I am working with a linear FE equation, I have the following algorithm/steps for solving my problem > > > > > > > > a) Set an initial guess x > > > > b) Obtain residual r and jacobian A through functions SNESComputeFunction() and SNESComputeJacobian() respectively > > > > c) Form vector b = r - A*x > > > > d) Set Hessian equal to A, gradient to A*x, objective function value to 1/2*x^T*A*x + x^T*b, and variable (lower) bounds to a zero vector > > > > e) Call TaoSolve > > > > > > > > This works well at the moment, but my question is there a more "efficient" way of doing this? Because with my current setup, I am making a rather bold assumption that my problem would converge in one SNES iteration without the bounded constraints and does not have any unexpected nonlinearities. > > > > > > > > 2) How would I go about doing the above for time-stepping problems? At each time step, I want to solve a convex optimization subject to the lower bounds constraint. I plan on using backward euler and my resulting jacobian should still be compatible with the above optimization problem. > > > > > > > > Thanks, > > > > > > > > -- > > > > Justin Chang > > > > PhD Candidate, Civil Engineering - Computational Sciences > > > > University of Houston, Department of Civil and Environmental Engineering > > > > Houston, TX 77004 > > > > (512) 963-3262 > > > > > > > > > > > > > > > -- > > > Justin Chang > > > PhD Candidate, Civil Engineering - Computational Sciences > > > University of Houston, Department of Civil and Environmental Engineering > > > Houston, TX 77004 > > > (512) 963-3262 > > > > > > > > > > -- > > Justin Chang > > PhD Candidate, Civil Engineering - Computational Sciences > > University of Houston, Department of Civil and Environmental Engineering > > Houston, TX 77004 > > (512) 963-3262 > >