From zonexo at gmail.com Fri Jul 1 01:33:54 2016 From: zonexo at gmail.com (TAY wee-beng) Date: Fri, 1 Jul 2016 14:33:54 +0800 Subject: [petsc-users] DMDAVecRestoreArrayF90 error Message-ID: Hi, I had problems with DMDAVecRestoreArrayF90 last time when I used an old version of Intel Fortran compiler. It works fine in gfortran and new version of Intel compiler. It was determined as a bug by the PETSc team. To use in old version of Intel, I had to use -O1 instead of -O3 -ipo in subroutines when DMDAVecRestoreArrayF90 is called. Recently, my cluster was reset and all files were deleted. I upload my files and compile my code again. However, this time, with a new version of Intel, I got segmentation error with DMDAVecRestoreArrayF90. Changing to -O1 works. But I thought it was working fine. So maybe I need to check with my admin if the ver before and after reset are the same. Another thing was using PETSc ver 3.6.4 and 3.7.2. Using v3.6.4 (compiled with -O3 -ipo), it encountered segmentation err right from the code start, when DMDAVecRestoreArrayF90 is called. However, for v3.7.2 (compiled with -O3 -ipo), it only happened during the 2nd time step, when I need to use DMDAVecRestoreArrayF90 in order to use KSP to solve the linear equation later on. So I wonder why v3.7.2 can get pass the 1st time step w/o problem and getting the right answer and only give errors at the 2nd time step. Any explanation for this? Btw, same result whether MPI is used or not. -- Thank you Yours sincerely, TAY wee-beng From bsmith at mcs.anl.gov Fri Jul 1 12:25:49 2016 From: bsmith at mcs.anl.gov (Barry Smith) Date: Fri, 1 Jul 2016 12:25:49 -0500 Subject: [petsc-users] user provided local preconditioner with additive schwarz preconditioner In-Reply-To: References: Message-ID: <32C3C9D7-36FF-4281-A3F6-2E1EF6E88E10@mcs.anl.gov> > On Jun 30, 2016, at 11:48 PM, Duan Zhaowen wrote: > > Thank you Barry. I'll try and follow the code. In my code the global matrix A was partitioned into CSR format. The local preconditioner I want to use only effect on local part of matrix A (diagonal part). What do you mean? In additive Schwarz the subproblems contain overlapping sets of variables that are solved for an then updated. > So in MyApplyFunc(PC, Vec x, Vec y) of the shell preconditioner, should I only take care of local part of vector y and leave alone its non-local part (or overlap)? The local solve updates the entire y (which has overlap with other y from other subdomains). There are several variants of overlapping Schwarz /*E PCASMType - Type of additive Schwarz method to use $ PC_ASM_BASIC - Symmetric version where residuals from the ghost points are used $ and computed values in ghost regions are added together. $ Classical standard additive Schwarz. $ PC_ASM_RESTRICT - Residuals from ghost points are used but computed values in ghost $ region are discarded. $ Default. $ PC_ASM_INTERPOLATE - Residuals from ghost points are not used, computed values in ghost $ region are added back in. $ PC_ASM_NONE - Residuals from ghost points are not used, computed ghost values are $ discarded. $ Not very good. Level: beginner .seealso: PCASMSetType() E*/ typedef enum {PC_ASM_BASIC = 3,PC_ASM_RESTRICT = 1,PC_ASM_INTERPOLATE = 2,PC_ASM_NONE = 0} PCASMType; but this is all handled inside the PCASM code. Your local function doesn't know or care which of the variants is used. It is the job of your local function to solve for all the values in the y output based on all the values in the x input. Barry > > Thank you again. If I have more problem I will let you know. > > Zhaowen > > On Thu, Jun 30, 2016 at 6:09 PM, Barry Smith wrote: > > I don't think we have an example that does exactly that. > > If you are working with KSP directly and not SNES here is how to proceed > > KSPGetPC(ksp,&pc); > PCSetType(pc,PCASM); > KSPSetOperators() > KSPSetUp() <--- this must be called before the code below otherwise the subksps don't exist yet > > PetscInt n_local; > KSP *subksps; > > PCASMGetSubKSP(pc,&n_local,NULL,&subksps); > for (i=0; i PC subpc; > > KSPGetPC(subksps[i],&subpc); > PCSetType(subpc,PCSHELL); > PCShellSetApply(subpc,yourapplyfunction); > anything else you need to set for your shell preconditioner here > } > KSPSetUpOnBlocks(ksp); > > KSPSolve(); > > Now if you want to solve with a different right hand side or different entries in you matrix just call > KSPSolve() again you don't need to repeat the code above. > > Barry > > Note that any of PETSc's preconditioners can be used on the subdomains so normally you can just -sub_pc_type typeyouwant and you don't need to mess with shell preconditioners. > > > > > > > > > On Jun 30, 2016, at 5:49 PM, Duan Zhaowen wrote: > > > > Hi, > > > > I was trying to define a shell preconditioner for local partition, and let it work with global additive schwarz preconditioner for parallel computing. Is any one can give an example on this kind of preconditioners combination. Thanks! > > > > ZW > > From zhangjiang.dudu at gmail.com Fri Jul 1 13:26:31 2016 From: zhangjiang.dudu at gmail.com (=?utf-8?B?5byg5rGf?=) Date: Fri, 1 Jul 2016 13:26:31 -0500 Subject: [petsc-users] pets error Segmentation Violation Message-ID: <734C6F9A-4824-464C-8F84-038AC6BF9AAA@gmail.com> Hi, I am trying to read a large data (11.7GB) with libmesh (integrated with PETSc) and use it for my application. The program runs well when using just one process. But in parallel (mpirun -n 4), some errors came out: [0]PETSC ERROR: ------------------------------------------------------------------------ [0]PETSC ERROR: Caught signal number 11 SEGV: Segmentation Violation, probably memory access out of range [0]PETSC ERROR: Try option -start_in_debugger or -on_error_attach_debugger [0]PETSC ERROR: or see http://www.mcs.anl.gov/petsc/documentation/faq.html#valgrind [0]PETSC ERROR: or try http://valgrind.org on GNU/linux and Apple Mac OS X to find memory corruption errors [0]PETSC ERROR: likely location of problem given in stack below [0]PETSC ERROR: --------------------- Stack Frames ------------------------------------ [0]PETSC ERROR: Note: The EXACT line numbers in the stack are not available, [0]PETSC ERROR: INSTEAD the line number of the start of the function [0]PETSC ERROR: is given. [0]PETSC ERROR: --------------------- Error Message -------------------------------------------------------------- [0]PETSC ERROR: Signal received [0]PETSC ERROR: See http://www.mcs.anl.gov/petsc/documentation/faq.html for trouble shooting. [0]PETSC ERROR: Petsc Release Version 3.7.2, Jun, 05, 2016 [0]PETSC ERROR: ./ptracer on a arch-linux2-c-debug named compute001 by jiangzhang Fri Jul 1 10:07:07 2016 [0]PETSC ERROR: Configure options --prefix=/nfs/proj-tpeterka/jiang/opt/petsc-3.7.2 --download-fblaslapack --with-mpi-dir=/nfs/proj-tpeterka/jiang/libraries/mpich-3.2 [0]PETSC ERROR: #1 User provided function() line 0 in unknown file application called MPI_Abort(MPI_COMM_WORLD, 59) - process 0 Anybody know the possible causes? -------------- next part -------------- An HTML attachment was scrubbed... URL: From bsmith at mcs.anl.gov Fri Jul 1 14:35:56 2016 From: bsmith at mcs.anl.gov (Barry Smith) Date: Fri, 1 Jul 2016 14:35:56 -0500 Subject: [petsc-users] [petsc-dev] pets error Segmentation Violation In-Reply-To: <734C6F9A-4824-464C-8F84-038AC6BF9AAA@gmail.com> References: <734C6F9A-4824-464C-8F84-038AC6BF9AAA@gmail.com> Message-ID: <4ACC6758-E5B7-44A9-B0D5-A1FACAB7AB90@mcs.anl.gov> No idea. You need to do what it says and run with valgrind or in a debugger. From the crash message it looks like it is crashing in your main program. Barry > On Jul 1, 2016, at 1:26 PM, ?? wrote: > > Hi, > > I am trying to read a large data (11.7GB) with libmesh (integrated with PETSc) and use it for my application. The program runs well when using just one process. But in parallel (mpirun -n 4), some errors came out: > > [0]PETSC ERROR: ------------------------------------------------------------------------ > [0]PETSC ERROR: Caught signal number 11 SEGV: Segmentation Violation, probably memory access out of range > [0]PETSC ERROR: Try option -start_in_debugger or -on_error_attach_debugger > [0]PETSC ERROR: or see http://www.mcs.anl.gov/petsc/documentation/faq.html#valgrind > [0]PETSC ERROR: or try http://valgrind.org on GNU/linux and Apple Mac OS X to find memory corruption errors > [0]PETSC ERROR: likely location of problem given in stack below > [0]PETSC ERROR: --------------------- Stack Frames ------------------------------------ > [0]PETSC ERROR: Note: The EXACT line numbers in the stack are not available, > [0]PETSC ERROR: INSTEAD the line number of the start of the function > [0]PETSC ERROR: is given. > [0]PETSC ERROR: --------------------- Error Message -------------------------------------------------------------- > [0]PETSC ERROR: Signal received > [0]PETSC ERROR: See http://www.mcs.anl.gov/petsc/documentation/faq.html for trouble shooting. > [0]PETSC ERROR: Petsc Release Version 3.7.2, Jun, 05, 2016 > [0]PETSC ERROR: ./ptracer on a arch-linux2-c-debug named compute001 by jiangzhang Fri Jul 1 10:07:07 2016 > [0]PETSC ERROR: Configure options --prefix=/nfs/proj-tpeterka/jiang/opt/petsc-3.7.2 --download-fblaslapack --with-mpi-dir=/nfs/proj-tpeterka/jiang/libraries/mpich-3.2 > [0]PETSC ERROR: #1 User provided function() line 0 in unknown file > application called MPI_Abort(MPI_COMM_WORLD, 59) - process 0 > > Anybody know the possible causes? > From zocca.marco at gmail.com Sat Jul 2 12:22:06 2016 From: zocca.marco at gmail.com (Marco Zocca) Date: Sat, 2 Jul 2016 19:22:06 +0200 Subject: [petsc-users] Re. PETSc user meeting 2016 Message-ID: Dear colleagues near and far, it has been a pleasure to meet you all in person during this very interesting and lively meeting. I have learned much about branches of science I had not previously considered and the current state of HPC, and for this I would like to thank all of the speakers and poster presenters, but first and foremost the PETSc team, the sponsors and our excellent host Karl. Hoping to meet you soon again, perhaps at PETSc'17, Kind regards, Marco Zocca ---------------- https://github.com/ocramz/petsc-hs https://github.com/ocramz/petsc-hs-docker From rupp at iue.tuwien.ac.at Sat Jul 2 12:26:17 2016 From: rupp at iue.tuwien.ac.at (Karl Rupp) Date: Sat, 2 Jul 2016 19:26:17 +0200 Subject: [petsc-users] Re. PETSc user meeting 2016 In-Reply-To: References: Message-ID: <5777F939.3000701@iue.tuwien.ac.at> Hi Marco, thank you for your words, it has been a pleasure for us :-) Best regards, Karli On 07/02/2016 07:22 PM, Marco Zocca wrote: > Dear colleagues near and far, > > it has been a pleasure to meet you all in person during this very > interesting and lively meeting. I have learned much about branches of > science I had not previously considered and the current state of HPC, > and for this I would like to thank all of the speakers and poster > presenters, but first and foremost the PETSc team, the sponsors and > our excellent host Karl. > > Hoping to meet you soon again, perhaps at PETSc'17, > Kind regards, > > Marco Zocca > > ---------------- > > https://github.com/ocramz/petsc-hs > https://github.com/ocramz/petsc-hs-docker > From rupp at iue.tuwien.ac.at Sat Jul 2 12:31:33 2016 From: rupp at iue.tuwien.ac.at (Karl Rupp) Date: Sat, 2 Jul 2016 19:31:33 +0200 Subject: [petsc-users] DMDAVecRestoreArrayF90 error In-Reply-To: References: Message-ID: <5777FA75.3080804@iue.tuwien.ac.at> Hi, what you describe looks a lot like memory corruption. Does your code run cleanly through valgrind? Best regards, Karli On 07/01/2016 08:33 AM, TAY wee-beng wrote: > Hi, > > I had problems with DMDAVecRestoreArrayF90 last time when I used an old > version of Intel Fortran compiler. It works fine in gfortran and new > version of Intel compiler. It was determined as a bug by the PETSc team. > > To use in old version of Intel, I had to use -O1 instead of -O3 -ipo in > subroutines when DMDAVecRestoreArrayF90 is called. > > Recently, my cluster was reset and all files were deleted. I upload my > files and compile my code again. However, this time, with a new version > of Intel, I got segmentation error with DMDAVecRestoreArrayF90. Changing > to -O1 works. But I thought it was working fine. So maybe I need to > check with my admin if the ver before and after reset are the same. > > Another thing was using PETSc ver 3.6.4 and 3.7.2. Using v3.6.4 > (compiled with -O3 -ipo), it encountered segmentation err right from the > code start, when DMDAVecRestoreArrayF90 is called. > > However, for v3.7.2 (compiled with -O3 -ipo), it only happened during > the 2nd time step, when I need to use DMDAVecRestoreArrayF90 in order to > use KSP to solve the linear equation later on. > > So I wonder why v3.7.2 can get pass the 1st time step w/o problem and > getting the right answer and only give errors at the 2nd time step. > > Any explanation for this? > > Btw, same result whether MPI is used or not. > > > From gpau at lbl.gov Sat Jul 2 17:53:36 2016 From: gpau at lbl.gov (George Pau) Date: Sat, 2 Jul 2016 15:53:36 -0700 Subject: [petsc-users] hdf5 libraries Message-ID: Hi, I am trying to debug an error I am getting when using the HDF5 viewer. I am working on NERSC systems, and they have a precompiled hdf5 (module cray-hdf5-parallel). When I linked petsc libraries to their hdf5 libraries, it gives the following error at run time when I tried to do a ISView: Rank 0 [Sat Jul 2 15:34:48 2016] [c0-0c0s15n0] Fatal error in MPI_Type_create_hindexed: Invalid argument, error stack: MPI_Type_create_hindexed(150): MPI_Type_create_hindexed(count=1, array_of_blocklengths=0x478b1e0, array_of_displacements=0x478b200, MPI_BYTE, newtype=0x7fffffff3598) failed MPI_Type_create_hindexed(98).: Invalid value for blocklength, must be non-negative but is -1927660792 The hdf5 version on NERSC is 1.8.16 but the version that PETSc downloaded when using --download-hdf5 is 1.8.12. So, could the error be due to this difference? I also see the above error only when the length of the IS is big (tested for about 200M total entries, using 1024 cores). I don't have these errors when I used --download-hdf5=1 during configure step. However, while I was able to use --download-hdf5=1 on Edison, PETSc was not able to compile hdf5 libraries properly on Cori. The OS of Cori was recently updated. My primary interest in trying to use the version provided by NERSC is to see if there is any improvement in the IO performance. Thanks, George -- George Pau Earth Sciences Division Lawrence Berkeley National Laboratory One Cyclotron, MS 74R316C Berkeley, CA 94720 (510) 486-7196 gpau at lbl.gov http://esd.lbl.gov/profiles/george-shu-heng-pau/ -------------- next part -------------- An HTML attachment was scrubbed... URL: From hgbk2008 at gmail.com Sun Jul 3 03:06:38 2016 From: hgbk2008 at gmail.com (Hoang Giang Bui) Date: Sun, 3 Jul 2016 10:06:38 +0200 Subject: [petsc-users] Dose Petsc has DMPlex example In-Reply-To: References: <201605030929463862822@163.com> Message-ID: Hi Matt I tried to run ex62 with 1 proc (petsc 3.7.2), but it all produces zero The output is: hbui at bermuda:~/workspace/petsc/snes$ es$ ./ex62 run_type full -bc_type dirichlet -refinement_limit 0.00625 -interpolate 1 -snes_monitor_short -snes_converged_reason -snes_view -ksp_type fgmres -ksp_gmres_restart 100 -ksp_rtol 1.0e-9 -ksp_monitor_short -pc_type fieldsplit -pc_fieldsplit_type schur -pc_fieldsplit_schur_factorization_type full -fieldsplit_velocity_ksp_type gmres -fieldsplit_velocity_pc_type lu -fieldsplit_pressure_ksp_rtol 1e-10 -fieldsplit_pressure_pc_type jacobi 0 SNES Function norm 0.265165 0 KSP Residual norm 0.265165 Nonlinear solve did not converge due to DIVERGED_LINEAR_SOLVE iterations 0 SNES Object: 1 MPI processes type: newtonls maximum iterations=50, maximum function evaluations=10000 tolerances: relative=1e-08, absolute=1e-50, solution=1e-08 total number of linear solver iterations=0 total number of function evaluations=1 norm schedule ALWAYS SNESLineSearch Object: 1 MPI processes type: bt interpolation: cubic alpha=1.000000e-04 maxstep=1.000000e+08, minlambda=1.000000e-12 tolerances: relative=1.000000e-08, absolute=1.000000e-15, lambda=1.000000e-08 maximum iterations=40 KSP Object: 1 MPI processes type: fgmres GMRES: restart=100, using Classical (unmodified) Gram-Schmidt Orthogonalization with no iterative refinement GMRES: happy breakdown tolerance 1e-30 maximum iterations=10000, initial guess is zero tolerances: relative=1e-09, absolute=1e-50, divergence=10000. right preconditioning using UNPRECONDITIONED norm type for convergence test PC Object: 1 MPI processes type: fieldsplit FieldSplit with Schur preconditioner, factorization FULL Preconditioner for the Schur complement formed from A11 Split info: Split number 0 Defined by IS Split number 1 Defined by IS KSP solver for A00 block KSP Object: (fieldsplit_velocity_) 1 MPI processes type: gmres GMRES: restart=30, using Classical (unmodified) Gram-Schmidt Orthogonalization with no iterative refinement GMRES: happy breakdown tolerance 1e-30 maximum iterations=10000, initial guess is zero tolerances: relative=1e-05, absolute=1e-50, divergence=10000. left preconditioning using PRECONDITIONED norm type for convergence test PC Object: (fieldsplit_velocity_) 1 MPI processes type: lu LU: out-of-place factorization tolerance for zero pivot 2.22045e-14 matrix ordering: nd factor fill ratio given 5., needed 1. Factored matrix follows: Mat Object: 1 MPI processes type: seqaij rows=512, cols=512, bs=2 package used to perform factorization: petsc total: nonzeros=1024, allocated nonzeros=1024 total number of mallocs used during MatSetValues calls =0 using I-node routines: found 256 nodes, limit used is 5 linear system matrix = precond matrix: Mat Object: (fieldsplit_velocity_) 1 MPI processes type: seqaij rows=512, cols=512, bs=2 total: nonzeros=1024, allocated nonzeros=1024 total number of mallocs used during MatSetValues calls =0 using I-node routines: found 256 nodes, limit used is 5 KSP solver for S = A11 - A10 inv(A00) A01 KSP Object: (fieldsplit_pressure_) 1 MPI processes type: gmres GMRES: restart=30, using Classical (unmodified) Gram-Schmidt Orthogonalization with no iterative refinement GMRES: happy breakdown tolerance 1e-30 maximum iterations=10000, initial guess is zero tolerances: relative=1e-10, absolute=1e-50, divergence=10000. left preconditioning using PRECONDITIONED norm type for convergence test PC Object: (fieldsplit_pressure_) 1 MPI processes type: jacobi linear system matrix followed by preconditioner matrix: Mat Object: (fieldsplit_pressure_) 1 MPI processes type: schurcomplement rows=256, cols=256 has attached null space Schur complement A11 - A10 inv(A00) A01 A11 Mat Object: (fieldsplit_pressure_) 1 MPI processes type: seqaij rows=256, cols=256 total: nonzeros=256, allocated nonzeros=256 total number of mallocs used during MatSetValues calls =0 has attached null space not using I-node routines A10 Mat Object: 1 MPI processes type: seqaij rows=256, cols=512 total: nonzeros=512, allocated nonzeros=512 total number of mallocs used during MatSetValues calls =0 not using I-node routines KSP of A00 KSP Object: (fieldsplit_velocity_) 1 MPI processes type: gmres GMRES: restart=30, using Classical (unmodified) Gram-Schmidt Orthogonalization with no iterative refinement GMRES: happy breakdown tolerance 1e-30 maximum iterations=10000, initial guess is zero tolerances: relative=1e-05, absolute=1e-50, divergence=10000. left preconditioning using PRECONDITIONED norm type for convergence test PC Object: (fieldsplit_velocity_) 1 MPI processes type: lu LU: out-of-place factorization tolerance for zero pivot 2.22045e-14 matrix ordering: nd factor fill ratio given 5., needed 1. Factored matrix follows: Mat Object: 1 MPI processes type: seqaij rows=512, cols=512, bs=2 package used to perform factorization: petsc total: nonzeros=1024, allocated nonzeros=1024 total number of mallocs used during MatSetValues calls =0 using I-node routines: found 256 nodes, limit used is 5 linear system matrix = precond matrix: Mat Object: (fieldsplit_velocity_) 1 MPI processes type: seqaij rows=512, cols=512, bs=2 total: nonzeros=1024, allocated nonzeros=1024 total number of mallocs used during MatSetValues calls =0 using I-node routines: found 256 nodes, limit used is 5 A01 Mat Object: 1 MPI processes type: seqaij rows=512, cols=256, rbs=2, cbs = 1 total: nonzeros=512, allocated nonzeros=512 total number of mallocs used during MatSetValues calls =0 using I-node routines: found 256 nodes, limit used is 5 Mat Object: (fieldsplit_pressure_) 1 MPI processes type: seqaij rows=256, cols=256 total: nonzeros=256, allocated nonzeros=256 total number of mallocs used during MatSetValues calls =0 has attached null space not using I-node routines linear system matrix = precond matrix: Mat Object: 1 MPI processes type: seqaij rows=768, cols=768 total: nonzeros=2304, allocated nonzeros=2304 total number of mallocs used during MatSetValues calls =0 has attached null space using I-node routines: found 256 nodes, limit used is 5 Number of SNES iterations = 0 L_2 Error: 1.01 [0.929, 0.407] Solution Vec Object: 1 MPI processes type: seq 0. 0. .... Am I doing something wrong? Giang Giang On Tue, May 3, 2016 at 4:44 AM, Matthew Knepley wrote: > On Mon, May 2, 2016 at 8:29 PM, ztdepyahoo at 163.com > wrote: > >> Dear professor: >> I want to write a parallel 3D CFD code based on unstructred grid, >> does Petsc has DMPlex examples to start with. >> > > SNES ex62 is an unstructured grid Stokes problem discretized with > low-order finite elements. > > Of course, all the different possible choices will impact the design. > > Matt > > >> Regards >> > > > > -- > What most experimenters take for granted before they begin their > experiments is infinitely more interesting than any results to which their > experiments lead. > -- Norbert Wiener > -------------- next part -------------- An HTML attachment was scrubbed... URL: From jychang48 at gmail.com Sun Jul 3 03:15:09 2016 From: jychang48 at gmail.com (Justin Chang) Date: Sun, 3 Jul 2016 09:15:09 +0100 Subject: [petsc-users] Dose Petsc has DMPlex example In-Reply-To: References: <201605030929463862822@163.com> Message-ID: Hoang, if you run this example shown from the config/builder.py ./ex62 -run_type full -refinement_limit 0.00625 -bc_type dirichlet -interpolate 1 -vel_petscspace_order 2 -pres_petscspace_order 1 -ksp_type fgmres -ksp_gmres_restart 100 -ksp_rtol 1.0e-9 -pc_type fieldsplit -pc_fieldsplit_type schur -pc_fieldsplit_schur_factorization_type full -fieldsplit_pressure_ksp_rtol 1e-10 -fieldsplit_velocity_ksp_type gmres -fieldsplit_velocity_pc_type lu -fieldsplit_pressure_pc_type jacobi -snes_monitor_short -ksp_monitor_short -snes_converged_reason -ksp_converged_reason -snes_view -show_solution 0 it should work On Sun, Jul 3, 2016 at 9:06 AM, Hoang Giang Bui wrote: > Hi Matt > > I tried to run ex62 with 1 proc (petsc 3.7.2), but it all produces zero > > The output is: > hbui at bermuda:~/workspace/petsc/snes$ es$ ./ex62 run_type full -bc_type > dirichlet -refinement_limit 0.00625 -interpolate 1 -snes_monitor_short > -snes_converged_reason -snes_view -ksp_type fgmres -ksp_gmres_restart 100 > -ksp_rtol 1.0e-9 -ksp_monitor_short -pc_type fieldsplit -pc_fieldsplit_type > schur -pc_fieldsplit_schur_factorization_type full > -fieldsplit_velocity_ksp_type gmres -fieldsplit_velocity_pc_type lu > -fieldsplit_pressure_ksp_rtol 1e-10 -fieldsplit_pressure_pc_type jacobi > 0 SNES Function norm 0.265165 > 0 KSP Residual norm 0.265165 > Nonlinear solve did not converge due to DIVERGED_LINEAR_SOLVE iterations 0 > SNES Object: 1 MPI processes > type: newtonls > maximum iterations=50, maximum function evaluations=10000 > tolerances: relative=1e-08, absolute=1e-50, solution=1e-08 > total number of linear solver iterations=0 > total number of function evaluations=1 > norm schedule ALWAYS > SNESLineSearch Object: 1 MPI processes > type: bt > interpolation: cubic > alpha=1.000000e-04 > maxstep=1.000000e+08, minlambda=1.000000e-12 > tolerances: relative=1.000000e-08, absolute=1.000000e-15, > lambda=1.000000e-08 > maximum iterations=40 > KSP Object: 1 MPI processes > type: fgmres > GMRES: restart=100, using Classical (unmodified) Gram-Schmidt > Orthogonalization with no iterative refinement > GMRES: happy breakdown tolerance 1e-30 > maximum iterations=10000, initial guess is zero > tolerances: relative=1e-09, absolute=1e-50, divergence=10000. > right preconditioning > using UNPRECONDITIONED norm type for convergence test > PC Object: 1 MPI processes > type: fieldsplit > FieldSplit with Schur preconditioner, factorization FULL > Preconditioner for the Schur complement formed from A11 > Split info: > Split number 0 Defined by IS > Split number 1 Defined by IS > KSP solver for A00 block > KSP Object: (fieldsplit_velocity_) 1 MPI processes > type: gmres > GMRES: restart=30, using Classical (unmodified) Gram-Schmidt > Orthogonalization with no iterative refinement > GMRES: happy breakdown tolerance 1e-30 > maximum iterations=10000, initial guess is zero > tolerances: relative=1e-05, absolute=1e-50, divergence=10000. > left preconditioning > using PRECONDITIONED norm type for convergence test > PC Object: (fieldsplit_velocity_) 1 MPI processes > type: lu > LU: out-of-place factorization > tolerance for zero pivot 2.22045e-14 > matrix ordering: nd > factor fill ratio given 5., needed 1. > Factored matrix follows: > Mat Object: 1 MPI processes > type: seqaij > rows=512, cols=512, bs=2 > package used to perform factorization: petsc > total: nonzeros=1024, allocated nonzeros=1024 > total number of mallocs used during MatSetValues calls =0 > using I-node routines: found 256 nodes, limit used is 5 > linear system matrix = precond matrix: > Mat Object: (fieldsplit_velocity_) 1 MPI > processes > type: seqaij > rows=512, cols=512, bs=2 > total: nonzeros=1024, allocated nonzeros=1024 > total number of mallocs used during MatSetValues calls =0 > using I-node routines: found 256 nodes, limit used is 5 > KSP solver for S = A11 - A10 inv(A00) A01 > KSP Object: (fieldsplit_pressure_) 1 MPI processes > type: gmres > GMRES: restart=30, using Classical (unmodified) Gram-Schmidt > Orthogonalization with no iterative refinement > GMRES: happy breakdown tolerance 1e-30 > maximum iterations=10000, initial guess is zero > tolerances: relative=1e-10, absolute=1e-50, divergence=10000. > left preconditioning > using PRECONDITIONED norm type for convergence test > PC Object: (fieldsplit_pressure_) 1 MPI processes > type: jacobi > linear system matrix followed by preconditioner matrix: > Mat Object: (fieldsplit_pressure_) 1 MPI > processes > type: schurcomplement > rows=256, cols=256 > has attached null space > Schur complement A11 - A10 inv(A00) A01 > A11 > Mat Object: > (fieldsplit_pressure_) 1 MPI processes > type: seqaij > rows=256, cols=256 > total: nonzeros=256, allocated nonzeros=256 > total number of mallocs used during MatSetValues calls =0 > has attached null space > not using I-node routines > A10 > Mat Object: 1 MPI processes > type: seqaij > rows=256, cols=512 > total: nonzeros=512, allocated nonzeros=512 > total number of mallocs used during MatSetValues calls =0 > not using I-node routines > KSP of A00 > KSP Object: > (fieldsplit_velocity_) 1 MPI processes > type: gmres > GMRES: restart=30, using Classical (unmodified) > Gram-Schmidt Orthogonalization with no iterative refinement > GMRES: happy breakdown tolerance 1e-30 > maximum iterations=10000, initial guess is zero > tolerances: relative=1e-05, absolute=1e-50, > divergence=10000. > left preconditioning > using PRECONDITIONED norm type for convergence test > PC Object: > (fieldsplit_velocity_) 1 MPI processes > type: lu > LU: out-of-place factorization > tolerance for zero pivot 2.22045e-14 > matrix ordering: nd > factor fill ratio given 5., needed 1. > Factored matrix follows: > Mat Object: 1 MPI processes > type: seqaij > rows=512, cols=512, bs=2 > package used to perform factorization: petsc > total: nonzeros=1024, allocated nonzeros=1024 > total number of mallocs used during MatSetValues > calls =0 > using I-node routines: found 256 nodes, limit > used is 5 > linear system matrix = precond matrix: > Mat Object: > (fieldsplit_velocity_) 1 MPI processes > type: seqaij > rows=512, cols=512, bs=2 > total: nonzeros=1024, allocated nonzeros=1024 > total number of mallocs used during MatSetValues calls > =0 > using I-node routines: found 256 nodes, limit used > is 5 > A01 > Mat Object: 1 MPI processes > type: seqaij > rows=512, cols=256, rbs=2, cbs = 1 > total: nonzeros=512, allocated nonzeros=512 > total number of mallocs used during MatSetValues calls =0 > using I-node routines: found 256 nodes, limit used is 5 > Mat Object: (fieldsplit_pressure_) 1 MPI > processes > type: seqaij > rows=256, cols=256 > total: nonzeros=256, allocated nonzeros=256 > total number of mallocs used during MatSetValues calls =0 > has attached null space > not using I-node routines > linear system matrix = precond matrix: > Mat Object: 1 MPI processes > type: seqaij > rows=768, cols=768 > total: nonzeros=2304, allocated nonzeros=2304 > total number of mallocs used during MatSetValues calls =0 > has attached null space > using I-node routines: found 256 nodes, limit used is 5 > Number of SNES iterations = 0 > L_2 Error: 1.01 [0.929, 0.407] > Solution > Vec Object: 1 MPI processes > type: seq > 0. > 0. > .... > > Am I doing something wrong? > > Giang > > > Giang > > On Tue, May 3, 2016 at 4:44 AM, Matthew Knepley wrote: > >> On Mon, May 2, 2016 at 8:29 PM, ztdepyahoo at 163.com >> wrote: >> >>> Dear professor: >>> I want to write a parallel 3D CFD code based on unstructred grid, >>> does Petsc has DMPlex examples to start with. >>> >> >> SNES ex62 is an unstructured grid Stokes problem discretized with >> low-order finite elements. >> >> Of course, all the different possible choices will impact the design. >> >> Matt >> >> >>> Regards >>> >> >> >> >> -- >> What most experimenters take for granted before they begin their >> experiments is infinitely more interesting than any results to which their >> experiments lead. >> -- Norbert Wiener >> > > -------------- next part -------------- An HTML attachment was scrubbed... URL: From hgbk2008 at gmail.com Sun Jul 3 03:49:05 2016 From: hgbk2008 at gmail.com (Hoang Giang Bui) Date: Sun, 3 Jul 2016 10:49:05 +0200 Subject: [petsc-users] Dose Petsc has DMPlex example In-Reply-To: References: <201605030929463862822@163.com> Message-ID: Thanks Justin. It works. The difference is these parameters: -vel_petscspace_order 2 -pres_petscspace_order 1. Which is quite cool since you can play with those orders to see how LBB condition affects the results. Giang Giang On Sun, Jul 3, 2016 at 10:15 AM, Justin Chang wrote: > Hoang, if you run this example shown from the config/builder.py > > ./ex62 -run_type full -refinement_limit 0.00625 -bc_type dirichlet > -interpolate 1 -vel_petscspace_order 2 -pres_petscspace_order 1 -ksp_type > fgmres -ksp_gmres_restart 100 -ksp_rtol 1.0e-9 -pc_type fieldsplit > -pc_fieldsplit_type schur -pc_fieldsplit_schur_factorization_type full > -fieldsplit_pressure_ksp_rtol 1e-10 -fieldsplit_velocity_ksp_type gmres > -fieldsplit_velocity_pc_type lu -fieldsplit_pressure_pc_type jacobi > -snes_monitor_short -ksp_monitor_short -snes_converged_reason > -ksp_converged_reason -snes_view -show_solution 0 > > > it should work > > On Sun, Jul 3, 2016 at 9:06 AM, Hoang Giang Bui > wrote: > >> Hi Matt >> >> I tried to run ex62 with 1 proc (petsc 3.7.2), but it all produces zero >> >> The output is: >> hbui at bermuda:~/workspace/petsc/snes$ es$ ./ex62 run_type full -bc_type >> dirichlet -refinement_limit 0.00625 -interpolate 1 -snes_monitor_short >> -snes_converged_reason -snes_view -ksp_type fgmres -ksp_gmres_restart 100 >> -ksp_rtol 1.0e-9 -ksp_monitor_short -pc_type fieldsplit -pc_fieldsplit_type >> schur -pc_fieldsplit_schur_factorization_type full >> -fieldsplit_velocity_ksp_type gmres -fieldsplit_velocity_pc_type lu >> -fieldsplit_pressure_ksp_rtol 1e-10 -fieldsplit_pressure_pc_type jacobi >> 0 SNES Function norm 0.265165 >> 0 KSP Residual norm 0.265165 >> Nonlinear solve did not converge due to DIVERGED_LINEAR_SOLVE iterations 0 >> SNES Object: 1 MPI processes >> type: newtonls >> maximum iterations=50, maximum function evaluations=10000 >> tolerances: relative=1e-08, absolute=1e-50, solution=1e-08 >> total number of linear solver iterations=0 >> total number of function evaluations=1 >> norm schedule ALWAYS >> SNESLineSearch Object: 1 MPI processes >> type: bt >> interpolation: cubic >> alpha=1.000000e-04 >> maxstep=1.000000e+08, minlambda=1.000000e-12 >> tolerances: relative=1.000000e-08, absolute=1.000000e-15, >> lambda=1.000000e-08 >> maximum iterations=40 >> KSP Object: 1 MPI processes >> type: fgmres >> GMRES: restart=100, using Classical (unmodified) Gram-Schmidt >> Orthogonalization with no iterative refinement >> GMRES: happy breakdown tolerance 1e-30 >> maximum iterations=10000, initial guess is zero >> tolerances: relative=1e-09, absolute=1e-50, divergence=10000. >> right preconditioning >> using UNPRECONDITIONED norm type for convergence test >> PC Object: 1 MPI processes >> type: fieldsplit >> FieldSplit with Schur preconditioner, factorization FULL >> Preconditioner for the Schur complement formed from A11 >> Split info: >> Split number 0 Defined by IS >> Split number 1 Defined by IS >> KSP solver for A00 block >> KSP Object: (fieldsplit_velocity_) 1 MPI processes >> type: gmres >> GMRES: restart=30, using Classical (unmodified) Gram-Schmidt >> Orthogonalization with no iterative refinement >> GMRES: happy breakdown tolerance 1e-30 >> maximum iterations=10000, initial guess is zero >> tolerances: relative=1e-05, absolute=1e-50, divergence=10000. >> left preconditioning >> using PRECONDITIONED norm type for convergence test >> PC Object: (fieldsplit_velocity_) 1 MPI processes >> type: lu >> LU: out-of-place factorization >> tolerance for zero pivot 2.22045e-14 >> matrix ordering: nd >> factor fill ratio given 5., needed 1. >> Factored matrix follows: >> Mat Object: 1 MPI processes >> type: seqaij >> rows=512, cols=512, bs=2 >> package used to perform factorization: petsc >> total: nonzeros=1024, allocated nonzeros=1024 >> total number of mallocs used during MatSetValues calls >> =0 >> using I-node routines: found 256 nodes, limit used is >> 5 >> linear system matrix = precond matrix: >> Mat Object: (fieldsplit_velocity_) 1 MPI >> processes >> type: seqaij >> rows=512, cols=512, bs=2 >> total: nonzeros=1024, allocated nonzeros=1024 >> total number of mallocs used during MatSetValues calls =0 >> using I-node routines: found 256 nodes, limit used is 5 >> KSP solver for S = A11 - A10 inv(A00) A01 >> KSP Object: (fieldsplit_pressure_) 1 MPI processes >> type: gmres >> GMRES: restart=30, using Classical (unmodified) Gram-Schmidt >> Orthogonalization with no iterative refinement >> GMRES: happy breakdown tolerance 1e-30 >> maximum iterations=10000, initial guess is zero >> tolerances: relative=1e-10, absolute=1e-50, divergence=10000. >> left preconditioning >> using PRECONDITIONED norm type for convergence test >> PC Object: (fieldsplit_pressure_) 1 MPI processes >> type: jacobi >> linear system matrix followed by preconditioner matrix: >> Mat Object: (fieldsplit_pressure_) 1 MPI >> processes >> type: schurcomplement >> rows=256, cols=256 >> has attached null space >> Schur complement A11 - A10 inv(A00) A01 >> A11 >> Mat Object: >> (fieldsplit_pressure_) 1 MPI processes >> type: seqaij >> rows=256, cols=256 >> total: nonzeros=256, allocated nonzeros=256 >> total number of mallocs used during MatSetValues calls >> =0 >> has attached null space >> not using I-node routines >> A10 >> Mat Object: 1 MPI processes >> type: seqaij >> rows=256, cols=512 >> total: nonzeros=512, allocated nonzeros=512 >> total number of mallocs used during MatSetValues calls >> =0 >> not using I-node routines >> KSP of A00 >> KSP Object: >> (fieldsplit_velocity_) 1 MPI processes >> type: gmres >> GMRES: restart=30, using Classical (unmodified) >> Gram-Schmidt Orthogonalization with no iterative refinement >> GMRES: happy breakdown tolerance 1e-30 >> maximum iterations=10000, initial guess is zero >> tolerances: relative=1e-05, absolute=1e-50, >> divergence=10000. >> left preconditioning >> using PRECONDITIONED norm type for convergence test >> PC Object: >> (fieldsplit_velocity_) 1 MPI processes >> type: lu >> LU: out-of-place factorization >> tolerance for zero pivot 2.22045e-14 >> matrix ordering: nd >> factor fill ratio given 5., needed 1. >> Factored matrix follows: >> Mat Object: 1 MPI >> processes >> type: seqaij >> rows=512, cols=512, bs=2 >> package used to perform factorization: petsc >> total: nonzeros=1024, allocated nonzeros=1024 >> total number of mallocs used during >> MatSetValues calls =0 >> using I-node routines: found 256 nodes, limit >> used is 5 >> linear system matrix = precond matrix: >> Mat Object: >> (fieldsplit_velocity_) 1 MPI processes >> type: seqaij >> rows=512, cols=512, bs=2 >> total: nonzeros=1024, allocated nonzeros=1024 >> total number of mallocs used during MatSetValues >> calls =0 >> using I-node routines: found 256 nodes, limit used >> is 5 >> A01 >> Mat Object: 1 MPI processes >> type: seqaij >> rows=512, cols=256, rbs=2, cbs = 1 >> total: nonzeros=512, allocated nonzeros=512 >> total number of mallocs used during MatSetValues calls >> =0 >> using I-node routines: found 256 nodes, limit used is >> 5 >> Mat Object: (fieldsplit_pressure_) 1 MPI >> processes >> type: seqaij >> rows=256, cols=256 >> total: nonzeros=256, allocated nonzeros=256 >> total number of mallocs used during MatSetValues calls =0 >> has attached null space >> not using I-node routines >> linear system matrix = precond matrix: >> Mat Object: 1 MPI processes >> type: seqaij >> rows=768, cols=768 >> total: nonzeros=2304, allocated nonzeros=2304 >> total number of mallocs used during MatSetValues calls =0 >> has attached null space >> using I-node routines: found 256 nodes, limit used is 5 >> Number of SNES iterations = 0 >> L_2 Error: 1.01 [0.929, 0.407] >> Solution >> Vec Object: 1 MPI processes >> type: seq >> 0. >> 0. >> .... >> >> Am I doing something wrong? >> >> Giang >> >> >> Giang >> >> On Tue, May 3, 2016 at 4:44 AM, Matthew Knepley >> wrote: >> >>> On Mon, May 2, 2016 at 8:29 PM, ztdepyahoo at 163.com >>> wrote: >>> >>>> Dear professor: >>>> I want to write a parallel 3D CFD code based on unstructred grid, >>>> does Petsc has DMPlex examples to start with. >>>> >>> >>> SNES ex62 is an unstructured grid Stokes problem discretized with >>> low-order finite elements. >>> >>> Of course, all the different possible choices will impact the design. >>> >>> Matt >>> >>> >>>> Regards >>>> >>> >>> >>> >>> -- >>> What most experimenters take for granted before they begin their >>> experiments is infinitely more interesting than any results to which their >>> experiments lead. >>> -- Norbert Wiener >>> >> >> > -------------- next part -------------- An HTML attachment was scrubbed... URL: From bsmith at mcs.anl.gov Sun Jul 3 17:13:33 2016 From: bsmith at mcs.anl.gov (Barry Smith) Date: Sun, 3 Jul 2016 17:13:33 -0500 Subject: [petsc-users] hdf5 libraries In-Reply-To: References: Message-ID: <2EC53254-CC98-42BB-94C1-212F72574F65@mcs.anl.gov> Please send $PETSC_ARCH/lib/petsc/conf/configure.log for both cases to petsc-maint at mcs.anl.gov. The fact that it kicks in for very large problems and reports a negative block length is indicative of a 64 bit integer not fitting into a 32 bit integer location. Barry > On Jul 2, 2016, at 5:53 PM, George Pau wrote: > > Hi, > > I am trying to debug an error I am getting when using the HDF5 viewer. I am working on NERSC systems, and they have a precompiled hdf5 (module cray-hdf5-parallel). When I linked petsc libraries to their hdf5 libraries, it gives the following error at run time when I tried to do a ISView: > > Rank 0 [Sat Jul 2 15:34:48 2016] [c0-0c0s15n0] Fatal error in MPI_Type_create_hindexed: Invalid argument, error stack: > MPI_Type_create_hindexed(150): MPI_Type_create_hindexed(count=1, array_of_blocklengths=0x478b1e0, array_of_displacements=0x478b200, MPI_BYTE, newtype=0x7fffffff3598) failed > MPI_Type_create_hindexed(98).: Invalid value for blocklength, must be non-negative but is -1927660792 > > The hdf5 version on NERSC is 1.8.16 but the version that PETSc downloaded when using --download-hdf5 is 1.8.12. So, could the error be due to this difference? I also see the above error only when the length of the IS is big (tested for about 200M total entries, using 1024 cores). > > I don't have these errors when I used --download-hdf5=1 during configure step. However, while I was able to use --download-hdf5=1 on Edison, PETSc was not able to compile hdf5 libraries properly on Cori. The OS of Cori was recently updated. My primary interest in trying to use the version provided by NERSC is to see if there is any improvement in the IO performance. > > Thanks, > George > > > -- > George Pau > Earth Sciences Division > Lawrence Berkeley National Laboratory > One Cyclotron, MS 74R316C > Berkeley, CA 94720 > > (510) 486-7196 > gpau at lbl.gov > http://esd.lbl.gov/profiles/george-shu-heng-pau/ From Hassan.Raiesi at aero.bombardier.com Mon Jul 4 13:48:08 2016 From: Hassan.Raiesi at aero.bombardier.com (Hassan Raiesi) Date: Mon, 4 Jul 2016 18:48:08 +0000 Subject: [petsc-users] reusing matrix created with MatCreateMPIAIJWithSplitArrays In-Reply-To: <333B1A41-ACE3-49E4-ADB3-8317D177ED14@mcs.anl.gov> References: <333B1A41-ACE3-49E4-ADB3-8317D177ED14@mcs.anl.gov> Message-ID: Thanks Barry, That works, however, the code seems to run a bit faster when I destroy and re-create the matrix at each time step as suggested by Dave!, I added this right after updating the values of the diagonal and off-diagonal parts, ierr = PetscObjectStateIncrease((PetscObject)(mat)); to avoid calls to MatAssemblyBegin/MatAssemblyEnd() and it does the trick! -H -----Original Message----- From: Barry Smith [mailto:bsmith at mcs.anl.gov] Sent: Thursday, June 30, 2016 6:17 PM To: Hassan Raiesi Cc: petsc-users at mcs.anl.gov Subject: Re: [petsc-users] reusing matrix created with MatCreateMPIAIJWithSplitArrays > On Jun 30, 2016, at 2:40 PM, Hassan Raiesi wrote: > > Hello, > > We are using PETSC in our CFD code, and noticed that using ?MatCreateMPIAIJWithSplitArrays? is almost 60% faster for large problem size (i.e DOF > 725M, using GAMG each time-step only takes 5sec, compared to 8.3 sec when assembling the matrix one row at a time using matsetvaluesblocked() as recommended). > > The problem is that the memory usage goes up after each call to MatCreateMPIAIJWithSplitArrays to update the matrix values. As MatCreateMPIAIJWithSplitArrays is not supposed to copy the values, do we need to call it each time to update the values? We tried to just update the values of the diagonal and off-diagonal part of the arrays passed to ?MatCreateMPIAIJWithSplitArrays?, (the sparsity structure is fixed) but it looks like that the values are not updated, what is the proper way to update the values of the matrix created by MatCreateMPIAIJWithSplitArrays? Since you have direct access to the two numeric arrays passed to MatCreateMPIAIJWithSplitArrays() you can simply change the values in those locations AND THEN immediately CALL MatAssemblyBegin/MatAssemblyEnd() on the matrix; this will increase the the PETSc object state value for the matrix so the matrix routines (and preconditioner) will know you changed the matrix values. If you don't call the MatAssemblyBegin/MatAssemblyEnd() the preconditioner will think the matrix has not been changed so just use its old values as you observed. Barry Of course if you change any nonzero locations in the matrix you need to destroy the matrix and call MatCreateMPIAIJWithSplitArrays() again. > > > Thank you > > Hassan Raiesi, > Advanced Aerodynamics Department > Bombardier Aerospace > > hassan.raiesi at aero.bombardier.com > > 2351 boul. Alfred-Nobel (BAN1) > Ville Saint-Laurent, Qu?bec, H4S 2A9 > > > > T?l. > 514-855-5001 # 62204 > > > > > > > CONFIDENTIALITY NOTICE - This communication may contain privileged or confidential information. > If you are not the intended recipient or received this communication > by error, please notify the sender and delete the message without copying, forwarding and/or disclosing it. From bsmith at mcs.anl.gov Mon Jul 4 13:51:57 2016 From: bsmith at mcs.anl.gov (Barry Smith) Date: Mon, 4 Jul 2016 13:51:57 -0500 Subject: [petsc-users] reusing matrix created with MatCreateMPIAIJWithSplitArrays In-Reply-To: References: <333B1A41-ACE3-49E4-ADB3-8317D177ED14@mcs.anl.gov> Message-ID: <2C797373-AE40-438A-9B00-04C6B4EC76E4@mcs.anl.gov> > On Jul 4, 2016, at 1:48 PM, Hassan Raiesi wrote: > > Thanks Barry, > > That works, however, the code seems to run a bit faster when I destroy and re-create the matrix at each time step as suggested by Dave!, That seems odd, but ok. > > I added this right after updating the values of the diagonal and off-diagonal parts, > > ierr = PetscObjectStateIncrease((PetscObject)(mat)); > > to avoid calls to MatAssemblyBegin/MatAssemblyEnd() and it does the trick! > > -H > > -----Original Message----- > From: Barry Smith [mailto:bsmith at mcs.anl.gov] > Sent: Thursday, June 30, 2016 6:17 PM > To: Hassan Raiesi > Cc: petsc-users at mcs.anl.gov > Subject: Re: [petsc-users] reusing matrix created with MatCreateMPIAIJWithSplitArrays > > >> On Jun 30, 2016, at 2:40 PM, Hassan Raiesi wrote: >> >> Hello, >> >> We are using PETSC in our CFD code, and noticed that using ?MatCreateMPIAIJWithSplitArrays? is almost 60% faster for large problem size (i.e DOF > 725M, using GAMG each time-step only takes 5sec, compared to 8.3 sec when assembling the matrix one row at a time using matsetvaluesblocked() as recommended). >> >> The problem is that the memory usage goes up after each call to MatCreateMPIAIJWithSplitArrays to update the matrix values. As MatCreateMPIAIJWithSplitArrays is not supposed to copy the values, do we need to call it each time to update the values? We tried to just update the values of the diagonal and off-diagonal part of the arrays passed to ?MatCreateMPIAIJWithSplitArrays?, (the sparsity structure is fixed) but it looks like that the values are not updated, what is the proper way to update the values of the matrix created by MatCreateMPIAIJWithSplitArrays? > > Since you have direct access to the two numeric arrays passed to MatCreateMPIAIJWithSplitArrays() you can simply change the values in those locations > > AND THEN immediately CALL MatAssemblyBegin/MatAssemblyEnd() on the matrix; this will increase the the PETSc object state value for the matrix so the matrix routines (and preconditioner) will know you changed the matrix values. If you don't call the MatAssemblyBegin/MatAssemblyEnd() the preconditioner will think the matrix has not been changed so just use its old values as you observed. > > Barry > > Of course if you change any nonzero locations in the matrix you need to destroy the matrix and call MatCreateMPIAIJWithSplitArrays() again. > >> >> >> Thank you >> >> Hassan Raiesi, >> Advanced Aerodynamics Department >> Bombardier Aerospace >> >> hassan.raiesi at aero.bombardier.com >> >> 2351 boul. Alfred-Nobel (BAN1) >> Ville Saint-Laurent, Qu?bec, H4S 2A9 >> >> >> >> T?l. >> 514-855-5001 # 62204 >> >> >> >> >> >> >> CONFIDENTIALITY NOTICE - This communication may contain privileged or confidential information. >> If you are not the intended recipient or received this communication >> by error, please notify the sender and delete the message without copying, forwarding and/or disclosing it. > > From gpau at lbl.gov Mon Jul 4 17:53:24 2016 From: gpau at lbl.gov (George Pau) Date: Mon, 4 Jul 2016 15:53:24 -0700 Subject: [petsc-users] hdf5 libraries In-Reply-To: <2EC53254-CC98-42BB-94C1-212F72574F65@mcs.anl.gov> References: <2EC53254-CC98-42BB-94C1-212F72574F65@mcs.anl.gov> Message-ID: Attached is the configure.log for the case where --download-hdf5 failed. For the run time error, the IS has only 298.8M entries (from ISGetSize). So, it shouldn't need a 64 bit integer. In addition, I was able to get the right output when petsc builds its own hdf5. In addition, I don't have an issue as well if I am using the Petsc Binary Viewer. If this is really an issue with NERSC's libraries, then I will work with NERSC to see if they can help me figure out what is wrong. Thanks George On Sun, Jul 3, 2016 at 3:13 PM, Barry Smith wrote: > > Please send $PETSC_ARCH/lib/petsc/conf/configure.log for both cases to > petsc-maint at mcs.anl.gov. > > The fact that it kicks in for very large problems and reports a > negative block length is indicative of a 64 bit integer not fitting into a > 32 bit integer location. > > Barry > > > > On Jul 2, 2016, at 5:53 PM, George Pau wrote: > > > > Hi, > > > > I am trying to debug an error I am getting when using the HDF5 viewer. > I am working on NERSC systems, and they have a precompiled hdf5 (module > cray-hdf5-parallel). When I linked petsc libraries to their hdf5 > libraries, it gives the following error at run time when I tried to do a > ISView: > > > > Rank 0 [Sat Jul 2 15:34:48 2016] [c0-0c0s15n0] Fatal error in > MPI_Type_create_hindexed: Invalid argument, error stack: > > MPI_Type_create_hindexed(150): MPI_Type_create_hindexed(count=1, > array_of_blocklengths=0x478b1e0, array_of_displacements=0x478b200, > MPI_BYTE, newtype=0x7fffffff3598) failed > > MPI_Type_create_hindexed(98).: Invalid value for blocklength, must be > non-negative but is -1927660792 > > > > The hdf5 version on NERSC is 1.8.16 but the version that PETSc > downloaded when using --download-hdf5 is 1.8.12. So, could the error be due > to this difference? I also see the above error only when the length of the > IS is big (tested for about 200M total entries, using 1024 cores). > > > > I don't have these errors when I used --download-hdf5=1 during configure > step. However, while I was able to use --download-hdf5=1 on Edison, PETSc > was not able to compile hdf5 libraries properly on Cori. The OS of Cori > was recently updated. My primary interest in trying to use the version > provided by NERSC is to see if there is any improvement in the IO > performance. > > > > Thanks, > > George > > > > > > -- > > George Pau > > Earth Sciences Division > > Lawrence Berkeley National Laboratory > > One Cyclotron, MS 74R316C > > Berkeley, CA 94720 > > > > (510) 486-7196 > > gpau at lbl.gov > > http://esd.lbl.gov/profiles/george-shu-heng-pau/ > > -- George Pau Earth Sciences Division Lawrence Berkeley National Laboratory One Cyclotron, MS 74R316C Berkeley, CA 94720 (510) 486-7196 gpau at lbl.gov http://esd.lbl.gov/profiles/george-shu-heng-pau/ -------------- next part -------------- An HTML attachment was scrubbed... URL: -------------- next part -------------- A non-text attachment was scrubbed... Name: configure.log Type: application/octet-stream Size: 2822116 bytes Desc: not available URL: From bsmith at mcs.anl.gov Mon Jul 4 18:48:00 2016 From: bsmith at mcs.anl.gov (Barry Smith) Date: Mon, 4 Jul 2016 18:48:00 -0500 Subject: [petsc-users] hdf5 libraries In-Reply-To: References: <2EC53254-CC98-42BB-94C1-212F72574F65@mcs.anl.gov> Message-ID: We've had lots of people having trouble trying to build the HDF on the Nersc machines; not much we can do about it. /bin/sh: line 4: 115507 Segmentation fault LD_LIBRARY_PATH="$LD_LIBRARY_PATH`echo | sed -e 's/-L/:/g' -e 's/ //g'`" ./H5make_libsettings > H5lib_settings.c Even though I agree it looks like 32 bit integers should be enough you could try configuring PETSc with --with-64-bit-indices to see if this resolves the problem. Barry > On Jul 4, 2016, at 5:53 PM, George Pau wrote: > > Attached is the configure.log for the case where --download-hdf5 failed. > > For the run time error, the IS has only 298.8M entries (from ISGetSize). So, it shouldn't need a 64 bit integer. In addition, I was able to get the right output when petsc builds its own hdf5. In addition, I don't have an issue as well if I am using the Petsc Binary Viewer. If this is really an issue with NERSC's libraries, then I will work with NERSC to see if they can help me figure out what is wrong. > > Thanks > George > > > On Sun, Jul 3, 2016 at 3:13 PM, Barry Smith wrote: > > Please send $PETSC_ARCH/lib/petsc/conf/configure.log for both cases to petsc-maint at mcs.anl.gov. > > The fact that it kicks in for very large problems and reports a negative block length is indicative of a 64 bit integer not fitting into a 32 bit integer location. > > Barry > > > > On Jul 2, 2016, at 5:53 PM, George Pau wrote: > > > > Hi, > > > > I am trying to debug an error I am getting when using the HDF5 viewer. I am working on NERSC systems, and they have a precompiled hdf5 (module cray-hdf5-parallel). When I linked petsc libraries to their hdf5 libraries, it gives the following error at run time when I tried to do a ISView: > > > > Rank 0 [Sat Jul 2 15:34:48 2016] [c0-0c0s15n0] Fatal error in MPI_Type_create_hindexed: Invalid argument, error stack: > > MPI_Type_create_hindexed(150): MPI_Type_create_hindexed(count=1, array_of_blocklengths=0x478b1e0, array_of_displacements=0x478b200, MPI_BYTE, newtype=0x7fffffff3598) failed > > MPI_Type_create_hindexed(98).: Invalid value for blocklength, must be non-negative but is -1927660792 > > > > The hdf5 version on NERSC is 1.8.16 but the version that PETSc downloaded when using --download-hdf5 is 1.8.12. So, could the error be due to this difference? I also see the above error only when the length of the IS is big (tested for about 200M total entries, using 1024 cores). > > > > I don't have these errors when I used --download-hdf5=1 during configure step. However, while I was able to use --download-hdf5=1 on Edison, PETSc was not able to compile hdf5 libraries properly on Cori. The OS of Cori was recently updated. My primary interest in trying to use the version provided by NERSC is to see if there is any improvement in the IO performance. > > > > Thanks, > > George > > > > > > -- > > George Pau > > Earth Sciences Division > > Lawrence Berkeley National Laboratory > > One Cyclotron, MS 74R316C > > Berkeley, CA 94720 > > > > (510) 486-7196 > > gpau at lbl.gov > > http://esd.lbl.gov/profiles/george-shu-heng-pau/ > > > > > -- > George Pau > Earth Sciences Division > Lawrence Berkeley National Laboratory > One Cyclotron, MS 74R316C > Berkeley, CA 94720 > > (510) 486-7196 > gpau at lbl.gov > http://esd.lbl.gov/profiles/george-shu-heng-pau/ > From mono at dtu.dk Tue Jul 5 04:17:29 2016 From: mono at dtu.dk (=?Windows-1252?Q?Morten_Nobel-J=F8rgensen?=) Date: Tue, 5 Jul 2016 09:17:29 +0000 Subject: [petsc-users] Duplicate cells when exporting a distributed dmplex Message-ID: Hi all, I hope someone can help me with the following: I?m having some problems when exporting a distributed DMPlex ? the cells (+cell types) seems to be duplicated. When I?m running the code on a non-distributed system it works as expected, but when I run it on multiple processors (2 in my case) the output is invalid. I have attached a simple example and the output for np=1 and np=2. Abbreviated the code essentially does the following: ' PetscInt dim = 3; PetscInt cells[] = {1, 1, 2}; PetscInt overlap = 1; PetscInitialize(&argc, &argv, NULL, help); DMPlexCreateHexBoxMesh(PETSC_COMM_WORLD, dim, cells, DM_BOUNDARY_NONE, DM_BOUNDARY_NONE, DM_BOUNDARY_NONE, &dm); DMPlexDistribute(dm, overlap, NULL, &dist); dm = dist; SetupDOFs(dm); Vec V; DMCreateGlobalVector(dm, &V); AssignSomeValues(V); PetscViewer viewer; const char* fn = "output.vtk"; PetscViewerVTKOpen(PETSC_COMM_WORLD,fn,FILE_MODE_WRITE,&viewer); VecView(V,viewer); PetscViewerDestroy(&viewer); Kind regards, Morten -------------- next part -------------- An HTML attachment was scrubbed... URL: -------------- next part -------------- A non-text attachment was scrubbed... Name: ex_vtk_export.cc Type: application/octet-stream Size: 2716 bytes Desc: ex_vtk_export.cc URL: -------------- next part -------------- A non-text attachment was scrubbed... Name: output-np2.vtk Type: application/octet-stream Size: 909 bytes Desc: output-np2.vtk URL: -------------- next part -------------- A non-text attachment was scrubbed... Name: output-np1.vtk Type: application/octet-stream Size: 863 bytes Desc: output-np1.vtk URL: From jychang48 at gmail.com Tue Jul 5 11:46:09 2016 From: jychang48 at gmail.com (Justin Chang) Date: Tue, 5 Jul 2016 11:46:09 -0500 Subject: [petsc-users] View wall-clock time of a PETSc function via command-line Message-ID: Hi all, Is there a quick way (e.g., through command-line options) to output the wall-clock time of a PETSc function (e.g., SNESSolve(), KSPSolve(), etc) without outputting the entire -log_view? Thanks, Justin -------------- next part -------------- An HTML attachment was scrubbed... URL: From bsmith at mcs.anl.gov Tue Jul 5 11:50:56 2016 From: bsmith at mcs.anl.gov (Barry Smith) Date: Tue, 5 Jul 2016 11:50:56 -0500 Subject: [petsc-users] View wall-clock time of a PETSc function via command-line In-Reply-To: References: Message-ID: <6ACBC74D-9B09-40DF-ABC0-E5DF485B850C@mcs.anl.gov> ./ex1 -log_view | grep KSPSolve or ./ex1 -log_view | egrep "(KSPSolve|SNESSolve)" > On Jul 5, 2016, at 11:46 AM, Justin Chang wrote: > > Hi all, > > Is there a quick way (e.g., through command-line options) to output the wall-clock time of a PETSc function (e.g., SNESSolve(), KSPSolve(), etc) without outputting the entire -log_view? > > Thanks, > Justin From jychang48 at gmail.com Tue Jul 5 11:52:29 2016 From: jychang48 at gmail.com (Justin Chang) Date: Tue, 5 Jul 2016 11:52:29 -0500 Subject: [petsc-users] View wall-clock time of a PETSc function via command-line In-Reply-To: <6ACBC74D-9B09-40DF-ABC0-E5DF485B850C@mcs.anl.gov> References: <6ACBC74D-9B09-40DF-ABC0-E5DF485B850C@mcs.anl.gov> Message-ID: Okay, thanks! On Tue, Jul 5, 2016 at 11:50 AM, Barry Smith wrote: > > ./ex1 -log_view | grep KSPSolve > > or > > ./ex1 -log_view | egrep "(KSPSolve|SNESSolve)" > > > > On Jul 5, 2016, at 11:46 AM, Justin Chang wrote: > > > > Hi all, > > > > Is there a quick way (e.g., through command-line options) to output the > wall-clock time of a PETSc function (e.g., SNESSolve(), KSPSolve(), etc) > without outputting the entire -log_view? > > > > Thanks, > > Justin > > -------------- next part -------------- An HTML attachment was scrubbed... URL: From knepley at gmail.com Tue Jul 5 12:14:05 2016 From: knepley at gmail.com (Matthew Knepley) Date: Tue, 5 Jul 2016 12:14:05 -0500 Subject: [petsc-users] View wall-clock time of a PETSc function via command-line In-Reply-To: References: <6ACBC74D-9B09-40DF-ABC0-E5DF485B850C@mcs.anl.gov> Message-ID: On Tue, Jul 5, 2016 at 11:52 AM, Justin Chang wrote: > Okay, thanks! > Or -log_view ::ascii_info_detailed and then upload that module to Python. This is how I do it in scripts. Matt > On Tue, Jul 5, 2016 at 11:50 AM, Barry Smith wrote: > >> >> ./ex1 -log_view | grep KSPSolve >> >> or >> >> ./ex1 -log_view | egrep "(KSPSolve|SNESSolve)" >> >> >> > On Jul 5, 2016, at 11:46 AM, Justin Chang wrote: >> > >> > Hi all, >> > >> > Is there a quick way (e.g., through command-line options) to output the >> wall-clock time of a PETSc function (e.g., SNESSolve(), KSPSolve(), etc) >> without outputting the entire -log_view? >> > >> > Thanks, >> > Justin >> >> > -- What most experimenters take for granted before they begin their experiments is infinitely more interesting than any results to which their experiments lead. -- Norbert Wiener -------------- next part -------------- An HTML attachment was scrubbed... URL: From Hassan.Raiesi at aero.bombardier.com Tue Jul 5 15:42:54 2016 From: Hassan.Raiesi at aero.bombardier.com (Hassan Raiesi) Date: Tue, 5 Jul 2016 20:42:54 +0000 Subject: [petsc-users] petsc 3.7.2 memory usage is much higher when compared to 3.6.1 Message-ID: Hi, PETSc 3.7.2 seems to have a much higher memory usage when compared with PETSc- 3.1.1 c, to a point that it crashes our code for large problems that we ran with version 3.6.1 in the past. I have re-compiled the code with same options, and ran the same code linked with the two versions, here are the log-summarie: -flow_ksp_max_it 20 -flow_ksp_monitor_true_residual -flow_ksp_rtol 0.1 -flow_ksp_type fgmres -flow_mg_coarse_pc_factor_mat_solver_package mumps -flow_mg_coarse_pc_type lu -flow_mg_levels_ksp_type richardson -flow_mg_levels_pc_type sor -flow_pc_gamg_agg_nsmooths 0 -flow_pc_gamg_coarse_eq_limit 2000 -flow_pc_gamg_process_eq_limit 2500 -flow_pc_gamg_repartition true -flow_pc_gamg_reuse_interpolation true -flow_pc_gamg_square_graph 3 -flow_pc_gamg_sym_graph true -flow_pc_gamg_type agg -flow_pc_mg_cycle v -flow_pc_mg_levels 20 -flow_pc_mg_type kaskade -flow_pc_type gamg -log_summary Note: it is not specific to PCGAMG, even a bjacobi+fgmres would need more memory (4.5GB/core in version 3.6.1 compared to 6.8GB/core for 3.7.2). Using Petsc Development GIT revision: v3.7.2-812-gc68d048 GIT Date: 2016-07-05 12:04:34 -0400 Max Max/Min Avg Total Time (sec): 6.760e+02 1.00006 6.760e+02 Objects: 1.284e+03 1.00469 1.279e+03 Flops: 3.563e+10 1.10884 3.370e+10 1.348e+13 Flops/sec: 5.271e+07 1.10884 4.985e+07 1.994e+10 MPI Messages: 4.279e+04 7.21359 1.635e+04 6.542e+06 MPI Message Lengths: 3.833e+09 17.25274 7.681e+04 5.024e+11 MPI Reductions: 4.023e+03 1.00149 Flop counting convention: 1 flop = 1 real number operation of type (multiply/divide/add/subtract) e.g., VecAXPY() for real vectors of length N --> 2N flops and VecAXPY() for complex vectors of length N --> 8N flops Summary of Stages: ----- Time ------ ----- Flops ----- --- Messages --- -- Message Lengths -- -- Reductions -- Avg %Total Avg %Total counts %Total Avg %Total counts %Total 0: Main Stage: 6.7600e+02 100.0% 1.3478e+13 100.0% 6.533e+06 99.9% 7.674e+04 99.9% 4.010e+03 99.7% ------------------------------------------------------------------------------------------------------------------------ See the 'Profiling' chapter of the users' manual for details on interpreting output. Phase summary info: Count: number of times phase was executed Time and Flops: Max - maximum over all processors Ratio - ratio of maximum to minimum over all processors Mess: number of messages sent Avg. len: average message length (bytes) Reduct: number of global reductions Global: entire computation Stage: stages of a computation. Set stages with PetscLogStagePush() and PetscLogStagePop(). %T - percent time in this phase %F - percent flops in this phase %M - percent messages in this phase %L - percent message lengths in this phase %R - percent reductions in this phase Total Mflop/s: 10e-6 * (sum of flops over all processors)/(max time over all processors) ------------------------------------------------------------------------------------------------------------------------ Event Count Time (sec) Flops --- Global --- --- Stage --- Total Max Ratio Max Ratio Max Ratio Mess Avg len Reduct %T %F %M %L %R %T %F %M %L %R Mflop/s ------------------------------------------------------------------------------------------------------------------------ --- Event Stage 0: Main Stage MatMult 500 1.0 1.0582e+01 1.2 6.68e+09 1.1 1.9e+06 1.0e+04 0.0e+00 1 19 28 4 0 1 19 29 4 0 237625 MatMultTranspose 120 1.0 7.6262e-01 1.3 3.58e+08 1.1 2.4e+05 1.5e+04 0.0e+00 0 1 4 1 0 0 1 4 1 0 180994 MatSolve 380 1.0 4.1580e+00 1.1 1.17e+09 1.1 8.6e+03 8.8e+01 6.0e+01 1 3 0 0 1 1 3 0 0 1 105950 MatSOR 120 1.0 1.4316e+01 1.2 6.75e+09 1.1 9.5e+05 7.4e+03 0.0e+00 2 19 15 1 0 2 19 15 1 0 177298 MatLUFactorSym 2 1.0 2.3449e-01 1.1 0.00e+00 0.0 0.0e+00 0.0e+00 1.0e+01 0 0 0 0 0 0 0 0 0 0 0 MatLUFactorNum 60 1.0 8.8820e+00 1.0 1.95e+08 1.2 0.0e+00 0.0e+00 0.0e+00 1 1 0 0 0 1 1 0 0 0 7877 MatILUFactorSym 1 1.0 1.9795e-02 1.4 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 0 MatConvert 6 1.0 2.9893e-02 1.1 0.00e+00 0.0 0.0e+00 0.0e+00 1.8e+01 0 0 0 0 0 0 0 0 0 0 0 MatScale 6 1.0 1.8810e-02 1.4 4.52e+06 1.1 2.4e+04 1.5e+03 0.0e+00 0 0 0 0 0 0 0 0 0 0 90171 MatAssemblyBegin 782 1.0 1.8294e+01 2.9 0.00e+00 0.0 9.2e+05 4.1e+05 4.2e+02 2 0 14 75 10 2 0 14 75 10 0 MatAssemblyEnd 782 1.0 1.4283e+01 3.0 0.00e+00 0.0 4.1e+05 8.7e+02 4.7e+02 1 0 6 0 12 1 0 6 0 12 0 MatGetRow 6774900 1.1 9.4289e-01 1.2 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 0 MatGetRowIJ 3 3.0 6.6261e-036948.0 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 0 MatGetSubMatrix 12 1.0 2.6783e+01 1.0 0.00e+00 0.0 1.1e+05 1.3e+05 2.0e+02 4 0 2 3 5 4 0 2 3 5 0 MatGetOrdering 3 3.0 7.7400e-03 7.2 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 0 MatPartitioning 6 1.0 1.8949e+00 1.0 0.00e+00 0.0 0.0e+00 0.0e+00 1.4e+01 0 0 0 0 0 0 0 0 0 0 0 MatCoarsen 6 1.0 9.5692e-02 1.2 0.00e+00 0.0 2.6e+05 1.1e+03 4.1e+01 0 0 4 0 1 0 0 4 0 1 0 MatZeroEntries 142 1.0 9.7085e-01 1.1 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 0 MatTranspose 6 1.0 2.1740e-01 1.0 0.00e+00 0.0 1.9e+05 8.5e+02 7.8e+01 0 0 3 0 2 0 0 3 0 2 0 MatPtAP 120 1.0 6.0157e+01 1.0 1.82e+10 1.1 1.5e+06 2.7e+05 4.2e+02 9 51 22 80 10 9 51 22 80 10 114269 MatPtAPSymbolic 12 1.0 8.1081e+00 1.0 0.00e+00 0.0 2.2e+05 3.8e+04 8.4e+01 1 0 3 2 2 1 0 3 2 2 0 MatPtAPNumeric 120 1.0 5.2205e+01 1.0 1.82e+10 1.1 1.2e+06 3.1e+05 3.4e+02 8 51 19 78 8 8 51 19 78 8 131676 MatTrnMatMult 3 1.0 1.8608e+00 1.0 3.23e+07 1.2 8.3e+04 7.9e+03 5.7e+01 0 0 1 0 1 0 0 1 0 1 6275 MatTrnMatMultSym 3 1.0 1.3447e+00 1.0 0.00e+00 0.0 6.9e+04 3.8e+03 5.1e+01 0 0 1 0 1 0 0 1 0 1 0 MatTrnMatMultNum 3 1.0 5.1695e-01 1.0 3.23e+07 1.2 1.3e+04 3.0e+04 6.0e+00 0 0 0 0 0 0 0 0 0 0 22588 MatGetLocalMat 126 1.0 1.0355e+00 1.8 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 0 MatGetBrAoCol 120 1.0 9.5921e+0019.2 0.00e+00 0.0 5.7e+05 3.3e+04 0.0e+00 1 0 9 4 0 1 0 9 4 0 0 VecDot 320 1.0 1.1400e+00 1.6 2.04e+08 1.1 0.0e+00 0.0e+00 3.2e+02 0 1 0 0 8 0 1 0 0 8 68967 VecMDot 260 1.0 1.9577e+00 2.8 3.70e+08 1.1 0.0e+00 0.0e+00 2.6e+02 0 1 0 0 6 0 1 0 0 6 72792 VecNorm 440 1.0 2.6273e+00 1.9 5.88e+08 1.1 0.0e+00 0.0e+00 4.4e+02 0 2 0 0 11 0 2 0 0 11 86035 VecScale 320 1.0 2.1386e-01 1.2 7.91e+07 1.1 0.0e+00 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 141968 VecCopy 220 1.0 7.0370e-01 1.2 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 0 VecSet 862 1.0 7.1000e-01 1.6 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 0 VecAXPY 440 1.0 8.6790e-01 1.1 3.83e+08 1.1 0.0e+00 0.0e+00 0.0e+00 0 1 0 0 0 0 1 0 0 0 169857 VecAYPX 280 1.0 5.7766e-01 1.5 1.92e+08 1.1 0.0e+00 0.0e+00 0.0e+00 0 1 0 0 0 0 1 0 0 0 127599 VecMAXPY 300 1.0 9.7396e-01 1.2 4.98e+08 1.1 0.0e+00 0.0e+00 0.0e+00 0 1 0 0 0 0 1 0 0 0 196768 VecAssemblyBegin 234 1.0 4.6313e+00 5.6 0.00e+00 0.0 0.0e+00 0.0e+00 6.8e+02 0 0 0 0 17 0 0 0 0 17 0 VecAssemblyEnd 234 1.0 5.1503e-0319.5 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 0 VecScatterBegin 1083 1.0 2.9274e-01 4.5 0.00e+00 0.0 3.8e+06 8.5e+03 2.0e+01 0 0 59 6 0 0 0 59 6 0 0 VecScatterEnd 1063 1.0 3.9653e+00 5.6 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 0 KSPGMRESOrthog 20 1.0 1.7405e+00 3.7 1.28e+08 1.1 0.0e+00 0.0e+00 2.0e+01 0 0 0 0 0 0 0 0 0 0 28232 KSPSetUp 222 1.0 6.8469e-02 1.1 0.00e+00 0.0 0.0e+00 0.0e+00 1.2e+01 0 0 0 0 0 0 0 0 0 0 0 KSPSolve 60 1.0 1.4767e+02 1.0 3.55e+10 1.1 6.3e+06 7.2e+04 3.2e+03 22100 96 90 79 22100 96 90 79 91007 PCGAMGGraph_AGG 6 1.0 6.0792e+00 1.0 4.52e+06 1.1 3.8e+05 9.0e+02 2.5e+02 1 0 6 0 6 1 0 6 0 6 279 PCGAMGCoarse_AGG 6 1.0 2.0660e+00 1.0 3.23e+07 1.2 4.2e+05 3.1e+03 1.5e+02 0 0 6 0 4 0 0 6 0 4 5652 PCGAMGProl_AGG 6 1.0 1.8842e+00 1.0 0.00e+00 0.0 7.3e+05 3.3e+03 8.6e+02 0 0 11 0 21 0 0 11 0 22 0 PCGAMGPOpt_AGG 6 1.0 6.4373e-06 0.0 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 0 GAMG: createProl 6 1.0 1.0036e+01 1.0 3.68e+07 1.2 1.5e+06 2.7e+03 1.3e+03 1 0 23 1 31 1 0 23 1 31 1332 Graph 12 1.0 6.0783e+00 1.0 4.52e+06 1.1 3.8e+05 9.0e+02 2.5e+02 1 0 6 0 6 1 0 6 0 6 279 MIS/Agg 6 1.0 9.5831e-02 1.2 0.00e+00 0.0 2.6e+05 1.1e+03 4.1e+01 0 0 4 0 1 0 0 4 0 1 0 SA: col data 6 1.0 7.7358e-01 1.0 0.00e+00 0.0 6.7e+05 2.9e+03 7.8e+02 0 0 10 0 19 0 0 10 0 19 0 SA: frmProl0 6 1.0 1.0759e+00 1.0 0.00e+00 0.0 6.2e+04 7.6e+03 6.0e+01 0 0 1 0 1 0 0 1 0 1 0 GAMG: partLevel 6 1.0 3.8136e+01 1.0 9.09e+08 1.1 3.8e+05 5.0e+04 5.4e+02 6 3 6 4 13 6 3 6 4 14 9013 repartition 6 1.0 2.7910e+00 1.0 0.00e+00 0.0 4.6e+04 1.3e+02 1.6e+02 0 0 1 0 4 0 0 1 0 4 0 Invert-Sort 6 1.0 2.5045e+00 2.0 0.00e+00 0.0 0.0e+00 0.0e+00 2.4e+01 0 0 0 0 1 0 0 0 0 1 0 Move A 6 1.0 1.4832e+01 1.0 0.00e+00 0.0 8.5e+04 1.7e+05 1.1e+02 2 0 1 3 3 2 0 1 3 3 0 Move P 6 1.0 1.2023e+01 1.0 0.00e+00 0.0 2.4e+04 3.8e+03 1.1e+02 2 0 0 0 3 2 0 0 0 3 0 PCSetUp 100 1.0 1.1212e+02 1.0 1.84e+10 1.1 3.2e+06 1.3e+05 2.2e+03 17 52 49 84 54 17 52 49 84 54 62052 PCSetUpOnBlocks 40 1.0 1.0386e+00 1.2 1.95e+08 1.2 0.0e+00 0.0e+00 0.0e+00 0 1 0 0 0 0 1 0 0 0 67368 PCApply 380 1.0 2.0034e+01 1.1 8.60e+09 1.1 1.5e+06 9.9e+03 6.0e+01 3 24 22 3 1 3 24 22 3 1 161973 SFSetGraph 12 1.0 4.9813e-0310.2 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 0 SFBcastBegin 47 1.0 3.3110e-02 2.6 0.00e+00 0.0 2.6e+05 1.1e+03 6.0e+00 0 0 4 0 0 0 0 4 0 0 0 SFBcastEnd 47 1.0 1.3497e-02 3.6 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 0 SFReduceBegin 6 1.0 1.8593e-02 4.2 0.00e+00 0.0 7.2e+04 4.9e+02 6.0e+00 0 0 1 0 0 0 0 1 0 0 0 SFReduceEnd 6 1.0 7.1628e-0318.5 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 0 BuildTwoSided 12 1.0 3.5771e-02 2.5 0.00e+00 0.0 5.0e+04 4.0e+00 1.2e+01 0 0 1 0 0 0 0 1 0 0 0 ------------------------------------------------------------------------------------------------------------------------ Memory usage is given in bytes: Object Type Creations Destructions Memory Descendants' Mem. Reports information only for process 0. --- Event Stage 0: Main Stage Matrix 302 299 1992700700 0. Matrix Partitioning 6 6 3888 0. Matrix Coarsen 6 6 3768 0. Vector 600 600 1582204168 0. Vector Scatter 87 87 5614432 0. Krylov Solver 11 11 59472 0. Preconditioner 11 11 11120 0. PetscRandom 1 1 638 0. Viewer 1 0 0 0. Index Set 247 247 9008420 0. Star Forest Bipartite Graph 12 12 10176 0. ======================================================================================================================== And for petsc 3.6.1: Using Petsc Development GIT revision: v3.6.1-307-g26c82d3 GIT Date: 2015-08-06 11:50:34 -0500 Max Max/Min Avg Total Time (sec): 5.515e+02 1.00001 5.515e+02 Objects: 1.231e+03 1.00490 1.226e+03 Flops: 3.431e+10 1.12609 3.253e+10 1.301e+13 Flops/sec: 6.222e+07 1.12609 5.899e+07 2.359e+10 MPI Messages: 4.432e+04 7.84165 1.504e+04 6.016e+06 MPI Message Lengths: 2.236e+09 12.61261 5.027e+04 3.024e+11 MPI Reductions: 4.012e+03 1.00150 Flop counting convention: 1 flop = 1 real number operation of type (multiply/divide/add/subtract) e.g., VecAXPY() for real vectors of length N --> 2N flops and VecAXPY() for complex vectors of length N --> 8N flops Summary of Stages: ----- Time ------ ----- Flops ----- --- Messages --- -- Message Lengths -- -- Reductions -- Avg %Total Avg %Total counts %Total Avg %Total counts %Total 0: Main Stage: 5.5145e+02 100.0% 1.3011e+13 100.0% 6.007e+06 99.9% 5.020e+04 99.9% 3.999e+03 99.7% ------------------------------------------------------------------------------------------------------------------------ See the 'Profiling' chapter of the users' manual for details on interpreting output. Phase summary info: Count: number of times phase was executed Time and Flops: Max - maximum over all processors Ratio - ratio of maximum to minimum over all processors Mess: number of messages sent Avg. len: average message length (bytes) Reduct: number of global reductions Global: entire computation Stage: stages of a computation. Set stages with PetscLogStagePush() and PetscLogStagePop(). %T - percent time in this phase %F - percent flops in this phase %M - percent messages in this phase %L - percent message lengths in this phase %R - percent reductions in this phase Total Mflop/s: 10e-6 * (sum of flops over all processors)/(max time over all processors) ------------------------------------------------------------------------------------------------------------------------ Event Count Time (sec) Flops --- Global --- --- Stage --- Total Max Ratio Max Ratio Max Ratio Mess Avg len Reduct %T %F %M %L %R %T %F %M %L %R Mflop/s ------------------------------------------------------------------------------------------------------------------------ --- Event Stage 0: Main Stage MatMult 500 1.0 1.0172e+01 1.2 6.68e+09 1.1 1.9e+06 9.9e+03 0.0e+00 2 19 31 6 0 2 19 31 6 0 247182 MatMultTranspose 120 1.0 6.9889e-01 1.2 3.56e+08 1.1 2.5e+05 1.4e+04 0.0e+00 0 1 4 1 0 0 1 4 1 0 197492 MatSolve 380 1.0 3.9310e+00 1.1 1.17e+09 1.1 1.3e+04 5.7e+01 6.0e+01 1 3 0 0 1 1 3 0 0 2 112069 MatSOR 120 1.0 1.3915e+01 1.1 6.73e+09 1.1 9.5e+05 7.4e+03 0.0e+00 2 20 16 2 0 2 20 16 2 0 182405 MatLUFactorSym 2 1.0 2.1180e-01 1.8 0.00e+00 0.0 0.0e+00 0.0e+00 1.0e+01 0 0 0 0 0 0 0 0 0 0 0 MatLUFactorNum 60 1.0 7.9378e+00 1.0 1.95e+08 1.2 0.0e+00 0.0e+00 0.0e+00 1 1 0 0 0 1 1 0 0 0 8814 MatILUFactorSym 1 1.0 2.3076e-02 1.5 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 0 MatConvert 6 1.0 3.2693e-02 1.5 0.00e+00 0.0 0.0e+00 0.0e+00 1.8e+01 0 0 0 0 0 0 0 0 0 0 0 MatScale 6 1.0 2.1923e-02 1.7 4.50e+06 1.1 2.4e+04 1.5e+03 0.0e+00 0 0 0 0 0 0 0 0 0 0 77365 MatAssemblyBegin 266 1.0 1.0337e+01 4.4 0.00e+00 0.0 1.8e+05 3.8e+03 4.2e+02 1 0 3 0 10 1 0 3 0 10 0 MatAssemblyEnd 266 1.0 3.0336e+00 1.0 0.00e+00 0.0 4.1e+05 8.6e+02 4.7e+02 1 0 7 0 12 1 0 7 0 12 0 MatGetRow 6730366 1.1 8.6473e-01 1.2 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 0 MatGetRowIJ 3 3.0 5.2931e-035550.2 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 0 MatGetSubMatrix 12 1.0 2.2689e+01 1.0 0.00e+00 0.0 1.1e+05 1.3e+05 1.9e+02 4 0 2 5 5 4 0 2 5 5 0 MatGetOrdering 3 3.0 6.5000e-03 5.4 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 0 MatPartitioning 6 1.0 2.9801e+00 1.0 0.00e+00 0.0 0.0e+00 0.0e+00 1.4e+01 1 0 0 0 0 1 0 0 0 0 0 MatCoarsen 6 1.0 9.5374e-02 1.1 0.00e+00 0.0 2.5e+05 1.1e+03 3.8e+01 0 0 4 0 1 0 0 4 0 1 0 MatZeroEntries 22 1.0 6.1185e-01 1.2 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 0 MatTranspose 6 1.0 1.9780e-01 1.1 0.00e+00 0.0 1.9e+05 8.6e+02 7.8e+01 0 0 3 0 2 0 0 3 0 2 0 MatPtAP 120 1.0 5.2996e+01 1.0 1.70e+10 1.1 9.7e+05 2.1e+05 4.2e+02 10 49 16 67 10 10 49 16 67 11 120900 MatPtAPSymbolic 12 1.0 5.8209e+00 1.0 0.00e+00 0.0 2.2e+05 3.7e+04 8.4e+01 1 0 4 3 2 1 0 4 3 2 0 MatPtAPNumeric 120 1.0 4.7185e+01 1.0 1.70e+10 1.1 7.6e+05 2.6e+05 3.4e+02 9 49 13 64 8 9 49 13 64 8 135789 MatTrnMatMult 3 1.0 1.1679e+00 1.0 3.22e+07 1.2 8.2e+04 8.0e+03 5.7e+01 0 0 1 0 1 0 0 1 0 1 9997 MatTrnMatMultSym 3 1.0 6.8366e-01 1.0 0.00e+00 0.0 6.9e+04 3.9e+03 5.1e+01 0 0 1 0 1 0 0 1 0 1 0 MatTrnMatMultNum 3 1.0 4.8513e-01 1.0 3.22e+07 1.2 1.3e+04 3.0e+04 6.0e+00 0 0 0 0 0 0 0 0 0 0 24069 MatGetLocalMat 126 1.0 1.1939e+00 1.3 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 0 MatGetBrAoCol 120 1.0 5.9887e-01 2.7 0.00e+00 0.0 5.7e+05 3.3e+04 0.0e+00 0 0 9 6 0 0 0 9 6 0 0 MatGetSymTrans 24 1.0 1.4878e-01 1.4 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 0 VecDot 320 1.0 1.5860e+00 1.5 2.04e+08 1.1 0.0e+00 0.0e+00 3.2e+02 0 1 0 0 8 0 1 0 0 8 49574 VecMDot 260 1.0 1.8154e+00 2.5 3.70e+08 1.1 0.0e+00 0.0e+00 2.6e+02 0 1 0 0 6 0 1 0 0 7 78497 VecNorm 440 1.0 2.8876e+00 1.8 5.88e+08 1.1 0.0e+00 0.0e+00 4.4e+02 0 2 0 0 11 0 2 0 0 11 78281 VecScale 320 1.0 2.2738e-01 1.2 7.88e+07 1.1 0.0e+00 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 133517 VecCopy 220 1.0 7.1162e-01 1.2 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 0 VecSet 862 1.0 7.0683e-01 1.5 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 0 VecAXPY 440 1.0 9.0657e-01 1.2 3.83e+08 1.1 0.0e+00 0.0e+00 0.0e+00 0 1 0 0 0 0 1 0 0 0 162612 VecAYPX 280 1.0 5.8935e-01 1.5 1.92e+08 1.1 0.0e+00 0.0e+00 0.0e+00 0 1 0 0 0 0 1 0 0 0 125070 VecMAXPY 300 1.0 9.7644e-01 1.2 4.98e+08 1.1 0.0e+00 0.0e+00 0.0e+00 0 1 0 0 0 0 1 0 0 0 196269 VecAssemblyBegin 234 1.0 5.0308e+00 5.5 0.00e+00 0.0 0.0e+00 0.0e+00 6.8e+02 1 0 0 0 17 1 0 0 0 17 0 VecAssemblyEnd 234 1.0 1.8253e-03 8.8 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 0 VecScatterBegin 1083 1.0 2.8195e-01 4.7 0.00e+00 0.0 3.8e+06 8.4e+03 2.0e+01 0 0 64 11 0 0 0 64 11 1 0 VecScatterEnd 1063 1.0 3.4924e+00 6.9 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 0 KSPGMRESOrthog 20 1.0 1.5598e+00 3.2 1.28e+08 1.1 0.0e+00 0.0e+00 2.0e+01 0 0 0 0 0 0 0 0 0 1 31503 KSPSetUp 222 1.0 9.7521e-02 1.6 0.00e+00 0.0 0.0e+00 0.0e+00 1.2e+01 0 0 0 0 0 0 0 0 0 0 0 KSPSolve 60 1.0 1.3742e+02 1.0 3.42e+10 1.1 5.7e+06 4.4e+04 3.2e+03 25100 95 83 79 25100 95 83 79 94396 PCGAMGGraph_AGG 6 1.0 5.7683e+00 1.0 4.50e+06 1.1 3.8e+05 9.1e+02 2.5e+02 1 0 6 0 6 1 0 6 0 6 294 PCGAMGCoarse_AGG 6 1.0 1.4101e+00 1.0 3.22e+07 1.2 4.0e+05 3.2e+03 1.4e+02 0 0 7 0 4 0 0 7 0 4 8280 PCGAMGProl_AGG 6 1.0 1.8976e+00 1.0 0.00e+00 0.0 7.2e+05 3.4e+03 8.6e+02 0 0 12 1 22 0 0 12 1 22 0 PCGAMGPOpt_AGG 6 1.0 5.7220e-06 0.0 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 0 GAMG: createProl 6 1.0 9.0840e+00 1.0 3.67e+07 1.2 1.5e+06 2.7e+03 1.3e+03 2 0 25 1 31 2 0 25 1 31 1472 Graph 12 1.0 5.7669e+00 1.0 4.50e+06 1.1 3.8e+05 9.1e+02 2.5e+02 1 0 6 0 6 1 0 6 0 6 294 MIS/Agg 6 1.0 9.5481e-02 1.1 0.00e+00 0.0 2.5e+05 1.1e+03 3.8e+01 0 0 4 0 1 0 0 4 0 1 0 SA: col data 6 1.0 8.5414e-01 1.0 0.00e+00 0.0 6.6e+05 3.0e+03 7.8e+02 0 0 11 1 19 0 0 11 1 20 0 SA: frmProl0 6 1.0 1.0123e+00 1.0 0.00e+00 0.0 6.2e+04 7.6e+03 6.0e+01 0 0 1 0 1 0 0 1 0 2 0 GAMG: partLevel 6 1.0 3.6150e+01 1.0 8.41e+08 1.1 3.5e+05 5.0e+04 5.3e+02 7 2 6 6 13 7 2 6 6 13 8804 repartition 6 1.0 3.8351e+00 1.0 0.00e+00 0.0 4.7e+04 1.3e+02 1.6e+02 1 0 1 0 4 1 0 1 0 4 0 Invert-Sort 6 1.0 4.4953e+00 1.8 0.00e+00 0.0 0.0e+00 0.0e+00 2.4e+01 1 0 0 0 1 1 0 0 0 1 0 Move A 6 1.0 1.0806e+01 1.0 0.00e+00 0.0 8.5e+04 1.6e+05 1.0e+02 2 0 1 5 3 2 0 1 5 3 0 Move P 6 1.0 1.1953e+01 1.0 0.00e+00 0.0 2.5e+04 3.6e+03 1.0e+02 2 0 0 0 3 2 0 0 0 3 0 PCSetUp 100 1.0 1.0166e+02 1.0 1.72e+10 1.1 2.7e+06 8.3e+04 2.2e+03 18 50 44 73 54 18 50 44 73 54 63848 PCSetUpOnBlocks 40 1.0 1.0812e+00 1.2 1.95e+08 1.2 0.0e+00 0.0e+00 0.0e+00 0 1 0 0 0 0 1 0 0 0 64711 PCApply 380 1.0 1.9359e+01 1.1 8.58e+09 1.1 1.4e+06 9.6e+03 6.0e+01 3 25 24 5 1 3 25 24 5 2 167605 SFSetGraph 12 1.0 3.5203e-03 6.8 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 0 SFBcastBegin 44 1.0 2.4242e-02 3.0 0.00e+00 0.0 2.5e+05 1.1e+03 6.0e+00 0 0 4 0 0 0 0 4 0 0 0 SFBcastEnd 44 1.0 3.0994e-02 8.6 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 0 SFReduceBegin 6 1.0 1.6784e-02 3.8 0.00e+00 0.0 7.1e+04 5.0e+02 6.0e+00 0 0 1 0 0 0 0 1 0 0 0 SFReduceEnd 6 1.0 8.6989e-0332.0 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 0 ------------------------------------------------------------------------------------------------------------------------ Memory usage is given in bytes: Object Type Creations Destructions Memory Descendants' Mem. Reports information only for process 0. --- Event Stage 0: Main Stage Matrix 246 243 1730595756 0 Matrix Partitioning 6 6 3816 0 Matrix Coarsen 6 6 3720 0 Vector 602 602 1603749672 0 Vector Scatter 87 87 4291136 0 Krylov Solver 12 12 60416 0 Preconditioner 12 12 12040 0 Viewer 1 0 0 0 Index Set 247 247 9018060 0 Star Forest Bipartite Graph 12 12 10080 0 ======================================================================================================================== Any idea why there are more matrix created with version 3.7.2? I only have 2 MatCreate calls and 4 VecCreate calls in my code!, so I assume the others are internally created. Thank you, Hassan Raiesi, PhD Advanced Aerodynamics Department Bombardier Aerospace hassan.raiesi at aero.bombardier.com 2351 boul. Alfred-Nobel (BAN1) Ville Saint-Laurent, Qu?bec, H4S 2A9 T?l. 514-855-5001 # 62204 [cid:image001.png at 01D1D6DA.DC1D3010] CONFIDENTIALITY NOTICE - This communication may contain privileged or confidential information. If you are not the intended recipient or received this communication by error, please notify the sender and delete the message without copying, forwarding and/or disclosing it. -------------- next part -------------- An HTML attachment was scrubbed... URL: -------------- next part -------------- A non-text attachment was scrubbed... Name: image001.png Type: image/png Size: 6402 bytes Desc: image001.png URL: From gbisht at lbl.gov Tue Jul 5 16:07:38 2016 From: gbisht at lbl.gov (Gautam Bisht) Date: Tue, 5 Jul 2016 14:07:38 -0700 Subject: [petsc-users] How to determine the type of SNESLineSearch? Message-ID: Hi PETSc, After SNESSolve converges, I want to perform few additional operations only when SNESLineSearchType is not SNESLINESEARCHBASIC. But, there is no SNESLineSearch*Get*Type routine. Any idea on how I can determine the type of LineSearch set by a user using command line option? Thanks, -Gautam. -------------- next part -------------- An HTML attachment was scrubbed... URL: From knepley at gmail.com Tue Jul 5 16:13:35 2016 From: knepley at gmail.com (Matthew Knepley) Date: Tue, 5 Jul 2016 16:13:35 -0500 Subject: [petsc-users] petsc 3.7.2 memory usage is much higher when compared to 3.6.1 In-Reply-To: References: Message-ID: On Tue, Jul 5, 2016 at 3:42 PM, Hassan Raiesi < Hassan.Raiesi at aero.bombardier.com> wrote: > Hi, > > > > PETSc 3.7.2 seems to have a much higher memory usage when compared with > PETSc- 3.1.1 c, to a point that it crashes our code for large problems that > we ran with version 3.6.1 in the past. > > I have re-compiled the code with same options, and ran the same code > linked with the two versions, here are the log-summarie: > According to the log_summary (which you NEED to send in full if we are to understand anything), the memory usage is largely the same. There are more matrices, which leads me to believe that GAMG is not coarsening as quickly. You might consider a non-zero threshold for it. The best way to understand what is happening is to run Massif (from valgrind) on both. Thanks, Matt > -flow_ksp_max_it 20 > > -flow_ksp_monitor_true_residual > > -flow_ksp_rtol 0.1 > > -flow_ksp_type fgmres > > -flow_mg_coarse_pc_factor_mat_solver_package mumps > > -flow_mg_coarse_pc_type lu > > -flow_mg_levels_ksp_type richardson > > -flow_mg_levels_pc_type sor > > -flow_pc_gamg_agg_nsmooths 0 > > -flow_pc_gamg_coarse_eq_limit 2000 > > -flow_pc_gamg_process_eq_limit 2500 > > -flow_pc_gamg_repartition true > > -flow_pc_gamg_reuse_interpolation true > > -flow_pc_gamg_square_graph 3 > > -flow_pc_gamg_sym_graph true > > -flow_pc_gamg_type agg > > -flow_pc_mg_cycle v > > -flow_pc_mg_levels 20 > > -flow_pc_mg_type kaskade > > -flow_pc_type gamg > > -log_summary > > > > Note: it is not specific to PCGAMG, even a bjacobi+fgmres would need more > memory (4.5GB/core in version 3.6.1 compared to 6.8GB/core for 3.7.2). > > > > > > > > Using Petsc Development GIT revision: v3.7.2-812-gc68d048 GIT Date: > 2016-07-05 12:04:34 -0400 > > > > Max Max/Min Avg Total > > Time (sec): 6.760e+02 1.00006 6.760e+02 > > Objects: 1.284e+03 1.00469 1.279e+03 > > Flops: 3.563e+10 1.10884 3.370e+10 1.348e+13 > > Flops/sec: 5.271e+07 1.10884 4.985e+07 1.994e+10 > > MPI Messages: 4.279e+04 7.21359 1.635e+04 6.542e+06 > > MPI Message Lengths: 3.833e+09 17.25274 7.681e+04 5.024e+11 > > MPI Reductions: 4.023e+03 1.00149 > > > > Flop counting convention: 1 flop = 1 real number operation of type > (multiply/divide/add/subtract) > > e.g., VecAXPY() for real vectors of length N > --> 2N flops > > and VecAXPY() for complex vectors of length N > --> 8N flops > > > > Summary of Stages: ----- Time ------ ----- Flops ----- --- Messages > --- -- Message Lengths -- -- Reductions -- > > Avg %Total Avg %Total counts > %Total Avg %Total counts %Total > > 0: Main Stage: 6.7600e+02 100.0% 1.3478e+13 100.0% 6.533e+06 > 99.9% 7.674e+04 99.9% 4.010e+03 99.7% > > > > > ------------------------------------------------------------------------------------------------------------------------ > > See the 'Profiling' chapter of the users' manual for details on > interpreting output. > > Phase summary info: > > Count: number of times phase was executed > > Time and Flops: Max - maximum over all processors > > Ratio - ratio of maximum to minimum over all processors > > Mess: number of messages sent > > Avg. len: average message length (bytes) > > Reduct: number of global reductions > > Global: entire computation > > Stage: stages of a computation. Set stages with PetscLogStagePush() and > PetscLogStagePop(). > > %T - percent time in this phase %F - percent flops in this > phase > > %M - percent messages in this phase %L - percent message lengths > in this phase > > %R - percent reductions in this phase > > Total Mflop/s: 10e-6 * (sum of flops over all processors)/(max time > over all processors) > > > ------------------------------------------------------------------------------------------------------------------------ > > Event Count Time (sec) Flops > --- Global --- --- Stage --- Total > > Max Ratio Max Ratio Max Ratio Mess Avg len > Reduct %T %F %M %L %R %T %F %M %L %R Mflop/s > > > ------------------------------------------------------------------------------------------------------------------------ > > > > --- Event Stage 0: Main Stage > > > > MatMult 500 1.0 1.0582e+01 1.2 6.68e+09 1.1 1.9e+06 1.0e+04 > 0.0e+00 1 19 28 4 0 1 19 29 4 0 237625 > > MatMultTranspose 120 1.0 7.6262e-01 1.3 3.58e+08 1.1 2.4e+05 1.5e+04 > 0.0e+00 0 1 4 1 0 0 1 4 1 0 180994 > > MatSolve 380 1.0 4.1580e+00 1.1 1.17e+09 1.1 8.6e+03 8.8e+01 > 6.0e+01 1 3 0 0 1 1 3 0 0 1 105950 > > MatSOR 120 1.0 1.4316e+01 1.2 6.75e+09 1.1 9.5e+05 7.4e+03 > 0.0e+00 2 19 15 1 0 2 19 15 1 0 177298 > > MatLUFactorSym 2 1.0 2.3449e-01 1.1 0.00e+00 0.0 0.0e+00 0.0e+00 > 1.0e+01 0 0 0 0 0 0 0 0 0 0 0 > > MatLUFactorNum 60 1.0 8.8820e+00 1.0 1.95e+08 1.2 0.0e+00 0.0e+00 > 0.0e+00 1 1 0 0 0 1 1 0 0 0 7877 > > MatILUFactorSym 1 1.0 1.9795e-02 1.4 0.00e+00 0.0 0.0e+00 0.0e+00 > 0.0e+00 0 0 0 0 0 0 0 0 0 0 0 > > MatConvert 6 1.0 2.9893e-02 1.1 0.00e+00 0.0 0.0e+00 0.0e+00 > 1.8e+01 0 0 0 0 0 0 0 0 0 0 0 > > MatScale 6 1.0 1.8810e-02 1.4 4.52e+06 1.1 2.4e+04 1.5e+03 > 0.0e+00 0 0 0 0 0 0 0 0 0 0 90171 > > MatAssemblyBegin 782 1.0 1.8294e+01 2.9 0.00e+00 0.0 9.2e+05 4.1e+05 > 4.2e+02 2 0 14 75 10 2 0 14 75 10 0 > > MatAssemblyEnd 782 1.0 1.4283e+01 3.0 0.00e+00 0.0 4.1e+05 8.7e+02 > 4.7e+02 1 0 6 0 12 1 0 6 0 12 0 > > MatGetRow 6774900 1.1 9.4289e-01 1.2 0.00e+00 0.0 0.0e+00 0.0e+00 > 0.0e+00 0 0 0 0 0 0 0 0 0 0 0 > > MatGetRowIJ 3 3.0 6.6261e-036948.0 0.00e+00 0.0 0.0e+00 0.0e+00 > 0.0e+00 0 0 0 0 0 0 0 0 0 0 0 > > MatGetSubMatrix 12 1.0 2.6783e+01 1.0 0.00e+00 0.0 1.1e+05 1.3e+05 > 2.0e+02 4 0 2 3 5 4 0 2 3 5 0 > > MatGetOrdering 3 3.0 7.7400e-03 7.2 0.00e+00 0.0 0.0e+00 0.0e+00 > 0.0e+00 0 0 0 0 0 0 0 0 0 0 0 > > MatPartitioning 6 1.0 1.8949e+00 1.0 0.00e+00 0.0 0.0e+00 0.0e+00 > 1.4e+01 0 0 0 0 0 0 0 0 0 0 0 > > MatCoarsen 6 1.0 9.5692e-02 1.2 0.00e+00 0.0 2.6e+05 1.1e+03 > 4.1e+01 0 0 4 0 1 0 0 4 0 1 0 > > MatZeroEntries 142 1.0 9.7085e-01 1.1 0.00e+00 0.0 0.0e+00 0.0e+00 > 0.0e+00 0 0 0 0 0 0 0 0 0 0 0 > > MatTranspose 6 1.0 2.1740e-01 1.0 0.00e+00 0.0 1.9e+05 8.5e+02 > 7.8e+01 0 0 3 0 2 0 0 3 0 2 0 > > MatPtAP 120 1.0 6.0157e+01 1.0 1.82e+10 1.1 1.5e+06 2.7e+05 > 4.2e+02 9 51 22 80 10 9 51 22 80 10 114269 > > MatPtAPSymbolic 12 1.0 8.1081e+00 1.0 0.00e+00 0.0 2.2e+05 3.8e+04 > 8.4e+01 1 0 3 2 2 1 0 3 2 2 0 > > MatPtAPNumeric 120 1.0 5.2205e+01 1.0 1.82e+10 1.1 1.2e+06 3.1e+05 > 3.4e+02 8 51 19 78 8 8 51 19 78 8 131676 > > MatTrnMatMult 3 1.0 1.8608e+00 1.0 3.23e+07 1.2 8.3e+04 7.9e+03 > 5.7e+01 0 0 1 0 1 0 0 1 0 1 6275 > > MatTrnMatMultSym 3 1.0 1.3447e+00 1.0 0.00e+00 0.0 6.9e+04 3.8e+03 > 5.1e+01 0 0 1 0 1 0 0 1 0 1 0 > > MatTrnMatMultNum 3 1.0 5.1695e-01 1.0 3.23e+07 1.2 1.3e+04 3.0e+04 > 6.0e+00 0 0 0 0 0 0 0 0 0 0 22588 > > MatGetLocalMat 126 1.0 1.0355e+00 1.8 0.00e+00 0.0 0.0e+00 0.0e+00 > 0.0e+00 0 0 0 0 0 0 0 0 0 0 0 > > MatGetBrAoCol 120 1.0 9.5921e+0019.2 0.00e+00 0.0 5.7e+05 3.3e+04 > 0.0e+00 1 0 9 4 0 1 0 9 4 0 0 > > VecDot 320 1.0 1.1400e+00 1.6 2.04e+08 1.1 0.0e+00 0.0e+00 > 3.2e+02 0 1 0 0 8 0 1 0 0 8 68967 > > VecMDot 260 1.0 1.9577e+00 2.8 3.70e+08 1.1 0.0e+00 0.0e+00 > 2.6e+02 0 1 0 0 6 0 1 0 0 6 72792 > > VecNorm 440 1.0 2.6273e+00 1.9 5.88e+08 1.1 0.0e+00 0.0e+00 > 4.4e+02 0 2 0 0 11 0 2 0 0 11 86035 > > VecScale 320 1.0 2.1386e-01 1.2 7.91e+07 1.1 0.0e+00 0.0e+00 > 0.0e+00 0 0 0 0 0 0 0 0 0 0 141968 > > VecCopy 220 1.0 7.0370e-01 1.2 0.00e+00 0.0 0.0e+00 0.0e+00 > 0.0e+00 0 0 0 0 0 0 0 0 0 0 0 > > VecSet 862 1.0 7.1000e-01 1.6 0.00e+00 0.0 0.0e+00 0.0e+00 > 0.0e+00 0 0 0 0 0 0 0 0 0 0 0 > > VecAXPY 440 1.0 8.6790e-01 1.1 3.83e+08 1.1 0.0e+00 0.0e+00 > 0.0e+00 0 1 0 0 0 0 1 0 0 0 169857 > > VecAYPX 280 1.0 5.7766e-01 1.5 1.92e+08 1.1 0.0e+00 0.0e+00 > 0.0e+00 0 1 0 0 0 0 1 0 0 0 127599 > > VecMAXPY 300 1.0 9.7396e-01 1.2 4.98e+08 1.1 0.0e+00 0.0e+00 > 0.0e+00 0 1 0 0 0 0 1 0 0 0 196768 > > VecAssemblyBegin 234 1.0 4.6313e+00 5.6 0.00e+00 0.0 0.0e+00 0.0e+00 > 6.8e+02 0 0 0 0 17 0 0 0 0 17 0 > > VecAssemblyEnd 234 1.0 5.1503e-0319.5 0.00e+00 0.0 0.0e+00 0.0e+00 > 0.0e+00 0 0 0 0 0 0 0 0 0 0 0 > > VecScatterBegin 1083 1.0 2.9274e-01 4.5 0.00e+00 0.0 3.8e+06 8.5e+03 > 2.0e+01 0 0 59 6 0 0 0 59 6 0 0 > > VecScatterEnd 1063 1.0 3.9653e+00 5.6 0.00e+00 0.0 0.0e+00 0.0e+00 > 0.0e+00 0 0 0 0 0 0 0 0 0 0 0 > > KSPGMRESOrthog 20 1.0 1.7405e+00 3.7 1.28e+08 1.1 0.0e+00 0.0e+00 > 2.0e+01 0 0 0 0 0 0 0 0 0 0 28232 > > KSPSetUp 222 1.0 6.8469e-02 1.1 0.00e+00 0.0 0.0e+00 0.0e+00 > 1.2e+01 0 0 0 0 0 0 0 0 0 0 0 > > KSPSolve 60 1.0 1.4767e+02 1.0 3.55e+10 1.1 6.3e+06 7.2e+04 > 3.2e+03 22100 96 90 79 22100 96 90 79 91007 > > PCGAMGGraph_AGG 6 1.0 6.0792e+00 1.0 4.52e+06 1.1 3.8e+05 9.0e+02 > 2.5e+02 1 0 6 0 6 1 0 6 0 6 279 > > PCGAMGCoarse_AGG 6 1.0 2.0660e+00 1.0 3.23e+07 1.2 4.2e+05 3.1e+03 > 1.5e+02 0 0 6 0 4 0 0 6 0 4 5652 > > PCGAMGProl_AGG 6 1.0 1.8842e+00 1.0 0.00e+00 0.0 7.3e+05 3.3e+03 > 8.6e+02 0 0 11 0 21 0 0 11 0 22 0 > > PCGAMGPOpt_AGG 6 1.0 6.4373e-06 0.0 0.00e+00 0.0 0.0e+00 0.0e+00 > 0.0e+00 0 0 0 0 0 0 0 0 0 0 0 > > GAMG: createProl 6 1.0 1.0036e+01 1.0 3.68e+07 1.2 1.5e+06 2.7e+03 > 1.3e+03 1 0 23 1 31 1 0 23 1 31 1332 > > Graph 12 1.0 6.0783e+00 1.0 4.52e+06 1.1 3.8e+05 9.0e+02 > 2.5e+02 1 0 6 0 6 1 0 6 0 6 279 > > MIS/Agg 6 1.0 9.5831e-02 1.2 0.00e+00 0.0 2.6e+05 1.1e+03 > 4.1e+01 0 0 4 0 1 0 0 4 0 1 0 > > SA: col data 6 1.0 7.7358e-01 1.0 0.00e+00 0.0 6.7e+05 2.9e+03 > 7.8e+02 0 0 10 0 19 0 0 10 0 19 0 > > SA: frmProl0 6 1.0 1.0759e+00 1.0 0.00e+00 0.0 6.2e+04 7.6e+03 > 6.0e+01 0 0 1 0 1 0 0 1 0 1 0 > > GAMG: partLevel 6 1.0 3.8136e+01 1.0 9.09e+08 1.1 3.8e+05 5.0e+04 > 5.4e+02 6 3 6 4 13 6 3 6 4 14 9013 > > repartition 6 1.0 2.7910e+00 1.0 0.00e+00 0.0 4.6e+04 1.3e+02 > 1.6e+02 0 0 1 0 4 0 0 1 0 4 0 > > Invert-Sort 6 1.0 2.5045e+00 2.0 0.00e+00 0.0 0.0e+00 0.0e+00 > 2.4e+01 0 0 0 0 1 0 0 0 0 1 0 > > Move A 6 1.0 1.4832e+01 1.0 0.00e+00 0.0 8.5e+04 1.7e+05 > 1.1e+02 2 0 1 3 3 2 0 1 3 3 0 > > Move P 6 1.0 1.2023e+01 1.0 0.00e+00 0.0 2.4e+04 3.8e+03 > 1.1e+02 2 0 0 0 3 2 0 0 0 3 0 > > PCSetUp 100 1.0 1.1212e+02 1.0 1.84e+10 1.1 3.2e+06 1.3e+05 > 2.2e+03 17 52 49 84 54 17 52 49 84 54 62052 > > PCSetUpOnBlocks 40 1.0 1.0386e+00 1.2 1.95e+08 1.2 0.0e+00 0.0e+00 > 0.0e+00 0 1 0 0 0 0 1 0 0 0 67368 > > PCApply 380 1.0 2.0034e+01 1.1 8.60e+09 1.1 1.5e+06 9.9e+03 > 6.0e+01 3 24 22 3 1 3 24 22 3 1 161973 > > SFSetGraph 12 1.0 4.9813e-0310.2 0.00e+00 0.0 0.0e+00 0.0e+00 > 0.0e+00 0 0 0 0 0 0 0 0 0 0 0 > > SFBcastBegin 47 1.0 3.3110e-02 2.6 0.00e+00 0.0 2.6e+05 1.1e+03 > 6.0e+00 0 0 4 0 0 0 0 4 0 0 0 > > SFBcastEnd 47 1.0 1.3497e-02 3.6 0.00e+00 0.0 0.0e+00 0.0e+00 > 0.0e+00 0 0 0 0 0 0 0 0 0 0 0 > > SFReduceBegin 6 1.0 1.8593e-02 4.2 0.00e+00 0.0 7.2e+04 4.9e+02 > 6.0e+00 0 0 1 0 0 0 0 1 0 0 0 > > SFReduceEnd 6 1.0 7.1628e-0318.5 0.00e+00 0.0 0.0e+00 0.0e+00 > 0.0e+00 0 0 0 0 0 0 0 0 0 0 0 > > BuildTwoSided 12 1.0 3.5771e-02 2.5 0.00e+00 0.0 5.0e+04 4.0e+00 > 1.2e+01 0 0 1 0 0 0 0 1 0 0 0 > > > ------------------------------------------------------------------------------------------------------------------------ > > > > Memory usage is given in bytes: > > > > Object Type Creations Destructions Memory Descendants' Mem. > > Reports information only for process 0. > > > > --- Event Stage 0: Main Stage > > > > Matrix 302 299 1992700700 0. > > Matrix Partitioning 6 6 3888 0. > > Matrix Coarsen 6 6 3768 0. > > Vector 600 600 1582204168 0. > > Vector Scatter 87 87 5614432 0. > > Krylov Solver 11 11 59472 0. > > Preconditioner 11 11 11120 0. > > PetscRandom 1 1 638 0. > > Viewer 1 0 0 0. > > Index Set 247 247 9008420 0. > > Star Forest Bipartite Graph 12 12 10176 0. > > > ======================================================================================================================== > > > > And for petsc 3.6.1: > > > > Using Petsc Development GIT revision: v3.6.1-307-g26c82d3 GIT Date: > 2015-08-06 11:50:34 -0500 > > > > Max Max/Min Avg Total > > Time (sec): 5.515e+02 1.00001 5.515e+02 > > Objects: 1.231e+03 1.00490 1.226e+03 > > Flops: 3.431e+10 1.12609 3.253e+10 1.301e+13 > > Flops/sec: 6.222e+07 1.12609 5.899e+07 2.359e+10 > > MPI Messages: 4.432e+04 7.84165 1.504e+04 6.016e+06 > > MPI Message Lengths: 2.236e+09 12.61261 5.027e+04 3.024e+11 > > MPI Reductions: 4.012e+03 1.00150 > > > > Flop counting convention: 1 flop = 1 real number operation of type > (multiply/divide/add/subtract) > > e.g., VecAXPY() for real vectors of length N > --> 2N flops > > and VecAXPY() for complex vectors of length N > --> 8N flops > > > > Summary of Stages: ----- Time ------ ----- Flops ----- --- Messages > --- -- Message Lengths -- -- Reductions -- > > Avg %Total Avg %Total counts > %Total Avg %Total counts %Total > > 0: Main Stage: 5.5145e+02 100.0% 1.3011e+13 100.0% 6.007e+06 > 99.9% 5.020e+04 99.9% 3.999e+03 99.7% > > > > > ------------------------------------------------------------------------------------------------------------------------ > > See the 'Profiling' chapter of the users' manual for details on > interpreting output. > > Phase summary info: > > Count: number of times phase was executed > > Time and Flops: Max - maximum over all processors > > Ratio - ratio of maximum to minimum over all processors > > Mess: number of messages sent > > Avg. len: average message length (bytes) > > Reduct: number of global reductions > > Global: entire computation > > Stage: stages of a computation. Set stages with PetscLogStagePush() and > PetscLogStagePop(). > > %T - percent time in this phase %F - percent flops in this > phase > > %M - percent messages in this phase %L - percent message lengths > in this phase > > %R - percent reductions in this phase > > Total Mflop/s: 10e-6 * (sum of flops over all processors)/(max time > over all processors) > > > ------------------------------------------------------------------------------------------------------------------------ > > Event Count Time (sec) > Flops --- Global --- --- Stage --- Total > > Max Ratio Max Ratio Max Ratio Mess Avg len > Reduct %T %F %M %L %R %T %F %M %L %R Mflop/s > > > ------------------------------------------------------------------------------------------------------------------------ > > > > --- Event Stage 0: Main Stage > > > > MatMult 500 1.0 1.0172e+01 1.2 6.68e+09 1.1 1.9e+06 9.9e+03 > 0.0e+00 2 19 31 6 0 2 19 31 6 0 247182 > > MatMultTranspose 120 1.0 6.9889e-01 1.2 3.56e+08 1.1 2.5e+05 1.4e+04 > 0.0e+00 0 1 4 1 0 0 1 4 1 0 197492 > > MatSolve 380 1.0 3.9310e+00 1.1 1.17e+09 1.1 1.3e+04 5.7e+01 > 6.0e+01 1 3 0 0 1 1 3 0 0 2 112069 > > MatSOR 120 1.0 1.3915e+01 1.1 6.73e+09 1.1 9.5e+05 7.4e+03 > 0.0e+00 2 20 16 2 0 2 20 16 2 0 182405 > > MatLUFactorSym 2 1.0 2.1180e-01 1.8 0.00e+00 0.0 0.0e+00 0.0e+00 > 1.0e+01 0 0 0 0 0 0 0 0 0 0 0 > > MatLUFactorNum 60 1.0 7.9378e+00 1.0 1.95e+08 1.2 0.0e+00 0.0e+00 > 0.0e+00 1 1 0 0 0 1 1 0 0 0 8814 > > MatILUFactorSym 1 1.0 2.3076e-02 1.5 0.00e+00 0.0 0.0e+00 0.0e+00 > 0.0e+00 0 0 0 0 0 0 0 0 0 0 0 > > MatConvert 6 1.0 3.2693e-02 1.5 0.00e+00 0.0 0.0e+00 0.0e+00 > 1.8e+01 0 0 0 0 0 0 0 0 0 0 0 > > MatScale 6 1.0 2.1923e-02 1.7 4.50e+06 1.1 2.4e+04 1.5e+03 > 0.0e+00 0 0 0 0 0 0 0 0 0 0 77365 > > MatAssemblyBegin 266 1.0 1.0337e+01 4.4 0.00e+00 0.0 1.8e+05 3.8e+03 > 4.2e+02 1 0 3 0 10 1 0 3 0 10 0 > > MatAssemblyEnd 266 1.0 3.0336e+00 1.0 0.00e+00 0.0 4.1e+05 8.6e+02 > 4.7e+02 1 0 7 0 12 1 0 7 0 12 0 > > MatGetRow 6730366 1.1 8.6473e-01 1.2 0.00e+00 0.0 0.0e+00 0.0e+00 > 0.0e+00 0 0 0 0 0 0 0 0 0 0 0 > > MatGetRowIJ 3 3.0 5.2931e-035550.2 0.00e+00 0.0 0.0e+00 0.0e+00 > 0.0e+00 0 0 0 0 0 0 0 0 0 0 0 > > MatGetSubMatrix 12 1.0 2.2689e+01 1.0 0.00e+00 0.0 1.1e+05 1.3e+05 > 1.9e+02 4 0 2 5 5 4 0 2 5 5 0 > > MatGetOrdering 3 3.0 6.5000e-03 5.4 0.00e+00 0.0 0.0e+00 0.0e+00 > 0.0e+00 0 0 0 0 0 0 0 0 0 0 0 > > MatPartitioning 6 1.0 2.9801e+00 1.0 0.00e+00 0.0 0.0e+00 0.0e+00 > 1.4e+01 1 0 0 0 0 1 0 0 0 0 0 > > MatCoarsen 6 1.0 9.5374e-02 1.1 0.00e+00 0.0 2.5e+05 1.1e+03 > 3.8e+01 0 0 4 0 1 0 0 4 0 1 0 > > MatZeroEntries 22 1.0 6.1185e-01 1.2 0.00e+00 0.0 0.0e+00 0.0e+00 > 0.0e+00 0 0 0 0 0 0 0 0 0 0 0 > > MatTranspose 6 1.0 1.9780e-01 1.1 0.00e+00 0.0 1.9e+05 8.6e+02 > 7.8e+01 0 0 3 0 2 0 0 3 0 2 0 > > MatPtAP 120 1.0 5.2996e+01 1.0 1.70e+10 1.1 9.7e+05 2.1e+05 > 4.2e+02 10 49 16 67 10 10 49 16 67 11 120900 > > MatPtAPSymbolic 12 1.0 5.8209e+00 1.0 0.00e+00 0.0 2.2e+05 3.7e+04 > 8.4e+01 1 0 4 3 2 1 0 4 3 2 0 > > MatPtAPNumeric 120 1.0 4.7185e+01 1.0 1.70e+10 1.1 7.6e+05 2.6e+05 > 3.4e+02 9 49 13 64 8 9 49 13 64 8 135789 > > MatTrnMatMult 3 1.0 1.1679e+00 1.0 3.22e+07 1.2 8.2e+04 8.0e+03 > 5.7e+01 0 0 1 0 1 0 0 1 0 1 9997 > > MatTrnMatMultSym 3 1.0 6.8366e-01 1.0 0.00e+00 0.0 6.9e+04 3.9e+03 > 5.1e+01 0 0 1 0 1 0 0 1 0 1 0 > > MatTrnMatMultNum 3 1.0 4.8513e-01 1.0 3.22e+07 1.2 1.3e+04 3.0e+04 > 6.0e+00 0 0 0 0 0 0 0 0 0 0 24069 > > MatGetLocalMat 126 1.0 1.1939e+00 1.3 0.00e+00 0.0 0.0e+00 0.0e+00 > 0.0e+00 0 0 0 0 0 0 0 0 0 0 0 > > MatGetBrAoCol 120 1.0 5.9887e-01 2.7 0.00e+00 0.0 5.7e+05 3.3e+04 > 0.0e+00 0 0 9 6 0 0 0 9 6 0 0 > > MatGetSymTrans 24 1.0 1.4878e-01 1.4 0.00e+00 0.0 0.0e+00 0.0e+00 > 0.0e+00 0 0 0 0 0 0 0 0 0 0 0 > > VecDot 320 1.0 1.5860e+00 1.5 2.04e+08 1.1 0.0e+00 0.0e+00 > 3.2e+02 0 1 0 0 8 0 1 0 0 8 49574 > > VecMDot 260 1.0 1.8154e+00 2.5 3.70e+08 1.1 0.0e+00 0.0e+00 > 2.6e+02 0 1 0 0 6 0 1 0 0 7 78497 > > VecNorm 440 1.0 2.8876e+00 1.8 5.88e+08 1.1 0.0e+00 0.0e+00 > 4.4e+02 0 2 0 0 11 0 2 0 0 11 78281 > > VecScale 320 1.0 2.2738e-01 1.2 7.88e+07 1.1 0.0e+00 0.0e+00 > 0.0e+00 0 0 0 0 0 0 0 0 0 0 133517 > > VecCopy 220 1.0 7.1162e-01 1.2 0.00e+00 0.0 0.0e+00 0.0e+00 > 0.0e+00 0 0 0 0 0 0 0 0 0 0 0 > > VecSet 862 1.0 7.0683e-01 1.5 0.00e+00 0.0 0.0e+00 0.0e+00 > 0.0e+00 0 0 0 0 0 0 0 0 0 0 0 > > VecAXPY 440 1.0 9.0657e-01 1.2 3.83e+08 1.1 0.0e+00 0.0e+00 > 0.0e+00 0 1 0 0 0 0 1 0 0 0 162612 > > VecAYPX 280 1.0 5.8935e-01 1.5 1.92e+08 1.1 0.0e+00 0.0e+00 > 0.0e+00 0 1 0 0 0 0 1 0 0 0 125070 > > VecMAXPY 300 1.0 9.7644e-01 1.2 4.98e+08 1.1 0.0e+00 0.0e+00 > 0.0e+00 0 1 0 0 0 0 1 0 0 0 196269 > > VecAssemblyBegin 234 1.0 5.0308e+00 5.5 0.00e+00 0.0 0.0e+00 0.0e+00 > 6.8e+02 1 0 0 0 17 1 0 0 0 17 0 > > VecAssemblyEnd 234 1.0 1.8253e-03 8.8 0.00e+00 0.0 0.0e+00 0.0e+00 > 0.0e+00 0 0 0 0 0 0 0 0 0 0 0 > > VecScatterBegin 1083 1.0 2.8195e-01 4.7 0.00e+00 0.0 3.8e+06 8.4e+03 > 2.0e+01 0 0 64 11 0 0 0 64 11 1 0 > > VecScatterEnd 1063 1.0 3.4924e+00 6.9 0.00e+00 0.0 0.0e+00 0.0e+00 > 0.0e+00 0 0 0 0 0 0 0 0 0 0 0 > > KSPGMRESOrthog 20 1.0 1.5598e+00 3.2 1.28e+08 1.1 0.0e+00 0.0e+00 > 2.0e+01 0 0 0 0 0 0 0 0 0 1 31503 > > KSPSetUp 222 1.0 9.7521e-02 1.6 0.00e+00 0.0 0.0e+00 0.0e+00 > 1.2e+01 0 0 0 0 0 0 0 0 0 0 0 > > KSPSolve 60 1.0 1.3742e+02 1.0 3.42e+10 1.1 5.7e+06 4.4e+04 > 3.2e+03 25100 95 83 79 25100 95 83 79 94396 > > PCGAMGGraph_AGG 6 1.0 5.7683e+00 1.0 4.50e+06 1.1 3.8e+05 9.1e+02 > 2.5e+02 1 0 6 0 6 1 0 6 0 6 294 > > PCGAMGCoarse_AGG 6 1.0 1.4101e+00 1.0 3.22e+07 1.2 4.0e+05 3.2e+03 > 1.4e+02 0 0 7 0 4 0 0 7 0 4 8280 > > PCGAMGProl_AGG 6 1.0 1.8976e+00 1.0 0.00e+00 0.0 7.2e+05 3.4e+03 > 8.6e+02 0 0 12 1 22 0 0 12 1 22 0 > > PCGAMGPOpt_AGG 6 1.0 5.7220e-06 0.0 0.00e+00 0.0 0.0e+00 0.0e+00 > 0.0e+00 0 0 0 0 0 0 0 0 0 0 0 > > GAMG: createProl 6 1.0 9.0840e+00 1.0 3.67e+07 1.2 1.5e+06 2.7e+03 > 1.3e+03 2 0 25 1 31 2 0 25 1 31 1472 > > Graph 12 1.0 5.7669e+00 1.0 4.50e+06 1.1 3.8e+05 9.1e+02 > 2.5e+02 1 0 6 0 6 1 0 6 0 6 294 > > MIS/Agg 6 1.0 9.5481e-02 1.1 0.00e+00 0.0 2.5e+05 1.1e+03 > 3.8e+01 0 0 4 0 1 0 0 4 0 1 0 > > SA: col data 6 1.0 8.5414e-01 1.0 0.00e+00 0.0 6.6e+05 3.0e+03 > 7.8e+02 0 0 11 1 19 0 0 11 1 20 0 > > SA: frmProl0 6 1.0 1.0123e+00 1.0 0.00e+00 0.0 6.2e+04 7.6e+03 > 6.0e+01 0 0 1 0 1 0 0 1 0 2 0 > > GAMG: partLevel 6 1.0 3.6150e+01 1.0 8.41e+08 1.1 3.5e+05 5.0e+04 > 5.3e+02 7 2 6 6 13 7 2 6 6 13 8804 > > repartition 6 1.0 3.8351e+00 1.0 0.00e+00 0.0 4.7e+04 1.3e+02 > 1.6e+02 1 0 1 0 4 1 0 1 0 4 0 > > Invert-Sort 6 1.0 4.4953e+00 1.8 0.00e+00 0.0 0.0e+00 0.0e+00 > 2.4e+01 1 0 0 0 1 1 0 0 0 1 0 > > Move A 6 1.0 1.0806e+01 1.0 0.00e+00 0.0 8.5e+04 1.6e+05 > 1.0e+02 2 0 1 5 3 2 0 1 5 3 0 > > Move P 6 1.0 1.1953e+01 1.0 0.00e+00 0.0 2.5e+04 3.6e+03 > 1.0e+02 2 0 0 0 3 2 0 0 0 3 0 > > PCSetUp 100 1.0 1.0166e+02 1.0 1.72e+10 1.1 2.7e+06 8.3e+04 > 2.2e+03 18 50 44 73 54 18 50 44 73 54 63848 > > PCSetUpOnBlocks 40 1.0 1.0812e+00 1.2 1.95e+08 1.2 0.0e+00 0.0e+00 > 0.0e+00 0 1 0 0 0 0 1 0 0 0 64711 > > PCApply 380 1.0 1.9359e+01 1.1 8.58e+09 1.1 1.4e+06 9.6e+03 > 6.0e+01 3 25 24 5 1 3 25 24 5 2 167605 > > SFSetGraph 12 1.0 3.5203e-03 6.8 0.00e+00 0.0 0.0e+00 0.0e+00 > 0.0e+00 0 0 0 0 0 0 0 0 0 0 0 > > SFBcastBegin 44 1.0 2.4242e-02 3.0 0.00e+00 0.0 2.5e+05 1.1e+03 > 6.0e+00 0 0 4 0 0 0 0 4 0 0 0 > > SFBcastEnd 44 1.0 3.0994e-02 8.6 0.00e+00 0.0 0.0e+00 0.0e+00 > 0.0e+00 0 0 0 0 0 0 0 0 0 0 0 > > SFReduceBegin 6 1.0 1.6784e-02 3.8 0.00e+00 0.0 7.1e+04 5.0e+02 > 6.0e+00 0 0 1 0 0 0 0 1 0 0 0 > > SFReduceEnd 6 1.0 8.6989e-0332.0 0.00e+00 0.0 0.0e+00 0.0e+00 > 0.0e+00 0 0 0 0 0 0 0 0 0 0 0 > > > ------------------------------------------------------------------------------------------------------------------------ > > > > Memory usage is given in bytes: > > > > Object Type Creations Destructions Memory Descendants' Mem. > > Reports information only for process 0. > > > > --- Event Stage 0: Main Stage > > > > Matrix 246 243 1730595756 0 > > Matrix Partitioning 6 6 3816 0 > > Matrix Coarsen 6 6 3720 0 > > Vector 602 602 1603749672 0 > > Vector Scatter 87 87 4291136 0 > > Krylov Solver 12 12 60416 0 > > Preconditioner 12 12 12040 0 > > Viewer 1 0 0 0 > > Index Set 247 247 9018060 0 > > Star Forest Bipartite Graph 12 12 10080 0 > > > ======================================================================================================================== > > > > Any idea why there are more matrix created with version 3.7.2? I only have > 2 MatCreate calls and 4 VecCreate calls in my code!, so I assume the others > are internally created. > > > > > > Thank you, > > > > > > *Hassan Raiesi, PhD* > > > > Advanced Aerodynamics Department > > Bombardier Aerospace > > > > hassan.raiesi at aero.bombardier.com > > > > *2351 boul. Alfred-Nobel (BAN1)* > > *Ville Saint-Laurent, Qu?bec, H4S 2A9* > > > > > > > > T?l. > > 514-855-5001 # 62204 > > > > > > > > > > > > *CONFIDENTIALITY NOTICE* - This communication may contain privileged or > confidential information. > If you are not the intended recipient or received this communication by > error, please notify the sender > and delete the message without copying, forwarding and/or disclosing it. > > > > > > > -- What most experimenters take for granted before they begin their experiments is infinitely more interesting than any results to which their experiments lead. -- Norbert Wiener -------------- next part -------------- An HTML attachment was scrubbed... URL: -------------- next part -------------- A non-text attachment was scrubbed... Name: image001.png Type: image/png Size: 6402 bytes Desc: not available URL: From it.sadr at gmail.com Tue Jul 5 17:13:36 2016 From: it.sadr at gmail.com (ehsan sadrfaridpour) Date: Tue, 5 Jul 2016 18:13:36 -0400 Subject: [petsc-users] How to have a local copy (sequential) of a parallel matrix In-Reply-To: <1618DCDA-7859-49BD-BCAF-F4BD08DF1BAF@mcs.anl.gov> References: <1618DCDA-7859-49BD-BCAF-F4BD08DF1BAF@mcs.anl.gov> Message-ID: I faced a problem with my code. The problem is related to MatCreateSeqAIJ(). I comment the rest of my code and just keeping the below lines cause me the error. *Code:* Mat * m_WA_nt_local; MatCreateSeqAIJ(PETSC_COMM_SELF,num_points,num_points, pre_init_size, NULL, m_WA_nt_local); PetscPrintf(PETSC_COMM_SELF, "[CS][pCalc_P] rank:%d, num_points:%d, p_init:%d\n", rank, num_points, pre_init_size); exit(1); *Error:* > > [0]PETSC ERROR: --------------------- Error Message > -------------------------------------------------------------- > [0]PETSC ERROR: Null argument, when expecting valid pointer > [0]PETSC ERROR: [1]PETSC ERROR: --------------------- Error Message > -------------------------------------------------------------- > [1]PETSC ERROR: Null argument, when expecting valid pointer > [1]PETSC ERROR: Null Pointer: Parameter # 2 > [1]PETSC ERROR: See http://www.mcs.anl.gov/petsc/documentation/faq.html > for trouble shooting. > [2]PETSC ERROR: --------------------- Error Message > -------------------------------------------------------------- > [2]PETSC ERROR: Null argument, when expecting valid pointer > [2]PETSC ERROR: Null Pointer: Parameter # 2 > [2]PETSC ERROR: See http://www.mcs.anl.gov/petsc/documentation/faq.html > for trouble shooting. > [2]PETSC ERROR: Petsc Release Version 3.6.3, unknown > [2]PETSC ERROR: ut_main on a linux-cxx-debug named grappelli by esfp Tue > Jul 5 18:05:15 2016 > [2]PETSC ERROR: Configure options PETSC_ARCH=linux-cxx-debug --with-cc=gcc > --with-cxx=g++ --with-clanguage=c++ --with-gnu-compilers=1 > --with-mpi-compilers=1 --with-debugging=1 --with-shared-libraries=1 > --download-openmpi=1 --download-f2cblaslapack --download-superlu_dist=1 > --download-superlu=1 --download-metis=1 --download-parmetis=1 > --download-blacs=1 --with-hdf5 --with-hdf5-dir=/usr/local/hdf5/ > [2]PETSC ERROR: #1 MatCreate() line 79 in > /home/esfp/tools/libraries/petsc/src/mat/utils/gcreate.c > [2]PETSC ERROR: #2 MatCreateSeqAIJ() line 3471 in > /home/esfp/tools/libraries/petsc/src/mat/impls/aij/seq/aij.c > Null Pointer: Parameter # 2 > [0]PETSC ERROR: See http://www.mcs.anl.gov/petsc/documentation/faq.html > for trouble shooting. > [0]PETSC ERROR: Petsc Release Version 3.6.3, unknown > [0]PETSC ERROR: ut_main on a linux-cxx-debug named grappelli by esfp Tue > Jul 5 18:05:15 2016 > [0]PETSC ERROR: Configure options PETSC_ARCH=linux-cxx-debug --with-cc=gcc > --with-cxx=g++ --with-clanguage=c++ --with-gnu-compilers=1 > --with-mpi-compilers=1 --with-debugging=1 --with-shared-libraries=1 > --download-openmpi=1 --download-f2cblaslapack --download-superlu_dist=1 > --download-superlu=1 --download-metis=1 --download-parmetis=1 > --download-blacs=1 --with-hdf5 --with-hdf5-dir=/usr/local/hdf5/ > [0]PETSC ERROR: #1 MatCreate() line 79 in > /home/esfp/tools/libraries/petsc/src/mat/utils/gcreate.c > [0]PETSC ERROR: #2 MatCreateSeqAIJ() line 3471 in > /home/esfp/tools/libraries/petsc/src/mat/impls/aij/seq/aij.c > [1]PETSC ERROR: Petsc Release Version 3.6.3, unknown > [1]PETSC ERROR: ut_main on a linux-cxx-debug named grappelli by esfp Tue > Jul 5 18:05:15 2016 > [1]PETSC ERROR: Configure options PETSC_ARCH=linux-cxx-debug --with-cc=gcc > --with-cxx=g++ --with-clanguage=c++ --with-gnu-compilers=1 > --with-mpi-compilers=1 --with-debugging=1 --with-shared-libraries=1 > --download-openmpi=1 --download-f2cblaslapack --download-superlu_dist=1 > --download-superlu=1 --download-metis=1 --download-parmetis=1 > --download-blacs=1 --with-hdf5 --with-hdf5-dir=/usr/local/hdf5/ > [1]PETSC ERROR: #1 MatCreate() line 79 in > /home/esfp/tools/libraries/petsc/src/mat/utils/gcreate.c > [1]PETSC ERROR: #2 MatCreateSeqAIJ() line 3471 in > /home/esfp/tools/libraries/petsc/src/mat/impls/aij/seq/aij.c > [CS][pCalc_P] rank:1, num_points:10, p_init:300 > [CS][pCalc_P] rank:2, num_points:10, p_init:300 > [CS][pCalc_P] rank:0, num_points:10, p_init:300 > As you can see nothing is NULL in my call to the MatCreateSeqAIJ. I tried to debug it with -start_in_debugger, but I got another error. > $ make ut_main && mpirun -n 3 ut_main -start_in_debugger > /home/esfp/tools/libraries/petsc/linux-cxx-debug/bin/mpicxx -o ut_main.o > -c -Wall -Wwrite-strings -Wno-strict-aliasing -Wno-unknown-pragmas -g -O0 > -fPIC -I/home/esfp/tools/libraries/petsc/include > -I/home/esfp/tools/libraries/petsc/linux-cxx-debug/include > -I/usr/local/hdf5/include -std=c++11 -g -O3 `pwd`/ut_main.cc > /home/esfp/tools/libraries/petsc/linux-cxx-debug/bin/mpicxx -Wall > -Wwrite-strings -Wno-strict-aliasing -Wno-unknown-pragmas -g -O0 -I. > svm.o solver.o model_selection.o ut_ms.o ut_common.o ut_kf.o > ut_partitioning.o ds_node.o ds_graph.o coarsening.o ut_coarsening.o > partitioning.o ut_mr.o pugixml.o config_params.o etimer.o common_funcs.o > OptionParser.o loader.o ut_loader.o k_fold.o ut_main.o > -Wl,-rpath,/home/esfp/tools/libraries/petsc/linux-cxx-debug/lib > -L/home/esfp/tools/libraries/petsc/linux-cxx-debug/lib -lpetsc > -Wl,-rpath,/home/esfp/tools/libraries/petsc/linux-cxx-debug/lib > -lsuperlu_4.3 -lsuperlu_dist_4.1 -lf2clapack -lf2cblas -lm -lparmetis > -lmetis -lX11 -Wl,-rpath,/usr/local/hdf5/lib -L/usr/local/hdf5/lib > -lhdf5hl_fortran -lhdf5_fortran -lhdf5_hl -lhdf5 -lhwloc -lm > -Wl,-rpath,/usr/lib/gcc/x86_64-linux-gnu/4.8 > -L/usr/lib/gcc/x86_64-linux-gnu/4.8 -Wl,-rpath,/usr/lib/x86_64-linux-gnu > -L/usr/lib/x86_64-linux-gnu -Wl,-rpath,/lib/x86_64-linux-gnu > -L/lib/x86_64-linux-gnu -lmpi_usempi -lmpi_mpifh -lgfortran -lm -lgfortran > -lm -lquadmath -lm -lmpi_cxx -lstdc++ > -Wl,-rpath,/home/esfp/tools/libraries/petsc/linux-cxx-debug/lib > -L/home/esfp/tools/libraries/petsc/linux-cxx-debug/lib > -Wl,-rpath,/usr/lib/gcc/x86_64-linux-gnu/4.8 > -L/usr/lib/gcc/x86_64-linux-gnu/4.8 -Wl,-rpath,/usr/lib/x86_64-linux-gnu > -L/usr/lib/x86_64-linux-gnu -Wl,-rpath,/lib/x86_64-linux-gnu > -L/lib/x86_64-linux-gnu -Wl,-rpath,/usr/lib/x86_64-linux-gnu > -L/usr/lib/x86_64-linux-gnu -ldl > -Wl,-rpath,/home/esfp/tools/libraries/petsc/linux-cxx-debug/lib -lmpi > -lgcc_s -lpthread -ldl -o ut_main > /bin/rm -f ut_main.o > [0]PETSC ERROR: PETSC: Attaching gdb to ut_main of pid 2818 on display :0 > on machine grappelli > [1]PETSC ERROR: PETSC: Attaching gdb to ut_main of pid 2819 on display :0 > on machine grappelli > [2]PETSC ERROR: PETSC: Attaching gdb to ut_main of pid 2820 on display :0 > on machine grappelli > And I got below error in gdb GUI: [image: Inline image 1] I appreciate your support. Best regards, Ehsan On Wed, Jun 29, 2016 at 4:31 PM, Barry Smith wrote: > > On all other processes don't pass in 1 pass in 0 since all other > processes want 0 sub matrices > > > > On Jun 29, 2016, at 3:24 PM, ehsan sadrfaridpour > wrote: > > > > Thanks, the IS problem is solved. > > But now I have another problem to compile the code. > > > > I use below code: > > Mat m_WA_nt_local; > > MatCreateSeqAIJ(PETSC_COMM_SELF,num_points,num_points, > Config_params::getInstance()->get_pre_init_loader_matrix(), NULL, > &m_WA_nt_local); > > IS set; > > if(rank ==0){ > > ISCreateStride(PETSC_COMM_SELF, num_points, 0, 1, &set); > > ISView(set, PETSC_VIEWER_STDOUT_SELF); > > } > > MatGetSubMatrices(m_WA_norm_T, 1, &set, &set, MAT_INITIAL_MATRIX, > &m_WA_nt_local); > > > > The error I get is : > > error: cannot convert ?_p_Mat**? to ?_p_Mat***? for argument ?6? to > ?PetscErrorCode MatGetSubMatrices(Mat, PetscInt, _p_IS* const*, _p_IS* > const*, MatReuse, _p_Mat***)? > > MatGetSubMatrices(m_WA_norm_T, 1, &set, &set, MAT_INITIAL_MATRIX, > &m_WA_nt_local); > > > > > > I tried to go around it by define a array of Matrices using "Mat * > m_WA_nt_local" > > So, the first 2 lines changed to below and I can compile the code. > > Mat * m_WA_nt_local; > > MatCreateSeqAIJ(PETSC_COMM_SELF,num_points,num_points, > Config_params::getInstance()->get_pre_init_loader_matrix(), NULL, > m_WA_nt_local); > > > > > > > > However, I get errors like below when I run the code with 2 mpi process. > > --------------------- Error Message > -------------------------------------------------------------- > > [1]PETSC ERROR: Invalid argument > > [1]PETSC ERROR: Wrong type of object: Parameter # 3 > > [1]PETSC ERROR: See http://www.mcs.anl.gov/petsc/documentation/faq.html > for trouble shooting. > > [1]PETSC ERROR: Petsc Release Version 3.6.3, unknown > > [1]PETSC ERROR: ut_main on a linux-cxx-debug named grappelli by esfp Wed > Jun 29 16:21:04 2016 > > [1]PETSC ERROR: Configure options PETSC_ARCH=linux-cxx-debug > --with-cc=gcc --with-cxx=g++ --with-clanguage=c++ --with-gnu-compilers=1 > --with-mpi-compilers=1 --with-debugging=1 --with-shared-libraries=1 > --download-openmpi=1 --download-f2cblaslapack --download-superlu_dist=1 > --download-superlu=1 --download-metis=1 --download-parmetis=1 > --download-blacs=1 --with-hdf5 --with-hdf5-dir=/usr/local/hdf5/ > > [1]PETSC ERROR: #1 MatGetSubMatrices() line 6605 in > /home/esfp/tools/libraries/petsc/src/mat/interface/matrix.c > > > > > > I think I need to do something for other processes, but I don't know > what I need to do. > > > > Best, > > Ehsan > > > > > > > > On Wed, Jun 29, 2016 at 4:03 PM, Dave May > wrote: > > > > > > On Wednesday, 29 June 2016, ehsan sadrfaridpour > wrote: > > I faced the below error during compiling my code for using > MatGetSubMatrices. > > > > error: cannot convert ?IS {aka _p_IS*}? to ?_p_IS* const*? for argument > ?3? to ?PetscErrorCode MatGetSubMatrices(Mat, PetscInt, _p_IS* const*, > _p_IS* const*, MatReuse, _p_Mat***)? > > MatGetSubMatrices(m_WA_norm_T, 1, set, set, MAT_INITIAL_MATRIX, > &m_local_W); > > > > My code : > > PetscMPIInt rank; > > MPI_Comm_rank(PETSC_COMM_WORLD, &rank); > > > > if(rank ==0){ > > Mat m_local_W; > > MatCreateSeqAIJ(PETSC_COMM_SELF,num_points,num_points, num_nz, > NULL,&m_local_W);// try to reserve space for only number of final non zero > entries for each fine node (e.g. 4) > > IS set; > > ISCreateStride(PETSC_COMM_SELF, num_points, 0, 1, &set_row); > > MatGetSubMatrices(m_WA_norm_T, 1, set_row, set_col, > MAT_INITIAL_MATRIX, &m_local_W); > > > > } > > > > I followed below example: > > > http://www.mcs.anl.gov/petsc/petsc-current/src/vec/is/is/examples/tutorials/ex2.c.html > > > > This code won't work in parallel. > > The man page says this function is collective on Mat. You need to move > the call to MatGetSubMatrices outside of the if(rank==0) loop. > > > > > > > > > > > > > > > > On Wed, Jun 29, 2016 at 3:19 PM, ehsan sadrfaridpour > wrote: > > Thanks a lot for great support. > > > > On Wed, Jun 29, 2016 at 3:11 PM, Barry Smith wrote: > > > > MatGetSubmatrices() just have the first process request all the rows > and columns and the others request none. You can use ISCreateStride() to > create the ISs without having to make an array of all the indices. > > > > > > > On Jun 29, 2016, at 1:43 PM, ehsan sadrfaridpour > wrote: > > > > > > Hi, > > > > > > I need to have access to most of elements of a parallel MPIAIJ matrix > only from 1 process (rank 0). > > > I tried to copy or duplicate it to SEQAIJ, but I faced problems. > > > > > > How can I have a local copy of a matrix which is distributed on > multiple process? I don't want to update the matrix, and the read-only > version of it would be enough. > > > > > > Best, > > > Ehsan > > > > > > > > > > > > > > > > -------------- next part -------------- An HTML attachment was scrubbed... URL: -------------- next part -------------- A non-text attachment was scrubbed... Name: image.png Type: image/png Size: 3695 bytes Desc: not available URL: From bsmith at mcs.anl.gov Tue Jul 5 17:18:51 2016 From: bsmith at mcs.anl.gov (Barry Smith) Date: Tue, 5 Jul 2016 17:18:51 -0500 Subject: [petsc-users] petsc 3.7.2 memory usage is much higher when compared to 3.6.1 In-Reply-To: References: Message-ID: <45E9D625-7398-4B46-88BF-BF936D5066D8@mcs.anl.gov> Hassan, This memory usage increase is not expected. How are you measuring memory usage? Since the problem occurs even with a simple solver you should debug with the simpler solver and only after resolving that move on to GAMG and see if the problem persists. Also do the test on the smallest case that clearly demonstrates the problem; if you have a 1 process run that shows a nontrivial memory usage increase then debug with that, don't run a huge problem unless you absolutely have to. How much code, if any, did you need to change in your application in going from 3.6.1 to 3.7.2 ? Here is the way to track down the problem. It may seem burdensome but requires no guesswork or speculation. Use the bisection capability of git. First obtain PETSc via git if you have not gotten that way http://www.mcs.anl.gov/petsc/download/index.html Then in the PETSc directory run git bisect start git bisect good v3.6.1 git bisect bad v3.7.2 It will then change to a new commit where you need to run configure and make on PETSc and then compile and run your application If the application uses the excessive memory then in the PETSc directory do git bisect bad otherwise type git bisect good if the code won't compile (if the PETSc API changes you may have to adjust your code slightly to get it to compile and you should do that; but if PETSc won't configure to build with the given commit then just do the skip) or crashes then type git bisect skip Now git will switch to another commit where you need again do the same process of configure make and run the application. After a few iterations git bisect will show the EXACT commit (code changes) that resulted in your very different memory usage and we can take a look at the code changes in PETSc and figure out how to reduce the memory usage. I realize this seems like a burdensome process but remember a great deal of changes took place in the PETSc code and this is the ONLY well defined way to figure out exactly which change caused the problem. Otherwise we can guess until the end of time. Barry > On Jul 5, 2016, at 3:42 PM, Hassan Raiesi wrote: > > Hi, > > PETSc 3.7.2 seems to have a much higher memory usage when compared with PETSc- 3.1.1 c, to a point that it crashes our code for large problems that we ran with version 3.6.1 in the past. > I have re-compiled the code with same options, and ran the same code linked with the two versions, here are the log-summarie: > > -flow_ksp_max_it 20 > -flow_ksp_monitor_true_residual > -flow_ksp_rtol 0.1 > -flow_ksp_type fgmres > -flow_mg_coarse_pc_factor_mat_solver_package mumps > -flow_mg_coarse_pc_type lu > -flow_mg_levels_ksp_type richardson > -flow_mg_levels_pc_type sor > -flow_pc_gamg_agg_nsmooths 0 > -flow_pc_gamg_coarse_eq_limit 2000 > -flow_pc_gamg_process_eq_limit 2500 > -flow_pc_gamg_repartition true > -flow_pc_gamg_reuse_interpolation true > -flow_pc_gamg_square_graph 3 > -flow_pc_gamg_sym_graph true > -flow_pc_gamg_type agg > -flow_pc_mg_cycle v > -flow_pc_mg_levels 20 > -flow_pc_mg_type kaskade > -flow_pc_type gamg > -log_summary > > Note: it is not specific to PCGAMG, even a bjacobi+fgmres would need more memory (4.5GB/core in version 3.6.1 compared to 6.8GB/core for 3.7.2). > > > > Using Petsc Development GIT revision: v3.7.2-812-gc68d048 GIT Date: 2016-07-05 12:04:34 -0400 > > Max Max/Min Avg Total > Time (sec): 6.760e+02 1.00006 6.760e+02 > Objects: 1.284e+03 1.00469 1.279e+03 > Flops: 3.563e+10 1.10884 3.370e+10 1.348e+13 > Flops/sec: 5.271e+07 1.10884 4.985e+07 1.994e+10 > MPI Messages: 4.279e+04 7.21359 1.635e+04 6.542e+06 > MPI Message Lengths: 3.833e+09 17.25274 7.681e+04 5.024e+11 > MPI Reductions: 4.023e+03 1.00149 > > Flop counting convention: 1 flop = 1 real number operation of type (multiply/divide/add/subtract) > e.g., VecAXPY() for real vectors of length N --> 2N flops > and VecAXPY() for complex vectors of length N --> 8N flops > > Summary of Stages: ----- Time ------ ----- Flops ----- --- Messages --- -- Message Lengths -- -- Reductions -- > Avg %Total Avg %Total counts %Total Avg %Total counts %Total > 0: Main Stage: 6.7600e+02 100.0% 1.3478e+13 100.0% 6.533e+06 99.9% 7.674e+04 99.9% 4.010e+03 99.7% > > ------------------------------------------------------------------------------------------------------------------------ > See the 'Profiling' chapter of the users' manual for details on interpreting output. > Phase summary info: > Count: number of times phase was executed > Time and Flops: Max - maximum over all processors > Ratio - ratio of maximum to minimum over all processors > Mess: number of messages sent > Avg. len: average message length (bytes) > Reduct: number of global reductions > Global: entire computation > Stage: stages of a computation. Set stages with PetscLogStagePush() and PetscLogStagePop(). > %T - percent time in this phase %F - percent flops in this phase > %M - percent messages in this phase %L - percent message lengths in this phase > %R - percent reductions in this phase > Total Mflop/s: 10e-6 * (sum of flops over all processors)/(max time over all processors) > ------------------------------------------------------------------------------------------------------------------------ > Event Count Time (sec) Flops --- Global --- --- Stage --- Total > Max Ratio Max Ratio Max Ratio Mess Avg len Reduct %T %F %M %L %R %T %F %M %L %R Mflop/s > ------------------------------------------------------------------------------------------------------------------------ > > --- Event Stage 0: Main Stage > > MatMult 500 1.0 1.0582e+01 1.2 6.68e+09 1.1 1.9e+06 1.0e+04 0.0e+00 1 19 28 4 0 1 19 29 4 0 237625 > MatMultTranspose 120 1.0 7.6262e-01 1.3 3.58e+08 1.1 2.4e+05 1.5e+04 0.0e+00 0 1 4 1 0 0 1 4 1 0 180994 > MatSolve 380 1.0 4.1580e+00 1.1 1.17e+09 1.1 8.6e+03 8.8e+01 6.0e+01 1 3 0 0 1 1 3 0 0 1 105950 > MatSOR 120 1.0 1.4316e+01 1.2 6.75e+09 1.1 9.5e+05 7.4e+03 0.0e+00 2 19 15 1 0 2 19 15 1 0 177298 > MatLUFactorSym 2 1.0 2.3449e-01 1.1 0.00e+00 0.0 0.0e+00 0.0e+00 1.0e+01 0 0 0 0 0 0 0 0 0 0 0 > MatLUFactorNum 60 1.0 8.8820e+00 1.0 1.95e+08 1.2 0.0e+00 0.0e+00 0.0e+00 1 1 0 0 0 1 1 0 0 0 7877 > MatILUFactorSym 1 1.0 1.9795e-02 1.4 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 0 > MatConvert 6 1.0 2.9893e-02 1.1 0.00e+00 0.0 0.0e+00 0.0e+00 1.8e+01 0 0 0 0 0 0 0 0 0 0 0 > MatScale 6 1.0 1.8810e-02 1.4 4.52e+06 1.1 2.4e+04 1.5e+03 0.0e+00 0 0 0 0 0 0 0 0 0 0 90171 > MatAssemblyBegin 782 1.0 1.8294e+01 2.9 0.00e+00 0.0 9.2e+05 4.1e+05 4.2e+02 2 0 14 75 10 2 0 14 75 10 0 > MatAssemblyEnd 782 1.0 1.4283e+01 3.0 0.00e+00 0.0 4.1e+05 8.7e+02 4.7e+02 1 0 6 0 12 1 0 6 0 12 0 > MatGetRow 6774900 1.1 9.4289e-01 1.2 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 0 > MatGetRowIJ 3 3.0 6.6261e-036948.0 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 0 > MatGetSubMatrix 12 1.0 2.6783e+01 1.0 0.00e+00 0.0 1.1e+05 1.3e+05 2.0e+02 4 0 2 3 5 4 0 2 3 5 0 > MatGetOrdering 3 3.0 7.7400e-03 7.2 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 0 > MatPartitioning 6 1.0 1.8949e+00 1.0 0.00e+00 0.0 0.0e+00 0.0e+00 1.4e+01 0 0 0 0 0 0 0 0 0 0 0 > MatCoarsen 6 1.0 9.5692e-02 1.2 0.00e+00 0.0 2.6e+05 1.1e+03 4.1e+01 0 0 4 0 1 0 0 4 0 1 0 > MatZeroEntries 142 1.0 9.7085e-01 1.1 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 0 > MatTranspose 6 1.0 2.1740e-01 1.0 0.00e+00 0.0 1.9e+05 8.5e+02 7.8e+01 0 0 3 0 2 0 0 3 0 2 0 > MatPtAP 120 1.0 6.0157e+01 1.0 1.82e+10 1.1 1.5e+06 2.7e+05 4.2e+02 9 51 22 80 10 9 51 22 80 10 114269 > MatPtAPSymbolic 12 1.0 8.1081e+00 1.0 0.00e+00 0.0 2.2e+05 3.8e+04 8.4e+01 1 0 3 2 2 1 0 3 2 2 0 > MatPtAPNumeric 120 1.0 5.2205e+01 1.0 1.82e+10 1.1 1.2e+06 3.1e+05 3.4e+02 8 51 19 78 8 8 51 19 78 8 131676 > MatTrnMatMult 3 1.0 1.8608e+00 1.0 3.23e+07 1.2 8.3e+04 7.9e+03 5.7e+01 0 0 1 0 1 0 0 1 0 1 6275 > MatTrnMatMultSym 3 1.0 1.3447e+00 1.0 0.00e+00 0.0 6.9e+04 3.8e+03 5.1e+01 0 0 1 0 1 0 0 1 0 1 0 > MatTrnMatMultNum 3 1.0 5.1695e-01 1.0 3.23e+07 1.2 1.3e+04 3.0e+04 6.0e+00 0 0 0 0 0 0 0 0 0 0 22588 > MatGetLocalMat 126 1.0 1.0355e+00 1.8 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 0 > MatGetBrAoCol 120 1.0 9.5921e+0019.2 0.00e+00 0.0 5.7e+05 3.3e+04 0.0e+00 1 0 9 4 0 1 0 9 4 0 0 > VecDot 320 1.0 1.1400e+00 1.6 2.04e+08 1.1 0.0e+00 0.0e+00 3.2e+02 0 1 0 0 8 0 1 0 0 8 68967 > VecMDot 260 1.0 1.9577e+00 2.8 3.70e+08 1.1 0.0e+00 0.0e+00 2.6e+02 0 1 0 0 6 0 1 0 0 6 72792 > VecNorm 440 1.0 2.6273e+00 1.9 5.88e+08 1.1 0.0e+00 0.0e+00 4.4e+02 0 2 0 0 11 0 2 0 0 11 86035 > VecScale 320 1.0 2.1386e-01 1.2 7.91e+07 1.1 0.0e+00 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 141968 > VecCopy 220 1.0 7.0370e-01 1.2 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 0 > VecSet 862 1.0 7.1000e-01 1.6 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 0 > VecAXPY 440 1.0 8.6790e-01 1.1 3.83e+08 1.1 0.0e+00 0.0e+00 0.0e+00 0 1 0 0 0 0 1 0 0 0 169857 > VecAYPX 280 1.0 5.7766e-01 1.5 1.92e+08 1.1 0.0e+00 0.0e+00 0.0e+00 0 1 0 0 0 0 1 0 0 0 127599 > VecMAXPY 300 1.0 9.7396e-01 1.2 4.98e+08 1.1 0.0e+00 0.0e+00 0.0e+00 0 1 0 0 0 0 1 0 0 0 196768 > VecAssemblyBegin 234 1.0 4.6313e+00 5.6 0.00e+00 0.0 0.0e+00 0.0e+00 6.8e+02 0 0 0 0 17 0 0 0 0 17 0 > VecAssemblyEnd 234 1.0 5.1503e-0319.5 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 0 > VecScatterBegin 1083 1.0 2.9274e-01 4.5 0.00e+00 0.0 3.8e+06 8.5e+03 2.0e+01 0 0 59 6 0 0 0 59 6 0 0 > VecScatterEnd 1063 1.0 3.9653e+00 5.6 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 0 > KSPGMRESOrthog 20 1.0 1.7405e+00 3.7 1.28e+08 1.1 0.0e+00 0.0e+00 2.0e+01 0 0 0 0 0 0 0 0 0 0 28232 > KSPSetUp 222 1.0 6.8469e-02 1.1 0.00e+00 0.0 0.0e+00 0.0e+00 1.2e+01 0 0 0 0 0 0 0 0 0 0 0 > KSPSolve 60 1.0 1.4767e+02 1.0 3.55e+10 1.1 6.3e+06 7.2e+04 3.2e+03 22100 96 90 79 22100 96 90 79 91007 > PCGAMGGraph_AGG 6 1.0 6.0792e+00 1.0 4.52e+06 1.1 3.8e+05 9.0e+02 2.5e+02 1 0 6 0 6 1 0 6 0 6 279 > PCGAMGCoarse_AGG 6 1.0 2.0660e+00 1.0 3.23e+07 1.2 4.2e+05 3.1e+03 1.5e+02 0 0 6 0 4 0 0 6 0 4 5652 > PCGAMGProl_AGG 6 1.0 1.8842e+00 1.0 0.00e+00 0.0 7.3e+05 3.3e+03 8.6e+02 0 0 11 0 21 0 0 11 0 22 0 > PCGAMGPOpt_AGG 6 1.0 6.4373e-06 0.0 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 0 > GAMG: createProl 6 1.0 1.0036e+01 1.0 3.68e+07 1.2 1.5e+06 2.7e+03 1.3e+03 1 0 23 1 31 1 0 23 1 31 1332 > Graph 12 1.0 6.0783e+00 1.0 4.52e+06 1.1 3.8e+05 9.0e+02 2.5e+02 1 0 6 0 6 1 0 6 0 6 279 > MIS/Agg 6 1.0 9.5831e-02 1.2 0.00e+00 0.0 2.6e+05 1.1e+03 4.1e+01 0 0 4 0 1 0 0 4 0 1 0 > SA: col data 6 1.0 7.7358e-01 1.0 0.00e+00 0.0 6.7e+05 2.9e+03 7.8e+02 0 0 10 0 19 0 0 10 0 19 0 > SA: frmProl0 6 1.0 1.0759e+00 1.0 0.00e+00 0.0 6.2e+04 7.6e+03 6.0e+01 0 0 1 0 1 0 0 1 0 1 0 > GAMG: partLevel 6 1.0 3.8136e+01 1.0 9.09e+08 1.1 3.8e+05 5.0e+04 5.4e+02 6 3 6 4 13 6 3 6 4 14 9013 > repartition 6 1.0 2.7910e+00 1.0 0.00e+00 0.0 4.6e+04 1.3e+02 1.6e+02 0 0 1 0 4 0 0 1 0 4 0 > Invert-Sort 6 1.0 2.5045e+00 2.0 0.00e+00 0.0 0.0e+00 0.0e+00 2.4e+01 0 0 0 0 1 0 0 0 0 1 0 > Move A 6 1.0 1.4832e+01 1.0 0.00e+00 0.0 8.5e+04 1.7e+05 1.1e+02 2 0 1 3 3 2 0 1 3 3 0 > Move P 6 1.0 1.2023e+01 1.0 0.00e+00 0.0 2.4e+04 3.8e+03 1.1e+02 2 0 0 0 3 2 0 0 0 3 0 > PCSetUp 100 1.0 1.1212e+02 1.0 1.84e+10 1.1 3.2e+06 1.3e+05 2.2e+03 17 52 49 84 54 17 52 49 84 54 62052 > PCSetUpOnBlocks 40 1.0 1.0386e+00 1.2 1.95e+08 1.2 0.0e+00 0.0e+00 0.0e+00 0 1 0 0 0 0 1 0 0 0 67368 > PCApply 380 1.0 2.0034e+01 1.1 8.60e+09 1.1 1.5e+06 9.9e+03 6.0e+01 3 24 22 3 1 3 24 22 3 1 161973 > SFSetGraph 12 1.0 4.9813e-0310.2 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 0 > SFBcastBegin 47 1.0 3.3110e-02 2.6 0.00e+00 0.0 2.6e+05 1.1e+03 6.0e+00 0 0 4 0 0 0 0 4 0 0 0 > SFBcastEnd 47 1.0 1.3497e-02 3.6 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 0 > SFReduceBegin 6 1.0 1.8593e-02 4.2 0.00e+00 0.0 7.2e+04 4.9e+02 6.0e+00 0 0 1 0 0 0 0 1 0 0 0 > SFReduceEnd 6 1.0 7.1628e-0318.5 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 0 > BuildTwoSided 12 1.0 3.5771e-02 2.5 0.00e+00 0.0 5.0e+04 4.0e+00 1.2e+01 0 0 1 0 0 0 0 1 0 0 0 > ------------------------------------------------------------------------------------------------------------------------ > > Memory usage is given in bytes: > > Object Type Creations Destructions Memory Descendants' Mem. > Reports information only for process 0. > > --- Event Stage 0: Main Stage > > Matrix 302 299 1992700700 0. > Matrix Partitioning 6 6 3888 0. > Matrix Coarsen 6 6 3768 0. > Vector 600 600 1582204168 0. > Vector Scatter 87 87 5614432 0. > Krylov Solver 11 11 59472 0. > Preconditioner 11 11 11120 0. > PetscRandom 1 1 638 0. > Viewer 1 0 0 0. > Index Set 247 247 9008420 0. > Star Forest Bipartite Graph 12 12 10176 0. > ======================================================================================================================== > > And for petsc 3.6.1: > > Using Petsc Development GIT revision: v3.6.1-307-g26c82d3 GIT Date: 2015-08-06 11:50:34 -0500 > > Max Max/Min Avg Total > Time (sec): 5.515e+02 1.00001 5.515e+02 > Objects: 1.231e+03 1.00490 1.226e+03 > Flops: 3.431e+10 1.12609 3.253e+10 1.301e+13 > Flops/sec: 6.222e+07 1.12609 5.899e+07 2.359e+10 > MPI Messages: 4.432e+04 7.84165 1.504e+04 6.016e+06 > MPI Message Lengths: 2.236e+09 12.61261 5.027e+04 3.024e+11 > MPI Reductions: 4.012e+03 1.00150 > > Flop counting convention: 1 flop = 1 real number operation of type (multiply/divide/add/subtract) > e.g., VecAXPY() for real vectors of length N --> 2N flops > and VecAXPY() for complex vectors of length N --> 8N flops > > Summary of Stages: ----- Time ------ ----- Flops ----- --- Messages --- -- Message Lengths -- -- Reductions -- > Avg %Total Avg %Total counts %Total Avg %Total counts %Total > 0: Main Stage: 5.5145e+02 100.0% 1.3011e+13 100.0% 6.007e+06 99.9% 5.020e+04 99.9% 3.999e+03 99.7% > > ------------------------------------------------------------------------------------------------------------------------ > See the 'Profiling' chapter of the users' manual for details on interpreting output. > Phase summary info: > Count: number of times phase was executed > Time and Flops: Max - maximum over all processors > Ratio - ratio of maximum to minimum over all processors > Mess: number of messages sent > Avg. len: average message length (bytes) > Reduct: number of global reductions > Global: entire computation > Stage: stages of a computation. Set stages with PetscLogStagePush() and PetscLogStagePop(). > %T - percent time in this phase %F - percent flops in this phase > %M - percent messages in this phase %L - percent message lengths in this phase > %R - percent reductions in this phase > Total Mflop/s: 10e-6 * (sum of flops over all processors)/(max time over all processors) > ------------------------------------------------------------------------------------------------------------------------ > Event Count Time (sec) Flops --- Global --- --- Stage --- Total > Max Ratio Max Ratio Max Ratio Mess Avg len Reduct %T %F %M %L %R %T %F %M %L %R Mflop/s > ------------------------------------------------------------------------------------------------------------------------ > > --- Event Stage 0: Main Stage > > MatMult 500 1.0 1.0172e+01 1.2 6.68e+09 1.1 1.9e+06 9.9e+03 0.0e+00 2 19 31 6 0 2 19 31 6 0 247182 > MatMultTranspose 120 1.0 6.9889e-01 1.2 3.56e+08 1.1 2.5e+05 1.4e+04 0.0e+00 0 1 4 1 0 0 1 4 1 0 197492 > MatSolve 380 1.0 3.9310e+00 1.1 1.17e+09 1.1 1.3e+04 5.7e+01 6.0e+01 1 3 0 0 1 1 3 0 0 2 112069 > MatSOR 120 1.0 1.3915e+01 1.1 6.73e+09 1.1 9.5e+05 7.4e+03 0.0e+00 2 20 16 2 0 2 20 16 2 0 182405 > MatLUFactorSym 2 1.0 2.1180e-01 1.8 0.00e+00 0.0 0.0e+00 0.0e+00 1.0e+01 0 0 0 0 0 0 0 0 0 0 0 > MatLUFactorNum 60 1.0 7.9378e+00 1.0 1.95e+08 1.2 0.0e+00 0.0e+00 0.0e+00 1 1 0 0 0 1 1 0 0 0 8814 > MatILUFactorSym 1 1.0 2.3076e-02 1.5 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 0 > MatConvert 6 1.0 3.2693e-02 1.5 0.00e+00 0.0 0.0e+00 0.0e+00 1.8e+01 0 0 0 0 0 0 0 0 0 0 0 > MatScale 6 1.0 2.1923e-02 1.7 4.50e+06 1.1 2.4e+04 1.5e+03 0.0e+00 0 0 0 0 0 0 0 0 0 0 77365 > MatAssemblyBegin 266 1.0 1.0337e+01 4.4 0.00e+00 0.0 1.8e+05 3.8e+03 4.2e+02 1 0 3 0 10 1 0 3 0 10 0 > MatAssemblyEnd 266 1.0 3.0336e+00 1.0 0.00e+00 0.0 4.1e+05 8.6e+02 4.7e+02 1 0 7 0 12 1 0 7 0 12 0 > MatGetRow 6730366 1.1 8.6473e-01 1.2 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 0 > MatGetRowIJ 3 3.0 5.2931e-035550.2 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 0 > MatGetSubMatrix 12 1.0 2.2689e+01 1.0 0.00e+00 0.0 1.1e+05 1.3e+05 1.9e+02 4 0 2 5 5 4 0 2 5 5 0 > MatGetOrdering 3 3.0 6.5000e-03 5.4 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 0 > MatPartitioning 6 1.0 2.9801e+00 1.0 0.00e+00 0.0 0.0e+00 0.0e+00 1.4e+01 1 0 0 0 0 1 0 0 0 0 0 > MatCoarsen 6 1.0 9.5374e-02 1.1 0.00e+00 0.0 2.5e+05 1.1e+03 3.8e+01 0 0 4 0 1 0 0 4 0 1 0 > MatZeroEntries 22 1.0 6.1185e-01 1.2 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 0 > MatTranspose 6 1.0 1.9780e-01 1.1 0.00e+00 0.0 1.9e+05 8.6e+02 7.8e+01 0 0 3 0 2 0 0 3 0 2 0 > MatPtAP 120 1.0 5.2996e+01 1.0 1.70e+10 1.1 9.7e+05 2.1e+05 4.2e+02 10 49 16 67 10 10 49 16 67 11 120900 > MatPtAPSymbolic 12 1.0 5.8209e+00 1.0 0.00e+00 0.0 2.2e+05 3.7e+04 8.4e+01 1 0 4 3 2 1 0 4 3 2 0 > MatPtAPNumeric 120 1.0 4.7185e+01 1.0 1.70e+10 1.1 7.6e+05 2.6e+05 3.4e+02 9 49 13 64 8 9 49 13 64 8 135789 > MatTrnMatMult 3 1.0 1.1679e+00 1.0 3.22e+07 1.2 8.2e+04 8.0e+03 5.7e+01 0 0 1 0 1 0 0 1 0 1 9997 > MatTrnMatMultSym 3 1.0 6.8366e-01 1.0 0.00e+00 0.0 6.9e+04 3.9e+03 5.1e+01 0 0 1 0 1 0 0 1 0 1 0 > MatTrnMatMultNum 3 1.0 4.8513e-01 1.0 3.22e+07 1.2 1.3e+04 3.0e+04 6.0e+00 0 0 0 0 0 0 0 0 0 0 24069 > MatGetLocalMat 126 1.0 1.1939e+00 1.3 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 0 > MatGetBrAoCol 120 1.0 5.9887e-01 2.7 0.00e+00 0.0 5.7e+05 3.3e+04 0.0e+00 0 0 9 6 0 0 0 9 6 0 0 > MatGetSymTrans 24 1.0 1.4878e-01 1.4 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 0 > VecDot 320 1.0 1.5860e+00 1.5 2.04e+08 1.1 0.0e+00 0.0e+00 3.2e+02 0 1 0 0 8 0 1 0 0 8 49574 > VecMDot 260 1.0 1.8154e+00 2.5 3.70e+08 1.1 0.0e+00 0.0e+00 2.6e+02 0 1 0 0 6 0 1 0 0 7 78497 > VecNorm 440 1.0 2.8876e+00 1.8 5.88e+08 1.1 0.0e+00 0.0e+00 4.4e+02 0 2 0 0 11 0 2 0 0 11 78281 > VecScale 320 1.0 2.2738e-01 1.2 7.88e+07 1.1 0.0e+00 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 133517 > VecCopy 220 1.0 7.1162e-01 1.2 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 0 > VecSet 862 1.0 7.0683e-01 1.5 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 0 > VecAXPY 440 1.0 9.0657e-01 1.2 3.83e+08 1.1 0.0e+00 0.0e+00 0.0e+00 0 1 0 0 0 0 1 0 0 0 162612 > VecAYPX 280 1.0 5.8935e-01 1.5 1.92e+08 1.1 0.0e+00 0.0e+00 0.0e+00 0 1 0 0 0 0 1 0 0 0 125070 > VecMAXPY 300 1.0 9.7644e-01 1.2 4.98e+08 1.1 0.0e+00 0.0e+00 0.0e+00 0 1 0 0 0 0 1 0 0 0 196269 > VecAssemblyBegin 234 1.0 5.0308e+00 5.5 0.00e+00 0.0 0.0e+00 0.0e+00 6.8e+02 1 0 0 0 17 1 0 0 0 17 0 > VecAssemblyEnd 234 1.0 1.8253e-03 8.8 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 0 > VecScatterBegin 1083 1.0 2.8195e-01 4.7 0.00e+00 0.0 3.8e+06 8.4e+03 2.0e+01 0 0 64 11 0 0 0 64 11 1 0 > VecScatterEnd 1063 1.0 3.4924e+00 6.9 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 0 > KSPGMRESOrthog 20 1.0 1.5598e+00 3.2 1.28e+08 1.1 0.0e+00 0.0e+00 2.0e+01 0 0 0 0 0 0 0 0 0 1 31503 > KSPSetUp 222 1.0 9.7521e-02 1.6 0.00e+00 0.0 0.0e+00 0.0e+00 1.2e+01 0 0 0 0 0 0 0 0 0 0 0 > KSPSolve 60 1.0 1.3742e+02 1.0 3.42e+10 1.1 5.7e+06 4.4e+04 3.2e+03 25100 95 83 79 25100 95 83 79 94396 > PCGAMGGraph_AGG 6 1.0 5.7683e+00 1.0 4.50e+06 1.1 3.8e+05 9.1e+02 2.5e+02 1 0 6 0 6 1 0 6 0 6 294 > PCGAMGCoarse_AGG 6 1.0 1.4101e+00 1.0 3.22e+07 1.2 4.0e+05 3.2e+03 1.4e+02 0 0 7 0 4 0 0 7 0 4 8280 > PCGAMGProl_AGG 6 1.0 1.8976e+00 1.0 0.00e+00 0.0 7.2e+05 3.4e+03 8.6e+02 0 0 12 1 22 0 0 12 1 22 0 > PCGAMGPOpt_AGG 6 1.0 5.7220e-06 0.0 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 0 > GAMG: createProl 6 1.0 9.0840e+00 1.0 3.67e+07 1.2 1.5e+06 2.7e+03 1.3e+03 2 0 25 1 31 2 0 25 1 31 1472 > Graph 12 1.0 5.7669e+00 1.0 4.50e+06 1.1 3.8e+05 9.1e+02 2.5e+02 1 0 6 0 6 1 0 6 0 6 294 > MIS/Agg 6 1.0 9.5481e-02 1.1 0.00e+00 0.0 2.5e+05 1.1e+03 3.8e+01 0 0 4 0 1 0 0 4 0 1 0 > SA: col data 6 1.0 8.5414e-01 1.0 0.00e+00 0.0 6.6e+05 3.0e+03 7.8e+02 0 0 11 1 19 0 0 11 1 20 0 > SA: frmProl0 6 1.0 1.0123e+00 1.0 0.00e+00 0.0 6.2e+04 7.6e+03 6.0e+01 0 0 1 0 1 0 0 1 0 2 0 > GAMG: partLevel 6 1.0 3.6150e+01 1.0 8.41e+08 1.1 3.5e+05 5.0e+04 5.3e+02 7 2 6 6 13 7 2 6 6 13 8804 > repartition 6 1.0 3.8351e+00 1.0 0.00e+00 0.0 4.7e+04 1.3e+02 1.6e+02 1 0 1 0 4 1 0 1 0 4 0 > Invert-Sort 6 1.0 4.4953e+00 1.8 0.00e+00 0.0 0.0e+00 0.0e+00 2.4e+01 1 0 0 0 1 1 0 0 0 1 0 > Move A 6 1.0 1.0806e+01 1.0 0.00e+00 0.0 8.5e+04 1.6e+05 1.0e+02 2 0 1 5 3 2 0 1 5 3 0 > Move P 6 1.0 1.1953e+01 1.0 0.00e+00 0.0 2.5e+04 3.6e+03 1.0e+02 2 0 0 0 3 2 0 0 0 3 0 > PCSetUp 100 1.0 1.0166e+02 1.0 1.72e+10 1.1 2.7e+06 8.3e+04 2.2e+03 18 50 44 73 54 18 50 44 73 54 63848 > PCSetUpOnBlocks 40 1.0 1.0812e+00 1.2 1.95e+08 1.2 0.0e+00 0.0e+00 0.0e+00 0 1 0 0 0 0 1 0 0 0 64711 > PCApply 380 1.0 1.9359e+01 1.1 8.58e+09 1.1 1.4e+06 9.6e+03 6.0e+01 3 25 24 5 1 3 25 24 5 2 167605 > SFSetGraph 12 1.0 3.5203e-03 6.8 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 0 > SFBcastBegin 44 1.0 2.4242e-02 3.0 0.00e+00 0.0 2.5e+05 1.1e+03 6.0e+00 0 0 4 0 0 0 0 4 0 0 0 > SFBcastEnd 44 1.0 3.0994e-02 8.6 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 0 > SFReduceBegin 6 1.0 1.6784e-02 3.8 0.00e+00 0.0 7.1e+04 5.0e+02 6.0e+00 0 0 1 0 0 0 0 1 0 0 0 > SFReduceEnd 6 1.0 8.6989e-0332.0 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 0 > ------------------------------------------------------------------------------------------------------------------------ > > Memory usage is given in bytes: > > Object Type Creations Destructions Memory Descendants' Mem. > Reports information only for process 0. > > --- Event Stage 0: Main Stage > > Matrix 246 243 1730595756 0 > Matrix Partitioning 6 6 3816 0 > Matrix Coarsen 6 6 3720 0 > Vector 602 602 1603749672 0 > Vector Scatter 87 87 4291136 0 > Krylov Solver 12 12 60416 0 > Preconditioner 12 12 12040 0 > Viewer 1 0 0 0 > Index Set 247 247 9018060 0 > Star Forest Bipartite Graph 12 12 10080 0 > ======================================================================================================================== > > Any idea why there are more matrix created with version 3.7.2? I only have 2 MatCreate calls and 4 VecCreate calls in my code!, so I assume the others are internally created. > > > Thank you, > > > Hassan Raiesi, PhD > > Advanced Aerodynamics Department > Bombardier Aerospace > > hassan.raiesi at aero.bombardier.com > > 2351 boul. Alfred-Nobel (BAN1) > Ville Saint-Laurent, Qu?bec, H4S 2A9 > > > > T?l. > 514-855-5001 # 62204 > > > > > > > CONFIDENTIALITY NOTICE - This communication may contain privileged or confidential information. > If you are not the intended recipient or received this communication by error, please notify the sender > and delete the message without copying, forwarding and/or disclosing it. From bsmith at mcs.anl.gov Tue Jul 5 17:21:24 2016 From: bsmith at mcs.anl.gov (Barry Smith) Date: Tue, 5 Jul 2016 17:21:24 -0500 Subject: [petsc-users] How to have a local copy (sequential) of a parallel matrix In-Reply-To: References: <1618DCDA-7859-49BD-BCAF-F4BD08DF1BAF@mcs.anl.gov> Message-ID: It should be Mat m_WA_nt_local; > MatCreateSeqAIJ(PETSC_COMM_SELF,num_points,num_points, pre_init_size, NULL, &m_WA_nt_local); ^^^^^^^^^^^^ note the & > On Jul 5, 2016, at 5:13 PM, ehsan sadrfaridpour wrote: > > I faced a problem with my code. The problem is related to MatCreateSeqAIJ(). > I comment the rest of my code and just keeping the below lines cause me the error. > Code: > Mat * m_WA_nt_local; > MatCreateSeqAIJ(PETSC_COMM_SELF,num_points,num_points, pre_init_size, NULL, m_WA_nt_local); > PetscPrintf(PETSC_COMM_SELF, "[CS][pCalc_P] rank:%d, num_points:%d, p_init:%d\n", rank, num_points, pre_init_size); > > exit(1); > > Error: > [0]PETSC ERROR: --------------------- Error Message -------------------------------------------------------------- > [0]PETSC ERROR: Null argument, when expecting valid pointer > [0]PETSC ERROR: [1]PETSC ERROR: --------------------- Error Message -------------------------------------------------------------- > [1]PETSC ERROR: Null argument, when expecting valid pointer > [1]PETSC ERROR: Null Pointer: Parameter # 2 > [1]PETSC ERROR: See http://www.mcs.anl.gov/petsc/documentation/faq.html for trouble shooting. > [2]PETSC ERROR: --------------------- Error Message -------------------------------------------------------------- > [2]PETSC ERROR: Null argument, when expecting valid pointer > [2]PETSC ERROR: Null Pointer: Parameter # 2 > [2]PETSC ERROR: See http://www.mcs.anl.gov/petsc/documentation/faq.html for trouble shooting. > [2]PETSC ERROR: Petsc Release Version 3.6.3, unknown > [2]PETSC ERROR: ut_main on a linux-cxx-debug named grappelli by esfp Tue Jul 5 18:05:15 2016 > [2]PETSC ERROR: Configure options PETSC_ARCH=linux-cxx-debug --with-cc=gcc --with-cxx=g++ --with-clanguage=c++ --with-gnu-compilers=1 --with-mpi-compilers=1 --with-debugging=1 --with-shared-libraries=1 --download-openmpi=1 --download-f2cblaslapack --download-superlu_dist=1 --download-superlu=1 --download-metis=1 --download-parmetis=1 --download-blacs=1 --with-hdf5 --with-hdf5-dir=/usr/local/hdf5/ > [2]PETSC ERROR: #1 MatCreate() line 79 in /home/esfp/tools/libraries/petsc/src/mat/utils/gcreate.c > [2]PETSC ERROR: #2 MatCreateSeqAIJ() line 3471 in /home/esfp/tools/libraries/petsc/src/mat/impls/aij/seq/aij.c > Null Pointer: Parameter # 2 > [0]PETSC ERROR: See http://www.mcs.anl.gov/petsc/documentation/faq.html for trouble shooting. > [0]PETSC ERROR: Petsc Release Version 3.6.3, unknown > [0]PETSC ERROR: ut_main on a linux-cxx-debug named grappelli by esfp Tue Jul 5 18:05:15 2016 > [0]PETSC ERROR: Configure options PETSC_ARCH=linux-cxx-debug --with-cc=gcc --with-cxx=g++ --with-clanguage=c++ --with-gnu-compilers=1 --with-mpi-compilers=1 --with-debugging=1 --with-shared-libraries=1 --download-openmpi=1 --download-f2cblaslapack --download-superlu_dist=1 --download-superlu=1 --download-metis=1 --download-parmetis=1 --download-blacs=1 --with-hdf5 --with-hdf5-dir=/usr/local/hdf5/ > [0]PETSC ERROR: #1 MatCreate() line 79 in /home/esfp/tools/libraries/petsc/src/mat/utils/gcreate.c > [0]PETSC ERROR: #2 MatCreateSeqAIJ() line 3471 in /home/esfp/tools/libraries/petsc/src/mat/impls/aij/seq/aij.c > [1]PETSC ERROR: Petsc Release Version 3.6.3, unknown > [1]PETSC ERROR: ut_main on a linux-cxx-debug named grappelli by esfp Tue Jul 5 18:05:15 2016 > [1]PETSC ERROR: Configure options PETSC_ARCH=linux-cxx-debug --with-cc=gcc --with-cxx=g++ --with-clanguage=c++ --with-gnu-compilers=1 --with-mpi-compilers=1 --with-debugging=1 --with-shared-libraries=1 --download-openmpi=1 --download-f2cblaslapack --download-superlu_dist=1 --download-superlu=1 --download-metis=1 --download-parmetis=1 --download-blacs=1 --with-hdf5 --with-hdf5-dir=/usr/local/hdf5/ > [1]PETSC ERROR: #1 MatCreate() line 79 in /home/esfp/tools/libraries/petsc/src/mat/utils/gcreate.c > [1]PETSC ERROR: #2 MatCreateSeqAIJ() line 3471 in /home/esfp/tools/libraries/petsc/src/mat/impls/aij/seq/aij.c > [CS][pCalc_P] rank:1, num_points:10, p_init:300 > [CS][pCalc_P] rank:2, num_points:10, p_init:300 > [CS][pCalc_P] rank:0, num_points:10, p_init:300 > > As you can see nothing is NULL in my call to the MatCreateSeqAIJ. > > I tried to debug it with -start_in_debugger, but I got another error. > $ make ut_main && mpirun -n 3 ut_main -start_in_debugger > /home/esfp/tools/libraries/petsc/linux-cxx-debug/bin/mpicxx -o ut_main.o -c -Wall -Wwrite-strings -Wno-strict-aliasing -Wno-unknown-pragmas -g -O0 -fPIC -I/home/esfp/tools/libraries/petsc/include -I/home/esfp/tools/libraries/petsc/linux-cxx-debug/include -I/usr/local/hdf5/include -std=c++11 -g -O3 `pwd`/ut_main.cc > /home/esfp/tools/libraries/petsc/linux-cxx-debug/bin/mpicxx -Wall -Wwrite-strings -Wno-strict-aliasing -Wno-unknown-pragmas -g -O0 -I. svm.o solver.o model_selection.o ut_ms.o ut_common.o ut_kf.o ut_partitioning.o ds_node.o ds_graph.o coarsening.o ut_coarsening.o partitioning.o ut_mr.o pugixml.o config_params.o etimer.o common_funcs.o OptionParser.o loader.o ut_loader.o k_fold.o ut_main.o -Wl,-rpath,/home/esfp/tools/libraries/petsc/linux-cxx-debug/lib -L/home/esfp/tools/libraries/petsc/linux-cxx-debug/lib -lpetsc -Wl,-rpath,/home/esfp/tools/libraries/petsc/linux-cxx-debug/lib -lsuperlu_4.3 -lsuperlu_dist_4.1 -lf2clapack -lf2cblas -lm -lparmetis -lmetis -lX11 -Wl,-rpath,/usr/local/hdf5/lib -L/usr/local/hdf5/lib -lhdf5hl_fortran -lhdf5_fortran -lhdf5_hl -lhdf5 -lhwloc -lm -Wl,-rpath,/usr/lib/gcc/x86_64-linux-gnu/4.8 -L/usr/lib/gcc/x86_64-linux-gnu/4.8 -Wl,-rpath,/usr/lib/x86_64-linux-gnu -L/usr/lib/x86_64-linux-gnu -Wl,-rpath,/lib/x86_64-linux-gnu -L/lib/x86_64-linux-gnu -lmpi_usempi -lmpi_mpifh -lgfortran -lm -lgfortran -lm -lquadmath -lm -lmpi_cxx -lstdc++ -Wl,-rpath,/home/esfp/tools/libraries/petsc/linux-cxx-debug/lib -L/home/esfp/tools/libraries/petsc/linux-cxx-debug/lib -Wl,-rpath,/usr/lib/gcc/x86_64-linux-gnu/4.8 -L/usr/lib/gcc/x86_64-linux-gnu/4.8 -Wl,-rpath,/usr/lib/x86_64-linux-gnu -L/usr/lib/x86_64-linux-gnu -Wl,-rpath,/lib/x86_64-linux-gnu -L/lib/x86_64-linux-gnu -Wl,-rpath,/usr/lib/x86_64-linux-gnu -L/usr/lib/x86_64-linux-gnu -ldl -Wl,-rpath,/home/esfp/tools/libraries/petsc/linux-cxx-debug/lib -lmpi -lgcc_s -lpthread -ldl -o ut_main > /bin/rm -f ut_main.o > [0]PETSC ERROR: PETSC: Attaching gdb to ut_main of pid 2818 on display :0 on machine grappelli > [1]PETSC ERROR: PETSC: Attaching gdb to ut_main of pid 2819 on display :0 on machine grappelli > [2]PETSC ERROR: PETSC: Attaching gdb to ut_main of pid 2820 on display :0 on machine grappelli > > > And I got below error in gdb GUI: > > > I appreciate your support. > > Best regards, > Ehsan > > On Wed, Jun 29, 2016 at 4:31 PM, Barry Smith wrote: > > On all other processes don't pass in 1 pass in 0 since all other processes want 0 sub matrices > > > > On Jun 29, 2016, at 3:24 PM, ehsan sadrfaridpour wrote: > > > > Thanks, the IS problem is solved. > > But now I have another problem to compile the code. > > > > I use below code: > > Mat m_WA_nt_local; > > MatCreateSeqAIJ(PETSC_COMM_SELF,num_points,num_points, Config_params::getInstance()->get_pre_init_loader_matrix(), NULL, &m_WA_nt_local); > > IS set; > > if(rank ==0){ > > ISCreateStride(PETSC_COMM_SELF, num_points, 0, 1, &set); > > ISView(set, PETSC_VIEWER_STDOUT_SELF); > > } > > MatGetSubMatrices(m_WA_norm_T, 1, &set, &set, MAT_INITIAL_MATRIX, &m_WA_nt_local); > > > > The error I get is : > > error: cannot convert ?_p_Mat**? to ?_p_Mat***? for argument ?6? to ?PetscErrorCode MatGetSubMatrices(Mat, PetscInt, _p_IS* const*, _p_IS* const*, MatReuse, _p_Mat***)? > > MatGetSubMatrices(m_WA_norm_T, 1, &set, &set, MAT_INITIAL_MATRIX, &m_WA_nt_local); > > > > > > I tried to go around it by define a array of Matrices using "Mat * m_WA_nt_local" > > So, the first 2 lines changed to below and I can compile the code. > > Mat * m_WA_nt_local; > > MatCreateSeqAIJ(PETSC_COMM_SELF,num_points,num_points, Config_params::getInstance()->get_pre_init_loader_matrix(), NULL, m_WA_nt_local); > > > > > > > > However, I get errors like below when I run the code with 2 mpi process. > > --------------------- Error Message -------------------------------------------------------------- > > [1]PETSC ERROR: Invalid argument > > [1]PETSC ERROR: Wrong type of object: Parameter # 3 > > [1]PETSC ERROR: See http://www.mcs.anl.gov/petsc/documentation/faq.html for trouble shooting. > > [1]PETSC ERROR: Petsc Release Version 3.6.3, unknown > > [1]PETSC ERROR: ut_main on a linux-cxx-debug named grappelli by esfp Wed Jun 29 16:21:04 2016 > > [1]PETSC ERROR: Configure options PETSC_ARCH=linux-cxx-debug --with-cc=gcc --with-cxx=g++ --with-clanguage=c++ --with-gnu-compilers=1 --with-mpi-compilers=1 --with-debugging=1 --with-shared-libraries=1 --download-openmpi=1 --download-f2cblaslapack --download-superlu_dist=1 --download-superlu=1 --download-metis=1 --download-parmetis=1 --download-blacs=1 --with-hdf5 --with-hdf5-dir=/usr/local/hdf5/ > > [1]PETSC ERROR: #1 MatGetSubMatrices() line 6605 in /home/esfp/tools/libraries/petsc/src/mat/interface/matrix.c > > > > > > I think I need to do something for other processes, but I don't know what I need to do. > > > > Best, > > Ehsan > > > > > > > > On Wed, Jun 29, 2016 at 4:03 PM, Dave May wrote: > > > > > > On Wednesday, 29 June 2016, ehsan sadrfaridpour wrote: > > I faced the below error during compiling my code for using MatGetSubMatrices. > > > > error: cannot convert ?IS {aka _p_IS*}? to ?_p_IS* const*? for argument ?3? to ?PetscErrorCode MatGetSubMatrices(Mat, PetscInt, _p_IS* const*, _p_IS* const*, MatReuse, _p_Mat***)? > > MatGetSubMatrices(m_WA_norm_T, 1, set, set, MAT_INITIAL_MATRIX, &m_local_W); > > > > My code : > > PetscMPIInt rank; > > MPI_Comm_rank(PETSC_COMM_WORLD, &rank); > > > > if(rank ==0){ > > Mat m_local_W; > > MatCreateSeqAIJ(PETSC_COMM_SELF,num_points,num_points, num_nz, NULL,&m_local_W);// try to reserve space for only number of final non zero entries for each fine node (e.g. 4) > > IS set; > > ISCreateStride(PETSC_COMM_SELF, num_points, 0, 1, &set_row); > > MatGetSubMatrices(m_WA_norm_T, 1, set_row, set_col, MAT_INITIAL_MATRIX, &m_local_W); > > > > } > > > > I followed below example: > > http://www.mcs.anl.gov/petsc/petsc-current/src/vec/is/is/examples/tutorials/ex2.c.html > > > > This code won't work in parallel. > > The man page says this function is collective on Mat. You need to move the call to MatGetSubMatrices outside of the if(rank==0) loop. > > > > > > > > > > > > > > > > On Wed, Jun 29, 2016 at 3:19 PM, ehsan sadrfaridpour wrote: > > Thanks a lot for great support. > > > > On Wed, Jun 29, 2016 at 3:11 PM, Barry Smith wrote: > > > > MatGetSubmatrices() just have the first process request all the rows and columns and the others request none. You can use ISCreateStride() to create the ISs without having to make an array of all the indices. > > > > > > > On Jun 29, 2016, at 1:43 PM, ehsan sadrfaridpour wrote: > > > > > > Hi, > > > > > > I need to have access to most of elements of a parallel MPIAIJ matrix only from 1 process (rank 0). > > > I tried to copy or duplicate it to SEQAIJ, but I faced problems. > > > > > > How can I have a local copy of a matrix which is distributed on multiple process? I don't want to update the matrix, and the read-only version of it would be enough. > > > > > > Best, > > > Ehsan > > > > > > > > > > > > > > > > From hengjiew at uci.edu Tue Jul 5 17:23:55 2016 From: hengjiew at uci.edu (frank) Date: Tue, 5 Jul 2016 15:23:55 -0700 Subject: [petsc-users] Question about memory usage in Multigrid preconditioner Message-ID: <577C337B.60909@uci.edu> Hi, I am using the CG ksp solver and Multigrid preconditioner to solve a linear system in parallel. I chose to use the 'Telescope' as the preconditioner on the coarse mesh for its good performance. The petsc options file is attached. The domain is a 3d box. It works well when the grid is 1536*128*384 and the process mesh is 96*8*24. When I double the size of grid and keep the same process mesh and petsc options, I get an "out of memory" error from the super-cluster I am using. Each process has access to at least 8G memory, which should be more than enough for my application. I am sure that all the other parts of my code( except the linear solver ) do not use much memory. So I doubt if there is something wrong with the linear solver. The error occurs before the linear system is completely solved so I don't have the info from ksp view. I am not able to re-produce the error with a smaller problem either. In addition, I tried to use the block jacobi as the preconditioner with the same grid and same decomposition. The linear solver runs extremely slow but there is no memory error. How can I diagnose what exactly cause the error? Thank you so much. Frank -------------- next part -------------- -ksp_type cg -ksp_norm_type unpreconditioned -ksp_lag_norm -ksp_rtol 1e-7 -ksp_initial_guess_nonzero yes -ksp_converged_reason -ppe_max_iter 50 -pc_type mg -pc_mg_galerkin -pc_mg_levels 4 -mg_levels_ksp_type richardson -mg_levels_ksp_max_it 1 -mg_coarse_ksp_type preonly -mg_coarse_pc_type telescope -mg_coarse_pc_telescope_reduction_factor 64 -options_left -log_summary # Setting dmdarepart on subcomm -repart_da_processors_x 24 -repart_da_processors_y 2 -repart_da_processors_z 6 -mg_coarse_telescope_ksp_type preonly #-mg_coarse_telescope_ksp_constant_null_space -mg_coarse_telescope_pc_type mg -mg_coarse_telescope_pc_mg_galerkin -mg_coarse_telescope_pc_mg_levels 4 -mg_coarse_telescope_mg_levels_ksp_max_it 1 -mg_coarse_telescope_mg_levels_ksp_type richardson -mg_coarse_telescope_mg_coarse_ksp_type preonly -mg_coarse_telescope_mg_coarse_pc_type svd #-mg_coarse_telescope_mg_coarse_pc_type telescope #-mg_coarse_telescope_mg_coarse_pc_telescope_reduction_factor 64 # Second subcomm #-mg_coarse_telescope_mg_coarse_telescope_ksp_type preonly #-mg_coarse_telescope_mg_coarse_telescope_pc_type mg #-mg_coarse_telescope_mg_coarse_telescope_pc_mg_galerkin #-mg_coarse_telescope_mg_coarse_telescope_pc_mg_levels 3 #-mg_coarse_telescope_mg_coarse_telescope_mg_levels_ksp_type richardson #-mg_coarse_telescope_mg_coarse_telescope_mg_levels_ksp_max_it 1 #-mg_coarse_telescope_mg_coarse_telescope_mg_coarse_ksp_type richardson #-mg_coarse_telescope_mg_coarse_telescope_mg_coarse_pc_type svd From it.sadr at gmail.com Tue Jul 5 17:26:58 2016 From: it.sadr at gmail.com (ehsan sadrfaridpour) Date: Tue, 5 Jul 2016 18:26:58 -0400 Subject: [petsc-users] How to have a local copy (sequential) of a parallel matrix In-Reply-To: References: <1618DCDA-7859-49BD-BCAF-F4BD08DF1BAF@mcs.anl.gov> Message-ID: Thanks for your prompt reply. Using & solve this problem, but then I have another problem. *Rest of the Code:* Mat m_WA_nt_local; MatCreateSeqAIJ(PETSC_COMM_SELF,num_points,num_points, pre_init_size, NULL, &m_WA_nt_local); PetscPrintf(PETSC_COMM_SELF, "[CS][pCalc_P] rank:%d, num_points:%d, p_init:%d\n", rank, num_points, pre_init_size); IS set; if(rank ==0){ // - - - - - create local matrix - - - - - PetscPrintf(PETSC_COMM_SELF, "[CS][pCalc_P] rank:%d, num_points:%d\n", rank, num_points); ISCreateStride(PETSC_COMM_SELF, num_points, 0, 1, &set); ISView(set, PETSC_VIEWER_STDOUT_SELF); MatGetSubMatrices(m_WA_norm_T, 1, &set, &set, MAT_INITIAL_MATRIX, &m_WA_nt_local); }else{ MatGetSubMatrices(m_WA_norm_T, 0, &set, &set, MAT_INITIAL_MATRIX, &m_WA_nt_local); } *Error in compile:* > /home/esfp/dev/ws_qt/mlsvm/coarsening.cc: In member function ?_p_Mat* > Coarsening::pCalc_P(_p_Mat*&, _p_Vec*&, std::vector&, > cs_info&)?: > /home/esfp/dev/ws_qt/mlsvm/coarsening.cc:113:89: error: cannot convert > ?_p_Mat**? to ?_p_Mat***? for argument ?6? to ?PetscErrorCode > MatGetSubMatrices(Mat, PetscInt, _p_IS* const*, _p_IS* const*, MatReuse, > _p_Mat***)? > MatGetSubMatrices(m_WA_norm_T, 1, &set, &set, MAT_INITIAL_MATRIX, > &m_WA_nt_local); > > ^ > /home/esfp/dev/ws_qt/mlsvm/coarsening.cc:115:89: error: cannot convert > ?_p_Mat**? to ?_p_Mat***? for argument ?6? to ?PetscErrorCode > MatGetSubMatrices(Mat, PetscInt, _p_IS* const*, _p_IS* const*, MatReuse, > _p_Mat***)? > MatGetSubMatrices(m_WA_norm_T, 0, &set, &set, MAT_INITIAL_MATRIX, > &m_WA_nt_local); > ^ On Tue, Jul 5, 2016 at 6:21 PM, Barry Smith wrote: > > It should be > > Mat m_WA_nt_local; > > > MatCreateSeqAIJ(PETSC_COMM_SELF,num_points,num_points, pre_init_size, > NULL, &m_WA_nt_local); > > ^^^^^^^^^^^^ note > the & > > > > > On Jul 5, 2016, at 5:13 PM, ehsan sadrfaridpour > wrote: > > > > I faced a problem with my code. The problem is related to > MatCreateSeqAIJ(). > > I comment the rest of my code and just keeping the below lines cause me > the error. > > Code: > > Mat * m_WA_nt_local; > > MatCreateSeqAIJ(PETSC_COMM_SELF,num_points,num_points, > pre_init_size, NULL, m_WA_nt_local); > > PetscPrintf(PETSC_COMM_SELF, "[CS][pCalc_P] rank:%d, num_points:%d, > p_init:%d\n", rank, num_points, pre_init_size); > > > > exit(1); > > > > Error: > > [0]PETSC ERROR: --------------------- Error Message > -------------------------------------------------------------- > > [0]PETSC ERROR: Null argument, when expecting valid pointer > > [0]PETSC ERROR: [1]PETSC ERROR: --------------------- Error Message > -------------------------------------------------------------- > > [1]PETSC ERROR: Null argument, when expecting valid pointer > > [1]PETSC ERROR: Null Pointer: Parameter # 2 > > [1]PETSC ERROR: See http://www.mcs.anl.gov/petsc/documentation/faq.html > for trouble shooting. > > [2]PETSC ERROR: --------------------- Error Message > -------------------------------------------------------------- > > [2]PETSC ERROR: Null argument, when expecting valid pointer > > [2]PETSC ERROR: Null Pointer: Parameter # 2 > > [2]PETSC ERROR: See http://www.mcs.anl.gov/petsc/documentation/faq.html > for trouble shooting. > > [2]PETSC ERROR: Petsc Release Version 3.6.3, unknown > > [2]PETSC ERROR: ut_main on a linux-cxx-debug named grappelli by esfp Tue > Jul 5 18:05:15 2016 > > [2]PETSC ERROR: Configure options PETSC_ARCH=linux-cxx-debug > --with-cc=gcc --with-cxx=g++ --with-clanguage=c++ --with-gnu-compilers=1 > --with-mpi-compilers=1 --with-debugging=1 --with-shared-libraries=1 > --download-openmpi=1 --download-f2cblaslapack --download-superlu_dist=1 > --download-superlu=1 --download-metis=1 --download-parmetis=1 > --download-blacs=1 --with-hdf5 --with-hdf5-dir=/usr/local/hdf5/ > > [2]PETSC ERROR: #1 MatCreate() line 79 in > /home/esfp/tools/libraries/petsc/src/mat/utils/gcreate.c > > [2]PETSC ERROR: #2 MatCreateSeqAIJ() line 3471 in > /home/esfp/tools/libraries/petsc/src/mat/impls/aij/seq/aij.c > > Null Pointer: Parameter # 2 > > [0]PETSC ERROR: See http://www.mcs.anl.gov/petsc/documentation/faq.html > for trouble shooting. > > [0]PETSC ERROR: Petsc Release Version 3.6.3, unknown > > [0]PETSC ERROR: ut_main on a linux-cxx-debug named grappelli by esfp Tue > Jul 5 18:05:15 2016 > > [0]PETSC ERROR: Configure options PETSC_ARCH=linux-cxx-debug > --with-cc=gcc --with-cxx=g++ --with-clanguage=c++ --with-gnu-compilers=1 > --with-mpi-compilers=1 --with-debugging=1 --with-shared-libraries=1 > --download-openmpi=1 --download-f2cblaslapack --download-superlu_dist=1 > --download-superlu=1 --download-metis=1 --download-parmetis=1 > --download-blacs=1 --with-hdf5 --with-hdf5-dir=/usr/local/hdf5/ > > [0]PETSC ERROR: #1 MatCreate() line 79 in > /home/esfp/tools/libraries/petsc/src/mat/utils/gcreate.c > > [0]PETSC ERROR: #2 MatCreateSeqAIJ() line 3471 in > /home/esfp/tools/libraries/petsc/src/mat/impls/aij/seq/aij.c > > [1]PETSC ERROR: Petsc Release Version 3.6.3, unknown > > [1]PETSC ERROR: ut_main on a linux-cxx-debug named grappelli by esfp Tue > Jul 5 18:05:15 2016 > > [1]PETSC ERROR: Configure options PETSC_ARCH=linux-cxx-debug > --with-cc=gcc --with-cxx=g++ --with-clanguage=c++ --with-gnu-compilers=1 > --with-mpi-compilers=1 --with-debugging=1 --with-shared-libraries=1 > --download-openmpi=1 --download-f2cblaslapack --download-superlu_dist=1 > --download-superlu=1 --download-metis=1 --download-parmetis=1 > --download-blacs=1 --with-hdf5 --with-hdf5-dir=/usr/local/hdf5/ > > [1]PETSC ERROR: #1 MatCreate() line 79 in > /home/esfp/tools/libraries/petsc/src/mat/utils/gcreate.c > > [1]PETSC ERROR: #2 MatCreateSeqAIJ() line 3471 in > /home/esfp/tools/libraries/petsc/src/mat/impls/aij/seq/aij.c > > [CS][pCalc_P] rank:1, num_points:10, p_init:300 > > [CS][pCalc_P] rank:2, num_points:10, p_init:300 > > [CS][pCalc_P] rank:0, num_points:10, p_init:300 > > > > As you can see nothing is NULL in my call to the MatCreateSeqAIJ. > > > > I tried to debug it with -start_in_debugger, but I got another error. > > $ make ut_main && mpirun -n 3 ut_main -start_in_debugger > > /home/esfp/tools/libraries/petsc/linux-cxx-debug/bin/mpicxx -o ut_main.o > -c -Wall -Wwrite-strings -Wno-strict-aliasing -Wno-unknown-pragmas -g -O0 > -fPIC -I/home/esfp/tools/libraries/petsc/include > -I/home/esfp/tools/libraries/petsc/linux-cxx-debug/include > -I/usr/local/hdf5/include -std=c++11 -g -O3 `pwd`/ut_main.cc > > /home/esfp/tools/libraries/petsc/linux-cxx-debug/bin/mpicxx -Wall > -Wwrite-strings -Wno-strict-aliasing -Wno-unknown-pragmas -g -O0 -I. > svm.o solver.o model_selection.o ut_ms.o ut_common.o ut_kf.o > ut_partitioning.o ds_node.o ds_graph.o coarsening.o ut_coarsening.o > partitioning.o ut_mr.o pugixml.o config_params.o etimer.o common_funcs.o > OptionParser.o loader.o ut_loader.o k_fold.o ut_main.o > -Wl,-rpath,/home/esfp/tools/libraries/petsc/linux-cxx-debug/lib > -L/home/esfp/tools/libraries/petsc/linux-cxx-debug/lib -lpetsc > -Wl,-rpath,/home/esfp/tools/libraries/petsc/linux-cxx-debug/lib > -lsuperlu_4.3 -lsuperlu_dist_4.1 -lf2clapack -lf2cblas -lm -lparmetis > -lmetis -lX11 -Wl,-rpath,/usr/local/hdf5/lib -L/usr/local/hdf5/lib > -lhdf5hl_fortran -lhdf5_fortran -lhdf5_hl -lhdf5 -lhwloc -lm > -Wl,-rpath,/usr/lib/gcc/x86_64-linux-gnu/4.8 > -L/usr/lib/gcc/x86_64-linux-gnu/4.8 -Wl,-rpath,/usr/lib/x86_64-linux-gnu > -L/usr/lib/x86_64-linux-gnu -Wl,-rpath,/lib/x86_64-linux-gnu > -L/lib/x86_64-linux-gnu -lmpi_usempi -lmpi_mpifh -lgfortran -lm -lgfortran > -lm -lquadmath -lm -lmpi_cxx -lstdc++ > -Wl,-rpath,/home/esfp/tools/libraries/petsc/linux-cxx-debug/lib > -L/home/esfp/tools/libraries/petsc/linux-cxx-debug/lib > -Wl,-rpath,/usr/lib/gcc/x86_64-linux-gnu/4.8 > -L/usr/lib/gcc/x86_64-linux-gnu/4.8 -Wl,-rpath,/usr/lib/x86_64-linux-gnu > -L/usr/lib/x86_64-linux-gnu -Wl,-rpath,/lib/x86_64-linux-gnu > -L/lib/x86_64-linux-gnu -Wl,-rpath,/usr/lib/x86_64-linux-gnu > -L/usr/lib/x86_64-linux-gnu -ldl > -Wl,-rpath,/home/esfp/tools/libraries/petsc/linux-cxx-debug/lib -lmpi > -lgcc_s -lpthread -ldl -o ut_main > > /bin/rm -f ut_main.o > > [0]PETSC ERROR: PETSC: Attaching gdb to ut_main of pid 2818 on display > :0 on machine grappelli > > [1]PETSC ERROR: PETSC: Attaching gdb to ut_main of pid 2819 on display > :0 on machine grappelli > > [2]PETSC ERROR: PETSC: Attaching gdb to ut_main of pid 2820 on display > :0 on machine grappelli > > > > > > And I got below error in gdb GUI: > > > > > > I appreciate your support. > > > > Best regards, > > Ehsan > > > > On Wed, Jun 29, 2016 at 4:31 PM, Barry Smith wrote: > > > > On all other processes don't pass in 1 pass in 0 since all other > processes want 0 sub matrices > > > > > > > On Jun 29, 2016, at 3:24 PM, ehsan sadrfaridpour > wrote: > > > > > > Thanks, the IS problem is solved. > > > But now I have another problem to compile the code. > > > > > > I use below code: > > > Mat m_WA_nt_local; > > > MatCreateSeqAIJ(PETSC_COMM_SELF,num_points,num_points, > Config_params::getInstance()->get_pre_init_loader_matrix(), NULL, > &m_WA_nt_local); > > > IS set; > > > if(rank ==0){ > > > ISCreateStride(PETSC_COMM_SELF, num_points, 0, 1, &set); > > > ISView(set, PETSC_VIEWER_STDOUT_SELF); > > > } > > > MatGetSubMatrices(m_WA_norm_T, 1, &set, &set, MAT_INITIAL_MATRIX, > &m_WA_nt_local); > > > > > > The error I get is : > > > error: cannot convert ?_p_Mat**? to ?_p_Mat***? for argument ?6? to > ?PetscErrorCode MatGetSubMatrices(Mat, PetscInt, _p_IS* const*, _p_IS* > const*, MatReuse, _p_Mat***)? > > > MatGetSubMatrices(m_WA_norm_T, 1, &set, &set, MAT_INITIAL_MATRIX, > &m_WA_nt_local); > > > > > > > > > I tried to go around it by define a array of Matrices using "Mat * > m_WA_nt_local" > > > So, the first 2 lines changed to below and I can compile the code. > > > Mat * m_WA_nt_local; > > > MatCreateSeqAIJ(PETSC_COMM_SELF,num_points,num_points, > Config_params::getInstance()->get_pre_init_loader_matrix(), NULL, > m_WA_nt_local); > > > > > > > > > > > > However, I get errors like below when I run the code with 2 mpi > process. > > > --------------------- Error Message > -------------------------------------------------------------- > > > [1]PETSC ERROR: Invalid argument > > > [1]PETSC ERROR: Wrong type of object: Parameter # 3 > > > [1]PETSC ERROR: See > http://www.mcs.anl.gov/petsc/documentation/faq.html for trouble shooting. > > > [1]PETSC ERROR: Petsc Release Version 3.6.3, unknown > > > [1]PETSC ERROR: ut_main on a linux-cxx-debug named grappelli by esfp > Wed Jun 29 16:21:04 2016 > > > [1]PETSC ERROR: Configure options PETSC_ARCH=linux-cxx-debug > --with-cc=gcc --with-cxx=g++ --with-clanguage=c++ --with-gnu-compilers=1 > --with-mpi-compilers=1 --with-debugging=1 --with-shared-libraries=1 > --download-openmpi=1 --download-f2cblaslapack --download-superlu_dist=1 > --download-superlu=1 --download-metis=1 --download-parmetis=1 > --download-blacs=1 --with-hdf5 --with-hdf5-dir=/usr/local/hdf5/ > > > [1]PETSC ERROR: #1 MatGetSubMatrices() line 6605 in > /home/esfp/tools/libraries/petsc/src/mat/interface/matrix.c > > > > > > > > > I think I need to do something for other processes, but I don't know > what I need to do. > > > > > > Best, > > > Ehsan > > > > > > > > > > > > On Wed, Jun 29, 2016 at 4:03 PM, Dave May > wrote: > > > > > > > > > On Wednesday, 29 June 2016, ehsan sadrfaridpour > wrote: > > > I faced the below error during compiling my code for using > MatGetSubMatrices. > > > > > > error: cannot convert ?IS {aka _p_IS*}? to ?_p_IS* const*? for > argument ?3? to ?PetscErrorCode MatGetSubMatrices(Mat, PetscInt, _p_IS* > const*, _p_IS* const*, MatReuse, _p_Mat***)? > > > MatGetSubMatrices(m_WA_norm_T, 1, set, set, > MAT_INITIAL_MATRIX, &m_local_W); > > > > > > My code : > > > PetscMPIInt rank; > > > MPI_Comm_rank(PETSC_COMM_WORLD, &rank); > > > > > > if(rank ==0){ > > > Mat m_local_W; > > > MatCreateSeqAIJ(PETSC_COMM_SELF,num_points,num_points, num_nz, > NULL,&m_local_W);// try to reserve space for only number of final non zero > entries for each fine node (e.g. 4) > > > IS set; > > > ISCreateStride(PETSC_COMM_SELF, num_points, 0, 1, &set_row); > > > MatGetSubMatrices(m_WA_norm_T, 1, set_row, set_col, > MAT_INITIAL_MATRIX, &m_local_W); > > > > > > } > > > > > > I followed below example: > > > > http://www.mcs.anl.gov/petsc/petsc-current/src/vec/is/is/examples/tutorials/ex2.c.html > > > > > > This code won't work in parallel. > > > The man page says this function is collective on Mat. You need to move > the call to MatGetSubMatrices outside of the if(rank==0) loop. > > > > > > > > > > > > > > > > > > > > > > > > On Wed, Jun 29, 2016 at 3:19 PM, ehsan sadrfaridpour < > it.sadr at gmail.com> wrote: > > > Thanks a lot for great support. > > > > > > On Wed, Jun 29, 2016 at 3:11 PM, Barry Smith > wrote: > > > > > > MatGetSubmatrices() just have the first process request all the > rows and columns and the others request none. You can use ISCreateStride() > to create the ISs without having to make an array of all the indices. > > > > > > > > > > On Jun 29, 2016, at 1:43 PM, ehsan sadrfaridpour > wrote: > > > > > > > > Hi, > > > > > > > > I need to have access to most of elements of a parallel MPIAIJ > matrix only from 1 process (rank 0). > > > > I tried to copy or duplicate it to SEQAIJ, but I faced problems. > > > > > > > > How can I have a local copy of a matrix which is distributed on > multiple process? I don't want to update the matrix, and the read-only > version of it would be enough. > > > > > > > > Best, > > > > Ehsan > > > > > > > > > > > > > > > > > > > > > > > > > > -------------- next part -------------- An HTML attachment was scrubbed... URL: From knepley at gmail.com Tue Jul 5 17:36:07 2016 From: knepley at gmail.com (Matthew Knepley) Date: Tue, 5 Jul 2016 17:36:07 -0500 Subject: [petsc-users] How to have a local copy (sequential) of a parallel matrix In-Reply-To: References: <1618DCDA-7859-49BD-BCAF-F4BD08DF1BAF@mcs.anl.gov> Message-ID: On Tue, Jul 5, 2016 at 5:26 PM, ehsan sadrfaridpour wrote: > Thanks for your prompt reply. Using & solve this problem, but then I have > another problem. > > *Rest of the Code:* > Mat m_WA_nt_local; > MatCreateSeqAIJ(PETSC_COMM_SELF,num_points,num_points, pre_init_size, > NULL, &m_WA_nt_local); > PetscPrintf(PETSC_COMM_SELF, "[CS][pCalc_P] rank:%d, num_points:%d, > p_init:%d\n", rank, num_points, pre_init_size); > > IS set; > if(rank ==0){ > // - - - - - create local matrix - - - - - > PetscPrintf(PETSC_COMM_SELF, "[CS][pCalc_P] rank:%d, > num_points:%d\n", rank, num_points); > ISCreateStride(PETSC_COMM_SELF, num_points, 0, 1, &set); > ISView(set, PETSC_VIEWER_STDOUT_SELF); > MatGetSubMatrices(m_WA_norm_T, 1, &set, &set, MAT_INITIAL_MATRIX, > &m_WA_nt_local); > }else{ > MatGetSubMatrices(m_WA_norm_T, 0, &set, &set, MAT_INITIAL_MATRIX, > &m_WA_nt_local); > } > This returns an ARRAY of Mat objects, not just one. Matt > > *Error in compile:* > >> /home/esfp/dev/ws_qt/mlsvm/coarsening.cc: In member function ?_p_Mat* >> Coarsening::pCalc_P(_p_Mat*&, _p_Vec*&, std::vector&, >> cs_info&)?: >> /home/esfp/dev/ws_qt/mlsvm/coarsening.cc:113:89: error: cannot convert >> ?_p_Mat**? to ?_p_Mat***? for argument ?6? to ?PetscErrorCode >> MatGetSubMatrices(Mat, PetscInt, _p_IS* const*, _p_IS* const*, MatReuse, >> _p_Mat***)? >> MatGetSubMatrices(m_WA_norm_T, 1, &set, &set, >> MAT_INITIAL_MATRIX, &m_WA_nt_local); >> >> ^ >> /home/esfp/dev/ws_qt/mlsvm/coarsening.cc:115:89: error: cannot convert >> ?_p_Mat**? to ?_p_Mat***? for argument ?6? to ?PetscErrorCode >> MatGetSubMatrices(Mat, PetscInt, _p_IS* const*, _p_IS* const*, MatReuse, >> _p_Mat***)? >> MatGetSubMatrices(m_WA_norm_T, 0, &set, &set, >> MAT_INITIAL_MATRIX, &m_WA_nt_local); >> > > ^ > > > On Tue, Jul 5, 2016 at 6:21 PM, Barry Smith wrote: > >> >> It should be >> >> Mat m_WA_nt_local; >> >> > MatCreateSeqAIJ(PETSC_COMM_SELF,num_points,num_points, pre_init_size, >> NULL, &m_WA_nt_local); >> >> ^^^^^^^^^^^^ note >> the & >> >> >> >> > On Jul 5, 2016, at 5:13 PM, ehsan sadrfaridpour >> wrote: >> > >> > I faced a problem with my code. The problem is related to >> MatCreateSeqAIJ(). >> > I comment the rest of my code and just keeping the below lines cause me >> the error. >> > Code: >> > Mat * m_WA_nt_local; >> > MatCreateSeqAIJ(PETSC_COMM_SELF,num_points,num_points, >> pre_init_size, NULL, m_WA_nt_local); >> > PetscPrintf(PETSC_COMM_SELF, "[CS][pCalc_P] rank:%d, num_points:%d, >> p_init:%d\n", rank, num_points, pre_init_size); >> > >> > exit(1); >> > >> > Error: >> > [0]PETSC ERROR: --------------------- Error Message >> -------------------------------------------------------------- >> > [0]PETSC ERROR: Null argument, when expecting valid pointer >> > [0]PETSC ERROR: [1]PETSC ERROR: --------------------- Error Message >> -------------------------------------------------------------- >> > [1]PETSC ERROR: Null argument, when expecting valid pointer >> > [1]PETSC ERROR: Null Pointer: Parameter # 2 >> > [1]PETSC ERROR: See http://www.mcs.anl.gov/petsc/documentation/faq.html >> for trouble shooting. >> > [2]PETSC ERROR: --------------------- Error Message >> -------------------------------------------------------------- >> > [2]PETSC ERROR: Null argument, when expecting valid pointer >> > [2]PETSC ERROR: Null Pointer: Parameter # 2 >> > [2]PETSC ERROR: See http://www.mcs.anl.gov/petsc/documentation/faq.html >> for trouble shooting. >> > [2]PETSC ERROR: Petsc Release Version 3.6.3, unknown >> > [2]PETSC ERROR: ut_main on a linux-cxx-debug named grappelli by esfp >> Tue Jul 5 18:05:15 2016 >> > [2]PETSC ERROR: Configure options PETSC_ARCH=linux-cxx-debug >> --with-cc=gcc --with-cxx=g++ --with-clanguage=c++ --with-gnu-compilers=1 >> --with-mpi-compilers=1 --with-debugging=1 --with-shared-libraries=1 >> --download-openmpi=1 --download-f2cblaslapack --download-superlu_dist=1 >> --download-superlu=1 --download-metis=1 --download-parmetis=1 >> --download-blacs=1 --with-hdf5 --with-hdf5-dir=/usr/local/hdf5/ >> > [2]PETSC ERROR: #1 MatCreate() line 79 in >> /home/esfp/tools/libraries/petsc/src/mat/utils/gcreate.c >> > [2]PETSC ERROR: #2 MatCreateSeqAIJ() line 3471 in >> /home/esfp/tools/libraries/petsc/src/mat/impls/aij/seq/aij.c >> > Null Pointer: Parameter # 2 >> > [0]PETSC ERROR: See http://www.mcs.anl.gov/petsc/documentation/faq.html >> for trouble shooting. >> > [0]PETSC ERROR: Petsc Release Version 3.6.3, unknown >> > [0]PETSC ERROR: ut_main on a linux-cxx-debug named grappelli by esfp >> Tue Jul 5 18:05:15 2016 >> > [0]PETSC ERROR: Configure options PETSC_ARCH=linux-cxx-debug >> --with-cc=gcc --with-cxx=g++ --with-clanguage=c++ --with-gnu-compilers=1 >> --with-mpi-compilers=1 --with-debugging=1 --with-shared-libraries=1 >> --download-openmpi=1 --download-f2cblaslapack --download-superlu_dist=1 >> --download-superlu=1 --download-metis=1 --download-parmetis=1 >> --download-blacs=1 --with-hdf5 --with-hdf5-dir=/usr/local/hdf5/ >> > [0]PETSC ERROR: #1 MatCreate() line 79 in >> /home/esfp/tools/libraries/petsc/src/mat/utils/gcreate.c >> > [0]PETSC ERROR: #2 MatCreateSeqAIJ() line 3471 in >> /home/esfp/tools/libraries/petsc/src/mat/impls/aij/seq/aij.c >> > [1]PETSC ERROR: Petsc Release Version 3.6.3, unknown >> > [1]PETSC ERROR: ut_main on a linux-cxx-debug named grappelli by esfp >> Tue Jul 5 18:05:15 2016 >> > [1]PETSC ERROR: Configure options PETSC_ARCH=linux-cxx-debug >> --with-cc=gcc --with-cxx=g++ --with-clanguage=c++ --with-gnu-compilers=1 >> --with-mpi-compilers=1 --with-debugging=1 --with-shared-libraries=1 >> --download-openmpi=1 --download-f2cblaslapack --download-superlu_dist=1 >> --download-superlu=1 --download-metis=1 --download-parmetis=1 >> --download-blacs=1 --with-hdf5 --with-hdf5-dir=/usr/local/hdf5/ >> > [1]PETSC ERROR: #1 MatCreate() line 79 in >> /home/esfp/tools/libraries/petsc/src/mat/utils/gcreate.c >> > [1]PETSC ERROR: #2 MatCreateSeqAIJ() line 3471 in >> /home/esfp/tools/libraries/petsc/src/mat/impls/aij/seq/aij.c >> > [CS][pCalc_P] rank:1, num_points:10, p_init:300 >> > [CS][pCalc_P] rank:2, num_points:10, p_init:300 >> > [CS][pCalc_P] rank:0, num_points:10, p_init:300 >> > >> > As you can see nothing is NULL in my call to the MatCreateSeqAIJ. >> > >> > I tried to debug it with -start_in_debugger, but I got another error. >> > $ make ut_main && mpirun -n 3 ut_main -start_in_debugger >> > /home/esfp/tools/libraries/petsc/linux-cxx-debug/bin/mpicxx -o >> ut_main.o -c -Wall -Wwrite-strings -Wno-strict-aliasing >> -Wno-unknown-pragmas -g -O0 -fPIC >> -I/home/esfp/tools/libraries/petsc/include >> -I/home/esfp/tools/libraries/petsc/linux-cxx-debug/include >> -I/usr/local/hdf5/include -std=c++11 -g -O3 `pwd`/ut_main.cc >> > /home/esfp/tools/libraries/petsc/linux-cxx-debug/bin/mpicxx -Wall >> -Wwrite-strings -Wno-strict-aliasing -Wno-unknown-pragmas -g -O0 -I. >> svm.o solver.o model_selection.o ut_ms.o ut_common.o ut_kf.o >> ut_partitioning.o ds_node.o ds_graph.o coarsening.o ut_coarsening.o >> partitioning.o ut_mr.o pugixml.o config_params.o etimer.o common_funcs.o >> OptionParser.o loader.o ut_loader.o k_fold.o ut_main.o >> -Wl,-rpath,/home/esfp/tools/libraries/petsc/linux-cxx-debug/lib >> -L/home/esfp/tools/libraries/petsc/linux-cxx-debug/lib -lpetsc >> -Wl,-rpath,/home/esfp/tools/libraries/petsc/linux-cxx-debug/lib >> -lsuperlu_4.3 -lsuperlu_dist_4.1 -lf2clapack -lf2cblas -lm -lparmetis >> -lmetis -lX11 -Wl,-rpath,/usr/local/hdf5/lib -L/usr/local/hdf5/lib >> -lhdf5hl_fortran -lhdf5_fortran -lhdf5_hl -lhdf5 -lhwloc -lm >> -Wl,-rpath,/usr/lib/gcc/x86_64-linux-gnu/4.8 >> -L/usr/lib/gcc/x86_64-linux-gnu/4.8 -Wl,-rpath,/usr/lib/x86_64-linux-gnu >> -L/usr/lib/x86_64-linux-gnu -Wl,-rpath,/lib/x86_64-linux-gnu >> -L/lib/x86_64-linux-gnu -lmpi_usempi -lmpi_mpifh -lgfortran -lm -lgfortran >> -lm -lquadmath -lm -lmpi_cxx -lstdc++ >> -Wl,-rpath,/home/esfp/tools/libraries/petsc/linux-cxx-debug/lib >> -L/home/esfp/tools/libraries/petsc/linux-cxx-debug/lib >> -Wl,-rpath,/usr/lib/gcc/x86_64-linux-gnu/4.8 >> -L/usr/lib/gcc/x86_64-linux-gnu/4.8 -Wl,-rpath,/usr/lib/x86_64-linux-gnu >> -L/usr/lib/x86_64-linux-gnu -Wl,-rpath,/lib/x86_64-linux-gnu >> -L/lib/x86_64-linux-gnu -Wl,-rpath,/usr/lib/x86_64-linux-gnu >> -L/usr/lib/x86_64-linux-gnu -ldl >> -Wl,-rpath,/home/esfp/tools/libraries/petsc/linux-cxx-debug/lib -lmpi >> -lgcc_s -lpthread -ldl -o ut_main >> > /bin/rm -f ut_main.o >> > [0]PETSC ERROR: PETSC: Attaching gdb to ut_main of pid 2818 on display >> :0 on machine grappelli >> > [1]PETSC ERROR: PETSC: Attaching gdb to ut_main of pid 2819 on display >> :0 on machine grappelli >> > [2]PETSC ERROR: PETSC: Attaching gdb to ut_main of pid 2820 on display >> :0 on machine grappelli >> > >> > >> > And I got below error in gdb GUI: >> > >> > >> > I appreciate your support. >> > >> > Best regards, >> > Ehsan >> > >> > On Wed, Jun 29, 2016 at 4:31 PM, Barry Smith >> wrote: >> > >> > On all other processes don't pass in 1 pass in 0 since all other >> processes want 0 sub matrices >> > >> > >> > > On Jun 29, 2016, at 3:24 PM, ehsan sadrfaridpour >> wrote: >> > > >> > > Thanks, the IS problem is solved. >> > > But now I have another problem to compile the code. >> > > >> > > I use below code: >> > > Mat m_WA_nt_local; >> > > MatCreateSeqAIJ(PETSC_COMM_SELF,num_points,num_points, >> Config_params::getInstance()->get_pre_init_loader_matrix(), NULL, >> &m_WA_nt_local); >> > > IS set; >> > > if(rank ==0){ >> > > ISCreateStride(PETSC_COMM_SELF, num_points, 0, 1, &set); >> > > ISView(set, PETSC_VIEWER_STDOUT_SELF); >> > > } >> > > MatGetSubMatrices(m_WA_norm_T, 1, &set, &set, MAT_INITIAL_MATRIX, >> &m_WA_nt_local); >> > > >> > > The error I get is : >> > > error: cannot convert ?_p_Mat**? to ?_p_Mat***? for argument ?6? to >> ?PetscErrorCode MatGetSubMatrices(Mat, PetscInt, _p_IS* const*, _p_IS* >> const*, MatReuse, _p_Mat***)? >> > > MatGetSubMatrices(m_WA_norm_T, 1, &set, &set, >> MAT_INITIAL_MATRIX, &m_WA_nt_local); >> > > >> > > >> > > I tried to go around it by define a array of Matrices using "Mat * >> m_WA_nt_local" >> > > So, the first 2 lines changed to below and I can compile the code. >> > > Mat * m_WA_nt_local; >> > > MatCreateSeqAIJ(PETSC_COMM_SELF,num_points,num_points, >> Config_params::getInstance()->get_pre_init_loader_matrix(), NULL, >> m_WA_nt_local); >> > > >> > > >> > > >> > > However, I get errors like below when I run the code with 2 mpi >> process. >> > > --------------------- Error Message >> -------------------------------------------------------------- >> > > [1]PETSC ERROR: Invalid argument >> > > [1]PETSC ERROR: Wrong type of object: Parameter # 3 >> > > [1]PETSC ERROR: See >> http://www.mcs.anl.gov/petsc/documentation/faq.html for trouble shooting. >> > > [1]PETSC ERROR: Petsc Release Version 3.6.3, unknown >> > > [1]PETSC ERROR: ut_main on a linux-cxx-debug named grappelli by esfp >> Wed Jun 29 16:21:04 2016 >> > > [1]PETSC ERROR: Configure options PETSC_ARCH=linux-cxx-debug >> --with-cc=gcc --with-cxx=g++ --with-clanguage=c++ --with-gnu-compilers=1 >> --with-mpi-compilers=1 --with-debugging=1 --with-shared-libraries=1 >> --download-openmpi=1 --download-f2cblaslapack --download-superlu_dist=1 >> --download-superlu=1 --download-metis=1 --download-parmetis=1 >> --download-blacs=1 --with-hdf5 --with-hdf5-dir=/usr/local/hdf5/ >> > > [1]PETSC ERROR: #1 MatGetSubMatrices() line 6605 in >> /home/esfp/tools/libraries/petsc/src/mat/interface/matrix.c >> > > >> > > >> > > I think I need to do something for other processes, but I don't know >> what I need to do. >> > > >> > > Best, >> > > Ehsan >> > > >> > > >> > > >> > > On Wed, Jun 29, 2016 at 4:03 PM, Dave May >> wrote: >> > > >> > > >> > > On Wednesday, 29 June 2016, ehsan sadrfaridpour >> wrote: >> > > I faced the below error during compiling my code for using >> MatGetSubMatrices. >> > > >> > > error: cannot convert ?IS {aka _p_IS*}? to ?_p_IS* const*? for >> argument ?3? to ?PetscErrorCode MatGetSubMatrices(Mat, PetscInt, _p_IS* >> const*, _p_IS* const*, MatReuse, _p_Mat***)? >> > > MatGetSubMatrices(m_WA_norm_T, 1, set, set, >> MAT_INITIAL_MATRIX, &m_local_W); >> > > >> > > My code : >> > > PetscMPIInt rank; >> > > MPI_Comm_rank(PETSC_COMM_WORLD, &rank); >> > > >> > > if(rank ==0){ >> > > Mat m_local_W; >> > > MatCreateSeqAIJ(PETSC_COMM_SELF,num_points,num_points, >> num_nz, NULL,&m_local_W);// try to reserve space for only number of final >> non zero entries for each fine node (e.g. 4) >> > > IS set; >> > > ISCreateStride(PETSC_COMM_SELF, num_points, 0, 1, &set_row); >> > > MatGetSubMatrices(m_WA_norm_T, 1, set_row, set_col, >> MAT_INITIAL_MATRIX, &m_local_W); >> > > >> > > } >> > > >> > > I followed below example: >> > > >> http://www.mcs.anl.gov/petsc/petsc-current/src/vec/is/is/examples/tutorials/ex2.c.html >> > > >> > > This code won't work in parallel. >> > > The man page says this function is collective on Mat. You need to >> move the call to MatGetSubMatrices outside of the if(rank==0) loop. >> > > >> > > >> > > >> > > >> > > >> > > >> > > >> > > On Wed, Jun 29, 2016 at 3:19 PM, ehsan sadrfaridpour < >> it.sadr at gmail.com> wrote: >> > > Thanks a lot for great support. >> > > >> > > On Wed, Jun 29, 2016 at 3:11 PM, Barry Smith >> wrote: >> > > >> > > MatGetSubmatrices() just have the first process request all the >> rows and columns and the others request none. You can use ISCreateStride() >> to create the ISs without having to make an array of all the indices. >> > > >> > > >> > > > On Jun 29, 2016, at 1:43 PM, ehsan sadrfaridpour >> wrote: >> > > > >> > > > Hi, >> > > > >> > > > I need to have access to most of elements of a parallel MPIAIJ >> matrix only from 1 process (rank 0). >> > > > I tried to copy or duplicate it to SEQAIJ, but I faced problems. >> > > > >> > > > How can I have a local copy of a matrix which is distributed on >> multiple process? I don't want to update the matrix, and the read-only >> version of it would be enough. >> > > > >> > > > Best, >> > > > Ehsan >> > > > >> > > > >> > > >> > > >> > > >> > > >> > >> > >> >> > -- What most experimenters take for granted before they begin their experiments is infinitely more interesting than any results to which their experiments lead. -- Norbert Wiener -------------- next part -------------- An HTML attachment was scrubbed... URL: From bsmith at mcs.anl.gov Tue Jul 5 17:37:15 2016 From: bsmith at mcs.anl.gov (Barry Smith) Date: Tue, 5 Jul 2016 17:37:15 -0500 Subject: [petsc-users] Question about memory usage in Multigrid preconditioner In-Reply-To: <577C337B.60909@uci.edu> References: <577C337B.60909@uci.edu> Message-ID: <94A03A99-4970-4F20-8C79-FEE1DCBD028D@mcs.anl.gov> Frank, You can run with -ksp_view_pre to have it "view" the KSP before the solve so hopefully it gets that far. Please run the problem that does fit with -memory_info when the problem completes it will show the "high water mark" for PETSc allocated memory and total memory used. We first want to look at these numbers to see if it is using more memory than you expect. You could also run with say half the grid spacing to see how the memory usage scaled with the increase in grid points. Make the runs also with -log_view and send all the output from these options. Barry > On Jul 5, 2016, at 5:23 PM, frank wrote: > > Hi, > > I am using the CG ksp solver and Multigrid preconditioner to solve a linear system in parallel. > I chose to use the 'Telescope' as the preconditioner on the coarse mesh for its good performance. > The petsc options file is attached. > > The domain is a 3d box. > It works well when the grid is 1536*128*384 and the process mesh is 96*8*24. When I double the size of grid and keep the same process mesh and petsc options, I get an "out of memory" error from the super-cluster I am using. > Each process has access to at least 8G memory, which should be more than enough for my application. I am sure that all the other parts of my code( except the linear solver ) do not use much memory. So I doubt if there is something wrong with the linear solver. > The error occurs before the linear system is completely solved so I don't have the info from ksp view. I am not able to re-produce the error with a smaller problem either. > In addition, I tried to use the block jacobi as the preconditioner with the same grid and same decomposition. The linear solver runs extremely slow but there is no memory error. > > How can I diagnose what exactly cause the error? > Thank you so much. > > Frank > From bsmith at mcs.anl.gov Tue Jul 5 17:43:06 2016 From: bsmith at mcs.anl.gov (Barry Smith) Date: Tue, 5 Jul 2016 17:43:06 -0500 Subject: [petsc-users] How to have a local copy (sequential) of a parallel matrix In-Reply-To: References: <1618DCDA-7859-49BD-BCAF-F4BD08DF1BAF@mcs.anl.gov> Message-ID: <855835AC-78B8-4A8A-993F-2E9060B4BBAF@mcs.anl.gov> > On Jul 5, 2016, at 5:36 PM, Matthew Knepley wrote: > > On Tue, Jul 5, 2016 at 5:26 PM, ehsan sadrfaridpour wrote: > Thanks for your prompt reply. Using & solve this problem, but then I have another problem. > > Rest of the Code: > Mat m_WA_nt_local; > MatCreateSeqAIJ(PETSC_COMM_SELF,num_points,num_points, pre_init_size, NULL, &m_WA_nt_local); > PetscPrintf(PETSC_COMM_SELF, "[CS][pCalc_P] rank:%d, num_points:%d, p_init:%d\n", rank, num_points, pre_init_size); > > IS set; > if(rank ==0){ > // - - - - - create local matrix - - - - - > PetscPrintf(PETSC_COMM_SELF, "[CS][pCalc_P] rank:%d, num_points:%d\n", rank, num_points); > ISCreateStride(PETSC_COMM_SELF, num_points, 0, 1, &set); > ISView(set, PETSC_VIEWER_STDOUT_SELF); > MatGetSubMatrices(m_WA_norm_T, 1, &set, &set, MAT_INITIAL_MATRIX, &m_WA_nt_local); > }else{ > MatGetSubMatrices(m_WA_norm_T, 0, &set, &set, MAT_INITIAL_MATRIX, &m_WA_nt_local); > } > > This returns an ARRAY of Mat objects, not just one. Didn't we just do this email a couple of days ago? You need Mat m_WA_nt_local *m_WA_nt_local; > MatGetSubMatrices(m_WA_norm_T, 0, &set, &set, MAT_INITIAL_MATRIX, &m_WA_nt_local); > > Matt > > > Error in compile: > /home/esfp/dev/ws_qt/mlsvm/coarsening.cc: In member function ?_p_Mat* Coarsening::pCalc_P(_p_Mat*&, _p_Vec*&, std::vector&, cs_info&)?: > /home/esfp/dev/ws_qt/mlsvm/coarsening.cc:113:89: error: cannot convert ?_p_Mat**? to ?_p_Mat***? for argument ?6? to ?PetscErrorCode MatGetSubMatrices(Mat, PetscInt, _p_IS* const*, _p_IS* const*, MatReuse, _p_Mat***)? > MatGetSubMatrices(m_WA_norm_T, 1, &set, &set, MAT_INITIAL_MATRIX, &m_WA_nt_local); > ^ > /home/esfp/dev/ws_qt/mlsvm/coarsening.cc:115:89: error: cannot convert ?_p_Mat**? to ?_p_Mat***? for argument ?6? to ?PetscErrorCode MatGetSubMatrices(Mat, PetscInt, _p_IS* const*, _p_IS* const*, MatReuse, _p_Mat***)? > MatGetSubMatrices(m_WA_norm_T, 0, &set, &set, MAT_INITIAL_MATRIX, &m_WA_nt_local); > ^ > > > On Tue, Jul 5, 2016 at 6:21 PM, Barry Smith wrote: > > It should be > > Mat m_WA_nt_local; > > > MatCreateSeqAIJ(PETSC_COMM_SELF,num_points,num_points, pre_init_size, NULL, &m_WA_nt_local); > ^^^^^^^^^^^^ note the & > > > > > On Jul 5, 2016, at 5:13 PM, ehsan sadrfaridpour wrote: > > > > I faced a problem with my code. The problem is related to MatCreateSeqAIJ(). > > I comment the rest of my code and just keeping the below lines cause me the error. > > Code: > > Mat * m_WA_nt_local; > > MatCreateSeqAIJ(PETSC_COMM_SELF,num_points,num_points, pre_init_size, NULL, m_WA_nt_local); > > PetscPrintf(PETSC_COMM_SELF, "[CS][pCalc_P] rank:%d, num_points:%d, p_init:%d\n", rank, num_points, pre_init_size); > > > > exit(1); > > > > Error: > > [0]PETSC ERROR: --------------------- Error Message -------------------------------------------------------------- > > [0]PETSC ERROR: Null argument, when expecting valid pointer > > [0]PETSC ERROR: [1]PETSC ERROR: --------------------- Error Message -------------------------------------------------------------- > > [1]PETSC ERROR: Null argument, when expecting valid pointer > > [1]PETSC ERROR: Null Pointer: Parameter # 2 > > [1]PETSC ERROR: See http://www.mcs.anl.gov/petsc/documentation/faq.html for trouble shooting. > > [2]PETSC ERROR: --------------------- Error Message -------------------------------------------------------------- > > [2]PETSC ERROR: Null argument, when expecting valid pointer > > [2]PETSC ERROR: Null Pointer: Parameter # 2 > > [2]PETSC ERROR: See http://www.mcs.anl.gov/petsc/documentation/faq.html for trouble shooting. > > [2]PETSC ERROR: Petsc Release Version 3.6.3, unknown > > [2]PETSC ERROR: ut_main on a linux-cxx-debug named grappelli by esfp Tue Jul 5 18:05:15 2016 > > [2]PETSC ERROR: Configure options PETSC_ARCH=linux-cxx-debug --with-cc=gcc --with-cxx=g++ --with-clanguage=c++ --with-gnu-compilers=1 --with-mpi-compilers=1 --with-debugging=1 --with-shared-libraries=1 --download-openmpi=1 --download-f2cblaslapack --download-superlu_dist=1 --download-superlu=1 --download-metis=1 --download-parmetis=1 --download-blacs=1 --with-hdf5 --with-hdf5-dir=/usr/local/hdf5/ > > [2]PETSC ERROR: #1 MatCreate() line 79 in /home/esfp/tools/libraries/petsc/src/mat/utils/gcreate.c > > [2]PETSC ERROR: #2 MatCreateSeqAIJ() line 3471 in /home/esfp/tools/libraries/petsc/src/mat/impls/aij/seq/aij.c > > Null Pointer: Parameter # 2 > > [0]PETSC ERROR: See http://www.mcs.anl.gov/petsc/documentation/faq.html for trouble shooting. > > [0]PETSC ERROR: Petsc Release Version 3.6.3, unknown > > [0]PETSC ERROR: ut_main on a linux-cxx-debug named grappelli by esfp Tue Jul 5 18:05:15 2016 > > [0]PETSC ERROR: Configure options PETSC_ARCH=linux-cxx-debug --with-cc=gcc --with-cxx=g++ --with-clanguage=c++ --with-gnu-compilers=1 --with-mpi-compilers=1 --with-debugging=1 --with-shared-libraries=1 --download-openmpi=1 --download-f2cblaslapack --download-superlu_dist=1 --download-superlu=1 --download-metis=1 --download-parmetis=1 --download-blacs=1 --with-hdf5 --with-hdf5-dir=/usr/local/hdf5/ > > [0]PETSC ERROR: #1 MatCreate() line 79 in /home/esfp/tools/libraries/petsc/src/mat/utils/gcreate.c > > [0]PETSC ERROR: #2 MatCreateSeqAIJ() line 3471 in /home/esfp/tools/libraries/petsc/src/mat/impls/aij/seq/aij.c > > [1]PETSC ERROR: Petsc Release Version 3.6.3, unknown > > [1]PETSC ERROR: ut_main on a linux-cxx-debug named grappelli by esfp Tue Jul 5 18:05:15 2016 > > [1]PETSC ERROR: Configure options PETSC_ARCH=linux-cxx-debug --with-cc=gcc --with-cxx=g++ --with-clanguage=c++ --with-gnu-compilers=1 --with-mpi-compilers=1 --with-debugging=1 --with-shared-libraries=1 --download-openmpi=1 --download-f2cblaslapack --download-superlu_dist=1 --download-superlu=1 --download-metis=1 --download-parmetis=1 --download-blacs=1 --with-hdf5 --with-hdf5-dir=/usr/local/hdf5/ > > [1]PETSC ERROR: #1 MatCreate() line 79 in /home/esfp/tools/libraries/petsc/src/mat/utils/gcreate.c > > [1]PETSC ERROR: #2 MatCreateSeqAIJ() line 3471 in /home/esfp/tools/libraries/petsc/src/mat/impls/aij/seq/aij.c > > [CS][pCalc_P] rank:1, num_points:10, p_init:300 > > [CS][pCalc_P] rank:2, num_points:10, p_init:300 > > [CS][pCalc_P] rank:0, num_points:10, p_init:300 > > > > As you can see nothing is NULL in my call to the MatCreateSeqAIJ. > > > > I tried to debug it with -start_in_debugger, but I got another error. > > $ make ut_main && mpirun -n 3 ut_main -start_in_debugger > > /home/esfp/tools/libraries/petsc/linux-cxx-debug/bin/mpicxx -o ut_main.o -c -Wall -Wwrite-strings -Wno-strict-aliasing -Wno-unknown-pragmas -g -O0 -fPIC -I/home/esfp/tools/libraries/petsc/include -I/home/esfp/tools/libraries/petsc/linux-cxx-debug/include -I/usr/local/hdf5/include -std=c++11 -g -O3 `pwd`/ut_main.cc > > /home/esfp/tools/libraries/petsc/linux-cxx-debug/bin/mpicxx -Wall -Wwrite-strings -Wno-strict-aliasing -Wno-unknown-pragmas -g -O0 -I. svm.o solver.o model_selection.o ut_ms.o ut_common.o ut_kf.o ut_partitioning.o ds_node.o ds_graph.o coarsening.o ut_coarsening.o partitioning.o ut_mr.o pugixml.o config_params.o etimer.o common_funcs.o OptionParser.o loader.o ut_loader.o k_fold.o ut_main.o -Wl,-rpath,/home/esfp/tools/libraries/petsc/linux-cxx-debug/lib -L/home/esfp/tools/libraries/petsc/linux-cxx-debug/lib -lpetsc -Wl,-rpath,/home/esfp/tools/libraries/petsc/linux-cxx-debug/lib -lsuperlu_4.3 -lsuperlu_dist_4.1 -lf2clapack -lf2cblas -lm -lparmetis -lmetis -lX11 -Wl,-rpath,/usr/local/hdf5/lib -L/usr/local/hdf5/lib -lhdf5hl_fortran -lhdf5_fortran -lhdf5_hl -lhdf5 -lhwloc -lm -Wl,-rpath,/usr/lib/gcc/x86_64-linux-gnu/4.8 -L/usr/lib/gcc/x86_64-linux-gnu/4.8 -Wl,-rpath,/usr/lib/x86_64-linux-gnu -L/usr/lib/x86_64-linux-gnu -Wl,-rpath,/lib/x86_64-linux-gnu -L/lib/x86_64-linux-gnu -lmpi_usempi -lmpi_mpifh -lgfortran -lm -lgfortran -lm -lquadmath -lm -lmpi_cxx -lstdc++ -Wl,-rpath,/home/esfp/tools/libraries/petsc/linux-cxx-debug/lib -L/home/esfp/tools/libraries/petsc/linux-cxx-debug/lib -Wl,-rpath,/usr/lib/gcc/x86_64-linux-gnu/4.8 -L/usr/lib/gcc/x86_64-linux-gnu/4.8 -Wl,-rpath,/usr/lib/x86_64-linux-gnu -L/usr/lib/x86_64-linux-gnu -Wl,-rpath,/lib/x86_64-linux-gnu -L/lib/x86_64-linux-gnu -Wl,-rpath,/usr/lib/x86_64-linux-gnu -L/usr/lib/x86_64-linux-gnu -ldl -Wl,-rpath,/home/esfp/tools/libraries/petsc/linux-cxx-debug/lib -lmpi -lgcc_s -lpthread -ldl -o ut_main > > /bin/rm -f ut_main.o > > [0]PETSC ERROR: PETSC: Attaching gdb to ut_main of pid 2818 on display :0 on machine grappelli > > [1]PETSC ERROR: PETSC: Attaching gdb to ut_main of pid 2819 on display :0 on machine grappelli > > [2]PETSC ERROR: PETSC: Attaching gdb to ut_main of pid 2820 on display :0 on machine grappelli > > > > > > And I got below error in gdb GUI: > > > > > > I appreciate your support. > > > > Best regards, > > Ehsan > > > > On Wed, Jun 29, 2016 at 4:31 PM, Barry Smith wrote: > > > > On all other processes don't pass in 1 pass in 0 since all other processes want 0 sub matrices > > > > > > > On Jun 29, 2016, at 3:24 PM, ehsan sadrfaridpour wrote: > > > > > > Thanks, the IS problem is solved. > > > But now I have another problem to compile the code. > > > > > > I use below code: > > > Mat m_WA_nt_local; > > > MatCreateSeqAIJ(PETSC_COMM_SELF,num_points,num_points, Config_params::getInstance()->get_pre_init_loader_matrix(), NULL, &m_WA_nt_local); > > > IS set; > > > if(rank ==0){ > > > ISCreateStride(PETSC_COMM_SELF, num_points, 0, 1, &set); > > > ISView(set, PETSC_VIEWER_STDOUT_SELF); > > > } > > > MatGetSubMatrices(m_WA_norm_T, 1, &set, &set, MAT_INITIAL_MATRIX, &m_WA_nt_local); > > > > > > The error I get is : > > > error: cannot convert ?_p_Mat**? to ?_p_Mat***? for argument ?6? to ?PetscErrorCode MatGetSubMatrices(Mat, PetscInt, _p_IS* const*, _p_IS* const*, MatReuse, _p_Mat***)? > > > MatGetSubMatrices(m_WA_norm_T, 1, &set, &set, MAT_INITIAL_MATRIX, &m_WA_nt_local); > > > > > > > > > I tried to go around it by define a array of Matrices using "Mat * m_WA_nt_local" > > > So, the first 2 lines changed to below and I can compile the code. > > > Mat * m_WA_nt_local; > > > MatCreateSeqAIJ(PETSC_COMM_SELF,num_points,num_points, Config_params::getInstance()->get_pre_init_loader_matrix(), NULL, m_WA_nt_local); > > > > > > > > > > > > However, I get errors like below when I run the code with 2 mpi process. > > > --------------------- Error Message -------------------------------------------------------------- > > > [1]PETSC ERROR: Invalid argument > > > [1]PETSC ERROR: Wrong type of object: Parameter # 3 > > > [1]PETSC ERROR: See http://www.mcs.anl.gov/petsc/documentation/faq.html for trouble shooting. > > > [1]PETSC ERROR: Petsc Release Version 3.6.3, unknown > > > [1]PETSC ERROR: ut_main on a linux-cxx-debug named grappelli by esfp Wed Jun 29 16:21:04 2016 > > > [1]PETSC ERROR: Configure options PETSC_ARCH=linux-cxx-debug --with-cc=gcc --with-cxx=g++ --with-clanguage=c++ --with-gnu-compilers=1 --with-mpi-compilers=1 --with-debugging=1 --with-shared-libraries=1 --download-openmpi=1 --download-f2cblaslapack --download-superlu_dist=1 --download-superlu=1 --download-metis=1 --download-parmetis=1 --download-blacs=1 --with-hdf5 --with-hdf5-dir=/usr/local/hdf5/ > > > [1]PETSC ERROR: #1 MatGetSubMatrices() line 6605 in /home/esfp/tools/libraries/petsc/src/mat/interface/matrix.c > > > > > > > > > I think I need to do something for other processes, but I don't know what I need to do. > > > > > > Best, > > > Ehsan > > > > > > > > > > > > On Wed, Jun 29, 2016 at 4:03 PM, Dave May wrote: > > > > > > > > > On Wednesday, 29 June 2016, ehsan sadrfaridpour wrote: > > > I faced the below error during compiling my code for using MatGetSubMatrices. > > > > > > error: cannot convert ?IS {aka _p_IS*}? to ?_p_IS* const*? for argument ?3? to ?PetscErrorCode MatGetSubMatrices(Mat, PetscInt, _p_IS* const*, _p_IS* const*, MatReuse, _p_Mat***)? > > > MatGetSubMatrices(m_WA_norm_T, 1, set, set, MAT_INITIAL_MATRIX, &m_local_W); > > > > > > My code : > > > PetscMPIInt rank; > > > MPI_Comm_rank(PETSC_COMM_WORLD, &rank); > > > > > > if(rank ==0){ > > > Mat m_local_W; > > > MatCreateSeqAIJ(PETSC_COMM_SELF,num_points,num_points, num_nz, NULL,&m_local_W);// try to reserve space for only number of final non zero entries for each fine node (e.g. 4) > > > IS set; > > > ISCreateStride(PETSC_COMM_SELF, num_points, 0, 1, &set_row); > > > MatGetSubMatrices(m_WA_norm_T, 1, set_row, set_col, MAT_INITIAL_MATRIX, &m_local_W); > > > > > > } > > > > > > I followed below example: > > > http://www.mcs.anl.gov/petsc/petsc-current/src/vec/is/is/examples/tutorials/ex2.c.html > > > > > > This code won't work in parallel. > > > The man page says this function is collective on Mat. You need to move the call to MatGetSubMatrices outside of the if(rank==0) loop. > > > > > > > > > > > > > > > > > > > > > > > > On Wed, Jun 29, 2016 at 3:19 PM, ehsan sadrfaridpour wrote: > > > Thanks a lot for great support. > > > > > > On Wed, Jun 29, 2016 at 3:11 PM, Barry Smith wrote: > > > > > > MatGetSubmatrices() just have the first process request all the rows and columns and the others request none. You can use ISCreateStride() to create the ISs without having to make an array of all the indices. > > > > > > > > > > On Jun 29, 2016, at 1:43 PM, ehsan sadrfaridpour wrote: > > > > > > > > Hi, > > > > > > > > I need to have access to most of elements of a parallel MPIAIJ matrix only from 1 process (rank 0). > > > > I tried to copy or duplicate it to SEQAIJ, but I faced problems. > > > > > > > > How can I have a local copy of a matrix which is distributed on multiple process? I don't want to update the matrix, and the read-only version of it would be enough. > > > > > > > > Best, > > > > Ehsan > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > -- > What most experimenters take for granted before they begin their experiments is infinitely more interesting than any results to which their experiments lead. > -- Norbert Wiener From it.sadr at gmail.com Tue Jul 5 17:58:48 2016 From: it.sadr at gmail.com (ehsan sadrfaridpour) Date: Tue, 5 Jul 2016 18:58:48 -0400 Subject: [petsc-users] How to have a local copy (sequential) of a parallel matrix In-Reply-To: <855835AC-78B8-4A8A-993F-2E9060B4BBAF@mcs.anl.gov> References: <1618DCDA-7859-49BD-BCAF-F4BD08DF1BAF@mcs.anl.gov> <855835AC-78B8-4A8A-993F-2E9060B4BBAF@mcs.anl.gov> Message-ID: Sorry, I think your suggestion needs something, since it doesn't compile. error: expected initializer before ?*? token > Mat m_WA_nt_local *m_WA_nt_local; > Yes, this is the same problem that compiled and worked but it has a bug. I faced this problem and I tried to define the array of Matrices to fix this 4 days ago. However, my first email today is the problem that array of matrices caused me. I get a little confused in the logic. Let me review what is happening: As this method is collective, all the processes needs to run it. Therefore, I need to define a local matrix and create it for all of the processes. Only for the process I want to have the local matrix, I request a matrix (matrices) and for the rest of them I pass 0 in the MatGetSubMatrices. I am suspicious about creating only 1 matrix for any process, while I expect an array of matrices in the MatGetSubMatrices. On Tue, Jul 5, 2016 at 6:43 PM, Barry Smith wrote: > > > On Jul 5, 2016, at 5:36 PM, Matthew Knepley wrote: > > > > On Tue, Jul 5, 2016 at 5:26 PM, ehsan sadrfaridpour > wrote: > > Thanks for your prompt reply. Using & solve this problem, but then I > have another problem. > > > > Rest of the Code: > > Mat m_WA_nt_local; > > MatCreateSeqAIJ(PETSC_COMM_SELF,num_points,num_points, > pre_init_size, NULL, &m_WA_nt_local); > > PetscPrintf(PETSC_COMM_SELF, "[CS][pCalc_P] rank:%d, num_points:%d, > p_init:%d\n", rank, num_points, pre_init_size); > > > > IS set; > > if(rank ==0){ > > // - - - - - create local matrix - - - - - > > PetscPrintf(PETSC_COMM_SELF, "[CS][pCalc_P] rank:%d, > num_points:%d\n", rank, num_points); > > ISCreateStride(PETSC_COMM_SELF, num_points, 0, 1, &set); > > ISView(set, PETSC_VIEWER_STDOUT_SELF); > > MatGetSubMatrices(m_WA_norm_T, 1, &set, &set, > MAT_INITIAL_MATRIX, &m_WA_nt_local); > > }else{ > > MatGetSubMatrices(m_WA_norm_T, 0, &set, &set, > MAT_INITIAL_MATRIX, &m_WA_nt_local); > > } > > > > This returns an ARRAY of Mat objects, not just one. > > Didn't we just do this email a couple of days ago? > > You need > > Mat m_WA_nt_local *m_WA_nt_local; > > MatGetSubMatrices(m_WA_norm_T, 0, &set, &set, MAT_INITIAL_MATRIX, > &m_WA_nt_local); > > > > > > > > > Matt > > > > > > Error in compile: > > /home/esfp/dev/ws_qt/mlsvm/coarsening.cc: In member function ?_p_Mat* > Coarsening::pCalc_P(_p_Mat*&, _p_Vec*&, std::vector&, > cs_info&)?: > > /home/esfp/dev/ws_qt/mlsvm/coarsening.cc:113:89: error: cannot convert > ?_p_Mat**? to ?_p_Mat***? for argument ?6? to ?PetscErrorCode > MatGetSubMatrices(Mat, PetscInt, _p_IS* const*, _p_IS* const*, MatReuse, > _p_Mat***)? > > MatGetSubMatrices(m_WA_norm_T, 1, &set, &set, > MAT_INITIAL_MATRIX, &m_WA_nt_local); > > > ^ > > /home/esfp/dev/ws_qt/mlsvm/coarsening.cc:115:89: error: cannot convert > ?_p_Mat**? to ?_p_Mat***? for argument ?6? to ?PetscErrorCode > MatGetSubMatrices(Mat, PetscInt, _p_IS* const*, _p_IS* const*, MatReuse, > _p_Mat***)? > > MatGetSubMatrices(m_WA_norm_T, 0, &set, &set, > MAT_INITIAL_MATRIX, &m_WA_nt_local); > > > ^ > > > > > > On Tue, Jul 5, 2016 at 6:21 PM, Barry Smith wrote: > > > > It should be > > > > Mat m_WA_nt_local; > > > > > MatCreateSeqAIJ(PETSC_COMM_SELF,num_points,num_points, pre_init_size, > NULL, &m_WA_nt_local); > > > ^^^^^^^^^^^^ note > the & > > > > > > > > > On Jul 5, 2016, at 5:13 PM, ehsan sadrfaridpour > wrote: > > > > > > I faced a problem with my code. The problem is related to > MatCreateSeqAIJ(). > > > I comment the rest of my code and just keeping the below lines cause > me the error. > > > Code: > > > Mat * m_WA_nt_local; > > > MatCreateSeqAIJ(PETSC_COMM_SELF,num_points,num_points, > pre_init_size, NULL, m_WA_nt_local); > > > PetscPrintf(PETSC_COMM_SELF, "[CS][pCalc_P] rank:%d, > num_points:%d, p_init:%d\n", rank, num_points, pre_init_size); > > > > > > exit(1); > > > > > > Error: > > > [0]PETSC ERROR: --------------------- Error Message > -------------------------------------------------------------- > > > [0]PETSC ERROR: Null argument, when expecting valid pointer > > > [0]PETSC ERROR: [1]PETSC ERROR: --------------------- Error Message > -------------------------------------------------------------- > > > [1]PETSC ERROR: Null argument, when expecting valid pointer > > > [1]PETSC ERROR: Null Pointer: Parameter # 2 > > > [1]PETSC ERROR: See > http://www.mcs.anl.gov/petsc/documentation/faq.html for trouble shooting. > > > [2]PETSC ERROR: --------------------- Error Message > -------------------------------------------------------------- > > > [2]PETSC ERROR: Null argument, when expecting valid pointer > > > [2]PETSC ERROR: Null Pointer: Parameter # 2 > > > [2]PETSC ERROR: See > http://www.mcs.anl.gov/petsc/documentation/faq.html for trouble shooting. > > > [2]PETSC ERROR: Petsc Release Version 3.6.3, unknown > > > [2]PETSC ERROR: ut_main on a linux-cxx-debug named grappelli by esfp > Tue Jul 5 18:05:15 2016 > > > [2]PETSC ERROR: Configure options PETSC_ARCH=linux-cxx-debug > --with-cc=gcc --with-cxx=g++ --with-clanguage=c++ --with-gnu-compilers=1 > --with-mpi-compilers=1 --with-debugging=1 --with-shared-libraries=1 > --download-openmpi=1 --download-f2cblaslapack --download-superlu_dist=1 > --download-superlu=1 --download-metis=1 --download-parmetis=1 > --download-blacs=1 --with-hdf5 --with-hdf5-dir=/usr/local/hdf5/ > > > [2]PETSC ERROR: #1 MatCreate() line 79 in > /home/esfp/tools/libraries/petsc/src/mat/utils/gcreate.c > > > [2]PETSC ERROR: #2 MatCreateSeqAIJ() line 3471 in > /home/esfp/tools/libraries/petsc/src/mat/impls/aij/seq/aij.c > > > Null Pointer: Parameter # 2 > > > [0]PETSC ERROR: See > http://www.mcs.anl.gov/petsc/documentation/faq.html for trouble shooting. > > > [0]PETSC ERROR: Petsc Release Version 3.6.3, unknown > > > [0]PETSC ERROR: ut_main on a linux-cxx-debug named grappelli by esfp > Tue Jul 5 18:05:15 2016 > > > [0]PETSC ERROR: Configure options PETSC_ARCH=linux-cxx-debug > --with-cc=gcc --with-cxx=g++ --with-clanguage=c++ --with-gnu-compilers=1 > --with-mpi-compilers=1 --with-debugging=1 --with-shared-libraries=1 > --download-openmpi=1 --download-f2cblaslapack --download-superlu_dist=1 > --download-superlu=1 --download-metis=1 --download-parmetis=1 > --download-blacs=1 --with-hdf5 --with-hdf5-dir=/usr/local/hdf5/ > > > [0]PETSC ERROR: #1 MatCreate() line 79 in > /home/esfp/tools/libraries/petsc/src/mat/utils/gcreate.c > > > [0]PETSC ERROR: #2 MatCreateSeqAIJ() line 3471 in > /home/esfp/tools/libraries/petsc/src/mat/impls/aij/seq/aij.c > > > [1]PETSC ERROR: Petsc Release Version 3.6.3, unknown > > > [1]PETSC ERROR: ut_main on a linux-cxx-debug named grappelli by esfp > Tue Jul 5 18:05:15 2016 > > > [1]PETSC ERROR: Configure options PETSC_ARCH=linux-cxx-debug > --with-cc=gcc --with-cxx=g++ --with-clanguage=c++ --with-gnu-compilers=1 > --with-mpi-compilers=1 --with-debugging=1 --with-shared-libraries=1 > --download-openmpi=1 --download-f2cblaslapack --download-superlu_dist=1 > --download-superlu=1 --download-metis=1 --download-parmetis=1 > --download-blacs=1 --with-hdf5 --with-hdf5-dir=/usr/local/hdf5/ > > > [1]PETSC ERROR: #1 MatCreate() line 79 in > /home/esfp/tools/libraries/petsc/src/mat/utils/gcreate.c > > > [1]PETSC ERROR: #2 MatCreateSeqAIJ() line 3471 in > /home/esfp/tools/libraries/petsc/src/mat/impls/aij/seq/aij.c > > > [CS][pCalc_P] rank:1, num_points:10, p_init:300 > > > [CS][pCalc_P] rank:2, num_points:10, p_init:300 > > > [CS][pCalc_P] rank:0, num_points:10, p_init:300 > > > > > > As you can see nothing is NULL in my call to the MatCreateSeqAIJ. > > > > > > I tried to debug it with -start_in_debugger, but I got another error. > > > $ make ut_main && mpirun -n 3 ut_main -start_in_debugger > > > /home/esfp/tools/libraries/petsc/linux-cxx-debug/bin/mpicxx -o > ut_main.o -c -Wall -Wwrite-strings -Wno-strict-aliasing > -Wno-unknown-pragmas -g -O0 -fPIC > -I/home/esfp/tools/libraries/petsc/include > -I/home/esfp/tools/libraries/petsc/linux-cxx-debug/include > -I/usr/local/hdf5/include -std=c++11 -g -O3 `pwd`/ut_main.cc > > > /home/esfp/tools/libraries/petsc/linux-cxx-debug/bin/mpicxx -Wall > -Wwrite-strings -Wno-strict-aliasing -Wno-unknown-pragmas -g -O0 -I. > svm.o solver.o model_selection.o ut_ms.o ut_common.o ut_kf.o > ut_partitioning.o ds_node.o ds_graph.o coarsening.o ut_coarsening.o > partitioning.o ut_mr.o pugixml.o config_params.o etimer.o common_funcs.o > OptionParser.o loader.o ut_loader.o k_fold.o ut_main.o > -Wl,-rpath,/home/esfp/tools/libraries/petsc/linux-cxx-debug/lib > -L/home/esfp/tools/libraries/petsc/linux-cxx-debug/lib -lpetsc > -Wl,-rpath,/home/esfp/tools/libraries/petsc/linux-cxx-debug/lib > -lsuperlu_4.3 -lsuperlu_dist_4.1 -lf2clapack -lf2cblas -lm -lparmetis > -lmetis -lX11 -Wl,-rpath,/usr/local/hdf5/lib -L/usr/local/hdf5/lib > -lhdf5hl_fortran -lhdf5_fortran -lhdf5_hl -lhdf5 -lhwloc -lm > -Wl,-rpath,/usr/lib/gcc/x86_64-linux-gnu/4.8 > -L/usr/lib/gcc/x86_64-linux-gnu/4.8 -Wl,-rpath,/usr/lib/x86_64-linux-gnu > -L/usr/lib/x86_64-linux-gnu -Wl,-rpath,/lib/x86_64-linux-gnu > -L/lib/x86_64-linux-gnu -lmpi_usempi -lmpi_mpifh -lgfortran -lm -lgfortran > -lm -lquadmath -lm -lmpi_cxx -lstdc++ > -Wl,-rpath,/home/esfp/tools/libraries/petsc/linux-cxx-debug/lib > -L/home/esfp/tools/libraries/petsc/linux-cxx-debug/lib > -Wl,-rpath,/usr/lib/gcc/x86_64-linux-gnu/4.8 > -L/usr/lib/gcc/x86_64-linux-gnu/4.8 -Wl,-rpath,/usr/lib/x86_64-linux-gnu > -L/usr/lib/x86_64-linux-gnu -Wl,-rpath,/lib/x86_64-linux-gnu > -L/lib/x86_64-linux-gnu -Wl,-rpath,/usr/lib/x86_64-linux-gnu > -L/usr/lib/x86_64-linux-gnu -ldl > -Wl,-rpath,/home/esfp/tools/libraries/petsc/linux-cxx-debug/lib -lmpi > -lgcc_s -lpthread -ldl -o ut_main > > > /bin/rm -f ut_main.o > > > [0]PETSC ERROR: PETSC: Attaching gdb to ut_main of pid 2818 on display > :0 on machine grappelli > > > [1]PETSC ERROR: PETSC: Attaching gdb to ut_main of pid 2819 on display > :0 on machine grappelli > > > [2]PETSC ERROR: PETSC: Attaching gdb to ut_main of pid 2820 on display > :0 on machine grappelli > > > > > > > > > And I got below error in gdb GUI: > > > > > > > > > I appreciate your support. > > > > > > Best regards, > > > Ehsan > > > > > > On Wed, Jun 29, 2016 at 4:31 PM, Barry Smith > wrote: > > > > > > On all other processes don't pass in 1 pass in 0 since all other > processes want 0 sub matrices > > > > > > > > > > On Jun 29, 2016, at 3:24 PM, ehsan sadrfaridpour > wrote: > > > > > > > > Thanks, the IS problem is solved. > > > > But now I have another problem to compile the code. > > > > > > > > I use below code: > > > > Mat m_WA_nt_local; > > > > MatCreateSeqAIJ(PETSC_COMM_SELF,num_points,num_points, > Config_params::getInstance()->get_pre_init_loader_matrix(), NULL, > &m_WA_nt_local); > > > > IS set; > > > > if(rank ==0){ > > > > ISCreateStride(PETSC_COMM_SELF, num_points, 0, 1, &set); > > > > ISView(set, PETSC_VIEWER_STDOUT_SELF); > > > > } > > > > MatGetSubMatrices(m_WA_norm_T, 1, &set, &set, > MAT_INITIAL_MATRIX, &m_WA_nt_local); > > > > > > > > The error I get is : > > > > error: cannot convert ?_p_Mat**? to ?_p_Mat***? for argument ?6? to > ?PetscErrorCode MatGetSubMatrices(Mat, PetscInt, _p_IS* const*, _p_IS* > const*, MatReuse, _p_Mat***)? > > > > MatGetSubMatrices(m_WA_norm_T, 1, &set, &set, > MAT_INITIAL_MATRIX, &m_WA_nt_local); > > > > > > > > > > > > I tried to go around it by define a array of Matrices using "Mat * > m_WA_nt_local" > > > > So, the first 2 lines changed to below and I can compile the code. > > > > Mat * m_WA_nt_local; > > > > MatCreateSeqAIJ(PETSC_COMM_SELF,num_points,num_points, > Config_params::getInstance()->get_pre_init_loader_matrix(), NULL, > m_WA_nt_local); > > > > > > > > > > > > > > > > However, I get errors like below when I run the code with 2 mpi > process. > > > > --------------------- Error Message > -------------------------------------------------------------- > > > > [1]PETSC ERROR: Invalid argument > > > > [1]PETSC ERROR: Wrong type of object: Parameter # 3 > > > > [1]PETSC ERROR: See > http://www.mcs.anl.gov/petsc/documentation/faq.html for trouble shooting. > > > > [1]PETSC ERROR: Petsc Release Version 3.6.3, unknown > > > > [1]PETSC ERROR: ut_main on a linux-cxx-debug named grappelli by esfp > Wed Jun 29 16:21:04 2016 > > > > [1]PETSC ERROR: Configure options PETSC_ARCH=linux-cxx-debug > --with-cc=gcc --with-cxx=g++ --with-clanguage=c++ --with-gnu-compilers=1 > --with-mpi-compilers=1 --with-debugging=1 --with-shared-libraries=1 > --download-openmpi=1 --download-f2cblaslapack --download-superlu_dist=1 > --download-superlu=1 --download-metis=1 --download-parmetis=1 > --download-blacs=1 --with-hdf5 --with-hdf5-dir=/usr/local/hdf5/ > > > > [1]PETSC ERROR: #1 MatGetSubMatrices() line 6605 in > /home/esfp/tools/libraries/petsc/src/mat/interface/matrix.c > > > > > > > > > > > > I think I need to do something for other processes, but I don't know > what I need to do. > > > > > > > > Best, > > > > Ehsan > > > > > > > > > > > > > > > > On Wed, Jun 29, 2016 at 4:03 PM, Dave May > wrote: > > > > > > > > > > > > On Wednesday, 29 June 2016, ehsan sadrfaridpour > wrote: > > > > I faced the below error during compiling my code for using > MatGetSubMatrices. > > > > > > > > error: cannot convert ?IS {aka _p_IS*}? to ?_p_IS* const*? for > argument ?3? to ?PetscErrorCode MatGetSubMatrices(Mat, PetscInt, _p_IS* > const*, _p_IS* const*, MatReuse, _p_Mat***)? > > > > MatGetSubMatrices(m_WA_norm_T, 1, set, set, > MAT_INITIAL_MATRIX, &m_local_W); > > > > > > > > My code : > > > > PetscMPIInt rank; > > > > MPI_Comm_rank(PETSC_COMM_WORLD, &rank); > > > > > > > > if(rank ==0){ > > > > Mat m_local_W; > > > > MatCreateSeqAIJ(PETSC_COMM_SELF,num_points,num_points, > num_nz, NULL,&m_local_W);// try to reserve space for only number of final > non zero entries for each fine node (e.g. 4) > > > > IS set; > > > > ISCreateStride(PETSC_COMM_SELF, num_points, 0, 1, &set_row); > > > > MatGetSubMatrices(m_WA_norm_T, 1, set_row, set_col, > MAT_INITIAL_MATRIX, &m_local_W); > > > > > > > > } > > > > > > > > I followed below example: > > > > > http://www.mcs.anl.gov/petsc/petsc-current/src/vec/is/is/examples/tutorials/ex2.c.html > > > > > > > > This code won't work in parallel. > > > > The man page says this function is collective on Mat. You need to > move the call to MatGetSubMatrices outside of the if(rank==0) loop. > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > On Wed, Jun 29, 2016 at 3:19 PM, ehsan sadrfaridpour < > it.sadr at gmail.com> wrote: > > > > Thanks a lot for great support. > > > > > > > > On Wed, Jun 29, 2016 at 3:11 PM, Barry Smith > wrote: > > > > > > > > MatGetSubmatrices() just have the first process request all the > rows and columns and the others request none. You can use ISCreateStride() > to create the ISs without having to make an array of all the indices. > > > > > > > > > > > > > On Jun 29, 2016, at 1:43 PM, ehsan sadrfaridpour < > it.sadr at gmail.com> wrote: > > > > > > > > > > Hi, > > > > > > > > > > I need to have access to most of elements of a parallel MPIAIJ > matrix only from 1 process (rank 0). > > > > > I tried to copy or duplicate it to SEQAIJ, but I faced problems. > > > > > > > > > > How can I have a local copy of a matrix which is distributed on > multiple process? I don't want to update the matrix, and the read-only > version of it would be enough. > > > > > > > > > > Best, > > > > > Ehsan > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > -- > > What most experimenters take for granted before they begin their > experiments is infinitely more interesting than any results to which their > experiments lead. > > -- Norbert Wiener > > -------------- next part -------------- An HTML attachment was scrubbed... URL: From bsmith at mcs.anl.gov Tue Jul 5 19:34:08 2016 From: bsmith at mcs.anl.gov (Barry Smith) Date: Tue, 5 Jul 2016 19:34:08 -0500 Subject: [petsc-users] How to have a local copy (sequential) of a parallel matrix In-Reply-To: References: <1618DCDA-7859-49BD-BCAF-F4BD08DF1BAF@mcs.anl.gov> <855835AC-78B8-4A8A-993F-2E9060B4BBAF@mcs.anl.gov> Message-ID: > On Jul 5, 2016, at 5:58 PM, ehsan sadrfaridpour wrote: > > Sorry, I think your suggestion needs something, since it doesn't compile. > > error: expected initializer before ?*? token > Mat m_WA_nt_local *m_WA_nt_local; Mat *m_WA_nt_local; > > > Yes, this is the same problem that compiled and worked but it has a bug. > I faced this problem and I tried to define the array of Matrices to fix this 4 days ago. > > However, my first email today is the problem that array of matrices caused me. > I get a little confused in the logic. > > Let me review what is happening: > As this method is collective, all the processes needs to run it. > Therefore, I need to define a local matrix and create it for all of the processes. > Only for the process I want to have the local matrix, I request a matrix (matrices) and for the rest of them I pass 0 in the MatGetSubMatrices. > I am suspicious about creating only 1 matrix for any process, while I expect an array of matrices in the MatGetSubMatrices. MatGetSubMatrices can return any number of matrices including a different number on different machines. Hence it returns an array containing the matrices. The length of the array is the same as the number of local matrices you requested which could be zero. You should read up a little on the web on using pointers in C; they are confusing at first but once you get the hang of them they are usually straightforward. Barry > > > > > > On Tue, Jul 5, 2016 at 6:43 PM, Barry Smith wrote: > > > On Jul 5, 2016, at 5:36 PM, Matthew Knepley wrote: > > > > On Tue, Jul 5, 2016 at 5:26 PM, ehsan sadrfaridpour wrote: > > Thanks for your prompt reply. Using & solve this problem, but then I have another problem. > > > > Rest of the Code: > > Mat m_WA_nt_local; > > MatCreateSeqAIJ(PETSC_COMM_SELF,num_points,num_points, pre_init_size, NULL, &m_WA_nt_local); > > PetscPrintf(PETSC_COMM_SELF, "[CS][pCalc_P] rank:%d, num_points:%d, p_init:%d\n", rank, num_points, pre_init_size); > > > > IS set; > > if(rank ==0){ > > // - - - - - create local matrix - - - - - > > PetscPrintf(PETSC_COMM_SELF, "[CS][pCalc_P] rank:%d, num_points:%d\n", rank, num_points); > > ISCreateStride(PETSC_COMM_SELF, num_points, 0, 1, &set); > > ISView(set, PETSC_VIEWER_STDOUT_SELF); > > MatGetSubMatrices(m_WA_norm_T, 1, &set, &set, MAT_INITIAL_MATRIX, &m_WA_nt_local); > > }else{ > > MatGetSubMatrices(m_WA_norm_T, 0, &set, &set, MAT_INITIAL_MATRIX, &m_WA_nt_local); > > } > > > > This returns an ARRAY of Mat objects, not just one. > > Didn't we just do this email a couple of days ago? > > You need > > Mat m_WA_nt_local *m_WA_nt_local; > > MatGetSubMatrices(m_WA_norm_T, 0, &set, &set, MAT_INITIAL_MATRIX, &m_WA_nt_local); > > > > > > > > > Matt > > > > > > Error in compile: > > /home/esfp/dev/ws_qt/mlsvm/coarsening.cc: In member function ?_p_Mat* Coarsening::pCalc_P(_p_Mat*&, _p_Vec*&, std::vector&, cs_info&)?: > > /home/esfp/dev/ws_qt/mlsvm/coarsening.cc:113:89: error: cannot convert ?_p_Mat**? to ?_p_Mat***? for argument ?6? to ?PetscErrorCode MatGetSubMatrices(Mat, PetscInt, _p_IS* const*, _p_IS* const*, MatReuse, _p_Mat***)? > > MatGetSubMatrices(m_WA_norm_T, 1, &set, &set, MAT_INITIAL_MATRIX, &m_WA_nt_local); > > ^ > > /home/esfp/dev/ws_qt/mlsvm/coarsening.cc:115:89: error: cannot convert ?_p_Mat**? to ?_p_Mat***? for argument ?6? to ?PetscErrorCode MatGetSubMatrices(Mat, PetscInt, _p_IS* const*, _p_IS* const*, MatReuse, _p_Mat***)? > > MatGetSubMatrices(m_WA_norm_T, 0, &set, &set, MAT_INITIAL_MATRIX, &m_WA_nt_local); > > ^ > > > > > > On Tue, Jul 5, 2016 at 6:21 PM, Barry Smith wrote: > > > > It should be > > > > Mat m_WA_nt_local; > > > > > MatCreateSeqAIJ(PETSC_COMM_SELF,num_points,num_points, pre_init_size, NULL, &m_WA_nt_local); > > ^^^^^^^^^^^^ note the & > > > > > > > > > On Jul 5, 2016, at 5:13 PM, ehsan sadrfaridpour wrote: > > > > > > I faced a problem with my code. The problem is related to MatCreateSeqAIJ(). > > > I comment the rest of my code and just keeping the below lines cause me the error. > > > Code: > > > Mat * m_WA_nt_local; > > > MatCreateSeqAIJ(PETSC_COMM_SELF,num_points,num_points, pre_init_size, NULL, m_WA_nt_local); > > > PetscPrintf(PETSC_COMM_SELF, "[CS][pCalc_P] rank:%d, num_points:%d, p_init:%d\n", rank, num_points, pre_init_size); > > > > > > exit(1); > > > > > > Error: > > > [0]PETSC ERROR: --------------------- Error Message -------------------------------------------------------------- > > > [0]PETSC ERROR: Null argument, when expecting valid pointer > > > [0]PETSC ERROR: [1]PETSC ERROR: --------------------- Error Message -------------------------------------------------------------- > > > [1]PETSC ERROR: Null argument, when expecting valid pointer > > > [1]PETSC ERROR: Null Pointer: Parameter # 2 > > > [1]PETSC ERROR: See http://www.mcs.anl.gov/petsc/documentation/faq.html for trouble shooting. > > > [2]PETSC ERROR: --------------------- Error Message -------------------------------------------------------------- > > > [2]PETSC ERROR: Null argument, when expecting valid pointer > > > [2]PETSC ERROR: Null Pointer: Parameter # 2 > > > [2]PETSC ERROR: See http://www.mcs.anl.gov/petsc/documentation/faq.html for trouble shooting. > > > [2]PETSC ERROR: Petsc Release Version 3.6.3, unknown > > > [2]PETSC ERROR: ut_main on a linux-cxx-debug named grappelli by esfp Tue Jul 5 18:05:15 2016 > > > [2]PETSC ERROR: Configure options PETSC_ARCH=linux-cxx-debug --with-cc=gcc --with-cxx=g++ --with-clanguage=c++ --with-gnu-compilers=1 --with-mpi-compilers=1 --with-debugging=1 --with-shared-libraries=1 --download-openmpi=1 --download-f2cblaslapack --download-superlu_dist=1 --download-superlu=1 --download-metis=1 --download-parmetis=1 --download-blacs=1 --with-hdf5 --with-hdf5-dir=/usr/local/hdf5/ > > > [2]PETSC ERROR: #1 MatCreate() line 79 in /home/esfp/tools/libraries/petsc/src/mat/utils/gcreate.c > > > [2]PETSC ERROR: #2 MatCreateSeqAIJ() line 3471 in /home/esfp/tools/libraries/petsc/src/mat/impls/aij/seq/aij.c > > > Null Pointer: Parameter # 2 > > > [0]PETSC ERROR: See http://www.mcs.anl.gov/petsc/documentation/faq.html for trouble shooting. > > > [0]PETSC ERROR: Petsc Release Version 3.6.3, unknown > > > [0]PETSC ERROR: ut_main on a linux-cxx-debug named grappelli by esfp Tue Jul 5 18:05:15 2016 > > > [0]PETSC ERROR: Configure options PETSC_ARCH=linux-cxx-debug --with-cc=gcc --with-cxx=g++ --with-clanguage=c++ --with-gnu-compilers=1 --with-mpi-compilers=1 --with-debugging=1 --with-shared-libraries=1 --download-openmpi=1 --download-f2cblaslapack --download-superlu_dist=1 --download-superlu=1 --download-metis=1 --download-parmetis=1 --download-blacs=1 --with-hdf5 --with-hdf5-dir=/usr/local/hdf5/ > > > [0]PETSC ERROR: #1 MatCreate() line 79 in /home/esfp/tools/libraries/petsc/src/mat/utils/gcreate.c > > > [0]PETSC ERROR: #2 MatCreateSeqAIJ() line 3471 in /home/esfp/tools/libraries/petsc/src/mat/impls/aij/seq/aij.c > > > [1]PETSC ERROR: Petsc Release Version 3.6.3, unknown > > > [1]PETSC ERROR: ut_main on a linux-cxx-debug named grappelli by esfp Tue Jul 5 18:05:15 2016 > > > [1]PETSC ERROR: Configure options PETSC_ARCH=linux-cxx-debug --with-cc=gcc --with-cxx=g++ --with-clanguage=c++ --with-gnu-compilers=1 --with-mpi-compilers=1 --with-debugging=1 --with-shared-libraries=1 --download-openmpi=1 --download-f2cblaslapack --download-superlu_dist=1 --download-superlu=1 --download-metis=1 --download-parmetis=1 --download-blacs=1 --with-hdf5 --with-hdf5-dir=/usr/local/hdf5/ > > > [1]PETSC ERROR: #1 MatCreate() line 79 in /home/esfp/tools/libraries/petsc/src/mat/utils/gcreate.c > > > [1]PETSC ERROR: #2 MatCreateSeqAIJ() line 3471 in /home/esfp/tools/libraries/petsc/src/mat/impls/aij/seq/aij.c > > > [CS][pCalc_P] rank:1, num_points:10, p_init:300 > > > [CS][pCalc_P] rank:2, num_points:10, p_init:300 > > > [CS][pCalc_P] rank:0, num_points:10, p_init:300 > > > > > > As you can see nothing is NULL in my call to the MatCreateSeqAIJ. > > > > > > I tried to debug it with -start_in_debugger, but I got another error. > > > $ make ut_main && mpirun -n 3 ut_main -start_in_debugger > > > /home/esfp/tools/libraries/petsc/linux-cxx-debug/bin/mpicxx -o ut_main.o -c -Wall -Wwrite-strings -Wno-strict-aliasing -Wno-unknown-pragmas -g -O0 -fPIC -I/home/esfp/tools/libraries/petsc/include -I/home/esfp/tools/libraries/petsc/linux-cxx-debug/include -I/usr/local/hdf5/include -std=c++11 -g -O3 `pwd`/ut_main.cc > > > /home/esfp/tools/libraries/petsc/linux-cxx-debug/bin/mpicxx -Wall -Wwrite-strings -Wno-strict-aliasing -Wno-unknown-pragmas -g -O0 -I. svm.o solver.o model_selection.o ut_ms.o ut_common.o ut_kf.o ut_partitioning.o ds_node.o ds_graph.o coarsening.o ut_coarsening.o partitioning.o ut_mr.o pugixml.o config_params.o etimer.o common_funcs.o OptionParser.o loader.o ut_loader.o k_fold.o ut_main.o -Wl,-rpath,/home/esfp/tools/libraries/petsc/linux-cxx-debug/lib -L/home/esfp/tools/libraries/petsc/linux-cxx-debug/lib -lpetsc -Wl,-rpath,/home/esfp/tools/libraries/petsc/linux-cxx-debug/lib -lsuperlu_4.3 -lsuperlu_dist_4.1 -lf2clapack -lf2cblas -lm -lparmetis -lmetis -lX11 -Wl,-rpath,/usr/local/hdf5/lib -L/usr/local/hdf5/lib -lhdf5hl_fortran -lhdf5_fortran -lhdf5_hl -lhdf5 -lhwloc -lm -Wl,-rpath,/usr/lib/gcc/x86_64-linux-gnu/4.8 -L/usr/lib/gcc/x86_64-linux-gnu/4.8 -Wl,-rpath,/usr/lib/x86_64-linux-gnu -L/usr/lib/x86_64-linux-gnu -Wl,-rpath,/lib/x86_64-linux-gnu -L/lib/x86_64-linux-gnu -lmpi_usempi -lmpi_mpifh -lgfortran -lm -lgfortran -lm -lquadmath -lm -lmpi_cxx -lstdc++ -Wl,-rpath,/home/esfp/tools/libraries/petsc/linux-cxx-debug/lib -L/home/esfp/tools/libraries/petsc/linux-cxx-debug/lib -Wl,-rpath,/usr/lib/gcc/x86_64-linux-gnu/4.8 -L/usr/lib/gcc/x86_64-linux-gnu/4.8 -Wl,-rpath,/usr/lib/x86_64-linux-gnu -L/usr/lib/x86_64-linux-gnu -Wl,-rpath,/lib/x86_64-linux-gnu -L/lib/x86_64-linux-gnu -Wl,-rpath,/usr/lib/x86_64-linux-gnu -L/usr/lib/x86_64-linux-gnu -ldl -Wl,-rpath,/home/esfp/tools/libraries/petsc/linux-cxx-debug/lib -lmpi -lgcc_s -lpthread -ldl -o ut_main > > > /bin/rm -f ut_main.o > > > [0]PETSC ERROR: PETSC: Attaching gdb to ut_main of pid 2818 on display :0 on machine grappelli > > > [1]PETSC ERROR: PETSC: Attaching gdb to ut_main of pid 2819 on display :0 on machine grappelli > > > [2]PETSC ERROR: PETSC: Attaching gdb to ut_main of pid 2820 on display :0 on machine grappelli > > > > > > > > > And I got below error in gdb GUI: > > > > > > > > > I appreciate your support. > > > > > > Best regards, > > > Ehsan > > > > > > On Wed, Jun 29, 2016 at 4:31 PM, Barry Smith wrote: > > > > > > On all other processes don't pass in 1 pass in 0 since all other processes want 0 sub matrices > > > > > > > > > > On Jun 29, 2016, at 3:24 PM, ehsan sadrfaridpour wrote: > > > > > > > > Thanks, the IS problem is solved. > > > > But now I have another problem to compile the code. > > > > > > > > I use below code: > > > > Mat m_WA_nt_local; > > > > MatCreateSeqAIJ(PETSC_COMM_SELF,num_points,num_points, Config_params::getInstance()->get_pre_init_loader_matrix(), NULL, &m_WA_nt_local); > > > > IS set; > > > > if(rank ==0){ > > > > ISCreateStride(PETSC_COMM_SELF, num_points, 0, 1, &set); > > > > ISView(set, PETSC_VIEWER_STDOUT_SELF); > > > > } > > > > MatGetSubMatrices(m_WA_norm_T, 1, &set, &set, MAT_INITIAL_MATRIX, &m_WA_nt_local); > > > > > > > > The error I get is : > > > > error: cannot convert ?_p_Mat**? to ?_p_Mat***? for argument ?6? to ?PetscErrorCode MatGetSubMatrices(Mat, PetscInt, _p_IS* const*, _p_IS* const*, MatReuse, _p_Mat***)? > > > > MatGetSubMatrices(m_WA_norm_T, 1, &set, &set, MAT_INITIAL_MATRIX, &m_WA_nt_local); > > > > > > > > > > > > I tried to go around it by define a array of Matrices using "Mat * m_WA_nt_local" > > > > So, the first 2 lines changed to below and I can compile the code. > > > > Mat * m_WA_nt_local; > > > > MatCreateSeqAIJ(PETSC_COMM_SELF,num_points,num_points, Config_params::getInstance()->get_pre_init_loader_matrix(), NULL, m_WA_nt_local); > > > > > > > > > > > > > > > > However, I get errors like below when I run the code with 2 mpi process. > > > > --------------------- Error Message -------------------------------------------------------------- > > > > [1]PETSC ERROR: Invalid argument > > > > [1]PETSC ERROR: Wrong type of object: Parameter # 3 > > > > [1]PETSC ERROR: See http://www.mcs.anl.gov/petsc/documentation/faq.html for trouble shooting. > > > > [1]PETSC ERROR: Petsc Release Version 3.6.3, unknown > > > > [1]PETSC ERROR: ut_main on a linux-cxx-debug named grappelli by esfp Wed Jun 29 16:21:04 2016 > > > > [1]PETSC ERROR: Configure options PETSC_ARCH=linux-cxx-debug --with-cc=gcc --with-cxx=g++ --with-clanguage=c++ --with-gnu-compilers=1 --with-mpi-compilers=1 --with-debugging=1 --with-shared-libraries=1 --download-openmpi=1 --download-f2cblaslapack --download-superlu_dist=1 --download-superlu=1 --download-metis=1 --download-parmetis=1 --download-blacs=1 --with-hdf5 --with-hdf5-dir=/usr/local/hdf5/ > > > > [1]PETSC ERROR: #1 MatGetSubMatrices() line 6605 in /home/esfp/tools/libraries/petsc/src/mat/interface/matrix.c > > > > > > > > > > > > I think I need to do something for other processes, but I don't know what I need to do. > > > > > > > > Best, > > > > Ehsan > > > > > > > > > > > > > > > > On Wed, Jun 29, 2016 at 4:03 PM, Dave May wrote: > > > > > > > > > > > > On Wednesday, 29 June 2016, ehsan sadrfaridpour wrote: > > > > I faced the below error during compiling my code for using MatGetSubMatrices. > > > > > > > > error: cannot convert ?IS {aka _p_IS*}? to ?_p_IS* const*? for argument ?3? to ?PetscErrorCode MatGetSubMatrices(Mat, PetscInt, _p_IS* const*, _p_IS* const*, MatReuse, _p_Mat***)? > > > > MatGetSubMatrices(m_WA_norm_T, 1, set, set, MAT_INITIAL_MATRIX, &m_local_W); > > > > > > > > My code : > > > > PetscMPIInt rank; > > > > MPI_Comm_rank(PETSC_COMM_WORLD, &rank); > > > > > > > > if(rank ==0){ > > > > Mat m_local_W; > > > > MatCreateSeqAIJ(PETSC_COMM_SELF,num_points,num_points, num_nz, NULL,&m_local_W);// try to reserve space for only number of final non zero entries for each fine node (e.g. 4) > > > > IS set; > > > > ISCreateStride(PETSC_COMM_SELF, num_points, 0, 1, &set_row); > > > > MatGetSubMatrices(m_WA_norm_T, 1, set_row, set_col, MAT_INITIAL_MATRIX, &m_local_W); > > > > > > > > } > > > > > > > > I followed below example: > > > > http://www.mcs.anl.gov/petsc/petsc-current/src/vec/is/is/examples/tutorials/ex2.c.html > > > > > > > > This code won't work in parallel. > > > > The man page says this function is collective on Mat. You need to move the call to MatGetSubMatrices outside of the if(rank==0) loop. > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > On Wed, Jun 29, 2016 at 3:19 PM, ehsan sadrfaridpour wrote: > > > > Thanks a lot for great support. > > > > > > > > On Wed, Jun 29, 2016 at 3:11 PM, Barry Smith wrote: > > > > > > > > MatGetSubmatrices() just have the first process request all the rows and columns and the others request none. You can use ISCreateStride() to create the ISs without having to make an array of all the indices. > > > > > > > > > > > > > On Jun 29, 2016, at 1:43 PM, ehsan sadrfaridpour wrote: > > > > > > > > > > Hi, > > > > > > > > > > I need to have access to most of elements of a parallel MPIAIJ matrix only from 1 process (rank 0). > > > > > I tried to copy or duplicate it to SEQAIJ, but I faced problems. > > > > > > > > > > How can I have a local copy of a matrix which is distributed on multiple process? I don't want to update the matrix, and the read-only version of it would be enough. > > > > > > > > > > Best, > > > > > Ehsan > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > -- > > What most experimenters take for granted before they begin their experiments is infinitely more interesting than any results to which their experiments lead. > > -- Norbert Wiener > > From knepley at gmail.com Tue Jul 5 20:02:45 2016 From: knepley at gmail.com (Matthew Knepley) Date: Tue, 5 Jul 2016 20:02:45 -0500 Subject: [petsc-users] How to have a local copy (sequential) of a parallel matrix In-Reply-To: References: <1618DCDA-7859-49BD-BCAF-F4BD08DF1BAF@mcs.anl.gov> <855835AC-78B8-4A8A-993F-2E9060B4BBAF@mcs.anl.gov> Message-ID: On Tue, Jul 5, 2016 at 5:58 PM, ehsan sadrfaridpour wrote: > Sorry, I think your suggestion needs something, since it doesn't compile. > > error: expected initializer before ?*? token >> Mat m_WA_nt_local *m_WA_nt_local; >> > > > Yes, this is the same problem that compiled and worked but it has a bug. > I faced this problem and I tried to define the array of Matrices to fix > this 4 days ago. > > However, my first email today is the problem that array of matrices caused > me. > I get a little confused in the logic. > > Let me review what is happening: > As this method is collective, all the processes needs to run it. > Therefore, I need to define a local matrix and create it for all of the > processes. > No no no. Each process extracts a SET of SEQUENTIAL matrices. Each proc choose how many it will extract (could be 0). Thanks, Matt > Only for the process I want to have the local matrix, I request a matrix > (matrices) and for the rest of them I pass 0 in the MatGetSubMatrices. > I am suspicious about creating only 1 matrix for any process, while I > expect an array of matrices in the MatGetSubMatrices. > > > > > > On Tue, Jul 5, 2016 at 6:43 PM, Barry Smith wrote: > >> >> > On Jul 5, 2016, at 5:36 PM, Matthew Knepley wrote: >> > >> > On Tue, Jul 5, 2016 at 5:26 PM, ehsan sadrfaridpour >> wrote: >> > Thanks for your prompt reply. Using & solve this problem, but then I >> have another problem. >> > >> > Rest of the Code: >> > Mat m_WA_nt_local; >> > MatCreateSeqAIJ(PETSC_COMM_SELF,num_points,num_points, >> pre_init_size, NULL, &m_WA_nt_local); >> > PetscPrintf(PETSC_COMM_SELF, "[CS][pCalc_P] rank:%d, num_points:%d, >> p_init:%d\n", rank, num_points, pre_init_size); >> > >> > IS set; >> > if(rank ==0){ >> > // - - - - - create local matrix - - - - - >> > PetscPrintf(PETSC_COMM_SELF, "[CS][pCalc_P] rank:%d, >> num_points:%d\n", rank, num_points); >> > ISCreateStride(PETSC_COMM_SELF, num_points, 0, 1, &set); >> > ISView(set, PETSC_VIEWER_STDOUT_SELF); >> > MatGetSubMatrices(m_WA_norm_T, 1, &set, &set, >> MAT_INITIAL_MATRIX, &m_WA_nt_local); >> > }else{ >> > MatGetSubMatrices(m_WA_norm_T, 0, &set, &set, >> MAT_INITIAL_MATRIX, &m_WA_nt_local); >> > } >> > >> > This returns an ARRAY of Mat objects, not just one. >> >> Didn't we just do this email a couple of days ago? >> >> You need >> >> Mat m_WA_nt_local *m_WA_nt_local; >> > MatGetSubMatrices(m_WA_norm_T, 0, &set, &set, MAT_INITIAL_MATRIX, >> &m_WA_nt_local); >> >> >> >> >> >> > >> > Matt >> > >> > >> > Error in compile: >> > /home/esfp/dev/ws_qt/mlsvm/coarsening.cc: In member function ?_p_Mat* >> Coarsening::pCalc_P(_p_Mat*&, _p_Vec*&, std::vector&, >> cs_info&)?: >> > /home/esfp/dev/ws_qt/mlsvm/coarsening.cc:113:89: error: cannot convert >> ?_p_Mat**? to ?_p_Mat***? for argument ?6? to ?PetscErrorCode >> MatGetSubMatrices(Mat, PetscInt, _p_IS* const*, _p_IS* const*, MatReuse, >> _p_Mat***)? >> > MatGetSubMatrices(m_WA_norm_T, 1, &set, &set, >> MAT_INITIAL_MATRIX, &m_WA_nt_local); >> > >> ^ >> > /home/esfp/dev/ws_qt/mlsvm/coarsening.cc:115:89: error: cannot convert >> ?_p_Mat**? to ?_p_Mat***? for argument ?6? to ?PetscErrorCode >> MatGetSubMatrices(Mat, PetscInt, _p_IS* const*, _p_IS* const*, MatReuse, >> _p_Mat***)? >> > MatGetSubMatrices(m_WA_norm_T, 0, &set, &set, >> MAT_INITIAL_MATRIX, &m_WA_nt_local); >> > >> ^ >> > >> > >> > On Tue, Jul 5, 2016 at 6:21 PM, Barry Smith wrote: >> > >> > It should be >> > >> > Mat m_WA_nt_local; >> > >> > > MatCreateSeqAIJ(PETSC_COMM_SELF,num_points,num_points, pre_init_size, >> NULL, &m_WA_nt_local); >> > >> ^^^^^^^^^^^^ >> note the & >> > >> > >> > >> > > On Jul 5, 2016, at 5:13 PM, ehsan sadrfaridpour >> wrote: >> > > >> > > I faced a problem with my code. The problem is related to >> MatCreateSeqAIJ(). >> > > I comment the rest of my code and just keeping the below lines cause >> me the error. >> > > Code: >> > > Mat * m_WA_nt_local; >> > > MatCreateSeqAIJ(PETSC_COMM_SELF,num_points,num_points, >> pre_init_size, NULL, m_WA_nt_local); >> > > PetscPrintf(PETSC_COMM_SELF, "[CS][pCalc_P] rank:%d, >> num_points:%d, p_init:%d\n", rank, num_points, pre_init_size); >> > > >> > > exit(1); >> > > >> > > Error: >> > > [0]PETSC ERROR: --------------------- Error Message >> -------------------------------------------------------------- >> > > [0]PETSC ERROR: Null argument, when expecting valid pointer >> > > [0]PETSC ERROR: [1]PETSC ERROR: --------------------- Error Message >> -------------------------------------------------------------- >> > > [1]PETSC ERROR: Null argument, when expecting valid pointer >> > > [1]PETSC ERROR: Null Pointer: Parameter # 2 >> > > [1]PETSC ERROR: See >> http://www.mcs.anl.gov/petsc/documentation/faq.html for trouble shooting. >> > > [2]PETSC ERROR: --------------------- Error Message >> -------------------------------------------------------------- >> > > [2]PETSC ERROR: Null argument, when expecting valid pointer >> > > [2]PETSC ERROR: Null Pointer: Parameter # 2 >> > > [2]PETSC ERROR: See >> http://www.mcs.anl.gov/petsc/documentation/faq.html for trouble shooting. >> > > [2]PETSC ERROR: Petsc Release Version 3.6.3, unknown >> > > [2]PETSC ERROR: ut_main on a linux-cxx-debug named grappelli by esfp >> Tue Jul 5 18:05:15 2016 >> > > [2]PETSC ERROR: Configure options PETSC_ARCH=linux-cxx-debug >> --with-cc=gcc --with-cxx=g++ --with-clanguage=c++ --with-gnu-compilers=1 >> --with-mpi-compilers=1 --with-debugging=1 --with-shared-libraries=1 >> --download-openmpi=1 --download-f2cblaslapack --download-superlu_dist=1 >> --download-superlu=1 --download-metis=1 --download-parmetis=1 >> --download-blacs=1 --with-hdf5 --with-hdf5-dir=/usr/local/hdf5/ >> > > [2]PETSC ERROR: #1 MatCreate() line 79 in >> /home/esfp/tools/libraries/petsc/src/mat/utils/gcreate.c >> > > [2]PETSC ERROR: #2 MatCreateSeqAIJ() line 3471 in >> /home/esfp/tools/libraries/petsc/src/mat/impls/aij/seq/aij.c >> > > Null Pointer: Parameter # 2 >> > > [0]PETSC ERROR: See >> http://www.mcs.anl.gov/petsc/documentation/faq.html for trouble shooting. >> > > [0]PETSC ERROR: Petsc Release Version 3.6.3, unknown >> > > [0]PETSC ERROR: ut_main on a linux-cxx-debug named grappelli by esfp >> Tue Jul 5 18:05:15 2016 >> > > [0]PETSC ERROR: Configure options PETSC_ARCH=linux-cxx-debug >> --with-cc=gcc --with-cxx=g++ --with-clanguage=c++ --with-gnu-compilers=1 >> --with-mpi-compilers=1 --with-debugging=1 --with-shared-libraries=1 >> --download-openmpi=1 --download-f2cblaslapack --download-superlu_dist=1 >> --download-superlu=1 --download-metis=1 --download-parmetis=1 >> --download-blacs=1 --with-hdf5 --with-hdf5-dir=/usr/local/hdf5/ >> > > [0]PETSC ERROR: #1 MatCreate() line 79 in >> /home/esfp/tools/libraries/petsc/src/mat/utils/gcreate.c >> > > [0]PETSC ERROR: #2 MatCreateSeqAIJ() line 3471 in >> /home/esfp/tools/libraries/petsc/src/mat/impls/aij/seq/aij.c >> > > [1]PETSC ERROR: Petsc Release Version 3.6.3, unknown >> > > [1]PETSC ERROR: ut_main on a linux-cxx-debug named grappelli by esfp >> Tue Jul 5 18:05:15 2016 >> > > [1]PETSC ERROR: Configure options PETSC_ARCH=linux-cxx-debug >> --with-cc=gcc --with-cxx=g++ --with-clanguage=c++ --with-gnu-compilers=1 >> --with-mpi-compilers=1 --with-debugging=1 --with-shared-libraries=1 >> --download-openmpi=1 --download-f2cblaslapack --download-superlu_dist=1 >> --download-superlu=1 --download-metis=1 --download-parmetis=1 >> --download-blacs=1 --with-hdf5 --with-hdf5-dir=/usr/local/hdf5/ >> > > [1]PETSC ERROR: #1 MatCreate() line 79 in >> /home/esfp/tools/libraries/petsc/src/mat/utils/gcreate.c >> > > [1]PETSC ERROR: #2 MatCreateSeqAIJ() line 3471 in >> /home/esfp/tools/libraries/petsc/src/mat/impls/aij/seq/aij.c >> > > [CS][pCalc_P] rank:1, num_points:10, p_init:300 >> > > [CS][pCalc_P] rank:2, num_points:10, p_init:300 >> > > [CS][pCalc_P] rank:0, num_points:10, p_init:300 >> > > >> > > As you can see nothing is NULL in my call to the MatCreateSeqAIJ. >> > > >> > > I tried to debug it with -start_in_debugger, but I got another error. >> > > $ make ut_main && mpirun -n 3 ut_main -start_in_debugger >> > > /home/esfp/tools/libraries/petsc/linux-cxx-debug/bin/mpicxx -o >> ut_main.o -c -Wall -Wwrite-strings -Wno-strict-aliasing >> -Wno-unknown-pragmas -g -O0 -fPIC >> -I/home/esfp/tools/libraries/petsc/include >> -I/home/esfp/tools/libraries/petsc/linux-cxx-debug/include >> -I/usr/local/hdf5/include -std=c++11 -g -O3 `pwd`/ut_main.cc >> > > /home/esfp/tools/libraries/petsc/linux-cxx-debug/bin/mpicxx -Wall >> -Wwrite-strings -Wno-strict-aliasing -Wno-unknown-pragmas -g -O0 -I. >> svm.o solver.o model_selection.o ut_ms.o ut_common.o ut_kf.o >> ut_partitioning.o ds_node.o ds_graph.o coarsening.o ut_coarsening.o >> partitioning.o ut_mr.o pugixml.o config_params.o etimer.o common_funcs.o >> OptionParser.o loader.o ut_loader.o k_fold.o ut_main.o >> -Wl,-rpath,/home/esfp/tools/libraries/petsc/linux-cxx-debug/lib >> -L/home/esfp/tools/libraries/petsc/linux-cxx-debug/lib -lpetsc >> -Wl,-rpath,/home/esfp/tools/libraries/petsc/linux-cxx-debug/lib >> -lsuperlu_4.3 -lsuperlu_dist_4.1 -lf2clapack -lf2cblas -lm -lparmetis >> -lmetis -lX11 -Wl,-rpath,/usr/local/hdf5/lib -L/usr/local/hdf5/lib >> -lhdf5hl_fortran -lhdf5_fortran -lhdf5_hl -lhdf5 -lhwloc -lm >> -Wl,-rpath,/usr/lib/gcc/x86_64-linux-gnu/4.8 >> -L/usr/lib/gcc/x86_64-linux-gnu/4.8 -Wl,-rpath,/usr/lib/x86_64-linux-gnu >> -L/usr/lib/x86_64-linux-gnu -Wl,-rpath,/lib/x86_64-linux-gnu >> -L/lib/x86_64-linux-gnu -lmpi_usempi -lmpi_mpifh -lgfortran -lm -lgfortran >> -lm -lquadmath -lm -lmpi_cxx -lstdc++ >> -Wl,-rpath,/home/esfp/tools/libraries/petsc/linux-cxx-debug/lib >> -L/home/esfp/tools/libraries/petsc/linux-cxx-debug/lib >> -Wl,-rpath,/usr/lib/gcc/x86_64-linux-gnu/4.8 >> -L/usr/lib/gcc/x86_64-linux-gnu/4.8 -Wl,-rpath,/usr/lib/x86_64-linux-gnu >> -L/usr/lib/x86_64-linux-gnu -Wl,-rpath,/lib/x86_64-linux-gnu >> -L/lib/x86_64-linux-gnu -Wl,-rpath,/usr/lib/x86_64-linux-gnu >> -L/usr/lib/x86_64-linux-gnu -ldl >> -Wl,-rpath,/home/esfp/tools/libraries/petsc/linux-cxx-debug/lib -lmpi >> -lgcc_s -lpthread -ldl -o ut_main >> > > /bin/rm -f ut_main.o >> > > [0]PETSC ERROR: PETSC: Attaching gdb to ut_main of pid 2818 on >> display :0 on machine grappelli >> > > [1]PETSC ERROR: PETSC: Attaching gdb to ut_main of pid 2819 on >> display :0 on machine grappelli >> > > [2]PETSC ERROR: PETSC: Attaching gdb to ut_main of pid 2820 on >> display :0 on machine grappelli >> > > >> > > >> > > And I got below error in gdb GUI: >> > > >> > > >> > > I appreciate your support. >> > > >> > > Best regards, >> > > Ehsan >> > > >> > > On Wed, Jun 29, 2016 at 4:31 PM, Barry Smith >> wrote: >> > > >> > > On all other processes don't pass in 1 pass in 0 since all other >> processes want 0 sub matrices >> > > >> > > >> > > > On Jun 29, 2016, at 3:24 PM, ehsan sadrfaridpour >> wrote: >> > > > >> > > > Thanks, the IS problem is solved. >> > > > But now I have another problem to compile the code. >> > > > >> > > > I use below code: >> > > > Mat m_WA_nt_local; >> > > > MatCreateSeqAIJ(PETSC_COMM_SELF,num_points,num_points, >> Config_params::getInstance()->get_pre_init_loader_matrix(), NULL, >> &m_WA_nt_local); >> > > > IS set; >> > > > if(rank ==0){ >> > > > ISCreateStride(PETSC_COMM_SELF, num_points, 0, 1, &set); >> > > > ISView(set, PETSC_VIEWER_STDOUT_SELF); >> > > > } >> > > > MatGetSubMatrices(m_WA_norm_T, 1, &set, &set, >> MAT_INITIAL_MATRIX, &m_WA_nt_local); >> > > > >> > > > The error I get is : >> > > > error: cannot convert ?_p_Mat**? to ?_p_Mat***? for argument ?6? to >> ?PetscErrorCode MatGetSubMatrices(Mat, PetscInt, _p_IS* const*, _p_IS* >> const*, MatReuse, _p_Mat***)? >> > > > MatGetSubMatrices(m_WA_norm_T, 1, &set, &set, >> MAT_INITIAL_MATRIX, &m_WA_nt_local); >> > > > >> > > > >> > > > I tried to go around it by define a array of Matrices using "Mat * >> m_WA_nt_local" >> > > > So, the first 2 lines changed to below and I can compile the code. >> > > > Mat * m_WA_nt_local; >> > > > MatCreateSeqAIJ(PETSC_COMM_SELF,num_points,num_points, >> Config_params::getInstance()->get_pre_init_loader_matrix(), NULL, >> m_WA_nt_local); >> > > > >> > > > >> > > > >> > > > However, I get errors like below when I run the code with 2 mpi >> process. >> > > > --------------------- Error Message >> -------------------------------------------------------------- >> > > > [1]PETSC ERROR: Invalid argument >> > > > [1]PETSC ERROR: Wrong type of object: Parameter # 3 >> > > > [1]PETSC ERROR: See >> http://www.mcs.anl.gov/petsc/documentation/faq.html for trouble shooting. >> > > > [1]PETSC ERROR: Petsc Release Version 3.6.3, unknown >> > > > [1]PETSC ERROR: ut_main on a linux-cxx-debug named grappelli by >> esfp Wed Jun 29 16:21:04 2016 >> > > > [1]PETSC ERROR: Configure options PETSC_ARCH=linux-cxx-debug >> --with-cc=gcc --with-cxx=g++ --with-clanguage=c++ --with-gnu-compilers=1 >> --with-mpi-compilers=1 --with-debugging=1 --with-shared-libraries=1 >> --download-openmpi=1 --download-f2cblaslapack --download-superlu_dist=1 >> --download-superlu=1 --download-metis=1 --download-parmetis=1 >> --download-blacs=1 --with-hdf5 --with-hdf5-dir=/usr/local/hdf5/ >> > > > [1]PETSC ERROR: #1 MatGetSubMatrices() line 6605 in >> /home/esfp/tools/libraries/petsc/src/mat/interface/matrix.c >> > > > >> > > > >> > > > I think I need to do something for other processes, but I don't >> know what I need to do. >> > > > >> > > > Best, >> > > > Ehsan >> > > > >> > > > >> > > > >> > > > On Wed, Jun 29, 2016 at 4:03 PM, Dave May >> wrote: >> > > > >> > > > >> > > > On Wednesday, 29 June 2016, ehsan sadrfaridpour >> wrote: >> > > > I faced the below error during compiling my code for using >> MatGetSubMatrices. >> > > > >> > > > error: cannot convert ?IS {aka _p_IS*}? to ?_p_IS* const*? for >> argument ?3? to ?PetscErrorCode MatGetSubMatrices(Mat, PetscInt, _p_IS* >> const*, _p_IS* const*, MatReuse, _p_Mat***)? >> > > > MatGetSubMatrices(m_WA_norm_T, 1, set, set, >> MAT_INITIAL_MATRIX, &m_local_W); >> > > > >> > > > My code : >> > > > PetscMPIInt rank; >> > > > MPI_Comm_rank(PETSC_COMM_WORLD, &rank); >> > > > >> > > > if(rank ==0){ >> > > > Mat m_local_W; >> > > > MatCreateSeqAIJ(PETSC_COMM_SELF,num_points,num_points, >> num_nz, NULL,&m_local_W);// try to reserve space for only number of final >> non zero entries for each fine node (e.g. 4) >> > > > IS set; >> > > > ISCreateStride(PETSC_COMM_SELF, num_points, 0, 1, &set_row); >> > > > MatGetSubMatrices(m_WA_norm_T, 1, set_row, set_col, >> MAT_INITIAL_MATRIX, &m_local_W); >> > > > >> > > > } >> > > > >> > > > I followed below example: >> > > > >> http://www.mcs.anl.gov/petsc/petsc-current/src/vec/is/is/examples/tutorials/ex2.c.html >> > > > >> > > > This code won't work in parallel. >> > > > The man page says this function is collective on Mat. You need to >> move the call to MatGetSubMatrices outside of the if(rank==0) loop. >> > > > >> > > > >> > > > >> > > > >> > > > >> > > > >> > > > >> > > > On Wed, Jun 29, 2016 at 3:19 PM, ehsan sadrfaridpour < >> it.sadr at gmail.com> wrote: >> > > > Thanks a lot for great support. >> > > > >> > > > On Wed, Jun 29, 2016 at 3:11 PM, Barry Smith >> wrote: >> > > > >> > > > MatGetSubmatrices() just have the first process request all the >> rows and columns and the others request none. You can use ISCreateStride() >> to create the ISs without having to make an array of all the indices. >> > > > >> > > > >> > > > > On Jun 29, 2016, at 1:43 PM, ehsan sadrfaridpour < >> it.sadr at gmail.com> wrote: >> > > > > >> > > > > Hi, >> > > > > >> > > > > I need to have access to most of elements of a parallel MPIAIJ >> matrix only from 1 process (rank 0). >> > > > > I tried to copy or duplicate it to SEQAIJ, but I faced problems. >> > > > > >> > > > > How can I have a local copy of a matrix which is distributed on >> multiple process? I don't want to update the matrix, and the read-only >> version of it would be enough. >> > > > > >> > > > > Best, >> > > > > Ehsan >> > > > > >> > > > > >> > > > >> > > > >> > > > >> > > > >> > > >> > > >> > >> > >> > >> > >> > >> > -- >> > What most experimenters take for granted before they begin their >> experiments is infinitely more interesting than any results to which their >> experiments lead. >> > -- Norbert Wiener >> >> > -- What most experimenters take for granted before they begin their experiments is infinitely more interesting than any results to which their experiments lead. -- Norbert Wiener -------------- next part -------------- An HTML attachment was scrubbed... URL: From it.sadr at gmail.com Wed Jul 6 08:12:28 2016 From: it.sadr at gmail.com (ehsan sadrfaridpour) Date: Wed, 6 Jul 2016 09:12:28 -0400 Subject: [petsc-users] How to have a local copy (sequential) of a parallel matrix In-Reply-To: References: <1618DCDA-7859-49BD-BCAF-F4BD08DF1BAF@mcs.anl.gov> <855835AC-78B8-4A8A-993F-2E9060B4BBAF@mcs.anl.gov> Message-ID: Thanks all, Sorry for lots of questions. Thanks to your advice, I didn't create the local matrix and it seems the problem is solved. I mean it seems that I shouldn't create the local matrix at all. And this is my final working code. Mat *m_WA_nt_local; IS set; if(rank ==0){ ISCreateStride(PETSC_COMM_SELF, num_points, 0, 1, &set); MatGetSubMatrices(m_WA_norm_T, 1, &set, &set, MAT_INITIAL_MATRIX, &m_WA_nt_local); }else{ MatGetSubMatrices(m_WA_norm_T, 0, &set, &set, MAT_INITIAL_MATRIX, &m_WA_nt_local); } if(rank ==0){ PetscInt m_WA_nt_local_start, m_WA_nt_local_end; MatGetOwnershipRange( (*m_WA_nt_local), &m_WA_nt_local_start, &m_WA_nt_local_end); PetscPrintf(PETSC_COMM_SELF, "[CS][pCalc_P] rank:%d, m_WA_nt_local start:%d, end:%d\n", rank, m_WA_nt_local_start,m_WA_nt_local_end); MatView((*m_WA_nt_local), PETSC_VIEWER_STDOUT_SELF); } It compiled and run without any problem. Best regards, Ehsan On Tue, Jul 5, 2016 at 9:02 PM, Matthew Knepley wrote: > On Tue, Jul 5, 2016 at 5:58 PM, ehsan sadrfaridpour > wrote: > >> Sorry, I think your suggestion needs something, since it doesn't compile. >> >> error: expected initializer before ?*? token >>> Mat m_WA_nt_local *m_WA_nt_local; >>> >> >> >> Yes, this is the same problem that compiled and worked but it has a bug. >> I faced this problem and I tried to define the array of Matrices to fix >> this 4 days ago. >> >> However, my first email today is the problem that array of matrices >> caused me. >> I get a little confused in the logic. >> >> Let me review what is happening: >> As this method is collective, all the processes needs to run it. >> Therefore, I need to define a local matrix and create it for all of the >> processes. >> > > No no no. Each process extracts a SET of SEQUENTIAL matrices. Each proc > choose how many > it will extract (could be 0). > > Thanks, > > Matt > > >> Only for the process I want to have the local matrix, I request a matrix >> (matrices) and for the rest of them I pass 0 in the MatGetSubMatrices. >> I am suspicious about creating only 1 matrix for any process, while I >> expect an array of matrices in the MatGetSubMatrices. >> >> >> >> >> >> On Tue, Jul 5, 2016 at 6:43 PM, Barry Smith wrote: >> >>> >>> > On Jul 5, 2016, at 5:36 PM, Matthew Knepley wrote: >>> > >>> > On Tue, Jul 5, 2016 at 5:26 PM, ehsan sadrfaridpour >>> wrote: >>> > Thanks for your prompt reply. Using & solve this problem, but then I >>> have another problem. >>> > >>> > Rest of the Code: >>> > Mat m_WA_nt_local; >>> > MatCreateSeqAIJ(PETSC_COMM_SELF,num_points,num_points, >>> pre_init_size, NULL, &m_WA_nt_local); >>> > PetscPrintf(PETSC_COMM_SELF, "[CS][pCalc_P] rank:%d, >>> num_points:%d, p_init:%d\n", rank, num_points, pre_init_size); >>> > >>> > IS set; >>> > if(rank ==0){ >>> > // - - - - - create local matrix - - - - - >>> > PetscPrintf(PETSC_COMM_SELF, "[CS][pCalc_P] rank:%d, >>> num_points:%d\n", rank, num_points); >>> > ISCreateStride(PETSC_COMM_SELF, num_points, 0, 1, &set); >>> > ISView(set, PETSC_VIEWER_STDOUT_SELF); >>> > MatGetSubMatrices(m_WA_norm_T, 1, &set, &set, >>> MAT_INITIAL_MATRIX, &m_WA_nt_local); >>> > }else{ >>> > MatGetSubMatrices(m_WA_norm_T, 0, &set, &set, >>> MAT_INITIAL_MATRIX, &m_WA_nt_local); >>> > } >>> > >>> > This returns an ARRAY of Mat objects, not just one. >>> >>> Didn't we just do this email a couple of days ago? >>> >>> You need >>> >>> Mat m_WA_nt_local *m_WA_nt_local; >>> > MatGetSubMatrices(m_WA_norm_T, 0, &set, &set, MAT_INITIAL_MATRIX, >>> &m_WA_nt_local); >>> >>> >>> >>> >>> >>> > >>> > Matt >>> > >>> > >>> > Error in compile: >>> > /home/esfp/dev/ws_qt/mlsvm/coarsening.cc: In member function ?_p_Mat* >>> Coarsening::pCalc_P(_p_Mat*&, _p_Vec*&, std::vector&, >>> cs_info&)?: >>> > /home/esfp/dev/ws_qt/mlsvm/coarsening.cc:113:89: error: cannot convert >>> ?_p_Mat**? to ?_p_Mat***? for argument ?6? to ?PetscErrorCode >>> MatGetSubMatrices(Mat, PetscInt, _p_IS* const*, _p_IS* const*, MatReuse, >>> _p_Mat***)? >>> > MatGetSubMatrices(m_WA_norm_T, 1, &set, &set, >>> MAT_INITIAL_MATRIX, &m_WA_nt_local); >>> > >>> ^ >>> > /home/esfp/dev/ws_qt/mlsvm/coarsening.cc:115:89: error: cannot convert >>> ?_p_Mat**? to ?_p_Mat***? for argument ?6? to ?PetscErrorCode >>> MatGetSubMatrices(Mat, PetscInt, _p_IS* const*, _p_IS* const*, MatReuse, >>> _p_Mat***)? >>> > MatGetSubMatrices(m_WA_norm_T, 0, &set, &set, >>> MAT_INITIAL_MATRIX, &m_WA_nt_local); >>> > >>> ^ >>> > >>> > >>> > On Tue, Jul 5, 2016 at 6:21 PM, Barry Smith >>> wrote: >>> > >>> > It should be >>> > >>> > Mat m_WA_nt_local; >>> > >>> > > MatCreateSeqAIJ(PETSC_COMM_SELF,num_points,num_points, >>> pre_init_size, NULL, &m_WA_nt_local); >>> > >>> ^^^^^^^^^^^^ >>> note the & >>> > >>> > >>> > >>> > > On Jul 5, 2016, at 5:13 PM, ehsan sadrfaridpour >>> wrote: >>> > > >>> > > I faced a problem with my code. The problem is related to >>> MatCreateSeqAIJ(). >>> > > I comment the rest of my code and just keeping the below lines cause >>> me the error. >>> > > Code: >>> > > Mat * m_WA_nt_local; >>> > > MatCreateSeqAIJ(PETSC_COMM_SELF,num_points,num_points, >>> pre_init_size, NULL, m_WA_nt_local); >>> > > PetscPrintf(PETSC_COMM_SELF, "[CS][pCalc_P] rank:%d, >>> num_points:%d, p_init:%d\n", rank, num_points, pre_init_size); >>> > > >>> > > exit(1); >>> > > >>> > > Error: >>> > > [0]PETSC ERROR: --------------------- Error Message >>> -------------------------------------------------------------- >>> > > [0]PETSC ERROR: Null argument, when expecting valid pointer >>> > > [0]PETSC ERROR: [1]PETSC ERROR: --------------------- Error Message >>> -------------------------------------------------------------- >>> > > [1]PETSC ERROR: Null argument, when expecting valid pointer >>> > > [1]PETSC ERROR: Null Pointer: Parameter # 2 >>> > > [1]PETSC ERROR: See >>> http://www.mcs.anl.gov/petsc/documentation/faq.html for trouble >>> shooting. >>> > > [2]PETSC ERROR: --------------------- Error Message >>> -------------------------------------------------------------- >>> > > [2]PETSC ERROR: Null argument, when expecting valid pointer >>> > > [2]PETSC ERROR: Null Pointer: Parameter # 2 >>> > > [2]PETSC ERROR: See >>> http://www.mcs.anl.gov/petsc/documentation/faq.html for trouble >>> shooting. >>> > > [2]PETSC ERROR: Petsc Release Version 3.6.3, unknown >>> > > [2]PETSC ERROR: ut_main on a linux-cxx-debug named grappelli by esfp >>> Tue Jul 5 18:05:15 2016 >>> > > [2]PETSC ERROR: Configure options PETSC_ARCH=linux-cxx-debug >>> --with-cc=gcc --with-cxx=g++ --with-clanguage=c++ --with-gnu-compilers=1 >>> --with-mpi-compilers=1 --with-debugging=1 --with-shared-libraries=1 >>> --download-openmpi=1 --download-f2cblaslapack --download-superlu_dist=1 >>> --download-superlu=1 --download-metis=1 --download-parmetis=1 >>> --download-blacs=1 --with-hdf5 --with-hdf5-dir=/usr/local/hdf5/ >>> > > [2]PETSC ERROR: #1 MatCreate() line 79 in >>> /home/esfp/tools/libraries/petsc/src/mat/utils/gcreate.c >>> > > [2]PETSC ERROR: #2 MatCreateSeqAIJ() line 3471 in >>> /home/esfp/tools/libraries/petsc/src/mat/impls/aij/seq/aij.c >>> > > Null Pointer: Parameter # 2 >>> > > [0]PETSC ERROR: See >>> http://www.mcs.anl.gov/petsc/documentation/faq.html for trouble >>> shooting. >>> > > [0]PETSC ERROR: Petsc Release Version 3.6.3, unknown >>> > > [0]PETSC ERROR: ut_main on a linux-cxx-debug named grappelli by esfp >>> Tue Jul 5 18:05:15 2016 >>> > > [0]PETSC ERROR: Configure options PETSC_ARCH=linux-cxx-debug >>> --with-cc=gcc --with-cxx=g++ --with-clanguage=c++ --with-gnu-compilers=1 >>> --with-mpi-compilers=1 --with-debugging=1 --with-shared-libraries=1 >>> --download-openmpi=1 --download-f2cblaslapack --download-superlu_dist=1 >>> --download-superlu=1 --download-metis=1 --download-parmetis=1 >>> --download-blacs=1 --with-hdf5 --with-hdf5-dir=/usr/local/hdf5/ >>> > > [0]PETSC ERROR: #1 MatCreate() line 79 in >>> /home/esfp/tools/libraries/petsc/src/mat/utils/gcreate.c >>> > > [0]PETSC ERROR: #2 MatCreateSeqAIJ() line 3471 in >>> /home/esfp/tools/libraries/petsc/src/mat/impls/aij/seq/aij.c >>> > > [1]PETSC ERROR: Petsc Release Version 3.6.3, unknown >>> > > [1]PETSC ERROR: ut_main on a linux-cxx-debug named grappelli by esfp >>> Tue Jul 5 18:05:15 2016 >>> > > [1]PETSC ERROR: Configure options PETSC_ARCH=linux-cxx-debug >>> --with-cc=gcc --with-cxx=g++ --with-clanguage=c++ --with-gnu-compilers=1 >>> --with-mpi-compilers=1 --with-debugging=1 --with-shared-libraries=1 >>> --download-openmpi=1 --download-f2cblaslapack --download-superlu_dist=1 >>> --download-superlu=1 --download-metis=1 --download-parmetis=1 >>> --download-blacs=1 --with-hdf5 --with-hdf5-dir=/usr/local/hdf5/ >>> > > [1]PETSC ERROR: #1 MatCreate() line 79 in >>> /home/esfp/tools/libraries/petsc/src/mat/utils/gcreate.c >>> > > [1]PETSC ERROR: #2 MatCreateSeqAIJ() line 3471 in >>> /home/esfp/tools/libraries/petsc/src/mat/impls/aij/seq/aij.c >>> > > [CS][pCalc_P] rank:1, num_points:10, p_init:300 >>> > > [CS][pCalc_P] rank:2, num_points:10, p_init:300 >>> > > [CS][pCalc_P] rank:0, num_points:10, p_init:300 >>> > > >>> > > As you can see nothing is NULL in my call to the MatCreateSeqAIJ. >>> > > >>> > > I tried to debug it with -start_in_debugger, but I got another error. >>> > > $ make ut_main && mpirun -n 3 ut_main -start_in_debugger >>> > > /home/esfp/tools/libraries/petsc/linux-cxx-debug/bin/mpicxx -o >>> ut_main.o -c -Wall -Wwrite-strings -Wno-strict-aliasing >>> -Wno-unknown-pragmas -g -O0 -fPIC >>> -I/home/esfp/tools/libraries/petsc/include >>> -I/home/esfp/tools/libraries/petsc/linux-cxx-debug/include >>> -I/usr/local/hdf5/include -std=c++11 -g -O3 `pwd`/ut_main.cc >>> > > /home/esfp/tools/libraries/petsc/linux-cxx-debug/bin/mpicxx -Wall >>> -Wwrite-strings -Wno-strict-aliasing -Wno-unknown-pragmas -g -O0 -I. >>> svm.o solver.o model_selection.o ut_ms.o ut_common.o ut_kf.o >>> ut_partitioning.o ds_node.o ds_graph.o coarsening.o ut_coarsening.o >>> partitioning.o ut_mr.o pugixml.o config_params.o etimer.o common_funcs.o >>> OptionParser.o loader.o ut_loader.o k_fold.o ut_main.o >>> -Wl,-rpath,/home/esfp/tools/libraries/petsc/linux-cxx-debug/lib >>> -L/home/esfp/tools/libraries/petsc/linux-cxx-debug/lib -lpetsc >>> -Wl,-rpath,/home/esfp/tools/libraries/petsc/linux-cxx-debug/lib >>> -lsuperlu_4.3 -lsuperlu_dist_4.1 -lf2clapack -lf2cblas -lm -lparmetis >>> -lmetis -lX11 -Wl,-rpath,/usr/local/hdf5/lib -L/usr/local/hdf5/lib >>> -lhdf5hl_fortran -lhdf5_fortran -lhdf5_hl -lhdf5 -lhwloc -lm >>> -Wl,-rpath,/usr/lib/gcc/x86_64-linux-gnu/4.8 >>> -L/usr/lib/gcc/x86_64-linux-gnu/4.8 -Wl,-rpath,/usr/lib/x86_64-linux-gnu >>> -L/usr/lib/x86_64-linux-gnu -Wl,-rpath,/lib/x86_64-linux-gnu >>> -L/lib/x86_64-linux-gnu -lmpi_usempi -lmpi_mpifh -lgfortran -lm -lgfortran >>> -lm -lquadmath -lm -lmpi_cxx -lstdc++ >>> -Wl,-rpath,/home/esfp/tools/libraries/petsc/linux-cxx-debug/lib >>> -L/home/esfp/tools/libraries/petsc/linux-cxx-debug/lib >>> -Wl,-rpath,/usr/lib/gcc/x86_64-linux-gnu/4.8 >>> -L/usr/lib/gcc/x86_64-linux-gnu/4.8 -Wl,-rpath,/usr/lib/x86_64-linux-gnu >>> -L/usr/lib/x86_64-linux-gnu -Wl,-rpath,/lib/x86_64-linux-gnu >>> -L/lib/x86_64-linux-gnu -Wl,-rpath,/usr/lib/x86_64-linux-gnu >>> -L/usr/lib/x86_64-linux-gnu -ldl >>> -Wl,-rpath,/home/esfp/tools/libraries/petsc/linux-cxx-debug/lib -lmpi >>> -lgcc_s -lpthread -ldl -o ut_main >>> > > /bin/rm -f ut_main.o >>> > > [0]PETSC ERROR: PETSC: Attaching gdb to ut_main of pid 2818 on >>> display :0 on machine grappelli >>> > > [1]PETSC ERROR: PETSC: Attaching gdb to ut_main of pid 2819 on >>> display :0 on machine grappelli >>> > > [2]PETSC ERROR: PETSC: Attaching gdb to ut_main of pid 2820 on >>> display :0 on machine grappelli >>> > > >>> > > >>> > > And I got below error in gdb GUI: >>> > > >>> > > >>> > > I appreciate your support. >>> > > >>> > > Best regards, >>> > > Ehsan >>> > > >>> > > On Wed, Jun 29, 2016 at 4:31 PM, Barry Smith >>> wrote: >>> > > >>> > > On all other processes don't pass in 1 pass in 0 since all other >>> processes want 0 sub matrices >>> > > >>> > > >>> > > > On Jun 29, 2016, at 3:24 PM, ehsan sadrfaridpour < >>> it.sadr at gmail.com> wrote: >>> > > > >>> > > > Thanks, the IS problem is solved. >>> > > > But now I have another problem to compile the code. >>> > > > >>> > > > I use below code: >>> > > > Mat m_WA_nt_local; >>> > > > MatCreateSeqAIJ(PETSC_COMM_SELF,num_points,num_points, >>> Config_params::getInstance()->get_pre_init_loader_matrix(), NULL, >>> &m_WA_nt_local); >>> > > > IS set; >>> > > > if(rank ==0){ >>> > > > ISCreateStride(PETSC_COMM_SELF, num_points, 0, 1, &set); >>> > > > ISView(set, PETSC_VIEWER_STDOUT_SELF); >>> > > > } >>> > > > MatGetSubMatrices(m_WA_norm_T, 1, &set, &set, >>> MAT_INITIAL_MATRIX, &m_WA_nt_local); >>> > > > >>> > > > The error I get is : >>> > > > error: cannot convert ?_p_Mat**? to ?_p_Mat***? for argument ?6? >>> to ?PetscErrorCode MatGetSubMatrices(Mat, PetscInt, _p_IS* const*, _p_IS* >>> const*, MatReuse, _p_Mat***)? >>> > > > MatGetSubMatrices(m_WA_norm_T, 1, &set, &set, >>> MAT_INITIAL_MATRIX, &m_WA_nt_local); >>> > > > >>> > > > >>> > > > I tried to go around it by define a array of Matrices using "Mat * >>> m_WA_nt_local" >>> > > > So, the first 2 lines changed to below and I can compile the code. >>> > > > Mat * m_WA_nt_local; >>> > > > MatCreateSeqAIJ(PETSC_COMM_SELF,num_points,num_points, >>> Config_params::getInstance()->get_pre_init_loader_matrix(), NULL, >>> m_WA_nt_local); >>> > > > >>> > > > >>> > > > >>> > > > However, I get errors like below when I run the code with 2 mpi >>> process. >>> > > > --------------------- Error Message >>> -------------------------------------------------------------- >>> > > > [1]PETSC ERROR: Invalid argument >>> > > > [1]PETSC ERROR: Wrong type of object: Parameter # 3 >>> > > > [1]PETSC ERROR: See >>> http://www.mcs.anl.gov/petsc/documentation/faq.html for trouble >>> shooting. >>> > > > [1]PETSC ERROR: Petsc Release Version 3.6.3, unknown >>> > > > [1]PETSC ERROR: ut_main on a linux-cxx-debug named grappelli by >>> esfp Wed Jun 29 16:21:04 2016 >>> > > > [1]PETSC ERROR: Configure options PETSC_ARCH=linux-cxx-debug >>> --with-cc=gcc --with-cxx=g++ --with-clanguage=c++ --with-gnu-compilers=1 >>> --with-mpi-compilers=1 --with-debugging=1 --with-shared-libraries=1 >>> --download-openmpi=1 --download-f2cblaslapack --download-superlu_dist=1 >>> --download-superlu=1 --download-metis=1 --download-parmetis=1 >>> --download-blacs=1 --with-hdf5 --with-hdf5-dir=/usr/local/hdf5/ >>> > > > [1]PETSC ERROR: #1 MatGetSubMatrices() line 6605 in >>> /home/esfp/tools/libraries/petsc/src/mat/interface/matrix.c >>> > > > >>> > > > >>> > > > I think I need to do something for other processes, but I don't >>> know what I need to do. >>> > > > >>> > > > Best, >>> > > > Ehsan >>> > > > >>> > > > >>> > > > >>> > > > On Wed, Jun 29, 2016 at 4:03 PM, Dave May >>> wrote: >>> > > > >>> > > > >>> > > > On Wednesday, 29 June 2016, ehsan sadrfaridpour >>> wrote: >>> > > > I faced the below error during compiling my code for using >>> MatGetSubMatrices. >>> > > > >>> > > > error: cannot convert ?IS {aka _p_IS*}? to ?_p_IS* const*? for >>> argument ?3? to ?PetscErrorCode MatGetSubMatrices(Mat, PetscInt, _p_IS* >>> const*, _p_IS* const*, MatReuse, _p_Mat***)? >>> > > > MatGetSubMatrices(m_WA_norm_T, 1, set, set, >>> MAT_INITIAL_MATRIX, &m_local_W); >>> > > > >>> > > > My code : >>> > > > PetscMPIInt rank; >>> > > > MPI_Comm_rank(PETSC_COMM_WORLD, &rank); >>> > > > >>> > > > if(rank ==0){ >>> > > > Mat m_local_W; >>> > > > MatCreateSeqAIJ(PETSC_COMM_SELF,num_points,num_points, >>> num_nz, NULL,&m_local_W);// try to reserve space for only number of final >>> non zero entries for each fine node (e.g. 4) >>> > > > IS set; >>> > > > ISCreateStride(PETSC_COMM_SELF, num_points, 0, 1, >>> &set_row); >>> > > > MatGetSubMatrices(m_WA_norm_T, 1, set_row, set_col, >>> MAT_INITIAL_MATRIX, &m_local_W); >>> > > > >>> > > > } >>> > > > >>> > > > I followed below example: >>> > > > >>> http://www.mcs.anl.gov/petsc/petsc-current/src/vec/is/is/examples/tutorials/ex2.c.html >>> > > > >>> > > > This code won't work in parallel. >>> > > > The man page says this function is collective on Mat. You need to >>> move the call to MatGetSubMatrices outside of the if(rank==0) loop. >>> > > > >>> > > > >>> > > > >>> > > > >>> > > > >>> > > > >>> > > > >>> > > > On Wed, Jun 29, 2016 at 3:19 PM, ehsan sadrfaridpour < >>> it.sadr at gmail.com> wrote: >>> > > > Thanks a lot for great support. >>> > > > >>> > > > On Wed, Jun 29, 2016 at 3:11 PM, Barry Smith >>> wrote: >>> > > > >>> > > > MatGetSubmatrices() just have the first process request all the >>> rows and columns and the others request none. You can use ISCreateStride() >>> to create the ISs without having to make an array of all the indices. >>> > > > >>> > > > >>> > > > > On Jun 29, 2016, at 1:43 PM, ehsan sadrfaridpour < >>> it.sadr at gmail.com> wrote: >>> > > > > >>> > > > > Hi, >>> > > > > >>> > > > > I need to have access to most of elements of a parallel MPIAIJ >>> matrix only from 1 process (rank 0). >>> > > > > I tried to copy or duplicate it to SEQAIJ, but I faced problems. >>> > > > > >>> > > > > How can I have a local copy of a matrix which is distributed on >>> multiple process? I don't want to update the matrix, and the read-only >>> version of it would be enough. >>> > > > > >>> > > > > Best, >>> > > > > Ehsan >>> > > > > >>> > > > > >>> > > > >>> > > > >>> > > > >>> > > > >>> > > >>> > > >>> > >>> > >>> > >>> > >>> > >>> > -- >>> > What most experimenters take for granted before they begin their >>> experiments is infinitely more interesting than any results to which their >>> experiments lead. >>> > -- Norbert Wiener >>> >>> >> > > > -- > What most experimenters take for granted before they begin their > experiments is infinitely more interesting than any results to which their > experiments lead. > -- Norbert Wiener > -------------- next part -------------- An HTML attachment was scrubbed... URL: From Hassan.Raiesi at aero.bombardier.com Wed Jul 6 11:22:11 2016 From: Hassan.Raiesi at aero.bombardier.com (Hassan Raiesi) Date: Wed, 6 Jul 2016 16:22:11 +0000 Subject: [petsc-users] (edit GAMG) petsc 3.7.2 memory usage is much higher when compared to 3.6.1 Message-ID: Barry, Thank you for the detailed instructions, I'll try to figure out what change causes this problem, To answer your question, I re-ran using fgmres/bjacobi for a simple case and there was virtually no difference in memory footprint reported by PETSc (see the log files ends _basic). So it is safe to assume the extra memory was due to GAMG. I ran a series of tests with GAMG, I attached full logs here, but to summarize: PETSc 3.6.1: --- Event Stage 0: Main Stage Matrix 368 365 149426856 0 Matrix Coarsen 16 16 9920 0 Vector 1181 1181 218526896 0 Vector Scatter 99 99 115936 0 Krylov Solver 22 22 72976 0 Preconditioner 22 22 21648 0 Viewer 1 0 0 0 Index Set 267 267 821040 0 Star Forest Bipartite Graph 16 16 13440 0 Using same options, exactly same code (just linked it with petsc-3.7.2) PETSc 3.7.2: --- Event Stage 0: Main Stage Matrix 412 409 180705004 0. Matrix Coarsen 12 12 7536 0. Vector 923 923 214751960 0. Vector Scatter 79 79 95488 0. Krylov Solver 17 17 67152 0. Preconditioner 17 17 16936 0. PetscRandom 1 1 638 0. Viewer 1 0 0 0. Index Set 223 223 790676 0. Star Forest Bipartite Graph 12 12 10176 0. GAMG in 3.7.2 creates less levels, but needs more memory. For next test, I changed the "pc_gamg_square_graph" from 2 to 1, here 3.7.2 makes 19 levels now PETSc 3.7.2: --- Event Stage 0: Main Stage Matrix 601 598 188796452 0. Matrix Coarsen 19 19 11932 0. Vector 1358 1358 216798096 0. Vector Scatter 110 110 128920 0. Krylov Solver 24 24 76112 0. Preconditioner 24 24 23712 0. PetscRandom 1 1 638 0. Viewer 1 0 0 0. Index Set 284 284 857076 0. Star Forest Bipartite Graph 19 19 16112 0. with similar memory usage. If I limit the number of levels to 17, I would get same number of levels as in version 3.6.1, however the memory usage is still higher than version 3.6.1 PETSc 3.7.2: --- Event Stage 0: Main Stage Matrix 506 503 187749632 0. Matrix Coarsen 16 16 10048 0. Vector 1160 1160 216216344 0. Vector Scatter 92 92 100424 0. Krylov Solver 21 21 72272 0. Preconditioner 21 21 20808 0. PetscRandom 1 1 638 0. Viewer 1 0 0 0. Index Set 237 237 818260 0. Star Forest Bipartite Graph 16 16 13568 0. Now running version 3.6.1 with the options used for the above run PETSc 3.6.1: --- Event Stage 0: Main Stage Matrix 338 335 153296844 0 Matrix Coarsen 16 16 9920 0 Vector 1156 1156 219112832 0 Vector Scatter 89 89 94696 0 Krylov Solver 22 22 72976 0 Preconditioner 22 22 21648 0 Viewer 1 0 0 0 Index Set 223 223 791548 0 Star Forest Bipartite Graph 16 16 13440 0 It Looks like the GAMG in 3.7.2 makes a lot more matrices for same number of levels and requires about (187749632 - 153296844)/153296844 = 22.5% more memory. I hope the logs help fixing the issue. Best Regards PS: GAMG is great, and by far beats all other AMG libraries we have tried so far :-) -----Original Message----- From: Barry Smith [mailto:bsmith at mcs.anl.gov] Sent: Tuesday, July 05, 2016 6:19 PM To: Hassan Raiesi Cc: petsc-users at mcs.anl.gov Subject: Re: [petsc-users] petsc 3.7.2 memory usage is much higher when compared to 3.6.1 Hassan, This memory usage increase is not expected. How are you measuring memory usage? Since the problem occurs even with a simple solver you should debug with the simpler solver and only after resolving that move on to GAMG and see if the problem persists. Also do the test on the smallest case that clearly demonstrates the problem; if you have a 1 process run that shows a nontrivial memory usage increase then debug with that, don't run a huge problem unless you absolutely have to. How much code, if any, did you need to change in your application in going from 3.6.1 to 3.7.2 ? Here is the way to track down the problem. It may seem burdensome but requires no guesswork or speculation. Use the bisection capability of git. First obtain PETSc via git if you have not gotten that way http://www.mcs.anl.gov/petsc/download/index.html Then in the PETSc directory run git bisect start git bisect good v3.6.1 git bisect bad v3.7.2 It will then change to a new commit where you need to run configure and make on PETSc and then compile and run your application If the application uses the excessive memory then in the PETSc directory do git bisect bad otherwise type git bisect good if the code won't compile (if the PETSc API changes you may have to adjust your code slightly to get it to compile and you should do that; but if PETSc won't configure to build with the given commit then just do the skip) or crashes then type git bisect skip Now git will switch to another commit where you need again do the same process of configure make and run the application. After a few iterations git bisect will show the EXACT commit (code changes) that resulted in your very different memory usage and we can take a look at the code changes in PETSc and figure out how to reduce the memory usage. I realize this seems like a burdensome process but remember a great deal of changes took place in the PETSc code and this is the ONLY well defined way to figure out exactly which change caused the problem. Otherwise we can guess until the end of time. Barry > On Jul 5, 2016, at 3:42 PM, Hassan Raiesi wrote: > > Hi, > > PETSc 3.7.2 seems to have a much higher memory usage when compared with PETSc- 3.1.1 c, to a point that it crashes our code for large problems that we ran with version 3.6.1 in the past. > I have re-compiled the code with same options, and ran the same code linked with the two versions, here are the log-summarie: > > -flow_ksp_max_it 20 > -flow_ksp_monitor_true_residual > -flow_ksp_rtol 0.1 > -flow_ksp_type fgmres > -flow_mg_coarse_pc_factor_mat_solver_package mumps > -flow_mg_coarse_pc_type lu -flow_mg_levels_ksp_type richardson > -flow_mg_levels_pc_type sor -flow_pc_gamg_agg_nsmooths 0 > -flow_pc_gamg_coarse_eq_limit 2000 -flow_pc_gamg_process_eq_limit 2500 > -flow_pc_gamg_repartition true -flow_pc_gamg_reuse_interpolation true > -flow_pc_gamg_square_graph 3 -flow_pc_gamg_sym_graph true > -flow_pc_gamg_type agg -flow_pc_mg_cycle v -flow_pc_mg_levels 20 > -flow_pc_mg_type kaskade -flow_pc_type gamg -log_summary > > Note: it is not specific to PCGAMG, even a bjacobi+fgmres would need more memory (4.5GB/core in version 3.6.1 compared to 6.8GB/core for 3.7.2). > > > > Using Petsc Development GIT revision: v3.7.2-812-gc68d048 GIT Date: > 2016-07-05 12:04:34 -0400 > > Max Max/Min Avg Total > Time (sec): 6.760e+02 1.00006 6.760e+02 > Objects: 1.284e+03 1.00469 1.279e+03 > Flops: 3.563e+10 1.10884 3.370e+10 1.348e+13 > Flops/sec: 5.271e+07 1.10884 4.985e+07 1.994e+10 > MPI Messages: 4.279e+04 7.21359 1.635e+04 6.542e+06 > MPI Message Lengths: 3.833e+09 17.25274 7.681e+04 5.024e+11 > MPI Reductions: 4.023e+03 1.00149 > > Flop counting convention: 1 flop = 1 real number operation of type (multiply/divide/add/subtract) > e.g., VecAXPY() for real vectors of length N --> 2N flops > and VecAXPY() for complex vectors of > length N --> 8N flops > > Summary of Stages: ----- Time ------ ----- Flops ----- --- Messages --- -- Message Lengths -- -- Reductions -- > Avg %Total Avg %Total counts %Total Avg %Total counts %Total > 0: Main Stage: 6.7600e+02 100.0% 1.3478e+13 100.0% 6.533e+06 99.9% 7.674e+04 99.9% 4.010e+03 99.7% > > ---------------------------------------------------------------------- > -------------------------------------------------- > See the 'Profiling' chapter of the users' manual for details on interpreting output. > Phase summary info: > Count: number of times phase was executed > Time and Flops: Max - maximum over all processors > Ratio - ratio of maximum to minimum over all processors > Mess: number of messages sent > Avg. len: average message length (bytes) > Reduct: number of global reductions > Global: entire computation > Stage: stages of a computation. Set stages with PetscLogStagePush() and PetscLogStagePop(). > %T - percent time in this phase %F - percent flops in this phase > %M - percent messages in this phase %L - percent message lengths in this phase > %R - percent reductions in this phase > Total Mflop/s: 10e-6 * (sum of flops over all processors)/(max time > over all processors) > ------------------------------------------------------------------------------------------------------------------------ > Event Count Time (sec) Flops --- Global --- --- Stage --- Total > Max Ratio Max Ratio Max Ratio Mess Avg len Reduct %T %F %M %L %R %T %F %M %L %R Mflop/s > ---------------------------------------------------------------------- > -------------------------------------------------- > > --- Event Stage 0: Main Stage > > MatMult 500 1.0 1.0582e+01 1.2 6.68e+09 1.1 1.9e+06 1.0e+04 0.0e+00 1 19 28 4 0 1 19 29 4 0 237625 > MatMultTranspose 120 1.0 7.6262e-01 1.3 3.58e+08 1.1 2.4e+05 1.5e+04 0.0e+00 0 1 4 1 0 0 1 4 1 0 180994 > MatSolve 380 1.0 4.1580e+00 1.1 1.17e+09 1.1 8.6e+03 8.8e+01 6.0e+01 1 3 0 0 1 1 3 0 0 1 105950 > MatSOR 120 1.0 1.4316e+01 1.2 6.75e+09 1.1 9.5e+05 7.4e+03 0.0e+00 2 19 15 1 0 2 19 15 1 0 177298 > MatLUFactorSym 2 1.0 2.3449e-01 1.1 0.00e+00 0.0 0.0e+00 0.0e+00 1.0e+01 0 0 0 0 0 0 0 0 0 0 0 > MatLUFactorNum 60 1.0 8.8820e+00 1.0 1.95e+08 1.2 0.0e+00 0.0e+00 0.0e+00 1 1 0 0 0 1 1 0 0 0 7877 > MatILUFactorSym 1 1.0 1.9795e-02 1.4 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 0 > MatConvert 6 1.0 2.9893e-02 1.1 0.00e+00 0.0 0.0e+00 0.0e+00 1.8e+01 0 0 0 0 0 0 0 0 0 0 0 > MatScale 6 1.0 1.8810e-02 1.4 4.52e+06 1.1 2.4e+04 1.5e+03 0.0e+00 0 0 0 0 0 0 0 0 0 0 90171 > MatAssemblyBegin 782 1.0 1.8294e+01 2.9 0.00e+00 0.0 9.2e+05 4.1e+05 4.2e+02 2 0 14 75 10 2 0 14 75 10 0 > MatAssemblyEnd 782 1.0 1.4283e+01 3.0 0.00e+00 0.0 4.1e+05 8.7e+02 4.7e+02 1 0 6 0 12 1 0 6 0 12 0 > MatGetRow 6774900 1.1 9.4289e-01 1.2 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 0 > MatGetRowIJ 3 3.0 6.6261e-036948.0 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 0 > MatGetSubMatrix 12 1.0 2.6783e+01 1.0 0.00e+00 0.0 1.1e+05 1.3e+05 2.0e+02 4 0 2 3 5 4 0 2 3 5 0 > MatGetOrdering 3 3.0 7.7400e-03 7.2 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 0 > MatPartitioning 6 1.0 1.8949e+00 1.0 0.00e+00 0.0 0.0e+00 0.0e+00 1.4e+01 0 0 0 0 0 0 0 0 0 0 0 > MatCoarsen 6 1.0 9.5692e-02 1.2 0.00e+00 0.0 2.6e+05 1.1e+03 4.1e+01 0 0 4 0 1 0 0 4 0 1 0 > MatZeroEntries 142 1.0 9.7085e-01 1.1 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 0 > MatTranspose 6 1.0 2.1740e-01 1.0 0.00e+00 0.0 1.9e+05 8.5e+02 7.8e+01 0 0 3 0 2 0 0 3 0 2 0 > MatPtAP 120 1.0 6.0157e+01 1.0 1.82e+10 1.1 1.5e+06 2.7e+05 4.2e+02 9 51 22 80 10 9 51 22 80 10 114269 > MatPtAPSymbolic 12 1.0 8.1081e+00 1.0 0.00e+00 0.0 2.2e+05 3.8e+04 8.4e+01 1 0 3 2 2 1 0 3 2 2 0 > MatPtAPNumeric 120 1.0 5.2205e+01 1.0 1.82e+10 1.1 1.2e+06 3.1e+05 3.4e+02 8 51 19 78 8 8 51 19 78 8 131676 > MatTrnMatMult 3 1.0 1.8608e+00 1.0 3.23e+07 1.2 8.3e+04 7.9e+03 5.7e+01 0 0 1 0 1 0 0 1 0 1 6275 > MatTrnMatMultSym 3 1.0 1.3447e+00 1.0 0.00e+00 0.0 6.9e+04 3.8e+03 5.1e+01 0 0 1 0 1 0 0 1 0 1 0 > MatTrnMatMultNum 3 1.0 5.1695e-01 1.0 3.23e+07 1.2 1.3e+04 3.0e+04 6.0e+00 0 0 0 0 0 0 0 0 0 0 22588 > MatGetLocalMat 126 1.0 1.0355e+00 1.8 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 0 > MatGetBrAoCol 120 1.0 9.5921e+0019.2 0.00e+00 0.0 5.7e+05 3.3e+04 0.0e+00 1 0 9 4 0 1 0 9 4 0 0 > VecDot 320 1.0 1.1400e+00 1.6 2.04e+08 1.1 0.0e+00 0.0e+00 3.2e+02 0 1 0 0 8 0 1 0 0 8 68967 > VecMDot 260 1.0 1.9577e+00 2.8 3.70e+08 1.1 0.0e+00 0.0e+00 2.6e+02 0 1 0 0 6 0 1 0 0 6 72792 > VecNorm 440 1.0 2.6273e+00 1.9 5.88e+08 1.1 0.0e+00 0.0e+00 4.4e+02 0 2 0 0 11 0 2 0 0 11 86035 > VecScale 320 1.0 2.1386e-01 1.2 7.91e+07 1.1 0.0e+00 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 141968 > VecCopy 220 1.0 7.0370e-01 1.2 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 0 > VecSet 862 1.0 7.1000e-01 1.6 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 0 > VecAXPY 440 1.0 8.6790e-01 1.1 3.83e+08 1.1 0.0e+00 0.0e+00 0.0e+00 0 1 0 0 0 0 1 0 0 0 169857 > VecAYPX 280 1.0 5.7766e-01 1.5 1.92e+08 1.1 0.0e+00 0.0e+00 0.0e+00 0 1 0 0 0 0 1 0 0 0 127599 > VecMAXPY 300 1.0 9.7396e-01 1.2 4.98e+08 1.1 0.0e+00 0.0e+00 0.0e+00 0 1 0 0 0 0 1 0 0 0 196768 > VecAssemblyBegin 234 1.0 4.6313e+00 5.6 0.00e+00 0.0 0.0e+00 0.0e+00 6.8e+02 0 0 0 0 17 0 0 0 0 17 0 > VecAssemblyEnd 234 1.0 5.1503e-0319.5 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 0 > VecScatterBegin 1083 1.0 2.9274e-01 4.5 0.00e+00 0.0 3.8e+06 8.5e+03 2.0e+01 0 0 59 6 0 0 0 59 6 0 0 > VecScatterEnd 1063 1.0 3.9653e+00 5.6 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 0 > KSPGMRESOrthog 20 1.0 1.7405e+00 3.7 1.28e+08 1.1 0.0e+00 0.0e+00 2.0e+01 0 0 0 0 0 0 0 0 0 0 28232 > KSPSetUp 222 1.0 6.8469e-02 1.1 0.00e+00 0.0 0.0e+00 0.0e+00 1.2e+01 0 0 0 0 0 0 0 0 0 0 0 > KSPSolve 60 1.0 1.4767e+02 1.0 3.55e+10 1.1 6.3e+06 7.2e+04 3.2e+03 22100 96 90 79 22100 96 90 79 91007 > PCGAMGGraph_AGG 6 1.0 6.0792e+00 1.0 4.52e+06 1.1 3.8e+05 9.0e+02 2.5e+02 1 0 6 0 6 1 0 6 0 6 279 > PCGAMGCoarse_AGG 6 1.0 2.0660e+00 1.0 3.23e+07 1.2 4.2e+05 3.1e+03 1.5e+02 0 0 6 0 4 0 0 6 0 4 5652 > PCGAMGProl_AGG 6 1.0 1.8842e+00 1.0 0.00e+00 0.0 7.3e+05 3.3e+03 8.6e+02 0 0 11 0 21 0 0 11 0 22 0 > PCGAMGPOpt_AGG 6 1.0 6.4373e-06 0.0 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 0 > GAMG: createProl 6 1.0 1.0036e+01 1.0 3.68e+07 1.2 1.5e+06 2.7e+03 1.3e+03 1 0 23 1 31 1 0 23 1 31 1332 > Graph 12 1.0 6.0783e+00 1.0 4.52e+06 1.1 3.8e+05 9.0e+02 2.5e+02 1 0 6 0 6 1 0 6 0 6 279 > MIS/Agg 6 1.0 9.5831e-02 1.2 0.00e+00 0.0 2.6e+05 1.1e+03 4.1e+01 0 0 4 0 1 0 0 4 0 1 0 > SA: col data 6 1.0 7.7358e-01 1.0 0.00e+00 0.0 6.7e+05 2.9e+03 7.8e+02 0 0 10 0 19 0 0 10 0 19 0 > SA: frmProl0 6 1.0 1.0759e+00 1.0 0.00e+00 0.0 6.2e+04 7.6e+03 6.0e+01 0 0 1 0 1 0 0 1 0 1 0 > GAMG: partLevel 6 1.0 3.8136e+01 1.0 9.09e+08 1.1 3.8e+05 5.0e+04 5.4e+02 6 3 6 4 13 6 3 6 4 14 9013 > repartition 6 1.0 2.7910e+00 1.0 0.00e+00 0.0 4.6e+04 1.3e+02 1.6e+02 0 0 1 0 4 0 0 1 0 4 0 > Invert-Sort 6 1.0 2.5045e+00 2.0 0.00e+00 0.0 0.0e+00 0.0e+00 2.4e+01 0 0 0 0 1 0 0 0 0 1 0 > Move A 6 1.0 1.4832e+01 1.0 0.00e+00 0.0 8.5e+04 1.7e+05 1.1e+02 2 0 1 3 3 2 0 1 3 3 0 > Move P 6 1.0 1.2023e+01 1.0 0.00e+00 0.0 2.4e+04 3.8e+03 1.1e+02 2 0 0 0 3 2 0 0 0 3 0 > PCSetUp 100 1.0 1.1212e+02 1.0 1.84e+10 1.1 3.2e+06 1.3e+05 2.2e+03 17 52 49 84 54 17 52 49 84 54 62052 > PCSetUpOnBlocks 40 1.0 1.0386e+00 1.2 1.95e+08 1.2 0.0e+00 0.0e+00 0.0e+00 0 1 0 0 0 0 1 0 0 0 67368 > PCApply 380 1.0 2.0034e+01 1.1 8.60e+09 1.1 1.5e+06 9.9e+03 6.0e+01 3 24 22 3 1 3 24 22 3 1 161973 > SFSetGraph 12 1.0 4.9813e-0310.2 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 0 > SFBcastBegin 47 1.0 3.3110e-02 2.6 0.00e+00 0.0 2.6e+05 1.1e+03 6.0e+00 0 0 4 0 0 0 0 4 0 0 0 > SFBcastEnd 47 1.0 1.3497e-02 3.6 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 0 > SFReduceBegin 6 1.0 1.8593e-02 4.2 0.00e+00 0.0 7.2e+04 4.9e+02 6.0e+00 0 0 1 0 0 0 0 1 0 0 0 > SFReduceEnd 6 1.0 7.1628e-0318.5 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 0 > BuildTwoSided 12 1.0 3.5771e-02 2.5 0.00e+00 0.0 5.0e+04 4.0e+00 1.2e+01 0 0 1 0 0 0 0 1 0 0 0 > ---------------------------------------------------------------------- > -------------------------------------------------- > > Memory usage is given in bytes: > > Object Type Creations Destructions Memory Descendants' Mem. > Reports information only for process 0. > > --- Event Stage 0: Main Stage > > Matrix 302 299 1992700700 0. > Matrix Partitioning 6 6 3888 0. > Matrix Coarsen 6 6 3768 0. > Vector 600 600 1582204168 0. > Vector Scatter 87 87 5614432 0. > Krylov Solver 11 11 59472 0. > Preconditioner 11 11 11120 0. > PetscRandom 1 1 638 0. > Viewer 1 0 0 0. > Index Set 247 247 9008420 0. > Star Forest Bipartite Graph 12 12 10176 0. > ====================================================================== > ================================================== > > And for petsc 3.6.1: > > Using Petsc Development GIT revision: v3.6.1-307-g26c82d3 GIT Date: > 2015-08-06 11:50:34 -0500 > > Max Max/Min Avg Total > Time (sec): 5.515e+02 1.00001 5.515e+02 > Objects: 1.231e+03 1.00490 1.226e+03 > Flops: 3.431e+10 1.12609 3.253e+10 1.301e+13 > Flops/sec: 6.222e+07 1.12609 5.899e+07 2.359e+10 > MPI Messages: 4.432e+04 7.84165 1.504e+04 6.016e+06 > MPI Message Lengths: 2.236e+09 12.61261 5.027e+04 3.024e+11 > MPI Reductions: 4.012e+03 1.00150 > > Flop counting convention: 1 flop = 1 real number operation of type (multiply/divide/add/subtract) > e.g., VecAXPY() for real vectors of length N --> 2N flops > and VecAXPY() for complex vectors of > length N --> 8N flops > > Summary of Stages: ----- Time ------ ----- Flops ----- --- Messages --- -- Message Lengths -- -- Reductions -- > Avg %Total Avg %Total counts %Total Avg %Total counts %Total > 0: Main Stage: 5.5145e+02 100.0% 1.3011e+13 100.0% 6.007e+06 99.9% 5.020e+04 99.9% 3.999e+03 99.7% > > ---------------------------------------------------------------------- > -------------------------------------------------- > See the 'Profiling' chapter of the users' manual for details on interpreting output. > Phase summary info: > Count: number of times phase was executed > Time and Flops: Max - maximum over all processors > Ratio - ratio of maximum to minimum over all processors > Mess: number of messages sent > Avg. len: average message length (bytes) > Reduct: number of global reductions > Global: entire computation > Stage: stages of a computation. Set stages with PetscLogStagePush() and PetscLogStagePop(). > %T - percent time in this phase %F - percent flops in this phase > %M - percent messages in this phase %L - percent message lengths in this phase > %R - percent reductions in this phase > Total Mflop/s: 10e-6 * (sum of flops over all processors)/(max time > over all processors) > ------------------------------------------------------------------------------------------------------------------------ > Event Count Time (sec) Flops --- Global --- --- Stage --- Total > Max Ratio Max Ratio Max Ratio Mess Avg len Reduct %T %F %M %L %R %T %F %M %L %R Mflop/s > ---------------------------------------------------------------------- > -------------------------------------------------- > > --- Event Stage 0: Main Stage > > MatMult 500 1.0 1.0172e+01 1.2 6.68e+09 1.1 1.9e+06 9.9e+03 0.0e+00 2 19 31 6 0 2 19 31 6 0 247182 > MatMultTranspose 120 1.0 6.9889e-01 1.2 3.56e+08 1.1 2.5e+05 1.4e+04 0.0e+00 0 1 4 1 0 0 1 4 1 0 197492 > MatSolve 380 1.0 3.9310e+00 1.1 1.17e+09 1.1 1.3e+04 5.7e+01 6.0e+01 1 3 0 0 1 1 3 0 0 2 112069 > MatSOR 120 1.0 1.3915e+01 1.1 6.73e+09 1.1 9.5e+05 7.4e+03 0.0e+00 2 20 16 2 0 2 20 16 2 0 182405 > MatLUFactorSym 2 1.0 2.1180e-01 1.8 0.00e+00 0.0 0.0e+00 0.0e+00 1.0e+01 0 0 0 0 0 0 0 0 0 0 0 > MatLUFactorNum 60 1.0 7.9378e+00 1.0 1.95e+08 1.2 0.0e+00 0.0e+00 0.0e+00 1 1 0 0 0 1 1 0 0 0 8814 > MatILUFactorSym 1 1.0 2.3076e-02 1.5 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 0 > MatConvert 6 1.0 3.2693e-02 1.5 0.00e+00 0.0 0.0e+00 0.0e+00 1.8e+01 0 0 0 0 0 0 0 0 0 0 0 > MatScale 6 1.0 2.1923e-02 1.7 4.50e+06 1.1 2.4e+04 1.5e+03 0.0e+00 0 0 0 0 0 0 0 0 0 0 77365 > MatAssemblyBegin 266 1.0 1.0337e+01 4.4 0.00e+00 0.0 1.8e+05 3.8e+03 4.2e+02 1 0 3 0 10 1 0 3 0 10 0 > MatAssemblyEnd 266 1.0 3.0336e+00 1.0 0.00e+00 0.0 4.1e+05 8.6e+02 4.7e+02 1 0 7 0 12 1 0 7 0 12 0 > MatGetRow 6730366 1.1 8.6473e-01 1.2 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 0 > MatGetRowIJ 3 3.0 5.2931e-035550.2 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 0 > MatGetSubMatrix 12 1.0 2.2689e+01 1.0 0.00e+00 0.0 1.1e+05 1.3e+05 1.9e+02 4 0 2 5 5 4 0 2 5 5 0 > MatGetOrdering 3 3.0 6.5000e-03 5.4 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 0 > MatPartitioning 6 1.0 2.9801e+00 1.0 0.00e+00 0.0 0.0e+00 0.0e+00 1.4e+01 1 0 0 0 0 1 0 0 0 0 0 > MatCoarsen 6 1.0 9.5374e-02 1.1 0.00e+00 0.0 2.5e+05 1.1e+03 3.8e+01 0 0 4 0 1 0 0 4 0 1 0 > MatZeroEntries 22 1.0 6.1185e-01 1.2 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 0 > MatTranspose 6 1.0 1.9780e-01 1.1 0.00e+00 0.0 1.9e+05 8.6e+02 7.8e+01 0 0 3 0 2 0 0 3 0 2 0 > MatPtAP 120 1.0 5.2996e+01 1.0 1.70e+10 1.1 9.7e+05 2.1e+05 4.2e+02 10 49 16 67 10 10 49 16 67 11 120900 > MatPtAPSymbolic 12 1.0 5.8209e+00 1.0 0.00e+00 0.0 2.2e+05 3.7e+04 8.4e+01 1 0 4 3 2 1 0 4 3 2 0 > MatPtAPNumeric 120 1.0 4.7185e+01 1.0 1.70e+10 1.1 7.6e+05 2.6e+05 3.4e+02 9 49 13 64 8 9 49 13 64 8 135789 > MatTrnMatMult 3 1.0 1.1679e+00 1.0 3.22e+07 1.2 8.2e+04 8.0e+03 5.7e+01 0 0 1 0 1 0 0 1 0 1 9997 > MatTrnMatMultSym 3 1.0 6.8366e-01 1.0 0.00e+00 0.0 6.9e+04 3.9e+03 5.1e+01 0 0 1 0 1 0 0 1 0 1 0 > MatTrnMatMultNum 3 1.0 4.8513e-01 1.0 3.22e+07 1.2 1.3e+04 3.0e+04 6.0e+00 0 0 0 0 0 0 0 0 0 0 24069 > MatGetLocalMat 126 1.0 1.1939e+00 1.3 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 0 > MatGetBrAoCol 120 1.0 5.9887e-01 2.7 0.00e+00 0.0 5.7e+05 3.3e+04 0.0e+00 0 0 9 6 0 0 0 9 6 0 0 > MatGetSymTrans 24 1.0 1.4878e-01 1.4 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 0 > VecDot 320 1.0 1.5860e+00 1.5 2.04e+08 1.1 0.0e+00 0.0e+00 3.2e+02 0 1 0 0 8 0 1 0 0 8 49574 > VecMDot 260 1.0 1.8154e+00 2.5 3.70e+08 1.1 0.0e+00 0.0e+00 2.6e+02 0 1 0 0 6 0 1 0 0 7 78497 > VecNorm 440 1.0 2.8876e+00 1.8 5.88e+08 1.1 0.0e+00 0.0e+00 4.4e+02 0 2 0 0 11 0 2 0 0 11 78281 > VecScale 320 1.0 2.2738e-01 1.2 7.88e+07 1.1 0.0e+00 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 133517 > VecCopy 220 1.0 7.1162e-01 1.2 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 0 > VecSet 862 1.0 7.0683e-01 1.5 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 0 > VecAXPY 440 1.0 9.0657e-01 1.2 3.83e+08 1.1 0.0e+00 0.0e+00 0.0e+00 0 1 0 0 0 0 1 0 0 0 162612 > VecAYPX 280 1.0 5.8935e-01 1.5 1.92e+08 1.1 0.0e+00 0.0e+00 0.0e+00 0 1 0 0 0 0 1 0 0 0 125070 > VecMAXPY 300 1.0 9.7644e-01 1.2 4.98e+08 1.1 0.0e+00 0.0e+00 0.0e+00 0 1 0 0 0 0 1 0 0 0 196269 > VecAssemblyBegin 234 1.0 5.0308e+00 5.5 0.00e+00 0.0 0.0e+00 0.0e+00 6.8e+02 1 0 0 0 17 1 0 0 0 17 0 > VecAssemblyEnd 234 1.0 1.8253e-03 8.8 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 0 > VecScatterBegin 1083 1.0 2.8195e-01 4.7 0.00e+00 0.0 3.8e+06 8.4e+03 2.0e+01 0 0 64 11 0 0 0 64 11 1 0 > VecScatterEnd 1063 1.0 3.4924e+00 6.9 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 0 > KSPGMRESOrthog 20 1.0 1.5598e+00 3.2 1.28e+08 1.1 0.0e+00 0.0e+00 2.0e+01 0 0 0 0 0 0 0 0 0 1 31503 > KSPSetUp 222 1.0 9.7521e-02 1.6 0.00e+00 0.0 0.0e+00 0.0e+00 1.2e+01 0 0 0 0 0 0 0 0 0 0 0 > KSPSolve 60 1.0 1.3742e+02 1.0 3.42e+10 1.1 5.7e+06 4.4e+04 3.2e+03 25100 95 83 79 25100 95 83 79 94396 > PCGAMGGraph_AGG 6 1.0 5.7683e+00 1.0 4.50e+06 1.1 3.8e+05 9.1e+02 2.5e+02 1 0 6 0 6 1 0 6 0 6 294 > PCGAMGCoarse_AGG 6 1.0 1.4101e+00 1.0 3.22e+07 1.2 4.0e+05 3.2e+03 1.4e+02 0 0 7 0 4 0 0 7 0 4 8280 > PCGAMGProl_AGG 6 1.0 1.8976e+00 1.0 0.00e+00 0.0 7.2e+05 3.4e+03 8.6e+02 0 0 12 1 22 0 0 12 1 22 0 > PCGAMGPOpt_AGG 6 1.0 5.7220e-06 0.0 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 0 > GAMG: createProl 6 1.0 9.0840e+00 1.0 3.67e+07 1.2 1.5e+06 2.7e+03 1.3e+03 2 0 25 1 31 2 0 25 1 31 1472 > Graph 12 1.0 5.7669e+00 1.0 4.50e+06 1.1 3.8e+05 9.1e+02 2.5e+02 1 0 6 0 6 1 0 6 0 6 294 > MIS/Agg 6 1.0 9.5481e-02 1.1 0.00e+00 0.0 2.5e+05 1.1e+03 3.8e+01 0 0 4 0 1 0 0 4 0 1 0 > SA: col data 6 1.0 8.5414e-01 1.0 0.00e+00 0.0 6.6e+05 3.0e+03 7.8e+02 0 0 11 1 19 0 0 11 1 20 0 > SA: frmProl0 6 1.0 1.0123e+00 1.0 0.00e+00 0.0 6.2e+04 7.6e+03 6.0e+01 0 0 1 0 1 0 0 1 0 2 0 > GAMG: partLevel 6 1.0 3.6150e+01 1.0 8.41e+08 1.1 3.5e+05 5.0e+04 5.3e+02 7 2 6 6 13 7 2 6 6 13 8804 > repartition 6 1.0 3.8351e+00 1.0 0.00e+00 0.0 4.7e+04 1.3e+02 1.6e+02 1 0 1 0 4 1 0 1 0 4 0 > Invert-Sort 6 1.0 4.4953e+00 1.8 0.00e+00 0.0 0.0e+00 0.0e+00 2.4e+01 1 0 0 0 1 1 0 0 0 1 0 > Move A 6 1.0 1.0806e+01 1.0 0.00e+00 0.0 8.5e+04 1.6e+05 1.0e+02 2 0 1 5 3 2 0 1 5 3 0 > Move P 6 1.0 1.1953e+01 1.0 0.00e+00 0.0 2.5e+04 3.6e+03 1.0e+02 2 0 0 0 3 2 0 0 0 3 0 > PCSetUp 100 1.0 1.0166e+02 1.0 1.72e+10 1.1 2.7e+06 8.3e+04 2.2e+03 18 50 44 73 54 18 50 44 73 54 63848 > PCSetUpOnBlocks 40 1.0 1.0812e+00 1.2 1.95e+08 1.2 0.0e+00 0.0e+00 0.0e+00 0 1 0 0 0 0 1 0 0 0 64711 > PCApply 380 1.0 1.9359e+01 1.1 8.58e+09 1.1 1.4e+06 9.6e+03 6.0e+01 3 25 24 5 1 3 25 24 5 2 167605 > SFSetGraph 12 1.0 3.5203e-03 6.8 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 0 > SFBcastBegin 44 1.0 2.4242e-02 3.0 0.00e+00 0.0 2.5e+05 1.1e+03 6.0e+00 0 0 4 0 0 0 0 4 0 0 0 > SFBcastEnd 44 1.0 3.0994e-02 8.6 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 0 > SFReduceBegin 6 1.0 1.6784e-02 3.8 0.00e+00 0.0 7.1e+04 5.0e+02 6.0e+00 0 0 1 0 0 0 0 1 0 0 0 > SFReduceEnd 6 1.0 8.6989e-0332.0 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 0 > ---------------------------------------------------------------------- > -------------------------------------------------- > > Memory usage is given in bytes: > > Object Type Creations Destructions Memory Descendants' Mem. > Reports information only for process 0. > > --- Event Stage 0: Main Stage > > Matrix 246 243 1730595756 0 > Matrix Partitioning 6 6 3816 0 > Matrix Coarsen 6 6 3720 0 > Vector 602 602 1603749672 0 > Vector Scatter 87 87 4291136 0 > Krylov Solver 12 12 60416 0 > Preconditioner 12 12 12040 0 > Viewer 1 0 0 0 > Index Set 247 247 9018060 0 > Star Forest Bipartite Graph 12 12 10080 0 > ====================================================================== > ================================================== > > Any idea why there are more matrix created with version 3.7.2? I only have 2 MatCreate calls and 4 VecCreate calls in my code!, so I assume the others are internally created. > > > Thank you, > > > Hassan Raiesi, PhD > > Advanced Aerodynamics Department > Bombardier Aerospace > > hassan.raiesi at aero.bombardier.com > > 2351 boul. Alfred-Nobel (BAN1) > Ville Saint-Laurent, Qu?bec, H4S 2A9 > > > > T?l. > 514-855-5001 # 62204 > > > > > > > CONFIDENTIALITY NOTICE - This communication may contain privileged or confidential information. > If you are not the intended recipient or received this communication > by error, please notify the sender and delete the message without copying, forwarding and/or disclosing it. -------------- next part -------------- An embedded and charset-unspecified text was scrubbed... Name: log_3.6.1_gamg.txt URL: -------------- next part -------------- An embedded and charset-unspecified text was scrubbed... Name: log_3.7.2_gamg_run_with_square_graph_1.txt URL: -------------- next part -------------- An embedded and charset-unspecified text was scrubbed... Name: log_3.7.2_gamg_square_graph_1_max_level_17.txt URL: -------------- next part -------------- An embedded and charset-unspecified text was scrubbed... Name: log_3.6.1_gamg_run_with_square_graph_1_max_level_17.txt URL: -------------- next part -------------- An embedded and charset-unspecified text was scrubbed... Name: log_3.7.2_gamg.txt URL: -------------- next part -------------- An embedded and charset-unspecified text was scrubbed... Name: log_3.7.2_basic.txt URL: -------------- next part -------------- An embedded and charset-unspecified text was scrubbed... Name: log_3.6.1_basic.txt URL: From epscodes at gmail.com Wed Jul 6 12:17:01 2016 From: epscodes at gmail.com (Xiangdong) Date: Wed, 6 Jul 2016 13:17:01 -0400 Subject: [petsc-users] snes true and preconditioned residuals for left npc Message-ID: Hello everyone, I am using snes_type aspin, which is actually newtonls + npc (nasm). After each newton iteration, if I call SNESGetFunction, the preconditioned residual is obtained. However, if I use SNESComputeFunction, I get the true (unpreconditioned) residual. If I want to know the preconditioned residual at a point different from current solution, which function should I call? Thanks. Best, Xiangdong -------------- next part -------------- An HTML attachment was scrubbed... URL: From knepley at gmail.com Wed Jul 6 14:35:24 2016 From: knepley at gmail.com (Matthew Knepley) Date: Wed, 6 Jul 2016 14:35:24 -0500 Subject: [petsc-users] Duplicate cells when exporting a distributed dmplex In-Reply-To: References: Message-ID: On Tue, Jul 5, 2016 at 4:17 AM, Morten Nobel-J?rgensen wrote: > Hi all, > > I hope someone can help me with the following: > > I?m having some problems when exporting a distributed DMPlex ? the cells > (+cell types) seems to be duplicated. > > When I?m running the code on a non-distributed system it works as > expected, but when I run it on multiple processors (2 in my case) the > output is invalid. > > I have attached a simple example and the output for np=1 and np=2. > The problem here is VTK output with overlapped meshes. If you change to overlap = 0, it works as expected. I never fixed the VTK output for this, but the HDF5 output works correctly. I will put this on the list of things to do. I am attaching your code with some cleanup from me, including assigning values in parallel. Thanks, Matt > Abbreviated the code essentially does the following: > ' > > PetscInt dim = 3; > PetscInt cells[] = {1, 1, 2}; > PetscInt overlap = 1; > PetscInitialize(&argc, &argv, NULL, help); > DMPlexCreateHexBoxMesh(PETSC_COMM_WORLD, dim, cells, DM_BOUNDARY_NONE, > DM_BOUNDARY_NONE, DM_BOUNDARY_NONE, &dm); > DMPlexDistribute(dm, overlap, NULL, &dist); > dm = dist; > SetupDOFs(dm); > Vec V; > DMCreateGlobalVector(dm, &V); > AssignSomeValues(V); > PetscViewer viewer; > const char* fn = "output.vtk"; > PetscViewerVTKOpen(PETSC_COMM_WORLD,fn,FILE_MODE_WRITE,&viewer); > VecView(V,viewer); > PetscViewerDestroy(&viewer); > > > Kind regards, > Morten > -- What most experimenters take for granted before they begin their experiments is infinitely more interesting than any results to which their experiments lead. -- Norbert Wiener -------------- next part -------------- An HTML attachment was scrubbed... URL: -------------- next part -------------- A non-text attachment was scrubbed... Name: ex_vtk_export.c Type: text/x-csrc Size: 3336 bytes Desc: not available URL: From eduardojourdan92 at gmail.com Wed Jul 6 14:50:39 2016 From: eduardojourdan92 at gmail.com (Eduardo Jourdan) Date: Wed, 6 Jul 2016 16:50:39 -0300 Subject: [petsc-users] What block size means in amg aggregation type Message-ID: Hi, I am kind of new to algebraic multigrid methods. I tried to figure it on my own but I'm not be sure about it. How the block size (bs) of a blocked matrix affects the AMG AGG? I mean, if bs = 4, then in the coarsening phase and setup, blocks of 4x4 matrix elements are considered to remain in the coarse level and a certain quantity of block neighbors are restricted and remain in the finer level? Never a row inside a block matrix is selected and the other elements of this block aren't, am I right? The entire block is interpolated when it comes to the interpolation phase? If the original problem is not a system of equations, then bs=1? Thank you, Eduardo -------------- next part -------------- An HTML attachment was scrubbed... URL: From bsmith at mcs.anl.gov Wed Jul 6 16:17:27 2016 From: bsmith at mcs.anl.gov (Barry Smith) Date: Wed, 6 Jul 2016 16:17:27 -0500 Subject: [petsc-users] (edit GAMG) petsc 3.7.2 memory usage is much higher when compared to 3.6.1 In-Reply-To: References: Message-ID: Hassan, My statement "This memory usage increase is not expected." holds only for the fgmres/bjacobi. Mark continues to make refinements to the GAMG code that could easily result in more or less memory being used so I am not surprised by the numbers you report below. Do not bother running the git bisect that I suggested before; that was only to find a bug related to fgmres/bjacobi which does not seem to be a problem. In general I think for GAMG that "more memory usage" results mostly from "larger coarse grid problems" (or, of course bugs in the code) so if you could run old and new with -ksp_view and send the output (and look at it yourself) I am guessing we will see larger coarse grid problems with new. You can use the option -pc_gamg_threshold .1 (or something) to cause smaller coarse grid problems (of course this may make the convergence slower). Mark Adams knows much more about this and will hopefully be able to make other suggestions. Note also that the memory usage information for matrices in the -log_summary is NOT a high water mark, rather it is the sum of the memory of the all the matrices that were ever created. Since GAMG creates some temporary matrices and then destroys them during the set up process the actual high water mark is lower. You can run with -memory_view to see the hight water mark level of memory usage in the old and new cases. High water mark is what causes the problem to end prematurely with out of memory. Barry > On Jul 6, 2016, at 11:22 AM, Hassan Raiesi wrote: > > Barry, > > Thank you for the detailed instructions, I'll try to figure out what change causes this problem, > > To answer your question, I re-ran using fgmres/bjacobi for a simple case and there was virtually no difference in memory footprint reported by PETSc (see the log files ends _basic). So it is safe to assume the extra memory was due to GAMG. > > I ran a series of tests with GAMG, I attached full logs here, but to summarize: > > PETSc 3.6.1: > --- Event Stage 0: Main Stage > > Matrix 368 365 149426856 0 > Matrix Coarsen 16 16 9920 0 > Vector 1181 1181 218526896 0 > Vector Scatter 99 99 115936 0 > Krylov Solver 22 22 72976 0 > Preconditioner 22 22 21648 0 > Viewer 1 0 0 0 > Index Set 267 267 821040 0 > Star Forest Bipartite Graph 16 16 13440 0 > > > Using same options, exactly same code (just linked it with petsc-3.7.2) > > PETSc 3.7.2: > --- Event Stage 0: Main Stage > > Matrix 412 409 180705004 0. > Matrix Coarsen 12 12 7536 0. > Vector 923 923 214751960 0. > Vector Scatter 79 79 95488 0. > Krylov Solver 17 17 67152 0. > Preconditioner 17 17 16936 0. > PetscRandom 1 1 638 0. > Viewer 1 0 0 0. > Index Set 223 223 790676 0. > Star Forest Bipartite Graph 12 12 10176 0. > > GAMG in 3.7.2 creates less levels, but needs more memory. > > For next test, I changed the "pc_gamg_square_graph" from 2 to 1, here 3.7.2 makes 19 levels now > > PETSc 3.7.2: > --- Event Stage 0: Main Stage > > Matrix 601 598 188796452 0. > Matrix Coarsen 19 19 11932 0. > Vector 1358 1358 216798096 0. > Vector Scatter 110 110 128920 0. > Krylov Solver 24 24 76112 0. > Preconditioner 24 24 23712 0. > PetscRandom 1 1 638 0. > Viewer 1 0 0 0. > Index Set 284 284 857076 0. > Star Forest Bipartite Graph 19 19 16112 0. > > with similar memory usage. > > If I limit the number of levels to 17, I would get same number of levels as in version 3.6.1, however the memory usage is still higher than version 3.6.1 > > PETSc 3.7.2: > --- Event Stage 0: Main Stage > > Matrix 506 503 187749632 0. > Matrix Coarsen 16 16 10048 0. > Vector 1160 1160 216216344 0. > Vector Scatter 92 92 100424 0. > Krylov Solver 21 21 72272 0. > Preconditioner 21 21 20808 0. > PetscRandom 1 1 638 0. > Viewer 1 0 0 0. > Index Set 237 237 818260 0. > Star Forest Bipartite Graph 16 16 13568 0. > > Now running version 3.6.1 with the options used for the above run > > PETSc 3.6.1: > --- Event Stage 0: Main Stage > > Matrix 338 335 153296844 0 > Matrix Coarsen 16 16 9920 0 > Vector 1156 1156 219112832 0 > Vector Scatter 89 89 94696 0 > Krylov Solver 22 22 72976 0 > Preconditioner 22 22 21648 0 > Viewer 1 0 0 0 > Index Set 223 223 791548 0 > Star Forest Bipartite Graph 16 16 13440 0 > > > It Looks like the GAMG in 3.7.2 makes a lot more matrices for same number of levels and requires about (187749632 - 153296844)/153296844 = 22.5% more memory. > > I hope the logs help fixing the issue. > > Best Regards > > PS: GAMG is great, and by far beats all other AMG libraries we have tried so far :-) > > > -----Original Message----- > From: Barry Smith [mailto:bsmith at mcs.anl.gov] > Sent: Tuesday, July 05, 2016 6:19 PM > To: Hassan Raiesi > Cc: petsc-users at mcs.anl.gov > Subject: Re: [petsc-users] petsc 3.7.2 memory usage is much higher when compared to 3.6.1 > > > Hassan, > > This memory usage increase is not expected. How are you measuring memory usage? > > Since the problem occurs even with a simple solver you should debug with the simpler solver and only after resolving that move on to GAMG and see if the problem persists. Also do the test on the smallest case that clearly demonstrates the problem; if you have a 1 process run that shows a nontrivial memory usage increase then debug with that, don't run a huge problem unless you absolutely have to. > > How much code, if any, did you need to change in your application in going from 3.6.1 to 3.7.2 ? > > Here is the way to track down the problem. It may seem burdensome but requires no guesswork or speculation. Use the bisection capability of git. > > First obtain PETSc via git if you have not gotten that way http://www.mcs.anl.gov/petsc/download/index.html > > Then in the PETSc directory run > > git bisect start > > git bisect good v3.6.1 > > git bisect bad v3.7.2 > > It will then change to a new commit where you need to run configure and make on PETSc and then compile and run your application > > If the application uses the excessive memory then in the PETSc directory do > > git bisect bad > > otherwise type > > git bisect good > > if the code won't compile (if the PETSc API changes you may have to adjust your code slightly to get it to compile and you should do that; but if PETSc won't configure to build with the given commit then just do the skip) or crashes then type > > git bisect skip > > Now git will switch to another commit > > where you need again do the same process of configure make and run the application. > > After a few iterations git bisect will show the EXACT commit (code changes) that resulted in your very different memory usage and we can take a look at the code changes in PETSc and figure out how to reduce the memory usage. > > I realize this seems like a burdensome process but remember a great deal of changes took place in the PETSc code and this is the ONLY well defined way to figure out exactly which change caused the problem. Otherwise we can guess until the end of time. > > Barry > > > > > > > >> On Jul 5, 2016, at 3:42 PM, Hassan Raiesi wrote: >> >> Hi, >> >> PETSc 3.7.2 seems to have a much higher memory usage when compared with PETSc- 3.1.1 c, to a point that it crashes our code for large problems that we ran with version 3.6.1 in the past. >> I have re-compiled the code with same options, and ran the same code linked with the two versions, here are the log-summarie: >> >> -flow_ksp_max_it 20 >> -flow_ksp_monitor_true_residual >> -flow_ksp_rtol 0.1 >> -flow_ksp_type fgmres >> -flow_mg_coarse_pc_factor_mat_solver_package mumps >> -flow_mg_coarse_pc_type lu -flow_mg_levels_ksp_type richardson >> -flow_mg_levels_pc_type sor -flow_pc_gamg_agg_nsmooths 0 >> -flow_pc_gamg_coarse_eq_limit 2000 -flow_pc_gamg_process_eq_limit 2500 >> -flow_pc_gamg_repartition true -flow_pc_gamg_reuse_interpolation true >> -flow_pc_gamg_square_graph 3 -flow_pc_gamg_sym_graph true >> -flow_pc_gamg_type agg -flow_pc_mg_cycle v -flow_pc_mg_levels 20 >> -flow_pc_mg_type kaskade -flow_pc_type gamg -log_summary >> >> Note: it is not specific to PCGAMG, even a bjacobi+fgmres would need more memory (4.5GB/core in version 3.6.1 compared to 6.8GB/core for 3.7.2). >> >> >> >> Using Petsc Development GIT revision: v3.7.2-812-gc68d048 GIT Date: >> 2016-07-05 12:04:34 -0400 >> >> Max Max/Min Avg Total >> Time (sec): 6.760e+02 1.00006 6.760e+02 >> Objects: 1.284e+03 1.00469 1.279e+03 >> Flops: 3.563e+10 1.10884 3.370e+10 1.348e+13 >> Flops/sec: 5.271e+07 1.10884 4.985e+07 1.994e+10 >> MPI Messages: 4.279e+04 7.21359 1.635e+04 6.542e+06 >> MPI Message Lengths: 3.833e+09 17.25274 7.681e+04 5.024e+11 >> MPI Reductions: 4.023e+03 1.00149 >> >> Flop counting convention: 1 flop = 1 real number operation of type (multiply/divide/add/subtract) >> e.g., VecAXPY() for real vectors of length N --> 2N flops >> and VecAXPY() for complex vectors of >> length N --> 8N flops >> >> Summary of Stages: ----- Time ------ ----- Flops ----- --- Messages --- -- Message Lengths -- -- Reductions -- >> Avg %Total Avg %Total counts %Total Avg %Total counts %Total >> 0: Main Stage: 6.7600e+02 100.0% 1.3478e+13 100.0% 6.533e+06 99.9% 7.674e+04 99.9% 4.010e+03 99.7% >> >> ---------------------------------------------------------------------- >> -------------------------------------------------- >> See the 'Profiling' chapter of the users' manual for details on interpreting output. >> Phase summary info: >> Count: number of times phase was executed >> Time and Flops: Max - maximum over all processors >> Ratio - ratio of maximum to minimum over all processors >> Mess: number of messages sent >> Avg. len: average message length (bytes) >> Reduct: number of global reductions >> Global: entire computation >> Stage: stages of a computation. Set stages with PetscLogStagePush() and PetscLogStagePop(). >> %T - percent time in this phase %F - percent flops in this phase >> %M - percent messages in this phase %L - percent message lengths in this phase >> %R - percent reductions in this phase >> Total Mflop/s: 10e-6 * (sum of flops over all processors)/(max time >> over all processors) >> ------------------------------------------------------------------------------------------------------------------------ >> Event Count Time (sec) Flops --- Global --- --- Stage --- Total >> Max Ratio Max Ratio Max Ratio Mess Avg len Reduct %T %F %M %L %R %T %F %M %L %R Mflop/s >> ---------------------------------------------------------------------- >> -------------------------------------------------- >> >> --- Event Stage 0: Main Stage >> >> MatMult 500 1.0 1.0582e+01 1.2 6.68e+09 1.1 1.9e+06 1.0e+04 0.0e+00 1 19 28 4 0 1 19 29 4 0 237625 >> MatMultTranspose 120 1.0 7.6262e-01 1.3 3.58e+08 1.1 2.4e+05 1.5e+04 0.0e+00 0 1 4 1 0 0 1 4 1 0 180994 >> MatSolve 380 1.0 4.1580e+00 1.1 1.17e+09 1.1 8.6e+03 8.8e+01 6.0e+01 1 3 0 0 1 1 3 0 0 1 105950 >> MatSOR 120 1.0 1.4316e+01 1.2 6.75e+09 1.1 9.5e+05 7.4e+03 0.0e+00 2 19 15 1 0 2 19 15 1 0 177298 >> MatLUFactorSym 2 1.0 2.3449e-01 1.1 0.00e+00 0.0 0.0e+00 0.0e+00 1.0e+01 0 0 0 0 0 0 0 0 0 0 0 >> MatLUFactorNum 60 1.0 8.8820e+00 1.0 1.95e+08 1.2 0.0e+00 0.0e+00 0.0e+00 1 1 0 0 0 1 1 0 0 0 7877 >> MatILUFactorSym 1 1.0 1.9795e-02 1.4 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 0 >> MatConvert 6 1.0 2.9893e-02 1.1 0.00e+00 0.0 0.0e+00 0.0e+00 1.8e+01 0 0 0 0 0 0 0 0 0 0 0 >> MatScale 6 1.0 1.8810e-02 1.4 4.52e+06 1.1 2.4e+04 1.5e+03 0.0e+00 0 0 0 0 0 0 0 0 0 0 90171 >> MatAssemblyBegin 782 1.0 1.8294e+01 2.9 0.00e+00 0.0 9.2e+05 4.1e+05 4.2e+02 2 0 14 75 10 2 0 14 75 10 0 >> MatAssemblyEnd 782 1.0 1.4283e+01 3.0 0.00e+00 0.0 4.1e+05 8.7e+02 4.7e+02 1 0 6 0 12 1 0 6 0 12 0 >> MatGetRow 6774900 1.1 9.4289e-01 1.2 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 0 >> MatGetRowIJ 3 3.0 6.6261e-036948.0 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 0 >> MatGetSubMatrix 12 1.0 2.6783e+01 1.0 0.00e+00 0.0 1.1e+05 1.3e+05 2.0e+02 4 0 2 3 5 4 0 2 3 5 0 >> MatGetOrdering 3 3.0 7.7400e-03 7.2 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 0 >> MatPartitioning 6 1.0 1.8949e+00 1.0 0.00e+00 0.0 0.0e+00 0.0e+00 1.4e+01 0 0 0 0 0 0 0 0 0 0 0 >> MatCoarsen 6 1.0 9.5692e-02 1.2 0.00e+00 0.0 2.6e+05 1.1e+03 4.1e+01 0 0 4 0 1 0 0 4 0 1 0 >> MatZeroEntries 142 1.0 9.7085e-01 1.1 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 0 >> MatTranspose 6 1.0 2.1740e-01 1.0 0.00e+00 0.0 1.9e+05 8.5e+02 7.8e+01 0 0 3 0 2 0 0 3 0 2 0 >> MatPtAP 120 1.0 6.0157e+01 1.0 1.82e+10 1.1 1.5e+06 2.7e+05 4.2e+02 9 51 22 80 10 9 51 22 80 10 114269 >> MatPtAPSymbolic 12 1.0 8.1081e+00 1.0 0.00e+00 0.0 2.2e+05 3.8e+04 8.4e+01 1 0 3 2 2 1 0 3 2 2 0 >> MatPtAPNumeric 120 1.0 5.2205e+01 1.0 1.82e+10 1.1 1.2e+06 3.1e+05 3.4e+02 8 51 19 78 8 8 51 19 78 8 131676 >> MatTrnMatMult 3 1.0 1.8608e+00 1.0 3.23e+07 1.2 8.3e+04 7.9e+03 5.7e+01 0 0 1 0 1 0 0 1 0 1 6275 >> MatTrnMatMultSym 3 1.0 1.3447e+00 1.0 0.00e+00 0.0 6.9e+04 3.8e+03 5.1e+01 0 0 1 0 1 0 0 1 0 1 0 >> MatTrnMatMultNum 3 1.0 5.1695e-01 1.0 3.23e+07 1.2 1.3e+04 3.0e+04 6.0e+00 0 0 0 0 0 0 0 0 0 0 22588 >> MatGetLocalMat 126 1.0 1.0355e+00 1.8 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 0 >> MatGetBrAoCol 120 1.0 9.5921e+0019.2 0.00e+00 0.0 5.7e+05 3.3e+04 0.0e+00 1 0 9 4 0 1 0 9 4 0 0 >> VecDot 320 1.0 1.1400e+00 1.6 2.04e+08 1.1 0.0e+00 0.0e+00 3.2e+02 0 1 0 0 8 0 1 0 0 8 68967 >> VecMDot 260 1.0 1.9577e+00 2.8 3.70e+08 1.1 0.0e+00 0.0e+00 2.6e+02 0 1 0 0 6 0 1 0 0 6 72792 >> VecNorm 440 1.0 2.6273e+00 1.9 5.88e+08 1.1 0.0e+00 0.0e+00 4.4e+02 0 2 0 0 11 0 2 0 0 11 86035 >> VecScale 320 1.0 2.1386e-01 1.2 7.91e+07 1.1 0.0e+00 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 141968 >> VecCopy 220 1.0 7.0370e-01 1.2 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 0 >> VecSet 862 1.0 7.1000e-01 1.6 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 0 >> VecAXPY 440 1.0 8.6790e-01 1.1 3.83e+08 1.1 0.0e+00 0.0e+00 0.0e+00 0 1 0 0 0 0 1 0 0 0 169857 >> VecAYPX 280 1.0 5.7766e-01 1.5 1.92e+08 1.1 0.0e+00 0.0e+00 0.0e+00 0 1 0 0 0 0 1 0 0 0 127599 >> VecMAXPY 300 1.0 9.7396e-01 1.2 4.98e+08 1.1 0.0e+00 0.0e+00 0.0e+00 0 1 0 0 0 0 1 0 0 0 196768 >> VecAssemblyBegin 234 1.0 4.6313e+00 5.6 0.00e+00 0.0 0.0e+00 0.0e+00 6.8e+02 0 0 0 0 17 0 0 0 0 17 0 >> VecAssemblyEnd 234 1.0 5.1503e-0319.5 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 0 >> VecScatterBegin 1083 1.0 2.9274e-01 4.5 0.00e+00 0.0 3.8e+06 8.5e+03 2.0e+01 0 0 59 6 0 0 0 59 6 0 0 >> VecScatterEnd 1063 1.0 3.9653e+00 5.6 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 0 >> KSPGMRESOrthog 20 1.0 1.7405e+00 3.7 1.28e+08 1.1 0.0e+00 0.0e+00 2.0e+01 0 0 0 0 0 0 0 0 0 0 28232 >> KSPSetUp 222 1.0 6.8469e-02 1.1 0.00e+00 0.0 0.0e+00 0.0e+00 1.2e+01 0 0 0 0 0 0 0 0 0 0 0 >> KSPSolve 60 1.0 1.4767e+02 1.0 3.55e+10 1.1 6.3e+06 7.2e+04 3.2e+03 22100 96 90 79 22100 96 90 79 91007 >> PCGAMGGraph_AGG 6 1.0 6.0792e+00 1.0 4.52e+06 1.1 3.8e+05 9.0e+02 2.5e+02 1 0 6 0 6 1 0 6 0 6 279 >> PCGAMGCoarse_AGG 6 1.0 2.0660e+00 1.0 3.23e+07 1.2 4.2e+05 3.1e+03 1.5e+02 0 0 6 0 4 0 0 6 0 4 5652 >> PCGAMGProl_AGG 6 1.0 1.8842e+00 1.0 0.00e+00 0.0 7.3e+05 3.3e+03 8.6e+02 0 0 11 0 21 0 0 11 0 22 0 >> PCGAMGPOpt_AGG 6 1.0 6.4373e-06 0.0 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 0 >> GAMG: createProl 6 1.0 1.0036e+01 1.0 3.68e+07 1.2 1.5e+06 2.7e+03 1.3e+03 1 0 23 1 31 1 0 23 1 31 1332 >> Graph 12 1.0 6.0783e+00 1.0 4.52e+06 1.1 3.8e+05 9.0e+02 2.5e+02 1 0 6 0 6 1 0 6 0 6 279 >> MIS/Agg 6 1.0 9.5831e-02 1.2 0.00e+00 0.0 2.6e+05 1.1e+03 4.1e+01 0 0 4 0 1 0 0 4 0 1 0 >> SA: col data 6 1.0 7.7358e-01 1.0 0.00e+00 0.0 6.7e+05 2.9e+03 7.8e+02 0 0 10 0 19 0 0 10 0 19 0 >> SA: frmProl0 6 1.0 1.0759e+00 1.0 0.00e+00 0.0 6.2e+04 7.6e+03 6.0e+01 0 0 1 0 1 0 0 1 0 1 0 >> GAMG: partLevel 6 1.0 3.8136e+01 1.0 9.09e+08 1.1 3.8e+05 5.0e+04 5.4e+02 6 3 6 4 13 6 3 6 4 14 9013 >> repartition 6 1.0 2.7910e+00 1.0 0.00e+00 0.0 4.6e+04 1.3e+02 1.6e+02 0 0 1 0 4 0 0 1 0 4 0 >> Invert-Sort 6 1.0 2.5045e+00 2.0 0.00e+00 0.0 0.0e+00 0.0e+00 2.4e+01 0 0 0 0 1 0 0 0 0 1 0 >> Move A 6 1.0 1.4832e+01 1.0 0.00e+00 0.0 8.5e+04 1.7e+05 1.1e+02 2 0 1 3 3 2 0 1 3 3 0 >> Move P 6 1.0 1.2023e+01 1.0 0.00e+00 0.0 2.4e+04 3.8e+03 1.1e+02 2 0 0 0 3 2 0 0 0 3 0 >> PCSetUp 100 1.0 1.1212e+02 1.0 1.84e+10 1.1 3.2e+06 1.3e+05 2.2e+03 17 52 49 84 54 17 52 49 84 54 62052 >> PCSetUpOnBlocks 40 1.0 1.0386e+00 1.2 1.95e+08 1.2 0.0e+00 0.0e+00 0.0e+00 0 1 0 0 0 0 1 0 0 0 67368 >> PCApply 380 1.0 2.0034e+01 1.1 8.60e+09 1.1 1.5e+06 9.9e+03 6.0e+01 3 24 22 3 1 3 24 22 3 1 161973 >> SFSetGraph 12 1.0 4.9813e-0310.2 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 0 >> SFBcastBegin 47 1.0 3.3110e-02 2.6 0.00e+00 0.0 2.6e+05 1.1e+03 6.0e+00 0 0 4 0 0 0 0 4 0 0 0 >> SFBcastEnd 47 1.0 1.3497e-02 3.6 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 0 >> SFReduceBegin 6 1.0 1.8593e-02 4.2 0.00e+00 0.0 7.2e+04 4.9e+02 6.0e+00 0 0 1 0 0 0 0 1 0 0 0 >> SFReduceEnd 6 1.0 7.1628e-0318.5 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 0 >> BuildTwoSided 12 1.0 3.5771e-02 2.5 0.00e+00 0.0 5.0e+04 4.0e+00 1.2e+01 0 0 1 0 0 0 0 1 0 0 0 >> ---------------------------------------------------------------------- >> -------------------------------------------------- >> >> Memory usage is given in bytes: >> >> Object Type Creations Destructions Memory Descendants' Mem. >> Reports information only for process 0. >> >> --- Event Stage 0: Main Stage >> >> Matrix 302 299 1992700700 0. >> Matrix Partitioning 6 6 3888 0. >> Matrix Coarsen 6 6 3768 0. >> Vector 600 600 1582204168 0. >> Vector Scatter 87 87 5614432 0. >> Krylov Solver 11 11 59472 0. >> Preconditioner 11 11 11120 0. >> PetscRandom 1 1 638 0. >> Viewer 1 0 0 0. >> Index Set 247 247 9008420 0. >> Star Forest Bipartite Graph 12 12 10176 0. >> ====================================================================== >> ================================================== >> >> And for petsc 3.6.1: >> >> Using Petsc Development GIT revision: v3.6.1-307-g26c82d3 GIT Date: >> 2015-08-06 11:50:34 -0500 >> >> Max Max/Min Avg Total >> Time (sec): 5.515e+02 1.00001 5.515e+02 >> Objects: 1.231e+03 1.00490 1.226e+03 >> Flops: 3.431e+10 1.12609 3.253e+10 1.301e+13 >> Flops/sec: 6.222e+07 1.12609 5.899e+07 2.359e+10 >> MPI Messages: 4.432e+04 7.84165 1.504e+04 6.016e+06 >> MPI Message Lengths: 2.236e+09 12.61261 5.027e+04 3.024e+11 >> MPI Reductions: 4.012e+03 1.00150 >> >> Flop counting convention: 1 flop = 1 real number operation of type (multiply/divide/add/subtract) >> e.g., VecAXPY() for real vectors of length N --> 2N flops >> and VecAXPY() for complex vectors of >> length N --> 8N flops >> >> Summary of Stages: ----- Time ------ ----- Flops ----- --- Messages --- -- Message Lengths -- -- Reductions -- >> Avg %Total Avg %Total counts %Total Avg %Total counts %Total >> 0: Main Stage: 5.5145e+02 100.0% 1.3011e+13 100.0% 6.007e+06 99.9% 5.020e+04 99.9% 3.999e+03 99.7% >> >> ---------------------------------------------------------------------- >> -------------------------------------------------- >> See the 'Profiling' chapter of the users' manual for details on interpreting output. >> Phase summary info: >> Count: number of times phase was executed >> Time and Flops: Max - maximum over all processors >> Ratio - ratio of maximum to minimum over all processors >> Mess: number of messages sent >> Avg. len: average message length (bytes) >> Reduct: number of global reductions >> Global: entire computation >> Stage: stages of a computation. Set stages with PetscLogStagePush() and PetscLogStagePop(). >> %T - percent time in this phase %F - percent flops in this phase >> %M - percent messages in this phase %L - percent message lengths in this phase >> %R - percent reductions in this phase >> Total Mflop/s: 10e-6 * (sum of flops over all processors)/(max time >> over all processors) >> ------------------------------------------------------------------------------------------------------------------------ >> Event Count Time (sec) Flops --- Global --- --- Stage --- Total >> Max Ratio Max Ratio Max Ratio Mess Avg len Reduct %T %F %M %L %R %T %F %M %L %R Mflop/s >> ---------------------------------------------------------------------- >> -------------------------------------------------- >> >> --- Event Stage 0: Main Stage >> >> MatMult 500 1.0 1.0172e+01 1.2 6.68e+09 1.1 1.9e+06 9.9e+03 0.0e+00 2 19 31 6 0 2 19 31 6 0 247182 >> MatMultTranspose 120 1.0 6.9889e-01 1.2 3.56e+08 1.1 2.5e+05 1.4e+04 0.0e+00 0 1 4 1 0 0 1 4 1 0 197492 >> MatSolve 380 1.0 3.9310e+00 1.1 1.17e+09 1.1 1.3e+04 5.7e+01 6.0e+01 1 3 0 0 1 1 3 0 0 2 112069 >> MatSOR 120 1.0 1.3915e+01 1.1 6.73e+09 1.1 9.5e+05 7.4e+03 0.0e+00 2 20 16 2 0 2 20 16 2 0 182405 >> MatLUFactorSym 2 1.0 2.1180e-01 1.8 0.00e+00 0.0 0.0e+00 0.0e+00 1.0e+01 0 0 0 0 0 0 0 0 0 0 0 >> MatLUFactorNum 60 1.0 7.9378e+00 1.0 1.95e+08 1.2 0.0e+00 0.0e+00 0.0e+00 1 1 0 0 0 1 1 0 0 0 8814 >> MatILUFactorSym 1 1.0 2.3076e-02 1.5 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 0 >> MatConvert 6 1.0 3.2693e-02 1.5 0.00e+00 0.0 0.0e+00 0.0e+00 1.8e+01 0 0 0 0 0 0 0 0 0 0 0 >> MatScale 6 1.0 2.1923e-02 1.7 4.50e+06 1.1 2.4e+04 1.5e+03 0.0e+00 0 0 0 0 0 0 0 0 0 0 77365 >> MatAssemblyBegin 266 1.0 1.0337e+01 4.4 0.00e+00 0.0 1.8e+05 3.8e+03 4.2e+02 1 0 3 0 10 1 0 3 0 10 0 >> MatAssemblyEnd 266 1.0 3.0336e+00 1.0 0.00e+00 0.0 4.1e+05 8.6e+02 4.7e+02 1 0 7 0 12 1 0 7 0 12 0 >> MatGetRow 6730366 1.1 8.6473e-01 1.2 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 0 >> MatGetRowIJ 3 3.0 5.2931e-035550.2 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 0 >> MatGetSubMatrix 12 1.0 2.2689e+01 1.0 0.00e+00 0.0 1.1e+05 1.3e+05 1.9e+02 4 0 2 5 5 4 0 2 5 5 0 >> MatGetOrdering 3 3.0 6.5000e-03 5.4 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 0 >> MatPartitioning 6 1.0 2.9801e+00 1.0 0.00e+00 0.0 0.0e+00 0.0e+00 1.4e+01 1 0 0 0 0 1 0 0 0 0 0 >> MatCoarsen 6 1.0 9.5374e-02 1.1 0.00e+00 0.0 2.5e+05 1.1e+03 3.8e+01 0 0 4 0 1 0 0 4 0 1 0 >> MatZeroEntries 22 1.0 6.1185e-01 1.2 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 0 >> MatTranspose 6 1.0 1.9780e-01 1.1 0.00e+00 0.0 1.9e+05 8.6e+02 7.8e+01 0 0 3 0 2 0 0 3 0 2 0 >> MatPtAP 120 1.0 5.2996e+01 1.0 1.70e+10 1.1 9.7e+05 2.1e+05 4.2e+02 10 49 16 67 10 10 49 16 67 11 120900 >> MatPtAPSymbolic 12 1.0 5.8209e+00 1.0 0.00e+00 0.0 2.2e+05 3.7e+04 8.4e+01 1 0 4 3 2 1 0 4 3 2 0 >> MatPtAPNumeric 120 1.0 4.7185e+01 1.0 1.70e+10 1.1 7.6e+05 2.6e+05 3.4e+02 9 49 13 64 8 9 49 13 64 8 135789 >> MatTrnMatMult 3 1.0 1.1679e+00 1.0 3.22e+07 1.2 8.2e+04 8.0e+03 5.7e+01 0 0 1 0 1 0 0 1 0 1 9997 >> MatTrnMatMultSym 3 1.0 6.8366e-01 1.0 0.00e+00 0.0 6.9e+04 3.9e+03 5.1e+01 0 0 1 0 1 0 0 1 0 1 0 >> MatTrnMatMultNum 3 1.0 4.8513e-01 1.0 3.22e+07 1.2 1.3e+04 3.0e+04 6.0e+00 0 0 0 0 0 0 0 0 0 0 24069 >> MatGetLocalMat 126 1.0 1.1939e+00 1.3 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 0 >> MatGetBrAoCol 120 1.0 5.9887e-01 2.7 0.00e+00 0.0 5.7e+05 3.3e+04 0.0e+00 0 0 9 6 0 0 0 9 6 0 0 >> MatGetSymTrans 24 1.0 1.4878e-01 1.4 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 0 >> VecDot 320 1.0 1.5860e+00 1.5 2.04e+08 1.1 0.0e+00 0.0e+00 3.2e+02 0 1 0 0 8 0 1 0 0 8 49574 >> VecMDot 260 1.0 1.8154e+00 2.5 3.70e+08 1.1 0.0e+00 0.0e+00 2.6e+02 0 1 0 0 6 0 1 0 0 7 78497 >> VecNorm 440 1.0 2.8876e+00 1.8 5.88e+08 1.1 0.0e+00 0.0e+00 4.4e+02 0 2 0 0 11 0 2 0 0 11 78281 >> VecScale 320 1.0 2.2738e-01 1.2 7.88e+07 1.1 0.0e+00 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 133517 >> VecCopy 220 1.0 7.1162e-01 1.2 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 0 >> VecSet 862 1.0 7.0683e-01 1.5 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 0 >> VecAXPY 440 1.0 9.0657e-01 1.2 3.83e+08 1.1 0.0e+00 0.0e+00 0.0e+00 0 1 0 0 0 0 1 0 0 0 162612 >> VecAYPX 280 1.0 5.8935e-01 1.5 1.92e+08 1.1 0.0e+00 0.0e+00 0.0e+00 0 1 0 0 0 0 1 0 0 0 125070 >> VecMAXPY 300 1.0 9.7644e-01 1.2 4.98e+08 1.1 0.0e+00 0.0e+00 0.0e+00 0 1 0 0 0 0 1 0 0 0 196269 >> VecAssemblyBegin 234 1.0 5.0308e+00 5.5 0.00e+00 0.0 0.0e+00 0.0e+00 6.8e+02 1 0 0 0 17 1 0 0 0 17 0 >> VecAssemblyEnd 234 1.0 1.8253e-03 8.8 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 0 >> VecScatterBegin 1083 1.0 2.8195e-01 4.7 0.00e+00 0.0 3.8e+06 8.4e+03 2.0e+01 0 0 64 11 0 0 0 64 11 1 0 >> VecScatterEnd 1063 1.0 3.4924e+00 6.9 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 0 >> KSPGMRESOrthog 20 1.0 1.5598e+00 3.2 1.28e+08 1.1 0.0e+00 0.0e+00 2.0e+01 0 0 0 0 0 0 0 0 0 1 31503 >> KSPSetUp 222 1.0 9.7521e-02 1.6 0.00e+00 0.0 0.0e+00 0.0e+00 1.2e+01 0 0 0 0 0 0 0 0 0 0 0 >> KSPSolve 60 1.0 1.3742e+02 1.0 3.42e+10 1.1 5.7e+06 4.4e+04 3.2e+03 25100 95 83 79 25100 95 83 79 94396 >> PCGAMGGraph_AGG 6 1.0 5.7683e+00 1.0 4.50e+06 1.1 3.8e+05 9.1e+02 2.5e+02 1 0 6 0 6 1 0 6 0 6 294 >> PCGAMGCoarse_AGG 6 1.0 1.4101e+00 1.0 3.22e+07 1.2 4.0e+05 3.2e+03 1.4e+02 0 0 7 0 4 0 0 7 0 4 8280 >> PCGAMGProl_AGG 6 1.0 1.8976e+00 1.0 0.00e+00 0.0 7.2e+05 3.4e+03 8.6e+02 0 0 12 1 22 0 0 12 1 22 0 >> PCGAMGPOpt_AGG 6 1.0 5.7220e-06 0.0 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 0 >> GAMG: createProl 6 1.0 9.0840e+00 1.0 3.67e+07 1.2 1.5e+06 2.7e+03 1.3e+03 2 0 25 1 31 2 0 25 1 31 1472 >> Graph 12 1.0 5.7669e+00 1.0 4.50e+06 1.1 3.8e+05 9.1e+02 2.5e+02 1 0 6 0 6 1 0 6 0 6 294 >> MIS/Agg 6 1.0 9.5481e-02 1.1 0.00e+00 0.0 2.5e+05 1.1e+03 3.8e+01 0 0 4 0 1 0 0 4 0 1 0 >> SA: col data 6 1.0 8.5414e-01 1.0 0.00e+00 0.0 6.6e+05 3.0e+03 7.8e+02 0 0 11 1 19 0 0 11 1 20 0 >> SA: frmProl0 6 1.0 1.0123e+00 1.0 0.00e+00 0.0 6.2e+04 7.6e+03 6.0e+01 0 0 1 0 1 0 0 1 0 2 0 >> GAMG: partLevel 6 1.0 3.6150e+01 1.0 8.41e+08 1.1 3.5e+05 5.0e+04 5.3e+02 7 2 6 6 13 7 2 6 6 13 8804 >> repartition 6 1.0 3.8351e+00 1.0 0.00e+00 0.0 4.7e+04 1.3e+02 1.6e+02 1 0 1 0 4 1 0 1 0 4 0 >> Invert-Sort 6 1.0 4.4953e+00 1.8 0.00e+00 0.0 0.0e+00 0.0e+00 2.4e+01 1 0 0 0 1 1 0 0 0 1 0 >> Move A 6 1.0 1.0806e+01 1.0 0.00e+00 0.0 8.5e+04 1.6e+05 1.0e+02 2 0 1 5 3 2 0 1 5 3 0 >> Move P 6 1.0 1.1953e+01 1.0 0.00e+00 0.0 2.5e+04 3.6e+03 1.0e+02 2 0 0 0 3 2 0 0 0 3 0 >> PCSetUp 100 1.0 1.0166e+02 1.0 1.72e+10 1.1 2.7e+06 8.3e+04 2.2e+03 18 50 44 73 54 18 50 44 73 54 63848 >> PCSetUpOnBlocks 40 1.0 1.0812e+00 1.2 1.95e+08 1.2 0.0e+00 0.0e+00 0.0e+00 0 1 0 0 0 0 1 0 0 0 64711 >> PCApply 380 1.0 1.9359e+01 1.1 8.58e+09 1.1 1.4e+06 9.6e+03 6.0e+01 3 25 24 5 1 3 25 24 5 2 167605 >> SFSetGraph 12 1.0 3.5203e-03 6.8 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 0 >> SFBcastBegin 44 1.0 2.4242e-02 3.0 0.00e+00 0.0 2.5e+05 1.1e+03 6.0e+00 0 0 4 0 0 0 0 4 0 0 0 >> SFBcastEnd 44 1.0 3.0994e-02 8.6 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 0 >> SFReduceBegin 6 1.0 1.6784e-02 3.8 0.00e+00 0.0 7.1e+04 5.0e+02 6.0e+00 0 0 1 0 0 0 0 1 0 0 0 >> SFReduceEnd 6 1.0 8.6989e-0332.0 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 0 >> ---------------------------------------------------------------------- >> -------------------------------------------------- >> >> Memory usage is given in bytes: >> >> Object Type Creations Destructions Memory Descendants' Mem. >> Reports information only for process 0. >> >> --- Event Stage 0: Main Stage >> >> Matrix 246 243 1730595756 0 >> Matrix Partitioning 6 6 3816 0 >> Matrix Coarsen 6 6 3720 0 >> Vector 602 602 1603749672 0 >> Vector Scatter 87 87 4291136 0 >> Krylov Solver 12 12 60416 0 >> Preconditioner 12 12 12040 0 >> Viewer 1 0 0 0 >> Index Set 247 247 9018060 0 >> Star Forest Bipartite Graph 12 12 10080 0 >> ====================================================================== >> ================================================== >> >> Any idea why there are more matrix created with version 3.7.2? I only have 2 MatCreate calls and 4 VecCreate calls in my code!, so I assume the others are internally created. >> >> >> Thank you, >> >> >> Hassan Raiesi, PhD >> >> Advanced Aerodynamics Department >> Bombardier Aerospace >> >> hassan.raiesi at aero.bombardier.com >> >> 2351 boul. Alfred-Nobel (BAN1) >> Ville Saint-Laurent, Qu?bec, H4S 2A9 >> >> >> >> T?l. >> 514-855-5001 # 62204 >> >> >> >> >> >> >> CONFIDENTIALITY NOTICE - This communication may contain privileged or confidential information. >> If you are not the intended recipient or received this communication >> by error, please notify the sender and delete the message without copying, forwarding and/or disclosing it. > > > From hengjiew at uci.edu Wed Jul 6 16:19:15 2016 From: hengjiew at uci.edu (frank) Date: Wed, 6 Jul 2016 14:19:15 -0700 Subject: [petsc-users] Question about memory usage in Multigrid preconditioner In-Reply-To: <94A03A99-4970-4F20-8C79-FEE1DCBD028D@mcs.anl.gov> References: <577C337B.60909@uci.edu> <94A03A99-4970-4F20-8C79-FEE1DCBD028D@mcs.anl.gov> Message-ID: <577D75D3.8010703@uci.edu> Hi Barry, Thank you for you advice. I tried three test. In the 1st test, the grid is 3072*256*768 and the process mesh is 96*8*24. The linear solver is 'cg' the preconditioner is 'mg' and 'telescope' is used as the preconditioner at the coarse mesh. The system gives me the "Out of Memory" error before the linear system is completely solved. The info from '-ksp_view_pre' is attached. I seems to me that the error occurs when it reaches the coarse mesh. The 2nd test uses a grid of 1536*128*384 and process mesh is 96*8*24. The 3rd test uses the same grid but a different process mesh 48*4*12. The linear solver and petsc options in 2nd and 3rd tests are the same in 1st test. The linear solver works fine in both test. I attached the memory usage of the 2nd and 3rd tests. The memory info is from the option '-log_summary'. I tried to use '-momery_info' as you suggested, but in my case petsc treated it as an unused option. It output nothing about the memory. Do I need to add sth to my code so I can use '-memory_info'? In both tests the memory usage is not large. It seems to me that it might be the 'telescope' preconditioner that allocated a lot of memory and caused the error in the 1st test. Is there is a way to show how much memory it allocated? Frank On 07/05/2016 03:37 PM, Barry Smith wrote: > Frank, > > You can run with -ksp_view_pre to have it "view" the KSP before the solve so hopefully it gets that far. > > Please run the problem that does fit with -memory_info when the problem completes it will show the "high water mark" for PETSc allocated memory and total memory used. We first want to look at these numbers to see if it is using more memory than you expect. You could also run with say half the grid spacing to see how the memory usage scaled with the increase in grid points. Make the runs also with -log_view and send all the output from these options. > > Barry > >> On Jul 5, 2016, at 5:23 PM, frank wrote: >> >> Hi, >> >> I am using the CG ksp solver and Multigrid preconditioner to solve a linear system in parallel. >> I chose to use the 'Telescope' as the preconditioner on the coarse mesh for its good performance. >> The petsc options file is attached. >> >> The domain is a 3d box. >> It works well when the grid is 1536*128*384 and the process mesh is 96*8*24. When I double the size of grid and keep the same process mesh and petsc options, I get an "out of memory" error from the super-cluster I am using. >> Each process has access to at least 8G memory, which should be more than enough for my application. I am sure that all the other parts of my code( except the linear solver ) do not use much memory. So I doubt if there is something wrong with the linear solver. >> The error occurs before the linear system is completely solved so I don't have the info from ksp view. I am not able to re-produce the error with a smaller problem either. >> In addition, I tried to use the block jacobi as the preconditioner with the same grid and same decomposition. The linear solver runs extremely slow but there is no memory error. >> >> How can I diagnose what exactly cause the error? >> Thank you so much. >> >> Frank >> -------------- next part -------------- KSP Object: 18432 MPI processes type: cg maximum iterations=10000 tolerances: relative=1e-07, absolute=1e-50, divergence=10000. left preconditioning using nonzero initial guess using UNPRECONDITIONED norm type for convergence test PC Object: 18432 MPI processes type: mg PC has not been set up so information may be incomplete MG: type is MULTIPLICATIVE, levels=4 cycles=v Cycles per PCApply=1 Using Galerkin computed coarse grid matrices Coarse grid solver -- level ------------------------------- KSP Object: (mg_coarse_) 18432 MPI processes type: preonly maximum iterations=10000, initial guess is zero tolerances: relative=1e-05, absolute=1e-50, divergence=10000. left preconditioning using DEFAULT norm type for convergence test PC Object: (mg_coarse_) 18432 MPI processes type: redundant PC has not been set up so information may be incomplete Redundant preconditioner: Not yet setup Down solver (pre-smoother) on level 1 ------------------------------- KSP Object: (mg_levels_1_) 18432 MPI processes type: chebyshev Chebyshev: eigenvalue estimates: min = 0., max = 0. maximum iterations=2, initial guess is zero tolerances: relative=1e-05, absolute=1e-50, divergence=10000. left preconditioning using NONE norm type for convergence test PC Object: (mg_levels_1_) 18432 MPI processes type: sor PC has not been set up so information may be incomplete SOR: type = local_symmetric, iterations = 1, local iterations = 1, omega = 1. Up solver (post-smoother) same as down solver (pre-smoother) Down solver (pre-smoother) on level 2 ------------------------------- KSP Object: (mg_levels_2_) 18432 MPI processes type: chebyshev Chebyshev: eigenvalue estimates: min = 0., max = 0. maximum iterations=2, initial guess is zero tolerances: relative=1e-05, absolute=1e-50, divergence=10000. left preconditioning using NONE norm type for convergence test PC Object: (mg_levels_2_) 18432 MPI processes type: sor PC has not been set up so information may be incomplete SOR: type = local_symmetric, iterations = 1, local iterations = 1, omega = 1. Up solver (post-smoother) same as down solver (pre-smoother) Down solver (pre-smoother) on level 3 ------------------------------- KSP Object: (mg_levels_3_) 18432 MPI processes type: chebyshev Chebyshev: eigenvalue estimates: min = 0., max = 0. maximum iterations=2, initial guess is zero tolerances: relative=1e-05, absolute=1e-50, divergence=10000. left preconditioning using NONE norm type for convergence test PC Object: (mg_levels_3_) 18432 MPI processes type: sor PC has not been set up so information may be incomplete SOR: type = local_symmetric, iterations = 1, local iterations = 1, omega = 1. Up solver (post-smoother) same as down solver (pre-smoother) linear system matrix = precond matrix: Mat Object: 18432 MPI processes type: mpiaij rows=603979776, cols=603979776 total: nonzeros=4223139840, allocated nonzeros=4223139840 total number of mallocs used during MatSetValues calls =0 has attached null space [NID 03157] 2016-07-05 18:53:01 Apid 45102172: initiated application termination [NID 00773] 2016-07-05 18:53:02 Apid 45102172: OOM killer terminated this process. [NID 09993] 2016-07-05 18:53:02 Apid 45102172: OOM killer terminated this process. -------------- next part -------------- Memory usage is given in bytes: Object Type Creations Destructions Memory Descendants' Mem. Reports information only for process 0. --- Event Stage 0: Main Stage Viewer 5 4 3328 0. Vector 384 383 8193712 0. Vector Scatter 27 23 61776 0. Matrix 103 103 11508688 0. Matrix Null Space 1 1 592 0. Distributed Mesh 8 4 20288 0. Star Forest Bipartite Graph 16 8 6784 0. Discrete System 8 4 3456 0. Index Set 55 55 277240 0. IS L to G Mapping 8 4 27136 0. Krylov Solver 10 10 12392 0. DMKSP interface 6 3 1944 0. Preconditioner 10 10 9952 0. -------------- next part -------------- Memory usage is given in bytes: Object Type Creations Destructions Memory Descendants' Mem. Reports information only for process 0. --- Event Stage 0: Main Stage Viewer 5 4 3328 0. Vector 384 383 1590520 0. Vector Scatter 27 23 28568 0. Matrix 103 103 3508664 0. Matrix Null Space 1 1 592 0. Distributed Mesh 8 4 20288 0. Star Forest Bipartite Graph 16 8 6784 0. Discrete System 8 4 3456 0. Index Set 55 55 80868 0. IS L to G Mapping 8 4 7080 0. Krylov Solver 10 10 12392 0. DMKSP interface 6 3 1944 0. Preconditioner 10 10 9952 0. -------------- next part -------------- -ksp_type cg -ksp_norm_type unpreconditioned -ksp_lag_norm -ksp_rtol 1e-7 -ksp_initial_guess_nonzero yes -ksp_converged_reason -ppe_max_iter 50 -pc_type mg -pc_mg_galerkin -pc_mg_levels 4 -mg_levels_ksp_type richardson -mg_levels_ksp_max_it 1 -mg_coarse_ksp_type preonly -mg_coarse_pc_type telescope -mg_coarse_pc_telescope_reduction_factor 64 -options_left -log_summary # options for telescope -mg_coarse_telescope_ksp_type preonly -mg_coarse_telescope_pc_type mg -mg_coarse_telescope_pc_mg_galerkin -mg_coarse_telescope_pc_mg_levels 4 -mg_coarse_telescope_mg_levels_ksp_max_it 1 -mg_coarse_telescope_mg_levels_ksp_type richardson -mg_coarse_telescope_mg_coarse_ksp_type preonly -mg_coarse_telescope_mg_coarse_pc_type svd From bsmith at mcs.anl.gov Wed Jul 6 16:51:49 2016 From: bsmith at mcs.anl.gov (Barry Smith) Date: Wed, 6 Jul 2016 16:51:49 -0500 Subject: [petsc-users] Question about memory usage in Multigrid preconditioner In-Reply-To: <577D75D3.8010703@uci.edu> References: <577C337B.60909@uci.edu> <94A03A99-4970-4F20-8C79-FEE1DCBD028D@mcs.anl.gov> <577D75D3.8010703@uci.edu> Message-ID: <2F25042C-E6D6-4AC6-9C22-1B63F8065836@mcs.anl.gov> > On Jul 6, 2016, at 4:19 PM, frank wrote: > > Hi Barry, > > Thank you for you advice. > I tried three test. In the 1st test, the grid is 3072*256*768 and the process mesh is 96*8*24. > The linear solver is 'cg' the preconditioner is 'mg' and 'telescope' is used as the preconditioner at the coarse mesh. > The system gives me the "Out of Memory" error before the linear system is completely solved. > The info from '-ksp_view_pre' is attached. I seems to me that the error occurs when it reaches the coarse mesh. > > The 2nd test uses a grid of 1536*128*384 and process mesh is 96*8*24. The 3rd test uses the same grid but a different process mesh 48*4*12. Are you sure this is right? The total matrix and vector memory usage goes from 2nd test Vector 384 383 8,193,712 0. Matrix 103 103 11,508,688 0. to 3rd test Vector 384 383 1,590,520 0. Matrix 103 103 3,508,664 0. that is the memory usage got smaller but if you have only 1/8th the processes and the same grid it should have gotten about 8 times bigger. Did you maybe cut the grid by a factor of 8 also? If so that still doesn't explain it because the memory usage changed by a factor of 5 something for the vectors and 3 something for the matrices. > The linear solver and petsc options in 2nd and 3rd tests are the same in 1st test. The linear solver works fine in both test. > I attached the memory usage of the 2nd and 3rd tests. The memory info is from the option '-log_summary'. I tried to use '-momery_info' as you suggested, but in my case petsc treated it as an unused option. It output nothing about the memory. Do I need to add sth to my code so I can use '-memory_info'? Sorry, my mistake the option is -memory_view Can you run the one case with -memory_view and -mg_coarse jacobi -ksp_max_it 1 (just so it doesn't iterate forever) to see how much memory is used without the telescope? Also run case 2 the same way. Barry > In both tests the memory usage is not large. > > It seems to me that it might be the 'telescope' preconditioner that allocated a lot of memory and caused the error in the 1st test. > Is there is a way to show how much memory it allocated? > > Frank > > On 07/05/2016 03:37 PM, Barry Smith wrote: >> Frank, >> >> You can run with -ksp_view_pre to have it "view" the KSP before the solve so hopefully it gets that far. >> >> Please run the problem that does fit with -memory_info when the problem completes it will show the "high water mark" for PETSc allocated memory and total memory used. We first want to look at these numbers to see if it is using more memory than you expect. You could also run with say half the grid spacing to see how the memory usage scaled with the increase in grid points. Make the runs also with -log_view and send all the output from these options. >> >> Barry >> >>> On Jul 5, 2016, at 5:23 PM, frank wrote: >>> >>> Hi, >>> >>> I am using the CG ksp solver and Multigrid preconditioner to solve a linear system in parallel. >>> I chose to use the 'Telescope' as the preconditioner on the coarse mesh for its good performance. >>> The petsc options file is attached. >>> >>> The domain is a 3d box. >>> It works well when the grid is 1536*128*384 and the process mesh is 96*8*24. When I double the size of grid and keep the same process mesh and petsc options, I get an "out of memory" error from the super-cluster I am using. >>> Each process has access to at least 8G memory, which should be more than enough for my application. I am sure that all the other parts of my code( except the linear solver ) do not use much memory. So I doubt if there is something wrong with the linear solver. >>> The error occurs before the linear system is completely solved so I don't have the info from ksp view. I am not able to re-produce the error with a smaller problem either. >>> In addition, I tried to use the block jacobi as the preconditioner with the same grid and same decomposition. The linear solver runs extremely slow but there is no memory error. >>> >>> How can I diagnose what exactly cause the error? >>> Thank you so much. >>> >>> Frank >>> > > From bsmith at mcs.anl.gov Wed Jul 6 16:53:44 2016 From: bsmith at mcs.anl.gov (Barry Smith) Date: Wed, 6 Jul 2016 16:53:44 -0500 Subject: [petsc-users] snes true and preconditioned residuals for left npc In-Reply-To: References: Message-ID: <3B9C8C0C-C09F-43DD-BA12-109583633C10@mcs.anl.gov> > On Jul 6, 2016, at 12:17 PM, Xiangdong wrote: > > Hello everyone, > > I am using snes_type aspin, which is actually newtonls + npc (nasm). After each newton iteration, if I call SNESGetFunction, the preconditioned residual is obtained. However, if I use SNESComputeFunction, I get the true (unpreconditioned) residual. > > If I want to know the preconditioned residual at a point different from current solution, which function should I call? I don't think it is possible. Barry > > Thanks. > > Best, > Xiangdong From jychang48 at gmail.com Wed Jul 6 16:56:15 2016 From: jychang48 at gmail.com (Justin Chang) Date: Wed, 6 Jul 2016 16:56:15 -0500 Subject: [petsc-users] Transient poisson example in petsc In-Reply-To: References: Message-ID: Julian, I hand wrote my own time stepping scheme (backward Euler) for SNES ex12.c because I had to enforce TAO's convex optimization solvers at every time level. I am sure Matt or one of the other PETSc developers can tell you how to make today's SNES ex12.c transient. Thanks, Justin On Wednesday, July 6, 2016, Julian Andrej wrote: > Hi, > > i've seen your question on the mailing list regarding a transient > poisson example using PetscFE and TS. Do you have any working > solution? I'm still confused about what to change from snes ex12 for > example to make it transient. > > I hope you could help me out there :) > > regards > Julian > -------------- next part -------------- An HTML attachment was scrubbed... URL: From cyrill.von.planta at usi.ch Thu Jul 7 03:37:26 2016 From: cyrill.von.planta at usi.ch (Cyrill Vonplanta) Date: Thu, 7 Jul 2016 08:37:26 +0000 Subject: [petsc-users] Reordering rows of parallel matrix across processors Message-ID: <4C660C5E-7326-45C2-82DD-3302E09490AA@usi.ch> Dear all, I would like to reorder the rows of a matrix across processors. Is this possible with MatPermute(?)? To illustrate here is how an index set would look like for a matrix with M=35 on 2 CPU?s. Amongst other things I intend to swap the first and last row here. [0] Number of indices in set 24 [0] 0 34 [0] 1 1 [0] 2 2 [0] 3 3 [0] 4 4 [0] 5 5 [0] 6 6 [0] 7 7 [0] 8 15 [0] 9 16 [0] 10 11 [0] 11 8 [0] 12 10 [0] 13 21 [0] 14 9 [0] 15 12 [0] 16 13 [0] 17 14 [0] 18 17 [0] 19 18 [0] 20 19 [0] 21 20 [0] 22 22 [0] 23 23 [1] Number of indices in set 11 [1] 0 24 [1] 1 25 [1] 2 26 [1] 3 27 [1] 4 28 [1] 5 29 [1] 6 30 [1] 7 31 [1] 8 32 [1] 9 33 [1] 10 0 Instead of exchanging the first and last row it seems to replace them with zeros only. If this can?t be done with MatPermute how could it be done? Thanks Cyrill From knepley at gmail.com Thu Jul 7 09:48:06 2016 From: knepley at gmail.com (Matthew Knepley) Date: Thu, 7 Jul 2016 09:48:06 -0500 Subject: [petsc-users] Reordering rows of parallel matrix across processors In-Reply-To: <4C660C5E-7326-45C2-82DD-3302E09490AA@usi.ch> References: <4C660C5E-7326-45C2-82DD-3302E09490AA@usi.ch> Message-ID: On Thu, Jul 7, 2016 at 3:37 AM, Cyrill Vonplanta wrote: > Dear all, > > I would like to reorder the rows of a matrix across processors. Is this > possible with MatPermute(?)? > Yes, this works with MatPermute(). Could you send this small example so I can reproduce it? > To illustrate here is how an index set would look like for a matrix with > M=35 on 2 CPU?s. Amongst other things I intend to swap the first and last > row here. > > [0] Number of indices in set 24 > [0] 0 34 > [0] 1 1 > [0] 2 2 > [0] 3 3 > [0] 4 4 > [0] 5 5 > [0] 6 6 > [0] 7 7 > [0] 8 15 > [0] 9 16 > [0] 10 11 > [0] 11 8 > [0] 12 10 > [0] 13 21 > [0] 14 9 > [0] 15 12 > [0] 16 13 > [0] 17 14 > [0] 18 17 > [0] 19 18 > [0] 20 19 > [0] 21 20 > [0] 22 22 > [0] 23 23 > [1] Number of indices in set 11 > [1] 0 24 > [1] 1 25 > [1] 2 26 > [1] 3 27 > [1] 4 28 > [1] 5 29 > [1] 6 30 > [1] 7 31 > [1] 8 32 > [1] 9 33 > [1] 10 0 > > Instead of exchanging the first and last row it seems to replace them with > zeros only. > If this can?t be done with MatPermute how could it be done? > You could also use MatGetSubMatrix(). Thanks, Matt > Thanks > Cyrill > > -- What most experimenters take for granted before they begin their experiments is infinitely more interesting than any results to which their experiments lead. -- Norbert Wiener -------------- next part -------------- An HTML attachment was scrubbed... URL: From mfadams at lbl.gov Thu Jul 7 13:30:32 2016 From: mfadams at lbl.gov (Mark Adams) Date: Thu, 7 Jul 2016 20:30:32 +0200 Subject: [petsc-users] (edit GAMG) petsc 3.7.2 memory usage is much higher when compared to 3.6.1 In-Reply-To: References: Message-ID: > > > > GAMG: createProl 6 1.0 1.0036e+01 1.0 3.68e+07 1.2 1.5e+06 2.7e+03 > 1.3e+03 1 0 23 1 31 1 0 23 1 31 1332 > > Graph 12 1.0 6.0783e+00 1.0 4.52e+06 1.1 3.8e+05 9.0e+02 > 2.5e+02 1 0 6 0 6 1 0 6 0 6 279 > > MIS/Agg 6 1.0 9.5831e-02 1.2 0.00e+00 0.0 2.6e+05 1.1e+03 > 4.1e+01 0 0 4 0 1 0 0 4 0 1 0 > > SA: col data 6 1.0 7.7358e-01 1.0 0.00e+00 0.0 6.7e+05 2.9e+03 > 7.8e+02 0 0 10 0 19 0 0 10 0 19 0 > > SA: frmProl0 6 1.0 1.0759e+00 1.0 0.00e+00 0.0 6.2e+04 7.6e+03 > 6.0e+01 0 0 1 0 1 0 0 1 0 1 0 > > GAMG: partLevel 6 1.0 3.8136e+01 1.0 9.09e+08 1.1 3.8e+05 5.0e+04 > 5.4e+02 6 3 6 4 13 6 3 6 4 14 9013 > > repartition 6 1.0 2.7910e+00 1.0 0.00e+00 0.0 4.6e+04 1.3e+02 > 1.6e+02 0 0 1 0 4 0 0 1 0 4 0 > > Invert-Sort 6 1.0 2.5045e+00 2.0 0.00e+00 0.0 0.0e+00 0.0e+00 > 2.4e+01 0 0 0 0 1 0 0 0 0 1 0 > > Move A 6 1.0 1.4832e+01 1.0 0.00e+00 0.0 8.5e+04 1.7e+05 > 1.1e+02 2 0 1 3 3 2 0 1 3 3 0 > > Move P 6 1.0 1.2023e+01 1.0 0.00e+00 0.0 2.4e+04 3.8e+03 > 1.1e+02 2 0 0 0 3 2 0 0 0 3 0 > THe number of these calls (eg, 6) is the number of grids that are setup. > > PCSetUp 100 1.0 1.1212e+02 1.0 1.84e+10 1.1 3.2e+06 1.3e+05 > 2.2e+03 17 52 49 84 54 17 52 49 84 54 62052 > > PCSetUpOnBlocks 40 1.0 1.0386e+00 1.2 1.95e+08 1.2 0.0e+00 0.0e+00 > 0.0e+00 0 1 0 0 0 0 1 0 0 0 67368 > > PCApply 380 1.0 2.0034e+01 1.1 8.60e+09 1.1 1.5e+06 9.9e+03 > 6.0e+01 3 24 22 3 1 3 24 22 3 1 161973 > > SFSetGraph 12 1.0 4.9813e-0310.2 0.00e+00 0.0 0.0e+00 0.0e+00 > 0.0e+00 0 0 0 0 0 0 0 0 0 0 0 > > SFBcastBegin 47 1.0 3.3110e-02 2.6 0.00e+00 0.0 2.6e+05 1.1e+03 > 6.0e+00 0 0 4 0 0 0 0 4 0 0 0 > > SFBcastEnd 47 1.0 1.3497e-02 3.6 0.00e+00 0.0 0.0e+00 0.0e+00 > 0.0e+00 0 0 0 0 0 0 0 0 0 0 0 > > SFReduceBegin 6 1.0 1.8593e-02 4.2 0.00e+00 0.0 7.2e+04 4.9e+02 > 6.0e+00 0 0 1 0 0 0 0 1 0 0 0 > > SFReduceEnd 6 1.0 7.1628e-0318.5 0.00e+00 0.0 0.0e+00 0.0e+00 > 0.0e+00 0 0 0 0 0 0 0 0 0 0 0 > > BuildTwoSided 12 1.0 3.5771e-02 2.5 0.00e+00 0.0 5.0e+04 4.0e+00 > 1.2e+01 0 0 1 0 0 0 0 1 0 0 0 > > ---------------------------------------------------------------------- > > -------------------------------------------------- > > > > Memory usage is given in bytes: > > > > Object Type Creations Destructions Memory Descendants' > Mem. > > Reports information only for process 0. > > > > --- Event Stage 0: Main Stage > > > > Matrix 302 299 1992700700 0. > > Matrix Partitioning 6 6 3888 0. > > Matrix Coarsen 6 6 3768 0. > > Vector 600 600 1582204168 0. > > Vector Scatter 87 87 5614432 0. > > Krylov Solver 11 11 59472 0. > > Preconditioner 11 11 11120 0. > > PetscRandom 1 1 638 0. > > Viewer 1 0 0 0. > > Index Set 247 247 9008420 0. > > Star Forest Bipartite Graph 12 12 10176 0. > > ====================================================================== > > ================================================== > > > > And for petsc 3.6.1: > > > > Using Petsc Development GIT revision: v3.6.1-307-g26c82d3 GIT Date: > > 2015-08-06 11:50:34 -0500 > > > > Max Max/Min Avg Total > > Time (sec): 5.515e+02 1.00001 5.515e+02 > > Objects: 1.231e+03 1.00490 1.226e+03 > > Flops: 3.431e+10 1.12609 3.253e+10 1.301e+13 > > Flops/sec: 6.222e+07 1.12609 5.899e+07 2.359e+10 > > MPI Messages: 4.432e+04 7.84165 1.504e+04 6.016e+06 > > MPI Message Lengths: 2.236e+09 12.61261 5.027e+04 3.024e+11 > > MPI Reductions: 4.012e+03 1.00150 > > > > Flop counting convention: 1 flop = 1 real number operation of type > (multiply/divide/add/subtract) > > e.g., VecAXPY() for real vectors of length N > --> 2N flops > > and VecAXPY() for complex vectors of > > length N --> 8N flops > > > > Summary of Stages: ----- Time ------ ----- Flops ----- --- Messages > --- -- Message Lengths -- -- Reductions -- > > Avg %Total Avg %Total counts > %Total Avg %Total counts %Total > > 0: Main Stage: 5.5145e+02 100.0% 1.3011e+13 100.0% 6.007e+06 > 99.9% 5.020e+04 99.9% 3.999e+03 99.7% > > > > ---------------------------------------------------------------------- > > -------------------------------------------------- > > See the 'Profiling' chapter of the users' manual for details on > interpreting output. > > Phase summary info: > > Count: number of times phase was executed > > Time and Flops: Max - maximum over all processors > > Ratio - ratio of maximum to minimum over all > processors > > Mess: number of messages sent > > Avg. len: average message length (bytes) > > Reduct: number of global reductions > > Global: entire computation > > Stage: stages of a computation. Set stages with PetscLogStagePush() > and PetscLogStagePop(). > > %T - percent time in this phase %F - percent flops in this > phase > > %M - percent messages in this phase %L - percent message > lengths in this phase > > %R - percent reductions in this phase > > Total Mflop/s: 10e-6 * (sum of flops over all processors)/(max time > > over all processors) > > > ------------------------------------------------------------------------------------------------------------------------ > > Event Count Time (sec) Flops > --- Global --- --- Stage --- Total > > Max Ratio Max Ratio Max Ratio Mess Avg len > Reduct %T %F %M %L %R %T %F %M %L %R Mflop/s > > ---------------------------------------------------------------------- > > -------------------------------------------------- > > > > --- Event Stage 0: Main Stage > > > > MatMult 500 1.0 1.0172e+01 1.2 6.68e+09 1.1 1.9e+06 9.9e+03 > 0.0e+00 2 19 31 6 0 2 19 31 6 0 247182 > > MatMultTranspose 120 1.0 6.9889e-01 1.2 3.56e+08 1.1 2.5e+05 1.4e+04 > 0.0e+00 0 1 4 1 0 0 1 4 1 0 197492 > > MatSolve 380 1.0 3.9310e+00 1.1 1.17e+09 1.1 1.3e+04 5.7e+01 > 6.0e+01 1 3 0 0 1 1 3 0 0 2 112069 > > MatSOR 120 1.0 1.3915e+01 1.1 6.73e+09 1.1 9.5e+05 7.4e+03 > 0.0e+00 2 20 16 2 0 2 20 16 2 0 182405 > > MatLUFactorSym 2 1.0 2.1180e-01 1.8 0.00e+00 0.0 0.0e+00 0.0e+00 > 1.0e+01 0 0 0 0 0 0 0 0 0 0 0 > > MatLUFactorNum 60 1.0 7.9378e+00 1.0 1.95e+08 1.2 0.0e+00 0.0e+00 > 0.0e+00 1 1 0 0 0 1 1 0 0 0 8814 > > MatILUFactorSym 1 1.0 2.3076e-02 1.5 0.00e+00 0.0 0.0e+00 0.0e+00 > 0.0e+00 0 0 0 0 0 0 0 0 0 0 0 > > MatConvert 6 1.0 3.2693e-02 1.5 0.00e+00 0.0 0.0e+00 0.0e+00 > 1.8e+01 0 0 0 0 0 0 0 0 0 0 0 > > MatScale 6 1.0 2.1923e-02 1.7 4.50e+06 1.1 2.4e+04 1.5e+03 > 0.0e+00 0 0 0 0 0 0 0 0 0 0 77365 > > MatAssemblyBegin 266 1.0 1.0337e+01 4.4 0.00e+00 0.0 1.8e+05 3.8e+03 > 4.2e+02 1 0 3 0 10 1 0 3 0 10 0 > > MatAssemblyEnd 266 1.0 3.0336e+00 1.0 0.00e+00 0.0 4.1e+05 8.6e+02 > 4.7e+02 1 0 7 0 12 1 0 7 0 12 0 > > MatGetRow 6730366 1.1 8.6473e-01 1.2 0.00e+00 0.0 0.0e+00 0.0e+00 > 0.0e+00 0 0 0 0 0 0 0 0 0 0 0 > > MatGetRowIJ 3 3.0 5.2931e-035550.2 0.00e+00 0.0 0.0e+00 > 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 0 > > MatGetSubMatrix 12 1.0 2.2689e+01 1.0 0.00e+00 0.0 1.1e+05 1.3e+05 > 1.9e+02 4 0 2 5 5 4 0 2 5 5 0 > > MatGetOrdering 3 3.0 6.5000e-03 5.4 0.00e+00 0.0 0.0e+00 0.0e+00 > 0.0e+00 0 0 0 0 0 0 0 0 0 0 0 > > MatPartitioning 6 1.0 2.9801e+00 1.0 0.00e+00 0.0 0.0e+00 0.0e+00 > 1.4e+01 1 0 0 0 0 1 0 0 0 0 0 > > MatCoarsen 6 1.0 9.5374e-02 1.1 0.00e+00 0.0 2.5e+05 1.1e+03 > 3.8e+01 0 0 4 0 1 0 0 4 0 1 0 > > MatZeroEntries 22 1.0 6.1185e-01 1.2 0.00e+00 0.0 0.0e+00 0.0e+00 > 0.0e+00 0 0 0 0 0 0 0 0 0 0 0 > > MatTranspose 6 1.0 1.9780e-01 1.1 0.00e+00 0.0 1.9e+05 8.6e+02 > 7.8e+01 0 0 3 0 2 0 0 3 0 2 0 > > MatPtAP 120 1.0 5.2996e+01 1.0 1.70e+10 1.1 9.7e+05 2.1e+05 > 4.2e+02 10 49 16 67 10 10 49 16 67 11 120900 > > MatPtAPSymbolic 12 1.0 5.8209e+00 1.0 0.00e+00 0.0 2.2e+05 3.7e+04 > 8.4e+01 1 0 4 3 2 1 0 4 3 2 0 > > MatPtAPNumeric 120 1.0 4.7185e+01 1.0 1.70e+10 1.1 7.6e+05 2.6e+05 > 3.4e+02 9 49 13 64 8 9 49 13 64 8 135789 > > MatTrnMatMult 3 1.0 1.1679e+00 1.0 3.22e+07 1.2 8.2e+04 8.0e+03 > 5.7e+01 0 0 1 0 1 0 0 1 0 1 9997 > > MatTrnMatMultSym 3 1.0 6.8366e-01 1.0 0.00e+00 0.0 6.9e+04 3.9e+03 > 5.1e+01 0 0 1 0 1 0 0 1 0 1 0 > > MatTrnMatMultNum 3 1.0 4.8513e-01 1.0 3.22e+07 1.2 1.3e+04 3.0e+04 > 6.0e+00 0 0 0 0 0 0 0 0 0 0 24069 > > MatGetLocalMat 126 1.0 1.1939e+00 1.3 0.00e+00 0.0 0.0e+00 0.0e+00 > 0.0e+00 0 0 0 0 0 0 0 0 0 0 0 > > MatGetBrAoCol 120 1.0 5.9887e-01 2.7 0.00e+00 0.0 5.7e+05 3.3e+04 > 0.0e+00 0 0 9 6 0 0 0 9 6 0 0 > > MatGetSymTrans 24 1.0 1.4878e-01 1.4 0.00e+00 0.0 0.0e+00 0.0e+00 > 0.0e+00 0 0 0 0 0 0 0 0 0 0 0 > > VecDot 320 1.0 1.5860e+00 1.5 2.04e+08 1.1 0.0e+00 0.0e+00 > 3.2e+02 0 1 0 0 8 0 1 0 0 8 49574 > > VecMDot 260 1.0 1.8154e+00 2.5 3.70e+08 1.1 0.0e+00 0.0e+00 > 2.6e+02 0 1 0 0 6 0 1 0 0 7 78497 > > VecNorm 440 1.0 2.8876e+00 1.8 5.88e+08 1.1 0.0e+00 0.0e+00 > 4.4e+02 0 2 0 0 11 0 2 0 0 11 78281 > > VecScale 320 1.0 2.2738e-01 1.2 7.88e+07 1.1 0.0e+00 0.0e+00 > 0.0e+00 0 0 0 0 0 0 0 0 0 0 133517 > > VecCopy 220 1.0 7.1162e-01 1.2 0.00e+00 0.0 0.0e+00 0.0e+00 > 0.0e+00 0 0 0 0 0 0 0 0 0 0 0 > > VecSet 862 1.0 7.0683e-01 1.5 0.00e+00 0.0 0.0e+00 0.0e+00 > 0.0e+00 0 0 0 0 0 0 0 0 0 0 0 > > VecAXPY 440 1.0 9.0657e-01 1.2 3.83e+08 1.1 0.0e+00 0.0e+00 > 0.0e+00 0 1 0 0 0 0 1 0 0 0 162612 > > VecAYPX 280 1.0 5.8935e-01 1.5 1.92e+08 1.1 0.0e+00 0.0e+00 > 0.0e+00 0 1 0 0 0 0 1 0 0 0 125070 > > VecMAXPY 300 1.0 9.7644e-01 1.2 4.98e+08 1.1 0.0e+00 0.0e+00 > 0.0e+00 0 1 0 0 0 0 1 0 0 0 196269 > > VecAssemblyBegin 234 1.0 5.0308e+00 5.5 0.00e+00 0.0 0.0e+00 0.0e+00 > 6.8e+02 1 0 0 0 17 1 0 0 0 17 0 > > VecAssemblyEnd 234 1.0 1.8253e-03 8.8 0.00e+00 0.0 0.0e+00 0.0e+00 > 0.0e+00 0 0 0 0 0 0 0 0 0 0 0 > > VecScatterBegin 1083 1.0 2.8195e-01 4.7 0.00e+00 0.0 3.8e+06 8.4e+03 > 2.0e+01 0 0 64 11 0 0 0 64 11 1 0 > > VecScatterEnd 1063 1.0 3.4924e+00 6.9 0.00e+00 0.0 0.0e+00 0.0e+00 > 0.0e+00 0 0 0 0 0 0 0 0 0 0 0 > > KSPGMRESOrthog 20 1.0 1.5598e+00 3.2 1.28e+08 1.1 0.0e+00 0.0e+00 > 2.0e+01 0 0 0 0 0 0 0 0 0 1 31503 > > KSPSetUp 222 1.0 9.7521e-02 1.6 0.00e+00 0.0 0.0e+00 0.0e+00 > 1.2e+01 0 0 0 0 0 0 0 0 0 0 0 > > KSPSolve 60 1.0 1.3742e+02 1.0 3.42e+10 1.1 5.7e+06 4.4e+04 > 3.2e+03 25100 95 83 79 25100 95 83 79 94396 > > PCGAMGGraph_AGG 6 1.0 5.7683e+00 1.0 4.50e+06 1.1 3.8e+05 9.1e+02 > 2.5e+02 1 0 6 0 6 1 0 6 0 6 294 > > PCGAMGCoarse_AGG 6 1.0 1.4101e+00 1.0 3.22e+07 1.2 4.0e+05 3.2e+03 > 1.4e+02 0 0 7 0 4 0 0 7 0 4 8280 > > PCGAMGProl_AGG 6 1.0 1.8976e+00 1.0 0.00e+00 0.0 7.2e+05 3.4e+03 > 8.6e+02 0 0 12 1 22 0 0 12 1 22 0 > > PCGAMGPOpt_AGG 6 1.0 5.7220e-06 0.0 0.00e+00 0.0 0.0e+00 0.0e+00 > 0.0e+00 0 0 0 0 0 0 0 0 0 0 0 > > GAMG: createProl 6 1.0 9.0840e+00 1.0 3.67e+07 1.2 1.5e+06 2.7e+03 > 1.3e+03 2 0 25 1 31 2 0 25 1 31 1472 > > Graph 12 1.0 5.7669e+00 1.0 4.50e+06 1.1 3.8e+05 9.1e+02 > 2.5e+02 1 0 6 0 6 1 0 6 0 6 294 > > MIS/Agg 6 1.0 9.5481e-02 1.1 0.00e+00 0.0 2.5e+05 1.1e+03 > 3.8e+01 0 0 4 0 1 0 0 4 0 1 0 > > SA: col data 6 1.0 8.5414e-01 1.0 0.00e+00 0.0 6.6e+05 3.0e+03 > 7.8e+02 0 0 11 1 19 0 0 11 1 20 0 > > SA: frmProl0 6 1.0 1.0123e+00 1.0 0.00e+00 0.0 6.2e+04 7.6e+03 > 6.0e+01 0 0 1 0 1 0 0 1 0 2 0 > > GAMG: partLevel 6 1.0 3.6150e+01 1.0 8.41e+08 1.1 3.5e+05 5.0e+04 > 5.3e+02 7 2 6 6 13 7 2 6 6 13 8804 > > repartition 6 1.0 3.8351e+00 1.0 0.00e+00 0.0 4.7e+04 1.3e+02 > 1.6e+02 1 0 1 0 4 1 0 1 0 4 0 > > Invert-Sort 6 1.0 4.4953e+00 1.8 0.00e+00 0.0 0.0e+00 0.0e+00 > 2.4e+01 1 0 0 0 1 1 0 0 0 1 0 > > Move A 6 1.0 1.0806e+01 1.0 0.00e+00 0.0 8.5e+04 1.6e+05 > 1.0e+02 2 0 1 5 3 2 0 1 5 3 0 > > Move P 6 1.0 1.1953e+01 1.0 0.00e+00 0.0 2.5e+04 3.6e+03 > 1.0e+02 2 0 0 0 3 2 0 0 0 3 0 > > PCSetUp 100 1.0 1.0166e+02 1.0 1.72e+10 1.1 2.7e+06 8.3e+04 > 2.2e+03 18 50 44 73 54 18 50 44 73 54 63848 > > PCSetUpOnBlocks 40 1.0 1.0812e+00 1.2 1.95e+08 1.2 0.0e+00 0.0e+00 > 0.0e+00 0 1 0 0 0 0 1 0 0 0 64711 > > PCApply 380 1.0 1.9359e+01 1.1 8.58e+09 1.1 1.4e+06 9.6e+03 > 6.0e+01 3 25 24 5 1 3 25 24 5 2 167605 > > SFSetGraph 12 1.0 3.5203e-03 6.8 0.00e+00 0.0 0.0e+00 0.0e+00 > 0.0e+00 0 0 0 0 0 0 0 0 0 0 0 > > SFBcastBegin 44 1.0 2.4242e-02 3.0 0.00e+00 0.0 2.5e+05 1.1e+03 > 6.0e+00 0 0 4 0 0 0 0 4 0 0 0 > > SFBcastEnd 44 1.0 3.0994e-02 8.6 0.00e+00 0.0 0.0e+00 0.0e+00 > 0.0e+00 0 0 0 0 0 0 0 0 0 0 0 > > SFReduceBegin 6 1.0 1.6784e-02 3.8 0.00e+00 0.0 7.1e+04 5.0e+02 > 6.0e+00 0 0 1 0 0 0 0 1 0 0 0 > > SFReduceEnd 6 1.0 8.6989e-0332.0 0.00e+00 0.0 0.0e+00 0.0e+00 > 0.0e+00 0 0 0 0 0 0 0 0 0 0 0 > > ---------------------------------------------------------------------- > > -------------------------------------------------- > > > > Memory usage is given in bytes: > > > > Object Type Creations Destructions Memory Descendants' > Mem. > > Reports information only for process 0. > > > > --- Event Stage 0: Main Stage > > > > Matrix 246 243 1730595756 0 > > Matrix Partitioning 6 6 3816 0 > > Matrix Coarsen 6 6 3720 0 > > Vector 602 602 1603749672 0 > > Vector Scatter 87 87 4291136 0 > > Krylov Solver 12 12 60416 0 > > Preconditioner 12 12 12040 0 > > Viewer 1 0 0 0 > > Index Set 247 247 9018060 0 > > Star Forest Bipartite Graph 12 12 10080 0 > > ====================================================================== > > ================================================== > > > > Any idea why there are more matrix created with version 3.7.2? I only > have 2 MatCreate calls and 4 VecCreate calls in my code!, so I assume the > others are internally created. > > > > > > Thank you, > > > > > > Hassan Raiesi, PhD > > > > Advanced Aerodynamics Department > > Bombardier Aerospace > > > > hassan.raiesi at aero.bombardier.com > > > > 2351 boul. Alfred-Nobel (BAN1) > > Ville Saint-Laurent, Qu?bec, H4S 2A9 > > > > > > > > T?l. > > 514-855-5001 # 62204 > > > > > > > > > > > > > > CONFIDENTIALITY NOTICE - This communication may contain privileged or > confidential information. > > If you are not the intended recipient or received this communication > > by error, please notify the sender and delete the message without > copying, forwarding and/or disclosing it. > > > -------------- next part -------------- An HTML attachment was scrubbed... URL: From mfadams at lbl.gov Thu Jul 7 13:27:46 2016 From: mfadams at lbl.gov (Mark Adams) Date: Thu, 7 Jul 2016 20:27:46 +0200 Subject: [petsc-users] petsc 3.7.2 memory usage is much higher when compared to 3.6.1 In-Reply-To: References: Message-ID: On Tue, Jul 5, 2016 at 11:13 PM, Matthew Knepley wrote: > On Tue, Jul 5, 2016 at 3:42 PM, Hassan Raiesi < > Hassan.Raiesi at aero.bombardier.com> wrote: > >> Hi, >> >> >> >> PETSc 3.7.2 seems to have a much higher memory usage when compared with >> PETSc- 3.1.1 c, to a point that it crashes our code for large problems that >> we ran with version 3.6.1 in the past. >> >> I have re-compiled the code with same options, and ran the same code >> linked with the two versions, here are the log-summarie: >> > > According to the log_summary (which you NEED to send in full if we are to > understand anything), the memory usage is largely the same. > There are more matrices, which leads me to believe that GAMG is not > coarsening as quickly. You might consider a non-zero threshold for > it. > > FYI There are the same number of grids in these two outputs. > The best way to understand what is happening is to run Massif (from > valgrind) on both. > > Thanks, > > Matt > > >> -flow_ksp_max_it 20 >> >> -flow_ksp_monitor_true_residual >> >> -flow_ksp_rtol 0.1 >> >> -flow_ksp_type fgmres >> >> -flow_mg_coarse_pc_factor_mat_solver_package mumps >> >> -flow_mg_coarse_pc_type lu >> >> -flow_mg_levels_ksp_type richardson >> >> -flow_mg_levels_pc_type sor >> >> -flow_pc_gamg_agg_nsmooths 0 >> >> -flow_pc_gamg_coarse_eq_limit 2000 >> >> -flow_pc_gamg_process_eq_limit 2500 >> >> -flow_pc_gamg_repartition true >> >> -flow_pc_gamg_reuse_interpolation true >> >> -flow_pc_gamg_square_graph 3 >> >> -flow_pc_gamg_sym_graph true >> >> -flow_pc_gamg_type agg >> >> -flow_pc_mg_cycle v >> >> -flow_pc_mg_levels 20 >> >> -flow_pc_mg_type kaskade >> >> -flow_pc_type gamg >> >> -log_summary >> >> >> >> Note: it is not specific to PCGAMG, even a bjacobi+fgmres would need more >> memory (4.5GB/core in version 3.6.1 compared to 6.8GB/core for 3.7.2). >> >> >> >> >> >> >> >> Using Petsc Development GIT revision: v3.7.2-812-gc68d048 GIT Date: >> 2016-07-05 12:04:34 -0400 >> >> >> >> Max Max/Min Avg Total >> >> Time (sec): 6.760e+02 1.00006 6.760e+02 >> >> Objects: 1.284e+03 1.00469 1.279e+03 >> >> Flops: 3.563e+10 1.10884 3.370e+10 1.348e+13 >> >> Flops/sec: 5.271e+07 1.10884 4.985e+07 1.994e+10 >> >> MPI Messages: 4.279e+04 7.21359 1.635e+04 6.542e+06 >> >> MPI Message Lengths: 3.833e+09 17.25274 7.681e+04 5.024e+11 >> >> MPI Reductions: 4.023e+03 1.00149 >> >> >> >> Flop counting convention: 1 flop = 1 real number operation of type >> (multiply/divide/add/subtract) >> >> e.g., VecAXPY() for real vectors of length N >> --> 2N flops >> >> and VecAXPY() for complex vectors of length N >> --> 8N flops >> >> >> >> Summary of Stages: ----- Time ------ ----- Flops ----- --- Messages >> --- -- Message Lengths -- -- Reductions -- >> >> Avg %Total Avg %Total counts >> %Total Avg %Total counts %Total >> >> 0: Main Stage: 6.7600e+02 100.0% 1.3478e+13 100.0% 6.533e+06 >> 99.9% 7.674e+04 99.9% 4.010e+03 99.7% >> >> >> >> >> ------------------------------------------------------------------------------------------------------------------------ >> >> See the 'Profiling' chapter of the users' manual for details on >> interpreting output. >> >> Phase summary info: >> >> Count: number of times phase was executed >> >> Time and Flops: Max - maximum over all processors >> >> Ratio - ratio of maximum to minimum over all processors >> >> Mess: number of messages sent >> >> Avg. len: average message length (bytes) >> >> Reduct: number of global reductions >> >> Global: entire computation >> >> Stage: stages of a computation. Set stages with PetscLogStagePush() >> and PetscLogStagePop(). >> >> %T - percent time in this phase %F - percent flops in this >> phase >> >> %M - percent messages in this phase %L - percent message >> lengths in this phase >> >> %R - percent reductions in this phase >> >> Total Mflop/s: 10e-6 * (sum of flops over all processors)/(max time >> over all processors) >> >> >> ------------------------------------------------------------------------------------------------------------------------ >> >> Event Count Time (sec) Flops >> --- Global --- --- Stage --- Total >> >> Max Ratio Max Ratio Max Ratio Mess Avg len >> Reduct %T %F %M %L %R %T %F %M %L %R Mflop/s >> >> >> ------------------------------------------------------------------------------------------------------------------------ >> >> >> >> --- Event Stage 0: Main Stage >> >> >> >> MatMult 500 1.0 1.0582e+01 1.2 6.68e+09 1.1 1.9e+06 1.0e+04 >> 0.0e+00 1 19 28 4 0 1 19 29 4 0 237625 >> >> MatMultTranspose 120 1.0 7.6262e-01 1.3 3.58e+08 1.1 2.4e+05 1.5e+04 >> 0.0e+00 0 1 4 1 0 0 1 4 1 0 180994 >> >> MatSolve 380 1.0 4.1580e+00 1.1 1.17e+09 1.1 8.6e+03 8.8e+01 >> 6.0e+01 1 3 0 0 1 1 3 0 0 1 105950 >> >> MatSOR 120 1.0 1.4316e+01 1.2 6.75e+09 1.1 9.5e+05 7.4e+03 >> 0.0e+00 2 19 15 1 0 2 19 15 1 0 177298 >> >> MatLUFactorSym 2 1.0 2.3449e-01 1.1 0.00e+00 0.0 0.0e+00 0.0e+00 >> 1.0e+01 0 0 0 0 0 0 0 0 0 0 0 >> >> MatLUFactorNum 60 1.0 8.8820e+00 1.0 1.95e+08 1.2 0.0e+00 0.0e+00 >> 0.0e+00 1 1 0 0 0 1 1 0 0 0 7877 >> >> MatILUFactorSym 1 1.0 1.9795e-02 1.4 0.00e+00 0.0 0.0e+00 0.0e+00 >> 0.0e+00 0 0 0 0 0 0 0 0 0 0 0 >> >> MatConvert 6 1.0 2.9893e-02 1.1 0.00e+00 0.0 0.0e+00 0.0e+00 >> 1.8e+01 0 0 0 0 0 0 0 0 0 0 0 >> >> MatScale 6 1.0 1.8810e-02 1.4 4.52e+06 1.1 2.4e+04 1.5e+03 >> 0.0e+00 0 0 0 0 0 0 0 0 0 0 90171 >> >> MatAssemblyBegin 782 1.0 1.8294e+01 2.9 0.00e+00 0.0 9.2e+05 4.1e+05 >> 4.2e+02 2 0 14 75 10 2 0 14 75 10 0 >> >> MatAssemblyEnd 782 1.0 1.4283e+01 3.0 0.00e+00 0.0 4.1e+05 8.7e+02 >> 4.7e+02 1 0 6 0 12 1 0 6 0 12 0 >> >> MatGetRow 6774900 1.1 9.4289e-01 1.2 0.00e+00 0.0 0.0e+00 0.0e+00 >> 0.0e+00 0 0 0 0 0 0 0 0 0 0 0 >> >> MatGetRowIJ 3 3.0 6.6261e-036948.0 0.00e+00 0.0 0.0e+00 >> 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 0 >> >> MatGetSubMatrix 12 1.0 2.6783e+01 1.0 0.00e+00 0.0 1.1e+05 1.3e+05 >> 2.0e+02 4 0 2 3 5 4 0 2 3 5 0 >> >> MatGetOrdering 3 3.0 7.7400e-03 7.2 0.00e+00 0.0 0.0e+00 0.0e+00 >> 0.0e+00 0 0 0 0 0 0 0 0 0 0 0 >> >> MatPartitioning 6 1.0 1.8949e+00 1.0 0.00e+00 0.0 0.0e+00 0.0e+00 >> 1.4e+01 0 0 0 0 0 0 0 0 0 0 0 >> >> MatCoarsen 6 1.0 9.5692e-02 1.2 0.00e+00 0.0 2.6e+05 1.1e+03 >> 4.1e+01 0 0 4 0 1 0 0 4 0 1 0 >> >> MatZeroEntries 142 1.0 9.7085e-01 1.1 0.00e+00 0.0 0.0e+00 0.0e+00 >> 0.0e+00 0 0 0 0 0 0 0 0 0 0 0 >> >> MatTranspose 6 1.0 2.1740e-01 1.0 0.00e+00 0.0 1.9e+05 8.5e+02 >> 7.8e+01 0 0 3 0 2 0 0 3 0 2 0 >> >> MatPtAP 120 1.0 6.0157e+01 1.0 1.82e+10 1.1 1.5e+06 2.7e+05 >> 4.2e+02 9 51 22 80 10 9 51 22 80 10 114269 >> >> MatPtAPSymbolic 12 1.0 8.1081e+00 1.0 0.00e+00 0.0 2.2e+05 3.8e+04 >> 8.4e+01 1 0 3 2 2 1 0 3 2 2 0 >> >> MatPtAPNumeric 120 1.0 5.2205e+01 1.0 1.82e+10 1.1 1.2e+06 3.1e+05 >> 3.4e+02 8 51 19 78 8 8 51 19 78 8 131676 >> >> MatTrnMatMult 3 1.0 1.8608e+00 1.0 3.23e+07 1.2 8.3e+04 7.9e+03 >> 5.7e+01 0 0 1 0 1 0 0 1 0 1 6275 >> >> MatTrnMatMultSym 3 1.0 1.3447e+00 1.0 0.00e+00 0.0 6.9e+04 3.8e+03 >> 5.1e+01 0 0 1 0 1 0 0 1 0 1 0 >> >> MatTrnMatMultNum 3 1.0 5.1695e-01 1.0 3.23e+07 1.2 1.3e+04 3.0e+04 >> 6.0e+00 0 0 0 0 0 0 0 0 0 0 22588 >> >> MatGetLocalMat 126 1.0 1.0355e+00 1.8 0.00e+00 0.0 0.0e+00 0.0e+00 >> 0.0e+00 0 0 0 0 0 0 0 0 0 0 0 >> >> MatGetBrAoCol 120 1.0 9.5921e+0019.2 0.00e+00 0.0 5.7e+05 3.3e+04 >> 0.0e+00 1 0 9 4 0 1 0 9 4 0 0 >> >> VecDot 320 1.0 1.1400e+00 1.6 2.04e+08 1.1 0.0e+00 0.0e+00 >> 3.2e+02 0 1 0 0 8 0 1 0 0 8 68967 >> >> VecMDot 260 1.0 1.9577e+00 2.8 3.70e+08 1.1 0.0e+00 0.0e+00 >> 2.6e+02 0 1 0 0 6 0 1 0 0 6 72792 >> >> VecNorm 440 1.0 2.6273e+00 1.9 5.88e+08 1.1 0.0e+00 0.0e+00 >> 4.4e+02 0 2 0 0 11 0 2 0 0 11 86035 >> >> VecScale 320 1.0 2.1386e-01 1.2 7.91e+07 1.1 0.0e+00 0.0e+00 >> 0.0e+00 0 0 0 0 0 0 0 0 0 0 141968 >> >> VecCopy 220 1.0 7.0370e-01 1.2 0.00e+00 0.0 0.0e+00 0.0e+00 >> 0.0e+00 0 0 0 0 0 0 0 0 0 0 0 >> >> VecSet 862 1.0 7.1000e-01 1.6 0.00e+00 0.0 0.0e+00 0.0e+00 >> 0.0e+00 0 0 0 0 0 0 0 0 0 0 0 >> >> VecAXPY 440 1.0 8.6790e-01 1.1 3.83e+08 1.1 0.0e+00 0.0e+00 >> 0.0e+00 0 1 0 0 0 0 1 0 0 0 169857 >> >> VecAYPX 280 1.0 5.7766e-01 1.5 1.92e+08 1.1 0.0e+00 0.0e+00 >> 0.0e+00 0 1 0 0 0 0 1 0 0 0 127599 >> >> VecMAXPY 300 1.0 9.7396e-01 1.2 4.98e+08 1.1 0.0e+00 0.0e+00 >> 0.0e+00 0 1 0 0 0 0 1 0 0 0 196768 >> >> VecAssemblyBegin 234 1.0 4.6313e+00 5.6 0.00e+00 0.0 0.0e+00 0.0e+00 >> 6.8e+02 0 0 0 0 17 0 0 0 0 17 0 >> >> VecAssemblyEnd 234 1.0 5.1503e-0319.5 0.00e+00 0.0 0.0e+00 0.0e+00 >> 0.0e+00 0 0 0 0 0 0 0 0 0 0 0 >> >> VecScatterBegin 1083 1.0 2.9274e-01 4.5 0.00e+00 0.0 3.8e+06 8.5e+03 >> 2.0e+01 0 0 59 6 0 0 0 59 6 0 0 >> >> VecScatterEnd 1063 1.0 3.9653e+00 5.6 0.00e+00 0.0 0.0e+00 0.0e+00 >> 0.0e+00 0 0 0 0 0 0 0 0 0 0 0 >> >> KSPGMRESOrthog 20 1.0 1.7405e+00 3.7 1.28e+08 1.1 0.0e+00 0.0e+00 >> 2.0e+01 0 0 0 0 0 0 0 0 0 0 28232 >> >> KSPSetUp 222 1.0 6.8469e-02 1.1 0.00e+00 0.0 0.0e+00 0.0e+00 >> 1.2e+01 0 0 0 0 0 0 0 0 0 0 0 >> >> KSPSolve 60 1.0 1.4767e+02 1.0 3.55e+10 1.1 6.3e+06 7.2e+04 >> 3.2e+03 22100 96 90 79 22100 96 90 79 91007 >> >> PCGAMGGraph_AGG 6 1.0 6.0792e+00 1.0 4.52e+06 1.1 3.8e+05 9.0e+02 >> 2.5e+02 1 0 6 0 6 1 0 6 0 6 279 >> >> PCGAMGCoarse_AGG 6 1.0 2.0660e+00 1.0 3.23e+07 1.2 4.2e+05 3.1e+03 >> 1.5e+02 0 0 6 0 4 0 0 6 0 4 5652 >> >> PCGAMGProl_AGG 6 1.0 1.8842e+00 1.0 0.00e+00 0.0 7.3e+05 3.3e+03 >> 8.6e+02 0 0 11 0 21 0 0 11 0 22 0 >> >> PCGAMGPOpt_AGG 6 1.0 6.4373e-06 0.0 0.00e+00 0.0 0.0e+00 0.0e+00 >> 0.0e+00 0 0 0 0 0 0 0 0 0 0 0 >> >> GAMG: createProl 6 1.0 1.0036e+01 1.0 3.68e+07 1.2 1.5e+06 2.7e+03 >> 1.3e+03 1 0 23 1 31 1 0 23 1 31 1332 >> >> Graph 12 1.0 6.0783e+00 1.0 4.52e+06 1.1 3.8e+05 9.0e+02 >> 2.5e+02 1 0 6 0 6 1 0 6 0 6 279 >> >> MIS/Agg 6 1.0 9.5831e-02 1.2 0.00e+00 0.0 2.6e+05 1.1e+03 >> 4.1e+01 0 0 4 0 1 0 0 4 0 1 0 >> >> SA: col data 6 1.0 7.7358e-01 1.0 0.00e+00 0.0 6.7e+05 2.9e+03 >> 7.8e+02 0 0 10 0 19 0 0 10 0 19 0 >> >> SA: frmProl0 6 1.0 1.0759e+00 1.0 0.00e+00 0.0 6.2e+04 7.6e+03 >> 6.0e+01 0 0 1 0 1 0 0 1 0 1 0 >> >> GAMG: partLevel 6 1.0 3.8136e+01 1.0 9.09e+08 1.1 3.8e+05 5.0e+04 >> 5.4e+02 6 3 6 4 13 6 3 6 4 14 9013 >> >> repartition 6 1.0 2.7910e+00 1.0 0.00e+00 0.0 4.6e+04 1.3e+02 >> 1.6e+02 0 0 1 0 4 0 0 1 0 4 0 >> >> Invert-Sort 6 1.0 2.5045e+00 2.0 0.00e+00 0.0 0.0e+00 0.0e+00 >> 2.4e+01 0 0 0 0 1 0 0 0 0 1 0 >> >> Move A 6 1.0 1.4832e+01 1.0 0.00e+00 0.0 8.5e+04 1.7e+05 >> 1.1e+02 2 0 1 3 3 2 0 1 3 3 0 >> >> Move P 6 1.0 1.2023e+01 1.0 0.00e+00 0.0 2.4e+04 3.8e+03 >> 1.1e+02 2 0 0 0 3 2 0 0 0 3 0 >> >> PCSetUp 100 1.0 1.1212e+02 1.0 1.84e+10 1.1 3.2e+06 1.3e+05 >> 2.2e+03 17 52 49 84 54 17 52 49 84 54 62052 >> >> PCSetUpOnBlocks 40 1.0 1.0386e+00 1.2 1.95e+08 1.2 0.0e+00 0.0e+00 >> 0.0e+00 0 1 0 0 0 0 1 0 0 0 67368 >> >> PCApply 380 1.0 2.0034e+01 1.1 8.60e+09 1.1 1.5e+06 9.9e+03 >> 6.0e+01 3 24 22 3 1 3 24 22 3 1 161973 >> >> SFSetGraph 12 1.0 4.9813e-0310.2 0.00e+00 0.0 0.0e+00 0.0e+00 >> 0.0e+00 0 0 0 0 0 0 0 0 0 0 0 >> >> SFBcastBegin 47 1.0 3.3110e-02 2.6 0.00e+00 0.0 2.6e+05 1.1e+03 >> 6.0e+00 0 0 4 0 0 0 0 4 0 0 0 >> >> SFBcastEnd 47 1.0 1.3497e-02 3.6 0.00e+00 0.0 0.0e+00 0.0e+00 >> 0.0e+00 0 0 0 0 0 0 0 0 0 0 0 >> >> SFReduceBegin 6 1.0 1.8593e-02 4.2 0.00e+00 0.0 7.2e+04 4.9e+02 >> 6.0e+00 0 0 1 0 0 0 0 1 0 0 0 >> >> SFReduceEnd 6 1.0 7.1628e-0318.5 0.00e+00 0.0 0.0e+00 0.0e+00 >> 0.0e+00 0 0 0 0 0 0 0 0 0 0 0 >> >> BuildTwoSided 12 1.0 3.5771e-02 2.5 0.00e+00 0.0 5.0e+04 4.0e+00 >> 1.2e+01 0 0 1 0 0 0 0 1 0 0 0 >> >> >> ------------------------------------------------------------------------------------------------------------------------ >> >> >> >> Memory usage is given in bytes: >> >> >> >> Object Type Creations Destructions Memory Descendants' >> Mem. >> >> Reports information only for process 0. >> >> >> >> --- Event Stage 0: Main Stage >> >> >> >> Matrix 302 299 1992700700 0. >> >> Matrix Partitioning 6 6 3888 0. >> >> Matrix Coarsen 6 6 3768 0. >> >> Vector 600 600 1582204168 0. >> >> Vector Scatter 87 87 5614432 0. >> >> Krylov Solver 11 11 59472 0. >> >> Preconditioner 11 11 11120 0. >> >> PetscRandom 1 1 638 0. >> >> Viewer 1 0 0 0. >> >> Index Set 247 247 9008420 0. >> >> Star Forest Bipartite Graph 12 12 10176 0. >> >> >> ======================================================================================================================== >> >> >> >> And for petsc 3.6.1: >> >> >> >> Using Petsc Development GIT revision: v3.6.1-307-g26c82d3 GIT Date: >> 2015-08-06 11:50:34 -0500 >> >> >> >> Max Max/Min Avg Total >> >> Time (sec): 5.515e+02 1.00001 5.515e+02 >> >> Objects: 1.231e+03 1.00490 1.226e+03 >> >> Flops: 3.431e+10 1.12609 3.253e+10 1.301e+13 >> >> Flops/sec: 6.222e+07 1.12609 5.899e+07 2.359e+10 >> >> MPI Messages: 4.432e+04 7.84165 1.504e+04 6.016e+06 >> >> MPI Message Lengths: 2.236e+09 12.61261 5.027e+04 3.024e+11 >> >> MPI Reductions: 4.012e+03 1.00150 >> >> >> >> Flop counting convention: 1 flop = 1 real number operation of type >> (multiply/divide/add/subtract) >> >> e.g., VecAXPY() for real vectors of length N >> --> 2N flops >> >> and VecAXPY() for complex vectors of length N >> --> 8N flops >> >> >> >> Summary of Stages: ----- Time ------ ----- Flops ----- --- Messages >> --- -- Message Lengths -- -- Reductions -- >> >> Avg %Total Avg %Total counts >> %Total Avg %Total counts %Total >> >> 0: Main Stage: 5.5145e+02 100.0% 1.3011e+13 100.0% 6.007e+06 >> 99.9% 5.020e+04 99.9% 3.999e+03 99.7% >> >> >> >> >> ------------------------------------------------------------------------------------------------------------------------ >> >> See the 'Profiling' chapter of the users' manual for details on >> interpreting output. >> >> Phase summary info: >> >> Count: number of times phase was executed >> >> Time and Flops: Max - maximum over all processors >> >> Ratio - ratio of maximum to minimum over all processors >> >> Mess: number of messages sent >> >> Avg. len: average message length (bytes) >> >> Reduct: number of global reductions >> >> Global: entire computation >> >> Stage: stages of a computation. Set stages with PetscLogStagePush() >> and PetscLogStagePop(). >> >> %T - percent time in this phase %F - percent flops in this >> phase >> >> %M - percent messages in this phase %L - percent message >> lengths in this phase >> >> %R - percent reductions in this phase >> >> Total Mflop/s: 10e-6 * (sum of flops over all processors)/(max time >> over all processors) >> >> >> ------------------------------------------------------------------------------------------------------------------------ >> >> Event Count Time (sec) >> Flops --- Global --- --- Stage --- Total >> >> Max Ratio Max Ratio Max Ratio Mess Avg len >> Reduct %T %F %M %L %R %T %F %M %L %R Mflop/s >> >> >> ------------------------------------------------------------------------------------------------------------------------ >> >> >> >> --- Event Stage 0: Main Stage >> >> >> >> MatMult 500 1.0 1.0172e+01 1.2 6.68e+09 1.1 1.9e+06 9.9e+03 >> 0.0e+00 2 19 31 6 0 2 19 31 6 0 247182 >> >> MatMultTranspose 120 1.0 6.9889e-01 1.2 3.56e+08 1.1 2.5e+05 1.4e+04 >> 0.0e+00 0 1 4 1 0 0 1 4 1 0 197492 >> >> MatSolve 380 1.0 3.9310e+00 1.1 1.17e+09 1.1 1.3e+04 5.7e+01 >> 6.0e+01 1 3 0 0 1 1 3 0 0 2 112069 >> >> MatSOR 120 1.0 1.3915e+01 1.1 6.73e+09 1.1 9.5e+05 7.4e+03 >> 0.0e+00 2 20 16 2 0 2 20 16 2 0 182405 >> >> MatLUFactorSym 2 1.0 2.1180e-01 1.8 0.00e+00 0.0 0.0e+00 0.0e+00 >> 1.0e+01 0 0 0 0 0 0 0 0 0 0 0 >> >> MatLUFactorNum 60 1.0 7.9378e+00 1.0 1.95e+08 1.2 0.0e+00 0.0e+00 >> 0.0e+00 1 1 0 0 0 1 1 0 0 0 8814 >> >> MatILUFactorSym 1 1.0 2.3076e-02 1.5 0.00e+00 0.0 0.0e+00 0.0e+00 >> 0.0e+00 0 0 0 0 0 0 0 0 0 0 0 >> >> MatConvert 6 1.0 3.2693e-02 1.5 0.00e+00 0.0 0.0e+00 0.0e+00 >> 1.8e+01 0 0 0 0 0 0 0 0 0 0 0 >> >> MatScale 6 1.0 2.1923e-02 1.7 4.50e+06 1.1 2.4e+04 1.5e+03 >> 0.0e+00 0 0 0 0 0 0 0 0 0 0 77365 >> >> MatAssemblyBegin 266 1.0 1.0337e+01 4.4 0.00e+00 0.0 1.8e+05 3.8e+03 >> 4.2e+02 1 0 3 0 10 1 0 3 0 10 0 >> >> MatAssemblyEnd 266 1.0 3.0336e+00 1.0 0.00e+00 0.0 4.1e+05 8.6e+02 >> 4.7e+02 1 0 7 0 12 1 0 7 0 12 0 >> >> MatGetRow 6730366 1.1 8.6473e-01 1.2 0.00e+00 0.0 0.0e+00 0.0e+00 >> 0.0e+00 0 0 0 0 0 0 0 0 0 0 0 >> >> MatGetRowIJ 3 3.0 5.2931e-035550.2 0.00e+00 0.0 0.0e+00 >> 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 0 >> >> MatGetSubMatrix 12 1.0 2.2689e+01 1.0 0.00e+00 0.0 1.1e+05 1.3e+05 >> 1.9e+02 4 0 2 5 5 4 0 2 5 5 0 >> >> MatGetOrdering 3 3.0 6.5000e-03 5.4 0.00e+00 0.0 0.0e+00 0.0e+00 >> 0.0e+00 0 0 0 0 0 0 0 0 0 0 0 >> >> MatPartitioning 6 1.0 2.9801e+00 1.0 0.00e+00 0.0 0.0e+00 0.0e+00 >> 1.4e+01 1 0 0 0 0 1 0 0 0 0 0 >> >> MatCoarsen 6 1.0 9.5374e-02 1.1 0.00e+00 0.0 2.5e+05 1.1e+03 >> 3.8e+01 0 0 4 0 1 0 0 4 0 1 0 >> >> MatZeroEntries 22 1.0 6.1185e-01 1.2 0.00e+00 0.0 0.0e+00 0.0e+00 >> 0.0e+00 0 0 0 0 0 0 0 0 0 0 0 >> >> MatTranspose 6 1.0 1.9780e-01 1.1 0.00e+00 0.0 1.9e+05 8.6e+02 >> 7.8e+01 0 0 3 0 2 0 0 3 0 2 0 >> >> MatPtAP 120 1.0 5.2996e+01 1.0 1.70e+10 1.1 9.7e+05 2.1e+05 >> 4.2e+02 10 49 16 67 10 10 49 16 67 11 120900 >> >> MatPtAPSymbolic 12 1.0 5.8209e+00 1.0 0.00e+00 0.0 2.2e+05 3.7e+04 >> 8.4e+01 1 0 4 3 2 1 0 4 3 2 0 >> >> MatPtAPNumeric 120 1.0 4.7185e+01 1.0 1.70e+10 1.1 7.6e+05 2.6e+05 >> 3.4e+02 9 49 13 64 8 9 49 13 64 8 135789 >> >> MatTrnMatMult 3 1.0 1.1679e+00 1.0 3.22e+07 1.2 8.2e+04 8.0e+03 >> 5.7e+01 0 0 1 0 1 0 0 1 0 1 9997 >> >> MatTrnMatMultSym 3 1.0 6.8366e-01 1.0 0.00e+00 0.0 6.9e+04 3.9e+03 >> 5.1e+01 0 0 1 0 1 0 0 1 0 1 0 >> >> MatTrnMatMultNum 3 1.0 4.8513e-01 1.0 3.22e+07 1.2 1.3e+04 3.0e+04 >> 6.0e+00 0 0 0 0 0 0 0 0 0 0 24069 >> >> MatGetLocalMat 126 1.0 1.1939e+00 1.3 0.00e+00 0.0 0.0e+00 0.0e+00 >> 0.0e+00 0 0 0 0 0 0 0 0 0 0 0 >> >> MatGetBrAoCol 120 1.0 5.9887e-01 2.7 0.00e+00 0.0 5.7e+05 3.3e+04 >> 0.0e+00 0 0 9 6 0 0 0 9 6 0 0 >> >> MatGetSymTrans 24 1.0 1.4878e-01 1.4 0.00e+00 0.0 0.0e+00 0.0e+00 >> 0.0e+00 0 0 0 0 0 0 0 0 0 0 0 >> >> VecDot 320 1.0 1.5860e+00 1.5 2.04e+08 1.1 0.0e+00 0.0e+00 >> 3.2e+02 0 1 0 0 8 0 1 0 0 8 49574 >> >> VecMDot 260 1.0 1.8154e+00 2.5 3.70e+08 1.1 0.0e+00 0.0e+00 >> 2.6e+02 0 1 0 0 6 0 1 0 0 7 78497 >> >> VecNorm 440 1.0 2.8876e+00 1.8 5.88e+08 1.1 0.0e+00 0.0e+00 >> 4.4e+02 0 2 0 0 11 0 2 0 0 11 78281 >> >> VecScale 320 1.0 2.2738e-01 1.2 7.88e+07 1.1 0.0e+00 0.0e+00 >> 0.0e+00 0 0 0 0 0 0 0 0 0 0 133517 >> >> VecCopy 220 1.0 7.1162e-01 1.2 0.00e+00 0.0 0.0e+00 0.0e+00 >> 0.0e+00 0 0 0 0 0 0 0 0 0 0 0 >> >> VecSet 862 1.0 7.0683e-01 1.5 0.00e+00 0.0 0.0e+00 0.0e+00 >> 0.0e+00 0 0 0 0 0 0 0 0 0 0 0 >> >> VecAXPY 440 1.0 9.0657e-01 1.2 3.83e+08 1.1 0.0e+00 0.0e+00 >> 0.0e+00 0 1 0 0 0 0 1 0 0 0 162612 >> >> VecAYPX 280 1.0 5.8935e-01 1.5 1.92e+08 1.1 0.0e+00 0.0e+00 >> 0.0e+00 0 1 0 0 0 0 1 0 0 0 125070 >> >> VecMAXPY 300 1.0 9.7644e-01 1.2 4.98e+08 1.1 0.0e+00 0.0e+00 >> 0.0e+00 0 1 0 0 0 0 1 0 0 0 196269 >> >> VecAssemblyBegin 234 1.0 5.0308e+00 5.5 0.00e+00 0.0 0.0e+00 0.0e+00 >> 6.8e+02 1 0 0 0 17 1 0 0 0 17 0 >> >> VecAssemblyEnd 234 1.0 1.8253e-03 8.8 0.00e+00 0.0 0.0e+00 0.0e+00 >> 0.0e+00 0 0 0 0 0 0 0 0 0 0 0 >> >> VecScatterBegin 1083 1.0 2.8195e-01 4.7 0.00e+00 0.0 3.8e+06 8.4e+03 >> 2.0e+01 0 0 64 11 0 0 0 64 11 1 0 >> >> VecScatterEnd 1063 1.0 3.4924e+00 6.9 0.00e+00 0.0 0.0e+00 0.0e+00 >> 0.0e+00 0 0 0 0 0 0 0 0 0 0 0 >> >> KSPGMRESOrthog 20 1.0 1.5598e+00 3.2 1.28e+08 1.1 0.0e+00 0.0e+00 >> 2.0e+01 0 0 0 0 0 0 0 0 0 1 31503 >> >> KSPSetUp 222 1.0 9.7521e-02 1.6 0.00e+00 0.0 0.0e+00 0.0e+00 >> 1.2e+01 0 0 0 0 0 0 0 0 0 0 0 >> >> KSPSolve 60 1.0 1.3742e+02 1.0 3.42e+10 1.1 5.7e+06 4.4e+04 >> 3.2e+03 25100 95 83 79 25100 95 83 79 94396 >> >> PCGAMGGraph_AGG 6 1.0 5.7683e+00 1.0 4.50e+06 1.1 3.8e+05 9.1e+02 >> 2.5e+02 1 0 6 0 6 1 0 6 0 6 294 >> >> PCGAMGCoarse_AGG 6 1.0 1.4101e+00 1.0 3.22e+07 1.2 4.0e+05 3.2e+03 >> 1.4e+02 0 0 7 0 4 0 0 7 0 4 8280 >> >> PCGAMGProl_AGG 6 1.0 1.8976e+00 1.0 0.00e+00 0.0 7.2e+05 3.4e+03 >> 8.6e+02 0 0 12 1 22 0 0 12 1 22 0 >> >> PCGAMGPOpt_AGG 6 1.0 5.7220e-06 0.0 0.00e+00 0.0 0.0e+00 0.0e+00 >> 0.0e+00 0 0 0 0 0 0 0 0 0 0 0 >> >> GAMG: createProl 6 1.0 9.0840e+00 1.0 3.67e+07 1.2 1.5e+06 2.7e+03 >> 1.3e+03 2 0 25 1 31 2 0 25 1 31 1472 >> >> Graph 12 1.0 5.7669e+00 1.0 4.50e+06 1.1 3.8e+05 9.1e+02 >> 2.5e+02 1 0 6 0 6 1 0 6 0 6 294 >> >> MIS/Agg 6 1.0 9.5481e-02 1.1 0.00e+00 0.0 2.5e+05 1.1e+03 >> 3.8e+01 0 0 4 0 1 0 0 4 0 1 0 >> >> SA: col data 6 1.0 8.5414e-01 1.0 0.00e+00 0.0 6.6e+05 3.0e+03 >> 7.8e+02 0 0 11 1 19 0 0 11 1 20 0 >> >> SA: frmProl0 6 1.0 1.0123e+00 1.0 0.00e+00 0.0 6.2e+04 7.6e+03 >> 6.0e+01 0 0 1 0 1 0 0 1 0 2 0 >> >> GAMG: partLevel 6 1.0 3.6150e+01 1.0 8.41e+08 1.1 3.5e+05 5.0e+04 >> 5.3e+02 7 2 6 6 13 7 2 6 6 13 8804 >> >> repartition 6 1.0 3.8351e+00 1.0 0.00e+00 0.0 4.7e+04 1.3e+02 >> 1.6e+02 1 0 1 0 4 1 0 1 0 4 0 >> >> Invert-Sort 6 1.0 4.4953e+00 1.8 0.00e+00 0.0 0.0e+00 0.0e+00 >> 2.4e+01 1 0 0 0 1 1 0 0 0 1 0 >> >> Move A 6 1.0 1.0806e+01 1.0 0.00e+00 0.0 8.5e+04 1.6e+05 >> 1.0e+02 2 0 1 5 3 2 0 1 5 3 0 >> >> Move P 6 1.0 1.1953e+01 1.0 0.00e+00 0.0 2.5e+04 3.6e+03 >> 1.0e+02 2 0 0 0 3 2 0 0 0 3 0 >> >> PCSetUp 100 1.0 1.0166e+02 1.0 1.72e+10 1.1 2.7e+06 8.3e+04 >> 2.2e+03 18 50 44 73 54 18 50 44 73 54 63848 >> >> PCSetUpOnBlocks 40 1.0 1.0812e+00 1.2 1.95e+08 1.2 0.0e+00 0.0e+00 >> 0.0e+00 0 1 0 0 0 0 1 0 0 0 64711 >> >> PCApply 380 1.0 1.9359e+01 1.1 8.58e+09 1.1 1.4e+06 9.6e+03 >> 6.0e+01 3 25 24 5 1 3 25 24 5 2 167605 >> >> SFSetGraph 12 1.0 3.5203e-03 6.8 0.00e+00 0.0 0.0e+00 0.0e+00 >> 0.0e+00 0 0 0 0 0 0 0 0 0 0 0 >> >> SFBcastBegin 44 1.0 2.4242e-02 3.0 0.00e+00 0.0 2.5e+05 1.1e+03 >> 6.0e+00 0 0 4 0 0 0 0 4 0 0 0 >> >> SFBcastEnd 44 1.0 3.0994e-02 8.6 0.00e+00 0.0 0.0e+00 0.0e+00 >> 0.0e+00 0 0 0 0 0 0 0 0 0 0 0 >> >> SFReduceBegin 6 1.0 1.6784e-02 3.8 0.00e+00 0.0 7.1e+04 5.0e+02 >> 6.0e+00 0 0 1 0 0 0 0 1 0 0 0 >> >> SFReduceEnd 6 1.0 8.6989e-0332.0 0.00e+00 0.0 0.0e+00 0.0e+00 >> 0.0e+00 0 0 0 0 0 0 0 0 0 0 0 >> >> >> ------------------------------------------------------------------------------------------------------------------------ >> >> >> >> Memory usage is given in bytes: >> >> >> >> Object Type Creations Destructions Memory Descendants' >> Mem. >> >> Reports information only for process 0. >> >> >> >> --- Event Stage 0: Main Stage >> >> >> >> Matrix 246 243 1730595756 0 >> >> Matrix Partitioning 6 6 3816 0 >> >> Matrix Coarsen 6 6 3720 0 >> >> Vector 602 602 1603749672 0 >> >> Vector Scatter 87 87 4291136 0 >> >> Krylov Solver 12 12 60416 0 >> >> Preconditioner 12 12 12040 0 >> >> Viewer 1 0 0 0 >> >> Index Set 247 247 9018060 0 >> >> Star Forest Bipartite Graph 12 12 10080 0 >> >> >> ======================================================================================================================== >> >> >> >> Any idea why there are more matrix created with version 3.7.2? I only >> have 2 MatCreate calls and 4 VecCreate calls in my code!, so I assume the >> others are internally created. >> >> >> >> >> >> Thank you, >> >> >> >> >> >> *Hassan Raiesi, PhD* >> >> >> >> Advanced Aerodynamics Department >> >> Bombardier Aerospace >> >> >> >> hassan.raiesi at aero.bombardier.com >> >> >> >> *2351 boul. Alfred-Nobel (BAN1)* >> >> *Ville Saint-Laurent, Qu?bec, H4S 2A9* >> >> >> >> >> >> >> >> T?l. >> >> 514-855-5001 # 62204 >> >> >> >> >> >> >> >> >> >> >> >> *CONFIDENTIALITY NOTICE* - This communication may contain privileged or >> confidential information. >> If you are not the intended recipient or received this communication by >> error, please notify the sender >> and delete the message without copying, forwarding and/or disclosing it. >> >> >> >> >> >> >> > > > > -- > What most experimenters take for granted before they begin their > experiments is infinitely more interesting than any results to which their > experiments lead. > -- Norbert Wiener > -------------- next part -------------- An HTML attachment was scrubbed... URL: -------------- next part -------------- A non-text attachment was scrubbed... Name: image001.png Type: image/png Size: 6402 bytes Desc: not available URL: From dave.mayhem23 at gmail.com Thu Jul 7 15:57:56 2016 From: dave.mayhem23 at gmail.com (Dave May) Date: Thu, 7 Jul 2016 22:57:56 +0200 Subject: [petsc-users] Question about memory usage in Multigrid preconditioner In-Reply-To: <577C337B.60909@uci.edu> References: <577C337B.60909@uci.edu> Message-ID: Hi Frank, On 6 July 2016 at 00:23, frank wrote: > Hi, > > I am using the CG ksp solver and Multigrid preconditioner to solve a > linear system in parallel. > I chose to use the 'Telescope' as the preconditioner on the coarse mesh > for its good performance. > The petsc options file is attached. > > The domain is a 3d box. > It works well when the grid is 1536*128*384 and the process mesh is > 96*8*24. When I double the size of grid and keep the same process mesh and > petsc options, I get an "out of memory" error from the super-cluster I am > using. > When you increase the mesh resolution, did you also increasing the number of effective MG levels? If the number of levels was held constant, then your coarse grid is increasing in size. I notice that you coarsest grid solver is PCSVD. This can be become expensive as PCSVD will convert your coarse level operator into a dense matrix and could be the cause of your OOM error. Telescope does have to store a couple of temporary matrices, but generally when used in the context of multigrid coarse level solves these operators represent a very small fraction of the fine level operator. We need to isolate if it's these temporary matrices from telescope causing the OOM error, or if they are caused by something else (e.g. PCSVD). > Each process has access to at least 8G memory, which should be more than > enough for my application. I am sure that all the other parts of my code( > except the linear solver ) do not use much memory. So I doubt if there is > something wrong with the linear solver. > The error occurs before the linear system is completely solved so I don't > have the info from ksp view. I am not able to re-produce the error with a > smaller problem either. > In addition, I tried to use the block jacobi as the preconditioner with > the same grid and same decomposition. The linear solver runs extremely slow > but there is no memory error. > > How can I diagnose what exactly cause the error? > This going to be kinda hard as I notice your configuration uses nested calls to telescope. You need to debug the solver configuration. The only way I know to do this is by invoking telescope one step at a time. By this I mean, use telescope once, check the configuration is what you want. Then add the next instance of telescope. For solver debugging purposes, get rid of PCSVD. The constant null space is propagated with telescope so you can just use an iterative method. Furthermore, for debugging purposes, you don't care about the solve time or even convergence, so set -ksp_max_it 1 everywhere in your solver stack (e.g. outer most KSP and on the coarsest level). If one instance of telescope works, e.g. no OOM error occurs, add the next instance of telescope. If two instance of telescope also works (no OOM), revert back to PCSVD. If now you have an OOM error, you should consider adding more levels, or getting rid of PCSVD as your coarse grid solver. Lastly, the option -repart_da_processors_x 24 has been depreciated. It now inherits the prefix from the solver running on the sub-communicator. For your use case, it should this be something like -mg_coarse_telescope_repart_da_processors_x 24 Use -options_left 1 to verify the option is getting picked up (another useful tool for solver config debugging). Cheers Dave > Thank you so much. > > Frank > -------------- next part -------------- An HTML attachment was scrubbed... URL: From bsmith at mcs.anl.gov Thu Jul 7 18:25:15 2016 From: bsmith at mcs.anl.gov (Barry Smith) Date: Thu, 7 Jul 2016 18:25:15 -0500 Subject: [petsc-users] Are performance benchmarks available? In-Reply-To: <1780114846.4471656.1467296427576.JavaMail.yahoo@mail.yahoo.com> References: <1337334190.2752078.1467075579364.JavaMail.yahoo.ref@mail.yahoo.com> <1337334190.2752078.1467075579364.JavaMail.yahoo@mail.yahoo.com> <781294040.3740649.1467216467855.JavaMail.yahoo@mail.yahoo.com> <213496802.3843456.1467229612250.JavaMail.yahoo@mail.yahoo.com> <1780114846.4471656.1467296427576.JavaMail.yahoo@mail.yahoo.com> Message-ID: <5713E506-7FD6-40D7-9C3F-4D3848D34A18@mcs.anl.gov> While I agree that having this type of information available would be very useful it is surprisingly difficult to do this and keep it up to date, plus we have little time to do it, so unfortunately we don't having thing like this. We should do this! Perhaps pick one or two problems and run them with say a simple preconditioner like ASM and then GAMG on a large problem with a couple of different number of processes, say 1, 32 and 256 then run them once a month to confirm they remain the same performance wise and make the performance numbers available on the web. Maybe using Mark's ex56.c case. I'll try to set something up Barry Always a big pain to try to automate the running on those damn batch systems! > On Jun 30, 2016, at 9:20 AM, Faraz Hussain wrote: > > I am wondering if there are benchmarks available that I can solve on my cluster to compare performance? I want to compare how scaling up-to 240 cores compares to large models already solved on an optimized configuration and hardware. > From cyrill.von.planta at usi.ch Fri Jul 8 02:14:39 2016 From: cyrill.von.planta at usi.ch (Cyrill Vonplanta) Date: Fri, 8 Jul 2016 07:14:39 +0000 Subject: [petsc-users] Reordering rows of parallel matrix across processors In-Reply-To: References: <4C660C5E-7326-45C2-82DD-3302E09490AA@usi.ch> Message-ID: <860F8A06-0C5C-447D-90A5-EB5481947BD6@usi.ch> Trying to make a small example for reproducing I could figure out my mistake in the code (totally unrelated to the question). MatPermute(..) just works fine. My apologies. Cyrill From: Matthew Knepley > Date: Donnerstag, 7. Juli 2016 um 16:48 To: von Planta Cyrill > Cc: "petsc-users at mcs.anl.gov" > Subject: Re: [petsc-users] Reordering rows of parallel matrix across processors On Thu, Jul 7, 2016 at 3:37 AM, Cyrill Vonplanta > wrote: Dear all, I would like to reorder the rows of a matrix across processors. Is this possible with MatPermute(?)? Yes, this works with MatPermute(). Could you send this small example so I can reproduce it? To illustrate here is how an index set would look like for a matrix with M=35 on 2 CPU?s. Amongst other things I intend to swap the first and last row here. [0] Number of indices in set 24 [0] 0 34 [0] 1 1 [0] 2 2 [0] 3 3 [0] 4 4 [0] 5 5 [0] 6 6 [0] 7 7 [0] 8 15 [0] 9 16 [0] 10 11 [0] 11 8 [0] 12 10 [0] 13 21 [0] 14 9 [0] 15 12 [0] 16 13 [0] 17 14 [0] 18 17 [0] 19 18 [0] 20 19 [0] 21 20 [0] 22 22 [0] 23 23 [1] Number of indices in set 11 [1] 0 24 [1] 1 25 [1] 2 26 [1] 3 27 [1] 4 28 [1] 5 29 [1] 6 30 [1] 7 31 [1] 8 32 [1] 9 33 [1] 10 0 Instead of exchanging the first and last row it seems to replace them with zeros only. If this can?t be done with MatPermute how could it be done? You could also use MatGetSubMatrix(). Thanks, Matt Thanks Cyrill -- What most experimenters take for granted before they begin their experiments is infinitely more interesting than any results to which their experiments lead. -- Norbert Wiener From mfadams at lbl.gov Fri Jul 8 08:09:07 2016 From: mfadams at lbl.gov (Mark Adams) Date: Fri, 8 Jul 2016 09:09:07 -0400 Subject: [petsc-users] Are performance benchmarks available? In-Reply-To: <5713E506-7FD6-40D7-9C3F-4D3848D34A18@mcs.anl.gov> References: <1337334190.2752078.1467075579364.JavaMail.yahoo.ref@mail.yahoo.com> <1337334190.2752078.1467075579364.JavaMail.yahoo@mail.yahoo.com> <781294040.3740649.1467216467855.JavaMail.yahoo@mail.yahoo.com> <213496802.3843456.1467229612250.JavaMail.yahoo@mail.yahoo.com> <1780114846.4471656.1467296427576.JavaMail.yahoo@mail.yahoo.com> <5713E506-7FD6-40D7-9C3F-4D3848D34A18@mcs.anl.gov> Message-ID: This would be a good idea, Please use SNES ex56 and send me the '-info | grep GAMG' result, and -log_view, so that I can check that it looks OK. Thanks, Mark On Thu, Jul 7, 2016 at 7:25 PM, Barry Smith wrote: > > While I agree that having this type of information available would be > very useful it is surprisingly difficult to do this and keep it up to date, > plus we have little time to do it, so unfortunately we don't having thing > like this. > > We should do this! Perhaps pick one or two problems and run them with > say a simple preconditioner like ASM and then GAMG on a large problem with > a couple of different number of processes, say 1, 32 and 256 then run them > once a month to confirm they remain the same performance wise and make the > performance numbers available on the web. Maybe using Mark's ex56.c case. > > I'll try to set something up > > Barry > > > Always a big pain to try to automate the running on those damn batch > systems! > > > > > > > On Jun 30, 2016, at 9:20 AM, Faraz Hussain > wrote: > > > > I am wondering if there are benchmarks available that I can solve on my > cluster to compare performance? I want to compare how scaling up-to 240 > cores compares to large models already solved on an optimized > configuration and hardware. > > > > -------------- next part -------------- An HTML attachment was scrubbed... URL: From huyaoyu1986 at gmail.com Fri Jul 8 18:59:41 2016 From: huyaoyu1986 at gmail.com (Yaoyu Hu) Date: Sat, 9 Jul 2016 07:59:41 +0800 Subject: [petsc-users] Need help: Poisson's equation with complex number Message-ID: Hi everyone, I am now trying to solve a partial differential equation which is similar to the three dimensional Poisson?s equation but with complex numbers. The equation is the result of the transformation of a set of fluid dynamic equations from time domain to frequency domain. I have Dirichlet boundary conditions all over the boundaries. The coefficient matrix that obtained by finite volume method (with collocated grid) is made of complex numbers. I would like to know that, for my discretized equation which solver and PC are the most suitable to work with. And BTW, the solution should be done in parallel with about 10^4 - 10^6 unknowns. It is the first time for me to solve equations with complex numbers, however, I am not so good at mathematics involving complex number. I would like to know want should I bear in mind throughout the whole process? Any suggestions or comments are appreciated. Thanks! HU Yaoyu From hengjiew at uci.edu Fri Jul 8 20:05:45 2016 From: hengjiew at uci.edu (frank) Date: Fri, 8 Jul 2016 18:05:45 -0700 Subject: [petsc-users] Question about memory usage in Multigrid preconditioner In-Reply-To: <2F25042C-E6D6-4AC6-9C22-1B63F8065836@mcs.anl.gov> References: <577C337B.60909@uci.edu> <94A03A99-4970-4F20-8C79-FEE1DCBD028D@mcs.anl.gov> <577D75D3.8010703@uci.edu> <2F25042C-E6D6-4AC6-9C22-1B63F8065836@mcs.anl.gov> Message-ID: <57804DE9.707@uci.edu> Hi Barry and Dave, Thank both of you for the advice. @Barry I made a mistake in the file names in last email. I attached the correct files this time. For all the three tests, 'Telescope' is used as the coarse preconditioner. == Test1: Grid: 1536*128*384, Process Mesh: 48*4*12 Part of the memory usage: Vector 125 124 3971904 0. Matrix 101 101 9462372 0 == Test2: Grid: 1536*128*384, Process Mesh: 96*8*24 Part of the memory usage: Vector 125 124 681672 0. Matrix 101 101 1462180 0. In theory, the memory usage in Test1 should be 8 times of Test2. In my case, it is about 6 times. == Test3: Grid: 3072*256*768, Process Mesh: 96*8*24. Sub-domain per process: 32*32*32 Here I get the out of memory error. I tried to use -mg_coarse jacobi. In this way, I don't need to set -mg_coarse_ksp_type and -mg_coarse_pc_type explicitly, right? The linear solver didn't work in this case. Petsc output some errors. @Dave In test3, I use only one instance of 'Telescope'. On the coarse mesh of 'Telescope', I used LU as the preconditioner instead of SVD. If my set the levels correctly, then on the last coarse mesh of MG where it calls 'Telescope', the sub-domain per process is 2*2*2. On the last coarse mesh of 'Telescope', there is only one grid point per process. I still got the OOM error. The detailed petsc option file is attached. Thank you so much. Frank On 07/06/2016 02:51 PM, Barry Smith wrote: >> On Jul 6, 2016, at 4:19 PM, frank wrote: >> >> Hi Barry, >> >> Thank you for you advice. >> I tried three test. In the 1st test, the grid is 3072*256*768 and the process mesh is 96*8*24. >> The linear solver is 'cg' the preconditioner is 'mg' and 'telescope' is used as the preconditioner at the coarse mesh. >> The system gives me the "Out of Memory" error before the linear system is completely solved. >> The info from '-ksp_view_pre' is attached. I seems to me that the error occurs when it reaches the coarse mesh. >> >> The 2nd test uses a grid of 1536*128*384 and process mesh is 96*8*24. The 3rd test uses the same grid but a different process mesh 48*4*12. > Are you sure this is right? The total matrix and vector memory usage goes from 2nd test > Vector 384 383 8,193,712 0. > Matrix 103 103 11,508,688 0. > to 3rd test > Vector 384 383 1,590,520 0. > Matrix 103 103 3,508,664 0. > that is the memory usage got smaller but if you have only 1/8th the processes and the same grid it should have gotten about 8 times bigger. Did you maybe cut the grid by a factor of 8 also? If so that still doesn't explain it because the memory usage changed by a factor of 5 something for the vectors and 3 something for the matrices. > > >> The linear solver and petsc options in 2nd and 3rd tests are the same in 1st test. The linear solver works fine in both test. >> I attached the memory usage of the 2nd and 3rd tests. The memory info is from the option '-log_summary'. I tried to use '-momery_info' as you suggested, but in my case petsc treated it as an unused option. It output nothing about the memory. Do I need to add sth to my code so I can use '-memory_info'? > Sorry, my mistake the option is -memory_view > > Can you run the one case with -memory_view and -mg_coarse jacobi -ksp_max_it 1 (just so it doesn't iterate forever) to see how much memory is used without the telescope? Also run case 2 the same way. > > Barry > > > >> In both tests the memory usage is not large. >> >> It seems to me that it might be the 'telescope' preconditioner that allocated a lot of memory and caused the error in the 1st test. >> Is there is a way to show how much memory it allocated? >> >> Frank >> >> On 07/05/2016 03:37 PM, Barry Smith wrote: >>> Frank, >>> >>> You can run with -ksp_view_pre to have it "view" the KSP before the solve so hopefully it gets that far. >>> >>> Please run the problem that does fit with -memory_info when the problem completes it will show the "high water mark" for PETSc allocated memory and total memory used. We first want to look at these numbers to see if it is using more memory than you expect. You could also run with say half the grid spacing to see how the memory usage scaled with the increase in grid points. Make the runs also with -log_view and send all the output from these options. >>> >>> Barry >>> >>>> On Jul 5, 2016, at 5:23 PM, frank wrote: >>>> >>>> Hi, >>>> >>>> I am using the CG ksp solver and Multigrid preconditioner to solve a linear system in parallel. >>>> I chose to use the 'Telescope' as the preconditioner on the coarse mesh for its good performance. >>>> The petsc options file is attached. >>>> >>>> The domain is a 3d box. >>>> It works well when the grid is 1536*128*384 and the process mesh is 96*8*24. When I double the size of grid and keep the same process mesh and petsc options, I get an "out of memory" error from the super-cluster I am using. >>>> Each process has access to at least 8G memory, which should be more than enough for my application. I am sure that all the other parts of my code( except the linear solver ) do not use much memory. So I doubt if there is something wrong with the linear solver. >>>> The error occurs before the linear system is completely solved so I don't have the info from ksp view. I am not able to re-produce the error with a smaller problem either. >>>> In addition, I tried to use the block jacobi as the preconditioner with the same grid and same decomposition. The linear solver runs extremely slow but there is no memory error. >>>> >>>> How can I diagnose what exactly cause the error? >>>> Thank you so much. >>>> >>>> Frank >>>> >> -------------- next part -------------- Summary of Memory Usage in PETSc Maximum (over computational time) process memory: total 7.2576e+08 max 3.8216e+05 min 3.1394e+05 Current process memory: total 7.2576e+08 max 3.8216e+05 min 3.1394e+05 Maximum (over computational time) space PetscMalloc()ed: total 6.3903e+11 max 2.7842e+08 min 2.7724e+08 Current space PetscMalloc()ed: total 1.8043e+09 max 8.0275e+05 min 7.6352e+05 ======================================================================================================================== Memory usage is given in bytes: Object Type Creations Destructions Memory Descendants' Mem. Reports information only for process 0. --- Event Stage 0: Main Stage Viewer 5 4 3328 0. Vector 125 124 3971904 0. Vector Scatter 25 21 60464 0. Matrix 101 101 9462372 0. Matrix Null Space 1 1 592 0. Distributed Mesh 8 4 20288 0. Star Forest Bipartite Graph 16 8 6784 0. Discrete System 8 4 3456 0. Index Set 55 55 277272 0. IS L to G Mapping 8 4 27136 0. Krylov Solver 10 10 12392 0. DMKSP interface 6 3 1944 0. Preconditioner 10 10 10024 0. ======================================================================================================================== -------------- next part -------------- Summary of Memory Usage in PETSc Maximum (over computational time) process memory: total 5.7481e+09 max 4.5144e+05 min 3.0404e+05 Current process memory: total 5.7481e+09 max 4.5144e+05 min 3.0404e+05 Maximum (over computational time) space PetscMalloc()ed: total 4.9405e+12 max 2.6821e+08 min 2.6800e+08 Current space PetscMalloc()ed: total 5.5180e+09 max 3.0192e+05 min 2.9173e+05 ======================================================================================================================== Memory usage is given in bytes: Object Type Creations Destructions Memory Descendants' Mem. Reports information only for process 0. --- Event Stage 0: Main Stage Viewer 5 4 3328 0. Vector 125 124 681672 0. Vector Scatter 25 21 27256 0. Matrix 101 101 1462180 0. Matrix Null Space 1 1 592 0. Distributed Mesh 8 4 20288 0. Star Forest Bipartite Graph 16 8 6784 0. Discrete System 8 4 3456 0. Index Set 55 55 80872 0. IS L to G Mapping 8 4 7080 0. Krylov Solver 10 10 12392 0. DMKSP interface 6 3 1944 0. Preconditioner 10 10 10024 0. ======================================================================================================================== -------------- next part -------------- -ksp_type cg -ksp_norm_type unpreconditioned -ksp_lag_norm -ksp_rtol 1e-7 -ksp_initial_guess_nonzero yes -ksp_converged_reason -ppe_max_iter 50 -pc_type mg -pc_mg_galerkin -pc_mg_levels 4 -mg_levels_ksp_type richardson -mg_levels_ksp_max_it 1 -ksp_max_it 1 -mg_coarse_ksp_type preonly -mg_coarse_pc_type telescope -mg_coarse_pc_telescope_reduction_factor 64 -options_left 1 -log_view -memory_view # Setting dmdarepart on subcomm -mg_coarse_telescope_repart_da_processors_x 12 -mg_coarse_telescope_repart_da_processors_y 1 -mg_coarse_telescope_repart_da_processors_z 3 -mg_coarse_telescope_ksp_type preonly -mg_coarse_telescope_pc_type mg -mg_coarse_telescope_pc_mg_galerkin -mg_coarse_telescope_pc_mg_levels 4 -mg_coarse_telescope_mg_levels_ksp_max_it 1 -mg_coarse_telescope_mg_levels_ksp_type richardson -mg_coarse_telescope_mg_coarse_ksp_type preonly -mg_coarse_telescope_mg_coarse_pc_type lu -------------- next part -------------- -ksp_type cg -ksp_norm_type unpreconditioned -ksp_lag_norm -ksp_rtol 1e-7 -ksp_initial_guess_nonzero yes -ksp_converged_reason -ppe_max_iter 50 -pc_type mg -pc_mg_galerkin -pc_mg_levels 4 -mg_levels_ksp_type richardson -mg_levels_ksp_max_it 1 -ksp_max_it 1 -mg_coarse_ksp_type preonly -mg_coarse_pc_type telescope -mg_coarse_pc_telescope_reduction_factor 64 -options_left -log_view -memory_view # Setting dmdarepart on subcomm -mg_coarse_telescope_repart_da_processors_x 24 -mg_coarse_telescope_repart_da_processors_y 2 -mg_coarse_telescope_repart_da_processors_z 6 -mg_coarse_telescope_ksp_type preonly -mg_coarse_telescope_pc_type mg -mg_coarse_telescope_pc_mg_galerkin -mg_coarse_telescope_pc_mg_levels 4 -mg_coarse_telescope_mg_levels_ksp_max_it 1 -mg_coarse_telescope_mg_levels_ksp_type richardson -mg_coarse_telescope_mg_coarse_ksp_type preonly -mg_coarse_telescope_mg_coarse_pc_type lu -------------- next part -------------- -ksp_type cg -ksp_norm_type unpreconditioned -ksp_lag_norm -ksp_rtol 1e-7 -ksp_initial_guess_nonzero yes -ksp_converged_reason -ppe_max_iter 50 -pc_type mg -pc_mg_galerkin -pc_mg_levels 5 -mg_levels_ksp_type richardson -mg_levels_ksp_max_it 1 -ksp_max_it 1 -mg_coarse_ksp_type preonly -mg_coarse_pc_type telescope -mg_coarse_pc_telescope_reduction_factor 64 -options_left 1 -log_view -memory_view -ksp_view_pre # Setting dmdarepart on subcomm -mg_coarse_telescope_repart_da_processors_x 24 -mg_coarse_telescope_repart_da_processors_y 2 -mg_coarse_telescope_repart_da_processors_z 6 -mg_coarse_telescope_ksp_type preonly -mg_coarse_telescope_pc_type mg -mg_coarse_telescope_pc_mg_galerkin -mg_coarse_telescope_pc_mg_levels 4 -mg_coarse_telescope_mg_levels_ksp_max_it 1 -mg_coarse_telescope_mg_levels_ksp_type richardson -mg_coarse_telescope_mg_coarse_ksp_type preonly -mg_coarse_telescope_mg_coarse_pc_type lu From bsmith at mcs.anl.gov Fri Jul 8 21:07:40 2016 From: bsmith at mcs.anl.gov (Barry Smith) Date: Fri, 8 Jul 2016 21:07:40 -0500 Subject: [petsc-users] Question about memory usage in Multigrid preconditioner In-Reply-To: <57804DE9.707@uci.edu> References: <577C337B.60909@uci.edu> <94A03A99-4970-4F20-8C79-FEE1DCBD028D@mcs.anl.gov> <577D75D3.8010703@uci.edu> <2F25042C-E6D6-4AC6-9C22-1B63F8065836@mcs.anl.gov> <57804DE9.707@uci.edu> Message-ID: Frank, I don't think we yet have enough information to figure out what is going on. Can you please run the test1 but on the larger number of processes? Our goal is to determine the memory usage scaling as you increase the mesh size with a fixed number of processes, from test 2 to test 3 so it is better to see the memory usage in test 1 with the same number of processes as test 2. > On Jul 8, 2016, at 8:05 PM, frank wrote: > > Hi Barry and Dave, > > Thank both of you for the advice. > > @Barry > I made a mistake in the file names in last email. I attached the correct files this time. > For all the three tests, 'Telescope' is used as the coarse preconditioner. > > == Test1: Grid: 1536*128*384, Process Mesh: 48*4*12 > Part of the memory usage: Vector 125 124 3971904 0. > Matrix 101 101 9462372 0 > > == Test2: Grid: 1536*128*384, Process Mesh: 96*8*24 > Part of the memory usage: Vector 125 124 681672 0. > Matrix 101 101 1462180 0. > > In theory, the memory usage in Test1 should be 8 times of Test2. In my case, it is about 6 times. > > == Test3: Grid: 3072*256*768, Process Mesh: 96*8*24. Sub-domain per process: 32*32*32 > Here I get the out of memory error. Please re-send us all the output from this failed case. > > I tried to use -mg_coarse jacobi. In this way, I don't need to set -mg_coarse_ksp_type and -mg_coarse_pc_type explicitly, right? > The linear solver didn't work in this case. Petsc output some errors. You better set the options you want because the default options may not be want you want. But it is possible that using jacobi on the coarse grid will result in failed failed convergence so I don't recommend it, better to use the defaults. The one thing I noted is that PETSc requests allocations much larger than are actually used (compare the maximum process memory to the maximum petscmalloc memory) in the test 1 and test 2 cases (likely because in the Galerkin RAR' process it doesn't know how much memory it will actually need). Normally these large requested allocations due no harm because it never actually needs to allocate all the memory pages for the full request. Barry > > @Dave > In test3, I use only one instance of 'Telescope'. On the coarse mesh of 'Telescope', I used LU as the preconditioner instead of SVD. > If my set the levels correctly, then on the last coarse mesh of MG where it calls 'Telescope', the sub-domain per process is 2*2*2. > On the last coarse mesh of 'Telescope', there is only one grid point per process. > I still got the OOM error. The detailed petsc option file is attached. > > > Thank you so much. > > Frank > > > > On 07/06/2016 02:51 PM, Barry Smith wrote: >>> On Jul 6, 2016, at 4:19 PM, frank wrote: >>> >>> Hi Barry, >>> >>> Thank you for you advice. >>> I tried three test. In the 1st test, the grid is 3072*256*768 and the process mesh is 96*8*24. >>> The linear solver is 'cg' the preconditioner is 'mg' and 'telescope' is used as the preconditioner at the coarse mesh. >>> The system gives me the "Out of Memory" error before the linear system is completely solved. >>> The info from '-ksp_view_pre' is attached. I seems to me that the error occurs when it reaches the coarse mesh. >>> >>> The 2nd test uses a grid of 1536*128*384 and process mesh is 96*8*24. The 3rd test uses the same grid but a different process mesh 48*4*12. >> Are you sure this is right? The total matrix and vector memory usage goes from 2nd test >> Vector 384 383 8,193,712 0. >> Matrix 103 103 11,508,688 0. >> to 3rd test >> Vector 384 383 1,590,520 0. >> Matrix 103 103 3,508,664 0. >> that is the memory usage got smaller but if you have only 1/8th the processes and the same grid it should have gotten about 8 times bigger. Did you maybe cut the grid by a factor of 8 also? If so that still doesn't explain it because the memory usage changed by a factor of 5 something for the vectors and 3 something for the matrices. >> >> >>> The linear solver and petsc options in 2nd and 3rd tests are the same in 1st test. The linear solver works fine in both test. >>> I attached the memory usage of the 2nd and 3rd tests. The memory info is from the option '-log_summary'. I tried to use '-momery_info' as you suggested, but in my case petsc treated it as an unused option. It output nothing about the memory. Do I need to add sth to my code so I can use '-memory_info'? >> Sorry, my mistake the option is -memory_view >> >> Can you run the one case with -memory_view and -mg_coarse jacobi -ksp_max_it 1 (just so it doesn't iterate forever) to see how much memory is used without the telescope? Also run case 2 the same way. >> >> Barry >> >> >> >>> In both tests the memory usage is not large. >>> >>> It seems to me that it might be the 'telescope' preconditioner that allocated a lot of memory and caused the error in the 1st test. >>> Is there is a way to show how much memory it allocated? >>> >>> Frank >>> >>> On 07/05/2016 03:37 PM, Barry Smith wrote: >>>> Frank, >>>> >>>> You can run with -ksp_view_pre to have it "view" the KSP before the solve so hopefully it gets that far. >>>> >>>> Please run the problem that does fit with -memory_info when the problem completes it will show the "high water mark" for PETSc allocated memory and total memory used. We first want to look at these numbers to see if it is using more memory than you expect. You could also run with say half the grid spacing to see how the memory usage scaled with the increase in grid points. Make the runs also with -log_view and send all the output from these options. >>>> >>>> Barry >>>> >>>>> On Jul 5, 2016, at 5:23 PM, frank wrote: >>>>> >>>>> Hi, >>>>> >>>>> I am using the CG ksp solver and Multigrid preconditioner to solve a linear system in parallel. >>>>> I chose to use the 'Telescope' as the preconditioner on the coarse mesh for its good performance. >>>>> The petsc options file is attached. >>>>> >>>>> The domain is a 3d box. >>>>> It works well when the grid is 1536*128*384 and the process mesh is 96*8*24. When I double the size of grid and keep the same process mesh and petsc options, I get an "out of memory" error from the super-cluster I am using. >>>>> Each process has access to at least 8G memory, which should be more than enough for my application. I am sure that all the other parts of my code( except the linear solver ) do not use much memory. So I doubt if there is something wrong with the linear solver. >>>>> The error occurs before the linear system is completely solved so I don't have the info from ksp view. I am not able to re-produce the error with a smaller problem either. >>>>> In addition, I tried to use the block jacobi as the preconditioner with the same grid and same decomposition. The linear solver runs extremely slow but there is no memory error. >>>>> >>>>> How can I diagnose what exactly cause the error? >>>>> Thank you so much. >>>>> >>>>> Frank >>>>> >>> > > From bsmith at mcs.anl.gov Fri Jul 8 21:11:26 2016 From: bsmith at mcs.anl.gov (Barry Smith) Date: Fri, 8 Jul 2016 21:11:26 -0500 Subject: [petsc-users] Need help: Poisson's equation with complex number In-Reply-To: References: Message-ID: <91BACFC0-05BC-4CA6-99DB-75CA18F56302@mcs.anl.gov> I would start with -pc_type gamg and -ksp_type gmres see how many iterations it requires and how the number of iterations grows when you refine the mesh (if life is good then the iterations will grow only moderately as you refine the mesh). If these options result in very bad convergence then send us the output with -ksp_monitor_true_residual and we'll have to consider other options. Barry > On Jul 8, 2016, at 6:59 PM, Yaoyu Hu wrote: > > Hi everyone, > > I am now trying to solve a partial differential equation which is > similar to the three dimensional Poisson?s equation but with complex > numbers. The equation is the result of the transformation of a set of > fluid dynamic equations from time domain to frequency domain. I have > Dirichlet boundary conditions all over the boundaries. The coefficient > matrix that obtained by finite volume method (with collocated grid) is > made of complex numbers. I would like to know that, for my discretized > equation which solver and PC are the most suitable to work with. And > BTW, the solution should be done in parallel with about 10^4 - 10^6 > unknowns. > > It is the first time for me to solve equations with complex numbers, > however, I am not so good at mathematics involving complex number. I > would like to know want should I bear in mind throughout the whole > process? Any suggestions or comments are appreciated. > > Thanks! > > HU Yaoyu From dave.mayhem23 at gmail.com Sat Jul 9 00:38:12 2016 From: dave.mayhem23 at gmail.com (Dave May) Date: Sat, 9 Jul 2016 07:38:12 +0200 Subject: [petsc-users] Question about memory usage in Multigrid preconditioner In-Reply-To: <57804DE9.707@uci.edu> References: <577C337B.60909@uci.edu> <94A03A99-4970-4F20-8C79-FEE1DCBD028D@mcs.anl.gov> <577D75D3.8010703@uci.edu> <2F25042C-E6D6-4AC6-9C22-1B63F8065836@mcs.anl.gov> <57804DE9.707@uci.edu> Message-ID: On Saturday, 9 July 2016, frank > wrote: > Hi Barry and Dave, > > Thank both of you for the advice. > > @Barry > I made a mistake in the file names in last email. I attached the correct > files this time. > For all the three tests, 'Telescope' is used as the coarse preconditioner. > > == Test1: Grid: 1536*128*384, Process Mesh: 48*4*12 > Part of the memory usage: Vector 125 124 3971904 0. > Matrix 101 101 > 9462372 0 > > == Test2: Grid: 1536*128*384, Process Mesh: 96*8*24 > Part of the memory usage: Vector 125 124 681672 0. > Matrix 101 101 > 1462180 0. > > In theory, the memory usage in Test1 should be 8 times of Test2. In my > case, it is about 6 times. > > == Test3: Grid: 3072*256*768, Process Mesh: 96*8*24. Sub-domain per > process: 32*32*32 > Here I get the out of memory error. > > I tried to use -mg_coarse jacobi. In this way, I don't need to set > -mg_coarse_ksp_type and -mg_coarse_pc_type explicitly, right? > The linear solver didn't work in this case. Petsc output some errors. > > @Dave > In test3, I use only one instance of 'Telescope'. On the coarse mesh of > 'Telescope', I used LU as the preconditioner instead of SVD. > If my set the levels correctly, then on the last coarse mesh of MG where > it calls 'Telescope', the sub-domain per process is 2*2*2. > On the last coarse mesh of 'Telescope', there is only one grid point per > process. > I still got the OOM error. The detailed petsc option file is attached. Do you understand the expected memory usage for the particular parallel LU implementation you are using? I don't (seriously). Replace LU with bjacobi and re-run this test. My point about solver debugging is still valid. And please send the result of KSPView so we can see what is actually used in the computations Thanks Dave > > > Thank you so much. > > Frank > > > > On 07/06/2016 02:51 PM, Barry Smith wrote: > >> On Jul 6, 2016, at 4:19 PM, frank wrote: >>> >>> Hi Barry, >>> >>> Thank you for you advice. >>> I tried three test. In the 1st test, the grid is 3072*256*768 and the >>> process mesh is 96*8*24. >>> The linear solver is 'cg' the preconditioner is 'mg' and 'telescope' is >>> used as the preconditioner at the coarse mesh. >>> The system gives me the "Out of Memory" error before the linear system >>> is completely solved. >>> The info from '-ksp_view_pre' is attached. I seems to me that the error >>> occurs when it reaches the coarse mesh. >>> >>> The 2nd test uses a grid of 1536*128*384 and process mesh is 96*8*24. >>> The 3rd test uses the same grid but a different process mesh 48*4*12. >>> >> Are you sure this is right? The total matrix and vector memory usage >> goes from 2nd test >> Vector 384 383 8,193,712 0. >> Matrix 103 103 11,508,688 0. >> to 3rd test >> Vector 384 383 1,590,520 0. >> Matrix 103 103 3,508,664 0. >> that is the memory usage got smaller but if you have only 1/8th the >> processes and the same grid it should have gotten about 8 times bigger. Did >> you maybe cut the grid by a factor of 8 also? If so that still doesn't >> explain it because the memory usage changed by a factor of 5 something for >> the vectors and 3 something for the matrices. >> >> >> The linear solver and petsc options in 2nd and 3rd tests are the same in >>> 1st test. The linear solver works fine in both test. >>> I attached the memory usage of the 2nd and 3rd tests. The memory info is >>> from the option '-log_summary'. I tried to use '-momery_info' as you >>> suggested, but in my case petsc treated it as an unused option. It output >>> nothing about the memory. Do I need to add sth to my code so I can use >>> '-memory_info'? >>> >> Sorry, my mistake the option is -memory_view >> >> Can you run the one case with -memory_view and -mg_coarse jacobi >> -ksp_max_it 1 (just so it doesn't iterate forever) to see how much memory >> is used without the telescope? Also run case 2 the same way. >> >> Barry >> >> >> >> In both tests the memory usage is not large. >>> >>> It seems to me that it might be the 'telescope' preconditioner that >>> allocated a lot of memory and caused the error in the 1st test. >>> Is there is a way to show how much memory it allocated? >>> >>> Frank >>> >>> On 07/05/2016 03:37 PM, Barry Smith wrote: >>> >>>> Frank, >>>> >>>> You can run with -ksp_view_pre to have it "view" the KSP before >>>> the solve so hopefully it gets that far. >>>> >>>> Please run the problem that does fit with -memory_info when the >>>> problem completes it will show the "high water mark" for PETSc allocated >>>> memory and total memory used. We first want to look at these numbers to see >>>> if it is using more memory than you expect. You could also run with say >>>> half the grid spacing to see how the memory usage scaled with the increase >>>> in grid points. Make the runs also with -log_view and send all the output >>>> from these options. >>>> >>>> Barry >>>> >>>> On Jul 5, 2016, at 5:23 PM, frank wrote: >>>>> >>>>> Hi, >>>>> >>>>> I am using the CG ksp solver and Multigrid preconditioner to solve a >>>>> linear system in parallel. >>>>> I chose to use the 'Telescope' as the preconditioner on the coarse >>>>> mesh for its good performance. >>>>> The petsc options file is attached. >>>>> >>>>> The domain is a 3d box. >>>>> It works well when the grid is 1536*128*384 and the process mesh is >>>>> 96*8*24. When I double the size of grid and keep the same process mesh and >>>>> petsc options, I get an "out of memory" error from the super-cluster I am >>>>> using. >>>>> Each process has access to at least 8G memory, which should be more >>>>> than enough for my application. I am sure that all the other parts of my >>>>> code( except the linear solver ) do not use much memory. So I doubt if >>>>> there is something wrong with the linear solver. >>>>> The error occurs before the linear system is completely solved so I >>>>> don't have the info from ksp view. I am not able to re-produce the error >>>>> with a smaller problem either. >>>>> In addition, I tried to use the block jacobi as the preconditioner >>>>> with the same grid and same decomposition. The linear solver runs extremely >>>>> slow but there is no memory error. >>>>> >>>>> How can I diagnose what exactly cause the error? >>>>> Thank you so much. >>>>> >>>>> Frank >>>>> >>>>> >>>> >>> >>> >> > -------------- next part -------------- An HTML attachment was scrubbed... URL: From dalcinl at gmail.com Sun Jul 10 04:31:28 2016 From: dalcinl at gmail.com (Lisandro Dalcin) Date: Sun, 10 Jul 2016 12:31:28 +0300 Subject: [petsc-users] How to determine the type of SNESLineSearch? In-Reply-To: References: Message-ID: PetscBool match; ierr = PetscObjectTypeCompare((PetscObject)linesearch, SNESLINESEARCHBASIC,&match);CHKERRQ(ierr); if (!match) { ... } On 6 July 2016 at 00:07, Gautam Bisht wrote: > Hi PETSc, > > After SNESSolve converges, I want to perform few additional operations only > when SNESLineSearchType is not SNESLINESEARCHBASIC. But, there is no > SNESLineSearchGetType routine. Any idea on how I can determine the type of > LineSearch set by a user using command line option? > > Thanks, > -Gautam. -- Lisandro Dalcin ============ Research Scientist Computer, Electrical and Mathematical Sciences & Engineering (CEMSE) Extreme Computing Research Center (ECRC) King Abdullah University of Science and Technology (KAUST) http://ecrc.kaust.edu.sa/ 4700 King Abdullah University of Science and Technology al-Khawarizmi Bldg (Bldg 1), Office # 0109 Thuwal 23955-6900, Kingdom of Saudi Arabia http://www.kaust.edu.sa Office Phone: +966 12 808-0459 From davydden at gmail.com Sun Jul 10 11:36:09 2016 From: davydden at gmail.com (Denis Davydov) Date: Sun, 10 Jul 2016 18:36:09 +0200 Subject: [petsc-users] [Slepc 3.7.1][macOS] install name is set to build folder instead of prefix Message-ID: Dear developers, Slepc 3.6.3 used to produce the following result of install names: $ otool -lv libslepc.dylib | grep slepc libslepc.dylib: name /Users/davydden/spack/opt/spack/darwin-elcapitan-x86_64/clang-7.3.0-apple/slepc-3.6.3-b35zhzknp4lrt5r2iksagql2jkya2vfl/lib/libslepc.3.6.3.dylib (offset 24) path /Users/davydden/spack/opt/spack/darwin-elcapitan-x86_64/clang-7.3.0-apple/slepc-3.6.3-b35zhzknp4lrt5r2iksagql2jkya2vfl/lib (offset 12) path /Users/davydden/spack/opt/spack/darwin-elcapitan-x86_64/clang-7.3.0-apple/slepc-3.6.3-b35zhzknp4lrt5r2iksagql2jkya2vfl/lib64 (offset 12) same for libslepc.3.6.dylib and libslepc.3.6.3.dylib Since [3.7.1] the installed libraries have $ otool -lv libslepc.dylib | grep slepc libslepc.dylib: name /private/var/folders/5k/sqpp24tx3ylds4fgm13pfht00000gn/T/davydden/spack-stage/spack-stage-JwBNAx/slepc-3.7.1/installed-arch-darwin-c-opt/lib/libslepc.3.7.dylib (offset 24) path /Users/davydden/spack/opt/spack/darwin-elcapitan-x86_64/clang-7.3.0-apple/slepc-3.7.1-gimrzhb4mozeus3i2hdmrtjp3tha5pgr/lib (offset 12) path /Users/davydden/spack/opt/spack/darwin-elcapitan-x86_64/clang-7.3.0-apple/slepc-3.7.1-gimrzhb4mozeus3i2hdmrtjp3tha5pgr/lib64 (offset 12) That is, the ?name? is wrong as it corresponds to the path in the temporary build folder. Kind regards, Denis From jroman at dsic.upv.es Sun Jul 10 11:47:58 2016 From: jroman at dsic.upv.es (Jose E. Roman) Date: Sun, 10 Jul 2016 18:47:58 +0200 Subject: [petsc-users] [Slepc 3.7.1][macOS] install name is set to build folder instead of prefix In-Reply-To: References: Message-ID: <7FE647AB-8FD7-4D8A-980F-87F5F78478D7@dsic.upv.es> I think this is already fixed in this commit: https://bitbucket.org/slepc/slepc/commits/7489a3f3d569e2fbf5513ac9dcd769017d9f7eb7 Version 7.3.2 containing this patch will be released in a week or so. Thanks for reporting this. Jose > El 10 jul 2016, a las 18:36, Denis Davydov escribi?: > > Dear developers, > > Slepc 3.6.3 used to produce the following result of install names: > > $ otool -lv libslepc.dylib | grep slepc > libslepc.dylib: > name /Users/davydden/spack/opt/spack/darwin-elcapitan-x86_64/clang-7.3.0-apple/slepc-3.6.3-b35zhzknp4lrt5r2iksagql2jkya2vfl/lib/libslepc.3.6.3.dylib (offset 24) > path /Users/davydden/spack/opt/spack/darwin-elcapitan-x86_64/clang-7.3.0-apple/slepc-3.6.3-b35zhzknp4lrt5r2iksagql2jkya2vfl/lib (offset 12) > path /Users/davydden/spack/opt/spack/darwin-elcapitan-x86_64/clang-7.3.0-apple/slepc-3.6.3-b35zhzknp4lrt5r2iksagql2jkya2vfl/lib64 (offset 12) > > same for libslepc.3.6.dylib and libslepc.3.6.3.dylib > > > Since [3.7.1] the installed libraries have > > $ otool -lv libslepc.dylib | grep slepc > libslepc.dylib: > name /private/var/folders/5k/sqpp24tx3ylds4fgm13pfht00000gn/T/davydden/spack-stage/spack-stage-JwBNAx/slepc-3.7.1/installed-arch-darwin-c-opt/lib/libslepc.3.7.dylib (offset 24) > path /Users/davydden/spack/opt/spack/darwin-elcapitan-x86_64/clang-7.3.0-apple/slepc-3.7.1-gimrzhb4mozeus3i2hdmrtjp3tha5pgr/lib (offset 12) > path /Users/davydden/spack/opt/spack/darwin-elcapitan-x86_64/clang-7.3.0-apple/slepc-3.7.1-gimrzhb4mozeus3i2hdmrtjp3tha5pgr/lib64 (offset 12) > > > That is, the ?name? is wrong as it corresponds to the path in the temporary build folder. > > Kind regards, > Denis > From davydden at gmail.com Sun Jul 10 11:56:50 2016 From: davydden at gmail.com (Denis Davydov) Date: Sun, 10 Jul 2016 18:56:50 +0200 Subject: [petsc-users] [Slepc 3.7.1][macOS] install name is set to build folder instead of prefix In-Reply-To: <7FE647AB-8FD7-4D8A-980F-87F5F78478D7@dsic.upv.es> References: <7FE647AB-8FD7-4D8A-980F-87F5F78478D7@dsic.upv.es> Message-ID: Hi Jose, the patch you mentioned does not solve the problem (i tried it): $ otool -D libslepc.dylib libslepc.dylib: /private/var/folders/5k/sqpp24tx3ylds4fgm13pfht00000gn/T/davydden/spack-stage/spack-stage-jqcVVv/slepc-3.7.1/installed-arch-darwin-c-opt/lib/libslepc.3.7.dylib Kind regards, Denis > On 10 Jul 2016, at 18:47, Jose E. Roman wrote: > > I think this is already fixed in this commit: > https://bitbucket.org/slepc/slepc/commits/7489a3f3d569e2fbf5513ac9dcd769017d9f7eb7 > Version 7.3.2 containing this patch will be released in a week or so. > Thanks for reporting this. > Jose > > >> El 10 jul 2016, a las 18:36, Denis Davydov escribi?: >> >> Dear developers, >> >> Slepc 3.6.3 used to produce the following result of install names: >> >> $ otool -lv libslepc.dylib | grep slepc >> libslepc.dylib: >> name /Users/davydden/spack/opt/spack/darwin-elcapitan-x86_64/clang-7.3.0-apple/slepc-3.6.3-b35zhzknp4lrt5r2iksagql2jkya2vfl/lib/libslepc.3.6.3.dylib (offset 24) >> path /Users/davydden/spack/opt/spack/darwin-elcapitan-x86_64/clang-7.3.0-apple/slepc-3.6.3-b35zhzknp4lrt5r2iksagql2jkya2vfl/lib (offset 12) >> path /Users/davydden/spack/opt/spack/darwin-elcapitan-x86_64/clang-7.3.0-apple/slepc-3.6.3-b35zhzknp4lrt5r2iksagql2jkya2vfl/lib64 (offset 12) >> >> same for libslepc.3.6.dylib and libslepc.3.6.3.dylib >> >> >> Since [3.7.1] the installed libraries have >> >> $ otool -lv libslepc.dylib | grep slepc >> libslepc.dylib: >> name /private/var/folders/5k/sqpp24tx3ylds4fgm13pfht00000gn/T/davydden/spack-stage/spack-stage-JwBNAx/slepc-3.7.1/installed-arch-darwin-c-opt/lib/libslepc.3.7.dylib (offset 24) >> path /Users/davydden/spack/opt/spack/darwin-elcapitan-x86_64/clang-7.3.0-apple/slepc-3.7.1-gimrzhb4mozeus3i2hdmrtjp3tha5pgr/lib (offset 12) >> path /Users/davydden/spack/opt/spack/darwin-elcapitan-x86_64/clang-7.3.0-apple/slepc-3.7.1-gimrzhb4mozeus3i2hdmrtjp3tha5pgr/lib64 (offset 12) >> >> >> That is, the ?name? is wrong as it corresponds to the path in the temporary build folder. >> >> Kind regards, >> Denis >> > From davydden at gmail.com Sun Jul 10 15:26:03 2016 From: davydden at gmail.com (Denis Davydov) Date: Sun, 10 Jul 2016 22:26:03 +0200 Subject: [petsc-users] [Slepc 3.7.1][macOS] install name is set to build folder instead of prefix In-Reply-To: References: <7FE647AB-8FD7-4D8A-980F-87F5F78478D7@dsic.upv.es> Message-ID: <5D82D597-FEE7-48CF-A99E-C5A88956CAAD@gmail.com> I debuged a bit your code, install_name should be used as follows: install_name_tool -id That is, you need to change around ?installName? variable and ?dst? and then it works as expected. Kind regards, Denis > On 10 Jul 2016, at 18:56, Denis Davydov wrote: > > Hi Jose, > > the patch you mentioned does not solve the problem (i tried it): > > $ otool -D libslepc.dylib > libslepc.dylib: > /private/var/folders/5k/sqpp24tx3ylds4fgm13pfht00000gn/T/davydden/spack-stage/spack-stage-jqcVVv/slepc-3.7.1/installed-arch-darwin-c-opt/lib/libslepc.3.7.dylib > > Kind regards, > Denis > >> On 10 Jul 2016, at 18:47, Jose E. Roman wrote: >> >> I think this is already fixed in this commit: >> https://bitbucket.org/slepc/slepc/commits/7489a3f3d569e2fbf5513ac9dcd769017d9f7eb7 >> Version 7.3.2 containing this patch will be released in a week or so. >> Thanks for reporting this. >> Jose >> >> >>> El 10 jul 2016, a las 18:36, Denis Davydov escribi?: >>> >>> Dear developers, >>> >>> Slepc 3.6.3 used to produce the following result of install names: >>> >>> $ otool -lv libslepc.dylib | grep slepc >>> libslepc.dylib: >>> name /Users/davydden/spack/opt/spack/darwin-elcapitan-x86_64/clang-7.3.0-apple/slepc-3.6.3-b35zhzknp4lrt5r2iksagql2jkya2vfl/lib/libslepc.3.6.3.dylib (offset 24) >>> path /Users/davydden/spack/opt/spack/darwin-elcapitan-x86_64/clang-7.3.0-apple/slepc-3.6.3-b35zhzknp4lrt5r2iksagql2jkya2vfl/lib (offset 12) >>> path /Users/davydden/spack/opt/spack/darwin-elcapitan-x86_64/clang-7.3.0-apple/slepc-3.6.3-b35zhzknp4lrt5r2iksagql2jkya2vfl/lib64 (offset 12) >>> >>> same for libslepc.3.6.dylib and libslepc.3.6.3.dylib >>> >>> >>> Since [3.7.1] the installed libraries have >>> >>> $ otool -lv libslepc.dylib | grep slepc >>> libslepc.dylib: >>> name /private/var/folders/5k/sqpp24tx3ylds4fgm13pfht00000gn/T/davydden/spack-stage/spack-stage-JwBNAx/slepc-3.7.1/installed-arch-darwin-c-opt/lib/libslepc.3.7.dylib (offset 24) >>> path /Users/davydden/spack/opt/spack/darwin-elcapitan-x86_64/clang-7.3.0-apple/slepc-3.7.1-gimrzhb4mozeus3i2hdmrtjp3tha5pgr/lib (offset 12) >>> path /Users/davydden/spack/opt/spack/darwin-elcapitan-x86_64/clang-7.3.0-apple/slepc-3.7.1-gimrzhb4mozeus3i2hdmrtjp3tha5pgr/lib64 (offset 12) >>> >>> >>> That is, the ?name? is wrong as it corresponds to the path in the temporary build folder. >>> >>> Kind regards, >>> Denis >>> >> > From davydden at gmail.com Sun Jul 10 17:29:02 2016 From: davydden at gmail.com (Denis Davydov) Date: Mon, 11 Jul 2016 00:29:02 +0200 Subject: [petsc-users] [Slepc 3.7.1][macOS] install name is set to build folder instead of prefix In-Reply-To: <5D82D597-FEE7-48CF-A99E-C5A88956CAAD@gmail.com> References: <7FE647AB-8FD7-4D8A-980F-87F5F78478D7@dsic.upv.es> <5D82D597-FEE7-48CF-A99E-C5A88956CAAD@gmail.com> Message-ID: <6C60E10D-B52A-4A59-8045-F6672E3F00C7@gmail.com> Hi Jose, Please, disregard my last email. The order of arguments is correct. I still have an issue, though. I will debug it further and try to find what?s the cause... Kind regards, Denis > On 10 Jul 2016, at 22:26, Denis Davydov wrote: > > I debuged a bit your code, install_name should be used as follows: > > install_name_tool -id > > That is, you need to change around ?installName? variable and ?dst? and then it works as expected. > > Kind regards, > Denis > >> On 10 Jul 2016, at 18:56, Denis Davydov wrote: >> >> Hi Jose, >> >> the patch you mentioned does not solve the problem (i tried it): >> >> $ otool -D libslepc.dylib >> libslepc.dylib: >> /private/var/folders/5k/sqpp24tx3ylds4fgm13pfht00000gn/T/davydden/spack-stage/spack-stage-jqcVVv/slepc-3.7.1/installed-arch-darwin-c-opt/lib/libslepc.3.7.dylib >> >> Kind regards, >> Denis >> >>> On 10 Jul 2016, at 18:47, Jose E. Roman wrote: >>> >>> I think this is already fixed in this commit: >>> https://bitbucket.org/slepc/slepc/commits/7489a3f3d569e2fbf5513ac9dcd769017d9f7eb7 >>> Version 7.3.2 containing this patch will be released in a week or so. >>> Thanks for reporting this. >>> Jose >>> >>> >>>> El 10 jul 2016, a las 18:36, Denis Davydov escribi?: >>>> >>>> Dear developers, >>>> >>>> Slepc 3.6.3 used to produce the following result of install names: >>>> >>>> $ otool -lv libslepc.dylib | grep slepc >>>> libslepc.dylib: >>>> name /Users/davydden/spack/opt/spack/darwin-elcapitan-x86_64/clang-7.3.0-apple/slepc-3.6.3-b35zhzknp4lrt5r2iksagql2jkya2vfl/lib/libslepc.3.6.3.dylib (offset 24) >>>> path /Users/davydden/spack/opt/spack/darwin-elcapitan-x86_64/clang-7.3.0-apple/slepc-3.6.3-b35zhzknp4lrt5r2iksagql2jkya2vfl/lib (offset 12) >>>> path /Users/davydden/spack/opt/spack/darwin-elcapitan-x86_64/clang-7.3.0-apple/slepc-3.6.3-b35zhzknp4lrt5r2iksagql2jkya2vfl/lib64 (offset 12) >>>> >>>> same for libslepc.3.6.dylib and libslepc.3.6.3.dylib >>>> >>>> >>>> Since [3.7.1] the installed libraries have >>>> >>>> $ otool -lv libslepc.dylib | grep slepc >>>> libslepc.dylib: >>>> name /private/var/folders/5k/sqpp24tx3ylds4fgm13pfht00000gn/T/davydden/spack-stage/spack-stage-JwBNAx/slepc-3.7.1/installed-arch-darwin-c-opt/lib/libslepc.3.7.dylib (offset 24) >>>> path /Users/davydden/spack/opt/spack/darwin-elcapitan-x86_64/clang-7.3.0-apple/slepc-3.7.1-gimrzhb4mozeus3i2hdmrtjp3tha5pgr/lib (offset 12) >>>> path /Users/davydden/spack/opt/spack/darwin-elcapitan-x86_64/clang-7.3.0-apple/slepc-3.7.1-gimrzhb4mozeus3i2hdmrtjp3tha5pgr/lib64 (offset 12) >>>> >>>> >>>> That is, the ?name? is wrong as it corresponds to the path in the temporary build folder. >>>> >>>> Kind regards, >>>> Denis >>>> >>> >> > From davydden at gmail.com Sun Jul 10 17:48:26 2016 From: davydden at gmail.com (Denis Davydov) Date: Mon, 11 Jul 2016 00:48:26 +0200 Subject: [petsc-users] [Slepc 3.7.1][macOS] install name is set to build folder instead of prefix In-Reply-To: <6C60E10D-B52A-4A59-8045-F6672E3F00C7@gmail.com> References: <7FE647AB-8FD7-4D8A-980F-87F5F78478D7@dsic.upv.es> <5D82D597-FEE7-48CF-A99E-C5A88956CAAD@gmail.com> <6C60E10D-B52A-4A59-8045-F6672E3F00C7@gmail.com> Message-ID: Hi Jose, so here is what happens. The issue appears when SLEPC_DIR is set to a symlink (the one with ?stage below) of a build folder (the one with ?private? below). During configure there is a warning that SLEPC_DIR is not the same as current dir (string comparison), but one is symlink of another, so all but install_name_tool work. The latter leads to the following values of variables: oldname =/private/var/folders/5k/sqpp24tx3ylds4fgm13pfht00000gn/T/davydden/spack-stage/spack-stage-MziaMV/slepc-3.7.1/installed-arch-darwin-c-opt/lib/libslepc.3.7.dylib installName=/private/var/folders/5k/sqpp24tx3ylds4fgm13pfht00000gn/T/davydden/spack-stage/spack-stage-MziaMV/slepc-3.7.1/installed-arch-darwin-c-opt/lib/libslepc.3.7.dylib archDir =/Users/davydden/spack/var/spack/stage/slepc-3.7.1-gimrzhb4mozeus3i2hdmrtjp3tha5pgr/slepc-3.7.1/installed-arch-darwin-c-opt installDir =/Users/davydden/spack/opt/spack/darwin-elcapitan-x86_64/clang-7.3.0-apple/slepc-3.7.1-gimrzhb4mozeus3i2hdmrtjp3tha5pgr dst =/Users/davydden/spack/opt/spack/darwin-elcapitan-x86_64/clang-7.3.0-apple/slepc-3.7.1-gimrzhb4mozeus3i2hdmrtjp3tha5pgr/lib/libslepc.3.7.1.dylib As you see, installName wasn?t changed from oldname. Since the python code rely on SLEPC_DIR be pwd(), i would suggest to through an error instead of the warning to make sure that users won?t get in the situation above. Alternative is to make this part of the code more robust. When SLEPC_DIR==pwd() the patch you referred works. Kind regards, Denis > On 11 Jul 2016, at 00:29, Denis Davydov wrote: > > Hi Jose, > > Please, disregard my last email. The order of arguments is correct. > I still have an issue, though. I will debug it further and try to find what?s the cause... > > Kind regards, > Denis > >> On 10 Jul 2016, at 22:26, Denis Davydov wrote: >> >> I debuged a bit your code, install_name should be used as follows: >> >> install_name_tool -id >> >> That is, you need to change around ?installName? variable and ?dst? and then it works as expected. >> >> Kind regards, >> Denis >> >>> On 10 Jul 2016, at 18:56, Denis Davydov wrote: >>> >>> Hi Jose, >>> >>> the patch you mentioned does not solve the problem (i tried it): >>> >>> $ otool -D libslepc.dylib >>> libslepc.dylib: >>> /private/var/folders/5k/sqpp24tx3ylds4fgm13pfht00000gn/T/davydden/spack-stage/spack-stage-jqcVVv/slepc-3.7.1/installed-arch-darwin-c-opt/lib/libslepc.3.7.dylib >>> >>> Kind regards, >>> Denis >>> >>>> On 10 Jul 2016, at 18:47, Jose E. Roman wrote: >>>> >>>> I think this is already fixed in this commit: >>>> https://bitbucket.org/slepc/slepc/commits/7489a3f3d569e2fbf5513ac9dcd769017d9f7eb7 >>>> Version 7.3.2 containing this patch will be released in a week or so. >>>> Thanks for reporting this. >>>> Jose >>>> >>>> >>>>> El 10 jul 2016, a las 18:36, Denis Davydov escribi?: >>>>> >>>>> Dear developers, >>>>> >>>>> Slepc 3.6.3 used to produce the following result of install names: >>>>> >>>>> $ otool -lv libslepc.dylib | grep slepc >>>>> libslepc.dylib: >>>>> name /Users/davydden/spack/opt/spack/darwin-elcapitan-x86_64/clang-7.3.0-apple/slepc-3.6.3-b35zhzknp4lrt5r2iksagql2jkya2vfl/lib/libslepc.3.6.3.dylib (offset 24) >>>>> path /Users/davydden/spack/opt/spack/darwin-elcapitan-x86_64/clang-7.3.0-apple/slepc-3.6.3-b35zhzknp4lrt5r2iksagql2jkya2vfl/lib (offset 12) >>>>> path /Users/davydden/spack/opt/spack/darwin-elcapitan-x86_64/clang-7.3.0-apple/slepc-3.6.3-b35zhzknp4lrt5r2iksagql2jkya2vfl/lib64 (offset 12) >>>>> >>>>> same for libslepc.3.6.dylib and libslepc.3.6.3.dylib >>>>> >>>>> >>>>> Since [3.7.1] the installed libraries have >>>>> >>>>> $ otool -lv libslepc.dylib | grep slepc >>>>> libslepc.dylib: >>>>> name /private/var/folders/5k/sqpp24tx3ylds4fgm13pfht00000gn/T/davydden/spack-stage/spack-stage-JwBNAx/slepc-3.7.1/installed-arch-darwin-c-opt/lib/libslepc.3.7.dylib (offset 24) >>>>> path /Users/davydden/spack/opt/spack/darwin-elcapitan-x86_64/clang-7.3.0-apple/slepc-3.7.1-gimrzhb4mozeus3i2hdmrtjp3tha5pgr/lib (offset 12) >>>>> path /Users/davydden/spack/opt/spack/darwin-elcapitan-x86_64/clang-7.3.0-apple/slepc-3.7.1-gimrzhb4mozeus3i2hdmrtjp3tha5pgr/lib64 (offset 12) >>>>> >>>>> >>>>> That is, the ?name? is wrong as it corresponds to the path in the temporary build folder. >>>>> >>>>> Kind regards, >>>>> Denis >>>>> >>>> >>> >> > -------------- next part -------------- An HTML attachment was scrubbed... URL: From zocca.marco at gmail.com Mon Jul 11 02:57:59 2016 From: zocca.marco at gmail.com (Marco Zocca) Date: Mon, 11 Jul 2016 09:57:59 +0200 Subject: [petsc-users] HDF5 and PETSc Message-ID: Good morning, Does the HDF5 functionality need to be explicitly requested at configure time? I just noticed that my default configuration on a single-node machine does not compile any relevant symbol. I do not have HDF5 installed on my system yet, but I assumed PETSc includes it by default, or automagically pulls the dependency in at config time, since the manual doesn't mention anything about it. Do I have to install HDF5 from source and rebuild PETSc then? Thanks in advance, Marco --- config options and architecture : Configure Options: --configModules=PETSc.Configure --optionsModule=config.compilerOptions --with-cc=gcc --with-cxx=g++ --with-fc=gfortran --download-fblaslapack --download-mpich Working directory: /Users/ocramz/petsc-3.7.2 Machine platform: ('Darwin', 'fermi.local', '13.4.0', 'Darwin Kernel Version 13.4.0: Sun Aug 17 19:50:11 PDT 2014; root:xnu-2422.115.4~1/RELEASE_X86_64', 'x86_64', 'i386') Python version: 2.7.5 (default, Mar 9 2014, 22:15:05) [GCC 4.2.1 Compatible Apple LLVM 5.0 (clang-500.0.68)] From zocca.marco at gmail.com Mon Jul 11 03:13:31 2016 From: zocca.marco at gmail.com (Marco Zocca) Date: Mon, 11 Jul 2016 10:13:31 +0200 Subject: [petsc-users] HDF5 and PETSc In-Reply-To: References: Message-ID: Sorry for the previous mail, I hadn't fully read ./configure --help : all external package options are listed there, including HDF5 As far as I can see in https://www.mcs.anl.gov/petsc/miscellaneous/external.html and on the PDF manual, not all external packages are mentioned, and this tripped me initially. So my question becomes: please synchronize the output of ./configure --help with manpages and pdf manual :) Thanks again, Marco On 11 July 2016 at 09:57, Marco Zocca wrote: > Good morning, > > Does the HDF5 functionality need to be explicitly requested at > configure time? I just noticed that my default configuration on a > single-node machine does not compile any relevant symbol. > > I do not have HDF5 installed on my system yet, but I assumed PETSc > includes it by default, or automagically pulls the dependency in at > config time, since the manual doesn't mention anything about it. Do I > have to install HDF5 from source and rebuild PETSc then? > > Thanks in advance, > Marco > > > > --- config options and architecture : > > Configure Options: --configModules=PETSc.Configure > --optionsModule=config.compilerOptions --with-cc=gcc --with-cxx=g++ > --with-fc=gfortran --download-fblaslapack --download-mpich > Working directory: /Users/ocramz/petsc-3.7.2 > Machine platform: > ('Darwin', 'fermi.local', '13.4.0', 'Darwin Kernel Version 13.4.0: Sun > Aug 17 19:50:11 PDT 2014; root:xnu-2422.115.4~1/RELEASE_X86_64', > 'x86_64', 'i386') > Python version: > 2.7.5 (default, Mar 9 2014, 22:15:05) > [GCC 4.2.1 Compatible Apple LLVM 5.0 (clang-500.0.68)] From jroman at dsic.upv.es Mon Jul 11 09:53:10 2016 From: jroman at dsic.upv.es (Jose E. Roman) Date: Mon, 11 Jul 2016 16:53:10 +0200 Subject: [petsc-users] [Slepc 3.7.1][macOS] install name is set to build folder instead of prefix In-Reply-To: References: <7FE647AB-8FD7-4D8A-980F-87F5F78478D7@dsic.upv.es> <5D82D597-FEE7-48CF-A99E-C5A88956CAAD@gmail.com> <6C60E10D-B52A-4A59-8045-F6672E3F00C7@gmail.com> Message-ID: I cannot reproduce this behaviour. If I do for instance this (on OS X El Capitan): $ cd ~/tmp $ ln -s $SLEPC_DIR . $ cd slepc-3.7.1 $ ./configure $ make $ otool -lv $PETSC_ARCH/lib/libslepc.dylib | grep slepc I don't get a warning, and the output of otool is the same that would result if done on $SLEPC_DIR. Which warning are you getting? Jose > El 11 jul 2016, a las 0:48, Denis Davydov escribi?: > > Hi Jose, > > so here is what happens. The issue appears when SLEPC_DIR is set to a symlink (the one with ?stage below) of a build folder (the one with ?private? below). > During configure there is a warning that SLEPC_DIR is not the same as current dir (string comparison), > but one is symlink of another, so all but install_name_tool work. The latter leads to the following values of variables: > > oldname =/private/var/folders/5k/sqpp24tx3ylds4fgm13pfht00000gn/T/davydden/spack-stage/spack-stage-MziaMV/slepc-3.7.1/installed-arch-darwin-c-opt/lib/libslepc.3.7.dylib > > installName=/private/var/folders/5k/sqpp24tx3ylds4fgm13pfht00000gn/T/davydden/spack-stage/spack-stage-MziaMV/slepc-3.7.1/installed-arch-darwin-c-opt/lib/libslepc.3.7.dylib > > archDir =/Users/davydden/spack/var/spack/stage/slepc-3.7.1-gimrzhb4mozeus3i2hdmrtjp3tha5pgr/slepc-3.7.1/installed-arch-darwin-c-opt > > installDir =/Users/davydden/spack/opt/spack/darwin-elcapitan-x86_64/clang-7.3.0-apple/slepc-3.7.1-gimrzhb4mozeus3i2hdmrtjp3tha5pgr > > dst =/Users/davydden/spack/opt/spack/darwin-elcapitan-x86_64/clang-7.3.0-apple/slepc-3.7.1-gimrzhb4mozeus3i2hdmrtjp3tha5pgr/lib/libslepc.3.7.1.dylib > > As you see, installName wasn?t changed from oldname. > > Since the python code rely on SLEPC_DIR be pwd(), i would suggest to through an error instead of the warning to make > sure that users won?t get in the situation above. Alternative is to make this part of the code more robust. > > When SLEPC_DIR==pwd() the patch you referred works. > > Kind regards, > Denis > From knepley at gmail.com Mon Jul 11 09:58:49 2016 From: knepley at gmail.com (Matthew Knepley) Date: Mon, 11 Jul 2016 09:58:49 -0500 Subject: [petsc-users] HDF5 and PETSc In-Reply-To: References: Message-ID: On Mon, Jul 11, 2016 at 3:13 AM, Marco Zocca wrote: > Sorry for the previous mail, I hadn't fully read ./configure --help : > all external package options are listed there, including HDF5 > > As far as I can see in > https://www.mcs.anl.gov/petsc/miscellaneous/external.html and on the > PDF manual, not all external packages are mentioned, and this tripped > me initially. > > So my question becomes: please synchronize the output of ./configure > --help with manpages and pdf manual :) > Done. https://bitbucket.org/petsc/petsc/commits/b6541ed63645a657daaf31a0efc9fb29a825bfaf Matt > Thanks again, > Marco > > > On 11 July 2016 at 09:57, Marco Zocca wrote: > > Good morning, > > > > Does the HDF5 functionality need to be explicitly requested at > > configure time? I just noticed that my default configuration on a > > single-node machine does not compile any relevant symbol. > > > > I do not have HDF5 installed on my system yet, but I assumed PETSc > > includes it by default, or automagically pulls the dependency in at > > config time, since the manual doesn't mention anything about it. Do I > > have to install HDF5 from source and rebuild PETSc then? > > > > Thanks in advance, > > Marco > > > > > > > > --- config options and architecture : > > > > Configure Options: --configModules=PETSc.Configure > > --optionsModule=config.compilerOptions --with-cc=gcc --with-cxx=g++ > > --with-fc=gfortran --download-fblaslapack --download-mpich > > Working directory: /Users/ocramz/petsc-3.7.2 > > Machine platform: > > ('Darwin', 'fermi.local', '13.4.0', 'Darwin Kernel Version 13.4.0: Sun > > Aug 17 19:50:11 PDT 2014; root:xnu-2422.115.4~1/RELEASE_X86_64', > > 'x86_64', 'i386') > > Python version: > > 2.7.5 (default, Mar 9 2014, 22:15:05) > > [GCC 4.2.1 Compatible Apple LLVM 5.0 (clang-500.0.68)] > -- What most experimenters take for granted before they begin their experiments is infinitely more interesting than any results to which their experiments lead. -- Norbert Wiener -------------- next part -------------- An HTML attachment was scrubbed... URL: From davydden at gmail.com Mon Jul 11 10:06:47 2016 From: davydden at gmail.com (Denis Davydov) Date: Mon, 11 Jul 2016 17:06:47 +0200 Subject: [petsc-users] [Slepc 3.7.1][macOS] install name is set to build folder instead of prefix In-Reply-To: References: <7FE647AB-8FD7-4D8A-980F-87F5F78478D7@dsic.upv.es> <5D82D597-FEE7-48CF-A99E-C5A88956CAAD@gmail.com> <6C60E10D-B52A-4A59-8045-F6672E3F00C7@gmail.com> Message-ID: <44EF3239-8AA0-4157-B04A-BC3437409215@gmail.com> Here is the warning: Your SLEPC_DIR may not match the directory you are in SLEPC_DIR /Users/davydden/spack/var/spack/stage/slepc-3.7.1-p7hqqclwqvbvra6j44lka3xuc4eycvdg/slepc-3.7.1 Current directory /private/var/folders/5k/sqpp24tx3ylds4fgm13pfht00000gn/T/davydden/spack-stage/spack-stage-m7Xg8I/slepc-3.7.1 p.s. this is done within Spack, for a fix see: https://github.com/LLNL/spack/pull/1206 > On 11 Jul 2016, at 16:53, Jose E. Roman wrote: > > I cannot reproduce this behaviour. If I do for instance this (on OS X El Capitan): > > $ cd ~/tmp > $ ln -s $SLEPC_DIR . > $ cd slepc-3.7.1 > $ ./configure > $ make > $ otool -lv $PETSC_ARCH/lib/libslepc.dylib | grep slepc > > I don't get a warning, and the output of otool is the same that would result if done on $SLEPC_DIR. > Which warning are you getting? > > Jose > > >> El 11 jul 2016, a las 0:48, Denis Davydov escribi?: >> >> Hi Jose, >> >> so here is what happens. The issue appears when SLEPC_DIR is set to a symlink (the one with ?stage below) of a build folder (the one with ?private? below). >> During configure there is a warning that SLEPC_DIR is not the same as current dir (string comparison), >> but one is symlink of another, so all but install_name_tool work. The latter leads to the following values of variables: >> >> oldname =/private/var/folders/5k/sqpp24tx3ylds4fgm13pfht00000gn/T/davydden/spack-stage/spack-stage-MziaMV/slepc-3.7.1/installed-arch-darwin-c-opt/lib/libslepc.3.7.dylib >> >> installName=/private/var/folders/5k/sqpp24tx3ylds4fgm13pfht00000gn/T/davydden/spack-stage/spack-stage-MziaMV/slepc-3.7.1/installed-arch-darwin-c-opt/lib/libslepc.3.7.dylib >> >> archDir =/Users/davydden/spack/var/spack/stage/slepc-3.7.1-gimrzhb4mozeus3i2hdmrtjp3tha5pgr/slepc-3.7.1/installed-arch-darwin-c-opt >> >> installDir =/Users/davydden/spack/opt/spack/darwin-elcapitan-x86_64/clang-7.3.0-apple/slepc-3.7.1-gimrzhb4mozeus3i2hdmrtjp3tha5pgr >> >> dst =/Users/davydden/spack/opt/spack/darwin-elcapitan-x86_64/clang-7.3.0-apple/slepc-3.7.1-gimrzhb4mozeus3i2hdmrtjp3tha5pgr/lib/libslepc.3.7.1.dylib >> >> As you see, installName wasn?t changed from oldname. >> >> Since the python code rely on SLEPC_DIR be pwd(), i would suggest to through an error instead of the warning to make >> sure that users won?t get in the situation above. Alternative is to make this part of the code more robust. >> >> When SLEPC_DIR==pwd() the patch you referred works. >> >> Kind regards, >> Denis >> > -------------- next part -------------- An HTML attachment was scrubbed... URL: From ketancmaheshwari at gmail.com Mon Jul 11 12:05:52 2016 From: ketancmaheshwari at gmail.com (Ketan Maheshwari) Date: Mon, 11 Jul 2016 13:05:52 -0400 Subject: [petsc-users] Diagonalization of a 3D dense matrix Message-ID: Hello PETSC-ers, I am a research faculty at Univ of Pittsburgh trying to use PETSC/SLEPC to obtain the diagonalization of a large matrix using Lanczos or Davidson method. The matrix is a 3 dimensional dense matrix with a total of 216000 elements. After looking into some of the examples in PETSC as well SLEPC implementations it seems like most of the implementations are with 2 dimensional matrices. So, I was wondering if it is possible to express a 3 dimensional matrix object compatible to PETSC so that the SLEPC API could be used to obtain diagonalization. Any suggestions or pointers to documentation or examples would be of great help. Best, -- Ketan -------------- next part -------------- An HTML attachment was scrubbed... URL: From knepley at gmail.com Mon Jul 11 12:15:01 2016 From: knepley at gmail.com (Matthew Knepley) Date: Mon, 11 Jul 2016 12:15:01 -0500 Subject: [petsc-users] Diagonalization of a 3D dense matrix In-Reply-To: References: Message-ID: On Mon, Jul 11, 2016 at 12:05 PM, Ketan Maheshwari < ketancmaheshwari at gmail.com> wrote: > Hello PETSC-ers, > > I am a research faculty at Univ of Pittsburgh trying to use PETSC/SLEPC to > obtain the diagonalization of a large matrix using Lanczos or Davidson > method. > > The matrix is a 3 dimensional dense matrix with a total of 216000 elements. > > After looking into some of the examples in PETSC as well SLEPC > implementations > it seems like most of the implementations are with 2 dimensional matrices. > You will have to explain what you mean by a "3D matrix". A matrix, by definition, has only rows and columns. You may mean a matrix generated from a 3D problem. That should pose no extra difficulty. You may mean a 3-index tensor, in which case diagonalization is not a clear concept. Thanks, Matt > So, I was wondering if it is possible to express a 3 dimensional matrix > object > compatible to PETSC so that the SLEPC API could be used to obtain > diagonalization. > > Any suggestions or pointers to documentation or examples would be of great > help. > > Best, > -- > Ketan > > -- What most experimenters take for granted before they begin their experiments is infinitely more interesting than any results to which their experiments lead. -- Norbert Wiener -------------- next part -------------- An HTML attachment was scrubbed... URL: From hengjiew at uci.edu Mon Jul 11 12:14:12 2016 From: hengjiew at uci.edu (frank) Date: Mon, 11 Jul 2016 10:14:12 -0700 Subject: [petsc-users] Question about memory usage in Multigrid preconditioner In-Reply-To: References: <577C337B.60909@uci.edu> <94A03A99-4970-4F20-8C79-FEE1DCBD028D@mcs.anl.gov> <577D75D3.8010703@uci.edu> <2F25042C-E6D6-4AC6-9C22-1B63F8065836@mcs.anl.gov> <57804DE9.707@uci.edu> Message-ID: <5783D3E4.4020004@uci.edu> Hi Dave, I re-run the test using bjacobi as the preconditioner on the coarse mesh of telescope. The Grid is 3072*256*768 and process mesh is 96*8*24. The petsc option file is attached. I still got the "Out Of Memory" error. The error occurred before the linear solver finished one step. So I don't have the full info from ksp_view. The info from ksp_view_pre is attached. It seems to me that the error occurred when the decomposition was going to be changed. I had another test with a grid of 1536*128*384 and the same process mesh as above. There was no error. The ksp_view info is attached for comparison. Thank you. Frank On 07/08/2016 10:38 PM, Dave May wrote: > > > On Saturday, 9 July 2016, frank > wrote: > > Hi Barry and Dave, > > Thank both of you for the advice. > > @Barry > I made a mistake in the file names in last email. I attached the > correct files this time. > For all the three tests, 'Telescope' is used as the coarse > preconditioner. > > == Test1: Grid: 1536*128*384, Process Mesh: 48*4*12 > Part of the memory usage: Vector 125 124 3971904 0. > Matrix 101 101 > 9462372 0 > > == Test2: Grid: 1536*128*384, Process Mesh: 96*8*24 > Part of the memory usage: Vector 125 124 681672 0. > Matrix 101 101 > 1462180 0. > > In theory, the memory usage in Test1 should be 8 times of Test2. > In my case, it is about 6 times. > > == Test3: Grid: 3072*256*768, Process Mesh: 96*8*24. Sub-domain > per process: 32*32*32 > Here I get the out of memory error. > > I tried to use -mg_coarse jacobi. In this way, I don't need to set > -mg_coarse_ksp_type and -mg_coarse_pc_type explicitly, right? > The linear solver didn't work in this case. Petsc output some errors. > > @Dave > In test3, I use only one instance of 'Telescope'. On the coarse > mesh of 'Telescope', I used LU as the preconditioner instead of SVD. > If my set the levels correctly, then on the last coarse mesh of MG > where it calls 'Telescope', the sub-domain per process is 2*2*2. > On the last coarse mesh of 'Telescope', there is only one grid > point per process. > I still got the OOM error. The detailed petsc option file is attached. > > > Do you understand the expected memory usage for the > particular parallel LU implementation you are using? I don't > (seriously). Replace LU with bjacobi and re-run this test. My point > about solver debugging is still valid. > > And please send the result of KSPView so we can see what is actually > used in the computations > > Thanks > Dave > > > > Thank you so much. > > Frank > > > > On 07/06/2016 02:51 PM, Barry Smith wrote: > > On Jul 6, 2016, at 4:19 PM, frank wrote: > > Hi Barry, > > Thank you for you advice. > I tried three test. In the 1st test, the grid is > 3072*256*768 and the process mesh is 96*8*24. > The linear solver is 'cg' the preconditioner is 'mg' and > 'telescope' is used as the preconditioner at the coarse mesh. > The system gives me the "Out of Memory" error before the > linear system is completely solved. > The info from '-ksp_view_pre' is attached. I seems to me > that the error occurs when it reaches the coarse mesh. > > The 2nd test uses a grid of 1536*128*384 and process mesh > is 96*8*24. The 3rd test uses the same grid but a > different process mesh 48*4*12. > > Are you sure this is right? The total matrix and vector > memory usage goes from 2nd test > Vector 384 383 8,193,712 0. > Matrix 103 103 11,508,688 0. > to 3rd test > Vector 384 383 1,590,520 0. > Matrix 103 103 3,508,664 0. > that is the memory usage got smaller but if you have only > 1/8th the processes and the same grid it should have gotten > about 8 times bigger. Did you maybe cut the grid by a factor > of 8 also? If so that still doesn't explain it because the > memory usage changed by a factor of 5 something for the > vectors and 3 something for the matrices. > > > The linear solver and petsc options in 2nd and 3rd tests > are the same in 1st test. The linear solver works fine in > both test. > I attached the memory usage of the 2nd and 3rd tests. The > memory info is from the option '-log_summary'. I tried to > use '-momery_info' as you suggested, but in my case petsc > treated it as an unused option. It output nothing about > the memory. Do I need to add sth to my code so I can use > '-memory_info'? > > Sorry, my mistake the option is -memory_view > > Can you run the one case with -memory_view and -mg_coarse > jacobi -ksp_max_it 1 (just so it doesn't iterate forever) to > see how much memory is used without the telescope? Also run > case 2 the same way. > > Barry > > > > In both tests the memory usage is not large. > > It seems to me that it might be the 'telescope' > preconditioner that allocated a lot of memory and caused > the error in the 1st test. > Is there is a way to show how much memory it allocated? > > Frank > > On 07/05/2016 03:37 PM, Barry Smith wrote: > > Frank, > > You can run with -ksp_view_pre to have it "view" > the KSP before the solve so hopefully it gets that far. > > Please run the problem that does fit with > -memory_info when the problem completes it will show > the "high water mark" for PETSc allocated memory and > total memory used. We first want to look at these > numbers to see if it is using more memory than you > expect. You could also run with say half the grid > spacing to see how the memory usage scaled with the > increase in grid points. Make the runs also with > -log_view and send all the output from these options. > > Barry > > On Jul 5, 2016, at 5:23 PM, frank > wrote: > > Hi, > > I am using the CG ksp solver and Multigrid > preconditioner to solve a linear system in parallel. > I chose to use the 'Telescope' as the > preconditioner on the coarse mesh for its good > performance. > The petsc options file is attached. > > The domain is a 3d box. > It works well when the grid is 1536*128*384 and > the process mesh is 96*8*24. When I double the > size of grid and keep the same process mesh and > petsc options, I get an "out of memory" error from > the super-cluster I am using. > Each process has access to at least 8G memory, > which should be more than enough for my > application. I am sure that all the other parts of > my code( except the linear solver ) do not use > much memory. So I doubt if there is something > wrong with the linear solver. > The error occurs before the linear system is > completely solved so I don't have the info from > ksp view. I am not able to re-produce the error > with a smaller problem either. > In addition, I tried to use the block jacobi as > the preconditioner with the same grid and same > decomposition. The linear solver runs extremely > slow but there is no memory error. > > How can I diagnose what exactly cause the error? > Thank you so much. > > Frank > > > > > -------------- next part -------------- An HTML attachment was scrubbed... URL: -------------- next part -------------- KSP Object: 18432 MPI processes type: cg maximum iterations=1 tolerances: relative=1e-07, absolute=1e-50, divergence=10000. left preconditioning using nonzero initial guess using UNPRECONDITIONED norm type for convergence test PC Object: 18432 MPI processes type: mg PC has not been set up so information may be incomplete MG: type is MULTIPLICATIVE, levels=5 cycles=v Cycles per PCApply=1 Using Galerkin computed coarse grid matrices Coarse grid solver -- level ------------------------------- KSP Object: (mg_coarse_) 18432 MPI processes type: preonly maximum iterations=10000, initial guess is zero tolerances: relative=1e-05, absolute=1e-50, divergence=10000. left preconditioning using DEFAULT norm type for convergence test PC Object: (mg_coarse_) 18432 MPI processes type: redundant PC has not been set up so information may be incomplete Redundant preconditioner: Not yet setup Down solver (pre-smoother) on level 1 ------------------------------- KSP Object: (mg_levels_1_) 18432 MPI processes type: chebyshev Chebyshev: eigenvalue estimates: min = 0., max = 0. maximum iterations=2, initial guess is zero tolerances: relative=1e-05, absolute=1e-50, divergence=10000. left preconditioning using NONE norm type for convergence test PC Object: (mg_levels_1_) 18432 MPI processes type: sor PC has not been set up so information may be incomplete SOR: type = local_symmetric, iterations = 1, local iterations = 1, omega = 1. Up solver (post-smoother) same as down solver (pre-smoother) Down solver (pre-smoother) on level 2 ------------------------------- KSP Object: (mg_levels_2_) 18432 MPI processes type: chebyshev Chebyshev: eigenvalue estimates: min = 0., max = 0. maximum iterations=2, initial guess is zero tolerances: relative=1e-05, absolute=1e-50, divergence=10000. left preconditioning using NONE norm type for convergence test PC Object: (mg_levels_2_) 18432 MPI processes type: sor PC has not been set up so information may be incomplete SOR: type = local_symmetric, iterations = 1, local iterations = 1, omega = 1. Up solver (post-smoother) same as down solver (pre-smoother) Down solver (pre-smoother) on level 3 ------------------------------- KSP Object: (mg_levels_3_) 18432 MPI processes type: chebyshev Chebyshev: eigenvalue estimates: min = 0., max = 0. maximum iterations=2, initial guess is zero tolerances: relative=1e-05, absolute=1e-50, divergence=10000. left preconditioning using NONE norm type for convergence test PC Object: (mg_levels_3_) 18432 MPI processes type: sor PC has not been set up so information may be incomplete SOR: type = local_symmetric, iterations = 1, local iterations = 1, omega = 1. Up solver (post-smoother) same as down solver (pre-smoother) Down solver (pre-smoother) on level 4 ------------------------------- KSP Object: (mg_levels_4_) 18432 MPI processes type: chebyshev Chebyshev: eigenvalue estimates: min = 0., max = 0. maximum iterations=2, initial guess is zero tolerances: relative=1e-05, absolute=1e-50, divergence=10000. left preconditioning using NONE norm type for convergence test PC Object: (mg_levels_4_) 18432 MPI processes type: sor PC has not been set up so information may be incomplete SOR: type = local_symmetric, iterations = 1, local iterations = 1, omega = 1. Up solver (post-smoother) same as down solver (pre-smoother) linear system matrix = precond matrix: Mat Object: 18432 MPI processes type: mpiaij rows=603979776, cols=603979776 total: nonzeros=4223139840, allocated nonzeros=4223139840 total number of mallocs used during MatSetValues calls =0 has attached null space [NID 00631] 2016-07-10 06:22:58 Apid 45768056: initiated application termination [NID 06277] 2016-07-10 06:23:00 Apid 45768056: OOM killer terminated this process. [NID 06235] 2016-07-10 06:23:00 Apid 45768056: OOM killer terminated this process. -------------- next part -------------- -ksp_type cg -ksp_norm_type unpreconditioned -ksp_lag_norm -ksp_rtol 1e-7 -ksp_initial_guess_nonzero yes -ksp_converged_reason -ppe_max_iter 50 -pc_type mg -pc_mg_galerkin -pc_mg_levels 5 -mg_levels_ksp_type richardson -mg_levels_ksp_max_it 1 -ksp_max_it 1 -mg_coarse_ksp_type preonly -mg_coarse_pc_type telescope -mg_coarse_pc_telescope_reduction_factor 64 -options_left 1 -log_view -memory_view -ksp_view_pre # Setting dmdarepart on subcomm -mg_coarse_telescope_repart_da_processors_x 24 -mg_coarse_telescope_repart_da_processors_y 2 -mg_coarse_telescope_repart_da_processors_z 6 -mg_coarse_telescope_ksp_type preonly -mg_coarse_telescope_pc_type mg -mg_coarse_telescope_pc_mg_galerkin -mg_coarse_telescope_pc_mg_levels 4 -mg_coarse_telescope_mg_levels_ksp_max_it 1 -mg_coarse_telescope_mg_levels_ksp_type richardson -mg_coarse_telescope_mg_coarse_ksp_type preonly -mg_coarse_telescope_mg_coarse_pc_type bjacobi -------------- next part -------------- KSP Object: 18432 MPI processes type: cg maximum iterations=1 tolerances: relative=1e-07, absolute=1e-50, divergence=10000. left preconditioning using nonzero initial guess using UNPRECONDITIONED norm type for convergence test PC Object: 18432 MPI processes type: mg MG: type is MULTIPLICATIVE, levels=4 cycles=v Cycles per PCApply=1 Using Galerkin computed coarse grid matrices Coarse grid solver -- level ------------------------------- KSP Object: (mg_coarse_) 18432 MPI processes type: preonly maximum iterations=10000, initial guess is zero tolerances: relative=1e-05, absolute=1e-50, divergence=10000. left preconditioning using NONE norm type for convergence test PC Object: (mg_coarse_) 18432 MPI processes type: telescope Telescope: parent comm size reduction factor = 64 Telescope: comm_size = 18432 , subcomm_size = 288 Telescope: DMDA detected DMDA Object: (mg_coarse_telescope_repart_) 288 MPI processes M 192 N 16 P 48 m 24 n 2 p 6 dof 1 overlap 1 KSP Object: (mg_coarse_telescope_) 288 MPI processes type: preonly maximum iterations=10000, initial guess is zero tolerances: relative=1e-05, absolute=1e-50, divergence=10000. left preconditioning using NONE norm type for convergence test PC Object: (mg_coarse_telescope_) 288 MPI processes type: mg MG: type is MULTIPLICATIVE, levels=4 cycles=v Cycles per PCApply=1 Using Galerkin computed coarse grid matrices Coarse grid solver -- level ------------------------------- KSP Object: (mg_coarse_telescope_mg_coarse_) 288 MPI processes type: preonly maximum iterations=10000, initial guess is zero tolerances: relative=1e-05, absolute=1e-50, divergence=10000. left preconditioning using NONE norm type for convergence test PC Object: (mg_coarse_telescope_mg_coarse_) 288 MPI processes type: lu LU: out-of-place factorization tolerance for zero pivot 2.22045e-14 matrix ordering: natural factor fill ratio given 0., needed 0. Factored matrix follows: Mat Object: 288 MPI processes type: mpiaij rows=288, cols=288 package used to perform factorization: superlu_dist total: nonzeros=0, allocated nonzeros=0 total number of mallocs used during MatSetValues calls =0 SuperLU_DIST run parameters: Process grid nprow 18 x npcol 16 Equilibrate matrix TRUE Matrix input mode 1 Replace tiny pivots FALSE Use iterative refinement FALSE Processors in row 18 col partition 16 Row permutation LargeDiag Column permutation METIS_AT_PLUS_A Parallel symbolic factorization FALSE Repeated factorization SamePattern linear system matrix = precond matrix: Mat Object: 288 MPI processes type: mpiaij rows=288, cols=288 total: nonzeros=1728, allocated nonzeros=1728 total number of mallocs used during MatSetValues calls =0 not using I-node (on process 0) routines Down solver (pre-smoother) on level 1 ------------------------------- KSP Object: (mg_coarse_telescope_mg_levels_1_) 288 MPI processes type: richardson Richardson: damping factor=1. maximum iterations=1 tolerances: relative=1e-05, absolute=1e-50, divergence=10000. left preconditioning using nonzero initial guess using NONE norm type for convergence test PC Object: (mg_coarse_telescope_mg_levels_1_) 288 MPI processes type: sor SOR: type = local_symmetric, iterations = 1, local iterations = 1, omega = 1. linear system matrix = precond matrix: Mat Object: 288 MPI processes type: mpiaij rows=2304, cols=2304 total: nonzeros=14976, allocated nonzeros=14976 total number of mallocs used during MatSetValues calls =0 not using I-node (on process 0) routines Up solver (post-smoother) same as down solver (pre-smoother) Down solver (pre-smoother) on level 2 ------------------------------- KSP Object: (mg_coarse_telescope_mg_levels_2_) 288 MPI processes type: richardson Richardson: damping factor=1. maximum iterations=1 tolerances: relative=1e-05, absolute=1e-50, divergence=10000. left preconditioning using nonzero initial guess using NONE norm type for convergence test PC Object: (mg_coarse_telescope_mg_levels_2_) 288 MPI processes type: sor SOR: type = local_symmetric, iterations = 1, local iterations = 1, omega = 1. linear system matrix = precond matrix: Mat Object: 288 MPI processes type: mpiaij rows=18432, cols=18432 total: nonzeros=124416, allocated nonzeros=124416 total number of mallocs used during MatSetValues calls =0 not using I-node (on process 0) routines Up solver (post-smoother) same as down solver (pre-smoother) Down solver (pre-smoother) on level 3 ------------------------------- KSP Object: (mg_coarse_telescope_mg_levels_3_) 288 MPI processes type: richardson Richardson: damping factor=1. maximum iterations=1 tolerances: relative=1e-05, absolute=1e-50, divergence=10000. left preconditioning using nonzero initial guess using NONE norm type for convergence test PC Object: (mg_coarse_telescope_mg_levels_3_) 288 MPI processes type: sor SOR: type = local_symmetric, iterations = 1, local iterations = 1, omega = 1. linear system matrix = precond matrix: Mat Object: 288 MPI processes type: mpiaij rows=147456, cols=147456 total: nonzeros=1013760, allocated nonzeros=1013760 total number of mallocs used during MatSetValues calls =0 not using I-node (on process 0) routines Up solver (post-smoother) same as down solver (pre-smoother) linear system matrix = precond matrix: Mat Object: 288 MPI processes type: mpiaij rows=147456, cols=147456 total: nonzeros=1013760, allocated nonzeros=1013760 total number of mallocs used during MatSetValues calls =0 not using I-node (on process 0) routines linear system matrix = precond matrix: Mat Object: 18432 MPI processes type: mpiaij rows=147456, cols=147456 total: nonzeros=1013760, allocated nonzeros=1013760 total number of mallocs used during MatSetValues calls =0 not using I-node (on process 0) routines Down solver (pre-smoother) on level 1 ------------------------------- KSP Object: (mg_levels_1_) 18432 MPI processes type: richardson Richardson: damping factor=1. maximum iterations=1 tolerances: relative=1e-05, absolute=1e-50, divergence=10000. left preconditioning using nonzero initial guess using NONE norm type for convergence test PC Object: (mg_levels_1_) 18432 MPI processes type: sor SOR: type = local_symmetric, iterations = 1, local iterations = 1, omega = 1. linear system matrix = precond matrix: Mat Object: 18432 MPI processes type: mpiaij rows=1179648, cols=1179648 total: nonzeros=8183808, allocated nonzeros=8183808 total number of mallocs used during MatSetValues calls =0 not using I-node (on process 0) routines Up solver (post-smoother) same as down solver (pre-smoother) Down solver (pre-smoother) on level 2 ------------------------------- KSP Object: (mg_levels_2_) 18432 MPI processes type: richardson Richardson: damping factor=1. maximum iterations=1 tolerances: relative=1e-05, absolute=1e-50, divergence=10000. left preconditioning using nonzero initial guess using NONE norm type for convergence test PC Object: (mg_levels_2_) 18432 MPI processes type: sor SOR: type = local_symmetric, iterations = 1, local iterations = 1, omega = 1. linear system matrix = precond matrix: Mat Object: 18432 MPI processes type: mpiaij rows=9437184, cols=9437184 total: nonzeros=65765376, allocated nonzeros=65765376 total number of mallocs used during MatSetValues calls =0 not using I-node (on process 0) routines Up solver (post-smoother) same as down solver (pre-smoother) Down solver (pre-smoother) on level 3 ------------------------------- KSP Object: (mg_levels_3_) 18432 MPI processes type: richardson Richardson: damping factor=1. maximum iterations=1 tolerances: relative=1e-05, absolute=1e-50, divergence=10000. left preconditioning using nonzero initial guess using NONE norm type for convergence test PC Object: (mg_levels_3_) 18432 MPI processes type: sor SOR: type = local_symmetric, iterations = 1, local iterations = 1, omega = 1. linear system matrix = precond matrix: Mat Object: 18432 MPI processes type: mpiaij rows=75497472, cols=75497472 total: nonzeros=527302656, allocated nonzeros=527302656 total number of mallocs used during MatSetValues calls =0 has attached null space Up solver (post-smoother) same as down solver (pre-smoother) linear system matrix = precond matrix: Mat Object: 18432 MPI processes type: mpiaij rows=75497472, cols=75497472 total: nonzeros=527302656, allocated nonzeros=527302656 total number of mallocs used during MatSetValues calls =0 has attached null space From ketancmaheshwari at gmail.com Mon Jul 11 13:22:09 2016 From: ketancmaheshwari at gmail.com (Ketan Maheshwari) Date: Mon, 11 Jul 2016 14:22:09 -0400 Subject: [petsc-users] Diagonalization of a 3D dense matrix In-Reply-To: References: Message-ID: Matthew, I am probably not using the right language but I meant that each element has three indices associated with it: x, y, z. Here is a snapshot: 1 10 55 5.7113635929515209e-03 1 10 56 4.2977490038287334e-03 1 10 57 2.8719519782193204e-03 1 10 58 1.4380140927001712e-03 1 10 59 9.9299930690365083e-17 1 11 0 0.0000000000000000e+00 1 11 1 1.5658614070601917e-03 1 11 2 3.1272842098367562e-03 1 11 3 4.6798423857521204e-03 Where the first three columns are the coordinates and the last one is value. Could you clarify the meaning of "diagonalization is not a clear concept" if it is applicable to this case. Thank you, -- Ketan On Mon, Jul 11, 2016 at 1:15 PM, Matthew Knepley wrote: > On Mon, Jul 11, 2016 at 12:05 PM, Ketan Maheshwari < > ketancmaheshwari at gmail.com> wrote: > >> Hello PETSC-ers, >> >> I am a research faculty at Univ of Pittsburgh trying to use PETSC/SLEPC >> to >> obtain the diagonalization of a large matrix using Lanczos or Davidson >> method. >> >> The matrix is a 3 dimensional dense matrix with a total of 216000 >> elements. >> >> After looking into some of the examples in PETSC as well SLEPC >> implementations >> it seems like most of the implementations are with 2 dimensional matrices. >> > > You will have to explain what you mean by a "3D matrix". A matrix, by > definition, has only > rows and columns. You may mean a matrix generated from a 3D problem. That > should pose > no extra difficulty. You may mean a 3-index tensor, in which case > diagonalization is not a clear > concept. > > Thanks, > > Matt > > >> So, I was wondering if it is possible to express a 3 dimensional matrix >> object >> compatible to PETSC so that the SLEPC API could be used to obtain >> diagonalization. >> >> Any suggestions or pointers to documentation or examples would be of great >> help. >> >> Best, >> -- >> Ketan >> >> > > > -- > What most experimenters take for granted before they begin their > experiments is infinitely more interesting than any results to which their > experiments lead. > -- Norbert Wiener > -- Ketan -------------- next part -------------- An HTML attachment was scrubbed... URL: From knepley at gmail.com Mon Jul 11 13:24:51 2016 From: knepley at gmail.com (Matthew Knepley) Date: Mon, 11 Jul 2016 13:24:51 -0500 Subject: [petsc-users] Diagonalization of a 3D dense matrix In-Reply-To: References: Message-ID: On Mon, Jul 11, 2016 at 1:22 PM, Ketan Maheshwari < ketancmaheshwari at gmail.com> wrote: > Matthew, > > I am probably not using the right language but I meant that each element > has three indices associated with it: x, y, z. > > Here is a snapshot: > > 1 10 55 5.7113635929515209e-03 > 1 10 56 4.2977490038287334e-03 > 1 10 57 2.8719519782193204e-03 > 1 10 58 1.4380140927001712e-03 > 1 10 59 9.9299930690365083e-17 > 1 11 0 0.0000000000000000e+00 > 1 11 1 1.5658614070601917e-03 > 1 11 2 3.1272842098367562e-03 > 1 11 3 4.6798423857521204e-03 > > Where the first three columns are the coordinates and the last one is > value. > This is not a matrix. A matrix is a linear operator on some space with a finite basis: https://en.wikipedia.org/wiki/Matrix_(mathematics) This is just a set of data points. Most people would call this a vector, since you have an index I (which consists of each independent triple) and a value V. > Could you clarify the meaning of "diagonalization is not a clear concept" > if it is applicable to this case. > There is no one definition of tensor diagonalization. Matt > Thank you, > -- > Ketan > > > On Mon, Jul 11, 2016 at 1:15 PM, Matthew Knepley > wrote: > >> On Mon, Jul 11, 2016 at 12:05 PM, Ketan Maheshwari < >> ketancmaheshwari at gmail.com> wrote: >> >>> Hello PETSC-ers, >>> >>> I am a research faculty at Univ of Pittsburgh trying to use PETSC/SLEPC >>> to >>> obtain the diagonalization of a large matrix using Lanczos or Davidson >>> method. >>> >>> The matrix is a 3 dimensional dense matrix with a total of 216000 >>> elements. >>> >>> After looking into some of the examples in PETSC as well SLEPC >>> implementations >>> it seems like most of the implementations are with 2 dimensional >>> matrices. >>> >> >> You will have to explain what you mean by a "3D matrix". A matrix, by >> definition, has only >> rows and columns. You may mean a matrix generated from a 3D problem. That >> should pose >> no extra difficulty. You may mean a 3-index tensor, in which case >> diagonalization is not a clear >> concept. >> >> Thanks, >> >> Matt >> >> >>> So, I was wondering if it is possible to express a 3 dimensional matrix >>> object >>> compatible to PETSC so that the SLEPC API could be used to obtain >>> diagonalization. >>> >>> Any suggestions or pointers to documentation or examples would be of >>> great >>> help. >>> >>> Best, >>> -- >>> Ketan >>> >>> >> >> >> -- >> What most experimenters take for granted before they begin their >> experiments is infinitely more interesting than any results to which their >> experiments lead. >> -- Norbert Wiener >> > > > > -- > Ketan > > -- What most experimenters take for granted before they begin their experiments is infinitely more interesting than any results to which their experiments lead. -- Norbert Wiener -------------- next part -------------- An HTML attachment was scrubbed... URL: From jroman at dsic.upv.es Mon Jul 11 14:06:14 2016 From: jroman at dsic.upv.es (Jose E. Roman) Date: Mon, 11 Jul 2016 21:06:14 +0200 Subject: [petsc-users] [Slepc 3.7.1][macOS] install name is set to build folder instead of prefix In-Reply-To: <44EF3239-8AA0-4157-B04A-BC3437409215@gmail.com> References: <7FE647AB-8FD7-4D8A-980F-87F5F78478D7@dsic.upv.es> <5D82D597-FEE7-48CF-A99E-C5A88956CAAD@gmail.com> <6C60E10D-B52A-4A59-8045-F6672E3F00C7@gmail.com> <44EF3239-8AA0-4157-B04A-BC3437409215@gmail.com> Message-ID: <06F337FC-9F59-4633-9A07-A253C33080EE@dsic.upv.es> I don't understand why I don't get this warning. Still I don't see where the problem is. Please tell me exactly what you want me to change, or better make a pull request. Thanks. Jose > El 11 jul 2016, a las 17:06, Denis Davydov escribi?: > > Here is the warning: > > Your SLEPC_DIR may not match the directory you are in > SLEPC_DIR /Users/davydden/spack/var/spack/stage/slepc-3.7.1-p7hqqclwqvbvra6j44lka3xuc4eycvdg/slepc-3.7.1 Current directory /private/var/folders/5k/sqpp24tx3ylds4fgm13pfht00000gn/T/davydden/spack-stage/spack-stage-m7Xg8I/slepc-3.7.1 > > p.s. this is done within Spack, for a fix see: https://github.com/LLNL/spack/pull/1206 > >> On 11 Jul 2016, at 16:53, Jose E. Roman wrote: >> >> I cannot reproduce this behaviour. If I do for instance this (on OS X El Capitan): >> >> $ cd ~/tmp >> $ ln -s $SLEPC_DIR . >> $ cd slepc-3.7.1 >> $ ./configure >> $ make >> $ otool -lv $PETSC_ARCH/lib/libslepc.dylib | grep slepc >> >> I don't get a warning, and the output of otool is the same that would result if done on $SLEPC_DIR. >> Which warning are you getting? >> >> Jose >> >> >>> El 11 jul 2016, a las 0:48, Denis Davydov escribi?: >>> >>> Hi Jose, >>> >>> so here is what happens. The issue appears when SLEPC_DIR is set to a symlink (the one with ?stage below) of a build folder (the one with ?private? below). >>> During configure there is a warning that SLEPC_DIR is not the same as current dir (string comparison), >>> but one is symlink of another, so all but install_name_tool work. The latter leads to the following values of variables: >>> >>> oldname =/private/var/folders/5k/sqpp24tx3ylds4fgm13pfht00000gn/T/davydden/spack-stage/spack-stage-MziaMV/slepc-3.7.1/installed-arch-darwin-c-opt/lib/libslepc.3.7.dylib >>> >>> installName=/private/var/folders/5k/sqpp24tx3ylds4fgm13pfht00000gn/T/davydden/spack-stage/spack-stage-MziaMV/slepc-3.7.1/installed-arch-darwin-c-opt/lib/libslepc.3.7.dylib >>> >>> archDir =/Users/davydden/spack/var/spack/stage/slepc-3.7.1-gimrzhb4mozeus3i2hdmrtjp3tha5pgr/slepc-3.7.1/installed-arch-darwin-c-opt >>> >>> installDir =/Users/davydden/spack/opt/spack/darwin-elcapitan-x86_64/clang-7.3.0-apple/slepc-3.7.1-gimrzhb4mozeus3i2hdmrtjp3tha5pgr >>> >>> dst =/Users/davydden/spack/opt/spack/darwin-elcapitan-x86_64/clang-7.3.0-apple/slepc-3.7.1-gimrzhb4mozeus3i2hdmrtjp3tha5pgr/lib/libslepc.3.7.1.dylib >>> >>> As you see, installName wasn?t changed from oldname. >>> >>> Since the python code rely on SLEPC_DIR be pwd(), i would suggest to through an error instead of the warning to make >>> sure that users won?t get in the situation above. Alternative is to make this part of the code more robust. >>> >>> When SLEPC_DIR==pwd() the patch you referred works. >>> >>> Kind regards, >>> Denis >>> >> > From davydden at gmail.com Mon Jul 11 14:43:47 2016 From: davydden at gmail.com (Denis Davydov) Date: Mon, 11 Jul 2016 21:43:47 +0200 Subject: [petsc-users] [Slepc 3.7.1][macOS] install name is set to build folder instead of prefix In-Reply-To: <06F337FC-9F59-4633-9A07-A253C33080EE@dsic.upv.es> References: <7FE647AB-8FD7-4D8A-980F-87F5F78478D7@dsic.upv.es> <5D82D597-FEE7-48CF-A99E-C5A88956CAAD@gmail.com> <6C60E10D-B52A-4A59-8045-F6672E3F00C7@gmail.com> <44EF3239-8AA0-4157-B04A-BC3437409215@gmail.com> <06F337FC-9F59-4633-9A07-A253C33080EE@dsic.upv.es> Message-ID: > On 11 Jul 2016, at 21:06, Jose E. Roman wrote: > > I don't understand why I don't get this warning. > Still I don't see where the problem is. Please tell me exactly what you want me to change, or better make a pull request. The problem has to do with the assumptions in python scripts. See below values of variables which will not work as expected, i.e. installName = oldname.replace(self.archDir, self.installDir) will not do any replace. Why you can?t reproduce it ? i don?t know. In any case, i have a working solution, so it?s not an issue for me and it is up to you if you want to further investigate it. I just wanted to point out that this part of the python code does not work in all circumstances. Regards, Denis. >>>> so here is what happens. The issue appears when SLEPC_DIR is set to a symlink (the one with ?stage below) of a build folder (the one with ?private? below). >>>> During configure there is a warning that SLEPC_DIR is not the same as current dir (string comparison), >>>> but one is symlink of another, so all but install_name_tool work. The latter leads to the following values of variables: >>>> >>>> oldname =/private/var/folders/5k/sqpp24tx3ylds4fgm13pfht00000gn/T/davydden/spack-stage/spack-stage-MziaMV/slepc-3.7.1/installed-arch-darwin-c-opt/lib/libslepc.3.7.dylib >>>> >>>> installName=/private/var/folders/5k/sqpp24tx3ylds4fgm13pfht00000gn/T/davydden/spack-stage/spack-stage-MziaMV/slepc-3.7.1/installed-arch-darwin-c-opt/lib/libslepc.3.7.dylib >>>> >>>> archDir =/Users/davydden/spack/var/spack/stage/slepc-3.7.1-gimrzhb4mozeus3i2hdmrtjp3tha5pgr/slepc-3.7.1/installed-arch-darwin-c-opt >>>> >>>> installDir =/Users/davydden/spack/opt/spack/darwin-elcapitan-x86_64/clang-7.3.0-apple/slepc-3.7.1-gimrzhb4mozeus3i2hdmrtjp3tha5pgr >>>> >>>> dst =/Users/davydden/spack/opt/spack/darwin-elcapitan-x86_64/clang-7.3.0-apple/slepc-3.7.1-gimrzhb4mozeus3i2hdmrtjp3tha5pgr/lib/libslepc.3.7.1.dylib >>>> >>>> As you see, installName wasn?t changed from oldname. >>>> >>>> Since the python code rely on SLEPC_DIR be pwd(), i would suggest to through an error instead of the warning to make >>>> sure that users won?t get in the situation above. Alternative is to make this part of the code more robust. >>>> >>>> When SLEPC_DIR==pwd() the patch you referred works. From dave.mayhem23 at gmail.com Mon Jul 11 15:18:01 2016 From: dave.mayhem23 at gmail.com (Dave May) Date: Mon, 11 Jul 2016 22:18:01 +0200 Subject: [petsc-users] Question about memory usage in Multigrid preconditioner In-Reply-To: <5783D3E4.4020004@uci.edu> References: <577C337B.60909@uci.edu> <94A03A99-4970-4F20-8C79-FEE1DCBD028D@mcs.anl.gov> <577D75D3.8010703@uci.edu> <2F25042C-E6D6-4AC6-9C22-1B63F8065836@mcs.anl.gov> <57804DE9.707@uci.edu> <5783D3E4.4020004@uci.edu> Message-ID: Hi Frank, On 11 July 2016 at 19:14, frank wrote: > Hi Dave, > > I re-run the test using bjacobi as the preconditioner on the coarse mesh > of telescope. The Grid is 3072*256*768 and process mesh is 96*8*24. The > petsc option file is attached. > I still got the "Out Of Memory" error. The error occurred before the > linear solver finished one step. So I don't have the full info from > ksp_view. The info from ksp_view_pre is attached. > Okay - that is essentially useless (sorry) > > It seems to me that the error occurred when the decomposition was going to > be changed. > Based on what information? Running with -info would give us more clues, but will create a ton of output. Please try running the case which failed with -info > I had another test with a grid of 1536*128*384 and the same process mesh > as above. There was no error. The ksp_view info is attached for comparison. > Thank you. > [3] Here is my crude estimate of your memory usage. I'll target the biggest memory hogs only to get an order of magnitude estimate * The Fine grid operator contains 4223139840 non-zeros --> 1.8 GB per MPI rank assuming double precision. The indices for the AIJ could amount to another 0.3 GB (assuming 32 bit integers) * You use 5 levels of coarsening, so the other operators should represent (collectively) 2.1 / 8 + 2.1/8^2 + 2.1/8^3 + 2.1/8^4 ~ 300 MB per MPI rank on the communicator with 18432 ranks. The coarse grid should consume ~ 0.5 MB per MPI rank on the communicator with 18432 ranks. * You use a reduction factor of 64, making the new communicator with 288 MPI ranks. PCTelescope will first gather a temporary matrix associated with your coarse level operator assuming a comm size of 288 living on the comm with size 18432. This matrix will require approximately 0.5 * 64 = 32 MB per core on the 288 ranks. This matrix is then used to form a new MPIAIJ matrix on the subcomm, thus require another 32 MB per rank. The temporary matrix is now destroyed. * Because a DMDA is detected, a permutation matrix is assembled. This requires 2 doubles per point in the DMDA. Your coarse DMDA contains 92 x 16 x 48 points. Thus the permutation matrix will require < 1 MB per MPI rank on the sub-comm. * Lastly, the matrix is permuted. This uses MatPtAP(), but the resulting operator will have the same memory footprint as the unpermuted matrix (32 MB). At any stage in PCTelescope, only 2 operators of size 32 MB are held in memory when the DMDA is provided. >From my rough estimates, the worst case memory foot print for any given core, given your options is approximately 2100 MB + 300 MB + 32 MB + 32 MB + 1 MB = 2465 MB This is way below 8 GB. Note this estimate completely ignores: (1) the memory required for the restriction operator, (2) the potential growth in the number of non-zeros per row due to Galerkin coarsening (I wished -ksp_view_pre reported the output from MatView so we could see the number of non-zeros required by the coarse level operators) (3) all temporary vectors required by the CG solver, and those required by the smoothers. (4) internal memory allocated by MatPtAP (5) memory associated with IS's used within PCTelescope So either I am completely off in my estimates, or you have not carefully estimated the memory usage of your application code. Hopefully others might examine/correct my rough estimates Since I don't have your code I cannot access the latter. Since I don't have access to the same machine you are running on, I think we need to take a step back. [1] What machine are you running on? Send me a URL if its available [2] What discretization are you using? (I am guessing a scalar 7 point FD stencil) If it's a 7 point FD stencil, we should be able to examine the memory usage of your solver configuration using a standard, light weight existing PETSc example, run on your machine at the same scale. This would hopefully enable us to correctly evaluate the actual memory usage required by the solver configuration you are using. Thanks, Dave > > > Frank > > > > > On 07/08/2016 10:38 PM, Dave May wrote: > > > > On Saturday, 9 July 2016, frank wrote: > >> Hi Barry and Dave, >> >> Thank both of you for the advice. >> >> @Barry >> I made a mistake in the file names in last email. I attached the correct >> files this time. >> For all the three tests, 'Telescope' is used as the coarse preconditioner. >> >> == Test1: Grid: 1536*128*384, Process Mesh: 48*4*12 >> Part of the memory usage: Vector 125 124 3971904 0. >> Matrix 101 101 >> 9462372 0 >> >> == Test2: Grid: 1536*128*384, Process Mesh: 96*8*24 >> Part of the memory usage: Vector 125 124 681672 0. >> Matrix 101 101 >> 1462180 0. >> >> In theory, the memory usage in Test1 should be 8 times of Test2. In my >> case, it is about 6 times. >> >> == Test3: Grid: 3072*256*768, Process Mesh: 96*8*24. Sub-domain per >> process: 32*32*32 >> Here I get the out of memory error. >> >> I tried to use -mg_coarse jacobi. In this way, I don't need to set >> -mg_coarse_ksp_type and -mg_coarse_pc_type explicitly, right? >> The linear solver didn't work in this case. Petsc output some errors. >> >> @Dave >> In test3, I use only one instance of 'Telescope'. On the coarse mesh of >> 'Telescope', I used LU as the preconditioner instead of SVD. >> If my set the levels correctly, then on the last coarse mesh of MG where >> it calls 'Telescope', the sub-domain per process is 2*2*2. >> On the last coarse mesh of 'Telescope', there is only one grid point per >> process. >> I still got the OOM error. The detailed petsc option file is attached. > > > Do you understand the expected memory usage for the particular parallel > LU implementation you are using? I don't (seriously). Replace LU with > bjacobi and re-run this test. My point about solver debugging is still > valid. > > And please send the result of KSPView so we can see what is actually used > in the computations > > Thanks > Dave > > >> >> >> Thank you so much. >> >> Frank >> >> >> >> On 07/06/2016 02:51 PM, Barry Smith wrote: >> >>> On Jul 6, 2016, at 4:19 PM, frank wrote: >>>> >>>> Hi Barry, >>>> >>>> Thank you for you advice. >>>> I tried three test. In the 1st test, the grid is 3072*256*768 and the >>>> process mesh is 96*8*24. >>>> The linear solver is 'cg' the preconditioner is 'mg' and 'telescope' is >>>> used as the preconditioner at the coarse mesh. >>>> The system gives me the "Out of Memory" error before the linear system >>>> is completely solved. >>>> The info from '-ksp_view_pre' is attached. I seems to me that the error >>>> occurs when it reaches the coarse mesh. >>>> >>>> The 2nd test uses a grid of 1536*128*384 and process mesh is 96*8*24. >>>> The 3rd test uses the same grid but a different process mesh 48*4*12. >>>> >>> Are you sure this is right? The total matrix and vector memory usage >>> goes from 2nd test >>> Vector 384 383 8,193,712 0. >>> Matrix 103 103 11,508,688 0. >>> to 3rd test >>> Vector 384 383 1,590,520 0. >>> Matrix 103 103 3,508,664 0. >>> that is the memory usage got smaller but if you have only 1/8th the >>> processes and the same grid it should have gotten about 8 times bigger. Did >>> you maybe cut the grid by a factor of 8 also? If so that still doesn't >>> explain it because the memory usage changed by a factor of 5 something for >>> the vectors and 3 something for the matrices. >>> >>> >>> The linear solver and petsc options in 2nd and 3rd tests are the same in >>>> 1st test. The linear solver works fine in both test. >>>> I attached the memory usage of the 2nd and 3rd tests. The memory info >>>> is from the option '-log_summary'. I tried to use '-momery_info' as you >>>> suggested, but in my case petsc treated it as an unused option. It output >>>> nothing about the memory. Do I need to add sth to my code so I can use >>>> '-memory_info'? >>>> >>> Sorry, my mistake the option is -memory_view >>> >>> Can you run the one case with -memory_view and -mg_coarse jacobi >>> -ksp_max_it 1 (just so it doesn't iterate forever) to see how much memory >>> is used without the telescope? Also run case 2 the same way. >>> >>> Barry >>> >>> >>> >>> In both tests the memory usage is not large. >>>> >>>> It seems to me that it might be the 'telescope' preconditioner that >>>> allocated a lot of memory and caused the error in the 1st test. >>>> Is there is a way to show how much memory it allocated? >>>> >>>> Frank >>>> >>>> On 07/05/2016 03:37 PM, Barry Smith wrote: >>>> >>>>> Frank, >>>>> >>>>> You can run with -ksp_view_pre to have it "view" the KSP before >>>>> the solve so hopefully it gets that far. >>>>> >>>>> Please run the problem that does fit with -memory_info when the >>>>> problem completes it will show the "high water mark" for PETSc allocated >>>>> memory and total memory used. We first want to look at these numbers to see >>>>> if it is using more memory than you expect. You could also run with say >>>>> half the grid spacing to see how the memory usage scaled with the increase >>>>> in grid points. Make the runs also with -log_view and send all the output >>>>> from these options. >>>>> >>>>> Barry >>>>> >>>>> On Jul 5, 2016, at 5:23 PM, frank wrote: >>>>>> >>>>>> Hi, >>>>>> >>>>>> I am using the CG ksp solver and Multigrid preconditioner to solve a >>>>>> linear system in parallel. >>>>>> I chose to use the 'Telescope' as the preconditioner on the coarse >>>>>> mesh for its good performance. >>>>>> The petsc options file is attached. >>>>>> >>>>>> The domain is a 3d box. >>>>>> It works well when the grid is 1536*128*384 and the process mesh is >>>>>> 96*8*24. When I double the size of grid and keep the same process mesh and >>>>>> petsc options, I get an "out of memory" error from the super-cluster I am >>>>>> using. >>>>>> Each process has access to at least 8G memory, which should be more >>>>>> than enough for my application. I am sure that all the other parts of my >>>>>> code( except the linear solver ) do not use much memory. So I doubt if >>>>>> there is something wrong with the linear solver. >>>>>> The error occurs before the linear system is completely solved so I >>>>>> don't have the info from ksp view. I am not able to re-produce the error >>>>>> with a smaller problem either. >>>>>> In addition, I tried to use the block jacobi as the preconditioner >>>>>> with the same grid and same decomposition. The linear solver runs extremely >>>>>> slow but there is no memory error. >>>>>> >>>>>> How can I diagnose what exactly cause the error? >>>>>> Thank you so much. >>>>>> >>>>>> Frank >>>>>> >>>>>> >>>>> >>>> >>>> >>> >> > -------------- next part -------------- An HTML attachment was scrubbed... URL: From ibarletta at inogs.it Tue Jul 12 03:35:01 2016 From: ibarletta at inogs.it (Ivano Barletta) Date: Tue, 12 Jul 2016 10:35:01 +0200 Subject: [petsc-users] Using Petsc with Finite Elements Domain Decomposition Message-ID: Dear Petsc users my aim is to parallelize the solution of a linear system into a finite elements ocean model. The model has been almost entirely parallelized, with a partitioning of the domain made element-wise through the use of Zoltan libraries, so the subdomains share the nodes lying on the edges. The linear system includes node-to-node dependencies so my guess is that I need to create an halo surrounding each subdomain, to allow connections of edge nodes with neighbour subdomains ones Apart from that, my question is if Petsc accept a previously made partitioning (maybe taking into account of halo) using the data structures coming out of it Has anybody of you ever faced a similar problem? Thanks in advance Ivano -------------- next part -------------- An HTML attachment was scrubbed... URL: From knepley at gmail.com Tue Jul 12 04:13:26 2016 From: knepley at gmail.com (Matthew Knepley) Date: Tue, 12 Jul 2016 04:13:26 -0500 Subject: [petsc-users] Using Petsc with Finite Elements Domain Decomposition In-Reply-To: References: Message-ID: On Tue, Jul 12, 2016 at 3:35 AM, Ivano Barletta wrote: > Dear Petsc users > > my aim is to parallelize the solution of a linear > system into a finite elements > ocean model. > > The model has been almost entirely parallelized, with > a partitioning of the domain made element-wise through > the use of Zoltan libraries, so the subdomains > share the nodes lying on the edges. > > The linear system includes node-to-node dependencies > so my guess is that I need to create an halo surrounding > each subdomain, to allow connections of edge nodes with > neighbour subdomains ones > > Apart from that, my question is if Petsc accept a > previously made partitioning (maybe taking into account of halo) > using the data structures coming out of it > > Has anybody of you ever faced a similar problem? > If all you want to do is construct a PETSc Mat and Vec for the linear system, just give PETSc the non-overlapping partition to create those objects. You can input values on off-process partitions automatically using MatSetValues() and VecSetValues(). Thanks, Matt > Thanks in advance > Ivano > > -- What most experimenters take for granted before they begin their experiments is infinitely more interesting than any results to which their experiments lead. -- Norbert Wiener -------------- next part -------------- An HTML attachment was scrubbed... URL: From hgbk2008 at gmail.com Tue Jul 12 07:42:02 2016 From: hgbk2008 at gmail.com (Hoang Giang Bui) Date: Tue, 12 Jul 2016 14:42:02 +0200 Subject: [petsc-users] different convergence behaviour Message-ID: Hello I encountered different convergence behaviour of Newton Raphson when using different solver settings with PETSc For the first solver configuration, I used direct solver -ksp_type preonly -pc_type lu -pc_factor_mat_solver_package mumps -mat_mumps_icntl_1 6 -mat_mumps_icntl_4 3 -mat_mumps_icntl_7 4 -mat_mumps_icntl_14 40 -mat_mumps_icntl_23 0 The simulation can run completely and the NR typically converged after 6/7 iterations. Of course, it's very slow. For the second solver configuration: -ksp_type gmres -ksp_max_it 300 -ksp_gmres_restart 300 -ksp_gmres_modifiedgramschmidt -pc_view -pc_fieldsplit_type multiplicative -fieldsplit_u_pc_type hypre -fieldsplit_u_pc_hypre_type boomeramg -fieldsplit_u_pc_hypre_boomeramg_coarsen_type PMIS -fieldsplit_u_pc_hypre_boomeramg_strong_threshold 0.6 -fieldsplit_u_pc_hypre_boomeramg_max_levels 25 -fieldsplit_wp_ksp_rtol 1.0e-8 -fieldsplit_wp_pc_type hypre -fieldsplit_wp_pc_hypre_type boomeramg -fieldsplit_wp_pc_hypre_boomeramg_coarsen_type PMIS -fieldsplit_wp_pc_hypre_boomeramg_strong_threshold 0.6 -fieldsplit_wp_pc_hypre_boomeramg_max_levels 25 The solver runs much faster, but the NR does not converge in 30 iterations after some time steps. I thought setting the solver tolerance -ksp_rtol 1.0e-12 but it doesn't help much because GMRES already terminate with tolerance 1e-30 (see sample log file). Can we set the tolerance of the sub-ksp of the Fieldsplit? I tried -fieldsplit_wp_ksp_rtol 1.0e-8 but it doesn't work. Sorry this problem is run with many time steps and is quite big so I cannot reproduce in a simple test case. Giang -------------- next part -------------- An HTML attachment was scrubbed... URL: -------------- next part -------------- A non-text attachment was scrubbed... Name: sample_log_iteration Type: application/octet-stream Size: 14045 bytes Desc: not available URL: From knepley at gmail.com Tue Jul 12 07:49:16 2016 From: knepley at gmail.com (Matthew Knepley) Date: Tue, 12 Jul 2016 07:49:16 -0500 Subject: [petsc-users] different convergence behaviour In-Reply-To: References: Message-ID: On Tue, Jul 12, 2016 at 7:42 AM, Hoang Giang Bui wrote: > Hello > > I encountered different convergence behaviour of Newton Raphson when using > different solver settings with PETSc > > For the first solver configuration, I used direct solver > -ksp_type preonly > -pc_type lu > -pc_factor_mat_solver_package mumps > -mat_mumps_icntl_1 6 > -mat_mumps_icntl_4 3 > -mat_mumps_icntl_7 4 > -mat_mumps_icntl_14 40 > -mat_mumps_icntl_23 0 > > The simulation can run completely and the NR typically converged after 6/7 > iterations. Of course, it's very slow. For the second solver configuration: > -ksp_type gmres > -ksp_max_it 300 > -ksp_gmres_restart 300 > -ksp_gmres_modifiedgramschmidt > -pc_view > -pc_fieldsplit_type multiplicative > -fieldsplit_u_pc_type hypre > -fieldsplit_u_pc_hypre_type boomeramg > -fieldsplit_u_pc_hypre_boomeramg_coarsen_type PMIS > -fieldsplit_u_pc_hypre_boomeramg_strong_threshold 0.6 > -fieldsplit_u_pc_hypre_boomeramg_max_levels 25 > -fieldsplit_wp_ksp_rtol 1.0e-8 > -fieldsplit_wp_pc_type hypre > -fieldsplit_wp_pc_hypre_type boomeramg > -fieldsplit_wp_pc_hypre_boomeramg_coarsen_type PMIS > -fieldsplit_wp_pc_hypre_boomeramg_strong_threshold 0.6 > -fieldsplit_wp_pc_hypre_boomeramg_max_levels 25 > > The solver runs much faster, but the NR does not converge in 30 iterations > after some time steps. I thought setting the solver tolerance -ksp_rtol > 1.0e-12 but it doesn't help much because GMRES already terminate with > tolerance 1e-30 (see sample log file). Can we set the tolerance of the > sub-ksp of the Fieldsplit? I tried -fieldsplit_wp_ksp_rtol 1.0e-8 but it > doesn't work. > 1) In the log you sent, the linear solver converges due to the Relative Tolerance, 1.0e-9, not the breakdown tolerance 1e-30. Change rtol will affect the convergence. 2) What do you mean -fieldsplit_wp_ksp_rtol 1.0e-8 does not work? ALWAYS send the view output. 3) I can't tell you anything about Newton convergence if you do not send the output, -snes_monitor -snes_view 4) If there is a difference between LU and an iterative solver with residual 1e-9, then your system is very ill-conditioned. Thanks, Matt > Sorry this problem is run with many time steps and is quite big so I > cannot reproduce in a simple test case. > > Giang > > -- What most experimenters take for granted before they begin their experiments is infinitely more interesting than any results to which their experiments lead. -- Norbert Wiener -------------- next part -------------- An HTML attachment was scrubbed... URL: From marius.hensgens at rwth-aachen.de Tue Jul 12 08:37:44 2016 From: marius.hensgens at rwth-aachen.de (Hensgens, Marius) Date: Tue, 12 Jul 2016 13:37:44 +0000 Subject: [petsc-users] petsc4py - Change default line search for SNES Newton Line Search Message-ID: <1468330674098.92345@rwth-aachen.de> Dear all, how can I change the used line search method after setting the SNES Type to 'newtonls' using petsc4py ? In the official PETSc documentation there is a function called SNESLineSearchSetType, however in petsc4py I can't find an equivalent function. Best regards, Marius -------------- next part -------------- An HTML attachment was scrubbed... URL: From hgbk2008 at gmail.com Tue Jul 12 08:44:52 2016 From: hgbk2008 at gmail.com (Hoang Giang Bui) Date: Tue, 12 Jul 2016 15:44:52 +0200 Subject: [petsc-users] different convergence behaviour In-Reply-To: References: Message-ID: Hi Matt 1) In the log you sent, the linear solver converges due to the Relative Tolerance, 1.0e-9, not the breakdown tolerance 1e-30. Change rtol will affect the convergence. Sorry i got it wrong in the previous email, the ksp_rtol 1.0e-12 DOES affect the convergence, and it took more iterations. But the simulation still failed at a definite time step. 2) What do you mean -fieldsplit_wp_ksp_rtol 1.0e-8 does not work? ALWAYS send the view output. In the log file I sent previously, the line KSP Object: (fieldsplit_wp_) 8 MPI processes type: preonly maximum iterations=10000, initial guess is zero tolerances: relative=1e-05, absolute=1e-50, divergence=10000 left preconditioning using NONE norm type for convergence test impressed me that the rtol for fieldsplit_wp is still 1.0e-5 3) I can't tell you anything about Newton convergence if you do not send the output, -snes_monitor -snes_view I did not yet use SNES, instead using my NR iterator so I have no view for SNES. 4) If there is a difference between LU and an iterative solver with residual 1e-9, then your system is very ill-conditioned. Yes it is ill-conditioned Giang On Tue, Jul 12, 2016 at 2:49 PM, Matthew Knepley wrote: > On Tue, Jul 12, 2016 at 7:42 AM, Hoang Giang Bui > wrote: > >> Hello >> >> I encountered different convergence behaviour of Newton Raphson when >> using different solver settings with PETSc >> >> For the first solver configuration, I used direct solver >> -ksp_type preonly >> -pc_type lu >> -pc_factor_mat_solver_package mumps >> -mat_mumps_icntl_1 6 >> -mat_mumps_icntl_4 3 >> -mat_mumps_icntl_7 4 >> -mat_mumps_icntl_14 40 >> -mat_mumps_icntl_23 0 >> >> The simulation can run completely and the NR typically converged after >> 6/7 iterations. Of course, it's very slow. For the second solver >> configuration: >> -ksp_type gmres >> -ksp_max_it 300 >> -ksp_gmres_restart 300 >> -ksp_gmres_modifiedgramschmidt >> -pc_view >> -pc_fieldsplit_type multiplicative >> -fieldsplit_u_pc_type hypre >> -fieldsplit_u_pc_hypre_type boomeramg >> -fieldsplit_u_pc_hypre_boomeramg_coarsen_type PMIS >> -fieldsplit_u_pc_hypre_boomeramg_strong_threshold 0.6 >> -fieldsplit_u_pc_hypre_boomeramg_max_levels 25 >> -fieldsplit_wp_ksp_rtol 1.0e-8 >> -fieldsplit_wp_pc_type hypre >> -fieldsplit_wp_pc_hypre_type boomeramg >> -fieldsplit_wp_pc_hypre_boomeramg_coarsen_type PMIS >> -fieldsplit_wp_pc_hypre_boomeramg_strong_threshold 0.6 >> -fieldsplit_wp_pc_hypre_boomeramg_max_levels 25 >> >> The solver runs much faster, but the NR does not converge in 30 >> iterations after some time steps. I thought setting the solver >> tolerance -ksp_rtol 1.0e-12 but it doesn't help much because GMRES already >> terminate with tolerance 1e-30 (see sample log file). Can we set the >> tolerance of the sub-ksp of the Fieldsplit? I tried -fieldsplit_wp_ksp_rtol >> 1.0e-8 but it doesn't work. >> > > 1) In the log you sent, the linear solver converges due to the Relative > Tolerance, 1.0e-9, not the breakdown tolerance 1e-30. Change rtol will > affect the convergence. > > 2) What do you mean -fieldsplit_wp_ksp_rtol 1.0e-8 does not work? ALWAYS > send the view output. > > 3) I can't tell you anything about Newton convergence if you do not send > the output, -snes_monitor -snes_view > > 4) If there is a difference between LU and an iterative solver with > residual 1e-9, then your system is very ill-conditioned. > > Thanks, > > Matt > > >> Sorry this problem is run with many time steps and is quite big so I >> cannot reproduce in a simple test case. >> >> Giang >> >> > > > -- > What most experimenters take for granted before they begin their > experiments is infinitely more interesting than any results to which their > experiments lead. > -- Norbert Wiener > -------------- next part -------------- An HTML attachment was scrubbed... URL: From dalcinl at gmail.com Tue Jul 12 09:06:15 2016 From: dalcinl at gmail.com (Lisandro Dalcin) Date: Tue, 12 Jul 2016 17:06:15 +0300 Subject: [petsc-users] petsc4py - Change default line search for SNES Newton Line Search In-Reply-To: <1468330674098.92345@rwth-aachen.de> References: <1468330674098.92345@rwth-aachen.de> Message-ID: On 12 July 2016 at 16:37, Hensgens, Marius wrote: > how can I change the used line search method after setting the SNES Type to > 'newtonls' using petsc4py ? > Right now, you can either use the command line or programatically insert an option in the database opts = PETSc.Options() opts['snes_linesearch_type'] = lstype ... snes.setFromOptions() > In the official PETSc documentation there is a function called > SNESLineSearchSetType, however in petsc4py I can't find an equivalent > function. The SNESLineSearch type and related routines are not wrapped yet. -- Lisandro Dalcin ============ Research Scientist Computer, Electrical and Mathematical Sciences & Engineering (CEMSE) Extreme Computing Research Center (ECRC) King Abdullah University of Science and Technology (KAUST) http://ecrc.kaust.edu.sa/ 4700 King Abdullah University of Science and Technology al-Khawarizmi Bldg (Bldg 1), Office # 0109 Thuwal 23955-6900, Kingdom of Saudi Arabia http://www.kaust.edu.sa Office Phone: +966 12 808-0459 From knepley at gmail.com Tue Jul 12 09:52:27 2016 From: knepley at gmail.com (Matthew Knepley) Date: Tue, 12 Jul 2016 09:52:27 -0500 Subject: [petsc-users] different convergence behaviour In-Reply-To: References: Message-ID: On Tue, Jul 12, 2016 at 8:44 AM, Hoang Giang Bui wrote: > Hi Matt > > 1) In the log you sent, the linear solver converges due to the Relative > Tolerance, 1.0e-9, not the breakdown tolerance 1e-30. Change rtol will > affect the convergence. > > Sorry i got it wrong in the previous email, the ksp_rtol 1.0e-12 DOES > affect the convergence, and it took more iterations. But the simulation > still failed at a definite time step. > > 2) What do you mean -fieldsplit_wp_ksp_rtol 1.0e-8 does not work? ALWAYS > send the view output. > > In the log file I sent previously, the line > > KSP Object: (fieldsplit_wp_) 8 MPI processes > type: preonly > maximum iterations=10000, initial guess is zero > tolerances: relative=1e-05, absolute=1e-50, divergence=10000 > left preconditioning > using NONE norm type for convergence test > > impressed me that the rtol for fieldsplit_wp is still 1.0e-5 > KSP "preonly" does no iterations, so it does not read the tolerance. If you want to lower the tolerance, choose a solver like GMRES -fieldsplit_wp_ksp_type gmres -fieldsplit_wp_ksp_rtol 1e-8 > 3) I can't tell you anything about Newton convergence if you do not send > the output, -snes_monitor -snes_view > > I did not yet use SNES, instead using my NR iterator so I have no view for > SNES. > It is hard to debug an iteration which we did not code. It could be you have a bug. If not, then very small changes in the iterates are making a difference, which means your Jacobians are close to singular. A problem reformulation would probably help more than solver tweaking. Thanks, Matt > 4) If there is a difference between LU and an iterative solver with > residual 1e-9, then your system is very ill-conditioned. > Yes it is ill-conditioned > > > > > > > > Giang > > On Tue, Jul 12, 2016 at 2:49 PM, Matthew Knepley > wrote: > >> On Tue, Jul 12, 2016 at 7:42 AM, Hoang Giang Bui >> wrote: >> >>> Hello >>> >>> I encountered different convergence behaviour of Newton Raphson when >>> using different solver settings with PETSc >>> >>> For the first solver configuration, I used direct solver >>> -ksp_type preonly >>> -pc_type lu >>> -pc_factor_mat_solver_package mumps >>> -mat_mumps_icntl_1 6 >>> -mat_mumps_icntl_4 3 >>> -mat_mumps_icntl_7 4 >>> -mat_mumps_icntl_14 40 >>> -mat_mumps_icntl_23 0 >>> >>> The simulation can run completely and the NR typically converged after >>> 6/7 iterations. Of course, it's very slow. For the second solver >>> configuration: >>> -ksp_type gmres >>> -ksp_max_it 300 >>> -ksp_gmres_restart 300 >>> -ksp_gmres_modifiedgramschmidt >>> -pc_view >>> -pc_fieldsplit_type multiplicative >>> -fieldsplit_u_pc_type hypre >>> -fieldsplit_u_pc_hypre_type boomeramg >>> -fieldsplit_u_pc_hypre_boomeramg_coarsen_type PMIS >>> -fieldsplit_u_pc_hypre_boomeramg_strong_threshold 0.6 >>> -fieldsplit_u_pc_hypre_boomeramg_max_levels 25 >>> -fieldsplit_wp_ksp_rtol 1.0e-8 >>> -fieldsplit_wp_pc_type hypre >>> -fieldsplit_wp_pc_hypre_type boomeramg >>> -fieldsplit_wp_pc_hypre_boomeramg_coarsen_type PMIS >>> -fieldsplit_wp_pc_hypre_boomeramg_strong_threshold 0.6 >>> -fieldsplit_wp_pc_hypre_boomeramg_max_levels 25 >>> >>> The solver runs much faster, but the NR does not converge in 30 >>> iterations after some time steps. I thought setting the solver >>> tolerance -ksp_rtol 1.0e-12 but it doesn't help much because GMRES already >>> terminate with tolerance 1e-30 (see sample log file). Can we set the >>> tolerance of the sub-ksp of the Fieldsplit? I tried -fieldsplit_wp_ksp_rtol >>> 1.0e-8 but it doesn't work. >>> >> >> 1) In the log you sent, the linear solver converges due to the Relative >> Tolerance, 1.0e-9, not the breakdown tolerance 1e-30. Change rtol will >> affect the convergence. >> >> 2) What do you mean -fieldsplit_wp_ksp_rtol 1.0e-8 does not work? ALWAYS >> send the view output. >> >> 3) I can't tell you anything about Newton convergence if you do not send >> the output, -snes_monitor -snes_view >> >> 4) If there is a difference between LU and an iterative solver with >> residual 1e-9, then your system is very ill-conditioned. >> >> Thanks, >> >> Matt >> >> >>> Sorry this problem is run with many time steps and is quite big so I >>> cannot reproduce in a simple test case. >>> >>> Giang >>> >>> >> >> >> -- >> What most experimenters take for granted before they begin their >> experiments is infinitely more interesting than any results to which their >> experiments lead. >> -- Norbert Wiener >> > > -- What most experimenters take for granted before they begin their experiments is infinitely more interesting than any results to which their experiments lead. -- Norbert Wiener -------------- next part -------------- An HTML attachment was scrubbed... URL: From bsmith at mcs.anl.gov Tue Jul 12 21:33:08 2016 From: bsmith at mcs.anl.gov (Barry Smith) Date: Tue, 12 Jul 2016 21:33:08 -0500 Subject: [petsc-users] Question about memory usage in Multigrid preconditioner In-Reply-To: References: <577C337B.60909@uci.edu> <94A03A99-4970-4F20-8C79-FEE1DCBD028D@mcs.anl.gov> <577D75D3.8010703@uci.edu> <2F25042C-E6D6-4AC6-9C22-1B63F8065836@mcs.anl.gov> <57804DE9.707@uci.edu> <5783D3E4.4020004@uci.edu> Message-ID: > On Jul 11, 2016, at 3:18 PM, Dave May wrote: > > Hi Frank, > > > On 11 July 2016 at 19:14, frank wrote: > Hi Dave, > > I re-run the test using bjacobi as the preconditioner on the coarse mesh of telescope. The Grid is 3072*256*768 and process mesh is 96*8*24. The petsc option file is attached. > I still got the "Out Of Memory" error. The error occurred before the linear solver finished one step. So I don't have the full info from ksp_view. The info from ksp_view_pre is attached. > > Okay - that is essentially useless (sorry) > > > It seems to me that the error occurred when the decomposition was going to be changed. > > Based on what information? > Running with -info would give us more clues, but will create a ton of output. > Please try running the case which failed with -info > > I had another test with a grid of 1536*128*384 and the same process mesh as above. There was no error. The ksp_view info is attached for comparison. > Thank you. > > > [3] Here is my crude estimate of your memory usage. > I'll target the biggest memory hogs only to get an order of magnitude estimate > > * The Fine grid operator contains 4223139840 non-zeros --> 1.8 GB per MPI rank assuming double precision. > The indices for the AIJ could amount to another 0.3 GB (assuming 32 bit integers) > > * You use 5 levels of coarsening, so the other operators should represent (collectively) > 2.1 / 8 + 2.1/8^2 + 2.1/8^3 + 2.1/8^4 ~ 300 MB per MPI rank on the communicator with 18432 ranks. > The coarse grid should consume ~ 0.5 MB per MPI rank on the communicator with 18432 ranks. > > * You use a reduction factor of 64, making the new communicator with 288 MPI ranks. > PCTelescope will first gather a temporary matrix associated with your coarse level operator assuming a comm size of 288 living on the comm with size 18432. > This matrix will require approximately 0.5 * 64 = 32 MB per core on the 288 ranks. > This matrix is then used to form a new MPIAIJ matrix on the subcomm, thus require another 32 MB per rank. > The temporary matrix is now destroyed. > > * Because a DMDA is detected, a permutation matrix is assembled. > This requires 2 doubles per point in the DMDA. > Your coarse DMDA contains 92 x 16 x 48 points. > Thus the permutation matrix will require < 1 MB per MPI rank on the sub-comm. > > * Lastly, the matrix is permuted. This uses MatPtAP(), but the resulting operator will have the same memory footprint as the unpermuted matrix (32 MB). Dave, MatPtAP has to generate some work space. Is it possible the "guess" it uses for needed work space is so absurdly (and unnecessarily) large that it triggers a memory issue? It is possible that other places that require "guesses" for work space produce a problem? Also are all the "guesses" properly -info logged so that we can detected them before the program is killed? Barry > At any stage in PCTelescope, only 2 operators of size 32 MB are held in memory when the DMDA is provided. > > From my rough estimates, the worst case memory foot print for any given core, given your options is approximately > 2100 MB + 300 MB + 32 MB + 32 MB + 1 MB = 2465 MB > This is way below 8 GB. > > Note this estimate completely ignores: > (1) the memory required for the restriction operator, > (2) the potential growth in the number of non-zeros per row due to Galerkin coarsening (I wished -ksp_view_pre reported the output from MatView so we could see the number of non-zeros required by the coarse level operators) > (3) all temporary vectors required by the CG solver, and those required by the smoothers. > (4) internal memory allocated by MatPtAP > (5) memory associated with IS's used within PCTelescope > > So either I am completely off in my estimates, or you have not carefully estimated the memory usage of your application code. Hopefully others might examine/correct my rough estimates > > Since I don't have your code I cannot access the latter. > Since I don't have access to the same machine you are running on, I think we need to take a step back. > > [1] What machine are you running on? Send me a URL if its available > > [2] What discretization are you using? (I am guessing a scalar 7 point FD stencil) > If it's a 7 point FD stencil, we should be able to examine the memory usage of your solver configuration using a standard, light weight existing PETSc example, run on your machine at the same scale. > This would hopefully enable us to correctly evaluate the actual memory usage required by the solver configuration you are using. > > Thanks, > Dave > > > > Frank > > > > > On 07/08/2016 10:38 PM, Dave May wrote: >> >> >> On Saturday, 9 July 2016, frank wrote: >> Hi Barry and Dave, >> >> Thank both of you for the advice. >> >> @Barry >> I made a mistake in the file names in last email. I attached the correct files this time. >> For all the three tests, 'Telescope' is used as the coarse preconditioner. >> >> == Test1: Grid: 1536*128*384, Process Mesh: 48*4*12 >> Part of the memory usage: Vector 125 124 3971904 0. >> Matrix 101 101 9462372 0 >> >> == Test2: Grid: 1536*128*384, Process Mesh: 96*8*24 >> Part of the memory usage: Vector 125 124 681672 0. >> Matrix 101 101 1462180 0. >> >> In theory, the memory usage in Test1 should be 8 times of Test2. In my case, it is about 6 times. >> >> == Test3: Grid: 3072*256*768, Process Mesh: 96*8*24. Sub-domain per process: 32*32*32 >> Here I get the out of memory error. >> >> I tried to use -mg_coarse jacobi. In this way, I don't need to set -mg_coarse_ksp_type and -mg_coarse_pc_type explicitly, right? >> The linear solver didn't work in this case. Petsc output some errors. >> >> @Dave >> In test3, I use only one instance of 'Telescope'. On the coarse mesh of 'Telescope', I used LU as the preconditioner instead of SVD. >> If my set the levels correctly, then on the last coarse mesh of MG where it calls 'Telescope', the sub-domain per process is 2*2*2. >> On the last coarse mesh of 'Telescope', there is only one grid point per process. >> I still got the OOM error. The detailed petsc option file is attached. >> >> Do you understand the expected memory usage for the particular parallel LU implementation you are using? I don't (seriously). Replace LU with bjacobi and re-run this test. My point about solver debugging is still valid. >> >> And please send the result of KSPView so we can see what is actually used in the computations >> >> Thanks >> Dave >> >> >> >> Thank you so much. >> >> Frank >> >> >> >> On 07/06/2016 02:51 PM, Barry Smith wrote: >> On Jul 6, 2016, at 4:19 PM, frank wrote: >> >> Hi Barry, >> >> Thank you for you advice. >> I tried three test. In the 1st test, the grid is 3072*256*768 and the process mesh is 96*8*24. >> The linear solver is 'cg' the preconditioner is 'mg' and 'telescope' is used as the preconditioner at the coarse mesh. >> The system gives me the "Out of Memory" error before the linear system is completely solved. >> The info from '-ksp_view_pre' is attached. I seems to me that the error occurs when it reaches the coarse mesh. >> >> The 2nd test uses a grid of 1536*128*384 and process mesh is 96*8*24. The 3rd test uses the same grid but a different process mesh 48*4*12. >> Are you sure this is right? The total matrix and vector memory usage goes from 2nd test >> Vector 384 383 8,193,712 0. >> Matrix 103 103 11,508,688 0. >> to 3rd test >> Vector 384 383 1,590,520 0. >> Matrix 103 103 3,508,664 0. >> that is the memory usage got smaller but if you have only 1/8th the processes and the same grid it should have gotten about 8 times bigger. Did you maybe cut the grid by a factor of 8 also? If so that still doesn't explain it because the memory usage changed by a factor of 5 something for the vectors and 3 something for the matrices. >> >> >> The linear solver and petsc options in 2nd and 3rd tests are the same in 1st test. The linear solver works fine in both test. >> I attached the memory usage of the 2nd and 3rd tests. The memory info is from the option '-log_summary'. I tried to use '-momery_info' as you suggested, but in my case petsc treated it as an unused option. It output nothing about the memory. Do I need to add sth to my code so I can use '-memory_info'? >> Sorry, my mistake the option is -memory_view >> >> Can you run the one case with -memory_view and -mg_coarse jacobi -ksp_max_it 1 (just so it doesn't iterate forever) to see how much memory is used without the telescope? Also run case 2 the same way. >> >> Barry >> >> >> >> In both tests the memory usage is not large. >> >> It seems to me that it might be the 'telescope' preconditioner that allocated a lot of memory and caused the error in the 1st test. >> Is there is a way to show how much memory it allocated? >> >> Frank >> >> On 07/05/2016 03:37 PM, Barry Smith wrote: >> Frank, >> >> You can run with -ksp_view_pre to have it "view" the KSP before the solve so hopefully it gets that far. >> >> Please run the problem that does fit with -memory_info when the problem completes it will show the "high water mark" for PETSc allocated memory and total memory used. We first want to look at these numbers to see if it is using more memory than you expect. You could also run with say half the grid spacing to see how the memory usage scaled with the increase in grid points. Make the runs also with -log_view and send all the output from these options. >> >> Barry >> >> On Jul 5, 2016, at 5:23 PM, frank wrote: >> >> Hi, >> >> I am using the CG ksp solver and Multigrid preconditioner to solve a linear system in parallel. >> I chose to use the 'Telescope' as the preconditioner on the coarse mesh for its good performance. >> The petsc options file is attached. >> >> The domain is a 3d box. >> It works well when the grid is 1536*128*384 and the process mesh is 96*8*24. When I double the size of grid and keep the same process mesh and petsc options, I get an "out of memory" error from the super-cluster I am using. >> Each process has access to at least 8G memory, which should be more than enough for my application. I am sure that all the other parts of my code( except the linear solver ) do not use much memory. So I doubt if there is something wrong with the linear solver. >> The error occurs before the linear system is completely solved so I don't have the info from ksp view. I am not able to re-produce the error with a smaller problem either. >> In addition, I tried to use the block jacobi as the preconditioner with the same grid and same decomposition. The linear solver runs extremely slow but there is no memory error. >> >> How can I diagnose what exactly cause the error? >> Thank you so much. >> >> Frank >> >> >> > > From bsmith at mcs.anl.gov Tue Jul 12 22:16:35 2016 From: bsmith at mcs.anl.gov (Barry Smith) Date: Tue, 12 Jul 2016 22:16:35 -0500 Subject: [petsc-users] Using Petsc with Finite Elements Domain Decomposition In-Reply-To: References: Message-ID: > On Jul 12, 2016, at 4:13 AM, Matthew Knepley wrote: > > On Tue, Jul 12, 2016 at 3:35 AM, Ivano Barletta wrote: > Dear Petsc users > > my aim is to parallelize the solution of a linear > system into a finite elements > ocean model. > > The model has been almost entirely parallelized, with > a partitioning of the domain made element-wise through > the use of Zoltan libraries, so the subdomains > share the nodes lying on the edges. > > The linear system includes node-to-node dependencies > so my guess is that I need to create an halo surrounding > each subdomain, to allow connections of edge nodes with > neighbour subdomains ones > > Apart from that, my question is if Petsc accept a > previously made partitioning (maybe taking into account of halo) > using the data structures coming out of it > > Has anybody of you ever faced a similar problem? > > If all you want to do is construct a PETSc Mat and Vec for the linear system, > just give PETSc the non-overlapping partition to create those objects. You > can input values on off-process partitions automatically using MatSetValues() > and VecSetValues(). Note that by just using the VecSetValues() and MatSetValues() PETSc will manage all the halo business needed by the linear algebra system solver automatically. You don't need to provide any halo information to PETSc. It is really straightforward. Barry > > Thanks, > > Matt > > Thanks in advance > Ivano > > > > > -- > What most experimenters take for granted before they begin their experiments is infinitely more interesting than any results to which their experiments lead. > -- Norbert Wiener From bsmith at mcs.anl.gov Tue Jul 12 22:43:59 2016 From: bsmith at mcs.anl.gov (Barry Smith) Date: Tue, 12 Jul 2016 22:43:59 -0500 Subject: [petsc-users] different convergence behaviour In-Reply-To: References: Message-ID: <66004C23-63C9-4A3E-A7DF-1352AC26412F@mcs.anl.gov> It is not uncommon for an iterative linear solver to work fine for some time steps but then start to perform poorly at a later timestep because the physics (mathematically the conditioning or eigenstructure of the Jacobian) changes over time; perhaps becomes singular. Another possibility is the trajectory of the solution is very sensitive to the solution of the nonlinear problem at each time step so that an iterative linear solver and a direct linear solver result in very difficult physical solutions after many time steps. In other words after many time-steps the computed solutions can be very different and if the computed solution for the iterative linear solver is eventually "non-physical" or ill-conditioned the nonlinear solver could break down. Please run with the iterative solver (that eventually breaks) with the option -ksp_monitor_true_solution -ksp_converged_reason and and send ALL the output (it will be very large, don't worry about it). Then we can see if the linear solver is breaking down. Note that by default PETSc linear solvers do not generate an error that stops the program if the linear solve fails, hence your NR code should call KSPGetConvergedReason() after EVERY linear solve and if the reason is negative your code needs to do something different since the linear solve failed and your code should not just keep on running NR. Barry > On Jul 12, 2016, at 9:52 AM, Matthew Knepley wrote: > > On Tue, Jul 12, 2016 at 8:44 AM, Hoang Giang Bui wrote: > Hi Matt > > 1) In the log you sent, the linear solver converges due to the Relative Tolerance, 1.0e-9, not the breakdown tolerance 1e-30. Change rtol will affect the convergence. > > Sorry i got it wrong in the previous email, the ksp_rtol 1.0e-12 DOES affect the convergence, and it took more iterations. But the simulation still failed at a definite time step. > > 2) What do you mean -fieldsplit_wp_ksp_rtol 1.0e-8 does not work? ALWAYS send the view output. > > In the log file I sent previously, the line > > KSP Object: (fieldsplit_wp_) 8 MPI processes > type: preonly > maximum iterations=10000, initial guess is zero > tolerances: relative=1e-05, absolute=1e-50, divergence=10000 > left preconditioning > using NONE norm type for convergence test > > impressed me that the rtol for fieldsplit_wp is still 1.0e-5 > > KSP "preonly" does no iterations, so it does not read the tolerance. If you want to lower the tolerance, > choose a solver like GMRES > > -fieldsplit_wp_ksp_type gmres -fieldsplit_wp_ksp_rtol 1e-8 > > 3) I can't tell you anything about Newton convergence if you do not send the output, -snes_monitor -snes_view > > I did not yet use SNES, instead using my NR iterator so I have no view for SNES. > > It is hard to debug an iteration which we did not code. It could be you have a bug. If not, then very small changes in > the iterates are making a difference, which means your Jacobians are close to singular. A problem reformulation would > probably help more than solver tweaking. > > Thanks, > > Matt > > 4) If there is a difference between LU and an iterative solver with residual 1e-9, then your system is very ill-conditioned. > Yes it is ill-conditioned > > > > > > > > Giang > > On Tue, Jul 12, 2016 at 2:49 PM, Matthew Knepley wrote: > On Tue, Jul 12, 2016 at 7:42 AM, Hoang Giang Bui wrote: > Hello > > I encountered different convergence behaviour of Newton Raphson when using different solver settings with PETSc > > For the first solver configuration, I used direct solver > -ksp_type preonly > -pc_type lu > -pc_factor_mat_solver_package mumps > -mat_mumps_icntl_1 6 > -mat_mumps_icntl_4 3 > -mat_mumps_icntl_7 4 > -mat_mumps_icntl_14 40 > -mat_mumps_icntl_23 0 > > The simulation can run completely and the NR typically converged after 6/7 iterations. Of course, it's very slow. For the second solver configuration: > -ksp_type gmres > -ksp_max_it 300 > -ksp_gmres_restart 300 > -ksp_gmres_modifiedgramschmidt > -pc_view > -pc_fieldsplit_type multiplicative > -fieldsplit_u_pc_type hypre > -fieldsplit_u_pc_hypre_type boomeramg > -fieldsplit_u_pc_hypre_boomeramg_coarsen_type PMIS > -fieldsplit_u_pc_hypre_boomeramg_strong_threshold 0.6 > -fieldsplit_u_pc_hypre_boomeramg_max_levels 25 > -fieldsplit_wp_ksp_rtol 1.0e-8 > -fieldsplit_wp_pc_type hypre > -fieldsplit_wp_pc_hypre_type boomeramg > -fieldsplit_wp_pc_hypre_boomeramg_coarsen_type PMIS > -fieldsplit_wp_pc_hypre_boomeramg_strong_threshold 0.6 > -fieldsplit_wp_pc_hypre_boomeramg_max_levels 25 > > The solver runs much faster, but the NR does not converge in 30 iterations after some time steps. I thought setting the solver tolerance -ksp_rtol 1.0e-12 but it doesn't help much because GMRES already terminate with tolerance 1e-30 (see sample log file). Can we set the tolerance of the sub-ksp of the Fieldsplit? I tried -fieldsplit_wp_ksp_rtol 1.0e-8 but it doesn't work. > > 1) In the log you sent, the linear solver converges due to the Relative Tolerance, 1.0e-9, not the breakdown tolerance 1e-30. Change rtol will affect the convergence. > > 2) What do you mean -fieldsplit_wp_ksp_rtol 1.0e-8 does not work? ALWAYS send the view output. > > 3) I can't tell you anything about Newton convergence if you do not send the output, -snes_monitor -snes_view > > 4) If there is a difference between LU and an iterative solver with residual 1e-9, then your system is very ill-conditioned. > > Thanks, > > Matt > > Sorry this problem is run with many time steps and is quite big so I cannot reproduce in a simple test case. > > Giang > > > > > -- > What most experimenters take for granted before they begin their experiments is infinitely more interesting than any results to which their experiments lead. > -- Norbert Wiener > > > > > -- > What most experimenters take for granted before they begin their experiments is infinitely more interesting than any results to which their experiments lead. > -- Norbert Wiener From mono at dtu.dk Wed Jul 13 03:57:06 2016 From: mono at dtu.dk (=?Windows-1252?Q?Morten_Nobel-J=F8rgensen?=) Date: Wed, 13 Jul 2016 08:57:06 +0000 Subject: [petsc-users] Distribution of DMPlex for FEM Message-ID: I?m having problems distributing a simple FEM model using DMPlex. For test case I use 1x1x2 hex box elements (/cells) with 12 vertices. Each vertex has one DOF. When I distribute the system to two processors, each get a single element and the local vector has the size 8 (one DOF for each vertex of a hex box) as expected. My problem is that when I manually assemble the global stiffness matrix (a 12x12 matrix) it seems like my ghost values are ignored. I?m sure that I?m missing something obvious but cannot see what it is. In the attached example, I?m assembling the global stiffness matrix using a simple local stiffness matrix of ones. This makes it very easy to see if the matrix is assembled correctly. If I run it on one process, then global stiffness matrix consists of 0?s, 1?s and 2?s and its trace is 16.0. But if I run it distributed on on two, then it consists only of 0's and 1?s and its trace is 12.0. I hope that somebody can spot my mistake and help me in the right direction :) Kind regards, Morten -------------- next part -------------- An HTML attachment was scrubbed... URL: -------------- next part -------------- A non-text attachment was scrubbed... Name: ex18.cc Type: application/octet-stream Size: 4631 bytes Desc: ex18.cc URL: From dave.mayhem23 at gmail.com Wed Jul 13 04:17:06 2016 From: dave.mayhem23 at gmail.com (Dave May) Date: Wed, 13 Jul 2016 11:17:06 +0200 Subject: [petsc-users] Question about memory usage in Multigrid preconditioner In-Reply-To: References: <577C337B.60909@uci.edu> <94A03A99-4970-4F20-8C79-FEE1DCBD028D@mcs.anl.gov> <577D75D3.8010703@uci.edu> <2F25042C-E6D6-4AC6-9C22-1B63F8065836@mcs.anl.gov> <57804DE9.707@uci.edu> <5783D3E4.4020004@uci.edu> Message-ID: Hi Barry, > Dave, > > MatPtAP has to generate some work space. Is it possible the "guess" it > uses for needed work space is so absurdly (and unnecessarily) large that it > triggers a memory issue? It is possible that other places that require > "guesses" for work space produce a problem? This is entirely possible. I've never ever used PtAP at the scale of Franks simulation. I poked around in src/mat/impls/aij/mpi/mpiptap.c In this function, MatPtAPSymbolic_MPIAIJ_MPIAIJ_ptap() I see the following code /* set default scalable */ ptap->scalable = PETSC_FALSE; /* PETSC_TRUE; */ ierr = PetscOptionsGetBool(((PetscObject)Cmpi)->options,((PetscObject)Cmpi)->prefix, "-matptap_scalable",&ptap-> This indicates that the default choice being used (despite the comment,) is to use the faster, but also the more memory hungry variant of MatPtAP for MPIAIJ matrices. Looks like someone has changed the default. The following comment is off topic from the email thread but... This particular file is littered with #ifdefs related to profiling (PTAP_PROFILE). This variable is not defined by default. I would much prefer be if this kind of thing was available all the time via a run time flag rather than a configure flag. Also, it would be great to augment the profiling for PtAP with memory usage as currently only CPU time is logged. Awhile back I proposed a PR for an "operation logger" object (which you absolutely hated). The functionality of this logger would be useful to get rid of the #if defined stuff for PtAP and be able to report meaningful details about both the memory and CPU time. I used this logger for the pctelescope paper and found it immensely useful. But to the topic. Frank, you might want to try running your job with the command line option -matptap_scalable (or -XXX_matptap_scalable if you have given assigned a name to your operator.) As always, run a small job first with -options_left 1 to ensure the option name is spelled correctly and being used. Let us know if this helps. Cheers, Dave Also are all the "guesses" properly -info logged so that we can detected > them before the program is killed? > > > Barry > > > > At any stage in PCTelescope, only 2 operators of size 32 MB are held in > memory when the DMDA is provided. > > > > From my rough estimates, the worst case memory foot print for any given > core, given your options is approximately > > 2100 MB + 300 MB + 32 MB + 32 MB + 1 MB = 2465 MB > > This is way below 8 GB. > > > > Note this estimate completely ignores: > > (1) the memory required for the restriction operator, > > (2) the potential growth in the number of non-zeros per row due to > Galerkin coarsening (I wished -ksp_view_pre reported the output from > MatView so we could see the number of non-zeros required by the coarse > level operators) > > (3) all temporary vectors required by the CG solver, and those required > by the smoothers. > > (4) internal memory allocated by MatPtAP > > (5) memory associated with IS's used within PCTelescope > > > > So either I am completely off in my estimates, or you have not carefully > estimated the memory usage of your application code. Hopefully others might > examine/correct my rough estimates > > > > Since I don't have your code I cannot access the latter. > > Since I don't have access to the same machine you are running on, I > think we need to take a step back. > > > > [1] What machine are you running on? Send me a URL if its available > > > > [2] What discretization are you using? (I am guessing a scalar 7 point > FD stencil) > > If it's a 7 point FD stencil, we should be able to examine the memory > usage of your solver configuration using a standard, light weight existing > PETSc example, run on your machine at the same scale. > > This would hopefully enable us to correctly evaluate the actual memory > usage required by the solver configuration you are using. > > > > Thanks, > > Dave > > > > > > > > Frank > > > > > > > > > > On 07/08/2016 10:38 PM, Dave May wrote: > >> > >> > >> On Saturday, 9 July 2016, frank wrote: > >> Hi Barry and Dave, > >> > >> Thank both of you for the advice. > >> > >> @Barry > >> I made a mistake in the file names in last email. I attached the > correct files this time. > >> For all the three tests, 'Telescope' is used as the coarse > preconditioner. > >> > >> == Test1: Grid: 1536*128*384, Process Mesh: 48*4*12 > >> Part of the memory usage: Vector 125 124 3971904 0. > >> Matrix 101 101 > 9462372 0 > >> > >> == Test2: Grid: 1536*128*384, Process Mesh: 96*8*24 > >> Part of the memory usage: Vector 125 124 681672 0. > >> Matrix 101 101 > 1462180 0. > >> > >> In theory, the memory usage in Test1 should be 8 times of Test2. In my > case, it is about 6 times. > >> > >> == Test3: Grid: 3072*256*768, Process Mesh: 96*8*24. Sub-domain per > process: 32*32*32 > >> Here I get the out of memory error. > >> > >> I tried to use -mg_coarse jacobi. In this way, I don't need to set > -mg_coarse_ksp_type and -mg_coarse_pc_type explicitly, right? > >> The linear solver didn't work in this case. Petsc output some errors. > >> > >> @Dave > >> In test3, I use only one instance of 'Telescope'. On the coarse mesh of > 'Telescope', I used LU as the preconditioner instead of SVD. > >> If my set the levels correctly, then on the last coarse mesh of MG > where it calls 'Telescope', the sub-domain per process is 2*2*2. > >> On the last coarse mesh of 'Telescope', there is only one grid point > per process. > >> I still got the OOM error. The detailed petsc option file is attached. > >> > >> Do you understand the expected memory usage for the particular parallel > LU implementation you are using? I don't (seriously). Replace LU with > bjacobi and re-run this test. My point about solver debugging is still > valid. > >> > >> And please send the result of KSPView so we can see what is actually > used in the computations > >> > >> Thanks > >> Dave > >> > >> > >> > >> Thank you so much. > >> > >> Frank > >> > >> > >> > >> On 07/06/2016 02:51 PM, Barry Smith wrote: > >> On Jul 6, 2016, at 4:19 PM, frank wrote: > >> > >> Hi Barry, > >> > >> Thank you for you advice. > >> I tried three test. In the 1st test, the grid is 3072*256*768 and the > process mesh is 96*8*24. > >> The linear solver is 'cg' the preconditioner is 'mg' and 'telescope' is > used as the preconditioner at the coarse mesh. > >> The system gives me the "Out of Memory" error before the linear system > is completely solved. > >> The info from '-ksp_view_pre' is attached. I seems to me that the error > occurs when it reaches the coarse mesh. > >> > >> The 2nd test uses a grid of 1536*128*384 and process mesh is 96*8*24. > The 3rd test uses the same grid but a different process mesh 48*4*12. > >> Are you sure this is right? The total matrix and vector memory > usage goes from 2nd test > >> Vector 384 383 8,193,712 0. > >> Matrix 103 103 11,508,688 0. > >> to 3rd test > >> Vector 384 383 1,590,520 0. > >> Matrix 103 103 3,508,664 0. > >> that is the memory usage got smaller but if you have only 1/8th the > processes and the same grid it should have gotten about 8 times bigger. Did > you maybe cut the grid by a factor of 8 also? If so that still doesn't > explain it because the memory usage changed by a factor of 5 something for > the vectors and 3 something for the matrices. > >> > >> > >> The linear solver and petsc options in 2nd and 3rd tests are the same > in 1st test. The linear solver works fine in both test. > >> I attached the memory usage of the 2nd and 3rd tests. The memory info > is from the option '-log_summary'. I tried to use '-momery_info' as you > suggested, but in my case petsc treated it as an unused option. It output > nothing about the memory. Do I need to add sth to my code so I can use > '-memory_info'? > >> Sorry, my mistake the option is -memory_view > >> > >> Can you run the one case with -memory_view and -mg_coarse jacobi > -ksp_max_it 1 (just so it doesn't iterate forever) to see how much memory > is used without the telescope? Also run case 2 the same way. > >> > >> Barry > >> > >> > >> > >> In both tests the memory usage is not large. > >> > >> It seems to me that it might be the 'telescope' preconditioner that > allocated a lot of memory and caused the error in the 1st test. > >> Is there is a way to show how much memory it allocated? > >> > >> Frank > >> > >> On 07/05/2016 03:37 PM, Barry Smith wrote: > >> Frank, > >> > >> You can run with -ksp_view_pre to have it "view" the KSP before > the solve so hopefully it gets that far. > >> > >> Please run the problem that does fit with -memory_info when the > problem completes it will show the "high water mark" for PETSc allocated > memory and total memory used. We first want to look at these numbers to see > if it is using more memory than you expect. You could also run with say > half the grid spacing to see how the memory usage scaled with the increase > in grid points. Make the runs also with -log_view and send all the output > from these options. > >> > >> Barry > >> > >> On Jul 5, 2016, at 5:23 PM, frank wrote: > >> > >> Hi, > >> > >> I am using the CG ksp solver and Multigrid preconditioner to solve a > linear system in parallel. > >> I chose to use the 'Telescope' as the preconditioner on the coarse mesh > for its good performance. > >> The petsc options file is attached. > >> > >> The domain is a 3d box. > >> It works well when the grid is 1536*128*384 and the process mesh is > 96*8*24. When I double the size of grid and keep the same process mesh and > petsc options, I get an "out of memory" error from the super-cluster I am > using. > >> Each process has access to at least 8G memory, which should be more > than enough for my application. I am sure that all the other parts of my > code( except the linear solver ) do not use much memory. So I doubt if > there is something wrong with the linear solver. > >> The error occurs before the linear system is completely solved so I > don't have the info from ksp view. I am not able to re-produce the error > with a smaller problem either. > >> In addition, I tried to use the block jacobi as the preconditioner > with the same grid and same decomposition. The linear solver runs extremely > slow but there is no memory error. > >> > >> How can I diagnose what exactly cause the error? > >> Thank you so much. > >> > >> Frank > >> > >> > > >> > > > > > > -------------- next part -------------- An HTML attachment was scrubbed... URL: From bsmith at mcs.anl.gov Wed Jul 13 08:16:07 2016 From: bsmith at mcs.anl.gov (Barry Smith) Date: Wed, 13 Jul 2016 08:16:07 -0500 Subject: [petsc-users] Question about memory usage in Multigrid preconditioner In-Reply-To: References: <577C337B.60909@uci.edu> <94A03A99-4970-4F20-8C79-FEE1DCBD028D@mcs.anl.gov> <577D75D3.8010703@uci.edu> <2F25042C-E6D6-4AC6-9C22-1B63F8065836@mcs.anl.gov> <57804DE9.707@uci.edu> <5783D3E4.4020004@uci.edu> Message-ID: <0903F73E-E332-48C6-931E-F89B2D7C6676@mcs.anl.gov> > On Jul 13, 2016, at 4:17 AM, Dave May wrote: > > Hi Barry, > > > Dave, > > MatPtAP has to generate some work space. Is it possible the "guess" it uses for needed work space is so absurdly (and unnecessarily) large that it triggers a memory issue? It is possible that other places that require "guesses" for work space produce a problem? > > This is entirely possible. I've never ever used PtAP at the scale of Franks simulation. > I poked around in > src/mat/impls/aij/mpi/mpiptap.c > > In this function, MatPtAPSymbolic_MPIAIJ_MPIAIJ_ptap() > I see the following code > /* set default scalable */ > > ptap->scalable = PETSC_FALSE; /* PETSC_TRUE; */ > > ierr = PetscOptionsGetBool(((PetscObject)Cmpi)->options,((PetscObject)Cmpi)->prefix,"-matptap_scalable",&ptap-> > > This indicates that the default choice being used (despite the comment,) is to use the faster, but also the more memory hungry variant of MatPtAP for MPIAIJ matrices. > Looks like someone has changed the default. > > The following comment is off topic from the email thread but... > > This particular file is littered with #ifdefs related to profiling (PTAP_PROFILE). > This variable is not defined by default. I would much prefer be if this kind of thing was available all the time via a run time flag rather than a configure flag. Dave, I agree the #if def stuff is horrible. I would want this handled with the regular PetscLogEvent() calls; and if they for some reason are not suitable for the task then we should improve them somehow. Note that it is possible to turn off logging of certain events by default via /* Turn off high traffic events by default */ ierr = PetscLogEventSetActiveAll(MAT_SetValues, PETSC_FALSE);CHKERRQ(ierr); so this horrible custom stuff in in mpiptap.c and also in gamg.c doesn't need to exist. Better eyes doing pull requests would have stopped this nonsense from ever getting in the master branch.. > Also, it would be great to augment the profiling for PtAP with memory usage as currently only CPU time is logged. Hmm, is there some generic way we can support this via the PetscLogEvent stuff (but not with your absolutely horrible operation logger event :-). Perhaps at each event begin we could record the memory high water mark and current usage and then at the event end compute the increase in the high water mark and current usage and record those with the event (and stage). Then in the -log_view we could see for example MatPtAP 1 100 secs ...... the usual columns of time etc information .... 1 G (temp real process memory usage) 5 G (temp malloced) .5 (permanent real process memory usage) 1 (perm malloced). In other words just add more columns of data for each event related to memory usage? Perhaps it can be done better than I suggest above? > > Awhile back I proposed a PR for an "operation logger" object (which you absolutely hated). The functionality of this logger would be useful to get rid of the #if defined stuff for PtAP and be able to report meaningful details about both the memory and CPU time. I used this logger for the pctelescope paper and found it immensely useful. > > But to the topic. Frank, you might want to try running your job with the command line option > -matptap_scalable > (or -XXX_matptap_scalable if you have given assigned a name to your operator.) > As always, run a small job first with -options_left 1 to ensure the option name is spelled correctly and being used. > > Let us know if this helps. > > > Cheers, > Dave > > > Also are all the "guesses" properly -info logged so that we can detected them before the program is killed? > > > Barry > > > > At any stage in PCTelescope, only 2 operators of size 32 MB are held in memory when the DMDA is provided. > > > > From my rough estimates, the worst case memory foot print for any given core, given your options is approximately > > 2100 MB + 300 MB + 32 MB + 32 MB + 1 MB = 2465 MB > > This is way below 8 GB. > > > > Note this estimate completely ignores: > > (1) the memory required for the restriction operator, > > (2) the potential growth in the number of non-zeros per row due to Galerkin coarsening (I wished -ksp_view_pre reported the output from MatView so we could see the number of non-zeros required by the coarse level operators) > > (3) all temporary vectors required by the CG solver, and those required by the smoothers. > > (4) internal memory allocated by MatPtAP > > (5) memory associated with IS's used within PCTelescope > > > > So either I am completely off in my estimates, or you have not carefully estimated the memory usage of your application code. Hopefully others might examine/correct my rough estimates > > > > Since I don't have your code I cannot access the latter. > > Since I don't have access to the same machine you are running on, I think we need to take a step back. > > > > [1] What machine are you running on? Send me a URL if its available > > > > [2] What discretization are you using? (I am guessing a scalar 7 point FD stencil) > > If it's a 7 point FD stencil, we should be able to examine the memory usage of your solver configuration using a standard, light weight existing PETSc example, run on your machine at the same scale. > > This would hopefully enable us to correctly evaluate the actual memory usage required by the solver configuration you are using. > > > > Thanks, > > Dave > > > > > > > > Frank > > > > > > > > > > On 07/08/2016 10:38 PM, Dave May wrote: > >> > >> > >> On Saturday, 9 July 2016, frank wrote: > >> Hi Barry and Dave, > >> > >> Thank both of you for the advice. > >> > >> @Barry > >> I made a mistake in the file names in last email. I attached the correct files this time. > >> For all the three tests, 'Telescope' is used as the coarse preconditioner. > >> > >> == Test1: Grid: 1536*128*384, Process Mesh: 48*4*12 > >> Part of the memory usage: Vector 125 124 3971904 0. > >> Matrix 101 101 9462372 0 > >> > >> == Test2: Grid: 1536*128*384, Process Mesh: 96*8*24 > >> Part of the memory usage: Vector 125 124 681672 0. > >> Matrix 101 101 1462180 0. > >> > >> In theory, the memory usage in Test1 should be 8 times of Test2. In my case, it is about 6 times. > >> > >> == Test3: Grid: 3072*256*768, Process Mesh: 96*8*24. Sub-domain per process: 32*32*32 > >> Here I get the out of memory error. > >> > >> I tried to use -mg_coarse jacobi. In this way, I don't need to set -mg_coarse_ksp_type and -mg_coarse_pc_type explicitly, right? > >> The linear solver didn't work in this case. Petsc output some errors. > >> > >> @Dave > >> In test3, I use only one instance of 'Telescope'. On the coarse mesh of 'Telescope', I used LU as the preconditioner instead of SVD. > >> If my set the levels correctly, then on the last coarse mesh of MG where it calls 'Telescope', the sub-domain per process is 2*2*2. > >> On the last coarse mesh of 'Telescope', there is only one grid point per process. > >> I still got the OOM error. The detailed petsc option file is attached. > >> > >> Do you understand the expected memory usage for the particular parallel LU implementation you are using? I don't (seriously). Replace LU with bjacobi and re-run this test. My point about solver debugging is still valid. > >> > >> And please send the result of KSPView so we can see what is actually used in the computations > >> > >> Thanks > >> Dave > >> > >> > >> > >> Thank you so much. > >> > >> Frank > >> > >> > >> > >> On 07/06/2016 02:51 PM, Barry Smith wrote: > >> On Jul 6, 2016, at 4:19 PM, frank wrote: > >> > >> Hi Barry, > >> > >> Thank you for you advice. > >> I tried three test. In the 1st test, the grid is 3072*256*768 and the process mesh is 96*8*24. > >> The linear solver is 'cg' the preconditioner is 'mg' and 'telescope' is used as the preconditioner at the coarse mesh. > >> The system gives me the "Out of Memory" error before the linear system is completely solved. > >> The info from '-ksp_view_pre' is attached. I seems to me that the error occurs when it reaches the coarse mesh. > >> > >> The 2nd test uses a grid of 1536*128*384 and process mesh is 96*8*24. The 3rd test uses the same grid but a different process mesh 48*4*12. > >> Are you sure this is right? The total matrix and vector memory usage goes from 2nd test > >> Vector 384 383 8,193,712 0. > >> Matrix 103 103 11,508,688 0. > >> to 3rd test > >> Vector 384 383 1,590,520 0. > >> Matrix 103 103 3,508,664 0. > >> that is the memory usage got smaller but if you have only 1/8th the processes and the same grid it should have gotten about 8 times bigger. Did you maybe cut the grid by a factor of 8 also? If so that still doesn't explain it because the memory usage changed by a factor of 5 something for the vectors and 3 something for the matrices. > >> > >> > >> The linear solver and petsc options in 2nd and 3rd tests are the same in 1st test. The linear solver works fine in both test. > >> I attached the memory usage of the 2nd and 3rd tests. The memory info is from the option '-log_summary'. I tried to use '-momery_info' as you suggested, but in my case petsc treated it as an unused option. It output nothing about the memory. Do I need to add sth to my code so I can use '-memory_info'? > >> Sorry, my mistake the option is -memory_view > >> > >> Can you run the one case with -memory_view and -mg_coarse jacobi -ksp_max_it 1 (just so it doesn't iterate forever) to see how much memory is used without the telescope? Also run case 2 the same way. > >> > >> Barry > >> > >> > >> > >> In both tests the memory usage is not large. > >> > >> It seems to me that it might be the 'telescope' preconditioner that allocated a lot of memory and caused the error in the 1st test. > >> Is there is a way to show how much memory it allocated? > >> > >> Frank > >> > >> On 07/05/2016 03:37 PM, Barry Smith wrote: > >> Frank, > >> > >> You can run with -ksp_view_pre to have it "view" the KSP before the solve so hopefully it gets that far. > >> > >> Please run the problem that does fit with -memory_info when the problem completes it will show the "high water mark" for PETSc allocated memory and total memory used. We first want to look at these numbers to see if it is using more memory than you expect. You could also run with say half the grid spacing to see how the memory usage scaled with the increase in grid points. Make the runs also with -log_view and send all the output from these options. > >> > >> Barry > >> > >> On Jul 5, 2016, at 5:23 PM, frank wrote: > >> > >> Hi, > >> > >> I am using the CG ksp solver and Multigrid preconditioner to solve a linear system in parallel. > >> I chose to use the 'Telescope' as the preconditioner on the coarse mesh for its good performance. > >> The petsc options file is attached. > >> > >> The domain is a 3d box. > >> It works well when the grid is 1536*128*384 and the process mesh is 96*8*24. When I double the size of grid and keep the same process mesh and petsc options, I get an "out of memory" error from the super-cluster I am using. > >> Each process has access to at least 8G memory, which should be more than enough for my application. I am sure that all the other parts of my code( except the linear solver ) do not use much memory. So I doubt if there is something wrong with the linear solver. > >> The error occurs before the linear system is completely solved so I don't have the info from ksp view. I am not able to re-produce the error with a smaller problem either. > >> In addition, I tried to use the block jacobi as the preconditioner with the same grid and same decomposition. The linear solver runs extremely slow but there is no memory error. > >> > >> How can I diagnose what exactly cause the error? > >> Thank you so much. > >> > >> Frank > >> > >> > >> > > > > From hgbk2008 at gmail.com Wed Jul 13 10:34:58 2016 From: hgbk2008 at gmail.com (Hoang Giang Bui) Date: Wed, 13 Jul 2016 17:34:58 +0200 Subject: [petsc-users] different convergence behaviour In-Reply-To: <66004C23-63C9-4A3E-A7DF-1352AC26412F@mcs.anl.gov> References: <66004C23-63C9-4A3E-A7DF-1352AC26412F@mcs.anl.gov> Message-ID: Thanks Barry This is a good comment. Since material behaviour depends very much on the trajectory of the solution. I suspect that the error may concatenate during time stepping. I have re-run the simulation as you suggested and post the log file here: https://www.dropbox.com/s/d6l8ixme37uh47a/log13Jul16?dl=0 However, I did not get what -ksp_monitor_true_solution used for? I see that I have the same log that I had before. Giang On Wed, Jul 13, 2016 at 5:43 AM, Barry Smith wrote: > > It is not uncommon for an iterative linear solver to work fine for some > time steps but then start to perform poorly at a later timestep because the > physics (mathematically the conditioning or eigenstructure of the Jacobian) > changes over time; perhaps becomes singular. Another possibility is the > trajectory of the solution is very sensitive to the solution of the > nonlinear problem at each time step so that an iterative linear solver and > a direct linear solver result in very difficult physical solutions after > many time steps. In other words after many time-steps the computed > solutions can be very different and if the computed solution for the > iterative linear solver is eventually "non-physical" or ill-conditioned the > nonlinear solver could break down. > > Please run with the iterative solver (that eventually breaks) with the > option -ksp_monitor_true_solution -ksp_converged_reason and and send ALL > the output (it will be very large, don't worry about it). Then we can see > if the linear solver is breaking down. Note that by default PETSc linear > solvers do not generate an error that stops the program if the linear solve > fails, hence your NR code should call KSPGetConvergedReason() after EVERY > linear solve and if the reason is negative your code needs to do something > different since the linear solve failed and your code should not just keep > on running NR. > > Barry > > > > On Jul 12, 2016, at 9:52 AM, Matthew Knepley wrote: > > > > On Tue, Jul 12, 2016 at 8:44 AM, Hoang Giang Bui > wrote: > > Hi Matt > > > > 1) In the log you sent, the linear solver converges due to the Relative > Tolerance, 1.0e-9, not the breakdown tolerance 1e-30. Change rtol will > affect the convergence. > > > > Sorry i got it wrong in the previous email, the ksp_rtol 1.0e-12 DOES > affect the convergence, and it took more iterations. But the simulation > still failed at a definite time step. > > > > 2) What do you mean -fieldsplit_wp_ksp_rtol 1.0e-8 does not work? > ALWAYS send the view output. > > > > In the log file I sent previously, the line > > > > KSP Object: (fieldsplit_wp_) 8 MPI processes > > type: preonly > > maximum iterations=10000, initial guess is zero > > tolerances: relative=1e-05, absolute=1e-50, divergence=10000 > > left preconditioning > > using NONE norm type for convergence test > > > > impressed me that the rtol for fieldsplit_wp is still 1.0e-5 > > > > KSP "preonly" does no iterations, so it does not read the tolerance. If > you want to lower the tolerance, > > choose a solver like GMRES > > > > -fieldsplit_wp_ksp_type gmres -fieldsplit_wp_ksp_rtol 1e-8 > > > > 3) I can't tell you anything about Newton convergence if you do not send > the output, -snes_monitor -snes_view > > > > I did not yet use SNES, instead using my NR iterator so I have no view > for SNES. > > > > It is hard to debug an iteration which we did not code. It could be you > have a bug. If not, then very small changes in > > the iterates are making a difference, which means your Jacobians are > close to singular. A problem reformulation would > > probably help more than solver tweaking. > > > > Thanks, > > > > Matt > > > > 4) If there is a difference between LU and an iterative solver with > residual 1e-9, then your system is very ill-conditioned. > > Yes it is ill-conditioned > > > > > > > > > > > > > > > > Giang > > > > On Tue, Jul 12, 2016 at 2:49 PM, Matthew Knepley > wrote: > > On Tue, Jul 12, 2016 at 7:42 AM, Hoang Giang Bui > wrote: > > Hello > > > > I encountered different convergence behaviour of Newton Raphson when > using different solver settings with PETSc > > > > For the first solver configuration, I used direct solver > > -ksp_type preonly > > -pc_type lu > > -pc_factor_mat_solver_package mumps > > -mat_mumps_icntl_1 6 > > -mat_mumps_icntl_4 3 > > -mat_mumps_icntl_7 4 > > -mat_mumps_icntl_14 40 > > -mat_mumps_icntl_23 0 > > > > The simulation can run completely and the NR typically converged after > 6/7 iterations. Of course, it's very slow. For the second solver > configuration: > > -ksp_type gmres > > -ksp_max_it 300 > > -ksp_gmres_restart 300 > > -ksp_gmres_modifiedgramschmidt > > -pc_view > > -pc_fieldsplit_type multiplicative > > -fieldsplit_u_pc_type hypre > > -fieldsplit_u_pc_hypre_type boomeramg > > -fieldsplit_u_pc_hypre_boomeramg_coarsen_type PMIS > > -fieldsplit_u_pc_hypre_boomeramg_strong_threshold 0.6 > > -fieldsplit_u_pc_hypre_boomeramg_max_levels 25 > > -fieldsplit_wp_ksp_rtol 1.0e-8 > > -fieldsplit_wp_pc_type hypre > > -fieldsplit_wp_pc_hypre_type boomeramg > > -fieldsplit_wp_pc_hypre_boomeramg_coarsen_type PMIS > > -fieldsplit_wp_pc_hypre_boomeramg_strong_threshold 0.6 > > -fieldsplit_wp_pc_hypre_boomeramg_max_levels 25 > > > > The solver runs much faster, but the NR does not converge in 30 > iterations after some time steps. I thought setting the solver tolerance > -ksp_rtol 1.0e-12 but it doesn't help much because GMRES already terminate > with tolerance 1e-30 (see sample log file). Can we set the tolerance of the > sub-ksp of the Fieldsplit? I tried -fieldsplit_wp_ksp_rtol 1.0e-8 but it > doesn't work. > > > > 1) In the log you sent, the linear solver converges due to the Relative > Tolerance, 1.0e-9, not the breakdown tolerance 1e-30. Change rtol will > affect the convergence. > > > > 2) What do you mean -fieldsplit_wp_ksp_rtol 1.0e-8 does not work? > ALWAYS send the view output. > > > > 3) I can't tell you anything about Newton convergence if you do not send > the output, -snes_monitor -snes_view > > > > 4) If there is a difference between LU and an iterative solver with > residual 1e-9, then your system is very ill-conditioned. > > > > Thanks, > > > > Matt > > > > Sorry this problem is run with many time steps and is quite big so I > cannot reproduce in a simple test case. > > > > Giang > > > > > > > > > > -- > > What most experimenters take for granted before they begin their > experiments is infinitely more interesting than any results to which their > experiments lead. > > -- Norbert Wiener > > > > > > > > > > -- > > What most experimenters take for granted before they begin their > experiments is infinitely more interesting than any results to which their > experiments lead. > > -- Norbert Wiener > > -------------- next part -------------- An HTML attachment was scrubbed... URL: From knepley at gmail.com Wed Jul 13 11:05:16 2016 From: knepley at gmail.com (Matthew Knepley) Date: Wed, 13 Jul 2016 11:05:16 -0500 Subject: [petsc-users] different convergence behaviour In-Reply-To: References: <66004C23-63C9-4A3E-A7DF-1352AC26412F@mcs.anl.gov> Message-ID: On Wed, Jul 13, 2016 at 10:34 AM, Hoang Giang Bui wrote: > Thanks Barry > > This is a good comment. Since material behaviour depends very much on the > trajectory of the solution. I suspect that the error may concatenate during > time stepping. > > I have re-run the simulation as you suggested and post the log file here: > https://www.dropbox.com/s/d6l8ixme37uh47a/log13Jul16?dl=0 > > However, I did not get what -ksp_monitor_true_solution used for? I see > that I have the same log that I had before. > That option is showing the last two numbers in these lines 0 KSP preconditioned resid norm 1.150038785083e+00 true resid norm 8.673040929526e+07 ||r(i)||/||b|| 1.000000000000e+00 Notice that there are 7 orders of magnitude between the apparent residual (using the preconditioner), and the actual residual, Ax - b. You are using Hypre, and this generally means the Hypre coarse grid operator is crap. Please a) Try ML or GAMG and look at the output again b) Try MUMPS, although you have 200 nonzeros/row so that fill-in might be extreme. The consequence is that you solve to what you think is machine precision (1e-13), but all you really get is (1e-4), so I can understand why the trajectory is completely different. Matt 1 KSP preconditioned resid norm 5.202876635759e-01 true resid norm 2.037005052213e+08 ||r(i)||/||b|| 2.348663022307e+00 2 KSP preconditioned resid norm 3.386127782775e-01 true resid norm 1.762196838305e+08 ||r(i)||/||b|| 2.031809664712e+00 3 KSP preconditioned resid norm 2.334102526025e-01 true resid norm 1.027451552306e+08 ||r(i)||/||b|| 1.184649721655e+00 4 KSP preconditioned resid norm 1.791251896569e-01 true resid norm 7.709961160729e+07 ||r(i)||/||b|| 8.889570824556e-01 5 KSP preconditioned resid norm 1.338763110903e-01 true resid norm 7.416954924746e+07 ||r(i)||/||b|| 8.551735181482e-01 6 KSP preconditioned resid norm 8.064262880339e-02 true resid norm 5.164444100149e+07 ||r(i)||/||b|| 5.954594405945e-01 7 KSP preconditioned resid norm 4.635705318709e-02 true resid norm 2.934800965373e+07 ||r(i)||/||b|| 3.383820034081e-01 8 KSP preconditioned resid norm 2.772133866748e-02 true resid norm 1.528356929458e+07 ||r(i)||/||b|| 1.762192686368e-01 9 KSP preconditioned resid norm 1.746753670007e-02 true resid norm 1.011788107951e+07 ||r(i)||/||b|| 1.166589799555e-01 10 KSP preconditioned resid norm 1.090702407895e-02 true resid norm 5.487922954253e+06 ||r(i)||/||b|| 6.327564920823e-02 11 KSP preconditioned resid norm 7.298748576067e-03 true resid norm 3.635843038640e+06 ||r(i)||/||b|| 4.192120235779e-02 12 KSP preconditioned resid norm 5.263606789063e-03 true resid norm 2.556946903793e+06 ||r(i)||/||b|| 2.948155006496e-02 13 KSP preconditioned resid norm 3.653208280595e-03 true resid norm 1.955721190606e+06 ||r(i)||/||b|| 2.254942881623e-02 14 KSP preconditioned resid norm 2.344759624903e-03 true resid norm 1.161259621408e+06 ||r(i)||/||b|| 1.338930175522e-02 15 KSP preconditioned resid norm 1.394564491254e-03 true resid norm 7.455856541894e+05 ||r(i)||/||b|| 8.596588673428e-03 16 KSP preconditioned resid norm 9.523395328600e-04 true resid norm 4.383808867461e+05 ||r(i)||/||b|| 5.054523440028e-03 17 KSP preconditioned resid norm 7.226014371144e-04 true resid norm 2.463564216053e+05 ||r(i)||/||b|| 2.840484941869e-03 18 KSP preconditioned resid norm 5.312593384754e-04 true resid norm 2.332075376781e+05 ||r(i)||/||b|| 2.688878555665e-03 19 KSP preconditioned resid norm 3.987403871945e-04 true resid norm 1.524236218549e+05 ||r(i)||/||b|| 1.757441514383e-03 20 KSP preconditioned resid norm 3.024350484979e-04 true resid norm 1.113568566173e+05 ||r(i)||/||b|| 1.283942477870e-03 21 KSP preconditioned resid norm 2.181724540430e-04 true resid norm 9.095158030900e+04 ||r(i)||/||b|| 1.048670022983e-03 22 KSP preconditioned resid norm 1.497651066688e-04 true resid norm 7.045647741653e+04 ||r(i)||/||b|| 8.123618692570e-04 23 KSP preconditioned resid norm 1.067332245914e-04 true resid norm 4.317487154207e+04 ||r(i)||/||b|| 4.978054628463e-04 24 KSP preconditioned resid norm 8.206743871631e-05 true resid norm 3.328488127932e+04 ||r(i)||/||b|| 3.837740597534e-04 25 KSP preconditioned resid norm 6.446633932980e-05 true resid norm 2.816657573261e+04 ||r(i)||/||b|| 3.247600923538e-04 26 KSP preconditioned resid norm 5.068725017435e-05 true resid norm 2.427030232896e+04 ||r(i)||/||b|| 2.798361327495e-04 27 KSP preconditioned resid norm 4.056292508453e-05 true resid norm 1.963628903861e+04 ||r(i)||/||b|| 2.264060460243e-04 28 KSP preconditioned resid norm 3.278196251068e-05 true resid norm 1.710046122873e+04 ||r(i)||/||b|| 1.971679987179e-04 29 KSP preconditioned resid norm 2.796514916728e-05 true resid norm 1.500292999274e+04 ||r(i)||/||b|| 1.729835027259e-04 30 KSP preconditioned resid norm 2.469882695602e-05 true resid norm 1.317997814765e+04 ||r(i)||/||b|| 1.519649019847e-04 31 KSP preconditioned resid norm 2.175528107880e-05 true resid norm 1.158572445412e+04 ||r(i)||/||b|| 1.335831866616e-04 32 KSP preconditioned resid norm 1.912573933887e-05 true resid norm 1.001695718951e+04 ||r(i)||/||b|| 1.154953293880e-04 33 KSP preconditioned resid norm 1.647102125210e-05 true resid norm 8.271485921360e+03 ||r(i)||/||b|| 9.537007825249e-05 34 KSP preconditioned resid norm 1.337436641169e-05 true resid norm 6.611637805300e+03 ||r(i)||/||b|| 7.623206046211e-05 35 KSP preconditioned resid norm 9.896966695703e-06 true resid norm 4.752788536204e+03 ||r(i)||/||b|| 5.479956309238e-05 36 KSP preconditioned resid norm 6.766260764791e-06 true resid norm 3.239548441802e+03 ||r(i)||/||b|| 3.735193305468e-05 37 KSP preconditioned resid norm 4.835158711776e-06 true resid norm 2.113941262442e+03 ||r(i)||/||b|| 2.437370329068e-05 38 KSP preconditioned resid norm 3.598894380040e-06 true resid norm 1.653467554688e+03 ||r(i)||/||b|| 1.906445003688e-05 39 KSP preconditioned resid norm 2.522642742745e-06 true resid norm 1.344572919946e+03 ||r(i)||/||b|| 1.550290066507e-05 40 KSP preconditioned resid norm 1.750002168280e-06 true resid norm 1.015690774521e+03 ||r(i)||/||b|| 1.171089566825e-05 41 KSP preconditioned resid norm 1.371380245282e-06 true resid norm 8.480814540622e+02 ||r(i)||/||b|| 9.778363332462e-06 42 KSP preconditioned resid norm 1.174063380270e-06 true resid norm 7.575955225454e+02 ||r(i)||/||b|| 8.735062231359e-06 43 KSP preconditioned resid norm 1.022078284946e-06 true resid norm 6.758159410670e+02 ||r(i)||/||b|| 7.792145183661e-06 44 KSP preconditioned resid norm 8.861345665105e-07 true resid norm 5.913685641420e+02 ||r(i)||/||b|| 6.818468504268e-06 45 KSP preconditioned resid norm 7.574040382433e-07 true resid norm 4.958820201473e+02 ||r(i)||/||b|| 5.717510434653e-06 46 KSP preconditioned resid norm 6.331382122180e-07 true resid norm 3.988451175342e+02 ||r(i)||/||b|| 4.598676759110e-06 47 KSP preconditioned resid norm 5.210644796074e-07 true resid norm 3.077459761874e+02 ||r(i)||/||b|| 3.548305360116e-06 48 KSP preconditioned resid norm 4.285762531134e-07 true resid norm 2.383304155333e+02 ||r(i)||/||b|| 2.747945241696e-06 49 KSP preconditioned resid norm 3.365753654637e-07 true resid norm 1.802176480688e+02 ||r(i)||/||b|| 2.077906117741e-06 50 KSP preconditioned resid norm 2.556504175739e-07 true resid norm 1.322207275993e+02 ||r(i)||/||b|| 1.524502520785e-06 51 KSP preconditioned resid norm 1.929395464892e-07 true resid norm 1.007938656038e+02 ||r(i)||/||b|| 1.162151388686e-06 52 KSP preconditioned resid norm 1.518353128559e-07 true resid norm 7.979486270816e+01 ||r(i)||/||b|| 9.200332773308e-07 53 KSP preconditioned resid norm 1.206065500213e-07 true resid norm 6.580266981926e+01 ||r(i)||/||b|| 7.587035545427e-07 54 KSP preconditioned resid norm 9.426597887251e-08 true resid norm 5.333098459078e+01 ||r(i)||/||b|| 6.149052566928e-07 55 KSP preconditioned resid norm 7.613592162567e-08 true resid norm 4.265349984159e+01 ||r(i)||/||b|| 4.917940568733e-07 56 KSP preconditioned resid norm 6.268355987149e-08 true resid norm 3.467681120568e+01 ||r(i)||/||b|| 3.998229858184e-07 57 KSP preconditioned resid norm 5.012883291890e-08 true resid norm 2.749870530323e+01 ||r(i)||/||b|| 3.170595587716e-07 58 KSP preconditioned resid norm 3.875711489918e-08 true resid norm 2.037239239206e+01 ||r(i)||/||b|| 2.348933039472e-07 59 KSP preconditioned resid norm 2.803879910778e-08 true resid norm 1.495957468476e+01 ||r(i)||/||b|| 1.724836168342e-07 60 KSP preconditioned resid norm 1.925214804831e-08 true resid norm 1.036952152845e+01 ||r(i)||/||b|| 1.195603896339e-07 61 KSP preconditioned resid norm 1.316807047769e-08 true resid norm 7.239457203086e+00 ||r(i)||/||b|| 8.347080639779e-08 62 KSP preconditioned resid norm 9.095263534284e-09 true resid norm 5.546725364022e+00 ||r(i)||/||b|| 6.395363989508e-08 63 KSP preconditioned resid norm 6.520024982652e-09 true resid norm 4.395022539849e+00 ||r(i)||/||b|| 5.067452783356e-08 64 KSP preconditioned resid norm 5.077084953418e-09 true resid norm 3.613138054874e+00 ||r(i)||/||b|| 4.165941431885e-08 65 KSP preconditioned resid norm 4.181478103167e-09 true resid norm 3.038027368880e+00 ||r(i)||/||b|| 3.502839884610e-08 66 KSP preconditioned resid norm 3.474545560062e-09 true resid norm 2.484725611092e+00 ||r(i)||/||b|| 2.864883990842e-08 67 KSP preconditioned resid norm 2.726294735157e-09 true resid norm 1.845741997810e+00 ||r(i)||/||b|| 2.128137077650e-08 68 KSP preconditioned resid norm 2.081101207644e-09 true resid norm 1.271838867185e+00 ||r(i)||/||b|| 1.466427839462e-08 69 KSP preconditioned resid norm 1.574053677511e-09 true resid norm 8.732579381622e-01 ||r(i)||/||b|| 1.006864772411e-08 70 KSP preconditioned resid norm 1.202717674216e-09 true resid norm 5.849220507056e-01 ||r(i)||/||b|| 6.744140324696e-09 71 KSP preconditioned resid norm 9.075713740333e-10 true resid norm 4.120181311262e-01 ||r(i)||/||b|| 4.750561359898e-09 72 KSP preconditioned resid norm 6.365151508838e-10 true resid norm 3.065749731760e-01 ||r(i)||/||b|| 3.534803717256e-09 73 KSP preconditioned resid norm 4.005974496315e-10 true resid norm 2.122086214944e-01 ||r(i)||/||b|| 2.446761444097e-09 74 KSP preconditioned resid norm 2.374916890000e-10 true resid norm 1.567794082480e-01 ||r(i)||/||b|| 1.807663650177e-09 75 KSP preconditioned resid norm 1.481096397633e-10 true resid norm 1.235242757193e-01 ||r(i)||/||b|| 1.424232592963e-09 76 KSP preconditioned resid norm 1.085014154415e-10 true resid norm 1.047268461651e-01 ||r(i)||/||b|| 1.207498581132e-09 77 KSP preconditioned resid norm 8.764582618532e-11 true resid norm 8.962364559579e-02 ||r(i)||/||b|| 1.033358960531e-09 78 KSP preconditioned resid norm 7.109092680274e-11 true resid norm 7.176047852904e-02 ||r(i)||/||b|| 8.273969777399e-10 79 KSP preconditioned resid norm 5.460763497752e-11 true resid norm 5.069849340150e-02 ||r(i)||/||b|| 5.845526824266e-10 80 KSP preconditioned resid norm 3.799942459039e-11 true resid norm 3.044234442091e-02 ||r(i)||/||b|| 3.509996628435e-10 81 KSP preconditioned resid norm 2.481109284531e-11 true resid norm 1.726059230919e-02 ||r(i)||/||b|| 1.990143070861e-10 82 KSP preconditioned resid norm 1.569622532234e-11 true resid norm 1.070220060596e-02 ||r(i)||/||b|| 1.233961731867e-10 83 KSP preconditioned resid norm 1.022582071414e-11 true resid norm 7.402265790954e-03 ||r(i)||/||b|| 8.534798637643e-11 84 KSP preconditioned resid norm 7.284827374238e-12 true resid norm 5.658340974708e-03 ||r(i)||/||b|| 6.524056580253e-11 85 KSP preconditioned resid norm 5.402886839508e-12 true resid norm 4.464802757767e-03 ||r(i)||/||b|| 5.147909244343e-11 86 KSP preconditioned resid norm 3.933784995327e-12 true resid norm 3.350654653931e-03 ||r(i)||/||b|| 3.863298560628e-11 87 KSP preconditioned resid norm 2.792049995877e-12 true resid norm 2.402140873006e-03 ||r(i)||/||b|| 2.769663942007e-11 88 KSP preconditioned resid norm 2.058524741199e-12 true resid norm 1.747330249674e-03 ||r(i)||/||b|| 2.014668515774e-11 89 KSP preconditioned resid norm 1.568241303093e-12 true resid norm 1.266336540932e-03 ||r(i)||/||b|| 1.460083667564e-11 90 KSP preconditioned resid norm 1.164779378453e-12 true resid norm 8.484550691359e-04 ||r(i)||/||b|| 9.782671107287e-12 91 KSP preconditioned resid norm 7.995560038101e-13 true resid norm 5.065061038629e-04 ||r(i)||/||b|| 5.840005921551e-12 Linear solve converged due to CONVERGED_RTOL iterations 91 KSP Object: 8 MPI processes type: gmres GMRES: restart=300, using Modified Gram-Schmidt Orthogonalization GMRES: happy breakdown tolerance 1e-30 maximum iterations=300, initial guess is zero tolerances: relative=1e-12, absolute=1e-20, divergence=10000 left preconditioning using PRECONDITIONED norm type for convergence test PC Object: 8 MPI processes type: fieldsplit FieldSplit with MULTIPLICATIVE composition: total splits = 2 Solver info for each split is in the following KSP objects: Split number 0 Defined by IS KSP Object: (fieldsplit_u_) 8 MPI processes type: preonly maximum iterations=10000, initial guess is zero tolerances: relative=1e-05, absolute=1e-50, divergence=10000 left preconditioning using NONE norm type for convergence test PC Object: (fieldsplit_u_) 8 MPI processes type: hypre HYPRE BoomerAMG preconditioning HYPRE BoomerAMG: Cycle type V HYPRE BoomerAMG: Maximum number of levels 25 HYPRE BoomerAMG: Maximum number of iterations PER hypre call 1 HYPRE BoomerAMG: Convergence tolerance PER hypre call 0 HYPRE BoomerAMG: Threshold for strong coupling 0.6 HYPRE BoomerAMG: Interpolation truncation factor 0 HYPRE BoomerAMG: Interpolation: max elements per row 0 HYPRE BoomerAMG: Number of levels of aggressive coarsening 0 HYPRE BoomerAMG: Number of paths for aggressive coarsening 1 HYPRE BoomerAMG: Maximum row sums 0.9 HYPRE BoomerAMG: Sweeps down 1 HYPRE BoomerAMG: Sweeps up 1 HYPRE BoomerAMG: Sweeps on coarse 1 HYPRE BoomerAMG: Relax down symmetric-SOR/Jacobi HYPRE BoomerAMG: Relax up symmetric-SOR/Jacobi HYPRE BoomerAMG: Relax on coarse Gaussian-elimination HYPRE BoomerAMG: Relax weight (all) 1 HYPRE BoomerAMG: Outer relax weight (all) 1 HYPRE BoomerAMG: Using CF-relaxation HYPRE BoomerAMG: Measure type local HYPRE BoomerAMG: Coarsen type PMIS HYPRE BoomerAMG: Interpolation type classical linear system matrix = precond matrix: Mat Object: (fieldsplit_u_) 8 MPI processes type: mpiaij rows=438420, cols=438420, bs=3 total: nonzeros=7.95766e+07, allocated nonzeros=7.95766e+07 total number of mallocs used during MatSetValues calls =0 using I-node (on process 0) routines: found 17349 nodes, limit used is 5 Split number 1 Defined by IS KSP Object: (fieldsplit_wp_) 8 MPI processes type: preonly maximum iterations=10000, initial guess is zero tolerances: relative=1e-05, absolute=1e-50, divergence=10000 left preconditioning using NONE norm type for convergence test PC Object: (fieldsplit_wp_) 8 MPI processes type: hypre HYPRE BoomerAMG preconditioning HYPRE BoomerAMG: Cycle type V HYPRE BoomerAMG: Maximum number of levels 25 HYPRE BoomerAMG: Maximum number of iterations PER hypre call 1 HYPRE BoomerAMG: Convergence tolerance PER hypre call 0 HYPRE BoomerAMG: Threshold for strong coupling 0.6 HYPRE BoomerAMG: Interpolation truncation factor 0 HYPRE BoomerAMG: Interpolation: max elements per row 0 HYPRE BoomerAMG: Number of levels of aggressive coarsening 0 HYPRE BoomerAMG: Number of paths for aggressive coarsening 1 HYPRE BoomerAMG: Maximum row sums 0.9 HYPRE BoomerAMG: Sweeps down 1 HYPRE BoomerAMG: Sweeps up 1 HYPRE BoomerAMG: Sweeps on coarse 1 HYPRE BoomerAMG: Relax down symmetric-SOR/Jacobi HYPRE BoomerAMG: Relax up symmetric-SOR/Jacobi HYPRE BoomerAMG: Relax on coarse Gaussian-elimination HYPRE BoomerAMG: Relax weight (all) 1 HYPRE BoomerAMG: Outer relax weight (all) 1 HYPRE BoomerAMG: Using CF-relaxation HYPRE BoomerAMG: Measure type local HYPRE BoomerAMG: Coarsen type PMIS HYPRE BoomerAMG: Interpolation type classical linear system matrix = precond matrix: Mat Object: (fieldsplit_wp_) 8 MPI processes type: mpiaij rows=146140, cols=146140 total: nonzeros=596012, allocated nonzeros=596012 total number of mallocs used during MatSetValues calls =0 not using I-node (on process 0) routines linear system matrix = precond matrix: Mat Object: 8 MPI processes type: mpiaij rows=584560, cols=584560, bs=4 total: nonzeros=9.29667e+07, allocated nonzeros=9.29667e+07 total number of mallocs used during MatSetValues calls =0 using I-node (on process 0) routines: found 32431 nodes, limit used is 5 KSPSolve completed > Giang > > On Wed, Jul 13, 2016 at 5:43 AM, Barry Smith wrote: > >> >> It is not uncommon for an iterative linear solver to work fine for some >> time steps but then start to perform poorly at a later timestep because the >> physics (mathematically the conditioning or eigenstructure of the Jacobian) >> changes over time; perhaps becomes singular. Another possibility is the >> trajectory of the solution is very sensitive to the solution of the >> nonlinear problem at each time step so that an iterative linear solver and >> a direct linear solver result in very difficult physical solutions after >> many time steps. In other words after many time-steps the computed >> solutions can be very different and if the computed solution for the >> iterative linear solver is eventually "non-physical" or ill-conditioned the >> nonlinear solver could break down. >> >> Please run with the iterative solver (that eventually breaks) with the >> option -ksp_monitor_true_solution -ksp_converged_reason and and send ALL >> the output (it will be very large, don't worry about it). Then we can see >> if the linear solver is breaking down. Note that by default PETSc linear >> solvers do not generate an error that stops the program if the linear solve >> fails, hence your NR code should call KSPGetConvergedReason() after EVERY >> linear solve and if the reason is negative your code needs to do something >> different since the linear solve failed and your code should not just keep >> on running NR. >> >> Barry >> >> >> > On Jul 12, 2016, at 9:52 AM, Matthew Knepley wrote: >> > >> > On Tue, Jul 12, 2016 at 8:44 AM, Hoang Giang Bui >> wrote: >> > Hi Matt >> > >> > 1) In the log you sent, the linear solver converges due to the Relative >> Tolerance, 1.0e-9, not the breakdown tolerance 1e-30. Change rtol will >> affect the convergence. >> > >> > Sorry i got it wrong in the previous email, the ksp_rtol 1.0e-12 DOES >> affect the convergence, and it took more iterations. But the simulation >> still failed at a definite time step. >> > >> > 2) What do you mean -fieldsplit_wp_ksp_rtol 1.0e-8 does not work? >> ALWAYS send the view output. >> > >> > In the log file I sent previously, the line >> > >> > KSP Object: (fieldsplit_wp_) 8 MPI processes >> > type: preonly >> > maximum iterations=10000, initial guess is zero >> > tolerances: relative=1e-05, absolute=1e-50, divergence=10000 >> > left preconditioning >> > using NONE norm type for convergence test >> > >> > impressed me that the rtol for fieldsplit_wp is still 1.0e-5 >> > >> > KSP "preonly" does no iterations, so it does not read the tolerance. If >> you want to lower the tolerance, >> > choose a solver like GMRES >> > >> > -fieldsplit_wp_ksp_type gmres -fieldsplit_wp_ksp_rtol 1e-8 >> > >> > 3) I can't tell you anything about Newton convergence if you do not >> send the output, -snes_monitor -snes_view >> > >> > I did not yet use SNES, instead using my NR iterator so I have no view >> for SNES. >> > >> > It is hard to debug an iteration which we did not code. It could be you >> have a bug. If not, then very small changes in >> > the iterates are making a difference, which means your Jacobians are >> close to singular. A problem reformulation would >> > probably help more than solver tweaking. >> > >> > Thanks, >> > >> > Matt >> > >> > 4) If there is a difference between LU and an iterative solver with >> residual 1e-9, then your system is very ill-conditioned. >> > Yes it is ill-conditioned >> > >> > >> > >> > >> > >> > >> > >> > Giang >> > >> > On Tue, Jul 12, 2016 at 2:49 PM, Matthew Knepley >> wrote: >> > On Tue, Jul 12, 2016 at 7:42 AM, Hoang Giang Bui >> wrote: >> > Hello >> > >> > I encountered different convergence behaviour of Newton Raphson when >> using different solver settings with PETSc >> > >> > For the first solver configuration, I used direct solver >> > -ksp_type preonly >> > -pc_type lu >> > -pc_factor_mat_solver_package mumps >> > -mat_mumps_icntl_1 6 >> > -mat_mumps_icntl_4 3 >> > -mat_mumps_icntl_7 4 >> > -mat_mumps_icntl_14 40 >> > -mat_mumps_icntl_23 0 >> > >> > The simulation can run completely and the NR typically converged after >> 6/7 iterations. Of course, it's very slow. For the second solver >> configuration: >> > -ksp_type gmres >> > -ksp_max_it 300 >> > -ksp_gmres_restart 300 >> > -ksp_gmres_modifiedgramschmidt >> > -pc_view >> > -pc_fieldsplit_type multiplicative >> > -fieldsplit_u_pc_type hypre >> > -fieldsplit_u_pc_hypre_type boomeramg >> > -fieldsplit_u_pc_hypre_boomeramg_coarsen_type PMIS >> > -fieldsplit_u_pc_hypre_boomeramg_strong_threshold 0.6 >> > -fieldsplit_u_pc_hypre_boomeramg_max_levels 25 >> > -fieldsplit_wp_ksp_rtol 1.0e-8 >> > -fieldsplit_wp_pc_type hypre >> > -fieldsplit_wp_pc_hypre_type boomeramg >> > -fieldsplit_wp_pc_hypre_boomeramg_coarsen_type PMIS >> > -fieldsplit_wp_pc_hypre_boomeramg_strong_threshold 0.6 >> > -fieldsplit_wp_pc_hypre_boomeramg_max_levels 25 >> > >> > The solver runs much faster, but the NR does not converge in 30 >> iterations after some time steps. I thought setting the solver tolerance >> -ksp_rtol 1.0e-12 but it doesn't help much because GMRES already terminate >> with tolerance 1e-30 (see sample log file). Can we set the tolerance of the >> sub-ksp of the Fieldsplit? I tried -fieldsplit_wp_ksp_rtol 1.0e-8 but it >> doesn't work. >> > >> > 1) In the log you sent, the linear solver converges due to the Relative >> Tolerance, 1.0e-9, not the breakdown tolerance 1e-30. Change rtol will >> affect the convergence. >> > >> > 2) What do you mean -fieldsplit_wp_ksp_rtol 1.0e-8 does not work? >> ALWAYS send the view output. >> > >> > 3) I can't tell you anything about Newton convergence if you do not >> send the output, -snes_monitor -snes_view >> > >> > 4) If there is a difference between LU and an iterative solver with >> residual 1e-9, then your system is very ill-conditioned. >> > >> > Thanks, >> > >> > Matt >> > >> > Sorry this problem is run with many time steps and is quite big so I >> cannot reproduce in a simple test case. >> > >> > Giang >> > >> > >> > >> > >> > -- >> > What most experimenters take for granted before they begin their >> experiments is infinitely more interesting than any results to which their >> experiments lead. >> > -- Norbert Wiener >> > >> > >> > >> > >> > -- >> > What most experimenters take for granted before they begin their >> experiments is infinitely more interesting than any results to which their >> experiments lead. >> > -- Norbert Wiener >> >> > -- What most experimenters take for granted before they begin their experiments is infinitely more interesting than any results to which their experiments lead. -- Norbert Wiener -------------- next part -------------- An HTML attachment was scrubbed... URL: From bsmith at mcs.anl.gov Wed Jul 13 12:09:00 2016 From: bsmith at mcs.anl.gov (Barry Smith) Date: Wed, 13 Jul 2016 12:09:00 -0500 Subject: [petsc-users] different convergence behaviour In-Reply-To: References: <66004C23-63C9-4A3E-A7DF-1352AC26412F@mcs.anl.gov> Message-ID: <329C1D8D-2EEA-4A1F-AEF2-47F02A2245E9@mcs.anl.gov> > On Jul 13, 2016, at 11:05 AM, Matthew Knepley wrote: > > On Wed, Jul 13, 2016 at 10:34 AM, Hoang Giang Bui wrote: > Thanks Barry > > This is a good comment. Since material behaviour depends very much on the trajectory of the solution. I suspect that the error may concatenate during time stepping. > > I have re-run the simulation as you suggested and post the log file here: https://www.dropbox.com/s/d6l8ixme37uh47a/log13Jul16?dl=0 > > However, I did not get what -ksp_monitor_true_solution used for? I see that I have the same log that I had before. My mistake. I didn't mean that option. > > That option is showing the last two numbers in these lines > > 0 KSP preconditioned resid norm 1.150038785083e+00 true resid norm 8.673040929526e+07 ||r(i)||/||b|| 1.000000000000e+00 > > Notice that there are 7 orders of magnitude between the apparent residual (using the preconditioner), and the actual residual, Ax - b. > You are using Hypre, and this generally means the Hypre coarse grid operator is crap. Please > > a) Try ML or GAMG and look at the output again > > b) Try MUMPS, although you have 200 nonzeros/row so that fill-in might be extreme. > > The consequence is that you solve to what you think is machine precision (1e-13), but all you really get is (1e-4), so I can understand > why the trajectory is completely different. > You can compare the final true residual norm at each iteration when using MUMPS with what you get with hypre to see if MUMPS is able to give you a smaller residual. Barry > Matt > > 1 KSP preconditioned resid norm 5.202876635759e-01 true resid norm 2.037005052213e+08 ||r(i)||/||b|| 2.348663022307e+00 > 2 KSP preconditioned resid norm 3.386127782775e-01 true resid norm 1.762196838305e+08 ||r(i)||/||b|| 2.031809664712e+00 > 3 KSP preconditioned resid norm 2.334102526025e-01 true resid norm 1.027451552306e+08 ||r(i)||/||b|| 1.184649721655e+00 > 4 KSP preconditioned resid norm 1.791251896569e-01 true resid norm 7.709961160729e+07 ||r(i)||/||b|| 8.889570824556e-01 > 5 KSP preconditioned resid norm 1.338763110903e-01 true resid norm 7.416954924746e+07 ||r(i)||/||b|| 8.551735181482e-01 > 6 KSP preconditioned resid norm 8.064262880339e-02 true resid norm 5.164444100149e+07 ||r(i)||/||b|| 5.954594405945e-01 > 7 KSP preconditioned resid norm 4.635705318709e-02 true resid norm 2.934800965373e+07 ||r(i)||/||b|| 3.383820034081e-01 > 8 KSP preconditioned resid norm 2.772133866748e-02 true resid norm 1.528356929458e+07 ||r(i)||/||b|| 1.762192686368e-01 > 9 KSP preconditioned resid norm 1.746753670007e-02 true resid norm 1.011788107951e+07 ||r(i)||/||b|| 1.166589799555e-01 > 10 KSP preconditioned resid norm 1.090702407895e-02 true resid norm 5.487922954253e+06 ||r(i)||/||b|| 6.327564920823e-02 > 11 KSP preconditioned resid norm 7.298748576067e-03 true resid norm 3.635843038640e+06 ||r(i)||/||b|| 4.192120235779e-02 > 12 KSP preconditioned resid norm 5.263606789063e-03 true resid norm 2.556946903793e+06 ||r(i)||/||b|| 2.948155006496e-02 > 13 KSP preconditioned resid norm 3.653208280595e-03 true resid norm 1.955721190606e+06 ||r(i)||/||b|| 2.254942881623e-02 > 14 KSP preconditioned resid norm 2.344759624903e-03 true resid norm 1.161259621408e+06 ||r(i)||/||b|| 1.338930175522e-02 > 15 KSP preconditioned resid norm 1.394564491254e-03 true resid norm 7.455856541894e+05 ||r(i)||/||b|| 8.596588673428e-03 > 16 KSP preconditioned resid norm 9.523395328600e-04 true resid norm 4.383808867461e+05 ||r(i)||/||b|| 5.054523440028e-03 > 17 KSP preconditioned resid norm 7.226014371144e-04 true resid norm 2.463564216053e+05 ||r(i)||/||b|| 2.840484941869e-03 > 18 KSP preconditioned resid norm 5.312593384754e-04 true resid norm 2.332075376781e+05 ||r(i)||/||b|| 2.688878555665e-03 > 19 KSP preconditioned resid norm 3.987403871945e-04 true resid norm 1.524236218549e+05 ||r(i)||/||b|| 1.757441514383e-03 > 20 KSP preconditioned resid norm 3.024350484979e-04 true resid norm 1.113568566173e+05 ||r(i)||/||b|| 1.283942477870e-03 > 21 KSP preconditioned resid norm 2.181724540430e-04 true resid norm 9.095158030900e+04 ||r(i)||/||b|| 1.048670022983e-03 > 22 KSP preconditioned resid norm 1.497651066688e-04 true resid norm 7.045647741653e+04 ||r(i)||/||b|| 8.123618692570e-04 > 23 KSP preconditioned resid norm 1.067332245914e-04 true resid norm 4.317487154207e+04 ||r(i)||/||b|| 4.978054628463e-04 > 24 KSP preconditioned resid norm 8.206743871631e-05 true resid norm 3.328488127932e+04 ||r(i)||/||b|| 3.837740597534e-04 > 25 KSP preconditioned resid norm 6.446633932980e-05 true resid norm 2.816657573261e+04 ||r(i)||/||b|| 3.247600923538e-04 > 26 KSP preconditioned resid norm 5.068725017435e-05 true resid norm 2.427030232896e+04 ||r(i)||/||b|| 2.798361327495e-04 > 27 KSP preconditioned resid norm 4.056292508453e-05 true resid norm 1.963628903861e+04 ||r(i)||/||b|| 2.264060460243e-04 > 28 KSP preconditioned resid norm 3.278196251068e-05 true resid norm 1.710046122873e+04 ||r(i)||/||b|| 1.971679987179e-04 > 29 KSP preconditioned resid norm 2.796514916728e-05 true resid norm 1.500292999274e+04 ||r(i)||/||b|| 1.729835027259e-04 > 30 KSP preconditioned resid norm 2.469882695602e-05 true resid norm 1.317997814765e+04 ||r(i)||/||b|| 1.519649019847e-04 > 31 KSP preconditioned resid norm 2.175528107880e-05 true resid norm 1.158572445412e+04 ||r(i)||/||b|| 1.335831866616e-04 > 32 KSP preconditioned resid norm 1.912573933887e-05 true resid norm 1.001695718951e+04 ||r(i)||/||b|| 1.154953293880e-04 > 33 KSP preconditioned resid norm 1.647102125210e-05 true resid norm 8.271485921360e+03 ||r(i)||/||b|| 9.537007825249e-05 > 34 KSP preconditioned resid norm 1.337436641169e-05 true resid norm 6.611637805300e+03 ||r(i)||/||b|| 7.623206046211e-05 > 35 KSP preconditioned resid norm 9.896966695703e-06 true resid norm 4.752788536204e+03 ||r(i)||/||b|| 5.479956309238e-05 > 36 KSP preconditioned resid norm 6.766260764791e-06 true resid norm 3.239548441802e+03 ||r(i)||/||b|| 3.735193305468e-05 > 37 KSP preconditioned resid norm 4.835158711776e-06 true resid norm 2.113941262442e+03 ||r(i)||/||b|| 2.437370329068e-05 > 38 KSP preconditioned resid norm 3.598894380040e-06 true resid norm 1.653467554688e+03 ||r(i)||/||b|| 1.906445003688e-05 > 39 KSP preconditioned resid norm 2.522642742745e-06 true resid norm 1.344572919946e+03 ||r(i)||/||b|| 1.550290066507e-05 > 40 KSP preconditioned resid norm 1.750002168280e-06 true resid norm 1.015690774521e+03 ||r(i)||/||b|| 1.171089566825e-05 > 41 KSP preconditioned resid norm 1.371380245282e-06 true resid norm 8.480814540622e+02 ||r(i)||/||b|| 9.778363332462e-06 > 42 KSP preconditioned resid norm 1.174063380270e-06 true resid norm 7.575955225454e+02 ||r(i)||/||b|| 8.735062231359e-06 > 43 KSP preconditioned resid norm 1.022078284946e-06 true resid norm 6.758159410670e+02 ||r(i)||/||b|| 7.792145183661e-06 > 44 KSP preconditioned resid norm 8.861345665105e-07 true resid norm 5.913685641420e+02 ||r(i)||/||b|| 6.818468504268e-06 > 45 KSP preconditioned resid norm 7.574040382433e-07 true resid norm 4.958820201473e+02 ||r(i)||/||b|| 5.717510434653e-06 > 46 KSP preconditioned resid norm 6.331382122180e-07 true resid norm 3.988451175342e+02 ||r(i)||/||b|| 4.598676759110e-06 > 47 KSP preconditioned resid norm 5.210644796074e-07 true resid norm 3.077459761874e+02 ||r(i)||/||b|| 3.548305360116e-06 > 48 KSP preconditioned resid norm 4.285762531134e-07 true resid norm 2.383304155333e+02 ||r(i)||/||b|| 2.747945241696e-06 > 49 KSP preconditioned resid norm 3.365753654637e-07 true resid norm 1.802176480688e+02 ||r(i)||/||b|| 2.077906117741e-06 > 50 KSP preconditioned resid norm 2.556504175739e-07 true resid norm 1.322207275993e+02 ||r(i)||/||b|| 1.524502520785e-06 > 51 KSP preconditioned resid norm 1.929395464892e-07 true resid norm 1.007938656038e+02 ||r(i)||/||b|| 1.162151388686e-06 > 52 KSP preconditioned resid norm 1.518353128559e-07 true resid norm 7.979486270816e+01 ||r(i)||/||b|| 9.200332773308e-07 > 53 KSP preconditioned resid norm 1.206065500213e-07 true resid norm 6.580266981926e+01 ||r(i)||/||b|| 7.587035545427e-07 > 54 KSP preconditioned resid norm 9.426597887251e-08 true resid norm 5.333098459078e+01 ||r(i)||/||b|| 6.149052566928e-07 > 55 KSP preconditioned resid norm 7.613592162567e-08 true resid norm 4.265349984159e+01 ||r(i)||/||b|| 4.917940568733e-07 > 56 KSP preconditioned resid norm 6.268355987149e-08 true resid norm 3.467681120568e+01 ||r(i)||/||b|| 3.998229858184e-07 > 57 KSP preconditioned resid norm 5.012883291890e-08 true resid norm 2.749870530323e+01 ||r(i)||/||b|| 3.170595587716e-07 > 58 KSP preconditioned resid norm 3.875711489918e-08 true resid norm 2.037239239206e+01 ||r(i)||/||b|| 2.348933039472e-07 > 59 KSP preconditioned resid norm 2.803879910778e-08 true resid norm 1.495957468476e+01 ||r(i)||/||b|| 1.724836168342e-07 > 60 KSP preconditioned resid norm 1.925214804831e-08 true resid norm 1.036952152845e+01 ||r(i)||/||b|| 1.195603896339e-07 > 61 KSP preconditioned resid norm 1.316807047769e-08 true resid norm 7.239457203086e+00 ||r(i)||/||b|| 8.347080639779e-08 > 62 KSP preconditioned resid norm 9.095263534284e-09 true resid norm 5.546725364022e+00 ||r(i)||/||b|| 6.395363989508e-08 > 63 KSP preconditioned resid norm 6.520024982652e-09 true resid norm 4.395022539849e+00 ||r(i)||/||b|| 5.067452783356e-08 > 64 KSP preconditioned resid norm 5.077084953418e-09 true resid norm 3.613138054874e+00 ||r(i)||/||b|| 4.165941431885e-08 > 65 KSP preconditioned resid norm 4.181478103167e-09 true resid norm 3.038027368880e+00 ||r(i)||/||b|| 3.502839884610e-08 > 66 KSP preconditioned resid norm 3.474545560062e-09 true resid norm 2.484725611092e+00 ||r(i)||/||b|| 2.864883990842e-08 > 67 KSP preconditioned resid norm 2.726294735157e-09 true resid norm 1.845741997810e+00 ||r(i)||/||b|| 2.128137077650e-08 > 68 KSP preconditioned resid norm 2.081101207644e-09 true resid norm 1.271838867185e+00 ||r(i)||/||b|| 1.466427839462e-08 > 69 KSP preconditioned resid norm 1.574053677511e-09 true resid norm 8.732579381622e-01 ||r(i)||/||b|| 1.006864772411e-08 > 70 KSP preconditioned resid norm 1.202717674216e-09 true resid norm 5.849220507056e-01 ||r(i)||/||b|| 6.744140324696e-09 > 71 KSP preconditioned resid norm 9.075713740333e-10 true resid norm 4.120181311262e-01 ||r(i)||/||b|| 4.750561359898e-09 > 72 KSP preconditioned resid norm 6.365151508838e-10 true resid norm 3.065749731760e-01 ||r(i)||/||b|| 3.534803717256e-09 > 73 KSP preconditioned resid norm 4.005974496315e-10 true resid norm 2.122086214944e-01 ||r(i)||/||b|| 2.446761444097e-09 > 74 KSP preconditioned resid norm 2.374916890000e-10 true resid norm 1.567794082480e-01 ||r(i)||/||b|| 1.807663650177e-09 > 75 KSP preconditioned resid norm 1.481096397633e-10 true resid norm 1.235242757193e-01 ||r(i)||/||b|| 1.424232592963e-09 > 76 KSP preconditioned resid norm 1.085014154415e-10 true resid norm 1.047268461651e-01 ||r(i)||/||b|| 1.207498581132e-09 > 77 KSP preconditioned resid norm 8.764582618532e-11 true resid norm 8.962364559579e-02 ||r(i)||/||b|| 1.033358960531e-09 > 78 KSP preconditioned resid norm 7.109092680274e-11 true resid norm 7.176047852904e-02 ||r(i)||/||b|| 8.273969777399e-10 > 79 KSP preconditioned resid norm 5.460763497752e-11 true resid norm 5.069849340150e-02 ||r(i)||/||b|| 5.845526824266e-10 > 80 KSP preconditioned resid norm 3.799942459039e-11 true resid norm 3.044234442091e-02 ||r(i)||/||b|| 3.509996628435e-10 > 81 KSP preconditioned resid norm 2.481109284531e-11 true resid norm 1.726059230919e-02 ||r(i)||/||b|| 1.990143070861e-10 > 82 KSP preconditioned resid norm 1.569622532234e-11 true resid norm 1.070220060596e-02 ||r(i)||/||b|| 1.233961731867e-10 > 83 KSP preconditioned resid norm 1.022582071414e-11 true resid norm 7.402265790954e-03 ||r(i)||/||b|| 8.534798637643e-11 > 84 KSP preconditioned resid norm 7.284827374238e-12 true resid norm 5.658340974708e-03 ||r(i)||/||b|| 6.524056580253e-11 > 85 KSP preconditioned resid norm 5.402886839508e-12 true resid norm 4.464802757767e-03 ||r(i)||/||b|| 5.147909244343e-11 > 86 KSP preconditioned resid norm 3.933784995327e-12 true resid norm 3.350654653931e-03 ||r(i)||/||b|| 3.863298560628e-11 > 87 KSP preconditioned resid norm 2.792049995877e-12 true resid norm 2.402140873006e-03 ||r(i)||/||b|| 2.769663942007e-11 > 88 KSP preconditioned resid norm 2.058524741199e-12 true resid norm 1.747330249674e-03 ||r(i)||/||b|| 2.014668515774e-11 > 89 KSP preconditioned resid norm 1.568241303093e-12 true resid norm 1.266336540932e-03 ||r(i)||/||b|| 1.460083667564e-11 > 90 KSP preconditioned resid norm 1.164779378453e-12 true resid norm 8.484550691359e-04 ||r(i)||/||b|| 9.782671107287e-12 > 91 KSP preconditioned resid norm 7.995560038101e-13 true resid norm 5.065061038629e-04 ||r(i)||/||b|| 5.840005921551e-12 > Linear solve converged due to CONVERGED_RTOL iterations 91 > KSP Object: 8 MPI processes > type: gmres > GMRES: restart=300, using Modified Gram-Schmidt Orthogonalization > GMRES: happy breakdown tolerance 1e-30 > maximum iterations=300, initial guess is zero > tolerances: relative=1e-12, absolute=1e-20, divergence=10000 > left preconditioning > using PRECONDITIONED norm type for convergence test > PC Object: 8 MPI processes > type: fieldsplit > FieldSplit with MULTIPLICATIVE composition: total splits = 2 > Solver info for each split is in the following KSP objects: > Split number 0 Defined by IS > KSP Object: (fieldsplit_u_) 8 MPI processes > type: preonly > maximum iterations=10000, initial guess is zero > tolerances: relative=1e-05, absolute=1e-50, divergence=10000 > left preconditioning > using NONE norm type for convergence test > PC Object: (fieldsplit_u_) 8 MPI processes > type: hypre > HYPRE BoomerAMG preconditioning > HYPRE BoomerAMG: Cycle type V > HYPRE BoomerAMG: Maximum number of levels 25 > HYPRE BoomerAMG: Maximum number of iterations PER hypre call 1 > HYPRE BoomerAMG: Convergence tolerance PER hypre call 0 > HYPRE BoomerAMG: Threshold for strong coupling 0.6 > HYPRE BoomerAMG: Interpolation truncation factor 0 > HYPRE BoomerAMG: Interpolation: max elements per row 0 > HYPRE BoomerAMG: Number of levels of aggressive coarsening 0 > HYPRE BoomerAMG: Number of paths for aggressive coarsening 1 > HYPRE BoomerAMG: Maximum row sums 0.9 > HYPRE BoomerAMG: Sweeps down 1 > HYPRE BoomerAMG: Sweeps up 1 > HYPRE BoomerAMG: Sweeps on coarse 1 > HYPRE BoomerAMG: Relax down symmetric-SOR/Jacobi > HYPRE BoomerAMG: Relax up symmetric-SOR/Jacobi > HYPRE BoomerAMG: Relax on coarse Gaussian-elimination > HYPRE BoomerAMG: Relax weight (all) 1 > HYPRE BoomerAMG: Outer relax weight (all) 1 > HYPRE BoomerAMG: Using CF-relaxation > HYPRE BoomerAMG: Measure type local > HYPRE BoomerAMG: Coarsen type PMIS > HYPRE BoomerAMG: Interpolation type classical > linear system matrix = precond matrix: > Mat Object: (fieldsplit_u_) 8 MPI processes > type: mpiaij > rows=438420, cols=438420, bs=3 > total: nonzeros=7.95766e+07, allocated nonzeros=7.95766e+07 > total number of mallocs used during MatSetValues calls =0 > using I-node (on process 0) routines: found 17349 nodes, limit used is 5 > Split number 1 Defined by IS > KSP Object: (fieldsplit_wp_) 8 MPI processes > type: preonly > maximum iterations=10000, initial guess is zero > tolerances: relative=1e-05, absolute=1e-50, divergence=10000 > left preconditioning > using NONE norm type for convergence test > PC Object: (fieldsplit_wp_) 8 MPI processes > type: hypre > HYPRE BoomerAMG preconditioning > HYPRE BoomerAMG: Cycle type V > HYPRE BoomerAMG: Maximum number of levels 25 > HYPRE BoomerAMG: Maximum number of iterations PER hypre call 1 > HYPRE BoomerAMG: Convergence tolerance PER hypre call 0 > HYPRE BoomerAMG: Threshold for strong coupling 0.6 > HYPRE BoomerAMG: Interpolation truncation factor 0 > HYPRE BoomerAMG: Interpolation: max elements per row 0 > HYPRE BoomerAMG: Number of levels of aggressive coarsening 0 > HYPRE BoomerAMG: Number of paths for aggressive coarsening 1 > HYPRE BoomerAMG: Maximum row sums 0.9 > HYPRE BoomerAMG: Sweeps down 1 > HYPRE BoomerAMG: Sweeps up 1 > HYPRE BoomerAMG: Sweeps on coarse 1 > HYPRE BoomerAMG: Relax down symmetric-SOR/Jacobi > HYPRE BoomerAMG: Relax up symmetric-SOR/Jacobi > HYPRE BoomerAMG: Relax on coarse Gaussian-elimination > HYPRE BoomerAMG: Relax weight (all) 1 > HYPRE BoomerAMG: Outer relax weight (all) 1 > HYPRE BoomerAMG: Using CF-relaxation > HYPRE BoomerAMG: Measure type local > HYPRE BoomerAMG: Coarsen type PMIS > HYPRE BoomerAMG: Interpolation type classical > linear system matrix = precond matrix: > Mat Object: (fieldsplit_wp_) 8 MPI processes > type: mpiaij > rows=146140, cols=146140 > total: nonzeros=596012, allocated nonzeros=596012 > total number of mallocs used during MatSetValues calls =0 > not using I-node (on process 0) routines > linear system matrix = precond matrix: > Mat Object: 8 MPI processes > type: mpiaij > rows=584560, cols=584560, bs=4 > total: nonzeros=9.29667e+07, allocated nonzeros=9.29667e+07 > total number of mallocs used during MatSetValues calls =0 > using I-node (on process 0) routines: found 32431 nodes, limit used is 5 > KSPSolve completed > > > Giang > > On Wed, Jul 13, 2016 at 5:43 AM, Barry Smith wrote: > > It is not uncommon for an iterative linear solver to work fine for some time steps but then start to perform poorly at a later timestep because the physics (mathematically the conditioning or eigenstructure of the Jacobian) changes over time; perhaps becomes singular. Another possibility is the trajectory of the solution is very sensitive to the solution of the nonlinear problem at each time step so that an iterative linear solver and a direct linear solver result in very difficult physical solutions after many time steps. In other words after many time-steps the computed solutions can be very different and if the computed solution for the iterative linear solver is eventually "non-physical" or ill-conditioned the nonlinear solver could break down. > > Please run with the iterative solver (that eventually breaks) with the option -ksp_monitor_true_solution -ksp_converged_reason and and send ALL the output (it will be very large, don't worry about it). Then we can see if the linear solver is breaking down. Note that by default PETSc linear solvers do not generate an error that stops the program if the linear solve fails, hence your NR code should call KSPGetConvergedReason() after EVERY linear solve and if the reason is negative your code needs to do something different since the linear solve failed and your code should not just keep on running NR. > > Barry > > > > On Jul 12, 2016, at 9:52 AM, Matthew Knepley wrote: > > > > On Tue, Jul 12, 2016 at 8:44 AM, Hoang Giang Bui wrote: > > Hi Matt > > > > 1) In the log you sent, the linear solver converges due to the Relative Tolerance, 1.0e-9, not the breakdown tolerance 1e-30. Change rtol will affect the convergence. > > > > Sorry i got it wrong in the previous email, the ksp_rtol 1.0e-12 DOES affect the convergence, and it took more iterations. But the simulation still failed at a definite time step. > > > > 2) What do you mean -fieldsplit_wp_ksp_rtol 1.0e-8 does not work? ALWAYS send the view output. > > > > In the log file I sent previously, the line > > > > KSP Object: (fieldsplit_wp_) 8 MPI processes > > type: preonly > > maximum iterations=10000, initial guess is zero > > tolerances: relative=1e-05, absolute=1e-50, divergence=10000 > > left preconditioning > > using NONE norm type for convergence test > > > > impressed me that the rtol for fieldsplit_wp is still 1.0e-5 > > > > KSP "preonly" does no iterations, so it does not read the tolerance. If you want to lower the tolerance, > > choose a solver like GMRES > > > > -fieldsplit_wp_ksp_type gmres -fieldsplit_wp_ksp_rtol 1e-8 > > > > 3) I can't tell you anything about Newton convergence if you do not send the output, -snes_monitor -snes_view > > > > I did not yet use SNES, instead using my NR iterator so I have no view for SNES. > > > > It is hard to debug an iteration which we did not code. It could be you have a bug. If not, then very small changes in > > the iterates are making a difference, which means your Jacobians are close to singular. A problem reformulation would > > probably help more than solver tweaking. > > > > Thanks, > > > > Matt > > > > 4) If there is a difference between LU and an iterative solver with residual 1e-9, then your system is very ill-conditioned. > > Yes it is ill-conditioned > > > > > > > > > > > > > > > > Giang > > > > On Tue, Jul 12, 2016 at 2:49 PM, Matthew Knepley wrote: > > On Tue, Jul 12, 2016 at 7:42 AM, Hoang Giang Bui wrote: > > Hello > > > > I encountered different convergence behaviour of Newton Raphson when using different solver settings with PETSc > > > > For the first solver configuration, I used direct solver > > -ksp_type preonly > > -pc_type lu > > -pc_factor_mat_solver_package mumps > > -mat_mumps_icntl_1 6 > > -mat_mumps_icntl_4 3 > > -mat_mumps_icntl_7 4 > > -mat_mumps_icntl_14 40 > > -mat_mumps_icntl_23 0 > > > > The simulation can run completely and the NR typically converged after 6/7 iterations. Of course, it's very slow. For the second solver configuration: > > -ksp_type gmres > > -ksp_max_it 300 > > -ksp_gmres_restart 300 > > -ksp_gmres_modifiedgramschmidt > > -pc_view > > -pc_fieldsplit_type multiplicative > > -fieldsplit_u_pc_type hypre > > -fieldsplit_u_pc_hypre_type boomeramg > > -fieldsplit_u_pc_hypre_boomeramg_coarsen_type PMIS > > -fieldsplit_u_pc_hypre_boomeramg_strong_threshold 0.6 > > -fieldsplit_u_pc_hypre_boomeramg_max_levels 25 > > -fieldsplit_wp_ksp_rtol 1.0e-8 > > -fieldsplit_wp_pc_type hypre > > -fieldsplit_wp_pc_hypre_type boomeramg > > -fieldsplit_wp_pc_hypre_boomeramg_coarsen_type PMIS > > -fieldsplit_wp_pc_hypre_boomeramg_strong_threshold 0.6 > > -fieldsplit_wp_pc_hypre_boomeramg_max_levels 25 > > > > The solver runs much faster, but the NR does not converge in 30 iterations after some time steps. I thought setting the solver tolerance -ksp_rtol 1.0e-12 but it doesn't help much because GMRES already terminate with tolerance 1e-30 (see sample log file). Can we set the tolerance of the sub-ksp of the Fieldsplit? I tried -fieldsplit_wp_ksp_rtol 1.0e-8 but it doesn't work. > > > > 1) In the log you sent, the linear solver converges due to the Relative Tolerance, 1.0e-9, not the breakdown tolerance 1e-30. Change rtol will affect the convergence. > > > > 2) What do you mean -fieldsplit_wp_ksp_rtol 1.0e-8 does not work? ALWAYS send the view output. > > > > 3) I can't tell you anything about Newton convergence if you do not send the output, -snes_monitor -snes_view > > > > 4) If there is a difference between LU and an iterative solver with residual 1e-9, then your system is very ill-conditioned. > > > > Thanks, > > > > Matt > > > > Sorry this problem is run with many time steps and is quite big so I cannot reproduce in a simple test case. > > > > Giang > > > > > > > > > > -- > > What most experimenters take for granted before they begin their experiments is infinitely more interesting than any results to which their experiments lead. > > -- Norbert Wiener > > > > > > > > > > -- > > What most experimenters take for granted before they begin their experiments is infinitely more interesting than any results to which their experiments lead. > > -- Norbert Wiener > > > > > > -- > What most experimenters take for granted before they begin their experiments is infinitely more interesting than any results to which their experiments lead. > -- Norbert Wiener From aks084000 at utdallas.edu Wed Jul 13 14:30:13 2016 From: aks084000 at utdallas.edu (Safin, Artur) Date: Wed, 13 Jul 2016 19:30:13 +0000 Subject: [petsc-users] Multigrid with PML Message-ID: <2d1003a65bf24fdf9b30adea866d2067@utdallas.edu> Dear PETSc community, I am working on solving a Helmholtz problem with PML. The issue is that I am finding it very hard to deal with the resulting matrix system; I can get the correct solution for coarse meshes, but it takes roughly 2-4 times as long to converge for each successively refined mesh. I've noticed that without PML, I do not have problems with convergence speed. I am using the GMRES solver with GAMG as the preconditioner (with block-Jacobi preconditioner for the multigrid solves). I have also tried to assemble a separate preconditioning matrix with the complex shift 1+0.5i, that does not seem to improve the results. Currently I am running with -ksp_type fgmres \ -pc_type gamg \ -mg_levels_pc_type bjacobi \ -pc_mg_type full \ -ksp_gmres_restart 150 \ Can anyone suggest some way of speeding up the convergence? Any help would be appreciated. I am attaching the output from kspview. Best, Artur -------------- next part -------------- An HTML attachment was scrubbed... URL: -------------- next part -------------- A non-text attachment was scrubbed... Name: kspview Type: application/octet-stream Size: 33747 bytes Desc: kspview URL: From knepley at gmail.com Wed Jul 13 17:03:25 2016 From: knepley at gmail.com (Matthew Knepley) Date: Wed, 13 Jul 2016 17:03:25 -0500 Subject: [petsc-users] Distribution of DMPlex for FEM In-Reply-To: References: Message-ID: On Wed, Jul 13, 2016 at 3:57 AM, Morten Nobel-J?rgensen wrote: > I?m having problems distributing a simple FEM model using DMPlex. For test > case I use 1x1x2 hex box elements (/cells) with 12 vertices. Each vertex > has one DOF. > When I distribute the system to two processors, each get a single element > and the local vector has the size 8 (one DOF for each vertex of a hex box) > as expected. > > My problem is that when I manually assemble the global stiffness matrix (a > 12x12 matrix) it seems like my ghost values are ignored. I?m sure that I?m > missing something obvious but cannot see what it is. > > In the attached example, I?m assembling the global stiffness matrix using > a simple local stiffness matrix of ones. This makes it very easy to see if > the matrix is assembled correctly. If I run it on one process, then global > stiffness matrix consists of 0?s, 1?s and 2?s and its trace is 16.0. But if > I run it distributed on on two, then it consists only of 0's and 1?s and > its trace is 12.0. > > I hope that somebody can spot my mistake and help me in the right > direction :) > This is my fault, and Stefano Zampini had already tried to tell me this was broken. I normally use DMPlexMatSetClosure(), which handles global indices correctly. I have fixed this in the branch knepley/fix-plex-l2g which is also merged to 'next'. I am attaching a version of your sample where all objects are freed correctly. Let me know if that works for you. Thanks, Matt > Kind regards, > Morten > -- What most experimenters take for granted before they begin their experiments is infinitely more interesting than any results to which their experiments lead. -- Norbert Wiener -------------- next part -------------- An HTML attachment was scrubbed... URL: -------------- next part -------------- A non-text attachment was scrubbed... Name: ex18.c Type: text/x-csrc Size: 4813 bytes Desc: not available URL: From hengjiew at uci.edu Wed Jul 13 18:07:51 2016 From: hengjiew at uci.edu (frank) Date: Wed, 13 Jul 2016 16:07:51 -0700 Subject: [petsc-users] Question about memory usage in Multigrid preconditioner In-Reply-To: References: <577C337B.60909@uci.edu> <94A03A99-4970-4F20-8C79-FEE1DCBD028D@mcs.anl.gov> <577D75D3.8010703@uci.edu> <2F25042C-E6D6-4AC6-9C22-1B63F8065836@mcs.anl.gov> <57804DE9.707@uci.edu> <5783D3E4.4020004@uci.edu> Message-ID: <5786C9C7.1080309@uci.edu> Hi Dave, Sorry for the late reply. Thank you so much for your detailed reply. I have a question about the estimation of the memory usage. There are 4223139840 allocated non-zeros and 18432 MPI processes. Double precision is used. So the memory per process is: 4223139840 * 8bytes / 18432 / 1024 / 1024 = 1.74M ? Did I do sth wrong here? Because this seems too small. I am running this job on Bluewater I am using the 7 points FD stencil in 3D. I apologize that I made a stupid mistake in computing the memory per core. My settings render each core can access only 2G memory on average instead of 8G which I mentioned in previous email. I re-run the job with 8G memory per core on average and there is no "Out Of Memory" error. I would do more test to see if there is still some memory issue. Regards, Frank On 07/11/2016 01:18 PM, Dave May wrote: > Hi Frank, > > > On 11 July 2016 at 19:14, frank > wrote: > > Hi Dave, > > I re-run the test using bjacobi as the preconditioner on the > coarse mesh of telescope. The Grid is 3072*256*768 and process > mesh is 96*8*24. The petsc option file is attached. > I still got the "Out Of Memory" error. The error occurred before > the linear solver finished one step. So I don't have the full info > from ksp_view. The info from ksp_view_pre is attached. > > > Okay - that is essentially useless (sorry) > > > It seems to me that the error occurred when the decomposition was > going to be changed. > > > Based on what information? > Running with -info would give us more clues, but will create a ton of > output. > Please try running the case which failed with -info > > I had another test with a grid of 1536*128*384 and the same > process mesh as above. There was no error. The ksp_view info is > attached for comparison. > Thank you. > > > > [3] Here is my crude estimate of your memory usage. > I'll target the biggest memory hogs only to get an order of magnitude > estimate > > * The Fine grid operator contains 4223139840 non-zeros --> 1.8 GB per > MPI rank assuming double precision. > The indices for the AIJ could amount to another 0.3 GB (assuming 32 > bit integers) > > * You use 5 levels of coarsening, so the other operators should > represent (collectively) > 2.1 / 8 + 2.1/8^2 + 2.1/8^3 + 2.1/8^4 ~ 300 MB per MPI rank on the > communicator with 18432 ranks. > The coarse grid should consume ~ 0.5 MB per MPI rank on the > communicator with 18432 ranks. > > * You use a reduction factor of 64, making the new communicator with > 288 MPI ranks. > PCTelescope will first gather a temporary matrix associated with your > coarse level operator assuming a comm size of 288 living on the comm > with size 18432. > This matrix will require approximately 0.5 * 64 = 32 MB per core on > the 288 ranks. > This matrix is then used to form a new MPIAIJ matrix on the subcomm, > thus require another 32 MB per rank. > The temporary matrix is now destroyed. > > * Because a DMDA is detected, a permutation matrix is assembled. > This requires 2 doubles per point in the DMDA. > Your coarse DMDA contains 92 x 16 x 48 points. > Thus the permutation matrix will require < 1 MB per MPI rank on the > sub-comm. > > * Lastly, the matrix is permuted. This uses MatPtAP(), but the > resulting operator will have the same memory footprint as the > unpermuted matrix (32 MB). At any stage in PCTelescope, only 2 > operators of size 32 MB are held in memory when the DMDA is provided. > > From my rough estimates, the worst case memory foot print for any > given core, given your options is approximately > 2100 MB + 300 MB + 32 MB + 32 MB + 1 MB = 2465 MB > This is way below 8 GB. > > Note this estimate completely ignores: > (1) the memory required for the restriction operator, > (2) the potential growth in the number of non-zeros per row due to > Galerkin coarsening (I wished -ksp_view_pre reported the output from > MatView so we could see the number of non-zeros required by the coarse > level operators) > (3) all temporary vectors required by the CG solver, and those > required by the smoothers. > (4) internal memory allocated by MatPtAP > (5) memory associated with IS's used within PCTelescope > > So either I am completely off in my estimates, or you have not > carefully estimated the memory usage of your application code. > Hopefully others might examine/correct my rough estimates > > Since I don't have your code I cannot access the latter. > Since I don't have access to the same machine you are running on, I > think we need to take a step back. > > [1] What machine are you running on? Send me a URL if its available > > [2] What discretization are you using? (I am guessing a scalar 7 point > FD stencil) > If it's a 7 point FD stencil, we should be able to examine the memory > usage of your solver configuration using a standard, light weight > existing PETSc example, run on your machine at the same scale. > This would hopefully enable us to correctly evaluate the actual memory > usage required by the solver configuration you are using. > > Thanks, > Dave > > > > Frank > > > > > On 07/08/2016 10:38 PM, Dave May wrote: >> >> >> On Saturday, 9 July 2016, frank wrote: >> >> Hi Barry and Dave, >> >> Thank both of you for the advice. >> >> @Barry >> I made a mistake in the file names in last email. I attached >> the correct files this time. >> For all the three tests, 'Telescope' is used as the coarse >> preconditioner. >> >> == Test1: Grid: 1536*128*384, Process Mesh: 48*4*12 >> Part of the memory usage: Vector 125 124 3971904 0. >> Matrix 101 101 9462372 0 >> >> == Test2: Grid: 1536*128*384, Process Mesh: 96*8*24 >> Part of the memory usage: Vector 125 124 681672 0. >> Matrix 101 101 1462180 0. >> >> In theory, the memory usage in Test1 should be 8 times of >> Test2. In my case, it is about 6 times. >> >> == Test3: Grid: 3072*256*768, Process Mesh: 96*8*24. >> Sub-domain per process: 32*32*32 >> Here I get the out of memory error. >> >> I tried to use -mg_coarse jacobi. In this way, I don't need >> to set -mg_coarse_ksp_type and -mg_coarse_pc_type explicitly, >> right? >> The linear solver didn't work in this case. Petsc output some >> errors. >> >> @Dave >> In test3, I use only one instance of 'Telescope'. On the >> coarse mesh of 'Telescope', I used LU as the preconditioner >> instead of SVD. >> If my set the levels correctly, then on the last coarse mesh >> of MG where it calls 'Telescope', the sub-domain per process >> is 2*2*2. >> On the last coarse mesh of 'Telescope', there is only one >> grid point per process. >> I still got the OOM error. The detailed petsc option file is >> attached. >> >> >> Do you understand the expected memory usage for the >> particular parallel LU implementation you are using? I don't >> (seriously). Replace LU with bjacobi and re-run this test. My >> point about solver debugging is still valid. >> >> And please send the result of KSPView so we can see what is >> actually used in the computations >> >> Thanks >> Dave >> >> >> >> Thank you so much. >> >> Frank >> >> >> >> On 07/06/2016 02:51 PM, Barry Smith wrote: >> >> On Jul 6, 2016, at 4:19 PM, frank > > wrote: >> >> Hi Barry, >> >> Thank you for you advice. >> I tried three test. In the 1st test, the grid is >> 3072*256*768 and the process mesh is 96*8*24. >> The linear solver is 'cg' the preconditioner is 'mg' >> and 'telescope' is used as the preconditioner at the >> coarse mesh. >> The system gives me the "Out of Memory" error before >> the linear system is completely solved. >> The info from '-ksp_view_pre' is attached. I seems to >> me that the error occurs when it reaches the coarse mesh. >> >> The 2nd test uses a grid of 1536*128*384 and process >> mesh is 96*8*24. The 3rd test uses the same grid but >> a different process mesh 48*4*12. >> >> Are you sure this is right? The total matrix and >> vector memory usage goes from 2nd test >> Vector 384 383 8,193,712 0. >> Matrix 103 103 11,508,688 0. >> to 3rd test >> Vector 384 383 1,590,520 0. >> Matrix 103 103 3,508,664 0. >> that is the memory usage got smaller but if you have only >> 1/8th the processes and the same grid it should have >> gotten about 8 times bigger. Did you maybe cut the grid >> by a factor of 8 also? If so that still doesn't explain >> it because the memory usage changed by a factor of 5 >> something for the vectors and 3 something for the matrices. >> >> >> The linear solver and petsc options in 2nd and 3rd >> tests are the same in 1st test. The linear solver >> works fine in both test. >> I attached the memory usage of the 2nd and 3rd tests. >> The memory info is from the option '-log_summary'. I >> tried to use '-momery_info' as you suggested, but in >> my case petsc treated it as an unused option. It >> output nothing about the memory. Do I need to add sth >> to my code so I can use '-memory_info'? >> >> Sorry, my mistake the option is -memory_view >> >> Can you run the one case with -memory_view and >> -mg_coarse jacobi -ksp_max_it 1 (just so it doesn't >> iterate forever) to see how much memory is used without >> the telescope? Also run case 2 the same way. >> >> Barry >> >> >> >> In both tests the memory usage is not large. >> >> It seems to me that it might be the 'telescope' >> preconditioner that allocated a lot of memory and >> caused the error in the 1st test. >> Is there is a way to show how much memory it allocated? >> >> Frank >> >> On 07/05/2016 03:37 PM, Barry Smith wrote: >> >> Frank, >> >> You can run with -ksp_view_pre to have it >> "view" the KSP before the solve so hopefully it >> gets that far. >> >> Please run the problem that does fit with >> -memory_info when the problem completes it will >> show the "high water mark" for PETSc allocated >> memory and total memory used. We first want to >> look at these numbers to see if it is using more >> memory than you expect. You could also run with >> say half the grid spacing to see how the memory >> usage scaled with the increase in grid points. >> Make the runs also with -log_view and send all >> the output from these options. >> >> Barry >> >> On Jul 5, 2016, at 5:23 PM, frank >> > >> wrote: >> >> Hi, >> >> I am using the CG ksp solver and Multigrid >> preconditioner to solve a linear system in >> parallel. >> I chose to use the 'Telescope' as the >> preconditioner on the coarse mesh for its >> good performance. >> The petsc options file is attached. >> >> The domain is a 3d box. >> It works well when the grid is 1536*128*384 >> and the process mesh is 96*8*24. When I >> double the size of grid and keep the same >> process mesh and petsc options, I get an "out >> of memory" error from the super-cluster I am >> using. >> Each process has access to at least 8G >> memory, which should be more than enough for >> my application. I am sure that all the other >> parts of my code( except the linear solver ) >> do not use much memory. So I doubt if there >> is something wrong with the linear solver. >> The error occurs before the linear system is >> completely solved so I don't have the info >> from ksp view. I am not able to re-produce >> the error with a smaller problem either. >> In addition, I tried to use the block jacobi >> as the preconditioner with the same grid and >> same decomposition. The linear solver runs >> extremely slow but there is no memory error. >> >> How can I diagnose what exactly cause the error? >> Thank you so much. >> >> Frank >> >> >> >> >> > > -------------- next part -------------- An HTML attachment was scrubbed... URL: From bsmith at mcs.anl.gov Wed Jul 13 18:28:46 2016 From: bsmith at mcs.anl.gov (Barry Smith) Date: Wed, 13 Jul 2016 18:28:46 -0500 Subject: [petsc-users] Question about memory usage in Multigrid preconditioner In-Reply-To: <5786C9C7.1080309@uci.edu> References: <577C337B.60909@uci.edu> <94A03A99-4970-4F20-8C79-FEE1DCBD028D@mcs.anl.gov> <577D75D3.8010703@uci.edu> <2F25042C-E6D6-4AC6-9C22-1B63F8065836@mcs.anl.gov> <57804DE9.707@uci.edu> <5783D3E4.4020004@uci.edu> <5786C9C7.1080309@uci.edu> Message-ID: > On Jul 13, 2016, at 6:07 PM, frank wrote: > > Hi Dave, > > Sorry for the late reply. > Thank you so much for your detailed reply. > > I have a question about the estimation of the memory usage. There are 4223139840 allocated non-zeros and 18432 MPI processes. Double precision is used. So the memory per process is: > 4223139840 * 8bytes / 18432 / 1024 / 1024 = 1.74M ? > Did I do sth wrong here? Because this seems too small. In addition to storing the non-zero values there are several integer arrays that need to be stored. For each nonzero it stores the column index so if integers are 4 bytes that is another 1.7M/2 . If PetscInt is 64 bit then the column indices take the same amount of space as the numerical values 1.74 M. In addition there are at least 7 PetscInt Arrays that are of size mlocal where mlocal is the number of rows local to the process. > > I am running this job on Bluewater > I am using the 7 points FD stencil in 3D. > > I apologize that I made a stupid mistake in computing the memory per core. My settings render each core can access only 2G memory on average instead of 8G which I mentioned in previous email. I re-run the job with 8G memory per core on average and there is no "Out Of Memory" error. I would do more test to see if there is still some memory issue. > > Regards, > Frank > > > On 07/11/2016 01:18 PM, Dave May wrote: >> Hi Frank, >> >> >> On 11 July 2016 at 19:14, frank wrote: >> Hi Dave, >> >> I re-run the test using bjacobi as the preconditioner on the coarse mesh of telescope. The Grid is 3072*256*768 and process mesh is 96*8*24. The petsc option file is attached. >> I still got the "Out Of Memory" error. The error occurred before the linear solver finished one step. So I don't have the full info from ksp_view. The info from ksp_view_pre is attached. >> >> Okay - that is essentially useless (sorry) >> >> >> It seems to me that the error occurred when the decomposition was going to be changed. >> >> Based on what information? >> Running with -info would give us more clues, but will create a ton of output. >> Please try running the case which failed with -info >> >> I had another test with a grid of 1536*128*384 and the same process mesh as above. There was no error. The ksp_view info is attached for comparison. >> Thank you. >> >> >> [3] Here is my crude estimate of your memory usage. >> I'll target the biggest memory hogs only to get an order of magnitude estimate >> >> * The Fine grid operator contains 4223139840 non-zeros --> 1.8 GB per MPI rank assuming double precision. >> The indices for the AIJ could amount to another 0.3 GB (assuming 32 bit integers) >> >> * You use 5 levels of coarsening, so the other operators should represent (collectively) >> 2.1 / 8 + 2.1/8^2 + 2.1/8^3 + 2.1/8^4 ~ 300 MB per MPI rank on the communicator with 18432 ranks. >> The coarse grid should consume ~ 0.5 MB per MPI rank on the communicator with 18432 ranks. >> >> * You use a reduction factor of 64, making the new communicator with 288 MPI ranks. >> PCTelescope will first gather a temporary matrix associated with your coarse level operator assuming a comm size of 288 living on the comm with size 18432. >> This matrix will require approximately 0.5 * 64 = 32 MB per core on the 288 ranks. >> This matrix is then used to form a new MPIAIJ matrix on the subcomm, thus require another 32 MB per rank. >> The temporary matrix is now destroyed. >> >> * Because a DMDA is detected, a permutation matrix is assembled. >> This requires 2 doubles per point in the DMDA. >> Your coarse DMDA contains 92 x 16 x 48 points. >> Thus the permutation matrix will require < 1 MB per MPI rank on the sub-comm. >> >> * Lastly, the matrix is permuted. This uses MatPtAP(), but the resulting operator will have the same memory footprint as the unpermuted matrix (32 MB). At any stage in PCTelescope, only 2 operators of size 32 MB are held in memory when the DMDA is provided. >> >> From my rough estimates, the worst case memory foot print for any given core, given your options is approximately >> 2100 MB + 300 MB + 32 MB + 32 MB + 1 MB = 2465 MB >> This is way below 8 GB. >> >> Note this estimate completely ignores: >> (1) the memory required for the restriction operator, >> (2) the potential growth in the number of non-zeros per row due to Galerkin coarsening (I wished -ksp_view_pre reported the output from MatView so we could see the number of non-zeros required by the coarse level operators) >> (3) all temporary vectors required by the CG solver, and those required by the smoothers. >> (4) internal memory allocated by MatPtAP >> (5) memory associated with IS's used within PCTelescope >> >> So either I am completely off in my estimates, or you have not carefully estimated the memory usage of your application code. Hopefully others might examine/correct my rough estimates >> >> Since I don't have your code I cannot access the latter. >> Since I don't have access to the same machine you are running on, I think we need to take a step back. >> >> [1] What machine are you running on? Send me a URL if its available >> >> [2] What discretization are you using? (I am guessing a scalar 7 point FD stencil) >> If it's a 7 point FD stencil, we should be able to examine the memory usage of your solver configuration using a standard, light weight existing PETSc example, run on your machine at the same scale. >> This would hopefully enable us to correctly evaluate the actual memory usage required by the solver configuration you are using. >> >> Thanks, >> Dave >> >> >> >> Frank >> >> >> >> >> On 07/08/2016 10:38 PM, Dave May wrote: >>> >>> >>> On Saturday, 9 July 2016, frank wrote: >>> Hi Barry and Dave, >>> >>> Thank both of you for the advice. >>> >>> @Barry >>> I made a mistake in the file names in last email. I attached the correct files this time. >>> For all the three tests, 'Telescope' is used as the coarse preconditioner. >>> >>> == Test1: Grid: 1536*128*384, Process Mesh: 48*4*12 >>> Part of the memory usage: Vector 125 124 3971904 0. >>> Matrix 101 101 9462372 0 >>> >>> == Test2: Grid: 1536*128*384, Process Mesh: 96*8*24 >>> Part of the memory usage: Vector 125 124 681672 0. >>> Matrix 101 101 1462180 0. >>> >>> In theory, the memory usage in Test1 should be 8 times of Test2. In my case, it is about 6 times. >>> >>> == Test3: Grid: 3072*256*768, Process Mesh: 96*8*24. Sub-domain per process: 32*32*32 >>> Here I get the out of memory error. >>> >>> I tried to use -mg_coarse jacobi. In this way, I don't need to set -mg_coarse_ksp_type and -mg_coarse_pc_type explicitly, right? >>> The linear solver didn't work in this case. Petsc output some errors. >>> >>> @Dave >>> In test3, I use only one instance of 'Telescope'. On the coarse mesh of 'Telescope', I used LU as the preconditioner instead of SVD. >>> If my set the levels correctly, then on the last coarse mesh of MG where it calls 'Telescope', the sub-domain per process is 2*2*2. >>> On the last coarse mesh of 'Telescope', there is only one grid point per process. >>> I still got the OOM error. The detailed petsc option file is attached. >>> >>> Do you understand the expected memory usage for the particular parallel LU implementation you are using? I don't (seriously). Replace LU with bjacobi and re-run this test. My point about solver debugging is still valid. >>> >>> And please send the result of KSPView so we can see what is actually used in the computations >>> >>> Thanks >>> Dave >>> >>> >>> >>> Thank you so much. >>> >>> Frank >>> >>> >>> >>> On 07/06/2016 02:51 PM, Barry Smith wrote: >>> On Jul 6, 2016, at 4:19 PM, frank wrote: >>> >>> Hi Barry, >>> >>> Thank you for you advice. >>> I tried three test. In the 1st test, the grid is 3072*256*768 and the process mesh is 96*8*24. >>> The linear solver is 'cg' the preconditioner is 'mg' and 'telescope' is used as the preconditioner at the coarse mesh. >>> The system gives me the "Out of Memory" error before the linear system is completely solved. >>> The info from '-ksp_view_pre' is attached. I seems to me that the error occurs when it reaches the coarse mesh. >>> >>> The 2nd test uses a grid of 1536*128*384 and process mesh is 96*8*24. The 3rd test uses the same grid but a different process mesh 48*4*12. >>> Are you sure this is right? The total matrix and vector memory usage goes from 2nd test >>> Vector 384 383 8,193,712 0. >>> Matrix 103 103 11,508,688 0. >>> to 3rd test >>> Vector 384 383 1,590,520 0. >>> Matrix 103 103 3,508,664 0. >>> that is the memory usage got smaller but if you have only 1/8th the processes and the same grid it should have gotten about 8 times bigger. Did you maybe cut the grid by a factor of 8 also? If so that still doesn't explain it because the memory usage changed by a factor of 5 something for the vectors and 3 something for the matrices. >>> >>> >>> The linear solver and petsc options in 2nd and 3rd tests are the same in 1st test. The linear solver works fine in both test. >>> I attached the memory usage of the 2nd and 3rd tests. The memory info is from the option '-log_summary'. I tried to use '-momery_info' as you suggested, but in my case petsc treated it as an unused option. It output nothing about the memory. Do I need to add sth to my code so I can use '-memory_info'? >>> Sorry, my mistake the option is -memory_view >>> >>> Can you run the one case with -memory_view and -mg_coarse jacobi -ksp_max_it 1 (just so it doesn't iterate forever) to see how much memory is used without the telescope? Also run case 2 the same way. >>> >>> Barry >>> >>> >>> >>> In both tests the memory usage is not large. >>> >>> It seems to me that it might be the 'telescope' preconditioner that allocated a lot of memory and caused the error in the 1st test. >>> Is there is a way to show how much memory it allocated? >>> >>> Frank >>> >>> On 07/05/2016 03:37 PM, Barry Smith wrote: >>> Frank, >>> >>> You can run with -ksp_view_pre to have it "view" the KSP before the solve so hopefully it gets that far. >>> >>> Please run the problem that does fit with -memory_info when the problem completes it will show the "high water mark" for PETSc allocated memory and total memory used. We first want to look at these numbers to see if it is using more memory than you expect. You could also run with say half the grid spacing to see how the memory usage scaled with the increase in grid points. Make the runs also with -log_view and send all the output from these options. >>> >>> Barry >>> >>> On Jul 5, 2016, at 5:23 PM, frank wrote: >>> >>> Hi, >>> >>> I am using the CG ksp solver and Multigrid preconditioner to solve a linear system in parallel. >>> I chose to use the 'Telescope' as the preconditioner on the coarse mesh for its good performance. >>> The petsc options file is attached. >>> >>> The domain is a 3d box. >>> It works well when the grid is 1536*128*384 and the process mesh is 96*8*24. When I double the size of grid and keep the same process mesh and petsc options, I get an "out of memory" error from the super-cluster I am using. >>> Each process has access to at least 8G memory, which should be more than enough for my application. I am sure that all the other parts of my code( except the linear solver ) do not use much memory. So I doubt if there is something wrong with the linear solver. >>> The error occurs before the linear system is completely solved so I don't have the info from ksp view. I am not able to re-produce the error with a smaller problem either. >>> In addition, I tried to use the block jacobi as the preconditioner with the same grid and same decomposition. The linear solver runs extremely slow but there is no memory error. >>> >>> How can I diagnose what exactly cause the error? >>> Thank you so much. >>> >>> Frank >>> >>> >>> >> >> > From dave.mayhem23 at gmail.com Wed Jul 13 19:47:31 2016 From: dave.mayhem23 at gmail.com (Dave May) Date: Thu, 14 Jul 2016 02:47:31 +0200 Subject: [petsc-users] Question about memory usage in Multigrid preconditioner In-Reply-To: <5786C9C7.1080309@uci.edu> References: <577C337B.60909@uci.edu> <94A03A99-4970-4F20-8C79-FEE1DCBD028D@mcs.anl.gov> <577D75D3.8010703@uci.edu> <2F25042C-E6D6-4AC6-9C22-1B63F8065836@mcs.anl.gov> <57804DE9.707@uci.edu> <5783D3E4.4020004@uci.edu> <5786C9C7.1080309@uci.edu> Message-ID: On 14 July 2016 at 01:07, frank wrote: > Hi Dave, > > Sorry for the late reply. > Thank you so much for your detailed reply. > > I have a question about the estimation of the memory usage. There are > 4223139840 allocated non-zeros and 18432 MPI processes. Double precision is > used. So the memory per process is: > 4223139840 * 8bytes / 18432 / 1024 / 1024 = 1.74M ? > Did I do sth wrong here? Because this seems too small. > No - I totally f***ed it up. You are correct. That'll teach me for fumbling around with my iphone calculator and not using my brain. (Note that to convert to MB just divide by 1e6, not 1024^2 - although I apparently cannot convert between units correctly....) >From the PETSc objects associated with the solver, It looks like it _should_ run with 2GB per MPI rank. Sorry for my mistake. Possibilities are: somewhere in your usage of PETSc you've introduced a memory leak; PETSc is doing a huge over allocation (e.g. as per our discussion of MatPtAP); or in your application code there are other objects you have forgotten to log the memory for. > I am running this job on Bluewater > > I am using the 7 points FD stencil in 3D. > I thought so on both counts. > > I apologize that I made a stupid mistake in computing the memory per core. > My settings render each core can access only 2G memory on average instead > of 8G which I mentioned in previous email. I re-run the job with 8G memory > per core on average and there is no "Out Of Memory" error. I would do more > test to see if there is still some memory issue. > Ok. I'd still like to know where the memory was being used since my estimates were off. Thanks, Dave > > Regards, > Frank > > > > On 07/11/2016 01:18 PM, Dave May wrote: > > Hi Frank, > > > On 11 July 2016 at 19:14, frank wrote: > >> Hi Dave, >> >> I re-run the test using bjacobi as the preconditioner on the coarse mesh >> of telescope. The Grid is 3072*256*768 and process mesh is 96*8*24. The >> petsc option file is attached. >> I still got the "Out Of Memory" error. The error occurred before the >> linear solver finished one step. So I don't have the full info from >> ksp_view. The info from ksp_view_pre is attached. >> > > Okay - that is essentially useless (sorry) > > >> >> It seems to me that the error occurred when the decomposition was going >> to be changed. >> > > Based on what information? > Running with -info would give us more clues, but will create a ton of > output. > Please try running the case which failed with -info > > >> I had another test with a grid of 1536*128*384 and the same process mesh >> as above. There was no error. The ksp_view info is attached for comparison. >> Thank you. >> > > > [3] Here is my crude estimate of your memory usage. > I'll target the biggest memory hogs only to get an order of magnitude > estimate > > * The Fine grid operator contains 4223139840 non-zeros --> 1.8 GB per MPI > rank assuming double precision. > The indices for the AIJ could amount to another 0.3 GB (assuming 32 bit > integers) > > * You use 5 levels of coarsening, so the other operators should represent > (collectively) > 2.1 / 8 + 2.1/8^2 + 2.1/8^3 + 2.1/8^4 ~ 300 MB per MPI rank on the > communicator with 18432 ranks. > The coarse grid should consume ~ 0.5 MB per MPI rank on the communicator > with 18432 ranks. > > * You use a reduction factor of 64, making the new communicator with 288 > MPI ranks. > PCTelescope will first gather a temporary matrix associated with your > coarse level operator assuming a comm size of 288 living on the comm with > size 18432. > This matrix will require approximately 0.5 * 64 = 32 MB per core on the > 288 ranks. > This matrix is then used to form a new MPIAIJ matrix on the subcomm, thus > require another 32 MB per rank. > The temporary matrix is now destroyed. > > * Because a DMDA is detected, a permutation matrix is assembled. > This requires 2 doubles per point in the DMDA. > Your coarse DMDA contains 92 x 16 x 48 points. > Thus the permutation matrix will require < 1 MB per MPI rank on the > sub-comm. > > * Lastly, the matrix is permuted. This uses MatPtAP(), but the resulting > operator will have the same memory footprint as the unpermuted matrix (32 > MB). At any stage in PCTelescope, only 2 operators of size 32 MB are held > in memory when the DMDA is provided. > > From my rough estimates, the worst case memory foot print for any given > core, given your options is approximately > 2100 MB + 300 MB + 32 MB + 32 MB + 1 MB = 2465 MB > This is way below 8 GB. > > Note this estimate completely ignores: > (1) the memory required for the restriction operator, > (2) the potential growth in the number of non-zeros per row due to > Galerkin coarsening (I wished -ksp_view_pre reported the output from > MatView so we could see the number of non-zeros required by the coarse > level operators) > (3) all temporary vectors required by the CG solver, and those required by > the smoothers. > (4) internal memory allocated by MatPtAP > (5) memory associated with IS's used within PCTelescope > > So either I am completely off in my estimates, or you have not carefully > estimated the memory usage of your application code. Hopefully others might > examine/correct my rough estimates > > Since I don't have your code I cannot access the latter. > Since I don't have access to the same machine you are running on, I think > we need to take a step back. > > [1] What machine are you running on? Send me a URL if its available > > [2] What discretization are you using? (I am guessing a scalar 7 point FD > stencil) > If it's a 7 point FD stencil, we should be able to examine the memory > usage of your solver configuration using a standard, light weight existing > PETSc example, run on your machine at the same scale. > This would hopefully enable us to correctly evaluate the actual memory > usage required by the solver configuration you are using. > > Thanks, > Dave > > >> >> >> Frank >> >> >> >> >> On 07/08/2016 10:38 PM, Dave May wrote: >> >> >> >> On Saturday, 9 July 2016, frank wrote: >> >>> Hi Barry and Dave, >>> >>> Thank both of you for the advice. >>> >>> @Barry >>> I made a mistake in the file names in last email. I attached the correct >>> files this time. >>> For all the three tests, 'Telescope' is used as the coarse >>> preconditioner. >>> >>> == Test1: Grid: 1536*128*384, Process Mesh: 48*4*12 >>> Part of the memory usage: Vector 125 124 3971904 0. >>> Matrix 101 101 >>> 9462372 0 >>> >>> == Test2: Grid: 1536*128*384, Process Mesh: 96*8*24 >>> Part of the memory usage: Vector 125 124 681672 0. >>> Matrix 101 101 >>> 1462180 0. >>> >>> In theory, the memory usage in Test1 should be 8 times of Test2. In my >>> case, it is about 6 times. >>> >>> == Test3: Grid: 3072*256*768, Process Mesh: 96*8*24. Sub-domain per >>> process: 32*32*32 >>> Here I get the out of memory error. >>> >>> I tried to use -mg_coarse jacobi. In this way, I don't need to set >>> -mg_coarse_ksp_type and -mg_coarse_pc_type explicitly, right? >>> The linear solver didn't work in this case. Petsc output some errors. >>> >>> @Dave >>> In test3, I use only one instance of 'Telescope'. On the coarse mesh of >>> 'Telescope', I used LU as the preconditioner instead of SVD. >>> If my set the levels correctly, then on the last coarse mesh of MG where >>> it calls 'Telescope', the sub-domain per process is 2*2*2. >>> On the last coarse mesh of 'Telescope', there is only one grid point per >>> process. >>> I still got the OOM error. The detailed petsc option file is attached. >> >> >> Do you understand the expected memory usage for the particular parallel >> LU implementation you are using? I don't (seriously). Replace LU with >> bjacobi and re-run this test. My point about solver debugging is still >> valid. >> >> And please send the result of KSPView so we can see what is actually used >> in the computations >> >> Thanks >> Dave >> >> >>> >>> >>> Thank you so much. >>> >>> Frank >>> >>> >>> >>> On 07/06/2016 02:51 PM, Barry Smith wrote: >>> >>>> On Jul 6, 2016, at 4:19 PM, frank < hengjiew at uci.edu> >>>>> wrote: >>>>> >>>>> Hi Barry, >>>>> >>>>> Thank you for you advice. >>>>> I tried three test. In the 1st test, the grid is 3072*256*768 and the >>>>> process mesh is 96*8*24. >>>>> The linear solver is 'cg' the preconditioner is 'mg' and 'telescope' >>>>> is used as the preconditioner at the coarse mesh. >>>>> The system gives me the "Out of Memory" error before the linear system >>>>> is completely solved. >>>>> The info from '-ksp_view_pre' is attached. I seems to me that the >>>>> error occurs when it reaches the coarse mesh. >>>>> >>>>> The 2nd test uses a grid of 1536*128*384 and process mesh is 96*8*24. >>>>> The 3rd test uses the same grid but a different process mesh 48*4*12. >>>>> >>>> Are you sure this is right? The total matrix and vector memory >>>> usage goes from 2nd test >>>> Vector 384 383 8,193,712 0. >>>> Matrix 103 103 11,508,688 0. >>>> to 3rd test >>>> Vector 384 383 1,590,520 0. >>>> Matrix 103 103 3,508,664 0. >>>> that is the memory usage got smaller but if you have only 1/8th the >>>> processes and the same grid it should have gotten about 8 times bigger. Did >>>> you maybe cut the grid by a factor of 8 also? If so that still doesn't >>>> explain it because the memory usage changed by a factor of 5 something for >>>> the vectors and 3 something for the matrices. >>>> >>>> >>>> The linear solver and petsc options in 2nd and 3rd tests are the same >>>>> in 1st test. The linear solver works fine in both test. >>>>> I attached the memory usage of the 2nd and 3rd tests. The memory info >>>>> is from the option '-log_summary'. I tried to use '-momery_info' as you >>>>> suggested, but in my case petsc treated it as an unused option. It output >>>>> nothing about the memory. Do I need to add sth to my code so I can use >>>>> '-memory_info'? >>>>> >>>> Sorry, my mistake the option is -memory_view >>>> >>>> Can you run the one case with -memory_view and -mg_coarse jacobi >>>> -ksp_max_it 1 (just so it doesn't iterate forever) to see how much memory >>>> is used without the telescope? Also run case 2 the same way. >>>> >>>> Barry >>>> >>>> >>>> >>>> In both tests the memory usage is not large. >>>>> >>>>> It seems to me that it might be the 'telescope' preconditioner that >>>>> allocated a lot of memory and caused the error in the 1st test. >>>>> Is there is a way to show how much memory it allocated? >>>>> >>>>> Frank >>>>> >>>>> On 07/05/2016 03:37 PM, Barry Smith wrote: >>>>> >>>>>> Frank, >>>>>> >>>>>> You can run with -ksp_view_pre to have it "view" the KSP before >>>>>> the solve so hopefully it gets that far. >>>>>> >>>>>> Please run the problem that does fit with -memory_info when the >>>>>> problem completes it will show the "high water mark" for PETSc allocated >>>>>> memory and total memory used. We first want to look at these numbers to see >>>>>> if it is using more memory than you expect. You could also run with say >>>>>> half the grid spacing to see how the memory usage scaled with the increase >>>>>> in grid points. Make the runs also with -log_view and send all the output >>>>>> from these options. >>>>>> >>>>>> Barry >>>>>> >>>>>> On Jul 5, 2016, at 5:23 PM, frank < >>>>>>> hengjiew at uci.edu> wrote: >>>>>>> >>>>>>> Hi, >>>>>>> >>>>>>> I am using the CG ksp solver and Multigrid preconditioner to solve >>>>>>> a linear system in parallel. >>>>>>> I chose to use the 'Telescope' as the preconditioner on the coarse >>>>>>> mesh for its good performance. >>>>>>> The petsc options file is attached. >>>>>>> >>>>>>> The domain is a 3d box. >>>>>>> It works well when the grid is 1536*128*384 and the process mesh is >>>>>>> 96*8*24. When I double the size of grid and keep the same process mesh and >>>>>>> petsc options, I get an "out of memory" error from the super-cluster I am >>>>>>> using. >>>>>>> Each process has access to at least 8G memory, which should be more >>>>>>> than enough for my application. I am sure that all the other parts of my >>>>>>> code( except the linear solver ) do not use much memory. So I doubt if >>>>>>> there is something wrong with the linear solver. >>>>>>> The error occurs before the linear system is completely solved so I >>>>>>> don't have the info from ksp view. I am not able to re-produce the error >>>>>>> with a smaller problem either. >>>>>>> In addition, I tried to use the block jacobi as the preconditioner >>>>>>> with the same grid and same decomposition. The linear solver runs extremely >>>>>>> slow but there is no memory error. >>>>>>> >>>>>>> How can I diagnose what exactly cause the error? >>>>>>> Thank you so much. >>>>>>> >>>>>>> Frank >>>>>>> >>>>>>> >>>>>> >>>>> >>>>> >>>> >>> >> > > -------------- next part -------------- An HTML attachment was scrubbed... URL: From bsmith at mcs.anl.gov Wed Jul 13 20:10:52 2016 From: bsmith at mcs.anl.gov (Barry Smith) Date: Wed, 13 Jul 2016 20:10:52 -0500 Subject: [petsc-users] Multigrid with PML In-Reply-To: <2d1003a65bf24fdf9b30adea866d2067@utdallas.edu> References: <2d1003a65bf24fdf9b30adea866d2067@utdallas.edu> Message-ID: <32247878-9DC8-4830-9CE3-A1518D23E3D9@mcs.anl.gov> Can you run with the additional option -ksp_view_mat binary and email the resulting file which will be called binaryoutput to petsc-maint at mcs.anl.gov Barry > On Jul 13, 2016, at 2:30 PM, Safin, Artur wrote: > > Dear PETSc community, > > I am working on solving a Helmholtz problem with PML. The issue is that I am finding it very hard to deal with the resulting matrix system; I can get the correct solution for coarse meshes, but it takes roughly 2-4 times as long to converge for each successively refined mesh. I've noticed that without PML, I do not have problems with convergence speed. > > I am using the GMRES solver with GAMG as the preconditioner (with block-Jacobi preconditioner for the multigrid solves). I have also tried to assemble a separate preconditioning matrix with the complex shift 1+0.5i, that does not seem to improve the results. Currently I am running with > > -ksp_type fgmres \ > -pc_type gamg \ > -mg_levels_pc_type bjacobi \ > -pc_mg_type full \ > -ksp_gmres_restart 150 \ > > Can anyone suggest some way of speeding up the convergence? Any help would be appreciated. I am attaching the output from kspview. > > Best, > > Artur > > From bsmith at mcs.anl.gov Wed Jul 13 21:11:53 2016 From: bsmith at mcs.anl.gov (Barry Smith) Date: Wed, 13 Jul 2016 21:11:53 -0500 Subject: [petsc-users] What block size means in amg aggregation type In-Reply-To: References: Message-ID: <568382E0-1BD7-43CD-85E6-6B864A8AC044@mcs.anl.gov> Sorry know one answered it. I had hoped Mark Adams would since he knows much more about it then me. > On Jul 6, 2016, at 2:50 PM, Eduardo Jourdan wrote: > > Hi, > > I am kind of new to algebraic multigrid methods. I tried to figure it on my own but I'm not be sure about it. > > How the block size (bs) of a blocked matrix affects the AMG AGG? I mean, if bs = 4, then > in the coarsening phase and setup, blocks of 4x4 matrix elements are considered to remain in the coarse level and a certain quantity of block neighbors are restricted and remain in the finer level? Never a row inside a block matrix is selected and the other elements of this block aren't, am I right? Correct > The entire block is interpolated when it comes to the interpolation phase? Correct and they all use the same interpolation. > > If the original problem is not a system of equations, then bs=1? Yes. For a Poission operator it is 1 for linear elasticity it is 2 to 6 depending on the dimension and the model. > > Thank you, > > Eduardo > > From mono at dtu.dk Thu Jul 14 02:45:33 2016 From: mono at dtu.dk (=?Windows-1252?Q?Morten_Nobel-J=F8rgensen?=) Date: Thu, 14 Jul 2016 07:45:33 +0000 Subject: [petsc-users] Distribution of DMPlex for FEM In-Reply-To: References: Message-ID: Hi Matthew Thanks for your answer and your fix. It works :))) Kind regards, Morten Fra: Matthew Knepley > Dato: Thursday 14 July 2016 at 00:03 Til: Morten Nobel-Joergensen > Cc: "petsc-users at mcs.anl.gov" > Emne: Re: [petsc-users] Distribution of DMPlex for FEM On Wed, Jul 13, 2016 at 3:57 AM, Morten Nobel-J?rgensen > wrote: I?m having problems distributing a simple FEM model using DMPlex. For test case I use 1x1x2 hex box elements (/cells) with 12 vertices. Each vertex has one DOF. When I distribute the system to two processors, each get a single element and the local vector has the size 8 (one DOF for each vertex of a hex box) as expected. My problem is that when I manually assemble the global stiffness matrix (a 12x12 matrix) it seems like my ghost values are ignored. I?m sure that I?m missing something obvious but cannot see what it is. In the attached example, I?m assembling the global stiffness matrix using a simple local stiffness matrix of ones. This makes it very easy to see if the matrix is assembled correctly. If I run it on one process, then global stiffness matrix consists of 0?s, 1?s and 2?s and its trace is 16.0. But if I run it distributed on on two, then it consists only of 0's and 1?s and its trace is 12.0. I hope that somebody can spot my mistake and help me in the right direction :) This is my fault, and Stefano Zampini had already tried to tell me this was broken. I normally use DMPlexMatSetClosure(), which handles global indices correctly. I have fixed this in the branch knepley/fix-plex-l2g which is also merged to 'next'. I am attaching a version of your sample where all objects are freed correctly. Let me know if that works for you. Thanks, Matt Kind regards, Morten -- What most experimenters take for granted before they begin their experiments is infinitely more interesting than any results to which their experiments lead. -- Norbert Wiener -------------- next part -------------- An HTML attachment was scrubbed... URL: From domenico_lahaye at yahoo.com Thu Jul 14 12:21:32 2016 From: domenico_lahaye at yahoo.com (domenico lahaye) Date: Thu, 14 Jul 2016 17:21:32 +0000 (UTC) Subject: [petsc-users] Regarding ksp ex42 - Citations References: <1413749702.3789628.1468516892902.JavaMail.yahoo.ref@mail.yahoo.com> Message-ID: <1413749702.3789628.1468516892902.JavaMail.yahoo@mail.yahoo.com> Dear PETSc team, 1) I am looking into ks/examples/tutorials/ex42.c I am still new to the DMDA structure ? ? and likely not giving it as much time as it deserves. However, I do not see immediately ??? what function is responsible for calling PCMGSetSmoother and PCMGSetResidual. ???? I tried to call PCMGGetCoarseSolve(pc, &kcpc) and subsequently ???? KSPGetOperators (kspc, ... ) to check how the coarse grid operator is defined ???? after calling DMCoarsenHierarchy, but that failed. ???? I am solving Helmholtz with shifted Laplace, and managed to exploit DMDA to perform ???? a multigrid solve on the preconditioner. In a next stage I want to implement the deflation ???? using DMDA as well. 2) On http://www.mcs.anl.gov/petsc/documentation/referencing.html I see @Misc{petsc-web-page, author = {Satish Balay and Shrirang Abhyankar and Mark~F. Adams and Jed Brown and Peter Brune and Kris Buschelman and Lisandro Dalcin and Victor Eijkhout and William~D. Gropp and Dinesh Kaushik and Matthew~G. Knepley and Lois Curfman McInnes and Karl Rupp and Barry~F. Smith and Stefano Zampini and Hong Zhang and Hong Zhang}, title = {{PETS}c {W}eb page}, url = {http://www.mcs.anl.gov/petsc}, howpublished = {\url{http://www.mcs.anl.gov/petsc}}, year = {2016} } Is the last author mentioned twice intentionally? 3) On http://www.mcs.anl.gov/petsc/publications/petscapps-bib.html#OpenFOAM%202.2.1 I see @misc{OpenFOAM, | | title | = | "OpenFOAM", | | | howpublished | = | "\url{http://www.openfoam.com}", | | | url | = | {http://www.openfoam.com}, | | | note | = | "OpenFOAM is a free, open source CFD software package. It allows PETSc linear algebra and solvers to be used underneath.", | | | key | = | "OpenFOAM 2.2.1" | } Do you have more information on the use of PETSc within OpenFoam? 4) @matt in response to a question he raised in Vienna MIPSE is a BEM solver. Details are on: http://www.g2elab.grenoble-inp.fr/plateforms/mipse-modeling-of-interconnected-power-systems-632862.kjsp?RH=G2ELAB_R-MAGE Cheers, Domenico Lahaye. -------------- next part -------------- An HTML attachment was scrubbed... URL: From amelie.compagna.1 at ulaval.ca Thu Jul 14 14:42:27 2016 From: amelie.compagna.1 at ulaval.ca (=?iso-8859-1?Q?Am=E9lie_Compagna?=) Date: Thu, 14 Jul 2016 19:42:27 +0000 Subject: [petsc-users] Slow convergence using Schur complement Message-ID: <1468525347118.92523@ulaval.ca> An HTML attachment was scrubbed... URL: -------------- next part -------------- A non-text attachment was scrubbed... Name: QuestionPetsc Type: application/octet-stream Size: 63842 bytes Desc: QuestionPetsc URL: From bsmith at mcs.anl.gov Thu Jul 14 15:05:12 2016 From: bsmith at mcs.anl.gov (Barry Smith) Date: Thu, 14 Jul 2016 15:05:12 -0500 Subject: [petsc-users] Slow convergence using Schur complement In-Reply-To: <1468525347118.92523@ulaval.ca> References: <1468525347118.92523@ulaval.ca> Message-ID: <13C19A0A-8EC6-471C-84E6-BD2B30C16E7E@mcs.anl.gov> So refreshing my memory > pc_fieldsplit_schur_precondition self selfp then the preconditioning for the Schur complement is generated from an explicitly-assembled approximation Sp = A11 - A10 inv(diag(A00)) A01 This is only a good preconditioner when diag(A00) is a good preconditioner for A00. Optionally, A00 can be lumped before extracting the diagonal using the additional option -fieldsplit_1_mat_schur_complement_ainv_type lump So first try adding the option (with the correct prefix) -fieldsplit_1_mat_schur_complement_ainv_type lump to see if the lumping helps the convergence. If suddenly it works well great but as the documentation says selfp may not be a good preconditioner at all for your problem and you'll have to consider the other ones. I don't know why it is printing the initial name and residual norm multiple times. What is is showing is the very slow convergence of the preconditioned system inv(Sp) S = inv(Sp) (A11 - A10 inv(A00) A01). Note I wrote inv() here because in both places you are using LU and hence it is a very accurate inverse operation. Barry > On Jul 14, 2016, at 2:42 PM, Am?lie Compagna wrote: > > ?Hi, > > I've been working on a finite element simulation of a 3 ionic species unsteady electrodiffusion model. The concentrations and the electric potential are defined using a unsteady diffusion equations. All the concentration being coupled to the potential giving a non symmetrical global system. > > I know that everything works since I've solved the system using LU. So far I've tried a lot of different things, but I am now trying to solve the system using a Schur complement, splitting the system in two groups [concentrations, potential], and I'm getting slow convergence. Here are the options I'm using. I've also attached a file with the ksp_view and the ksp_monitor. > > ====== > ksp_type gcr > pc_type fieldsplit > pc_fieldsplit_type schur > mat_type nest > ksp_monitor > ksp_view > > //Options concentrations block > > fieldsplit_a_00_ksp_type gcr > fieldsplit_a_00_pc_type lu > fieldsplit_a_00_ksp_rtol 1.0e-4 > fieldsplit_a_00_ksp_atol 1.0e-8 > > //Options potential block > fieldsplit_schur_mat_type schurcomplement > fieldsplit_schur_ksp_type gcr > pc_fieldsplit_schur_precondition selfp > pc_fieldsplit_schur_fact_type full > fieldsplit_schur_pc_type lu > fieldsplit_schur_ksp_monitor > fieldsplit_schur_ksp_rtol 1.0e-4 > fieldsplit_schur_ksp_atol 1.0e-8 > > > ksp_rtol 1.0e-5 > ksp_atol 1.0e-5 > ===== > > First of all, I'm wondering what exactly is showing on the screen when I use the fieldsplit_schur_ksp_monitor? > > Also, why is it printing twice each time as you can see in the attached file? When I use pc_fieldsplit_a_00_monitor (which is not included in the file I sent you because it only does one iteration, as it should since it's solving with LU) it prints it 3 times every time which gets pretty annoying. > > Finally, as you can see, it takes a long time to the fieldsplit_schur_ksp to converge, do you have any idea why it takes over 200 iterations to get down to 1e-02? Is there a way to get it to converge faster? > > Thank you for your time, > Am?lie? > > From andrewh0 at uw.edu Thu Jul 14 18:18:40 2016 From: andrewh0 at uw.edu (Andrew Ho) Date: Thu, 14 Jul 2016 16:18:40 -0700 Subject: [petsc-users] SNES_QN_RESTART_POWELL fails to converge? Message-ID: I am trying to solve a simple ionization/recombination ODE using PETSc's quasi-newton SNES. This is a basic non-linear coupled ODE system: delta = -a u^2 + b u v d_t u = delta d_t v = -delta a and b are constants. I wrote a backwards Euler root finding function (yes, I know the TS module has BE implemented, but this is more of a learning exercise). Here is the function evaluation: struct ion_rec_ctx > { > PetscScalar rate_a, rate_b; > PetscScalar dt; > }; > PetscErrorCode bdf1(SNES snes, Vec x, Vec f, void *ctx) > { > const PetscScalar *xx; > PetscScalar *ff; > ion_rec_ctx& params = *reinterpret_cast(ctx); > CHKERRQ(VecGetArrayRead(x, &xx)); > CHKERRQ(VecGetArray(f,&ff)); > auto delta = (-params.rate_a*xx[0]*xx[0]+params.rate_b*xx[1]*xx[0]); > ff[0] = xx[0]-params.dt*delta; > ff[1] = xx[1]-params.dt*-delta; > CHKERRQ(VecRestoreArrayRead(x,&xx)); > CHKERRQ(VecRestoreArray(f,&ff)); > return 0; > } To setup the solver and solve one time step: // q0, q1, and res are Vec's previously initialized > // initial conditions: q0 = [1e19,1e19] > SNES solver; > CHKERRQ(SNESCreate(comm, &solver)); > CHKERRQ(SNESSetType(solver, SNESQN)); > CHKERRQ(SNESQNSetType(solver, SNES_QN_LBFGS)); > ion_rec_ctx params = {9.59e-16, 1.15e-19, 1.}; > CHKERRQ(SNESSetFunction(solver, res, &bdf1, ¶ms)); > CHKERRQ(SNESSolve(solver, q0, q1)); When I run this, the solver fails to converge to a solution for this rather large time step. The solution produced when the SNES module finally gives up is: q1 = [-2.72647e142, 2.72647e142] For reference, when I disable the scale and restart types, I get these values: q1 = [1.0279e17, 1.98972e19] This is only a problem when I use the SNES_QN_RESTART_POWELL restart type (seems to be regardless of the scale type type). I get reasonable answers for other combinations of restart/scale type. I've tried every combination of restart type/scale type except for SNES_QN_SCALE_JACOBIAN (my ultimate application doesn't have an available Jacobian), and only cases using SNES_QN_RESTART_POWELL are failing. I'm unfamiliar with Powell's restart criterion, but is it suppose to work reasonably well with Quasi-Newton methods? I tried it on the simple problem given in this example: http://www.mcs.anl.gov/petsc/petsc-current/src/snes/examples/tutorials/ex1.c.html And Powell restarts also fails to converge to a meaningful solution (solving for f(x) = [1,1], for x0 = [1,1]), but the other restart methods do converge properly. Software information: PETSc version 3.7.2 (built from git maint branch) PETSc arch: arch-linux2-c-opt OS: Ubuntu 15.04 x64 Compiler: gcc 4.9.2 -------------- next part -------------- An HTML attachment was scrubbed... URL: From knepley at gmail.com Thu Jul 14 18:22:32 2016 From: knepley at gmail.com (Matthew Knepley) Date: Thu, 14 Jul 2016 18:22:32 -0500 Subject: [petsc-users] SNES_QN_RESTART_POWELL fails to converge? In-Reply-To: References: Message-ID: On Thu, Jul 14, 2016 at 6:18 PM, Andrew Ho wrote: > I am trying to solve a simple ionization/recombination ODE using PETSc's > quasi-newton SNES. > > This is a basic non-linear coupled ODE system: > > delta = -a u^2 + b u v > d_t u = delta > d_t v = -delta > > a and b are constants. > > I wrote a backwards Euler root finding function (yes, I know the TS module > has BE implemented, but this is more of a learning exercise). > > Here is the function evaluation: > > struct ion_rec_ctx >> { >> PetscScalar rate_a, rate_b; >> PetscScalar dt; >> }; >> PetscErrorCode bdf1(SNES snes, Vec x, Vec f, void *ctx) >> { >> const PetscScalar *xx; >> PetscScalar *ff; >> ion_rec_ctx& params = *reinterpret_cast(ctx); >> CHKERRQ(VecGetArrayRead(x, &xx)); >> CHKERRQ(VecGetArray(f,&ff)); >> auto delta = (-params.rate_a*xx[0]*xx[0]+params.rate_b*xx[1]*xx[0]); >> ff[0] = xx[0]-params.dt*delta; >> > I do not understand this. Shouldn't it be (xx[0] - xxold[0]) here? Matt > ff[1] = xx[1]-params.dt*-delta; >> CHKERRQ(VecRestoreArrayRead(x,&xx)); >> CHKERRQ(VecRestoreArray(f,&ff)); >> return 0; >> } > > > To setup the solver and solve one time step: > > // q0, q1, and res are Vec's previously initialized >> // initial conditions: q0 = [1e19,1e19] >> SNES solver; >> CHKERRQ(SNESCreate(comm, &solver)); >> CHKERRQ(SNESSetType(solver, SNESQN)); >> CHKERRQ(SNESQNSetType(solver, SNES_QN_LBFGS)); >> ion_rec_ctx params = {9.59e-16, 1.15e-19, 1.}; >> CHKERRQ(SNESSetFunction(solver, res, &bdf1, ¶ms)); >> CHKERRQ(SNESSolve(solver, q0, q1)); > > > When I run this, the solver fails to converge to a solution for this > rather large time step. > The solution produced when the SNES module finally gives up is: > > q1 = [-2.72647e142, 2.72647e142] > > For reference, when I disable the scale and restart types, I get these > values: > > q1 = [1.0279e17, 1.98972e19] > > This is only a problem when I use the SNES_QN_RESTART_POWELL restart type > (seems to be regardless of the scale type type). I get reasonable answers > for other combinations of restart/scale type. I've tried every combination > of restart type/scale type except for SNES_QN_SCALE_JACOBIAN (my ultimate > application doesn't have an available Jacobian), and only cases using > SNES_QN_RESTART_POWELL are failing. > > I'm unfamiliar with Powell's restart criterion, but is it suppose to work > reasonably well with Quasi-Newton methods? I tried it on the simple problem > given in this example: > http://www.mcs.anl.gov/petsc/petsc-current/src/snes/examples/tutorials/ex1.c.html > > And Powell restarts also fails to converge to a meaningful solution > (solving for f(x) = [1,1], for x0 = [1,1]), but the other restart methods > do converge properly. > > Software information: > > PETSc version 3.7.2 (built from git maint branch) > PETSc arch: arch-linux2-c-opt > OS: Ubuntu 15.04 x64 > Compiler: gcc 4.9.2 > -- What most experimenters take for granted before they begin their experiments is infinitely more interesting than any results to which their experiments lead. -- Norbert Wiener -------------- next part -------------- An HTML attachment was scrubbed... URL: From mfadams at lbl.gov Thu Jul 14 18:27:02 2016 From: mfadams at lbl.gov (Mark Adams) Date: Thu, 14 Jul 2016 19:27:02 -0400 Subject: [petsc-users] different convergence behaviour In-Reply-To: References: <66004C23-63C9-4A3E-A7DF-1352AC26412F@mcs.anl.gov> Message-ID: > > > > Notice that there are 7 orders of magnitude between the apparent residual > (using the preconditioner), and the actual residual, Ax - b. > You are using Hypre, and this generally means the Hypre coarse grid > operator is crap. Please > > Huh?, this data looks fine, both the true and preconditioned residual stay separated by about 9 orders of magnitude. This just tells you that the norm of A (or is it A^-1) is 10^9. Am I misunderstanding this? -------------- next part -------------- An HTML attachment was scrubbed... URL: From andrewh0 at uw.edu Thu Jul 14 18:28:38 2016 From: andrewh0 at uw.edu (Andrew Ho) Date: Thu, 14 Jul 2016 16:28:38 -0700 Subject: [petsc-users] SNES_QN_RESTART_POWELL fails to converge? In-Reply-To: References: Message-ID: On Thu, Jul 14, 2016 at 4:22 PM, Matthew Knepley wrote: > On Thu, Jul 14, 2016 at 6:18 PM, Andrew Ho wrote: > >> I am trying to solve a simple ionization/recombination ODE using PETSc's >> quasi-newton SNES. >> >> This is a basic non-linear coupled ODE system: >> >> delta = -a u^2 + b u v >> d_t u = delta >> d_t v = -delta >> >> a and b are constants. >> >> I wrote a backwards Euler root finding function (yes, I know the TS >> module has BE implemented, but this is more of a learning exercise). >> >> Here is the function evaluation: >> >> struct ion_rec_ctx >>> { >>> PetscScalar rate_a, rate_b; >>> PetscScalar dt; >>> }; >>> PetscErrorCode bdf1(SNES snes, Vec x, Vec f, void *ctx) >>> { >>> const PetscScalar *xx; >>> PetscScalar *ff; >>> ion_rec_ctx& params = *reinterpret_cast(ctx); >>> CHKERRQ(VecGetArrayRead(x, &xx)); >>> CHKERRQ(VecGetArray(f,&ff)); >>> auto delta = (-params.rate_a*xx[0]*xx[0]+params.rate_b*xx[1]*xx[0]); >>> ff[0] = xx[0]-params.dt*delta; >>> >> > I do not understand this. Shouldn't it be (xx[0] - xxold[0]) here? > > Matt > No, the time discretization is as such: xnew = xold + dt*f(xnew) I re-arrange this to be xnew - dt*f(xnew) = xold The left hand side I am defining as g(x), which is what the bdf1 function evaluates. The SNES module solves for g(x) = b, so I simply set b = xold. -------------- next part -------------- An HTML attachment was scrubbed... URL: From knepley at gmail.com Thu Jul 14 18:29:06 2016 From: knepley at gmail.com (Matthew Knepley) Date: Thu, 14 Jul 2016 18:29:06 -0500 Subject: [petsc-users] different convergence behaviour In-Reply-To: References: <66004C23-63C9-4A3E-A7DF-1352AC26412F@mcs.anl.gov> Message-ID: On Thu, Jul 14, 2016 at 6:27 PM, Mark Adams wrote: > >> >> Notice that there are 7 orders of magnitude between the apparent residual >> (using the preconditioner), and the actual residual, Ax - b. >> You are using Hypre, and this generally means the Hypre coarse grid >> operator is crap. Please >> >> > Huh?, this data looks fine, both the true and preconditioned residual stay > separated by about 9 orders of magnitude. This just tells you that the norm > of A (or is it A^-1) is 10^9. Am I misunderstanding this? > This is why Barry and I asked for a comparsion with MUMPS. If you are right, and its just the condition number, the LU will not be any more accurate. Matt -- What most experimenters take for granted before they begin their experiments is infinitely more interesting than any results to which their experiments lead. -- Norbert Wiener -------------- next part -------------- An HTML attachment was scrubbed... URL: From bsmith at mcs.anl.gov Thu Jul 14 19:50:26 2016 From: bsmith at mcs.anl.gov (Barry Smith) Date: Thu, 14 Jul 2016 19:50:26 -0500 Subject: [petsc-users] SNES_QN_RESTART_POWELL fails to converge? In-Reply-To: References: Message-ID: <20513817-993F-41CC-8888-AAD5DF55922C@mcs.anl.gov> > On Jul 14, 2016, at 6:18 PM, Andrew Ho wrote: > > I am trying to solve a simple ionization/recombination ODE using PETSc's quasi-newton SNES. > > This is a basic non-linear coupled ODE system: > > delta = -a u^2 + b u v > d_t u = delta > d_t v = -delta > > a and b are constants. > > I wrote a backwards Euler root finding function (yes, I know the TS module has BE implemented, but this is more of a learning exercise). > > Here is the function evaluation: > > struct ion_rec_ctx > { > PetscScalar rate_a, rate_b; > PetscScalar dt; > }; > PetscErrorCode bdf1(SNES snes, Vec x, Vec f, void *ctx) > { > const PetscScalar *xx; > PetscScalar *ff; > ion_rec_ctx& params = *reinterpret_cast(ctx); > CHKERRQ(VecGetArrayRead(x, &xx)); > CHKERRQ(VecGetArray(f,&ff)); > auto delta = (-params.rate_a*xx[0]*xx[0]+params.rate_b*xx[1]*xx[0]); > ff[0] = xx[0]-params.dt*delta; > ff[1] = xx[1]-params.dt*-delta; > CHKERRQ(VecRestoreArrayRead(x,&xx)); > CHKERRQ(VecRestoreArray(f,&ff)); > return 0; > } > > To setup the solver and solve one time step: > > // q0, q1, and res are Vec's previously initialized > // initial conditions: q0 = [1e19,1e19] > SNES solver; > CHKERRQ(SNESCreate(comm, &solver)); > CHKERRQ(SNESSetType(solver, SNESQN)); > CHKERRQ(SNESQNSetType(solver, SNES_QN_LBFGS)); > ion_rec_ctx params = {9.59e-16, 1.15e-19, 1.}; > CHKERRQ(SNESSetFunction(solver, res, &bdf1, ¶ms)); > CHKERRQ(SNESSolve(solver, q0, q1)); > > When I run this, the solver fails to converge to a solution for this rather large time step. > The solution produced when the SNES module finally gives up is: > > q1 = [-2.72647e142, 2.72647e142] > > For reference, when I disable the scale and restart types, I get these values: > > q1 = [1.0279e17, 1.98972e19] > > This is only a problem when I use the SNES_QN_RESTART_POWELL restart type (seems to be regardless of the scale type type). I get reasonable answers for other combinations of restart/scale type. I've tried every combination of restart type/scale type except for SNES_QN_SCALE_JACOBIAN (my ultimate application doesn't have an available Jacobian), and only cases using SNES_QN_RESTART_POWELL are failing. > > I'm unfamiliar with Powell's restart criterion, but is it suppose to work reasonably well with Quasi-Newton methods? I tried it on the simple problem given in this example: http://www.mcs.anl.gov/petsc/petsc-current/src/snes/examples/tutorials/ex1.c.html > > And Powell restarts also fails to converge to a meaningful solution (solving for f(x) = [1,1], for x0 = [1,1]), but the other restart methods do converge properly. Could you please send the exact options you are using for the ex1.c that both fail and work and we'll see if there is some problem with the Powell restart. Thanks Barry > > Software information: > > PETSc version 3.7.2 (built from git maint branch) > PETSc arch: arch-linux2-c-opt > OS: Ubuntu 15.04 x64 > Compiler: gcc 4.9.2 From mfadams at lbl.gov Thu Jul 14 19:52:09 2016 From: mfadams at lbl.gov (Mark Adams) Date: Thu, 14 Jul 2016 20:52:09 -0400 Subject: [petsc-users] different convergence behaviour In-Reply-To: References: <66004C23-63C9-4A3E-A7DF-1352AC26412F@mcs.anl.gov> Message-ID: On Thu, Jul 14, 2016 at 7:29 PM, Matthew Knepley wrote: > On Thu, Jul 14, 2016 at 6:27 PM, Mark Adams wrote: > >> >>> >>> Notice that there are 7 orders of magnitude between the apparent >>> residual (using the preconditioner), and the actual residual, Ax - b. >>> You are using Hypre, and this generally means the Hypre coarse grid >>> operator is crap. Please >>> >>> >> Huh?, this data looks fine, both the true and preconditioned residual >> stay separated by about 9 orders of magnitude. This just tells you that the >> norm of A (or is it A^-1) is 10^9. Am I misunderstanding this? >> > > This is why Barry and I asked for a comparsion with MUMPS. If you are > right, and its just the condition number, > I said norm not condition number. I trust I'm missing something in this thread. > the LU > will not be any more accurate. > > Matt > > -- > What most experimenters take for granted before they begin their > experiments is infinitely more interesting than any results to which their > experiments lead. > -- Norbert Wiener > -------------- next part -------------- An HTML attachment was scrubbed... URL: From bsmith at mcs.anl.gov Thu Jul 14 20:10:27 2016 From: bsmith at mcs.anl.gov (Barry Smith) Date: Thu, 14 Jul 2016 20:10:27 -0500 Subject: [petsc-users] Multigrid with PML In-Reply-To: <2d1003a65bf24fdf9b30adea866d2067@utdallas.edu> References: <2d1003a65bf24fdf9b30adea866d2067@utdallas.edu> Message-ID: <02E40C8A-322D-4784-8418-22EE5F0999C7@mcs.anl.gov> This is a very difficult problem. I am not surprised that GAMG performs poorly, I would be surprised if it performed well at all. I think you need to do some googling of "helmholtz PML linear system solve" to find what other people have used. The first hit I got was this http://www.math.tau.ac.il/services/phd/dissertations/Singer_Ido.pdf and every iterative method he tried ended up requiring MANY iterations with refinement. This is 14 years old so there will be better suggestions out there. One that caught my eye was http://www.sciencedirect.com/science/article/pii/S0022247X11005063 Barry Just looking at the matrix makes it clear to me that conventional iterative methods are not going to work well, many of the diagonal entries are zero and even in rows with a diagonal entry it is much smaller in magnitude than the diagonal entries. > On Jul 13, 2016, at 2:30 PM, Safin, Artur wrote: > > Dear PETSc community, > > I am working on solving a Helmholtz problem with PML. The issue is that I am finding it very hard to deal with the resulting matrix system; I can get the correct solution for coarse meshes, but it takes roughly 2-4 times as long to converge for each successively refined mesh. I've noticed that without PML, I do not have problems with convergence speed. > > I am using the GMRES solver with GAMG as the preconditioner (with block-Jacobi preconditioner for the multigrid solves). I have also tried to assemble a separate preconditioning matrix with the complex shift 1+0.5i, that does not seem to improve the results. Currently I am running with > > -ksp_type fgmres \ > -pc_type gamg \ > -mg_levels_pc_type bjacobi \ > -pc_mg_type full \ > -ksp_gmres_restart 150 \ > > Can anyone suggest some way of speeding up the convergence? Any help would be appreciated. I am attaching the output from kspview. > > Best, > > Artur > > From andrewh0 at uw.edu Fri Jul 15 03:14:43 2016 From: andrewh0 at uw.edu (Andrew Ho) Date: Fri, 15 Jul 2016 01:14:43 -0700 Subject: [petsc-users] SNES_QN_RESTART_POWELL fails to converge? In-Reply-To: <20513817-993F-41CC-8888-AAD5DF55922C@mcs.anl.gov> References: <20513817-993F-41CC-8888-AAD5DF55922C@mcs.anl.gov> Message-ID: I've attached two modified versions of ex1: ex1_powell.c uses the Powell restart ex1_none.c uses no restart For the default initial guess (x0 = [0.5,0.5]), both converge just fine. However, for the initial guess x0 = [3.,3.], the Powell solution fails to converge, while None and Periodic both still converge. This is with the "easy" equation set (run without -hard). Interestingly enough, the Powell restart still "finishes" in a reasonable number of iterations (7 iterations), but the residual is very large (on the order of 1e254). On Thu, Jul 14, 2016 at 5:50 PM, Barry Smith wrote: > > > On Jul 14, 2016, at 6:18 PM, Andrew Ho wrote: > > > > I am trying to solve a simple ionization/recombination ODE using PETSc's > quasi-newton SNES. > > > > This is a basic non-linear coupled ODE system: > > > > delta = -a u^2 + b u v > > d_t u = delta > > d_t v = -delta > > > > a and b are constants. > > > > I wrote a backwards Euler root finding function (yes, I know the TS > module has BE implemented, but this is more of a learning exercise). > > > > Here is the function evaluation: > > > > struct ion_rec_ctx > > { > > PetscScalar rate_a, rate_b; > > PetscScalar dt; > > }; > > PetscErrorCode bdf1(SNES snes, Vec x, Vec f, void *ctx) > > { > > const PetscScalar *xx; > > PetscScalar *ff; > > ion_rec_ctx& params = *reinterpret_cast(ctx); > > CHKERRQ(VecGetArrayRead(x, &xx)); > > CHKERRQ(VecGetArray(f,&ff)); > > auto delta = (-params.rate_a*xx[0]*xx[0]+params.rate_b*xx[1]*xx[0]); > > ff[0] = xx[0]-params.dt*delta; > > ff[1] = xx[1]-params.dt*-delta; > > CHKERRQ(VecRestoreArrayRead(x,&xx)); > > CHKERRQ(VecRestoreArray(f,&ff)); > > return 0; > > } > > > > To setup the solver and solve one time step: > > > > // q0, q1, and res are Vec's previously initialized > > // initial conditions: q0 = [1e19,1e19] > > SNES solver; > > CHKERRQ(SNESCreate(comm, &solver)); > > CHKERRQ(SNESSetType(solver, SNESQN)); > > CHKERRQ(SNESQNSetType(solver, SNES_QN_LBFGS)); > > ion_rec_ctx params = {9.59e-16, 1.15e-19, 1.}; > > CHKERRQ(SNESSetFunction(solver, res, &bdf1, ¶ms)); > > CHKERRQ(SNESSolve(solver, q0, q1)); > > > > When I run this, the solver fails to converge to a solution for this > rather large time step. > > The solution produced when the SNES module finally gives up is: > > > > q1 = [-2.72647e142, 2.72647e142] > > > > For reference, when I disable the scale and restart types, I get these > values: > > > > q1 = [1.0279e17, 1.98972e19] > > > > This is only a problem when I use the SNES_QN_RESTART_POWELL restart > type (seems to be regardless of the scale type type). I get reasonable > answers for other combinations of restart/scale type. I've tried every > combination of restart type/scale type except for SNES_QN_SCALE_JACOBIAN > (my ultimate application doesn't have an available Jacobian), and only > cases using SNES_QN_RESTART_POWELL are failing. > > > > I'm unfamiliar with Powell's restart criterion, but is it suppose to > work reasonably well with Quasi-Newton methods? I tried it on the simple > problem given in this example: > http://www.mcs.anl.gov/petsc/petsc-current/src/snes/examples/tutorials/ex1.c.html > > > > And Powell restarts also fails to converge to a meaningful solution > (solving for f(x) = [1,1], for x0 = [1,1]), but the other restart methods > do converge properly. > > Could you please send the exact options you are using for the ex1.c > that both fail and work and we'll see if there is some problem with the > Powell restart. > > Thanks > > Barry > > > > > Software information: > > > > PETSc version 3.7.2 (built from git maint branch) > > PETSc arch: arch-linux2-c-opt > > OS: Ubuntu 15.04 x64 > > Compiler: gcc 4.9.2 > > -- Andrew Ho -------------- next part -------------- An HTML attachment was scrubbed... URL: -------------- next part -------------- A non-text attachment was scrubbed... Name: ex1_none.c Type: text/x-csrc Size: 9365 bytes Desc: not available URL: -------------- next part -------------- A non-text attachment was scrubbed... Name: ex1_powell.c Type: text/x-csrc Size: 9367 bytes Desc: not available URL: From mfadams at lbl.gov Fri Jul 15 03:46:47 2016 From: mfadams at lbl.gov (Mark Adams) Date: Fri, 15 Jul 2016 04:46:47 -0400 Subject: [petsc-users] Multigrid with PML In-Reply-To: <02E40C8A-322D-4784-8418-22EE5F0999C7@mcs.anl.gov> References: <2d1003a65bf24fdf9b30adea866d2067@utdallas.edu> <02E40C8A-322D-4784-8418-22EE5F0999C7@mcs.anl.gov> Message-ID: On Thu, Jul 14, 2016 at 9:10 PM, Barry Smith wrote: > > This is a very difficult problem. I am not surprised that GAMG performs > poorly, I would be surprised if it performed well at all. > > I think you need to do some googling of "helmholtz PML linear system > solve" to find what other people have used. The first hit I got was this > http://www.math.tau.ac.il/services/phd/dissertations/Singer_Ido.pdf and > every iterative method he tried ended up requiring MANY iterations with > refinement. This is 14 years old so there will be better suggestions out > there. One that caught my eye was > http://www.sciencedirect.com/science/article/pii/S0022247X11005063 > > > Barry > > Just looking at the matrix makes it clear to me that conventional > iterative methods are not going to work well, many of the diagonal entries > are zero and even in rows with a diagonal entry it is much smaller in > magnitude than the diagonal entries. > Indefinite Helmholtz is hard unless you are not shifting very far. This zero diagonals must come from PML. First get rid of PML and see if you can solve anything to your satisfaction. I have a paper on this, using AMG, and I tried to be inclusive, but I did miss a potentially useful method of adding a complex shift to damp the system. You can Google something like 'complex shift helmholtz damp'. If you are shifting deep (high frequency Helmholtz), then use direct solvers. > > > On Jul 13, 2016, at 2:30 PM, Safin, Artur > wrote: > > > > Dear PETSc community, > > > > I am working on solving a Helmholtz problem with PML. The issue is that > I am finding it very hard to deal with the resulting matrix system; I can > get the correct solution for coarse meshes, but it takes roughly 2-4 times > as long to converge for each successively refined mesh. I've noticed that > without PML, I do not have problems with convergence speed. > > > > I am using the GMRES solver with GAMG as the preconditioner (with > block-Jacobi preconditioner for the multigrid solves). I have also tried to > assemble a separate preconditioning matrix with the complex shift 1+0.5i, > that does not seem to improve the results. Currently I am running with > > > > -ksp_type fgmres \ > > -pc_type gamg \ > > -mg_levels_pc_type bjacobi \ > > -pc_mg_type full \ > > -ksp_gmres_restart 150 \ > > > > Can anyone suggest some way of speeding up the convergence? Any help > would be appreciated. I am attaching the output from kspview. > > > > Best, > > > > Artur > > > > > > -------------- next part -------------- An HTML attachment was scrubbed... URL: From s_g at berkeley.edu Fri Jul 15 04:02:09 2016 From: s_g at berkeley.edu (Sanjay Govindjee) Date: Fri, 15 Jul 2016 02:02:09 -0700 Subject: [petsc-users] Multigrid with PML In-Reply-To: References: <2d1003a65bf24fdf9b30adea866d2067@utdallas.edu> <02E40C8A-322D-4784-8418-22EE5F0999C7@mcs.anl.gov> Message-ID: I agree, this is an extra hard problem when you add PML to it. Here is a link to a paper that presents a few tricks applied to some aspects of this problem. Koyama, T. and Govindjee, S., ``Solving generalized complex-symmetriceigenvalue problems arising fromresonant MEMS simulations with PETSc," in Proceedings in AppliedMathematics and Mechanics, 1141701-1141702 (2008) . http://dx.doi.org/10.1002/pamm.200700206 -sg On 7/15/16 1:46 AM, Mark Adams wrote: > > > On Thu, Jul 14, 2016 at 9:10 PM, Barry Smith > wrote: > > > This is a very difficult problem. I am not surprised that GAMG > performs poorly, I would be surprised if it performed well at all. > > I think you need to do some googling of "helmholtz PML linear > system solve" to find what other people have used. The first hit I > got was this > http://www.math.tau.ac.il/services/phd/dissertations/Singer_Ido.pdf > and every iterative method he tried ended up requiring MANY > iterations with refinement. This is 14 years old so there will be > better suggestions out there. One that caught my eye was > http://www.sciencedirect.com/science/article/pii/S0022247X11005063 > > > Barry > > Just looking at the matrix makes it clear to me that conventional > iterative methods are not going to work well, many of the diagonal > entries are zero and even in rows with a diagonal entry it is much > smaller in magnitude than the diagonal entries. > > > Indefinite Helmholtz is hard unless you are not shifting very far. > This zero diagonals must come from PML. > > First get rid of PML and see if you can solve anything to your > satisfaction. > > I have a paper on this, using AMG, and I tried to be inclusive, but I > did miss a potentially useful method of adding a complex shift to damp > the system. You can Google something like 'complex shift helmholtz > damp'. If you are shifting deep (high frequency Helmholtz), then use > direct solvers. > > > > On Jul 13, 2016, at 2:30 PM, Safin, Artur > > wrote: > > > > Dear PETSc community, > > > > I am working on solving a Helmholtz problem with PML. The issue > is that I am finding it very hard to deal with the resulting > matrix system; I can get the correct solution for coarse meshes, > but it takes roughly 2-4 times as long to converge for each > successively refined mesh. I've noticed that without PML, I do not > have problems with convergence speed. > > > > I am using the GMRES solver with GAMG as the preconditioner > (with block-Jacobi preconditioner for the multigrid solves). I > have also tried to assemble a separate preconditioning matrix with > the complex shift 1+0.5i, that does not seem to improve the > results. Currently I am running with > > > > -ksp_type fgmres \ > > -pc_type gamg \ > > -mg_levels_pc_type bjacobi \ > > -pc_mg_type full \ > > -ksp_gmres_restart 150 \ > > > > Can anyone suggest some way of speeding up the convergence? Any > help would be appreciated. I am attaching the output from kspview. > > > > Best, > > > > Artur > > > > > > -- ----------------------------------------------- Sanjay Govindjee, PhD, PE Professor of Civil Engineering 779 Davis Hall University of California Berkeley, CA 94720-1710 Voice: +1 510 642 6060 FAX: +1 510 643 5264 s_g at berkeley.edu http://www.ce.berkeley.edu/~sanjay ----------------------------------------------- Books: Engineering Mechanics of Deformable Solids: A Presentation with Exercises http://www.oup.com/us/catalog/general/subject/Physics/MaterialsScience/?view=usa&ci=9780199651641 http://ukcatalogue.oup.com/product/9780199651641.do http://amzn.com/0199651647 Engineering Mechanics 3 (Dynamics) 2nd Edition http://www.springer.com/978-3-642-53711-0 http://amzn.com/3642537111 Engineering Mechanics 3, Supplementary Problems: Dynamics http://www.amzn.com/B00SOXN8JU ----------------------------------------------- -------------- next part -------------- An HTML attachment was scrubbed... URL: From simon at arrowtheory.com Fri Jul 15 07:29:13 2016 From: simon at arrowtheory.com (Simon Burton) Date: Fri, 15 Jul 2016 22:29:13 +1000 Subject: [petsc-users] slepc eating all my ram Message-ID: <20160715222913.df7b3dd606ec173f7cac6a8e@arrowtheory.com> Hi, I'm running a slepc eigenvalue solver on a single machine with 198GB of ram, and solution space dimension 2^32. With double precision this means each vector is 32GB. I'm using shell matrices to implement the matrix vector product. I figured the easiest way to get eigenvalues is using the slepc power method, but it is still eating all the ram. Running in gdb I see that slepc is allocating a bunch of vectors in the spectral transform object (in STSetUp), and by this time it has consumed most of the 198GB of ram. I don't see why a spectral transform shift of zero needs to alloc a whole bunch of memory. I'm wondering if there are some other options to slepc that can reduce the memory footprint? A barebones implementation of the power method only needs to keep two vectors, perhaps I should just try doing this using petsc primitives. It's also possible that I could spread the computation over two or more machines but that's a whole other learning curve. The code I am running is essentially the laplacian grid example from slepc (src/eps/examples/tutorials/ex3.c): ./ex3 -eps_hermitian -eps_largest_magnitude -eps_monitor ascii -eps_nev 1 -eps_type power -n 65536 I also put this line in the source: EPSSetDimensions(eps,1,2,1); Cheers, Simon. From domenico_lahaye at yahoo.com Fri Jul 15 08:02:00 2016 From: domenico_lahaye at yahoo.com (domenico lahaye) Date: Fri, 15 Jul 2016 13:02:00 +0000 (UTC) Subject: [petsc-users] Multigrid with PML In-Reply-To: References: <2d1003a65bf24fdf9b30adea866d2067@utdallas.edu> <02E40C8A-322D-4784-8418-22EE5F0999C7@mcs.anl.gov> Message-ID: <1581901516.4078953.1468587720083.JavaMail.yahoo@mail.yahoo.com> Dear Artur, ? Out of a blend of curiosity and healthy naivity: have you tried complex shifted Laplace as a preconditioner? ? Greetings, Domenico Lahaye. From: Sanjay Govindjee To: petsc-users at mcs.anl.gov Sent: Friday, July 15, 2016 11:02 AM Subject: Re: [petsc-users] Multigrid with PML I agree, this is an extra hard problem when you add PML to it.? Here is a link to a paper that presents a few tricks applied to some aspects of this problem. Koyama, T. and Govindjee, S., ``Solving generalized complex-symmetriceigenvalue problems arising fromresonant MEMS simulations with PETSc," in Proceedings in AppliedMathematics and Mechanics, 1141701-1141702 (2008). http://dx.doi.org/10.1002/pamm.200700206 -sg On 7/15/16 1:46 AM, Mark Adams wrote: On Thu, Jul 14, 2016 at 9:10 PM, Barry Smith wrote: ? ?This is a very difficult problem. I am not surprised that GAMG performs poorly, I would be surprised if it performed well at all. ? ?I think you need to do some googling of? ?"helmholtz PML linear system solve" to find what other people have used. The first hit I got was this http://www.math.tau.ac.il/services/phd/dissertations/Singer_Ido.pdf and every iterative method he tried ended up requiring MANY iterations with refinement. This is 14 years old so there will be better suggestions out there. One that caught my eye was http://www.sciencedirect.com/science/article/pii/S0022247X11005063 ? Barry Just looking at the matrix makes it clear to me that conventional iterative methods are not going to work well, many of the diagonal entries are zero and even in rows with a diagonal entry it is much smaller in magnitude than the diagonal entries. Indefinite Helmholtz is hard unless you are not shifting very far. This zero diagonals must come from PML. First get rid of PML and see if you can solve anything to your satisfaction. I have a paper on this, using AMG, and I tried to be inclusive, but I did miss a potentially useful method of adding a complex shift to damp the system. You can Google something like 'complex shift helmholtz damp'.? If you are shifting deep (high frequency Helmholtz), then use direct solvers. ? > On Jul 13, 2016, at 2:30 PM, Safin, Artur wrote: > > Dear PETSc community, > > I am working on solving a Helmholtz problem with PML. The issue is that I am finding it very hard to deal with the resulting matrix system; I can get the correct solution for coarse meshes, but it takes roughly 2-4 times as long to converge for each successively refined mesh. I've noticed that without PML, I do not have problems with convergence speed. > > I am using the GMRES solver with GAMG as the preconditioner (with block-Jacobi preconditioner for the multigrid solves). I have also tried to assemble a separate preconditioning matrix with the complex shift 1+0.5i, that does not seem to improve the results. Currently I am running with > >? ? -ksp_type fgmres \ >? ? -pc_type gamg \ >? ? -mg_levels_pc_type bjacobi \ >? ? -pc_mg_type full \ >? ? -ksp_gmres_restart 150 \ > > Can anyone suggest some way of speeding up the convergence? Any help would be appreciated. I am attaching the output from kspview. > > Best, > > Artur > > -- ----------------------------------------------- Sanjay Govindjee, PhD, PE Professor of Civil Engineering 779 Davis Hall University of California Berkeley, CA 94720-1710 Voice: +1 510 642 6060 FAX: +1 510 643 5264 s_g at berkeley.edu http://www.ce.berkeley.edu/~sanjay ----------------------------------------------- Books: Engineering Mechanics of Deformable Solids: A Presentation with Exercises http://www.oup.com/us/catalog/general/subject/Physics/MaterialsScience/?view=usa&ci=9780199651641 http://ukcatalogue.oup.com/product/9780199651641.do http://amzn.com/0199651647 Engineering Mechanics 3 (Dynamics) 2nd Edition http://www.springer.com/978-3-642-53711-0 http://amzn.com/3642537111 Engineering Mechanics 3, Supplementary Problems: Dynamics http://www.amzn.com/B00SOXN8JU ----------------------------------------------- -------------- next part -------------- An HTML attachment was scrubbed... URL: From hzhang at mcs.anl.gov Fri Jul 15 11:13:27 2016 From: hzhang at mcs.anl.gov (Hong) Date: Fri, 15 Jul 2016 11:13:27 -0500 Subject: [petsc-users] slepc eating all my ram In-Reply-To: <20160715222913.df7b3dd606ec173f7cac6a8e@arrowtheory.com> References: <20160715222913.df7b3dd606ec173f7cac6a8e@arrowtheory.com> Message-ID: Simon : For '-eps_hermitian -eps_largest_magnitude', why do you need 'spectral transform'? Try slepc default method for ex3.c. Hong > > Hi, > > I'm running a slepc eigenvalue solver on a single machine with 198GB of > ram, > and solution space dimension 2^32. With double precision this means > each vector is 32GB. I'm using shell matrices to implement the matrix > vector product. I figured the easiest way to get eigenvalues is using > the slepc power method, but it is still eating all the ram. > > Running in gdb I see that slepc is allocating a bunch of vectors in > the spectral transform object (in STSetUp), and by this time it has > consumed > most of the 198GB of ram. I don't see why a spectral transform > shift of zero needs to alloc a whole bunch of memory. > > I'm wondering if there are some other options to slepc that can > reduce the memory footprint? A barebones implementation of the > power method only needs to keep two vectors, perhaps I should > just try doing this using petsc primitives. It's also possible that > I could spread the computation over two or more machines but > that's a whole other learning curve. > > The code I am running is essentially the laplacian grid > example from slepc (src/eps/examples/tutorials/ex3.c): > > ./ex3 -eps_hermitian -eps_largest_magnitude -eps_monitor ascii -eps_nev 1 > -eps_type power -n 65536 > > I also put this line in the source: > EPSSetDimensions(eps,1,2,1); > > Cheers, > > Simon. > > -------------- next part -------------- An HTML attachment was scrubbed... URL: From hgbk2008 at gmail.com Fri Jul 15 11:28:00 2016 From: hgbk2008 at gmail.com (Hoang Giang Bui) Date: Fri, 15 Jul 2016 18:28:00 +0200 Subject: [petsc-users] different convergence behaviour In-Reply-To: References: <66004C23-63C9-4A3E-A7DF-1352AC26412F@mcs.anl.gov> Message-ID: I used -ksp_monitor_true_residual -ksp_monitor_true_solution -ksp_converged_reason with MUMPS but it does not compute the true residual. Should I compute that myself? Below is a sample for a full log of MUMPS https://www.dropbox.com/s/fy5uknooxw77r19/log13Jun16_mumps?dl=0 Giang On Fri, Jul 15, 2016 at 2:52 AM, Mark Adams wrote: > > > On Thu, Jul 14, 2016 at 7:29 PM, Matthew Knepley > wrote: > >> On Thu, Jul 14, 2016 at 6:27 PM, Mark Adams wrote: >> >>> >>>> >>>> Notice that there are 7 orders of magnitude between the apparent >>>> residual (using the preconditioner), and the actual residual, Ax - b. >>>> You are using Hypre, and this generally means the Hypre coarse grid >>>> operator is crap. Please >>>> >>>> >>> Huh?, this data looks fine, both the true and preconditioned residual >>> stay separated by about 9 orders of magnitude. This just tells you that the >>> norm of A (or is it A^-1) is 10^9. Am I misunderstanding this? >>> >> >> This is why Barry and I asked for a comparsion with MUMPS. If you are >> right, and its just the condition number, >> > > I said norm not condition number. I trust I'm missing something in this > thread. > > >> the LU >> will not be any more accurate. >> >> Matt >> >> -- >> What most experimenters take for granted before they begin their >> experiments is infinitely more interesting than any results to which their >> experiments lead. >> -- Norbert Wiener >> > > -------------- next part -------------- An HTML attachment was scrubbed... URL: From bsmith at mcs.anl.gov Fri Jul 15 11:32:55 2016 From: bsmith at mcs.anl.gov (Barry Smith) Date: Fri, 15 Jul 2016 11:32:55 -0500 Subject: [petsc-users] different convergence behaviour In-Reply-To: References: <66004C23-63C9-4A3E-A7DF-1352AC26412F@mcs.anl.gov> Message-ID: <7E31CA9A-7717-4E0D-9E58-3BE243A05AB4@mcs.anl.gov> Use -ksp_type gmres to get it to print the residuals. With preonly it doesn't compute or print them. > On Jul 15, 2016, at 11:28 AM, Hoang Giang Bui wrote: > > I used > > -ksp_monitor_true_residual > -ksp_monitor_true_solution > -ksp_converged_reason > > with MUMPS but it does not compute the true residual. Should I compute that myself? > > Below is a sample for a full log of MUMPS > https://www.dropbox.com/s/fy5uknooxw77r19/log13Jun16_mumps?dl=0 > > > Giang > > On Fri, Jul 15, 2016 at 2:52 AM, Mark Adams wrote: > > > On Thu, Jul 14, 2016 at 7:29 PM, Matthew Knepley wrote: > On Thu, Jul 14, 2016 at 6:27 PM, Mark Adams wrote: > > > Notice that there are 7 orders of magnitude between the apparent residual (using the preconditioner), and the actual residual, Ax - b. > You are using Hypre, and this generally means the Hypre coarse grid operator is crap. Please > > > Huh?, this data looks fine, both the true and preconditioned residual stay separated by about 9 orders of magnitude. This just tells you that the norm of A (or is it A^-1) is 10^9. Am I misunderstanding this? > > This is why Barry and I asked for a comparsion with MUMPS. If you are right, and its just the condition number, > > I said norm not condition number. I trust I'm missing something in this thread. > > the LU > will not be any more accurate. > > Matt > > -- > What most experimenters take for granted before they begin their experiments is infinitely more interesting than any results to which their experiments lead. > -- Norbert Wiener > > From simon at arrowtheory.com Fri Jul 15 12:12:36 2016 From: simon at arrowtheory.com (Simon Burton) Date: Sat, 16 Jul 2016 03:12:36 +1000 Subject: [petsc-users] slepc eating all my ram In-Reply-To: References: <20160715222913.df7b3dd606ec173f7cac6a8e@arrowtheory.com> Message-ID: <20160716031236.4f52e3e02814cfe42d83a7b6@arrowtheory.com> Hi, just like this? ./ex3 -eps_nev 1 -eps_type power -n 65536 -info I still see: [0] STSetUp(): Setting up new ST and that's when memory usage reaches to 192GB and the machine can't take it. I don't understand why the default behaviour creates a spectral transform object that then needs so much memory. thanks, Simon. On Fri, 15 Jul 2016 11:13:27 -0500 Hong wrote: > Simon : > For '-eps_hermitian -eps_largest_magnitude', why do you need 'spectral > transform'? > Try slepc default method for ex3.c. > > Hong > > > > > Hi, > > > > I'm running a slepc eigenvalue solver on a single machine with 198GB of > > ram, > > and solution space dimension 2^32. With double precision this means > > each vector is 32GB. I'm using shell matrices to implement the matrix > > vector product. I figured the easiest way to get eigenvalues is using > > the slepc power method, but it is still eating all the ram. > > > > Running in gdb I see that slepc is allocating a bunch of vectors in > > the spectral transform object (in STSetUp), and by this time it has > > consumed > > most of the 198GB of ram. I don't see why a spectral transform > > shift of zero needs to alloc a whole bunch of memory. > > > > I'm wondering if there are some other options to slepc that can > > reduce the memory footprint? A barebones implementation of the > > power method only needs to keep two vectors, perhaps I should > > just try doing this using petsc primitives. It's also possible that > > I could spread the computation over two or more machines but > > that's a whole other learning curve. > > > > The code I am running is essentially the laplacian grid > > example from slepc (src/eps/examples/tutorials/ex3.c): > > > > ./ex3 -eps_hermitian -eps_largest_magnitude -eps_monitor ascii -eps_nev 1 > > -eps_type power -n 65536 > > > > I also put this line in the source: > > EPSSetDimensions(eps,1,2,1); > > > > Cheers, > > > > Simon. > > > > From jroman at dsic.upv.es Fri Jul 15 12:53:31 2016 From: jroman at dsic.upv.es (Jose E. Roman) Date: Fri, 15 Jul 2016 19:53:31 +0200 Subject: [petsc-users] slepc eating all my ram In-Reply-To: <20160716031236.4f52e3e02814cfe42d83a7b6@arrowtheory.com> References: <20160715222913.df7b3dd606ec173f7cac6a8e@arrowtheory.com> <20160716031236.4f52e3e02814cfe42d83a7b6@arrowtheory.com> Message-ID: <4455A442-710F-412A-9B7F-32D690B4E1F0@dsic.upv.es> > El 15 jul 2016, a las 19:12, Simon Burton escribi?: > > Hi, > > just like this? > ./ex3 -eps_nev 1 -eps_type power -n 65536 -info > > I still see: > [0] STSetUp(): Setting up new ST > > and that's when memory usage reaches to 192GB and the machine can't take it. > > I don't understand why the default behaviour creates a spectral transform > object that then needs so much memory. > > thanks, > > Simon. The default spectral transformation (STSHIFT) will allocate just one vector. At which exact point are you seeing that it allocates a bunch of vectors? Is this the unmodified ex3.c? Or did you change anything like EPSSetOperators(eps,A,B) ? Do you get the same behaviour with the original ex3 with the same problem size? Do you have the same problem with a smaller problem? (half size, say) Jose From simon at arrowtheory.com Fri Jul 15 16:17:44 2016 From: simon at arrowtheory.com (Simon Burton) Date: Sat, 16 Jul 2016 07:17:44 +1000 Subject: [petsc-users] slepc eating all my ram In-Reply-To: <4455A442-710F-412A-9B7F-32D690B4E1F0@dsic.upv.es> References: <20160715222913.df7b3dd606ec173f7cac6a8e@arrowtheory.com> <20160716031236.4f52e3e02814cfe42d83a7b6@arrowtheory.com> <4455A442-710F-412A-9B7F-32D690B4E1F0@dsic.upv.es> Message-ID: <20160716071744.50ec5af125d99abc4c0ffd7c@arrowtheory.com> On Fri, 15 Jul 2016 19:53:31 +0200 "Jose E. Roman" wrote: > > The default spectral transformation (STSHIFT) will allocate just one vector. At which exact point are you seeing that it allocates a bunch of vectors? Yes I think you are right. I can get beyond STSetUp with the right settings. Now the solver runs out of memory inside EPSGetStartVector. > > Is this the unmodified ex3.c? Or did you change anything like EPSSetOperators(eps,A,B) ? good question. I didn't change much, let me try again the original. > Do you get the same behaviour with the original ex3 with the same problem size? Yes > > Do you have the same problem with a smaller problem? (half size, say) Halving n gives a quarter of the dimension, which is 8gb vector sizes. It works fine and uses a total of 48gb ram. Oh, I see at one point during initialization it hits a maximum of 56gb. So I guess it needs to keep 6 vectors in total. With the original problem size this becomes 192gb which is just a few gb too much to crunch. I guess I can still try it, but it doesn't feel good hitting the harddrive that much. Thanks for the suggestions. Simon. From aks084000 at utdallas.edu Fri Jul 15 18:29:58 2016 From: aks084000 at utdallas.edu (Safin, Artur) Date: Fri, 15 Jul 2016 23:29:58 +0000 Subject: [petsc-users] Multigrid with PML In-Reply-To: References: <2d1003a65bf24fdf9b30adea866d2067@utdallas.edu> <02E40C8A-322D-4784-8418-22EE5F0999C7@mcs.anl.gov>, Message-ID: Barry, Thank you for taking a look at my problem. I will see if I can implement some of the methods available in literature. Artur -------------- next part -------------- An HTML attachment was scrubbed... URL: From bsmith at mcs.anl.gov Fri Jul 15 22:26:04 2016 From: bsmith at mcs.anl.gov (Barry Smith) Date: Fri, 15 Jul 2016 22:26:04 -0500 Subject: [petsc-users] SNES_QN_RESTART_POWELL fails to converge? In-Reply-To: References: <20513817-993F-41CC-8888-AAD5DF55922C@mcs.anl.gov> Message-ID: Andrew, Thanks for your code. I look through the QN code and it seems ok, the one funny thing is that it applies the Powell criteria after the first iterations (before the L-BFGS has properly started) which is why the solution just continues to grow and grow. Essentially with the Powell test it is never starting L-BFSG. I have made two changes 1) branch barry/fix-snes-qn-powell/maint that changes the code so that the Powel check is not done until the first full iteration of L-BFGS has been completed. This now gets the ex1_powell.c code to converge (with 18 iterations). Of course waiting for one full iteration of L-BFGS is arbitrary, perhaps 2 is better, I do not know. 2) barry/add-snes-divtol this adds a divergence test to SNES; it was goofy that even though residual norm was increasing without bound the SNES iteration continued to iterate. I added a new convergence test that if the residual grows (default) by 1e4 then the iteration is stopped with a divergence error. Thanks for reporting these problems, Barry > On Jul 15, 2016, at 3:14 AM, Andrew Ho wrote: > > I've attached two modified versions of ex1: > > ex1_powell.c uses the Powell restart > ex1_none.c uses no restart > > For the default initial guess (x0 = [0.5,0.5]), both converge just fine. However, for the initial guess x0 = [3.,3.], the Powell solution fails to converge, while None and Periodic both still converge. This is with the "easy" equation set (run without -hard). > > Interestingly enough, the Powell restart still "finishes" in a reasonable number of iterations (7 iterations), but the residual is very large (on the order of 1e254). > > On Thu, Jul 14, 2016 at 5:50 PM, Barry Smith wrote: > > > On Jul 14, 2016, at 6:18 PM, Andrew Ho wrote: > > > > I am trying to solve a simple ionization/recombination ODE using PETSc's quasi-newton SNES. > > > > This is a basic non-linear coupled ODE system: > > > > delta = -a u^2 + b u v > > d_t u = delta > > d_t v = -delta > > > > a and b are constants. > > > > I wrote a backwards Euler root finding function (yes, I know the TS module has BE implemented, but this is more of a learning exercise). > > > > Here is the function evaluation: > > > > struct ion_rec_ctx > > { > > PetscScalar rate_a, rate_b; > > PetscScalar dt; > > }; > > PetscErrorCode bdf1(SNES snes, Vec x, Vec f, void *ctx) > > { > > const PetscScalar *xx; > > PetscScalar *ff; > > ion_rec_ctx& params = *reinterpret_cast(ctx); > > CHKERRQ(VecGetArrayRead(x, &xx)); > > CHKERRQ(VecGetArray(f,&ff)); > > auto delta = (-params.rate_a*xx[0]*xx[0]+params.rate_b*xx[1]*xx[0]); > > ff[0] = xx[0]-params.dt*delta; > > ff[1] = xx[1]-params.dt*-delta; > > CHKERRQ(VecRestoreArrayRead(x,&xx)); > > CHKERRQ(VecRestoreArray(f,&ff)); > > return 0; > > } > > > > To setup the solver and solve one time step: > > > > // q0, q1, and res are Vec's previously initialized > > // initial conditions: q0 = [1e19,1e19] > > SNES solver; > > CHKERRQ(SNESCreate(comm, &solver)); > > CHKERRQ(SNESSetType(solver, SNESQN)); > > CHKERRQ(SNESQNSetType(solver, SNES_QN_LBFGS)); > > ion_rec_ctx params = {9.59e-16, 1.15e-19, 1.}; > > CHKERRQ(SNESSetFunction(solver, res, &bdf1, ¶ms)); > > CHKERRQ(SNESSolve(solver, q0, q1)); > > > > When I run this, the solver fails to converge to a solution for this rather large time step. > > The solution produced when the SNES module finally gives up is: > > > > q1 = [-2.72647e142, 2.72647e142] > > > > For reference, when I disable the scale and restart types, I get these values: > > > > q1 = [1.0279e17, 1.98972e19] > > > > This is only a problem when I use the SNES_QN_RESTART_POWELL restart type (seems to be regardless of the scale type type). I get reasonable answers for other combinations of restart/scale type. I've tried every combination of restart type/scale type except for SNES_QN_SCALE_JACOBIAN (my ultimate application doesn't have an available Jacobian), and only cases using SNES_QN_RESTART_POWELL are failing. > > > > I'm unfamiliar with Powell's restart criterion, but is it suppose to work reasonably well with Quasi-Newton methods? I tried it on the simple problem given in this example: http://www.mcs.anl.gov/petsc/petsc-current/src/snes/examples/tutorials/ex1.c.html > > > > And Powell restarts also fails to converge to a meaningful solution (solving for f(x) = [1,1], for x0 = [1,1]), but the other restart methods do converge properly. > > Could you please send the exact options you are using for the ex1.c that both fail and work and we'll see if there is some problem with the Powell restart. > > Thanks > > Barry > > > > > Software information: > > > > PETSc version 3.7.2 (built from git maint branch) > > PETSc arch: arch-linux2-c-opt > > OS: Ubuntu 15.04 x64 > > Compiler: gcc 4.9.2 > > > > > -- > Andrew Ho > From bsmith at mcs.anl.gov Fri Jul 15 22:48:14 2016 From: bsmith at mcs.anl.gov (Barry Smith) Date: Fri, 15 Jul 2016 22:48:14 -0500 Subject: [petsc-users] Regarding ksp ex42 - Citations In-Reply-To: <1413749702.3789628.1468516892902.JavaMail.yahoo@mail.yahoo.com> References: <1413749702.3789628.1468516892902.JavaMail.yahoo.ref@mail.yahoo.com> <1413749702.3789628.1468516892902.JavaMail.yahoo@mail.yahoo.com> Message-ID: > On Jul 14, 2016, at 12:21 PM, domenico lahaye wrote: > > Dear PETSc team, > > 1) I am looking into ks/examples/tutorials/ex42.c This example is really written as only a one level solver, making it work with geometric multigrid is not clean > I am still new to the DMDA structure > and likely not giving it as much time as it deserves. However, I do not see immediately > what function is responsible for calling PCMGSetSmoother and PCMGSetResidual. > > I tried to call PCMGGetCoarseSolve(pc, &kcpc) and subsequently > KSPGetOperators (kspc, ... ) to check how the coarse grid operator is defined > after calling DMCoarsenHierarchy, but that failed. > > I am solving Helmholtz with shifted Laplace, and managed to exploit DMDA to perform > a multigrid solve on the preconditioner. In a next stage I want to implement the deflation > using DMDA as well. You should look at ex25.c in the same directory. Here ierr = KSPSetDM(ksp,da);CHKERRQ(ierr); ierr = KSPSetComputeRHS(ksp,ComputeRHS,&user);CHKERRQ(ierr); ierr = KSPSetComputeOperators(ksp,ComputeMatrix,&user);CHKERRQ(ierr); make it straight forward to work with multigrid. The KSP object can mange the hierarchy of grids since it is provided with the DM and the ComputeRHS and ComputeMatrix provide a way for the multigrid preconditioner to automatically generate the needed matrix on each level without you having to manage it yourself. For example the rule in the makefile runex25: -@${MPIEXEC} -n 1 ./ex25 -pc_type mg -ksp_type fgmres -da_refine 2 -ksp_monitor_short -mg_levels_ksp_monitor_short -mg_levels_ksp_norm_type unpreconditioned -ksp_view -pc_mg_type full > ex25_1.tmp 2>&1; \ if (${DIFF} output/ex25_1.out ex25_1.tmp) then true; \ else printf "${PWD}\nPossible problem with ex25_1, diffs above\n=========================================\n"; fi; \ ${RM} -f ex25_1.tmp shows how to run with two levels. etc. > > 2) On http://www.mcs.anl.gov/petsc/documentation/referencing.html I see > > @Misc{petsc-web-page, > author = {Satish Balay and Shrirang Abhyankar and Mark~F. Adams and Jed Brown and Peter Brune > and Kris Buschelman and Lisandro Dalcin and Victor Eijkhout and William~D. Gropp > and Dinesh Kaushik and Matthew~G. Knepley > and Lois Curfman McInnes and Karl Rupp and Barry~F. Smith > and Stefano Zampini and Hong Zhang and Hong Zhang}, > title = {{PETS}c {W}eb page}, > url = {http://www.mcs.anl.gov/petsc}, > howpublished = {\url{http://www.mcs.anl.gov/petsc}}, > year = {2016} > } > > > > Is the last author mentioned twice intentionally? > > 3) On http://www.mcs.anl.gov/petsc/publications/petscapps-bib.html#OpenFOAM%202.2.1 I see > > @misc{OpenFOAM > , > > > title = "OpenFOAM", > > howpublished = "\url{http://www.openfoam.com}", > > url = {http://www.openfoam.com}, > > note = "OpenFOAM is a free, open source CFD software package. It allows PETSc linear algebra and solvers to be used underneath.", > > key = "OpenFOAM 2.2.1" > > } > > > Do you have more information on the use of PETSc within OpenFoam? > > 4) @matt in response to a question he raised in Vienna > > MIPSE is a BEM solver. Details are on: > http://www.g2elab.grenoble-inp.fr/plateforms/mipse-modeling-of-interconnected-power-systems-632862.kjsp?RH=G2ELAB_R-MAGE > > Cheers, Domenico Lahaye. > From knepley at gmail.com Fri Jul 15 22:54:48 2016 From: knepley at gmail.com (Matthew Knepley) Date: Fri, 15 Jul 2016 22:54:48 -0500 Subject: [petsc-users] Regarding ksp ex42 - Citations In-Reply-To: References: <1413749702.3789628.1468516892902.JavaMail.yahoo.ref@mail.yahoo.com> <1413749702.3789628.1468516892902.JavaMail.yahoo@mail.yahoo.com> Message-ID: On Fri, Jul 15, 2016 at 10:48 PM, Barry Smith wrote: > > > On Jul 14, 2016, at 12:21 PM, domenico lahaye > wrote: > > > > Dear PETSc team, > > > > 1) I am looking into ks/examples/tutorials/ex42.c > > This example is really written as only a one level solver, making it > work with geometric multigrid is not clean > > > I am still new to the DMDA structure > > and likely not giving it as much time as it deserves. However, I do > not see immediately > > what function is responsible for calling PCMGSetSmoother and > PCMGSetResidual. > > > > I tried to call PCMGGetCoarseSolve(pc, &kcpc) and subsequently > > KSPGetOperators (kspc, ... ) to check how the coarse grid operator > is defined > > after calling DMCoarsenHierarchy, but that failed. > > > > I am solving Helmholtz with shifted Laplace, and managed to exploit > DMDA to perform > > a multigrid solve on the preconditioner. In a next stage I want to > implement the deflation > > using DMDA as well. > > You should look at ex25.c in the same directory. Here > > ierr = KSPSetDM(ksp,da);CHKERRQ(ierr); > ierr = KSPSetComputeRHS(ksp,ComputeRHS,&user);CHKERRQ(ierr); > ierr = KSPSetComputeOperators(ksp,ComputeMatrix,&user);CHKERRQ(ierr); > > make it straight forward to work with multigrid. The KSP object can mange > the hierarchy of grids since it is provided with the DM > and the ComputeRHS and ComputeMatrix provide a way for the multigrid > preconditioner to automatically generate the needed matrix on each level > without you having to manage it yourself. For example the rule in the > makefile > > runex25: > -@${MPIEXEC} -n 1 ./ex25 -pc_type mg -ksp_type fgmres -da_refine 2 > -ksp_monitor_short -mg_levels_ksp_monitor_short -mg_levels_ksp_norm_type > unpreconditioned -ksp_view -pc_mg_type full > ex25_1.tmp 2>&1; \ > if (${DIFF} output/ex25_1.out ex25_1.tmp) then true; \ > else printf "${PWD}\nPossible problem with ex25_1, diffs > above\n=========================================\n"; fi; \ > ${RM} -f ex25_1.tmp > > shows how to run with two levels. etc. > > > > > > 2) On http://www.mcs.anl.gov/petsc/documentation/referencing.html I see > > > > @Misc{petsc-web-page, > > author = {Satish Balay and Shrirang Abhyankar and Mark~F. > Adams and Jed Brown and Peter Brune > > and Kris Buschelman and Lisandro Dalcin and Victor > Eijkhout and William~D. Gropp > > and Dinesh Kaushik and Matthew~G. Knepley > > and Lois Curfman McInnes and Karl Rupp and > Barry~F. Smith > > and Stefano Zampini and Hong Zhang and Hong Zhang}, > > title = {{PETS}c {W}eb page}, > > url = {http://www.mcs.anl.gov/petsc}, > > howpublished = {\url{http://www.mcs.anl.gov/petsc}}, > > year = {2016} > > } > > > > > > > > Is the last author mentioned twice intentionally? > That is actually two different people with the same name. > > 3) On > http://www.mcs.anl.gov/petsc/publications/petscapps-bib.html#OpenFOAM%202.2.1 > I see > > > > @misc{OpenFOAM > > , > > > > > > title = "OpenFOAM", > > > > howpublished = "\url{http://www.openfoam.com}", > > > > url = {http://www.openfoam.com}, > > > > note = "OpenFOAM is a free, open source CFD software package. It > allows PETSc linear algebra and solvers to be used underneath.", > > > > key = "OpenFOAM 2.2.1" > > > > } > > > > > > Do you have more information on the use of PETSc within OpenFoam? > They only use solvers, and not the DM stuff as far as I know. > > 4) @matt in response to a question he raised in Vienna > > > > MIPSE is a BEM solver. Details are on: > > > http://www.g2elab.grenoble-inp.fr/plateforms/mipse-modeling-of-interconnected-power-systems-632862.kjsp?RH=G2ELAB_R-MAGE >From what I can tell, the code is not open source. Is that right? Thanks, Matt > > > Cheers, Domenico Lahaye. > > > > -- What most experimenters take for granted before they begin their experiments is infinitely more interesting than any results to which their experiments lead. -- Norbert Wiener -------------- next part -------------- An HTML attachment was scrubbed... URL: From simon at arrowtheory.com Sat Jul 16 08:40:24 2016 From: simon at arrowtheory.com (Simon Burton) Date: Sat, 16 Jul 2016 23:40:24 +1000 Subject: [petsc-users] slepc eating all my ram In-Reply-To: <20160716071744.50ec5af125d99abc4c0ffd7c@arrowtheory.com> References: <20160715222913.df7b3dd606ec173f7cac6a8e@arrowtheory.com> <20160716031236.4f52e3e02814cfe42d83a7b6@arrowtheory.com> <4455A442-710F-412A-9B7F-32D690B4E1F0@dsic.upv.es> <20160716071744.50ec5af125d99abc4c0ffd7c@arrowtheory.com> Message-ID: <20160716234024.5d13e6ec0021548c2022bbe0@arrowtheory.com> Hi again, I found another machine with enough ram to run this (i think). Running into another problem now, with dgemv: [0] EPSSetUp_Power(): Warning: parameter mpd ignored [0] STSetUp(): Setting up new ST Intel MKL ERROR: Parameter 6 was incorrect on entry to DGEMV . [0] BV_SafeSqrt(): Zero norm, either the vector is zero or a semi-inner product is being used I dug into this in gdb a bit: Breakpoint 2, 0x00007ffff4f4cbd0 in dgemv_ () from /usr/physics/ic15/composer_xe_2015.0.090/mkl/lib/intel64/libmkl_intel_lp64.so (gdb) bt #0 0x00007ffff4f4cbd0 in dgemv_ () from /usr/physics/ic15/composer_xe_2015.0.090/mkl/lib/intel64/libmkl_intel_lp64.so #1 0x00007ffff5e14b4b in BVDotVec_BLAS_Private (bv=0x6ba6b0, n_=4294967296, k_=1, A=0x7fe7f23b3650, x=0x7fe7f23b3650, y=0x75a3b0, mpi=PETSC_FALSE) at /suphys/sburton/local/slepc-3.7.1/src/sys/classes/bv/interface/bvblas.c:274 #2 0x00007ffff5dcbd86 in BVDotVec_Svec (X=0x6ba6b0, y=0x74dbc0, m=0x75a3b0) at /suphys/sburton/local/slepc-3.7.1/src/sys/classes/bv/impls/svec/svec.c:150 #3 0x00007ffff5dffd58 in BVDotVec (X=0x6ba6b0, y=0x74dbc0, m=0x75a3b0) at /suphys/sburton/local/slepc-3.7.1/src/sys/classes/bv/interface/bvglobal.c:191 #4 0x00007ffff5e1aad9 in BVOrthogonalizeCGS1 (bv=0x6ba6b0, j=0, v=0x0, H=0x75a3b0, onorm=0x7fffffffdc28, norm=0x7fffffffdc20) at /suphys/sburton/local/slepc-3.7.1/src/sys/classes/bv/interface/bvorthog.c:81 #5 0x00007ffff5e1c1bb in BVOrthogonalizeCGS (bv=0x6ba6b0, j=0, v=0x0, H=0x0, norm=0x7fffffffddb0, lindep=0x7fffffffddac) at /suphys/sburton/local/slepc-3.7.1/src/sys/classes/bv/interface/bvorthog.c:214 #6 0x00007ffff5e1ddfd in BVOrthogonalizeColumn (bv=0x6ba6b0, j=0, H=0x0, norm=0x7fffffffddb0, lindep=0x7fffffffddac) at /suphys/sburton/local/slepc-3.7.1/src/sys/classes/bv/interface/bvorthog.c:371 #7 0x00007ffff6050986 in EPSGetStartVector (eps=0x6a3ee0, i=0, breakdown=0x0) at /suphys/sburton/local/slepc-3.7.1/src/eps/interface/epssolve.c:758 #8 0x00007ffff5f52812 in EPSSolve_Power (eps=0x6a3ee0) at /suphys/sburton/local/slepc-3.7.1/src/eps/impls/power/power.c:103 #9 0x00007ffff6049b28 in EPSSolve (eps=0x6a3ee0) at /suphys/sburton/local/slepc-3.7.1/src/eps/interface/epssolve.c:101 #10 0x0000000000401430 in main () (gdb) up #1 0x00007ffff5e14b4b in BVDotVec_BLAS_Private (bv=0x6ba6b0, n_=4294967296, k_=1, A=0x7fe7f23b3650, x=0x7fe7f23b3650, y=0x75a3b0, mpi=PETSC_FALSE) at /suphys/sburton/local/slepc-3.7.1/src/sys/classes/bv/interface/bvblas.c:274 274 if (n) PetscStackCallBLAS("BLASgemv",BLASgemv_("C",&n,&k,&done,A,&n,x,&one,&zero,y,&one)); (gdb) print n $1 = 4294967296 (gdb) print sizeof(n) $2 = 8 (gdb) step Intel MKL ERROR: Parameter 6 was incorrect on entry to DGEMV . It looks to me like slepc is doing it right, but with error messages like this who knows. It's a bit beyond me debugging assembly. Originally I built petsc with --download-fblaslapack but i don't think it was working with 64bit indexes (?) Maybe I should try another blas. Simon. On Sat, 16 Jul 2016 07:17:44 +1000 Simon Burton wrote: > On Fri, 15 Jul 2016 19:53:31 +0200 > "Jose E. Roman" wrote: > > > > > The default spectral transformation (STSHIFT) will allocate just one vector. At which exact point are you seeing that it allocates a bunch of vectors? > > Yes I think you are right. > I can get beyond STSetUp with the right settings. > Now the solver runs out of memory inside EPSGetStartVector. > > > > > Is this the unmodified ex3.c? Or did you change anything like EPSSetOperators(eps,A,B) ? > > good question. I didn't change much, let me try again the original. > > > Do you get the same behaviour with the original ex3 with the same problem size? > > Yes > > > > > Do you have the same problem with a smaller problem? (half size, say) > > Halving n gives a quarter of the dimension, which is 8gb vector sizes. > It works fine and uses a total of 48gb ram. Oh, I see at one point during > initialization it hits a maximum of 56gb. > > So I guess it needs to keep 6 vectors in total. > With the original problem size this becomes 192gb which is > just a few gb too much to crunch. I guess I can still try it, > but it doesn't feel good hitting the harddrive that much. > > Thanks for the suggestions. > > Simon. From bsmith at mcs.anl.gov Sat Jul 16 10:00:58 2016 From: bsmith at mcs.anl.gov (Barry Smith) Date: Sat, 16 Jul 2016 10:00:58 -0500 Subject: [petsc-users] slepc eating all my ram In-Reply-To: <20160716234024.5d13e6ec0021548c2022bbe0@arrowtheory.com> References: <20160715222913.df7b3dd606ec173f7cac6a8e@arrowtheory.com> <20160716031236.4f52e3e02814cfe42d83a7b6@arrowtheory.com> <4455A442-710F-412A-9B7F-32D690B4E1F0@dsic.upv.es> <20160716071744.50ec5af125d99abc4c0ffd7c@arrowtheory.com> <20160716234024.5d13e6ec0021548c2022bbe0@arrowtheory.com> Message-ID: <27AC55B0-C1E7-4181-9ECD-A3CE6F795EAC@mcs.anl.gov> Send configure.log to petsc-maint at mcs.anl.gov > On Jul 16, 2016, at 8:40 AM, Simon Burton wrote: > > > Hi again, > > I found another machine with enough ram to run this (i think). > > Running into another problem now, with dgemv: > > [0] EPSSetUp_Power(): Warning: parameter mpd ignored > [0] STSetUp(): Setting up new ST > Intel MKL ERROR: Parameter 6 was incorrect on entry to DGEMV . > [0] BV_SafeSqrt(): Zero norm, either the vector is zero or a semi-inner product is being used > > > I dug into this in gdb a bit: > > > Breakpoint 2, 0x00007ffff4f4cbd0 in dgemv_ () > from /usr/physics/ic15/composer_xe_2015.0.090/mkl/lib/intel64/libmkl_intel_lp64.so > (gdb) bt > #0 0x00007ffff4f4cbd0 in dgemv_ () from /usr/physics/ic15/composer_xe_2015.0.090/mkl/lib/intel64/libmkl_intel_lp64.so > #1 0x00007ffff5e14b4b in BVDotVec_BLAS_Private (bv=0x6ba6b0, n_=4294967296, k_=1, A=0x7fe7f23b3650, x=0x7fe7f23b3650, > y=0x75a3b0, mpi=PETSC_FALSE) at /suphys/sburton/local/slepc-3.7.1/src/sys/classes/bv/interface/bvblas.c:274 > #2 0x00007ffff5dcbd86 in BVDotVec_Svec (X=0x6ba6b0, y=0x74dbc0, m=0x75a3b0) > at /suphys/sburton/local/slepc-3.7.1/src/sys/classes/bv/impls/svec/svec.c:150 > #3 0x00007ffff5dffd58 in BVDotVec (X=0x6ba6b0, y=0x74dbc0, m=0x75a3b0) > at /suphys/sburton/local/slepc-3.7.1/src/sys/classes/bv/interface/bvglobal.c:191 > #4 0x00007ffff5e1aad9 in BVOrthogonalizeCGS1 (bv=0x6ba6b0, j=0, v=0x0, H=0x75a3b0, onorm=0x7fffffffdc28, > norm=0x7fffffffdc20) at /suphys/sburton/local/slepc-3.7.1/src/sys/classes/bv/interface/bvorthog.c:81 > #5 0x00007ffff5e1c1bb in BVOrthogonalizeCGS (bv=0x6ba6b0, j=0, v=0x0, H=0x0, norm=0x7fffffffddb0, lindep=0x7fffffffddac) > at /suphys/sburton/local/slepc-3.7.1/src/sys/classes/bv/interface/bvorthog.c:214 > #6 0x00007ffff5e1ddfd in BVOrthogonalizeColumn (bv=0x6ba6b0, j=0, H=0x0, norm=0x7fffffffddb0, lindep=0x7fffffffddac) > at /suphys/sburton/local/slepc-3.7.1/src/sys/classes/bv/interface/bvorthog.c:371 > #7 0x00007ffff6050986 in EPSGetStartVector (eps=0x6a3ee0, i=0, breakdown=0x0) > at /suphys/sburton/local/slepc-3.7.1/src/eps/interface/epssolve.c:758 > #8 0x00007ffff5f52812 in EPSSolve_Power (eps=0x6a3ee0) at /suphys/sburton/local/slepc-3.7.1/src/eps/impls/power/power.c:103 > #9 0x00007ffff6049b28 in EPSSolve (eps=0x6a3ee0) at /suphys/sburton/local/slepc-3.7.1/src/eps/interface/epssolve.c:101 > #10 0x0000000000401430 in main () > (gdb) up > #1 0x00007ffff5e14b4b in BVDotVec_BLAS_Private (bv=0x6ba6b0, n_=4294967296, k_=1, A=0x7fe7f23b3650, x=0x7fe7f23b3650, > y=0x75a3b0, mpi=PETSC_FALSE) at /suphys/sburton/local/slepc-3.7.1/src/sys/classes/bv/interface/bvblas.c:274 > 274 if (n) PetscStackCallBLAS("BLASgemv",BLASgemv_("C",&n,&k,&done,A,&n,x,&one,&zero,y,&one)); > (gdb) print n > $1 = 4294967296 > (gdb) print sizeof(n) > $2 = 8 > (gdb) step > Intel MKL ERROR: Parameter 6 was incorrect on entry to DGEMV . > > > It looks to me like slepc is doing it right, but with error messages > like this who knows. It's a bit beyond me debugging assembly. > > Originally I built petsc with --download-fblaslapack but i don't think > it was working with 64bit indexes (?) > > Maybe I should try another blas. > > Simon. > > > On Sat, 16 Jul 2016 07:17:44 +1000 > Simon Burton wrote: > >> On Fri, 15 Jul 2016 19:53:31 +0200 >> "Jose E. Roman" wrote: >> >>> >>> The default spectral transformation (STSHIFT) will allocate just one vector. At which exact point are you seeing that it allocates a bunch of vectors? >> >> Yes I think you are right. >> I can get beyond STSetUp with the right settings. >> Now the solver runs out of memory inside EPSGetStartVector. >> >>> >>> Is this the unmodified ex3.c? Or did you change anything like EPSSetOperators(eps,A,B) ? >> >> good question. I didn't change much, let me try again the original. >> >>> Do you get the same behaviour with the original ex3 with the same problem size? >> >> Yes >> >>> >>> Do you have the same problem with a smaller problem? (half size, say) >> >> Halving n gives a quarter of the dimension, which is 8gb vector sizes. >> It works fine and uses a total of 48gb ram. Oh, I see at one point during >> initialization it hits a maximum of 56gb. >> >> So I guess it needs to keep 6 vectors in total. >> With the original problem size this becomes 192gb which is >> just a few gb too much to crunch. I guess I can still try it, >> but it doesn't feel good hitting the harddrive that much. >> >> Thanks for the suggestions. >> >> Simon. From bsmith at mcs.anl.gov Sat Jul 16 22:11:23 2016 From: bsmith at mcs.anl.gov (Barry Smith) Date: Sat, 16 Jul 2016 22:11:23 -0500 Subject: [petsc-users] Regarding ksp ex42 - Citations In-Reply-To: <1413749702.3789628.1468516892902.JavaMail.yahoo@mail.yahoo.com> References: <1413749702.3789628.1468516892902.JavaMail.yahoo.ref@mail.yahoo.com> <1413749702.3789628.1468516892902.JavaMail.yahoo@mail.yahoo.com> Message-ID: <5A491912-5FFB-46AB-8B2E-CBC0C5C443C2@mcs.anl.gov> > On Jul 14, 2016, at 12:21 PM, domenico lahaye wrote: > > Dear PETSc team, > > 1) I am looking into ks/examples/tutorials/ex42.c I am still new to the DMDA structure > and likely not giving it as much time as it deserves. However, I do not see immediately > what function is responsible for calling PCMGSetSmoother and PCMGSetResidual. > > I tried to call PCMGGetCoarseSolve(pc, &kcpc) and subsequently > KSPGetOperators (kspc, ... ) to check how the coarse grid operator is defined > after calling DMCoarsenHierarchy, but that failed. > > I am solving Helmholtz with shifted Laplace, and managed to exploit DMDA to perform > a multigrid solve on the preconditioner. In a next stage I want to implement the deflation > using DMDA as well. > > 2) On http://www.mcs.anl.gov/petsc/documentation/referencing.html I see > > @Misc{petsc-web-page, > author = {Satish Balay and Shrirang Abhyankar and Mark~F. Adams and Jed Brown and Peter Brune > and Kris Buschelman and Lisandro Dalcin and Victor Eijkhout and William~D. Gropp > and Dinesh Kaushik and Matthew~G. Knepley > and Lois Curfman McInnes and Karl Rupp and Barry~F. Smith > and Stefano Zampini and Hong Zhang and Hong Zhang}, > title = {{PETS}c {W}eb page}, > url = {http://www.mcs.anl.gov/petsc}, > howpublished = {\url{http://www.mcs.anl.gov/petsc}}, > year = {2016} > } > > > > Is the last author mentioned twice intentionally? > > 3) On http://www.mcs.anl.gov/petsc/publications/petscapps-bib.html#OpenFOAM%202.2.1 I see > > @misc{OpenFOAM > , > > > title = "OpenFOAM", > > howpublished = "\url{http://www.openfoam.com}", > > url = {http://www.openfoam.com}, > > note = "OpenFOAM is a free, open source CFD software package. It allows PETSc linear algebra and solvers to be used underneath.", > > key = "OpenFOAM 2.2.1" > > } > > > Do you have more information on the use of PETSc within OpenFoam? Very good question. It seems that this citation is wrong or no longer valid; I have removed it from the PETSc repository. I could find no mention of PETSc usage in the OpenFoam and its third party packages. I think we should not have been listing this citation. Barry > > 4) @matt in response to a question he raised in Vienna > > MIPSE is a BEM solver. Details are on: > http://www.g2elab.grenoble-inp.fr/plateforms/mipse-modeling-of-interconnected-power-systems-632862.kjsp?RH=G2ELAB_R-MAGE > > Cheers, Domenico Lahaye. > From knepley at gmail.com Sun Jul 17 07:29:59 2016 From: knepley at gmail.com (Matthew Knepley) Date: Sun, 17 Jul 2016 07:29:59 -0500 Subject: [petsc-users] Regarding ksp ex42 - Citations In-Reply-To: <5A491912-5FFB-46AB-8B2E-CBC0C5C443C2@mcs.anl.gov> References: <1413749702.3789628.1468516892902.JavaMail.yahoo.ref@mail.yahoo.com> <1413749702.3789628.1468516892902.JavaMail.yahoo@mail.yahoo.com> <5A491912-5FFB-46AB-8B2E-CBC0C5C443C2@mcs.anl.gov> Message-ID: On Sat, Jul 16, 2016 at 10:11 PM, Barry Smith wrote: > > > On Jul 14, 2016, at 12:21 PM, domenico lahaye > wrote: > > > > Dear PETSc team, > > > > 1) I am looking into ks/examples/tutorials/ex42.c I am still new to the > DMDA structure > > and likely not giving it as much time as it deserves. However, I do > not see immediately > > what function is responsible for calling PCMGSetSmoother and > PCMGSetResidual. > > > > I tried to call PCMGGetCoarseSolve(pc, &kcpc) and subsequently > > KSPGetOperators (kspc, ... ) to check how the coarse grid operator > is defined > > after calling DMCoarsenHierarchy, but that failed. > > > > I am solving Helmholtz with shifted Laplace, and managed to exploit > DMDA to perform > > a multigrid solve on the preconditioner. In a next stage I want to > implement the deflation > > using DMDA as well. > > > > 2) On http://www.mcs.anl.gov/petsc/documentation/referencing.html I see > > > > @Misc{petsc-web-page, > > author = {Satish Balay and Shrirang Abhyankar and Mark~F. > Adams and Jed Brown and Peter Brune > > and Kris Buschelman and Lisandro Dalcin and Victor > Eijkhout and William~D. Gropp > > and Dinesh Kaushik and Matthew~G. Knepley > > and Lois Curfman McInnes and Karl Rupp and > Barry~F. Smith > > and Stefano Zampini and Hong Zhang and Hong Zhang}, > > title = {{PETS}c {W}eb page}, > > url = {http://www.mcs.anl.gov/petsc}, > > howpublished = {\url{http://www.mcs.anl.gov/petsc}}, > > year = {2016} > > } > > > > > > > > Is the last author mentioned twice intentionally? > > > > 3) On > http://www.mcs.anl.gov/petsc/publications/petscapps-bib.html#OpenFOAM%202.2.1 > I see > > > > @misc{OpenFOAM > > , > > > > > > title = "OpenFOAM", > > > > howpublished = "\url{http://www.openfoam.com}", > > > > url = {http://www.openfoam.com}, > > > > note = "OpenFOAM is a free, open source CFD software package. It > allows PETSc linear algebra and solvers to be used underneath.", > > > > key = "OpenFOAM 2.2.1" > > > > } > > > > > > Do you have more information on the use of PETSc within OpenFoam? > > Very good question. It seems that this citation is wrong or no longer > valid; I have removed it from the PETSc repository. I could find no mention > of PETSc usage in the OpenFoam and its third party packages. I think we > should not have been listing this citation. This suggests that people are using it with OpenFOAM: http://powerlab.fsb.hr/ped/kturbo/OpenFOAM/slides/PatersonNuTTS2009.pdf In fact, they use PETSc in the dynamic overset grid implementation for OpenFOAM, which I think is an approved extension: http://web.student.chalmers.se/groups/ofw5/Abstracts/DavidBogerAbstractOFW5.pdf Matt > > Barry > > > > > 4) @matt in response to a question he raised in Vienna > > > > MIPSE is a BEM solver. Details are on: > > > http://www.g2elab.grenoble-inp.fr/plateforms/mipse-modeling-of-interconnected-power-systems-632862.kjsp?RH=G2ELAB_R-MAGE > > > > Cheers, Domenico Lahaye. > > > > -- What most experimenters take for granted before they begin their experiments is infinitely more interesting than any results to which their experiments lead. -- Norbert Wiener -------------- next part -------------- An HTML attachment was scrubbed... URL: From jroman at dsic.upv.es Sun Jul 17 12:40:52 2016 From: jroman at dsic.upv.es (Jose E. Roman) Date: Sun, 17 Jul 2016 19:40:52 +0200 Subject: [petsc-users] slepc eating all my ram In-Reply-To: <27AC55B0-C1E7-4181-9ECD-A3CE6F795EAC@mcs.anl.gov> References: <20160715222913.df7b3dd606ec173f7cac6a8e@arrowtheory.com> <20160716031236.4f52e3e02814cfe42d83a7b6@arrowtheory.com> <4455A442-710F-412A-9B7F-32D690B4E1F0@dsic.upv.es> <20160716071744.50ec5af125d99abc4c0ffd7c@arrowtheory.com> <20160716234024.5d13e6ec0021548c2022bbe0@arrowtheory.com> <27AC55B0-C1E7-4181-9ECD-A3CE6F795EAC@mcs.anl.gov> Message-ID: Simon: I have made a few optimizations regarding memory management in EPS. In your case, these changes will allocate 1 vector less (maybe 2). If you are using the repository version, just pull and try again. Otherwise, wait until slepc-3.7.2 is released (in a few days). Jose > El 16 jul 2016, a las 17:00, Barry Smith escribi?: > > > Send configure.log to petsc-maint at mcs.anl.gov > > >> On Jul 16, 2016, at 8:40 AM, Simon Burton wrote: >> >> >> Hi again, >> >> I found another machine with enough ram to run this (i think). >> >> Running into another problem now, with dgemv: >> >> [0] EPSSetUp_Power(): Warning: parameter mpd ignored >> [0] STSetUp(): Setting up new ST >> Intel MKL ERROR: Parameter 6 was incorrect on entry to DGEMV . >> [0] BV_SafeSqrt(): Zero norm, either the vector is zero or a semi-inner product is being used >> >> >> I dug into this in gdb a bit: >> >> >> Breakpoint 2, 0x00007ffff4f4cbd0 in dgemv_ () >> from /usr/physics/ic15/composer_xe_2015.0.090/mkl/lib/intel64/libmkl_intel_lp64.so >> (gdb) bt >> #0 0x00007ffff4f4cbd0 in dgemv_ () from /usr/physics/ic15/composer_xe_2015.0.090/mkl/lib/intel64/libmkl_intel_lp64.so >> #1 0x00007ffff5e14b4b in BVDotVec_BLAS_Private (bv=0x6ba6b0, n_=4294967296, k_=1, A=0x7fe7f23b3650, x=0x7fe7f23b3650, >> y=0x75a3b0, mpi=PETSC_FALSE) at /suphys/sburton/local/slepc-3.7.1/src/sys/classes/bv/interface/bvblas.c:274 >> #2 0x00007ffff5dcbd86 in BVDotVec_Svec (X=0x6ba6b0, y=0x74dbc0, m=0x75a3b0) >> at /suphys/sburton/local/slepc-3.7.1/src/sys/classes/bv/impls/svec/svec.c:150 >> #3 0x00007ffff5dffd58 in BVDotVec (X=0x6ba6b0, y=0x74dbc0, m=0x75a3b0) >> at /suphys/sburton/local/slepc-3.7.1/src/sys/classes/bv/interface/bvglobal.c:191 >> #4 0x00007ffff5e1aad9 in BVOrthogonalizeCGS1 (bv=0x6ba6b0, j=0, v=0x0, H=0x75a3b0, onorm=0x7fffffffdc28, >> norm=0x7fffffffdc20) at /suphys/sburton/local/slepc-3.7.1/src/sys/classes/bv/interface/bvorthog.c:81 >> #5 0x00007ffff5e1c1bb in BVOrthogonalizeCGS (bv=0x6ba6b0, j=0, v=0x0, H=0x0, norm=0x7fffffffddb0, lindep=0x7fffffffddac) >> at /suphys/sburton/local/slepc-3.7.1/src/sys/classes/bv/interface/bvorthog.c:214 >> #6 0x00007ffff5e1ddfd in BVOrthogonalizeColumn (bv=0x6ba6b0, j=0, H=0x0, norm=0x7fffffffddb0, lindep=0x7fffffffddac) >> at /suphys/sburton/local/slepc-3.7.1/src/sys/classes/bv/interface/bvorthog.c:371 >> #7 0x00007ffff6050986 in EPSGetStartVector (eps=0x6a3ee0, i=0, breakdown=0x0) >> at /suphys/sburton/local/slepc-3.7.1/src/eps/interface/epssolve.c:758 >> #8 0x00007ffff5f52812 in EPSSolve_Power (eps=0x6a3ee0) at /suphys/sburton/local/slepc-3.7.1/src/eps/impls/power/power.c:103 >> #9 0x00007ffff6049b28 in EPSSolve (eps=0x6a3ee0) at /suphys/sburton/local/slepc-3.7.1/src/eps/interface/epssolve.c:101 >> #10 0x0000000000401430 in main () >> (gdb) up >> #1 0x00007ffff5e14b4b in BVDotVec_BLAS_Private (bv=0x6ba6b0, n_=4294967296, k_=1, A=0x7fe7f23b3650, x=0x7fe7f23b3650, >> y=0x75a3b0, mpi=PETSC_FALSE) at /suphys/sburton/local/slepc-3.7.1/src/sys/classes/bv/interface/bvblas.c:274 >> 274 if (n) PetscStackCallBLAS("BLASgemv",BLASgemv_("C",&n,&k,&done,A,&n,x,&one,&zero,y,&one)); >> (gdb) print n >> $1 = 4294967296 >> (gdb) print sizeof(n) >> $2 = 8 >> (gdb) step >> Intel MKL ERROR: Parameter 6 was incorrect on entry to DGEMV . >> >> >> It looks to me like slepc is doing it right, but with error messages >> like this who knows. It's a bit beyond me debugging assembly. >> >> Originally I built petsc with --download-fblaslapack but i don't think >> it was working with 64bit indexes (?) >> >> Maybe I should try another blas. >> >> Simon. >> >> >> On Sat, 16 Jul 2016 07:17:44 +1000 >> Simon Burton wrote: >> >>> On Fri, 15 Jul 2016 19:53:31 +0200 >>> "Jose E. Roman" wrote: >>> >>>> >>>> The default spectral transformation (STSHIFT) will allocate just one vector. At which exact point are you seeing that it allocates a bunch of vectors? >>> >>> Yes I think you are right. >>> I can get beyond STSetUp with the right settings. >>> Now the solver runs out of memory inside EPSGetStartVector. >>> >>>> >>>> Is this the unmodified ex3.c? Or did you change anything like EPSSetOperators(eps,A,B) ? >>> >>> good question. I didn't change much, let me try again the original. >>> >>>> Do you get the same behaviour with the original ex3 with the same problem size? >>> >>> Yes >>> >>>> >>>> Do you have the same problem with a smaller problem? (half size, say) >>> >>> Halving n gives a quarter of the dimension, which is 8gb vector sizes. >>> It works fine and uses a total of 48gb ram. Oh, I see at one point during >>> initialization it hits a maximum of 56gb. >>> >>> So I guess it needs to keep 6 vectors in total. >>> With the original problem size this becomes 192gb which is >>> just a few gb too much to crunch. I guess I can still try it, >>> but it doesn't feel good hitting the harddrive that much. >>> >>> Thanks for the suggestions. >>> >>> Simon. > From domenico_lahaye at yahoo.com Mon Jul 18 00:59:30 2016 From: domenico_lahaye at yahoo.com (domenico lahaye) Date: Mon, 18 Jul 2016 05:59:30 +0000 (UTC) Subject: [petsc-users] Regarding ksp ex42 - Citations In-Reply-To: References: <1413749702.3789628.1468516892902.JavaMail.yahoo.ref@mail.yahoo.com> <1413749702.3789628.1468516892902.JavaMail.yahoo@mail.yahoo.com> <5A491912-5FFB-46AB-8B2E-CBC0C5C443C2@mcs.anl.gov> Message-ID: <461808588.655361.1468821570462.JavaMail.yahoo@mail.yahoo.com> Thanks for?all the?pointers.? I am happy to switch to?ksp/examples/tutorials/ex25.c in a first instance as you suggest. ? ? I am still stuck with the same issue as before though. I am trying to extract the hierarchy?? ? of coarser grid matrices and the intergrid transfer operators from the DMDA data structure. I would?? ? like to modify these operators and define a multigrid cycle with the modified operators.? ? ? Given A^h (Helmholtz) and M^h (shifted Laplace), I would like to define a multigrid cycle involving?? ? both A^H and M^H. Can I rely on the multilevel DMDA structure to construct A^H and M^H for me?? ? in a set-up phase, plug them into a user-defined context, and plug them back out in a solve phase?? Thanks, Domenico.? From: Matthew Knepley To: Barry Smith Cc: domenico lahaye ; "petsc-users at mcs.anl.gov" Sent: Sunday, July 17, 2016 2:29 PM Subject: Re: [petsc-users] Regarding ksp ex42 - Citations On Sat, Jul 16, 2016 at 10:11 PM, Barry Smith wrote: > On Jul 14, 2016, at 12:21 PM, domenico lahaye wrote: > > Dear PETSc team, > > 1) I am looking into ks/examples/tutorials/ex42.c I am still new to the DMDA structure >? ? ?and likely not giving it as much time as it deserves. However, I do not see immediately >? ? ?what function is responsible for calling PCMGSetSmoother and PCMGSetResidual. > >? ? ? I tried to call PCMGGetCoarseSolve(pc, &kcpc) and subsequently >? ? ? KSPGetOperators (kspc, ... ) to check how the coarse grid operator is defined >? ? ? after calling DMCoarsenHierarchy, but that failed. > >? ? ? I am solving Helmholtz with shifted Laplace, and managed to exploit DMDA to perform >? ? ? a multigrid solve on the preconditioner. In a next stage I want to implement the deflation >? ? ? using DMDA as well. > > 2) On http://www.mcs.anl.gov/petsc/documentation/referencing.html I see > > @Misc{petsc-web-page, >? ? ? ? ? ? ?author = {Satish Balay and Shrirang Abhyankar and Mark~F. Adams and Jed Brown and Peter Brune >? ? ? ? ? ? ? ? ? ? ? ?and Kris Buschelman and Lisandro Dalcin and Victor Eijkhout and William~D. Gropp >? ? ? ? ? ? ? ? ? ? ? ?and Dinesh Kaushik and Matthew~G. Knepley >? ? ? ? ? ? ? ? ? ? ? ?and Lois Curfman McInnes and Karl Rupp and Barry~F. Smith >? ? ? ? ? ? ? ? ? ? ? ?and Stefano Zampini and Hong Zhang and Hong Zhang}, >? ? ? ? ? ? ?title =? {{PETS}c {W}eb page}, >? ? ? ? ? ? ?url =? ? {http://www.mcs.anl.gov/petsc}, >? ? ? ? ? ? ?howpublished = {\url{http://www.mcs.anl.gov/petsc}}, >? ? ? ? ? ? ?year = {2016} >? ? ? ? ? ?} > > > > Is the last author mentioned twice intentionally? > > 3) On http://www.mcs.anl.gov/petsc/publications/petscapps-bib.html#OpenFOAM%202.2.1 I see > > @misc{OpenFOAM > , > > > title =? ? ? ?"OpenFOAM", > > howpublished? =? ? ? ?"\url{http://www.openfoam.com}", > > url? ?=? ? ? ?{http://www.openfoam.com}, > > note? =? ? ? ?"OpenFOAM is a free, open source CFD software package. It allows PETSc linear algebra and solvers to be used underneath.", > > key? ?=? ? ? ?"OpenFOAM 2.2.1" > > } > > > Do you have more information on the use of PETSc within OpenFoam? ? Very good question. It seems that this citation is wrong or no longer valid; I have removed it from the PETSc repository. I could find no mention of PETSc usage in the OpenFoam and its third party packages. I think we should not have been listing this citation. This suggests that people are using it with OpenFOAM:?http://powerlab.fsb.hr/ped/kturbo/OpenFOAM/slides/PatersonNuTTS2009.pdf In fact, they use PETSc in the dynamic overset grid implementation for OpenFOAM, which I think is an approved extension: ??http://web.student.chalmers.se/groups/ofw5/Abstracts/DavidBogerAbstractOFW5.pdf ? ? ?Matt? ? ?Barry > > 4) @matt in response to a question he raised in Vienna > > MIPSE is a BEM solver. Details are on: > http://www.g2elab.grenoble-inp.fr/plateforms/mipse-modeling-of-interconnected-power-systems-632862.kjsp?RH=G2ELAB_R-MAGE > > Cheers, Domenico Lahaye. > -- What most experimenters take for granted before they begin their experiments is infinitely more interesting than any results to which their experiments lead. -- Norbert Wiener -------------- next part -------------- An HTML attachment was scrubbed... URL: From knepley at gmail.com Mon Jul 18 01:16:59 2016 From: knepley at gmail.com (Matthew Knepley) Date: Mon, 18 Jul 2016 01:16:59 -0500 Subject: [petsc-users] Regarding ksp ex42 - Citations In-Reply-To: <461808588.655361.1468821570462.JavaMail.yahoo@mail.yahoo.com> References: <1413749702.3789628.1468516892902.JavaMail.yahoo.ref@mail.yahoo.com> <1413749702.3789628.1468516892902.JavaMail.yahoo@mail.yahoo.com> <5A491912-5FFB-46AB-8B2E-CBC0C5C443C2@mcs.anl.gov> <461808588.655361.1468821570462.JavaMail.yahoo@mail.yahoo.com> Message-ID: On Mon, Jul 18, 2016 at 12:59 AM, domenico lahaye wrote: > Thanks for all the pointers. > > I am happy to switch to ksp/examples/tutorials/ex25.c in a first instance > as you suggest. > > I am still stuck with the same issue as before though. I am trying to > extract the hierarchy > of coarser grid matrices and the intergrid transfer operators from the > DMDA data structure. I would > like to modify these operators and define a multigrid cycle with the > modified operators. > > Given A^h (Helmholtz) and M^h (shifted Laplace), I would like to > define a multigrid cycle involving > both A^H and M^H. Can I rely on the multilevel DMDA structure to > construct A^H and M^H for me > in a set-up phase, plug them into a user-defined context, and plug > them back out in a solve phase? > If you are not using -pc_mg_galerkin, then the FormJacobian is called separately on each level to rediscretize the operator. The only thing that changes is the DMDA that is passed to the call. If you need more information, there are hooks to attach different contexts to each MG level. Do you need this? Thanks, Matt > Thanks, Domenico. > > > ------------------------------ > *From:* Matthew Knepley > *To:* Barry Smith > *Cc:* domenico lahaye ; " > petsc-users at mcs.anl.gov" > *Sent:* Sunday, July 17, 2016 2:29 PM > *Subject:* Re: [petsc-users] Regarding ksp ex42 - Citations > > On Sat, Jul 16, 2016 at 10:11 PM, Barry Smith wrote: > > > > On Jul 14, 2016, at 12:21 PM, domenico lahaye > wrote: > > > > Dear PETSc team, > > > > 1) I am looking into ks/examples/tutorials/ex42.c I am still new to the > DMDA structure > > and likely not giving it as much time as it deserves. However, I do > not see immediately > > what function is responsible for calling PCMGSetSmoother and > PCMGSetResidual. > > > > I tried to call PCMGGetCoarseSolve(pc, &kcpc) and subsequently > > KSPGetOperators (kspc, ... ) to check how the coarse grid operator > is defined > > after calling DMCoarsenHierarchy, but that failed. > > > > I am solving Helmholtz with shifted Laplace, and managed to exploit > DMDA to perform > > a multigrid solve on the preconditioner. In a next stage I want to > implement the deflation > > using DMDA as well. > > > > 2) On http://www.mcs.anl.gov/petsc/documentation/referencing.html I see > > > > @Misc{petsc-web-page, > > author = {Satish Balay and Shrirang Abhyankar and Mark~F. > Adams and Jed Brown and Peter Brune > > and Kris Buschelman and Lisandro Dalcin and Victor > Eijkhout and William~D. Gropp > > and Dinesh Kaushik and Matthew~G. Knepley > > and Lois Curfman McInnes and Karl Rupp and > Barry~F. Smith > > and Stefano Zampini and Hong Zhang and Hong Zhang}, > > title = {{PETS}c {W}eb page}, > > url = {http://www.mcs.anl.gov/petsc}, > > howpublished = {\url{http://www.mcs.anl.gov/petsc}}, > > year = {2016} > > } > > > > > > > > Is the last author mentioned twice intentionally? > > > > 3) On > http://www.mcs.anl.gov/petsc/publications/petscapps-bib.html#OpenFOAM%202.2.1 > I see > > > > @misc{OpenFOAM > > , > > > > > > title = "OpenFOAM", > > > > howpublished = "\url{http://www.openfoam.com}", > > > > url = {http://www.openfoam.com}, > > > > note = "OpenFOAM is a free, open source CFD software package. It > allows PETSc linear algebra and solvers to be used underneath.", > > > > key = "OpenFOAM 2.2.1" > > > > } > > > > > > Do you have more information on the use of PETSc within OpenFoam? > > Very good question. It seems that this citation is wrong or no longer > valid; I have removed it from the PETSc repository. I could find no mention > of PETSc usage in the OpenFoam and its third party packages. I think we > should not have been listing this citation. > > > This suggests that people are using it with OpenFOAM: > http://powerlab.fsb.hr/ped/kturbo/OpenFOAM/slides/PatersonNuTTS2009.pdf > > In fact, they use PETSc in the dynamic overset grid implementation for > OpenFOAM, which I think is an approved extension: > > > http://web.student.chalmers.se/groups/ofw5/Abstracts/DavidBogerAbstractOFW5.pdf > > Matt > > > > Barry > > > > > 4) @matt in response to a question he raised in Vienna > > > > MIPSE is a BEM solver. Details are on: > > > http://www.g2elab.grenoble-inp.fr/plateforms/mipse-modeling-of-interconnected-power-systems-632862.kjsp?RH=G2ELAB_R-MAGE > > > > Cheers, Domenico Lahaye. > > > > > > > > -- > What most experimenters take for granted before they begin their > experiments is infinitely more interesting than any results to which their > experiments lead. > -- Norbert Wiener > > > -- What most experimenters take for granted before they begin their experiments is infinitely more interesting than any results to which their experiments lead. -- Norbert Wiener -------------- next part -------------- An HTML attachment was scrubbed... URL: From domenico_lahaye at yahoo.com Mon Jul 18 01:41:24 2016 From: domenico_lahaye at yahoo.com (domenico lahaye) Date: Mon, 18 Jul 2016 06:41:24 +0000 (UTC) Subject: [petsc-users] Regarding ksp ex42 - Citations In-Reply-To: References: <1413749702.3789628.1468516892902.JavaMail.yahoo.ref@mail.yahoo.com> <1413749702.3789628.1468516892902.JavaMail.yahoo@mail.yahoo.com> <5A491912-5FFB-46AB-8B2E-CBC0C5C443C2@mcs.anl.gov> <461808588.655361.1468821570462.JavaMail.yahoo@mail.yahoo.com> Message-ID: <877772657.653258.1468824084856.JavaMail.yahoo@mail.yahoo.com> Dear Matthew,? ? I would like to place the FormJacobian statement in ex25.c in such a way that I can view?the result on the different levels. Can you please point me to an example?? ? I would like to do above with Galerkin coarsening as well. So yes, I do expect that I will need the?hooks attached to the different MG levels. I appreciate more pointers here as well.? ? ?Thanks, Domenico. ? From: Matthew Knepley To: domenico lahaye Cc: PETSc Users List Sent: Monday, July 18, 2016 8:16 AM Subject: Re: [petsc-users] Regarding ksp ex42 - Citations On Mon, Jul 18, 2016 at 12:59 AM, domenico lahaye wrote: Thanks for?all the?pointers.? I am happy to switch to?ksp/examples/tutorials/ex25.c in a first instance as you suggest. ? ? I am still stuck with the same issue as before though. I am trying to extract the hierarchy?? ? of coarser grid matrices and the intergrid transfer operators from the DMDA data structure. I would?? ? like to modify these operators and define a multigrid cycle with the modified operators.? ? ? Given A^h (Helmholtz) and M^h (shifted Laplace), I would like to define a multigrid cycle involving?? ? both A^H and M^H. Can I rely on the multilevel DMDA structure to construct A^H and M^H for me?? ? in a set-up phase, plug them into a user-defined context, and plug them back out in a solve phase?? If you are not using -pc_mg_galerkin, then the FormJacobian is called separately on each level to rediscretize the operator.The only thing that changes is the DMDA that is passed to the call. If you need more information, there are hooks toattach different contexts to each MG level. Do you need this? ? Thanks, ? ? ?Matt? Thanks, Domenico.? From: Matthew Knepley To: Barry Smith Cc: domenico lahaye ; "petsc-users at mcs.anl.gov" Sent: Sunday, July 17, 2016 2:29 PM Subject: Re: [petsc-users] Regarding ksp ex42 - Citations On Sat, Jul 16, 2016 at 10:11 PM, Barry Smith wrote: > On Jul 14, 2016, at 12:21 PM, domenico lahaye wrote: > > Dear PETSc team, > > 1) I am looking into ks/examples/tutorials/ex42.c I am still new to the DMDA structure >? ? ?and likely not giving it as much time as it deserves. However, I do not see immediately >? ? ?what function is responsible for calling PCMGSetSmoother and PCMGSetResidual. > >? ? ? I tried to call PCMGGetCoarseSolve(pc, &kcpc) and subsequently >? ? ? KSPGetOperators (kspc, ... ) to check how the coarse grid operator is defined >? ? ? after calling DMCoarsenHierarchy, but that failed. > >? ? ? I am solving Helmholtz with shifted Laplace, and managed to exploit DMDA to perform >? ? ? a multigrid solve on the preconditioner. In a next stage I want to implement the deflation >? ? ? using DMDA as well. > > 2) On http://www.mcs.anl.gov/petsc/documentation/referencing.html I see > > @Misc{petsc-web-page, >? ? ? ? ? ? ?author = {Satish Balay and Shrirang Abhyankar and Mark~F. Adams and Jed Brown and Peter Brune >? ? ? ? ? ? ? ? ? ? ? ?and Kris Buschelman and Lisandro Dalcin and Victor Eijkhout and William~D. Gropp >? ? ? ? ? ? ? ? ? ? ? ?and Dinesh Kaushik and Matthew~G. Knepley >? ? ? ? ? ? ? ? ? ? ? ?and Lois Curfman McInnes and Karl Rupp and Barry~F. Smith >? ? ? ? ? ? ? ? ? ? ? ?and Stefano Zampini and Hong Zhang and Hong Zhang}, >? ? ? ? ? ? ?title =? {{PETS}c {W}eb page}, >? ? ? ? ? ? ?url =? ? {http://www.mcs.anl.gov/petsc}, >? ? ? ? ? ? ?howpublished = {\url{http://www.mcs.anl.gov/petsc}}, >? ? ? ? ? ? ?year = {2016} >? ? ? ? ? ?} > > > > Is the last author mentioned twice intentionally? > > 3) On http://www.mcs.anl.gov/petsc/publications/petscapps-bib.html#OpenFOAM%202.2.1 I see > > @misc{OpenFOAM > , > > > title =? ? ? ?"OpenFOAM", > > howpublished? =? ? ? ?"\url{http://www.openfoam.com}", > > url? ?=? ? ? ?{http://www.openfoam.com}, > > note? =? ? ? ?"OpenFOAM is a free, open source CFD software package. It allows PETSc linear algebra and solvers to be used underneath.", > > key? ?=? ? ? ?"OpenFOAM 2.2.1" > > } > > > Do you have more information on the use of PETSc within OpenFoam? ? Very good question. It seems that this citation is wrong or no longer valid; I have removed it from the PETSc repository. I could find no mention of PETSc usage in the OpenFoam and its third party packages. I think we should not have been listing this citation. This suggests that people are using it with OpenFOAM:?http://powerlab.fsb.hr/ped/kturbo/OpenFOAM/slides/PatersonNuTTS2009.pdf In fact, they use PETSc in the dynamic overset grid implementation for OpenFOAM, which I think is an approved extension: ??http://web.student.chalmers.se/groups/ofw5/Abstracts/DavidBogerAbstractOFW5.pdf ? ? ?Matt? ? ?Barry > > 4) @matt in response to a question he raised in Vienna > > MIPSE is a BEM solver. Details are on: > http://www.g2elab.grenoble-inp.fr/plateforms/mipse-modeling-of-interconnected-power-systems-632862.kjsp?RH=G2ELAB_R-MAGE > > Cheers, Domenico Lahaye. > -- What most experimenters take for granted before they begin their experiments is infinitely more interesting than any results to which their experiments lead. -- Norbert Wiener -- What most experimenters take for granted before they begin their experiments is infinitely more interesting than any results to which their experiments lead. -- Norbert Wiener -------------- next part -------------- An HTML attachment was scrubbed... URL: From knepley at gmail.com Mon Jul 18 02:11:48 2016 From: knepley at gmail.com (Matthew Knepley) Date: Mon, 18 Jul 2016 02:11:48 -0500 Subject: [petsc-users] Regarding ksp ex42 - Citations In-Reply-To: <877772657.653258.1468824084856.JavaMail.yahoo@mail.yahoo.com> References: <1413749702.3789628.1468516892902.JavaMail.yahoo.ref@mail.yahoo.com> <1413749702.3789628.1468516892902.JavaMail.yahoo@mail.yahoo.com> <5A491912-5FFB-46AB-8B2E-CBC0C5C443C2@mcs.anl.gov> <461808588.655361.1468821570462.JavaMail.yahoo@mail.yahoo.com> <877772657.653258.1468824084856.JavaMail.yahoo@mail.yahoo.com> Message-ID: On Mon, Jul 18, 2016 at 1:41 AM, domenico lahaye wrote: > Dear Matthew, > > I would like to place the FormJacobian statement in ex25.c in such a way > that I can view > the result on the different levels. Can you please point me to an example? > You can use options to do this. For any KSP solve, you can use -ksp_view_mat draw for whatever viewer you want. In the mg cycle, you can use -mg_level_2_ksp_view_mat draw or for all levels -mg_levels_ksp_view_mat draw I would like to do above with Galerkin coarsening as well. So yes, I do > expect that I will need the > hooks attached to the different MG levels. I appreciate more pointers here > as well. > The above should work with either method. Thanks, Matt > Thanks, Domenico. > > > *From:* Matthew Knepley > > > *To:* domenico lahaye > *Cc:* PETSc Users List > *Sent:* Monday, July 18, 2016 8:16 AM > > *Subject:* Re: [petsc-users] Regarding ksp ex42 - Citations > > On Mon, Jul 18, 2016 at 12:59 AM, domenico lahaye < > domenico_lahaye at yahoo.com> wrote: > > Thanks for all the pointers. > > I am happy to switch to ksp/examples/tutorials/ex25.c in a first instance > as you suggest. > > I am still stuck with the same issue as before though. I am trying to > extract the hierarchy > of coarser grid matrices and the intergrid transfer operators from the > DMDA data structure. I would > like to modify these operators and define a multigrid cycle with the > modified operators. > > Given A^h (Helmholtz) and M^h (shifted Laplace), I would like to > define a multigrid cycle involving > both A^H and M^H. Can I rely on the multilevel DMDA structure to > construct A^H and M^H for me > in a set-up phase, plug them into a user-defined context, and plug > them back out in a solve phase? > > > If you are not using -pc_mg_galerkin, then the FormJacobian is called > separately on each level to rediscretize the operator. > The only thing that changes is the DMDA that is passed to the call. If you > need more information, there are hooks to > attach different contexts to each MG level. Do you need this? > > Thanks, > > Matt > > > Thanks, Domenico. > > > ------------------------------ > *From:* Matthew Knepley > *To:* Barry Smith > *Cc:* domenico lahaye ; " > petsc-users at mcs.anl.gov" > *Sent:* Sunday, July 17, 2016 2:29 PM > *Subject:* Re: [petsc-users] Regarding ksp ex42 - Citations > > On Sat, Jul 16, 2016 at 10:11 PM, Barry Smith wrote: > > > > On Jul 14, 2016, at 12:21 PM, domenico lahaye > wrote: > > > > Dear PETSc team, > > > > 1) I am looking into ks/examples/tutorials/ex42.c I am still new to the > DMDA structure > > and likely not giving it as much time as it deserves. However, I do > not see immediately > > what function is responsible for calling PCMGSetSmoother and > PCMGSetResidual. > > > > I tried to call PCMGGetCoarseSolve(pc, &kcpc) and subsequently > > KSPGetOperators (kspc, ... ) to check how the coarse grid operator > is defined > > after calling DMCoarsenHierarchy, but that failed. > > > > I am solving Helmholtz with shifted Laplace, and managed to exploit > DMDA to perform > > a multigrid solve on the preconditioner. In a next stage I want to > implement the deflation > > using DMDA as well. > > > > 2) On http://www.mcs.anl.gov/petsc/documentation/referencing.html I see > > > > @Misc{petsc-web-page, > > author = {Satish Balay and Shrirang Abhyankar and Mark~F. > Adams and Jed Brown and Peter Brune > > and Kris Buschelman and Lisandro Dalcin and Victor > Eijkhout and William~D. Gropp > > and Dinesh Kaushik and Matthew~G. Knepley > > and Lois Curfman McInnes and Karl Rupp and > Barry~F. Smith > > and Stefano Zampini and Hong Zhang and Hong Zhang}, > > title = {{PETS}c {W}eb page}, > > url = {http://www.mcs.anl.gov/petsc}, > > howpublished = {\url{http://www.mcs.anl.gov/petsc}}, > > year = {2016} > > } > > > > > > > > Is the last author mentioned twice intentionally? > > > > 3) On > http://www.mcs.anl.gov/petsc/publications/petscapps-bib.html#OpenFOAM%202.2.1 > I see > > > > @misc{OpenFOAM > > , > > > > > > title = "OpenFOAM", > > > > howpublished = "\url{http://www.openfoam.com}", > > > > url = {http://www.openfoam.com}, > > > > note = "OpenFOAM is a free, open source CFD software package. It > allows PETSc linear algebra and solvers to be used underneath.", > > > > key = "OpenFOAM 2.2.1" > > > > } > > > > > > Do you have more information on the use of PETSc within OpenFoam? > > Very good question. It seems that this citation is wrong or no longer > valid; I have removed it from the PETSc repository. I could find no mention > of PETSc usage in the OpenFoam and its third party packages. I think we > should not have been listing this citation. > > > This suggests that people are using it with OpenFOAM: > http://powerlab.fsb.hr/ped/kturbo/OpenFOAM/slides/PatersonNuTTS2009.pdf > > In fact, they use PETSc in the dynamic overset grid implementation for > OpenFOAM, which I think is an approved extension: > > > http://web.student.chalmers.se/groups/ofw5/Abstracts/DavidBogerAbstractOFW5.pdf > > Matt > > > > Barry > > > > > 4) @matt in response to a question he raised in Vienna > > > > MIPSE is a BEM solver. Details are on: > > > http://www.g2elab.grenoble-inp.fr/plateforms/mipse-modeling-of-interconnected-power-systems-632862.kjsp?RH=G2ELAB_R-MAGE > > > > Cheers, Domenico Lahaye. > > > > > > > > -- > What most experimenters take for granted before they begin their > experiments is infinitely more interesting than any results to which their > experiments lead. > -- Norbert Wiener > > > > > > -- > What most experimenters take for granted before they begin their > experiments is infinitely more interesting than any results to which their > experiments lead. > -- Norbert Wiener > > > -- What most experimenters take for granted before they begin their experiments is infinitely more interesting than any results to which their experiments lead. -- Norbert Wiener -------------- next part -------------- An HTML attachment was scrubbed... URL: From domenico_lahaye at yahoo.com Mon Jul 18 02:29:51 2016 From: domenico_lahaye at yahoo.com (domenico lahaye) Date: Mon, 18 Jul 2016 07:29:51 +0000 (UTC) Subject: [petsc-users] Regarding ksp ex42 - Citations In-Reply-To: <872779534.685616.1468826653246.JavaMail.yahoo@mail.yahoo.com> References: <1413749702.3789628.1468516892902.JavaMail.yahoo.ref@mail.yahoo.com> <1413749702.3789628.1468516892902.JavaMail.yahoo@mail.yahoo.com> <5A491912-5FFB-46AB-8B2E-CBC0C5C443C2@mcs.anl.gov> <461808588.655361.1468821570462.JavaMail.yahoo@mail.yahoo.com> <877772657.653258.1468824084856.JavaMail.yahoo@mail.yahoo.com> <872779534.685616.1468826653246.JavaMail.yahoo@mail.yahoo.com> Message-ID: <1309408705.665690.1468826991415.JavaMail.yahoo@mail.yahoo.com> That is wonderful.? Given however that is a subsequent stage I would like to manipulate the grid?hierarchy in my code, I would like to know what the equivalent function calls?are (at least in my limited understanding).? I saw that snes/ex58.c has a FormJacobian using DMDA. I am looking for?something similar that *gets* the ?Jacobian (instead on forming it) on the?different levels (instead of on the finest level only).? Thanks again, Domenico.? From: Matthew Knepley To: domenico lahaye Cc: PETSc Users List Sent: Monday, July 18, 2016 9:11 AM Subject: Re: [petsc-users] Regarding ksp ex42 - Citations On Mon, Jul 18, 2016 at 1:41 AM, domenico lahaye wrote: Dear Matthew,? ? I would like to place the FormJacobian statement in ex25.c in such a way that I can view?the result on the different levels. Can you please point me to an example?? You can use options to do this. For any KSP solve, you can use ? -ksp_view_mat draw for whatever viewer you want. In the mg cycle, you can use ? -mg_level_2_ksp_view_mat draw or for all levels ? -mg_levels_ksp_view_mat draw ? I would like to do above with Galerkin coarsening as well. So yes, I do expect that I will need the?hooks attached to the different MG levels. I appreciate more pointers here as well.? The above should work with either method. ? Thanks, ? ? Matt? ? ?Thanks, Domenico. ? From: Matthew Knepley To: domenico lahaye Cc: PETSc Users List Sent: Monday, July 18, 2016 8:16 AM Subject: Re: [petsc-users] Regarding ksp ex42 - Citations On Mon, Jul 18, 2016 at 12:59 AM, domenico lahaye wrote: Thanks for?all the?pointers.? I am happy to switch to?ksp/examples/tutorials/ex25.c in a first instance as you suggest. ? ? I am still stuck with the same issue as before though. I am trying to extract the hierarchy?? ? of coarser grid matrices and the intergrid transfer operators from the DMDA data structure. I would?? ? like to modify these operators and define a multigrid cycle with the modified operators.? ? ? Given A^h (Helmholtz) and M^h (shifted Laplace), I would like to define a multigrid cycle involving?? ? both A^H and M^H. Can I rely on the multilevel DMDA structure to construct A^H and M^H for me?? ? in a set-up phase, plug them into a user-defined context, and plug them back out in a solve phase?? If you are not using -pc_mg_galerkin, then the FormJacobian is called separately on each level to rediscretize the operator.The only thing that changes is the DMDA that is passed to the call. If you need more information, there are hooks toattach different contexts to each MG level. Do you need this? ? Thanks, ? ? ?Matt? Thanks, Domenico.? From: Matthew Knepley To: Barry Smith Cc: domenico lahaye ; "petsc-users at mcs.anl.gov" Sent: Sunday, July 17, 2016 2:29 PM Subject: Re: [petsc-users] Regarding ksp ex42 - Citations On Sat, Jul 16, 2016 at 10:11 PM, Barry Smith wrote: > On Jul 14, 2016, at 12:21 PM, domenico lahaye wrote: > > Dear PETSc team, > > 1) I am looking into ks/examples/tutorials/ex42.c I am still new to the DMDA structure >? ? ?and likely not giving it as much time as it deserves. However, I do not see immediately >? ? ?what function is responsible for calling PCMGSetSmoother and PCMGSetResidual. > >? ? ? I tried to call PCMGGetCoarseSolve(pc, &kcpc) and subsequently >? ? ? KSPGetOperators (kspc, ... ) to check how the coarse grid operator is defined >? ? ? after calling DMCoarsenHierarchy, but that failed. > >? ? ? I am solving Helmholtz with shifted Laplace, and managed to exploit DMDA to perform >? ? ? a multigrid solve on the preconditioner. In a next stage I want to implement the deflation >? ? ? using DMDA as well. > > 2) On http://www.mcs.anl.gov/petsc/documentation/referencing.html I see > > @Misc{petsc-web-page, >? ? ? ? ? ? ?author = {Satish Balay and Shrirang Abhyankar and Mark~F. Adams and Jed Brown and Peter Brune >? ? ? ? ? ? ? ? ? ? ? ?and Kris Buschelman and Lisandro Dalcin and Victor Eijkhout and William~D. Gropp >? ? ? ? ? ? ? ? ? ? ? ?and Dinesh Kaushik and Matthew~G. Knepley >? ? ? ? ? ? ? ? ? ? ? ?and Lois Curfman McInnes and Karl Rupp and Barry~F. Smith >? ? ? ? ? ? ? ? ? ? ? ?and Stefano Zampini and Hong Zhang and Hong Zhang}, >? ? ? ? ? ? ?title =? {{PETS}c {W}eb page}, >? ? ? ? ? ? ?url =? ? {http://www.mcs.anl.gov/petsc}, >? ? ? ? ? ? ?howpublished = {\url{http://www.mcs.anl.gov/petsc}}, >? ? ? ? ? ? ?year = {2016} >? ? ? ? ? ?} > > > > Is the last author mentioned twice intentionally? > > 3) On http://www.mcs.anl.gov/petsc/publications/petscapps-bib.html#OpenFOAM%202.2.1 I see > > @misc{OpenFOAM > , > > > title =? ? ? ?"OpenFOAM", > > howpublished? =? ? ? ?"\url{http://www.openfoam.com}", > > url? ?=? ? ? ?{http://www.openfoam.com}, > > note? =? ? ? ?"OpenFOAM is a free, open source CFD software package. It allows PETSc linear algebra and solvers to be used underneath.", > > key? ?=? ? ? ?"OpenFOAM 2.2.1" > > } > > > Do you have more information on the use of PETSc within OpenFoam? ? Very good question. It seems that this citation is wrong or no longer valid; I have removed it from the PETSc repository. I could find no mention of PETSc usage in the OpenFoam and its third party packages. I think we should not have been listing this citation. This suggests that people are using it with OpenFOAM:?http://powerlab.fsb.hr/ped/kturbo/OpenFOAM/slides/PatersonNuTTS2009.pdf In fact, they use PETSc in the dynamic overset grid implementation for OpenFOAM, which I think is an approved extension: ??http://web.student.chalmers.se/groups/ofw5/Abstracts/DavidBogerAbstractOFW5.pdf ? ? ?Matt? ? ?Barry > > 4) @matt in response to a question he raised in Vienna > > MIPSE is a BEM solver. Details are on: > http://www.g2elab.grenoble-inp.fr/plateforms/mipse-modeling-of-interconnected-power-systems-632862.kjsp?RH=G2ELAB_R-MAGE > > Cheers, Domenico Lahaye. > -- What most experimenters take for granted before they begin their experiments is infinitely more interesting than any results to which their experiments lead. -- Norbert Wiener -- What most experimenters take for granted before they begin their experiments is infinitely more interesting than any results to which their experiments lead. -- Norbert Wiener -- What most experimenters take for granted before they begin their experiments is infinitely more interesting than any results to which their experiments lead. -- Norbert Wiener -------------- next part -------------- An HTML attachment was scrubbed... URL: From mhassan at miners.utep.edu Mon Jul 18 14:01:28 2016 From: mhassan at miners.utep.edu (Hassan Md Mahmudulla) Date: Mon, 18 Jul 2016 19:01:28 +0000 Subject: [petsc-users] Incorrect eigenvalues Message-ID: Hi all, I have been trying to solve generalized eigenvalue problem using matrices of size 10K. Sparsity of the matrix is 6%. I am using the following command ./solver -f1 hamold.petsc -f2 ovlbaby.petsc -st_ksp_type preonly -st_pc_type jacobi -st_pc_factor_mat_solver_package mumps -eps_interval -2,0 -eps_nev 1000 * solver is the program * f1 and f2 are the input file for both matrices in petsc binary (mpiaij) I am getting the following output: Generalized eigenproblem stored in file. Reading REAL matrices from binary files... TYPE OF MATRIX A: mpiaij TYPE OF MATRIX B: mpiaij Solving for Eigen values... Solved! 1: -9771.8339 0 2: -9559.8347 0 3: -9408.5603 0 4: -9387.423 0 5: -9235.9137 0 6: -9102.5334 0 7: -9098.1307 0 8: -8970.3594 0 9: -8854.4964 0 10: -8850.3629 0 11: -8736.6619 0 12: -8637.1749 0 13: -8628.214 0 14: -8524.2494 0 15: -8440.0801 0 16: -8424.1789 0 17: -8327.5389 0 18: -8257.7763 0 19: -8233.9564 0 20: -8143.1251 0 21: -8086.9865 0 22: -8054.7899 0 23: -7968.7355 0 24: -7925.5421 0 25: -7884.7777 0 26: -7802.7577 0 27: -7771.913 0 28: -7722.537 0 29: -7643.9943 0 30: -7624.9684 0 ......................... ......................... 541: -24.947288 0 542: -24.945875 0 543: -24.94017 0 First column is the eigenvalues. My concern is, * Eigenvalues are not right * I defined the interval but still it's giving me eigenvalues outside of that interval Please help me out. M Hassan -------------- next part -------------- An HTML attachment was scrubbed... URL: From jroman at dsic.upv.es Mon Jul 18 14:43:18 2016 From: jroman at dsic.upv.es (Jose E. Roman) Date: Mon, 18 Jul 2016 21:43:18 +0200 Subject: [petsc-users] Incorrect eigenvalues In-Reply-To: References: Message-ID: [Please do not send queries to both petsc-users and slepc-maint, only one of them is enough.] It seems that you are mixing a random number of options that make little sense. You cannot use preonly+jacobi to solve linear systems, even less in the case of the eps_interval option. For computing eigenvalues in an interval, follow the instructions in section 3.4.5 of the users manual. In particular, preonly+cholesky is required. Also, if done with MUMPS the option -mat_mumps_icntl_13 1 is also needed. Furthermore, I don't see that you are using -st_type sinvert, so I guess some options are inserted in the source code, which you did not show. Jose > El 18 jul 2016, a las 21:01, Hassan Md Mahmudulla escribi?: > > Hi all, > I have been trying to solve generalized eigenvalue problem using matrices of size 10K. Sparsity of the matrix is 6%. I am using the following command > > ./solver -f1 hamold.petsc -f2 ovlbaby.petsc -st_ksp_type preonly -st_pc_type jacobi -st_pc_factor_mat_solver_package mumps -eps_interval -2,0 -eps_nev 1000 > > ? solver is the program > ? f1 and f2 are the input file for both matrices in petsc binary (mpiaij) > I am getting the following output: > > > Generalized eigenproblem stored in file. > > Reading REAL matrices from binary files... > TYPE OF MATRIX A: mpiaij > TYPE OF MATRIX B: mpiaij > Solving for Eigen values... > Solved! > 1: -9771.8339 0 > 2: -9559.8347 0 > 3: -9408.5603 0 > 4: -9387.423 0 > 5: -9235.9137 0 > 6: -9102.5334 0 > 7: -9098.1307 0 > 8: -8970.3594 0 > 9: -8854.4964 0 > 10: -8850.3629 0 > 11: -8736.6619 0 > 12: -8637.1749 0 > 13: -8628.214 0 > 14: -8524.2494 0 > 15: -8440.0801 0 > 16: -8424.1789 0 > 17: -8327.5389 0 > 18: -8257.7763 0 > 19: -8233.9564 0 > 20: -8143.1251 0 > 21: -8086.9865 0 > 22: -8054.7899 0 > 23: -7968.7355 0 > 24: -7925.5421 0 > 25: -7884.7777 0 > 26: -7802.7577 0 > 27: -7771.913 0 > 28: -7722.537 0 > 29: -7643.9943 0 > 30: -7624.9684 0 > > ......................... > ......................... > > 541: -24.947288 0 > 542: -24.945875 0 > 543: -24.94017 0 > > > First column is the eigenvalues. > My concern is, > ? Eigenvalues are not right > ? I defined the interval but still it's giving me eigenvalues outside of that interval > Please help me out. > > M Hassan From mhassan at miners.utep.edu Mon Jul 18 14:48:53 2016 From: mhassan at miners.utep.edu (Hassan Md Mahmudulla) Date: Mon, 18 Jul 2016 19:48:53 +0000 Subject: [petsc-users] Incorrect eigenvalues In-Reply-To: References: , Message-ID: Would you please give me an idea what combination of ksp solver and preconditioner I should use to solve this generalized symmetric hermitian problem? To get the convergence faster, do I need to use external solvers like mumps and superlu_dist? Thanks M Hassan ________________________________ From: Jose E. Roman Sent: Monday, July 18, 2016 1:43:18 PM To: Hassan Md Mahmudulla Cc: petsc-users at mcs.anl.gov; slepc-maint at upv.es Subject: Re: [petsc-users] Incorrect eigenvalues [Please do not send queries to both petsc-users and slepc-maint, only one of them is enough.] It seems that you are mixing a random number of options that make little sense. You cannot use preonly+jacobi to solve linear systems, even less in the case of the eps_interval option. For computing eigenvalues in an interval, follow the instructions in section 3.4.5 of the users manual. In particular, preonly+cholesky is required. Also, if done with MUMPS the option -mat_mumps_icntl_13 1 is also needed. Furthermore, I don't see that you are using -st_type sinvert, so I guess some options are inserted in the source code, which you did not show. Jose > El 18 jul 2016, a las 21:01, Hassan Md Mahmudulla escribi?: > > Hi all, > I have been trying to solve generalized eigenvalue problem using matrices of size 10K. Sparsity of the matrix is 6%. I am using the following command > > ./solver -f1 hamold.petsc -f2 ovlbaby.petsc -st_ksp_type preonly -st_pc_type jacobi -st_pc_factor_mat_solver_package mumps -eps_interval -2,0 -eps_nev 1000 > > ? solver is the program > ? f1 and f2 are the input file for both matrices in petsc binary (mpiaij) > I am getting the following output: > > > Generalized eigenproblem stored in file. > > Reading REAL matrices from binary files... > TYPE OF MATRIX A: mpiaij > TYPE OF MATRIX B: mpiaij > Solving for Eigen values... > Solved! > 1: -9771.8339 0 > 2: -9559.8347 0 > 3: -9408.5603 0 > 4: -9387.423 0 > 5: -9235.9137 0 > 6: -9102.5334 0 > 7: -9098.1307 0 > 8: -8970.3594 0 > 9: -8854.4964 0 > 10: -8850.3629 0 > 11: -8736.6619 0 > 12: -8637.1749 0 > 13: -8628.214 0 > 14: -8524.2494 0 > 15: -8440.0801 0 > 16: -8424.1789 0 > 17: -8327.5389 0 > 18: -8257.7763 0 > 19: -8233.9564 0 > 20: -8143.1251 0 > 21: -8086.9865 0 > 22: -8054.7899 0 > 23: -7968.7355 0 > 24: -7925.5421 0 > 25: -7884.7777 0 > 26: -7802.7577 0 > 27: -7771.913 0 > 28: -7722.537 0 > 29: -7643.9943 0 > 30: -7624.9684 0 > > ......................... > ......................... > > 541: -24.947288 0 > 542: -24.945875 0 > 543: -24.94017 0 > > > First column is the eigenvalues. > My concern is, > ? Eigenvalues are not right > ? I defined the interval but still it's giving me eigenvalues outside of that interval > Please help me out. > > M Hassan -------------- next part -------------- An HTML attachment was scrubbed... URL: From jroman at dsic.upv.es Mon Jul 18 15:00:16 2016 From: jroman at dsic.upv.es (Jose E. Roman) Date: Mon, 18 Jul 2016 22:00:16 +0200 Subject: [petsc-users] Incorrect eigenvalues In-Reply-To: References: Message-ID: > El 18 jul 2016, a las 21:48, Hassan Md Mahmudulla escribi?: > > Would you please give me an idea what combination of ksp solver and preconditioner I should use to solve this generalized symmetric hermitian problem? To get the convergence faster, do I need to use external solvers like mumps and superlu_dist? > > Thanks > M Hassan For computing eigenvalues in an interval, you have to follow exactly what is written in section 3.4.5 of SLEPc's users manual. It is not possible to use preconditioners in that case. Also, superlu_dist cannot be used for this, only MUMPS or PETSc's cholesky (sequential). Jose From mhassan at miners.utep.edu Mon Jul 18 15:09:51 2016 From: mhassan at miners.utep.edu (Hassan Md Mahmudulla) Date: Mon, 18 Jul 2016 20:09:51 +0000 Subject: [petsc-users] Incorrect eigenvalues In-Reply-To: References: , Message-ID: Thank you very much for your reply. Well, I actually can avoid using eps_interval since I don't really need that. I want to request 10% eigenvalues and I need them very fast. That's why I was trying with different combinations. My system size can be bigger. So, I was trying iterative solver like mumps as well. But the problem is almost all the preconditioners are giving me wrong answers. Would you suggest me any way so that I can solve my problem? I will try with section 3.4.5 though. M Hassan ________________________________ From: Jose E. Roman Sent: Monday, July 18, 2016 2:00:16 PM To: Hassan Md Mahmudulla Cc: petsc-users at mcs.anl.gov; slepc-maint at upv.es Subject: Re: [petsc-users] Incorrect eigenvalues > El 18 jul 2016, a las 21:48, Hassan Md Mahmudulla escribi?: > > Would you please give me an idea what combination of ksp solver and preconditioner I should use to solve this generalized symmetric hermitian problem? To get the convergence faster, do I need to use external solvers like mumps and superlu_dist? > > Thanks > M Hassan For computing eigenvalues in an interval, you have to follow exactly what is written in section 3.4.5 of SLEPc's users manual. It is not possible to use preconditioners in that case. Also, superlu_dist cannot be used for this, only MUMPS or PETSc's cholesky (sequential). Jose -------------- next part -------------- An HTML attachment was scrubbed... URL: From jroman at dsic.upv.es Mon Jul 18 15:32:20 2016 From: jroman at dsic.upv.es (Jose E. Roman) Date: Mon, 18 Jul 2016 22:32:20 +0200 Subject: [petsc-users] Incorrect eigenvalues In-Reply-To: References: Message-ID: <7DDE1F06-993D-4365-9290-C1010EDC6289@dsic.upv.es> > El 18 jul 2016, a las 22:09, Hassan Md Mahmudulla escribi?: > > Thank you very much for your reply. Well, I actually can avoid using eps_interval since I don't really need that. I want to request 10% eigenvalues and I need them very fast. That's why I was trying with different combinations. My system size can be bigger. So, I was trying iterative solver like mumps as well. But the problem is almost all the preconditioners are giving me wrong answers. Would you suggest me any way so that I can solve my problem? I will try with section 3.4.5 though. > > M Hassan MUMPS is not an iterative solver, but a direct solver. For solving linear systems you first need to understand PETSc's KSP and PC objects. You cannot use preonly with jacobi because it won't give you the solution of the linear system (just one preconditioning step, which is enough for some SLEPc solvers but not for the default one). You can try an iterative method such as GMRES together with a preconditioner such as Jacobi. Again, this is discussed in SLEPc's documentation, for instance in section 3.4.1 of the manual. But eps_interval is an exception which supports direct solvers only. Computing 10% of eigenvalues of a large matrix is generally a very expensive task, it cannot be done "very fast". Using eps_interval could be a good option if you know the interval containing the eigenvalues, but it will take time since it requires factorizing large matrices. Jose From mhassan at miners.utep.edu Tue Jul 19 05:42:21 2016 From: mhassan at miners.utep.edu (Hassan Md Mahmudulla) Date: Tue, 19 Jul 2016 10:42:21 +0000 Subject: [petsc-users] Spectrum slicing with MUMPS (Segmentation fault) Message-ID: Hi all, I have been trying spectrum slicing with MUMPS external solver. The error output is the following: [0]PETSC ERROR: ------------------------------------------------------------------------ [0]PETSC ERROR: Caught signal number 11 SEGV: Segmentation Violation, probably memory access out of range [0]PETSC ERROR: Try option -start_in_debugger or -on_error_attach_debugger [0]PETSC ERROR: or see http://www.mcs.anl.gov/petsc/documentation/faq.html#valgrind [0]PETSC ERROR: or try http://valgrind.org on GNU/linux and Apple Mac OS X to find memory corruption errors [0]PETSC ERROR: configure using --with-debugging=yes, recompile, link, and run [0]PETSC ERROR: to get more information on the crash. [0]PETSC ERROR: --------------------- Error Message -------------------------------------------------------------- [0]PETSC ERROR: Signal received [0]PETSC ERROR: See http://www.mcs.anl.gov/petsc/documentation/faq.html for trouble shooting. [0]PETSC ERROR: Petsc Release Version 3.5.3, Jan, 31, 2015 [0]PETSC ERROR: /scratch1/scratchdirs/mhassan/dSLEPc/d540/../eigenSolverSS on a sandybridge named nid00281 by mhassan Tue Jul 19 02:54 :00 2016 [0]PETSC ERROR: Configure options --known-mpi-int64_t=0 --known-bits-per-byte=8 --known-sdot-returns-double=0 --known-snrm2-returns-do uble=0 --known-level1-dcache-assoc=0 --known-level1-dcache-linesize=32 --known-level1-dcache-size=32768 --known-memcmp-ok=1 --known-mp i-c-double-complex=1 --known-mpi-long-double=1 --known-mpi-shared-libraries=0 --known-sizeof-MPI_Comm=4 --known-sizeof-MPI_Fint=4 --kn own-sizeof-char=1 --known-sizeof-double=8 --known-sizeof-float=4 --known-sizeof-int=4 --known-sizeof-long-long=8 --known-sizeof-long=8 --known-sizeof-short=2 --known-sizeof-size_t=8 --known-sizeof-void-p=8 --with-ar=ar --with-batch=1 --with-cc=cc --with-clib-autodetec t=0 --with-cxx=CC --with-cxxlib-autodetect=0 --with-debugging=0 --with-dependencies=0 --with-fc=ftn --with-fortran-datatypes=0 --with- fortran-interfaces=0 --with-fortranlib-autodetect=0 --with-ranlib=ranlib --with-scalar-type=real --with-shared-ld=ar --with-etags=0 -- with-dependencies=0 --with-dependencies=0 --with-mpi-dir=/opt/cray/mpt/7.0.0/gni/mpich2-intel/140 --with-superlu=1 --with-superlu-incl ude=/opt/cray/tpsl/1.4.4/INTEL/140/sandybridge/include --with-superlu-lib=/opt/cray/tpsl/1.4.4/INTEL/140/sandybridge/lib/libsuperlu.a --with-superlu_dist=1 --with-superlu_dist-include=/opt/cray/tpsl/1.4.4/INTEL/140/sandybridge/include --with-superlu_dist-lib=/opt/cray /tpsl/1.4.4/INTEL/140/sandybridge/lib/libsuperlu_dist.a --with-parmetis=1 --with-parmetis-include=/opt/cray/tpsl/1.4.4/INTEL/140/sandy bridge/include --with-parmetis-lib=/opt/cray/tpsl/1.4.4/INTEL/140/sandybridge/lib/libparmetis.a --with-metis=1 --with-metis-include=/o pt/cray/tpsl/1.4.4/INTEL/140/sandybridge/include --with-metis-lib=/opt/cray/tpsl/1.4.4/INTEL/140/sandybridge/lib/libmetis.a --with-pts cotch=1 --with-ptscotch-include=/opt/cray/tpsl/1.4.4/INTEL/140/sandybridge/include --with-ptscotch-lib="-L/opt/cray/tpsl/1.4.4/INTEL/1 40/sandybridge/lib -lptscotch -lscotch -lptscotcherr -lscotcherr" --with-scalapack=1 --with-scalapack-include=/opt/cray/libsci/13.0.3/ INTEL/140/x86_64/include --with-scalapack-lib="-L/opt/cray/libsci/13.0.3/INTEL/140/x86_64/lib -lsci_intel_mpi_mp -lsci_intel_mp" --wit h-mumps=1 --with-mumps-include=/opt/cray/tpsl/1.4.4/INTEL/140/sandybridge/include --with-mumps-lib="-L/opt/cray/tpsl/1.4.4/INTEL/140/s andybridge/lib -lcmumps -ldmumps -lesmumps -lsmumps -lzmumps -lmumps_common -lptesmumps -lpord" --CFLAGS="-xavx -openmp -O3 " --CXXFLA GS="-xavx -openmp -O3 " --FFLAGS="-xavx -openmp -O3 " --LIBS=-lstdc++ --CXX_LINKER_FLAGS= --PETSC_ARCH=sandybridge --prefix=/opt/cra y/petsc/3.5.3.0/real/INTEL/140/sandybridge --with-hypre=1 --with-hypre-include=/opt/cray/tpsl/1.4.4/INTEL/140/sandybridge/include --wi th-hypre-lib=/opt/cray/tpsl/1.4.4/INTEL/140/sandybridge/lib/libHYPRE.a --with-sundials=1 --with-sundials-include=/opt/cray/tpsl/1.4.4/ INTEL/140/sandybridge/include --with-sundials-lib="-L/opt/cray/tpsl/1.4.4/INTEL/140/sandybridge/lib -lsundials_cvode -lsundials_cvodes -lsundials_ida -lsundials_idas -lsundials_kinsol -lsundials_nvecparallel -lsundials_nvecserial" [0]PETSC ERROR: #1 User provided function() line 0 in unknown file Rank 0 [Tue Jul 19 02:54:04 2016] [c1-0c1s6n1] application called MPI_Abort(MPI_COMM_WORLD, 59) - process 0 srun: error: nid00281: task 0: Aborted srun: Terminating job step 1330433.0 slurmstepd: *** STEP 1330433.0 ON nid00281 CANCELLED AT 2016-07-19T02:54:04 *** srun: Job step aborted: Waiting up to 32 seconds for job step to finish. srun: error: nid00281: tasks 1-17: Killed srun: error: nid00282: tasks 18-35: Killed I ran the same code in my pc with 8 processor. It had no issues. But when I tried in a different machine, I am getting this. Any idea? Can I use Superlu_dist instead of MUMPS? I got INFOG(1)=-22 error from MUMPS in another run. Thanks, M Hassan -------------- next part -------------- An HTML attachment was scrubbed... URL: From loiseau.jc at gmail.com Tue Jul 19 06:51:00 2016 From: loiseau.jc at gmail.com (JC) Date: Tue, 19 Jul 2016 13:51:00 +0200 Subject: [petsc-users] petscviewerhdf5open undefined reference Message-ID: Hi everyone, I am a rather recent user of petsc. I have installed it on my mac using home-brew and have been to develop my CFD code quite efficiently thanks to that. I am now porting the code onto another machine which has linux mint 18 installed. I have installed petsc and its dependancies as follow: apt install --install-recommends --install-suggests pets-dev Though most of the code compiles correctly, I get the following error at some point: /home/jean-christophe/Codes/PETSc_LS/SOURCES/io.f90:162: undefined reference to `petscviewerhdf5open_? I have made sure that apt install the hdf5 library. All of the versions are exactly the same I use on my mac, yet I cannot compile correctly. Anyone has ever encountered the same problem? Thanks a lot anyway for this amazing library. Regards, JC From lixin_chu at yahoo.com Tue Jul 19 09:01:35 2016 From: lixin_chu at yahoo.com (lixin chu) Date: Tue, 19 Jul 2016 14:01:35 +0000 (UTC) Subject: [petsc-users] some beginner questions : matrix multiplication References: <932627683.1480276.1468936895326.JavaMail.yahoo.ref@mail.yahoo.com> Message-ID: <932627683.1480276.1468936895326.JavaMail.yahoo@mail.yahoo.com> Hello,I am new to PETsc, and I am looking for a library to support matrix multiplication. I have several questions and would like to confirm: 1. From MatMatMult API, for C=A*B, I assume we can support mixed sparse and dense matrix, i.e., either A or B can be dense; similarly, MatMatMatMult (A*B*C) can support A and C sparse, and B is dense. 2. We can also use mixed data type for MatMatMult/MatMatMatMult, for example, A is complex, double, and B is double. 3. Is there a way to estimate the total working memory required for MatMatMult/MatMatMatMult, given A,B and C information (like dimensions, and total none zero elements, data type)?4. do we have any performance/memory usage data when compared with other sparse matrix multiplication solutions. for example. PSBLAS ? thank you very much, lixin -------------- next part -------------- An HTML attachment was scrubbed... URL: From knepley at gmail.com Tue Jul 19 09:37:11 2016 From: knepley at gmail.com (Matthew Knepley) Date: Tue, 19 Jul 2016 16:37:11 +0200 Subject: [petsc-users] petscviewerhdf5open undefined reference In-Reply-To: References: Message-ID: On Tue, Jul 19, 2016 at 1:51 PM, JC wrote: > Hi everyone, > > I am a rather recent user of petsc. I have installed it on my mac using > home-brew and have been to develop my CFD code quite efficiently thanks to > that. I am now porting the code onto another machine which has linux mint > 18 installed. I have installed petsc and its dependancies as follow: > > apt install --install-recommends --install-suggests pets-dev > > Though most of the code compiles correctly, I get the following error at > some point: > > /home/jean-christophe/Codes/PETSc_LS/SOURCES/io.f90:162: undefined > reference to `petscviewerhdf5open_? > > I have made sure that apt install the hdf5 library. All of the versions > are exactly the same I use on my mac, yet I cannot compile correctly. > Anyone has ever encountered the same problem? > Its possible that the packager did not configure PETSc to use HDF5. Check $PETSC_DIR/include/petscconf.h for the lines #ifndef PETSC_HAVE_HDF5 #define PETSC_HAVE_HDF5 1 #endif If they are not there, you will have to install yourself using --download-hdf5, which should not be hard. Thanks, Matt > Thanks a lot anyway for this amazing library. > Regards, > JC -- What most experimenters take for granted before they begin their experiments is infinitely more interesting than any results to which their experiments lead. -- Norbert Wiener -------------- next part -------------- An HTML attachment was scrubbed... URL: From knepley at gmail.com Tue Jul 19 09:38:35 2016 From: knepley at gmail.com (Matthew Knepley) Date: Tue, 19 Jul 2016 16:38:35 +0200 Subject: [petsc-users] Spectrum slicing with MUMPS (Segmentation fault) In-Reply-To: References: Message-ID: On Tue, Jul 19, 2016 at 12:42 PM, Hassan Md Mahmudulla < mhassan at miners.utep.edu> wrote: > Hi all, > > I have been trying spectrum slicing with MUMPS external solver. The error > output is the following: > A stack trace in the debugger would help, but it sounds like an error in MUMPS. You can try SuperLU_dist instead. Thanks, Matt > [0]PETSC ERROR: > ------------------------------------------------------------------------ > [0]PETSC ERROR: Caught signal number 11 SEGV: Segmentation Violation, > probably memory access out of range > [0]PETSC ERROR: Try option -start_in_debugger or -on_error_attach_debugger > [0]PETSC ERROR: or see > http://www.mcs.anl.gov/petsc/documentation/faq.html#valgrind > [0]PETSC ERROR: or try http://valgrind.org on GNU/linux and Apple Mac OS > X to find memory corruption errors > [0]PETSC ERROR: configure using --with-debugging=yes, recompile, link, and > run > [0]PETSC ERROR: to get more information on the crash. > [0]PETSC ERROR: --------------------- Error Message > -------------------------------------------------------------- > [0]PETSC ERROR: Signal received > [0]PETSC ERROR: See http://www.mcs.anl.gov/petsc/documentation/faq.html > for trouble shooting. > [0]PETSC ERROR: Petsc Release Version 3.5.3, Jan, 31, 2015 > [0]PETSC ERROR: /scratch1/scratchdirs/mhassan/dSLEPc/d540/../eigenSolverSS > on a sandybridge named nid00281 by mhassan Tue Jul 19 02:54 > :00 2016 > [0]PETSC ERROR: Configure options --known-mpi-int64_t=0 > --known-bits-per-byte=8 --known-sdot-returns-double=0 > --known-snrm2-returns-do > uble=0 --known-level1-dcache-assoc=0 --known-level1-dcache-linesize=32 > --known-level1-dcache-size=32768 --known-memcmp-ok=1 --known-mp > i-c-double-complex=1 --known-mpi-long-double=1 > --known-mpi-shared-libraries=0 --known-sizeof-MPI_Comm=4 > --known-sizeof-MPI_Fint=4 --kn > own-sizeof-char=1 --known-sizeof-double=8 --known-sizeof-float=4 > --known-sizeof-int=4 --known-sizeof-long-long=8 --known-sizeof-long=8 > --known-sizeof-short=2 --known-sizeof-size_t=8 --known-sizeof-void-p=8 > --with-ar=ar --with-batch=1 --with-cc=cc --with-clib-autodetec > t=0 --with-cxx=CC --with-cxxlib-autodetect=0 --with-debugging=0 > --with-dependencies=0 --with-fc=ftn --with-fortran-datatypes=0 --with- > fortran-interfaces=0 --with-fortranlib-autodetect=0 --with-ranlib=ranlib > --with-scalar-type=real --with-shared-ld=ar --with-etags=0 -- > with-dependencies=0 --with-dependencies=0 > --with-mpi-dir=/opt/cray/mpt/7.0.0/gni/mpich2-intel/140 --with-superlu=1 > --with-superlu-incl > ude=/opt/cray/tpsl/1.4.4/INTEL/140/sandybridge/include > --with-superlu-lib=/opt/cray/tpsl/1.4.4/INTEL/140/sandybridge/lib/libsuperlu.a > --with-superlu_dist=1 > --with-superlu_dist-include=/opt/cray/tpsl/1.4.4/INTEL/140/sandybridge/include > --with-superlu_dist-lib=/opt/cray > /tpsl/1.4.4/INTEL/140/sandybridge/lib/libsuperlu_dist.a --with-parmetis=1 > --with-parmetis-include=/opt/cray/tpsl/1.4.4/INTEL/140/sandy > bridge/include > --with-parmetis-lib=/opt/cray/tpsl/1.4.4/INTEL/140/sandybridge/lib/libparmetis.a > --with-metis=1 --with-metis-include=/o > pt/cray/tpsl/1.4.4/INTEL/140/sandybridge/include > --with-metis-lib=/opt/cray/tpsl/1.4.4/INTEL/140/sandybridge/lib/libmetis.a > --with-pts > cotch=1 > --with-ptscotch-include=/opt/cray/tpsl/1.4.4/INTEL/140/sandybridge/include > --with-ptscotch-lib="-L/opt/cray/tpsl/1.4.4/INTEL/1 > 40/sandybridge/lib -lptscotch -lscotch -lptscotcherr -lscotcherr" > --with-scalapack=1 --with-scalapack-include=/opt/cray/libsci/13.0.3/ > INTEL/140/x86_64/include > --with-scalapack-lib="-L/opt/cray/libsci/13.0.3/INTEL/140/x86_64/lib > -lsci_intel_mpi_mp -lsci_intel_mp" --wit > h-mumps=1 > --with-mumps-include=/opt/cray/tpsl/1.4.4/INTEL/140/sandybridge/include > --with-mumps-lib="-L/opt/cray/tpsl/1.4.4/INTEL/140/s > andybridge/lib -lcmumps -ldmumps -lesmumps -lsmumps -lzmumps > -lmumps_common -lptesmumps -lpord" --CFLAGS="-xavx -openmp -O3 " --CXXFLA > GS="-xavx -openmp -O3 " --FFLAGS="-xavx -openmp -O3 " --LIBS=-lstdc++ > --CXX_LINKER_FLAGS= --PETSC_ARCH=sandybridge --prefix=/opt/cra > y/petsc/3.5.3.0/real/INTEL/140/sandybridge --with-hypre=1 > --with-hypre-include=/opt/cray/tpsl/1.4.4/INTEL/140/sandybridge/include --wi > th-hypre-lib=/opt/cray/tpsl/1.4.4/INTEL/140/sandybridge/lib/libHYPRE.a > --with-sundials=1 --with-sundials-include=/opt/cray/tpsl/1.4.4/ > INTEL/140/sandybridge/include > --with-sundials-lib="-L/opt/cray/tpsl/1.4.4/INTEL/140/sandybridge/lib > -lsundials_cvode -lsundials_cvodes > -lsundials_ida -lsundials_idas -lsundials_kinsol -lsundials_nvecparallel > -lsundials_nvecserial" > [0]PETSC ERROR: #1 User provided function() line 0 in unknown file > Rank 0 [Tue Jul 19 02:54:04 2016] [c1-0c1s6n1] application called > MPI_Abort(MPI_COMM_WORLD, 59) - process 0 > srun: error: nid00281: task 0: Aborted > srun: Terminating job step 1330433.0 > slurmstepd: *** STEP 1330433.0 ON nid00281 CANCELLED AT > 2016-07-19T02:54:04 *** > srun: Job step aborted: Waiting up to 32 seconds for job step to finish. > srun: error: nid00281: tasks 1-17: Killed > srun: error: nid00282: tasks 18-35: Killed > > > I ran the same code in my pc with 8 processor. It had no issues. But when > I tried in a different machine, I am getting this. Any idea? Can I use > Superlu_dist instead of MUMPS? I got INFOG(1)=-22 error from MUMPS in > another run. > > > Thanks, > > > *M Hassan* > -- What most experimenters take for granted before they begin their experiments is infinitely more interesting than any results to which their experiments lead. -- Norbert Wiener -------------- next part -------------- An HTML attachment was scrubbed... URL: From jroman at dsic.upv.es Tue Jul 19 09:42:23 2016 From: jroman at dsic.upv.es (Jose E. Roman) Date: Tue, 19 Jul 2016 16:42:23 +0200 Subject: [petsc-users] Spectrum slicing with MUMPS (Segmentation fault) In-Reply-To: References: Message-ID: <95BF1DF1-BC04-4F6C-93D4-591C5E7E36F3@dsic.upv.es> SuperLU_dist can be used in general with shift-and-invert, but for spectrum slicint (eps_interval) it does not work because it does not provide inertia (MatGetInertia) which is required in that case. Jose > El 19 jul 2016, a las 16:38, Matthew Knepley escribi?: > > On Tue, Jul 19, 2016 at 12:42 PM, Hassan Md Mahmudulla wrote: > Hi all, > > I have been trying spectrum slicing with MUMPS external solver. The error output is the following: > > A stack trace in the debugger would help, but it sounds like an error in MUMPS. You can try SuperLU_dist instead. > > Thanks, > > Matt > > [0]PETSC ERROR: ------------------------------------------------------------------------ > [0]PETSC ERROR: Caught signal number 11 SEGV: Segmentation Violation, probably memory access out of range > [0]PETSC ERROR: Try option -start_in_debugger or -on_error_attach_debugger > [0]PETSC ERROR: or see http://www.mcs.anl.gov/petsc/documentation/faq.html#valgrind > [0]PETSC ERROR: or try http://valgrind.org on GNU/linux and Apple Mac OS X to find memory corruption errors > [0]PETSC ERROR: configure using --with-debugging=yes, recompile, link, and run > [0]PETSC ERROR: to get more information on the crash. > [0]PETSC ERROR: --------------------- Error Message -------------------------------------------------------------- > [0]PETSC ERROR: Signal received > [0]PETSC ERROR: See http://www.mcs.anl.gov/petsc/documentation/faq.html for trouble shooting. > [0]PETSC ERROR: Petsc Release Version 3.5.3, Jan, 31, 2015 > [0]PETSC ERROR: /scratch1/scratchdirs/mhassan/dSLEPc/d540/../eigenSolverSS on a sandybridge named nid00281 by mhassan Tue Jul 19 02:54 > :00 2016 > [0]PETSC ERROR: Configure options --known-mpi-int64_t=0 --known-bits-per-byte=8 --known-sdot-returns-double=0 --known-snrm2-returns-do > uble=0 --known-level1-dcache-assoc=0 --known-level1-dcache-linesize=32 --known-level1-dcache-size=32768 --known-memcmp-ok=1 --known-mp > i-c-double-complex=1 --known-mpi-long-double=1 --known-mpi-shared-libraries=0 --known-sizeof-MPI_Comm=4 --known-sizeof-MPI_Fint=4 --kn > own-sizeof-char=1 --known-sizeof-double=8 --known-sizeof-float=4 --known-sizeof-int=4 --known-sizeof-long-long=8 --known-sizeof-long=8 > --known-sizeof-short=2 --known-sizeof-size_t=8 --known-sizeof-void-p=8 --with-ar=ar --with-batch=1 --with-cc=cc --with-clib-autodetec > t=0 --with-cxx=CC --with-cxxlib-autodetect=0 --with-debugging=0 --with-dependencies=0 --with-fc=ftn --with-fortran-datatypes=0 --with- > fortran-interfaces=0 --with-fortranlib-autodetect=0 --with-ranlib=ranlib --with-scalar-type=real --with-shared-ld=ar --with-etags=0 -- > with-dependencies=0 --with-dependencies=0 --with-mpi-dir=/opt/cray/mpt/7.0.0/gni/mpich2-intel/140 --with-superlu=1 --with-superlu-incl > ude=/opt/cray/tpsl/1.4.4/INTEL/140/sandybridge/include --with-superlu-lib=/opt/cray/tpsl/1.4.4/INTEL/140/sandybridge/lib/libsuperlu.a > --with-superlu_dist=1 --with-superlu_dist-include=/opt/cray/tpsl/1.4.4/INTEL/140/sandybridge/include --with-superlu_dist-lib=/opt/cray > /tpsl/1.4.4/INTEL/140/sandybridge/lib/libsuperlu_dist.a --with-parmetis=1 --with-parmetis-include=/opt/cray/tpsl/1.4.4/INTEL/140/sandy > bridge/include --with-parmetis-lib=/opt/cray/tpsl/1.4.4/INTEL/140/sandybridge/lib/libparmetis.a --with-metis=1 --with-metis-include=/o > pt/cray/tpsl/1.4.4/INTEL/140/sandybridge/include --with-metis-lib=/opt/cray/tpsl/1.4.4/INTEL/140/sandybridge/lib/libmetis.a --with-pts > cotch=1 --with-ptscotch-include=/opt/cray/tpsl/1.4.4/INTEL/140/sandybridge/include --with-ptscotch-lib="-L/opt/cray/tpsl/1.4.4/INTEL/1 > 40/sandybridge/lib -lptscotch -lscotch -lptscotcherr -lscotcherr" --with-scalapack=1 --with-scalapack-include=/opt/cray/libsci/13.0.3/ > INTEL/140/x86_64/include --with-scalapack-lib="-L/opt/cray/libsci/13.0.3/INTEL/140/x86_64/lib -lsci_intel_mpi_mp -lsci_intel_mp" --wit > h-mumps=1 --with-mumps-include=/opt/cray/tpsl/1.4.4/INTEL/140/sandybridge/include --with-mumps-lib="-L/opt/cray/tpsl/1.4.4/INTEL/140/s > andybridge/lib -lcmumps -ldmumps -lesmumps -lsmumps -lzmumps -lmumps_common -lptesmumps -lpord" --CFLAGS="-xavx -openmp -O3 " --CXXFLA > GS="-xavx -openmp -O3 " --FFLAGS="-xavx -openmp -O3 " --LIBS=-lstdc++ --CXX_LINKER_FLAGS= --PETSC_ARCH=sandybridge --prefix=/opt/cra > y/petsc/3.5.3.0/real/INTEL/140/sandybridge --with-hypre=1 --with-hypre-include=/opt/cray/tpsl/1.4.4/INTEL/140/sandybridge/include --wi > th-hypre-lib=/opt/cray/tpsl/1.4.4/INTEL/140/sandybridge/lib/libHYPRE.a --with-sundials=1 --with-sundials-include=/opt/cray/tpsl/1.4.4/ > INTEL/140/sandybridge/include --with-sundials-lib="-L/opt/cray/tpsl/1.4.4/INTEL/140/sandybridge/lib -lsundials_cvode -lsundials_cvodes > -lsundials_ida -lsundials_idas -lsundials_kinsol -lsundials_nvecparallel -lsundials_nvecserial" > [0]PETSC ERROR: #1 User provided function() line 0 in unknown file > Rank 0 [Tue Jul 19 02:54:04 2016] [c1-0c1s6n1] application called MPI_Abort(MPI_COMM_WORLD, 59) - process 0 > srun: error: nid00281: task 0: Aborted > srun: Terminating job step 1330433.0 > slurmstepd: *** STEP 1330433.0 ON nid00281 CANCELLED AT 2016-07-19T02:54:04 *** > srun: Job step aborted: Waiting up to 32 seconds for job step to finish. > srun: error: nid00281: tasks 1-17: Killed > srun: error: nid00282: tasks 18-35: Killed > > > > I ran the same code in my pc with 8 processor. It had no issues. But when I tried in a different machine, I am getting this. Any idea? Can I use Superlu_dist instead of MUMPS? I got INFOG(1)=-22 error from MUMPS in another run. > > > > Thanks, > > > > M Hassan > > > > > -- > What most experimenters take for granted before they begin their experiments is infinitely more interesting than any results to which their experiments lead. > -- Norbert Wiener From hzhang at mcs.anl.gov Tue Jul 19 09:50:09 2016 From: hzhang at mcs.anl.gov (Hong) Date: Tue, 19 Jul 2016 09:50:09 -0500 Subject: [petsc-users] Spectrum slicing with MUMPS (Segmentation fault) In-Reply-To: <95BF1DF1-BC04-4F6C-93D4-591C5E7E36F3@dsic.upv.es> References: <95BF1DF1-BC04-4F6C-93D4-591C5E7E36F3@dsic.upv.es> Message-ID: "I got INFOG(1)=-22 error from MUMPS in another run. " does not tell much about the error (check MUMPS's user manual). Suggest building petsc in debugging mode, then you may get more error info. Hong On Tue, Jul 19, 2016 at 9:42 AM, Jose E. Roman wrote: > SuperLU_dist can be used in general with shift-and-invert, but for > spectrum slicint (eps_interval) it does not work because it does not > provide inertia (MatGetInertia) which is required in that case. > > Jose > > > > El 19 jul 2016, a las 16:38, Matthew Knepley > escribi?: > > > > On Tue, Jul 19, 2016 at 12:42 PM, Hassan Md Mahmudulla < > mhassan at miners.utep.edu> wrote: > > Hi all, > > > > I have been trying spectrum slicing with MUMPS external solver. The > error output is the following: > > > > A stack trace in the debugger would help, but it sounds like an error in > MUMPS. You can try SuperLU_dist instead. > > > > Thanks, > > > > Matt > > > > [0]PETSC ERROR: > ------------------------------------------------------------------------ > > [0]PETSC ERROR: Caught signal number 11 SEGV: Segmentation Violation, > probably memory access out of range > > [0]PETSC ERROR: Try option -start_in_debugger or > -on_error_attach_debugger > > [0]PETSC ERROR: or see > http://www.mcs.anl.gov/petsc/documentation/faq.html#valgrind > > [0]PETSC ERROR: or try http://valgrind.org on GNU/linux and Apple Mac > OS X to find memory corruption errors > > [0]PETSC ERROR: configure using --with-debugging=yes, recompile, link, > and run > > [0]PETSC ERROR: to get more information on the crash. > > [0]PETSC ERROR: --------------------- Error Message > -------------------------------------------------------------- > > [0]PETSC ERROR: Signal received > > [0]PETSC ERROR: See http://www.mcs.anl.gov/petsc/documentation/faq.html > for trouble shooting. > > [0]PETSC ERROR: Petsc Release Version 3.5.3, Jan, 31, 2015 > > [0]PETSC ERROR: > /scratch1/scratchdirs/mhassan/dSLEPc/d540/../eigenSolverSS on a sandybridge > named nid00281 by mhassan Tue Jul 19 02:54 > > :00 2016 > > [0]PETSC ERROR: Configure options --known-mpi-int64_t=0 > --known-bits-per-byte=8 --known-sdot-returns-double=0 > --known-snrm2-returns-do > > uble=0 --known-level1-dcache-assoc=0 --known-level1-dcache-linesize=32 > --known-level1-dcache-size=32768 --known-memcmp-ok=1 --known-mp > > i-c-double-complex=1 --known-mpi-long-double=1 > --known-mpi-shared-libraries=0 --known-sizeof-MPI_Comm=4 > --known-sizeof-MPI_Fint=4 --kn > > own-sizeof-char=1 --known-sizeof-double=8 --known-sizeof-float=4 > --known-sizeof-int=4 --known-sizeof-long-long=8 --known-sizeof-long=8 > > --known-sizeof-short=2 --known-sizeof-size_t=8 --known-sizeof-void-p=8 > --with-ar=ar --with-batch=1 --with-cc=cc --with-clib-autodetec > > t=0 --with-cxx=CC --with-cxxlib-autodetect=0 --with-debugging=0 > --with-dependencies=0 --with-fc=ftn --with-fortran-datatypes=0 --with- > > fortran-interfaces=0 --with-fortranlib-autodetect=0 --with-ranlib=ranlib > --with-scalar-type=real --with-shared-ld=ar --with-etags=0 -- > > with-dependencies=0 --with-dependencies=0 > --with-mpi-dir=/opt/cray/mpt/7.0.0/gni/mpich2-intel/140 --with-superlu=1 > --with-superlu-incl > > ude=/opt/cray/tpsl/1.4.4/INTEL/140/sandybridge/include > --with-superlu-lib=/opt/cray/tpsl/1.4.4/INTEL/140/sandybridge/lib/libsuperlu.a > > --with-superlu_dist=1 > --with-superlu_dist-include=/opt/cray/tpsl/1.4.4/INTEL/140/sandybridge/include > --with-superlu_dist-lib=/opt/cray > > /tpsl/1.4.4/INTEL/140/sandybridge/lib/libsuperlu_dist.a > --with-parmetis=1 > --with-parmetis-include=/opt/cray/tpsl/1.4.4/INTEL/140/sandy > > bridge/include > --with-parmetis-lib=/opt/cray/tpsl/1.4.4/INTEL/140/sandybridge/lib/libparmetis.a > --with-metis=1 --with-metis-include=/o > > pt/cray/tpsl/1.4.4/INTEL/140/sandybridge/include > --with-metis-lib=/opt/cray/tpsl/1.4.4/INTEL/140/sandybridge/lib/libmetis.a > --with-pts > > cotch=1 > --with-ptscotch-include=/opt/cray/tpsl/1.4.4/INTEL/140/sandybridge/include > --with-ptscotch-lib="-L/opt/cray/tpsl/1.4.4/INTEL/1 > > 40/sandybridge/lib -lptscotch -lscotch -lptscotcherr -lscotcherr" > --with-scalapack=1 --with-scalapack-include=/opt/cray/libsci/13.0.3/ > > INTEL/140/x86_64/include > --with-scalapack-lib="-L/opt/cray/libsci/13.0.3/INTEL/140/x86_64/lib > -lsci_intel_mpi_mp -lsci_intel_mp" --wit > > h-mumps=1 > --with-mumps-include=/opt/cray/tpsl/1.4.4/INTEL/140/sandybridge/include > --with-mumps-lib="-L/opt/cray/tpsl/1.4.4/INTEL/140/s > > andybridge/lib -lcmumps -ldmumps -lesmumps -lsmumps -lzmumps > -lmumps_common -lptesmumps -lpord" --CFLAGS="-xavx -openmp -O3 " --CXXFLA > > GS="-xavx -openmp -O3 " --FFLAGS="-xavx -openmp -O3 " --LIBS=-lstdc++ > --CXX_LINKER_FLAGS= --PETSC_ARCH=sandybridge --prefix=/opt/cra > > y/petsc/3.5.3.0/real/INTEL/140/sandybridge --with-hypre=1 > --with-hypre-include=/opt/cray/tpsl/1.4.4/INTEL/140/sandybridge/include --wi > > th-hypre-lib=/opt/cray/tpsl/1.4.4/INTEL/140/sandybridge/lib/libHYPRE.a > --with-sundials=1 --with-sundials-include=/opt/cray/tpsl/1.4.4/ > > INTEL/140/sandybridge/include > --with-sundials-lib="-L/opt/cray/tpsl/1.4.4/INTEL/140/sandybridge/lib > -lsundials_cvode -lsundials_cvodes > > -lsundials_ida -lsundials_idas -lsundials_kinsol > -lsundials_nvecparallel -lsundials_nvecserial" > > [0]PETSC ERROR: #1 User provided function() line 0 in unknown file > > Rank 0 [Tue Jul 19 02:54:04 2016] [c1-0c1s6n1] application called > MPI_Abort(MPI_COMM_WORLD, 59) - process 0 > > srun: error: nid00281: task 0: Aborted > > srun: Terminating job step 1330433.0 > > slurmstepd: *** STEP 1330433.0 ON nid00281 CANCELLED AT > 2016-07-19T02:54:04 *** > > srun: Job step aborted: Waiting up to 32 seconds for job step to finish. > > srun: error: nid00281: tasks 1-17: Killed > > srun: error: nid00282: tasks 18-35: Killed > > > > > > > > I ran the same code in my pc with 8 processor. It had no issues. But > when I tried in a different machine, I am getting this. Any idea? Can I use > Superlu_dist instead of MUMPS? I got INFOG(1)=-22 error from MUMPS in > another run. > > > > > > > > Thanks, > > > > > > > > M Hassan > > > > > > > > > > -- > > What most experimenters take for granted before they begin their > experiments is infinitely more interesting than any results to which their > experiments lead. > > -- Norbert Wiener > > -------------- next part -------------- An HTML attachment was scrubbed... URL: From eduardojourdan92 at gmail.com Tue Jul 19 13:17:52 2016 From: eduardojourdan92 at gmail.com (Eduardo Jourdan) Date: Tue, 19 Jul 2016 15:17:52 -0300 Subject: [petsc-users] Questions for MatSolve Message-ID: Hi all, I would like to perform a specific number (for instance 4 of forward and backward sweeps with a seqaij matrix with block size 4, vectors b and x. Also, I need to do this same procedure with another matrix seqaij block size 16. I would appreciate if someone knows the best way to do it. 1 - I've been trying to use MatSolve. For the bs=4 it seems to work, but with the other matrix with bs=16 the residue diverges. When I call matConvert to convert the later matrix for a seqbaij with bs=16 the result changes and the linear residue is reduced. It is supposed to happen or it is more possible that i am doing something wrong? 2 - MatSolve for seqbaij and seqaij with the same block sizes gives the same results in terms of solution (not performace, memory) ? 3 - Can do I do a specific number of sweeps as told before with the KSP/PC interface? 4 - I saw the manual for the MatSolve and It says that it is for factored matrix. Can I use a matrix just after the MatAssembly calls? Best regards, Eduardo Jourdan -------------- next part -------------- An HTML attachment was scrubbed... URL: From aks084000 at utdallas.edu Tue Jul 19 14:53:22 2016 From: aks084000 at utdallas.edu (Safin, Artur) Date: Tue, 19 Jul 2016 19:53:22 +0000 Subject: [petsc-users] Multigrid with PML In-Reply-To: References: <2d1003a65bf24fdf9b30adea866d2067@utdallas.edu> <02E40C8A-322D-4784-8418-22EE5F0999C7@mcs.anl.gov> Message-ID: <6B852635-27EC-45D7-8C09-8F3306DA6DEE@utdallas.edu> Hello, In order to achieve reasonable performance for Helmholtz with PML, Erlangga in his paper used 1) Matrix dependent interpolation in the multigrid. The operators are nonlinear, for example an intermediate computation reads something like d = max(|a+c|, |b|, ?) 2) Full weighting (This is linear, so I believe I can achieve that with PCMGSetRestriction). 3) F-cycle with one pre- and postsmoothing with the Jacobi iteration and relaxation factor ? = 0.5. I am not sure how to do 1 & 3 in PETSc. Can anyone suggest a way of implementing these? Thanks, Artur PS. for anyone curious, the paper is "Advances in Iterative Methods and Preconditioners for the Helmholtz Equation" -------------- next part -------------- An HTML attachment was scrubbed... URL: From knepley at gmail.com Tue Jul 19 14:58:42 2016 From: knepley at gmail.com (Matthew Knepley) Date: Tue, 19 Jul 2016 21:58:42 +0200 Subject: [petsc-users] Multigrid with PML In-Reply-To: <6B852635-27EC-45D7-8C09-8F3306DA6DEE@utdallas.edu> References: <2d1003a65bf24fdf9b30adea866d2067@utdallas.edu> <02E40C8A-322D-4784-8418-22EE5F0999C7@mcs.anl.gov> <6B852635-27EC-45D7-8C09-8F3306DA6DEE@utdallas.edu> Message-ID: On Tue, Jul 19, 2016 at 9:53 PM, Safin, Artur wrote: > Hello, > > In order to achieve reasonable performance for Helmholtz with PML, > Erlangga in his paper used > > 1) Matrix dependent interpolation in the multigrid. The operators are > nonlinear, for example an intermediate computation reads something like > d = max(|a+c|, |b|, ?) > You can use this http://www.mcs.anl.gov/petsc/petsc-current/docs/manualpages/PC/PCMGSetInterpolation.html to set your own interpolation operators. > 2) Full weighting (This is linear, so I believe I can achieve that with > *PCMGSetRestriction*). > > 3) F-cycle with one pre- and postsmoothing with the Jacobi iteration and > relaxation factor ? = 0.5. > -pc_mg_type full -pc_mg_smoothup 1 -pc_mg_smoothdown 1 -mg_levels_pc_type sor -mg_leves_pc_sor_omega 0.5 and use -ksp_view to check that you have what you want. Matt > I am not sure how to do 1 & 3 in PETSc. Can anyone suggest a way of > implementing these? > > Thanks, > > Artur > > PS. for anyone curious, the paper is "Advances in Iterative Methods and > Preconditioners for the Helmholtz Equation" > -- What most experimenters take for granted before they begin their experiments is infinitely more interesting than any results to which their experiments lead. -- Norbert Wiener -------------- next part -------------- An HTML attachment was scrubbed... URL: From bsmith at mcs.anl.gov Tue Jul 19 18:20:19 2016 From: bsmith at mcs.anl.gov (Barry Smith) Date: Tue, 19 Jul 2016 16:20:19 -0700 Subject: [petsc-users] some beginner questions : matrix multiplication In-Reply-To: <932627683.1480276.1468936895326.JavaMail.yahoo@mail.yahoo.com> References: <932627683.1480276.1468936895326.JavaMail.yahoo.ref@mail.yahoo.com> <932627683.1480276.1468936895326.JavaMail.yahoo@mail.yahoo.com> Message-ID: > On Jul 19, 2016, at 7:01 AM, lixin chu wrote: > > Hello, > I am new to PETsc, and I am looking for a library to support matrix multiplication. I have several questions and would like to confirm: > > 1. From MatMatMult API, for C=A*B, I assume we can support mixed sparse and dense matrix, i.e., either A or B can be dense; similarly, MatMatMatMult (A*B*C) can support A and C sparse, and B is dense. We do not have code for all combinations. > > 2. We can also use mixed data type for MatMatMult/MatMatMatMult, for example, A is complex, double, and B is double. PETSc only supports all real or all complex, not missing. > > 3. Is there a way to estimate the total working memory required for MatMatMult/MatMatMatMult, given A,B and C information (like dimensions, and total none zero elements, data type) Whenever one of the matrices is dense the result is dense so it is easy to compute in that case. If all the matrices are sparse it is difficult to predict the sparsity of the final result (generally is is a bit denser than the most dense of the sparse matrices). We make some estimates before we start the symbolic multiple and if we need more space we allocate more. > > 4. do we have any performance/memory usage data when compared with other sparse matrix multiplication solutions. for example. PSBLAS No > ? > > thank you very much, > > lixin From bsmith at mcs.anl.gov Tue Jul 19 18:38:54 2016 From: bsmith at mcs.anl.gov (Barry Smith) Date: Tue, 19 Jul 2016 16:38:54 -0700 Subject: [petsc-users] Multigrid with PML In-Reply-To: References: <2d1003a65bf24fdf9b30adea866d2067@utdallas.edu> <02E40C8A-322D-4784-8418-22EE5F0999C7@mcs.anl.gov> <6B852635-27EC-45D7-8C09-8F3306DA6DEE@utdallas.edu> Message-ID: <834BCA08-682E-4141-B23C-D0E3D259B5E0@mcs.anl.gov> For jacobi smoothing with a .5 damping you need -ksp_type richardson -pc_type jacobi -ksp_richardson_scale .5 but instead of the scale you can try -ksp_richardson_self_scale which claims to use the optimal scale factor for each iteration (at a cost of some vector operations). Barry > On Jul 19, 2016, at 12:58 PM, Matthew Knepley wrote: > > On Tue, Jul 19, 2016 at 9:53 PM, Safin, Artur wrote: > Hello, > > In order to achieve reasonable performance for Helmholtz with PML, Erlangga in his paper used > > 1) Matrix dependent interpolation in the multigrid. The operators are nonlinear, for example an intermediate computation reads something like > d = max(|a+c|, |b|, ?) > > You can use this http://www.mcs.anl.gov/petsc/petsc-current/docs/manualpages/PC/PCMGSetInterpolation.html > to set your own interpolation operators. > > 2) Full weighting (This is linear, so I believe I can achieve that with PCMGSetRestriction). > > 3) F-cycle with one pre- and postsmoothing with the Jacobi iteration and relaxation factor ? = 0.5. > > -pc_mg_type full > -pc_mg_smoothup 1 > -pc_mg_smoothdown 1 > -mg_levels_pc_type sor > -mg_leves_pc_sor_omega 0.5 > > and use -ksp_view to check that you have what you want. > > Matt > > I am not sure how to do 1 & 3 in PETSc. Can anyone suggest a way of implementing these? > > Thanks, > > Artur > > PS. for anyone curious, the paper is "Advances in Iterative Methods and Preconditioners for the Helmholtz Equation" > > > > -- > What most experimenters take for granted before they begin their experiments is infinitely more interesting than any results to which their experiments lead. > -- Norbert Wiener From knepley at gmail.com Tue Jul 19 22:03:55 2016 From: knepley at gmail.com (Matthew Knepley) Date: Wed, 20 Jul 2016 05:03:55 +0200 Subject: [petsc-users] Questions for MatSolve In-Reply-To: References: Message-ID: On Tue, Jul 19, 2016 at 8:17 PM, Eduardo Jourdan wrote: > Hi all, > > I would like to perform a specific number (for instance 4 of forward and > backward sweeps with a seqaij matrix with block size 4, vectors b and x. > Also, I need to do this same procedure with another matrix seqaij block > size 16. I would appreciate if someone knows the best way to do it. > It sounds like you want PCSOR and PCApply, not MatSolve. Thanks, Matt > 1 - I've been trying to use MatSolve. For the bs=4 it seems to work, but > with the other matrix with bs=16 the residue diverges. When I call > matConvert to convert the later matrix for a seqbaij with bs=16 the result > changes and the linear residue is reduced. It is supposed to happen or it > is more possible that i am doing something wrong? > > 2 - MatSolve for seqbaij and seqaij with the same block sizes gives the > same results in terms of solution (not performace, memory) ? > > 3 - Can do I do a specific number of sweeps as told before with the KSP/PC > interface? > > 4 - I saw the manual for the MatSolve and It says that it is for factored > matrix. Can I use a matrix just after the MatAssembly calls? > > Best regards, > > Eduardo Jourdan > -- What most experimenters take for granted before they begin their experiments is infinitely more interesting than any results to which their experiments lead. -- Norbert Wiener -------------- next part -------------- An HTML attachment was scrubbed... URL: From lixin_chu at yahoo.com Wed Jul 20 09:35:13 2016 From: lixin_chu at yahoo.com (lixin chu) Date: Wed, 20 Jul 2016 14:35:13 +0000 (UTC) Subject: [petsc-users] some beginner questions : matrix multiplication In-Reply-To: References: <932627683.1480276.1468936895326.JavaMail.yahoo.ref@mail.yahoo.com> <932627683.1480276.1468936895326.JavaMail.yahoo@mail.yahoo.com> Message-ID: <1936335791.1984783.1469025313270.JavaMail.yahoo@mail.yahoo.com> Thank you very much for the quick reply. Sent from Yahoo Mail on Android On Wed, 20 Jul, 2016 at 7:20, Barry Smith wrote: > On Jul 19, 2016, at 7:01 AM, lixin chu wrote: > > Hello, > I am new to PETsc, and I am looking for a library to support matrix multiplication. I have several questions and would like to confirm: > > 1. From MatMatMult API, for C=A*B, I assume we can support mixed sparse and dense matrix, i.e., either A or B can be dense; similarly, MatMatMatMult (A*B*C) can support A and C sparse, and B is dense. ? We do not have code for all combinations. > > 2. We can also use mixed data type for MatMatMult/MatMatMatMult, for example, A is complex, double, and B is double. ? PETSc only supports all real or all complex, not missing. > > 3. Is there a way to estimate the total working memory required for MatMatMult/MatMatMatMult, given A,B and C information (like dimensions, and total none zero elements, data type) ? Whenever one of the matrices is dense the result is dense so it is easy to compute in that case. ? If all the matrices are sparse it is difficult to predict the sparsity of the final result (generally is is a bit denser than the most dense of the sparse matrices). We make some estimates before we start the symbolic multiple and if we need more space we allocate more. >? > 4. do we have any performance/memory usage data when compared with other sparse matrix multiplication solutions. for example. PSBLAS ? No > ? > > thank you very much, > > lixin -------------- next part -------------- An HTML attachment was scrubbed... URL: From zonexo at gmail.com Wed Jul 20 22:24:32 2016 From: zonexo at gmail.com (TAY wee-beng) Date: Thu, 21 Jul 2016 11:24:32 +0800 Subject: [petsc-users] Fwd: Re: Error with PETSc on K computer In-Reply-To: <6E84C554-39F0-4BB6-92D0-D2443BA79989@mcs.anl.gov> References: <7423eeed-4b95-28e7-c55d-08e515911935@gmail.com> <3436e085-071a-db3f-3438-84e2536af2d5@gmail.com> <6E84C554-39F0-4BB6-92D0-D2443BA79989@mcs.anl.gov> Message-ID: Dear all, I have emailed the K computer helpdesk and they have given their reply: /*This is HPCI helpdesk. *//* */ /* *//*Sorry for making you wait. *//* *//*We have received the investigation results from Operation Division. *//* */ /* *//*The cause of SIGSEGV by the Fujitsu compiler is that the implementation of *//* *//*the Fortran pointer is different from the Intel/GNU compiler. *//* */ /* *//*In the Fujitsu compiler, interoperability of the Fortran pointer and C language *//* *//*is implemented by the Fortran pointer interface of Fujitsu. *//* *//** The implementation of the Fortran pointer is processor-dependent.* *//* */ /* *//*On the other hand, PETSc is implemented assuming of the Fortran pointer interface of *//* *//*the Intel/GNU compiler. *//* */ /* *//*The PETSc routine cannot correctly interpret the Fortran pointer of Fujitsu *//* *//*because the implementation of the Fortran pointer of the Fujitsu compiler *//* *//*and the Intel/GNU compiler is different, and it terminates abnormally at execution. *//* */ /* *//*Please avoid the use of the Fortran pointer as a workaround. *//* */ /* *//*The sample program of PETSc which does not use the Fotran pointer (ex4f etc.) *//* *//*runs normaly without getting SIGSEGV. */ Hence, they advice avoiding the use of pointers. I made use of VecGetArrayF90 but I believe I can also use VecGetArray. But what about DMDAVecGetArrayF90 and DMDAVecRestoreArrayF90? Can I use DMDAVecGetArray and DMDAVecRestoreArray instead in Fortran, thus avoiding using pointers? I remember my segmentation fault always happens when calling DMDAVecGetArrayF90 and DMDAVecRestoreArrayF90. In other words, can I use DMDA in Fortran w/o using any pointer? Thank you Yours sincerely, TAY wee-beng On 10/6/2016 11:00 AM, Barry Smith wrote: > Without knowing the specifics of how this machine's Fortran compiler passes Fortran pointers to subroutines we cannot resolve this problem. This information can only be obtained from the experts on the this machine. > > Barry > >> On Jun 9, 2016, at 9:28 PM, TAY wee-beng wrote: >> >> Hi, >> >> The current solution cannot work. May I know if there's any other solution to try. Meanwhile, I've also email the K computer helpdesk for help. >> >> Thank you >> >> Yours sincerely, >> >> TAY wee-beng >> >> On 3/6/2016 10:33 PM, Satish Balay wrote: >>> Sorry - I'm not sure whats hapenning with this compiler. >>> >>> [for a build without the patch I sent ] - can you edit >>> PETSC_ARCH/include/petscconf.h and remove the lines >>> >>> #ifndef PETSC_HAVE_F90_2PTR_ARG >>> #define PETSC_HAVE_F90_2PTR_ARG 1 >>> #endif >>> >>> And then build the libraries [do not run configure again]. >>> >>> Does this make a difference for this example? >>> >>> Satish >>> >>> On Fri, 3 Jun 2016, TAY wee-beng wrote: >>> >>>> Hi, >>>> >>>> Is there any update to the issue below? >>>> >>>> No hurry, just to make sure that the email is sent successfully. >>>> >>>> >>>> Thanks >>>> >>>> >>>> >>>> -------- Forwarded Message -------- >>>> Subject: Re: [petsc-users] Error with PETSc on K computer >>>> Date: Thu, 2 Jun 2016 10:25:22 +0800 >>>> From: TAY wee-beng >>>> To: petsc-users >>>> >>>> >>>> >>>> Hi Satish, >>>> >>>> The X9 option is : >>>> >>>> Provides a different interpretation under Fortran 95 specifications >>>> for any parts not conforming to the language specifications of this >>>> compiler >>>> >>>> I just patched and re-compiled but it still can't work. I've attached the >>>> configure.log for both builds. >>>> >>>> FYI, some parts of the PETSc 3.6.3 code were initially patch to make it work >>>> with the K computer system: >>>> >>>> $ diff -u petsc-3.6.3/config/BuildSystem/config/package.py.org >>>> petsc-3.6.3/config/BuildSystem/config/package.py >>>> --- petsc-3.6.3/config/BuildSystem/config/package.py.org 2015-12-04 >>>> 14:06:42.000000000 +0900 >>>> +++ petsc-3.6.3/config/BuildSystem/config/package.py 2016-01-22 >>>> 11:09:37.000000000 +0900 >>>> @@ -174,7 +174,7 @@ >>>> return '' >>>> >>>> def getSharedFlag(self,cflags): >>>> - for flag in ['-PIC', '-fPIC', '-KPIC', '-qpic']: >>>> + for flag in ['-KPIC', '-fPIC', '-PIC', '-qpic']: >>>> if cflags.find(flag) >=0: return flag >>>> return '' >>>> >>>> $ diff -u petsc-3.6.3/config/BuildSystem/config/setCompilers.py.org >>>> petsc-3.6.3/config/BuildSystem/config/setCompilers.py >>>> --- petsc-3.6.3/config/BuildSystem/config/setCompilers.py.org 2015-07-23 >>>> 00:22:46.000000000 +0900 >>>> +++ petsc-3.6.3/config/BuildSystem/config/setCompilers.py 2016-01-22 >>>> 11:10:05.000000000 +0900 >>>> @@ -1017,7 +1017,7 @@ >>>> self.pushLanguage(language) >>>> #different compilers are sensitive to the order of testing these >>>> flags. So separete out GCC test. >>>> if config.setCompilers.Configure.isGNU(self.getCompiler()): testFlags = >>>> ['-fPIC'] >>>> - else: testFlags = ['-PIC', '-fPIC', '-KPIC','-qpic'] >>>> + else: testFlags = ['-KPIC', '-fPIC', '-PIC','-qpic'] >>>> for testFlag in testFlags: >>>> try: >>>> self.logPrint('Trying '+language+' compiler flag '+testFlag) >>>> $ diff -u petsc-3.6.3/config/BuildSystem/config/packages/openmp.py.org >>>> petsc-3.6.3/config/BuildSystem/config/packages/openmp.py >>>> --- petsc-3.6.3/config/BuildSystem/config/packages/openmp.py.org 2016-01-25 >>>> 15:42:23.000000000+0900 >>>> +++ petsc-3.6.3/config/BuildSystem/config/packages/openmp.py 2016-01-22 >>>> 17:13:52.000000000 +0900 >>>> @@ -19,7 +19,8 @@ >>>> self.found = 0 >>>> self.setCompilers.pushLanguage('C') >>>> # >>>> - for flag in ["-fopenmp", # Gnu >>>> + for flag in ["-Kopenmp", # Fujitsu >>>> + "-fopenmp", # Gnu >>>> "-qsmp=omp",# IBM XL C/C++ >>>> "-h omp", # Cray. Must come after XL because XL >>>> interprets this option as meaning"-soname omp" >>>> "-mp", # Portland Group >>>> >>>> $ diff -u ./petsc-3.6.3/config/BuildSystem/config/compilers.py.org >>>> ./petsc-3.6.3/config/BuildSystem/config/compilers.py >>>> --- ./petsc-3.6.3/config/BuildSystem/config/compilers.py.org 2015-06-10 >>>> 06:24:49.000000000 +0900 >>>> +++ ./petsc-3.6.3/config/BuildSystem/config/compilers.py 2016-02-19 >>>> 11:56:12.000000000 +0900 >>>> @@ -164,7 +164,7 @@ >>>> def checkCLibraries(self): >>>> '''Determines the libraries needed to link with C''' >>>> oldFlags = self.setCompilers.LDFLAGS >>>> - self.setCompilers.LDFLAGS += ' -v' >>>> + self.setCompilers.LDFLAGS += ' -###' >>>> self.pushLanguage('C') >>>> (output, returnCode) = self.outputLink('', '') >>>> self.setCompilers.LDFLAGS = oldFlags >>>> @@ -413,7 +413,7 @@ >>>> def checkCxxLibraries(self): >>>> '''Determines the libraries needed to link with C++''' >>>> oldFlags = self.setCompilers.LDFLAGS >>>> - self.setCompilers.LDFLAGS += ' -v' >>>> + self.setCompilers.LDFLAGS += ' -###' >>>> self.pushLanguage('Cxx') >>>> (output, returnCode) = self.outputLink('', '') >>>> self.setCompilers.LDFLAGS = oldFlags >>>> >>>> >>>> >>>> Thank you >>>> >>>> Yours sincerely, >>>> >>>> TAY wee-beng >>>> >>>> On 2/6/2016 3:18 AM, Satish Balay wrote: >>>>> What does -X9 in --FFLAGS="-X9 -O0" do? >>>>> >>>>> can you send configure.log for this build? >>>>> >>>>> And does the attached patch make a difference with this example? >>>>> [suggest doing a separate temporary build of PETSc - in a different source >>>>> location - to check this.] >>>>> >>>>> Satish >>>>> >>>>> On Wed, 1 Jun 2016, TAY wee-beng wrote: >>>>> >>>>>> Hi Satish, >>>>>> >>>>>> Only partially working: >>>>>> >>>>>> [t00196 at b04-036 tutorials]$ mpiexec -n 2 ./ex4f90 >>>>>> jwe1050i-w The hardware barrier couldn't be used and continues processing >>>>>> using the software barrier. >>>>>> taken to (standard) corrective action, execution continuing. >>>>>> jwe1050i-w The hardware barrier couldn't be used and continues processing >>>>>> using the software barrier. >>>>>> taken to (standard) corrective action, execution continuing. >>>>>> Vec Object:Vec Object:initial vector:initial vector: 1 MPI processes >>>>>> type: seq >>>>>> 10 >>>>>> 20 >>>>>> 30 >>>>>> 40 >>>>>> 50 >>>>>> 60 >>>>>> 1 MPI processes >>>>>> type: seq >>>>>> 10 >>>>>> 20 >>>>>> 30 >>>>>> 40 >>>>>> 50 >>>>>> 60 >>>>>> [1]PETSC ERROR: >>>>>> ------------------------------------------------------------------------ >>>>>> [1]PETSC ERROR: Caught signal number 11 SEGV: Segmentation Violation, >>>>>> probably >>>>>> memory access out of range >>>>>> [1]PETSC ERROR: Try option -start_in_debugger or -on_error_attach_debugger >>>>>> [1]PETSC ERROR: or see >>>>>> http://www.mcs.anl.gov/petsc/documentation/faq.html#valgrind >>>>>> [1]PETSC ERROR: or tryhttp://valgrind.org on GNU/linux and Apple Mac OS X >>>>>> to >>>>>> find memory corruption errors >>>>>> [1]PETSC ERROR: likely location of problem given in stack below >>>>>> [1]PETSC ERROR: --------------------- Stack Frames >>>>>> ------------------------------------ >>>>>> [1]PETSC ERROR: Note: The EXACT line numbers in the stack are not >>>>>> available, >>>>>> [1]PETSC ERROR: INSTEAD the line number of the start of the function >>>>>> [1]PETSC ERROR: is given. >>>>>> [1]PETSC ERROR: [1] F90Array1dCreate line 50 >>>>>> /.global/volume2/home/hp150306/t00196/source/petsc-3.6.3/src/sys/f90-src/f90_cwrap.c >>>>>> [1]PETSC ERROR: --------------------- Error Message >>>>>> ------------------------------------------[0]PETSC ERROR: >>>>>> ------------------------------------------------------------------------ >>>>>> [0]PETSC ERROR: Caught signal number 11 SEGV: Segmentation Violation, >>>>>> probably >>>>>> memory access out of range >>>>>> [0]PETSC ERROR: Try option -start_in_debugger or -on_error_attach_debugger >>>>>> [0]PETSC ERROR: or see >>>>>> http://www.mcs.anl.gov/petsc/documentation/faq.html#valgrind >>>>>> [0]PETSC ERROR: or tryhttp://valgrind.org on GNU/linux and Apple Mac OS X >>>>>> to >>>>>> find memory corruption errors >>>>>> [0]PETSC ERROR: likely location of problem given in stack below >>>>>> [0]PETSC ERROR: --------------------- Stack Frames >>>>>> ------------------------------------ >>>>>> [0]PETSC ERROR: Note: The EXACT line numbers in the stack are not >>>>>> available, >>>>>> [0]PETSC ERROR: INSTEAD the line number of the start of the function >>>>>> [0]PETSC ERROR: is given. >>>>>> [0]PETSC ERROR: [0] F90Array1dCreate line 50 >>>>>> /.global/volume2/home/hp150306/t00196/source/petsc-3.6.3/src/sys/f90-src/f90_cwrap.c >>>>>> [0]PETSC ERROR: --------------------- Error Message >>>>>> -------------------------------------------------------------- >>>>>> [1]PETSC ERROR: Signal received >>>>>> [1]PETSC ERROR: Seehttp://www.mcs.anl.gov/petsc/documentation/faq.html >>>>>> for >>>>>> trouble shooting. >>>>>> [1]PETSC ERROR: Petsc Release Version 3.6.3, Dec, 03, 2015 >>>>>> [1]PETSC ERROR: ./ex4f90 on a petsc-3.6.3_debug named b04-036 by Unknown >>>>>> Wed >>>>>> Jun 1 13:23:41 2016 >>>>>> [1]PETSC ERROR: Configure options --with-cc=mpifcc --with-cxx=mpiFCC >>>>>> --with-fc=mpifrt --with-64-bit-pointers=1 --CC=mpifcc --CFLAGS="-Xg -O0" >>>>>> --CXX=mpiFCC --CXXFLAGS="-Xg -O0" --FC=mpifrt --FFLAGS="-X9 -O0" >>>>>> --LD_SHARED= >>>>>> --LDDFLAGS= --with-openmp=1 --with-mpiexec=mpiexec --known-endian=big >>>>>> --with-shared-libraries=0 --with-blas-lapack-lib=-SSL2 >>>>>> --with-scalapack-lib=-SCALAPACK >>>>>> --prefix=/home/hp150306/t00196/lib/petsc-3.6.3_debug >>>>>> --with-fortran-interfaces=1 --with-debugging=1 --useThreads=0 >>>>>> --with-hypre=1 >>>>>> --with-hypre-dir=/home/hp150306/t00196/lib/hypre-2.10.0b-p4 >>>>>> [1]PETSC ERROR: #1 User provided function() line 0 in unknown file >>>>>> -------------------------------------------------------------------------- >>>>>> [mpi::mpi-api::mpi-abort] >>>>>> MPI_ABORT was invoked on rank 1 in communicator MPI_COMM_WORLD >>>>>> with errorcode 59. >>>>>> >>>>>> NOTE: invoking MPI_ABORT causes Open MPI to kill all MPI processes. >>>>>> You may or may not see output from other processes, depending on >>>>>> exactly when Open MPI kills them. >>>>>> -------------------------------------------------------------------------- >>>>>> [b04-036:28998] >>>>>> /opt/FJSVtclang/GM-1.2.0-20/lib64/libmpi.so.0(orte_errmgr_base_error_abort+0x84) >>>>>> [0xffffffff11360404] >>>>>> [b04-036:28998] >>>>>> /opt/FJSVtclang/GM-1.2.0-20/lib64/libmpi.so.0(ompi_mpi_abort+0x51c) >>>>>> [0xffffffff1110391c] >>>>>> [b04-036:28998] >>>>>> /opt/FJSVtclang/GM-1.2.0-20/lib64/libmpi.so.0(MPI_Abort+0x6c) >>>>>> [0xffffffff1111b5ec] >>>>>> [b04-036:28998] >>>>>> /opt/FJSVtclang/GM-1.2.0-20/lib64/libtrtmet_c.so.1(MPI_Abort+0x2c) >>>>>> [0xffffffff00281bf0] >>>>>> [b04-036:28998] ./ex4f90 [0x292548] >>>>>> [b04-036:28998] ./ex4f90 [0x29165c] >>>>>> [b04-036:28998] >>>>>> /opt/FJSVxosmmm/lib64/libmpgpthread.so.1(_IO_funlockfile+0x5c) >>>>>> [0xffffffff121e1974] >>>>>> [b04-036:28998] ./ex4f90 [0x9f6748] >>>>>> [b04-036:28998] ./ex4f90 [0x9f0ea4] >>>>>> [b04-036:28998] ./ex4f90 [0x2c76a0] >>>>>> [b04-036:28998] ./ex4f90(MAIN__+0x38c) [0x10688c] >>>>>> [b04-036:28998] ./ex4f90(main+0xec) [0x268e91c] >>>>>> [b04-036:28998] /lib64/libc.so.6(__libc_start_main+0x194) >>>>>> [0xffffffff138cb81c] >>>>>> [b04-036:28998] ./ex4f90 [0x1063ac] >>>>>> [1]PETSC ERROR: >>>>>> ------------------------------------------------------------------------ >>>>>> [1]PETSC ERROR: Caught signal number 15 Terminate: Some process (or the >>>>>> batch >>>>>> system) has told this process to end >>>>>> [1]PETSC ERROR: Tr-------------------- >>>>>> [0]PETSC ERROR: Signal received >>>>>> [0]PETSC ERROR: Seehttp://www.mcs.anl.gov/petsc/documentation/faq.html >>>>>> for >>>>>> trouble shooting. >>>>>> [0]PETSC ERROR: Petsc Release Version 3.6.3, Dec, 03, 2015 >>>>>> [0]PETSC ERROR: ./ex4f90 on a petsc-3.6.3_debug named b04-036 by Unknown >>>>>> Wed >>>>>> Jun 1 13:23:41 2016 >>>>>> [0]PETSC ERROR: Configure options --with-cc=mpifcc --with-cxx=mpiFCC >>>>>> --with-fc=mpifrt --with-64-bit-pointers=1 --CC=mpifcc --CFLAGS="-Xg -O0" >>>>>> --CXX=mpiFCC --CXXFLAGS="-Xg -O0" --FC=mpifrt --FFLAGS="-X9 -O0" >>>>>> --LD_SHARED= >>>>>> --LDDFLAGS= --with-openmp=1 --with-mpiexec=mpiexec --known-endian=big >>>>>> --with-shared-libraries=0 --with-blas-lapack-lib=-SSL2 >>>>>> --with-scalapack-lib=-SCALAPACK >>>>>> --prefix=/home/hp150306/t00196/lib/petsc-3.6.3_debug >>>>>> --with-fortran-interfaces=1 --with-debugging=1 --useThreads=0 >>>>>> --with-hypre=1 >>>>>> --with-hypre-dir=/home/hp150306/t00196/lib/hypre-2.10.0b-p4 >>>>>> [0]PETSC ERROR: #1 User provided function() line 0 in unknown file >>>>>> -------------------------------------------------------------------------- >>>>>> [mpi::mpi-api::mpi-abort] >>>>>> MPI_ABORT was invoked on rank 0 in communicator MPI_COMM_WORLD >>>>>> with errorcode 59. >>>>>> >>>>>> NOTE: invoking MPI_ABORT causes Open MPI to kill all MPI processes. >>>>>> You may or may not see output from other processes, depending on >>>>>> exactly when Open MPI kills them. >>>>>> -------------------------------------------------------------------------- >>>>>> [b04-036:28997] >>>>>> /opt/FJSVtclang/GM-1.2.0-20/lib64/libmpi.so.0(orte_errmgr_base_error_abort+0x84) >>>>>> [0xffffffff11360404] >>>>>> [b04-036:28997] >>>>>> /opt/FJSVtclang/GM-1.2.0-20/lib64/libmpi.so.0(ompi_mpi_abort+0x51c) >>>>>> [0xffffffff1110391c] >>>>>> [b04-036:28997] >>>>>> /opt/FJSVtclang/GM-1.2.0-20/lib64/libmpi.so.0(MPI_Abort+0x6c) >>>>>> [0xffffffff1111b5ec] >>>>>> [b04-036:28997] >>>>>> /opt/FJSVtclang/GM-1.2.0-20/lib64/libtrtmet_c.so.1(MPI_Abort+0x2c) >>>>>> [0xffffffff00281bf0] >>>>>> [b04-036:28997] ./ex4f90 [0x292548] >>>>>> [b04-036:28997] ./ex4f90 [0x29165c] >>>>>> [b04-036:28997] >>>>>> /opt/FJSVxosmmm/lib64/libmpgpthread.so.1(_IO_funlockfile+0x5c) >>>>>> [0xffffffff121e1974] >>>>>> [b04-036:28997] ./ex4f90 [0x9f6748] >>>>>> [b04-036:28997] ./ex4f90 [0x9f0ea4] >>>>>> [b04-036:28997] ./ex4f90 [0x2c76a0] >>>>>> [b04-036:28997] ./ex4f90(MAIN__+0x38c) [0x10688c] >>>>>> [b04-036:28997] ./ex4f90(main+0xec) [0x268e91c] >>>>>> [b04-036:28997] /lib64/libc.so.6(__libc_start_main+0x194) >>>>>> [0xffffffff138cb81c] >>>>>> [b04-036:28997] ./ex4f90 [0x1063ac] >>>>>> [0]PETSC ERROR: >>>>>> ------------------------------------------------------------------------ >>>>>> [0]PETSC ERROR: Caught signal number 15 Terminate: Some process (or the >>>>>> batch >>>>>> system) has told this process to end >>>>>> [0]PETSC ERROR: Try option -start_in_debugger or -on_error_attach_debugger >>>>>> [0]PETSC ERROR: or see >>>>>> http://www.mcs.anl.gov/petsc/documentation/faq.html#valgrind >>>>>> [0]PETSC ERROR: or tryhttp://valgrind.org on GNU/linux and Apple Mac OS X >>>>>> to >>>>>> find memory corruption errors >>>>>> [0]PETSC ERROR: likely location of problem given in stack below >>>>>> [0]PETSC ERROR: --------------------- Stack Frames >>>>>> ------------------------------------ >>>>>> [0]PETSC ERROR: Note: The EXACT line numbers in the stack are not >>>>>> available, >>>>>> [0]PETSC ERROR: INSTEAD the line number of the start of the function >>>>>> [0]PETSC ERROR: is given. >>>>>> [0]PETSC ERROR: [0] F90Array1dCreate line 50 >>>>>> /.global/volume2/home/hp150306/t00196/source/petsc-3.6.3/src/sys/f90-src/f90_cwrap.c >>>>>> [0]PETSC ERROR: --------------------- Error Message >>>>>> -------------------------------------------------------------- >>>>>> [0]PETSC ERROR: Signal received >>>>>> [0]PETSC ERROR: Seehttp://www.mcs.anl.gov/petsc/documentation/faq.html >>>>>> for >>>>>> trouble shooting. >>>>>> [0]PETSC ERROR: Petsc Release Version 3.6.3, Dec, 03, 2015 >>>>>> [0]PETSC ERROR: ./ex4f90 on a petsc-3.6.3_debug named b04-036 by Unknown >>>>>> Wed >>>>>> Jun 1 13:23:41 2016 >>>>>> [0]PETSC ERROR: Configure options --with-cc=mpifcc --with-cxx=mpiFCC >>>>>> --with-fc=mpifrt --with-64-bit-pointers=1 --CC=mpifcc --CFLAGS="-Xg -O0" >>>>>> --CXX=mpiFCC --CXXFLAGS="-Xg -O0" --FC=mpifrt --FFLAGS="-X9 -O0" >>>>>> --LD_SHARED= >>>>>> --LDDFLAGS= --with-openmp=1 --with-mpiexec=mpiexec --known-endian=big >>>>>> --with-shared-libraries=0 --with-blas-lapack-lib=-SSL2 >>>>>> --with-scalapack-lib=-SCALAPACK >>>>>> --prefix=/home/hp150306/t00196/lib/petsc-3.6.3_debug >>>>>> --with-fortran-interfaces=1 --with-debuy option -start_in_debugger or >>>>>> -on_error_attach_debugger >>>>>> [1]PETSC ERROR: or see >>>>>> http://www.mcs.anl.gov/petsc/documentation/faq.html#valgrind >>>>>> [1]PETSC ERROR: or tryhttp://valgrind.org on GNU/linux and Apple Mac OS X >>>>>> to >>>>>> find memory corruption errors >>>>>> [1]PETSC ERROR: likely location of problem given in stack below >>>>>> [1]PETSC ERROR: --------------------- Stack Frames >>>>>> ------------------------------------ >>>>>> [1]PETSC ERROR: Note: The EXACT line numbers in the stack are not >>>>>> available, >>>>>> [1]PETSC ERROR: INSTEAD the line number of the start of the function >>>>>> [1]PETSC ERROR: is given. >>>>>> [1]PETSC ERROR: [1] F90Array1dCreate line 50 >>>>>> /.global/volume2/home/hp150306/t00196/source/petsc-3.6.3/src/sys/f90-src/f90_cwrap.c >>>>>> [1]PETSC ERROR: --------------------- Error Message >>>>>> -------------------------------------------------------------- >>>>>> [1]PETSC ERROR: Signal received >>>>>> [1]PETSC ERROR: Seehttp://www.mcs.anl.gov/petsc/documentation/faq.html >>>>>> for >>>>>> trouble shooting. >>>>>> [1]PETSC ERROR: Petsc Release Version 3.6.3, Dec, 03, 2015 >>>>>> [1]PETSC ERROR: ./ex4f90 on a petsc-3.6.3_debug named b04-036 by Unknown >>>>>> Wed >>>>>> Jun 1 13:23:41 2016 >>>>>> [1]PETSC ERROR: Configure options --with-cc=mpifcc --with-cxx=mpiFCC >>>>>> --with-fc=mpifrt --with-64-bit-pointers=1 --CC=mpifcc --CFLAGS="-Xg -O0" >>>>>> --CXX=mpiFCC --CXXFLAGS="-Xg -O0" --FC=mpifrt --FFLAGS="-X9 -O0" >>>>>> --LD_SHARED= >>>>>> --LDDFLAGS= --with-openmp=1 --with-mpiexec=mpiexec --known-endian=big >>>>>> --with-shared-libraries=0 --with-blas-lapack-lib=-SSL2 >>>>>> --with-scalapack-lib=-SCALAPACK >>>>>> --prefix=/home/hp150306/t00196/lib/petsc-3.6.3_debug >>>>>> --with-fortran-interfaces=1 --with-debugging=1 --useThreads=0 >>>>>> --with-hypre=1 >>>>>> --with-hypre-dir=/home/hp150306/t00196/lib/hypre-2.10.0b-p4 >>>>>> [1]PETSC ERROR: #2 User provided function() line 0 in unknown file >>>>>> gging=1 --useThreads=0 --with-hypre=1 >>>>>> --with-hypre-dir=/home/hp150306/t00196/lib/hypre-2.10.0b-p4 >>>>>> [0]PETSC ERROR: #2 User provided function() line 0 in unknown file >>>>>> [ERR.] PLE 0019 plexec One of MPI processes was >>>>>> aborted.(rank=0)(nid=0x04180034)(CODE=1938,793745140674134016,15104) >>>>>> [t00196 at b04-036 tutorials]$ >>>>>> [ERR.] PLE 0021 plexec The interactive job has aborted with the >>>>>> signal.(sig=24) >>>>>> [INFO] PJM 0083 pjsub Interactive job 5211401 completed. >>>>>> >>>>>> Thank you >>>>>> >>>>>> Yours sincerely, >>>>>> >>>>>> TAY wee-beng >>>>>> >>>>>> On 1/6/2016 12:21 PM, Satish Balay wrote: >>>>>>> Do PETSc examples using VecGetArrayF90() work? >>>>>>> >>>>>>> say src/vec/vec/examples/tutorials/ex4f90.F >>>>>>> >>>>>>> Satish >>>>>>> >>>>>>> On Tue, 31 May 2016, TAY wee-beng wrote: >>>>>>> >>>>>>>> Hi, >>>>>>>> >>>>>>>> I'm trying to run my MPI CFD code on Japan's K computer. My code can >>>>>>>> run >>>>>>>> if I >>>>>>>> didn't make use of the PETSc DMDAVecGetArrayF90 subroutine. If it's >>>>>>>> called >>>>>>>> >>>>>>>> call DMDAVecGetArrayF90(da_u,u_local,u_array,ierr) >>>>>>>> >>>>>>>> I get the error below. I have no problem with my code on other >>>>>>>> clusters >>>>>>>> using >>>>>>>> the new Intel compilers. I used to have problems with DM when using >>>>>>>> the >>>>>>>> old >>>>>>>> Intel compilers. Now on the K computer, I'm using Fujitsu's Fortran >>>>>>>> compiler. >>>>>>>> How can I troubleshoot? >>>>>>>> >>>>>>>> Btw, I also tested on the ex13f90 example and it didn't work too. The >>>>>>>> error is >>>>>>>> below. >>>>>>>> >>>>>>>> >>>>>>>> My code error: >>>>>>>> >>>>>>>> /* size_x,size_y,size_z 76x130x136*//* >>>>>>>> *//* total grid size = 1343680*//* >>>>>>>> *//* recommended cores (50k / core) = 26.87360000000000*//* >>>>>>>> *//* 0*//* >>>>>>>> *//* 1*//* >>>>>>>> *//* 1*//* >>>>>>>> *//*[3]PETSC ERROR: [1]PETSC ERROR: >>>>>>>> ------------------------------------------------------------------------*//* >>>>>>>> *//*[1]PETSC ERROR: Caught signal number 11 SEGV: Segmentation >>>>>>>> Violation, >>>>>>>> probably memory access out of range*//* >>>>>>>> *//*[1]PETSC ERROR: Try option -start_in_debugger or >>>>>>>> -on_error_attach_debugger*//* >>>>>>>> *//*[1]PETSC ERROR: or see >>>>>>>> http://www.mcs.anl.gov/petsc/documentation/faq.html#valgrind*//* >>>>>>>> *//*[1]PETSC ERROR: or tryhttp://valgrind.org on GNU/linux and Apple >>>>>>>> Mac >>>>>>>> OS X >>>>>>>> to find memory corruption errors*//* >>>>>>>> *//*[1]PETSC ERROR: likely location of problem given in stack >>>>>>>> below*//* >>>>>>>> *//*[1]PETSC ERROR: --------------------- Stack Frames >>>>>>>> ------------------------------------*//* >>>>>>>> *//*[1]PETSC ERROR: Note: The EXACT line numbers in the stack are not >>>>>>>> available,*//* >>>>>>>> *//*[1]PETSC ERROR: INSTEAD the line number of the start of the >>>>>>>> function*//* >>>>>>>> *//*[1]PETSC ERROR: is given.*//* >>>>>>>> *//*[1]PETSC ERROR: [1] F90Array3dCreate line 244 >>>>>>>> /.global/volume2/home/hp150306/t00196/source/petsc-3.6.3/src/sys/f90-src/f90_cwrap.c*//* >>>>>>>> *//* 1*//* >>>>>>>> *//*------------------------------------------------------------------------*//* >>>>>>>> *//*[3]PETSC ERROR: Caught signal number 11 SEGV: Segmentation >>>>>>>> Violation, >>>>>>>> probably memory access out of range*//* >>>>>>>> *//*[3]PETSC ERROR: Try option -start_in_debugger or >>>>>>>> -on_error_attach_debugger*//* >>>>>>>> *//*[3]PETSC ERROR: or see >>>>>>>> http://www.mcs.anl.gov/petsc/documentation/faq.html#valgrind*//* >>>>>>>> *//*[3]PETSC ERROR: or tryhttp://valgrind.org on GNU/linux and Apple >>>>>>>> Mac >>>>>>>> OS X >>>>>>>> to find memory corruption errors*//* >>>>>>>> *//*[3]PETSC ERROR: likely location of problem given in stack >>>>>>>> below*//* >>>>>>>> *//*[3]PETSC ERROR: --------------------- Stack Frames >>>>>>>> ------------------------------------*//* >>>>>>>> *//*[0]PETSC ERROR: >>>>>>>> ------------------------------------------------------------------------*//* >>>>>>>> *//*[0]PETSC ERROR: Caught signal number 11 SEGV: Segmentation >>>>>>>> Violation, >>>>>>>> probably memory access out of range*//* >>>>>>>> *//*[0]PETSC ERROR: Try option -start_in_debugger or >>>>>>>> -on_error_attach_debugger*//* >>>>>>>> *//*[0]PETSC ERROR: or see >>>>>>>> http://www.mcs.anl.gov/petsc/documentation/faq.html#valgrind*//* >>>>>>>> *//*[0]PETSC ERROR: or tryhttp://valgrind.org on GNU/linux and Apple >>>>>>>> Mac >>>>>>>> OS X >>>>>>>> to find memory corruption errors*//* >>>>>>>> *//*[0]PETSC ERROR: likely location of problem given in stack >>>>>>>> below*//* >>>>>>>> *//*[0]PETSC ERROR: --------------------- Stack Frames >>>>>>>> ------------------------------------*//* >>>>>>>> *//*[0]PETSC ERROR: Note: The EXACT line numbers in the stack are not >>>>>>>> available,*//* >>>>>>>> *//*[0]PETSC ERROR: INSTEAD the line number of the start of the >>>>>>>> function*//* >>>>>>>> *//*[0]PETSC ERROR: is given.*//* >>>>>>>> *//*[0]PETSC ERROR: [0] F90Array3dCreate line 244 >>>>>>>> /.global/volume2/home/hp150306/t00196/source/petsc-3.6.3/src/sys/f90-src/f90_cwrap.c*//* >>>>>>>> *//*[0]PETSC ERROR: --------------------- Error Message >>>>>>>> ----------------------------------------- 1*//* >>>>>>>> *//*[2]PETSC ERROR: >>>>>>>> ------------------------------------------------------------------------*//* >>>>>>>> *//*[2]PETSC ERROR: Caught signal number 11 SEGV: Segmentation >>>>>>>> Violation, >>>>>>>> probably memory access out of range*//* >>>>>>>> *//*[2]PETSC ERROR: Try option -start_in_debugger or >>>>>>>> -on_error_attach_debugger*//* >>>>>>>> *//*[2]PETSC ERROR: or see >>>>>>>> http://www.mcs.anl.gov/petsc/documentation/faq.html#valgrind*//* >>>>>>>> *//*[2]PETSC ERROR: or tryhttp://valgrind.org on GNU/linux and Apple >>>>>>>> Mac >>>>>>>> OS X >>>>>>>> to find memory corruption errors*//* >>>>>>>> *//*[2]PETSC ERROR: likely location of problem given in stack >>>>>>>> below*//* >>>>>>>> *//*[2]PETSC ERROR: --------------------- Stack Frames >>>>>>>> ------------------------------------*//* >>>>>>>> *//*[2]PETSC ERROR: Note: The EXACT line numbers in the stack are not >>>>>>>> available,*//* >>>>>>>> *//*[2]PETSC ERROR: INSTEAD the line number of the start of the >>>>>>>> function*//* >>>>>>>> *//*[2]PETSC ERROR: is given.*//* >>>>>>>> *//*[2]PETSC ERROR: [2] F90Array3dCreate line 244 >>>>>>>> /.global/volume2/home/hp150306/t00196/source/petsc-3.6.3/src/sys/f90-src/f90_cwrap.c*//* >>>>>>>> *//*[2]PETSC ERROR: --------------------- Error Message >>>>>>>> -----------------------------------------[3]PETSC ERROR: Note: The >>>>>>>> EXACT >>>>>>>> line >>>>>>>> numbers in the stack are not available,*//* >>>>>>>> *//*[3]PETSC ERROR: INSTEAD the line number of the start of the >>>>>>>> function*//* >>>>>>>> *//*[3]PETSC ERROR: is given.*//* >>>>>>>> *//*[3]PETSC ERROR: [3] F90Array3dCreate line 244 >>>>>>>> /.global/volume2/home/hp150306/t00196/source/petsc-3.6.3/src/sys/f90-src/f90_cwrap.c*//* >>>>>>>> *//*[3]PETSC ERROR: --------------------- Error Message >>>>>>>> --------------------------------------------------------------*//* >>>>>>>> *//*[3]PETSC ERROR: Signal received*//* >>>>>>>> *//*[3]PETSC ERROR: See >>>>>>>> http://www.mcs.anl.gov/petsc/documentation/faq.html >>>>>>>> for trouble shooting.*//* >>>>>>>> *//*[3]PETSC ERROR: Petsc Release Version 3.6.3, Dec, 03, 2015*//* >>>>>>>> *//*[3]PETSC ERROR: ./a-debug.out on a petsc-3.6.3_debug named b04-036 >>>>>>>> by >>>>>>>> Unknown Wed Jun 1 12:54:34 2016*//* >>>>>>>> *//*[3]PETSC ERROR: Configure options --with-cc=mpifcc >>>>>>>> --with-cxx=mpiFCC >>>>>>>> --with-fc=mpifrt --with-64-bit-pointers=1 --CC=mpifcc --CFLAGS="-Xg >>>>>>>> -O0" >>>>>>>> --CXX=mpiFCC --CXXFLAGS="-Xg -O0" --FC=mpifrt --FFLAGS="-X9 -O0" >>>>>>>> --LD_SHARED= >>>>>>>> --LDDFLAGS= --with-openmp=1 --with-mpiexec=mpiexec --known-endian=big >>>>>>>> --with-shared----------------------*//* >>>>>>>> *//*[0]PETSC ERROR: Signal received*//* >>>>>>>> *//*[0]PETSC ERROR: See >>>>>>>> http://www.mcs.anl.gov/petsc/documentation/faq.html >>>>>>>> for trouble shooting.*//* >>>>>>>> *//*[0]PETSC ERROR: Petsc Release Version 3.6.3, Dec, 03, 2015*//* >>>>>>>> *//*[0]PETSC ERROR: ./a-debug.out on a petsc-3.6.3_debug named b04-036 >>>>>>>> by >>>>>>>> Unknown Wed Jun 1 12:54:34 2016*//* >>>>>>>> *//*[0]PETSC ERROR: Configure options --with-cc=mpifcc >>>>>>>> --with-cxx=mpiFCC >>>>>>>> --with-fc=mpifrt --with-64-bit-pointers=1 --CC=mpifcc --CFLAGS="-Xg >>>>>>>> -O0" >>>>>>>> --CXX=mpiFCC --CXXFLAGS="-Xg -O0" --FC=mpifrt --FFLAGS="-X9 -O0" >>>>>>>> --LD_SHARED= >>>>>>>> --LDDFLAGS= --with-openmp=1 --with-mpiexec=mpiexec --known-endian=big >>>>>>>> --with-shared-libraries=0 --with-blas-lapack-lib=-SSL2 >>>>>>>> --with-scalapack-lib=-SCALAPACK >>>>>>>> --prefix=/home/hp150306/t00196/lib/petsc-3.6.3_debug >>>>>>>> --with-fortran-interfaces=1 --with-debugging=1 --useThreads=0 >>>>>>>> --with-hypre=1 >>>>>>>> --with-hypre-dir=/home/hp150306/t00196/lib/hypre-2.10.0b-p4*//* >>>>>>>> *//*[0]PETSC ERROR: #1 User provided function() line 0 in unknown >>>>>>>> file*//* >>>>>>>> *//*--------------------------------------------------------------------------*//* >>>>>>>> *//*[m---------------------*//* >>>>>>>> *//*[2]PETSC ERROR: Signal received*//* >>>>>>>> *//*[2]PETSC ERROR: See >>>>>>>> http://www.mcs.anl.gov/petsc/documentation/faq.html >>>>>>>> for trouble shooting.*//* >>>>>>>> *//*[2]PETSC ERROR: Petsc Release Version 3.6.3, Dec, 03, 2015*//* >>>>>>>> *//*[2]PETSC ERROR: ./a-debug.out on a petsc-3.6.3_debug named b04-036 >>>>>>>> by >>>>>>>> Unknown Wed Jun 1 12:54:34 2016*//* >>>>>>>> *//*[2]PETSC ERROR: Configure options --with-cc=mpifcc >>>>>>>> --with-cxx=mpiFCC >>>>>>>> --with-fc=mpifrt --with-64-bit-pointers=1 --CC=mpifcc --CFLAGS="-Xg >>>>>>>> -O0" >>>>>>>> --CXX=mpiFCC --CXXFLAGS="-Xg -O0" --FC=mpifrt --FFLAGS="-X9 -O0" >>>>>>>> --LD_SHARED= >>>>>>>> --LDDFLAGS= --with-openmp=1 --with-mpiexec=mpiexec --known-endian=big >>>>>>>> --with-shared-libraries=0 --with-blas-lapack-lib=-SSL2 >>>>>>>> --with-scalapack-lib=-SCALAPACK >>>>>>>> --prefix=/home/hp150306/t00196/lib/petsc-3.6.3_debug >>>>>>>> --with-fortran-interfaces=1 --with-debugging=1 --useThreads=0 >>>>>>>> --with-hypre=1 >>>>>>>> --with-hypre-dir=/home/hp150306/t00196/lib/hypre-2.10.0b-p4*//* >>>>>>>> *//*[2]PETSC ERROR: #1 User provided function() line 0 in unknown >>>>>>>> file*//* >>>>>>>> *//*--------------------------------------------------------------------------*//* >>>>>>>> *//*[m[1]PETSC ERROR: --------------------- Error Message >>>>>>>> --------------------------------------------------------------*//* >>>>>>>> *//*[1]PETSC ERROR: Signal received*//* >>>>>>>> *//*[1]PETSC ERROR: See >>>>>>>> http://www.mcs.anl.gov/petsc/documentation/faq.html >>>>>>>> for trouble shooting.*//* >>>>>>>> *//*[1]PETSC ERROR: Petsc Release Version 3.6.3, Dec, 03, 2015*//* >>>>>>>> *//*[1]PETSC ERROR: ./a-debug.out on a petsc-3.6.3_debug named b04-036 >>>>>>>> by >>>>>>>> Unknown Wed Jun 1 12:54:34 2016*//* >>>>>>>> *//*[1]PETSC ERROR: Configure options --with-cc=mpifcc >>>>>>>> --with-cxx=mpiFCC >>>>>>>> --with-fc=mpifrt --with-64-bit-pointers=1 --CC=mpifcc --CFLAGS="-Xg >>>>>>>> -O0" >>>>>>>> --CXX=mpiFCC --CXXFLAGS="-Xg -O0" --FC=mpifrt --FFLAGS="-X9 -O0" >>>>>>>> --LD_SHARED= >>>>>>>> --LDDFLAGS= --with-openmp=1 --with-mpiexec=mpiexec --known-endian=big >>>>>>>> --with-shared-libraries=0 --with-blas-lapack-lib=-SSL2 >>>>>>>> --with-scalapack-lib=-SCALAPACK >>>>>>>> --prefix=/home/hp150306/t00196/lib/petsc-3.6.3_debug >>>>>>>> --with-fortran-interfaces=1 --with-debugging=1 --useThreads=0 >>>>>>>> --with-hypre=1 >>>>>>>> --with-hypre-dir=/home/hp150306/t00196/lib/hypre-2.10.0b-p4*//* >>>>>>>> *//*[1]PETSC ERROR: #1 User provided function() line 0 ilibraries=0 >>>>>>>> --with-blas-lapack-lib=-SSL2 --with-scalapack-lib=-SCALAPACK >>>>>>>> --prefix=/home/hp150306/t00196/lib/petsc-3.6.3_debug >>>>>>>> --with-fortran-interfaces=1 --with-debugging=1 --useThreads=0 >>>>>>>> --with-hypre=1 >>>>>>>> --with-hypre-dir=/home/hp150306/t00196/lib/hypre-2.10.0b-p4*//* >>>>>>>> *//*[3]PETSC ERROR: #1 User provided function() line 0 in unknown >>>>>>>> file*//* >>>>>>>> *//*--------------------------------------------------------------------------*//* >>>>>>>> *//*[mpi::mpi-api::mpi-abort]*//* >>>>>>>> *//*MPI_ABORT was invoked on rank 3 in communicator MPI_COMM_WORLD*//* >>>>>>>> *//*with errorcode 59.*//* >>>>>>>> *//* >>>>>>>> *//*NOTE: invoking MPI_ABORT causes Open MPI to kill all MPI >>>>>>>> processes.*//* >>>>>>>> *//*You may or may not see output from other processes, depending >>>>>>>> on*//* >>>>>>>> *//*exactly when Open MPI kills them.*//* >>>>>>>> *//*--------------------------------------------------------------------------*//* >>>>>>>> *//*[b04-036:28416] >>>>>>>> /opt/FJSVtclang/GM-1.2.0-20/lib64/libmpi.so.0(orte_errmgr_base_error_abort+0x84) >>>>>>>> [0xffffffff11360404]*//* >>>>>>>> *//*[b04-036:28416] >>>>>>>> /opt/FJSVtclang/GM-1.2.0-20/lib64/libmpi.so.0(ompi_mpi_abort+0x51c) >>>>>>>> [0xffffffff1110391c]*//* >>>>>>>> *//*[b04-036:28416] >>>>>>>> /opt/FJSVtclang/GM-1.2.0-2pi::mpi-api::mpi-abort]*//* >>>>>>>> *//*MPI_ABORT was invoked on rank 2 in communicator MPI_COMM_WORLD*//* >>>>>>>> *//*with errorcode 59.*/ >>>>>>>> >>>>>>>> ex13f90 error: >>>>>>>> >>>>>>>> >>>>>>>> /*[t00196 at b04-036 tutorials]$ mpiexec -np 2 ./ex13f90*//* >>>>>>>> *//*jwe1050i-w The hardware barrier couldn't be used and continues >>>>>>>> processing >>>>>>>> using the software barrier.*//* >>>>>>>> *//*taken to (standard) corrective action, execution continuing.*//* >>>>>>>> *//*jwe1050i-w The hardware barrier couldn't be used and continues >>>>>>>> processing >>>>>>>> using the software barrier.*//* >>>>>>>> *//*taken to (standard) corrective action, execution continuing.*//* >>>>>>>> *//* Hi! We're solving van der Pol using 2 processes.*//* >>>>>>>> *//* >>>>>>>> *//* t x1 x2*//* >>>>>>>> *//*[1]PETSC ERROR: >>>>>>>> ------------------------------------------------------------------------*//* >>>>>>>> *//*[1]PETSC ERROR: Caught signal number 10 BUS: Bus Error, possibly >>>>>>>> illegal >>>>>>>> memory access*//* >>>>>>>> *//*[1]PETSC ERROR: Try option -start_in_debugger or >>>>>>>> -on_error_attach_debugger*//* >>>>>>>> *//*[0]PETSC ERROR: >>>>>>>> ------------------------------------------------------------------------*//* >>>>>>>> *//*[0]PETSC ERROR: Caught signal number 10 BUS: Bus Error, possibly >>>>>>>> illegal >>>>>>>> memory access*//* >>>>>>>> *//*[0]PETSC ERROR: Try option -start_in_debugger or >>>>>>>> -on_error_attach_debugger*//* >>>>>>>> *//*[0]PETSC ERROR: or see >>>>>>>> http://www.mcs.anl.gov/petsc/documentation/faq.html#valgrind*//* >>>>>>>> *//*[0]PETSC ERROR: or tryhttp://valgrind.org on GNU/linux and Apple >>>>>>>> Mac >>>>>>>> OS X >>>>>>>> to find memory corruption errors*//* >>>>>>>> *//*[1]PETSC ERROR: or see >>>>>>>> http://www.mcs.anl.gov/petsc/documentation/faq.html#valgrind*//* >>>>>>>> *//*[1]PETSC ERROR: or tryhttp://valgrind.org on GNU/linux and Apple >>>>>>>> Mac >>>>>>>> OS X >>>>>>>> to find memory corruption errors*//* >>>>>>>> *//*[1]PETSC ERROR: likely location of problem given in stack >>>>>>>> below*//* >>>>>>>> *//*[1]PETSC ERROR: --------------------- Stack Frames >>>>>>>> ------------------------------------*//* >>>>>>>> *//*[1]PETSC ERROR: Note: The EXACT line numbers in the stack are not >>>>>>>> available,*//* >>>>>>>> *//*[1]PETSC ERROR: INSTEAD the line number of the start of the >>>>>>>> function*//* >>>>>>>> *//*[1]PETSC ERROR: is given.*//* >>>>>>>> *//*[1]PETSC ERROR: [1] F90Array4dCreate line 337 >>>>>>>> /.global/volume2/home/hp150306/t00196/source/petsc-3.6.3/src/sys/f90-src/f90_cwrap.c*//* >>>>>>>> *//*[0]PETSC ERROR: likely location of problem given in stack >>>>>>>> below*//* >>>>>>>> *//*[0]PETSC ERROR: --------------------- Stack Frames >>>>>>>> ------------------------------------*//* >>>>>>>> *//*[0]PETSC ERROR: Note: The EXACT line numbers in the stack are not >>>>>>>> available,*//* >>>>>>>> *//*[0]PETSC ERROR: INSTEAD the line number of the start of the >>>>>>>> function*//* >>>>>>>> *//*[0]PETSC ERROR: is given.*//* >>>>>>>> *//*[0]PETSC ERROR: [0] F90Array4dCreate line 337 >>>>>>>> /.global/volume2/home/hp150306/t00196/source/petsc-3.6.3/src/sys/f90-src/f90_cwrap.c*//* >>>>>>>> *//*[1]PETSC ERROR: --------------------- Error Message >>>>>>>> --------------------------------------------------------------*//* >>>>>>>> *//*[1]PETSC ERROR: Signal received*//* >>>>>>>> *//*[1]PETSC ERROR: See >>>>>>>> http://www.mcs.anl.gov/petsc/documentation/faq.html >>>>>>>> for trouble shooting.*//* >>>>>>>> *//*[1]PETSC ERROR: Petsc Release Version 3.6.3, Dec, 03, 2015*//* >>>>>>>> *//*[1]PETSC ERROR: ./ex13f90 on a petsc-3.6.3_debug named b04-036 by >>>>>>>> Unknown >>>>>>>> Wed Jun 1 13:04:34 2016*//* >>>>>>>> *//*[1]PETSC ERROR: Configure options --with-cc=mpifcc >>>>>>>> --with-cxx=mpiFCC >>>>>>>> --with-fc=mpifrt --with-64-bit-pointers=1 --CC=mpifcc --CFLAGS="-Xg >>>>>>>> -O0" >>>>>>>> --CXX=mpiFCC --CXXFLAGS="-Xg -O0" --FC=mpifrt --FFLAGS="-X9 -O0" >>>>>>>> --LD_SHARED= >>>>>>>> --LDDFLAGS= --with-openmp=1 --with-mpiexec=mpiexec --known-endian=big >>>>>>>> --with-shared-libraries=0 --with-blas-lapack-lib=-SSL2 >>>>>>>> --with-scalapack-lib=-SCALAPACK >>>>>>>> --prefix=/home/hp150306/t00196/lib/petsc-3.6.3_debug >>>>>>>> --with-fortran-interfaces=1 --with-debugging=1 --useThreads=0 >>>>>>>> --with-hypre=1 >>>>>>>> --with-hyp*//* >>>>>>>> */ >>>>>>>> >>>>>>>> >>>>>>>> -------------- next part -------------- An HTML attachment was scrubbed... URL: From bsmith at mcs.anl.gov Wed Jul 20 22:37:37 2016 From: bsmith at mcs.anl.gov (Barry Smith) Date: Wed, 20 Jul 2016 22:37:37 -0500 Subject: [petsc-users] Fwd: Re: Error with PETSc on K computer In-Reply-To: References: <7423eeed-4b95-28e7-c55d-08e515911935@gmail.com> <3436e085-071a-db3f-3438-84e2536af2d5@gmail.com> <6E84C554-39F0-4BB6-92D0-D2443BA79989@mcs.anl.gov> Message-ID: There is no way to implement DMDAVecGetArray() to be used from Fortran. The only way we can support DMDAVecGetArrayF90() is that we be given the information about how the Fortran pointers are implemented in Fujitsu compiler and access to the machine to test the interface. Barry > On Jul 20, 2016, at 10:24 PM, TAY wee-beng wrote: > > Dear all, > > I have emailed the K computer helpdesk and they have given their reply: > > This is HPCI helpdesk. > > Sorry for making you wait. > We have received the investigation results from Operation Division. > > The cause of SIGSEGV by the Fujitsu compiler is that the implementation of > the Fortran pointer is different from the Intel/GNU compiler. > > In the Fujitsu compiler, interoperability of the Fortran pointer and C language > is implemented by the Fortran pointer interface of Fujitsu. > * The implementation of the Fortran pointer is processor-dependent.* > > On the other hand, PETSc is implemented assuming of the Fortran pointer interface of > the Intel/GNU compiler. > > The PETSc routine cannot correctly interpret the Fortran pointer of Fujitsu > because the implementation of the Fortran pointer of the Fujitsu compiler > and the Intel/GNU compiler is different, and it terminates abnormally at execution. > > Please avoid the use of the Fortran pointer as a workaround. > > The sample program of PETSc which does not use the Fotran pointer (ex4f etc.) > runs normaly without getting SIGSEGV. > > Hence, they advice avoiding the use of pointers. I made use of VecGetArrayF90 but I believe I can also use VecGetArray. > > But what about DMDAVecGetArrayF90 and DMDAVecRestoreArrayF90? Can I use DMDAVecGetArray and DMDAVecRestoreArray instead in Fortran, thus avoiding using pointers? I remember my segmentation fault always happens when calling DMDAVecGetArrayF90 and DMDAVecRestoreArrayF90. > > In other words, can I use DMDA in Fortran w/o using any pointer? > > Thank you > > Yours sincerely, > > TAY wee-beng > > On 10/6/2016 11:00 AM, Barry Smith wrote: >> Without knowing the specifics of how this machine's Fortran compiler passes Fortran pointers to subroutines we cannot resolve this problem. This information can only be obtained from the experts on the this machine. >> >> Barry >> >> >>> On Jun 9, 2016, at 9:28 PM, TAY wee-beng >>> wrote: >>> >>> Hi, >>> >>> The current solution cannot work. May I know if there's any other solution to try. Meanwhile, I've also email the K computer helpdesk for help. >>> >>> Thank you >>> >>> Yours sincerely, >>> >>> TAY wee-beng >>> >>> On 3/6/2016 10:33 PM, Satish Balay wrote: >>> >>>> Sorry - I'm not sure whats hapenning with this compiler. >>>> >>>> [for a build without the patch I sent ] - can you edit >>>> PETSC_ARCH/include/petscconf.h and remove the lines >>>> >>>> #ifndef PETSC_HAVE_F90_2PTR_ARG >>>> #define PETSC_HAVE_F90_2PTR_ARG 1 >>>> #endif >>>> >>>> And then build the libraries [do not run configure again]. >>>> >>>> Does this make a difference for this example? >>>> >>>> Satish >>>> >>>> On Fri, 3 Jun 2016, TAY wee-beng wrote: >>>> >>>> >>>>> Hi, >>>>> >>>>> Is there any update to the issue below? >>>>> >>>>> No hurry, just to make sure that the email is sent successfully. >>>>> >>>>> >>>>> Thanks >>>>> >>>>> >>>>> >>>>> -------- Forwarded Message -------- >>>>> Subject: Re: [petsc-users] Error with PETSc on K computer >>>>> Date: Thu, 2 Jun 2016 10:25:22 +0800 >>>>> From: TAY wee-beng >>>>> >>>>> >>>>> To: petsc-users >>>>> >>>>> >>>>> >>>>> >>>>> >>>>> Hi Satish, >>>>> >>>>> The X9 option is : >>>>> >>>>> Provides a different interpretation under Fortran 95 specifications >>>>> for any parts not conforming to the language specifications of this >>>>> compiler >>>>> >>>>> I just patched and re-compiled but it still can't work. I've attached the >>>>> configure.log for both builds. >>>>> >>>>> FYI, some parts of the PETSc 3.6.3 code were initially patch to make it work >>>>> with the K computer system: >>>>> >>>>> $ diff -u petsc-3.6.3/config/BuildSystem/config/package.py.org >>>>> petsc-3.6.3/config/BuildSystem/config/package.py >>>>> --- petsc-3.6.3/config/BuildSystem/config/package.py.org 2015-12-04 >>>>> 14:06:42.000000000 +0900 >>>>> +++ petsc-3.6.3/config/BuildSystem/config/package.py 2016-01-22 >>>>> 11:09:37.000000000 +0900 >>>>> @@ -174,7 +174,7 @@ >>>>> return '' >>>>> >>>>> def getSharedFlag(self,cflags): >>>>> - for flag in ['-PIC', '-fPIC', '-KPIC', '-qpic']: >>>>> + for flag in ['-KPIC', '-fPIC', '-PIC', '-qpic']: >>>>> if cflags.find(flag) >=0: return flag >>>>> return '' >>>>> >>>>> $ diff -u petsc-3.6.3/config/BuildSystem/config/setCompilers.py.org >>>>> petsc-3.6.3/config/BuildSystem/config/setCompilers.py >>>>> --- petsc-3.6.3/config/BuildSystem/config/setCompilers.py.org 2015-07-23 >>>>> 00:22:46.000000000 +0900 >>>>> +++ petsc-3.6.3/config/BuildSystem/config/setCompilers.py 2016-01-22 >>>>> 11:10:05.000000000 +0900 >>>>> @@ -1017,7 +1017,7 @@ >>>>> self.pushLanguage(language) >>>>> #different compilers are sensitive to the order of testing these >>>>> flags. So separete out GCC test. >>>>> if config.setCompilers.Configure.isGNU(self.getCompiler()): testFlags = >>>>> ['-fPIC'] >>>>> - else: testFlags = ['-PIC', '-fPIC', '-KPIC','-qpic'] >>>>> + else: testFlags = ['-KPIC', '-fPIC', '-PIC','-qpic'] >>>>> for testFlag in testFlags: >>>>> try: >>>>> self.logPrint('Trying '+language+' compiler flag '+testFlag) >>>>> $ diff -u petsc-3.6.3/config/BuildSystem/config/packages/openmp.py.org >>>>> petsc-3.6.3/config/BuildSystem/config/packages/openmp.py >>>>> --- petsc-3.6.3/config/BuildSystem/config/packages/openmp.py.org 2016-01-25 >>>>> 15:42:23.000000000+0900 >>>>> +++ petsc-3.6.3/config/BuildSystem/config/packages/openmp.py 2016-01-22 >>>>> 17:13:52.000000000 +0900 >>>>> @@ -19,7 +19,8 @@ >>>>> self.found = 0 >>>>> self.setCompilers.pushLanguage('C') >>>>> # >>>>> - for flag in ["-fopenmp", # Gnu >>>>> + for flag in ["-Kopenmp", # Fujitsu >>>>> + "-fopenmp", # Gnu >>>>> "-qsmp=omp",# IBM XL C/C++ >>>>> "-h omp", # Cray. Must come after XL because XL >>>>> interprets this option as meaning"-soname omp" >>>>> "-mp", # Portland Group >>>>> >>>>> $ diff -u ./petsc-3.6.3/config/BuildSystem/config/compilers.py.org >>>>> ./petsc-3.6.3/config/BuildSystem/config/compilers.py >>>>> --- ./petsc-3.6.3/config/BuildSystem/config/compilers.py.org 2015-06-10 >>>>> 06:24:49.000000000 +0900 >>>>> +++ ./petsc-3.6.3/config/BuildSystem/config/compilers.py 2016-02-19 >>>>> 11:56:12.000000000 +0900 >>>>> @@ -164,7 +164,7 @@ >>>>> def checkCLibraries(self): >>>>> '''Determines the libraries needed to link with C''' >>>>> oldFlags = self.setCompilers.LDFLAGS >>>>> - self.setCompilers.LDFLAGS += ' -v' >>>>> + self.setCompilers.LDFLAGS += ' -###' >>>>> self.pushLanguage('C') >>>>> (output, returnCode) = self.outputLink('', '') >>>>> self.setCompilers.LDFLAGS = oldFlags >>>>> @@ -413,7 +413,7 @@ >>>>> def checkCxxLibraries(self): >>>>> '''Determines the libraries needed to link with C++''' >>>>> oldFlags = self.setCompilers.LDFLAGS >>>>> - self.setCompilers.LDFLAGS += ' -v' >>>>> + self.setCompilers.LDFLAGS += ' -###' >>>>> self.pushLanguage('Cxx') >>>>> (output, returnCode) = self.outputLink('', '') >>>>> self.setCompilers.LDFLAGS = oldFlags >>>>> >>>>> >>>>> >>>>> Thank you >>>>> >>>>> Yours sincerely, >>>>> >>>>> TAY wee-beng >>>>> >>>>> On 2/6/2016 3:18 AM, Satish Balay wrote: >>>>> >>>>>> What does -X9 in --FFLAGS="-X9 -O0" do? >>>>>> >>>>>> can you send configure.log for this build? >>>>>> >>>>>> And does the attached patch make a difference with this example? >>>>>> [suggest doing a separate temporary build of PETSc - in a different source >>>>>> location - to check this.] >>>>>> >>>>>> Satish >>>>>> >>>>>> On Wed, 1 Jun 2016, TAY wee-beng wrote: >>>>>> >>>>>> >>>>>>> Hi Satish, >>>>>>> >>>>>>> Only partially working: >>>>>>> >>>>>>> [t00196 at b04-036 tutorials]$ mpiexec -n 2 ./ex4f90 >>>>>>> jwe1050i-w The hardware barrier couldn't be used and continues processing >>>>>>> using the software barrier. >>>>>>> taken to (standard) corrective action, execution continuing. >>>>>>> jwe1050i-w The hardware barrier couldn't be used and continues processing >>>>>>> using the software barrier. >>>>>>> taken to (standard) corrective action, execution continuing. >>>>>>> Vec Object:Vec Object:initial vector:initial vector: 1 MPI processes >>>>>>> type: seq >>>>>>> 10 >>>>>>> 20 >>>>>>> 30 >>>>>>> 40 >>>>>>> 50 >>>>>>> 60 >>>>>>> 1 MPI processes >>>>>>> type: seq >>>>>>> 10 >>>>>>> 20 >>>>>>> 30 >>>>>>> 40 >>>>>>> 50 >>>>>>> 60 >>>>>>> [1]PETSC ERROR: >>>>>>> ------------------------------------------------------------------------ >>>>>>> [1]PETSC ERROR: Caught signal number 11 SEGV: Segmentation Violation, >>>>>>> probably >>>>>>> memory access out of range >>>>>>> [1]PETSC ERROR: Try option -start_in_debugger or -on_error_attach_debugger >>>>>>> [1]PETSC ERROR: or see >>>>>>> >>>>>>> http://www.mcs.anl.gov/petsc/documentation/faq.html#valgrind >>>>>>> >>>>>>> [1]PETSC ERROR: or tryhttp://valgrind.org on GNU/linux and Apple Mac OS X >>>>>>> to >>>>>>> find memory corruption errors >>>>>>> [1]PETSC ERROR: likely location of problem given in stack below >>>>>>> [1]PETSC ERROR: --------------------- Stack Frames >>>>>>> ------------------------------------ >>>>>>> [1]PETSC ERROR: Note: The EXACT line numbers in the stack are not >>>>>>> available, >>>>>>> [1]PETSC ERROR: INSTEAD the line number of the start of the function >>>>>>> [1]PETSC ERROR: is given. >>>>>>> [1]PETSC ERROR: [1] F90Array1dCreate line 50 >>>>>>> /.global/volume2/home/hp150306/t00196/source/petsc-3.6.3/src/sys/f90-src/f90_cwrap.c >>>>>>> [1]PETSC ERROR: --------------------- Error Message >>>>>>> ------------------------------------------[0]PETSC ERROR: >>>>>>> ------------------------------------------------------------------------ >>>>>>> [0]PETSC ERROR: Caught signal number 11 SEGV: Segmentation Violation, >>>>>>> probably >>>>>>> memory access out of range >>>>>>> [0]PETSC ERROR: Try option -start_in_debugger or -on_error_attach_debugger >>>>>>> [0]PETSC ERROR: or see >>>>>>> >>>>>>> http://www.mcs.anl.gov/petsc/documentation/faq.html#valgrind >>>>>>> >>>>>>> [0]PETSC ERROR: or tryhttp://valgrind.org on GNU/linux and Apple Mac OS X >>>>>>> to >>>>>>> find memory corruption errors >>>>>>> [0]PETSC ERROR: likely location of problem given in stack below >>>>>>> [0]PETSC ERROR: --------------------- Stack Frames >>>>>>> ------------------------------------ >>>>>>> [0]PETSC ERROR: Note: The EXACT line numbers in the stack are not >>>>>>> available, >>>>>>> [0]PETSC ERROR: INSTEAD the line number of the start of the function >>>>>>> [0]PETSC ERROR: is given. >>>>>>> [0]PETSC ERROR: [0] F90Array1dCreate line 50 >>>>>>> /.global/volume2/home/hp150306/t00196/source/petsc-3.6.3/src/sys/f90-src/f90_cwrap.c >>>>>>> [0]PETSC ERROR: --------------------- Error Message >>>>>>> -------------------------------------------------------------- >>>>>>> [1]PETSC ERROR: Signal received >>>>>>> [1]PETSC ERROR: Seehttp://www.mcs.anl.gov/petsc/documentation/faq.html >>>>>>> for >>>>>>> trouble shooting. >>>>>>> [1]PETSC ERROR: Petsc Release Version 3.6.3, Dec, 03, 2015 >>>>>>> [1]PETSC ERROR: ./ex4f90 on a petsc-3.6.3_debug named b04-036 by Unknown >>>>>>> Wed >>>>>>> Jun 1 13:23:41 2016 >>>>>>> [1]PETSC ERROR: Configure options --with-cc=mpifcc --with-cxx=mpiFCC >>>>>>> --with-fc=mpifrt --with-64-bit-pointers=1 --CC=mpifcc --CFLAGS="-Xg -O0" >>>>>>> --CXX=mpiFCC --CXXFLAGS="-Xg -O0" --FC=mpifrt --FFLAGS="-X9 -O0" >>>>>>> --LD_SHARED= >>>>>>> --LDDFLAGS= --with-openmp=1 --with-mpiexec=mpiexec --known-endian=big >>>>>>> --with-shared-libraries=0 --with-blas-lapack-lib=-SSL2 >>>>>>> --with-scalapack-lib=-SCALAPACK >>>>>>> --prefix=/home/hp150306/t00196/lib/petsc-3.6.3_debug >>>>>>> --with-fortran-interfaces=1 --with-debugging=1 --useThreads=0 >>>>>>> --with-hypre=1 >>>>>>> --with-hypre-dir=/home/hp150306/t00196/lib/hypre-2.10.0b-p4 >>>>>>> [1]PETSC ERROR: #1 User provided function() line 0 in unknown file >>>>>>> -------------------------------------------------------------------------- >>>>>>> [mpi::mpi-api::mpi-abort] >>>>>>> MPI_ABORT was invoked on rank 1 in communicator MPI_COMM_WORLD >>>>>>> with errorcode 59. >>>>>>> >>>>>>> NOTE: invoking MPI_ABORT causes Open MPI to kill all MPI processes. >>>>>>> You may or may not see output from other processes, depending on >>>>>>> exactly when Open MPI kills them. >>>>>>> -------------------------------------------------------------------------- >>>>>>> [b04-036:28998] >>>>>>> /opt/FJSVtclang/GM-1.2.0-20/lib64/libmpi.so.0(orte_errmgr_base_error_abort+0x84) >>>>>>> [0xffffffff11360404] >>>>>>> [b04-036:28998] >>>>>>> /opt/FJSVtclang/GM-1.2.0-20/lib64/libmpi.so.0(ompi_mpi_abort+0x51c) >>>>>>> [0xffffffff1110391c] >>>>>>> [b04-036:28998] >>>>>>> /opt/FJSVtclang/GM-1.2.0-20/lib64/libmpi.so.0(MPI_Abort+0x6c) >>>>>>> [0xffffffff1111b5ec] >>>>>>> [b04-036:28998] >>>>>>> /opt/FJSVtclang/GM-1.2.0-20/lib64/libtrtmet_c.so.1(MPI_Abort+0x2c) >>>>>>> [0xffffffff00281bf0] >>>>>>> [b04-036:28998] ./ex4f90 [0x292548] >>>>>>> [b04-036:28998] ./ex4f90 [0x29165c] >>>>>>> [b04-036:28998] >>>>>>> /opt/FJSVxosmmm/lib64/libmpgpthread.so.1(_IO_funlockfile+0x5c) >>>>>>> [0xffffffff121e1974] >>>>>>> [b04-036:28998] ./ex4f90 [0x9f6748] >>>>>>> [b04-036:28998] ./ex4f90 [0x9f0ea4] >>>>>>> [b04-036:28998] ./ex4f90 [0x2c76a0] >>>>>>> [b04-036:28998] ./ex4f90(MAIN__+0x38c) [0x10688c] >>>>>>> [b04-036:28998] ./ex4f90(main+0xec) [0x268e91c] >>>>>>> [b04-036:28998] /lib64/libc.so.6(__libc_start_main+0x194) >>>>>>> [0xffffffff138cb81c] >>>>>>> [b04-036:28998] ./ex4f90 [0x1063ac] >>>>>>> [1]PETSC ERROR: >>>>>>> ------------------------------------------------------------------------ >>>>>>> [1]PETSC ERROR: Caught signal number 15 Terminate: Some process (or the >>>>>>> batch >>>>>>> system) has told this process to end >>>>>>> [1]PETSC ERROR: Tr-------------------- >>>>>>> [0]PETSC ERROR: Signal received >>>>>>> [0]PETSC ERROR: Seehttp://www.mcs.anl.gov/petsc/documentation/faq.html >>>>>>> for >>>>>>> trouble shooting. >>>>>>> [0]PETSC ERROR: Petsc Release Version 3.6.3, Dec, 03, 2015 >>>>>>> [0]PETSC ERROR: ./ex4f90 on a petsc-3.6.3_debug named b04-036 by Unknown >>>>>>> Wed >>>>>>> Jun 1 13:23:41 2016 >>>>>>> [0]PETSC ERROR: Configure options --with-cc=mpifcc --with-cxx=mpiFCC >>>>>>> --with-fc=mpifrt --with-64-bit-pointers=1 --CC=mpifcc --CFLAGS="-Xg -O0" >>>>>>> --CXX=mpiFCC --CXXFLAGS="-Xg -O0" --FC=mpifrt --FFLAGS="-X9 -O0" >>>>>>> --LD_SHARED= >>>>>>> --LDDFLAGS= --with-openmp=1 --with-mpiexec=mpiexec --known-endian=big >>>>>>> --with-shared-libraries=0 --with-blas-lapack-lib=-SSL2 >>>>>>> --with-scalapack-lib=-SCALAPACK >>>>>>> --prefix=/home/hp150306/t00196/lib/petsc-3.6.3_debug >>>>>>> --with-fortran-interfaces=1 --with-debugging=1 --useThreads=0 >>>>>>> --with-hypre=1 >>>>>>> --with-hypre-dir=/home/hp150306/t00196/lib/hypre-2.10.0b-p4 >>>>>>> [0]PETSC ERROR: #1 User provided function() line 0 in unknown file >>>>>>> -------------------------------------------------------------------------- >>>>>>> [mpi::mpi-api::mpi-abort] >>>>>>> MPI_ABORT was invoked on rank 0 in communicator MPI_COMM_WORLD >>>>>>> with errorcode 59. >>>>>>> >>>>>>> NOTE: invoking MPI_ABORT causes Open MPI to kill all MPI processes. >>>>>>> You may or may not see output from other processes, depending on >>>>>>> exactly when Open MPI kills them. >>>>>>> -------------------------------------------------------------------------- >>>>>>> [b04-036:28997] >>>>>>> /opt/FJSVtclang/GM-1.2.0-20/lib64/libmpi.so.0(orte_errmgr_base_error_abort+0x84) >>>>>>> [0xffffffff11360404] >>>>>>> [b04-036:28997] >>>>>>> /opt/FJSVtclang/GM-1.2.0-20/lib64/libmpi.so.0(ompi_mpi_abort+0x51c) >>>>>>> [0xffffffff1110391c] >>>>>>> [b04-036:28997] >>>>>>> /opt/FJSVtclang/GM-1.2.0-20/lib64/libmpi.so.0(MPI_Abort+0x6c) >>>>>>> [0xffffffff1111b5ec] >>>>>>> [b04-036:28997] >>>>>>> /opt/FJSVtclang/GM-1.2.0-20/lib64/libtrtmet_c.so.1(MPI_Abort+0x2c) >>>>>>> [0xffffffff00281bf0] >>>>>>> [b04-036:28997] ./ex4f90 [0x292548] >>>>>>> [b04-036:28997] ./ex4f90 [0x29165c] >>>>>>> [b04-036:28997] >>>>>>> /opt/FJSVxosmmm/lib64/libmpgpthread.so.1(_IO_funlockfile+0x5c) >>>>>>> [0xffffffff121e1974] >>>>>>> [b04-036:28997] ./ex4f90 [0x9f6748] >>>>>>> [b04-036:28997] ./ex4f90 [0x9f0ea4] >>>>>>> [b04-036:28997] ./ex4f90 [0x2c76a0] >>>>>>> [b04-036:28997] ./ex4f90(MAIN__+0x38c) [0x10688c] >>>>>>> [b04-036:28997] ./ex4f90(main+0xec) [0x268e91c] >>>>>>> [b04-036:28997] /lib64/libc.so.6(__libc_start_main+0x194) >>>>>>> [0xffffffff138cb81c] >>>>>>> [b04-036:28997] ./ex4f90 [0x1063ac] >>>>>>> [0]PETSC ERROR: >>>>>>> ------------------------------------------------------------------------ >>>>>>> [0]PETSC ERROR: Caught signal number 15 Terminate: Some process (or the >>>>>>> batch >>>>>>> system) has told this process to end >>>>>>> [0]PETSC ERROR: Try option -start_in_debugger or -on_error_attach_debugger >>>>>>> [0]PETSC ERROR: or see >>>>>>> >>>>>>> http://www.mcs.anl.gov/petsc/documentation/faq.html#valgrind >>>>>>> >>>>>>> [0]PETSC ERROR: or tryhttp://valgrind.org on GNU/linux and Apple Mac OS X >>>>>>> to >>>>>>> find memory corruption errors >>>>>>> [0]PETSC ERROR: likely location of problem given in stack below >>>>>>> [0]PETSC ERROR: --------------------- Stack Frames >>>>>>> ------------------------------------ >>>>>>> [0]PETSC ERROR: Note: The EXACT line numbers in the stack are not >>>>>>> available, >>>>>>> [0]PETSC ERROR: INSTEAD the line number of the start of the function >>>>>>> [0]PETSC ERROR: is given. >>>>>>> [0]PETSC ERROR: [0] F90Array1dCreate line 50 >>>>>>> /.global/volume2/home/hp150306/t00196/source/petsc-3.6.3/src/sys/f90-src/f90_cwrap.c >>>>>>> [0]PETSC ERROR: --------------------- Error Message >>>>>>> -------------------------------------------------------------- >>>>>>> [0]PETSC ERROR: Signal received >>>>>>> [0]PETSC ERROR: Seehttp://www.mcs.anl.gov/petsc/documentation/faq.html >>>>>>> for >>>>>>> trouble shooting. >>>>>>> [0]PETSC ERROR: Petsc Release Version 3.6.3, Dec, 03, 2015 >>>>>>> [0]PETSC ERROR: ./ex4f90 on a petsc-3.6.3_debug named b04-036 by Unknown >>>>>>> Wed >>>>>>> Jun 1 13:23:41 2016 >>>>>>> [0]PETSC ERROR: Configure options --with-cc=mpifcc --with-cxx=mpiFCC >>>>>>> --with-fc=mpifrt --with-64-bit-pointers=1 --CC=mpifcc --CFLAGS="-Xg -O0" >>>>>>> --CXX=mpiFCC --CXXFLAGS="-Xg -O0" --FC=mpifrt --FFLAGS="-X9 -O0" >>>>>>> --LD_SHARED= >>>>>>> --LDDFLAGS= --with-openmp=1 --with-mpiexec=mpiexec --known-endian=big >>>>>>> --with-shared-libraries=0 --with-blas-lapack-lib=-SSL2 >>>>>>> --with-scalapack-lib=-SCALAPACK >>>>>>> --prefix=/home/hp150306/t00196/lib/petsc-3.6.3_debug >>>>>>> --with-fortran-interfaces=1 --with-debuy option -start_in_debugger or >>>>>>> -on_error_attach_debugger >>>>>>> [1]PETSC ERROR: or see >>>>>>> >>>>>>> http://www.mcs.anl.gov/petsc/documentation/faq.html#valgrind >>>>>>> >>>>>>> [1]PETSC ERROR: or tryhttp://valgrind.org on GNU/linux and Apple Mac OS X >>>>>>> to >>>>>>> find memory corruption errors >>>>>>> [1]PETSC ERROR: likely location of problem given in stack below >>>>>>> [1]PETSC ERROR: --------------------- Stack Frames >>>>>>> ------------------------------------ >>>>>>> [1]PETSC ERROR: Note: The EXACT line numbers in the stack are not >>>>>>> available, >>>>>>> [1]PETSC ERROR: INSTEAD the line number of the start of the function >>>>>>> [1]PETSC ERROR: is given. >>>>>>> [1]PETSC ERROR: [1] F90Array1dCreate line 50 >>>>>>> /.global/volume2/home/hp150306/t00196/source/petsc-3.6.3/src/sys/f90-src/f90_cwrap.c >>>>>>> [1]PETSC ERROR: --------------------- Error Message >>>>>>> -------------------------------------------------------------- >>>>>>> [1]PETSC ERROR: Signal received >>>>>>> [1]PETSC ERROR: Seehttp://www.mcs.anl.gov/petsc/documentation/faq.html >>>>>>> for >>>>>>> trouble shooting. >>>>>>> [1]PETSC ERROR: Petsc Release Version 3.6.3, Dec, 03, 2015 >>>>>>> [1]PETSC ERROR: ./ex4f90 on a petsc-3.6.3_debug named b04-036 by Unknown >>>>>>> Wed >>>>>>> Jun 1 13:23:41 2016 >>>>>>> [1]PETSC ERROR: Configure options --with-cc=mpifcc --with-cxx=mpiFCC >>>>>>> --with-fc=mpifrt --with-64-bit-pointers=1 --CC=mpifcc --CFLAGS="-Xg -O0" >>>>>>> --CXX=mpiFCC --CXXFLAGS="-Xg -O0" --FC=mpifrt --FFLAGS="-X9 -O0" >>>>>>> --LD_SHARED= >>>>>>> --LDDFLAGS= --with-openmp=1 --with-mpiexec=mpiexec --known-endian=big >>>>>>> --with-shared-libraries=0 --with-blas-lapack-lib=-SSL2 >>>>>>> --with-scalapack-lib=-SCALAPACK >>>>>>> --prefix=/home/hp150306/t00196/lib/petsc-3.6.3_debug >>>>>>> --with-fortran-interfaces=1 --with-debugging=1 --useThreads=0 >>>>>>> --with-hypre=1 >>>>>>> --with-hypre-dir=/home/hp150306/t00196/lib/hypre-2.10.0b-p4 >>>>>>> [1]PETSC ERROR: #2 User provided function() line 0 in unknown file >>>>>>> gging=1 --useThreads=0 --with-hypre=1 >>>>>>> --with-hypre-dir=/home/hp150306/t00196/lib/hypre-2.10.0b-p4 >>>>>>> [0]PETSC ERROR: #2 User provided function() line 0 in unknown file >>>>>>> [ERR.] PLE 0019 plexec One of MPI processes was >>>>>>> aborted.(rank=0)(nid=0x04180034)(CODE=1938,793745140674134016,15104) >>>>>>> [t00196 at b04-036 tutorials]$ >>>>>>> [ERR.] PLE 0021 plexec The interactive job has aborted with the >>>>>>> signal.(sig=24) >>>>>>> [INFO] PJM 0083 pjsub Interactive job 5211401 completed. >>>>>>> >>>>>>> Thank you >>>>>>> >>>>>>> Yours sincerely, >>>>>>> >>>>>>> TAY wee-beng >>>>>>> >>>>>>> On 1/6/2016 12:21 PM, Satish Balay wrote: >>>>>>> >>>>>>>> Do PETSc examples using VecGetArrayF90() work? >>>>>>>> >>>>>>>> say src/vec/vec/examples/tutorials/ex4f90.F >>>>>>>> >>>>>>>> Satish >>>>>>>> >>>>>>>> On Tue, 31 May 2016, TAY wee-beng wrote: >>>>>>>> >>>>>>>> >>>>>>>>> Hi, >>>>>>>>> >>>>>>>>> I'm trying to run my MPI CFD code on Japan's K computer. My code can >>>>>>>>> run >>>>>>>>> if I >>>>>>>>> didn't make use of the PETSc DMDAVecGetArrayF90 subroutine. If it's >>>>>>>>> called >>>>>>>>> >>>>>>>>> call DMDAVecGetArrayF90(da_u,u_local,u_array,ierr) >>>>>>>>> >>>>>>>>> I get the error below. I have no problem with my code on other >>>>>>>>> clusters >>>>>>>>> using >>>>>>>>> the new Intel compilers. I used to have problems with DM when using >>>>>>>>> the >>>>>>>>> old >>>>>>>>> Intel compilers. Now on the K computer, I'm using Fujitsu's Fortran >>>>>>>>> compiler. >>>>>>>>> How can I troubleshoot? >>>>>>>>> >>>>>>>>> Btw, I also tested on the ex13f90 example and it didn't work too. The >>>>>>>>> error is >>>>>>>>> below. >>>>>>>>> >>>>>>>>> >>>>>>>>> My code error: >>>>>>>>> >>>>>>>>> /* size_x,size_y,size_z 76x130x136*//* >>>>>>>>> *//* total grid size = 1343680*//* >>>>>>>>> *//* recommended cores (50k / core) = 26.87360000000000*//* >>>>>>>>> *//* 0*//* >>>>>>>>> *//* 1*//* >>>>>>>>> *//* 1*//* >>>>>>>>> *//*[3]PETSC ERROR: [1]PETSC ERROR: >>>>>>>>> ------------------------------------------------------------------------*//* >>>>>>>>> *//*[1]PETSC ERROR: Caught signal number 11 SEGV: Segmentation >>>>>>>>> Violation, >>>>>>>>> probably memory access out of range*//* >>>>>>>>> *//*[1]PETSC ERROR: Try option -start_in_debugger or >>>>>>>>> -on_error_attach_debugger*//* >>>>>>>>> *//*[1]PETSC ERROR: or see >>>>>>>>> >>>>>>>>> http://www.mcs.anl.gov/petsc/documentation/faq.html#valgrind*//* >>>>>>>>> >>>>>>>>> *//*[1]PETSC ERROR: or tryhttp://valgrind.org on GNU/linux and Apple >>>>>>>>> Mac >>>>>>>>> OS X >>>>>>>>> to find memory corruption errors*//* >>>>>>>>> *//*[1]PETSC ERROR: likely location of problem given in stack >>>>>>>>> below*//* >>>>>>>>> *//*[1]PETSC ERROR: --------------------- Stack Frames >>>>>>>>> ------------------------------------*//* >>>>>>>>> *//*[1]PETSC ERROR: Note: The EXACT line numbers in the stack are not >>>>>>>>> available,*//* >>>>>>>>> *//*[1]PETSC ERROR: INSTEAD the line number of the start of the >>>>>>>>> function*//* >>>>>>>>> *//*[1]PETSC ERROR: is given.*//* >>>>>>>>> *//*[1]PETSC ERROR: [1] F90Array3dCreate line 244 >>>>>>>>> /.global/volume2/home/hp150306/t00196/source/petsc-3.6.3/src/sys/f90-src/f90_cwrap.c*//* >>>>>>>>> *//* 1*//* >>>>>>>>> *//*------------------------------------------------------------------------*//* >>>>>>>>> *//*[3]PETSC ERROR: Caught signal number 11 SEGV: Segmentation >>>>>>>>> Violation, >>>>>>>>> probably memory access out of range*//* >>>>>>>>> *//*[3]PETSC ERROR: Try option -start_in_debugger or >>>>>>>>> -on_error_attach_debugger*//* >>>>>>>>> *//*[3]PETSC ERROR: or see >>>>>>>>> >>>>>>>>> http://www.mcs.anl.gov/petsc/documentation/faq.html#valgrind*//* >>>>>>>>> >>>>>>>>> *//*[3]PETSC ERROR: or tryhttp://valgrind.org on GNU/linux and Apple >>>>>>>>> Mac >>>>>>>>> OS X >>>>>>>>> to find memory corruption errors*//* >>>>>>>>> *//*[3]PETSC ERROR: likely location of problem given in stack >>>>>>>>> below*//* >>>>>>>>> *//*[3]PETSC ERROR: --------------------- Stack Frames >>>>>>>>> ------------------------------------*//* >>>>>>>>> *//*[0]PETSC ERROR: >>>>>>>>> ------------------------------------------------------------------------*//* >>>>>>>>> *//*[0]PETSC ERROR: Caught signal number 11 SEGV: Segmentation >>>>>>>>> Violation, >>>>>>>>> probably memory access out of range*//* >>>>>>>>> *//*[0]PETSC ERROR: Try option -start_in_debugger or >>>>>>>>> -on_error_attach_debugger*//* >>>>>>>>> *//*[0]PETSC ERROR: or see >>>>>>>>> >>>>>>>>> http://www.mcs.anl.gov/petsc/documentation/faq.html#valgrind*//* >>>>>>>>> >>>>>>>>> *//*[0]PETSC ERROR: or tryhttp://valgrind.org on GNU/linux and Apple >>>>>>>>> Mac >>>>>>>>> OS X >>>>>>>>> to find memory corruption errors*//* >>>>>>>>> *//*[0]PETSC ERROR: likely location of problem given in stack >>>>>>>>> below*//* >>>>>>>>> *//*[0]PETSC ERROR: --------------------- Stack Frames >>>>>>>>> ------------------------------------*//* >>>>>>>>> *//*[0]PETSC ERROR: Note: The EXACT line numbers in the stack are not >>>>>>>>> available,*//* >>>>>>>>> *//*[0]PETSC ERROR: INSTEAD the line number of the start of the >>>>>>>>> function*//* >>>>>>>>> *//*[0]PETSC ERROR: is given.*//* >>>>>>>>> *//*[0]PETSC ERROR: [0] F90Array3dCreate line 244 >>>>>>>>> /.global/volume2/home/hp150306/t00196/source/petsc-3.6.3/src/sys/f90-src/f90_cwrap.c*//* >>>>>>>>> *//*[0]PETSC ERROR: --------------------- Error Message >>>>>>>>> ----------------------------------------- 1*//* >>>>>>>>> *//*[2]PETSC ERROR: >>>>>>>>> ------------------------------------------------------------------------*//* >>>>>>>>> *//*[2]PETSC ERROR: Caught signal number 11 SEGV: Segmentation >>>>>>>>> Violation, >>>>>>>>> probably memory access out of range*//* >>>>>>>>> *//*[2]PETSC ERROR: Try option -start_in_debugger or >>>>>>>>> -on_error_attach_debugger*//* >>>>>>>>> *//*[2]PETSC ERROR: or see >>>>>>>>> >>>>>>>>> http://www.mcs.anl.gov/petsc/documentation/faq.html#valgrind*//* >>>>>>>>> >>>>>>>>> *//*[2]PETSC ERROR: or tryhttp://valgrind.org on GNU/linux and Apple >>>>>>>>> Mac >>>>>>>>> OS X >>>>>>>>> to find memory corruption errors*//* >>>>>>>>> *//*[2]PETSC ERROR: likely location of problem given in stack >>>>>>>>> below*//* >>>>>>>>> *//*[2]PETSC ERROR: --------------------- Stack Frames >>>>>>>>> ------------------------------------*//* >>>>>>>>> *//*[2]PETSC ERROR: Note: The EXACT line numbers in the stack are not >>>>>>>>> available,*//* >>>>>>>>> *//*[2]PETSC ERROR: INSTEAD the line number of the start of the >>>>>>>>> function*//* >>>>>>>>> *//*[2]PETSC ERROR: is given.*//* >>>>>>>>> *//*[2]PETSC ERROR: [2] F90Array3dCreate line 244 >>>>>>>>> /.global/volume2/home/hp150306/t00196/source/petsc-3.6.3/src/sys/f90-src/f90_cwrap.c*//* >>>>>>>>> *//*[2]PETSC ERROR: --------------------- Error Message >>>>>>>>> -----------------------------------------[3]PETSC ERROR: Note: The >>>>>>>>> EXACT >>>>>>>>> line >>>>>>>>> numbers in the stack are not available,*//* >>>>>>>>> *//*[3]PETSC ERROR: INSTEAD the line number of the start of the >>>>>>>>> function*//* >>>>>>>>> *//*[3]PETSC ERROR: is given.*//* >>>>>>>>> *//*[3]PETSC ERROR: [3] F90Array3dCreate line 244 >>>>>>>>> /.global/volume2/home/hp150306/t00196/source/petsc-3.6.3/src/sys/f90-src/f90_cwrap.c*//* >>>>>>>>> *//*[3]PETSC ERROR: --------------------- Error Message >>>>>>>>> --------------------------------------------------------------*//* >>>>>>>>> *//*[3]PETSC ERROR: Signal received*//* >>>>>>>>> *//*[3]PETSC ERROR: See >>>>>>>>> >>>>>>>>> http://www.mcs.anl.gov/petsc/documentation/faq.html >>>>>>>>> >>>>>>>>> for trouble shooting.*//* >>>>>>>>> *//*[3]PETSC ERROR: Petsc Release Version 3.6.3, Dec, 03, 2015*//* >>>>>>>>> *//*[3]PETSC ERROR: ./a-debug.out on a petsc-3.6.3_debug named b04-036 >>>>>>>>> by >>>>>>>>> Unknown Wed Jun 1 12:54:34 2016*//* >>>>>>>>> *//*[3]PETSC ERROR: Configure options --with-cc=mpifcc >>>>>>>>> --with-cxx=mpiFCC >>>>>>>>> --with-fc=mpifrt --with-64-bit-pointers=1 --CC=mpifcc --CFLAGS="-Xg >>>>>>>>> -O0" >>>>>>>>> --CXX=mpiFCC --CXXFLAGS="-Xg -O0" --FC=mpifrt --FFLAGS="-X9 -O0" >>>>>>>>> --LD_SHARED= >>>>>>>>> --LDDFLAGS= --with-openmp=1 --with-mpiexec=mpiexec --known-endian=big >>>>>>>>> --with-shared----------------------*//* >>>>>>>>> *//*[0]PETSC ERROR: Signal received*//* >>>>>>>>> *//*[0]PETSC ERROR: See >>>>>>>>> >>>>>>>>> http://www.mcs.anl.gov/petsc/documentation/faq.html >>>>>>>>> >>>>>>>>> for trouble shooting.*//* >>>>>>>>> *//*[0]PETSC ERROR: Petsc Release Version 3.6.3, Dec, 03, 2015*//* >>>>>>>>> *//*[0]PETSC ERROR: ./a-debug.out on a petsc-3.6.3_debug named b04-036 >>>>>>>>> by >>>>>>>>> Unknown Wed Jun 1 12:54:34 2016*//* >>>>>>>>> *//*[0]PETSC ERROR: Configure options --with-cc=mpifcc >>>>>>>>> --with-cxx=mpiFCC >>>>>>>>> --with-fc=mpifrt --with-64-bit-pointers=1 --CC=mpifcc --CFLAGS="-Xg >>>>>>>>> -O0" >>>>>>>>> --CXX=mpiFCC --CXXFLAGS="-Xg -O0" --FC=mpifrt --FFLAGS="-X9 -O0" >>>>>>>>> --LD_SHARED= >>>>>>>>> --LDDFLAGS= --with-openmp=1 --with-mpiexec=mpiexec --known-endian=big >>>>>>>>> --with-shared-libraries=0 --with-blas-lapack-lib=-SSL2 >>>>>>>>> --with-scalapack-lib=-SCALAPACK >>>>>>>>> --prefix=/home/hp150306/t00196/lib/petsc-3.6.3_debug >>>>>>>>> --with-fortran-interfaces=1 --with-debugging=1 --useThreads=0 >>>>>>>>> --with-hypre=1 >>>>>>>>> --with-hypre-dir=/home/hp150306/t00196/lib/hypre-2.10.0b-p4*//* >>>>>>>>> *//*[0]PETSC ERROR: #1 User provided function() line 0 in unknown >>>>>>>>> file*//* >>>>>>>>> *//*--------------------------------------------------------------------------*//* >>>>>>>>> *//*[m---------------------*//* >>>>>>>>> *//*[2]PETSC ERROR: Signal received*//* >>>>>>>>> *//*[2]PETSC ERROR: See >>>>>>>>> >>>>>>>>> http://www.mcs.anl.gov/petsc/documentation/faq.html >>>>>>>>> >>>>>>>>> for trouble shooting.*//* >>>>>>>>> *//*[2]PETSC ERROR: Petsc Release Version 3.6.3, Dec, 03, 2015*//* >>>>>>>>> *//*[2]PETSC ERROR: ./a-debug.out on a petsc-3.6.3_debug named b04-036 >>>>>>>>> by >>>>>>>>> Unknown Wed Jun 1 12:54:34 2016*//* >>>>>>>>> *//*[2]PETSC ERROR: Configure options --with-cc=mpifcc >>>>>>>>> --with-cxx=mpiFCC >>>>>>>>> --with-fc=mpifrt --with-64-bit-pointers=1 --CC=mpifcc --CFLAGS="-Xg >>>>>>>>> -O0" >>>>>>>>> --CXX=mpiFCC --CXXFLAGS="-Xg -O0" --FC=mpifrt --FFLAGS="-X9 -O0" >>>>>>>>> --LD_SHARED= >>>>>>>>> --LDDFLAGS= --with-openmp=1 --with-mpiexec=mpiexec --known-endian=big >>>>>>>>> --with-shared-libraries=0 --with-blas-lapack-lib=-SSL2 >>>>>>>>> --with-scalapack-lib=-SCALAPACK >>>>>>>>> --prefix=/home/hp150306/t00196/lib/petsc-3.6.3_debug >>>>>>>>> --with-fortran-interfaces=1 --with-debugging=1 --useThreads=0 >>>>>>>>> --with-hypre=1 >>>>>>>>> --with-hypre-dir=/home/hp150306/t00196/lib/hypre-2.10.0b-p4*//* >>>>>>>>> *//*[2]PETSC ERROR: #1 User provided function() line 0 in unknown >>>>>>>>> file*//* >>>>>>>>> *//*--------------------------------------------------------------------------*//* >>>>>>>>> *//*[m[1]PETSC ERROR: --------------------- Error Message >>>>>>>>> --------------------------------------------------------------*//* >>>>>>>>> *//*[1]PETSC ERROR: Signal received*//* >>>>>>>>> *//*[1]PETSC ERROR: See >>>>>>>>> >>>>>>>>> http://www.mcs.anl.gov/petsc/documentation/faq.html >>>>>>>>> >>>>>>>>> for trouble shooting.*//* >>>>>>>>> *//*[1]PETSC ERROR: Petsc Release Version 3.6.3, Dec, 03, 2015*//* >>>>>>>>> *//*[1]PETSC ERROR: ./a-debug.out on a petsc-3.6.3_debug named b04-036 >>>>>>>>> by >>>>>>>>> Unknown Wed Jun 1 12:54:34 2016*//* >>>>>>>>> *//*[1]PETSC ERROR: Configure options --with-cc=mpifcc >>>>>>>>> --with-cxx=mpiFCC >>>>>>>>> --with-fc=mpifrt --with-64-bit-pointers=1 --CC=mpifcc --CFLAGS="-Xg >>>>>>>>> -O0" >>>>>>>>> --CXX=mpiFCC --CXXFLAGS="-Xg -O0" --FC=mpifrt --FFLAGS="-X9 -O0" >>>>>>>>> --LD_SHARED= >>>>>>>>> --LDDFLAGS= --with-openmp=1 --with-mpiexec=mpiexec --known-endian=big >>>>>>>>> --with-shared-libraries=0 --with-blas-lapack-lib=-SSL2 >>>>>>>>> --with-scalapack-lib=-SCALAPACK >>>>>>>>> --prefix=/home/hp150306/t00196/lib/petsc-3.6.3_debug >>>>>>>>> --with-fortran-interfaces=1 --with-debugging=1 --useThreads=0 >>>>>>>>> --with-hypre=1 >>>>>>>>> --with-hypre-dir=/home/hp150306/t00196/lib/hypre-2.10.0b-p4*//* >>>>>>>>> *//*[1]PETSC ERROR: #1 User provided function() line 0 ilibraries=0 >>>>>>>>> --with-blas-lapack-lib=-SSL2 --with-scalapack-lib=-SCALAPACK >>>>>>>>> --prefix=/home/hp150306/t00196/lib/petsc-3.6.3_debug >>>>>>>>> --with-fortran-interfaces=1 --with-debugging=1 --useThreads=0 >>>>>>>>> --with-hypre=1 >>>>>>>>> --with-hypre-dir=/home/hp150306/t00196/lib/hypre-2.10.0b-p4*//* >>>>>>>>> *//*[3]PETSC ERROR: #1 User provided function() line 0 in unknown >>>>>>>>> file*//* >>>>>>>>> *//*--------------------------------------------------------------------------*//* >>>>>>>>> *//*[mpi::mpi-api::mpi-abort]*//* >>>>>>>>> *//*MPI_ABORT was invoked on rank 3 in communicator MPI_COMM_WORLD*//* >>>>>>>>> *//*with errorcode 59.*//* >>>>>>>>> *//* >>>>>>>>> *//*NOTE: invoking MPI_ABORT causes Open MPI to kill all MPI >>>>>>>>> processes.*//* >>>>>>>>> *//*You may or may not see output from other processes, depending >>>>>>>>> on*//* >>>>>>>>> *//*exactly when Open MPI kills them.*//* >>>>>>>>> *//*--------------------------------------------------------------------------*//* >>>>>>>>> *//*[b04-036:28416] >>>>>>>>> /opt/FJSVtclang/GM-1.2.0-20/lib64/libmpi.so.0(orte_errmgr_base_error_abort+0x84) >>>>>>>>> [0xffffffff11360404]*//* >>>>>>>>> *//*[b04-036:28416] >>>>>>>>> /opt/FJSVtclang/GM-1.2.0-20/lib64/libmpi.so.0(ompi_mpi_abort+0x51c) >>>>>>>>> [0xffffffff1110391c]*//* >>>>>>>>> *//*[b04-036:28416] >>>>>>>>> /opt/FJSVtclang/GM-1.2.0-2pi::mpi-api::mpi-abort]*//* >>>>>>>>> *//*MPI_ABORT was invoked on rank 2 in communicator MPI_COMM_WORLD*//* >>>>>>>>> *//*with errorcode 59.*/ >>>>>>>>> >>>>>>>>> ex13f90 error: >>>>>>>>> >>>>>>>>> >>>>>>>>> /*[t00196 at b04-036 tutorials]$ mpiexec -np 2 ./ex13f90*//* >>>>>>>>> *//*jwe1050i-w The hardware barrier couldn't be used and continues >>>>>>>>> processing >>>>>>>>> using the software barrier.*//* >>>>>>>>> *//*taken to (standard) corrective action, execution continuing.*//* >>>>>>>>> *//*jwe1050i-w The hardware barrier couldn't be used and continues >>>>>>>>> processing >>>>>>>>> using the software barrier.*//* >>>>>>>>> *//*taken to (standard) corrective action, execution continuing.*//* >>>>>>>>> *//* Hi! We're solving van der Pol using 2 processes.*//* >>>>>>>>> *//* >>>>>>>>> *//* t x1 x2*//* >>>>>>>>> *//*[1]PETSC ERROR: >>>>>>>>> ------------------------------------------------------------------------*//* >>>>>>>>> *//*[1]PETSC ERROR: Caught signal number 10 BUS: Bus Error, possibly >>>>>>>>> illegal >>>>>>>>> memory access*//* >>>>>>>>> *//*[1]PETSC ERROR: Try option -start_in_debugger or >>>>>>>>> -on_error_attach_debugger*//* >>>>>>>>> *//*[0]PETSC ERROR: >>>>>>>>> ------------------------------------------------------------------------*//* >>>>>>>>> *//*[0]PETSC ERROR: Caught signal number 10 BUS: Bus Error, possibly >>>>>>>>> illegal >>>>>>>>> memory access*//* >>>>>>>>> *//*[0]PETSC ERROR: Try option -start_in_debugger or >>>>>>>>> -on_error_attach_debugger*//* >>>>>>>>> *//*[0]PETSC ERROR: or see >>>>>>>>> >>>>>>>>> http://www.mcs.anl.gov/petsc/documentation/faq.html#valgrind*//* >>>>>>>>> >>>>>>>>> *//*[0]PETSC ERROR: or tryhttp://valgrind.org on GNU/linux and Apple >>>>>>>>> Mac >>>>>>>>> OS X >>>>>>>>> to find memory corruption errors*//* >>>>>>>>> *//*[1]PETSC ERROR: or see >>>>>>>>> >>>>>>>>> http://www.mcs.anl.gov/petsc/documentation/faq.html#valgrind*//* >>>>>>>>> >>>>>>>>> *//*[1]PETSC ERROR: or tryhttp://valgrind.org on GNU/linux and Apple >>>>>>>>> Mac >>>>>>>>> OS X >>>>>>>>> to find memory corruption errors*//* >>>>>>>>> *//*[1]PETSC ERROR: likely location of problem given in stack >>>>>>>>> below*//* >>>>>>>>> *//*[1]PETSC ERROR: --------------------- Stack Frames >>>>>>>>> ------------------------------------*//* >>>>>>>>> *//*[1]PETSC ERROR: Note: The EXACT line numbers in the stack are not >>>>>>>>> available,*//* >>>>>>>>> *//*[1]PETSC ERROR: INSTEAD the line number of the start of the >>>>>>>>> function*//* >>>>>>>>> *//*[1]PETSC ERROR: is given.*//* >>>>>>>>> *//*[1]PETSC ERROR: [1] F90Array4dCreate line 337 >>>>>>>>> /.global/volume2/home/hp150306/t00196/source/petsc-3.6.3/src/sys/f90-src/f90_cwrap.c*//* >>>>>>>>> *//*[0]PETSC ERROR: likely location of problem given in stack >>>>>>>>> below*//* >>>>>>>>> *//*[0]PETSC ERROR: --------------------- Stack Frames >>>>>>>>> ------------------------------------*//* >>>>>>>>> *//*[0]PETSC ERROR: Note: The EXACT line numbers in the stack are not >>>>>>>>> available,*//* >>>>>>>>> *//*[0]PETSC ERROR: INSTEAD the line number of the start of the >>>>>>>>> function*//* >>>>>>>>> *//*[0]PETSC ERROR: is given.*//* >>>>>>>>> *//*[0]PETSC ERROR: [0] F90Array4dCreate line 337 >>>>>>>>> /.global/volume2/home/hp150306/t00196/source/petsc-3.6.3/src/sys/f90-src/f90_cwrap.c*//* >>>>>>>>> *//*[1]PETSC ERROR: --------------------- Error Message >>>>>>>>> --------------------------------------------------------------*//* >>>>>>>>> *//*[1]PETSC ERROR: Signal received*//* >>>>>>>>> *//*[1]PETSC ERROR: See >>>>>>>>> >>>>>>>>> http://www.mcs.anl.gov/petsc/documentation/faq.html >>>>>>>>> >>>>>>>>> for trouble shooting.*//* >>>>>>>>> *//*[1]PETSC ERROR: Petsc Release Version 3.6.3, Dec, 03, 2015*//* >>>>>>>>> *//*[1]PETSC ERROR: ./ex13f90 on a petsc-3.6.3_debug named b04-036 by >>>>>>>>> Unknown >>>>>>>>> Wed Jun 1 13:04:34 2016*//* >>>>>>>>> *//*[1]PETSC ERROR: Configure options --with-cc=mpifcc >>>>>>>>> --with-cxx=mpiFCC >>>>>>>>> --with-fc=mpifrt --with-64-bit-pointers=1 --CC=mpifcc --CFLAGS="-Xg >>>>>>>>> -O0" >>>>>>>>> --CXX=mpiFCC --CXXFLAGS="-Xg -O0" --FC=mpifrt --FFLAGS="-X9 -O0" >>>>>>>>> --LD_SHARED= >>>>>>>>> --LDDFLAGS= --with-openmp=1 --with-mpiexec=mpiexec --known-endian=big >>>>>>>>> --with-shared-libraries=0 --with-blas-lapack-lib=-SSL2 >>>>>>>>> --with-scalapack-lib=-SCALAPACK >>>>>>>>> --prefix=/home/hp150306/t00196/lib/petsc-3.6.3_debug >>>>>>>>> --with-fortran-interfaces=1 --with-debugging=1 --useThreads=0 >>>>>>>>> --with-hypre=1 >>>>>>>>> --with-hyp*//* >>>>>>>>> */ >>>>>>>>> >>>>>>>>> >>>>>>>>> >>>>>>>>> > From bsmith at mcs.anl.gov Wed Jul 20 22:49:11 2016 From: bsmith at mcs.anl.gov (Barry Smith) Date: Wed, 20 Jul 2016 22:49:11 -0500 Subject: [petsc-users] Regarding ksp ex42 - Citations In-Reply-To: <877772657.653258.1468824084856.JavaMail.yahoo@mail.yahoo.com> References: <1413749702.3789628.1468516892902.JavaMail.yahoo.ref@mail.yahoo.com> <1413749702.3789628.1468516892902.JavaMail.yahoo@mail.yahoo.com> <5A491912-5FFB-46AB-8B2E-CBC0C5C443C2@mcs.anl.gov> <461808588.655361.1468821570462.JavaMail.yahoo@mail.yahoo.com> <877772657.653258.1468824084856.JavaMail.yahoo@mail.yahoo.com> Message-ID: > On Jul 18, 2016, at 1:41 AM, domenico lahaye wrote: > > Dear Matthew, > > I would like to place the FormJacobian statement in ex25.c in such a way that I can view > the result on the different levels. Can you please point me to an example? > > I would like to do above with Galerkin coarsening as well. So yes, I do expect that I will need the > hooks attached to the different MG levels. I appreciate more pointers here as well. The thing is some parts of the solver may not be constructed on each level until the actual solve is performed so it may not be possible to view/change things before the solve starts. You can try calling KSPSetUp() and then do as Matt suggested "You can always call http://www.mcs.anl.gov/petsc/petsc-current/docs/manualpages/PC/PCMGGetSmoother.html and then http://www.mcs.anl.gov/petsc/petsc-current/docs/manualpages/KSP/KSPGetPC.html and then http://www.mcs.anl.gov/petsc/petsc-current/docs/manualpages/PC/PCGetOperators.html I would caution you against this, since it is very fragile in the code." When using SNES there is really no good time to call KSPSetUp() and then access the PCMGGetSmoother(). This is why PETSc is designed around callbacks, so rather than having you look over MG levels and get some object and modify it, you provide callbacks that SNES or KSP calls at the appropriate time with a single object and then your callback function does what you want it to do. If there are additional callbacks you think we should add please let us know. Barry > > Thanks, Domenico. > > > From: Matthew Knepley > > > To: domenico lahaye > Cc: PETSc Users List > Sent: Monday, July 18, 2016 8:16 AM > Subject: Re: [petsc-users] Regarding ksp ex42 - Citations > > On Mon, Jul 18, 2016 at 12:59 AM, domenico lahaye wrote: > Thanks for all the pointers. > > I am happy to switch to ksp/examples/tutorials/ex25.c in a first instance as you suggest. > > I am still stuck with the same issue as before though. I am trying to extract the hierarchy > of coarser grid matrices and the intergrid transfer operators from the DMDA data structure. I would > like to modify these operators and define a multigrid cycle with the modified operators. > > Given A^h (Helmholtz) and M^h (shifted Laplace), I would like to define a multigrid cycle involving > both A^H and M^H. Can I rely on the multilevel DMDA structure to construct A^H and M^H for me > in a set-up phase, plug them into a user-defined context, and plug them back out in a solve phase? > > If you are not using -pc_mg_galerkin, then the FormJacobian is called separately on each level to rediscretize the operator. > The only thing that changes is the DMDA that is passed to the call. If you need more information, there are hooks to > attach different contexts to each MG level. Do you need this? > > Thanks, > > Matt > > Thanks, Domenico. > > > From: Matthew Knepley > To: Barry Smith > Cc: domenico lahaye ; "petsc-users at mcs.anl.gov" > Sent: Sunday, July 17, 2016 2:29 PM > Subject: Re: [petsc-users] Regarding ksp ex42 - Citations > > On Sat, Jul 16, 2016 at 10:11 PM, Barry Smith wrote: > > > On Jul 14, 2016, at 12:21 PM, domenico lahaye wrote: > > > > Dear PETSc team, > > > > 1) I am looking into ks/examples/tutorials/ex42.c I am still new to the DMDA structure > > and likely not giving it as much time as it deserves. However, I do not see immediately > > what function is responsible for calling PCMGSetSmoother and PCMGSetResidual. > > > > I tried to call PCMGGetCoarseSolve(pc, &kcpc) and subsequently > > KSPGetOperators (kspc, ... ) to check how the coarse grid operator is defined > > after calling DMCoarsenHierarchy, but that failed. > > > > I am solving Helmholtz with shifted Laplace, and managed to exploit DMDA to perform > > a multigrid solve on the preconditioner. In a next stage I want to implement the deflation > > using DMDA as well. > > > > 2) On http://www.mcs.anl.gov/petsc/documentation/referencing.html I see > > > > @Misc{petsc-web-page, > > author = {Satish Balay and Shrirang Abhyankar and Mark~F. Adams and Jed Brown and Peter Brune > > and Kris Buschelman and Lisandro Dalcin and Victor Eijkhout and William~D. Gropp > > and Dinesh Kaushik and Matthew~G. Knepley > > and Lois Curfman McInnes and Karl Rupp and Barry~F. Smith > > and Stefano Zampini and Hong Zhang and Hong Zhang}, > > title = {{PETS}c {W}eb page}, > > url = {http://www.mcs.anl.gov/petsc}, > > howpublished = {\url{http://www.mcs.anl.gov/petsc}}, > > year = {2016} > > } > > > > > > > > Is the last author mentioned twice intentionally? > > > > 3) On http://www.mcs.anl.gov/petsc/publications/petscapps-bib.html#OpenFOAM%202.2.1 I see > > > > @misc{OpenFOAM > > , > > > > > > title = "OpenFOAM", > > > > howpublished = "\url{http://www.openfoam.com}", > > > > url = {http://www.openfoam.com}, > > > > note = "OpenFOAM is a free, open source CFD software package. It allows PETSc linear algebra and solvers to be used underneath.", > > > > key = "OpenFOAM 2.2.1" > > > > } > > > > > > Do you have more information on the use of PETSc within OpenFoam? > > Very good question. It seems that this citation is wrong or no longer valid; I have removed it from the PETSc repository. I could find no mention of PETSc usage in the OpenFoam and its third party packages. I think we should not have been listing this citation. > > This suggests that people are using it with OpenFOAM: http://powerlab.fsb.hr/ped/kturbo/OpenFOAM/slides/PatersonNuTTS2009.pdf > > In fact, they use PETSc in the dynamic overset grid implementation for OpenFOAM, which I think is an approved extension: > > http://web.student.chalmers.se/groups/ofw5/Abstracts/DavidBogerAbstractOFW5.pdf > > Matt > > > Barry > > > > > 4) @matt in response to a question he raised in Vienna > > > > MIPSE is a BEM solver. Details are on: > > http://www.g2elab.grenoble-inp.fr/plateforms/mipse-modeling-of-interconnected-power-systems-632862.kjsp?RH=G2ELAB_R-MAGE > > > > Cheers, Domenico Lahaye. > > > > > > > > -- > What most experimenters take for granted before they begin their experiments is infinitely more interesting than any results to which their experiments lead. > -- Norbert Wiener > > > > > > -- > What most experimenters take for granted before they begin their experiments is infinitely more interesting than any results to which their experiments lead. > -- Norbert Wiener > > From domenico_lahaye at yahoo.com Thu Jul 21 03:41:04 2016 From: domenico_lahaye at yahoo.com (domenico lahaye) Date: Thu, 21 Jul 2016 08:41:04 +0000 (UTC) Subject: [petsc-users] Regarding ksp ex42 - Citations In-Reply-To: References: <1413749702.3789628.1468516892902.JavaMail.yahoo.ref@mail.yahoo.com> <1413749702.3789628.1468516892902.JavaMail.yahoo@mail.yahoo.com> <5A491912-5FFB-46AB-8B2E-CBC0C5C443C2@mcs.anl.gov> <461808588.655361.1468821570462.JavaMail.yahoo@mail.yahoo.com> <877772657.653258.1468824084856.JavaMail.yahoo@mail.yahoo.com> Message-ID: <1627436644.2376666.1469090464432.JavaMail.yahoo@mail.yahoo.com> Thanks.? KSPSetOperators() allows to precondition A^h with M^h.?This is lovely and great as it allows to implement the shifted Laplace?preconditioner for the Helmholtz equation.? Recently I managed to implement shifted Laplace using the DMDA?infrastructure in 2D. This implementation avoids having to construct?the hierarchy in Matlab as we did previously.? In next stage we would like to precondition A^H with M^H on a sequence?of coarser grids. This is what Calandra does on two levels and what we doon multiple levels.? We currently have an implement in which we construct the hierarchy on A^h?and M^h in Matlab, we read the hierarchy in PETSc, traverse the hierarchy and?do SetOperators and do a lot more of dark magic and witch craft by combining?preconditioners in a additive and multiplicative fashion.? It would be lovely to obtain a more readable piece of code. ? ? I am not sure what kind of additional callbacks I need. My first guess here?would be a multilevel extension of SetOperators allowing to define M^H?a preconditioner for A^H on a sequence of coarser levels. But I currently?fail to oversee the whole matter.? An alternative is to build a fragile code on top of DMDA first and get back?to you with more informed guesses on what kind of call backs I precisely need.?I think I prefer to go with this option.? Does this sound reasonable?? Domenico.? From: Barry Smith To: domenico lahaye Cc: PETSc Users List Sent: Thursday, July 21, 2016 5:49 AM Subject: Re: [petsc-users] Regarding ksp ex42 - Citations > On Jul 18, 2016, at 1:41 AM, domenico lahaye wrote: > > Dear Matthew, > >? I would like to place the FormJacobian statement in ex25.c in such a way that I can view > the result on the different levels. Can you please point me to an example? > >? I would like to do above with Galerkin coarsening as well. So yes, I do expect that I will need the > hooks attached to the different MG levels. I appreciate more pointers here as well. ? The thing is some parts of the solver may not be constructed on each level until the actual solve is performed so it may not be possible to view/change things before the solve starts. You can try calling KSPSetUp() and then do as Matt suggested "You can always call ? http://www.mcs.anl.gov/petsc/petsc-current/docs/manualpages/PC/PCMGGetSmoother.html and then ? http://www.mcs.anl.gov/petsc/petsc-current/docs/manualpages/KSP/KSPGetPC.html and then ? http://www.mcs.anl.gov/petsc/petsc-current/docs/manualpages/PC/PCGetOperators.html I would caution you against this, since it is very fragile in the code." ? When using SNES there is really no good time to call KSPSetUp() and then access the PCMGGetSmoother(). This is why PETSc is designed around callbacks, so rather than having you look over MG levels and get some object and modify it, you provide callbacks that SNES or KSP calls at the appropriate time with a single object and then your callback function does what you want it to do. If there are additional callbacks you think we should add please let us know. ? Barry > >? ? Thanks, Domenico.? > > > From: Matthew Knepley > > > To: domenico lahaye > Cc: PETSc Users List > Sent: Monday, July 18, 2016 8:16 AM > Subject: Re: [petsc-users] Regarding ksp ex42 - Citations > > On Mon, Jul 18, 2016 at 12:59 AM, domenico lahaye wrote: > Thanks for all the pointers. > > I am happy to switch to ksp/examples/tutorials/ex25.c in a first instance as you suggest. > >? ? I am still stuck with the same issue as before though. I am trying to extract the hierarchy >? ? of coarser grid matrices and the intergrid transfer operators from the DMDA data structure. I would >? ? like to modify these operators and define a multigrid cycle with the modified operators. > >? ? Given A^h (Helmholtz) and M^h (shifted Laplace), I would like to define a multigrid cycle involving >? ? both A^H and M^H. Can I rely on the multilevel DMDA structure to construct A^H and M^H for me >? ? in a set-up phase, plug them into a user-defined context, and plug them back out in a solve phase? > > If you are not using -pc_mg_galerkin, then the FormJacobian is called separately on each level to rediscretize the operator. > The only thing that changes is the DMDA that is passed to the call. If you need more information, there are hooks to > attach different contexts to each MG level. Do you need this? > >? Thanks, > >? ? ? Matt >? > Thanks, Domenico. > > > From: Matthew Knepley > To: Barry Smith > Cc: domenico lahaye ; "petsc-users at mcs.anl.gov" > Sent: Sunday, July 17, 2016 2:29 PM > Subject: Re: [petsc-users] Regarding ksp ex42 - Citations > > On Sat, Jul 16, 2016 at 10:11 PM, Barry Smith wrote: > > > On Jul 14, 2016, at 12:21 PM, domenico lahaye wrote: > > > > Dear PETSc team, > > > > 1) I am looking into ks/examples/tutorials/ex42.c I am still new to the DMDA structure > >? ? and likely not giving it as much time as it deserves. However, I do not see immediately > >? ? what function is responsible for calling PCMGSetSmoother and PCMGSetResidual. > > > >? ? ? I tried to call PCMGGetCoarseSolve(pc, &kcpc) and subsequently > >? ? ? KSPGetOperators (kspc, ... ) to check how the coarse grid operator is defined > >? ? ? after calling DMCoarsenHierarchy, but that failed. > > > >? ? ? I am solving Helmholtz with shifted Laplace, and managed to exploit DMDA to perform > >? ? ? a multigrid solve on the preconditioner. In a next stage I want to implement the deflation > >? ? ? using DMDA as well. > > > > 2) On http://www.mcs.anl.gov/petsc/documentation/referencing.html I see > > > > @Misc{petsc-web-page, > >? ? ? ? ? ? author = {Satish Balay and Shrirang Abhyankar and Mark~F. Adams and Jed Brown and Peter Brune > >? ? ? ? ? ? ? ? ? ? ? and Kris Buschelman and Lisandro Dalcin and Victor Eijkhout and William~D. Gropp > >? ? ? ? ? ? ? ? ? ? ? and Dinesh Kaushik and Matthew~G. Knepley > >? ? ? ? ? ? ? ? ? ? ? and Lois Curfman McInnes and Karl Rupp and Barry~F. Smith > >? ? ? ? ? ? ? ? ? ? ? and Stefano Zampini and Hong Zhang and Hong Zhang}, > >? ? ? ? ? ? title =? {{PETS}c {W}eb page}, > >? ? ? ? ? ? url =? ? {http://www.mcs.anl.gov/petsc}, > >? ? ? ? ? ? howpublished = {\url{http://www.mcs.anl.gov/petsc}}, > >? ? ? ? ? ? year = {2016} > >? ? ? ? ? } > > > > > > > > Is the last author mentioned twice intentionally? > > > > 3) On http://www.mcs.anl.gov/petsc/publications/petscapps-bib.html#OpenFOAM%202.2.1 I see > > > > @misc{OpenFOAM > > , > > > > > > title =? ? ? "OpenFOAM", > > > > howpublished? =? ? ? "\url{http://www.openfoam.com}", > > > > url? =? ? ? {http://www.openfoam.com}, > > > > note? =? ? ? "OpenFOAM is a free, open source CFD software package. It allows PETSc linear algebra and solvers to be used underneath.", > > > > key? =? ? ? "OpenFOAM 2.2.1" > > > > } > > > > > > Do you have more information on the use of PETSc within OpenFoam? > >? Very good question. It seems that this citation is wrong or no longer valid; I have removed it from the PETSc repository. I could find no mention of PETSc usage in the OpenFoam and its third party packages. I think we should not have been listing this citation. > > This suggests that people are using it with OpenFOAM: http://powerlab.fsb.hr/ped/kturbo/OpenFOAM/slides/PatersonNuTTS2009.pdf > > In fact, they use PETSc in the dynamic overset grid implementation for OpenFOAM, which I think is an approved extension: > >? http://web.student.chalmers.se/groups/ofw5/Abstracts/DavidBogerAbstractOFW5.pdf > >? ? ? Matt >? > >? ? Barry > > > > > 4) @matt in response to a question he raised in Vienna > > > > MIPSE is a BEM solver. Details are on: > > http://www.g2elab.grenoble-inp.fr/plateforms/mipse-modeling-of-interconnected-power-systems-632862.kjsp?RH=G2ELAB_R-MAGE > > > > Cheers, Domenico Lahaye. > > > > > > > > -- > What most experimenters take for granted before they begin their experiments is infinitely more interesting than any results to which their experiments lead. > -- Norbert Wiener > > > > > > -- > What most experimenters take for granted before they begin their experiments is infinitely more interesting than any results to which their experiments lead. > -- Norbert Wiener > > -------------- next part -------------- An HTML attachment was scrubbed... URL: From lawrence.mitchell at imperial.ac.uk Thu Jul 21 04:00:19 2016 From: lawrence.mitchell at imperial.ac.uk (Lawrence Mitchell) Date: Thu, 21 Jul 2016 10:00:19 +0100 Subject: [petsc-users] Regarding ksp ex42 - Citations In-Reply-To: <1627436644.2376666.1469090464432.JavaMail.yahoo@mail.yahoo.com> References: <1413749702.3789628.1468516892902.JavaMail.yahoo.ref@mail.yahoo.com> <1413749702.3789628.1468516892902.JavaMail.yahoo@mail.yahoo.com> <5A491912-5FFB-46AB-8B2E-CBC0C5C443C2@mcs.anl.gov> <461808588.655361.1468821570462.JavaMail.yahoo@mail.yahoo.com> <877772657.653258.1468824084856.JavaMail.yahoo@mail.yahoo.com> <1627436644.2376666.1469090464432.JavaMail.yahoo@mail.yahoo.com> Message-ID: > On 21 Jul 2016, at 09:41, domenico lahaye wrote: > > Thanks. > > KSPSetOperators() allows to precondition A^h with M^h. > This is lovely and great as it allows to implement the shifted Laplace > preconditioner for the Helmholtz equation. > > Recently I managed to implement shifted Laplace using the DMDA > infrastructure in 2D. This implementation avoids having to construct > the hierarchy in Matlab as we did previously. > > In next stage we would like to precondition A^H with M^H on a sequence > of coarser grids. This is what Calandra does on two levels and what we do > on multiple levels. > > We currently have an implement in which we construct the hierarchy on A^h > and M^h in Matlab, we read the hierarchy in PETSc, traverse the hierarchy and > do SetOperators and do a lot more of dark magic and witch craft by combining > preconditioners in a additive and multiplicative fashion. > > It would be lovely to obtain a more readable piece of code. > > I am not sure what kind of additional callbacks I need. My first guess here > would be a multilevel extension of SetOperators allowing to define M^H > a preconditioner for A^H on a sequence of coarser levels. But I currently > fail to oversee the whole matter. > > An alternative is to build a fragile code on top of DMDA first and get back > to you with more informed guesses on what kind of call backs I precisely need. > I think I prefer to go with this option. > > Does this sound reasonable? It sounds like what you need is that the coarse DM should have a way of building the operators via a callback. I think this is already available. Rather than doing KSPSetOperators. You do KSPSetComputeOperators, providing the function to be called to build the operator. Now, you need a way for the coarse grids to allocate the matrices that will be used for your operators. If you have a DMDA, this is set up for you because the KSP calls DMCreateMatrix and the DMDA knows how to create a matrix. One wrinkle here is that the interface doesn't currently support making separate matrices for A and M. The code currently does (in KSPSetUp): if (using_dm) { DMCreateMatrix(ksp->dm, &A); KSPSetOperators(ksp, A, A); ... } For your needs you'd need this to be: if (using_dm) { DMCreateMatrices(ksp->dm, &A, &P); KSPSetOperators(ksp, A, P) ... } I think. Adding this call should not be too hard, there have been discussions before about it. See, for example, the thread here: http://lists.mcs.anl.gov/pipermail/petsc-dev/2015-March/017130.html (which started here http://lists.mcs.anl.gov/pipermail/petsc-dev/2015-February/017008.html) I note I never got round to making the suggested changes there. Cheers, Lawrence -------------- next part -------------- A non-text attachment was scrubbed... Name: signature.asc Type: application/pgp-signature Size: 455 bytes Desc: Message signed with OpenPGP using GPGMail URL: From lawrence.mitchell at imperial.ac.uk Thu Jul 21 04:25:19 2016 From: lawrence.mitchell at imperial.ac.uk (Lawrence Mitchell) Date: Thu, 21 Jul 2016 10:25:19 +0100 Subject: [petsc-users] Regarding ksp ex42 - Citations In-Reply-To: <1239841092.2483369.1469092691767.JavaMail.yahoo@mail.yahoo.com> References: <1413749702.3789628.1468516892902.JavaMail.yahoo.ref@mail.yahoo.com> <1413749702.3789628.1468516892902.JavaMail.yahoo@mail.yahoo.com> <5A491912-5FFB-46AB-8B2E-CBC0C5C443C2@mcs.anl.gov> <461808588.655361.1468821570462.JavaMail.yahoo@mail.yahoo.com> <877772657.653258.1468824084856.JavaMail.yahoo@mail.yahoo.com> <1627436644.2376666.1469090464432.JavaMail.yahoo@mail.yahoo.com> <1239841092.2483369.1469092691767.JavaMail.yahoo@mail.yahoo.com> Message-ID: <579094FF.9020708@imperial.ac.uk> [Reintroducing petsc-users in cc] On 21/07/16 10:18, domenico lahaye wrote: > Thanks Lauwrence. > > Does the fact that the coarse level preconditioner M^H should be > constructed > by Galerkin coarse (rather then rediscretization) cause additional > wrinkles? Do you want to rediscretise A, but use a galerkin coarse grid M? If so, that is currently unsupported in PCMG: In PCSetUp_MG (mg.c, line 660 or so): if (mg->galerkin == 1) { /* Currently only handle case where mat and pmat are the same on coarser levels */ ... } I guess if you're managing the creation of the coarse grid operators yourself via KSPSetComputeOperators and a putative (new) DMCreateMatrices then you'd have the flexibility to do separate things for A and M (including, I think, galerkin coarse M). Since you have access to the DM hierarchy inside your compute operators. Make sense? Lawrence -------------- next part -------------- A non-text attachment was scrubbed... Name: signature.asc Type: application/pgp-signature Size: 490 bytes Desc: OpenPGP digital signature URL: From domenico_lahaye at yahoo.com Thu Jul 21 04:55:45 2016 From: domenico_lahaye at yahoo.com (domenico lahaye) Date: Thu, 21 Jul 2016 09:55:45 +0000 (UTC) Subject: [petsc-users] Regarding ksp ex42 - Citations In-Reply-To: <579094FF.9020708@imperial.ac.uk> References: <1413749702.3789628.1468516892902.JavaMail.yahoo.ref@mail.yahoo.com> <1413749702.3789628.1468516892902.JavaMail.yahoo@mail.yahoo.com> <5A491912-5FFB-46AB-8B2E-CBC0C5C443C2@mcs.anl.gov> <461808588.655361.1468821570462.JavaMail.yahoo@mail.yahoo.com> <877772657.653258.1468824084856.JavaMail.yahoo@mail.yahoo.com> <1627436644.2376666.1469090464432.JavaMail.yahoo@mail.yahoo.com> <1239841092.2483369.1469092691767.JavaMail.yahoo@mail.yahoo.com> <579094FF.9020708@imperial.ac.uk> Message-ID: <2137146867.2442005.1469094945841.JavaMail.yahoo@mail.yahoo.com> Apologies for being not sufficient clear in my previous message.? I would like to be able to Galerkin coarsen A^h to obtain A^H?and to separately Galerkin coarsen M^h to obtain M^H.? So, yes, the way in which I currently (partially) understand your?description of the new DMCreateMatrices would do the job.? What is a sensible way to proceed?? Thanks, Domenico.? From: Lawrence Mitchell To: domenico lahaye Cc: petsc-users at mcs.anl.gov Sent: Thursday, July 21, 2016 11:25 AM Subject: Re: [petsc-users] Regarding ksp ex42 - Citations [Reintroducing petsc-users in cc] On 21/07/16 10:18, domenico lahaye wrote: > Thanks Lauwrence. > > Does the fact that the coarse level preconditioner M^H should be > constructed > by Galerkin coarse (rather then rediscretization) cause additional > wrinkles? Do you want to rediscretise A, but use a galerkin coarse grid M? If so, that is currently unsupported in PCMG:? In PCSetUp_MG (mg.c, line 660 or so): if (mg->galerkin == 1) { ? /* Currently only handle case where mat and pmat are the same on coarser levels */ ? ... } I guess if you're managing the creation of the coarse grid operators yourself via KSPSetComputeOperators and a putative (new) DMCreateMatrices then you'd have the flexibility to do separate things for A and M (including, I think, galerkin coarse M).? Since you have access to the DM hierarchy inside your compute operators. Make sense? Lawrence -------------- next part -------------- An HTML attachment was scrubbed... URL: From lawrence.mitchell at imperial.ac.uk Thu Jul 21 06:09:21 2016 From: lawrence.mitchell at imperial.ac.uk (Lawrence Mitchell) Date: Thu, 21 Jul 2016 12:09:21 +0100 Subject: [petsc-users] Regarding ksp ex42 - Citations In-Reply-To: <2137146867.2442005.1469094945841.JavaMail.yahoo@mail.yahoo.com> References: <1413749702.3789628.1468516892902.JavaMail.yahoo.ref@mail.yahoo.com> <1413749702.3789628.1468516892902.JavaMail.yahoo@mail.yahoo.com> <5A491912-5FFB-46AB-8B2E-CBC0C5C443C2@mcs.anl.gov> <461808588.655361.1468821570462.JavaMail.yahoo@mail.yahoo.com> <877772657.653258.1468824084856.JavaMail.yahoo@mail.yahoo.com> <1627436644.2376666.1469090464432.JavaMail.yahoo@mail.yahoo.com> <1239841092.2483369.1469092691767.JavaMail.yahoo@mail.yahoo.com> <579094FF.9020708@imperial.ac.uk> <2137146867.2442005.1469094945841.JavaMail.yahoo@mail.yahoo.com> Message-ID: <9C370EDC-0F99-45FE-B650-B0F24091CA63@imperial.ac.uk> > On 21 Jul 2016, at 10:55, domenico lahaye wrote: > > Apologies for being not sufficient clear in my previous message. > > I would like to be able to Galerkin coarsen A^h to obtain A^H > and to separately Galerkin coarsen M^h to obtain M^H. > > So, yes, the way in which I currently (partially) understand your > description of the new DMCreateMatrices would do the job. If you want to separately coarsen A and M via Galerkin, I think it will be easier to just change the code in PCSetUp_MG to handle the case where A and M are different on the coarse levels. Effectively you just need to replicate the code that computes the coarse grid "B" matrix to separately compute coarse grid A and B matrices and pass them in to KSPSetOperators. Cheers, Lawrence -------------- next part -------------- A non-text attachment was scrubbed... Name: signature.asc Type: application/pgp-signature Size: 455 bytes Desc: Message signed with OpenPGP using GPGMail URL: From knepley at gmail.com Thu Jul 21 08:04:57 2016 From: knepley at gmail.com (Matthew Knepley) Date: Thu, 21 Jul 2016 15:04:57 +0200 Subject: [petsc-users] Mass matrix with PetscFE In-Reply-To: <556a1fbacdac5b3b1e98e9f06955f71a@tf.uni-kiel.de> References: <56CDA67F.6000906@tf.uni-kiel.de> <56CDB469.806@tf.uni-kiel.de> <56CDB84F.8020309@tf.uni-kiel.de> <518cc2f74e6b2267660acaf3871d52f9@tf.uni-kiel.de> <56CEB565.5010203@tf.uni-kiel.de> <556a1fbacdac5b3b1e98e9f06955f71a@tf.uni-kiel.de> Message-ID: On Mon, Mar 7, 2016 at 6:21 PM, Julian Andrej wrote: > Any news about this? I've seen you merged the dmforest branch into next. I am going through my mail, and see that this might have been dropped. Has your problem been solved? Sorry for the delay, Matt > > On 2016-02-26 01:22, Matthew Knepley wrote: > >> I am sorry about the delay. I have your example working but it exposed >> a bug in Plex so I need to push the fix first. I should have >> everything for you early next week. >> >> Thanks >> >> Matt >> >> On Feb 25, 2016 2:04 AM, "Julian Andrej" wrote: >> >> After a bit of rethinking the problem, the discrepancy between the >>> size of matrix A and the mass matrix M arises because of the >>> Dirichlet boundary conditions. So why aren't the BCs not imposed on >>> the mass matrix? Do I need to handle Dirichlet BCs differently in >>> this context (like zero rows and put one the diagonal?) >>> >>> On 24.02.2016 20 [1]:54, juan wrote: >>> I attached another example which creates the correct mass matrix >>> but also overwrites the DM for the SNES solve. Somehow i cannot >>> manage >>> to really copy the DM to dm_mass and use that. If i try to do that >>> with >>> DMClone(dm, &dm_mass) i get a smaller mass matrix (which is not of >>> size A). >>> >>> Maybe this helps in the discussion. >>> >>> Relevant code starts at line 455. >>> >>> On 2016-02-24 15:03, Julian Andrej wrote: >>> Thanks Matt, >>> >>> I attached the modified example. >>> >>> the corresponding code (and only changes to ex12) is starting at >>> line >>> 832. >>> >>> It also seems that the mass matrix is of size 169x169 and the >>> stiffness matrix is of dimension 225x225. I'd assume that if i >>> multiply test and trial function i'd get a matrix of same size (if >>> the >>> space/quadrature is the same for the stiffness matrix) >>> >>> On 24.02.2016 14 [2]:56, Matthew Knepley wrote: >>> On Wed, Feb 24, 2016 at 7:47 AM, Julian Andrej >> > wrote: >>> >>> I'm now using the petsc git master branch. >>> >>> I tried adding my code to the ex12 >>> >>> DM dm_mass; >>> PetscDS prob_mass; >>> PetscFE fe; >>> Mat M; >>> PetscFECreateDefault(dm, user.dim, 1, PETSC_TRUE, NULL, -1, >>> &fe); >>> >>> DMClone(dm, &dm_mass); >>> DMGetDS(dm_mass, &prob_mass); >>> PetscDSSetDiscretization(prob_mass, 0, (PetscObject) fe); >>> PetscDSSetJacobian(prob_mass, 0, 0, mass_kernel, NULL, NULL, >>> NULL); >>> DMCreateMatrix(dm_mass, &M); >>> >>> MatSetOptionsPrefix(M, "M_";) >>> >>> and receive the error on running >>> ./exe -interpolate -refinement_limit 0.0125 -petscspace_order 2 >>> -M_mat_view binary >>> >>> WARNING! There are options you set that were not used! >>> WARNING! could be spelling mistake, etc! >>> Option left: name:-M_mat_view value: binary >>> >>> I don't know if the matrix is actually there and assembled or if >>> the >>> option is ommitted because something is wrong. >>> >>> Its difficult to know when I cannot see the whole code. You can >>> always >>> insert >>> >>> MatViewFromOptions(M, NULL, "-mat_view"); >>> >>> Using >>> MatView(M, PETSC_VIEWER_STDOUT_WORLD); >>> >>> gives me a reasonable output to stdout. >>> >>> Good. >>> >>> But saving the matrix and analysing it in matlab, results in an >>> all >>> zero matrix. >>> >>> PetscViewerBinaryOpen(PETSC_COMM_WORLD, "Mout",FILE_MODE_WRITE, >>> &viewer); >>> MatView(M, viewer); >>> >>> I cannot explain this, but it has to be something like you are >>> viewing >>> the matrix before it is >>> actually assembled. Feel free to send the code. It sounds like it is >>> mostly working. >>> >>> Matt >>> >>> Any hints? >>> >>> On 24.02.2016 13 [3] :58, Matthew Knepley >>> >>> wrote: >>> >>> On Wed, Feb 24, 2016 at 6:47 AM, Julian Andrej >>> >>> >> >>> wrote: >>> >>> Hi, >>> >>> i'm trying to assemble a mass matrix with the >>> PetscFE/DMPlex >>> interface. I found something in the examples of TAO >>> >>> >>> >> https://bitbucket.org/petsc/petsc/src/da8116b0e8d067e39fd79740a8a864b0fe207998/src/tao/examples/tutorials/ex3.c?at=master&fileviewer=file-view-default >> >>> >>> but using the lines >>> >>> DMClone(dm, &dm_mass); >>> DMSetNumFields(dm_mass, 1); >>> DMPlexCopyCoordinates(dm, dm_mass); >>> DMGetDS(dm_mass, &prob_mass); >>> PetscDSSetJacobian(prob_mass, 0, 0, mass_kernel, NULL, >>> NULL, NULL); >>> PetscDSSetDiscretization(prob_mass, 0, (PetscObject) >>> fe); >>> DMPlexSNESComputeJacobianFEM(dm_mass, u, M, M, NULL); >>> DMCreateMatrix(dm_mass, &M); >>> >>> leads to errors in DMPlexSNESComputeJacobianFEM (u is a >>> global vector). >>> >>> I don't can understand the necessary commands until >>> DMPlexSNESComputeJacobianFEM. What does it do and why >>> is it >>> necessary? (especially why does the naming involve >>> SNES?) >>> >>> Is there another/easier/better way to create a mass >>> matrix (the >>> inner product of the function space and the test >>> space)? >>> >>> 1) That example needs updating. First, look at SNES ex12 >>> which >>> is up to >>> date. >>> >>> 2) I assume you are using 3.6. If you use the development >>> version, you >>> can remove DMPlexCopyCoordinates(). >>> >>> 3) You need to create the matrix BEFORE calling the assembly >>> >>> 4) Always always always send the entire error messge >>> >>> Matt >>> >>> Regards >>> Julian Andrej >>> >>> -- >>> What most experimenters take for granted before they begin >>> their >>> experiments is infinitely more interesting than any results >>> to which >>> their experiments lead. >>> -- Norbert Wiener >>> >>> -- >>> What most experimenters take for granted before they begin their >>> experiments is infinitely more interesting than any results to which >>> their experiments lead. >>> -- Norbert Wiener >>> >> >> >> Links: >> ------ >> [1] tel:24.02.2016%2020 >> [2] tel:24.02.2016%2014 >> [3] tel:24.02.2016%2013 >> > -- What most experimenters take for granted before they begin their experiments is infinitely more interesting than any results to which their experiments lead. -- Norbert Wiener -------------- next part -------------- An HTML attachment was scrubbed... URL: From juan at tf.uni-kiel.de Thu Jul 21 08:59:06 2016 From: juan at tf.uni-kiel.de (Julian Andrej) Date: Thu, 21 Jul 2016 15:59:06 +0200 Subject: [petsc-users] Mass matrix with PetscFE In-Reply-To: References: <56CDA67F.6000906@tf.uni-kiel.de> <56CDB469.806@tf.uni-kiel.de> <56CDB84F.8020309@tf.uni-kiel.de> <518cc2f74e6b2267660acaf3871d52f9@tf.uni-kiel.de> <56CEB565.5010203@tf.uni-kiel.de> <556a1fbacdac5b3b1e98e9f06955f71a@tf.uni-kiel.de> Message-ID: Hey, yes, this issue was resolved a few weeks after the mail. I just tried again after some DMPlex commits ;). Thanks! On Thu, Jul 21, 2016 at 3:04 PM, Matthew Knepley wrote: > On Mon, Mar 7, 2016 at 6:21 PM, Julian Andrej wrote: >> >> Any news about this? I've seen you merged the dmforest branch into next. > > > I am going through my mail, and see that this might have been dropped. Has > your > problem been solved? > > Sorry for the delay, > > Matt > >> >> >> On 2016-02-26 01:22, Matthew Knepley wrote: >>> >>> I am sorry about the delay. I have your example working but it exposed >>> a bug in Plex so I need to push the fix first. I should have >>> everything for you early next week. >>> >>> Thanks >>> >>> Matt >>> >>> On Feb 25, 2016 2:04 AM, "Julian Andrej" wrote: >>> >>>> After a bit of rethinking the problem, the discrepancy between the >>>> size of matrix A and the mass matrix M arises because of the >>>> Dirichlet boundary conditions. So why aren't the BCs not imposed on >>>> the mass matrix? Do I need to handle Dirichlet BCs differently in >>>> this context (like zero rows and put one the diagonal?) >>>> >>>> On 24.02.2016 20 [1]:54, juan wrote: >>>> I attached another example which creates the correct mass matrix >>>> but also overwrites the DM for the SNES solve. Somehow i cannot >>>> manage >>>> to really copy the DM to dm_mass and use that. If i try to do that >>>> with >>>> DMClone(dm, &dm_mass) i get a smaller mass matrix (which is not of >>>> size A). >>>> >>>> Maybe this helps in the discussion. >>>> >>>> Relevant code starts at line 455. >>>> >>>> On 2016-02-24 15:03, Julian Andrej wrote: >>>> Thanks Matt, >>>> >>>> I attached the modified example. >>>> >>>> the corresponding code (and only changes to ex12) is starting at >>>> line >>>> 832. >>>> >>>> It also seems that the mass matrix is of size 169x169 and the >>>> stiffness matrix is of dimension 225x225. I'd assume that if i >>>> multiply test and trial function i'd get a matrix of same size (if >>>> the >>>> space/quadrature is the same for the stiffness matrix) >>>> >>>> On 24.02.2016 14 [2]:56, Matthew Knepley wrote: >>>> On Wed, Feb 24, 2016 at 7:47 AM, Julian Andrej >>> > wrote: >>>> >>>> I'm now using the petsc git master branch. >>>> >>>> I tried adding my code to the ex12 >>>> >>>> DM dm_mass; >>>> PetscDS prob_mass; >>>> PetscFE fe; >>>> Mat M; >>>> PetscFECreateDefault(dm, user.dim, 1, PETSC_TRUE, NULL, -1, >>>> &fe); >>>> >>>> DMClone(dm, &dm_mass); >>>> DMGetDS(dm_mass, &prob_mass); >>>> PetscDSSetDiscretization(prob_mass, 0, (PetscObject) fe); >>>> PetscDSSetJacobian(prob_mass, 0, 0, mass_kernel, NULL, NULL, >>>> NULL); >>>> DMCreateMatrix(dm_mass, &M); >>>> >>>> MatSetOptionsPrefix(M, "M_";) >>>> >>>> and receive the error on running >>>> ./exe -interpolate -refinement_limit 0.0125 -petscspace_order 2 >>>> -M_mat_view binary >>>> >>>> WARNING! There are options you set that were not used! >>>> WARNING! could be spelling mistake, etc! >>>> Option left: name:-M_mat_view value: binary >>>> >>>> I don't know if the matrix is actually there and assembled or if >>>> the >>>> option is ommitted because something is wrong. >>>> >>>> Its difficult to know when I cannot see the whole code. You can >>>> always >>>> insert >>>> >>>> MatViewFromOptions(M, NULL, "-mat_view"); >>>> >>>> Using >>>> MatView(M, PETSC_VIEWER_STDOUT_WORLD); >>>> >>>> gives me a reasonable output to stdout. >>>> >>>> Good. >>>> >>>> But saving the matrix and analysing it in matlab, results in an >>>> all >>>> zero matrix. >>>> >>>> PetscViewerBinaryOpen(PETSC_COMM_WORLD, "Mout",FILE_MODE_WRITE, >>>> &viewer); >>>> MatView(M, viewer); >>>> >>>> I cannot explain this, but it has to be something like you are >>>> viewing >>>> the matrix before it is >>>> actually assembled. Feel free to send the code. It sounds like it is >>>> mostly working. >>>> >>>> Matt >>>> >>>> Any hints? >>>> >>>> On 24.02.2016 13 [3] :58, Matthew Knepley >>>> >>>> wrote: >>>> >>>> On Wed, Feb 24, 2016 at 6:47 AM, Julian Andrej >>>> >>>> >> >>>> wrote: >>>> >>>> Hi, >>>> >>>> i'm trying to assemble a mass matrix with the >>>> PetscFE/DMPlex >>>> interface. I found something in the examples of TAO >>>> >>>> >>> >>> https://bitbucket.org/petsc/petsc/src/da8116b0e8d067e39fd79740a8a864b0fe207998/src/tao/examples/tutorials/ex3.c?at=master&fileviewer=file-view-default >>>> >>>> >>>> but using the lines >>>> >>>> DMClone(dm, &dm_mass); >>>> DMSetNumFields(dm_mass, 1); >>>> DMPlexCopyCoordinates(dm, dm_mass); >>>> DMGetDS(dm_mass, &prob_mass); >>>> PetscDSSetJacobian(prob_mass, 0, 0, mass_kernel, NULL, >>>> NULL, NULL); >>>> PetscDSSetDiscretization(prob_mass, 0, (PetscObject) >>>> fe); >>>> DMPlexSNESComputeJacobianFEM(dm_mass, u, M, M, NULL); >>>> DMCreateMatrix(dm_mass, &M); >>>> >>>> leads to errors in DMPlexSNESComputeJacobianFEM (u is a >>>> global vector). >>>> >>>> I don't can understand the necessary commands until >>>> DMPlexSNESComputeJacobianFEM. What does it do and why >>>> is it >>>> necessary? (especially why does the naming involve >>>> SNES?) >>>> >>>> Is there another/easier/better way to create a mass >>>> matrix (the >>>> inner product of the function space and the test >>>> space)? >>>> >>>> 1) That example needs updating. First, look at SNES ex12 >>>> which >>>> is up to >>>> date. >>>> >>>> 2) I assume you are using 3.6. If you use the development >>>> version, you >>>> can remove DMPlexCopyCoordinates(). >>>> >>>> 3) You need to create the matrix BEFORE calling the assembly >>>> >>>> 4) Always always always send the entire error messge >>>> >>>> Matt >>>> >>>> Regards >>>> Julian Andrej >>>> >>>> -- >>>> What most experimenters take for granted before they begin >>>> their >>>> experiments is infinitely more interesting than any results >>>> to which >>>> their experiments lead. >>>> -- Norbert Wiener >>>> >>>> -- >>>> What most experimenters take for granted before they begin their >>>> experiments is infinitely more interesting than any results to which >>>> their experiments lead. >>>> -- Norbert Wiener >>> >>> >>> >>> Links: >>> ------ >>> [1] tel:24.02.2016%2020 >>> [2] tel:24.02.2016%2014 >>> [3] tel:24.02.2016%2013 > > > > > -- > What most experimenters take for granted before they begin their experiments > is infinitely more interesting than any results to which their experiments > lead. > -- Norbert Wiener From domenico_lahaye at yahoo.com Thu Jul 21 09:09:02 2016 From: domenico_lahaye at yahoo.com (domenico lahaye) Date: Thu, 21 Jul 2016 14:09:02 +0000 (UTC) Subject: [petsc-users] Regarding ksp ex42 - Citations In-Reply-To: <9C370EDC-0F99-45FE-B650-B0F24091CA63@imperial.ac.uk> References: <1413749702.3789628.1468516892902.JavaMail.yahoo.ref@mail.yahoo.com> <1413749702.3789628.1468516892902.JavaMail.yahoo@mail.yahoo.com> <5A491912-5FFB-46AB-8B2E-CBC0C5C443C2@mcs.anl.gov> <461808588.655361.1468821570462.JavaMail.yahoo@mail.yahoo.com> <877772657.653258.1468824084856.JavaMail.yahoo@mail.yahoo.com> <1627436644.2376666.1469090464432.JavaMail.yahoo@mail.yahoo.com> <1239841092.2483369.1469092691767.JavaMail.yahoo@mail.yahoo.com> <579094FF.9020708@imperial.ac.uk> <2137146867.2442005.1469094945841.JavaMail.yahoo@mail.yahoo.com> <9C370EDC-0F99-45FE-B650-B0F24091CA63@imperial.ac.uk> Message-ID: <517029271.2456281.1469110142974.JavaMail.yahoo@mail.yahoo.com> Thank you for sharing the additional insight. The separate Galerkin coarsening of A and M will be part of the overall algorithm only. I think it is wise to implement in two stages: first a fragile implementation and later a more stable one. Kind wishes, Domenico. From: Lawrence Mitchell To: domenico lahaye Cc: PETSc Users List Sent: Thursday, July 21, 2016 1:09 PM Subject: Re: [petsc-users] Regarding ksp ex42 - Citations > On 21 Jul 2016, at 10:55, domenico lahaye wrote: > > Apologies for being not sufficient clear in my previous message. > > I would like to be able to Galerkin coarsen A^h to obtain A^H > and to separately Galerkin coarsen M^h to obtain M^H. > > So, yes, the way in which I currently (partially) understand your > description of the new DMCreateMatrices would do the job. If you want to separately coarsen A and M via Galerkin, I think it will be easier to just change the code in PCSetUp_MG to handle the case where A and M are different on the coarse levels.? Effectively you just need to replicate the code that computes the coarse grid "B" matrix to separately compute coarse grid A and B matrices and pass them in to KSPSetOperators. Cheers, Lawrence -------------- next part -------------- An HTML attachment was scrubbed... URL: From eduardojourdan92 at gmail.com Thu Jul 21 11:38:43 2016 From: eduardojourdan92 at gmail.com (Eduardo Jourdan) Date: Thu, 21 Jul 2016 13:38:43 -0300 Subject: [petsc-users] Questions for MatSolve In-Reply-To: References: Message-ID: Thank you for the quick answer. I didn't realize that I could use PC without KSP interface. I also think that it is what I wanted. Nevertheless, as long as I figured out from the source code, PcApply for PCSOR basically do some interface and preparations and then calls MatSOR. I saw that depending on the matrix ('BAIJ, SBAIJ, and AIJ matrices with Inodes') it does SOR smoothing or block SOR smoothing. I think that in my case the seqaij matrix with bs=4 had Inodes with size 4. That is why calling MatSOR with seqaij or calling with seqbaij converted from the seqaij seem to give the same result. However, with the matrix seqaij with bs = 16 I can guess that the rows inside a block dont have the same nonzero pattern, so Inodes size are different from block size. I happened to see the follow note in the MatSOR website page: "Developer Note: We should add block SOR support for AIJ matrices with block size set to great than one and no inodes ". This may be the reason why seqaij and seqbaij are leading to different results with my matrix of bs = 16. I think that answer all may previous questions. I am sorry, I've got confused and wrote MatSolve instead of MatSOR in my previous email, what changes it completely. Best Regards Eduardo 2016-07-20 0:03 GMT-03:00 Matthew Knepley : > On Tue, Jul 19, 2016 at 8:17 PM, Eduardo Jourdan < > eduardojourdan92 at gmail.com> wrote: > >> Hi all, >> >> I would like to perform a specific number (for instance 4 of forward and >> backward sweeps with a seqaij matrix with block size 4, vectors b and x. >> Also, I need to do this same procedure with another matrix seqaij block >> size 16. I would appreciate if someone knows the best way to do it. >> > > It sounds like you want PCSOR and PCApply, not MatSolve. > > Thanks, > > Matt > > >> 1 - I've been trying to use MatSolve. For the bs=4 it seems to work, but >> with the other matrix with bs=16 the residue diverges. When I call >> matConvert to convert the later matrix for a seqbaij with bs=16 the result >> changes and the linear residue is reduced. It is supposed to happen or it >> is more possible that i am doing something wrong? >> >> 2 - MatSolve for seqbaij and seqaij with the same block sizes gives the >> same results in terms of solution (not performace, memory) ? >> >> 3 - Can do I do a specific number of sweeps as told before with the >> KSP/PC interface? >> >> 4 - I saw the manual for the MatSolve and It says that it is for factored >> matrix. Can I use a matrix just after the MatAssembly calls? >> >> Best regards, >> >> Eduardo Jourdan >> > > > > -- > What most experimenters take for granted before they begin their > experiments is infinitely more interesting than any results to which their > experiments lead. > -- Norbert Wiener > -------------- next part -------------- An HTML attachment was scrubbed... URL: From bsmith at mcs.anl.gov Thu Jul 21 13:15:31 2016 From: bsmith at mcs.anl.gov (Barry Smith) Date: Thu, 21 Jul 2016 13:15:31 -0500 Subject: [petsc-users] Questions for MatSolve In-Reply-To: References: Message-ID: <65228947-04A4-46AE-9023-861EDF530D87@mcs.anl.gov> > On Jul 21, 2016, at 11:38 AM, Eduardo Jourdan wrote: > > Thank you for the quick answer. > > I didn't realize that I could use PC without KSP interface. I also think that it is what I wanted. Nevertheless, as long as I figured out from the source code, PcApply for PCSOR basically do some interface and preparations and then calls MatSOR. I saw that depending on the matrix ('BAIJ, SBAIJ, and AIJ matrices with Inodes') it does SOR smoothing or block SOR smoothing. > > I think that in my case the seqaij matrix with bs=4 had Inodes with size 4. That is why calling MatSOR with seqaij or calling with seqbaij converted from the seqaij seem to give the same result. > However, with the matrix seqaij with bs = 16 I can guess that the rows inside a block dont have the same nonzero pattern, so Inodes size are different from block size. I happened to see the follow note in the MatSOR website page: "Developer Note: We should add block SOR support for AIJ matrices with block size set to great than one and no inodes ". This may be the reason why seqaij and seqbaij are leading to different results with my matrix of bs = 16. I think that answer all may previous questions. I am sorry, I've got confused and wrote MatSolve instead of MatSOR in my previous email, what changes it completely. Your analysis is correct. In general PCSOR will produce different convergence histories for AIJ and BIJ block size > 1. The BAIJ may convergence (due to the blocking) when the AIJ does not; I suppose the opposite may be possible but seems unlikely. Barry > > Best Regards > > Eduardo > > > > > > 2016-07-20 0:03 GMT-03:00 Matthew Knepley : > On Tue, Jul 19, 2016 at 8:17 PM, Eduardo Jourdan wrote: > Hi all, > > I would like to perform a specific number (for instance 4 of forward and backward sweeps with a seqaij matrix with block size 4, vectors b and x. Also, I need to do this same procedure with another matrix seqaij block size 16. I would appreciate if someone knows the best way to do it. > > It sounds like you want PCSOR and PCApply, not MatSolve. > > Thanks, > > Matt > > 1 - I've been trying to use MatSolve. For the bs=4 it seems to work, but with the other matrix with bs=16 the residue diverges. When I call matConvert to convert the later matrix for a seqbaij with bs=16 the result changes and the linear residue is reduced. It is supposed to happen or it is more possible that i am doing something wrong? > > 2 - MatSolve for seqbaij and seqaij with the same block sizes gives the same results in terms of solution (not performace, memory) ? > > 3 - Can do I do a specific number of sweeps as told before with the KSP/PC interface? > > 4 - I saw the manual for the MatSolve and It says that it is for factored matrix. Can I use a matrix just after the MatAssembly calls? > > Best regards, > > Eduardo Jourdan > > > > -- > What most experimenters take for granted before they begin their experiments is infinitely more interesting than any results to which their experiments lead. > -- Norbert Wiener > From overholt at capesim.com Thu Jul 21 16:00:57 2016 From: overholt at capesim.com (Matthew Overholt) Date: Thu, 21 Jul 2016 17:00:57 -0400 Subject: [petsc-users] PC Direct Solution failure Message-ID: <006001d1e392$fbaf6000$f30e2000$@capesim.com> PETSc Users, I am doing a KSPPREONLY solution (of the heat transfer equation using FEA) and comparing several packages like PARDISO and MUMPS, and I am encountering a MatSolve() failure that I am having trouble diagnosing. The matrix inversion fails and I get "nan". The failure only happens for certain input files, and its not (just) related to problem size. By making a slight change to the geometry of the problem I can get it to solve. The SuperLu solver is the only one that will give me any error message: -ksp_type preonly -pc_type lu -pc_mat_solver_package superlu -info I get the error message: [0] MatSolve(): MatFactorError 2 Is that a PCFailedReason of PC_FACTOR_NUMERIC_ZEROPIVOT? If so, is there a way to perturb the pivot in some way? In another (non-PETSc) code which uses MKL PARDISO I am able to solve the exact same problem by the same approach without any issues, and that code gives PARDISO a pivot perturbation flag value. Is there a better way to figure out what is happening? I have been running the code in TotalView with extreme memory checks and everything appears to be ok. Thanks, Matt Overholt CapeSym, Inc. --- This email has been checked for viruses by Avast antivirus software. https://www.avast.com/antivirus -------------- next part -------------- An HTML attachment was scrubbed... URL: From bsmith at mcs.anl.gov Thu Jul 21 16:26:12 2016 From: bsmith at mcs.anl.gov (Barry Smith) Date: Thu, 21 Jul 2016 16:26:12 -0500 Subject: [petsc-users] PC Direct Solution failure In-Reply-To: <006001d1e392$fbaf6000$f30e2000$@capesim.com> References: <006001d1e392$fbaf6000$f30e2000$@capesim.com> Message-ID: <2E47CB3E-E8D2-4EBB-83D9-05A46DA0A7E4@mcs.anl.gov> > On Jul 21, 2016, at 4:00 PM, Matthew Overholt wrote: > > PETSc Users, > > I am doing a KSPPREONLY solution (of the heat transfer equation using FEA) and comparing several packages like PARDISO and MUMPS, and I am encountering a MatSolve() failure that I am having trouble diagnosing. The matrix inversion fails and I get ?nan?. The failure only happens for certain input files, and its not (just) related to problem size. By making a slight change to the geometry of the problem I can get it to solve. > > The SuperLu solver is the only one that will give me any error message: > -ksp_type preonly -pc_type lu ?pc_mat_solver_package superlu ?info > I get the error message: > [0] MatSolve(): MatFactorError 2 > Is that a PCFailedReason of PC_FACTOR_NUMERIC_ZEROPIVOT? Yes typedef enum {MAT_FACTOR_NOERROR,MAT_FACTOR_STRUCT_ZEROPIVOT,MAT_FACTOR_NUMERIC_ZEROPIVOT,MAT_FACTOR_OUTMEMORY,MAT_FACTOR_OTHER} MatFactorError; There are a few SuperLU options that could potentially alleviate the problem of the zero pivot: Run with -help to see them all or look at the manual page for MATSOLVERSUPERLU + -mat_superlu_equil - Equil (None) . -mat_superlu_colperm - (choose one of) NATURAL MMD_ATA MMD_AT_PLUS_A COLAMD . -mat_superlu_iterrefine - (choose one of) NOREFINE SINGLE DOUBLE EXTRA . -mat_superlu_symmetricmode: - SymmetricMode (None) . -mat_superlu_diagpivotthresh <1> - DiagPivotThresh (None) . -mat_superlu_pivotgrowth - PivotGrowth (None) . -mat_superlu_conditionnumber - ConditionNumber (None) . -mat_superlu_rowperm - (choose one of) NOROWPERM LargeDiag . -mat_superlu_replacetinypivot - ReplaceTinyPivot (None) but they may introduce a different problem for a different matrix. The thing with sparse direct solvers is they can work fine for some matrices but when you change the matrix slightly they don't work, they can also work for some orderings and not for others and if you change the matrix it may be a different ordering is better for that matrix than the ordering for a different matrix. So generally for a particular matrix you might be able to get things to run but I know of no way to bullet proof the direct solver so it will always work when you throw different matrices at it unless you manually change some options. > If so, is there a way to perturb the pivot in some way? > > In another (non-PETSc) code which uses MKL PARDISO I am able to solve the exact same problem by the same approach without any issues, and that code gives PARDISO a pivot perturbation flag value. I'm not surprised. Different sparse solvers will work better on some classes of matrices than others but it is not easy to predict in advance which solver will be best. Generally for each type of simulation we do we try out the different sparse direct solvers and then pick the one that seems the most robust for that simulation. Note that since each solver package has its own tuning options this can be a annoying because you need to find the tuning options for each package and see if they help. Barry > > Is there a better way to figure out what is happening? I have been running the code in TotalView with extreme memory checks and everything appears to be ok. > > Thanks, > Matt Overholt > CapeSym, Inc. > > > > Virus-free. www.avast.com From bsmith at mcs.anl.gov Thu Jul 21 18:41:48 2016 From: bsmith at mcs.anl.gov (Barry Smith) Date: Thu, 21 Jul 2016 18:41:48 -0500 Subject: [petsc-users] Regarding ksp ex42 - Citations In-Reply-To: <9C370EDC-0F99-45FE-B650-B0F24091CA63@imperial.ac.uk> References: <1413749702.3789628.1468516892902.JavaMail.yahoo.ref@mail.yahoo.com> <1413749702.3789628.1468516892902.JavaMail.yahoo@mail.yahoo.com> <5A491912-5FFB-46AB-8B2E-CBC0C5C443C2@mcs.anl.gov> <461808588.655361.1468821570462.JavaMail.yahoo@mail.yahoo.com> <877772657.653258.1468824084856.JavaMail.yahoo@mail.yahoo.com> <1627436644.2376666.1469090464432.JavaMail.yahoo@mail.yahoo.com> <1239841092.2483369.1469092691767.JavaMail.yahoo@mail.yahoo.com> <579094FF.9020708@imperial.ac.uk> <2137146867.2442005.1469094945841.JavaMail.yahoo@mail.yahoo.com> <9C370EDC-0F99-45FE-B650-B0F24091CA63@imperial.ac.uk> Message-ID: I'll add support for handling both A and M via Galerkin. It is easy to write the code, picking a good simple API that doesn't break anything is more difficult. I'm leaning to change PCMGSetGalerkin(PC,PetscBool) to PCMGSetGalerkin(PC, PCMGGalerkinType) where typedef enum { PC_MG_GALERKIN_BOTH,PC_MG_GALERKIN_PMAT,PC_MG_GALERKIN_MAT, PC_MG_GALERKIN_NONE } PCMGGalerkinType; Barry > On Jul 21, 2016, at 6:09 AM, Lawrence Mitchell wrote: > > >> On 21 Jul 2016, at 10:55, domenico lahaye wrote: >> >> Apologies for being not sufficient clear in my previous message. >> >> I would like to be able to Galerkin coarsen A^h to obtain A^H >> and to separately Galerkin coarsen M^h to obtain M^H. >> >> So, yes, the way in which I currently (partially) understand your >> description of the new DMCreateMatrices would do the job. > > If you want to separately coarsen A and M via Galerkin, I think it will be easier to just change the code in PCSetUp_MG to handle the case where A and M are different on the coarse levels. Effectively you just need to replicate the code that computes the coarse grid "B" matrix to separately compute coarse grid A and B matrices and pass them in to KSPSetOperators. > > Cheers, > > Lawrence > From domenico_lahaye at yahoo.com Fri Jul 22 03:42:00 2016 From: domenico_lahaye at yahoo.com (domenico lahaye) Date: Fri, 22 Jul 2016 08:42:00 +0000 (UTC) Subject: [petsc-users] Regarding ksp ex42 - Citations In-Reply-To: References: <1413749702.3789628.1468516892902.JavaMail.yahoo.ref@mail.yahoo.com> <1413749702.3789628.1468516892902.JavaMail.yahoo@mail.yahoo.com> <5A491912-5FFB-46AB-8B2E-CBC0C5C443C2@mcs.anl.gov> <461808588.655361.1468821570462.JavaMail.yahoo@mail.yahoo.com> <877772657.653258.1468824084856.JavaMail.yahoo@mail.yahoo.com> <1627436644.2376666.1469090464432.JavaMail.yahoo@mail.yahoo.com> <1239841092.2483369.1469092691767.JavaMail.yahoo@mail.yahoo.com> <579094FF.9020708@imperial.ac.uk> <2137146867.2442005.1469094945841.JavaMail.yahoo@mail.yahoo.com> <9C370EDC-0F99-45FE-B650-B0F24091CA63@imperial.ac.uk> Message-ID: <430090064.2880208.1469176920170.JavaMail.yahoo@mail.yahoo.com> Dear Barry, ?? Thank you for your suggestion. ?? I will be happy to test drive the new code when available. ? Kind wishes, Domenico. From: Barry Smith To: Lawrence Mitchell Cc: domenico lahaye ; PETSc Users List Sent: Friday, July 22, 2016 1:41 AM Subject: Re: [petsc-users] Regarding ksp ex42 - Citations ? I'll add support for handling both A and M via Galerkin. It is easy to write the code, picking a good simple API that doesn't break anything is more difficult.? I'm leaning to change PCMGSetGalerkin(PC,PetscBool) to PCMGSetGalerkin(PC, PCMGGalerkinType) where typedef enum { PC_MG_GALERKIN_BOTH,PC_MG_GALERKIN_PMAT,PC_MG_GALERKIN_MAT, PC_MG_GALERKIN_NONE } PCMGGalerkinType; Barry > On Jul 21, 2016, at 6:09 AM, Lawrence Mitchell wrote: > > >> On 21 Jul 2016, at 10:55, domenico lahaye wrote: >> >> Apologies for being not sufficient clear in my previous message. >> >> I would like to be able to Galerkin coarsen A^h to obtain A^H >> and to separately Galerkin coarsen M^h to obtain M^H. >> >> So, yes, the way in which I currently (partially) understand your >> description of the new DMCreateMatrices would do the job. > > If you want to separately coarsen A and M via Galerkin, I think it will be easier to just change the code in PCSetUp_MG to handle the case where A and M are different on the coarse levels.? Effectively you just need to replicate the code that computes the coarse grid "B" matrix to separately compute coarse grid A and B matrices and pass them in to KSPSetOperators. > > Cheers, > > Lawrence > -------------- next part -------------- An HTML attachment was scrubbed... URL: From domenico_lahaye at yahoo.com Fri Jul 22 03:42:00 2016 From: domenico_lahaye at yahoo.com (domenico lahaye) Date: Fri, 22 Jul 2016 08:42:00 +0000 (UTC) Subject: [petsc-users] Regarding ksp ex42 - Citations In-Reply-To: References: <1413749702.3789628.1468516892902.JavaMail.yahoo.ref@mail.yahoo.com> <1413749702.3789628.1468516892902.JavaMail.yahoo@mail.yahoo.com> <5A491912-5FFB-46AB-8B2E-CBC0C5C443C2@mcs.anl.gov> <461808588.655361.1468821570462.JavaMail.yahoo@mail.yahoo.com> <877772657.653258.1468824084856.JavaMail.yahoo@mail.yahoo.com> <1627436644.2376666.1469090464432.JavaMail.yahoo@mail.yahoo.com> <1239841092.2483369.1469092691767.JavaMail.yahoo@mail.yahoo.com> <579094FF.9020708@imperial.ac.uk> <2137146867.2442005.1469094945841.JavaMail.yahoo@mail.yahoo.com> <9C370EDC-0F99-45FE-B650-B0F24091CA63@imperial.ac.uk> Message-ID: <430090064.2880208.1469176920170.JavaMail.yahoo@mail.yahoo.com> Dear Barry, ?? Thank you for your suggestion. ?? I will be happy to test drive the new code when available. ? Kind wishes, Domenico. From: Barry Smith To: Lawrence Mitchell Cc: domenico lahaye ; PETSc Users List Sent: Friday, July 22, 2016 1:41 AM Subject: Re: [petsc-users] Regarding ksp ex42 - Citations ? I'll add support for handling both A and M via Galerkin. It is easy to write the code, picking a good simple API that doesn't break anything is more difficult.? I'm leaning to change PCMGSetGalerkin(PC,PetscBool) to PCMGSetGalerkin(PC, PCMGGalerkinType) where typedef enum { PC_MG_GALERKIN_BOTH,PC_MG_GALERKIN_PMAT,PC_MG_GALERKIN_MAT, PC_MG_GALERKIN_NONE } PCMGGalerkinType; Barry > On Jul 21, 2016, at 6:09 AM, Lawrence Mitchell wrote: > > >> On 21 Jul 2016, at 10:55, domenico lahaye wrote: >> >> Apologies for being not sufficient clear in my previous message. >> >> I would like to be able to Galerkin coarsen A^h to obtain A^H >> and to separately Galerkin coarsen M^h to obtain M^H. >> >> So, yes, the way in which I currently (partially) understand your >> description of the new DMCreateMatrices would do the job. > > If you want to separately coarsen A and M via Galerkin, I think it will be easier to just change the code in PCSetUp_MG to handle the case where A and M are different on the coarse levels.? Effectively you just need to replicate the code that computes the coarse grid "B" matrix to separately compute coarse grid A and B matrices and pass them in to KSPSetOperators. > > Cheers, > > Lawrence > -------------- next part -------------- An HTML attachment was scrubbed... URL: From zocca.marco at gmail.com Sat Jul 23 04:25:47 2016 From: zocca.marco at gmail.com (Marco Zocca) Date: Sat, 23 Jul 2016 11:25:47 +0200 Subject: [petsc-users] [RFC] Docs: TeX -> HTML Message-ID: Dear all, following the discussion at PETSc'16, I have tried to render the TeX-based manual into HTML with latex2html [1] and pandoc [2] . Neither attempt was successful, because of the presence of certain external TeX packages used for rendering various custom aspects of the manual. There is no 1:1 way of converting such a document. However there are a number of templates for rendering static websites that use LaTeX math and verbatim source code (e.g. readthedocs [3] for manual-type documents, which also supports MathJax [4] and re-renders at every repository push). At any rate, the conversion requires copying blocks of text and code to the web-based version, i.e. removing all the LaTeX markup, therefore effectively committing to maintaining 2 versions of the manual up to date and in sync with each other. Before committing to any approach, I would like your input on this: 1) Do you have any preference for web rendering/site hosting solution? 2) Are you OK with the idea of essentially forking the manual into PDF output and web output ? It is not huge work (an afternoon of tweaking initially and a couple minutes at every new release) but we should be sure about the approach in the first place. Any and all feedback is welcome; Thank you and kind regards, Marco [1] https://www.ctan.org/tex-archive/support/latex2html/ [2] http://pandoc.org/ [3] https://readthedocs.org/ [4] http://mathjax.readthedocs.io/en/latest/tex.html From wgropp at illinois.edu Sat Jul 23 08:42:53 2016 From: wgropp at illinois.edu (William Gropp) Date: Sat, 23 Jul 2016 08:42:53 -0500 Subject: [petsc-users] [RFC] Docs: TeX -> HTML In-Reply-To: References: Message-ID: Another option is to try tohtml, which is what I use for the MPI Standard. It has a way to specify how to handle some TeX commands (it isn?t a full implementation of TeX, so some more sophisticated uses of TeX are beyond it). Bill William Gropp Director, Parallel Computing Institute Thomas M. Siebel Chair in Computer Science Chief Scientist, NCSA University of Illinois Urbana-Champaign On Jul 23, 2016, at 4:25 AM, Marco Zocca wrote: > Dear all, > > following the discussion at PETSc'16, I have tried to render the > TeX-based manual into HTML with latex2html [1] and pandoc [2] . > > Neither attempt was successful, because of the presence of certain > external TeX packages used for rendering various custom aspects of the > manual. > > There is no 1:1 way of converting such a document. However there are a > number of templates for rendering static websites that use LaTeX math > and verbatim source code (e.g. readthedocs [3] for manual-type > documents, which also supports MathJax [4] and re-renders at every > repository push). > > At any rate, the conversion requires copying blocks of text and code > to the web-based version, i.e. removing all the LaTeX markup, > therefore effectively committing to maintaining 2 versions of the > manual up to date and in sync with each other. > > > Before committing to any approach, I would like your input on this: > > 1) Do you have any preference for web rendering/site hosting solution? > > 2) Are you OK with the idea of essentially forking the manual into PDF > output and web output ? It is not huge work (an afternoon of tweaking > initially and a couple minutes at every new release) but we should be > sure about the approach in the first place. > > Any and all feedback is welcome; > > Thank you and kind regards, > Marco > > > [1] https://www.ctan.org/tex-archive/support/latex2html/ > [2] http://pandoc.org/ > [3] https://readthedocs.org/ > [4] http://mathjax.readthedocs.io/en/latest/tex.html From bhatiamanav at gmail.com Sat Jul 23 11:50:12 2016 From: bhatiamanav at gmail.com (Manav Bhatia) Date: Sat, 23 Jul 2016 11:50:12 -0500 Subject: [petsc-users] using DM constructs Message-ID: <3DB67049-F477-4B63-A185-60095FA14F73@gmail.com> Hi, I am new to the DM constructs. I am curious if there is a compelling reason to move from handling IS sets to DM data structures. My applications are built on top of libMesh. They used IS sets for a long time, and in recent years I have seen DM constructs in the library. However, I do not know why this is beneficial or necessary. The Petsc manual discusses DMDA for structured mesh, and I see reference to DMForest in the code (for unstructured mesh?) which is not discussed in the manual. Is there a document that might provide the necessary background for DM and how best to derive from it, like in the libMesh source? Any guidance would be appreciated. Regards, Manav From bsmith at mcs.anl.gov Sat Jul 23 12:44:21 2016 From: bsmith at mcs.anl.gov (Barry Smith) Date: Sat, 23 Jul 2016 12:44:21 -0500 Subject: [petsc-users] using DM constructs In-Reply-To: <3DB67049-F477-4B63-A185-60095FA14F73@gmail.com> References: <3DB67049-F477-4B63-A185-60095FA14F73@gmail.com> Message-ID: <1EAFD472-171F-4273-A09E-525D73E8117B@mcs.anl.gov> Manav, Each DM classes has two distinct interfaces: One interface that is common to all DM which "speaks linear algebra (algebraic solvers)", for example DMCreateGlobalVector() One interface that is specific to a particular DM (for example DMDA, or DMPlex or DMNetwork) it speaks in the language of the mesh/discretization model of the DM. So for example DMDA routines which are for structured grids speak in the language of structured grids and so you have things like DMDAGetCorners() which tells you the corners of the "box" of the structured grid you own. DMPlex speaks in a particular language of unstructured grids, DMNetwork speaks in the language of computations on networks (graphs) such as power grids where you have vertices and edges connecting vertices). DMForest speaks the languages of quad-tree and oct-tree grids. The DM is PETSc's approach for communicating between mesh/discretization data and algebraic solvers. It is suppose to handle all the busywork of coordinating the interactions of the mesh/discretization data and algebraic solvers for the application developer so they don't need to do it themselves. For example with geometric multigrid the DMXXX object can fill up all the vectors and matrices that are needed for each level without requiring the user to loop over the levels and put the vectors and matrices themselves into the PCMG data structures. IS are lower level basic data structures, often used by DMs. So one does not replace the use of IS with DM but one collects all the mesh/discretization interactions into a DMXXX and implements the DM operations (for example DMCreateGlobalVector()) using the data from DMXXX object. In some sense libMesh is a DM for unstructured meshes with finite elements but it was written before we came up with the concept of DMs and so naturally doesn't use the DM interfaces. So one would not write libMesh using DMDA or DMPlex or something rather you would write DMlibMesh or write a new DMlibMesh2 by refactoring the libMesh interfaces to match the DM paradigm. So if you are using libMesh and it satisfies your needs you should definitely not just switch to some DMXXX unless you have a good reason. Each DMXXX is for a particular class of problems/algorithms and you pick the DMXXX to use based on what you are doing. So use DMForest if you wish to use oct-trees, etc. Barry > On Jul 23, 2016, at 11:50 AM, Manav Bhatia wrote: > > Hi, > > I am new to the DM constructs. I am curious if there is a compelling reason to move from handling IS sets to DM data structures. > > My applications are built on top of libMesh. They used IS sets for a long time, and in recent years I have seen DM constructs in the library. However, I do not know why this is beneficial or necessary. The Petsc manual discusses DMDA for structured mesh, and I see reference to DMForest in the code (for unstructured mesh?) which is not discussed in the manual. > > Is there a document that might provide the necessary background for DM and how best to derive from it, like in the libMesh source? > > Any guidance would be appreciated. > > Regards, > Manav From patrick.sanan at gmail.com Sat Jul 23 13:16:14 2016 From: patrick.sanan at gmail.com (Patrick Sanan) Date: Sat, 23 Jul 2016 14:16:14 -0400 Subject: [petsc-users] [RFC] Docs: TeX -> HTML In-Reply-To: References: Message-ID: I have slowly been doing some work to clean up the manual a little bit, mainly just fixing the formatting where it needs attention, but also updating the content where it is obviously out of date, so I'm interested in working on resolving this. The latex version is of course nice in that it can look pretty with latex tools, but the advantage of having html documentation which is more friendly to search engines is undeniable. Which latex packages are giving trouble? Maybe we can figure out a way to sufficiently reduce the dependencies. On Sat, Jul 23, 2016 at 5:25 AM, Marco Zocca wrote: > Dear all, > > following the discussion at PETSc'16, I have tried to render the > TeX-based manual into HTML with latex2html [1] and pandoc [2] . > > Neither attempt was successful, because of the presence of certain > external TeX packages used for rendering various custom aspects of the > manual. > > There is no 1:1 way of converting such a document. However there are a > number of templates for rendering static websites that use LaTeX math > and verbatim source code (e.g. readthedocs [3] for manual-type > documents, which also supports MathJax [4] and re-renders at every > repository push). > > At any rate, the conversion requires copying blocks of text and code > to the web-based version, i.e. removing all the LaTeX markup, > therefore effectively committing to maintaining 2 versions of the > manual up to date and in sync with each other. > > > Before committing to any approach, I would like your input on this: > > 1) Do you have any preference for web rendering/site hosting solution? > > 2) Are you OK with the idea of essentially forking the manual into PDF > output and web output ? It is not huge work (an afternoon of tweaking > initially and a couple minutes at every new release) but we should be > sure about the approach in the first place. > > Any and all feedback is welcome; > > Thank you and kind regards, > Marco > > > [1] https://www.ctan.org/tex-archive/support/latex2html/ > [2] http://pandoc.org/ > [3] https://readthedocs.org/ > [4] http://mathjax.readthedocs.io/en/latest/tex.html From bsmith at mcs.anl.gov Sat Jul 23 13:23:07 2016 From: bsmith at mcs.anl.gov (Barry Smith) Date: Sat, 23 Jul 2016 13:23:07 -0500 Subject: [petsc-users] [RFC] Docs: TeX -> HTML In-Reply-To: References: Message-ID: <28E6C2EC-545C-4811-9873-02CD791D0719@mcs.anl.gov> Marco, Every rending I've seen of nontrivial latex documents to HTML looks dang ugly in HTML and is extra work to maintain (despite the poor quality). We've tried a couple of times with PETSc to keep an HTML version going and gave up both times. I don't like the idea of having two copies of the same thing, we'd never keep them in sync nor do I like the idea of ugly HTML pages. The one drawback of just having a PDF manual IMHO is that we cannot currently link directly to bookmarks inside the users manual from, say, a manual page html file. (Bookmarks inside the manual.pdf to other places inside the manual.pdf do work fine). The solution seems to be to use Adobe #nameddest=destination instead of bookmarks. These can be added in latex with \hypertarget{} for example \hypertarget{ch_performance} and then in the browser http://www.mcs.anl.gov/petsc/petsc-current/docs/manual.pdf#nameddest=ch_performance will jump to the correct place. This works with the current chrome but does not work with the current Apple Safari (arg) and if you google nameddest doesn't work you find that often browsers seem to have this broken. If it wasn't so broken I would have (automated) adding all the hypertargets and the manual and augmented all the manual pages to have links to them. Still tempted but it seems they won't work except with Chrome (maybe firefox if properly configured). The problem of badly supported #nameddest goes back 10 years Barry > On Jul 23, 2016, at 4:25 AM, Marco Zocca wrote: > > Dear all, > > following the discussion at PETSc'16, I have tried to render the > TeX-based manual into HTML with latex2html [1] and pandoc [2] . > > Neither attempt was successful, because of the presence of certain > external TeX packages used for rendering various custom aspects of the > manual. > > There is no 1:1 way of converting such a document. However there are a > number of templates for rendering static websites that use LaTeX math > and verbatim source code (e.g. readthedocs [3] for manual-type > documents, which also supports MathJax [4] and re-renders at every > repository push). > > At any rate, the conversion requires copying blocks of text and code > to the web-based version, i.e. removing all the LaTeX markup, > therefore effectively committing to maintaining 2 versions of the > manual up to date and in sync with each other. > > > Before committing to any approach, I would like your input on this: > > 1) Do you have any preference for web rendering/site hosting solution? > > 2) Are you OK with the idea of essentially forking the manual into PDF > output and web output ? It is not huge work (an afternoon of tweaking > initially and a couple minutes at every new release) but we should be > sure about the approach in the first place. > > Any and all feedback is welcome; > > Thank you and kind regards, > Marco > > > [1] https://www.ctan.org/tex-archive/support/latex2html/ > [2] http://pandoc.org/ > [3] https://readthedocs.org/ > [4] http://mathjax.readthedocs.io/en/latest/tex.html From juan at tf.uni-kiel.de Sat Jul 23 13:40:02 2016 From: juan at tf.uni-kiel.de (Julian Andrej) Date: Sat, 23 Jul 2016 20:40:02 +0200 Subject: [petsc-users] [RFC] Docs: TeX -> HTML In-Reply-To: <28E6C2EC-545C-4811-9873-02CD791D0719@mcs.anl.gov> References: <28E6C2EC-545C-4811-9873-02CD791D0719@mcs.anl.gov> Message-ID: Small suggestion (it also came up at the Meeting) What is the opinion on a "main" documentation in markdown/restructured text or something like that? The conversion from one of these formats into pdf or any other format like html is handled by a variety of tools pretty well. On Sat, Jul 23, 2016 at 8:23 PM, Barry Smith wrote: > > Marco, > > Every rending I've seen of nontrivial latex documents to HTML looks dang ugly in HTML and is extra work to maintain (despite the poor quality). We've tried a couple of times with PETSc to keep an HTML version going and gave up both times. > > I don't like the idea of having two copies of the same thing, we'd never keep them in sync nor do I like the idea of ugly HTML pages. > > The one drawback of just having a PDF manual IMHO is that we cannot currently link directly to bookmarks inside the users manual from, say, a manual page html file. (Bookmarks inside the manual.pdf to other places inside the manual.pdf do work fine). The solution seems to be to use Adobe #nameddest=destination instead of bookmarks. These can be added in latex with \hypertarget{} for example \hypertarget{ch_performance} and then in the browser http://www.mcs.anl.gov/petsc/petsc-current/docs/manual.pdf#nameddest=ch_performance will jump to the correct place. This works with the current chrome but does not work with the current Apple Safari (arg) and if you google nameddest doesn't work you find that often browsers seem to have this broken. If it wasn't so broken I would have (automated) adding all the hypertargets and the manual and augmented all the manual pages to have links to them. Still tempted but it seems they won't work except with Chrome (maybe firefox if properly configured). The problem of badly supported #nameddest goes back 10 years > > > Barry > > > > > > >> On Jul 23, 2016, at 4:25 AM, Marco Zocca wrote: >> >> Dear all, >> >> following the discussion at PETSc'16, I have tried to render the >> TeX-based manual into HTML with latex2html [1] and pandoc [2] . >> >> Neither attempt was successful, because of the presence of certain >> external TeX packages used for rendering various custom aspects of the >> manual. >> >> There is no 1:1 way of converting such a document. However there are a >> number of templates for rendering static websites that use LaTeX math >> and verbatim source code (e.g. readthedocs [3] for manual-type >> documents, which also supports MathJax [4] and re-renders at every >> repository push). >> >> At any rate, the conversion requires copying blocks of text and code >> to the web-based version, i.e. removing all the LaTeX markup, >> therefore effectively committing to maintaining 2 versions of the >> manual up to date and in sync with each other. >> >> >> Before committing to any approach, I would like your input on this: >> >> 1) Do you have any preference for web rendering/site hosting solution? >> >> 2) Are you OK with the idea of essentially forking the manual into PDF >> output and web output ? It is not huge work (an afternoon of tweaking >> initially and a couple minutes at every new release) but we should be >> sure about the approach in the first place. >> >> Any and all feedback is welcome; >> >> Thank you and kind regards, >> Marco >> >> >> [1] https://www.ctan.org/tex-archive/support/latex2html/ >> [2] http://pandoc.org/ >> [3] https://readthedocs.org/ >> [4] http://mathjax.readthedocs.io/en/latest/tex.html > From knepley at gmail.com Sat Jul 23 14:26:08 2016 From: knepley at gmail.com (Matthew Knepley) Date: Sat, 23 Jul 2016 21:26:08 +0200 Subject: [petsc-users] [RFC] Docs: TeX -> HTML In-Reply-To: References: <28E6C2EC-545C-4811-9873-02CD791D0719@mcs.anl.gov> Message-ID: On Sat, Jul 23, 2016 at 8:40 PM, Julian Andrej wrote: > Small suggestion (it also came up at the Meeting) > > What is the opinion on a "main" documentation in markdown/restructured > text or something like that? The conversion from one of these formats > into pdf or any other format like html is handled by a variety of > tools pretty well. 1) I am really opposed to two copies of the source. This never works out. 2) My reservation concerning Markdown is that it is so constricted. I am used to the freedom of TeX. I agree that this is not a definitive argument. Matt > On Sat, Jul 23, 2016 at 8:23 PM, Barry Smith wrote: > > > > Marco, > > > > Every rending I've seen of nontrivial latex documents to HTML looks > dang ugly in HTML and is extra work to maintain (despite the poor quality). > We've tried a couple of times with PETSc to keep an HTML version going and > gave up both times. > > > > I don't like the idea of having two copies of the same thing, we'd > never keep them in sync nor do I like the idea of ugly HTML pages. > > > > The one drawback of just having a PDF manual IMHO is that we cannot > currently link directly to bookmarks inside the users manual from, say, a > manual page html file. (Bookmarks inside the manual.pdf to other places > inside the manual.pdf do work fine). The solution seems to be to use Adobe > #nameddest=destination instead of bookmarks. These can be added in latex > with \hypertarget{} for example \hypertarget{ch_performance} and then in > the browser > http://www.mcs.anl.gov/petsc/petsc-current/docs/manual.pdf#nameddest=ch_performance > will jump to the correct place. This works with the current chrome but does > not work with the current Apple Safari (arg) and if you google nameddest > doesn't work you find that often browsers seem to have this broken. If it > wasn't so broken I would have (automated) adding all the hypertargets and > the manual and augmented all the manual pages to have links to them. Still > tempted but it seems they won't work except with Chrome (maybe firefox if > properly configured). The problem of badly supported #nameddest goes back > 10 years > > > > > > Barry > > > > > > > > > > > > > >> On Jul 23, 2016, at 4:25 AM, Marco Zocca wrote: > >> > >> Dear all, > >> > >> following the discussion at PETSc'16, I have tried to render the > >> TeX-based manual into HTML with latex2html [1] and pandoc [2] . > >> > >> Neither attempt was successful, because of the presence of certain > >> external TeX packages used for rendering various custom aspects of the > >> manual. > >> > >> There is no 1:1 way of converting such a document. However there are a > >> number of templates for rendering static websites that use LaTeX math > >> and verbatim source code (e.g. readthedocs [3] for manual-type > >> documents, which also supports MathJax [4] and re-renders at every > >> repository push). > >> > >> At any rate, the conversion requires copying blocks of text and code > >> to the web-based version, i.e. removing all the LaTeX markup, > >> therefore effectively committing to maintaining 2 versions of the > >> manual up to date and in sync with each other. > >> > >> > >> Before committing to any approach, I would like your input on this: > >> > >> 1) Do you have any preference for web rendering/site hosting solution? > >> > >> 2) Are you OK with the idea of essentially forking the manual into PDF > >> output and web output ? It is not huge work (an afternoon of tweaking > >> initially and a couple minutes at every new release) but we should be > >> sure about the approach in the first place. > >> > >> Any and all feedback is welcome; > >> > >> Thank you and kind regards, > >> Marco > >> > >> > >> [1] https://www.ctan.org/tex-archive/support/latex2html/ > >> [2] http://pandoc.org/ > >> [3] https://readthedocs.org/ > >> [4] http://mathjax.readthedocs.io/en/latest/tex.html > > > -- What most experimenters take for granted before they begin their experiments is infinitely more interesting than any results to which their experiments lead. -- Norbert Wiener -------------- next part -------------- An HTML attachment was scrubbed... URL: From bhatiamanav at gmail.com Sat Jul 23 14:29:58 2016 From: bhatiamanav at gmail.com (Manav Bhatia) Date: Sat, 23 Jul 2016 14:29:58 -0500 Subject: [petsc-users] using DM constructs In-Reply-To: <1EAFD472-171F-4273-A09E-525D73E8117B@mcs.anl.gov> References: <3DB67049-F477-4B63-A185-60095FA14F73@gmail.com> <1EAFD472-171F-4273-A09E-525D73E8117B@mcs.anl.gov> Message-ID: Thanks, Barry. This gives me a good perspective. Are there specific functions that need to be implemented/provided by a DM derived object? What would be a good resource to learn about this? Regards, Manav > On Jul 23, 2016, at 12:44 PM, Barry Smith wrote: > > > Manav, > > Each DM classes has two distinct interfaces: > > One interface that is common to all DM which "speaks linear algebra (algebraic solvers)", for example DMCreateGlobalVector() > > One interface that is specific to a particular DM (for example DMDA, or DMPlex or DMNetwork) it speaks in the language of the mesh/discretization model of the DM. So for example DMDA routines which are for structured grids speak in the language of structured grids and so you have things like DMDAGetCorners() which tells you the corners of the "box" of the structured grid you own. DMPlex speaks in a particular language of unstructured grids, DMNetwork speaks in the language of computations on networks (graphs) such as power grids where you have vertices and edges connecting vertices). DMForest speaks the languages of quad-tree and oct-tree grids. > > The DM is PETSc's approach for communicating between mesh/discretization data and algebraic solvers. It is suppose to handle all the busywork of coordinating the interactions of the mesh/discretization data and algebraic solvers for the application developer so they don't need to do it themselves. For example with geometric multigrid the DMXXX object can fill up all the vectors and matrices that are needed for each level without requiring the user to loop over the levels and put the vectors and matrices themselves into the PCMG data structures. > > IS are lower level basic data structures, often used by DMs. So one does not replace the use of IS with DM but one collects all the mesh/discretization interactions into a DMXXX and implements the DM operations (for example DMCreateGlobalVector()) using the data from DMXXX object. > > In some sense libMesh is a DM for unstructured meshes with finite elements but it was written before we came up with the concept of DMs and so naturally doesn't use the DM interfaces. So one would not write libMesh using DMDA or DMPlex or something rather you would write DMlibMesh or write a new DMlibMesh2 by refactoring the libMesh interfaces to match the DM paradigm. > > So if you are using libMesh and it satisfies your needs you should definitely not just switch to some DMXXX unless you have a good reason. Each DMXXX is for a particular class of problems/algorithms and you pick the DMXXX to use based on what you are doing. So use DMForest if you wish to use oct-trees, etc. > > Barry > > > >> On Jul 23, 2016, at 11:50 AM, Manav Bhatia wrote: >> >> Hi, >> >> I am new to the DM constructs. I am curious if there is a compelling reason to move from handling IS sets to DM data structures. >> >> My applications are built on top of libMesh. They used IS sets for a long time, and in recent years I have seen DM constructs in the library. However, I do not know why this is beneficial or necessary. The Petsc manual discusses DMDA for structured mesh, and I see reference to DMForest in the code (for unstructured mesh?) which is not discussed in the manual. >> >> Is there a document that might provide the necessary background for DM and how best to derive from it, like in the libMesh source? >> >> Any guidance would be appreciated. >> >> Regards, >> Manav > From knepley at gmail.com Sat Jul 23 14:30:44 2016 From: knepley at gmail.com (Matthew Knepley) Date: Sat, 23 Jul 2016 21:30:44 +0200 Subject: [petsc-users] using DM constructs In-Reply-To: References: <3DB67049-F477-4B63-A185-60095FA14F73@gmail.com> <1EAFD472-171F-4273-A09E-525D73E8117B@mcs.anl.gov> Message-ID: On Sat, Jul 23, 2016 at 9:29 PM, Manav Bhatia wrote: > Thanks, Barry. > > This gives me a good perspective. > > Are there specific functions that need to be implemented/provided by a DM > derived object? What would be a good resource to learn about this? > We talk a lot about this in the online tutorials. Matt > Regards, > Manav > > > > On Jul 23, 2016, at 12:44 PM, Barry Smith wrote: > > > > > > Manav, > > > > Each DM classes has two distinct interfaces: > > > > One interface that is common to all DM which "speaks linear algebra > (algebraic solvers)", for example DMCreateGlobalVector() > > > > One interface that is specific to a particular DM (for example DMDA, or > DMPlex or DMNetwork) it speaks in the language of the mesh/discretization > model of the DM. So for example DMDA routines which are for structured > grids speak in the language of structured grids and so you have things like > DMDAGetCorners() which tells you the corners of the "box" of the structured > grid you own. DMPlex speaks in a particular language of unstructured grids, > DMNetwork speaks in the language of computations on networks (graphs) such > as power grids where you have vertices and edges connecting vertices). > DMForest speaks the languages of quad-tree and oct-tree grids. > > > > The DM is PETSc's approach for communicating between > mesh/discretization data and algebraic solvers. It is suppose to handle all > the busywork of coordinating the interactions of the mesh/discretization > data and algebraic solvers for the application developer so they don't need > to do it themselves. For example with geometric multigrid the DMXXX object > can fill up all the vectors and matrices that are needed for each level > without requiring the user to loop over the levels and put the vectors and > matrices themselves into the PCMG data structures. > > > > IS are lower level basic data structures, often used by DMs. So one > does not replace the use of IS with DM but one collects all the > mesh/discretization interactions into a DMXXX and implements the DM > operations (for example DMCreateGlobalVector()) using the data from DMXXX > object. > > > > In some sense libMesh is a DM for unstructured meshes with finite > elements but it was written before we came up with the concept of DMs and > so naturally doesn't use the DM interfaces. So one would not write libMesh > using DMDA or DMPlex or something rather you would write DMlibMesh or write > a new DMlibMesh2 by refactoring the libMesh interfaces to match the DM > paradigm. > > > > So if you are using libMesh and it satisfies your needs you should > definitely not just switch to some DMXXX unless you have a good reason. > Each DMXXX is for a particular class of problems/algorithms and you pick > the DMXXX to use based on what you are doing. So use DMForest if you wish > to use oct-trees, etc. > > > > Barry > > > > > > > >> On Jul 23, 2016, at 11:50 AM, Manav Bhatia > wrote: > >> > >> Hi, > >> > >> I am new to the DM constructs. I am curious if there is a compelling > reason to move from handling IS sets to DM data structures. > >> > >> My applications are built on top of libMesh. They used IS sets for a > long time, and in recent years I have seen DM constructs in the library. > However, I do not know why this is beneficial or necessary. The Petsc > manual discusses DMDA for structured mesh, and I see reference to DMForest > in the code (for unstructured mesh?) which is not discussed in the manual. > >> > >> Is there a document that might provide the necessary background for DM > and how best to derive from it, like in the libMesh source? > >> > >> Any guidance would be appreciated. > >> > >> Regards, > >> Manav > > > > -- What most experimenters take for granted before they begin their experiments is infinitely more interesting than any results to which their experiments lead. -- Norbert Wiener -------------- next part -------------- An HTML attachment was scrubbed... URL: From bsmith at mcs.anl.gov Sat Jul 23 14:46:19 2016 From: bsmith at mcs.anl.gov (Barry Smith) Date: Sat, 23 Jul 2016 14:46:19 -0500 Subject: [petsc-users] using DM constructs In-Reply-To: References: <3DB67049-F477-4B63-A185-60095FA14F73@gmail.com> <1EAFD472-171F-4273-A09E-525D73E8117B@mcs.anl.gov> Message-ID: Unfortunately they are muddled up in the include files and source. That is functions that you need to implement are mixed in with functions in the base class. I cut and pasted below the basic ones from petscdm.h and removed the ones you do not need to implement. PETSC_EXTERN PetscErrorCode DMView(DM,PetscViewer); PETSC_EXTERN PetscErrorCode DMLoad(DM,PetscViewer); /* very useful but doesn't need to be implemented PETSC_EXTERN PetscErrorCode DMDestroy(DM*); PETSC_EXTERN PetscErrorCode DMCreateGlobalVector(DM,Vec*); PETSC_EXTERN PetscErrorCode DMCreateLocalVector(DM,Vec*); PETSC_EXTERN PetscErrorCode DMGetLocalToGlobalMapping(DM,ISLocalToGlobalMapping*); /* isn't always needed PETSC_EXTERN PetscErrorCode DMGetBlockSize(DM,PetscInt*); /* often doesn't mean anything, like for mixed methods PETSC_EXTERN PetscErrorCode DMCreateColoring(DM,ISColoringType,ISColoring*); /* not needed by very useful for automatically computing Jacobians via differencing PETSC_EXTERN PetscErrorCode DMCreateMatrix(DM,Mat*); PETSC_EXTERN PetscErrorCode DMSetMatrixPreallocateOnly(DM,PetscBool); PETSC_EXTERN PetscErrorCode DMCreateInterpolation(DM,DM,Mat*,Vec*); /* following are needed if you wish to use geometric multigrid; they don't necessarily make sense for all DM implementations. PETSC_EXTERN PetscErrorCode DMCreateRestriction(DM,DM,Mat*); PETSC_EXTERN PetscErrorCode DMRefine(DM,MPI_Comm,DM*); PETSC_EXTERN PetscErrorCode DMCoarsen(DM,MPI_Comm,DM*); PETSC_EXTERN PetscErrorCode DMRefineHierarchy(DM,PetscInt,DM[]); PETSC_EXTERN PetscErrorCode DMCoarsenHierarchy(DM,PetscInt,DM[]); PETSC_EXTERN PetscErrorCode DMSetFromOptions(DM); > On Jul 23, 2016, at 2:29 PM, Manav Bhatia wrote: > > Thanks, Barry. > > This gives me a good perspective. > > Are there specific functions that need to be implemented/provided by a DM derived object? What would be a good resource to learn about this? > > Regards, > Manav > > >> On Jul 23, 2016, at 12:44 PM, Barry Smith wrote: >> >> >> Manav, >> >> Each DM classes has two distinct interfaces: >> >> One interface that is common to all DM which "speaks linear algebra (algebraic solvers)", for example DMCreateGlobalVector() >> >> One interface that is specific to a particular DM (for example DMDA, or DMPlex or DMNetwork) it speaks in the language of the mesh/discretization model of the DM. So for example DMDA routines which are for structured grids speak in the language of structured grids and so you have things like DMDAGetCorners() which tells you the corners of the "box" of the structured grid you own. DMPlex speaks in a particular language of unstructured grids, DMNetwork speaks in the language of computations on networks (graphs) such as power grids where you have vertices and edges connecting vertices). DMForest speaks the languages of quad-tree and oct-tree grids. >> >> The DM is PETSc's approach for communicating between mesh/discretization data and algebraic solvers. It is suppose to handle all the busywork of coordinating the interactions of the mesh/discretization data and algebraic solvers for the application developer so they don't need to do it themselves. For example with geometric multigrid the DMXXX object can fill up all the vectors and matrices that are needed for each level without requiring the user to loop over the levels and put the vectors and matrices themselves into the PCMG data structures. >> >> IS are lower level basic data structures, often used by DMs. So one does not replace the use of IS with DM but one collects all the mesh/discretization interactions into a DMXXX and implements the DM operations (for example DMCreateGlobalVector()) using the data from DMXXX object. >> >> In some sense libMesh is a DM for unstructured meshes with finite elements but it was written before we came up with the concept of DMs and so naturally doesn't use the DM interfaces. So one would not write libMesh using DMDA or DMPlex or something rather you would write DMlibMesh or write a new DMlibMesh2 by refactoring the libMesh interfaces to match the DM paradigm. >> >> So if you are using libMesh and it satisfies your needs you should definitely not just switch to some DMXXX unless you have a good reason. Each DMXXX is for a particular class of problems/algorithms and you pick the DMXXX to use based on what you are doing. So use DMForest if you wish to use oct-trees, etc. >> >> Barry >> >> >> >>> On Jul 23, 2016, at 11:50 AM, Manav Bhatia wrote: >>> >>> Hi, >>> >>> I am new to the DM constructs. I am curious if there is a compelling reason to move from handling IS sets to DM data structures. >>> >>> My applications are built on top of libMesh. They used IS sets for a long time, and in recent years I have seen DM constructs in the library. However, I do not know why this is beneficial or necessary. The Petsc manual discusses DMDA for structured mesh, and I see reference to DMForest in the code (for unstructured mesh?) which is not discussed in the manual. >>> >>> Is there a document that might provide the necessary background for DM and how best to derive from it, like in the libMesh source? >>> >>> Any guidance would be appreciated. >>> >>> Regards, >>> Manav >> > From bsmith at mcs.anl.gov Sat Jul 23 15:06:36 2016 From: bsmith at mcs.anl.gov (Barry Smith) Date: Sat, 23 Jul 2016 15:06:36 -0500 Subject: [petsc-users] [RFC] Docs: TeX -> HTML In-Reply-To: References: <28E6C2EC-545C-4811-9873-02CD791D0719@mcs.anl.gov> Message-ID: <7746A6E1-FE80-4091-9F0B-29AAE4CDECDA@mcs.anl.gov> > On Jul 23, 2016, at 1:40 PM, Julian Andrej wrote: > > Small suggestion (it also came up at the Meeting) > > What is the opinion on a "main" documentation in markdown/restructured > text or something like that? The conversion from one of these formats > into pdf or any other format like html is handled by a variety of > tools pretty well. This might be possible. The drawback to that is markdown and friends are really limited in the types of formatting one can do. I like better the idea of generating nice html from latex if that is possible. Can you list what "certain external TeX packages used for rendering various custom aspects of the manual" chock pandoc? Maybe they can be redefined or seded out of the .tex file before passing to pandoc? Also if you can process part of the manual can you point to how it looks with pandoc so we can evaluate if it is too "ugly"? Thanks Barry > > On Sat, Jul 23, 2016 at 8:23 PM, Barry Smith wrote: >> >> Marco, >> >> Every rending I've seen of nontrivial latex documents to HTML looks dang ugly in HTML and is extra work to maintain (despite the poor quality). We've tried a couple of times with PETSc to keep an HTML version going and gave up both times. >> >> I don't like the idea of having two copies of the same thing, we'd never keep them in sync nor do I like the idea of ugly HTML pages. >> >> The one drawback of just having a PDF manual IMHO is that we cannot currently link directly to bookmarks inside the users manual from, say, a manual page html file. (Bookmarks inside the manual.pdf to other places inside the manual.pdf do work fine). The solution seems to be to use Adobe #nameddest=destination instead of bookmarks. These can be added in latex with \hypertarget{} for example \hypertarget{ch_performance} and then in the browser http://www.mcs.anl.gov/petsc/petsc-current/docs/manual.pdf#nameddest=ch_performance will jump to the correct place. This works with the current chrome but does not work with the current Apple Safari (arg) and if you google nameddest doesn't work you find that often browsers seem to have this broken. If it wasn't so broken I would have (automated) adding all the hypertargets and the manual and augmented all the manual pages to have links to them. Still tempted but it seems they won't work except with Chrome (maybe firefox if properly configured). The problem of badly supported #nameddest goes back 10 years >> >> >> Barry >> >> >> >> >> >> >>> On Jul 23, 2016, at 4:25 AM, Marco Zocca wrote: >>> >>> Dear all, >>> >>> following the discussion at PETSc'16, I have tried to render the >>> TeX-based manual into HTML with latex2html [1] and pandoc [2] . >>> >>> Neither attempt was successful, because of the presence of certain >>> external TeX packages used for rendering various custom aspects of the >>> manual. >>> >>> There is no 1:1 way of converting such a document. However there are a >>> number of templates for rendering static websites that use LaTeX math >>> and verbatim source code (e.g. readthedocs [3] for manual-type >>> documents, which also supports MathJax [4] and re-renders at every >>> repository push). >>> >>> At any rate, the conversion requires copying blocks of text and code >>> to the web-based version, i.e. removing all the LaTeX markup, >>> therefore effectively committing to maintaining 2 versions of the >>> manual up to date and in sync with each other. >>> >>> >>> Before committing to any approach, I would like your input on this: >>> >>> 1) Do you have any preference for web rendering/site hosting solution? >>> >>> 2) Are you OK with the idea of essentially forking the manual into PDF >>> output and web output ? It is not huge work (an afternoon of tweaking >>> initially and a couple minutes at every new release) but we should be >>> sure about the approach in the first place. >>> >>> Any and all feedback is welcome; >>> >>> Thank you and kind regards, >>> Marco >>> >>> >>> [1] https://www.ctan.org/tex-archive/support/latex2html/ >>> [2] http://pandoc.org/ >>> [3] https://readthedocs.org/ >>> [4] http://mathjax.readthedocs.io/en/latest/tex.html >> From aks084000 at utdallas.edu Sat Jul 23 18:21:57 2016 From: aks084000 at utdallas.edu (Safin, Artur) Date: Sat, 23 Jul 2016 23:21:57 +0000 Subject: [petsc-users] Multigrid with PML In-Reply-To: References: <2d1003a65bf24fdf9b30adea866d2067@utdallas.edu> <02E40C8A-322D-4784-8418-22EE5F0999C7@mcs.anl.gov> <6B852635-27EC-45D7-8C09-8F3306DA6DEE@utdallas.edu> Message-ID: <37055B11-8B43-4C7F-9E65-47B8C1CB31D7@utdallas.edu> Matt, Barry, Thank you for your help! Artur -------------- next part -------------- An HTML attachment was scrubbed... URL: From bsmith at mcs.anl.gov Sat Jul 23 19:52:14 2016 From: bsmith at mcs.anl.gov (Barry Smith) Date: Sat, 23 Jul 2016 19:52:14 -0500 Subject: [petsc-users] Regarding ksp ex42 - Citations In-Reply-To: <430090064.2880208.1469176920170.JavaMail.yahoo@mail.yahoo.com> References: <1413749702.3789628.1468516892902.JavaMail.yahoo.ref@mail.yahoo.com> <1413749702.3789628.1468516892902.JavaMail.yahoo@mail.yahoo.com> <5A491912-5FFB-46AB-8B2E-CBC0C5C443C2@mcs.anl.gov> <461808588.655361.1468821570462.JavaMail.yahoo@mail.yahoo.com> <877772657.653258.1468824084856.JavaMail.yahoo@mail.yahoo.com> <1627436644.2376666.1469090464432.JavaMail.yahoo@mail.yahoo.com> <1239841092.2483369.1469092691767.JavaMail.yahoo@mail.yahoo.com> <579094FF.9020708@imperial.ac.uk> <2137146867.2442005.1469094945841.JavaMail.yahoo@mail.yahoo.com> <9C370EDC-0F99-45FE-B650-B0F24091CA63@imperial.ac.uk> <430090064.2880208.1469176920170.JavaMail.yahoo@mail.yahoo.com> Message-ID: <8CB9F29A-77CA-46D2-9C3C-4E7CD494D2D0@mcs.anl.gov> Took a little more time than I expected but the branch barry/extend-pcmg-galerkin now supports PCMGSetGalerkin() and -pc_mg_galerkin now take PC_MG_GALERKIN_BOTH,PC_MG_GALERKIN_PMAT,PC_MG_GALERKIN_MAT, PC_MG_GALERKIN_NONE as arguments instead of PetscBool This allows computing either mat, or pmat or both via the Galerkin process so you should be able to provide A and M with KSPSetOperators() and then run with -pc_mg_galerkin both to get both generated on the coarse meshes via the Galekin process. Note that if you use the additional option -pc_use_amat false it will use only the M for both mat and pmat in the multigrid process (while A is only used for the outer Krylov solver definition of the operator.) For some problems this is actually a better approach. Please let me know if you have any difficulties with it. Barry > On Jul 22, 2016, at 3:42 AM, domenico lahaye wrote: > > Dear Barry, > > Thank you for your suggestion. > > I will be happy to test drive the new code when available. > > Kind wishes, Domenico. > > > > From: Barry Smith > To: Lawrence Mitchell > Cc: domenico lahaye ; PETSc Users List > Sent: Friday, July 22, 2016 1:41 AM > Subject: Re: [petsc-users] Regarding ksp ex42 - Citations > > > I'll add support for handling both A and M via Galerkin. It is easy to write the code, picking a good simple API that doesn't break anything is more difficult. I'm leaning to change PCMGSetGalerkin(PC,PetscBool) to PCMGSetGalerkin(PC, PCMGGalerkinType) where > > typedef enum { PC_MG_GALERKIN_BOTH,PC_MG_GALERKIN_PMAT,PC_MG_GALERKIN_MAT, PC_MG_GALERKIN_NONE > } PCMGGalerkinType; > > Barry > > > > > On Jul 21, 2016, at 6:09 AM, Lawrence Mitchell wrote: > > > > > >> On 21 Jul 2016, at 10:55, domenico lahaye wrote: > >> > >> Apologies for being not sufficient clear in my previous message. > >> > >> I would like to be able to Galerkin coarsen A^h to obtain A^H > >> and to separately Galerkin coarsen M^h to obtain M^H. > >> > >> So, yes, the way in which I currently (partially) understand your > >> description of the new DMCreateMatrices would do the job. > > > > If you want to separately coarsen A and M via Galerkin, I think it will be easier to just change the code in PCSetUp_MG to handle the case where A and M are different on the coarse levels. Effectively you just need to replicate the code that computes the coarse grid "B" matrix to separately compute coarse grid A and B matrices and pass them in to KSPSetOperators. > > > > Cheers, > > > > Lawrence > > > > From mhassan at miners.utep.edu Sun Jul 24 12:50:30 2016 From: mhassan at miners.utep.edu (Hassan Md Mahmudulla) Date: Sun, 24 Jul 2016 17:50:30 +0000 Subject: [petsc-users] EPSKrylovSchurSetDetectZeros() not working Message-ID: Hi, I am solving a generalized eigenvalue problem using spectrum slicing. I am using this example (http://slepc.upv.es/documentation/current/src/eps/examples/tutorials/ex25.c.html) as it is. Part of the code is: ierr =EPSKrylovSchurSetDetectZeros(eps,PETSC_TRUE);CHKERRQ(ierr); But I am getting the following: [0]PETSC ERROR: Mismatch between number of values found and information from inertia, consider using EPSKrylovSchurSetDetectZeros() [0]PETSC ERROR: See http://www.mcs.anl.gov/petsc/documentation/faq.html for trouble shooting. It seems the input PETSC_TRUE is not working for EPSKrylovSchurSetDetectZeros(). Any idea? M Hassan -------------- next part -------------- An HTML attachment was scrubbed... URL: From jroman at dsic.upv.es Sun Jul 24 14:27:04 2016 From: jroman at dsic.upv.es (Jose E. Roman) Date: Sun, 24 Jul 2016 21:27:04 +0200 Subject: [petsc-users] EPSKrylovSchurSetDetectZeros() not working In-Reply-To: References: Message-ID: > El 24 jul 2016, a las 19:50, Hassan Md Mahmudulla escribi?: > > Hi, > I am solving a generalized eigenvalue problem using spectrum slicing. I am using this example (http://slepc.upv.es/documentation/current/src/eps/examples/tutorials/ex25.c.html) as it is. Part of the code is: > ierr =EPSKrylovSchurSetDetectZeros(eps,PETSC_TRUE);CHKERRQ(ierr); > > But I am getting the following: > > [0]PETSC ERROR: Mismatch between number of values found and information from inertia, consider using EPSKrylovSchurSetDetectZeros() > [0]PETSC ERROR: See http://www.mcs.anl.gov/petsc/documentation/faq.html for trouble shooting. > > It seems the input PETSC_TRUE is not working for EPSKrylovSchurSetDetectZeros(). Any idea? > > M Hassan It seems that you are not using MUMPS. Spectrum slicing can be used with PETSc's Cholesky, but for guaranteed robustness it is necessary to use MUMPS. Jose From mhassan at miners.utep.edu Sun Jul 24 14:31:36 2016 From: mhassan at miners.utep.edu (Hassan Md Mahmudulla) Date: Sun, 24 Jul 2016 19:31:36 +0000 Subject: [petsc-users] EPSKrylovSchurSetDetectZeros() not working In-Reply-To: References: , Message-ID: Hi Jose, Here is the part of the code: ierr = STSetType(st,STSINVERT);CHKERRQ(ierr); ierr = STGetKSP(st,&ksp);CHKERRQ(ierr); ierr = KSPSetType(ksp,KSPPREONLY);CHKERRQ(ierr); ierr = KSPGetPC(ksp,&pc);CHKERRQ(ierr); ierr = PCSetType(pc,PCCHOLESKY);CHKERRQ(ierr); #if defined(PETSC_HAVE_MUMPS) #if defined(PETSC_USE_COMPLEX) SETERRQ(PETSC_COMM_WORLD,PETSC_ERR_SUP,"Spectrum slicing with MUMPS is not available for complex scalars"); #endif ierr = PetscPrintf(PETSC_COMM_WORLD, "PETSC_HAVE_MUMPS\n");CHKERRQ(ierr); ierr = EPSKrylovSchurSetDetectZeros(eps,PETSC_TRUE);CHKERRQ(ierr); /* enforce zero detection */ ierr = PCFactorSetMatSolverPackage(pc,MATSOLVERMUMPS);CHKERRQ(ierr); /* Add several MUMPS options (currently there is no better way of setting this in program): '-mat_mumps_icntl_13 1': turn off ScaLAPACK for matrix inertia '-mat_mumps_icntl_24 1': detect null pivots in factorization (for the case that a shift is equal to an eigenvalue) '-mat_mumps_cntl_3 ': a tolerance used for null pivot detection (must be larger than machine epsilon) Note: depending on the interval, it may be necessary also to increase the workspace: '-mat_mumps_icntl_14 ': increase workspace with a percentage (50, 100 or more) */ ierr = PetscOptionsInsertString(NULL,"-mat_mumps_icntl_13 1 -mat_mumps_icntl_24 1 -mat_mumps_cntl_3 1e-12");CHKERRQ(ierr); #endif /* Set solver parameters at runtime */ ierr = EPSSetFromOptions(eps);CHKERRQ(ierr); I am using MUMPS. Actually it's the example I said before. I didn't modify it that much. M Hassan ________________________________ From: Jose E. Roman Sent: Sunday, July 24, 2016 1:27:04 PM To: Hassan Md Mahmudulla Cc: petsc-users at mcs.anl.gov Subject: Re: [petsc-users] EPSKrylovSchurSetDetectZeros() not working > El 24 jul 2016, a las 19:50, Hassan Md Mahmudulla escribi?: > > Hi, > I am solving a generalized eigenvalue problem using spectrum slicing. I am using this example (http://slepc.upv.es/documentation/current/src/eps/examples/tutorials/ex25.c.html) as it is. Part of the code is: > ierr =EPSKrylovSchurSetDetectZeros(eps,PETSC_TRUE);CHKERRQ(ierr); > > But I am getting the following: > > [0]PETSC ERROR: Mismatch between number of values found and information from inertia, consider using EPSKrylovSchurSetDetectZeros() > [0]PETSC ERROR: See http://www.mcs.anl.gov/petsc/documentation/faq.html for trouble shooting. > > It seems the input PETSC_TRUE is not working for EPSKrylovSchurSetDetectZeros(). Any idea? > > M Hassan It seems that you are not using MUMPS. Spectrum slicing can be used with PETSc's Cholesky, but for guaranteed robustness it is necessary to use MUMPS. Jose -------------- next part -------------- An HTML attachment was scrubbed... URL: From jroman at dsic.upv.es Sun Jul 24 14:46:28 2016 From: jroman at dsic.upv.es (Jose E. Roman) Date: Sun, 24 Jul 2016 21:46:28 +0200 Subject: [petsc-users] EPSKrylovSchurSetDetectZeros() not working In-Reply-To: References: Message-ID: <6CCDCF64-35E4-4272-86D0-2590FF81478C@dsic.upv.es> The PETSc configuration you are using (PETSC_ARCH) does not have MUMPS. You have to add the appropiate options to PETSc's configure script. > El 24 jul 2016, a las 21:31, Hassan Md Mahmudulla escribi?: > > Hi Jose, > Here is the part of the code: > > ierr = STSetType(st,STSINVERT);CHKERRQ(ierr); > > ierr = STGetKSP(st,&ksp);CHKERRQ(ierr); > ierr = KSPSetType(ksp,KSPPREONLY);CHKERRQ(ierr); > ierr = KSPGetPC(ksp,&pc);CHKERRQ(ierr); > ierr = PCSetType(pc,PCCHOLESKY);CHKERRQ(ierr); > > #if defined(PETSC_HAVE_MUMPS) > #if defined(PETSC_USE_COMPLEX) > SETERRQ(PETSC_COMM_WORLD,PETSC_ERR_SUP,"Spectrum slicing with MUMPS is not available for complex scalars"); > #endif > ierr = PetscPrintf(PETSC_COMM_WORLD, "PETSC_HAVE_MUMPS\n");CHKERRQ(ierr); > ierr = EPSKrylovSchurSetDetectZeros(eps,PETSC_TRUE);CHKERRQ(ierr); /* enforce zero detection */ > ierr = PCFactorSetMatSolverPackage(pc,MATSOLVERMUMPS);CHKERRQ(ierr); > /* > Add several MUMPS options (currently there is no better way of setting this in program): > '-mat_mumps_icntl_13 1': turn off ScaLAPACK for matrix inertia > '-mat_mumps_icntl_24 1': detect null pivots in factorization (for the case that a shift is equal to an eigenvalue) > '-mat_mumps_cntl_3 ': a tolerance used for null pivot detection (must be larger than machine epsilon) > > Note: depending on the interval, it may be necessary also to increase the workspace: > '-mat_mumps_icntl_14 ': increase workspace with a percentage (50, 100 or more) > */ > ierr = PetscOptionsInsertString(NULL,"-mat_mumps_icntl_13 1 -mat_mumps_icntl_24 1 -mat_mumps_cntl_3 1e-12");CHKERRQ(ierr); > #endif > > /* > Set solver parameters at runtime > */ > ierr = EPSSetFromOptions(eps);CHKERRQ(ierr); > > I am using MUMPS. Actually it's the example I said before. I didn't modify it that much. > > > M Hassan From mhassan at miners.utep.edu Sun Jul 24 14:56:30 2016 From: mhassan at miners.utep.edu (Hassan Md Mahmudulla) Date: Sun, 24 Jul 2016 19:56:30 +0000 Subject: [petsc-users] EPSKrylovSchurSetDetectZeros() not working In-Reply-To: <6CCDCF64-35E4-4272-86D0-2590FF81478C@dsic.upv.es> References: , <6CCDCF64-35E4-4272-86D0-2590FF81478C@dsic.upv.es> Message-ID: Hi Jose, I don't think that my PETSc configuration doesn't have MUMPS. I configured that myself. I also got the output from this if-else code section ierr = PetscPrintf(PETSC_COMM_WORLD, "PETSC_HAVE_MUMPS\n");CHKERRQ(ierr); which works from inside of the if-else section. Please take a look at the configuration info from the error output also: [0]PETSC ERROR: Configure options --COPTFLAGS=-O2 -no-ipo -g -qopt-report=5 -dynamic --CXXOPTFLAGS=-O2 -no-ipo -g -qopt-report=5 -dynamic --FOPTFLAGS=-O2 -no-ipo -g -qopt-report=5 -dynamic --with-mpiexec=srun --with-cc=cc --with-clib-autodetect=0 --with-cxx=CC --with-cxxlib-autodetect=0 --with-debugging=0 --with-fc=ftn --with-fortranlib-autodetect=0 --with-mpiexec=srun --with-shared-libraries=0 --with-x=0 LIBS=-lstdc++ PETSC_ARCH=arch-edison-opt64-intel --download-mumps --download-ptscotch --download-scalapack --download-metis --download-parmetis Thanks, M Hassan ________________________________ From: Jose E. Roman Sent: Sunday, July 24, 2016 1:46:28 PM To: Hassan Md Mahmudulla Cc: petsc-users at mcs.anl.gov Subject: Re: [petsc-users] EPSKrylovSchurSetDetectZeros() not working The PETSc configuration you are using (PETSC_ARCH) does not have MUMPS. You have to add the appropiate options to PETSc's configure script. > El 24 jul 2016, a las 21:31, Hassan Md Mahmudulla escribi?: > > Hi Jose, > Here is the part of the code: > > ierr = STSetType(st,STSINVERT);CHKERRQ(ierr); > > ierr = STGetKSP(st,&ksp);CHKERRQ(ierr); > ierr = KSPSetType(ksp,KSPPREONLY);CHKERRQ(ierr); > ierr = KSPGetPC(ksp,&pc);CHKERRQ(ierr); > ierr = PCSetType(pc,PCCHOLESKY);CHKERRQ(ierr); > > #if defined(PETSC_HAVE_MUMPS) > #if defined(PETSC_USE_COMPLEX) > SETERRQ(PETSC_COMM_WORLD,PETSC_ERR_SUP,"Spectrum slicing with MUMPS is not available for complex scalars"); > #endif > ierr = PetscPrintf(PETSC_COMM_WORLD, "PETSC_HAVE_MUMPS\n");CHKERRQ(ierr); > ierr = EPSKrylovSchurSetDetectZeros(eps,PETSC_TRUE);CHKERRQ(ierr); /* enforce zero detection */ > ierr = PCFactorSetMatSolverPackage(pc,MATSOLVERMUMPS);CHKERRQ(ierr); > /* > Add several MUMPS options (currently there is no better way of setting this in program): > '-mat_mumps_icntl_13 1': turn off ScaLAPACK for matrix inertia > '-mat_mumps_icntl_24 1': detect null pivots in factorization (for the case that a shift is equal to an eigenvalue) > '-mat_mumps_cntl_3 ': a tolerance used for null pivot detection (must be larger than machine epsilon) > > Note: depending on the interval, it may be necessary also to increase the workspace: > '-mat_mumps_icntl_14 ': increase workspace with a percentage (50, 100 or more) > */ > ierr = PetscOptionsInsertString(NULL,"-mat_mumps_icntl_13 1 -mat_mumps_icntl_24 1 -mat_mumps_cntl_3 1e-12");CHKERRQ(ierr); > #endif > > /* > Set solver parameters at runtime > */ > ierr = EPSSetFromOptions(eps);CHKERRQ(ierr); > > I am using MUMPS. Actually it's the example I said before. I didn't modify it that much. > > > M Hassan -------------- next part -------------- An HTML attachment was scrubbed... URL: From jroman at dsic.upv.es Sun Jul 24 15:34:39 2016 From: jroman at dsic.upv.es (Jose E. Roman) Date: Sun, 24 Jul 2016 22:34:39 +0200 Subject: [petsc-users] EPSKrylovSchurSetDetectZeros() not working In-Reply-To: References: <6CCDCF64-35E4-4272-86D0-2590FF81478C@dsic.upv.es> Message-ID: <4DA115EF-0A4B-4094-9F52-998DDF8D1E07@dsic.upv.es> > El 24 jul 2016, a las 21:56, Hassan Md Mahmudulla escribi?: > > Hi Jose, > I don't think that my PETSc configuration doesn't have MUMPS. I configured that myself. I also got the output from this if-else code section > ierr = PetscPrintf(PETSC_COMM_WORLD, "PETSC_HAVE_MUMPS\n");CHKERRQ(ierr); > > which works from inside of the if-else section. > > Please take a look at the configuration info from the error output also: > > [0]PETSC ERROR: Configure options --COPTFLAGS=-O2 -no-ipo -g -qopt-report=5 -dynamic --CXXOPTFLAGS=-O2 -no-ipo -g -qopt-report=5 -dynamic --FOPTFLAGS=-O2 -no-ipo -g -qopt-report=5 -dynamic --with-mpiexec=srun --with-cc=cc --with-clib-autodetect=0 --with-cxx=CC --with-cxxlib-autodetect=0 --with-debugging=0 --with-fc=ftn --with-fortranlib-autodetect=0 --with-mpiexec=srun --with-shared-libraries=0 --with-x=0 LIBS=-lstdc++ PETSC_ARCH=arch-edison-opt64-intel --download-mumps --download-ptscotch --download-scalapack --download-metis --download-parmetis > > > Thanks, > M Hassan Then I don't know what is happening. Jose From domenico_lahaye at yahoo.com Mon Jul 25 01:42:44 2016 From: domenico_lahaye at yahoo.com (domenico lahaye) Date: Mon, 25 Jul 2016 06:42:44 +0000 (UTC) Subject: [petsc-users] Regarding ksp ex42 - Citations In-Reply-To: <8CB9F29A-77CA-46D2-9C3C-4E7CD494D2D0@mcs.anl.gov> References: <1413749702.3789628.1468516892902.JavaMail.yahoo.ref@mail.yahoo.com> <1413749702.3789628.1468516892902.JavaMail.yahoo@mail.yahoo.com> <5A491912-5FFB-46AB-8B2E-CBC0C5C443C2@mcs.anl.gov> <461808588.655361.1468821570462.JavaMail.yahoo@mail.yahoo.com> <877772657.653258.1468824084856.JavaMail.yahoo@mail.yahoo.com> <1627436644.2376666.1469090464432.JavaMail.yahoo@mail.yahoo.com> <1239841092.2483369.1469092691767.JavaMail.yahoo@mail.yahoo.com> <579094FF.9020708@imperial.ac.uk> <2137146867.2442005.1469094945841.JavaMail.yahoo@mail.yahoo.com> <9C370EDC-0F99-45FE-B650-B0F24091CA63@imperial.ac.uk> <430090064.2880208.1469176920170.JavaMail.yahoo@mail.yahoo.com> <8CB9F29A-77CA-46D2-9C3C-4 E7CD494D2D0@mcs.anl.gov> Message-ID: <1278608375.3800440.1469428964428.JavaMail.yahoo@mail.yahoo.com> Thanks Barry.? I will give it a look. If not before my holidays, than in the second half of August.? Best wishes. Domenico.? ?From: Barry Smith To: domenico lahaye Cc: "petsc-users at mcs.anl.gov" Sent: Sunday, July 24, 2016 2:52 AM Subject: Re: [petsc-users] Regarding ksp ex42 - Citations ? Took a little more time than I expected but the branch barry/extend-pcmg-galerkin now supports PCMGSetGalerkin() and -pc_mg_galerkin now take PC_MG_GALERKIN_BOTH,PC_MG_GALERKIN_PMAT,PC_MG_GALERKIN_MAT, PC_MG_GALERKIN_NONE as arguments instead of PetscBool This allows computing either mat, or pmat or both via the Galerkin process so you should be able to provide A and M with KSPSetOperators() and then run with -pc_mg_galerkin both to get both generated on the coarse meshes via the Galekin process.? Note that if you use the additional option -pc_use_amat false it will use only the M for both mat and pmat in the multigrid process (while A is only used for the outer Krylov solver definition of the operator.) For some problems this is actually a better approach. ? Please let me know if you have any difficulties with it. Barry > On Jul 22, 2016, at 3:42 AM, domenico lahaye wrote: > > Dear Barry, >? >? Thank you for your suggestion. >? >? I will be happy to test drive the new code when available. > >? Kind wishes, Domenico. > > > > From: Barry Smith > To: Lawrence Mitchell > Cc: domenico lahaye ; PETSc Users List > Sent: Friday, July 22, 2016 1:41 AM > Subject: Re: [petsc-users] Regarding ksp ex42 - Citations > > >? I'll add support for handling both A and M via Galerkin. It is easy to write the code, picking a good simple API that doesn't break anything is more difficult.? I'm leaning to change PCMGSetGalerkin(PC,PetscBool) to PCMGSetGalerkin(PC, PCMGGalerkinType) where > > typedef enum { PC_MG_GALERKIN_BOTH,PC_MG_GALERKIN_PMAT,PC_MG_GALERKIN_MAT, PC_MG_GALERKIN_NONE > } PCMGGalerkinType; > > Barry > > > > > On Jul 21, 2016, at 6:09 AM, Lawrence Mitchell wrote: > > > > > >> On 21 Jul 2016, at 10:55, domenico lahaye wrote: > >> > >> Apologies for being not sufficient clear in my previous message. > >> > >> I would like to be able to Galerkin coarsen A^h to obtain A^H > >> and to separately Galerkin coarsen M^h to obtain M^H. > >> > >> So, yes, the way in which I currently (partially) understand your > >> description of the new DMCreateMatrices would do the job. > > > > If you want to separately coarsen A and M via Galerkin, I think it will be easier to just change the code in PCSetUp_MG to handle the case where A and M are different on the coarse levels.? Effectively you just need to replicate the code that computes the coarse grid "B" matrix to separately compute coarse grid A and B matrices and pass them in to KSPSetOperators. > > > > Cheers, > > > > Lawrence > > > > -------------- next part -------------- An HTML attachment was scrubbed... URL: From xzhao99 at gmail.com Mon Jul 25 11:17:05 2016 From: xzhao99 at gmail.com (Xujun Zhao) Date: Mon, 25 Jul 2016 11:17:05 -0500 Subject: [petsc-users] KSPSolve() passes in the dbg mode, but failed in opt mode Message-ID: Hi all, I am trying to solve my problem with a direct solver superLU_dist. But the KSPSolve failed in the "opt" mode. I shifted to the "dbg" version and wanted to see what error info I can get from the PETSc. Surprisingly, it passed the solve and didn't output any errors in the "dbg" version. Does anyone have the similar experience? and what type of potential bugs it may have? --->test in StokesSolver::solve(): Start the KSP solve... [0]PETSC ERROR: ------------------------------------------------------------------------ [0]PETSC ERROR: Caught signal number 11 SEGV: Segmentation Violation, probably memory access out of range [0]PETSC ERROR: Try option -start_in_debugger or -on_error_attach_debugger [0]PETSC ERROR: or see http://www.mcs.anl.gov/petsc/documentation/faq.html#valgrind [0]PETSC ERROR: or try http://valgrind.org on GNU/linux and Apple Mac OS X to find memory corruption errors [0]PETSC ERROR: configure using --with-debugging=yes, recompile, link, and run [0]PETSC ERROR: to get more information on the crash. [0]PETSC ERROR: --------------------- Error Message -------------------------------------------------------------- [0]PETSC ERROR: Signal received [0]PETSC ERROR: See http://www.mcs.anl.gov/petsc/documentation/faq.html for trouble shooting. [0]PETSC ERROR: Petsc Release Version 3.7.2, unknown [0]PETSC ERROR: ./example-dbg on a arch-darwin-c-opt named mcswl091.mcs.anl.gov by xzhao Mon Jul 25 11:10:12 2016 [0]PETSC ERROR: Configure options --with-cc=gcc --with-cxx=g++ --with-fc=gfortran --download-mpich --download-fblaslapack --download-scalapack --download-mumps --download-superlu_dist --download-hypre --download-ml --download-metis --download-parmetis --download-triangle --download-chaco --with-debugging=0 [0]PETSC ERROR: #1 User provided function() line 0 in unknown file application called MPI_Abort(MPI_COMM_WORLD, 59) - process 0 -------------- next part -------------- An HTML attachment was scrubbed... URL: From hzhang at mcs.anl.gov Mon Jul 25 11:33:19 2016 From: hzhang at mcs.anl.gov (Hong) Date: Mon, 25 Jul 2016 11:33:19 -0500 Subject: [petsc-users] KSPSolve() passes in the dbg mode, but failed in opt mode In-Reply-To: References: Message-ID: Xujun: Test your code with valgrind to see if it is valgrind clean. Hong Hi all, > > I am trying to solve my problem with a direct solver superLU_dist. > But the KSPSolve failed in the "opt" mode. I shifted to the "dbg" version > and wanted to see what error info I can get from the PETSc. Surprisingly, > it passed the solve and didn't output any errors in the "dbg" version. Does > anyone have the similar experience? and what type of potential bugs it may > have? > > > --->test in StokesSolver::solve(): Start the KSP solve... > > [0]PETSC ERROR: > ------------------------------------------------------------------------ > > [0]PETSC ERROR: Caught signal number 11 SEGV: Segmentation Violation, > probably memory access out of range > > [0]PETSC ERROR: Try option -start_in_debugger or -on_error_attach_debugger > > [0]PETSC ERROR: or see > http://www.mcs.anl.gov/petsc/documentation/faq.html#valgrind > > [0]PETSC ERROR: or try http://valgrind.org on GNU/linux and Apple Mac OS > X to find memory corruption errors > > [0]PETSC ERROR: configure using --with-debugging=yes, recompile, link, and > run > > [0]PETSC ERROR: to get more information on the crash. > > [0]PETSC ERROR: --------------------- Error Message > -------------------------------------------------------------- > > [0]PETSC ERROR: Signal received > > [0]PETSC ERROR: See http://www.mcs.anl.gov/petsc/documentation/faq.html > for trouble shooting. > > [0]PETSC ERROR: Petsc Release Version 3.7.2, unknown > > [0]PETSC ERROR: ./example-dbg on a arch-darwin-c-opt named > mcswl091.mcs.anl.gov by xzhao Mon Jul 25 11:10:12 2016 > > [0]PETSC ERROR: Configure options --with-cc=gcc --with-cxx=g++ > --with-fc=gfortran --download-mpich --download-fblaslapack > --download-scalapack --download-mumps --download-superlu_dist > --download-hypre --download-ml --download-metis --download-parmetis > --download-triangle --download-chaco --with-debugging=0 > > [0]PETSC ERROR: #1 User provided function() line 0 in unknown file > > application called MPI_Abort(MPI_COMM_WORLD, 59) - process 0 > > > -------------- next part -------------- An HTML attachment was scrubbed... URL: From knepley at gmail.com Mon Jul 25 11:50:05 2016 From: knepley at gmail.com (Matthew Knepley) Date: Mon, 25 Jul 2016 09:50:05 -0700 Subject: [petsc-users] KSPSolve() passes in the dbg mode, but failed in opt mode In-Reply-To: References: Message-ID: On Mon, Jul 25, 2016 at 9:17 AM, Xujun Zhao wrote: > Hi all, > > I am trying to solve my problem with a direct solver superLU_dist. > But the KSPSolve failed in the "opt" mode. I shifted to the "dbg" version > and wanted to see what error info I can get from the PETSc. Surprisingly, > it passed the solve and didn't output any errors in the "dbg" version. Does > anyone have the similar experience? and what type of potential bugs it may > have? > Debugging mode initializes all variables, but as Hong says, valgrind will warn you of uninitialized variables. Matt > > --->test in StokesSolver::solve(): Start the KSP solve... > > [0]PETSC ERROR: > ------------------------------------------------------------------------ > > [0]PETSC ERROR: Caught signal number 11 SEGV: Segmentation Violation, > probably memory access out of range > > [0]PETSC ERROR: Try option -start_in_debugger or -on_error_attach_debugger > > [0]PETSC ERROR: or see > http://www.mcs.anl.gov/petsc/documentation/faq.html#valgrind > > [0]PETSC ERROR: or try http://valgrind.org on GNU/linux and Apple Mac OS > X to find memory corruption errors > > [0]PETSC ERROR: configure using --with-debugging=yes, recompile, link, and > run > > [0]PETSC ERROR: to get more information on the crash. > > [0]PETSC ERROR: --------------------- Error Message > -------------------------------------------------------------- > > [0]PETSC ERROR: Signal received > > [0]PETSC ERROR: See http://www.mcs.anl.gov/petsc/documentation/faq.html > for trouble shooting. > > [0]PETSC ERROR: Petsc Release Version 3.7.2, unknown > > [0]PETSC ERROR: ./example-dbg on a arch-darwin-c-opt named > mcswl091.mcs.anl.gov by xzhao Mon Jul 25 11:10:12 2016 > > [0]PETSC ERROR: Configure options --with-cc=gcc --with-cxx=g++ > --with-fc=gfortran --download-mpich --download-fblaslapack > --download-scalapack --download-mumps --download-superlu_dist > --download-hypre --download-ml --download-metis --download-parmetis > --download-triangle --download-chaco --with-debugging=0 > > [0]PETSC ERROR: #1 User provided function() line 0 in unknown file > > application called MPI_Abort(MPI_COMM_WORLD, 59) - process 0 > > > -- What most experimenters take for granted before they begin their experiments is infinitely more interesting than any results to which their experiments lead. -- Norbert Wiener -------------- next part -------------- An HTML attachment was scrubbed... URL: From Eric.Chamberland at giref.ulaval.ca Mon Jul 25 13:33:45 2016 From: Eric.Chamberland at giref.ulaval.ca (Eric Chamberland) Date: Mon, 25 Jul 2016 14:33:45 -0400 Subject: [petsc-users] OpenMPI 2.0 and Petsc 3.7.2 Message-ID: <99090192-103a-b58c-8bbb-273b938fb748@giref.ulaval.ca> Hi, has someone tried OpenMPI 2.0 with Petsc 3.7.2? I am having some errors with petsc, maybe someone have them too? Here are the configure logs for PETSc: http://www.giref.ulaval.ca/~cmpgiref/dernier_ompi/2016.07.25.01h16m02s_configure.log http://www.giref.ulaval.ca/~cmpgiref/dernier_ompi/2016.07.25.01h16m02s_RDict.log And for OpenMPI: http://www.giref.ulaval.ca/~cmpgiref/dernier_ompi/2016.07.25.01h16m02s_config.log (in fact, I am testing the ompi-release branch, a sort of petsc-master branch, since I need the commit 9ba6678156). For a set of parallel tests, I have 104 that works on 124 total tests. And the typical error: *** Error in `/pmi/cmpbib/compilation_BIB_dernier_ompi/COMPILE_AUTO/GIREF/bin/Test.ProblemeGD.dev': free(): invalid pointer: ======= Backtrace: ========= /lib64/libc.so.6(+0x7277f)[0x7f80eb11677f] /lib64/libc.so.6(+0x78026)[0x7f80eb11c026] /lib64/libc.so.6(+0x78d53)[0x7f80eb11cd53] /opt/openmpi-2.x_opt/lib/libopen-pal.so.20(opal_free+0x1f)[0x7f80ea8f9d60] /opt/openmpi-2.x_opt/lib/openmpi/mca_pml_ob1.so(+0x16628)[0x7f80df0ea628] /opt/openmpi-2.x_opt/lib/openmpi/mca_pml_ob1.so(+0x16c50)[0x7f80df0eac50] /opt/openmpi-2.x_opt/lib/libmpi.so.20(+0x9f9dd)[0x7f80eb7029dd] /opt/openmpi-2.x_opt/lib/libmpi.so.20(MPI_Request_free+0xf7)[0x7f80eb702ad6] /opt/petsc-3.7.2_debug_openmpi_2.x/lib/libpetsc.so.3.7(+0x4adc6d)[0x7f80f2fa6c6d] /opt/petsc-3.7.2_debug_openmpi_2.x/lib/libpetsc.so.3.7(VecScatterDestroy+0x68d)[0x7f80f2fa1c45] /opt/petsc-3.7.2_debug_openmpi_2.x/lib/libpetsc.so.3.7(+0xa9d0f5)[0x7f80f35960f5] /opt/petsc-3.7.2_debug_openmpi_2.x/lib/libpetsc.so.3.7(MatDestroy+0x648)[0x7f80f35c2588] /opt/petsc-3.7.2_debug_openmpi_2.x/lib/libpetsc.so.3.7(+0x10bf0f4)[0x7f80f3bb80f4] /opt/petsc-3.7.2_debug_openmpi_2.x/lib/libpetsc.so.3.7(PCReset+0x346)[0x7f80f3a796de] /opt/petsc-3.7.2_debug_openmpi_2.x/lib/libpetsc.so.3.7(KSPReset+0x502)[0x7f80f3d19779] /opt/petsc-3.7.2_debug_openmpi_2.x/lib/libpetsc.so.3.7(+0x11707f7)[0x7f80f3c697f7] /opt/petsc-3.7.2_debug_openmpi_2.x/lib/libpetsc.so.3.7(PCReset+0x346)[0x7f80f3a796de] /opt/petsc-3.7.2_debug_openmpi_2.x/lib/libpetsc.so.3.7(KSPReset+0x502)[0x7f80f3d19779] /opt/petsc-3.7.2_debug_openmpi_2.x/lib/libpetsc.so.3.7(+0x11707f7)[0x7f80f3c697f7] /opt/petsc-3.7.2_debug_openmpi_2.x/lib/libpetsc.so.3.7(PCReset+0x346)[0x7f80f3a796de] /opt/petsc-3.7.2_debug_openmpi_2.x/lib/libpetsc.so.3.7(KSPReset+0x502)[0x7f80f3d19779] /opt/petsc-3.7.2_debug_openmpi_2.x/lib/libpetsc.so.3.7(+0x11707f7)[0x7f80f3c697f7] /opt/petsc-3.7.2_debug_openmpi_2.x/lib/libpetsc.so.3.7(PCReset+0x346)[0x7f80f3a796de] /opt/petsc-3.7.2_debug_openmpi_2.x/lib/libpetsc.so.3.7(PCDestroy+0x5d1)[0x7f80f3a79fd9] /opt/petsc-3.7.2_debug_openmpi_2.x/lib/libpetsc.so.3.7(KSPDestroy+0x7b6)[0x7f80f3d1a334] a similar one: *** Error in `/pmi/cmpbib/compilation_BIB_dernier_ompi/COMPILE_AUTO/GIREF/bin/Test.ProbFluideIncompressible.dev': free(): invalid pointer: 0x00007f382a7c5bc0 *** ======= Backtrace: ========= /lib64/libc.so.6(+0x7277f)[0x7f3829f1c77f] /lib64/libc.so.6(+0x78026)[0x7f3829f22026] /lib64/libc.so.6(+0x78d53)[0x7f3829f22d53] /opt/openmpi-2.x_opt/lib/libopen-pal.so.20(opal_free+0x1f)[0x7f38296ffd60] /opt/openmpi-2.x_opt/lib/openmpi/mca_pml_ob1.so(+0x16628)[0x7f381deab628] /opt/openmpi-2.x_opt/lib/openmpi/mca_pml_ob1.so(+0x16c50)[0x7f381deabc50] /opt/openmpi-2.x_opt/lib/libmpi.so.20(+0x9f9dd)[0x7f382a5089dd] /opt/openmpi-2.x_opt/lib/libmpi.so.20(MPI_Request_free+0xf7)[0x7f382a508ad6] /opt/petsc-3.7.2_debug_openmpi_2.x/lib/libpetsc.so.3.7(+0x4adc6d)[0x7f3831dacc6d] /opt/petsc-3.7.2_debug_openmpi_2.x/lib/libpetsc.so.3.7(VecScatterDestroy+0x68d)[0x7f3831da7c45] /opt/petsc-3.7.2_debug_openmpi_2.x/lib/libpetsc.so.3.7(+0x9f4755)[0x7f38322f3755] /opt/petsc-3.7.2_debug_openmpi_2.x/lib/libpetsc.so.3.7(MatDestroy+0x648)[0x7f38323c8588] /opt/petsc-3.7.2_debug_openmpi_2.x/lib/libpetsc.so.3.7(PCReset+0x4e2)[0x7f383287f87a] /opt/petsc-3.7.2_debug_openmpi_2.x/lib/libpetsc.so.3.7(PCDestroy+0x5d1)[0x7f383287ffd9] /opt/petsc-3.7.2_debug_openmpi_2.x/lib/libpetsc.so.3.7(KSPDestroy+0x7b6)[0x7f3832b20334] another one: *** Error in `/pmi/cmpbib/compilation_BIB_dernier_ompi/COMPILE_AUTO/GIREF/bin/Test.MortierDiffusion.dev': free(): invalid pointer: 0x00007f67b6d37bc0 *** ======= Backtrace: ========= /lib64/libc.so.6(+0x7277f)[0x7f67b648e77f] /lib64/libc.so.6(+0x78026)[0x7f67b6494026] /lib64/libc.so.6(+0x78d53)[0x7f67b6494d53] /opt/openmpi-2.x_opt/lib/libopen-pal.so.20(opal_free+0x1f)[0x7f67b5c71d60] /opt/openmpi-2.x_opt/lib/openmpi/mca_pml_ob1.so(+0x1adae)[0x7f67aa4cddae] /opt/openmpi-2.x_opt/lib/openmpi/mca_pml_ob1.so(+0x1b4ca)[0x7f67aa4ce4ca] /opt/openmpi-2.x_opt/lib/libmpi.so.20(+0x9f9dd)[0x7f67b6a7a9dd] /opt/openmpi-2.x_opt/lib/libmpi.so.20(MPI_Request_free+0xf7)[0x7f67b6a7aad6] /opt/petsc-3.7.2_debug_openmpi_2.x/lib/libpetsc.so.3.7(+0x4adb09)[0x7f67be31eb09] /opt/petsc-3.7.2_debug_openmpi_2.x/lib/libpetsc.so.3.7(VecScatterDestroy+0x68d)[0x7f67be319c45] /opt/petsc-3.7.2_debug_openmpi_2.x/lib/libpetsc.so.3.7(+0x4574f7)[0x7f67be2c84f7] /opt/petsc-3.7.2_debug_openmpi_2.x/lib/libpetsc.so.3.7(VecDestroy+0x648)[0x7f67be26e8da] I feel like I should wait until someone else from Petsc have tested it too... Thanks, Eric From knepley at gmail.com Mon Jul 25 13:57:18 2016 From: knepley at gmail.com (Matthew Knepley) Date: Mon, 25 Jul 2016 11:57:18 -0700 Subject: [petsc-users] OpenMPI 2.0 and Petsc 3.7.2 In-Reply-To: <99090192-103a-b58c-8bbb-273b938fb748@giref.ulaval.ca> References: <99090192-103a-b58c-8bbb-273b938fb748@giref.ulaval.ca> Message-ID: On Mon, Jul 25, 2016 at 11:33 AM, Eric Chamberland < Eric.Chamberland at giref.ulaval.ca> wrote: > Hi, > > has someone tried OpenMPI 2.0 with Petsc 3.7.2? > > I am having some errors with petsc, maybe someone have them too? > > Here are the configure logs for PETSc: > > > http://www.giref.ulaval.ca/~cmpgiref/dernier_ompi/2016.07.25.01h16m02s_configure.log > > > http://www.giref.ulaval.ca/~cmpgiref/dernier_ompi/2016.07.25.01h16m02s_RDict.log > > And for OpenMPI: > > http://www.giref.ulaval.ca/~cmpgiref/dernier_ompi/2016.07.25.01h16m02s_config.log > > (in fact, I am testing the ompi-release branch, a sort of petsc-master > branch, since I need the commit 9ba6678156). > > For a set of parallel tests, I have 104 that works on 124 total tests. > It appears that the fault happens when freeing the VecScatter we build for MatMult, which contains Request structures for the ISends and IRecvs. These looks like internal OpenMPI errors to me since the Request should be opaque. I would try at least two things: 1) Run under valgrind. 2) Switch the VecScatter implementation. All the options are here, http://www.mcs.anl.gov/petsc/petsc-current/docs/manualpages/Vec/VecScatterCreate.html#VecScatterCreate but maybe use alltoall. Thanks, Matt > And the typical error: > *** Error in > `/pmi/cmpbib/compilation_BIB_dernier_ompi/COMPILE_AUTO/GIREF/bin/Test.ProblemeGD.dev': > free(): invalid pointer: > ======= Backtrace: ========= > /lib64/libc.so.6(+0x7277f)[0x7f80eb11677f] > /lib64/libc.so.6(+0x78026)[0x7f80eb11c026] > /lib64/libc.so.6(+0x78d53)[0x7f80eb11cd53] > /opt/openmpi-2.x_opt/lib/libopen-pal.so.20(opal_free+0x1f)[0x7f80ea8f9d60] > /opt/openmpi-2.x_opt/lib/openmpi/mca_pml_ob1.so(+0x16628)[0x7f80df0ea628] > /opt/openmpi-2.x_opt/lib/openmpi/mca_pml_ob1.so(+0x16c50)[0x7f80df0eac50] > /opt/openmpi-2.x_opt/lib/libmpi.so.20(+0x9f9dd)[0x7f80eb7029dd] > > /opt/openmpi-2.x_opt/lib/libmpi.so.20(MPI_Request_free+0xf7)[0x7f80eb702ad6] > > /opt/petsc-3.7.2_debug_openmpi_2.x/lib/libpetsc.so.3.7(+0x4adc6d)[0x7f80f2fa6c6d] > > /opt/petsc-3.7.2_debug_openmpi_2.x/lib/libpetsc.so.3.7(VecScatterDestroy+0x68d)[0x7f80f2fa1c45] > > /opt/petsc-3.7.2_debug_openmpi_2.x/lib/libpetsc.so.3.7(+0xa9d0f5)[0x7f80f35960f5] > > /opt/petsc-3.7.2_debug_openmpi_2.x/lib/libpetsc.so.3.7(MatDestroy+0x648)[0x7f80f35c2588] > > /opt/petsc-3.7.2_debug_openmpi_2.x/lib/libpetsc.so.3.7(+0x10bf0f4)[0x7f80f3bb80f4] > > /opt/petsc-3.7.2_debug_openmpi_2.x/lib/libpetsc.so.3.7(PCReset+0x346)[0x7f80f3a796de] > > /opt/petsc-3.7.2_debug_openmpi_2.x/lib/libpetsc.so.3.7(KSPReset+0x502)[0x7f80f3d19779] > > /opt/petsc-3.7.2_debug_openmpi_2.x/lib/libpetsc.so.3.7(+0x11707f7)[0x7f80f3c697f7] > > /opt/petsc-3.7.2_debug_openmpi_2.x/lib/libpetsc.so.3.7(PCReset+0x346)[0x7f80f3a796de] > > /opt/petsc-3.7.2_debug_openmpi_2.x/lib/libpetsc.so.3.7(KSPReset+0x502)[0x7f80f3d19779] > > /opt/petsc-3.7.2_debug_openmpi_2.x/lib/libpetsc.so.3.7(+0x11707f7)[0x7f80f3c697f7] > > /opt/petsc-3.7.2_debug_openmpi_2.x/lib/libpetsc.so.3.7(PCReset+0x346)[0x7f80f3a796de] > > /opt/petsc-3.7.2_debug_openmpi_2.x/lib/libpetsc.so.3.7(KSPReset+0x502)[0x7f80f3d19779] > > /opt/petsc-3.7.2_debug_openmpi_2.x/lib/libpetsc.so.3.7(+0x11707f7)[0x7f80f3c697f7] > > /opt/petsc-3.7.2_debug_openmpi_2.x/lib/libpetsc.so.3.7(PCReset+0x346)[0x7f80f3a796de] > > /opt/petsc-3.7.2_debug_openmpi_2.x/lib/libpetsc.so.3.7(PCDestroy+0x5d1)[0x7f80f3a79fd9] > > /opt/petsc-3.7.2_debug_openmpi_2.x/lib/libpetsc.so.3.7(KSPDestroy+0x7b6)[0x7f80f3d1a334] > > a similar one: > *** Error in > `/pmi/cmpbib/compilation_BIB_dernier_ompi/COMPILE_AUTO/GIREF/bin/Test.ProbFluideIncompressible.dev': > free(): invalid pointer: 0x00007f382a7c5bc0 *** > ======= Backtrace: ========= > /lib64/libc.so.6(+0x7277f)[0x7f3829f1c77f] > /lib64/libc.so.6(+0x78026)[0x7f3829f22026] > /lib64/libc.so.6(+0x78d53)[0x7f3829f22d53] > /opt/openmpi-2.x_opt/lib/libopen-pal.so.20(opal_free+0x1f)[0x7f38296ffd60] > /opt/openmpi-2.x_opt/lib/openmpi/mca_pml_ob1.so(+0x16628)[0x7f381deab628] > /opt/openmpi-2.x_opt/lib/openmpi/mca_pml_ob1.so(+0x16c50)[0x7f381deabc50] > /opt/openmpi-2.x_opt/lib/libmpi.so.20(+0x9f9dd)[0x7f382a5089dd] > > /opt/openmpi-2.x_opt/lib/libmpi.so.20(MPI_Request_free+0xf7)[0x7f382a508ad6] > > /opt/petsc-3.7.2_debug_openmpi_2.x/lib/libpetsc.so.3.7(+0x4adc6d)[0x7f3831dacc6d] > > /opt/petsc-3.7.2_debug_openmpi_2.x/lib/libpetsc.so.3.7(VecScatterDestroy+0x68d)[0x7f3831da7c45] > > /opt/petsc-3.7.2_debug_openmpi_2.x/lib/libpetsc.so.3.7(+0x9f4755)[0x7f38322f3755] > > /opt/petsc-3.7.2_debug_openmpi_2.x/lib/libpetsc.so.3.7(MatDestroy+0x648)[0x7f38323c8588] > > /opt/petsc-3.7.2_debug_openmpi_2.x/lib/libpetsc.so.3.7(PCReset+0x4e2)[0x7f383287f87a] > > /opt/petsc-3.7.2_debug_openmpi_2.x/lib/libpetsc.so.3.7(PCDestroy+0x5d1)[0x7f383287ffd9] > > /opt/petsc-3.7.2_debug_openmpi_2.x/lib/libpetsc.so.3.7(KSPDestroy+0x7b6)[0x7f3832b20334] > > another one: > > *** Error in > `/pmi/cmpbib/compilation_BIB_dernier_ompi/COMPILE_AUTO/GIREF/bin/Test.MortierDiffusion.dev': > free(): invalid pointer: 0x00007f67b6d37bc0 *** > ======= Backtrace: ========= > /lib64/libc.so.6(+0x7277f)[0x7f67b648e77f] > /lib64/libc.so.6(+0x78026)[0x7f67b6494026] > /lib64/libc.so.6(+0x78d53)[0x7f67b6494d53] > /opt/openmpi-2.x_opt/lib/libopen-pal.so.20(opal_free+0x1f)[0x7f67b5c71d60] > /opt/openmpi-2.x_opt/lib/openmpi/mca_pml_ob1.so(+0x1adae)[0x7f67aa4cddae] > /opt/openmpi-2.x_opt/lib/openmpi/mca_pml_ob1.so(+0x1b4ca)[0x7f67aa4ce4ca] > /opt/openmpi-2.x_opt/lib/libmpi.so.20(+0x9f9dd)[0x7f67b6a7a9dd] > > /opt/openmpi-2.x_opt/lib/libmpi.so.20(MPI_Request_free+0xf7)[0x7f67b6a7aad6] > > /opt/petsc-3.7.2_debug_openmpi_2.x/lib/libpetsc.so.3.7(+0x4adb09)[0x7f67be31eb09] > > /opt/petsc-3.7.2_debug_openmpi_2.x/lib/libpetsc.so.3.7(VecScatterDestroy+0x68d)[0x7f67be319c45] > > /opt/petsc-3.7.2_debug_openmpi_2.x/lib/libpetsc.so.3.7(+0x4574f7)[0x7f67be2c84f7] > > /opt/petsc-3.7.2_debug_openmpi_2.x/lib/libpetsc.so.3.7(VecDestroy+0x648)[0x7f67be26e8da] > > I feel like I should wait until someone else from Petsc have tested it > too... > > Thanks, > > Eric > -- What most experimenters take for granted before they begin their experiments is infinitely more interesting than any results to which their experiments lead. -- Norbert Wiener -------------- next part -------------- An HTML attachment was scrubbed... URL: From Eric.Chamberland at giref.ulaval.ca Mon Jul 25 14:44:57 2016 From: Eric.Chamberland at giref.ulaval.ca (Eric Chamberland) Date: Mon, 25 Jul 2016 15:44:57 -0400 Subject: [petsc-users] OpenMPI 2.0 and Petsc 3.7.2 In-Reply-To: References: <99090192-103a-b58c-8bbb-273b938fb748@giref.ulaval.ca> Message-ID: <33b3cb0d-78f8-fb84-2ad5-a447f5cdce9e@giref.ulaval.ca> Ok, here is the 2 points answered: #1) got valgrind output... here is the fatal free operation: ==107156== Invalid free() / delete / delete[] / realloc() ==107156== at 0x4C2A37C: free (in /usr/lib64/valgrind/vgpreload_memcheck-amd64-linux.so) ==107156== by 0x1E63CD5F: opal_free (malloc.c:184) ==107156== by 0x27622627: mca_pml_ob1_recv_request_fini (pml_ob1_recvreq.h:133) ==107156== by 0x27622C4F: mca_pml_ob1_recv_request_free (pml_ob1_recvreq.c:90) ==107156== by 0x1D3EF9DC: ompi_request_free (request.h:362) ==107156== by 0x1D3EFAD5: PMPI_Request_free (prequest_free.c:59) ==107156== by 0x14AE3B9C: VecScatterDestroy_PtoP (vpscat.c:219) ==107156== by 0x14ADEB74: VecScatterDestroy (vscat.c:1860) ==107156== by 0x14A8D426: VecDestroy_MPI (pdvec.c:25) ==107156== by 0x14A33809: VecDestroy (vector.c:432) ==107156== by 0x10A2A5AB: GIREFVecDestroy(_p_Vec*&) (girefConfigurationPETSc.h:115) ==107156== by 0x10BA9F14: VecteurPETSc::detruitObjetPETSc() (VecteurPETSc.cc:2292) ==107156== by 0x10BA9D0D: VecteurPETSc::~VecteurPETSc() (VecteurPETSc.cc:287) ==107156== by 0x10BA9F48: VecteurPETSc::~VecteurPETSc() (VecteurPETSc.cc:281) ==107156== by 0x1135A57B: PPReactionsAppuiEL3D::~PPReactionsAppuiEL3D() (PPReactionsAppuiEL3D.cc:216) ==107156== by 0xCD9A1EA: ProblemeGD::~ProblemeGD() (in /home/mefpp_ericc/depots_prepush/GIREF/lib/libgiref_dev_Formulation.so) ==107156== by 0x435702: main (Test.ProblemeGD.icc:381) ==107156== Address 0x1d6acbc0 is 0 bytes inside data symbol "ompi_mpi_double" --107156-- REDIR: 0x1dda2680 (libc.so.6:__GI_stpcpy) redirected to 0x4c2f330 (__GI_stpcpy) ==107156== ==107156== Process terminating with default action of signal 6 (SIGABRT): dumping core ==107156== at 0x1DD520C7: raise (in /lib64/libc-2.19.so) ==107156== by 0x1DD53534: abort (in /lib64/libc-2.19.so) ==107156== by 0x1DD4B145: __assert_fail_base (in /lib64/libc-2.19.so) ==107156== by 0x1DD4B1F1: __assert_fail (in /lib64/libc-2.19.so) ==107156== by 0x27626D12: mca_pml_ob1_send_request_fini (pml_ob1_sendreq.h:221) ==107156== by 0x276274C9: mca_pml_ob1_send_request_free (pml_ob1_sendreq.c:117) ==107156== by 0x1D3EF9DC: ompi_request_free (request.h:362) ==107156== by 0x1D3EFAD5: PMPI_Request_free (prequest_free.c:59) ==107156== by 0x14AE3C3C: VecScatterDestroy_PtoP (vpscat.c:225) ==107156== by 0x14ADEB74: VecScatterDestroy (vscat.c:1860) ==107156== by 0x14A8D426: VecDestroy_MPI (pdvec.c:25) ==107156== by 0x14A33809: VecDestroy (vector.c:432) ==107156== by 0x10A2A5AB: GIREFVecDestroy(_p_Vec*&) (girefConfigurationPETSc.h:115) ==107156== by 0x10BA9F14: VecteurPETSc::detruitObjetPETSc() (VecteurPETSc.cc:2292) ==107156== by 0x10BA9D0D: VecteurPETSc::~VecteurPETSc() (VecteurPETSc.cc:287) ==107156== by 0x10BA9F48: VecteurPETSc::~VecteurPETSc() (VecteurPETSc.cc:281) ==107156== by 0x1135A57B: PPReactionsAppuiEL3D::~PPReactionsAppuiEL3D() (PPReactionsAppuiEL3D.cc:216) ==107156== by 0xCD9A1EA: ProblemeGD::~ProblemeGD() (in /home/mefpp_ericc/depots_prepush/GIREF/lib/libgiref_dev_Formulation.so) ==107156== by 0x435702: main (Test.ProblemeGD.icc:381) #2) For the run with -vecscatter_alltoall it works...! As an "end user", should I ever modify these VecScatterCreate options? How do they change the performances of the code on large problems? Thanks, Eric On 25/07/16 02:57 PM, Matthew Knepley wrote: > On Mon, Jul 25, 2016 at 11:33 AM, Eric Chamberland > > wrote: > > Hi, > > has someone tried OpenMPI 2.0 with Petsc 3.7.2? > > I am having some errors with petsc, maybe someone have them too? > > Here are the configure logs for PETSc: > > http://www.giref.ulaval.ca/~cmpgiref/dernier_ompi/2016.07.25.01h16m02s_configure.log > > http://www.giref.ulaval.ca/~cmpgiref/dernier_ompi/2016.07.25.01h16m02s_RDict.log > > And for OpenMPI: > http://www.giref.ulaval.ca/~cmpgiref/dernier_ompi/2016.07.25.01h16m02s_config.log > > (in fact, I am testing the ompi-release branch, a sort of > petsc-master branch, since I need the commit 9ba6678156). > > For a set of parallel tests, I have 104 that works on 124 total tests. > > > It appears that the fault happens when freeing the VecScatter we build > for MatMult, which contains Request structures > for the ISends and IRecvs. These looks like internal OpenMPI errors to > me since the Request should be opaque. > I would try at least two things: > > 1) Run under valgrind. > > 2) Switch the VecScatter implementation. All the options are here, > > http://www.mcs.anl.gov/petsc/petsc-current/docs/manualpages/Vec/VecScatterCreate.html#VecScatterCreate > > but maybe use alltoall. > > Thanks, > > Matt > > > And the typical error: > *** Error in > `/pmi/cmpbib/compilation_BIB_dernier_ompi/COMPILE_AUTO/GIREF/bin/Test.ProblemeGD.dev': > free(): invalid pointer: > ======= Backtrace: ========= > /lib64/libc.so.6(+0x7277f)[0x7f80eb11677f] > /lib64/libc.so.6(+0x78026)[0x7f80eb11c026] > /lib64/libc.so.6(+0x78d53)[0x7f80eb11cd53] > /opt/openmpi-2.x_opt/lib/libopen-pal.so.20(opal_free+0x1f)[0x7f80ea8f9d60] > /opt/openmpi-2.x_opt/lib/openmpi/mca_pml_ob1.so(+0x16628)[0x7f80df0ea628] > /opt/openmpi-2.x_opt/lib/openmpi/mca_pml_ob1.so(+0x16c50)[0x7f80df0eac50] > /opt/openmpi-2.x_opt/lib/libmpi.so.20(+0x9f9dd)[0x7f80eb7029dd] > /opt/openmpi-2.x_opt/lib/libmpi.so.20(MPI_Request_free+0xf7)[0x7f80eb702ad6] > /opt/petsc-3.7.2_debug_openmpi_2.x/lib/libpetsc.so.3.7(+0x4adc6d)[0x7f80f2fa6c6d] > /opt/petsc-3.7.2_debug_openmpi_2.x/lib/libpetsc.so.3.7(VecScatterDestroy+0x68d)[0x7f80f2fa1c45] > /opt/petsc-3.7.2_debug_openmpi_2.x/lib/libpetsc.so.3.7(+0xa9d0f5)[0x7f80f35960f5] > /opt/petsc-3.7.2_debug_openmpi_2.x/lib/libpetsc.so.3.7(MatDestroy+0x648)[0x7f80f35c2588] > /opt/petsc-3.7.2_debug_openmpi_2.x/lib/libpetsc.so.3.7(+0x10bf0f4)[0x7f80f3bb80f4] > /opt/petsc-3.7.2_debug_openmpi_2.x/lib/libpetsc.so.3.7(PCReset+0x346)[0x7f80f3a796de] > /opt/petsc-3.7.2_debug_openmpi_2.x/lib/libpetsc.so.3.7(KSPReset+0x502)[0x7f80f3d19779] > /opt/petsc-3.7.2_debug_openmpi_2.x/lib/libpetsc.so.3.7(+0x11707f7)[0x7f80f3c697f7] > /opt/petsc-3.7.2_debug_openmpi_2.x/lib/libpetsc.so.3.7(PCReset+0x346)[0x7f80f3a796de] > /opt/petsc-3.7.2_debug_openmpi_2.x/lib/libpetsc.so.3.7(KSPReset+0x502)[0x7f80f3d19779] > /opt/petsc-3.7.2_debug_openmpi_2.x/lib/libpetsc.so.3.7(+0x11707f7)[0x7f80f3c697f7] > /opt/petsc-3.7.2_debug_openmpi_2.x/lib/libpetsc.so.3.7(PCReset+0x346)[0x7f80f3a796de] > /opt/petsc-3.7.2_debug_openmpi_2.x/lib/libpetsc.so.3.7(KSPReset+0x502)[0x7f80f3d19779] > /opt/petsc-3.7.2_debug_openmpi_2.x/lib/libpetsc.so.3.7(+0x11707f7)[0x7f80f3c697f7] > /opt/petsc-3.7.2_debug_openmpi_2.x/lib/libpetsc.so.3.7(PCReset+0x346)[0x7f80f3a796de] > /opt/petsc-3.7.2_debug_openmpi_2.x/lib/libpetsc.so.3.7(PCDestroy+0x5d1)[0x7f80f3a79fd9] > /opt/petsc-3.7.2_debug_openmpi_2.x/lib/libpetsc.so.3.7(KSPDestroy+0x7b6)[0x7f80f3d1a334] > > a similar one: > *** Error in > `/pmi/cmpbib/compilation_BIB_dernier_ompi/COMPILE_AUTO/GIREF/bin/Test.ProbFluideIncompressible.dev': > free(): invalid pointer: 0x00007f382a7c5bc0 *** > ======= Backtrace: ========= > /lib64/libc.so.6(+0x7277f)[0x7f3829f1c77f] > /lib64/libc.so.6(+0x78026)[0x7f3829f22026] > /lib64/libc.so.6(+0x78d53)[0x7f3829f22d53] > /opt/openmpi-2.x_opt/lib/libopen-pal.so.20(opal_free+0x1f)[0x7f38296ffd60] > /opt/openmpi-2.x_opt/lib/openmpi/mca_pml_ob1.so(+0x16628)[0x7f381deab628] > /opt/openmpi-2.x_opt/lib/openmpi/mca_pml_ob1.so(+0x16c50)[0x7f381deabc50] > /opt/openmpi-2.x_opt/lib/libmpi.so.20(+0x9f9dd)[0x7f382a5089dd] > /opt/openmpi-2.x_opt/lib/libmpi.so.20(MPI_Request_free+0xf7)[0x7f382a508ad6] > /opt/petsc-3.7.2_debug_openmpi_2.x/lib/libpetsc.so.3.7(+0x4adc6d)[0x7f3831dacc6d] > /opt/petsc-3.7.2_debug_openmpi_2.x/lib/libpetsc.so.3.7(VecScatterDestroy+0x68d)[0x7f3831da7c45] > /opt/petsc-3.7.2_debug_openmpi_2.x/lib/libpetsc.so.3.7(+0x9f4755)[0x7f38322f3755] > /opt/petsc-3.7.2_debug_openmpi_2.x/lib/libpetsc.so.3.7(MatDestroy+0x648)[0x7f38323c8588] > /opt/petsc-3.7.2_debug_openmpi_2.x/lib/libpetsc.so.3.7(PCReset+0x4e2)[0x7f383287f87a] > /opt/petsc-3.7.2_debug_openmpi_2.x/lib/libpetsc.so.3.7(PCDestroy+0x5d1)[0x7f383287ffd9] > /opt/petsc-3.7.2_debug_openmpi_2.x/lib/libpetsc.so.3.7(KSPDestroy+0x7b6)[0x7f3832b20334] > > another one: > > *** Error in > `/pmi/cmpbib/compilation_BIB_dernier_ompi/COMPILE_AUTO/GIREF/bin/Test.MortierDiffusion.dev': > free(): invalid pointer: 0x00007f67b6d37bc0 *** > ======= Backtrace: ========= > /lib64/libc.so.6(+0x7277f)[0x7f67b648e77f] > /lib64/libc.so.6(+0x78026)[0x7f67b6494026] > /lib64/libc.so.6(+0x78d53)[0x7f67b6494d53] > /opt/openmpi-2.x_opt/lib/libopen-pal.so.20(opal_free+0x1f)[0x7f67b5c71d60] > /opt/openmpi-2.x_opt/lib/openmpi/mca_pml_ob1.so(+0x1adae)[0x7f67aa4cddae] > /opt/openmpi-2.x_opt/lib/openmpi/mca_pml_ob1.so(+0x1b4ca)[0x7f67aa4ce4ca] > /opt/openmpi-2.x_opt/lib/libmpi.so.20(+0x9f9dd)[0x7f67b6a7a9dd] > /opt/openmpi-2.x_opt/lib/libmpi.so.20(MPI_Request_free+0xf7)[0x7f67b6a7aad6] > /opt/petsc-3.7.2_debug_openmpi_2.x/lib/libpetsc.so.3.7(+0x4adb09)[0x7f67be31eb09] > /opt/petsc-3.7.2_debug_openmpi_2.x/lib/libpetsc.so.3.7(VecScatterDestroy+0x68d)[0x7f67be319c45] > /opt/petsc-3.7.2_debug_openmpi_2.x/lib/libpetsc.so.3.7(+0x4574f7)[0x7f67be2c84f7] > /opt/petsc-3.7.2_debug_openmpi_2.x/lib/libpetsc.so.3.7(VecDestroy+0x648)[0x7f67be26e8da] > > I feel like I should wait until someone else from Petsc have tested > it too... > > Thanks, > > Eric > > > > > -- > What most experimenters take for granted before they begin their > experiments is infinitely more interesting than any results to which > their experiments lead. > -- Norbert Wiener From knepley at gmail.com Mon Jul 25 14:53:32 2016 From: knepley at gmail.com (Matthew Knepley) Date: Mon, 25 Jul 2016 12:53:32 -0700 Subject: [petsc-users] OpenMPI 2.0 and Petsc 3.7.2 In-Reply-To: <33b3cb0d-78f8-fb84-2ad5-a447f5cdce9e@giref.ulaval.ca> References: <99090192-103a-b58c-8bbb-273b938fb748@giref.ulaval.ca> <33b3cb0d-78f8-fb84-2ad5-a447f5cdce9e@giref.ulaval.ca> Message-ID: On Mon, Jul 25, 2016 at 12:44 PM, Eric Chamberland < Eric.Chamberland at giref.ulaval.ca> wrote: > Ok, > > here is the 2 points answered: > > #1) got valgrind output... here is the fatal free operation: > Okay, this is not the MatMult scatter, this is for local representations of ghosted vectors. However, to me it looks like OpenMPI mistakenly frees its built-in type for MPI_DOUBLE. > ==107156== Invalid free() / delete / delete[] / realloc() > ==107156== at 0x4C2A37C: free (in > /usr/lib64/valgrind/vgpreload_memcheck-amd64-linux.so) > ==107156== by 0x1E63CD5F: opal_free (malloc.c:184) > ==107156== by 0x27622627: mca_pml_ob1_recv_request_fini > (pml_ob1_recvreq.h:133) > ==107156== by 0x27622C4F: mca_pml_ob1_recv_request_free > (pml_ob1_recvreq.c:90) > ==107156== by 0x1D3EF9DC: ompi_request_free (request.h:362) > ==107156== by 0x1D3EFAD5: PMPI_Request_free (prequest_free.c:59) > ==107156== by 0x14AE3B9C: VecScatterDestroy_PtoP (vpscat.c:219) > ==107156== by 0x14ADEB74: VecScatterDestroy (vscat.c:1860) > ==107156== by 0x14A8D426: VecDestroy_MPI (pdvec.c:25) > ==107156== by 0x14A33809: VecDestroy (vector.c:432) > ==107156== by 0x10A2A5AB: GIREFVecDestroy(_p_Vec*&) > (girefConfigurationPETSc.h:115) > ==107156== by 0x10BA9F14: VecteurPETSc::detruitObjetPETSc() > (VecteurPETSc.cc:2292) > ==107156== by 0x10BA9D0D: VecteurPETSc::~VecteurPETSc() > (VecteurPETSc.cc:287) > ==107156== by 0x10BA9F48: VecteurPETSc::~VecteurPETSc() > (VecteurPETSc.cc:281) > ==107156== by 0x1135A57B: PPReactionsAppuiEL3D::~PPReactionsAppuiEL3D() > (PPReactionsAppuiEL3D.cc:216) > ==107156== by 0xCD9A1EA: ProblemeGD::~ProblemeGD() (in > /home/mefpp_ericc/depots_prepush/GIREF/lib/libgiref_dev_Formulation.so) > ==107156== by 0x435702: main (Test.ProblemeGD.icc:381) > ==107156== Address 0x1d6acbc0 is 0 bytes inside data symbol > "ompi_mpi_double" > --107156-- REDIR: 0x1dda2680 (libc.so.6:__GI_stpcpy) redirected to > 0x4c2f330 (__GI_stpcpy) > ==107156== > ==107156== Process terminating with default action of signal 6 (SIGABRT): > dumping core > ==107156== at 0x1DD520C7: raise (in /lib64/libc-2.19.so) > ==107156== by 0x1DD53534: abort (in /lib64/libc-2.19.so) > ==107156== by 0x1DD4B145: __assert_fail_base (in /lib64/libc-2.19.so) > ==107156== by 0x1DD4B1F1: __assert_fail (in /lib64/libc-2.19.so) > ==107156== by 0x27626D12: mca_pml_ob1_send_request_fini > (pml_ob1_sendreq.h:221) > ==107156== by 0x276274C9: mca_pml_ob1_send_request_free > (pml_ob1_sendreq.c:117) > ==107156== by 0x1D3EF9DC: ompi_request_free (request.h:362) > ==107156== by 0x1D3EFAD5: PMPI_Request_free (prequest_free.c:59) > ==107156== by 0x14AE3C3C: VecScatterDestroy_PtoP (vpscat.c:225) > ==107156== by 0x14ADEB74: VecScatterDestroy (vscat.c:1860) > ==107156== by 0x14A8D426: VecDestroy_MPI (pdvec.c:25) > ==107156== by 0x14A33809: VecDestroy (vector.c:432) > ==107156== by 0x10A2A5AB: GIREFVecDestroy(_p_Vec*&) > (girefConfigurationPETSc.h:115) > ==107156== by 0x10BA9F14: VecteurPETSc::detruitObjetPETSc() > (VecteurPETSc.cc:2292) > ==107156== by 0x10BA9D0D: VecteurPETSc::~VecteurPETSc() > (VecteurPETSc.cc:287) > ==107156== by 0x10BA9F48: VecteurPETSc::~VecteurPETSc() > (VecteurPETSc.cc:281) > ==107156== by 0x1135A57B: PPReactionsAppuiEL3D::~PPReactionsAppuiEL3D() > (PPReactionsAppuiEL3D.cc:216) > ==107156== by 0xCD9A1EA: ProblemeGD::~ProblemeGD() (in > /home/mefpp_ericc/depots_prepush/GIREF/lib/libgiref_dev_Formulation.so) > ==107156== by 0x435702: main (Test.ProblemeGD.icc:381) > > > #2) For the run with -vecscatter_alltoall it works...! > > As an "end user", should I ever modify these VecScatterCreate options? How > do they change the performances of the code on large problems? > Yep, those options are there because the different variants are better on different architectures, and you can't know which one to pick until runtime, (and without experimentation). Thanks, Matt > Thanks, > > Eric > > On 25/07/16 02:57 PM, Matthew Knepley wrote: > >> On Mon, Jul 25, 2016 at 11:33 AM, Eric Chamberland >> > > wrote: >> >> Hi, >> >> has someone tried OpenMPI 2.0 with Petsc 3.7.2? >> >> I am having some errors with petsc, maybe someone have them too? >> >> Here are the configure logs for PETSc: >> >> >> http://www.giref.ulaval.ca/~cmpgiref/dernier_ompi/2016.07.25.01h16m02s_configure.log >> >> >> http://www.giref.ulaval.ca/~cmpgiref/dernier_ompi/2016.07.25.01h16m02s_RDict.log >> >> And for OpenMPI: >> >> http://www.giref.ulaval.ca/~cmpgiref/dernier_ompi/2016.07.25.01h16m02s_config.log >> >> (in fact, I am testing the ompi-release branch, a sort of >> petsc-master branch, since I need the commit 9ba6678156). >> >> For a set of parallel tests, I have 104 that works on 124 total tests. >> >> >> It appears that the fault happens when freeing the VecScatter we build >> for MatMult, which contains Request structures >> for the ISends and IRecvs. These looks like internal OpenMPI errors to >> me since the Request should be opaque. >> I would try at least two things: >> >> 1) Run under valgrind. >> >> 2) Switch the VecScatter implementation. All the options are here, >> >> >> http://www.mcs.anl.gov/petsc/petsc-current/docs/manualpages/Vec/VecScatterCreate.html#VecScatterCreate >> >> but maybe use alltoall. >> >> Thanks, >> >> Matt >> >> >> And the typical error: >> *** Error in >> >> `/pmi/cmpbib/compilation_BIB_dernier_ompi/COMPILE_AUTO/GIREF/bin/Test.ProblemeGD.dev': >> free(): invalid pointer: >> ======= Backtrace: ========= >> /lib64/libc.so.6(+0x7277f)[0x7f80eb11677f] >> /lib64/libc.so.6(+0x78026)[0x7f80eb11c026] >> /lib64/libc.so.6(+0x78d53)[0x7f80eb11cd53] >> >> /opt/openmpi-2.x_opt/lib/libopen-pal.so.20(opal_free+0x1f)[0x7f80ea8f9d60] >> >> /opt/openmpi-2.x_opt/lib/openmpi/mca_pml_ob1.so(+0x16628)[0x7f80df0ea628] >> >> /opt/openmpi-2.x_opt/lib/openmpi/mca_pml_ob1.so(+0x16c50)[0x7f80df0eac50] >> /opt/openmpi-2.x_opt/lib/libmpi.so.20(+0x9f9dd)[0x7f80eb7029dd] >> >> /opt/openmpi-2.x_opt/lib/libmpi.so.20(MPI_Request_free+0xf7)[0x7f80eb702ad6] >> >> /opt/petsc-3.7.2_debug_openmpi_2.x/lib/libpetsc.so.3.7(+0x4adc6d)[0x7f80f2fa6c6d] >> >> /opt/petsc-3.7.2_debug_openmpi_2.x/lib/libpetsc.so.3.7(VecScatterDestroy+0x68d)[0x7f80f2fa1c45] >> >> /opt/petsc-3.7.2_debug_openmpi_2.x/lib/libpetsc.so.3.7(+0xa9d0f5)[0x7f80f35960f5] >> >> /opt/petsc-3.7.2_debug_openmpi_2.x/lib/libpetsc.so.3.7(MatDestroy+0x648)[0x7f80f35c2588] >> >> /opt/petsc-3.7.2_debug_openmpi_2.x/lib/libpetsc.so.3.7(+0x10bf0f4)[0x7f80f3bb80f4] >> >> /opt/petsc-3.7.2_debug_openmpi_2.x/lib/libpetsc.so.3.7(PCReset+0x346)[0x7f80f3a796de] >> >> /opt/petsc-3.7.2_debug_openmpi_2.x/lib/libpetsc.so.3.7(KSPReset+0x502)[0x7f80f3d19779] >> >> /opt/petsc-3.7.2_debug_openmpi_2.x/lib/libpetsc.so.3.7(+0x11707f7)[0x7f80f3c697f7] >> >> /opt/petsc-3.7.2_debug_openmpi_2.x/lib/libpetsc.so.3.7(PCReset+0x346)[0x7f80f3a796de] >> >> /opt/petsc-3.7.2_debug_openmpi_2.x/lib/libpetsc.so.3.7(KSPReset+0x502)[0x7f80f3d19779] >> >> /opt/petsc-3.7.2_debug_openmpi_2.x/lib/libpetsc.so.3.7(+0x11707f7)[0x7f80f3c697f7] >> >> /opt/petsc-3.7.2_debug_openmpi_2.x/lib/libpetsc.so.3.7(PCReset+0x346)[0x7f80f3a796de] >> >> /opt/petsc-3.7.2_debug_openmpi_2.x/lib/libpetsc.so.3.7(KSPReset+0x502)[0x7f80f3d19779] >> >> /opt/petsc-3.7.2_debug_openmpi_2.x/lib/libpetsc.so.3.7(+0x11707f7)[0x7f80f3c697f7] >> >> /opt/petsc-3.7.2_debug_openmpi_2.x/lib/libpetsc.so.3.7(PCReset+0x346)[0x7f80f3a796de] >> >> /opt/petsc-3.7.2_debug_openmpi_2.x/lib/libpetsc.so.3.7(PCDestroy+0x5d1)[0x7f80f3a79fd9] >> >> /opt/petsc-3.7.2_debug_openmpi_2.x/lib/libpetsc.so.3.7(KSPDestroy+0x7b6)[0x7f80f3d1a334] >> >> a similar one: >> *** Error in >> >> `/pmi/cmpbib/compilation_BIB_dernier_ompi/COMPILE_AUTO/GIREF/bin/Test.ProbFluideIncompressible.dev': >> free(): invalid pointer: 0x00007f382a7c5bc0 *** >> ======= Backtrace: ========= >> /lib64/libc.so.6(+0x7277f)[0x7f3829f1c77f] >> /lib64/libc.so.6(+0x78026)[0x7f3829f22026] >> /lib64/libc.so.6(+0x78d53)[0x7f3829f22d53] >> >> /opt/openmpi-2.x_opt/lib/libopen-pal.so.20(opal_free+0x1f)[0x7f38296ffd60] >> >> /opt/openmpi-2.x_opt/lib/openmpi/mca_pml_ob1.so(+0x16628)[0x7f381deab628] >> >> /opt/openmpi-2.x_opt/lib/openmpi/mca_pml_ob1.so(+0x16c50)[0x7f381deabc50] >> /opt/openmpi-2.x_opt/lib/libmpi.so.20(+0x9f9dd)[0x7f382a5089dd] >> >> /opt/openmpi-2.x_opt/lib/libmpi.so.20(MPI_Request_free+0xf7)[0x7f382a508ad6] >> >> /opt/petsc-3.7.2_debug_openmpi_2.x/lib/libpetsc.so.3.7(+0x4adc6d)[0x7f3831dacc6d] >> >> /opt/petsc-3.7.2_debug_openmpi_2.x/lib/libpetsc.so.3.7(VecScatterDestroy+0x68d)[0x7f3831da7c45] >> >> /opt/petsc-3.7.2_debug_openmpi_2.x/lib/libpetsc.so.3.7(+0x9f4755)[0x7f38322f3755] >> >> /opt/petsc-3.7.2_debug_openmpi_2.x/lib/libpetsc.so.3.7(MatDestroy+0x648)[0x7f38323c8588] >> >> /opt/petsc-3.7.2_debug_openmpi_2.x/lib/libpetsc.so.3.7(PCReset+0x4e2)[0x7f383287f87a] >> >> /opt/petsc-3.7.2_debug_openmpi_2.x/lib/libpetsc.so.3.7(PCDestroy+0x5d1)[0x7f383287ffd9] >> >> /opt/petsc-3.7.2_debug_openmpi_2.x/lib/libpetsc.so.3.7(KSPDestroy+0x7b6)[0x7f3832b20334] >> >> another one: >> >> *** Error in >> >> `/pmi/cmpbib/compilation_BIB_dernier_ompi/COMPILE_AUTO/GIREF/bin/Test.MortierDiffusion.dev': >> free(): invalid pointer: 0x00007f67b6d37bc0 *** >> ======= Backtrace: ========= >> /lib64/libc.so.6(+0x7277f)[0x7f67b648e77f] >> /lib64/libc.so.6(+0x78026)[0x7f67b6494026] >> /lib64/libc.so.6(+0x78d53)[0x7f67b6494d53] >> >> /opt/openmpi-2.x_opt/lib/libopen-pal.so.20(opal_free+0x1f)[0x7f67b5c71d60] >> >> /opt/openmpi-2.x_opt/lib/openmpi/mca_pml_ob1.so(+0x1adae)[0x7f67aa4cddae] >> >> /opt/openmpi-2.x_opt/lib/openmpi/mca_pml_ob1.so(+0x1b4ca)[0x7f67aa4ce4ca] >> /opt/openmpi-2.x_opt/lib/libmpi.so.20(+0x9f9dd)[0x7f67b6a7a9dd] >> >> /opt/openmpi-2.x_opt/lib/libmpi.so.20(MPI_Request_free+0xf7)[0x7f67b6a7aad6] >> >> /opt/petsc-3.7.2_debug_openmpi_2.x/lib/libpetsc.so.3.7(+0x4adb09)[0x7f67be31eb09] >> >> /opt/petsc-3.7.2_debug_openmpi_2.x/lib/libpetsc.so.3.7(VecScatterDestroy+0x68d)[0x7f67be319c45] >> >> /opt/petsc-3.7.2_debug_openmpi_2.x/lib/libpetsc.so.3.7(+0x4574f7)[0x7f67be2c84f7] >> >> /opt/petsc-3.7.2_debug_openmpi_2.x/lib/libpetsc.so.3.7(VecDestroy+0x648)[0x7f67be26e8da] >> >> I feel like I should wait until someone else from Petsc have tested >> it too... >> >> Thanks, >> >> Eric >> >> >> >> >> -- >> What most experimenters take for granted before they begin their >> experiments is infinitely more interesting than any results to which >> their experiments lead. >> -- Norbert Wiener >> > -- What most experimenters take for granted before they begin their experiments is infinitely more interesting than any results to which their experiments lead. -- Norbert Wiener -------------- next part -------------- An HTML attachment was scrubbed... URL: From bhatiamanav at gmail.com Mon Jul 25 15:13:10 2016 From: bhatiamanav at gmail.com (Manav Bhatia) Date: Mon, 25 Jul 2016 15:13:10 -0500 Subject: [petsc-users] handling multi physics applications on multiple MPI_Comm Message-ID: Hi, I have a multi physics application with discipline1 defined on comm1 and discipline2 on comm2. My intent is to use the nested matrix for the KSP solver where each diagonal block is provided by the disciplines, and the off-diagonal blocks are defined as shell-matrices with matrix vector products. I am a bit unclear about how to deal with the case of different set of processors on comm1 and comm2. I have the following questions and would appreciate some guidance: ? Would it make sense to define a comm_global as a union of comm1 and comm2 for the MatCreateNest? ? The diagonal blocks are available on comm1 and comm2 only. Should MatAssemblyBegin/End for these diagonal blocks be called on comm1 and comm2 separately? ? What comm should be used for the off-diagonal shell matrices? ? Likewise, when calling VecGetSubVector and VecRestoreSubVector to get sub-vectors corresponding to discipline1 (or 2), what comm should these function calls be made? Thanks, Manav From knepley at gmail.com Mon Jul 25 15:21:24 2016 From: knepley at gmail.com (Matthew Knepley) Date: Mon, 25 Jul 2016 13:21:24 -0700 Subject: [petsc-users] handling multi physics applications on multiple MPI_Comm In-Reply-To: References: Message-ID: On Mon, Jul 25, 2016 at 1:13 PM, Manav Bhatia wrote: > Hi, > > I have a multi physics application with discipline1 defined on comm1 > and discipline2 on comm2. > > My intent is to use the nested matrix for the KSP solver where each > diagonal block is provided by the disciplines, and the off-diagonal blocks > are defined as shell-matrices with matrix vector products. > > I am a bit unclear about how to deal with the case of different set of > processors on comm1 and comm2. I have the following questions and would > appreciate some guidance: > > ? Would it make sense to define a comm_global as a union of comm1 and > comm2 for the MatCreateNest? > > ? The diagonal blocks are available on comm1 and comm2 only. Should > MatAssemblyBegin/End for these diagonal blocks be called on comm1 and comm2 > separately? > > ? What comm should be used for the off-diagonal shell matrices? > > ? Likewise, when calling VecGetSubVector and VecRestoreSubVector to get > sub-vectors corresponding to discipline1 (or 2), what comm should these > function calls be made? > I would first ask if you have a convincing reason for doing this, because it sounds like the genesis of a million programming errors. All the linear algebra objects would have to be in a global comm that contained any subcomms you want to use. I don't think it would make sense to define submatrices on subcomms. You can have your assembly code run on a subcomm certainly, but again this is a tricky business and I find it hard to understand the gain. Matt > Thanks, > Manav > -- What most experimenters take for granted before they begin their experiments is infinitely more interesting than any results to which their experiments lead. -- Norbert Wiener -------------- next part -------------- An HTML attachment was scrubbed... URL: From bhatiamanav at gmail.com Mon Jul 25 15:34:08 2016 From: bhatiamanav at gmail.com (Manav Bhatia) Date: Mon, 25 Jul 2016 15:34:08 -0500 Subject: [petsc-users] handling multi physics applications on multiple MPI_Comm In-Reply-To: References: Message-ID: <7C6E1551-B9EF-46C0-B0D5-6FE1B52BC297@gmail.com> Thanks for your comments, Matt. I have a fluid-structural application with a really large fluid discretization and a really small structural discretization. Due to the relative difference in size, I have defined the structural system on only a single node, and the fluid system on (say) N nodes. So far, I have hand-coded a Schur-Complement for a frequency-domain analysis that is able to handle the difference in comms. I am attempting to migrate to the nested matrix constructs for some future work, and was looking at the possibility of reusing the same distribution of comms. Additionally, I am looking to add additional disciplines and was considering the possibility of defining the systems on different comms. I wasn?t sure if I was creating more problems with this approach than what I was trying to solve. Would you recommend that all objects exist on a global_comm so that there is no confusion about these operations? Thanks, Manav > On Jul 25, 2016, at 3:21 PM, Matthew Knepley wrote: > > On Mon, Jul 25, 2016 at 1:13 PM, Manav Bhatia > wrote: > Hi, > > I have a multi physics application with discipline1 defined on comm1 and discipline2 on comm2. > > My intent is to use the nested matrix for the KSP solver where each diagonal block is provided by the disciplines, and the off-diagonal blocks are defined as shell-matrices with matrix vector products. > > I am a bit unclear about how to deal with the case of different set of processors on comm1 and comm2. I have the following questions and would appreciate some guidance: > > ? Would it make sense to define a comm_global as a union of comm1 and comm2 for the MatCreateNest? > > ? The diagonal blocks are available on comm1 and comm2 only. Should MatAssemblyBegin/End for these diagonal blocks be called on comm1 and comm2 separately? > > ? What comm should be used for the off-diagonal shell matrices? > > ? Likewise, when calling VecGetSubVector and VecRestoreSubVector to get sub-vectors corresponding to discipline1 (or 2), what comm should these function calls be made? > > I would first ask if you have a convincing reason for doing this, because it sounds like the genesis of a million programming errors. > > All the linear algebra objects would have to be in a global comm that contained any subcomms you want to use. I don't > think it would make sense to define submatrices on subcomms. You can have your assembly code run on a subcomm certainly, > but again this is a tricky business and I find it hard to understand the gain. > > Matt > > Thanks, > Manav > -- > What most experimenters take for granted before they begin their experiments is infinitely more interesting than any results to which their experiments lead. > -- Norbert Wiener -------------- next part -------------- An HTML attachment was scrubbed... URL: From knepley at gmail.com Mon Jul 25 15:43:14 2016 From: knepley at gmail.com (Matthew Knepley) Date: Mon, 25 Jul 2016 13:43:14 -0700 Subject: [petsc-users] handling multi physics applications on multiple MPI_Comm In-Reply-To: <7C6E1551-B9EF-46C0-B0D5-6FE1B52BC297@gmail.com> References: <7C6E1551-B9EF-46C0-B0D5-6FE1B52BC297@gmail.com> Message-ID: On Mon, Jul 25, 2016 at 1:34 PM, Manav Bhatia wrote: > Thanks for your comments, Matt. > > I have a fluid-structural application with a really large fluid > discretization and a really small structural discretization. Due to the > relative difference in size, I have defined the structural system on only a > single node, and the fluid system on (say) N nodes. > > So far, I have hand-coded a Schur-Complement for a frequency-domain > analysis that is able to handle the difference in comms. > > I am attempting to migrate to the nested matrix constructs for some future > work, and was looking at the possibility of reusing the same distribution > of comms. Additionally, I am looking to add additional disciplines and was > considering the possibility of defining the systems on different comms. > > I wasn?t sure if I was creating more problems with this approach than what > I was trying to solve. > > Would you recommend that all objects exist on a global_comm so that there > is no confusion about these operations? > Yes. I think the confusion here is between the problem you are trying to solve, and the tool for doing it. Disparate size of subsystems seems to me to be a _load balancing_ problem. Here you can use data layout to alleviate this. On the global comm, you can put all the fluid unknowns on ranks 0..N-2, and the structural unknowns on N-1. You can have more general splits than that. IF for some reason in the structural assembly you used a large number of collective operations (like say did artificial timestepping to get to some steady state property), then it might make sense to pull out a subcomm of only the occupied ranks, but only above 1000 procs, and only on a non-BlueGene machine. This is also easily measure before you do this work. Matt > Thanks, > Manav > > > > On Jul 25, 2016, at 3:21 PM, Matthew Knepley wrote: > > On Mon, Jul 25, 2016 at 1:13 PM, Manav Bhatia > wrote: > >> Hi, >> >> I have a multi physics application with discipline1 defined on comm1 >> and discipline2 on comm2. >> >> My intent is to use the nested matrix for the KSP solver where each >> diagonal block is provided by the disciplines, and the off-diagonal blocks >> are defined as shell-matrices with matrix vector products. >> >> I am a bit unclear about how to deal with the case of different set >> of processors on comm1 and comm2. I have the following questions and would >> appreciate some guidance: >> >> ? Would it make sense to define a comm_global as a union of comm1 and >> comm2 for the MatCreateNest? >> >> ? The diagonal blocks are available on comm1 and comm2 only. Should >> MatAssemblyBegin/End for these diagonal blocks be called on comm1 and comm2 >> separately? >> >> ? What comm should be used for the off-diagonal shell matrices? >> >> ? Likewise, when calling VecGetSubVector and VecRestoreSubVector to get >> sub-vectors corresponding to discipline1 (or 2), what comm should these >> function calls be made? >> > > I would first ask if you have a convincing reason for doing this, because > it sounds like the genesis of a million programming errors. > > All the linear algebra objects would have to be in a global comm that > contained any subcomms you want to use. I don't > think it would make sense to define submatrices on subcomms. You can have > your assembly code run on a subcomm certainly, > but again this is a tricky business and I find it hard to understand the > gain. > > Matt > > >> Thanks, >> Manav >> > -- > What most experimenters take for granted before they begin their > experiments is infinitely more interesting than any results to which their > experiments lead. > -- Norbert Wiener > > > -- What most experimenters take for granted before they begin their experiments is infinitely more interesting than any results to which their experiments lead. -- Norbert Wiener -------------- next part -------------- An HTML attachment was scrubbed... URL: From bhatiamanav at gmail.com Mon Jul 25 16:30:20 2016 From: bhatiamanav at gmail.com (Manav Bhatia) Date: Mon, 25 Jul 2016 16:30:20 -0500 Subject: [petsc-users] handling multi physics applications on multiple MPI_Comm In-Reply-To: References: <7C6E1551-B9EF-46C0-B0D5-6FE1B52BC297@gmail.com> Message-ID: > On Jul 25, 2016, at 3:43 PM, Matthew Knepley wrote: > > Yes. I think the confusion here is between the problem you are trying to solve, and the tool for doing it. > > Disparate size of subsystems seems to me to be a _load balancing_ problem. Here you can use data layout to alleviate this. > On the global comm, you can put all the fluid unknowns on ranks 0..N-2, and the structural unknowns on N-1. You can have > more general splits than that. > Ok. So, if I do that, then there would still be one comm? If yes, then the distribution would be by specifying the number of local fluid dofs on N-1 to be zero? Sorry that this such is a basic question. > IF for some reason in the structural assembly you used a large number of collective operations (like say did artificial timestepping > to get to some steady state property), then it might make sense to pull out a subcomm of only the occupied ranks, but only above > 1000 procs, and only on a non-BlueGene machine. This is also easily measure before you do this work. > > -------------- next part -------------- An HTML attachment was scrubbed... URL: From xzhao99 at gmail.com Mon Jul 25 16:39:35 2016 From: xzhao99 at gmail.com (Xujun Zhao) Date: Mon, 25 Jul 2016 16:39:35 -0500 Subject: [petsc-users] KSPSolve() passes in the dbg mode, but failed in opt mode In-Reply-To: References: Message-ID: Another interesting phenomenon is that it works for an iterative solver, but only failed for direct solvers(both superLU_dist and mumps). If something is not initialized correctly, why doesn't the iterative solver, for example, GMRES, throw any errors? On Mon, Jul 25, 2016 at 11:50 AM, Matthew Knepley wrote: > On Mon, Jul 25, 2016 at 9:17 AM, Xujun Zhao wrote: > >> Hi all, >> >> I am trying to solve my problem with a direct solver superLU_dist. >> But the KSPSolve failed in the "opt" mode. I shifted to the "dbg" version >> and wanted to see what error info I can get from the PETSc. Surprisingly, >> it passed the solve and didn't output any errors in the "dbg" version. Does >> anyone have the similar experience? and what type of potential bugs it may >> have? >> > > Debugging mode initializes all variables, but as Hong says, valgrind will > warn you of uninitialized variables. > > Matt > > >> >> --->test in StokesSolver::solve(): Start the KSP solve... >> >> [0]PETSC ERROR: >> ------------------------------------------------------------------------ >> >> [0]PETSC ERROR: Caught signal number 11 SEGV: Segmentation Violation, >> probably memory access out of range >> >> [0]PETSC ERROR: Try option -start_in_debugger or -on_error_attach_debugger >> >> [0]PETSC ERROR: or see >> http://www.mcs.anl.gov/petsc/documentation/faq.html#valgrind >> >> [0]PETSC ERROR: or try http://valgrind.org on GNU/linux and Apple Mac OS >> X to find memory corruption errors >> >> [0]PETSC ERROR: configure using --with-debugging=yes, recompile, link, >> and run >> >> [0]PETSC ERROR: to get more information on the crash. >> >> [0]PETSC ERROR: --------------------- Error Message >> -------------------------------------------------------------- >> >> [0]PETSC ERROR: Signal received >> >> [0]PETSC ERROR: See http://www.mcs.anl.gov/petsc/documentation/faq.html >> for trouble shooting. >> >> [0]PETSC ERROR: Petsc Release Version 3.7.2, unknown >> >> [0]PETSC ERROR: ./example-dbg on a arch-darwin-c-opt named >> mcswl091.mcs.anl.gov by xzhao Mon Jul 25 11:10:12 2016 >> >> [0]PETSC ERROR: Configure options --with-cc=gcc --with-cxx=g++ >> --with-fc=gfortran --download-mpich --download-fblaslapack >> --download-scalapack --download-mumps --download-superlu_dist >> --download-hypre --download-ml --download-metis --download-parmetis >> --download-triangle --download-chaco --with-debugging=0 >> >> [0]PETSC ERROR: #1 User provided function() line 0 in unknown file >> >> application called MPI_Abort(MPI_COMM_WORLD, 59) - process 0 >> >> >> > > > -- > What most experimenters take for granted before they begin their > experiments is infinitely more interesting than any results to which their > experiments lead. > -- Norbert Wiener > -------------- next part -------------- An HTML attachment was scrubbed... URL: From knepley at gmail.com Mon Jul 25 16:55:58 2016 From: knepley at gmail.com (Matthew Knepley) Date: Mon, 25 Jul 2016 14:55:58 -0700 Subject: [petsc-users] handling multi physics applications on multiple MPI_Comm In-Reply-To: References: <7C6E1551-B9EF-46C0-B0D5-6FE1B52BC297@gmail.com> Message-ID: On Mon, Jul 25, 2016 at 2:30 PM, Manav Bhatia wrote: > > On Jul 25, 2016, at 3:43 PM, Matthew Knepley wrote: > > Yes. I think the confusion here is between the problem you are trying to > solve, and the tool for doing it. > > Disparate size of subsystems seems to me to be a _load balancing_ problem. > Here you can use data layout to alleviate this. > On the global comm, you can put all the fluid unknowns on ranks 0..N-2, > and the structural unknowns on N-1. You can have > more general splits than that. > > > Ok. So, if I do that, then there would still be one comm? If yes, then the > distribution would be by specifying the number of local fluid dofs on N-1 > to be zero? > Yes. If all you want is good load balance, I think this is the best way. Thanks, Matt > Sorry that this such is a basic question. > > > IF for some reason in the structural assembly you used a large number of > collective operations (like say did artificial timestepping > to get to some steady state property), then it might make sense to pull > out a subcomm of only the occupied ranks, but only above > 1000 procs, and only on a non-BlueGene machine. This is also easily > measure before you do this work. > > > > -- What most experimenters take for granted before they begin their experiments is infinitely more interesting than any results to which their experiments lead. -- Norbert Wiener -------------- next part -------------- An HTML attachment was scrubbed... URL: From knepley at gmail.com Mon Jul 25 16:56:51 2016 From: knepley at gmail.com (Matthew Knepley) Date: Mon, 25 Jul 2016 14:56:51 -0700 Subject: [petsc-users] KSPSolve() passes in the dbg mode, but failed in opt mode In-Reply-To: References: Message-ID: On Mon, Jul 25, 2016 at 2:39 PM, Xujun Zhao wrote: > Another interesting phenomenon is that it works for an iterative solver, > but only failed for direct solvers(both superLU_dist and mumps). If > something is not initialized correctly, why doesn't the iterative solver, > for example, GMRES, throw any errors? > It would of course depend on what you have not initialized, and what value was sitting in that place to begin with. Use valgrind to clear all this up. Matt > On Mon, Jul 25, 2016 at 11:50 AM, Matthew Knepley > wrote: > >> On Mon, Jul 25, 2016 at 9:17 AM, Xujun Zhao wrote: >> >>> Hi all, >>> >>> I am trying to solve my problem with a direct solver superLU_dist. >>> But the KSPSolve failed in the "opt" mode. I shifted to the "dbg" >>> version and wanted to see what error info I can get from the PETSc. >>> Surprisingly, it passed the solve and didn't output any errors in the "dbg" >>> version. Does anyone have the similar experience? and what type of >>> potential bugs it may have? >>> >> >> Debugging mode initializes all variables, but as Hong says, valgrind will >> warn you of uninitialized variables. >> >> Matt >> >> >>> >>> --->test in StokesSolver::solve(): Start the KSP solve... >>> >>> [0]PETSC ERROR: >>> ------------------------------------------------------------------------ >>> >>> [0]PETSC ERROR: Caught signal number 11 SEGV: Segmentation Violation, >>> probably memory access out of range >>> >>> [0]PETSC ERROR: Try option -start_in_debugger or >>> -on_error_attach_debugger >>> >>> [0]PETSC ERROR: or see >>> http://www.mcs.anl.gov/petsc/documentation/faq.html#valgrind >>> >>> [0]PETSC ERROR: or try http://valgrind.org on GNU/linux and Apple Mac >>> OS X to find memory corruption errors >>> >>> [0]PETSC ERROR: configure using --with-debugging=yes, recompile, link, >>> and run >>> >>> [0]PETSC ERROR: to get more information on the crash. >>> >>> [0]PETSC ERROR: --------------------- Error Message >>> -------------------------------------------------------------- >>> >>> [0]PETSC ERROR: Signal received >>> >>> [0]PETSC ERROR: See http://www.mcs.anl.gov/petsc/documentation/faq.html >>> for trouble shooting. >>> >>> [0]PETSC ERROR: Petsc Release Version 3.7.2, unknown >>> >>> [0]PETSC ERROR: ./example-dbg on a arch-darwin-c-opt named >>> mcswl091.mcs.anl.gov by xzhao Mon Jul 25 11:10:12 2016 >>> >>> [0]PETSC ERROR: Configure options --with-cc=gcc --with-cxx=g++ >>> --with-fc=gfortran --download-mpich --download-fblaslapack >>> --download-scalapack --download-mumps --download-superlu_dist >>> --download-hypre --download-ml --download-metis --download-parmetis >>> --download-triangle --download-chaco --with-debugging=0 >>> >>> [0]PETSC ERROR: #1 User provided function() line 0 in unknown file >>> >>> application called MPI_Abort(MPI_COMM_WORLD, 59) - process 0 >>> >>> >>> >> >> >> -- >> What most experimenters take for granted before they begin their >> experiments is infinitely more interesting than any results to which their >> experiments lead. >> -- Norbert Wiener >> > > -- What most experimenters take for granted before they begin their experiments is infinitely more interesting than any results to which their experiments lead. -- Norbert Wiener -------------- next part -------------- An HTML attachment was scrubbed... URL: From loiseau.jc at gmail.com Tue Jul 26 10:27:02 2016 From: loiseau.jc at gmail.com (JC) Date: Tue, 26 Jul 2016 17:27:02 +0200 Subject: [petsc-users] Scheduled Relaxation Jacobi method Message-ID: <40FF0CE8-7589-4631-AB5B-0F4AF5205C99@gmail.com> Hej, I have been using a very simple Scheduled Relaxation Jacobi (SRJ) method for tests purpose in one of my code, and would now like to implement into the big version that uses PETSc. So far, I have figured that I can use the weighted Jacobi method since it is nothing but a Jacobi-preconditioned Richardson. While in the weighted Jacobi, the relaxation weight omega is fixed, in the SRJ method, the value of the relaxation factor depends on the grid size and the iteration. Is there any simple way to implement such a iteration-varying relaxation weight given that I have text files with their appropriate values? Thanks a lot. JC From aks084000 at utdallas.edu Tue Jul 26 20:00:13 2016 From: aks084000 at utdallas.edu (Safin, Artur) Date: Wed, 27 Jul 2016 01:00:13 +0000 Subject: [petsc-users] Nested Fieldsplit for custom index sets Message-ID: Hello all, I would like to work out how to get nested fieldsplit to work correctly. I have a submatrix (labeled fieldsplit_P) that I would like to block precondition with sub-blocks A & B. To do this, I access the PC object within fieldsplit_P, and pass index sets corresponding to these sub-blocks (P_A_IS, P_B_IS) that tell how the matrix should be split. This is what I have: -------------------------------------------------------------------------------- KSP *ksp_all, ksp_P; PCFieldSplitGetSubKSP(pc, &i, &ksp_all); ksp_P = ksp_all[0]; PC pc_P; KSPGetPC(ksp_P, &pc_P); // This should extract the preconditioner for fieldsplit P PCFieldSplitSetIS(pc_P, "A", P_A_IS); PCFieldSplitSetIS(pc_P, "B", P_B_IS); -------------------------------------------------------------------------------- And these are the run-time arguments: -------------------------------------------------------------------------------- -pc_type fieldsplit -pc_fieldsplit_type multiplicative -fieldsplit_P_ksp_type gmres -fieldsplit_P_pc_type fieldsplit -fieldsplit_P_pc_fieldsplit_type multiplicative -fieldsplit_P_fieldsplit_A_ksp_type gmres -fieldsplit_P_fieldsplit_B_pc_type lu -fieldsplit_P_fieldsplit_B_ksp_type preonly -------------------------------------------------------------------------------- But that does not work: -------------------------------------------------------------------------------- [0]PETSC ERROR: --------------------- Error Message -------------------------------------------------------------- [0]PETSC ERROR: No support for this operation for this object type [0]PETSC ERROR: Cannot locate function PCFieldSplitSetIS_C in object [0]PETSC ERROR: See http://www.mcs.anl.gov/petsc/documentation/faq.html for trouble shooting. [0]PETSC ERROR: Petsc Release Version 3.7.2, Jun, 05, 2016 [0]PETSC ERROR: ./main_2D on a x86_64 named artur-ubuntu by artur Tue Jul 26 18:55:29 2016 [0]PETSC ERROR: Configure options --with-scalar-type=complex --with-mpi=1 --with-clanguage=c++ --with-cc=mpicc --with-fc=gfortran --with-cxx=mpic++ --with-fc=mpif90 --download-mumps --download-scalapack [0]PETSC ERROR: #1 PCFieldSplitSetIS() line 1756 in /home/artur/Rorsrach/Packages/petsc-3.7.2/src/ksp/pc/impls/fieldsplit/fieldsplit.c [0]PETSC ERROR: --------------------- Error Message -------------------------------------------------------------- [0]PETSC ERROR: No support for this operation for this object type [0]PETSC ERROR: Cannot locate function PCFieldSplitSetIS_C in object [0]PETSC ERROR: See http://www.mcs.anl.gov/petsc/documentation/faq.html for trouble shooting. [0]PETSC ERROR: Petsc Release Version 3.7.2, Jun, 05, 2016 [0]PETSC ERROR: ./main_2D on a x86_64 named artur-ubuntu by artur Tue Jul 26 18:55:29 2016 [0]PETSC ERROR: Configure options --with-scalar-type=complex --with-mpi=1 --with-clanguage=c++ --with-cc=mpicc --with-fc=gfortran --with-cxx=mpic++ --with-fc=mpif90 --download-mumps --download-scalapack [0]PETSC ERROR: #2 PCFieldSplitSetIS() line 1756 in /home/artur/Rorsrach/Packages/petsc-3.7.2/src/ksp/pc/impls/fieldsplit/fieldsplit.c -------------------------------------------------------------------------------- It seems that the preconditioner object for fieldsplit_P does not know that it is also of fieldsplit type. Does anyone have any idea of how I can specify the proper fieldsplit? Best, Artur -------------- next part -------------- An HTML attachment was scrubbed... URL: From bsmith at mcs.anl.gov Tue Jul 26 20:40:33 2016 From: bsmith at mcs.anl.gov (Barry Smith) Date: Tue, 26 Jul 2016 21:40:33 -0400 Subject: [petsc-users] Scheduled Relaxation Jacobi method In-Reply-To: <40FF0CE8-7589-4631-AB5B-0F4AF5205C99@gmail.com> References: <40FF0CE8-7589-4631-AB5B-0F4AF5205C99@gmail.com> Message-ID: <3EFDE37C-513C-4894-9859-BA7DCA70A760@mcs.anl.gov> > On Jul 26, 2016, at 11:27 AM, JC wrote: > > Hej, > > I have been using a very simple Scheduled Relaxation Jacobi (SRJ) method for tests purpose in one of my code, and would now like to implement into the big version that uses PETSc. So far, I have figured that I can use the weighted Jacobi method since it is nothing but a Jacobi-preconditioned Richardson. While in the weighted Jacobi, the relaxation weight omega is fixed, in the SRJ method, the value of the relaxation factor depends on the grid size and the iteration. Is there any simple way to implement such a iteration-varying relaxation weight given that I have text files with their appropriate values? The dependence on grid size is easy. You just call KSPRichardsonSetScale(). By depends on the iteration do you mean the linear iteration, as in the first iteration you use .1 then in the second you use .2 etc? To do this use KSPSetMonitor() and have your monitor routine call KSPRichardsonSetScale() with the value you like which can depend on the iteration. Barry > > Thanks a lot. > JC From bsmith at mcs.anl.gov Tue Jul 26 20:54:44 2016 From: bsmith at mcs.anl.gov (Barry Smith) Date: Tue, 26 Jul 2016 21:54:44 -0400 Subject: [petsc-users] Nested Fieldsplit for custom index sets In-Reply-To: References: Message-ID: Do you have a call to KSPSetFromOptions() before the call PCFieldSplitGetSubKSP()? I am guessing not which means that the PC does not yet know that it is of type fieldplit. Barry > On Jul 26, 2016, at 9:00 PM, Safin, Artur wrote: > > Hello all, > > I would like to work out how to get nested fieldsplit to work correctly. I have a submatrix (labeled fieldsplit_P) that I would like to block precondition with sub-blocks A & B. To do this, I access the PC object within fieldsplit_P, and pass index sets corresponding to these sub-blocks (P_A_IS, P_B_IS) that tell how the matrix should be split. This is what I have: > > -------------------------------------------------------------------------------- > KSP *ksp_all, ksp_P; > PCFieldSplitGetSubKSP(pc, &i, &ksp_all); > > ksp_P = ksp_all[0]; > PC pc_P; > KSPGetPC(ksp_P, &pc_P); // This should extract the preconditioner for fieldsplit P > PCFieldSplitSetIS(pc_P, "A", P_A_IS); > PCFieldSplitSetIS(pc_P, "B", P_B_IS); > -------------------------------------------------------------------------------- > > And these are the run-time arguments: > > -------------------------------------------------------------------------------- > -pc_type fieldsplit > -pc_fieldsplit_type multiplicative > > -fieldsplit_P_ksp_type gmres > -fieldsplit_P_pc_type fieldsplit > -fieldsplit_P_pc_fieldsplit_type multiplicative > > -fieldsplit_P_fieldsplit_A_ksp_type gmres > -fieldsplit_P_fieldsplit_B_pc_type lu > -fieldsplit_P_fieldsplit_B_ksp_type preonly > -------------------------------------------------------------------------------- > > But that does not work: > > -------------------------------------------------------------------------------- > [0]PETSC ERROR: --------------------- Error Message -------------------------------------------------------------- > [0]PETSC ERROR: No support for this operation for this object type > [0]PETSC ERROR: Cannot locate function PCFieldSplitSetIS_C in object > [0]PETSC ERROR: See http://www.mcs.anl.gov/petsc/documentation/faq.html for trouble shooting. > [0]PETSC ERROR: Petsc Release Version 3.7.2, Jun, 05, 2016 > [0]PETSC ERROR: ./main_2D on a x86_64 named artur-ubuntu by artur Tue Jul 26 18:55:29 2016 > [0]PETSC ERROR: Configure options --with-scalar-type=complex --with-mpi=1 --with-clanguage=c++ --with-cc=mpicc --with-fc=gfortran --with-cxx=mpic++ --with-fc=mpif90 --download-mumps --download-scalapack > [0]PETSC ERROR: #1 PCFieldSplitSetIS() line 1756 in /home/artur/Rorsrach/Packages/petsc-3.7.2/src/ksp/pc/impls/fieldsplit/fieldsplit.c > [0]PETSC ERROR: --------------------- Error Message -------------------------------------------------------------- > [0]PETSC ERROR: No support for this operation for this object type > [0]PETSC ERROR: Cannot locate function PCFieldSplitSetIS_C in object > [0]PETSC ERROR: See http://www.mcs.anl.gov/petsc/documentation/faq.html for trouble shooting. > [0]PETSC ERROR: Petsc Release Version 3.7.2, Jun, 05, 2016 > [0]PETSC ERROR: ./main_2D on a x86_64 named artur-ubuntu by artur Tue Jul 26 18:55:29 2016 > [0]PETSC ERROR: Configure options --with-scalar-type=complex --with-mpi=1 --with-clanguage=c++ --with-cc=mpicc --with-fc=gfortran --with-cxx=mpic++ --with-fc=mpif90 --download-mumps --download-scalapack > [0]PETSC ERROR: #2 PCFieldSplitSetIS() line 1756 in /home/artur/Rorsrach/Packages/petsc-3.7.2/src/ksp/pc/impls/fieldsplit/fieldsplit.c > -------------------------------------------------------------------------------- > > It seems that the preconditioner object for fieldsplit_P does not know that it is also of fieldsplit type. Does anyone have any idea of how I can specify the proper fieldsplit? > > Best, > > Artur From aks084000 at utdallas.edu Tue Jul 26 21:54:22 2016 From: aks084000 at utdallas.edu (Safin, Artur) Date: Wed, 27 Jul 2016 02:54:22 +0000 Subject: [petsc-users] Nested Fieldsplit for custom index sets In-Reply-To: References: , Message-ID: Barry, > Do you have a call to KSPSetFromOptions() before the call PCFieldSplitGetSubKSP()? I am guessing not which means that the PC does not yet know that it is of type fieldplit. Yes, I call KSPSetFromOptions() for the global matrix at the beginning of the code. Should I also do it for the ksp I obtain from PCFieldSplitGetSubKSP()? The program has no problem doing fieldsplit for the global matrix; my issue is that I cannot get it to recognize a fieldsplit within a fieldsplit. This is the whole code for the solver: -------------------------------------------------------------------------------------------------------------------------------- KSP ksp; KSPCreate(mpi_communicator, &ksp); KSPSetType(ksp, KSPGMRES); KSPSetOperators(ksp, A_petsc, A_petsc); KSPSetFromOptions(ksp); PC pc; KSPGetPC(ksp, &pc); // Define the fieldsplit for the global matrix PCFieldSplitSetIS(pc, "P", P_IS); PCFieldSplitSetIS(pc, "T", T_IS); // fieldsplit for submatrix P: KSP *ksp_all, ksp_P; PCFieldSplitGetSubKSP(pc, &i, &ksp_all); ksp_P = ksp_all[0]; PC pc_P; KSPGetPC(ksp_P, &pc_P); // This should be the preconditioner for fieldsplit P PCFieldSplitSetIS(pc_P, "A", P_A_IS); PCFieldSplitSetIS(pc_P, "B", P_B_IS); KSPSolve(ksp, b_petsc, u_petsc); -------------------------------------------------------------------------------------------------------------------------------- Thanks, Artur From lawrence.mitchell at imperial.ac.uk Wed Jul 27 03:00:32 2016 From: lawrence.mitchell at imperial.ac.uk (Lawrence Mitchell) Date: Wed, 27 Jul 2016 09:00:32 +0100 Subject: [petsc-users] Nested Fieldsplit for custom index sets In-Reply-To: References: Message-ID: > On 27 Jul 2016, at 03:54, Safin, Artur wrote: > > Barry, > >> Do you have a call to KSPSetFromOptions() before the call PCFieldSplitGetSubKSP()? I am guessing not which means that the PC does not yet know that it is of type fieldplit. > > Yes, I call KSPSetFromOptions() for the global matrix at the beginning of the code. Should I also do it for the ksp I obtain from PCFieldSplitGetSubKSP()? > > The program has no problem doing fieldsplit for the global matrix; my issue is that I cannot get it to recognize a fieldsplit within a fieldsplit. I think the SubKSPs (and therefore SubPCs) are not set up until you call KSPSetUp(ksp) which your code does not do explicitly and is therefore done in KSPSolve. Cheers, Lawrence -------------- next part -------------- A non-text attachment was scrubbed... Name: signature.asc Type: application/pgp-signature Size: 455 bytes Desc: Message signed with OpenPGP using GPGMail URL: From kandanovian at gmail.com Wed Jul 27 06:59:05 2016 From: kandanovian at gmail.com (Tim Steinhoff) Date: Wed, 27 Jul 2016 13:59:05 +0200 Subject: [petsc-users] Ignore command line arguments with fortran code using PETSc Message-ID: Hi all, we coupled PETSc with our fortran code. Is there any way to let PETSc (PetscInitialize) ignore all arguments passed by the command line? Since our code is controlled by command line arguements as well, it leads to a mess, when those arguments are read twice. Thanks and kind regards, Volker From knepley at gmail.com Wed Jul 27 09:04:52 2016 From: knepley at gmail.com (Matthew Knepley) Date: Wed, 27 Jul 2016 07:04:52 -0700 Subject: [petsc-users] Ignore command line arguments with fortran code using PETSc In-Reply-To: References: Message-ID: On Wed, Jul 27, 2016 at 4:59 AM, Tim Steinhoff wrote: > Hi all, > > we coupled PETSc with our fortran code. Is there any way to let PETSc > (PetscInitialize) ignore all arguments passed by the command line? > Since our code is controlled by command line arguements as well, it > leads to a mess, when those arguments are read twice. > 1) You can use PetscInitializeNoArguments() 2) What goes wrong? PETSc should just ignore any options it does not recognize. Thanks, Matt > Thanks and kind regards, > > Volker > -- What most experimenters take for granted before they begin their experiments is infinitely more interesting than any results to which their experiments lead. -- Norbert Wiener -------------- next part -------------- An HTML attachment was scrubbed... URL: From kandanovian at gmail.com Wed Jul 27 09:55:42 2016 From: kandanovian at gmail.com (Tim Steinhoff) Date: Wed, 27 Jul 2016 16:55:42 +0200 Subject: [petsc-users] Ignore command line arguments with fortran code using PETSc In-Reply-To: References: Message-ID: 2016-07-27 16:04 GMT+02:00 Matthew Knepley : > On Wed, Jul 27, 2016 at 4:59 AM, Tim Steinhoff > wrote: >> >> Hi all, >> >> we coupled PETSc with our fortran code. Is there any way to let PETSc >> (PetscInitialize) ignore all arguments passed by the command line? >> Since our code is controlled by command line arguements as well, it >> leads to a mess, when those arguments are read twice. > > > 1) You can use PetscInitializeNoArguments() Thanks! I thought that function was for C/C++ only. > > 2) What goes wrong? PETSc should just ignore any options it does not > recognize. The problem is that our code uses the same or similar argument names as PETSc does and our end user should not have access to all petsc options. > > Thanks, > > Matt > >> >> Thanks and kind regards, >> >> Volker > > > > > -- > What most experimenters take for granted before they begin their experiments > is infinitely more interesting than any results to which their experiments > lead. > -- Norbert Wiener From bsmith at mcs.anl.gov Wed Jul 27 11:09:14 2016 From: bsmith at mcs.anl.gov (Barry Smith) Date: Wed, 27 Jul 2016 12:09:14 -0400 Subject: [petsc-users] Nested Fieldsplit for custom index sets In-Reply-To: References: Message-ID: Please send a complete code that you think should work that doesn't so we can understand the exact issue. > On Jul 26, 2016, at 10:54 PM, Safin, Artur wrote: > > Barry, > >> Do you have a call to KSPSetFromOptions() before the call PCFieldSplitGetSubKSP()? I am guessing not which means that the PC does not yet know that it is of type fieldplit. > > Yes, I call KSPSetFromOptions() for the global matrix at the beginning of the code. Should I also do it for the ksp I obtain from PCFieldSplitGetSubKSP()? > > The program has no problem doing fieldsplit for the global matrix; my issue is that I cannot get it to recognize a fieldsplit within a fieldsplit. > > This is the whole code for the solver: > > -------------------------------------------------------------------------------------------------------------------------------- > KSP ksp; > KSPCreate(mpi_communicator, &ksp); > KSPSetType(ksp, KSPGMRES); > KSPSetOperators(ksp, A_petsc, A_petsc); > KSPSetFromOptions(ksp); > > PC pc; > KSPGetPC(ksp, &pc); > > // Define the fieldsplit for the global matrix > PCFieldSplitSetIS(pc, "P", P_IS); > PCFieldSplitSetIS(pc, "T", T_IS); > > // fieldsplit for submatrix P: > KSP *ksp_all, ksp_P; > PCFieldSplitGetSubKSP(pc, &i, &ksp_all); > > ksp_P = ksp_all[0]; > PC pc_P; > KSPGetPC(ksp_P, &pc_P); // This should be the preconditioner for fieldsplit P > PCFieldSplitSetIS(pc_P, "A", P_A_IS); > PCFieldSplitSetIS(pc_P, "B", P_B_IS); > > KSPSolve(ksp, b_petsc, u_petsc); > -------------------------------------------------------------------------------------------------------------------------------- > > Thanks, > > Artur > From bsmith at mcs.anl.gov Wed Jul 27 14:42:21 2016 From: bsmith at mcs.anl.gov (Barry Smith) Date: Wed, 27 Jul 2016 15:42:21 -0400 Subject: [petsc-users] Ignore command line arguments with fortran code using PETSc In-Reply-To: References: Message-ID: <8F9DC370-1CD1-48AA-8009-42731293566A@mcs.anl.gov> Actually there is currently no way to PetscInitialize from Fortran without adding the command line options to the database. In the middle of petscinitialize_() is the code fragment PETScParseFortranArgs_Private(&PetscGlobalArgc,&PetscGlobalArgs); FIXCHAR(filename,len,t1); *ierr = PetscOptionsInsert(NULL,&PetscGlobalArgc,&PetscGlobalArgs,t1); We'll need to do a bit of code refactoring to provide a Fortran petscinitializenoarguments_(). The simplest way to refactor would be to change the name of petscinitialize_ to say PetscInitializeFortran_Internal() and add a bool argument whether to process the arguments and then write two trivial routines petscinitialize_ that calls the new routine with PETSC_TRUE and petscinitializenoarguments_() that calls it with PETSC_FALSE. Barry Of course you can have a C/C++ main routine that calls PetscInitializeNoArguments(); followed by PetscInitializeFortran() and then have the bulk of your code in Fortran. > On Jul 27, 2016, at 10:55 AM, Tim Steinhoff wrote: > > 2016-07-27 16:04 GMT+02:00 Matthew Knepley : >> On Wed, Jul 27, 2016 at 4:59 AM, Tim Steinhoff >> wrote: >>> >>> Hi all, >>> >>> we coupled PETSc with our fortran code. Is there any way to let PETSc >>> (PetscInitialize) ignore all arguments passed by the command line? >>> Since our code is controlled by command line arguements as well, it >>> leads to a mess, when those arguments are read twice. >> >> >> 1) You can use PetscInitializeNoArguments() > > Thanks! I thought that function was for C/C++ only. > >> >> 2) What goes wrong? PETSc should just ignore any options it does not >> recognize. > > > The problem is that our code uses the same or similar argument names > as PETSc does and our end user should not have access to all petsc > options. > > >> >> Thanks, >> >> Matt >> >>> >>> Thanks and kind regards, >>> >>> Volker >> >> >> >> >> -- >> What most experimenters take for granted before they begin their experiments >> is infinitely more interesting than any results to which their experiments >> lead. >> -- Norbert Wiener From epscodes at gmail.com Wed Jul 27 16:42:20 2016 From: epscodes at gmail.com (Xiangdong) Date: Wed, 27 Jul 2016 17:42:20 -0400 Subject: [petsc-users] vec norm for local portion of a vector Message-ID: Hello everyone, I have a global dmda vector vg. On each processor, if I want to know the norm of local portion of vg, which function should I call? So far I am thinking of using DMDAVecGetArray and then write a loop to compute the norm of this local array. Is there a simple function available to call? like *vg->ops->norm_local(vg,NORM_2, &normlocal)? Thanks. Best, Xiangdong -------------- next part -------------- An HTML attachment was scrubbed... URL: From aks084000 at utdallas.edu Wed Jul 27 20:20:41 2016 From: aks084000 at utdallas.edu (Safin, Artur) Date: Thu, 28 Jul 2016 01:20:41 +0000 Subject: [petsc-users] Nested Fieldsplit for custom index sets In-Reply-To: References: , Message-ID: <0B3B3C93-5C07-4E07-A37E-DEBA9577D3EE@utdallas.edu> Barry, Lawrence, > I think the SubKSPs (and therefore SubPCs) are not set up until you call KSPSetUp(ksp) which your code does not do explicitly and is therefore done in KSPSolve. I added KSPSetUp(), but unfortunately the issue did not go away. I have created a MWE that replicates the issue. The program tries to solve a tridiagonal system, where the first fieldsplit partitions the global matrix [ P x ] [ x T ], and the nested fieldsplit partitions P into [ A x ] [ x B ]. Thanks for your help, Artur -------------- next part -------------- A non-text attachment was scrubbed... Name: ex.c Type: text/x-csrc Size: 2395 bytes Desc: ex.c URL: -------------- next part -------------- A non-text attachment was scrubbed... Name: run.sh Type: application/x-shellscript Size: 351 bytes Desc: run.sh URL: From kandanovian at gmail.com Thu Jul 28 02:35:00 2016 From: kandanovian at gmail.com (Tim Steinhoff) Date: Thu, 28 Jul 2016 09:35:00 +0200 Subject: [petsc-users] Ignore command line arguments with fortran code using PETSc In-Reply-To: <8F9DC370-1CD1-48AA-8009-42731293566A@mcs.anl.gov> References: <8F9DC370-1CD1-48AA-8009-42731293566A@mcs.anl.gov> Message-ID: 2016-07-27 21:42 GMT+02:00 Barry Smith : > > Actually there is currently no way to PetscInitialize from Fortran without adding the command line options to the database. In the middle > of petscinitialize_() is the code fragment > > PETScParseFortranArgs_Private(&PetscGlobalArgc,&PetscGlobalArgs); > FIXCHAR(filename,len,t1); > *ierr = PetscOptionsInsert(NULL,&PetscGlobalArgc,&PetscGlobalArgs,t1); > > We'll need to do a bit of code refactoring to provide a Fortran petscinitializenoarguments_(). The simplest way to refactor would be to change the name of petscinitialize_ to say PetscInitializeFortran_Internal() and add a bool argument whether to process the arguments and then write two trivial routines petscinitialize_ that calls the new routine with PETSC_TRUE and petscinitializenoarguments_() that calls it with PETSC_FALSE. Thanks Barry. It would be really nice if PETSc comes with that feature in future, because I would prefer not to make any changes to the PETSc code that disappear with every new PETSc release. > > Barry > > Of course you can have a C/C++ main routine that calls PetscInitializeNoArguments(); followed by PetscInitializeFortran() and then have the bulk of your code in Fortran. That would work, but we have a rather large fortran code without any C. So, for now we will probably stick to your first approach and keep our code fotran only. Thanks again, Volker > > >> On Jul 27, 2016, at 10:55 AM, Tim Steinhoff wrote: >> >> 2016-07-27 16:04 GMT+02:00 Matthew Knepley : >>> On Wed, Jul 27, 2016 at 4:59 AM, Tim Steinhoff >>> wrote: >>>> >>>> Hi all, >>>> >>>> we coupled PETSc with our fortran code. Is there any way to let PETSc >>>> (PetscInitialize) ignore all arguments passed by the command line? >>>> Since our code is controlled by command line arguements as well, it >>>> leads to a mess, when those arguments are read twice. >>> >>> >>> 1) You can use PetscInitializeNoArguments() >> >> Thanks! I thought that function was for C/C++ only. >> >>> >>> 2) What goes wrong? PETSc should just ignore any options it does not >>> recognize. >> >> >> The problem is that our code uses the same or similar argument names >> as PETSc does and our end user should not have access to all petsc >> options. >> >> >>> >>> Thanks, >>> >>> Matt >>> >>>> >>>> Thanks and kind regards, >>>> >>>> Volker >>> >>> >>> >>> >>> -- >>> What most experimenters take for granted before they begin their experiments >>> is infinitely more interesting than any results to which their experiments >>> lead. >>> -- Norbert Wiener > From lawrence.mitchell at imperial.ac.uk Thu Jul 28 03:35:30 2016 From: lawrence.mitchell at imperial.ac.uk (Lawrence Mitchell) Date: Thu, 28 Jul 2016 09:35:30 +0100 Subject: [petsc-users] Nested Fieldsplit for custom index sets In-Reply-To: <0B3B3C93-5C07-4E07-A37E-DEBA9577D3EE@utdallas.edu> References: <0B3B3C93-5C07-4E07-A37E-DEBA9577D3EE@utdallas.edu> Message-ID: <5799C3D2.8000407@imperial.ac.uk> Dear Artur, On 28/07/16 02:20, Safin, Artur wrote: > Barry, Lawrence, > >> I think the SubKSPs (and therefore SubPCs) are not set up until you call KSPSetUp(ksp) which your code does not do explicitly and is therefore done in KSPSolve. > > I added KSPSetUp(), but unfortunately the issue did not go away. > > > > I have created a MWE that replicates the issue. The program tries to solve a tridiagonal system, where the first fieldsplit partitions the global matrix > > [ P x ] > [ x T ], > > and the nested fieldsplit partitions P into > > [ A x ] > [ x B ]. Two things: 1. Always check the return value from all PETSc calls. This will normally give you a very useful backtrace when something goes wrong. That is, annotate all your calls with: PetscErrorCode ierr; ierr = SomePetscFunction(...); CHKERRQ(ierr); If I do this, I see that the call to KSPSetUp fails: [0]PETSC ERROR: --------------------- Error Message -------------------------------------------------------------- [0]PETSC ERROR: Petsc has generated inconsistent data [0]PETSC ERROR: Unhandled case, must have at least two fields, not 1 [0]PETSC ERROR: See http://www.mcs.anl.gov/petsc/documentation/faq.html for trouble shooting. [0]PETSC ERROR: Petsc Development GIT revision: v3.7.2-931-g1e46b98 GIT Date: 2016-07-06 16:57:50 -0500 ... [0]PETSC ERROR: #1 PCFieldSplitSetDefaults() line 470 in /data/lmitche1/src/deps/petsc/src/ksp/pc/impls/fieldsplit/fieldsplit.c [0]PETSC ERROR: #2 PCSetUp_FieldSplit() line 487 in /data/lmitche1/src/deps/petsc/src/ksp/pc/impls/fieldsplit/fieldsplit.c [0]PETSC ERROR: #3 PCSetUp() line 968 in /data/lmitche1/src/deps/petsc/src/ksp/pc/interface/precon.c [0]PETSC ERROR: #4 KSPSetUp() line 393 in /data/lmitche1/src/deps/petsc/src/ksp/ksp/interface/itfunc.c [0]PETSC ERROR: #5 main() line 65 in /homes/lmitche1/tmp/ex.c The reason is you need to call KSPSetUp *after* setting the outermost fieldsplit ISes. If I move the call to KSPSetUp, then things seem to work. I've attached the working code. Cheers, Lawrence $ cat options.txt -pc_type fieldsplit -pc_fieldsplit_type multiplicative -fieldsplit_T_ksp_type bcgs -fieldsplit_P_ksp_type gmres -fieldsplit_P_pc_type fieldsplit -fieldsplit_P_pc_fieldsplit_type multiplicative -fieldsplit_P_fieldsplit_A_ksp_type gmres -fieldsplit_P_fieldsplit_B_pc_type lu -fieldsplit_P_fieldsplit_B_ksp_type preonly -ksp_converged_reason -ksp_monitor_true_residual -ksp_view $ ./ex -options_file options.txt 0 KSP preconditioned resid norm 5.774607007892e+00 true resid norm 1.414213562373e+00 ||r(i)||/||b|| 1.000000000000e+00 1 KSP preconditioned resid norm 1.921795888956e-01 true resid norm 4.802975385197e-02 ||r(i)||/||b|| 3.396216464745e-02 2 KSP preconditioned resid norm 1.436304589027e-12 true resid norm 2.435255920058e-13 ||r(i)||/||b|| 1.721985974998e-13 Linear solve converged due to CONVERGED_RTOL iterations 2 KSP Object: 1 MPI processes type: gmres GMRES: restart=30, using Classical (unmodified) Gram-Schmidt Orthogonalization with no iterative refinement GMRES: happy breakdown tolerance 1e-30 maximum iterations=10000, initial guess is zero tolerances: relative=1e-05, absolute=1e-50, divergence=10000. left preconditioning using PRECONDITIONED norm type for convergence test PC Object: 1 MPI processes type: fieldsplit FieldSplit with MULTIPLICATIVE composition: total splits = 2 Solver info for each split is in the following KSP objects: Split number 0 Defined by IS KSP Object: (fieldsplit_P_) 1 MPI processes type: gmres GMRES: restart=30, using Classical (unmodified) Gram-Schmidt Orthogonalization with no iterative refinement GMRES: happy breakdown tolerance 1e-30 maximum iterations=10000, initial guess is zero tolerances: relative=1e-05, absolute=1e-50, divergence=10000. left preconditioning using PRECONDITIONED norm type for convergence test PC Object: (fieldsplit_P_) 1 MPI processes type: fieldsplit FieldSplit with MULTIPLICATIVE composition: total splits = 2 Solver info for each split is in the following KSP objects: Split number 0 Defined by IS KSP Object: (fieldsplit_P_fieldsplit_A_) 1 MPI processes type: gmres GMRES: restart=30, using Classical (unmodified) Gram-Schmidt Orthogonalization with no iterative refinement GMRES: happy breakdown tolerance 1e-30 maximum iterations=10000, initial guess is zero tolerances: relative=1e-05, absolute=1e-50, divergence=10000. left preconditioning using PRECONDITIONED norm type for convergence test PC Object: (fieldsplit_P_fieldsplit_A_) 1 MPI processes type: ilu ILU: out-of-place factorization 0 levels of fill tolerance for zero pivot 2.22045e-14 matrix ordering: natural factor fill ratio given 1., needed 1. Factored matrix follows: Mat Object: 1 MPI processes type: seqaij rows=25, cols=25 package used to perform factorization: petsc total: nonzeros=73, allocated nonzeros=73 total number of mallocs used during MatSetValues calls =0 not using I-node routines linear system matrix = precond matrix: Mat Object: (fieldsplit_P_fieldsplit_A_) 1 MPI processes type: seqaij rows=25, cols=25 total: nonzeros=73, allocated nonzeros=73 total number of mallocs used during MatSetValues calls =0 not using I-node routines Split number 1 Defined by IS KSP Object: (fieldsplit_P_fieldsplit_B_) 1 MPI processes type: preonly maximum iterations=10000, initial guess is zero tolerances: relative=1e-05, absolute=1e-50, divergence=10000. left preconditioning using NONE norm type for convergence test PC Object: (fieldsplit_P_fieldsplit_B_) 1 MPI processes type: lu LU: out-of-place factorization tolerance for zero pivot 2.22045e-14 matrix ordering: nd factor fill ratio given 5., needed 1.43836 Factored matrix follows: Mat Object: 1 MPI processes type: seqaij rows=25, cols=25 package used to perform factorization: petsc total: nonzeros=105, allocated nonzeros=105 total number of mallocs used during MatSetValues calls =0 not using I-node routines linear system matrix = precond matrix: Mat Object: (fieldsplit_P_fieldsplit_B_) 1 MPI processes type: seqaij rows=25, cols=25 total: nonzeros=73, allocated nonzeros=73 total number of mallocs used during MatSetValues calls =0 not using I-node routines linear system matrix = precond matrix: Mat Object: (fieldsplit_P_) 1 MPI processes type: seqaij rows=50, cols=50 total: nonzeros=148, allocated nonzeros=148 total number of mallocs used during MatSetValues calls =0 not using I-node routines Split number 1 Defined by IS KSP Object: (fieldsplit_T_) 1 MPI processes type: bcgs maximum iterations=10000, initial guess is zero tolerances: relative=1e-05, absolute=1e-50, divergence=10000. left preconditioning using PRECONDITIONED norm type for convergence test PC Object: (fieldsplit_T_) 1 MPI processes type: ilu ILU: out-of-place factorization 0 levels of fill tolerance for zero pivot 2.22045e-14 matrix ordering: natural factor fill ratio given 1., needed 1. Factored matrix follows: Mat Object: 1 MPI processes type: seqaij rows=50, cols=50 package used to perform factorization: petsc total: nonzeros=148, allocated nonzeros=148 total number of mallocs used during MatSetValues calls =0 not using I-node routines linear system matrix = precond matrix: Mat Object: (fieldsplit_T_) 1 MPI processes type: seqaij rows=50, cols=50 total: nonzeros=148, allocated nonzeros=148 total number of mallocs used during MatSetValues calls =0 not using I-node routines linear system matrix = precond matrix: Mat Object: 1 MPI processes type: seqaij rows=100, cols=100 total: nonzeros=298, allocated nonzeros=500 total number of mallocs used during MatSetValues calls =0 not using I-node routines -------------- next part -------------- A non-text attachment was scrubbed... Name: ex.c Type: text/x-csrc Size: 2885 bytes Desc: not available URL: -------------- next part -------------- A non-text attachment was scrubbed... Name: signature.asc Type: application/pgp-signature Size: 490 bytes Desc: OpenPGP digital signature URL: From C.Klaij at marin.nl Thu Jul 28 03:38:54 2016 From: C.Klaij at marin.nl (Klaij, Christiaan) Date: Thu, 28 Jul 2016 08:38:54 +0000 Subject: [petsc-users] block matrix without MatCreateNest Message-ID: <1469695134232.97712@marin.nl> I'm trying to understand how to assemble a block matrix in a format-independent manner, so that I can switch between types mpiaij and matnest. The manual states that the key to format-independent assembly is to use MatGetLocalSubMatrix. So, in the code below, I'm using this to assemble a 3-by-3 block matrix A and setting the diagonal of block A02. This seems to work for type mpiaij, but not for type matnest. What am I missing? Chris $ cat mattry.F90 program mattry use petscksp implicit none #include PetscInt :: n=4 ! setting 4 cells per process PetscErrorCode :: ierr PetscInt :: size,rank,i Mat :: A,A02 IS :: isg0,isg1,isg2 IS :: isl0,isl1,isl2 ISLocalToGlobalMapping :: map integer, allocatable, dimension(:) :: idx call PetscInitialize(PETSC_NULL_CHARACTER,ierr); CHKERRQ(ierr) call MPI_Comm_size(PETSC_COMM_WORLD,size,ierr); CHKERRQ(ierr) call MPI_Comm_rank(PETSC_COMM_WORLD,rank,ierr);CHKERRQ(ierr) ! local index sets for 3 fields allocate(idx(n)) idx=(/ (i-1, i=1,n) /) call ISCreateGeneral(PETSC_COMM_WORLD,n,idx,PETSC_COPY_VALUES,isl0,ierr);CHKERRQ(ierr) call ISCreateGeneral(PETSC_COMM_WORLD,n,idx+n,PETSC_COPY_VALUES,isl1,ierr);CHKERRQ(ierr) call ISCreateGeneral(PETSC_COMM_WORLD,n,idx+2*n,PETSC_COPY_VALUES,isl2,ierr);CHKERRQ(ierr) ! call ISView(isl3,PETSC_VIEWER_STDOUT_WORLD,ierr); CHKERRQ(ierr) deallocate(idx) ! global index sets for 3 fields allocate(idx(n)) idx=(/ (i-1+rank*3*n, i=1,n) /) call ISCreateGeneral(PETSC_COMM_WORLD,n,idx,PETSC_COPY_VALUES,isg0,ierr);CHKERRQ(ierr) call ISCreateGeneral(PETSC_COMM_WORLD,n,idx+n,PETSC_COPY_VALUES,isg1,ierr); CHKERRQ(ierr) call ISCreateGeneral(PETSC_COMM_WORLD,n,idx+2*n,PETSC_COPY_VALUES,isg2,ierr); CHKERRQ(ierr) ! call ISView(isg3,PETSC_VIEWER_STDOUT_WORLD,ierr); CHKERRQ(ierr) deallocate(idx) ! local-to-global mapping allocate(idx(3*n)) idx=(/ (i-1+rank*3*n, i=1,3*n) /) call ISLocalToGlobalMappingCreate(PETSC_COMM_WORLD,1,3*n,idx,PETSC_COPY_VALUES,map,ierr); CHKERRQ(ierr) ! call ISLocalToGlobalMappingView(map,PETSC_VIEWER_STDOUT_WORLD,ierr); CHKERRQ(ierr) deallocate(idx) ! create the 3-by-3 block matrix call MatCreate(PETSC_COMM_WORLD,A,ierr); CHKERRQ(ierr) call MatSetSizes(A,3*n,3*n,PETSC_DECIDE,PETSC_DECIDE,ierr); CHKERRQ(ierr) ! call MatSetType(A,MATNEST,ierr); CHKERRQ(ierr) call MatSetUp(A,ierr); CHKERRQ(ierr) call MatSetOptionsPrefix(A,"A_",ierr); CHKERRQ(ierr) call MatSetLocalToGlobalMapping(A,map,map,ierr); CHKERRQ(ierr) call MatSetFromOptions(A,ierr); CHKERRQ(ierr) ! set diagonal of block A02 to 0.65 call MatGetLocalSubmatrix(A,isl0,isl2,A02,ierr); CHKERRQ(ierr) do i=1,n call MatSetValuesLocal(A02,1,i-1,1,i-1,0.65d0,INSERT_VALUES,ierr); CHKERRQ(ierr) end do call MatRestoreLocalSubMatrix(A,isl0,isl2,A02,ierr); CHKERRQ(ierr) call MatAssemblyBegin(A,MAT_FINAL_ASSEMBLY,ierr); CHKERRQ(ierr) call MatAssemblyEnd(A,MAT_FINAL_ASSEMBLY,ierr); CHKERRQ(ierr) ! verify call MatGetSubmatrix(A,isg0,isg2,MAT_INITIAL_MATRIX,A02,ierr); CHKERRQ(ierr) call MatView(A02,PETSC_VIEWER_STDOUT_WORLD,ierr);CHKERRQ(ierr) call PetscFinalize(ierr) end program mattry $ mpiexec -n 2 ./mattry -A_mat_type mpiaij Mat Object: 2 MPI processes type: mpiaij row 0: (0, 0.65) row 1: (1, 0.65) row 2: (2, 0.65) row 3: (3, 0.65) row 4: (4, 0.65) row 5: (5, 0.65) row 6: (6, 0.65) row 7: (7, 0.65) $ mpiexec -n 2 ./mattry -A_mat_type nest [0]PETSC ERROR: --------------------- Error Message -------------------------------------------------------------- [0]PETSC ERROR: Null argument, when expecting valid pointer [0]PETSC ERROR: Null Pointer: Parameter # 3 [0]PETSC ERROR: See http://www.mcs.anl.gov/petsc/documentation/faq.html for trouble shooting. [0]PETSC ERROR: Petsc Release Version 3.7.3, Jul, 24, 2016 [0]PETSC ERROR: ./mattry on a linux_64bit_debug named lin0322.marin.local by cklaij Thu Jul 28 10:31:04 2016 [0]PETSC ERROR: Configure options --with-mpi-dir=/home/cklaij/ReFRESCO/Dev/trunk/Libs/install/openmpi/1.8.7 --with-clanguage=c++ --with-x=1 --with-debugging=1 --with-blas-lapack-dir=/opt/intel/composer_xe_2015.1.133/mkl --with-shared-libraries=0 [0]PETSC ERROR: #1 MatNestFindIS() line 298 in /home/cklaij/ReFRESCO/Dev/trunk/Libs/build/petsc/3.7.3-dbg/src/mat/impls/nest/matnest.c [0]PETSC ERROR: [1]PETSC ERROR: --------------------- Error Message -------------------------------------------------------------- [1]PETSC ERROR: Null argument, when expecting valid pointer [1]PETSC ERROR: Null Pointer: Parameter # 3 [1]PETSC ERROR: See http://www.mcs.anl.gov/petsc/documentation/faq.html for trouble shooting. [1]PETSC ERROR: Petsc Release Version 3.7.3, Jul, 24, 2016 [1]PETSC ERROR: ./mattry on a linux_64bit_debug named lin0322.marin.local by cklaij Thu Jul 28 10:31:04 2016 [1]PETSC ERROR: Configure options --with-mpi-dir=/home/cklaij/ReFRESCO/Dev/trunk/Libs/install/openmpi/1.8.7 --with-clanguage=c++ --with-x=1 --with-debugging=1 --with-blas-lapack-dir=/opt/intel/composer_xe_2015.1.133/mkl --with-shared-libraries=0 [1]PETSC ERROR: #1 MatNestFindIS() line 298 in /home/cklaij/ReFRESCO/Dev/trunk/Libs/build/petsc/3.7.3-dbg/src/mat/impls/nest/matnest.c [1]PETSC ERROR: #2 MatNestFindSubMat() line 371 in /home/cklaij/ReFRESCO/Dev/trunk/Libs/build/petsc/3.7.3-dbg/src/mat/impls/nest/matnest.c [1]PETSC ERROR: #3 MatGetLocalSubMatrix_Nest() line 414 in /home/cklaij/ReFRESCO/Dev/trunk/Libs/build/petsc/3.7.3-dbg/src/mat/impls/nest/matnest.c [1]PETSC ERROR: #4 MatGetLocalSubMatrix() line 10099 in /home/cklaij/ReFRESCO/Dev/trunk/Libs/build/petsc/3.7.3-dbg/src/mat/interface/matrix.c #2 MatNestFindSubMat() line 371 in /home/cklaij/ReFRESCO/Dev/trunk/Libs/build/petsc/3.7.3-dbg/src/mat/impls/nest/matnest.c [0]PETSC ERROR: #3 MatGetLocalSubMatrix_Nest() line 414 in /home/cklaij/ReFRESCO/Dev/trunk/Libs/build/petsc/3.7.3-dbg/src/mat/impls/nest/matnest.c [0]PETSC ERROR: #4 MatGetLocalSubMatrix() line 10099 in /home/cklaij/ReFRESCO/Dev/trunk/Libs/build/petsc/3.7.3-dbg/src/mat/interface/matrix.c -------------------------------------------------------------------------- MPI_ABORT was invoked on rank 1 in communicator MPI_COMM_WORLD with errorcode 85. NOTE: invoking MPI_ABORT causes Open MPI to kill all MPI processes. You may or may not see output from other processes, depending on exactly when Open MPI kills them. -------------------------------------------------------------------------- [lin0322.marin.local:11985] 1 more process has sent help message help-mpi-api.txt / mpi-abort [lin0322.marin.local:11985] Set MCA parameter "orte_base_help_aggregate" to 0 to see all help / error messages $ dr. ir. Christiaan Klaij | CFD Researcher | Research & Development MARIN | T +31 317 49 33 44 | mailto:C.Klaij at marin.nl | http://www.marin.nl MARIN news: http://www.marin.nl/web/News/News-items/Ship-design-in-EU-project-Holiship.htm From loiseau.jc at gmail.com Thu Jul 28 11:07:43 2016 From: loiseau.jc at gmail.com (JC) Date: Thu, 28 Jul 2016 18:07:43 +0200 Subject: [petsc-users] Comprehensive example for KSPRegister() Message-ID: Hey everyone, I was wondering if any of you had a comprehensive example of KSPRegister() to create our own KSP solver in fortran? I have tried to look online but have not been able to find it. Thanks a lot, JC From knepley at gmail.com Thu Jul 28 12:22:41 2016 From: knepley at gmail.com (Matthew Knepley) Date: Thu, 28 Jul 2016 10:22:41 -0700 Subject: [petsc-users] Comprehensive example for KSPRegister() In-Reply-To: References: Message-ID: On Thu, Jul 28, 2016 at 9:07 AM, JC wrote: > Hey everyone, > > I was wondering if any of you had a comprehensive example of KSPRegister() > to create our own KSP solver in fortran? > I have tried to look online but have not been able to find it. > We do not currently have one in Fortran, although we do in C. Thanks, Matt > Thanks a lot, > JC -- What most experimenters take for granted before they begin their experiments is infinitely more interesting than any results to which their experiments lead. -- Norbert Wiener -------------- next part -------------- An HTML attachment was scrubbed... URL: From valeria.mele at unina.it Thu Jul 28 12:48:47 2016 From: valeria.mele at unina.it (Valeria Mele) Date: Thu, 28 Jul 2016 12:48:47 -0500 Subject: [petsc-users] PCMG with matrix-free operators accessing DMDA Message-ID: Hi everyone, this time I am using PETSc to do something that is more complicated than my usual and I want to do it at the highest possible abstraction level. To put it in a nutshell, my intent is to build a parallel multigrid to solve a linear system via DM, KSP and PCMG (I would like to use DMMG but probably I should have the same problems or more). I created the distributed object, *da*, with DMDACreate3d, even if it is distributed (as yet) only in the x-dimension and has 3 dof. Then I create the KSP (type KSPRICHARDSON) and set the nonzero initial guess and PCMG as preconditioner. Here I start to tune the MG. The point is that I need to define all the operators as matrix-free, since they will do several operations on *x* to obtain *y*, and I am not familiar with the way to access all the elements or informations in the two levels involved and/or among the processors with a so-high level interface. So please (please please please please) tell me if I correctly understand the mechanism or I am on the wrong way and clear my doubts. That is, let's say that my operation for the shell are: - *A_mult(Mat mat,Vec x, Vec y) *//coefficients matrix in this case the level is only one but should I write it taking into account only the local data (I think so) and accessing them via the informations in *da*? For example, if I use DMDAVecGetArray, DMDAVecGetCorners (or DMDAGetGhostCorners) and DMDAVecRestoreArray, will they retrieve informations from the right level each time (I am pretty sure that in some official examples it is done in this way)? Or should I handle just *Vec*s as local structures with their usual indices (through VecGetArray and VecRestoreArray)? - *P_mult(Mat mat,Vec x, Vec y) *//interpolation matrix that is NOT conceptually the traspose of Restriction in this case *x* and *y* will be from two different levels (respectively L and L+1), so, if I retrieve informations from the *da*... how can I access the two at different levels? I am sorry if it seems that they are trivial questions, and I will be grateful to anyone will help me. Thanks a lot, Valeria --------------------------------------------------------------------------------------------- PhD Valeria Mele University of Naples Federico II Department of Mathematics and Applications "R. Caccioppoli" Complesso Universitario M.S. Angelo, Via Cinthia 80126 Naples --------------------------------------------------------------------------------------------- -------------- next part -------------- An HTML attachment was scrubbed... URL: From andrewh0 at uw.edu Thu Jul 28 13:31:07 2016 From: andrewh0 at uw.edu (Andrew Ho) Date: Thu, 28 Jul 2016 11:31:07 -0700 Subject: [petsc-users] Implementing discontinuous Galerkin FEM? Message-ID: I am trying to implement a discontinuous Galerkin discretization using the PETSc DM features to handle most of the topology/geometry specific functions. However, I'm not really sure which direction to approach this from since DG is kind of a middle ground between finite volume and traditional continuous Galerkin finite element methods. It appears to me that if I want to implement a nodal DG method, then it would be more practical to extend the PetscFE interface, but for a modal DG method perhaps the PetscFV interface is better? There are still a few questions that I don't know the answers to, though. Questions about implementing nodal DG: 1. Does PetscFE support sub/super parametric element types? If so, how do I express the internal node structure for a nodal DG method (say, for example located at the abscissa of a Gauss-Lobatto quadrature scheme)? 2. How would I go about making the dataset stored discontinuous between neighboring elements (specifically at shared nodes for a nodal DG method)? 3. Similar to 2, how would I handle boundary conditions? Specifically, I need a layer of data space of just the boundary nodes (not a complete "ghost" element), and these are the actual constrained points. Questions about implementing modal DG: A. What does specifying the quadrature object for a PetscFV object actually do? Is it purely a surface flux integration quadrature? How does the quadrature object handle simplex-type elements in 2D/3D? B. How would I go about modifying the limiters to take into account these multiple modes? -- Andrew Ho -------------- next part -------------- An HTML attachment was scrubbed... URL: From andrewh0 at uw.edu Thu Jul 28 17:43:24 2016 From: andrewh0 at uw.edu (Andrew Ho) Date: Thu, 28 Jul 2016 15:43:24 -0700 Subject: [petsc-users] How to create a quadrature object? Message-ID: I am trying to create a very simple quadrature object, but for some reason PETSc keeps giving me an "invalid argument" error. Relevant code: #include > int main(int argc, char** argv) > { > CHKERRQ(PetscInitialize(&argc, &argv, nullptr, "quadrature testing")); > PetscQuadrature quad; > CHKERRQ(PetscQuadratureCreate(PETSC_COMM_SELF, &quad)); > CHKERRQ(PetscQuadratureDestroy(&quad)); CHKERRQ(PetscFinalize()); > } Error message: > > [0]PETSC ERROR: --------------------- Error Message > -------------------------------------------- > ------------------ > > [0]PETSC ERROR: Invalid argument > [0]PETSC ERROR: Invalid object classid 0 > This could happen if you compile with PETSC_HAVE_DYNAMIC_LIBRARIES, but > link with static libraries. > > [0]PETSC ERROR: See http://www.mcs.anl.gov/petsc/documentation/faq.html > for troubleshooting. > [0]PETSC ERROR: Petsc Release Version 3.7.2, unknown > [0]PETSC ERROR: Configure options --with-debugging=0 COPTFLAGS="-O3 > -march=native" CXXOPTFLAGS="-O3 -march=native" FOPTFLAGS="-O3 > -march=native" --prefix=/usr/local > [0]PETSC ERROR: #1 PetscClassRegLogGetClass() line 290 in > /home/andrew/tools/petsc/petsc/src/sys/logging/utils/classlog.c > [0]PETSC ERROR: #2 PetscLogObjCreateDefault() line 317 in > /home/andrew/tools/petsc/petsc/src/sys/logging/utils/classlog.c > [0]PETSC ERROR: #3 PetscQuadratureCreate() line 54 in > /home/andrew/tools/petsc/petsc/src/dm/dt/interface/dt.c > > [0]PETSC ERROR: --------------------- Error Message > -------------------------------------------------------------- > > [0]PETSC ERROR: Invalid argument > [0]PETSC ERROR: Invalid object classid 0 > This could happen if you compile with PETSC_HAVE_DYNAMIC_LIBRARIES, but > link with static libraries. > > [0]PETSC ERROR: See http://www.mcs.anl.gov/petsc/documentation/faq.html > for troubleshooting. > [0]PETSC ERROR: Petsc Release Version 3.7.2, unknown > [0]PETSC ERROR: Configure options --with-debugging=0 COPTFLAGS="-O3 > -march=native" CXXOPTFLAGS=" > -O3 -march=native" FOPTFLAGS="-O3 -march=native" --prefix=/usr/local > [0]PETSC ERROR: #4 PetscClassRegLogGetClass() line 290 in > /home/andrew/tools/petsc/petsc/src/sys/logging/utils/classlog.c > > [0]PETSC ERROR: #5 PetscLogObjCreateDefault() line 317 in > /home/andrew/tools/petsc/petsc/src/sys/logging/utils/classlog.c > > [0]PETSC ERROR: #6 PetscQuadratureCreate() line 54 in > /home/andrew/tools/petsc/petsc/src/dm/dt/interface/dt.c > > [0]PETSC ERROR: #7 main() line 5 in /home/user/tests/pquad.cpp > [0]PETSC ERROR: No PETSc Option Table entries > [0]PETSC ERROR: ----------------End of Error Message -------send entire > error message to petsc-maint at mcs.anl.gov---------- I've had no problems using other parts of PETSc with the exact same build options/install (SNES, linear solvers), so I suspect this is likely just user error. I have tried building PETSc both as a shared and static library, and both methods fail in the same way. As a related question, what is the "Order" of quadrature object suppose to be? The documentation for "PetscQuadratureGetOrder" and "PetscQuadratureSetOrder" says this is the highest degree polynomial that is exactly integrated. However, when I looked at the source code for "PetscDTGaussTensorQuadrature", this appears to set the order to the number of quadrature points, which for Gaussian quadrature means it can integrate a 2*npoints-1 polynomial exactly. If I wanted to implement my own quadrature scheme (Gauss-Lobatto) which is exact for polynomials up to 2*npoints-3, what should I set the quadrature order to? -- Andrew Ho -------------- next part -------------- An HTML attachment was scrubbed... URL: From knepley at gmail.com Thu Jul 28 18:35:33 2016 From: knepley at gmail.com (Matthew Knepley) Date: Thu, 28 Jul 2016 16:35:33 -0700 Subject: [petsc-users] How to create a quadrature object? In-Reply-To: References: Message-ID: On Thu, Jul 28, 2016 at 3:43 PM, Andrew Ho wrote: > I am trying to create a very simple quadrature object, but for some reason > PETSc keeps giving me an "invalid argument" error. > > Relevant code: > > #include >> int main(int argc, char** argv) >> { >> CHKERRQ(PetscInitialize(&argc, &argv, nullptr, "quadrature testing")); >> PetscQuadrature quad; >> CHKERRQ(PetscQuadratureCreate(PETSC_COMM_SELF, &quad)); >> > CHKERRQ(PetscQuadratureDestroy(&quad)); > > CHKERRQ(PetscFinalize()); >> } > > > Error message: > >> >> [0]PETSC ERROR: --------------------- Error Message >> -------------------------------------------- >> ------------------ >> >> [0]PETSC ERROR: Invalid argument >> [0]PETSC ERROR: Invalid object classid 0 >> This could happen if you compile with PETSC_HAVE_DYNAMIC_LIBRARIES, but >> link with static libraries. >> >> [0]PETSC ERROR: See http://www.mcs.anl.gov/petsc/documentation/faq.html >> for troubleshooting. >> [0]PETSC ERROR: Petsc Release Version 3.7.2, unknown >> [0]PETSC ERROR: Configure options --with-debugging=0 COPTFLAGS="-O3 >> -march=native" CXXOPTFLAGS="-O3 -march=native" FOPTFLAGS="-O3 >> -march=native" --prefix=/usr/local >> [0]PETSC ERROR: #1 PetscClassRegLogGetClass() line 290 in >> /home/andrew/tools/petsc/petsc/src/sys/logging/utils/classlog.c >> [0]PETSC ERROR: #2 PetscLogObjCreateDefault() line 317 in >> /home/andrew/tools/petsc/petsc/src/sys/logging/utils/classlog.c >> [0]PETSC ERROR: #3 PetscQuadratureCreate() line 54 in >> /home/andrew/tools/petsc/petsc/src/dm/dt/interface/dt.c >> >> [0]PETSC ERROR: --------------------- Error Message >> -------------------------------------------------------------- >> >> [0]PETSC ERROR: Invalid argument >> [0]PETSC ERROR: Invalid object classid 0 >> This could happen if you compile with PETSC_HAVE_DYNAMIC_LIBRARIES, but >> link with static libraries. >> >> [0]PETSC ERROR: See http://www.mcs.anl.gov/petsc/documentation/faq.html >> for troubleshooting. >> [0]PETSC ERROR: Petsc Release Version 3.7.2, unknown >> [0]PETSC ERROR: Configure options --with-debugging=0 COPTFLAGS="-O3 >> -march=native" CXXOPTFLAGS=" >> -O3 -march=native" FOPTFLAGS="-O3 -march=native" --prefix=/usr/local >> [0]PETSC ERROR: #4 PetscClassRegLogGetClass() line 290 in >> /home/andrew/tools/petsc/petsc/src/sys/logging/utils/classlog.c >> >> [0]PETSC ERROR: #5 PetscLogObjCreateDefault() line 317 in >> /home/andrew/tools/petsc/petsc/src/sys/logging/utils/classlog.c >> >> [0]PETSC ERROR: #6 PetscQuadratureCreate() line 54 in >> /home/andrew/tools/petsc/petsc/src/dm/dt/interface/dt.c >> >> [0]PETSC ERROR: #7 main() line 5 in /home/user/tests/pquad.cpp >> [0]PETSC ERROR: No PETSc Option Table entries >> [0]PETSC ERROR: ----------------End of Error Message -------send entire >> error message to petsc-maint at mcs.anl.gov---------- > > > I've had no problems using other parts of PETSc with the exact same build > options/install (SNES, linear solvers), so I suspect this is likely just > user error. I have tried building PETSc both as a shared and static > library, and both methods fail in the same way. > This is strange. First, you should only have uninitialized classids if you build without dynamics libraries. Did you? If you include the entire error output, it would show us. If so, then PetscInitialize should set the value of this classid, unless you built this with PETSC_HAVE_DYNAMIC_LIBRARIES, but linked against static libraries, as the error message says. Do you have multiple versions of PETSc on your machine? > As a related question, what is the "Order" of quadrature object suppose to > be? The documentation for "PetscQuadratureGetOrder" and > "PetscQuadratureSetOrder" says this is the highest degree polynomial that > is exactly integrated. > That is what it is supposed to be, but its not. We are changing this now. However, at the moment, order just means the number of points/dim used to define the quadrature. > However, when I looked at the source code for > "PetscDTGaussTensorQuadrature", this appears to set the order to the number > of quadrature points, which for Gaussian quadrature means it can integrate > a 2*npoints-1 polynomial exactly. > Agree completely. > If I wanted to implement my own quadrature scheme (Gauss-Lobatto) which is > exact for polynomials up to 2*npoints-3, what should I set the quadrature > order to? > I would set it to be the number of points now, since that is what should be fed to the low level routines. Then we will go through and complete a higher layer that translates an order request into a number of points for each type. Matt > -- > Andrew Ho > -- What most experimenters take for granted before they begin their experiments is infinitely more interesting than any results to which their experiments lead. -- Norbert Wiener -------------- next part -------------- An HTML attachment was scrubbed... URL: From andrewh0 at uw.edu Thu Jul 28 21:08:49 2016 From: andrewh0 at uw.edu (Andrew Ho) Date: Thu, 28 Jul 2016 19:08:49 -0700 Subject: [petsc-users] How to create a quadrature object? In-Reply-To: References: Message-ID: On Thu, Jul 28, 2016 at 4:35 PM, Matthew Knepley wrote: > > This is strange. First, you should only have uninitialized classids if you > build without dynamics libraries. Did you? If you > include the entire error output, it would show us. > > This is pretty much the entire error output; I trimmed off the ending "MPI Abort called" message (it's the standard message printed by OpenMPI when MPIAbort is called). The code snippet I had is a complete working example which is able to give me this error. > If so, then PetscInitialize should set the value of this classid, unless > you built this with PETSC_HAVE_DYNAMIC_LIBRARIES, > but linked against static libraries, as the error message says. Do you > have multiple versions of PETSc on your machine? > I have a single install of PETSc at any one time. I've tried builds with --with-shared-libraries on and off with the same results (I made sure to purge any previous install first). Here's a diff if you want to test the same configs I tried running. Just run *make ex4* in the src/dm/dt/examples/tests folder Configure command: ./configure --with-debugging=0 COPTFLAGS="-O3 -march=native" CXXOPTFLAGS="-O3 -march=native" FOPTFLAGS="-O3 -march=native" -- Andrew Ho -------------- next part -------------- An HTML attachment was scrubbed... URL: -------------- next part -------------- A non-text attachment was scrubbed... Name: quad_test.patch Type: text/x-patch Size: 1134 bytes Desc: not available URL: From knepley at gmail.com Thu Jul 28 22:58:45 2016 From: knepley at gmail.com (Matthew Knepley) Date: Thu, 28 Jul 2016 20:58:45 -0700 Subject: [petsc-users] How to create a quadrature object? In-Reply-To: References: Message-ID: On Thu, Jul 28, 2016 at 7:08 PM, Andrew Ho wrote: > > On Thu, Jul 28, 2016 at 4:35 PM, Matthew Knepley > wrote: > >> >> This is strange. First, you should only have uninitialized classids if >> you build without dynamics libraries. Did you? If you >> include the entire error output, it would show us. >> >> > This is pretty much the entire error output; I trimmed off the ending "MPI > Abort called" message (it's the standard message printed by OpenMPI when > MPIAbort is called). The code snippet I had is a complete working example > which is able to give me this error. > > >> If so, then PetscInitialize should set the value of this classid, unless >> you built this with PETSC_HAVE_DYNAMIC_LIBRARIES, >> but linked against static libraries, as the error message says. Do you >> have multiple versions of PETSc on your machine? >> > > I have a single install of PETSc at any one time. I've tried builds with > --with-shared-libraries on and off with the same results (I made sure to > purge any previous install first). Here's a diff if you want to test the > same configs I tried running. > Crap. We changed the way initialization works, but this was left behind. You can make this change diff --git a/src/dm/dt/interface/dt.c b/src/dm/dt/interface/dt.c index 5d12959..d6f3454 100644 --- a/src/dm/dt/interface/dt.c +++ b/src/dm/dt/interface/dt.c @@ -50,7 +50,7 @@ PetscErrorCode PetscQuadratureCreate(MPI_Comm comm, PetscQuadrature *q) PetscFunctionBegin; PetscValidPointer(q, 2); - ierr = DMInitializePackage();CHKERRQ(ierr); + ierr = PetscSysInitializePackage();CHKERRQ(ierr); ierr = PetscHeaderCreate(*q,PETSC_OBJECT_CLASSID,"PetscQuadrature","Quadrature","DT",comm,PetscQuadratureDestroy,PetscQuadratureView);CHKERRQ(ierr); (*q)->dim = -1; (*q)->order = -1; and then rebuild cd $PETSC_DIR make -f ./gmakefile and try running your example again. I will get this change in. Thanks, Matt > Just run *make ex4* in the src/dm/dt/examples/tests folder > > Configure command: > > ./configure --with-debugging=0 COPTFLAGS="-O3 -march=native" > CXXOPTFLAGS="-O3 -march=native" FOPTFLAGS="-O3 -march=native" > > -- > Andrew Ho > -- What most experimenters take for granted before they begin their experiments is infinitely more interesting than any results to which their experiments lead. -- Norbert Wiener -------------- next part -------------- An HTML attachment was scrubbed... URL: From bsmith at mcs.anl.gov Fri Jul 29 09:41:19 2016 From: bsmith at mcs.anl.gov (Barry Smith) Date: Fri, 29 Jul 2016 09:41:19 -0500 Subject: [petsc-users] vec norm for local portion of a vector In-Reply-To: References: Message-ID: <79191996-1FC0-4DFE-B1EB-33A29A5D8CA2@mcs.anl.gov> > On Jul 27, 2016, at 4:42 PM, Xiangdong wrote: > > Hello everyone, > > I have a global dmda vector vg. On each processor, if I want to know the norm of local portion of vg, which function should I call? > > So far I am thinking of using DMDAVecGetArray and then write a loop to compute the norm of this local array. > > Is there a simple function available to call? like *vg->ops->norm_local(vg,NORM_2, &normlocal)? There isn't a public interface to this call because it really isn't a mathematically well defined object; the subdomains in the decomposition of the array are arbitrary based on the number of processes used. Anyways if you want it and it is the NON-overlapping portion then yes, you can write a little routine (basically just cut and paste VecNorm()) call it say VecNormLocal() and have it call the function pointer you indicated above. Note for the 2 norm the norm_local() returns the square of the norm so you need to take the square root. If you want the overlapping portion of the vector then you should just do the DMDAVecGetArray() as you already do. Barry > > Thanks. > > Best, > Xiangdong > From bsmith at mcs.anl.gov Fri Jul 29 09:49:29 2016 From: bsmith at mcs.anl.gov (Barry Smith) Date: Fri, 29 Jul 2016 09:49:29 -0500 Subject: [petsc-users] Ignore command line arguments with fortran code using PETSc In-Reply-To: References: <8F9DC370-1CD1-48AA-8009-42731293566A@mcs.anl.gov> Message-ID: > On Jul 28, 2016, at 2:35 AM, Tim Steinhoff wrote: > > 2016-07-27 21:42 GMT+02:00 Barry Smith : >> >> Actually there is currently no way to PetscInitialize from Fortran without adding the command line options to the database. In the middle >> of petscinitialize_() is the code fragment >> >> PETScParseFortranArgs_Private(&PetscGlobalArgc,&PetscGlobalArgs); >> FIXCHAR(filename,len,t1); >> *ierr = PetscOptionsInsert(NULL,&PetscGlobalArgc,&PetscGlobalArgs,t1); >> >> We'll need to do a bit of code refactoring to provide a Fortran petscinitializenoarguments_(). The simplest way to refactor would be to change the name of petscinitialize_ to say PetscInitializeFortran_Internal() and add a bool argument whether to process the arguments and then write two trivial routines petscinitialize_ that calls the new routine with PETSC_TRUE and petscinitializenoarguments_() that calls it with PETSC_FALSE. > > Thanks Barry. It would be really nice if PETSc comes with that feature > in future, because I would prefer not to make any changes to the PETSc > code that disappear with every new PETSc release. Understood. You could make a pull request with your changes https://bitbucket.org/petsc/petsc/wiki/pull-request-instructions-git otherwise I will add it but it will take a few days since I am backlogged. Barry > >> >> Barry >> >> Of course you can have a C/C++ main routine that calls PetscInitializeNoArguments(); followed by PetscInitializeFortran() and then have the bulk of your code in Fortran. > That would work, but we have a rather large fortran code without any > C. So, for now we will probably stick to your first approach and keep > our code fotran only. > > Thanks again, > Volker > > >> >> >>> On Jul 27, 2016, at 10:55 AM, Tim Steinhoff wrote: >>> >>> 2016-07-27 16:04 GMT+02:00 Matthew Knepley : >>>> On Wed, Jul 27, 2016 at 4:59 AM, Tim Steinhoff >>>> wrote: >>>>> >>>>> Hi all, >>>>> >>>>> we coupled PETSc with our fortran code. Is there any way to let PETSc >>>>> (PetscInitialize) ignore all arguments passed by the command line? >>>>> Since our code is controlled by command line arguements as well, it >>>>> leads to a mess, when those arguments are read twice. >>>> >>>> >>>> 1) You can use PetscInitializeNoArguments() >>> >>> Thanks! I thought that function was for C/C++ only. >>> >>>> >>>> 2) What goes wrong? PETSc should just ignore any options it does not >>>> recognize. >>> >>> >>> The problem is that our code uses the same or similar argument names >>> as PETSc does and our end user should not have access to all petsc >>> options. >>> >>> >>>> >>>> Thanks, >>>> >>>> Matt >>>> >>>>> >>>>> Thanks and kind regards, >>>>> >>>>> Volker >>>> >>>> >>>> >>>> >>>> -- >>>> What most experimenters take for granted before they begin their experiments >>>> is infinitely more interesting than any results to which their experiments >>>> lead. >>>> -- Norbert Wiener >> From bsmith at mcs.anl.gov Fri Jul 29 10:29:34 2016 From: bsmith at mcs.anl.gov (Barry Smith) Date: Fri, 29 Jul 2016 10:29:34 -0500 Subject: [petsc-users] Scheduled Relaxation Jacobi method In-Reply-To: References: <40FF0CE8-7589-4631-AB5B-0F4AF5205C99@gmail.com> <3EFDE37C-513C-4894-9859-BA7DCA70A760@mcs.anl.gov> <8DFDAF89-70EA-4245-967B-3C0D6A7E136F@gmail.com> Message-ID: Make sure you always respond to petsc-users so the email doesn't just get sent to me. Someone else would have already helped you. Ahh, the problem is we copy the scale value out of the KSP object at the beginning of the routine into a local variable so it remains the same value even though you correctly change the value in the KSP object. I have changed the Richardson code in the maint and master branch so if you use it there the scaling will work as you desire. (Just follow the git instructions at http://www.mcs.anl.gov/petsc/download/index.html for obtaining PETSc.) Barry > On Jul 28, 2016, at 9:24 AM, JC wrote: > > Hej, > > I have tried to use kspMonitorSet to change the value of the scale in the Richardson iteration, however it does not seem to take it into account when actually solving the problem. Here is my monitoring routine. relaxation is an array declared in my module. If I do print *, relaxation(ind), then the correct value is printed. It is not passed to the KSP framework however despite the call to ksprichardsonsetscale. Any idea why? > > Thanks a lot, > JC > > > subroutine MyKSPMonitor(solver, iter, dummy_1, dummy_2, ierr) > > !----- Inputs -----! > > KSP, intent(inout) :: solver > PetscInt, intent(in) :: iter > PetscReal, intent(in) :: dummy_1 > PetscInt, intent(in) :: dummy_2 > > !----- Output -----! > > PetscErrorCode, intent(out) :: ierr > > !----- Miscellaneous -----! > > PetscInt :: ind > PetscReal :: weight > > ind = mod(iter, max_srj) > weight = relaxation(ind) > call ksprichardsonsetscale(solver, weight, ierr) > if (ierr/=0) call abort(ierr, 'Failed to set the relaxation weights.', nrank) > ierr = 0 > > return > end subroutine MyKSPMonitor > > >> On 27 Jul 2016, at 18:11, Barry Smith wrote: >> >> >>> On Jul 27, 2016, at 6:27 AM, JC wrote: >>> >>>> The dependence on grid size is easy. >>> >>> Knowing the grid size, I have a list of files I can read to load the correct relaxation weights. This is indeed the easy part. >>> >>>> By depends on the iteration do you mean the linear iteration, as in the first iteration you use .1 then in the second you use .2 etc? >>> >>> That is exactly what I meant. At the first iteration of the solver (i.e. the first matrix-vector product), the relaxation is say omega_1. At the second iteration, it is omega_2, so on so forth. >>> >>>> To do this use KSPSetMonitor() and have your monitor routine call KSPRichardsonSetScale() with the value you like which can depend on the iteration. >>> >>> So basically, KSPSetMonitor() allows me to define a callback procedure that will be executed at the end of each iteration of the KSP solver, right? >> >> Yes >> >>> >>> Thanks a lot, >>> JC >> > From bsmith at mcs.anl.gov Fri Jul 29 10:39:02 2016 From: bsmith at mcs.anl.gov (Barry Smith) Date: Fri, 29 Jul 2016 10:39:02 -0500 Subject: [petsc-users] Comprehensive example for KSPRegister() In-Reply-To: References: Message-ID: <3D74EA20-9D78-4730-A088-CC2EBEDF32F2@mcs.anl.gov> > On Jul 28, 2016, at 12:22 PM, Matthew Knepley wrote: > > On Thu, Jul 28, 2016 at 9:07 AM, JC wrote: > Hey everyone, > > I was wondering if any of you had a comprehensive example of KSPRegister() to create our own KSP solver in fortran? > I have tried to look online but have not been able to find it. > > We do not currently have one in Fortran, although we do in C. Basically you would copy the file src/ksp/ksp/impls/cg/cg.c and replace the bodies of each of the methods (KSPSolve_CG etc) with calls to your Fortran routines that implement each method. Given the large number of KSP methods we already have implemented if what you want to implement is a variation of what we already have it would likely require much less new code if you worked in C and simple derived off a subclass of an already implemented class where you made the the changes. Barry > > Thanks, > > Matt > > Thanks a lot, > JC > > > > -- > What most experimenters take for granted before they begin their experiments is infinitely more interesting than any results to which their experiments lead. > -- Norbert Wiener From jshen25 at jhu.edu Fri Jul 29 11:46:54 2016 From: jshen25 at jhu.edu (Jinlei Shen) Date: Fri, 29 Jul 2016 12:46:54 -0400 Subject: [petsc-users] Petsc mesh scalability issue with iterative solver and direct solver Message-ID: Dear PETSC developers, Thank you for developing such a powerful tool for scientific computations. I'm currently trying to run a simple cantilever beam FEM to test the scalability of PETSC on multi-processors. I also want to verify whether iterative solver or direct solver is more efficient for parallel large FEM problem. Problem description, An Euler elementary cantilever beam with point load at the end along -y direction. Each node has 2 DOF (deflection and rotation)). MPIBAIJ is used with bs = 2, dnnz and onnz are determined based on the connectivity. Loop with elements in each processor to assemble the global matrix with same element stiffness matrix. The boundary condition is set using call MatZeroRowsColumns(SG,2,g_BC,one,PETSC_NULL_OBJECT,PETSC_NULL_OBJECT,ierr); Based on what I have done, I find the computations work well, i.e the results are correct compared with theoretical solution, for small mesh size (small than 5000 elements) using both solvers with different numbers of processes. However, there are several confusing issues when I increase the mesh size to 10000 and more elements with iterative solve(CG + PCBJACOBI) 1. For 10k elements, I can get accurate solution using iterative solver with uni-processor(i.e. only one process). However, when I use 2-8 processes, it tells the linear solver converged with different iterations, but, the results are all different for different processes and erroneous. The wired thing is when I use >9 processes, the results are correct again. I am really confused by this. Could you explain me why? If my parallelization is not correct, why it works for small cases? And I check the global matrix and RHS vector and didn't see any mallocs during the process. 2. For 30k elements, if I use one process, it says: Linear solve did not converge due to DIVERGED_INDEFINITE_PC. Does this commonly happen for large sparse matrix? If so, is there any stable solver or pc for large problem? For parallel computing using direct solver(SUPERLU_DIST + PCLU), I can only get accuracy when the number of elements are below 5000. There must be something wrong. The way I use the superlu_dist solver is first convert MatType to AIJ, then call PCFactorSetMatSolverPackage, and change the PC to PCLU. Do I miss anything else to run SUPER_LU correctly? I also use SUPER_LU and iterative solver(CG+PCBJACOBI) to solve the sequential version of the same problem. The results shows that iterative solver works well for <50k elements, while SUPER_LU only gets right solution below 5k elements. Can I say iterative solver is better than SUPER_LU for large problem? How can I improve the solver to copy with very large problem, such as million by million? Another thing is it's still doubtable of performance of SUPER_LU. For the inaccuracy issue, do you think it may be due to the memory? However, there is no memory error showing during the execution. I really appreciate someone could resolve those puzzles above for me. My goal is to replace the current SUPER_LU solver in my parallel CPFEM main program with the iterative solver using PETSC. Please let me if you would like to see my code in detail. Thank you very much. Bests, Jinlei -------------- next part -------------- An HTML attachment was scrubbed... URL: From andrewh0 at uw.edu Fri Jul 29 11:54:02 2016 From: andrewh0 at uw.edu (Andrew Ho) Date: Fri, 29 Jul 2016 09:54:02 -0700 Subject: [petsc-users] How to create a quadrature object? In-Reply-To: References: Message-ID: Thanks, this fix works. On Thu, Jul 28, 2016 at 8:58 PM, Matthew Knepley wrote: > On Thu, Jul 28, 2016 at 7:08 PM, Andrew Ho wrote: > >> >> On Thu, Jul 28, 2016 at 4:35 PM, Matthew Knepley >> wrote: >> >>> >>> This is strange. First, you should only have uninitialized classids if >>> you build without dynamics libraries. Did you? If you >>> include the entire error output, it would show us. >>> >>> >> This is pretty much the entire error output; I trimmed off the ending >> "MPI Abort called" message (it's the standard message printed by OpenMPI >> when MPIAbort is called). The code snippet I had is a complete working >> example which is able to give me this error. >> >> >>> If so, then PetscInitialize should set the value of this classid, unless >>> you built this with PETSC_HAVE_DYNAMIC_LIBRARIES, >>> but linked against static libraries, as the error message says. Do you >>> have multiple versions of PETSc on your machine? >>> >> >> I have a single install of PETSc at any one time. I've tried builds with >> --with-shared-libraries on and off with the same results (I made sure to >> purge any previous install first). Here's a diff if you want to test the >> same configs I tried running. >> > > Crap. We changed the way initialization works, but this was left behind. > You can make this change > > diff --git a/src/dm/dt/interface/dt.c b/src/dm/dt/interface/dt.c > index 5d12959..d6f3454 100644 > --- a/src/dm/dt/interface/dt.c > +++ b/src/dm/dt/interface/dt.c > @@ -50,7 +50,7 @@ PetscErrorCode PetscQuadratureCreate(MPI_Comm comm, > PetscQuadrature *q) > > PetscFunctionBegin; > PetscValidPointer(q, 2); > - ierr = DMInitializePackage();CHKERRQ(ierr); > + ierr = PetscSysInitializePackage();CHKERRQ(ierr); > ierr = > PetscHeaderCreate(*q,PETSC_OBJECT_CLASSID,"PetscQuadrature","Quadrature","DT",comm,PetscQuadratureDestroy,PetscQuadratureView);CHKERRQ(ierr); > (*q)->dim = -1; > (*q)->order = -1; > > and then rebuild > > cd $PETSC_DIR > make -f ./gmakefile > > and try running your example again. > > I will get this change in. > > Thanks, > > Matt > > >> Just run *make ex4* in the src/dm/dt/examples/tests folder >> >> Configure command: >> >> ./configure --with-debugging=0 COPTFLAGS="-O3 -march=native" >> CXXOPTFLAGS="-O3 -march=native" FOPTFLAGS="-O3 -march=native" >> >> -- >> Andrew Ho >> > > > > -- > What most experimenters take for granted before they begin their > experiments is infinitely more interesting than any results to which their > experiments lead. > -- Norbert Wiener > -- Andrew Ho -------------- next part -------------- An HTML attachment was scrubbed... URL: From bsmith at mcs.anl.gov Fri Jul 29 12:19:11 2016 From: bsmith at mcs.anl.gov (Barry Smith) Date: Fri, 29 Jul 2016 12:19:11 -0500 Subject: [petsc-users] PCMG with matrix-free operators accessing DMDA In-Reply-To: References: Message-ID: <4706BBEA-9216-4973-8CAE-E7DCCD9F0F89@mcs.anl.gov> > On Jul 28, 2016, at 12:48 PM, Valeria Mele wrote: > > Hi everyone, > this time I am using PETSc to do something that is more complicated than my usual and I want to do it at the highest possible abstraction level. > > To put it in a nutshell, my intent is to build a parallel multigrid to solve a linear system via DM, KSP and PCMG (I would like to use DMMG but probably I should have the same problems or more). DMMG doesn't exist anymore. It was refactored away many years ago, its functionality is handled by PCMG and DM. > > I created the distributed object, da, with DMDACreate3d, even if it is distributed (as yet) only in the x-dimension and has 3 dof. > Then I create the KSP (type KSPRICHARDSON) and set the nonzero initial guess and PCMG as preconditioner. Here I start to tune the MG. > > The point is that I need to define all the operators as matrix-free, since they will do several operations on x to obtain y, and I am not familiar with the way to access all the elements or informations in the two levels involved and/or among the processors with a so-high level interface. > > So please (please please please please) tell me if I correctly understand the mechanism or I am on the wrong way and clear my doubts. > > That is, let's say that my operation for the shell are: > ? A_mult(Mat mat,Vec x, Vec y) //coefficients matrix > in this case the level is only one but should I write it taking into account only the local data (I think so) and accessing them via the informations in da? Yes. You can use VecGetDM(x,&da) to get the DMDA object > > For example, if I use DMDAVecGetArray, DMDAVecGetCorners (or DMDAGetGhostCorners) and DMDAVecRestoreArray, will they retrieve informations from the right level each time (I am pretty sure that in some official examples it is done in this way)? > > Or should I handle just Vecs as local structures with their usual indices (through VecGetArray and VecRestoreArray)? No, no, no because then you would need to mange all the structured grid information yourself, since the DMDA manages it for you you should use it. > ? P_mult(Mat mat,Vec x, Vec y) //interpolation matrix that is NOT conceptually the traspose of Restriction > in this case x and y will be from two different levels (respectively L and L+1), so, if I retrieve informations from the da... how can I access the two at different levels? Use VecGetDM(x, and VecGetDM(y to get access to both DMDA. > > I am sorry if it seems that they are trivial questions, and I will be grateful to anyone will help me. Additional information. Since the PCMG will be requesting the matrices and the interpolation/restriction operations (rather than you setting them into each level of multigrid) you will need to use DMShellSetCreateMatrix() and DMShellSetCreateInterpolation() and DMShellSetCreateRestriction() to provide the routines that will create the Shell matrices you need to represent the operators on the levels and the restriction and interpolation (Even though you are using a DMDA you can still call these routines). Barry > > Thanks a lot, > Valeria > > > > > > > --------------------------------------------------------------------------------------------- > PhD Valeria Mele > > University of Naples Federico II > Department of Mathematics and Applications "R. Caccioppoli" > Complesso Universitario M.S. Angelo, Via Cinthia > 80126 Naples > --------------------------------------------------------------------------------------------- From bsmith at mcs.anl.gov Fri Jul 29 13:09:46 2016 From: bsmith at mcs.anl.gov (Barry Smith) Date: Fri, 29 Jul 2016 13:09:46 -0500 Subject: [petsc-users] Petsc mesh scalability issue with iterative solver and direct solver In-Reply-To: References: Message-ID: First run under valgrind all the cases to make sure there is not some use of uninitialized data or overwriting of data. Go to http://www.mcs.anl.gov/petsc follow the link to FAQ and search for valgrind (the web server seems to be broken at the moment). Second it is possible that your code the assembles the matrices and vectors is not correctly assembling it for either the sequential or parallel case. Hence a different number of processes could be generating a different linear system hence inconsistent results. How are you handling the parallelism? How do you know the matrix generated in parallel is identically to that sequentially? Simple preconditioners such as pbjacobi will converge slower and slower with more elements. Note that you should run with -ksp_monitor_true_residual and -ksp_converged_reason to make sure that the iterative solver is even converging. By default PETSc KSP solvers do not stop with a big error message if they do not converge so you need make sure they are always converging. Barry > On Jul 29, 2016, at 11:46 AM, Jinlei Shen wrote: > > Dear PETSC developers, > > Thank you for developing such a powerful tool for scientific computations. > > I'm currently trying to run a simple cantilever beam FEM to test the scalability of PETSC on multi-processors. I also want to verify whether iterative solver or direct solver is more efficient for parallel large FEM problem. > > Problem description, An Euler elementary cantilever beam with point load at the end along -y direction. Each node has 2 DOF (deflection and rotation)). MPIBAIJ is used with bs = 2, dnnz and onnz are determined based on the connectivity. Loop with elements in each processor to assemble the global matrix with same element stiffness matrix. The boundary condition is set using call MatZeroRowsColumns(SG,2,g_BC,one,PETSC_NULL_OBJECT,PETSC_NULL_OBJECT,ierr); > > Based on what I have done, I find the computations work well, i.e the results are correct compared with theoretical solution, for small mesh size (small than 5000 elements) using both solvers with different numbers of processes. > > However, there are several confusing issues when I increase the mesh size to 10000 and more elements with iterative solve(CG + PCBJACOBI) > > 1. For 10k elements, I can get accurate solution using iterative solver with uni-processor(i.e. only one process). However, when I use 2-8 processes, it tells the linear solver converged with different iterations, but, the results are all different for different processes and erroneous. The wired thing is when I use >9 processes, the results are correct again. I am really confused by this. Could you explain me why? If my parallelization is not correct, why it works for small cases? And I check the global matrix and RHS vector and didn't see any mallocs during the process. > > 2. For 30k elements, if I use one process, it says: Linear solve did not converge due to DIVERGED_INDEFINITE_PC. Does this commonly happen for large sparse matrix? If so, is there any stable solver or pc for large problem? > > > For parallel computing using direct solver(SUPERLU_DIST + PCLU), I can only get accuracy when the number of elements are below 5000. There must be something wrong. The way I use the superlu_dist solver is first convert MatType to AIJ, then call PCFactorSetMatSolverPackage, and change the PC to PCLU. Do I miss anything else to run SUPER_LU correctly? > > > I also use SUPER_LU and iterative solver(CG+PCBJACOBI) to solve the sequential version of the same problem. The results shows that iterative solver works well for <50k elements, while SUPER_LU only gets right solution below 5k elements. Can I say iterative solver is better than SUPER_LU for large problem? How can I improve the solver to copy with very large problem, such as million by million? Another thing is it's still doubtable of performance of SUPER_LU. > > For the inaccuracy issue, do you think it may be due to the memory? However, there is no memory error showing during the execution. > > I really appreciate someone could resolve those puzzles above for me. My goal is to replace the current SUPER_LU solver in my parallel CPFEM main program with the iterative solver using PETSC. > > > Please let me if you would like to see my code in detail. > > Thank you very much. > > Bests, > Jinlei > > > > > > > From valeria.mele at unina.it Fri Jul 29 13:58:53 2016 From: valeria.mele at unina.it (Valeria Mele) Date: Fri, 29 Jul 2016 13:58:53 -0500 Subject: [petsc-users] PCMG with matrix-free operators accessing DMDA In-Reply-To: <4706BBEA-9216-4973-8CAE-E7DCCD9F0F89@mcs.anl.gov> References: <4706BBEA-9216-4973-8CAE-E7DCCD9F0F89@mcs.anl.gov> Message-ID: Thank you very much Barry. Apparently I missed many things about "DMShell..." that I didn't find in the current users manual, and I was trying to create the operators through matCreateShell() and MatShellSetOperation(). If I use "DMShellSetCreate..." to define the matrices I shouldn't have any doubt about the da to refer to. Now I can go on. Thank you again. Best, Valeria --------------------------------------------------------------------------------------------- PhD Valeria Mele University of Naples Federico II Department of Mathematics and Applications "R. Caccioppoli" Complesso Universitario M.S. Angelo, Via Cinthia 80126 Naples --------------------------------------------------------------------------------------------- 2016-07-29 12:19 GMT-05:00 Barry Smith : > > > On Jul 28, 2016, at 12:48 PM, Valeria Mele > wrote: > > > > Hi everyone, > > this time I am using PETSc to do something that is more complicated than > my usual and I want to do it at the highest possible abstraction level. > > > > To put it in a nutshell, my intent is to build a parallel multigrid to > solve a linear system via DM, KSP and PCMG (I would like to use DMMG but > probably I should have the same problems or more). > > DMMG doesn't exist anymore. It was refactored away many years ago, its > functionality is handled by PCMG and DM. > > > > > > I created the distributed object, da, with DMDACreate3d, even if it is > distributed (as yet) only in the x-dimension and has 3 dof. > > Then I create the KSP (type KSPRICHARDSON) and set the nonzero initial > guess and PCMG as preconditioner. Here I start to tune the MG. > > > > The point is that I need to define all the operators as matrix-free, > since they will do several operations on x to obtain y, and I am not > familiar with the way to access all the elements or informations in the two > levels involved and/or among the processors with a so-high level interface. > > > > So please (please please please please) tell me if I correctly > understand the mechanism or I am on the wrong way and clear my doubts. > > > > That is, let's say that my operation for the shell are: > > ? A_mult(Mat mat,Vec x, Vec y) //coefficients matrix > > in this case the level is only one but should I write it taking into > account only the local data (I think so) and accessing them via the > informations in da? > > Yes. You can use VecGetDM(x,&da) to get the DMDA object > > > > > For example, if I use DMDAVecGetArray, DMDAVecGetCorners (or > DMDAGetGhostCorners) and DMDAVecRestoreArray, will they retrieve > informations from the right level each time (I am pretty sure that in some > official examples it is done in this way)? > > > > Or should I handle just Vecs as local structures with their usual > indices (through VecGetArray and VecRestoreArray)? > > No, no, no because then you would need to mange all the structured grid > information yourself, since the DMDA manages it for you you should use it. > > > ? P_mult(Mat mat,Vec x, Vec y) //interpolation matrix that is NOT > conceptually the traspose of Restriction > > in this case x and y will be from two different levels (respectively L > and L+1), so, if I retrieve informations from the da... how can I access > the two at different levels? > > Use VecGetDM(x, and VecGetDM(y to get access to both DMDA. > > > > > I am sorry if it seems that they are trivial questions, and I will be > grateful to anyone will help me. > > Additional information. Since the PCMG will be requesting the matrices > and the interpolation/restriction operations (rather than you setting them > into each level of multigrid) you will need to use DMShellSetCreateMatrix() > and DMShellSetCreateInterpolation() and DMShellSetCreateRestriction() to > provide the routines that will create the Shell matrices you need to > represent the operators on the levels and the restriction and interpolation > (Even though you are using a DMDA you can still call these routines). > > Barry > > > > > Thanks a lot, > > Valeria > > > > > > > > > > > > > > > --------------------------------------------------------------------------------------------- > > PhD Valeria Mele > > > > University of Naples Federico II > > Department of Mathematics and Applications "R. Caccioppoli" > > Complesso Universitario M.S. Angelo, Via Cinthia > > 80126 Naples > > > --------------------------------------------------------------------------------------------- > > > -------------- next part -------------- An HTML attachment was scrubbed... URL: From bsmith at mcs.anl.gov Fri Jul 29 16:22:51 2016 From: bsmith at mcs.anl.gov (Barry Smith) Date: Fri, 29 Jul 2016 16:22:51 -0500 Subject: [petsc-users] PCMG with matrix-free operators accessing DMDA In-Reply-To: References: <4706BBEA-9216-4973-8CAE-E7DCCD9F0F89@mcs.anl.gov> Message-ID: > On Jul 29, 2016, at 1:58 PM, Valeria Mele wrote: > > Thank you very much Barry. > Apparently I missed many things about "DMShell..." that I didn't find in the current users manual, and I was trying to create the operators through matCreateShell() and MatShellSetOperation(). You do need to use MatCreateShell and MatShellSetOperation()! But you need to call these from within the DMShellSetCreateMatrix() and interpolation/restriction routines that you provide. Barry > If I use "DMShellSetCreate..." to define the matrices I shouldn't have any doubt about the da to refer to. > > Now I can go on. > Thank you again. > > Best, > Valeria > > > --------------------------------------------------------------------------------------------- > PhD Valeria Mele > > University of Naples Federico II > Department of Mathematics and Applications "R. Caccioppoli" > Complesso Universitario M.S. Angelo, Via Cinthia > 80126 Naples > --------------------------------------------------------------------------------------------- > > 2016-07-29 12:19 GMT-05:00 Barry Smith : > > > On Jul 28, 2016, at 12:48 PM, Valeria Mele wrote: > > > > Hi everyone, > > this time I am using PETSc to do something that is more complicated than my usual and I want to do it at the highest possible abstraction level. > > > > To put it in a nutshell, my intent is to build a parallel multigrid to solve a linear system via DM, KSP and PCMG (I would like to use DMMG but probably I should have the same problems or more). > > DMMG doesn't exist anymore. It was refactored away many years ago, its functionality is handled by PCMG and DM. > > > > > > I created the distributed object, da, with DMDACreate3d, even if it is distributed (as yet) only in the x-dimension and has 3 dof. > > Then I create the KSP (type KSPRICHARDSON) and set the nonzero initial guess and PCMG as preconditioner. Here I start to tune the MG. > > > > The point is that I need to define all the operators as matrix-free, since they will do several operations on x to obtain y, and I am not familiar with the way to access all the elements or informations in the two levels involved and/or among the processors with a so-high level interface. > > > > So please (please please please please) tell me if I correctly understand the mechanism or I am on the wrong way and clear my doubts. > > > > That is, let's say that my operation for the shell are: > > ? A_mult(Mat mat,Vec x, Vec y) //coefficients matrix > > in this case the level is only one but should I write it taking into account only the local data (I think so) and accessing them via the informations in da? > > Yes. You can use VecGetDM(x,&da) to get the DMDA object > > > > > For example, if I use DMDAVecGetArray, DMDAVecGetCorners (or DMDAGetGhostCorners) and DMDAVecRestoreArray, will they retrieve informations from the right level each time (I am pretty sure that in some official examples it is done in this way)? > > > > Or should I handle just Vecs as local structures with their usual indices (through VecGetArray and VecRestoreArray)? > > No, no, no because then you would need to mange all the structured grid information yourself, since the DMDA manages it for you you should use it. > > > ? P_mult(Mat mat,Vec x, Vec y) //interpolation matrix that is NOT conceptually the traspose of Restriction > > in this case x and y will be from two different levels (respectively L and L+1), so, if I retrieve informations from the da... how can I access the two at different levels? > > Use VecGetDM(x, and VecGetDM(y to get access to both DMDA. > > > > > I am sorry if it seems that they are trivial questions, and I will be grateful to anyone will help me. > > Additional information. Since the PCMG will be requesting the matrices and the interpolation/restriction operations (rather than you setting them into each level of multigrid) you will need to use DMShellSetCreateMatrix() and DMShellSetCreateInterpolation() and DMShellSetCreateRestriction() to provide the routines that will create the Shell matrices you need to represent the operators on the levels and the restriction and interpolation (Even though you are using a DMDA you can still call these routines). > > Barry > > > > > Thanks a lot, > > Valeria > > > > > > > > > > > > > > --------------------------------------------------------------------------------------------- > > PhD Valeria Mele > > > > University of Naples Federico II > > Department of Mathematics and Applications "R. Caccioppoli" > > Complesso Universitario M.S. Angelo, Via Cinthia > > 80126 Naples > > --------------------------------------------------------------------------------------------- > > > From C.Klaij at marin.nl Sat Jul 30 09:41:58 2016 From: C.Klaij at marin.nl (Klaij, Christiaan) Date: Sat, 30 Jul 2016 14:41:58 +0000 Subject: [petsc-users] block matrix without MatCreateNest In-Reply-To: References: Message-ID: <1469889718285.98025@marin.nl> Anyone? (my guess is an if-statement, something like "if type nest then setup nest"...) > Date: Thu, 28 Jul 2016 08:38:54 +0000 > From: "Klaij, Christiaan" > To: "petsc-users at mcs.anl.gov" > Subject: [petsc-users] block matrix without MatCreateNest > Message-ID: <1469695134232.97712 at marin.nl> > Content-Type: text/plain; charset="utf-8" > > I'm trying to understand how to assemble a block matrix in a > format-independent manner, so that I can switch between types > mpiaij and matnest. > > The manual states that the key to format-independent assembly is > to use MatGetLocalSubMatrix. So, in the code below, I'm using > this to assemble a 3-by-3 block matrix A and setting the diagonal > of block A02. This seems to work for type mpiaij, but not for > type matnest. What am I missing? > > Chris > > > $ cat mattry.F90 > program mattry > > use petscksp > implicit none > #include > > PetscInt :: n=4 ! setting 4 cells per process > > PetscErrorCode :: ierr > PetscInt :: size,rank,i > Mat :: A,A02 > IS :: isg0,isg1,isg2 > IS :: isl0,isl1,isl2 > ISLocalToGlobalMapping :: map > > integer, allocatable, dimension(:) :: idx > > call PetscInitialize(PETSC_NULL_CHARACTER,ierr); CHKERRQ(ierr) > call MPI_Comm_size(PETSC_COMM_WORLD,size,ierr); CHKERRQ(ierr) > call MPI_Comm_rank(PETSC_COMM_WORLD,rank,ierr);CHKERRQ(ierr) > > ! local index sets for 3 fields > allocate(idx(n)) > idx=(/ (i-1, i=1,n) /) > call ISCreateGeneral(PETSC_COMM_WORLD,n,idx,PETSC_COPY_VALUES,isl0,ierr);CHKERRQ(ierr) > call ISCreateGeneral(PETSC_COMM_WORLD,n,idx+n,PETSC_COPY_VALUES,isl1,ierr);CHKERRQ(ierr) > call ISCreateGeneral(PETSC_COMM_WORLD,n,idx+2*n,PETSC_COPY_VALUES,isl2,ierr);CHKERRQ(ierr) > ! call ISView(isl3,PETSC_VIEWER_STDOUT_WORLD,ierr); CHKERRQ(ierr) > deallocate(idx) > > ! global index sets for 3 fields > allocate(idx(n)) > idx=(/ (i-1+rank*3*n, i=1,n) /) > call ISCreateGeneral(PETSC_COMM_WORLD,n,idx,PETSC_COPY_VALUES,isg0,ierr);CHKERRQ(ierr) > call ISCreateGeneral(PETSC_COMM_WORLD,n,idx+n,PETSC_COPY_VALUES,isg1,ierr); CHKERRQ(ierr) > call ISCreateGeneral(PETSC_COMM_WORLD,n,idx+2*n,PETSC_COPY_VALUES,isg2,ierr); CHKERRQ(ierr) > ! call ISView(isg3,PETSC_VIEWER_STDOUT_WORLD,ierr); CHKERRQ(ierr) > deallocate(idx) > > ! local-to-global mapping > allocate(idx(3*n)) > idx=(/ (i-1+rank*3*n, i=1,3*n) /) > call ISLocalToGlobalMappingCreate(PETSC_COMM_WORLD,1,3*n,idx,PETSC_COPY_VALUES,map,ierr); CHKERRQ(ierr) > ! call ISLocalToGlobalMappingView(map,PETSC_VIEWER_STDOUT_WORLD,ierr); CHKERRQ(ierr) > deallocate(idx) > > ! create the 3-by-3 block matrix > call MatCreate(PETSC_COMM_WORLD,A,ierr); CHKERRQ(ierr) > call MatSetSizes(A,3*n,3*n,PETSC_DECIDE,PETSC_DECIDE,ierr); CHKERRQ(ierr) > ! call MatSetType(A,MATNEST,ierr); CHKERRQ(ierr) > call MatSetUp(A,ierr); CHKERRQ(ierr) > call MatSetOptionsPrefix(A,"A_",ierr); CHKERRQ(ierr) > call MatSetLocalToGlobalMapping(A,map,map,ierr); CHKERRQ(ierr) > call MatSetFromOptions(A,ierr); CHKERRQ(ierr) > > ! set diagonal of block A02 to 0.65 > call MatGetLocalSubmatrix(A,isl0,isl2,A02,ierr); CHKERRQ(ierr) > do i=1,n > call MatSetValuesLocal(A02,1,i-1,1,i-1,0.65d0,INSERT_VALUES,ierr); CHKERRQ(ierr) > end do > call MatRestoreLocalSubMatrix(A,isl0,isl2,A02,ierr); CHKERRQ(ierr) > call MatAssemblyBegin(A,MAT_FINAL_ASSEMBLY,ierr); CHKERRQ(ierr) > call MatAssemblyEnd(A,MAT_FINAL_ASSEMBLY,ierr); CHKERRQ(ierr) > > ! verify > call MatGetSubmatrix(A,isg0,isg2,MAT_INITIAL_MATRIX,A02,ierr); CHKERRQ(ierr) > call MatView(A02,PETSC_VIEWER_STDOUT_WORLD,ierr);CHKERRQ(ierr) > > call PetscFinalize(ierr) > > end program mattry > > $ mpiexec -n 2 ./mattry -A_mat_type mpiaij > Mat Object: 2 MPI processes > type: mpiaij > row 0: (0, 0.65) > row 1: (1, 0.65) > row 2: (2, 0.65) > row 3: (3, 0.65) > row 4: (4, 0.65) > row 5: (5, 0.65) > row 6: (6, 0.65) > row 7: (7, 0.65) > > $ mpiexec -n 2 ./mattry -A_mat_type nest > [0]PETSC ERROR: --------------------- Error Message -------------------------------------------------------------- > [0]PETSC ERROR: Null argument, when expecting valid pointer > [0]PETSC ERROR: Null Pointer: Parameter # 3 > [0]PETSC ERROR: See http://www.mcs.anl.gov/petsc/documentation/faq.html for trouble shooting. > [0]PETSC ERROR: Petsc Release Version 3.7.3, Jul, 24, 2016 > [0]PETSC ERROR: ./mattry on a linux_64bit_debug named lin0322.marin.local by cklaij Thu Jul 28 10:31:04 2016 > [0]PETSC ERROR: Configure options --with-mpi-dir=/home/cklaij/ReFRESCO/Dev/trunk/Libs/install/openmpi/1.8.7 --with-clanguage=c++ --with-x=1 --with-debugging=1 --with-blas-lapack-dir=/opt/intel/composer_xe_2015.1.133/mkl --with-shared-libraries=0 > [0]PETSC ERROR: #1 MatNestFindIS() line 298 in /home/cklaij/ReFRESCO/Dev/trunk/Libs/build/petsc/3.7.3-dbg/src/mat/impls/nest/matnest.c > [0]PETSC ERROR: [1]PETSC ERROR: --------------------- Error Message -------------------------------------------------------------- > [1]PETSC ERROR: Null argument, when expecting valid pointer > [1]PETSC ERROR: Null Pointer: Parameter # 3 > [1]PETSC ERROR: See http://www.mcs.anl.gov/petsc/documentation/faq.html for trouble shooting. > [1]PETSC ERROR: Petsc Release Version 3.7.3, Jul, 24, 2016 > [1]PETSC ERROR: ./mattry on a linux_64bit_debug named lin0322.marin.local by cklaij Thu Jul 28 10:31:04 2016 > [1]PETSC ERROR: Configure options --with-mpi-dir=/home/cklaij/ReFRESCO/Dev/trunk/Libs/install/openmpi/1.8.7 --with-clanguage=c++ --with-x=1 --with-debugging=1 --with-blas-lapack-dir=/opt/intel/composer_xe_2015.1.133/mkl --with-shared-libraries=0 > [1]PETSC ERROR: #1 MatNestFindIS() line 298 in /home/cklaij/ReFRESCO/Dev/trunk/Libs/build/petsc/3.7.3-dbg/src/mat/impls/nest/matnest.c > [1]PETSC ERROR: #2 MatNestFindSubMat() line 371 in /home/cklaij/ReFRESCO/Dev/trunk/Libs/build/petsc/3.7.3-dbg/src/mat/impls/nest/matnest.c > [1]PETSC ERROR: #3 MatGetLocalSubMatrix_Nest() line 414 in /home/cklaij/ReFRESCO/Dev/trunk/Libs/build/petsc/3.7.3-dbg/src/mat/impls/nest/matnest.c > [1]PETSC ERROR: #4 MatGetLocalSubMatrix() line 10099 in /home/cklaij/ReFRESCO/Dev/trunk/Libs/build/petsc/3.7.3-dbg/src/mat/interface/matrix.c > #2 MatNestFindSubMat() line 371 in /home/cklaij/ReFRESCO/Dev/trunk/Libs/build/petsc/3.7.3-dbg/src/mat/impls/nest/matnest.c > [0]PETSC ERROR: #3 MatGetLocalSubMatrix_Nest() line 414 in /home/cklaij/ReFRESCO/Dev/trunk/Libs/build/petsc/3.7.3-dbg/src/mat/impls/nest/matnest.c > [0]PETSC ERROR: #4 MatGetLocalSubMatrix() line 10099 in /home/cklaij/ReFRESCO/Dev/trunk/Libs/build/petsc/3.7.3-dbg/src/mat/interface/matrix.c > -------------------------------------------------------------------------- > MPI_ABORT was invoked on rank 1 in communicator MPI_COMM_WORLD > with errorcode 85. > > NOTE: invoking MPI_ABORT causes Open MPI to kill all MPI processes. > You may or may not see output from other processes, depending on > exactly when Open MPI kills them. > -------------------------------------------------------------------------- > [lin0322.marin.local:11985] 1 more process has sent help message help-mpi-api.txt / mpi-abort > [lin0322.marin.local:11985] Set MCA parameter "orte_base_help_aggregate" to 0 to see all help / error messages > $ > > > dr. ir. Christiaan Klaij | CFD Researcher | Research & Development > MARIN | T +31 317 49 33 44 | mailto:C.Klaij at marin.nl | http://www.marin.nl > > MARIN news: http://www.marin.nl/web/News/News-items/Ship-design-in-EU-project-Holiship.htm > dr. ir. Christiaan Klaij | CFD Researcher | Research & Development MARIN | T +31 317 49 33 44 | mailto:C.Klaij at marin.nl | http://www.marin.nl MARIN news: http://www.marin.nl/web/News/News-items/Joint-Industry-Project-LifeLine-kicks-off.htm From knepley at gmail.com Sat Jul 30 10:02:21 2016 From: knepley at gmail.com (Matthew Knepley) Date: Sat, 30 Jul 2016 10:02:21 -0500 Subject: [petsc-users] block matrix without MatCreateNest In-Reply-To: <1469695134232.97712@marin.nl> References: <1469695134232.97712@marin.nl> Message-ID: On Thu, Jul 28, 2016 at 3:38 AM, Klaij, Christiaan wrote: > I'm trying to understand how to assemble a block matrix in a > format-independent manner, so that I can switch between types > mpiaij and matnest. > > The manual states that the key to format-independent assembly is > to use MatGetLocalSubMatrix. So, in the code below, I'm using > this to assemble a 3-by-3 block matrix A and setting the diagonal > of block A02. This seems to work for type mpiaij, but not for > type matnest. What am I missing? > > Chris > > > $ cat mattry.F90 > program mattry > > use petscksp > implicit none > #include > > PetscInt :: n=4 ! setting 4 cells per process > > PetscErrorCode :: ierr > PetscInt :: size,rank,i > Mat :: A,A02 > IS :: isg0,isg1,isg2 > IS :: isl0,isl1,isl2 > ISLocalToGlobalMapping :: map > > integer, allocatable, dimension(:) :: idx > > call PetscInitialize(PETSC_NULL_CHARACTER,ierr); CHKERRQ(ierr) > call MPI_Comm_size(PETSC_COMM_WORLD,size,ierr); CHKERRQ(ierr) > call MPI_Comm_rank(PETSC_COMM_WORLD,rank,ierr);CHKERRQ(ierr) > > ! local index sets for 3 fields > allocate(idx(n)) > idx=(/ (i-1, i=1,n) /) > call > ISCreateGeneral(PETSC_COMM_WORLD,n,idx,PETSC_COPY_VALUES,isl0,ierr);CHKERRQ(ierr) > call > ISCreateGeneral(PETSC_COMM_WORLD,n,idx+n,PETSC_COPY_VALUES,isl1,ierr);CHKERRQ(ierr) > call > ISCreateGeneral(PETSC_COMM_WORLD,n,idx+2*n,PETSC_COPY_VALUES,isl2,ierr);CHKERRQ(ierr) > ! call ISView(isl3,PETSC_VIEWER_STDOUT_WORLD,ierr); CHKERRQ(ierr) > deallocate(idx) > > ! global index sets for 3 fields > allocate(idx(n)) > idx=(/ (i-1+rank*3*n, i=1,n) /) > call > ISCreateGeneral(PETSC_COMM_WORLD,n,idx,PETSC_COPY_VALUES,isg0,ierr);CHKERRQ(ierr) > call > ISCreateGeneral(PETSC_COMM_WORLD,n,idx+n,PETSC_COPY_VALUES,isg1,ierr); > CHKERRQ(ierr) > call > ISCreateGeneral(PETSC_COMM_WORLD,n,idx+2*n,PETSC_COPY_VALUES,isg2,ierr); > CHKERRQ(ierr) > ! call ISView(isg3,PETSC_VIEWER_STDOUT_WORLD,ierr); CHKERRQ(ierr) > deallocate(idx) > > ! local-to-global mapping > allocate(idx(3*n)) > idx=(/ (i-1+rank*3*n, i=1,3*n) /) > call > ISLocalToGlobalMappingCreate(PETSC_COMM_WORLD,1,3*n,idx,PETSC_COPY_VALUES,map,ierr); > CHKERRQ(ierr) > ! call ISLocalToGlobalMappingView(map,PETSC_VIEWER_STDOUT_WORLD,ierr); > CHKERRQ(ierr) > deallocate(idx) > > ! create the 3-by-3 block matrix > call MatCreate(PETSC_COMM_WORLD,A,ierr); CHKERRQ(ierr) > call MatSetSizes(A,3*n,3*n,PETSC_DECIDE,PETSC_DECIDE,ierr); CHKERRQ(ierr) > ! call MatSetType(A,MATNEST,ierr); CHKERRQ(ierr) > call MatSetUp(A,ierr); CHKERRQ(ierr) > I am sorry I have not had time to run this, but I believe you need to insert a call here: MatNestSetSubMats(A, 3, [isg0, isg1, isg2], 3, [isg0, isg1, isg2], NULL); coming from http://www.mcs.anl.gov/petsc/petsc-current/docs/manualpages/Mat/MatNestSetSubMats.html#MatNestSetSubMats Thanks, Matt call MatSetOptionsPrefix(A,"A_",ierr); CHKERRQ(ierr) > call MatSetLocalToGlobalMapping(A,map,map,ierr); CHKERRQ(ierr) > call MatSetFromOptions(A,ierr); CHKERRQ(ierr) > > ! set diagonal of block A02 to 0.65 > call MatGetLocalSubmatrix(A,isl0,isl2,A02,ierr); CHKERRQ(ierr) > do i=1,n > call MatSetValuesLocal(A02,1,i-1,1,i-1,0.65d0,INSERT_VALUES,ierr); > CHKERRQ(ierr) > end do > call MatRestoreLocalSubMatrix(A,isl0,isl2,A02,ierr); CHKERRQ(ierr) > call MatAssemblyBegin(A,MAT_FINAL_ASSEMBLY,ierr); CHKERRQ(ierr) > call MatAssemblyEnd(A,MAT_FINAL_ASSEMBLY,ierr); CHKERRQ(ierr) > > ! verify > call MatGetSubmatrix(A,isg0,isg2,MAT_INITIAL_MATRIX,A02,ierr); > CHKERRQ(ierr) > call MatView(A02,PETSC_VIEWER_STDOUT_WORLD,ierr);CHKERRQ(ierr) > > call PetscFinalize(ierr) > > end program mattry > > $ mpiexec -n 2 ./mattry -A_mat_type mpiaij > Mat Object: 2 MPI processes > type: mpiaij > row 0: (0, 0.65) > row 1: (1, 0.65) > row 2: (2, 0.65) > row 3: (3, 0.65) > row 4: (4, 0.65) > row 5: (5, 0.65) > row 6: (6, 0.65) > row 7: (7, 0.65) > > $ mpiexec -n 2 ./mattry -A_mat_type nest > [0]PETSC ERROR: --------------------- Error Message > -------------------------------------------------------------- > [0]PETSC ERROR: Null argument, when expecting valid pointer > [0]PETSC ERROR: Null Pointer: Parameter # 3 > [0]PETSC ERROR: See http://www.mcs.anl.gov/petsc/documentation/faq.html > for trouble shooting. > [0]PETSC ERROR: Petsc Release Version 3.7.3, Jul, 24, 2016 > [0]PETSC ERROR: ./mattry > > > on a linux_64bit_debug named > lin0322.marin.local by cklaij Thu Jul 28 10:31:04 2016 > [0]PETSC ERROR: Configure options > --with-mpi-dir=/home/cklaij/ReFRESCO/Dev/trunk/Libs/install/openmpi/1.8.7 > --with-clanguage=c++ --with-x=1 --with-debugging=1 > --with-blas-lapack-dir=/opt/intel/composer_xe_2015.1.133/mkl > --with-shared-libraries=0 > [0]PETSC ERROR: #1 MatNestFindIS() line 298 in > /home/cklaij/ReFRESCO/Dev/trunk/Libs/build/petsc/3.7.3-dbg/src/mat/impls/nest/matnest.c > [0]PETSC ERROR: [1]PETSC ERROR: --------------------- Error Message > -------------------------------------------------------------- > [1]PETSC ERROR: Null argument, when expecting valid pointer > [1]PETSC ERROR: Null Pointer: Parameter # 3 > [1]PETSC ERROR: See http://www.mcs.anl.gov/petsc/documentation/faq.html > for trouble shooting. > [1]PETSC ERROR: Petsc Release Version 3.7.3, Jul, 24, 2016 > [1]PETSC ERROR: ./mattry > > > on a linux_64bit_debug named > lin0322.marin.local by cklaij Thu Jul 28 10:31:04 2016 > [1]PETSC ERROR: Configure options > --with-mpi-dir=/home/cklaij/ReFRESCO/Dev/trunk/Libs/install/openmpi/1.8.7 > --with-clanguage=c++ --with-x=1 --with-debugging=1 > --with-blas-lapack-dir=/opt/intel/composer_xe_2015.1.133/mkl > --with-shared-libraries=0 > [1]PETSC ERROR: #1 MatNestFindIS() line 298 in > /home/cklaij/ReFRESCO/Dev/trunk/Libs/build/petsc/3.7.3-dbg/src/mat/impls/nest/matnest.c > [1]PETSC ERROR: #2 MatNestFindSubMat() line 371 in > /home/cklaij/ReFRESCO/Dev/trunk/Libs/build/petsc/3.7.3-dbg/src/mat/impls/nest/matnest.c > [1]PETSC ERROR: #3 MatGetLocalSubMatrix_Nest() line 414 in > /home/cklaij/ReFRESCO/Dev/trunk/Libs/build/petsc/3.7.3-dbg/src/mat/impls/nest/matnest.c > [1]PETSC ERROR: #4 MatGetLocalSubMatrix() line 10099 in > /home/cklaij/ReFRESCO/Dev/trunk/Libs/build/petsc/3.7.3-dbg/src/mat/interface/matrix.c > #2 MatNestFindSubMat() line 371 in > /home/cklaij/ReFRESCO/Dev/trunk/Libs/build/petsc/3.7.3-dbg/src/mat/impls/nest/matnest.c > [0]PETSC ERROR: #3 MatGetLocalSubMatrix_Nest() line 414 in > /home/cklaij/ReFRESCO/Dev/trunk/Libs/build/petsc/3.7.3-dbg/src/mat/impls/nest/matnest.c > [0]PETSC ERROR: #4 MatGetLocalSubMatrix() line 10099 in > /home/cklaij/ReFRESCO/Dev/trunk/Libs/build/petsc/3.7.3-dbg/src/mat/interface/matrix.c > -------------------------------------------------------------------------- > MPI_ABORT was invoked on rank 1 in communicator MPI_COMM_WORLD > with errorcode 85. > > NOTE: invoking MPI_ABORT causes Open MPI to kill all MPI processes. > You may or may not see output from other processes, depending on > exactly when Open MPI kills them. > -------------------------------------------------------------------------- > [lin0322.marin.local:11985] 1 more process has sent help message > help-mpi-api.txt / mpi-abort > [lin0322.marin.local:11985] Set MCA parameter "orte_base_help_aggregate" > to 0 to see all help / error messages > $ > > > dr. ir. Christiaan Klaij | CFD Researcher | Research & Development > MARIN | T +31 317 49 33 44 | mailto:C.Klaij at marin.nl | http://www.marin.nl > > MARIN news: > http://www.marin.nl/web/News/News-items/Ship-design-in-EU-project-Holiship.htm > > -- What most experimenters take for granted before they begin their experiments is infinitely more interesting than any results to which their experiments lead. -- Norbert Wiener -------------- next part -------------- An HTML attachment was scrubbed... URL: From bsmith at mcs.anl.gov Sat Jul 30 11:04:43 2016 From: bsmith at mcs.anl.gov (Barry Smith) Date: Sat, 30 Jul 2016 11:04:43 -0500 Subject: [petsc-users] block matrix without MatCreateNest In-Reply-To: <1469889718285.98025@marin.nl> References: <1469889718285.98025@marin.nl> Message-ID: <47333B7B-15AA-4089-88CA-8BEB471F87A1@mcs.anl.gov> You need to call MatNestSetSubMats() after you set the mattype. Yes the manual pages are missing needed cross links. > On Jul 30, 2016, at 9:41 AM, Klaij, Christiaan wrote: > > Anyone? > (my guess is an if-statement, something like "if type nest then > setup nest"...) > >> Date: Thu, 28 Jul 2016 08:38:54 +0000 >> From: "Klaij, Christiaan" >> To: "petsc-users at mcs.anl.gov" >> Subject: [petsc-users] block matrix without MatCreateNest >> Message-ID: <1469695134232.97712 at marin.nl> >> Content-Type: text/plain; charset="utf-8" >> >> I'm trying to understand how to assemble a block matrix in a >> format-independent manner, so that I can switch between types >> mpiaij and matnest. >> >> The manual states that the key to format-independent assembly is >> to use MatGetLocalSubMatrix. So, in the code below, I'm using >> this to assemble a 3-by-3 block matrix A and setting the diagonal >> of block A02. This seems to work for type mpiaij, but not for >> type matnest. What am I missing? >> >> Chris >> >> >> $ cat mattry.F90 >> program mattry >> >> use petscksp >> implicit none >> #include >> >> PetscInt :: n=4 ! setting 4 cells per process >> >> PetscErrorCode :: ierr >> PetscInt :: size,rank,i >> Mat :: A,A02 >> IS :: isg0,isg1,isg2 >> IS :: isl0,isl1,isl2 >> ISLocalToGlobalMapping :: map >> >> integer, allocatable, dimension(:) :: idx >> >> call PetscInitialize(PETSC_NULL_CHARACTER,ierr); CHKERRQ(ierr) >> call MPI_Comm_size(PETSC_COMM_WORLD,size,ierr); CHKERRQ(ierr) >> call MPI_Comm_rank(PETSC_COMM_WORLD,rank,ierr);CHKERRQ(ierr) >> >> ! local index sets for 3 fields >> allocate(idx(n)) >> idx=(/ (i-1, i=1,n) /) >> call ISCreateGeneral(PETSC_COMM_WORLD,n,idx,PETSC_COPY_VALUES,isl0,ierr);CHKERRQ(ierr) >> call ISCreateGeneral(PETSC_COMM_WORLD,n,idx+n,PETSC_COPY_VALUES,isl1,ierr);CHKERRQ(ierr) >> call ISCreateGeneral(PETSC_COMM_WORLD,n,idx+2*n,PETSC_COPY_VALUES,isl2,ierr);CHKERRQ(ierr) >> ! call ISView(isl3,PETSC_VIEWER_STDOUT_WORLD,ierr); CHKERRQ(ierr) >> deallocate(idx) >> >> ! global index sets for 3 fields >> allocate(idx(n)) >> idx=(/ (i-1+rank*3*n, i=1,n) /) >> call ISCreateGeneral(PETSC_COMM_WORLD,n,idx,PETSC_COPY_VALUES,isg0,ierr);CHKERRQ(ierr) >> call ISCreateGeneral(PETSC_COMM_WORLD,n,idx+n,PETSC_COPY_VALUES,isg1,ierr); CHKERRQ(ierr) >> call ISCreateGeneral(PETSC_COMM_WORLD,n,idx+2*n,PETSC_COPY_VALUES,isg2,ierr); CHKERRQ(ierr) >> ! call ISView(isg3,PETSC_VIEWER_STDOUT_WORLD,ierr); CHKERRQ(ierr) >> deallocate(idx) >> >> ! local-to-global mapping >> allocate(idx(3*n)) >> idx=(/ (i-1+rank*3*n, i=1,3*n) /) >> call ISLocalToGlobalMappingCreate(PETSC_COMM_WORLD,1,3*n,idx,PETSC_COPY_VALUES,map,ierr); CHKERRQ(ierr) >> ! call ISLocalToGlobalMappingView(map,PETSC_VIEWER_STDOUT_WORLD,ierr); CHKERRQ(ierr) >> deallocate(idx) >> >> ! create the 3-by-3 block matrix >> call MatCreate(PETSC_COMM_WORLD,A,ierr); CHKERRQ(ierr) >> call MatSetSizes(A,3*n,3*n,PETSC_DECIDE,PETSC_DECIDE,ierr); CHKERRQ(ierr) >> ! call MatSetType(A,MATNEST,ierr); CHKERRQ(ierr) >> call MatSetUp(A,ierr); CHKERRQ(ierr) >> call MatSetOptionsPrefix(A,"A_",ierr); CHKERRQ(ierr) >> call MatSetLocalToGlobalMapping(A,map,map,ierr); CHKERRQ(ierr) >> call MatSetFromOptions(A,ierr); CHKERRQ(ierr) >> >> ! set diagonal of block A02 to 0.65 >> call MatGetLocalSubmatrix(A,isl0,isl2,A02,ierr); CHKERRQ(ierr) >> do i=1,n >> call MatSetValuesLocal(A02,1,i-1,1,i-1,0.65d0,INSERT_VALUES,ierr); CHKERRQ(ierr) >> end do >> call MatRestoreLocalSubMatrix(A,isl0,isl2,A02,ierr); CHKERRQ(ierr) >> call MatAssemblyBegin(A,MAT_FINAL_ASSEMBLY,ierr); CHKERRQ(ierr) >> call MatAssemblyEnd(A,MAT_FINAL_ASSEMBLY,ierr); CHKERRQ(ierr) >> >> ! verify >> call MatGetSubmatrix(A,isg0,isg2,MAT_INITIAL_MATRIX,A02,ierr); CHKERRQ(ierr) >> call MatView(A02,PETSC_VIEWER_STDOUT_WORLD,ierr);CHKERRQ(ierr) >> >> call PetscFinalize(ierr) >> >> end program mattry >> >> $ mpiexec -n 2 ./mattry -A_mat_type mpiaij >> Mat Object: 2 MPI processes >> type: mpiaij >> row 0: (0, 0.65) >> row 1: (1, 0.65) >> row 2: (2, 0.65) >> row 3: (3, 0.65) >> row 4: (4, 0.65) >> row 5: (5, 0.65) >> row 6: (6, 0.65) >> row 7: (7, 0.65) >> >> $ mpiexec -n 2 ./mattry -A_mat_type nest >> [0]PETSC ERROR: --------------------- Error Message -------------------------------------------------------------- >> [0]PETSC ERROR: Null argument, when expecting valid pointer >> [0]PETSC ERROR: Null Pointer: Parameter # 3 >> [0]PETSC ERROR: See http://www.mcs.anl.gov/petsc/documentation/faq.html for trouble shooting. >> [0]PETSC ERROR: Petsc Release Version 3.7.3, Jul, 24, 2016 >> [0]PETSC ERROR: ./mattry on a linux_64bit_debug named lin0322.marin.local by cklaij Thu Jul 28 10:31:04 2016 >> [0]PETSC ERROR: Configure options --with-mpi-dir=/home/cklaij/ReFRESCO/Dev/trunk/Libs/install/openmpi/1.8.7 --with-clanguage=c++ --with-x=1 --with-debugging=1 --with-blas-lapack-dir=/opt/intel/composer_xe_2015.1.133/mkl --with-shared-libraries=0 >> [0]PETSC ERROR: #1 MatNestFindIS() line 298 in /home/cklaij/ReFRESCO/Dev/trunk/Libs/build/petsc/3.7.3-dbg/src/mat/impls/nest/matnest.c >> [0]PETSC ERROR: [1]PETSC ERROR: --------------------- Error Message -------------------------------------------------------------- >> [1]PETSC ERROR: Null argument, when expecting valid pointer >> [1]PETSC ERROR: Null Pointer: Parameter # 3 >> [1]PETSC ERROR: See http://www.mcs.anl.gov/petsc/documentation/faq.html for trouble shooting. >> [1]PETSC ERROR: Petsc Release Version 3.7.3, Jul, 24, 2016 >> [1]PETSC ERROR: ./mattry on a linux_64bit_debug named lin0322.marin.local by cklaij Thu Jul 28 10:31:04 2016 >> [1]PETSC ERROR: Configure options --with-mpi-dir=/home/cklaij/ReFRESCO/Dev/trunk/Libs/install/openmpi/1.8.7 --with-clanguage=c++ --with-x=1 --with-debugging=1 --with-blas-lapack-dir=/opt/intel/composer_xe_2015.1.133/mkl --with-shared-libraries=0 >> [1]PETSC ERROR: #1 MatNestFindIS() line 298 in /home/cklaij/ReFRESCO/Dev/trunk/Libs/build/petsc/3.7.3-dbg/src/mat/impls/nest/matnest.c >> [1]PETSC ERROR: #2 MatNestFindSubMat() line 371 in /home/cklaij/ReFRESCO/Dev/trunk/Libs/build/petsc/3.7.3-dbg/src/mat/impls/nest/matnest.c >> [1]PETSC ERROR: #3 MatGetLocalSubMatrix_Nest() line 414 in /home/cklaij/ReFRESCO/Dev/trunk/Libs/build/petsc/3.7.3-dbg/src/mat/impls/nest/matnest.c >> [1]PETSC ERROR: #4 MatGetLocalSubMatrix() line 10099 in /home/cklaij/ReFRESCO/Dev/trunk/Libs/build/petsc/3.7.3-dbg/src/mat/interface/matrix.c >> #2 MatNestFindSubMat() line 371 in /home/cklaij/ReFRESCO/Dev/trunk/Libs/build/petsc/3.7.3-dbg/src/mat/impls/nest/matnest.c >> [0]PETSC ERROR: #3 MatGetLocalSubMatrix_Nest() line 414 in /home/cklaij/ReFRESCO/Dev/trunk/Libs/build/petsc/3.7.3-dbg/src/mat/impls/nest/matnest.c >> [0]PETSC ERROR: #4 MatGetLocalSubMatrix() line 10099 in /home/cklaij/ReFRESCO/Dev/trunk/Libs/build/petsc/3.7.3-dbg/src/mat/interface/matrix.c >> -------------------------------------------------------------------------- >> MPI_ABORT was invoked on rank 1 in communicator MPI_COMM_WORLD >> with errorcode 85. >> >> NOTE: invoking MPI_ABORT causes Open MPI to kill all MPI processes. >> You may or may not see output from other processes, depending on >> exactly when Open MPI kills them. >> -------------------------------------------------------------------------- >> [lin0322.marin.local:11985] 1 more process has sent help message help-mpi-api.txt / mpi-abort >> [lin0322.marin.local:11985] Set MCA parameter "orte_base_help_aggregate" to 0 to see all help / error messages >> $ >> >> >> dr. ir. Christiaan Klaij | CFD Researcher | Research & Development >> MARIN | T +31 317 49 33 44 | mailto:C.Klaij at marin.nl | http://www.marin.nl >> >> MARIN news: http://www.marin.nl/web/News/News-items/Ship-design-in-EU-project-Holiship.htm >> > > > dr. ir. Christiaan Klaij | CFD Researcher | Research & Development > MARIN | T +31 317 49 33 44 | mailto:C.Klaij at marin.nl | http://www.marin.nl > > MARIN news: http://www.marin.nl/web/News/News-items/Joint-Industry-Project-LifeLine-kicks-off.htm > From andrewh0 at uw.edu Sat Jul 30 12:19:45 2016 From: andrewh0 at uw.edu (Andrew Ho) Date: Sat, 30 Jul 2016 10:19:45 -0700 Subject: [petsc-users] Multi-physics meshes with PETSc DM? Message-ID: I am trying to solve a multi-physics problem consisting of some physics on a rectangular domain which is split in half such that one set of physics is solved on the left, and the other set of physics is solved on the right. Each set has their own set of variable components, and I would like to not allocate both variable sets across the entire domain because the physics in one subdomain happens to have lots of components per mesh element, which the other subdomain doesn't need except to compute boundary interactions. For testing right now, I am using the attached gmsh file to generate a mesh with 2 physical groups to represent each subdomain (called "left" and "right"). It has periodic boundaries on all sides. However, when I try to load the generated mesh into PETSc using the *DMPlexCreateFromFile* function, PETSc complains that the mesh is not a valid Gmsh file. I've attached the sample mesh, as well as the error message PETSc spits out. Here's the relevant code (should be a complete working example) which re-creates what I'm doing: #include > int main(int argc, char** argv) > { > PetscInitialize(&argc, &argv, NULL, "multi physics testing"); > DM dm; > CHKERRQ(DMPlexCreateFromFile(PETSC_COMM_WORLD, "periodic_square.msh", > PETSC_TRUE, &dm)); > PetscFinalize(); > } What is the correct procedure for creating a multi-physics mesh using PETSc DM objects for mesh management? -------------- next part -------------- An HTML attachment was scrubbed... URL: -------------- next part -------------- A non-text attachment was scrubbed... Name: periodic_square.geo Type: application/octet-stream Size: 521 bytes Desc: not available URL: -------------- next part -------------- A non-text attachment was scrubbed... Name: periodic_square.msh Type: model/mesh Size: 433 bytes Desc: not available URL: -------------- next part -------------- A non-text attachment was scrubbed... Name: petsc_error.log Type: text/x-log Size: 3518 bytes Desc: not available URL: From knepley at gmail.com Sat Jul 30 12:49:25 2016 From: knepley at gmail.com (Matthew Knepley) Date: Sat, 30 Jul 2016 12:49:25 -0500 Subject: [petsc-users] Multi-physics meshes with PETSc DM? In-Reply-To: References: Message-ID: On Sat, Jul 30, 2016 at 12:19 PM, Andrew Ho wrote: > I am trying to solve a multi-physics problem consisting of some physics on > a rectangular domain which is split in half such that one set of physics is > solved on the left, and the other set of physics is solved on the right. > > Each set has their own set of variable components, and I would like to not > allocate both variable sets across the entire domain because the physics in > one subdomain happens to have lots of components per mesh element, which > the other subdomain doesn't need except to compute boundary interactions. > > For testing right now, I am using the attached gmsh file to generate a > mesh with 2 physical groups to represent each subdomain (called "left" and > "right"). It has periodic boundaries on all sides. > > However, when I try to load the generated mesh into PETSc using the > *DMPlexCreateFromFile* function, PETSc complains that the mesh is not a > valid Gmsh file. I've attached the sample mesh, as well as the error > message PETSc spits out. > > Here's the relevant code (should be a complete working example) which > re-creates what I'm doing: > > #include > > >> int main(int argc, char** argv) >> { >> PetscInitialize(&argc, &argv, NULL, "multi physics testing"); >> DM dm; >> CHKERRQ(DMPlexCreateFromFile(PETSC_COMM_WORLD, "periodic_square.msh", >> PETSC_TRUE, &dm)); >> PetscFinalize(); >> } > > > What is the correct procedure for creating a multi-physics mesh using > PETSc DM objects for mesh management? > 1) I don't use Physical Groups from GMsh since its unclear how this would be reflected in the discretization 2) You should make a PetscSection representing your data layout, which is discussed in the manual and in the tutorials. The number of dofs on different cells/edges/vertices will be different across the mesh (it sounds like from your description). 3) Obviously this means the closures of different cells will be different sizes. I am not sure how your assembly is setup to handle this. Matt -- What most experimenters take for granted before they begin their experiments is infinitely more interesting than any results to which their experiments lead. -- Norbert Wiener -------------- next part -------------- An HTML attachment was scrubbed... URL: From andrewh0 at uw.edu Sat Jul 30 13:06:26 2016 From: andrewh0 at uw.edu (Andrew Ho) Date: Sat, 30 Jul 2016 11:06:26 -0700 Subject: [petsc-users] Multi-physics meshes with PETSc DM? In-Reply-To: References: Message-ID: > > 1) I don't use Physical Groups from GMsh since its unclear how this would > be reflected in the discretization If I'm not using physical groups in GMsh, how do I easily denote what part of the domain should be handled with which physics? I would like to be able to use the same code with similar but not identical meshes (for example to do a convergence study), so manually iterating through a list of vertices at the element height stratum in a chart doesn't provide any hints on which subdomain an element is suppose to belong in. -------------- next part -------------- An HTML attachment was scrubbed... URL: From knepley at gmail.com Sat Jul 30 13:11:15 2016 From: knepley at gmail.com (Matthew Knepley) Date: Sat, 30 Jul 2016 13:11:15 -0500 Subject: [petsc-users] Multi-physics meshes with PETSc DM? In-Reply-To: References: Message-ID: On Sat, Jul 30, 2016 at 1:06 PM, Andrew Ho wrote: > 1) I don't use Physical Groups from GMsh since its unclear how this would >> be reflected in the discretization > > > If I'm not using physical groups in GMsh, how do I easily denote what part > of the domain should be handled with which physics? I would like to be able > to use the same code with similar but not identical meshes (for example to > do a convergence study), so manually iterating through a list of vertices > at the element height stratum in a chart doesn't provide any hints on which > subdomain an element is suppose to belong in. > I think the right way to handle all this is to just mark pieces of the mesh. Mesh formats should just have a generic marking ability which does not differentiate between vertices, edges, faces, and cells. Some formats come close (ExodusII) and some are just crazy (GMsh). If you can point me toward the documentation for the GMsh format, I will put in code to translate whatever part marks cells to a cell label, as we do for ExodusII. Thanks, Matt -- What most experimenters take for granted before they begin their experiments is infinitely more interesting than any results to which their experiments lead. -- Norbert Wiener -------------- next part -------------- An HTML attachment was scrubbed... URL: From andrewh0 at uw.edu Sat Jul 30 13:35:41 2016 From: andrewh0 at uw.edu (Andrew Ho) Date: Sat, 30 Jul 2016 11:35:41 -0700 Subject: [petsc-users] Multi-physics meshes with PETSc DM? In-Reply-To: References: Message-ID: Is there a reason the physical groups aren't sufficient for handling this? As far as I can tell, this is the only way in GMsh to have any kind of grouping of elements. The Gmsh file format can be found here (happens to be the ASCII version, but binary version is below that): http://gmsh.info/doc/texinfo/gmsh.html#MSH-ASCII-file-format All tags are attributed to elements; there may be multiple element types (points, lines, triangles, etc.), but at the end of the day each element just has a list of indices indicating which physical group(s) each element belongs to. >From the documentation for ASCII formatted mesh files: number-of-tags gives the number of integer tags that follow for the n-th element. By > default, the first tag is the number of the physical entity to which the > element belongs; the second is the number of the elementary geometrical > entity to which the element belongs; the third is the number of mesh > partitions to which the element belongs, followed by the partition ids > (negative partition ids indicate ghost cells). A zero tag is equivalent to > no tag. Gmsh and most codes using the MSH 2 format require at least the > first two tags (physical and elementary tags). My understanding is to support markers you only need to add a 4th stratum level which has one node per physical group. It would be helpful (though not necessary) if this subdomain marker stratum level had the physical tag name labels properly associated with the corresponding nodes on the graph, but this is not necessary since it's just as easy to refer to them by node number as long as the node numbering matches or is a simple transform of the numbering scheme in the original physical group id's. On Sat, Jul 30, 2016 at 11:11 AM, Matthew Knepley wrote: > On Sat, Jul 30, 2016 at 1:06 PM, Andrew Ho wrote: > >> 1) I don't use Physical Groups from GMsh since its unclear how this would >>> be reflected in the discretization >> >> >> If I'm not using physical groups in GMsh, how do I easily denote what >> part of the domain should be handled with which physics? I would like to be >> able to use the same code with similar but not identical meshes (for >> example to do a convergence study), so manually iterating through a list of >> vertices at the element height stratum in a chart doesn't provide any hints >> on which subdomain an element is suppose to belong in. >> > > I think the right way to handle all this is to just mark pieces of the > mesh. Mesh formats should just have a generic marking > ability which does not differentiate between vertices, edges, faces, and > cells. Some formats come close (ExodusII) and some > are just crazy (GMsh). If you can point me toward the documentation for > the GMsh format, I will put in code to translate whatever > part marks cells to a cell label, as we do for ExodusII. > > Thanks, > > Matt > > -- > What most experimenters take for granted before they begin their > experiments is infinitely more interesting than any results to which their > experiments lead. > -- Norbert Wiener > -- Andrew Ho -------------- next part -------------- An HTML attachment was scrubbed... URL: