From zonexo at gmail.com  Fri Jul  1 01:33:54 2016
From: zonexo at gmail.com (TAY wee-beng)
Date: Fri, 1 Jul 2016 14:33:54 +0800
Subject: [petsc-users] DMDAVecRestoreArrayF90 error
Message-ID: <b436540b-63bd-5b65-74a9-d6b4945fa2a2@gmail.com>

Hi,

I had problems with DMDAVecRestoreArrayF90 last time when I used an old 
version of Intel Fortran compiler. It works fine in gfortran and new 
version of Intel compiler. It was determined as a bug by the PETSc team.

To use in old version of Intel, I had to use -O1 instead of -O3 -ipo in 
subroutines when DMDAVecRestoreArrayF90 is called.

Recently, my cluster was reset and all files were deleted. I upload my 
files and compile my code again. However, this time, with a new version 
of Intel, I got segmentation error with DMDAVecRestoreArrayF90. Changing 
to -O1 works. But I thought it was working fine. So maybe I need to 
check with my admin if the ver before and after reset are the same.

Another thing was using PETSc ver 3.6.4 and 3.7.2. Using v3.6.4 
(compiled with -O3 -ipo), it encountered segmentation err right from the 
code start, when DMDAVecRestoreArrayF90  is called.

However, for v3.7.2 (compiled with -O3 -ipo), it only happened during 
the 2nd time step, when I need to use DMDAVecRestoreArrayF90 in order to 
use KSP to solve the linear equation later on.

So I wonder why v3.7.2 can get pass the 1st time step w/o problem and 
getting the right answer and only give errors at the 2nd time step.

Any explanation for this?

Btw, same result whether MPI is used or not.


-- 
Thank you

Yours sincerely,

TAY wee-beng


From bsmith at mcs.anl.gov  Fri Jul  1 12:25:49 2016
From: bsmith at mcs.anl.gov (Barry Smith)
Date: Fri, 1 Jul 2016 12:25:49 -0500
Subject: [petsc-users] user provided local preconditioner with additive
	schwarz preconditioner
In-Reply-To: <CACs--92XNWKTvJoHSirUF7TW-Looo3-fv84E0=5_N748pK-dSw@mail.gmail.com>
References: <CACs--91UgDax2bvHbHyW+xT9NX9DjZpmMdC47XjBcjM1V=6rww@mail.gmail.com>
	<F74B4581-9F64-4A93-8BE2-A456F31B0D29@mcs.anl.gov>
	<CACs--92XNWKTvJoHSirUF7TW-Looo3-fv84E0=5_N748pK-dSw@mail.gmail.com>
Message-ID: <32C3C9D7-36FF-4281-A3F6-2E1EF6E88E10@mcs.anl.gov>


> On Jun 30, 2016, at 11:48 PM, Duan Zhaowen <dzw.pku at gmail.com> wrote:
> 
> Thank you Barry. I'll try and follow the code. In my code the global matrix A was partitioned into CSR format. The local preconditioner I want to use only effect on local part of matrix A (diagonal part).

   What do you mean? In additive Schwarz the subproblems contain overlapping sets of variables that are solved for an then updated. 

> So in MyApplyFunc(PC, Vec x, Vec y) of the shell preconditioner, should I only take care of local part of vector y and leave alone its non-local part (or overlap)? 

   The local solve updates the entire y (which has overlap with other y from other subdomains). 

    There are several variants of overlapping Schwarz 

/*E
    PCASMType - Type of additive Schwarz method to use

$  PC_ASM_BASIC        - Symmetric version where residuals from the ghost points are used
$                        and computed values in ghost regions are added together.
$                        Classical standard additive Schwarz.
$  PC_ASM_RESTRICT     - Residuals from ghost points are used but computed values in ghost
$                        region are discarded.
$                        Default.
$  PC_ASM_INTERPOLATE  - Residuals from ghost points are not used, computed values in ghost
$                        region are added back in.
$  PC_ASM_NONE         - Residuals from ghost points are not used, computed ghost values are
$                        discarded.
$                        Not very good.

   Level: beginner

.seealso: PCASMSetType()
E*/
typedef enum {PC_ASM_BASIC = 3,PC_ASM_RESTRICT = 1,PC_ASM_INTERPOLATE = 2,PC_ASM_NONE = 0} PCASMType;

but this is all handled inside the PCASM code. Your local function doesn't know or care which of the variants is used. It is the job of your local function to solve for all the values in the y output based on all the values in the x input.

  Barry

> 
> Thank you again. If I have more problem I will let you know.
> 
> Zhaowen
> 
> On Thu, Jun 30, 2016 at 6:09 PM, Barry Smith <bsmith at mcs.anl.gov> wrote:
> 
>    I don't think we have an example that does exactly that.
> 
>    If you are working with KSP directly and not SNES here is how to proceed
> 
>    KSPGetPC(ksp,&pc);
>    PCSetType(pc,PCASM);
>    KSPSetOperators()
>    KSPSetUp()                        <--- this must be called before the code below otherwise the subksps don't exist yet
> 
>    PetscInt n_local;
>    KSP     *subksps;
> 
>    PCASMGetSubKSP(pc,&n_local,NULL,&subksps);
>    for (i=0; i<n_local; i++) {
>       PC subpc;
> 
>       KSPGetPC(subksps[i],&subpc);
>       PCSetType(subpc,PCSHELL);
>       PCShellSetApply(subpc,yourapplyfunction);
>       anything else you need to set for your shell preconditioner here
>    }
>    KSPSetUpOnBlocks(ksp);
> 
>    KSPSolve();
> 
>    Now if you want to solve with a different right hand side or different entries in you matrix just call
>    KSPSolve() again  you don't need to repeat the code above.
> 
>   Barry
> 
>  Note that any of PETSc's preconditioners can be used on the subdomains so normally you can just -sub_pc_type typeyouwant and you don't need to mess with shell preconditioners.
> 
> 
> 
> 
> 
> 
> 
> > On Jun 30, 2016, at 5:49 PM, Duan Zhaowen <dzw.pku at gmail.com> wrote:
> >
> > Hi,
> >
> > I was trying to define a shell preconditioner for local partition, and let it work with global additive schwarz preconditioner for parallel computing. Is any one can give an example on this kind of preconditioners combination. Thanks!
> >
> > ZW
> 
> 


From zhangjiang.dudu at gmail.com  Fri Jul  1 13:26:31 2016
From: zhangjiang.dudu at gmail.com (=?utf-8?B?5byg5rGf?=)
Date: Fri, 1 Jul 2016 13:26:31 -0500
Subject: [petsc-users] pets error Segmentation Violation
Message-ID: <734C6F9A-4824-464C-8F84-038AC6BF9AAA@gmail.com>

Hi,

I am trying to read a large data (11.7GB) with libmesh (integrated with PETSc) and use it for my application. The program runs well when using just one process. But in parallel (mpirun -n 4), some errors came out:

[0]PETSC ERROR: ------------------------------------------------------------------------
[0]PETSC ERROR: Caught signal number 11 SEGV: Segmentation Violation, probably memory access out of range
[0]PETSC ERROR: Try option -start_in_debugger or -on_error_attach_debugger
[0]PETSC ERROR: or see http://www.mcs.anl.gov/petsc/documentation/faq.html#valgrind
[0]PETSC ERROR: or try http://valgrind.org on GNU/linux and Apple Mac OS X to find memory corruption errors
[0]PETSC ERROR: likely location of problem given in stack below
[0]PETSC ERROR: ---------------------  Stack Frames ------------------------------------
[0]PETSC ERROR: Note: The EXACT line numbers in the stack are not available,
[0]PETSC ERROR:       INSTEAD the line number of the start of the function
[0]PETSC ERROR:       is given.
[0]PETSC ERROR: --------------------- Error Message --------------------------------------------------------------
[0]PETSC ERROR: Signal received
[0]PETSC ERROR: See http://www.mcs.anl.gov/petsc/documentation/faq.html for trouble shooting.
[0]PETSC ERROR: Petsc Release Version 3.7.2, Jun, 05, 2016 
[0]PETSC ERROR: ./ptracer on a arch-linux2-c-debug named compute001 by jiangzhang Fri Jul  1 10:07:07 2016
[0]PETSC ERROR: Configure options --prefix=/nfs/proj-tpeterka/jiang/opt/petsc-3.7.2 --download-fblaslapack --with-mpi-dir=/nfs/proj-tpeterka/jiang/libraries/mpich-3.2
[0]PETSC ERROR: #1 User provided function() line 0 in  unknown file
application called MPI_Abort(MPI_COMM_WORLD, 59) - process 0

Anybody know the possible causes?

-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.mcs.anl.gov/pipermail/petsc-users/attachments/20160701/9e44fdb7/attachment.html>

From bsmith at mcs.anl.gov  Fri Jul  1 14:35:56 2016
From: bsmith at mcs.anl.gov (Barry Smith)
Date: Fri, 1 Jul 2016 14:35:56 -0500
Subject: [petsc-users] [petsc-dev] pets error Segmentation Violation
In-Reply-To: <734C6F9A-4824-464C-8F84-038AC6BF9AAA@gmail.com>
References: <734C6F9A-4824-464C-8F84-038AC6BF9AAA@gmail.com>
Message-ID: <4ACC6758-E5B7-44A9-B0D5-A1FACAB7AB90@mcs.anl.gov>


  No idea. You need to do what it says and run with valgrind or in a debugger. From the crash message it looks like it is crashing in your main program.

   Barry

> On Jul 1, 2016, at 1:26 PM, ?? <zhangjiang.dudu at gmail.com> wrote:
> 
> Hi,
> 
> I am trying to read a large data (11.7GB) with libmesh (integrated with PETSc) and use it for my application. The program runs well when using just one process. But in parallel (mpirun -n 4), some errors came out:
> 
> [0]PETSC ERROR: ------------------------------------------------------------------------
> [0]PETSC ERROR: Caught signal number 11 SEGV: Segmentation Violation, probably memory access out of range
> [0]PETSC ERROR: Try option -start_in_debugger or -on_error_attach_debugger
> [0]PETSC ERROR: or see http://www.mcs.anl.gov/petsc/documentation/faq.html#valgrind
> [0]PETSC ERROR: or try http://valgrind.org on GNU/linux and Apple Mac OS X to find memory corruption errors
> [0]PETSC ERROR: likely location of problem given in stack below
> [0]PETSC ERROR: ---------------------  Stack Frames ------------------------------------
> [0]PETSC ERROR: Note: The EXACT line numbers in the stack are not available,
> [0]PETSC ERROR:       INSTEAD the line number of the start of the function
> [0]PETSC ERROR:       is given.
> [0]PETSC ERROR: --------------------- Error Message --------------------------------------------------------------
> [0]PETSC ERROR: Signal received
> [0]PETSC ERROR: See http://www.mcs.anl.gov/petsc/documentation/faq.html for trouble shooting.
> [0]PETSC ERROR: Petsc Release Version 3.7.2, Jun, 05, 2016 
> [0]PETSC ERROR: ./ptracer on a arch-linux2-c-debug named compute001 by jiangzhang Fri Jul  1 10:07:07 2016
> [0]PETSC ERROR: Configure options --prefix=/nfs/proj-tpeterka/jiang/opt/petsc-3.7.2 --download-fblaslapack --with-mpi-dir=/nfs/proj-tpeterka/jiang/libraries/mpich-3.2
> [0]PETSC ERROR: #1 User provided function() line 0 in  unknown file
> application called MPI_Abort(MPI_COMM_WORLD, 59) - process 0
> 
> Anybody know the possible causes?
> 


From zocca.marco at gmail.com  Sat Jul  2 12:22:06 2016
From: zocca.marco at gmail.com (Marco Zocca)
Date: Sat, 2 Jul 2016 19:22:06 +0200
Subject: [petsc-users] Re. PETSc user meeting 2016
Message-ID: <CAKE6T0TUWdy+CaDRpbFZkY5-tGc-nt9+pDF8f-L603zHPUsubA@mail.gmail.com>

Dear colleagues near and far,

  it has been a pleasure to meet you all in person during this very
interesting and lively meeting. I have learned much about branches of
science I had not previously considered and the current state of HPC,
and for this I would like to thank all of the speakers and poster
presenters, but first and foremost the PETSc team, the sponsors and
our excellent host Karl.

Hoping to meet you soon again, perhaps at PETSc'17,
Kind regards,

Marco Zocca

----------------

https://github.com/ocramz/petsc-hs
https://github.com/ocramz/petsc-hs-docker

From rupp at iue.tuwien.ac.at  Sat Jul  2 12:26:17 2016
From: rupp at iue.tuwien.ac.at (Karl Rupp)
Date: Sat, 2 Jul 2016 19:26:17 +0200
Subject: [petsc-users] Re. PETSc user meeting 2016
In-Reply-To: <CAKE6T0TUWdy+CaDRpbFZkY5-tGc-nt9+pDF8f-L603zHPUsubA@mail.gmail.com>
References: <CAKE6T0TUWdy+CaDRpbFZkY5-tGc-nt9+pDF8f-L603zHPUsubA@mail.gmail.com>
Message-ID: <5777F939.3000701@iue.tuwien.ac.at>

Hi Marco,

thank you for your words, it has been a pleasure for us :-)

Best regards,
Karli

On 07/02/2016 07:22 PM, Marco Zocca wrote:
> Dear colleagues near and far,
>
>    it has been a pleasure to meet you all in person during this very
> interesting and lively meeting. I have learned much about branches of
> science I had not previously considered and the current state of HPC,
> and for this I would like to thank all of the speakers and poster
> presenters, but first and foremost the PETSc team, the sponsors and
> our excellent host Karl.
>
> Hoping to meet you soon again, perhaps at PETSc'17,
> Kind regards,
>
> Marco Zocca
>
> ----------------
>
> https://github.com/ocramz/petsc-hs
> https://github.com/ocramz/petsc-hs-docker
>


From rupp at iue.tuwien.ac.at  Sat Jul  2 12:31:33 2016
From: rupp at iue.tuwien.ac.at (Karl Rupp)
Date: Sat, 2 Jul 2016 19:31:33 +0200
Subject: [petsc-users] DMDAVecRestoreArrayF90 error
In-Reply-To: <b436540b-63bd-5b65-74a9-d6b4945fa2a2@gmail.com>
References: <b436540b-63bd-5b65-74a9-d6b4945fa2a2@gmail.com>
Message-ID: <5777FA75.3080804@iue.tuwien.ac.at>

Hi,

what you describe looks a lot like memory corruption. Does your code run 
cleanly through valgrind?

Best regards,
Karli


On 07/01/2016 08:33 AM, TAY wee-beng wrote:
> Hi,
>
> I had problems with DMDAVecRestoreArrayF90 last time when I used an old
> version of Intel Fortran compiler. It works fine in gfortran and new
> version of Intel compiler. It was determined as a bug by the PETSc team.
>
> To use in old version of Intel, I had to use -O1 instead of -O3 -ipo in
> subroutines when DMDAVecRestoreArrayF90 is called.
>
> Recently, my cluster was reset and all files were deleted. I upload my
> files and compile my code again. However, this time, with a new version
> of Intel, I got segmentation error with DMDAVecRestoreArrayF90. Changing
> to -O1 works. But I thought it was working fine. So maybe I need to
> check with my admin if the ver before and after reset are the same.
>
> Another thing was using PETSc ver 3.6.4 and 3.7.2. Using v3.6.4
> (compiled with -O3 -ipo), it encountered segmentation err right from the
> code start, when DMDAVecRestoreArrayF90  is called.
>
> However, for v3.7.2 (compiled with -O3 -ipo), it only happened during
> the 2nd time step, when I need to use DMDAVecRestoreArrayF90 in order to
> use KSP to solve the linear equation later on.
>
> So I wonder why v3.7.2 can get pass the 1st time step w/o problem and
> getting the right answer and only give errors at the 2nd time step.
>
> Any explanation for this?
>
> Btw, same result whether MPI is used or not.
>
>
>


From gpau at lbl.gov  Sat Jul  2 17:53:36 2016
From: gpau at lbl.gov (George Pau)
Date: Sat, 2 Jul 2016 15:53:36 -0700
Subject: [petsc-users] hdf5 libraries
Message-ID: <CABUTOunbHVBtfPxSk2bmVkKC1BBo2YS_YftGj5F6z0Q9zzq6rA@mail.gmail.com>

Hi,

I am trying to debug an error I am getting when using the HDF5 viewer.  I
am working on NERSC systems, and they have a precompiled hdf5 (module
cray-hdf5-parallel).  When I linked petsc libraries to their hdf5
libraries, it gives the following error at run time when I tried to do a
ISView:

Rank 0 [Sat Jul  2 15:34:48 2016] [c0-0c0s15n0] Fatal error in
MPI_Type_create_hindexed: Invalid argument, error stack:
MPI_Type_create_hindexed(150): MPI_Type_create_hindexed(count=1,
array_of_blocklengths=0x478b1e0, array_of_displacements=0x478b200,
MPI_BYTE, newtype=0x7fffffff3598) failed
MPI_Type_create_hindexed(98).: Invalid value for blocklength, must be
non-negative but is -1927660792

The hdf5 version on NERSC is 1.8.16 but the version that PETSc downloaded
when using --download-hdf5 is 1.8.12. So, could the error be due to this
difference? I also see the above error only when the length of the IS is
big (tested for about 200M total entries, using 1024 cores).

I don't have these errors when I used --download-hdf5=1 during configure
step.  However, while I was able to use --download-hdf5=1 on Edison, PETSc
was not able to compile hdf5 libraries properly on Cori.  The OS of Cori
was recently updated.  My primary interest in trying to use the version
provided by NERSC is to see if there is any improvement in the IO
performance.

Thanks,
George


-- 
George Pau
Earth Sciences Division
Lawrence Berkeley National Laboratory
One Cyclotron, MS 74R316C
Berkeley, CA 94720

(510) 486-7196
gpau at lbl.gov
http://esd.lbl.gov/profiles/george-shu-heng-pau/
<http://esd.lbl.gov/about/staff/georgepau/>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.mcs.anl.gov/pipermail/petsc-users/attachments/20160702/dc39b15c/attachment.html>

From hgbk2008 at gmail.com  Sun Jul  3 03:06:38 2016
From: hgbk2008 at gmail.com (Hoang Giang Bui)
Date: Sun, 3 Jul 2016 10:06:38 +0200
Subject: [petsc-users] Dose Petsc has DMPlex example
In-Reply-To: <CAMYG4Gm5JyfEwgNP1J=8U5kAfYvXm6AW355PjUyQbVStpUnoBg@mail.gmail.com>
References: <201605030929463862822@163.com>
	<CAMYG4Gm5JyfEwgNP1J=8U5kAfYvXm6AW355PjUyQbVStpUnoBg@mail.gmail.com>
Message-ID: <CAJW_hKc4KCHF-VUA6XVEfa1fBreCJVSNUF+vF2D6Y=EZ0bNeqg@mail.gmail.com>

Hi Matt

I tried to run ex62 with 1 proc (petsc 3.7.2), but it all produces zero

The output is:
hbui at bermuda:~/workspace/petsc/snes$ es$ ./ex62 run_type full -bc_type
dirichlet -refinement_limit 0.00625 -interpolate 1 -snes_monitor_short
-snes_converged_reason -snes_view -ksp_type fgmres -ksp_gmres_restart 100
-ksp_rtol 1.0e-9 -ksp_monitor_short -pc_type fieldsplit -pc_fieldsplit_type
schur -pc_fieldsplit_schur_factorization_type full
-fieldsplit_velocity_ksp_type gmres -fieldsplit_velocity_pc_type lu
-fieldsplit_pressure_ksp_rtol 1e-10 -fieldsplit_pressure_pc_type jacobi
  0 SNES Function norm 0.265165
    0 KSP Residual norm 0.265165
Nonlinear solve did not converge due to DIVERGED_LINEAR_SOLVE iterations 0
SNES Object: 1 MPI processes
  type: newtonls
  maximum iterations=50, maximum function evaluations=10000
  tolerances: relative=1e-08, absolute=1e-50, solution=1e-08
  total number of linear solver iterations=0
  total number of function evaluations=1
  norm schedule ALWAYS
  SNESLineSearch Object:   1 MPI processes
    type: bt
      interpolation: cubic
      alpha=1.000000e-04
    maxstep=1.000000e+08, minlambda=1.000000e-12
    tolerances: relative=1.000000e-08, absolute=1.000000e-15,
lambda=1.000000e-08
    maximum iterations=40
  KSP Object:   1 MPI processes
    type: fgmres
      GMRES: restart=100, using Classical (unmodified) Gram-Schmidt
Orthogonalization with no iterative refinement
      GMRES: happy breakdown tolerance 1e-30
    maximum iterations=10000, initial guess is zero
    tolerances:  relative=1e-09, absolute=1e-50, divergence=10000.
    right preconditioning
    using UNPRECONDITIONED norm type for convergence test
  PC Object:   1 MPI processes
    type: fieldsplit
      FieldSplit with Schur preconditioner, factorization FULL
      Preconditioner for the Schur complement formed from A11
      Split info:
      Split number 0 Defined by IS
      Split number 1 Defined by IS
      KSP solver for A00 block
        KSP Object:        (fieldsplit_velocity_)         1 MPI processes
          type: gmres
            GMRES: restart=30, using Classical (unmodified) Gram-Schmidt
Orthogonalization with no iterative refinement
            GMRES: happy breakdown tolerance 1e-30
          maximum iterations=10000, initial guess is zero
          tolerances:  relative=1e-05, absolute=1e-50, divergence=10000.
          left preconditioning
          using PRECONDITIONED norm type for convergence test
        PC Object:        (fieldsplit_velocity_)         1 MPI processes
          type: lu
            LU: out-of-place factorization
            tolerance for zero pivot 2.22045e-14
            matrix ordering: nd
            factor fill ratio given 5., needed 1.
              Factored matrix follows:
                Mat Object:                 1 MPI processes
                  type: seqaij
                  rows=512, cols=512, bs=2
                  package used to perform factorization: petsc
                  total: nonzeros=1024, allocated nonzeros=1024
                  total number of mallocs used during MatSetValues calls =0
                    using I-node routines: found 256 nodes, limit used is 5
          linear system matrix = precond matrix:
          Mat Object:          (fieldsplit_velocity_)           1 MPI
processes
            type: seqaij
            rows=512, cols=512, bs=2
            total: nonzeros=1024, allocated nonzeros=1024
            total number of mallocs used during MatSetValues calls =0
              using I-node routines: found 256 nodes, limit used is 5
      KSP solver for S = A11 - A10 inv(A00) A01
        KSP Object:        (fieldsplit_pressure_)         1 MPI processes
          type: gmres
            GMRES: restart=30, using Classical (unmodified) Gram-Schmidt
Orthogonalization with no iterative refinement
            GMRES: happy breakdown tolerance 1e-30
          maximum iterations=10000, initial guess is zero
          tolerances:  relative=1e-10, absolute=1e-50, divergence=10000.
          left preconditioning
          using PRECONDITIONED norm type for convergence test
        PC Object:        (fieldsplit_pressure_)         1 MPI processes
          type: jacobi
          linear system matrix followed by preconditioner matrix:
          Mat Object:          (fieldsplit_pressure_)           1 MPI
processes
            type: schurcomplement
            rows=256, cols=256
              has attached null space
              Schur complement A11 - A10 inv(A00) A01
              A11
                Mat Object:
(fieldsplit_pressure_)                 1 MPI processes
                  type: seqaij
                  rows=256, cols=256
                  total: nonzeros=256, allocated nonzeros=256
                  total number of mallocs used during MatSetValues calls =0
                    has attached null space
                    not using I-node routines
              A10
                Mat Object:                 1 MPI processes
                  type: seqaij
                  rows=256, cols=512
                  total: nonzeros=512, allocated nonzeros=512
                  total number of mallocs used during MatSetValues calls =0
                    not using I-node routines
              KSP of A00
                KSP Object:
(fieldsplit_velocity_)                 1 MPI processes
                  type: gmres
                    GMRES: restart=30, using Classical (unmodified)
Gram-Schmidt Orthogonalization with no iterative refinement
                    GMRES: happy breakdown tolerance 1e-30
                  maximum iterations=10000, initial guess is zero
                  tolerances:  relative=1e-05, absolute=1e-50,
divergence=10000.
                  left preconditioning
                  using PRECONDITIONED norm type for convergence test
                PC Object:
(fieldsplit_velocity_)                 1 MPI processes
                  type: lu
                    LU: out-of-place factorization
                    tolerance for zero pivot 2.22045e-14
                    matrix ordering: nd
                    factor fill ratio given 5., needed 1.
                      Factored matrix follows:
                        Mat Object:                         1 MPI processes
                          type: seqaij
                          rows=512, cols=512, bs=2
                          package used to perform factorization: petsc
                          total: nonzeros=1024, allocated nonzeros=1024
                          total number of mallocs used during MatSetValues
calls =0
                            using I-node routines: found 256 nodes, limit
used is 5
                  linear system matrix = precond matrix:
                  Mat Object:
(fieldsplit_velocity_)                   1 MPI processes
                    type: seqaij
                    rows=512, cols=512, bs=2
                    total: nonzeros=1024, allocated nonzeros=1024
                    total number of mallocs used during MatSetValues calls
=0
                      using I-node routines: found 256 nodes, limit used is
5
              A01
                Mat Object:                 1 MPI processes
                  type: seqaij
                  rows=512, cols=256, rbs=2, cbs = 1
                  total: nonzeros=512, allocated nonzeros=512
                  total number of mallocs used during MatSetValues calls =0
                    using I-node routines: found 256 nodes, limit used is 5
          Mat Object:          (fieldsplit_pressure_)           1 MPI
processes
            type: seqaij
            rows=256, cols=256
            total: nonzeros=256, allocated nonzeros=256
            total number of mallocs used during MatSetValues calls =0
              has attached null space
              not using I-node routines
    linear system matrix = precond matrix:
    Mat Object:     1 MPI processes
      type: seqaij
      rows=768, cols=768
      total: nonzeros=2304, allocated nonzeros=2304
      total number of mallocs used during MatSetValues calls =0
        has attached null space
        using I-node routines: found 256 nodes, limit used is 5
Number of SNES iterations = 0
L_2 Error: 1.01 [0.929, 0.407]
Solution
Vec Object: 1 MPI processes
  type: seq
0.
0.
....

Am I doing something wrong?

Giang


Giang

On Tue, May 3, 2016 at 4:44 AM, Matthew Knepley <knepley at gmail.com> wrote:

> On Mon, May 2, 2016 at 8:29 PM, ztdepyahoo at 163.com <ztdepyahoo at 163.com>
> wrote:
>
>> Dear professor:
>>       I want to write a parallel 3D CFD code based on unstructred grid,
>> does Petsc has DMPlex examples to start with.
>>
>
> SNES ex62 is an unstructured grid Stokes problem discretized with
> low-order finite elements.
>
> Of course, all the different possible choices will impact the design.
>
>    Matt
>
>
>> Regards
>>
>
>
>
> --
> What most experimenters take for granted before they begin their
> experiments is infinitely more interesting than any results to which their
> experiments lead.
> -- Norbert Wiener
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.mcs.anl.gov/pipermail/petsc-users/attachments/20160703/dbbe286b/attachment-0001.html>

From jychang48 at gmail.com  Sun Jul  3 03:15:09 2016
From: jychang48 at gmail.com (Justin Chang)
Date: Sun, 3 Jul 2016 09:15:09 +0100
Subject: [petsc-users] Dose Petsc has DMPlex example
In-Reply-To: <CAJW_hKc4KCHF-VUA6XVEfa1fBreCJVSNUF+vF2D6Y=EZ0bNeqg@mail.gmail.com>
References: <201605030929463862822@163.com>
	<CAMYG4Gm5JyfEwgNP1J=8U5kAfYvXm6AW355PjUyQbVStpUnoBg@mail.gmail.com>
	<CAJW_hKc4KCHF-VUA6XVEfa1fBreCJVSNUF+vF2D6Y=EZ0bNeqg@mail.gmail.com>
Message-ID: <CAP2=TMihGa13UMXiFn6vKoARD+V-k+FTRmXOMkGTh2Miec9HUw@mail.gmail.com>

Hoang, if you run this example shown from the config/builder.py

./ex62 -run_type full -refinement_limit 0.00625 -bc_type dirichlet
-interpolate 1 -vel_petscspace_order 2 -pres_petscspace_order 1 -ksp_type
fgmres -ksp_gmres_restart 100 -ksp_rtol 1.0e-9 -pc_type fieldsplit
-pc_fieldsplit_type schur -pc_fieldsplit_schur_factorization_type full
-fieldsplit_pressure_ksp_rtol 1e-10 -fieldsplit_velocity_ksp_type gmres
-fieldsplit_velocity_pc_type lu -fieldsplit_pressure_pc_type jacobi
-snes_monitor_short -ksp_monitor_short -snes_converged_reason
-ksp_converged_reason -snes_view -show_solution 0


it should work

On Sun, Jul 3, 2016 at 9:06 AM, Hoang Giang Bui <hgbk2008 at gmail.com> wrote:

> Hi Matt
>
> I tried to run ex62 with 1 proc (petsc 3.7.2), but it all produces zero
>
> The output is:
> hbui at bermuda:~/workspace/petsc/snes$ es$ ./ex62 run_type full -bc_type
> dirichlet -refinement_limit 0.00625 -interpolate 1 -snes_monitor_short
> -snes_converged_reason -snes_view -ksp_type fgmres -ksp_gmres_restart 100
> -ksp_rtol 1.0e-9 -ksp_monitor_short -pc_type fieldsplit -pc_fieldsplit_type
> schur -pc_fieldsplit_schur_factorization_type full
> -fieldsplit_velocity_ksp_type gmres -fieldsplit_velocity_pc_type lu
> -fieldsplit_pressure_ksp_rtol 1e-10 -fieldsplit_pressure_pc_type jacobi
>   0 SNES Function norm 0.265165
>     0 KSP Residual norm 0.265165
> Nonlinear solve did not converge due to DIVERGED_LINEAR_SOLVE iterations 0
> SNES Object: 1 MPI processes
>   type: newtonls
>   maximum iterations=50, maximum function evaluations=10000
>   tolerances: relative=1e-08, absolute=1e-50, solution=1e-08
>   total number of linear solver iterations=0
>   total number of function evaluations=1
>   norm schedule ALWAYS
>   SNESLineSearch Object:   1 MPI processes
>     type: bt
>       interpolation: cubic
>       alpha=1.000000e-04
>     maxstep=1.000000e+08, minlambda=1.000000e-12
>     tolerances: relative=1.000000e-08, absolute=1.000000e-15,
> lambda=1.000000e-08
>     maximum iterations=40
>   KSP Object:   1 MPI processes
>     type: fgmres
>       GMRES: restart=100, using Classical (unmodified) Gram-Schmidt
> Orthogonalization with no iterative refinement
>       GMRES: happy breakdown tolerance 1e-30
>     maximum iterations=10000, initial guess is zero
>     tolerances:  relative=1e-09, absolute=1e-50, divergence=10000.
>     right preconditioning
>     using UNPRECONDITIONED norm type for convergence test
>   PC Object:   1 MPI processes
>     type: fieldsplit
>       FieldSplit with Schur preconditioner, factorization FULL
>       Preconditioner for the Schur complement formed from A11
>       Split info:
>       Split number 0 Defined by IS
>       Split number 1 Defined by IS
>       KSP solver for A00 block
>         KSP Object:        (fieldsplit_velocity_)         1 MPI processes
>           type: gmres
>             GMRES: restart=30, using Classical (unmodified) Gram-Schmidt
> Orthogonalization with no iterative refinement
>             GMRES: happy breakdown tolerance 1e-30
>           maximum iterations=10000, initial guess is zero
>           tolerances:  relative=1e-05, absolute=1e-50, divergence=10000.
>           left preconditioning
>           using PRECONDITIONED norm type for convergence test
>         PC Object:        (fieldsplit_velocity_)         1 MPI processes
>           type: lu
>             LU: out-of-place factorization
>             tolerance for zero pivot 2.22045e-14
>             matrix ordering: nd
>             factor fill ratio given 5., needed 1.
>               Factored matrix follows:
>                 Mat Object:                 1 MPI processes
>                   type: seqaij
>                   rows=512, cols=512, bs=2
>                   package used to perform factorization: petsc
>                   total: nonzeros=1024, allocated nonzeros=1024
>                   total number of mallocs used during MatSetValues calls =0
>                     using I-node routines: found 256 nodes, limit used is 5
>           linear system matrix = precond matrix:
>           Mat Object:          (fieldsplit_velocity_)           1 MPI
> processes
>             type: seqaij
>             rows=512, cols=512, bs=2
>             total: nonzeros=1024, allocated nonzeros=1024
>             total number of mallocs used during MatSetValues calls =0
>               using I-node routines: found 256 nodes, limit used is 5
>       KSP solver for S = A11 - A10 inv(A00) A01
>         KSP Object:        (fieldsplit_pressure_)         1 MPI processes
>           type: gmres
>             GMRES: restart=30, using Classical (unmodified) Gram-Schmidt
> Orthogonalization with no iterative refinement
>             GMRES: happy breakdown tolerance 1e-30
>           maximum iterations=10000, initial guess is zero
>           tolerances:  relative=1e-10, absolute=1e-50, divergence=10000.
>           left preconditioning
>           using PRECONDITIONED norm type for convergence test
>         PC Object:        (fieldsplit_pressure_)         1 MPI processes
>           type: jacobi
>           linear system matrix followed by preconditioner matrix:
>           Mat Object:          (fieldsplit_pressure_)           1 MPI
> processes
>             type: schurcomplement
>             rows=256, cols=256
>               has attached null space
>               Schur complement A11 - A10 inv(A00) A01
>               A11
>                 Mat Object:
> (fieldsplit_pressure_)                 1 MPI processes
>                   type: seqaij
>                   rows=256, cols=256
>                   total: nonzeros=256, allocated nonzeros=256
>                   total number of mallocs used during MatSetValues calls =0
>                     has attached null space
>                     not using I-node routines
>               A10
>                 Mat Object:                 1 MPI processes
>                   type: seqaij
>                   rows=256, cols=512
>                   total: nonzeros=512, allocated nonzeros=512
>                   total number of mallocs used during MatSetValues calls =0
>                     not using I-node routines
>               KSP of A00
>                 KSP Object:
> (fieldsplit_velocity_)                 1 MPI processes
>                   type: gmres
>                     GMRES: restart=30, using Classical (unmodified)
> Gram-Schmidt Orthogonalization with no iterative refinement
>                     GMRES: happy breakdown tolerance 1e-30
>                   maximum iterations=10000, initial guess is zero
>                   tolerances:  relative=1e-05, absolute=1e-50,
> divergence=10000.
>                   left preconditioning
>                   using PRECONDITIONED norm type for convergence test
>                 PC Object:
> (fieldsplit_velocity_)                 1 MPI processes
>                   type: lu
>                     LU: out-of-place factorization
>                     tolerance for zero pivot 2.22045e-14
>                     matrix ordering: nd
>                     factor fill ratio given 5., needed 1.
>                       Factored matrix follows:
>                         Mat Object:                         1 MPI processes
>                           type: seqaij
>                           rows=512, cols=512, bs=2
>                           package used to perform factorization: petsc
>                           total: nonzeros=1024, allocated nonzeros=1024
>                           total number of mallocs used during MatSetValues
> calls =0
>                             using I-node routines: found 256 nodes, limit
> used is 5
>                   linear system matrix = precond matrix:
>                   Mat Object:
> (fieldsplit_velocity_)                   1 MPI processes
>                     type: seqaij
>                     rows=512, cols=512, bs=2
>                     total: nonzeros=1024, allocated nonzeros=1024
>                     total number of mallocs used during MatSetValues calls
> =0
>                       using I-node routines: found 256 nodes, limit used
> is 5
>               A01
>                 Mat Object:                 1 MPI processes
>                   type: seqaij
>                   rows=512, cols=256, rbs=2, cbs = 1
>                   total: nonzeros=512, allocated nonzeros=512
>                   total number of mallocs used during MatSetValues calls =0
>                     using I-node routines: found 256 nodes, limit used is 5
>           Mat Object:          (fieldsplit_pressure_)           1 MPI
> processes
>             type: seqaij
>             rows=256, cols=256
>             total: nonzeros=256, allocated nonzeros=256
>             total number of mallocs used during MatSetValues calls =0
>               has attached null space
>               not using I-node routines
>     linear system matrix = precond matrix:
>     Mat Object:     1 MPI processes
>       type: seqaij
>       rows=768, cols=768
>       total: nonzeros=2304, allocated nonzeros=2304
>       total number of mallocs used during MatSetValues calls =0
>         has attached null space
>         using I-node routines: found 256 nodes, limit used is 5
> Number of SNES iterations = 0
> L_2 Error: 1.01 [0.929, 0.407]
> Solution
> Vec Object: 1 MPI processes
>   type: seq
> 0.
> 0.
> ....
>
> Am I doing something wrong?
>
> Giang
>
>
> Giang
>
> On Tue, May 3, 2016 at 4:44 AM, Matthew Knepley <knepley at gmail.com> wrote:
>
>> On Mon, May 2, 2016 at 8:29 PM, ztdepyahoo at 163.com <ztdepyahoo at 163.com>
>> wrote:
>>
>>> Dear professor:
>>>       I want to write a parallel 3D CFD code based on unstructred grid,
>>> does Petsc has DMPlex examples to start with.
>>>
>>
>> SNES ex62 is an unstructured grid Stokes problem discretized with
>> low-order finite elements.
>>
>> Of course, all the different possible choices will impact the design.
>>
>>    Matt
>>
>>
>>> Regards
>>>
>>
>>
>>
>> --
>> What most experimenters take for granted before they begin their
>> experiments is infinitely more interesting than any results to which their
>> experiments lead.
>> -- Norbert Wiener
>>
>
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.mcs.anl.gov/pipermail/petsc-users/attachments/20160703/1e49aea3/attachment.html>

From hgbk2008 at gmail.com  Sun Jul  3 03:49:05 2016
From: hgbk2008 at gmail.com (Hoang Giang Bui)
Date: Sun, 3 Jul 2016 10:49:05 +0200
Subject: [petsc-users] Dose Petsc has DMPlex example
In-Reply-To: <CAP2=TMihGa13UMXiFn6vKoARD+V-k+FTRmXOMkGTh2Miec9HUw@mail.gmail.com>
References: <201605030929463862822@163.com>
	<CAMYG4Gm5JyfEwgNP1J=8U5kAfYvXm6AW355PjUyQbVStpUnoBg@mail.gmail.com>
	<CAJW_hKc4KCHF-VUA6XVEfa1fBreCJVSNUF+vF2D6Y=EZ0bNeqg@mail.gmail.com>
	<CAP2=TMihGa13UMXiFn6vKoARD+V-k+FTRmXOMkGTh2Miec9HUw@mail.gmail.com>
Message-ID: <CAJW_hKfZxzLUyzofMgsy1kgF76DurUWL4wHh7uTHpUw_=7pQ_g@mail.gmail.com>

Thanks Justin. It works. The difference is these parameters:
-vel_petscspace_order 2 -pres_petscspace_order 1. Which is quite cool since
you can play with those orders to see how LBB condition affects the results.

Giang


Giang

On Sun, Jul 3, 2016 at 10:15 AM, Justin Chang <jychang48 at gmail.com> wrote:

> Hoang, if you run this example shown from the config/builder.py
>
> ./ex62 -run_type full -refinement_limit 0.00625 -bc_type dirichlet
> -interpolate 1 -vel_petscspace_order 2 -pres_petscspace_order 1 -ksp_type
> fgmres -ksp_gmres_restart 100 -ksp_rtol 1.0e-9 -pc_type fieldsplit
> -pc_fieldsplit_type schur -pc_fieldsplit_schur_factorization_type full
> -fieldsplit_pressure_ksp_rtol 1e-10 -fieldsplit_velocity_ksp_type gmres
> -fieldsplit_velocity_pc_type lu -fieldsplit_pressure_pc_type jacobi
> -snes_monitor_short -ksp_monitor_short -snes_converged_reason
> -ksp_converged_reason -snes_view -show_solution 0
>
>
> it should work
>
> On Sun, Jul 3, 2016 at 9:06 AM, Hoang Giang Bui <hgbk2008 at gmail.com>
> wrote:
>
>> Hi Matt
>>
>> I tried to run ex62 with 1 proc (petsc 3.7.2), but it all produces zero
>>
>> The output is:
>> hbui at bermuda:~/workspace/petsc/snes$ es$ ./ex62 run_type full -bc_type
>> dirichlet -refinement_limit 0.00625 -interpolate 1 -snes_monitor_short
>> -snes_converged_reason -snes_view -ksp_type fgmres -ksp_gmres_restart 100
>> -ksp_rtol 1.0e-9 -ksp_monitor_short -pc_type fieldsplit -pc_fieldsplit_type
>> schur -pc_fieldsplit_schur_factorization_type full
>> -fieldsplit_velocity_ksp_type gmres -fieldsplit_velocity_pc_type lu
>> -fieldsplit_pressure_ksp_rtol 1e-10 -fieldsplit_pressure_pc_type jacobi
>>   0 SNES Function norm 0.265165
>>     0 KSP Residual norm 0.265165
>> Nonlinear solve did not converge due to DIVERGED_LINEAR_SOLVE iterations 0
>> SNES Object: 1 MPI processes
>>   type: newtonls
>>   maximum iterations=50, maximum function evaluations=10000
>>   tolerances: relative=1e-08, absolute=1e-50, solution=1e-08
>>   total number of linear solver iterations=0
>>   total number of function evaluations=1
>>   norm schedule ALWAYS
>>   SNESLineSearch Object:   1 MPI processes
>>     type: bt
>>       interpolation: cubic
>>       alpha=1.000000e-04
>>     maxstep=1.000000e+08, minlambda=1.000000e-12
>>     tolerances: relative=1.000000e-08, absolute=1.000000e-15,
>> lambda=1.000000e-08
>>     maximum iterations=40
>>   KSP Object:   1 MPI processes
>>     type: fgmres
>>       GMRES: restart=100, using Classical (unmodified) Gram-Schmidt
>> Orthogonalization with no iterative refinement
>>       GMRES: happy breakdown tolerance 1e-30
>>     maximum iterations=10000, initial guess is zero
>>     tolerances:  relative=1e-09, absolute=1e-50, divergence=10000.
>>     right preconditioning
>>     using UNPRECONDITIONED norm type for convergence test
>>   PC Object:   1 MPI processes
>>     type: fieldsplit
>>       FieldSplit with Schur preconditioner, factorization FULL
>>       Preconditioner for the Schur complement formed from A11
>>       Split info:
>>       Split number 0 Defined by IS
>>       Split number 1 Defined by IS
>>       KSP solver for A00 block
>>         KSP Object:        (fieldsplit_velocity_)         1 MPI processes
>>           type: gmres
>>             GMRES: restart=30, using Classical (unmodified) Gram-Schmidt
>> Orthogonalization with no iterative refinement
>>             GMRES: happy breakdown tolerance 1e-30
>>           maximum iterations=10000, initial guess is zero
>>           tolerances:  relative=1e-05, absolute=1e-50, divergence=10000.
>>           left preconditioning
>>           using PRECONDITIONED norm type for convergence test
>>         PC Object:        (fieldsplit_velocity_)         1 MPI processes
>>           type: lu
>>             LU: out-of-place factorization
>>             tolerance for zero pivot 2.22045e-14
>>             matrix ordering: nd
>>             factor fill ratio given 5., needed 1.
>>               Factored matrix follows:
>>                 Mat Object:                 1 MPI processes
>>                   type: seqaij
>>                   rows=512, cols=512, bs=2
>>                   package used to perform factorization: petsc
>>                   total: nonzeros=1024, allocated nonzeros=1024
>>                   total number of mallocs used during MatSetValues calls
>> =0
>>                     using I-node routines: found 256 nodes, limit used is
>> 5
>>           linear system matrix = precond matrix:
>>           Mat Object:          (fieldsplit_velocity_)           1 MPI
>> processes
>>             type: seqaij
>>             rows=512, cols=512, bs=2
>>             total: nonzeros=1024, allocated nonzeros=1024
>>             total number of mallocs used during MatSetValues calls =0
>>               using I-node routines: found 256 nodes, limit used is 5
>>       KSP solver for S = A11 - A10 inv(A00) A01
>>         KSP Object:        (fieldsplit_pressure_)         1 MPI processes
>>           type: gmres
>>             GMRES: restart=30, using Classical (unmodified) Gram-Schmidt
>> Orthogonalization with no iterative refinement
>>             GMRES: happy breakdown tolerance 1e-30
>>           maximum iterations=10000, initial guess is zero
>>           tolerances:  relative=1e-10, absolute=1e-50, divergence=10000.
>>           left preconditioning
>>           using PRECONDITIONED norm type for convergence test
>>         PC Object:        (fieldsplit_pressure_)         1 MPI processes
>>           type: jacobi
>>           linear system matrix followed by preconditioner matrix:
>>           Mat Object:          (fieldsplit_pressure_)           1 MPI
>> processes
>>             type: schurcomplement
>>             rows=256, cols=256
>>               has attached null space
>>               Schur complement A11 - A10 inv(A00) A01
>>               A11
>>                 Mat Object:
>> (fieldsplit_pressure_)                 1 MPI processes
>>                   type: seqaij
>>                   rows=256, cols=256
>>                   total: nonzeros=256, allocated nonzeros=256
>>                   total number of mallocs used during MatSetValues calls
>> =0
>>                     has attached null space
>>                     not using I-node routines
>>               A10
>>                 Mat Object:                 1 MPI processes
>>                   type: seqaij
>>                   rows=256, cols=512
>>                   total: nonzeros=512, allocated nonzeros=512
>>                   total number of mallocs used during MatSetValues calls
>> =0
>>                     not using I-node routines
>>               KSP of A00
>>                 KSP Object:
>> (fieldsplit_velocity_)                 1 MPI processes
>>                   type: gmres
>>                     GMRES: restart=30, using Classical (unmodified)
>> Gram-Schmidt Orthogonalization with no iterative refinement
>>                     GMRES: happy breakdown tolerance 1e-30
>>                   maximum iterations=10000, initial guess is zero
>>                   tolerances:  relative=1e-05, absolute=1e-50,
>> divergence=10000.
>>                   left preconditioning
>>                   using PRECONDITIONED norm type for convergence test
>>                 PC Object:
>> (fieldsplit_velocity_)                 1 MPI processes
>>                   type: lu
>>                     LU: out-of-place factorization
>>                     tolerance for zero pivot 2.22045e-14
>>                     matrix ordering: nd
>>                     factor fill ratio given 5., needed 1.
>>                       Factored matrix follows:
>>                         Mat Object:                         1 MPI
>> processes
>>                           type: seqaij
>>                           rows=512, cols=512, bs=2
>>                           package used to perform factorization: petsc
>>                           total: nonzeros=1024, allocated nonzeros=1024
>>                           total number of mallocs used during
>> MatSetValues calls =0
>>                             using I-node routines: found 256 nodes, limit
>> used is 5
>>                   linear system matrix = precond matrix:
>>                   Mat Object:
>> (fieldsplit_velocity_)                   1 MPI processes
>>                     type: seqaij
>>                     rows=512, cols=512, bs=2
>>                     total: nonzeros=1024, allocated nonzeros=1024
>>                     total number of mallocs used during MatSetValues
>> calls =0
>>                       using I-node routines: found 256 nodes, limit used
>> is 5
>>               A01
>>                 Mat Object:                 1 MPI processes
>>                   type: seqaij
>>                   rows=512, cols=256, rbs=2, cbs = 1
>>                   total: nonzeros=512, allocated nonzeros=512
>>                   total number of mallocs used during MatSetValues calls
>> =0
>>                     using I-node routines: found 256 nodes, limit used is
>> 5
>>           Mat Object:          (fieldsplit_pressure_)           1 MPI
>> processes
>>             type: seqaij
>>             rows=256, cols=256
>>             total: nonzeros=256, allocated nonzeros=256
>>             total number of mallocs used during MatSetValues calls =0
>>               has attached null space
>>               not using I-node routines
>>     linear system matrix = precond matrix:
>>     Mat Object:     1 MPI processes
>>       type: seqaij
>>       rows=768, cols=768
>>       total: nonzeros=2304, allocated nonzeros=2304
>>       total number of mallocs used during MatSetValues calls =0
>>         has attached null space
>>         using I-node routines: found 256 nodes, limit used is 5
>> Number of SNES iterations = 0
>> L_2 Error: 1.01 [0.929, 0.407]
>> Solution
>> Vec Object: 1 MPI processes
>>   type: seq
>> 0.
>> 0.
>> ....
>>
>> Am I doing something wrong?
>>
>> Giang
>>
>>
>> Giang
>>
>> On Tue, May 3, 2016 at 4:44 AM, Matthew Knepley <knepley at gmail.com>
>> wrote:
>>
>>> On Mon, May 2, 2016 at 8:29 PM, ztdepyahoo at 163.com <ztdepyahoo at 163.com>
>>> wrote:
>>>
>>>> Dear professor:
>>>>       I want to write a parallel 3D CFD code based on unstructred grid,
>>>> does Petsc has DMPlex examples to start with.
>>>>
>>>
>>> SNES ex62 is an unstructured grid Stokes problem discretized with
>>> low-order finite elements.
>>>
>>> Of course, all the different possible choices will impact the design.
>>>
>>>    Matt
>>>
>>>
>>>> Regards
>>>>
>>>
>>>
>>>
>>> --
>>> What most experimenters take for granted before they begin their
>>> experiments is infinitely more interesting than any results to which their
>>> experiments lead.
>>> -- Norbert Wiener
>>>
>>
>>
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.mcs.anl.gov/pipermail/petsc-users/attachments/20160703/4d5d3d73/attachment-0001.html>

From bsmith at mcs.anl.gov  Sun Jul  3 17:13:33 2016
From: bsmith at mcs.anl.gov (Barry Smith)
Date: Sun, 3 Jul 2016 17:13:33 -0500
Subject: [petsc-users] hdf5 libraries
In-Reply-To: <CABUTOunbHVBtfPxSk2bmVkKC1BBo2YS_YftGj5F6z0Q9zzq6rA@mail.gmail.com>
References: <CABUTOunbHVBtfPxSk2bmVkKC1BBo2YS_YftGj5F6z0Q9zzq6rA@mail.gmail.com>
Message-ID: <2EC53254-CC98-42BB-94C1-212F72574F65@mcs.anl.gov>


   Please send $PETSC_ARCH/lib/petsc/conf/configure.log for both cases to petsc-maint at mcs.anl.gov. 

   The fact that it kicks in for very large problems and reports a negative block length is indicative of a 64 bit integer not fitting into a 32 bit integer location.

   Barry


> On Jul 2, 2016, at 5:53 PM, George Pau <gpau at lbl.gov> wrote:
> 
> Hi,
> 
> I am trying to debug an error I am getting when using the HDF5 viewer.  I am working on NERSC systems, and they have a precompiled hdf5 (module cray-hdf5-parallel).  When I linked petsc libraries to their hdf5 libraries, it gives the following error at run time when I tried to do a ISView:
> 
> Rank 0 [Sat Jul  2 15:34:48 2016] [c0-0c0s15n0] Fatal error in MPI_Type_create_hindexed: Invalid argument, error stack:
> MPI_Type_create_hindexed(150): MPI_Type_create_hindexed(count=1, array_of_blocklengths=0x478b1e0, array_of_displacements=0x478b200, MPI_BYTE, newtype=0x7fffffff3598) failed
> MPI_Type_create_hindexed(98).: Invalid value for blocklength, must be non-negative but is -1927660792
> 
> The hdf5 version on NERSC is 1.8.16 but the version that PETSc downloaded when using --download-hdf5 is 1.8.12. So, could the error be due to this difference? I also see the above error only when the length of the IS is big (tested for about 200M total entries, using 1024 cores).
> 
> I don't have these errors when I used --download-hdf5=1 during configure step.  However, while I was able to use --download-hdf5=1 on Edison, PETSc was not able to compile hdf5 libraries properly on Cori.  The OS of Cori was recently updated.  My primary interest in trying to use the version provided by NERSC is to see if there is any improvement in the IO performance.  
> 
> Thanks,
> George
> 
> 
> -- 
> George Pau
> Earth Sciences Division
> Lawrence Berkeley National Laboratory
> One Cyclotron, MS 74R316C
> Berkeley, CA 94720
> 
> (510) 486-7196
> gpau at lbl.gov
> http://esd.lbl.gov/profiles/george-shu-heng-pau/


From Hassan.Raiesi at aero.bombardier.com  Mon Jul  4 13:48:08 2016
From: Hassan.Raiesi at aero.bombardier.com (Hassan Raiesi)
Date: Mon, 4 Jul 2016 18:48:08 +0000
Subject: [petsc-users] reusing matrix created with
 MatCreateMPIAIJWithSplitArrays
In-Reply-To: <333B1A41-ACE3-49E4-ADB3-8317D177ED14@mcs.anl.gov>
References: <CB3EA7864A5EA74EADA707EEB42CF5BAFC30B407@MTLWAEXCH005.ca.aero.bombardier.net>
	<333B1A41-ACE3-49E4-ADB3-8317D177ED14@mcs.anl.gov>
Message-ID: <CB3EA7864A5EA74EADA707EEB42CF5BAFC3127B9@MTLWAEXCH005.ca.aero.bombardier.net>

Thanks Barry,

That works, however, the code seems to run a bit faster when I destroy and re-create the matrix at each time step as suggested by Dave!, 

I added this right after updating the values of the diagonal and off-diagonal parts,

ierr = PetscObjectStateIncrease((PetscObject)(mat)); 

to avoid calls to  MatAssemblyBegin/MatAssemblyEnd()  and it does the trick!

-H

-----Original Message-----
From: Barry Smith [mailto:bsmith at mcs.anl.gov] 
Sent: Thursday, June 30, 2016 6:17 PM
To: Hassan Raiesi <Hassan.Raiesi at aero.bombardier.com>
Cc: petsc-users at mcs.anl.gov
Subject: Re: [petsc-users] reusing matrix created with MatCreateMPIAIJWithSplitArrays


> On Jun 30, 2016, at 2:40 PM, Hassan Raiesi <Hassan.Raiesi at aero.bombardier.com> wrote:
> 
> Hello,
>  
> We are using PETSC in our CFD code, and  noticed that using ?MatCreateMPIAIJWithSplitArrays? is almost 60% faster for large problem size (i.e DOF > 725M, using GAMG each time-step only takes 5sec, compared to 8.3 sec when assembling the matrix one row at a time using matsetvaluesblocked()  as recommended). 
>  
> The problem is that the memory usage goes up after each call to MatCreateMPIAIJWithSplitArrays  to update the matrix values. As MatCreateMPIAIJWithSplitArrays is not supposed to copy the values, do we need to call it each time to update the values? We tried to just update the values of the diagonal and off-diagonal part of the arrays passed to ?MatCreateMPIAIJWithSplitArrays?, (the sparsity structure is fixed) but it looks like that the values are not updated, what is the proper way to update the values of the matrix created by MatCreateMPIAIJWithSplitArrays?

   Since you have direct access to the two numeric arrays passed to MatCreateMPIAIJWithSplitArrays() you can simply change the values in those locations

   AND THEN immediately CALL MatAssemblyBegin/MatAssemblyEnd() on the matrix; this will increase the the PETSc object state value for the matrix so the matrix routines (and preconditioner) will know you changed the matrix values. If you don't call the MatAssemblyBegin/MatAssemblyEnd() the preconditioner will think the matrix has not been changed so just use its old values as you observed.

   Barry

Of course if you change any nonzero locations in the matrix you need to destroy the matrix and call MatCreateMPIAIJWithSplitArrays() again. 

>  
>  
> Thank you
>  
> Hassan Raiesi,
> Advanced Aerodynamics Department
> Bombardier Aerospace
>  
> hassan.raiesi at aero.bombardier.com
>  
> 2351 boul. Alfred-Nobel (BAN1)
> Ville Saint-Laurent, Qu?bec, H4S 2A9
>  
>  
>  
> T?l.
>   514-855-5001    # 62204
>  
>  
>  
> <image001.png>
>  
>  
> CONFIDENTIALITY NOTICE - This communication may contain privileged or confidential information.
> If you are not the intended recipient or received this communication 
> by error, please notify the sender and delete the message without copying, forwarding and/or disclosing it.


From bsmith at mcs.anl.gov  Mon Jul  4 13:51:57 2016
From: bsmith at mcs.anl.gov (Barry Smith)
Date: Mon, 4 Jul 2016 13:51:57 -0500
Subject: [petsc-users] reusing matrix created with
	MatCreateMPIAIJWithSplitArrays
In-Reply-To: <CB3EA7864A5EA74EADA707EEB42CF5BAFC3127B9@MTLWAEXCH005.ca.aero.bombardier.net>
References: <CB3EA7864A5EA74EADA707EEB42CF5BAFC30B407@MTLWAEXCH005.ca.aero.bombardier.net>
	<333B1A41-ACE3-49E4-ADB3-8317D177ED14@mcs.anl.gov>
	<CB3EA7864A5EA74EADA707EEB42CF5BAFC3127B9@MTLWAEXCH005.ca.aero.bombardier.net>
Message-ID: <2C797373-AE40-438A-9B00-04C6B4EC76E4@mcs.anl.gov>


> On Jul 4, 2016, at 1:48 PM, Hassan Raiesi <Hassan.Raiesi at aero.bombardier.com> wrote:
> 
> Thanks Barry,
> 
> That works, however, the code seems to run a bit faster when I destroy and re-create the matrix at each time step as suggested by Dave!, 

   That seems odd, but ok.

> 
> I added this right after updating the values of the diagonal and off-diagonal parts,
> 
> ierr = PetscObjectStateIncrease((PetscObject)(mat)); 
> 
> to avoid calls to  MatAssemblyBegin/MatAssemblyEnd()  and it does the trick!
> 
> -H
> 
> -----Original Message-----
> From: Barry Smith [mailto:bsmith at mcs.anl.gov] 
> Sent: Thursday, June 30, 2016 6:17 PM
> To: Hassan Raiesi <Hassan.Raiesi at aero.bombardier.com>
> Cc: petsc-users at mcs.anl.gov
> Subject: Re: [petsc-users] reusing matrix created with MatCreateMPIAIJWithSplitArrays
> 
> 
>> On Jun 30, 2016, at 2:40 PM, Hassan Raiesi <Hassan.Raiesi at aero.bombardier.com> wrote:
>> 
>> Hello,
>> 
>> We are using PETSC in our CFD code, and  noticed that using ?MatCreateMPIAIJWithSplitArrays? is almost 60% faster for large problem size (i.e DOF > 725M, using GAMG each time-step only takes 5sec, compared to 8.3 sec when assembling the matrix one row at a time using matsetvaluesblocked()  as recommended). 
>> 
>> The problem is that the memory usage goes up after each call to MatCreateMPIAIJWithSplitArrays  to update the matrix values. As MatCreateMPIAIJWithSplitArrays is not supposed to copy the values, do we need to call it each time to update the values? We tried to just update the values of the diagonal and off-diagonal part of the arrays passed to ?MatCreateMPIAIJWithSplitArrays?, (the sparsity structure is fixed) but it looks like that the values are not updated, what is the proper way to update the values of the matrix created by MatCreateMPIAIJWithSplitArrays?
> 
>   Since you have direct access to the two numeric arrays passed to MatCreateMPIAIJWithSplitArrays() you can simply change the values in those locations
> 
>   AND THEN immediately CALL MatAssemblyBegin/MatAssemblyEnd() on the matrix; this will increase the the PETSc object state value for the matrix so the matrix routines (and preconditioner) will know you changed the matrix values. If you don't call the MatAssemblyBegin/MatAssemblyEnd() the preconditioner will think the matrix has not been changed so just use its old values as you observed.
> 
>   Barry
> 
> Of course if you change any nonzero locations in the matrix you need to destroy the matrix and call MatCreateMPIAIJWithSplitArrays() again. 
> 
>> 
>> 
>> Thank you
>> 
>> Hassan Raiesi,
>> Advanced Aerodynamics Department
>> Bombardier Aerospace
>> 
>> hassan.raiesi at aero.bombardier.com
>> 
>> 2351 boul. Alfred-Nobel (BAN1)
>> Ville Saint-Laurent, Qu?bec, H4S 2A9
>> 
>> 
>> 
>> T?l.
>>  514-855-5001    # 62204
>> 
>> 
>> 
>> <image001.png>
>> 
>> 
>> CONFIDENTIALITY NOTICE - This communication may contain privileged or confidential information.
>> If you are not the intended recipient or received this communication 
>> by error, please notify the sender and delete the message without copying, forwarding and/or disclosing it.
> 
> 


From gpau at lbl.gov  Mon Jul  4 17:53:24 2016
From: gpau at lbl.gov (George Pau)
Date: Mon, 4 Jul 2016 15:53:24 -0700
Subject: [petsc-users] hdf5 libraries
In-Reply-To: <2EC53254-CC98-42BB-94C1-212F72574F65@mcs.anl.gov>
References: <CABUTOunbHVBtfPxSk2bmVkKC1BBo2YS_YftGj5F6z0Q9zzq6rA@mail.gmail.com>
	<2EC53254-CC98-42BB-94C1-212F72574F65@mcs.anl.gov>
Message-ID: <CABUTOukJFNS_L1AgCR4O25QGVNoLjZWn5ykN4MVeGPfnmBmfLg@mail.gmail.com>

Attached is the configure.log for the case where --download-hdf5 failed.

For the run time error, the IS has only 298.8M entries (from ISGetSize).
So, it shouldn't need a 64 bit integer.  In addition, I was able to get the
right output when petsc builds its own hdf5.  In addition, I don't have an
issue as well if I am using the Petsc Binary Viewer.   If this is really an
issue with NERSC's libraries, then I will work with NERSC to see if they
can help me figure out what is wrong.

Thanks
George


On Sun, Jul 3, 2016 at 3:13 PM, Barry Smith <bsmith at mcs.anl.gov> wrote:

>
>    Please send $PETSC_ARCH/lib/petsc/conf/configure.log for both cases to
> petsc-maint at mcs.anl.gov.
>
>    The fact that it kicks in for very large problems and reports a
> negative block length is indicative of a 64 bit integer not fitting into a
> 32 bit integer location.
>
>    Barry
>
>
> > On Jul 2, 2016, at 5:53 PM, George Pau <gpau at lbl.gov> wrote:
> >
> > Hi,
> >
> > I am trying to debug an error I am getting when using the HDF5 viewer.
> I am working on NERSC systems, and they have a precompiled hdf5 (module
> cray-hdf5-parallel).  When I linked petsc libraries to their hdf5
> libraries, it gives the following error at run time when I tried to do a
> ISView:
> >
> > Rank 0 [Sat Jul  2 15:34:48 2016] [c0-0c0s15n0] Fatal error in
> MPI_Type_create_hindexed: Invalid argument, error stack:
> > MPI_Type_create_hindexed(150): MPI_Type_create_hindexed(count=1,
> array_of_blocklengths=0x478b1e0, array_of_displacements=0x478b200,
> MPI_BYTE, newtype=0x7fffffff3598) failed
> > MPI_Type_create_hindexed(98).: Invalid value for blocklength, must be
> non-negative but is -1927660792
> >
> > The hdf5 version on NERSC is 1.8.16 but the version that PETSc
> downloaded when using --download-hdf5 is 1.8.12. So, could the error be due
> to this difference? I also see the above error only when the length of the
> IS is big (tested for about 200M total entries, using 1024 cores).
> >
> > I don't have these errors when I used --download-hdf5=1 during configure
> step.  However, while I was able to use --download-hdf5=1 on Edison, PETSc
> was not able to compile hdf5 libraries properly on Cori.  The OS of Cori
> was recently updated.  My primary interest in trying to use the version
> provided by NERSC is to see if there is any improvement in the IO
> performance.
> >
> > Thanks,
> > George
> >
> >
> > --
> > George Pau
> > Earth Sciences Division
> > Lawrence Berkeley National Laboratory
> > One Cyclotron, MS 74R316C
> > Berkeley, CA 94720
> >
> > (510) 486-7196
> > gpau at lbl.gov
> > http://esd.lbl.gov/profiles/george-shu-heng-pau/
>
>


-- 
George Pau
Earth Sciences Division
Lawrence Berkeley National Laboratory
One Cyclotron, MS 74R316C
Berkeley, CA 94720

(510) 486-7196
gpau at lbl.gov
http://esd.lbl.gov/profiles/george-shu-heng-pau/
<http://esd.lbl.gov/about/staff/georgepau/>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.mcs.anl.gov/pipermail/petsc-users/attachments/20160704/5e1173c5/attachment-0001.html>
-------------- next part --------------
A non-text attachment was scrubbed...
Name: configure.log
Type: application/octet-stream
Size: 2822116 bytes
Desc: not available
URL: <http://lists.mcs.anl.gov/pipermail/petsc-users/attachments/20160704/5e1173c5/attachment-0001.obj>

From bsmith at mcs.anl.gov  Mon Jul  4 18:48:00 2016
From: bsmith at mcs.anl.gov (Barry Smith)
Date: Mon, 4 Jul 2016 18:48:00 -0500
Subject: [petsc-users] hdf5 libraries
In-Reply-To: <CABUTOukJFNS_L1AgCR4O25QGVNoLjZWn5ykN4MVeGPfnmBmfLg@mail.gmail.com>
References: <CABUTOunbHVBtfPxSk2bmVkKC1BBo2YS_YftGj5F6z0Q9zzq6rA@mail.gmail.com>
	<2EC53254-CC98-42BB-94C1-212F72574F65@mcs.anl.gov>
	<CABUTOukJFNS_L1AgCR4O25QGVNoLjZWn5ykN4MVeGPfnmBmfLg@mail.gmail.com>
Message-ID: <FB8B6799-55C0-45B5-BD7B-DBC53035C78E@mcs.anl.gov>


  We've had lots of people having trouble trying to build the HDF on the Nersc machines; not much we can do about it.

  /bin/sh: line 4: 115507 Segmentation fault      LD_LIBRARY_PATH="$LD_LIBRARY_PATH`echo  |                  	sed -e 's/-L/:/g' -e 's/ //g'`" ./H5make_libsettings > H5lib_settings.c

   Even though I agree it looks like 32 bit integers should be enough you could try configuring PETSc with --with-64-bit-indices to see if this resolves the problem.

   Barry

> On Jul 4, 2016, at 5:53 PM, George Pau <gpau at lbl.gov> wrote:
> 
> Attached is the configure.log for the case where --download-hdf5 failed.
> 
> For the run time error, the IS has only 298.8M entries (from ISGetSize).  So, it shouldn't need a 64 bit integer.  In addition, I was able to get the right output when petsc builds its own hdf5.  In addition, I don't have an issue as well if I am using the Petsc Binary Viewer.   If this is really an issue with NERSC's libraries, then I will work with NERSC to see if they can help me figure out what is wrong.
> 
> Thanks
> George
> 
> 
> On Sun, Jul 3, 2016 at 3:13 PM, Barry Smith <bsmith at mcs.anl.gov> wrote:
> 
>    Please send $PETSC_ARCH/lib/petsc/conf/configure.log for both cases to petsc-maint at mcs.anl.gov.
> 
>    The fact that it kicks in for very large problems and reports a negative block length is indicative of a 64 bit integer not fitting into a 32 bit integer location.
> 
>    Barry
> 
> 
> > On Jul 2, 2016, at 5:53 PM, George Pau <gpau at lbl.gov> wrote:
> >
> > Hi,
> >
> > I am trying to debug an error I am getting when using the HDF5 viewer.  I am working on NERSC systems, and they have a precompiled hdf5 (module cray-hdf5-parallel).  When I linked petsc libraries to their hdf5 libraries, it gives the following error at run time when I tried to do a ISView:
> >
> > Rank 0 [Sat Jul  2 15:34:48 2016] [c0-0c0s15n0] Fatal error in MPI_Type_create_hindexed: Invalid argument, error stack:
> > MPI_Type_create_hindexed(150): MPI_Type_create_hindexed(count=1, array_of_blocklengths=0x478b1e0, array_of_displacements=0x478b200, MPI_BYTE, newtype=0x7fffffff3598) failed
> > MPI_Type_create_hindexed(98).: Invalid value for blocklength, must be non-negative but is -1927660792
> >
> > The hdf5 version on NERSC is 1.8.16 but the version that PETSc downloaded when using --download-hdf5 is 1.8.12. So, could the error be due to this difference? I also see the above error only when the length of the IS is big (tested for about 200M total entries, using 1024 cores).
> >
> > I don't have these errors when I used --download-hdf5=1 during configure step.  However, while I was able to use --download-hdf5=1 on Edison, PETSc was not able to compile hdf5 libraries properly on Cori.  The OS of Cori was recently updated.  My primary interest in trying to use the version provided by NERSC is to see if there is any improvement in the IO performance.
> >
> > Thanks,
> > George
> >
> >
> > --
> > George Pau
> > Earth Sciences Division
> > Lawrence Berkeley National Laboratory
> > One Cyclotron, MS 74R316C
> > Berkeley, CA 94720
> >
> > (510) 486-7196
> > gpau at lbl.gov
> > http://esd.lbl.gov/profiles/george-shu-heng-pau/
> 
> 
> 
> 
> -- 
> George Pau
> Earth Sciences Division
> Lawrence Berkeley National Laboratory
> One Cyclotron, MS 74R316C
> Berkeley, CA 94720
> 
> (510) 486-7196
> gpau at lbl.gov
> http://esd.lbl.gov/profiles/george-shu-heng-pau/
> <configure.log>


From mono at dtu.dk  Tue Jul  5 04:17:29 2016
From: mono at dtu.dk (=?Windows-1252?Q?Morten_Nobel-J=F8rgensen?=)
Date: Tue, 5 Jul 2016 09:17:29 +0000
Subject: [petsc-users] Duplicate cells when exporting a distributed dmplex
Message-ID: <D3A147C8.6336%mono@dtu.dk>

Hi all,

I hope someone can help me with the following:

I?m having some problems when exporting a distributed DMPlex ? the cells (+cell types) seems to be duplicated.

When I?m running the code on a non-distributed system it works as expected, but when I run it on multiple processors (2 in my case) the output is invalid.

I have attached a simple example and the output for np=1 and np=2.

Abbreviated the code essentially does the following:
'
PetscInt       dim         = 3;
PetscInt       cells[]     = {1, 1, 2};
PetscInt       overlap     = 1;
PetscInitialize(&argc, &argv, NULL, help);
DMPlexCreateHexBoxMesh(PETSC_COMM_WORLD, dim, cells, DM_BOUNDARY_NONE, DM_BOUNDARY_NONE, DM_BOUNDARY_NONE, &dm);
DMPlexDistribute(dm, overlap, NULL, &dist);
dm   = dist;
SetupDOFs(dm);
Vec V;
DMCreateGlobalVector(dm, &V);
AssignSomeValues(V);
PetscViewer viewer;
const char* fn = "output.vtk";
PetscViewerVTKOpen(PETSC_COMM_WORLD,fn,FILE_MODE_WRITE,&viewer);
VecView(V,viewer);
PetscViewerDestroy(&viewer);

Kind regards,
Morten
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.mcs.anl.gov/pipermail/petsc-users/attachments/20160705/bb8afa8f/attachment.html>
-------------- next part --------------
A non-text attachment was scrubbed...
Name: ex_vtk_export.cc
Type: application/octet-stream
Size: 2716 bytes
Desc: ex_vtk_export.cc
URL: <http://lists.mcs.anl.gov/pipermail/petsc-users/attachments/20160705/bb8afa8f/attachment.obj>
-------------- next part --------------
A non-text attachment was scrubbed...
Name: output-np2.vtk
Type: application/octet-stream
Size: 909 bytes
Desc: output-np2.vtk
URL: <http://lists.mcs.anl.gov/pipermail/petsc-users/attachments/20160705/bb8afa8f/attachment-0001.obj>
-------------- next part --------------
A non-text attachment was scrubbed...
Name: output-np1.vtk
Type: application/octet-stream
Size: 863 bytes
Desc: output-np1.vtk
URL: <http://lists.mcs.anl.gov/pipermail/petsc-users/attachments/20160705/bb8afa8f/attachment-0002.obj>

From jychang48 at gmail.com  Tue Jul  5 11:46:09 2016
From: jychang48 at gmail.com (Justin Chang)
Date: Tue, 5 Jul 2016 11:46:09 -0500
Subject: [petsc-users] View wall-clock time of a PETSc function via
	command-line
Message-ID: <CAP2=TMhbhpxKbNU7gYz+0V7drrcyfO+9UCQFsrhkvyMO3tJbVg@mail.gmail.com>

Hi all,

Is there a quick way (e.g., through command-line options) to output the
wall-clock time of a PETSc function (e.g., SNESSolve(), KSPSolve(), etc)
without outputting the entire -log_view?

Thanks,
Justin
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.mcs.anl.gov/pipermail/petsc-users/attachments/20160705/f022bd3f/attachment.html>

From bsmith at mcs.anl.gov  Tue Jul  5 11:50:56 2016
From: bsmith at mcs.anl.gov (Barry Smith)
Date: Tue, 5 Jul 2016 11:50:56 -0500
Subject: [petsc-users] View wall-clock time of a PETSc function via
	command-line
In-Reply-To: <CAP2=TMhbhpxKbNU7gYz+0V7drrcyfO+9UCQFsrhkvyMO3tJbVg@mail.gmail.com>
References: <CAP2=TMhbhpxKbNU7gYz+0V7drrcyfO+9UCQFsrhkvyMO3tJbVg@mail.gmail.com>
Message-ID: <6ACBC74D-9B09-40DF-ABC0-E5DF485B850C@mcs.anl.gov>


  ./ex1 -log_view | grep KSPSolve

  or

   ./ex1 -log_view | egrep "(KSPSolve|SNESSolve)"


> On Jul 5, 2016, at 11:46 AM, Justin Chang <jychang48 at gmail.com> wrote:
> 
> Hi all,
> 
> Is there a quick way (e.g., through command-line options) to output the wall-clock time of a PETSc function (e.g., SNESSolve(), KSPSolve(), etc) without outputting the entire -log_view?
> 
> Thanks,
> Justin


From jychang48 at gmail.com  Tue Jul  5 11:52:29 2016
From: jychang48 at gmail.com (Justin Chang)
Date: Tue, 5 Jul 2016 11:52:29 -0500
Subject: [petsc-users] View wall-clock time of a PETSc function via
	command-line
In-Reply-To: <6ACBC74D-9B09-40DF-ABC0-E5DF485B850C@mcs.anl.gov>
References: <CAP2=TMhbhpxKbNU7gYz+0V7drrcyfO+9UCQFsrhkvyMO3tJbVg@mail.gmail.com>
	<6ACBC74D-9B09-40DF-ABC0-E5DF485B850C@mcs.anl.gov>
Message-ID: <CAP2=TMhjbz0brH1P4rnuxD67pE5bJg=Kpvs3N8gbXgreBh2SWg@mail.gmail.com>

Okay, thanks!

On Tue, Jul 5, 2016 at 11:50 AM, Barry Smith <bsmith at mcs.anl.gov> wrote:

>
>   ./ex1 -log_view | grep KSPSolve
>
>   or
>
>    ./ex1 -log_view | egrep "(KSPSolve|SNESSolve)"
>
>
> > On Jul 5, 2016, at 11:46 AM, Justin Chang <jychang48 at gmail.com> wrote:
> >
> > Hi all,
> >
> > Is there a quick way (e.g., through command-line options) to output the
> wall-clock time of a PETSc function (e.g., SNESSolve(), KSPSolve(), etc)
> without outputting the entire -log_view?
> >
> > Thanks,
> > Justin
>
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.mcs.anl.gov/pipermail/petsc-users/attachments/20160705/2f07d06c/attachment.html>

From knepley at gmail.com  Tue Jul  5 12:14:05 2016
From: knepley at gmail.com (Matthew Knepley)
Date: Tue, 5 Jul 2016 12:14:05 -0500
Subject: [petsc-users] View wall-clock time of a PETSc function via
	command-line
In-Reply-To: <CAP2=TMhjbz0brH1P4rnuxD67pE5bJg=Kpvs3N8gbXgreBh2SWg@mail.gmail.com>
References: <CAP2=TMhbhpxKbNU7gYz+0V7drrcyfO+9UCQFsrhkvyMO3tJbVg@mail.gmail.com>
	<6ACBC74D-9B09-40DF-ABC0-E5DF485B850C@mcs.anl.gov>
	<CAP2=TMhjbz0brH1P4rnuxD67pE5bJg=Kpvs3N8gbXgreBh2SWg@mail.gmail.com>
Message-ID: <CAMYG4Gm5OGtcjhKZa2GF=wJ64Mni+DUD5mwHbRC3o4j0g+MSqg@mail.gmail.com>

On Tue, Jul 5, 2016 at 11:52 AM, Justin Chang <jychang48 at gmail.com> wrote:

> Okay, thanks!
>

Or -log_view ::ascii_info_detailed and then upload that module to Python.
This is how I do it in scripts.

   Matt


> On Tue, Jul 5, 2016 at 11:50 AM, Barry Smith <bsmith at mcs.anl.gov> wrote:
>
>>
>>   ./ex1 -log_view | grep KSPSolve
>>
>>   or
>>
>>    ./ex1 -log_view | egrep "(KSPSolve|SNESSolve)"
>>
>>
>> > On Jul 5, 2016, at 11:46 AM, Justin Chang <jychang48 at gmail.com> wrote:
>> >
>> > Hi all,
>> >
>> > Is there a quick way (e.g., through command-line options) to output the
>> wall-clock time of a PETSc function (e.g., SNESSolve(), KSPSolve(), etc)
>> without outputting the entire -log_view?
>> >
>> > Thanks,
>> > Justin
>>
>>
>


-- 
What most experimenters take for granted before they begin their
experiments is infinitely more interesting than any results to which their
experiments lead.
-- Norbert Wiener
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.mcs.anl.gov/pipermail/petsc-users/attachments/20160705/b843f925/attachment.html>

From Hassan.Raiesi at aero.bombardier.com  Tue Jul  5 15:42:54 2016
From: Hassan.Raiesi at aero.bombardier.com (Hassan Raiesi)
Date: Tue, 5 Jul 2016 20:42:54 +0000
Subject: [petsc-users] petsc 3.7.2 memory usage is much higher when compared
	to 3.6.1
Message-ID: <CB3EA7864A5EA74EADA707EEB42CF5BAFC313A95@MTLWAEXCH005.ca.aero.bombardier.net>

Hi,

PETSc 3.7.2 seems to have a much higher memory usage when compared with PETSc- 3.1.1 c, to a point that it crashes our code for large problems that we ran with version 3.6.1 in the past.
I have re-compiled the code with same options, and ran the same code linked with the two versions, here are the log-summarie:

-flow_ksp_max_it 20
-flow_ksp_monitor_true_residual
-flow_ksp_rtol 0.1
-flow_ksp_type fgmres
-flow_mg_coarse_pc_factor_mat_solver_package mumps
-flow_mg_coarse_pc_type lu
-flow_mg_levels_ksp_type richardson
-flow_mg_levels_pc_type sor
-flow_pc_gamg_agg_nsmooths 0
-flow_pc_gamg_coarse_eq_limit 2000
-flow_pc_gamg_process_eq_limit 2500
-flow_pc_gamg_repartition true
-flow_pc_gamg_reuse_interpolation true
-flow_pc_gamg_square_graph 3
-flow_pc_gamg_sym_graph true
-flow_pc_gamg_type agg
-flow_pc_mg_cycle v
-flow_pc_mg_levels 20
-flow_pc_mg_type kaskade
-flow_pc_type gamg
-log_summary

Note: it is not specific to PCGAMG, even a bjacobi+fgmres would need more memory (4.5GB/core in version 3.6.1 compared to 6.8GB/core for 3.7.2).


Using Petsc Development GIT revision: v3.7.2-812-gc68d048  GIT Date: 2016-07-05 12:04:34 -0400


                         Max       Max/Min        Avg      Total

Time (sec):           6.760e+02      1.00006   6.760e+02

Objects:              1.284e+03      1.00469   1.279e+03

Flops:                3.563e+10      1.10884   3.370e+10  1.348e+13

Flops/sec:            5.271e+07      1.10884   4.985e+07  1.994e+10

MPI Messages:         4.279e+04      7.21359   1.635e+04  6.542e+06

MPI Message Lengths:  3.833e+09     17.25274   7.681e+04  5.024e+11

MPI Reductions:       4.023e+03      1.00149


Flop counting convention: 1 flop = 1 real number operation of type (multiply/divide/add/subtract)

                            e.g., VecAXPY() for real vectors of length N --> 2N flops

                            and VecAXPY() for complex vectors of length N --> 8N flops


Summary of Stages:   ----- Time ------  ----- Flops -----  --- Messages ---  -- Message Lengths --  -- Reductions --

                        Avg     %Total     Avg     %Total   counts   %Total     Avg         %Total   counts   %Total

0:      Main Stage: 6.7600e+02 100.0%  1.3478e+13 100.0%  6.533e+06  99.9%  7.674e+04       99.9%  4.010e+03  99.7%


------------------------------------------------------------------------------------------------------------------------

See the 'Profiling' chapter of the users' manual for details on interpreting output.

Phase summary info:

   Count: number of times phase was executed

   Time and Flops: Max - maximum over all processors

                   Ratio - ratio of maximum to minimum over all processors

   Mess: number of messages sent

   Avg. len: average message length (bytes)

   Reduct: number of global reductions

   Global: entire computation

   Stage: stages of a computation. Set stages with PetscLogStagePush() and PetscLogStagePop().

      %T - percent time in this phase         %F - percent flops in this phase

      %M - percent messages in this phase     %L - percent message lengths in this phase

      %R - percent reductions in this phase

   Total Mflop/s: 10e-6 * (sum of flops over all processors)/(max time over all processors)

------------------------------------------------------------------------------------------------------------------------

Event                Count      Time (sec)     Flops                             --- Global ---  --- Stage ---   Total

                   Max Ratio  Max     Ratio   Max  Ratio  Mess   Avg len Reduct  %T %F %M %L %R  %T %F %M %L %R Mflop/s

------------------------------------------------------------------------------------------------------------------------


--- Event Stage 0: Main Stage


MatMult              500 1.0 1.0582e+01 1.2 6.68e+09 1.1 1.9e+06 1.0e+04 0.0e+00  1 19 28  4  0   1 19 29  4  0 237625

MatMultTranspose     120 1.0 7.6262e-01 1.3 3.58e+08 1.1 2.4e+05 1.5e+04 0.0e+00  0  1  4  1  0   0  1  4  1  0 180994

MatSolve             380 1.0 4.1580e+00 1.1 1.17e+09 1.1 8.6e+03 8.8e+01 6.0e+01  1  3  0  0  1   1  3  0  0  1 105950

MatSOR               120 1.0 1.4316e+01 1.2 6.75e+09 1.1 9.5e+05 7.4e+03 0.0e+00  2 19 15  1  0   2 19 15  1  0 177298

MatLUFactorSym         2 1.0 2.3449e-01 1.1 0.00e+00 0.0 0.0e+00 0.0e+00 1.0e+01  0  0  0  0  0   0  0  0  0  0     0

MatLUFactorNum        60 1.0 8.8820e+00 1.0 1.95e+08 1.2 0.0e+00 0.0e+00 0.0e+00  1  1  0  0  0   1  1  0  0  0  7877

MatILUFactorSym        1 1.0 1.9795e-02 1.4 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00  0  0  0  0  0   0  0  0  0  0     0

MatConvert             6 1.0 2.9893e-02 1.1 0.00e+00 0.0 0.0e+00 0.0e+00 1.8e+01  0  0  0  0  0   0  0  0  0  0     0

MatScale               6 1.0 1.8810e-02 1.4 4.52e+06 1.1 2.4e+04 1.5e+03 0.0e+00  0  0  0  0  0   0  0  0  0  0 90171

MatAssemblyBegin     782 1.0 1.8294e+01 2.9 0.00e+00 0.0 9.2e+05 4.1e+05 4.2e+02  2  0 14 75 10   2  0 14 75 10     0

MatAssemblyEnd       782 1.0 1.4283e+01 3.0 0.00e+00 0.0 4.1e+05 8.7e+02 4.7e+02  1  0  6  0 12   1  0  6  0 12     0

MatGetRow        6774900 1.1 9.4289e-01 1.2 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00  0  0  0  0  0   0  0  0  0  0     0

MatGetRowIJ            3 3.0 6.6261e-036948.0 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00  0  0  0  0  0   0  0  0  0  0     0

MatGetSubMatrix       12 1.0 2.6783e+01 1.0 0.00e+00 0.0 1.1e+05 1.3e+05 2.0e+02  4  0  2  3  5   4  0  2  3  5     0

MatGetOrdering         3 3.0 7.7400e-03 7.2 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00  0  0  0  0  0   0  0  0  0  0     0

MatPartitioning        6 1.0 1.8949e+00 1.0 0.00e+00 0.0 0.0e+00 0.0e+00 1.4e+01  0  0  0  0  0   0  0  0  0  0     0

MatCoarsen             6 1.0 9.5692e-02 1.2 0.00e+00 0.0 2.6e+05 1.1e+03 4.1e+01  0  0  4  0  1   0  0  4  0  1     0

MatZeroEntries       142 1.0 9.7085e-01 1.1 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00  0  0  0  0  0   0  0  0  0  0     0

MatTranspose           6 1.0 2.1740e-01 1.0 0.00e+00 0.0 1.9e+05 8.5e+02 7.8e+01  0  0  3  0  2   0  0  3  0  2     0

MatPtAP              120 1.0 6.0157e+01 1.0 1.82e+10 1.1 1.5e+06 2.7e+05 4.2e+02  9 51 22 80 10   9 51 22 80 10 114269

MatPtAPSymbolic       12 1.0 8.1081e+00 1.0 0.00e+00 0.0 2.2e+05 3.8e+04 8.4e+01  1  0  3  2  2   1  0  3  2  2     0

MatPtAPNumeric       120 1.0 5.2205e+01 1.0 1.82e+10 1.1 1.2e+06 3.1e+05 3.4e+02  8 51 19 78  8   8 51 19 78  8 131676

MatTrnMatMult          3 1.0 1.8608e+00 1.0 3.23e+07 1.2 8.3e+04 7.9e+03 5.7e+01  0  0  1  0  1   0  0  1  0  1  6275

MatTrnMatMultSym       3 1.0 1.3447e+00 1.0 0.00e+00 0.0 6.9e+04 3.8e+03 5.1e+01  0  0  1  0  1   0  0  1  0  1     0

MatTrnMatMultNum       3 1.0 5.1695e-01 1.0 3.23e+07 1.2 1.3e+04 3.0e+04 6.0e+00  0  0  0  0  0   0  0  0  0  0 22588

MatGetLocalMat       126 1.0 1.0355e+00 1.8 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00  0  0  0  0  0   0  0  0  0  0     0

MatGetBrAoCol        120 1.0 9.5921e+0019.2 0.00e+00 0.0 5.7e+05 3.3e+04 0.0e+00  1  0  9  4  0   1  0  9  4  0     0

VecDot               320 1.0 1.1400e+00 1.6 2.04e+08 1.1 0.0e+00 0.0e+00 3.2e+02  0  1  0  0  8   0  1  0  0  8 68967

VecMDot              260 1.0 1.9577e+00 2.8 3.70e+08 1.1 0.0e+00 0.0e+00 2.6e+02  0  1  0  0  6   0  1  0  0  6 72792

VecNorm              440 1.0 2.6273e+00 1.9 5.88e+08 1.1 0.0e+00 0.0e+00 4.4e+02  0  2  0  0 11   0  2  0  0 11 86035

VecScale             320 1.0 2.1386e-01 1.2 7.91e+07 1.1 0.0e+00 0.0e+00 0.0e+00  0  0  0  0  0   0  0  0  0  0 141968

VecCopy              220 1.0 7.0370e-01 1.2 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00  0  0  0  0  0   0  0  0  0  0     0

VecSet               862 1.0 7.1000e-01 1.6 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00  0  0  0  0  0   0  0  0  0  0     0

VecAXPY              440 1.0 8.6790e-01 1.1 3.83e+08 1.1 0.0e+00 0.0e+00 0.0e+00  0  1  0  0  0   0  1  0  0  0 169857

VecAYPX              280 1.0 5.7766e-01 1.5 1.92e+08 1.1 0.0e+00 0.0e+00 0.0e+00  0  1  0  0  0   0  1  0  0  0 127599

VecMAXPY             300 1.0 9.7396e-01 1.2 4.98e+08 1.1 0.0e+00 0.0e+00 0.0e+00  0  1  0  0  0   0  1  0  0  0 196768

VecAssemblyBegin     234 1.0 4.6313e+00 5.6 0.00e+00 0.0 0.0e+00 0.0e+00 6.8e+02  0  0  0  0 17   0  0  0  0 17     0

VecAssemblyEnd       234 1.0 5.1503e-0319.5 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00  0  0  0  0  0   0  0  0  0  0     0

VecScatterBegin     1083 1.0 2.9274e-01 4.5 0.00e+00 0.0 3.8e+06 8.5e+03 2.0e+01  0  0 59  6  0   0  0 59  6  0     0

VecScatterEnd       1063 1.0 3.9653e+00 5.6 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00  0  0  0  0  0   0  0  0  0  0     0

KSPGMRESOrthog        20 1.0 1.7405e+00 3.7 1.28e+08 1.1 0.0e+00 0.0e+00 2.0e+01  0  0  0  0  0   0  0  0  0  0 28232

KSPSetUp             222 1.0 6.8469e-02 1.1 0.00e+00 0.0 0.0e+00 0.0e+00 1.2e+01  0  0  0  0  0   0  0  0  0  0     0

KSPSolve              60 1.0 1.4767e+02 1.0 3.55e+10 1.1 6.3e+06 7.2e+04 3.2e+03 22100 96 90 79  22100 96 90 79 91007

PCGAMGGraph_AGG        6 1.0 6.0792e+00 1.0 4.52e+06 1.1 3.8e+05 9.0e+02 2.5e+02  1  0  6  0  6   1  0  6  0  6   279

PCGAMGCoarse_AGG       6 1.0 2.0660e+00 1.0 3.23e+07 1.2 4.2e+05 3.1e+03 1.5e+02  0  0  6  0  4   0  0  6  0  4  5652

PCGAMGProl_AGG         6 1.0 1.8842e+00 1.0 0.00e+00 0.0 7.3e+05 3.3e+03 8.6e+02  0  0 11  0 21   0  0 11  0 22     0

PCGAMGPOpt_AGG         6 1.0 6.4373e-06 0.0 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00  0  0  0  0  0   0  0  0  0  0     0

GAMG: createProl       6 1.0 1.0036e+01 1.0 3.68e+07 1.2 1.5e+06 2.7e+03 1.3e+03  1  0 23  1 31   1  0 23  1 31  1332

  Graph               12 1.0 6.0783e+00 1.0 4.52e+06 1.1 3.8e+05 9.0e+02 2.5e+02  1  0  6  0  6   1  0  6  0  6   279

  MIS/Agg              6 1.0 9.5831e-02 1.2 0.00e+00 0.0 2.6e+05 1.1e+03 4.1e+01  0  0  4  0  1   0  0  4  0  1     0

  SA: col data         6 1.0 7.7358e-01 1.0 0.00e+00 0.0 6.7e+05 2.9e+03 7.8e+02  0  0 10  0 19   0  0 10  0 19     0

  SA: frmProl0         6 1.0 1.0759e+00 1.0 0.00e+00 0.0 6.2e+04 7.6e+03 6.0e+01  0  0  1  0  1   0  0  1  0  1     0

GAMG: partLevel        6 1.0 3.8136e+01 1.0 9.09e+08 1.1 3.8e+05 5.0e+04 5.4e+02  6  3  6  4 13   6  3  6  4 14  9013

  repartition          6 1.0 2.7910e+00 1.0 0.00e+00 0.0 4.6e+04 1.3e+02 1.6e+02  0  0  1  0  4   0  0  1  0  4     0

  Invert-Sort          6 1.0 2.5045e+00 2.0 0.00e+00 0.0 0.0e+00 0.0e+00 2.4e+01  0  0  0  0  1   0  0  0  0  1     0

  Move A               6 1.0 1.4832e+01 1.0 0.00e+00 0.0 8.5e+04 1.7e+05 1.1e+02  2  0  1  3  3   2  0  1  3  3     0

  Move P               6 1.0 1.2023e+01 1.0 0.00e+00 0.0 2.4e+04 3.8e+03 1.1e+02  2  0  0  0  3   2  0  0  0  3     0

PCSetUp              100 1.0 1.1212e+02 1.0 1.84e+10 1.1 3.2e+06 1.3e+05 2.2e+03 17 52 49 84 54  17 52 49 84 54 62052

PCSetUpOnBlocks       40 1.0 1.0386e+00 1.2 1.95e+08 1.2 0.0e+00 0.0e+00 0.0e+00  0  1  0  0  0   0  1  0  0  0 67368

PCApply              380 1.0 2.0034e+01 1.1 8.60e+09 1.1 1.5e+06 9.9e+03 6.0e+01  3 24 22  3  1   3 24 22  3  1 161973

SFSetGraph            12 1.0 4.9813e-0310.2 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00  0  0  0  0  0   0  0  0  0  0     0

SFBcastBegin          47 1.0 3.3110e-02 2.6 0.00e+00 0.0 2.6e+05 1.1e+03 6.0e+00  0  0  4  0  0   0  0  4  0  0     0

SFBcastEnd            47 1.0 1.3497e-02 3.6 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00  0  0  0  0  0   0  0  0  0  0     0

SFReduceBegin          6 1.0 1.8593e-02 4.2 0.00e+00 0.0 7.2e+04 4.9e+02 6.0e+00  0  0  1  0  0   0  0  1  0  0     0

SFReduceEnd            6 1.0 7.1628e-0318.5 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00  0  0  0  0  0   0  0  0  0  0     0

BuildTwoSided         12 1.0 3.5771e-02 2.5 0.00e+00 0.0 5.0e+04 4.0e+00 1.2e+01  0  0  1  0  0   0  0  1  0  0     0

------------------------------------------------------------------------------------------------------------------------


Memory usage is given in bytes:


Object Type          Creations   Destructions     Memory  Descendants' Mem.

Reports information only for process 0.


--- Event Stage 0: Main Stage


              Matrix   302            299   1992700700     0.

Matrix Partitioning     6              6         3888     0.

      Matrix Coarsen     6              6         3768     0.

              Vector   600            600   1582204168     0.

      Vector Scatter    87             87      5614432     0.

       Krylov Solver    11             11        59472     0.

      Preconditioner    11             11        11120     0.

         PetscRandom     1              1          638     0.

              Viewer     1              0            0     0.

           Index Set   247            247      9008420     0.

Star Forest Bipartite Graph    12             12        10176     0.

========================================================================================================================

And for  petsc 3.6.1:


Using Petsc Development GIT revision: v3.6.1-307-g26c82d3  GIT Date: 2015-08-06 11:50:34 -0500


                         Max       Max/Min        Avg      Total

Time (sec):           5.515e+02      1.00001   5.515e+02

Objects:              1.231e+03      1.00490   1.226e+03

Flops:                3.431e+10      1.12609   3.253e+10  1.301e+13

Flops/sec:            6.222e+07      1.12609   5.899e+07  2.359e+10

MPI Messages:         4.432e+04      7.84165   1.504e+04  6.016e+06

MPI Message Lengths:  2.236e+09     12.61261   5.027e+04  3.024e+11

MPI Reductions:       4.012e+03      1.00150


Flop counting convention: 1 flop = 1 real number operation of type (multiply/divide/add/subtract)

                            e.g., VecAXPY() for real vectors of length N --> 2N flops

                            and VecAXPY() for complex vectors of length N --> 8N flops


Summary of Stages:   ----- Time ------  ----- Flops -----  --- Messages ---  -- Message Lengths --  -- Reductions --

                        Avg     %Total     Avg     %Total   counts   %Total     Avg         %Total   counts   %Total

0:      Main Stage: 5.5145e+02 100.0%  1.3011e+13 100.0%  6.007e+06  99.9%  5.020e+04       99.9%  3.999e+03  99.7%


------------------------------------------------------------------------------------------------------------------------

See the 'Profiling' chapter of the users' manual for details on interpreting output.

Phase summary info:

   Count: number of times phase was executed

   Time and Flops: Max - maximum over all processors

                   Ratio - ratio of maximum to minimum over all processors

   Mess: number of messages sent

   Avg. len: average message length (bytes)

   Reduct: number of global reductions

   Global: entire computation

   Stage: stages of a computation. Set stages with PetscLogStagePush() and PetscLogStagePop().

      %T - percent time in this phase         %F - percent flops in this phase

      %M - percent messages in this phase     %L - percent message lengths in this phase

      %R - percent reductions in this phase

   Total Mflop/s: 10e-6 * (sum of flops over all processors)/(max time over all processors)

------------------------------------------------------------------------------------------------------------------------

Event                Count      Time (sec)     Flops                             --- Global ---  --- Stage ---   Total

                   Max Ratio  Max     Ratio   Max  Ratio  Mess   Avg len Reduct  %T %F %M %L %R  %T %F %M %L %R Mflop/s

------------------------------------------------------------------------------------------------------------------------


--- Event Stage 0: Main Stage


MatMult              500 1.0 1.0172e+01 1.2 6.68e+09 1.1 1.9e+06 9.9e+03 0.0e+00  2 19 31  6  0   2 19 31  6  0 247182

MatMultTranspose     120 1.0 6.9889e-01 1.2 3.56e+08 1.1 2.5e+05 1.4e+04 0.0e+00  0  1  4  1  0   0  1  4  1  0 197492

MatSolve             380 1.0 3.9310e+00 1.1 1.17e+09 1.1 1.3e+04 5.7e+01 6.0e+01  1  3  0  0  1   1  3  0  0  2 112069

MatSOR               120 1.0 1.3915e+01 1.1 6.73e+09 1.1 9.5e+05 7.4e+03 0.0e+00  2 20 16  2  0   2 20 16  2  0 182405

MatLUFactorSym         2 1.0 2.1180e-01 1.8 0.00e+00 0.0 0.0e+00 0.0e+00 1.0e+01  0  0  0  0  0   0  0  0  0  0     0

MatLUFactorNum        60 1.0 7.9378e+00 1.0 1.95e+08 1.2 0.0e+00 0.0e+00 0.0e+00  1  1  0  0  0   1  1  0  0  0  8814

MatILUFactorSym        1 1.0 2.3076e-02 1.5 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00  0  0  0  0  0   0  0  0  0  0     0

MatConvert             6 1.0 3.2693e-02 1.5 0.00e+00 0.0 0.0e+00 0.0e+00 1.8e+01  0  0  0  0  0   0  0  0  0  0     0

MatScale               6 1.0 2.1923e-02 1.7 4.50e+06 1.1 2.4e+04 1.5e+03 0.0e+00  0  0  0  0  0   0  0  0  0  0 77365

MatAssemblyBegin     266 1.0 1.0337e+01 4.4 0.00e+00 0.0 1.8e+05 3.8e+03 4.2e+02  1  0  3  0 10   1  0  3  0 10     0

MatAssemblyEnd       266 1.0 3.0336e+00 1.0 0.00e+00 0.0 4.1e+05 8.6e+02 4.7e+02  1  0  7  0 12   1  0  7  0 12     0

MatGetRow        6730366 1.1 8.6473e-01 1.2 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00  0  0  0  0  0   0  0  0  0  0     0

MatGetRowIJ            3 3.0 5.2931e-035550.2 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00  0  0  0  0  0   0  0  0  0  0     0

MatGetSubMatrix       12 1.0 2.2689e+01 1.0 0.00e+00 0.0 1.1e+05 1.3e+05 1.9e+02  4  0  2  5  5   4  0  2  5  5     0

MatGetOrdering         3 3.0 6.5000e-03 5.4 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00  0  0  0  0  0   0  0  0  0  0     0

MatPartitioning        6 1.0 2.9801e+00 1.0 0.00e+00 0.0 0.0e+00 0.0e+00 1.4e+01  1  0  0  0  0   1  0  0  0  0     0

MatCoarsen             6 1.0 9.5374e-02 1.1 0.00e+00 0.0 2.5e+05 1.1e+03 3.8e+01  0  0  4  0  1   0  0  4  0  1     0

MatZeroEntries        22 1.0 6.1185e-01 1.2 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00  0  0  0  0  0   0  0  0  0  0     0

MatTranspose           6 1.0 1.9780e-01 1.1 0.00e+00 0.0 1.9e+05 8.6e+02 7.8e+01  0  0  3  0  2   0  0  3  0  2     0

MatPtAP              120 1.0 5.2996e+01 1.0 1.70e+10 1.1 9.7e+05 2.1e+05 4.2e+02 10 49 16 67 10  10 49 16 67 11 120900

MatPtAPSymbolic       12 1.0 5.8209e+00 1.0 0.00e+00 0.0 2.2e+05 3.7e+04 8.4e+01  1  0  4  3  2   1  0  4  3  2     0

MatPtAPNumeric       120 1.0 4.7185e+01 1.0 1.70e+10 1.1 7.6e+05 2.6e+05 3.4e+02  9 49 13 64  8   9 49 13 64  8 135789

MatTrnMatMult          3 1.0 1.1679e+00 1.0 3.22e+07 1.2 8.2e+04 8.0e+03 5.7e+01  0  0  1  0  1   0  0  1  0  1  9997

MatTrnMatMultSym       3 1.0 6.8366e-01 1.0 0.00e+00 0.0 6.9e+04 3.9e+03 5.1e+01  0  0  1  0  1   0  0  1  0  1     0

MatTrnMatMultNum       3 1.0 4.8513e-01 1.0 3.22e+07 1.2 1.3e+04 3.0e+04 6.0e+00  0  0  0  0  0   0  0  0  0  0 24069

MatGetLocalMat       126 1.0 1.1939e+00 1.3 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00  0  0  0  0  0   0  0  0  0  0     0

MatGetBrAoCol        120 1.0 5.9887e-01 2.7 0.00e+00 0.0 5.7e+05 3.3e+04 0.0e+00  0  0  9  6  0   0  0  9  6  0     0

MatGetSymTrans        24 1.0 1.4878e-01 1.4 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00  0  0  0  0  0   0  0  0  0  0     0

VecDot               320 1.0 1.5860e+00 1.5 2.04e+08 1.1 0.0e+00 0.0e+00 3.2e+02  0  1  0  0  8   0  1  0  0  8 49574

VecMDot              260 1.0 1.8154e+00 2.5 3.70e+08 1.1 0.0e+00 0.0e+00 2.6e+02  0  1  0  0  6   0  1  0  0  7 78497

VecNorm              440 1.0 2.8876e+00 1.8 5.88e+08 1.1 0.0e+00 0.0e+00 4.4e+02  0  2  0  0 11   0  2  0  0 11 78281

VecScale             320 1.0 2.2738e-01 1.2 7.88e+07 1.1 0.0e+00 0.0e+00 0.0e+00  0  0  0  0  0   0  0  0  0  0 133517

VecCopy              220 1.0 7.1162e-01 1.2 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00  0  0  0  0  0   0  0  0  0  0     0

VecSet               862 1.0 7.0683e-01 1.5 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00  0  0  0  0  0   0  0  0  0  0     0

VecAXPY              440 1.0 9.0657e-01 1.2 3.83e+08 1.1 0.0e+00 0.0e+00 0.0e+00  0  1  0  0  0   0  1  0  0  0 162612

VecAYPX              280 1.0 5.8935e-01 1.5 1.92e+08 1.1 0.0e+00 0.0e+00 0.0e+00  0  1  0  0  0   0  1  0  0  0 125070

VecMAXPY             300 1.0 9.7644e-01 1.2 4.98e+08 1.1 0.0e+00 0.0e+00 0.0e+00  0  1  0  0  0   0  1  0  0  0 196269

VecAssemblyBegin     234 1.0 5.0308e+00 5.5 0.00e+00 0.0 0.0e+00 0.0e+00 6.8e+02  1  0  0  0 17   1  0  0  0 17     0

VecAssemblyEnd       234 1.0 1.8253e-03 8.8 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00  0  0  0  0  0   0  0  0  0  0     0

VecScatterBegin     1083 1.0 2.8195e-01 4.7 0.00e+00 0.0 3.8e+06 8.4e+03 2.0e+01  0  0 64 11  0   0  0 64 11  1     0

VecScatterEnd       1063 1.0 3.4924e+00 6.9 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00  0  0  0  0  0   0  0  0  0  0     0

KSPGMRESOrthog        20 1.0 1.5598e+00 3.2 1.28e+08 1.1 0.0e+00 0.0e+00 2.0e+01  0  0  0  0  0   0  0  0  0  1 31503

KSPSetUp             222 1.0 9.7521e-02 1.6 0.00e+00 0.0 0.0e+00 0.0e+00 1.2e+01  0  0  0  0  0   0  0  0  0  0     0

KSPSolve              60 1.0 1.3742e+02 1.0 3.42e+10 1.1 5.7e+06 4.4e+04 3.2e+03 25100 95 83 79  25100 95 83 79 94396

PCGAMGGraph_AGG        6 1.0 5.7683e+00 1.0 4.50e+06 1.1 3.8e+05 9.1e+02 2.5e+02  1  0  6  0  6   1  0  6  0  6   294

PCGAMGCoarse_AGG       6 1.0 1.4101e+00 1.0 3.22e+07 1.2 4.0e+05 3.2e+03 1.4e+02  0  0  7  0  4   0  0  7  0  4  8280

PCGAMGProl_AGG         6 1.0 1.8976e+00 1.0 0.00e+00 0.0 7.2e+05 3.4e+03 8.6e+02  0  0 12  1 22   0  0 12  1 22     0

PCGAMGPOpt_AGG         6 1.0 5.7220e-06 0.0 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00  0  0  0  0  0   0  0  0  0  0     0

GAMG: createProl       6 1.0 9.0840e+00 1.0 3.67e+07 1.2 1.5e+06 2.7e+03 1.3e+03  2  0 25  1 31   2  0 25  1 31  1472

  Graph               12 1.0 5.7669e+00 1.0 4.50e+06 1.1 3.8e+05 9.1e+02 2.5e+02  1  0  6  0  6   1  0  6  0  6   294

  MIS/Agg              6 1.0 9.5481e-02 1.1 0.00e+00 0.0 2.5e+05 1.1e+03 3.8e+01  0  0  4  0  1   0  0  4  0  1     0

  SA: col data         6 1.0 8.5414e-01 1.0 0.00e+00 0.0 6.6e+05 3.0e+03 7.8e+02  0  0 11  1 19   0  0 11  1 20     0

  SA: frmProl0         6 1.0 1.0123e+00 1.0 0.00e+00 0.0 6.2e+04 7.6e+03 6.0e+01  0  0  1  0  1   0  0  1  0  2     0

GAMG: partLevel        6 1.0 3.6150e+01 1.0 8.41e+08 1.1 3.5e+05 5.0e+04 5.3e+02  7  2  6  6 13   7  2  6  6 13  8804

  repartition          6 1.0 3.8351e+00 1.0 0.00e+00 0.0 4.7e+04 1.3e+02 1.6e+02  1  0  1  0  4   1  0  1  0  4     0

  Invert-Sort          6 1.0 4.4953e+00 1.8 0.00e+00 0.0 0.0e+00 0.0e+00 2.4e+01  1  0  0  0  1   1  0  0  0  1     0

  Move A               6 1.0 1.0806e+01 1.0 0.00e+00 0.0 8.5e+04 1.6e+05 1.0e+02  2  0  1  5  3   2  0  1  5  3     0

  Move P               6 1.0 1.1953e+01 1.0 0.00e+00 0.0 2.5e+04 3.6e+03 1.0e+02  2  0  0  0  3   2  0  0  0  3     0

PCSetUp              100 1.0 1.0166e+02 1.0 1.72e+10 1.1 2.7e+06 8.3e+04 2.2e+03 18 50 44 73 54  18 50 44 73 54 63848

PCSetUpOnBlocks       40 1.0 1.0812e+00 1.2 1.95e+08 1.2 0.0e+00 0.0e+00 0.0e+00  0  1  0  0  0   0  1  0  0  0 64711

PCApply              380 1.0 1.9359e+01 1.1 8.58e+09 1.1 1.4e+06 9.6e+03 6.0e+01  3 25 24  5  1   3 25 24  5  2 167605

SFSetGraph            12 1.0 3.5203e-03 6.8 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00  0  0  0  0  0   0  0  0  0  0     0

SFBcastBegin          44 1.0 2.4242e-02 3.0 0.00e+00 0.0 2.5e+05 1.1e+03 6.0e+00  0  0  4  0  0   0  0  4  0  0     0

SFBcastEnd            44 1.0 3.0994e-02 8.6 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00  0  0  0  0  0   0  0  0  0  0     0

SFReduceBegin          6 1.0 1.6784e-02 3.8 0.00e+00 0.0 7.1e+04 5.0e+02 6.0e+00  0  0  1  0  0   0  0  1  0  0     0

SFReduceEnd            6 1.0 8.6989e-0332.0 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00  0  0  0  0  0   0  0  0  0  0     0

------------------------------------------------------------------------------------------------------------------------


Memory usage is given in bytes:


Object Type          Creations   Destructions     Memory  Descendants' Mem.

Reports information only for process 0.


--- Event Stage 0: Main Stage


              Matrix   246            243   1730595756     0

Matrix Partitioning     6              6         3816     0

      Matrix Coarsen     6              6         3720     0

              Vector   602            602   1603749672     0

      Vector Scatter    87             87      4291136     0

       Krylov Solver    12             12        60416     0

      Preconditioner    12             12        12040     0

              Viewer     1              0            0     0

           Index Set   247            247      9018060     0

Star Forest Bipartite Graph    12             12        10080     0

========================================================================================================================

Any idea why there are more matrix created with version 3.7.2? I only have 2 MatCreate calls and 4 VecCreate calls in my code!, so I assume the others are internally created.


Thank you,


Hassan Raiesi, PhD


Advanced Aerodynamics Department

Bombardier Aerospace


hassan.raiesi at aero.bombardier.com


2351 boul. Alfred-Nobel (BAN1)

Ville Saint-Laurent, Qu?bec, H4S 2A9


T?l.

  514-855-5001    # 62204


[cid:image001.png at 01D1D6DA.DC1D3010]


CONFIDENTIALITY NOTICE - This communication may contain privileged or confidential information.
If you are not the intended recipient or received this communication by error, please notify the sender
and delete the message without copying, forwarding and/or disclosing it.


-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.mcs.anl.gov/pipermail/petsc-users/attachments/20160705/49633698/attachment-0001.html>
-------------- next part --------------
A non-text attachment was scrubbed...
Name: image001.png
Type: image/png
Size: 6402 bytes
Desc: image001.png
URL: <http://lists.mcs.anl.gov/pipermail/petsc-users/attachments/20160705/49633698/attachment-0001.png>

From gbisht at lbl.gov  Tue Jul  5 16:07:38 2016
From: gbisht at lbl.gov (Gautam Bisht)
Date: Tue, 5 Jul 2016 14:07:38 -0700
Subject: [petsc-users] How to determine the type of SNESLineSearch?
Message-ID: <CAPz1TnceCfkAcp+gX6Lyz4RzeeUgH_Mdpfc8n3pqMnH3W7zfZw@mail.gmail.com>

Hi PETSc,

After SNESSolve converges, I want to perform few additional operations only
when SNESLineSearchType is not SNESLINESEARCHBASIC. But, there is no
SNESLineSearch*Get*Type routine. Any idea on how I can determine the type
of LineSearch set by a user using command line option?

Thanks,
-Gautam.
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.mcs.anl.gov/pipermail/petsc-users/attachments/20160705/2cb8f917/attachment.html>

From knepley at gmail.com  Tue Jul  5 16:13:35 2016
From: knepley at gmail.com (Matthew Knepley)
Date: Tue, 5 Jul 2016 16:13:35 -0500
Subject: [petsc-users] petsc 3.7.2 memory usage is much higher when
 compared to 3.6.1
In-Reply-To: <CB3EA7864A5EA74EADA707EEB42CF5BAFC313A95@MTLWAEXCH005.ca.aero.bombardier.net>
References: <CB3EA7864A5EA74EADA707EEB42CF5BAFC313A95@MTLWAEXCH005.ca.aero.bombardier.net>
Message-ID: <CAMYG4Gk-Jf+xjz=o2N6d1+-jBxmynEbY+3ZSWQZxrL8tyRnVSQ@mail.gmail.com>

On Tue, Jul 5, 2016 at 3:42 PM, Hassan Raiesi <
Hassan.Raiesi at aero.bombardier.com> wrote:

> Hi,
>
>
>
> PETSc 3.7.2 seems to have a much higher memory usage when compared with
> PETSc- 3.1.1 c, to a point that it crashes our code for large problems that
> we ran with version 3.6.1 in the past.
>
> I have re-compiled the code with same options, and ran the same code
> linked with the two versions, here are the log-summarie:
>

According to the log_summary (which you NEED to send in full if we are to
understand anything), the memory usage is largely the same.
There are more matrices, which leads me to believe that GAMG is not
coarsening as quickly. You might consider a non-zero threshold for
it.

The best way to understand what is happening is to run Massif (from
valgrind) on both.

  Thanks,

     Matt


> -flow_ksp_max_it 20
>
> -flow_ksp_monitor_true_residual
>
> -flow_ksp_rtol 0.1
>
> -flow_ksp_type fgmres
>
> -flow_mg_coarse_pc_factor_mat_solver_package mumps
>
> -flow_mg_coarse_pc_type lu
>
> -flow_mg_levels_ksp_type richardson
>
> -flow_mg_levels_pc_type sor
>
> -flow_pc_gamg_agg_nsmooths 0
>
> -flow_pc_gamg_coarse_eq_limit 2000
>
> -flow_pc_gamg_process_eq_limit 2500
>
> -flow_pc_gamg_repartition true
>
> -flow_pc_gamg_reuse_interpolation true
>
> -flow_pc_gamg_square_graph 3
>
> -flow_pc_gamg_sym_graph true
>
> -flow_pc_gamg_type agg
>
> -flow_pc_mg_cycle v
>
> -flow_pc_mg_levels 20
>
> -flow_pc_mg_type kaskade
>
> -flow_pc_type gamg
>
> -log_summary
>
>
>
> Note: it is not specific to PCGAMG, even a bjacobi+fgmres would need more
> memory (4.5GB/core in version 3.6.1 compared to 6.8GB/core for 3.7.2).
>
>
>
>
>
>
>
> Using Petsc Development GIT revision: v3.7.2-812-gc68d048  GIT Date:
> 2016-07-05 12:04:34 -0400
>
>
>
>                          Max       Max/Min        Avg      Total
>
> Time (sec):           6.760e+02      1.00006   6.760e+02
>
> Objects:              1.284e+03      1.00469   1.279e+03
>
> Flops:                3.563e+10      1.10884   3.370e+10  1.348e+13
>
> Flops/sec:            5.271e+07      1.10884   4.985e+07  1.994e+10
>
> MPI Messages:         4.279e+04      7.21359   1.635e+04  6.542e+06
>
> MPI Message Lengths:  3.833e+09     17.25274   7.681e+04  5.024e+11
>
> MPI Reductions:       4.023e+03      1.00149
>
>
>
> Flop counting convention: 1 flop = 1 real number operation of type
> (multiply/divide/add/subtract)
>
>                             e.g., VecAXPY() for real vectors of length N
> --> 2N flops
>
>                             and VecAXPY() for complex vectors of length N
> --> 8N flops
>
>
>
> Summary of Stages:   ----- Time ------  ----- Flops -----  --- Messages
> ---  -- Message Lengths --  -- Reductions --
>
>                         Avg     %Total     Avg     %Total   counts
> %Total     Avg         %Total   counts   %Total
>
> 0:      Main Stage: 6.7600e+02 100.0%  1.3478e+13 100.0%  6.533e+06
> 99.9%  7.674e+04       99.9%  4.010e+03  99.7%
>
>
>
>
> ------------------------------------------------------------------------------------------------------------------------
>
> See the 'Profiling' chapter of the users' manual for details on
> interpreting output.
>
> Phase summary info:
>
>    Count: number of times phase was executed
>
>    Time and Flops: Max - maximum over all processors
>
>                    Ratio - ratio of maximum to minimum over all processors
>
>    Mess: number of messages sent
>
>    Avg. len: average message length (bytes)
>
>    Reduct: number of global reductions
>
>    Global: entire computation
>
>    Stage: stages of a computation. Set stages with PetscLogStagePush() and
> PetscLogStagePop().
>
>       %T - percent time in this phase         %F - percent flops in this
> phase
>
>       %M - percent messages in this phase     %L - percent message lengths
> in this phase
>
>       %R - percent reductions in this phase
>
>    Total Mflop/s: 10e-6 * (sum of flops over all processors)/(max time
> over all processors)
>
>
> ------------------------------------------------------------------------------------------------------------------------
>
> Event                Count      Time (sec)     Flops
>        --- Global ---  --- Stage ---   Total
>
>                    Max Ratio  Max     Ratio   Max  Ratio  Mess   Avg len
> Reduct  %T %F %M %L %R  %T %F %M %L %R Mflop/s
>
>
> ------------------------------------------------------------------------------------------------------------------------
>
>
>
> --- Event Stage 0: Main Stage
>
>
>
> MatMult              500 1.0 1.0582e+01 1.2 6.68e+09 1.1 1.9e+06 1.0e+04
> 0.0e+00  1 19 28  4  0   1 19 29  4  0 237625
>
> MatMultTranspose     120 1.0 7.6262e-01 1.3 3.58e+08 1.1 2.4e+05 1.5e+04
> 0.0e+00  0  1  4  1  0   0  1  4  1  0 180994
>
> MatSolve             380 1.0 4.1580e+00 1.1 1.17e+09 1.1 8.6e+03 8.8e+01
> 6.0e+01  1  3  0  0  1   1  3  0  0  1 105950
>
> MatSOR               120 1.0 1.4316e+01 1.2 6.75e+09 1.1 9.5e+05 7.4e+03
> 0.0e+00  2 19 15  1  0   2 19 15  1  0 177298
>
> MatLUFactorSym         2 1.0 2.3449e-01 1.1 0.00e+00 0.0 0.0e+00 0.0e+00
> 1.0e+01  0  0  0  0  0   0  0  0  0  0     0
>
> MatLUFactorNum        60 1.0 8.8820e+00 1.0 1.95e+08 1.2 0.0e+00 0.0e+00
> 0.0e+00  1  1  0  0  0   1  1  0  0  0  7877
>
> MatILUFactorSym        1 1.0 1.9795e-02 1.4 0.00e+00 0.0 0.0e+00 0.0e+00
> 0.0e+00  0  0  0  0  0   0  0  0  0  0     0
>
> MatConvert             6 1.0 2.9893e-02 1.1 0.00e+00 0.0 0.0e+00 0.0e+00
> 1.8e+01  0  0  0  0  0   0  0  0  0  0     0
>
> MatScale               6 1.0 1.8810e-02 1.4 4.52e+06 1.1 2.4e+04 1.5e+03
> 0.0e+00  0  0  0  0  0   0  0  0  0  0 90171
>
> MatAssemblyBegin     782 1.0 1.8294e+01 2.9 0.00e+00 0.0 9.2e+05 4.1e+05
> 4.2e+02  2  0 14 75 10   2  0 14 75 10     0
>
> MatAssemblyEnd       782 1.0 1.4283e+01 3.0 0.00e+00 0.0 4.1e+05 8.7e+02
> 4.7e+02  1  0  6  0 12   1  0  6  0 12     0
>
> MatGetRow        6774900 1.1 9.4289e-01 1.2 0.00e+00 0.0 0.0e+00 0.0e+00
> 0.0e+00  0  0  0  0  0   0  0  0  0  0     0
>
> MatGetRowIJ            3 3.0 6.6261e-036948.0 0.00e+00 0.0 0.0e+00 0.0e+00
> 0.0e+00  0  0  0  0  0   0  0  0  0  0     0
>
> MatGetSubMatrix       12 1.0 2.6783e+01 1.0 0.00e+00 0.0 1.1e+05 1.3e+05
> 2.0e+02  4  0  2  3  5   4  0  2  3  5     0
>
> MatGetOrdering         3 3.0 7.7400e-03 7.2 0.00e+00 0.0 0.0e+00 0.0e+00
> 0.0e+00  0  0  0  0  0   0  0  0  0  0     0
>
> MatPartitioning        6 1.0 1.8949e+00 1.0 0.00e+00 0.0 0.0e+00 0.0e+00
> 1.4e+01  0  0  0  0  0   0  0  0  0  0     0
>
> MatCoarsen             6 1.0 9.5692e-02 1.2 0.00e+00 0.0 2.6e+05 1.1e+03
> 4.1e+01  0  0  4  0  1   0  0  4  0  1     0
>
> MatZeroEntries       142 1.0 9.7085e-01 1.1 0.00e+00 0.0 0.0e+00 0.0e+00
> 0.0e+00  0  0  0  0  0   0  0  0  0  0     0
>
> MatTranspose           6 1.0 2.1740e-01 1.0 0.00e+00 0.0 1.9e+05 8.5e+02
> 7.8e+01  0  0  3  0  2   0  0  3  0  2     0
>
> MatPtAP              120 1.0 6.0157e+01 1.0 1.82e+10 1.1 1.5e+06 2.7e+05
> 4.2e+02  9 51 22 80 10   9 51 22 80 10 114269
>
> MatPtAPSymbolic       12 1.0 8.1081e+00 1.0 0.00e+00 0.0 2.2e+05 3.8e+04
> 8.4e+01  1  0  3  2  2   1  0  3  2  2     0
>
> MatPtAPNumeric       120 1.0 5.2205e+01 1.0 1.82e+10 1.1 1.2e+06 3.1e+05
> 3.4e+02  8 51 19 78  8   8 51 19 78  8 131676
>
> MatTrnMatMult          3 1.0 1.8608e+00 1.0 3.23e+07 1.2 8.3e+04 7.9e+03
> 5.7e+01  0  0  1  0  1   0  0  1  0  1  6275
>
> MatTrnMatMultSym       3 1.0 1.3447e+00 1.0 0.00e+00 0.0 6.9e+04 3.8e+03
> 5.1e+01  0  0  1  0  1   0  0  1  0  1     0
>
> MatTrnMatMultNum       3 1.0 5.1695e-01 1.0 3.23e+07 1.2 1.3e+04 3.0e+04
> 6.0e+00  0  0  0  0  0   0  0  0  0  0 22588
>
> MatGetLocalMat       126 1.0 1.0355e+00 1.8 0.00e+00 0.0 0.0e+00 0.0e+00
> 0.0e+00  0  0  0  0  0   0  0  0  0  0     0
>
> MatGetBrAoCol        120 1.0 9.5921e+0019.2 0.00e+00 0.0 5.7e+05 3.3e+04
> 0.0e+00  1  0  9  4  0   1  0  9  4  0     0
>
> VecDot               320 1.0 1.1400e+00 1.6 2.04e+08 1.1 0.0e+00 0.0e+00
> 3.2e+02  0  1  0  0  8   0  1  0  0  8 68967
>
> VecMDot              260 1.0 1.9577e+00 2.8 3.70e+08 1.1 0.0e+00 0.0e+00
> 2.6e+02  0  1  0  0  6   0  1  0  0  6 72792
>
> VecNorm              440 1.0 2.6273e+00 1.9 5.88e+08 1.1 0.0e+00 0.0e+00
> 4.4e+02  0  2  0  0 11   0  2  0  0 11 86035
>
> VecScale             320 1.0 2.1386e-01 1.2 7.91e+07 1.1 0.0e+00 0.0e+00
> 0.0e+00  0  0  0  0  0   0  0  0  0  0 141968
>
> VecCopy              220 1.0 7.0370e-01 1.2 0.00e+00 0.0 0.0e+00 0.0e+00
> 0.0e+00  0  0  0  0  0   0  0  0  0  0     0
>
> VecSet               862 1.0 7.1000e-01 1.6 0.00e+00 0.0 0.0e+00 0.0e+00
> 0.0e+00  0  0  0  0  0   0  0  0  0  0     0
>
> VecAXPY              440 1.0 8.6790e-01 1.1 3.83e+08 1.1 0.0e+00 0.0e+00
> 0.0e+00  0  1  0  0  0   0  1  0  0  0 169857
>
> VecAYPX              280 1.0 5.7766e-01 1.5 1.92e+08 1.1 0.0e+00 0.0e+00
> 0.0e+00  0  1  0  0  0   0  1  0  0  0 127599
>
> VecMAXPY             300 1.0 9.7396e-01 1.2 4.98e+08 1.1 0.0e+00 0.0e+00
> 0.0e+00  0  1  0  0  0   0  1  0  0  0 196768
>
> VecAssemblyBegin     234 1.0 4.6313e+00 5.6 0.00e+00 0.0 0.0e+00 0.0e+00
> 6.8e+02  0  0  0  0 17   0  0  0  0 17     0
>
> VecAssemblyEnd       234 1.0 5.1503e-0319.5 0.00e+00 0.0 0.0e+00 0.0e+00
> 0.0e+00  0  0  0  0  0   0  0  0  0  0     0
>
> VecScatterBegin     1083 1.0 2.9274e-01 4.5 0.00e+00 0.0 3.8e+06 8.5e+03
> 2.0e+01  0  0 59  6  0   0  0 59  6  0     0
>
> VecScatterEnd       1063 1.0 3.9653e+00 5.6 0.00e+00 0.0 0.0e+00 0.0e+00
> 0.0e+00  0  0  0  0  0   0  0  0  0  0     0
>
> KSPGMRESOrthog        20 1.0 1.7405e+00 3.7 1.28e+08 1.1 0.0e+00 0.0e+00
> 2.0e+01  0  0  0  0  0   0  0  0  0  0 28232
>
> KSPSetUp             222 1.0 6.8469e-02 1.1 0.00e+00 0.0 0.0e+00 0.0e+00
> 1.2e+01  0  0  0  0  0   0  0  0  0  0     0
>
> KSPSolve              60 1.0 1.4767e+02 1.0 3.55e+10 1.1 6.3e+06 7.2e+04
> 3.2e+03 22100 96 90 79  22100 96 90 79 91007
>
> PCGAMGGraph_AGG        6 1.0 6.0792e+00 1.0 4.52e+06 1.1 3.8e+05 9.0e+02
> 2.5e+02  1  0  6  0  6   1  0  6  0  6   279
>
> PCGAMGCoarse_AGG       6 1.0 2.0660e+00 1.0 3.23e+07 1.2 4.2e+05 3.1e+03
> 1.5e+02  0  0  6  0  4   0  0  6  0  4  5652
>
> PCGAMGProl_AGG         6 1.0 1.8842e+00 1.0 0.00e+00 0.0 7.3e+05 3.3e+03
> 8.6e+02  0  0 11  0 21   0  0 11  0 22     0
>
> PCGAMGPOpt_AGG         6 1.0 6.4373e-06 0.0 0.00e+00 0.0 0.0e+00 0.0e+00
> 0.0e+00  0  0  0  0  0   0  0  0  0  0     0
>
> GAMG: createProl       6 1.0 1.0036e+01 1.0 3.68e+07 1.2 1.5e+06 2.7e+03
> 1.3e+03  1  0 23  1 31   1  0 23  1 31  1332
>
>   Graph               12 1.0 6.0783e+00 1.0 4.52e+06 1.1 3.8e+05 9.0e+02
> 2.5e+02  1  0  6  0  6   1  0  6  0  6   279
>
>   MIS/Agg              6 1.0 9.5831e-02 1.2 0.00e+00 0.0 2.6e+05 1.1e+03
> 4.1e+01  0  0  4  0  1   0  0  4  0  1     0
>
>   SA: col data         6 1.0 7.7358e-01 1.0 0.00e+00 0.0 6.7e+05 2.9e+03
> 7.8e+02  0  0 10  0 19   0  0 10  0 19     0
>
>   SA: frmProl0         6 1.0 1.0759e+00 1.0 0.00e+00 0.0 6.2e+04 7.6e+03
> 6.0e+01  0  0  1  0  1   0  0  1  0  1     0
>
> GAMG: partLevel        6 1.0 3.8136e+01 1.0 9.09e+08 1.1 3.8e+05 5.0e+04
> 5.4e+02  6  3  6  4 13   6  3  6  4 14  9013
>
>   repartition          6 1.0 2.7910e+00 1.0 0.00e+00 0.0 4.6e+04 1.3e+02
> 1.6e+02  0  0  1  0  4   0  0  1  0  4     0
>
>   Invert-Sort          6 1.0 2.5045e+00 2.0 0.00e+00 0.0 0.0e+00 0.0e+00
> 2.4e+01  0  0  0  0  1   0  0  0  0  1     0
>
>   Move A               6 1.0 1.4832e+01 1.0 0.00e+00 0.0 8.5e+04 1.7e+05
> 1.1e+02  2  0  1  3  3   2  0  1  3  3     0
>
>   Move P               6 1.0 1.2023e+01 1.0 0.00e+00 0.0 2.4e+04 3.8e+03
> 1.1e+02  2  0  0  0  3   2  0  0  0  3     0
>
> PCSetUp              100 1.0 1.1212e+02 1.0 1.84e+10 1.1 3.2e+06 1.3e+05
> 2.2e+03 17 52 49 84 54  17 52 49 84 54 62052
>
> PCSetUpOnBlocks       40 1.0 1.0386e+00 1.2 1.95e+08 1.2 0.0e+00 0.0e+00
> 0.0e+00  0  1  0  0  0   0  1  0  0  0 67368
>
> PCApply              380 1.0 2.0034e+01 1.1 8.60e+09 1.1 1.5e+06 9.9e+03
> 6.0e+01  3 24 22  3  1   3 24 22  3  1 161973
>
> SFSetGraph            12 1.0 4.9813e-0310.2 0.00e+00 0.0 0.0e+00 0.0e+00
> 0.0e+00  0  0  0  0  0   0  0  0  0  0     0
>
> SFBcastBegin          47 1.0 3.3110e-02 2.6 0.00e+00 0.0 2.6e+05 1.1e+03
> 6.0e+00  0  0  4  0  0   0  0  4  0  0     0
>
> SFBcastEnd            47 1.0 1.3497e-02 3.6 0.00e+00 0.0 0.0e+00 0.0e+00
> 0.0e+00  0  0  0  0  0   0  0  0  0  0     0
>
> SFReduceBegin          6 1.0 1.8593e-02 4.2 0.00e+00 0.0 7.2e+04 4.9e+02
> 6.0e+00  0  0  1  0  0   0  0  1  0  0     0
>
> SFReduceEnd            6 1.0 7.1628e-0318.5 0.00e+00 0.0 0.0e+00 0.0e+00
> 0.0e+00  0  0  0  0  0   0  0  0  0  0     0
>
> BuildTwoSided         12 1.0 3.5771e-02 2.5 0.00e+00 0.0 5.0e+04 4.0e+00
> 1.2e+01  0  0  1  0  0   0  0  1  0  0     0
>
>
> ------------------------------------------------------------------------------------------------------------------------
>
>
>
> Memory usage is given in bytes:
>
>
>
> Object Type          Creations   Destructions     Memory  Descendants' Mem.
>
> Reports information only for process 0.
>
>
>
> --- Event Stage 0: Main Stage
>
>
>
>               Matrix   302            299   1992700700     0.
>
> Matrix Partitioning     6              6         3888     0.
>
>       Matrix Coarsen     6              6         3768     0.
>
>               Vector   600            600   1582204168     0.
>
>       Vector Scatter    87             87      5614432     0.
>
>        Krylov Solver    11             11        59472     0.
>
>       Preconditioner    11             11        11120     0.
>
>          PetscRandom     1              1          638     0.
>
>               Viewer     1              0            0     0.
>
>            Index Set   247            247      9008420     0.
>
> Star Forest Bipartite Graph    12             12        10176     0.
>
>
> ========================================================================================================================
>
>
>
> And for  petsc 3.6.1:
>
>
>
> Using Petsc Development GIT revision: v3.6.1-307-g26c82d3  GIT Date:
> 2015-08-06 11:50:34 -0500
>
>
>
>                          Max       Max/Min        Avg      Total
>
> Time (sec):           5.515e+02      1.00001   5.515e+02
>
> Objects:              1.231e+03      1.00490   1.226e+03
>
> Flops:                3.431e+10      1.12609   3.253e+10  1.301e+13
>
> Flops/sec:            6.222e+07      1.12609   5.899e+07  2.359e+10
>
> MPI Messages:         4.432e+04      7.84165   1.504e+04  6.016e+06
>
> MPI Message Lengths:  2.236e+09     12.61261   5.027e+04  3.024e+11
>
> MPI Reductions:       4.012e+03      1.00150
>
>
>
> Flop counting convention: 1 flop = 1 real number operation of type
> (multiply/divide/add/subtract)
>
>                             e.g., VecAXPY() for real vectors of length N
> --> 2N flops
>
>                             and VecAXPY() for complex vectors of length N
> --> 8N flops
>
>
>
> Summary of Stages:   ----- Time ------  ----- Flops -----  --- Messages
> ---  -- Message Lengths --  -- Reductions --
>
>                         Avg     %Total     Avg     %Total   counts
> %Total     Avg         %Total   counts   %Total
>
> 0:      Main Stage: 5.5145e+02 100.0%  1.3011e+13 100.0%  6.007e+06
> 99.9%  5.020e+04       99.9%  3.999e+03  99.7%
>
>
>
>
> ------------------------------------------------------------------------------------------------------------------------
>
> See the 'Profiling' chapter of the users' manual for details on
> interpreting output.
>
> Phase summary info:
>
>    Count: number of times phase was executed
>
>    Time and Flops: Max - maximum over all processors
>
>                    Ratio - ratio of maximum to minimum over all processors
>
>    Mess: number of messages sent
>
>    Avg. len: average message length (bytes)
>
>    Reduct: number of global reductions
>
>    Global: entire computation
>
>    Stage: stages of a computation. Set stages with PetscLogStagePush() and
> PetscLogStagePop().
>
>       %T - percent time in this phase         %F - percent flops in this
> phase
>
>       %M - percent messages in this phase     %L - percent message lengths
> in this phase
>
>       %R - percent reductions in this phase
>
>    Total Mflop/s: 10e-6 * (sum of flops over all processors)/(max time
> over all processors)
>
>
> ------------------------------------------------------------------------------------------------------------------------
>
> Event                Count      Time (sec)
> Flops                             --- Global ---  --- Stage ---   Total
>
>                    Max Ratio  Max     Ratio   Max  Ratio  Mess   Avg len
> Reduct  %T %F %M %L %R  %T %F %M %L %R Mflop/s
>
>
> ------------------------------------------------------------------------------------------------------------------------
>
>
>
> --- Event Stage 0: Main Stage
>
>
>
> MatMult              500 1.0 1.0172e+01 1.2 6.68e+09 1.1 1.9e+06 9.9e+03
> 0.0e+00  2 19 31  6  0   2 19 31  6  0 247182
>
> MatMultTranspose     120 1.0 6.9889e-01 1.2 3.56e+08 1.1 2.5e+05 1.4e+04
> 0.0e+00  0  1  4  1  0   0  1  4  1  0 197492
>
> MatSolve             380 1.0 3.9310e+00 1.1 1.17e+09 1.1 1.3e+04 5.7e+01
> 6.0e+01  1  3  0  0  1   1  3  0  0  2 112069
>
> MatSOR               120 1.0 1.3915e+01 1.1 6.73e+09 1.1 9.5e+05 7.4e+03
> 0.0e+00  2 20 16  2  0   2 20 16  2  0 182405
>
> MatLUFactorSym         2 1.0 2.1180e-01 1.8 0.00e+00 0.0 0.0e+00 0.0e+00
> 1.0e+01  0  0  0  0  0   0  0  0  0  0     0
>
> MatLUFactorNum        60 1.0 7.9378e+00 1.0 1.95e+08 1.2 0.0e+00 0.0e+00
> 0.0e+00  1  1  0  0  0   1  1  0  0  0  8814
>
> MatILUFactorSym        1 1.0 2.3076e-02 1.5 0.00e+00 0.0 0.0e+00 0.0e+00
> 0.0e+00  0  0  0  0  0   0  0  0  0  0     0
>
> MatConvert             6 1.0 3.2693e-02 1.5 0.00e+00 0.0 0.0e+00 0.0e+00
> 1.8e+01  0  0  0  0  0   0  0  0  0  0     0
>
> MatScale               6 1.0 2.1923e-02 1.7 4.50e+06 1.1 2.4e+04 1.5e+03
> 0.0e+00  0  0  0  0  0   0  0  0  0  0 77365
>
> MatAssemblyBegin     266 1.0 1.0337e+01 4.4 0.00e+00 0.0 1.8e+05 3.8e+03
> 4.2e+02  1  0  3  0 10   1  0  3  0 10     0
>
> MatAssemblyEnd       266 1.0 3.0336e+00 1.0 0.00e+00 0.0 4.1e+05 8.6e+02
> 4.7e+02  1  0  7  0 12   1  0  7  0 12     0
>
> MatGetRow        6730366 1.1 8.6473e-01 1.2 0.00e+00 0.0 0.0e+00 0.0e+00
> 0.0e+00  0  0  0  0  0   0  0  0  0  0     0
>
> MatGetRowIJ            3 3.0 5.2931e-035550.2 0.00e+00 0.0 0.0e+00 0.0e+00
> 0.0e+00  0  0  0  0  0   0  0  0  0  0     0
>
> MatGetSubMatrix       12 1.0 2.2689e+01 1.0 0.00e+00 0.0 1.1e+05 1.3e+05
> 1.9e+02  4  0  2  5  5   4  0  2  5  5     0
>
> MatGetOrdering         3 3.0 6.5000e-03 5.4 0.00e+00 0.0 0.0e+00 0.0e+00
> 0.0e+00  0  0  0  0  0   0  0  0  0  0     0
>
> MatPartitioning        6 1.0 2.9801e+00 1.0 0.00e+00 0.0 0.0e+00 0.0e+00
> 1.4e+01  1  0  0  0  0   1  0  0  0  0     0
>
> MatCoarsen             6 1.0 9.5374e-02 1.1 0.00e+00 0.0 2.5e+05 1.1e+03
> 3.8e+01  0  0  4  0  1   0  0  4  0  1     0
>
> MatZeroEntries        22 1.0 6.1185e-01 1.2 0.00e+00 0.0 0.0e+00 0.0e+00
> 0.0e+00  0  0  0  0  0   0  0  0  0  0     0
>
> MatTranspose           6 1.0 1.9780e-01 1.1 0.00e+00 0.0 1.9e+05 8.6e+02
> 7.8e+01  0  0  3  0  2   0  0  3  0  2     0
>
> MatPtAP              120 1.0 5.2996e+01 1.0 1.70e+10 1.1 9.7e+05 2.1e+05
> 4.2e+02 10 49 16 67 10  10 49 16 67 11 120900
>
> MatPtAPSymbolic       12 1.0 5.8209e+00 1.0 0.00e+00 0.0 2.2e+05 3.7e+04
> 8.4e+01  1  0  4  3  2   1  0  4  3  2     0
>
> MatPtAPNumeric       120 1.0 4.7185e+01 1.0 1.70e+10 1.1 7.6e+05 2.6e+05
> 3.4e+02  9 49 13 64  8   9 49 13 64  8 135789
>
> MatTrnMatMult          3 1.0 1.1679e+00 1.0 3.22e+07 1.2 8.2e+04 8.0e+03
> 5.7e+01  0  0  1  0  1   0  0  1  0  1  9997
>
> MatTrnMatMultSym       3 1.0 6.8366e-01 1.0 0.00e+00 0.0 6.9e+04 3.9e+03
> 5.1e+01  0  0  1  0  1   0  0  1  0  1     0
>
> MatTrnMatMultNum       3 1.0 4.8513e-01 1.0 3.22e+07 1.2 1.3e+04 3.0e+04
> 6.0e+00  0  0  0  0  0   0  0  0  0  0 24069
>
> MatGetLocalMat       126 1.0 1.1939e+00 1.3 0.00e+00 0.0 0.0e+00 0.0e+00
> 0.0e+00  0  0  0  0  0   0  0  0  0  0     0
>
> MatGetBrAoCol        120 1.0 5.9887e-01 2.7 0.00e+00 0.0 5.7e+05 3.3e+04
> 0.0e+00  0  0  9  6  0   0  0  9  6  0     0
>
> MatGetSymTrans        24 1.0 1.4878e-01 1.4 0.00e+00 0.0 0.0e+00 0.0e+00
> 0.0e+00  0  0  0  0  0   0  0  0  0  0     0
>
> VecDot               320 1.0 1.5860e+00 1.5 2.04e+08 1.1 0.0e+00 0.0e+00
> 3.2e+02  0  1  0  0  8   0  1  0  0  8 49574
>
> VecMDot              260 1.0 1.8154e+00 2.5 3.70e+08 1.1 0.0e+00 0.0e+00
> 2.6e+02  0  1  0  0  6   0  1  0  0  7 78497
>
> VecNorm              440 1.0 2.8876e+00 1.8 5.88e+08 1.1 0.0e+00 0.0e+00
> 4.4e+02  0  2  0  0 11   0  2  0  0 11 78281
>
> VecScale             320 1.0 2.2738e-01 1.2 7.88e+07 1.1 0.0e+00 0.0e+00
> 0.0e+00  0  0  0  0  0   0  0  0  0  0 133517
>
> VecCopy              220 1.0 7.1162e-01 1.2 0.00e+00 0.0 0.0e+00 0.0e+00
> 0.0e+00  0  0  0  0  0   0  0  0  0  0     0
>
> VecSet               862 1.0 7.0683e-01 1.5 0.00e+00 0.0 0.0e+00 0.0e+00
> 0.0e+00  0  0  0  0  0   0  0  0  0  0     0
>
> VecAXPY              440 1.0 9.0657e-01 1.2 3.83e+08 1.1 0.0e+00 0.0e+00
> 0.0e+00  0  1  0  0  0   0  1  0  0  0 162612
>
> VecAYPX              280 1.0 5.8935e-01 1.5 1.92e+08 1.1 0.0e+00 0.0e+00
> 0.0e+00  0  1  0  0  0   0  1  0  0  0 125070
>
> VecMAXPY             300 1.0 9.7644e-01 1.2 4.98e+08 1.1 0.0e+00 0.0e+00
> 0.0e+00  0  1  0  0  0   0  1  0  0  0 196269
>
> VecAssemblyBegin     234 1.0 5.0308e+00 5.5 0.00e+00 0.0 0.0e+00 0.0e+00
> 6.8e+02  1  0  0  0 17   1  0  0  0 17     0
>
> VecAssemblyEnd       234 1.0 1.8253e-03 8.8 0.00e+00 0.0 0.0e+00 0.0e+00
> 0.0e+00  0  0  0  0  0   0  0  0  0  0     0
>
> VecScatterBegin     1083 1.0 2.8195e-01 4.7 0.00e+00 0.0 3.8e+06 8.4e+03
> 2.0e+01  0  0 64 11  0   0  0 64 11  1     0
>
> VecScatterEnd       1063 1.0 3.4924e+00 6.9 0.00e+00 0.0 0.0e+00 0.0e+00
> 0.0e+00  0  0  0  0  0   0  0  0  0  0     0
>
> KSPGMRESOrthog        20 1.0 1.5598e+00 3.2 1.28e+08 1.1 0.0e+00 0.0e+00
> 2.0e+01  0  0  0  0  0   0  0  0  0  1 31503
>
> KSPSetUp             222 1.0 9.7521e-02 1.6 0.00e+00 0.0 0.0e+00 0.0e+00
> 1.2e+01  0  0  0  0  0   0  0  0  0  0     0
>
> KSPSolve              60 1.0 1.3742e+02 1.0 3.42e+10 1.1 5.7e+06 4.4e+04
> 3.2e+03 25100 95 83 79  25100 95 83 79 94396
>
> PCGAMGGraph_AGG        6 1.0 5.7683e+00 1.0 4.50e+06 1.1 3.8e+05 9.1e+02
> 2.5e+02  1  0  6  0  6   1  0  6  0  6   294
>
> PCGAMGCoarse_AGG       6 1.0 1.4101e+00 1.0 3.22e+07 1.2 4.0e+05 3.2e+03
> 1.4e+02  0  0  7  0  4   0  0  7  0  4  8280
>
> PCGAMGProl_AGG         6 1.0 1.8976e+00 1.0 0.00e+00 0.0 7.2e+05 3.4e+03
> 8.6e+02  0  0 12  1 22   0  0 12  1 22     0
>
> PCGAMGPOpt_AGG         6 1.0 5.7220e-06 0.0 0.00e+00 0.0 0.0e+00 0.0e+00
> 0.0e+00  0  0  0  0  0   0  0  0  0  0     0
>
> GAMG: createProl       6 1.0 9.0840e+00 1.0 3.67e+07 1.2 1.5e+06 2.7e+03
> 1.3e+03  2  0 25  1 31   2  0 25  1 31  1472
>
>   Graph               12 1.0 5.7669e+00 1.0 4.50e+06 1.1 3.8e+05 9.1e+02
> 2.5e+02  1  0  6  0  6   1  0  6  0  6   294
>
>   MIS/Agg              6 1.0 9.5481e-02 1.1 0.00e+00 0.0 2.5e+05 1.1e+03
> 3.8e+01  0  0  4  0  1   0  0  4  0  1     0
>
>   SA: col data         6 1.0 8.5414e-01 1.0 0.00e+00 0.0 6.6e+05 3.0e+03
> 7.8e+02  0  0 11  1 19   0  0 11  1 20     0
>
>   SA: frmProl0         6 1.0 1.0123e+00 1.0 0.00e+00 0.0 6.2e+04 7.6e+03
> 6.0e+01  0  0  1  0  1   0  0  1  0  2     0
>
> GAMG: partLevel        6 1.0 3.6150e+01 1.0 8.41e+08 1.1 3.5e+05 5.0e+04
> 5.3e+02  7  2  6  6 13   7  2  6  6 13  8804
>
>   repartition          6 1.0 3.8351e+00 1.0 0.00e+00 0.0 4.7e+04 1.3e+02
> 1.6e+02  1  0  1  0  4   1  0  1  0  4     0
>
>   Invert-Sort          6 1.0 4.4953e+00 1.8 0.00e+00 0.0 0.0e+00 0.0e+00
> 2.4e+01  1  0  0  0  1   1  0  0  0  1     0
>
>   Move A               6 1.0 1.0806e+01 1.0 0.00e+00 0.0 8.5e+04 1.6e+05
> 1.0e+02  2  0  1  5  3   2  0  1  5  3     0
>
>   Move P               6 1.0 1.1953e+01 1.0 0.00e+00 0.0 2.5e+04 3.6e+03
> 1.0e+02  2  0  0  0  3   2  0  0  0  3     0
>
> PCSetUp              100 1.0 1.0166e+02 1.0 1.72e+10 1.1 2.7e+06 8.3e+04
> 2.2e+03 18 50 44 73 54  18 50 44 73 54 63848
>
> PCSetUpOnBlocks       40 1.0 1.0812e+00 1.2 1.95e+08 1.2 0.0e+00 0.0e+00
> 0.0e+00  0  1  0  0  0   0  1  0  0  0 64711
>
> PCApply              380 1.0 1.9359e+01 1.1 8.58e+09 1.1 1.4e+06 9.6e+03
> 6.0e+01  3 25 24  5  1   3 25 24  5  2 167605
>
> SFSetGraph            12 1.0 3.5203e-03 6.8 0.00e+00 0.0 0.0e+00 0.0e+00
> 0.0e+00  0  0  0  0  0   0  0  0  0  0     0
>
> SFBcastBegin          44 1.0 2.4242e-02 3.0 0.00e+00 0.0 2.5e+05 1.1e+03
> 6.0e+00  0  0  4  0  0   0  0  4  0  0     0
>
> SFBcastEnd            44 1.0 3.0994e-02 8.6 0.00e+00 0.0 0.0e+00 0.0e+00
> 0.0e+00  0  0  0  0  0   0  0  0  0  0     0
>
> SFReduceBegin          6 1.0 1.6784e-02 3.8 0.00e+00 0.0 7.1e+04 5.0e+02
> 6.0e+00  0  0  1  0  0   0  0  1  0  0     0
>
> SFReduceEnd            6 1.0 8.6989e-0332.0 0.00e+00 0.0 0.0e+00 0.0e+00
> 0.0e+00  0  0  0  0  0   0  0  0  0  0     0
>
>
> ------------------------------------------------------------------------------------------------------------------------
>
>
>
> Memory usage is given in bytes:
>
>
>
> Object Type          Creations   Destructions     Memory  Descendants' Mem.
>
> Reports information only for process 0.
>
>
>
> --- Event Stage 0: Main Stage
>
>
>
>               Matrix   246            243   1730595756     0
>
> Matrix Partitioning     6              6         3816     0
>
>       Matrix Coarsen     6              6         3720     0
>
>               Vector   602            602   1603749672     0
>
>       Vector Scatter    87             87      4291136     0
>
>        Krylov Solver    12             12        60416     0
>
>       Preconditioner    12             12        12040     0
>
>               Viewer     1              0            0     0
>
>            Index Set   247            247      9018060     0
>
> Star Forest Bipartite Graph    12             12        10080     0
>
>
> ========================================================================================================================
>
>
>
> Any idea why there are more matrix created with version 3.7.2? I only have
> 2 MatCreate calls and 4 VecCreate calls in my code!, so I assume the others
> are internally created.
>
>
>
>
>
> Thank you,
>
>
>
>
>
> *Hassan Raiesi, PhD*
>
>
>
> Advanced Aerodynamics Department
>
> Bombardier Aerospace
>
>
>
> hassan.raiesi at aero.bombardier.com
>
>
>
> *2351 boul. Alfred-Nobel (BAN1)*
>
> *Ville Saint-Laurent, Qu?bec, H4S 2A9*
>
>
>
>
>
>
>
> T?l.
>
>   514-855-5001    # 62204
>
>
>
>
>
>
>
>
>
>
>
> *CONFIDENTIALITY NOTICE* - This communication may contain privileged or
> confidential information.
> If you are not the intended recipient or received this communication by
> error, please notify the sender
> and delete the message without copying, forwarding and/or disclosing it.
>
>
>
>
>
>
>


-- 
What most experimenters take for granted before they begin their
experiments is infinitely more interesting than any results to which their
experiments lead.
-- Norbert Wiener
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.mcs.anl.gov/pipermail/petsc-users/attachments/20160705/dfbe941d/attachment-0001.html>
-------------- next part --------------
A non-text attachment was scrubbed...
Name: image001.png
Type: image/png
Size: 6402 bytes
Desc: not available
URL: <http://lists.mcs.anl.gov/pipermail/petsc-users/attachments/20160705/dfbe941d/attachment-0001.png>

From it.sadr at gmail.com  Tue Jul  5 17:13:36 2016
From: it.sadr at gmail.com (ehsan sadrfaridpour)
Date: Tue, 5 Jul 2016 18:13:36 -0400
Subject: [petsc-users] How to have a local copy (sequential) of a
	parallel matrix
In-Reply-To: <1618DCDA-7859-49BD-BCAF-F4BD08DF1BAF@mcs.anl.gov>
References: <CALtCA6F_vGr=ZMi2ttcFxo1TkYoSOPn9wv5iY48+8qicYPJRBQ@mail.gmail.com>
	<F0B9F2BE-B38F-4B1B-9F95-C82F7D282ECC@mcs.anl.gov>
	<CALtCA6HWqqCij2K6uv7LER2ebwnwgh_xY7u4D2YWmzsyR67bzw@mail.gmail.com>
	<CALtCA6EUfkOOOArOLxF5A6bpbCooT1d3BKMPHZyJKvCVZ3cYRw@mail.gmail.com>
	<CAJ98EDrK9peK_B868Lr9EQ4sdiyQKMoZyOuZuOwTuCR6of9K9w@mail.gmail.com>
	<CALtCA6FnWa8A6Zi-zANDePmDWtBpUSTCM511nP=NHuttwTY7SA@mail.gmail.com>
	<1618DCDA-7859-49BD-BCAF-F4BD08DF1BAF@mcs.anl.gov>
Message-ID: <CALtCA6F4xzEJE1WCk-4e8-LDjhTfJia4b16HmzD+J4z-weWLZQ@mail.gmail.com>

I faced a problem with my code. The problem is related to MatCreateSeqAIJ().
I comment the rest of my code and just keeping the below lines cause me the
error.
*Code:*
    Mat * m_WA_nt_local;
    MatCreateSeqAIJ(PETSC_COMM_SELF,num_points,num_points, pre_init_size,
NULL, m_WA_nt_local);
    PetscPrintf(PETSC_COMM_SELF, "[CS][pCalc_P] rank:%d, num_points:%d,
p_init:%d\n", rank, num_points, pre_init_size);

    exit(1);


*Error:*
>
> [0]PETSC ERROR: --------------------- Error Message
> --------------------------------------------------------------
> [0]PETSC ERROR: Null argument, when expecting valid pointer
> [0]PETSC ERROR: [1]PETSC ERROR: --------------------- Error Message
> --------------------------------------------------------------
> [1]PETSC ERROR: Null argument, when expecting valid pointer
> [1]PETSC ERROR: Null Pointer: Parameter # 2
> [1]PETSC ERROR: See http://www.mcs.anl.gov/petsc/documentation/faq.html
> for trouble shooting.
> [2]PETSC ERROR: --------------------- Error Message
> --------------------------------------------------------------
> [2]PETSC ERROR: Null argument, when expecting valid pointer
> [2]PETSC ERROR: Null Pointer: Parameter # 2
> [2]PETSC ERROR: See http://www.mcs.anl.gov/petsc/documentation/faq.html
> for trouble shooting.
> [2]PETSC ERROR: Petsc Release Version 3.6.3, unknown
> [2]PETSC ERROR: ut_main on a linux-cxx-debug named grappelli by esfp Tue
> Jul  5 18:05:15 2016
> [2]PETSC ERROR: Configure options PETSC_ARCH=linux-cxx-debug --with-cc=gcc
> --with-cxx=g++ --with-clanguage=c++ --with-gnu-compilers=1
> --with-mpi-compilers=1 --with-debugging=1 --with-shared-libraries=1
> --download-openmpi=1 --download-f2cblaslapack --download-superlu_dist=1
> --download-superlu=1 --download-metis=1 --download-parmetis=1
> --download-blacs=1 --with-hdf5 --with-hdf5-dir=/usr/local/hdf5/
> [2]PETSC ERROR: #1 MatCreate() line 79 in
> /home/esfp/tools/libraries/petsc/src/mat/utils/gcreate.c
> [2]PETSC ERROR: #2 MatCreateSeqAIJ() line 3471 in
> /home/esfp/tools/libraries/petsc/src/mat/impls/aij/seq/aij.c
> Null Pointer: Parameter # 2
> [0]PETSC ERROR: See http://www.mcs.anl.gov/petsc/documentation/faq.html
> for trouble shooting.
> [0]PETSC ERROR: Petsc Release Version 3.6.3, unknown
> [0]PETSC ERROR: ut_main on a linux-cxx-debug named grappelli by esfp Tue
> Jul  5 18:05:15 2016
> [0]PETSC ERROR: Configure options PETSC_ARCH=linux-cxx-debug --with-cc=gcc
> --with-cxx=g++ --with-clanguage=c++ --with-gnu-compilers=1
> --with-mpi-compilers=1 --with-debugging=1 --with-shared-libraries=1
> --download-openmpi=1 --download-f2cblaslapack --download-superlu_dist=1
> --download-superlu=1 --download-metis=1 --download-parmetis=1
> --download-blacs=1 --with-hdf5 --with-hdf5-dir=/usr/local/hdf5/
> [0]PETSC ERROR: #1 MatCreate() line 79 in
> /home/esfp/tools/libraries/petsc/src/mat/utils/gcreate.c
> [0]PETSC ERROR: #2 MatCreateSeqAIJ() line 3471 in
> /home/esfp/tools/libraries/petsc/src/mat/impls/aij/seq/aij.c
> [1]PETSC ERROR: Petsc Release Version 3.6.3, unknown
> [1]PETSC ERROR: ut_main on a linux-cxx-debug named grappelli by esfp Tue
> Jul  5 18:05:15 2016
> [1]PETSC ERROR: Configure options PETSC_ARCH=linux-cxx-debug --with-cc=gcc
> --with-cxx=g++ --with-clanguage=c++ --with-gnu-compilers=1
> --with-mpi-compilers=1 --with-debugging=1 --with-shared-libraries=1
> --download-openmpi=1 --download-f2cblaslapack --download-superlu_dist=1
> --download-superlu=1 --download-metis=1 --download-parmetis=1
> --download-blacs=1 --with-hdf5 --with-hdf5-dir=/usr/local/hdf5/
> [1]PETSC ERROR: #1 MatCreate() line 79 in
> /home/esfp/tools/libraries/petsc/src/mat/utils/gcreate.c
> [1]PETSC ERROR: #2 MatCreateSeqAIJ() line 3471 in
> /home/esfp/tools/libraries/petsc/src/mat/impls/aij/seq/aij.c
> [CS][pCalc_P] rank:1, num_points:10, p_init:300
> [CS][pCalc_P] rank:2, num_points:10, p_init:300
> [CS][pCalc_P] rank:0, num_points:10, p_init:300
>

As you can see nothing is NULL in my call to the MatCreateSeqAIJ.

I tried to debug it with -start_in_debugger, but I got another error.

> $ make ut_main && mpirun -n 3   ut_main   -start_in_debugger
> /home/esfp/tools/libraries/petsc/linux-cxx-debug/bin/mpicxx -o ut_main.o
> -c -Wall -Wwrite-strings -Wno-strict-aliasing -Wno-unknown-pragmas -g -O0
> -fPIC    -I/home/esfp/tools/libraries/petsc/include
> -I/home/esfp/tools/libraries/petsc/linux-cxx-debug/include
> -I/usr/local/hdf5/include   -std=c++11 -g -O3  `pwd`/ut_main.cc
> /home/esfp/tools/libraries/petsc/linux-cxx-debug/bin/mpicxx -Wall
> -Wwrite-strings -Wno-strict-aliasing -Wno-unknown-pragmas -g -O0 -I.
> svm.o solver.o model_selection.o ut_ms.o ut_common.o ut_kf.o
> ut_partitioning.o ds_node.o ds_graph.o coarsening.o ut_coarsening.o
> partitioning.o ut_mr.o pugixml.o config_params.o etimer.o common_funcs.o
> OptionParser.o loader.o ut_loader.o k_fold.o ut_main.o
> -Wl,-rpath,/home/esfp/tools/libraries/petsc/linux-cxx-debug/lib
> -L/home/esfp/tools/libraries/petsc/linux-cxx-debug/lib  -lpetsc
> -Wl,-rpath,/home/esfp/tools/libraries/petsc/linux-cxx-debug/lib
> -lsuperlu_4.3 -lsuperlu_dist_4.1 -lf2clapack -lf2cblas -lm -lparmetis
> -lmetis -lX11 -Wl,-rpath,/usr/local/hdf5/lib -L/usr/local/hdf5/lib
> -lhdf5hl_fortran -lhdf5_fortran -lhdf5_hl -lhdf5 -lhwloc -lm
> -Wl,-rpath,/usr/lib/gcc/x86_64-linux-gnu/4.8
> -L/usr/lib/gcc/x86_64-linux-gnu/4.8 -Wl,-rpath,/usr/lib/x86_64-linux-gnu
> -L/usr/lib/x86_64-linux-gnu -Wl,-rpath,/lib/x86_64-linux-gnu
> -L/lib/x86_64-linux-gnu -lmpi_usempi -lmpi_mpifh -lgfortran -lm -lgfortran
> -lm -lquadmath -lm -lmpi_cxx -lstdc++
> -Wl,-rpath,/home/esfp/tools/libraries/petsc/linux-cxx-debug/lib
> -L/home/esfp/tools/libraries/petsc/linux-cxx-debug/lib
> -Wl,-rpath,/usr/lib/gcc/x86_64-linux-gnu/4.8
> -L/usr/lib/gcc/x86_64-linux-gnu/4.8 -Wl,-rpath,/usr/lib/x86_64-linux-gnu
> -L/usr/lib/x86_64-linux-gnu -Wl,-rpath,/lib/x86_64-linux-gnu
> -L/lib/x86_64-linux-gnu -Wl,-rpath,/usr/lib/x86_64-linux-gnu
> -L/usr/lib/x86_64-linux-gnu -ldl
> -Wl,-rpath,/home/esfp/tools/libraries/petsc/linux-cxx-debug/lib -lmpi
> -lgcc_s -lpthread -ldl  -o ut_main
> /bin/rm -f ut_main.o
> [0]PETSC ERROR: PETSC: Attaching gdb to ut_main of pid 2818 on display :0
> on machine grappelli
> [1]PETSC ERROR: PETSC: Attaching gdb to ut_main of pid 2819 on display :0
> on machine grappelli
> [2]PETSC ERROR: PETSC: Attaching gdb to ut_main of pid 2820 on display :0
> on machine grappelli
>


And I got below error in gdb GUI:
[image: Inline image 1]

I appreciate your support.

Best regards,
Ehsan

On Wed, Jun 29, 2016 at 4:31 PM, Barry Smith <bsmith at mcs.anl.gov> wrote:

>
>   On all other processes don't pass in 1 pass  in 0 since all other
> processes want 0 sub matrices
>
>
> > On Jun 29, 2016, at 3:24 PM, ehsan sadrfaridpour <it.sadr at gmail.com>
> wrote:
> >
> > Thanks, the IS problem is solved.
> > But now I have another problem to compile the code.
> >
> > I use below code:
> > Mat m_WA_nt_local;
> >     MatCreateSeqAIJ(PETSC_COMM_SELF,num_points,num_points,
> Config_params::getInstance()->get_pre_init_loader_matrix(), NULL,
> &m_WA_nt_local);
> >     IS set;
> >     if(rank ==0){
> >         ISCreateStride(PETSC_COMM_SELF, num_points, 0, 1, &set);
> >         ISView(set, PETSC_VIEWER_STDOUT_SELF);
> >     }
> >     MatGetSubMatrices(m_WA_norm_T, 1, &set, &set, MAT_INITIAL_MATRIX,
> &m_WA_nt_local);
> >
> > The error I get is :
> > error: cannot convert ?_p_Mat**? to ?_p_Mat***? for argument ?6? to
> ?PetscErrorCode MatGetSubMatrices(Mat, PetscInt, _p_IS* const*, _p_IS*
> const*, MatReuse, _p_Mat***)?
> >      MatGetSubMatrices(m_WA_norm_T, 1, &set, &set, MAT_INITIAL_MATRIX,
> &m_WA_nt_local);
> >
> >
> > I tried to go around it by define a array of Matrices using "Mat *
> m_WA_nt_local"
> > So, the first 2 lines changed to below and I can compile the code.
> > Mat * m_WA_nt_local;
> >     MatCreateSeqAIJ(PETSC_COMM_SELF,num_points,num_points,
> Config_params::getInstance()->get_pre_init_loader_matrix(), NULL,
> m_WA_nt_local);
> >
> >
> >
> > However, I get errors like below when I run the code with 2 mpi process.
> >  --------------------- Error Message
> --------------------------------------------------------------
> > [1]PETSC ERROR: Invalid argument
> > [1]PETSC ERROR: Wrong type of object: Parameter # 3
> > [1]PETSC ERROR: See http://www.mcs.anl.gov/petsc/documentation/faq.html
> for trouble shooting.
> > [1]PETSC ERROR: Petsc Release Version 3.6.3, unknown
> > [1]PETSC ERROR: ut_main on a linux-cxx-debug named grappelli by esfp Wed
> Jun 29 16:21:04 2016
> > [1]PETSC ERROR: Configure options PETSC_ARCH=linux-cxx-debug
> --with-cc=gcc --with-cxx=g++ --with-clanguage=c++ --with-gnu-compilers=1
> --with-mpi-compilers=1 --with-debugging=1 --with-shared-libraries=1
> --download-openmpi=1 --download-f2cblaslapack --download-superlu_dist=1
> --download-superlu=1 --download-metis=1 --download-parmetis=1
> --download-blacs=1 --with-hdf5 --with-hdf5-dir=/usr/local/hdf5/
> > [1]PETSC ERROR: #1 MatGetSubMatrices() line 6605 in
> /home/esfp/tools/libraries/petsc/src/mat/interface/matrix.c
> >
> >
> > I think I need to do something for other processes, but I don't know
> what I need to do.
> >
> > Best,
> > Ehsan
> >
> >
> >
> > On Wed, Jun 29, 2016 at 4:03 PM, Dave May <dave.mayhem23 at gmail.com>
> wrote:
> >
> >
> > On Wednesday, 29 June 2016, ehsan sadrfaridpour <it.sadr at gmail.com>
> wrote:
> > I faced the below error during compiling my code for using
> MatGetSubMatrices.
> >
> > error: cannot convert ?IS {aka _p_IS*}? to ?_p_IS* const*? for argument
> ?3? to ?PetscErrorCode MatGetSubMatrices(Mat, PetscInt, _p_IS* const*,
> _p_IS* const*, MatReuse, _p_Mat***)?
> >          MatGetSubMatrices(m_WA_norm_T, 1, set, set, MAT_INITIAL_MATRIX,
> &m_local_W);
> >
> > My code :
> > PetscMPIInt    rank;
> > MPI_Comm_rank(PETSC_COMM_WORLD, &rank);
> >
> > if(rank ==0){
> >         Mat m_local_W;
> >         MatCreateSeqAIJ(PETSC_COMM_SELF,num_points,num_points, num_nz,
> NULL,&m_local_W);// try to reserve space for only number of final non zero
> entries for each fine node (e.g. 4)
> >         IS set;
> >         ISCreateStride(PETSC_COMM_SELF, num_points, 0, 1, &set_row);
> >         MatGetSubMatrices(m_WA_norm_T, 1, set_row, set_col,
> MAT_INITIAL_MATRIX, &m_local_W);
> >
> >     }
> >
> > I followed below example:
> >
> http://www.mcs.anl.gov/petsc/petsc-current/src/vec/is/is/examples/tutorials/ex2.c.html
> >
> > This code won't work in parallel.
> > The man page says this function is collective on Mat. You need to move
> the call to MatGetSubMatrices outside of the if(rank==0) loop.
> >
> >
> >
> >
> >
> >
> >
> > On Wed, Jun 29, 2016 at 3:19 PM, ehsan sadrfaridpour <it.sadr at gmail.com>
> wrote:
> > Thanks a lot for great support.
> >
> > On Wed, Jun 29, 2016 at 3:11 PM, Barry Smith <bsmith at mcs.anl.gov> wrote:
> >
> >    MatGetSubmatrices() just have the first process request all the rows
> and columns and the others request none. You can use ISCreateStride() to
> create the ISs without having to make an array of all the indices.
> >
> >
> > > On Jun 29, 2016, at 1:43 PM, ehsan sadrfaridpour <it.sadr at gmail.com>
> wrote:
> > >
> > > Hi,
> > >
> > > I need to have access to most of elements of a parallel MPIAIJ matrix
> only from 1 process (rank 0).
> > > I tried to copy or duplicate it to SEQAIJ, but I faced problems.
> > >
> > > How can I have a local copy of a matrix which is distributed on
> multiple process? I don't want to update the matrix, and the read-only
> version of it would be enough.
> > >
> > > Best,
> > > Ehsan
> > >
> > >
> >
> >
> >
> >
>
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.mcs.anl.gov/pipermail/petsc-users/attachments/20160705/9b5a4cf7/attachment.html>
-------------- next part --------------
A non-text attachment was scrubbed...
Name: image.png
Type: image/png
Size: 3695 bytes
Desc: not available
URL: <http://lists.mcs.anl.gov/pipermail/petsc-users/attachments/20160705/9b5a4cf7/attachment.png>

From bsmith at mcs.anl.gov  Tue Jul  5 17:18:51 2016
From: bsmith at mcs.anl.gov (Barry Smith)
Date: Tue, 5 Jul 2016 17:18:51 -0500
Subject: [petsc-users] petsc 3.7.2 memory usage is much higher when
	compared to 3.6.1
In-Reply-To: <CB3EA7864A5EA74EADA707EEB42CF5BAFC313A95@MTLWAEXCH005.ca.aero.bombardier.net>
References: <CB3EA7864A5EA74EADA707EEB42CF5BAFC313A95@MTLWAEXCH005.ca.aero.bombardier.net>
Message-ID: <45E9D625-7398-4B46-88BF-BF936D5066D8@mcs.anl.gov>


   Hassan,

    This memory usage increase is not expected.  How are you measuring memory usage?

    Since the problem occurs even with a simple solver you should debug with the simpler solver and only after resolving that move on to GAMG and see if the problem persists. Also do the test on the smallest case that clearly demonstrates the problem; if you have a 1 process run that shows a nontrivial memory usage increase then debug with that, don't run a huge problem unless you absolutely have to.

     How much code, if any, did you need to change in your application in going from 3.6.1 to 3.7.2 ?

     Here is the way to track down the problem. It may seem burdensome but requires no guesswork or speculation. Use the bisection capability of git. 

     First obtain PETSc via git if you have not gotten that way http://www.mcs.anl.gov/petsc/download/index.html

     Then in the PETSc directory run

      git bisect start
 
       git bisect good v3.6.1 

       git bisect bad v3.7.2

       It will then change to a new commit where you need to run configure and make on PETSc and then compile and run your application

       If the application uses the excessive memory then in the PETSc directory do

       git bisect bad

       otherwise type

       git bisect good

       if the code won't compile (if the PETSc API changes you may have to adjust your code slightly to get it to compile and you should do that; but if PETSc won't configure to build with the given commit then just do the skip) or crashes then type 

       git bisect skip 

      Now git will switch to another commit 

      where you need again do the same process of configure make and run the application. 

      After a few iterations git bisect will show the EXACT commit (code changes) that resulted in your very different memory usage and we can take a look at the code changes in PETSc and figure out how to reduce the memory usage.

      I realize this seems like a burdensome process but remember a great deal of changes took place in the PETSc code and this is the ONLY well defined way to figure out exactly which change caused the problem. Otherwise we can guess until the end of time.

   Barry


> On Jul 5, 2016, at 3:42 PM, Hassan Raiesi <Hassan.Raiesi at aero.bombardier.com> wrote:
> 
> Hi,
>  
> PETSc 3.7.2 seems to have a much higher memory usage when compared with PETSc- 3.1.1 c, to a point that it crashes our code for large problems that we ran with version 3.6.1 in the past.
> I have re-compiled the code with same options, and ran the same code linked with the two versions, here are the log-summarie:
>  
> -flow_ksp_max_it 20
> -flow_ksp_monitor_true_residual
> -flow_ksp_rtol 0.1
> -flow_ksp_type fgmres
> -flow_mg_coarse_pc_factor_mat_solver_package mumps
> -flow_mg_coarse_pc_type lu
> -flow_mg_levels_ksp_type richardson
> -flow_mg_levels_pc_type sor
> -flow_pc_gamg_agg_nsmooths 0
> -flow_pc_gamg_coarse_eq_limit 2000
> -flow_pc_gamg_process_eq_limit 2500
> -flow_pc_gamg_repartition true
> -flow_pc_gamg_reuse_interpolation true
> -flow_pc_gamg_square_graph 3
> -flow_pc_gamg_sym_graph true
> -flow_pc_gamg_type agg
> -flow_pc_mg_cycle v
> -flow_pc_mg_levels 20
> -flow_pc_mg_type kaskade
> -flow_pc_type gamg
> -log_summary
>  
> Note: it is not specific to PCGAMG, even a bjacobi+fgmres would need more memory (4.5GB/core in version 3.6.1 compared to 6.8GB/core for 3.7.2).
>  
>  
>  
> Using Petsc Development GIT revision: v3.7.2-812-gc68d048  GIT Date: 2016-07-05 12:04:34 -0400
>  
>                          Max       Max/Min        Avg      Total
> Time (sec):           6.760e+02      1.00006   6.760e+02
> Objects:              1.284e+03      1.00469   1.279e+03
> Flops:                3.563e+10      1.10884   3.370e+10  1.348e+13
> Flops/sec:            5.271e+07      1.10884   4.985e+07  1.994e+10
> MPI Messages:         4.279e+04      7.21359   1.635e+04  6.542e+06
> MPI Message Lengths:  3.833e+09     17.25274   7.681e+04  5.024e+11
> MPI Reductions:       4.023e+03      1.00149
>  
> Flop counting convention: 1 flop = 1 real number operation of type (multiply/divide/add/subtract)
>                             e.g., VecAXPY() for real vectors of length N --> 2N flops
>                             and VecAXPY() for complex vectors of length N --> 8N flops
>  
> Summary of Stages:   ----- Time ------  ----- Flops -----  --- Messages ---  -- Message Lengths --  -- Reductions --
>                         Avg     %Total     Avg     %Total   counts   %Total     Avg         %Total   counts   %Total
> 0:      Main Stage: 6.7600e+02 100.0%  1.3478e+13 100.0%  6.533e+06  99.9%  7.674e+04       99.9%  4.010e+03  99.7%
>  
> ------------------------------------------------------------------------------------------------------------------------
> See the 'Profiling' chapter of the users' manual for details on interpreting output.
> Phase summary info:
>    Count: number of times phase was executed
>    Time and Flops: Max - maximum over all processors
>                    Ratio - ratio of maximum to minimum over all processors
>    Mess: number of messages sent
>    Avg. len: average message length (bytes)
>    Reduct: number of global reductions
>    Global: entire computation
>    Stage: stages of a computation. Set stages with PetscLogStagePush() and PetscLogStagePop().
>       %T - percent time in this phase         %F - percent flops in this phase
>       %M - percent messages in this phase     %L - percent message lengths in this phase
>       %R - percent reductions in this phase
>    Total Mflop/s: 10e-6 * (sum of flops over all processors)/(max time over all processors)
> ------------------------------------------------------------------------------------------------------------------------
> Event                Count      Time (sec)     Flops                             --- Global ---  --- Stage ---   Total
>                    Max Ratio  Max     Ratio   Max  Ratio  Mess   Avg len Reduct  %T %F %M %L %R  %T %F %M %L %R Mflop/s
> ------------------------------------------------------------------------------------------------------------------------
>  
> --- Event Stage 0: Main Stage
>  
> MatMult              500 1.0 1.0582e+01 1.2 6.68e+09 1.1 1.9e+06 1.0e+04 0.0e+00  1 19 28  4  0   1 19 29  4  0 237625
> MatMultTranspose     120 1.0 7.6262e-01 1.3 3.58e+08 1.1 2.4e+05 1.5e+04 0.0e+00  0  1  4  1  0   0  1  4  1  0 180994
> MatSolve             380 1.0 4.1580e+00 1.1 1.17e+09 1.1 8.6e+03 8.8e+01 6.0e+01  1  3  0  0  1   1  3  0  0  1 105950
> MatSOR               120 1.0 1.4316e+01 1.2 6.75e+09 1.1 9.5e+05 7.4e+03 0.0e+00  2 19 15  1  0   2 19 15  1  0 177298
> MatLUFactorSym         2 1.0 2.3449e-01 1.1 0.00e+00 0.0 0.0e+00 0.0e+00 1.0e+01  0  0  0  0  0   0  0  0  0  0     0
> MatLUFactorNum        60 1.0 8.8820e+00 1.0 1.95e+08 1.2 0.0e+00 0.0e+00 0.0e+00  1  1  0  0  0   1  1  0  0  0  7877
> MatILUFactorSym        1 1.0 1.9795e-02 1.4 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00  0  0  0  0  0   0  0  0  0  0     0
> MatConvert             6 1.0 2.9893e-02 1.1 0.00e+00 0.0 0.0e+00 0.0e+00 1.8e+01  0  0  0  0  0   0  0  0  0  0     0
> MatScale               6 1.0 1.8810e-02 1.4 4.52e+06 1.1 2.4e+04 1.5e+03 0.0e+00  0  0  0  0  0   0  0  0  0  0 90171
> MatAssemblyBegin     782 1.0 1.8294e+01 2.9 0.00e+00 0.0 9.2e+05 4.1e+05 4.2e+02  2  0 14 75 10   2  0 14 75 10     0
> MatAssemblyEnd       782 1.0 1.4283e+01 3.0 0.00e+00 0.0 4.1e+05 8.7e+02 4.7e+02  1  0  6  0 12   1  0  6  0 12     0
> MatGetRow        6774900 1.1 9.4289e-01 1.2 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00  0  0  0  0  0   0  0  0  0  0     0
> MatGetRowIJ            3 3.0 6.6261e-036948.0 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00  0  0  0  0  0   0  0  0  0  0     0
> MatGetSubMatrix       12 1.0 2.6783e+01 1.0 0.00e+00 0.0 1.1e+05 1.3e+05 2.0e+02  4  0  2  3  5   4  0  2  3  5     0
> MatGetOrdering         3 3.0 7.7400e-03 7.2 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00  0  0  0  0  0   0  0  0  0  0     0
> MatPartitioning        6 1.0 1.8949e+00 1.0 0.00e+00 0.0 0.0e+00 0.0e+00 1.4e+01  0  0  0  0  0   0  0  0  0  0     0
> MatCoarsen             6 1.0 9.5692e-02 1.2 0.00e+00 0.0 2.6e+05 1.1e+03 4.1e+01  0  0  4  0  1   0  0  4  0  1     0
> MatZeroEntries       142 1.0 9.7085e-01 1.1 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00  0  0  0  0  0   0  0  0  0  0     0
> MatTranspose           6 1.0 2.1740e-01 1.0 0.00e+00 0.0 1.9e+05 8.5e+02 7.8e+01  0  0  3  0  2   0  0  3  0  2     0
> MatPtAP              120 1.0 6.0157e+01 1.0 1.82e+10 1.1 1.5e+06 2.7e+05 4.2e+02  9 51 22 80 10   9 51 22 80 10 114269
> MatPtAPSymbolic       12 1.0 8.1081e+00 1.0 0.00e+00 0.0 2.2e+05 3.8e+04 8.4e+01  1  0  3  2  2   1  0  3  2  2     0
> MatPtAPNumeric       120 1.0 5.2205e+01 1.0 1.82e+10 1.1 1.2e+06 3.1e+05 3.4e+02  8 51 19 78  8   8 51 19 78  8 131676
> MatTrnMatMult          3 1.0 1.8608e+00 1.0 3.23e+07 1.2 8.3e+04 7.9e+03 5.7e+01  0  0  1  0  1   0  0  1  0  1  6275
> MatTrnMatMultSym       3 1.0 1.3447e+00 1.0 0.00e+00 0.0 6.9e+04 3.8e+03 5.1e+01  0  0  1  0  1   0  0  1  0  1     0
> MatTrnMatMultNum       3 1.0 5.1695e-01 1.0 3.23e+07 1.2 1.3e+04 3.0e+04 6.0e+00  0  0  0  0  0   0  0  0  0  0 22588
> MatGetLocalMat       126 1.0 1.0355e+00 1.8 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00  0  0  0  0  0   0  0  0  0  0     0
> MatGetBrAoCol        120 1.0 9.5921e+0019.2 0.00e+00 0.0 5.7e+05 3.3e+04 0.0e+00  1  0  9  4  0   1  0  9  4  0     0
> VecDot               320 1.0 1.1400e+00 1.6 2.04e+08 1.1 0.0e+00 0.0e+00 3.2e+02  0  1  0  0  8   0  1  0  0  8 68967
> VecMDot              260 1.0 1.9577e+00 2.8 3.70e+08 1.1 0.0e+00 0.0e+00 2.6e+02  0  1  0  0  6   0  1  0  0  6 72792
> VecNorm              440 1.0 2.6273e+00 1.9 5.88e+08 1.1 0.0e+00 0.0e+00 4.4e+02  0  2  0  0 11   0  2  0  0 11 86035
> VecScale             320 1.0 2.1386e-01 1.2 7.91e+07 1.1 0.0e+00 0.0e+00 0.0e+00  0  0  0  0  0   0  0  0  0  0 141968
> VecCopy              220 1.0 7.0370e-01 1.2 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00  0  0  0  0  0   0  0  0  0  0     0
> VecSet               862 1.0 7.1000e-01 1.6 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00  0  0  0  0  0   0  0  0  0  0     0
> VecAXPY              440 1.0 8.6790e-01 1.1 3.83e+08 1.1 0.0e+00 0.0e+00 0.0e+00  0  1  0  0  0   0  1  0  0  0 169857
> VecAYPX              280 1.0 5.7766e-01 1.5 1.92e+08 1.1 0.0e+00 0.0e+00 0.0e+00  0  1  0  0  0   0  1  0  0  0 127599
> VecMAXPY             300 1.0 9.7396e-01 1.2 4.98e+08 1.1 0.0e+00 0.0e+00 0.0e+00  0  1  0  0  0   0  1  0  0  0 196768
> VecAssemblyBegin     234 1.0 4.6313e+00 5.6 0.00e+00 0.0 0.0e+00 0.0e+00 6.8e+02  0  0  0  0 17   0  0  0  0 17     0
> VecAssemblyEnd       234 1.0 5.1503e-0319.5 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00  0  0  0  0  0   0  0  0  0  0     0
> VecScatterBegin     1083 1.0 2.9274e-01 4.5 0.00e+00 0.0 3.8e+06 8.5e+03 2.0e+01  0  0 59  6  0   0  0 59  6  0     0
> VecScatterEnd       1063 1.0 3.9653e+00 5.6 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00  0  0  0  0  0   0  0  0  0  0     0
> KSPGMRESOrthog        20 1.0 1.7405e+00 3.7 1.28e+08 1.1 0.0e+00 0.0e+00 2.0e+01  0  0  0  0  0   0  0  0  0  0 28232
> KSPSetUp             222 1.0 6.8469e-02 1.1 0.00e+00 0.0 0.0e+00 0.0e+00 1.2e+01  0  0  0  0  0   0  0  0  0  0     0
> KSPSolve              60 1.0 1.4767e+02 1.0 3.55e+10 1.1 6.3e+06 7.2e+04 3.2e+03 22100 96 90 79  22100 96 90 79 91007
> PCGAMGGraph_AGG        6 1.0 6.0792e+00 1.0 4.52e+06 1.1 3.8e+05 9.0e+02 2.5e+02  1  0  6  0  6   1  0  6  0  6   279
> PCGAMGCoarse_AGG       6 1.0 2.0660e+00 1.0 3.23e+07 1.2 4.2e+05 3.1e+03 1.5e+02  0  0  6  0  4   0  0  6  0  4  5652
> PCGAMGProl_AGG         6 1.0 1.8842e+00 1.0 0.00e+00 0.0 7.3e+05 3.3e+03 8.6e+02  0  0 11  0 21   0  0 11  0 22     0
> PCGAMGPOpt_AGG         6 1.0 6.4373e-06 0.0 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00  0  0  0  0  0   0  0  0  0  0     0
> GAMG: createProl       6 1.0 1.0036e+01 1.0 3.68e+07 1.2 1.5e+06 2.7e+03 1.3e+03  1  0 23  1 31   1  0 23  1 31  1332
>   Graph               12 1.0 6.0783e+00 1.0 4.52e+06 1.1 3.8e+05 9.0e+02 2.5e+02  1  0  6  0  6   1  0  6  0  6   279
>   MIS/Agg              6 1.0 9.5831e-02 1.2 0.00e+00 0.0 2.6e+05 1.1e+03 4.1e+01  0  0  4  0  1   0  0  4  0  1     0
>   SA: col data         6 1.0 7.7358e-01 1.0 0.00e+00 0.0 6.7e+05 2.9e+03 7.8e+02  0  0 10  0 19   0  0 10  0 19     0
>   SA: frmProl0         6 1.0 1.0759e+00 1.0 0.00e+00 0.0 6.2e+04 7.6e+03 6.0e+01  0  0  1  0  1   0  0  1  0  1     0
> GAMG: partLevel        6 1.0 3.8136e+01 1.0 9.09e+08 1.1 3.8e+05 5.0e+04 5.4e+02  6  3  6  4 13   6  3  6  4 14  9013
>   repartition          6 1.0 2.7910e+00 1.0 0.00e+00 0.0 4.6e+04 1.3e+02 1.6e+02  0  0  1  0  4   0  0  1  0  4     0
>   Invert-Sort          6 1.0 2.5045e+00 2.0 0.00e+00 0.0 0.0e+00 0.0e+00 2.4e+01  0  0  0  0  1   0  0  0  0  1     0
>   Move A               6 1.0 1.4832e+01 1.0 0.00e+00 0.0 8.5e+04 1.7e+05 1.1e+02  2  0  1  3  3   2  0  1  3  3     0
>   Move P               6 1.0 1.2023e+01 1.0 0.00e+00 0.0 2.4e+04 3.8e+03 1.1e+02  2  0  0  0  3   2  0  0  0  3     0
> PCSetUp              100 1.0 1.1212e+02 1.0 1.84e+10 1.1 3.2e+06 1.3e+05 2.2e+03 17 52 49 84 54  17 52 49 84 54 62052
> PCSetUpOnBlocks       40 1.0 1.0386e+00 1.2 1.95e+08 1.2 0.0e+00 0.0e+00 0.0e+00  0  1  0  0  0   0  1  0  0  0 67368
> PCApply              380 1.0 2.0034e+01 1.1 8.60e+09 1.1 1.5e+06 9.9e+03 6.0e+01  3 24 22  3  1   3 24 22  3  1 161973
> SFSetGraph            12 1.0 4.9813e-0310.2 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00  0  0  0  0  0   0  0  0  0  0     0
> SFBcastBegin          47 1.0 3.3110e-02 2.6 0.00e+00 0.0 2.6e+05 1.1e+03 6.0e+00  0  0  4  0  0   0  0  4  0  0     0
> SFBcastEnd            47 1.0 1.3497e-02 3.6 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00  0  0  0  0  0   0  0  0  0  0     0
> SFReduceBegin          6 1.0 1.8593e-02 4.2 0.00e+00 0.0 7.2e+04 4.9e+02 6.0e+00  0  0  1  0  0   0  0  1  0  0     0
> SFReduceEnd            6 1.0 7.1628e-0318.5 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00  0  0  0  0  0   0  0  0  0  0     0
> BuildTwoSided         12 1.0 3.5771e-02 2.5 0.00e+00 0.0 5.0e+04 4.0e+00 1.2e+01  0  0  1  0  0   0  0  1  0  0     0
> ------------------------------------------------------------------------------------------------------------------------
>  
> Memory usage is given in bytes:
>  
> Object Type          Creations   Destructions     Memory  Descendants' Mem.
> Reports information only for process 0.
>  
> --- Event Stage 0: Main Stage
>  
>               Matrix   302            299   1992700700     0.
> Matrix Partitioning     6              6         3888     0.
>       Matrix Coarsen     6              6         3768     0.
>               Vector   600            600   1582204168     0.
>       Vector Scatter    87             87      5614432     0.
>        Krylov Solver    11             11        59472     0.
>       Preconditioner    11             11        11120     0.
>          PetscRandom     1              1          638     0.
>               Viewer     1              0            0     0.
>            Index Set   247            247      9008420     0.
> Star Forest Bipartite Graph    12             12        10176     0.
> ========================================================================================================================
>  
> And for  petsc 3.6.1:
>  
> Using Petsc Development GIT revision: v3.6.1-307-g26c82d3  GIT Date: 2015-08-06 11:50:34 -0500
>  
>                          Max       Max/Min        Avg      Total
> Time (sec):           5.515e+02      1.00001   5.515e+02
> Objects:              1.231e+03      1.00490   1.226e+03
> Flops:                3.431e+10      1.12609   3.253e+10  1.301e+13
> Flops/sec:            6.222e+07      1.12609   5.899e+07  2.359e+10
> MPI Messages:         4.432e+04      7.84165   1.504e+04  6.016e+06
> MPI Message Lengths:  2.236e+09     12.61261   5.027e+04  3.024e+11
> MPI Reductions:       4.012e+03      1.00150
>  
> Flop counting convention: 1 flop = 1 real number operation of type (multiply/divide/add/subtract)
>                             e.g., VecAXPY() for real vectors of length N --> 2N flops
>                             and VecAXPY() for complex vectors of length N --> 8N flops
>  
> Summary of Stages:   ----- Time ------  ----- Flops -----  --- Messages ---  -- Message Lengths --  -- Reductions --
>                         Avg     %Total     Avg     %Total   counts   %Total     Avg         %Total   counts   %Total
> 0:      Main Stage: 5.5145e+02 100.0%  1.3011e+13 100.0%  6.007e+06  99.9%  5.020e+04       99.9%  3.999e+03  99.7%
>  
> ------------------------------------------------------------------------------------------------------------------------
> See the 'Profiling' chapter of the users' manual for details on interpreting output.
> Phase summary info:
>    Count: number of times phase was executed
>    Time and Flops: Max - maximum over all processors
>                    Ratio - ratio of maximum to minimum over all processors
>    Mess: number of messages sent
>    Avg. len: average message length (bytes)
>    Reduct: number of global reductions
>    Global: entire computation
>    Stage: stages of a computation. Set stages with PetscLogStagePush() and PetscLogStagePop().
>       %T - percent time in this phase         %F - percent flops in this phase
>       %M - percent messages in this phase     %L - percent message lengths in this phase
>       %R - percent reductions in this phase
>    Total Mflop/s: 10e-6 * (sum of flops over all processors)/(max time over all processors)
> ------------------------------------------------------------------------------------------------------------------------
> Event                Count      Time (sec)     Flops                             --- Global ---  --- Stage ---   Total
>                    Max Ratio  Max     Ratio   Max  Ratio  Mess   Avg len Reduct  %T %F %M %L %R  %T %F %M %L %R Mflop/s
> ------------------------------------------------------------------------------------------------------------------------
>  
> --- Event Stage 0: Main Stage
>  
> MatMult              500 1.0 1.0172e+01 1.2 6.68e+09 1.1 1.9e+06 9.9e+03 0.0e+00  2 19 31  6  0   2 19 31  6  0 247182
> MatMultTranspose     120 1.0 6.9889e-01 1.2 3.56e+08 1.1 2.5e+05 1.4e+04 0.0e+00  0  1  4  1  0   0  1  4  1  0 197492
> MatSolve             380 1.0 3.9310e+00 1.1 1.17e+09 1.1 1.3e+04 5.7e+01 6.0e+01  1  3  0  0  1   1  3  0  0  2 112069
> MatSOR               120 1.0 1.3915e+01 1.1 6.73e+09 1.1 9.5e+05 7.4e+03 0.0e+00  2 20 16  2  0   2 20 16  2  0 182405
> MatLUFactorSym         2 1.0 2.1180e-01 1.8 0.00e+00 0.0 0.0e+00 0.0e+00 1.0e+01  0  0  0  0  0   0  0  0  0  0     0
> MatLUFactorNum        60 1.0 7.9378e+00 1.0 1.95e+08 1.2 0.0e+00 0.0e+00 0.0e+00  1  1  0  0  0   1  1  0  0  0  8814
> MatILUFactorSym        1 1.0 2.3076e-02 1.5 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00  0  0  0  0  0   0  0  0  0  0     0
> MatConvert             6 1.0 3.2693e-02 1.5 0.00e+00 0.0 0.0e+00 0.0e+00 1.8e+01  0  0  0  0  0   0  0  0  0  0     0
> MatScale               6 1.0 2.1923e-02 1.7 4.50e+06 1.1 2.4e+04 1.5e+03 0.0e+00  0  0  0  0  0   0  0  0  0  0 77365
> MatAssemblyBegin     266 1.0 1.0337e+01 4.4 0.00e+00 0.0 1.8e+05 3.8e+03 4.2e+02  1  0  3  0 10   1  0  3  0 10     0
> MatAssemblyEnd       266 1.0 3.0336e+00 1.0 0.00e+00 0.0 4.1e+05 8.6e+02 4.7e+02  1  0  7  0 12   1  0  7  0 12     0
> MatGetRow        6730366 1.1 8.6473e-01 1.2 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00  0  0  0  0  0   0  0  0  0  0     0
> MatGetRowIJ            3 3.0 5.2931e-035550.2 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00  0  0  0  0  0   0  0  0  0  0     0
> MatGetSubMatrix       12 1.0 2.2689e+01 1.0 0.00e+00 0.0 1.1e+05 1.3e+05 1.9e+02  4  0  2  5  5   4  0  2  5  5     0
> MatGetOrdering         3 3.0 6.5000e-03 5.4 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00  0  0  0  0  0   0  0  0  0  0     0
> MatPartitioning        6 1.0 2.9801e+00 1.0 0.00e+00 0.0 0.0e+00 0.0e+00 1.4e+01  1  0  0  0  0   1  0  0  0  0     0
> MatCoarsen             6 1.0 9.5374e-02 1.1 0.00e+00 0.0 2.5e+05 1.1e+03 3.8e+01  0  0  4  0  1   0  0  4  0  1     0
> MatZeroEntries        22 1.0 6.1185e-01 1.2 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00  0  0  0  0  0   0  0  0  0  0     0
> MatTranspose           6 1.0 1.9780e-01 1.1 0.00e+00 0.0 1.9e+05 8.6e+02 7.8e+01  0  0  3  0  2   0  0  3  0  2     0
> MatPtAP              120 1.0 5.2996e+01 1.0 1.70e+10 1.1 9.7e+05 2.1e+05 4.2e+02 10 49 16 67 10  10 49 16 67 11 120900
> MatPtAPSymbolic       12 1.0 5.8209e+00 1.0 0.00e+00 0.0 2.2e+05 3.7e+04 8.4e+01  1  0  4  3  2   1  0  4  3  2     0
> MatPtAPNumeric       120 1.0 4.7185e+01 1.0 1.70e+10 1.1 7.6e+05 2.6e+05 3.4e+02  9 49 13 64  8   9 49 13 64  8 135789
> MatTrnMatMult          3 1.0 1.1679e+00 1.0 3.22e+07 1.2 8.2e+04 8.0e+03 5.7e+01  0  0  1  0  1   0  0  1  0  1  9997
> MatTrnMatMultSym       3 1.0 6.8366e-01 1.0 0.00e+00 0.0 6.9e+04 3.9e+03 5.1e+01  0  0  1  0  1   0  0  1  0  1     0
> MatTrnMatMultNum       3 1.0 4.8513e-01 1.0 3.22e+07 1.2 1.3e+04 3.0e+04 6.0e+00  0  0  0  0  0   0  0  0  0  0 24069
> MatGetLocalMat       126 1.0 1.1939e+00 1.3 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00  0  0  0  0  0   0  0  0  0  0     0
> MatGetBrAoCol        120 1.0 5.9887e-01 2.7 0.00e+00 0.0 5.7e+05 3.3e+04 0.0e+00  0  0  9  6  0   0  0  9  6  0     0
> MatGetSymTrans        24 1.0 1.4878e-01 1.4 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00  0  0  0  0  0   0  0  0  0  0     0
> VecDot               320 1.0 1.5860e+00 1.5 2.04e+08 1.1 0.0e+00 0.0e+00 3.2e+02  0  1  0  0  8   0  1  0  0  8 49574
> VecMDot              260 1.0 1.8154e+00 2.5 3.70e+08 1.1 0.0e+00 0.0e+00 2.6e+02  0  1  0  0  6   0  1  0  0  7 78497
> VecNorm              440 1.0 2.8876e+00 1.8 5.88e+08 1.1 0.0e+00 0.0e+00 4.4e+02  0  2  0  0 11   0  2  0  0 11 78281
> VecScale             320 1.0 2.2738e-01 1.2 7.88e+07 1.1 0.0e+00 0.0e+00 0.0e+00  0  0  0  0  0   0  0  0  0  0 133517
> VecCopy              220 1.0 7.1162e-01 1.2 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00  0  0  0  0  0   0  0  0  0  0     0
> VecSet               862 1.0 7.0683e-01 1.5 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00  0  0  0  0  0   0  0  0  0  0     0
> VecAXPY              440 1.0 9.0657e-01 1.2 3.83e+08 1.1 0.0e+00 0.0e+00 0.0e+00  0  1  0  0  0   0  1  0  0  0 162612
> VecAYPX              280 1.0 5.8935e-01 1.5 1.92e+08 1.1 0.0e+00 0.0e+00 0.0e+00  0  1  0  0  0   0  1  0  0  0 125070
> VecMAXPY             300 1.0 9.7644e-01 1.2 4.98e+08 1.1 0.0e+00 0.0e+00 0.0e+00  0  1  0  0  0   0  1  0  0  0 196269
> VecAssemblyBegin     234 1.0 5.0308e+00 5.5 0.00e+00 0.0 0.0e+00 0.0e+00 6.8e+02  1  0  0  0 17   1  0  0  0 17     0
> VecAssemblyEnd       234 1.0 1.8253e-03 8.8 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00  0  0  0  0  0   0  0  0  0  0     0
> VecScatterBegin     1083 1.0 2.8195e-01 4.7 0.00e+00 0.0 3.8e+06 8.4e+03 2.0e+01  0  0 64 11  0   0  0 64 11  1     0
> VecScatterEnd       1063 1.0 3.4924e+00 6.9 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00  0  0  0  0  0   0  0  0  0  0     0
> KSPGMRESOrthog        20 1.0 1.5598e+00 3.2 1.28e+08 1.1 0.0e+00 0.0e+00 2.0e+01  0  0  0  0  0   0  0  0  0  1 31503
> KSPSetUp             222 1.0 9.7521e-02 1.6 0.00e+00 0.0 0.0e+00 0.0e+00 1.2e+01  0  0  0  0  0   0  0  0  0  0     0
> KSPSolve              60 1.0 1.3742e+02 1.0 3.42e+10 1.1 5.7e+06 4.4e+04 3.2e+03 25100 95 83 79  25100 95 83 79 94396
> PCGAMGGraph_AGG        6 1.0 5.7683e+00 1.0 4.50e+06 1.1 3.8e+05 9.1e+02 2.5e+02  1  0  6  0  6   1  0  6  0  6   294
> PCGAMGCoarse_AGG       6 1.0 1.4101e+00 1.0 3.22e+07 1.2 4.0e+05 3.2e+03 1.4e+02  0  0  7  0  4   0  0  7  0  4  8280
> PCGAMGProl_AGG         6 1.0 1.8976e+00 1.0 0.00e+00 0.0 7.2e+05 3.4e+03 8.6e+02  0  0 12  1 22   0  0 12  1 22     0
> PCGAMGPOpt_AGG         6 1.0 5.7220e-06 0.0 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00  0  0  0  0  0   0  0  0  0  0     0
> GAMG: createProl       6 1.0 9.0840e+00 1.0 3.67e+07 1.2 1.5e+06 2.7e+03 1.3e+03  2  0 25  1 31   2  0 25  1 31  1472
>   Graph               12 1.0 5.7669e+00 1.0 4.50e+06 1.1 3.8e+05 9.1e+02 2.5e+02  1  0  6  0  6   1  0  6  0  6   294
>   MIS/Agg              6 1.0 9.5481e-02 1.1 0.00e+00 0.0 2.5e+05 1.1e+03 3.8e+01  0  0  4  0  1   0  0  4  0  1     0
>   SA: col data         6 1.0 8.5414e-01 1.0 0.00e+00 0.0 6.6e+05 3.0e+03 7.8e+02  0  0 11  1 19   0  0 11  1 20     0
>   SA: frmProl0         6 1.0 1.0123e+00 1.0 0.00e+00 0.0 6.2e+04 7.6e+03 6.0e+01  0  0  1  0  1   0  0  1  0  2     0
> GAMG: partLevel        6 1.0 3.6150e+01 1.0 8.41e+08 1.1 3.5e+05 5.0e+04 5.3e+02  7  2  6  6 13   7  2  6  6 13  8804
>   repartition          6 1.0 3.8351e+00 1.0 0.00e+00 0.0 4.7e+04 1.3e+02 1.6e+02  1  0  1  0  4   1  0  1  0  4     0
>   Invert-Sort          6 1.0 4.4953e+00 1.8 0.00e+00 0.0 0.0e+00 0.0e+00 2.4e+01  1  0  0  0  1   1  0  0  0  1     0
>   Move A               6 1.0 1.0806e+01 1.0 0.00e+00 0.0 8.5e+04 1.6e+05 1.0e+02  2  0  1  5  3   2  0  1  5  3     0
>   Move P               6 1.0 1.1953e+01 1.0 0.00e+00 0.0 2.5e+04 3.6e+03 1.0e+02  2  0  0  0  3   2  0  0  0  3     0
> PCSetUp              100 1.0 1.0166e+02 1.0 1.72e+10 1.1 2.7e+06 8.3e+04 2.2e+03 18 50 44 73 54  18 50 44 73 54 63848
> PCSetUpOnBlocks       40 1.0 1.0812e+00 1.2 1.95e+08 1.2 0.0e+00 0.0e+00 0.0e+00  0  1  0  0  0   0  1  0  0  0 64711
> PCApply              380 1.0 1.9359e+01 1.1 8.58e+09 1.1 1.4e+06 9.6e+03 6.0e+01  3 25 24  5  1   3 25 24  5  2 167605
> SFSetGraph            12 1.0 3.5203e-03 6.8 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00  0  0  0  0  0   0  0  0  0  0     0
> SFBcastBegin          44 1.0 2.4242e-02 3.0 0.00e+00 0.0 2.5e+05 1.1e+03 6.0e+00  0  0  4  0  0   0  0  4  0  0     0
> SFBcastEnd            44 1.0 3.0994e-02 8.6 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00  0  0  0  0  0   0  0  0  0  0     0
> SFReduceBegin          6 1.0 1.6784e-02 3.8 0.00e+00 0.0 7.1e+04 5.0e+02 6.0e+00  0  0  1  0  0   0  0  1  0  0     0
> SFReduceEnd            6 1.0 8.6989e-0332.0 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00  0  0  0  0  0   0  0  0  0  0     0
> ------------------------------------------------------------------------------------------------------------------------
>  
> Memory usage is given in bytes:
>  
> Object Type          Creations   Destructions     Memory  Descendants' Mem.
> Reports information only for process 0.
>  
> --- Event Stage 0: Main Stage
>  
>               Matrix   246            243   1730595756     0
> Matrix Partitioning     6              6         3816     0
>       Matrix Coarsen     6              6         3720     0
>               Vector   602            602   1603749672     0
>       Vector Scatter    87             87      4291136     0
>        Krylov Solver    12             12        60416     0
>       Preconditioner    12             12        12040     0
>               Viewer     1              0            0     0
>            Index Set   247            247      9018060     0
> Star Forest Bipartite Graph    12             12        10080     0
> ========================================================================================================================
>  
> Any idea why there are more matrix created with version 3.7.2? I only have 2 MatCreate calls and 4 VecCreate calls in my code!, so I assume the others are internally created.
>  
>  
> Thank you,
>  
>  
> Hassan Raiesi, PhD
>  
> Advanced Aerodynamics Department
> Bombardier Aerospace
>  
> hassan.raiesi at aero.bombardier.com
>  
> 2351 boul. Alfred-Nobel (BAN1)
> Ville Saint-Laurent, Qu?bec, H4S 2A9
>  
>  
>  
> T?l.
>   514-855-5001    # 62204
>  
>  
>  
> <image001.png>
>  
>  
> CONFIDENTIALITY NOTICE - This communication may contain privileged or confidential information.
> If you are not the intended recipient or received this communication by error, please notify the sender
> and delete the message without copying, forwarding and/or disclosing it.


From bsmith at mcs.anl.gov  Tue Jul  5 17:21:24 2016
From: bsmith at mcs.anl.gov (Barry Smith)
Date: Tue, 5 Jul 2016 17:21:24 -0500
Subject: [petsc-users] How to have a local copy (sequential) of a
	parallel matrix
In-Reply-To: <CALtCA6F4xzEJE1WCk-4e8-LDjhTfJia4b16HmzD+J4z-weWLZQ@mail.gmail.com>
References: <CALtCA6F_vGr=ZMi2ttcFxo1TkYoSOPn9wv5iY48+8qicYPJRBQ@mail.gmail.com>
	<F0B9F2BE-B38F-4B1B-9F95-C82F7D282ECC@mcs.anl.gov>
	<CALtCA6HWqqCij2K6uv7LER2ebwnwgh_xY7u4D2YWmzsyR67bzw@mail.gmail.com>
	<CALtCA6EUfkOOOArOLxF5A6bpbCooT1d3BKMPHZyJKvCVZ3cYRw@mail.gmail.com>
	<CAJ98EDrK9peK_B868Lr9EQ4sdiyQKMoZyOuZuOwTuCR6of9K9w@mail.gmail.com>
	<CALtCA6FnWa8A6Zi-zANDePmDWtBpUSTCM511nP=NHuttwTY7SA@mail.gmail.com>
	<1618DCDA-7859-49BD-BCAF-F4BD08DF1BAF@mcs.anl.gov>
	<CALtCA6F4xzEJE1WCk-4e8-LDjhTfJia4b16HmzD+J4z-weWLZQ@mail.gmail.com>
Message-ID: <D78E861A-793F-4D84-B155-6834CEFDAC6A@mcs.anl.gov>


  It should be 

    Mat m_WA_nt_local;

> MatCreateSeqAIJ(PETSC_COMM_SELF,num_points,num_points, pre_init_size, NULL, &m_WA_nt_local);
                                                                                                                                   ^^^^^^^^^^^^ note the &


> On Jul 5, 2016, at 5:13 PM, ehsan sadrfaridpour <it.sadr at gmail.com> wrote:
> 
> I faced a problem with my code. The problem is related to MatCreateSeqAIJ().
> I comment the rest of my code and just keeping the below lines cause me the error.
> Code:
>     Mat * m_WA_nt_local;
>     MatCreateSeqAIJ(PETSC_COMM_SELF,num_points,num_points, pre_init_size, NULL, m_WA_nt_local);
>     PetscPrintf(PETSC_COMM_SELF, "[CS][pCalc_P] rank:%d, num_points:%d, p_init:%d\n", rank, num_points, pre_init_size);
> 
>     exit(1);
> 
> Error:
> [0]PETSC ERROR: --------------------- Error Message --------------------------------------------------------------
> [0]PETSC ERROR: Null argument, when expecting valid pointer
> [0]PETSC ERROR: [1]PETSC ERROR: --------------------- Error Message --------------------------------------------------------------
> [1]PETSC ERROR: Null argument, when expecting valid pointer
> [1]PETSC ERROR: Null Pointer: Parameter # 2
> [1]PETSC ERROR: See http://www.mcs.anl.gov/petsc/documentation/faq.html for trouble shooting.
> [2]PETSC ERROR: --------------------- Error Message --------------------------------------------------------------
> [2]PETSC ERROR: Null argument, when expecting valid pointer
> [2]PETSC ERROR: Null Pointer: Parameter # 2
> [2]PETSC ERROR: See http://www.mcs.anl.gov/petsc/documentation/faq.html for trouble shooting.
> [2]PETSC ERROR: Petsc Release Version 3.6.3, unknown 
> [2]PETSC ERROR: ut_main on a linux-cxx-debug named grappelli by esfp Tue Jul  5 18:05:15 2016
> [2]PETSC ERROR: Configure options PETSC_ARCH=linux-cxx-debug --with-cc=gcc --with-cxx=g++ --with-clanguage=c++ --with-gnu-compilers=1 --with-mpi-compilers=1 --with-debugging=1 --with-shared-libraries=1 --download-openmpi=1 --download-f2cblaslapack --download-superlu_dist=1 --download-superlu=1 --download-metis=1 --download-parmetis=1 --download-blacs=1 --with-hdf5 --with-hdf5-dir=/usr/local/hdf5/
> [2]PETSC ERROR: #1 MatCreate() line 79 in /home/esfp/tools/libraries/petsc/src/mat/utils/gcreate.c
> [2]PETSC ERROR: #2 MatCreateSeqAIJ() line 3471 in /home/esfp/tools/libraries/petsc/src/mat/impls/aij/seq/aij.c
> Null Pointer: Parameter # 2
> [0]PETSC ERROR: See http://www.mcs.anl.gov/petsc/documentation/faq.html for trouble shooting.
> [0]PETSC ERROR: Petsc Release Version 3.6.3, unknown 
> [0]PETSC ERROR: ut_main on a linux-cxx-debug named grappelli by esfp Tue Jul  5 18:05:15 2016
> [0]PETSC ERROR: Configure options PETSC_ARCH=linux-cxx-debug --with-cc=gcc --with-cxx=g++ --with-clanguage=c++ --with-gnu-compilers=1 --with-mpi-compilers=1 --with-debugging=1 --with-shared-libraries=1 --download-openmpi=1 --download-f2cblaslapack --download-superlu_dist=1 --download-superlu=1 --download-metis=1 --download-parmetis=1 --download-blacs=1 --with-hdf5 --with-hdf5-dir=/usr/local/hdf5/
> [0]PETSC ERROR: #1 MatCreate() line 79 in /home/esfp/tools/libraries/petsc/src/mat/utils/gcreate.c
> [0]PETSC ERROR: #2 MatCreateSeqAIJ() line 3471 in /home/esfp/tools/libraries/petsc/src/mat/impls/aij/seq/aij.c
> [1]PETSC ERROR: Petsc Release Version 3.6.3, unknown 
> [1]PETSC ERROR: ut_main on a linux-cxx-debug named grappelli by esfp Tue Jul  5 18:05:15 2016
> [1]PETSC ERROR: Configure options PETSC_ARCH=linux-cxx-debug --with-cc=gcc --with-cxx=g++ --with-clanguage=c++ --with-gnu-compilers=1 --with-mpi-compilers=1 --with-debugging=1 --with-shared-libraries=1 --download-openmpi=1 --download-f2cblaslapack --download-superlu_dist=1 --download-superlu=1 --download-metis=1 --download-parmetis=1 --download-blacs=1 --with-hdf5 --with-hdf5-dir=/usr/local/hdf5/
> [1]PETSC ERROR: #1 MatCreate() line 79 in /home/esfp/tools/libraries/petsc/src/mat/utils/gcreate.c
> [1]PETSC ERROR: #2 MatCreateSeqAIJ() line 3471 in /home/esfp/tools/libraries/petsc/src/mat/impls/aij/seq/aij.c
> [CS][pCalc_P] rank:1, num_points:10, p_init:300
> [CS][pCalc_P] rank:2, num_points:10, p_init:300
> [CS][pCalc_P] rank:0, num_points:10, p_init:300
> 
> As you can see nothing is NULL in my call to the MatCreateSeqAIJ.
> 
> I tried to debug it with -start_in_debugger, but I got another error.
> $ make ut_main && mpirun -n 3   ut_main   -start_in_debugger
> /home/esfp/tools/libraries/petsc/linux-cxx-debug/bin/mpicxx -o ut_main.o -c -Wall -Wwrite-strings -Wno-strict-aliasing -Wno-unknown-pragmas -g -O0  -fPIC    -I/home/esfp/tools/libraries/petsc/include -I/home/esfp/tools/libraries/petsc/linux-cxx-debug/include -I/usr/local/hdf5/include   -std=c++11 -g -O3  `pwd`/ut_main.cc
> /home/esfp/tools/libraries/petsc/linux-cxx-debug/bin/mpicxx -Wall -Wwrite-strings -Wno-strict-aliasing -Wno-unknown-pragmas -g -O0 -I.        svm.o solver.o model_selection.o ut_ms.o ut_common.o ut_kf.o ut_partitioning.o ds_node.o ds_graph.o coarsening.o ut_coarsening.o partitioning.o ut_mr.o pugixml.o config_params.o etimer.o common_funcs.o OptionParser.o loader.o ut_loader.o k_fold.o ut_main.o  -Wl,-rpath,/home/esfp/tools/libraries/petsc/linux-cxx-debug/lib -L/home/esfp/tools/libraries/petsc/linux-cxx-debug/lib  -lpetsc -Wl,-rpath,/home/esfp/tools/libraries/petsc/linux-cxx-debug/lib -lsuperlu_4.3 -lsuperlu_dist_4.1 -lf2clapack -lf2cblas -lm -lparmetis -lmetis -lX11 -Wl,-rpath,/usr/local/hdf5/lib -L/usr/local/hdf5/lib -lhdf5hl_fortran -lhdf5_fortran -lhdf5_hl -lhdf5 -lhwloc -lm -Wl,-rpath,/usr/lib/gcc/x86_64-linux-gnu/4.8 -L/usr/lib/gcc/x86_64-linux-gnu/4.8 -Wl,-rpath,/usr/lib/x86_64-linux-gnu -L/usr/lib/x86_64-linux-gnu -Wl,-rpath,/lib/x86_64-linux-gnu -L/lib/x86_64-linux-gnu -lmpi_usempi -lmpi_mpifh -lgfortran -lm -lgfortran -lm -lquadmath -lm -lmpi_cxx -lstdc++ -Wl,-rpath,/home/esfp/tools/libraries/petsc/linux-cxx-debug/lib -L/home/esfp/tools/libraries/petsc/linux-cxx-debug/lib -Wl,-rpath,/usr/lib/gcc/x86_64-linux-gnu/4.8 -L/usr/lib/gcc/x86_64-linux-gnu/4.8 -Wl,-rpath,/usr/lib/x86_64-linux-gnu -L/usr/lib/x86_64-linux-gnu -Wl,-rpath,/lib/x86_64-linux-gnu -L/lib/x86_64-linux-gnu -Wl,-rpath,/usr/lib/x86_64-linux-gnu -L/usr/lib/x86_64-linux-gnu -ldl -Wl,-rpath,/home/esfp/tools/libraries/petsc/linux-cxx-debug/lib -lmpi -lgcc_s -lpthread -ldl  -o ut_main 
> /bin/rm -f ut_main.o 
> [0]PETSC ERROR: PETSC: Attaching gdb to ut_main of pid 2818 on display :0 on machine grappelli
> [1]PETSC ERROR: PETSC: Attaching gdb to ut_main of pid 2819 on display :0 on machine grappelli
> [2]PETSC ERROR: PETSC: Attaching gdb to ut_main of pid 2820 on display :0 on machine grappelli
> 
> 
> And I got below error in gdb GUI:
> <image.png>
> 
> I appreciate your support.
> 
> Best regards,
> Ehsan
> 
> On Wed, Jun 29, 2016 at 4:31 PM, Barry Smith <bsmith at mcs.anl.gov> wrote:
> 
>   On all other processes don't pass in 1 pass  in 0 since all other processes want 0 sub matrices
> 
> 
> > On Jun 29, 2016, at 3:24 PM, ehsan sadrfaridpour <it.sadr at gmail.com> wrote:
> >
> > Thanks, the IS problem is solved.
> > But now I have another problem to compile the code.
> >
> > I use below code:
> > Mat m_WA_nt_local;
> >     MatCreateSeqAIJ(PETSC_COMM_SELF,num_points,num_points, Config_params::getInstance()->get_pre_init_loader_matrix(), NULL, &m_WA_nt_local);
> >     IS set;
> >     if(rank ==0){
> >         ISCreateStride(PETSC_COMM_SELF, num_points, 0, 1, &set);
> >         ISView(set, PETSC_VIEWER_STDOUT_SELF);
> >     }
> >     MatGetSubMatrices(m_WA_norm_T, 1, &set, &set, MAT_INITIAL_MATRIX, &m_WA_nt_local);
> >
> > The error I get is :
> > error: cannot convert ?_p_Mat**? to ?_p_Mat***? for argument ?6? to ?PetscErrorCode MatGetSubMatrices(Mat, PetscInt, _p_IS* const*, _p_IS* const*, MatReuse, _p_Mat***)?
> >      MatGetSubMatrices(m_WA_norm_T, 1, &set, &set, MAT_INITIAL_MATRIX, &m_WA_nt_local);
> >
> >
> > I tried to go around it by define a array of Matrices using "Mat * m_WA_nt_local"
> > So, the first 2 lines changed to below and I can compile the code.
> > Mat * m_WA_nt_local;
> >     MatCreateSeqAIJ(PETSC_COMM_SELF,num_points,num_points, Config_params::getInstance()->get_pre_init_loader_matrix(), NULL, m_WA_nt_local);
> >
> >
> >
> > However, I get errors like below when I run the code with 2 mpi process.
> >  --------------------- Error Message --------------------------------------------------------------
> > [1]PETSC ERROR: Invalid argument
> > [1]PETSC ERROR: Wrong type of object: Parameter # 3
> > [1]PETSC ERROR: See http://www.mcs.anl.gov/petsc/documentation/faq.html for trouble shooting.
> > [1]PETSC ERROR: Petsc Release Version 3.6.3, unknown
> > [1]PETSC ERROR: ut_main on a linux-cxx-debug named grappelli by esfp Wed Jun 29 16:21:04 2016
> > [1]PETSC ERROR: Configure options PETSC_ARCH=linux-cxx-debug --with-cc=gcc --with-cxx=g++ --with-clanguage=c++ --with-gnu-compilers=1 --with-mpi-compilers=1 --with-debugging=1 --with-shared-libraries=1 --download-openmpi=1 --download-f2cblaslapack --download-superlu_dist=1 --download-superlu=1 --download-metis=1 --download-parmetis=1 --download-blacs=1 --with-hdf5 --with-hdf5-dir=/usr/local/hdf5/
> > [1]PETSC ERROR: #1 MatGetSubMatrices() line 6605 in /home/esfp/tools/libraries/petsc/src/mat/interface/matrix.c
> >
> >
> > I think I need to do something for other processes, but I don't know what I need to do.
> >
> > Best,
> > Ehsan
> >
> >
> >
> > On Wed, Jun 29, 2016 at 4:03 PM, Dave May <dave.mayhem23 at gmail.com> wrote:
> >
> >
> > On Wednesday, 29 June 2016, ehsan sadrfaridpour <it.sadr at gmail.com> wrote:
> > I faced the below error during compiling my code for using MatGetSubMatrices.
> >
> > error: cannot convert ?IS {aka _p_IS*}? to ?_p_IS* const*? for argument ?3? to ?PetscErrorCode MatGetSubMatrices(Mat, PetscInt, _p_IS* const*, _p_IS* const*, MatReuse, _p_Mat***)?
> >          MatGetSubMatrices(m_WA_norm_T, 1, set, set, MAT_INITIAL_MATRIX, &m_local_W);
> >
> > My code :
> > PetscMPIInt    rank;
> > MPI_Comm_rank(PETSC_COMM_WORLD, &rank);
> >
> > if(rank ==0){
> >         Mat m_local_W;
> >         MatCreateSeqAIJ(PETSC_COMM_SELF,num_points,num_points, num_nz, NULL,&m_local_W);// try to reserve space for only number of final non zero entries for each fine node (e.g. 4)
> >         IS set;
> >         ISCreateStride(PETSC_COMM_SELF, num_points, 0, 1, &set_row);
> >         MatGetSubMatrices(m_WA_norm_T, 1, set_row, set_col, MAT_INITIAL_MATRIX, &m_local_W);
> >
> >     }
> >
> > I followed below example:
> > http://www.mcs.anl.gov/petsc/petsc-current/src/vec/is/is/examples/tutorials/ex2.c.html
> >
> > This code won't work in parallel.
> > The man page says this function is collective on Mat. You need to move the call to MatGetSubMatrices outside of the if(rank==0) loop.
> >
> >
> >
> >
> >
> >
> >
> > On Wed, Jun 29, 2016 at 3:19 PM, ehsan sadrfaridpour <it.sadr at gmail.com> wrote:
> > Thanks a lot for great support.
> >
> > On Wed, Jun 29, 2016 at 3:11 PM, Barry Smith <bsmith at mcs.anl.gov> wrote:
> >
> >    MatGetSubmatrices() just have the first process request all the rows and columns and the others request none. You can use ISCreateStride() to create the ISs without having to make an array of all the indices.
> >
> >
> > > On Jun 29, 2016, at 1:43 PM, ehsan sadrfaridpour <it.sadr at gmail.com> wrote:
> > >
> > > Hi,
> > >
> > > I need to have access to most of elements of a parallel MPIAIJ matrix only from 1 process (rank 0).
> > > I tried to copy or duplicate it to SEQAIJ, but I faced problems.
> > >
> > > How can I have a local copy of a matrix which is distributed on multiple process? I don't want to update the matrix, and the read-only version of it would be enough.
> > >
> > > Best,
> > > Ehsan
> > >
> > >
> >
> >
> >
> >
> 
> 


From hengjiew at uci.edu  Tue Jul  5 17:23:55 2016
From: hengjiew at uci.edu (frank)
Date: Tue, 5 Jul 2016 15:23:55 -0700
Subject: [petsc-users] Question about memory usage in Multigrid
	preconditioner
Message-ID: <577C337B.60909@uci.edu>

Hi,

I am using the CG ksp solver and Multigrid preconditioner  to solve a 
linear system in parallel.
I chose to use the 'Telescope' as the preconditioner on the coarse mesh 
for its good performance.
The petsc options file is attached.

The domain is a 3d box.
It works well when the grid is  1536*128*384 and the process mesh is 
96*8*24. When I double the size of grid and keep the same process mesh 
and petsc options, I get an "out of memory" error from the super-cluster 
I am using.
Each process has access to at least 8G memory, which should be more than 
enough for my application. I am sure that all the other parts of my 
code( except the linear solver ) do not use much memory. So I doubt if 
there is something wrong with the linear solver.
The error occurs before the linear system is completely solved so I 
don't have the info from ksp view. I am not able to re-produce the error 
with a smaller problem either.
In addition,  I tried to use the block jacobi as the preconditioner with 
the same grid and same decomposition. The linear solver runs extremely 
slow but there is no memory error.

How can I diagnose what exactly cause the error?
Thank you so much.

Frank
-------------- next part --------------
-ksp_type        cg 
-ksp_norm_type   unpreconditioned
-ksp_lag_norm
-ksp_rtol        1e-7
-ksp_initial_guess_nonzero  yes
-ksp_converged_reason 
-ppe_max_iter 50
-pc_type mg
-pc_mg_galerkin
-pc_mg_levels 4
-mg_levels_ksp_type richardson 
-mg_levels_ksp_max_it 1
-mg_coarse_ksp_type preonly
-mg_coarse_pc_type telescope
-mg_coarse_pc_telescope_reduction_factor 64
-options_left
-log_summary

# Setting dmdarepart on subcomm
-repart_da_processors_x 24
-repart_da_processors_y 2
-repart_da_processors_z 6
-mg_coarse_telescope_ksp_type preonly
#-mg_coarse_telescope_ksp_constant_null_space
-mg_coarse_telescope_pc_type mg
-mg_coarse_telescope_pc_mg_galerkin
-mg_coarse_telescope_pc_mg_levels 4
-mg_coarse_telescope_mg_levels_ksp_max_it 1
-mg_coarse_telescope_mg_levels_ksp_type richardson
-mg_coarse_telescope_mg_coarse_ksp_type preonly
-mg_coarse_telescope_mg_coarse_pc_type svd
#-mg_coarse_telescope_mg_coarse_pc_type telescope
#-mg_coarse_telescope_mg_coarse_pc_telescope_reduction_factor 64

# Second subcomm
#-mg_coarse_telescope_mg_coarse_telescope_ksp_type preonly
#-mg_coarse_telescope_mg_coarse_telescope_pc_type mg
#-mg_coarse_telescope_mg_coarse_telescope_pc_mg_galerkin
#-mg_coarse_telescope_mg_coarse_telescope_pc_mg_levels 3
#-mg_coarse_telescope_mg_coarse_telescope_mg_levels_ksp_type richardson
#-mg_coarse_telescope_mg_coarse_telescope_mg_levels_ksp_max_it 1
#-mg_coarse_telescope_mg_coarse_telescope_mg_coarse_ksp_type richardson
#-mg_coarse_telescope_mg_coarse_telescope_mg_coarse_pc_type svd

From it.sadr at gmail.com  Tue Jul  5 17:26:58 2016
From: it.sadr at gmail.com (ehsan sadrfaridpour)
Date: Tue, 5 Jul 2016 18:26:58 -0400
Subject: [petsc-users] How to have a local copy (sequential) of a
	parallel matrix
In-Reply-To: <D78E861A-793F-4D84-B155-6834CEFDAC6A@mcs.anl.gov>
References: <CALtCA6F_vGr=ZMi2ttcFxo1TkYoSOPn9wv5iY48+8qicYPJRBQ@mail.gmail.com>
	<F0B9F2BE-B38F-4B1B-9F95-C82F7D282ECC@mcs.anl.gov>
	<CALtCA6HWqqCij2K6uv7LER2ebwnwgh_xY7u4D2YWmzsyR67bzw@mail.gmail.com>
	<CALtCA6EUfkOOOArOLxF5A6bpbCooT1d3BKMPHZyJKvCVZ3cYRw@mail.gmail.com>
	<CAJ98EDrK9peK_B868Lr9EQ4sdiyQKMoZyOuZuOwTuCR6of9K9w@mail.gmail.com>
	<CALtCA6FnWa8A6Zi-zANDePmDWtBpUSTCM511nP=NHuttwTY7SA@mail.gmail.com>
	<1618DCDA-7859-49BD-BCAF-F4BD08DF1BAF@mcs.anl.gov>
	<CALtCA6F4xzEJE1WCk-4e8-LDjhTfJia4b16HmzD+J4z-weWLZQ@mail.gmail.com>
	<D78E861A-793F-4D84-B155-6834CEFDAC6A@mcs.anl.gov>
Message-ID: <CALtCA6FSO35UDGo+eYbpihH25oV1eytWVhSa+qoQMXstc9RFMw@mail.gmail.com>

Thanks for your prompt reply. Using & solve this problem, but then I have
another problem.

*Rest of the Code:*
    Mat m_WA_nt_local;
    MatCreateSeqAIJ(PETSC_COMM_SELF,num_points,num_points, pre_init_size,
NULL, &m_WA_nt_local);
    PetscPrintf(PETSC_COMM_SELF, "[CS][pCalc_P] rank:%d, num_points:%d,
p_init:%d\n", rank, num_points, pre_init_size);

    IS set;
    if(rank ==0){
        // - - - - - create local matrix - - - - -
        PetscPrintf(PETSC_COMM_SELF, "[CS][pCalc_P] rank:%d,
num_points:%d\n", rank, num_points);
        ISCreateStride(PETSC_COMM_SELF, num_points, 0, 1, &set);
        ISView(set, PETSC_VIEWER_STDOUT_SELF);
        MatGetSubMatrices(m_WA_norm_T, 1, &set, &set, MAT_INITIAL_MATRIX,
&m_WA_nt_local);
    }else{
        MatGetSubMatrices(m_WA_norm_T, 0, &set, &set, MAT_INITIAL_MATRIX,
&m_WA_nt_local);
    }

*Error in compile:*

> /home/esfp/dev/ws_qt/mlsvm/coarsening.cc: In member function ?_p_Mat*
> Coarsening::pCalc_P(_p_Mat*&, _p_Vec*&, std::vector<long unsigned int>&,
> cs_info&)?:
> /home/esfp/dev/ws_qt/mlsvm/coarsening.cc:113:89: error: cannot convert
> ?_p_Mat**? to ?_p_Mat***? for argument ?6? to ?PetscErrorCode
> MatGetSubMatrices(Mat, PetscInt, _p_IS* const*, _p_IS* const*, MatReuse,
> _p_Mat***)?
>          MatGetSubMatrices(m_WA_norm_T, 1, &set, &set, MAT_INITIAL_MATRIX,
> &m_WA_nt_local);
>
> ^
> /home/esfp/dev/ws_qt/mlsvm/coarsening.cc:115:89: error: cannot convert
> ?_p_Mat**? to ?_p_Mat***? for argument ?6? to ?PetscErrorCode
> MatGetSubMatrices(Mat, PetscInt, _p_IS* const*, _p_IS* const*, MatReuse,
> _p_Mat***)?
>          MatGetSubMatrices(m_WA_norm_T, 0, &set, &set, MAT_INITIAL_MATRIX,
> &m_WA_nt_local);
>

^


On Tue, Jul 5, 2016 at 6:21 PM, Barry Smith <bsmith at mcs.anl.gov> wrote:

>
>   It should be
>
>     Mat m_WA_nt_local;
>
> > MatCreateSeqAIJ(PETSC_COMM_SELF,num_points,num_points, pre_init_size,
> NULL, &m_WA_nt_local);
>
>                                                          ^^^^^^^^^^^^ note
> the &
>
>
>
> > On Jul 5, 2016, at 5:13 PM, ehsan sadrfaridpour <it.sadr at gmail.com>
> wrote:
> >
> > I faced a problem with my code. The problem is related to
> MatCreateSeqAIJ().
> > I comment the rest of my code and just keeping the below lines cause me
> the error.
> > Code:
> >     Mat * m_WA_nt_local;
> >     MatCreateSeqAIJ(PETSC_COMM_SELF,num_points,num_points,
> pre_init_size, NULL, m_WA_nt_local);
> >     PetscPrintf(PETSC_COMM_SELF, "[CS][pCalc_P] rank:%d, num_points:%d,
> p_init:%d\n", rank, num_points, pre_init_size);
> >
> >     exit(1);
> >
> > Error:
> > [0]PETSC ERROR: --------------------- Error Message
> --------------------------------------------------------------
> > [0]PETSC ERROR: Null argument, when expecting valid pointer
> > [0]PETSC ERROR: [1]PETSC ERROR: --------------------- Error Message
> --------------------------------------------------------------
> > [1]PETSC ERROR: Null argument, when expecting valid pointer
> > [1]PETSC ERROR: Null Pointer: Parameter # 2
> > [1]PETSC ERROR: See http://www.mcs.anl.gov/petsc/documentation/faq.html
> for trouble shooting.
> > [2]PETSC ERROR: --------------------- Error Message
> --------------------------------------------------------------
> > [2]PETSC ERROR: Null argument, when expecting valid pointer
> > [2]PETSC ERROR: Null Pointer: Parameter # 2
> > [2]PETSC ERROR: See http://www.mcs.anl.gov/petsc/documentation/faq.html
> for trouble shooting.
> > [2]PETSC ERROR: Petsc Release Version 3.6.3, unknown
> > [2]PETSC ERROR: ut_main on a linux-cxx-debug named grappelli by esfp Tue
> Jul  5 18:05:15 2016
> > [2]PETSC ERROR: Configure options PETSC_ARCH=linux-cxx-debug
> --with-cc=gcc --with-cxx=g++ --with-clanguage=c++ --with-gnu-compilers=1
> --with-mpi-compilers=1 --with-debugging=1 --with-shared-libraries=1
> --download-openmpi=1 --download-f2cblaslapack --download-superlu_dist=1
> --download-superlu=1 --download-metis=1 --download-parmetis=1
> --download-blacs=1 --with-hdf5 --with-hdf5-dir=/usr/local/hdf5/
> > [2]PETSC ERROR: #1 MatCreate() line 79 in
> /home/esfp/tools/libraries/petsc/src/mat/utils/gcreate.c
> > [2]PETSC ERROR: #2 MatCreateSeqAIJ() line 3471 in
> /home/esfp/tools/libraries/petsc/src/mat/impls/aij/seq/aij.c
> > Null Pointer: Parameter # 2
> > [0]PETSC ERROR: See http://www.mcs.anl.gov/petsc/documentation/faq.html
> for trouble shooting.
> > [0]PETSC ERROR: Petsc Release Version 3.6.3, unknown
> > [0]PETSC ERROR: ut_main on a linux-cxx-debug named grappelli by esfp Tue
> Jul  5 18:05:15 2016
> > [0]PETSC ERROR: Configure options PETSC_ARCH=linux-cxx-debug
> --with-cc=gcc --with-cxx=g++ --with-clanguage=c++ --with-gnu-compilers=1
> --with-mpi-compilers=1 --with-debugging=1 --with-shared-libraries=1
> --download-openmpi=1 --download-f2cblaslapack --download-superlu_dist=1
> --download-superlu=1 --download-metis=1 --download-parmetis=1
> --download-blacs=1 --with-hdf5 --with-hdf5-dir=/usr/local/hdf5/
> > [0]PETSC ERROR: #1 MatCreate() line 79 in
> /home/esfp/tools/libraries/petsc/src/mat/utils/gcreate.c
> > [0]PETSC ERROR: #2 MatCreateSeqAIJ() line 3471 in
> /home/esfp/tools/libraries/petsc/src/mat/impls/aij/seq/aij.c
> > [1]PETSC ERROR: Petsc Release Version 3.6.3, unknown
> > [1]PETSC ERROR: ut_main on a linux-cxx-debug named grappelli by esfp Tue
> Jul  5 18:05:15 2016
> > [1]PETSC ERROR: Configure options PETSC_ARCH=linux-cxx-debug
> --with-cc=gcc --with-cxx=g++ --with-clanguage=c++ --with-gnu-compilers=1
> --with-mpi-compilers=1 --with-debugging=1 --with-shared-libraries=1
> --download-openmpi=1 --download-f2cblaslapack --download-superlu_dist=1
> --download-superlu=1 --download-metis=1 --download-parmetis=1
> --download-blacs=1 --with-hdf5 --with-hdf5-dir=/usr/local/hdf5/
> > [1]PETSC ERROR: #1 MatCreate() line 79 in
> /home/esfp/tools/libraries/petsc/src/mat/utils/gcreate.c
> > [1]PETSC ERROR: #2 MatCreateSeqAIJ() line 3471 in
> /home/esfp/tools/libraries/petsc/src/mat/impls/aij/seq/aij.c
> > [CS][pCalc_P] rank:1, num_points:10, p_init:300
> > [CS][pCalc_P] rank:2, num_points:10, p_init:300
> > [CS][pCalc_P] rank:0, num_points:10, p_init:300
> >
> > As you can see nothing is NULL in my call to the MatCreateSeqAIJ.
> >
> > I tried to debug it with -start_in_debugger, but I got another error.
> > $ make ut_main && mpirun -n 3   ut_main   -start_in_debugger
> > /home/esfp/tools/libraries/petsc/linux-cxx-debug/bin/mpicxx -o ut_main.o
> -c -Wall -Wwrite-strings -Wno-strict-aliasing -Wno-unknown-pragmas -g -O0
> -fPIC    -I/home/esfp/tools/libraries/petsc/include
> -I/home/esfp/tools/libraries/petsc/linux-cxx-debug/include
> -I/usr/local/hdf5/include   -std=c++11 -g -O3  `pwd`/ut_main.cc
> > /home/esfp/tools/libraries/petsc/linux-cxx-debug/bin/mpicxx -Wall
> -Wwrite-strings -Wno-strict-aliasing -Wno-unknown-pragmas -g -O0 -I.
> svm.o solver.o model_selection.o ut_ms.o ut_common.o ut_kf.o
> ut_partitioning.o ds_node.o ds_graph.o coarsening.o ut_coarsening.o
> partitioning.o ut_mr.o pugixml.o config_params.o etimer.o common_funcs.o
> OptionParser.o loader.o ut_loader.o k_fold.o ut_main.o
> -Wl,-rpath,/home/esfp/tools/libraries/petsc/linux-cxx-debug/lib
> -L/home/esfp/tools/libraries/petsc/linux-cxx-debug/lib  -lpetsc
> -Wl,-rpath,/home/esfp/tools/libraries/petsc/linux-cxx-debug/lib
> -lsuperlu_4.3 -lsuperlu_dist_4.1 -lf2clapack -lf2cblas -lm -lparmetis
> -lmetis -lX11 -Wl,-rpath,/usr/local/hdf5/lib -L/usr/local/hdf5/lib
> -lhdf5hl_fortran -lhdf5_fortran -lhdf5_hl -lhdf5 -lhwloc -lm
> -Wl,-rpath,/usr/lib/gcc/x86_64-linux-gnu/4.8
> -L/usr/lib/gcc/x86_64-linux-gnu/4.8 -Wl,-rpath,/usr/lib/x86_64-linux-gnu
> -L/usr/lib/x86_64-linux-gnu -Wl,-rpath,/lib/x86_64-linux-gnu
> -L/lib/x86_64-linux-gnu -lmpi_usempi -lmpi_mpifh -lgfortran -lm -lgfortran
> -lm -lquadmath -lm -lmpi_cxx -lstdc++
> -Wl,-rpath,/home/esfp/tools/libraries/petsc/linux-cxx-debug/lib
> -L/home/esfp/tools/libraries/petsc/linux-cxx-debug/lib
> -Wl,-rpath,/usr/lib/gcc/x86_64-linux-gnu/4.8
> -L/usr/lib/gcc/x86_64-linux-gnu/4.8 -Wl,-rpath,/usr/lib/x86_64-linux-gnu
> -L/usr/lib/x86_64-linux-gnu -Wl,-rpath,/lib/x86_64-linux-gnu
> -L/lib/x86_64-linux-gnu -Wl,-rpath,/usr/lib/x86_64-linux-gnu
> -L/usr/lib/x86_64-linux-gnu -ldl
> -Wl,-rpath,/home/esfp/tools/libraries/petsc/linux-cxx-debug/lib -lmpi
> -lgcc_s -lpthread -ldl  -o ut_main
> > /bin/rm -f ut_main.o
> > [0]PETSC ERROR: PETSC: Attaching gdb to ut_main of pid 2818 on display
> :0 on machine grappelli
> > [1]PETSC ERROR: PETSC: Attaching gdb to ut_main of pid 2819 on display
> :0 on machine grappelli
> > [2]PETSC ERROR: PETSC: Attaching gdb to ut_main of pid 2820 on display
> :0 on machine grappelli
> >
> >
> > And I got below error in gdb GUI:
> > <image.png>
> >
> > I appreciate your support.
> >
> > Best regards,
> > Ehsan
> >
> > On Wed, Jun 29, 2016 at 4:31 PM, Barry Smith <bsmith at mcs.anl.gov> wrote:
> >
> >   On all other processes don't pass in 1 pass  in 0 since all other
> processes want 0 sub matrices
> >
> >
> > > On Jun 29, 2016, at 3:24 PM, ehsan sadrfaridpour <it.sadr at gmail.com>
> wrote:
> > >
> > > Thanks, the IS problem is solved.
> > > But now I have another problem to compile the code.
> > >
> > > I use below code:
> > > Mat m_WA_nt_local;
> > >     MatCreateSeqAIJ(PETSC_COMM_SELF,num_points,num_points,
> Config_params::getInstance()->get_pre_init_loader_matrix(), NULL,
> &m_WA_nt_local);
> > >     IS set;
> > >     if(rank ==0){
> > >         ISCreateStride(PETSC_COMM_SELF, num_points, 0, 1, &set);
> > >         ISView(set, PETSC_VIEWER_STDOUT_SELF);
> > >     }
> > >     MatGetSubMatrices(m_WA_norm_T, 1, &set, &set, MAT_INITIAL_MATRIX,
> &m_WA_nt_local);
> > >
> > > The error I get is :
> > > error: cannot convert ?_p_Mat**? to ?_p_Mat***? for argument ?6? to
> ?PetscErrorCode MatGetSubMatrices(Mat, PetscInt, _p_IS* const*, _p_IS*
> const*, MatReuse, _p_Mat***)?
> > >      MatGetSubMatrices(m_WA_norm_T, 1, &set, &set, MAT_INITIAL_MATRIX,
> &m_WA_nt_local);
> > >
> > >
> > > I tried to go around it by define a array of Matrices using "Mat *
> m_WA_nt_local"
> > > So, the first 2 lines changed to below and I can compile the code.
> > > Mat * m_WA_nt_local;
> > >     MatCreateSeqAIJ(PETSC_COMM_SELF,num_points,num_points,
> Config_params::getInstance()->get_pre_init_loader_matrix(), NULL,
> m_WA_nt_local);
> > >
> > >
> > >
> > > However, I get errors like below when I run the code with 2 mpi
> process.
> > >  --------------------- Error Message
> --------------------------------------------------------------
> > > [1]PETSC ERROR: Invalid argument
> > > [1]PETSC ERROR: Wrong type of object: Parameter # 3
> > > [1]PETSC ERROR: See
> http://www.mcs.anl.gov/petsc/documentation/faq.html for trouble shooting.
> > > [1]PETSC ERROR: Petsc Release Version 3.6.3, unknown
> > > [1]PETSC ERROR: ut_main on a linux-cxx-debug named grappelli by esfp
> Wed Jun 29 16:21:04 2016
> > > [1]PETSC ERROR: Configure options PETSC_ARCH=linux-cxx-debug
> --with-cc=gcc --with-cxx=g++ --with-clanguage=c++ --with-gnu-compilers=1
> --with-mpi-compilers=1 --with-debugging=1 --with-shared-libraries=1
> --download-openmpi=1 --download-f2cblaslapack --download-superlu_dist=1
> --download-superlu=1 --download-metis=1 --download-parmetis=1
> --download-blacs=1 --with-hdf5 --with-hdf5-dir=/usr/local/hdf5/
> > > [1]PETSC ERROR: #1 MatGetSubMatrices() line 6605 in
> /home/esfp/tools/libraries/petsc/src/mat/interface/matrix.c
> > >
> > >
> > > I think I need to do something for other processes, but I don't know
> what I need to do.
> > >
> > > Best,
> > > Ehsan
> > >
> > >
> > >
> > > On Wed, Jun 29, 2016 at 4:03 PM, Dave May <dave.mayhem23 at gmail.com>
> wrote:
> > >
> > >
> > > On Wednesday, 29 June 2016, ehsan sadrfaridpour <it.sadr at gmail.com>
> wrote:
> > > I faced the below error during compiling my code for using
> MatGetSubMatrices.
> > >
> > > error: cannot convert ?IS {aka _p_IS*}? to ?_p_IS* const*? for
> argument ?3? to ?PetscErrorCode MatGetSubMatrices(Mat, PetscInt, _p_IS*
> const*, _p_IS* const*, MatReuse, _p_Mat***)?
> > >          MatGetSubMatrices(m_WA_norm_T, 1, set, set,
> MAT_INITIAL_MATRIX, &m_local_W);
> > >
> > > My code :
> > > PetscMPIInt    rank;
> > > MPI_Comm_rank(PETSC_COMM_WORLD, &rank);
> > >
> > > if(rank ==0){
> > >         Mat m_local_W;
> > >         MatCreateSeqAIJ(PETSC_COMM_SELF,num_points,num_points, num_nz,
> NULL,&m_local_W);// try to reserve space for only number of final non zero
> entries for each fine node (e.g. 4)
> > >         IS set;
> > >         ISCreateStride(PETSC_COMM_SELF, num_points, 0, 1, &set_row);
> > >         MatGetSubMatrices(m_WA_norm_T, 1, set_row, set_col,
> MAT_INITIAL_MATRIX, &m_local_W);
> > >
> > >     }
> > >
> > > I followed below example:
> > >
> http://www.mcs.anl.gov/petsc/petsc-current/src/vec/is/is/examples/tutorials/ex2.c.html
> > >
> > > This code won't work in parallel.
> > > The man page says this function is collective on Mat. You need to move
> the call to MatGetSubMatrices outside of the if(rank==0) loop.
> > >
> > >
> > >
> > >
> > >
> > >
> > >
> > > On Wed, Jun 29, 2016 at 3:19 PM, ehsan sadrfaridpour <
> it.sadr at gmail.com> wrote:
> > > Thanks a lot for great support.
> > >
> > > On Wed, Jun 29, 2016 at 3:11 PM, Barry Smith <bsmith at mcs.anl.gov>
> wrote:
> > >
> > >    MatGetSubmatrices() just have the first process request all the
> rows and columns and the others request none. You can use ISCreateStride()
> to create the ISs without having to make an array of all the indices.
> > >
> > >
> > > > On Jun 29, 2016, at 1:43 PM, ehsan sadrfaridpour <it.sadr at gmail.com>
> wrote:
> > > >
> > > > Hi,
> > > >
> > > > I need to have access to most of elements of a parallel MPIAIJ
> matrix only from 1 process (rank 0).
> > > > I tried to copy or duplicate it to SEQAIJ, but I faced problems.
> > > >
> > > > How can I have a local copy of a matrix which is distributed on
> multiple process? I don't want to update the matrix, and the read-only
> version of it would be enough.
> > > >
> > > > Best,
> > > > Ehsan
> > > >
> > > >
> > >
> > >
> > >
> > >
> >
> >
>
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.mcs.anl.gov/pipermail/petsc-users/attachments/20160705/165bba7b/attachment-0001.html>

From knepley at gmail.com  Tue Jul  5 17:36:07 2016
From: knepley at gmail.com (Matthew Knepley)
Date: Tue, 5 Jul 2016 17:36:07 -0500
Subject: [petsc-users] How to have a local copy (sequential) of a
	parallel matrix
In-Reply-To: <CALtCA6FSO35UDGo+eYbpihH25oV1eytWVhSa+qoQMXstc9RFMw@mail.gmail.com>
References: <CALtCA6F_vGr=ZMi2ttcFxo1TkYoSOPn9wv5iY48+8qicYPJRBQ@mail.gmail.com>
	<F0B9F2BE-B38F-4B1B-9F95-C82F7D282ECC@mcs.anl.gov>
	<CALtCA6HWqqCij2K6uv7LER2ebwnwgh_xY7u4D2YWmzsyR67bzw@mail.gmail.com>
	<CALtCA6EUfkOOOArOLxF5A6bpbCooT1d3BKMPHZyJKvCVZ3cYRw@mail.gmail.com>
	<CAJ98EDrK9peK_B868Lr9EQ4sdiyQKMoZyOuZuOwTuCR6of9K9w@mail.gmail.com>
	<CALtCA6FnWa8A6Zi-zANDePmDWtBpUSTCM511nP=NHuttwTY7SA@mail.gmail.com>
	<1618DCDA-7859-49BD-BCAF-F4BD08DF1BAF@mcs.anl.gov>
	<CALtCA6F4xzEJE1WCk-4e8-LDjhTfJia4b16HmzD+J4z-weWLZQ@mail.gmail.com>
	<D78E861A-793F-4D84-B155-6834CEFDAC6A@mcs.anl.gov>
	<CALtCA6FSO35UDGo+eYbpihH25oV1eytWVhSa+qoQMXstc9RFMw@mail.gmail.com>
Message-ID: <CAMYG4GmNrKPU0TUWJ4bQtJRi3132ebxJ57jemtwik-xbS-HLHA@mail.gmail.com>

On Tue, Jul 5, 2016 at 5:26 PM, ehsan sadrfaridpour <it.sadr at gmail.com>
wrote:

> Thanks for your prompt reply. Using & solve this problem, but then I have
> another problem.
>
> *Rest of the Code:*
>     Mat m_WA_nt_local;
>     MatCreateSeqAIJ(PETSC_COMM_SELF,num_points,num_points, pre_init_size,
> NULL, &m_WA_nt_local);
>     PetscPrintf(PETSC_COMM_SELF, "[CS][pCalc_P] rank:%d, num_points:%d,
> p_init:%d\n", rank, num_points, pre_init_size);
>
>     IS set;
>     if(rank ==0){
>         // - - - - - create local matrix - - - - -
>         PetscPrintf(PETSC_COMM_SELF, "[CS][pCalc_P] rank:%d,
> num_points:%d\n", rank, num_points);
>         ISCreateStride(PETSC_COMM_SELF, num_points, 0, 1, &set);
>         ISView(set, PETSC_VIEWER_STDOUT_SELF);
>         MatGetSubMatrices(m_WA_norm_T, 1, &set, &set, MAT_INITIAL_MATRIX,
> &m_WA_nt_local);
>     }else{
>         MatGetSubMatrices(m_WA_norm_T, 0, &set, &set, MAT_INITIAL_MATRIX,
> &m_WA_nt_local);
>     }
>

This returns an ARRAY of Mat objects, not just one.

   Matt


>
> *Error in compile:*
>
>> /home/esfp/dev/ws_qt/mlsvm/coarsening.cc: In member function ?_p_Mat*
>> Coarsening::pCalc_P(_p_Mat*&, _p_Vec*&, std::vector<long unsigned int>&,
>> cs_info&)?:
>> /home/esfp/dev/ws_qt/mlsvm/coarsening.cc:113:89: error: cannot convert
>> ?_p_Mat**? to ?_p_Mat***? for argument ?6? to ?PetscErrorCode
>> MatGetSubMatrices(Mat, PetscInt, _p_IS* const*, _p_IS* const*, MatReuse,
>> _p_Mat***)?
>>          MatGetSubMatrices(m_WA_norm_T, 1, &set, &set,
>> MAT_INITIAL_MATRIX, &m_WA_nt_local);
>>
>> ^
>> /home/esfp/dev/ws_qt/mlsvm/coarsening.cc:115:89: error: cannot convert
>> ?_p_Mat**? to ?_p_Mat***? for argument ?6? to ?PetscErrorCode
>> MatGetSubMatrices(Mat, PetscInt, _p_IS* const*, _p_IS* const*, MatReuse,
>> _p_Mat***)?
>>          MatGetSubMatrices(m_WA_norm_T, 0, &set, &set,
>> MAT_INITIAL_MATRIX, &m_WA_nt_local);
>>
>
> ^
>
>
> On Tue, Jul 5, 2016 at 6:21 PM, Barry Smith <bsmith at mcs.anl.gov> wrote:
>
>>
>>   It should be
>>
>>     Mat m_WA_nt_local;
>>
>> > MatCreateSeqAIJ(PETSC_COMM_SELF,num_points,num_points, pre_init_size,
>> NULL, &m_WA_nt_local);
>>
>>                                                          ^^^^^^^^^^^^ note
>> the &
>>
>>
>>
>> > On Jul 5, 2016, at 5:13 PM, ehsan sadrfaridpour <it.sadr at gmail.com>
>> wrote:
>> >
>> > I faced a problem with my code. The problem is related to
>> MatCreateSeqAIJ().
>> > I comment the rest of my code and just keeping the below lines cause me
>> the error.
>> > Code:
>> >     Mat * m_WA_nt_local;
>> >     MatCreateSeqAIJ(PETSC_COMM_SELF,num_points,num_points,
>> pre_init_size, NULL, m_WA_nt_local);
>> >     PetscPrintf(PETSC_COMM_SELF, "[CS][pCalc_P] rank:%d, num_points:%d,
>> p_init:%d\n", rank, num_points, pre_init_size);
>> >
>> >     exit(1);
>> >
>> > Error:
>> > [0]PETSC ERROR: --------------------- Error Message
>> --------------------------------------------------------------
>> > [0]PETSC ERROR: Null argument, when expecting valid pointer
>> > [0]PETSC ERROR: [1]PETSC ERROR: --------------------- Error Message
>> --------------------------------------------------------------
>> > [1]PETSC ERROR: Null argument, when expecting valid pointer
>> > [1]PETSC ERROR: Null Pointer: Parameter # 2
>> > [1]PETSC ERROR: See http://www.mcs.anl.gov/petsc/documentation/faq.html
>> for trouble shooting.
>> > [2]PETSC ERROR: --------------------- Error Message
>> --------------------------------------------------------------
>> > [2]PETSC ERROR: Null argument, when expecting valid pointer
>> > [2]PETSC ERROR: Null Pointer: Parameter # 2
>> > [2]PETSC ERROR: See http://www.mcs.anl.gov/petsc/documentation/faq.html
>> for trouble shooting.
>> > [2]PETSC ERROR: Petsc Release Version 3.6.3, unknown
>> > [2]PETSC ERROR: ut_main on a linux-cxx-debug named grappelli by esfp
>> Tue Jul  5 18:05:15 2016
>> > [2]PETSC ERROR: Configure options PETSC_ARCH=linux-cxx-debug
>> --with-cc=gcc --with-cxx=g++ --with-clanguage=c++ --with-gnu-compilers=1
>> --with-mpi-compilers=1 --with-debugging=1 --with-shared-libraries=1
>> --download-openmpi=1 --download-f2cblaslapack --download-superlu_dist=1
>> --download-superlu=1 --download-metis=1 --download-parmetis=1
>> --download-blacs=1 --with-hdf5 --with-hdf5-dir=/usr/local/hdf5/
>> > [2]PETSC ERROR: #1 MatCreate() line 79 in
>> /home/esfp/tools/libraries/petsc/src/mat/utils/gcreate.c
>> > [2]PETSC ERROR: #2 MatCreateSeqAIJ() line 3471 in
>> /home/esfp/tools/libraries/petsc/src/mat/impls/aij/seq/aij.c
>> > Null Pointer: Parameter # 2
>> > [0]PETSC ERROR: See http://www.mcs.anl.gov/petsc/documentation/faq.html
>> for trouble shooting.
>> > [0]PETSC ERROR: Petsc Release Version 3.6.3, unknown
>> > [0]PETSC ERROR: ut_main on a linux-cxx-debug named grappelli by esfp
>> Tue Jul  5 18:05:15 2016
>> > [0]PETSC ERROR: Configure options PETSC_ARCH=linux-cxx-debug
>> --with-cc=gcc --with-cxx=g++ --with-clanguage=c++ --with-gnu-compilers=1
>> --with-mpi-compilers=1 --with-debugging=1 --with-shared-libraries=1
>> --download-openmpi=1 --download-f2cblaslapack --download-superlu_dist=1
>> --download-superlu=1 --download-metis=1 --download-parmetis=1
>> --download-blacs=1 --with-hdf5 --with-hdf5-dir=/usr/local/hdf5/
>> > [0]PETSC ERROR: #1 MatCreate() line 79 in
>> /home/esfp/tools/libraries/petsc/src/mat/utils/gcreate.c
>> > [0]PETSC ERROR: #2 MatCreateSeqAIJ() line 3471 in
>> /home/esfp/tools/libraries/petsc/src/mat/impls/aij/seq/aij.c
>> > [1]PETSC ERROR: Petsc Release Version 3.6.3, unknown
>> > [1]PETSC ERROR: ut_main on a linux-cxx-debug named grappelli by esfp
>> Tue Jul  5 18:05:15 2016
>> > [1]PETSC ERROR: Configure options PETSC_ARCH=linux-cxx-debug
>> --with-cc=gcc --with-cxx=g++ --with-clanguage=c++ --with-gnu-compilers=1
>> --with-mpi-compilers=1 --with-debugging=1 --with-shared-libraries=1
>> --download-openmpi=1 --download-f2cblaslapack --download-superlu_dist=1
>> --download-superlu=1 --download-metis=1 --download-parmetis=1
>> --download-blacs=1 --with-hdf5 --with-hdf5-dir=/usr/local/hdf5/
>> > [1]PETSC ERROR: #1 MatCreate() line 79 in
>> /home/esfp/tools/libraries/petsc/src/mat/utils/gcreate.c
>> > [1]PETSC ERROR: #2 MatCreateSeqAIJ() line 3471 in
>> /home/esfp/tools/libraries/petsc/src/mat/impls/aij/seq/aij.c
>> > [CS][pCalc_P] rank:1, num_points:10, p_init:300
>> > [CS][pCalc_P] rank:2, num_points:10, p_init:300
>> > [CS][pCalc_P] rank:0, num_points:10, p_init:300
>> >
>> > As you can see nothing is NULL in my call to the MatCreateSeqAIJ.
>> >
>> > I tried to debug it with -start_in_debugger, but I got another error.
>> > $ make ut_main && mpirun -n 3   ut_main   -start_in_debugger
>> > /home/esfp/tools/libraries/petsc/linux-cxx-debug/bin/mpicxx -o
>> ut_main.o -c -Wall -Wwrite-strings -Wno-strict-aliasing
>> -Wno-unknown-pragmas -g -O0  -fPIC
>> -I/home/esfp/tools/libraries/petsc/include
>> -I/home/esfp/tools/libraries/petsc/linux-cxx-debug/include
>> -I/usr/local/hdf5/include   -std=c++11 -g -O3  `pwd`/ut_main.cc
>> > /home/esfp/tools/libraries/petsc/linux-cxx-debug/bin/mpicxx -Wall
>> -Wwrite-strings -Wno-strict-aliasing -Wno-unknown-pragmas -g -O0 -I.
>> svm.o solver.o model_selection.o ut_ms.o ut_common.o ut_kf.o
>> ut_partitioning.o ds_node.o ds_graph.o coarsening.o ut_coarsening.o
>> partitioning.o ut_mr.o pugixml.o config_params.o etimer.o common_funcs.o
>> OptionParser.o loader.o ut_loader.o k_fold.o ut_main.o
>> -Wl,-rpath,/home/esfp/tools/libraries/petsc/linux-cxx-debug/lib
>> -L/home/esfp/tools/libraries/petsc/linux-cxx-debug/lib  -lpetsc
>> -Wl,-rpath,/home/esfp/tools/libraries/petsc/linux-cxx-debug/lib
>> -lsuperlu_4.3 -lsuperlu_dist_4.1 -lf2clapack -lf2cblas -lm -lparmetis
>> -lmetis -lX11 -Wl,-rpath,/usr/local/hdf5/lib -L/usr/local/hdf5/lib
>> -lhdf5hl_fortran -lhdf5_fortran -lhdf5_hl -lhdf5 -lhwloc -lm
>> -Wl,-rpath,/usr/lib/gcc/x86_64-linux-gnu/4.8
>> -L/usr/lib/gcc/x86_64-linux-gnu/4.8 -Wl,-rpath,/usr/lib/x86_64-linux-gnu
>> -L/usr/lib/x86_64-linux-gnu -Wl,-rpath,/lib/x86_64-linux-gnu
>> -L/lib/x86_64-linux-gnu -lmpi_usempi -lmpi_mpifh -lgfortran -lm -lgfortran
>> -lm -lquadmath -lm -lmpi_cxx -lstdc++
>> -Wl,-rpath,/home/esfp/tools/libraries/petsc/linux-cxx-debug/lib
>> -L/home/esfp/tools/libraries/petsc/linux-cxx-debug/lib
>> -Wl,-rpath,/usr/lib/gcc/x86_64-linux-gnu/4.8
>> -L/usr/lib/gcc/x86_64-linux-gnu/4.8 -Wl,-rpath,/usr/lib/x86_64-linux-gnu
>> -L/usr/lib/x86_64-linux-gnu -Wl,-rpath,/lib/x86_64-linux-gnu
>> -L/lib/x86_64-linux-gnu -Wl,-rpath,/usr/lib/x86_64-linux-gnu
>> -L/usr/lib/x86_64-linux-gnu -ldl
>> -Wl,-rpath,/home/esfp/tools/libraries/petsc/linux-cxx-debug/lib -lmpi
>> -lgcc_s -lpthread -ldl  -o ut_main
>> > /bin/rm -f ut_main.o
>> > [0]PETSC ERROR: PETSC: Attaching gdb to ut_main of pid 2818 on display
>> :0 on machine grappelli
>> > [1]PETSC ERROR: PETSC: Attaching gdb to ut_main of pid 2819 on display
>> :0 on machine grappelli
>> > [2]PETSC ERROR: PETSC: Attaching gdb to ut_main of pid 2820 on display
>> :0 on machine grappelli
>> >
>> >
>> > And I got below error in gdb GUI:
>> > <image.png>
>> >
>> > I appreciate your support.
>> >
>> > Best regards,
>> > Ehsan
>> >
>> > On Wed, Jun 29, 2016 at 4:31 PM, Barry Smith <bsmith at mcs.anl.gov>
>> wrote:
>> >
>> >   On all other processes don't pass in 1 pass  in 0 since all other
>> processes want 0 sub matrices
>> >
>> >
>> > > On Jun 29, 2016, at 3:24 PM, ehsan sadrfaridpour <it.sadr at gmail.com>
>> wrote:
>> > >
>> > > Thanks, the IS problem is solved.
>> > > But now I have another problem to compile the code.
>> > >
>> > > I use below code:
>> > > Mat m_WA_nt_local;
>> > >     MatCreateSeqAIJ(PETSC_COMM_SELF,num_points,num_points,
>> Config_params::getInstance()->get_pre_init_loader_matrix(), NULL,
>> &m_WA_nt_local);
>> > >     IS set;
>> > >     if(rank ==0){
>> > >         ISCreateStride(PETSC_COMM_SELF, num_points, 0, 1, &set);
>> > >         ISView(set, PETSC_VIEWER_STDOUT_SELF);
>> > >     }
>> > >     MatGetSubMatrices(m_WA_norm_T, 1, &set, &set, MAT_INITIAL_MATRIX,
>> &m_WA_nt_local);
>> > >
>> > > The error I get is :
>> > > error: cannot convert ?_p_Mat**? to ?_p_Mat***? for argument ?6? to
>> ?PetscErrorCode MatGetSubMatrices(Mat, PetscInt, _p_IS* const*, _p_IS*
>> const*, MatReuse, _p_Mat***)?
>> > >      MatGetSubMatrices(m_WA_norm_T, 1, &set, &set,
>> MAT_INITIAL_MATRIX, &m_WA_nt_local);
>> > >
>> > >
>> > > I tried to go around it by define a array of Matrices using "Mat *
>> m_WA_nt_local"
>> > > So, the first 2 lines changed to below and I can compile the code.
>> > > Mat * m_WA_nt_local;
>> > >     MatCreateSeqAIJ(PETSC_COMM_SELF,num_points,num_points,
>> Config_params::getInstance()->get_pre_init_loader_matrix(), NULL,
>> m_WA_nt_local);
>> > >
>> > >
>> > >
>> > > However, I get errors like below when I run the code with 2 mpi
>> process.
>> > >  --------------------- Error Message
>> --------------------------------------------------------------
>> > > [1]PETSC ERROR: Invalid argument
>> > > [1]PETSC ERROR: Wrong type of object: Parameter # 3
>> > > [1]PETSC ERROR: See
>> http://www.mcs.anl.gov/petsc/documentation/faq.html for trouble shooting.
>> > > [1]PETSC ERROR: Petsc Release Version 3.6.3, unknown
>> > > [1]PETSC ERROR: ut_main on a linux-cxx-debug named grappelli by esfp
>> Wed Jun 29 16:21:04 2016
>> > > [1]PETSC ERROR: Configure options PETSC_ARCH=linux-cxx-debug
>> --with-cc=gcc --with-cxx=g++ --with-clanguage=c++ --with-gnu-compilers=1
>> --with-mpi-compilers=1 --with-debugging=1 --with-shared-libraries=1
>> --download-openmpi=1 --download-f2cblaslapack --download-superlu_dist=1
>> --download-superlu=1 --download-metis=1 --download-parmetis=1
>> --download-blacs=1 --with-hdf5 --with-hdf5-dir=/usr/local/hdf5/
>> > > [1]PETSC ERROR: #1 MatGetSubMatrices() line 6605 in
>> /home/esfp/tools/libraries/petsc/src/mat/interface/matrix.c
>> > >
>> > >
>> > > I think I need to do something for other processes, but I don't know
>> what I need to do.
>> > >
>> > > Best,
>> > > Ehsan
>> > >
>> > >
>> > >
>> > > On Wed, Jun 29, 2016 at 4:03 PM, Dave May <dave.mayhem23 at gmail.com>
>> wrote:
>> > >
>> > >
>> > > On Wednesday, 29 June 2016, ehsan sadrfaridpour <it.sadr at gmail.com>
>> wrote:
>> > > I faced the below error during compiling my code for using
>> MatGetSubMatrices.
>> > >
>> > > error: cannot convert ?IS {aka _p_IS*}? to ?_p_IS* const*? for
>> argument ?3? to ?PetscErrorCode MatGetSubMatrices(Mat, PetscInt, _p_IS*
>> const*, _p_IS* const*, MatReuse, _p_Mat***)?
>> > >          MatGetSubMatrices(m_WA_norm_T, 1, set, set,
>> MAT_INITIAL_MATRIX, &m_local_W);
>> > >
>> > > My code :
>> > > PetscMPIInt    rank;
>> > > MPI_Comm_rank(PETSC_COMM_WORLD, &rank);
>> > >
>> > > if(rank ==0){
>> > >         Mat m_local_W;
>> > >         MatCreateSeqAIJ(PETSC_COMM_SELF,num_points,num_points,
>> num_nz, NULL,&m_local_W);// try to reserve space for only number of final
>> non zero entries for each fine node (e.g. 4)
>> > >         IS set;
>> > >         ISCreateStride(PETSC_COMM_SELF, num_points, 0, 1, &set_row);
>> > >         MatGetSubMatrices(m_WA_norm_T, 1, set_row, set_col,
>> MAT_INITIAL_MATRIX, &m_local_W);
>> > >
>> > >     }
>> > >
>> > > I followed below example:
>> > >
>> http://www.mcs.anl.gov/petsc/petsc-current/src/vec/is/is/examples/tutorials/ex2.c.html
>> > >
>> > > This code won't work in parallel.
>> > > The man page says this function is collective on Mat. You need to
>> move the call to MatGetSubMatrices outside of the if(rank==0) loop.
>> > >
>> > >
>> > >
>> > >
>> > >
>> > >
>> > >
>> > > On Wed, Jun 29, 2016 at 3:19 PM, ehsan sadrfaridpour <
>> it.sadr at gmail.com> wrote:
>> > > Thanks a lot for great support.
>> > >
>> > > On Wed, Jun 29, 2016 at 3:11 PM, Barry Smith <bsmith at mcs.anl.gov>
>> wrote:
>> > >
>> > >    MatGetSubmatrices() just have the first process request all the
>> rows and columns and the others request none. You can use ISCreateStride()
>> to create the ISs without having to make an array of all the indices.
>> > >
>> > >
>> > > > On Jun 29, 2016, at 1:43 PM, ehsan sadrfaridpour <it.sadr at gmail.com>
>> wrote:
>> > > >
>> > > > Hi,
>> > > >
>> > > > I need to have access to most of elements of a parallel MPIAIJ
>> matrix only from 1 process (rank 0).
>> > > > I tried to copy or duplicate it to SEQAIJ, but I faced problems.
>> > > >
>> > > > How can I have a local copy of a matrix which is distributed on
>> multiple process? I don't want to update the matrix, and the read-only
>> version of it would be enough.
>> > > >
>> > > > Best,
>> > > > Ehsan
>> > > >
>> > > >
>> > >
>> > >
>> > >
>> > >
>> >
>> >
>>
>>
>


-- 
What most experimenters take for granted before they begin their
experiments is infinitely more interesting than any results to which their
experiments lead.
-- Norbert Wiener
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.mcs.anl.gov/pipermail/petsc-users/attachments/20160705/94ced305/attachment-0001.html>

From bsmith at mcs.anl.gov  Tue Jul  5 17:37:15 2016
From: bsmith at mcs.anl.gov (Barry Smith)
Date: Tue, 5 Jul 2016 17:37:15 -0500
Subject: [petsc-users] Question about memory usage in Multigrid
	preconditioner
In-Reply-To: <577C337B.60909@uci.edu>
References: <577C337B.60909@uci.edu>
Message-ID: <94A03A99-4970-4F20-8C79-FEE1DCBD028D@mcs.anl.gov>


  Frank,

    You can run with -ksp_view_pre to have it "view" the KSP before the solve so hopefully it gets that far.

     Please run the problem that does fit with -memory_info when the problem completes it will show the "high water mark" for PETSc allocated memory and total memory used. We first want to look at these numbers to see if it is using more memory than you expect. You could also run with say half the grid spacing to see how the memory usage scaled with the increase in grid points. Make the runs also with -log_view and send all the output from these options.

   Barry

> On Jul 5, 2016, at 5:23 PM, frank <hengjiew at uci.edu> wrote:
> 
> Hi,
> 
> I am using the CG ksp solver and Multigrid preconditioner  to solve a linear system in parallel.
> I chose to use the 'Telescope' as the preconditioner on the coarse mesh for its good performance.
> The petsc options file is attached.
> 
> The domain is a 3d box.
> It works well when the grid is  1536*128*384 and the process mesh is 96*8*24. When I double the size of grid and keep the same process mesh and petsc options, I get an "out of memory" error from the super-cluster I am using.
> Each process has access to at least 8G memory, which should be more than enough for my application. I am sure that all the other parts of my code( except the linear solver ) do not use much memory. So I doubt if there is something wrong with the linear solver.
> The error occurs before the linear system is completely solved so I don't have the info from ksp view. I am not able to re-produce the error with a smaller problem either.
> In addition,  I tried to use the block jacobi as the preconditioner with the same grid and same decomposition. The linear solver runs extremely slow but there is no memory error.
> 
> How can I diagnose what exactly cause the error?
> Thank you so much.
> 
> Frank
> <petsc_options.txt>


From bsmith at mcs.anl.gov  Tue Jul  5 17:43:06 2016
From: bsmith at mcs.anl.gov (Barry Smith)
Date: Tue, 5 Jul 2016 17:43:06 -0500
Subject: [petsc-users] How to have a local copy (sequential) of a
	parallel matrix
In-Reply-To: <CAMYG4GmNrKPU0TUWJ4bQtJRi3132ebxJ57jemtwik-xbS-HLHA@mail.gmail.com>
References: <CALtCA6F_vGr=ZMi2ttcFxo1TkYoSOPn9wv5iY48+8qicYPJRBQ@mail.gmail.com>
	<F0B9F2BE-B38F-4B1B-9F95-C82F7D282ECC@mcs.anl.gov>
	<CALtCA6HWqqCij2K6uv7LER2ebwnwgh_xY7u4D2YWmzsyR67bzw@mail.gmail.com>
	<CALtCA6EUfkOOOArOLxF5A6bpbCooT1d3BKMPHZyJKvCVZ3cYRw@mail.gmail.com>
	<CAJ98EDrK9peK_B868Lr9EQ4sdiyQKMoZyOuZuOwTuCR6of9K9w@mail.gmail.com>
	<CALtCA6FnWa8A6Zi-zANDePmDWtBpUSTCM511nP=NHuttwTY7SA@mail.gmail.com>
	<1618DCDA-7859-49BD-BCAF-F4BD08DF1BAF@mcs.anl.gov>
	<CALtCA6F4xzEJE1WCk-4e8-LDjhTfJia4b16HmzD+J4z-weWLZQ@mail.gmail.com>
	<D78E861A-793F-4D84-B155-6834CEFDAC6A@mcs.anl.gov>
	<CALtCA6FSO35UDGo+eYbpihH25oV1eytWVhSa+qoQMXstc9RFMw@mail.gmail.com>
	<CAMYG4GmNrKPU0TUWJ4bQtJRi3132ebxJ57jemtwik-xbS-HLHA@mail.gmail.com>
Message-ID: <855835AC-78B8-4A8A-993F-2E9060B4BBAF@mcs.anl.gov>


> On Jul 5, 2016, at 5:36 PM, Matthew Knepley <knepley at gmail.com> wrote:
> 
> On Tue, Jul 5, 2016 at 5:26 PM, ehsan sadrfaridpour <it.sadr at gmail.com> wrote:
> Thanks for your prompt reply. Using & solve this problem, but then I have another problem.
> 
> Rest of the Code:
>     Mat m_WA_nt_local;
>     MatCreateSeqAIJ(PETSC_COMM_SELF,num_points,num_points, pre_init_size, NULL, &m_WA_nt_local);
>     PetscPrintf(PETSC_COMM_SELF, "[CS][pCalc_P] rank:%d, num_points:%d, p_init:%d\n", rank, num_points, pre_init_size);
> 
>     IS set;
>     if(rank ==0){
>         // - - - - - create local matrix - - - - -
>         PetscPrintf(PETSC_COMM_SELF, "[CS][pCalc_P] rank:%d, num_points:%d\n", rank, num_points);
>         ISCreateStride(PETSC_COMM_SELF, num_points, 0, 1, &set);
>         ISView(set, PETSC_VIEWER_STDOUT_SELF);
>         MatGetSubMatrices(m_WA_norm_T, 1, &set, &set, MAT_INITIAL_MATRIX, &m_WA_nt_local);
>     }else{
>         MatGetSubMatrices(m_WA_norm_T, 0, &set, &set, MAT_INITIAL_MATRIX, &m_WA_nt_local);
>     }
> 
> This returns an ARRAY of Mat objects, not just one.

  Didn't we just do this email a couple of days ago?

   You need

    Mat m_WA_nt_local  *m_WA_nt_local;
> MatGetSubMatrices(m_WA_norm_T, 0, &set, &set, MAT_INITIAL_MATRIX, &m_WA_nt_local);


> 
>    Matt
>  
> 
> Error in compile:
> /home/esfp/dev/ws_qt/mlsvm/coarsening.cc: In member function ?_p_Mat* Coarsening::pCalc_P(_p_Mat*&, _p_Vec*&, std::vector<long unsigned int>&, cs_info&)?:
> /home/esfp/dev/ws_qt/mlsvm/coarsening.cc:113:89: error: cannot convert ?_p_Mat**? to ?_p_Mat***? for argument ?6? to ?PetscErrorCode MatGetSubMatrices(Mat, PetscInt, _p_IS* const*, _p_IS* const*, MatReuse, _p_Mat***)?
>          MatGetSubMatrices(m_WA_norm_T, 1, &set, &set, MAT_INITIAL_MATRIX, &m_WA_nt_local);
>                                                                                          ^
> /home/esfp/dev/ws_qt/mlsvm/coarsening.cc:115:89: error: cannot convert ?_p_Mat**? to ?_p_Mat***? for argument ?6? to ?PetscErrorCode MatGetSubMatrices(Mat, PetscInt, _p_IS* const*, _p_IS* const*, MatReuse, _p_Mat***)?
>          MatGetSubMatrices(m_WA_norm_T, 0, &set, &set, MAT_INITIAL_MATRIX, &m_WA_nt_local);
>                                                                                          ^
> 
> 
> On Tue, Jul 5, 2016 at 6:21 PM, Barry Smith <bsmith at mcs.anl.gov> wrote:
> 
>   It should be
> 
>     Mat m_WA_nt_local;
> 
> > MatCreateSeqAIJ(PETSC_COMM_SELF,num_points,num_points, pre_init_size, NULL, &m_WA_nt_local);
>                                                                                                                                    ^^^^^^^^^^^^ note the &
> 
> 
> 
> > On Jul 5, 2016, at 5:13 PM, ehsan sadrfaridpour <it.sadr at gmail.com> wrote:
> >
> > I faced a problem with my code. The problem is related to MatCreateSeqAIJ().
> > I comment the rest of my code and just keeping the below lines cause me the error.
> > Code:
> >     Mat * m_WA_nt_local;
> >     MatCreateSeqAIJ(PETSC_COMM_SELF,num_points,num_points, pre_init_size, NULL, m_WA_nt_local);
> >     PetscPrintf(PETSC_COMM_SELF, "[CS][pCalc_P] rank:%d, num_points:%d, p_init:%d\n", rank, num_points, pre_init_size);
> >
> >     exit(1);
> >
> > Error:
> > [0]PETSC ERROR: --------------------- Error Message --------------------------------------------------------------
> > [0]PETSC ERROR: Null argument, when expecting valid pointer
> > [0]PETSC ERROR: [1]PETSC ERROR: --------------------- Error Message --------------------------------------------------------------
> > [1]PETSC ERROR: Null argument, when expecting valid pointer
> > [1]PETSC ERROR: Null Pointer: Parameter # 2
> > [1]PETSC ERROR: See http://www.mcs.anl.gov/petsc/documentation/faq.html for trouble shooting.
> > [2]PETSC ERROR: --------------------- Error Message --------------------------------------------------------------
> > [2]PETSC ERROR: Null argument, when expecting valid pointer
> > [2]PETSC ERROR: Null Pointer: Parameter # 2
> > [2]PETSC ERROR: See http://www.mcs.anl.gov/petsc/documentation/faq.html for trouble shooting.
> > [2]PETSC ERROR: Petsc Release Version 3.6.3, unknown
> > [2]PETSC ERROR: ut_main on a linux-cxx-debug named grappelli by esfp Tue Jul  5 18:05:15 2016
> > [2]PETSC ERROR: Configure options PETSC_ARCH=linux-cxx-debug --with-cc=gcc --with-cxx=g++ --with-clanguage=c++ --with-gnu-compilers=1 --with-mpi-compilers=1 --with-debugging=1 --with-shared-libraries=1 --download-openmpi=1 --download-f2cblaslapack --download-superlu_dist=1 --download-superlu=1 --download-metis=1 --download-parmetis=1 --download-blacs=1 --with-hdf5 --with-hdf5-dir=/usr/local/hdf5/
> > [2]PETSC ERROR: #1 MatCreate() line 79 in /home/esfp/tools/libraries/petsc/src/mat/utils/gcreate.c
> > [2]PETSC ERROR: #2 MatCreateSeqAIJ() line 3471 in /home/esfp/tools/libraries/petsc/src/mat/impls/aij/seq/aij.c
> > Null Pointer: Parameter # 2
> > [0]PETSC ERROR: See http://www.mcs.anl.gov/petsc/documentation/faq.html for trouble shooting.
> > [0]PETSC ERROR: Petsc Release Version 3.6.3, unknown
> > [0]PETSC ERROR: ut_main on a linux-cxx-debug named grappelli by esfp Tue Jul  5 18:05:15 2016
> > [0]PETSC ERROR: Configure options PETSC_ARCH=linux-cxx-debug --with-cc=gcc --with-cxx=g++ --with-clanguage=c++ --with-gnu-compilers=1 --with-mpi-compilers=1 --with-debugging=1 --with-shared-libraries=1 --download-openmpi=1 --download-f2cblaslapack --download-superlu_dist=1 --download-superlu=1 --download-metis=1 --download-parmetis=1 --download-blacs=1 --with-hdf5 --with-hdf5-dir=/usr/local/hdf5/
> > [0]PETSC ERROR: #1 MatCreate() line 79 in /home/esfp/tools/libraries/petsc/src/mat/utils/gcreate.c
> > [0]PETSC ERROR: #2 MatCreateSeqAIJ() line 3471 in /home/esfp/tools/libraries/petsc/src/mat/impls/aij/seq/aij.c
> > [1]PETSC ERROR: Petsc Release Version 3.6.3, unknown
> > [1]PETSC ERROR: ut_main on a linux-cxx-debug named grappelli by esfp Tue Jul  5 18:05:15 2016
> > [1]PETSC ERROR: Configure options PETSC_ARCH=linux-cxx-debug --with-cc=gcc --with-cxx=g++ --with-clanguage=c++ --with-gnu-compilers=1 --with-mpi-compilers=1 --with-debugging=1 --with-shared-libraries=1 --download-openmpi=1 --download-f2cblaslapack --download-superlu_dist=1 --download-superlu=1 --download-metis=1 --download-parmetis=1 --download-blacs=1 --with-hdf5 --with-hdf5-dir=/usr/local/hdf5/
> > [1]PETSC ERROR: #1 MatCreate() line 79 in /home/esfp/tools/libraries/petsc/src/mat/utils/gcreate.c
> > [1]PETSC ERROR: #2 MatCreateSeqAIJ() line 3471 in /home/esfp/tools/libraries/petsc/src/mat/impls/aij/seq/aij.c
> > [CS][pCalc_P] rank:1, num_points:10, p_init:300
> > [CS][pCalc_P] rank:2, num_points:10, p_init:300
> > [CS][pCalc_P] rank:0, num_points:10, p_init:300
> >
> > As you can see nothing is NULL in my call to the MatCreateSeqAIJ.
> >
> > I tried to debug it with -start_in_debugger, but I got another error.
> > $ make ut_main && mpirun -n 3   ut_main   -start_in_debugger
> > /home/esfp/tools/libraries/petsc/linux-cxx-debug/bin/mpicxx -o ut_main.o -c -Wall -Wwrite-strings -Wno-strict-aliasing -Wno-unknown-pragmas -g -O0  -fPIC    -I/home/esfp/tools/libraries/petsc/include -I/home/esfp/tools/libraries/petsc/linux-cxx-debug/include -I/usr/local/hdf5/include   -std=c++11 -g -O3  `pwd`/ut_main.cc
> > /home/esfp/tools/libraries/petsc/linux-cxx-debug/bin/mpicxx -Wall -Wwrite-strings -Wno-strict-aliasing -Wno-unknown-pragmas -g -O0 -I.        svm.o solver.o model_selection.o ut_ms.o ut_common.o ut_kf.o ut_partitioning.o ds_node.o ds_graph.o coarsening.o ut_coarsening.o partitioning.o ut_mr.o pugixml.o config_params.o etimer.o common_funcs.o OptionParser.o loader.o ut_loader.o k_fold.o ut_main.o  -Wl,-rpath,/home/esfp/tools/libraries/petsc/linux-cxx-debug/lib -L/home/esfp/tools/libraries/petsc/linux-cxx-debug/lib  -lpetsc -Wl,-rpath,/home/esfp/tools/libraries/petsc/linux-cxx-debug/lib -lsuperlu_4.3 -lsuperlu_dist_4.1 -lf2clapack -lf2cblas -lm -lparmetis -lmetis -lX11 -Wl,-rpath,/usr/local/hdf5/lib -L/usr/local/hdf5/lib -lhdf5hl_fortran -lhdf5_fortran -lhdf5_hl -lhdf5 -lhwloc -lm -Wl,-rpath,/usr/lib/gcc/x86_64-linux-gnu/4.8 -L/usr/lib/gcc/x86_64-linux-gnu/4.8 -Wl,-rpath,/usr/lib/x86_64-linux-gnu -L/usr/lib/x86_64-linux-gnu -Wl,-rpath,/lib/x86_64-linux-gnu -L/lib/x86_64-linux-gnu -lmpi_usempi -lmpi_mpifh -lgfortran -lm -lgfortran -lm -lquadmath -lm -lmpi_cxx -lstdc++ -Wl,-rpath,/home/esfp/tools/libraries/petsc/linux-cxx-debug/lib -L/home/esfp/tools/libraries/petsc/linux-cxx-debug/lib -Wl,-rpath,/usr/lib/gcc/x86_64-linux-gnu/4.8 -L/usr/lib/gcc/x86_64-linux-gnu/4.8 -Wl,-rpath,/usr/lib/x86_64-linux-gnu -L/usr/lib/x86_64-linux-gnu -Wl,-rpath,/lib/x86_64-linux-gnu -L/lib/x86_64-linux-gnu -Wl,-rpath,/usr/lib/x86_64-linux-gnu -L/usr/lib/x86_64-linux-gnu -ldl -Wl,-rpath,/home/esfp/tools/libraries/petsc/linux-cxx-debug/lib -lmpi -lgcc_s -lpthread -ldl  -o ut_main
> > /bin/rm -f ut_main.o
> > [0]PETSC ERROR: PETSC: Attaching gdb to ut_main of pid 2818 on display :0 on machine grappelli
> > [1]PETSC ERROR: PETSC: Attaching gdb to ut_main of pid 2819 on display :0 on machine grappelli
> > [2]PETSC ERROR: PETSC: Attaching gdb to ut_main of pid 2820 on display :0 on machine grappelli
> >
> >
> > And I got below error in gdb GUI:
> > <image.png>
> >
> > I appreciate your support.
> >
> > Best regards,
> > Ehsan
> >
> > On Wed, Jun 29, 2016 at 4:31 PM, Barry Smith <bsmith at mcs.anl.gov> wrote:
> >
> >   On all other processes don't pass in 1 pass  in 0 since all other processes want 0 sub matrices
> >
> >
> > > On Jun 29, 2016, at 3:24 PM, ehsan sadrfaridpour <it.sadr at gmail.com> wrote:
> > >
> > > Thanks, the IS problem is solved.
> > > But now I have another problem to compile the code.
> > >
> > > I use below code:
> > > Mat m_WA_nt_local;
> > >     MatCreateSeqAIJ(PETSC_COMM_SELF,num_points,num_points, Config_params::getInstance()->get_pre_init_loader_matrix(), NULL, &m_WA_nt_local);
> > >     IS set;
> > >     if(rank ==0){
> > >         ISCreateStride(PETSC_COMM_SELF, num_points, 0, 1, &set);
> > >         ISView(set, PETSC_VIEWER_STDOUT_SELF);
> > >     }
> > >     MatGetSubMatrices(m_WA_norm_T, 1, &set, &set, MAT_INITIAL_MATRIX, &m_WA_nt_local);
> > >
> > > The error I get is :
> > > error: cannot convert ?_p_Mat**? to ?_p_Mat***? for argument ?6? to ?PetscErrorCode MatGetSubMatrices(Mat, PetscInt, _p_IS* const*, _p_IS* const*, MatReuse, _p_Mat***)?
> > >      MatGetSubMatrices(m_WA_norm_T, 1, &set, &set, MAT_INITIAL_MATRIX, &m_WA_nt_local);
> > >
> > >
> > > I tried to go around it by define a array of Matrices using "Mat * m_WA_nt_local"
> > > So, the first 2 lines changed to below and I can compile the code.
> > > Mat * m_WA_nt_local;
> > >     MatCreateSeqAIJ(PETSC_COMM_SELF,num_points,num_points, Config_params::getInstance()->get_pre_init_loader_matrix(), NULL, m_WA_nt_local);
> > >
> > >
> > >
> > > However, I get errors like below when I run the code with 2 mpi process.
> > >  --------------------- Error Message --------------------------------------------------------------
> > > [1]PETSC ERROR: Invalid argument
> > > [1]PETSC ERROR: Wrong type of object: Parameter # 3
> > > [1]PETSC ERROR: See http://www.mcs.anl.gov/petsc/documentation/faq.html for trouble shooting.
> > > [1]PETSC ERROR: Petsc Release Version 3.6.3, unknown
> > > [1]PETSC ERROR: ut_main on a linux-cxx-debug named grappelli by esfp Wed Jun 29 16:21:04 2016
> > > [1]PETSC ERROR: Configure options PETSC_ARCH=linux-cxx-debug --with-cc=gcc --with-cxx=g++ --with-clanguage=c++ --with-gnu-compilers=1 --with-mpi-compilers=1 --with-debugging=1 --with-shared-libraries=1 --download-openmpi=1 --download-f2cblaslapack --download-superlu_dist=1 --download-superlu=1 --download-metis=1 --download-parmetis=1 --download-blacs=1 --with-hdf5 --with-hdf5-dir=/usr/local/hdf5/
> > > [1]PETSC ERROR: #1 MatGetSubMatrices() line 6605 in /home/esfp/tools/libraries/petsc/src/mat/interface/matrix.c
> > >
> > >
> > > I think I need to do something for other processes, but I don't know what I need to do.
> > >
> > > Best,
> > > Ehsan
> > >
> > >
> > >
> > > On Wed, Jun 29, 2016 at 4:03 PM, Dave May <dave.mayhem23 at gmail.com> wrote:
> > >
> > >
> > > On Wednesday, 29 June 2016, ehsan sadrfaridpour <it.sadr at gmail.com> wrote:
> > > I faced the below error during compiling my code for using MatGetSubMatrices.
> > >
> > > error: cannot convert ?IS {aka _p_IS*}? to ?_p_IS* const*? for argument ?3? to ?PetscErrorCode MatGetSubMatrices(Mat, PetscInt, _p_IS* const*, _p_IS* const*, MatReuse, _p_Mat***)?
> > >          MatGetSubMatrices(m_WA_norm_T, 1, set, set, MAT_INITIAL_MATRIX, &m_local_W);
> > >
> > > My code :
> > > PetscMPIInt    rank;
> > > MPI_Comm_rank(PETSC_COMM_WORLD, &rank);
> > >
> > > if(rank ==0){
> > >         Mat m_local_W;
> > >         MatCreateSeqAIJ(PETSC_COMM_SELF,num_points,num_points, num_nz, NULL,&m_local_W);// try to reserve space for only number of final non zero entries for each fine node (e.g. 4)
> > >         IS set;
> > >         ISCreateStride(PETSC_COMM_SELF, num_points, 0, 1, &set_row);
> > >         MatGetSubMatrices(m_WA_norm_T, 1, set_row, set_col, MAT_INITIAL_MATRIX, &m_local_W);
> > >
> > >     }
> > >
> > > I followed below example:
> > > http://www.mcs.anl.gov/petsc/petsc-current/src/vec/is/is/examples/tutorials/ex2.c.html
> > >
> > > This code won't work in parallel.
> > > The man page says this function is collective on Mat. You need to move the call to MatGetSubMatrices outside of the if(rank==0) loop.
> > >
> > >
> > >
> > >
> > >
> > >
> > >
> > > On Wed, Jun 29, 2016 at 3:19 PM, ehsan sadrfaridpour <it.sadr at gmail.com> wrote:
> > > Thanks a lot for great support.
> > >
> > > On Wed, Jun 29, 2016 at 3:11 PM, Barry Smith <bsmith at mcs.anl.gov> wrote:
> > >
> > >    MatGetSubmatrices() just have the first process request all the rows and columns and the others request none. You can use ISCreateStride() to create the ISs without having to make an array of all the indices.
> > >
> > >
> > > > On Jun 29, 2016, at 1:43 PM, ehsan sadrfaridpour <it.sadr at gmail.com> wrote:
> > > >
> > > > Hi,
> > > >
> > > > I need to have access to most of elements of a parallel MPIAIJ matrix only from 1 process (rank 0).
> > > > I tried to copy or duplicate it to SEQAIJ, but I faced problems.
> > > >
> > > > How can I have a local copy of a matrix which is distributed on multiple process? I don't want to update the matrix, and the read-only version of it would be enough.
> > > >
> > > > Best,
> > > > Ehsan
> > > >
> > > >
> > >
> > >
> > >
> > >
> >
> >
> 
> 
> 
> 
> 
> -- 
> What most experimenters take for granted before they begin their experiments is infinitely more interesting than any results to which their experiments lead.
> -- Norbert Wiener


From it.sadr at gmail.com  Tue Jul  5 17:58:48 2016
From: it.sadr at gmail.com (ehsan sadrfaridpour)
Date: Tue, 5 Jul 2016 18:58:48 -0400
Subject: [petsc-users] How to have a local copy (sequential) of a
	parallel matrix
In-Reply-To: <855835AC-78B8-4A8A-993F-2E9060B4BBAF@mcs.anl.gov>
References: <CALtCA6F_vGr=ZMi2ttcFxo1TkYoSOPn9wv5iY48+8qicYPJRBQ@mail.gmail.com>
	<F0B9F2BE-B38F-4B1B-9F95-C82F7D282ECC@mcs.anl.gov>
	<CALtCA6HWqqCij2K6uv7LER2ebwnwgh_xY7u4D2YWmzsyR67bzw@mail.gmail.com>
	<CALtCA6EUfkOOOArOLxF5A6bpbCooT1d3BKMPHZyJKvCVZ3cYRw@mail.gmail.com>
	<CAJ98EDrK9peK_B868Lr9EQ4sdiyQKMoZyOuZuOwTuCR6of9K9w@mail.gmail.com>
	<CALtCA6FnWa8A6Zi-zANDePmDWtBpUSTCM511nP=NHuttwTY7SA@mail.gmail.com>
	<1618DCDA-7859-49BD-BCAF-F4BD08DF1BAF@mcs.anl.gov>
	<CALtCA6F4xzEJE1WCk-4e8-LDjhTfJia4b16HmzD+J4z-weWLZQ@mail.gmail.com>
	<D78E861A-793F-4D84-B155-6834CEFDAC6A@mcs.anl.gov>
	<CALtCA6FSO35UDGo+eYbpihH25oV1eytWVhSa+qoQMXstc9RFMw@mail.gmail.com>
	<CAMYG4GmNrKPU0TUWJ4bQtJRi3132ebxJ57jemtwik-xbS-HLHA@mail.gmail.com>
	<855835AC-78B8-4A8A-993F-2E9060B4BBAF@mcs.anl.gov>
Message-ID: <CALtCA6HqH9tAW93r+=c5h5atFJzUv018u7ZnjGTXivzKxgG_Tg@mail.gmail.com>

Sorry, I think your suggestion needs something, since it doesn't compile.

 error: expected initializer before ?*? token
>      Mat m_WA_nt_local  *m_WA_nt_local;
>


Yes, this is the same problem that compiled and worked but it has a bug.
I faced this problem and I tried to define the array of Matrices to fix
this 4 days ago.

However, my first email today is the problem that array of matrices caused
me.
I get a little confused in the logic.

Let me review what is happening:
As this method is collective, all the processes needs to run it.
Therefore, I need to define a local matrix and create it for all of the
processes.
Only for the process I want to have the local matrix, I request a matrix
(matrices) and for the rest of them I pass 0 in the MatGetSubMatrices.
I am suspicious about creating only 1 matrix for any process, while I
expect an array of matrices in the  MatGetSubMatrices.


On Tue, Jul 5, 2016 at 6:43 PM, Barry Smith <bsmith at mcs.anl.gov> wrote:

>
> > On Jul 5, 2016, at 5:36 PM, Matthew Knepley <knepley at gmail.com> wrote:
> >
> > On Tue, Jul 5, 2016 at 5:26 PM, ehsan sadrfaridpour <it.sadr at gmail.com>
> wrote:
> > Thanks for your prompt reply. Using & solve this problem, but then I
> have another problem.
> >
> > Rest of the Code:
> >     Mat m_WA_nt_local;
> >     MatCreateSeqAIJ(PETSC_COMM_SELF,num_points,num_points,
> pre_init_size, NULL, &m_WA_nt_local);
> >     PetscPrintf(PETSC_COMM_SELF, "[CS][pCalc_P] rank:%d, num_points:%d,
> p_init:%d\n", rank, num_points, pre_init_size);
> >
> >     IS set;
> >     if(rank ==0){
> >         // - - - - - create local matrix - - - - -
> >         PetscPrintf(PETSC_COMM_SELF, "[CS][pCalc_P] rank:%d,
> num_points:%d\n", rank, num_points);
> >         ISCreateStride(PETSC_COMM_SELF, num_points, 0, 1, &set);
> >         ISView(set, PETSC_VIEWER_STDOUT_SELF);
> >         MatGetSubMatrices(m_WA_norm_T, 1, &set, &set,
> MAT_INITIAL_MATRIX, &m_WA_nt_local);
> >     }else{
> >         MatGetSubMatrices(m_WA_norm_T, 0, &set, &set,
> MAT_INITIAL_MATRIX, &m_WA_nt_local);
> >     }
> >
> > This returns an ARRAY of Mat objects, not just one.
>
>   Didn't we just do this email a couple of days ago?
>
>    You need
>
>     Mat m_WA_nt_local  *m_WA_nt_local;
> > MatGetSubMatrices(m_WA_norm_T, 0, &set, &set, MAT_INITIAL_MATRIX,
> &m_WA_nt_local);
>
>
>
>
>
> >
> >    Matt
> >
> >
> > Error in compile:
> > /home/esfp/dev/ws_qt/mlsvm/coarsening.cc: In member function ?_p_Mat*
> Coarsening::pCalc_P(_p_Mat*&, _p_Vec*&, std::vector<long unsigned int>&,
> cs_info&)?:
> > /home/esfp/dev/ws_qt/mlsvm/coarsening.cc:113:89: error: cannot convert
> ?_p_Mat**? to ?_p_Mat***? for argument ?6? to ?PetscErrorCode
> MatGetSubMatrices(Mat, PetscInt, _p_IS* const*, _p_IS* const*, MatReuse,
> _p_Mat***)?
> >          MatGetSubMatrices(m_WA_norm_T, 1, &set, &set,
> MAT_INITIAL_MATRIX, &m_WA_nt_local);
> >
>                 ^
> > /home/esfp/dev/ws_qt/mlsvm/coarsening.cc:115:89: error: cannot convert
> ?_p_Mat**? to ?_p_Mat***? for argument ?6? to ?PetscErrorCode
> MatGetSubMatrices(Mat, PetscInt, _p_IS* const*, _p_IS* const*, MatReuse,
> _p_Mat***)?
> >          MatGetSubMatrices(m_WA_norm_T, 0, &set, &set,
> MAT_INITIAL_MATRIX, &m_WA_nt_local);
> >
>                 ^
> >
> >
> > On Tue, Jul 5, 2016 at 6:21 PM, Barry Smith <bsmith at mcs.anl.gov> wrote:
> >
> >   It should be
> >
> >     Mat m_WA_nt_local;
> >
> > > MatCreateSeqAIJ(PETSC_COMM_SELF,num_points,num_points, pre_init_size,
> NULL, &m_WA_nt_local);
> >
>                                                           ^^^^^^^^^^^^ note
> the &
> >
> >
> >
> > > On Jul 5, 2016, at 5:13 PM, ehsan sadrfaridpour <it.sadr at gmail.com>
> wrote:
> > >
> > > I faced a problem with my code. The problem is related to
> MatCreateSeqAIJ().
> > > I comment the rest of my code and just keeping the below lines cause
> me the error.
> > > Code:
> > >     Mat * m_WA_nt_local;
> > >     MatCreateSeqAIJ(PETSC_COMM_SELF,num_points,num_points,
> pre_init_size, NULL, m_WA_nt_local);
> > >     PetscPrintf(PETSC_COMM_SELF, "[CS][pCalc_P] rank:%d,
> num_points:%d, p_init:%d\n", rank, num_points, pre_init_size);
> > >
> > >     exit(1);
> > >
> > > Error:
> > > [0]PETSC ERROR: --------------------- Error Message
> --------------------------------------------------------------
> > > [0]PETSC ERROR: Null argument, when expecting valid pointer
> > > [0]PETSC ERROR: [1]PETSC ERROR: --------------------- Error Message
> --------------------------------------------------------------
> > > [1]PETSC ERROR: Null argument, when expecting valid pointer
> > > [1]PETSC ERROR: Null Pointer: Parameter # 2
> > > [1]PETSC ERROR: See
> http://www.mcs.anl.gov/petsc/documentation/faq.html for trouble shooting.
> > > [2]PETSC ERROR: --------------------- Error Message
> --------------------------------------------------------------
> > > [2]PETSC ERROR: Null argument, when expecting valid pointer
> > > [2]PETSC ERROR: Null Pointer: Parameter # 2
> > > [2]PETSC ERROR: See
> http://www.mcs.anl.gov/petsc/documentation/faq.html for trouble shooting.
> > > [2]PETSC ERROR: Petsc Release Version 3.6.3, unknown
> > > [2]PETSC ERROR: ut_main on a linux-cxx-debug named grappelli by esfp
> Tue Jul  5 18:05:15 2016
> > > [2]PETSC ERROR: Configure options PETSC_ARCH=linux-cxx-debug
> --with-cc=gcc --with-cxx=g++ --with-clanguage=c++ --with-gnu-compilers=1
> --with-mpi-compilers=1 --with-debugging=1 --with-shared-libraries=1
> --download-openmpi=1 --download-f2cblaslapack --download-superlu_dist=1
> --download-superlu=1 --download-metis=1 --download-parmetis=1
> --download-blacs=1 --with-hdf5 --with-hdf5-dir=/usr/local/hdf5/
> > > [2]PETSC ERROR: #1 MatCreate() line 79 in
> /home/esfp/tools/libraries/petsc/src/mat/utils/gcreate.c
> > > [2]PETSC ERROR: #2 MatCreateSeqAIJ() line 3471 in
> /home/esfp/tools/libraries/petsc/src/mat/impls/aij/seq/aij.c
> > > Null Pointer: Parameter # 2
> > > [0]PETSC ERROR: See
> http://www.mcs.anl.gov/petsc/documentation/faq.html for trouble shooting.
> > > [0]PETSC ERROR: Petsc Release Version 3.6.3, unknown
> > > [0]PETSC ERROR: ut_main on a linux-cxx-debug named grappelli by esfp
> Tue Jul  5 18:05:15 2016
> > > [0]PETSC ERROR: Configure options PETSC_ARCH=linux-cxx-debug
> --with-cc=gcc --with-cxx=g++ --with-clanguage=c++ --with-gnu-compilers=1
> --with-mpi-compilers=1 --with-debugging=1 --with-shared-libraries=1
> --download-openmpi=1 --download-f2cblaslapack --download-superlu_dist=1
> --download-superlu=1 --download-metis=1 --download-parmetis=1
> --download-blacs=1 --with-hdf5 --with-hdf5-dir=/usr/local/hdf5/
> > > [0]PETSC ERROR: #1 MatCreate() line 79 in
> /home/esfp/tools/libraries/petsc/src/mat/utils/gcreate.c
> > > [0]PETSC ERROR: #2 MatCreateSeqAIJ() line 3471 in
> /home/esfp/tools/libraries/petsc/src/mat/impls/aij/seq/aij.c
> > > [1]PETSC ERROR: Petsc Release Version 3.6.3, unknown
> > > [1]PETSC ERROR: ut_main on a linux-cxx-debug named grappelli by esfp
> Tue Jul  5 18:05:15 2016
> > > [1]PETSC ERROR: Configure options PETSC_ARCH=linux-cxx-debug
> --with-cc=gcc --with-cxx=g++ --with-clanguage=c++ --with-gnu-compilers=1
> --with-mpi-compilers=1 --with-debugging=1 --with-shared-libraries=1
> --download-openmpi=1 --download-f2cblaslapack --download-superlu_dist=1
> --download-superlu=1 --download-metis=1 --download-parmetis=1
> --download-blacs=1 --with-hdf5 --with-hdf5-dir=/usr/local/hdf5/
> > > [1]PETSC ERROR: #1 MatCreate() line 79 in
> /home/esfp/tools/libraries/petsc/src/mat/utils/gcreate.c
> > > [1]PETSC ERROR: #2 MatCreateSeqAIJ() line 3471 in
> /home/esfp/tools/libraries/petsc/src/mat/impls/aij/seq/aij.c
> > > [CS][pCalc_P] rank:1, num_points:10, p_init:300
> > > [CS][pCalc_P] rank:2, num_points:10, p_init:300
> > > [CS][pCalc_P] rank:0, num_points:10, p_init:300
> > >
> > > As you can see nothing is NULL in my call to the MatCreateSeqAIJ.
> > >
> > > I tried to debug it with -start_in_debugger, but I got another error.
> > > $ make ut_main && mpirun -n 3   ut_main   -start_in_debugger
> > > /home/esfp/tools/libraries/petsc/linux-cxx-debug/bin/mpicxx -o
> ut_main.o -c -Wall -Wwrite-strings -Wno-strict-aliasing
> -Wno-unknown-pragmas -g -O0  -fPIC
> -I/home/esfp/tools/libraries/petsc/include
> -I/home/esfp/tools/libraries/petsc/linux-cxx-debug/include
> -I/usr/local/hdf5/include   -std=c++11 -g -O3  `pwd`/ut_main.cc
> > > /home/esfp/tools/libraries/petsc/linux-cxx-debug/bin/mpicxx -Wall
> -Wwrite-strings -Wno-strict-aliasing -Wno-unknown-pragmas -g -O0 -I.
> svm.o solver.o model_selection.o ut_ms.o ut_common.o ut_kf.o
> ut_partitioning.o ds_node.o ds_graph.o coarsening.o ut_coarsening.o
> partitioning.o ut_mr.o pugixml.o config_params.o etimer.o common_funcs.o
> OptionParser.o loader.o ut_loader.o k_fold.o ut_main.o
> -Wl,-rpath,/home/esfp/tools/libraries/petsc/linux-cxx-debug/lib
> -L/home/esfp/tools/libraries/petsc/linux-cxx-debug/lib  -lpetsc
> -Wl,-rpath,/home/esfp/tools/libraries/petsc/linux-cxx-debug/lib
> -lsuperlu_4.3 -lsuperlu_dist_4.1 -lf2clapack -lf2cblas -lm -lparmetis
> -lmetis -lX11 -Wl,-rpath,/usr/local/hdf5/lib -L/usr/local/hdf5/lib
> -lhdf5hl_fortran -lhdf5_fortran -lhdf5_hl -lhdf5 -lhwloc -lm
> -Wl,-rpath,/usr/lib/gcc/x86_64-linux-gnu/4.8
> -L/usr/lib/gcc/x86_64-linux-gnu/4.8 -Wl,-rpath,/usr/lib/x86_64-linux-gnu
> -L/usr/lib/x86_64-linux-gnu -Wl,-rpath,/lib/x86_64-linux-gnu
> -L/lib/x86_64-linux-gnu -lmpi_usempi -lmpi_mpifh -lgfortran -lm -lgfortran
> -lm -lquadmath -lm -lmpi_cxx -lstdc++
> -Wl,-rpath,/home/esfp/tools/libraries/petsc/linux-cxx-debug/lib
> -L/home/esfp/tools/libraries/petsc/linux-cxx-debug/lib
> -Wl,-rpath,/usr/lib/gcc/x86_64-linux-gnu/4.8
> -L/usr/lib/gcc/x86_64-linux-gnu/4.8 -Wl,-rpath,/usr/lib/x86_64-linux-gnu
> -L/usr/lib/x86_64-linux-gnu -Wl,-rpath,/lib/x86_64-linux-gnu
> -L/lib/x86_64-linux-gnu -Wl,-rpath,/usr/lib/x86_64-linux-gnu
> -L/usr/lib/x86_64-linux-gnu -ldl
> -Wl,-rpath,/home/esfp/tools/libraries/petsc/linux-cxx-debug/lib -lmpi
> -lgcc_s -lpthread -ldl  -o ut_main
> > > /bin/rm -f ut_main.o
> > > [0]PETSC ERROR: PETSC: Attaching gdb to ut_main of pid 2818 on display
> :0 on machine grappelli
> > > [1]PETSC ERROR: PETSC: Attaching gdb to ut_main of pid 2819 on display
> :0 on machine grappelli
> > > [2]PETSC ERROR: PETSC: Attaching gdb to ut_main of pid 2820 on display
> :0 on machine grappelli
> > >
> > >
> > > And I got below error in gdb GUI:
> > > <image.png>
> > >
> > > I appreciate your support.
> > >
> > > Best regards,
> > > Ehsan
> > >
> > > On Wed, Jun 29, 2016 at 4:31 PM, Barry Smith <bsmith at mcs.anl.gov>
> wrote:
> > >
> > >   On all other processes don't pass in 1 pass  in 0 since all other
> processes want 0 sub matrices
> > >
> > >
> > > > On Jun 29, 2016, at 3:24 PM, ehsan sadrfaridpour <it.sadr at gmail.com>
> wrote:
> > > >
> > > > Thanks, the IS problem is solved.
> > > > But now I have another problem to compile the code.
> > > >
> > > > I use below code:
> > > > Mat m_WA_nt_local;
> > > >     MatCreateSeqAIJ(PETSC_COMM_SELF,num_points,num_points,
> Config_params::getInstance()->get_pre_init_loader_matrix(), NULL,
> &m_WA_nt_local);
> > > >     IS set;
> > > >     if(rank ==0){
> > > >         ISCreateStride(PETSC_COMM_SELF, num_points, 0, 1, &set);
> > > >         ISView(set, PETSC_VIEWER_STDOUT_SELF);
> > > >     }
> > > >     MatGetSubMatrices(m_WA_norm_T, 1, &set, &set,
> MAT_INITIAL_MATRIX, &m_WA_nt_local);
> > > >
> > > > The error I get is :
> > > > error: cannot convert ?_p_Mat**? to ?_p_Mat***? for argument ?6? to
> ?PetscErrorCode MatGetSubMatrices(Mat, PetscInt, _p_IS* const*, _p_IS*
> const*, MatReuse, _p_Mat***)?
> > > >      MatGetSubMatrices(m_WA_norm_T, 1, &set, &set,
> MAT_INITIAL_MATRIX, &m_WA_nt_local);
> > > >
> > > >
> > > > I tried to go around it by define a array of Matrices using "Mat *
> m_WA_nt_local"
> > > > So, the first 2 lines changed to below and I can compile the code.
> > > > Mat * m_WA_nt_local;
> > > >     MatCreateSeqAIJ(PETSC_COMM_SELF,num_points,num_points,
> Config_params::getInstance()->get_pre_init_loader_matrix(), NULL,
> m_WA_nt_local);
> > > >
> > > >
> > > >
> > > > However, I get errors like below when I run the code with 2 mpi
> process.
> > > >  --------------------- Error Message
> --------------------------------------------------------------
> > > > [1]PETSC ERROR: Invalid argument
> > > > [1]PETSC ERROR: Wrong type of object: Parameter # 3
> > > > [1]PETSC ERROR: See
> http://www.mcs.anl.gov/petsc/documentation/faq.html for trouble shooting.
> > > > [1]PETSC ERROR: Petsc Release Version 3.6.3, unknown
> > > > [1]PETSC ERROR: ut_main on a linux-cxx-debug named grappelli by esfp
> Wed Jun 29 16:21:04 2016
> > > > [1]PETSC ERROR: Configure options PETSC_ARCH=linux-cxx-debug
> --with-cc=gcc --with-cxx=g++ --with-clanguage=c++ --with-gnu-compilers=1
> --with-mpi-compilers=1 --with-debugging=1 --with-shared-libraries=1
> --download-openmpi=1 --download-f2cblaslapack --download-superlu_dist=1
> --download-superlu=1 --download-metis=1 --download-parmetis=1
> --download-blacs=1 --with-hdf5 --with-hdf5-dir=/usr/local/hdf5/
> > > > [1]PETSC ERROR: #1 MatGetSubMatrices() line 6605 in
> /home/esfp/tools/libraries/petsc/src/mat/interface/matrix.c
> > > >
> > > >
> > > > I think I need to do something for other processes, but I don't know
> what I need to do.
> > > >
> > > > Best,
> > > > Ehsan
> > > >
> > > >
> > > >
> > > > On Wed, Jun 29, 2016 at 4:03 PM, Dave May <dave.mayhem23 at gmail.com>
> wrote:
> > > >
> > > >
> > > > On Wednesday, 29 June 2016, ehsan sadrfaridpour <it.sadr at gmail.com>
> wrote:
> > > > I faced the below error during compiling my code for using
> MatGetSubMatrices.
> > > >
> > > > error: cannot convert ?IS {aka _p_IS*}? to ?_p_IS* const*? for
> argument ?3? to ?PetscErrorCode MatGetSubMatrices(Mat, PetscInt, _p_IS*
> const*, _p_IS* const*, MatReuse, _p_Mat***)?
> > > >          MatGetSubMatrices(m_WA_norm_T, 1, set, set,
> MAT_INITIAL_MATRIX, &m_local_W);
> > > >
> > > > My code :
> > > > PetscMPIInt    rank;
> > > > MPI_Comm_rank(PETSC_COMM_WORLD, &rank);
> > > >
> > > > if(rank ==0){
> > > >         Mat m_local_W;
> > > >         MatCreateSeqAIJ(PETSC_COMM_SELF,num_points,num_points,
> num_nz, NULL,&m_local_W);// try to reserve space for only number of final
> non zero entries for each fine node (e.g. 4)
> > > >         IS set;
> > > >         ISCreateStride(PETSC_COMM_SELF, num_points, 0, 1, &set_row);
> > > >         MatGetSubMatrices(m_WA_norm_T, 1, set_row, set_col,
> MAT_INITIAL_MATRIX, &m_local_W);
> > > >
> > > >     }
> > > >
> > > > I followed below example:
> > > >
> http://www.mcs.anl.gov/petsc/petsc-current/src/vec/is/is/examples/tutorials/ex2.c.html
> > > >
> > > > This code won't work in parallel.
> > > > The man page says this function is collective on Mat. You need to
> move the call to MatGetSubMatrices outside of the if(rank==0) loop.
> > > >
> > > >
> > > >
> > > >
> > > >
> > > >
> > > >
> > > > On Wed, Jun 29, 2016 at 3:19 PM, ehsan sadrfaridpour <
> it.sadr at gmail.com> wrote:
> > > > Thanks a lot for great support.
> > > >
> > > > On Wed, Jun 29, 2016 at 3:11 PM, Barry Smith <bsmith at mcs.anl.gov>
> wrote:
> > > >
> > > >    MatGetSubmatrices() just have the first process request all the
> rows and columns and the others request none. You can use ISCreateStride()
> to create the ISs without having to make an array of all the indices.
> > > >
> > > >
> > > > > On Jun 29, 2016, at 1:43 PM, ehsan sadrfaridpour <
> it.sadr at gmail.com> wrote:
> > > > >
> > > > > Hi,
> > > > >
> > > > > I need to have access to most of elements of a parallel MPIAIJ
> matrix only from 1 process (rank 0).
> > > > > I tried to copy or duplicate it to SEQAIJ, but I faced problems.
> > > > >
> > > > > How can I have a local copy of a matrix which is distributed on
> multiple process? I don't want to update the matrix, and the read-only
> version of it would be enough.
> > > > >
> > > > > Best,
> > > > > Ehsan
> > > > >
> > > > >
> > > >
> > > >
> > > >
> > > >
> > >
> > >
> >
> >
> >
> >
> >
> > --
> > What most experimenters take for granted before they begin their
> experiments is infinitely more interesting than any results to which their
> experiments lead.
> > -- Norbert Wiener
>
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.mcs.anl.gov/pipermail/petsc-users/attachments/20160705/2242ab67/attachment-0001.html>

From bsmith at mcs.anl.gov  Tue Jul  5 19:34:08 2016
From: bsmith at mcs.anl.gov (Barry Smith)
Date: Tue, 5 Jul 2016 19:34:08 -0500
Subject: [petsc-users] How to have a local copy (sequential) of a
	parallel matrix
In-Reply-To: <CALtCA6HqH9tAW93r+=c5h5atFJzUv018u7ZnjGTXivzKxgG_Tg@mail.gmail.com>
References: <CALtCA6F_vGr=ZMi2ttcFxo1TkYoSOPn9wv5iY48+8qicYPJRBQ@mail.gmail.com>
	<F0B9F2BE-B38F-4B1B-9F95-C82F7D282ECC@mcs.anl.gov>
	<CALtCA6HWqqCij2K6uv7LER2ebwnwgh_xY7u4D2YWmzsyR67bzw@mail.gmail.com>
	<CALtCA6EUfkOOOArOLxF5A6bpbCooT1d3BKMPHZyJKvCVZ3cYRw@mail.gmail.com>
	<CAJ98EDrK9peK_B868Lr9EQ4sdiyQKMoZyOuZuOwTuCR6of9K9w@mail.gmail.com>
	<CALtCA6FnWa8A6Zi-zANDePmDWtBpUSTCM511nP=NHuttwTY7SA@mail.gmail.com>
	<1618DCDA-7859-49BD-BCAF-F4BD08DF1BAF@mcs.anl.gov>
	<CALtCA6F4xzEJE1WCk-4e8-LDjhTfJia4b16HmzD+J4z-weWLZQ@mail.gmail.com>
	<D78E861A-793F-4D84-B155-6834CEFDAC6A@mcs.anl.gov>
	<CALtCA6FSO35UDGo+eYbpihH25oV1eytWVhSa+qoQMXstc9RFMw@mail.gmail.com>
	<CAMYG4GmNrKPU0TUWJ4bQtJRi3132ebxJ57jemtwik-xbS-HLHA@mail.gmail.com>
	<855835AC-78B8-4A8A-993F-2E9060B4BBAF@mcs.anl.gov>
	<CALtCA6HqH9tAW93r+=c5h5atFJzUv018u7ZnjGTXivzKxgG_Tg@mail.gmail.com>
Message-ID: <EF9196EB-85A6-484F-AC16-D6EB0282212B@mcs.anl.gov>


> On Jul 5, 2016, at 5:58 PM, ehsan sadrfaridpour <it.sadr at gmail.com> wrote:
> 
> Sorry, I think your suggestion needs something, since it doesn't compile.
> 
>  error: expected initializer before ?*? token
>      Mat m_WA_nt_local  *m_WA_nt_local;

     Mat *m_WA_nt_local;

>  
> 
> Yes, this is the same problem that compiled and worked but it has a bug. 
> I faced this problem and I tried to define the array of Matrices to fix this 4 days ago.
> 
> However, my first email today is the problem that array of matrices caused me. 
> I get a little confused in the logic. 
> 
> Let me review what is happening:
> As this method is collective, all the processes needs to run it. 
> Therefore, I need to define a local matrix and create it for all of the processes.
> Only for the process I want to have the local matrix, I request a matrix (matrices) and for the rest of them I pass 0 in the MatGetSubMatrices.
> I am suspicious about creating only 1 matrix for any process, while I expect an array of matrices in the  MatGetSubMatrices.

   MatGetSubMatrices can return any number of matrices including a different number on different machines. Hence it returns an array containing the matrices. The length of the array is the same as the number of local matrices you requested which could be zero.

   You should read up a little on the web on using pointers in C; they are confusing at first but once you get the hang of them they are usually straightforward.

   Barry

> 
> 
> 
> 
> 
> On Tue, Jul 5, 2016 at 6:43 PM, Barry Smith <bsmith at mcs.anl.gov> wrote:
> 
> > On Jul 5, 2016, at 5:36 PM, Matthew Knepley <knepley at gmail.com> wrote:
> >
> > On Tue, Jul 5, 2016 at 5:26 PM, ehsan sadrfaridpour <it.sadr at gmail.com> wrote:
> > Thanks for your prompt reply. Using & solve this problem, but then I have another problem.
> >
> > Rest of the Code:
> >     Mat m_WA_nt_local;
> >     MatCreateSeqAIJ(PETSC_COMM_SELF,num_points,num_points, pre_init_size, NULL, &m_WA_nt_local);
> >     PetscPrintf(PETSC_COMM_SELF, "[CS][pCalc_P] rank:%d, num_points:%d, p_init:%d\n", rank, num_points, pre_init_size);
> >
> >     IS set;
> >     if(rank ==0){
> >         // - - - - - create local matrix - - - - -
> >         PetscPrintf(PETSC_COMM_SELF, "[CS][pCalc_P] rank:%d, num_points:%d\n", rank, num_points);
> >         ISCreateStride(PETSC_COMM_SELF, num_points, 0, 1, &set);
> >         ISView(set, PETSC_VIEWER_STDOUT_SELF);
> >         MatGetSubMatrices(m_WA_norm_T, 1, &set, &set, MAT_INITIAL_MATRIX, &m_WA_nt_local);
> >     }else{
> >         MatGetSubMatrices(m_WA_norm_T, 0, &set, &set, MAT_INITIAL_MATRIX, &m_WA_nt_local);
> >     }
> >
> > This returns an ARRAY of Mat objects, not just one.
> 
>   Didn't we just do this email a couple of days ago?
> 
>    You need
> 
>     Mat m_WA_nt_local  *m_WA_nt_local;
> > MatGetSubMatrices(m_WA_norm_T, 0, &set, &set, MAT_INITIAL_MATRIX, &m_WA_nt_local);
> 
> 
> 
> 
> 
> >
> >    Matt
> >
> >
> > Error in compile:
> > /home/esfp/dev/ws_qt/mlsvm/coarsening.cc: In member function ?_p_Mat* Coarsening::pCalc_P(_p_Mat*&, _p_Vec*&, std::vector<long unsigned int>&, cs_info&)?:
> > /home/esfp/dev/ws_qt/mlsvm/coarsening.cc:113:89: error: cannot convert ?_p_Mat**? to ?_p_Mat***? for argument ?6? to ?PetscErrorCode MatGetSubMatrices(Mat, PetscInt, _p_IS* const*, _p_IS* const*, MatReuse, _p_Mat***)?
> >          MatGetSubMatrices(m_WA_norm_T, 1, &set, &set, MAT_INITIAL_MATRIX, &m_WA_nt_local);
> >                                                                                          ^
> > /home/esfp/dev/ws_qt/mlsvm/coarsening.cc:115:89: error: cannot convert ?_p_Mat**? to ?_p_Mat***? for argument ?6? to ?PetscErrorCode MatGetSubMatrices(Mat, PetscInt, _p_IS* const*, _p_IS* const*, MatReuse, _p_Mat***)?
> >          MatGetSubMatrices(m_WA_norm_T, 0, &set, &set, MAT_INITIAL_MATRIX, &m_WA_nt_local);
> >                                                                                          ^
> >
> >
> > On Tue, Jul 5, 2016 at 6:21 PM, Barry Smith <bsmith at mcs.anl.gov> wrote:
> >
> >   It should be
> >
> >     Mat m_WA_nt_local;
> >
> > > MatCreateSeqAIJ(PETSC_COMM_SELF,num_points,num_points, pre_init_size, NULL, &m_WA_nt_local);
> >                                                                                                                                    ^^^^^^^^^^^^ note the &
> >
> >
> >
> > > On Jul 5, 2016, at 5:13 PM, ehsan sadrfaridpour <it.sadr at gmail.com> wrote:
> > >
> > > I faced a problem with my code. The problem is related to MatCreateSeqAIJ().
> > > I comment the rest of my code and just keeping the below lines cause me the error.
> > > Code:
> > >     Mat * m_WA_nt_local;
> > >     MatCreateSeqAIJ(PETSC_COMM_SELF,num_points,num_points, pre_init_size, NULL, m_WA_nt_local);
> > >     PetscPrintf(PETSC_COMM_SELF, "[CS][pCalc_P] rank:%d, num_points:%d, p_init:%d\n", rank, num_points, pre_init_size);
> > >
> > >     exit(1);
> > >
> > > Error:
> > > [0]PETSC ERROR: --------------------- Error Message --------------------------------------------------------------
> > > [0]PETSC ERROR: Null argument, when expecting valid pointer
> > > [0]PETSC ERROR: [1]PETSC ERROR: --------------------- Error Message --------------------------------------------------------------
> > > [1]PETSC ERROR: Null argument, when expecting valid pointer
> > > [1]PETSC ERROR: Null Pointer: Parameter # 2
> > > [1]PETSC ERROR: See http://www.mcs.anl.gov/petsc/documentation/faq.html for trouble shooting.
> > > [2]PETSC ERROR: --------------------- Error Message --------------------------------------------------------------
> > > [2]PETSC ERROR: Null argument, when expecting valid pointer
> > > [2]PETSC ERROR: Null Pointer: Parameter # 2
> > > [2]PETSC ERROR: See http://www.mcs.anl.gov/petsc/documentation/faq.html for trouble shooting.
> > > [2]PETSC ERROR: Petsc Release Version 3.6.3, unknown
> > > [2]PETSC ERROR: ut_main on a linux-cxx-debug named grappelli by esfp Tue Jul  5 18:05:15 2016
> > > [2]PETSC ERROR: Configure options PETSC_ARCH=linux-cxx-debug --with-cc=gcc --with-cxx=g++ --with-clanguage=c++ --with-gnu-compilers=1 --with-mpi-compilers=1 --with-debugging=1 --with-shared-libraries=1 --download-openmpi=1 --download-f2cblaslapack --download-superlu_dist=1 --download-superlu=1 --download-metis=1 --download-parmetis=1 --download-blacs=1 --with-hdf5 --with-hdf5-dir=/usr/local/hdf5/
> > > [2]PETSC ERROR: #1 MatCreate() line 79 in /home/esfp/tools/libraries/petsc/src/mat/utils/gcreate.c
> > > [2]PETSC ERROR: #2 MatCreateSeqAIJ() line 3471 in /home/esfp/tools/libraries/petsc/src/mat/impls/aij/seq/aij.c
> > > Null Pointer: Parameter # 2
> > > [0]PETSC ERROR: See http://www.mcs.anl.gov/petsc/documentation/faq.html for trouble shooting.
> > > [0]PETSC ERROR: Petsc Release Version 3.6.3, unknown
> > > [0]PETSC ERROR: ut_main on a linux-cxx-debug named grappelli by esfp Tue Jul  5 18:05:15 2016
> > > [0]PETSC ERROR: Configure options PETSC_ARCH=linux-cxx-debug --with-cc=gcc --with-cxx=g++ --with-clanguage=c++ --with-gnu-compilers=1 --with-mpi-compilers=1 --with-debugging=1 --with-shared-libraries=1 --download-openmpi=1 --download-f2cblaslapack --download-superlu_dist=1 --download-superlu=1 --download-metis=1 --download-parmetis=1 --download-blacs=1 --with-hdf5 --with-hdf5-dir=/usr/local/hdf5/
> > > [0]PETSC ERROR: #1 MatCreate() line 79 in /home/esfp/tools/libraries/petsc/src/mat/utils/gcreate.c
> > > [0]PETSC ERROR: #2 MatCreateSeqAIJ() line 3471 in /home/esfp/tools/libraries/petsc/src/mat/impls/aij/seq/aij.c
> > > [1]PETSC ERROR: Petsc Release Version 3.6.3, unknown
> > > [1]PETSC ERROR: ut_main on a linux-cxx-debug named grappelli by esfp Tue Jul  5 18:05:15 2016
> > > [1]PETSC ERROR: Configure options PETSC_ARCH=linux-cxx-debug --with-cc=gcc --with-cxx=g++ --with-clanguage=c++ --with-gnu-compilers=1 --with-mpi-compilers=1 --with-debugging=1 --with-shared-libraries=1 --download-openmpi=1 --download-f2cblaslapack --download-superlu_dist=1 --download-superlu=1 --download-metis=1 --download-parmetis=1 --download-blacs=1 --with-hdf5 --with-hdf5-dir=/usr/local/hdf5/
> > > [1]PETSC ERROR: #1 MatCreate() line 79 in /home/esfp/tools/libraries/petsc/src/mat/utils/gcreate.c
> > > [1]PETSC ERROR: #2 MatCreateSeqAIJ() line 3471 in /home/esfp/tools/libraries/petsc/src/mat/impls/aij/seq/aij.c
> > > [CS][pCalc_P] rank:1, num_points:10, p_init:300
> > > [CS][pCalc_P] rank:2, num_points:10, p_init:300
> > > [CS][pCalc_P] rank:0, num_points:10, p_init:300
> > >
> > > As you can see nothing is NULL in my call to the MatCreateSeqAIJ.
> > >
> > > I tried to debug it with -start_in_debugger, but I got another error.
> > > $ make ut_main && mpirun -n 3   ut_main   -start_in_debugger
> > > /home/esfp/tools/libraries/petsc/linux-cxx-debug/bin/mpicxx -o ut_main.o -c -Wall -Wwrite-strings -Wno-strict-aliasing -Wno-unknown-pragmas -g -O0  -fPIC    -I/home/esfp/tools/libraries/petsc/include -I/home/esfp/tools/libraries/petsc/linux-cxx-debug/include -I/usr/local/hdf5/include   -std=c++11 -g -O3  `pwd`/ut_main.cc
> > > /home/esfp/tools/libraries/petsc/linux-cxx-debug/bin/mpicxx -Wall -Wwrite-strings -Wno-strict-aliasing -Wno-unknown-pragmas -g -O0 -I.        svm.o solver.o model_selection.o ut_ms.o ut_common.o ut_kf.o ut_partitioning.o ds_node.o ds_graph.o coarsening.o ut_coarsening.o partitioning.o ut_mr.o pugixml.o config_params.o etimer.o common_funcs.o OptionParser.o loader.o ut_loader.o k_fold.o ut_main.o  -Wl,-rpath,/home/esfp/tools/libraries/petsc/linux-cxx-debug/lib -L/home/esfp/tools/libraries/petsc/linux-cxx-debug/lib  -lpetsc -Wl,-rpath,/home/esfp/tools/libraries/petsc/linux-cxx-debug/lib -lsuperlu_4.3 -lsuperlu_dist_4.1 -lf2clapack -lf2cblas -lm -lparmetis -lmetis -lX11 -Wl,-rpath,/usr/local/hdf5/lib -L/usr/local/hdf5/lib -lhdf5hl_fortran -lhdf5_fortran -lhdf5_hl -lhdf5 -lhwloc -lm -Wl,-rpath,/usr/lib/gcc/x86_64-linux-gnu/4.8 -L/usr/lib/gcc/x86_64-linux-gnu/4.8 -Wl,-rpath,/usr/lib/x86_64-linux-gnu -L/usr/lib/x86_64-linux-gnu -Wl,-rpath,/lib/x86_64-linux-gnu -L/lib/x86_64-linux-gnu -lmpi_usempi -lmpi_mpifh -lgfortran -lm -lgfortran -lm -lquadmath -lm -lmpi_cxx -lstdc++ -Wl,-rpath,/home/esfp/tools/libraries/petsc/linux-cxx-debug/lib -L/home/esfp/tools/libraries/petsc/linux-cxx-debug/lib -Wl,-rpath,/usr/lib/gcc/x86_64-linux-gnu/4.8 -L/usr/lib/gcc/x86_64-linux-gnu/4.8 -Wl,-rpath,/usr/lib/x86_64-linux-gnu -L/usr/lib/x86_64-linux-gnu -Wl,-rpath,/lib/x86_64-linux-gnu -L/lib/x86_64-linux-gnu -Wl,-rpath,/usr/lib/x86_64-linux-gnu -L/usr/lib/x86_64-linux-gnu -ldl -Wl,-rpath,/home/esfp/tools/libraries/petsc/linux-cxx-debug/lib -lmpi -lgcc_s -lpthread -ldl  -o ut_main
> > > /bin/rm -f ut_main.o
> > > [0]PETSC ERROR: PETSC: Attaching gdb to ut_main of pid 2818 on display :0 on machine grappelli
> > > [1]PETSC ERROR: PETSC: Attaching gdb to ut_main of pid 2819 on display :0 on machine grappelli
> > > [2]PETSC ERROR: PETSC: Attaching gdb to ut_main of pid 2820 on display :0 on machine grappelli
> > >
> > >
> > > And I got below error in gdb GUI:
> > > <image.png>
> > >
> > > I appreciate your support.
> > >
> > > Best regards,
> > > Ehsan
> > >
> > > On Wed, Jun 29, 2016 at 4:31 PM, Barry Smith <bsmith at mcs.anl.gov> wrote:
> > >
> > >   On all other processes don't pass in 1 pass  in 0 since all other processes want 0 sub matrices
> > >
> > >
> > > > On Jun 29, 2016, at 3:24 PM, ehsan sadrfaridpour <it.sadr at gmail.com> wrote:
> > > >
> > > > Thanks, the IS problem is solved.
> > > > But now I have another problem to compile the code.
> > > >
> > > > I use below code:
> > > > Mat m_WA_nt_local;
> > > >     MatCreateSeqAIJ(PETSC_COMM_SELF,num_points,num_points, Config_params::getInstance()->get_pre_init_loader_matrix(), NULL, &m_WA_nt_local);
> > > >     IS set;
> > > >     if(rank ==0){
> > > >         ISCreateStride(PETSC_COMM_SELF, num_points, 0, 1, &set);
> > > >         ISView(set, PETSC_VIEWER_STDOUT_SELF);
> > > >     }
> > > >     MatGetSubMatrices(m_WA_norm_T, 1, &set, &set, MAT_INITIAL_MATRIX, &m_WA_nt_local);
> > > >
> > > > The error I get is :
> > > > error: cannot convert ?_p_Mat**? to ?_p_Mat***? for argument ?6? to ?PetscErrorCode MatGetSubMatrices(Mat, PetscInt, _p_IS* const*, _p_IS* const*, MatReuse, _p_Mat***)?
> > > >      MatGetSubMatrices(m_WA_norm_T, 1, &set, &set, MAT_INITIAL_MATRIX, &m_WA_nt_local);
> > > >
> > > >
> > > > I tried to go around it by define a array of Matrices using "Mat * m_WA_nt_local"
> > > > So, the first 2 lines changed to below and I can compile the code.
> > > > Mat * m_WA_nt_local;
> > > >     MatCreateSeqAIJ(PETSC_COMM_SELF,num_points,num_points, Config_params::getInstance()->get_pre_init_loader_matrix(), NULL, m_WA_nt_local);
> > > >
> > > >
> > > >
> > > > However, I get errors like below when I run the code with 2 mpi process.
> > > >  --------------------- Error Message --------------------------------------------------------------
> > > > [1]PETSC ERROR: Invalid argument
> > > > [1]PETSC ERROR: Wrong type of object: Parameter # 3
> > > > [1]PETSC ERROR: See http://www.mcs.anl.gov/petsc/documentation/faq.html for trouble shooting.
> > > > [1]PETSC ERROR: Petsc Release Version 3.6.3, unknown
> > > > [1]PETSC ERROR: ut_main on a linux-cxx-debug named grappelli by esfp Wed Jun 29 16:21:04 2016
> > > > [1]PETSC ERROR: Configure options PETSC_ARCH=linux-cxx-debug --with-cc=gcc --with-cxx=g++ --with-clanguage=c++ --with-gnu-compilers=1 --with-mpi-compilers=1 --with-debugging=1 --with-shared-libraries=1 --download-openmpi=1 --download-f2cblaslapack --download-superlu_dist=1 --download-superlu=1 --download-metis=1 --download-parmetis=1 --download-blacs=1 --with-hdf5 --with-hdf5-dir=/usr/local/hdf5/
> > > > [1]PETSC ERROR: #1 MatGetSubMatrices() line 6605 in /home/esfp/tools/libraries/petsc/src/mat/interface/matrix.c
> > > >
> > > >
> > > > I think I need to do something for other processes, but I don't know what I need to do.
> > > >
> > > > Best,
> > > > Ehsan
> > > >
> > > >
> > > >
> > > > On Wed, Jun 29, 2016 at 4:03 PM, Dave May <dave.mayhem23 at gmail.com> wrote:
> > > >
> > > >
> > > > On Wednesday, 29 June 2016, ehsan sadrfaridpour <it.sadr at gmail.com> wrote:
> > > > I faced the below error during compiling my code for using MatGetSubMatrices.
> > > >
> > > > error: cannot convert ?IS {aka _p_IS*}? to ?_p_IS* const*? for argument ?3? to ?PetscErrorCode MatGetSubMatrices(Mat, PetscInt, _p_IS* const*, _p_IS* const*, MatReuse, _p_Mat***)?
> > > >          MatGetSubMatrices(m_WA_norm_T, 1, set, set, MAT_INITIAL_MATRIX, &m_local_W);
> > > >
> > > > My code :
> > > > PetscMPIInt    rank;
> > > > MPI_Comm_rank(PETSC_COMM_WORLD, &rank);
> > > >
> > > > if(rank ==0){
> > > >         Mat m_local_W;
> > > >         MatCreateSeqAIJ(PETSC_COMM_SELF,num_points,num_points, num_nz, NULL,&m_local_W);// try to reserve space for only number of final non zero entries for each fine node (e.g. 4)
> > > >         IS set;
> > > >         ISCreateStride(PETSC_COMM_SELF, num_points, 0, 1, &set_row);
> > > >         MatGetSubMatrices(m_WA_norm_T, 1, set_row, set_col, MAT_INITIAL_MATRIX, &m_local_W);
> > > >
> > > >     }
> > > >
> > > > I followed below example:
> > > > http://www.mcs.anl.gov/petsc/petsc-current/src/vec/is/is/examples/tutorials/ex2.c.html
> > > >
> > > > This code won't work in parallel.
> > > > The man page says this function is collective on Mat. You need to move the call to MatGetSubMatrices outside of the if(rank==0) loop.
> > > >
> > > >
> > > >
> > > >
> > > >
> > > >
> > > >
> > > > On Wed, Jun 29, 2016 at 3:19 PM, ehsan sadrfaridpour <it.sadr at gmail.com> wrote:
> > > > Thanks a lot for great support.
> > > >
> > > > On Wed, Jun 29, 2016 at 3:11 PM, Barry Smith <bsmith at mcs.anl.gov> wrote:
> > > >
> > > >    MatGetSubmatrices() just have the first process request all the rows and columns and the others request none. You can use ISCreateStride() to create the ISs without having to make an array of all the indices.
> > > >
> > > >
> > > > > On Jun 29, 2016, at 1:43 PM, ehsan sadrfaridpour <it.sadr at gmail.com> wrote:
> > > > >
> > > > > Hi,
> > > > >
> > > > > I need to have access to most of elements of a parallel MPIAIJ matrix only from 1 process (rank 0).
> > > > > I tried to copy or duplicate it to SEQAIJ, but I faced problems.
> > > > >
> > > > > How can I have a local copy of a matrix which is distributed on multiple process? I don't want to update the matrix, and the read-only version of it would be enough.
> > > > >
> > > > > Best,
> > > > > Ehsan
> > > > >
> > > > >
> > > >
> > > >
> > > >
> > > >
> > >
> > >
> >
> >
> >
> >
> >
> > --
> > What most experimenters take for granted before they begin their experiments is infinitely more interesting than any results to which their experiments lead.
> > -- Norbert Wiener
> 
> 


From knepley at gmail.com  Tue Jul  5 20:02:45 2016
From: knepley at gmail.com (Matthew Knepley)
Date: Tue, 5 Jul 2016 20:02:45 -0500
Subject: [petsc-users] How to have a local copy (sequential) of a
	parallel matrix
In-Reply-To: <CALtCA6HqH9tAW93r+=c5h5atFJzUv018u7ZnjGTXivzKxgG_Tg@mail.gmail.com>
References: <CALtCA6F_vGr=ZMi2ttcFxo1TkYoSOPn9wv5iY48+8qicYPJRBQ@mail.gmail.com>
	<F0B9F2BE-B38F-4B1B-9F95-C82F7D282ECC@mcs.anl.gov>
	<CALtCA6HWqqCij2K6uv7LER2ebwnwgh_xY7u4D2YWmzsyR67bzw@mail.gmail.com>
	<CALtCA6EUfkOOOArOLxF5A6bpbCooT1d3BKMPHZyJKvCVZ3cYRw@mail.gmail.com>
	<CAJ98EDrK9peK_B868Lr9EQ4sdiyQKMoZyOuZuOwTuCR6of9K9w@mail.gmail.com>
	<CALtCA6FnWa8A6Zi-zANDePmDWtBpUSTCM511nP=NHuttwTY7SA@mail.gmail.com>
	<1618DCDA-7859-49BD-BCAF-F4BD08DF1BAF@mcs.anl.gov>
	<CALtCA6F4xzEJE1WCk-4e8-LDjhTfJia4b16HmzD+J4z-weWLZQ@mail.gmail.com>
	<D78E861A-793F-4D84-B155-6834CEFDAC6A@mcs.anl.gov>
	<CALtCA6FSO35UDGo+eYbpihH25oV1eytWVhSa+qoQMXstc9RFMw@mail.gmail.com>
	<CAMYG4GmNrKPU0TUWJ4bQtJRi3132ebxJ57jemtwik-xbS-HLHA@mail.gmail.com>
	<855835AC-78B8-4A8A-993F-2E9060B4BBAF@mcs.anl.gov>
	<CALtCA6HqH9tAW93r+=c5h5atFJzUv018u7ZnjGTXivzKxgG_Tg@mail.gmail.com>
Message-ID: <CAMYG4G=v2o2D_Y5nF6mQihb2_YQZkRXNxbY3oBcOURcx9=DHBQ@mail.gmail.com>

On Tue, Jul 5, 2016 at 5:58 PM, ehsan sadrfaridpour <it.sadr at gmail.com>
wrote:

> Sorry, I think your suggestion needs something, since it doesn't compile.
>
>  error: expected initializer before ?*? token
>>      Mat m_WA_nt_local  *m_WA_nt_local;
>>
>
>
> Yes, this is the same problem that compiled and worked but it has a bug.
> I faced this problem and I tried to define the array of Matrices to fix
> this 4 days ago.
>
> However, my first email today is the problem that array of matrices caused
> me.
> I get a little confused in the logic.
>
> Let me review what is happening:
> As this method is collective, all the processes needs to run it.
> Therefore, I need to define a local matrix and create it for all of the
> processes.
>

No no no. Each process extracts a SET of SEQUENTIAL matrices. Each proc
choose how many
it will extract (could be 0).

  Thanks,

    Matt


> Only for the process I want to have the local matrix, I request a matrix
> (matrices) and for the rest of them I pass 0 in the MatGetSubMatrices.
> I am suspicious about creating only 1 matrix for any process, while I
> expect an array of matrices in the  MatGetSubMatrices.
>
>
>
>
>
> On Tue, Jul 5, 2016 at 6:43 PM, Barry Smith <bsmith at mcs.anl.gov> wrote:
>
>>
>> > On Jul 5, 2016, at 5:36 PM, Matthew Knepley <knepley at gmail.com> wrote:
>> >
>> > On Tue, Jul 5, 2016 at 5:26 PM, ehsan sadrfaridpour <it.sadr at gmail.com>
>> wrote:
>> > Thanks for your prompt reply. Using & solve this problem, but then I
>> have another problem.
>> >
>> > Rest of the Code:
>> >     Mat m_WA_nt_local;
>> >     MatCreateSeqAIJ(PETSC_COMM_SELF,num_points,num_points,
>> pre_init_size, NULL, &m_WA_nt_local);
>> >     PetscPrintf(PETSC_COMM_SELF, "[CS][pCalc_P] rank:%d, num_points:%d,
>> p_init:%d\n", rank, num_points, pre_init_size);
>> >
>> >     IS set;
>> >     if(rank ==0){
>> >         // - - - - - create local matrix - - - - -
>> >         PetscPrintf(PETSC_COMM_SELF, "[CS][pCalc_P] rank:%d,
>> num_points:%d\n", rank, num_points);
>> >         ISCreateStride(PETSC_COMM_SELF, num_points, 0, 1, &set);
>> >         ISView(set, PETSC_VIEWER_STDOUT_SELF);
>> >         MatGetSubMatrices(m_WA_norm_T, 1, &set, &set,
>> MAT_INITIAL_MATRIX, &m_WA_nt_local);
>> >     }else{
>> >         MatGetSubMatrices(m_WA_norm_T, 0, &set, &set,
>> MAT_INITIAL_MATRIX, &m_WA_nt_local);
>> >     }
>> >
>> > This returns an ARRAY of Mat objects, not just one.
>>
>>   Didn't we just do this email a couple of days ago?
>>
>>    You need
>>
>>     Mat m_WA_nt_local  *m_WA_nt_local;
>> > MatGetSubMatrices(m_WA_norm_T, 0, &set, &set, MAT_INITIAL_MATRIX,
>> &m_WA_nt_local);
>>
>>
>>
>>
>>
>> >
>> >    Matt
>> >
>> >
>> > Error in compile:
>> > /home/esfp/dev/ws_qt/mlsvm/coarsening.cc: In member function ?_p_Mat*
>> Coarsening::pCalc_P(_p_Mat*&, _p_Vec*&, std::vector<long unsigned int>&,
>> cs_info&)?:
>> > /home/esfp/dev/ws_qt/mlsvm/coarsening.cc:113:89: error: cannot convert
>> ?_p_Mat**? to ?_p_Mat***? for argument ?6? to ?PetscErrorCode
>> MatGetSubMatrices(Mat, PetscInt, _p_IS* const*, _p_IS* const*, MatReuse,
>> _p_Mat***)?
>> >          MatGetSubMatrices(m_WA_norm_T, 1, &set, &set,
>> MAT_INITIAL_MATRIX, &m_WA_nt_local);
>> >
>>                   ^
>> > /home/esfp/dev/ws_qt/mlsvm/coarsening.cc:115:89: error: cannot convert
>> ?_p_Mat**? to ?_p_Mat***? for argument ?6? to ?PetscErrorCode
>> MatGetSubMatrices(Mat, PetscInt, _p_IS* const*, _p_IS* const*, MatReuse,
>> _p_Mat***)?
>> >          MatGetSubMatrices(m_WA_norm_T, 0, &set, &set,
>> MAT_INITIAL_MATRIX, &m_WA_nt_local);
>> >
>>                   ^
>> >
>> >
>> > On Tue, Jul 5, 2016 at 6:21 PM, Barry Smith <bsmith at mcs.anl.gov> wrote:
>> >
>> >   It should be
>> >
>> >     Mat m_WA_nt_local;
>> >
>> > > MatCreateSeqAIJ(PETSC_COMM_SELF,num_points,num_points, pre_init_size,
>> NULL, &m_WA_nt_local);
>> >
>>                                                             ^^^^^^^^^^^^
>> note the &
>> >
>> >
>> >
>> > > On Jul 5, 2016, at 5:13 PM, ehsan sadrfaridpour <it.sadr at gmail.com>
>> wrote:
>> > >
>> > > I faced a problem with my code. The problem is related to
>> MatCreateSeqAIJ().
>> > > I comment the rest of my code and just keeping the below lines cause
>> me the error.
>> > > Code:
>> > >     Mat * m_WA_nt_local;
>> > >     MatCreateSeqAIJ(PETSC_COMM_SELF,num_points,num_points,
>> pre_init_size, NULL, m_WA_nt_local);
>> > >     PetscPrintf(PETSC_COMM_SELF, "[CS][pCalc_P] rank:%d,
>> num_points:%d, p_init:%d\n", rank, num_points, pre_init_size);
>> > >
>> > >     exit(1);
>> > >
>> > > Error:
>> > > [0]PETSC ERROR: --------------------- Error Message
>> --------------------------------------------------------------
>> > > [0]PETSC ERROR: Null argument, when expecting valid pointer
>> > > [0]PETSC ERROR: [1]PETSC ERROR: --------------------- Error Message
>> --------------------------------------------------------------
>> > > [1]PETSC ERROR: Null argument, when expecting valid pointer
>> > > [1]PETSC ERROR: Null Pointer: Parameter # 2
>> > > [1]PETSC ERROR: See
>> http://www.mcs.anl.gov/petsc/documentation/faq.html for trouble shooting.
>> > > [2]PETSC ERROR: --------------------- Error Message
>> --------------------------------------------------------------
>> > > [2]PETSC ERROR: Null argument, when expecting valid pointer
>> > > [2]PETSC ERROR: Null Pointer: Parameter # 2
>> > > [2]PETSC ERROR: See
>> http://www.mcs.anl.gov/petsc/documentation/faq.html for trouble shooting.
>> > > [2]PETSC ERROR: Petsc Release Version 3.6.3, unknown
>> > > [2]PETSC ERROR: ut_main on a linux-cxx-debug named grappelli by esfp
>> Tue Jul  5 18:05:15 2016
>> > > [2]PETSC ERROR: Configure options PETSC_ARCH=linux-cxx-debug
>> --with-cc=gcc --with-cxx=g++ --with-clanguage=c++ --with-gnu-compilers=1
>> --with-mpi-compilers=1 --with-debugging=1 --with-shared-libraries=1
>> --download-openmpi=1 --download-f2cblaslapack --download-superlu_dist=1
>> --download-superlu=1 --download-metis=1 --download-parmetis=1
>> --download-blacs=1 --with-hdf5 --with-hdf5-dir=/usr/local/hdf5/
>> > > [2]PETSC ERROR: #1 MatCreate() line 79 in
>> /home/esfp/tools/libraries/petsc/src/mat/utils/gcreate.c
>> > > [2]PETSC ERROR: #2 MatCreateSeqAIJ() line 3471 in
>> /home/esfp/tools/libraries/petsc/src/mat/impls/aij/seq/aij.c
>> > > Null Pointer: Parameter # 2
>> > > [0]PETSC ERROR: See
>> http://www.mcs.anl.gov/petsc/documentation/faq.html for trouble shooting.
>> > > [0]PETSC ERROR: Petsc Release Version 3.6.3, unknown
>> > > [0]PETSC ERROR: ut_main on a linux-cxx-debug named grappelli by esfp
>> Tue Jul  5 18:05:15 2016
>> > > [0]PETSC ERROR: Configure options PETSC_ARCH=linux-cxx-debug
>> --with-cc=gcc --with-cxx=g++ --with-clanguage=c++ --with-gnu-compilers=1
>> --with-mpi-compilers=1 --with-debugging=1 --with-shared-libraries=1
>> --download-openmpi=1 --download-f2cblaslapack --download-superlu_dist=1
>> --download-superlu=1 --download-metis=1 --download-parmetis=1
>> --download-blacs=1 --with-hdf5 --with-hdf5-dir=/usr/local/hdf5/
>> > > [0]PETSC ERROR: #1 MatCreate() line 79 in
>> /home/esfp/tools/libraries/petsc/src/mat/utils/gcreate.c
>> > > [0]PETSC ERROR: #2 MatCreateSeqAIJ() line 3471 in
>> /home/esfp/tools/libraries/petsc/src/mat/impls/aij/seq/aij.c
>> > > [1]PETSC ERROR: Petsc Release Version 3.6.3, unknown
>> > > [1]PETSC ERROR: ut_main on a linux-cxx-debug named grappelli by esfp
>> Tue Jul  5 18:05:15 2016
>> > > [1]PETSC ERROR: Configure options PETSC_ARCH=linux-cxx-debug
>> --with-cc=gcc --with-cxx=g++ --with-clanguage=c++ --with-gnu-compilers=1
>> --with-mpi-compilers=1 --with-debugging=1 --with-shared-libraries=1
>> --download-openmpi=1 --download-f2cblaslapack --download-superlu_dist=1
>> --download-superlu=1 --download-metis=1 --download-parmetis=1
>> --download-blacs=1 --with-hdf5 --with-hdf5-dir=/usr/local/hdf5/
>> > > [1]PETSC ERROR: #1 MatCreate() line 79 in
>> /home/esfp/tools/libraries/petsc/src/mat/utils/gcreate.c
>> > > [1]PETSC ERROR: #2 MatCreateSeqAIJ() line 3471 in
>> /home/esfp/tools/libraries/petsc/src/mat/impls/aij/seq/aij.c
>> > > [CS][pCalc_P] rank:1, num_points:10, p_init:300
>> > > [CS][pCalc_P] rank:2, num_points:10, p_init:300
>> > > [CS][pCalc_P] rank:0, num_points:10, p_init:300
>> > >
>> > > As you can see nothing is NULL in my call to the MatCreateSeqAIJ.
>> > >
>> > > I tried to debug it with -start_in_debugger, but I got another error.
>> > > $ make ut_main && mpirun -n 3   ut_main   -start_in_debugger
>> > > /home/esfp/tools/libraries/petsc/linux-cxx-debug/bin/mpicxx -o
>> ut_main.o -c -Wall -Wwrite-strings -Wno-strict-aliasing
>> -Wno-unknown-pragmas -g -O0  -fPIC
>> -I/home/esfp/tools/libraries/petsc/include
>> -I/home/esfp/tools/libraries/petsc/linux-cxx-debug/include
>> -I/usr/local/hdf5/include   -std=c++11 -g -O3  `pwd`/ut_main.cc
>> > > /home/esfp/tools/libraries/petsc/linux-cxx-debug/bin/mpicxx -Wall
>> -Wwrite-strings -Wno-strict-aliasing -Wno-unknown-pragmas -g -O0 -I.
>> svm.o solver.o model_selection.o ut_ms.o ut_common.o ut_kf.o
>> ut_partitioning.o ds_node.o ds_graph.o coarsening.o ut_coarsening.o
>> partitioning.o ut_mr.o pugixml.o config_params.o etimer.o common_funcs.o
>> OptionParser.o loader.o ut_loader.o k_fold.o ut_main.o
>> -Wl,-rpath,/home/esfp/tools/libraries/petsc/linux-cxx-debug/lib
>> -L/home/esfp/tools/libraries/petsc/linux-cxx-debug/lib  -lpetsc
>> -Wl,-rpath,/home/esfp/tools/libraries/petsc/linux-cxx-debug/lib
>> -lsuperlu_4.3 -lsuperlu_dist_4.1 -lf2clapack -lf2cblas -lm -lparmetis
>> -lmetis -lX11 -Wl,-rpath,/usr/local/hdf5/lib -L/usr/local/hdf5/lib
>> -lhdf5hl_fortran -lhdf5_fortran -lhdf5_hl -lhdf5 -lhwloc -lm
>> -Wl,-rpath,/usr/lib/gcc/x86_64-linux-gnu/4.8
>> -L/usr/lib/gcc/x86_64-linux-gnu/4.8 -Wl,-rpath,/usr/lib/x86_64-linux-gnu
>> -L/usr/lib/x86_64-linux-gnu -Wl,-rpath,/lib/x86_64-linux-gnu
>> -L/lib/x86_64-linux-gnu -lmpi_usempi -lmpi_mpifh -lgfortran -lm -lgfortran
>> -lm -lquadmath -lm -lmpi_cxx -lstdc++
>> -Wl,-rpath,/home/esfp/tools/libraries/petsc/linux-cxx-debug/lib
>> -L/home/esfp/tools/libraries/petsc/linux-cxx-debug/lib
>> -Wl,-rpath,/usr/lib/gcc/x86_64-linux-gnu/4.8
>> -L/usr/lib/gcc/x86_64-linux-gnu/4.8 -Wl,-rpath,/usr/lib/x86_64-linux-gnu
>> -L/usr/lib/x86_64-linux-gnu -Wl,-rpath,/lib/x86_64-linux-gnu
>> -L/lib/x86_64-linux-gnu -Wl,-rpath,/usr/lib/x86_64-linux-gnu
>> -L/usr/lib/x86_64-linux-gnu -ldl
>> -Wl,-rpath,/home/esfp/tools/libraries/petsc/linux-cxx-debug/lib -lmpi
>> -lgcc_s -lpthread -ldl  -o ut_main
>> > > /bin/rm -f ut_main.o
>> > > [0]PETSC ERROR: PETSC: Attaching gdb to ut_main of pid 2818 on
>> display :0 on machine grappelli
>> > > [1]PETSC ERROR: PETSC: Attaching gdb to ut_main of pid 2819 on
>> display :0 on machine grappelli
>> > > [2]PETSC ERROR: PETSC: Attaching gdb to ut_main of pid 2820 on
>> display :0 on machine grappelli
>> > >
>> > >
>> > > And I got below error in gdb GUI:
>> > > <image.png>
>> > >
>> > > I appreciate your support.
>> > >
>> > > Best regards,
>> > > Ehsan
>> > >
>> > > On Wed, Jun 29, 2016 at 4:31 PM, Barry Smith <bsmith at mcs.anl.gov>
>> wrote:
>> > >
>> > >   On all other processes don't pass in 1 pass  in 0 since all other
>> processes want 0 sub matrices
>> > >
>> > >
>> > > > On Jun 29, 2016, at 3:24 PM, ehsan sadrfaridpour <it.sadr at gmail.com>
>> wrote:
>> > > >
>> > > > Thanks, the IS problem is solved.
>> > > > But now I have another problem to compile the code.
>> > > >
>> > > > I use below code:
>> > > > Mat m_WA_nt_local;
>> > > >     MatCreateSeqAIJ(PETSC_COMM_SELF,num_points,num_points,
>> Config_params::getInstance()->get_pre_init_loader_matrix(), NULL,
>> &m_WA_nt_local);
>> > > >     IS set;
>> > > >     if(rank ==0){
>> > > >         ISCreateStride(PETSC_COMM_SELF, num_points, 0, 1, &set);
>> > > >         ISView(set, PETSC_VIEWER_STDOUT_SELF);
>> > > >     }
>> > > >     MatGetSubMatrices(m_WA_norm_T, 1, &set, &set,
>> MAT_INITIAL_MATRIX, &m_WA_nt_local);
>> > > >
>> > > > The error I get is :
>> > > > error: cannot convert ?_p_Mat**? to ?_p_Mat***? for argument ?6? to
>> ?PetscErrorCode MatGetSubMatrices(Mat, PetscInt, _p_IS* const*, _p_IS*
>> const*, MatReuse, _p_Mat***)?
>> > > >      MatGetSubMatrices(m_WA_norm_T, 1, &set, &set,
>> MAT_INITIAL_MATRIX, &m_WA_nt_local);
>> > > >
>> > > >
>> > > > I tried to go around it by define a array of Matrices using "Mat *
>> m_WA_nt_local"
>> > > > So, the first 2 lines changed to below and I can compile the code.
>> > > > Mat * m_WA_nt_local;
>> > > >     MatCreateSeqAIJ(PETSC_COMM_SELF,num_points,num_points,
>> Config_params::getInstance()->get_pre_init_loader_matrix(), NULL,
>> m_WA_nt_local);
>> > > >
>> > > >
>> > > >
>> > > > However, I get errors like below when I run the code with 2 mpi
>> process.
>> > > >  --------------------- Error Message
>> --------------------------------------------------------------
>> > > > [1]PETSC ERROR: Invalid argument
>> > > > [1]PETSC ERROR: Wrong type of object: Parameter # 3
>> > > > [1]PETSC ERROR: See
>> http://www.mcs.anl.gov/petsc/documentation/faq.html for trouble shooting.
>> > > > [1]PETSC ERROR: Petsc Release Version 3.6.3, unknown
>> > > > [1]PETSC ERROR: ut_main on a linux-cxx-debug named grappelli by
>> esfp Wed Jun 29 16:21:04 2016
>> > > > [1]PETSC ERROR: Configure options PETSC_ARCH=linux-cxx-debug
>> --with-cc=gcc --with-cxx=g++ --with-clanguage=c++ --with-gnu-compilers=1
>> --with-mpi-compilers=1 --with-debugging=1 --with-shared-libraries=1
>> --download-openmpi=1 --download-f2cblaslapack --download-superlu_dist=1
>> --download-superlu=1 --download-metis=1 --download-parmetis=1
>> --download-blacs=1 --with-hdf5 --with-hdf5-dir=/usr/local/hdf5/
>> > > > [1]PETSC ERROR: #1 MatGetSubMatrices() line 6605 in
>> /home/esfp/tools/libraries/petsc/src/mat/interface/matrix.c
>> > > >
>> > > >
>> > > > I think I need to do something for other processes, but I don't
>> know what I need to do.
>> > > >
>> > > > Best,
>> > > > Ehsan
>> > > >
>> > > >
>> > > >
>> > > > On Wed, Jun 29, 2016 at 4:03 PM, Dave May <dave.mayhem23 at gmail.com>
>> wrote:
>> > > >
>> > > >
>> > > > On Wednesday, 29 June 2016, ehsan sadrfaridpour <it.sadr at gmail.com>
>> wrote:
>> > > > I faced the below error during compiling my code for using
>> MatGetSubMatrices.
>> > > >
>> > > > error: cannot convert ?IS {aka _p_IS*}? to ?_p_IS* const*? for
>> argument ?3? to ?PetscErrorCode MatGetSubMatrices(Mat, PetscInt, _p_IS*
>> const*, _p_IS* const*, MatReuse, _p_Mat***)?
>> > > >          MatGetSubMatrices(m_WA_norm_T, 1, set, set,
>> MAT_INITIAL_MATRIX, &m_local_W);
>> > > >
>> > > > My code :
>> > > > PetscMPIInt    rank;
>> > > > MPI_Comm_rank(PETSC_COMM_WORLD, &rank);
>> > > >
>> > > > if(rank ==0){
>> > > >         Mat m_local_W;
>> > > >         MatCreateSeqAIJ(PETSC_COMM_SELF,num_points,num_points,
>> num_nz, NULL,&m_local_W);// try to reserve space for only number of final
>> non zero entries for each fine node (e.g. 4)
>> > > >         IS set;
>> > > >         ISCreateStride(PETSC_COMM_SELF, num_points, 0, 1, &set_row);
>> > > >         MatGetSubMatrices(m_WA_norm_T, 1, set_row, set_col,
>> MAT_INITIAL_MATRIX, &m_local_W);
>> > > >
>> > > >     }
>> > > >
>> > > > I followed below example:
>> > > >
>> http://www.mcs.anl.gov/petsc/petsc-current/src/vec/is/is/examples/tutorials/ex2.c.html
>> > > >
>> > > > This code won't work in parallel.
>> > > > The man page says this function is collective on Mat. You need to
>> move the call to MatGetSubMatrices outside of the if(rank==0) loop.
>> > > >
>> > > >
>> > > >
>> > > >
>> > > >
>> > > >
>> > > >
>> > > > On Wed, Jun 29, 2016 at 3:19 PM, ehsan sadrfaridpour <
>> it.sadr at gmail.com> wrote:
>> > > > Thanks a lot for great support.
>> > > >
>> > > > On Wed, Jun 29, 2016 at 3:11 PM, Barry Smith <bsmith at mcs.anl.gov>
>> wrote:
>> > > >
>> > > >    MatGetSubmatrices() just have the first process request all the
>> rows and columns and the others request none. You can use ISCreateStride()
>> to create the ISs without having to make an array of all the indices.
>> > > >
>> > > >
>> > > > > On Jun 29, 2016, at 1:43 PM, ehsan sadrfaridpour <
>> it.sadr at gmail.com> wrote:
>> > > > >
>> > > > > Hi,
>> > > > >
>> > > > > I need to have access to most of elements of a parallel MPIAIJ
>> matrix only from 1 process (rank 0).
>> > > > > I tried to copy or duplicate it to SEQAIJ, but I faced problems.
>> > > > >
>> > > > > How can I have a local copy of a matrix which is distributed on
>> multiple process? I don't want to update the matrix, and the read-only
>> version of it would be enough.
>> > > > >
>> > > > > Best,
>> > > > > Ehsan
>> > > > >
>> > > > >
>> > > >
>> > > >
>> > > >
>> > > >
>> > >
>> > >
>> >
>> >
>> >
>> >
>> >
>> > --
>> > What most experimenters take for granted before they begin their
>> experiments is infinitely more interesting than any results to which their
>> experiments lead.
>> > -- Norbert Wiener
>>
>>
>


-- 
What most experimenters take for granted before they begin their
experiments is infinitely more interesting than any results to which their
experiments lead.
-- Norbert Wiener
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.mcs.anl.gov/pipermail/petsc-users/attachments/20160705/e4fb78fc/attachment-0001.html>

From it.sadr at gmail.com  Wed Jul  6 08:12:28 2016
From: it.sadr at gmail.com (ehsan sadrfaridpour)
Date: Wed, 6 Jul 2016 09:12:28 -0400
Subject: [petsc-users] How to have a local copy (sequential) of a
	parallel matrix
In-Reply-To: <CAMYG4G=v2o2D_Y5nF6mQihb2_YQZkRXNxbY3oBcOURcx9=DHBQ@mail.gmail.com>
References: <CALtCA6F_vGr=ZMi2ttcFxo1TkYoSOPn9wv5iY48+8qicYPJRBQ@mail.gmail.com>
	<F0B9F2BE-B38F-4B1B-9F95-C82F7D282ECC@mcs.anl.gov>
	<CALtCA6HWqqCij2K6uv7LER2ebwnwgh_xY7u4D2YWmzsyR67bzw@mail.gmail.com>
	<CALtCA6EUfkOOOArOLxF5A6bpbCooT1d3BKMPHZyJKvCVZ3cYRw@mail.gmail.com>
	<CAJ98EDrK9peK_B868Lr9EQ4sdiyQKMoZyOuZuOwTuCR6of9K9w@mail.gmail.com>
	<CALtCA6FnWa8A6Zi-zANDePmDWtBpUSTCM511nP=NHuttwTY7SA@mail.gmail.com>
	<1618DCDA-7859-49BD-BCAF-F4BD08DF1BAF@mcs.anl.gov>
	<CALtCA6F4xzEJE1WCk-4e8-LDjhTfJia4b16HmzD+J4z-weWLZQ@mail.gmail.com>
	<D78E861A-793F-4D84-B155-6834CEFDAC6A@mcs.anl.gov>
	<CALtCA6FSO35UDGo+eYbpihH25oV1eytWVhSa+qoQMXstc9RFMw@mail.gmail.com>
	<CAMYG4GmNrKPU0TUWJ4bQtJRi3132ebxJ57jemtwik-xbS-HLHA@mail.gmail.com>
	<855835AC-78B8-4A8A-993F-2E9060B4BBAF@mcs.anl.gov>
	<CALtCA6HqH9tAW93r+=c5h5atFJzUv018u7ZnjGTXivzKxgG_Tg@mail.gmail.com>
	<CAMYG4G=v2o2D_Y5nF6mQihb2_YQZkRXNxbY3oBcOURcx9=DHBQ@mail.gmail.com>
Message-ID: <CALtCA6GjgXXZQ6m9S_HUvKW+Gpk_Wwqatt8tLVF-ghDaoTvimg@mail.gmail.com>

Thanks all,
Sorry for lots of questions. Thanks to your advice, I didn't create the
local matrix and it seems the problem is solved.
I mean it seems that I shouldn't create the local matrix at all.
And this is my final working code.

    Mat *m_WA_nt_local;
    IS set;
    if(rank ==0){
        ISCreateStride(PETSC_COMM_SELF, num_points, 0, 1, &set);
        MatGetSubMatrices(m_WA_norm_T, 1, &set, &set, MAT_INITIAL_MATRIX,
&m_WA_nt_local);
    }else{
        MatGetSubMatrices(m_WA_norm_T, 0, &set, &set, MAT_INITIAL_MATRIX,
&m_WA_nt_local);
    }

    if(rank ==0){
        PetscInt m_WA_nt_local_start, m_WA_nt_local_end;
        MatGetOwnershipRange( (*m_WA_nt_local), &m_WA_nt_local_start,
&m_WA_nt_local_end);
        PetscPrintf(PETSC_COMM_SELF, "[CS][pCalc_P] rank:%d, m_WA_nt_local
start:%d, end:%d\n", rank, m_WA_nt_local_start,m_WA_nt_local_end);
        MatView((*m_WA_nt_local), PETSC_VIEWER_STDOUT_SELF);
    }

It compiled and  run without any problem.

Best regards,
Ehsan


On Tue, Jul 5, 2016 at 9:02 PM, Matthew Knepley <knepley at gmail.com> wrote:

> On Tue, Jul 5, 2016 at 5:58 PM, ehsan sadrfaridpour <it.sadr at gmail.com>
> wrote:
>
>> Sorry, I think your suggestion needs something, since it doesn't compile.
>>
>>  error: expected initializer before ?*? token
>>>      Mat m_WA_nt_local  *m_WA_nt_local;
>>>
>>
>>
>> Yes, this is the same problem that compiled and worked but it has a bug.
>> I faced this problem and I tried to define the array of Matrices to fix
>> this 4 days ago.
>>
>> However, my first email today is the problem that array of matrices
>> caused me.
>> I get a little confused in the logic.
>>
>> Let me review what is happening:
>> As this method is collective, all the processes needs to run it.
>> Therefore, I need to define a local matrix and create it for all of the
>> processes.
>>
>
> No no no. Each process extracts a SET of SEQUENTIAL matrices. Each proc
> choose how many
> it will extract (could be 0).
>
>   Thanks,
>
>     Matt
>
>
>> Only for the process I want to have the local matrix, I request a matrix
>> (matrices) and for the rest of them I pass 0 in the MatGetSubMatrices.
>> I am suspicious about creating only 1 matrix for any process, while I
>> expect an array of matrices in the  MatGetSubMatrices.
>>
>>
>>
>>
>>
>> On Tue, Jul 5, 2016 at 6:43 PM, Barry Smith <bsmith at mcs.anl.gov> wrote:
>>
>>>
>>> > On Jul 5, 2016, at 5:36 PM, Matthew Knepley <knepley at gmail.com> wrote:
>>> >
>>> > On Tue, Jul 5, 2016 at 5:26 PM, ehsan sadrfaridpour <it.sadr at gmail.com>
>>> wrote:
>>> > Thanks for your prompt reply. Using & solve this problem, but then I
>>> have another problem.
>>> >
>>> > Rest of the Code:
>>> >     Mat m_WA_nt_local;
>>> >     MatCreateSeqAIJ(PETSC_COMM_SELF,num_points,num_points,
>>> pre_init_size, NULL, &m_WA_nt_local);
>>> >     PetscPrintf(PETSC_COMM_SELF, "[CS][pCalc_P] rank:%d,
>>> num_points:%d, p_init:%d\n", rank, num_points, pre_init_size);
>>> >
>>> >     IS set;
>>> >     if(rank ==0){
>>> >         // - - - - - create local matrix - - - - -
>>> >         PetscPrintf(PETSC_COMM_SELF, "[CS][pCalc_P] rank:%d,
>>> num_points:%d\n", rank, num_points);
>>> >         ISCreateStride(PETSC_COMM_SELF, num_points, 0, 1, &set);
>>> >         ISView(set, PETSC_VIEWER_STDOUT_SELF);
>>> >         MatGetSubMatrices(m_WA_norm_T, 1, &set, &set,
>>> MAT_INITIAL_MATRIX, &m_WA_nt_local);
>>> >     }else{
>>> >         MatGetSubMatrices(m_WA_norm_T, 0, &set, &set,
>>> MAT_INITIAL_MATRIX, &m_WA_nt_local);
>>> >     }
>>> >
>>> > This returns an ARRAY of Mat objects, not just one.
>>>
>>>   Didn't we just do this email a couple of days ago?
>>>
>>>    You need
>>>
>>>     Mat m_WA_nt_local  *m_WA_nt_local;
>>> > MatGetSubMatrices(m_WA_norm_T, 0, &set, &set, MAT_INITIAL_MATRIX,
>>> &m_WA_nt_local);
>>>
>>>
>>>
>>>
>>>
>>> >
>>> >    Matt
>>> >
>>> >
>>> > Error in compile:
>>> > /home/esfp/dev/ws_qt/mlsvm/coarsening.cc: In member function ?_p_Mat*
>>> Coarsening::pCalc_P(_p_Mat*&, _p_Vec*&, std::vector<long unsigned int>&,
>>> cs_info&)?:
>>> > /home/esfp/dev/ws_qt/mlsvm/coarsening.cc:113:89: error: cannot convert
>>> ?_p_Mat**? to ?_p_Mat***? for argument ?6? to ?PetscErrorCode
>>> MatGetSubMatrices(Mat, PetscInt, _p_IS* const*, _p_IS* const*, MatReuse,
>>> _p_Mat***)?
>>> >          MatGetSubMatrices(m_WA_norm_T, 1, &set, &set,
>>> MAT_INITIAL_MATRIX, &m_WA_nt_local);
>>> >
>>>                   ^
>>> > /home/esfp/dev/ws_qt/mlsvm/coarsening.cc:115:89: error: cannot convert
>>> ?_p_Mat**? to ?_p_Mat***? for argument ?6? to ?PetscErrorCode
>>> MatGetSubMatrices(Mat, PetscInt, _p_IS* const*, _p_IS* const*, MatReuse,
>>> _p_Mat***)?
>>> >          MatGetSubMatrices(m_WA_norm_T, 0, &set, &set,
>>> MAT_INITIAL_MATRIX, &m_WA_nt_local);
>>> >
>>>                   ^
>>> >
>>> >
>>> > On Tue, Jul 5, 2016 at 6:21 PM, Barry Smith <bsmith at mcs.anl.gov>
>>> wrote:
>>> >
>>> >   It should be
>>> >
>>> >     Mat m_WA_nt_local;
>>> >
>>> > > MatCreateSeqAIJ(PETSC_COMM_SELF,num_points,num_points,
>>> pre_init_size, NULL, &m_WA_nt_local);
>>> >
>>>                                                             ^^^^^^^^^^^^
>>> note the &
>>> >
>>> >
>>> >
>>> > > On Jul 5, 2016, at 5:13 PM, ehsan sadrfaridpour <it.sadr at gmail.com>
>>> wrote:
>>> > >
>>> > > I faced a problem with my code. The problem is related to
>>> MatCreateSeqAIJ().
>>> > > I comment the rest of my code and just keeping the below lines cause
>>> me the error.
>>> > > Code:
>>> > >     Mat * m_WA_nt_local;
>>> > >     MatCreateSeqAIJ(PETSC_COMM_SELF,num_points,num_points,
>>> pre_init_size, NULL, m_WA_nt_local);
>>> > >     PetscPrintf(PETSC_COMM_SELF, "[CS][pCalc_P] rank:%d,
>>> num_points:%d, p_init:%d\n", rank, num_points, pre_init_size);
>>> > >
>>> > >     exit(1);
>>> > >
>>> > > Error:
>>> > > [0]PETSC ERROR: --------------------- Error Message
>>> --------------------------------------------------------------
>>> > > [0]PETSC ERROR: Null argument, when expecting valid pointer
>>> > > [0]PETSC ERROR: [1]PETSC ERROR: --------------------- Error Message
>>> --------------------------------------------------------------
>>> > > [1]PETSC ERROR: Null argument, when expecting valid pointer
>>> > > [1]PETSC ERROR: Null Pointer: Parameter # 2
>>> > > [1]PETSC ERROR: See
>>> http://www.mcs.anl.gov/petsc/documentation/faq.html for trouble
>>> shooting.
>>> > > [2]PETSC ERROR: --------------------- Error Message
>>> --------------------------------------------------------------
>>> > > [2]PETSC ERROR: Null argument, when expecting valid pointer
>>> > > [2]PETSC ERROR: Null Pointer: Parameter # 2
>>> > > [2]PETSC ERROR: See
>>> http://www.mcs.anl.gov/petsc/documentation/faq.html for trouble
>>> shooting.
>>> > > [2]PETSC ERROR: Petsc Release Version 3.6.3, unknown
>>> > > [2]PETSC ERROR: ut_main on a linux-cxx-debug named grappelli by esfp
>>> Tue Jul  5 18:05:15 2016
>>> > > [2]PETSC ERROR: Configure options PETSC_ARCH=linux-cxx-debug
>>> --with-cc=gcc --with-cxx=g++ --with-clanguage=c++ --with-gnu-compilers=1
>>> --with-mpi-compilers=1 --with-debugging=1 --with-shared-libraries=1
>>> --download-openmpi=1 --download-f2cblaslapack --download-superlu_dist=1
>>> --download-superlu=1 --download-metis=1 --download-parmetis=1
>>> --download-blacs=1 --with-hdf5 --with-hdf5-dir=/usr/local/hdf5/
>>> > > [2]PETSC ERROR: #1 MatCreate() line 79 in
>>> /home/esfp/tools/libraries/petsc/src/mat/utils/gcreate.c
>>> > > [2]PETSC ERROR: #2 MatCreateSeqAIJ() line 3471 in
>>> /home/esfp/tools/libraries/petsc/src/mat/impls/aij/seq/aij.c
>>> > > Null Pointer: Parameter # 2
>>> > > [0]PETSC ERROR: See
>>> http://www.mcs.anl.gov/petsc/documentation/faq.html for trouble
>>> shooting.
>>> > > [0]PETSC ERROR: Petsc Release Version 3.6.3, unknown
>>> > > [0]PETSC ERROR: ut_main on a linux-cxx-debug named grappelli by esfp
>>> Tue Jul  5 18:05:15 2016
>>> > > [0]PETSC ERROR: Configure options PETSC_ARCH=linux-cxx-debug
>>> --with-cc=gcc --with-cxx=g++ --with-clanguage=c++ --with-gnu-compilers=1
>>> --with-mpi-compilers=1 --with-debugging=1 --with-shared-libraries=1
>>> --download-openmpi=1 --download-f2cblaslapack --download-superlu_dist=1
>>> --download-superlu=1 --download-metis=1 --download-parmetis=1
>>> --download-blacs=1 --with-hdf5 --with-hdf5-dir=/usr/local/hdf5/
>>> > > [0]PETSC ERROR: #1 MatCreate() line 79 in
>>> /home/esfp/tools/libraries/petsc/src/mat/utils/gcreate.c
>>> > > [0]PETSC ERROR: #2 MatCreateSeqAIJ() line 3471 in
>>> /home/esfp/tools/libraries/petsc/src/mat/impls/aij/seq/aij.c
>>> > > [1]PETSC ERROR: Petsc Release Version 3.6.3, unknown
>>> > > [1]PETSC ERROR: ut_main on a linux-cxx-debug named grappelli by esfp
>>> Tue Jul  5 18:05:15 2016
>>> > > [1]PETSC ERROR: Configure options PETSC_ARCH=linux-cxx-debug
>>> --with-cc=gcc --with-cxx=g++ --with-clanguage=c++ --with-gnu-compilers=1
>>> --with-mpi-compilers=1 --with-debugging=1 --with-shared-libraries=1
>>> --download-openmpi=1 --download-f2cblaslapack --download-superlu_dist=1
>>> --download-superlu=1 --download-metis=1 --download-parmetis=1
>>> --download-blacs=1 --with-hdf5 --with-hdf5-dir=/usr/local/hdf5/
>>> > > [1]PETSC ERROR: #1 MatCreate() line 79 in
>>> /home/esfp/tools/libraries/petsc/src/mat/utils/gcreate.c
>>> > > [1]PETSC ERROR: #2 MatCreateSeqAIJ() line 3471 in
>>> /home/esfp/tools/libraries/petsc/src/mat/impls/aij/seq/aij.c
>>> > > [CS][pCalc_P] rank:1, num_points:10, p_init:300
>>> > > [CS][pCalc_P] rank:2, num_points:10, p_init:300
>>> > > [CS][pCalc_P] rank:0, num_points:10, p_init:300
>>> > >
>>> > > As you can see nothing is NULL in my call to the MatCreateSeqAIJ.
>>> > >
>>> > > I tried to debug it with -start_in_debugger, but I got another error.
>>> > > $ make ut_main && mpirun -n 3   ut_main   -start_in_debugger
>>> > > /home/esfp/tools/libraries/petsc/linux-cxx-debug/bin/mpicxx -o
>>> ut_main.o -c -Wall -Wwrite-strings -Wno-strict-aliasing
>>> -Wno-unknown-pragmas -g -O0  -fPIC
>>> -I/home/esfp/tools/libraries/petsc/include
>>> -I/home/esfp/tools/libraries/petsc/linux-cxx-debug/include
>>> -I/usr/local/hdf5/include   -std=c++11 -g -O3  `pwd`/ut_main.cc
>>> > > /home/esfp/tools/libraries/petsc/linux-cxx-debug/bin/mpicxx -Wall
>>> -Wwrite-strings -Wno-strict-aliasing -Wno-unknown-pragmas -g -O0 -I.
>>> svm.o solver.o model_selection.o ut_ms.o ut_common.o ut_kf.o
>>> ut_partitioning.o ds_node.o ds_graph.o coarsening.o ut_coarsening.o
>>> partitioning.o ut_mr.o pugixml.o config_params.o etimer.o common_funcs.o
>>> OptionParser.o loader.o ut_loader.o k_fold.o ut_main.o
>>> -Wl,-rpath,/home/esfp/tools/libraries/petsc/linux-cxx-debug/lib
>>> -L/home/esfp/tools/libraries/petsc/linux-cxx-debug/lib  -lpetsc
>>> -Wl,-rpath,/home/esfp/tools/libraries/petsc/linux-cxx-debug/lib
>>> -lsuperlu_4.3 -lsuperlu_dist_4.1 -lf2clapack -lf2cblas -lm -lparmetis
>>> -lmetis -lX11 -Wl,-rpath,/usr/local/hdf5/lib -L/usr/local/hdf5/lib
>>> -lhdf5hl_fortran -lhdf5_fortran -lhdf5_hl -lhdf5 -lhwloc -lm
>>> -Wl,-rpath,/usr/lib/gcc/x86_64-linux-gnu/4.8
>>> -L/usr/lib/gcc/x86_64-linux-gnu/4.8 -Wl,-rpath,/usr/lib/x86_64-linux-gnu
>>> -L/usr/lib/x86_64-linux-gnu -Wl,-rpath,/lib/x86_64-linux-gnu
>>> -L/lib/x86_64-linux-gnu -lmpi_usempi -lmpi_mpifh -lgfortran -lm -lgfortran
>>> -lm -lquadmath -lm -lmpi_cxx -lstdc++
>>> -Wl,-rpath,/home/esfp/tools/libraries/petsc/linux-cxx-debug/lib
>>> -L/home/esfp/tools/libraries/petsc/linux-cxx-debug/lib
>>> -Wl,-rpath,/usr/lib/gcc/x86_64-linux-gnu/4.8
>>> -L/usr/lib/gcc/x86_64-linux-gnu/4.8 -Wl,-rpath,/usr/lib/x86_64-linux-gnu
>>> -L/usr/lib/x86_64-linux-gnu -Wl,-rpath,/lib/x86_64-linux-gnu
>>> -L/lib/x86_64-linux-gnu -Wl,-rpath,/usr/lib/x86_64-linux-gnu
>>> -L/usr/lib/x86_64-linux-gnu -ldl
>>> -Wl,-rpath,/home/esfp/tools/libraries/petsc/linux-cxx-debug/lib -lmpi
>>> -lgcc_s -lpthread -ldl  -o ut_main
>>> > > /bin/rm -f ut_main.o
>>> > > [0]PETSC ERROR: PETSC: Attaching gdb to ut_main of pid 2818 on
>>> display :0 on machine grappelli
>>> > > [1]PETSC ERROR: PETSC: Attaching gdb to ut_main of pid 2819 on
>>> display :0 on machine grappelli
>>> > > [2]PETSC ERROR: PETSC: Attaching gdb to ut_main of pid 2820 on
>>> display :0 on machine grappelli
>>> > >
>>> > >
>>> > > And I got below error in gdb GUI:
>>> > > <image.png>
>>> > >
>>> > > I appreciate your support.
>>> > >
>>> > > Best regards,
>>> > > Ehsan
>>> > >
>>> > > On Wed, Jun 29, 2016 at 4:31 PM, Barry Smith <bsmith at mcs.anl.gov>
>>> wrote:
>>> > >
>>> > >   On all other processes don't pass in 1 pass  in 0 since all other
>>> processes want 0 sub matrices
>>> > >
>>> > >
>>> > > > On Jun 29, 2016, at 3:24 PM, ehsan sadrfaridpour <
>>> it.sadr at gmail.com> wrote:
>>> > > >
>>> > > > Thanks, the IS problem is solved.
>>> > > > But now I have another problem to compile the code.
>>> > > >
>>> > > > I use below code:
>>> > > > Mat m_WA_nt_local;
>>> > > >     MatCreateSeqAIJ(PETSC_COMM_SELF,num_points,num_points,
>>> Config_params::getInstance()->get_pre_init_loader_matrix(), NULL,
>>> &m_WA_nt_local);
>>> > > >     IS set;
>>> > > >     if(rank ==0){
>>> > > >         ISCreateStride(PETSC_COMM_SELF, num_points, 0, 1, &set);
>>> > > >         ISView(set, PETSC_VIEWER_STDOUT_SELF);
>>> > > >     }
>>> > > >     MatGetSubMatrices(m_WA_norm_T, 1, &set, &set,
>>> MAT_INITIAL_MATRIX, &m_WA_nt_local);
>>> > > >
>>> > > > The error I get is :
>>> > > > error: cannot convert ?_p_Mat**? to ?_p_Mat***? for argument ?6?
>>> to ?PetscErrorCode MatGetSubMatrices(Mat, PetscInt, _p_IS* const*, _p_IS*
>>> const*, MatReuse, _p_Mat***)?
>>> > > >      MatGetSubMatrices(m_WA_norm_T, 1, &set, &set,
>>> MAT_INITIAL_MATRIX, &m_WA_nt_local);
>>> > > >
>>> > > >
>>> > > > I tried to go around it by define a array of Matrices using "Mat *
>>> m_WA_nt_local"
>>> > > > So, the first 2 lines changed to below and I can compile the code.
>>> > > > Mat * m_WA_nt_local;
>>> > > >     MatCreateSeqAIJ(PETSC_COMM_SELF,num_points,num_points,
>>> Config_params::getInstance()->get_pre_init_loader_matrix(), NULL,
>>> m_WA_nt_local);
>>> > > >
>>> > > >
>>> > > >
>>> > > > However, I get errors like below when I run the code with 2 mpi
>>> process.
>>> > > >  --------------------- Error Message
>>> --------------------------------------------------------------
>>> > > > [1]PETSC ERROR: Invalid argument
>>> > > > [1]PETSC ERROR: Wrong type of object: Parameter # 3
>>> > > > [1]PETSC ERROR: See
>>> http://www.mcs.anl.gov/petsc/documentation/faq.html for trouble
>>> shooting.
>>> > > > [1]PETSC ERROR: Petsc Release Version 3.6.3, unknown
>>> > > > [1]PETSC ERROR: ut_main on a linux-cxx-debug named grappelli by
>>> esfp Wed Jun 29 16:21:04 2016
>>> > > > [1]PETSC ERROR: Configure options PETSC_ARCH=linux-cxx-debug
>>> --with-cc=gcc --with-cxx=g++ --with-clanguage=c++ --with-gnu-compilers=1
>>> --with-mpi-compilers=1 --with-debugging=1 --with-shared-libraries=1
>>> --download-openmpi=1 --download-f2cblaslapack --download-superlu_dist=1
>>> --download-superlu=1 --download-metis=1 --download-parmetis=1
>>> --download-blacs=1 --with-hdf5 --with-hdf5-dir=/usr/local/hdf5/
>>> > > > [1]PETSC ERROR: #1 MatGetSubMatrices() line 6605 in
>>> /home/esfp/tools/libraries/petsc/src/mat/interface/matrix.c
>>> > > >
>>> > > >
>>> > > > I think I need to do something for other processes, but I don't
>>> know what I need to do.
>>> > > >
>>> > > > Best,
>>> > > > Ehsan
>>> > > >
>>> > > >
>>> > > >
>>> > > > On Wed, Jun 29, 2016 at 4:03 PM, Dave May <dave.mayhem23 at gmail.com>
>>> wrote:
>>> > > >
>>> > > >
>>> > > > On Wednesday, 29 June 2016, ehsan sadrfaridpour <it.sadr at gmail.com>
>>> wrote:
>>> > > > I faced the below error during compiling my code for using
>>> MatGetSubMatrices.
>>> > > >
>>> > > > error: cannot convert ?IS {aka _p_IS*}? to ?_p_IS* const*? for
>>> argument ?3? to ?PetscErrorCode MatGetSubMatrices(Mat, PetscInt, _p_IS*
>>> const*, _p_IS* const*, MatReuse, _p_Mat***)?
>>> > > >          MatGetSubMatrices(m_WA_norm_T, 1, set, set,
>>> MAT_INITIAL_MATRIX, &m_local_W);
>>> > > >
>>> > > > My code :
>>> > > > PetscMPIInt    rank;
>>> > > > MPI_Comm_rank(PETSC_COMM_WORLD, &rank);
>>> > > >
>>> > > > if(rank ==0){
>>> > > >         Mat m_local_W;
>>> > > >         MatCreateSeqAIJ(PETSC_COMM_SELF,num_points,num_points,
>>> num_nz, NULL,&m_local_W);// try to reserve space for only number of final
>>> non zero entries for each fine node (e.g. 4)
>>> > > >         IS set;
>>> > > >         ISCreateStride(PETSC_COMM_SELF, num_points, 0, 1,
>>> &set_row);
>>> > > >         MatGetSubMatrices(m_WA_norm_T, 1, set_row, set_col,
>>> MAT_INITIAL_MATRIX, &m_local_W);
>>> > > >
>>> > > >     }
>>> > > >
>>> > > > I followed below example:
>>> > > >
>>> http://www.mcs.anl.gov/petsc/petsc-current/src/vec/is/is/examples/tutorials/ex2.c.html
>>> > > >
>>> > > > This code won't work in parallel.
>>> > > > The man page says this function is collective on Mat. You need to
>>> move the call to MatGetSubMatrices outside of the if(rank==0) loop.
>>> > > >
>>> > > >
>>> > > >
>>> > > >
>>> > > >
>>> > > >
>>> > > >
>>> > > > On Wed, Jun 29, 2016 at 3:19 PM, ehsan sadrfaridpour <
>>> it.sadr at gmail.com> wrote:
>>> > > > Thanks a lot for great support.
>>> > > >
>>> > > > On Wed, Jun 29, 2016 at 3:11 PM, Barry Smith <bsmith at mcs.anl.gov>
>>> wrote:
>>> > > >
>>> > > >    MatGetSubmatrices() just have the first process request all the
>>> rows and columns and the others request none. You can use ISCreateStride()
>>> to create the ISs without having to make an array of all the indices.
>>> > > >
>>> > > >
>>> > > > > On Jun 29, 2016, at 1:43 PM, ehsan sadrfaridpour <
>>> it.sadr at gmail.com> wrote:
>>> > > > >
>>> > > > > Hi,
>>> > > > >
>>> > > > > I need to have access to most of elements of a parallel MPIAIJ
>>> matrix only from 1 process (rank 0).
>>> > > > > I tried to copy or duplicate it to SEQAIJ, but I faced problems.
>>> > > > >
>>> > > > > How can I have a local copy of a matrix which is distributed on
>>> multiple process? I don't want to update the matrix, and the read-only
>>> version of it would be enough.
>>> > > > >
>>> > > > > Best,
>>> > > > > Ehsan
>>> > > > >
>>> > > > >
>>> > > >
>>> > > >
>>> > > >
>>> > > >
>>> > >
>>> > >
>>> >
>>> >
>>> >
>>> >
>>> >
>>> > --
>>> > What most experimenters take for granted before they begin their
>>> experiments is infinitely more interesting than any results to which their
>>> experiments lead.
>>> > -- Norbert Wiener
>>>
>>>
>>
>
>
> --
> What most experimenters take for granted before they begin their
> experiments is infinitely more interesting than any results to which their
> experiments lead.
> -- Norbert Wiener
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.mcs.anl.gov/pipermail/petsc-users/attachments/20160706/99e26b3b/attachment-0001.html>

From Hassan.Raiesi at aero.bombardier.com  Wed Jul  6 11:22:11 2016
From: Hassan.Raiesi at aero.bombardier.com (Hassan Raiesi)
Date: Wed, 6 Jul 2016 16:22:11 +0000
Subject: [petsc-users] (edit GAMG) petsc 3.7.2 memory usage is much
 higher when compared to 3.6.1
Message-ID: <CB3EA7864A5EA74EADA707EEB42CF5BAFC314C33@MTLWAEXCH005.ca.aero.bombardier.net>

Barry,

Thank you for the detailed instructions, I'll try to figure out what change causes this problem, 

To answer your question, I re-ran using fgmres/bjacobi for a simple case and there was virtually no difference in memory footprint reported by PETSc (see the log files ends _basic). So it is safe to assume the extra memory was due to GAMG.

I ran a series of tests with GAMG, I attached full logs here, but to summarize:

PETSc 3.6.1:
--- Event Stage 0: Main Stage

              Matrix   368            365    149426856     0
      Matrix Coarsen    16             16         9920     0
              Vector  1181           1181    218526896     0
      Vector Scatter    99             99       115936     0
       Krylov Solver    22             22        72976     0
      Preconditioner    22             22        21648     0
              Viewer     1              0            0     0
           Index Set   267            267       821040     0
Star Forest Bipartite Graph    16             16        13440     0


Using same options, exactly same code (just linked it with petsc-3.7.2)

PETSc 3.7.2:
--- Event Stage 0: Main Stage

              Matrix   412            409    180705004     0.
      Matrix Coarsen    12             12         7536     0.
              Vector   923            923    214751960     0.
      Vector Scatter    79             79        95488     0.
       Krylov Solver    17             17        67152     0.
      Preconditioner    17             17        16936     0.
         PetscRandom     1              1          638     0.
              Viewer     1              0            0     0.
           Index Set   223            223       790676     0.
Star Forest Bipartite Graph    12             12        10176     0.

GAMG in 3.7.2 creates less levels, but needs more memory. 

For next test, I changed the "pc_gamg_square_graph" from 2 to 1, here 3.7.2 makes 19 levels now

PETSc 3.7.2:
--- Event Stage 0: Main Stage

              Matrix   601            598    188796452     0.
      Matrix Coarsen    19             19        11932     0.
              Vector  1358           1358    216798096     0.
      Vector Scatter   110            110       128920     0.
       Krylov Solver    24             24        76112     0.
      Preconditioner    24             24        23712     0.
         PetscRandom     1              1          638     0.
              Viewer     1              0            0     0.
           Index Set   284            284       857076     0.
Star Forest Bipartite Graph    19             19        16112     0.

with similar memory usage.

If I limit the number of levels to 17, I would get same number of levels as in version 3.6.1, however the memory usage is still higher than version 3.6.1

PETSc 3.7.2:
--- Event Stage 0: Main Stage

              Matrix   506            503    187749632     0.
      Matrix Coarsen    16             16        10048     0.
              Vector  1160           1160    216216344     0.
      Vector Scatter    92             92       100424     0.
       Krylov Solver    21             21        72272     0.
      Preconditioner    21             21        20808     0.
         PetscRandom     1              1          638     0.
              Viewer     1              0            0     0.
           Index Set   237            237       818260     0.
Star Forest Bipartite Graph    16             16        13568     0.

Now running version 3.6.1 with the options used for the above run 

PETSc 3.6.1:
--- Event Stage 0: Main Stage

              Matrix   338            335    153296844     0
      Matrix Coarsen    16             16         9920     0
              Vector  1156           1156    219112832     0
      Vector Scatter    89             89        94696     0
       Krylov Solver    22             22        72976     0
      Preconditioner    22             22        21648     0
              Viewer     1              0            0     0
           Index Set   223            223       791548     0
Star Forest Bipartite Graph    16             16        13440     0


It Looks like the GAMG in 3.7.2 makes a lot more matrices for same number of levels and requires about  (187749632  - 153296844)/153296844   = 22.5%  more memory.

I hope the logs help fixing the issue.

Best Regards

PS: GAMG is great, and by far beats all other AMG libraries we have tried so far :-)


-----Original Message-----
From: Barry Smith [mailto:bsmith at mcs.anl.gov] 
Sent: Tuesday, July 05, 2016 6:19 PM
To: Hassan Raiesi <Hassan.Raiesi at aero.bombardier.com>
Cc: petsc-users at mcs.anl.gov
Subject: Re: [petsc-users] petsc 3.7.2 memory usage is much higher when compared to 3.6.1


   Hassan,

    This memory usage increase is not expected.  How are you measuring memory usage?

    Since the problem occurs even with a simple solver you should debug with the simpler solver and only after resolving that move on to GAMG and see if the problem persists. Also do the test on the smallest case that clearly demonstrates the problem; if you have a 1 process run that shows a nontrivial memory usage increase then debug with that, don't run a huge problem unless you absolutely have to.

     How much code, if any, did you need to change in your application in going from 3.6.1 to 3.7.2 ?

     Here is the way to track down the problem. It may seem burdensome but requires no guesswork or speculation. Use the bisection capability of git. 

     First obtain PETSc via git if you have not gotten that way http://www.mcs.anl.gov/petsc/download/index.html

     Then in the PETSc directory run

      git bisect start
 
       git bisect good v3.6.1 

       git bisect bad v3.7.2

       It will then change to a new commit where you need to run configure and make on PETSc and then compile and run your application

       If the application uses the excessive memory then in the PETSc directory do

       git bisect bad

       otherwise type

       git bisect good

       if the code won't compile (if the PETSc API changes you may have to adjust your code slightly to get it to compile and you should do that; but if PETSc won't configure to build with the given commit then just do the skip) or crashes then type 

       git bisect skip 

      Now git will switch to another commit 

      where you need again do the same process of configure make and run the application. 

      After a few iterations git bisect will show the EXACT commit (code changes) that resulted in your very different memory usage and we can take a look at the code changes in PETSc and figure out how to reduce the memory usage.

      I realize this seems like a burdensome process but remember a great deal of changes took place in the PETSc code and this is the ONLY well defined way to figure out exactly which change caused the problem. Otherwise we can guess until the end of time.

   Barry


> On Jul 5, 2016, at 3:42 PM, Hassan Raiesi <Hassan.Raiesi at aero.bombardier.com> wrote:
> 
> Hi,
>  
> PETSc 3.7.2 seems to have a much higher memory usage when compared with PETSc- 3.1.1 c, to a point that it crashes our code for large problems that we ran with version 3.6.1 in the past.
> I have re-compiled the code with same options, and ran the same code linked with the two versions, here are the log-summarie:
>  
> -flow_ksp_max_it 20
> -flow_ksp_monitor_true_residual
> -flow_ksp_rtol 0.1
> -flow_ksp_type fgmres
> -flow_mg_coarse_pc_factor_mat_solver_package mumps 
> -flow_mg_coarse_pc_type lu -flow_mg_levels_ksp_type richardson 
> -flow_mg_levels_pc_type sor -flow_pc_gamg_agg_nsmooths 0 
> -flow_pc_gamg_coarse_eq_limit 2000 -flow_pc_gamg_process_eq_limit 2500 
> -flow_pc_gamg_repartition true -flow_pc_gamg_reuse_interpolation true 
> -flow_pc_gamg_square_graph 3 -flow_pc_gamg_sym_graph true 
> -flow_pc_gamg_type agg -flow_pc_mg_cycle v -flow_pc_mg_levels 20 
> -flow_pc_mg_type kaskade -flow_pc_type gamg -log_summary
>  
> Note: it is not specific to PCGAMG, even a bjacobi+fgmres would need more memory (4.5GB/core in version 3.6.1 compared to 6.8GB/core for 3.7.2).
>  
>  
>  
> Using Petsc Development GIT revision: v3.7.2-812-gc68d048  GIT Date: 
> 2016-07-05 12:04:34 -0400
>  
>                          Max       Max/Min        Avg      Total
> Time (sec):           6.760e+02      1.00006   6.760e+02
> Objects:              1.284e+03      1.00469   1.279e+03
> Flops:                3.563e+10      1.10884   3.370e+10  1.348e+13
> Flops/sec:            5.271e+07      1.10884   4.985e+07  1.994e+10
> MPI Messages:         4.279e+04      7.21359   1.635e+04  6.542e+06
> MPI Message Lengths:  3.833e+09     17.25274   7.681e+04  5.024e+11
> MPI Reductions:       4.023e+03      1.00149
>  
> Flop counting convention: 1 flop = 1 real number operation of type (multiply/divide/add/subtract)
>                             e.g., VecAXPY() for real vectors of length N --> 2N flops
>                             and VecAXPY() for complex vectors of 
> length N --> 8N flops
>  
> Summary of Stages:   ----- Time ------  ----- Flops -----  --- Messages ---  -- Message Lengths --  -- Reductions --
>                         Avg     %Total     Avg     %Total   counts   %Total     Avg         %Total   counts   %Total
> 0:      Main Stage: 6.7600e+02 100.0%  1.3478e+13 100.0%  6.533e+06  99.9%  7.674e+04       99.9%  4.010e+03  99.7%
>  
> ----------------------------------------------------------------------
> --------------------------------------------------
> See the 'Profiling' chapter of the users' manual for details on interpreting output.
> Phase summary info:
>    Count: number of times phase was executed
>    Time and Flops: Max - maximum over all processors
>                    Ratio - ratio of maximum to minimum over all processors
>    Mess: number of messages sent
>    Avg. len: average message length (bytes)
>    Reduct: number of global reductions
>    Global: entire computation
>    Stage: stages of a computation. Set stages with PetscLogStagePush() and PetscLogStagePop().
>       %T - percent time in this phase         %F - percent flops in this phase
>       %M - percent messages in this phase     %L - percent message lengths in this phase
>       %R - percent reductions in this phase
>    Total Mflop/s: 10e-6 * (sum of flops over all processors)/(max time 
> over all processors)
> ------------------------------------------------------------------------------------------------------------------------
> Event                Count      Time (sec)     Flops                             --- Global ---  --- Stage ---   Total
>                    Max Ratio  Max     Ratio   Max  Ratio  Mess   Avg len Reduct  %T %F %M %L %R  %T %F %M %L %R Mflop/s
> ----------------------------------------------------------------------
> --------------------------------------------------
>  
> --- Event Stage 0: Main Stage
>  
> MatMult              500 1.0 1.0582e+01 1.2 6.68e+09 1.1 1.9e+06 1.0e+04 0.0e+00  1 19 28  4  0   1 19 29  4  0 237625
> MatMultTranspose     120 1.0 7.6262e-01 1.3 3.58e+08 1.1 2.4e+05 1.5e+04 0.0e+00  0  1  4  1  0   0  1  4  1  0 180994
> MatSolve             380 1.0 4.1580e+00 1.1 1.17e+09 1.1 8.6e+03 8.8e+01 6.0e+01  1  3  0  0  1   1  3  0  0  1 105950
> MatSOR               120 1.0 1.4316e+01 1.2 6.75e+09 1.1 9.5e+05 7.4e+03 0.0e+00  2 19 15  1  0   2 19 15  1  0 177298
> MatLUFactorSym         2 1.0 2.3449e-01 1.1 0.00e+00 0.0 0.0e+00 0.0e+00 1.0e+01  0  0  0  0  0   0  0  0  0  0     0
> MatLUFactorNum        60 1.0 8.8820e+00 1.0 1.95e+08 1.2 0.0e+00 0.0e+00 0.0e+00  1  1  0  0  0   1  1  0  0  0  7877
> MatILUFactorSym        1 1.0 1.9795e-02 1.4 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00  0  0  0  0  0   0  0  0  0  0     0
> MatConvert             6 1.0 2.9893e-02 1.1 0.00e+00 0.0 0.0e+00 0.0e+00 1.8e+01  0  0  0  0  0   0  0  0  0  0     0
> MatScale               6 1.0 1.8810e-02 1.4 4.52e+06 1.1 2.4e+04 1.5e+03 0.0e+00  0  0  0  0  0   0  0  0  0  0 90171
> MatAssemblyBegin     782 1.0 1.8294e+01 2.9 0.00e+00 0.0 9.2e+05 4.1e+05 4.2e+02  2  0 14 75 10   2  0 14 75 10     0
> MatAssemblyEnd       782 1.0 1.4283e+01 3.0 0.00e+00 0.0 4.1e+05 8.7e+02 4.7e+02  1  0  6  0 12   1  0  6  0 12     0
> MatGetRow        6774900 1.1 9.4289e-01 1.2 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00  0  0  0  0  0   0  0  0  0  0     0
> MatGetRowIJ            3 3.0 6.6261e-036948.0 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00  0  0  0  0  0   0  0  0  0  0     0
> MatGetSubMatrix       12 1.0 2.6783e+01 1.0 0.00e+00 0.0 1.1e+05 1.3e+05 2.0e+02  4  0  2  3  5   4  0  2  3  5     0
> MatGetOrdering         3 3.0 7.7400e-03 7.2 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00  0  0  0  0  0   0  0  0  0  0     0
> MatPartitioning        6 1.0 1.8949e+00 1.0 0.00e+00 0.0 0.0e+00 0.0e+00 1.4e+01  0  0  0  0  0   0  0  0  0  0     0
> MatCoarsen             6 1.0 9.5692e-02 1.2 0.00e+00 0.0 2.6e+05 1.1e+03 4.1e+01  0  0  4  0  1   0  0  4  0  1     0
> MatZeroEntries       142 1.0 9.7085e-01 1.1 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00  0  0  0  0  0   0  0  0  0  0     0
> MatTranspose           6 1.0 2.1740e-01 1.0 0.00e+00 0.0 1.9e+05 8.5e+02 7.8e+01  0  0  3  0  2   0  0  3  0  2     0
> MatPtAP              120 1.0 6.0157e+01 1.0 1.82e+10 1.1 1.5e+06 2.7e+05 4.2e+02  9 51 22 80 10   9 51 22 80 10 114269
> MatPtAPSymbolic       12 1.0 8.1081e+00 1.0 0.00e+00 0.0 2.2e+05 3.8e+04 8.4e+01  1  0  3  2  2   1  0  3  2  2     0
> MatPtAPNumeric       120 1.0 5.2205e+01 1.0 1.82e+10 1.1 1.2e+06 3.1e+05 3.4e+02  8 51 19 78  8   8 51 19 78  8 131676
> MatTrnMatMult          3 1.0 1.8608e+00 1.0 3.23e+07 1.2 8.3e+04 7.9e+03 5.7e+01  0  0  1  0  1   0  0  1  0  1  6275
> MatTrnMatMultSym       3 1.0 1.3447e+00 1.0 0.00e+00 0.0 6.9e+04 3.8e+03 5.1e+01  0  0  1  0  1   0  0  1  0  1     0
> MatTrnMatMultNum       3 1.0 5.1695e-01 1.0 3.23e+07 1.2 1.3e+04 3.0e+04 6.0e+00  0  0  0  0  0   0  0  0  0  0 22588
> MatGetLocalMat       126 1.0 1.0355e+00 1.8 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00  0  0  0  0  0   0  0  0  0  0     0
> MatGetBrAoCol        120 1.0 9.5921e+0019.2 0.00e+00 0.0 5.7e+05 3.3e+04 0.0e+00  1  0  9  4  0   1  0  9  4  0     0
> VecDot               320 1.0 1.1400e+00 1.6 2.04e+08 1.1 0.0e+00 0.0e+00 3.2e+02  0  1  0  0  8   0  1  0  0  8 68967
> VecMDot              260 1.0 1.9577e+00 2.8 3.70e+08 1.1 0.0e+00 0.0e+00 2.6e+02  0  1  0  0  6   0  1  0  0  6 72792
> VecNorm              440 1.0 2.6273e+00 1.9 5.88e+08 1.1 0.0e+00 0.0e+00 4.4e+02  0  2  0  0 11   0  2  0  0 11 86035
> VecScale             320 1.0 2.1386e-01 1.2 7.91e+07 1.1 0.0e+00 0.0e+00 0.0e+00  0  0  0  0  0   0  0  0  0  0 141968
> VecCopy              220 1.0 7.0370e-01 1.2 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00  0  0  0  0  0   0  0  0  0  0     0
> VecSet               862 1.0 7.1000e-01 1.6 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00  0  0  0  0  0   0  0  0  0  0     0
> VecAXPY              440 1.0 8.6790e-01 1.1 3.83e+08 1.1 0.0e+00 0.0e+00 0.0e+00  0  1  0  0  0   0  1  0  0  0 169857
> VecAYPX              280 1.0 5.7766e-01 1.5 1.92e+08 1.1 0.0e+00 0.0e+00 0.0e+00  0  1  0  0  0   0  1  0  0  0 127599
> VecMAXPY             300 1.0 9.7396e-01 1.2 4.98e+08 1.1 0.0e+00 0.0e+00 0.0e+00  0  1  0  0  0   0  1  0  0  0 196768
> VecAssemblyBegin     234 1.0 4.6313e+00 5.6 0.00e+00 0.0 0.0e+00 0.0e+00 6.8e+02  0  0  0  0 17   0  0  0  0 17     0
> VecAssemblyEnd       234 1.0 5.1503e-0319.5 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00  0  0  0  0  0   0  0  0  0  0     0
> VecScatterBegin     1083 1.0 2.9274e-01 4.5 0.00e+00 0.0 3.8e+06 8.5e+03 2.0e+01  0  0 59  6  0   0  0 59  6  0     0
> VecScatterEnd       1063 1.0 3.9653e+00 5.6 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00  0  0  0  0  0   0  0  0  0  0     0
> KSPGMRESOrthog        20 1.0 1.7405e+00 3.7 1.28e+08 1.1 0.0e+00 0.0e+00 2.0e+01  0  0  0  0  0   0  0  0  0  0 28232
> KSPSetUp             222 1.0 6.8469e-02 1.1 0.00e+00 0.0 0.0e+00 0.0e+00 1.2e+01  0  0  0  0  0   0  0  0  0  0     0
> KSPSolve              60 1.0 1.4767e+02 1.0 3.55e+10 1.1 6.3e+06 7.2e+04 3.2e+03 22100 96 90 79  22100 96 90 79 91007
> PCGAMGGraph_AGG        6 1.0 6.0792e+00 1.0 4.52e+06 1.1 3.8e+05 9.0e+02 2.5e+02  1  0  6  0  6   1  0  6  0  6   279
> PCGAMGCoarse_AGG       6 1.0 2.0660e+00 1.0 3.23e+07 1.2 4.2e+05 3.1e+03 1.5e+02  0  0  6  0  4   0  0  6  0  4  5652
> PCGAMGProl_AGG         6 1.0 1.8842e+00 1.0 0.00e+00 0.0 7.3e+05 3.3e+03 8.6e+02  0  0 11  0 21   0  0 11  0 22     0
> PCGAMGPOpt_AGG         6 1.0 6.4373e-06 0.0 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00  0  0  0  0  0   0  0  0  0  0     0
> GAMG: createProl       6 1.0 1.0036e+01 1.0 3.68e+07 1.2 1.5e+06 2.7e+03 1.3e+03  1  0 23  1 31   1  0 23  1 31  1332
>   Graph               12 1.0 6.0783e+00 1.0 4.52e+06 1.1 3.8e+05 9.0e+02 2.5e+02  1  0  6  0  6   1  0  6  0  6   279
>   MIS/Agg              6 1.0 9.5831e-02 1.2 0.00e+00 0.0 2.6e+05 1.1e+03 4.1e+01  0  0  4  0  1   0  0  4  0  1     0
>   SA: col data         6 1.0 7.7358e-01 1.0 0.00e+00 0.0 6.7e+05 2.9e+03 7.8e+02  0  0 10  0 19   0  0 10  0 19     0
>   SA: frmProl0         6 1.0 1.0759e+00 1.0 0.00e+00 0.0 6.2e+04 7.6e+03 6.0e+01  0  0  1  0  1   0  0  1  0  1     0
> GAMG: partLevel        6 1.0 3.8136e+01 1.0 9.09e+08 1.1 3.8e+05 5.0e+04 5.4e+02  6  3  6  4 13   6  3  6  4 14  9013
>   repartition          6 1.0 2.7910e+00 1.0 0.00e+00 0.0 4.6e+04 1.3e+02 1.6e+02  0  0  1  0  4   0  0  1  0  4     0
>   Invert-Sort          6 1.0 2.5045e+00 2.0 0.00e+00 0.0 0.0e+00 0.0e+00 2.4e+01  0  0  0  0  1   0  0  0  0  1     0
>   Move A               6 1.0 1.4832e+01 1.0 0.00e+00 0.0 8.5e+04 1.7e+05 1.1e+02  2  0  1  3  3   2  0  1  3  3     0
>   Move P               6 1.0 1.2023e+01 1.0 0.00e+00 0.0 2.4e+04 3.8e+03 1.1e+02  2  0  0  0  3   2  0  0  0  3     0
> PCSetUp              100 1.0 1.1212e+02 1.0 1.84e+10 1.1 3.2e+06 1.3e+05 2.2e+03 17 52 49 84 54  17 52 49 84 54 62052
> PCSetUpOnBlocks       40 1.0 1.0386e+00 1.2 1.95e+08 1.2 0.0e+00 0.0e+00 0.0e+00  0  1  0  0  0   0  1  0  0  0 67368
> PCApply              380 1.0 2.0034e+01 1.1 8.60e+09 1.1 1.5e+06 9.9e+03 6.0e+01  3 24 22  3  1   3 24 22  3  1 161973
> SFSetGraph            12 1.0 4.9813e-0310.2 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00  0  0  0  0  0   0  0  0  0  0     0
> SFBcastBegin          47 1.0 3.3110e-02 2.6 0.00e+00 0.0 2.6e+05 1.1e+03 6.0e+00  0  0  4  0  0   0  0  4  0  0     0
> SFBcastEnd            47 1.0 1.3497e-02 3.6 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00  0  0  0  0  0   0  0  0  0  0     0
> SFReduceBegin          6 1.0 1.8593e-02 4.2 0.00e+00 0.0 7.2e+04 4.9e+02 6.0e+00  0  0  1  0  0   0  0  1  0  0     0
> SFReduceEnd            6 1.0 7.1628e-0318.5 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00  0  0  0  0  0   0  0  0  0  0     0
> BuildTwoSided         12 1.0 3.5771e-02 2.5 0.00e+00 0.0 5.0e+04 4.0e+00 1.2e+01  0  0  1  0  0   0  0  1  0  0     0
> ----------------------------------------------------------------------
> --------------------------------------------------
>  
> Memory usage is given in bytes:
>  
> Object Type          Creations   Destructions     Memory  Descendants' Mem.
> Reports information only for process 0.
>  
> --- Event Stage 0: Main Stage
>  
>               Matrix   302            299   1992700700     0.
> Matrix Partitioning     6              6         3888     0.
>       Matrix Coarsen     6              6         3768     0.
>               Vector   600            600   1582204168     0.
>       Vector Scatter    87             87      5614432     0.
>        Krylov Solver    11             11        59472     0.
>       Preconditioner    11             11        11120     0.
>          PetscRandom     1              1          638     0.
>               Viewer     1              0            0     0.
>            Index Set   247            247      9008420     0.
> Star Forest Bipartite Graph    12             12        10176     0.
> ======================================================================
> ==================================================
>  
> And for  petsc 3.6.1:
>  
> Using Petsc Development GIT revision: v3.6.1-307-g26c82d3  GIT Date: 
> 2015-08-06 11:50:34 -0500
>  
>                          Max       Max/Min        Avg      Total
> Time (sec):           5.515e+02      1.00001   5.515e+02
> Objects:              1.231e+03      1.00490   1.226e+03
> Flops:                3.431e+10      1.12609   3.253e+10  1.301e+13
> Flops/sec:            6.222e+07      1.12609   5.899e+07  2.359e+10
> MPI Messages:         4.432e+04      7.84165   1.504e+04  6.016e+06
> MPI Message Lengths:  2.236e+09     12.61261   5.027e+04  3.024e+11
> MPI Reductions:       4.012e+03      1.00150
>  
> Flop counting convention: 1 flop = 1 real number operation of type (multiply/divide/add/subtract)
>                             e.g., VecAXPY() for real vectors of length N --> 2N flops
>                             and VecAXPY() for complex vectors of 
> length N --> 8N flops
>  
> Summary of Stages:   ----- Time ------  ----- Flops -----  --- Messages ---  -- Message Lengths --  -- Reductions --
>                         Avg     %Total     Avg     %Total   counts   %Total     Avg         %Total   counts   %Total
> 0:      Main Stage: 5.5145e+02 100.0%  1.3011e+13 100.0%  6.007e+06  99.9%  5.020e+04       99.9%  3.999e+03  99.7%
>  
> ----------------------------------------------------------------------
> --------------------------------------------------
> See the 'Profiling' chapter of the users' manual for details on interpreting output.
> Phase summary info:
>    Count: number of times phase was executed
>    Time and Flops: Max - maximum over all processors
>                    Ratio - ratio of maximum to minimum over all processors
>    Mess: number of messages sent
>    Avg. len: average message length (bytes)
>    Reduct: number of global reductions
>    Global: entire computation
>    Stage: stages of a computation. Set stages with PetscLogStagePush() and PetscLogStagePop().
>       %T - percent time in this phase         %F - percent flops in this phase
>       %M - percent messages in this phase     %L - percent message lengths in this phase
>       %R - percent reductions in this phase
>    Total Mflop/s: 10e-6 * (sum of flops over all processors)/(max time 
> over all processors)
> ------------------------------------------------------------------------------------------------------------------------
> Event                Count      Time (sec)     Flops                             --- Global ---  --- Stage ---   Total
>                    Max Ratio  Max     Ratio   Max  Ratio  Mess   Avg len Reduct  %T %F %M %L %R  %T %F %M %L %R Mflop/s
> ----------------------------------------------------------------------
> --------------------------------------------------
>  
> --- Event Stage 0: Main Stage
>  
> MatMult              500 1.0 1.0172e+01 1.2 6.68e+09 1.1 1.9e+06 9.9e+03 0.0e+00  2 19 31  6  0   2 19 31  6  0 247182
> MatMultTranspose     120 1.0 6.9889e-01 1.2 3.56e+08 1.1 2.5e+05 1.4e+04 0.0e+00  0  1  4  1  0   0  1  4  1  0 197492
> MatSolve             380 1.0 3.9310e+00 1.1 1.17e+09 1.1 1.3e+04 5.7e+01 6.0e+01  1  3  0  0  1   1  3  0  0  2 112069
> MatSOR               120 1.0 1.3915e+01 1.1 6.73e+09 1.1 9.5e+05 7.4e+03 0.0e+00  2 20 16  2  0   2 20 16  2  0 182405
> MatLUFactorSym         2 1.0 2.1180e-01 1.8 0.00e+00 0.0 0.0e+00 0.0e+00 1.0e+01  0  0  0  0  0   0  0  0  0  0     0
> MatLUFactorNum        60 1.0 7.9378e+00 1.0 1.95e+08 1.2 0.0e+00 0.0e+00 0.0e+00  1  1  0  0  0   1  1  0  0  0  8814
> MatILUFactorSym        1 1.0 2.3076e-02 1.5 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00  0  0  0  0  0   0  0  0  0  0     0
> MatConvert             6 1.0 3.2693e-02 1.5 0.00e+00 0.0 0.0e+00 0.0e+00 1.8e+01  0  0  0  0  0   0  0  0  0  0     0
> MatScale               6 1.0 2.1923e-02 1.7 4.50e+06 1.1 2.4e+04 1.5e+03 0.0e+00  0  0  0  0  0   0  0  0  0  0 77365
> MatAssemblyBegin     266 1.0 1.0337e+01 4.4 0.00e+00 0.0 1.8e+05 3.8e+03 4.2e+02  1  0  3  0 10   1  0  3  0 10     0
> MatAssemblyEnd       266 1.0 3.0336e+00 1.0 0.00e+00 0.0 4.1e+05 8.6e+02 4.7e+02  1  0  7  0 12   1  0  7  0 12     0
> MatGetRow        6730366 1.1 8.6473e-01 1.2 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00  0  0  0  0  0   0  0  0  0  0     0
> MatGetRowIJ            3 3.0 5.2931e-035550.2 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00  0  0  0  0  0   0  0  0  0  0     0
> MatGetSubMatrix       12 1.0 2.2689e+01 1.0 0.00e+00 0.0 1.1e+05 1.3e+05 1.9e+02  4  0  2  5  5   4  0  2  5  5     0
> MatGetOrdering         3 3.0 6.5000e-03 5.4 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00  0  0  0  0  0   0  0  0  0  0     0
> MatPartitioning        6 1.0 2.9801e+00 1.0 0.00e+00 0.0 0.0e+00 0.0e+00 1.4e+01  1  0  0  0  0   1  0  0  0  0     0
> MatCoarsen             6 1.0 9.5374e-02 1.1 0.00e+00 0.0 2.5e+05 1.1e+03 3.8e+01  0  0  4  0  1   0  0  4  0  1     0
> MatZeroEntries        22 1.0 6.1185e-01 1.2 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00  0  0  0  0  0   0  0  0  0  0     0
> MatTranspose           6 1.0 1.9780e-01 1.1 0.00e+00 0.0 1.9e+05 8.6e+02 7.8e+01  0  0  3  0  2   0  0  3  0  2     0
> MatPtAP              120 1.0 5.2996e+01 1.0 1.70e+10 1.1 9.7e+05 2.1e+05 4.2e+02 10 49 16 67 10  10 49 16 67 11 120900
> MatPtAPSymbolic       12 1.0 5.8209e+00 1.0 0.00e+00 0.0 2.2e+05 3.7e+04 8.4e+01  1  0  4  3  2   1  0  4  3  2     0
> MatPtAPNumeric       120 1.0 4.7185e+01 1.0 1.70e+10 1.1 7.6e+05 2.6e+05 3.4e+02  9 49 13 64  8   9 49 13 64  8 135789
> MatTrnMatMult          3 1.0 1.1679e+00 1.0 3.22e+07 1.2 8.2e+04 8.0e+03 5.7e+01  0  0  1  0  1   0  0  1  0  1  9997
> MatTrnMatMultSym       3 1.0 6.8366e-01 1.0 0.00e+00 0.0 6.9e+04 3.9e+03 5.1e+01  0  0  1  0  1   0  0  1  0  1     0
> MatTrnMatMultNum       3 1.0 4.8513e-01 1.0 3.22e+07 1.2 1.3e+04 3.0e+04 6.0e+00  0  0  0  0  0   0  0  0  0  0 24069
> MatGetLocalMat       126 1.0 1.1939e+00 1.3 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00  0  0  0  0  0   0  0  0  0  0     0
> MatGetBrAoCol        120 1.0 5.9887e-01 2.7 0.00e+00 0.0 5.7e+05 3.3e+04 0.0e+00  0  0  9  6  0   0  0  9  6  0     0
> MatGetSymTrans        24 1.0 1.4878e-01 1.4 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00  0  0  0  0  0   0  0  0  0  0     0
> VecDot               320 1.0 1.5860e+00 1.5 2.04e+08 1.1 0.0e+00 0.0e+00 3.2e+02  0  1  0  0  8   0  1  0  0  8 49574
> VecMDot              260 1.0 1.8154e+00 2.5 3.70e+08 1.1 0.0e+00 0.0e+00 2.6e+02  0  1  0  0  6   0  1  0  0  7 78497
> VecNorm              440 1.0 2.8876e+00 1.8 5.88e+08 1.1 0.0e+00 0.0e+00 4.4e+02  0  2  0  0 11   0  2  0  0 11 78281
> VecScale             320 1.0 2.2738e-01 1.2 7.88e+07 1.1 0.0e+00 0.0e+00 0.0e+00  0  0  0  0  0   0  0  0  0  0 133517
> VecCopy              220 1.0 7.1162e-01 1.2 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00  0  0  0  0  0   0  0  0  0  0     0
> VecSet               862 1.0 7.0683e-01 1.5 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00  0  0  0  0  0   0  0  0  0  0     0
> VecAXPY              440 1.0 9.0657e-01 1.2 3.83e+08 1.1 0.0e+00 0.0e+00 0.0e+00  0  1  0  0  0   0  1  0  0  0 162612
> VecAYPX              280 1.0 5.8935e-01 1.5 1.92e+08 1.1 0.0e+00 0.0e+00 0.0e+00  0  1  0  0  0   0  1  0  0  0 125070
> VecMAXPY             300 1.0 9.7644e-01 1.2 4.98e+08 1.1 0.0e+00 0.0e+00 0.0e+00  0  1  0  0  0   0  1  0  0  0 196269
> VecAssemblyBegin     234 1.0 5.0308e+00 5.5 0.00e+00 0.0 0.0e+00 0.0e+00 6.8e+02  1  0  0  0 17   1  0  0  0 17     0
> VecAssemblyEnd       234 1.0 1.8253e-03 8.8 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00  0  0  0  0  0   0  0  0  0  0     0
> VecScatterBegin     1083 1.0 2.8195e-01 4.7 0.00e+00 0.0 3.8e+06 8.4e+03 2.0e+01  0  0 64 11  0   0  0 64 11  1     0
> VecScatterEnd       1063 1.0 3.4924e+00 6.9 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00  0  0  0  0  0   0  0  0  0  0     0
> KSPGMRESOrthog        20 1.0 1.5598e+00 3.2 1.28e+08 1.1 0.0e+00 0.0e+00 2.0e+01  0  0  0  0  0   0  0  0  0  1 31503
> KSPSetUp             222 1.0 9.7521e-02 1.6 0.00e+00 0.0 0.0e+00 0.0e+00 1.2e+01  0  0  0  0  0   0  0  0  0  0     0
> KSPSolve              60 1.0 1.3742e+02 1.0 3.42e+10 1.1 5.7e+06 4.4e+04 3.2e+03 25100 95 83 79  25100 95 83 79 94396
> PCGAMGGraph_AGG        6 1.0 5.7683e+00 1.0 4.50e+06 1.1 3.8e+05 9.1e+02 2.5e+02  1  0  6  0  6   1  0  6  0  6   294
> PCGAMGCoarse_AGG       6 1.0 1.4101e+00 1.0 3.22e+07 1.2 4.0e+05 3.2e+03 1.4e+02  0  0  7  0  4   0  0  7  0  4  8280
> PCGAMGProl_AGG         6 1.0 1.8976e+00 1.0 0.00e+00 0.0 7.2e+05 3.4e+03 8.6e+02  0  0 12  1 22   0  0 12  1 22     0
> PCGAMGPOpt_AGG         6 1.0 5.7220e-06 0.0 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00  0  0  0  0  0   0  0  0  0  0     0
> GAMG: createProl       6 1.0 9.0840e+00 1.0 3.67e+07 1.2 1.5e+06 2.7e+03 1.3e+03  2  0 25  1 31   2  0 25  1 31  1472
>   Graph               12 1.0 5.7669e+00 1.0 4.50e+06 1.1 3.8e+05 9.1e+02 2.5e+02  1  0  6  0  6   1  0  6  0  6   294
>   MIS/Agg              6 1.0 9.5481e-02 1.1 0.00e+00 0.0 2.5e+05 1.1e+03 3.8e+01  0  0  4  0  1   0  0  4  0  1     0
>   SA: col data         6 1.0 8.5414e-01 1.0 0.00e+00 0.0 6.6e+05 3.0e+03 7.8e+02  0  0 11  1 19   0  0 11  1 20     0
>   SA: frmProl0         6 1.0 1.0123e+00 1.0 0.00e+00 0.0 6.2e+04 7.6e+03 6.0e+01  0  0  1  0  1   0  0  1  0  2     0
> GAMG: partLevel        6 1.0 3.6150e+01 1.0 8.41e+08 1.1 3.5e+05 5.0e+04 5.3e+02  7  2  6  6 13   7  2  6  6 13  8804
>   repartition          6 1.0 3.8351e+00 1.0 0.00e+00 0.0 4.7e+04 1.3e+02 1.6e+02  1  0  1  0  4   1  0  1  0  4     0
>   Invert-Sort          6 1.0 4.4953e+00 1.8 0.00e+00 0.0 0.0e+00 0.0e+00 2.4e+01  1  0  0  0  1   1  0  0  0  1     0
>   Move A               6 1.0 1.0806e+01 1.0 0.00e+00 0.0 8.5e+04 1.6e+05 1.0e+02  2  0  1  5  3   2  0  1  5  3     0
>   Move P               6 1.0 1.1953e+01 1.0 0.00e+00 0.0 2.5e+04 3.6e+03 1.0e+02  2  0  0  0  3   2  0  0  0  3     0
> PCSetUp              100 1.0 1.0166e+02 1.0 1.72e+10 1.1 2.7e+06 8.3e+04 2.2e+03 18 50 44 73 54  18 50 44 73 54 63848
> PCSetUpOnBlocks       40 1.0 1.0812e+00 1.2 1.95e+08 1.2 0.0e+00 0.0e+00 0.0e+00  0  1  0  0  0   0  1  0  0  0 64711
> PCApply              380 1.0 1.9359e+01 1.1 8.58e+09 1.1 1.4e+06 9.6e+03 6.0e+01  3 25 24  5  1   3 25 24  5  2 167605
> SFSetGraph            12 1.0 3.5203e-03 6.8 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00  0  0  0  0  0   0  0  0  0  0     0
> SFBcastBegin          44 1.0 2.4242e-02 3.0 0.00e+00 0.0 2.5e+05 1.1e+03 6.0e+00  0  0  4  0  0   0  0  4  0  0     0
> SFBcastEnd            44 1.0 3.0994e-02 8.6 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00  0  0  0  0  0   0  0  0  0  0     0
> SFReduceBegin          6 1.0 1.6784e-02 3.8 0.00e+00 0.0 7.1e+04 5.0e+02 6.0e+00  0  0  1  0  0   0  0  1  0  0     0
> SFReduceEnd            6 1.0 8.6989e-0332.0 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00  0  0  0  0  0   0  0  0  0  0     0
> ----------------------------------------------------------------------
> --------------------------------------------------
>  
> Memory usage is given in bytes:
>  
> Object Type          Creations   Destructions     Memory  Descendants' Mem.
> Reports information only for process 0.
>  
> --- Event Stage 0: Main Stage
>  
>               Matrix   246            243   1730595756     0
> Matrix Partitioning     6              6         3816     0
>       Matrix Coarsen     6              6         3720     0
>               Vector   602            602   1603749672     0
>       Vector Scatter    87             87      4291136     0
>        Krylov Solver    12             12        60416     0
>       Preconditioner    12             12        12040     0
>               Viewer     1              0            0     0
>            Index Set   247            247      9018060     0
> Star Forest Bipartite Graph    12             12        10080     0
> ======================================================================
> ==================================================
>  
> Any idea why there are more matrix created with version 3.7.2? I only have 2 MatCreate calls and 4 VecCreate calls in my code!, so I assume the others are internally created.
>  
>  
> Thank you,
>  
>  
> Hassan Raiesi, PhD
>  
> Advanced Aerodynamics Department
> Bombardier Aerospace
>  
> hassan.raiesi at aero.bombardier.com
>  
> 2351 boul. Alfred-Nobel (BAN1)
> Ville Saint-Laurent, Qu?bec, H4S 2A9
>  
>  
>  
> T?l.
>   514-855-5001    # 62204
>  
>  
>  
> <image001.png>
>  
>  
> CONFIDENTIALITY NOTICE - This communication may contain privileged or confidential information.
> If you are not the intended recipient or received this communication 
> by error, please notify the sender and delete the message without copying, forwarding and/or disclosing it.


-------------- next part --------------
An embedded and charset-unspecified text was scrubbed...
Name: log_3.6.1_gamg.txt
URL: <http://lists.mcs.anl.gov/pipermail/petsc-users/attachments/20160706/125704ab/attachment-0007.txt>
-------------- next part --------------
An embedded and charset-unspecified text was scrubbed...
Name: log_3.7.2_gamg_run_with_square_graph_1.txt
URL: <http://lists.mcs.anl.gov/pipermail/petsc-users/attachments/20160706/125704ab/attachment-0008.txt>
-------------- next part --------------
An embedded and charset-unspecified text was scrubbed...
Name: log_3.7.2_gamg_square_graph_1_max_level_17.txt
URL: <http://lists.mcs.anl.gov/pipermail/petsc-users/attachments/20160706/125704ab/attachment-0009.txt>
-------------- next part --------------
An embedded and charset-unspecified text was scrubbed...
Name: log_3.6.1_gamg_run_with_square_graph_1_max_level_17.txt
URL: <http://lists.mcs.anl.gov/pipermail/petsc-users/attachments/20160706/125704ab/attachment-0010.txt>
-------------- next part --------------
An embedded and charset-unspecified text was scrubbed...
Name: log_3.7.2_gamg.txt
URL: <http://lists.mcs.anl.gov/pipermail/petsc-users/attachments/20160706/125704ab/attachment-0011.txt>
-------------- next part --------------
An embedded and charset-unspecified text was scrubbed...
Name: log_3.7.2_basic.txt
URL: <http://lists.mcs.anl.gov/pipermail/petsc-users/attachments/20160706/125704ab/attachment-0012.txt>
-------------- next part --------------
An embedded and charset-unspecified text was scrubbed...
Name: log_3.6.1_basic.txt
URL: <http://lists.mcs.anl.gov/pipermail/petsc-users/attachments/20160706/125704ab/attachment-0013.txt>

From epscodes at gmail.com  Wed Jul  6 12:17:01 2016
From: epscodes at gmail.com (Xiangdong)
Date: Wed, 6 Jul 2016 13:17:01 -0400
Subject: [petsc-users] snes true and preconditioned residuals for left npc
Message-ID: <CAAPpcp=pTB9XzFEDUxyRE+J_F2S1WV88KcoyzM5z_kDK1otPiA@mail.gmail.com>

Hello everyone,

I am using snes_type aspin, which is actually newtonls + npc (nasm). After
each newton iteration, if I call SNESGetFunction, the preconditioned
residual is obtained. However, if I use SNESComputeFunction, I get the
 true (unpreconditioned) residual.

If I want to know the preconditioned residual at a point different from
current solution, which function should I call?

Thanks.

Best,
Xiangdong
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.mcs.anl.gov/pipermail/petsc-users/attachments/20160706/fe9638b3/attachment.html>

From knepley at gmail.com  Wed Jul  6 14:35:24 2016
From: knepley at gmail.com (Matthew Knepley)
Date: Wed, 6 Jul 2016 14:35:24 -0500
Subject: [petsc-users] Duplicate cells when exporting a distributed
	dmplex
In-Reply-To: <D3A147C8.6336%mono@dtu.dk>
References: <D3A147C8.6336%mono@dtu.dk>
Message-ID: <CAMYG4GnuEipymXE1V-UwXcv_opp_QTR=ZjQp19E8ioxdCna2pg@mail.gmail.com>

On Tue, Jul 5, 2016 at 4:17 AM, Morten Nobel-J?rgensen <mono at dtu.dk> wrote:

> Hi all,
>
> I hope someone can help me with the following:
>
> I?m having some problems when exporting a distributed DMPlex ? the cells
> (+cell types) seems to be duplicated.
>
> When I?m running the code on a non-distributed system it works as
> expected, but when I run it on multiple processors (2 in my case) the
> output is invalid.
>
> I have attached a simple example and the output for np=1 and np=2.
>

The problem here is VTK output with overlapped meshes. If you change to
overlap = 0, it works as expected. I never fixed the VTK output
for this, but the HDF5 output works correctly. I will put this on the list
of things to do. I am attaching your code with some cleanup from me,
including assigning values in parallel.

  Thanks,

     Matt


> Abbreviated the code essentially does the following:
> '
>
> PetscInt       dim         = 3;
> PetscInt       cells[]     = {1, 1, 2};
> PetscInt       overlap     = 1;
> PetscInitialize(&argc, &argv, NULL, help);
> DMPlexCreateHexBoxMesh(PETSC_COMM_WORLD, dim, cells, DM_BOUNDARY_NONE,
> DM_BOUNDARY_NONE, DM_BOUNDARY_NONE, &dm);
> DMPlexDistribute(dm, overlap, NULL, &dist);
> dm   = dist;
> SetupDOFs(dm);
> Vec V;
> DMCreateGlobalVector(dm, &V);
> AssignSomeValues(V);
> PetscViewer viewer;
> const char* fn = "output.vtk";
> PetscViewerVTKOpen(PETSC_COMM_WORLD,fn,FILE_MODE_WRITE,&viewer);
> VecView(V,viewer);
> PetscViewerDestroy(&viewer);
>
>
> Kind regards,
> Morten
>


-- 
What most experimenters take for granted before they begin their
experiments is infinitely more interesting than any results to which their
experiments lead.
-- Norbert Wiener
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.mcs.anl.gov/pipermail/petsc-users/attachments/20160706/e2785085/attachment.html>
-------------- next part --------------
A non-text attachment was scrubbed...
Name: ex_vtk_export.c
Type: text/x-csrc
Size: 3336 bytes
Desc: not available
URL: <http://lists.mcs.anl.gov/pipermail/petsc-users/attachments/20160706/e2785085/attachment.c>

From eduardojourdan92 at gmail.com  Wed Jul  6 14:50:39 2016
From: eduardojourdan92 at gmail.com (Eduardo Jourdan)
Date: Wed, 6 Jul 2016 16:50:39 -0300
Subject: [petsc-users] What block size means in amg aggregation type
Message-ID: <CAF78e0y61gy=_XyQTbonh6MOBcCPSR=VnQD+zGwKH0KWX8bOJg@mail.gmail.com>

Hi,

I am kind of new to algebraic multigrid methods. I tried to figure it on my
own but I'm not be sure about it.

How the block size (bs) of a blocked matrix affects the AMG AGG? I mean, if
bs = 4, then
in the coarsening phase and setup, blocks of 4x4 matrix elements are
considered to remain in the coarse level and a certain quantity of block
neighbors are restricted and remain in the finer level? Never a row inside
a block matrix is selected and the other elements of this block aren't, am
I right? The entire block is interpolated when it comes to the
interpolation phase?

If the original problem is not a system of equations, then bs=1?

Thank you,

Eduardo
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.mcs.anl.gov/pipermail/petsc-users/attachments/20160706/2410bfc3/attachment.html>

From bsmith at mcs.anl.gov  Wed Jul  6 16:17:27 2016
From: bsmith at mcs.anl.gov (Barry Smith)
Date: Wed, 6 Jul 2016 16:17:27 -0500
Subject: [petsc-users] (edit GAMG) petsc 3.7.2 memory usage is much
	higher when compared to 3.6.1
In-Reply-To: <CB3EA7864A5EA74EADA707EEB42CF5BAFC314C33@MTLWAEXCH005.ca.aero.bombardier.net>
References: <CB3EA7864A5EA74EADA707EEB42CF5BAFC314C33@MTLWAEXCH005.ca.aero.bombardier.net>
Message-ID: <D19655C5-5012-4262-A24E-FEA090D0C333@mcs.anl.gov>


  Hassan,

   My statement "This memory usage increase is not expected." holds only for the fgmres/bjacobi. Mark continues to make refinements to the GAMG code that could easily result in more or less memory being used so I am not surprised by the numbers you report below. Do not bother running the git bisect that I suggested before; that was only to find a bug related to fgmres/bjacobi which does not seem to be a problem.

   In general I think for GAMG that "more memory usage" results mostly from "larger coarse grid problems" (or, of course bugs in the code) so if you could run old and new with -ksp_view and send the output (and look at it yourself) I am guessing we will see larger coarse grid problems with new. 

  You can use the option -pc_gamg_threshold .1 (or something) to cause smaller coarse grid problems (of course this may make the convergence slower).

  Mark Adams knows much more about this and will hopefully be able to make other suggestions.

  Note also that the memory usage information for matrices in the -log_summary is NOT a high water mark, rather it is the sum of the memory of the all the matrices that were ever created. Since GAMG creates some temporary matrices and then destroys them during the set up process the actual high water mark is lower. You can run with -memory_view to see the hight water mark level of memory usage in the old and new cases. High water mark is what causes the problem to end prematurely with out of memory.

  
   Barry


> On Jul 6, 2016, at 11:22 AM, Hassan Raiesi <Hassan.Raiesi at aero.bombardier.com> wrote:
> 
> Barry,
> 
> Thank you for the detailed instructions, I'll try to figure out what change causes this problem, 
> 
> To answer your question, I re-ran using fgmres/bjacobi for a simple case and there was virtually no difference in memory footprint reported by PETSc (see the log files ends _basic). So it is safe to assume the extra memory was due to GAMG.
> 
> I ran a series of tests with GAMG, I attached full logs here, but to summarize:
> 
> PETSc 3.6.1:
> --- Event Stage 0: Main Stage
> 
>              Matrix   368            365    149426856     0
>      Matrix Coarsen    16             16         9920     0
>              Vector  1181           1181    218526896     0
>      Vector Scatter    99             99       115936     0
>       Krylov Solver    22             22        72976     0
>      Preconditioner    22             22        21648     0
>              Viewer     1              0            0     0
>           Index Set   267            267       821040     0
> Star Forest Bipartite Graph    16             16        13440     0
> 
> 
> Using same options, exactly same code (just linked it with petsc-3.7.2)
> 
> PETSc 3.7.2:
> --- Event Stage 0: Main Stage
> 
>              Matrix   412            409    180705004     0.
>      Matrix Coarsen    12             12         7536     0.
>              Vector   923            923    214751960     0.
>      Vector Scatter    79             79        95488     0.
>       Krylov Solver    17             17        67152     0.
>      Preconditioner    17             17        16936     0.
>         PetscRandom     1              1          638     0.
>              Viewer     1              0            0     0.
>           Index Set   223            223       790676     0.
> Star Forest Bipartite Graph    12             12        10176     0.
> 
> GAMG in 3.7.2 creates less levels, but needs more memory. 
> 
> For next test, I changed the "pc_gamg_square_graph" from 2 to 1, here 3.7.2 makes 19 levels now
> 
> PETSc 3.7.2:
> --- Event Stage 0: Main Stage
> 
>              Matrix   601            598    188796452     0.
>      Matrix Coarsen    19             19        11932     0.
>              Vector  1358           1358    216798096     0.
>      Vector Scatter   110            110       128920     0.
>       Krylov Solver    24             24        76112     0.
>      Preconditioner    24             24        23712     0.
>         PetscRandom     1              1          638     0.
>              Viewer     1              0            0     0.
>           Index Set   284            284       857076     0.
> Star Forest Bipartite Graph    19             19        16112     0.
> 
> with similar memory usage.
> 
> If I limit the number of levels to 17, I would get same number of levels as in version 3.6.1, however the memory usage is still higher than version 3.6.1
> 
> PETSc 3.7.2:
> --- Event Stage 0: Main Stage
> 
>              Matrix   506            503    187749632     0.
>      Matrix Coarsen    16             16        10048     0.
>              Vector  1160           1160    216216344     0.
>      Vector Scatter    92             92       100424     0.
>       Krylov Solver    21             21        72272     0.
>      Preconditioner    21             21        20808     0.
>         PetscRandom     1              1          638     0.
>              Viewer     1              0            0     0.
>           Index Set   237            237       818260     0.
> Star Forest Bipartite Graph    16             16        13568     0.
> 
> Now running version 3.6.1 with the options used for the above run 
> 
> PETSc 3.6.1:
> --- Event Stage 0: Main Stage
> 
>              Matrix   338            335    153296844     0
>      Matrix Coarsen    16             16         9920     0
>              Vector  1156           1156    219112832     0
>      Vector Scatter    89             89        94696     0
>       Krylov Solver    22             22        72976     0
>      Preconditioner    22             22        21648     0
>              Viewer     1              0            0     0
>           Index Set   223            223       791548     0
> Star Forest Bipartite Graph    16             16        13440     0
> 
> 
> It Looks like the GAMG in 3.7.2 makes a lot more matrices for same number of levels and requires about  (187749632  - 153296844)/153296844   = 22.5%  more memory.
> 
> I hope the logs help fixing the issue.
> 
> Best Regards
> 
> PS: GAMG is great, and by far beats all other AMG libraries we have tried so far :-)
> 
> 
> -----Original Message-----
> From: Barry Smith [mailto:bsmith at mcs.anl.gov] 
> Sent: Tuesday, July 05, 2016 6:19 PM
> To: Hassan Raiesi <Hassan.Raiesi at aero.bombardier.com>
> Cc: petsc-users at mcs.anl.gov
> Subject: Re: [petsc-users] petsc 3.7.2 memory usage is much higher when compared to 3.6.1
> 
> 
>   Hassan,
> 
>    This memory usage increase is not expected.  How are you measuring memory usage?
> 
>    Since the problem occurs even with a simple solver you should debug with the simpler solver and only after resolving that move on to GAMG and see if the problem persists. Also do the test on the smallest case that clearly demonstrates the problem; if you have a 1 process run that shows a nontrivial memory usage increase then debug with that, don't run a huge problem unless you absolutely have to.
> 
>     How much code, if any, did you need to change in your application in going from 3.6.1 to 3.7.2 ?
> 
>     Here is the way to track down the problem. It may seem burdensome but requires no guesswork or speculation. Use the bisection capability of git. 
> 
>     First obtain PETSc via git if you have not gotten that way http://www.mcs.anl.gov/petsc/download/index.html
> 
>     Then in the PETSc directory run
> 
>      git bisect start
> 
>       git bisect good v3.6.1 
> 
>       git bisect bad v3.7.2
> 
>       It will then change to a new commit where you need to run configure and make on PETSc and then compile and run your application
> 
>       If the application uses the excessive memory then in the PETSc directory do
> 
>       git bisect bad
> 
>       otherwise type
> 
>       git bisect good
> 
>       if the code won't compile (if the PETSc API changes you may have to adjust your code slightly to get it to compile and you should do that; but if PETSc won't configure to build with the given commit then just do the skip) or crashes then type 
> 
>       git bisect skip 
> 
>      Now git will switch to another commit 
> 
>      where you need again do the same process of configure make and run the application. 
> 
>      After a few iterations git bisect will show the EXACT commit (code changes) that resulted in your very different memory usage and we can take a look at the code changes in PETSc and figure out how to reduce the memory usage.
> 
>      I realize this seems like a burdensome process but remember a great deal of changes took place in the PETSc code and this is the ONLY well defined way to figure out exactly which change caused the problem. Otherwise we can guess until the end of time.
> 
>   Barry
> 
> 
> 
> 
> 
> 
> 
>> On Jul 5, 2016, at 3:42 PM, Hassan Raiesi <Hassan.Raiesi at aero.bombardier.com> wrote:
>> 
>> Hi,
>> 
>> PETSc 3.7.2 seems to have a much higher memory usage when compared with PETSc- 3.1.1 c, to a point that it crashes our code for large problems that we ran with version 3.6.1 in the past.
>> I have re-compiled the code with same options, and ran the same code linked with the two versions, here are the log-summarie:
>> 
>> -flow_ksp_max_it 20
>> -flow_ksp_monitor_true_residual
>> -flow_ksp_rtol 0.1
>> -flow_ksp_type fgmres
>> -flow_mg_coarse_pc_factor_mat_solver_package mumps 
>> -flow_mg_coarse_pc_type lu -flow_mg_levels_ksp_type richardson 
>> -flow_mg_levels_pc_type sor -flow_pc_gamg_agg_nsmooths 0 
>> -flow_pc_gamg_coarse_eq_limit 2000 -flow_pc_gamg_process_eq_limit 2500 
>> -flow_pc_gamg_repartition true -flow_pc_gamg_reuse_interpolation true 
>> -flow_pc_gamg_square_graph 3 -flow_pc_gamg_sym_graph true 
>> -flow_pc_gamg_type agg -flow_pc_mg_cycle v -flow_pc_mg_levels 20 
>> -flow_pc_mg_type kaskade -flow_pc_type gamg -log_summary
>> 
>> Note: it is not specific to PCGAMG, even a bjacobi+fgmres would need more memory (4.5GB/core in version 3.6.1 compared to 6.8GB/core for 3.7.2).
>> 
>> 
>> 
>> Using Petsc Development GIT revision: v3.7.2-812-gc68d048  GIT Date: 
>> 2016-07-05 12:04:34 -0400
>> 
>>                         Max       Max/Min        Avg      Total
>> Time (sec):           6.760e+02      1.00006   6.760e+02
>> Objects:              1.284e+03      1.00469   1.279e+03
>> Flops:                3.563e+10      1.10884   3.370e+10  1.348e+13
>> Flops/sec:            5.271e+07      1.10884   4.985e+07  1.994e+10
>> MPI Messages:         4.279e+04      7.21359   1.635e+04  6.542e+06
>> MPI Message Lengths:  3.833e+09     17.25274   7.681e+04  5.024e+11
>> MPI Reductions:       4.023e+03      1.00149
>> 
>> Flop counting convention: 1 flop = 1 real number operation of type (multiply/divide/add/subtract)
>>                            e.g., VecAXPY() for real vectors of length N --> 2N flops
>>                            and VecAXPY() for complex vectors of 
>> length N --> 8N flops
>> 
>> Summary of Stages:   ----- Time ------  ----- Flops -----  --- Messages ---  -- Message Lengths --  -- Reductions --
>>                        Avg     %Total     Avg     %Total   counts   %Total     Avg         %Total   counts   %Total
>> 0:      Main Stage: 6.7600e+02 100.0%  1.3478e+13 100.0%  6.533e+06  99.9%  7.674e+04       99.9%  4.010e+03  99.7%
>> 
>> ----------------------------------------------------------------------
>> --------------------------------------------------
>> See the 'Profiling' chapter of the users' manual for details on interpreting output.
>> Phase summary info:
>>   Count: number of times phase was executed
>>   Time and Flops: Max - maximum over all processors
>>                   Ratio - ratio of maximum to minimum over all processors
>>   Mess: number of messages sent
>>   Avg. len: average message length (bytes)
>>   Reduct: number of global reductions
>>   Global: entire computation
>>   Stage: stages of a computation. Set stages with PetscLogStagePush() and PetscLogStagePop().
>>      %T - percent time in this phase         %F - percent flops in this phase
>>      %M - percent messages in this phase     %L - percent message lengths in this phase
>>      %R - percent reductions in this phase
>>   Total Mflop/s: 10e-6 * (sum of flops over all processors)/(max time 
>> over all processors)
>> ------------------------------------------------------------------------------------------------------------------------
>> Event                Count      Time (sec)     Flops                             --- Global ---  --- Stage ---   Total
>>                   Max Ratio  Max     Ratio   Max  Ratio  Mess   Avg len Reduct  %T %F %M %L %R  %T %F %M %L %R Mflop/s
>> ----------------------------------------------------------------------
>> --------------------------------------------------
>> 
>> --- Event Stage 0: Main Stage
>> 
>> MatMult              500 1.0 1.0582e+01 1.2 6.68e+09 1.1 1.9e+06 1.0e+04 0.0e+00  1 19 28  4  0   1 19 29  4  0 237625
>> MatMultTranspose     120 1.0 7.6262e-01 1.3 3.58e+08 1.1 2.4e+05 1.5e+04 0.0e+00  0  1  4  1  0   0  1  4  1  0 180994
>> MatSolve             380 1.0 4.1580e+00 1.1 1.17e+09 1.1 8.6e+03 8.8e+01 6.0e+01  1  3  0  0  1   1  3  0  0  1 105950
>> MatSOR               120 1.0 1.4316e+01 1.2 6.75e+09 1.1 9.5e+05 7.4e+03 0.0e+00  2 19 15  1  0   2 19 15  1  0 177298
>> MatLUFactorSym         2 1.0 2.3449e-01 1.1 0.00e+00 0.0 0.0e+00 0.0e+00 1.0e+01  0  0  0  0  0   0  0  0  0  0     0
>> MatLUFactorNum        60 1.0 8.8820e+00 1.0 1.95e+08 1.2 0.0e+00 0.0e+00 0.0e+00  1  1  0  0  0   1  1  0  0  0  7877
>> MatILUFactorSym        1 1.0 1.9795e-02 1.4 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00  0  0  0  0  0   0  0  0  0  0     0
>> MatConvert             6 1.0 2.9893e-02 1.1 0.00e+00 0.0 0.0e+00 0.0e+00 1.8e+01  0  0  0  0  0   0  0  0  0  0     0
>> MatScale               6 1.0 1.8810e-02 1.4 4.52e+06 1.1 2.4e+04 1.5e+03 0.0e+00  0  0  0  0  0   0  0  0  0  0 90171
>> MatAssemblyBegin     782 1.0 1.8294e+01 2.9 0.00e+00 0.0 9.2e+05 4.1e+05 4.2e+02  2  0 14 75 10   2  0 14 75 10     0
>> MatAssemblyEnd       782 1.0 1.4283e+01 3.0 0.00e+00 0.0 4.1e+05 8.7e+02 4.7e+02  1  0  6  0 12   1  0  6  0 12     0
>> MatGetRow        6774900 1.1 9.4289e-01 1.2 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00  0  0  0  0  0   0  0  0  0  0     0
>> MatGetRowIJ            3 3.0 6.6261e-036948.0 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00  0  0  0  0  0   0  0  0  0  0     0
>> MatGetSubMatrix       12 1.0 2.6783e+01 1.0 0.00e+00 0.0 1.1e+05 1.3e+05 2.0e+02  4  0  2  3  5   4  0  2  3  5     0
>> MatGetOrdering         3 3.0 7.7400e-03 7.2 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00  0  0  0  0  0   0  0  0  0  0     0
>> MatPartitioning        6 1.0 1.8949e+00 1.0 0.00e+00 0.0 0.0e+00 0.0e+00 1.4e+01  0  0  0  0  0   0  0  0  0  0     0
>> MatCoarsen             6 1.0 9.5692e-02 1.2 0.00e+00 0.0 2.6e+05 1.1e+03 4.1e+01  0  0  4  0  1   0  0  4  0  1     0
>> MatZeroEntries       142 1.0 9.7085e-01 1.1 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00  0  0  0  0  0   0  0  0  0  0     0
>> MatTranspose           6 1.0 2.1740e-01 1.0 0.00e+00 0.0 1.9e+05 8.5e+02 7.8e+01  0  0  3  0  2   0  0  3  0  2     0
>> MatPtAP              120 1.0 6.0157e+01 1.0 1.82e+10 1.1 1.5e+06 2.7e+05 4.2e+02  9 51 22 80 10   9 51 22 80 10 114269
>> MatPtAPSymbolic       12 1.0 8.1081e+00 1.0 0.00e+00 0.0 2.2e+05 3.8e+04 8.4e+01  1  0  3  2  2   1  0  3  2  2     0
>> MatPtAPNumeric       120 1.0 5.2205e+01 1.0 1.82e+10 1.1 1.2e+06 3.1e+05 3.4e+02  8 51 19 78  8   8 51 19 78  8 131676
>> MatTrnMatMult          3 1.0 1.8608e+00 1.0 3.23e+07 1.2 8.3e+04 7.9e+03 5.7e+01  0  0  1  0  1   0  0  1  0  1  6275
>> MatTrnMatMultSym       3 1.0 1.3447e+00 1.0 0.00e+00 0.0 6.9e+04 3.8e+03 5.1e+01  0  0  1  0  1   0  0  1  0  1     0
>> MatTrnMatMultNum       3 1.0 5.1695e-01 1.0 3.23e+07 1.2 1.3e+04 3.0e+04 6.0e+00  0  0  0  0  0   0  0  0  0  0 22588
>> MatGetLocalMat       126 1.0 1.0355e+00 1.8 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00  0  0  0  0  0   0  0  0  0  0     0
>> MatGetBrAoCol        120 1.0 9.5921e+0019.2 0.00e+00 0.0 5.7e+05 3.3e+04 0.0e+00  1  0  9  4  0   1  0  9  4  0     0
>> VecDot               320 1.0 1.1400e+00 1.6 2.04e+08 1.1 0.0e+00 0.0e+00 3.2e+02  0  1  0  0  8   0  1  0  0  8 68967
>> VecMDot              260 1.0 1.9577e+00 2.8 3.70e+08 1.1 0.0e+00 0.0e+00 2.6e+02  0  1  0  0  6   0  1  0  0  6 72792
>> VecNorm              440 1.0 2.6273e+00 1.9 5.88e+08 1.1 0.0e+00 0.0e+00 4.4e+02  0  2  0  0 11   0  2  0  0 11 86035
>> VecScale             320 1.0 2.1386e-01 1.2 7.91e+07 1.1 0.0e+00 0.0e+00 0.0e+00  0  0  0  0  0   0  0  0  0  0 141968
>> VecCopy              220 1.0 7.0370e-01 1.2 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00  0  0  0  0  0   0  0  0  0  0     0
>> VecSet               862 1.0 7.1000e-01 1.6 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00  0  0  0  0  0   0  0  0  0  0     0
>> VecAXPY              440 1.0 8.6790e-01 1.1 3.83e+08 1.1 0.0e+00 0.0e+00 0.0e+00  0  1  0  0  0   0  1  0  0  0 169857
>> VecAYPX              280 1.0 5.7766e-01 1.5 1.92e+08 1.1 0.0e+00 0.0e+00 0.0e+00  0  1  0  0  0   0  1  0  0  0 127599
>> VecMAXPY             300 1.0 9.7396e-01 1.2 4.98e+08 1.1 0.0e+00 0.0e+00 0.0e+00  0  1  0  0  0   0  1  0  0  0 196768
>> VecAssemblyBegin     234 1.0 4.6313e+00 5.6 0.00e+00 0.0 0.0e+00 0.0e+00 6.8e+02  0  0  0  0 17   0  0  0  0 17     0
>> VecAssemblyEnd       234 1.0 5.1503e-0319.5 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00  0  0  0  0  0   0  0  0  0  0     0
>> VecScatterBegin     1083 1.0 2.9274e-01 4.5 0.00e+00 0.0 3.8e+06 8.5e+03 2.0e+01  0  0 59  6  0   0  0 59  6  0     0
>> VecScatterEnd       1063 1.0 3.9653e+00 5.6 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00  0  0  0  0  0   0  0  0  0  0     0
>> KSPGMRESOrthog        20 1.0 1.7405e+00 3.7 1.28e+08 1.1 0.0e+00 0.0e+00 2.0e+01  0  0  0  0  0   0  0  0  0  0 28232
>> KSPSetUp             222 1.0 6.8469e-02 1.1 0.00e+00 0.0 0.0e+00 0.0e+00 1.2e+01  0  0  0  0  0   0  0  0  0  0     0
>> KSPSolve              60 1.0 1.4767e+02 1.0 3.55e+10 1.1 6.3e+06 7.2e+04 3.2e+03 22100 96 90 79  22100 96 90 79 91007
>> PCGAMGGraph_AGG        6 1.0 6.0792e+00 1.0 4.52e+06 1.1 3.8e+05 9.0e+02 2.5e+02  1  0  6  0  6   1  0  6  0  6   279
>> PCGAMGCoarse_AGG       6 1.0 2.0660e+00 1.0 3.23e+07 1.2 4.2e+05 3.1e+03 1.5e+02  0  0  6  0  4   0  0  6  0  4  5652
>> PCGAMGProl_AGG         6 1.0 1.8842e+00 1.0 0.00e+00 0.0 7.3e+05 3.3e+03 8.6e+02  0  0 11  0 21   0  0 11  0 22     0
>> PCGAMGPOpt_AGG         6 1.0 6.4373e-06 0.0 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00  0  0  0  0  0   0  0  0  0  0     0
>> GAMG: createProl       6 1.0 1.0036e+01 1.0 3.68e+07 1.2 1.5e+06 2.7e+03 1.3e+03  1  0 23  1 31   1  0 23  1 31  1332
>>  Graph               12 1.0 6.0783e+00 1.0 4.52e+06 1.1 3.8e+05 9.0e+02 2.5e+02  1  0  6  0  6   1  0  6  0  6   279
>>  MIS/Agg              6 1.0 9.5831e-02 1.2 0.00e+00 0.0 2.6e+05 1.1e+03 4.1e+01  0  0  4  0  1   0  0  4  0  1     0
>>  SA: col data         6 1.0 7.7358e-01 1.0 0.00e+00 0.0 6.7e+05 2.9e+03 7.8e+02  0  0 10  0 19   0  0 10  0 19     0
>>  SA: frmProl0         6 1.0 1.0759e+00 1.0 0.00e+00 0.0 6.2e+04 7.6e+03 6.0e+01  0  0  1  0  1   0  0  1  0  1     0
>> GAMG: partLevel        6 1.0 3.8136e+01 1.0 9.09e+08 1.1 3.8e+05 5.0e+04 5.4e+02  6  3  6  4 13   6  3  6  4 14  9013
>>  repartition          6 1.0 2.7910e+00 1.0 0.00e+00 0.0 4.6e+04 1.3e+02 1.6e+02  0  0  1  0  4   0  0  1  0  4     0
>>  Invert-Sort          6 1.0 2.5045e+00 2.0 0.00e+00 0.0 0.0e+00 0.0e+00 2.4e+01  0  0  0  0  1   0  0  0  0  1     0
>>  Move A               6 1.0 1.4832e+01 1.0 0.00e+00 0.0 8.5e+04 1.7e+05 1.1e+02  2  0  1  3  3   2  0  1  3  3     0
>>  Move P               6 1.0 1.2023e+01 1.0 0.00e+00 0.0 2.4e+04 3.8e+03 1.1e+02  2  0  0  0  3   2  0  0  0  3     0
>> PCSetUp              100 1.0 1.1212e+02 1.0 1.84e+10 1.1 3.2e+06 1.3e+05 2.2e+03 17 52 49 84 54  17 52 49 84 54 62052
>> PCSetUpOnBlocks       40 1.0 1.0386e+00 1.2 1.95e+08 1.2 0.0e+00 0.0e+00 0.0e+00  0  1  0  0  0   0  1  0  0  0 67368
>> PCApply              380 1.0 2.0034e+01 1.1 8.60e+09 1.1 1.5e+06 9.9e+03 6.0e+01  3 24 22  3  1   3 24 22  3  1 161973
>> SFSetGraph            12 1.0 4.9813e-0310.2 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00  0  0  0  0  0   0  0  0  0  0     0
>> SFBcastBegin          47 1.0 3.3110e-02 2.6 0.00e+00 0.0 2.6e+05 1.1e+03 6.0e+00  0  0  4  0  0   0  0  4  0  0     0
>> SFBcastEnd            47 1.0 1.3497e-02 3.6 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00  0  0  0  0  0   0  0  0  0  0     0
>> SFReduceBegin          6 1.0 1.8593e-02 4.2 0.00e+00 0.0 7.2e+04 4.9e+02 6.0e+00  0  0  1  0  0   0  0  1  0  0     0
>> SFReduceEnd            6 1.0 7.1628e-0318.5 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00  0  0  0  0  0   0  0  0  0  0     0
>> BuildTwoSided         12 1.0 3.5771e-02 2.5 0.00e+00 0.0 5.0e+04 4.0e+00 1.2e+01  0  0  1  0  0   0  0  1  0  0     0
>> ----------------------------------------------------------------------
>> --------------------------------------------------
>> 
>> Memory usage is given in bytes:
>> 
>> Object Type          Creations   Destructions     Memory  Descendants' Mem.
>> Reports information only for process 0.
>> 
>> --- Event Stage 0: Main Stage
>> 
>>              Matrix   302            299   1992700700     0.
>> Matrix Partitioning     6              6         3888     0.
>>      Matrix Coarsen     6              6         3768     0.
>>              Vector   600            600   1582204168     0.
>>      Vector Scatter    87             87      5614432     0.
>>       Krylov Solver    11             11        59472     0.
>>      Preconditioner    11             11        11120     0.
>>         PetscRandom     1              1          638     0.
>>              Viewer     1              0            0     0.
>>           Index Set   247            247      9008420     0.
>> Star Forest Bipartite Graph    12             12        10176     0.
>> ======================================================================
>> ==================================================
>> 
>> And for  petsc 3.6.1:
>> 
>> Using Petsc Development GIT revision: v3.6.1-307-g26c82d3  GIT Date: 
>> 2015-08-06 11:50:34 -0500
>> 
>>                         Max       Max/Min        Avg      Total
>> Time (sec):           5.515e+02      1.00001   5.515e+02
>> Objects:              1.231e+03      1.00490   1.226e+03
>> Flops:                3.431e+10      1.12609   3.253e+10  1.301e+13
>> Flops/sec:            6.222e+07      1.12609   5.899e+07  2.359e+10
>> MPI Messages:         4.432e+04      7.84165   1.504e+04  6.016e+06
>> MPI Message Lengths:  2.236e+09     12.61261   5.027e+04  3.024e+11
>> MPI Reductions:       4.012e+03      1.00150
>> 
>> Flop counting convention: 1 flop = 1 real number operation of type (multiply/divide/add/subtract)
>>                            e.g., VecAXPY() for real vectors of length N --> 2N flops
>>                            and VecAXPY() for complex vectors of 
>> length N --> 8N flops
>> 
>> Summary of Stages:   ----- Time ------  ----- Flops -----  --- Messages ---  -- Message Lengths --  -- Reductions --
>>                        Avg     %Total     Avg     %Total   counts   %Total     Avg         %Total   counts   %Total
>> 0:      Main Stage: 5.5145e+02 100.0%  1.3011e+13 100.0%  6.007e+06  99.9%  5.020e+04       99.9%  3.999e+03  99.7%
>> 
>> ----------------------------------------------------------------------
>> --------------------------------------------------
>> See the 'Profiling' chapter of the users' manual for details on interpreting output.
>> Phase summary info:
>>   Count: number of times phase was executed
>>   Time and Flops: Max - maximum over all processors
>>                   Ratio - ratio of maximum to minimum over all processors
>>   Mess: number of messages sent
>>   Avg. len: average message length (bytes)
>>   Reduct: number of global reductions
>>   Global: entire computation
>>   Stage: stages of a computation. Set stages with PetscLogStagePush() and PetscLogStagePop().
>>      %T - percent time in this phase         %F - percent flops in this phase
>>      %M - percent messages in this phase     %L - percent message lengths in this phase
>>      %R - percent reductions in this phase
>>   Total Mflop/s: 10e-6 * (sum of flops over all processors)/(max time 
>> over all processors)
>> ------------------------------------------------------------------------------------------------------------------------
>> Event                Count      Time (sec)     Flops                             --- Global ---  --- Stage ---   Total
>>                   Max Ratio  Max     Ratio   Max  Ratio  Mess   Avg len Reduct  %T %F %M %L %R  %T %F %M %L %R Mflop/s
>> ----------------------------------------------------------------------
>> --------------------------------------------------
>> 
>> --- Event Stage 0: Main Stage
>> 
>> MatMult              500 1.0 1.0172e+01 1.2 6.68e+09 1.1 1.9e+06 9.9e+03 0.0e+00  2 19 31  6  0   2 19 31  6  0 247182
>> MatMultTranspose     120 1.0 6.9889e-01 1.2 3.56e+08 1.1 2.5e+05 1.4e+04 0.0e+00  0  1  4  1  0   0  1  4  1  0 197492
>> MatSolve             380 1.0 3.9310e+00 1.1 1.17e+09 1.1 1.3e+04 5.7e+01 6.0e+01  1  3  0  0  1   1  3  0  0  2 112069
>> MatSOR               120 1.0 1.3915e+01 1.1 6.73e+09 1.1 9.5e+05 7.4e+03 0.0e+00  2 20 16  2  0   2 20 16  2  0 182405
>> MatLUFactorSym         2 1.0 2.1180e-01 1.8 0.00e+00 0.0 0.0e+00 0.0e+00 1.0e+01  0  0  0  0  0   0  0  0  0  0     0
>> MatLUFactorNum        60 1.0 7.9378e+00 1.0 1.95e+08 1.2 0.0e+00 0.0e+00 0.0e+00  1  1  0  0  0   1  1  0  0  0  8814
>> MatILUFactorSym        1 1.0 2.3076e-02 1.5 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00  0  0  0  0  0   0  0  0  0  0     0
>> MatConvert             6 1.0 3.2693e-02 1.5 0.00e+00 0.0 0.0e+00 0.0e+00 1.8e+01  0  0  0  0  0   0  0  0  0  0     0
>> MatScale               6 1.0 2.1923e-02 1.7 4.50e+06 1.1 2.4e+04 1.5e+03 0.0e+00  0  0  0  0  0   0  0  0  0  0 77365
>> MatAssemblyBegin     266 1.0 1.0337e+01 4.4 0.00e+00 0.0 1.8e+05 3.8e+03 4.2e+02  1  0  3  0 10   1  0  3  0 10     0
>> MatAssemblyEnd       266 1.0 3.0336e+00 1.0 0.00e+00 0.0 4.1e+05 8.6e+02 4.7e+02  1  0  7  0 12   1  0  7  0 12     0
>> MatGetRow        6730366 1.1 8.6473e-01 1.2 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00  0  0  0  0  0   0  0  0  0  0     0
>> MatGetRowIJ            3 3.0 5.2931e-035550.2 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00  0  0  0  0  0   0  0  0  0  0     0
>> MatGetSubMatrix       12 1.0 2.2689e+01 1.0 0.00e+00 0.0 1.1e+05 1.3e+05 1.9e+02  4  0  2  5  5   4  0  2  5  5     0
>> MatGetOrdering         3 3.0 6.5000e-03 5.4 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00  0  0  0  0  0   0  0  0  0  0     0
>> MatPartitioning        6 1.0 2.9801e+00 1.0 0.00e+00 0.0 0.0e+00 0.0e+00 1.4e+01  1  0  0  0  0   1  0  0  0  0     0
>> MatCoarsen             6 1.0 9.5374e-02 1.1 0.00e+00 0.0 2.5e+05 1.1e+03 3.8e+01  0  0  4  0  1   0  0  4  0  1     0
>> MatZeroEntries        22 1.0 6.1185e-01 1.2 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00  0  0  0  0  0   0  0  0  0  0     0
>> MatTranspose           6 1.0 1.9780e-01 1.1 0.00e+00 0.0 1.9e+05 8.6e+02 7.8e+01  0  0  3  0  2   0  0  3  0  2     0
>> MatPtAP              120 1.0 5.2996e+01 1.0 1.70e+10 1.1 9.7e+05 2.1e+05 4.2e+02 10 49 16 67 10  10 49 16 67 11 120900
>> MatPtAPSymbolic       12 1.0 5.8209e+00 1.0 0.00e+00 0.0 2.2e+05 3.7e+04 8.4e+01  1  0  4  3  2   1  0  4  3  2     0
>> MatPtAPNumeric       120 1.0 4.7185e+01 1.0 1.70e+10 1.1 7.6e+05 2.6e+05 3.4e+02  9 49 13 64  8   9 49 13 64  8 135789
>> MatTrnMatMult          3 1.0 1.1679e+00 1.0 3.22e+07 1.2 8.2e+04 8.0e+03 5.7e+01  0  0  1  0  1   0  0  1  0  1  9997
>> MatTrnMatMultSym       3 1.0 6.8366e-01 1.0 0.00e+00 0.0 6.9e+04 3.9e+03 5.1e+01  0  0  1  0  1   0  0  1  0  1     0
>> MatTrnMatMultNum       3 1.0 4.8513e-01 1.0 3.22e+07 1.2 1.3e+04 3.0e+04 6.0e+00  0  0  0  0  0   0  0  0  0  0 24069
>> MatGetLocalMat       126 1.0 1.1939e+00 1.3 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00  0  0  0  0  0   0  0  0  0  0     0
>> MatGetBrAoCol        120 1.0 5.9887e-01 2.7 0.00e+00 0.0 5.7e+05 3.3e+04 0.0e+00  0  0  9  6  0   0  0  9  6  0     0
>> MatGetSymTrans        24 1.0 1.4878e-01 1.4 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00  0  0  0  0  0   0  0  0  0  0     0
>> VecDot               320 1.0 1.5860e+00 1.5 2.04e+08 1.1 0.0e+00 0.0e+00 3.2e+02  0  1  0  0  8   0  1  0  0  8 49574
>> VecMDot              260 1.0 1.8154e+00 2.5 3.70e+08 1.1 0.0e+00 0.0e+00 2.6e+02  0  1  0  0  6   0  1  0  0  7 78497
>> VecNorm              440 1.0 2.8876e+00 1.8 5.88e+08 1.1 0.0e+00 0.0e+00 4.4e+02  0  2  0  0 11   0  2  0  0 11 78281
>> VecScale             320 1.0 2.2738e-01 1.2 7.88e+07 1.1 0.0e+00 0.0e+00 0.0e+00  0  0  0  0  0   0  0  0  0  0 133517
>> VecCopy              220 1.0 7.1162e-01 1.2 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00  0  0  0  0  0   0  0  0  0  0     0
>> VecSet               862 1.0 7.0683e-01 1.5 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00  0  0  0  0  0   0  0  0  0  0     0
>> VecAXPY              440 1.0 9.0657e-01 1.2 3.83e+08 1.1 0.0e+00 0.0e+00 0.0e+00  0  1  0  0  0   0  1  0  0  0 162612
>> VecAYPX              280 1.0 5.8935e-01 1.5 1.92e+08 1.1 0.0e+00 0.0e+00 0.0e+00  0  1  0  0  0   0  1  0  0  0 125070
>> VecMAXPY             300 1.0 9.7644e-01 1.2 4.98e+08 1.1 0.0e+00 0.0e+00 0.0e+00  0  1  0  0  0   0  1  0  0  0 196269
>> VecAssemblyBegin     234 1.0 5.0308e+00 5.5 0.00e+00 0.0 0.0e+00 0.0e+00 6.8e+02  1  0  0  0 17   1  0  0  0 17     0
>> VecAssemblyEnd       234 1.0 1.8253e-03 8.8 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00  0  0  0  0  0   0  0  0  0  0     0
>> VecScatterBegin     1083 1.0 2.8195e-01 4.7 0.00e+00 0.0 3.8e+06 8.4e+03 2.0e+01  0  0 64 11  0   0  0 64 11  1     0
>> VecScatterEnd       1063 1.0 3.4924e+00 6.9 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00  0  0  0  0  0   0  0  0  0  0     0
>> KSPGMRESOrthog        20 1.0 1.5598e+00 3.2 1.28e+08 1.1 0.0e+00 0.0e+00 2.0e+01  0  0  0  0  0   0  0  0  0  1 31503
>> KSPSetUp             222 1.0 9.7521e-02 1.6 0.00e+00 0.0 0.0e+00 0.0e+00 1.2e+01  0  0  0  0  0   0  0  0  0  0     0
>> KSPSolve              60 1.0 1.3742e+02 1.0 3.42e+10 1.1 5.7e+06 4.4e+04 3.2e+03 25100 95 83 79  25100 95 83 79 94396
>> PCGAMGGraph_AGG        6 1.0 5.7683e+00 1.0 4.50e+06 1.1 3.8e+05 9.1e+02 2.5e+02  1  0  6  0  6   1  0  6  0  6   294
>> PCGAMGCoarse_AGG       6 1.0 1.4101e+00 1.0 3.22e+07 1.2 4.0e+05 3.2e+03 1.4e+02  0  0  7  0  4   0  0  7  0  4  8280
>> PCGAMGProl_AGG         6 1.0 1.8976e+00 1.0 0.00e+00 0.0 7.2e+05 3.4e+03 8.6e+02  0  0 12  1 22   0  0 12  1 22     0
>> PCGAMGPOpt_AGG         6 1.0 5.7220e-06 0.0 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00  0  0  0  0  0   0  0  0  0  0     0
>> GAMG: createProl       6 1.0 9.0840e+00 1.0 3.67e+07 1.2 1.5e+06 2.7e+03 1.3e+03  2  0 25  1 31   2  0 25  1 31  1472
>>  Graph               12 1.0 5.7669e+00 1.0 4.50e+06 1.1 3.8e+05 9.1e+02 2.5e+02  1  0  6  0  6   1  0  6  0  6   294
>>  MIS/Agg              6 1.0 9.5481e-02 1.1 0.00e+00 0.0 2.5e+05 1.1e+03 3.8e+01  0  0  4  0  1   0  0  4  0  1     0
>>  SA: col data         6 1.0 8.5414e-01 1.0 0.00e+00 0.0 6.6e+05 3.0e+03 7.8e+02  0  0 11  1 19   0  0 11  1 20     0
>>  SA: frmProl0         6 1.0 1.0123e+00 1.0 0.00e+00 0.0 6.2e+04 7.6e+03 6.0e+01  0  0  1  0  1   0  0  1  0  2     0
>> GAMG: partLevel        6 1.0 3.6150e+01 1.0 8.41e+08 1.1 3.5e+05 5.0e+04 5.3e+02  7  2  6  6 13   7  2  6  6 13  8804
>>  repartition          6 1.0 3.8351e+00 1.0 0.00e+00 0.0 4.7e+04 1.3e+02 1.6e+02  1  0  1  0  4   1  0  1  0  4     0
>>  Invert-Sort          6 1.0 4.4953e+00 1.8 0.00e+00 0.0 0.0e+00 0.0e+00 2.4e+01  1  0  0  0  1   1  0  0  0  1     0
>>  Move A               6 1.0 1.0806e+01 1.0 0.00e+00 0.0 8.5e+04 1.6e+05 1.0e+02  2  0  1  5  3   2  0  1  5  3     0
>>  Move P               6 1.0 1.1953e+01 1.0 0.00e+00 0.0 2.5e+04 3.6e+03 1.0e+02  2  0  0  0  3   2  0  0  0  3     0
>> PCSetUp              100 1.0 1.0166e+02 1.0 1.72e+10 1.1 2.7e+06 8.3e+04 2.2e+03 18 50 44 73 54  18 50 44 73 54 63848
>> PCSetUpOnBlocks       40 1.0 1.0812e+00 1.2 1.95e+08 1.2 0.0e+00 0.0e+00 0.0e+00  0  1  0  0  0   0  1  0  0  0 64711
>> PCApply              380 1.0 1.9359e+01 1.1 8.58e+09 1.1 1.4e+06 9.6e+03 6.0e+01  3 25 24  5  1   3 25 24  5  2 167605
>> SFSetGraph            12 1.0 3.5203e-03 6.8 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00  0  0  0  0  0   0  0  0  0  0     0
>> SFBcastBegin          44 1.0 2.4242e-02 3.0 0.00e+00 0.0 2.5e+05 1.1e+03 6.0e+00  0  0  4  0  0   0  0  4  0  0     0
>> SFBcastEnd            44 1.0 3.0994e-02 8.6 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00  0  0  0  0  0   0  0  0  0  0     0
>> SFReduceBegin          6 1.0 1.6784e-02 3.8 0.00e+00 0.0 7.1e+04 5.0e+02 6.0e+00  0  0  1  0  0   0  0  1  0  0     0
>> SFReduceEnd            6 1.0 8.6989e-0332.0 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00  0  0  0  0  0   0  0  0  0  0     0
>> ----------------------------------------------------------------------
>> --------------------------------------------------
>> 
>> Memory usage is given in bytes:
>> 
>> Object Type          Creations   Destructions     Memory  Descendants' Mem.
>> Reports information only for process 0.
>> 
>> --- Event Stage 0: Main Stage
>> 
>>              Matrix   246            243   1730595756     0
>> Matrix Partitioning     6              6         3816     0
>>      Matrix Coarsen     6              6         3720     0
>>              Vector   602            602   1603749672     0
>>      Vector Scatter    87             87      4291136     0
>>       Krylov Solver    12             12        60416     0
>>      Preconditioner    12             12        12040     0
>>              Viewer     1              0            0     0
>>           Index Set   247            247      9018060     0
>> Star Forest Bipartite Graph    12             12        10080     0
>> ======================================================================
>> ==================================================
>> 
>> Any idea why there are more matrix created with version 3.7.2? I only have 2 MatCreate calls and 4 VecCreate calls in my code!, so I assume the others are internally created.
>> 
>> 
>> Thank you,
>> 
>> 
>> Hassan Raiesi, PhD
>> 
>> Advanced Aerodynamics Department
>> Bombardier Aerospace
>> 
>> hassan.raiesi at aero.bombardier.com
>> 
>> 2351 boul. Alfred-Nobel (BAN1)
>> Ville Saint-Laurent, Qu?bec, H4S 2A9
>> 
>> 
>> 
>> T?l.
>>  514-855-5001    # 62204
>> 
>> 
>> 
>> <image001.png>
>> 
>> 
>> CONFIDENTIALITY NOTICE - This communication may contain privileged or confidential information.
>> If you are not the intended recipient or received this communication 
>> by error, please notify the sender and delete the message without copying, forwarding and/or disclosing it.
> 
> 
> <log_3.6.1_gamg.txt><log_3.7.2_gamg_run_with_square_graph_1.txt><log_3.7.2_gamg_square_graph_1_max_level_17.txt><log_3.6.1_gamg_run_with_square_graph_1_max_level_17.txt><log_3.7.2_gamg.txt><log_3.7.2_basic.txt><log_3.6.1_basic.txt>


From hengjiew at uci.edu  Wed Jul  6 16:19:15 2016
From: hengjiew at uci.edu (frank)
Date: Wed, 6 Jul 2016 14:19:15 -0700
Subject: [petsc-users] Question about memory usage in Multigrid
 preconditioner
In-Reply-To: <94A03A99-4970-4F20-8C79-FEE1DCBD028D@mcs.anl.gov>
References: <577C337B.60909@uci.edu>
	<94A03A99-4970-4F20-8C79-FEE1DCBD028D@mcs.anl.gov>
Message-ID: <577D75D3.8010703@uci.edu>

Hi Barry,

Thank you for you advice.
I tried three test. In the 1st test, the grid is 3072*256*768 and the 
process mesh is 96*8*24.
The linear solver is 'cg' the preconditioner is 'mg' and 'telescope' is 
used as the preconditioner at the coarse mesh.
The system gives me the "Out of Memory" error before the linear system 
is completely solved.
The info from '-ksp_view_pre' is attached. I seems to me that the error 
occurs when it reaches the coarse mesh.

The 2nd test uses a grid of 1536*128*384 and process mesh is 96*8*24. 
The 3rd test uses the same grid but a different process mesh 48*4*12.
The linear solver and petsc options in 2nd and 3rd tests are the same in 
1st test. The linear solver works fine in both test.
I attached the memory usage of the 2nd and 3rd tests. The memory info is 
from the option '-log_summary'. I tried to use '-momery_info' as you 
suggested, but in my case petsc treated it as an unused option. It 
output nothing about the memory. Do I need to add sth to my code so I 
can use '-memory_info'?
In both tests the memory usage is not large.

It seems to me that it might be the 'telescope'  preconditioner that 
allocated a lot of memory and caused the error in the 1st test.
Is there is a way to show how much memory it allocated?

Frank

On 07/05/2016 03:37 PM, Barry Smith wrote:
>    Frank,
>
>      You can run with -ksp_view_pre to have it "view" the KSP before the solve so hopefully it gets that far.
>
>       Please run the problem that does fit with -memory_info when the problem completes it will show the "high water mark" for PETSc allocated memory and total memory used. We first want to look at these numbers to see if it is using more memory than you expect. You could also run with say half the grid spacing to see how the memory usage scaled with the increase in grid points. Make the runs also with -log_view and send all the output from these options.
>
>     Barry
>
>> On Jul 5, 2016, at 5:23 PM, frank <hengjiew at uci.edu> wrote:
>>
>> Hi,
>>
>> I am using the CG ksp solver and Multigrid preconditioner  to solve a linear system in parallel.
>> I chose to use the 'Telescope' as the preconditioner on the coarse mesh for its good performance.
>> The petsc options file is attached.
>>
>> The domain is a 3d box.
>> It works well when the grid is  1536*128*384 and the process mesh is 96*8*24. When I double the size of grid and keep the same process mesh and petsc options, I get an "out of memory" error from the super-cluster I am using.
>> Each process has access to at least 8G memory, which should be more than enough for my application. I am sure that all the other parts of my code( except the linear solver ) do not use much memory. So I doubt if there is something wrong with the linear solver.
>> The error occurs before the linear system is completely solved so I don't have the info from ksp view. I am not able to re-produce the error with a smaller problem either.
>> In addition,  I tried to use the block jacobi as the preconditioner with the same grid and same decomposition. The linear solver runs extremely slow but there is no memory error.
>>
>> How can I diagnose what exactly cause the error?
>> Thank you so much.
>>
>> Frank
>> <petsc_options.txt>

-------------- next part --------------
KSP Object: 18432 MPI processes
  type: cg
  maximum iterations=10000
  tolerances:  relative=1e-07, absolute=1e-50, divergence=10000.
  left preconditioning
  using nonzero initial guess
  using UNPRECONDITIONED norm type for convergence test
PC Object: 18432 MPI processes
  type: mg
  PC has not been set up so information may be incomplete
    MG: type is MULTIPLICATIVE, levels=4 cycles=v
      Cycles per PCApply=1
      Using Galerkin computed coarse grid matrices
  Coarse grid solver -- level -------------------------------
    KSP Object:    (mg_coarse_)     18432 MPI processes
      type: preonly
      maximum iterations=10000, initial guess is zero
      tolerances:  relative=1e-05, absolute=1e-50, divergence=10000.
      left preconditioning
      using DEFAULT norm type for convergence test
    PC Object:    (mg_coarse_)     18432 MPI processes
      type: redundant
      PC has not been set up so information may be incomplete
        Redundant preconditioner: Not yet setup
  Down solver (pre-smoother) on level 1 -------------------------------
    KSP Object:    (mg_levels_1_)     18432 MPI processes
      type: chebyshev
        Chebyshev: eigenvalue estimates:  min = 0., max = 0.
      maximum iterations=2, initial guess is zero
      tolerances:  relative=1e-05, absolute=1e-50, divergence=10000.
      left preconditioning
      using NONE norm type for convergence test
    PC Object:    (mg_levels_1_)     18432 MPI processes
      type: sor
      PC has not been set up so information may be incomplete
        SOR: type = local_symmetric, iterations = 1, local iterations = 1, omega = 1.
  Up solver (post-smoother) same as down solver (pre-smoother)
  Down solver (pre-smoother) on level 2 -------------------------------
    KSP Object:    (mg_levels_2_)     18432 MPI processes
      type: chebyshev
        Chebyshev: eigenvalue estimates:  min = 0., max = 0.
      maximum iterations=2, initial guess is zero
      tolerances:  relative=1e-05, absolute=1e-50, divergence=10000.
      left preconditioning
      using NONE norm type for convergence test
    PC Object:    (mg_levels_2_)     18432 MPI processes
      type: sor
      PC has not been set up so information may be incomplete
        SOR: type = local_symmetric, iterations = 1, local iterations = 1, omega = 1.
  Up solver (post-smoother) same as down solver (pre-smoother)
  Down solver (pre-smoother) on level 3 -------------------------------
    KSP Object:    (mg_levels_3_)     18432 MPI processes
      type: chebyshev
        Chebyshev: eigenvalue estimates:  min = 0., max = 0.
      maximum iterations=2, initial guess is zero
      tolerances:  relative=1e-05, absolute=1e-50, divergence=10000.
      left preconditioning
      using NONE norm type for convergence test
    PC Object:    (mg_levels_3_)     18432 MPI processes
      type: sor
      PC has not been set up so information may be incomplete
        SOR: type = local_symmetric, iterations = 1, local iterations = 1, omega = 1.
  Up solver (post-smoother) same as down solver (pre-smoother)
  linear system matrix = precond matrix:
  Mat Object:   18432 MPI processes
    type: mpiaij
    rows=603979776, cols=603979776
    total: nonzeros=4223139840, allocated nonzeros=4223139840
    total number of mallocs used during MatSetValues calls =0
      has attached null space
[NID 03157] 2016-07-05 18:53:01 Apid 45102172: initiated application termination
[NID 00773] 2016-07-05 18:53:02 Apid 45102172: OOM killer terminated this process.
[NID 09993] 2016-07-05 18:53:02 Apid 45102172: OOM killer terminated this process.
-------------- next part --------------
Memory usage is given in bytes:

Object Type          Creations   Destructions     Memory  Descendants' Mem.
Reports information only for process 0.

--- Event Stage 0: Main Stage

              Viewer     5              4         3328     0.
              Vector   384            383      8193712     0.
      Vector Scatter    27             23        61776     0.
              Matrix   103            103     11508688     0.
   Matrix Null Space     1              1          592     0.
    Distributed Mesh     8              4        20288     0.
Star Forest Bipartite Graph    16              8         6784     0.
     Discrete System     8              4         3456     0.
           Index Set    55             55       277240     0.
   IS L to G Mapping     8              4        27136     0.
       Krylov Solver    10             10        12392     0.
     DMKSP interface     6              3         1944     0.
      Preconditioner    10             10         9952     0.
-------------- next part --------------
Memory usage is given in bytes:

Object Type          Creations   Destructions     Memory  Descendants' Mem.
Reports information only for process 0.

--- Event Stage 0: Main Stage

              Viewer     5              4         3328     0.
              Vector   384            383      1590520     0.
      Vector Scatter    27             23        28568     0.
              Matrix   103            103      3508664     0.
   Matrix Null Space     1              1          592     0.
    Distributed Mesh     8              4        20288     0.
Star Forest Bipartite Graph    16              8         6784     0.
     Discrete System     8              4         3456     0.
           Index Set    55             55        80868     0.
   IS L to G Mapping     8              4         7080     0.
       Krylov Solver    10             10        12392     0.
     DMKSP interface     6              3         1944     0.
      Preconditioner    10             10         9952     0.
-------------- next part --------------
-ksp_type        cg 
-ksp_norm_type   unpreconditioned
-ksp_lag_norm
-ksp_rtol        1e-7
-ksp_initial_guess_nonzero  yes
-ksp_converged_reason 
-ppe_max_iter 50
-pc_type mg
-pc_mg_galerkin
-pc_mg_levels 4
-mg_levels_ksp_type richardson 
-mg_levels_ksp_max_it 1
-mg_coarse_ksp_type preonly
-mg_coarse_pc_type telescope
-mg_coarse_pc_telescope_reduction_factor 64
-options_left
-log_summary

# options for telescope
-mg_coarse_telescope_ksp_type preonly
-mg_coarse_telescope_pc_type mg
-mg_coarse_telescope_pc_mg_galerkin
-mg_coarse_telescope_pc_mg_levels 4
-mg_coarse_telescope_mg_levels_ksp_max_it 1
-mg_coarse_telescope_mg_levels_ksp_type richardson
-mg_coarse_telescope_mg_coarse_ksp_type preonly
-mg_coarse_telescope_mg_coarse_pc_type svd


From bsmith at mcs.anl.gov  Wed Jul  6 16:51:49 2016
From: bsmith at mcs.anl.gov (Barry Smith)
Date: Wed, 6 Jul 2016 16:51:49 -0500
Subject: [petsc-users] Question about memory usage in Multigrid
	preconditioner
In-Reply-To: <577D75D3.8010703@uci.edu>
References: <577C337B.60909@uci.edu>
	<94A03A99-4970-4F20-8C79-FEE1DCBD028D@mcs.anl.gov>
	<577D75D3.8010703@uci.edu>
Message-ID: <2F25042C-E6D6-4AC6-9C22-1B63F8065836@mcs.anl.gov>


> On Jul 6, 2016, at 4:19 PM, frank <hengjiew at uci.edu> wrote:
> 
> Hi Barry,
> 
> Thank you for you advice.
> I tried three test. In the 1st test, the grid is 3072*256*768 and the process mesh is 96*8*24.
> The linear solver is 'cg' the preconditioner is 'mg' and 'telescope' is used as the preconditioner at the coarse mesh.
> The system gives me the "Out of Memory" error before the linear system is completely solved.
> The info from '-ksp_view_pre' is attached. I seems to me that the error occurs when it reaches the coarse mesh.
> 
> The 2nd test uses a grid of 1536*128*384 and process mesh is 96*8*24. The 3rd test uses the same grid but a different process mesh 48*4*12.

   Are you sure this is right? The total matrix and vector memory usage goes from 2nd test 
              Vector   384            383      8,193,712     0.
              Matrix   103            103     11,508,688     0.
to 3rd test
             Vector   384            383      1,590,520     0.
              Matrix   103            103      3,508,664     0. 
that is the memory usage got smaller but if you have only 1/8th the processes and the same grid it should have gotten about 8 times bigger. Did you maybe cut the grid by a factor of 8 also? If so that still doesn't explain it because the memory usage changed by a factor of 5 something for the vectors and 3 something for the matrices. 


> The linear solver and petsc options in 2nd and 3rd tests are the same in 1st test. The linear solver works fine in both test.
> I attached the memory usage of the 2nd and 3rd tests. The memory info is from the option '-log_summary'. I tried to use '-momery_info' as you suggested, but in my case petsc treated it as an unused option. It output nothing about the memory. Do I need to add sth to my code so I can use '-memory_info'?

   Sorry, my mistake the option is -memory_view 

  Can you run the one case with -memory_view and -mg_coarse jacobi -ksp_max_it 1 (just so it doesn't iterate forever) to see how much memory is used without the telescope? Also run case 2 the same way.

  Barry


> In both tests the memory usage is not large.
> 
> It seems to me that it might be the 'telescope'  preconditioner that allocated a lot of memory and caused the error in the 1st test.
> Is there is a way to show how much memory it allocated?
> 
> Frank
> 
> On 07/05/2016 03:37 PM, Barry Smith wrote:
>>   Frank,
>> 
>>     You can run with -ksp_view_pre to have it "view" the KSP before the solve so hopefully it gets that far.
>> 
>>      Please run the problem that does fit with -memory_info when the problem completes it will show the "high water mark" for PETSc allocated memory and total memory used. We first want to look at these numbers to see if it is using more memory than you expect. You could also run with say half the grid spacing to see how the memory usage scaled with the increase in grid points. Make the runs also with -log_view and send all the output from these options.
>> 
>>    Barry
>> 
>>> On Jul 5, 2016, at 5:23 PM, frank <hengjiew at uci.edu> wrote:
>>> 
>>> Hi,
>>> 
>>> I am using the CG ksp solver and Multigrid preconditioner  to solve a linear system in parallel.
>>> I chose to use the 'Telescope' as the preconditioner on the coarse mesh for its good performance.
>>> The petsc options file is attached.
>>> 
>>> The domain is a 3d box.
>>> It works well when the grid is  1536*128*384 and the process mesh is 96*8*24. When I double the size of grid and keep the same process mesh and petsc options, I get an "out of memory" error from the super-cluster I am using.
>>> Each process has access to at least 8G memory, which should be more than enough for my application. I am sure that all the other parts of my code( except the linear solver ) do not use much memory. So I doubt if there is something wrong with the linear solver.
>>> The error occurs before the linear system is completely solved so I don't have the info from ksp view. I am not able to re-produce the error with a smaller problem either.
>>> In addition,  I tried to use the block jacobi as the preconditioner with the same grid and same decomposition. The linear solver runs extremely slow but there is no memory error.
>>> 
>>> How can I diagnose what exactly cause the error?
>>> Thank you so much.
>>> 
>>> Frank
>>> <petsc_options.txt>
> 
> <ksp_view_pre.txt><memory_test2.txt><memory_test3.txt><petsc_options.txt>


From bsmith at mcs.anl.gov  Wed Jul  6 16:53:44 2016
From: bsmith at mcs.anl.gov (Barry Smith)
Date: Wed, 6 Jul 2016 16:53:44 -0500
Subject: [petsc-users] snes true and preconditioned residuals for left
	npc
In-Reply-To: <CAAPpcp=pTB9XzFEDUxyRE+J_F2S1WV88KcoyzM5z_kDK1otPiA@mail.gmail.com>
References: <CAAPpcp=pTB9XzFEDUxyRE+J_F2S1WV88KcoyzM5z_kDK1otPiA@mail.gmail.com>
Message-ID: <3B9C8C0C-C09F-43DD-BA12-109583633C10@mcs.anl.gov>


> On Jul 6, 2016, at 12:17 PM, Xiangdong <epscodes at gmail.com> wrote:
> 
> Hello everyone,
> 
> I am using snes_type aspin, which is actually newtonls + npc (nasm). After each newton iteration, if I call SNESGetFunction, the preconditioned residual is obtained. However, if I use SNESComputeFunction, I get the  true (unpreconditioned) residual. 
> 
> If I want to know the preconditioned residual at a point different from current solution, which function should I call?

  I don't think it is possible.

  Barry

> 
> Thanks.
> 
> Best,
> Xiangdong 


From jychang48 at gmail.com  Wed Jul  6 16:56:15 2016
From: jychang48 at gmail.com (Justin Chang)
Date: Wed, 6 Jul 2016 16:56:15 -0500
Subject: [petsc-users] Transient poisson example in petsc
In-Reply-To: <CABFzUT0BTbzF+NERE+MwuPYr8qDpbcckJd3CruZ6w+8ZWk=vEw@mail.gmail.com>
References: <CABFzUT0BTbzF+NERE+MwuPYr8qDpbcckJd3CruZ6w+8ZWk=vEw@mail.gmail.com>
Message-ID: <CAP2=TMgt8-RmNHKnTVKRCv6Jegke+MZ3W6gjQS6pkvoO-WRPjQ@mail.gmail.com>

Julian,

I hand wrote my own time stepping scheme (backward Euler) for SNES ex12.c
because I had to enforce TAO's convex optimization solvers at every time
level. I am sure Matt or one of the other PETSc developers can tell you how
to make today's SNES ex12.c transient.

Thanks,
Justin

On Wednesday, July 6, 2016, Julian Andrej <juan at tf.uni-kiel.de> wrote:

> Hi,
>
> i've seen your question on the mailing list regarding a transient
> poisson example using PetscFE and TS. Do you have any working
> solution? I'm still confused about what to change from snes ex12 for
> example to make it transient.
>
> I hope you could help me out there :)
>
> regards
> Julian
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.mcs.anl.gov/pipermail/petsc-users/attachments/20160706/db8907c3/attachment.html>

From cyrill.von.planta at usi.ch  Thu Jul  7 03:37:26 2016
From: cyrill.von.planta at usi.ch (Cyrill Vonplanta)
Date: Thu, 7 Jul 2016 08:37:26 +0000
Subject: [petsc-users] Reordering rows of parallel matrix across processors
Message-ID: <4C660C5E-7326-45C2-82DD-3302E09490AA@usi.ch>

Dear all,

I would like to reorder the rows of a matrix across processors. Is this possible with MatPermute(?)?

To illustrate here is how an index set would look like for a matrix with  M=35 on 2 CPU?s. Amongst other things I intend to swap the first and last row here.

[0] Number of indices in set 24
[0] 0 34
[0] 1 1
[0] 2 2
[0] 3 3
[0] 4 4
[0] 5 5
[0] 6 6
[0] 7 7
[0] 8 15
[0] 9 16
[0] 10 11
[0] 11 8
[0] 12 10
[0] 13 21
[0] 14 9
[0] 15 12
[0] 16 13
[0] 17 14
[0] 18 17
[0] 19 18
[0] 20 19
[0] 21 20
[0] 22 22
[0] 23 23
[1] Number of indices in set 11
[1] 0 24
[1] 1 25
[1] 2 26
[1] 3 27
[1] 4 28
[1] 5 29
[1] 6 30
[1] 7 31
[1] 8 32
[1] 9 33
[1] 10 0

Instead of exchanging the first and last row it seems to replace them with zeros only.
If this can?t be done with MatPermute how could it be done?

Thanks
Cyrill


From knepley at gmail.com  Thu Jul  7 09:48:06 2016
From: knepley at gmail.com (Matthew Knepley)
Date: Thu, 7 Jul 2016 09:48:06 -0500
Subject: [petsc-users] Reordering rows of parallel matrix across
	processors
In-Reply-To: <4C660C5E-7326-45C2-82DD-3302E09490AA@usi.ch>
References: <4C660C5E-7326-45C2-82DD-3302E09490AA@usi.ch>
Message-ID: <CAMYG4Gniz_93h9ton0dFBFOsJcYTKqN=1=L5CYpyfjf-bk5w7w@mail.gmail.com>

On Thu, Jul 7, 2016 at 3:37 AM, Cyrill Vonplanta <cyrill.von.planta at usi.ch>
wrote:

> Dear all,
>
> I would like to reorder the rows of a matrix across processors. Is this
> possible with MatPermute(?)?
>

Yes, this works with MatPermute().

Could you send this small example so I can reproduce it?


> To illustrate here is how an index set would look like for a matrix with
> M=35 on 2 CPU?s. Amongst other things I intend to swap the first and last
> row here.
>
> [0] Number of indices in set 24
> [0] 0 34
> [0] 1 1
> [0] 2 2
> [0] 3 3
> [0] 4 4
> [0] 5 5
> [0] 6 6
> [0] 7 7
> [0] 8 15
> [0] 9 16
> [0] 10 11
> [0] 11 8
> [0] 12 10
> [0] 13 21
> [0] 14 9
> [0] 15 12
> [0] 16 13
> [0] 17 14
> [0] 18 17
> [0] 19 18
> [0] 20 19
> [0] 21 20
> [0] 22 22
> [0] 23 23
> [1] Number of indices in set 11
> [1] 0 24
> [1] 1 25
> [1] 2 26
> [1] 3 27
> [1] 4 28
> [1] 5 29
> [1] 6 30
> [1] 7 31
> [1] 8 32
> [1] 9 33
> [1] 10 0
>
> Instead of exchanging the first and last row it seems to replace them with
> zeros only.
> If this can?t be done with MatPermute how could it be done?
>

You could also use MatGetSubMatrix().

  Thanks,

    Matt


> Thanks
> Cyrill
>
>


-- 
What most experimenters take for granted before they begin their
experiments is infinitely more interesting than any results to which their
experiments lead.
-- Norbert Wiener
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.mcs.anl.gov/pipermail/petsc-users/attachments/20160707/17592267/attachment-0001.html>

From mfadams at lbl.gov  Thu Jul  7 13:30:32 2016
From: mfadams at lbl.gov (Mark Adams)
Date: Thu, 7 Jul 2016 20:30:32 +0200
Subject: [petsc-users] (edit GAMG) petsc 3.7.2 memory usage is much
 higher when compared to 3.6.1
In-Reply-To: <CB3EA7864A5EA74EADA707EEB42CF5BAFC314C33@MTLWAEXCH005.ca.aero.bombardier.net>
References: <CB3EA7864A5EA74EADA707EEB42CF5BAFC314C33@MTLWAEXCH005.ca.aero.bombardier.net>
Message-ID: <CADOhEh6HuUzcaoTFkHmVswpQkW87L8+PwKVxrNkAinDNKLsCdA@mail.gmail.com>

>
>
> > GAMG: createProl       6 1.0 1.0036e+01 1.0 3.68e+07 1.2 1.5e+06 2.7e+03
> 1.3e+03  1  0 23  1 31   1  0 23  1 31  1332
> >   Graph               12 1.0 6.0783e+00 1.0 4.52e+06 1.1 3.8e+05 9.0e+02
> 2.5e+02  1  0  6  0  6   1  0  6  0  6   279
> >   MIS/Agg              6 1.0 9.5831e-02 1.2 0.00e+00 0.0 2.6e+05 1.1e+03
> 4.1e+01  0  0  4  0  1   0  0  4  0  1     0
> >   SA: col data         6 1.0 7.7358e-01 1.0 0.00e+00 0.0 6.7e+05 2.9e+03
> 7.8e+02  0  0 10  0 19   0  0 10  0 19     0
> >   SA: frmProl0         6 1.0 1.0759e+00 1.0 0.00e+00 0.0 6.2e+04 7.6e+03
> 6.0e+01  0  0  1  0  1   0  0  1  0  1     0
> > GAMG: partLevel        6 1.0 3.8136e+01 1.0 9.09e+08 1.1 3.8e+05 5.0e+04
> 5.4e+02  6  3  6  4 13   6  3  6  4 14  9013
> >   repartition          6 1.0 2.7910e+00 1.0 0.00e+00 0.0 4.6e+04 1.3e+02
> 1.6e+02  0  0  1  0  4   0  0  1  0  4     0
> >   Invert-Sort          6 1.0 2.5045e+00 2.0 0.00e+00 0.0 0.0e+00 0.0e+00
> 2.4e+01  0  0  0  0  1   0  0  0  0  1     0
> >   Move A               6 1.0 1.4832e+01 1.0 0.00e+00 0.0 8.5e+04 1.7e+05
> 1.1e+02  2  0  1  3  3   2  0  1  3  3     0
> >   Move P               6 1.0 1.2023e+01 1.0 0.00e+00 0.0 2.4e+04 3.8e+03
> 1.1e+02  2  0  0  0  3   2  0  0  0  3     0
>


THe number of these calls (eg, 6) is the number of grids that are setup.


> > PCSetUp              100 1.0 1.1212e+02 1.0 1.84e+10 1.1 3.2e+06 1.3e+05
> 2.2e+03 17 52 49 84 54  17 52 49 84 54 62052
> > PCSetUpOnBlocks       40 1.0 1.0386e+00 1.2 1.95e+08 1.2 0.0e+00 0.0e+00
> 0.0e+00  0  1  0  0  0   0  1  0  0  0 67368
> > PCApply              380 1.0 2.0034e+01 1.1 8.60e+09 1.1 1.5e+06 9.9e+03
> 6.0e+01  3 24 22  3  1   3 24 22  3  1 161973
> > SFSetGraph            12 1.0 4.9813e-0310.2 0.00e+00 0.0 0.0e+00 0.0e+00
> 0.0e+00  0  0  0  0  0   0  0  0  0  0     0
> > SFBcastBegin          47 1.0 3.3110e-02 2.6 0.00e+00 0.0 2.6e+05 1.1e+03
> 6.0e+00  0  0  4  0  0   0  0  4  0  0     0
> > SFBcastEnd            47 1.0 1.3497e-02 3.6 0.00e+00 0.0 0.0e+00 0.0e+00
> 0.0e+00  0  0  0  0  0   0  0  0  0  0     0
> > SFReduceBegin          6 1.0 1.8593e-02 4.2 0.00e+00 0.0 7.2e+04 4.9e+02
> 6.0e+00  0  0  1  0  0   0  0  1  0  0     0
> > SFReduceEnd            6 1.0 7.1628e-0318.5 0.00e+00 0.0 0.0e+00 0.0e+00
> 0.0e+00  0  0  0  0  0   0  0  0  0  0     0
> > BuildTwoSided         12 1.0 3.5771e-02 2.5 0.00e+00 0.0 5.0e+04 4.0e+00
> 1.2e+01  0  0  1  0  0   0  0  1  0  0     0
> > ----------------------------------------------------------------------
> > --------------------------------------------------
> >
> > Memory usage is given in bytes:
> >
> > Object Type          Creations   Destructions     Memory  Descendants'
> Mem.
> > Reports information only for process 0.
> >
> > --- Event Stage 0: Main Stage
> >
> >               Matrix   302            299   1992700700     0.
> > Matrix Partitioning     6              6         3888     0.
> >       Matrix Coarsen     6              6         3768     0.
> >               Vector   600            600   1582204168     0.
> >       Vector Scatter    87             87      5614432     0.
> >        Krylov Solver    11             11        59472     0.
> >       Preconditioner    11             11        11120     0.
> >          PetscRandom     1              1          638     0.
> >               Viewer     1              0            0     0.
> >            Index Set   247            247      9008420     0.
> > Star Forest Bipartite Graph    12             12        10176     0.
> > ======================================================================
> > ==================================================
> >
> > And for  petsc 3.6.1:
> >
> > Using Petsc Development GIT revision: v3.6.1-307-g26c82d3  GIT Date:
> > 2015-08-06 11:50:34 -0500
> >
> >                          Max       Max/Min        Avg      Total
> > Time (sec):           5.515e+02      1.00001   5.515e+02
> > Objects:              1.231e+03      1.00490   1.226e+03
> > Flops:                3.431e+10      1.12609   3.253e+10  1.301e+13
> > Flops/sec:            6.222e+07      1.12609   5.899e+07  2.359e+10
> > MPI Messages:         4.432e+04      7.84165   1.504e+04  6.016e+06
> > MPI Message Lengths:  2.236e+09     12.61261   5.027e+04  3.024e+11
> > MPI Reductions:       4.012e+03      1.00150
> >
> > Flop counting convention: 1 flop = 1 real number operation of type
> (multiply/divide/add/subtract)
> >                             e.g., VecAXPY() for real vectors of length N
> --> 2N flops
> >                             and VecAXPY() for complex vectors of
> > length N --> 8N flops
> >
> > Summary of Stages:   ----- Time ------  ----- Flops -----  --- Messages
> ---  -- Message Lengths --  -- Reductions --
> >                         Avg     %Total     Avg     %Total   counts
>  %Total     Avg         %Total   counts   %Total
> > 0:      Main Stage: 5.5145e+02 100.0%  1.3011e+13 100.0%  6.007e+06
> 99.9%  5.020e+04       99.9%  3.999e+03  99.7%
> >
> > ----------------------------------------------------------------------
> > --------------------------------------------------
> > See the 'Profiling' chapter of the users' manual for details on
> interpreting output.
> > Phase summary info:
> >    Count: number of times phase was executed
> >    Time and Flops: Max - maximum over all processors
> >                    Ratio - ratio of maximum to minimum over all
> processors
> >    Mess: number of messages sent
> >    Avg. len: average message length (bytes)
> >    Reduct: number of global reductions
> >    Global: entire computation
> >    Stage: stages of a computation. Set stages with PetscLogStagePush()
> and PetscLogStagePop().
> >       %T - percent time in this phase         %F - percent flops in this
> phase
> >       %M - percent messages in this phase     %L - percent message
> lengths in this phase
> >       %R - percent reductions in this phase
> >    Total Mflop/s: 10e-6 * (sum of flops over all processors)/(max time
> > over all processors)
> >
> ------------------------------------------------------------------------------------------------------------------------
> > Event                Count      Time (sec)     Flops
>          --- Global ---  --- Stage ---   Total
> >                    Max Ratio  Max     Ratio   Max  Ratio  Mess   Avg len
> Reduct  %T %F %M %L %R  %T %F %M %L %R Mflop/s
> > ----------------------------------------------------------------------
> > --------------------------------------------------
> >
> > --- Event Stage 0: Main Stage
> >
> > MatMult              500 1.0 1.0172e+01 1.2 6.68e+09 1.1 1.9e+06 9.9e+03
> 0.0e+00  2 19 31  6  0   2 19 31  6  0 247182
> > MatMultTranspose     120 1.0 6.9889e-01 1.2 3.56e+08 1.1 2.5e+05 1.4e+04
> 0.0e+00  0  1  4  1  0   0  1  4  1  0 197492
> > MatSolve             380 1.0 3.9310e+00 1.1 1.17e+09 1.1 1.3e+04 5.7e+01
> 6.0e+01  1  3  0  0  1   1  3  0  0  2 112069
> > MatSOR               120 1.0 1.3915e+01 1.1 6.73e+09 1.1 9.5e+05 7.4e+03
> 0.0e+00  2 20 16  2  0   2 20 16  2  0 182405
> > MatLUFactorSym         2 1.0 2.1180e-01 1.8 0.00e+00 0.0 0.0e+00 0.0e+00
> 1.0e+01  0  0  0  0  0   0  0  0  0  0     0
> > MatLUFactorNum        60 1.0 7.9378e+00 1.0 1.95e+08 1.2 0.0e+00 0.0e+00
> 0.0e+00  1  1  0  0  0   1  1  0  0  0  8814
> > MatILUFactorSym        1 1.0 2.3076e-02 1.5 0.00e+00 0.0 0.0e+00 0.0e+00
> 0.0e+00  0  0  0  0  0   0  0  0  0  0     0
> > MatConvert             6 1.0 3.2693e-02 1.5 0.00e+00 0.0 0.0e+00 0.0e+00
> 1.8e+01  0  0  0  0  0   0  0  0  0  0     0
> > MatScale               6 1.0 2.1923e-02 1.7 4.50e+06 1.1 2.4e+04 1.5e+03
> 0.0e+00  0  0  0  0  0   0  0  0  0  0 77365
> > MatAssemblyBegin     266 1.0 1.0337e+01 4.4 0.00e+00 0.0 1.8e+05 3.8e+03
> 4.2e+02  1  0  3  0 10   1  0  3  0 10     0
> > MatAssemblyEnd       266 1.0 3.0336e+00 1.0 0.00e+00 0.0 4.1e+05 8.6e+02
> 4.7e+02  1  0  7  0 12   1  0  7  0 12     0
> > MatGetRow        6730366 1.1 8.6473e-01 1.2 0.00e+00 0.0 0.0e+00 0.0e+00
> 0.0e+00  0  0  0  0  0   0  0  0  0  0     0
> > MatGetRowIJ            3 3.0 5.2931e-035550.2 0.00e+00 0.0 0.0e+00
> 0.0e+00 0.0e+00  0  0  0  0  0   0  0  0  0  0     0
> > MatGetSubMatrix       12 1.0 2.2689e+01 1.0 0.00e+00 0.0 1.1e+05 1.3e+05
> 1.9e+02  4  0  2  5  5   4  0  2  5  5     0
> > MatGetOrdering         3 3.0 6.5000e-03 5.4 0.00e+00 0.0 0.0e+00 0.0e+00
> 0.0e+00  0  0  0  0  0   0  0  0  0  0     0
> > MatPartitioning        6 1.0 2.9801e+00 1.0 0.00e+00 0.0 0.0e+00 0.0e+00
> 1.4e+01  1  0  0  0  0   1  0  0  0  0     0
> > MatCoarsen             6 1.0 9.5374e-02 1.1 0.00e+00 0.0 2.5e+05 1.1e+03
> 3.8e+01  0  0  4  0  1   0  0  4  0  1     0
> > MatZeroEntries        22 1.0 6.1185e-01 1.2 0.00e+00 0.0 0.0e+00 0.0e+00
> 0.0e+00  0  0  0  0  0   0  0  0  0  0     0
> > MatTranspose           6 1.0 1.9780e-01 1.1 0.00e+00 0.0 1.9e+05 8.6e+02
> 7.8e+01  0  0  3  0  2   0  0  3  0  2     0
> > MatPtAP              120 1.0 5.2996e+01 1.0 1.70e+10 1.1 9.7e+05 2.1e+05
> 4.2e+02 10 49 16 67 10  10 49 16 67 11 120900
> > MatPtAPSymbolic       12 1.0 5.8209e+00 1.0 0.00e+00 0.0 2.2e+05 3.7e+04
> 8.4e+01  1  0  4  3  2   1  0  4  3  2     0
> > MatPtAPNumeric       120 1.0 4.7185e+01 1.0 1.70e+10 1.1 7.6e+05 2.6e+05
> 3.4e+02  9 49 13 64  8   9 49 13 64  8 135789
> > MatTrnMatMult          3 1.0 1.1679e+00 1.0 3.22e+07 1.2 8.2e+04 8.0e+03
> 5.7e+01  0  0  1  0  1   0  0  1  0  1  9997
> > MatTrnMatMultSym       3 1.0 6.8366e-01 1.0 0.00e+00 0.0 6.9e+04 3.9e+03
> 5.1e+01  0  0  1  0  1   0  0  1  0  1     0
> > MatTrnMatMultNum       3 1.0 4.8513e-01 1.0 3.22e+07 1.2 1.3e+04 3.0e+04
> 6.0e+00  0  0  0  0  0   0  0  0  0  0 24069
> > MatGetLocalMat       126 1.0 1.1939e+00 1.3 0.00e+00 0.0 0.0e+00 0.0e+00
> 0.0e+00  0  0  0  0  0   0  0  0  0  0     0
> > MatGetBrAoCol        120 1.0 5.9887e-01 2.7 0.00e+00 0.0 5.7e+05 3.3e+04
> 0.0e+00  0  0  9  6  0   0  0  9  6  0     0
> > MatGetSymTrans        24 1.0 1.4878e-01 1.4 0.00e+00 0.0 0.0e+00 0.0e+00
> 0.0e+00  0  0  0  0  0   0  0  0  0  0     0
> > VecDot               320 1.0 1.5860e+00 1.5 2.04e+08 1.1 0.0e+00 0.0e+00
> 3.2e+02  0  1  0  0  8   0  1  0  0  8 49574
> > VecMDot              260 1.0 1.8154e+00 2.5 3.70e+08 1.1 0.0e+00 0.0e+00
> 2.6e+02  0  1  0  0  6   0  1  0  0  7 78497
> > VecNorm              440 1.0 2.8876e+00 1.8 5.88e+08 1.1 0.0e+00 0.0e+00
> 4.4e+02  0  2  0  0 11   0  2  0  0 11 78281
> > VecScale             320 1.0 2.2738e-01 1.2 7.88e+07 1.1 0.0e+00 0.0e+00
> 0.0e+00  0  0  0  0  0   0  0  0  0  0 133517
> > VecCopy              220 1.0 7.1162e-01 1.2 0.00e+00 0.0 0.0e+00 0.0e+00
> 0.0e+00  0  0  0  0  0   0  0  0  0  0     0
> > VecSet               862 1.0 7.0683e-01 1.5 0.00e+00 0.0 0.0e+00 0.0e+00
> 0.0e+00  0  0  0  0  0   0  0  0  0  0     0
> > VecAXPY              440 1.0 9.0657e-01 1.2 3.83e+08 1.1 0.0e+00 0.0e+00
> 0.0e+00  0  1  0  0  0   0  1  0  0  0 162612
> > VecAYPX              280 1.0 5.8935e-01 1.5 1.92e+08 1.1 0.0e+00 0.0e+00
> 0.0e+00  0  1  0  0  0   0  1  0  0  0 125070
> > VecMAXPY             300 1.0 9.7644e-01 1.2 4.98e+08 1.1 0.0e+00 0.0e+00
> 0.0e+00  0  1  0  0  0   0  1  0  0  0 196269
> > VecAssemblyBegin     234 1.0 5.0308e+00 5.5 0.00e+00 0.0 0.0e+00 0.0e+00
> 6.8e+02  1  0  0  0 17   1  0  0  0 17     0
> > VecAssemblyEnd       234 1.0 1.8253e-03 8.8 0.00e+00 0.0 0.0e+00 0.0e+00
> 0.0e+00  0  0  0  0  0   0  0  0  0  0     0
> > VecScatterBegin     1083 1.0 2.8195e-01 4.7 0.00e+00 0.0 3.8e+06 8.4e+03
> 2.0e+01  0  0 64 11  0   0  0 64 11  1     0
> > VecScatterEnd       1063 1.0 3.4924e+00 6.9 0.00e+00 0.0 0.0e+00 0.0e+00
> 0.0e+00  0  0  0  0  0   0  0  0  0  0     0
> > KSPGMRESOrthog        20 1.0 1.5598e+00 3.2 1.28e+08 1.1 0.0e+00 0.0e+00
> 2.0e+01  0  0  0  0  0   0  0  0  0  1 31503
> > KSPSetUp             222 1.0 9.7521e-02 1.6 0.00e+00 0.0 0.0e+00 0.0e+00
> 1.2e+01  0  0  0  0  0   0  0  0  0  0     0
> > KSPSolve              60 1.0 1.3742e+02 1.0 3.42e+10 1.1 5.7e+06 4.4e+04
> 3.2e+03 25100 95 83 79  25100 95 83 79 94396
> > PCGAMGGraph_AGG        6 1.0 5.7683e+00 1.0 4.50e+06 1.1 3.8e+05 9.1e+02
> 2.5e+02  1  0  6  0  6   1  0  6  0  6   294
> > PCGAMGCoarse_AGG       6 1.0 1.4101e+00 1.0 3.22e+07 1.2 4.0e+05 3.2e+03
> 1.4e+02  0  0  7  0  4   0  0  7  0  4  8280
> > PCGAMGProl_AGG         6 1.0 1.8976e+00 1.0 0.00e+00 0.0 7.2e+05 3.4e+03
> 8.6e+02  0  0 12  1 22   0  0 12  1 22     0
> > PCGAMGPOpt_AGG         6 1.0 5.7220e-06 0.0 0.00e+00 0.0 0.0e+00 0.0e+00
> 0.0e+00  0  0  0  0  0   0  0  0  0  0     0
> > GAMG: createProl       6 1.0 9.0840e+00 1.0 3.67e+07 1.2 1.5e+06 2.7e+03
> 1.3e+03  2  0 25  1 31   2  0 25  1 31  1472
> >   Graph               12 1.0 5.7669e+00 1.0 4.50e+06 1.1 3.8e+05 9.1e+02
> 2.5e+02  1  0  6  0  6   1  0  6  0  6   294
> >   MIS/Agg              6 1.0 9.5481e-02 1.1 0.00e+00 0.0 2.5e+05 1.1e+03
> 3.8e+01  0  0  4  0  1   0  0  4  0  1     0
> >   SA: col data         6 1.0 8.5414e-01 1.0 0.00e+00 0.0 6.6e+05 3.0e+03
> 7.8e+02  0  0 11  1 19   0  0 11  1 20     0
> >   SA: frmProl0         6 1.0 1.0123e+00 1.0 0.00e+00 0.0 6.2e+04 7.6e+03
> 6.0e+01  0  0  1  0  1   0  0  1  0  2     0
> > GAMG: partLevel        6 1.0 3.6150e+01 1.0 8.41e+08 1.1 3.5e+05 5.0e+04
> 5.3e+02  7  2  6  6 13   7  2  6  6 13  8804
> >   repartition          6 1.0 3.8351e+00 1.0 0.00e+00 0.0 4.7e+04 1.3e+02
> 1.6e+02  1  0  1  0  4   1  0  1  0  4     0
> >   Invert-Sort          6 1.0 4.4953e+00 1.8 0.00e+00 0.0 0.0e+00 0.0e+00
> 2.4e+01  1  0  0  0  1   1  0  0  0  1     0
> >   Move A               6 1.0 1.0806e+01 1.0 0.00e+00 0.0 8.5e+04 1.6e+05
> 1.0e+02  2  0  1  5  3   2  0  1  5  3     0
> >   Move P               6 1.0 1.1953e+01 1.0 0.00e+00 0.0 2.5e+04 3.6e+03
> 1.0e+02  2  0  0  0  3   2  0  0  0  3     0
> > PCSetUp              100 1.0 1.0166e+02 1.0 1.72e+10 1.1 2.7e+06 8.3e+04
> 2.2e+03 18 50 44 73 54  18 50 44 73 54 63848
> > PCSetUpOnBlocks       40 1.0 1.0812e+00 1.2 1.95e+08 1.2 0.0e+00 0.0e+00
> 0.0e+00  0  1  0  0  0   0  1  0  0  0 64711
> > PCApply              380 1.0 1.9359e+01 1.1 8.58e+09 1.1 1.4e+06 9.6e+03
> 6.0e+01  3 25 24  5  1   3 25 24  5  2 167605
> > SFSetGraph            12 1.0 3.5203e-03 6.8 0.00e+00 0.0 0.0e+00 0.0e+00
> 0.0e+00  0  0  0  0  0   0  0  0  0  0     0
> > SFBcastBegin          44 1.0 2.4242e-02 3.0 0.00e+00 0.0 2.5e+05 1.1e+03
> 6.0e+00  0  0  4  0  0   0  0  4  0  0     0
> > SFBcastEnd            44 1.0 3.0994e-02 8.6 0.00e+00 0.0 0.0e+00 0.0e+00
> 0.0e+00  0  0  0  0  0   0  0  0  0  0     0
> > SFReduceBegin          6 1.0 1.6784e-02 3.8 0.00e+00 0.0 7.1e+04 5.0e+02
> 6.0e+00  0  0  1  0  0   0  0  1  0  0     0
> > SFReduceEnd            6 1.0 8.6989e-0332.0 0.00e+00 0.0 0.0e+00 0.0e+00
> 0.0e+00  0  0  0  0  0   0  0  0  0  0     0
> > ----------------------------------------------------------------------
> > --------------------------------------------------
> >
> > Memory usage is given in bytes:
> >
> > Object Type          Creations   Destructions     Memory  Descendants'
> Mem.
> > Reports information only for process 0.
> >
> > --- Event Stage 0: Main Stage
> >
> >               Matrix   246            243   1730595756     0
> > Matrix Partitioning     6              6         3816     0
> >       Matrix Coarsen     6              6         3720     0
> >               Vector   602            602   1603749672     0
> >       Vector Scatter    87             87      4291136     0
> >        Krylov Solver    12             12        60416     0
> >       Preconditioner    12             12        12040     0
> >               Viewer     1              0            0     0
> >            Index Set   247            247      9018060     0
> > Star Forest Bipartite Graph    12             12        10080     0
> > ======================================================================
> > ==================================================
> >
> > Any idea why there are more matrix created with version 3.7.2? I only
> have 2 MatCreate calls and 4 VecCreate calls in my code!, so I assume the
> others are internally created.
> >
> >
> > Thank you,
> >
> >
> > Hassan Raiesi, PhD
> >
> > Advanced Aerodynamics Department
> > Bombardier Aerospace
> >
> > hassan.raiesi at aero.bombardier.com
> >
> > 2351 boul. Alfred-Nobel (BAN1)
> > Ville Saint-Laurent, Qu?bec, H4S 2A9
> >
> >
> >
> > T?l.
> >   514-855-5001    # 62204
> >
> >
> >
> > <image001.png>
> >
> >
> > CONFIDENTIALITY NOTICE - This communication may contain privileged or
> confidential information.
> > If you are not the intended recipient or received this communication
> > by error, please notify the sender and delete the message without
> copying, forwarding and/or disclosing it.
>
>
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.mcs.anl.gov/pipermail/petsc-users/attachments/20160707/914f1776/attachment-0001.html>

From mfadams at lbl.gov  Thu Jul  7 13:27:46 2016
From: mfadams at lbl.gov (Mark Adams)
Date: Thu, 7 Jul 2016 20:27:46 +0200
Subject: [petsc-users] petsc 3.7.2 memory usage is much higher when
 compared to 3.6.1
In-Reply-To: <CAMYG4Gk-Jf+xjz=o2N6d1+-jBxmynEbY+3ZSWQZxrL8tyRnVSQ@mail.gmail.com>
References: <CB3EA7864A5EA74EADA707EEB42CF5BAFC313A95@MTLWAEXCH005.ca.aero.bombardier.net>
	<CAMYG4Gk-Jf+xjz=o2N6d1+-jBxmynEbY+3ZSWQZxrL8tyRnVSQ@mail.gmail.com>
Message-ID: <CADOhEh7TZQR_LcVksNHb1PYKB84wiX6pPA9Vd=VrWhRfu_re7Q@mail.gmail.com>

On Tue, Jul 5, 2016 at 11:13 PM, Matthew Knepley <knepley at gmail.com> wrote:

> On Tue, Jul 5, 2016 at 3:42 PM, Hassan Raiesi <
> Hassan.Raiesi at aero.bombardier.com> wrote:
>
>> Hi,
>>
>>
>>
>> PETSc 3.7.2 seems to have a much higher memory usage when compared with
>> PETSc- 3.1.1 c, to a point that it crashes our code for large problems that
>> we ran with version 3.6.1 in the past.
>>
>> I have re-compiled the code with same options, and ran the same code
>> linked with the two versions, here are the log-summarie:
>>
>
> According to the log_summary (which you NEED to send in full if we are to
> understand anything), the memory usage is largely the same.
> There are more matrices, which leads me to believe that GAMG is not
> coarsening as quickly. You might consider a non-zero threshold for
> it.
>
>
FYI There are the same number of grids in these two outputs.


> The best way to understand what is happening is to run Massif (from
> valgrind) on both.
>
>   Thanks,
>
>      Matt
>
>
>> -flow_ksp_max_it 20
>>
>> -flow_ksp_monitor_true_residual
>>
>> -flow_ksp_rtol 0.1
>>
>> -flow_ksp_type fgmres
>>
>> -flow_mg_coarse_pc_factor_mat_solver_package mumps
>>
>> -flow_mg_coarse_pc_type lu
>>
>> -flow_mg_levels_ksp_type richardson
>>
>> -flow_mg_levels_pc_type sor
>>
>> -flow_pc_gamg_agg_nsmooths 0
>>
>> -flow_pc_gamg_coarse_eq_limit 2000
>>
>> -flow_pc_gamg_process_eq_limit 2500
>>
>> -flow_pc_gamg_repartition true
>>
>> -flow_pc_gamg_reuse_interpolation true
>>
>> -flow_pc_gamg_square_graph 3
>>
>> -flow_pc_gamg_sym_graph true
>>
>> -flow_pc_gamg_type agg
>>
>> -flow_pc_mg_cycle v
>>
>> -flow_pc_mg_levels 20
>>
>> -flow_pc_mg_type kaskade
>>
>> -flow_pc_type gamg
>>
>> -log_summary
>>
>>
>>
>> Note: it is not specific to PCGAMG, even a bjacobi+fgmres would need more
>> memory (4.5GB/core in version 3.6.1 compared to 6.8GB/core for 3.7.2).
>>
>>
>>
>>
>>
>>
>>
>> Using Petsc Development GIT revision: v3.7.2-812-gc68d048  GIT Date:
>> 2016-07-05 12:04:34 -0400
>>
>>
>>
>>                          Max       Max/Min        Avg      Total
>>
>> Time (sec):           6.760e+02      1.00006   6.760e+02
>>
>> Objects:              1.284e+03      1.00469   1.279e+03
>>
>> Flops:                3.563e+10      1.10884   3.370e+10  1.348e+13
>>
>> Flops/sec:            5.271e+07      1.10884   4.985e+07  1.994e+10
>>
>> MPI Messages:         4.279e+04      7.21359   1.635e+04  6.542e+06
>>
>> MPI Message Lengths:  3.833e+09     17.25274   7.681e+04  5.024e+11
>>
>> MPI Reductions:       4.023e+03      1.00149
>>
>>
>>
>> Flop counting convention: 1 flop = 1 real number operation of type
>> (multiply/divide/add/subtract)
>>
>>                             e.g., VecAXPY() for real vectors of length N
>> --> 2N flops
>>
>>                             and VecAXPY() for complex vectors of length N
>> --> 8N flops
>>
>>
>>
>> Summary of Stages:   ----- Time ------  ----- Flops -----  --- Messages
>> ---  -- Message Lengths --  -- Reductions --
>>
>>                         Avg     %Total     Avg     %Total   counts
>> %Total     Avg         %Total   counts   %Total
>>
>> 0:      Main Stage: 6.7600e+02 100.0%  1.3478e+13 100.0%  6.533e+06
>> 99.9%  7.674e+04       99.9%  4.010e+03  99.7%
>>
>>
>>
>>
>> ------------------------------------------------------------------------------------------------------------------------
>>
>> See the 'Profiling' chapter of the users' manual for details on
>> interpreting output.
>>
>> Phase summary info:
>>
>>    Count: number of times phase was executed
>>
>>    Time and Flops: Max - maximum over all processors
>>
>>                    Ratio - ratio of maximum to minimum over all processors
>>
>>    Mess: number of messages sent
>>
>>    Avg. len: average message length (bytes)
>>
>>    Reduct: number of global reductions
>>
>>    Global: entire computation
>>
>>    Stage: stages of a computation. Set stages with PetscLogStagePush()
>> and PetscLogStagePop().
>>
>>       %T - percent time in this phase         %F - percent flops in this
>> phase
>>
>>       %M - percent messages in this phase     %L - percent message
>> lengths in this phase
>>
>>       %R - percent reductions in this phase
>>
>>    Total Mflop/s: 10e-6 * (sum of flops over all processors)/(max time
>> over all processors)
>>
>>
>> ------------------------------------------------------------------------------------------------------------------------
>>
>> Event                Count      Time (sec)     Flops
>>        --- Global ---  --- Stage ---   Total
>>
>>                    Max Ratio  Max     Ratio   Max  Ratio  Mess   Avg len
>> Reduct  %T %F %M %L %R  %T %F %M %L %R Mflop/s
>>
>>
>> ------------------------------------------------------------------------------------------------------------------------
>>
>>
>>
>> --- Event Stage 0: Main Stage
>>
>>
>>
>> MatMult              500 1.0 1.0582e+01 1.2 6.68e+09 1.1 1.9e+06 1.0e+04
>> 0.0e+00  1 19 28  4  0   1 19 29  4  0 237625
>>
>> MatMultTranspose     120 1.0 7.6262e-01 1.3 3.58e+08 1.1 2.4e+05 1.5e+04
>> 0.0e+00  0  1  4  1  0   0  1  4  1  0 180994
>>
>> MatSolve             380 1.0 4.1580e+00 1.1 1.17e+09 1.1 8.6e+03 8.8e+01
>> 6.0e+01  1  3  0  0  1   1  3  0  0  1 105950
>>
>> MatSOR               120 1.0 1.4316e+01 1.2 6.75e+09 1.1 9.5e+05 7.4e+03
>> 0.0e+00  2 19 15  1  0   2 19 15  1  0 177298
>>
>> MatLUFactorSym         2 1.0 2.3449e-01 1.1 0.00e+00 0.0 0.0e+00 0.0e+00
>> 1.0e+01  0  0  0  0  0   0  0  0  0  0     0
>>
>> MatLUFactorNum        60 1.0 8.8820e+00 1.0 1.95e+08 1.2 0.0e+00 0.0e+00
>> 0.0e+00  1  1  0  0  0   1  1  0  0  0  7877
>>
>> MatILUFactorSym        1 1.0 1.9795e-02 1.4 0.00e+00 0.0 0.0e+00 0.0e+00
>> 0.0e+00  0  0  0  0  0   0  0  0  0  0     0
>>
>> MatConvert             6 1.0 2.9893e-02 1.1 0.00e+00 0.0 0.0e+00 0.0e+00
>> 1.8e+01  0  0  0  0  0   0  0  0  0  0     0
>>
>> MatScale               6 1.0 1.8810e-02 1.4 4.52e+06 1.1 2.4e+04 1.5e+03
>> 0.0e+00  0  0  0  0  0   0  0  0  0  0 90171
>>
>> MatAssemblyBegin     782 1.0 1.8294e+01 2.9 0.00e+00 0.0 9.2e+05 4.1e+05
>> 4.2e+02  2  0 14 75 10   2  0 14 75 10     0
>>
>> MatAssemblyEnd       782 1.0 1.4283e+01 3.0 0.00e+00 0.0 4.1e+05 8.7e+02
>> 4.7e+02  1  0  6  0 12   1  0  6  0 12     0
>>
>> MatGetRow        6774900 1.1 9.4289e-01 1.2 0.00e+00 0.0 0.0e+00 0.0e+00
>> 0.0e+00  0  0  0  0  0   0  0  0  0  0     0
>>
>> MatGetRowIJ            3 3.0 6.6261e-036948.0 0.00e+00 0.0 0.0e+00
>> 0.0e+00 0.0e+00  0  0  0  0  0   0  0  0  0  0     0
>>
>> MatGetSubMatrix       12 1.0 2.6783e+01 1.0 0.00e+00 0.0 1.1e+05 1.3e+05
>> 2.0e+02  4  0  2  3  5   4  0  2  3  5     0
>>
>> MatGetOrdering         3 3.0 7.7400e-03 7.2 0.00e+00 0.0 0.0e+00 0.0e+00
>> 0.0e+00  0  0  0  0  0   0  0  0  0  0     0
>>
>> MatPartitioning        6 1.0 1.8949e+00 1.0 0.00e+00 0.0 0.0e+00 0.0e+00
>> 1.4e+01  0  0  0  0  0   0  0  0  0  0     0
>>
>> MatCoarsen             6 1.0 9.5692e-02 1.2 0.00e+00 0.0 2.6e+05 1.1e+03
>> 4.1e+01  0  0  4  0  1   0  0  4  0  1     0
>>
>> MatZeroEntries       142 1.0 9.7085e-01 1.1 0.00e+00 0.0 0.0e+00 0.0e+00
>> 0.0e+00  0  0  0  0  0   0  0  0  0  0     0
>>
>> MatTranspose           6 1.0 2.1740e-01 1.0 0.00e+00 0.0 1.9e+05 8.5e+02
>> 7.8e+01  0  0  3  0  2   0  0  3  0  2     0
>>
>> MatPtAP              120 1.0 6.0157e+01 1.0 1.82e+10 1.1 1.5e+06 2.7e+05
>> 4.2e+02  9 51 22 80 10   9 51 22 80 10 114269
>>
>> MatPtAPSymbolic       12 1.0 8.1081e+00 1.0 0.00e+00 0.0 2.2e+05 3.8e+04
>> 8.4e+01  1  0  3  2  2   1  0  3  2  2     0
>>
>> MatPtAPNumeric       120 1.0 5.2205e+01 1.0 1.82e+10 1.1 1.2e+06 3.1e+05
>> 3.4e+02  8 51 19 78  8   8 51 19 78  8 131676
>>
>> MatTrnMatMult          3 1.0 1.8608e+00 1.0 3.23e+07 1.2 8.3e+04 7.9e+03
>> 5.7e+01  0  0  1  0  1   0  0  1  0  1  6275
>>
>> MatTrnMatMultSym       3 1.0 1.3447e+00 1.0 0.00e+00 0.0 6.9e+04 3.8e+03
>> 5.1e+01  0  0  1  0  1   0  0  1  0  1     0
>>
>> MatTrnMatMultNum       3 1.0 5.1695e-01 1.0 3.23e+07 1.2 1.3e+04 3.0e+04
>> 6.0e+00  0  0  0  0  0   0  0  0  0  0 22588
>>
>> MatGetLocalMat       126 1.0 1.0355e+00 1.8 0.00e+00 0.0 0.0e+00 0.0e+00
>> 0.0e+00  0  0  0  0  0   0  0  0  0  0     0
>>
>> MatGetBrAoCol        120 1.0 9.5921e+0019.2 0.00e+00 0.0 5.7e+05 3.3e+04
>> 0.0e+00  1  0  9  4  0   1  0  9  4  0     0
>>
>> VecDot               320 1.0 1.1400e+00 1.6 2.04e+08 1.1 0.0e+00 0.0e+00
>> 3.2e+02  0  1  0  0  8   0  1  0  0  8 68967
>>
>> VecMDot              260 1.0 1.9577e+00 2.8 3.70e+08 1.1 0.0e+00 0.0e+00
>> 2.6e+02  0  1  0  0  6   0  1  0  0  6 72792
>>
>> VecNorm              440 1.0 2.6273e+00 1.9 5.88e+08 1.1 0.0e+00 0.0e+00
>> 4.4e+02  0  2  0  0 11   0  2  0  0 11 86035
>>
>> VecScale             320 1.0 2.1386e-01 1.2 7.91e+07 1.1 0.0e+00 0.0e+00
>> 0.0e+00  0  0  0  0  0   0  0  0  0  0 141968
>>
>> VecCopy              220 1.0 7.0370e-01 1.2 0.00e+00 0.0 0.0e+00 0.0e+00
>> 0.0e+00  0  0  0  0  0   0  0  0  0  0     0
>>
>> VecSet               862 1.0 7.1000e-01 1.6 0.00e+00 0.0 0.0e+00 0.0e+00
>> 0.0e+00  0  0  0  0  0   0  0  0  0  0     0
>>
>> VecAXPY              440 1.0 8.6790e-01 1.1 3.83e+08 1.1 0.0e+00 0.0e+00
>> 0.0e+00  0  1  0  0  0   0  1  0  0  0 169857
>>
>> VecAYPX              280 1.0 5.7766e-01 1.5 1.92e+08 1.1 0.0e+00 0.0e+00
>> 0.0e+00  0  1  0  0  0   0  1  0  0  0 127599
>>
>> VecMAXPY             300 1.0 9.7396e-01 1.2 4.98e+08 1.1 0.0e+00 0.0e+00
>> 0.0e+00  0  1  0  0  0   0  1  0  0  0 196768
>>
>> VecAssemblyBegin     234 1.0 4.6313e+00 5.6 0.00e+00 0.0 0.0e+00 0.0e+00
>> 6.8e+02  0  0  0  0 17   0  0  0  0 17     0
>>
>> VecAssemblyEnd       234 1.0 5.1503e-0319.5 0.00e+00 0.0 0.0e+00 0.0e+00
>> 0.0e+00  0  0  0  0  0   0  0  0  0  0     0
>>
>> VecScatterBegin     1083 1.0 2.9274e-01 4.5 0.00e+00 0.0 3.8e+06 8.5e+03
>> 2.0e+01  0  0 59  6  0   0  0 59  6  0     0
>>
>> VecScatterEnd       1063 1.0 3.9653e+00 5.6 0.00e+00 0.0 0.0e+00 0.0e+00
>> 0.0e+00  0  0  0  0  0   0  0  0  0  0     0
>>
>> KSPGMRESOrthog        20 1.0 1.7405e+00 3.7 1.28e+08 1.1 0.0e+00 0.0e+00
>> 2.0e+01  0  0  0  0  0   0  0  0  0  0 28232
>>
>> KSPSetUp             222 1.0 6.8469e-02 1.1 0.00e+00 0.0 0.0e+00 0.0e+00
>> 1.2e+01  0  0  0  0  0   0  0  0  0  0     0
>>
>> KSPSolve              60 1.0 1.4767e+02 1.0 3.55e+10 1.1 6.3e+06 7.2e+04
>> 3.2e+03 22100 96 90 79  22100 96 90 79 91007
>>
>> PCGAMGGraph_AGG        6 1.0 6.0792e+00 1.0 4.52e+06 1.1 3.8e+05 9.0e+02
>> 2.5e+02  1  0  6  0  6   1  0  6  0  6   279
>>
>> PCGAMGCoarse_AGG       6 1.0 2.0660e+00 1.0 3.23e+07 1.2 4.2e+05 3.1e+03
>> 1.5e+02  0  0  6  0  4   0  0  6  0  4  5652
>>
>> PCGAMGProl_AGG         6 1.0 1.8842e+00 1.0 0.00e+00 0.0 7.3e+05 3.3e+03
>> 8.6e+02  0  0 11  0 21   0  0 11  0 22     0
>>
>> PCGAMGPOpt_AGG         6 1.0 6.4373e-06 0.0 0.00e+00 0.0 0.0e+00 0.0e+00
>> 0.0e+00  0  0  0  0  0   0  0  0  0  0     0
>>
>> GAMG: createProl       6 1.0 1.0036e+01 1.0 3.68e+07 1.2 1.5e+06 2.7e+03
>> 1.3e+03  1  0 23  1 31   1  0 23  1 31  1332
>>
>>   Graph               12 1.0 6.0783e+00 1.0 4.52e+06 1.1 3.8e+05 9.0e+02
>> 2.5e+02  1  0  6  0  6   1  0  6  0  6   279
>>
>>   MIS/Agg              6 1.0 9.5831e-02 1.2 0.00e+00 0.0 2.6e+05 1.1e+03
>> 4.1e+01  0  0  4  0  1   0  0  4  0  1     0
>>
>>   SA: col data         6 1.0 7.7358e-01 1.0 0.00e+00 0.0 6.7e+05 2.9e+03
>> 7.8e+02  0  0 10  0 19   0  0 10  0 19     0
>>
>>   SA: frmProl0         6 1.0 1.0759e+00 1.0 0.00e+00 0.0 6.2e+04 7.6e+03
>> 6.0e+01  0  0  1  0  1   0  0  1  0  1     0
>>
>> GAMG: partLevel        6 1.0 3.8136e+01 1.0 9.09e+08 1.1 3.8e+05 5.0e+04
>> 5.4e+02  6  3  6  4 13   6  3  6  4 14  9013
>>
>>   repartition          6 1.0 2.7910e+00 1.0 0.00e+00 0.0 4.6e+04 1.3e+02
>> 1.6e+02  0  0  1  0  4   0  0  1  0  4     0
>>
>>   Invert-Sort          6 1.0 2.5045e+00 2.0 0.00e+00 0.0 0.0e+00 0.0e+00
>> 2.4e+01  0  0  0  0  1   0  0  0  0  1     0
>>
>>   Move A               6 1.0 1.4832e+01 1.0 0.00e+00 0.0 8.5e+04 1.7e+05
>> 1.1e+02  2  0  1  3  3   2  0  1  3  3     0
>>
>>   Move P               6 1.0 1.2023e+01 1.0 0.00e+00 0.0 2.4e+04 3.8e+03
>> 1.1e+02  2  0  0  0  3   2  0  0  0  3     0
>>
>> PCSetUp              100 1.0 1.1212e+02 1.0 1.84e+10 1.1 3.2e+06 1.3e+05
>> 2.2e+03 17 52 49 84 54  17 52 49 84 54 62052
>>
>> PCSetUpOnBlocks       40 1.0 1.0386e+00 1.2 1.95e+08 1.2 0.0e+00 0.0e+00
>> 0.0e+00  0  1  0  0  0   0  1  0  0  0 67368
>>
>> PCApply              380 1.0 2.0034e+01 1.1 8.60e+09 1.1 1.5e+06 9.9e+03
>> 6.0e+01  3 24 22  3  1   3 24 22  3  1 161973
>>
>> SFSetGraph            12 1.0 4.9813e-0310.2 0.00e+00 0.0 0.0e+00 0.0e+00
>> 0.0e+00  0  0  0  0  0   0  0  0  0  0     0
>>
>> SFBcastBegin          47 1.0 3.3110e-02 2.6 0.00e+00 0.0 2.6e+05 1.1e+03
>> 6.0e+00  0  0  4  0  0   0  0  4  0  0     0
>>
>> SFBcastEnd            47 1.0 1.3497e-02 3.6 0.00e+00 0.0 0.0e+00 0.0e+00
>> 0.0e+00  0  0  0  0  0   0  0  0  0  0     0
>>
>> SFReduceBegin          6 1.0 1.8593e-02 4.2 0.00e+00 0.0 7.2e+04 4.9e+02
>> 6.0e+00  0  0  1  0  0   0  0  1  0  0     0
>>
>> SFReduceEnd            6 1.0 7.1628e-0318.5 0.00e+00 0.0 0.0e+00 0.0e+00
>> 0.0e+00  0  0  0  0  0   0  0  0  0  0     0
>>
>> BuildTwoSided         12 1.0 3.5771e-02 2.5 0.00e+00 0.0 5.0e+04 4.0e+00
>> 1.2e+01  0  0  1  0  0   0  0  1  0  0     0
>>
>>
>> ------------------------------------------------------------------------------------------------------------------------
>>
>>
>>
>> Memory usage is given in bytes:
>>
>>
>>
>> Object Type          Creations   Destructions     Memory  Descendants'
>> Mem.
>>
>> Reports information only for process 0.
>>
>>
>>
>> --- Event Stage 0: Main Stage
>>
>>
>>
>>               Matrix   302            299   1992700700     0.
>>
>> Matrix Partitioning     6              6         3888     0.
>>
>>       Matrix Coarsen     6              6         3768     0.
>>
>>               Vector   600            600   1582204168     0.
>>
>>       Vector Scatter    87             87      5614432     0.
>>
>>        Krylov Solver    11             11        59472     0.
>>
>>       Preconditioner    11             11        11120     0.
>>
>>          PetscRandom     1              1          638     0.
>>
>>               Viewer     1              0            0     0.
>>
>>            Index Set   247            247      9008420     0.
>>
>> Star Forest Bipartite Graph    12             12        10176     0.
>>
>>
>> ========================================================================================================================
>>
>>
>>
>> And for  petsc 3.6.1:
>>
>>
>>
>> Using Petsc Development GIT revision: v3.6.1-307-g26c82d3  GIT Date:
>> 2015-08-06 11:50:34 -0500
>>
>>
>>
>>                          Max       Max/Min        Avg      Total
>>
>> Time (sec):           5.515e+02      1.00001   5.515e+02
>>
>> Objects:              1.231e+03      1.00490   1.226e+03
>>
>> Flops:                3.431e+10      1.12609   3.253e+10  1.301e+13
>>
>> Flops/sec:            6.222e+07      1.12609   5.899e+07  2.359e+10
>>
>> MPI Messages:         4.432e+04      7.84165   1.504e+04  6.016e+06
>>
>> MPI Message Lengths:  2.236e+09     12.61261   5.027e+04  3.024e+11
>>
>> MPI Reductions:       4.012e+03      1.00150
>>
>>
>>
>> Flop counting convention: 1 flop = 1 real number operation of type
>> (multiply/divide/add/subtract)
>>
>>                             e.g., VecAXPY() for real vectors of length N
>> --> 2N flops
>>
>>                             and VecAXPY() for complex vectors of length N
>> --> 8N flops
>>
>>
>>
>> Summary of Stages:   ----- Time ------  ----- Flops -----  --- Messages
>> ---  -- Message Lengths --  -- Reductions --
>>
>>                         Avg     %Total     Avg     %Total   counts
>> %Total     Avg         %Total   counts   %Total
>>
>> 0:      Main Stage: 5.5145e+02 100.0%  1.3011e+13 100.0%  6.007e+06
>> 99.9%  5.020e+04       99.9%  3.999e+03  99.7%
>>
>>
>>
>>
>> ------------------------------------------------------------------------------------------------------------------------
>>
>> See the 'Profiling' chapter of the users' manual for details on
>> interpreting output.
>>
>> Phase summary info:
>>
>>    Count: number of times phase was executed
>>
>>    Time and Flops: Max - maximum over all processors
>>
>>                    Ratio - ratio of maximum to minimum over all processors
>>
>>    Mess: number of messages sent
>>
>>    Avg. len: average message length (bytes)
>>
>>    Reduct: number of global reductions
>>
>>    Global: entire computation
>>
>>    Stage: stages of a computation. Set stages with PetscLogStagePush()
>> and PetscLogStagePop().
>>
>>       %T - percent time in this phase         %F - percent flops in this
>> phase
>>
>>       %M - percent messages in this phase     %L - percent message
>> lengths in this phase
>>
>>       %R - percent reductions in this phase
>>
>>    Total Mflop/s: 10e-6 * (sum of flops over all processors)/(max time
>> over all processors)
>>
>>
>> ------------------------------------------------------------------------------------------------------------------------
>>
>> Event                Count      Time (sec)
>> Flops                             --- Global ---  --- Stage ---   Total
>>
>>                    Max Ratio  Max     Ratio   Max  Ratio  Mess   Avg len
>> Reduct  %T %F %M %L %R  %T %F %M %L %R Mflop/s
>>
>>
>> ------------------------------------------------------------------------------------------------------------------------
>>
>>
>>
>> --- Event Stage 0: Main Stage
>>
>>
>>
>> MatMult              500 1.0 1.0172e+01 1.2 6.68e+09 1.1 1.9e+06 9.9e+03
>> 0.0e+00  2 19 31  6  0   2 19 31  6  0 247182
>>
>> MatMultTranspose     120 1.0 6.9889e-01 1.2 3.56e+08 1.1 2.5e+05 1.4e+04
>> 0.0e+00  0  1  4  1  0   0  1  4  1  0 197492
>>
>> MatSolve             380 1.0 3.9310e+00 1.1 1.17e+09 1.1 1.3e+04 5.7e+01
>> 6.0e+01  1  3  0  0  1   1  3  0  0  2 112069
>>
>> MatSOR               120 1.0 1.3915e+01 1.1 6.73e+09 1.1 9.5e+05 7.4e+03
>> 0.0e+00  2 20 16  2  0   2 20 16  2  0 182405
>>
>> MatLUFactorSym         2 1.0 2.1180e-01 1.8 0.00e+00 0.0 0.0e+00 0.0e+00
>> 1.0e+01  0  0  0  0  0   0  0  0  0  0     0
>>
>> MatLUFactorNum        60 1.0 7.9378e+00 1.0 1.95e+08 1.2 0.0e+00 0.0e+00
>> 0.0e+00  1  1  0  0  0   1  1  0  0  0  8814
>>
>> MatILUFactorSym        1 1.0 2.3076e-02 1.5 0.00e+00 0.0 0.0e+00 0.0e+00
>> 0.0e+00  0  0  0  0  0   0  0  0  0  0     0
>>
>> MatConvert             6 1.0 3.2693e-02 1.5 0.00e+00 0.0 0.0e+00 0.0e+00
>> 1.8e+01  0  0  0  0  0   0  0  0  0  0     0
>>
>> MatScale               6 1.0 2.1923e-02 1.7 4.50e+06 1.1 2.4e+04 1.5e+03
>> 0.0e+00  0  0  0  0  0   0  0  0  0  0 77365
>>
>> MatAssemblyBegin     266 1.0 1.0337e+01 4.4 0.00e+00 0.0 1.8e+05 3.8e+03
>> 4.2e+02  1  0  3  0 10   1  0  3  0 10     0
>>
>> MatAssemblyEnd       266 1.0 3.0336e+00 1.0 0.00e+00 0.0 4.1e+05 8.6e+02
>> 4.7e+02  1  0  7  0 12   1  0  7  0 12     0
>>
>> MatGetRow        6730366 1.1 8.6473e-01 1.2 0.00e+00 0.0 0.0e+00 0.0e+00
>> 0.0e+00  0  0  0  0  0   0  0  0  0  0     0
>>
>> MatGetRowIJ            3 3.0 5.2931e-035550.2 0.00e+00 0.0 0.0e+00
>> 0.0e+00 0.0e+00  0  0  0  0  0   0  0  0  0  0     0
>>
>> MatGetSubMatrix       12 1.0 2.2689e+01 1.0 0.00e+00 0.0 1.1e+05 1.3e+05
>> 1.9e+02  4  0  2  5  5   4  0  2  5  5     0
>>
>> MatGetOrdering         3 3.0 6.5000e-03 5.4 0.00e+00 0.0 0.0e+00 0.0e+00
>> 0.0e+00  0  0  0  0  0   0  0  0  0  0     0
>>
>> MatPartitioning        6 1.0 2.9801e+00 1.0 0.00e+00 0.0 0.0e+00 0.0e+00
>> 1.4e+01  1  0  0  0  0   1  0  0  0  0     0
>>
>> MatCoarsen             6 1.0 9.5374e-02 1.1 0.00e+00 0.0 2.5e+05 1.1e+03
>> 3.8e+01  0  0  4  0  1   0  0  4  0  1     0
>>
>> MatZeroEntries        22 1.0 6.1185e-01 1.2 0.00e+00 0.0 0.0e+00 0.0e+00
>> 0.0e+00  0  0  0  0  0   0  0  0  0  0     0
>>
>> MatTranspose           6 1.0 1.9780e-01 1.1 0.00e+00 0.0 1.9e+05 8.6e+02
>> 7.8e+01  0  0  3  0  2   0  0  3  0  2     0
>>
>> MatPtAP              120 1.0 5.2996e+01 1.0 1.70e+10 1.1 9.7e+05 2.1e+05
>> 4.2e+02 10 49 16 67 10  10 49 16 67 11 120900
>>
>> MatPtAPSymbolic       12 1.0 5.8209e+00 1.0 0.00e+00 0.0 2.2e+05 3.7e+04
>> 8.4e+01  1  0  4  3  2   1  0  4  3  2     0
>>
>> MatPtAPNumeric       120 1.0 4.7185e+01 1.0 1.70e+10 1.1 7.6e+05 2.6e+05
>> 3.4e+02  9 49 13 64  8   9 49 13 64  8 135789
>>
>> MatTrnMatMult          3 1.0 1.1679e+00 1.0 3.22e+07 1.2 8.2e+04 8.0e+03
>> 5.7e+01  0  0  1  0  1   0  0  1  0  1  9997
>>
>> MatTrnMatMultSym       3 1.0 6.8366e-01 1.0 0.00e+00 0.0 6.9e+04 3.9e+03
>> 5.1e+01  0  0  1  0  1   0  0  1  0  1     0
>>
>> MatTrnMatMultNum       3 1.0 4.8513e-01 1.0 3.22e+07 1.2 1.3e+04 3.0e+04
>> 6.0e+00  0  0  0  0  0   0  0  0  0  0 24069
>>
>> MatGetLocalMat       126 1.0 1.1939e+00 1.3 0.00e+00 0.0 0.0e+00 0.0e+00
>> 0.0e+00  0  0  0  0  0   0  0  0  0  0     0
>>
>> MatGetBrAoCol        120 1.0 5.9887e-01 2.7 0.00e+00 0.0 5.7e+05 3.3e+04
>> 0.0e+00  0  0  9  6  0   0  0  9  6  0     0
>>
>> MatGetSymTrans        24 1.0 1.4878e-01 1.4 0.00e+00 0.0 0.0e+00 0.0e+00
>> 0.0e+00  0  0  0  0  0   0  0  0  0  0     0
>>
>> VecDot               320 1.0 1.5860e+00 1.5 2.04e+08 1.1 0.0e+00 0.0e+00
>> 3.2e+02  0  1  0  0  8   0  1  0  0  8 49574
>>
>> VecMDot              260 1.0 1.8154e+00 2.5 3.70e+08 1.1 0.0e+00 0.0e+00
>> 2.6e+02  0  1  0  0  6   0  1  0  0  7 78497
>>
>> VecNorm              440 1.0 2.8876e+00 1.8 5.88e+08 1.1 0.0e+00 0.0e+00
>> 4.4e+02  0  2  0  0 11   0  2  0  0 11 78281
>>
>> VecScale             320 1.0 2.2738e-01 1.2 7.88e+07 1.1 0.0e+00 0.0e+00
>> 0.0e+00  0  0  0  0  0   0  0  0  0  0 133517
>>
>> VecCopy              220 1.0 7.1162e-01 1.2 0.00e+00 0.0 0.0e+00 0.0e+00
>> 0.0e+00  0  0  0  0  0   0  0  0  0  0     0
>>
>> VecSet               862 1.0 7.0683e-01 1.5 0.00e+00 0.0 0.0e+00 0.0e+00
>> 0.0e+00  0  0  0  0  0   0  0  0  0  0     0
>>
>> VecAXPY              440 1.0 9.0657e-01 1.2 3.83e+08 1.1 0.0e+00 0.0e+00
>> 0.0e+00  0  1  0  0  0   0  1  0  0  0 162612
>>
>> VecAYPX              280 1.0 5.8935e-01 1.5 1.92e+08 1.1 0.0e+00 0.0e+00
>> 0.0e+00  0  1  0  0  0   0  1  0  0  0 125070
>>
>> VecMAXPY             300 1.0 9.7644e-01 1.2 4.98e+08 1.1 0.0e+00 0.0e+00
>> 0.0e+00  0  1  0  0  0   0  1  0  0  0 196269
>>
>> VecAssemblyBegin     234 1.0 5.0308e+00 5.5 0.00e+00 0.0 0.0e+00 0.0e+00
>> 6.8e+02  1  0  0  0 17   1  0  0  0 17     0
>>
>> VecAssemblyEnd       234 1.0 1.8253e-03 8.8 0.00e+00 0.0 0.0e+00 0.0e+00
>> 0.0e+00  0  0  0  0  0   0  0  0  0  0     0
>>
>> VecScatterBegin     1083 1.0 2.8195e-01 4.7 0.00e+00 0.0 3.8e+06 8.4e+03
>> 2.0e+01  0  0 64 11  0   0  0 64 11  1     0
>>
>> VecScatterEnd       1063 1.0 3.4924e+00 6.9 0.00e+00 0.0 0.0e+00 0.0e+00
>> 0.0e+00  0  0  0  0  0   0  0  0  0  0     0
>>
>> KSPGMRESOrthog        20 1.0 1.5598e+00 3.2 1.28e+08 1.1 0.0e+00 0.0e+00
>> 2.0e+01  0  0  0  0  0   0  0  0  0  1 31503
>>
>> KSPSetUp             222 1.0 9.7521e-02 1.6 0.00e+00 0.0 0.0e+00 0.0e+00
>> 1.2e+01  0  0  0  0  0   0  0  0  0  0     0
>>
>> KSPSolve              60 1.0 1.3742e+02 1.0 3.42e+10 1.1 5.7e+06 4.4e+04
>> 3.2e+03 25100 95 83 79  25100 95 83 79 94396
>>
>> PCGAMGGraph_AGG        6 1.0 5.7683e+00 1.0 4.50e+06 1.1 3.8e+05 9.1e+02
>> 2.5e+02  1  0  6  0  6   1  0  6  0  6   294
>>
>> PCGAMGCoarse_AGG       6 1.0 1.4101e+00 1.0 3.22e+07 1.2 4.0e+05 3.2e+03
>> 1.4e+02  0  0  7  0  4   0  0  7  0  4  8280
>>
>> PCGAMGProl_AGG         6 1.0 1.8976e+00 1.0 0.00e+00 0.0 7.2e+05 3.4e+03
>> 8.6e+02  0  0 12  1 22   0  0 12  1 22     0
>>
>> PCGAMGPOpt_AGG         6 1.0 5.7220e-06 0.0 0.00e+00 0.0 0.0e+00 0.0e+00
>> 0.0e+00  0  0  0  0  0   0  0  0  0  0     0
>>
>> GAMG: createProl       6 1.0 9.0840e+00 1.0 3.67e+07 1.2 1.5e+06 2.7e+03
>> 1.3e+03  2  0 25  1 31   2  0 25  1 31  1472
>>
>>   Graph               12 1.0 5.7669e+00 1.0 4.50e+06 1.1 3.8e+05 9.1e+02
>> 2.5e+02  1  0  6  0  6   1  0  6  0  6   294
>>
>>   MIS/Agg              6 1.0 9.5481e-02 1.1 0.00e+00 0.0 2.5e+05 1.1e+03
>> 3.8e+01  0  0  4  0  1   0  0  4  0  1     0
>>
>>   SA: col data         6 1.0 8.5414e-01 1.0 0.00e+00 0.0 6.6e+05 3.0e+03
>> 7.8e+02  0  0 11  1 19   0  0 11  1 20     0
>>
>>   SA: frmProl0         6 1.0 1.0123e+00 1.0 0.00e+00 0.0 6.2e+04 7.6e+03
>> 6.0e+01  0  0  1  0  1   0  0  1  0  2     0
>>
>> GAMG: partLevel        6 1.0 3.6150e+01 1.0 8.41e+08 1.1 3.5e+05 5.0e+04
>> 5.3e+02  7  2  6  6 13   7  2  6  6 13  8804
>>
>>   repartition          6 1.0 3.8351e+00 1.0 0.00e+00 0.0 4.7e+04 1.3e+02
>> 1.6e+02  1  0  1  0  4   1  0  1  0  4     0
>>
>>   Invert-Sort          6 1.0 4.4953e+00 1.8 0.00e+00 0.0 0.0e+00 0.0e+00
>> 2.4e+01  1  0  0  0  1   1  0  0  0  1     0
>>
>>   Move A               6 1.0 1.0806e+01 1.0 0.00e+00 0.0 8.5e+04 1.6e+05
>> 1.0e+02  2  0  1  5  3   2  0  1  5  3     0
>>
>>   Move P               6 1.0 1.1953e+01 1.0 0.00e+00 0.0 2.5e+04 3.6e+03
>> 1.0e+02  2  0  0  0  3   2  0  0  0  3     0
>>
>> PCSetUp              100 1.0 1.0166e+02 1.0 1.72e+10 1.1 2.7e+06 8.3e+04
>> 2.2e+03 18 50 44 73 54  18 50 44 73 54 63848
>>
>> PCSetUpOnBlocks       40 1.0 1.0812e+00 1.2 1.95e+08 1.2 0.0e+00 0.0e+00
>> 0.0e+00  0  1  0  0  0   0  1  0  0  0 64711
>>
>> PCApply              380 1.0 1.9359e+01 1.1 8.58e+09 1.1 1.4e+06 9.6e+03
>> 6.0e+01  3 25 24  5  1   3 25 24  5  2 167605
>>
>> SFSetGraph            12 1.0 3.5203e-03 6.8 0.00e+00 0.0 0.0e+00 0.0e+00
>> 0.0e+00  0  0  0  0  0   0  0  0  0  0     0
>>
>> SFBcastBegin          44 1.0 2.4242e-02 3.0 0.00e+00 0.0 2.5e+05 1.1e+03
>> 6.0e+00  0  0  4  0  0   0  0  4  0  0     0
>>
>> SFBcastEnd            44 1.0 3.0994e-02 8.6 0.00e+00 0.0 0.0e+00 0.0e+00
>> 0.0e+00  0  0  0  0  0   0  0  0  0  0     0
>>
>> SFReduceBegin          6 1.0 1.6784e-02 3.8 0.00e+00 0.0 7.1e+04 5.0e+02
>> 6.0e+00  0  0  1  0  0   0  0  1  0  0     0
>>
>> SFReduceEnd            6 1.0 8.6989e-0332.0 0.00e+00 0.0 0.0e+00 0.0e+00
>> 0.0e+00  0  0  0  0  0   0  0  0  0  0     0
>>
>>
>> ------------------------------------------------------------------------------------------------------------------------
>>
>>
>>
>> Memory usage is given in bytes:
>>
>>
>>
>> Object Type          Creations   Destructions     Memory  Descendants'
>> Mem.
>>
>> Reports information only for process 0.
>>
>>
>>
>> --- Event Stage 0: Main Stage
>>
>>
>>
>>               Matrix   246            243   1730595756     0
>>
>> Matrix Partitioning     6              6         3816     0
>>
>>       Matrix Coarsen     6              6         3720     0
>>
>>               Vector   602            602   1603749672     0
>>
>>       Vector Scatter    87             87      4291136     0
>>
>>        Krylov Solver    12             12        60416     0
>>
>>       Preconditioner    12             12        12040     0
>>
>>               Viewer     1              0            0     0
>>
>>            Index Set   247            247      9018060     0
>>
>> Star Forest Bipartite Graph    12             12        10080     0
>>
>>
>> ========================================================================================================================
>>
>>
>>
>> Any idea why there are more matrix created with version 3.7.2? I only
>> have 2 MatCreate calls and 4 VecCreate calls in my code!, so I assume the
>> others are internally created.
>>
>>
>>
>>
>>
>> Thank you,
>>
>>
>>
>>
>>
>> *Hassan Raiesi, PhD*
>>
>>
>>
>> Advanced Aerodynamics Department
>>
>> Bombardier Aerospace
>>
>>
>>
>> hassan.raiesi at aero.bombardier.com
>>
>>
>>
>> *2351 boul. Alfred-Nobel (BAN1)*
>>
>> *Ville Saint-Laurent, Qu?bec, H4S 2A9*
>>
>>
>>
>>
>>
>>
>>
>> T?l.
>>
>>   514-855-5001    # 62204
>>
>>
>>
>>
>>
>>
>>
>>
>>
>>
>>
>> *CONFIDENTIALITY NOTICE* - This communication may contain privileged or
>> confidential information.
>> If you are not the intended recipient or received this communication by
>> error, please notify the sender
>> and delete the message without copying, forwarding and/or disclosing it.
>>
>>
>>
>>
>>
>>
>>
>
>
>
> --
> What most experimenters take for granted before they begin their
> experiments is infinitely more interesting than any results to which their
> experiments lead.
> -- Norbert Wiener
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.mcs.anl.gov/pipermail/petsc-users/attachments/20160707/224e56a6/attachment-0001.html>
-------------- next part --------------
A non-text attachment was scrubbed...
Name: image001.png
Type: image/png
Size: 6402 bytes
Desc: not available
URL: <http://lists.mcs.anl.gov/pipermail/petsc-users/attachments/20160707/224e56a6/attachment-0001.png>

From dave.mayhem23 at gmail.com  Thu Jul  7 15:57:56 2016
From: dave.mayhem23 at gmail.com (Dave May)
Date: Thu, 7 Jul 2016 22:57:56 +0200
Subject: [petsc-users] Question about memory usage in Multigrid
	preconditioner
In-Reply-To: <577C337B.60909@uci.edu>
References: <577C337B.60909@uci.edu>
Message-ID: <CAJ98EDrfaXKyGm+29oxgvJ7wMwSOVe0ksU375ySnLHuERRt0uQ@mail.gmail.com>

Hi Frank,

On 6 July 2016 at 00:23, frank <hengjiew at uci.edu> wrote:

> Hi,
>
> I am using the CG ksp solver and Multigrid preconditioner  to solve a
> linear system in parallel.
> I chose to use the 'Telescope' as the preconditioner on the coarse mesh
> for its good performance.
> The petsc options file is attached.
>
> The domain is a 3d box.
> It works well when the grid is  1536*128*384 and the process mesh is
> 96*8*24. When I double the size of grid and keep the same process mesh and
> petsc options, I get an "out of memory" error from the super-cluster I am
> using.
>

When you increase the mesh resolution, did you also increasing the number
of effective MG levels?
If the number of levels was held constant, then your coarse grid is
increasing in size.
I notice that you coarsest grid solver is PCSVD.
This can be become expensive as PCSVD will convert your coarse level
operator into a dense matrix and could be the cause of your OOM error.

Telescope does have to store a couple of temporary matrices, but generally
when used in the context of multigrid coarse level solves these operators
represent a very small fraction of the fine level operator.

We need to isolate if it's these temporary matrices from telescope causing
the OOM error, or if they are caused by something else (e.g. PCSVD).


> Each process has access to at least 8G memory, which should be more than
> enough for my application. I am sure that all the other parts of my code(
> except the linear solver ) do not use much memory. So I doubt if there is
> something wrong with the linear solver.
> The error occurs before the linear system is completely solved so I don't
> have the info from ksp view. I am not able to re-produce the error with a
> smaller problem either.
> In addition,  I tried to use the block jacobi as the preconditioner with
> the same grid and same decomposition. The linear solver runs extremely slow
> but there is no memory error.
>
> How can I diagnose what exactly cause the error?
>

This going to be kinda hard as I notice your configuration uses nested
calls to telescope.
You need to debug the solver configuration.

The only way I know to do this is by invoking telescope one step at a time.
By this I mean, use telescope once, check the configuration is what you
want.
 Then add the next instance of telescope.
For solver debugging  purposes, get rid of PCSVD.
The constant null space is propagated with telescope so you can just use an
iterative method.
Furthermore, for debugging purposes, you don't care about the solve time or
even convergence, so set -ksp_max_it 1 everywhere in your solver stack
(e.g. outer most KSP and on the coarsest level).

If one instance of telescope works, e.g. no OOM error occurs, add the next
instance of telescope.
If two instance of telescope also works (no OOM), revert back to PCSVD.
If now you have an OOM error, you should consider adding more levels, or
getting rid of PCSVD as your coarse grid solver.

Lastly, the option

-repart_da_processors_x 24

has been depreciated.
It now inherits the prefix from the solver running on the sub-communicator.
For your use case, it should this be something like
  -mg_coarse_telescope_repart_da_processors_x 24
Use -options_left 1 to verify the option is getting picked up (another
useful tool for solver config debugging).


Cheers
  Dave


> Thank you so much.
>
> Frank
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.mcs.anl.gov/pipermail/petsc-users/attachments/20160707/56bee1ad/attachment.html>

From bsmith at mcs.anl.gov  Thu Jul  7 18:25:15 2016
From: bsmith at mcs.anl.gov (Barry Smith)
Date: Thu, 7 Jul 2016 18:25:15 -0500
Subject: [petsc-users] Are performance benchmarks available?
In-Reply-To: <1780114846.4471656.1467296427576.JavaMail.yahoo@mail.yahoo.com>
References: <1337334190.2752078.1467075579364.JavaMail.yahoo.ref@mail.yahoo.com>
	<1337334190.2752078.1467075579364.JavaMail.yahoo@mail.yahoo.com>
	<781294040.3740649.1467216467855.JavaMail.yahoo@mail.yahoo.com>
	<CAGCphBu71i4sKmU+8RQ2qe9y8==ogs0415KARYCPg68kELFh1Q@mail.gmail.com>
	<213496802.3843456.1467229612250.JavaMail.yahoo@mail.yahoo.com>
	<1780114846.4471656.1467296427576.JavaMail.yahoo@mail.yahoo.com>
Message-ID: <5713E506-7FD6-40D7-9C3F-4D3848D34A18@mcs.anl.gov>


   While I agree that having this type of information available would be very useful it is surprisingly difficult to do this and keep it up to date, plus we have little time to do it, so unfortunately we don't having thing like this. 

   We should do this! Perhaps pick one or two problems and run them with say a simple preconditioner like ASM and then GAMG on a large problem with a couple of different number of processes, say 1, 32 and 256 then run them once a month to confirm they remain the same performance wise and make the performance numbers available on the web. Maybe using Mark's ex56.c case.

   I'll try to set something up

   Barry


Always a big pain to try to automate the running on those damn batch systems!


> On Jun 30, 2016, at 9:20 AM, Faraz Hussain <faraz_hussain at yahoo.com> wrote:
> 
> I am wondering if there are benchmarks available that I can solve on my cluster to compare performance? I want to compare how scaling up-to 240 cores compares to large models already solved on an optimized  configuration and hardware.
> 


From cyrill.von.planta at usi.ch  Fri Jul  8 02:14:39 2016
From: cyrill.von.planta at usi.ch (Cyrill Vonplanta)
Date: Fri, 8 Jul 2016 07:14:39 +0000
Subject: [petsc-users] Reordering rows of parallel matrix across
 processors
In-Reply-To: <CAMYG4Gniz_93h9ton0dFBFOsJcYTKqN=1=L5CYpyfjf-bk5w7w@mail.gmail.com>
References: <4C660C5E-7326-45C2-82DD-3302E09490AA@usi.ch>
	<CAMYG4Gniz_93h9ton0dFBFOsJcYTKqN=1=L5CYpyfjf-bk5w7w@mail.gmail.com>
Message-ID: <860F8A06-0C5C-447D-90A5-EB5481947BD6@usi.ch>

Trying to make a small example for reproducing I could figure out my mistake in the code (totally unrelated to the question). MatPermute(..) just works fine.

My apologies.
Cyrill

From: Matthew Knepley <knepley at gmail.com<mailto:knepley at gmail.com>>
Date: Donnerstag, 7. Juli 2016 um 16:48
To: von Planta Cyrill <cyrill.von.planta at usi.ch<mailto:cyrill.von.planta at usi.ch>>
Cc: "petsc-users at mcs.anl.gov<mailto:petsc-users at mcs.anl.gov>" <petsc-users at mcs.anl.gov<mailto:petsc-users at mcs.anl.gov>>
Subject: Re: [petsc-users] Reordering rows of parallel matrix across processors

On Thu, Jul 7, 2016 at 3:37 AM, Cyrill Vonplanta <cyrill.von.planta at usi.ch<mailto:cyrill.von.planta at usi.ch>> wrote:
Dear all,

I would like to reorder the rows of a matrix across processors. Is this possible with MatPermute(?)?

Yes, this works with MatPermute().

Could you send this small example so I can reproduce it?

To illustrate here is how an index set would look like for a matrix with  M=35 on 2 CPU?s. Amongst other things I intend to swap the first and last row here.

[0] Number of indices in set 24
[0] 0 34
[0] 1 1
[0] 2 2
[0] 3 3
[0] 4 4
[0] 5 5
[0] 6 6
[0] 7 7
[0] 8 15
[0] 9 16
[0] 10 11
[0] 11 8
[0] 12 10
[0] 13 21
[0] 14 9
[0] 15 12
[0] 16 13
[0] 17 14
[0] 18 17
[0] 19 18
[0] 20 19
[0] 21 20
[0] 22 22
[0] 23 23
[1] Number of indices in set 11
[1] 0 24
[1] 1 25
[1] 2 26
[1] 3 27
[1] 4 28
[1] 5 29
[1] 6 30
[1] 7 31
[1] 8 32
[1] 9 33
[1] 10 0

Instead of exchanging the first and last row it seems to replace them with zeros only.
If this can?t be done with MatPermute how could it be done?

You could also use MatGetSubMatrix().

  Thanks,

    Matt

Thanks
Cyrill


--
What most experimenters take for granted before they begin their experiments is infinitely more interesting than any results to which their experiments lead.
-- Norbert Wiener

From mfadams at lbl.gov  Fri Jul  8 08:09:07 2016
From: mfadams at lbl.gov (Mark Adams)
Date: Fri, 8 Jul 2016 09:09:07 -0400
Subject: [petsc-users] Are performance benchmarks available?
In-Reply-To: <5713E506-7FD6-40D7-9C3F-4D3848D34A18@mcs.anl.gov>
References: <1337334190.2752078.1467075579364.JavaMail.yahoo.ref@mail.yahoo.com>
	<1337334190.2752078.1467075579364.JavaMail.yahoo@mail.yahoo.com>
	<781294040.3740649.1467216467855.JavaMail.yahoo@mail.yahoo.com>
	<CAGCphBu71i4sKmU+8RQ2qe9y8==ogs0415KARYCPg68kELFh1Q@mail.gmail.com>
	<213496802.3843456.1467229612250.JavaMail.yahoo@mail.yahoo.com>
	<1780114846.4471656.1467296427576.JavaMail.yahoo@mail.yahoo.com>
	<5713E506-7FD6-40D7-9C3F-4D3848D34A18@mcs.anl.gov>
Message-ID: <CADOhEh5FOBhRMoHOH+=YYHq0q=fYr6c5ukDQQtgQWnWkWeHrcw@mail.gmail.com>

This would be a good idea,
Please use SNES ex56 and send me the '-info | grep GAMG' result, and
-log_view, so that I can check that it looks OK.
Thanks,
Mark

On Thu, Jul 7, 2016 at 7:25 PM, Barry Smith <bsmith at mcs.anl.gov> wrote:

>
>    While I agree that having this type of information available would be
> very useful it is surprisingly difficult to do this and keep it up to date,
> plus we have little time to do it, so unfortunately we don't having thing
> like this.
>
>    We should do this! Perhaps pick one or two problems and run them with
> say a simple preconditioner like ASM and then GAMG on a large problem with
> a couple of different number of processes, say 1, 32 and 256 then run them
> once a month to confirm they remain the same performance wise and make the
> performance numbers available on the web. Maybe using Mark's ex56.c case.
>
>    I'll try to set something up
>
>    Barry
>
>
> Always a big pain to try to automate the running on those damn batch
> systems!
>
>
>
>
>
> > On Jun 30, 2016, at 9:20 AM, Faraz Hussain <faraz_hussain at yahoo.com>
> wrote:
> >
> > I am wondering if there are benchmarks available that I can solve on my
> cluster to compare performance? I want to compare how scaling up-to 240
> cores compares to large models already solved on an optimized
> configuration and hardware.
> >
>
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.mcs.anl.gov/pipermail/petsc-users/attachments/20160708/5c949fa1/attachment.html>

From huyaoyu1986 at gmail.com  Fri Jul  8 18:59:41 2016
From: huyaoyu1986 at gmail.com (Yaoyu Hu)
Date: Sat, 9 Jul 2016 07:59:41 +0800
Subject: [petsc-users] Need help: Poisson's equation with complex number
Message-ID: <CAMfFeSq5fLp+nEGh3xU8QN72Kg=itYKs59yL3WTDGFUTyZppWw@mail.gmail.com>

Hi everyone,

I am now trying to solve a partial differential equation which is
similar to the three dimensional Poisson?s equation but with complex
numbers. The equation is the result of the transformation of a set of
fluid dynamic equations from time domain to frequency domain. I have
Dirichlet boundary conditions all over the boundaries. The coefficient
matrix that obtained by finite volume method (with collocated grid) is
made of complex numbers. I would like to know that, for my discretized
equation which solver and PC are the most suitable to work with. And
BTW, the solution should be done in parallel with about 10^4 - 10^6
unknowns.

It is the first time for me to solve equations with complex numbers,
however, I am not so good at mathematics involving complex number. I
would like to know want should I bear in mind throughout the whole
process? Any suggestions or comments are appreciated.

Thanks!

HU Yaoyu

From hengjiew at uci.edu  Fri Jul  8 20:05:45 2016
From: hengjiew at uci.edu (frank)
Date: Fri, 8 Jul 2016 18:05:45 -0700
Subject: [petsc-users] Question about memory usage in Multigrid
 preconditioner
In-Reply-To: <2F25042C-E6D6-4AC6-9C22-1B63F8065836@mcs.anl.gov>
References: <577C337B.60909@uci.edu>
	<94A03A99-4970-4F20-8C79-FEE1DCBD028D@mcs.anl.gov>
	<577D75D3.8010703@uci.edu>
	<2F25042C-E6D6-4AC6-9C22-1B63F8065836@mcs.anl.gov>
Message-ID: <57804DE9.707@uci.edu>

Hi Barry and Dave,

Thank both of you for the advice.

@Barry
I made a mistake in the file names in last email. I attached the correct 
files this time.
For all the three tests, 'Telescope' is used as the coarse preconditioner.

== Test1:   Grid: 1536*128*384,   Process Mesh: 48*4*12
Part of the memory usage:  Vector   125            124 3971904     0.
                                              Matrix   101 101      
9462372     0

== Test2: Grid: 1536*128*384,   Process Mesh: 96*8*24
Part of the memory usage:  Vector   125            124 681672     0.
                                              Matrix   101 101      
1462180     0.

In theory, the memory usage in Test1 should be 8 times of Test2. In my 
case, it is about 6 times.

== Test3: Grid: 3072*256*768,   Process Mesh: 96*8*24. Sub-domain per 
process: 32*32*32
Here I get the out of memory error.

I tried to use -mg_coarse jacobi. In this way, I don't need to set 
-mg_coarse_ksp_type and -mg_coarse_pc_type explicitly, right?
The linear solver didn't work in this case. Petsc output some errors.

@Dave
In test3, I use only one instance of 'Telescope'. On the coarse mesh of 
'Telescope', I used LU as the preconditioner instead of SVD.
If my set the levels correctly, then on the last coarse mesh of MG where 
it calls 'Telescope', the sub-domain per process is 2*2*2.
On the last coarse mesh of 'Telescope', there is only one grid point per 
process.
I still got the OOM error. The detailed petsc option file is attached.


Thank you so much.

Frank


On 07/06/2016 02:51 PM, Barry Smith wrote:
>> On Jul 6, 2016, at 4:19 PM, frank <hengjiew at uci.edu> wrote:
>>
>> Hi Barry,
>>
>> Thank you for you advice.
>> I tried three test. In the 1st test, the grid is 3072*256*768 and the process mesh is 96*8*24.
>> The linear solver is 'cg' the preconditioner is 'mg' and 'telescope' is used as the preconditioner at the coarse mesh.
>> The system gives me the "Out of Memory" error before the linear system is completely solved.
>> The info from '-ksp_view_pre' is attached. I seems to me that the error occurs when it reaches the coarse mesh.
>>
>> The 2nd test uses a grid of 1536*128*384 and process mesh is 96*8*24. The 3rd test uses the same grid but a different process mesh 48*4*12.
>     Are you sure this is right? The total matrix and vector memory usage goes from 2nd test
>                Vector   384            383      8,193,712     0.
>                Matrix   103            103     11,508,688     0.
> to 3rd test
>               Vector   384            383      1,590,520     0.
>                Matrix   103            103      3,508,664     0.
> that is the memory usage got smaller but if you have only 1/8th the processes and the same grid it should have gotten about 8 times bigger. Did you maybe cut the grid by a factor of 8 also? If so that still doesn't explain it because the memory usage changed by a factor of 5 something for the vectors and 3 something for the matrices.
>
>
>> The linear solver and petsc options in 2nd and 3rd tests are the same in 1st test. The linear solver works fine in both test.
>> I attached the memory usage of the 2nd and 3rd tests. The memory info is from the option '-log_summary'. I tried to use '-momery_info' as you suggested, but in my case petsc treated it as an unused option. It output nothing about the memory. Do I need to add sth to my code so I can use '-memory_info'?
>     Sorry, my mistake the option is -memory_view
>
>    Can you run the one case with -memory_view and -mg_coarse jacobi -ksp_max_it 1 (just so it doesn't iterate forever) to see how much memory is used without the telescope? Also run case 2 the same way.
>
>    Barry
>
>
>
>> In both tests the memory usage is not large.
>>
>> It seems to me that it might be the 'telescope'  preconditioner that allocated a lot of memory and caused the error in the 1st test.
>> Is there is a way to show how much memory it allocated?
>>
>> Frank
>>
>> On 07/05/2016 03:37 PM, Barry Smith wrote:
>>>    Frank,
>>>
>>>      You can run with -ksp_view_pre to have it "view" the KSP before the solve so hopefully it gets that far.
>>>
>>>       Please run the problem that does fit with -memory_info when the problem completes it will show the "high water mark" for PETSc allocated memory and total memory used. We first want to look at these numbers to see if it is using more memory than you expect. You could also run with say half the grid spacing to see how the memory usage scaled with the increase in grid points. Make the runs also with -log_view and send all the output from these options.
>>>
>>>     Barry
>>>
>>>> On Jul 5, 2016, at 5:23 PM, frank <hengjiew at uci.edu> wrote:
>>>>
>>>> Hi,
>>>>
>>>> I am using the CG ksp solver and Multigrid preconditioner  to solve a linear system in parallel.
>>>> I chose to use the 'Telescope' as the preconditioner on the coarse mesh for its good performance.
>>>> The petsc options file is attached.
>>>>
>>>> The domain is a 3d box.
>>>> It works well when the grid is  1536*128*384 and the process mesh is 96*8*24. When I double the size of grid and keep the same process mesh and petsc options, I get an "out of memory" error from the super-cluster I am using.
>>>> Each process has access to at least 8G memory, which should be more than enough for my application. I am sure that all the other parts of my code( except the linear solver ) do not use much memory. So I doubt if there is something wrong with the linear solver.
>>>> The error occurs before the linear system is completely solved so I don't have the info from ksp view. I am not able to re-produce the error with a smaller problem either.
>>>> In addition,  I tried to use the block jacobi as the preconditioner with the same grid and same decomposition. The linear solver runs extremely slow but there is no memory error.
>>>>
>>>> How can I diagnose what exactly cause the error?
>>>> Thank you so much.
>>>>
>>>> Frank
>>>> <petsc_options.txt>
>> <ksp_view_pre.txt><memory_test2.txt><memory_test3.txt><petsc_options.txt>

-------------- next part --------------
Summary of Memory Usage in PETSc
Maximum (over computational time) process memory:        total 7.2576e+08 max 3.8216e+05 min 3.1394e+05
Current process memory:                                  total 7.2576e+08 max 3.8216e+05 min 3.1394e+05
Maximum (over computational time) space PetscMalloc()ed: total 6.3903e+11 max 2.7842e+08 min 2.7724e+08
Current space PetscMalloc()ed:                           total 1.8043e+09 max 8.0275e+05 min 7.6352e+05

========================================================================================================================
Memory usage is given in bytes:

Object Type          Creations   Destructions     Memory  Descendants' Mem.
Reports information only for process 0.

--- Event Stage 0: Main Stage

              Viewer     5              4         3328     0.
              Vector   125            124      3971904     0.
      Vector Scatter    25             21        60464     0.
              Matrix   101            101      9462372     0.
   Matrix Null Space     1              1          592     0.
    Distributed Mesh     8              4        20288     0.
Star Forest Bipartite Graph    16              8         6784     0.
     Discrete System     8              4         3456     0.
           Index Set    55             55       277272     0.
   IS L to G Mapping     8              4        27136     0.
       Krylov Solver    10             10        12392     0.
     DMKSP interface     6              3         1944     0.
      Preconditioner    10             10        10024     0.
========================================================================================================================
-------------- next part --------------
Summary of Memory Usage in PETSc
Maximum (over computational time) process memory:        total 5.7481e+09 max 4.5144e+05 min 3.0404e+05
Current process memory:                                  total 5.7481e+09 max 4.5144e+05 min 3.0404e+05
Maximum (over computational time) space PetscMalloc()ed: total 4.9405e+12 max 2.6821e+08 min 2.6800e+08
Current space PetscMalloc()ed:                           total 5.5180e+09 max 3.0192e+05 min 2.9173e+05

========================================================================================================================
Memory usage is given in bytes:

Object Type          Creations   Destructions     Memory  Descendants' Mem.
Reports information only for process 0.

--- Event Stage 0: Main Stage

              Viewer     5              4         3328     0.
              Vector   125            124       681672     0.
      Vector Scatter    25             21        27256     0.
              Matrix   101            101      1462180     0.
   Matrix Null Space     1              1          592     0.
    Distributed Mesh     8              4        20288     0.
Star Forest Bipartite Graph    16              8         6784     0.
     Discrete System     8              4         3456     0.
           Index Set    55             55        80872     0.
   IS L to G Mapping     8              4         7080     0.
       Krylov Solver    10             10        12392     0.
     DMKSP interface     6              3         1944     0.
      Preconditioner    10             10        10024     0.
========================================================================================================================
-------------- next part --------------
-ksp_type        cg 
-ksp_norm_type   unpreconditioned
-ksp_lag_norm
-ksp_rtol        1e-7
-ksp_initial_guess_nonzero  yes
-ksp_converged_reason 
-ppe_max_iter 50
-pc_type mg
-pc_mg_galerkin
-pc_mg_levels 4
-mg_levels_ksp_type richardson 
-mg_levels_ksp_max_it 1
-ksp_max_it 1
-mg_coarse_ksp_type preonly
-mg_coarse_pc_type telescope
-mg_coarse_pc_telescope_reduction_factor 64
-options_left 1
-log_view
-memory_view

# Setting dmdarepart on subcomm
-mg_coarse_telescope_repart_da_processors_x 12
-mg_coarse_telescope_repart_da_processors_y 1
-mg_coarse_telescope_repart_da_processors_z 3
-mg_coarse_telescope_ksp_type preonly
-mg_coarse_telescope_pc_type mg
-mg_coarse_telescope_pc_mg_galerkin
-mg_coarse_telescope_pc_mg_levels 4
-mg_coarse_telescope_mg_levels_ksp_max_it 1
-mg_coarse_telescope_mg_levels_ksp_type richardson
-mg_coarse_telescope_mg_coarse_ksp_type preonly
-mg_coarse_telescope_mg_coarse_pc_type lu

-------------- next part --------------
-ksp_type        cg 
-ksp_norm_type   unpreconditioned
-ksp_lag_norm
-ksp_rtol        1e-7
-ksp_initial_guess_nonzero  yes
-ksp_converged_reason 
-ppe_max_iter 50
-pc_type mg
-pc_mg_galerkin
-pc_mg_levels 4
-mg_levels_ksp_type richardson 
-mg_levels_ksp_max_it 1
-ksp_max_it 1
-mg_coarse_ksp_type preonly
-mg_coarse_pc_type telescope
-mg_coarse_pc_telescope_reduction_factor 64
-options_left
-log_view
-memory_view

# Setting dmdarepart on subcomm
-mg_coarse_telescope_repart_da_processors_x 24
-mg_coarse_telescope_repart_da_processors_y 2
-mg_coarse_telescope_repart_da_processors_z 6
-mg_coarse_telescope_ksp_type preonly
-mg_coarse_telescope_pc_type mg
-mg_coarse_telescope_pc_mg_galerkin
-mg_coarse_telescope_pc_mg_levels 4
-mg_coarse_telescope_mg_levels_ksp_max_it 1
-mg_coarse_telescope_mg_levels_ksp_type richardson
-mg_coarse_telescope_mg_coarse_ksp_type preonly
-mg_coarse_telescope_mg_coarse_pc_type lu
-------------- next part --------------
-ksp_type        cg 
-ksp_norm_type   unpreconditioned
-ksp_lag_norm
-ksp_rtol        1e-7
-ksp_initial_guess_nonzero  yes
-ksp_converged_reason 
-ppe_max_iter 50
-pc_type mg
-pc_mg_galerkin
-pc_mg_levels 5
-mg_levels_ksp_type richardson 
-mg_levels_ksp_max_it 1
-ksp_max_it 1
-mg_coarse_ksp_type preonly
-mg_coarse_pc_type telescope
-mg_coarse_pc_telescope_reduction_factor 64
-options_left 1
-log_view
-memory_view
-ksp_view_pre

# Setting dmdarepart on subcomm
-mg_coarse_telescope_repart_da_processors_x 24
-mg_coarse_telescope_repart_da_processors_y 2
-mg_coarse_telescope_repart_da_processors_z 6
-mg_coarse_telescope_ksp_type preonly
-mg_coarse_telescope_pc_type mg
-mg_coarse_telescope_pc_mg_galerkin
-mg_coarse_telescope_pc_mg_levels 4
-mg_coarse_telescope_mg_levels_ksp_max_it 1
-mg_coarse_telescope_mg_levels_ksp_type richardson
-mg_coarse_telescope_mg_coarse_ksp_type preonly
-mg_coarse_telescope_mg_coarse_pc_type lu


From bsmith at mcs.anl.gov  Fri Jul  8 21:07:40 2016
From: bsmith at mcs.anl.gov (Barry Smith)
Date: Fri, 8 Jul 2016 21:07:40 -0500
Subject: [petsc-users] Question about memory usage in Multigrid
	preconditioner
In-Reply-To: <57804DE9.707@uci.edu>
References: <577C337B.60909@uci.edu>
	<94A03A99-4970-4F20-8C79-FEE1DCBD028D@mcs.anl.gov>
	<577D75D3.8010703@uci.edu>
	<2F25042C-E6D6-4AC6-9C22-1B63F8065836@mcs.anl.gov>
	<57804DE9.707@uci.edu>
Message-ID: <F80F5FE4-5AA1-4C75-9A67-342D7F88E5F0@mcs.anl.gov>


  Frank,

    I don't think we yet have enough information to figure out what is going on.

    Can you please run the test1 but on the larger number of processes? Our goal is to determine the memory usage scaling as you increase the mesh size with a fixed number of processes, from test 2 to test 3 so it is better to see the memory usage in test 1 with the same number of processes as test 2.


> On Jul 8, 2016, at 8:05 PM, frank <hengjiew at uci.edu> wrote:
> 
> Hi Barry and Dave,
> 
> Thank both of you for the advice.
> 
> @Barry
> I made a mistake in the file names in last email. I attached the correct files this time.
> For all the three tests, 'Telescope' is used as the coarse preconditioner.
> 
> == Test1:   Grid: 1536*128*384,   Process Mesh: 48*4*12
> Part of the memory usage:  Vector   125            124 3971904     0.
>                                             Matrix   101 101      9462372     0
> 
> == Test2: Grid: 1536*128*384,   Process Mesh: 96*8*24
> Part of the memory usage:  Vector   125            124 681672     0.
>                                             Matrix   101 101      1462180     0.
> 
> In theory, the memory usage in Test1 should be 8 times of Test2. In my case, it is about 6 times.
> 
> == Test3: Grid: 3072*256*768,   Process Mesh: 96*8*24. Sub-domain per process: 32*32*32
> Here I get the out of memory error.

   Please re-send us all the output from this failed case.

> 
> I tried to use -mg_coarse jacobi. In this way, I don't need to set -mg_coarse_ksp_type and -mg_coarse_pc_type explicitly, right?
> The linear solver didn't work in this case. Petsc output some errors.

  You better set the options you want because the default options may not be want you want. 

   But it is possible that using jacobi on the coarse grid will result in failed failed convergence so I don't recommend it, better to use the defaults.

  The one thing I noted is that PETSc requests allocations much larger than are actually used (compare the maximum process memory to the maximum petscmalloc memory) in the test 1 and test 2 cases (likely because in the Galerkin RAR' process it doesn't know how much memory it will actually need). Normally these large requested allocations due no harm because it never actually needs to allocate all the memory pages for the full request. 

  Barry


> 
> @Dave
> In test3, I use only one instance of 'Telescope'. On the coarse mesh of 'Telescope', I used LU as the preconditioner instead of SVD.
> If my set the levels correctly, then on the last coarse mesh of MG where it calls 'Telescope', the sub-domain per process is 2*2*2.
> On the last coarse mesh of 'Telescope', there is only one grid point per process.
> I still got the OOM error. The detailed petsc option file is attached.
> 
> 
> Thank you so much.
> 
> Frank
> 
> 
> 
> On 07/06/2016 02:51 PM, Barry Smith wrote:
>>> On Jul 6, 2016, at 4:19 PM, frank <hengjiew at uci.edu> wrote:
>>> 
>>> Hi Barry,
>>> 
>>> Thank you for you advice.
>>> I tried three test. In the 1st test, the grid is 3072*256*768 and the process mesh is 96*8*24.
>>> The linear solver is 'cg' the preconditioner is 'mg' and 'telescope' is used as the preconditioner at the coarse mesh.
>>> The system gives me the "Out of Memory" error before the linear system is completely solved.
>>> The info from '-ksp_view_pre' is attached. I seems to me that the error occurs when it reaches the coarse mesh.
>>> 
>>> The 2nd test uses a grid of 1536*128*384 and process mesh is 96*8*24. The 3rd test uses the same grid but a different process mesh 48*4*12.
>>    Are you sure this is right? The total matrix and vector memory usage goes from 2nd test
>>               Vector   384            383      8,193,712     0.
>>               Matrix   103            103     11,508,688     0.
>> to 3rd test
>>              Vector   384            383      1,590,520     0.
>>               Matrix   103            103      3,508,664     0.
>> that is the memory usage got smaller but if you have only 1/8th the processes and the same grid it should have gotten about 8 times bigger. Did you maybe cut the grid by a factor of 8 also? If so that still doesn't explain it because the memory usage changed by a factor of 5 something for the vectors and 3 something for the matrices.
>> 
>> 
>>> The linear solver and petsc options in 2nd and 3rd tests are the same in 1st test. The linear solver works fine in both test.
>>> I attached the memory usage of the 2nd and 3rd tests. The memory info is from the option '-log_summary'. I tried to use '-momery_info' as you suggested, but in my case petsc treated it as an unused option. It output nothing about the memory. Do I need to add sth to my code so I can use '-memory_info'?
>>    Sorry, my mistake the option is -memory_view
>> 
>>   Can you run the one case with -memory_view and -mg_coarse jacobi -ksp_max_it 1 (just so it doesn't iterate forever) to see how much memory is used without the telescope? Also run case 2 the same way.
>> 
>>   Barry
>> 
>> 
>> 
>>> In both tests the memory usage is not large.
>>> 
>>> It seems to me that it might be the 'telescope'  preconditioner that allocated a lot of memory and caused the error in the 1st test.
>>> Is there is a way to show how much memory it allocated?
>>> 
>>> Frank
>>> 
>>> On 07/05/2016 03:37 PM, Barry Smith wrote:
>>>>   Frank,
>>>> 
>>>>     You can run with -ksp_view_pre to have it "view" the KSP before the solve so hopefully it gets that far.
>>>> 
>>>>      Please run the problem that does fit with -memory_info when the problem completes it will show the "high water mark" for PETSc allocated memory and total memory used. We first want to look at these numbers to see if it is using more memory than you expect. You could also run with say half the grid spacing to see how the memory usage scaled with the increase in grid points. Make the runs also with -log_view and send all the output from these options.
>>>> 
>>>>    Barry
>>>> 
>>>>> On Jul 5, 2016, at 5:23 PM, frank <hengjiew at uci.edu> wrote:
>>>>> 
>>>>> Hi,
>>>>> 
>>>>> I am using the CG ksp solver and Multigrid preconditioner  to solve a linear system in parallel.
>>>>> I chose to use the 'Telescope' as the preconditioner on the coarse mesh for its good performance.
>>>>> The petsc options file is attached.
>>>>> 
>>>>> The domain is a 3d box.
>>>>> It works well when the grid is  1536*128*384 and the process mesh is 96*8*24. When I double the size of grid and keep the same process mesh and petsc options, I get an "out of memory" error from the super-cluster I am using.
>>>>> Each process has access to at least 8G memory, which should be more than enough for my application. I am sure that all the other parts of my code( except the linear solver ) do not use much memory. So I doubt if there is something wrong with the linear solver.
>>>>> The error occurs before the linear system is completely solved so I don't have the info from ksp view. I am not able to re-produce the error with a smaller problem either.
>>>>> In addition,  I tried to use the block jacobi as the preconditioner with the same grid and same decomposition. The linear solver runs extremely slow but there is no memory error.
>>>>> 
>>>>> How can I diagnose what exactly cause the error?
>>>>> Thank you so much.
>>>>> 
>>>>> Frank
>>>>> <petsc_options.txt>
>>> <ksp_view_pre.txt><memory_test2.txt><memory_test3.txt><petsc_options.txt>
> 
> <memory_test1.txt><memory_test2.txt><petsc_options_test1.txt><petsc_options_test2.txt><petsc_options_test3.txt>


From bsmith at mcs.anl.gov  Fri Jul  8 21:11:26 2016
From: bsmith at mcs.anl.gov (Barry Smith)
Date: Fri, 8 Jul 2016 21:11:26 -0500
Subject: [petsc-users] Need help: Poisson's equation with complex number
In-Reply-To: <CAMfFeSq5fLp+nEGh3xU8QN72Kg=itYKs59yL3WTDGFUTyZppWw@mail.gmail.com>
References: <CAMfFeSq5fLp+nEGh3xU8QN72Kg=itYKs59yL3WTDGFUTyZppWw@mail.gmail.com>
Message-ID: <91BACFC0-05BC-4CA6-99DB-75CA18F56302@mcs.anl.gov>


  I would start with -pc_type gamg and -ksp_type gmres see how many iterations it requires and how the number of iterations grows when you  refine the mesh (if life is good then the iterations will grow only moderately as you refine the mesh).  If these options result in very bad convergence then send us the output with -ksp_monitor_true_residual and we'll have to consider other options.

  Barry


> On Jul 8, 2016, at 6:59 PM, Yaoyu Hu <huyaoyu1986 at gmail.com> wrote:
> 
> Hi everyone,
> 
> I am now trying to solve a partial differential equation which is
> similar to the three dimensional Poisson?s equation but with complex
> numbers. The equation is the result of the transformation of a set of
> fluid dynamic equations from time domain to frequency domain. I have
> Dirichlet boundary conditions all over the boundaries. The coefficient
> matrix that obtained by finite volume method (with collocated grid) is
> made of complex numbers. I would like to know that, for my discretized
> equation which solver and PC are the most suitable to work with. And
> BTW, the solution should be done in parallel with about 10^4 - 10^6
> unknowns.
> 
> It is the first time for me to solve equations with complex numbers,
> however, I am not so good at mathematics involving complex number. I
> would like to know want should I bear in mind throughout the whole
> process? Any suggestions or comments are appreciated.
> 
> Thanks!
> 
> HU Yaoyu


From dave.mayhem23 at gmail.com  Sat Jul  9 00:38:12 2016
From: dave.mayhem23 at gmail.com (Dave May)
Date: Sat, 9 Jul 2016 07:38:12 +0200
Subject: [petsc-users] Question about memory usage in Multigrid
	preconditioner
In-Reply-To: <57804DE9.707@uci.edu>
References: <577C337B.60909@uci.edu>
	<94A03A99-4970-4F20-8C79-FEE1DCBD028D@mcs.anl.gov>
	<577D75D3.8010703@uci.edu>
	<2F25042C-E6D6-4AC6-9C22-1B63F8065836@mcs.anl.gov>
	<57804DE9.707@uci.edu>
Message-ID: <CAJ98EDqfz2hr1s8SoLCwPceqbB+bsZp1ryibGngRn+kNTDrocA@mail.gmail.com>

On Saturday, 9 July 2016, frank <hengjiew at uci.edu
<javascript:_e(%7B%7D,'cvml','hengjiew at uci.edu');>> wrote:

> Hi Barry and Dave,
>
> Thank both of you for the advice.
>
> @Barry
> I made a mistake in the file names in last email. I attached the correct
> files this time.
> For all the three tests, 'Telescope' is used as the coarse preconditioner.
>
> == Test1:   Grid: 1536*128*384,   Process Mesh: 48*4*12
> Part of the memory usage:  Vector   125            124 3971904     0.
>                                              Matrix   101 101
> 9462372     0
>
> == Test2: Grid: 1536*128*384,   Process Mesh: 96*8*24
> Part of the memory usage:  Vector   125            124 681672     0.
>                                              Matrix   101 101
> 1462180     0.
>
> In theory, the memory usage in Test1 should be 8 times of Test2. In my
> case, it is about 6 times.
>
> == Test3: Grid: 3072*256*768,   Process Mesh: 96*8*24. Sub-domain per
> process: 32*32*32
> Here I get the out of memory error.
>
> I tried to use -mg_coarse jacobi. In this way, I don't need to set
> -mg_coarse_ksp_type and -mg_coarse_pc_type explicitly, right?
> The linear solver didn't work in this case. Petsc output some errors.
>
> @Dave
> In test3, I use only one instance of 'Telescope'. On the coarse mesh of
> 'Telescope', I used LU as the preconditioner instead of SVD.
> If my set the levels correctly, then on the last coarse mesh of MG where
> it calls 'Telescope', the sub-domain per process is 2*2*2.
> On the last coarse mesh of 'Telescope', there is only one grid point per
> process.
> I still got the OOM error. The detailed petsc option file is attached.


Do you understand the expected memory usage for the particular parallel
LU implementation you are using? I don't (seriously). Replace LU with
bjacobi and re-run this test. My point about solver debugging is still
valid.

And please send the result of KSPView so we can see what is actually used
in the computations

Thanks
  Dave


>
>
> Thank you so much.
>
> Frank
>
>
>
> On 07/06/2016 02:51 PM, Barry Smith wrote:
>
>> On Jul 6, 2016, at 4:19 PM, frank <hengjiew at uci.edu> wrote:
>>>
>>> Hi Barry,
>>>
>>> Thank you for you advice.
>>> I tried three test. In the 1st test, the grid is 3072*256*768 and the
>>> process mesh is 96*8*24.
>>> The linear solver is 'cg' the preconditioner is 'mg' and 'telescope' is
>>> used as the preconditioner at the coarse mesh.
>>> The system gives me the "Out of Memory" error before the linear system
>>> is completely solved.
>>> The info from '-ksp_view_pre' is attached. I seems to me that the error
>>> occurs when it reaches the coarse mesh.
>>>
>>> The 2nd test uses a grid of 1536*128*384 and process mesh is 96*8*24.
>>> The 3rd test uses the same grid but a different process mesh 48*4*12.
>>>
>>     Are you sure this is right? The total matrix and vector memory usage
>> goes from 2nd test
>>                Vector   384            383      8,193,712     0.
>>                Matrix   103            103     11,508,688     0.
>> to 3rd test
>>               Vector   384            383      1,590,520     0.
>>                Matrix   103            103      3,508,664     0.
>> that is the memory usage got smaller but if you have only 1/8th the
>> processes and the same grid it should have gotten about 8 times bigger. Did
>> you maybe cut the grid by a factor of 8 also? If so that still doesn't
>> explain it because the memory usage changed by a factor of 5 something for
>> the vectors and 3 something for the matrices.
>>
>>
>> The linear solver and petsc options in 2nd and 3rd tests are the same in
>>> 1st test. The linear solver works fine in both test.
>>> I attached the memory usage of the 2nd and 3rd tests. The memory info is
>>> from the option '-log_summary'. I tried to use '-momery_info' as you
>>> suggested, but in my case petsc treated it as an unused option. It output
>>> nothing about the memory. Do I need to add sth to my code so I can use
>>> '-memory_info'?
>>>
>>     Sorry, my mistake the option is -memory_view
>>
>>    Can you run the one case with -memory_view and -mg_coarse jacobi
>> -ksp_max_it 1 (just so it doesn't iterate forever) to see how much memory
>> is used without the telescope? Also run case 2 the same way.
>>
>>    Barry
>>
>>
>>
>> In both tests the memory usage is not large.
>>>
>>> It seems to me that it might be the 'telescope'  preconditioner that
>>> allocated a lot of memory and caused the error in the 1st test.
>>> Is there is a way to show how much memory it allocated?
>>>
>>> Frank
>>>
>>> On 07/05/2016 03:37 PM, Barry Smith wrote:
>>>
>>>>    Frank,
>>>>
>>>>      You can run with -ksp_view_pre to have it "view" the KSP before
>>>> the solve so hopefully it gets that far.
>>>>
>>>>       Please run the problem that does fit with -memory_info when the
>>>> problem completes it will show the "high water mark" for PETSc allocated
>>>> memory and total memory used. We first want to look at these numbers to see
>>>> if it is using more memory than you expect. You could also run with say
>>>> half the grid spacing to see how the memory usage scaled with the increase
>>>> in grid points. Make the runs also with -log_view and send all the output
>>>> from these options.
>>>>
>>>>     Barry
>>>>
>>>> On Jul 5, 2016, at 5:23 PM, frank <hengjiew at uci.edu> wrote:
>>>>>
>>>>> Hi,
>>>>>
>>>>> I am using the CG ksp solver and Multigrid preconditioner  to solve a
>>>>> linear system in parallel.
>>>>> I chose to use the 'Telescope' as the preconditioner on the coarse
>>>>> mesh for its good performance.
>>>>> The petsc options file is attached.
>>>>>
>>>>> The domain is a 3d box.
>>>>> It works well when the grid is  1536*128*384 and the process mesh is
>>>>> 96*8*24. When I double the size of grid and keep the same process mesh and
>>>>> petsc options, I get an "out of memory" error from the super-cluster I am
>>>>> using.
>>>>> Each process has access to at least 8G memory, which should be more
>>>>> than enough for my application. I am sure that all the other parts of my
>>>>> code( except the linear solver ) do not use much memory. So I doubt if
>>>>> there is something wrong with the linear solver.
>>>>> The error occurs before the linear system is completely solved so I
>>>>> don't have the info from ksp view. I am not able to re-produce the error
>>>>> with a smaller problem either.
>>>>> In addition,  I tried to use the block jacobi as the preconditioner
>>>>> with the same grid and same decomposition. The linear solver runs extremely
>>>>> slow but there is no memory error.
>>>>>
>>>>> How can I diagnose what exactly cause the error?
>>>>> Thank you so much.
>>>>>
>>>>> Frank
>>>>> <petsc_options.txt>
>>>>>
>>>>
>>> <ksp_view_pre.txt><memory_test2.txt><memory_test3.txt><petsc_options.txt>
>>>
>>
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.mcs.anl.gov/pipermail/petsc-users/attachments/20160709/94a80abc/attachment-0001.html>

From dalcinl at gmail.com  Sun Jul 10 04:31:28 2016
From: dalcinl at gmail.com (Lisandro Dalcin)
Date: Sun, 10 Jul 2016 12:31:28 +0300
Subject: [petsc-users] How to determine the type of SNESLineSearch?
In-Reply-To: <CAPz1TnceCfkAcp+gX6Lyz4RzeeUgH_Mdpfc8n3pqMnH3W7zfZw@mail.gmail.com>
References: <CAPz1TnceCfkAcp+gX6Lyz4RzeeUgH_Mdpfc8n3pqMnH3W7zfZw@mail.gmail.com>
Message-ID: <CAEcYPwCDU+hRQ5k_8ce_Pf9=91duiLhSjc=yYf+D42+42i1M0g@mail.gmail.com>

PetscBool match;
ierr = PetscObjectTypeCompare((PetscObject)linesearch,
SNESLINESEARCHBASIC,&match);CHKERRQ(ierr);
if (!match) {
    ...
}

On 6 July 2016 at 00:07, Gautam Bisht <gbisht at lbl.gov> wrote:
> Hi PETSc,
>
> After SNESSolve converges, I want to perform few additional operations only
> when SNESLineSearchType is not SNESLINESEARCHBASIC. But, there is no
> SNESLineSearchGetType routine. Any idea on how I can determine the type of
> LineSearch set by a user using command line option?
>
> Thanks,
> -Gautam.


-- 
Lisandro Dalcin
============
Research Scientist
Computer, Electrical and Mathematical Sciences & Engineering (CEMSE)
Extreme Computing Research Center (ECRC)
King Abdullah University of Science and Technology (KAUST)
http://ecrc.kaust.edu.sa/

4700 King Abdullah University of Science and Technology
al-Khawarizmi Bldg (Bldg 1), Office # 0109
Thuwal 23955-6900, Kingdom of Saudi Arabia
http://www.kaust.edu.sa

Office Phone: +966 12 808-0459

From davydden at gmail.com  Sun Jul 10 11:36:09 2016
From: davydden at gmail.com (Denis Davydov)
Date: Sun, 10 Jul 2016 18:36:09 +0200
Subject: [petsc-users] [Slepc 3.7.1][macOS] install name is set to build
	folder instead of prefix
Message-ID: <D455E302-CAAB-4993-8B6D-233D789CDBD9@gmail.com>

Dear developers,

Slepc 3.6.3 used to produce the following result of install names:

$ otool -lv libslepc.dylib | grep slepc
libslepc.dylib:
         name /Users/davydden/spack/opt/spack/darwin-elcapitan-x86_64/clang-7.3.0-apple/slepc-3.6.3-b35zhzknp4lrt5r2iksagql2jkya2vfl/lib/libslepc.3.6.3.dylib (offset 24)
         path /Users/davydden/spack/opt/spack/darwin-elcapitan-x86_64/clang-7.3.0-apple/slepc-3.6.3-b35zhzknp4lrt5r2iksagql2jkya2vfl/lib (offset 12)
         path /Users/davydden/spack/opt/spack/darwin-elcapitan-x86_64/clang-7.3.0-apple/slepc-3.6.3-b35zhzknp4lrt5r2iksagql2jkya2vfl/lib64 (offset 12)

same for libslepc.3.6.dylib and libslepc.3.6.3.dylib


Since [3.7.1] the installed libraries have

$ otool -lv libslepc.dylib | grep slepc
libslepc.dylib:
         name /private/var/folders/5k/sqpp24tx3ylds4fgm13pfht00000gn/T/davydden/spack-stage/spack-stage-JwBNAx/slepc-3.7.1/installed-arch-darwin-c-opt/lib/libslepc.3.7.dylib (offset 24)
         path /Users/davydden/spack/opt/spack/darwin-elcapitan-x86_64/clang-7.3.0-apple/slepc-3.7.1-gimrzhb4mozeus3i2hdmrtjp3tha5pgr/lib (offset 12)
         path /Users/davydden/spack/opt/spack/darwin-elcapitan-x86_64/clang-7.3.0-apple/slepc-3.7.1-gimrzhb4mozeus3i2hdmrtjp3tha5pgr/lib64 (offset 12)


That is, the ?name? is wrong as it corresponds to the path in the temporary build folder.

Kind regards,
Denis 


From jroman at dsic.upv.es  Sun Jul 10 11:47:58 2016
From: jroman at dsic.upv.es (Jose E. Roman)
Date: Sun, 10 Jul 2016 18:47:58 +0200
Subject: [petsc-users] [Slepc 3.7.1][macOS] install name is set to build
	folder instead of prefix
In-Reply-To: <D455E302-CAAB-4993-8B6D-233D789CDBD9@gmail.com>
References: <D455E302-CAAB-4993-8B6D-233D789CDBD9@gmail.com>
Message-ID: <7FE647AB-8FD7-4D8A-980F-87F5F78478D7@dsic.upv.es>

I think this is already fixed in this commit:
https://bitbucket.org/slepc/slepc/commits/7489a3f3d569e2fbf5513ac9dcd769017d9f7eb7
Version 7.3.2 containing this patch will be released in a week or so.
Thanks for reporting this.
Jose


> El 10 jul 2016, a las 18:36, Denis Davydov <davydden at gmail.com> escribi?:
> 
> Dear developers,
> 
> Slepc 3.6.3 used to produce the following result of install names:
> 
> $ otool -lv libslepc.dylib | grep slepc
> libslepc.dylib:
>         name /Users/davydden/spack/opt/spack/darwin-elcapitan-x86_64/clang-7.3.0-apple/slepc-3.6.3-b35zhzknp4lrt5r2iksagql2jkya2vfl/lib/libslepc.3.6.3.dylib (offset 24)
>         path /Users/davydden/spack/opt/spack/darwin-elcapitan-x86_64/clang-7.3.0-apple/slepc-3.6.3-b35zhzknp4lrt5r2iksagql2jkya2vfl/lib (offset 12)
>         path /Users/davydden/spack/opt/spack/darwin-elcapitan-x86_64/clang-7.3.0-apple/slepc-3.6.3-b35zhzknp4lrt5r2iksagql2jkya2vfl/lib64 (offset 12)
> 
> same for libslepc.3.6.dylib and libslepc.3.6.3.dylib
> 
> 
> Since [3.7.1] the installed libraries have
> 
> $ otool -lv libslepc.dylib | grep slepc
> libslepc.dylib:
>         name /private/var/folders/5k/sqpp24tx3ylds4fgm13pfht00000gn/T/davydden/spack-stage/spack-stage-JwBNAx/slepc-3.7.1/installed-arch-darwin-c-opt/lib/libslepc.3.7.dylib (offset 24)
>         path /Users/davydden/spack/opt/spack/darwin-elcapitan-x86_64/clang-7.3.0-apple/slepc-3.7.1-gimrzhb4mozeus3i2hdmrtjp3tha5pgr/lib (offset 12)
>         path /Users/davydden/spack/opt/spack/darwin-elcapitan-x86_64/clang-7.3.0-apple/slepc-3.7.1-gimrzhb4mozeus3i2hdmrtjp3tha5pgr/lib64 (offset 12)
> 
> 
> That is, the ?name? is wrong as it corresponds to the path in the temporary build folder.
> 
> Kind regards,
> Denis 
> 


From davydden at gmail.com  Sun Jul 10 11:56:50 2016
From: davydden at gmail.com (Denis Davydov)
Date: Sun, 10 Jul 2016 18:56:50 +0200
Subject: [petsc-users] [Slepc 3.7.1][macOS] install name is set to build
	folder instead of prefix
In-Reply-To: <7FE647AB-8FD7-4D8A-980F-87F5F78478D7@dsic.upv.es>
References: <D455E302-CAAB-4993-8B6D-233D789CDBD9@gmail.com>
	<7FE647AB-8FD7-4D8A-980F-87F5F78478D7@dsic.upv.es>
Message-ID: <EA5C795F-C648-4701-BF59-3C7EDC76417F@gmail.com>

Hi Jose,

the patch you mentioned does not solve the problem (i tried it):

$ otool -D libslepc.dylib
libslepc.dylib:
/private/var/folders/5k/sqpp24tx3ylds4fgm13pfht00000gn/T/davydden/spack-stage/spack-stage-jqcVVv/slepc-3.7.1/installed-arch-darwin-c-opt/lib/libslepc.3.7.dylib

Kind regards,
Denis 

> On 10 Jul 2016, at 18:47, Jose E. Roman <jroman at dsic.upv.es> wrote:
> 
> I think this is already fixed in this commit:
> https://bitbucket.org/slepc/slepc/commits/7489a3f3d569e2fbf5513ac9dcd769017d9f7eb7
> Version 7.3.2 containing this patch will be released in a week or so.
> Thanks for reporting this.
> Jose
> 
> 
>> El 10 jul 2016, a las 18:36, Denis Davydov <davydden at gmail.com> escribi?:
>> 
>> Dear developers,
>> 
>> Slepc 3.6.3 used to produce the following result of install names:
>> 
>> $ otool -lv libslepc.dylib | grep slepc
>> libslepc.dylib:
>>       name /Users/davydden/spack/opt/spack/darwin-elcapitan-x86_64/clang-7.3.0-apple/slepc-3.6.3-b35zhzknp4lrt5r2iksagql2jkya2vfl/lib/libslepc.3.6.3.dylib (offset 24)
>>       path /Users/davydden/spack/opt/spack/darwin-elcapitan-x86_64/clang-7.3.0-apple/slepc-3.6.3-b35zhzknp4lrt5r2iksagql2jkya2vfl/lib (offset 12)
>>       path /Users/davydden/spack/opt/spack/darwin-elcapitan-x86_64/clang-7.3.0-apple/slepc-3.6.3-b35zhzknp4lrt5r2iksagql2jkya2vfl/lib64 (offset 12)
>> 
>> same for libslepc.3.6.dylib and libslepc.3.6.3.dylib
>> 
>> 
>> Since [3.7.1] the installed libraries have
>> 
>> $ otool -lv libslepc.dylib | grep slepc
>> libslepc.dylib:
>>       name /private/var/folders/5k/sqpp24tx3ylds4fgm13pfht00000gn/T/davydden/spack-stage/spack-stage-JwBNAx/slepc-3.7.1/installed-arch-darwin-c-opt/lib/libslepc.3.7.dylib (offset 24)
>>       path /Users/davydden/spack/opt/spack/darwin-elcapitan-x86_64/clang-7.3.0-apple/slepc-3.7.1-gimrzhb4mozeus3i2hdmrtjp3tha5pgr/lib (offset 12)
>>       path /Users/davydden/spack/opt/spack/darwin-elcapitan-x86_64/clang-7.3.0-apple/slepc-3.7.1-gimrzhb4mozeus3i2hdmrtjp3tha5pgr/lib64 (offset 12)
>> 
>> 
>> That is, the ?name? is wrong as it corresponds to the path in the temporary build folder.
>> 
>> Kind regards,
>> Denis 
>> 
> 


From davydden at gmail.com  Sun Jul 10 15:26:03 2016
From: davydden at gmail.com (Denis Davydov)
Date: Sun, 10 Jul 2016 22:26:03 +0200
Subject: [petsc-users] [Slepc 3.7.1][macOS] install name is set to build
	folder instead of prefix
In-Reply-To: <EA5C795F-C648-4701-BF59-3C7EDC76417F@gmail.com>
References: <D455E302-CAAB-4993-8B6D-233D789CDBD9@gmail.com>
	<7FE647AB-8FD7-4D8A-980F-87F5F78478D7@dsic.upv.es>
	<EA5C795F-C648-4701-BF59-3C7EDC76417F@gmail.com>
Message-ID: <5D82D597-FEE7-48CF-A99E-C5A88956CAAD@gmail.com>

I debuged a bit your code,  install_name should be used as follows:

install_name_tool -id <new_name> <library_file_to_change>

That is, you need to change around ?installName? variable and ?dst? and then it works as expected.

Kind regards,
Denis 

> On 10 Jul 2016, at 18:56, Denis Davydov <davydden at gmail.com> wrote:
> 
> Hi Jose,
> 
> the patch you mentioned does not solve the problem (i tried it):
> 
> $ otool -D libslepc.dylib
> libslepc.dylib:
> /private/var/folders/5k/sqpp24tx3ylds4fgm13pfht00000gn/T/davydden/spack-stage/spack-stage-jqcVVv/slepc-3.7.1/installed-arch-darwin-c-opt/lib/libslepc.3.7.dylib
> 
> Kind regards,
> Denis 
> 
>> On 10 Jul 2016, at 18:47, Jose E. Roman <jroman at dsic.upv.es> wrote:
>> 
>> I think this is already fixed in this commit:
>> https://bitbucket.org/slepc/slepc/commits/7489a3f3d569e2fbf5513ac9dcd769017d9f7eb7
>> Version 7.3.2 containing this patch will be released in a week or so.
>> Thanks for reporting this.
>> Jose
>> 
>> 
>>> El 10 jul 2016, a las 18:36, Denis Davydov <davydden at gmail.com> escribi?:
>>> 
>>> Dear developers,
>>> 
>>> Slepc 3.6.3 used to produce the following result of install names:
>>> 
>>> $ otool -lv libslepc.dylib | grep slepc
>>> libslepc.dylib:
>>>      name /Users/davydden/spack/opt/spack/darwin-elcapitan-x86_64/clang-7.3.0-apple/slepc-3.6.3-b35zhzknp4lrt5r2iksagql2jkya2vfl/lib/libslepc.3.6.3.dylib (offset 24)
>>>      path /Users/davydden/spack/opt/spack/darwin-elcapitan-x86_64/clang-7.3.0-apple/slepc-3.6.3-b35zhzknp4lrt5r2iksagql2jkya2vfl/lib (offset 12)
>>>      path /Users/davydden/spack/opt/spack/darwin-elcapitan-x86_64/clang-7.3.0-apple/slepc-3.6.3-b35zhzknp4lrt5r2iksagql2jkya2vfl/lib64 (offset 12)
>>> 
>>> same for libslepc.3.6.dylib and libslepc.3.6.3.dylib
>>> 
>>> 
>>> Since [3.7.1] the installed libraries have
>>> 
>>> $ otool -lv libslepc.dylib | grep slepc
>>> libslepc.dylib:
>>>      name /private/var/folders/5k/sqpp24tx3ylds4fgm13pfht00000gn/T/davydden/spack-stage/spack-stage-JwBNAx/slepc-3.7.1/installed-arch-darwin-c-opt/lib/libslepc.3.7.dylib (offset 24)
>>>      path /Users/davydden/spack/opt/spack/darwin-elcapitan-x86_64/clang-7.3.0-apple/slepc-3.7.1-gimrzhb4mozeus3i2hdmrtjp3tha5pgr/lib (offset 12)
>>>      path /Users/davydden/spack/opt/spack/darwin-elcapitan-x86_64/clang-7.3.0-apple/slepc-3.7.1-gimrzhb4mozeus3i2hdmrtjp3tha5pgr/lib64 (offset 12)
>>> 
>>> 
>>> That is, the ?name? is wrong as it corresponds to the path in the temporary build folder.
>>> 
>>> Kind regards,
>>> Denis 
>>> 
>> 
> 


From davydden at gmail.com  Sun Jul 10 17:29:02 2016
From: davydden at gmail.com (Denis Davydov)
Date: Mon, 11 Jul 2016 00:29:02 +0200
Subject: [petsc-users] [Slepc 3.7.1][macOS] install name is set to build
	folder instead of prefix
In-Reply-To: <5D82D597-FEE7-48CF-A99E-C5A88956CAAD@gmail.com>
References: <D455E302-CAAB-4993-8B6D-233D789CDBD9@gmail.com>
	<7FE647AB-8FD7-4D8A-980F-87F5F78478D7@dsic.upv.es>
	<EA5C795F-C648-4701-BF59-3C7EDC76417F@gmail.com>
	<5D82D597-FEE7-48CF-A99E-C5A88956CAAD@gmail.com>
Message-ID: <6C60E10D-B52A-4A59-8045-F6672E3F00C7@gmail.com>

Hi Jose,

Please, disregard my last email. The order of arguments is correct.
I still have an issue, though. I will debug it further and try to find what?s the cause...

Kind regards,
Denis 

> On 10 Jul 2016, at 22:26, Denis Davydov <davydden at gmail.com> wrote:
> 
> I debuged a bit your code,  install_name should be used as follows:
> 
> install_name_tool -id <new_name> <library_file_to_change>
> 
> That is, you need to change around ?installName? variable and ?dst? and then it works as expected.
> 
> Kind regards,
> Denis 
> 
>> On 10 Jul 2016, at 18:56, Denis Davydov <davydden at gmail.com> wrote:
>> 
>> Hi Jose,
>> 
>> the patch you mentioned does not solve the problem (i tried it):
>> 
>> $ otool -D libslepc.dylib
>> libslepc.dylib:
>> /private/var/folders/5k/sqpp24tx3ylds4fgm13pfht00000gn/T/davydden/spack-stage/spack-stage-jqcVVv/slepc-3.7.1/installed-arch-darwin-c-opt/lib/libslepc.3.7.dylib
>> 
>> Kind regards,
>> Denis 
>> 
>>> On 10 Jul 2016, at 18:47, Jose E. Roman <jroman at dsic.upv.es> wrote:
>>> 
>>> I think this is already fixed in this commit:
>>> https://bitbucket.org/slepc/slepc/commits/7489a3f3d569e2fbf5513ac9dcd769017d9f7eb7
>>> Version 7.3.2 containing this patch will be released in a week or so.
>>> Thanks for reporting this.
>>> Jose
>>> 
>>> 
>>>> El 10 jul 2016, a las 18:36, Denis Davydov <davydden at gmail.com> escribi?:
>>>> 
>>>> Dear developers,
>>>> 
>>>> Slepc 3.6.3 used to produce the following result of install names:
>>>> 
>>>> $ otool -lv libslepc.dylib | grep slepc
>>>> libslepc.dylib:
>>>>     name /Users/davydden/spack/opt/spack/darwin-elcapitan-x86_64/clang-7.3.0-apple/slepc-3.6.3-b35zhzknp4lrt5r2iksagql2jkya2vfl/lib/libslepc.3.6.3.dylib (offset 24)
>>>>     path /Users/davydden/spack/opt/spack/darwin-elcapitan-x86_64/clang-7.3.0-apple/slepc-3.6.3-b35zhzknp4lrt5r2iksagql2jkya2vfl/lib (offset 12)
>>>>     path /Users/davydden/spack/opt/spack/darwin-elcapitan-x86_64/clang-7.3.0-apple/slepc-3.6.3-b35zhzknp4lrt5r2iksagql2jkya2vfl/lib64 (offset 12)
>>>> 
>>>> same for libslepc.3.6.dylib and libslepc.3.6.3.dylib
>>>> 
>>>> 
>>>> Since [3.7.1] the installed libraries have
>>>> 
>>>> $ otool -lv libslepc.dylib | grep slepc
>>>> libslepc.dylib:
>>>>     name /private/var/folders/5k/sqpp24tx3ylds4fgm13pfht00000gn/T/davydden/spack-stage/spack-stage-JwBNAx/slepc-3.7.1/installed-arch-darwin-c-opt/lib/libslepc.3.7.dylib (offset 24)
>>>>     path /Users/davydden/spack/opt/spack/darwin-elcapitan-x86_64/clang-7.3.0-apple/slepc-3.7.1-gimrzhb4mozeus3i2hdmrtjp3tha5pgr/lib (offset 12)
>>>>     path /Users/davydden/spack/opt/spack/darwin-elcapitan-x86_64/clang-7.3.0-apple/slepc-3.7.1-gimrzhb4mozeus3i2hdmrtjp3tha5pgr/lib64 (offset 12)
>>>> 
>>>> 
>>>> That is, the ?name? is wrong as it corresponds to the path in the temporary build folder.
>>>> 
>>>> Kind regards,
>>>> Denis 
>>>> 
>>> 
>> 
> 


From davydden at gmail.com  Sun Jul 10 17:48:26 2016
From: davydden at gmail.com (Denis Davydov)
Date: Mon, 11 Jul 2016 00:48:26 +0200
Subject: [petsc-users] [Slepc 3.7.1][macOS] install name is set to build
	folder instead of prefix
In-Reply-To: <6C60E10D-B52A-4A59-8045-F6672E3F00C7@gmail.com>
References: <D455E302-CAAB-4993-8B6D-233D789CDBD9@gmail.com>
	<7FE647AB-8FD7-4D8A-980F-87F5F78478D7@dsic.upv.es>
	<EA5C795F-C648-4701-BF59-3C7EDC76417F@gmail.com>
	<5D82D597-FEE7-48CF-A99E-C5A88956CAAD@gmail.com>
	<6C60E10D-B52A-4A59-8045-F6672E3F00C7@gmail.com>
Message-ID: <CA8E72EA-43D1-45AC-BCC5-ED2F61C7257D@gmail.com>

Hi Jose,

so here is what happens. The issue appears when SLEPC_DIR is set to a symlink (the one with ?stage below) of a build folder (the one with ?private? below). 
During configure there is a warning that SLEPC_DIR is not the same as current dir (string comparison),
but one is symlink of another, so all but install_name_tool work. The latter leads to the following values of variables:

oldname    =/private/var/folders/5k/sqpp24tx3ylds4fgm13pfht00000gn/T/davydden/spack-stage/spack-stage-MziaMV/slepc-3.7.1/installed-arch-darwin-c-opt/lib/libslepc.3.7.dylib

installName=/private/var/folders/5k/sqpp24tx3ylds4fgm13pfht00000gn/T/davydden/spack-stage/spack-stage-MziaMV/slepc-3.7.1/installed-arch-darwin-c-opt/lib/libslepc.3.7.dylib

archDir    =/Users/davydden/spack/var/spack/stage/slepc-3.7.1-gimrzhb4mozeus3i2hdmrtjp3tha5pgr/slepc-3.7.1/installed-arch-darwin-c-opt

installDir =/Users/davydden/spack/opt/spack/darwin-elcapitan-x86_64/clang-7.3.0-apple/slepc-3.7.1-gimrzhb4mozeus3i2hdmrtjp3tha5pgr

dst        =/Users/davydden/spack/opt/spack/darwin-elcapitan-x86_64/clang-7.3.0-apple/slepc-3.7.1-gimrzhb4mozeus3i2hdmrtjp3tha5pgr/lib/libslepc.3.7.1.dylib

As you see, installName wasn?t changed from oldname.

Since the python code rely on SLEPC_DIR be pwd(), i would suggest to through an error instead of the warning to make
sure that users won?t get in the situation above. Alternative is to make this part of the code more robust.

When SLEPC_DIR==pwd() the patch you referred works.

Kind regards,
Denis 

> On 11 Jul 2016, at 00:29, Denis Davydov <davydden at gmail.com> wrote:
> 
> Hi Jose,
> 
> Please, disregard my last email. The order of arguments is correct.
> I still have an issue, though. I will debug it further and try to find what?s the cause...
> 
> Kind regards,
> Denis 
> 
>> On 10 Jul 2016, at 22:26, Denis Davydov <davydden at gmail.com> wrote:
>> 
>> I debuged a bit your code,  install_name should be used as follows:
>> 
>> install_name_tool -id <new_name> <library_file_to_change>
>> 
>> That is, you need to change around ?installName? variable and ?dst? and then it works as expected.
>> 
>> Kind regards,
>> Denis 
>> 
>>> On 10 Jul 2016, at 18:56, Denis Davydov <davydden at gmail.com> wrote:
>>> 
>>> Hi Jose,
>>> 
>>> the patch you mentioned does not solve the problem (i tried it):
>>> 
>>> $ otool -D libslepc.dylib
>>> libslepc.dylib:
>>> /private/var/folders/5k/sqpp24tx3ylds4fgm13pfht00000gn/T/davydden/spack-stage/spack-stage-jqcVVv/slepc-3.7.1/installed-arch-darwin-c-opt/lib/libslepc.3.7.dylib
>>> 
>>> Kind regards,
>>> Denis 
>>> 
>>>> On 10 Jul 2016, at 18:47, Jose E. Roman <jroman at dsic.upv.es> wrote:
>>>> 
>>>> I think this is already fixed in this commit:
>>>> https://bitbucket.org/slepc/slepc/commits/7489a3f3d569e2fbf5513ac9dcd769017d9f7eb7
>>>> Version 7.3.2 containing this patch will be released in a week or so.
>>>> Thanks for reporting this.
>>>> Jose
>>>> 
>>>> 
>>>>> El 10 jul 2016, a las 18:36, Denis Davydov <davydden at gmail.com> escribi?:
>>>>> 
>>>>> Dear developers,
>>>>> 
>>>>> Slepc 3.6.3 used to produce the following result of install names:
>>>>> 
>>>>> $ otool -lv libslepc.dylib | grep slepc
>>>>> libslepc.dylib:
>>>>>    name /Users/davydden/spack/opt/spack/darwin-elcapitan-x86_64/clang-7.3.0-apple/slepc-3.6.3-b35zhzknp4lrt5r2iksagql2jkya2vfl/lib/libslepc.3.6.3.dylib (offset 24)
>>>>>    path /Users/davydden/spack/opt/spack/darwin-elcapitan-x86_64/clang-7.3.0-apple/slepc-3.6.3-b35zhzknp4lrt5r2iksagql2jkya2vfl/lib (offset 12)
>>>>>    path /Users/davydden/spack/opt/spack/darwin-elcapitan-x86_64/clang-7.3.0-apple/slepc-3.6.3-b35zhzknp4lrt5r2iksagql2jkya2vfl/lib64 (offset 12)
>>>>> 
>>>>> same for libslepc.3.6.dylib and libslepc.3.6.3.dylib
>>>>> 
>>>>> 
>>>>> Since [3.7.1] the installed libraries have
>>>>> 
>>>>> $ otool -lv libslepc.dylib | grep slepc
>>>>> libslepc.dylib:
>>>>>    name /private/var/folders/5k/sqpp24tx3ylds4fgm13pfht00000gn/T/davydden/spack-stage/spack-stage-JwBNAx/slepc-3.7.1/installed-arch-darwin-c-opt/lib/libslepc.3.7.dylib (offset 24)
>>>>>    path /Users/davydden/spack/opt/spack/darwin-elcapitan-x86_64/clang-7.3.0-apple/slepc-3.7.1-gimrzhb4mozeus3i2hdmrtjp3tha5pgr/lib (offset 12)
>>>>>    path /Users/davydden/spack/opt/spack/darwin-elcapitan-x86_64/clang-7.3.0-apple/slepc-3.7.1-gimrzhb4mozeus3i2hdmrtjp3tha5pgr/lib64 (offset 12)
>>>>> 
>>>>> 
>>>>> That is, the ?name? is wrong as it corresponds to the path in the temporary build folder.
>>>>> 
>>>>> Kind regards,
>>>>> Denis 
>>>>> 
>>>> 
>>> 
>> 
> 

-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.mcs.anl.gov/pipermail/petsc-users/attachments/20160711/6e634aa1/attachment.html>

From zocca.marco at gmail.com  Mon Jul 11 02:57:59 2016
From: zocca.marco at gmail.com (Marco Zocca)
Date: Mon, 11 Jul 2016 09:57:59 +0200
Subject: [petsc-users] HDF5 and PETSc
Message-ID: <CAKE6T0TPPBMyHS5G_zDXMTBg6i29O4VUi+Hj=nzFf5yRRTnsjQ@mail.gmail.com>

Good morning,

   Does the HDF5 functionality need to be explicitly requested at
configure time? I just noticed that my default configuration on a
single-node machine does not compile any relevant symbol.

I do not have HDF5 installed on my system yet, but I assumed PETSc
includes it by default, or automagically pulls the dependency in at
config time, since the manual doesn't mention anything about it. Do I
have to install HDF5 from source and rebuild PETSc then?

Thanks in advance,
Marco


--- config options and architecture :

Configure Options: --configModules=PETSc.Configure
--optionsModule=config.compilerOptions --with-cc=gcc --with-cxx=g++
--with-fc=gfortran --download-fblaslapack --download-mpich
Working directory: /Users/ocramz/petsc-3.7.2
Machine platform:
('Darwin', 'fermi.local', '13.4.0', 'Darwin Kernel Version 13.4.0: Sun
Aug 17 19:50:11 PDT 2014; root:xnu-2422.115.4~1/RELEASE_X86_64',
'x86_64', 'i386')
Python version:
2.7.5 (default, Mar  9 2014, 22:15:05)
[GCC 4.2.1 Compatible Apple LLVM 5.0 (clang-500.0.68)]

From zocca.marco at gmail.com  Mon Jul 11 03:13:31 2016
From: zocca.marco at gmail.com (Marco Zocca)
Date: Mon, 11 Jul 2016 10:13:31 +0200
Subject: [petsc-users] HDF5 and PETSc
In-Reply-To: <CAKE6T0TPPBMyHS5G_zDXMTBg6i29O4VUi+Hj=nzFf5yRRTnsjQ@mail.gmail.com>
References: <CAKE6T0TPPBMyHS5G_zDXMTBg6i29O4VUi+Hj=nzFf5yRRTnsjQ@mail.gmail.com>
Message-ID: <CAKE6T0S=r0VjRrcQHCyPV0XmUz4LTaZ8RAC6quwZuVd_Li+O6A@mail.gmail.com>

Sorry for the previous mail, I hadn't fully read  ./configure --help :
all external package options are listed there, including HDF5

As far as I can see in
https://www.mcs.anl.gov/petsc/miscellaneous/external.html and on the
PDF manual, not all external packages are mentioned, and this tripped
me initially.

So my question becomes: please synchronize the output of ./configure
--help with manpages and pdf manual :)

Thanks again,
Marco


On 11 July 2016 at 09:57, Marco Zocca <zocca.marco at gmail.com> wrote:
> Good morning,
>
>    Does the HDF5 functionality need to be explicitly requested at
> configure time? I just noticed that my default configuration on a
> single-node machine does not compile any relevant symbol.
>
> I do not have HDF5 installed on my system yet, but I assumed PETSc
> includes it by default, or automagically pulls the dependency in at
> config time, since the manual doesn't mention anything about it. Do I
> have to install HDF5 from source and rebuild PETSc then?
>
> Thanks in advance,
> Marco
>
>
>
> --- config options and architecture :
>
> Configure Options: --configModules=PETSc.Configure
> --optionsModule=config.compilerOptions --with-cc=gcc --with-cxx=g++
> --with-fc=gfortran --download-fblaslapack --download-mpich
> Working directory: /Users/ocramz/petsc-3.7.2
> Machine platform:
> ('Darwin', 'fermi.local', '13.4.0', 'Darwin Kernel Version 13.4.0: Sun
> Aug 17 19:50:11 PDT 2014; root:xnu-2422.115.4~1/RELEASE_X86_64',
> 'x86_64', 'i386')
> Python version:
> 2.7.5 (default, Mar  9 2014, 22:15:05)
> [GCC 4.2.1 Compatible Apple LLVM 5.0 (clang-500.0.68)]

From jroman at dsic.upv.es  Mon Jul 11 09:53:10 2016
From: jroman at dsic.upv.es (Jose E. Roman)
Date: Mon, 11 Jul 2016 16:53:10 +0200
Subject: [petsc-users] [Slepc 3.7.1][macOS] install name is set to build
	folder instead of prefix
In-Reply-To: <CA8E72EA-43D1-45AC-BCC5-ED2F61C7257D@gmail.com>
References: <D455E302-CAAB-4993-8B6D-233D789CDBD9@gmail.com>
	<7FE647AB-8FD7-4D8A-980F-87F5F78478D7@dsic.upv.es>
	<EA5C795F-C648-4701-BF59-3C7EDC76417F@gmail.com>
	<5D82D597-FEE7-48CF-A99E-C5A88956CAAD@gmail.com>
	<6C60E10D-B52A-4A59-8045-F6672E3F00C7@gmail.com>
	<CA8E72EA-43D1-45AC-BCC5-ED2F61C7257D@gmail.com>
Message-ID: <BE5FD8AD-6DF6-4EB4-9BC4-67ACC12763C0@dsic.upv.es>

I cannot reproduce this behaviour. If I do for instance this (on OS X El Capitan):

$ cd ~/tmp
$ ln -s $SLEPC_DIR .
$ cd slepc-3.7.1
$ ./configure
$ make
$ otool -lv $PETSC_ARCH/lib/libslepc.dylib | grep slepc

I don't get a warning, and the output of otool is the same that would result if done on $SLEPC_DIR.
Which warning are you getting?

Jose


> El 11 jul 2016, a las 0:48, Denis Davydov <davydden at gmail.com> escribi?:
> 
> Hi Jose,
> 
> so here is what happens. The issue appears when SLEPC_DIR is set to a symlink (the one with ?stage below) of a build folder (the one with ?private? below). 
> During configure there is a warning that SLEPC_DIR is not the same as current dir (string comparison),
> but one is symlink of another, so all but install_name_tool work. The latter leads to the following values of variables:
> 
> oldname    =/private/var/folders/5k/sqpp24tx3ylds4fgm13pfht00000gn/T/davydden/spack-stage/spack-stage-MziaMV/slepc-3.7.1/installed-arch-darwin-c-opt/lib/libslepc.3.7.dylib
> 
> installName=/private/var/folders/5k/sqpp24tx3ylds4fgm13pfht00000gn/T/davydden/spack-stage/spack-stage-MziaMV/slepc-3.7.1/installed-arch-darwin-c-opt/lib/libslepc.3.7.dylib
> 
> archDir    =/Users/davydden/spack/var/spack/stage/slepc-3.7.1-gimrzhb4mozeus3i2hdmrtjp3tha5pgr/slepc-3.7.1/installed-arch-darwin-c-opt
> 
> installDir =/Users/davydden/spack/opt/spack/darwin-elcapitan-x86_64/clang-7.3.0-apple/slepc-3.7.1-gimrzhb4mozeus3i2hdmrtjp3tha5pgr
> 
> dst        =/Users/davydden/spack/opt/spack/darwin-elcapitan-x86_64/clang-7.3.0-apple/slepc-3.7.1-gimrzhb4mozeus3i2hdmrtjp3tha5pgr/lib/libslepc.3.7.1.dylib
> 
> As you see, installName wasn?t changed from oldname.
> 
> Since the python code rely on SLEPC_DIR be pwd(), i would suggest to through an error instead of the warning to make
> sure that users won?t get in the situation above. Alternative is to make this part of the code more robust.
> 
> When SLEPC_DIR==pwd() the patch you referred works.
> 
> Kind regards,
> Denis 
> 


From knepley at gmail.com  Mon Jul 11 09:58:49 2016
From: knepley at gmail.com (Matthew Knepley)
Date: Mon, 11 Jul 2016 09:58:49 -0500
Subject: [petsc-users] HDF5 and PETSc
In-Reply-To: <CAKE6T0S=r0VjRrcQHCyPV0XmUz4LTaZ8RAC6quwZuVd_Li+O6A@mail.gmail.com>
References: <CAKE6T0TPPBMyHS5G_zDXMTBg6i29O4VUi+Hj=nzFf5yRRTnsjQ@mail.gmail.com>
	<CAKE6T0S=r0VjRrcQHCyPV0XmUz4LTaZ8RAC6quwZuVd_Li+O6A@mail.gmail.com>
Message-ID: <CAMYG4G=5w2Az3WvsHnHdk1B1GoSwHVukSpW-fa6XSO9Zf1LJkw@mail.gmail.com>

On Mon, Jul 11, 2016 at 3:13 AM, Marco Zocca <zocca.marco at gmail.com> wrote:

> Sorry for the previous mail, I hadn't fully read  ./configure --help :
> all external package options are listed there, including HDF5
>
> As far as I can see in
> https://www.mcs.anl.gov/petsc/miscellaneous/external.html and on the
> PDF manual, not all external packages are mentioned, and this tripped
> me initially.
>
> So my question becomes: please synchronize the output of ./configure
> --help with manpages and pdf manual :)
>

Done.

https://bitbucket.org/petsc/petsc/commits/b6541ed63645a657daaf31a0efc9fb29a825bfaf

   Matt


> Thanks again,
> Marco
>
>
> On 11 July 2016 at 09:57, Marco Zocca <zocca.marco at gmail.com> wrote:
> > Good morning,
> >
> >    Does the HDF5 functionality need to be explicitly requested at
> > configure time? I just noticed that my default configuration on a
> > single-node machine does not compile any relevant symbol.
> >
> > I do not have HDF5 installed on my system yet, but I assumed PETSc
> > includes it by default, or automagically pulls the dependency in at
> > config time, since the manual doesn't mention anything about it. Do I
> > have to install HDF5 from source and rebuild PETSc then?
> >
> > Thanks in advance,
> > Marco
> >
> >
> >
> > --- config options and architecture :
> >
> > Configure Options: --configModules=PETSc.Configure
> > --optionsModule=config.compilerOptions --with-cc=gcc --with-cxx=g++
> > --with-fc=gfortran --download-fblaslapack --download-mpich
> > Working directory: /Users/ocramz/petsc-3.7.2
> > Machine platform:
> > ('Darwin', 'fermi.local', '13.4.0', 'Darwin Kernel Version 13.4.0: Sun
> > Aug 17 19:50:11 PDT 2014; root:xnu-2422.115.4~1/RELEASE_X86_64',
> > 'x86_64', 'i386')
> > Python version:
> > 2.7.5 (default, Mar  9 2014, 22:15:05)
> > [GCC 4.2.1 Compatible Apple LLVM 5.0 (clang-500.0.68)]
>


-- 
What most experimenters take for granted before they begin their
experiments is infinitely more interesting than any results to which their
experiments lead.
-- Norbert Wiener
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.mcs.anl.gov/pipermail/petsc-users/attachments/20160711/50f37faa/attachment.html>

From davydden at gmail.com  Mon Jul 11 10:06:47 2016
From: davydden at gmail.com (Denis Davydov)
Date: Mon, 11 Jul 2016 17:06:47 +0200
Subject: [petsc-users] [Slepc 3.7.1][macOS] install name is set to build
	folder instead of prefix
In-Reply-To: <BE5FD8AD-6DF6-4EB4-9BC4-67ACC12763C0@dsic.upv.es>
References: <D455E302-CAAB-4993-8B6D-233D789CDBD9@gmail.com>
	<7FE647AB-8FD7-4D8A-980F-87F5F78478D7@dsic.upv.es>
	<EA5C795F-C648-4701-BF59-3C7EDC76417F@gmail.com>
	<5D82D597-FEE7-48CF-A99E-C5A88956CAAD@gmail.com>
	<6C60E10D-B52A-4A59-8045-F6672E3F00C7@gmail.com>
	<CA8E72EA-43D1-45AC-BCC5-ED2F61C7257D@gmail.com>
	<BE5FD8AD-6DF6-4EB4-9BC4-67ACC12763C0@dsic.upv.es>
Message-ID: <44EF3239-8AA0-4157-B04A-BC3437409215@gmail.com>

Here is the warning:

Your SLEPC_DIR may not match the directory you are in
SLEPC_DIR  /Users/davydden/spack/var/spack/stage/slepc-3.7.1-p7hqqclwqvbvra6j44lka3xuc4eycvdg/slepc-3.7.1 Current directory /private/var/folders/5k/sqpp24tx3ylds4fgm13pfht00000gn/T/davydden/spack-stage/spack-stage-m7Xg8I/slepc-3.7.1

p.s. this is done within Spack, for a fix see: https://github.com/LLNL/spack/pull/1206

> On 11 Jul 2016, at 16:53, Jose E. Roman <jroman at dsic.upv.es> wrote:
> 
> I cannot reproduce this behaviour. If I do for instance this (on OS X El Capitan):
> 
> $ cd ~/tmp
> $ ln -s $SLEPC_DIR .
> $ cd slepc-3.7.1
> $ ./configure
> $ make
> $ otool -lv $PETSC_ARCH/lib/libslepc.dylib | grep slepc
> 
> I don't get a warning, and the output of otool is the same that would result if done on $SLEPC_DIR.
> Which warning are you getting?
> 
> Jose
> 
> 
>> El 11 jul 2016, a las 0:48, Denis Davydov <davydden at gmail.com> escribi?:
>> 
>> Hi Jose,
>> 
>> so here is what happens. The issue appears when SLEPC_DIR is set to a symlink (the one with ?stage below) of a build folder (the one with ?private? below). 
>> During configure there is a warning that SLEPC_DIR is not the same as current dir (string comparison),
>> but one is symlink of another, so all but install_name_tool work. The latter leads to the following values of variables:
>> 
>> oldname    =/private/var/folders/5k/sqpp24tx3ylds4fgm13pfht00000gn/T/davydden/spack-stage/spack-stage-MziaMV/slepc-3.7.1/installed-arch-darwin-c-opt/lib/libslepc.3.7.dylib
>> 
>> installName=/private/var/folders/5k/sqpp24tx3ylds4fgm13pfht00000gn/T/davydden/spack-stage/spack-stage-MziaMV/slepc-3.7.1/installed-arch-darwin-c-opt/lib/libslepc.3.7.dylib
>> 
>> archDir    =/Users/davydden/spack/var/spack/stage/slepc-3.7.1-gimrzhb4mozeus3i2hdmrtjp3tha5pgr/slepc-3.7.1/installed-arch-darwin-c-opt
>> 
>> installDir =/Users/davydden/spack/opt/spack/darwin-elcapitan-x86_64/clang-7.3.0-apple/slepc-3.7.1-gimrzhb4mozeus3i2hdmrtjp3tha5pgr
>> 
>> dst        =/Users/davydden/spack/opt/spack/darwin-elcapitan-x86_64/clang-7.3.0-apple/slepc-3.7.1-gimrzhb4mozeus3i2hdmrtjp3tha5pgr/lib/libslepc.3.7.1.dylib
>> 
>> As you see, installName wasn?t changed from oldname.
>> 
>> Since the python code rely on SLEPC_DIR be pwd(), i would suggest to through an error instead of the warning to make
>> sure that users won?t get in the situation above. Alternative is to make this part of the code more robust.
>> 
>> When SLEPC_DIR==pwd() the patch you referred works.
>> 
>> Kind regards,
>> Denis 
>> 
> 

-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.mcs.anl.gov/pipermail/petsc-users/attachments/20160711/35a52f77/attachment.html>

From ketancmaheshwari at gmail.com  Mon Jul 11 12:05:52 2016
From: ketancmaheshwari at gmail.com (Ketan Maheshwari)
Date: Mon, 11 Jul 2016 13:05:52 -0400
Subject: [petsc-users] Diagonalization of a 3D dense matrix
Message-ID: <CAMUuviqmSPJt_HdROfL-31YRnH4E2C+w1AYhrf7uup1sk1m07Q@mail.gmail.com>

Hello PETSC-ers,

I am a research faculty at Univ of Pittsburgh trying to use PETSC/SLEPC to
obtain the diagonalization of a large matrix using Lanczos or Davidson
method.

The matrix is a 3 dimensional dense matrix with a total of 216000 elements.

After looking into some of the examples in PETSC as well SLEPC
implementations
it seems like most of the implementations are with 2 dimensional matrices.

So, I was wondering if it is possible to express a 3 dimensional matrix
object
compatible to PETSC so that the SLEPC API could be used to obtain
diagonalization.

Any suggestions or pointers to documentation or examples would be of great
help.

Best,
-- 
Ketan
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.mcs.anl.gov/pipermail/petsc-users/attachments/20160711/fb05b561/attachment.html>

From knepley at gmail.com  Mon Jul 11 12:15:01 2016
From: knepley at gmail.com (Matthew Knepley)
Date: Mon, 11 Jul 2016 12:15:01 -0500
Subject: [petsc-users] Diagonalization of a 3D dense matrix
In-Reply-To: <CAMUuviqmSPJt_HdROfL-31YRnH4E2C+w1AYhrf7uup1sk1m07Q@mail.gmail.com>
References: <CAMUuviqmSPJt_HdROfL-31YRnH4E2C+w1AYhrf7uup1sk1m07Q@mail.gmail.com>
Message-ID: <CAMYG4Gk1-JOKAPGioiVogd3dM_Dxekt37H6Enf1kQ89QCYLS2g@mail.gmail.com>

On Mon, Jul 11, 2016 at 12:05 PM, Ketan Maheshwari <
ketancmaheshwari at gmail.com> wrote:

> Hello PETSC-ers,
>
> I am a research faculty at Univ of Pittsburgh trying to use PETSC/SLEPC to
> obtain the diagonalization of a large matrix using Lanczos or Davidson
> method.
>
> The matrix is a 3 dimensional dense matrix with a total of 216000 elements.
>
> After looking into some of the examples in PETSC as well SLEPC
> implementations
> it seems like most of the implementations are with 2 dimensional matrices.
>

You will have to explain what you mean by a "3D matrix". A matrix, by
definition, has only
rows and columns. You may mean a matrix generated from a 3D problem. That
should pose
no extra difficulty. You may mean a 3-index tensor, in which case
diagonalization is not a clear
concept.

  Thanks,

     Matt


> So, I was wondering if it is possible to express a 3 dimensional matrix
> object
> compatible to PETSC so that the SLEPC API could be used to obtain
> diagonalization.
>
> Any suggestions or pointers to documentation or examples would be of great
> help.
>
> Best,
> --
> Ketan
>
>


-- 
What most experimenters take for granted before they begin their
experiments is infinitely more interesting than any results to which their
experiments lead.
-- Norbert Wiener
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.mcs.anl.gov/pipermail/petsc-users/attachments/20160711/84a9400a/attachment.html>

From hengjiew at uci.edu  Mon Jul 11 12:14:12 2016
From: hengjiew at uci.edu (frank)
Date: Mon, 11 Jul 2016 10:14:12 -0700
Subject: [petsc-users] Question about memory usage in Multigrid
 preconditioner
In-Reply-To: <CAJ98EDqfz2hr1s8SoLCwPceqbB+bsZp1ryibGngRn+kNTDrocA@mail.gmail.com>
References: <577C337B.60909@uci.edu>
	<94A03A99-4970-4F20-8C79-FEE1DCBD028D@mcs.anl.gov>
	<577D75D3.8010703@uci.edu>
	<2F25042C-E6D6-4AC6-9C22-1B63F8065836@mcs.anl.gov>
	<57804DE9.707@uci.edu>
	<CAJ98EDqfz2hr1s8SoLCwPceqbB+bsZp1ryibGngRn+kNTDrocA@mail.gmail.com>
Message-ID: <5783D3E4.4020004@uci.edu>

Hi Dave,

I re-run the test using bjacobi as the preconditioner on the coarse mesh 
of telescope. The Grid is 3072*256*768 and process mesh is 96*8*24. The 
petsc option file is attached.
I still got the "Out Of Memory" error. The error occurred before the 
linear solver finished one step. So I don't have the full info from 
ksp_view. The info from ksp_view_pre is attached.
It seems to me that the error occurred when the decomposition was going 
to be changed.
I had another test with a grid of 1536*128*384 and the same process mesh 
as above. There was no error. The ksp_view info is attached for comparison.
Thank you.

Frank


On 07/08/2016 10:38 PM, Dave May wrote:
>
>
> On Saturday, 9 July 2016, frank <hengjiew at uci.edu 
> <javascript:_e(%7B%7D,'cvml','hengjiew at uci.edu');>> wrote:
>
>     Hi Barry and Dave,
>
>     Thank both of you for the advice.
>
>     @Barry
>     I made a mistake in the file names in last email. I attached the
>     correct files this time.
>     For all the three tests, 'Telescope' is used as the coarse
>     preconditioner.
>
>     == Test1:   Grid: 1536*128*384,   Process Mesh: 48*4*12
>     Part of the memory usage:  Vector   125            124 3971904    0.
>                                                  Matrix   101 101  
>     9462372     0
>
>     == Test2: Grid: 1536*128*384,   Process Mesh: 96*8*24
>     Part of the memory usage:  Vector   125            124 681672  0.
>                                                  Matrix   101 101  
>     1462180     0.
>
>     In theory, the memory usage in Test1 should be 8 times of Test2.
>     In my case, it is about 6 times.
>
>     == Test3: Grid: 3072*256*768,   Process Mesh: 96*8*24. Sub-domain
>     per process: 32*32*32
>     Here I get the out of memory error.
>
>     I tried to use -mg_coarse jacobi. In this way, I don't need to set
>     -mg_coarse_ksp_type and -mg_coarse_pc_type explicitly, right?
>     The linear solver didn't work in this case. Petsc output some errors.
>
>     @Dave
>     In test3, I use only one instance of 'Telescope'. On the coarse
>     mesh of 'Telescope', I used LU as the preconditioner instead of SVD.
>     If my set the levels correctly, then on the last coarse mesh of MG
>     where it calls 'Telescope', the sub-domain per process is 2*2*2.
>     On the last coarse mesh of 'Telescope', there is only one grid
>     point per process.
>     I still got the OOM error. The detailed petsc option file is attached.
>
>
> Do you understand the expected memory usage for the 
> particular parallel LU implementation you are using? I don't 
> (seriously). Replace LU with bjacobi and re-run this test. My point 
> about solver debugging is still valid.
>
> And please send the result of KSPView so we can see what is actually 
> used in the computations
>
> Thanks
>   Dave
>
>
>
>     Thank you so much.
>
>     Frank
>
>
>
>     On 07/06/2016 02:51 PM, Barry Smith wrote:
>
>             On Jul 6, 2016, at 4:19 PM, frank <hengjiew at uci.edu> wrote:
>
>             Hi Barry,
>
>             Thank you for you advice.
>             I tried three test. In the 1st test, the grid is
>             3072*256*768 and the process mesh is 96*8*24.
>             The linear solver is 'cg' the preconditioner is 'mg' and
>             'telescope' is used as the preconditioner at the coarse mesh.
>             The system gives me the "Out of Memory" error before the
>             linear system is completely solved.
>             The info from '-ksp_view_pre' is attached. I seems to me
>             that the error occurs when it reaches the coarse mesh.
>
>             The 2nd test uses a grid of 1536*128*384 and process mesh
>             is 96*8*24. The 3rd test uses the same grid but a
>             different process mesh 48*4*12.
>
>             Are you sure this is right? The total matrix and vector
>         memory usage goes from 2nd test
>                        Vector   384            383      8,193,712  0.
>                        Matrix   103            103     11,508,688  0.
>         to 3rd test
>                       Vector   384            383      1,590,520  0.
>                        Matrix   103            103      3,508,664  0.
>         that is the memory usage got smaller but if you have only
>         1/8th the processes and the same grid it should have gotten
>         about 8 times bigger. Did you maybe cut the grid by a factor
>         of 8 also? If so that still doesn't explain it because the
>         memory usage changed by a factor of 5 something for the
>         vectors and 3 something for the matrices.
>
>
>             The linear solver and petsc options in 2nd and 3rd tests
>             are the same in 1st test. The linear solver works fine in
>             both test.
>             I attached the memory usage of the 2nd and 3rd tests. The
>             memory info is from the option '-log_summary'. I tried to
>             use '-momery_info' as you suggested, but in my case petsc
>             treated it as an unused option. It output nothing about
>             the memory. Do I need to add sth to my code so I can use
>             '-memory_info'?
>
>             Sorry, my mistake the option is -memory_view
>
>            Can you run the one case with -memory_view and -mg_coarse
>         jacobi -ksp_max_it 1 (just so it doesn't iterate forever) to
>         see how much memory is used without the telescope? Also run
>         case 2 the same way.
>
>            Barry
>
>
>
>             In both tests the memory usage is not large.
>
>             It seems to me that it might be the 'telescope'
>             preconditioner that allocated a lot of memory and caused
>             the error in the 1st test.
>             Is there is a way to show how much memory it allocated?
>
>             Frank
>
>             On 07/05/2016 03:37 PM, Barry Smith wrote:
>
>                    Frank,
>
>                      You can run with -ksp_view_pre to have it "view"
>                 the KSP before the solve so hopefully it gets that far.
>
>                       Please run the problem that does fit with
>                 -memory_info when the problem completes it will show
>                 the "high water mark" for PETSc allocated memory and
>                 total memory used. We first want to look at these
>                 numbers to see if it is using more memory than you
>                 expect. You could also run with say half the grid
>                 spacing to see how the memory usage scaled with the
>                 increase in grid points. Make the runs also with
>                 -log_view and send all the output from these options.
>
>                     Barry
>
>                     On Jul 5, 2016, at 5:23 PM, frank
>                     <hengjiew at uci.edu> wrote:
>
>                     Hi,
>
>                     I am using the CG ksp solver and Multigrid
>                     preconditioner  to solve a linear system in parallel.
>                     I chose to use the 'Telescope' as the
>                     preconditioner on the coarse mesh for its good
>                     performance.
>                     The petsc options file is attached.
>
>                     The domain is a 3d box.
>                     It works well when the grid is  1536*128*384 and
>                     the process mesh is 96*8*24. When I double the
>                     size of grid and keep the same process mesh and
>                     petsc options, I get an "out of memory" error from
>                     the super-cluster I am using.
>                     Each process has access to at least 8G memory,
>                     which should be more than enough for my
>                     application. I am sure that all the other parts of
>                     my code( except the linear solver ) do not use
>                     much memory. So I doubt if there is something
>                     wrong with the linear solver.
>                     The error occurs before the linear system is
>                     completely solved so I don't have the info from
>                     ksp view. I am not able to re-produce the error
>                     with a smaller problem either.
>                     In addition,  I tried to use the block jacobi as
>                     the preconditioner with the same grid and same
>                     decomposition. The linear solver runs extremely
>                     slow but there is no memory error.
>
>                     How can I diagnose what exactly cause the error?
>                     Thank you so much.
>
>                     Frank
>                     <petsc_options.txt>
>
>             <ksp_view_pre.txt><memory_test2.txt><memory_test3.txt><petsc_options.txt>
>
>

-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.mcs.anl.gov/pipermail/petsc-users/attachments/20160711/5fee2a2b/attachment-0001.html>
-------------- next part --------------
KSP Object: 18432 MPI processes
  type: cg
  maximum iterations=1
  tolerances:  relative=1e-07, absolute=1e-50, divergence=10000.
  left preconditioning
  using nonzero initial guess
  using UNPRECONDITIONED norm type for convergence test
PC Object: 18432 MPI processes
  type: mg
  PC has not been set up so information may be incomplete
    MG: type is MULTIPLICATIVE, levels=5 cycles=v
      Cycles per PCApply=1
      Using Galerkin computed coarse grid matrices
  Coarse grid solver -- level -------------------------------
    KSP Object:    (mg_coarse_)     18432 MPI processes
      type: preonly
      maximum iterations=10000, initial guess is zero
      tolerances:  relative=1e-05, absolute=1e-50, divergence=10000.
      left preconditioning
      using DEFAULT norm type for convergence test
    PC Object:    (mg_coarse_)     18432 MPI processes
      type: redundant
      PC has not been set up so information may be incomplete
        Redundant preconditioner: Not yet setup
  Down solver (pre-smoother) on level 1 -------------------------------
    KSP Object:    (mg_levels_1_)     18432 MPI processes
      type: chebyshev
        Chebyshev: eigenvalue estimates:  min = 0., max = 0.
      maximum iterations=2, initial guess is zero
      tolerances:  relative=1e-05, absolute=1e-50, divergence=10000.
      left preconditioning
      using NONE norm type for convergence test
    PC Object:    (mg_levels_1_)     18432 MPI processes
      type: sor
      PC has not been set up so information may be incomplete
        SOR: type = local_symmetric, iterations = 1, local iterations = 1, omega = 1.
  Up solver (post-smoother) same as down solver (pre-smoother)
  Down solver (pre-smoother) on level 2 -------------------------------
    KSP Object:    (mg_levels_2_)     18432 MPI processes
      type: chebyshev
        Chebyshev: eigenvalue estimates:  min = 0., max = 0.
      maximum iterations=2, initial guess is zero
      tolerances:  relative=1e-05, absolute=1e-50, divergence=10000.
      left preconditioning
      using NONE norm type for convergence test
    PC Object:    (mg_levels_2_)     18432 MPI processes
      type: sor
      PC has not been set up so information may be incomplete
        SOR: type = local_symmetric, iterations = 1, local iterations = 1, omega = 1.
  Up solver (post-smoother) same as down solver (pre-smoother)
  Down solver (pre-smoother) on level 3 -------------------------------
    KSP Object:    (mg_levels_3_)     18432 MPI processes
      type: chebyshev
        Chebyshev: eigenvalue estimates:  min = 0., max = 0.
      maximum iterations=2, initial guess is zero
      tolerances:  relative=1e-05, absolute=1e-50, divergence=10000.
      left preconditioning
      using NONE norm type for convergence test
    PC Object:    (mg_levels_3_)     18432 MPI processes
      type: sor
      PC has not been set up so information may be incomplete
        SOR: type = local_symmetric, iterations = 1, local iterations = 1, omega = 1.
  Up solver (post-smoother) same as down solver (pre-smoother)
  Down solver (pre-smoother) on level 4 -------------------------------
    KSP Object:    (mg_levels_4_)     18432 MPI processes
      type: chebyshev
        Chebyshev: eigenvalue estimates:  min = 0., max = 0.
      maximum iterations=2, initial guess is zero
      tolerances:  relative=1e-05, absolute=1e-50, divergence=10000.
      left preconditioning
      using NONE norm type for convergence test
    PC Object:    (mg_levels_4_)     18432 MPI processes
      type: sor
      PC has not been set up so information may be incomplete
        SOR: type = local_symmetric, iterations = 1, local iterations = 1, omega = 1.
  Up solver (post-smoother) same as down solver (pre-smoother)
  linear system matrix = precond matrix:
  Mat Object:   18432 MPI processes
    type: mpiaij
    rows=603979776, cols=603979776
    total: nonzeros=4223139840, allocated nonzeros=4223139840
    total number of mallocs used during MatSetValues calls =0
      has attached null space
[NID 00631] 2016-07-10 06:22:58 Apid 45768056: initiated application termination
[NID 06277] 2016-07-10 06:23:00 Apid 45768056: OOM killer terminated this process.
[NID 06235] 2016-07-10 06:23:00 Apid 45768056: OOM killer terminated this process.
-------------- next part --------------
-ksp_type        cg 
-ksp_norm_type   unpreconditioned
-ksp_lag_norm
-ksp_rtol        1e-7
-ksp_initial_guess_nonzero  yes
-ksp_converged_reason 
-ppe_max_iter 50
-pc_type mg
-pc_mg_galerkin
-pc_mg_levels 5
-mg_levels_ksp_type richardson 
-mg_levels_ksp_max_it 1
-ksp_max_it 1
-mg_coarse_ksp_type preonly
-mg_coarse_pc_type telescope
-mg_coarse_pc_telescope_reduction_factor 64
-options_left 1
-log_view
-memory_view
-ksp_view_pre

# Setting dmdarepart on subcomm
-mg_coarse_telescope_repart_da_processors_x 24
-mg_coarse_telescope_repart_da_processors_y 2
-mg_coarse_telescope_repart_da_processors_z 6
-mg_coarse_telescope_ksp_type preonly
-mg_coarse_telescope_pc_type mg
-mg_coarse_telescope_pc_mg_galerkin
-mg_coarse_telescope_pc_mg_levels 4
-mg_coarse_telescope_mg_levels_ksp_max_it 1
-mg_coarse_telescope_mg_levels_ksp_type richardson
-mg_coarse_telescope_mg_coarse_ksp_type preonly
-mg_coarse_telescope_mg_coarse_pc_type bjacobi
-------------- next part --------------
KSP Object: 18432 MPI processes
  type: cg
  maximum iterations=1
  tolerances:  relative=1e-07, absolute=1e-50, divergence=10000.
  left preconditioning
  using nonzero initial guess
  using UNPRECONDITIONED norm type for convergence test
PC Object: 18432 MPI processes
  type: mg
    MG: type is MULTIPLICATIVE, levels=4 cycles=v
      Cycles per PCApply=1
      Using Galerkin computed coarse grid matrices
  Coarse grid solver -- level -------------------------------
    KSP Object:    (mg_coarse_)     18432 MPI processes
      type: preonly
      maximum iterations=10000, initial guess is zero
      tolerances:  relative=1e-05, absolute=1e-50, divergence=10000.
      left preconditioning
      using NONE norm type for convergence test
    PC Object:    (mg_coarse_)     18432 MPI processes
      type: telescope
        Telescope: parent comm size reduction factor = 64
        Telescope: comm_size = 18432 , subcomm_size = 288
          Telescope: DMDA detected
        DMDA Object:    (mg_coarse_telescope_repart_)    288 MPI processes
          M 192 N 16 P 48 m 24 n 2 p 6 dof 1 overlap 1
        KSP Object:        (mg_coarse_telescope_)         288 MPI processes
          type: preonly
          maximum iterations=10000, initial guess is zero
          tolerances:  relative=1e-05, absolute=1e-50, divergence=10000.
          left preconditioning
          using NONE norm type for convergence test
        PC Object:        (mg_coarse_telescope_)         288 MPI processes
          type: mg
            MG: type is MULTIPLICATIVE, levels=4 cycles=v
              Cycles per PCApply=1
              Using Galerkin computed coarse grid matrices
          Coarse grid solver -- level -------------------------------
            KSP Object:            (mg_coarse_telescope_mg_coarse_)             288 MPI processes
              type: preonly
              maximum iterations=10000, initial guess is zero
              tolerances:  relative=1e-05, absolute=1e-50, divergence=10000.
              left preconditioning
              using NONE norm type for convergence test
            PC Object:            (mg_coarse_telescope_mg_coarse_)             288 MPI processes
              type: lu
                LU: out-of-place factorization
                tolerance for zero pivot 2.22045e-14
                matrix ordering: natural
                factor fill ratio given 0., needed 0.
                  Factored matrix follows:
                    Mat Object:                     288 MPI processes
                      type: mpiaij
                      rows=288, cols=288
                      package used to perform factorization: superlu_dist
                      total: nonzeros=0, allocated nonzeros=0
                      total number of mallocs used during MatSetValues calls =0
                        SuperLU_DIST run parameters:
                          Process grid nprow 18 x npcol 16 
                          Equilibrate matrix TRUE 
                          Matrix input mode 1 
                          Replace tiny pivots FALSE 
                          Use iterative refinement FALSE 
                          Processors in row 18 col partition 16 
                          Row permutation LargeDiag 
                          Column permutation METIS_AT_PLUS_A
                          Parallel symbolic factorization FALSE 
                          Repeated factorization SamePattern
              linear system matrix = precond matrix:
              Mat Object:               288 MPI processes
                type: mpiaij
                rows=288, cols=288
                total: nonzeros=1728, allocated nonzeros=1728
                total number of mallocs used during MatSetValues calls =0
                  not using I-node (on process 0) routines
          Down solver (pre-smoother) on level 1 -------------------------------
            KSP Object:            (mg_coarse_telescope_mg_levels_1_)             288 MPI processes
              type: richardson
                Richardson: damping factor=1.
              maximum iterations=1
              tolerances:  relative=1e-05, absolute=1e-50, divergence=10000.
              left preconditioning
              using nonzero initial guess
              using NONE norm type for convergence test
            PC Object:            (mg_coarse_telescope_mg_levels_1_)             288 MPI processes
              type: sor
                SOR: type = local_symmetric, iterations = 1, local iterations = 1, omega = 1.
              linear system matrix = precond matrix:
              Mat Object:               288 MPI processes
                type: mpiaij
                rows=2304, cols=2304
                total: nonzeros=14976, allocated nonzeros=14976
                total number of mallocs used during MatSetValues calls =0
                  not using I-node (on process 0) routines
          Up solver (post-smoother) same as down solver (pre-smoother)
          Down solver (pre-smoother) on level 2 -------------------------------
            KSP Object:            (mg_coarse_telescope_mg_levels_2_)             288 MPI processes
              type: richardson
                Richardson: damping factor=1.
              maximum iterations=1
              tolerances:  relative=1e-05, absolute=1e-50, divergence=10000.
              left preconditioning
              using nonzero initial guess
              using NONE norm type for convergence test
            PC Object:            (mg_coarse_telescope_mg_levels_2_)             288 MPI processes
              type: sor
                SOR: type = local_symmetric, iterations = 1, local iterations = 1, omega = 1.
              linear system matrix = precond matrix:
              Mat Object:               288 MPI processes
                type: mpiaij
                rows=18432, cols=18432
                total: nonzeros=124416, allocated nonzeros=124416
                total number of mallocs used during MatSetValues calls =0
                  not using I-node (on process 0) routines
          Up solver (post-smoother) same as down solver (pre-smoother)
          Down solver (pre-smoother) on level 3 -------------------------------
            KSP Object:            (mg_coarse_telescope_mg_levels_3_)             288 MPI processes
              type: richardson
                Richardson: damping factor=1.
              maximum iterations=1
              tolerances:  relative=1e-05, absolute=1e-50, divergence=10000.
              left preconditioning
              using nonzero initial guess
              using NONE norm type for convergence test
            PC Object:            (mg_coarse_telescope_mg_levels_3_)             288 MPI processes
              type: sor
                SOR: type = local_symmetric, iterations = 1, local iterations = 1, omega = 1.
              linear system matrix = precond matrix:
              Mat Object:               288 MPI processes
                type: mpiaij
                rows=147456, cols=147456
                total: nonzeros=1013760, allocated nonzeros=1013760
                total number of mallocs used during MatSetValues calls =0
                  not using I-node (on process 0) routines
          Up solver (post-smoother) same as down solver (pre-smoother)
          linear system matrix = precond matrix:
          Mat Object:           288 MPI processes
            type: mpiaij
            rows=147456, cols=147456
            total: nonzeros=1013760, allocated nonzeros=1013760
            total number of mallocs used during MatSetValues calls =0
              not using I-node (on process 0) routines
      linear system matrix = precond matrix:
      Mat Object:       18432 MPI processes
        type: mpiaij
        rows=147456, cols=147456
        total: nonzeros=1013760, allocated nonzeros=1013760
        total number of mallocs used during MatSetValues calls =0
          not using I-node (on process 0) routines
  Down solver (pre-smoother) on level 1 -------------------------------
    KSP Object:    (mg_levels_1_)     18432 MPI processes
      type: richardson
        Richardson: damping factor=1.
      maximum iterations=1
      tolerances:  relative=1e-05, absolute=1e-50, divergence=10000.
      left preconditioning
      using nonzero initial guess
      using NONE norm type for convergence test
    PC Object:    (mg_levels_1_)     18432 MPI processes
      type: sor
        SOR: type = local_symmetric, iterations = 1, local iterations = 1, omega = 1.
      linear system matrix = precond matrix:
      Mat Object:       18432 MPI processes
        type: mpiaij
        rows=1179648, cols=1179648
        total: nonzeros=8183808, allocated nonzeros=8183808
        total number of mallocs used during MatSetValues calls =0
          not using I-node (on process 0) routines
  Up solver (post-smoother) same as down solver (pre-smoother)
  Down solver (pre-smoother) on level 2 -------------------------------
    KSP Object:    (mg_levels_2_)     18432 MPI processes
      type: richardson
        Richardson: damping factor=1.
      maximum iterations=1
      tolerances:  relative=1e-05, absolute=1e-50, divergence=10000.
      left preconditioning
      using nonzero initial guess
      using NONE norm type for convergence test
    PC Object:    (mg_levels_2_)     18432 MPI processes
      type: sor
        SOR: type = local_symmetric, iterations = 1, local iterations = 1, omega = 1.
      linear system matrix = precond matrix:
      Mat Object:       18432 MPI processes
        type: mpiaij
        rows=9437184, cols=9437184
        total: nonzeros=65765376, allocated nonzeros=65765376
        total number of mallocs used during MatSetValues calls =0
          not using I-node (on process 0) routines
  Up solver (post-smoother) same as down solver (pre-smoother)
  Down solver (pre-smoother) on level 3 -------------------------------
    KSP Object:    (mg_levels_3_)     18432 MPI processes
      type: richardson
        Richardson: damping factor=1.
      maximum iterations=1
      tolerances:  relative=1e-05, absolute=1e-50, divergence=10000.
      left preconditioning
      using nonzero initial guess
      using NONE norm type for convergence test
    PC Object:    (mg_levels_3_)     18432 MPI processes
      type: sor
        SOR: type = local_symmetric, iterations = 1, local iterations = 1, omega = 1.
      linear system matrix = precond matrix:
      Mat Object:       18432 MPI processes
        type: mpiaij
        rows=75497472, cols=75497472
        total: nonzeros=527302656, allocated nonzeros=527302656
        total number of mallocs used during MatSetValues calls =0
          has attached null space
  Up solver (post-smoother) same as down solver (pre-smoother)
  linear system matrix = precond matrix:
  Mat Object:   18432 MPI processes
    type: mpiaij
    rows=75497472, cols=75497472
    total: nonzeros=527302656, allocated nonzeros=527302656
    total number of mallocs used during MatSetValues calls =0
      has attached null space

From ketancmaheshwari at gmail.com  Mon Jul 11 13:22:09 2016
From: ketancmaheshwari at gmail.com (Ketan Maheshwari)
Date: Mon, 11 Jul 2016 14:22:09 -0400
Subject: [petsc-users] Diagonalization of a 3D dense matrix
In-Reply-To: <CAMYG4Gk1-JOKAPGioiVogd3dM_Dxekt37H6Enf1kQ89QCYLS2g@mail.gmail.com>
References: <CAMUuviqmSPJt_HdROfL-31YRnH4E2C+w1AYhrf7uup1sk1m07Q@mail.gmail.com>
	<CAMYG4Gk1-JOKAPGioiVogd3dM_Dxekt37H6Enf1kQ89QCYLS2g@mail.gmail.com>
Message-ID: <CAMUuvirYjFntE-DnxasCBbScHypa3Q2aonbLvCv5QLQaFDM0qg@mail.gmail.com>

Matthew,

I am probably not using the right language but I meant that each element
has three indices associated with it: x, y, z.

Here is a snapshot:

1 10 55    5.7113635929515209e-03
 1 10 56    4.2977490038287334e-03
 1 10 57    2.8719519782193204e-03
 1 10 58    1.4380140927001712e-03
 1 10 59    9.9299930690365083e-17
 1 11  0    0.0000000000000000e+00
 1 11  1    1.5658614070601917e-03
 1 11  2    3.1272842098367562e-03
 1 11  3    4.6798423857521204e-03

Where the first three columns are the coordinates and the last one is value.

Could you clarify the meaning of "diagonalization is not a clear concept"
if it is applicable to this case.

Thank you,
--
Ketan


On Mon, Jul 11, 2016 at 1:15 PM, Matthew Knepley <knepley at gmail.com> wrote:

> On Mon, Jul 11, 2016 at 12:05 PM, Ketan Maheshwari <
> ketancmaheshwari at gmail.com> wrote:
>
>> Hello PETSC-ers,
>>
>> I am a research faculty at Univ of Pittsburgh trying to use PETSC/SLEPC
>> to
>> obtain the diagonalization of a large matrix using Lanczos or Davidson
>> method.
>>
>> The matrix is a 3 dimensional dense matrix with a total of 216000
>> elements.
>>
>> After looking into some of the examples in PETSC as well SLEPC
>> implementations
>> it seems like most of the implementations are with 2 dimensional matrices.
>>
>
> You will have to explain what you mean by a "3D matrix". A matrix, by
> definition, has only
> rows and columns. You may mean a matrix generated from a 3D problem. That
> should pose
> no extra difficulty. You may mean a 3-index tensor, in which case
> diagonalization is not a clear
> concept.
>
>   Thanks,
>
>      Matt
>
>
>> So, I was wondering if it is possible to express a 3 dimensional matrix
>> object
>> compatible to PETSC so that the SLEPC API could be used to obtain
>> diagonalization.
>>
>> Any suggestions or pointers to documentation or examples would be of great
>> help.
>>
>> Best,
>> --
>> Ketan
>>
>>
>
>
> --
> What most experimenters take for granted before they begin their
> experiments is infinitely more interesting than any results to which their
> experiments lead.
> -- Norbert Wiener
>


-- 
Ketan
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.mcs.anl.gov/pipermail/petsc-users/attachments/20160711/159cf50a/attachment.html>

From knepley at gmail.com  Mon Jul 11 13:24:51 2016
From: knepley at gmail.com (Matthew Knepley)
Date: Mon, 11 Jul 2016 13:24:51 -0500
Subject: [petsc-users] Diagonalization of a 3D dense matrix
In-Reply-To: <CAMUuvirYjFntE-DnxasCBbScHypa3Q2aonbLvCv5QLQaFDM0qg@mail.gmail.com>
References: <CAMUuviqmSPJt_HdROfL-31YRnH4E2C+w1AYhrf7uup1sk1m07Q@mail.gmail.com>
	<CAMYG4Gk1-JOKAPGioiVogd3dM_Dxekt37H6Enf1kQ89QCYLS2g@mail.gmail.com>
	<CAMUuvirYjFntE-DnxasCBbScHypa3Q2aonbLvCv5QLQaFDM0qg@mail.gmail.com>
Message-ID: <CAMYG4G=iJtYch3EmsX3rVUxT4GcZ3FJEtse5x8jKqkwzNFeGTA@mail.gmail.com>

On Mon, Jul 11, 2016 at 1:22 PM, Ketan Maheshwari <
ketancmaheshwari at gmail.com> wrote:

> Matthew,
>
> I am probably not using the right language but I meant that each element
> has three indices associated with it: x, y, z.
>
> Here is a snapshot:
>
> 1 10 55    5.7113635929515209e-03
>  1 10 56    4.2977490038287334e-03
>  1 10 57    2.8719519782193204e-03
>  1 10 58    1.4380140927001712e-03
>  1 10 59    9.9299930690365083e-17
>  1 11  0    0.0000000000000000e+00
>  1 11  1    1.5658614070601917e-03
>  1 11  2    3.1272842098367562e-03
>  1 11  3    4.6798423857521204e-03
>
> Where the first three columns are the coordinates and the last one is
> value.
>

This is not a matrix. A matrix is a linear operator on some space with a
finite basis: https://en.wikipedia.org/wiki/Matrix_(mathematics)
This is just a set of data points.

Most people would call this a vector, since you have an index I (which
consists of each independent triple) and a value V.


> Could you clarify the meaning of "diagonalization is not a clear concept"
> if it is applicable to this case.
>

There is no one definition of tensor diagonalization.

   Matt


> Thank you,
> --
> Ketan
>
>
> On Mon, Jul 11, 2016 at 1:15 PM, Matthew Knepley <knepley at gmail.com>
> wrote:
>
>> On Mon, Jul 11, 2016 at 12:05 PM, Ketan Maheshwari <
>> ketancmaheshwari at gmail.com> wrote:
>>
>>> Hello PETSC-ers,
>>>
>>> I am a research faculty at Univ of Pittsburgh trying to use PETSC/SLEPC
>>> to
>>> obtain the diagonalization of a large matrix using Lanczos or Davidson
>>> method.
>>>
>>> The matrix is a 3 dimensional dense matrix with a total of 216000
>>> elements.
>>>
>>> After looking into some of the examples in PETSC as well SLEPC
>>> implementations
>>> it seems like most of the implementations are with 2 dimensional
>>> matrices.
>>>
>>
>> You will have to explain what you mean by a "3D matrix". A matrix, by
>> definition, has only
>> rows and columns. You may mean a matrix generated from a 3D problem. That
>> should pose
>> no extra difficulty. You may mean a 3-index tensor, in which case
>> diagonalization is not a clear
>> concept.
>>
>>   Thanks,
>>
>>      Matt
>>
>>
>>> So, I was wondering if it is possible to express a 3 dimensional matrix
>>> object
>>> compatible to PETSC so that the SLEPC API could be used to obtain
>>> diagonalization.
>>>
>>> Any suggestions or pointers to documentation or examples would be of
>>> great
>>> help.
>>>
>>> Best,
>>> --
>>> Ketan
>>>
>>>
>>
>>
>> --
>> What most experimenters take for granted before they begin their
>> experiments is infinitely more interesting than any results to which their
>> experiments lead.
>> -- Norbert Wiener
>>
>
>
>
> --
> Ketan
>
>


-- 
What most experimenters take for granted before they begin their
experiments is infinitely more interesting than any results to which their
experiments lead.
-- Norbert Wiener
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.mcs.anl.gov/pipermail/petsc-users/attachments/20160711/f182f229/attachment.html>

From jroman at dsic.upv.es  Mon Jul 11 14:06:14 2016
From: jroman at dsic.upv.es (Jose E. Roman)
Date: Mon, 11 Jul 2016 21:06:14 +0200
Subject: [petsc-users] [Slepc 3.7.1][macOS] install name is set to build
	folder instead of prefix
In-Reply-To: <44EF3239-8AA0-4157-B04A-BC3437409215@gmail.com>
References: <D455E302-CAAB-4993-8B6D-233D789CDBD9@gmail.com>
	<7FE647AB-8FD7-4D8A-980F-87F5F78478D7@dsic.upv.es>
	<EA5C795F-C648-4701-BF59-3C7EDC76417F@gmail.com>
	<5D82D597-FEE7-48CF-A99E-C5A88956CAAD@gmail.com>
	<6C60E10D-B52A-4A59-8045-F6672E3F00C7@gmail.com>
	<CA8E72EA-43D1-45AC-BCC5-ED2F61C7257D@gmail.com>
	<BE5FD8AD-6DF6-4EB4-9BC4-67ACC12763C0@dsic.upv.es>
	<44EF3239-8AA0-4157-B04A-BC3437409215@gmail.com>
Message-ID: <06F337FC-9F59-4633-9A07-A253C33080EE@dsic.upv.es>

I don't understand why I don't get this warning.
Still I don't see where the problem is. Please tell me exactly what you want me to change, or better make a pull request.
Thanks.
Jose


> El 11 jul 2016, a las 17:06, Denis Davydov <davydden at gmail.com> escribi?:
> 
> Here is the warning:
> 
> Your SLEPC_DIR may not match the directory you are in
> SLEPC_DIR  /Users/davydden/spack/var/spack/stage/slepc-3.7.1-p7hqqclwqvbvra6j44lka3xuc4eycvdg/slepc-3.7.1 Current directory /private/var/folders/5k/sqpp24tx3ylds4fgm13pfht00000gn/T/davydden/spack-stage/spack-stage-m7Xg8I/slepc-3.7.1
> 
> p.s. this is done within Spack, for a fix see: https://github.com/LLNL/spack/pull/1206
> 
>> On 11 Jul 2016, at 16:53, Jose E. Roman <jroman at dsic.upv.es> wrote:
>> 
>> I cannot reproduce this behaviour. If I do for instance this (on OS X El Capitan):
>> 
>> $ cd ~/tmp
>> $ ln -s $SLEPC_DIR .
>> $ cd slepc-3.7.1
>> $ ./configure
>> $ make
>> $ otool -lv $PETSC_ARCH/lib/libslepc.dylib | grep slepc
>> 
>> I don't get a warning, and the output of otool is the same that would result if done on $SLEPC_DIR.
>> Which warning are you getting?
>> 
>> Jose
>> 
>> 
>>> El 11 jul 2016, a las 0:48, Denis Davydov <davydden at gmail.com> escribi?:
>>> 
>>> Hi Jose,
>>> 
>>> so here is what happens. The issue appears when SLEPC_DIR is set to a symlink (the one with ?stage below) of a build folder (the one with ?private? below). 
>>> During configure there is a warning that SLEPC_DIR is not the same as current dir (string comparison),
>>> but one is symlink of another, so all but install_name_tool work. The latter leads to the following values of variables:
>>> 
>>> oldname    =/private/var/folders/5k/sqpp24tx3ylds4fgm13pfht00000gn/T/davydden/spack-stage/spack-stage-MziaMV/slepc-3.7.1/installed-arch-darwin-c-opt/lib/libslepc.3.7.dylib
>>> 
>>> installName=/private/var/folders/5k/sqpp24tx3ylds4fgm13pfht00000gn/T/davydden/spack-stage/spack-stage-MziaMV/slepc-3.7.1/installed-arch-darwin-c-opt/lib/libslepc.3.7.dylib
>>> 
>>> archDir    =/Users/davydden/spack/var/spack/stage/slepc-3.7.1-gimrzhb4mozeus3i2hdmrtjp3tha5pgr/slepc-3.7.1/installed-arch-darwin-c-opt
>>> 
>>> installDir =/Users/davydden/spack/opt/spack/darwin-elcapitan-x86_64/clang-7.3.0-apple/slepc-3.7.1-gimrzhb4mozeus3i2hdmrtjp3tha5pgr
>>> 
>>> dst        =/Users/davydden/spack/opt/spack/darwin-elcapitan-x86_64/clang-7.3.0-apple/slepc-3.7.1-gimrzhb4mozeus3i2hdmrtjp3tha5pgr/lib/libslepc.3.7.1.dylib
>>> 
>>> As you see, installName wasn?t changed from oldname.
>>> 
>>> Since the python code rely on SLEPC_DIR be pwd(), i would suggest to through an error instead of the warning to make
>>> sure that users won?t get in the situation above. Alternative is to make this part of the code more robust.
>>> 
>>> When SLEPC_DIR==pwd() the patch you referred works.
>>> 
>>> Kind regards,
>>> Denis 
>>> 
>> 
> 


From davydden at gmail.com  Mon Jul 11 14:43:47 2016
From: davydden at gmail.com (Denis Davydov)
Date: Mon, 11 Jul 2016 21:43:47 +0200
Subject: [petsc-users] [Slepc 3.7.1][macOS] install name is set to build
	folder instead of prefix
In-Reply-To: <06F337FC-9F59-4633-9A07-A253C33080EE@dsic.upv.es>
References: <D455E302-CAAB-4993-8B6D-233D789CDBD9@gmail.com>
	<7FE647AB-8FD7-4D8A-980F-87F5F78478D7@dsic.upv.es>
	<EA5C795F-C648-4701-BF59-3C7EDC76417F@gmail.com>
	<5D82D597-FEE7-48CF-A99E-C5A88956CAAD@gmail.com>
	<6C60E10D-B52A-4A59-8045-F6672E3F00C7@gmail.com>
	<CA8E72EA-43D1-45AC-BCC5-ED2F61C7257D@gmail.com>
	<BE5FD8AD-6DF6-4EB4-9BC4-67ACC12763C0@dsic.upv.es>
	<44EF3239-8AA0-4157-B04A-BC3437409215@gmail.com>
	<06F337FC-9F59-4633-9A07-A253C33080EE@dsic.upv.es>
Message-ID: <A33F04FF-64C5-424D-9B13-76315AC15EA8@gmail.com>


> On 11 Jul 2016, at 21:06, Jose E. Roman <jroman at dsic.upv.es> wrote:
> 
> I don't understand why I don't get this warning.
> Still I don't see where the problem is. Please tell me exactly what you want me to change, or better make a pull request.

The problem has to do with the assumptions in python scripts. See below values of variables which will not work as expected, 
i.e. installName = oldname.replace(self.archDir, self.installDir) will not do any replace.
Why you can?t reproduce it ? i don?t know.

In any case, i have a working solution, so it?s not an issue for me and it is up to you if you want to further investigate it.
I just wanted to point out that this part of the python code does not work in all circumstances.

Regards,
Denis.


>>>> so here is what happens. The issue appears when SLEPC_DIR is set to a symlink (the one with ?stage below) of a build folder (the one with ?private? below). 
>>>> During configure there is a warning that SLEPC_DIR is not the same as current dir (string comparison),
>>>> but one is symlink of another, so all but install_name_tool work. The latter leads to the following values of variables:
>>>> 
>>>> oldname    =/private/var/folders/5k/sqpp24tx3ylds4fgm13pfht00000gn/T/davydden/spack-stage/spack-stage-MziaMV/slepc-3.7.1/installed-arch-darwin-c-opt/lib/libslepc.3.7.dylib
>>>> 
>>>> installName=/private/var/folders/5k/sqpp24tx3ylds4fgm13pfht00000gn/T/davydden/spack-stage/spack-stage-MziaMV/slepc-3.7.1/installed-arch-darwin-c-opt/lib/libslepc.3.7.dylib
>>>> 
>>>> archDir    =/Users/davydden/spack/var/spack/stage/slepc-3.7.1-gimrzhb4mozeus3i2hdmrtjp3tha5pgr/slepc-3.7.1/installed-arch-darwin-c-opt
>>>> 
>>>> installDir =/Users/davydden/spack/opt/spack/darwin-elcapitan-x86_64/clang-7.3.0-apple/slepc-3.7.1-gimrzhb4mozeus3i2hdmrtjp3tha5pgr
>>>> 
>>>> dst        =/Users/davydden/spack/opt/spack/darwin-elcapitan-x86_64/clang-7.3.0-apple/slepc-3.7.1-gimrzhb4mozeus3i2hdmrtjp3tha5pgr/lib/libslepc.3.7.1.dylib
>>>> 
>>>> As you see, installName wasn?t changed from oldname.
>>>> 
>>>> Since the python code rely on SLEPC_DIR be pwd(), i would suggest to through an error instead of the warning to make
>>>> sure that users won?t get in the situation above. Alternative is to make this part of the code more robust.
>>>> 
>>>> When SLEPC_DIR==pwd() the patch you referred works.


From dave.mayhem23 at gmail.com  Mon Jul 11 15:18:01 2016
From: dave.mayhem23 at gmail.com (Dave May)
Date: Mon, 11 Jul 2016 22:18:01 +0200
Subject: [petsc-users] Question about memory usage in Multigrid
	preconditioner
In-Reply-To: <5783D3E4.4020004@uci.edu>
References: <577C337B.60909@uci.edu>
	<94A03A99-4970-4F20-8C79-FEE1DCBD028D@mcs.anl.gov>
	<577D75D3.8010703@uci.edu>
	<2F25042C-E6D6-4AC6-9C22-1B63F8065836@mcs.anl.gov>
	<57804DE9.707@uci.edu>
	<CAJ98EDqfz2hr1s8SoLCwPceqbB+bsZp1ryibGngRn+kNTDrocA@mail.gmail.com>
	<5783D3E4.4020004@uci.edu>
Message-ID: <CAJ98EDoNNDpPYt7wr6EFyAp-vhHE1LC8CMnSrL89sJMrLTr+Yw@mail.gmail.com>

Hi Frank,


On 11 July 2016 at 19:14, frank <hengjiew at uci.edu> wrote:

> Hi Dave,
>
> I re-run the test using bjacobi as the preconditioner on the coarse mesh
> of telescope. The Grid is 3072*256*768 and process mesh is 96*8*24. The
> petsc option file is attached.
> I still got the "Out Of Memory" error. The error occurred before the
> linear solver finished one step. So I don't have the full info from
> ksp_view. The info from ksp_view_pre is attached.
>

Okay - that is essentially useless (sorry)


>
> It seems to me that the error occurred when the decomposition was going to
> be changed.
>

Based on what information?
Running with -info would give us more clues, but will create a ton of
output.
Please try running the case which failed with -info


> I had another test with a grid of 1536*128*384 and the same process mesh
> as above. There was no error. The ksp_view info is attached for comparison.
> Thank you.
>


[3] Here is my crude estimate of your memory usage.
I'll target the biggest memory hogs only to get an order of magnitude
estimate

* The Fine grid operator contains 4223139840 non-zeros --> 1.8 GB per MPI
rank assuming double precision.
The indices for the AIJ could amount to another 0.3 GB (assuming 32 bit
integers)

* You use 5 levels of coarsening, so the other operators should represent
(collectively)
2.1 / 8 + 2.1/8^2 + 2.1/8^3 + 2.1/8^4  ~ 300 MB per MPI rank on the
communicator with 18432 ranks.
The coarse grid should consume ~ 0.5 MB per MPI rank on the communicator
with 18432 ranks.

* You use a reduction factor of 64, making the new communicator with 288
MPI ranks.
PCTelescope will first gather a temporary matrix associated with your
coarse level operator assuming a comm size of 288 living on the comm with
size 18432.
This matrix will require approximately 0.5 * 64 = 32 MB per core on the 288
ranks.
This matrix is then used to form a new MPIAIJ matrix on the subcomm, thus
require another 32 MB per rank.
The temporary matrix is now destroyed.

* Because a DMDA is detected, a permutation matrix is assembled.
This requires 2 doubles per point in the DMDA.
Your coarse DMDA contains 92 x 16 x 48 points.
Thus the permutation matrix will require < 1 MB per MPI rank on the
sub-comm.

* Lastly, the matrix is permuted. This uses MatPtAP(), but the resulting
operator will have the same memory footprint as the unpermuted matrix (32
MB). At any stage in PCTelescope, only 2 operators of size 32 MB are held
in memory when the DMDA is provided.

>From my rough estimates, the worst case memory foot print for any given
core, given your options is approximately
2100 MB + 300 MB + 32 MB + 32 MB + 1 MB  = 2465 MB
This is way below 8 GB.

Note this estimate completely ignores:
(1) the memory required for the restriction operator,
(2) the potential growth in the number of non-zeros per row due to Galerkin
coarsening (I wished -ksp_view_pre reported the output from MatView so we
could see the number of non-zeros required by the coarse level operators)
(3) all temporary vectors required by the CG solver, and those required by
the smoothers.
(4) internal memory allocated by MatPtAP
(5) memory associated with IS's used within PCTelescope

So either I am completely off in my estimates, or you have not carefully
estimated the memory usage of your application code. Hopefully others might
examine/correct my rough estimates

Since I don't have your code I cannot access the latter.
Since I don't have access to the same machine you are running on, I think
we need to take a step back.

[1] What machine are you running on? Send me a URL if its available

[2] What discretization are you using? (I am guessing a scalar 7 point FD
stencil)
If it's a 7 point FD stencil, we should be able to examine the memory usage
of your solver configuration using a standard, light weight existing PETSc
example, run on your machine at the same scale.
This would hopefully enable us to correctly evaluate the actual memory
usage required by the solver configuration you are using.

Thanks,
  Dave


>
>
> Frank
>
>
>
>
> On 07/08/2016 10:38 PM, Dave May wrote:
>
>
>
> On Saturday, 9 July 2016, frank <hengjiew at uci.edu> wrote:
>
>> Hi Barry and Dave,
>>
>> Thank both of you for the advice.
>>
>> @Barry
>> I made a mistake in the file names in last email. I attached the correct
>> files this time.
>> For all the three tests, 'Telescope' is used as the coarse preconditioner.
>>
>> == Test1:   Grid: 1536*128*384,   Process Mesh: 48*4*12
>> Part of the memory usage:  Vector   125            124 3971904     0.
>>                                              Matrix   101 101
>> 9462372     0
>>
>> == Test2: Grid: 1536*128*384,   Process Mesh: 96*8*24
>> Part of the memory usage:  Vector   125            124 681672     0.
>>                                              Matrix   101 101
>> 1462180     0.
>>
>> In theory, the memory usage in Test1 should be 8 times of Test2. In my
>> case, it is about 6 times.
>>
>> == Test3: Grid: 3072*256*768,   Process Mesh: 96*8*24. Sub-domain per
>> process: 32*32*32
>> Here I get the out of memory error.
>>
>> I tried to use -mg_coarse jacobi. In this way, I don't need to set
>> -mg_coarse_ksp_type and -mg_coarse_pc_type explicitly, right?
>> The linear solver didn't work in this case. Petsc output some errors.
>>
>> @Dave
>> In test3, I use only one instance of 'Telescope'. On the coarse mesh of
>> 'Telescope', I used LU as the preconditioner instead of SVD.
>> If my set the levels correctly, then on the last coarse mesh of MG where
>> it calls 'Telescope', the sub-domain per process is 2*2*2.
>> On the last coarse mesh of 'Telescope', there is only one grid point per
>> process.
>> I still got the OOM error. The detailed petsc option file is attached.
>
>
> Do you understand the expected memory usage for the particular parallel
> LU implementation you are using? I don't (seriously). Replace LU with
> bjacobi and re-run this test. My point about solver debugging is still
> valid.
>
> And please send the result of KSPView so we can see what is actually used
> in the computations
>
> Thanks
>   Dave
>
>
>>
>>
>> Thank you so much.
>>
>> Frank
>>
>>
>>
>> On 07/06/2016 02:51 PM, Barry Smith wrote:
>>
>>> On Jul 6, 2016, at 4:19 PM, frank <hengjiew at uci.edu> wrote:
>>>>
>>>> Hi Barry,
>>>>
>>>> Thank you for you advice.
>>>> I tried three test. In the 1st test, the grid is 3072*256*768 and the
>>>> process mesh is 96*8*24.
>>>> The linear solver is 'cg' the preconditioner is 'mg' and 'telescope' is
>>>> used as the preconditioner at the coarse mesh.
>>>> The system gives me the "Out of Memory" error before the linear system
>>>> is completely solved.
>>>> The info from '-ksp_view_pre' is attached. I seems to me that the error
>>>> occurs when it reaches the coarse mesh.
>>>>
>>>> The 2nd test uses a grid of 1536*128*384 and process mesh is 96*8*24.
>>>> The 3rd test uses the same grid but a different process mesh 48*4*12.
>>>>
>>>     Are you sure this is right? The total matrix and vector memory usage
>>> goes from 2nd test
>>>                Vector   384            383      8,193,712     0.
>>>                Matrix   103            103     11,508,688     0.
>>> to 3rd test
>>>               Vector   384            383      1,590,520     0.
>>>                Matrix   103            103      3,508,664     0.
>>> that is the memory usage got smaller but if you have only 1/8th the
>>> processes and the same grid it should have gotten about 8 times bigger. Did
>>> you maybe cut the grid by a factor of 8 also? If so that still doesn't
>>> explain it because the memory usage changed by a factor of 5 something for
>>> the vectors and 3 something for the matrices.
>>>
>>>
>>> The linear solver and petsc options in 2nd and 3rd tests are the same in
>>>> 1st test. The linear solver works fine in both test.
>>>> I attached the memory usage of the 2nd and 3rd tests. The memory info
>>>> is from the option '-log_summary'. I tried to use '-momery_info' as you
>>>> suggested, but in my case petsc treated it as an unused option. It output
>>>> nothing about the memory. Do I need to add sth to my code so I can use
>>>> '-memory_info'?
>>>>
>>>     Sorry, my mistake the option is -memory_view
>>>
>>>    Can you run the one case with -memory_view and -mg_coarse jacobi
>>> -ksp_max_it 1 (just so it doesn't iterate forever) to see how much memory
>>> is used without the telescope? Also run case 2 the same way.
>>>
>>>    Barry
>>>
>>>
>>>
>>> In both tests the memory usage is not large.
>>>>
>>>> It seems to me that it might be the 'telescope'  preconditioner that
>>>> allocated a lot of memory and caused the error in the 1st test.
>>>> Is there is a way to show how much memory it allocated?
>>>>
>>>> Frank
>>>>
>>>> On 07/05/2016 03:37 PM, Barry Smith wrote:
>>>>
>>>>>    Frank,
>>>>>
>>>>>      You can run with -ksp_view_pre to have it "view" the KSP before
>>>>> the solve so hopefully it gets that far.
>>>>>
>>>>>       Please run the problem that does fit with -memory_info when the
>>>>> problem completes it will show the "high water mark" for PETSc allocated
>>>>> memory and total memory used. We first want to look at these numbers to see
>>>>> if it is using more memory than you expect. You could also run with say
>>>>> half the grid spacing to see how the memory usage scaled with the increase
>>>>> in grid points. Make the runs also with -log_view and send all the output
>>>>> from these options.
>>>>>
>>>>>     Barry
>>>>>
>>>>> On Jul 5, 2016, at 5:23 PM, frank <hengjiew at uci.edu> wrote:
>>>>>>
>>>>>> Hi,
>>>>>>
>>>>>> I am using the CG ksp solver and Multigrid preconditioner  to solve a
>>>>>> linear system in parallel.
>>>>>> I chose to use the 'Telescope' as the preconditioner on the coarse
>>>>>> mesh for its good performance.
>>>>>> The petsc options file is attached.
>>>>>>
>>>>>> The domain is a 3d box.
>>>>>> It works well when the grid is  1536*128*384 and the process mesh is
>>>>>> 96*8*24. When I double the size of grid and keep the same process mesh and
>>>>>> petsc options, I get an "out of memory" error from the super-cluster I am
>>>>>> using.
>>>>>> Each process has access to at least 8G memory, which should be more
>>>>>> than enough for my application. I am sure that all the other parts of my
>>>>>> code( except the linear solver ) do not use much memory. So I doubt if
>>>>>> there is something wrong with the linear solver.
>>>>>> The error occurs before the linear system is completely solved so I
>>>>>> don't have the info from ksp view. I am not able to re-produce the error
>>>>>> with a smaller problem either.
>>>>>> In addition,  I tried to use the block jacobi as the preconditioner
>>>>>> with the same grid and same decomposition. The linear solver runs extremely
>>>>>> slow but there is no memory error.
>>>>>>
>>>>>> How can I diagnose what exactly cause the error?
>>>>>> Thank you so much.
>>>>>>
>>>>>> Frank
>>>>>> <petsc_options.txt>
>>>>>>
>>>>>
>>>> <ksp_view_pre.txt><memory_test2.txt><memory_test3.txt><petsc_options.txt>
>>>>
>>>
>>
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.mcs.anl.gov/pipermail/petsc-users/attachments/20160711/34e71884/attachment.html>

From ibarletta at inogs.it  Tue Jul 12 03:35:01 2016
From: ibarletta at inogs.it (Ivano Barletta)
Date: Tue, 12 Jul 2016 10:35:01 +0200
Subject: [petsc-users] Using Petsc with Finite Elements Domain Decomposition
Message-ID: <CAEz28OjSL-QTEioCVtgM5qG+6+n4R8fz6khJi414eXsRWAfT2Q@mail.gmail.com>

Dear Petsc users

my aim is to parallelize the solution of a linear
system into a finite elements
ocean model.

The model has been almost entirely parallelized, with
a partitioning of the domain made element-wise through
the use of Zoltan libraries, so the subdomains
share the nodes lying on the edges.

The linear system includes node-to-node dependencies
so my guess is that I need to create an halo surrounding
each subdomain, to allow connections of edge nodes with
neighbour subdomains ones

Apart from that, my question is if Petsc accept a
previously made partitioning (maybe taking into account of halo)
using the data structures coming out of it

Has anybody of you ever faced a similar problem?

Thanks in advance
Ivano
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.mcs.anl.gov/pipermail/petsc-users/attachments/20160712/7179c148/attachment-0001.html>

From knepley at gmail.com  Tue Jul 12 04:13:26 2016
From: knepley at gmail.com (Matthew Knepley)
Date: Tue, 12 Jul 2016 04:13:26 -0500
Subject: [petsc-users] Using Petsc with Finite Elements Domain
	Decomposition
In-Reply-To: <CAEz28OjSL-QTEioCVtgM5qG+6+n4R8fz6khJi414eXsRWAfT2Q@mail.gmail.com>
References: <CAEz28OjSL-QTEioCVtgM5qG+6+n4R8fz6khJi414eXsRWAfT2Q@mail.gmail.com>
Message-ID: <CAMYG4Gm3+1ick6mgJLyr_8qVf6VkuM2M3MNjDNtMx2DiWa2Pxg@mail.gmail.com>

On Tue, Jul 12, 2016 at 3:35 AM, Ivano Barletta <ibarletta at inogs.it> wrote:

> Dear Petsc users
>
> my aim is to parallelize the solution of a linear
> system into a finite elements
> ocean model.
>
> The model has been almost entirely parallelized, with
> a partitioning of the domain made element-wise through
> the use of Zoltan libraries, so the subdomains
> share the nodes lying on the edges.
>
> The linear system includes node-to-node dependencies
> so my guess is that I need to create an halo surrounding
> each subdomain, to allow connections of edge nodes with
> neighbour subdomains ones
>
> Apart from that, my question is if Petsc accept a
> previously made partitioning (maybe taking into account of halo)
> using the data structures coming out of it
>
> Has anybody of you ever faced a similar problem?
>

If all you want to do is construct a PETSc Mat and Vec for the linear
system,
just give PETSc the non-overlapping partition to create those objects. You
can input values on off-process partitions automatically using
MatSetValues()
and VecSetValues().

  Thanks,

    Matt


> Thanks in advance
> Ivano
>
>


-- 
What most experimenters take for granted before they begin their
experiments is infinitely more interesting than any results to which their
experiments lead.
-- Norbert Wiener
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.mcs.anl.gov/pipermail/petsc-users/attachments/20160712/6f6149f8/attachment.html>

From hgbk2008 at gmail.com  Tue Jul 12 07:42:02 2016
From: hgbk2008 at gmail.com (Hoang Giang Bui)
Date: Tue, 12 Jul 2016 14:42:02 +0200
Subject: [petsc-users] different convergence behaviour
Message-ID: <CAJW_hKf6z-ra+sj_6OHO-WR2x0zJs5c_kQyUm+sZ5quRGmh8+A@mail.gmail.com>

Hello

I encountered different convergence behaviour of Newton Raphson when using
different solver settings with PETSc

For the first solver configuration, I used direct solver
-ksp_type preonly
-pc_type lu
-pc_factor_mat_solver_package mumps
-mat_mumps_icntl_1 6
-mat_mumps_icntl_4 3
-mat_mumps_icntl_7 4
-mat_mumps_icntl_14 40
-mat_mumps_icntl_23 0

The simulation can run completely and the NR typically converged after 6/7
iterations. Of course, it's very slow. For the second solver configuration:
-ksp_type gmres
-ksp_max_it 300
-ksp_gmres_restart 300
-ksp_gmres_modifiedgramschmidt
-pc_view
-pc_fieldsplit_type multiplicative
-fieldsplit_u_pc_type hypre
-fieldsplit_u_pc_hypre_type boomeramg
-fieldsplit_u_pc_hypre_boomeramg_coarsen_type PMIS
-fieldsplit_u_pc_hypre_boomeramg_strong_threshold 0.6
-fieldsplit_u_pc_hypre_boomeramg_max_levels 25
-fieldsplit_wp_ksp_rtol 1.0e-8
-fieldsplit_wp_pc_type hypre
-fieldsplit_wp_pc_hypre_type boomeramg
-fieldsplit_wp_pc_hypre_boomeramg_coarsen_type PMIS
-fieldsplit_wp_pc_hypre_boomeramg_strong_threshold 0.6
-fieldsplit_wp_pc_hypre_boomeramg_max_levels 25

The solver runs much faster, but the NR does not converge in 30 iterations
after some time steps. I thought setting the solver tolerance -ksp_rtol
1.0e-12 but it doesn't help much because GMRES already terminate with
tolerance 1e-30 (see sample log file). Can we set the tolerance of the
sub-ksp of the Fieldsplit? I tried -fieldsplit_wp_ksp_rtol 1.0e-8 but it
doesn't work.

Sorry this problem is run with many time steps and is quite big so I cannot
reproduce in a simple test case.

Giang
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.mcs.anl.gov/pipermail/petsc-users/attachments/20160712/244d2f63/attachment.html>
-------------- next part --------------
A non-text attachment was scrubbed...
Name: sample_log_iteration
Type: application/octet-stream
Size: 14045 bytes
Desc: not available
URL: <http://lists.mcs.anl.gov/pipermail/petsc-users/attachments/20160712/244d2f63/attachment.obj>

From knepley at gmail.com  Tue Jul 12 07:49:16 2016
From: knepley at gmail.com (Matthew Knepley)
Date: Tue, 12 Jul 2016 07:49:16 -0500
Subject: [petsc-users] different convergence behaviour
In-Reply-To: <CAJW_hKf6z-ra+sj_6OHO-WR2x0zJs5c_kQyUm+sZ5quRGmh8+A@mail.gmail.com>
References: <CAJW_hKf6z-ra+sj_6OHO-WR2x0zJs5c_kQyUm+sZ5quRGmh8+A@mail.gmail.com>
Message-ID: <CAMYG4GkXWTAsUAPFgYhCqr5ZjHsOUYq9e2O+cwBJ=fvdp5v-Pg@mail.gmail.com>

On Tue, Jul 12, 2016 at 7:42 AM, Hoang Giang Bui <hgbk2008 at gmail.com> wrote:

> Hello
>
> I encountered different convergence behaviour of Newton Raphson when using
> different solver settings with PETSc
>
> For the first solver configuration, I used direct solver
> -ksp_type preonly
> -pc_type lu
> -pc_factor_mat_solver_package mumps
> -mat_mumps_icntl_1 6
> -mat_mumps_icntl_4 3
> -mat_mumps_icntl_7 4
> -mat_mumps_icntl_14 40
> -mat_mumps_icntl_23 0
>
> The simulation can run completely and the NR typically converged after 6/7
> iterations. Of course, it's very slow. For the second solver configuration:
> -ksp_type gmres
> -ksp_max_it 300
> -ksp_gmres_restart 300
> -ksp_gmres_modifiedgramschmidt
> -pc_view
> -pc_fieldsplit_type multiplicative
> -fieldsplit_u_pc_type hypre
> -fieldsplit_u_pc_hypre_type boomeramg
> -fieldsplit_u_pc_hypre_boomeramg_coarsen_type PMIS
> -fieldsplit_u_pc_hypre_boomeramg_strong_threshold 0.6
> -fieldsplit_u_pc_hypre_boomeramg_max_levels 25
> -fieldsplit_wp_ksp_rtol 1.0e-8
> -fieldsplit_wp_pc_type hypre
> -fieldsplit_wp_pc_hypre_type boomeramg
> -fieldsplit_wp_pc_hypre_boomeramg_coarsen_type PMIS
> -fieldsplit_wp_pc_hypre_boomeramg_strong_threshold 0.6
> -fieldsplit_wp_pc_hypre_boomeramg_max_levels 25
>
> The solver runs much faster, but the NR does not converge in 30 iterations
> after some time steps. I thought setting the solver tolerance -ksp_rtol
> 1.0e-12 but it doesn't help much because GMRES already terminate with
> tolerance 1e-30 (see sample log file). Can we set the tolerance of the
> sub-ksp of the Fieldsplit? I tried -fieldsplit_wp_ksp_rtol 1.0e-8 but it
> doesn't work.
>

1) In the log you sent, the linear solver converges due to the Relative
Tolerance, 1.0e-9, not the breakdown tolerance 1e-30. Change rtol will
affect the convergence.

2) What do you mean  -fieldsplit_wp_ksp_rtol 1.0e-8 does not work? ALWAYS
send the view output.

3) I can't tell you anything about Newton convergence if you do not send
the output, -snes_monitor -snes_view

4) If there is a difference between LU and an iterative solver with
residual 1e-9, then your system is very ill-conditioned.

  Thanks,

     Matt


> Sorry this problem is run with many time steps and is quite big so I
> cannot reproduce in a simple test case.
>
> Giang
>
>


-- 
What most experimenters take for granted before they begin their
experiments is infinitely more interesting than any results to which their
experiments lead.
-- Norbert Wiener
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.mcs.anl.gov/pipermail/petsc-users/attachments/20160712/3ba77b29/attachment-0001.html>

From marius.hensgens at rwth-aachen.de  Tue Jul 12 08:37:44 2016
From: marius.hensgens at rwth-aachen.de (Hensgens, Marius)
Date: Tue, 12 Jul 2016 13:37:44 +0000
Subject: [petsc-users] petsc4py - Change default line search for SNES Newton
	Line Search
Message-ID: <1468330674098.92345@rwth-aachen.de>

Dear all,


how can I change the used line search method after setting the SNES Type to 'newtonls' using petsc4py ?

In the official PETSc documentation there is a function called SNESLineSearchSetType, however in petsc4py I can't find an equivalent function.


Best regards,

Marius


-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.mcs.anl.gov/pipermail/petsc-users/attachments/20160712/aa6b59e9/attachment.html>

From hgbk2008 at gmail.com  Tue Jul 12 08:44:52 2016
From: hgbk2008 at gmail.com (Hoang Giang Bui)
Date: Tue, 12 Jul 2016 15:44:52 +0200
Subject: [petsc-users] different convergence behaviour
In-Reply-To: <CAMYG4GkXWTAsUAPFgYhCqr5ZjHsOUYq9e2O+cwBJ=fvdp5v-Pg@mail.gmail.com>
References: <CAJW_hKf6z-ra+sj_6OHO-WR2x0zJs5c_kQyUm+sZ5quRGmh8+A@mail.gmail.com>
	<CAMYG4GkXWTAsUAPFgYhCqr5ZjHsOUYq9e2O+cwBJ=fvdp5v-Pg@mail.gmail.com>
Message-ID: <CAJW_hKezLFQS3OOB6mxyr=mKR4rVmg2TRMNxNs0F3n_mzfA7Sg@mail.gmail.com>

Hi Matt

1) In the log you sent, the linear solver converges due to the Relative
Tolerance, 1.0e-9, not the breakdown tolerance 1e-30. Change rtol will
affect the convergence.

Sorry i got it wrong in the previous email, the ksp_rtol 1.0e-12 DOES
affect the convergence, and it took more iterations. But the simulation
still failed at a definite time step.

2) What do you mean  -fieldsplit_wp_ksp_rtol 1.0e-8 does not work? ALWAYS
send the view output.

In the log file I sent previously, the line

    KSP Object:    (fieldsplit_wp_)     8 MPI processes
      type: preonly
      maximum iterations=10000, initial guess is zero
      tolerances:  relative=1e-05, absolute=1e-50, divergence=10000
      left preconditioning
      using NONE norm type for convergence test

impressed me that the rtol for fieldsplit_wp is still 1.0e-5

3) I can't tell you anything about Newton convergence if you do not send
the output, -snes_monitor -snes_view

I did not yet use SNES, instead using my NR iterator so I have no view for
SNES.

4) If there is a difference between LU and an iterative solver with
residual 1e-9, then your system is very ill-conditioned.
Yes it is ill-conditioned


Giang

On Tue, Jul 12, 2016 at 2:49 PM, Matthew Knepley <knepley at gmail.com> wrote:

> On Tue, Jul 12, 2016 at 7:42 AM, Hoang Giang Bui <hgbk2008 at gmail.com>
> wrote:
>
>> Hello
>>
>> I encountered different convergence behaviour of Newton Raphson when
>> using different solver settings with PETSc
>>
>> For the first solver configuration, I used direct solver
>> -ksp_type preonly
>> -pc_type lu
>> -pc_factor_mat_solver_package mumps
>> -mat_mumps_icntl_1 6
>> -mat_mumps_icntl_4 3
>> -mat_mumps_icntl_7 4
>> -mat_mumps_icntl_14 40
>> -mat_mumps_icntl_23 0
>>
>> The simulation can run completely and the NR typically converged after
>> 6/7 iterations. Of course, it's very slow. For the second solver
>> configuration:
>> -ksp_type gmres
>> -ksp_max_it 300
>> -ksp_gmres_restart 300
>> -ksp_gmres_modifiedgramschmidt
>> -pc_view
>> -pc_fieldsplit_type multiplicative
>> -fieldsplit_u_pc_type hypre
>> -fieldsplit_u_pc_hypre_type boomeramg
>> -fieldsplit_u_pc_hypre_boomeramg_coarsen_type PMIS
>> -fieldsplit_u_pc_hypre_boomeramg_strong_threshold 0.6
>> -fieldsplit_u_pc_hypre_boomeramg_max_levels 25
>> -fieldsplit_wp_ksp_rtol 1.0e-8
>> -fieldsplit_wp_pc_type hypre
>> -fieldsplit_wp_pc_hypre_type boomeramg
>> -fieldsplit_wp_pc_hypre_boomeramg_coarsen_type PMIS
>> -fieldsplit_wp_pc_hypre_boomeramg_strong_threshold 0.6
>> -fieldsplit_wp_pc_hypre_boomeramg_max_levels 25
>>
>> The solver runs much faster, but the NR does not converge in 30
>> iterations after some time steps. I thought setting the solver
>> tolerance -ksp_rtol 1.0e-12 but it doesn't help much because GMRES already
>> terminate with tolerance 1e-30 (see sample log file). Can we set the
>> tolerance of the sub-ksp of the Fieldsplit? I tried -fieldsplit_wp_ksp_rtol
>> 1.0e-8 but it doesn't work.
>>
>
> 1) In the log you sent, the linear solver converges due to the Relative
> Tolerance, 1.0e-9, not the breakdown tolerance 1e-30. Change rtol will
> affect the convergence.
>
> 2) What do you mean  -fieldsplit_wp_ksp_rtol 1.0e-8 does not work? ALWAYS
> send the view output.
>
> 3) I can't tell you anything about Newton convergence if you do not send
> the output, -snes_monitor -snes_view
>
> 4) If there is a difference between LU and an iterative solver with
> residual 1e-9, then your system is very ill-conditioned.
>
>   Thanks,
>
>      Matt
>
>
>> Sorry this problem is run with many time steps and is quite big so I
>> cannot reproduce in a simple test case.
>>
>> Giang
>>
>>
>
>
> --
> What most experimenters take for granted before they begin their
> experiments is infinitely more interesting than any results to which their
> experiments lead.
> -- Norbert Wiener
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.mcs.anl.gov/pipermail/petsc-users/attachments/20160712/aec09c42/attachment.html>

From dalcinl at gmail.com  Tue Jul 12 09:06:15 2016
From: dalcinl at gmail.com (Lisandro Dalcin)
Date: Tue, 12 Jul 2016 17:06:15 +0300
Subject: [petsc-users] petsc4py - Change default line search for SNES
 Newton Line Search
In-Reply-To: <1468330674098.92345@rwth-aachen.de>
References: <1468330674098.92345@rwth-aachen.de>
Message-ID: <CAEcYPwAvFZmrS8fAzKUdd722wrRFPkS1y+joNFoKW3Fq2Dt_bA@mail.gmail.com>

On 12 July 2016 at 16:37, Hensgens, Marius
<marius.hensgens at rwth-aachen.de> wrote:
> how can I change the used line search method after setting the SNES Type to
> 'newtonls' using petsc4py ?
>

Right now, you can either use the command line or programatically
insert an option in the database

opts = PETSc.Options()
opts['snes_linesearch_type'] = lstype
...
snes.setFromOptions()

> In the official PETSc documentation there is a function called
> SNESLineSearchSetType, however in petsc4py I can't find an equivalent
> function.

The SNESLineSearch type and related routines are not wrapped yet.


-- 
Lisandro Dalcin
============
Research Scientist
Computer, Electrical and Mathematical Sciences & Engineering (CEMSE)
Extreme Computing Research Center (ECRC)
King Abdullah University of Science and Technology (KAUST)
http://ecrc.kaust.edu.sa/

4700 King Abdullah University of Science and Technology
al-Khawarizmi Bldg (Bldg 1), Office # 0109
Thuwal 23955-6900, Kingdom of Saudi Arabia
http://www.kaust.edu.sa

Office Phone: +966 12 808-0459

From knepley at gmail.com  Tue Jul 12 09:52:27 2016
From: knepley at gmail.com (Matthew Knepley)
Date: Tue, 12 Jul 2016 09:52:27 -0500
Subject: [petsc-users] different convergence behaviour
In-Reply-To: <CAJW_hKezLFQS3OOB6mxyr=mKR4rVmg2TRMNxNs0F3n_mzfA7Sg@mail.gmail.com>
References: <CAJW_hKf6z-ra+sj_6OHO-WR2x0zJs5c_kQyUm+sZ5quRGmh8+A@mail.gmail.com>
	<CAMYG4GkXWTAsUAPFgYhCqr5ZjHsOUYq9e2O+cwBJ=fvdp5v-Pg@mail.gmail.com>
	<CAJW_hKezLFQS3OOB6mxyr=mKR4rVmg2TRMNxNs0F3n_mzfA7Sg@mail.gmail.com>
Message-ID: <CAMYG4Gkwrw4WuGsZ1VJ6i9LkyTwhpJu=TSE6Tc=oiS+R0FT12A@mail.gmail.com>

On Tue, Jul 12, 2016 at 8:44 AM, Hoang Giang Bui <hgbk2008 at gmail.com> wrote:

> Hi Matt
>
> 1) In the log you sent, the linear solver converges due to the Relative
> Tolerance, 1.0e-9, not the breakdown tolerance 1e-30. Change rtol will
> affect the convergence.
>
> Sorry i got it wrong in the previous email, the ksp_rtol 1.0e-12 DOES
> affect the convergence, and it took more iterations. But the simulation
> still failed at a definite time step.
>
> 2) What do you mean  -fieldsplit_wp_ksp_rtol 1.0e-8 does not work? ALWAYS
> send the view output.
>
> In the log file I sent previously, the line
>
>     KSP Object:    (fieldsplit_wp_)     8 MPI processes
>       type: preonly
>       maximum iterations=10000, initial guess is zero
>       tolerances:  relative=1e-05, absolute=1e-50, divergence=10000
>       left preconditioning
>       using NONE norm type for convergence test
>
> impressed me that the rtol for fieldsplit_wp is still 1.0e-5
>

KSP "preonly" does no iterations, so it does not read the tolerance. If you
want to lower the tolerance,
choose a solver like GMRES

  -fieldsplit_wp_ksp_type gmres -fieldsplit_wp_ksp_rtol 1e-8


> 3) I can't tell you anything about Newton convergence if you do not send
> the output, -snes_monitor -snes_view
>
> I did not yet use SNES, instead using my NR iterator so I have no view for
> SNES.
>

It is hard to debug an iteration which we did not code. It could be you
have a bug. If not, then very small changes in
the iterates are making a difference, which means your Jacobians are close
to singular. A problem reformulation would
probably help more than solver tweaking.

  Thanks,

    Matt


> 4) If there is a difference between LU and an iterative solver with
> residual 1e-9, then your system is very ill-conditioned.
> Yes it is ill-conditioned
>
>
>
>
>
>
>
> Giang
>
> On Tue, Jul 12, 2016 at 2:49 PM, Matthew Knepley <knepley at gmail.com>
> wrote:
>
>> On Tue, Jul 12, 2016 at 7:42 AM, Hoang Giang Bui <hgbk2008 at gmail.com>
>> wrote:
>>
>>> Hello
>>>
>>> I encountered different convergence behaviour of Newton Raphson when
>>> using different solver settings with PETSc
>>>
>>> For the first solver configuration, I used direct solver
>>> -ksp_type preonly
>>> -pc_type lu
>>> -pc_factor_mat_solver_package mumps
>>> -mat_mumps_icntl_1 6
>>> -mat_mumps_icntl_4 3
>>> -mat_mumps_icntl_7 4
>>> -mat_mumps_icntl_14 40
>>> -mat_mumps_icntl_23 0
>>>
>>> The simulation can run completely and the NR typically converged after
>>> 6/7 iterations. Of course, it's very slow. For the second solver
>>> configuration:
>>> -ksp_type gmres
>>> -ksp_max_it 300
>>> -ksp_gmres_restart 300
>>> -ksp_gmres_modifiedgramschmidt
>>> -pc_view
>>> -pc_fieldsplit_type multiplicative
>>> -fieldsplit_u_pc_type hypre
>>> -fieldsplit_u_pc_hypre_type boomeramg
>>> -fieldsplit_u_pc_hypre_boomeramg_coarsen_type PMIS
>>> -fieldsplit_u_pc_hypre_boomeramg_strong_threshold 0.6
>>> -fieldsplit_u_pc_hypre_boomeramg_max_levels 25
>>> -fieldsplit_wp_ksp_rtol 1.0e-8
>>> -fieldsplit_wp_pc_type hypre
>>> -fieldsplit_wp_pc_hypre_type boomeramg
>>> -fieldsplit_wp_pc_hypre_boomeramg_coarsen_type PMIS
>>> -fieldsplit_wp_pc_hypre_boomeramg_strong_threshold 0.6
>>> -fieldsplit_wp_pc_hypre_boomeramg_max_levels 25
>>>
>>> The solver runs much faster, but the NR does not converge in 30
>>> iterations after some time steps. I thought setting the solver
>>> tolerance -ksp_rtol 1.0e-12 but it doesn't help much because GMRES already
>>> terminate with tolerance 1e-30 (see sample log file). Can we set the
>>> tolerance of the sub-ksp of the Fieldsplit? I tried -fieldsplit_wp_ksp_rtol
>>> 1.0e-8 but it doesn't work.
>>>
>>
>> 1) In the log you sent, the linear solver converges due to the Relative
>> Tolerance, 1.0e-9, not the breakdown tolerance 1e-30. Change rtol will
>> affect the convergence.
>>
>> 2) What do you mean  -fieldsplit_wp_ksp_rtol 1.0e-8 does not work? ALWAYS
>> send the view output.
>>
>> 3) I can't tell you anything about Newton convergence if you do not send
>> the output, -snes_monitor -snes_view
>>
>> 4) If there is a difference between LU and an iterative solver with
>> residual 1e-9, then your system is very ill-conditioned.
>>
>>   Thanks,
>>
>>      Matt
>>
>>
>>> Sorry this problem is run with many time steps and is quite big so I
>>> cannot reproduce in a simple test case.
>>>
>>> Giang
>>>
>>>
>>
>>
>> --
>> What most experimenters take for granted before they begin their
>> experiments is infinitely more interesting than any results to which their
>> experiments lead.
>> -- Norbert Wiener
>>
>
>


-- 
What most experimenters take for granted before they begin their
experiments is infinitely more interesting than any results to which their
experiments lead.
-- Norbert Wiener
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.mcs.anl.gov/pipermail/petsc-users/attachments/20160712/ec2bbe34/attachment-0001.html>

From bsmith at mcs.anl.gov  Tue Jul 12 21:33:08 2016
From: bsmith at mcs.anl.gov (Barry Smith)
Date: Tue, 12 Jul 2016 21:33:08 -0500
Subject: [petsc-users] Question about memory usage in Multigrid
	preconditioner
In-Reply-To: <CAJ98EDoNNDpPYt7wr6EFyAp-vhHE1LC8CMnSrL89sJMrLTr+Yw@mail.gmail.com>
References: <577C337B.60909@uci.edu>
	<94A03A99-4970-4F20-8C79-FEE1DCBD028D@mcs.anl.gov>
	<577D75D3.8010703@uci.edu>
	<2F25042C-E6D6-4AC6-9C22-1B63F8065836@mcs.anl.gov>
	<57804DE9.707@uci.edu>
	<CAJ98EDqfz2hr1s8SoLCwPceqbB+bsZp1ryibGngRn+kNTDrocA@mail.gmail.com>
	<5783D3E4.4020004@uci.edu>
	<CAJ98EDoNNDpPYt7wr6EFyAp-vhHE1LC8CMnSrL89sJMrLTr+Yw@mail.gmail.com>
Message-ID: <EF43AF08-E1F9-46DC-A8CF-DD9FDFCACCED@mcs.anl.gov>


> On Jul 11, 2016, at 3:18 PM, Dave May <dave.mayhem23 at gmail.com> wrote:
> 
> Hi Frank,
> 
> 
> On 11 July 2016 at 19:14, frank <hengjiew at uci.edu> wrote:
> Hi Dave,
> 
> I re-run the test using bjacobi as the preconditioner on the coarse mesh of telescope. The Grid is 3072*256*768 and process mesh is 96*8*24. The petsc option file is attached.
> I still got the "Out Of Memory" error. The error occurred before the linear solver finished one step. So I don't have the full info from ksp_view. The info from ksp_view_pre is attached.
> 
> Okay - that is essentially useless (sorry)
>  
> 
> It seems to me that the error occurred when the decomposition was going to be changed.
> 
> Based on what information?
> Running with -info would give us more clues, but will create a ton of output.
> Please try running the case which failed with -info
>  
> I had another test with a grid of 1536*128*384 and the same process mesh as above. There was no error. The ksp_view info is attached for comparison.
> Thank you.
> 
> 
> [3] Here is my crude estimate of your memory usage. 
> I'll target the biggest memory hogs only to get an order of magnitude estimate
> 
> * The Fine grid operator contains 4223139840 non-zeros --> 1.8 GB per MPI rank assuming double precision.
> The indices for the AIJ could amount to another 0.3 GB (assuming 32 bit integers)
> 
> * You use 5 levels of coarsening, so the other operators should represent (collectively)  
> 2.1 / 8 + 2.1/8^2 + 2.1/8^3 + 2.1/8^4  ~ 300 MB per MPI rank on the communicator with 18432 ranks.
> The coarse grid should consume ~ 0.5 MB per MPI rank on the communicator with 18432 ranks.
> 
> * You use a reduction factor of 64, making the new communicator with 288 MPI ranks. 
> PCTelescope will first gather a temporary matrix associated with your coarse level operator assuming a comm size of 288 living on the comm with size 18432. 
> This matrix will require approximately 0.5 * 64 = 32 MB per core on the 288 ranks. 
> This matrix is then used to form a new MPIAIJ matrix on the subcomm, thus require another 32 MB per rank. 
> The temporary matrix is now destroyed.
> 
> * Because a DMDA is detected, a permutation matrix is assembled. 
> This requires 2 doubles per point in the DMDA. 
> Your coarse DMDA contains 92 x 16 x 48 points. 
> Thus the permutation matrix will require < 1 MB per MPI rank on the sub-comm.
> 
> * Lastly, the matrix is permuted. This uses MatPtAP(), but the resulting operator will have the same memory footprint as the unpermuted matrix (32 MB).

  Dave,

   MatPtAP has to generate some work space. Is it possible the "guess" it uses for needed work space is so absurdly (and unnecessarily) large that it triggers a memory issue?  It is possible that other places that require "guesses" for work space produce a problem? Also are all the "guesses" properly -info logged so that we can detected them before the program is killed?


  Barry


> At any stage in PCTelescope, only 2 operators of size 32 MB are held in memory when the DMDA is provided.
> 
> From my rough estimates, the worst case memory foot print for any given core, given your options is approximately 
> 2100 MB + 300 MB + 32 MB + 32 MB + 1 MB  = 2465 MB
> This is way below 8 GB.
> 
> Note this estimate completely ignores:
> (1) the memory required for the restriction operator, 
> (2) the potential growth in the number of non-zeros per row due to Galerkin coarsening (I wished -ksp_view_pre reported the output from MatView so we could see the number of non-zeros required by the coarse level operators)
> (3) all temporary vectors required by the CG solver, and those required by the smoothers.
> (4) internal memory allocated by MatPtAP
> (5) memory associated with IS's used within PCTelescope
> 
> So either I am completely off in my estimates, or you have not carefully estimated the memory usage of your application code. Hopefully others might examine/correct my rough estimates
> 
> Since I don't have your code I cannot access the latter.
> Since I don't have access to the same machine you are running on, I think we need to take a step back.
> 
> [1] What machine are you running on? Send me a URL if its available
> 
> [2] What discretization are you using? (I am guessing a scalar 7 point FD stencil)
> If it's a 7 point FD stencil, we should be able to examine the memory usage of your solver configuration using a standard, light weight existing PETSc example, run on your machine at the same scale. 
> This would hopefully enable us to correctly evaluate the actual memory usage required by the solver configuration you are using.
> 
> Thanks,
>   Dave
>  
> 
> 
> Frank
> 
> 
> 
> 
> On 07/08/2016 10:38 PM, Dave May wrote:
>> 
>> 
>> On Saturday, 9 July 2016, frank <hengjiew at uci.edu> wrote:
>> Hi Barry and Dave,
>> 
>> Thank both of you for the advice.
>> 
>> @Barry
>> I made a mistake in the file names in last email. I attached the correct files this time.
>> For all the three tests, 'Telescope' is used as the coarse preconditioner.
>> 
>> == Test1:   Grid: 1536*128*384,   Process Mesh: 48*4*12
>> Part of the memory usage:  Vector   125            124 3971904     0.
>>                                              Matrix   101 101      9462372     0
>> 
>> == Test2: Grid: 1536*128*384,   Process Mesh: 96*8*24
>> Part of the memory usage:  Vector   125            124 681672     0.
>>                                              Matrix   101 101      1462180     0.
>> 
>> In theory, the memory usage in Test1 should be 8 times of Test2. In my case, it is about 6 times.
>> 
>> == Test3: Grid: 3072*256*768,   Process Mesh: 96*8*24. Sub-domain per process: 32*32*32
>> Here I get the out of memory error.
>> 
>> I tried to use -mg_coarse jacobi. In this way, I don't need to set -mg_coarse_ksp_type and -mg_coarse_pc_type explicitly, right?
>> The linear solver didn't work in this case. Petsc output some errors.
>> 
>> @Dave
>> In test3, I use only one instance of 'Telescope'. On the coarse mesh of 'Telescope', I used LU as the preconditioner instead of SVD.
>> If my set the levels correctly, then on the last coarse mesh of MG where it calls 'Telescope', the sub-domain per process is 2*2*2.
>> On the last coarse mesh of 'Telescope', there is only one grid point per process.
>> I still got the OOM error. The detailed petsc option file is attached.
>> 
>> Do you understand the expected memory usage for the particular parallel LU implementation you are using? I don't (seriously). Replace LU with bjacobi and re-run this test. My point about solver debugging is still valid. 
>> 
>> And please send the result of KSPView so we can see what is actually used in the computations
>> 
>> Thanks
>>   Dave
>>  
>> 
>> 
>> Thank you so much.
>> 
>> Frank
>> 
>> 
>> 
>> On 07/06/2016 02:51 PM, Barry Smith wrote:
>> On Jul 6, 2016, at 4:19 PM, frank <hengjiew at uci.edu> wrote:
>> 
>> Hi Barry,
>> 
>> Thank you for you advice.
>> I tried three test. In the 1st test, the grid is 3072*256*768 and the process mesh is 96*8*24.
>> The linear solver is 'cg' the preconditioner is 'mg' and 'telescope' is used as the preconditioner at the coarse mesh.
>> The system gives me the "Out of Memory" error before the linear system is completely solved.
>> The info from '-ksp_view_pre' is attached. I seems to me that the error occurs when it reaches the coarse mesh.
>> 
>> The 2nd test uses a grid of 1536*128*384 and process mesh is 96*8*24. The 3rd test uses the same grid but a different process mesh 48*4*12.
>>     Are you sure this is right? The total matrix and vector memory usage goes from 2nd test
>>                Vector   384            383      8,193,712     0.
>>                Matrix   103            103     11,508,688     0.
>> to 3rd test
>>               Vector   384            383      1,590,520     0.
>>                Matrix   103            103      3,508,664     0.
>> that is the memory usage got smaller but if you have only 1/8th the processes and the same grid it should have gotten about 8 times bigger. Did you maybe cut the grid by a factor of 8 also? If so that still doesn't explain it because the memory usage changed by a factor of 5 something for the vectors and 3 something for the matrices.
>> 
>> 
>> The linear solver and petsc options in 2nd and 3rd tests are the same in 1st test. The linear solver works fine in both test.
>> I attached the memory usage of the 2nd and 3rd tests. The memory info is from the option '-log_summary'. I tried to use '-momery_info' as you suggested, but in my case petsc treated it as an unused option. It output nothing about the memory. Do I need to add sth to my code so I can use '-memory_info'?
>>     Sorry, my mistake the option is -memory_view
>> 
>>    Can you run the one case with -memory_view and -mg_coarse jacobi -ksp_max_it 1 (just so it doesn't iterate forever) to see how much memory is used without the telescope? Also run case 2 the same way.
>> 
>>    Barry
>> 
>> 
>> 
>> In both tests the memory usage is not large.
>> 
>> It seems to me that it might be the 'telescope'  preconditioner that allocated a lot of memory and caused the error in the 1st test.
>> Is there is a way to show how much memory it allocated?
>> 
>> Frank
>> 
>> On 07/05/2016 03:37 PM, Barry Smith wrote:
>>    Frank,
>> 
>>      You can run with -ksp_view_pre to have it "view" the KSP before the solve so hopefully it gets that far.
>> 
>>       Please run the problem that does fit with -memory_info when the problem completes it will show the "high water mark" for PETSc allocated memory and total memory used. We first want to look at these numbers to see if it is using more memory than you expect. You could also run with say half the grid spacing to see how the memory usage scaled with the increase in grid points. Make the runs also with -log_view and send all the output from these options.
>> 
>>     Barry
>> 
>> On Jul 5, 2016, at 5:23 PM, frank <hengjiew at uci.edu> wrote:
>> 
>> Hi,
>> 
>> I am using the CG ksp solver and Multigrid preconditioner  to solve a linear system in parallel.
>> I chose to use the 'Telescope' as the preconditioner on the coarse mesh for its good performance.
>> The petsc options file is attached.
>> 
>> The domain is a 3d box.
>> It works well when the grid is  1536*128*384 and the process mesh is 96*8*24. When I double the size of grid and keep the same process mesh and petsc options, I get an "out of memory" error from the super-cluster I am using.
>> Each process has access to at least 8G memory, which should be more than enough for my application. I am sure that all the other parts of my code( except the linear solver ) do not use much memory. So I doubt if there is something wrong with the linear solver.
>> The error occurs before the linear system is completely solved so I don't have the info from ksp view. I am not able to re-produce the error with a smaller problem either.
>> In addition,  I tried to use the block jacobi as the preconditioner with the same grid and same decomposition. The linear solver runs extremely slow but there is no memory error.
>> 
>> How can I diagnose what exactly cause the error?
>> Thank you so much.
>> 
>> Frank
>> <petsc_options.txt>
>> <ksp_view_pre.txt><memory_test2.txt><memory_test3.txt><petsc_options.txt>
>> 
> 
> 


From bsmith at mcs.anl.gov  Tue Jul 12 22:16:35 2016
From: bsmith at mcs.anl.gov (Barry Smith)
Date: Tue, 12 Jul 2016 22:16:35 -0500
Subject: [petsc-users] Using Petsc with Finite Elements Domain
	Decomposition
In-Reply-To: <CAMYG4Gm3+1ick6mgJLyr_8qVf6VkuM2M3MNjDNtMx2DiWa2Pxg@mail.gmail.com>
References: <CAEz28OjSL-QTEioCVtgM5qG+6+n4R8fz6khJi414eXsRWAfT2Q@mail.gmail.com>
	<CAMYG4Gm3+1ick6mgJLyr_8qVf6VkuM2M3MNjDNtMx2DiWa2Pxg@mail.gmail.com>
Message-ID: <F912A870-F4C0-4CFD-9B4F-8215940536B4@mcs.anl.gov>


> On Jul 12, 2016, at 4:13 AM, Matthew Knepley <knepley at gmail.com> wrote:
> 
> On Tue, Jul 12, 2016 at 3:35 AM, Ivano Barletta <ibarletta at inogs.it> wrote:
> Dear Petsc users
> 
> my aim is to parallelize the solution of a linear
> system into a finite elements
> ocean model.
> 
> The model has been almost entirely parallelized, with
> a partitioning of the domain made element-wise through
> the use of Zoltan libraries, so the subdomains
> share the nodes lying on the edges.
> 
> The linear system includes node-to-node dependencies
> so my guess is that I need to create an halo surrounding 
> each subdomain, to allow connections of edge nodes with
> neighbour subdomains ones
> 
> Apart from that, my question is if Petsc accept a
> previously made partitioning (maybe taking into account of halo)
> using the data structures coming out of it
> 
> Has anybody of you ever faced a similar problem?
> 
> If all you want to do is construct a PETSc Mat and Vec for the linear system,
> just give PETSc the non-overlapping partition to create those objects. You
> can input values on off-process partitions automatically using MatSetValues()
> and VecSetValues().

  Note that by just using the VecSetValues() and MatSetValues() PETSc will manage all the halo business needed by the linear algebra system solver automatically. You don't need to provide any halo information to PETSc. It is really straightforward.

  Barry

> 
>   Thanks,
> 
>     Matt
>  
> Thanks in advance
> Ivano 
> 
> 
> 
> 
> -- 
> What most experimenters take for granted before they begin their experiments is infinitely more interesting than any results to which their experiments lead.
> -- Norbert Wiener


From bsmith at mcs.anl.gov  Tue Jul 12 22:43:59 2016
From: bsmith at mcs.anl.gov (Barry Smith)
Date: Tue, 12 Jul 2016 22:43:59 -0500
Subject: [petsc-users] different convergence behaviour
In-Reply-To: <CAMYG4Gkwrw4WuGsZ1VJ6i9LkyTwhpJu=TSE6Tc=oiS+R0FT12A@mail.gmail.com>
References: <CAJW_hKf6z-ra+sj_6OHO-WR2x0zJs5c_kQyUm+sZ5quRGmh8+A@mail.gmail.com>
	<CAMYG4GkXWTAsUAPFgYhCqr5ZjHsOUYq9e2O+cwBJ=fvdp5v-Pg@mail.gmail.com>
	<CAJW_hKezLFQS3OOB6mxyr=mKR4rVmg2TRMNxNs0F3n_mzfA7Sg@mail.gmail.com>
	<CAMYG4Gkwrw4WuGsZ1VJ6i9LkyTwhpJu=TSE6Tc=oiS+R0FT12A@mail.gmail.com>
Message-ID: <66004C23-63C9-4A3E-A7DF-1352AC26412F@mcs.anl.gov>


  It is not uncommon for an iterative linear solver to work fine for some time steps but then start to perform poorly at a later timestep because the physics (mathematically the conditioning or eigenstructure of the Jacobian) changes over time; perhaps becomes singular. Another possibility is the trajectory of the solution is very sensitive to the solution of the nonlinear problem at each time step so that an iterative linear solver and a direct linear solver result in very difficult physical solutions after many time steps. In other words after many time-steps the computed solutions can be very different and if the computed solution for the iterative linear solver is eventually "non-physical" or ill-conditioned the nonlinear solver could break down.

  Please run with the iterative solver (that eventually breaks) with the option -ksp_monitor_true_solution -ksp_converged_reason and and send ALL the output (it will be very large, don't worry about it). Then we can see if the linear solver is breaking down. Note that by default PETSc linear solvers do not generate an error that stops the program if the linear solve fails, hence your NR code should call KSPGetConvergedReason() after EVERY linear solve and if the reason is negative your code needs to do something different since the linear solve failed and your code should not just keep on running NR.

  Barry


> On Jul 12, 2016, at 9:52 AM, Matthew Knepley <knepley at gmail.com> wrote:
> 
> On Tue, Jul 12, 2016 at 8:44 AM, Hoang Giang Bui <hgbk2008 at gmail.com> wrote:
> Hi Matt
> 
> 1) In the log you sent, the linear solver converges due to the Relative Tolerance, 1.0e-9, not the breakdown tolerance 1e-30. Change rtol will affect the convergence.
> 
> Sorry i got it wrong in the previous email, the ksp_rtol 1.0e-12 DOES affect the convergence, and it took more iterations. But the simulation still failed at a definite time step.
> 
> 2) What do you mean  -fieldsplit_wp_ksp_rtol 1.0e-8 does not work? ALWAYS send the view output.
> 
> In the log file I sent previously, the line
> 
>     KSP Object:    (fieldsplit_wp_)     8 MPI processes
>       type: preonly
>       maximum iterations=10000, initial guess is zero
>       tolerances:  relative=1e-05, absolute=1e-50, divergence=10000
>       left preconditioning
>       using NONE norm type for convergence test
> 
> impressed me that the rtol for fieldsplit_wp is still 1.0e-5
> 
> KSP "preonly" does no iterations, so it does not read the tolerance. If you want to lower the tolerance,
> choose a solver like GMRES
> 
>   -fieldsplit_wp_ksp_type gmres -fieldsplit_wp_ksp_rtol 1e-8
>  
> 3) I can't tell you anything about Newton convergence if you do not send the output, -snes_monitor -snes_view
> 
> I did not yet use SNES, instead using my NR iterator so I have no view for SNES.
> 
> It is hard to debug an iteration which we did not code. It could be you have a bug. If not, then very small changes in
> the iterates are making a difference, which means your Jacobians are close to singular. A problem reformulation would
> probably help more than solver tweaking.
> 
>   Thanks,
> 
>     Matt
>  
> 4) If there is a difference between LU and an iterative solver with residual 1e-9, then your system is very ill-conditioned.
> Yes it is ill-conditioned
> 
> 
> 
> 
> 
> 
> 
> Giang
> 
> On Tue, Jul 12, 2016 at 2:49 PM, Matthew Knepley <knepley at gmail.com> wrote:
> On Tue, Jul 12, 2016 at 7:42 AM, Hoang Giang Bui <hgbk2008 at gmail.com> wrote:
> Hello
> 
> I encountered different convergence behaviour of Newton Raphson when using different solver settings with PETSc
> 
> For the first solver configuration, I used direct solver
> -ksp_type preonly
> -pc_type lu
> -pc_factor_mat_solver_package mumps
> -mat_mumps_icntl_1 6
> -mat_mumps_icntl_4 3
> -mat_mumps_icntl_7 4
> -mat_mumps_icntl_14 40
> -mat_mumps_icntl_23 0
> 
> The simulation can run completely and the NR typically converged after 6/7 iterations. Of course, it's very slow. For the second solver configuration:
> -ksp_type gmres
> -ksp_max_it 300
> -ksp_gmres_restart 300
> -ksp_gmres_modifiedgramschmidt
> -pc_view
> -pc_fieldsplit_type multiplicative
> -fieldsplit_u_pc_type hypre
> -fieldsplit_u_pc_hypre_type boomeramg
> -fieldsplit_u_pc_hypre_boomeramg_coarsen_type PMIS
> -fieldsplit_u_pc_hypre_boomeramg_strong_threshold 0.6
> -fieldsplit_u_pc_hypre_boomeramg_max_levels 25
> -fieldsplit_wp_ksp_rtol 1.0e-8
> -fieldsplit_wp_pc_type hypre
> -fieldsplit_wp_pc_hypre_type boomeramg
> -fieldsplit_wp_pc_hypre_boomeramg_coarsen_type PMIS
> -fieldsplit_wp_pc_hypre_boomeramg_strong_threshold 0.6
> -fieldsplit_wp_pc_hypre_boomeramg_max_levels 25
> 
> The solver runs much faster, but the NR does not converge in 30 iterations after some time steps. I thought setting the solver tolerance -ksp_rtol 1.0e-12 but it doesn't help much because GMRES already terminate with tolerance 1e-30 (see sample log file). Can we set the tolerance of the sub-ksp of the Fieldsplit? I tried -fieldsplit_wp_ksp_rtol 1.0e-8 but it doesn't work.
> 
> 1) In the log you sent, the linear solver converges due to the Relative Tolerance, 1.0e-9, not the breakdown tolerance 1e-30. Change rtol will affect the convergence.
> 
> 2) What do you mean  -fieldsplit_wp_ksp_rtol 1.0e-8 does not work? ALWAYS send the view output.
> 
> 3) I can't tell you anything about Newton convergence if you do not send the output, -snes_monitor -snes_view
> 
> 4) If there is a difference between LU and an iterative solver with residual 1e-9, then your system is very ill-conditioned.
> 
>   Thanks,
> 
>      Matt
>  
> Sorry this problem is run with many time steps and is quite big so I cannot reproduce in a simple test case.
> 
> Giang
> 
> 
> 
> 
> -- 
> What most experimenters take for granted before they begin their experiments is infinitely more interesting than any results to which their experiments lead.
> -- Norbert Wiener
> 
> 
> 
> 
> -- 
> What most experimenters take for granted before they begin their experiments is infinitely more interesting than any results to which their experiments lead.
> -- Norbert Wiener


From mono at dtu.dk  Wed Jul 13 03:57:06 2016
From: mono at dtu.dk (=?Windows-1252?Q?Morten_Nobel-J=F8rgensen?=)
Date: Wed, 13 Jul 2016 08:57:06 +0000
Subject: [petsc-users] Distribution of DMPlex for FEM
Message-ID: <D3ABCEFE.6AE1%mono@dtu.dk>

I?m having problems distributing a simple FEM model using DMPlex. For test case I use 1x1x2 hex box elements (/cells) with 12 vertices. Each vertex has one DOF.
When I distribute the system to two processors, each get a single element and the local vector has the size 8 (one DOF for each vertex of a hex box) as expected.

My problem is that when I manually assemble the global stiffness matrix (a 12x12 matrix) it seems like my ghost values are ignored. I?m sure that I?m missing something obvious but cannot see what it is.

In the attached example, I?m assembling the global stiffness matrix using a simple local stiffness matrix of ones. This makes it very easy to see if the matrix is assembled correctly. If I run it on one process, then global stiffness matrix consists of 0?s, 1?s and 2?s and its trace is 16.0. But if I run it distributed on on two, then it consists only of 0's and 1?s and its trace is 12.0.

I hope that somebody can spot my mistake and help me in the right direction :)

Kind regards,
Morten
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.mcs.anl.gov/pipermail/petsc-users/attachments/20160713/0d0074dc/attachment-0001.html>
-------------- next part --------------
A non-text attachment was scrubbed...
Name: ex18.cc
Type: application/octet-stream
Size: 4631 bytes
Desc: ex18.cc
URL: <http://lists.mcs.anl.gov/pipermail/petsc-users/attachments/20160713/0d0074dc/attachment-0001.obj>

From dave.mayhem23 at gmail.com  Wed Jul 13 04:17:06 2016
From: dave.mayhem23 at gmail.com (Dave May)
Date: Wed, 13 Jul 2016 11:17:06 +0200
Subject: [petsc-users] Question about memory usage in Multigrid
	preconditioner
In-Reply-To: <EF43AF08-E1F9-46DC-A8CF-DD9FDFCACCED@mcs.anl.gov>
References: <577C337B.60909@uci.edu>
	<94A03A99-4970-4F20-8C79-FEE1DCBD028D@mcs.anl.gov>
	<577D75D3.8010703@uci.edu>
	<2F25042C-E6D6-4AC6-9C22-1B63F8065836@mcs.anl.gov>
	<57804DE9.707@uci.edu>
	<CAJ98EDqfz2hr1s8SoLCwPceqbB+bsZp1ryibGngRn+kNTDrocA@mail.gmail.com>
	<5783D3E4.4020004@uci.edu>
	<CAJ98EDoNNDpPYt7wr6EFyAp-vhHE1LC8CMnSrL89sJMrLTr+Yw@mail.gmail.com>
	<EF43AF08-E1F9-46DC-A8CF-DD9FDFCACCED@mcs.anl.gov>
Message-ID: <CAJ98EDpRMheifoq6bZ_LGuyLp0Un2nO9XA+YjVCCv72v5Q5erA@mail.gmail.com>

Hi Barry,


>   Dave,
>
>    MatPtAP has to generate some work space. Is it possible the "guess" it
> uses for needed work space is so absurdly (and unnecessarily) large that it
> triggers a memory issue?  It is possible that other places that require
> "guesses" for work space produce a problem?


This is entirely possible. I've never ever used PtAP at the scale of Franks
simulation.
I poked around in

src/mat/impls/aij/mpi/mpiptap.c
In this function, MatPtAPSymbolic_MPIAIJ_MPIAIJ_ptap()
I see the following code

  /* set default scalable */

  ptap->scalable = PETSC_FALSE; /* PETSC_TRUE; */

  ierr =
PetscOptionsGetBool(((PetscObject)Cmpi)->options,((PetscObject)Cmpi)->prefix,
"-matptap_scalable",&ptap->
This indicates that the default choice being used (despite the comment,) is
to use the faster, but also the more memory hungry variant of MatPtAP for
MPIAIJ matrices.
Looks like someone has changed the default.

The following comment is off topic from the email thread but...

This particular file is littered with #ifdefs related to profiling
(PTAP_PROFILE).
This variable is not defined by default. I would much prefer be if this
kind of thing was available all the time via a run time flag rather than a
configure flag.
Also, it would be great to augment the profiling for PtAP with memory usage
as currently only CPU time is logged.

Awhile back I proposed a PR for an "operation logger" object (which you
absolutely hated). The functionality of this logger would be useful to get
rid of the #if defined stuff for PtAP and be able to report meaningful
details about both the memory and CPU time. I used this logger for the
pctelescope paper and found it immensely useful.

But to the topic. Frank, you might want to try running your job with the
command line option
-matptap_scalable
(or  -XXX_matptap_scalable if you have given assigned a name to your
operator.)
As always, run a small job first with -options_left 1 to ensure the option
name is spelled correctly and being used.

Let us know if this helps.


Cheers,
  Dave


Also are all the "guesses" properly -info logged so that we can detected
> them before the program is killed?
>
>
>   Barry
>
>
> > At any stage in PCTelescope, only 2 operators of size 32 MB are held in
> memory when the DMDA is provided.
> >
> > From my rough estimates, the worst case memory foot print for any given
> core, given your options is approximately
> > 2100 MB + 300 MB + 32 MB + 32 MB + 1 MB  = 2465 MB
> > This is way below 8 GB.
> >
> > Note this estimate completely ignores:
> > (1) the memory required for the restriction operator,
> > (2) the potential growth in the number of non-zeros per row due to
> Galerkin coarsening (I wished -ksp_view_pre reported the output from
> MatView so we could see the number of non-zeros required by the coarse
> level operators)
> > (3) all temporary vectors required by the CG solver, and those required
> by the smoothers.
> > (4) internal memory allocated by MatPtAP
> > (5) memory associated with IS's used within PCTelescope
> >
> > So either I am completely off in my estimates, or you have not carefully
> estimated the memory usage of your application code. Hopefully others might
> examine/correct my rough estimates
> >
> > Since I don't have your code I cannot access the latter.
> > Since I don't have access to the same machine you are running on, I
> think we need to take a step back.
> >
> > [1] What machine are you running on? Send me a URL if its available
> >
> > [2] What discretization are you using? (I am guessing a scalar 7 point
> FD stencil)
> > If it's a 7 point FD stencil, we should be able to examine the memory
> usage of your solver configuration using a standard, light weight existing
> PETSc example, run on your machine at the same scale.
> > This would hopefully enable us to correctly evaluate the actual memory
> usage required by the solver configuration you are using.
> >
> > Thanks,
> >   Dave
> >
> >
> >
> > Frank
> >
> >
> >
> >
> > On 07/08/2016 10:38 PM, Dave May wrote:
> >>
> >>
> >> On Saturday, 9 July 2016, frank <hengjiew at uci.edu> wrote:
> >> Hi Barry and Dave,
> >>
> >> Thank both of you for the advice.
> >>
> >> @Barry
> >> I made a mistake in the file names in last email. I attached the
> correct files this time.
> >> For all the three tests, 'Telescope' is used as the coarse
> preconditioner.
> >>
> >> == Test1:   Grid: 1536*128*384,   Process Mesh: 48*4*12
> >> Part of the memory usage:  Vector   125            124 3971904     0.
> >>                                              Matrix   101 101
> 9462372     0
> >>
> >> == Test2: Grid: 1536*128*384,   Process Mesh: 96*8*24
> >> Part of the memory usage:  Vector   125            124 681672     0.
> >>                                              Matrix   101 101
> 1462180     0.
> >>
> >> In theory, the memory usage in Test1 should be 8 times of Test2. In my
> case, it is about 6 times.
> >>
> >> == Test3: Grid: 3072*256*768,   Process Mesh: 96*8*24. Sub-domain per
> process: 32*32*32
> >> Here I get the out of memory error.
> >>
> >> I tried to use -mg_coarse jacobi. In this way, I don't need to set
> -mg_coarse_ksp_type and -mg_coarse_pc_type explicitly, right?
> >> The linear solver didn't work in this case. Petsc output some errors.
> >>
> >> @Dave
> >> In test3, I use only one instance of 'Telescope'. On the coarse mesh of
> 'Telescope', I used LU as the preconditioner instead of SVD.
> >> If my set the levels correctly, then on the last coarse mesh of MG
> where it calls 'Telescope', the sub-domain per process is 2*2*2.
> >> On the last coarse mesh of 'Telescope', there is only one grid point
> per process.
> >> I still got the OOM error. The detailed petsc option file is attached.
> >>
> >> Do you understand the expected memory usage for the particular parallel
> LU implementation you are using? I don't (seriously). Replace LU with
> bjacobi and re-run this test. My point about solver debugging is still
> valid.
> >>
> >> And please send the result of KSPView so we can see what is actually
> used in the computations
> >>
> >> Thanks
> >>   Dave
> >>
> >>
> >>
> >> Thank you so much.
> >>
> >> Frank
> >>
> >>
> >>
> >> On 07/06/2016 02:51 PM, Barry Smith wrote:
> >> On Jul 6, 2016, at 4:19 PM, frank <hengjiew at uci.edu> wrote:
> >>
> >> Hi Barry,
> >>
> >> Thank you for you advice.
> >> I tried three test. In the 1st test, the grid is 3072*256*768 and the
> process mesh is 96*8*24.
> >> The linear solver is 'cg' the preconditioner is 'mg' and 'telescope' is
> used as the preconditioner at the coarse mesh.
> >> The system gives me the "Out of Memory" error before the linear system
> is completely solved.
> >> The info from '-ksp_view_pre' is attached. I seems to me that the error
> occurs when it reaches the coarse mesh.
> >>
> >> The 2nd test uses a grid of 1536*128*384 and process mesh is 96*8*24.
> The 3rd test uses the same grid but a different process mesh 48*4*12.
> >>     Are you sure this is right? The total matrix and vector memory
> usage goes from 2nd test
> >>                Vector   384            383      8,193,712     0.
> >>                Matrix   103            103     11,508,688     0.
> >> to 3rd test
> >>               Vector   384            383      1,590,520     0.
> >>                Matrix   103            103      3,508,664     0.
> >> that is the memory usage got smaller but if you have only 1/8th the
> processes and the same grid it should have gotten about 8 times bigger. Did
> you maybe cut the grid by a factor of 8 also? If so that still doesn't
> explain it because the memory usage changed by a factor of 5 something for
> the vectors and 3 something for the matrices.
> >>
> >>
> >> The linear solver and petsc options in 2nd and 3rd tests are the same
> in 1st test. The linear solver works fine in both test.
> >> I attached the memory usage of the 2nd and 3rd tests. The memory info
> is from the option '-log_summary'. I tried to use '-momery_info' as you
> suggested, but in my case petsc treated it as an unused option. It output
> nothing about the memory. Do I need to add sth to my code so I can use
> '-memory_info'?
> >>     Sorry, my mistake the option is -memory_view
> >>
> >>    Can you run the one case with -memory_view and -mg_coarse jacobi
> -ksp_max_it 1 (just so it doesn't iterate forever) to see how much memory
> is used without the telescope? Also run case 2 the same way.
> >>
> >>    Barry
> >>
> >>
> >>
> >> In both tests the memory usage is not large.
> >>
> >> It seems to me that it might be the 'telescope'  preconditioner that
> allocated a lot of memory and caused the error in the 1st test.
> >> Is there is a way to show how much memory it allocated?
> >>
> >> Frank
> >>
> >> On 07/05/2016 03:37 PM, Barry Smith wrote:
> >>    Frank,
> >>
> >>      You can run with -ksp_view_pre to have it "view" the KSP before
> the solve so hopefully it gets that far.
> >>
> >>       Please run the problem that does fit with -memory_info when the
> problem completes it will show the "high water mark" for PETSc allocated
> memory and total memory used. We first want to look at these numbers to see
> if it is using more memory than you expect. You could also run with say
> half the grid spacing to see how the memory usage scaled with the increase
> in grid points. Make the runs also with -log_view and send all the output
> from these options.
> >>
> >>     Barry
> >>
> >> On Jul 5, 2016, at 5:23 PM, frank <hengjiew at uci.edu> wrote:
> >>
> >> Hi,
> >>
> >> I am using the CG ksp solver and Multigrid preconditioner  to solve a
> linear system in parallel.
> >> I chose to use the 'Telescope' as the preconditioner on the coarse mesh
> for its good performance.
> >> The petsc options file is attached.
> >>
> >> The domain is a 3d box.
> >> It works well when the grid is  1536*128*384 and the process mesh is
> 96*8*24. When I double the size of grid and keep the same process mesh and
> petsc options, I get an "out of memory" error from the super-cluster I am
> using.
> >> Each process has access to at least 8G memory, which should be more
> than enough for my application. I am sure that all the other parts of my
> code( except the linear solver ) do not use much memory. So I doubt if
> there is something wrong with the linear solver.
> >> The error occurs before the linear system is completely solved so I
> don't have the info from ksp view. I am not able to re-produce the error
> with a smaller problem either.
> >> In addition,  I tried to use the block jacobi as the preconditioner
> with the same grid and same decomposition. The linear solver runs extremely
> slow but there is no memory error.
> >>
> >> How can I diagnose what exactly cause the error?
> >> Thank you so much.
> >>
> >> Frank
> >> <petsc_options.txt>
> >>
> <ksp_view_pre.txt><memory_test2.txt><memory_test3.txt><petsc_options.txt>
> >>
> >
> >
>
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.mcs.anl.gov/pipermail/petsc-users/attachments/20160713/80a97b08/attachment.html>

From bsmith at mcs.anl.gov  Wed Jul 13 08:16:07 2016
From: bsmith at mcs.anl.gov (Barry Smith)
Date: Wed, 13 Jul 2016 08:16:07 -0500
Subject: [petsc-users] Question about memory usage in Multigrid
	preconditioner
In-Reply-To: <CAJ98EDpRMheifoq6bZ_LGuyLp0Un2nO9XA+YjVCCv72v5Q5erA@mail.gmail.com>
References: <577C337B.60909@uci.edu>
	<94A03A99-4970-4F20-8C79-FEE1DCBD028D@mcs.anl.gov>
	<577D75D3.8010703@uci.edu>
	<2F25042C-E6D6-4AC6-9C22-1B63F8065836@mcs.anl.gov>
	<57804DE9.707@uci.edu>
	<CAJ98EDqfz2hr1s8SoLCwPceqbB+bsZp1ryibGngRn+kNTDrocA@mail.gmail.com>
	<5783D3E4.4020004@uci.edu>
	<CAJ98EDoNNDpPYt7wr6EFyAp-vhHE1LC8CMnSrL89sJMrLTr+Yw@mail.gmail.com>
	<EF43AF08-E1F9-46DC-A8CF-DD9FDFCACCED@mcs.anl.gov>
	<CAJ98EDpRMheifoq6bZ_LGuyLp0Un2nO9XA+YjVCCv72v5Q5erA@mail.gmail.com>
Message-ID: <0903F73E-E332-48C6-931E-F89B2D7C6676@mcs.anl.gov>


> On Jul 13, 2016, at 4:17 AM, Dave May <dave.mayhem23 at gmail.com> wrote:
> 
> Hi Barry,
> 
>  
>   Dave,
> 
>    MatPtAP has to generate some work space. Is it possible the "guess" it uses for needed work space is so absurdly (and unnecessarily) large that it triggers a memory issue?  It is possible that other places that require "guesses" for work space produce a problem? 
> 
> This is entirely possible. I've never ever used PtAP at the scale of Franks simulation.
> I poked around in
> src/mat/impls/aij/mpi/mpiptap.c
> 
> In this function, MatPtAPSymbolic_MPIAIJ_MPIAIJ_ptap()
> I see the following code
>   /* set default scalable */
> 
>   ptap->scalable = PETSC_FALSE; /* PETSC_TRUE; */
> 
>   ierr = PetscOptionsGetBool(((PetscObject)Cmpi)->options,((PetscObject)Cmpi)->prefix,"-matptap_scalable",&ptap->
> 
> This indicates that the default choice being used (despite the comment,) is to use the faster, but also the more memory hungry variant of MatPtAP for MPIAIJ matrices.
> Looks like someone has changed the default.
> 
> The following comment is off topic from the email thread but...
> 
> This particular file is littered with #ifdefs related to profiling (PTAP_PROFILE).
> This variable is not defined by default. I would much prefer be if this kind of thing was available all the time via a run time flag rather than a configure flag. 

  Dave,

   I agree the #if def stuff is horrible. I would want this handled with the regular PetscLogEvent() calls; and if they for some reason are not suitable for the task then we should improve them somehow. Note that it is possible to turn off logging of certain events by default via

  /* Turn off high traffic events by default */
  ierr = PetscLogEventSetActiveAll(MAT_SetValues, PETSC_FALSE);CHKERRQ(ierr);

so this horrible custom stuff in in mpiptap.c and also in gamg.c doesn't need to exist.

Better eyes doing pull requests would have stopped this nonsense from ever getting in the master branch..

> Also, it would be great to augment the profiling for PtAP with memory usage as currently only CPU time is logged.

   Hmm, is there some generic way we can support this via the PetscLogEvent stuff (but not with your absolutely horrible operation logger event :-).  Perhaps at each event begin we could record the memory high water mark and current usage and then at the event end compute the increase in the high water mark and current usage and record those with the event (and stage). Then in the -log_view we could see for example

     MatPtAP     1   100 secs ......  the usual columns of time etc information    ....   1 G (temp real process memory usage) 5 G (temp malloced) .5 (permanent real process memory usage) 1 (perm malloced).  

In other words just add more columns of data for each event related to memory usage? Perhaps it can be done better than I suggest above?

> 
> Awhile back I proposed a PR for an "operation logger" object (which you absolutely hated). The functionality of this logger would be useful to get rid of the #if defined stuff for PtAP and be able to report meaningful details about both the memory and CPU time. I used this logger for the pctelescope paper and found it immensely useful.
> 
> But to the topic. Frank, you might want to try running your job with the command line option
> -matptap_scalable
> (or  -XXX_matptap_scalable if you have given assigned a name to your operator.)
> As always, run a small job first with -options_left 1 to ensure the option name is spelled correctly and being used.
> 
> Let us know if this helps.
> 
> 
> Cheers,
>   Dave
> 
> 
> Also are all the "guesses" properly -info logged so that we can detected them before the program is killed?
> 
> 
>   Barry
> 
> 
> > At any stage in PCTelescope, only 2 operators of size 32 MB are held in memory when the DMDA is provided.
> >
> > From my rough estimates, the worst case memory foot print for any given core, given your options is approximately
> > 2100 MB + 300 MB + 32 MB + 32 MB + 1 MB  = 2465 MB
> > This is way below 8 GB.
> >
> > Note this estimate completely ignores:
> > (1) the memory required for the restriction operator,
> > (2) the potential growth in the number of non-zeros per row due to Galerkin coarsening (I wished -ksp_view_pre reported the output from MatView so we could see the number of non-zeros required by the coarse level operators)
> > (3) all temporary vectors required by the CG solver, and those required by the smoothers.
> > (4) internal memory allocated by MatPtAP
> > (5) memory associated with IS's used within PCTelescope
> >
> > So either I am completely off in my estimates, or you have not carefully estimated the memory usage of your application code. Hopefully others might examine/correct my rough estimates
> >
> > Since I don't have your code I cannot access the latter.
> > Since I don't have access to the same machine you are running on, I think we need to take a step back.
> >
> > [1] What machine are you running on? Send me a URL if its available
> >
> > [2] What discretization are you using? (I am guessing a scalar 7 point FD stencil)
> > If it's a 7 point FD stencil, we should be able to examine the memory usage of your solver configuration using a standard, light weight existing PETSc example, run on your machine at the same scale.
> > This would hopefully enable us to correctly evaluate the actual memory usage required by the solver configuration you are using.
> >
> > Thanks,
> >   Dave
> >
> >
> >
> > Frank
> >
> >
> >
> >
> > On 07/08/2016 10:38 PM, Dave May wrote:
> >>
> >>
> >> On Saturday, 9 July 2016, frank <hengjiew at uci.edu> wrote:
> >> Hi Barry and Dave,
> >>
> >> Thank both of you for the advice.
> >>
> >> @Barry
> >> I made a mistake in the file names in last email. I attached the correct files this time.
> >> For all the three tests, 'Telescope' is used as the coarse preconditioner.
> >>
> >> == Test1:   Grid: 1536*128*384,   Process Mesh: 48*4*12
> >> Part of the memory usage:  Vector   125            124 3971904     0.
> >>                                              Matrix   101 101      9462372     0
> >>
> >> == Test2: Grid: 1536*128*384,   Process Mesh: 96*8*24
> >> Part of the memory usage:  Vector   125            124 681672     0.
> >>                                              Matrix   101 101      1462180     0.
> >>
> >> In theory, the memory usage in Test1 should be 8 times of Test2. In my case, it is about 6 times.
> >>
> >> == Test3: Grid: 3072*256*768,   Process Mesh: 96*8*24. Sub-domain per process: 32*32*32
> >> Here I get the out of memory error.
> >>
> >> I tried to use -mg_coarse jacobi. In this way, I don't need to set -mg_coarse_ksp_type and -mg_coarse_pc_type explicitly, right?
> >> The linear solver didn't work in this case. Petsc output some errors.
> >>
> >> @Dave
> >> In test3, I use only one instance of 'Telescope'. On the coarse mesh of 'Telescope', I used LU as the preconditioner instead of SVD.
> >> If my set the levels correctly, then on the last coarse mesh of MG where it calls 'Telescope', the sub-domain per process is 2*2*2.
> >> On the last coarse mesh of 'Telescope', there is only one grid point per process.
> >> I still got the OOM error. The detailed petsc option file is attached.
> >>
> >> Do you understand the expected memory usage for the particular parallel LU implementation you are using? I don't (seriously). Replace LU with bjacobi and re-run this test. My point about solver debugging is still valid.
> >>
> >> And please send the result of KSPView so we can see what is actually used in the computations
> >>
> >> Thanks
> >>   Dave
> >>
> >>
> >>
> >> Thank you so much.
> >>
> >> Frank
> >>
> >>
> >>
> >> On 07/06/2016 02:51 PM, Barry Smith wrote:
> >> On Jul 6, 2016, at 4:19 PM, frank <hengjiew at uci.edu> wrote:
> >>
> >> Hi Barry,
> >>
> >> Thank you for you advice.
> >> I tried three test. In the 1st test, the grid is 3072*256*768 and the process mesh is 96*8*24.
> >> The linear solver is 'cg' the preconditioner is 'mg' and 'telescope' is used as the preconditioner at the coarse mesh.
> >> The system gives me the "Out of Memory" error before the linear system is completely solved.
> >> The info from '-ksp_view_pre' is attached. I seems to me that the error occurs when it reaches the coarse mesh.
> >>
> >> The 2nd test uses a grid of 1536*128*384 and process mesh is 96*8*24. The 3rd test uses the same grid but a different process mesh 48*4*12.
> >>     Are you sure this is right? The total matrix and vector memory usage goes from 2nd test
> >>                Vector   384            383      8,193,712     0.
> >>                Matrix   103            103     11,508,688     0.
> >> to 3rd test
> >>               Vector   384            383      1,590,520     0.
> >>                Matrix   103            103      3,508,664     0.
> >> that is the memory usage got smaller but if you have only 1/8th the processes and the same grid it should have gotten about 8 times bigger. Did you maybe cut the grid by a factor of 8 also? If so that still doesn't explain it because the memory usage changed by a factor of 5 something for the vectors and 3 something for the matrices.
> >>
> >>
> >> The linear solver and petsc options in 2nd and 3rd tests are the same in 1st test. The linear solver works fine in both test.
> >> I attached the memory usage of the 2nd and 3rd tests. The memory info is from the option '-log_summary'. I tried to use '-momery_info' as you suggested, but in my case petsc treated it as an unused option. It output nothing about the memory. Do I need to add sth to my code so I can use '-memory_info'?
> >>     Sorry, my mistake the option is -memory_view
> >>
> >>    Can you run the one case with -memory_view and -mg_coarse jacobi -ksp_max_it 1 (just so it doesn't iterate forever) to see how much memory is used without the telescope? Also run case 2 the same way.
> >>
> >>    Barry
> >>
> >>
> >>
> >> In both tests the memory usage is not large.
> >>
> >> It seems to me that it might be the 'telescope'  preconditioner that allocated a lot of memory and caused the error in the 1st test.
> >> Is there is a way to show how much memory it allocated?
> >>
> >> Frank
> >>
> >> On 07/05/2016 03:37 PM, Barry Smith wrote:
> >>    Frank,
> >>
> >>      You can run with -ksp_view_pre to have it "view" the KSP before the solve so hopefully it gets that far.
> >>
> >>       Please run the problem that does fit with -memory_info when the problem completes it will show the "high water mark" for PETSc allocated memory and total memory used. We first want to look at these numbers to see if it is using more memory than you expect. You could also run with say half the grid spacing to see how the memory usage scaled with the increase in grid points. Make the runs also with -log_view and send all the output from these options.
> >>
> >>     Barry
> >>
> >> On Jul 5, 2016, at 5:23 PM, frank <hengjiew at uci.edu> wrote:
> >>
> >> Hi,
> >>
> >> I am using the CG ksp solver and Multigrid preconditioner  to solve a linear system in parallel.
> >> I chose to use the 'Telescope' as the preconditioner on the coarse mesh for its good performance.
> >> The petsc options file is attached.
> >>
> >> The domain is a 3d box.
> >> It works well when the grid is  1536*128*384 and the process mesh is 96*8*24. When I double the size of grid and keep the same process mesh and petsc options, I get an "out of memory" error from the super-cluster I am using.
> >> Each process has access to at least 8G memory, which should be more than enough for my application. I am sure that all the other parts of my code( except the linear solver ) do not use much memory. So I doubt if there is something wrong with the linear solver.
> >> The error occurs before the linear system is completely solved so I don't have the info from ksp view. I am not able to re-produce the error with a smaller problem either.
> >> In addition,  I tried to use the block jacobi as the preconditioner with the same grid and same decomposition. The linear solver runs extremely slow but there is no memory error.
> >>
> >> How can I diagnose what exactly cause the error?
> >> Thank you so much.
> >>
> >> Frank
> >> <petsc_options.txt>
> >> <ksp_view_pre.txt><memory_test2.txt><memory_test3.txt><petsc_options.txt>
> >>
> >
> >


From hgbk2008 at gmail.com  Wed Jul 13 10:34:58 2016
From: hgbk2008 at gmail.com (Hoang Giang Bui)
Date: Wed, 13 Jul 2016 17:34:58 +0200
Subject: [petsc-users] different convergence behaviour
In-Reply-To: <66004C23-63C9-4A3E-A7DF-1352AC26412F@mcs.anl.gov>
References: <CAJW_hKf6z-ra+sj_6OHO-WR2x0zJs5c_kQyUm+sZ5quRGmh8+A@mail.gmail.com>
	<CAMYG4GkXWTAsUAPFgYhCqr5ZjHsOUYq9e2O+cwBJ=fvdp5v-Pg@mail.gmail.com>
	<CAJW_hKezLFQS3OOB6mxyr=mKR4rVmg2TRMNxNs0F3n_mzfA7Sg@mail.gmail.com>
	<CAMYG4Gkwrw4WuGsZ1VJ6i9LkyTwhpJu=TSE6Tc=oiS+R0FT12A@mail.gmail.com>
	<66004C23-63C9-4A3E-A7DF-1352AC26412F@mcs.anl.gov>
Message-ID: <CAJW_hKfqO+bhQZrkLZSJE=FEG13k8qdj3V2dSUSxXuPUf4LugA@mail.gmail.com>

Thanks Barry

This is a good comment. Since material behaviour depends very much on the
trajectory of the solution. I suspect that the error may concatenate during
time stepping.

I have re-run the simulation as you suggested and post the log file here:
https://www.dropbox.com/s/d6l8ixme37uh47a/log13Jul16?dl=0

However, I did not get what -ksp_monitor_true_solution used for? I see that
I have the same log that I had before.

Giang

On Wed, Jul 13, 2016 at 5:43 AM, Barry Smith <bsmith at mcs.anl.gov> wrote:

>
>   It is not uncommon for an iterative linear solver to work fine for some
> time steps but then start to perform poorly at a later timestep because the
> physics (mathematically the conditioning or eigenstructure of the Jacobian)
> changes over time; perhaps becomes singular. Another possibility is the
> trajectory of the solution is very sensitive to the solution of the
> nonlinear problem at each time step so that an iterative linear solver and
> a direct linear solver result in very difficult physical solutions after
> many time steps. In other words after many time-steps the computed
> solutions can be very different and if the computed solution for the
> iterative linear solver is eventually "non-physical" or ill-conditioned the
> nonlinear solver could break down.
>
>   Please run with the iterative solver (that eventually breaks) with the
> option -ksp_monitor_true_solution -ksp_converged_reason and and send ALL
> the output (it will be very large, don't worry about it). Then we can see
> if the linear solver is breaking down. Note that by default PETSc linear
> solvers do not generate an error that stops the program if the linear solve
> fails, hence your NR code should call KSPGetConvergedReason() after EVERY
> linear solve and if the reason is negative your code needs to do something
> different since the linear solve failed and your code should not just keep
> on running NR.
>
>   Barry
>
>
> > On Jul 12, 2016, at 9:52 AM, Matthew Knepley <knepley at gmail.com> wrote:
> >
> > On Tue, Jul 12, 2016 at 8:44 AM, Hoang Giang Bui <hgbk2008 at gmail.com>
> wrote:
> > Hi Matt
> >
> > 1) In the log you sent, the linear solver converges due to the Relative
> Tolerance, 1.0e-9, not the breakdown tolerance 1e-30. Change rtol will
> affect the convergence.
> >
> > Sorry i got it wrong in the previous email, the ksp_rtol 1.0e-12 DOES
> affect the convergence, and it took more iterations. But the simulation
> still failed at a definite time step.
> >
> > 2) What do you mean  -fieldsplit_wp_ksp_rtol 1.0e-8 does not work?
> ALWAYS send the view output.
> >
> > In the log file I sent previously, the line
> >
> >     KSP Object:    (fieldsplit_wp_)     8 MPI processes
> >       type: preonly
> >       maximum iterations=10000, initial guess is zero
> >       tolerances:  relative=1e-05, absolute=1e-50, divergence=10000
> >       left preconditioning
> >       using NONE norm type for convergence test
> >
> > impressed me that the rtol for fieldsplit_wp is still 1.0e-5
> >
> > KSP "preonly" does no iterations, so it does not read the tolerance. If
> you want to lower the tolerance,
> > choose a solver like GMRES
> >
> >   -fieldsplit_wp_ksp_type gmres -fieldsplit_wp_ksp_rtol 1e-8
> >
> > 3) I can't tell you anything about Newton convergence if you do not send
> the output, -snes_monitor -snes_view
> >
> > I did not yet use SNES, instead using my NR iterator so I have no view
> for SNES.
> >
> > It is hard to debug an iteration which we did not code. It could be you
> have a bug. If not, then very small changes in
> > the iterates are making a difference, which means your Jacobians are
> close to singular. A problem reformulation would
> > probably help more than solver tweaking.
> >
> >   Thanks,
> >
> >     Matt
> >
> > 4) If there is a difference between LU and an iterative solver with
> residual 1e-9, then your system is very ill-conditioned.
> > Yes it is ill-conditioned
> >
> >
> >
> >
> >
> >
> >
> > Giang
> >
> > On Tue, Jul 12, 2016 at 2:49 PM, Matthew Knepley <knepley at gmail.com>
> wrote:
> > On Tue, Jul 12, 2016 at 7:42 AM, Hoang Giang Bui <hgbk2008 at gmail.com>
> wrote:
> > Hello
> >
> > I encountered different convergence behaviour of Newton Raphson when
> using different solver settings with PETSc
> >
> > For the first solver configuration, I used direct solver
> > -ksp_type preonly
> > -pc_type lu
> > -pc_factor_mat_solver_package mumps
> > -mat_mumps_icntl_1 6
> > -mat_mumps_icntl_4 3
> > -mat_mumps_icntl_7 4
> > -mat_mumps_icntl_14 40
> > -mat_mumps_icntl_23 0
> >
> > The simulation can run completely and the NR typically converged after
> 6/7 iterations. Of course, it's very slow. For the second solver
> configuration:
> > -ksp_type gmres
> > -ksp_max_it 300
> > -ksp_gmres_restart 300
> > -ksp_gmres_modifiedgramschmidt
> > -pc_view
> > -pc_fieldsplit_type multiplicative
> > -fieldsplit_u_pc_type hypre
> > -fieldsplit_u_pc_hypre_type boomeramg
> > -fieldsplit_u_pc_hypre_boomeramg_coarsen_type PMIS
> > -fieldsplit_u_pc_hypre_boomeramg_strong_threshold 0.6
> > -fieldsplit_u_pc_hypre_boomeramg_max_levels 25
> > -fieldsplit_wp_ksp_rtol 1.0e-8
> > -fieldsplit_wp_pc_type hypre
> > -fieldsplit_wp_pc_hypre_type boomeramg
> > -fieldsplit_wp_pc_hypre_boomeramg_coarsen_type PMIS
> > -fieldsplit_wp_pc_hypre_boomeramg_strong_threshold 0.6
> > -fieldsplit_wp_pc_hypre_boomeramg_max_levels 25
> >
> > The solver runs much faster, but the NR does not converge in 30
> iterations after some time steps. I thought setting the solver tolerance
> -ksp_rtol 1.0e-12 but it doesn't help much because GMRES already terminate
> with tolerance 1e-30 (see sample log file). Can we set the tolerance of the
> sub-ksp of the Fieldsplit? I tried -fieldsplit_wp_ksp_rtol 1.0e-8 but it
> doesn't work.
> >
> > 1) In the log you sent, the linear solver converges due to the Relative
> Tolerance, 1.0e-9, not the breakdown tolerance 1e-30. Change rtol will
> affect the convergence.
> >
> > 2) What do you mean  -fieldsplit_wp_ksp_rtol 1.0e-8 does not work?
> ALWAYS send the view output.
> >
> > 3) I can't tell you anything about Newton convergence if you do not send
> the output, -snes_monitor -snes_view
> >
> > 4) If there is a difference between LU and an iterative solver with
> residual 1e-9, then your system is very ill-conditioned.
> >
> >   Thanks,
> >
> >      Matt
> >
> > Sorry this problem is run with many time steps and is quite big so I
> cannot reproduce in a simple test case.
> >
> > Giang
> >
> >
> >
> >
> > --
> > What most experimenters take for granted before they begin their
> experiments is infinitely more interesting than any results to which their
> experiments lead.
> > -- Norbert Wiener
> >
> >
> >
> >
> > --
> > What most experimenters take for granted before they begin their
> experiments is infinitely more interesting than any results to which their
> experiments lead.
> > -- Norbert Wiener
>
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.mcs.anl.gov/pipermail/petsc-users/attachments/20160713/9bf73589/attachment.html>

From knepley at gmail.com  Wed Jul 13 11:05:16 2016
From: knepley at gmail.com (Matthew Knepley)
Date: Wed, 13 Jul 2016 11:05:16 -0500
Subject: [petsc-users] different convergence behaviour
In-Reply-To: <CAJW_hKfqO+bhQZrkLZSJE=FEG13k8qdj3V2dSUSxXuPUf4LugA@mail.gmail.com>
References: <CAJW_hKf6z-ra+sj_6OHO-WR2x0zJs5c_kQyUm+sZ5quRGmh8+A@mail.gmail.com>
	<CAMYG4GkXWTAsUAPFgYhCqr5ZjHsOUYq9e2O+cwBJ=fvdp5v-Pg@mail.gmail.com>
	<CAJW_hKezLFQS3OOB6mxyr=mKR4rVmg2TRMNxNs0F3n_mzfA7Sg@mail.gmail.com>
	<CAMYG4Gkwrw4WuGsZ1VJ6i9LkyTwhpJu=TSE6Tc=oiS+R0FT12A@mail.gmail.com>
	<66004C23-63C9-4A3E-A7DF-1352AC26412F@mcs.anl.gov>
	<CAJW_hKfqO+bhQZrkLZSJE=FEG13k8qdj3V2dSUSxXuPUf4LugA@mail.gmail.com>
Message-ID: <CAMYG4GkC1omj+TLHMA64iWobW6Xm-O=KioQPmwGgJ4-urWREWA@mail.gmail.com>

On Wed, Jul 13, 2016 at 10:34 AM, Hoang Giang Bui <hgbk2008 at gmail.com>
wrote:

> Thanks Barry
>
> This is a good comment. Since material behaviour depends very much on the
> trajectory of the solution. I suspect that the error may concatenate during
> time stepping.
>
> I have re-run the simulation as you suggested and post the log file here:
> https://www.dropbox.com/s/d6l8ixme37uh47a/log13Jul16?dl=0
>
> However, I did not get what -ksp_monitor_true_solution used for? I see
> that I have the same log that I had before.
>

That option is showing the last two numbers in these lines

  0 KSP preconditioned resid norm 1.150038785083e+00 true resid norm
8.673040929526e+07 ||r(i)||/||b|| 1.000000000000e+00

Notice that there are 7 orders of magnitude between the apparent residual
(using the preconditioner), and the actual residual, Ax - b.
You are using Hypre, and this generally means the Hypre coarse grid
operator is crap. Please

  a) Try ML or GAMG and look at the output again

  b) Try MUMPS, although you have 200 nonzeros/row so that fill-in might be
extreme.

The consequence is that you solve to what you think is machine precision
(1e-13), but all you really get is (1e-4), so I can understand
why the trajectory is completely different.

   Matt

  1 KSP preconditioned resid norm 5.202876635759e-01 true resid norm
2.037005052213e+08 ||r(i)||/||b|| 2.348663022307e+00
  2 KSP preconditioned resid norm 3.386127782775e-01 true resid norm
1.762196838305e+08 ||r(i)||/||b|| 2.031809664712e+00
  3 KSP preconditioned resid norm 2.334102526025e-01 true resid norm
1.027451552306e+08 ||r(i)||/||b|| 1.184649721655e+00
  4 KSP preconditioned resid norm 1.791251896569e-01 true resid norm
7.709961160729e+07 ||r(i)||/||b|| 8.889570824556e-01
  5 KSP preconditioned resid norm 1.338763110903e-01 true resid norm
7.416954924746e+07 ||r(i)||/||b|| 8.551735181482e-01
  6 KSP preconditioned resid norm 8.064262880339e-02 true resid norm
5.164444100149e+07 ||r(i)||/||b|| 5.954594405945e-01
  7 KSP preconditioned resid norm 4.635705318709e-02 true resid norm
2.934800965373e+07 ||r(i)||/||b|| 3.383820034081e-01
  8 KSP preconditioned resid norm 2.772133866748e-02 true resid norm
1.528356929458e+07 ||r(i)||/||b|| 1.762192686368e-01
  9 KSP preconditioned resid norm 1.746753670007e-02 true resid norm
1.011788107951e+07 ||r(i)||/||b|| 1.166589799555e-01
 10 KSP preconditioned resid norm 1.090702407895e-02 true resid norm
5.487922954253e+06 ||r(i)||/||b|| 6.327564920823e-02
 11 KSP preconditioned resid norm 7.298748576067e-03 true resid norm
3.635843038640e+06 ||r(i)||/||b|| 4.192120235779e-02
 12 KSP preconditioned resid norm 5.263606789063e-03 true resid norm
2.556946903793e+06 ||r(i)||/||b|| 2.948155006496e-02
 13 KSP preconditioned resid norm 3.653208280595e-03 true resid norm
1.955721190606e+06 ||r(i)||/||b|| 2.254942881623e-02
 14 KSP preconditioned resid norm 2.344759624903e-03 true resid norm
1.161259621408e+06 ||r(i)||/||b|| 1.338930175522e-02
 15 KSP preconditioned resid norm 1.394564491254e-03 true resid norm
7.455856541894e+05 ||r(i)||/||b|| 8.596588673428e-03
 16 KSP preconditioned resid norm 9.523395328600e-04 true resid norm
4.383808867461e+05 ||r(i)||/||b|| 5.054523440028e-03
 17 KSP preconditioned resid norm 7.226014371144e-04 true resid norm
2.463564216053e+05 ||r(i)||/||b|| 2.840484941869e-03
 18 KSP preconditioned resid norm 5.312593384754e-04 true resid norm
2.332075376781e+05 ||r(i)||/||b|| 2.688878555665e-03
 19 KSP preconditioned resid norm 3.987403871945e-04 true resid norm
1.524236218549e+05 ||r(i)||/||b|| 1.757441514383e-03
 20 KSP preconditioned resid norm 3.024350484979e-04 true resid norm
1.113568566173e+05 ||r(i)||/||b|| 1.283942477870e-03
 21 KSP preconditioned resid norm 2.181724540430e-04 true resid norm
9.095158030900e+04 ||r(i)||/||b|| 1.048670022983e-03
 22 KSP preconditioned resid norm 1.497651066688e-04 true resid norm
7.045647741653e+04 ||r(i)||/||b|| 8.123618692570e-04
 23 KSP preconditioned resid norm 1.067332245914e-04 true resid norm
4.317487154207e+04 ||r(i)||/||b|| 4.978054628463e-04
 24 KSP preconditioned resid norm 8.206743871631e-05 true resid norm
3.328488127932e+04 ||r(i)||/||b|| 3.837740597534e-04
 25 KSP preconditioned resid norm 6.446633932980e-05 true resid norm
2.816657573261e+04 ||r(i)||/||b|| 3.247600923538e-04
 26 KSP preconditioned resid norm 5.068725017435e-05 true resid norm
2.427030232896e+04 ||r(i)||/||b|| 2.798361327495e-04
 27 KSP preconditioned resid norm 4.056292508453e-05 true resid norm
1.963628903861e+04 ||r(i)||/||b|| 2.264060460243e-04
 28 KSP preconditioned resid norm 3.278196251068e-05 true resid norm
1.710046122873e+04 ||r(i)||/||b|| 1.971679987179e-04
 29 KSP preconditioned resid norm 2.796514916728e-05 true resid norm
1.500292999274e+04 ||r(i)||/||b|| 1.729835027259e-04
 30 KSP preconditioned resid norm 2.469882695602e-05 true resid norm
1.317997814765e+04 ||r(i)||/||b|| 1.519649019847e-04
 31 KSP preconditioned resid norm 2.175528107880e-05 true resid norm
1.158572445412e+04 ||r(i)||/||b|| 1.335831866616e-04
 32 KSP preconditioned resid norm 1.912573933887e-05 true resid norm
1.001695718951e+04 ||r(i)||/||b|| 1.154953293880e-04
 33 KSP preconditioned resid norm 1.647102125210e-05 true resid norm
8.271485921360e+03 ||r(i)||/||b|| 9.537007825249e-05
 34 KSP preconditioned resid norm 1.337436641169e-05 true resid norm
6.611637805300e+03 ||r(i)||/||b|| 7.623206046211e-05
 35 KSP preconditioned resid norm 9.896966695703e-06 true resid norm
4.752788536204e+03 ||r(i)||/||b|| 5.479956309238e-05
 36 KSP preconditioned resid norm 6.766260764791e-06 true resid norm
3.239548441802e+03 ||r(i)||/||b|| 3.735193305468e-05
 37 KSP preconditioned resid norm 4.835158711776e-06 true resid norm
2.113941262442e+03 ||r(i)||/||b|| 2.437370329068e-05
 38 KSP preconditioned resid norm 3.598894380040e-06 true resid norm
1.653467554688e+03 ||r(i)||/||b|| 1.906445003688e-05
 39 KSP preconditioned resid norm 2.522642742745e-06 true resid norm
1.344572919946e+03 ||r(i)||/||b|| 1.550290066507e-05
 40 KSP preconditioned resid norm 1.750002168280e-06 true resid norm
1.015690774521e+03 ||r(i)||/||b|| 1.171089566825e-05
 41 KSP preconditioned resid norm 1.371380245282e-06 true resid norm
8.480814540622e+02 ||r(i)||/||b|| 9.778363332462e-06
 42 KSP preconditioned resid norm 1.174063380270e-06 true resid norm
7.575955225454e+02 ||r(i)||/||b|| 8.735062231359e-06
 43 KSP preconditioned resid norm 1.022078284946e-06 true resid norm
6.758159410670e+02 ||r(i)||/||b|| 7.792145183661e-06
 44 KSP preconditioned resid norm 8.861345665105e-07 true resid norm
5.913685641420e+02 ||r(i)||/||b|| 6.818468504268e-06
 45 KSP preconditioned resid norm 7.574040382433e-07 true resid norm
4.958820201473e+02 ||r(i)||/||b|| 5.717510434653e-06
 46 KSP preconditioned resid norm 6.331382122180e-07 true resid norm
3.988451175342e+02 ||r(i)||/||b|| 4.598676759110e-06
 47 KSP preconditioned resid norm 5.210644796074e-07 true resid norm
3.077459761874e+02 ||r(i)||/||b|| 3.548305360116e-06
 48 KSP preconditioned resid norm 4.285762531134e-07 true resid norm
2.383304155333e+02 ||r(i)||/||b|| 2.747945241696e-06
 49 KSP preconditioned resid norm 3.365753654637e-07 true resid norm
1.802176480688e+02 ||r(i)||/||b|| 2.077906117741e-06
 50 KSP preconditioned resid norm 2.556504175739e-07 true resid norm
1.322207275993e+02 ||r(i)||/||b|| 1.524502520785e-06
 51 KSP preconditioned resid norm 1.929395464892e-07 true resid norm
1.007938656038e+02 ||r(i)||/||b|| 1.162151388686e-06
 52 KSP preconditioned resid norm 1.518353128559e-07 true resid norm
7.979486270816e+01 ||r(i)||/||b|| 9.200332773308e-07
 53 KSP preconditioned resid norm 1.206065500213e-07 true resid norm
6.580266981926e+01 ||r(i)||/||b|| 7.587035545427e-07
 54 KSP preconditioned resid norm 9.426597887251e-08 true resid norm
5.333098459078e+01 ||r(i)||/||b|| 6.149052566928e-07
 55 KSP preconditioned resid norm 7.613592162567e-08 true resid norm
4.265349984159e+01 ||r(i)||/||b|| 4.917940568733e-07
 56 KSP preconditioned resid norm 6.268355987149e-08 true resid norm
3.467681120568e+01 ||r(i)||/||b|| 3.998229858184e-07
 57 KSP preconditioned resid norm 5.012883291890e-08 true resid norm
2.749870530323e+01 ||r(i)||/||b|| 3.170595587716e-07
 58 KSP preconditioned resid norm 3.875711489918e-08 true resid norm
2.037239239206e+01 ||r(i)||/||b|| 2.348933039472e-07
 59 KSP preconditioned resid norm 2.803879910778e-08 true resid norm
1.495957468476e+01 ||r(i)||/||b|| 1.724836168342e-07
 60 KSP preconditioned resid norm 1.925214804831e-08 true resid norm
1.036952152845e+01 ||r(i)||/||b|| 1.195603896339e-07
 61 KSP preconditioned resid norm 1.316807047769e-08 true resid norm
7.239457203086e+00 ||r(i)||/||b|| 8.347080639779e-08
 62 KSP preconditioned resid norm 9.095263534284e-09 true resid norm
5.546725364022e+00 ||r(i)||/||b|| 6.395363989508e-08
 63 KSP preconditioned resid norm 6.520024982652e-09 true resid norm
4.395022539849e+00 ||r(i)||/||b|| 5.067452783356e-08
 64 KSP preconditioned resid norm 5.077084953418e-09 true resid norm
3.613138054874e+00 ||r(i)||/||b|| 4.165941431885e-08
 65 KSP preconditioned resid norm 4.181478103167e-09 true resid norm
3.038027368880e+00 ||r(i)||/||b|| 3.502839884610e-08
 66 KSP preconditioned resid norm 3.474545560062e-09 true resid norm
2.484725611092e+00 ||r(i)||/||b|| 2.864883990842e-08
 67 KSP preconditioned resid norm 2.726294735157e-09 true resid norm
1.845741997810e+00 ||r(i)||/||b|| 2.128137077650e-08
 68 KSP preconditioned resid norm 2.081101207644e-09 true resid norm
1.271838867185e+00 ||r(i)||/||b|| 1.466427839462e-08
 69 KSP preconditioned resid norm 1.574053677511e-09 true resid norm
8.732579381622e-01 ||r(i)||/||b|| 1.006864772411e-08
 70 KSP preconditioned resid norm 1.202717674216e-09 true resid norm
5.849220507056e-01 ||r(i)||/||b|| 6.744140324696e-09
 71 KSP preconditioned resid norm 9.075713740333e-10 true resid norm
4.120181311262e-01 ||r(i)||/||b|| 4.750561359898e-09
 72 KSP preconditioned resid norm 6.365151508838e-10 true resid norm
3.065749731760e-01 ||r(i)||/||b|| 3.534803717256e-09
 73 KSP preconditioned resid norm 4.005974496315e-10 true resid norm
2.122086214944e-01 ||r(i)||/||b|| 2.446761444097e-09
 74 KSP preconditioned resid norm 2.374916890000e-10 true resid norm
1.567794082480e-01 ||r(i)||/||b|| 1.807663650177e-09
 75 KSP preconditioned resid norm 1.481096397633e-10 true resid norm
1.235242757193e-01 ||r(i)||/||b|| 1.424232592963e-09
 76 KSP preconditioned resid norm 1.085014154415e-10 true resid norm
1.047268461651e-01 ||r(i)||/||b|| 1.207498581132e-09
 77 KSP preconditioned resid norm 8.764582618532e-11 true resid norm
8.962364559579e-02 ||r(i)||/||b|| 1.033358960531e-09
 78 KSP preconditioned resid norm 7.109092680274e-11 true resid norm
7.176047852904e-02 ||r(i)||/||b|| 8.273969777399e-10
 79 KSP preconditioned resid norm 5.460763497752e-11 true resid norm
5.069849340150e-02 ||r(i)||/||b|| 5.845526824266e-10
 80 KSP preconditioned resid norm 3.799942459039e-11 true resid norm
3.044234442091e-02 ||r(i)||/||b|| 3.509996628435e-10
 81 KSP preconditioned resid norm 2.481109284531e-11 true resid norm
1.726059230919e-02 ||r(i)||/||b|| 1.990143070861e-10
 82 KSP preconditioned resid norm 1.569622532234e-11 true resid norm
1.070220060596e-02 ||r(i)||/||b|| 1.233961731867e-10
 83 KSP preconditioned resid norm 1.022582071414e-11 true resid norm
7.402265790954e-03 ||r(i)||/||b|| 8.534798637643e-11
 84 KSP preconditioned resid norm 7.284827374238e-12 true resid norm
5.658340974708e-03 ||r(i)||/||b|| 6.524056580253e-11
 85 KSP preconditioned resid norm 5.402886839508e-12 true resid norm
4.464802757767e-03 ||r(i)||/||b|| 5.147909244343e-11
 86 KSP preconditioned resid norm 3.933784995327e-12 true resid norm
3.350654653931e-03 ||r(i)||/||b|| 3.863298560628e-11
 87 KSP preconditioned resid norm 2.792049995877e-12 true resid norm
2.402140873006e-03 ||r(i)||/||b|| 2.769663942007e-11
 88 KSP preconditioned resid norm 2.058524741199e-12 true resid norm
1.747330249674e-03 ||r(i)||/||b|| 2.014668515774e-11
 89 KSP preconditioned resid norm 1.568241303093e-12 true resid norm
1.266336540932e-03 ||r(i)||/||b|| 1.460083667564e-11
 90 KSP preconditioned resid norm 1.164779378453e-12 true resid norm
8.484550691359e-04 ||r(i)||/||b|| 9.782671107287e-12
 91 KSP preconditioned resid norm 7.995560038101e-13 true resid norm
5.065061038629e-04 ||r(i)||/||b|| 5.840005921551e-12
Linear solve converged due to CONVERGED_RTOL iterations 91
KSP Object: 8 MPI processes
  type: gmres
    GMRES: restart=300, using Modified Gram-Schmidt Orthogonalization
    GMRES: happy breakdown tolerance 1e-30
  maximum iterations=300, initial guess is zero
  tolerances:  relative=1e-12, absolute=1e-20, divergence=10000
  left preconditioning
  using PRECONDITIONED norm type for convergence test
PC Object: 8 MPI processes
  type: fieldsplit
    FieldSplit with MULTIPLICATIVE composition: total splits = 2
    Solver info for each split is in the following KSP objects:
    Split number 0 Defined by IS
    KSP Object:    (fieldsplit_u_)     8 MPI processes
      type: preonly
      maximum iterations=10000, initial guess is zero
      tolerances:  relative=1e-05, absolute=1e-50, divergence=10000
      left preconditioning
      using NONE norm type for convergence test
    PC Object:    (fieldsplit_u_)     8 MPI processes
      type: hypre
        HYPRE BoomerAMG preconditioning
        HYPRE BoomerAMG: Cycle type V
        HYPRE BoomerAMG: Maximum number of levels 25
        HYPRE BoomerAMG: Maximum number of iterations PER hypre call 1
        HYPRE BoomerAMG: Convergence tolerance PER hypre call 0
        HYPRE BoomerAMG: Threshold for strong coupling 0.6
        HYPRE BoomerAMG: Interpolation truncation factor 0
        HYPRE BoomerAMG: Interpolation: max elements per row 0
        HYPRE BoomerAMG: Number of levels of aggressive coarsening 0
        HYPRE BoomerAMG: Number of paths for aggressive coarsening 1
        HYPRE BoomerAMG: Maximum row sums 0.9
        HYPRE BoomerAMG: Sweeps down         1
        HYPRE BoomerAMG: Sweeps up           1
        HYPRE BoomerAMG: Sweeps on coarse    1
        HYPRE BoomerAMG: Relax down          symmetric-SOR/Jacobi
        HYPRE BoomerAMG: Relax up            symmetric-SOR/Jacobi
        HYPRE BoomerAMG: Relax on coarse     Gaussian-elimination
        HYPRE BoomerAMG: Relax weight  (all)      1
        HYPRE BoomerAMG: Outer relax weight (all) 1
        HYPRE BoomerAMG: Using CF-relaxation
        HYPRE BoomerAMG: Measure type        local
        HYPRE BoomerAMG: Coarsen type        PMIS
        HYPRE BoomerAMG: Interpolation type  classical
      linear system matrix = precond matrix:
      Mat Object:      (fieldsplit_u_)       8 MPI processes
        type: mpiaij
        rows=438420, cols=438420, bs=3
        total: nonzeros=7.95766e+07, allocated nonzeros=7.95766e+07
        total number of mallocs used during MatSetValues calls =0
          using I-node (on process 0) routines: found 17349 nodes, limit
used is 5
    Split number 1 Defined by IS
    KSP Object:    (fieldsplit_wp_)     8 MPI processes
      type: preonly
      maximum iterations=10000, initial guess is zero
      tolerances:  relative=1e-05, absolute=1e-50, divergence=10000
      left preconditioning
      using NONE norm type for convergence test
    PC Object:    (fieldsplit_wp_)     8 MPI processes
      type: hypre
        HYPRE BoomerAMG preconditioning
        HYPRE BoomerAMG: Cycle type V
        HYPRE BoomerAMG: Maximum number of levels 25
        HYPRE BoomerAMG: Maximum number of iterations PER hypre call 1
        HYPRE BoomerAMG: Convergence tolerance PER hypre call 0
        HYPRE BoomerAMG: Threshold for strong coupling 0.6
        HYPRE BoomerAMG: Interpolation truncation factor 0
        HYPRE BoomerAMG: Interpolation: max elements per row 0
        HYPRE BoomerAMG: Number of levels of aggressive coarsening 0
        HYPRE BoomerAMG: Number of paths for aggressive coarsening 1
        HYPRE BoomerAMG: Maximum row sums 0.9
        HYPRE BoomerAMG: Sweeps down         1
        HYPRE BoomerAMG: Sweeps up           1
        HYPRE BoomerAMG: Sweeps on coarse    1
        HYPRE BoomerAMG: Relax down          symmetric-SOR/Jacobi
        HYPRE BoomerAMG: Relax up            symmetric-SOR/Jacobi
        HYPRE BoomerAMG: Relax on coarse     Gaussian-elimination
        HYPRE BoomerAMG: Relax weight  (all)      1
        HYPRE BoomerAMG: Outer relax weight (all) 1
        HYPRE BoomerAMG: Using CF-relaxation
        HYPRE BoomerAMG: Measure type        local
        HYPRE BoomerAMG: Coarsen type        PMIS
        HYPRE BoomerAMG: Interpolation type  classical
      linear system matrix = precond matrix:
      Mat Object:      (fieldsplit_wp_)       8 MPI processes
        type: mpiaij
        rows=146140, cols=146140
        total: nonzeros=596012, allocated nonzeros=596012
        total number of mallocs used during MatSetValues calls =0
          not using I-node (on process 0) routines
  linear system matrix = precond matrix:
  Mat Object:   8 MPI processes
    type: mpiaij
    rows=584560, cols=584560, bs=4
    total: nonzeros=9.29667e+07, allocated nonzeros=9.29667e+07
    total number of mallocs used during MatSetValues calls =0
      using I-node (on process 0) routines: found 32431 nodes, limit used
is 5
KSPSolve completed


> Giang
>
> On Wed, Jul 13, 2016 at 5:43 AM, Barry Smith <bsmith at mcs.anl.gov> wrote:
>
>>
>>   It is not uncommon for an iterative linear solver to work fine for some
>> time steps but then start to perform poorly at a later timestep because the
>> physics (mathematically the conditioning or eigenstructure of the Jacobian)
>> changes over time; perhaps becomes singular. Another possibility is the
>> trajectory of the solution is very sensitive to the solution of the
>> nonlinear problem at each time step so that an iterative linear solver and
>> a direct linear solver result in very difficult physical solutions after
>> many time steps. In other words after many time-steps the computed
>> solutions can be very different and if the computed solution for the
>> iterative linear solver is eventually "non-physical" or ill-conditioned the
>> nonlinear solver could break down.
>>
>>   Please run with the iterative solver (that eventually breaks) with the
>> option -ksp_monitor_true_solution -ksp_converged_reason and and send ALL
>> the output (it will be very large, don't worry about it). Then we can see
>> if the linear solver is breaking down. Note that by default PETSc linear
>> solvers do not generate an error that stops the program if the linear solve
>> fails, hence your NR code should call KSPGetConvergedReason() after EVERY
>> linear solve and if the reason is negative your code needs to do something
>> different since the linear solve failed and your code should not just keep
>> on running NR.
>>
>>   Barry
>>
>>
>> > On Jul 12, 2016, at 9:52 AM, Matthew Knepley <knepley at gmail.com> wrote:
>> >
>> > On Tue, Jul 12, 2016 at 8:44 AM, Hoang Giang Bui <hgbk2008 at gmail.com>
>> wrote:
>> > Hi Matt
>> >
>> > 1) In the log you sent, the linear solver converges due to the Relative
>> Tolerance, 1.0e-9, not the breakdown tolerance 1e-30. Change rtol will
>> affect the convergence.
>> >
>> > Sorry i got it wrong in the previous email, the ksp_rtol 1.0e-12 DOES
>> affect the convergence, and it took more iterations. But the simulation
>> still failed at a definite time step.
>> >
>> > 2) What do you mean  -fieldsplit_wp_ksp_rtol 1.0e-8 does not work?
>> ALWAYS send the view output.
>> >
>> > In the log file I sent previously, the line
>> >
>> >     KSP Object:    (fieldsplit_wp_)     8 MPI processes
>> >       type: preonly
>> >       maximum iterations=10000, initial guess is zero
>> >       tolerances:  relative=1e-05, absolute=1e-50, divergence=10000
>> >       left preconditioning
>> >       using NONE norm type for convergence test
>> >
>> > impressed me that the rtol for fieldsplit_wp is still 1.0e-5
>> >
>> > KSP "preonly" does no iterations, so it does not read the tolerance. If
>> you want to lower the tolerance,
>> > choose a solver like GMRES
>> >
>> >   -fieldsplit_wp_ksp_type gmres -fieldsplit_wp_ksp_rtol 1e-8
>> >
>> > 3) I can't tell you anything about Newton convergence if you do not
>> send the output, -snes_monitor -snes_view
>> >
>> > I did not yet use SNES, instead using my NR iterator so I have no view
>> for SNES.
>> >
>> > It is hard to debug an iteration which we did not code. It could be you
>> have a bug. If not, then very small changes in
>> > the iterates are making a difference, which means your Jacobians are
>> close to singular. A problem reformulation would
>> > probably help more than solver tweaking.
>> >
>> >   Thanks,
>> >
>> >     Matt
>> >
>> > 4) If there is a difference between LU and an iterative solver with
>> residual 1e-9, then your system is very ill-conditioned.
>> > Yes it is ill-conditioned
>> >
>> >
>> >
>> >
>> >
>> >
>> >
>> > Giang
>> >
>> > On Tue, Jul 12, 2016 at 2:49 PM, Matthew Knepley <knepley at gmail.com>
>> wrote:
>> > On Tue, Jul 12, 2016 at 7:42 AM, Hoang Giang Bui <hgbk2008 at gmail.com>
>> wrote:
>> > Hello
>> >
>> > I encountered different convergence behaviour of Newton Raphson when
>> using different solver settings with PETSc
>> >
>> > For the first solver configuration, I used direct solver
>> > -ksp_type preonly
>> > -pc_type lu
>> > -pc_factor_mat_solver_package mumps
>> > -mat_mumps_icntl_1 6
>> > -mat_mumps_icntl_4 3
>> > -mat_mumps_icntl_7 4
>> > -mat_mumps_icntl_14 40
>> > -mat_mumps_icntl_23 0
>> >
>> > The simulation can run completely and the NR typically converged after
>> 6/7 iterations. Of course, it's very slow. For the second solver
>> configuration:
>> > -ksp_type gmres
>> > -ksp_max_it 300
>> > -ksp_gmres_restart 300
>> > -ksp_gmres_modifiedgramschmidt
>> > -pc_view
>> > -pc_fieldsplit_type multiplicative
>> > -fieldsplit_u_pc_type hypre
>> > -fieldsplit_u_pc_hypre_type boomeramg
>> > -fieldsplit_u_pc_hypre_boomeramg_coarsen_type PMIS
>> > -fieldsplit_u_pc_hypre_boomeramg_strong_threshold 0.6
>> > -fieldsplit_u_pc_hypre_boomeramg_max_levels 25
>> > -fieldsplit_wp_ksp_rtol 1.0e-8
>> > -fieldsplit_wp_pc_type hypre
>> > -fieldsplit_wp_pc_hypre_type boomeramg
>> > -fieldsplit_wp_pc_hypre_boomeramg_coarsen_type PMIS
>> > -fieldsplit_wp_pc_hypre_boomeramg_strong_threshold 0.6
>> > -fieldsplit_wp_pc_hypre_boomeramg_max_levels 25
>> >
>> > The solver runs much faster, but the NR does not converge in 30
>> iterations after some time steps. I thought setting the solver tolerance
>> -ksp_rtol 1.0e-12 but it doesn't help much because GMRES already terminate
>> with tolerance 1e-30 (see sample log file). Can we set the tolerance of the
>> sub-ksp of the Fieldsplit? I tried -fieldsplit_wp_ksp_rtol 1.0e-8 but it
>> doesn't work.
>> >
>> > 1) In the log you sent, the linear solver converges due to the Relative
>> Tolerance, 1.0e-9, not the breakdown tolerance 1e-30. Change rtol will
>> affect the convergence.
>> >
>> > 2) What do you mean  -fieldsplit_wp_ksp_rtol 1.0e-8 does not work?
>> ALWAYS send the view output.
>> >
>> > 3) I can't tell you anything about Newton convergence if you do not
>> send the output, -snes_monitor -snes_view
>> >
>> > 4) If there is a difference between LU and an iterative solver with
>> residual 1e-9, then your system is very ill-conditioned.
>> >
>> >   Thanks,
>> >
>> >      Matt
>> >
>> > Sorry this problem is run with many time steps and is quite big so I
>> cannot reproduce in a simple test case.
>> >
>> > Giang
>> >
>> >
>> >
>> >
>> > --
>> > What most experimenters take for granted before they begin their
>> experiments is infinitely more interesting than any results to which their
>> experiments lead.
>> > -- Norbert Wiener
>> >
>> >
>> >
>> >
>> > --
>> > What most experimenters take for granted before they begin their
>> experiments is infinitely more interesting than any results to which their
>> experiments lead.
>> > -- Norbert Wiener
>>
>>
>


-- 
What most experimenters take for granted before they begin their
experiments is infinitely more interesting than any results to which their
experiments lead.
-- Norbert Wiener
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.mcs.anl.gov/pipermail/petsc-users/attachments/20160713/b06b54c5/attachment-0001.html>

From bsmith at mcs.anl.gov  Wed Jul 13 12:09:00 2016
From: bsmith at mcs.anl.gov (Barry Smith)
Date: Wed, 13 Jul 2016 12:09:00 -0500
Subject: [petsc-users] different convergence behaviour
In-Reply-To: <CAMYG4GkC1omj+TLHMA64iWobW6Xm-O=KioQPmwGgJ4-urWREWA@mail.gmail.com>
References: <CAJW_hKf6z-ra+sj_6OHO-WR2x0zJs5c_kQyUm+sZ5quRGmh8+A@mail.gmail.com>
	<CAMYG4GkXWTAsUAPFgYhCqr5ZjHsOUYq9e2O+cwBJ=fvdp5v-Pg@mail.gmail.com>
	<CAJW_hKezLFQS3OOB6mxyr=mKR4rVmg2TRMNxNs0F3n_mzfA7Sg@mail.gmail.com>
	<CAMYG4Gkwrw4WuGsZ1VJ6i9LkyTwhpJu=TSE6Tc=oiS+R0FT12A@mail.gmail.com>
	<66004C23-63C9-4A3E-A7DF-1352AC26412F@mcs.anl.gov>
	<CAJW_hKfqO+bhQZrkLZSJE=FEG13k8qdj3V2dSUSxXuPUf4LugA@mail.gmail.com>
	<CAMYG4GkC1omj+TLHMA64iWobW6Xm-O=KioQPmwGgJ4-urWREWA@mail.gmail.com>
Message-ID: <329C1D8D-2EEA-4A1F-AEF2-47F02A2245E9@mcs.anl.gov>


> On Jul 13, 2016, at 11:05 AM, Matthew Knepley <knepley at gmail.com> wrote:
> 
> On Wed, Jul 13, 2016 at 10:34 AM, Hoang Giang Bui <hgbk2008 at gmail.com> wrote:
> Thanks Barry
> 
> This is a good comment. Since material behaviour depends very much on the trajectory of the solution. I suspect that the error may concatenate during time stepping.
> 
> I have re-run the simulation as you suggested and post the log file here: https://www.dropbox.com/s/d6l8ixme37uh47a/log13Jul16?dl=0
> 
> However, I did not get what -ksp_monitor_true_solution used for? I see that I have the same log that I had before.

   My mistake. I didn't mean that option.
> 
> That option is showing the last two numbers in these lines
> 
>   0 KSP preconditioned resid norm 1.150038785083e+00 true resid norm 8.673040929526e+07 ||r(i)||/||b|| 1.000000000000e+00
> 
> Notice that there are 7 orders of magnitude between the apparent residual (using the preconditioner), and the actual residual, Ax - b.
> You are using Hypre, and this generally means the Hypre coarse grid operator is crap. Please
> 
>   a) Try ML or GAMG and look at the output again
> 
>   b) Try MUMPS, although you have 200 nonzeros/row so that fill-in might be extreme.
> 
> The consequence is that you solve to what you think is machine precision (1e-13), but all you really get is (1e-4), so I can understand
> why the trajectory is completely different.
> 
    You can compare the final  true residual norm at each iteration when using MUMPS with what you get with hypre to see if MUMPS is able to give you a smaller residual.

   Barry


>    Matt
> 
>   1 KSP preconditioned resid norm 5.202876635759e-01 true resid norm 2.037005052213e+08 ||r(i)||/||b|| 2.348663022307e+00
>   2 KSP preconditioned resid norm 3.386127782775e-01 true resid norm 1.762196838305e+08 ||r(i)||/||b|| 2.031809664712e+00
>   3 KSP preconditioned resid norm 2.334102526025e-01 true resid norm 1.027451552306e+08 ||r(i)||/||b|| 1.184649721655e+00
>   4 KSP preconditioned resid norm 1.791251896569e-01 true resid norm 7.709961160729e+07 ||r(i)||/||b|| 8.889570824556e-01
>   5 KSP preconditioned resid norm 1.338763110903e-01 true resid norm 7.416954924746e+07 ||r(i)||/||b|| 8.551735181482e-01
>   6 KSP preconditioned resid norm 8.064262880339e-02 true resid norm 5.164444100149e+07 ||r(i)||/||b|| 5.954594405945e-01
>   7 KSP preconditioned resid norm 4.635705318709e-02 true resid norm 2.934800965373e+07 ||r(i)||/||b|| 3.383820034081e-01
>   8 KSP preconditioned resid norm 2.772133866748e-02 true resid norm 1.528356929458e+07 ||r(i)||/||b|| 1.762192686368e-01
>   9 KSP preconditioned resid norm 1.746753670007e-02 true resid norm 1.011788107951e+07 ||r(i)||/||b|| 1.166589799555e-01
>  10 KSP preconditioned resid norm 1.090702407895e-02 true resid norm 5.487922954253e+06 ||r(i)||/||b|| 6.327564920823e-02
>  11 KSP preconditioned resid norm 7.298748576067e-03 true resid norm 3.635843038640e+06 ||r(i)||/||b|| 4.192120235779e-02
>  12 KSP preconditioned resid norm 5.263606789063e-03 true resid norm 2.556946903793e+06 ||r(i)||/||b|| 2.948155006496e-02
>  13 KSP preconditioned resid norm 3.653208280595e-03 true resid norm 1.955721190606e+06 ||r(i)||/||b|| 2.254942881623e-02
>  14 KSP preconditioned resid norm 2.344759624903e-03 true resid norm 1.161259621408e+06 ||r(i)||/||b|| 1.338930175522e-02
>  15 KSP preconditioned resid norm 1.394564491254e-03 true resid norm 7.455856541894e+05 ||r(i)||/||b|| 8.596588673428e-03
>  16 KSP preconditioned resid norm 9.523395328600e-04 true resid norm 4.383808867461e+05 ||r(i)||/||b|| 5.054523440028e-03
>  17 KSP preconditioned resid norm 7.226014371144e-04 true resid norm 2.463564216053e+05 ||r(i)||/||b|| 2.840484941869e-03
>  18 KSP preconditioned resid norm 5.312593384754e-04 true resid norm 2.332075376781e+05 ||r(i)||/||b|| 2.688878555665e-03
>  19 KSP preconditioned resid norm 3.987403871945e-04 true resid norm 1.524236218549e+05 ||r(i)||/||b|| 1.757441514383e-03
>  20 KSP preconditioned resid norm 3.024350484979e-04 true resid norm 1.113568566173e+05 ||r(i)||/||b|| 1.283942477870e-03
>  21 KSP preconditioned resid norm 2.181724540430e-04 true resid norm 9.095158030900e+04 ||r(i)||/||b|| 1.048670022983e-03
>  22 KSP preconditioned resid norm 1.497651066688e-04 true resid norm 7.045647741653e+04 ||r(i)||/||b|| 8.123618692570e-04
>  23 KSP preconditioned resid norm 1.067332245914e-04 true resid norm 4.317487154207e+04 ||r(i)||/||b|| 4.978054628463e-04
>  24 KSP preconditioned resid norm 8.206743871631e-05 true resid norm 3.328488127932e+04 ||r(i)||/||b|| 3.837740597534e-04
>  25 KSP preconditioned resid norm 6.446633932980e-05 true resid norm 2.816657573261e+04 ||r(i)||/||b|| 3.247600923538e-04
>  26 KSP preconditioned resid norm 5.068725017435e-05 true resid norm 2.427030232896e+04 ||r(i)||/||b|| 2.798361327495e-04
>  27 KSP preconditioned resid norm 4.056292508453e-05 true resid norm 1.963628903861e+04 ||r(i)||/||b|| 2.264060460243e-04
>  28 KSP preconditioned resid norm 3.278196251068e-05 true resid norm 1.710046122873e+04 ||r(i)||/||b|| 1.971679987179e-04
>  29 KSP preconditioned resid norm 2.796514916728e-05 true resid norm 1.500292999274e+04 ||r(i)||/||b|| 1.729835027259e-04
>  30 KSP preconditioned resid norm 2.469882695602e-05 true resid norm 1.317997814765e+04 ||r(i)||/||b|| 1.519649019847e-04
>  31 KSP preconditioned resid norm 2.175528107880e-05 true resid norm 1.158572445412e+04 ||r(i)||/||b|| 1.335831866616e-04
>  32 KSP preconditioned resid norm 1.912573933887e-05 true resid norm 1.001695718951e+04 ||r(i)||/||b|| 1.154953293880e-04
>  33 KSP preconditioned resid norm 1.647102125210e-05 true resid norm 8.271485921360e+03 ||r(i)||/||b|| 9.537007825249e-05
>  34 KSP preconditioned resid norm 1.337436641169e-05 true resid norm 6.611637805300e+03 ||r(i)||/||b|| 7.623206046211e-05
>  35 KSP preconditioned resid norm 9.896966695703e-06 true resid norm 4.752788536204e+03 ||r(i)||/||b|| 5.479956309238e-05
>  36 KSP preconditioned resid norm 6.766260764791e-06 true resid norm 3.239548441802e+03 ||r(i)||/||b|| 3.735193305468e-05
>  37 KSP preconditioned resid norm 4.835158711776e-06 true resid norm 2.113941262442e+03 ||r(i)||/||b|| 2.437370329068e-05
>  38 KSP preconditioned resid norm 3.598894380040e-06 true resid norm 1.653467554688e+03 ||r(i)||/||b|| 1.906445003688e-05
>  39 KSP preconditioned resid norm 2.522642742745e-06 true resid norm 1.344572919946e+03 ||r(i)||/||b|| 1.550290066507e-05
>  40 KSP preconditioned resid norm 1.750002168280e-06 true resid norm 1.015690774521e+03 ||r(i)||/||b|| 1.171089566825e-05
>  41 KSP preconditioned resid norm 1.371380245282e-06 true resid norm 8.480814540622e+02 ||r(i)||/||b|| 9.778363332462e-06
>  42 KSP preconditioned resid norm 1.174063380270e-06 true resid norm 7.575955225454e+02 ||r(i)||/||b|| 8.735062231359e-06
>  43 KSP preconditioned resid norm 1.022078284946e-06 true resid norm 6.758159410670e+02 ||r(i)||/||b|| 7.792145183661e-06
>  44 KSP preconditioned resid norm 8.861345665105e-07 true resid norm 5.913685641420e+02 ||r(i)||/||b|| 6.818468504268e-06
>  45 KSP preconditioned resid norm 7.574040382433e-07 true resid norm 4.958820201473e+02 ||r(i)||/||b|| 5.717510434653e-06
>  46 KSP preconditioned resid norm 6.331382122180e-07 true resid norm 3.988451175342e+02 ||r(i)||/||b|| 4.598676759110e-06
>  47 KSP preconditioned resid norm 5.210644796074e-07 true resid norm 3.077459761874e+02 ||r(i)||/||b|| 3.548305360116e-06
>  48 KSP preconditioned resid norm 4.285762531134e-07 true resid norm 2.383304155333e+02 ||r(i)||/||b|| 2.747945241696e-06
>  49 KSP preconditioned resid norm 3.365753654637e-07 true resid norm 1.802176480688e+02 ||r(i)||/||b|| 2.077906117741e-06
>  50 KSP preconditioned resid norm 2.556504175739e-07 true resid norm 1.322207275993e+02 ||r(i)||/||b|| 1.524502520785e-06
>  51 KSP preconditioned resid norm 1.929395464892e-07 true resid norm 1.007938656038e+02 ||r(i)||/||b|| 1.162151388686e-06
>  52 KSP preconditioned resid norm 1.518353128559e-07 true resid norm 7.979486270816e+01 ||r(i)||/||b|| 9.200332773308e-07
>  53 KSP preconditioned resid norm 1.206065500213e-07 true resid norm 6.580266981926e+01 ||r(i)||/||b|| 7.587035545427e-07
>  54 KSP preconditioned resid norm 9.426597887251e-08 true resid norm 5.333098459078e+01 ||r(i)||/||b|| 6.149052566928e-07
>  55 KSP preconditioned resid norm 7.613592162567e-08 true resid norm 4.265349984159e+01 ||r(i)||/||b|| 4.917940568733e-07
>  56 KSP preconditioned resid norm 6.268355987149e-08 true resid norm 3.467681120568e+01 ||r(i)||/||b|| 3.998229858184e-07
>  57 KSP preconditioned resid norm 5.012883291890e-08 true resid norm 2.749870530323e+01 ||r(i)||/||b|| 3.170595587716e-07
>  58 KSP preconditioned resid norm 3.875711489918e-08 true resid norm 2.037239239206e+01 ||r(i)||/||b|| 2.348933039472e-07
>  59 KSP preconditioned resid norm 2.803879910778e-08 true resid norm 1.495957468476e+01 ||r(i)||/||b|| 1.724836168342e-07
>  60 KSP preconditioned resid norm 1.925214804831e-08 true resid norm 1.036952152845e+01 ||r(i)||/||b|| 1.195603896339e-07
>  61 KSP preconditioned resid norm 1.316807047769e-08 true resid norm 7.239457203086e+00 ||r(i)||/||b|| 8.347080639779e-08
>  62 KSP preconditioned resid norm 9.095263534284e-09 true resid norm 5.546725364022e+00 ||r(i)||/||b|| 6.395363989508e-08
>  63 KSP preconditioned resid norm 6.520024982652e-09 true resid norm 4.395022539849e+00 ||r(i)||/||b|| 5.067452783356e-08
>  64 KSP preconditioned resid norm 5.077084953418e-09 true resid norm 3.613138054874e+00 ||r(i)||/||b|| 4.165941431885e-08
>  65 KSP preconditioned resid norm 4.181478103167e-09 true resid norm 3.038027368880e+00 ||r(i)||/||b|| 3.502839884610e-08
>  66 KSP preconditioned resid norm 3.474545560062e-09 true resid norm 2.484725611092e+00 ||r(i)||/||b|| 2.864883990842e-08
>  67 KSP preconditioned resid norm 2.726294735157e-09 true resid norm 1.845741997810e+00 ||r(i)||/||b|| 2.128137077650e-08
>  68 KSP preconditioned resid norm 2.081101207644e-09 true resid norm 1.271838867185e+00 ||r(i)||/||b|| 1.466427839462e-08
>  69 KSP preconditioned resid norm 1.574053677511e-09 true resid norm 8.732579381622e-01 ||r(i)||/||b|| 1.006864772411e-08
>  70 KSP preconditioned resid norm 1.202717674216e-09 true resid norm 5.849220507056e-01 ||r(i)||/||b|| 6.744140324696e-09
>  71 KSP preconditioned resid norm 9.075713740333e-10 true resid norm 4.120181311262e-01 ||r(i)||/||b|| 4.750561359898e-09
>  72 KSP preconditioned resid norm 6.365151508838e-10 true resid norm 3.065749731760e-01 ||r(i)||/||b|| 3.534803717256e-09
>  73 KSP preconditioned resid norm 4.005974496315e-10 true resid norm 2.122086214944e-01 ||r(i)||/||b|| 2.446761444097e-09
>  74 KSP preconditioned resid norm 2.374916890000e-10 true resid norm 1.567794082480e-01 ||r(i)||/||b|| 1.807663650177e-09
>  75 KSP preconditioned resid norm 1.481096397633e-10 true resid norm 1.235242757193e-01 ||r(i)||/||b|| 1.424232592963e-09
>  76 KSP preconditioned resid norm 1.085014154415e-10 true resid norm 1.047268461651e-01 ||r(i)||/||b|| 1.207498581132e-09
>  77 KSP preconditioned resid norm 8.764582618532e-11 true resid norm 8.962364559579e-02 ||r(i)||/||b|| 1.033358960531e-09
>  78 KSP preconditioned resid norm 7.109092680274e-11 true resid norm 7.176047852904e-02 ||r(i)||/||b|| 8.273969777399e-10
>  79 KSP preconditioned resid norm 5.460763497752e-11 true resid norm 5.069849340150e-02 ||r(i)||/||b|| 5.845526824266e-10
>  80 KSP preconditioned resid norm 3.799942459039e-11 true resid norm 3.044234442091e-02 ||r(i)||/||b|| 3.509996628435e-10
>  81 KSP preconditioned resid norm 2.481109284531e-11 true resid norm 1.726059230919e-02 ||r(i)||/||b|| 1.990143070861e-10
>  82 KSP preconditioned resid norm 1.569622532234e-11 true resid norm 1.070220060596e-02 ||r(i)||/||b|| 1.233961731867e-10
>  83 KSP preconditioned resid norm 1.022582071414e-11 true resid norm 7.402265790954e-03 ||r(i)||/||b|| 8.534798637643e-11
>  84 KSP preconditioned resid norm 7.284827374238e-12 true resid norm 5.658340974708e-03 ||r(i)||/||b|| 6.524056580253e-11
>  85 KSP preconditioned resid norm 5.402886839508e-12 true resid norm 4.464802757767e-03 ||r(i)||/||b|| 5.147909244343e-11
>  86 KSP preconditioned resid norm 3.933784995327e-12 true resid norm 3.350654653931e-03 ||r(i)||/||b|| 3.863298560628e-11
>  87 KSP preconditioned resid norm 2.792049995877e-12 true resid norm 2.402140873006e-03 ||r(i)||/||b|| 2.769663942007e-11
>  88 KSP preconditioned resid norm 2.058524741199e-12 true resid norm 1.747330249674e-03 ||r(i)||/||b|| 2.014668515774e-11
>  89 KSP preconditioned resid norm 1.568241303093e-12 true resid norm 1.266336540932e-03 ||r(i)||/||b|| 1.460083667564e-11
>  90 KSP preconditioned resid norm 1.164779378453e-12 true resid norm 8.484550691359e-04 ||r(i)||/||b|| 9.782671107287e-12
>  91 KSP preconditioned resid norm 7.995560038101e-13 true resid norm 5.065061038629e-04 ||r(i)||/||b|| 5.840005921551e-12
> Linear solve converged due to CONVERGED_RTOL iterations 91
> KSP Object: 8 MPI processes
>   type: gmres
>     GMRES: restart=300, using Modified Gram-Schmidt Orthogonalization
>     GMRES: happy breakdown tolerance 1e-30
>   maximum iterations=300, initial guess is zero
>   tolerances:  relative=1e-12, absolute=1e-20, divergence=10000
>   left preconditioning
>   using PRECONDITIONED norm type for convergence test
> PC Object: 8 MPI processes
>   type: fieldsplit
>     FieldSplit with MULTIPLICATIVE composition: total splits = 2
>     Solver info for each split is in the following KSP objects:
>     Split number 0 Defined by IS
>     KSP Object:    (fieldsplit_u_)     8 MPI processes
>       type: preonly
>       maximum iterations=10000, initial guess is zero
>       tolerances:  relative=1e-05, absolute=1e-50, divergence=10000
>       left preconditioning
>       using NONE norm type for convergence test
>     PC Object:    (fieldsplit_u_)     8 MPI processes
>       type: hypre
>         HYPRE BoomerAMG preconditioning
>         HYPRE BoomerAMG: Cycle type V
>         HYPRE BoomerAMG: Maximum number of levels 25
>         HYPRE BoomerAMG: Maximum number of iterations PER hypre call 1
>         HYPRE BoomerAMG: Convergence tolerance PER hypre call 0
>         HYPRE BoomerAMG: Threshold for strong coupling 0.6
>         HYPRE BoomerAMG: Interpolation truncation factor 0
>         HYPRE BoomerAMG: Interpolation: max elements per row 0
>         HYPRE BoomerAMG: Number of levels of aggressive coarsening 0
>         HYPRE BoomerAMG: Number of paths for aggressive coarsening 1
>         HYPRE BoomerAMG: Maximum row sums 0.9
>         HYPRE BoomerAMG: Sweeps down         1
>         HYPRE BoomerAMG: Sweeps up           1
>         HYPRE BoomerAMG: Sweeps on coarse    1
>         HYPRE BoomerAMG: Relax down          symmetric-SOR/Jacobi
>         HYPRE BoomerAMG: Relax up            symmetric-SOR/Jacobi
>         HYPRE BoomerAMG: Relax on coarse     Gaussian-elimination
>         HYPRE BoomerAMG: Relax weight  (all)      1
>         HYPRE BoomerAMG: Outer relax weight (all) 1
>         HYPRE BoomerAMG: Using CF-relaxation
>         HYPRE BoomerAMG: Measure type        local
>         HYPRE BoomerAMG: Coarsen type        PMIS
>         HYPRE BoomerAMG: Interpolation type  classical
>       linear system matrix = precond matrix:
>       Mat Object:      (fieldsplit_u_)       8 MPI processes
>         type: mpiaij
>         rows=438420, cols=438420, bs=3
>         total: nonzeros=7.95766e+07, allocated nonzeros=7.95766e+07
>         total number of mallocs used during MatSetValues calls =0
>           using I-node (on process 0) routines: found 17349 nodes, limit used is 5
>     Split number 1 Defined by IS
>     KSP Object:    (fieldsplit_wp_)     8 MPI processes
>       type: preonly
>       maximum iterations=10000, initial guess is zero
>       tolerances:  relative=1e-05, absolute=1e-50, divergence=10000
>       left preconditioning
>       using NONE norm type for convergence test
>     PC Object:    (fieldsplit_wp_)     8 MPI processes
>       type: hypre
>         HYPRE BoomerAMG preconditioning
>         HYPRE BoomerAMG: Cycle type V
>         HYPRE BoomerAMG: Maximum number of levels 25
>         HYPRE BoomerAMG: Maximum number of iterations PER hypre call 1
>         HYPRE BoomerAMG: Convergence tolerance PER hypre call 0
>         HYPRE BoomerAMG: Threshold for strong coupling 0.6
>         HYPRE BoomerAMG: Interpolation truncation factor 0
>         HYPRE BoomerAMG: Interpolation: max elements per row 0
>         HYPRE BoomerAMG: Number of levels of aggressive coarsening 0
>         HYPRE BoomerAMG: Number of paths for aggressive coarsening 1
>         HYPRE BoomerAMG: Maximum row sums 0.9
>         HYPRE BoomerAMG: Sweeps down         1
>         HYPRE BoomerAMG: Sweeps up           1
>         HYPRE BoomerAMG: Sweeps on coarse    1
>         HYPRE BoomerAMG: Relax down          symmetric-SOR/Jacobi
>         HYPRE BoomerAMG: Relax up            symmetric-SOR/Jacobi
>         HYPRE BoomerAMG: Relax on coarse     Gaussian-elimination
>         HYPRE BoomerAMG: Relax weight  (all)      1
>         HYPRE BoomerAMG: Outer relax weight (all) 1
>         HYPRE BoomerAMG: Using CF-relaxation
>         HYPRE BoomerAMG: Measure type        local
>         HYPRE BoomerAMG: Coarsen type        PMIS
>         HYPRE BoomerAMG: Interpolation type  classical
>       linear system matrix = precond matrix:
>       Mat Object:      (fieldsplit_wp_)       8 MPI processes
>         type: mpiaij
>         rows=146140, cols=146140
>         total: nonzeros=596012, allocated nonzeros=596012
>         total number of mallocs used during MatSetValues calls =0
>           not using I-node (on process 0) routines
>   linear system matrix = precond matrix:
>   Mat Object:   8 MPI processes
>     type: mpiaij
>     rows=584560, cols=584560, bs=4
>     total: nonzeros=9.29667e+07, allocated nonzeros=9.29667e+07
>     total number of mallocs used during MatSetValues calls =0
>       using I-node (on process 0) routines: found 32431 nodes, limit used is 5
> KSPSolve completed
> 
>  
> Giang
> 
> On Wed, Jul 13, 2016 at 5:43 AM, Barry Smith <bsmith at mcs.anl.gov> wrote:
> 
>   It is not uncommon for an iterative linear solver to work fine for some time steps but then start to perform poorly at a later timestep because the physics (mathematically the conditioning or eigenstructure of the Jacobian) changes over time; perhaps becomes singular. Another possibility is the trajectory of the solution is very sensitive to the solution of the nonlinear problem at each time step so that an iterative linear solver and a direct linear solver result in very difficult physical solutions after many time steps. In other words after many time-steps the computed solutions can be very different and if the computed solution for the iterative linear solver is eventually "non-physical" or ill-conditioned the nonlinear solver could break down.
> 
>   Please run with the iterative solver (that eventually breaks) with the option -ksp_monitor_true_solution -ksp_converged_reason and and send ALL the output (it will be very large, don't worry about it). Then we can see if the linear solver is breaking down. Note that by default PETSc linear solvers do not generate an error that stops the program if the linear solve fails, hence your NR code should call KSPGetConvergedReason() after EVERY linear solve and if the reason is negative your code needs to do something different since the linear solve failed and your code should not just keep on running NR.
> 
>   Barry
> 
> 
> > On Jul 12, 2016, at 9:52 AM, Matthew Knepley <knepley at gmail.com> wrote:
> >
> > On Tue, Jul 12, 2016 at 8:44 AM, Hoang Giang Bui <hgbk2008 at gmail.com> wrote:
> > Hi Matt
> >
> > 1) In the log you sent, the linear solver converges due to the Relative Tolerance, 1.0e-9, not the breakdown tolerance 1e-30. Change rtol will affect the convergence.
> >
> > Sorry i got it wrong in the previous email, the ksp_rtol 1.0e-12 DOES affect the convergence, and it took more iterations. But the simulation still failed at a definite time step.
> >
> > 2) What do you mean  -fieldsplit_wp_ksp_rtol 1.0e-8 does not work? ALWAYS send the view output.
> >
> > In the log file I sent previously, the line
> >
> >     KSP Object:    (fieldsplit_wp_)     8 MPI processes
> >       type: preonly
> >       maximum iterations=10000, initial guess is zero
> >       tolerances:  relative=1e-05, absolute=1e-50, divergence=10000
> >       left preconditioning
> >       using NONE norm type for convergence test
> >
> > impressed me that the rtol for fieldsplit_wp is still 1.0e-5
> >
> > KSP "preonly" does no iterations, so it does not read the tolerance. If you want to lower the tolerance,
> > choose a solver like GMRES
> >
> >   -fieldsplit_wp_ksp_type gmres -fieldsplit_wp_ksp_rtol 1e-8
> >
> > 3) I can't tell you anything about Newton convergence if you do not send the output, -snes_monitor -snes_view
> >
> > I did not yet use SNES, instead using my NR iterator so I have no view for SNES.
> >
> > It is hard to debug an iteration which we did not code. It could be you have a bug. If not, then very small changes in
> > the iterates are making a difference, which means your Jacobians are close to singular. A problem reformulation would
> > probably help more than solver tweaking.
> >
> >   Thanks,
> >
> >     Matt
> >
> > 4) If there is a difference between LU and an iterative solver with residual 1e-9, then your system is very ill-conditioned.
> > Yes it is ill-conditioned
> >
> >
> >
> >
> >
> >
> >
> > Giang
> >
> > On Tue, Jul 12, 2016 at 2:49 PM, Matthew Knepley <knepley at gmail.com> wrote:
> > On Tue, Jul 12, 2016 at 7:42 AM, Hoang Giang Bui <hgbk2008 at gmail.com> wrote:
> > Hello
> >
> > I encountered different convergence behaviour of Newton Raphson when using different solver settings with PETSc
> >
> > For the first solver configuration, I used direct solver
> > -ksp_type preonly
> > -pc_type lu
> > -pc_factor_mat_solver_package mumps
> > -mat_mumps_icntl_1 6
> > -mat_mumps_icntl_4 3
> > -mat_mumps_icntl_7 4
> > -mat_mumps_icntl_14 40
> > -mat_mumps_icntl_23 0
> >
> > The simulation can run completely and the NR typically converged after 6/7 iterations. Of course, it's very slow. For the second solver configuration:
> > -ksp_type gmres
> > -ksp_max_it 300
> > -ksp_gmres_restart 300
> > -ksp_gmres_modifiedgramschmidt
> > -pc_view
> > -pc_fieldsplit_type multiplicative
> > -fieldsplit_u_pc_type hypre
> > -fieldsplit_u_pc_hypre_type boomeramg
> > -fieldsplit_u_pc_hypre_boomeramg_coarsen_type PMIS
> > -fieldsplit_u_pc_hypre_boomeramg_strong_threshold 0.6
> > -fieldsplit_u_pc_hypre_boomeramg_max_levels 25
> > -fieldsplit_wp_ksp_rtol 1.0e-8
> > -fieldsplit_wp_pc_type hypre
> > -fieldsplit_wp_pc_hypre_type boomeramg
> > -fieldsplit_wp_pc_hypre_boomeramg_coarsen_type PMIS
> > -fieldsplit_wp_pc_hypre_boomeramg_strong_threshold 0.6
> > -fieldsplit_wp_pc_hypre_boomeramg_max_levels 25
> >
> > The solver runs much faster, but the NR does not converge in 30 iterations after some time steps. I thought setting the solver tolerance -ksp_rtol 1.0e-12 but it doesn't help much because GMRES already terminate with tolerance 1e-30 (see sample log file). Can we set the tolerance of the sub-ksp of the Fieldsplit? I tried -fieldsplit_wp_ksp_rtol 1.0e-8 but it doesn't work.
> >
> > 1) In the log you sent, the linear solver converges due to the Relative Tolerance, 1.0e-9, not the breakdown tolerance 1e-30. Change rtol will affect the convergence.
> >
> > 2) What do you mean  -fieldsplit_wp_ksp_rtol 1.0e-8 does not work? ALWAYS send the view output.
> >
> > 3) I can't tell you anything about Newton convergence if you do not send the output, -snes_monitor -snes_view
> >
> > 4) If there is a difference between LU and an iterative solver with residual 1e-9, then your system is very ill-conditioned.
> >
> >   Thanks,
> >
> >      Matt
> >
> > Sorry this problem is run with many time steps and is quite big so I cannot reproduce in a simple test case.
> >
> > Giang
> >
> >
> >
> >
> > --
> > What most experimenters take for granted before they begin their experiments is infinitely more interesting than any results to which their experiments lead.
> > -- Norbert Wiener
> >
> >
> >
> >
> > --
> > What most experimenters take for granted before they begin their experiments is infinitely more interesting than any results to which their experiments lead.
> > -- Norbert Wiener
> 
> 
> 
> 
> 
> -- 
> What most experimenters take for granted before they begin their experiments is infinitely more interesting than any results to which their experiments lead.
> -- Norbert Wiener


From aks084000 at utdallas.edu  Wed Jul 13 14:30:13 2016
From: aks084000 at utdallas.edu (Safin, Artur)
Date: Wed, 13 Jul 2016 19:30:13 +0000
Subject: [petsc-users] Multigrid with PML
Message-ID: <2d1003a65bf24fdf9b30adea866d2067@utdallas.edu>

Dear PETSc community,

I am working on solving a Helmholtz problem with PML. The issue is that I am finding it very hard to deal with the resulting matrix system; I can get the correct solution for coarse meshes, but it takes roughly 2-4 times as long to converge for each successively refined mesh. I've noticed that without PML, I do not have problems with convergence speed.

I am using the GMRES solver with GAMG as the preconditioner (with block-Jacobi preconditioner for the multigrid solves). I have also tried to assemble a separate preconditioning matrix with the complex shift 1+0.5i, that does not seem to improve the results. Currently I am running with

   -ksp_type fgmres \
   -pc_type gamg \
   -mg_levels_pc_type bjacobi \
   -pc_mg_type full \
   -ksp_gmres_restart 150 \

Can anyone suggest some way of speeding up the convergence? Any help would be appreciated. I am attaching the output from kspview.

Best,

Artur

-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.mcs.anl.gov/pipermail/petsc-users/attachments/20160713/70c99185/attachment-0001.html>
-------------- next part --------------
A non-text attachment was scrubbed...
Name: kspview
Type: application/octet-stream
Size: 33747 bytes
Desc: kspview
URL: <http://lists.mcs.anl.gov/pipermail/petsc-users/attachments/20160713/70c99185/attachment-0001.obj>

From knepley at gmail.com  Wed Jul 13 17:03:25 2016
From: knepley at gmail.com (Matthew Knepley)
Date: Wed, 13 Jul 2016 17:03:25 -0500
Subject: [petsc-users] Distribution of DMPlex for FEM
In-Reply-To: <D3ABCEFE.6AE1%mono@dtu.dk>
References: <D3ABCEFE.6AE1%mono@dtu.dk>
Message-ID: <CAMYG4G=h3tZzQ51WYOOtJQKB5ycy+R072yjub7zxWtigyWt8sQ@mail.gmail.com>

On Wed, Jul 13, 2016 at 3:57 AM, Morten Nobel-J?rgensen <mono at dtu.dk> wrote:

> I?m having problems distributing a simple FEM model using DMPlex. For test
> case I use 1x1x2 hex box elements (/cells) with 12 vertices. Each vertex
> has one DOF.
> When I distribute the system to two processors, each get a single element
> and the local vector has the size 8 (one DOF for each vertex of a hex box)
> as expected.
>
> My problem is that when I manually assemble the global stiffness matrix (a
> 12x12 matrix) it seems like my ghost values are ignored. I?m sure that I?m
> missing something obvious but cannot see what it is.
>
> In the attached example, I?m assembling the global stiffness matrix using
> a simple local stiffness matrix of ones. This makes it very easy to see if
> the matrix is assembled correctly. If I run it on one process, then global
> stiffness matrix consists of 0?s, 1?s and 2?s and its trace is 16.0. But if
> I run it distributed on on two, then it consists only of 0's and 1?s and
> its trace is 12.0.
>
> I hope that somebody can spot my mistake and help me in the right
> direction :)
>

This is my fault, and Stefano Zampini had already tried to tell me this was
broken. I normally use DMPlexMatSetClosure(), which handles global indices
correctly.
I have fixed this in the branch

  knepley/fix-plex-l2g

which is also merged to 'next'. I am attaching a version of your sample
where all objects are freed correctly. Let me know if that works for you.

  Thanks,

     Matt


> Kind regards,
> Morten
>


-- 
What most experimenters take for granted before they begin their
experiments is infinitely more interesting than any results to which their
experiments lead.
-- Norbert Wiener
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.mcs.anl.gov/pipermail/petsc-users/attachments/20160713/f265b826/attachment.html>
-------------- next part --------------
A non-text attachment was scrubbed...
Name: ex18.c
Type: text/x-csrc
Size: 4813 bytes
Desc: not available
URL: <http://lists.mcs.anl.gov/pipermail/petsc-users/attachments/20160713/f265b826/attachment.c>

From hengjiew at uci.edu  Wed Jul 13 18:07:51 2016
From: hengjiew at uci.edu (frank)
Date: Wed, 13 Jul 2016 16:07:51 -0700
Subject: [petsc-users] Question about memory usage in Multigrid
 preconditioner
In-Reply-To: <CAJ98EDoNNDpPYt7wr6EFyAp-vhHE1LC8CMnSrL89sJMrLTr+Yw@mail.gmail.com>
References: <577C337B.60909@uci.edu>
	<94A03A99-4970-4F20-8C79-FEE1DCBD028D@mcs.anl.gov>
	<577D75D3.8010703@uci.edu>
	<2F25042C-E6D6-4AC6-9C22-1B63F8065836@mcs.anl.gov>
	<57804DE9.707@uci.edu>
	<CAJ98EDqfz2hr1s8SoLCwPceqbB+bsZp1ryibGngRn+kNTDrocA@mail.gmail.com>
	<5783D3E4.4020004@uci.edu>
	<CAJ98EDoNNDpPYt7wr6EFyAp-vhHE1LC8CMnSrL89sJMrLTr+Yw@mail.gmail.com>
Message-ID: <5786C9C7.1080309@uci.edu>

Hi Dave,

Sorry for the late reply.
Thank you so much for your detailed reply.

I have a question about the estimation of the memory usage. There are 
4223139840 allocated non-zeros and 18432 MPI processes. Double precision 
is used. So the memory per process is:
   4223139840 * 8bytes / 18432 / 1024 / 1024 = 1.74M ?
Did I do sth wrong here? Because this seems too small.

I am running this job on Bluewater 
<https://bluewaters.ncsa.illinois.edu/user-guide>
I am using the 7 points FD stencil in 3D.

I apologize that I made a stupid mistake in computing the memory per 
core. My settings render each core can access only 2G memory on average 
instead of 8G which I mentioned in previous email. I re-run the job with 
8G memory per core on average and there is no "Out Of Memory" error. I 
would do more test to see if there is still some memory issue.

Regards,
Frank


On 07/11/2016 01:18 PM, Dave May wrote:
> Hi Frank,
>
>
> On 11 July 2016 at 19:14, frank <hengjiew at uci.edu 
> <mailto:hengjiew at uci.edu>> wrote:
>
>     Hi Dave,
>
>     I re-run the test using bjacobi as the preconditioner on the
>     coarse mesh of telescope. The Grid is 3072*256*768 and process
>     mesh is 96*8*24. The petsc option file is attached.
>     I still got the "Out Of Memory" error. The error occurred before
>     the linear solver finished one step. So I don't have the full info
>     from ksp_view. The info from ksp_view_pre is attached.
>
>
> Okay - that is essentially useless (sorry)
>
>
>     It seems to me that the error occurred when the decomposition was
>     going to be changed.
>
>
> Based on what information?
> Running with -info would give us more clues, but will create a ton of 
> output.
> Please try running the case which failed with -info
>
>     I had another test with a grid of 1536*128*384 and the same
>     process mesh as above. There was no error. The ksp_view info is
>     attached for comparison.
>     Thank you.
>
>
>
> [3] Here is my crude estimate of your memory usage.
> I'll target the biggest memory hogs only to get an order of magnitude 
> estimate
>
> * The Fine grid operator contains 4223139840 non-zeros --> 1.8 GB per 
> MPI rank assuming double precision.
> The indices for the AIJ could amount to another 0.3 GB (assuming 32 
> bit integers)
>
> * You use 5 levels of coarsening, so the other operators should 
> represent (collectively)
> 2.1 / 8 + 2.1/8^2 + 2.1/8^3 + 2.1/8^4  ~ 300 MB per MPI rank on the 
> communicator with 18432 ranks.
> The coarse grid should consume ~ 0.5 MB per MPI rank on the 
> communicator with 18432 ranks.
>
> * You use a reduction factor of 64, making the new communicator with 
> 288 MPI ranks.
> PCTelescope will first gather a temporary matrix associated with your 
> coarse level operator assuming a comm size of 288 living on the comm 
> with size 18432.
> This matrix will require approximately 0.5 * 64 = 32 MB per core on 
> the 288 ranks.
> This matrix is then used to form a new MPIAIJ matrix on the subcomm, 
> thus require another 32 MB per rank.
> The temporary matrix is now destroyed.
>
> * Because a DMDA is detected, a permutation matrix is assembled.
> This requires 2 doubles per point in the DMDA.
> Your coarse DMDA contains 92 x 16 x 48 points.
> Thus the permutation matrix will require < 1 MB per MPI rank on the 
> sub-comm.
>
> * Lastly, the matrix is permuted. This uses MatPtAP(), but the 
> resulting operator will have the same memory footprint as the 
> unpermuted matrix (32 MB). At any stage in PCTelescope, only 2 
> operators of size 32 MB are held in memory when the DMDA is provided.
>
> From my rough estimates, the worst case memory foot print for any 
> given core, given your options is approximately
> 2100 MB + 300 MB + 32 MB + 32 MB + 1 MB  = 2465 MB
> This is way below 8 GB.
>
> Note this estimate completely ignores:
> (1) the memory required for the restriction operator,
> (2) the potential growth in the number of non-zeros per row due to 
> Galerkin coarsening (I wished -ksp_view_pre reported the output from 
> MatView so we could see the number of non-zeros required by the coarse 
> level operators)
> (3) all temporary vectors required by the CG solver, and those 
> required by the smoothers.
> (4) internal memory allocated by MatPtAP
> (5) memory associated with IS's used within PCTelescope
>
> So either I am completely off in my estimates, or you have not 
> carefully estimated the memory usage of your application code. 
> Hopefully others might examine/correct my rough estimates
>
> Since I don't have your code I cannot access the latter.
> Since I don't have access to the same machine you are running on, I 
> think we need to take a step back.
>
> [1] What machine are you running on? Send me a URL if its available
>
> [2] What discretization are you using? (I am guessing a scalar 7 point 
> FD stencil)
> If it's a 7 point FD stencil, we should be able to examine the memory 
> usage of your solver configuration using a standard, light weight 
> existing PETSc example, run on your machine at the same scale.
> This would hopefully enable us to correctly evaluate the actual memory 
> usage required by the solver configuration you are using.
>
> Thanks,
>   Dave
>
>
>
>     Frank
>
>
>
>
>     On 07/08/2016 10:38 PM, Dave May wrote:
>>
>>
>>     On Saturday, 9 July 2016, frank <hengjiew at uci.edu> wrote:
>>
>>         Hi Barry and Dave,
>>
>>         Thank both of you for the advice.
>>
>>         @Barry
>>         I made a mistake in the file names in last email. I attached
>>         the correct files this time.
>>         For all the three tests, 'Telescope' is used as the coarse
>>         preconditioner.
>>
>>         == Test1:   Grid: 1536*128*384,   Process Mesh: 48*4*12
>>         Part of the memory usage:  Vector   125   124 3971904     0.
>>          Matrix   101 101      9462372     0
>>
>>         == Test2: Grid: 1536*128*384,   Process Mesh: 96*8*24
>>         Part of the memory usage:  Vector   125   124 681672     0.
>>          Matrix   101 101      1462180     0.
>>
>>         In theory, the memory usage in Test1 should be 8 times of
>>         Test2. In my case, it is about 6 times.
>>
>>         == Test3: Grid: 3072*256*768,   Process Mesh: 96*8*24.
>>         Sub-domain per process: 32*32*32
>>         Here I get the out of memory error.
>>
>>         I tried to use -mg_coarse jacobi. In this way, I don't need
>>         to set -mg_coarse_ksp_type and -mg_coarse_pc_type explicitly,
>>         right?
>>         The linear solver didn't work in this case. Petsc output some
>>         errors.
>>
>>         @Dave
>>         In test3, I use only one instance of 'Telescope'. On the
>>         coarse mesh of 'Telescope', I used LU as the preconditioner
>>         instead of SVD.
>>         If my set the levels correctly, then on the last coarse mesh
>>         of MG where it calls 'Telescope', the sub-domain per process
>>         is 2*2*2.
>>         On the last coarse mesh of 'Telescope', there is only one
>>         grid point per process.
>>         I still got the OOM error. The detailed petsc option file is
>>         attached.
>>
>>
>>     Do you understand the expected memory usage for the
>>     particular parallel LU implementation you are using? I don't
>>     (seriously). Replace LU with bjacobi and re-run this test. My
>>     point about solver debugging is still valid.
>>
>>     And please send the result of KSPView so we can see what is
>>     actually used in the computations
>>
>>     Thanks
>>       Dave
>>
>>
>>
>>         Thank you so much.
>>
>>         Frank
>>
>>
>>
>>         On 07/06/2016 02:51 PM, Barry Smith wrote:
>>
>>                 On Jul 6, 2016, at 4:19 PM, frank <hengjiew at uci.edu
>>                 <mailto:hengjiew at uci.edu>> wrote:
>>
>>                 Hi Barry,
>>
>>                 Thank you for you advice.
>>                 I tried three test. In the 1st test, the grid is
>>                 3072*256*768 and the process mesh is 96*8*24.
>>                 The linear solver is 'cg' the preconditioner is 'mg'
>>                 and 'telescope' is used as the preconditioner at the
>>                 coarse mesh.
>>                 The system gives me the "Out of Memory" error before
>>                 the linear system is completely solved.
>>                 The info from '-ksp_view_pre' is attached. I seems to
>>                 me that the error occurs when it reaches the coarse mesh.
>>
>>                 The 2nd test uses a grid of 1536*128*384 and process
>>                 mesh is 96*8*24. The 3rd test uses the same grid but
>>                 a different process mesh 48*4*12.
>>
>>                 Are you sure this is right? The total matrix and
>>             vector memory usage goes from 2nd test
>>                            Vector   384            383   8,193,712     0.
>>                            Matrix   103            103  11,508,688     0.
>>             to 3rd test
>>                           Vector   384            383 1,590,520     0.
>>                            Matrix   103            103   3,508,664     0.
>>             that is the memory usage got smaller but if you have only
>>             1/8th the processes and the same grid it should have
>>             gotten about 8 times bigger. Did you maybe cut the grid
>>             by a factor of 8 also? If so that still doesn't explain
>>             it because the memory usage changed by a factor of 5
>>             something for the vectors and 3 something for the matrices.
>>
>>
>>                 The linear solver and petsc options in 2nd and 3rd
>>                 tests are the same in 1st test. The linear solver
>>                 works fine in both test.
>>                 I attached the memory usage of the 2nd and 3rd tests.
>>                 The memory info is from the option '-log_summary'. I
>>                 tried to use '-momery_info' as you suggested, but in
>>                 my case petsc treated it as an unused option. It
>>                 output nothing about the memory. Do I need to add sth
>>                 to my code so I can use '-memory_info'?
>>
>>                 Sorry, my mistake the option is -memory_view
>>
>>                Can you run the one case with -memory_view and
>>             -mg_coarse jacobi -ksp_max_it 1 (just so it doesn't
>>             iterate forever) to see how much memory is used without
>>             the telescope? Also run case 2 the same way.
>>
>>                Barry
>>
>>
>>
>>                 In both tests the memory usage is not large.
>>
>>                 It seems to me that it might be the 'telescope' 
>>                 preconditioner that allocated a lot of memory and
>>                 caused the error in the 1st test.
>>                 Is there is a way to show how much memory it allocated?
>>
>>                 Frank
>>
>>                 On 07/05/2016 03:37 PM, Barry Smith wrote:
>>
>>                        Frank,
>>
>>                          You can run with -ksp_view_pre to have it
>>                     "view" the KSP before the solve so hopefully it
>>                     gets that far.
>>
>>                           Please run the problem that does fit with
>>                     -memory_info when the problem completes it will
>>                     show the "high water mark" for PETSc allocated
>>                     memory and total memory used. We first want to
>>                     look at these numbers to see if it is using more
>>                     memory than you expect. You could also run with
>>                     say half the grid spacing to see how the memory
>>                     usage scaled with the increase in grid points.
>>                     Make the runs also with -log_view and send all
>>                     the output from these options.
>>
>>                         Barry
>>
>>                         On Jul 5, 2016, at 5:23 PM, frank
>>                         <hengjiew at uci.edu <mailto:hengjiew at uci.edu>>
>>                         wrote:
>>
>>                         Hi,
>>
>>                         I am using the CG ksp solver and Multigrid
>>                         preconditioner  to solve a linear system in
>>                         parallel.
>>                         I chose to use the 'Telescope' as the
>>                         preconditioner on the coarse mesh for its
>>                         good performance.
>>                         The petsc options file is attached.
>>
>>                         The domain is a 3d box.
>>                         It works well when the grid is 1536*128*384
>>                         and the process mesh is 96*8*24. When I
>>                         double the size of grid and keep the same
>>                         process mesh and petsc options, I get an "out
>>                         of memory" error from the super-cluster I am
>>                         using.
>>                         Each process has access to at least 8G
>>                         memory, which should be more than enough for
>>                         my application. I am sure that all the other
>>                         parts of my code( except the linear solver )
>>                         do not use much memory. So I doubt if there
>>                         is something wrong with the linear solver.
>>                         The error occurs before the linear system is
>>                         completely solved so I don't have the info
>>                         from ksp view. I am not able to re-produce
>>                         the error with a smaller problem either.
>>                         In addition,  I tried to use the block jacobi
>>                         as the preconditioner with the same grid and
>>                         same decomposition. The linear solver runs
>>                         extremely slow but there is no memory error.
>>
>>                         How can I diagnose what exactly cause the error?
>>                         Thank you so much.
>>
>>                         Frank
>>                         <petsc_options.txt>
>>
>>                 <ksp_view_pre.txt><memory_test2.txt><memory_test3.txt><petsc_options.txt>
>>
>>
>
>

-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.mcs.anl.gov/pipermail/petsc-users/attachments/20160713/ebd17769/attachment-0001.html>

From bsmith at mcs.anl.gov  Wed Jul 13 18:28:46 2016
From: bsmith at mcs.anl.gov (Barry Smith)
Date: Wed, 13 Jul 2016 18:28:46 -0500
Subject: [petsc-users] Question about memory usage in Multigrid
	preconditioner
In-Reply-To: <5786C9C7.1080309@uci.edu>
References: <577C337B.60909@uci.edu>
	<94A03A99-4970-4F20-8C79-FEE1DCBD028D@mcs.anl.gov>
	<577D75D3.8010703@uci.edu>
	<2F25042C-E6D6-4AC6-9C22-1B63F8065836@mcs.anl.gov>
	<57804DE9.707@uci.edu>
	<CAJ98EDqfz2hr1s8SoLCwPceqbB+bsZp1ryibGngRn+kNTDrocA@mail.gmail.com>
	<5783D3E4.4020004@uci.edu>
	<CAJ98EDoNNDpPYt7wr6EFyAp-vhHE1LC8CMnSrL89sJMrLTr+Yw@mail.gmail.com>
	<5786C9C7.1080309@uci.edu>
Message-ID: <C01665F7-4C8B-4A6E-A6AA-98AA5B919036@mcs.anl.gov>


> On Jul 13, 2016, at 6:07 PM, frank <hengjiew at uci.edu> wrote:
> 
> Hi Dave,
> 
> Sorry for the late reply.
> Thank you so much for your detailed reply.
> 
> I have a question about the estimation of the memory usage. There are 4223139840 allocated non-zeros and 18432 MPI processes. Double precision is used. So the memory per process is:
>   4223139840 * 8bytes / 18432 / 1024 / 1024 = 1.74M ? 
> Did I do sth wrong here? Because this seems too small.

   In addition to storing the non-zero values there are several integer arrays that need to be stored. For each nonzero it stores the column index so if integers are 4 bytes that is another 1.7M/2 . If PetscInt is 64 bit then the column indices take the same amount of space as the numerical values 1.74 M. In addition there are at least 7  PetscInt Arrays that are of size mlocal  where mlocal is the number of rows local to the process. 
> 
> I am running this job on Bluewater
> I am using the 7 points FD stencil in 3D. 
> 
> I apologize that I made a stupid mistake in computing the memory per core. My settings render each core can access only 2G memory on average instead of 8G which I mentioned in previous email. I re-run the job with 8G memory per core on average and there is no "Out Of Memory" error. I would do more test to see if there is still some memory issue.
> 
> Regards,
> Frank
> 
> 
> On 07/11/2016 01:18 PM, Dave May wrote:
>> Hi Frank,
>> 
>> 
>> On 11 July 2016 at 19:14, frank <hengjiew at uci.edu> wrote:
>> Hi Dave,
>> 
>> I re-run the test using bjacobi as the preconditioner on the coarse mesh of telescope. The Grid is 3072*256*768 and process mesh is 96*8*24. The petsc option file is attached.
>> I still got the "Out Of Memory" error. The error occurred before the linear solver finished one step. So I don't have the full info from ksp_view. The info from ksp_view_pre is attached.
>> 
>> Okay - that is essentially useless (sorry)
>>  
>> 
>> It seems to me that the error occurred when the decomposition was going to be changed.
>> 
>> Based on what information?
>> Running with -info would give us more clues, but will create a ton of output.
>> Please try running the case which failed with -info
>>  
>> I had another test with a grid of 1536*128*384 and the same process mesh as above. There was no error. The ksp_view info is attached for comparison.
>> Thank you.
>> 
>> 
>> [3] Here is my crude estimate of your memory usage. 
>> I'll target the biggest memory hogs only to get an order of magnitude estimate
>> 
>> * The Fine grid operator contains 4223139840 non-zeros --> 1.8 GB per MPI rank assuming double precision.
>> The indices for the AIJ could amount to another 0.3 GB (assuming 32 bit integers)
>> 
>> * You use 5 levels of coarsening, so the other operators should represent (collectively)  
>> 2.1 / 8 + 2.1/8^2 + 2.1/8^3 + 2.1/8^4  ~ 300 MB per MPI rank on the communicator with 18432 ranks.
>> The coarse grid should consume ~ 0.5 MB per MPI rank on the communicator with 18432 ranks.
>> 
>> * You use a reduction factor of 64, making the new communicator with 288 MPI ranks. 
>> PCTelescope will first gather a temporary matrix associated with your coarse level operator assuming a comm size of 288 living on the comm with size 18432. 
>> This matrix will require approximately 0.5 * 64 = 32 MB per core on the 288 ranks. 
>> This matrix is then used to form a new MPIAIJ matrix on the subcomm, thus require another 32 MB per rank. 
>> The temporary matrix is now destroyed.
>> 
>> * Because a DMDA is detected, a permutation matrix is assembled. 
>> This requires 2 doubles per point in the DMDA. 
>> Your coarse DMDA contains 92 x 16 x 48 points. 
>> Thus the permutation matrix will require < 1 MB per MPI rank on the sub-comm.
>> 
>> * Lastly, the matrix is permuted. This uses MatPtAP(), but the resulting operator will have the same memory footprint as the unpermuted matrix (32 MB). At any stage in PCTelescope, only 2 operators of size 32 MB are held in memory when the DMDA is provided.
>> 
>> From my rough estimates, the worst case memory foot print for any given core, given your options is approximately 
>> 2100 MB + 300 MB + 32 MB + 32 MB + 1 MB  = 2465 MB
>> This is way below 8 GB.
>> 
>> Note this estimate completely ignores:
>> (1) the memory required for the restriction operator, 
>> (2) the potential growth in the number of non-zeros per row due to Galerkin coarsening (I wished -ksp_view_pre reported the output from MatView so we could see the number of non-zeros required by the coarse level operators)
>> (3) all temporary vectors required by the CG solver, and those required by the smoothers.
>> (4) internal memory allocated by MatPtAP
>> (5) memory associated with IS's used within PCTelescope
>> 
>> So either I am completely off in my estimates, or you have not carefully estimated the memory usage of your application code. Hopefully others might examine/correct my rough estimates
>> 
>> Since I don't have your code I cannot access the latter.
>> Since I don't have access to the same machine you are running on, I think we need to take a step back.
>> 
>> [1] What machine are you running on? Send me a URL if its available
>> 
>> [2] What discretization are you using? (I am guessing a scalar 7 point FD stencil)
>> If it's a 7 point FD stencil, we should be able to examine the memory usage of your solver configuration using a standard, light weight existing PETSc example, run on your machine at the same scale. 
>> This would hopefully enable us to correctly evaluate the actual memory usage required by the solver configuration you are using.
>> 
>> Thanks,
>>   Dave
>>  
>> 
>> 
>> Frank
>> 
>> 
>> 
>> 
>> On 07/08/2016 10:38 PM, Dave May wrote:
>>> 
>>> 
>>> On Saturday, 9 July 2016, frank <hengjiew at uci.edu> wrote:
>>> Hi Barry and Dave,
>>> 
>>> Thank both of you for the advice.
>>> 
>>> @Barry
>>> I made a mistake in the file names in last email. I attached the correct files this time.
>>> For all the three tests, 'Telescope' is used as the coarse preconditioner.
>>> 
>>> == Test1:   Grid: 1536*128*384,   Process Mesh: 48*4*12
>>> Part of the memory usage:  Vector   125            124 3971904     0.
>>>                                              Matrix   101 101      9462372     0
>>> 
>>> == Test2: Grid: 1536*128*384,   Process Mesh: 96*8*24
>>> Part of the memory usage:  Vector   125            124 681672     0.
>>>                                              Matrix   101 101      1462180     0.
>>> 
>>> In theory, the memory usage in Test1 should be 8 times of Test2. In my case, it is about 6 times.
>>> 
>>> == Test3: Grid: 3072*256*768,   Process Mesh: 96*8*24. Sub-domain per process: 32*32*32
>>> Here I get the out of memory error.
>>> 
>>> I tried to use -mg_coarse jacobi. In this way, I don't need to set -mg_coarse_ksp_type and -mg_coarse_pc_type explicitly, right?
>>> The linear solver didn't work in this case. Petsc output some errors.
>>> 
>>> @Dave
>>> In test3, I use only one instance of 'Telescope'. On the coarse mesh of 'Telescope', I used LU as the preconditioner instead of SVD.
>>> If my set the levels correctly, then on the last coarse mesh of MG where it calls 'Telescope', the sub-domain per process is 2*2*2.
>>> On the last coarse mesh of 'Telescope', there is only one grid point per process.
>>> I still got the OOM error. The detailed petsc option file is attached.
>>> 
>>> Do you understand the expected memory usage for the particular parallel LU implementation you are using? I don't (seriously). Replace LU with bjacobi and re-run this test. My point about solver debugging is still valid. 
>>> 
>>> And please send the result of KSPView so we can see what is actually used in the computations
>>> 
>>> Thanks
>>>   Dave
>>>  
>>> 
>>> 
>>> Thank you so much.
>>> 
>>> Frank
>>> 
>>> 
>>> 
>>> On 07/06/2016 02:51 PM, Barry Smith wrote:
>>> On Jul 6, 2016, at 4:19 PM, frank <hengjiew at uci.edu> wrote:
>>> 
>>> Hi Barry,
>>> 
>>> Thank you for you advice.
>>> I tried three test. In the 1st test, the grid is 3072*256*768 and the process mesh is 96*8*24.
>>> The linear solver is 'cg' the preconditioner is 'mg' and 'telescope' is used as the preconditioner at the coarse mesh.
>>> The system gives me the "Out of Memory" error before the linear system is completely solved.
>>> The info from '-ksp_view_pre' is attached. I seems to me that the error occurs when it reaches the coarse mesh.
>>> 
>>> The 2nd test uses a grid of 1536*128*384 and process mesh is 96*8*24. The 3rd test uses the same grid but a different process mesh 48*4*12.
>>>     Are you sure this is right? The total matrix and vector memory usage goes from 2nd test
>>>                Vector   384            383      8,193,712     0.
>>>                Matrix   103            103     11,508,688     0.
>>> to 3rd test
>>>               Vector   384            383      1,590,520     0.
>>>                Matrix   103            103      3,508,664     0.
>>> that is the memory usage got smaller but if you have only 1/8th the processes and the same grid it should have gotten about 8 times bigger. Did you maybe cut the grid by a factor of 8 also? If so that still doesn't explain it because the memory usage changed by a factor of 5 something for the vectors and 3 something for the matrices.
>>> 
>>> 
>>> The linear solver and petsc options in 2nd and 3rd tests are the same in 1st test. The linear solver works fine in both test.
>>> I attached the memory usage of the 2nd and 3rd tests. The memory info is from the option '-log_summary'. I tried to use '-momery_info' as you suggested, but in my case petsc treated it as an unused option. It output nothing about the memory. Do I need to add sth to my code so I can use '-memory_info'?
>>>     Sorry, my mistake the option is -memory_view
>>> 
>>>    Can you run the one case with -memory_view and -mg_coarse jacobi -ksp_max_it 1 (just so it doesn't iterate forever) to see how much memory is used without the telescope? Also run case 2 the same way.
>>> 
>>>    Barry
>>> 
>>> 
>>> 
>>> In both tests the memory usage is not large.
>>> 
>>> It seems to me that it might be the 'telescope'  preconditioner that allocated a lot of memory and caused the error in the 1st test.
>>> Is there is a way to show how much memory it allocated?
>>> 
>>> Frank
>>> 
>>> On 07/05/2016 03:37 PM, Barry Smith wrote:
>>>    Frank,
>>> 
>>>      You can run with -ksp_view_pre to have it "view" the KSP before the solve so hopefully it gets that far.
>>> 
>>>       Please run the problem that does fit with -memory_info when the problem completes it will show the "high water mark" for PETSc allocated memory and total memory used. We first want to look at these numbers to see if it is using more memory than you expect. You could also run with say half the grid spacing to see how the memory usage scaled with the increase in grid points. Make the runs also with -log_view and send all the output from these options.
>>> 
>>>     Barry
>>> 
>>> On Jul 5, 2016, at 5:23 PM, frank <hengjiew at uci.edu> wrote:
>>> 
>>> Hi,
>>> 
>>> I am using the CG ksp solver and Multigrid preconditioner  to solve a linear system in parallel.
>>> I chose to use the 'Telescope' as the preconditioner on the coarse mesh for its good performance.
>>> The petsc options file is attached.
>>> 
>>> The domain is a 3d box.
>>> It works well when the grid is  1536*128*384 and the process mesh is 96*8*24. When I double the size of grid and keep the same process mesh and petsc options, I get an "out of memory" error from the super-cluster I am using.
>>> Each process has access to at least 8G memory, which should be more than enough for my application. I am sure that all the other parts of my code( except the linear solver ) do not use much memory. So I doubt if there is something wrong with the linear solver.
>>> The error occurs before the linear system is completely solved so I don't have the info from ksp view. I am not able to re-produce the error with a smaller problem either.
>>> In addition,  I tried to use the block jacobi as the preconditioner with the same grid and same decomposition. The linear solver runs extremely slow but there is no memory error.
>>> 
>>> How can I diagnose what exactly cause the error?
>>> Thank you so much.
>>> 
>>> Frank
>>> <petsc_options.txt>
>>> <ksp_view_pre.txt><memory_test2.txt><memory_test3.txt><petsc_options.txt>
>>> 
>> 
>> 
> 


From dave.mayhem23 at gmail.com  Wed Jul 13 19:47:31 2016
From: dave.mayhem23 at gmail.com (Dave May)
Date: Thu, 14 Jul 2016 02:47:31 +0200
Subject: [petsc-users] Question about memory usage in Multigrid
	preconditioner
In-Reply-To: <5786C9C7.1080309@uci.edu>
References: <577C337B.60909@uci.edu>
	<94A03A99-4970-4F20-8C79-FEE1DCBD028D@mcs.anl.gov>
	<577D75D3.8010703@uci.edu>
	<2F25042C-E6D6-4AC6-9C22-1B63F8065836@mcs.anl.gov>
	<57804DE9.707@uci.edu>
	<CAJ98EDqfz2hr1s8SoLCwPceqbB+bsZp1ryibGngRn+kNTDrocA@mail.gmail.com>
	<5783D3E4.4020004@uci.edu>
	<CAJ98EDoNNDpPYt7wr6EFyAp-vhHE1LC8CMnSrL89sJMrLTr+Yw@mail.gmail.com>
	<5786C9C7.1080309@uci.edu>
Message-ID: <CAJ98EDrRQfspLSv8kOuzVsXzH5bL2dfzdwu0VnhOJM2VbaxkWA@mail.gmail.com>

On 14 July 2016 at 01:07, frank <hengjiew at uci.edu> wrote:

> Hi Dave,
>
> Sorry for the late reply.
> Thank you so much for your detailed reply.
>
> I have a question about the estimation of the memory usage. There are
> 4223139840 allocated non-zeros and 18432 MPI processes. Double precision is
> used. So the memory per process is:
>   4223139840 * 8bytes / 18432 / 1024 / 1024 = 1.74M ?
> Did I do sth wrong here? Because this seems too small.
>

No - I totally f***ed it up. You are correct. That'll teach me for fumbling
around with my iphone calculator and not using my brain. (Note that to
convert to MB just divide by 1e6, not 1024^2 - although I apparently cannot
convert between units correctly....)

>From the PETSc objects associated with the solver, It looks like it
_should_ run with 2GB per MPI rank. Sorry for my mistake. Possibilities
are: somewhere in your usage of PETSc you've introduced a memory leak;
PETSc is doing a huge over allocation (e.g. as per our discussion of
MatPtAP); or in your application code there are other objects you have
forgotten to log the memory for.


> I am running this job on Bluewater
> <https://bluewaters.ncsa.illinois.edu/user-guide>
>
I am using the 7 points FD stencil in 3D.
>

I thought so on both counts.


>
> I apologize that I made a stupid mistake in computing the memory per core.
> My settings render each core can access only 2G memory on average instead
> of 8G which I mentioned in previous email. I re-run the job with 8G memory
> per core on average and there is no "Out Of Memory" error. I would do more
> test to see if there is still some memory issue.
>

Ok. I'd still like to know where the memory was being used since my
estimates were off.


Thanks,
  Dave


>
> Regards,
> Frank
>
>
>
> On 07/11/2016 01:18 PM, Dave May wrote:
>
> Hi Frank,
>
>
> On 11 July 2016 at 19:14, frank <hengjiew at uci.edu> wrote:
>
>> Hi Dave,
>>
>> I re-run the test using bjacobi as the preconditioner on the coarse mesh
>> of telescope. The Grid is 3072*256*768 and process mesh is 96*8*24. The
>> petsc option file is attached.
>> I still got the "Out Of Memory" error. The error occurred before the
>> linear solver finished one step. So I don't have the full info from
>> ksp_view. The info from ksp_view_pre is attached.
>>
>
> Okay - that is essentially useless (sorry)
>
>
>>
>> It seems to me that the error occurred when the decomposition was going
>> to be changed.
>>
>
> Based on what information?
> Running with -info would give us more clues, but will create a ton of
> output.
> Please try running the case which failed with -info
>
>
>> I had another test with a grid of 1536*128*384 and the same process mesh
>> as above. There was no error. The ksp_view info is attached for comparison.
>> Thank you.
>>
>
>
> [3] Here is my crude estimate of your memory usage.
> I'll target the biggest memory hogs only to get an order of magnitude
> estimate
>
> * The Fine grid operator contains 4223139840 non-zeros --> 1.8 GB per MPI
> rank assuming double precision.
> The indices for the AIJ could amount to another 0.3 GB (assuming 32 bit
> integers)
>
> * You use 5 levels of coarsening, so the other operators should represent
> (collectively)
> 2.1 / 8 + 2.1/8^2 + 2.1/8^3 + 2.1/8^4  ~ 300 MB per MPI rank on the
> communicator with 18432 ranks.
> The coarse grid should consume ~ 0.5 MB per MPI rank on the communicator
> with 18432 ranks.
>
> * You use a reduction factor of 64, making the new communicator with 288
> MPI ranks.
> PCTelescope will first gather a temporary matrix associated with your
> coarse level operator assuming a comm size of 288 living on the comm with
> size 18432.
> This matrix will require approximately 0.5 * 64 = 32 MB per core on the
> 288 ranks.
> This matrix is then used to form a new MPIAIJ matrix on the subcomm, thus
> require another 32 MB per rank.
> The temporary matrix is now destroyed.
>
> * Because a DMDA is detected, a permutation matrix is assembled.
> This requires 2 doubles per point in the DMDA.
> Your coarse DMDA contains 92 x 16 x 48 points.
> Thus the permutation matrix will require < 1 MB per MPI rank on the
> sub-comm.
>
> * Lastly, the matrix is permuted. This uses MatPtAP(), but the resulting
> operator will have the same memory footprint as the unpermuted matrix (32
> MB). At any stage in PCTelescope, only 2 operators of size 32 MB are held
> in memory when the DMDA is provided.
>
> From my rough estimates, the worst case memory foot print for any given
> core, given your options is approximately
> 2100 MB + 300 MB + 32 MB + 32 MB + 1 MB  = 2465 MB
> This is way below 8 GB.
>
> Note this estimate completely ignores:
> (1) the memory required for the restriction operator,
> (2) the potential growth in the number of non-zeros per row due to
> Galerkin coarsening (I wished -ksp_view_pre reported the output from
> MatView so we could see the number of non-zeros required by the coarse
> level operators)
> (3) all temporary vectors required by the CG solver, and those required by
> the smoothers.
> (4) internal memory allocated by MatPtAP
> (5) memory associated with IS's used within PCTelescope
>
> So either I am completely off in my estimates, or you have not carefully
> estimated the memory usage of your application code. Hopefully others might
> examine/correct my rough estimates
>
> Since I don't have your code I cannot access the latter.
> Since I don't have access to the same machine you are running on, I think
> we need to take a step back.
>
> [1] What machine are you running on? Send me a URL if its available
>
> [2] What discretization are you using? (I am guessing a scalar 7 point FD
> stencil)
> If it's a 7 point FD stencil, we should be able to examine the memory
> usage of your solver configuration using a standard, light weight existing
> PETSc example, run on your machine at the same scale.
> This would hopefully enable us to correctly evaluate the actual memory
> usage required by the solver configuration you are using.
>
> Thanks,
>   Dave
>
>
>>
>>
>> Frank
>>
>>
>>
>>
>> On 07/08/2016 10:38 PM, Dave May wrote:
>>
>>
>>
>> On Saturday, 9 July 2016, frank <hengjiew at uci.edu> wrote:
>>
>>> Hi Barry and Dave,
>>>
>>> Thank both of you for the advice.
>>>
>>> @Barry
>>> I made a mistake in the file names in last email. I attached the correct
>>> files this time.
>>> For all the three tests, 'Telescope' is used as the coarse
>>> preconditioner.
>>>
>>> == Test1:   Grid: 1536*128*384,   Process Mesh: 48*4*12
>>> Part of the memory usage:  Vector   125            124 3971904     0.
>>>                                              Matrix   101 101
>>> 9462372     0
>>>
>>> == Test2: Grid: 1536*128*384,   Process Mesh: 96*8*24
>>> Part of the memory usage:  Vector   125            124 681672     0.
>>>                                              Matrix   101 101
>>> 1462180     0.
>>>
>>> In theory, the memory usage in Test1 should be 8 times of Test2. In my
>>> case, it is about 6 times.
>>>
>>> == Test3: Grid: 3072*256*768,   Process Mesh: 96*8*24. Sub-domain per
>>> process: 32*32*32
>>> Here I get the out of memory error.
>>>
>>> I tried to use -mg_coarse jacobi. In this way, I don't need to set
>>> -mg_coarse_ksp_type and -mg_coarse_pc_type explicitly, right?
>>> The linear solver didn't work in this case. Petsc output some errors.
>>>
>>> @Dave
>>> In test3, I use only one instance of 'Telescope'. On the coarse mesh of
>>> 'Telescope', I used LU as the preconditioner instead of SVD.
>>> If my set the levels correctly, then on the last coarse mesh of MG where
>>> it calls 'Telescope', the sub-domain per process is 2*2*2.
>>> On the last coarse mesh of 'Telescope', there is only one grid point per
>>> process.
>>> I still got the OOM error. The detailed petsc option file is attached.
>>
>>
>> Do you understand the expected memory usage for the particular parallel
>> LU implementation you are using? I don't (seriously). Replace LU with
>> bjacobi and re-run this test. My point about solver debugging is still
>> valid.
>>
>> And please send the result of KSPView so we can see what is actually used
>> in the computations
>>
>> Thanks
>>   Dave
>>
>>
>>>
>>>
>>> Thank you so much.
>>>
>>> Frank
>>>
>>>
>>>
>>> On 07/06/2016 02:51 PM, Barry Smith wrote:
>>>
>>>> On Jul 6, 2016, at 4:19 PM, frank < <hengjiew at uci.edu>hengjiew at uci.edu>
>>>>> wrote:
>>>>>
>>>>> Hi Barry,
>>>>>
>>>>> Thank you for you advice.
>>>>> I tried three test. In the 1st test, the grid is 3072*256*768 and the
>>>>> process mesh is 96*8*24.
>>>>> The linear solver is 'cg' the preconditioner is 'mg' and 'telescope'
>>>>> is used as the preconditioner at the coarse mesh.
>>>>> The system gives me the "Out of Memory" error before the linear system
>>>>> is completely solved.
>>>>> The info from '-ksp_view_pre' is attached. I seems to me that the
>>>>> error occurs when it reaches the coarse mesh.
>>>>>
>>>>> The 2nd test uses a grid of 1536*128*384 and process mesh is 96*8*24.
>>>>> The 3rd test uses the same grid but a different process mesh 48*4*12.
>>>>>
>>>>     Are you sure this is right? The total matrix and vector memory
>>>> usage goes from 2nd test
>>>>                Vector   384            383      8,193,712     0.
>>>>                Matrix   103            103     11,508,688     0.
>>>> to 3rd test
>>>>               Vector   384            383      1,590,520     0.
>>>>                Matrix   103            103      3,508,664     0.
>>>> that is the memory usage got smaller but if you have only 1/8th the
>>>> processes and the same grid it should have gotten about 8 times bigger. Did
>>>> you maybe cut the grid by a factor of 8 also? If so that still doesn't
>>>> explain it because the memory usage changed by a factor of 5 something for
>>>> the vectors and 3 something for the matrices.
>>>>
>>>>
>>>> The linear solver and petsc options in 2nd and 3rd tests are the same
>>>>> in 1st test. The linear solver works fine in both test.
>>>>> I attached the memory usage of the 2nd and 3rd tests. The memory info
>>>>> is from the option '-log_summary'. I tried to use '-momery_info' as you
>>>>> suggested, but in my case petsc treated it as an unused option. It output
>>>>> nothing about the memory. Do I need to add sth to my code so I can use
>>>>> '-memory_info'?
>>>>>
>>>>     Sorry, my mistake the option is -memory_view
>>>>
>>>>    Can you run the one case with -memory_view and -mg_coarse jacobi
>>>> -ksp_max_it 1 (just so it doesn't iterate forever) to see how much memory
>>>> is used without the telescope? Also run case 2 the same way.
>>>>
>>>>    Barry
>>>>
>>>>
>>>>
>>>> In both tests the memory usage is not large.
>>>>>
>>>>> It seems to me that it might be the 'telescope'  preconditioner that
>>>>> allocated a lot of memory and caused the error in the 1st test.
>>>>> Is there is a way to show how much memory it allocated?
>>>>>
>>>>> Frank
>>>>>
>>>>> On 07/05/2016 03:37 PM, Barry Smith wrote:
>>>>>
>>>>>>    Frank,
>>>>>>
>>>>>>      You can run with -ksp_view_pre to have it "view" the KSP before
>>>>>> the solve so hopefully it gets that far.
>>>>>>
>>>>>>       Please run the problem that does fit with -memory_info when the
>>>>>> problem completes it will show the "high water mark" for PETSc allocated
>>>>>> memory and total memory used. We first want to look at these numbers to see
>>>>>> if it is using more memory than you expect. You could also run with say
>>>>>> half the grid spacing to see how the memory usage scaled with the increase
>>>>>> in grid points. Make the runs also with -log_view and send all the output
>>>>>> from these options.
>>>>>>
>>>>>>     Barry
>>>>>>
>>>>>> On Jul 5, 2016, at 5:23 PM, frank < <hengjiew at uci.edu>
>>>>>>> hengjiew at uci.edu> wrote:
>>>>>>>
>>>>>>> Hi,
>>>>>>>
>>>>>>> I am using the CG ksp solver and Multigrid preconditioner  to solve
>>>>>>> a linear system in parallel.
>>>>>>> I chose to use the 'Telescope' as the preconditioner on the coarse
>>>>>>> mesh for its good performance.
>>>>>>> The petsc options file is attached.
>>>>>>>
>>>>>>> The domain is a 3d box.
>>>>>>> It works well when the grid is  1536*128*384 and the process mesh is
>>>>>>> 96*8*24. When I double the size of grid and keep the same process mesh and
>>>>>>> petsc options, I get an "out of memory" error from the super-cluster I am
>>>>>>> using.
>>>>>>> Each process has access to at least 8G memory, which should be more
>>>>>>> than enough for my application. I am sure that all the other parts of my
>>>>>>> code( except the linear solver ) do not use much memory. So I doubt if
>>>>>>> there is something wrong with the linear solver.
>>>>>>> The error occurs before the linear system is completely solved so I
>>>>>>> don't have the info from ksp view. I am not able to re-produce the error
>>>>>>> with a smaller problem either.
>>>>>>> In addition,  I tried to use the block jacobi as the preconditioner
>>>>>>> with the same grid and same decomposition. The linear solver runs extremely
>>>>>>> slow but there is no memory error.
>>>>>>>
>>>>>>> How can I diagnose what exactly cause the error?
>>>>>>> Thank you so much.
>>>>>>>
>>>>>>> Frank
>>>>>>> <petsc_options.txt>
>>>>>>>
>>>>>>
>>>>> <ksp_view_pre.txt><memory_test2.txt><memory_test3.txt><petsc_options.txt>
>>>>>
>>>>
>>>
>>
>
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.mcs.anl.gov/pipermail/petsc-users/attachments/20160714/7ac1c782/attachment-0001.html>

From bsmith at mcs.anl.gov  Wed Jul 13 20:10:52 2016
From: bsmith at mcs.anl.gov (Barry Smith)
Date: Wed, 13 Jul 2016 20:10:52 -0500
Subject: [petsc-users] Multigrid with PML
In-Reply-To: <2d1003a65bf24fdf9b30adea866d2067@utdallas.edu>
References: <2d1003a65bf24fdf9b30adea866d2067@utdallas.edu>
Message-ID: <32247878-9DC8-4830-9CE3-A1518D23E3D9@mcs.anl.gov>


  Can you run with the additional option -ksp_view_mat binary and email the resulting file which will be called binaryoutput to petsc-maint at mcs.anl.gov

   Barry

> On Jul 13, 2016, at 2:30 PM, Safin, Artur <aks084000 at utdallas.edu> wrote:
> 
> Dear PETSc community,
> 
> I am working on solving a Helmholtz problem with PML. The issue is that I am finding it very hard to deal with the resulting matrix system; I can get the correct solution for coarse meshes, but it takes roughly 2-4 times as long to converge for each successively refined mesh. I've noticed that without PML, I do not have problems with convergence speed.
> 
> I am using the GMRES solver with GAMG as the preconditioner (with block-Jacobi preconditioner for the multigrid solves). I have also tried to assemble a separate preconditioning matrix with the complex shift 1+0.5i, that does not seem to improve the results. Currently I am running with
> 
>    -ksp_type fgmres \
>    -pc_type gamg \
>    -mg_levels_pc_type bjacobi \
>    -pc_mg_type full \
>    -ksp_gmres_restart 150 \
> 
> Can anyone suggest some way of speeding up the convergence? Any help would be appreciated. I am attaching the output from kspview.
> 
> Best,
> 
> Artur
> 
> <kspview>


From bsmith at mcs.anl.gov  Wed Jul 13 21:11:53 2016
From: bsmith at mcs.anl.gov (Barry Smith)
Date: Wed, 13 Jul 2016 21:11:53 -0500
Subject: [petsc-users] What block size means in amg aggregation type
In-Reply-To: <CAF78e0y61gy=_XyQTbonh6MOBcCPSR=VnQD+zGwKH0KWX8bOJg@mail.gmail.com>
References: <CAF78e0y61gy=_XyQTbonh6MOBcCPSR=VnQD+zGwKH0KWX8bOJg@mail.gmail.com>
Message-ID: <568382E0-1BD7-43CD-85E6-6B864A8AC044@mcs.anl.gov>


  Sorry know one answered it. I had hoped Mark Adams would since he knows much more about it then me.

> On Jul 6, 2016, at 2:50 PM, Eduardo Jourdan <eduardojourdan92 at gmail.com> wrote:
> 
> Hi,
> 
> I am kind of new to algebraic multigrid methods. I tried to figure it on my own but I'm not be sure about it.
> 
> How the block size (bs) of a blocked matrix affects the AMG AGG? I mean, if bs = 4, then
> in the coarsening phase and setup, blocks of 4x4 matrix elements are considered to remain in the coarse level and a certain quantity of block neighbors are restricted and remain in the finer level? Never a row inside a block matrix is selected and the other elements of this block aren't, am I right?

   Correct 
> The entire block is interpolated when it comes to the interpolation phase?

   Correct and they all use the same interpolation.
> 
> If the original problem is not a system of equations, then bs=1?

   Yes. For a Poission operator it is 1 for linear elasticity it is 2 to 6 depending on the dimension and the model.
> 
> Thank you,
> 
> Eduardo
> 
> 


From mono at dtu.dk  Thu Jul 14 02:45:33 2016
From: mono at dtu.dk (=?Windows-1252?Q?Morten_Nobel-J=F8rgensen?=)
Date: Thu, 14 Jul 2016 07:45:33 +0000
Subject: [petsc-users] Distribution of DMPlex for FEM
In-Reply-To: <CAMYG4G=h3tZzQ51WYOOtJQKB5ycy+R072yjub7zxWtigyWt8sQ@mail.gmail.com>
References: <D3ABCEFE.6AE1%mono@dtu.dk>
	<CAMYG4G=h3tZzQ51WYOOtJQKB5ycy+R072yjub7zxWtigyWt8sQ@mail.gmail.com>
Message-ID: <D3AD0F4B.6B1C%mono@dtu.dk>

Hi Matthew

Thanks for your answer and your fix. It works :)))

Kind regards,
Morten


Fra: Matthew Knepley <knepley at gmail.com<mailto:knepley at gmail.com>>
Dato: Thursday 14 July 2016 at 00:03
Til: Morten Nobel-Joergensen <mono at dtu.dk<mailto:mono at dtu.dk>>
Cc: "petsc-users at mcs.anl.gov<mailto:petsc-users at mcs.anl.gov>" <petsc-users at mcs.anl.gov<mailto:petsc-users at mcs.anl.gov>>
Emne: Re: [petsc-users] Distribution of DMPlex for FEM

On Wed, Jul 13, 2016 at 3:57 AM, Morten Nobel-J?rgensen <mono at dtu.dk<mailto:mono at dtu.dk>> wrote:
I?m having problems distributing a simple FEM model using DMPlex. For test case I use 1x1x2 hex box elements (/cells) with 12 vertices. Each vertex has one DOF.
When I distribute the system to two processors, each get a single element and the local vector has the size 8 (one DOF for each vertex of a hex box) as expected.

My problem is that when I manually assemble the global stiffness matrix (a 12x12 matrix) it seems like my ghost values are ignored. I?m sure that I?m missing something obvious but cannot see what it is.

In the attached example, I?m assembling the global stiffness matrix using a simple local stiffness matrix of ones. This makes it very easy to see if the matrix is assembled correctly. If I run it on one process, then global stiffness matrix consists of 0?s, 1?s and 2?s and its trace is 16.0. But if I run it distributed on on two, then it consists only of 0's and 1?s and its trace is 12.0.

I hope that somebody can spot my mistake and help me in the right direction :)

This is my fault, and Stefano Zampini had already tried to tell me this was broken. I normally use DMPlexMatSetClosure(), which handles global indices correctly.
I have fixed this in the branch

  knepley/fix-plex-l2g

which is also merged to 'next'. I am attaching a version of your sample where all objects are freed correctly. Let me know if that works for you.

  Thanks,

     Matt

Kind regards,
Morten


--
What most experimenters take for granted before they begin their experiments is infinitely more interesting than any results to which their experiments lead.
-- Norbert Wiener
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.mcs.anl.gov/pipermail/petsc-users/attachments/20160714/c3d49f51/attachment.html>

From domenico_lahaye at yahoo.com  Thu Jul 14 12:21:32 2016
From: domenico_lahaye at yahoo.com (domenico lahaye)
Date: Thu, 14 Jul 2016 17:21:32 +0000 (UTC)
Subject: [petsc-users] Regarding ksp ex42 - Citations
References: <1413749702.3789628.1468516892902.JavaMail.yahoo.ref@mail.yahoo.com>
Message-ID: <1413749702.3789628.1468516892902.JavaMail.yahoo@mail.yahoo.com>

Dear PETSc team, 

1) I am looking into ks/examples/tutorials/ex42.c I am still new to the DMDA structure
? ? and likely not giving it as much time as it deserves. However, I do not see immediately 
??? what function is responsible for calling PCMGSetSmoother and PCMGSetResidual. 

???? I tried to call PCMGGetCoarseSolve(pc, &kcpc) and subsequently 
???? KSPGetOperators (kspc, ... ) to check how the coarse grid operator is defined 
???? after calling DMCoarsenHierarchy, but that failed. 

???? I am solving Helmholtz with shifted Laplace, and managed to exploit DMDA to perform 
???? a multigrid solve on the preconditioner. In a next stage I want to implement the deflation 
???? using DMDA as well. 

2) On http://www.mcs.anl.gov/petsc/documentation/referencing.html I see 

@Misc{petsc-web-page,
            author = {Satish Balay and Shrirang Abhyankar and Mark~F. Adams and Jed Brown and Peter Brune
                      and Kris Buschelman and Lisandro Dalcin and Victor Eijkhout and William~D. Gropp
                      and Dinesh Kaushik and Matthew~G. Knepley
                      and Lois Curfman McInnes and Karl Rupp and Barry~F. Smith
                      and Stefano Zampini and Hong Zhang and Hong Zhang},
            title =  {{PETS}c {W}eb page},
            url =    {http://www.mcs.anl.gov/petsc},
            howpublished = {\url{http://www.mcs.anl.gov/petsc}},
            year = {2016}
          }


Is the last author mentioned twice intentionally? 

3) On http://www.mcs.anl.gov/petsc/publications/petscapps-bib.html#OpenFOAM%202.2.1 I see 

@misc{OpenFOAM,

| 
 |  title  |  =  | "OpenFOAM", |
| 
 |  howpublished  |  =  | "\url{http://www.openfoam.com}", |
| 
 |  url  |  =  | {http://www.openfoam.com}, |
| 
 |  note  |  =  | "OpenFOAM is a free, open source CFD software package. It allows PETSc linear algebra and solvers to be used underneath.", |
| 
 |  key  |  =  | "OpenFOAM 2.2.1" |

}

Do you have more information on the use of PETSc within OpenFoam? 

4) @matt in response to a question he raised in Vienna

MIPSE is a BEM solver. Details are on: 
http://www.g2elab.grenoble-inp.fr/plateforms/mipse-modeling-of-interconnected-power-systems-632862.kjsp?RH=G2ELAB_R-MAGE

Cheers, Domenico Lahaye. 

-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.mcs.anl.gov/pipermail/petsc-users/attachments/20160714/aaee0f76/attachment.html>

From amelie.compagna.1 at ulaval.ca  Thu Jul 14 14:42:27 2016
From: amelie.compagna.1 at ulaval.ca (=?iso-8859-1?Q?Am=E9lie_Compagna?=)
Date: Thu, 14 Jul 2016 19:42:27 +0000
Subject: [petsc-users] Slow convergence using Schur complement
Message-ID: <1468525347118.92523@ulaval.ca>

An HTML attachment was scrubbed...
URL: <http://lists.mcs.anl.gov/pipermail/petsc-users/attachments/20160714/8b934c0f/attachment-0001.html>
-------------- next part --------------
A non-text attachment was scrubbed...
Name: QuestionPetsc
Type: application/octet-stream
Size: 63842 bytes
Desc: QuestionPetsc
URL: <http://lists.mcs.anl.gov/pipermail/petsc-users/attachments/20160714/8b934c0f/attachment-0001.obj>

From bsmith at mcs.anl.gov  Thu Jul 14 15:05:12 2016
From: bsmith at mcs.anl.gov (Barry Smith)
Date: Thu, 14 Jul 2016 15:05:12 -0500
Subject: [petsc-users] Slow convergence using Schur complement
In-Reply-To: <1468525347118.92523@ulaval.ca>
References: <1468525347118.92523@ulaval.ca>
Message-ID: <13C19A0A-8EC6-471C-84E6-BD2B30C16E7E@mcs.anl.gov>


   So refreshing my memory 

> pc_fieldsplit_schur_precondition self

selfp then the preconditioning for the Schur complement is generated from an explicitly-assembled approximation 

Sp = A11 - A10 inv(diag(A00)) A01

             This is only a good preconditioner when diag(A00) is a good preconditioner for A00. Optionally, A00 can be
             lumped before extracting the diagonal using the additional option -fieldsplit_1_mat_schur_complement_ainv_type lump


So first try adding the option (with the correct prefix) -fieldsplit_1_mat_schur_complement_ainv_type lump to see if the lumping helps the convergence. If suddenly it works well great but as the documentation says selfp may not be a good preconditioner at all for your problem and you'll have to consider the other ones.

  I don't know why it is printing the initial name and residual norm multiple times.

   What is is showing is the very slow convergence of the preconditioned system inv(Sp) S = inv(Sp) (A11 - A10 inv(A00) A01). Note I wrote inv() here because in both places you are using LU and hence it is a very accurate inverse operation.

  Barry


> On Jul 14, 2016, at 2:42 PM, Am?lie Compagna <amelie.compagna.1 at ulaval.ca> wrote:
> 
> ?Hi, 
> 
> I've been working on a finite element simulation of a 3 ionic species unsteady electrodiffusion model. The concentrations and the electric potential are defined using a unsteady diffusion equations. All the concentration being coupled to the potential giving a non symmetrical global system.
> 
> I know that everything works since I've solved the system using LU. So far I've tried a lot of different things, but I am now trying to solve the system using a Schur complement, splitting the system in two groups [concentrations, potential], and I'm getting slow convergence. Here are the options I'm using. I've also attached a file with the ksp_view and the ksp_monitor.
> 
> ======
>   ksp_type               gcr
>   pc_type                fieldsplit
>   pc_fieldsplit_type      schur
>   mat_type                nest
>   ksp_monitor
>   ksp_view
> 
> //Options concentrations block
> 
>   fieldsplit_a_00_ksp_type   gcr
>   fieldsplit_a_00_pc_type  lu
>   fieldsplit_a_00_ksp_rtol 1.0e-4
>   fieldsplit_a_00_ksp_atol 1.0e-8
> 
> //Options potential block
>   fieldsplit_schur_mat_type          schurcomplement
>   fieldsplit_schur_ksp_type          gcr
>   pc_fieldsplit_schur_precondition selfp
>   pc_fieldsplit_schur_fact_type full
>   fieldsplit_schur_pc_type           lu
>   fieldsplit_schur_ksp_monitor
>   fieldsplit_schur_ksp_rtol 1.0e-4
>   fieldsplit_schur_ksp_atol 1.0e-8
> 
> 
>   ksp_rtol                1.0e-5
>   ksp_atol                1.0e-5
> =====
> 
> First of all, I'm wondering what exactly is showing on the screen when I use the fieldsplit_schur_ksp_monitor? 
> 
> Also, why is it printing twice each time as you can see in the attached file? When I use pc_fieldsplit_a_00_monitor (which is not included in the file I sent you because it only does one iteration, as it should since it's solving with LU) it prints it 3 times every time which gets pretty annoying. 
> 
> Finally, as you can see, it takes a long time to the fieldsplit_schur_ksp to converge, do you have any idea why it takes over 200 iterations to get down to 1e-02? Is there a way to get it to converge faster?
> 
> Thank you for your time,
> Am?lie?
> 
> <QuestionPetsc>


From andrewh0 at uw.edu  Thu Jul 14 18:18:40 2016
From: andrewh0 at uw.edu (Andrew Ho)
Date: Thu, 14 Jul 2016 16:18:40 -0700
Subject: [petsc-users] SNES_QN_RESTART_POWELL fails to converge?
Message-ID: <CADhXwgumn9Cb9ZE692pyid6rnk9A7PydKAMSHBr1NJ_JY4D3kA@mail.gmail.com>

I am trying to solve a simple ionization/recombination ODE using PETSc's
quasi-newton SNES.

This is a basic non-linear coupled ODE system:

delta = -a u^2 + b u v
d_t u = delta
d_t v = -delta

a and b are constants.

I wrote a backwards Euler root finding function (yes, I know the TS module
has BE implemented, but this is more of a learning exercise).

Here is the function evaluation:

struct ion_rec_ctx
> {
>   PetscScalar rate_a, rate_b;
>   PetscScalar dt;
> };
> PetscErrorCode bdf1(SNES snes, Vec x, Vec f, void *ctx)
> {
>   const PetscScalar *xx;
>   PetscScalar *ff;
>   ion_rec_ctx& params = *reinterpret_cast<ion_rec_ctx*>(ctx);
>   CHKERRQ(VecGetArrayRead(x, &xx));
>   CHKERRQ(VecGetArray(f,&ff));
>   auto delta = (-params.rate_a*xx[0]*xx[0]+params.rate_b*xx[1]*xx[0]);
>   ff[0] = xx[0]-params.dt*delta;
>   ff[1] = xx[1]-params.dt*-delta;
>   CHKERRQ(VecRestoreArrayRead(x,&xx));
>   CHKERRQ(VecRestoreArray(f,&ff));
>   return 0;
> }


To setup the solver and solve one time step:

// q0, q1, and res are Vec's previously initialized
> // initial conditions: q0 = [1e19,1e19]
> SNES solver;
> CHKERRQ(SNESCreate(comm, &solver));
> CHKERRQ(SNESSetType(solver, SNESQN));
> CHKERRQ(SNESQNSetType(solver, SNES_QN_LBFGS));
> ion_rec_ctx params = {9.59e-16, 1.15e-19, 1.};
> CHKERRQ(SNESSetFunction(solver, res, &bdf1, &params));
> CHKERRQ(SNESSolve(solver, q0, q1));


When I run this, the solver fails to converge to a solution for this rather
large time step.
The solution produced when the SNES module finally gives up is:

q1 = [-2.72647e142, 2.72647e142]

For reference, when I disable the scale and restart types, I get these
values:

q1 = [1.0279e17, 1.98972e19]

This is only a problem when I use the SNES_QN_RESTART_POWELL restart type
(seems to be regardless of the scale type type). I get reasonable answers
for other combinations of restart/scale type. I've tried every combination
of restart type/scale type except for SNES_QN_SCALE_JACOBIAN (my ultimate
application doesn't have an available Jacobian), and only cases using
SNES_QN_RESTART_POWELL are failing.

I'm unfamiliar with Powell's restart criterion, but is it suppose to work
reasonably well with Quasi-Newton methods? I tried it on the simple problem
given in this example:
http://www.mcs.anl.gov/petsc/petsc-current/src/snes/examples/tutorials/ex1.c.html

And Powell restarts also fails to converge to a meaningful solution
(solving for f(x) = [1,1], for x0 = [1,1]), but the other restart methods
do converge properly.

Software information:

PETSc version 3.7.2 (built from git maint branch)
PETSc arch: arch-linux2-c-opt
OS: Ubuntu 15.04 x64
Compiler: gcc 4.9.2
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.mcs.anl.gov/pipermail/petsc-users/attachments/20160714/526e1ab9/attachment.html>

From knepley at gmail.com  Thu Jul 14 18:22:32 2016
From: knepley at gmail.com (Matthew Knepley)
Date: Thu, 14 Jul 2016 18:22:32 -0500
Subject: [petsc-users] SNES_QN_RESTART_POWELL fails to converge?
In-Reply-To: <CADhXwgumn9Cb9ZE692pyid6rnk9A7PydKAMSHBr1NJ_JY4D3kA@mail.gmail.com>
References: <CADhXwgumn9Cb9ZE692pyid6rnk9A7PydKAMSHBr1NJ_JY4D3kA@mail.gmail.com>
Message-ID: <CAMYG4Gncym4JoXp_tvhf30N0JVOZG-yeGG5Q5y3riabW6k9ivg@mail.gmail.com>

On Thu, Jul 14, 2016 at 6:18 PM, Andrew Ho <andrewh0 at uw.edu> wrote:

> I am trying to solve a simple ionization/recombination ODE using PETSc's
> quasi-newton SNES.
>
> This is a basic non-linear coupled ODE system:
>
> delta = -a u^2 + b u v
> d_t u = delta
> d_t v = -delta
>
> a and b are constants.
>
> I wrote a backwards Euler root finding function (yes, I know the TS module
> has BE implemented, but this is more of a learning exercise).
>
> Here is the function evaluation:
>
> struct ion_rec_ctx
>> {
>>   PetscScalar rate_a, rate_b;
>>   PetscScalar dt;
>> };
>> PetscErrorCode bdf1(SNES snes, Vec x, Vec f, void *ctx)
>> {
>>   const PetscScalar *xx;
>>   PetscScalar *ff;
>>   ion_rec_ctx& params = *reinterpret_cast<ion_rec_ctx*>(ctx);
>>   CHKERRQ(VecGetArrayRead(x, &xx));
>>   CHKERRQ(VecGetArray(f,&ff));
>>   auto delta = (-params.rate_a*xx[0]*xx[0]+params.rate_b*xx[1]*xx[0]);
>>   ff[0] = xx[0]-params.dt*delta;
>>
>
I do not understand this. Shouldn't it be (xx[0] - xxold[0]) here?

 Matt


>   ff[1] = xx[1]-params.dt*-delta;
>>   CHKERRQ(VecRestoreArrayRead(x,&xx));
>>   CHKERRQ(VecRestoreArray(f,&ff));
>>   return 0;
>> }
>
>
> To setup the solver and solve one time step:
>
> // q0, q1, and res are Vec's previously initialized
>> // initial conditions: q0 = [1e19,1e19]
>> SNES solver;
>> CHKERRQ(SNESCreate(comm, &solver));
>> CHKERRQ(SNESSetType(solver, SNESQN));
>> CHKERRQ(SNESQNSetType(solver, SNES_QN_LBFGS));
>> ion_rec_ctx params = {9.59e-16, 1.15e-19, 1.};
>> CHKERRQ(SNESSetFunction(solver, res, &bdf1, &params));
>> CHKERRQ(SNESSolve(solver, q0, q1));
>
>
> When I run this, the solver fails to converge to a solution for this
> rather large time step.
> The solution produced when the SNES module finally gives up is:
>
> q1 = [-2.72647e142, 2.72647e142]
>
> For reference, when I disable the scale and restart types, I get these
> values:
>
> q1 = [1.0279e17, 1.98972e19]
>
> This is only a problem when I use the SNES_QN_RESTART_POWELL restart type
> (seems to be regardless of the scale type type). I get reasonable answers
> for other combinations of restart/scale type. I've tried every combination
> of restart type/scale type except for SNES_QN_SCALE_JACOBIAN (my ultimate
> application doesn't have an available Jacobian), and only cases using
> SNES_QN_RESTART_POWELL are failing.
>
> I'm unfamiliar with Powell's restart criterion, but is it suppose to work
> reasonably well with Quasi-Newton methods? I tried it on the simple problem
> given in this example:
> http://www.mcs.anl.gov/petsc/petsc-current/src/snes/examples/tutorials/ex1.c.html
>
> And Powell restarts also fails to converge to a meaningful solution
> (solving for f(x) = [1,1], for x0 = [1,1]), but the other restart methods
> do converge properly.
>
> Software information:
>
> PETSc version 3.7.2 (built from git maint branch)
> PETSc arch: arch-linux2-c-opt
> OS: Ubuntu 15.04 x64
> Compiler: gcc 4.9.2
>


-- 
What most experimenters take for granted before they begin their
experiments is infinitely more interesting than any results to which their
experiments lead.
-- Norbert Wiener
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.mcs.anl.gov/pipermail/petsc-users/attachments/20160714/9ff94309/attachment.html>

From mfadams at lbl.gov  Thu Jul 14 18:27:02 2016
From: mfadams at lbl.gov (Mark Adams)
Date: Thu, 14 Jul 2016 19:27:02 -0400
Subject: [petsc-users] different convergence behaviour
In-Reply-To: <CAMYG4GkC1omj+TLHMA64iWobW6Xm-O=KioQPmwGgJ4-urWREWA@mail.gmail.com>
References: <CAJW_hKf6z-ra+sj_6OHO-WR2x0zJs5c_kQyUm+sZ5quRGmh8+A@mail.gmail.com>
	<CAMYG4GkXWTAsUAPFgYhCqr5ZjHsOUYq9e2O+cwBJ=fvdp5v-Pg@mail.gmail.com>
	<CAJW_hKezLFQS3OOB6mxyr=mKR4rVmg2TRMNxNs0F3n_mzfA7Sg@mail.gmail.com>
	<CAMYG4Gkwrw4WuGsZ1VJ6i9LkyTwhpJu=TSE6Tc=oiS+R0FT12A@mail.gmail.com>
	<66004C23-63C9-4A3E-A7DF-1352AC26412F@mcs.anl.gov>
	<CAJW_hKfqO+bhQZrkLZSJE=FEG13k8qdj3V2dSUSxXuPUf4LugA@mail.gmail.com>
	<CAMYG4GkC1omj+TLHMA64iWobW6Xm-O=KioQPmwGgJ4-urWREWA@mail.gmail.com>
Message-ID: <CADOhEh7aUvrz0SOi-71sSjGD2KQs43gH7sAP+Kkyx5t1BkHNQg@mail.gmail.com>

>
>
>
> Notice that there are 7 orders of magnitude between the apparent residual
> (using the preconditioner), and the actual residual, Ax - b.
> You are using Hypre, and this generally means the Hypre coarse grid
> operator is crap. Please
>
>
Huh?, this data looks fine, both the true and preconditioned residual stay
separated by about 9 orders of magnitude. This just tells you that the norm
of A (or is it A^-1) is 10^9.  Am I misunderstanding this?
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.mcs.anl.gov/pipermail/petsc-users/attachments/20160714/4a759bc7/attachment.html>

From andrewh0 at uw.edu  Thu Jul 14 18:28:38 2016
From: andrewh0 at uw.edu (Andrew Ho)
Date: Thu, 14 Jul 2016 16:28:38 -0700
Subject: [petsc-users] SNES_QN_RESTART_POWELL fails to converge?
In-Reply-To: <CAMYG4Gncym4JoXp_tvhf30N0JVOZG-yeGG5Q5y3riabW6k9ivg@mail.gmail.com>
References: <CADhXwgumn9Cb9ZE692pyid6rnk9A7PydKAMSHBr1NJ_JY4D3kA@mail.gmail.com>
	<CAMYG4Gncym4JoXp_tvhf30N0JVOZG-yeGG5Q5y3riabW6k9ivg@mail.gmail.com>
Message-ID: <CADhXwgvOH6Qw+_23sx-dN_kTH2gR45oN-N4g-LbUHDYaB+j2Eg@mail.gmail.com>

On Thu, Jul 14, 2016 at 4:22 PM, Matthew Knepley <knepley at gmail.com> wrote:

> On Thu, Jul 14, 2016 at 6:18 PM, Andrew Ho <andrewh0 at uw.edu> wrote:
>
>> I am trying to solve a simple ionization/recombination ODE using PETSc's
>> quasi-newton SNES.
>>
>> This is a basic non-linear coupled ODE system:
>>
>> delta = -a u^2 + b u v
>> d_t u = delta
>> d_t v = -delta
>>
>> a and b are constants.
>>
>> I wrote a backwards Euler root finding function (yes, I know the TS
>> module has BE implemented, but this is more of a learning exercise).
>>
>> Here is the function evaluation:
>>
>> struct ion_rec_ctx
>>> {
>>>   PetscScalar rate_a, rate_b;
>>>   PetscScalar dt;
>>> };
>>> PetscErrorCode bdf1(SNES snes, Vec x, Vec f, void *ctx)
>>> {
>>>   const PetscScalar *xx;
>>>   PetscScalar *ff;
>>>   ion_rec_ctx& params = *reinterpret_cast<ion_rec_ctx*>(ctx);
>>>   CHKERRQ(VecGetArrayRead(x, &xx));
>>>   CHKERRQ(VecGetArray(f,&ff));
>>>   auto delta = (-params.rate_a*xx[0]*xx[0]+params.rate_b*xx[1]*xx[0]);
>>>   ff[0] = xx[0]-params.dt*delta;
>>>
>>
> I do not understand this. Shouldn't it be (xx[0] - xxold[0]) here?
>
>  Matt
>

No, the time discretization is as such:

xnew = xold + dt*f(xnew)

I re-arrange this to be

xnew - dt*f(xnew) = xold

The left hand side I am defining as g(x), which is what the bdf1 function
evaluates. The SNES module solves for g(x) = b, so I simply set b = xold.
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.mcs.anl.gov/pipermail/petsc-users/attachments/20160714/f8e40aa8/attachment-0001.html>

From knepley at gmail.com  Thu Jul 14 18:29:06 2016
From: knepley at gmail.com (Matthew Knepley)
Date: Thu, 14 Jul 2016 18:29:06 -0500
Subject: [petsc-users] different convergence behaviour
In-Reply-To: <CADOhEh7aUvrz0SOi-71sSjGD2KQs43gH7sAP+Kkyx5t1BkHNQg@mail.gmail.com>
References: <CAJW_hKf6z-ra+sj_6OHO-WR2x0zJs5c_kQyUm+sZ5quRGmh8+A@mail.gmail.com>
	<CAMYG4GkXWTAsUAPFgYhCqr5ZjHsOUYq9e2O+cwBJ=fvdp5v-Pg@mail.gmail.com>
	<CAJW_hKezLFQS3OOB6mxyr=mKR4rVmg2TRMNxNs0F3n_mzfA7Sg@mail.gmail.com>
	<CAMYG4Gkwrw4WuGsZ1VJ6i9LkyTwhpJu=TSE6Tc=oiS+R0FT12A@mail.gmail.com>
	<66004C23-63C9-4A3E-A7DF-1352AC26412F@mcs.anl.gov>
	<CAJW_hKfqO+bhQZrkLZSJE=FEG13k8qdj3V2dSUSxXuPUf4LugA@mail.gmail.com>
	<CAMYG4GkC1omj+TLHMA64iWobW6Xm-O=KioQPmwGgJ4-urWREWA@mail.gmail.com>
	<CADOhEh7aUvrz0SOi-71sSjGD2KQs43gH7sAP+Kkyx5t1BkHNQg@mail.gmail.com>
Message-ID: <CAMYG4GmBxE3anMd3iAmv-pB9qvEjKuezCQfcP+6MBSKA_KpGgQ@mail.gmail.com>

On Thu, Jul 14, 2016 at 6:27 PM, Mark Adams <mfadams at lbl.gov> wrote:

>
>>
>> Notice that there are 7 orders of magnitude between the apparent residual
>> (using the preconditioner), and the actual residual, Ax - b.
>> You are using Hypre, and this generally means the Hypre coarse grid
>> operator is crap. Please
>>
>>
> Huh?, this data looks fine, both the true and preconditioned residual stay
> separated by about 9 orders of magnitude. This just tells you that the norm
> of A (or is it A^-1) is 10^9.  Am I misunderstanding this?
>

This is why Barry and I asked for a comparsion with MUMPS. If you are
right, and its just the condition number, the LU
will not be any more accurate.

   Matt

-- 
What most experimenters take for granted before they begin their
experiments is infinitely more interesting than any results to which their
experiments lead.
-- Norbert Wiener
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.mcs.anl.gov/pipermail/petsc-users/attachments/20160714/c02bb866/attachment.html>

From bsmith at mcs.anl.gov  Thu Jul 14 19:50:26 2016
From: bsmith at mcs.anl.gov (Barry Smith)
Date: Thu, 14 Jul 2016 19:50:26 -0500
Subject: [petsc-users] SNES_QN_RESTART_POWELL fails to converge?
In-Reply-To: <CADhXwgumn9Cb9ZE692pyid6rnk9A7PydKAMSHBr1NJ_JY4D3kA@mail.gmail.com>
References: <CADhXwgumn9Cb9ZE692pyid6rnk9A7PydKAMSHBr1NJ_JY4D3kA@mail.gmail.com>
Message-ID: <20513817-993F-41CC-8888-AAD5DF55922C@mcs.anl.gov>


> On Jul 14, 2016, at 6:18 PM, Andrew Ho <andrewh0 at uw.edu> wrote:
> 
> I am trying to solve a simple ionization/recombination ODE using PETSc's quasi-newton SNES.
> 
> This is a basic non-linear coupled ODE system:
> 
> delta = -a u^2 + b u v
> d_t u = delta
> d_t v = -delta
> 
> a and b are constants.
> 
> I wrote a backwards Euler root finding function (yes, I know the TS module has BE implemented, but this is more of a learning exercise).
> 
> Here is the function evaluation:
> 
> struct ion_rec_ctx
> {
>   PetscScalar rate_a, rate_b;
>   PetscScalar dt;
> };
> PetscErrorCode bdf1(SNES snes, Vec x, Vec f, void *ctx)
> {
>   const PetscScalar *xx;
>   PetscScalar *ff;
>   ion_rec_ctx& params = *reinterpret_cast<ion_rec_ctx*>(ctx);
>   CHKERRQ(VecGetArrayRead(x, &xx));
>   CHKERRQ(VecGetArray(f,&ff));
>   auto delta = (-params.rate_a*xx[0]*xx[0]+params.rate_b*xx[1]*xx[0]);
>   ff[0] = xx[0]-params.dt*delta;
>   ff[1] = xx[1]-params.dt*-delta;
>   CHKERRQ(VecRestoreArrayRead(x,&xx));
>   CHKERRQ(VecRestoreArray(f,&ff));
>   return 0;
> }
> 
> To setup the solver and solve one time step:
> 
> // q0, q1, and res are Vec's previously initialized
> // initial conditions: q0 = [1e19,1e19]
> SNES solver;
> CHKERRQ(SNESCreate(comm, &solver));
> CHKERRQ(SNESSetType(solver, SNESQN));
> CHKERRQ(SNESQNSetType(solver, SNES_QN_LBFGS));
> ion_rec_ctx params = {9.59e-16, 1.15e-19, 1.};
> CHKERRQ(SNESSetFunction(solver, res, &bdf1, &params));
> CHKERRQ(SNESSolve(solver, q0, q1));
> 
> When I run this, the solver fails to converge to a solution for this rather large time step.
> The solution produced when the SNES module finally gives up is:
> 
> q1 = [-2.72647e142, 2.72647e142]
> 
> For reference, when I disable the scale and restart types, I get these values:
> 
> q1 = [1.0279e17, 1.98972e19]
> 
> This is only a problem when I use the SNES_QN_RESTART_POWELL restart type (seems to be regardless of the scale type type). I get reasonable answers for other combinations of restart/scale type. I've tried every combination of restart type/scale type except for SNES_QN_SCALE_JACOBIAN (my ultimate application doesn't have an available Jacobian), and only cases using SNES_QN_RESTART_POWELL are failing.
> 
> I'm unfamiliar with Powell's restart criterion, but is it suppose to work reasonably well with Quasi-Newton methods? I tried it on the simple problem given in this example: http://www.mcs.anl.gov/petsc/petsc-current/src/snes/examples/tutorials/ex1.c.html
> 
> And Powell restarts also fails to converge to a meaningful solution (solving for f(x) = [1,1], for x0 = [1,1]), but the other restart methods do converge properly.

   Could you please send the exact options you are using for the ex1.c that both fail and work and we'll see if there is some problem with the Powell restart.

    Thanks

    Barry

> 
> Software information:
> 
> PETSc version 3.7.2 (built from git maint branch)
> PETSc arch: arch-linux2-c-opt
> OS: Ubuntu 15.04 x64
> Compiler: gcc 4.9.2


From mfadams at lbl.gov  Thu Jul 14 19:52:09 2016
From: mfadams at lbl.gov (Mark Adams)
Date: Thu, 14 Jul 2016 20:52:09 -0400
Subject: [petsc-users] different convergence behaviour
In-Reply-To: <CAMYG4GmBxE3anMd3iAmv-pB9qvEjKuezCQfcP+6MBSKA_KpGgQ@mail.gmail.com>
References: <CAJW_hKf6z-ra+sj_6OHO-WR2x0zJs5c_kQyUm+sZ5quRGmh8+A@mail.gmail.com>
	<CAMYG4GkXWTAsUAPFgYhCqr5ZjHsOUYq9e2O+cwBJ=fvdp5v-Pg@mail.gmail.com>
	<CAJW_hKezLFQS3OOB6mxyr=mKR4rVmg2TRMNxNs0F3n_mzfA7Sg@mail.gmail.com>
	<CAMYG4Gkwrw4WuGsZ1VJ6i9LkyTwhpJu=TSE6Tc=oiS+R0FT12A@mail.gmail.com>
	<66004C23-63C9-4A3E-A7DF-1352AC26412F@mcs.anl.gov>
	<CAJW_hKfqO+bhQZrkLZSJE=FEG13k8qdj3V2dSUSxXuPUf4LugA@mail.gmail.com>
	<CAMYG4GkC1omj+TLHMA64iWobW6Xm-O=KioQPmwGgJ4-urWREWA@mail.gmail.com>
	<CADOhEh7aUvrz0SOi-71sSjGD2KQs43gH7sAP+Kkyx5t1BkHNQg@mail.gmail.com>
	<CAMYG4GmBxE3anMd3iAmv-pB9qvEjKuezCQfcP+6MBSKA_KpGgQ@mail.gmail.com>
Message-ID: <CADOhEh5zqccrTXQYq+AkHkp0HZvxG6eiDTie=5YQSN7TD7oFmQ@mail.gmail.com>

On Thu, Jul 14, 2016 at 7:29 PM, Matthew Knepley <knepley at gmail.com> wrote:

> On Thu, Jul 14, 2016 at 6:27 PM, Mark Adams <mfadams at lbl.gov> wrote:
>
>>
>>>
>>> Notice that there are 7 orders of magnitude between the apparent
>>> residual (using the preconditioner), and the actual residual, Ax - b.
>>> You are using Hypre, and this generally means the Hypre coarse grid
>>> operator is crap. Please
>>>
>>>
>> Huh?, this data looks fine, both the true and preconditioned residual
>> stay separated by about 9 orders of magnitude. This just tells you that the
>> norm of A (or is it A^-1) is 10^9.  Am I misunderstanding this?
>>
>
> This is why Barry and I asked for a comparsion with MUMPS. If you are
> right, and its just the condition number,
>

I said norm not condition number. I trust I'm missing something in this
thread.


> the LU
> will not be any more accurate.
>
>    Matt
>
> --
> What most experimenters take for granted before they begin their
> experiments is infinitely more interesting than any results to which their
> experiments lead.
> -- Norbert Wiener
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.mcs.anl.gov/pipermail/petsc-users/attachments/20160714/7d6af34b/attachment.html>

From bsmith at mcs.anl.gov  Thu Jul 14 20:10:27 2016
From: bsmith at mcs.anl.gov (Barry Smith)
Date: Thu, 14 Jul 2016 20:10:27 -0500
Subject: [petsc-users] Multigrid with PML
In-Reply-To: <2d1003a65bf24fdf9b30adea866d2067@utdallas.edu>
References: <2d1003a65bf24fdf9b30adea866d2067@utdallas.edu>
Message-ID: <02E40C8A-322D-4784-8418-22EE5F0999C7@mcs.anl.gov>


   This is a very difficult problem. I am not surprised that GAMG performs poorly, I would be surprised if it performed well at all.

   I think you need to do some googling of   "helmholtz PML linear system solve" to find what other people have used. The first hit I got was this http://www.math.tau.ac.il/services/phd/dissertations/Singer_Ido.pdf and every iterative method he tried ended up requiring MANY iterations with refinement. This is 14 years old so there will be better suggestions out there. One that caught my eye was http://www.sciencedirect.com/science/article/pii/S0022247X11005063


  Barry

Just looking at the matrix makes it clear to me that conventional iterative methods are not going to work well, many of the diagonal entries are zero and even in rows with a diagonal entry it is much smaller in magnitude than the diagonal entries. 

> On Jul 13, 2016, at 2:30 PM, Safin, Artur <aks084000 at utdallas.edu> wrote:
> 
> Dear PETSc community,
> 
> I am working on solving a Helmholtz problem with PML. The issue is that I am finding it very hard to deal with the resulting matrix system; I can get the correct solution for coarse meshes, but it takes roughly 2-4 times as long to converge for each successively refined mesh. I've noticed that without PML, I do not have problems with convergence speed.
> 
> I am using the GMRES solver with GAMG as the preconditioner (with block-Jacobi preconditioner for the multigrid solves). I have also tried to assemble a separate preconditioning matrix with the complex shift 1+0.5i, that does not seem to improve the results. Currently I am running with
> 
>    -ksp_type fgmres \
>    -pc_type gamg \
>    -mg_levels_pc_type bjacobi \
>    -pc_mg_type full \
>    -ksp_gmres_restart 150 \
> 
> Can anyone suggest some way of speeding up the convergence? Any help would be appreciated. I am attaching the output from kspview.
> 
> Best,
> 
> Artur
> 
> <kspview>


From andrewh0 at uw.edu  Fri Jul 15 03:14:43 2016
From: andrewh0 at uw.edu (Andrew Ho)
Date: Fri, 15 Jul 2016 01:14:43 -0700
Subject: [petsc-users] SNES_QN_RESTART_POWELL fails to converge?
In-Reply-To: <20513817-993F-41CC-8888-AAD5DF55922C@mcs.anl.gov>
References: <CADhXwgumn9Cb9ZE692pyid6rnk9A7PydKAMSHBr1NJ_JY4D3kA@mail.gmail.com>
	<20513817-993F-41CC-8888-AAD5DF55922C@mcs.anl.gov>
Message-ID: <CADhXwgt4rso7vimzSf8=Enb8D5wsjURsy0=xKHNC5fSL-E6nUw@mail.gmail.com>

I've attached two modified versions of ex1:

ex1_powell.c uses the Powell restart
ex1_none.c uses no restart

For the default initial guess (x0 = [0.5,0.5]), both converge just fine.
However, for the initial guess x0 = [3.,3.], the Powell solution fails to
converge, while None and Periodic both still converge. This is with the
"easy" equation set (run without -hard).

Interestingly enough, the Powell restart still "finishes" in a reasonable
number of iterations (7 iterations), but the residual is very large (on the
order of 1e254).

On Thu, Jul 14, 2016 at 5:50 PM, Barry Smith <bsmith at mcs.anl.gov> wrote:

>
> > On Jul 14, 2016, at 6:18 PM, Andrew Ho <andrewh0 at uw.edu> wrote:
> >
> > I am trying to solve a simple ionization/recombination ODE using PETSc's
> quasi-newton SNES.
> >
> > This is a basic non-linear coupled ODE system:
> >
> > delta = -a u^2 + b u v
> > d_t u = delta
> > d_t v = -delta
> >
> > a and b are constants.
> >
> > I wrote a backwards Euler root finding function (yes, I know the TS
> module has BE implemented, but this is more of a learning exercise).
> >
> > Here is the function evaluation:
> >
> > struct ion_rec_ctx
> > {
> >   PetscScalar rate_a, rate_b;
> >   PetscScalar dt;
> > };
> > PetscErrorCode bdf1(SNES snes, Vec x, Vec f, void *ctx)
> > {
> >   const PetscScalar *xx;
> >   PetscScalar *ff;
> >   ion_rec_ctx& params = *reinterpret_cast<ion_rec_ctx*>(ctx);
> >   CHKERRQ(VecGetArrayRead(x, &xx));
> >   CHKERRQ(VecGetArray(f,&ff));
> >   auto delta = (-params.rate_a*xx[0]*xx[0]+params.rate_b*xx[1]*xx[0]);
> >   ff[0] = xx[0]-params.dt*delta;
> >   ff[1] = xx[1]-params.dt*-delta;
> >   CHKERRQ(VecRestoreArrayRead(x,&xx));
> >   CHKERRQ(VecRestoreArray(f,&ff));
> >   return 0;
> > }
> >
> > To setup the solver and solve one time step:
> >
> > // q0, q1, and res are Vec's previously initialized
> > // initial conditions: q0 = [1e19,1e19]
> > SNES solver;
> > CHKERRQ(SNESCreate(comm, &solver));
> > CHKERRQ(SNESSetType(solver, SNESQN));
> > CHKERRQ(SNESQNSetType(solver, SNES_QN_LBFGS));
> > ion_rec_ctx params = {9.59e-16, 1.15e-19, 1.};
> > CHKERRQ(SNESSetFunction(solver, res, &bdf1, &params));
> > CHKERRQ(SNESSolve(solver, q0, q1));
> >
> > When I run this, the solver fails to converge to a solution for this
> rather large time step.
> > The solution produced when the SNES module finally gives up is:
> >
> > q1 = [-2.72647e142, 2.72647e142]
> >
> > For reference, when I disable the scale and restart types, I get these
> values:
> >
> > q1 = [1.0279e17, 1.98972e19]
> >
> > This is only a problem when I use the SNES_QN_RESTART_POWELL restart
> type (seems to be regardless of the scale type type). I get reasonable
> answers for other combinations of restart/scale type. I've tried every
> combination of restart type/scale type except for SNES_QN_SCALE_JACOBIAN
> (my ultimate application doesn't have an available Jacobian), and only
> cases using SNES_QN_RESTART_POWELL are failing.
> >
> > I'm unfamiliar with Powell's restart criterion, but is it suppose to
> work reasonably well with Quasi-Newton methods? I tried it on the simple
> problem given in this example:
> http://www.mcs.anl.gov/petsc/petsc-current/src/snes/examples/tutorials/ex1.c.html
> >
> > And Powell restarts also fails to converge to a meaningful solution
> (solving for f(x) = [1,1], for x0 = [1,1]), but the other restart methods
> do converge properly.
>
>    Could you please send the exact options you are using for the ex1.c
> that both fail and work and we'll see if there is some problem with the
> Powell restart.
>
>     Thanks
>
>     Barry
>
> >
> > Software information:
> >
> > PETSc version 3.7.2 (built from git maint branch)
> > PETSc arch: arch-linux2-c-opt
> > OS: Ubuntu 15.04 x64
> > Compiler: gcc 4.9.2
>
>


-- 
Andrew Ho
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.mcs.anl.gov/pipermail/petsc-users/attachments/20160715/791e9015/attachment-0001.html>
-------------- next part --------------
A non-text attachment was scrubbed...
Name: ex1_none.c
Type: text/x-csrc
Size: 9365 bytes
Desc: not available
URL: <http://lists.mcs.anl.gov/pipermail/petsc-users/attachments/20160715/791e9015/attachment-0002.c>
-------------- next part --------------
A non-text attachment was scrubbed...
Name: ex1_powell.c
Type: text/x-csrc
Size: 9367 bytes
Desc: not available
URL: <http://lists.mcs.anl.gov/pipermail/petsc-users/attachments/20160715/791e9015/attachment-0003.c>

From mfadams at lbl.gov  Fri Jul 15 03:46:47 2016
From: mfadams at lbl.gov (Mark Adams)
Date: Fri, 15 Jul 2016 04:46:47 -0400
Subject: [petsc-users] Multigrid with PML
In-Reply-To: <02E40C8A-322D-4784-8418-22EE5F0999C7@mcs.anl.gov>
References: <2d1003a65bf24fdf9b30adea866d2067@utdallas.edu>
	<02E40C8A-322D-4784-8418-22EE5F0999C7@mcs.anl.gov>
Message-ID: <CADOhEh6bAfyDHrJB78wTuZ7qgKdgJTKn=eRDm8EFbUTW4R+3Kg@mail.gmail.com>

On Thu, Jul 14, 2016 at 9:10 PM, Barry Smith <bsmith at mcs.anl.gov> wrote:

>
>    This is a very difficult problem. I am not surprised that GAMG performs
> poorly, I would be surprised if it performed well at all.
>
>    I think you need to do some googling of   "helmholtz PML linear system
> solve" to find what other people have used. The first hit I got was this
> http://www.math.tau.ac.il/services/phd/dissertations/Singer_Ido.pdf and
> every iterative method he tried ended up requiring MANY iterations with
> refinement. This is 14 years old so there will be better suggestions out
> there. One that caught my eye was
> http://www.sciencedirect.com/science/article/pii/S0022247X11005063
>
>
>   Barry
>
> Just looking at the matrix makes it clear to me that conventional
> iterative methods are not going to work well, many of the diagonal entries
> are zero and even in rows with a diagonal entry it is much smaller in
> magnitude than the diagonal entries.
>

Indefinite Helmholtz is hard unless you are not shifting very far. This
zero diagonals must come from PML.

First get rid of PML and see if you can solve anything to your satisfaction.

I have a paper on this, using AMG, and I tried to be inclusive, but I did
miss a potentially useful method of adding a complex shift to damp the
system. You can Google something like 'complex shift helmholtz damp'.  If
you are shifting deep (high frequency Helmholtz), then use direct solvers.


>
> > On Jul 13, 2016, at 2:30 PM, Safin, Artur <aks084000 at utdallas.edu>
> wrote:
> >
> > Dear PETSc community,
> >
> > I am working on solving a Helmholtz problem with PML. The issue is that
> I am finding it very hard to deal with the resulting matrix system; I can
> get the correct solution for coarse meshes, but it takes roughly 2-4 times
> as long to converge for each successively refined mesh. I've noticed that
> without PML, I do not have problems with convergence speed.
> >
> > I am using the GMRES solver with GAMG as the preconditioner (with
> block-Jacobi preconditioner for the multigrid solves). I have also tried to
> assemble a separate preconditioning matrix with the complex shift 1+0.5i,
> that does not seem to improve the results. Currently I am running with
> >
> >    -ksp_type fgmres \
> >    -pc_type gamg \
> >    -mg_levels_pc_type bjacobi \
> >    -pc_mg_type full \
> >    -ksp_gmres_restart 150 \
> >
> > Can anyone suggest some way of speeding up the convergence? Any help
> would be appreciated. I am attaching the output from kspview.
> >
> > Best,
> >
> > Artur
> >
> > <kspview>
>
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.mcs.anl.gov/pipermail/petsc-users/attachments/20160715/70877b7e/attachment.html>

From s_g at berkeley.edu  Fri Jul 15 04:02:09 2016
From: s_g at berkeley.edu (Sanjay Govindjee)
Date: Fri, 15 Jul 2016 02:02:09 -0700
Subject: [petsc-users] Multigrid with PML
In-Reply-To: <CADOhEh6bAfyDHrJB78wTuZ7qgKdgJTKn=eRDm8EFbUTW4R+3Kg@mail.gmail.com>
References: <2d1003a65bf24fdf9b30adea866d2067@utdallas.edu>
	<02E40C8A-322D-4784-8418-22EE5F0999C7@mcs.anl.gov>
	<CADOhEh6bAfyDHrJB78wTuZ7qgKdgJTKn=eRDm8EFbUTW4R+3Kg@mail.gmail.com>
Message-ID: <c8e13b8f-9fb8-00d0-afe1-b035005f33cc@berkeley.edu>

I agree, this is an extra hard problem when you add PML to it.  Here is 
a link to a paper that presents a few tricks applied to some aspects of 
this problem.

Koyama, T. and Govindjee, S., ``Solving generalized 
complex-symmetriceigenvalue problems arising fromresonant MEMS 
simulations with PETSc," in Proceedings in AppliedMathematics and 
Mechanics, 1141701-1141702 (2008) 
<http://dx.doi.org/10.1002/pamm.200700206>.

http://dx.doi.org/10.1002/pamm.200700206

-sg

On 7/15/16 1:46 AM, Mark Adams wrote:
>
>
> On Thu, Jul 14, 2016 at 9:10 PM, Barry Smith <bsmith at mcs.anl.gov 
> <mailto:bsmith at mcs.anl.gov>> wrote:
>
>
>        This is a very difficult problem. I am not surprised that GAMG
>     performs poorly, I would be surprised if it performed well at all.
>
>        I think you need to do some googling of   "helmholtz PML linear
>     system solve" to find what other people have used. The first hit I
>     got was this
>     http://www.math.tau.ac.il/services/phd/dissertations/Singer_Ido.pdf
>     and every iterative method he tried ended up requiring MANY
>     iterations with refinement. This is 14 years old so there will be
>     better suggestions out there. One that caught my eye was
>     http://www.sciencedirect.com/science/article/pii/S0022247X11005063
>
>
>       Barry
>
>     Just looking at the matrix makes it clear to me that conventional
>     iterative methods are not going to work well, many of the diagonal
>     entries are zero and even in rows with a diagonal entry it is much
>     smaller in magnitude than the diagonal entries.
>
>
> Indefinite Helmholtz is hard unless you are not shifting very far. 
> This zero diagonals must come from PML.
>
> First get rid of PML and see if you can solve anything to your 
> satisfaction.
>
> I have a paper on this, using AMG, and I tried to be inclusive, but I 
> did miss a potentially useful method of adding a complex shift to damp 
> the system. You can Google something like 'complex shift helmholtz 
> damp'.  If you are shifting deep (high frequency Helmholtz), then use 
> direct solvers.
>
>
>     > On Jul 13, 2016, at 2:30 PM, Safin, Artur
>     <aks084000 at utdallas.edu <mailto:aks084000 at utdallas.edu>> wrote:
>     >
>     > Dear PETSc community,
>     >
>     > I am working on solving a Helmholtz problem with PML. The issue
>     is that I am finding it very hard to deal with the resulting
>     matrix system; I can get the correct solution for coarse meshes,
>     but it takes roughly 2-4 times as long to converge for each
>     successively refined mesh. I've noticed that without PML, I do not
>     have problems with convergence speed.
>     >
>     > I am using the GMRES solver with GAMG as the preconditioner
>     (with block-Jacobi preconditioner for the multigrid solves). I
>     have also tried to assemble a separate preconditioning matrix with
>     the complex shift 1+0.5i, that does not seem to improve the
>     results. Currently I am running with
>     >
>     >    -ksp_type fgmres \
>     >    -pc_type gamg \
>     >    -mg_levels_pc_type bjacobi \
>     >    -pc_mg_type full \
>     >    -ksp_gmres_restart 150 \
>     >
>     > Can anyone suggest some way of speeding up the convergence? Any
>     help would be appreciated. I am attaching the output from kspview.
>     >
>     > Best,
>     >
>     > Artur
>     >
>     > <kspview>
>
>

-- 
-----------------------------------------------
Sanjay Govindjee, PhD, PE
Professor of Civil Engineering

779 Davis Hall
University of California
Berkeley, CA 94720-1710

Voice:  +1 510 642 6060
FAX:    +1 510 643 5264
s_g at berkeley.edu
http://www.ce.berkeley.edu/~sanjay
-----------------------------------------------

Books:

Engineering Mechanics of Deformable
Solids: A Presentation with Exercises
http://www.oup.com/us/catalog/general/subject/Physics/MaterialsScience/?view=usa&ci=9780199651641
http://ukcatalogue.oup.com/product/9780199651641.do
http://amzn.com/0199651647

Engineering Mechanics 3 (Dynamics) 2nd Edition
http://www.springer.com/978-3-642-53711-0
http://amzn.com/3642537111

Engineering Mechanics 3, Supplementary Problems: Dynamics
http://www.amzn.com/B00SOXN8JU

-----------------------------------------------

-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.mcs.anl.gov/pipermail/petsc-users/attachments/20160715/54ddd157/attachment.html>

From simon at arrowtheory.com  Fri Jul 15 07:29:13 2016
From: simon at arrowtheory.com (Simon Burton)
Date: Fri, 15 Jul 2016 22:29:13 +1000
Subject: [petsc-users] slepc eating all my ram
Message-ID: <20160715222913.df7b3dd606ec173f7cac6a8e@arrowtheory.com>


Hi,

I'm running a slepc eigenvalue solver on a single machine with 198GB of ram,
and solution space dimension 2^32. With double precision this means
each vector is 32GB. I'm using shell matrices to implement the matrix
vector product. I figured the easiest way to get eigenvalues is using
the slepc power method, but it is still eating all the ram.

Running in gdb I see that slepc is allocating a bunch of vectors in
the spectral transform object (in STSetUp), and by this time it has consumed
most of the 198GB of ram. I don't see why a spectral transform
shift of zero needs to alloc a whole bunch of memory.

I'm wondering if there are some other options to slepc that can
reduce the memory footprint? A barebones implementation of the
power method only needs to keep two vectors, perhaps I should
just try doing this using petsc primitives. It's also possible that
I could spread the computation over two or more machines but
that's a whole other learning curve. 

The code I am running is essentially the laplacian grid
example from slepc (src/eps/examples/tutorials/ex3.c):

./ex3 -eps_hermitian -eps_largest_magnitude -eps_monitor ascii -eps_nev 1 -eps_type power -n 65536

I also put this line in the source:
EPSSetDimensions(eps,1,2,1); 

Cheers,

Simon.


From domenico_lahaye at yahoo.com  Fri Jul 15 08:02:00 2016
From: domenico_lahaye at yahoo.com (domenico lahaye)
Date: Fri, 15 Jul 2016 13:02:00 +0000 (UTC)
Subject: [petsc-users] Multigrid with PML
In-Reply-To: <c8e13b8f-9fb8-00d0-afe1-b035005f33cc@berkeley.edu>
References: <2d1003a65bf24fdf9b30adea866d2067@utdallas.edu>
	<02E40C8A-322D-4784-8418-22EE5F0999C7@mcs.anl.gov>
	<CADOhEh6bAfyDHrJB78wTuZ7qgKdgJTKn=eRDm8EFbUTW4R+3Kg@mail.gmail.com>
	<c8e13b8f-9fb8-00d0-afe1-b035005f33cc@berkeley.edu>
Message-ID: <1581901516.4078953.1468587720083.JavaMail.yahoo@mail.yahoo.com>

Dear Artur, 
? Out of a blend of curiosity and healthy naivity: have you tried complex shifted Laplace as 
a preconditioner? 

? Greetings, Domenico Lahaye.
 
      From: Sanjay Govindjee <s_g at berkeley.edu>
 To: petsc-users at mcs.anl.gov 
 Sent: Friday, July 15, 2016 11:02 AM
 Subject: Re: [petsc-users] Multigrid with PML
   
 I agree, this is an extra hard problem when you add PML to it.? Here is a link to a paper that presents a few tricks applied to some aspects of this problem.
 
 Koyama, T. and Govindjee, S., ``Solving generalized complex-symmetriceigenvalue problems arising fromresonant MEMS simulations with PETSc," in Proceedings in AppliedMathematics and Mechanics, 1141701-1141702 (2008).
 
 http://dx.doi.org/10.1002/pamm.200700206
 
 -sg
 
 On 7/15/16 1:46 AM, Mark Adams wrote:
  
 
 On Thu, Jul 14, 2016 at 9:10 PM, Barry Smith <bsmith at mcs.anl.gov> wrote:
 

 ? ?This is a very difficult problem. I am not surprised that GAMG performs poorly, I would be surprised if it performed well at all.
 
 ? ?I think you need to do some googling of? ?"helmholtz PML linear system solve" to find what other people have used. The first hit I got was this http://www.math.tau.ac.il/services/phd/dissertations/Singer_Ido.pdf and every iterative method he tried ended up requiring MANY iterations with refinement. This is 14 years old so there will be better suggestions out there. One that caught my eye was http://www.sciencedirect.com/science/article/pii/S0022247X11005063
 
 
 ? Barry
 
 Just looking at the matrix makes it clear to me that conventional iterative methods are not going to work well, many of the diagonal entries are zero and even in rows with a diagonal entry it is much smaller in magnitude than the diagonal entries.
 
 
  Indefinite Helmholtz is hard unless you are not shifting very far. This zero diagonals must come from PML. 
  First get rid of PML and see if you can solve anything to your satisfaction. 
  I have a paper on this, using AMG, and I tried to be inclusive, but I did miss a potentially useful method of adding a complex shift to damp the system. You can Google something like 'complex shift helmholtz damp'.? If you are shifting deep (high frequency Helmholtz), then use direct solvers. ? 

 > On Jul 13, 2016, at 2:30 PM, Safin, Artur <aks084000 at utdallas.edu> wrote:
 >
   > Dear PETSc community,
 >
 > I am working on solving a Helmholtz problem with PML. The issue is that I am finding it very hard to deal with the resulting matrix system; I can get the correct solution for coarse meshes, but it takes roughly 2-4 times as long to converge for each successively refined mesh. I've noticed that without PML, I do not have problems with convergence speed.
 >
 > I am using the GMRES solver with GAMG as the preconditioner (with block-Jacobi preconditioner for the multigrid solves). I have also tried to assemble a separate preconditioning matrix with the complex shift 1+0.5i, that does not seem to improve the results. Currently I am running with
 >
 >? ? -ksp_type fgmres \
 >? ? -pc_type gamg \
 >? ? -mg_levels_pc_type bjacobi \
 >? ? -pc_mg_type full \
 >? ? -ksp_gmres_restart 150 \
 >
 > Can anyone suggest some way of speeding up the convergence? Any help would be appreciated. I am attaching the output from kspview.
 >
 > Best,
 >
 > Artur
 >
   > <kspview>
 
 
 -- 
-----------------------------------------------
Sanjay Govindjee, PhD, PE
Professor of Civil Engineering

779 Davis Hall
University of California
Berkeley, CA 94720-1710

Voice:  +1 510 642 6060
FAX:    +1 510 643 5264
s_g at berkeley.edu
http://www.ce.berkeley.edu/~sanjay
-----------------------------------------------

Books:  

Engineering Mechanics of Deformable 
Solids: A Presentation with Exercises
http://www.oup.com/us/catalog/general/subject/Physics/MaterialsScience/?view=usa&ci=9780199651641
http://ukcatalogue.oup.com/product/9780199651641.do
http://amzn.com/0199651647

Engineering Mechanics 3 (Dynamics) 2nd Edition
http://www.springer.com/978-3-642-53711-0
http://amzn.com/3642537111

Engineering Mechanics 3, Supplementary Problems: Dynamics 
http://www.amzn.com/B00SOXN8JU

-----------------------------------------------
 

-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.mcs.anl.gov/pipermail/petsc-users/attachments/20160715/031e2165/attachment-0001.html>

From hzhang at mcs.anl.gov  Fri Jul 15 11:13:27 2016
From: hzhang at mcs.anl.gov (Hong)
Date: Fri, 15 Jul 2016 11:13:27 -0500
Subject: [petsc-users] slepc eating all my ram
In-Reply-To: <20160715222913.df7b3dd606ec173f7cac6a8e@arrowtheory.com>
References: <20160715222913.df7b3dd606ec173f7cac6a8e@arrowtheory.com>
Message-ID: <CAGCphBuAKMkS4ePszsOBcH-sx2yD=PhBJNhGxULYyMPGimf2og@mail.gmail.com>

Simon :
For '-eps_hermitian -eps_largest_magnitude', why do you need 'spectral
transform'?
Try slepc default method for ex3.c.

Hong

>
> Hi,
>
> I'm running a slepc eigenvalue solver on a single machine with 198GB of
> ram,
> and solution space dimension 2^32. With double precision this means
> each vector is 32GB. I'm using shell matrices to implement the matrix
> vector product. I figured the easiest way to get eigenvalues is using
> the slepc power method, but it is still eating all the ram.
>
> Running in gdb I see that slepc is allocating a bunch of vectors in
> the spectral transform object (in STSetUp), and by this time it has
> consumed
> most of the 198GB of ram. I don't see why a spectral transform
> shift of zero needs to alloc a whole bunch of memory.
>
> I'm wondering if there are some other options to slepc that can
> reduce the memory footprint? A barebones implementation of the
> power method only needs to keep two vectors, perhaps I should
> just try doing this using petsc primitives. It's also possible that
> I could spread the computation over two or more machines but
> that's a whole other learning curve.
>
> The code I am running is essentially the laplacian grid
> example from slepc (src/eps/examples/tutorials/ex3.c):
>
> ./ex3 -eps_hermitian -eps_largest_magnitude -eps_monitor ascii -eps_nev 1
> -eps_type power -n 65536
>
> I also put this line in the source:
> EPSSetDimensions(eps,1,2,1);
>
> Cheers,
>
> Simon.
>
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.mcs.anl.gov/pipermail/petsc-users/attachments/20160715/b8f13cf4/attachment.html>

From hgbk2008 at gmail.com  Fri Jul 15 11:28:00 2016
From: hgbk2008 at gmail.com (Hoang Giang Bui)
Date: Fri, 15 Jul 2016 18:28:00 +0200
Subject: [petsc-users] different convergence behaviour
In-Reply-To: <CADOhEh5zqccrTXQYq+AkHkp0HZvxG6eiDTie=5YQSN7TD7oFmQ@mail.gmail.com>
References: <CAJW_hKf6z-ra+sj_6OHO-WR2x0zJs5c_kQyUm+sZ5quRGmh8+A@mail.gmail.com>
	<CAMYG4GkXWTAsUAPFgYhCqr5ZjHsOUYq9e2O+cwBJ=fvdp5v-Pg@mail.gmail.com>
	<CAJW_hKezLFQS3OOB6mxyr=mKR4rVmg2TRMNxNs0F3n_mzfA7Sg@mail.gmail.com>
	<CAMYG4Gkwrw4WuGsZ1VJ6i9LkyTwhpJu=TSE6Tc=oiS+R0FT12A@mail.gmail.com>
	<66004C23-63C9-4A3E-A7DF-1352AC26412F@mcs.anl.gov>
	<CAJW_hKfqO+bhQZrkLZSJE=FEG13k8qdj3V2dSUSxXuPUf4LugA@mail.gmail.com>
	<CAMYG4GkC1omj+TLHMA64iWobW6Xm-O=KioQPmwGgJ4-urWREWA@mail.gmail.com>
	<CADOhEh7aUvrz0SOi-71sSjGD2KQs43gH7sAP+Kkyx5t1BkHNQg@mail.gmail.com>
	<CAMYG4GmBxE3anMd3iAmv-pB9qvEjKuezCQfcP+6MBSKA_KpGgQ@mail.gmail.com>
	<CADOhEh5zqccrTXQYq+AkHkp0HZvxG6eiDTie=5YQSN7TD7oFmQ@mail.gmail.com>
Message-ID: <CAJW_hKfSwaGyk=X-T59Nr+p4x0=LZXWqEX820Z=VeW=7jHduig@mail.gmail.com>

I used

-ksp_monitor_true_residual
-ksp_monitor_true_solution
-ksp_converged_reason

with MUMPS but it does not compute the true residual. Should I compute that
myself?

Below is a sample for a full log of MUMPS
https://www.dropbox.com/s/fy5uknooxw77r19/log13Jun16_mumps?dl=0


Giang

On Fri, Jul 15, 2016 at 2:52 AM, Mark Adams <mfadams at lbl.gov> wrote:

>
>
> On Thu, Jul 14, 2016 at 7:29 PM, Matthew Knepley <knepley at gmail.com>
> wrote:
>
>> On Thu, Jul 14, 2016 at 6:27 PM, Mark Adams <mfadams at lbl.gov> wrote:
>>
>>>
>>>>
>>>> Notice that there are 7 orders of magnitude between the apparent
>>>> residual (using the preconditioner), and the actual residual, Ax - b.
>>>> You are using Hypre, and this generally means the Hypre coarse grid
>>>> operator is crap. Please
>>>>
>>>>
>>> Huh?, this data looks fine, both the true and preconditioned residual
>>> stay separated by about 9 orders of magnitude. This just tells you that the
>>> norm of A (or is it A^-1) is 10^9.  Am I misunderstanding this?
>>>
>>
>> This is why Barry and I asked for a comparsion with MUMPS. If you are
>> right, and its just the condition number,
>>
>
> I said norm not condition number. I trust I'm missing something in this
> thread.
>
>
>> the LU
>> will not be any more accurate.
>>
>>    Matt
>>
>> --
>> What most experimenters take for granted before they begin their
>> experiments is infinitely more interesting than any results to which their
>> experiments lead.
>> -- Norbert Wiener
>>
>
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.mcs.anl.gov/pipermail/petsc-users/attachments/20160715/b130f074/attachment.html>

From bsmith at mcs.anl.gov  Fri Jul 15 11:32:55 2016
From: bsmith at mcs.anl.gov (Barry Smith)
Date: Fri, 15 Jul 2016 11:32:55 -0500
Subject: [petsc-users] different convergence behaviour
In-Reply-To: <CAJW_hKfSwaGyk=X-T59Nr+p4x0=LZXWqEX820Z=VeW=7jHduig@mail.gmail.com>
References: <CAJW_hKf6z-ra+sj_6OHO-WR2x0zJs5c_kQyUm+sZ5quRGmh8+A@mail.gmail.com>
	<CAMYG4GkXWTAsUAPFgYhCqr5ZjHsOUYq9e2O+cwBJ=fvdp5v-Pg@mail.gmail.com>
	<CAJW_hKezLFQS3OOB6mxyr=mKR4rVmg2TRMNxNs0F3n_mzfA7Sg@mail.gmail.com>
	<CAMYG4Gkwrw4WuGsZ1VJ6i9LkyTwhpJu=TSE6Tc=oiS+R0FT12A@mail.gmail.com>
	<66004C23-63C9-4A3E-A7DF-1352AC26412F@mcs.anl.gov>
	<CAJW_hKfqO+bhQZrkLZSJE=FEG13k8qdj3V2dSUSxXuPUf4LugA@mail.gmail.com>
	<CAMYG4GkC1omj+TLHMA64iWobW6Xm-O=KioQPmwGgJ4-urWREWA@mail.gmail.com>
	<CADOhEh7aUvrz0SOi-71sSjGD2KQs43gH7sAP+Kkyx5t1BkHNQg@mail.gmail.com>
	<CAMYG4GmBxE3anMd3iAmv-pB9qvEjKuezCQfcP+6MBSKA_KpGgQ@mail.gmail.com>
	<CADOhEh5zqccrTXQYq+AkHkp0HZvxG6eiDTie=5YQSN7TD7oFmQ@mail.gmail.com>
	<CAJW_hKfSwaGyk=X-T59Nr+p4x0=LZXWqEX820Z=VeW=7jHduig@mail.gmail.com>
Message-ID: <7E31CA9A-7717-4E0D-9E58-3BE243A05AB4@mcs.anl.gov>


  Use -ksp_type gmres to get it to print the residuals. With preonly it doesn't compute or print them.


> On Jul 15, 2016, at 11:28 AM, Hoang Giang Bui <hgbk2008 at gmail.com> wrote:
> 
> I used
> 
> -ksp_monitor_true_residual
> -ksp_monitor_true_solution
> -ksp_converged_reason
> 
> with MUMPS but it does not compute the true residual. Should I compute that myself?
> 
> Below is a sample for a full log of MUMPS
> https://www.dropbox.com/s/fy5uknooxw77r19/log13Jun16_mumps?dl=0
> 
> 
> Giang
> 
> On Fri, Jul 15, 2016 at 2:52 AM, Mark Adams <mfadams at lbl.gov> wrote:
> 
> 
> On Thu, Jul 14, 2016 at 7:29 PM, Matthew Knepley <knepley at gmail.com> wrote:
> On Thu, Jul 14, 2016 at 6:27 PM, Mark Adams <mfadams at lbl.gov> wrote:
> 
> 
> Notice that there are 7 orders of magnitude between the apparent residual (using the preconditioner), and the actual residual, Ax - b.
> You are using Hypre, and this generally means the Hypre coarse grid operator is crap. Please
> 
> 
> Huh?, this data looks fine, both the true and preconditioned residual stay separated by about 9 orders of magnitude. This just tells you that the norm of A (or is it A^-1) is 10^9.  Am I misunderstanding this?
> 
> This is why Barry and I asked for a comparsion with MUMPS. If you are right, and its just the condition number,
> 
> I said norm not condition number. I trust I'm missing something in this thread.
>  
> the LU
> will not be any more accurate.
> 
>    Matt
> 
> -- 
> What most experimenters take for granted before they begin their experiments is infinitely more interesting than any results to which their experiments lead.
> -- Norbert Wiener
> 
> 


From simon at arrowtheory.com  Fri Jul 15 12:12:36 2016
From: simon at arrowtheory.com (Simon Burton)
Date: Sat, 16 Jul 2016 03:12:36 +1000
Subject: [petsc-users] slepc eating all my ram
In-Reply-To: <CAGCphBuAKMkS4ePszsOBcH-sx2yD=PhBJNhGxULYyMPGimf2og@mail.gmail.com>
References: <20160715222913.df7b3dd606ec173f7cac6a8e@arrowtheory.com>
	<CAGCphBuAKMkS4ePszsOBcH-sx2yD=PhBJNhGxULYyMPGimf2og@mail.gmail.com>
Message-ID: <20160716031236.4f52e3e02814cfe42d83a7b6@arrowtheory.com>


Hi,

just like this?
./ex3 -eps_nev 1 -eps_type power -n 65536 -info

I still see:
[0] STSetUp(): Setting up new ST

and that's when memory usage reaches to 192GB and the machine can't take it.

I don't understand why the default behaviour creates a spectral transform 
object that then needs so much memory.

thanks,

Simon.


On Fri, 15 Jul 2016 11:13:27 -0500
Hong <hzhang at mcs.anl.gov> wrote:

> Simon :
> For '-eps_hermitian -eps_largest_magnitude', why do you need 'spectral
> transform'?
> Try slepc default method for ex3.c.
> 
> Hong
> 
> >
> > Hi,
> >
> > I'm running a slepc eigenvalue solver on a single machine with 198GB of
> > ram,
> > and solution space dimension 2^32. With double precision this means
> > each vector is 32GB. I'm using shell matrices to implement the matrix
> > vector product. I figured the easiest way to get eigenvalues is using
> > the slepc power method, but it is still eating all the ram.
> >
> > Running in gdb I see that slepc is allocating a bunch of vectors in
> > the spectral transform object (in STSetUp), and by this time it has
> > consumed
> > most of the 198GB of ram. I don't see why a spectral transform
> > shift of zero needs to alloc a whole bunch of memory.
> >
> > I'm wondering if there are some other options to slepc that can
> > reduce the memory footprint? A barebones implementation of the
> > power method only needs to keep two vectors, perhaps I should
> > just try doing this using petsc primitives. It's also possible that
> > I could spread the computation over two or more machines but
> > that's a whole other learning curve.
> >
> > The code I am running is essentially the laplacian grid
> > example from slepc (src/eps/examples/tutorials/ex3.c):
> >
> > ./ex3 -eps_hermitian -eps_largest_magnitude -eps_monitor ascii -eps_nev 1
> > -eps_type power -n 65536
> >
> > I also put this line in the source:
> > EPSSetDimensions(eps,1,2,1);
> >
> > Cheers,
> >
> > Simon.
> >
> >

From jroman at dsic.upv.es  Fri Jul 15 12:53:31 2016
From: jroman at dsic.upv.es (Jose E. Roman)
Date: Fri, 15 Jul 2016 19:53:31 +0200
Subject: [petsc-users] slepc eating all my ram
In-Reply-To: <20160716031236.4f52e3e02814cfe42d83a7b6@arrowtheory.com>
References: <20160715222913.df7b3dd606ec173f7cac6a8e@arrowtheory.com>
	<CAGCphBuAKMkS4ePszsOBcH-sx2yD=PhBJNhGxULYyMPGimf2og@mail.gmail.com>
	<20160716031236.4f52e3e02814cfe42d83a7b6@arrowtheory.com>
Message-ID: <4455A442-710F-412A-9B7F-32D690B4E1F0@dsic.upv.es>


> El 15 jul 2016, a las 19:12, Simon Burton <simon at arrowtheory.com> escribi?:
> 
> Hi,
> 
> just like this?
> ./ex3 -eps_nev 1 -eps_type power -n 65536 -info
> 
> I still see:
> [0] STSetUp(): Setting up new ST
> 
> and that's when memory usage reaches to 192GB and the machine can't take it.
> 
> I don't understand why the default behaviour creates a spectral transform 
> object that then needs so much memory.
> 
> thanks,
> 
> Simon.

The default spectral transformation (STSHIFT) will allocate just one vector. At which exact point are you seeing that it allocates a bunch of vectors?

Is this the unmodified ex3.c? Or did you change anything like EPSSetOperators(eps,A,B) ?

Do you get the same behaviour with the original ex3 with the same problem size?

Do you have the same problem with a smaller problem? (half size, say)

Jose
 

From simon at arrowtheory.com  Fri Jul 15 16:17:44 2016
From: simon at arrowtheory.com (Simon Burton)
Date: Sat, 16 Jul 2016 07:17:44 +1000
Subject: [petsc-users] slepc eating all my ram
In-Reply-To: <4455A442-710F-412A-9B7F-32D690B4E1F0@dsic.upv.es>
References: <20160715222913.df7b3dd606ec173f7cac6a8e@arrowtheory.com>
	<CAGCphBuAKMkS4ePszsOBcH-sx2yD=PhBJNhGxULYyMPGimf2og@mail.gmail.com>
	<20160716031236.4f52e3e02814cfe42d83a7b6@arrowtheory.com>
	<4455A442-710F-412A-9B7F-32D690B4E1F0@dsic.upv.es>
Message-ID: <20160716071744.50ec5af125d99abc4c0ffd7c@arrowtheory.com>

On Fri, 15 Jul 2016 19:53:31 +0200
"Jose E. Roman" <jroman at dsic.upv.es> wrote:

> 
> The default spectral transformation (STSHIFT) will allocate just one vector. At which exact point are you seeing that it allocates a bunch of vectors?

Yes I think you are right.
I can get beyond STSetUp with the right settings.
Now the solver runs out of memory inside EPSGetStartVector.

> 
> Is this the unmodified ex3.c? Or did you change anything like EPSSetOperators(eps,A,B) ?

good question. I didn't change much, let me try again the original.

> Do you get the same behaviour with the original ex3 with the same problem size?

Yes

> 
> Do you have the same problem with a smaller problem? (half size, say)

Halving n gives a quarter of the dimension, which is 8gb vector sizes.
It works fine and uses a total of 48gb ram. Oh, I see at one point during
initialization it hits a maximum of 56gb.

So I guess it needs to keep 6 vectors in total.
With the original problem size this becomes 192gb which is
just a few gb too much to crunch. I guess I can still try it,
but it doesn't feel good hitting the harddrive that much.

Thanks for the suggestions.

Simon.

From aks084000 at utdallas.edu  Fri Jul 15 18:29:58 2016
From: aks084000 at utdallas.edu (Safin, Artur)
Date: Fri, 15 Jul 2016 23:29:58 +0000
Subject: [petsc-users] Multigrid with PML
In-Reply-To: <CADOhEh6bAfyDHrJB78wTuZ7qgKdgJTKn=eRDm8EFbUTW4R+3Kg@mail.gmail.com>
References: <2d1003a65bf24fdf9b30adea866d2067@utdallas.edu>
	<02E40C8A-322D-4784-8418-22EE5F0999C7@mcs.anl.gov>,
	<CADOhEh6bAfyDHrJB78wTuZ7qgKdgJTKn=eRDm8EFbUTW4R+3Kg@mail.gmail.com>
Message-ID: <f1cb36bdd90545228d0efa1c0a60404a@utdallas.edu>

Barry,


Thank you for taking a look at my problem. I will see if I can implement some of the methods available in literature.


Artur
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.mcs.anl.gov/pipermail/petsc-users/attachments/20160715/60c64a21/attachment.html>

From bsmith at mcs.anl.gov  Fri Jul 15 22:26:04 2016
From: bsmith at mcs.anl.gov (Barry Smith)
Date: Fri, 15 Jul 2016 22:26:04 -0500
Subject: [petsc-users] SNES_QN_RESTART_POWELL fails to converge?
In-Reply-To: <CADhXwgt4rso7vimzSf8=Enb8D5wsjURsy0=xKHNC5fSL-E6nUw@mail.gmail.com>
References: <CADhXwgumn9Cb9ZE692pyid6rnk9A7PydKAMSHBr1NJ_JY4D3kA@mail.gmail.com>
	<20513817-993F-41CC-8888-AAD5DF55922C@mcs.anl.gov>
	<CADhXwgt4rso7vimzSf8=Enb8D5wsjURsy0=xKHNC5fSL-E6nUw@mail.gmail.com>
Message-ID: <C8C008AA-05E9-4097-B467-F5F6B499A5CD@mcs.anl.gov>


  Andrew,
  
    Thanks for your code. I look through the QN code and it seems ok, the one funny thing is that it applies the Powell criteria after the first iterations (before the L-BFGS has properly started) which is why the solution just continues to grow and grow. Essentially with the Powell test it is never starting L-BFSG. I have made two changes 

1)  branch barry/fix-snes-qn-powell/maint that changes the code so that the Powel check is not done until the first full iteration of L-BFGS has been completed. This now gets the ex1_powell.c code to converge (with 18 iterations). Of course waiting for one full iteration of L-BFGS is arbitrary, perhaps 2 is better, I do not know.

2) barry/add-snes-divtol this adds a divergence test to SNES; it was goofy that even though residual norm was increasing without bound the SNES iteration continued to iterate. I added a new convergence test that if the residual grows (default) by 1e4 then the iteration is stopped with a divergence error.

  Thanks for reporting these problems,

  Barry


> On Jul 15, 2016, at 3:14 AM, Andrew Ho <andrewh0 at uw.edu> wrote:
> 
> I've attached two modified versions of ex1:
> 
> ex1_powell.c uses the Powell restart
> ex1_none.c uses no restart
> 
> For the default initial guess (x0 = [0.5,0.5]), both converge just fine. However, for the initial guess x0 = [3.,3.], the Powell solution fails to converge, while None and Periodic both still converge. This is with the "easy" equation set (run without -hard).
> 
> Interestingly enough, the Powell restart still "finishes" in a reasonable number of iterations (7 iterations), but the residual is very large (on the order of 1e254).
> 
> On Thu, Jul 14, 2016 at 5:50 PM, Barry Smith <bsmith at mcs.anl.gov> wrote:
> 
> > On Jul 14, 2016, at 6:18 PM, Andrew Ho <andrewh0 at uw.edu> wrote:
> >
> > I am trying to solve a simple ionization/recombination ODE using PETSc's quasi-newton SNES.
> >
> > This is a basic non-linear coupled ODE system:
> >
> > delta = -a u^2 + b u v
> > d_t u = delta
> > d_t v = -delta
> >
> > a and b are constants.
> >
> > I wrote a backwards Euler root finding function (yes, I know the TS module has BE implemented, but this is more of a learning exercise).
> >
> > Here is the function evaluation:
> >
> > struct ion_rec_ctx
> > {
> >   PetscScalar rate_a, rate_b;
> >   PetscScalar dt;
> > };
> > PetscErrorCode bdf1(SNES snes, Vec x, Vec f, void *ctx)
> > {
> >   const PetscScalar *xx;
> >   PetscScalar *ff;
> >   ion_rec_ctx& params = *reinterpret_cast<ion_rec_ctx*>(ctx);
> >   CHKERRQ(VecGetArrayRead(x, &xx));
> >   CHKERRQ(VecGetArray(f,&ff));
> >   auto delta = (-params.rate_a*xx[0]*xx[0]+params.rate_b*xx[1]*xx[0]);
> >   ff[0] = xx[0]-params.dt*delta;
> >   ff[1] = xx[1]-params.dt*-delta;
> >   CHKERRQ(VecRestoreArrayRead(x,&xx));
> >   CHKERRQ(VecRestoreArray(f,&ff));
> >   return 0;
> > }
> >
> > To setup the solver and solve one time step:
> >
> > // q0, q1, and res are Vec's previously initialized
> > // initial conditions: q0 = [1e19,1e19]
> > SNES solver;
> > CHKERRQ(SNESCreate(comm, &solver));
> > CHKERRQ(SNESSetType(solver, SNESQN));
> > CHKERRQ(SNESQNSetType(solver, SNES_QN_LBFGS));
> > ion_rec_ctx params = {9.59e-16, 1.15e-19, 1.};
> > CHKERRQ(SNESSetFunction(solver, res, &bdf1, &params));
> > CHKERRQ(SNESSolve(solver, q0, q1));
> >
> > When I run this, the solver fails to converge to a solution for this rather large time step.
> > The solution produced when the SNES module finally gives up is:
> >
> > q1 = [-2.72647e142, 2.72647e142]
> >
> > For reference, when I disable the scale and restart types, I get these values:
> >
> > q1 = [1.0279e17, 1.98972e19]
> >
> > This is only a problem when I use the SNES_QN_RESTART_POWELL restart type (seems to be regardless of the scale type type). I get reasonable answers for other combinations of restart/scale type. I've tried every combination of restart type/scale type except for SNES_QN_SCALE_JACOBIAN (my ultimate application doesn't have an available Jacobian), and only cases using SNES_QN_RESTART_POWELL are failing.
> >
> > I'm unfamiliar with Powell's restart criterion, but is it suppose to work reasonably well with Quasi-Newton methods? I tried it on the simple problem given in this example: http://www.mcs.anl.gov/petsc/petsc-current/src/snes/examples/tutorials/ex1.c.html
> >
> > And Powell restarts also fails to converge to a meaningful solution (solving for f(x) = [1,1], for x0 = [1,1]), but the other restart methods do converge properly.
> 
>    Could you please send the exact options you are using for the ex1.c that both fail and work and we'll see if there is some problem with the Powell restart.
> 
>     Thanks
> 
>     Barry
> 
> >
> > Software information:
> >
> > PETSc version 3.7.2 (built from git maint branch)
> > PETSc arch: arch-linux2-c-opt
> > OS: Ubuntu 15.04 x64
> > Compiler: gcc 4.9.2
> 
> 
> 
> 
> -- 
> Andrew Ho
> <ex1_none.c><ex1_powell.c>


From bsmith at mcs.anl.gov  Fri Jul 15 22:48:14 2016
From: bsmith at mcs.anl.gov (Barry Smith)
Date: Fri, 15 Jul 2016 22:48:14 -0500
Subject: [petsc-users] Regarding ksp ex42 - Citations
In-Reply-To: <1413749702.3789628.1468516892902.JavaMail.yahoo@mail.yahoo.com>
References: <1413749702.3789628.1468516892902.JavaMail.yahoo.ref@mail.yahoo.com>
	<1413749702.3789628.1468516892902.JavaMail.yahoo@mail.yahoo.com>
Message-ID: <E8DAFAD1-55D4-41C7-8EF9-B8F1D7BA23C2@mcs.anl.gov>


> On Jul 14, 2016, at 12:21 PM, domenico lahaye <domenico_lahaye at yahoo.com> wrote:
> 
> Dear PETSc team, 
> 
> 1) I am looking into ks/examples/tutorials/ex42.c

   This example is really written as only a one level solver, making it work with geometric multigrid is not clean

> I am still new to the DMDA structure
>     and likely not giving it as much time as it deserves. However, I do not see immediately 
>     what function is responsible for calling PCMGSetSmoother and PCMGSetResidual. 
> 
>      I tried to call PCMGGetCoarseSolve(pc, &kcpc) and subsequently 
>      KSPGetOperators (kspc, ... ) to check how the coarse grid operator is defined 
>      after calling DMCoarsenHierarchy, but that failed. 
> 
>      I am solving Helmholtz with shifted Laplace, and managed to exploit DMDA to perform 
>      a multigrid solve on the preconditioner. In a next stage I want to implement the deflation 
>      using DMDA as well. 

  You should look at ex25.c in the same directory. Here 

  ierr = KSPSetDM(ksp,da);CHKERRQ(ierr);
  ierr = KSPSetComputeRHS(ksp,ComputeRHS,&user);CHKERRQ(ierr);
  ierr = KSPSetComputeOperators(ksp,ComputeMatrix,&user);CHKERRQ(ierr);

make it straight forward to work with multigrid. The KSP object can mange the hierarchy of grids since it is provided with the DM
and the ComputeRHS and ComputeMatrix provide a way for the multigrid preconditioner to automatically generate the needed matrix on each level without you having to manage it yourself. For example the rule in the makefile

runex25:
	-@${MPIEXEC} -n 1 ./ex25 -pc_type mg -ksp_type fgmres -da_refine 2 -ksp_monitor_short -mg_levels_ksp_monitor_short -mg_levels_ksp_norm_type unpreconditioned -ksp_view -pc_mg_type full  > ex25_1.tmp 2>&1;	  \
	   if (${DIFF} output/ex25_1.out ex25_1.tmp) then true; \
	   else printf "${PWD}\nPossible problem with ex25_1, diffs above\n=========================================\n"; fi; \
	   ${RM} -f ex25_1.tmp

shows how to run with two levels. etc.


> 
> 2) On http://www.mcs.anl.gov/petsc/documentation/referencing.html I see 
> 
> @Misc{petsc-web-page,
>             author = {Satish Balay and Shrirang Abhyankar and Mark~F. Adams and Jed Brown and Peter Brune
>                       and Kris Buschelman and Lisandro Dalcin and Victor Eijkhout and William~D. Gropp
>                       and Dinesh Kaushik and Matthew~G. Knepley
>                       and Lois Curfman McInnes and Karl Rupp and Barry~F. Smith
>                       and Stefano Zampini and Hong Zhang and Hong Zhang},
>             title =  {{PETS}c {W}eb page},
>             url =    {http://www.mcs.anl.gov/petsc},
>             howpublished = {\url{http://www.mcs.anl.gov/petsc}},
>             year = {2016}
>           }
> 
> 
> 
> Is the last author mentioned twice intentionally? 
> 
> 3) On http://www.mcs.anl.gov/petsc/publications/petscapps-bib.html#OpenFOAM%202.2.1 I see 
> 
> @misc{OpenFOAM
> ,
> 
> 
> title	=	"OpenFOAM",
> 
> howpublished	=	"\url{http://www.openfoam.com}",
> 
> url	=	{http://www.openfoam.com},
> 
> note	=	"OpenFOAM is a free, open source CFD software package. It allows PETSc linear algebra and solvers to be used underneath.",
> 
> key	=	"OpenFOAM 2.2.1"
> 
> }
> 
> 
> Do you have more information on the use of PETSc within OpenFoam? 
> 
> 4) @matt in response to a question he raised in Vienna
> 
> MIPSE is a BEM solver. Details are on: 
> http://www.g2elab.grenoble-inp.fr/plateforms/mipse-modeling-of-interconnected-power-systems-632862.kjsp?RH=G2ELAB_R-MAGE
> 
> Cheers, Domenico Lahaye. 
> 


From knepley at gmail.com  Fri Jul 15 22:54:48 2016
From: knepley at gmail.com (Matthew Knepley)
Date: Fri, 15 Jul 2016 22:54:48 -0500
Subject: [petsc-users] Regarding ksp ex42 - Citations
In-Reply-To: <E8DAFAD1-55D4-41C7-8EF9-B8F1D7BA23C2@mcs.anl.gov>
References: <1413749702.3789628.1468516892902.JavaMail.yahoo.ref@mail.yahoo.com>
	<1413749702.3789628.1468516892902.JavaMail.yahoo@mail.yahoo.com>
	<E8DAFAD1-55D4-41C7-8EF9-B8F1D7BA23C2@mcs.anl.gov>
Message-ID: <CAMYG4G=U9+NN8jxbuYtw2A6j=8nMmm_WxOcweSgaUFy02+jJSg@mail.gmail.com>

On Fri, Jul 15, 2016 at 10:48 PM, Barry Smith <bsmith at mcs.anl.gov> wrote:

>
> > On Jul 14, 2016, at 12:21 PM, domenico lahaye <domenico_lahaye at yahoo.com>
> wrote:
> >
> > Dear PETSc team,
> >
> > 1) I am looking into ks/examples/tutorials/ex42.c
>
>    This example is really written as only a one level solver, making it
> work with geometric multigrid is not clean
>
> > I am still new to the DMDA structure
> >     and likely not giving it as much time as it deserves. However, I do
> not see immediately
> >     what function is responsible for calling PCMGSetSmoother and
> PCMGSetResidual.
> >
> >      I tried to call PCMGGetCoarseSolve(pc, &kcpc) and subsequently
> >      KSPGetOperators (kspc, ... ) to check how the coarse grid operator
> is defined
> >      after calling DMCoarsenHierarchy, but that failed.
> >
> >      I am solving Helmholtz with shifted Laplace, and managed to exploit
> DMDA to perform
> >      a multigrid solve on the preconditioner. In a next stage I want to
> implement the deflation
> >      using DMDA as well.
>
>   You should look at ex25.c in the same directory. Here
>
>   ierr = KSPSetDM(ksp,da);CHKERRQ(ierr);
>   ierr = KSPSetComputeRHS(ksp,ComputeRHS,&user);CHKERRQ(ierr);
>   ierr = KSPSetComputeOperators(ksp,ComputeMatrix,&user);CHKERRQ(ierr);
>
> make it straight forward to work with multigrid. The KSP object can mange
> the hierarchy of grids since it is provided with the DM
> and the ComputeRHS and ComputeMatrix provide a way for the multigrid
> preconditioner to automatically generate the needed matrix on each level
> without you having to manage it yourself. For example the rule in the
> makefile
>
> runex25:
>         -@${MPIEXEC} -n 1 ./ex25 -pc_type mg -ksp_type fgmres -da_refine 2
> -ksp_monitor_short -mg_levels_ksp_monitor_short -mg_levels_ksp_norm_type
> unpreconditioned -ksp_view -pc_mg_type full  > ex25_1.tmp 2>&1;       \
>            if (${DIFF} output/ex25_1.out ex25_1.tmp) then true; \
>            else printf "${PWD}\nPossible problem with ex25_1, diffs
> above\n=========================================\n"; fi; \
>            ${RM} -f ex25_1.tmp
>
> shows how to run with two levels. etc.
>
>
> >
> > 2) On http://www.mcs.anl.gov/petsc/documentation/referencing.html I see
> >
> > @Misc{petsc-web-page,
> >             author = {Satish Balay and Shrirang Abhyankar and Mark~F.
> Adams and Jed Brown and Peter Brune
> >                       and Kris Buschelman and Lisandro Dalcin and Victor
> Eijkhout and William~D. Gropp
> >                       and Dinesh Kaushik and Matthew~G. Knepley
> >                       and Lois Curfman McInnes and Karl Rupp and
> Barry~F. Smith
> >                       and Stefano Zampini and Hong Zhang and Hong Zhang},
> >             title =  {{PETS}c {W}eb page},
> >             url =    {http://www.mcs.anl.gov/petsc},
> >             howpublished = {\url{http://www.mcs.anl.gov/petsc}},
> >             year = {2016}
> >           }
> >
> >
> >
> > Is the last author mentioned twice intentionally?
>

That is actually two different people with the same name.


> > 3) On
> http://www.mcs.anl.gov/petsc/publications/petscapps-bib.html#OpenFOAM%202.2.1
> I see
> >
> > @misc{OpenFOAM
> > ,
> >
> >
> > title =       "OpenFOAM",
> >
> > howpublished  =       "\url{http://www.openfoam.com}",
> >
> > url   =       {http://www.openfoam.com},
> >
> > note  =       "OpenFOAM is a free, open source CFD software package. It
> allows PETSc linear algebra and solvers to be used underneath.",
> >
> > key   =       "OpenFOAM 2.2.1"
> >
> > }
> >
> >
> > Do you have more information on the use of PETSc within OpenFoam?
>

They only use solvers, and not the DM stuff as far as I know.


> > 4) @matt in response to a question he raised in Vienna
> >
> > MIPSE is a BEM solver. Details are on:
> >
> http://www.g2elab.grenoble-inp.fr/plateforms/mipse-modeling-of-interconnected-power-systems-632862.kjsp?RH=G2ELAB_R-MAGE


>From what I can tell, the code is not open source. Is that right?

  Thanks,

     Matt


>
> > Cheers, Domenico Lahaye.
> >
>
>


-- 
What most experimenters take for granted before they begin their
experiments is infinitely more interesting than any results to which their
experiments lead.
-- Norbert Wiener
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.mcs.anl.gov/pipermail/petsc-users/attachments/20160715/a22c6d96/attachment-0001.html>

From simon at arrowtheory.com  Sat Jul 16 08:40:24 2016
From: simon at arrowtheory.com (Simon Burton)
Date: Sat, 16 Jul 2016 23:40:24 +1000
Subject: [petsc-users] slepc eating all my ram
In-Reply-To: <20160716071744.50ec5af125d99abc4c0ffd7c@arrowtheory.com>
References: <20160715222913.df7b3dd606ec173f7cac6a8e@arrowtheory.com>
	<CAGCphBuAKMkS4ePszsOBcH-sx2yD=PhBJNhGxULYyMPGimf2og@mail.gmail.com>
	<20160716031236.4f52e3e02814cfe42d83a7b6@arrowtheory.com>
	<4455A442-710F-412A-9B7F-32D690B4E1F0@dsic.upv.es>
	<20160716071744.50ec5af125d99abc4c0ffd7c@arrowtheory.com>
Message-ID: <20160716234024.5d13e6ec0021548c2022bbe0@arrowtheory.com>


Hi again,

I found another machine with enough ram to run this (i think).

Running into another problem now, with dgemv:

[0] EPSSetUp_Power(): Warning: parameter mpd ignored
[0] STSetUp(): Setting up new ST
Intel MKL ERROR: Parameter 6 was incorrect on entry to DGEMV .
[0] BV_SafeSqrt(): Zero norm, either the vector is zero or a semi-inner product is being used


I dug into this in gdb a bit:


Breakpoint 2, 0x00007ffff4f4cbd0 in dgemv_ ()
   from /usr/physics/ic15/composer_xe_2015.0.090/mkl/lib/intel64/libmkl_intel_lp64.so
(gdb) bt
#0  0x00007ffff4f4cbd0 in dgemv_ () from /usr/physics/ic15/composer_xe_2015.0.090/mkl/lib/intel64/libmkl_intel_lp64.so
#1  0x00007ffff5e14b4b in BVDotVec_BLAS_Private (bv=0x6ba6b0, n_=4294967296, k_=1, A=0x7fe7f23b3650, x=0x7fe7f23b3650, 
    y=0x75a3b0, mpi=PETSC_FALSE) at /suphys/sburton/local/slepc-3.7.1/src/sys/classes/bv/interface/bvblas.c:274
#2  0x00007ffff5dcbd86 in BVDotVec_Svec (X=0x6ba6b0, y=0x74dbc0, m=0x75a3b0)
    at /suphys/sburton/local/slepc-3.7.1/src/sys/classes/bv/impls/svec/svec.c:150
#3  0x00007ffff5dffd58 in BVDotVec (X=0x6ba6b0, y=0x74dbc0, m=0x75a3b0)
    at /suphys/sburton/local/slepc-3.7.1/src/sys/classes/bv/interface/bvglobal.c:191
#4  0x00007ffff5e1aad9 in BVOrthogonalizeCGS1 (bv=0x6ba6b0, j=0, v=0x0, H=0x75a3b0, onorm=0x7fffffffdc28, 
    norm=0x7fffffffdc20) at /suphys/sburton/local/slepc-3.7.1/src/sys/classes/bv/interface/bvorthog.c:81
#5  0x00007ffff5e1c1bb in BVOrthogonalizeCGS (bv=0x6ba6b0, j=0, v=0x0, H=0x0, norm=0x7fffffffddb0, lindep=0x7fffffffddac)
    at /suphys/sburton/local/slepc-3.7.1/src/sys/classes/bv/interface/bvorthog.c:214
#6  0x00007ffff5e1ddfd in BVOrthogonalizeColumn (bv=0x6ba6b0, j=0, H=0x0, norm=0x7fffffffddb0, lindep=0x7fffffffddac)
    at /suphys/sburton/local/slepc-3.7.1/src/sys/classes/bv/interface/bvorthog.c:371
#7  0x00007ffff6050986 in EPSGetStartVector (eps=0x6a3ee0, i=0, breakdown=0x0)
    at /suphys/sburton/local/slepc-3.7.1/src/eps/interface/epssolve.c:758
#8  0x00007ffff5f52812 in EPSSolve_Power (eps=0x6a3ee0) at /suphys/sburton/local/slepc-3.7.1/src/eps/impls/power/power.c:103
#9  0x00007ffff6049b28 in EPSSolve (eps=0x6a3ee0) at /suphys/sburton/local/slepc-3.7.1/src/eps/interface/epssolve.c:101
#10 0x0000000000401430 in main ()
(gdb) up
#1  0x00007ffff5e14b4b in BVDotVec_BLAS_Private (bv=0x6ba6b0, n_=4294967296, k_=1, A=0x7fe7f23b3650, x=0x7fe7f23b3650, 
    y=0x75a3b0, mpi=PETSC_FALSE) at /suphys/sburton/local/slepc-3.7.1/src/sys/classes/bv/interface/bvblas.c:274
274	    if (n) PetscStackCallBLAS("BLASgemv",BLASgemv_("C",&n,&k,&done,A,&n,x,&one,&zero,y,&one));
(gdb) print n
$1 = 4294967296
(gdb) print sizeof(n)
$2 = 8
(gdb) step
Intel MKL ERROR: Parameter 6 was incorrect on entry to DGEMV .


It looks to me like slepc is doing it right, but with error messages
like this who knows. It's a bit beyond me debugging assembly.

Originally I built petsc with --download-fblaslapack but i don't think
it was working with 64bit indexes (?)

Maybe I should try another blas.

Simon.


On Sat, 16 Jul 2016 07:17:44 +1000
Simon Burton <simon at arrowtheory.com> wrote:

> On Fri, 15 Jul 2016 19:53:31 +0200
> "Jose E. Roman" <jroman at dsic.upv.es> wrote:
> 
> > 
> > The default spectral transformation (STSHIFT) will allocate just one vector. At which exact point are you seeing that it allocates a bunch of vectors?
> 
> Yes I think you are right.
> I can get beyond STSetUp with the right settings.
> Now the solver runs out of memory inside EPSGetStartVector.
> 
> > 
> > Is this the unmodified ex3.c? Or did you change anything like EPSSetOperators(eps,A,B) ?
> 
> good question. I didn't change much, let me try again the original.
> 
> > Do you get the same behaviour with the original ex3 with the same problem size?
> 
> Yes
> 
> > 
> > Do you have the same problem with a smaller problem? (half size, say)
> 
> Halving n gives a quarter of the dimension, which is 8gb vector sizes.
> It works fine and uses a total of 48gb ram. Oh, I see at one point during
> initialization it hits a maximum of 56gb.
> 
> So I guess it needs to keep 6 vectors in total.
> With the original problem size this becomes 192gb which is
> just a few gb too much to crunch. I guess I can still try it,
> but it doesn't feel good hitting the harddrive that much.
> 
> Thanks for the suggestions.
> 
> Simon.

From bsmith at mcs.anl.gov  Sat Jul 16 10:00:58 2016
From: bsmith at mcs.anl.gov (Barry Smith)
Date: Sat, 16 Jul 2016 10:00:58 -0500
Subject: [petsc-users] slepc eating all my ram
In-Reply-To: <20160716234024.5d13e6ec0021548c2022bbe0@arrowtheory.com>
References: <20160715222913.df7b3dd606ec173f7cac6a8e@arrowtheory.com>
	<CAGCphBuAKMkS4ePszsOBcH-sx2yD=PhBJNhGxULYyMPGimf2og@mail.gmail.com>
	<20160716031236.4f52e3e02814cfe42d83a7b6@arrowtheory.com>
	<4455A442-710F-412A-9B7F-32D690B4E1F0@dsic.upv.es>
	<20160716071744.50ec5af125d99abc4c0ffd7c@arrowtheory.com>
	<20160716234024.5d13e6ec0021548c2022bbe0@arrowtheory.com>
Message-ID: <27AC55B0-C1E7-4181-9ECD-A3CE6F795EAC@mcs.anl.gov>


  Send configure.log to petsc-maint at mcs.anl.gov


> On Jul 16, 2016, at 8:40 AM, Simon Burton <simon at arrowtheory.com> wrote:
> 
> 
> Hi again,
> 
> I found another machine with enough ram to run this (i think).
> 
> Running into another problem now, with dgemv:
> 
> [0] EPSSetUp_Power(): Warning: parameter mpd ignored
> [0] STSetUp(): Setting up new ST
> Intel MKL ERROR: Parameter 6 was incorrect on entry to DGEMV .
> [0] BV_SafeSqrt(): Zero norm, either the vector is zero or a semi-inner product is being used
> 
> 
> I dug into this in gdb a bit:
> 
> 
> Breakpoint 2, 0x00007ffff4f4cbd0 in dgemv_ ()
>   from /usr/physics/ic15/composer_xe_2015.0.090/mkl/lib/intel64/libmkl_intel_lp64.so
> (gdb) bt
> #0  0x00007ffff4f4cbd0 in dgemv_ () from /usr/physics/ic15/composer_xe_2015.0.090/mkl/lib/intel64/libmkl_intel_lp64.so
> #1  0x00007ffff5e14b4b in BVDotVec_BLAS_Private (bv=0x6ba6b0, n_=4294967296, k_=1, A=0x7fe7f23b3650, x=0x7fe7f23b3650, 
>    y=0x75a3b0, mpi=PETSC_FALSE) at /suphys/sburton/local/slepc-3.7.1/src/sys/classes/bv/interface/bvblas.c:274
> #2  0x00007ffff5dcbd86 in BVDotVec_Svec (X=0x6ba6b0, y=0x74dbc0, m=0x75a3b0)
>    at /suphys/sburton/local/slepc-3.7.1/src/sys/classes/bv/impls/svec/svec.c:150
> #3  0x00007ffff5dffd58 in BVDotVec (X=0x6ba6b0, y=0x74dbc0, m=0x75a3b0)
>    at /suphys/sburton/local/slepc-3.7.1/src/sys/classes/bv/interface/bvglobal.c:191
> #4  0x00007ffff5e1aad9 in BVOrthogonalizeCGS1 (bv=0x6ba6b0, j=0, v=0x0, H=0x75a3b0, onorm=0x7fffffffdc28, 
>    norm=0x7fffffffdc20) at /suphys/sburton/local/slepc-3.7.1/src/sys/classes/bv/interface/bvorthog.c:81
> #5  0x00007ffff5e1c1bb in BVOrthogonalizeCGS (bv=0x6ba6b0, j=0, v=0x0, H=0x0, norm=0x7fffffffddb0, lindep=0x7fffffffddac)
>    at /suphys/sburton/local/slepc-3.7.1/src/sys/classes/bv/interface/bvorthog.c:214
> #6  0x00007ffff5e1ddfd in BVOrthogonalizeColumn (bv=0x6ba6b0, j=0, H=0x0, norm=0x7fffffffddb0, lindep=0x7fffffffddac)
>    at /suphys/sburton/local/slepc-3.7.1/src/sys/classes/bv/interface/bvorthog.c:371
> #7  0x00007ffff6050986 in EPSGetStartVector (eps=0x6a3ee0, i=0, breakdown=0x0)
>    at /suphys/sburton/local/slepc-3.7.1/src/eps/interface/epssolve.c:758
> #8  0x00007ffff5f52812 in EPSSolve_Power (eps=0x6a3ee0) at /suphys/sburton/local/slepc-3.7.1/src/eps/impls/power/power.c:103
> #9  0x00007ffff6049b28 in EPSSolve (eps=0x6a3ee0) at /suphys/sburton/local/slepc-3.7.1/src/eps/interface/epssolve.c:101
> #10 0x0000000000401430 in main ()
> (gdb) up
> #1  0x00007ffff5e14b4b in BVDotVec_BLAS_Private (bv=0x6ba6b0, n_=4294967296, k_=1, A=0x7fe7f23b3650, x=0x7fe7f23b3650, 
>    y=0x75a3b0, mpi=PETSC_FALSE) at /suphys/sburton/local/slepc-3.7.1/src/sys/classes/bv/interface/bvblas.c:274
> 274	    if (n) PetscStackCallBLAS("BLASgemv",BLASgemv_("C",&n,&k,&done,A,&n,x,&one,&zero,y,&one));
> (gdb) print n
> $1 = 4294967296
> (gdb) print sizeof(n)
> $2 = 8
> (gdb) step
> Intel MKL ERROR: Parameter 6 was incorrect on entry to DGEMV .
> 
> 
> It looks to me like slepc is doing it right, but with error messages
> like this who knows. It's a bit beyond me debugging assembly.
> 
> Originally I built petsc with --download-fblaslapack but i don't think
> it was working with 64bit indexes (?)
> 
> Maybe I should try another blas.
> 
> Simon.
> 
> 
> On Sat, 16 Jul 2016 07:17:44 +1000
> Simon Burton <simon at arrowtheory.com> wrote:
> 
>> On Fri, 15 Jul 2016 19:53:31 +0200
>> "Jose E. Roman" <jroman at dsic.upv.es> wrote:
>> 
>>> 
>>> The default spectral transformation (STSHIFT) will allocate just one vector. At which exact point are you seeing that it allocates a bunch of vectors?
>> 
>> Yes I think you are right.
>> I can get beyond STSetUp with the right settings.
>> Now the solver runs out of memory inside EPSGetStartVector.
>> 
>>> 
>>> Is this the unmodified ex3.c? Or did you change anything like EPSSetOperators(eps,A,B) ?
>> 
>> good question. I didn't change much, let me try again the original.
>> 
>>> Do you get the same behaviour with the original ex3 with the same problem size?
>> 
>> Yes
>> 
>>> 
>>> Do you have the same problem with a smaller problem? (half size, say)
>> 
>> Halving n gives a quarter of the dimension, which is 8gb vector sizes.
>> It works fine and uses a total of 48gb ram. Oh, I see at one point during
>> initialization it hits a maximum of 56gb.
>> 
>> So I guess it needs to keep 6 vectors in total.
>> With the original problem size this becomes 192gb which is
>> just a few gb too much to crunch. I guess I can still try it,
>> but it doesn't feel good hitting the harddrive that much.
>> 
>> Thanks for the suggestions.
>> 
>> Simon.


From bsmith at mcs.anl.gov  Sat Jul 16 22:11:23 2016
From: bsmith at mcs.anl.gov (Barry Smith)
Date: Sat, 16 Jul 2016 22:11:23 -0500
Subject: [petsc-users] Regarding ksp ex42 - Citations
In-Reply-To: <1413749702.3789628.1468516892902.JavaMail.yahoo@mail.yahoo.com>
References: <1413749702.3789628.1468516892902.JavaMail.yahoo.ref@mail.yahoo.com>
	<1413749702.3789628.1468516892902.JavaMail.yahoo@mail.yahoo.com>
Message-ID: <5A491912-5FFB-46AB-8B2E-CBC0C5C443C2@mcs.anl.gov>


> On Jul 14, 2016, at 12:21 PM, domenico lahaye <domenico_lahaye at yahoo.com> wrote:
> 
> Dear PETSc team, 
> 
> 1) I am looking into ks/examples/tutorials/ex42.c I am still new to the DMDA structure
>     and likely not giving it as much time as it deserves. However, I do not see immediately 
>     what function is responsible for calling PCMGSetSmoother and PCMGSetResidual. 
> 
>      I tried to call PCMGGetCoarseSolve(pc, &kcpc) and subsequently 
>      KSPGetOperators (kspc, ... ) to check how the coarse grid operator is defined 
>      after calling DMCoarsenHierarchy, but that failed. 
> 
>      I am solving Helmholtz with shifted Laplace, and managed to exploit DMDA to perform 
>      a multigrid solve on the preconditioner. In a next stage I want to implement the deflation 
>      using DMDA as well. 
> 
> 2) On http://www.mcs.anl.gov/petsc/documentation/referencing.html I see 
> 
> @Misc{petsc-web-page,
>             author = {Satish Balay and Shrirang Abhyankar and Mark~F. Adams and Jed Brown and Peter Brune
>                       and Kris Buschelman and Lisandro Dalcin and Victor Eijkhout and William~D. Gropp
>                       and Dinesh Kaushik and Matthew~G. Knepley
>                       and Lois Curfman McInnes and Karl Rupp and Barry~F. Smith
>                       and Stefano Zampini and Hong Zhang and Hong Zhang},
>             title =  {{PETS}c {W}eb page},
>             url =    {http://www.mcs.anl.gov/petsc},
>             howpublished = {\url{http://www.mcs.anl.gov/petsc}},
>             year = {2016}
>           }
> 
> 
> 
> Is the last author mentioned twice intentionally? 
> 
> 3) On http://www.mcs.anl.gov/petsc/publications/petscapps-bib.html#OpenFOAM%202.2.1 I see 
> 
> @misc{OpenFOAM
> ,
> 
> 
> title	=	"OpenFOAM",
> 
> howpublished	=	"\url{http://www.openfoam.com}",
> 
> url	=	{http://www.openfoam.com},
> 
> note	=	"OpenFOAM is a free, open source CFD software package. It allows PETSc linear algebra and solvers to be used underneath.",
> 
> key	=	"OpenFOAM 2.2.1"
> 
> }
> 
> 
> Do you have more information on the use of PETSc within OpenFoam? 

  Very good question. It seems that this citation is wrong or no longer valid; I have removed it from the PETSc repository. I could find no mention of PETSc usage in the OpenFoam and its third party packages. I think we should not have been listing this citation.

   Barry

> 
> 4) @matt in response to a question he raised in Vienna
> 
> MIPSE is a BEM solver. Details are on: 
> http://www.g2elab.grenoble-inp.fr/plateforms/mipse-modeling-of-interconnected-power-systems-632862.kjsp?RH=G2ELAB_R-MAGE
> 
> Cheers, Domenico Lahaye. 
> 


From knepley at gmail.com  Sun Jul 17 07:29:59 2016
From: knepley at gmail.com (Matthew Knepley)
Date: Sun, 17 Jul 2016 07:29:59 -0500
Subject: [petsc-users] Regarding ksp ex42 - Citations
In-Reply-To: <5A491912-5FFB-46AB-8B2E-CBC0C5C443C2@mcs.anl.gov>
References: <1413749702.3789628.1468516892902.JavaMail.yahoo.ref@mail.yahoo.com>
	<1413749702.3789628.1468516892902.JavaMail.yahoo@mail.yahoo.com>
	<5A491912-5FFB-46AB-8B2E-CBC0C5C443C2@mcs.anl.gov>
Message-ID: <CAMYG4GnjpQ+CTzB+Ed7hbB77YXJPzWRXtspDbrE8eiXZD=DA_Q@mail.gmail.com>

On Sat, Jul 16, 2016 at 10:11 PM, Barry Smith <bsmith at mcs.anl.gov> wrote:

>
> > On Jul 14, 2016, at 12:21 PM, domenico lahaye <domenico_lahaye at yahoo.com>
> wrote:
> >
> > Dear PETSc team,
> >
> > 1) I am looking into ks/examples/tutorials/ex42.c I am still new to the
> DMDA structure
> >     and likely not giving it as much time as it deserves. However, I do
> not see immediately
> >     what function is responsible for calling PCMGSetSmoother and
> PCMGSetResidual.
> >
> >      I tried to call PCMGGetCoarseSolve(pc, &kcpc) and subsequently
> >      KSPGetOperators (kspc, ... ) to check how the coarse grid operator
> is defined
> >      after calling DMCoarsenHierarchy, but that failed.
> >
> >      I am solving Helmholtz with shifted Laplace, and managed to exploit
> DMDA to perform
> >      a multigrid solve on the preconditioner. In a next stage I want to
> implement the deflation
> >      using DMDA as well.
> >
> > 2) On http://www.mcs.anl.gov/petsc/documentation/referencing.html I see
> >
> > @Misc{petsc-web-page,
> >             author = {Satish Balay and Shrirang Abhyankar and Mark~F.
> Adams and Jed Brown and Peter Brune
> >                       and Kris Buschelman and Lisandro Dalcin and Victor
> Eijkhout and William~D. Gropp
> >                       and Dinesh Kaushik and Matthew~G. Knepley
> >                       and Lois Curfman McInnes and Karl Rupp and
> Barry~F. Smith
> >                       and Stefano Zampini and Hong Zhang and Hong Zhang},
> >             title =  {{PETS}c {W}eb page},
> >             url =    {http://www.mcs.anl.gov/petsc},
> >             howpublished = {\url{http://www.mcs.anl.gov/petsc}},
> >             year = {2016}
> >           }
> >
> >
> >
> > Is the last author mentioned twice intentionally?
> >
> > 3) On
> http://www.mcs.anl.gov/petsc/publications/petscapps-bib.html#OpenFOAM%202.2.1
> I see
> >
> > @misc{OpenFOAM
> > ,
> >
> >
> > title =       "OpenFOAM",
> >
> > howpublished  =       "\url{http://www.openfoam.com}",
> >
> > url   =       {http://www.openfoam.com},
> >
> > note  =       "OpenFOAM is a free, open source CFD software package. It
> allows PETSc linear algebra and solvers to be used underneath.",
> >
> > key   =       "OpenFOAM 2.2.1"
> >
> > }
> >
> >
> > Do you have more information on the use of PETSc within OpenFoam?
>
>   Very good question. It seems that this citation is wrong or no longer
> valid; I have removed it from the PETSc repository. I could find no mention
> of PETSc usage in the OpenFoam and its third party packages. I think we
> should not have been listing this citation.


This suggests that people are using it with OpenFOAM:
http://powerlab.fsb.hr/ped/kturbo/OpenFOAM/slides/PatersonNuTTS2009.pdf

In fact, they use PETSc in the dynamic overset grid implementation for
OpenFOAM, which I think is an approved extension:


http://web.student.chalmers.se/groups/ofw5/Abstracts/DavidBogerAbstractOFW5.pdf

     Matt


>
>    Barry
>
> >
> > 4) @matt in response to a question he raised in Vienna
> >
> > MIPSE is a BEM solver. Details are on:
> >
> http://www.g2elab.grenoble-inp.fr/plateforms/mipse-modeling-of-interconnected-power-systems-632862.kjsp?RH=G2ELAB_R-MAGE
> >
> > Cheers, Domenico Lahaye.
> >
>
>


-- 
What most experimenters take for granted before they begin their
experiments is infinitely more interesting than any results to which their
experiments lead.
-- Norbert Wiener
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.mcs.anl.gov/pipermail/petsc-users/attachments/20160717/e3e4a102/attachment.html>

From jroman at dsic.upv.es  Sun Jul 17 12:40:52 2016
From: jroman at dsic.upv.es (Jose E. Roman)
Date: Sun, 17 Jul 2016 19:40:52 +0200
Subject: [petsc-users] slepc eating all my ram
In-Reply-To: <27AC55B0-C1E7-4181-9ECD-A3CE6F795EAC@mcs.anl.gov>
References: <20160715222913.df7b3dd606ec173f7cac6a8e@arrowtheory.com>
	<CAGCphBuAKMkS4ePszsOBcH-sx2yD=PhBJNhGxULYyMPGimf2og@mail.gmail.com>
	<20160716031236.4f52e3e02814cfe42d83a7b6@arrowtheory.com>
	<4455A442-710F-412A-9B7F-32D690B4E1F0@dsic.upv.es>
	<20160716071744.50ec5af125d99abc4c0ffd7c@arrowtheory.com>
	<20160716234024.5d13e6ec0021548c2022bbe0@arrowtheory.com>
	<27AC55B0-C1E7-4181-9ECD-A3CE6F795EAC@mcs.anl.gov>
Message-ID: <F4406AB6-8FC3-48FA-A6BC-3D2E6779AD21@dsic.upv.es>

Simon:
I have made a few optimizations regarding memory management in EPS. In your case, these changes will allocate 1 vector less (maybe 2). If you are using the repository version, just pull and try again. Otherwise, wait until slepc-3.7.2 is released (in a few days).
Jose


> El 16 jul 2016, a las 17:00, Barry Smith <bsmith at mcs.anl.gov> escribi?:
> 
> 
>  Send configure.log to petsc-maint at mcs.anl.gov
> 
> 
>> On Jul 16, 2016, at 8:40 AM, Simon Burton <simon at arrowtheory.com> wrote:
>> 
>> 
>> Hi again,
>> 
>> I found another machine with enough ram to run this (i think).
>> 
>> Running into another problem now, with dgemv:
>> 
>> [0] EPSSetUp_Power(): Warning: parameter mpd ignored
>> [0] STSetUp(): Setting up new ST
>> Intel MKL ERROR: Parameter 6 was incorrect on entry to DGEMV .
>> [0] BV_SafeSqrt(): Zero norm, either the vector is zero or a semi-inner product is being used
>> 
>> 
>> I dug into this in gdb a bit:
>> 
>> 
>> Breakpoint 2, 0x00007ffff4f4cbd0 in dgemv_ ()
>>  from /usr/physics/ic15/composer_xe_2015.0.090/mkl/lib/intel64/libmkl_intel_lp64.so
>> (gdb) bt
>> #0  0x00007ffff4f4cbd0 in dgemv_ () from /usr/physics/ic15/composer_xe_2015.0.090/mkl/lib/intel64/libmkl_intel_lp64.so
>> #1  0x00007ffff5e14b4b in BVDotVec_BLAS_Private (bv=0x6ba6b0, n_=4294967296, k_=1, A=0x7fe7f23b3650, x=0x7fe7f23b3650, 
>>   y=0x75a3b0, mpi=PETSC_FALSE) at /suphys/sburton/local/slepc-3.7.1/src/sys/classes/bv/interface/bvblas.c:274
>> #2  0x00007ffff5dcbd86 in BVDotVec_Svec (X=0x6ba6b0, y=0x74dbc0, m=0x75a3b0)
>>   at /suphys/sburton/local/slepc-3.7.1/src/sys/classes/bv/impls/svec/svec.c:150
>> #3  0x00007ffff5dffd58 in BVDotVec (X=0x6ba6b0, y=0x74dbc0, m=0x75a3b0)
>>   at /suphys/sburton/local/slepc-3.7.1/src/sys/classes/bv/interface/bvglobal.c:191
>> #4  0x00007ffff5e1aad9 in BVOrthogonalizeCGS1 (bv=0x6ba6b0, j=0, v=0x0, H=0x75a3b0, onorm=0x7fffffffdc28, 
>>   norm=0x7fffffffdc20) at /suphys/sburton/local/slepc-3.7.1/src/sys/classes/bv/interface/bvorthog.c:81
>> #5  0x00007ffff5e1c1bb in BVOrthogonalizeCGS (bv=0x6ba6b0, j=0, v=0x0, H=0x0, norm=0x7fffffffddb0, lindep=0x7fffffffddac)
>>   at /suphys/sburton/local/slepc-3.7.1/src/sys/classes/bv/interface/bvorthog.c:214
>> #6  0x00007ffff5e1ddfd in BVOrthogonalizeColumn (bv=0x6ba6b0, j=0, H=0x0, norm=0x7fffffffddb0, lindep=0x7fffffffddac)
>>   at /suphys/sburton/local/slepc-3.7.1/src/sys/classes/bv/interface/bvorthog.c:371
>> #7  0x00007ffff6050986 in EPSGetStartVector (eps=0x6a3ee0, i=0, breakdown=0x0)
>>   at /suphys/sburton/local/slepc-3.7.1/src/eps/interface/epssolve.c:758
>> #8  0x00007ffff5f52812 in EPSSolve_Power (eps=0x6a3ee0) at /suphys/sburton/local/slepc-3.7.1/src/eps/impls/power/power.c:103
>> #9  0x00007ffff6049b28 in EPSSolve (eps=0x6a3ee0) at /suphys/sburton/local/slepc-3.7.1/src/eps/interface/epssolve.c:101
>> #10 0x0000000000401430 in main ()
>> (gdb) up
>> #1  0x00007ffff5e14b4b in BVDotVec_BLAS_Private (bv=0x6ba6b0, n_=4294967296, k_=1, A=0x7fe7f23b3650, x=0x7fe7f23b3650, 
>>   y=0x75a3b0, mpi=PETSC_FALSE) at /suphys/sburton/local/slepc-3.7.1/src/sys/classes/bv/interface/bvblas.c:274
>> 274	    if (n) PetscStackCallBLAS("BLASgemv",BLASgemv_("C",&n,&k,&done,A,&n,x,&one,&zero,y,&one));
>> (gdb) print n
>> $1 = 4294967296
>> (gdb) print sizeof(n)
>> $2 = 8
>> (gdb) step
>> Intel MKL ERROR: Parameter 6 was incorrect on entry to DGEMV .
>> 
>> 
>> It looks to me like slepc is doing it right, but with error messages
>> like this who knows. It's a bit beyond me debugging assembly.
>> 
>> Originally I built petsc with --download-fblaslapack but i don't think
>> it was working with 64bit indexes (?)
>> 
>> Maybe I should try another blas.
>> 
>> Simon.
>> 
>> 
>> On Sat, 16 Jul 2016 07:17:44 +1000
>> Simon Burton <simon at arrowtheory.com> wrote:
>> 
>>> On Fri, 15 Jul 2016 19:53:31 +0200
>>> "Jose E. Roman" <jroman at dsic.upv.es> wrote:
>>> 
>>>> 
>>>> The default spectral transformation (STSHIFT) will allocate just one vector. At which exact point are you seeing that it allocates a bunch of vectors?
>>> 
>>> Yes I think you are right.
>>> I can get beyond STSetUp with the right settings.
>>> Now the solver runs out of memory inside EPSGetStartVector.
>>> 
>>>> 
>>>> Is this the unmodified ex3.c? Or did you change anything like EPSSetOperators(eps,A,B) ?
>>> 
>>> good question. I didn't change much, let me try again the original.
>>> 
>>>> Do you get the same behaviour with the original ex3 with the same problem size?
>>> 
>>> Yes
>>> 
>>>> 
>>>> Do you have the same problem with a smaller problem? (half size, say)
>>> 
>>> Halving n gives a quarter of the dimension, which is 8gb vector sizes.
>>> It works fine and uses a total of 48gb ram. Oh, I see at one point during
>>> initialization it hits a maximum of 56gb.
>>> 
>>> So I guess it needs to keep 6 vectors in total.
>>> With the original problem size this becomes 192gb which is
>>> just a few gb too much to crunch. I guess I can still try it,
>>> but it doesn't feel good hitting the harddrive that much.
>>> 
>>> Thanks for the suggestions.
>>> 
>>> Simon.
> 


From domenico_lahaye at yahoo.com  Mon Jul 18 00:59:30 2016
From: domenico_lahaye at yahoo.com (domenico lahaye)
Date: Mon, 18 Jul 2016 05:59:30 +0000 (UTC)
Subject: [petsc-users] Regarding ksp ex42 - Citations
In-Reply-To: <CAMYG4GnjpQ+CTzB+Ed7hbB77YXJPzWRXtspDbrE8eiXZD=DA_Q@mail.gmail.com>
References: <1413749702.3789628.1468516892902.JavaMail.yahoo.ref@mail.yahoo.com>
	<1413749702.3789628.1468516892902.JavaMail.yahoo@mail.yahoo.com>
	<5A491912-5FFB-46AB-8B2E-CBC0C5C443C2@mcs.anl.gov>
	<CAMYG4GnjpQ+CTzB+Ed7hbB77YXJPzWRXtspDbrE8eiXZD=DA_Q@mail.gmail.com>
Message-ID: <461808588.655361.1468821570462.JavaMail.yahoo@mail.yahoo.com>

Thanks for?all the?pointers.?
I am happy to switch to?ksp/examples/tutorials/ex25.c in a first instance as you suggest.
? ? I am still stuck with the same issue as before though. I am trying to extract the hierarchy?? ? of coarser grid matrices and the intergrid transfer operators from the DMDA data structure. I would?? ? like to modify these operators and define a multigrid cycle with the modified operators.?
? ? Given A^h (Helmholtz) and M^h (shifted Laplace), I would like to define a multigrid cycle involving?? ? both A^H and M^H. Can I rely on the multilevel DMDA structure to construct A^H and M^H for me?? ? in a set-up phase, plug them into a user-defined context, and plug them back out in a solve phase??
Thanks, Domenico.?

      From: Matthew Knepley <knepley at gmail.com>
 To: Barry Smith <bsmith at mcs.anl.gov> 
Cc: domenico lahaye <domenico_lahaye at yahoo.com>; "petsc-users at mcs.anl.gov" <petsc-users at mcs.anl.gov>
 Sent: Sunday, July 17, 2016 2:29 PM
 Subject: Re: [petsc-users] Regarding ksp ex42 - Citations
  
On Sat, Jul 16, 2016 at 10:11 PM, Barry Smith <bsmith at mcs.anl.gov> wrote:


> On Jul 14, 2016, at 12:21 PM, domenico lahaye <domenico_lahaye at yahoo.com> wrote:
>
> Dear PETSc team,
>
> 1) I am looking into ks/examples/tutorials/ex42.c I am still new to the DMDA structure
>? ? ?and likely not giving it as much time as it deserves. However, I do not see immediately
>? ? ?what function is responsible for calling PCMGSetSmoother and PCMGSetResidual.
>
>? ? ? I tried to call PCMGGetCoarseSolve(pc, &kcpc) and subsequently
>? ? ? KSPGetOperators (kspc, ... ) to check how the coarse grid operator is defined
>? ? ? after calling DMCoarsenHierarchy, but that failed.
>
>? ? ? I am solving Helmholtz with shifted Laplace, and managed to exploit DMDA to perform
>? ? ? a multigrid solve on the preconditioner. In a next stage I want to implement the deflation
>? ? ? using DMDA as well.
>
> 2) On http://www.mcs.anl.gov/petsc/documentation/referencing.html I see
>
> @Misc{petsc-web-page,
>? ? ? ? ? ? ?author = {Satish Balay and Shrirang Abhyankar and Mark~F. Adams and Jed Brown and Peter Brune
>? ? ? ? ? ? ? ? ? ? ? ?and Kris Buschelman and Lisandro Dalcin and Victor Eijkhout and William~D. Gropp
>? ? ? ? ? ? ? ? ? ? ? ?and Dinesh Kaushik and Matthew~G. Knepley
>? ? ? ? ? ? ? ? ? ? ? ?and Lois Curfman McInnes and Karl Rupp and Barry~F. Smith
>? ? ? ? ? ? ? ? ? ? ? ?and Stefano Zampini and Hong Zhang and Hong Zhang},
>? ? ? ? ? ? ?title =? {{PETS}c {W}eb page},
>? ? ? ? ? ? ?url =? ? {http://www.mcs.anl.gov/petsc},
>? ? ? ? ? ? ?howpublished = {\url{http://www.mcs.anl.gov/petsc}},
>? ? ? ? ? ? ?year = {2016}
>? ? ? ? ? ?}
>
>
>
> Is the last author mentioned twice intentionally?
>
> 3) On http://www.mcs.anl.gov/petsc/publications/petscapps-bib.html#OpenFOAM%202.2.1 I see
>
> @misc{OpenFOAM
> ,
>
>
> title =? ? ? ?"OpenFOAM",
>
> howpublished? =? ? ? ?"\url{http://www.openfoam.com}",
>
> url? ?=? ? ? ?{http://www.openfoam.com},
>
> note? =? ? ? ?"OpenFOAM is a free, open source CFD software package. It allows PETSc linear algebra and solvers to be used underneath.",
>
> key? ?=? ? ? ?"OpenFOAM 2.2.1"
>
> }
>
>
> Do you have more information on the use of PETSc within OpenFoam?

? Very good question. It seems that this citation is wrong or no longer valid; I have removed it from the PETSc repository. I could find no mention of PETSc usage in the OpenFoam and its third party packages. I think we should not have been listing this citation.

This suggests that people are using it with OpenFOAM:?http://powerlab.fsb.hr/ped/kturbo/OpenFOAM/slides/PatersonNuTTS2009.pdf
In fact, they use PETSc in the dynamic overset grid implementation for OpenFOAM, which I think is an approved extension:
??http://web.student.chalmers.se/groups/ofw5/Abstracts/DavidBogerAbstractOFW5.pdf
? ? ?Matt?

? ?Barry

>
> 4) @matt in response to a question he raised in Vienna
>
> MIPSE is a BEM solver. Details are on:
> http://www.g2elab.grenoble-inp.fr/plateforms/mipse-modeling-of-interconnected-power-systems-632862.kjsp?RH=G2ELAB_R-MAGE
>
> Cheers, Domenico Lahaye.
>


-- 
What most experimenters take for granted before they begin their experiments is infinitely more interesting than any results to which their experiments lead.
-- Norbert Wiener

  
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.mcs.anl.gov/pipermail/petsc-users/attachments/20160718/dc4b8550/attachment.html>

From knepley at gmail.com  Mon Jul 18 01:16:59 2016
From: knepley at gmail.com (Matthew Knepley)
Date: Mon, 18 Jul 2016 01:16:59 -0500
Subject: [petsc-users] Regarding ksp ex42 - Citations
In-Reply-To: <461808588.655361.1468821570462.JavaMail.yahoo@mail.yahoo.com>
References: <1413749702.3789628.1468516892902.JavaMail.yahoo.ref@mail.yahoo.com>
	<1413749702.3789628.1468516892902.JavaMail.yahoo@mail.yahoo.com>
	<5A491912-5FFB-46AB-8B2E-CBC0C5C443C2@mcs.anl.gov>
	<CAMYG4GnjpQ+CTzB+Ed7hbB77YXJPzWRXtspDbrE8eiXZD=DA_Q@mail.gmail.com>
	<461808588.655361.1468821570462.JavaMail.yahoo@mail.yahoo.com>
Message-ID: <CAMYG4Gk7HNz8S6A2cv=0OFqbHOPuLOkfdJwNnMiix-ONHvUheg@mail.gmail.com>

On Mon, Jul 18, 2016 at 12:59 AM, domenico lahaye <domenico_lahaye at yahoo.com
> wrote:

> Thanks for all the pointers.
>
> I am happy to switch to ksp/examples/tutorials/ex25.c in a first instance
> as you suggest.
>
>     I am still stuck with the same issue as before though. I am trying to
> extract the hierarchy
>     of coarser grid matrices and the intergrid transfer operators from the
> DMDA data structure. I would
>     like to modify these operators and define a multigrid cycle with the
> modified operators.
>
>     Given A^h (Helmholtz) and M^h (shifted Laplace), I would like to
> define a multigrid cycle involving
>     both A^H and M^H. Can I rely on the multilevel DMDA structure to
> construct A^H and M^H for me
>     in a set-up phase, plug them into a user-defined context, and plug
> them back out in a solve phase?
>

If you are not using -pc_mg_galerkin, then the FormJacobian is called
separately on each level to rediscretize the operator.
The only thing that changes is the DMDA that is passed to the call. If you
need more information, there are hooks to
attach different contexts to each MG level. Do you need this?

  Thanks,

     Matt


> Thanks, Domenico.
>
>
> ------------------------------
> *From:* Matthew Knepley <knepley at gmail.com>
> *To:* Barry Smith <bsmith at mcs.anl.gov>
> *Cc:* domenico lahaye <domenico_lahaye at yahoo.com>; "
> petsc-users at mcs.anl.gov" <petsc-users at mcs.anl.gov>
> *Sent:* Sunday, July 17, 2016 2:29 PM
> *Subject:* Re: [petsc-users] Regarding ksp ex42 - Citations
>
> On Sat, Jul 16, 2016 at 10:11 PM, Barry Smith <bsmith at mcs.anl.gov> wrote:
>
>
> > On Jul 14, 2016, at 12:21 PM, domenico lahaye <domenico_lahaye at yahoo.com>
> wrote:
> >
> > Dear PETSc team,
> >
> > 1) I am looking into ks/examples/tutorials/ex42.c I am still new to the
> DMDA structure
> >     and likely not giving it as much time as it deserves. However, I do
> not see immediately
> >     what function is responsible for calling PCMGSetSmoother and
> PCMGSetResidual.
> >
> >      I tried to call PCMGGetCoarseSolve(pc, &kcpc) and subsequently
> >      KSPGetOperators (kspc, ... ) to check how the coarse grid operator
> is defined
> >      after calling DMCoarsenHierarchy, but that failed.
> >
> >      I am solving Helmholtz with shifted Laplace, and managed to exploit
> DMDA to perform
> >      a multigrid solve on the preconditioner. In a next stage I want to
> implement the deflation
> >      using DMDA as well.
> >
> > 2) On http://www.mcs.anl.gov/petsc/documentation/referencing.html I see
> >
> > @Misc{petsc-web-page,
> >             author = {Satish Balay and Shrirang Abhyankar and Mark~F.
> Adams and Jed Brown and Peter Brune
> >                       and Kris Buschelman and Lisandro Dalcin and Victor
> Eijkhout and William~D. Gropp
> >                       and Dinesh Kaushik and Matthew~G. Knepley
> >                       and Lois Curfman McInnes and Karl Rupp and
> Barry~F. Smith
> >                       and Stefano Zampini and Hong Zhang and Hong Zhang},
> >             title =  {{PETS}c {W}eb page},
> >             url =    {http://www.mcs.anl.gov/petsc},
> >             howpublished = {\url{http://www.mcs.anl.gov/petsc}},
> >             year = {2016}
> >           }
> >
> >
> >
> > Is the last author mentioned twice intentionally?
> >
> > 3) On
> http://www.mcs.anl.gov/petsc/publications/petscapps-bib.html#OpenFOAM%202.2.1
> I see
> >
> > @misc{OpenFOAM
> > ,
> >
> >
> > title =       "OpenFOAM",
> >
> > howpublished  =       "\url{http://www.openfoam.com}",
> >
> > url   =       {http://www.openfoam.com},
> >
> > note  =       "OpenFOAM is a free, open source CFD software package. It
> allows PETSc linear algebra and solvers to be used underneath.",
> >
> > key   =       "OpenFOAM 2.2.1"
> >
> > }
> >
> >
> > Do you have more information on the use of PETSc within OpenFoam?
>
>   Very good question. It seems that this citation is wrong or no longer
> valid; I have removed it from the PETSc repository. I could find no mention
> of PETSc usage in the OpenFoam and its third party packages. I think we
> should not have been listing this citation.
>
>
> This suggests that people are using it with OpenFOAM:
> http://powerlab.fsb.hr/ped/kturbo/OpenFOAM/slides/PatersonNuTTS2009.pdf
>
> In fact, they use PETSc in the dynamic overset grid implementation for
> OpenFOAM, which I think is an approved extension:
>
>
> http://web.student.chalmers.se/groups/ofw5/Abstracts/DavidBogerAbstractOFW5.pdf
>
>      Matt
>
>
>
>    Barry
>
> >
> > 4) @matt in response to a question he raised in Vienna
> >
> > MIPSE is a BEM solver. Details are on:
> >
> http://www.g2elab.grenoble-inp.fr/plateforms/mipse-modeling-of-interconnected-power-systems-632862.kjsp?RH=G2ELAB_R-MAGE
> >
> > Cheers, Domenico Lahaye.
>
> >
>
>
>
>
> --
> What most experimenters take for granted before they begin their
> experiments is infinitely more interesting than any results to which their
> experiments lead.
> -- Norbert Wiener
>
>
>


-- 
What most experimenters take for granted before they begin their
experiments is infinitely more interesting than any results to which their
experiments lead.
-- Norbert Wiener
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.mcs.anl.gov/pipermail/petsc-users/attachments/20160718/3bafba39/attachment-0001.html>

From domenico_lahaye at yahoo.com  Mon Jul 18 01:41:24 2016
From: domenico_lahaye at yahoo.com (domenico lahaye)
Date: Mon, 18 Jul 2016 06:41:24 +0000 (UTC)
Subject: [petsc-users] Regarding ksp ex42 - Citations
In-Reply-To: <CAMYG4Gk7HNz8S6A2cv=0OFqbHOPuLOkfdJwNnMiix-ONHvUheg@mail.gmail.com>
References: <1413749702.3789628.1468516892902.JavaMail.yahoo.ref@mail.yahoo.com>
	<1413749702.3789628.1468516892902.JavaMail.yahoo@mail.yahoo.com>
	<5A491912-5FFB-46AB-8B2E-CBC0C5C443C2@mcs.anl.gov>
	<CAMYG4GnjpQ+CTzB+Ed7hbB77YXJPzWRXtspDbrE8eiXZD=DA_Q@mail.gmail.com>
	<461808588.655361.1468821570462.JavaMail.yahoo@mail.yahoo.com>
	<CAMYG4Gk7HNz8S6A2cv=0OFqbHOPuLOkfdJwNnMiix-ONHvUheg@mail.gmail.com>
Message-ID: <877772657.653258.1468824084856.JavaMail.yahoo@mail.yahoo.com>

Dear Matthew,?
? I would like to place the FormJacobian statement in ex25.c in such a way that I can view?the result on the different levels. Can you please point me to an example??
? I would like to do above with Galerkin coarsening as well. So yes, I do expect that I will need the?hooks attached to the different MG levels. I appreciate more pointers here as well.?
? ?Thanks, Domenico. ?

From: Matthew Knepley <knepley at gmail.com>


 To: domenico lahaye <domenico_lahaye at yahoo.com> 
Cc: PETSc Users List <petsc-users at mcs.anl.gov>
 Sent: Monday, July 18, 2016 8:16 AM
 Subject: Re: [petsc-users] Regarding ksp ex42 - Citations
   
On Mon, Jul 18, 2016 at 12:59 AM, domenico lahaye <domenico_lahaye at yahoo.com> wrote:

Thanks for?all the?pointers.?
I am happy to switch to?ksp/examples/tutorials/ex25.c in a first instance as you suggest.
? ? I am still stuck with the same issue as before though. I am trying to extract the hierarchy?? ? of coarser grid matrices and the intergrid transfer operators from the DMDA data structure. I would?? ? like to modify these operators and define a multigrid cycle with the modified operators.?
? ? Given A^h (Helmholtz) and M^h (shifted Laplace), I would like to define a multigrid cycle involving?? ? both A^H and M^H. Can I rely on the multilevel DMDA structure to construct A^H and M^H for me?? ? in a set-up phase, plug them into a user-defined context, and plug them back out in a solve phase??

If you are not using -pc_mg_galerkin, then the FormJacobian is called separately on each level to rediscretize the operator.The only thing that changes is the DMDA that is passed to the call. If you need more information, there are hooks toattach different contexts to each MG level. Do you need this?
? Thanks,
? ? ?Matt?

Thanks, Domenico.?

      From: Matthew Knepley <knepley at gmail.com>
 To: Barry Smith <bsmith at mcs.anl.gov> 
Cc: domenico lahaye <domenico_lahaye at yahoo.com>; "petsc-users at mcs.anl.gov" <petsc-users at mcs.anl.gov>
 Sent: Sunday, July 17, 2016 2:29 PM
 Subject: Re: [petsc-users] Regarding ksp ex42 - Citations
  
On Sat, Jul 16, 2016 at 10:11 PM, Barry Smith <bsmith at mcs.anl.gov> wrote:


> On Jul 14, 2016, at 12:21 PM, domenico lahaye <domenico_lahaye at yahoo.com> wrote:
>
> Dear PETSc team,
>
> 1) I am looking into ks/examples/tutorials/ex42.c I am still new to the DMDA structure
>? ? ?and likely not giving it as much time as it deserves. However, I do not see immediately
>? ? ?what function is responsible for calling PCMGSetSmoother and PCMGSetResidual.
>
>? ? ? I tried to call PCMGGetCoarseSolve(pc, &kcpc) and subsequently
>? ? ? KSPGetOperators (kspc, ... ) to check how the coarse grid operator is defined
>? ? ? after calling DMCoarsenHierarchy, but that failed.
>
>? ? ? I am solving Helmholtz with shifted Laplace, and managed to exploit DMDA to perform
>? ? ? a multigrid solve on the preconditioner. In a next stage I want to implement the deflation
>? ? ? using DMDA as well.
>
> 2) On http://www.mcs.anl.gov/petsc/documentation/referencing.html I see
>
> @Misc{petsc-web-page,
>? ? ? ? ? ? ?author = {Satish Balay and Shrirang Abhyankar and Mark~F. Adams and Jed Brown and Peter Brune
>? ? ? ? ? ? ? ? ? ? ? ?and Kris Buschelman and Lisandro Dalcin and Victor Eijkhout and William~D. Gropp
>? ? ? ? ? ? ? ? ? ? ? ?and Dinesh Kaushik and Matthew~G. Knepley
>? ? ? ? ? ? ? ? ? ? ? ?and Lois Curfman McInnes and Karl Rupp and Barry~F. Smith
>? ? ? ? ? ? ? ? ? ? ? ?and Stefano Zampini and Hong Zhang and Hong Zhang},
>? ? ? ? ? ? ?title =? {{PETS}c {W}eb page},
>? ? ? ? ? ? ?url =? ? {http://www.mcs.anl.gov/petsc},
>? ? ? ? ? ? ?howpublished = {\url{http://www.mcs.anl.gov/petsc}},
>? ? ? ? ? ? ?year = {2016}
>? ? ? ? ? ?}
>
>
>
> Is the last author mentioned twice intentionally?
>
> 3) On http://www.mcs.anl.gov/petsc/publications/petscapps-bib.html#OpenFOAM%202.2.1 I see
>
> @misc{OpenFOAM
> ,
>
>
> title =? ? ? ?"OpenFOAM",
>
> howpublished? =? ? ? ?"\url{http://www.openfoam.com}",
>
> url? ?=? ? ? ?{http://www.openfoam.com},
>
> note? =? ? ? ?"OpenFOAM is a free, open source CFD software package. It allows PETSc linear algebra and solvers to be used underneath.",
>
> key? ?=? ? ? ?"OpenFOAM 2.2.1"
>
> }
>
>
> Do you have more information on the use of PETSc within OpenFoam?

? Very good question. It seems that this citation is wrong or no longer valid; I have removed it from the PETSc repository. I could find no mention of PETSc usage in the OpenFoam and its third party packages. I think we should not have been listing this citation.

This suggests that people are using it with OpenFOAM:?http://powerlab.fsb.hr/ped/kturbo/OpenFOAM/slides/PatersonNuTTS2009.pdf
In fact, they use PETSc in the dynamic overset grid implementation for OpenFOAM, which I think is an approved extension:
??http://web.student.chalmers.se/groups/ofw5/Abstracts/DavidBogerAbstractOFW5.pdf
? ? ?Matt?

? ?Barry

>
> 4) @matt in response to a question he raised in Vienna
>
> MIPSE is a BEM solver. Details are on:
> http://www.g2elab.grenoble-inp.fr/plateforms/mipse-modeling-of-interconnected-power-systems-632862.kjsp?RH=G2ELAB_R-MAGE
>
> Cheers, Domenico Lahaye.
>


-- 
What most experimenters take for granted before they begin their experiments is infinitely more interesting than any results to which their experiments lead.
-- Norbert Wiener

   
-- 
What most experimenters take for granted before they begin their experiments is infinitely more interesting than any results to which their experiments lead.
-- Norbert Wiener

  
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.mcs.anl.gov/pipermail/petsc-users/attachments/20160718/dadbda73/attachment.html>

From knepley at gmail.com  Mon Jul 18 02:11:48 2016
From: knepley at gmail.com (Matthew Knepley)
Date: Mon, 18 Jul 2016 02:11:48 -0500
Subject: [petsc-users] Regarding ksp ex42 - Citations
In-Reply-To: <877772657.653258.1468824084856.JavaMail.yahoo@mail.yahoo.com>
References: <1413749702.3789628.1468516892902.JavaMail.yahoo.ref@mail.yahoo.com>
	<1413749702.3789628.1468516892902.JavaMail.yahoo@mail.yahoo.com>
	<5A491912-5FFB-46AB-8B2E-CBC0C5C443C2@mcs.anl.gov>
	<CAMYG4GnjpQ+CTzB+Ed7hbB77YXJPzWRXtspDbrE8eiXZD=DA_Q@mail.gmail.com>
	<461808588.655361.1468821570462.JavaMail.yahoo@mail.yahoo.com>
	<CAMYG4Gk7HNz8S6A2cv=0OFqbHOPuLOkfdJwNnMiix-ONHvUheg@mail.gmail.com>
	<877772657.653258.1468824084856.JavaMail.yahoo@mail.yahoo.com>
Message-ID: <CAMYG4GmK6bgUmwYvbv-7vTPZ_9MokS6ekKpqqFbeSa3RXJzzcQ@mail.gmail.com>

On Mon, Jul 18, 2016 at 1:41 AM, domenico lahaye <domenico_lahaye at yahoo.com>
wrote:

> Dear Matthew,
>
>   I would like to place the FormJacobian statement in ex25.c in such a way
> that I can view
> the result on the different levels. Can you please point me to an example?
>

You can use options to do this. For any KSP solve, you can use

  -ksp_view_mat draw

for whatever viewer you want. In the mg cycle, you can use

  -mg_level_2_ksp_view_mat draw

or for all levels

  -mg_levels_ksp_view_mat draw

  I would like to do above with Galerkin coarsening as well. So yes, I do
> expect that I will need the
> hooks attached to the different MG levels. I appreciate more pointers here
> as well.
>

The above should work with either method.

  Thanks,

    Matt


>    Thanks, Domenico.
>
>
> *From:* Matthew Knepley <knepley at gmail.com>
>
>
> *To:* domenico lahaye <domenico_lahaye at yahoo.com>
> *Cc:* PETSc Users List <petsc-users at mcs.anl.gov>
> *Sent:* Monday, July 18, 2016 8:16 AM
>
> *Subject:* Re: [petsc-users] Regarding ksp ex42 - Citations
>
> On Mon, Jul 18, 2016 at 12:59 AM, domenico lahaye <
> domenico_lahaye at yahoo.com> wrote:
>
> Thanks for all the pointers.
>
> I am happy to switch to ksp/examples/tutorials/ex25.c in a first instance
> as you suggest.
>
>     I am still stuck with the same issue as before though. I am trying to
> extract the hierarchy
>     of coarser grid matrices and the intergrid transfer operators from the
> DMDA data structure. I would
>     like to modify these operators and define a multigrid cycle with the
> modified operators.
>
>     Given A^h (Helmholtz) and M^h (shifted Laplace), I would like to
> define a multigrid cycle involving
>     both A^H and M^H. Can I rely on the multilevel DMDA structure to
> construct A^H and M^H for me
>     in a set-up phase, plug them into a user-defined context, and plug
> them back out in a solve phase?
>
>
> If you are not using -pc_mg_galerkin, then the FormJacobian is called
> separately on each level to rediscretize the operator.
> The only thing that changes is the DMDA that is passed to the call. If you
> need more information, there are hooks to
> attach different contexts to each MG level. Do you need this?
>
>   Thanks,
>
>      Matt
>
>
> Thanks, Domenico.
>
>
> ------------------------------
> *From:* Matthew Knepley <knepley at gmail.com>
> *To:* Barry Smith <bsmith at mcs.anl.gov>
> *Cc:* domenico lahaye <domenico_lahaye at yahoo.com>; "
> petsc-users at mcs.anl.gov" <petsc-users at mcs.anl.gov>
> *Sent:* Sunday, July 17, 2016 2:29 PM
> *Subject:* Re: [petsc-users] Regarding ksp ex42 - Citations
>
> On Sat, Jul 16, 2016 at 10:11 PM, Barry Smith <bsmith at mcs.anl.gov> wrote:
>
>
> > On Jul 14, 2016, at 12:21 PM, domenico lahaye <domenico_lahaye at yahoo.com>
> wrote:
> >
> > Dear PETSc team,
> >
> > 1) I am looking into ks/examples/tutorials/ex42.c I am still new to the
> DMDA structure
> >     and likely not giving it as much time as it deserves. However, I do
> not see immediately
> >     what function is responsible for calling PCMGSetSmoother and
> PCMGSetResidual.
> >
> >      I tried to call PCMGGetCoarseSolve(pc, &kcpc) and subsequently
> >      KSPGetOperators (kspc, ... ) to check how the coarse grid operator
> is defined
> >      after calling DMCoarsenHierarchy, but that failed.
> >
> >      I am solving Helmholtz with shifted Laplace, and managed to exploit
> DMDA to perform
> >      a multigrid solve on the preconditioner. In a next stage I want to
> implement the deflation
> >      using DMDA as well.
> >
> > 2) On http://www.mcs.anl.gov/petsc/documentation/referencing.html I see
> >
> > @Misc{petsc-web-page,
> >             author = {Satish Balay and Shrirang Abhyankar and Mark~F.
> Adams and Jed Brown and Peter Brune
> >                       and Kris Buschelman and Lisandro Dalcin and Victor
> Eijkhout and William~D. Gropp
> >                       and Dinesh Kaushik and Matthew~G. Knepley
> >                       and Lois Curfman McInnes and Karl Rupp and
> Barry~F. Smith
> >                       and Stefano Zampini and Hong Zhang and Hong Zhang},
> >             title =  {{PETS}c {W}eb page},
> >             url =    {http://www.mcs.anl.gov/petsc},
> >             howpublished = {\url{http://www.mcs.anl.gov/petsc}},
> >             year = {2016}
> >           }
> >
> >
> >
> > Is the last author mentioned twice intentionally?
> >
> > 3) On
> http://www.mcs.anl.gov/petsc/publications/petscapps-bib.html#OpenFOAM%202.2.1
> I see
> >
> > @misc{OpenFOAM
> > ,
> >
> >
> > title =       "OpenFOAM",
> >
> > howpublished  =       "\url{http://www.openfoam.com}",
> >
> > url   =       {http://www.openfoam.com},
> >
> > note  =       "OpenFOAM is a free, open source CFD software package. It
> allows PETSc linear algebra and solvers to be used underneath.",
> >
> > key   =       "OpenFOAM 2.2.1"
> >
> > }
> >
> >
> > Do you have more information on the use of PETSc within OpenFoam?
>
>   Very good question. It seems that this citation is wrong or no longer
> valid; I have removed it from the PETSc repository. I could find no mention
> of PETSc usage in the OpenFoam and its third party packages. I think we
> should not have been listing this citation.
>
>
> This suggests that people are using it with OpenFOAM:
> http://powerlab.fsb.hr/ped/kturbo/OpenFOAM/slides/PatersonNuTTS2009.pdf
>
> In fact, they use PETSc in the dynamic overset grid implementation for
> OpenFOAM, which I think is an approved extension:
>
>
> http://web.student.chalmers.se/groups/ofw5/Abstracts/DavidBogerAbstractOFW5.pdf
>
>      Matt
>
>
>
>    Barry
>
> >
> > 4) @matt in response to a question he raised in Vienna
> >
> > MIPSE is a BEM solver. Details are on:
> >
> http://www.g2elab.grenoble-inp.fr/plateforms/mipse-modeling-of-interconnected-power-systems-632862.kjsp?RH=G2ELAB_R-MAGE
> >
> > Cheers, Domenico Lahaye.
>
> >
>
>
>
>
> --
> What most experimenters take for granted before they begin their
> experiments is infinitely more interesting than any results to which their
> experiments lead.
> -- Norbert Wiener
>
>
>
>
>
> --
> What most experimenters take for granted before they begin their
> experiments is infinitely more interesting than any results to which their
> experiments lead.
> -- Norbert Wiener
>
>
>


-- 
What most experimenters take for granted before they begin their
experiments is infinitely more interesting than any results to which their
experiments lead.
-- Norbert Wiener
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.mcs.anl.gov/pipermail/petsc-users/attachments/20160718/22c519ff/attachment-0001.html>

From domenico_lahaye at yahoo.com  Mon Jul 18 02:29:51 2016
From: domenico_lahaye at yahoo.com (domenico lahaye)
Date: Mon, 18 Jul 2016 07:29:51 +0000 (UTC)
Subject: [petsc-users] Regarding ksp ex42 - Citations
In-Reply-To: <872779534.685616.1468826653246.JavaMail.yahoo@mail.yahoo.com>
References: <1413749702.3789628.1468516892902.JavaMail.yahoo.ref@mail.yahoo.com>
	<1413749702.3789628.1468516892902.JavaMail.yahoo@mail.yahoo.com>
	<5A491912-5FFB-46AB-8B2E-CBC0C5C443C2@mcs.anl.gov>
	<CAMYG4GnjpQ+CTzB+Ed7hbB77YXJPzWRXtspDbrE8eiXZD=DA_Q@mail.gmail.com>
	<461808588.655361.1468821570462.JavaMail.yahoo@mail.yahoo.com>
	<CAMYG4Gk7HNz8S6A2cv=0OFqbHOPuLOkfdJwNnMiix-ONHvUheg@mail.gmail.com>
	<877772657.653258.1468824084856.JavaMail.yahoo@mail.yahoo.com>
	<CAMYG4GmK6bgUmwYvbv-7vTPZ_9MokS6ekKpqqFbeSa3RXJzzcQ@mail.gmail.com>
	<872779534.685616.1468826653246.JavaMail.yahoo@mail.yahoo.com>
Message-ID: <1309408705.665690.1468826991415.JavaMail.yahoo@mail.yahoo.com>

That is wonderful.?


Given however that is a subsequent stage I would like to manipulate the grid?hierarchy in my code, I would like to know what the equivalent function calls?are (at least in my limited understanding).?
I saw that snes/ex58.c has a FormJacobian using DMDA. I am looking for?something similar that *gets* the ?Jacobian (instead on forming it) on the?different levels (instead of on the finest level only).?
Thanks again, Domenico.?

      From: Matthew Knepley <knepley at gmail.com>
 To: domenico lahaye <domenico_lahaye at yahoo.com> 
Cc: PETSc Users List <petsc-users at mcs.anl.gov>
 Sent: Monday, July 18, 2016 9:11 AM
 Subject: Re: [petsc-users] Regarding ksp ex42 - Citations
  
On Mon, Jul 18, 2016 at 1:41 AM, domenico lahaye <domenico_lahaye at yahoo.com> wrote:

Dear Matthew,?
? I would like to place the FormJacobian statement in ex25.c in such a way that I can view?the result on the different levels. Can you please point me to an example??

You can use options to do this. For any KSP solve, you can use
? -ksp_view_mat draw
for whatever viewer you want. In the mg cycle, you can use
? -mg_level_2_ksp_view_mat draw
or for all levels
? -mg_levels_ksp_view_mat draw

? I would like to do above with Galerkin coarsening as well. So yes, I do expect that I will need the?hooks attached to the different MG levels. I appreciate more pointers here as well.?

The above should work with either method.
? Thanks,
? ? Matt?
? ?Thanks, Domenico. ?

From: Matthew Knepley <knepley at gmail.com>


 To: domenico lahaye <domenico_lahaye at yahoo.com> 
Cc: PETSc Users List <petsc-users at mcs.anl.gov>
 Sent: Monday, July 18, 2016 8:16 AM
 Subject: Re: [petsc-users] Regarding ksp ex42 - Citations
   
On Mon, Jul 18, 2016 at 12:59 AM, domenico lahaye <domenico_lahaye at yahoo.com> wrote:

Thanks for?all the?pointers.?
I am happy to switch to?ksp/examples/tutorials/ex25.c in a first instance as you suggest.
? ? I am still stuck with the same issue as before though. I am trying to extract the hierarchy?? ? of coarser grid matrices and the intergrid transfer operators from the DMDA data structure. I would?? ? like to modify these operators and define a multigrid cycle with the modified operators.?
? ? Given A^h (Helmholtz) and M^h (shifted Laplace), I would like to define a multigrid cycle involving?? ? both A^H and M^H. Can I rely on the multilevel DMDA structure to construct A^H and M^H for me?? ? in a set-up phase, plug them into a user-defined context, and plug them back out in a solve phase??

If you are not using -pc_mg_galerkin, then the FormJacobian is called separately on each level to rediscretize the operator.The only thing that changes is the DMDA that is passed to the call. If you need more information, there are hooks toattach different contexts to each MG level. Do you need this?
? Thanks,
? ? ?Matt?

Thanks, Domenico.?

      From: Matthew Knepley <knepley at gmail.com>
 To: Barry Smith <bsmith at mcs.anl.gov> 
Cc: domenico lahaye <domenico_lahaye at yahoo.com>; "petsc-users at mcs.anl.gov" <petsc-users at mcs.anl.gov>
 Sent: Sunday, July 17, 2016 2:29 PM
 Subject: Re: [petsc-users] Regarding ksp ex42 - Citations
  
On Sat, Jul 16, 2016 at 10:11 PM, Barry Smith <bsmith at mcs.anl.gov> wrote:


> On Jul 14, 2016, at 12:21 PM, domenico lahaye <domenico_lahaye at yahoo.com> wrote:
>
> Dear PETSc team,
>
> 1) I am looking into ks/examples/tutorials/ex42.c I am still new to the DMDA structure
>? ? ?and likely not giving it as much time as it deserves. However, I do not see immediately
>? ? ?what function is responsible for calling PCMGSetSmoother and PCMGSetResidual.
>
>? ? ? I tried to call PCMGGetCoarseSolve(pc, &kcpc) and subsequently
>? ? ? KSPGetOperators (kspc, ... ) to check how the coarse grid operator is defined
>? ? ? after calling DMCoarsenHierarchy, but that failed.
>
>? ? ? I am solving Helmholtz with shifted Laplace, and managed to exploit DMDA to perform
>? ? ? a multigrid solve on the preconditioner. In a next stage I want to implement the deflation
>? ? ? using DMDA as well.
>
> 2) On http://www.mcs.anl.gov/petsc/documentation/referencing.html I see
>
> @Misc{petsc-web-page,
>? ? ? ? ? ? ?author = {Satish Balay and Shrirang Abhyankar and Mark~F. Adams and Jed Brown and Peter Brune
>? ? ? ? ? ? ? ? ? ? ? ?and Kris Buschelman and Lisandro Dalcin and Victor Eijkhout and William~D. Gropp
>? ? ? ? ? ? ? ? ? ? ? ?and Dinesh Kaushik and Matthew~G. Knepley
>? ? ? ? ? ? ? ? ? ? ? ?and Lois Curfman McInnes and Karl Rupp and Barry~F. Smith
>? ? ? ? ? ? ? ? ? ? ? ?and Stefano Zampini and Hong Zhang and Hong Zhang},
>? ? ? ? ? ? ?title =? {{PETS}c {W}eb page},
>? ? ? ? ? ? ?url =? ? {http://www.mcs.anl.gov/petsc},
>? ? ? ? ? ? ?howpublished = {\url{http://www.mcs.anl.gov/petsc}},
>? ? ? ? ? ? ?year = {2016}
>? ? ? ? ? ?}
>
>
>
> Is the last author mentioned twice intentionally?
>
> 3) On http://www.mcs.anl.gov/petsc/publications/petscapps-bib.html#OpenFOAM%202.2.1 I see
>
> @misc{OpenFOAM
> ,
>
>
> title =? ? ? ?"OpenFOAM",
>
> howpublished? =? ? ? ?"\url{http://www.openfoam.com}",
>
> url? ?=? ? ? ?{http://www.openfoam.com},
>
> note? =? ? ? ?"OpenFOAM is a free, open source CFD software package. It allows PETSc linear algebra and solvers to be used underneath.",
>
> key? ?=? ? ? ?"OpenFOAM 2.2.1"
>
> }
>
>
> Do you have more information on the use of PETSc within OpenFoam?

? Very good question. It seems that this citation is wrong or no longer valid; I have removed it from the PETSc repository. I could find no mention of PETSc usage in the OpenFoam and its third party packages. I think we should not have been listing this citation.

This suggests that people are using it with OpenFOAM:?http://powerlab.fsb.hr/ped/kturbo/OpenFOAM/slides/PatersonNuTTS2009.pdf
In fact, they use PETSc in the dynamic overset grid implementation for OpenFOAM, which I think is an approved extension:
??http://web.student.chalmers.se/groups/ofw5/Abstracts/DavidBogerAbstractOFW5.pdf
? ? ?Matt?

? ?Barry

>
> 4) @matt in response to a question he raised in Vienna
>
> MIPSE is a BEM solver. Details are on:
> http://www.g2elab.grenoble-inp.fr/plateforms/mipse-modeling-of-interconnected-power-systems-632862.kjsp?RH=G2ELAB_R-MAGE
>
> Cheers, Domenico Lahaye.
>


-- 
What most experimenters take for granted before they begin their experiments is infinitely more interesting than any results to which their experiments lead.
-- Norbert Wiener

   
-- 
What most experimenters take for granted before they begin their experiments is infinitely more interesting than any results to which their experiments lead.
-- Norbert Wiener

   
-- 
What most experimenters take for granted before they begin their experiments is infinitely more interesting than any results to which their experiments lead.
-- Norbert Wiener

   
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.mcs.anl.gov/pipermail/petsc-users/attachments/20160718/c20879e3/attachment.html>

From mhassan at miners.utep.edu  Mon Jul 18 14:01:28 2016
From: mhassan at miners.utep.edu (Hassan Md Mahmudulla)
Date: Mon, 18 Jul 2016 19:01:28 +0000
Subject: [petsc-users] Incorrect eigenvalues
Message-ID: <DM2PR0501MB1247AB2D9E43294524D615FEFF360@DM2PR0501MB1247.namprd05.prod.outlook.com>

Hi all,
I have been trying to solve generalized eigenvalue problem using matrices of size 10K. Sparsity of the matrix is 6%. I am using the following command

./solver -f1 hamold.petsc -f2 ovlbaby.petsc -st_ksp_type preonly -st_pc_type jacobi -st_pc_factor_mat_solver_package mumps -eps_interval -2,0 -eps_nev 1000


  *   solver is the program
  *   f1 and f2 are the input file for both matrices in petsc binary (mpiaij)

I am getting the following output:


Generalized eigenproblem stored in file.

 Reading REAL matrices from binary files...
 TYPE OF MATRIX A: mpiaij
 TYPE OF MATRIX B: mpiaij
 Solving for Eigen values...
  Solved!
  1: -9771.8339             0
  2: -9559.8347             0
  3: -9408.5603             0
  4:  -9387.423             0
  5: -9235.9137             0
  6: -9102.5334             0
  7: -9098.1307             0
  8: -8970.3594             0
  9: -8854.4964             0
  10: -8850.3629             0
  11: -8736.6619             0
  12: -8637.1749             0
  13:  -8628.214             0
  14: -8524.2494             0
  15: -8440.0801             0
  16: -8424.1789             0
  17: -8327.5389             0
  18: -8257.7763             0
  19: -8233.9564             0
  20: -8143.1251             0
  21: -8086.9865             0
  22: -8054.7899             0
  23: -7968.7355             0
  24: -7925.5421             0
  25: -7884.7777             0
  26: -7802.7577             0
  27:  -7771.913             0
  28:  -7722.537             0
  29: -7643.9943             0
  30: -7624.9684             0

  .........................
  .........................

 541: -24.947288             0
  542: -24.945875             0
  543:  -24.94017             0


First column is the eigenvalues.
My concern is,

  *   Eigenvalues are not right
  *   I defined the interval but still it's giving me eigenvalues outside of that interval

 Please help me out.

M Hassan


-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.mcs.anl.gov/pipermail/petsc-users/attachments/20160718/2af9bd31/attachment.html>

From jroman at dsic.upv.es  Mon Jul 18 14:43:18 2016
From: jroman at dsic.upv.es (Jose E. Roman)
Date: Mon, 18 Jul 2016 21:43:18 +0200
Subject: [petsc-users] Incorrect eigenvalues
In-Reply-To: <DM2PR0501MB1247AB2D9E43294524D615FEFF360@DM2PR0501MB1247.namprd05.prod.outlook.com>
References: <DM2PR0501MB1247AB2D9E43294524D615FEFF360@DM2PR0501MB1247.namprd05.prod.outlook.com>
Message-ID: <F40583A2-B826-450A-B35C-AE0E65A05E4B@dsic.upv.es>

[Please do not send queries to both petsc-users and slepc-maint, only one of them is enough.]

It seems that you are mixing a random number of options that make little sense. You cannot use preonly+jacobi to solve linear systems, even less in the case of the eps_interval option. For computing eigenvalues in an interval, follow the instructions in section 3.4.5 of the users manual. In particular, preonly+cholesky is required. Also, if done with MUMPS the option -mat_mumps_icntl_13 1 is also needed. Furthermore, I don't see that you are using -st_type sinvert, so I guess some options are inserted in the source code, which you did not show.

Jose


> El 18 jul 2016, a las 21:01, Hassan Md Mahmudulla <mhassan at miners.utep.edu> escribi?:
> 
> Hi all,
> I have been trying to solve generalized eigenvalue problem using matrices of size 10K. Sparsity of the matrix is 6%. I am using the following command 
> 
> ./solver -f1 hamold.petsc -f2 ovlbaby.petsc -st_ksp_type preonly -st_pc_type jacobi -st_pc_factor_mat_solver_package mumps -eps_interval -2,0 -eps_nev 1000
> 
> 	? solver is the program
> 	? f1 and f2 are the input file for both matrices in petsc binary (mpiaij)
> I am getting the following output:
> 
>  
> Generalized eigenproblem stored in file.
> 
>  Reading REAL matrices from binary files...
>  TYPE OF MATRIX A: mpiaij
>  TYPE OF MATRIX B: mpiaij
>  Solving for Eigen values...
>   Solved! 
>   1: -9771.8339             0
>   2: -9559.8347             0
>   3: -9408.5603             0
>   4:  -9387.423             0
>   5: -9235.9137             0
>   6: -9102.5334             0
>   7: -9098.1307             0
>   8: -8970.3594             0
>   9: -8854.4964             0
>   10: -8850.3629             0
>   11: -8736.6619             0
>   12: -8637.1749             0
>   13:  -8628.214             0
>   14: -8524.2494             0
>   15: -8440.0801             0
>   16: -8424.1789             0
>   17: -8327.5389             0
>   18: -8257.7763             0
>   19: -8233.9564             0
>   20: -8143.1251             0
>   21: -8086.9865             0
>   22: -8054.7899             0
>   23: -7968.7355             0
>   24: -7925.5421             0
>   25: -7884.7777             0
>   26: -7802.7577             0
>   27:  -7771.913             0
>   28:  -7722.537             0
>   29: -7643.9943             0
>   30: -7624.9684             0
>  
>   .........................
>   .........................
>  
>  541: -24.947288             0
>   542: -24.945875             0
>   543:  -24.94017             0
>  
> 
> First column is the eigenvalues. 
> My concern is, 
> 	? Eigenvalues are not right
> 	? I defined the interval but still it's giving me eigenvalues outside of that interval
>  Please help me out.
> 
> M Hassan


From mhassan at miners.utep.edu  Mon Jul 18 14:48:53 2016
From: mhassan at miners.utep.edu (Hassan Md Mahmudulla)
Date: Mon, 18 Jul 2016 19:48:53 +0000
Subject: [petsc-users] Incorrect eigenvalues
In-Reply-To: <F40583A2-B826-450A-B35C-AE0E65A05E4B@dsic.upv.es>
References: <DM2PR0501MB1247AB2D9E43294524D615FEFF360@DM2PR0501MB1247.namprd05.prod.outlook.com>,
	<F40583A2-B826-450A-B35C-AE0E65A05E4B@dsic.upv.es>
Message-ID: <DM2PR0501MB12478D578CA65CBEB377D2A9FF360@DM2PR0501MB1247.namprd05.prod.outlook.com>

Would you please give me an idea what combination of ksp solver and preconditioner I should use to solve this generalized symmetric hermitian problem? To get the convergence faster, do I need to use external solvers like mumps and superlu_dist?


Thanks

M Hassan

________________________________
From: Jose E. Roman <jroman at dsic.upv.es>
Sent: Monday, July 18, 2016 1:43:18 PM
To: Hassan Md Mahmudulla
Cc: petsc-users at mcs.anl.gov; slepc-maint at upv.es
Subject: Re: [petsc-users] Incorrect eigenvalues

[Please do not send queries to both petsc-users and slepc-maint, only one of them is enough.]

It seems that you are mixing a random number of options that make little sense. You cannot use preonly+jacobi to solve linear systems, even less in the case of the eps_interval option. For computing eigenvalues in an interval, follow the instructions in section 3.4.5 of the users manual. In particular, preonly+cholesky is required. Also, if done with MUMPS the option -mat_mumps_icntl_13 1 is also needed. Furthermore, I don't see that you are using -st_type sinvert, so I guess some options are inserted in the source code, which you did not show.

Jose


> El 18 jul 2016, a las 21:01, Hassan Md Mahmudulla <mhassan at miners.utep.edu> escribi?:
>
> Hi all,
> I have been trying to solve generalized eigenvalue problem using matrices of size 10K. Sparsity of the matrix is 6%. I am using the following command
>
> ./solver -f1 hamold.petsc -f2 ovlbaby.petsc -st_ksp_type preonly -st_pc_type jacobi -st_pc_factor_mat_solver_package mumps -eps_interval -2,0 -eps_nev 1000
>
>        ? solver is the program
>        ? f1 and f2 are the input file for both matrices in petsc binary (mpiaij)
> I am getting the following output:
>
>
> Generalized eigenproblem stored in file.
>
>  Reading REAL matrices from binary files...
>  TYPE OF MATRIX A: mpiaij
>  TYPE OF MATRIX B: mpiaij
>  Solving for Eigen values...
>   Solved!
>   1: -9771.8339             0
>   2: -9559.8347             0
>   3: -9408.5603             0
>   4:  -9387.423             0
>   5: -9235.9137             0
>   6: -9102.5334             0
>   7: -9098.1307             0
>   8: -8970.3594             0
>   9: -8854.4964             0
>   10: -8850.3629             0
>   11: -8736.6619             0
>   12: -8637.1749             0
>   13:  -8628.214             0
>   14: -8524.2494             0
>   15: -8440.0801             0
>   16: -8424.1789             0
>   17: -8327.5389             0
>   18: -8257.7763             0
>   19: -8233.9564             0
>   20: -8143.1251             0
>   21: -8086.9865             0
>   22: -8054.7899             0
>   23: -7968.7355             0
>   24: -7925.5421             0
>   25: -7884.7777             0
>   26: -7802.7577             0
>   27:  -7771.913             0
>   28:  -7722.537             0
>   29: -7643.9943             0
>   30: -7624.9684             0
>
>   .........................
>   .........................
>
>  541: -24.947288             0
>   542: -24.945875             0
>   543:  -24.94017             0
>
>
> First column is the eigenvalues.
> My concern is,
>        ? Eigenvalues are not right
>        ? I defined the interval but still it's giving me eigenvalues outside of that interval
>  Please help me out.
>
> M Hassan

-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.mcs.anl.gov/pipermail/petsc-users/attachments/20160718/642a13d2/attachment-0001.html>

From jroman at dsic.upv.es  Mon Jul 18 15:00:16 2016
From: jroman at dsic.upv.es (Jose E. Roman)
Date: Mon, 18 Jul 2016 22:00:16 +0200
Subject: [petsc-users] Incorrect eigenvalues
In-Reply-To: <DM2PR0501MB12478D578CA65CBEB377D2A9FF360@DM2PR0501MB1247.namprd05.prod.outlook.com>
References: <DM2PR0501MB1247AB2D9E43294524D615FEFF360@DM2PR0501MB1247.namprd05.prod.outlook.com>
	<F40583A2-B826-450A-B35C-AE0E65A05E4B@dsic.upv.es>
	<DM2PR0501MB12478D578CA65CBEB377D2A9FF360@DM2PR0501MB1247.namprd05.prod.outlook.com>
Message-ID: <D7BA1CBF-F5B1-4BC5-8A9D-53A893F0822F@dsic.upv.es>


> El 18 jul 2016, a las 21:48, Hassan Md Mahmudulla <mhassan at miners.utep.edu> escribi?:
> 
> Would you please give me an idea what combination of ksp solver and preconditioner I should use to solve this generalized symmetric hermitian problem? To get the convergence faster, do I need to use external solvers like mumps and superlu_dist?
> 
> Thanks
> M Hassan

For computing eigenvalues in an interval, you have to follow exactly what is written in section 3.4.5 of SLEPc's users manual. It is not possible to use preconditioners in that case. Also, superlu_dist cannot be used for this, only MUMPS or PETSc's cholesky (sequential).

Jose


From mhassan at miners.utep.edu  Mon Jul 18 15:09:51 2016
From: mhassan at miners.utep.edu (Hassan Md Mahmudulla)
Date: Mon, 18 Jul 2016 20:09:51 +0000
Subject: [petsc-users] Incorrect eigenvalues
In-Reply-To: <D7BA1CBF-F5B1-4BC5-8A9D-53A893F0822F@dsic.upv.es>
References: <DM2PR0501MB1247AB2D9E43294524D615FEFF360@DM2PR0501MB1247.namprd05.prod.outlook.com>
	<F40583A2-B826-450A-B35C-AE0E65A05E4B@dsic.upv.es>
	<DM2PR0501MB12478D578CA65CBEB377D2A9FF360@DM2PR0501MB1247.namprd05.prod.outlook.com>,
	<D7BA1CBF-F5B1-4BC5-8A9D-53A893F0822F@dsic.upv.es>
Message-ID: <DM2PR0501MB12471F36BE5C0185729EE1B5FF360@DM2PR0501MB1247.namprd05.prod.outlook.com>

Thank you very much for your reply. Well, I actually can avoid using eps_interval since I don't really need that. I want to request 10% eigenvalues and I need them very fast. That's why I was trying with different combinations. My system size can be bigger. So, I was trying iterative solver like mumps as well. But the problem is almost all the preconditioners are giving me wrong answers. Would you suggest me any way so that I can solve my problem? I will try with section 3.4.5 though.


M Hassan


________________________________
From: Jose E. Roman <jroman at dsic.upv.es>
Sent: Monday, July 18, 2016 2:00:16 PM
To: Hassan Md Mahmudulla
Cc: petsc-users at mcs.anl.gov; slepc-maint at upv.es
Subject: Re: [petsc-users] Incorrect eigenvalues


> El 18 jul 2016, a las 21:48, Hassan Md Mahmudulla <mhassan at miners.utep.edu> escribi?:
>
> Would you please give me an idea what combination of ksp solver and preconditioner I should use to solve this generalized symmetric hermitian problem? To get the convergence faster, do I need to use external solvers like mumps and superlu_dist?
>
> Thanks
> M Hassan

For computing eigenvalues in an interval, you have to follow exactly what is written in section 3.4.5 of SLEPc's users manual. It is not possible to use preconditioners in that case. Also, superlu_dist cannot be used for this, only MUMPS or PETSc's cholesky (sequential).

Jose

-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.mcs.anl.gov/pipermail/petsc-users/attachments/20160718/d2bc4ea4/attachment.html>

From jroman at dsic.upv.es  Mon Jul 18 15:32:20 2016
From: jroman at dsic.upv.es (Jose E. Roman)
Date: Mon, 18 Jul 2016 22:32:20 +0200
Subject: [petsc-users] Incorrect eigenvalues
In-Reply-To: <DM2PR0501MB12471F36BE5C0185729EE1B5FF360@DM2PR0501MB1247.namprd05.prod.outlook.com>
References: <DM2PR0501MB1247AB2D9E43294524D615FEFF360@DM2PR0501MB1247.namprd05.prod.outlook.com>
	<F40583A2-B826-450A-B35C-AE0E65A05E4B@dsic.upv.es>
	<DM2PR0501MB12478D578CA65CBEB377D2A9FF360@DM2PR0501MB1247.namprd05.prod.outlook.com>
	<D7BA1CBF-F5B1-4BC5-8A9D-53A893F0822F@dsic.upv.es>
	<DM2PR0501MB12471F36BE5C0185729EE1B5FF360@DM2PR0501MB1247.namprd05.prod.outlook.com>
Message-ID: <7DDE1F06-993D-4365-9290-C1010EDC6289@dsic.upv.es>


> El 18 jul 2016, a las 22:09, Hassan Md Mahmudulla <mhassan at miners.utep.edu> escribi?:
> 
> Thank you very much for your reply. Well, I actually can avoid using eps_interval since I don't really need that. I want to request 10% eigenvalues and I need them very fast. That's why I was trying with different combinations. My system size can be bigger. So, I was trying iterative solver like mumps as well. But the problem is almost all the preconditioners are giving me wrong answers. Would you suggest me any way so that I can solve my problem? I will try with section 3.4.5 though.
> 
> M Hassan

MUMPS is not an iterative solver, but a direct solver. For solving linear systems you first need to understand PETSc's KSP and PC objects. You cannot use preonly with jacobi because it won't give you the solution of the linear system (just one preconditioning step, which is enough for some SLEPc solvers but not for the default one). You can try an iterative method such as GMRES together with a preconditioner such as Jacobi. Again, this is discussed in SLEPc's documentation, for instance in section 3.4.1 of the manual. But eps_interval is an exception which supports direct solvers only.

Computing 10% of eigenvalues of a large matrix is generally a very expensive task, it cannot be done "very fast". Using eps_interval could be a good option if you know the interval containing the eigenvalues, but it will take time since it requires factorizing large matrices.

Jose


From mhassan at miners.utep.edu  Tue Jul 19 05:42:21 2016
From: mhassan at miners.utep.edu (Hassan Md Mahmudulla)
Date: Tue, 19 Jul 2016 10:42:21 +0000
Subject: [petsc-users] Spectrum slicing with MUMPS (Segmentation fault)
Message-ID: <DM2PR0501MB12471722688A71D4D6E1342BFF370@DM2PR0501MB1247.namprd05.prod.outlook.com>

Hi all,

I have been trying spectrum slicing with MUMPS external solver. The error output is the following:


[0]PETSC ERROR: ------------------------------------------------------------------------
[0]PETSC ERROR: Caught signal number 11 SEGV: Segmentation Violation, probably memory access out of range
[0]PETSC ERROR: Try option -start_in_debugger or -on_error_attach_debugger
[0]PETSC ERROR: or see http://www.mcs.anl.gov/petsc/documentation/faq.html#valgrind
[0]PETSC ERROR: or try http://valgrind.org on GNU/linux and Apple Mac OS X to find memory corruption errors
[0]PETSC ERROR: configure using --with-debugging=yes, recompile, link, and run
[0]PETSC ERROR: to get more information on the crash.
[0]PETSC ERROR: --------------------- Error Message --------------------------------------------------------------
[0]PETSC ERROR: Signal received
[0]PETSC ERROR: See http://www.mcs.anl.gov/petsc/documentation/faq.html for trouble shooting.
[0]PETSC ERROR: Petsc Release Version 3.5.3, Jan, 31, 2015
[0]PETSC ERROR: /scratch1/scratchdirs/mhassan/dSLEPc/d540/../eigenSolverSS on a sandybridge named nid00281 by mhassan Tue Jul 19 02:54
:00 2016
[0]PETSC ERROR: Configure options --known-mpi-int64_t=0 --known-bits-per-byte=8 --known-sdot-returns-double=0 --known-snrm2-returns-do
uble=0 --known-level1-dcache-assoc=0 --known-level1-dcache-linesize=32 --known-level1-dcache-size=32768 --known-memcmp-ok=1 --known-mp
i-c-double-complex=1 --known-mpi-long-double=1 --known-mpi-shared-libraries=0 --known-sizeof-MPI_Comm=4 --known-sizeof-MPI_Fint=4 --kn
own-sizeof-char=1 --known-sizeof-double=8 --known-sizeof-float=4 --known-sizeof-int=4 --known-sizeof-long-long=8 --known-sizeof-long=8
 --known-sizeof-short=2 --known-sizeof-size_t=8 --known-sizeof-void-p=8 --with-ar=ar --with-batch=1 --with-cc=cc --with-clib-autodetec
t=0 --with-cxx=CC --with-cxxlib-autodetect=0 --with-debugging=0 --with-dependencies=0 --with-fc=ftn --with-fortran-datatypes=0 --with-
fortran-interfaces=0 --with-fortranlib-autodetect=0 --with-ranlib=ranlib --with-scalar-type=real --with-shared-ld=ar --with-etags=0 --
with-dependencies=0 --with-dependencies=0 --with-mpi-dir=/opt/cray/mpt/7.0.0/gni/mpich2-intel/140 --with-superlu=1 --with-superlu-incl
ude=/opt/cray/tpsl/1.4.4/INTEL/140/sandybridge/include --with-superlu-lib=/opt/cray/tpsl/1.4.4/INTEL/140/sandybridge/lib/libsuperlu.a
--with-superlu_dist=1 --with-superlu_dist-include=/opt/cray/tpsl/1.4.4/INTEL/140/sandybridge/include --with-superlu_dist-lib=/opt/cray
/tpsl/1.4.4/INTEL/140/sandybridge/lib/libsuperlu_dist.a --with-parmetis=1 --with-parmetis-include=/opt/cray/tpsl/1.4.4/INTEL/140/sandy
bridge/include --with-parmetis-lib=/opt/cray/tpsl/1.4.4/INTEL/140/sandybridge/lib/libparmetis.a --with-metis=1 --with-metis-include=/o
pt/cray/tpsl/1.4.4/INTEL/140/sandybridge/include --with-metis-lib=/opt/cray/tpsl/1.4.4/INTEL/140/sandybridge/lib/libmetis.a --with-pts
cotch=1 --with-ptscotch-include=/opt/cray/tpsl/1.4.4/INTEL/140/sandybridge/include --with-ptscotch-lib="-L/opt/cray/tpsl/1.4.4/INTEL/1
40/sandybridge/lib -lptscotch -lscotch -lptscotcherr -lscotcherr" --with-scalapack=1 --with-scalapack-include=/opt/cray/libsci/13.0.3/
INTEL/140/x86_64/include --with-scalapack-lib="-L/opt/cray/libsci/13.0.3/INTEL/140/x86_64/lib -lsci_intel_mpi_mp -lsci_intel_mp" --wit
h-mumps=1 --with-mumps-include=/opt/cray/tpsl/1.4.4/INTEL/140/sandybridge/include --with-mumps-lib="-L/opt/cray/tpsl/1.4.4/INTEL/140/s
andybridge/lib -lcmumps -ldmumps -lesmumps -lsmumps -lzmumps -lmumps_common -lptesmumps -lpord" --CFLAGS="-xavx -openmp -O3 " --CXXFLA
GS="-xavx -openmp -O3  " --FFLAGS="-xavx -openmp -O3  " --LIBS=-lstdc++ --CXX_LINKER_FLAGS= --PETSC_ARCH=sandybridge --prefix=/opt/cra
y/petsc/3.5.3.0/real/INTEL/140/sandybridge --with-hypre=1 --with-hypre-include=/opt/cray/tpsl/1.4.4/INTEL/140/sandybridge/include --wi
th-hypre-lib=/opt/cray/tpsl/1.4.4/INTEL/140/sandybridge/lib/libHYPRE.a --with-sundials=1 --with-sundials-include=/opt/cray/tpsl/1.4.4/
INTEL/140/sandybridge/include --with-sundials-lib="-L/opt/cray/tpsl/1.4.4/INTEL/140/sandybridge/lib -lsundials_cvode -lsundials_cvodes
 -lsundials_ida -lsundials_idas -lsundials_kinsol -lsundials_nvecparallel -lsundials_nvecserial"
[0]PETSC ERROR: #1 User provided function() line 0 in  unknown file
Rank 0 [Tue Jul 19 02:54:04 2016] [c1-0c1s6n1] application called MPI_Abort(MPI_COMM_WORLD, 59) - process 0
srun: error: nid00281: task 0: Aborted
srun: Terminating job step 1330433.0
slurmstepd: *** STEP 1330433.0 ON nid00281 CANCELLED AT 2016-07-19T02:54:04 ***
srun: Job step aborted: Waiting up to 32 seconds for job step to finish.
srun: error: nid00281: tasks 1-17: Killed
srun: error: nid00282: tasks 18-35: Killed


I ran the same code in my pc with 8 processor. It had no issues. But when I tried in a different machine, I am getting this. Any idea? Can I use Superlu_dist instead of MUMPS? I got INFOG(1)=-22 error from MUMPS in another run.


Thanks,


M Hassan
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.mcs.anl.gov/pipermail/petsc-users/attachments/20160719/d190aca4/attachment-0001.html>

From loiseau.jc at gmail.com  Tue Jul 19 06:51:00 2016
From: loiseau.jc at gmail.com (JC)
Date: Tue, 19 Jul 2016 13:51:00 +0200
Subject: [petsc-users] petscviewerhdf5open undefined reference
Message-ID: <C90884B0-E98D-43CD-B73C-49D8B00D61DE@gmail.com>

Hi everyone,

I am a rather recent user of petsc. I have installed it on my mac using home-brew and have been to develop my CFD code quite efficiently thanks to that. I am now porting the code onto another machine which has linux mint 18 installed. I have installed petsc and its dependancies as follow:

	apt install --install-recommends --install-suggests pets-dev

Though most of the code compiles correctly, I get the following error at some point:

	/home/jean-christophe/Codes/PETSc_LS/SOURCES/io.f90:162: undefined reference to `petscviewerhdf5open_?

I have made sure that apt install the hdf5 library. All of the versions are exactly the same I use on my mac, yet I cannot compile correctly. Anyone has ever encountered the same problem?

Thanks a lot anyway for this amazing library.
Regards, 
JC

From lixin_chu at yahoo.com  Tue Jul 19 09:01:35 2016
From: lixin_chu at yahoo.com (lixin chu)
Date: Tue, 19 Jul 2016 14:01:35 +0000 (UTC)
Subject: [petsc-users] some beginner questions : matrix multiplication
References: <932627683.1480276.1468936895326.JavaMail.yahoo.ref@mail.yahoo.com>
Message-ID: <932627683.1480276.1468936895326.JavaMail.yahoo@mail.yahoo.com>

Hello,I am new to PETsc, and I am looking for a library to support matrix multiplication. I have several questions and would like to confirm:
1. From MatMatMult API, for C=A*B, I assume we can support mixed sparse and dense matrix, i.e., either A or B can be dense; similarly, MatMatMatMult (A*B*C) can support A and C sparse, and B is dense.
2. We can also use mixed data type for MatMatMult/MatMatMatMult, for example, A is complex, double, and B is double.
3. Is there a way to estimate the total working memory required for MatMatMult/MatMatMatMult, given A,B and C information (like dimensions, and total none zero elements, data type)?4. do we have any performance/memory usage data when compared with other sparse matrix multiplication solutions. for example. PSBLAS ?
thank you very much,
lixin
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.mcs.anl.gov/pipermail/petsc-users/attachments/20160719/c961dcb0/attachment.html>

From knepley at gmail.com  Tue Jul 19 09:37:11 2016
From: knepley at gmail.com (Matthew Knepley)
Date: Tue, 19 Jul 2016 16:37:11 +0200
Subject: [petsc-users] petscviewerhdf5open undefined reference
In-Reply-To: <C90884B0-E98D-43CD-B73C-49D8B00D61DE@gmail.com>
References: <C90884B0-E98D-43CD-B73C-49D8B00D61DE@gmail.com>
Message-ID: <CAMYG4Gk0jw5eezPeZx5sLM4efm0QmfifA7_qNK1Kv0jDdgJ=MQ@mail.gmail.com>

On Tue, Jul 19, 2016 at 1:51 PM, JC <loiseau.jc at gmail.com> wrote:

> Hi everyone,
>
> I am a rather recent user of petsc. I have installed it on my mac using
> home-brew and have been to develop my CFD code quite efficiently thanks to
> that. I am now porting the code onto another machine which has linux mint
> 18 installed. I have installed petsc and its dependancies as follow:
>
>         apt install --install-recommends --install-suggests pets-dev
>
> Though most of the code compiles correctly, I get the following error at
> some point:
>
>         /home/jean-christophe/Codes/PETSc_LS/SOURCES/io.f90:162: undefined
> reference to `petscviewerhdf5open_?
>
> I have made sure that apt install the hdf5 library. All of the versions
> are exactly the same I use on my mac, yet I cannot compile correctly.
> Anyone has ever encountered the same problem?
>

Its possible that the packager did not configure PETSc to use HDF5. Check

  $PETSC_DIR/include/petscconf.h

for the lines

#ifndef PETSC_HAVE_HDF5
#define PETSC_HAVE_HDF5 1
#endif

If they are not there, you will have to install yourself using
--download-hdf5, which should not be hard.

  Thanks,

    Matt


> Thanks a lot anyway for this amazing library.
> Regards,
> JC


-- 
What most experimenters take for granted before they begin their
experiments is infinitely more interesting than any results to which their
experiments lead.
-- Norbert Wiener
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.mcs.anl.gov/pipermail/petsc-users/attachments/20160719/642b2181/attachment.html>

From knepley at gmail.com  Tue Jul 19 09:38:35 2016
From: knepley at gmail.com (Matthew Knepley)
Date: Tue, 19 Jul 2016 16:38:35 +0200
Subject: [petsc-users] Spectrum slicing with MUMPS (Segmentation fault)
In-Reply-To: <DM2PR0501MB12471722688A71D4D6E1342BFF370@DM2PR0501MB1247.namprd05.prod.outlook.com>
References: <DM2PR0501MB12471722688A71D4D6E1342BFF370@DM2PR0501MB1247.namprd05.prod.outlook.com>
Message-ID: <CAMYG4Gk8-zuZSEOA7pe16an9yzk5+NXfudj9R4MFYrimvq9-NQ@mail.gmail.com>

On Tue, Jul 19, 2016 at 12:42 PM, Hassan Md Mahmudulla <
mhassan at miners.utep.edu> wrote:

> Hi all,
>
> I have been trying spectrum slicing with MUMPS external solver. The error
> output is the following:
>
A stack trace in the debugger would help, but it sounds like an error in
MUMPS. You can try SuperLU_dist instead.

  Thanks,

    Matt

> [0]PETSC ERROR:
> ------------------------------------------------------------------------
> [0]PETSC ERROR: Caught signal number 11 SEGV: Segmentation Violation,
> probably memory access out of range
> [0]PETSC ERROR: Try option -start_in_debugger or -on_error_attach_debugger
> [0]PETSC ERROR: or see
> http://www.mcs.anl.gov/petsc/documentation/faq.html#valgrind
> [0]PETSC ERROR: or try http://valgrind.org on GNU/linux and Apple Mac OS
> X to find memory corruption errors
> [0]PETSC ERROR: configure using --with-debugging=yes, recompile, link, and
> run
> [0]PETSC ERROR: to get more information on the crash.
> [0]PETSC ERROR: --------------------- Error Message
> --------------------------------------------------------------
> [0]PETSC ERROR: Signal received
> [0]PETSC ERROR: See http://www.mcs.anl.gov/petsc/documentation/faq.html
> for trouble shooting.
> [0]PETSC ERROR: Petsc Release Version 3.5.3, Jan, 31, 2015
> [0]PETSC ERROR: /scratch1/scratchdirs/mhassan/dSLEPc/d540/../eigenSolverSS
> on a sandybridge named nid00281 by mhassan Tue Jul 19 02:54
> :00 2016
> [0]PETSC ERROR: Configure options --known-mpi-int64_t=0
> --known-bits-per-byte=8 --known-sdot-returns-double=0
> --known-snrm2-returns-do
> uble=0 --known-level1-dcache-assoc=0 --known-level1-dcache-linesize=32
> --known-level1-dcache-size=32768 --known-memcmp-ok=1 --known-mp
> i-c-double-complex=1 --known-mpi-long-double=1
> --known-mpi-shared-libraries=0 --known-sizeof-MPI_Comm=4
> --known-sizeof-MPI_Fint=4 --kn
> own-sizeof-char=1 --known-sizeof-double=8 --known-sizeof-float=4
> --known-sizeof-int=4 --known-sizeof-long-long=8 --known-sizeof-long=8
>  --known-sizeof-short=2 --known-sizeof-size_t=8 --known-sizeof-void-p=8
> --with-ar=ar --with-batch=1 --with-cc=cc --with-clib-autodetec
> t=0 --with-cxx=CC --with-cxxlib-autodetect=0 --with-debugging=0
> --with-dependencies=0 --with-fc=ftn --with-fortran-datatypes=0 --with-
> fortran-interfaces=0 --with-fortranlib-autodetect=0 --with-ranlib=ranlib
> --with-scalar-type=real --with-shared-ld=ar --with-etags=0 --
> with-dependencies=0 --with-dependencies=0
> --with-mpi-dir=/opt/cray/mpt/7.0.0/gni/mpich2-intel/140 --with-superlu=1
> --with-superlu-incl
> ude=/opt/cray/tpsl/1.4.4/INTEL/140/sandybridge/include
> --with-superlu-lib=/opt/cray/tpsl/1.4.4/INTEL/140/sandybridge/lib/libsuperlu.a
> --with-superlu_dist=1
> --with-superlu_dist-include=/opt/cray/tpsl/1.4.4/INTEL/140/sandybridge/include
> --with-superlu_dist-lib=/opt/cray
> /tpsl/1.4.4/INTEL/140/sandybridge/lib/libsuperlu_dist.a --with-parmetis=1
> --with-parmetis-include=/opt/cray/tpsl/1.4.4/INTEL/140/sandy
> bridge/include
> --with-parmetis-lib=/opt/cray/tpsl/1.4.4/INTEL/140/sandybridge/lib/libparmetis.a
> --with-metis=1 --with-metis-include=/o
> pt/cray/tpsl/1.4.4/INTEL/140/sandybridge/include
> --with-metis-lib=/opt/cray/tpsl/1.4.4/INTEL/140/sandybridge/lib/libmetis.a
> --with-pts
> cotch=1
> --with-ptscotch-include=/opt/cray/tpsl/1.4.4/INTEL/140/sandybridge/include
> --with-ptscotch-lib="-L/opt/cray/tpsl/1.4.4/INTEL/1
> 40/sandybridge/lib -lptscotch -lscotch -lptscotcherr -lscotcherr"
> --with-scalapack=1 --with-scalapack-include=/opt/cray/libsci/13.0.3/
> INTEL/140/x86_64/include
> --with-scalapack-lib="-L/opt/cray/libsci/13.0.3/INTEL/140/x86_64/lib
> -lsci_intel_mpi_mp -lsci_intel_mp" --wit
> h-mumps=1
> --with-mumps-include=/opt/cray/tpsl/1.4.4/INTEL/140/sandybridge/include
> --with-mumps-lib="-L/opt/cray/tpsl/1.4.4/INTEL/140/s
> andybridge/lib -lcmumps -ldmumps -lesmumps -lsmumps -lzmumps
> -lmumps_common -lptesmumps -lpord" --CFLAGS="-xavx -openmp -O3 " --CXXFLA
> GS="-xavx -openmp -O3  " --FFLAGS="-xavx -openmp -O3  " --LIBS=-lstdc++
> --CXX_LINKER_FLAGS= --PETSC_ARCH=sandybridge --prefix=/opt/cra
> y/petsc/3.5.3.0/real/INTEL/140/sandybridge --with-hypre=1
> --with-hypre-include=/opt/cray/tpsl/1.4.4/INTEL/140/sandybridge/include --wi
> th-hypre-lib=/opt/cray/tpsl/1.4.4/INTEL/140/sandybridge/lib/libHYPRE.a
> --with-sundials=1 --with-sundials-include=/opt/cray/tpsl/1.4.4/
> INTEL/140/sandybridge/include
> --with-sundials-lib="-L/opt/cray/tpsl/1.4.4/INTEL/140/sandybridge/lib
> -lsundials_cvode -lsundials_cvodes
>  -lsundials_ida -lsundials_idas -lsundials_kinsol -lsundials_nvecparallel
> -lsundials_nvecserial"
> [0]PETSC ERROR: #1 User provided function() line 0 in  unknown file
> Rank 0 [Tue Jul 19 02:54:04 2016] [c1-0c1s6n1] application called
> MPI_Abort(MPI_COMM_WORLD, 59) - process 0
> srun: error: nid00281: task 0: Aborted
> srun: Terminating job step 1330433.0
> slurmstepd: *** STEP 1330433.0 ON nid00281 CANCELLED AT
> 2016-07-19T02:54:04 ***
> srun: Job step aborted: Waiting up to 32 seconds for job step to finish.
> srun: error: nid00281: tasks 1-17: Killed
> srun: error: nid00282: tasks 18-35: Killed
>
>
> I ran the same code in my pc with 8 processor. It had no issues. But when
> I tried in a different machine, I am getting this. Any idea? Can I use
> Superlu_dist instead of MUMPS? I got INFOG(1)=-22 error from MUMPS in
> another run.
>
>
> Thanks,
>
>
> *M Hassan*
>


-- 
What most experimenters take for granted before they begin their
experiments is infinitely more interesting than any results to which their
experiments lead.
-- Norbert Wiener
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.mcs.anl.gov/pipermail/petsc-users/attachments/20160719/f1398f49/attachment-0001.html>

From jroman at dsic.upv.es  Tue Jul 19 09:42:23 2016
From: jroman at dsic.upv.es (Jose E. Roman)
Date: Tue, 19 Jul 2016 16:42:23 +0200
Subject: [petsc-users] Spectrum slicing with MUMPS (Segmentation fault)
In-Reply-To: <CAMYG4Gk8-zuZSEOA7pe16an9yzk5+NXfudj9R4MFYrimvq9-NQ@mail.gmail.com>
References: <DM2PR0501MB12471722688A71D4D6E1342BFF370@DM2PR0501MB1247.namprd05.prod.outlook.com>
	<CAMYG4Gk8-zuZSEOA7pe16an9yzk5+NXfudj9R4MFYrimvq9-NQ@mail.gmail.com>
Message-ID: <95BF1DF1-BC04-4F6C-93D4-591C5E7E36F3@dsic.upv.es>

SuperLU_dist can be used in general with shift-and-invert, but for spectrum slicint (eps_interval) it does not work because it does not provide inertia (MatGetInertia) which is required in that case.

Jose


> El 19 jul 2016, a las 16:38, Matthew Knepley <knepley at gmail.com> escribi?:
> 
> On Tue, Jul 19, 2016 at 12:42 PM, Hassan Md Mahmudulla <mhassan at miners.utep.edu> wrote:
> Hi all,
> 
> I have been trying spectrum slicing with MUMPS external solver. The error output is the following:
> 
> A stack trace in the debugger would help, but it sounds like an error in MUMPS. You can try SuperLU_dist instead.
> 
>   Thanks,
> 
>     Matt 
> 
> [0]PETSC ERROR: ------------------------------------------------------------------------
> [0]PETSC ERROR: Caught signal number 11 SEGV: Segmentation Violation, probably memory access out of range
> [0]PETSC ERROR: Try option -start_in_debugger or -on_error_attach_debugger
> [0]PETSC ERROR: or see http://www.mcs.anl.gov/petsc/documentation/faq.html#valgrind
> [0]PETSC ERROR: or try http://valgrind.org on GNU/linux and Apple Mac OS X to find memory corruption errors
> [0]PETSC ERROR: configure using --with-debugging=yes, recompile, link, and run
> [0]PETSC ERROR: to get more information on the crash.
> [0]PETSC ERROR: --------------------- Error Message --------------------------------------------------------------
> [0]PETSC ERROR: Signal received
> [0]PETSC ERROR: See http://www.mcs.anl.gov/petsc/documentation/faq.html for trouble shooting.
> [0]PETSC ERROR: Petsc Release Version 3.5.3, Jan, 31, 2015
> [0]PETSC ERROR: /scratch1/scratchdirs/mhassan/dSLEPc/d540/../eigenSolverSS on a sandybridge named nid00281 by mhassan Tue Jul 19 02:54
> :00 2016
> [0]PETSC ERROR: Configure options --known-mpi-int64_t=0 --known-bits-per-byte=8 --known-sdot-returns-double=0 --known-snrm2-returns-do
> uble=0 --known-level1-dcache-assoc=0 --known-level1-dcache-linesize=32 --known-level1-dcache-size=32768 --known-memcmp-ok=1 --known-mp
> i-c-double-complex=1 --known-mpi-long-double=1 --known-mpi-shared-libraries=0 --known-sizeof-MPI_Comm=4 --known-sizeof-MPI_Fint=4 --kn
> own-sizeof-char=1 --known-sizeof-double=8 --known-sizeof-float=4 --known-sizeof-int=4 --known-sizeof-long-long=8 --known-sizeof-long=8
>  --known-sizeof-short=2 --known-sizeof-size_t=8 --known-sizeof-void-p=8 --with-ar=ar --with-batch=1 --with-cc=cc --with-clib-autodetec
> t=0 --with-cxx=CC --with-cxxlib-autodetect=0 --with-debugging=0 --with-dependencies=0 --with-fc=ftn --with-fortran-datatypes=0 --with-
> fortran-interfaces=0 --with-fortranlib-autodetect=0 --with-ranlib=ranlib --with-scalar-type=real --with-shared-ld=ar --with-etags=0 --
> with-dependencies=0 --with-dependencies=0 --with-mpi-dir=/opt/cray/mpt/7.0.0/gni/mpich2-intel/140 --with-superlu=1 --with-superlu-incl
> ude=/opt/cray/tpsl/1.4.4/INTEL/140/sandybridge/include --with-superlu-lib=/opt/cray/tpsl/1.4.4/INTEL/140/sandybridge/lib/libsuperlu.a
> --with-superlu_dist=1 --with-superlu_dist-include=/opt/cray/tpsl/1.4.4/INTEL/140/sandybridge/include --with-superlu_dist-lib=/opt/cray
> /tpsl/1.4.4/INTEL/140/sandybridge/lib/libsuperlu_dist.a --with-parmetis=1 --with-parmetis-include=/opt/cray/tpsl/1.4.4/INTEL/140/sandy
> bridge/include --with-parmetis-lib=/opt/cray/tpsl/1.4.4/INTEL/140/sandybridge/lib/libparmetis.a --with-metis=1 --with-metis-include=/o
> pt/cray/tpsl/1.4.4/INTEL/140/sandybridge/include --with-metis-lib=/opt/cray/tpsl/1.4.4/INTEL/140/sandybridge/lib/libmetis.a --with-pts
> cotch=1 --with-ptscotch-include=/opt/cray/tpsl/1.4.4/INTEL/140/sandybridge/include --with-ptscotch-lib="-L/opt/cray/tpsl/1.4.4/INTEL/1
> 40/sandybridge/lib -lptscotch -lscotch -lptscotcherr -lscotcherr" --with-scalapack=1 --with-scalapack-include=/opt/cray/libsci/13.0.3/
> INTEL/140/x86_64/include --with-scalapack-lib="-L/opt/cray/libsci/13.0.3/INTEL/140/x86_64/lib -lsci_intel_mpi_mp -lsci_intel_mp" --wit
> h-mumps=1 --with-mumps-include=/opt/cray/tpsl/1.4.4/INTEL/140/sandybridge/include --with-mumps-lib="-L/opt/cray/tpsl/1.4.4/INTEL/140/s
> andybridge/lib -lcmumps -ldmumps -lesmumps -lsmumps -lzmumps -lmumps_common -lptesmumps -lpord" --CFLAGS="-xavx -openmp -O3 " --CXXFLA
> GS="-xavx -openmp -O3  " --FFLAGS="-xavx -openmp -O3  " --LIBS=-lstdc++ --CXX_LINKER_FLAGS= --PETSC_ARCH=sandybridge --prefix=/opt/cra
> y/petsc/3.5.3.0/real/INTEL/140/sandybridge --with-hypre=1 --with-hypre-include=/opt/cray/tpsl/1.4.4/INTEL/140/sandybridge/include --wi
> th-hypre-lib=/opt/cray/tpsl/1.4.4/INTEL/140/sandybridge/lib/libHYPRE.a --with-sundials=1 --with-sundials-include=/opt/cray/tpsl/1.4.4/
> INTEL/140/sandybridge/include --with-sundials-lib="-L/opt/cray/tpsl/1.4.4/INTEL/140/sandybridge/lib -lsundials_cvode -lsundials_cvodes
>  -lsundials_ida -lsundials_idas -lsundials_kinsol -lsundials_nvecparallel -lsundials_nvecserial"
> [0]PETSC ERROR: #1 User provided function() line 0 in  unknown file
> Rank 0 [Tue Jul 19 02:54:04 2016] [c1-0c1s6n1] application called MPI_Abort(MPI_COMM_WORLD, 59) - process 0
> srun: error: nid00281: task 0: Aborted
> srun: Terminating job step 1330433.0
> slurmstepd: *** STEP 1330433.0 ON nid00281 CANCELLED AT 2016-07-19T02:54:04 ***
> srun: Job step aborted: Waiting up to 32 seconds for job step to finish.
> srun: error: nid00281: tasks 1-17: Killed
> srun: error: nid00282: tasks 18-35: Killed
> 
> 
> 
> I ran the same code in my pc with 8 processor. It had no issues. But when I tried in a different machine, I am getting this. Any idea? Can I use Superlu_dist instead of MUMPS? I got INFOG(1)=-22 error from MUMPS in another run. 
> 
> 
> 
> Thanks,
> 
> 
> 
> M Hassan
> 
> 
> 
> 
> -- 
> What most experimenters take for granted before they begin their experiments is infinitely more interesting than any results to which their experiments lead.
> -- Norbert Wiener


From hzhang at mcs.anl.gov  Tue Jul 19 09:50:09 2016
From: hzhang at mcs.anl.gov (Hong)
Date: Tue, 19 Jul 2016 09:50:09 -0500
Subject: [petsc-users] Spectrum slicing with MUMPS (Segmentation fault)
In-Reply-To: <95BF1DF1-BC04-4F6C-93D4-591C5E7E36F3@dsic.upv.es>
References: <DM2PR0501MB12471722688A71D4D6E1342BFF370@DM2PR0501MB1247.namprd05.prod.outlook.com>
	<CAMYG4Gk8-zuZSEOA7pe16an9yzk5+NXfudj9R4MFYrimvq9-NQ@mail.gmail.com>
	<95BF1DF1-BC04-4F6C-93D4-591C5E7E36F3@dsic.upv.es>
Message-ID: <CAGCphBvCY5KPn6wJyf100viddPEM=ryb-ctYBYEMnX=LjPXH-Q@mail.gmail.com>

"I got INFOG(1)=-22 error from MUMPS in another run. " does not tell much
about the error (check MUMPS's user manual). Suggest building petsc in
debugging mode, then you may get more error info.

Hong

On Tue, Jul 19, 2016 at 9:42 AM, Jose E. Roman <jroman at dsic.upv.es> wrote:

> SuperLU_dist can be used in general with shift-and-invert, but for
> spectrum slicint (eps_interval) it does not work because it does not
> provide inertia (MatGetInertia) which is required in that case.
>
> Jose
>
>
> > El 19 jul 2016, a las 16:38, Matthew Knepley <knepley at gmail.com>
> escribi?:
> >
> > On Tue, Jul 19, 2016 at 12:42 PM, Hassan Md Mahmudulla <
> mhassan at miners.utep.edu> wrote:
> > Hi all,
> >
> > I have been trying spectrum slicing with MUMPS external solver. The
> error output is the following:
> >
> > A stack trace in the debugger would help, but it sounds like an error in
> MUMPS. You can try SuperLU_dist instead.
> >
> >   Thanks,
> >
> >     Matt
> >
> > [0]PETSC ERROR:
> ------------------------------------------------------------------------
> > [0]PETSC ERROR: Caught signal number 11 SEGV: Segmentation Violation,
> probably memory access out of range
> > [0]PETSC ERROR: Try option -start_in_debugger or
> -on_error_attach_debugger
> > [0]PETSC ERROR: or see
> http://www.mcs.anl.gov/petsc/documentation/faq.html#valgrind
> > [0]PETSC ERROR: or try http://valgrind.org on GNU/linux and Apple Mac
> OS X to find memory corruption errors
> > [0]PETSC ERROR: configure using --with-debugging=yes, recompile, link,
> and run
> > [0]PETSC ERROR: to get more information on the crash.
> > [0]PETSC ERROR: --------------------- Error Message
> --------------------------------------------------------------
> > [0]PETSC ERROR: Signal received
> > [0]PETSC ERROR: See http://www.mcs.anl.gov/petsc/documentation/faq.html
> for trouble shooting.
> > [0]PETSC ERROR: Petsc Release Version 3.5.3, Jan, 31, 2015
> > [0]PETSC ERROR:
> /scratch1/scratchdirs/mhassan/dSLEPc/d540/../eigenSolverSS on a sandybridge
> named nid00281 by mhassan Tue Jul 19 02:54
> > :00 2016
> > [0]PETSC ERROR: Configure options --known-mpi-int64_t=0
> --known-bits-per-byte=8 --known-sdot-returns-double=0
> --known-snrm2-returns-do
> > uble=0 --known-level1-dcache-assoc=0 --known-level1-dcache-linesize=32
> --known-level1-dcache-size=32768 --known-memcmp-ok=1 --known-mp
> > i-c-double-complex=1 --known-mpi-long-double=1
> --known-mpi-shared-libraries=0 --known-sizeof-MPI_Comm=4
> --known-sizeof-MPI_Fint=4 --kn
> > own-sizeof-char=1 --known-sizeof-double=8 --known-sizeof-float=4
> --known-sizeof-int=4 --known-sizeof-long-long=8 --known-sizeof-long=8
> >  --known-sizeof-short=2 --known-sizeof-size_t=8 --known-sizeof-void-p=8
> --with-ar=ar --with-batch=1 --with-cc=cc --with-clib-autodetec
> > t=0 --with-cxx=CC --with-cxxlib-autodetect=0 --with-debugging=0
> --with-dependencies=0 --with-fc=ftn --with-fortran-datatypes=0 --with-
> > fortran-interfaces=0 --with-fortranlib-autodetect=0 --with-ranlib=ranlib
> --with-scalar-type=real --with-shared-ld=ar --with-etags=0 --
> > with-dependencies=0 --with-dependencies=0
> --with-mpi-dir=/opt/cray/mpt/7.0.0/gni/mpich2-intel/140 --with-superlu=1
> --with-superlu-incl
> > ude=/opt/cray/tpsl/1.4.4/INTEL/140/sandybridge/include
> --with-superlu-lib=/opt/cray/tpsl/1.4.4/INTEL/140/sandybridge/lib/libsuperlu.a
> > --with-superlu_dist=1
> --with-superlu_dist-include=/opt/cray/tpsl/1.4.4/INTEL/140/sandybridge/include
> --with-superlu_dist-lib=/opt/cray
> > /tpsl/1.4.4/INTEL/140/sandybridge/lib/libsuperlu_dist.a
> --with-parmetis=1
> --with-parmetis-include=/opt/cray/tpsl/1.4.4/INTEL/140/sandy
> > bridge/include
> --with-parmetis-lib=/opt/cray/tpsl/1.4.4/INTEL/140/sandybridge/lib/libparmetis.a
> --with-metis=1 --with-metis-include=/o
> > pt/cray/tpsl/1.4.4/INTEL/140/sandybridge/include
> --with-metis-lib=/opt/cray/tpsl/1.4.4/INTEL/140/sandybridge/lib/libmetis.a
> --with-pts
> > cotch=1
> --with-ptscotch-include=/opt/cray/tpsl/1.4.4/INTEL/140/sandybridge/include
> --with-ptscotch-lib="-L/opt/cray/tpsl/1.4.4/INTEL/1
> > 40/sandybridge/lib -lptscotch -lscotch -lptscotcherr -lscotcherr"
> --with-scalapack=1 --with-scalapack-include=/opt/cray/libsci/13.0.3/
> > INTEL/140/x86_64/include
> --with-scalapack-lib="-L/opt/cray/libsci/13.0.3/INTEL/140/x86_64/lib
> -lsci_intel_mpi_mp -lsci_intel_mp" --wit
> > h-mumps=1
> --with-mumps-include=/opt/cray/tpsl/1.4.4/INTEL/140/sandybridge/include
> --with-mumps-lib="-L/opt/cray/tpsl/1.4.4/INTEL/140/s
> > andybridge/lib -lcmumps -ldmumps -lesmumps -lsmumps -lzmumps
> -lmumps_common -lptesmumps -lpord" --CFLAGS="-xavx -openmp -O3 " --CXXFLA
> > GS="-xavx -openmp -O3  " --FFLAGS="-xavx -openmp -O3  " --LIBS=-lstdc++
> --CXX_LINKER_FLAGS= --PETSC_ARCH=sandybridge --prefix=/opt/cra
> > y/petsc/3.5.3.0/real/INTEL/140/sandybridge --with-hypre=1
> --with-hypre-include=/opt/cray/tpsl/1.4.4/INTEL/140/sandybridge/include --wi
> > th-hypre-lib=/opt/cray/tpsl/1.4.4/INTEL/140/sandybridge/lib/libHYPRE.a
> --with-sundials=1 --with-sundials-include=/opt/cray/tpsl/1.4.4/
> > INTEL/140/sandybridge/include
> --with-sundials-lib="-L/opt/cray/tpsl/1.4.4/INTEL/140/sandybridge/lib
> -lsundials_cvode -lsundials_cvodes
> >  -lsundials_ida -lsundials_idas -lsundials_kinsol
> -lsundials_nvecparallel -lsundials_nvecserial"
> > [0]PETSC ERROR: #1 User provided function() line 0 in  unknown file
> > Rank 0 [Tue Jul 19 02:54:04 2016] [c1-0c1s6n1] application called
> MPI_Abort(MPI_COMM_WORLD, 59) - process 0
> > srun: error: nid00281: task 0: Aborted
> > srun: Terminating job step 1330433.0
> > slurmstepd: *** STEP 1330433.0 ON nid00281 CANCELLED AT
> 2016-07-19T02:54:04 ***
> > srun: Job step aborted: Waiting up to 32 seconds for job step to finish.
> > srun: error: nid00281: tasks 1-17: Killed
> > srun: error: nid00282: tasks 18-35: Killed
> >
> >
> >
> > I ran the same code in my pc with 8 processor. It had no issues. But
> when I tried in a different machine, I am getting this. Any idea? Can I use
> Superlu_dist instead of MUMPS? I got INFOG(1)=-22 error from MUMPS in
> another run.
> >
> >
> >
> > Thanks,
> >
> >
> >
> > M Hassan
> >
> >
> >
> >
> > --
> > What most experimenters take for granted before they begin their
> experiments is infinitely more interesting than any results to which their
> experiments lead.
> > -- Norbert Wiener
>
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.mcs.anl.gov/pipermail/petsc-users/attachments/20160719/926626d5/attachment.html>

From eduardojourdan92 at gmail.com  Tue Jul 19 13:17:52 2016
From: eduardojourdan92 at gmail.com (Eduardo Jourdan)
Date: Tue, 19 Jul 2016 15:17:52 -0300
Subject: [petsc-users] Questions for MatSolve
Message-ID: <CAF78e0xki-Fd0w=g45bVtcuU4TMrHjT_PeaqWbCP=FS7nQi8Pw@mail.gmail.com>

Hi all,

I would like to perform a specific number (for instance 4 of forward and
backward sweeps with a seqaij matrix with block size 4, vectors b and x.
Also, I need to do this same procedure with another matrix seqaij block
size 16. I would appreciate if someone knows the best way to do it.

1 - I've been trying to use MatSolve. For the bs=4 it seems to work, but
with the other matrix with bs=16 the residue diverges. When I call
matConvert to convert the later matrix for a seqbaij with bs=16 the result
changes and the linear residue is reduced. It is supposed to happen or it
is more possible that i am doing something wrong?

2 - MatSolve for seqbaij and seqaij with the same block sizes gives the
same results in terms of solution (not performace, memory) ?

3 - Can do I do a specific number of sweeps as told before with the KSP/PC
interface?

4 - I saw the manual for the MatSolve and It says that it is for factored
matrix. Can I use a matrix just after the MatAssembly calls?

Best regards,

Eduardo Jourdan
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.mcs.anl.gov/pipermail/petsc-users/attachments/20160719/9f757f3b/attachment.html>

From aks084000 at utdallas.edu  Tue Jul 19 14:53:22 2016
From: aks084000 at utdallas.edu (Safin, Artur)
Date: Tue, 19 Jul 2016 19:53:22 +0000
Subject: [petsc-users] Multigrid with PML
In-Reply-To: <CADOhEh6bAfyDHrJB78wTuZ7qgKdgJTKn=eRDm8EFbUTW4R+3Kg@mail.gmail.com>
References: <2d1003a65bf24fdf9b30adea866d2067@utdallas.edu>
	<02E40C8A-322D-4784-8418-22EE5F0999C7@mcs.anl.gov>
	<CADOhEh6bAfyDHrJB78wTuZ7qgKdgJTKn=eRDm8EFbUTW4R+3Kg@mail.gmail.com>
Message-ID: <6B852635-27EC-45D7-8C09-8F3306DA6DEE@utdallas.edu>

Hello,

In order to achieve reasonable performance for Helmholtz with PML, Erlangga in his paper used

1) Matrix dependent interpolation in the multigrid. The operators are nonlinear, for example an intermediate computation reads something like
d = max(|a+c|, |b|, ?)

2) Full weighting (This is linear, so I believe I can achieve that with PCMGSetRestriction).

3) F-cycle with one pre- and postsmoothing with the Jacobi iteration and relaxation factor ? = 0.5.

I am not sure how to do 1 & 3 in PETSc. Can anyone suggest a way of implementing these?

Thanks,

Artur

PS. for anyone curious, the paper is "Advances in Iterative Methods and Preconditioners for the Helmholtz Equation"
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.mcs.anl.gov/pipermail/petsc-users/attachments/20160719/ef0f65d3/attachment.html>

From knepley at gmail.com  Tue Jul 19 14:58:42 2016
From: knepley at gmail.com (Matthew Knepley)
Date: Tue, 19 Jul 2016 21:58:42 +0200
Subject: [petsc-users] Multigrid with PML
In-Reply-To: <6B852635-27EC-45D7-8C09-8F3306DA6DEE@utdallas.edu>
References: <2d1003a65bf24fdf9b30adea866d2067@utdallas.edu>
	<02E40C8A-322D-4784-8418-22EE5F0999C7@mcs.anl.gov>
	<CADOhEh6bAfyDHrJB78wTuZ7qgKdgJTKn=eRDm8EFbUTW4R+3Kg@mail.gmail.com>
	<6B852635-27EC-45D7-8C09-8F3306DA6DEE@utdallas.edu>
Message-ID: <CAMYG4GnX2CtB9GzDgmG_4p7xb267Fc4kVesSRCcyTZ6_zgUtKg@mail.gmail.com>

On Tue, Jul 19, 2016 at 9:53 PM, Safin, Artur <aks084000 at utdallas.edu>
wrote:

> Hello,
>
> In order to achieve reasonable performance for Helmholtz with PML,
> Erlangga in his paper used
>
> 1) Matrix dependent interpolation in the multigrid. The operators are
> nonlinear, for example an intermediate computation reads something like
> d = max(|a+c|, |b|, ?)
>

You can use this
http://www.mcs.anl.gov/petsc/petsc-current/docs/manualpages/PC/PCMGSetInterpolation.html
to set your own interpolation operators.


> 2) Full weighting (This is linear, so I believe I can achieve that with
> *PCMGSetRestriction*).
>
> 3) F-cycle with one pre- and postsmoothing with the Jacobi iteration and
> relaxation factor ? = 0.5.
>

-pc_mg_type full
-pc_mg_smoothup 1
-pc_mg_smoothdown 1
-mg_levels_pc_type sor
-mg_leves_pc_sor_omega 0.5

and use -ksp_view to check that you have what you want.

   Matt


> I am not sure how to do 1 & 3 in PETSc. Can anyone suggest a way of
> implementing these?
>
> Thanks,
>
> Artur
>
> PS. for anyone curious, the paper is "Advances in Iterative Methods and
> Preconditioners for the Helmholtz Equation"
>


-- 
What most experimenters take for granted before they begin their
experiments is infinitely more interesting than any results to which their
experiments lead.
-- Norbert Wiener
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.mcs.anl.gov/pipermail/petsc-users/attachments/20160719/7bb322a7/attachment.html>

From bsmith at mcs.anl.gov  Tue Jul 19 18:20:19 2016
From: bsmith at mcs.anl.gov (Barry Smith)
Date: Tue, 19 Jul 2016 16:20:19 -0700
Subject: [petsc-users] some beginner questions : matrix multiplication
In-Reply-To: <932627683.1480276.1468936895326.JavaMail.yahoo@mail.yahoo.com>
References: <932627683.1480276.1468936895326.JavaMail.yahoo.ref@mail.yahoo.com>
	<932627683.1480276.1468936895326.JavaMail.yahoo@mail.yahoo.com>
Message-ID: <CA87F0A9-E3A2-4A1F-B4BF-57FBF9F70146@mcs.anl.gov>


> On Jul 19, 2016, at 7:01 AM, lixin chu <lixin_chu at yahoo.com> wrote:
> 
> Hello,
> I am new to PETsc, and I am looking for a library to support matrix multiplication. I have several questions and would like to confirm:
> 
> 1. From MatMatMult API, for C=A*B, I assume we can support mixed sparse and dense matrix, i.e., either A or B can be dense; similarly, MatMatMatMult (A*B*C) can support A and C sparse, and B is dense.

  We do not have code for all combinations. 
> 
> 2. We can also use mixed data type for MatMatMult/MatMatMatMult, for example, A is complex, double, and B is double.

  PETSc only supports all real or all complex, not missing.
> 
> 3. Is there a way to estimate the total working memory required for MatMatMult/MatMatMatMult, given A,B and C information (like dimensions, and total none zero elements, data type)

   Whenever one of the matrices is dense the result is dense so it is easy to compute in that case.

   If all the matrices are sparse it is difficult to predict the sparsity of the final result (generally is is a bit denser than the most dense of the sparse matrices). We make some estimates before we start the symbolic multiple and if we need more space we allocate more.
>  
> 4. do we have any performance/memory usage data when compared with other sparse matrix multiplication solutions. for example. PSBLAS

   No

> ?
> 
> thank you very much,
> 
> lixin


From bsmith at mcs.anl.gov  Tue Jul 19 18:38:54 2016
From: bsmith at mcs.anl.gov (Barry Smith)
Date: Tue, 19 Jul 2016 16:38:54 -0700
Subject: [petsc-users] Multigrid with PML
In-Reply-To: <CAMYG4GnX2CtB9GzDgmG_4p7xb267Fc4kVesSRCcyTZ6_zgUtKg@mail.gmail.com>
References: <2d1003a65bf24fdf9b30adea866d2067@utdallas.edu>
	<02E40C8A-322D-4784-8418-22EE5F0999C7@mcs.anl.gov>
	<CADOhEh6bAfyDHrJB78wTuZ7qgKdgJTKn=eRDm8EFbUTW4R+3Kg@mail.gmail.com>
	<6B852635-27EC-45D7-8C09-8F3306DA6DEE@utdallas.edu>
	<CAMYG4GnX2CtB9GzDgmG_4p7xb267Fc4kVesSRCcyTZ6_zgUtKg@mail.gmail.com>
Message-ID: <834BCA08-682E-4141-B23C-D0E3D259B5E0@mcs.anl.gov>


  For jacobi smoothing with a .5 damping you need -ksp_type richardson -pc_type jacobi -ksp_richardson_scale .5 but instead of the scale you can try -ksp_richardson_self_scale  which claims to use the optimal scale factor for each iteration (at a cost of some vector operations).

  Barry

> On Jul 19, 2016, at 12:58 PM, Matthew Knepley <knepley at gmail.com> wrote:
> 
> On Tue, Jul 19, 2016 at 9:53 PM, Safin, Artur <aks084000 at utdallas.edu> wrote:
> Hello,
> 
> In order to achieve reasonable performance for Helmholtz with PML, Erlangga in his paper used
> 
> 1) Matrix dependent interpolation in the multigrid. The operators are nonlinear, for example an intermediate computation reads something like
> d = max(|a+c|, |b|, ?)
> 
> You can use this http://www.mcs.anl.gov/petsc/petsc-current/docs/manualpages/PC/PCMGSetInterpolation.html
> to set your own interpolation operators.
>  
> 2) Full weighting (This is linear, so I believe I can achieve that with PCMGSetRestriction).
> 
> 3) F-cycle with one pre- and postsmoothing with the Jacobi iteration and relaxation factor ? = 0.5.
> 
> -pc_mg_type full
> -pc_mg_smoothup 1
> -pc_mg_smoothdown 1
> -mg_levels_pc_type sor
> -mg_leves_pc_sor_omega 0.5
> 
> and use -ksp_view to check that you have what you want.
> 
>    Matt
>  
> I am not sure how to do 1 & 3 in PETSc. Can anyone suggest a way of implementing these?
> 
> Thanks,
> 
> Artur
> 
> PS. for anyone curious, the paper is "Advances in Iterative Methods and Preconditioners for the Helmholtz Equation"
> 
> 
> 
> -- 
> What most experimenters take for granted before they begin their experiments is infinitely more interesting than any results to which their experiments lead.
> -- Norbert Wiener


From knepley at gmail.com  Tue Jul 19 22:03:55 2016
From: knepley at gmail.com (Matthew Knepley)
Date: Wed, 20 Jul 2016 05:03:55 +0200
Subject: [petsc-users] Questions for MatSolve
In-Reply-To: <CAF78e0xki-Fd0w=g45bVtcuU4TMrHjT_PeaqWbCP=FS7nQi8Pw@mail.gmail.com>
References: <CAF78e0xki-Fd0w=g45bVtcuU4TMrHjT_PeaqWbCP=FS7nQi8Pw@mail.gmail.com>
Message-ID: <CAMYG4GnWbHvFjVO1p6ea=nCLsi9s_Gnbxcggj6n=GOAM_7Eu5w@mail.gmail.com>

On Tue, Jul 19, 2016 at 8:17 PM, Eduardo Jourdan <eduardojourdan92 at gmail.com
> wrote:

> Hi all,
>
> I would like to perform a specific number (for instance 4 of forward and
> backward sweeps with a seqaij matrix with block size 4, vectors b and x.
> Also, I need to do this same procedure with another matrix seqaij block
> size 16. I would appreciate if someone knows the best way to do it.
>

It sounds like you want PCSOR and PCApply, not MatSolve.

  Thanks,

     Matt


> 1 - I've been trying to use MatSolve. For the bs=4 it seems to work, but
> with the other matrix with bs=16 the residue diverges. When I call
> matConvert to convert the later matrix for a seqbaij with bs=16 the result
> changes and the linear residue is reduced. It is supposed to happen or it
> is more possible that i am doing something wrong?
>
> 2 - MatSolve for seqbaij and seqaij with the same block sizes gives the
> same results in terms of solution (not performace, memory) ?
>
> 3 - Can do I do a specific number of sweeps as told before with the KSP/PC
> interface?
>
> 4 - I saw the manual for the MatSolve and It says that it is for factored
> matrix. Can I use a matrix just after the MatAssembly calls?
>
> Best regards,
>
> Eduardo Jourdan
>


-- 
What most experimenters take for granted before they begin their
experiments is infinitely more interesting than any results to which their
experiments lead.
-- Norbert Wiener
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.mcs.anl.gov/pipermail/petsc-users/attachments/20160720/e8dbd295/attachment-0001.html>

From lixin_chu at yahoo.com  Wed Jul 20 09:35:13 2016
From: lixin_chu at yahoo.com (lixin chu)
Date: Wed, 20 Jul 2016 14:35:13 +0000 (UTC)
Subject: [petsc-users] some beginner questions : matrix multiplication
In-Reply-To: <CA87F0A9-E3A2-4A1F-B4BF-57FBF9F70146@mcs.anl.gov>
References: <932627683.1480276.1468936895326.JavaMail.yahoo.ref@mail.yahoo.com>
	<932627683.1480276.1468936895326.JavaMail.yahoo@mail.yahoo.com>
	<CA87F0A9-E3A2-4A1F-B4BF-57FBF9F70146@mcs.anl.gov>
Message-ID: <1936335791.1984783.1469025313270.JavaMail.yahoo@mail.yahoo.com>

Thank you very much for the quick reply.

Sent from Yahoo Mail on Android 
 
  On Wed, 20 Jul, 2016 at 7:20, Barry Smith<bsmith at mcs.anl.gov> wrote:   
> On Jul 19, 2016, at 7:01 AM, lixin chu <lixin_chu at yahoo.com> wrote:
> 
> Hello,
> I am new to PETsc, and I am looking for a library to support matrix multiplication. I have several questions and would like to confirm:
> 
> 1. From MatMatMult API, for C=A*B, I assume we can support mixed sparse and dense matrix, i.e., either A or B can be dense; similarly, MatMatMatMult (A*B*C) can support A and C sparse, and B is dense.

? We do not have code for all combinations. 
> 
> 2. We can also use mixed data type for MatMatMult/MatMatMatMult, for example, A is complex, double, and B is double.

? PETSc only supports all real or all complex, not missing.
> 
> 3. Is there a way to estimate the total working memory required for MatMatMult/MatMatMatMult, given A,B and C information (like dimensions, and total none zero elements, data type)

? Whenever one of the matrices is dense the result is dense so it is easy to compute in that case.

? If all the matrices are sparse it is difficult to predict the sparsity of the final result (generally is is a bit denser than the most dense of the sparse matrices). We make some estimates before we start the symbolic multiple and if we need more space we allocate more.
>? 
> 4. do we have any performance/memory usage data when compared with other sparse matrix multiplication solutions. for example. PSBLAS

? No

> ?
> 
> thank you very much,
> 
> lixin
  
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.mcs.anl.gov/pipermail/petsc-users/attachments/20160720/70688e1e/attachment.html>

From zonexo at gmail.com  Wed Jul 20 22:24:32 2016
From: zonexo at gmail.com (TAY wee-beng)
Date: Thu, 21 Jul 2016 11:24:32 +0800
Subject: [petsc-users] Fwd: Re: Error with PETSc on K computer
In-Reply-To: <6E84C554-39F0-4BB6-92D0-D2443BA79989@mcs.anl.gov>
References: <7423eeed-4b95-28e7-c55d-08e515911935@gmail.com>
	<3436e085-071a-db3f-3438-84e2536af2d5@gmail.com>
	<alpine.LFD.2.20.1606030930550.32504@asterix>
	<c836a8b2-52ad-7d38-cc19-b11536c3752a@gmail.com>
	<6E84C554-39F0-4BB6-92D0-D2443BA79989@mcs.anl.gov>
Message-ID: <a639a4ee-cf11-4e84-3bd3-477181a9019f@gmail.com>

Dear all,

I have emailed the K computer helpdesk and they have given their reply:

/*This is HPCI helpdesk. *//*
*/ /*
*//*Sorry for making you wait. *//*
*//*We have received the investigation results from Operation Division. *//*
*/ /*
*//*The cause of SIGSEGV by the Fujitsu compiler is that the 
implementation of *//*
*//*the Fortran pointer is different from the Intel/GNU compiler. *//*
*/ /*
*//*In the Fujitsu compiler, interoperability of the Fortran pointer and 
C language *//*
*//*is implemented by the Fortran pointer interface of Fujitsu. *//*
*//** The implementation of the Fortran pointer is processor-dependent.* 
*//*
*/ /*
*//*On the other hand, PETSc is implemented assuming of the Fortran 
pointer interface of *//*
*//*the Intel/GNU compiler. *//*
*/ /*
*//*The PETSc routine cannot correctly interpret the Fortran pointer of 
Fujitsu *//*
*//*because the implementation of the Fortran pointer of the Fujitsu 
compiler *//*
*//*and the Intel/GNU compiler is different, and it terminates 
abnormally at execution. *//*
*/ /*
*//*Please avoid the use of the Fortran pointer as a workaround. *//*
*/ /*
*//*The sample program of PETSc which does not use the Fotran pointer 
(ex4f etc.) *//*
*//*runs normaly without getting SIGSEGV. */

Hence, they advice avoiding the use of pointers. I made use of 
VecGetArrayF90 but I believe I can also use VecGetArray.

But what about DMDAVecGetArrayF90 and DMDAVecRestoreArrayF90? Can I use 
DMDAVecGetArray and DMDAVecRestoreArray instead in Fortran, thus 
avoiding using pointers? I remember my segmentation fault always happens 
when calling DMDAVecGetArrayF90 and DMDAVecRestoreArrayF90.

In other words, can I use DMDA in Fortran w/o using any pointer?


Thank you

Yours sincerely,

TAY wee-beng

On 10/6/2016 11:00 AM, Barry Smith wrote:
>    Without knowing the specifics of how this machine's Fortran compiler passes Fortran pointers to subroutines we cannot resolve this problem. This information can only be obtained from the experts on the this machine.
>
>     Barry
>
>> On Jun 9, 2016, at 9:28 PM, TAY wee-beng <zonexo at gmail.com> wrote:
>>
>> Hi,
>>
>> The current solution cannot work. May I know if there's any other solution to try. Meanwhile, I've also email the K computer helpdesk for help.
>>
>> Thank you
>>
>> Yours sincerely,
>>
>> TAY wee-beng
>>
>> On 3/6/2016 10:33 PM, Satish Balay wrote:
>>> Sorry - I'm not sure whats hapenning with this compiler.
>>>
>>> [for a build without the patch I sent ] - can you edit
>>> PETSC_ARCH/include/petscconf.h and remove the lines
>>>
>>> #ifndef PETSC_HAVE_F90_2PTR_ARG
>>> #define PETSC_HAVE_F90_2PTR_ARG 1
>>> #endif
>>>
>>> And then build the libraries [do not run configure again].
>>>
>>> Does this make a difference for this example?
>>>
>>> Satish
>>>
>>> On Fri, 3 Jun 2016, TAY wee-beng wrote:
>>>
>>>> Hi,
>>>>
>>>> Is there any update to the issue below?
>>>>
>>>> No hurry, just to make sure that the email is sent successfully.
>>>>
>>>>
>>>> Thanks
>>>>
>>>>
>>>>
>>>> -------- Forwarded Message --------
>>>> Subject: 	Re: [petsc-users] Error with PETSc on K computer
>>>> Date: 	Thu, 2 Jun 2016 10:25:22 +0800
>>>> From: 	TAY wee-beng <zonexo at gmail.com>
>>>> To: 	petsc-users <petsc-users at mcs.anl.gov>
>>>>
>>>>
>>>>
>>>> Hi Satish,
>>>>
>>>> The X9 option is :
>>>>
>>>> Provides a different interpretation under Fortran 95 specifications
>>>> for any parts not conforming to the language specifications of this
>>>> compiler
>>>>
>>>> I just patched and re-compiled but it still can't work. I've attached the
>>>> configure.log for both builds.
>>>>
>>>> FYI, some parts of the PETSc 3.6.3 code were initially patch to make it work
>>>> with the K computer system:
>>>>
>>>> $ diff -u petsc-3.6.3/config/BuildSystem/config/package.py.org
>>>> petsc-3.6.3/config/BuildSystem/config/package.py
>>>> --- petsc-3.6.3/config/BuildSystem/config/package.py.org 2015-12-04
>>>> 14:06:42.000000000 +0900
>>>> +++ petsc-3.6.3/config/BuildSystem/config/package.py 2016-01-22
>>>> 11:09:37.000000000 +0900
>>>> @@ -174,7 +174,7 @@
>>>>       return ''
>>>>
>>>>     def getSharedFlag(self,cflags):
>>>> -    for flag in ['-PIC', '-fPIC', '-KPIC', '-qpic']:
>>>> +    for flag in ['-KPIC', '-fPIC', '-PIC', '-qpic']:
>>>>         if cflags.find(flag) >=0: return flag
>>>>       return ''
>>>>
>>>> $ diff -u petsc-3.6.3/config/BuildSystem/config/setCompilers.py.org
>>>> petsc-3.6.3/config/BuildSystem/config/setCompilers.py
>>>> --- petsc-3.6.3/config/BuildSystem/config/setCompilers.py.org 2015-07-23
>>>> 00:22:46.000000000 +0900
>>>> +++ petsc-3.6.3/config/BuildSystem/config/setCompilers.py 2016-01-22
>>>> 11:10:05.000000000 +0900
>>>> @@ -1017,7 +1017,7 @@
>>>>         self.pushLanguage(language)
>>>>         #different compilers are sensitive to the order of testing these
>>>> flags. So separete out GCC test.
>>>>         if config.setCompilers.Configure.isGNU(self.getCompiler()): testFlags =
>>>> ['-fPIC']
>>>> -      else: testFlags = ['-PIC', '-fPIC', '-KPIC','-qpic']
>>>> +      else: testFlags = ['-KPIC', '-fPIC', '-PIC','-qpic']
>>>>         for testFlag in testFlags:
>>>>           try:
>>>>             self.logPrint('Trying '+language+' compiler flag '+testFlag)
>>>> $ diff -u petsc-3.6.3/config/BuildSystem/config/packages/openmp.py.org
>>>> petsc-3.6.3/config/BuildSystem/config/packages/openmp.py
>>>> --- petsc-3.6.3/config/BuildSystem/config/packages/openmp.py.org 2016-01-25
>>>> 15:42:23.000000000+0900
>>>> +++ petsc-3.6.3/config/BuildSystem/config/packages/openmp.py 2016-01-22
>>>> 17:13:52.000000000 +0900
>>>> @@ -19,7 +19,8 @@
>>>>       self.found = 0
>>>>       self.setCompilers.pushLanguage('C')
>>>>       #
>>>> -    for flag in ["-fopenmp", # Gnu
>>>> +    for flag in ["-Kopenmp", # Fujitsu
>>>> +                 "-fopenmp", # Gnu
>>>>                    "-qsmp=omp",# IBM XL C/C++
>>>>                    "-h omp",   # Cray. Must come after XL because XL
>>>> interprets this option as meaning"-soname omp"
>>>>                    "-mp",      # Portland Group
>>>>
>>>> $ diff -u ./petsc-3.6.3/config/BuildSystem/config/compilers.py.org
>>>> ./petsc-3.6.3/config/BuildSystem/config/compilers.py
>>>> --- ./petsc-3.6.3/config/BuildSystem/config/compilers.py.org 2015-06-10
>>>> 06:24:49.000000000 +0900
>>>> +++ ./petsc-3.6.3/config/BuildSystem/config/compilers.py 2016-02-19
>>>> 11:56:12.000000000 +0900
>>>> @@ -164,7 +164,7 @@
>>>>     def checkCLibraries(self):
>>>>       '''Determines the libraries needed to link with C'''
>>>>       oldFlags = self.setCompilers.LDFLAGS
>>>> -    self.setCompilers.LDFLAGS += ' -v'
>>>> +    self.setCompilers.LDFLAGS += ' -###'
>>>>       self.pushLanguage('C')
>>>>       (output, returnCode) = self.outputLink('', '')
>>>>       self.setCompilers.LDFLAGS = oldFlags
>>>> @@ -413,7 +413,7 @@
>>>>     def checkCxxLibraries(self):
>>>>       '''Determines the libraries needed to link with C++'''
>>>>       oldFlags = self.setCompilers.LDFLAGS
>>>> -    self.setCompilers.LDFLAGS += ' -v'
>>>> +    self.setCompilers.LDFLAGS += ' -###'
>>>>       self.pushLanguage('Cxx')
>>>>       (output, returnCode) = self.outputLink('', '')
>>>>       self.setCompilers.LDFLAGS = oldFlags
>>>>
>>>>
>>>>
>>>> Thank you
>>>>
>>>> Yours sincerely,
>>>>
>>>> TAY wee-beng
>>>>
>>>> On 2/6/2016 3:18 AM, Satish Balay wrote:
>>>>> What does -X9 in  --FFLAGS="-X9 -O0" do?
>>>>>
>>>>> can you send configure.log for this build?
>>>>>
>>>>> And does the attached patch make a difference with this example?
>>>>> [suggest doing a separate temporary build of PETSc - in a different source
>>>>> location - to check this.]
>>>>>
>>>>> Satish
>>>>>
>>>>> On Wed, 1 Jun 2016, TAY wee-beng wrote:
>>>>>
>>>>>> Hi Satish,
>>>>>>
>>>>>> Only partially working:
>>>>>>
>>>>>> [t00196 at b04-036 tutorials]$ mpiexec -n 2 ./ex4f90
>>>>>> jwe1050i-w The hardware barrier couldn't be used and continues processing
>>>>>> using the software barrier.
>>>>>> taken to (standard) corrective action, execution continuing.
>>>>>> jwe1050i-w The hardware barrier couldn't be used and continues processing
>>>>>> using the software barrier.
>>>>>> taken to (standard) corrective action, execution continuing.
>>>>>> Vec Object:Vec Object:initial vector:initial vector: 1 MPI processes
>>>>>>     type: seq
>>>>>> 10
>>>>>> 20
>>>>>> 30
>>>>>> 40
>>>>>> 50
>>>>>> 60
>>>>>>    1 MPI processes
>>>>>>     type: seq
>>>>>> 10
>>>>>> 20
>>>>>> 30
>>>>>> 40
>>>>>> 50
>>>>>> 60
>>>>>> [1]PETSC ERROR:
>>>>>> ------------------------------------------------------------------------
>>>>>> [1]PETSC ERROR: Caught signal number 11 SEGV: Segmentation Violation,
>>>>>> probably
>>>>>> memory access out of range
>>>>>> [1]PETSC ERROR: Try option -start_in_debugger or -on_error_attach_debugger
>>>>>> [1]PETSC ERROR: or see
>>>>>> http://www.mcs.anl.gov/petsc/documentation/faq.html#valgrind
>>>>>> [1]PETSC ERROR: or tryhttp://valgrind.org  on GNU/linux and Apple Mac OS X
>>>>>> to
>>>>>> find memory corruption errors
>>>>>> [1]PETSC ERROR: likely location of problem given in stack below
>>>>>> [1]PETSC ERROR: ---------------------  Stack Frames
>>>>>> ------------------------------------
>>>>>> [1]PETSC ERROR: Note: The EXACT line numbers in the stack are not
>>>>>> available,
>>>>>> [1]PETSC ERROR:       INSTEAD the line number of the start of the function
>>>>>> [1]PETSC ERROR:       is given.
>>>>>> [1]PETSC ERROR: [1] F90Array1dCreate line 50
>>>>>> /.global/volume2/home/hp150306/t00196/source/petsc-3.6.3/src/sys/f90-src/f90_cwrap.c
>>>>>> [1]PETSC ERROR: --------------------- Error Message
>>>>>> ------------------------------------------[0]PETSC ERROR:
>>>>>> ------------------------------------------------------------------------
>>>>>> [0]PETSC ERROR: Caught signal number 11 SEGV: Segmentation Violation,
>>>>>> probably
>>>>>> memory access out of range
>>>>>> [0]PETSC ERROR: Try option -start_in_debugger or -on_error_attach_debugger
>>>>>> [0]PETSC ERROR: or see
>>>>>> http://www.mcs.anl.gov/petsc/documentation/faq.html#valgrind
>>>>>> [0]PETSC ERROR: or tryhttp://valgrind.org  on GNU/linux and Apple Mac OS X
>>>>>> to
>>>>>> find memory corruption errors
>>>>>> [0]PETSC ERROR: likely location of problem given in stack below
>>>>>> [0]PETSC ERROR: ---------------------  Stack Frames
>>>>>> ------------------------------------
>>>>>> [0]PETSC ERROR: Note: The EXACT line numbers in the stack are not
>>>>>> available,
>>>>>> [0]PETSC ERROR:       INSTEAD the line number of the start of the function
>>>>>> [0]PETSC ERROR:       is given.
>>>>>> [0]PETSC ERROR: [0] F90Array1dCreate line 50
>>>>>> /.global/volume2/home/hp150306/t00196/source/petsc-3.6.3/src/sys/f90-src/f90_cwrap.c
>>>>>> [0]PETSC ERROR: --------------------- Error Message
>>>>>> --------------------------------------------------------------
>>>>>> [1]PETSC ERROR: Signal received
>>>>>> [1]PETSC ERROR: Seehttp://www.mcs.anl.gov/petsc/documentation/faq.html
>>>>>> for
>>>>>> trouble shooting.
>>>>>> [1]PETSC ERROR: Petsc Release Version 3.6.3, Dec, 03, 2015
>>>>>> [1]PETSC ERROR: ./ex4f90 on a petsc-3.6.3_debug named b04-036 by Unknown
>>>>>> Wed
>>>>>> Jun  1 13:23:41 2016
>>>>>> [1]PETSC ERROR: Configure options --with-cc=mpifcc --with-cxx=mpiFCC
>>>>>> --with-fc=mpifrt --with-64-bit-pointers=1 --CC=mpifcc --CFLAGS="-Xg -O0"
>>>>>> --CXX=mpiFCC --CXXFLAGS="-Xg -O0" --FC=mpifrt --FFLAGS="-X9 -O0"
>>>>>> --LD_SHARED=
>>>>>> --LDDFLAGS= --with-openmp=1 --with-mpiexec=mpiexec --known-endian=big
>>>>>> --with-shared-libraries=0 --with-blas-lapack-lib=-SSL2
>>>>>> --with-scalapack-lib=-SCALAPACK
>>>>>> --prefix=/home/hp150306/t00196/lib/petsc-3.6.3_debug
>>>>>> --with-fortran-interfaces=1 --with-debugging=1 --useThreads=0
>>>>>> --with-hypre=1
>>>>>> --with-hypre-dir=/home/hp150306/t00196/lib/hypre-2.10.0b-p4
>>>>>> [1]PETSC ERROR: #1 User provided function() line 0 in  unknown file
>>>>>> --------------------------------------------------------------------------
>>>>>> [mpi::mpi-api::mpi-abort]
>>>>>> MPI_ABORT was invoked on rank 1 in communicator MPI_COMM_WORLD
>>>>>> with errorcode 59.
>>>>>>
>>>>>> NOTE: invoking MPI_ABORT causes Open MPI to kill all MPI processes.
>>>>>> You may or may not see output from other processes, depending on
>>>>>> exactly when Open MPI kills them.
>>>>>> --------------------------------------------------------------------------
>>>>>> [b04-036:28998]
>>>>>> /opt/FJSVtclang/GM-1.2.0-20/lib64/libmpi.so.0(orte_errmgr_base_error_abort+0x84)
>>>>>> [0xffffffff11360404]
>>>>>> [b04-036:28998]
>>>>>> /opt/FJSVtclang/GM-1.2.0-20/lib64/libmpi.so.0(ompi_mpi_abort+0x51c)
>>>>>> [0xffffffff1110391c]
>>>>>> [b04-036:28998]
>>>>>> /opt/FJSVtclang/GM-1.2.0-20/lib64/libmpi.so.0(MPI_Abort+0x6c)
>>>>>> [0xffffffff1111b5ec]
>>>>>> [b04-036:28998]
>>>>>> /opt/FJSVtclang/GM-1.2.0-20/lib64/libtrtmet_c.so.1(MPI_Abort+0x2c)
>>>>>> [0xffffffff00281bf0]
>>>>>> [b04-036:28998] ./ex4f90 [0x292548]
>>>>>> [b04-036:28998] ./ex4f90 [0x29165c]
>>>>>> [b04-036:28998]
>>>>>> /opt/FJSVxosmmm/lib64/libmpgpthread.so.1(_IO_funlockfile+0x5c)
>>>>>> [0xffffffff121e1974]
>>>>>> [b04-036:28998] ./ex4f90 [0x9f6748]
>>>>>> [b04-036:28998] ./ex4f90 [0x9f0ea4]
>>>>>> [b04-036:28998] ./ex4f90 [0x2c76a0]
>>>>>> [b04-036:28998] ./ex4f90(MAIN__+0x38c) [0x10688c]
>>>>>> [b04-036:28998] ./ex4f90(main+0xec) [0x268e91c]
>>>>>> [b04-036:28998] /lib64/libc.so.6(__libc_start_main+0x194)
>>>>>> [0xffffffff138cb81c]
>>>>>> [b04-036:28998] ./ex4f90 [0x1063ac]
>>>>>> [1]PETSC ERROR:
>>>>>> ------------------------------------------------------------------------
>>>>>> [1]PETSC ERROR: Caught signal number 15 Terminate: Some process (or the
>>>>>> batch
>>>>>> system) has told this process to end
>>>>>> [1]PETSC ERROR: Tr--------------------
>>>>>> [0]PETSC ERROR: Signal received
>>>>>> [0]PETSC ERROR: Seehttp://www.mcs.anl.gov/petsc/documentation/faq.html
>>>>>> for
>>>>>> trouble shooting.
>>>>>> [0]PETSC ERROR: Petsc Release Version 3.6.3, Dec, 03, 2015
>>>>>> [0]PETSC ERROR: ./ex4f90 on a petsc-3.6.3_debug named b04-036 by Unknown
>>>>>> Wed
>>>>>> Jun  1 13:23:41 2016
>>>>>> [0]PETSC ERROR: Configure options --with-cc=mpifcc --with-cxx=mpiFCC
>>>>>> --with-fc=mpifrt --with-64-bit-pointers=1 --CC=mpifcc --CFLAGS="-Xg -O0"
>>>>>> --CXX=mpiFCC --CXXFLAGS="-Xg -O0" --FC=mpifrt --FFLAGS="-X9 -O0"
>>>>>> --LD_SHARED=
>>>>>> --LDDFLAGS= --with-openmp=1 --with-mpiexec=mpiexec --known-endian=big
>>>>>> --with-shared-libraries=0 --with-blas-lapack-lib=-SSL2
>>>>>> --with-scalapack-lib=-SCALAPACK
>>>>>> --prefix=/home/hp150306/t00196/lib/petsc-3.6.3_debug
>>>>>> --with-fortran-interfaces=1 --with-debugging=1 --useThreads=0
>>>>>> --with-hypre=1
>>>>>> --with-hypre-dir=/home/hp150306/t00196/lib/hypre-2.10.0b-p4
>>>>>> [0]PETSC ERROR: #1 User provided function() line 0 in  unknown file
>>>>>> --------------------------------------------------------------------------
>>>>>> [mpi::mpi-api::mpi-abort]
>>>>>> MPI_ABORT was invoked on rank 0 in communicator MPI_COMM_WORLD
>>>>>> with errorcode 59.
>>>>>>
>>>>>> NOTE: invoking MPI_ABORT causes Open MPI to kill all MPI processes.
>>>>>> You may or may not see output from other processes, depending on
>>>>>> exactly when Open MPI kills them.
>>>>>> --------------------------------------------------------------------------
>>>>>> [b04-036:28997]
>>>>>> /opt/FJSVtclang/GM-1.2.0-20/lib64/libmpi.so.0(orte_errmgr_base_error_abort+0x84)
>>>>>> [0xffffffff11360404]
>>>>>> [b04-036:28997]
>>>>>> /opt/FJSVtclang/GM-1.2.0-20/lib64/libmpi.so.0(ompi_mpi_abort+0x51c)
>>>>>> [0xffffffff1110391c]
>>>>>> [b04-036:28997]
>>>>>> /opt/FJSVtclang/GM-1.2.0-20/lib64/libmpi.so.0(MPI_Abort+0x6c)
>>>>>> [0xffffffff1111b5ec]
>>>>>> [b04-036:28997]
>>>>>> /opt/FJSVtclang/GM-1.2.0-20/lib64/libtrtmet_c.so.1(MPI_Abort+0x2c)
>>>>>> [0xffffffff00281bf0]
>>>>>> [b04-036:28997] ./ex4f90 [0x292548]
>>>>>> [b04-036:28997] ./ex4f90 [0x29165c]
>>>>>> [b04-036:28997]
>>>>>> /opt/FJSVxosmmm/lib64/libmpgpthread.so.1(_IO_funlockfile+0x5c)
>>>>>> [0xffffffff121e1974]
>>>>>> [b04-036:28997] ./ex4f90 [0x9f6748]
>>>>>> [b04-036:28997] ./ex4f90 [0x9f0ea4]
>>>>>> [b04-036:28997] ./ex4f90 [0x2c76a0]
>>>>>> [b04-036:28997] ./ex4f90(MAIN__+0x38c) [0x10688c]
>>>>>> [b04-036:28997] ./ex4f90(main+0xec) [0x268e91c]
>>>>>> [b04-036:28997] /lib64/libc.so.6(__libc_start_main+0x194)
>>>>>> [0xffffffff138cb81c]
>>>>>> [b04-036:28997] ./ex4f90 [0x1063ac]
>>>>>> [0]PETSC ERROR:
>>>>>> ------------------------------------------------------------------------
>>>>>> [0]PETSC ERROR: Caught signal number 15 Terminate: Some process (or the
>>>>>> batch
>>>>>> system) has told this process to end
>>>>>> [0]PETSC ERROR: Try option -start_in_debugger or -on_error_attach_debugger
>>>>>> [0]PETSC ERROR: or see
>>>>>> http://www.mcs.anl.gov/petsc/documentation/faq.html#valgrind
>>>>>> [0]PETSC ERROR: or tryhttp://valgrind.org  on GNU/linux and Apple Mac OS X
>>>>>> to
>>>>>> find memory corruption errors
>>>>>> [0]PETSC ERROR: likely location of problem given in stack below
>>>>>> [0]PETSC ERROR: ---------------------  Stack Frames
>>>>>> ------------------------------------
>>>>>> [0]PETSC ERROR: Note: The EXACT line numbers in the stack are not
>>>>>> available,
>>>>>> [0]PETSC ERROR:       INSTEAD the line number of the start of the function
>>>>>> [0]PETSC ERROR:       is given.
>>>>>> [0]PETSC ERROR: [0] F90Array1dCreate line 50
>>>>>> /.global/volume2/home/hp150306/t00196/source/petsc-3.6.3/src/sys/f90-src/f90_cwrap.c
>>>>>> [0]PETSC ERROR: --------------------- Error Message
>>>>>> --------------------------------------------------------------
>>>>>> [0]PETSC ERROR: Signal received
>>>>>> [0]PETSC ERROR: Seehttp://www.mcs.anl.gov/petsc/documentation/faq.html
>>>>>> for
>>>>>> trouble shooting.
>>>>>> [0]PETSC ERROR: Petsc Release Version 3.6.3, Dec, 03, 2015
>>>>>> [0]PETSC ERROR: ./ex4f90 on a petsc-3.6.3_debug named b04-036 by Unknown
>>>>>> Wed
>>>>>> Jun  1 13:23:41 2016
>>>>>> [0]PETSC ERROR: Configure options --with-cc=mpifcc --with-cxx=mpiFCC
>>>>>> --with-fc=mpifrt --with-64-bit-pointers=1 --CC=mpifcc --CFLAGS="-Xg -O0"
>>>>>> --CXX=mpiFCC --CXXFLAGS="-Xg -O0" --FC=mpifrt --FFLAGS="-X9 -O0"
>>>>>> --LD_SHARED=
>>>>>> --LDDFLAGS= --with-openmp=1 --with-mpiexec=mpiexec --known-endian=big
>>>>>> --with-shared-libraries=0 --with-blas-lapack-lib=-SSL2
>>>>>> --with-scalapack-lib=-SCALAPACK
>>>>>> --prefix=/home/hp150306/t00196/lib/petsc-3.6.3_debug
>>>>>> --with-fortran-interfaces=1 --with-debuy option -start_in_debugger or
>>>>>> -on_error_attach_debugger
>>>>>> [1]PETSC ERROR: or see
>>>>>> http://www.mcs.anl.gov/petsc/documentation/faq.html#valgrind
>>>>>> [1]PETSC ERROR: or tryhttp://valgrind.org  on GNU/linux and Apple Mac OS X
>>>>>> to
>>>>>> find memory corruption errors
>>>>>> [1]PETSC ERROR: likely location of problem given in stack below
>>>>>> [1]PETSC ERROR: ---------------------  Stack Frames
>>>>>> ------------------------------------
>>>>>> [1]PETSC ERROR: Note: The EXACT line numbers in the stack are not
>>>>>> available,
>>>>>> [1]PETSC ERROR:       INSTEAD the line number of the start of the function
>>>>>> [1]PETSC ERROR:       is given.
>>>>>> [1]PETSC ERROR: [1] F90Array1dCreate line 50
>>>>>> /.global/volume2/home/hp150306/t00196/source/petsc-3.6.3/src/sys/f90-src/f90_cwrap.c
>>>>>> [1]PETSC ERROR: --------------------- Error Message
>>>>>> --------------------------------------------------------------
>>>>>> [1]PETSC ERROR: Signal received
>>>>>> [1]PETSC ERROR: Seehttp://www.mcs.anl.gov/petsc/documentation/faq.html
>>>>>> for
>>>>>> trouble shooting.
>>>>>> [1]PETSC ERROR: Petsc Release Version 3.6.3, Dec, 03, 2015
>>>>>> [1]PETSC ERROR: ./ex4f90 on a petsc-3.6.3_debug named b04-036 by Unknown
>>>>>> Wed
>>>>>> Jun  1 13:23:41 2016
>>>>>> [1]PETSC ERROR: Configure options --with-cc=mpifcc --with-cxx=mpiFCC
>>>>>> --with-fc=mpifrt --with-64-bit-pointers=1 --CC=mpifcc --CFLAGS="-Xg -O0"
>>>>>> --CXX=mpiFCC --CXXFLAGS="-Xg -O0" --FC=mpifrt --FFLAGS="-X9 -O0"
>>>>>> --LD_SHARED=
>>>>>> --LDDFLAGS= --with-openmp=1 --with-mpiexec=mpiexec --known-endian=big
>>>>>> --with-shared-libraries=0 --with-blas-lapack-lib=-SSL2
>>>>>> --with-scalapack-lib=-SCALAPACK
>>>>>> --prefix=/home/hp150306/t00196/lib/petsc-3.6.3_debug
>>>>>> --with-fortran-interfaces=1 --with-debugging=1 --useThreads=0
>>>>>> --with-hypre=1
>>>>>> --with-hypre-dir=/home/hp150306/t00196/lib/hypre-2.10.0b-p4
>>>>>> [1]PETSC ERROR: #2 User provided function() line 0 in  unknown file
>>>>>> gging=1 --useThreads=0 --with-hypre=1
>>>>>> --with-hypre-dir=/home/hp150306/t00196/lib/hypre-2.10.0b-p4
>>>>>> [0]PETSC ERROR: #2 User provided function() line 0 in  unknown file
>>>>>> [ERR.] PLE 0019 plexec One of MPI processes was
>>>>>> aborted.(rank=0)(nid=0x04180034)(CODE=1938,793745140674134016,15104)
>>>>>> [t00196 at b04-036 tutorials]$
>>>>>> [ERR.] PLE 0021 plexec The interactive job has aborted with the
>>>>>> signal.(sig=24)
>>>>>> [INFO] PJM 0083 pjsub Interactive job 5211401 completed.
>>>>>>
>>>>>> Thank you
>>>>>>
>>>>>> Yours sincerely,
>>>>>>
>>>>>> TAY wee-beng
>>>>>>
>>>>>> On 1/6/2016 12:21 PM, Satish Balay wrote:
>>>>>>> Do PETSc examples using VecGetArrayF90() work?
>>>>>>>
>>>>>>> say src/vec/vec/examples/tutorials/ex4f90.F
>>>>>>>
>>>>>>> Satish
>>>>>>>
>>>>>>> On Tue, 31 May 2016, TAY wee-beng wrote:
>>>>>>>
>>>>>>>> Hi,
>>>>>>>>
>>>>>>>> I'm trying to run my MPI CFD code on Japan's K computer. My code can
>>>>>>>> run
>>>>>>>> if I
>>>>>>>> didn't make use of the PETSc DMDAVecGetArrayF90 subroutine. If it's
>>>>>>>> called
>>>>>>>>
>>>>>>>> call DMDAVecGetArrayF90(da_u,u_local,u_array,ierr)
>>>>>>>>
>>>>>>>> I get the error below.  I have no problem with my code on other
>>>>>>>> clusters
>>>>>>>> using
>>>>>>>> the new Intel compilers. I used to have problems with DM when using
>>>>>>>> the
>>>>>>>> old
>>>>>>>> Intel compilers. Now on the K computer, I'm using Fujitsu's Fortran
>>>>>>>> compiler.
>>>>>>>> How can I troubleshoot?
>>>>>>>>
>>>>>>>> Btw, I also tested on the ex13f90 example and it didn't work too. The
>>>>>>>> error is
>>>>>>>> below.
>>>>>>>>
>>>>>>>>
>>>>>>>> My code error:
>>>>>>>>
>>>>>>>> /* size_x,size_y,size_z 76x130x136*//*
>>>>>>>> *//* total grid size =  1343680*//*
>>>>>>>> *//* recommended cores (50k / core) =  26.87360000000000*//*
>>>>>>>> *//* 0*//*
>>>>>>>> *//* 1*//*
>>>>>>>> *//* 1*//*
>>>>>>>> *//*[3]PETSC ERROR: [1]PETSC ERROR:
>>>>>>>> ------------------------------------------------------------------------*//*
>>>>>>>> *//*[1]PETSC ERROR: Caught signal number 11 SEGV: Segmentation
>>>>>>>> Violation,
>>>>>>>> probably memory access out of range*//*
>>>>>>>> *//*[1]PETSC ERROR: Try option -start_in_debugger or
>>>>>>>> -on_error_attach_debugger*//*
>>>>>>>> *//*[1]PETSC ERROR: or see
>>>>>>>> http://www.mcs.anl.gov/petsc/documentation/faq.html#valgrind*//*
>>>>>>>> *//*[1]PETSC ERROR: or tryhttp://valgrind.org  on GNU/linux and Apple
>>>>>>>> Mac
>>>>>>>> OS X
>>>>>>>> to find memory corruption errors*//*
>>>>>>>> *//*[1]PETSC ERROR: likely location of problem given in stack
>>>>>>>> below*//*
>>>>>>>> *//*[1]PETSC ERROR: ---------------------  Stack Frames
>>>>>>>> ------------------------------------*//*
>>>>>>>> *//*[1]PETSC ERROR: Note: The EXACT line numbers in the stack are not
>>>>>>>> available,*//*
>>>>>>>> *//*[1]PETSC ERROR:       INSTEAD the line number of the start of the
>>>>>>>> function*//*
>>>>>>>> *//*[1]PETSC ERROR:       is given.*//*
>>>>>>>> *//*[1]PETSC ERROR: [1] F90Array3dCreate line 244
>>>>>>>> /.global/volume2/home/hp150306/t00196/source/petsc-3.6.3/src/sys/f90-src/f90_cwrap.c*//*
>>>>>>>> *//* 1*//*
>>>>>>>> *//*------------------------------------------------------------------------*//*
>>>>>>>> *//*[3]PETSC ERROR: Caught signal number 11 SEGV: Segmentation
>>>>>>>> Violation,
>>>>>>>> probably memory access out of range*//*
>>>>>>>> *//*[3]PETSC ERROR: Try option -start_in_debugger or
>>>>>>>> -on_error_attach_debugger*//*
>>>>>>>> *//*[3]PETSC ERROR: or see
>>>>>>>> http://www.mcs.anl.gov/petsc/documentation/faq.html#valgrind*//*
>>>>>>>> *//*[3]PETSC ERROR: or tryhttp://valgrind.org  on GNU/linux and Apple
>>>>>>>> Mac
>>>>>>>> OS X
>>>>>>>> to find memory corruption errors*//*
>>>>>>>> *//*[3]PETSC ERROR: likely location of problem given in stack
>>>>>>>> below*//*
>>>>>>>> *//*[3]PETSC ERROR: ---------------------  Stack Frames
>>>>>>>> ------------------------------------*//*
>>>>>>>> *//*[0]PETSC ERROR:
>>>>>>>> ------------------------------------------------------------------------*//*
>>>>>>>> *//*[0]PETSC ERROR: Caught signal number 11 SEGV: Segmentation
>>>>>>>> Violation,
>>>>>>>> probably memory access out of range*//*
>>>>>>>> *//*[0]PETSC ERROR: Try option -start_in_debugger or
>>>>>>>> -on_error_attach_debugger*//*
>>>>>>>> *//*[0]PETSC ERROR: or see
>>>>>>>> http://www.mcs.anl.gov/petsc/documentation/faq.html#valgrind*//*
>>>>>>>> *//*[0]PETSC ERROR: or tryhttp://valgrind.org  on GNU/linux and Apple
>>>>>>>> Mac
>>>>>>>> OS X
>>>>>>>> to find memory corruption errors*//*
>>>>>>>> *//*[0]PETSC ERROR: likely location of problem given in stack
>>>>>>>> below*//*
>>>>>>>> *//*[0]PETSC ERROR: ---------------------  Stack Frames
>>>>>>>> ------------------------------------*//*
>>>>>>>> *//*[0]PETSC ERROR: Note: The EXACT line numbers in the stack are not
>>>>>>>> available,*//*
>>>>>>>> *//*[0]PETSC ERROR:       INSTEAD the line number of the start of the
>>>>>>>> function*//*
>>>>>>>> *//*[0]PETSC ERROR:       is given.*//*
>>>>>>>> *//*[0]PETSC ERROR: [0] F90Array3dCreate line 244
>>>>>>>> /.global/volume2/home/hp150306/t00196/source/petsc-3.6.3/src/sys/f90-src/f90_cwrap.c*//*
>>>>>>>> *//*[0]PETSC ERROR: --------------------- Error Message
>>>>>>>> ----------------------------------------- 1*//*
>>>>>>>> *//*[2]PETSC ERROR:
>>>>>>>> ------------------------------------------------------------------------*//*
>>>>>>>> *//*[2]PETSC ERROR: Caught signal number 11 SEGV: Segmentation
>>>>>>>> Violation,
>>>>>>>> probably memory access out of range*//*
>>>>>>>> *//*[2]PETSC ERROR: Try option -start_in_debugger or
>>>>>>>> -on_error_attach_debugger*//*
>>>>>>>> *//*[2]PETSC ERROR: or see
>>>>>>>> http://www.mcs.anl.gov/petsc/documentation/faq.html#valgrind*//*
>>>>>>>> *//*[2]PETSC ERROR: or tryhttp://valgrind.org  on GNU/linux and Apple
>>>>>>>> Mac
>>>>>>>> OS X
>>>>>>>> to find memory corruption errors*//*
>>>>>>>> *//*[2]PETSC ERROR: likely location of problem given in stack
>>>>>>>> below*//*
>>>>>>>> *//*[2]PETSC ERROR: ---------------------  Stack Frames
>>>>>>>> ------------------------------------*//*
>>>>>>>> *//*[2]PETSC ERROR: Note: The EXACT line numbers in the stack are not
>>>>>>>> available,*//*
>>>>>>>> *//*[2]PETSC ERROR:       INSTEAD the line number of the start of the
>>>>>>>> function*//*
>>>>>>>> *//*[2]PETSC ERROR:       is given.*//*
>>>>>>>> *//*[2]PETSC ERROR: [2] F90Array3dCreate line 244
>>>>>>>> /.global/volume2/home/hp150306/t00196/source/petsc-3.6.3/src/sys/f90-src/f90_cwrap.c*//*
>>>>>>>> *//*[2]PETSC ERROR: --------------------- Error Message
>>>>>>>> -----------------------------------------[3]PETSC ERROR: Note: The
>>>>>>>> EXACT
>>>>>>>> line
>>>>>>>> numbers in the stack are not available,*//*
>>>>>>>> *//*[3]PETSC ERROR:       INSTEAD the line number of the start of the
>>>>>>>> function*//*
>>>>>>>> *//*[3]PETSC ERROR:       is given.*//*
>>>>>>>> *//*[3]PETSC ERROR: [3] F90Array3dCreate line 244
>>>>>>>> /.global/volume2/home/hp150306/t00196/source/petsc-3.6.3/src/sys/f90-src/f90_cwrap.c*//*
>>>>>>>> *//*[3]PETSC ERROR: --------------------- Error Message
>>>>>>>> --------------------------------------------------------------*//*
>>>>>>>> *//*[3]PETSC ERROR: Signal received*//*
>>>>>>>> *//*[3]PETSC ERROR: See
>>>>>>>> http://www.mcs.anl.gov/petsc/documentation/faq.html
>>>>>>>> for trouble shooting.*//*
>>>>>>>> *//*[3]PETSC ERROR: Petsc Release Version 3.6.3, Dec, 03, 2015*//*
>>>>>>>> *//*[3]PETSC ERROR: ./a-debug.out on a petsc-3.6.3_debug named b04-036
>>>>>>>> by
>>>>>>>> Unknown Wed Jun  1 12:54:34 2016*//*
>>>>>>>> *//*[3]PETSC ERROR: Configure options --with-cc=mpifcc
>>>>>>>> --with-cxx=mpiFCC
>>>>>>>> --with-fc=mpifrt --with-64-bit-pointers=1 --CC=mpifcc --CFLAGS="-Xg
>>>>>>>> -O0"
>>>>>>>> --CXX=mpiFCC --CXXFLAGS="-Xg -O0" --FC=mpifrt --FFLAGS="-X9 -O0"
>>>>>>>> --LD_SHARED=
>>>>>>>> --LDDFLAGS= --with-openmp=1 --with-mpiexec=mpiexec --known-endian=big
>>>>>>>> --with-shared----------------------*//*
>>>>>>>> *//*[0]PETSC ERROR: Signal received*//*
>>>>>>>> *//*[0]PETSC ERROR: See
>>>>>>>> http://www.mcs.anl.gov/petsc/documentation/faq.html
>>>>>>>> for trouble shooting.*//*
>>>>>>>> *//*[0]PETSC ERROR: Petsc Release Version 3.6.3, Dec, 03, 2015*//*
>>>>>>>> *//*[0]PETSC ERROR: ./a-debug.out on a petsc-3.6.3_debug named b04-036
>>>>>>>> by
>>>>>>>> Unknown Wed Jun  1 12:54:34 2016*//*
>>>>>>>> *//*[0]PETSC ERROR: Configure options --with-cc=mpifcc
>>>>>>>> --with-cxx=mpiFCC
>>>>>>>> --with-fc=mpifrt --with-64-bit-pointers=1 --CC=mpifcc --CFLAGS="-Xg
>>>>>>>> -O0"
>>>>>>>> --CXX=mpiFCC --CXXFLAGS="-Xg -O0" --FC=mpifrt --FFLAGS="-X9 -O0"
>>>>>>>> --LD_SHARED=
>>>>>>>> --LDDFLAGS= --with-openmp=1 --with-mpiexec=mpiexec --known-endian=big
>>>>>>>> --with-shared-libraries=0 --with-blas-lapack-lib=-SSL2
>>>>>>>> --with-scalapack-lib=-SCALAPACK
>>>>>>>> --prefix=/home/hp150306/t00196/lib/petsc-3.6.3_debug
>>>>>>>> --with-fortran-interfaces=1 --with-debugging=1 --useThreads=0
>>>>>>>> --with-hypre=1
>>>>>>>> --with-hypre-dir=/home/hp150306/t00196/lib/hypre-2.10.0b-p4*//*
>>>>>>>> *//*[0]PETSC ERROR: #1 User provided function() line 0 in  unknown
>>>>>>>> file*//*
>>>>>>>> *//*--------------------------------------------------------------------------*//*
>>>>>>>> *//*[m---------------------*//*
>>>>>>>> *//*[2]PETSC ERROR: Signal received*//*
>>>>>>>> *//*[2]PETSC ERROR: See
>>>>>>>> http://www.mcs.anl.gov/petsc/documentation/faq.html
>>>>>>>> for trouble shooting.*//*
>>>>>>>> *//*[2]PETSC ERROR: Petsc Release Version 3.6.3, Dec, 03, 2015*//*
>>>>>>>> *//*[2]PETSC ERROR: ./a-debug.out on a petsc-3.6.3_debug named b04-036
>>>>>>>> by
>>>>>>>> Unknown Wed Jun  1 12:54:34 2016*//*
>>>>>>>> *//*[2]PETSC ERROR: Configure options --with-cc=mpifcc
>>>>>>>> --with-cxx=mpiFCC
>>>>>>>> --with-fc=mpifrt --with-64-bit-pointers=1 --CC=mpifcc --CFLAGS="-Xg
>>>>>>>> -O0"
>>>>>>>> --CXX=mpiFCC --CXXFLAGS="-Xg -O0" --FC=mpifrt --FFLAGS="-X9 -O0"
>>>>>>>> --LD_SHARED=
>>>>>>>> --LDDFLAGS= --with-openmp=1 --with-mpiexec=mpiexec --known-endian=big
>>>>>>>> --with-shared-libraries=0 --with-blas-lapack-lib=-SSL2
>>>>>>>> --with-scalapack-lib=-SCALAPACK
>>>>>>>> --prefix=/home/hp150306/t00196/lib/petsc-3.6.3_debug
>>>>>>>> --with-fortran-interfaces=1 --with-debugging=1 --useThreads=0
>>>>>>>> --with-hypre=1
>>>>>>>> --with-hypre-dir=/home/hp150306/t00196/lib/hypre-2.10.0b-p4*//*
>>>>>>>> *//*[2]PETSC ERROR: #1 User provided function() line 0 in  unknown
>>>>>>>> file*//*
>>>>>>>> *//*--------------------------------------------------------------------------*//*
>>>>>>>> *//*[m[1]PETSC ERROR: --------------------- Error Message
>>>>>>>> --------------------------------------------------------------*//*
>>>>>>>> *//*[1]PETSC ERROR: Signal received*//*
>>>>>>>> *//*[1]PETSC ERROR: See
>>>>>>>> http://www.mcs.anl.gov/petsc/documentation/faq.html
>>>>>>>> for trouble shooting.*//*
>>>>>>>> *//*[1]PETSC ERROR: Petsc Release Version 3.6.3, Dec, 03, 2015*//*
>>>>>>>> *//*[1]PETSC ERROR: ./a-debug.out on a petsc-3.6.3_debug named b04-036
>>>>>>>> by
>>>>>>>> Unknown Wed Jun  1 12:54:34 2016*//*
>>>>>>>> *//*[1]PETSC ERROR: Configure options --with-cc=mpifcc
>>>>>>>> --with-cxx=mpiFCC
>>>>>>>> --with-fc=mpifrt --with-64-bit-pointers=1 --CC=mpifcc --CFLAGS="-Xg
>>>>>>>> -O0"
>>>>>>>> --CXX=mpiFCC --CXXFLAGS="-Xg -O0" --FC=mpifrt --FFLAGS="-X9 -O0"
>>>>>>>> --LD_SHARED=
>>>>>>>> --LDDFLAGS= --with-openmp=1 --with-mpiexec=mpiexec --known-endian=big
>>>>>>>> --with-shared-libraries=0 --with-blas-lapack-lib=-SSL2
>>>>>>>> --with-scalapack-lib=-SCALAPACK
>>>>>>>> --prefix=/home/hp150306/t00196/lib/petsc-3.6.3_debug
>>>>>>>> --with-fortran-interfaces=1 --with-debugging=1 --useThreads=0
>>>>>>>> --with-hypre=1
>>>>>>>> --with-hypre-dir=/home/hp150306/t00196/lib/hypre-2.10.0b-p4*//*
>>>>>>>> *//*[1]PETSC ERROR: #1 User provided function() line 0 ilibraries=0
>>>>>>>> --with-blas-lapack-lib=-SSL2 --with-scalapack-lib=-SCALAPACK
>>>>>>>> --prefix=/home/hp150306/t00196/lib/petsc-3.6.3_debug
>>>>>>>> --with-fortran-interfaces=1 --with-debugging=1 --useThreads=0
>>>>>>>> --with-hypre=1
>>>>>>>> --with-hypre-dir=/home/hp150306/t00196/lib/hypre-2.10.0b-p4*//*
>>>>>>>> *//*[3]PETSC ERROR: #1 User provided function() line 0 in  unknown
>>>>>>>> file*//*
>>>>>>>> *//*--------------------------------------------------------------------------*//*
>>>>>>>> *//*[mpi::mpi-api::mpi-abort]*//*
>>>>>>>> *//*MPI_ABORT was invoked on rank 3 in communicator MPI_COMM_WORLD*//*
>>>>>>>> *//*with errorcode 59.*//*
>>>>>>>> *//*
>>>>>>>> *//*NOTE: invoking MPI_ABORT causes Open MPI to kill all MPI
>>>>>>>> processes.*//*
>>>>>>>> *//*You may or may not see output from other processes, depending
>>>>>>>> on*//*
>>>>>>>> *//*exactly when Open MPI kills them.*//*
>>>>>>>> *//*--------------------------------------------------------------------------*//*
>>>>>>>> *//*[b04-036:28416]
>>>>>>>> /opt/FJSVtclang/GM-1.2.0-20/lib64/libmpi.so.0(orte_errmgr_base_error_abort+0x84)
>>>>>>>> [0xffffffff11360404]*//*
>>>>>>>> *//*[b04-036:28416]
>>>>>>>> /opt/FJSVtclang/GM-1.2.0-20/lib64/libmpi.so.0(ompi_mpi_abort+0x51c)
>>>>>>>> [0xffffffff1110391c]*//*
>>>>>>>> *//*[b04-036:28416]
>>>>>>>> /opt/FJSVtclang/GM-1.2.0-2pi::mpi-api::mpi-abort]*//*
>>>>>>>> *//*MPI_ABORT was invoked on rank 2 in communicator MPI_COMM_WORLD*//*
>>>>>>>> *//*with errorcode 59.*/
>>>>>>>>
>>>>>>>> ex13f90 error:
>>>>>>>>
>>>>>>>>
>>>>>>>> /*[t00196 at b04-036 tutorials]$ mpiexec -np 2 ./ex13f90*//*
>>>>>>>> *//*jwe1050i-w The hardware barrier couldn't be used and continues
>>>>>>>> processing
>>>>>>>> using the software barrier.*//*
>>>>>>>> *//*taken to (standard) corrective action, execution continuing.*//*
>>>>>>>> *//*jwe1050i-w The hardware barrier couldn't be used and continues
>>>>>>>> processing
>>>>>>>> using the software barrier.*//*
>>>>>>>> *//*taken to (standard) corrective action, execution continuing.*//*
>>>>>>>> *//* Hi! We're solving van der Pol using  2 processes.*//*
>>>>>>>> *//*
>>>>>>>> *//*   t     x1         x2*//*
>>>>>>>> *//*[1]PETSC ERROR:
>>>>>>>> ------------------------------------------------------------------------*//*
>>>>>>>> *//*[1]PETSC ERROR: Caught signal number 10 BUS: Bus Error, possibly
>>>>>>>> illegal
>>>>>>>> memory access*//*
>>>>>>>> *//*[1]PETSC ERROR: Try option -start_in_debugger or
>>>>>>>> -on_error_attach_debugger*//*
>>>>>>>> *//*[0]PETSC ERROR:
>>>>>>>> ------------------------------------------------------------------------*//*
>>>>>>>> *//*[0]PETSC ERROR: Caught signal number 10 BUS: Bus Error, possibly
>>>>>>>> illegal
>>>>>>>> memory access*//*
>>>>>>>> *//*[0]PETSC ERROR: Try option -start_in_debugger or
>>>>>>>> -on_error_attach_debugger*//*
>>>>>>>> *//*[0]PETSC ERROR: or see
>>>>>>>> http://www.mcs.anl.gov/petsc/documentation/faq.html#valgrind*//*
>>>>>>>> *//*[0]PETSC ERROR: or tryhttp://valgrind.org  on GNU/linux and Apple
>>>>>>>> Mac
>>>>>>>> OS X
>>>>>>>> to find memory corruption errors*//*
>>>>>>>> *//*[1]PETSC ERROR: or see
>>>>>>>> http://www.mcs.anl.gov/petsc/documentation/faq.html#valgrind*//*
>>>>>>>> *//*[1]PETSC ERROR: or tryhttp://valgrind.org  on GNU/linux and Apple
>>>>>>>> Mac
>>>>>>>> OS X
>>>>>>>> to find memory corruption errors*//*
>>>>>>>> *//*[1]PETSC ERROR: likely location of problem given in stack
>>>>>>>> below*//*
>>>>>>>> *//*[1]PETSC ERROR: ---------------------  Stack Frames
>>>>>>>> ------------------------------------*//*
>>>>>>>> *//*[1]PETSC ERROR: Note: The EXACT line numbers in the stack are not
>>>>>>>> available,*//*
>>>>>>>> *//*[1]PETSC ERROR:       INSTEAD the line number of the start of the
>>>>>>>> function*//*
>>>>>>>> *//*[1]PETSC ERROR:       is given.*//*
>>>>>>>> *//*[1]PETSC ERROR: [1] F90Array4dCreate line 337
>>>>>>>> /.global/volume2/home/hp150306/t00196/source/petsc-3.6.3/src/sys/f90-src/f90_cwrap.c*//*
>>>>>>>> *//*[0]PETSC ERROR: likely location of problem given in stack
>>>>>>>> below*//*
>>>>>>>> *//*[0]PETSC ERROR: ---------------------  Stack Frames
>>>>>>>> ------------------------------------*//*
>>>>>>>> *//*[0]PETSC ERROR: Note: The EXACT line numbers in the stack are not
>>>>>>>> available,*//*
>>>>>>>> *//*[0]PETSC ERROR:       INSTEAD the line number of the start of the
>>>>>>>> function*//*
>>>>>>>> *//*[0]PETSC ERROR:       is given.*//*
>>>>>>>> *//*[0]PETSC ERROR: [0] F90Array4dCreate line 337
>>>>>>>> /.global/volume2/home/hp150306/t00196/source/petsc-3.6.3/src/sys/f90-src/f90_cwrap.c*//*
>>>>>>>> *//*[1]PETSC ERROR: --------------------- Error Message
>>>>>>>> --------------------------------------------------------------*//*
>>>>>>>> *//*[1]PETSC ERROR: Signal received*//*
>>>>>>>> *//*[1]PETSC ERROR: See
>>>>>>>> http://www.mcs.anl.gov/petsc/documentation/faq.html
>>>>>>>> for trouble shooting.*//*
>>>>>>>> *//*[1]PETSC ERROR: Petsc Release Version 3.6.3, Dec, 03, 2015*//*
>>>>>>>> *//*[1]PETSC ERROR: ./ex13f90 on a petsc-3.6.3_debug named b04-036 by
>>>>>>>> Unknown
>>>>>>>> Wed Jun  1 13:04:34 2016*//*
>>>>>>>> *//*[1]PETSC ERROR: Configure options --with-cc=mpifcc
>>>>>>>> --with-cxx=mpiFCC
>>>>>>>> --with-fc=mpifrt --with-64-bit-pointers=1 --CC=mpifcc --CFLAGS="-Xg
>>>>>>>> -O0"
>>>>>>>> --CXX=mpiFCC --CXXFLAGS="-Xg -O0" --FC=mpifrt --FFLAGS="-X9 -O0"
>>>>>>>> --LD_SHARED=
>>>>>>>> --LDDFLAGS= --with-openmp=1 --with-mpiexec=mpiexec --known-endian=big
>>>>>>>> --with-shared-libraries=0 --with-blas-lapack-lib=-SSL2
>>>>>>>> --with-scalapack-lib=-SCALAPACK
>>>>>>>> --prefix=/home/hp150306/t00196/lib/petsc-3.6.3_debug
>>>>>>>> --with-fortran-interfaces=1 --with-debugging=1 --useThreads=0
>>>>>>>> --with-hypre=1
>>>>>>>> --with-hyp*//*
>>>>>>>> */
>>>>>>>>
>>>>>>>>
>>>>>>>>

-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.mcs.anl.gov/pipermail/petsc-users/attachments/20160721/c66a82d4/attachment-0001.html>

From bsmith at mcs.anl.gov  Wed Jul 20 22:37:37 2016
From: bsmith at mcs.anl.gov (Barry Smith)
Date: Wed, 20 Jul 2016 22:37:37 -0500
Subject: [petsc-users] Fwd: Re: Error with PETSc on K computer
In-Reply-To: <a639a4ee-cf11-4e84-3bd3-477181a9019f@gmail.com>
References: <7423eeed-4b95-28e7-c55d-08e515911935@gmail.com>
	<3436e085-071a-db3f-3438-84e2536af2d5@gmail.com>
	<alpine.LFD.2.20.1606030930550.32504@asterix>
	<c836a8b2-52ad-7d38-cc19-b11536c3752a@gmail.com>
	<6E84C554-39F0-4BB6-92D0-D2443BA79989@mcs.anl.gov>
	<a639a4ee-cf11-4e84-3bd3-477181a9019f@gmail.com>
Message-ID: <E7A198B2-1194-497E-8D5B-E345DF61CF0F@mcs.anl.gov>


   There is no way to implement DMDAVecGetArray() to be used from Fortran. The only way we can support DMDAVecGetArrayF90() is that we be given the information about how the Fortran pointers are implemented in Fujitsu compiler and access to the machine to test the interface.

 Barry

> On Jul 20, 2016, at 10:24 PM, TAY wee-beng <zonexo at gmail.com> wrote:
> 
> Dear all,
> 
> I have emailed the K computer helpdesk and they have given their reply:
> 
> This is HPCI helpdesk. 
> 
> Sorry for making you wait. 
> We have received the investigation results from Operation Division. 
> 
> The cause of SIGSEGV by the Fujitsu compiler is that the implementation of 
> the Fortran pointer is different from the Intel/GNU compiler. 
> 
> In the Fujitsu compiler, interoperability of the Fortran pointer and C language 
> is implemented by the Fortran pointer interface of Fujitsu. 
> * The implementation of the Fortran pointer is processor-dependent.* 
> 
> On the other hand, PETSc is implemented assuming of the Fortran pointer interface of 
> the Intel/GNU compiler. 
> 
> The PETSc routine cannot correctly interpret the Fortran pointer of Fujitsu 
> because the implementation of the Fortran pointer of the Fujitsu compiler 
> and the Intel/GNU compiler is different, and it terminates abnormally at execution. 
> 
> Please avoid the use of the Fortran pointer as a workaround. 
> 
> The sample program of PETSc which does not use the Fotran pointer (ex4f etc.) 
> runs normaly without getting SIGSEGV.
> 
> Hence, they advice avoiding the use of pointers. I made use of VecGetArrayF90 but I believe I can also use VecGetArray.
> 
> But what about DMDAVecGetArrayF90 and DMDAVecRestoreArrayF90? Can I use DMDAVecGetArray and DMDAVecRestoreArray instead in Fortran, thus avoiding using pointers? I remember my segmentation fault always happens when calling DMDAVecGetArrayF90 and DMDAVecRestoreArrayF90.
> 
> In other words, can I use DMDA in Fortran w/o using any pointer?
> 
> Thank you
> 
> Yours sincerely,
> 
> TAY wee-beng
> 
> On 10/6/2016 11:00 AM, Barry Smith wrote:
>>   Without knowing the specifics of how this machine's Fortran compiler passes Fortran pointers to subroutines we cannot resolve this problem. This information can only be obtained from the experts on the this machine.
>> 
>>    Barry
>> 
>> 
>>> On Jun 9, 2016, at 9:28 PM, TAY wee-beng <zonexo at gmail.com>
>>>  wrote:
>>> 
>>> Hi,
>>> 
>>> The current solution cannot work. May I know if there's any other solution to try. Meanwhile, I've also email the K computer helpdesk for help.
>>> 
>>> Thank you
>>> 
>>> Yours sincerely,
>>> 
>>> TAY wee-beng
>>> 
>>> On 3/6/2016 10:33 PM, Satish Balay wrote:
>>> 
>>>> Sorry - I'm not sure whats hapenning with this compiler.
>>>> 
>>>> [for a build without the patch I sent ] - can you edit
>>>> PETSC_ARCH/include/petscconf.h and remove the lines
>>>> 
>>>> #ifndef PETSC_HAVE_F90_2PTR_ARG
>>>> #define PETSC_HAVE_F90_2PTR_ARG 1
>>>> #endif
>>>> 
>>>> And then build the libraries [do not run configure again].
>>>> 
>>>> Does this make a difference for this example?
>>>> 
>>>> Satish
>>>> 
>>>> On Fri, 3 Jun 2016, TAY wee-beng wrote:
>>>> 
>>>> 
>>>>> Hi,
>>>>> 
>>>>> Is there any update to the issue below?
>>>>> 
>>>>> No hurry, just to make sure that the email is sent successfully.
>>>>> 
>>>>> 
>>>>> Thanks
>>>>> 
>>>>> 
>>>>> 
>>>>> -------- Forwarded Message --------
>>>>> Subject: 	Re: [petsc-users] Error with PETSc on K computer
>>>>> Date: 	Thu, 2 Jun 2016 10:25:22 +0800
>>>>> From: 	TAY wee-beng 
>>>>> <zonexo at gmail.com>
>>>>> 
>>>>> To: 	petsc-users 
>>>>> <petsc-users at mcs.anl.gov>
>>>>> 
>>>>> 
>>>>> 
>>>>> 
>>>>> Hi Satish,
>>>>> 
>>>>> The X9 option is :
>>>>> 
>>>>> Provides a different interpretation under Fortran 95 specifications
>>>>> for any parts not conforming to the language specifications of this
>>>>> compiler
>>>>> 
>>>>> I just patched and re-compiled but it still can't work. I've attached the
>>>>> configure.log for both builds.
>>>>> 
>>>>> FYI, some parts of the PETSc 3.6.3 code were initially patch to make it work
>>>>> with the K computer system:
>>>>> 
>>>>> $ diff -u petsc-3.6.3/config/BuildSystem/config/package.py.org
>>>>> petsc-3.6.3/config/BuildSystem/config/package.py
>>>>> --- petsc-3.6.3/config/BuildSystem/config/package.py.org 2015-12-04
>>>>> 14:06:42.000000000 +0900
>>>>> +++ petsc-3.6.3/config/BuildSystem/config/package.py 2016-01-22
>>>>> 11:09:37.000000000 +0900
>>>>> @@ -174,7 +174,7 @@
>>>>>      return ''
>>>>> 
>>>>>    def getSharedFlag(self,cflags):
>>>>> -    for flag in ['-PIC', '-fPIC', '-KPIC', '-qpic']:
>>>>> +    for flag in ['-KPIC', '-fPIC', '-PIC', '-qpic']:
>>>>>        if cflags.find(flag) >=0: return flag
>>>>>      return ''
>>>>> 
>>>>> $ diff -u petsc-3.6.3/config/BuildSystem/config/setCompilers.py.org
>>>>> petsc-3.6.3/config/BuildSystem/config/setCompilers.py
>>>>> --- petsc-3.6.3/config/BuildSystem/config/setCompilers.py.org 2015-07-23
>>>>> 00:22:46.000000000 +0900
>>>>> +++ petsc-3.6.3/config/BuildSystem/config/setCompilers.py 2016-01-22
>>>>> 11:10:05.000000000 +0900
>>>>> @@ -1017,7 +1017,7 @@
>>>>>        self.pushLanguage(language)
>>>>>        #different compilers are sensitive to the order of testing these
>>>>> flags. So separete out GCC test.
>>>>>        if config.setCompilers.Configure.isGNU(self.getCompiler()): testFlags =
>>>>> ['-fPIC']
>>>>> -      else: testFlags = ['-PIC', '-fPIC', '-KPIC','-qpic']
>>>>> +      else: testFlags = ['-KPIC', '-fPIC', '-PIC','-qpic']
>>>>>        for testFlag in testFlags:
>>>>>          try:
>>>>>            self.logPrint('Trying '+language+' compiler flag '+testFlag)
>>>>> $ diff -u petsc-3.6.3/config/BuildSystem/config/packages/openmp.py.org
>>>>> petsc-3.6.3/config/BuildSystem/config/packages/openmp.py
>>>>> --- petsc-3.6.3/config/BuildSystem/config/packages/openmp.py.org 2016-01-25
>>>>> 15:42:23.000000000+0900
>>>>> +++ petsc-3.6.3/config/BuildSystem/config/packages/openmp.py 2016-01-22
>>>>> 17:13:52.000000000 +0900
>>>>> @@ -19,7 +19,8 @@
>>>>>      self.found = 0
>>>>>      self.setCompilers.pushLanguage('C')
>>>>>      #
>>>>> -    for flag in ["-fopenmp", # Gnu
>>>>> +    for flag in ["-Kopenmp", # Fujitsu
>>>>> +                 "-fopenmp", # Gnu
>>>>>                   "-qsmp=omp",# IBM XL C/C++
>>>>>                   "-h omp",   # Cray. Must come after XL because XL
>>>>> interprets this option as meaning"-soname omp"
>>>>>                   "-mp",      # Portland Group
>>>>> 
>>>>> $ diff -u ./petsc-3.6.3/config/BuildSystem/config/compilers.py.org
>>>>> ./petsc-3.6.3/config/BuildSystem/config/compilers.py
>>>>> --- ./petsc-3.6.3/config/BuildSystem/config/compilers.py.org 2015-06-10
>>>>> 06:24:49.000000000 +0900
>>>>> +++ ./petsc-3.6.3/config/BuildSystem/config/compilers.py 2016-02-19
>>>>> 11:56:12.000000000 +0900
>>>>> @@ -164,7 +164,7 @@
>>>>>    def checkCLibraries(self):
>>>>>      '''Determines the libraries needed to link with C'''
>>>>>      oldFlags = self.setCompilers.LDFLAGS
>>>>> -    self.setCompilers.LDFLAGS += ' -v'
>>>>> +    self.setCompilers.LDFLAGS += ' -###'
>>>>>      self.pushLanguage('C')
>>>>>      (output, returnCode) = self.outputLink('', '')
>>>>>      self.setCompilers.LDFLAGS = oldFlags
>>>>> @@ -413,7 +413,7 @@
>>>>>    def checkCxxLibraries(self):
>>>>>      '''Determines the libraries needed to link with C++'''
>>>>>      oldFlags = self.setCompilers.LDFLAGS
>>>>> -    self.setCompilers.LDFLAGS += ' -v'
>>>>> +    self.setCompilers.LDFLAGS += ' -###'
>>>>>      self.pushLanguage('Cxx')
>>>>>      (output, returnCode) = self.outputLink('', '')
>>>>>      self.setCompilers.LDFLAGS = oldFlags
>>>>> 
>>>>> 
>>>>> 
>>>>> Thank you
>>>>> 
>>>>> Yours sincerely,
>>>>> 
>>>>> TAY wee-beng
>>>>> 
>>>>> On 2/6/2016 3:18 AM, Satish Balay wrote:
>>>>> 
>>>>>> What does -X9 in  --FFLAGS="-X9 -O0" do?
>>>>>> 
>>>>>> can you send configure.log for this build?
>>>>>> 
>>>>>> And does the attached patch make a difference with this example?
>>>>>> [suggest doing a separate temporary build of PETSc - in a different source
>>>>>> location - to check this.]
>>>>>> 
>>>>>> Satish
>>>>>> 
>>>>>> On Wed, 1 Jun 2016, TAY wee-beng wrote:
>>>>>> 
>>>>>> 
>>>>>>> Hi Satish,
>>>>>>> 
>>>>>>> Only partially working:
>>>>>>> 
>>>>>>> [t00196 at b04-036 tutorials]$ mpiexec -n 2 ./ex4f90
>>>>>>> jwe1050i-w The hardware barrier couldn't be used and continues processing
>>>>>>> using the software barrier.
>>>>>>> taken to (standard) corrective action, execution continuing.
>>>>>>> jwe1050i-w The hardware barrier couldn't be used and continues processing
>>>>>>> using the software barrier.
>>>>>>> taken to (standard) corrective action, execution continuing.
>>>>>>> Vec Object:Vec Object:initial vector:initial vector: 1 MPI processes
>>>>>>>    type: seq
>>>>>>> 10
>>>>>>> 20
>>>>>>> 30
>>>>>>> 40
>>>>>>> 50
>>>>>>> 60
>>>>>>>   1 MPI processes
>>>>>>>    type: seq
>>>>>>> 10
>>>>>>> 20
>>>>>>> 30
>>>>>>> 40
>>>>>>> 50
>>>>>>> 60
>>>>>>> [1]PETSC ERROR:
>>>>>>> ------------------------------------------------------------------------
>>>>>>> [1]PETSC ERROR: Caught signal number 11 SEGV: Segmentation Violation,
>>>>>>> probably
>>>>>>> memory access out of range
>>>>>>> [1]PETSC ERROR: Try option -start_in_debugger or -on_error_attach_debugger
>>>>>>> [1]PETSC ERROR: or see
>>>>>>> 
>>>>>>> http://www.mcs.anl.gov/petsc/documentation/faq.html#valgrind
>>>>>>> 
>>>>>>> [1]PETSC ERROR: or tryhttp://valgrind.org  on GNU/linux and Apple Mac OS X
>>>>>>> to
>>>>>>> find memory corruption errors
>>>>>>> [1]PETSC ERROR: likely location of problem given in stack below
>>>>>>> [1]PETSC ERROR: ---------------------  Stack Frames
>>>>>>> ------------------------------------
>>>>>>> [1]PETSC ERROR: Note: The EXACT line numbers in the stack are not
>>>>>>> available,
>>>>>>> [1]PETSC ERROR:       INSTEAD the line number of the start of the function
>>>>>>> [1]PETSC ERROR:       is given.
>>>>>>> [1]PETSC ERROR: [1] F90Array1dCreate line 50
>>>>>>> /.global/volume2/home/hp150306/t00196/source/petsc-3.6.3/src/sys/f90-src/f90_cwrap.c
>>>>>>> [1]PETSC ERROR: --------------------- Error Message
>>>>>>> ------------------------------------------[0]PETSC ERROR:
>>>>>>> ------------------------------------------------------------------------
>>>>>>> [0]PETSC ERROR: Caught signal number 11 SEGV: Segmentation Violation,
>>>>>>> probably
>>>>>>> memory access out of range
>>>>>>> [0]PETSC ERROR: Try option -start_in_debugger or -on_error_attach_debugger
>>>>>>> [0]PETSC ERROR: or see
>>>>>>> 
>>>>>>> http://www.mcs.anl.gov/petsc/documentation/faq.html#valgrind
>>>>>>> 
>>>>>>> [0]PETSC ERROR: or tryhttp://valgrind.org  on GNU/linux and Apple Mac OS X
>>>>>>> to
>>>>>>> find memory corruption errors
>>>>>>> [0]PETSC ERROR: likely location of problem given in stack below
>>>>>>> [0]PETSC ERROR: ---------------------  Stack Frames
>>>>>>> ------------------------------------
>>>>>>> [0]PETSC ERROR: Note: The EXACT line numbers in the stack are not
>>>>>>> available,
>>>>>>> [0]PETSC ERROR:       INSTEAD the line number of the start of the function
>>>>>>> [0]PETSC ERROR:       is given.
>>>>>>> [0]PETSC ERROR: [0] F90Array1dCreate line 50
>>>>>>> /.global/volume2/home/hp150306/t00196/source/petsc-3.6.3/src/sys/f90-src/f90_cwrap.c
>>>>>>> [0]PETSC ERROR: --------------------- Error Message
>>>>>>> --------------------------------------------------------------
>>>>>>> [1]PETSC ERROR: Signal received
>>>>>>> [1]PETSC ERROR: Seehttp://www.mcs.anl.gov/petsc/documentation/faq.html
>>>>>>> for
>>>>>>> trouble shooting.
>>>>>>> [1]PETSC ERROR: Petsc Release Version 3.6.3, Dec, 03, 2015
>>>>>>> [1]PETSC ERROR: ./ex4f90 on a petsc-3.6.3_debug named b04-036 by Unknown
>>>>>>> Wed
>>>>>>> Jun  1 13:23:41 2016
>>>>>>> [1]PETSC ERROR: Configure options --with-cc=mpifcc --with-cxx=mpiFCC
>>>>>>> --with-fc=mpifrt --with-64-bit-pointers=1 --CC=mpifcc --CFLAGS="-Xg -O0"
>>>>>>> --CXX=mpiFCC --CXXFLAGS="-Xg -O0" --FC=mpifrt --FFLAGS="-X9 -O0"
>>>>>>> --LD_SHARED=
>>>>>>> --LDDFLAGS= --with-openmp=1 --with-mpiexec=mpiexec --known-endian=big
>>>>>>> --with-shared-libraries=0 --with-blas-lapack-lib=-SSL2
>>>>>>> --with-scalapack-lib=-SCALAPACK
>>>>>>> --prefix=/home/hp150306/t00196/lib/petsc-3.6.3_debug
>>>>>>> --with-fortran-interfaces=1 --with-debugging=1 --useThreads=0
>>>>>>> --with-hypre=1
>>>>>>> --with-hypre-dir=/home/hp150306/t00196/lib/hypre-2.10.0b-p4
>>>>>>> [1]PETSC ERROR: #1 User provided function() line 0 in  unknown file
>>>>>>> --------------------------------------------------------------------------
>>>>>>> [mpi::mpi-api::mpi-abort]
>>>>>>> MPI_ABORT was invoked on rank 1 in communicator MPI_COMM_WORLD
>>>>>>> with errorcode 59.
>>>>>>> 
>>>>>>> NOTE: invoking MPI_ABORT causes Open MPI to kill all MPI processes.
>>>>>>> You may or may not see output from other processes, depending on
>>>>>>> exactly when Open MPI kills them.
>>>>>>> --------------------------------------------------------------------------
>>>>>>> [b04-036:28998]
>>>>>>> /opt/FJSVtclang/GM-1.2.0-20/lib64/libmpi.so.0(orte_errmgr_base_error_abort+0x84)
>>>>>>> [0xffffffff11360404]
>>>>>>> [b04-036:28998]
>>>>>>> /opt/FJSVtclang/GM-1.2.0-20/lib64/libmpi.so.0(ompi_mpi_abort+0x51c)
>>>>>>> [0xffffffff1110391c]
>>>>>>> [b04-036:28998]
>>>>>>> /opt/FJSVtclang/GM-1.2.0-20/lib64/libmpi.so.0(MPI_Abort+0x6c)
>>>>>>> [0xffffffff1111b5ec]
>>>>>>> [b04-036:28998]
>>>>>>> /opt/FJSVtclang/GM-1.2.0-20/lib64/libtrtmet_c.so.1(MPI_Abort+0x2c)
>>>>>>> [0xffffffff00281bf0]
>>>>>>> [b04-036:28998] ./ex4f90 [0x292548]
>>>>>>> [b04-036:28998] ./ex4f90 [0x29165c]
>>>>>>> [b04-036:28998]
>>>>>>> /opt/FJSVxosmmm/lib64/libmpgpthread.so.1(_IO_funlockfile+0x5c)
>>>>>>> [0xffffffff121e1974]
>>>>>>> [b04-036:28998] ./ex4f90 [0x9f6748]
>>>>>>> [b04-036:28998] ./ex4f90 [0x9f0ea4]
>>>>>>> [b04-036:28998] ./ex4f90 [0x2c76a0]
>>>>>>> [b04-036:28998] ./ex4f90(MAIN__+0x38c) [0x10688c]
>>>>>>> [b04-036:28998] ./ex4f90(main+0xec) [0x268e91c]
>>>>>>> [b04-036:28998] /lib64/libc.so.6(__libc_start_main+0x194)
>>>>>>> [0xffffffff138cb81c]
>>>>>>> [b04-036:28998] ./ex4f90 [0x1063ac]
>>>>>>> [1]PETSC ERROR:
>>>>>>> ------------------------------------------------------------------------
>>>>>>> [1]PETSC ERROR: Caught signal number 15 Terminate: Some process (or the
>>>>>>> batch
>>>>>>> system) has told this process to end
>>>>>>> [1]PETSC ERROR: Tr--------------------
>>>>>>> [0]PETSC ERROR: Signal received
>>>>>>> [0]PETSC ERROR: Seehttp://www.mcs.anl.gov/petsc/documentation/faq.html
>>>>>>> for
>>>>>>> trouble shooting.
>>>>>>> [0]PETSC ERROR: Petsc Release Version 3.6.3, Dec, 03, 2015
>>>>>>> [0]PETSC ERROR: ./ex4f90 on a petsc-3.6.3_debug named b04-036 by Unknown
>>>>>>> Wed
>>>>>>> Jun  1 13:23:41 2016
>>>>>>> [0]PETSC ERROR: Configure options --with-cc=mpifcc --with-cxx=mpiFCC
>>>>>>> --with-fc=mpifrt --with-64-bit-pointers=1 --CC=mpifcc --CFLAGS="-Xg -O0"
>>>>>>> --CXX=mpiFCC --CXXFLAGS="-Xg -O0" --FC=mpifrt --FFLAGS="-X9 -O0"
>>>>>>> --LD_SHARED=
>>>>>>> --LDDFLAGS= --with-openmp=1 --with-mpiexec=mpiexec --known-endian=big
>>>>>>> --with-shared-libraries=0 --with-blas-lapack-lib=-SSL2
>>>>>>> --with-scalapack-lib=-SCALAPACK
>>>>>>> --prefix=/home/hp150306/t00196/lib/petsc-3.6.3_debug
>>>>>>> --with-fortran-interfaces=1 --with-debugging=1 --useThreads=0
>>>>>>> --with-hypre=1
>>>>>>> --with-hypre-dir=/home/hp150306/t00196/lib/hypre-2.10.0b-p4
>>>>>>> [0]PETSC ERROR: #1 User provided function() line 0 in  unknown file
>>>>>>> --------------------------------------------------------------------------
>>>>>>> [mpi::mpi-api::mpi-abort]
>>>>>>> MPI_ABORT was invoked on rank 0 in communicator MPI_COMM_WORLD
>>>>>>> with errorcode 59.
>>>>>>> 
>>>>>>> NOTE: invoking MPI_ABORT causes Open MPI to kill all MPI processes.
>>>>>>> You may or may not see output from other processes, depending on
>>>>>>> exactly when Open MPI kills them.
>>>>>>> --------------------------------------------------------------------------
>>>>>>> [b04-036:28997]
>>>>>>> /opt/FJSVtclang/GM-1.2.0-20/lib64/libmpi.so.0(orte_errmgr_base_error_abort+0x84)
>>>>>>> [0xffffffff11360404]
>>>>>>> [b04-036:28997]
>>>>>>> /opt/FJSVtclang/GM-1.2.0-20/lib64/libmpi.so.0(ompi_mpi_abort+0x51c)
>>>>>>> [0xffffffff1110391c]
>>>>>>> [b04-036:28997]
>>>>>>> /opt/FJSVtclang/GM-1.2.0-20/lib64/libmpi.so.0(MPI_Abort+0x6c)
>>>>>>> [0xffffffff1111b5ec]
>>>>>>> [b04-036:28997]
>>>>>>> /opt/FJSVtclang/GM-1.2.0-20/lib64/libtrtmet_c.so.1(MPI_Abort+0x2c)
>>>>>>> [0xffffffff00281bf0]
>>>>>>> [b04-036:28997] ./ex4f90 [0x292548]
>>>>>>> [b04-036:28997] ./ex4f90 [0x29165c]
>>>>>>> [b04-036:28997]
>>>>>>> /opt/FJSVxosmmm/lib64/libmpgpthread.so.1(_IO_funlockfile+0x5c)
>>>>>>> [0xffffffff121e1974]
>>>>>>> [b04-036:28997] ./ex4f90 [0x9f6748]
>>>>>>> [b04-036:28997] ./ex4f90 [0x9f0ea4]
>>>>>>> [b04-036:28997] ./ex4f90 [0x2c76a0]
>>>>>>> [b04-036:28997] ./ex4f90(MAIN__+0x38c) [0x10688c]
>>>>>>> [b04-036:28997] ./ex4f90(main+0xec) [0x268e91c]
>>>>>>> [b04-036:28997] /lib64/libc.so.6(__libc_start_main+0x194)
>>>>>>> [0xffffffff138cb81c]
>>>>>>> [b04-036:28997] ./ex4f90 [0x1063ac]
>>>>>>> [0]PETSC ERROR:
>>>>>>> ------------------------------------------------------------------------
>>>>>>> [0]PETSC ERROR: Caught signal number 15 Terminate: Some process (or the
>>>>>>> batch
>>>>>>> system) has told this process to end
>>>>>>> [0]PETSC ERROR: Try option -start_in_debugger or -on_error_attach_debugger
>>>>>>> [0]PETSC ERROR: or see
>>>>>>> 
>>>>>>> http://www.mcs.anl.gov/petsc/documentation/faq.html#valgrind
>>>>>>> 
>>>>>>> [0]PETSC ERROR: or tryhttp://valgrind.org  on GNU/linux and Apple Mac OS X
>>>>>>> to
>>>>>>> find memory corruption errors
>>>>>>> [0]PETSC ERROR: likely location of problem given in stack below
>>>>>>> [0]PETSC ERROR: ---------------------  Stack Frames
>>>>>>> ------------------------------------
>>>>>>> [0]PETSC ERROR: Note: The EXACT line numbers in the stack are not
>>>>>>> available,
>>>>>>> [0]PETSC ERROR:       INSTEAD the line number of the start of the function
>>>>>>> [0]PETSC ERROR:       is given.
>>>>>>> [0]PETSC ERROR: [0] F90Array1dCreate line 50
>>>>>>> /.global/volume2/home/hp150306/t00196/source/petsc-3.6.3/src/sys/f90-src/f90_cwrap.c
>>>>>>> [0]PETSC ERROR: --------------------- Error Message
>>>>>>> --------------------------------------------------------------
>>>>>>> [0]PETSC ERROR: Signal received
>>>>>>> [0]PETSC ERROR: Seehttp://www.mcs.anl.gov/petsc/documentation/faq.html
>>>>>>> for
>>>>>>> trouble shooting.
>>>>>>> [0]PETSC ERROR: Petsc Release Version 3.6.3, Dec, 03, 2015
>>>>>>> [0]PETSC ERROR: ./ex4f90 on a petsc-3.6.3_debug named b04-036 by Unknown
>>>>>>> Wed
>>>>>>> Jun  1 13:23:41 2016
>>>>>>> [0]PETSC ERROR: Configure options --with-cc=mpifcc --with-cxx=mpiFCC
>>>>>>> --with-fc=mpifrt --with-64-bit-pointers=1 --CC=mpifcc --CFLAGS="-Xg -O0"
>>>>>>> --CXX=mpiFCC --CXXFLAGS="-Xg -O0" --FC=mpifrt --FFLAGS="-X9 -O0"
>>>>>>> --LD_SHARED=
>>>>>>> --LDDFLAGS= --with-openmp=1 --with-mpiexec=mpiexec --known-endian=big
>>>>>>> --with-shared-libraries=0 --with-blas-lapack-lib=-SSL2
>>>>>>> --with-scalapack-lib=-SCALAPACK
>>>>>>> --prefix=/home/hp150306/t00196/lib/petsc-3.6.3_debug
>>>>>>> --with-fortran-interfaces=1 --with-debuy option -start_in_debugger or
>>>>>>> -on_error_attach_debugger
>>>>>>> [1]PETSC ERROR: or see
>>>>>>> 
>>>>>>> http://www.mcs.anl.gov/petsc/documentation/faq.html#valgrind
>>>>>>> 
>>>>>>> [1]PETSC ERROR: or tryhttp://valgrind.org  on GNU/linux and Apple Mac OS X
>>>>>>> to
>>>>>>> find memory corruption errors
>>>>>>> [1]PETSC ERROR: likely location of problem given in stack below
>>>>>>> [1]PETSC ERROR: ---------------------  Stack Frames
>>>>>>> ------------------------------------
>>>>>>> [1]PETSC ERROR: Note: The EXACT line numbers in the stack are not
>>>>>>> available,
>>>>>>> [1]PETSC ERROR:       INSTEAD the line number of the start of the function
>>>>>>> [1]PETSC ERROR:       is given.
>>>>>>> [1]PETSC ERROR: [1] F90Array1dCreate line 50
>>>>>>> /.global/volume2/home/hp150306/t00196/source/petsc-3.6.3/src/sys/f90-src/f90_cwrap.c
>>>>>>> [1]PETSC ERROR: --------------------- Error Message
>>>>>>> --------------------------------------------------------------
>>>>>>> [1]PETSC ERROR: Signal received
>>>>>>> [1]PETSC ERROR: Seehttp://www.mcs.anl.gov/petsc/documentation/faq.html
>>>>>>> for
>>>>>>> trouble shooting.
>>>>>>> [1]PETSC ERROR: Petsc Release Version 3.6.3, Dec, 03, 2015
>>>>>>> [1]PETSC ERROR: ./ex4f90 on a petsc-3.6.3_debug named b04-036 by Unknown
>>>>>>> Wed
>>>>>>> Jun  1 13:23:41 2016
>>>>>>> [1]PETSC ERROR: Configure options --with-cc=mpifcc --with-cxx=mpiFCC
>>>>>>> --with-fc=mpifrt --with-64-bit-pointers=1 --CC=mpifcc --CFLAGS="-Xg -O0"
>>>>>>> --CXX=mpiFCC --CXXFLAGS="-Xg -O0" --FC=mpifrt --FFLAGS="-X9 -O0"
>>>>>>> --LD_SHARED=
>>>>>>> --LDDFLAGS= --with-openmp=1 --with-mpiexec=mpiexec --known-endian=big
>>>>>>> --with-shared-libraries=0 --with-blas-lapack-lib=-SSL2
>>>>>>> --with-scalapack-lib=-SCALAPACK
>>>>>>> --prefix=/home/hp150306/t00196/lib/petsc-3.6.3_debug
>>>>>>> --with-fortran-interfaces=1 --with-debugging=1 --useThreads=0
>>>>>>> --with-hypre=1
>>>>>>> --with-hypre-dir=/home/hp150306/t00196/lib/hypre-2.10.0b-p4
>>>>>>> [1]PETSC ERROR: #2 User provided function() line 0 in  unknown file
>>>>>>> gging=1 --useThreads=0 --with-hypre=1
>>>>>>> --with-hypre-dir=/home/hp150306/t00196/lib/hypre-2.10.0b-p4
>>>>>>> [0]PETSC ERROR: #2 User provided function() line 0 in  unknown file
>>>>>>> [ERR.] PLE 0019 plexec One of MPI processes was
>>>>>>> aborted.(rank=0)(nid=0x04180034)(CODE=1938,793745140674134016,15104)
>>>>>>> [t00196 at b04-036 tutorials]$
>>>>>>> [ERR.] PLE 0021 plexec The interactive job has aborted with the
>>>>>>> signal.(sig=24)
>>>>>>> [INFO] PJM 0083 pjsub Interactive job 5211401 completed.
>>>>>>> 
>>>>>>> Thank you
>>>>>>> 
>>>>>>> Yours sincerely,
>>>>>>> 
>>>>>>> TAY wee-beng
>>>>>>> 
>>>>>>> On 1/6/2016 12:21 PM, Satish Balay wrote:
>>>>>>> 
>>>>>>>> Do PETSc examples using VecGetArrayF90() work?
>>>>>>>> 
>>>>>>>> say src/vec/vec/examples/tutorials/ex4f90.F
>>>>>>>> 
>>>>>>>> Satish
>>>>>>>> 
>>>>>>>> On Tue, 31 May 2016, TAY wee-beng wrote:
>>>>>>>> 
>>>>>>>> 
>>>>>>>>> Hi,
>>>>>>>>> 
>>>>>>>>> I'm trying to run my MPI CFD code on Japan's K computer. My code can
>>>>>>>>> run
>>>>>>>>> if I
>>>>>>>>> didn't make use of the PETSc DMDAVecGetArrayF90 subroutine. If it's
>>>>>>>>> called
>>>>>>>>> 
>>>>>>>>> call DMDAVecGetArrayF90(da_u,u_local,u_array,ierr)
>>>>>>>>> 
>>>>>>>>> I get the error below.  I have no problem with my code on other
>>>>>>>>> clusters
>>>>>>>>> using
>>>>>>>>> the new Intel compilers. I used to have problems with DM when using
>>>>>>>>> the
>>>>>>>>> old
>>>>>>>>> Intel compilers. Now on the K computer, I'm using Fujitsu's Fortran
>>>>>>>>> compiler.
>>>>>>>>> How can I troubleshoot?
>>>>>>>>> 
>>>>>>>>> Btw, I also tested on the ex13f90 example and it didn't work too. The
>>>>>>>>> error is
>>>>>>>>> below.
>>>>>>>>> 
>>>>>>>>> 
>>>>>>>>> My code error:
>>>>>>>>> 
>>>>>>>>> /* size_x,size_y,size_z 76x130x136*//*
>>>>>>>>> *//* total grid size =  1343680*//*
>>>>>>>>> *//* recommended cores (50k / core) =  26.87360000000000*//*
>>>>>>>>> *//* 0*//*
>>>>>>>>> *//* 1*//*
>>>>>>>>> *//* 1*//*
>>>>>>>>> *//*[3]PETSC ERROR: [1]PETSC ERROR:
>>>>>>>>> ------------------------------------------------------------------------*//*
>>>>>>>>> *//*[1]PETSC ERROR: Caught signal number 11 SEGV: Segmentation
>>>>>>>>> Violation,
>>>>>>>>> probably memory access out of range*//*
>>>>>>>>> *//*[1]PETSC ERROR: Try option -start_in_debugger or
>>>>>>>>> -on_error_attach_debugger*//*
>>>>>>>>> *//*[1]PETSC ERROR: or see
>>>>>>>>> 
>>>>>>>>> http://www.mcs.anl.gov/petsc/documentation/faq.html#valgrind*//*
>>>>>>>>> 
>>>>>>>>> *//*[1]PETSC ERROR: or tryhttp://valgrind.org  on GNU/linux and Apple
>>>>>>>>> Mac
>>>>>>>>> OS X
>>>>>>>>> to find memory corruption errors*//*
>>>>>>>>> *//*[1]PETSC ERROR: likely location of problem given in stack
>>>>>>>>> below*//*
>>>>>>>>> *//*[1]PETSC ERROR: ---------------------  Stack Frames
>>>>>>>>> ------------------------------------*//*
>>>>>>>>> *//*[1]PETSC ERROR: Note: The EXACT line numbers in the stack are not
>>>>>>>>> available,*//*
>>>>>>>>> *//*[1]PETSC ERROR:       INSTEAD the line number of the start of the
>>>>>>>>> function*//*
>>>>>>>>> *//*[1]PETSC ERROR:       is given.*//*
>>>>>>>>> *//*[1]PETSC ERROR: [1] F90Array3dCreate line 244
>>>>>>>>> /.global/volume2/home/hp150306/t00196/source/petsc-3.6.3/src/sys/f90-src/f90_cwrap.c*//*
>>>>>>>>> *//* 1*//*
>>>>>>>>> *//*------------------------------------------------------------------------*//*
>>>>>>>>> *//*[3]PETSC ERROR: Caught signal number 11 SEGV: Segmentation
>>>>>>>>> Violation,
>>>>>>>>> probably memory access out of range*//*
>>>>>>>>> *//*[3]PETSC ERROR: Try option -start_in_debugger or
>>>>>>>>> -on_error_attach_debugger*//*
>>>>>>>>> *//*[3]PETSC ERROR: or see
>>>>>>>>> 
>>>>>>>>> http://www.mcs.anl.gov/petsc/documentation/faq.html#valgrind*//*
>>>>>>>>> 
>>>>>>>>> *//*[3]PETSC ERROR: or tryhttp://valgrind.org  on GNU/linux and Apple
>>>>>>>>> Mac
>>>>>>>>> OS X
>>>>>>>>> to find memory corruption errors*//*
>>>>>>>>> *//*[3]PETSC ERROR: likely location of problem given in stack
>>>>>>>>> below*//*
>>>>>>>>> *//*[3]PETSC ERROR: ---------------------  Stack Frames
>>>>>>>>> ------------------------------------*//*
>>>>>>>>> *//*[0]PETSC ERROR:
>>>>>>>>> ------------------------------------------------------------------------*//*
>>>>>>>>> *//*[0]PETSC ERROR: Caught signal number 11 SEGV: Segmentation
>>>>>>>>> Violation,
>>>>>>>>> probably memory access out of range*//*
>>>>>>>>> *//*[0]PETSC ERROR: Try option -start_in_debugger or
>>>>>>>>> -on_error_attach_debugger*//*
>>>>>>>>> *//*[0]PETSC ERROR: or see
>>>>>>>>> 
>>>>>>>>> http://www.mcs.anl.gov/petsc/documentation/faq.html#valgrind*//*
>>>>>>>>> 
>>>>>>>>> *//*[0]PETSC ERROR: or tryhttp://valgrind.org  on GNU/linux and Apple
>>>>>>>>> Mac
>>>>>>>>> OS X
>>>>>>>>> to find memory corruption errors*//*
>>>>>>>>> *//*[0]PETSC ERROR: likely location of problem given in stack
>>>>>>>>> below*//*
>>>>>>>>> *//*[0]PETSC ERROR: ---------------------  Stack Frames
>>>>>>>>> ------------------------------------*//*
>>>>>>>>> *//*[0]PETSC ERROR: Note: The EXACT line numbers in the stack are not
>>>>>>>>> available,*//*
>>>>>>>>> *//*[0]PETSC ERROR:       INSTEAD the line number of the start of the
>>>>>>>>> function*//*
>>>>>>>>> *//*[0]PETSC ERROR:       is given.*//*
>>>>>>>>> *//*[0]PETSC ERROR: [0] F90Array3dCreate line 244
>>>>>>>>> /.global/volume2/home/hp150306/t00196/source/petsc-3.6.3/src/sys/f90-src/f90_cwrap.c*//*
>>>>>>>>> *//*[0]PETSC ERROR: --------------------- Error Message
>>>>>>>>> ----------------------------------------- 1*//*
>>>>>>>>> *//*[2]PETSC ERROR:
>>>>>>>>> ------------------------------------------------------------------------*//*
>>>>>>>>> *//*[2]PETSC ERROR: Caught signal number 11 SEGV: Segmentation
>>>>>>>>> Violation,
>>>>>>>>> probably memory access out of range*//*
>>>>>>>>> *//*[2]PETSC ERROR: Try option -start_in_debugger or
>>>>>>>>> -on_error_attach_debugger*//*
>>>>>>>>> *//*[2]PETSC ERROR: or see
>>>>>>>>> 
>>>>>>>>> http://www.mcs.anl.gov/petsc/documentation/faq.html#valgrind*//*
>>>>>>>>> 
>>>>>>>>> *//*[2]PETSC ERROR: or tryhttp://valgrind.org  on GNU/linux and Apple
>>>>>>>>> Mac
>>>>>>>>> OS X
>>>>>>>>> to find memory corruption errors*//*
>>>>>>>>> *//*[2]PETSC ERROR: likely location of problem given in stack
>>>>>>>>> below*//*
>>>>>>>>> *//*[2]PETSC ERROR: ---------------------  Stack Frames
>>>>>>>>> ------------------------------------*//*
>>>>>>>>> *//*[2]PETSC ERROR: Note: The EXACT line numbers in the stack are not
>>>>>>>>> available,*//*
>>>>>>>>> *//*[2]PETSC ERROR:       INSTEAD the line number of the start of the
>>>>>>>>> function*//*
>>>>>>>>> *//*[2]PETSC ERROR:       is given.*//*
>>>>>>>>> *//*[2]PETSC ERROR: [2] F90Array3dCreate line 244
>>>>>>>>> /.global/volume2/home/hp150306/t00196/source/petsc-3.6.3/src/sys/f90-src/f90_cwrap.c*//*
>>>>>>>>> *//*[2]PETSC ERROR: --------------------- Error Message
>>>>>>>>> -----------------------------------------[3]PETSC ERROR: Note: The
>>>>>>>>> EXACT
>>>>>>>>> line
>>>>>>>>> numbers in the stack are not available,*//*
>>>>>>>>> *//*[3]PETSC ERROR:       INSTEAD the line number of the start of the
>>>>>>>>> function*//*
>>>>>>>>> *//*[3]PETSC ERROR:       is given.*//*
>>>>>>>>> *//*[3]PETSC ERROR: [3] F90Array3dCreate line 244
>>>>>>>>> /.global/volume2/home/hp150306/t00196/source/petsc-3.6.3/src/sys/f90-src/f90_cwrap.c*//*
>>>>>>>>> *//*[3]PETSC ERROR: --------------------- Error Message
>>>>>>>>> --------------------------------------------------------------*//*
>>>>>>>>> *//*[3]PETSC ERROR: Signal received*//*
>>>>>>>>> *//*[3]PETSC ERROR: See
>>>>>>>>> 
>>>>>>>>> http://www.mcs.anl.gov/petsc/documentation/faq.html
>>>>>>>>> 
>>>>>>>>> for trouble shooting.*//*
>>>>>>>>> *//*[3]PETSC ERROR: Petsc Release Version 3.6.3, Dec, 03, 2015*//*
>>>>>>>>> *//*[3]PETSC ERROR: ./a-debug.out on a petsc-3.6.3_debug named b04-036
>>>>>>>>> by
>>>>>>>>> Unknown Wed Jun  1 12:54:34 2016*//*
>>>>>>>>> *//*[3]PETSC ERROR: Configure options --with-cc=mpifcc
>>>>>>>>> --with-cxx=mpiFCC
>>>>>>>>> --with-fc=mpifrt --with-64-bit-pointers=1 --CC=mpifcc --CFLAGS="-Xg
>>>>>>>>> -O0"
>>>>>>>>> --CXX=mpiFCC --CXXFLAGS="-Xg -O0" --FC=mpifrt --FFLAGS="-X9 -O0"
>>>>>>>>> --LD_SHARED=
>>>>>>>>> --LDDFLAGS= --with-openmp=1 --with-mpiexec=mpiexec --known-endian=big
>>>>>>>>> --with-shared----------------------*//*
>>>>>>>>> *//*[0]PETSC ERROR: Signal received*//*
>>>>>>>>> *//*[0]PETSC ERROR: See
>>>>>>>>> 
>>>>>>>>> http://www.mcs.anl.gov/petsc/documentation/faq.html
>>>>>>>>> 
>>>>>>>>> for trouble shooting.*//*
>>>>>>>>> *//*[0]PETSC ERROR: Petsc Release Version 3.6.3, Dec, 03, 2015*//*
>>>>>>>>> *//*[0]PETSC ERROR: ./a-debug.out on a petsc-3.6.3_debug named b04-036
>>>>>>>>> by
>>>>>>>>> Unknown Wed Jun  1 12:54:34 2016*//*
>>>>>>>>> *//*[0]PETSC ERROR: Configure options --with-cc=mpifcc
>>>>>>>>> --with-cxx=mpiFCC
>>>>>>>>> --with-fc=mpifrt --with-64-bit-pointers=1 --CC=mpifcc --CFLAGS="-Xg
>>>>>>>>> -O0"
>>>>>>>>> --CXX=mpiFCC --CXXFLAGS="-Xg -O0" --FC=mpifrt --FFLAGS="-X9 -O0"
>>>>>>>>> --LD_SHARED=
>>>>>>>>> --LDDFLAGS= --with-openmp=1 --with-mpiexec=mpiexec --known-endian=big
>>>>>>>>> --with-shared-libraries=0 --with-blas-lapack-lib=-SSL2
>>>>>>>>> --with-scalapack-lib=-SCALAPACK
>>>>>>>>> --prefix=/home/hp150306/t00196/lib/petsc-3.6.3_debug
>>>>>>>>> --with-fortran-interfaces=1 --with-debugging=1 --useThreads=0
>>>>>>>>> --with-hypre=1
>>>>>>>>> --with-hypre-dir=/home/hp150306/t00196/lib/hypre-2.10.0b-p4*//*
>>>>>>>>> *//*[0]PETSC ERROR: #1 User provided function() line 0 in  unknown
>>>>>>>>> file*//*
>>>>>>>>> *//*--------------------------------------------------------------------------*//*
>>>>>>>>> *//*[m---------------------*//*
>>>>>>>>> *//*[2]PETSC ERROR: Signal received*//*
>>>>>>>>> *//*[2]PETSC ERROR: See
>>>>>>>>> 
>>>>>>>>> http://www.mcs.anl.gov/petsc/documentation/faq.html
>>>>>>>>> 
>>>>>>>>> for trouble shooting.*//*
>>>>>>>>> *//*[2]PETSC ERROR: Petsc Release Version 3.6.3, Dec, 03, 2015*//*
>>>>>>>>> *//*[2]PETSC ERROR: ./a-debug.out on a petsc-3.6.3_debug named b04-036
>>>>>>>>> by
>>>>>>>>> Unknown Wed Jun  1 12:54:34 2016*//*
>>>>>>>>> *//*[2]PETSC ERROR: Configure options --with-cc=mpifcc
>>>>>>>>> --with-cxx=mpiFCC
>>>>>>>>> --with-fc=mpifrt --with-64-bit-pointers=1 --CC=mpifcc --CFLAGS="-Xg
>>>>>>>>> -O0"
>>>>>>>>> --CXX=mpiFCC --CXXFLAGS="-Xg -O0" --FC=mpifrt --FFLAGS="-X9 -O0"
>>>>>>>>> --LD_SHARED=
>>>>>>>>> --LDDFLAGS= --with-openmp=1 --with-mpiexec=mpiexec --known-endian=big
>>>>>>>>> --with-shared-libraries=0 --with-blas-lapack-lib=-SSL2
>>>>>>>>> --with-scalapack-lib=-SCALAPACK
>>>>>>>>> --prefix=/home/hp150306/t00196/lib/petsc-3.6.3_debug
>>>>>>>>> --with-fortran-interfaces=1 --with-debugging=1 --useThreads=0
>>>>>>>>> --with-hypre=1
>>>>>>>>> --with-hypre-dir=/home/hp150306/t00196/lib/hypre-2.10.0b-p4*//*
>>>>>>>>> *//*[2]PETSC ERROR: #1 User provided function() line 0 in  unknown
>>>>>>>>> file*//*
>>>>>>>>> *//*--------------------------------------------------------------------------*//*
>>>>>>>>> *//*[m[1]PETSC ERROR: --------------------- Error Message
>>>>>>>>> --------------------------------------------------------------*//*
>>>>>>>>> *//*[1]PETSC ERROR: Signal received*//*
>>>>>>>>> *//*[1]PETSC ERROR: See
>>>>>>>>> 
>>>>>>>>> http://www.mcs.anl.gov/petsc/documentation/faq.html
>>>>>>>>> 
>>>>>>>>> for trouble shooting.*//*
>>>>>>>>> *//*[1]PETSC ERROR: Petsc Release Version 3.6.3, Dec, 03, 2015*//*
>>>>>>>>> *//*[1]PETSC ERROR: ./a-debug.out on a petsc-3.6.3_debug named b04-036
>>>>>>>>> by
>>>>>>>>> Unknown Wed Jun  1 12:54:34 2016*//*
>>>>>>>>> *//*[1]PETSC ERROR: Configure options --with-cc=mpifcc
>>>>>>>>> --with-cxx=mpiFCC
>>>>>>>>> --with-fc=mpifrt --with-64-bit-pointers=1 --CC=mpifcc --CFLAGS="-Xg
>>>>>>>>> -O0"
>>>>>>>>> --CXX=mpiFCC --CXXFLAGS="-Xg -O0" --FC=mpifrt --FFLAGS="-X9 -O0"
>>>>>>>>> --LD_SHARED=
>>>>>>>>> --LDDFLAGS= --with-openmp=1 --with-mpiexec=mpiexec --known-endian=big
>>>>>>>>> --with-shared-libraries=0 --with-blas-lapack-lib=-SSL2
>>>>>>>>> --with-scalapack-lib=-SCALAPACK
>>>>>>>>> --prefix=/home/hp150306/t00196/lib/petsc-3.6.3_debug
>>>>>>>>> --with-fortran-interfaces=1 --with-debugging=1 --useThreads=0
>>>>>>>>> --with-hypre=1
>>>>>>>>> --with-hypre-dir=/home/hp150306/t00196/lib/hypre-2.10.0b-p4*//*
>>>>>>>>> *//*[1]PETSC ERROR: #1 User provided function() line 0 ilibraries=0
>>>>>>>>> --with-blas-lapack-lib=-SSL2 --with-scalapack-lib=-SCALAPACK
>>>>>>>>> --prefix=/home/hp150306/t00196/lib/petsc-3.6.3_debug
>>>>>>>>> --with-fortran-interfaces=1 --with-debugging=1 --useThreads=0
>>>>>>>>> --with-hypre=1
>>>>>>>>> --with-hypre-dir=/home/hp150306/t00196/lib/hypre-2.10.0b-p4*//*
>>>>>>>>> *//*[3]PETSC ERROR: #1 User provided function() line 0 in  unknown
>>>>>>>>> file*//*
>>>>>>>>> *//*--------------------------------------------------------------------------*//*
>>>>>>>>> *//*[mpi::mpi-api::mpi-abort]*//*
>>>>>>>>> *//*MPI_ABORT was invoked on rank 3 in communicator MPI_COMM_WORLD*//*
>>>>>>>>> *//*with errorcode 59.*//*
>>>>>>>>> *//*
>>>>>>>>> *//*NOTE: invoking MPI_ABORT causes Open MPI to kill all MPI
>>>>>>>>> processes.*//*
>>>>>>>>> *//*You may or may not see output from other processes, depending
>>>>>>>>> on*//*
>>>>>>>>> *//*exactly when Open MPI kills them.*//*
>>>>>>>>> *//*--------------------------------------------------------------------------*//*
>>>>>>>>> *//*[b04-036:28416]
>>>>>>>>> /opt/FJSVtclang/GM-1.2.0-20/lib64/libmpi.so.0(orte_errmgr_base_error_abort+0x84)
>>>>>>>>> [0xffffffff11360404]*//*
>>>>>>>>> *//*[b04-036:28416]
>>>>>>>>> /opt/FJSVtclang/GM-1.2.0-20/lib64/libmpi.so.0(ompi_mpi_abort+0x51c)
>>>>>>>>> [0xffffffff1110391c]*//*
>>>>>>>>> *//*[b04-036:28416]
>>>>>>>>> /opt/FJSVtclang/GM-1.2.0-2pi::mpi-api::mpi-abort]*//*
>>>>>>>>> *//*MPI_ABORT was invoked on rank 2 in communicator MPI_COMM_WORLD*//*
>>>>>>>>> *//*with errorcode 59.*/
>>>>>>>>> 
>>>>>>>>> ex13f90 error:
>>>>>>>>> 
>>>>>>>>> 
>>>>>>>>> /*[t00196 at b04-036 tutorials]$ mpiexec -np 2 ./ex13f90*//*
>>>>>>>>> *//*jwe1050i-w The hardware barrier couldn't be used and continues
>>>>>>>>> processing
>>>>>>>>> using the software barrier.*//*
>>>>>>>>> *//*taken to (standard) corrective action, execution continuing.*//*
>>>>>>>>> *//*jwe1050i-w The hardware barrier couldn't be used and continues
>>>>>>>>> processing
>>>>>>>>> using the software barrier.*//*
>>>>>>>>> *//*taken to (standard) corrective action, execution continuing.*//*
>>>>>>>>> *//* Hi! We're solving van der Pol using  2 processes.*//*
>>>>>>>>> *//*
>>>>>>>>> *//*   t     x1         x2*//*
>>>>>>>>> *//*[1]PETSC ERROR:
>>>>>>>>> ------------------------------------------------------------------------*//*
>>>>>>>>> *//*[1]PETSC ERROR: Caught signal number 10 BUS: Bus Error, possibly
>>>>>>>>> illegal
>>>>>>>>> memory access*//*
>>>>>>>>> *//*[1]PETSC ERROR: Try option -start_in_debugger or
>>>>>>>>> -on_error_attach_debugger*//*
>>>>>>>>> *//*[0]PETSC ERROR:
>>>>>>>>> ------------------------------------------------------------------------*//*
>>>>>>>>> *//*[0]PETSC ERROR: Caught signal number 10 BUS: Bus Error, possibly
>>>>>>>>> illegal
>>>>>>>>> memory access*//*
>>>>>>>>> *//*[0]PETSC ERROR: Try option -start_in_debugger or
>>>>>>>>> -on_error_attach_debugger*//*
>>>>>>>>> *//*[0]PETSC ERROR: or see
>>>>>>>>> 
>>>>>>>>> http://www.mcs.anl.gov/petsc/documentation/faq.html#valgrind*//*
>>>>>>>>> 
>>>>>>>>> *//*[0]PETSC ERROR: or tryhttp://valgrind.org  on GNU/linux and Apple
>>>>>>>>> Mac
>>>>>>>>> OS X
>>>>>>>>> to find memory corruption errors*//*
>>>>>>>>> *//*[1]PETSC ERROR: or see
>>>>>>>>> 
>>>>>>>>> http://www.mcs.anl.gov/petsc/documentation/faq.html#valgrind*//*
>>>>>>>>> 
>>>>>>>>> *//*[1]PETSC ERROR: or tryhttp://valgrind.org  on GNU/linux and Apple
>>>>>>>>> Mac
>>>>>>>>> OS X
>>>>>>>>> to find memory corruption errors*//*
>>>>>>>>> *//*[1]PETSC ERROR: likely location of problem given in stack
>>>>>>>>> below*//*
>>>>>>>>> *//*[1]PETSC ERROR: ---------------------  Stack Frames
>>>>>>>>> ------------------------------------*//*
>>>>>>>>> *//*[1]PETSC ERROR: Note: The EXACT line numbers in the stack are not
>>>>>>>>> available,*//*
>>>>>>>>> *//*[1]PETSC ERROR:       INSTEAD the line number of the start of the
>>>>>>>>> function*//*
>>>>>>>>> *//*[1]PETSC ERROR:       is given.*//*
>>>>>>>>> *//*[1]PETSC ERROR: [1] F90Array4dCreate line 337
>>>>>>>>> /.global/volume2/home/hp150306/t00196/source/petsc-3.6.3/src/sys/f90-src/f90_cwrap.c*//*
>>>>>>>>> *//*[0]PETSC ERROR: likely location of problem given in stack
>>>>>>>>> below*//*
>>>>>>>>> *//*[0]PETSC ERROR: ---------------------  Stack Frames
>>>>>>>>> ------------------------------------*//*
>>>>>>>>> *//*[0]PETSC ERROR: Note: The EXACT line numbers in the stack are not
>>>>>>>>> available,*//*
>>>>>>>>> *//*[0]PETSC ERROR:       INSTEAD the line number of the start of the
>>>>>>>>> function*//*
>>>>>>>>> *//*[0]PETSC ERROR:       is given.*//*
>>>>>>>>> *//*[0]PETSC ERROR: [0] F90Array4dCreate line 337
>>>>>>>>> /.global/volume2/home/hp150306/t00196/source/petsc-3.6.3/src/sys/f90-src/f90_cwrap.c*//*
>>>>>>>>> *//*[1]PETSC ERROR: --------------------- Error Message
>>>>>>>>> --------------------------------------------------------------*//*
>>>>>>>>> *//*[1]PETSC ERROR: Signal received*//*
>>>>>>>>> *//*[1]PETSC ERROR: See
>>>>>>>>> 
>>>>>>>>> http://www.mcs.anl.gov/petsc/documentation/faq.html
>>>>>>>>> 
>>>>>>>>> for trouble shooting.*//*
>>>>>>>>> *//*[1]PETSC ERROR: Petsc Release Version 3.6.3, Dec, 03, 2015*//*
>>>>>>>>> *//*[1]PETSC ERROR: ./ex13f90 on a petsc-3.6.3_debug named b04-036 by
>>>>>>>>> Unknown
>>>>>>>>> Wed Jun  1 13:04:34 2016*//*
>>>>>>>>> *//*[1]PETSC ERROR: Configure options --with-cc=mpifcc
>>>>>>>>> --with-cxx=mpiFCC
>>>>>>>>> --with-fc=mpifrt --with-64-bit-pointers=1 --CC=mpifcc --CFLAGS="-Xg
>>>>>>>>> -O0"
>>>>>>>>> --CXX=mpiFCC --CXXFLAGS="-Xg -O0" --FC=mpifrt --FFLAGS="-X9 -O0"
>>>>>>>>> --LD_SHARED=
>>>>>>>>> --LDDFLAGS= --with-openmp=1 --with-mpiexec=mpiexec --known-endian=big
>>>>>>>>> --with-shared-libraries=0 --with-blas-lapack-lib=-SSL2
>>>>>>>>> --with-scalapack-lib=-SCALAPACK
>>>>>>>>> --prefix=/home/hp150306/t00196/lib/petsc-3.6.3_debug
>>>>>>>>> --with-fortran-interfaces=1 --with-debugging=1 --useThreads=0
>>>>>>>>> --with-hypre=1
>>>>>>>>> --with-hyp*//*
>>>>>>>>> */
>>>>>>>>> 
>>>>>>>>> 
>>>>>>>>> 
>>>>>>>>> 
> 


From bsmith at mcs.anl.gov  Wed Jul 20 22:49:11 2016
From: bsmith at mcs.anl.gov (Barry Smith)
Date: Wed, 20 Jul 2016 22:49:11 -0500
Subject: [petsc-users] Regarding ksp ex42 - Citations
In-Reply-To: <877772657.653258.1468824084856.JavaMail.yahoo@mail.yahoo.com>
References: <1413749702.3789628.1468516892902.JavaMail.yahoo.ref@mail.yahoo.com>
	<1413749702.3789628.1468516892902.JavaMail.yahoo@mail.yahoo.com>
	<5A491912-5FFB-46AB-8B2E-CBC0C5C443C2@mcs.anl.gov>
	<CAMYG4GnjpQ+CTzB+Ed7hbB77YXJPzWRXtspDbrE8eiXZD=DA_Q@mail.gmail.com>
	<461808588.655361.1468821570462.JavaMail.yahoo@mail.yahoo.com>
	<CAMYG4Gk7HNz8S6A2cv=0OFqbHOPuLOkfdJwNnMiix-ONHvUheg@mail.gmail.com>
	<877772657.653258.1468824084856.JavaMail.yahoo@mail.yahoo.com>
Message-ID: <DBCA28BC-8A97-4D5F-8041-644DF237D91E@mcs.anl.gov>


> On Jul 18, 2016, at 1:41 AM, domenico lahaye <domenico_lahaye at yahoo.com> wrote:
> 
> Dear Matthew, 
> 
>   I would like to place the FormJacobian statement in ex25.c in such a way that I can view 
> the result on the different levels. Can you please point me to an example? 
> 
>   I would like to do above with Galerkin coarsening as well. So yes, I do expect that I will need the 
> hooks attached to the different MG levels. I appreciate more pointers here as well. 

   The thing is some parts of the solver may not be constructed on each level until the actual solve is performed so it may not be possible to view/change things before the solve starts. You can try calling KSPSetUp() and then do as Matt suggested

"You can always call

  http://www.mcs.anl.gov/petsc/petsc-current/docs/manualpages/PC/PCMGGetSmoother.html

and then

  http://www.mcs.anl.gov/petsc/petsc-current/docs/manualpages/KSP/KSPGetPC.html

and then

  http://www.mcs.anl.gov/petsc/petsc-current/docs/manualpages/PC/PCGetOperators.html

I would caution you against this, since it is very fragile in the code."

  When using SNES there is really no good time to call KSPSetUp() and then access the PCMGGetSmoother(). This is why PETSc is designed around callbacks, so rather than having you look over MG levels and get some object and modify it, you provide callbacks that SNES or KSP calls at the appropriate time with a single object and then your callback function does what you want it to do. If there are additional callbacks you think we should add please let us know.

  Barry


> 

>    Thanks, Domenico.  
> 
> 
> From: Matthew Knepley <knepley at gmail.com>
> 
> 
> To: domenico lahaye <domenico_lahaye at yahoo.com> 
> Cc: PETSc Users List <petsc-users at mcs.anl.gov>
> Sent: Monday, July 18, 2016 8:16 AM
> Subject: Re: [petsc-users] Regarding ksp ex42 - Citations
> 
> On Mon, Jul 18, 2016 at 12:59 AM, domenico lahaye <domenico_lahaye at yahoo.com> wrote:
> Thanks for all the pointers. 
> 
> I am happy to switch to ksp/examples/tutorials/ex25.c in a first instance as you suggest.
> 
>     I am still stuck with the same issue as before though. I am trying to extract the hierarchy 
>     of coarser grid matrices and the intergrid transfer operators from the DMDA data structure. I would 
>     like to modify these operators and define a multigrid cycle with the modified operators. 
> 
>     Given A^h (Helmholtz) and M^h (shifted Laplace), I would like to define a multigrid cycle involving 
>     both A^H and M^H. Can I rely on the multilevel DMDA structure to construct A^H and M^H for me 
>     in a set-up phase, plug them into a user-defined context, and plug them back out in a solve phase? 
> 
> If you are not using -pc_mg_galerkin, then the FormJacobian is called separately on each level to rediscretize the operator.
> The only thing that changes is the DMDA that is passed to the call. If you need more information, there are hooks to
> attach different contexts to each MG level. Do you need this?
> 
>   Thanks,
> 
>      Matt
>  
> Thanks, Domenico. 
> 
> 
> From: Matthew Knepley <knepley at gmail.com>
> To: Barry Smith <bsmith at mcs.anl.gov> 
> Cc: domenico lahaye <domenico_lahaye at yahoo.com>; "petsc-users at mcs.anl.gov" <petsc-users at mcs.anl.gov>
> Sent: Sunday, July 17, 2016 2:29 PM
> Subject: Re: [petsc-users] Regarding ksp ex42 - Citations
> 
> On Sat, Jul 16, 2016 at 10:11 PM, Barry Smith <bsmith at mcs.anl.gov> wrote:
> 
> > On Jul 14, 2016, at 12:21 PM, domenico lahaye <domenico_lahaye at yahoo.com> wrote:
> >
> > Dear PETSc team,
> >
> > 1) I am looking into ks/examples/tutorials/ex42.c I am still new to the DMDA structure
> >     and likely not giving it as much time as it deserves. However, I do not see immediately
> >     what function is responsible for calling PCMGSetSmoother and PCMGSetResidual.
> >
> >      I tried to call PCMGGetCoarseSolve(pc, &kcpc) and subsequently
> >      KSPGetOperators (kspc, ... ) to check how the coarse grid operator is defined
> >      after calling DMCoarsenHierarchy, but that failed.
> >
> >      I am solving Helmholtz with shifted Laplace, and managed to exploit DMDA to perform
> >      a multigrid solve on the preconditioner. In a next stage I want to implement the deflation
> >      using DMDA as well.
> >
> > 2) On http://www.mcs.anl.gov/petsc/documentation/referencing.html I see
> >
> > @Misc{petsc-web-page,
> >             author = {Satish Balay and Shrirang Abhyankar and Mark~F. Adams and Jed Brown and Peter Brune
> >                       and Kris Buschelman and Lisandro Dalcin and Victor Eijkhout and William~D. Gropp
> >                       and Dinesh Kaushik and Matthew~G. Knepley
> >                       and Lois Curfman McInnes and Karl Rupp and Barry~F. Smith
> >                       and Stefano Zampini and Hong Zhang and Hong Zhang},
> >             title =  {{PETS}c {W}eb page},
> >             url =    {http://www.mcs.anl.gov/petsc},
> >             howpublished = {\url{http://www.mcs.anl.gov/petsc}},
> >             year = {2016}
> >           }
> >
> >
> >
> > Is the last author mentioned twice intentionally?
> >
> > 3) On http://www.mcs.anl.gov/petsc/publications/petscapps-bib.html#OpenFOAM%202.2.1 I see
> >
> > @misc{OpenFOAM
> > ,
> >
> >
> > title =       "OpenFOAM",
> >
> > howpublished  =       "\url{http://www.openfoam.com}",
> >
> > url   =       {http://www.openfoam.com},
> >
> > note  =       "OpenFOAM is a free, open source CFD software package. It allows PETSc linear algebra and solvers to be used underneath.",
> >
> > key   =       "OpenFOAM 2.2.1"
> >
> > }
> >
> >
> > Do you have more information on the use of PETSc within OpenFoam?
> 
>   Very good question. It seems that this citation is wrong or no longer valid; I have removed it from the PETSc repository. I could find no mention of PETSc usage in the OpenFoam and its third party packages. I think we should not have been listing this citation.
> 
> This suggests that people are using it with OpenFOAM: http://powerlab.fsb.hr/ped/kturbo/OpenFOAM/slides/PatersonNuTTS2009.pdf
> 
> In fact, they use PETSc in the dynamic overset grid implementation for OpenFOAM, which I think is an approved extension:
> 
>   http://web.student.chalmers.se/groups/ofw5/Abstracts/DavidBogerAbstractOFW5.pdf
> 
>      Matt
>  
> 
>    Barry
> 
> >
> > 4) @matt in response to a question he raised in Vienna
> >
> > MIPSE is a BEM solver. Details are on:
> > http://www.g2elab.grenoble-inp.fr/plateforms/mipse-modeling-of-interconnected-power-systems-632862.kjsp?RH=G2ELAB_R-MAGE
> >
> > Cheers, Domenico Lahaye.
> 
> >
> 
> 
> 
> 
> -- 
> What most experimenters take for granted before they begin their experiments is infinitely more interesting than any results to which their experiments lead.
> -- Norbert Wiener
> 
> 
> 
> 
> 
> -- 
> What most experimenters take for granted before they begin their experiments is infinitely more interesting than any results to which their experiments lead.
> -- Norbert Wiener
> 
> 


From domenico_lahaye at yahoo.com  Thu Jul 21 03:41:04 2016
From: domenico_lahaye at yahoo.com (domenico lahaye)
Date: Thu, 21 Jul 2016 08:41:04 +0000 (UTC)
Subject: [petsc-users] Regarding ksp ex42 - Citations
In-Reply-To: <DBCA28BC-8A97-4D5F-8041-644DF237D91E@mcs.anl.gov>
References: <1413749702.3789628.1468516892902.JavaMail.yahoo.ref@mail.yahoo.com>
	<1413749702.3789628.1468516892902.JavaMail.yahoo@mail.yahoo.com>
	<5A491912-5FFB-46AB-8B2E-CBC0C5C443C2@mcs.anl.gov>
	<CAMYG4GnjpQ+CTzB+Ed7hbB77YXJPzWRXtspDbrE8eiXZD=DA_Q@mail.gmail.com>
	<461808588.655361.1468821570462.JavaMail.yahoo@mail.yahoo.com>
	<CAMYG4Gk7HNz8S6A2cv=0OFqbHOPuLOkfdJwNnMiix-ONHvUheg@mail.gmail.com>
	<877772657.653258.1468824084856.JavaMail.yahoo@mail.yahoo.com>
	<DBCA28BC-8A97-4D5F-8041-644DF237D91E@mcs.anl.gov>
Message-ID: <1627436644.2376666.1469090464432.JavaMail.yahoo@mail.yahoo.com>

Thanks.?
KSPSetOperators() allows to precondition A^h with M^h.?This is lovely and great as it allows to implement the shifted Laplace?preconditioner for the Helmholtz equation.?
Recently I managed to implement shifted Laplace using the DMDA?infrastructure in 2D. This implementation avoids having to construct?the hierarchy in Matlab as we did previously.?
In next stage we would like to precondition A^H with M^H on a sequence?of coarser grids. This is what Calandra does on two levels and what we doon multiple levels.?
We currently have an implement in which we construct the hierarchy on A^h?and M^h in Matlab, we read the hierarchy in PETSc, traverse the hierarchy and?do SetOperators and do a lot more of dark magic and witch craft by combining?preconditioners in a additive and multiplicative fashion.?
It would be lovely to obtain a more readable piece of code. ? ?
I am not sure what kind of additional callbacks I need. My first guess here?would be a multilevel extension of SetOperators allowing to define M^H?a preconditioner for A^H on a sequence of coarser levels. But I currently?fail to oversee the whole matter.?
An alternative is to build a fragile code on top of DMDA first and get back?to you with more informed guesses on what kind of call backs I precisely need.?I think I prefer to go with this option.?
Does this sound reasonable??
Domenico.?


      From: Barry Smith <bsmith at mcs.anl.gov>
 To: domenico lahaye <domenico_lahaye at yahoo.com> 
Cc: PETSc Users List <petsc-users at mcs.anl.gov>
 Sent: Thursday, July 21, 2016 5:49 AM
 Subject: Re: [petsc-users] Regarding ksp ex42 - Citations
   

> On Jul 18, 2016, at 1:41 AM, domenico lahaye <domenico_lahaye at yahoo.com> wrote:
> 
> Dear Matthew, 
> 
>? I would like to place the FormJacobian statement in ex25.c in such a way that I can view 
> the result on the different levels. Can you please point me to an example? 
> 
>? I would like to do above with Galerkin coarsening as well. So yes, I do expect that I will need the 
> hooks attached to the different MG levels. I appreciate more pointers here as well. 

? The thing is some parts of the solver may not be constructed on each level until the actual solve is performed so it may not be possible to view/change things before the solve starts. You can try calling KSPSetUp() and then do as Matt suggested

"You can always call

? http://www.mcs.anl.gov/petsc/petsc-current/docs/manualpages/PC/PCMGGetSmoother.html

and then

? http://www.mcs.anl.gov/petsc/petsc-current/docs/manualpages/KSP/KSPGetPC.html

and then

? http://www.mcs.anl.gov/petsc/petsc-current/docs/manualpages/PC/PCGetOperators.html

I would caution you against this, since it is very fragile in the code."

? When using SNES there is really no good time to call KSPSetUp() and then access the PCMGGetSmoother(). This is why PETSc is designed around callbacks, so rather than having you look over MG levels and get some object and modify it, you provide callbacks that SNES or KSP calls at the appropriate time with a single object and then your callback function does what you want it to do. If there are additional callbacks you think we should add please let us know.

? Barry


> 

>? ? Thanks, Domenico.? 
> 
> 
> From: Matthew Knepley <knepley at gmail.com>
> 
> 
> To: domenico lahaye <domenico_lahaye at yahoo.com> 
> Cc: PETSc Users List <petsc-users at mcs.anl.gov>
> Sent: Monday, July 18, 2016 8:16 AM
> Subject: Re: [petsc-users] Regarding ksp ex42 - Citations
> 
> On Mon, Jul 18, 2016 at 12:59 AM, domenico lahaye <domenico_lahaye at yahoo.com> wrote:
> Thanks for all the pointers. 
> 
> I am happy to switch to ksp/examples/tutorials/ex25.c in a first instance as you suggest.
> 
>? ? I am still stuck with the same issue as before though. I am trying to extract the hierarchy 
>? ? of coarser grid matrices and the intergrid transfer operators from the DMDA data structure. I would 
>? ? like to modify these operators and define a multigrid cycle with the modified operators. 
> 
>? ? Given A^h (Helmholtz) and M^h (shifted Laplace), I would like to define a multigrid cycle involving 
>? ? both A^H and M^H. Can I rely on the multilevel DMDA structure to construct A^H and M^H for me 
>? ? in a set-up phase, plug them into a user-defined context, and plug them back out in a solve phase? 
> 
> If you are not using -pc_mg_galerkin, then the FormJacobian is called separately on each level to rediscretize the operator.
> The only thing that changes is the DMDA that is passed to the call. If you need more information, there are hooks to
> attach different contexts to each MG level. Do you need this?
> 
>? Thanks,
> 
>? ? ? Matt
>? 
> Thanks, Domenico. 
> 
> 
> From: Matthew Knepley <knepley at gmail.com>
> To: Barry Smith <bsmith at mcs.anl.gov> 
> Cc: domenico lahaye <domenico_lahaye at yahoo.com>; "petsc-users at mcs.anl.gov" <petsc-users at mcs.anl.gov>
> Sent: Sunday, July 17, 2016 2:29 PM
> Subject: Re: [petsc-users] Regarding ksp ex42 - Citations
> 
> On Sat, Jul 16, 2016 at 10:11 PM, Barry Smith <bsmith at mcs.anl.gov> wrote:
> 
> > On Jul 14, 2016, at 12:21 PM, domenico lahaye <domenico_lahaye at yahoo.com> wrote:
> >
> > Dear PETSc team,
> >
> > 1) I am looking into ks/examples/tutorials/ex42.c I am still new to the DMDA structure
> >? ? and likely not giving it as much time as it deserves. However, I do not see immediately
> >? ? what function is responsible for calling PCMGSetSmoother and PCMGSetResidual.
> >
> >? ? ? I tried to call PCMGGetCoarseSolve(pc, &kcpc) and subsequently
> >? ? ? KSPGetOperators (kspc, ... ) to check how the coarse grid operator is defined
> >? ? ? after calling DMCoarsenHierarchy, but that failed.
> >
> >? ? ? I am solving Helmholtz with shifted Laplace, and managed to exploit DMDA to perform
> >? ? ? a multigrid solve on the preconditioner. In a next stage I want to implement the deflation
> >? ? ? using DMDA as well.
> >
> > 2) On http://www.mcs.anl.gov/petsc/documentation/referencing.html I see
> >
> > @Misc{petsc-web-page,
> >? ? ? ? ? ? author = {Satish Balay and Shrirang Abhyankar and Mark~F. Adams and Jed Brown and Peter Brune
> >? ? ? ? ? ? ? ? ? ? ? and Kris Buschelman and Lisandro Dalcin and Victor Eijkhout and William~D. Gropp
> >? ? ? ? ? ? ? ? ? ? ? and Dinesh Kaushik and Matthew~G. Knepley
> >? ? ? ? ? ? ? ? ? ? ? and Lois Curfman McInnes and Karl Rupp and Barry~F. Smith
> >? ? ? ? ? ? ? ? ? ? ? and Stefano Zampini and Hong Zhang and Hong Zhang},
> >? ? ? ? ? ? title =? {{PETS}c {W}eb page},
> >? ? ? ? ? ? url =? ? {http://www.mcs.anl.gov/petsc},
> >? ? ? ? ? ? howpublished = {\url{http://www.mcs.anl.gov/petsc}},
> >? ? ? ? ? ? year = {2016}
> >? ? ? ? ? }
> >
> >
> >
> > Is the last author mentioned twice intentionally?
> >
> > 3) On http://www.mcs.anl.gov/petsc/publications/petscapps-bib.html#OpenFOAM%202.2.1 I see
> >
> > @misc{OpenFOAM
> > ,
> >
> >
> > title =? ? ? "OpenFOAM",
> >
> > howpublished? =? ? ? "\url{http://www.openfoam.com}",
> >
> > url? =? ? ? {http://www.openfoam.com},
> >
> > note? =? ? ? "OpenFOAM is a free, open source CFD software package. It allows PETSc linear algebra and solvers to be used underneath.",
> >
> > key? =? ? ? "OpenFOAM 2.2.1"
> >
> > }
> >
> >
> > Do you have more information on the use of PETSc within OpenFoam?
> 
>? Very good question. It seems that this citation is wrong or no longer valid; I have removed it from the PETSc repository. I could find no mention of PETSc usage in the OpenFoam and its third party packages. I think we should not have been listing this citation.
> 
> This suggests that people are using it with OpenFOAM: http://powerlab.fsb.hr/ped/kturbo/OpenFOAM/slides/PatersonNuTTS2009.pdf
> 
> In fact, they use PETSc in the dynamic overset grid implementation for OpenFOAM, which I think is an approved extension:
> 
>? http://web.student.chalmers.se/groups/ofw5/Abstracts/DavidBogerAbstractOFW5.pdf
> 
>? ? ? Matt
>? 
> 
>? ? Barry
> 
> >
> > 4) @matt in response to a question he raised in Vienna
> >
> > MIPSE is a BEM solver. Details are on:
> > http://www.g2elab.grenoble-inp.fr/plateforms/mipse-modeling-of-interconnected-power-systems-632862.kjsp?RH=G2ELAB_R-MAGE
> >
> > Cheers, Domenico Lahaye.
> 
> >
> 
> 
> 
> 
> -- 
> What most experimenters take for granted before they begin their experiments is infinitely more interesting than any results to which their experiments lead.
> -- Norbert Wiener
> 
> 
> 
> 
> 
> -- 
> What most experimenters take for granted before they begin their experiments is infinitely more interesting than any results to which their experiments lead.
> -- Norbert Wiener
> 
> 


-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.mcs.anl.gov/pipermail/petsc-users/attachments/20160721/f8332497/attachment-0001.html>

From lawrence.mitchell at imperial.ac.uk  Thu Jul 21 04:00:19 2016
From: lawrence.mitchell at imperial.ac.uk (Lawrence Mitchell)
Date: Thu, 21 Jul 2016 10:00:19 +0100
Subject: [petsc-users] Regarding ksp ex42 - Citations
In-Reply-To: <1627436644.2376666.1469090464432.JavaMail.yahoo@mail.yahoo.com>
References: <1413749702.3789628.1468516892902.JavaMail.yahoo.ref@mail.yahoo.com>
	<1413749702.3789628.1468516892902.JavaMail.yahoo@mail.yahoo.com>
	<5A491912-5FFB-46AB-8B2E-CBC0C5C443C2@mcs.anl.gov>
	<CAMYG4GnjpQ+CTzB+Ed7hbB77YXJPzWRXtspDbrE8eiXZD=DA_Q@mail.gmail.com>
	<461808588.655361.1468821570462.JavaMail.yahoo@mail.yahoo.com>
	<CAMYG4Gk7HNz8S6A2cv=0OFqbHOPuLOkfdJwNnMiix-ONHvUheg@mail.gmail.com>
	<877772657.653258.1468824084856.JavaMail.yahoo@mail.yahoo.com>
	<DBCA28BC-8A97-4D5F-8041-644DF237D91E@mcs.anl.gov>
	<1627436644.2376666.1469090464432.JavaMail.yahoo@mail.yahoo.com>
Message-ID: <EECEE224-66DD-43CB-9296-2173CA35BB14@imperial.ac.uk>


> On 21 Jul 2016, at 09:41, domenico lahaye <domenico_lahaye at yahoo.com> wrote:
> 
> Thanks.
> 
> KSPSetOperators() allows to precondition A^h with M^h.
> This is lovely and great as it allows to implement the shifted Laplace
> preconditioner for the Helmholtz equation.
> 
> Recently I managed to implement shifted Laplace using the DMDA
> infrastructure in 2D. This implementation avoids having to construct
> the hierarchy in Matlab as we did previously.
> 
> In next stage we would like to precondition A^H with M^H on a sequence
> of coarser grids. This is what Calandra does on two levels and what we do
> on multiple levels.
> 
> We currently have an implement in which we construct the hierarchy on A^h
> and M^h in Matlab, we read the hierarchy in PETSc, traverse the hierarchy and
> do SetOperators and do a lot more of dark magic and witch craft by combining
> preconditioners in a additive and multiplicative fashion.
> 
> It would be lovely to obtain a more readable piece of code.
> 
> I am not sure what kind of additional callbacks I need. My first guess here
> would be a multilevel extension of SetOperators allowing to define M^H
> a preconditioner for A^H on a sequence of coarser levels. But I currently
> fail to oversee the whole matter.
> 
> An alternative is to build a fragile code on top of DMDA first and get back
> to you with more informed guesses on what kind of call backs I precisely need.
> I think I prefer to go with this option.
> 
> Does this sound reasonable?


It sounds like what you need is that the coarse DM should have a way of building the operators via a callback.  I think this is already available.  Rather than doing KSPSetOperators.  You do KSPSetComputeOperators, providing the function to be called to build the operator.  Now, you need a way for the coarse grids to allocate the matrices that will be used for your operators.  If you have a DMDA, this is set up for you because the KSP calls DMCreateMatrix and the DMDA knows how to create a matrix.

One wrinkle here is that the interface doesn't currently support making separate matrices for A and M.  The code currently does (in KSPSetUp):

if (using_dm) {
    DMCreateMatrix(ksp->dm, &A);
    KSPSetOperators(ksp, A, A);
    ...
}

For your needs you'd need this to be:

if (using_dm) {
   DMCreateMatrices(ksp->dm, &A, &P);
   KSPSetOperators(ksp, A, P)
   ...
}

I think.

Adding this call should not be too hard, there have been discussions before about it.  See, for example, the thread here: http://lists.mcs.anl.gov/pipermail/petsc-dev/2015-March/017130.html

(which started here http://lists.mcs.anl.gov/pipermail/petsc-dev/2015-February/017008.html)

I note I never got round to making the suggested changes there.

Cheers,

Lawrence
-------------- next part --------------
A non-text attachment was scrubbed...
Name: signature.asc
Type: application/pgp-signature
Size: 455 bytes
Desc: Message signed with OpenPGP using GPGMail
URL: <http://lists.mcs.anl.gov/pipermail/petsc-users/attachments/20160721/69de7e61/attachment.pgp>

From lawrence.mitchell at imperial.ac.uk  Thu Jul 21 04:25:19 2016
From: lawrence.mitchell at imperial.ac.uk (Lawrence Mitchell)
Date: Thu, 21 Jul 2016 10:25:19 +0100
Subject: [petsc-users] Regarding ksp ex42 - Citations
In-Reply-To: <1239841092.2483369.1469092691767.JavaMail.yahoo@mail.yahoo.com>
References: <1413749702.3789628.1468516892902.JavaMail.yahoo.ref@mail.yahoo.com>
	<1413749702.3789628.1468516892902.JavaMail.yahoo@mail.yahoo.com>
	<5A491912-5FFB-46AB-8B2E-CBC0C5C443C2@mcs.anl.gov>
	<CAMYG4GnjpQ+CTzB+Ed7hbB77YXJPzWRXtspDbrE8eiXZD=DA_Q@mail.gmail.com>
	<461808588.655361.1468821570462.JavaMail.yahoo@mail.yahoo.com>
	<CAMYG4Gk7HNz8S6A2cv=0OFqbHOPuLOkfdJwNnMiix-ONHvUheg@mail.gmail.com>
	<877772657.653258.1468824084856.JavaMail.yahoo@mail.yahoo.com>
	<DBCA28BC-8A97-4D5F-8041-644DF237D91E@mcs.anl.gov>
	<1627436644.2376666.1469090464432.JavaMail.yahoo@mail.yahoo.com>
	<EECEE224-66DD-43CB-9296-2173CA35BB14@imperial.ac.uk>
	<1239841092.2483369.1469092691767.JavaMail.yahoo@mail.yahoo.com>
Message-ID: <579094FF.9020708@imperial.ac.uk>

[Reintroducing petsc-users in cc]

On 21/07/16 10:18, domenico lahaye wrote:
> Thanks Lauwrence. 
> 
> Does the fact that the coarse level preconditioner M^H should be
> constructed 
> by Galerkin coarse (rather then rediscretization) cause additional
> wrinkles? 

Do you want to rediscretise A, but use a galerkin coarse grid M?

If so, that is currently unsupported in PCMG:  In PCSetUp_MG (mg.c,
line 660 or so):

if (mg->galerkin == 1) {
   /* Currently only handle case where mat and pmat are the same on
coarser levels */
   ...
}

I guess if you're managing the creation of the coarse grid operators
yourself via KSPSetComputeOperators and a putative (new)
DMCreateMatrices then you'd have the flexibility to do separate things
for A and M (including, I think, galerkin coarse M).  Since you have
access to the DM hierarchy inside your compute operators.

Make sense?

Lawrence

-------------- next part --------------
A non-text attachment was scrubbed...
Name: signature.asc
Type: application/pgp-signature
Size: 490 bytes
Desc: OpenPGP digital signature
URL: <http://lists.mcs.anl.gov/pipermail/petsc-users/attachments/20160721/4124bdc3/attachment.pgp>

From domenico_lahaye at yahoo.com  Thu Jul 21 04:55:45 2016
From: domenico_lahaye at yahoo.com (domenico lahaye)
Date: Thu, 21 Jul 2016 09:55:45 +0000 (UTC)
Subject: [petsc-users] Regarding ksp ex42 - Citations
In-Reply-To: <579094FF.9020708@imperial.ac.uk>
References: <1413749702.3789628.1468516892902.JavaMail.yahoo.ref@mail.yahoo.com>
	<1413749702.3789628.1468516892902.JavaMail.yahoo@mail.yahoo.com>
	<5A491912-5FFB-46AB-8B2E-CBC0C5C443C2@mcs.anl.gov>
	<CAMYG4GnjpQ+CTzB+Ed7hbB77YXJPzWRXtspDbrE8eiXZD=DA_Q@mail.gmail.com>
	<461808588.655361.1468821570462.JavaMail.yahoo@mail.yahoo.com>
	<CAMYG4Gk7HNz8S6A2cv=0OFqbHOPuLOkfdJwNnMiix-ONHvUheg@mail.gmail.com>
	<877772657.653258.1468824084856.JavaMail.yahoo@mail.yahoo.com>
	<DBCA28BC-8A97-4D5F-8041-644DF237D91E@mcs.anl.gov>
	<1627436644.2376666.1469090464432.JavaMail.yahoo@mail.yahoo.com>
	<EECEE224-66DD-43CB-9296-2173CA35BB14@imperial.ac.uk>
	<1239841092.2483369.1469092691767.JavaMail.yahoo@mail.yahoo.com>
	<579094FF.9020708@imperial.ac.uk>
Message-ID: <2137146867.2442005.1469094945841.JavaMail.yahoo@mail.yahoo.com>

Apologies for being not sufficient clear in my previous message.?
I would like to be able to Galerkin coarsen A^h to obtain A^H?and to separately Galerkin coarsen M^h to obtain M^H.?
So, yes, the way in which I currently (partially) understand your?description of the new DMCreateMatrices would do the job.?
What is a sensible way to proceed??
Thanks, Domenico.?
      From: Lawrence Mitchell <lawrence.mitchell at imperial.ac.uk>
 To: domenico lahaye <domenico_lahaye at yahoo.com> 
Cc: petsc-users at mcs.anl.gov
 Sent: Thursday, July 21, 2016 11:25 AM
 Subject: Re: [petsc-users] Regarding ksp ex42 - Citations
   
[Reintroducing petsc-users in cc]

On 21/07/16 10:18, domenico lahaye wrote:
> Thanks Lauwrence. 
> 
> Does the fact that the coarse level preconditioner M^H should be
> constructed 
> by Galerkin coarse (rather then rediscretization) cause additional
> wrinkles? 

Do you want to rediscretise A, but use a galerkin coarse grid M?

If so, that is currently unsupported in PCMG:? In PCSetUp_MG (mg.c,
line 660 or so):

if (mg->galerkin == 1) {
? /* Currently only handle case where mat and pmat are the same on
coarser levels */
? ...
}

I guess if you're managing the creation of the coarse grid operators
yourself via KSPSetComputeOperators and a putative (new)
DMCreateMatrices then you'd have the flexibility to do separate things
for A and M (including, I think, galerkin coarse M).? Since you have
access to the DM hierarchy inside your compute operators.

Make sense?

Lawrence


-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.mcs.anl.gov/pipermail/petsc-users/attachments/20160721/0e385656/attachment.html>

From lawrence.mitchell at imperial.ac.uk  Thu Jul 21 06:09:21 2016
From: lawrence.mitchell at imperial.ac.uk (Lawrence Mitchell)
Date: Thu, 21 Jul 2016 12:09:21 +0100
Subject: [petsc-users] Regarding ksp ex42 - Citations
In-Reply-To: <2137146867.2442005.1469094945841.JavaMail.yahoo@mail.yahoo.com>
References: <1413749702.3789628.1468516892902.JavaMail.yahoo.ref@mail.yahoo.com>
	<1413749702.3789628.1468516892902.JavaMail.yahoo@mail.yahoo.com>
	<5A491912-5FFB-46AB-8B2E-CBC0C5C443C2@mcs.anl.gov>
	<CAMYG4GnjpQ+CTzB+Ed7hbB77YXJPzWRXtspDbrE8eiXZD=DA_Q@mail.gmail.com>
	<461808588.655361.1468821570462.JavaMail.yahoo@mail.yahoo.com>
	<CAMYG4Gk7HNz8S6A2cv=0OFqbHOPuLOkfdJwNnMiix-ONHvUheg@mail.gmail.com>
	<877772657.653258.1468824084856.JavaMail.yahoo@mail.yahoo.com>
	<DBCA28BC-8A97-4D5F-8041-644DF237D91E@mcs.anl.gov>
	<1627436644.2376666.1469090464432.JavaMail.yahoo@mail.yahoo.com>
	<EECEE224-66DD-43CB-9296-2173CA35BB14@imperial.ac.uk>
	<1239841092.2483369.1469092691767.JavaMail.yahoo@mail.yahoo.com>
	<579094FF.9020708@imperial.ac.uk>
	<2137146867.2442005.1469094945841.JavaMail.yahoo@mail.yahoo.com>
Message-ID: <9C370EDC-0F99-45FE-B650-B0F24091CA63@imperial.ac.uk>


> On 21 Jul 2016, at 10:55, domenico lahaye <domenico_lahaye at yahoo.com> wrote:
> 
> Apologies for being not sufficient clear in my previous message.
> 
> I would like to be able to Galerkin coarsen A^h to obtain A^H
> and to separately Galerkin coarsen M^h to obtain M^H.
> 
> So, yes, the way in which I currently (partially) understand your
> description of the new DMCreateMatrices would do the job.

If you want to separately coarsen A and M via Galerkin, I think it will be easier to just change the code in PCSetUp_MG to handle the case where A and M are different on the coarse levels.  Effectively you just need to replicate the code that computes the coarse grid "B" matrix to separately compute coarse grid A and B matrices and pass them in to KSPSetOperators.

Cheers,

Lawrence

-------------- next part --------------
A non-text attachment was scrubbed...
Name: signature.asc
Type: application/pgp-signature
Size: 455 bytes
Desc: Message signed with OpenPGP using GPGMail
URL: <http://lists.mcs.anl.gov/pipermail/petsc-users/attachments/20160721/93008bf6/attachment.pgp>

From knepley at gmail.com  Thu Jul 21 08:04:57 2016
From: knepley at gmail.com (Matthew Knepley)
Date: Thu, 21 Jul 2016 15:04:57 +0200
Subject: [petsc-users] Mass matrix with PetscFE
In-Reply-To: <556a1fbacdac5b3b1e98e9f06955f71a@tf.uni-kiel.de>
References: <56CDA67F.6000906@tf.uni-kiel.de>
	<CAMYG4G=R6koKaiQuV37SzCUNS01JiWTbU6rD0y6J37Q=607jEA@mail.gmail.com>
	<56CDB469.806@tf.uni-kiel.de>
	<CAMYG4GmY2_5uzHEE2cD1gvG+vkxFOKmUC3rmp0VuX5GwC0O8Wg@mail.gmail.com>
	<56CDB84F.8020309@tf.uni-kiel.de>
	<518cc2f74e6b2267660acaf3871d52f9@tf.uni-kiel.de>
	<56CEB565.5010203@tf.uni-kiel.de>
	<CAMYG4Gnuv1dRDg_n-cif2L5_f_qaeufaQ7h1N0OEyE7dmT+Q9g@mail.gmail.com>
	<556a1fbacdac5b3b1e98e9f06955f71a@tf.uni-kiel.de>
Message-ID: <CAMYG4G=YZstRRtvcb9sdp+Xvszy8DdHiFZ0HkZ2BSZcFF_L-Pg@mail.gmail.com>

On Mon, Mar 7, 2016 at 6:21 PM, Julian Andrej <juan at tf.uni-kiel.de> wrote:

> Any news about this? I've seen you merged the dmforest branch into next.


I am going through my mail, and see that this might have been dropped. Has
your
problem been solved?

  Sorry for the delay,

     Matt


>
> On 2016-02-26 01:22, Matthew Knepley wrote:
>
>> I am sorry about the delay. I have your example working but it exposed
>> a bug in Plex so I need to push the fix first. I should have
>> everything for you early next week.
>>
>>   Thanks
>>
>>      Matt
>>
>> On Feb 25, 2016 2:04 AM, "Julian Andrej" <juan at tf.uni-kiel.de> wrote:
>>
>> After a bit of rethinking the problem, the discrepancy between the
>>> size of matrix A and the mass matrix M arises because of the
>>> Dirichlet boundary conditions. So why aren't the BCs not imposed on
>>> the mass matrix? Do I need to handle Dirichlet BCs differently in
>>> this context (like zero rows and put one the diagonal?)
>>>
>>> On 24.02.2016 20 [1]:54, juan wrote:
>>> I attached another example which creates the correct mass matrix
>>> but also overwrites the DM for the SNES solve. Somehow i cannot
>>> manage
>>> to really copy the DM to dm_mass and use that. If i try to do that
>>> with
>>> DMClone(dm, &dm_mass) i get a smaller mass matrix (which is not of
>>> size A).
>>>
>>> Maybe this helps in the discussion.
>>>
>>> Relevant code starts at line 455.
>>>
>>> On 2016-02-24 15:03, Julian Andrej wrote:
>>> Thanks Matt,
>>>
>>> I attached the modified example.
>>>
>>> the corresponding code (and only changes to ex12) is starting at
>>> line
>>> 832.
>>>
>>> It also seems that the mass matrix is of size 169x169 and the
>>> stiffness matrix is of dimension 225x225. I'd assume that if i
>>> multiply test and trial function i'd get a matrix of same size (if
>>> the
>>> space/quadrature is the same for the stiffness matrix)
>>>
>>> On 24.02.2016 14 [2]:56, Matthew Knepley wrote:
>>> On Wed, Feb 24, 2016 at 7:47 AM, Julian Andrej <juan at tf.uni-kiel.de
>>> <mailto:juan at tf.uni-kiel.de>> wrote:
>>>
>>> I'm now using the petsc git master branch.
>>>
>>> I tried adding my code to the ex12
>>>
>>> DM dm_mass;
>>> PetscDS prob_mass;
>>> PetscFE fe;
>>> Mat M;
>>> PetscFECreateDefault(dm, user.dim, 1, PETSC_TRUE, NULL, -1,
>>> &fe);
>>>
>>> DMClone(dm, &dm_mass);
>>> DMGetDS(dm_mass, &prob_mass);
>>> PetscDSSetDiscretization(prob_mass, 0, (PetscObject) fe);
>>> PetscDSSetJacobian(prob_mass, 0, 0, mass_kernel, NULL, NULL,
>>> NULL);
>>> DMCreateMatrix(dm_mass, &M);
>>>
>>> MatSetOptionsPrefix(M, "M_";)
>>>
>>> and receive the error on running
>>> ./exe -interpolate -refinement_limit 0.0125 -petscspace_order 2
>>> -M_mat_view binary
>>>
>>> WARNING! There are options you set that were not used!
>>> WARNING! could be spelling mistake, etc!
>>> Option left: name:-M_mat_view value: binary
>>>
>>> I don't know if the matrix is actually there and assembled or if
>>> the
>>> option is ommitted because something is wrong.
>>>
>>> Its difficult to know when I cannot see the whole code. You can
>>> always
>>> insert
>>>
>>> MatViewFromOptions(M, NULL, "-mat_view");
>>>
>>> Using
>>> MatView(M, PETSC_VIEWER_STDOUT_WORLD);
>>>
>>> gives me a reasonable output to stdout.
>>>
>>> Good.
>>>
>>> But saving the matrix and analysing it in matlab, results in an
>>> all
>>> zero matrix.
>>>
>>> PetscViewerBinaryOpen(PETSC_COMM_WORLD, "Mout",FILE_MODE_WRITE,
>>> &viewer);
>>> MatView(M, viewer);
>>>
>>> I cannot explain this, but it has to be something like you are
>>> viewing
>>> the matrix before it is
>>> actually assembled. Feel free to send the code. It sounds like it is
>>> mostly working.
>>>
>>> Matt
>>>
>>> Any hints?
>>>
>>> On 24.02.2016 13 [3] <tel:24.02.2016%2013>:58, Matthew Knepley
>>>
>>> wrote:
>>>
>>> On Wed, Feb 24, 2016 at 6:47 AM, Julian Andrej
>>> <juan at tf.uni-kiel.de <mailto:juan at tf.uni-kiel.de>
>>> <mailto:juan at tf.uni-kiel.de <mailto:juan at tf.uni-kiel.de>>>
>>> wrote:
>>>
>>> Hi,
>>>
>>> i'm trying to assemble a mass matrix with the
>>> PetscFE/DMPlex
>>> interface. I found something in the examples of TAO
>>>
>>>
>>>
>> https://bitbucket.org/petsc/petsc/src/da8116b0e8d067e39fd79740a8a864b0fe207998/src/tao/examples/tutorials/ex3.c?at=master&fileviewer=file-view-default
>>
>>>
>>> but using the lines
>>>
>>> DMClone(dm, &dm_mass);
>>> DMSetNumFields(dm_mass, 1);
>>> DMPlexCopyCoordinates(dm, dm_mass);
>>> DMGetDS(dm_mass, &prob_mass);
>>> PetscDSSetJacobian(prob_mass, 0, 0, mass_kernel, NULL,
>>> NULL, NULL);
>>> PetscDSSetDiscretization(prob_mass, 0, (PetscObject)
>>> fe);
>>> DMPlexSNESComputeJacobianFEM(dm_mass, u, M, M, NULL);
>>> DMCreateMatrix(dm_mass, &M);
>>>
>>> leads to errors in DMPlexSNESComputeJacobianFEM (u is a
>>> global vector).
>>>
>>> I don't can understand the necessary commands until
>>> DMPlexSNESComputeJacobianFEM. What does it do and why
>>> is it
>>> necessary? (especially why does the naming involve
>>> SNES?)
>>>
>>> Is there another/easier/better way to create a mass
>>> matrix (the
>>> inner product of the function space and the test
>>> space)?
>>>
>>> 1) That example needs updating. First, look at SNES ex12
>>> which
>>> is up to
>>> date.
>>>
>>> 2) I assume you are using 3.6. If you use the development
>>> version, you
>>> can remove DMPlexCopyCoordinates().
>>>
>>> 3) You need to create the matrix BEFORE calling the assembly
>>>
>>> 4) Always always always send the entire error messge
>>>
>>> Matt
>>>
>>> Regards
>>> Julian Andrej
>>>
>>> --
>>> What most experimenters take for granted before they begin
>>> their
>>> experiments is infinitely more interesting than any results
>>> to which
>>> their experiments lead.
>>> -- Norbert Wiener
>>>
>>> --
>>> What most experimenters take for granted before they begin their
>>> experiments is infinitely more interesting than any results to which
>>> their experiments lead.
>>> -- Norbert Wiener
>>>
>>
>>
>> Links:
>> ------
>> [1] tel:24.02.2016%2020
>> [2] tel:24.02.2016%2014
>> [3] tel:24.02.2016%2013
>>
>


-- 
What most experimenters take for granted before they begin their
experiments is infinitely more interesting than any results to which their
experiments lead.
-- Norbert Wiener
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.mcs.anl.gov/pipermail/petsc-users/attachments/20160721/dc401004/attachment-0001.html>

From juan at tf.uni-kiel.de  Thu Jul 21 08:59:06 2016
From: juan at tf.uni-kiel.de (Julian Andrej)
Date: Thu, 21 Jul 2016 15:59:06 +0200
Subject: [petsc-users] Mass matrix with PetscFE
In-Reply-To: <CAMYG4G=YZstRRtvcb9sdp+Xvszy8DdHiFZ0HkZ2BSZcFF_L-Pg@mail.gmail.com>
References: <56CDA67F.6000906@tf.uni-kiel.de>
	<CAMYG4G=R6koKaiQuV37SzCUNS01JiWTbU6rD0y6J37Q=607jEA@mail.gmail.com>
	<56CDB469.806@tf.uni-kiel.de>
	<CAMYG4GmY2_5uzHEE2cD1gvG+vkxFOKmUC3rmp0VuX5GwC0O8Wg@mail.gmail.com>
	<56CDB84F.8020309@tf.uni-kiel.de>
	<518cc2f74e6b2267660acaf3871d52f9@tf.uni-kiel.de>
	<56CEB565.5010203@tf.uni-kiel.de>
	<CAMYG4Gnuv1dRDg_n-cif2L5_f_qaeufaQ7h1N0OEyE7dmT+Q9g@mail.gmail.com>
	<556a1fbacdac5b3b1e98e9f06955f71a@tf.uni-kiel.de>
	<CAMYG4G=YZstRRtvcb9sdp+Xvszy8DdHiFZ0HkZ2BSZcFF_L-Pg@mail.gmail.com>
Message-ID: <CABFzUT2P9oEu9UnBgN+MAvT54S7yGwOJ3eDx9JtvqnvzWC-fYA@mail.gmail.com>

Hey,

yes, this issue was resolved a few weeks after the mail. I just tried
again after some DMPlex commits ;).

Thanks!

On Thu, Jul 21, 2016 at 3:04 PM, Matthew Knepley <knepley at gmail.com> wrote:
> On Mon, Mar 7, 2016 at 6:21 PM, Julian Andrej <juan at tf.uni-kiel.de> wrote:
>>
>> Any news about this? I've seen you merged the dmforest branch into next.
>
>
> I am going through my mail, and see that this might have been dropped. Has
> your
> problem been solved?
>
>   Sorry for the delay,
>
>      Matt
>
>>
>>
>> On 2016-02-26 01:22, Matthew Knepley wrote:
>>>
>>> I am sorry about the delay. I have your example working but it exposed
>>> a bug in Plex so I need to push the fix first. I should have
>>> everything for you early next week.
>>>
>>>   Thanks
>>>
>>>      Matt
>>>
>>> On Feb 25, 2016 2:04 AM, "Julian Andrej" <juan at tf.uni-kiel.de> wrote:
>>>
>>>> After a bit of rethinking the problem, the discrepancy between the
>>>> size of matrix A and the mass matrix M arises because of the
>>>> Dirichlet boundary conditions. So why aren't the BCs not imposed on
>>>> the mass matrix? Do I need to handle Dirichlet BCs differently in
>>>> this context (like zero rows and put one the diagonal?)
>>>>
>>>> On 24.02.2016 20 [1]:54, juan wrote:
>>>> I attached another example which creates the correct mass matrix
>>>> but also overwrites the DM for the SNES solve. Somehow i cannot
>>>> manage
>>>> to really copy the DM to dm_mass and use that. If i try to do that
>>>> with
>>>> DMClone(dm, &dm_mass) i get a smaller mass matrix (which is not of
>>>> size A).
>>>>
>>>> Maybe this helps in the discussion.
>>>>
>>>> Relevant code starts at line 455.
>>>>
>>>> On 2016-02-24 15:03, Julian Andrej wrote:
>>>> Thanks Matt,
>>>>
>>>> I attached the modified example.
>>>>
>>>> the corresponding code (and only changes to ex12) is starting at
>>>> line
>>>> 832.
>>>>
>>>> It also seems that the mass matrix is of size 169x169 and the
>>>> stiffness matrix is of dimension 225x225. I'd assume that if i
>>>> multiply test and trial function i'd get a matrix of same size (if
>>>> the
>>>> space/quadrature is the same for the stiffness matrix)
>>>>
>>>> On 24.02.2016 14 [2]:56, Matthew Knepley wrote:
>>>> On Wed, Feb 24, 2016 at 7:47 AM, Julian Andrej <juan at tf.uni-kiel.de
>>>> <mailto:juan at tf.uni-kiel.de>> wrote:
>>>>
>>>> I'm now using the petsc git master branch.
>>>>
>>>> I tried adding my code to the ex12
>>>>
>>>> DM dm_mass;
>>>> PetscDS prob_mass;
>>>> PetscFE fe;
>>>> Mat M;
>>>> PetscFECreateDefault(dm, user.dim, 1, PETSC_TRUE, NULL, -1,
>>>> &fe);
>>>>
>>>> DMClone(dm, &dm_mass);
>>>> DMGetDS(dm_mass, &prob_mass);
>>>> PetscDSSetDiscretization(prob_mass, 0, (PetscObject) fe);
>>>> PetscDSSetJacobian(prob_mass, 0, 0, mass_kernel, NULL, NULL,
>>>> NULL);
>>>> DMCreateMatrix(dm_mass, &M);
>>>>
>>>> MatSetOptionsPrefix(M, "M_";)
>>>>
>>>> and receive the error on running
>>>> ./exe -interpolate -refinement_limit 0.0125 -petscspace_order 2
>>>> -M_mat_view binary
>>>>
>>>> WARNING! There are options you set that were not used!
>>>> WARNING! could be spelling mistake, etc!
>>>> Option left: name:-M_mat_view value: binary
>>>>
>>>> I don't know if the matrix is actually there and assembled or if
>>>> the
>>>> option is ommitted because something is wrong.
>>>>
>>>> Its difficult to know when I cannot see the whole code. You can
>>>> always
>>>> insert
>>>>
>>>> MatViewFromOptions(M, NULL, "-mat_view");
>>>>
>>>> Using
>>>> MatView(M, PETSC_VIEWER_STDOUT_WORLD);
>>>>
>>>> gives me a reasonable output to stdout.
>>>>
>>>> Good.
>>>>
>>>> But saving the matrix and analysing it in matlab, results in an
>>>> all
>>>> zero matrix.
>>>>
>>>> PetscViewerBinaryOpen(PETSC_COMM_WORLD, "Mout",FILE_MODE_WRITE,
>>>> &viewer);
>>>> MatView(M, viewer);
>>>>
>>>> I cannot explain this, but it has to be something like you are
>>>> viewing
>>>> the matrix before it is
>>>> actually assembled. Feel free to send the code. It sounds like it is
>>>> mostly working.
>>>>
>>>> Matt
>>>>
>>>> Any hints?
>>>>
>>>> On 24.02.2016 13 [3] <tel:24.02.2016%2013>:58, Matthew Knepley
>>>>
>>>> wrote:
>>>>
>>>> On Wed, Feb 24, 2016 at 6:47 AM, Julian Andrej
>>>> <juan at tf.uni-kiel.de <mailto:juan at tf.uni-kiel.de>
>>>> <mailto:juan at tf.uni-kiel.de <mailto:juan at tf.uni-kiel.de>>>
>>>> wrote:
>>>>
>>>> Hi,
>>>>
>>>> i'm trying to assemble a mass matrix with the
>>>> PetscFE/DMPlex
>>>> interface. I found something in the examples of TAO
>>>>
>>>>
>>>
>>> https://bitbucket.org/petsc/petsc/src/da8116b0e8d067e39fd79740a8a864b0fe207998/src/tao/examples/tutorials/ex3.c?at=master&fileviewer=file-view-default
>>>>
>>>>
>>>> but using the lines
>>>>
>>>> DMClone(dm, &dm_mass);
>>>> DMSetNumFields(dm_mass, 1);
>>>> DMPlexCopyCoordinates(dm, dm_mass);
>>>> DMGetDS(dm_mass, &prob_mass);
>>>> PetscDSSetJacobian(prob_mass, 0, 0, mass_kernel, NULL,
>>>> NULL, NULL);
>>>> PetscDSSetDiscretization(prob_mass, 0, (PetscObject)
>>>> fe);
>>>> DMPlexSNESComputeJacobianFEM(dm_mass, u, M, M, NULL);
>>>> DMCreateMatrix(dm_mass, &M);
>>>>
>>>> leads to errors in DMPlexSNESComputeJacobianFEM (u is a
>>>> global vector).
>>>>
>>>> I don't can understand the necessary commands until
>>>> DMPlexSNESComputeJacobianFEM. What does it do and why
>>>> is it
>>>> necessary? (especially why does the naming involve
>>>> SNES?)
>>>>
>>>> Is there another/easier/better way to create a mass
>>>> matrix (the
>>>> inner product of the function space and the test
>>>> space)?
>>>>
>>>> 1) That example needs updating. First, look at SNES ex12
>>>> which
>>>> is up to
>>>> date.
>>>>
>>>> 2) I assume you are using 3.6. If you use the development
>>>> version, you
>>>> can remove DMPlexCopyCoordinates().
>>>>
>>>> 3) You need to create the matrix BEFORE calling the assembly
>>>>
>>>> 4) Always always always send the entire error messge
>>>>
>>>> Matt
>>>>
>>>> Regards
>>>> Julian Andrej
>>>>
>>>> --
>>>> What most experimenters take for granted before they begin
>>>> their
>>>> experiments is infinitely more interesting than any results
>>>> to which
>>>> their experiments lead.
>>>> -- Norbert Wiener
>>>>
>>>> --
>>>> What most experimenters take for granted before they begin their
>>>> experiments is infinitely more interesting than any results to which
>>>> their experiments lead.
>>>> -- Norbert Wiener
>>>
>>>
>>>
>>> Links:
>>> ------
>>> [1] tel:24.02.2016%2020
>>> [2] tel:24.02.2016%2014
>>> [3] tel:24.02.2016%2013
>
>
>
>
> --
> What most experimenters take for granted before they begin their experiments
> is infinitely more interesting than any results to which their experiments
> lead.
> -- Norbert Wiener

From domenico_lahaye at yahoo.com  Thu Jul 21 09:09:02 2016
From: domenico_lahaye at yahoo.com (domenico lahaye)
Date: Thu, 21 Jul 2016 14:09:02 +0000 (UTC)
Subject: [petsc-users] Regarding ksp ex42 - Citations
In-Reply-To: <9C370EDC-0F99-45FE-B650-B0F24091CA63@imperial.ac.uk>
References: <1413749702.3789628.1468516892902.JavaMail.yahoo.ref@mail.yahoo.com>
	<1413749702.3789628.1468516892902.JavaMail.yahoo@mail.yahoo.com>
	<5A491912-5FFB-46AB-8B2E-CBC0C5C443C2@mcs.anl.gov>
	<CAMYG4GnjpQ+CTzB+Ed7hbB77YXJPzWRXtspDbrE8eiXZD=DA_Q@mail.gmail.com>
	<461808588.655361.1468821570462.JavaMail.yahoo@mail.yahoo.com>
	<CAMYG4Gk7HNz8S6A2cv=0OFqbHOPuLOkfdJwNnMiix-ONHvUheg@mail.gmail.com>
	<877772657.653258.1468824084856.JavaMail.yahoo@mail.yahoo.com>
	<DBCA28BC-8A97-4D5F-8041-644DF237D91E@mcs.anl.gov>
	<1627436644.2376666.1469090464432.JavaMail.yahoo@mail.yahoo.com>
	<EECEE224-66DD-43CB-9296-2173CA35BB14@imperial.ac.uk>
	<1239841092.2483369.1469092691767.JavaMail.yahoo@mail.yahoo.com>
	<579094FF.9020708@imperial.ac.uk>
	<2137146867.2442005.1469094945841.JavaMail.yahoo@mail.yahoo.com>
	<9C370EDC-0F99-45FE-B650-B0F24091CA63@imperial.ac.uk>
Message-ID: <517029271.2456281.1469110142974.JavaMail.yahoo@mail.yahoo.com>

Thank you for sharing the additional insight.
The separate Galerkin coarsening of A and M will be part of the overall algorithm only. 

I think it is wise to implement in two stages: first a fragile implementation and later 
a more stable one. 

Kind wishes, Domenico. 
 
      From: Lawrence Mitchell <lawrence.mitchell at imperial.ac.uk>
 To: domenico lahaye <domenico_lahaye at yahoo.com> 
Cc: PETSc Users List <petsc-users at mcs.anl.gov>
 Sent: Thursday, July 21, 2016 1:09 PM
 Subject: Re: [petsc-users] Regarding ksp ex42 - Citations
   

> On 21 Jul 2016, at 10:55, domenico lahaye <domenico_lahaye at yahoo.com> wrote:
> 
> Apologies for being not sufficient clear in my previous message.
> 
> I would like to be able to Galerkin coarsen A^h to obtain A^H
> and to separately Galerkin coarsen M^h to obtain M^H.
> 
> So, yes, the way in which I currently (partially) understand your
> description of the new DMCreateMatrices would do the job.

If you want to separately coarsen A and M via Galerkin, I think it will be easier to just change the code in PCSetUp_MG to handle the case where A and M are different on the coarse levels.? Effectively you just need to replicate the code that computes the coarse grid "B" matrix to separately compute coarse grid A and B matrices and pass them in to KSPSetOperators.

Cheers,

Lawrence


-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.mcs.anl.gov/pipermail/petsc-users/attachments/20160721/c958970d/attachment.html>

From eduardojourdan92 at gmail.com  Thu Jul 21 11:38:43 2016
From: eduardojourdan92 at gmail.com (Eduardo Jourdan)
Date: Thu, 21 Jul 2016 13:38:43 -0300
Subject: [petsc-users] Questions for MatSolve
In-Reply-To: <CAMYG4GnWbHvFjVO1p6ea=nCLsi9s_Gnbxcggj6n=GOAM_7Eu5w@mail.gmail.com>
References: <CAF78e0xki-Fd0w=g45bVtcuU4TMrHjT_PeaqWbCP=FS7nQi8Pw@mail.gmail.com>
	<CAMYG4GnWbHvFjVO1p6ea=nCLsi9s_Gnbxcggj6n=GOAM_7Eu5w@mail.gmail.com>
Message-ID: <CAF78e0ykgZxzQNzyYrC_XVUSrX4BycfAFNMLuBo2dUrB+pJ1tA@mail.gmail.com>

Thank you for the quick answer.

I didn't realize that I could use PC without KSP interface. I also think
that it is what I wanted. Nevertheless, as long as I figured out from the
source code, PcApply for PCSOR basically do some interface and preparations
and then calls MatSOR. I saw that depending on the matrix ('BAIJ, SBAIJ,
and AIJ matrices with Inodes') it does SOR smoothing or block SOR
smoothing.

I think that in my case the seqaij matrix with bs=4 had Inodes with size 4.
That is why calling MatSOR with seqaij or calling with seqbaij converted
from the seqaij seem to give the same result.
However, with the matrix seqaij with bs = 16 I can guess that the rows
inside a block dont have the same nonzero pattern, so Inodes size are
different from block size. I happened to see the follow note in the MatSOR
website page: "Developer Note: We should add block SOR support for AIJ
matrices with block size set to great than one and no inodes ". This may be
the reason why seqaij and seqbaij are leading to different results with my
matrix of bs = 16. I think that answer all may previous questions. I am
sorry, I've got confused and wrote MatSolve instead of MatSOR in my
previous email, what changes it completely.

Best Regards

Eduardo


2016-07-20 0:03 GMT-03:00 Matthew Knepley <knepley at gmail.com>:

> On Tue, Jul 19, 2016 at 8:17 PM, Eduardo Jourdan <
> eduardojourdan92 at gmail.com> wrote:
>
>> Hi all,
>>
>> I would like to perform a specific number (for instance 4 of forward and
>> backward sweeps with a seqaij matrix with block size 4, vectors b and x.
>> Also, I need to do this same procedure with another matrix seqaij block
>> size 16. I would appreciate if someone knows the best way to do it.
>>
>
> It sounds like you want PCSOR and PCApply, not MatSolve.
>
>   Thanks,
>
>      Matt
>
>
>> 1 - I've been trying to use MatSolve. For the bs=4 it seems to work, but
>> with the other matrix with bs=16 the residue diverges. When I call
>> matConvert to convert the later matrix for a seqbaij with bs=16 the result
>> changes and the linear residue is reduced. It is supposed to happen or it
>> is more possible that i am doing something wrong?
>>
>> 2 - MatSolve for seqbaij and seqaij with the same block sizes gives the
>> same results in terms of solution (not performace, memory) ?
>>
>> 3 - Can do I do a specific number of sweeps as told before with the
>> KSP/PC interface?
>>
>> 4 - I saw the manual for the MatSolve and It says that it is for factored
>> matrix. Can I use a matrix just after the MatAssembly calls?
>>
>> Best regards,
>>
>> Eduardo Jourdan
>>
>
>
>
> --
> What most experimenters take for granted before they begin their
> experiments is infinitely more interesting than any results to which their
> experiments lead.
> -- Norbert Wiener
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.mcs.anl.gov/pipermail/petsc-users/attachments/20160721/e1825f3c/attachment.html>

From bsmith at mcs.anl.gov  Thu Jul 21 13:15:31 2016
From: bsmith at mcs.anl.gov (Barry Smith)
Date: Thu, 21 Jul 2016 13:15:31 -0500
Subject: [petsc-users] Questions for MatSolve
In-Reply-To: <CAF78e0ykgZxzQNzyYrC_XVUSrX4BycfAFNMLuBo2dUrB+pJ1tA@mail.gmail.com>
References: <CAF78e0xki-Fd0w=g45bVtcuU4TMrHjT_PeaqWbCP=FS7nQi8Pw@mail.gmail.com>
	<CAMYG4GnWbHvFjVO1p6ea=nCLsi9s_Gnbxcggj6n=GOAM_7Eu5w@mail.gmail.com>
	<CAF78e0ykgZxzQNzyYrC_XVUSrX4BycfAFNMLuBo2dUrB+pJ1tA@mail.gmail.com>
Message-ID: <65228947-04A4-46AE-9023-861EDF530D87@mcs.anl.gov>


> On Jul 21, 2016, at 11:38 AM, Eduardo Jourdan <eduardojourdan92 at gmail.com> wrote:
> 
> Thank you for the quick answer.
> 
> I didn't realize that I could use PC without KSP interface. I also think that it is what I wanted. Nevertheless, as long as I figured out from the source code, PcApply for PCSOR basically do some interface and preparations and then calls MatSOR. I saw that depending on the matrix ('BAIJ, SBAIJ, and AIJ matrices with Inodes') it does SOR smoothing or block SOR smoothing. 
> 
> I think that in my case the seqaij matrix with bs=4 had Inodes with size 4. That is why calling MatSOR with seqaij or calling with seqbaij converted from the seqaij seem to give the same result.
> However, with the matrix seqaij with bs = 16 I can guess that the rows inside a block dont have the same nonzero pattern, so Inodes size are different from block size. I happened to see the follow note in the MatSOR website page: "Developer Note: We should add block SOR support for AIJ matrices with block size set to great than one and no inodes ". This may be the reason why seqaij and seqbaij are leading to different results with my matrix of bs = 16. I think that answer all may previous questions. I am sorry, I've got confused and wrote MatSolve instead of MatSOR in my previous email, what changes it completely.

    Your analysis is correct. In general PCSOR will produce different convergence histories for AIJ and BIJ block size > 1. The BAIJ may convergence (due to the blocking) when the AIJ does not; I suppose the opposite may be possible but seems unlikely.

   Barry

> 
> Best Regards
> 
> Eduardo
> 
> 
> 
> 
> 
> 2016-07-20 0:03 GMT-03:00 Matthew Knepley <knepley at gmail.com>:
> On Tue, Jul 19, 2016 at 8:17 PM, Eduardo Jourdan <eduardojourdan92 at gmail.com> wrote:
> Hi all,
> 
> I would like to perform a specific number (for instance 4 of forward and backward sweeps with a seqaij matrix with block size 4, vectors b and x. Also, I need to do this same procedure with another matrix seqaij block size 16. I would appreciate if someone knows the best way to do it. 
> 
> It sounds like you want PCSOR and PCApply, not MatSolve.
> 
>   Thanks,
> 
>      Matt
>  
> 1 - I've been trying to use MatSolve. For the bs=4 it seems to work, but with the other matrix with bs=16 the residue diverges. When I call matConvert to convert the later matrix for a seqbaij with bs=16 the result changes and the linear residue is reduced. It is supposed to happen or it is more possible that i am doing something wrong? 
> 
> 2 - MatSolve for seqbaij and seqaij with the same block sizes gives the same results in terms of solution (not performace, memory) ?
> 
> 3 - Can do I do a specific number of sweeps as told before with the KSP/PC interface?
> 
> 4 - I saw the manual for the MatSolve and It says that it is for factored matrix. Can I use a matrix just after the MatAssembly calls?
> 
> Best regards,
> 
> Eduardo Jourdan
> 
> 
> 
> -- 
> What most experimenters take for granted before they begin their experiments is infinitely more interesting than any results to which their experiments lead.
> -- Norbert Wiener
> 


From overholt at capesim.com  Thu Jul 21 16:00:57 2016
From: overholt at capesim.com (Matthew Overholt)
Date: Thu, 21 Jul 2016 17:00:57 -0400
Subject: [petsc-users] PC Direct Solution failure
Message-ID: <006001d1e392$fbaf6000$f30e2000$@capesim.com>

PETSc Users,

 
I am doing a KSPPREONLY solution (of the heat transfer equation using FEA)
and comparing several packages like PARDISO and MUMPS, and I am encountering
a MatSolve() failure that I am having trouble diagnosing.  The matrix
inversion fails and I get "nan".  The failure only happens for certain input
files, and its not (just) related to problem size.  By making a slight
change to the geometry of the problem I can get it to solve.

 
The SuperLu solver is the only one that will give me any error message:

-ksp_type preonly -pc_type lu -pc_mat_solver_package superlu -info

I get the error message:

[0] MatSolve(): MatFactorError 2

Is that a PCFailedReason of PC_FACTOR_NUMERIC_ZEROPIVOT?

If so, is there a way to perturb the pivot in some way?

 
In another (non-PETSc) code which uses MKL PARDISO I am able to solve the
exact same problem by the same approach without any issues, and that code
gives PARDISO a pivot perturbation flag value.

 
Is there a better way to figure out what is happening?  I have been running
the code in TotalView with extreme memory checks and everything appears to
be ok.

 
Thanks,

Matt Overholt

CapeSym, Inc.

 
---
This email has been checked for viruses by Avast antivirus software.
https://www.avast.com/antivirus
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.mcs.anl.gov/pipermail/petsc-users/attachments/20160721/4af59161/attachment.html>

From bsmith at mcs.anl.gov  Thu Jul 21 16:26:12 2016
From: bsmith at mcs.anl.gov (Barry Smith)
Date: Thu, 21 Jul 2016 16:26:12 -0500
Subject: [petsc-users] PC Direct Solution failure
In-Reply-To: <006001d1e392$fbaf6000$f30e2000$@capesim.com>
References: <006001d1e392$fbaf6000$f30e2000$@capesim.com>
Message-ID: <2E47CB3E-E8D2-4EBB-83D9-05A46DA0A7E4@mcs.anl.gov>


> On Jul 21, 2016, at 4:00 PM, Matthew Overholt <overholt at capesim.com> wrote:
> 
> PETSc Users,
>  
> I am doing a KSPPREONLY solution (of the heat transfer equation using FEA) and comparing several packages like PARDISO and MUMPS, and I am encountering a MatSolve() failure that I am having trouble diagnosing.  The matrix inversion fails and I get ?nan?.  The failure only happens for certain input files, and its not (just) related to problem size.  By making a slight change to the geometry of the problem I can get it to solve.
>  
> The SuperLu solver is the only one that will give me any error message:
> -ksp_type preonly -pc_type lu ?pc_mat_solver_package superlu ?info
> I get the error message:
> [0] MatSolve(): MatFactorError 2
> Is that a PCFailedReason of PC_FACTOR_NUMERIC_ZEROPIVOT?

  Yes

typedef enum {MAT_FACTOR_NOERROR,MAT_FACTOR_STRUCT_ZEROPIVOT,MAT_FACTOR_NUMERIC_ZEROPIVOT,MAT_FACTOR_OUTMEMORY,MAT_FACTOR_OTHER} MatFactorError;

  There are a few SuperLU options that could potentially alleviate the problem of the zero pivot: Run with -help to see them all or look at the manual page for MATSOLVERSUPERLU

+ -mat_superlu_equil <FALSE>            - Equil (None)
. -mat_superlu_colperm <COLAMD>         - (choose one of) NATURAL MMD_ATA MMD_AT_PLUS_A COLAMD
. -mat_superlu_iterrefine <NOREFINE>    - (choose one of) NOREFINE SINGLE DOUBLE EXTRA
. -mat_superlu_symmetricmode: <FALSE>   - SymmetricMode (None)
. -mat_superlu_diagpivotthresh <1>      - DiagPivotThresh (None)
. -mat_superlu_pivotgrowth <FALSE>      - PivotGrowth (None)
. -mat_superlu_conditionnumber <FALSE>  - ConditionNumber (None)
. -mat_superlu_rowperm <NOROWPERM>      - (choose one of) NOROWPERM LargeDiag
. -mat_superlu_replacetinypivot <FALSE> - ReplaceTinyPivot (None)

but they may introduce a different problem for a different matrix.

The thing with sparse direct solvers is they can work fine for some matrices but when you change the matrix slightly they don't work, they can also work for some orderings and not for others and if you change the matrix it may be a different ordering is better for that matrix than the ordering for a different matrix. So generally for a particular matrix you might be able to get things to run but I know of no way to bullet proof the direct solver so it will always work when you throw different matrices at it unless you manually change some options.


> If so, is there a way to perturb the pivot in some way?
>  
> In another (non-PETSc) code which uses MKL PARDISO I am able to solve the exact same problem by the same approach without any issues, and that code gives PARDISO a pivot perturbation flag value.

  I'm not surprised. Different sparse solvers will work better on some classes of matrices than others but it is not easy to predict in advance which solver will be best.  Generally for each type of simulation we do we try out the different sparse direct solvers and then pick the one that seems the most robust for that simulation.  Note that since each solver package has its own tuning options this can be a annoying because you need to find the tuning options for each package and see if they help.

  Barry

>  
> Is there a better way to figure out what is happening?  I have been running the code in TotalView with extreme memory checks and everything appears to be ok.
>  
> Thanks,
> Matt Overholt
> CapeSym, Inc.
>  
>  
> 
> 	Virus-free. www.avast.com


From bsmith at mcs.anl.gov  Thu Jul 21 18:41:48 2016
From: bsmith at mcs.anl.gov (Barry Smith)
Date: Thu, 21 Jul 2016 18:41:48 -0500
Subject: [petsc-users] Regarding ksp ex42 - Citations
In-Reply-To: <9C370EDC-0F99-45FE-B650-B0F24091CA63@imperial.ac.uk>
References: <1413749702.3789628.1468516892902.JavaMail.yahoo.ref@mail.yahoo.com>
	<1413749702.3789628.1468516892902.JavaMail.yahoo@mail.yahoo.com>
	<5A491912-5FFB-46AB-8B2E-CBC0C5C443C2@mcs.anl.gov>
	<CAMYG4GnjpQ+CTzB+Ed7hbB77YXJPzWRXtspDbrE8eiXZD=DA_Q@mail.gmail.com>
	<461808588.655361.1468821570462.JavaMail.yahoo@mail.yahoo.com>
	<CAMYG4Gk7HNz8S6A2cv=0OFqbHOPuLOkfdJwNnMiix-ONHvUheg@mail.gmail.com>
	<877772657.653258.1468824084856.JavaMail.yahoo@mail.yahoo.com>
	<DBCA28BC-8A97-4D5F-8041-644DF237D91E@mcs.anl.gov>
	<1627436644.2376666.1469090464432.JavaMail.yahoo@mail.yahoo.com>
	<EECEE224-66DD-43CB-9296-2173CA35BB14@imperial.ac.uk>
	<1239841092.2483369.1469092691767.JavaMail.yahoo@mail.yahoo.com>
	<579094FF.9020708@imperial.ac.uk>
	<2137146867.2442005.1469094945841.JavaMail.yahoo@mail.yahoo.com>
	<9C370EDC-0F99-45FE-B650-B0F24091CA63@imperial.ac.uk>
Message-ID: <E51ACC8F-CA39-416B-9386-70A6E05FDB6C@mcs.anl.gov>


   I'll add support for handling both A and M via Galerkin. It is easy to write the code, picking a good simple API that doesn't break anything is more difficult.   I'm leaning to change PCMGSetGalerkin(PC,PetscBool) to PCMGSetGalerkin(PC, PCMGGalerkinType) where 

typedef enum { PC_MG_GALERKIN_BOTH,PC_MG_GALERKIN_PMAT,PC_MG_GALERKIN_MAT, PC_MG_GALERKIN_NONE
 } PCMGGalerkinType;

Barry


> On Jul 21, 2016, at 6:09 AM, Lawrence Mitchell <lawrence.mitchell at imperial.ac.uk> wrote:
> 
> 
>> On 21 Jul 2016, at 10:55, domenico lahaye <domenico_lahaye at yahoo.com> wrote:
>> 
>> Apologies for being not sufficient clear in my previous message.
>> 
>> I would like to be able to Galerkin coarsen A^h to obtain A^H
>> and to separately Galerkin coarsen M^h to obtain M^H.
>> 
>> So, yes, the way in which I currently (partially) understand your
>> description of the new DMCreateMatrices would do the job.
> 
> If you want to separately coarsen A and M via Galerkin, I think it will be easier to just change the code in PCSetUp_MG to handle the case where A and M are different on the coarse levels.  Effectively you just need to replicate the code that computes the coarse grid "B" matrix to separately compute coarse grid A and B matrices and pass them in to KSPSetOperators.
> 
> Cheers,
> 
> Lawrence
> 


From domenico_lahaye at yahoo.com  Fri Jul 22 03:42:00 2016
From: domenico_lahaye at yahoo.com (domenico lahaye)
Date: Fri, 22 Jul 2016 08:42:00 +0000 (UTC)
Subject: [petsc-users] Regarding ksp ex42 - Citations
In-Reply-To: <E51ACC8F-CA39-416B-9386-70A6E05FDB6C@mcs.anl.gov>
References: <1413749702.3789628.1468516892902.JavaMail.yahoo.ref@mail.yahoo.com>
	<1413749702.3789628.1468516892902.JavaMail.yahoo@mail.yahoo.com>
	<5A491912-5FFB-46AB-8B2E-CBC0C5C443C2@mcs.anl.gov>
	<CAMYG4GnjpQ+CTzB+Ed7hbB77YXJPzWRXtspDbrE8eiXZD=DA_Q@mail.gmail.com>
	<461808588.655361.1468821570462.JavaMail.yahoo@mail.yahoo.com>
	<CAMYG4Gk7HNz8S6A2cv=0OFqbHOPuLOkfdJwNnMiix-ONHvUheg@mail.gmail.com>
	<877772657.653258.1468824084856.JavaMail.yahoo@mail.yahoo.com>
	<DBCA28BC-8A97-4D5F-8041-644DF237D91E@mcs.anl.gov>
	<1627436644.2376666.1469090464432.JavaMail.yahoo@mail.yahoo.com>
	<EECEE224-66DD-43CB-9296-2173CA35BB14@imperial.ac.uk>
	<1239841092.2483369.1469092691767.JavaMail.yahoo@mail.yahoo.com>
	<579094FF.9020708@imperial.ac.uk>
	<2137146867.2442005.1469094945841.JavaMail.yahoo@mail.yahoo.com>
	<9C370EDC-0F99-45FE-B650-B0F24091CA63@imperial.ac.uk>
	<E51ACC8F-CA39-416B-9386-70A6E05FDB6C@mcs.anl.gov>
Message-ID: <430090064.2880208.1469176920170.JavaMail.yahoo@mail.yahoo.com>

Dear Barry, 
?? Thank you for your suggestion. 
?? I will be happy to test drive the new code when available. 

? Kind wishes, Domenico. 


      From: Barry Smith <bsmith at mcs.anl.gov>
 To: Lawrence Mitchell <lawrence.mitchell at imperial.ac.uk> 
Cc: domenico lahaye <domenico_lahaye at yahoo.com>; PETSc Users List <petsc-users at mcs.anl.gov>
 Sent: Friday, July 22, 2016 1:41 AM
 Subject: Re: [petsc-users] Regarding ksp ex42 - Citations
   

? I'll add support for handling both A and M via Galerkin. It is easy to write the code, picking a good simple API that doesn't break anything is more difficult.? I'm leaning to change PCMGSetGalerkin(PC,PetscBool) to PCMGSetGalerkin(PC, PCMGGalerkinType) where 

typedef enum { PC_MG_GALERKIN_BOTH,PC_MG_GALERKIN_PMAT,PC_MG_GALERKIN_MAT, PC_MG_GALERKIN_NONE
 } PCMGGalerkinType;

Barry


> On Jul 21, 2016, at 6:09 AM, Lawrence Mitchell <lawrence.mitchell at imperial.ac.uk> wrote:
> 
> 
>> On 21 Jul 2016, at 10:55, domenico lahaye <domenico_lahaye at yahoo.com> wrote:
>> 
>> Apologies for being not sufficient clear in my previous message.
>> 
>> I would like to be able to Galerkin coarsen A^h to obtain A^H
>> and to separately Galerkin coarsen M^h to obtain M^H.
>> 
>> So, yes, the way in which I currently (partially) understand your
>> description of the new DMCreateMatrices would do the job.
> 
> If you want to separately coarsen A and M via Galerkin, I think it will be easier to just change the code in PCSetUp_MG to handle the case where A and M are different on the coarse levels.? Effectively you just need to replicate the code that computes the coarse grid "B" matrix to separately compute coarse grid A and B matrices and pass them in to KSPSetOperators.
> 
> Cheers,
> 
> Lawrence
> 


-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.mcs.anl.gov/pipermail/petsc-users/attachments/20160722/bb06ac3a/attachment-0002.html>

From domenico_lahaye at yahoo.com  Fri Jul 22 03:42:00 2016
From: domenico_lahaye at yahoo.com (domenico lahaye)
Date: Fri, 22 Jul 2016 08:42:00 +0000 (UTC)
Subject: [petsc-users] Regarding ksp ex42 - Citations
In-Reply-To: <E51ACC8F-CA39-416B-9386-70A6E05FDB6C@mcs.anl.gov>
References: <1413749702.3789628.1468516892902.JavaMail.yahoo.ref@mail.yahoo.com>
	<1413749702.3789628.1468516892902.JavaMail.yahoo@mail.yahoo.com>
	<5A491912-5FFB-46AB-8B2E-CBC0C5C443C2@mcs.anl.gov>
	<CAMYG4GnjpQ+CTzB+Ed7hbB77YXJPzWRXtspDbrE8eiXZD=DA_Q@mail.gmail.com>
	<461808588.655361.1468821570462.JavaMail.yahoo@mail.yahoo.com>
	<CAMYG4Gk7HNz8S6A2cv=0OFqbHOPuLOkfdJwNnMiix-ONHvUheg@mail.gmail.com>
	<877772657.653258.1468824084856.JavaMail.yahoo@mail.yahoo.com>
	<DBCA28BC-8A97-4D5F-8041-644DF237D91E@mcs.anl.gov>
	<1627436644.2376666.1469090464432.JavaMail.yahoo@mail.yahoo.com>
	<EECEE224-66DD-43CB-9296-2173CA35BB14@imperial.ac.uk>
	<1239841092.2483369.1469092691767.JavaMail.yahoo@mail.yahoo.com>
	<579094FF.9020708@imperial.ac.uk>
	<2137146867.2442005.1469094945841.JavaMail.yahoo@mail.yahoo.com>
	<9C370EDC-0F99-45FE-B650-B0F24091CA63@imperial.ac.uk>
	<E51ACC8F-CA39-416B-9386-70A6E05FDB6C@mcs.anl.gov>
Message-ID: <430090064.2880208.1469176920170.JavaMail.yahoo@mail.yahoo.com>

Dear Barry, 
?? Thank you for your suggestion. 
?? I will be happy to test drive the new code when available. 

? Kind wishes, Domenico. 


      From: Barry Smith <bsmith at mcs.anl.gov>
 To: Lawrence Mitchell <lawrence.mitchell at imperial.ac.uk> 
Cc: domenico lahaye <domenico_lahaye at yahoo.com>; PETSc Users List <petsc-users at mcs.anl.gov>
 Sent: Friday, July 22, 2016 1:41 AM
 Subject: Re: [petsc-users] Regarding ksp ex42 - Citations
   

? I'll add support for handling both A and M via Galerkin. It is easy to write the code, picking a good simple API that doesn't break anything is more difficult.? I'm leaning to change PCMGSetGalerkin(PC,PetscBool) to PCMGSetGalerkin(PC, PCMGGalerkinType) where 

typedef enum { PC_MG_GALERKIN_BOTH,PC_MG_GALERKIN_PMAT,PC_MG_GALERKIN_MAT, PC_MG_GALERKIN_NONE
 } PCMGGalerkinType;

Barry


> On Jul 21, 2016, at 6:09 AM, Lawrence Mitchell <lawrence.mitchell at imperial.ac.uk> wrote:
> 
> 
>> On 21 Jul 2016, at 10:55, domenico lahaye <domenico_lahaye at yahoo.com> wrote:
>> 
>> Apologies for being not sufficient clear in my previous message.
>> 
>> I would like to be able to Galerkin coarsen A^h to obtain A^H
>> and to separately Galerkin coarsen M^h to obtain M^H.
>> 
>> So, yes, the way in which I currently (partially) understand your
>> description of the new DMCreateMatrices would do the job.
> 
> If you want to separately coarsen A and M via Galerkin, I think it will be easier to just change the code in PCSetUp_MG to handle the case where A and M are different on the coarse levels.? Effectively you just need to replicate the code that computes the coarse grid "B" matrix to separately compute coarse grid A and B matrices and pass them in to KSPSetOperators.
> 
> Cheers,
> 
> Lawrence
> 


-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.mcs.anl.gov/pipermail/petsc-users/attachments/20160722/bb06ac3a/attachment-0003.html>

From zocca.marco at gmail.com  Sat Jul 23 04:25:47 2016
From: zocca.marco at gmail.com (Marco Zocca)
Date: Sat, 23 Jul 2016 11:25:47 +0200
Subject: [petsc-users] [RFC] Docs: TeX -> HTML
Message-ID: <CAKE6T0R+RFoQ81aR+6ogL8RcXfjZ32bKtmBNigS3izpJSo4yJQ@mail.gmail.com>

Dear all,

  following the discussion at PETSc'16, I have tried to render the
TeX-based manual into HTML with latex2html [1] and pandoc [2] .

Neither attempt was successful, because of the presence of certain
external TeX packages used for rendering various custom aspects of the
manual.

There is no 1:1 way of converting such a document. However there are a
number of templates for rendering static websites that use LaTeX math
and verbatim source code (e.g. readthedocs [3] for manual-type
documents, which also supports MathJax [4] and re-renders at every
repository push).

At any rate, the conversion requires copying blocks of text and code
to the web-based version, i.e. removing all the LaTeX markup,
therefore effectively committing to maintaining 2 versions of the
manual up to date and in sync with each other.


Before committing to any approach, I would like your input on this:

1) Do you have any preference for web rendering/site hosting solution?

2) Are you OK with the idea of essentially forking the manual into PDF
output and web output ? It is not huge work (an afternoon of tweaking
initially and a couple minutes at every new release) but we should be
sure about the approach in the first place.

Any and all feedback is welcome;

Thank you and kind regards,
Marco


[1] https://www.ctan.org/tex-archive/support/latex2html/
[2] http://pandoc.org/
[3] https://readthedocs.org/
[4] http://mathjax.readthedocs.io/en/latest/tex.html

From wgropp at illinois.edu  Sat Jul 23 08:42:53 2016
From: wgropp at illinois.edu (William Gropp)
Date: Sat, 23 Jul 2016 08:42:53 -0500
Subject: [petsc-users] [RFC] Docs: TeX -> HTML
In-Reply-To: <CAKE6T0R+RFoQ81aR+6ogL8RcXfjZ32bKtmBNigS3izpJSo4yJQ@mail.gmail.com>
References: <CAKE6T0R+RFoQ81aR+6ogL8RcXfjZ32bKtmBNigS3izpJSo4yJQ@mail.gmail.com>
Message-ID: <ECD2700E-453E-4075-8707-8D8558803B69@illinois.edu>

Another option is to try tohtml, which is what I use for the MPI Standard.  It has a way to specify how to handle some TeX commands (it isn?t a full implementation of TeX, so some more sophisticated uses of TeX are beyond it).

Bill

William Gropp
Director, Parallel Computing Institute
Thomas M. Siebel Chair in Computer Science
Chief Scientist, NCSA
University of Illinois Urbana-Champaign


On Jul 23, 2016, at 4:25 AM, Marco Zocca <zocca.marco at gmail.com> wrote:

> Dear all,
> 
>  following the discussion at PETSc'16, I have tried to render the
> TeX-based manual into HTML with latex2html [1] and pandoc [2] .
> 
> Neither attempt was successful, because of the presence of certain
> external TeX packages used for rendering various custom aspects of the
> manual.
> 
> There is no 1:1 way of converting such a document. However there are a
> number of templates for rendering static websites that use LaTeX math
> and verbatim source code (e.g. readthedocs [3] for manual-type
> documents, which also supports MathJax [4] and re-renders at every
> repository push).
> 
> At any rate, the conversion requires copying blocks of text and code
> to the web-based version, i.e. removing all the LaTeX markup,
> therefore effectively committing to maintaining 2 versions of the
> manual up to date and in sync with each other.
> 
> 
> Before committing to any approach, I would like your input on this:
> 
> 1) Do you have any preference for web rendering/site hosting solution?
> 
> 2) Are you OK with the idea of essentially forking the manual into PDF
> output and web output ? It is not huge work (an afternoon of tweaking
> initially and a couple minutes at every new release) but we should be
> sure about the approach in the first place.
> 
> Any and all feedback is welcome;
> 
> Thank you and kind regards,
> Marco
> 
> 
> [1] https://www.ctan.org/tex-archive/support/latex2html/
> [2] http://pandoc.org/
> [3] https://readthedocs.org/
> [4] http://mathjax.readthedocs.io/en/latest/tex.html


From bhatiamanav at gmail.com  Sat Jul 23 11:50:12 2016
From: bhatiamanav at gmail.com (Manav Bhatia)
Date: Sat, 23 Jul 2016 11:50:12 -0500
Subject: [petsc-users] using DM constructs
Message-ID: <3DB67049-F477-4B63-A185-60095FA14F73@gmail.com>

Hi, 

   I am new to the DM constructs. I am curious if there is a compelling reason to move from handling IS sets to DM data structures. 

   My applications are built on top of libMesh. They used IS sets for a long time, and in recent years I have seen DM constructs in the library. However, I do not know why this is beneficial or necessary. The Petsc manual discusses DMDA for structured mesh, and I see reference to DMForest in the code (for unstructured mesh?) which is not discussed in the manual. 

   Is there a document that might provide the necessary background for DM and how best to derive from it, like in the libMesh source? 

   Any guidance would be appreciated. 

Regards,
Manav

From bsmith at mcs.anl.gov  Sat Jul 23 12:44:21 2016
From: bsmith at mcs.anl.gov (Barry Smith)
Date: Sat, 23 Jul 2016 12:44:21 -0500
Subject: [petsc-users] using DM constructs
In-Reply-To: <3DB67049-F477-4B63-A185-60095FA14F73@gmail.com>
References: <3DB67049-F477-4B63-A185-60095FA14F73@gmail.com>
Message-ID: <1EAFD472-171F-4273-A09E-525D73E8117B@mcs.anl.gov>

 
   Manav,

     Each DM classes has two distinct interfaces:

  One interface that is common to all DM which "speaks linear algebra (algebraic solvers)", for example DMCreateGlobalVector()

  One interface that is specific to a particular DM (for example DMDA, or DMPlex or DMNetwork) it speaks in the language of the mesh/discretization model of the DM. So for example DMDA routines which are for structured grids speak in the language of structured grids and so you have things like DMDAGetCorners() which tells you the corners of the "box" of the structured grid you own. DMPlex speaks in a particular language of unstructured grids, DMNetwork speaks in the language of computations on networks (graphs) such as power grids where you have vertices and edges connecting vertices). DMForest speaks the languages of quad-tree and oct-tree grids.

   The DM is PETSc's approach for communicating between mesh/discretization data and algebraic solvers. It is suppose to handle all the busywork of coordinating the interactions of the mesh/discretization data and algebraic solvers for the application developer so they don't need to do it themselves. For example with geometric multigrid the DMXXX object can fill up all the vectors and matrices that are needed for each level without requiring the user to loop over the levels and put the vectors and matrices themselves into the PCMG data structures.

   IS are lower level basic data structures, often used by DMs. So one does not replace the use of IS with DM but one collects all the mesh/discretization interactions into a DMXXX and implements the DM operations (for example DMCreateGlobalVector()) using the data from DMXXX object.

   In some sense libMesh is a DM for unstructured meshes with finite elements but it was written before we came up with the concept of DMs and so naturally doesn't use the DM interfaces. So one would not write libMesh using DMDA or DMPlex or something rather you would write DMlibMesh or write a new DMlibMesh2 by refactoring the libMesh interfaces to match the DM paradigm.

   So if you are using libMesh and it satisfies your needs you should definitely not just switch to some DMXXX unless you have a good reason. Each DMXXX is for a particular class of problems/algorithms and you pick the DMXXX to use based on what you are doing. So use DMForest if you wish to use oct-trees, etc.

   Barry


> On Jul 23, 2016, at 11:50 AM, Manav Bhatia <bhatiamanav at gmail.com> wrote:
> 
> Hi, 
> 
>   I am new to the DM constructs. I am curious if there is a compelling reason to move from handling IS sets to DM data structures. 
> 
>   My applications are built on top of libMesh. They used IS sets for a long time, and in recent years I have seen DM constructs in the library. However, I do not know why this is beneficial or necessary. The Petsc manual discusses DMDA for structured mesh, and I see reference to DMForest in the code (for unstructured mesh?) which is not discussed in the manual. 
> 
>   Is there a document that might provide the necessary background for DM and how best to derive from it, like in the libMesh source? 
> 
>   Any guidance would be appreciated. 
> 
> Regards,
> Manav


From patrick.sanan at gmail.com  Sat Jul 23 13:16:14 2016
From: patrick.sanan at gmail.com (Patrick Sanan)
Date: Sat, 23 Jul 2016 14:16:14 -0400
Subject: [petsc-users] [RFC] Docs: TeX -> HTML
In-Reply-To: <CAKE6T0R+RFoQ81aR+6ogL8RcXfjZ32bKtmBNigS3izpJSo4yJQ@mail.gmail.com>
References: <CAKE6T0R+RFoQ81aR+6ogL8RcXfjZ32bKtmBNigS3izpJSo4yJQ@mail.gmail.com>
Message-ID: <CA+z91Tc4KqAF47SC-J4GBKohiwkrDp70KBzhVjCvSHwYZU5gzQ@mail.gmail.com>

I have slowly been doing some work to clean up the manual a little
bit, mainly just fixing the formatting where it needs attention, but
also updating the content where it is obviously out of date, so I'm
interested in working on resolving this. The latex version is of
course nice in that it can look pretty with latex tools, but the
advantage of having html documentation which is more friendly to
search engines is undeniable.

Which latex packages are giving trouble? Maybe we can figure out a way
to sufficiently reduce the dependencies.

On Sat, Jul 23, 2016 at 5:25 AM, Marco Zocca <zocca.marco at gmail.com> wrote:
> Dear all,
>
>   following the discussion at PETSc'16, I have tried to render the
> TeX-based manual into HTML with latex2html [1] and pandoc [2] .
>
> Neither attempt was successful, because of the presence of certain
> external TeX packages used for rendering various custom aspects of the
> manual.
>
> There is no 1:1 way of converting such a document. However there are a
> number of templates for rendering static websites that use LaTeX math
> and verbatim source code (e.g. readthedocs [3] for manual-type
> documents, which also supports MathJax [4] and re-renders at every
> repository push).
>
> At any rate, the conversion requires copying blocks of text and code
> to the web-based version, i.e. removing all the LaTeX markup,
> therefore effectively committing to maintaining 2 versions of the
> manual up to date and in sync with each other.
>
>
> Before committing to any approach, I would like your input on this:
>
> 1) Do you have any preference for web rendering/site hosting solution?
>
> 2) Are you OK with the idea of essentially forking the manual into PDF
> output and web output ? It is not huge work (an afternoon of tweaking
> initially and a couple minutes at every new release) but we should be
> sure about the approach in the first place.
>
> Any and all feedback is welcome;
>
> Thank you and kind regards,
> Marco
>
>
> [1] https://www.ctan.org/tex-archive/support/latex2html/
> [2] http://pandoc.org/
> [3] https://readthedocs.org/
> [4] http://mathjax.readthedocs.io/en/latest/tex.html

From bsmith at mcs.anl.gov  Sat Jul 23 13:23:07 2016
From: bsmith at mcs.anl.gov (Barry Smith)
Date: Sat, 23 Jul 2016 13:23:07 -0500
Subject: [petsc-users] [RFC] Docs: TeX -> HTML
In-Reply-To: <CAKE6T0R+RFoQ81aR+6ogL8RcXfjZ32bKtmBNigS3izpJSo4yJQ@mail.gmail.com>
References: <CAKE6T0R+RFoQ81aR+6ogL8RcXfjZ32bKtmBNigS3izpJSo4yJQ@mail.gmail.com>
Message-ID: <28E6C2EC-545C-4811-9873-02CD791D0719@mcs.anl.gov>


   Marco,

    Every rending I've seen of nontrivial latex documents to HTML looks dang ugly in HTML and is extra work to maintain (despite the poor quality). We've tried a couple of times with PETSc to keep an HTML version going and gave up both times. 

    I don't like the idea of having two copies of the same thing, we'd never keep them in sync nor do I like the idea of ugly HTML pages. 

    The one drawback of just having a PDF manual IMHO is that we cannot currently link directly to bookmarks inside the users manual from, say, a manual page html file. (Bookmarks inside the manual.pdf to other places inside the manual.pdf do work fine). The solution seems to be to use Adobe #nameddest=destination instead of bookmarks. These can be added in latex with \hypertarget{} for example \hypertarget{ch_performance} and then in the browser http://www.mcs.anl.gov/petsc/petsc-current/docs/manual.pdf#nameddest=ch_performance will jump to the correct place. This works with the current chrome but does not work with the current Apple Safari (arg) and if you google nameddest doesn't work you find that often browsers seem to have this broken. If it wasn't so broken I would have (automated) adding all the hypertargets and the manual and augmented all the manual pages to have links to them. Still tempted but it seems they won't work except with Chrome (maybe firefox if properly configured). The problem of badly supported #nameddest goes back 10 years 


  Barry


> On Jul 23, 2016, at 4:25 AM, Marco Zocca <zocca.marco at gmail.com> wrote:
> 
> Dear all,
> 
>  following the discussion at PETSc'16, I have tried to render the
> TeX-based manual into HTML with latex2html [1] and pandoc [2] .
> 
> Neither attempt was successful, because of the presence of certain
> external TeX packages used for rendering various custom aspects of the
> manual.
> 
> There is no 1:1 way of converting such a document. However there are a
> number of templates for rendering static websites that use LaTeX math
> and verbatim source code (e.g. readthedocs [3] for manual-type
> documents, which also supports MathJax [4] and re-renders at every
> repository push).
> 
> At any rate, the conversion requires copying blocks of text and code
> to the web-based version, i.e. removing all the LaTeX markup,
> therefore effectively committing to maintaining 2 versions of the
> manual up to date and in sync with each other.
> 
> 
> Before committing to any approach, I would like your input on this:
> 
> 1) Do you have any preference for web rendering/site hosting solution?
> 
> 2) Are you OK with the idea of essentially forking the manual into PDF
> output and web output ? It is not huge work (an afternoon of tweaking
> initially and a couple minutes at every new release) but we should be
> sure about the approach in the first place.
> 
> Any and all feedback is welcome;
> 
> Thank you and kind regards,
> Marco
> 
> 
> [1] https://www.ctan.org/tex-archive/support/latex2html/
> [2] http://pandoc.org/
> [3] https://readthedocs.org/
> [4] http://mathjax.readthedocs.io/en/latest/tex.html


From juan at tf.uni-kiel.de  Sat Jul 23 13:40:02 2016
From: juan at tf.uni-kiel.de (Julian Andrej)
Date: Sat, 23 Jul 2016 20:40:02 +0200
Subject: [petsc-users] [RFC] Docs: TeX -> HTML
In-Reply-To: <28E6C2EC-545C-4811-9873-02CD791D0719@mcs.anl.gov>
References: <CAKE6T0R+RFoQ81aR+6ogL8RcXfjZ32bKtmBNigS3izpJSo4yJQ@mail.gmail.com>
	<28E6C2EC-545C-4811-9873-02CD791D0719@mcs.anl.gov>
Message-ID: <CABFzUT0K6ET3Y7=MyxVfOG=Ocm0_=dBPqTpXhctvNNoo57P5Ww@mail.gmail.com>

Small suggestion (it also came up at the Meeting)

What is the opinion on a "main" documentation in markdown/restructured
text or something like that? The conversion from one of these formats
into pdf or any other format like html is handled by a variety of
tools pretty well.

On Sat, Jul 23, 2016 at 8:23 PM, Barry Smith <bsmith at mcs.anl.gov> wrote:
>
>    Marco,
>
>     Every rending I've seen of nontrivial latex documents to HTML looks dang ugly in HTML and is extra work to maintain (despite the poor quality). We've tried a couple of times with PETSc to keep an HTML version going and gave up both times.
>
>     I don't like the idea of having two copies of the same thing, we'd never keep them in sync nor do I like the idea of ugly HTML pages.
>
>     The one drawback of just having a PDF manual IMHO is that we cannot currently link directly to bookmarks inside the users manual from, say, a manual page html file. (Bookmarks inside the manual.pdf to other places inside the manual.pdf do work fine). The solution seems to be to use Adobe #nameddest=destination instead of bookmarks. These can be added in latex with \hypertarget{} for example \hypertarget{ch_performance} and then in the browser http://www.mcs.anl.gov/petsc/petsc-current/docs/manual.pdf#nameddest=ch_performance will jump to the correct place. This works with the current chrome but does not work with the current Apple Safari (arg) and if you google nameddest doesn't work you find that often browsers seem to have this broken. If it wasn't so broken I would have (automated) adding all the hypertargets and the manual and augmented all the manual pages to have links to them. Still tempted but it seems they won't work except with Chrome (maybe firefox if properly configured). The problem of badly supported #nameddest goes back 10 years
>
>
>   Barry
>
>
>
>
>
>
>> On Jul 23, 2016, at 4:25 AM, Marco Zocca <zocca.marco at gmail.com> wrote:
>>
>> Dear all,
>>
>>  following the discussion at PETSc'16, I have tried to render the
>> TeX-based manual into HTML with latex2html [1] and pandoc [2] .
>>
>> Neither attempt was successful, because of the presence of certain
>> external TeX packages used for rendering various custom aspects of the
>> manual.
>>
>> There is no 1:1 way of converting such a document. However there are a
>> number of templates for rendering static websites that use LaTeX math
>> and verbatim source code (e.g. readthedocs [3] for manual-type
>> documents, which also supports MathJax [4] and re-renders at every
>> repository push).
>>
>> At any rate, the conversion requires copying blocks of text and code
>> to the web-based version, i.e. removing all the LaTeX markup,
>> therefore effectively committing to maintaining 2 versions of the
>> manual up to date and in sync with each other.
>>
>>
>> Before committing to any approach, I would like your input on this:
>>
>> 1) Do you have any preference for web rendering/site hosting solution?
>>
>> 2) Are you OK with the idea of essentially forking the manual into PDF
>> output and web output ? It is not huge work (an afternoon of tweaking
>> initially and a couple minutes at every new release) but we should be
>> sure about the approach in the first place.
>>
>> Any and all feedback is welcome;
>>
>> Thank you and kind regards,
>> Marco
>>
>>
>> [1] https://www.ctan.org/tex-archive/support/latex2html/
>> [2] http://pandoc.org/
>> [3] https://readthedocs.org/
>> [4] http://mathjax.readthedocs.io/en/latest/tex.html
>

From knepley at gmail.com  Sat Jul 23 14:26:08 2016
From: knepley at gmail.com (Matthew Knepley)
Date: Sat, 23 Jul 2016 21:26:08 +0200
Subject: [petsc-users] [RFC] Docs: TeX -> HTML
In-Reply-To: <CABFzUT0K6ET3Y7=MyxVfOG=Ocm0_=dBPqTpXhctvNNoo57P5Ww@mail.gmail.com>
References: <CAKE6T0R+RFoQ81aR+6ogL8RcXfjZ32bKtmBNigS3izpJSo4yJQ@mail.gmail.com>
	<28E6C2EC-545C-4811-9873-02CD791D0719@mcs.anl.gov>
	<CABFzUT0K6ET3Y7=MyxVfOG=Ocm0_=dBPqTpXhctvNNoo57P5Ww@mail.gmail.com>
Message-ID: <CAMYG4G=YdYDT3xBcpfRsZ0+seA8a5rDfjVWncs56HnxNDv+QQw@mail.gmail.com>

On Sat, Jul 23, 2016 at 8:40 PM, Julian Andrej <juan at tf.uni-kiel.de> wrote:

> Small suggestion (it also came up at the Meeting)
>
> What is the opinion on a "main" documentation in markdown/restructured
> text or something like that? The conversion from one of these formats
> into pdf or any other format like html is handled by a variety of
> tools pretty well.


1) I am really opposed to two copies of the source. This never works out.

2) My reservation concerning Markdown is that it is so constricted. I am
used to the freedom of TeX. I
    agree that this is not a definitive argument.

   Matt


> On Sat, Jul 23, 2016 at 8:23 PM, Barry Smith <bsmith at mcs.anl.gov> wrote:
> >
> >    Marco,
> >
> >     Every rending I've seen of nontrivial latex documents to HTML looks
> dang ugly in HTML and is extra work to maintain (despite the poor quality).
> We've tried a couple of times with PETSc to keep an HTML version going and
> gave up both times.
> >
> >     I don't like the idea of having two copies of the same thing, we'd
> never keep them in sync nor do I like the idea of ugly HTML pages.
> >
> >     The one drawback of just having a PDF manual IMHO is that we cannot
> currently link directly to bookmarks inside the users manual from, say, a
> manual page html file. (Bookmarks inside the manual.pdf to other places
> inside the manual.pdf do work fine). The solution seems to be to use Adobe
> #nameddest=destination instead of bookmarks. These can be added in latex
> with \hypertarget{} for example \hypertarget{ch_performance} and then in
> the browser
> http://www.mcs.anl.gov/petsc/petsc-current/docs/manual.pdf#nameddest=ch_performance
> will jump to the correct place. This works with the current chrome but does
> not work with the current Apple Safari (arg) and if you google nameddest
> doesn't work you find that often browsers seem to have this broken. If it
> wasn't so broken I would have (automated) adding all the hypertargets and
> the manual and augmented all the manual pages to have links to them. Still
> tempted but it seems they won't work except with Chrome (maybe firefox if
> properly configured). The problem of badly supported #nameddest goes back
> 10 years
> >
> >
> >   Barry
> >
> >
> >
> >
> >
> >
> >> On Jul 23, 2016, at 4:25 AM, Marco Zocca <zocca.marco at gmail.com> wrote:
> >>
> >> Dear all,
> >>
> >>  following the discussion at PETSc'16, I have tried to render the
> >> TeX-based manual into HTML with latex2html [1] and pandoc [2] .
> >>
> >> Neither attempt was successful, because of the presence of certain
> >> external TeX packages used for rendering various custom aspects of the
> >> manual.
> >>
> >> There is no 1:1 way of converting such a document. However there are a
> >> number of templates for rendering static websites that use LaTeX math
> >> and verbatim source code (e.g. readthedocs [3] for manual-type
> >> documents, which also supports MathJax [4] and re-renders at every
> >> repository push).
> >>
> >> At any rate, the conversion requires copying blocks of text and code
> >> to the web-based version, i.e. removing all the LaTeX markup,
> >> therefore effectively committing to maintaining 2 versions of the
> >> manual up to date and in sync with each other.
> >>
> >>
> >> Before committing to any approach, I would like your input on this:
> >>
> >> 1) Do you have any preference for web rendering/site hosting solution?
> >>
> >> 2) Are you OK with the idea of essentially forking the manual into PDF
> >> output and web output ? It is not huge work (an afternoon of tweaking
> >> initially and a couple minutes at every new release) but we should be
> >> sure about the approach in the first place.
> >>
> >> Any and all feedback is welcome;
> >>
> >> Thank you and kind regards,
> >> Marco
> >>
> >>
> >> [1] https://www.ctan.org/tex-archive/support/latex2html/
> >> [2] http://pandoc.org/
> >> [3] https://readthedocs.org/
> >> [4] http://mathjax.readthedocs.io/en/latest/tex.html
> >
>


-- 
What most experimenters take for granted before they begin their
experiments is infinitely more interesting than any results to which their
experiments lead.
-- Norbert Wiener
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.mcs.anl.gov/pipermail/petsc-users/attachments/20160723/ae5a4179/attachment-0001.html>

From bhatiamanav at gmail.com  Sat Jul 23 14:29:58 2016
From: bhatiamanav at gmail.com (Manav Bhatia)
Date: Sat, 23 Jul 2016 14:29:58 -0500
Subject: [petsc-users] using DM constructs
In-Reply-To: <1EAFD472-171F-4273-A09E-525D73E8117B@mcs.anl.gov>
References: <3DB67049-F477-4B63-A185-60095FA14F73@gmail.com>
	<1EAFD472-171F-4273-A09E-525D73E8117B@mcs.anl.gov>
Message-ID: <C4AD0D32-D292-4C22-95DE-08DBEAF8BA8B@gmail.com>

Thanks, Barry. 

This gives me a good perspective. 

Are there specific functions that need to be implemented/provided by a DM derived object? What would be a good resource to learn about this? 

Regards,
Manav


> On Jul 23, 2016, at 12:44 PM, Barry Smith <bsmith at mcs.anl.gov> wrote:
> 
> 
>   Manav,
> 
>     Each DM classes has two distinct interfaces:
> 
>  One interface that is common to all DM which "speaks linear algebra (algebraic solvers)", for example DMCreateGlobalVector()
> 
>  One interface that is specific to a particular DM (for example DMDA, or DMPlex or DMNetwork) it speaks in the language of the mesh/discretization model of the DM. So for example DMDA routines which are for structured grids speak in the language of structured grids and so you have things like DMDAGetCorners() which tells you the corners of the "box" of the structured grid you own. DMPlex speaks in a particular language of unstructured grids, DMNetwork speaks in the language of computations on networks (graphs) such as power grids where you have vertices and edges connecting vertices). DMForest speaks the languages of quad-tree and oct-tree grids.
> 
>   The DM is PETSc's approach for communicating between mesh/discretization data and algebraic solvers. It is suppose to handle all the busywork of coordinating the interactions of the mesh/discretization data and algebraic solvers for the application developer so they don't need to do it themselves. For example with geometric multigrid the DMXXX object can fill up all the vectors and matrices that are needed for each level without requiring the user to loop over the levels and put the vectors and matrices themselves into the PCMG data structures.
> 
>   IS are lower level basic data structures, often used by DMs. So one does not replace the use of IS with DM but one collects all the mesh/discretization interactions into a DMXXX and implements the DM operations (for example DMCreateGlobalVector()) using the data from DMXXX object.
> 
>   In some sense libMesh is a DM for unstructured meshes with finite elements but it was written before we came up with the concept of DMs and so naturally doesn't use the DM interfaces. So one would not write libMesh using DMDA or DMPlex or something rather you would write DMlibMesh or write a new DMlibMesh2 by refactoring the libMesh interfaces to match the DM paradigm.
> 
>   So if you are using libMesh and it satisfies your needs you should definitely not just switch to some DMXXX unless you have a good reason. Each DMXXX is for a particular class of problems/algorithms and you pick the DMXXX to use based on what you are doing. So use DMForest if you wish to use oct-trees, etc.
> 
>   Barry
> 
> 
> 
>> On Jul 23, 2016, at 11:50 AM, Manav Bhatia <bhatiamanav at gmail.com> wrote:
>> 
>> Hi, 
>> 
>>  I am new to the DM constructs. I am curious if there is a compelling reason to move from handling IS sets to DM data structures. 
>> 
>>  My applications are built on top of libMesh. They used IS sets for a long time, and in recent years I have seen DM constructs in the library. However, I do not know why this is beneficial or necessary. The Petsc manual discusses DMDA for structured mesh, and I see reference to DMForest in the code (for unstructured mesh?) which is not discussed in the manual. 
>> 
>>  Is there a document that might provide the necessary background for DM and how best to derive from it, like in the libMesh source? 
>> 
>>  Any guidance would be appreciated. 
>> 
>> Regards,
>> Manav
> 


From knepley at gmail.com  Sat Jul 23 14:30:44 2016
From: knepley at gmail.com (Matthew Knepley)
Date: Sat, 23 Jul 2016 21:30:44 +0200
Subject: [petsc-users] using DM constructs
In-Reply-To: <C4AD0D32-D292-4C22-95DE-08DBEAF8BA8B@gmail.com>
References: <3DB67049-F477-4B63-A185-60095FA14F73@gmail.com>
	<1EAFD472-171F-4273-A09E-525D73E8117B@mcs.anl.gov>
	<C4AD0D32-D292-4C22-95DE-08DBEAF8BA8B@gmail.com>
Message-ID: <CAMYG4Gku4C1aomqAUAHLTNZkfHOOaMBcmzYUzU0Ziy7Ms5rsOQ@mail.gmail.com>

On Sat, Jul 23, 2016 at 9:29 PM, Manav Bhatia <bhatiamanav at gmail.com> wrote:

> Thanks, Barry.
>
> This gives me a good perspective.
>
> Are there specific functions that need to be implemented/provided by a DM
> derived object? What would be a good resource to learn about this?
>

We talk a lot about this in the online tutorials.

   Matt


> Regards,
> Manav
>
>
> > On Jul 23, 2016, at 12:44 PM, Barry Smith <bsmith at mcs.anl.gov> wrote:
> >
> >
> >   Manav,
> >
> >     Each DM classes has two distinct interfaces:
> >
> >  One interface that is common to all DM which "speaks linear algebra
> (algebraic solvers)", for example DMCreateGlobalVector()
> >
> >  One interface that is specific to a particular DM (for example DMDA, or
> DMPlex or DMNetwork) it speaks in the language of the mesh/discretization
> model of the DM. So for example DMDA routines which are for structured
> grids speak in the language of structured grids and so you have things like
> DMDAGetCorners() which tells you the corners of the "box" of the structured
> grid you own. DMPlex speaks in a particular language of unstructured grids,
> DMNetwork speaks in the language of computations on networks (graphs) such
> as power grids where you have vertices and edges connecting vertices).
> DMForest speaks the languages of quad-tree and oct-tree grids.
> >
> >   The DM is PETSc's approach for communicating between
> mesh/discretization data and algebraic solvers. It is suppose to handle all
> the busywork of coordinating the interactions of the mesh/discretization
> data and algebraic solvers for the application developer so they don't need
> to do it themselves. For example with geometric multigrid the DMXXX object
> can fill up all the vectors and matrices that are needed for each level
> without requiring the user to loop over the levels and put the vectors and
> matrices themselves into the PCMG data structures.
> >
> >   IS are lower level basic data structures, often used by DMs. So one
> does not replace the use of IS with DM but one collects all the
> mesh/discretization interactions into a DMXXX and implements the DM
> operations (for example DMCreateGlobalVector()) using the data from DMXXX
> object.
> >
> >   In some sense libMesh is a DM for unstructured meshes with finite
> elements but it was written before we came up with the concept of DMs and
> so naturally doesn't use the DM interfaces. So one would not write libMesh
> using DMDA or DMPlex or something rather you would write DMlibMesh or write
> a new DMlibMesh2 by refactoring the libMesh interfaces to match the DM
> paradigm.
> >
> >   So if you are using libMesh and it satisfies your needs you should
> definitely not just switch to some DMXXX unless you have a good reason.
> Each DMXXX is for a particular class of problems/algorithms and you pick
> the DMXXX to use based on what you are doing. So use DMForest if you wish
> to use oct-trees, etc.
> >
> >   Barry
> >
> >
> >
> >> On Jul 23, 2016, at 11:50 AM, Manav Bhatia <bhatiamanav at gmail.com>
> wrote:
> >>
> >> Hi,
> >>
> >>  I am new to the DM constructs. I am curious if there is a compelling
> reason to move from handling IS sets to DM data structures.
> >>
> >>  My applications are built on top of libMesh. They used IS sets for a
> long time, and in recent years I have seen DM constructs in the library.
> However, I do not know why this is beneficial or necessary. The Petsc
> manual discusses DMDA for structured mesh, and I see reference to DMForest
> in the code (for unstructured mesh?) which is not discussed in the manual.
> >>
> >>  Is there a document that might provide the necessary background for DM
> and how best to derive from it, like in the libMesh source?
> >>
> >>  Any guidance would be appreciated.
> >>
> >> Regards,
> >> Manav
> >
>
>


-- 
What most experimenters take for granted before they begin their
experiments is infinitely more interesting than any results to which their
experiments lead.
-- Norbert Wiener
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.mcs.anl.gov/pipermail/petsc-users/attachments/20160723/8600e100/attachment.html>

From bsmith at mcs.anl.gov  Sat Jul 23 14:46:19 2016
From: bsmith at mcs.anl.gov (Barry Smith)
Date: Sat, 23 Jul 2016 14:46:19 -0500
Subject: [petsc-users] using DM constructs
In-Reply-To: <C4AD0D32-D292-4C22-95DE-08DBEAF8BA8B@gmail.com>
References: <3DB67049-F477-4B63-A185-60095FA14F73@gmail.com>
	<1EAFD472-171F-4273-A09E-525D73E8117B@mcs.anl.gov>
	<C4AD0D32-D292-4C22-95DE-08DBEAF8BA8B@gmail.com>
Message-ID: <B807711D-6DB2-4F51-993C-17D3763CD8F4@mcs.anl.gov>


   Unfortunately they are muddled up in the include files and source. That is functions that you need to implement are mixed in with functions in the base class. I cut and pasted below the basic ones from petscdm.h and removed the ones you do not need to implement.

PETSC_EXTERN PetscErrorCode DMView(DM,PetscViewer);
PETSC_EXTERN PetscErrorCode DMLoad(DM,PetscViewer);   /* very useful but doesn't need to be implemented 
PETSC_EXTERN PetscErrorCode DMDestroy(DM*);
PETSC_EXTERN PetscErrorCode DMCreateGlobalVector(DM,Vec*);
PETSC_EXTERN PetscErrorCode DMCreateLocalVector(DM,Vec*);
PETSC_EXTERN PetscErrorCode DMGetLocalToGlobalMapping(DM,ISLocalToGlobalMapping*);  /* isn't always needed
PETSC_EXTERN PetscErrorCode DMGetBlockSize(DM,PetscInt*);            /* often doesn't mean anything, like for mixed methods
PETSC_EXTERN PetscErrorCode DMCreateColoring(DM,ISColoringType,ISColoring*);  /* not needed by very useful for automatically computing Jacobians via differencing
PETSC_EXTERN PetscErrorCode DMCreateMatrix(DM,Mat*);
PETSC_EXTERN PetscErrorCode DMSetMatrixPreallocateOnly(DM,PetscBool);
PETSC_EXTERN PetscErrorCode DMCreateInterpolation(DM,DM,Mat*,Vec*);  /* following are needed if you wish to use geometric multigrid; they don't necessarily make sense for all DM implementations.
PETSC_EXTERN PetscErrorCode DMCreateRestriction(DM,DM,Mat*);
PETSC_EXTERN PetscErrorCode DMRefine(DM,MPI_Comm,DM*);
PETSC_EXTERN PetscErrorCode DMCoarsen(DM,MPI_Comm,DM*);
PETSC_EXTERN PetscErrorCode DMRefineHierarchy(DM,PetscInt,DM[]);
PETSC_EXTERN PetscErrorCode DMCoarsenHierarchy(DM,PetscInt,DM[]);
PETSC_EXTERN PetscErrorCode DMSetFromOptions(DM);

> On Jul 23, 2016, at 2:29 PM, Manav Bhatia <bhatiamanav at gmail.com> wrote:
> 
> Thanks, Barry. 
> 
> This gives me a good perspective. 
> 
> Are there specific functions that need to be implemented/provided by a DM derived object? What would be a good resource to learn about this? 
> 
> Regards,
> Manav
> 
> 
>> On Jul 23, 2016, at 12:44 PM, Barry Smith <bsmith at mcs.anl.gov> wrote:
>> 
>> 
>>  Manav,
>> 
>>    Each DM classes has two distinct interfaces:
>> 
>> One interface that is common to all DM which "speaks linear algebra (algebraic solvers)", for example DMCreateGlobalVector()
>> 
>> One interface that is specific to a particular DM (for example DMDA, or DMPlex or DMNetwork) it speaks in the language of the mesh/discretization model of the DM. So for example DMDA routines which are for structured grids speak in the language of structured grids and so you have things like DMDAGetCorners() which tells you the corners of the "box" of the structured grid you own. DMPlex speaks in a particular language of unstructured grids, DMNetwork speaks in the language of computations on networks (graphs) such as power grids where you have vertices and edges connecting vertices). DMForest speaks the languages of quad-tree and oct-tree grids.
>> 
>>  The DM is PETSc's approach for communicating between mesh/discretization data and algebraic solvers. It is suppose to handle all the busywork of coordinating the interactions of the mesh/discretization data and algebraic solvers for the application developer so they don't need to do it themselves. For example with geometric multigrid the DMXXX object can fill up all the vectors and matrices that are needed for each level without requiring the user to loop over the levels and put the vectors and matrices themselves into the PCMG data structures.
>> 
>>  IS are lower level basic data structures, often used by DMs. So one does not replace the use of IS with DM but one collects all the mesh/discretization interactions into a DMXXX and implements the DM operations (for example DMCreateGlobalVector()) using the data from DMXXX object.
>> 
>>  In some sense libMesh is a DM for unstructured meshes with finite elements but it was written before we came up with the concept of DMs and so naturally doesn't use the DM interfaces. So one would not write libMesh using DMDA or DMPlex or something rather you would write DMlibMesh or write a new DMlibMesh2 by refactoring the libMesh interfaces to match the DM paradigm.
>> 
>>  So if you are using libMesh and it satisfies your needs you should definitely not just switch to some DMXXX unless you have a good reason. Each DMXXX is for a particular class of problems/algorithms and you pick the DMXXX to use based on what you are doing. So use DMForest if you wish to use oct-trees, etc.
>> 
>>  Barry
>> 
>> 
>> 
>>> On Jul 23, 2016, at 11:50 AM, Manav Bhatia <bhatiamanav at gmail.com> wrote:
>>> 
>>> Hi, 
>>> 
>>> I am new to the DM constructs. I am curious if there is a compelling reason to move from handling IS sets to DM data structures. 
>>> 
>>> My applications are built on top of libMesh. They used IS sets for a long time, and in recent years I have seen DM constructs in the library. However, I do not know why this is beneficial or necessary. The Petsc manual discusses DMDA for structured mesh, and I see reference to DMForest in the code (for unstructured mesh?) which is not discussed in the manual. 
>>> 
>>> Is there a document that might provide the necessary background for DM and how best to derive from it, like in the libMesh source? 
>>> 
>>> Any guidance would be appreciated. 
>>> 
>>> Regards,
>>> Manav
>> 
> 


From bsmith at mcs.anl.gov  Sat Jul 23 15:06:36 2016
From: bsmith at mcs.anl.gov (Barry Smith)
Date: Sat, 23 Jul 2016 15:06:36 -0500
Subject: [petsc-users] [RFC] Docs: TeX -> HTML
In-Reply-To: <CABFzUT0K6ET3Y7=MyxVfOG=Ocm0_=dBPqTpXhctvNNoo57P5Ww@mail.gmail.com>
References: <CAKE6T0R+RFoQ81aR+6ogL8RcXfjZ32bKtmBNigS3izpJSo4yJQ@mail.gmail.com>
	<28E6C2EC-545C-4811-9873-02CD791D0719@mcs.anl.gov>
	<CABFzUT0K6ET3Y7=MyxVfOG=Ocm0_=dBPqTpXhctvNNoo57P5Ww@mail.gmail.com>
Message-ID: <7746A6E1-FE80-4091-9F0B-29AAE4CDECDA@mcs.anl.gov>


> On Jul 23, 2016, at 1:40 PM, Julian Andrej <juan at tf.uni-kiel.de> wrote:
> 
> Small suggestion (it also came up at the Meeting)
> 
> What is the opinion on a "main" documentation in markdown/restructured
> text or something like that? The conversion from one of these formats
> into pdf or any other format like html is handled by a variety of
> tools pretty well.

   This might be possible.

   The drawback to that is markdown and friends are really limited in the types of formatting one can do.

   I like better the idea of generating nice html from latex if that is possible.

    Can you list what "certain external TeX packages used for rendering various custom aspects of the manual" chock pandoc? Maybe they can be redefined or seded out of the .tex file before passing to pandoc? Also if you can process part of the manual can you point to how it looks with pandoc so we can evaluate if it is too "ugly"?

  Thanks

   Barry


> 
> On Sat, Jul 23, 2016 at 8:23 PM, Barry Smith <bsmith at mcs.anl.gov> wrote:
>> 
>>   Marco,
>> 
>>    Every rending I've seen of nontrivial latex documents to HTML looks dang ugly in HTML and is extra work to maintain (despite the poor quality). We've tried a couple of times with PETSc to keep an HTML version going and gave up both times.
>> 
>>    I don't like the idea of having two copies of the same thing, we'd never keep them in sync nor do I like the idea of ugly HTML pages.
>> 
>>    The one drawback of just having a PDF manual IMHO is that we cannot currently link directly to bookmarks inside the users manual from, say, a manual page html file. (Bookmarks inside the manual.pdf to other places inside the manual.pdf do work fine). The solution seems to be to use Adobe #nameddest=destination instead of bookmarks. These can be added in latex with \hypertarget{} for example \hypertarget{ch_performance} and then in the browser http://www.mcs.anl.gov/petsc/petsc-current/docs/manual.pdf#nameddest=ch_performance will jump to the correct place. This works with the current chrome but does not work with the current Apple Safari (arg) and if you google nameddest doesn't work you find that often browsers seem to have this broken. If it wasn't so broken I would have (automated) adding all the hypertargets and the manual and augmented all the manual pages to have links to them. Still tempted but it seems they won't work except with Chrome (maybe firefox if properly configured). The problem of badly supported #nameddest goes back 10 years
>> 
>> 
>>  Barry
>> 
>> 
>> 
>> 
>> 
>> 
>>> On Jul 23, 2016, at 4:25 AM, Marco Zocca <zocca.marco at gmail.com> wrote:
>>> 
>>> Dear all,
>>> 
>>> following the discussion at PETSc'16, I have tried to render the
>>> TeX-based manual into HTML with latex2html [1] and pandoc [2] .
>>> 
>>> Neither attempt was successful, because of the presence of certain
>>> external TeX packages used for rendering various custom aspects of the
>>> manual.
>>> 
>>> There is no 1:1 way of converting such a document. However there are a
>>> number of templates for rendering static websites that use LaTeX math
>>> and verbatim source code (e.g. readthedocs [3] for manual-type
>>> documents, which also supports MathJax [4] and re-renders at every
>>> repository push).
>>> 
>>> At any rate, the conversion requires copying blocks of text and code
>>> to the web-based version, i.e. removing all the LaTeX markup,
>>> therefore effectively committing to maintaining 2 versions of the
>>> manual up to date and in sync with each other.
>>> 
>>> 
>>> Before committing to any approach, I would like your input on this:
>>> 
>>> 1) Do you have any preference for web rendering/site hosting solution?
>>> 
>>> 2) Are you OK with the idea of essentially forking the manual into PDF
>>> output and web output ? It is not huge work (an afternoon of tweaking
>>> initially and a couple minutes at every new release) but we should be
>>> sure about the approach in the first place.
>>> 
>>> Any and all feedback is welcome;
>>> 
>>> Thank you and kind regards,
>>> Marco
>>> 
>>> 
>>> [1] https://www.ctan.org/tex-archive/support/latex2html/
>>> [2] http://pandoc.org/
>>> [3] https://readthedocs.org/
>>> [4] http://mathjax.readthedocs.io/en/latest/tex.html
>> 


From aks084000 at utdallas.edu  Sat Jul 23 18:21:57 2016
From: aks084000 at utdallas.edu (Safin, Artur)
Date: Sat, 23 Jul 2016 23:21:57 +0000
Subject: [petsc-users] Multigrid with PML
In-Reply-To: <CAMYG4GnX2CtB9GzDgmG_4p7xb267Fc4kVesSRCcyTZ6_zgUtKg@mail.gmail.com>
References: <2d1003a65bf24fdf9b30adea866d2067@utdallas.edu>
	<02E40C8A-322D-4784-8418-22EE5F0999C7@mcs.anl.gov>
	<CADOhEh6bAfyDHrJB78wTuZ7qgKdgJTKn=eRDm8EFbUTW4R+3Kg@mail.gmail.com>
	<6B852635-27EC-45D7-8C09-8F3306DA6DEE@utdallas.edu>
	<CAMYG4GnX2CtB9GzDgmG_4p7xb267Fc4kVesSRCcyTZ6_zgUtKg@mail.gmail.com>
Message-ID: <37055B11-8B43-4C7F-9E65-47B8C1CB31D7@utdallas.edu>

Matt, Barry,

Thank you for your help!

Artur
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.mcs.anl.gov/pipermail/petsc-users/attachments/20160723/c38e09c3/attachment-0001.html>

From bsmith at mcs.anl.gov  Sat Jul 23 19:52:14 2016
From: bsmith at mcs.anl.gov (Barry Smith)
Date: Sat, 23 Jul 2016 19:52:14 -0500
Subject: [petsc-users] Regarding ksp ex42 - Citations
In-Reply-To: <430090064.2880208.1469176920170.JavaMail.yahoo@mail.yahoo.com>
References: <1413749702.3789628.1468516892902.JavaMail.yahoo.ref@mail.yahoo.com>
	<1413749702.3789628.1468516892902.JavaMail.yahoo@mail.yahoo.com>
	<5A491912-5FFB-46AB-8B2E-CBC0C5C443C2@mcs.anl.gov>
	<CAMYG4GnjpQ+CTzB+Ed7hbB77YXJPzWRXtspDbrE8eiXZD=DA_Q@mail.gmail.com>
	<461808588.655361.1468821570462.JavaMail.yahoo@mail.yahoo.com>
	<CAMYG4Gk7HNz8S6A2cv=0OFqbHOPuLOkfdJwNnMiix-ONHvUheg@mail.gmail.com>
	<877772657.653258.1468824084856.JavaMail.yahoo@mail.yahoo.com>
	<DBCA28BC-8A97-4D5F-8041-644DF237D91E@mcs.anl.gov>
	<1627436644.2376666.1469090464432.JavaMail.yahoo@mail.yahoo.com>
	<EECEE224-66DD-43CB-9296-2173CA35BB14@imperial.ac.uk>
	<1239841092.2483369.1469092691767.JavaMail.yahoo@mail.yahoo.com>
	<579094FF.9020708@imperial.ac.uk>
	<2137146867.2442005.1469094945841.JavaMail.yahoo@mail.yahoo.com>
	<9C370EDC-0F99-45FE-B650-B0F24091CA63@imperial.ac.uk>
	<E51ACC8F-CA39-416B-9386-70A6E05FDB6C@mcs.anl.gov>
	<430090064.2880208.1469176920170.JavaMail.yahoo@mail.yahoo.com>
Message-ID: <8CB9F29A-77CA-46D2-9C3C-4E7CD494D2D0@mcs.anl.gov>


  Took a little more time than I expected but the branch barry/extend-pcmg-galerkin now supports 

PCMGSetGalerkin() and -pc_mg_galerkin now take PC_MG_GALERKIN_BOTH,PC_MG_GALERKIN_PMAT,PC_MG_GALERKIN_MAT, PC_MG_GALERKIN_NONE as arguments instead of PetscBool
This allows computing either mat, or pmat or both via the Galerkin process

so you should be able to provide A and M with KSPSetOperators() and then run with -pc_mg_galerkin both to get both generated on the coarse meshes via the Galekin process.  Note that if you use the additional option -pc_use_amat false it will use only the M for both mat and pmat in the multigrid process (while A is only used for the outer Krylov solver definition of the operator.) For some problems this is actually a better approach.


  Please let me know if you have any difficulties with it.

Barry

> On Jul 22, 2016, at 3:42 AM, domenico lahaye <domenico_lahaye at yahoo.com> wrote:
> 
> Dear Barry, 
>  
>   Thank you for your suggestion. 
>  
>   I will be happy to test drive the new code when available. 
> 
>   Kind wishes, Domenico. 
> 
> 
> 
> From: Barry Smith <bsmith at mcs.anl.gov>
> To: Lawrence Mitchell <lawrence.mitchell at imperial.ac.uk> 
> Cc: domenico lahaye <domenico_lahaye at yahoo.com>; PETSc Users List <petsc-users at mcs.anl.gov>
> Sent: Friday, July 22, 2016 1:41 AM
> Subject: Re: [petsc-users] Regarding ksp ex42 - Citations
> 
> 
>   I'll add support for handling both A and M via Galerkin. It is easy to write the code, picking a good simple API that doesn't break anything is more difficult.  I'm leaning to change PCMGSetGalerkin(PC,PetscBool) to PCMGSetGalerkin(PC, PCMGGalerkinType) where 
> 
> typedef enum { PC_MG_GALERKIN_BOTH,PC_MG_GALERKIN_PMAT,PC_MG_GALERKIN_MAT, PC_MG_GALERKIN_NONE
> } PCMGGalerkinType;
> 
> Barry
> 
> 
> 
> > On Jul 21, 2016, at 6:09 AM, Lawrence Mitchell <lawrence.mitchell at imperial.ac.uk> wrote:
> > 
> > 
> >> On 21 Jul 2016, at 10:55, domenico lahaye <domenico_lahaye at yahoo.com> wrote:
> >> 
> >> Apologies for being not sufficient clear in my previous message.
> >> 
> >> I would like to be able to Galerkin coarsen A^h to obtain A^H
> >> and to separately Galerkin coarsen M^h to obtain M^H.
> >> 
> >> So, yes, the way in which I currently (partially) understand your
> >> description of the new DMCreateMatrices would do the job.
> > 
> > If you want to separately coarsen A and M via Galerkin, I think it will be easier to just change the code in PCSetUp_MG to handle the case where A and M are different on the coarse levels.  Effectively you just need to replicate the code that computes the coarse grid "B" matrix to separately compute coarse grid A and B matrices and pass them in to KSPSetOperators.
> > 
> > Cheers,
> > 
> > Lawrence
> > 
> 
> 


From mhassan at miners.utep.edu  Sun Jul 24 12:50:30 2016
From: mhassan at miners.utep.edu (Hassan Md Mahmudulla)
Date: Sun, 24 Jul 2016 17:50:30 +0000
Subject: [petsc-users] EPSKrylovSchurSetDetectZeros() not working
Message-ID: <DM2PR0501MB124787F11C88D34AD540CCCEFF0C0@DM2PR0501MB1247.namprd05.prod.outlook.com>

Hi,

I am solving a generalized eigenvalue problem using spectrum slicing. I am using this example (http://slepc.upv.es/documentation/current/src/eps/examples/tutorials/ex25.c.html) as it is. Part of the code is:

ierr =EPSKrylovSchurSetDetectZeros<http://slepc.upv.es/documentation/current/docs/manualpages/EPS/EPSKrylovSchurSetDetectZeros.html#EPSKrylovSchurSetDetectZeros>(eps,PETSC_TRUE);CHKERRQ(ierr);


But I am getting the following:

[0]PETSC ERROR: Mismatch between number of values found and information from inertia, consider using EPSKrylovSchurSetDetectZeros()
[0]PETSC ERROR: See http://www.mcs.anl.gov/petsc/documentation/faq.html for trouble shooting.

It seems the input PETSC_TRUE is not working for EPSKrylovSchurSetDetectZeros(). Any idea?


M Hassan

-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.mcs.anl.gov/pipermail/petsc-users/attachments/20160724/142a8750/attachment.html>

From jroman at dsic.upv.es  Sun Jul 24 14:27:04 2016
From: jroman at dsic.upv.es (Jose E. Roman)
Date: Sun, 24 Jul 2016 21:27:04 +0200
Subject: [petsc-users] EPSKrylovSchurSetDetectZeros() not working
In-Reply-To: <DM2PR0501MB124787F11C88D34AD540CCCEFF0C0@DM2PR0501MB1247.namprd05.prod.outlook.com>
References: <DM2PR0501MB124787F11C88D34AD540CCCEFF0C0@DM2PR0501MB1247.namprd05.prod.outlook.com>
Message-ID: <F540BC2E-D9A3-46EB-BD45-0612CEFCC977@dsic.upv.es>


> El 24 jul 2016, a las 19:50, Hassan Md Mahmudulla <mhassan at miners.utep.edu> escribi?:
> 
> Hi,
> I am solving a generalized eigenvalue problem using spectrum slicing. I am using this example (http://slepc.upv.es/documentation/current/src/eps/examples/tutorials/ex25.c.html) as it is. Part of the code is:
> ierr =EPSKrylovSchurSetDetectZeros(eps,PETSC_TRUE);CHKERRQ(ierr);
> 
> But I am getting the following: 
> 
> [0]PETSC ERROR: Mismatch between number of values found and information from inertia, consider using EPSKrylovSchurSetDetectZeros()
> [0]PETSC ERROR: See http://www.mcs.anl.gov/petsc/documentation/faq.html for trouble shooting.
> 
> It seems the input PETSC_TRUE is not working for EPSKrylovSchurSetDetectZeros(). Any idea?
> 
> M Hassan

It seems that you are not using MUMPS. Spectrum slicing can be used with PETSc's Cholesky, but for guaranteed robustness it is necessary to use MUMPS.

Jose


From mhassan at miners.utep.edu  Sun Jul 24 14:31:36 2016
From: mhassan at miners.utep.edu (Hassan Md Mahmudulla)
Date: Sun, 24 Jul 2016 19:31:36 +0000
Subject: [petsc-users] EPSKrylovSchurSetDetectZeros() not working
In-Reply-To: <F540BC2E-D9A3-46EB-BD45-0612CEFCC977@dsic.upv.es>
References: <DM2PR0501MB124787F11C88D34AD540CCCEFF0C0@DM2PR0501MB1247.namprd05.prod.outlook.com>,
	<F540BC2E-D9A3-46EB-BD45-0612CEFCC977@dsic.upv.es>
Message-ID: <DM2PR0501MB1247CE64A490DB5D495C17A6FF0C0@DM2PR0501MB1247.namprd05.prod.outlook.com>

Hi Jose,

Here is the part of the code:


  ierr = STSetType(st,STSINVERT);CHKERRQ(ierr);

  ierr = STGetKSP(st,&ksp);CHKERRQ(ierr);
  ierr = KSPSetType(ksp,KSPPREONLY);CHKERRQ(ierr);
  ierr = KSPGetPC(ksp,&pc);CHKERRQ(ierr);
  ierr = PCSetType(pc,PCCHOLESKY);CHKERRQ(ierr);

#if defined(PETSC_HAVE_MUMPS)
#if defined(PETSC_USE_COMPLEX)
  SETERRQ(PETSC_COMM_WORLD,PETSC_ERR_SUP,"Spectrum slicing with MUMPS is not available for complex scalars");
#endif
  ierr = PetscPrintf(PETSC_COMM_WORLD, "PETSC_HAVE_MUMPS\n");CHKERRQ(ierr);
  ierr = EPSKrylovSchurSetDetectZeros(eps,PETSC_TRUE);CHKERRQ(ierr);  /* enforce zero detection */
  ierr = PCFactorSetMatSolverPackage(pc,MATSOLVERMUMPS);CHKERRQ(ierr);
  /*
     Add several MUMPS options (currently there is no better way of setting this in program):
     '-mat_mumps_icntl_13 1': turn off ScaLAPACK for matrix inertia
     '-mat_mumps_icntl_24 1': detect null pivots in factorization (for the case that a shift is equal to an eigenvalue)
     '-mat_mumps_cntl_3 <tol>': a tolerance used for null pivot detection (must be larger than machine epsilon)

     Note: depending on the interval, it may be necessary also to increase the workspace:
     '-mat_mumps_icntl_14 <percentage>': increase workspace with a percentage (50, 100 or more)
  */
  ierr = PetscOptionsInsertString(NULL,"-mat_mumps_icntl_13 1 -mat_mumps_icntl_24 1 -mat_mumps_cntl_3 1e-12");CHKERRQ(ierr);
#endif

  /*
     Set solver parameters at runtime
  */
  ierr = EPSSetFromOptions(eps);CHKERRQ(ierr);


I am using MUMPS. Actually it's the example I said before. I didn't modify it that much.


M Hassan

________________________________
From: Jose E. Roman <jroman at dsic.upv.es>
Sent: Sunday, July 24, 2016 1:27:04 PM
To: Hassan Md Mahmudulla
Cc: petsc-users at mcs.anl.gov
Subject: Re: [petsc-users] EPSKrylovSchurSetDetectZeros() not working


> El 24 jul 2016, a las 19:50, Hassan Md Mahmudulla <mhassan at miners.utep.edu> escribi?:
>
> Hi,
> I am solving a generalized eigenvalue problem using spectrum slicing. I am using this example (http://slepc.upv.es/documentation/current/src/eps/examples/tutorials/ex25.c.html) as it is. Part of the code is:
> ierr =EPSKrylovSchurSetDetectZeros(eps,PETSC_TRUE);CHKERRQ(ierr);
>
> But I am getting the following:
>
> [0]PETSC ERROR: Mismatch between number of values found and information from inertia, consider using EPSKrylovSchurSetDetectZeros()
> [0]PETSC ERROR: See http://www.mcs.anl.gov/petsc/documentation/faq.html for trouble shooting.
>
> It seems the input PETSC_TRUE is not working for EPSKrylovSchurSetDetectZeros(). Any idea?
>
> M Hassan

It seems that you are not using MUMPS. Spectrum slicing can be used with PETSc's Cholesky, but for guaranteed robustness it is necessary to use MUMPS.

Jose

-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.mcs.anl.gov/pipermail/petsc-users/attachments/20160724/34b81455/attachment.html>

From jroman at dsic.upv.es  Sun Jul 24 14:46:28 2016
From: jroman at dsic.upv.es (Jose E. Roman)
Date: Sun, 24 Jul 2016 21:46:28 +0200
Subject: [petsc-users] EPSKrylovSchurSetDetectZeros() not working
In-Reply-To: <DM2PR0501MB1247CE64A490DB5D495C17A6FF0C0@DM2PR0501MB1247.namprd05.prod.outlook.com>
References: <DM2PR0501MB124787F11C88D34AD540CCCEFF0C0@DM2PR0501MB1247.namprd05.prod.outlook.com>
	<F540BC2E-D9A3-46EB-BD45-0612CEFCC977@dsic.upv.es>
	<DM2PR0501MB1247CE64A490DB5D495C17A6FF0C0@DM2PR0501MB1247.namprd05.prod.outlook.com>
Message-ID: <6CCDCF64-35E4-4272-86D0-2590FF81478C@dsic.upv.es>

The PETSc configuration you are using (PETSC_ARCH) does not have MUMPS. You have to add the appropiate options to PETSc's configure script.

> El 24 jul 2016, a las 21:31, Hassan Md Mahmudulla <mhassan at miners.utep.edu> escribi?:
> 
> Hi Jose, 
> Here is the part of the code:
> 
>   ierr = STSetType(st,STSINVERT);CHKERRQ(ierr);
> 
>   ierr = STGetKSP(st,&ksp);CHKERRQ(ierr);
>   ierr = KSPSetType(ksp,KSPPREONLY);CHKERRQ(ierr);
>   ierr = KSPGetPC(ksp,&pc);CHKERRQ(ierr);
>   ierr = PCSetType(pc,PCCHOLESKY);CHKERRQ(ierr);
> 
> #if defined(PETSC_HAVE_MUMPS)
> #if defined(PETSC_USE_COMPLEX)
>   SETERRQ(PETSC_COMM_WORLD,PETSC_ERR_SUP,"Spectrum slicing with MUMPS is not available for complex scalars");
> #endif
>   ierr = PetscPrintf(PETSC_COMM_WORLD, "PETSC_HAVE_MUMPS\n");CHKERRQ(ierr);
>   ierr = EPSKrylovSchurSetDetectZeros(eps,PETSC_TRUE);CHKERRQ(ierr);  /* enforce zero detection */
>   ierr = PCFactorSetMatSolverPackage(pc,MATSOLVERMUMPS);CHKERRQ(ierr);
>   /*
>      Add several MUMPS options (currently there is no better way of setting this in program):
>      '-mat_mumps_icntl_13 1': turn off ScaLAPACK for matrix inertia
>      '-mat_mumps_icntl_24 1': detect null pivots in factorization (for the case that a shift is equal to an eigenvalue)
>      '-mat_mumps_cntl_3 <tol>': a tolerance used for null pivot detection (must be larger than machine epsilon)
> 
>      Note: depending on the interval, it may be necessary also to increase the workspace:
>      '-mat_mumps_icntl_14 <percentage>': increase workspace with a percentage (50, 100 or more)
>   */
>   ierr = PetscOptionsInsertString(NULL,"-mat_mumps_icntl_13 1 -mat_mumps_icntl_24 1 -mat_mumps_cntl_3 1e-12");CHKERRQ(ierr);
> #endif
> 
>   /*
>      Set solver parameters at runtime
>   */
>   ierr = EPSSetFromOptions(eps);CHKERRQ(ierr);
> 
> I am using MUMPS. Actually it's the example I said before. I didn't modify it that much.
> 
> 
> M Hassan


From mhassan at miners.utep.edu  Sun Jul 24 14:56:30 2016
From: mhassan at miners.utep.edu (Hassan Md Mahmudulla)
Date: Sun, 24 Jul 2016 19:56:30 +0000
Subject: [petsc-users] EPSKrylovSchurSetDetectZeros() not working
In-Reply-To: <6CCDCF64-35E4-4272-86D0-2590FF81478C@dsic.upv.es>
References: <DM2PR0501MB124787F11C88D34AD540CCCEFF0C0@DM2PR0501MB1247.namprd05.prod.outlook.com>
	<F540BC2E-D9A3-46EB-BD45-0612CEFCC977@dsic.upv.es>
	<DM2PR0501MB1247CE64A490DB5D495C17A6FF0C0@DM2PR0501MB1247.namprd05.prod.outlook.com>,
	<6CCDCF64-35E4-4272-86D0-2590FF81478C@dsic.upv.es>
Message-ID: <DM2PR0501MB124715C4506A79EBF8DF36BAFF0C0@DM2PR0501MB1247.namprd05.prod.outlook.com>

Hi Jose,

I don't think that my PETSc configuration doesn't have MUMPS. I configured that myself. I also got the output from this if-else code section

ierr = PetscPrintf(PETSC_COMM_WORLD, "PETSC_HAVE_MUMPS\n");CHKERRQ(ierr);


which works from inside of the if-else section.


Please take a look at the configuration info from the error output also:


[0]PETSC ERROR: Configure options --COPTFLAGS=-O2 -no-ipo -g -qopt-report=5 -dynamic --CXXOPTFLAGS=-O2 -no-ipo -g -qopt-report=5 -dynamic --FOPTFLAGS=-O2 -no-ipo -g -qopt-report=5 -dynamic --with-mpiexec=srun --with-cc=cc --with-clib-autodetect=0 --with-cxx=CC --with-cxxlib-autodetect=0 --with-debugging=0 --with-fc=ftn --with-fortranlib-autodetect=0 --with-mpiexec=srun --with-shared-libraries=0 --with-x=0 LIBS=-lstdc++ PETSC_ARCH=arch-edison-opt64-intel --download-mumps --download-ptscotch --download-scalapack --download-metis --download-parmetis


Thanks,

M Hassan


________________________________
From: Jose E. Roman <jroman at dsic.upv.es>
Sent: Sunday, July 24, 2016 1:46:28 PM
To: Hassan Md Mahmudulla
Cc: petsc-users at mcs.anl.gov
Subject: Re: [petsc-users] EPSKrylovSchurSetDetectZeros() not working

The PETSc configuration you are using (PETSC_ARCH) does not have MUMPS. You have to add the appropiate options to PETSc's configure script.

> El 24 jul 2016, a las 21:31, Hassan Md Mahmudulla <mhassan at miners.utep.edu> escribi?:
>
> Hi Jose,
> Here is the part of the code:
>
>   ierr = STSetType(st,STSINVERT);CHKERRQ(ierr);
>
>   ierr = STGetKSP(st,&ksp);CHKERRQ(ierr);
>   ierr = KSPSetType(ksp,KSPPREONLY);CHKERRQ(ierr);
>   ierr = KSPGetPC(ksp,&pc);CHKERRQ(ierr);
>   ierr = PCSetType(pc,PCCHOLESKY);CHKERRQ(ierr);
>
> #if defined(PETSC_HAVE_MUMPS)
> #if defined(PETSC_USE_COMPLEX)
>   SETERRQ(PETSC_COMM_WORLD,PETSC_ERR_SUP,"Spectrum slicing with MUMPS is not available for complex scalars");
> #endif
>   ierr = PetscPrintf(PETSC_COMM_WORLD, "PETSC_HAVE_MUMPS\n");CHKERRQ(ierr);
>   ierr = EPSKrylovSchurSetDetectZeros(eps,PETSC_TRUE);CHKERRQ(ierr);  /* enforce zero detection */
>   ierr = PCFactorSetMatSolverPackage(pc,MATSOLVERMUMPS);CHKERRQ(ierr);
>   /*
>      Add several MUMPS options (currently there is no better way of setting this in program):
>      '-mat_mumps_icntl_13 1': turn off ScaLAPACK for matrix inertia
>      '-mat_mumps_icntl_24 1': detect null pivots in factorization (for the case that a shift is equal to an eigenvalue)
>      '-mat_mumps_cntl_3 <tol>': a tolerance used for null pivot detection (must be larger than machine epsilon)
>
>      Note: depending on the interval, it may be necessary also to increase the workspace:
>      '-mat_mumps_icntl_14 <percentage>': increase workspace with a percentage (50, 100 or more)
>   */
>   ierr = PetscOptionsInsertString(NULL,"-mat_mumps_icntl_13 1 -mat_mumps_icntl_24 1 -mat_mumps_cntl_3 1e-12");CHKERRQ(ierr);
> #endif
>
>   /*
>      Set solver parameters at runtime
>   */
>   ierr = EPSSetFromOptions(eps);CHKERRQ(ierr);
>
> I am using MUMPS. Actually it's the example I said before. I didn't modify it that much.
>
>
> M Hassan

-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.mcs.anl.gov/pipermail/petsc-users/attachments/20160724/ea8c55db/attachment-0001.html>

From jroman at dsic.upv.es  Sun Jul 24 15:34:39 2016
From: jroman at dsic.upv.es (Jose E. Roman)
Date: Sun, 24 Jul 2016 22:34:39 +0200
Subject: [petsc-users] EPSKrylovSchurSetDetectZeros() not working
In-Reply-To: <DM2PR0501MB124715C4506A79EBF8DF36BAFF0C0@DM2PR0501MB1247.namprd05.prod.outlook.com>
References: <DM2PR0501MB124787F11C88D34AD540CCCEFF0C0@DM2PR0501MB1247.namprd05.prod.outlook.com>
	<F540BC2E-D9A3-46EB-BD45-0612CEFCC977@dsic.upv.es>
	<DM2PR0501MB1247CE64A490DB5D495C17A6FF0C0@DM2PR0501MB1247.namprd05.prod.outlook.com>
	<6CCDCF64-35E4-4272-86D0-2590FF81478C@dsic.upv.es>
	<DM2PR0501MB124715C4506A79EBF8DF36BAFF0C0@DM2PR0501MB1247.namprd05.prod.outlook.com>
Message-ID: <4DA115EF-0A4B-4094-9F52-998DDF8D1E07@dsic.upv.es>


> El 24 jul 2016, a las 21:56, Hassan Md Mahmudulla <mhassan at miners.utep.edu> escribi?:
> 
> Hi Jose,
> I don't think that my PETSc configuration doesn't have MUMPS. I configured that myself. I also got the output from this if-else code section
> ierr = PetscPrintf(PETSC_COMM_WORLD, "PETSC_HAVE_MUMPS\n");CHKERRQ(ierr);
>  
> which works from inside of the if-else section.
> 
> Please take a look at the configuration info from the error output also:
> 
> [0]PETSC ERROR: Configure options --COPTFLAGS=-O2 -no-ipo -g -qopt-report=5 -dynamic --CXXOPTFLAGS=-O2 -no-ipo -g -qopt-report=5 -dynamic --FOPTFLAGS=-O2 -no-ipo -g -qopt-report=5 -dynamic --with-mpiexec=srun --with-cc=cc --with-clib-autodetect=0 --with-cxx=CC --with-cxxlib-autodetect=0 --with-debugging=0 --with-fc=ftn --with-fortranlib-autodetect=0 --with-mpiexec=srun --with-shared-libraries=0 --with-x=0 LIBS=-lstdc++ PETSC_ARCH=arch-edison-opt64-intel --download-mumps --download-ptscotch --download-scalapack --download-metis --download-parmetis
> 
> 
> Thanks,
> M Hassan

Then I don't know what is happening.
Jose


From domenico_lahaye at yahoo.com  Mon Jul 25 01:42:44 2016
From: domenico_lahaye at yahoo.com (domenico lahaye)
Date: Mon, 25 Jul 2016 06:42:44 +0000 (UTC)
Subject: [petsc-users] Regarding ksp ex42 - Citations
In-Reply-To: <8CB9F29A-77CA-46D2-9C3C-4E7CD494D2D0@mcs.anl.gov>
References: <1413749702.3789628.1468516892902.JavaMail.yahoo.ref@mail.yahoo.com>
	<1413749702.3789628.1468516892902.JavaMail.yahoo@mail.yahoo.com>
	<5A491912-5FFB-46AB-8B2E-CBC0C5C443C2@mcs.anl.gov>
	<CAMYG4GnjpQ+CTzB+Ed7hbB77YXJPzWRXtspDbrE8eiXZD=DA_Q@mail.gmail.com>
	<461808588.655361.1468821570462.JavaMail.yahoo@mail.yahoo.com>
	<CAMYG4Gk7HNz8S6A2cv=0OFqbHOPuLOkfdJwNnMiix-ONHvUheg@mail.gmail.com>
	<877772657.653258.1468824084856.JavaMail.yahoo@mail.yahoo.com>
	<DBCA28BC-8A97-4D5F-8041-644DF237D91E@mcs.anl.gov>
	<1627436644.2376666.1469090464432.JavaMail.yahoo@mail.yahoo.com>
	<EECEE224-66DD-43CB-9296-2173CA35BB14@imperial.ac.uk>
	<1239841092.2483369.1469092691767.JavaMail.yahoo@mail.yahoo.com>
	<579094FF.9020708@imperial.ac.uk>
	<2137146867.2442005.1469094945841.JavaMail.yahoo@mail.yahoo.com>
	<9C370EDC-0F99-45FE-B650-B0F24091CA63@imperial.ac.uk>
	<E51ACC8F-CA39-416B-9386-70A6E05FDB6C@mcs.anl.gov>
	<430090064.2880208.1469176920170.JavaMail.yahoo@mail.yahoo.com>
	<8CB9F29A-77CA-46D2-9C3C-4 E7CD494D2D0@mcs.anl.gov>
Message-ID: <1278608375.3800440.1469428964428.JavaMail.yahoo@mail.yahoo.com>

Thanks Barry.?
I will give it a look. If not before my holidays, than in the second half of August.?
Best wishes. Domenico.?

?From: Barry Smith <bsmith at mcs.anl.gov>

 To: domenico lahaye <domenico_lahaye at yahoo.com> 
Cc: "petsc-users at mcs.anl.gov" <petsc-users at mcs.anl.gov>
 Sent: Sunday, July 24, 2016 2:52 AM
 Subject: Re: [petsc-users] Regarding ksp ex42 - Citations
   

? Took a little more time than I expected but the branch barry/extend-pcmg-galerkin now supports 

PCMGSetGalerkin() and -pc_mg_galerkin now take PC_MG_GALERKIN_BOTH,PC_MG_GALERKIN_PMAT,PC_MG_GALERKIN_MAT, PC_MG_GALERKIN_NONE as arguments instead of PetscBool
This allows computing either mat, or pmat or both via the Galerkin process

so you should be able to provide A and M with KSPSetOperators() and then run with -pc_mg_galerkin both to get both generated on the coarse meshes via the Galekin process.? Note that if you use the additional option -pc_use_amat false it will use only the M for both mat and pmat in the multigrid process (while A is only used for the outer Krylov solver definition of the operator.) For some problems this is actually a better approach.


? Please let me know if you have any difficulties with it.

Barry

> On Jul 22, 2016, at 3:42 AM, domenico lahaye <domenico_lahaye at yahoo.com> wrote:
> 
> Dear Barry, 
>? 
>? Thank you for your suggestion. 
>? 
>? I will be happy to test drive the new code when available. 
> 
>? Kind wishes, Domenico. 
> 
> 
> 
> From: Barry Smith <bsmith at mcs.anl.gov>
> To: Lawrence Mitchell <lawrence.mitchell at imperial.ac.uk> 
> Cc: domenico lahaye <domenico_lahaye at yahoo.com>; PETSc Users List <petsc-users at mcs.anl.gov>
> Sent: Friday, July 22, 2016 1:41 AM
> Subject: Re: [petsc-users] Regarding ksp ex42 - Citations
> 
> 
>? I'll add support for handling both A and M via Galerkin. It is easy to write the code, picking a good simple API that doesn't break anything is more difficult.? I'm leaning to change PCMGSetGalerkin(PC,PetscBool) to PCMGSetGalerkin(PC, PCMGGalerkinType) where 
> 
> typedef enum { PC_MG_GALERKIN_BOTH,PC_MG_GALERKIN_PMAT,PC_MG_GALERKIN_MAT, PC_MG_GALERKIN_NONE
> } PCMGGalerkinType;
> 
> Barry
> 
> 
> 
> > On Jul 21, 2016, at 6:09 AM, Lawrence Mitchell <lawrence.mitchell at imperial.ac.uk> wrote:
> > 
> > 
> >> On 21 Jul 2016, at 10:55, domenico lahaye <domenico_lahaye at yahoo.com> wrote:
> >> 
> >> Apologies for being not sufficient clear in my previous message.
> >> 
> >> I would like to be able to Galerkin coarsen A^h to obtain A^H
> >> and to separately Galerkin coarsen M^h to obtain M^H.
> >> 
> >> So, yes, the way in which I currently (partially) understand your
> >> description of the new DMCreateMatrices would do the job.
> > 
> > If you want to separately coarsen A and M via Galerkin, I think it will be easier to just change the code in PCSetUp_MG to handle the case where A and M are different on the coarse levels.? Effectively you just need to replicate the code that computes the coarse grid "B" matrix to separately compute coarse grid A and B matrices and pass them in to KSPSetOperators.
> > 
> > Cheers,
> > 
> > Lawrence
> > 
> 
> 


-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.mcs.anl.gov/pipermail/petsc-users/attachments/20160725/51964595/attachment.html>

From xzhao99 at gmail.com  Mon Jul 25 11:17:05 2016
From: xzhao99 at gmail.com (Xujun Zhao)
Date: Mon, 25 Jul 2016 11:17:05 -0500
Subject: [petsc-users] KSPSolve() passes in the dbg mode,
	but failed in opt mode
Message-ID: <CAHOKZ65kvxMLd22uQPO8sfq8HS8GT0AMRSAmVEc_aM5Oz+rvzg@mail.gmail.com>

Hi all,

I am trying to solve my problem with a direct solver superLU_dist.
But the KSPSolve failed in the "opt" mode. I shifted to the "dbg" version
and wanted to see what error info I can get from the PETSc. Surprisingly,
it passed the solve and didn't output any errors in the "dbg" version. Does
anyone have the similar experience? and what type of potential bugs it may
have?


--->test in StokesSolver::solve(): Start the KSP solve...

[0]PETSC ERROR:
------------------------------------------------------------------------

[0]PETSC ERROR: Caught signal number 11 SEGV: Segmentation Violation,
probably memory access out of range

[0]PETSC ERROR: Try option -start_in_debugger or -on_error_attach_debugger

[0]PETSC ERROR: or see
http://www.mcs.anl.gov/petsc/documentation/faq.html#valgrind

[0]PETSC ERROR: or try http://valgrind.org on GNU/linux and Apple Mac OS X
to find memory corruption errors

[0]PETSC ERROR: configure using --with-debugging=yes, recompile, link, and
run

[0]PETSC ERROR: to get more information on the crash.

[0]PETSC ERROR: --------------------- Error Message
--------------------------------------------------------------

[0]PETSC ERROR: Signal received

[0]PETSC ERROR: See http://www.mcs.anl.gov/petsc/documentation/faq.html for
trouble shooting.

[0]PETSC ERROR: Petsc Release Version 3.7.2, unknown

[0]PETSC ERROR: ./example-dbg on a arch-darwin-c-opt named
mcswl091.mcs.anl.gov by xzhao Mon Jul 25 11:10:12 2016

[0]PETSC ERROR: Configure options --with-cc=gcc --with-cxx=g++
--with-fc=gfortran --download-mpich --download-fblaslapack
--download-scalapack --download-mumps --download-superlu_dist
--download-hypre --download-ml --download-metis --download-parmetis
--download-triangle --download-chaco --with-debugging=0

[0]PETSC ERROR: #1 User provided function() line 0 in  unknown file

application called MPI_Abort(MPI_COMM_WORLD, 59) - process 0
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.mcs.anl.gov/pipermail/petsc-users/attachments/20160725/002a8268/attachment.html>

From hzhang at mcs.anl.gov  Mon Jul 25 11:33:19 2016
From: hzhang at mcs.anl.gov (Hong)
Date: Mon, 25 Jul 2016 11:33:19 -0500
Subject: [petsc-users] KSPSolve() passes in the dbg mode,
 but failed in opt mode
In-Reply-To: <CAHOKZ65kvxMLd22uQPO8sfq8HS8GT0AMRSAmVEc_aM5Oz+rvzg@mail.gmail.com>
References: <CAHOKZ65kvxMLd22uQPO8sfq8HS8GT0AMRSAmVEc_aM5Oz+rvzg@mail.gmail.com>
Message-ID: <CAGCphBv4Pud_XmEnkXdDCCX0V-u8cCnTopyVyrkXWxaPK2GMbw@mail.gmail.com>

Xujun:
Test your code with valgrind to see if it is valgrind clean.
Hong

Hi all,
>
> I am trying to solve my problem with a direct solver superLU_dist.
> But the KSPSolve failed in the "opt" mode. I shifted to the "dbg" version
> and wanted to see what error info I can get from the PETSc. Surprisingly,
> it passed the solve and didn't output any errors in the "dbg" version. Does
> anyone have the similar experience? and what type of potential bugs it may
> have?
>
>
> --->test in StokesSolver::solve(): Start the KSP solve...
>
> [0]PETSC ERROR:
> ------------------------------------------------------------------------
>
> [0]PETSC ERROR: Caught signal number 11 SEGV: Segmentation Violation,
> probably memory access out of range
>
> [0]PETSC ERROR: Try option -start_in_debugger or -on_error_attach_debugger
>
> [0]PETSC ERROR: or see
> http://www.mcs.anl.gov/petsc/documentation/faq.html#valgrind
>
> [0]PETSC ERROR: or try http://valgrind.org on GNU/linux and Apple Mac OS
> X to find memory corruption errors
>
> [0]PETSC ERROR: configure using --with-debugging=yes, recompile, link, and
> run
>
> [0]PETSC ERROR: to get more information on the crash.
>
> [0]PETSC ERROR: --------------------- Error Message
> --------------------------------------------------------------
>
> [0]PETSC ERROR: Signal received
>
> [0]PETSC ERROR: See http://www.mcs.anl.gov/petsc/documentation/faq.html
> for trouble shooting.
>
> [0]PETSC ERROR: Petsc Release Version 3.7.2, unknown
>
> [0]PETSC ERROR: ./example-dbg on a arch-darwin-c-opt named
> mcswl091.mcs.anl.gov by xzhao Mon Jul 25 11:10:12 2016
>
> [0]PETSC ERROR: Configure options --with-cc=gcc --with-cxx=g++
> --with-fc=gfortran --download-mpich --download-fblaslapack
> --download-scalapack --download-mumps --download-superlu_dist
> --download-hypre --download-ml --download-metis --download-parmetis
> --download-triangle --download-chaco --with-debugging=0
>
> [0]PETSC ERROR: #1 User provided function() line 0 in  unknown file
>
> application called MPI_Abort(MPI_COMM_WORLD, 59) - process 0
>
>
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.mcs.anl.gov/pipermail/petsc-users/attachments/20160725/f61ac443/attachment-0001.html>

From knepley at gmail.com  Mon Jul 25 11:50:05 2016
From: knepley at gmail.com (Matthew Knepley)
Date: Mon, 25 Jul 2016 09:50:05 -0700
Subject: [petsc-users] KSPSolve() passes in the dbg mode,
 but failed in opt mode
In-Reply-To: <CAHOKZ65kvxMLd22uQPO8sfq8HS8GT0AMRSAmVEc_aM5Oz+rvzg@mail.gmail.com>
References: <CAHOKZ65kvxMLd22uQPO8sfq8HS8GT0AMRSAmVEc_aM5Oz+rvzg@mail.gmail.com>
Message-ID: <CAMYG4Gmgnn+2Qj_DNjMWwE409ytgqhCmjFKTC+5Z5Qt5mifwzQ@mail.gmail.com>

On Mon, Jul 25, 2016 at 9:17 AM, Xujun Zhao <xzhao99 at gmail.com> wrote:

> Hi all,
>
> I am trying to solve my problem with a direct solver superLU_dist.
> But the KSPSolve failed in the "opt" mode. I shifted to the "dbg" version
> and wanted to see what error info I can get from the PETSc. Surprisingly,
> it passed the solve and didn't output any errors in the "dbg" version. Does
> anyone have the similar experience? and what type of potential bugs it may
> have?
>

Debugging mode initializes all variables, but as Hong says, valgrind will
warn you of uninitialized variables.

   Matt


>
> --->test in StokesSolver::solve(): Start the KSP solve...
>
> [0]PETSC ERROR:
> ------------------------------------------------------------------------
>
> [0]PETSC ERROR: Caught signal number 11 SEGV: Segmentation Violation,
> probably memory access out of range
>
> [0]PETSC ERROR: Try option -start_in_debugger or -on_error_attach_debugger
>
> [0]PETSC ERROR: or see
> http://www.mcs.anl.gov/petsc/documentation/faq.html#valgrind
>
> [0]PETSC ERROR: or try http://valgrind.org on GNU/linux and Apple Mac OS
> X to find memory corruption errors
>
> [0]PETSC ERROR: configure using --with-debugging=yes, recompile, link, and
> run
>
> [0]PETSC ERROR: to get more information on the crash.
>
> [0]PETSC ERROR: --------------------- Error Message
> --------------------------------------------------------------
>
> [0]PETSC ERROR: Signal received
>
> [0]PETSC ERROR: See http://www.mcs.anl.gov/petsc/documentation/faq.html
> for trouble shooting.
>
> [0]PETSC ERROR: Petsc Release Version 3.7.2, unknown
>
> [0]PETSC ERROR: ./example-dbg on a arch-darwin-c-opt named
> mcswl091.mcs.anl.gov by xzhao Mon Jul 25 11:10:12 2016
>
> [0]PETSC ERROR: Configure options --with-cc=gcc --with-cxx=g++
> --with-fc=gfortran --download-mpich --download-fblaslapack
> --download-scalapack --download-mumps --download-superlu_dist
> --download-hypre --download-ml --download-metis --download-parmetis
> --download-triangle --download-chaco --with-debugging=0
>
> [0]PETSC ERROR: #1 User provided function() line 0 in  unknown file
>
> application called MPI_Abort(MPI_COMM_WORLD, 59) - process 0
>
>
>


-- 
What most experimenters take for granted before they begin their
experiments is infinitely more interesting than any results to which their
experiments lead.
-- Norbert Wiener
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.mcs.anl.gov/pipermail/petsc-users/attachments/20160725/ccd3ac2f/attachment.html>

From Eric.Chamberland at giref.ulaval.ca  Mon Jul 25 13:33:45 2016
From: Eric.Chamberland at giref.ulaval.ca (Eric Chamberland)
Date: Mon, 25 Jul 2016 14:33:45 -0400
Subject: [petsc-users] OpenMPI 2.0 and Petsc 3.7.2
Message-ID: <99090192-103a-b58c-8bbb-273b938fb748@giref.ulaval.ca>

Hi,

has someone tried OpenMPI 2.0 with Petsc 3.7.2?

I am having some errors with petsc, maybe someone have them too?

Here are the configure logs for PETSc:

http://www.giref.ulaval.ca/~cmpgiref/dernier_ompi/2016.07.25.01h16m02s_configure.log

http://www.giref.ulaval.ca/~cmpgiref/dernier_ompi/2016.07.25.01h16m02s_RDict.log

And for OpenMPI:
http://www.giref.ulaval.ca/~cmpgiref/dernier_ompi/2016.07.25.01h16m02s_config.log

(in fact, I am testing the ompi-release branch, a sort of petsc-master 
branch, since I need the commit 9ba6678156).

For a set of parallel tests, I have 104 that works on 124 total tests.

And the typical error:
*** Error in 
`/pmi/cmpbib/compilation_BIB_dernier_ompi/COMPILE_AUTO/GIREF/bin/Test.ProblemeGD.dev': 
free(): invalid pointer:
======= Backtrace: =========
/lib64/libc.so.6(+0x7277f)[0x7f80eb11677f]
/lib64/libc.so.6(+0x78026)[0x7f80eb11c026]
/lib64/libc.so.6(+0x78d53)[0x7f80eb11cd53]
/opt/openmpi-2.x_opt/lib/libopen-pal.so.20(opal_free+0x1f)[0x7f80ea8f9d60]
/opt/openmpi-2.x_opt/lib/openmpi/mca_pml_ob1.so(+0x16628)[0x7f80df0ea628]
/opt/openmpi-2.x_opt/lib/openmpi/mca_pml_ob1.so(+0x16c50)[0x7f80df0eac50]
/opt/openmpi-2.x_opt/lib/libmpi.so.20(+0x9f9dd)[0x7f80eb7029dd]
/opt/openmpi-2.x_opt/lib/libmpi.so.20(MPI_Request_free+0xf7)[0x7f80eb702ad6]
/opt/petsc-3.7.2_debug_openmpi_2.x/lib/libpetsc.so.3.7(+0x4adc6d)[0x7f80f2fa6c6d]
/opt/petsc-3.7.2_debug_openmpi_2.x/lib/libpetsc.so.3.7(VecScatterDestroy+0x68d)[0x7f80f2fa1c45]
/opt/petsc-3.7.2_debug_openmpi_2.x/lib/libpetsc.so.3.7(+0xa9d0f5)[0x7f80f35960f5]
/opt/petsc-3.7.2_debug_openmpi_2.x/lib/libpetsc.so.3.7(MatDestroy+0x648)[0x7f80f35c2588]
/opt/petsc-3.7.2_debug_openmpi_2.x/lib/libpetsc.so.3.7(+0x10bf0f4)[0x7f80f3bb80f4]
/opt/petsc-3.7.2_debug_openmpi_2.x/lib/libpetsc.so.3.7(PCReset+0x346)[0x7f80f3a796de]
/opt/petsc-3.7.2_debug_openmpi_2.x/lib/libpetsc.so.3.7(KSPReset+0x502)[0x7f80f3d19779]
/opt/petsc-3.7.2_debug_openmpi_2.x/lib/libpetsc.so.3.7(+0x11707f7)[0x7f80f3c697f7]
/opt/petsc-3.7.2_debug_openmpi_2.x/lib/libpetsc.so.3.7(PCReset+0x346)[0x7f80f3a796de]
/opt/petsc-3.7.2_debug_openmpi_2.x/lib/libpetsc.so.3.7(KSPReset+0x502)[0x7f80f3d19779]
/opt/petsc-3.7.2_debug_openmpi_2.x/lib/libpetsc.so.3.7(+0x11707f7)[0x7f80f3c697f7]
/opt/petsc-3.7.2_debug_openmpi_2.x/lib/libpetsc.so.3.7(PCReset+0x346)[0x7f80f3a796de]
/opt/petsc-3.7.2_debug_openmpi_2.x/lib/libpetsc.so.3.7(KSPReset+0x502)[0x7f80f3d19779]
/opt/petsc-3.7.2_debug_openmpi_2.x/lib/libpetsc.so.3.7(+0x11707f7)[0x7f80f3c697f7]
/opt/petsc-3.7.2_debug_openmpi_2.x/lib/libpetsc.so.3.7(PCReset+0x346)[0x7f80f3a796de]
/opt/petsc-3.7.2_debug_openmpi_2.x/lib/libpetsc.so.3.7(PCDestroy+0x5d1)[0x7f80f3a79fd9]
/opt/petsc-3.7.2_debug_openmpi_2.x/lib/libpetsc.so.3.7(KSPDestroy+0x7b6)[0x7f80f3d1a334]

a similar one:
*** Error in 
`/pmi/cmpbib/compilation_BIB_dernier_ompi/COMPILE_AUTO/GIREF/bin/Test.ProbFluideIncompressible.dev': 
free(): invalid pointer: 0x00007f382a7c5bc0 ***
======= Backtrace: =========
/lib64/libc.so.6(+0x7277f)[0x7f3829f1c77f]
/lib64/libc.so.6(+0x78026)[0x7f3829f22026]
/lib64/libc.so.6(+0x78d53)[0x7f3829f22d53]
/opt/openmpi-2.x_opt/lib/libopen-pal.so.20(opal_free+0x1f)[0x7f38296ffd60]
/opt/openmpi-2.x_opt/lib/openmpi/mca_pml_ob1.so(+0x16628)[0x7f381deab628]
/opt/openmpi-2.x_opt/lib/openmpi/mca_pml_ob1.so(+0x16c50)[0x7f381deabc50]
/opt/openmpi-2.x_opt/lib/libmpi.so.20(+0x9f9dd)[0x7f382a5089dd]
/opt/openmpi-2.x_opt/lib/libmpi.so.20(MPI_Request_free+0xf7)[0x7f382a508ad6]
/opt/petsc-3.7.2_debug_openmpi_2.x/lib/libpetsc.so.3.7(+0x4adc6d)[0x7f3831dacc6d]
/opt/petsc-3.7.2_debug_openmpi_2.x/lib/libpetsc.so.3.7(VecScatterDestroy+0x68d)[0x7f3831da7c45]
/opt/petsc-3.7.2_debug_openmpi_2.x/lib/libpetsc.so.3.7(+0x9f4755)[0x7f38322f3755]
/opt/petsc-3.7.2_debug_openmpi_2.x/lib/libpetsc.so.3.7(MatDestroy+0x648)[0x7f38323c8588]
/opt/petsc-3.7.2_debug_openmpi_2.x/lib/libpetsc.so.3.7(PCReset+0x4e2)[0x7f383287f87a]
/opt/petsc-3.7.2_debug_openmpi_2.x/lib/libpetsc.so.3.7(PCDestroy+0x5d1)[0x7f383287ffd9]
/opt/petsc-3.7.2_debug_openmpi_2.x/lib/libpetsc.so.3.7(KSPDestroy+0x7b6)[0x7f3832b20334]

another one:

*** Error in 
`/pmi/cmpbib/compilation_BIB_dernier_ompi/COMPILE_AUTO/GIREF/bin/Test.MortierDiffusion.dev': 
free(): invalid pointer: 0x00007f67b6d37bc0 ***
======= Backtrace: =========
/lib64/libc.so.6(+0x7277f)[0x7f67b648e77f]
/lib64/libc.so.6(+0x78026)[0x7f67b6494026]
/lib64/libc.so.6(+0x78d53)[0x7f67b6494d53]
/opt/openmpi-2.x_opt/lib/libopen-pal.so.20(opal_free+0x1f)[0x7f67b5c71d60]
/opt/openmpi-2.x_opt/lib/openmpi/mca_pml_ob1.so(+0x1adae)[0x7f67aa4cddae]
/opt/openmpi-2.x_opt/lib/openmpi/mca_pml_ob1.so(+0x1b4ca)[0x7f67aa4ce4ca]
/opt/openmpi-2.x_opt/lib/libmpi.so.20(+0x9f9dd)[0x7f67b6a7a9dd]
/opt/openmpi-2.x_opt/lib/libmpi.so.20(MPI_Request_free+0xf7)[0x7f67b6a7aad6]
/opt/petsc-3.7.2_debug_openmpi_2.x/lib/libpetsc.so.3.7(+0x4adb09)[0x7f67be31eb09]
/opt/petsc-3.7.2_debug_openmpi_2.x/lib/libpetsc.so.3.7(VecScatterDestroy+0x68d)[0x7f67be319c45]
/opt/petsc-3.7.2_debug_openmpi_2.x/lib/libpetsc.so.3.7(+0x4574f7)[0x7f67be2c84f7]
/opt/petsc-3.7.2_debug_openmpi_2.x/lib/libpetsc.so.3.7(VecDestroy+0x648)[0x7f67be26e8da]

I feel like I should wait until someone else from Petsc have tested it 
too...

Thanks,

Eric

From knepley at gmail.com  Mon Jul 25 13:57:18 2016
From: knepley at gmail.com (Matthew Knepley)
Date: Mon, 25 Jul 2016 11:57:18 -0700
Subject: [petsc-users] OpenMPI 2.0 and Petsc 3.7.2
In-Reply-To: <99090192-103a-b58c-8bbb-273b938fb748@giref.ulaval.ca>
References: <99090192-103a-b58c-8bbb-273b938fb748@giref.ulaval.ca>
Message-ID: <CAMYG4GkZ7vyBQ0=c1ZZ5GRwPrNa=Jgc04aHki39tYHHYWXyopA@mail.gmail.com>

On Mon, Jul 25, 2016 at 11:33 AM, Eric Chamberland <
Eric.Chamberland at giref.ulaval.ca> wrote:

> Hi,
>
> has someone tried OpenMPI 2.0 with Petsc 3.7.2?
>
> I am having some errors with petsc, maybe someone have them too?
>
> Here are the configure logs for PETSc:
>
>
> http://www.giref.ulaval.ca/~cmpgiref/dernier_ompi/2016.07.25.01h16m02s_configure.log
>
>
> http://www.giref.ulaval.ca/~cmpgiref/dernier_ompi/2016.07.25.01h16m02s_RDict.log
>
> And for OpenMPI:
>
> http://www.giref.ulaval.ca/~cmpgiref/dernier_ompi/2016.07.25.01h16m02s_config.log
>
> (in fact, I am testing the ompi-release branch, a sort of petsc-master
> branch, since I need the commit 9ba6678156).
>
> For a set of parallel tests, I have 104 that works on 124 total tests.
>

It appears that the fault happens when freeing the VecScatter we build for
MatMult, which contains Request structures
for the ISends and  IRecvs. These looks like internal OpenMPI errors to me
since the Request should be opaque.
I would try at least two things:

1) Run under valgrind.

2) Switch the VecScatter implementation. All the options are here,


http://www.mcs.anl.gov/petsc/petsc-current/docs/manualpages/Vec/VecScatterCreate.html#VecScatterCreate

but maybe use alltoall.

  Thanks,

     Matt


> And the typical error:
> *** Error in
> `/pmi/cmpbib/compilation_BIB_dernier_ompi/COMPILE_AUTO/GIREF/bin/Test.ProblemeGD.dev':
> free(): invalid pointer:
> ======= Backtrace: =========
> /lib64/libc.so.6(+0x7277f)[0x7f80eb11677f]
> /lib64/libc.so.6(+0x78026)[0x7f80eb11c026]
> /lib64/libc.so.6(+0x78d53)[0x7f80eb11cd53]
> /opt/openmpi-2.x_opt/lib/libopen-pal.so.20(opal_free+0x1f)[0x7f80ea8f9d60]
> /opt/openmpi-2.x_opt/lib/openmpi/mca_pml_ob1.so(+0x16628)[0x7f80df0ea628]
> /opt/openmpi-2.x_opt/lib/openmpi/mca_pml_ob1.so(+0x16c50)[0x7f80df0eac50]
> /opt/openmpi-2.x_opt/lib/libmpi.so.20(+0x9f9dd)[0x7f80eb7029dd]
>
> /opt/openmpi-2.x_opt/lib/libmpi.so.20(MPI_Request_free+0xf7)[0x7f80eb702ad6]
>
> /opt/petsc-3.7.2_debug_openmpi_2.x/lib/libpetsc.so.3.7(+0x4adc6d)[0x7f80f2fa6c6d]
>
> /opt/petsc-3.7.2_debug_openmpi_2.x/lib/libpetsc.so.3.7(VecScatterDestroy+0x68d)[0x7f80f2fa1c45]
>
> /opt/petsc-3.7.2_debug_openmpi_2.x/lib/libpetsc.so.3.7(+0xa9d0f5)[0x7f80f35960f5]
>
> /opt/petsc-3.7.2_debug_openmpi_2.x/lib/libpetsc.so.3.7(MatDestroy+0x648)[0x7f80f35c2588]
>
> /opt/petsc-3.7.2_debug_openmpi_2.x/lib/libpetsc.so.3.7(+0x10bf0f4)[0x7f80f3bb80f4]
>
> /opt/petsc-3.7.2_debug_openmpi_2.x/lib/libpetsc.so.3.7(PCReset+0x346)[0x7f80f3a796de]
>
> /opt/petsc-3.7.2_debug_openmpi_2.x/lib/libpetsc.so.3.7(KSPReset+0x502)[0x7f80f3d19779]
>
> /opt/petsc-3.7.2_debug_openmpi_2.x/lib/libpetsc.so.3.7(+0x11707f7)[0x7f80f3c697f7]
>
> /opt/petsc-3.7.2_debug_openmpi_2.x/lib/libpetsc.so.3.7(PCReset+0x346)[0x7f80f3a796de]
>
> /opt/petsc-3.7.2_debug_openmpi_2.x/lib/libpetsc.so.3.7(KSPReset+0x502)[0x7f80f3d19779]
>
> /opt/petsc-3.7.2_debug_openmpi_2.x/lib/libpetsc.so.3.7(+0x11707f7)[0x7f80f3c697f7]
>
> /opt/petsc-3.7.2_debug_openmpi_2.x/lib/libpetsc.so.3.7(PCReset+0x346)[0x7f80f3a796de]
>
> /opt/petsc-3.7.2_debug_openmpi_2.x/lib/libpetsc.so.3.7(KSPReset+0x502)[0x7f80f3d19779]
>
> /opt/petsc-3.7.2_debug_openmpi_2.x/lib/libpetsc.so.3.7(+0x11707f7)[0x7f80f3c697f7]
>
> /opt/petsc-3.7.2_debug_openmpi_2.x/lib/libpetsc.so.3.7(PCReset+0x346)[0x7f80f3a796de]
>
> /opt/petsc-3.7.2_debug_openmpi_2.x/lib/libpetsc.so.3.7(PCDestroy+0x5d1)[0x7f80f3a79fd9]
>
> /opt/petsc-3.7.2_debug_openmpi_2.x/lib/libpetsc.so.3.7(KSPDestroy+0x7b6)[0x7f80f3d1a334]
>
> a similar one:
> *** Error in
> `/pmi/cmpbib/compilation_BIB_dernier_ompi/COMPILE_AUTO/GIREF/bin/Test.ProbFluideIncompressible.dev':
> free(): invalid pointer: 0x00007f382a7c5bc0 ***
> ======= Backtrace: =========
> /lib64/libc.so.6(+0x7277f)[0x7f3829f1c77f]
> /lib64/libc.so.6(+0x78026)[0x7f3829f22026]
> /lib64/libc.so.6(+0x78d53)[0x7f3829f22d53]
> /opt/openmpi-2.x_opt/lib/libopen-pal.so.20(opal_free+0x1f)[0x7f38296ffd60]
> /opt/openmpi-2.x_opt/lib/openmpi/mca_pml_ob1.so(+0x16628)[0x7f381deab628]
> /opt/openmpi-2.x_opt/lib/openmpi/mca_pml_ob1.so(+0x16c50)[0x7f381deabc50]
> /opt/openmpi-2.x_opt/lib/libmpi.so.20(+0x9f9dd)[0x7f382a5089dd]
>
> /opt/openmpi-2.x_opt/lib/libmpi.so.20(MPI_Request_free+0xf7)[0x7f382a508ad6]
>
> /opt/petsc-3.7.2_debug_openmpi_2.x/lib/libpetsc.so.3.7(+0x4adc6d)[0x7f3831dacc6d]
>
> /opt/petsc-3.7.2_debug_openmpi_2.x/lib/libpetsc.so.3.7(VecScatterDestroy+0x68d)[0x7f3831da7c45]
>
> /opt/petsc-3.7.2_debug_openmpi_2.x/lib/libpetsc.so.3.7(+0x9f4755)[0x7f38322f3755]
>
> /opt/petsc-3.7.2_debug_openmpi_2.x/lib/libpetsc.so.3.7(MatDestroy+0x648)[0x7f38323c8588]
>
> /opt/petsc-3.7.2_debug_openmpi_2.x/lib/libpetsc.so.3.7(PCReset+0x4e2)[0x7f383287f87a]
>
> /opt/petsc-3.7.2_debug_openmpi_2.x/lib/libpetsc.so.3.7(PCDestroy+0x5d1)[0x7f383287ffd9]
>
> /opt/petsc-3.7.2_debug_openmpi_2.x/lib/libpetsc.so.3.7(KSPDestroy+0x7b6)[0x7f3832b20334]
>
> another one:
>
> *** Error in
> `/pmi/cmpbib/compilation_BIB_dernier_ompi/COMPILE_AUTO/GIREF/bin/Test.MortierDiffusion.dev':
> free(): invalid pointer: 0x00007f67b6d37bc0 ***
> ======= Backtrace: =========
> /lib64/libc.so.6(+0x7277f)[0x7f67b648e77f]
> /lib64/libc.so.6(+0x78026)[0x7f67b6494026]
> /lib64/libc.so.6(+0x78d53)[0x7f67b6494d53]
> /opt/openmpi-2.x_opt/lib/libopen-pal.so.20(opal_free+0x1f)[0x7f67b5c71d60]
> /opt/openmpi-2.x_opt/lib/openmpi/mca_pml_ob1.so(+0x1adae)[0x7f67aa4cddae]
> /opt/openmpi-2.x_opt/lib/openmpi/mca_pml_ob1.so(+0x1b4ca)[0x7f67aa4ce4ca]
> /opt/openmpi-2.x_opt/lib/libmpi.so.20(+0x9f9dd)[0x7f67b6a7a9dd]
>
> /opt/openmpi-2.x_opt/lib/libmpi.so.20(MPI_Request_free+0xf7)[0x7f67b6a7aad6]
>
> /opt/petsc-3.7.2_debug_openmpi_2.x/lib/libpetsc.so.3.7(+0x4adb09)[0x7f67be31eb09]
>
> /opt/petsc-3.7.2_debug_openmpi_2.x/lib/libpetsc.so.3.7(VecScatterDestroy+0x68d)[0x7f67be319c45]
>
> /opt/petsc-3.7.2_debug_openmpi_2.x/lib/libpetsc.so.3.7(+0x4574f7)[0x7f67be2c84f7]
>
> /opt/petsc-3.7.2_debug_openmpi_2.x/lib/libpetsc.so.3.7(VecDestroy+0x648)[0x7f67be26e8da]
>
> I feel like I should wait until someone else from Petsc have tested it
> too...
>
> Thanks,
>
> Eric
>


-- 
What most experimenters take for granted before they begin their
experiments is infinitely more interesting than any results to which their
experiments lead.
-- Norbert Wiener
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.mcs.anl.gov/pipermail/petsc-users/attachments/20160725/fdfd5e4c/attachment.html>

From Eric.Chamberland at giref.ulaval.ca  Mon Jul 25 14:44:57 2016
From: Eric.Chamberland at giref.ulaval.ca (Eric Chamberland)
Date: Mon, 25 Jul 2016 15:44:57 -0400
Subject: [petsc-users] OpenMPI 2.0 and Petsc 3.7.2
In-Reply-To: <CAMYG4GkZ7vyBQ0=c1ZZ5GRwPrNa=Jgc04aHki39tYHHYWXyopA@mail.gmail.com>
References: <99090192-103a-b58c-8bbb-273b938fb748@giref.ulaval.ca>
	<CAMYG4GkZ7vyBQ0=c1ZZ5GRwPrNa=Jgc04aHki39tYHHYWXyopA@mail.gmail.com>
Message-ID: <33b3cb0d-78f8-fb84-2ad5-a447f5cdce9e@giref.ulaval.ca>

Ok,

here is the 2 points answered:

#1) got valgrind output... here is the fatal free operation:

==107156== Invalid free() / delete / delete[] / realloc()
==107156==    at 0x4C2A37C: free (in 
/usr/lib64/valgrind/vgpreload_memcheck-amd64-linux.so)
==107156==    by 0x1E63CD5F: opal_free (malloc.c:184)
==107156==    by 0x27622627: mca_pml_ob1_recv_request_fini 
(pml_ob1_recvreq.h:133)
==107156==    by 0x27622C4F: mca_pml_ob1_recv_request_free 
(pml_ob1_recvreq.c:90)
==107156==    by 0x1D3EF9DC: ompi_request_free (request.h:362)
==107156==    by 0x1D3EFAD5: PMPI_Request_free (prequest_free.c:59)
==107156==    by 0x14AE3B9C: VecScatterDestroy_PtoP (vpscat.c:219)
==107156==    by 0x14ADEB74: VecScatterDestroy (vscat.c:1860)
==107156==    by 0x14A8D426: VecDestroy_MPI (pdvec.c:25)
==107156==    by 0x14A33809: VecDestroy (vector.c:432)
==107156==    by 0x10A2A5AB: GIREFVecDestroy(_p_Vec*&) 
(girefConfigurationPETSc.h:115)
==107156==    by 0x10BA9F14: VecteurPETSc::detruitObjetPETSc() 
(VecteurPETSc.cc:2292)
==107156==    by 0x10BA9D0D: VecteurPETSc::~VecteurPETSc() 
(VecteurPETSc.cc:287)
==107156==    by 0x10BA9F48: VecteurPETSc::~VecteurPETSc() 
(VecteurPETSc.cc:281)
==107156==    by 0x1135A57B: 
PPReactionsAppuiEL3D::~PPReactionsAppuiEL3D() (PPReactionsAppuiEL3D.cc:216)
==107156==    by 0xCD9A1EA: ProblemeGD::~ProblemeGD() (in 
/home/mefpp_ericc/depots_prepush/GIREF/lib/libgiref_dev_Formulation.so)
==107156==    by 0x435702: main (Test.ProblemeGD.icc:381)
==107156==  Address 0x1d6acbc0 is 0 bytes inside data symbol 
"ompi_mpi_double"
--107156-- REDIR: 0x1dda2680 (libc.so.6:__GI_stpcpy) redirected to 
0x4c2f330 (__GI_stpcpy)
==107156==
==107156== Process terminating with default action of signal 6 
(SIGABRT): dumping core
==107156==    at 0x1DD520C7: raise (in /lib64/libc-2.19.so)
==107156==    by 0x1DD53534: abort (in /lib64/libc-2.19.so)
==107156==    by 0x1DD4B145: __assert_fail_base (in /lib64/libc-2.19.so)
==107156==    by 0x1DD4B1F1: __assert_fail (in /lib64/libc-2.19.so)
==107156==    by 0x27626D12: mca_pml_ob1_send_request_fini 
(pml_ob1_sendreq.h:221)
==107156==    by 0x276274C9: mca_pml_ob1_send_request_free 
(pml_ob1_sendreq.c:117)
==107156==    by 0x1D3EF9DC: ompi_request_free (request.h:362)
==107156==    by 0x1D3EFAD5: PMPI_Request_free (prequest_free.c:59)
==107156==    by 0x14AE3C3C: VecScatterDestroy_PtoP (vpscat.c:225)
==107156==    by 0x14ADEB74: VecScatterDestroy (vscat.c:1860)
==107156==    by 0x14A8D426: VecDestroy_MPI (pdvec.c:25)
==107156==    by 0x14A33809: VecDestroy (vector.c:432)
==107156==    by 0x10A2A5AB: GIREFVecDestroy(_p_Vec*&) 
(girefConfigurationPETSc.h:115)
==107156==    by 0x10BA9F14: VecteurPETSc::detruitObjetPETSc() 
(VecteurPETSc.cc:2292)
==107156==    by 0x10BA9D0D: VecteurPETSc::~VecteurPETSc() 
(VecteurPETSc.cc:287)
==107156==    by 0x10BA9F48: VecteurPETSc::~VecteurPETSc() 
(VecteurPETSc.cc:281)
==107156==    by 0x1135A57B: 
PPReactionsAppuiEL3D::~PPReactionsAppuiEL3D() (PPReactionsAppuiEL3D.cc:216)
==107156==    by 0xCD9A1EA: ProblemeGD::~ProblemeGD() (in 
/home/mefpp_ericc/depots_prepush/GIREF/lib/libgiref_dev_Formulation.so)
==107156==    by 0x435702: main (Test.ProblemeGD.icc:381)


#2) For the run with -vecscatter_alltoall it works...!

As an "end user", should I ever modify these VecScatterCreate options? 
How do they change the performances of the code on large problems?

Thanks,

Eric

On 25/07/16 02:57 PM, Matthew Knepley wrote:
> On Mon, Jul 25, 2016 at 11:33 AM, Eric Chamberland
> <Eric.Chamberland at giref.ulaval.ca
> <mailto:Eric.Chamberland at giref.ulaval.ca>> wrote:
>
>     Hi,
>
>     has someone tried OpenMPI 2.0 with Petsc 3.7.2?
>
>     I am having some errors with petsc, maybe someone have them too?
>
>     Here are the configure logs for PETSc:
>
>     http://www.giref.ulaval.ca/~cmpgiref/dernier_ompi/2016.07.25.01h16m02s_configure.log
>
>     http://www.giref.ulaval.ca/~cmpgiref/dernier_ompi/2016.07.25.01h16m02s_RDict.log
>
>     And for OpenMPI:
>     http://www.giref.ulaval.ca/~cmpgiref/dernier_ompi/2016.07.25.01h16m02s_config.log
>
>     (in fact, I am testing the ompi-release branch, a sort of
>     petsc-master branch, since I need the commit 9ba6678156).
>
>     For a set of parallel tests, I have 104 that works on 124 total tests.
>
>
> It appears that the fault happens when freeing the VecScatter we build
> for MatMult, which contains Request structures
> for the ISends and  IRecvs. These looks like internal OpenMPI errors to
> me since the Request should be opaque.
> I would try at least two things:
>
> 1) Run under valgrind.
>
> 2) Switch the VecScatter implementation. All the options are here,
>
>   http://www.mcs.anl.gov/petsc/petsc-current/docs/manualpages/Vec/VecScatterCreate.html#VecScatterCreate
>
> but maybe use alltoall.
>
>   Thanks,
>
>      Matt
>
>
>     And the typical error:
>     *** Error in
>     `/pmi/cmpbib/compilation_BIB_dernier_ompi/COMPILE_AUTO/GIREF/bin/Test.ProblemeGD.dev':
>     free(): invalid pointer:
>     ======= Backtrace: =========
>     /lib64/libc.so.6(+0x7277f)[0x7f80eb11677f]
>     /lib64/libc.so.6(+0x78026)[0x7f80eb11c026]
>     /lib64/libc.so.6(+0x78d53)[0x7f80eb11cd53]
>     /opt/openmpi-2.x_opt/lib/libopen-pal.so.20(opal_free+0x1f)[0x7f80ea8f9d60]
>     /opt/openmpi-2.x_opt/lib/openmpi/mca_pml_ob1.so(+0x16628)[0x7f80df0ea628]
>     /opt/openmpi-2.x_opt/lib/openmpi/mca_pml_ob1.so(+0x16c50)[0x7f80df0eac50]
>     /opt/openmpi-2.x_opt/lib/libmpi.so.20(+0x9f9dd)[0x7f80eb7029dd]
>     /opt/openmpi-2.x_opt/lib/libmpi.so.20(MPI_Request_free+0xf7)[0x7f80eb702ad6]
>     /opt/petsc-3.7.2_debug_openmpi_2.x/lib/libpetsc.so.3.7(+0x4adc6d)[0x7f80f2fa6c6d]
>     /opt/petsc-3.7.2_debug_openmpi_2.x/lib/libpetsc.so.3.7(VecScatterDestroy+0x68d)[0x7f80f2fa1c45]
>     /opt/petsc-3.7.2_debug_openmpi_2.x/lib/libpetsc.so.3.7(+0xa9d0f5)[0x7f80f35960f5]
>     /opt/petsc-3.7.2_debug_openmpi_2.x/lib/libpetsc.so.3.7(MatDestroy+0x648)[0x7f80f35c2588]
>     /opt/petsc-3.7.2_debug_openmpi_2.x/lib/libpetsc.so.3.7(+0x10bf0f4)[0x7f80f3bb80f4]
>     /opt/petsc-3.7.2_debug_openmpi_2.x/lib/libpetsc.so.3.7(PCReset+0x346)[0x7f80f3a796de]
>     /opt/petsc-3.7.2_debug_openmpi_2.x/lib/libpetsc.so.3.7(KSPReset+0x502)[0x7f80f3d19779]
>     /opt/petsc-3.7.2_debug_openmpi_2.x/lib/libpetsc.so.3.7(+0x11707f7)[0x7f80f3c697f7]
>     /opt/petsc-3.7.2_debug_openmpi_2.x/lib/libpetsc.so.3.7(PCReset+0x346)[0x7f80f3a796de]
>     /opt/petsc-3.7.2_debug_openmpi_2.x/lib/libpetsc.so.3.7(KSPReset+0x502)[0x7f80f3d19779]
>     /opt/petsc-3.7.2_debug_openmpi_2.x/lib/libpetsc.so.3.7(+0x11707f7)[0x7f80f3c697f7]
>     /opt/petsc-3.7.2_debug_openmpi_2.x/lib/libpetsc.so.3.7(PCReset+0x346)[0x7f80f3a796de]
>     /opt/petsc-3.7.2_debug_openmpi_2.x/lib/libpetsc.so.3.7(KSPReset+0x502)[0x7f80f3d19779]
>     /opt/petsc-3.7.2_debug_openmpi_2.x/lib/libpetsc.so.3.7(+0x11707f7)[0x7f80f3c697f7]
>     /opt/petsc-3.7.2_debug_openmpi_2.x/lib/libpetsc.so.3.7(PCReset+0x346)[0x7f80f3a796de]
>     /opt/petsc-3.7.2_debug_openmpi_2.x/lib/libpetsc.so.3.7(PCDestroy+0x5d1)[0x7f80f3a79fd9]
>     /opt/petsc-3.7.2_debug_openmpi_2.x/lib/libpetsc.so.3.7(KSPDestroy+0x7b6)[0x7f80f3d1a334]
>
>     a similar one:
>     *** Error in
>     `/pmi/cmpbib/compilation_BIB_dernier_ompi/COMPILE_AUTO/GIREF/bin/Test.ProbFluideIncompressible.dev':
>     free(): invalid pointer: 0x00007f382a7c5bc0 ***
>     ======= Backtrace: =========
>     /lib64/libc.so.6(+0x7277f)[0x7f3829f1c77f]
>     /lib64/libc.so.6(+0x78026)[0x7f3829f22026]
>     /lib64/libc.so.6(+0x78d53)[0x7f3829f22d53]
>     /opt/openmpi-2.x_opt/lib/libopen-pal.so.20(opal_free+0x1f)[0x7f38296ffd60]
>     /opt/openmpi-2.x_opt/lib/openmpi/mca_pml_ob1.so(+0x16628)[0x7f381deab628]
>     /opt/openmpi-2.x_opt/lib/openmpi/mca_pml_ob1.so(+0x16c50)[0x7f381deabc50]
>     /opt/openmpi-2.x_opt/lib/libmpi.so.20(+0x9f9dd)[0x7f382a5089dd]
>     /opt/openmpi-2.x_opt/lib/libmpi.so.20(MPI_Request_free+0xf7)[0x7f382a508ad6]
>     /opt/petsc-3.7.2_debug_openmpi_2.x/lib/libpetsc.so.3.7(+0x4adc6d)[0x7f3831dacc6d]
>     /opt/petsc-3.7.2_debug_openmpi_2.x/lib/libpetsc.so.3.7(VecScatterDestroy+0x68d)[0x7f3831da7c45]
>     /opt/petsc-3.7.2_debug_openmpi_2.x/lib/libpetsc.so.3.7(+0x9f4755)[0x7f38322f3755]
>     /opt/petsc-3.7.2_debug_openmpi_2.x/lib/libpetsc.so.3.7(MatDestroy+0x648)[0x7f38323c8588]
>     /opt/petsc-3.7.2_debug_openmpi_2.x/lib/libpetsc.so.3.7(PCReset+0x4e2)[0x7f383287f87a]
>     /opt/petsc-3.7.2_debug_openmpi_2.x/lib/libpetsc.so.3.7(PCDestroy+0x5d1)[0x7f383287ffd9]
>     /opt/petsc-3.7.2_debug_openmpi_2.x/lib/libpetsc.so.3.7(KSPDestroy+0x7b6)[0x7f3832b20334]
>
>     another one:
>
>     *** Error in
>     `/pmi/cmpbib/compilation_BIB_dernier_ompi/COMPILE_AUTO/GIREF/bin/Test.MortierDiffusion.dev':
>     free(): invalid pointer: 0x00007f67b6d37bc0 ***
>     ======= Backtrace: =========
>     /lib64/libc.so.6(+0x7277f)[0x7f67b648e77f]
>     /lib64/libc.so.6(+0x78026)[0x7f67b6494026]
>     /lib64/libc.so.6(+0x78d53)[0x7f67b6494d53]
>     /opt/openmpi-2.x_opt/lib/libopen-pal.so.20(opal_free+0x1f)[0x7f67b5c71d60]
>     /opt/openmpi-2.x_opt/lib/openmpi/mca_pml_ob1.so(+0x1adae)[0x7f67aa4cddae]
>     /opt/openmpi-2.x_opt/lib/openmpi/mca_pml_ob1.so(+0x1b4ca)[0x7f67aa4ce4ca]
>     /opt/openmpi-2.x_opt/lib/libmpi.so.20(+0x9f9dd)[0x7f67b6a7a9dd]
>     /opt/openmpi-2.x_opt/lib/libmpi.so.20(MPI_Request_free+0xf7)[0x7f67b6a7aad6]
>     /opt/petsc-3.7.2_debug_openmpi_2.x/lib/libpetsc.so.3.7(+0x4adb09)[0x7f67be31eb09]
>     /opt/petsc-3.7.2_debug_openmpi_2.x/lib/libpetsc.so.3.7(VecScatterDestroy+0x68d)[0x7f67be319c45]
>     /opt/petsc-3.7.2_debug_openmpi_2.x/lib/libpetsc.so.3.7(+0x4574f7)[0x7f67be2c84f7]
>     /opt/petsc-3.7.2_debug_openmpi_2.x/lib/libpetsc.so.3.7(VecDestroy+0x648)[0x7f67be26e8da]
>
>     I feel like I should wait until someone else from Petsc have tested
>     it too...
>
>     Thanks,
>
>     Eric
>
>
>
>
> --
> What most experimenters take for granted before they begin their
> experiments is infinitely more interesting than any results to which
> their experiments lead.
> -- Norbert Wiener

From knepley at gmail.com  Mon Jul 25 14:53:32 2016
From: knepley at gmail.com (Matthew Knepley)
Date: Mon, 25 Jul 2016 12:53:32 -0700
Subject: [petsc-users] OpenMPI 2.0 and Petsc 3.7.2
In-Reply-To: <33b3cb0d-78f8-fb84-2ad5-a447f5cdce9e@giref.ulaval.ca>
References: <99090192-103a-b58c-8bbb-273b938fb748@giref.ulaval.ca>
	<CAMYG4GkZ7vyBQ0=c1ZZ5GRwPrNa=Jgc04aHki39tYHHYWXyopA@mail.gmail.com>
	<33b3cb0d-78f8-fb84-2ad5-a447f5cdce9e@giref.ulaval.ca>
Message-ID: <CAMYG4G=kTXyrotQYFPZnF5iZx4fBe3+9sqPbbf3FzXQyrUeNtQ@mail.gmail.com>

On Mon, Jul 25, 2016 at 12:44 PM, Eric Chamberland <
Eric.Chamberland at giref.ulaval.ca> wrote:

> Ok,
>
> here is the 2 points answered:
>
> #1) got valgrind output... here is the fatal free operation:
>

Okay, this is not the MatMult scatter, this is for local representations of
ghosted vectors. However, to me
it looks like OpenMPI mistakenly frees its built-in type for MPI_DOUBLE.


> ==107156== Invalid free() / delete / delete[] / realloc()
> ==107156==    at 0x4C2A37C: free (in
> /usr/lib64/valgrind/vgpreload_memcheck-amd64-linux.so)
> ==107156==    by 0x1E63CD5F: opal_free (malloc.c:184)
> ==107156==    by 0x27622627: mca_pml_ob1_recv_request_fini
> (pml_ob1_recvreq.h:133)
> ==107156==    by 0x27622C4F: mca_pml_ob1_recv_request_free
> (pml_ob1_recvreq.c:90)
> ==107156==    by 0x1D3EF9DC: ompi_request_free (request.h:362)
> ==107156==    by 0x1D3EFAD5: PMPI_Request_free (prequest_free.c:59)
> ==107156==    by 0x14AE3B9C: VecScatterDestroy_PtoP (vpscat.c:219)
> ==107156==    by 0x14ADEB74: VecScatterDestroy (vscat.c:1860)
> ==107156==    by 0x14A8D426: VecDestroy_MPI (pdvec.c:25)
> ==107156==    by 0x14A33809: VecDestroy (vector.c:432)
> ==107156==    by 0x10A2A5AB: GIREFVecDestroy(_p_Vec*&)
> (girefConfigurationPETSc.h:115)
> ==107156==    by 0x10BA9F14: VecteurPETSc::detruitObjetPETSc()
> (VecteurPETSc.cc:2292)
> ==107156==    by 0x10BA9D0D: VecteurPETSc::~VecteurPETSc()
> (VecteurPETSc.cc:287)
> ==107156==    by 0x10BA9F48: VecteurPETSc::~VecteurPETSc()
> (VecteurPETSc.cc:281)
> ==107156==    by 0x1135A57B: PPReactionsAppuiEL3D::~PPReactionsAppuiEL3D()
> (PPReactionsAppuiEL3D.cc:216)
> ==107156==    by 0xCD9A1EA: ProblemeGD::~ProblemeGD() (in
> /home/mefpp_ericc/depots_prepush/GIREF/lib/libgiref_dev_Formulation.so)
> ==107156==    by 0x435702: main (Test.ProblemeGD.icc:381)
> ==107156==  Address 0x1d6acbc0 is 0 bytes inside data symbol
> "ompi_mpi_double"
> --107156-- REDIR: 0x1dda2680 (libc.so.6:__GI_stpcpy) redirected to
> 0x4c2f330 (__GI_stpcpy)
> ==107156==
> ==107156== Process terminating with default action of signal 6 (SIGABRT):
> dumping core
> ==107156==    at 0x1DD520C7: raise (in /lib64/libc-2.19.so)
> ==107156==    by 0x1DD53534: abort (in /lib64/libc-2.19.so)
> ==107156==    by 0x1DD4B145: __assert_fail_base (in /lib64/libc-2.19.so)
> ==107156==    by 0x1DD4B1F1: __assert_fail (in /lib64/libc-2.19.so)
> ==107156==    by 0x27626D12: mca_pml_ob1_send_request_fini
> (pml_ob1_sendreq.h:221)
> ==107156==    by 0x276274C9: mca_pml_ob1_send_request_free
> (pml_ob1_sendreq.c:117)
> ==107156==    by 0x1D3EF9DC: ompi_request_free (request.h:362)
> ==107156==    by 0x1D3EFAD5: PMPI_Request_free (prequest_free.c:59)
> ==107156==    by 0x14AE3C3C: VecScatterDestroy_PtoP (vpscat.c:225)
> ==107156==    by 0x14ADEB74: VecScatterDestroy (vscat.c:1860)
> ==107156==    by 0x14A8D426: VecDestroy_MPI (pdvec.c:25)
> ==107156==    by 0x14A33809: VecDestroy (vector.c:432)
> ==107156==    by 0x10A2A5AB: GIREFVecDestroy(_p_Vec*&)
> (girefConfigurationPETSc.h:115)
> ==107156==    by 0x10BA9F14: VecteurPETSc::detruitObjetPETSc()
> (VecteurPETSc.cc:2292)
> ==107156==    by 0x10BA9D0D: VecteurPETSc::~VecteurPETSc()
> (VecteurPETSc.cc:287)
> ==107156==    by 0x10BA9F48: VecteurPETSc::~VecteurPETSc()
> (VecteurPETSc.cc:281)
> ==107156==    by 0x1135A57B: PPReactionsAppuiEL3D::~PPReactionsAppuiEL3D()
> (PPReactionsAppuiEL3D.cc:216)
> ==107156==    by 0xCD9A1EA: ProblemeGD::~ProblemeGD() (in
> /home/mefpp_ericc/depots_prepush/GIREF/lib/libgiref_dev_Formulation.so)
> ==107156==    by 0x435702: main (Test.ProblemeGD.icc:381)
>
>
> #2) For the run with -vecscatter_alltoall it works...!
>
> As an "end user", should I ever modify these VecScatterCreate options? How
> do they change the performances of the code on large problems?
>

Yep, those options are there because the different variants are better on
different architectures, and you can't know which one to pick until runtime,
(and without experimentation).

  Thanks,

    Matt


> Thanks,
>
> Eric
>
> On 25/07/16 02:57 PM, Matthew Knepley wrote:
>
>> On Mon, Jul 25, 2016 at 11:33 AM, Eric Chamberland
>> <Eric.Chamberland at giref.ulaval.ca
>> <mailto:Eric.Chamberland at giref.ulaval.ca>> wrote:
>>
>>     Hi,
>>
>>     has someone tried OpenMPI 2.0 with Petsc 3.7.2?
>>
>>     I am having some errors with petsc, maybe someone have them too?
>>
>>     Here are the configure logs for PETSc:
>>
>>
>> http://www.giref.ulaval.ca/~cmpgiref/dernier_ompi/2016.07.25.01h16m02s_configure.log
>>
>>
>> http://www.giref.ulaval.ca/~cmpgiref/dernier_ompi/2016.07.25.01h16m02s_RDict.log
>>
>>     And for OpenMPI:
>>
>> http://www.giref.ulaval.ca/~cmpgiref/dernier_ompi/2016.07.25.01h16m02s_config.log
>>
>>     (in fact, I am testing the ompi-release branch, a sort of
>>     petsc-master branch, since I need the commit 9ba6678156).
>>
>>     For a set of parallel tests, I have 104 that works on 124 total tests.
>>
>>
>> It appears that the fault happens when freeing the VecScatter we build
>> for MatMult, which contains Request structures
>> for the ISends and  IRecvs. These looks like internal OpenMPI errors to
>> me since the Request should be opaque.
>> I would try at least two things:
>>
>> 1) Run under valgrind.
>>
>> 2) Switch the VecScatter implementation. All the options are here,
>>
>>
>> http://www.mcs.anl.gov/petsc/petsc-current/docs/manualpages/Vec/VecScatterCreate.html#VecScatterCreate
>>
>> but maybe use alltoall.
>>
>>   Thanks,
>>
>>      Matt
>>
>>
>>     And the typical error:
>>     *** Error in
>>
>> `/pmi/cmpbib/compilation_BIB_dernier_ompi/COMPILE_AUTO/GIREF/bin/Test.ProblemeGD.dev':
>>     free(): invalid pointer:
>>     ======= Backtrace: =========
>>     /lib64/libc.so.6(+0x7277f)[0x7f80eb11677f]
>>     /lib64/libc.so.6(+0x78026)[0x7f80eb11c026]
>>     /lib64/libc.so.6(+0x78d53)[0x7f80eb11cd53]
>>
>> /opt/openmpi-2.x_opt/lib/libopen-pal.so.20(opal_free+0x1f)[0x7f80ea8f9d60]
>>
>> /opt/openmpi-2.x_opt/lib/openmpi/mca_pml_ob1.so(+0x16628)[0x7f80df0ea628]
>>
>> /opt/openmpi-2.x_opt/lib/openmpi/mca_pml_ob1.so(+0x16c50)[0x7f80df0eac50]
>>     /opt/openmpi-2.x_opt/lib/libmpi.so.20(+0x9f9dd)[0x7f80eb7029dd]
>>
>> /opt/openmpi-2.x_opt/lib/libmpi.so.20(MPI_Request_free+0xf7)[0x7f80eb702ad6]
>>
>> /opt/petsc-3.7.2_debug_openmpi_2.x/lib/libpetsc.so.3.7(+0x4adc6d)[0x7f80f2fa6c6d]
>>
>> /opt/petsc-3.7.2_debug_openmpi_2.x/lib/libpetsc.so.3.7(VecScatterDestroy+0x68d)[0x7f80f2fa1c45]
>>
>> /opt/petsc-3.7.2_debug_openmpi_2.x/lib/libpetsc.so.3.7(+0xa9d0f5)[0x7f80f35960f5]
>>
>> /opt/petsc-3.7.2_debug_openmpi_2.x/lib/libpetsc.so.3.7(MatDestroy+0x648)[0x7f80f35c2588]
>>
>> /opt/petsc-3.7.2_debug_openmpi_2.x/lib/libpetsc.so.3.7(+0x10bf0f4)[0x7f80f3bb80f4]
>>
>> /opt/petsc-3.7.2_debug_openmpi_2.x/lib/libpetsc.so.3.7(PCReset+0x346)[0x7f80f3a796de]
>>
>> /opt/petsc-3.7.2_debug_openmpi_2.x/lib/libpetsc.so.3.7(KSPReset+0x502)[0x7f80f3d19779]
>>
>> /opt/petsc-3.7.2_debug_openmpi_2.x/lib/libpetsc.so.3.7(+0x11707f7)[0x7f80f3c697f7]
>>
>> /opt/petsc-3.7.2_debug_openmpi_2.x/lib/libpetsc.so.3.7(PCReset+0x346)[0x7f80f3a796de]
>>
>> /opt/petsc-3.7.2_debug_openmpi_2.x/lib/libpetsc.so.3.7(KSPReset+0x502)[0x7f80f3d19779]
>>
>> /opt/petsc-3.7.2_debug_openmpi_2.x/lib/libpetsc.so.3.7(+0x11707f7)[0x7f80f3c697f7]
>>
>> /opt/petsc-3.7.2_debug_openmpi_2.x/lib/libpetsc.so.3.7(PCReset+0x346)[0x7f80f3a796de]
>>
>> /opt/petsc-3.7.2_debug_openmpi_2.x/lib/libpetsc.so.3.7(KSPReset+0x502)[0x7f80f3d19779]
>>
>> /opt/petsc-3.7.2_debug_openmpi_2.x/lib/libpetsc.so.3.7(+0x11707f7)[0x7f80f3c697f7]
>>
>> /opt/petsc-3.7.2_debug_openmpi_2.x/lib/libpetsc.so.3.7(PCReset+0x346)[0x7f80f3a796de]
>>
>> /opt/petsc-3.7.2_debug_openmpi_2.x/lib/libpetsc.so.3.7(PCDestroy+0x5d1)[0x7f80f3a79fd9]
>>
>> /opt/petsc-3.7.2_debug_openmpi_2.x/lib/libpetsc.so.3.7(KSPDestroy+0x7b6)[0x7f80f3d1a334]
>>
>>     a similar one:
>>     *** Error in
>>
>> `/pmi/cmpbib/compilation_BIB_dernier_ompi/COMPILE_AUTO/GIREF/bin/Test.ProbFluideIncompressible.dev':
>>     free(): invalid pointer: 0x00007f382a7c5bc0 ***
>>     ======= Backtrace: =========
>>     /lib64/libc.so.6(+0x7277f)[0x7f3829f1c77f]
>>     /lib64/libc.so.6(+0x78026)[0x7f3829f22026]
>>     /lib64/libc.so.6(+0x78d53)[0x7f3829f22d53]
>>
>> /opt/openmpi-2.x_opt/lib/libopen-pal.so.20(opal_free+0x1f)[0x7f38296ffd60]
>>
>> /opt/openmpi-2.x_opt/lib/openmpi/mca_pml_ob1.so(+0x16628)[0x7f381deab628]
>>
>> /opt/openmpi-2.x_opt/lib/openmpi/mca_pml_ob1.so(+0x16c50)[0x7f381deabc50]
>>     /opt/openmpi-2.x_opt/lib/libmpi.so.20(+0x9f9dd)[0x7f382a5089dd]
>>
>> /opt/openmpi-2.x_opt/lib/libmpi.so.20(MPI_Request_free+0xf7)[0x7f382a508ad6]
>>
>> /opt/petsc-3.7.2_debug_openmpi_2.x/lib/libpetsc.so.3.7(+0x4adc6d)[0x7f3831dacc6d]
>>
>> /opt/petsc-3.7.2_debug_openmpi_2.x/lib/libpetsc.so.3.7(VecScatterDestroy+0x68d)[0x7f3831da7c45]
>>
>> /opt/petsc-3.7.2_debug_openmpi_2.x/lib/libpetsc.so.3.7(+0x9f4755)[0x7f38322f3755]
>>
>> /opt/petsc-3.7.2_debug_openmpi_2.x/lib/libpetsc.so.3.7(MatDestroy+0x648)[0x7f38323c8588]
>>
>> /opt/petsc-3.7.2_debug_openmpi_2.x/lib/libpetsc.so.3.7(PCReset+0x4e2)[0x7f383287f87a]
>>
>> /opt/petsc-3.7.2_debug_openmpi_2.x/lib/libpetsc.so.3.7(PCDestroy+0x5d1)[0x7f383287ffd9]
>>
>> /opt/petsc-3.7.2_debug_openmpi_2.x/lib/libpetsc.so.3.7(KSPDestroy+0x7b6)[0x7f3832b20334]
>>
>>     another one:
>>
>>     *** Error in
>>
>> `/pmi/cmpbib/compilation_BIB_dernier_ompi/COMPILE_AUTO/GIREF/bin/Test.MortierDiffusion.dev':
>>     free(): invalid pointer: 0x00007f67b6d37bc0 ***
>>     ======= Backtrace: =========
>>     /lib64/libc.so.6(+0x7277f)[0x7f67b648e77f]
>>     /lib64/libc.so.6(+0x78026)[0x7f67b6494026]
>>     /lib64/libc.so.6(+0x78d53)[0x7f67b6494d53]
>>
>> /opt/openmpi-2.x_opt/lib/libopen-pal.so.20(opal_free+0x1f)[0x7f67b5c71d60]
>>
>> /opt/openmpi-2.x_opt/lib/openmpi/mca_pml_ob1.so(+0x1adae)[0x7f67aa4cddae]
>>
>> /opt/openmpi-2.x_opt/lib/openmpi/mca_pml_ob1.so(+0x1b4ca)[0x7f67aa4ce4ca]
>>     /opt/openmpi-2.x_opt/lib/libmpi.so.20(+0x9f9dd)[0x7f67b6a7a9dd]
>>
>> /opt/openmpi-2.x_opt/lib/libmpi.so.20(MPI_Request_free+0xf7)[0x7f67b6a7aad6]
>>
>> /opt/petsc-3.7.2_debug_openmpi_2.x/lib/libpetsc.so.3.7(+0x4adb09)[0x7f67be31eb09]
>>
>> /opt/petsc-3.7.2_debug_openmpi_2.x/lib/libpetsc.so.3.7(VecScatterDestroy+0x68d)[0x7f67be319c45]
>>
>> /opt/petsc-3.7.2_debug_openmpi_2.x/lib/libpetsc.so.3.7(+0x4574f7)[0x7f67be2c84f7]
>>
>> /opt/petsc-3.7.2_debug_openmpi_2.x/lib/libpetsc.so.3.7(VecDestroy+0x648)[0x7f67be26e8da]
>>
>>     I feel like I should wait until someone else from Petsc have tested
>>     it too...
>>
>>     Thanks,
>>
>>     Eric
>>
>>
>>
>>
>> --
>> What most experimenters take for granted before they begin their
>> experiments is infinitely more interesting than any results to which
>> their experiments lead.
>> -- Norbert Wiener
>>
>


-- 
What most experimenters take for granted before they begin their
experiments is infinitely more interesting than any results to which their
experiments lead.
-- Norbert Wiener
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.mcs.anl.gov/pipermail/petsc-users/attachments/20160725/5823a5ac/attachment.html>

From bhatiamanav at gmail.com  Mon Jul 25 15:13:10 2016
From: bhatiamanav at gmail.com (Manav Bhatia)
Date: Mon, 25 Jul 2016 15:13:10 -0500
Subject: [petsc-users] handling multi physics applications on multiple
	MPI_Comm
Message-ID: <C3FB5534-4150-4F56-8C41-DC8743F60E2F@gmail.com>

Hi,

    I have a multi physics application with discipline1 defined on comm1 and discipline2 on comm2.

    My intent is to use the nested matrix for the KSP solver where each diagonal block is provided by the disciplines, and the off-diagonal blocks are defined as shell-matrices with matrix vector products. 

    I am a bit unclear about how to deal with the case of different set of processors on comm1 and comm2. I have the following questions and would appreciate some guidance: 

? Would it make sense to define a comm_global as a union of comm1 and comm2 for the MatCreateNest? 

? The diagonal blocks are available on comm1 and comm2 only. Should MatAssemblyBegin/End for these diagonal blocks be called on comm1 and comm2 separately?

? What comm should be used for the off-diagonal shell matrices? 

? Likewise, when calling VecGetSubVector and VecRestoreSubVector to get sub-vectors corresponding to discipline1 (or 2), what comm should these function calls be made? 

Thanks,
Manav


From knepley at gmail.com  Mon Jul 25 15:21:24 2016
From: knepley at gmail.com (Matthew Knepley)
Date: Mon, 25 Jul 2016 13:21:24 -0700
Subject: [petsc-users] handling multi physics applications on multiple
	MPI_Comm
In-Reply-To: <C3FB5534-4150-4F56-8C41-DC8743F60E2F@gmail.com>
References: <C3FB5534-4150-4F56-8C41-DC8743F60E2F@gmail.com>
Message-ID: <CAMYG4GkeUAOd=_izu+dhAU13HEaYyvL18CX4ad=64P9CWLYHkg@mail.gmail.com>

On Mon, Jul 25, 2016 at 1:13 PM, Manav Bhatia <bhatiamanav at gmail.com> wrote:

> Hi,
>
>     I have a multi physics application with discipline1 defined on comm1
> and discipline2 on comm2.
>
>     My intent is to use the nested matrix for the KSP solver where each
> diagonal block is provided by the disciplines, and the off-diagonal blocks
> are defined as shell-matrices with matrix vector products.
>
>     I am a bit unclear about how to deal with the case of different set of
> processors on comm1 and comm2. I have the following questions and would
> appreciate some guidance:
>
> ? Would it make sense to define a comm_global as a union of comm1 and
> comm2 for the MatCreateNest?
>
> ? The diagonal blocks are available on comm1 and comm2 only. Should
> MatAssemblyBegin/End for these diagonal blocks be called on comm1 and comm2
> separately?
>
> ? What comm should be used for the off-diagonal shell matrices?
>
> ? Likewise, when calling VecGetSubVector and VecRestoreSubVector to get
> sub-vectors corresponding to discipline1 (or 2), what comm should these
> function calls be made?
>

I would first ask if you have a convincing reason for doing this, because
it sounds like the genesis of a million programming errors.

All the linear algebra objects would have to be in a global comm that
contained any subcomms you want to use. I don't
think it would make sense to define submatrices on subcomms. You can have
your assembly code run on a subcomm certainly,
but again this is a tricky business and I find it hard to understand the
gain.

   Matt


> Thanks,
> Manav
>
-- 
What most experimenters take for granted before they begin their
experiments is infinitely more interesting than any results to which their
experiments lead.
-- Norbert Wiener
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.mcs.anl.gov/pipermail/petsc-users/attachments/20160725/f065b1d0/attachment-0001.html>

From bhatiamanav at gmail.com  Mon Jul 25 15:34:08 2016
From: bhatiamanav at gmail.com (Manav Bhatia)
Date: Mon, 25 Jul 2016 15:34:08 -0500
Subject: [petsc-users] handling multi physics applications on multiple
	MPI_Comm
In-Reply-To: <CAMYG4GkeUAOd=_izu+dhAU13HEaYyvL18CX4ad=64P9CWLYHkg@mail.gmail.com>
References: <C3FB5534-4150-4F56-8C41-DC8743F60E2F@gmail.com>
	<CAMYG4GkeUAOd=_izu+dhAU13HEaYyvL18CX4ad=64P9CWLYHkg@mail.gmail.com>
Message-ID: <7C6E1551-B9EF-46C0-B0D5-6FE1B52BC297@gmail.com>

Thanks for your comments, Matt. 

I have a fluid-structural application with a really large fluid discretization and a really small structural discretization. Due to the relative difference in size, I have defined the structural system on only a single node, and the fluid system on (say) N nodes. 

So far, I have hand-coded a Schur-Complement for a frequency-domain analysis that is able to handle the difference in comms. 

I am attempting to migrate to the nested matrix constructs for some future work, and was looking at the possibility of reusing the same distribution of comms. Additionally, I am looking to add additional disciplines and was considering the possibility of defining the systems on different comms. 

I wasn?t sure if I was creating more problems with this approach than what I was trying to solve.

Would you recommend that all objects exist on a global_comm so that there is no confusion about these operations? 

Thanks,
Manav


> On Jul 25, 2016, at 3:21 PM, Matthew Knepley <knepley at gmail.com> wrote:
> 
> On Mon, Jul 25, 2016 at 1:13 PM, Manav Bhatia <bhatiamanav at gmail.com <mailto:bhatiamanav at gmail.com>> wrote:
> Hi,
> 
>     I have a multi physics application with discipline1 defined on comm1 and discipline2 on comm2.
> 
>     My intent is to use the nested matrix for the KSP solver where each diagonal block is provided by the disciplines, and the off-diagonal blocks are defined as shell-matrices with matrix vector products.
> 
>     I am a bit unclear about how to deal with the case of different set of processors on comm1 and comm2. I have the following questions and would appreciate some guidance:
> 
> ? Would it make sense to define a comm_global as a union of comm1 and comm2 for the MatCreateNest?
> 
> ? The diagonal blocks are available on comm1 and comm2 only. Should MatAssemblyBegin/End for these diagonal blocks be called on comm1 and comm2 separately?
> 
> ? What comm should be used for the off-diagonal shell matrices?
> 
> ? Likewise, when calling VecGetSubVector and VecRestoreSubVector to get sub-vectors corresponding to discipline1 (or 2), what comm should these function calls be made?
> 
> I would first ask if you have a convincing reason for doing this, because it sounds like the genesis of a million programming errors.
> 
> All the linear algebra objects would have to be in a global comm that contained any subcomms you want to use. I don't
> think it would make sense to define submatrices on subcomms. You can have your assembly code run on a subcomm certainly,
> but again this is a tricky business and I find it hard to understand the gain.
> 
>    Matt
>  
> Thanks,
> Manav
> -- 
> What most experimenters take for granted before they begin their experiments is infinitely more interesting than any results to which their experiments lead.
> -- Norbert Wiener

-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.mcs.anl.gov/pipermail/petsc-users/attachments/20160725/1e5fb2dd/attachment.html>

From knepley at gmail.com  Mon Jul 25 15:43:14 2016
From: knepley at gmail.com (Matthew Knepley)
Date: Mon, 25 Jul 2016 13:43:14 -0700
Subject: [petsc-users] handling multi physics applications on multiple
	MPI_Comm
In-Reply-To: <7C6E1551-B9EF-46C0-B0D5-6FE1B52BC297@gmail.com>
References: <C3FB5534-4150-4F56-8C41-DC8743F60E2F@gmail.com>
	<CAMYG4GkeUAOd=_izu+dhAU13HEaYyvL18CX4ad=64P9CWLYHkg@mail.gmail.com>
	<7C6E1551-B9EF-46C0-B0D5-6FE1B52BC297@gmail.com>
Message-ID: <CAMYG4GmNgNnQM4_rYGzjmzoFGGeUt6sif-MAqCEM8zFza4LVXA@mail.gmail.com>

On Mon, Jul 25, 2016 at 1:34 PM, Manav Bhatia <bhatiamanav at gmail.com> wrote:

> Thanks for your comments, Matt.
>
> I have a fluid-structural application with a really large fluid
> discretization and a really small structural discretization. Due to the
> relative difference in size, I have defined the structural system on only a
> single node, and the fluid system on (say) N nodes.
>
> So far, I have hand-coded a Schur-Complement for a frequency-domain
> analysis that is able to handle the difference in comms.
>
> I am attempting to migrate to the nested matrix constructs for some future
> work, and was looking at the possibility of reusing the same distribution
> of comms. Additionally, I am looking to add additional disciplines and was
> considering the possibility of defining the systems on different comms.
>
> I wasn?t sure if I was creating more problems with this approach than what
> I was trying to solve.
>
> Would you recommend that all objects exist on a global_comm so that there
> is no confusion about these operations?
>

Yes. I think the confusion here is between the problem you are trying to
solve, and the tool for doing it.

Disparate size of subsystems seems to me to be a _load balancing_ problem.
Here you can use data layout to alleviate this.
On the global comm, you can put all the fluid unknowns on ranks 0..N-2, and
the structural unknowns on N-1. You can have
more general splits than that.

IF for some reason in the structural assembly you used a large number of
collective operations (like say did artificial timestepping
to get to some steady state property), then it might make sense to pull out
a subcomm of only the occupied ranks, but only above
1000 procs, and only on a non-BlueGene machine. This is also easily measure
before you do this work.

   Matt


> Thanks,
> Manav
>
>
>
> On Jul 25, 2016, at 3:21 PM, Matthew Knepley <knepley at gmail.com> wrote:
>
> On Mon, Jul 25, 2016 at 1:13 PM, Manav Bhatia <bhatiamanav at gmail.com>
> wrote:
>
>> Hi,
>>
>>     I have a multi physics application with discipline1 defined on comm1
>> and discipline2 on comm2.
>>
>>     My intent is to use the nested matrix for the KSP solver where each
>> diagonal block is provided by the disciplines, and the off-diagonal blocks
>> are defined as shell-matrices with matrix vector products.
>>
>>     I am a bit unclear about how to deal with the case of different set
>> of processors on comm1 and comm2. I have the following questions and would
>> appreciate some guidance:
>>
>> ? Would it make sense to define a comm_global as a union of comm1 and
>> comm2 for the MatCreateNest?
>>
>> ? The diagonal blocks are available on comm1 and comm2 only. Should
>> MatAssemblyBegin/End for these diagonal blocks be called on comm1 and comm2
>> separately?
>>
>> ? What comm should be used for the off-diagonal shell matrices?
>>
>> ? Likewise, when calling VecGetSubVector and VecRestoreSubVector to get
>> sub-vectors corresponding to discipline1 (or 2), what comm should these
>> function calls be made?
>>
>
> I would first ask if you have a convincing reason for doing this, because
> it sounds like the genesis of a million programming errors.
>
> All the linear algebra objects would have to be in a global comm that
> contained any subcomms you want to use. I don't
> think it would make sense to define submatrices on subcomms. You can have
> your assembly code run on a subcomm certainly,
> but again this is a tricky business and I find it hard to understand the
> gain.
>
>    Matt
>
>
>> Thanks,
>> Manav
>>
> --
> What most experimenters take for granted before they begin their
> experiments is infinitely more interesting than any results to which their
> experiments lead.
> -- Norbert Wiener
>
>
>


-- 
What most experimenters take for granted before they begin their
experiments is infinitely more interesting than any results to which their
experiments lead.
-- Norbert Wiener
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.mcs.anl.gov/pipermail/petsc-users/attachments/20160725/99aaeac2/attachment.html>

From bhatiamanav at gmail.com  Mon Jul 25 16:30:20 2016
From: bhatiamanav at gmail.com (Manav Bhatia)
Date: Mon, 25 Jul 2016 16:30:20 -0500
Subject: [petsc-users] handling multi physics applications on multiple
	MPI_Comm
In-Reply-To: <CAMYG4GmNgNnQM4_rYGzjmzoFGGeUt6sif-MAqCEM8zFza4LVXA@mail.gmail.com>
References: <C3FB5534-4150-4F56-8C41-DC8743F60E2F@gmail.com>
	<CAMYG4GkeUAOd=_izu+dhAU13HEaYyvL18CX4ad=64P9CWLYHkg@mail.gmail.com>
	<7C6E1551-B9EF-46C0-B0D5-6FE1B52BC297@gmail.com>
	<CAMYG4GmNgNnQM4_rYGzjmzoFGGeUt6sif-MAqCEM8zFza4LVXA@mail.gmail.com>
Message-ID: <B6A8789E-E93F-42A2-9B96-A81E0D32B521@gmail.com>


> On Jul 25, 2016, at 3:43 PM, Matthew Knepley <knepley at gmail.com> wrote:
> 
> Yes. I think the confusion here is between the problem you are trying to solve, and the tool for doing it.
> 
> Disparate size of subsystems seems to me to be a _load balancing_ problem. Here you can use data layout to alleviate this.
> On the global comm, you can put all the fluid unknowns on ranks 0..N-2, and the structural unknowns on N-1. You can have
> more general splits than that.
> 

Ok. So, if I do that, then there would still be one comm? If yes, then the distribution would be by specifying the number of local fluid dofs on N-1 to be zero? 

Sorry that this such is a basic question. 


> IF for some reason in the structural assembly you used a large number of collective operations (like say did artificial timestepping
> to get to some steady state property), then it might make sense to pull out a subcomm of only the occupied ranks, but only above
> 1000 procs, and only on a non-BlueGene machine. This is also easily measure before you do this work.
> 
> 

-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.mcs.anl.gov/pipermail/petsc-users/attachments/20160725/c3de904b/attachment-0001.html>

From xzhao99 at gmail.com  Mon Jul 25 16:39:35 2016
From: xzhao99 at gmail.com (Xujun Zhao)
Date: Mon, 25 Jul 2016 16:39:35 -0500
Subject: [petsc-users] KSPSolve() passes in the dbg mode,
 but failed in opt mode
In-Reply-To: <CAMYG4Gmgnn+2Qj_DNjMWwE409ytgqhCmjFKTC+5Z5Qt5mifwzQ@mail.gmail.com>
References: <CAHOKZ65kvxMLd22uQPO8sfq8HS8GT0AMRSAmVEc_aM5Oz+rvzg@mail.gmail.com>
	<CAMYG4Gmgnn+2Qj_DNjMWwE409ytgqhCmjFKTC+5Z5Qt5mifwzQ@mail.gmail.com>
Message-ID: <CAHOKZ663mweMwcn1P49PMxo4-SxfkPsSHSQ4MsKdnBmmAiQTTg@mail.gmail.com>

Another interesting phenomenon is that it works for an iterative solver,
but only failed for direct solvers(both superLU_dist and mumps). If
something is not initialized correctly, why doesn't the iterative solver,
for example, GMRES, throw any errors?

On Mon, Jul 25, 2016 at 11:50 AM, Matthew Knepley <knepley at gmail.com> wrote:

> On Mon, Jul 25, 2016 at 9:17 AM, Xujun Zhao <xzhao99 at gmail.com> wrote:
>
>> Hi all,
>>
>> I am trying to solve my problem with a direct solver superLU_dist.
>> But the KSPSolve failed in the "opt" mode. I shifted to the "dbg" version
>> and wanted to see what error info I can get from the PETSc. Surprisingly,
>> it passed the solve and didn't output any errors in the "dbg" version. Does
>> anyone have the similar experience? and what type of potential bugs it may
>> have?
>>
>
> Debugging mode initializes all variables, but as Hong says, valgrind will
> warn you of uninitialized variables.
>
>    Matt
>
>
>>
>> --->test in StokesSolver::solve(): Start the KSP solve...
>>
>> [0]PETSC ERROR:
>> ------------------------------------------------------------------------
>>
>> [0]PETSC ERROR: Caught signal number 11 SEGV: Segmentation Violation,
>> probably memory access out of range
>>
>> [0]PETSC ERROR: Try option -start_in_debugger or -on_error_attach_debugger
>>
>> [0]PETSC ERROR: or see
>> http://www.mcs.anl.gov/petsc/documentation/faq.html#valgrind
>>
>> [0]PETSC ERROR: or try http://valgrind.org on GNU/linux and Apple Mac OS
>> X to find memory corruption errors
>>
>> [0]PETSC ERROR: configure using --with-debugging=yes, recompile, link,
>> and run
>>
>> [0]PETSC ERROR: to get more information on the crash.
>>
>> [0]PETSC ERROR: --------------------- Error Message
>> --------------------------------------------------------------
>>
>> [0]PETSC ERROR: Signal received
>>
>> [0]PETSC ERROR: See http://www.mcs.anl.gov/petsc/documentation/faq.html
>> for trouble shooting.
>>
>> [0]PETSC ERROR: Petsc Release Version 3.7.2, unknown
>>
>> [0]PETSC ERROR: ./example-dbg on a arch-darwin-c-opt named
>> mcswl091.mcs.anl.gov by xzhao Mon Jul 25 11:10:12 2016
>>
>> [0]PETSC ERROR: Configure options --with-cc=gcc --with-cxx=g++
>> --with-fc=gfortran --download-mpich --download-fblaslapack
>> --download-scalapack --download-mumps --download-superlu_dist
>> --download-hypre --download-ml --download-metis --download-parmetis
>> --download-triangle --download-chaco --with-debugging=0
>>
>> [0]PETSC ERROR: #1 User provided function() line 0 in  unknown file
>>
>> application called MPI_Abort(MPI_COMM_WORLD, 59) - process 0
>>
>>
>>
>
>
> --
> What most experimenters take for granted before they begin their
> experiments is infinitely more interesting than any results to which their
> experiments lead.
> -- Norbert Wiener
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.mcs.anl.gov/pipermail/petsc-users/attachments/20160725/1f1310c7/attachment.html>

From knepley at gmail.com  Mon Jul 25 16:55:58 2016
From: knepley at gmail.com (Matthew Knepley)
Date: Mon, 25 Jul 2016 14:55:58 -0700
Subject: [petsc-users] handling multi physics applications on multiple
	MPI_Comm
In-Reply-To: <B6A8789E-E93F-42A2-9B96-A81E0D32B521@gmail.com>
References: <C3FB5534-4150-4F56-8C41-DC8743F60E2F@gmail.com>
	<CAMYG4GkeUAOd=_izu+dhAU13HEaYyvL18CX4ad=64P9CWLYHkg@mail.gmail.com>
	<7C6E1551-B9EF-46C0-B0D5-6FE1B52BC297@gmail.com>
	<CAMYG4GmNgNnQM4_rYGzjmzoFGGeUt6sif-MAqCEM8zFza4LVXA@mail.gmail.com>
	<B6A8789E-E93F-42A2-9B96-A81E0D32B521@gmail.com>
Message-ID: <CAMYG4GmwaE9k1DRJq0BN84QpvFD1OYuv=yJjgdAZ8WLon19unw@mail.gmail.com>

On Mon, Jul 25, 2016 at 2:30 PM, Manav Bhatia <bhatiamanav at gmail.com> wrote:

>
> On Jul 25, 2016, at 3:43 PM, Matthew Knepley <knepley at gmail.com> wrote:
>
> Yes. I think the confusion here is between the problem you are trying to
> solve, and the tool for doing it.
>
> Disparate size of subsystems seems to me to be a _load balancing_ problem.
> Here you can use data layout to alleviate this.
> On the global comm, you can put all the fluid unknowns on ranks 0..N-2,
> and the structural unknowns on N-1. You can have
> more general splits than that.
>
>
> Ok. So, if I do that, then there would still be one comm? If yes, then the
> distribution would be by specifying the number of local fluid dofs on N-1
> to be zero?
>

Yes. If all you want is good load balance, I think this is the best way.

  Thanks,

    Matt


> Sorry that this such is a basic question.
>
>
> IF for some reason in the structural assembly you used a large number of
> collective operations (like say did artificial timestepping
> to get to some steady state property), then it might make sense to pull
> out a subcomm of only the occupied ranks, but only above
> 1000 procs, and only on a non-BlueGene machine. This is also easily
> measure before you do this work.
>
>
>
>


-- 
What most experimenters take for granted before they begin their
experiments is infinitely more interesting than any results to which their
experiments lead.
-- Norbert Wiener
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.mcs.anl.gov/pipermail/petsc-users/attachments/20160725/59a52640/attachment.html>

From knepley at gmail.com  Mon Jul 25 16:56:51 2016
From: knepley at gmail.com (Matthew Knepley)
Date: Mon, 25 Jul 2016 14:56:51 -0700
Subject: [petsc-users] KSPSolve() passes in the dbg mode,
 but failed in opt mode
In-Reply-To: <CAHOKZ663mweMwcn1P49PMxo4-SxfkPsSHSQ4MsKdnBmmAiQTTg@mail.gmail.com>
References: <CAHOKZ65kvxMLd22uQPO8sfq8HS8GT0AMRSAmVEc_aM5Oz+rvzg@mail.gmail.com>
	<CAMYG4Gmgnn+2Qj_DNjMWwE409ytgqhCmjFKTC+5Z5Qt5mifwzQ@mail.gmail.com>
	<CAHOKZ663mweMwcn1P49PMxo4-SxfkPsSHSQ4MsKdnBmmAiQTTg@mail.gmail.com>
Message-ID: <CAMYG4Gn_xKCiWz7081LDcZYzWOg580YzJjXYX+kYjCijrONdsw@mail.gmail.com>

On Mon, Jul 25, 2016 at 2:39 PM, Xujun Zhao <xzhao99 at gmail.com> wrote:

> Another interesting phenomenon is that it works for an iterative solver,
> but only failed for direct solvers(both superLU_dist and mumps). If
> something is not initialized correctly, why doesn't the iterative solver,
> for example, GMRES, throw any errors?
>

It would of course depend on what you have not initialized, and what value
was sitting in that place to begin with.
Use valgrind to clear all this up.

   Matt


> On Mon, Jul 25, 2016 at 11:50 AM, Matthew Knepley <knepley at gmail.com>
> wrote:
>
>> On Mon, Jul 25, 2016 at 9:17 AM, Xujun Zhao <xzhao99 at gmail.com> wrote:
>>
>>> Hi all,
>>>
>>> I am trying to solve my problem with a direct solver superLU_dist.
>>> But the KSPSolve failed in the "opt" mode. I shifted to the "dbg"
>>> version and wanted to see what error info I can get from the PETSc.
>>> Surprisingly, it passed the solve and didn't output any errors in the "dbg"
>>> version. Does anyone have the similar experience? and what type of
>>> potential bugs it may have?
>>>
>>
>> Debugging mode initializes all variables, but as Hong says, valgrind will
>> warn you of uninitialized variables.
>>
>>    Matt
>>
>>
>>>
>>> --->test in StokesSolver::solve(): Start the KSP solve...
>>>
>>> [0]PETSC ERROR:
>>> ------------------------------------------------------------------------
>>>
>>> [0]PETSC ERROR: Caught signal number 11 SEGV: Segmentation Violation,
>>> probably memory access out of range
>>>
>>> [0]PETSC ERROR: Try option -start_in_debugger or
>>> -on_error_attach_debugger
>>>
>>> [0]PETSC ERROR: or see
>>> http://www.mcs.anl.gov/petsc/documentation/faq.html#valgrind
>>>
>>> [0]PETSC ERROR: or try http://valgrind.org on GNU/linux and Apple Mac
>>> OS X to find memory corruption errors
>>>
>>> [0]PETSC ERROR: configure using --with-debugging=yes, recompile, link,
>>> and run
>>>
>>> [0]PETSC ERROR: to get more information on the crash.
>>>
>>> [0]PETSC ERROR: --------------------- Error Message
>>> --------------------------------------------------------------
>>>
>>> [0]PETSC ERROR: Signal received
>>>
>>> [0]PETSC ERROR: See http://www.mcs.anl.gov/petsc/documentation/faq.html
>>> for trouble shooting.
>>>
>>> [0]PETSC ERROR: Petsc Release Version 3.7.2, unknown
>>>
>>> [0]PETSC ERROR: ./example-dbg on a arch-darwin-c-opt named
>>> mcswl091.mcs.anl.gov by xzhao Mon Jul 25 11:10:12 2016
>>>
>>> [0]PETSC ERROR: Configure options --with-cc=gcc --with-cxx=g++
>>> --with-fc=gfortran --download-mpich --download-fblaslapack
>>> --download-scalapack --download-mumps --download-superlu_dist
>>> --download-hypre --download-ml --download-metis --download-parmetis
>>> --download-triangle --download-chaco --with-debugging=0
>>>
>>> [0]PETSC ERROR: #1 User provided function() line 0 in  unknown file
>>>
>>> application called MPI_Abort(MPI_COMM_WORLD, 59) - process 0
>>>
>>>
>>>
>>
>>
>> --
>> What most experimenters take for granted before they begin their
>> experiments is infinitely more interesting than any results to which their
>> experiments lead.
>> -- Norbert Wiener
>>
>
>


-- 
What most experimenters take for granted before they begin their
experiments is infinitely more interesting than any results to which their
experiments lead.
-- Norbert Wiener
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.mcs.anl.gov/pipermail/petsc-users/attachments/20160725/48b80090/attachment.html>

From loiseau.jc at gmail.com  Tue Jul 26 10:27:02 2016
From: loiseau.jc at gmail.com (JC)
Date: Tue, 26 Jul 2016 17:27:02 +0200
Subject: [petsc-users] Scheduled Relaxation Jacobi method
Message-ID: <40FF0CE8-7589-4631-AB5B-0F4AF5205C99@gmail.com>

Hej,

I have been using a very simple Scheduled Relaxation Jacobi (SRJ) method for tests purpose in one of my code, and would now like to implement into the big version that uses PETSc. So far, I have figured that I can use the weighted Jacobi method since it is nothing but a Jacobi-preconditioned Richardson. While in the weighted Jacobi, the relaxation weight omega is fixed, in the SRJ method, the value of the relaxation factor depends on the grid size and the iteration. Is there any simple way to implement such a iteration-varying relaxation weight given that I have text files with their appropriate values?

Thanks a lot.
JC

From aks084000 at utdallas.edu  Tue Jul 26 20:00:13 2016
From: aks084000 at utdallas.edu (Safin, Artur)
Date: Wed, 27 Jul 2016 01:00:13 +0000
Subject: [petsc-users] Nested Fieldsplit for custom index sets
Message-ID: <f892146ecce147429103227707eb67e7@utdallas.edu>

Hello all,

I would like to work out how to get nested fieldsplit to work correctly. I have a submatrix (labeled fieldsplit_P) that I would like to block precondition with sub-blocks A & B. To do this, I access the PC object within fieldsplit_P, and pass index sets corresponding to these sub-blocks (P_A_IS, P_B_IS) that tell how the matrix should be split. This is what I have:

--------------------------------------------------------------------------------
   KSP *ksp_all, ksp_P;
   PCFieldSplitGetSubKSP(pc, &i, &ksp_all);

   ksp_P = ksp_all[0];
   PC pc_P;
   KSPGetPC(ksp_P, &pc_P);                     // This should extract the preconditioner for fieldsplit P
   PCFieldSplitSetIS(pc_P, "A", P_A_IS);
   PCFieldSplitSetIS(pc_P, "B", P_B_IS);
--------------------------------------------------------------------------------

And these are the run-time arguments:

--------------------------------------------------------------------------------
   -pc_type fieldsplit
   -pc_fieldsplit_type multiplicative

   -fieldsplit_P_ksp_type gmres
   -fieldsplit_P_pc_type fieldsplit
   -fieldsplit_P_pc_fieldsplit_type multiplicative

   -fieldsplit_P_fieldsplit_A_ksp_type gmres
   -fieldsplit_P_fieldsplit_B_pc_type lu
   -fieldsplit_P_fieldsplit_B_ksp_type preonly
--------------------------------------------------------------------------------

But that does not work:

--------------------------------------------------------------------------------
[0]PETSC ERROR: --------------------- Error Message --------------------------------------------------------------
[0]PETSC ERROR: No support for this operation for this object type
[0]PETSC ERROR: Cannot locate function PCFieldSplitSetIS_C in object
[0]PETSC ERROR: See http://www.mcs.anl.gov/petsc/documentation/faq.html for trouble shooting.
[0]PETSC ERROR: Petsc Release Version 3.7.2, Jun, 05, 2016
[0]PETSC ERROR: ./main_2D on a x86_64 named artur-ubuntu by artur Tue Jul 26 18:55:29 2016
[0]PETSC ERROR: Configure options --with-scalar-type=complex --with-mpi=1 --with-clanguage=c++ --with-cc=mpicc --with-fc=gfortran --with-cxx=mpic++ --with-fc=mpif90 --download-mumps --download-scalapack
[0]PETSC ERROR: #1 PCFieldSplitSetIS() line 1756 in /home/artur/Rorsrach/Packages/petsc-3.7.2/src/ksp/pc/impls/fieldsplit/fieldsplit.c
[0]PETSC ERROR: --------------------- Error Message --------------------------------------------------------------
[0]PETSC ERROR: No support for this operation for this object type
[0]PETSC ERROR: Cannot locate function PCFieldSplitSetIS_C in object
[0]PETSC ERROR: See http://www.mcs.anl.gov/petsc/documentation/faq.html for trouble shooting.
[0]PETSC ERROR: Petsc Release Version 3.7.2, Jun, 05, 2016
[0]PETSC ERROR: ./main_2D on a x86_64 named artur-ubuntu by artur Tue Jul 26 18:55:29 2016
[0]PETSC ERROR: Configure options --with-scalar-type=complex --with-mpi=1 --with-clanguage=c++ --with-cc=mpicc --with-fc=gfortran --with-cxx=mpic++ --with-fc=mpif90 --download-mumps --download-scalapack
[0]PETSC ERROR: #2 PCFieldSplitSetIS() line 1756 in /home/artur/Rorsrach/Packages/petsc-3.7.2/src/ksp/pc/impls/fieldsplit/fieldsplit.c
--------------------------------------------------------------------------------

It seems that the preconditioner object for fieldsplit_P does not know that it is also of fieldsplit type. Does anyone have any idea of how I can specify the proper fieldsplit?

Best,

Artur

-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.mcs.anl.gov/pipermail/petsc-users/attachments/20160727/f64a52ba/attachment.html>

From bsmith at mcs.anl.gov  Tue Jul 26 20:40:33 2016
From: bsmith at mcs.anl.gov (Barry Smith)
Date: Tue, 26 Jul 2016 21:40:33 -0400
Subject: [petsc-users] Scheduled Relaxation Jacobi method
In-Reply-To: <40FF0CE8-7589-4631-AB5B-0F4AF5205C99@gmail.com>
References: <40FF0CE8-7589-4631-AB5B-0F4AF5205C99@gmail.com>
Message-ID: <3EFDE37C-513C-4894-9859-BA7DCA70A760@mcs.anl.gov>


> On Jul 26, 2016, at 11:27 AM, JC <loiseau.jc at gmail.com> wrote:
> 
> Hej,
> 
> I have been using a very simple Scheduled Relaxation Jacobi (SRJ) method for tests purpose in one of my code, and would now like to implement into the big version that uses PETSc. So far, I have figured that I can use the weighted Jacobi method since it is nothing but a Jacobi-preconditioned Richardson. While in the weighted Jacobi, the relaxation weight omega is fixed, in the SRJ method, the value of the relaxation factor depends on the grid size and the iteration. Is there any simple way to implement such a iteration-varying relaxation weight given that I have text files with their appropriate values?

   The dependence on grid size is easy. You just call KSPRichardsonSetScale(). By depends on the iteration do you mean the linear iteration, as in the first iteration you use .1 then in the second you use .2 etc?  To do this use KSPSetMonitor() and have your monitor routine call KSPRichardsonSetScale() with the value you like which can depend on the iteration.

   Barry

> 
> Thanks a lot.
> JC


From bsmith at mcs.anl.gov  Tue Jul 26 20:54:44 2016
From: bsmith at mcs.anl.gov (Barry Smith)
Date: Tue, 26 Jul 2016 21:54:44 -0400
Subject: [petsc-users] Nested Fieldsplit for custom index sets
In-Reply-To: <f892146ecce147429103227707eb67e7@utdallas.edu>
References: <f892146ecce147429103227707eb67e7@utdallas.edu>
Message-ID: <DA8B1BDA-0321-42AC-8E9B-4A142703875B@mcs.anl.gov>


  Do you have a call to KSPSetFromOptions() before the call PCFieldSplitGetSubKSP()? I am guessing not which means that the PC does not yet know that it is of type fieldplit.

   Barry

> On Jul 26, 2016, at 9:00 PM, Safin, Artur <aks084000 at utdallas.edu> wrote:
> 
> Hello all,
> 
> I would like to work out how to get nested fieldsplit to work correctly. I have a submatrix (labeled fieldsplit_P) that I would like to block precondition with sub-blocks A & B. To do this, I access the PC object within fieldsplit_P, and pass index sets corresponding to these sub-blocks (P_A_IS, P_B_IS) that tell how the matrix should be split. This is what I have:
> 
> --------------------------------------------------------------------------------
>    KSP *ksp_all, ksp_P;
>    PCFieldSplitGetSubKSP(pc, &i, &ksp_all);
>    
>    ksp_P = ksp_all[0];
>    PC pc_P;
>    KSPGetPC(ksp_P, &pc_P);                     // This should extract the preconditioner for fieldsplit P
>    PCFieldSplitSetIS(pc_P, "A", P_A_IS);
>    PCFieldSplitSetIS(pc_P, "B", P_B_IS);
> --------------------------------------------------------------------------------
> 
> And these are the run-time arguments:
> 
> --------------------------------------------------------------------------------
>    -pc_type fieldsplit
>    -pc_fieldsplit_type multiplicative
> 
>    -fieldsplit_P_ksp_type gmres
>    -fieldsplit_P_pc_type fieldsplit
>    -fieldsplit_P_pc_fieldsplit_type multiplicative
> 
>    -fieldsplit_P_fieldsplit_A_ksp_type gmres
>    -fieldsplit_P_fieldsplit_B_pc_type lu
>    -fieldsplit_P_fieldsplit_B_ksp_type preonly
> --------------------------------------------------------------------------------
> 
> But that does not work:
> 
> --------------------------------------------------------------------------------
> [0]PETSC ERROR: --------------------- Error Message --------------------------------------------------------------
> [0]PETSC ERROR: No support for this operation for this object type
> [0]PETSC ERROR: Cannot locate function PCFieldSplitSetIS_C in object
> [0]PETSC ERROR: See http://www.mcs.anl.gov/petsc/documentation/faq.html for trouble shooting.
> [0]PETSC ERROR: Petsc Release Version 3.7.2, Jun, 05, 2016 
> [0]PETSC ERROR: ./main_2D on a x86_64 named artur-ubuntu by artur Tue Jul 26 18:55:29 2016
> [0]PETSC ERROR: Configure options --with-scalar-type=complex --with-mpi=1 --with-clanguage=c++ --with-cc=mpicc --with-fc=gfortran --with-cxx=mpic++ --with-fc=mpif90 --download-mumps --download-scalapack
> [0]PETSC ERROR: #1 PCFieldSplitSetIS() line 1756 in /home/artur/Rorsrach/Packages/petsc-3.7.2/src/ksp/pc/impls/fieldsplit/fieldsplit.c
> [0]PETSC ERROR: --------------------- Error Message --------------------------------------------------------------
> [0]PETSC ERROR: No support for this operation for this object type
> [0]PETSC ERROR: Cannot locate function PCFieldSplitSetIS_C in object
> [0]PETSC ERROR: See http://www.mcs.anl.gov/petsc/documentation/faq.html for trouble shooting.
> [0]PETSC ERROR: Petsc Release Version 3.7.2, Jun, 05, 2016 
> [0]PETSC ERROR: ./main_2D on a x86_64 named artur-ubuntu by artur Tue Jul 26 18:55:29 2016
> [0]PETSC ERROR: Configure options --with-scalar-type=complex --with-mpi=1 --with-clanguage=c++ --with-cc=mpicc --with-fc=gfortran --with-cxx=mpic++ --with-fc=mpif90 --download-mumps --download-scalapack
> [0]PETSC ERROR: #2 PCFieldSplitSetIS() line 1756 in /home/artur/Rorsrach/Packages/petsc-3.7.2/src/ksp/pc/impls/fieldsplit/fieldsplit.c
> --------------------------------------------------------------------------------
> 
> It seems that the preconditioner object for fieldsplit_P does not know that it is also of fieldsplit type. Does anyone have any idea of how I can specify the proper fieldsplit?
> 
> Best,
> 
> Artur


From aks084000 at utdallas.edu  Tue Jul 26 21:54:22 2016
From: aks084000 at utdallas.edu (Safin, Artur)
Date: Wed, 27 Jul 2016 02:54:22 +0000
Subject: [petsc-users] Nested Fieldsplit for custom index sets
In-Reply-To: <DA8B1BDA-0321-42AC-8E9B-4A142703875B@mcs.anl.gov>
References: <f892146ecce147429103227707eb67e7@utdallas.edu>,
	<DA8B1BDA-0321-42AC-8E9B-4A142703875B@mcs.anl.gov>
Message-ID: <E10F1C66-91A8-49F4-AAF3-03C3B91E5254@utdallas.edu>

Barry,

> Do you have a call to KSPSetFromOptions() before the call PCFieldSplitGetSubKSP()? I am guessing not which means that the PC does not yet know that it is of type fieldplit.

Yes, I call KSPSetFromOptions() for the global matrix at the beginning of the code. Should I also do it for the ksp I obtain from PCFieldSplitGetSubKSP()?

The program has no problem doing fieldsplit for the global matrix; my issue is that I cannot get it to recognize a fieldsplit within a fieldsplit.

This is the whole code for the solver:

--------------------------------------------------------------------------------------------------------------------------------
  KSP ksp;
  KSPCreate(mpi_communicator, &ksp);
  KSPSetType(ksp, KSPGMRES);
  KSPSetOperators(ksp, A_petsc, A_petsc);
  KSPSetFromOptions(ksp);

  PC pc;
  KSPGetPC(ksp, &pc);

  // Define the fieldsplit for the global matrix
  PCFieldSplitSetIS(pc, "P", P_IS);
  PCFieldSplitSetIS(pc, "T", T_IS);

  // fieldsplit for submatrix P:
  KSP *ksp_all, ksp_P;
  PCFieldSplitGetSubKSP(pc, &i, &ksp_all);

  ksp_P = ksp_all[0];
  PC pc_P;
  KSPGetPC(ksp_P, &pc_P); // This should be the preconditioner for fieldsplit P
  PCFieldSplitSetIS(pc_P, "A", P_A_IS);
  PCFieldSplitSetIS(pc_P, "B", P_B_IS);

  KSPSolve(ksp, b_petsc, u_petsc);
--------------------------------------------------------------------------------------------------------------------------------

Thanks,

Artur


From lawrence.mitchell at imperial.ac.uk  Wed Jul 27 03:00:32 2016
From: lawrence.mitchell at imperial.ac.uk (Lawrence Mitchell)
Date: Wed, 27 Jul 2016 09:00:32 +0100
Subject: [petsc-users] Nested Fieldsplit for custom index sets
In-Reply-To: <E10F1C66-91A8-49F4-AAF3-03C3B91E5254@utdallas.edu>
References: <f892146ecce147429103227707eb67e7@utdallas.edu>
	<DA8B1BDA-0321-42AC-8E9B-4A142703875B@mcs.anl.gov>
	<E10F1C66-91A8-49F4-AAF3-03C3B91E5254@utdallas.edu>
Message-ID: <F1CF2513-1260-4574-ABB4-B3E60E9F7902@imperial.ac.uk>


> On 27 Jul 2016, at 03:54, Safin, Artur <aks084000 at utdallas.edu> wrote:
> 
> Barry,
> 
>> Do you have a call to KSPSetFromOptions() before the call PCFieldSplitGetSubKSP()? I am guessing not which means that the PC does not yet know that it is of type fieldplit.
> 
> Yes, I call KSPSetFromOptions() for the global matrix at the beginning of the code. Should I also do it for the ksp I obtain from PCFieldSplitGetSubKSP()?
> 
> The program has no problem doing fieldsplit for the global matrix; my issue is that I cannot get it to recognize a fieldsplit within a fieldsplit.

I think the SubKSPs (and therefore SubPCs) are not set up until you call KSPSetUp(ksp) which your code does not do explicitly and is therefore done in KSPSolve.

Cheers,

Lawrence
-------------- next part --------------
A non-text attachment was scrubbed...
Name: signature.asc
Type: application/pgp-signature
Size: 455 bytes
Desc: Message signed with OpenPGP using GPGMail
URL: <http://lists.mcs.anl.gov/pipermail/petsc-users/attachments/20160727/7ffdd275/attachment.pgp>

From kandanovian at gmail.com  Wed Jul 27 06:59:05 2016
From: kandanovian at gmail.com (Tim Steinhoff)
Date: Wed, 27 Jul 2016 13:59:05 +0200
Subject: [petsc-users] Ignore command line arguments with fortran code using
	PETSc
Message-ID: <CAE=bjn5QBx=Dsq=GhW-Xq+94-dghD5EFNY4F42vYYiGQ8kFf_g@mail.gmail.com>

Hi all,

we coupled PETSc with our fortran code. Is there any way to let PETSc
(PetscInitialize) ignore all arguments passed by the command line?
Since our code is controlled by command line arguements as well, it
leads to a mess, when those arguments are read twice.

Thanks and kind regards,

Volker

From knepley at gmail.com  Wed Jul 27 09:04:52 2016
From: knepley at gmail.com (Matthew Knepley)
Date: Wed, 27 Jul 2016 07:04:52 -0700
Subject: [petsc-users] Ignore command line arguments with fortran code
 using PETSc
In-Reply-To: <CAE=bjn5QBx=Dsq=GhW-Xq+94-dghD5EFNY4F42vYYiGQ8kFf_g@mail.gmail.com>
References: <CAE=bjn5QBx=Dsq=GhW-Xq+94-dghD5EFNY4F42vYYiGQ8kFf_g@mail.gmail.com>
Message-ID: <CAMYG4G=oWuPiFwM4JLMUpJapiag=M=COuqmbDdjcdNLGnfCQ+A@mail.gmail.com>

On Wed, Jul 27, 2016 at 4:59 AM, Tim Steinhoff <kandanovian at gmail.com>
wrote:

> Hi all,
>
> we coupled PETSc with our fortran code. Is there any way to let PETSc
> (PetscInitialize) ignore all arguments passed by the command line?
> Since our code is controlled by command line arguements as well, it
> leads to a mess, when those arguments are read twice.
>

1) You can use PetscInitializeNoArguments()

2) What goes wrong? PETSc should just ignore any options it does not
recognize.

  Thanks,

    Matt


> Thanks and kind regards,
>
> Volker
>


-- 
What most experimenters take for granted before they begin their
experiments is infinitely more interesting than any results to which their
experiments lead.
-- Norbert Wiener
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.mcs.anl.gov/pipermail/petsc-users/attachments/20160727/5ee8292e/attachment-0001.html>

From kandanovian at gmail.com  Wed Jul 27 09:55:42 2016
From: kandanovian at gmail.com (Tim Steinhoff)
Date: Wed, 27 Jul 2016 16:55:42 +0200
Subject: [petsc-users] Ignore command line arguments with fortran code
 using PETSc
In-Reply-To: <CAMYG4G=oWuPiFwM4JLMUpJapiag=M=COuqmbDdjcdNLGnfCQ+A@mail.gmail.com>
References: <CAE=bjn5QBx=Dsq=GhW-Xq+94-dghD5EFNY4F42vYYiGQ8kFf_g@mail.gmail.com>
	<CAMYG4G=oWuPiFwM4JLMUpJapiag=M=COuqmbDdjcdNLGnfCQ+A@mail.gmail.com>
Message-ID: <CAE=bjn5Ef9ZmyXdVF3JZ19frw6+Ev+E2L+YN-xJOoA3pv1pBFw@mail.gmail.com>

2016-07-27 16:04 GMT+02:00 Matthew Knepley <knepley at gmail.com>:
> On Wed, Jul 27, 2016 at 4:59 AM, Tim Steinhoff <kandanovian at gmail.com>
> wrote:
>>
>> Hi all,
>>
>> we coupled PETSc with our fortran code. Is there any way to let PETSc
>> (PetscInitialize) ignore all arguments passed by the command line?
>> Since our code is controlled by command line arguements as well, it
>> leads to a mess, when those arguments are read twice.
>
>
> 1) You can use PetscInitializeNoArguments()

Thanks! I thought that function was for C/C++ only.

>
> 2) What goes wrong? PETSc should just ignore any options it does not
> recognize.


The problem is that our code uses the same or similar argument names
as PETSc does and our end user should not have access to all petsc
options.


>
>   Thanks,
>
>     Matt
>
>>
>> Thanks and kind regards,
>>
>> Volker
>
>
>
>
> --
> What most experimenters take for granted before they begin their experiments
> is infinitely more interesting than any results to which their experiments
> lead.
> -- Norbert Wiener

From bsmith at mcs.anl.gov  Wed Jul 27 11:09:14 2016
From: bsmith at mcs.anl.gov (Barry Smith)
Date: Wed, 27 Jul 2016 12:09:14 -0400
Subject: [petsc-users] Nested Fieldsplit for custom index sets
In-Reply-To: <E10F1C66-91A8-49F4-AAF3-03C3B91E5254@utdallas.edu>
References: <f892146ecce147429103227707eb67e7@utdallas.edu>
	<DA8B1BDA-0321-42AC-8E9B-4A142703875B@mcs.anl.gov>
	<E10F1C66-91A8-49F4-AAF3-03C3B91E5254@utdallas.edu>
Message-ID: <BD1F4851-A841-4EB3-9603-D4A5A684C574@mcs.anl.gov>

  
  Please send a complete code that you think should work that doesn't so we can understand the exact issue.


> On Jul 26, 2016, at 10:54 PM, Safin, Artur <aks084000 at utdallas.edu> wrote:
> 
> Barry,
> 
>> Do you have a call to KSPSetFromOptions() before the call PCFieldSplitGetSubKSP()? I am guessing not which means that the PC does not yet know that it is of type fieldplit.
> 
> Yes, I call KSPSetFromOptions() for the global matrix at the beginning of the code. Should I also do it for the ksp I obtain from PCFieldSplitGetSubKSP()?
> 
> The program has no problem doing fieldsplit for the global matrix; my issue is that I cannot get it to recognize a fieldsplit within a fieldsplit.
> 
> This is the whole code for the solver:
> 
> --------------------------------------------------------------------------------------------------------------------------------
>  KSP ksp;
>  KSPCreate(mpi_communicator, &ksp);
>  KSPSetType(ksp, KSPGMRES);
>  KSPSetOperators(ksp, A_petsc, A_petsc);
>  KSPSetFromOptions(ksp);
> 
>  PC pc;
>  KSPGetPC(ksp, &pc);
> 
>  // Define the fieldsplit for the global matrix
>  PCFieldSplitSetIS(pc, "P", P_IS);
>  PCFieldSplitSetIS(pc, "T", T_IS);
> 
>  // fieldsplit for submatrix P:
>  KSP *ksp_all, ksp_P;
>  PCFieldSplitGetSubKSP(pc, &i, &ksp_all);
> 
>  ksp_P = ksp_all[0];
>  PC pc_P;
>  KSPGetPC(ksp_P, &pc_P); // This should be the preconditioner for fieldsplit P
>  PCFieldSplitSetIS(pc_P, "A", P_A_IS);
>  PCFieldSplitSetIS(pc_P, "B", P_B_IS);
> 
>  KSPSolve(ksp, b_petsc, u_petsc);
> --------------------------------------------------------------------------------------------------------------------------------
> 
> Thanks,
> 
> Artur
> 


From bsmith at mcs.anl.gov  Wed Jul 27 14:42:21 2016
From: bsmith at mcs.anl.gov (Barry Smith)
Date: Wed, 27 Jul 2016 15:42:21 -0400
Subject: [petsc-users] Ignore command line arguments with fortran code
	using PETSc
In-Reply-To: <CAE=bjn5Ef9ZmyXdVF3JZ19frw6+Ev+E2L+YN-xJOoA3pv1pBFw@mail.gmail.com>
References: <CAE=bjn5QBx=Dsq=GhW-Xq+94-dghD5EFNY4F42vYYiGQ8kFf_g@mail.gmail.com>
	<CAMYG4G=oWuPiFwM4JLMUpJapiag=M=COuqmbDdjcdNLGnfCQ+A@mail.gmail.com>
	<CAE=bjn5Ef9ZmyXdVF3JZ19frw6+Ev+E2L+YN-xJOoA3pv1pBFw@mail.gmail.com>
Message-ID: <8F9DC370-1CD1-48AA-8009-42731293566A@mcs.anl.gov>


  Actually there is currently no way to PetscInitialize from Fortran without adding the command line options to the database. In the middle
of petscinitialize_() is the code fragment

  PETScParseFortranArgs_Private(&PetscGlobalArgc,&PetscGlobalArgs);
  FIXCHAR(filename,len,t1);
  *ierr = PetscOptionsInsert(NULL,&PetscGlobalArgc,&PetscGlobalArgs,t1);

  We'll need to do a bit of code refactoring to provide a Fortran petscinitializenoarguments_(). The simplest way to refactor would be to change the name of petscinitialize_ to say PetscInitializeFortran_Internal() and add a bool argument whether to process the arguments and then write two trivial routines petscinitialize_ that calls the new routine with PETSC_TRUE and petscinitializenoarguments_() that calls it with PETSC_FALSE.

   Barry

  Of course you can have a C/C++ main routine that calls PetscInitializeNoArguments(); followed by PetscInitializeFortran() and then have the bulk of your code in Fortran.


> On Jul 27, 2016, at 10:55 AM, Tim Steinhoff <kandanovian at gmail.com> wrote:
> 
> 2016-07-27 16:04 GMT+02:00 Matthew Knepley <knepley at gmail.com>:
>> On Wed, Jul 27, 2016 at 4:59 AM, Tim Steinhoff <kandanovian at gmail.com>
>> wrote:
>>> 
>>> Hi all,
>>> 
>>> we coupled PETSc with our fortran code. Is there any way to let PETSc
>>> (PetscInitialize) ignore all arguments passed by the command line?
>>> Since our code is controlled by command line arguements as well, it
>>> leads to a mess, when those arguments are read twice.
>> 
>> 
>> 1) You can use PetscInitializeNoArguments()
> 
> Thanks! I thought that function was for C/C++ only.
> 
>> 
>> 2) What goes wrong? PETSc should just ignore any options it does not
>> recognize.
> 
> 
> The problem is that our code uses the same or similar argument names
> as PETSc does and our end user should not have access to all petsc
> options.
> 
> 
>> 
>>  Thanks,
>> 
>>    Matt
>> 
>>> 
>>> Thanks and kind regards,
>>> 
>>> Volker
>> 
>> 
>> 
>> 
>> --
>> What most experimenters take for granted before they begin their experiments
>> is infinitely more interesting than any results to which their experiments
>> lead.
>> -- Norbert Wiener


From epscodes at gmail.com  Wed Jul 27 16:42:20 2016
From: epscodes at gmail.com (Xiangdong)
Date: Wed, 27 Jul 2016 17:42:20 -0400
Subject: [petsc-users] vec norm for local portion of a vector
Message-ID: <CAAPpcpnVrGB2PiFBjyXGC+Y8niGc_ONE+KRZv2t7fsP_n_HL7A@mail.gmail.com>

Hello everyone,

I have a global dmda vector vg. On each processor, if I want to know the
norm of local portion of vg, which function should I call?

So far I am thinking of using DMDAVecGetArray and then write a loop to
compute the norm of this local array.

Is there a simple function available to call? like
*vg->ops->norm_local(vg,NORM_2, &normlocal)?

Thanks.

Best,
Xiangdong
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.mcs.anl.gov/pipermail/petsc-users/attachments/20160727/7989ebbe/attachment.html>

From aks084000 at utdallas.edu  Wed Jul 27 20:20:41 2016
From: aks084000 at utdallas.edu (Safin, Artur)
Date: Thu, 28 Jul 2016 01:20:41 +0000
Subject: [petsc-users] Nested Fieldsplit for custom index sets
In-Reply-To: <BD1F4851-A841-4EB3-9603-D4A5A684C574@mcs.anl.gov>
References: <f892146ecce147429103227707eb67e7@utdallas.edu>
	<DA8B1BDA-0321-42AC-8E9B-4A142703875B@mcs.anl.gov>
	<E10F1C66-91A8-49F4-AAF3-03C3B91E5254@utdallas.edu>,
	<BD1F4851-A841-4EB3-9603-D4A5A684C574@mcs.anl.gov>
Message-ID: <0B3B3C93-5C07-4E07-A37E-DEBA9577D3EE@utdallas.edu>

Barry, Lawrence,

> I think the SubKSPs (and therefore SubPCs) are not set up until you call KSPSetUp(ksp) which your code does not do explicitly and is therefore done in KSPSolve.

I added KSPSetUp(), but unfortunately the issue did not go away.


I have created a MWE that replicates the issue. The program tries to solve a tridiagonal system, where the first fieldsplit partitions the global matrix

[ P  x ]
[ x  T ],

and the nested fieldsplit partitions P into

[ A  x ]
[ x  B ].

Thanks for your help,

Artur

-------------- next part --------------
A non-text attachment was scrubbed...
Name: ex.c
Type: text/x-csrc
Size: 2395 bytes
Desc: ex.c
URL: <http://lists.mcs.anl.gov/pipermail/petsc-users/attachments/20160728/a019776b/attachment.c>
-------------- next part --------------
A non-text attachment was scrubbed...
Name: run.sh
Type: application/x-shellscript
Size: 351 bytes
Desc: run.sh
URL: <http://lists.mcs.anl.gov/pipermail/petsc-users/attachments/20160728/a019776b/attachment.bin>

From kandanovian at gmail.com  Thu Jul 28 02:35:00 2016
From: kandanovian at gmail.com (Tim Steinhoff)
Date: Thu, 28 Jul 2016 09:35:00 +0200
Subject: [petsc-users] Ignore command line arguments with fortran code
 using PETSc
In-Reply-To: <8F9DC370-1CD1-48AA-8009-42731293566A@mcs.anl.gov>
References: <CAE=bjn5QBx=Dsq=GhW-Xq+94-dghD5EFNY4F42vYYiGQ8kFf_g@mail.gmail.com>
	<CAMYG4G=oWuPiFwM4JLMUpJapiag=M=COuqmbDdjcdNLGnfCQ+A@mail.gmail.com>
	<CAE=bjn5Ef9ZmyXdVF3JZ19frw6+Ev+E2L+YN-xJOoA3pv1pBFw@mail.gmail.com>
	<8F9DC370-1CD1-48AA-8009-42731293566A@mcs.anl.gov>
Message-ID: <CAE=bjn4fnCXyKO4AnyOWUwuShpp8XmVxXnSf9vaN9if77B37fg@mail.gmail.com>

2016-07-27 21:42 GMT+02:00 Barry Smith <bsmith at mcs.anl.gov>:
>
>   Actually there is currently no way to PetscInitialize from Fortran without adding the command line options to the database. In the middle
> of petscinitialize_() is the code fragment
>
>   PETScParseFortranArgs_Private(&PetscGlobalArgc,&PetscGlobalArgs);
>   FIXCHAR(filename,len,t1);
>   *ierr = PetscOptionsInsert(NULL,&PetscGlobalArgc,&PetscGlobalArgs,t1);
>
>   We'll need to do a bit of code refactoring to provide a Fortran petscinitializenoarguments_(). The simplest way to refactor would be to change the name of petscinitialize_ to say PetscInitializeFortran_Internal() and add a bool argument whether to process the arguments and then write two trivial routines petscinitialize_ that calls the new routine with PETSC_TRUE and petscinitializenoarguments_() that calls it with PETSC_FALSE.

Thanks Barry. It would be really nice if PETSc comes with that feature
in future, because I would prefer not to make any changes to the PETSc
code that disappear with every new PETSc release.

>
>    Barry
>
>   Of course you can have a C/C++ main routine that calls PetscInitializeNoArguments(); followed by PetscInitializeFortran() and then have the bulk of your code in Fortran.
That would work, but we have a rather large fortran code without any
C. So, for now we will probably stick to your first approach and keep
our code fotran only.

Thanks again,
Volker


>
>
>> On Jul 27, 2016, at 10:55 AM, Tim Steinhoff <kandanovian at gmail.com> wrote:
>>
>> 2016-07-27 16:04 GMT+02:00 Matthew Knepley <knepley at gmail.com>:
>>> On Wed, Jul 27, 2016 at 4:59 AM, Tim Steinhoff <kandanovian at gmail.com>
>>> wrote:
>>>>
>>>> Hi all,
>>>>
>>>> we coupled PETSc with our fortran code. Is there any way to let PETSc
>>>> (PetscInitialize) ignore all arguments passed by the command line?
>>>> Since our code is controlled by command line arguements as well, it
>>>> leads to a mess, when those arguments are read twice.
>>>
>>>
>>> 1) You can use PetscInitializeNoArguments()
>>
>> Thanks! I thought that function was for C/C++ only.
>>
>>>
>>> 2) What goes wrong? PETSc should just ignore any options it does not
>>> recognize.
>>
>>
>> The problem is that our code uses the same or similar argument names
>> as PETSc does and our end user should not have access to all petsc
>> options.
>>
>>
>>>
>>>  Thanks,
>>>
>>>    Matt
>>>
>>>>
>>>> Thanks and kind regards,
>>>>
>>>> Volker
>>>
>>>
>>>
>>>
>>> --
>>> What most experimenters take for granted before they begin their experiments
>>> is infinitely more interesting than any results to which their experiments
>>> lead.
>>> -- Norbert Wiener
>

From lawrence.mitchell at imperial.ac.uk  Thu Jul 28 03:35:30 2016
From: lawrence.mitchell at imperial.ac.uk (Lawrence Mitchell)
Date: Thu, 28 Jul 2016 09:35:30 +0100
Subject: [petsc-users] Nested Fieldsplit for custom index sets
In-Reply-To: <0B3B3C93-5C07-4E07-A37E-DEBA9577D3EE@utdallas.edu>
References: <f892146ecce147429103227707eb67e7@utdallas.edu>
	<DA8B1BDA-0321-42AC-8E9B-4A142703875B@mcs.anl.gov>
	<E10F1C66-91A8-49F4-AAF3-03C3B91E5254@utdallas.edu>
	<BD1F4851-A841-4EB3-9603-D4A5A684C574@mcs.anl.gov>
	<0B3B3C93-5C07-4E07-A37E-DEBA9577D3EE@utdallas.edu>
Message-ID: <5799C3D2.8000407@imperial.ac.uk>

Dear Artur,

On 28/07/16 02:20, Safin, Artur wrote:
> Barry, Lawrence,
> 
>> I think the SubKSPs (and therefore SubPCs) are not set up until you call KSPSetUp(ksp) which your code does not do explicitly and is therefore done in KSPSolve.
> 
> I added KSPSetUp(), but unfortunately the issue did not go away.
> 
> 
> 
> I have created a MWE that replicates the issue. The program tries to solve a tridiagonal system, where the first fieldsplit partitions the global matrix
> 
> [ P  x ]
> [ x  T ],
> 
> and the nested fieldsplit partitions P into
> 
> [ A  x ]
> [ x  B ].

Two things:

1. Always check the return value from all PETSc calls.  This will
normally give you a very useful backtrace when something goes wrong.

That is, annotate all your calls with:

PetscErrorCode ierr;


ierr = SomePetscFunction(...); CHKERRQ(ierr);

If I do this, I see that the call to KSPSetUp fails:

[0]PETSC ERROR: --------------------- Error Message
--------------------------------------------------------------
[0]PETSC ERROR: Petsc has generated inconsistent data
[0]PETSC ERROR: Unhandled case, must have at least two fields, not 1
[0]PETSC ERROR: See
http://www.mcs.anl.gov/petsc/documentation/faq.html for trouble shooting.
[0]PETSC ERROR: Petsc Development GIT revision: v3.7.2-931-g1e46b98
GIT Date: 2016-07-06 16:57:50 -0500

...

[0]PETSC ERROR: #1 PCFieldSplitSetDefaults() line 470 in
/data/lmitche1/src/deps/petsc/src/ksp/pc/impls/fieldsplit/fieldsplit.c
[0]PETSC ERROR: #2 PCSetUp_FieldSplit() line 487 in
/data/lmitche1/src/deps/petsc/src/ksp/pc/impls/fieldsplit/fieldsplit.c
[0]PETSC ERROR: #3 PCSetUp() line 968 in
/data/lmitche1/src/deps/petsc/src/ksp/pc/interface/precon.c
[0]PETSC ERROR: #4 KSPSetUp() line 393 in
/data/lmitche1/src/deps/petsc/src/ksp/ksp/interface/itfunc.c
[0]PETSC ERROR: #5 main() line 65 in /homes/lmitche1/tmp/ex.c

The reason is you need to call KSPSetUp *after* setting the outermost
fieldsplit ISes.

If I move the call to KSPSetUp, then things seem to work.  I've
attached the working code.

Cheers,

Lawrence

$ cat options.txt
-pc_type fieldsplit
-pc_fieldsplit_type multiplicative
-fieldsplit_T_ksp_type bcgs
-fieldsplit_P_ksp_type gmres
-fieldsplit_P_pc_type fieldsplit
-fieldsplit_P_pc_fieldsplit_type multiplicative
-fieldsplit_P_fieldsplit_A_ksp_type gmres
-fieldsplit_P_fieldsplit_B_pc_type lu
-fieldsplit_P_fieldsplit_B_ksp_type preonly
-ksp_converged_reason
-ksp_monitor_true_residual
-ksp_view

$ ./ex -options_file options.txt

  0 KSP preconditioned resid norm 5.774607007892e+00 true resid norm
1.414213562373e+00 ||r(i)||/||b|| 1.000000000000e+00
  1 KSP preconditioned resid norm 1.921795888956e-01 true resid norm
4.802975385197e-02 ||r(i)||/||b|| 3.396216464745e-02
  2 KSP preconditioned resid norm 1.436304589027e-12 true resid norm
2.435255920058e-13 ||r(i)||/||b|| 1.721985974998e-13
Linear solve converged due to CONVERGED_RTOL iterations 2
KSP Object: 1 MPI processes
  type: gmres
    GMRES: restart=30, using Classical (unmodified) Gram-Schmidt
Orthogonalization with no iterative refinement
    GMRES: happy breakdown tolerance 1e-30
  maximum iterations=10000, initial guess is zero
  tolerances:  relative=1e-05, absolute=1e-50, divergence=10000.
  left preconditioning
  using PRECONDITIONED norm type for convergence test
PC Object: 1 MPI processes
  type: fieldsplit
    FieldSplit with MULTIPLICATIVE composition: total splits = 2
    Solver info for each split is in the following KSP objects:
    Split number 0 Defined by IS
    KSP Object: (fieldsplit_P_) 1 MPI processes
      type: gmres
        GMRES: restart=30, using Classical (unmodified) Gram-Schmidt
Orthogonalization with no iterative refinement
        GMRES: happy breakdown tolerance 1e-30
      maximum iterations=10000, initial guess is zero
      tolerances:  relative=1e-05, absolute=1e-50, divergence=10000.
      left preconditioning
      using PRECONDITIONED norm type for convergence test
    PC Object: (fieldsplit_P_) 1 MPI processes
      type: fieldsplit
        FieldSplit with MULTIPLICATIVE composition: total splits = 2
        Solver info for each split is in the following KSP objects:
        Split number 0 Defined by IS
        KSP Object: (fieldsplit_P_fieldsplit_A_) 1 MPI processes
          type: gmres
            GMRES: restart=30, using Classical (unmodified)
Gram-Schmidt Orthogonalization with no iterative refinement
            GMRES: happy breakdown tolerance 1e-30
          maximum iterations=10000, initial guess is zero
          tolerances:  relative=1e-05, absolute=1e-50, divergence=10000.
          left preconditioning
          using PRECONDITIONED norm type for convergence test
        PC Object: (fieldsplit_P_fieldsplit_A_) 1 MPI processes
          type: ilu
            ILU: out-of-place factorization
            0 levels of fill
            tolerance for zero pivot 2.22045e-14
            matrix ordering: natural
            factor fill ratio given 1., needed 1.
              Factored matrix follows:
                Mat Object: 1 MPI processes
                  type: seqaij
                  rows=25, cols=25
                  package used to perform factorization: petsc
                  total: nonzeros=73, allocated nonzeros=73
                  total number of mallocs used during MatSetValues
calls =0
                    not using I-node routines
          linear system matrix = precond matrix:
          Mat Object: (fieldsplit_P_fieldsplit_A_) 1 MPI processes
            type: seqaij
            rows=25, cols=25
            total: nonzeros=73, allocated nonzeros=73
            total number of mallocs used during MatSetValues calls =0
              not using I-node routines
        Split number 1 Defined by IS
        KSP Object: (fieldsplit_P_fieldsplit_B_) 1 MPI processes
          type: preonly
          maximum iterations=10000, initial guess is zero
          tolerances:  relative=1e-05, absolute=1e-50, divergence=10000.
          left preconditioning
          using NONE norm type for convergence test
        PC Object: (fieldsplit_P_fieldsplit_B_) 1 MPI processes
          type: lu
            LU: out-of-place factorization
            tolerance for zero pivot 2.22045e-14
            matrix ordering: nd
            factor fill ratio given 5., needed 1.43836
              Factored matrix follows:
                Mat Object: 1 MPI processes
                  type: seqaij
                  rows=25, cols=25
                  package used to perform factorization: petsc
                  total: nonzeros=105, allocated nonzeros=105
                  total number of mallocs used during MatSetValues
calls =0
                    not using I-node routines
          linear system matrix = precond matrix:
          Mat Object: (fieldsplit_P_fieldsplit_B_) 1 MPI processes
            type: seqaij
            rows=25, cols=25
            total: nonzeros=73, allocated nonzeros=73
            total number of mallocs used during MatSetValues calls =0
              not using I-node routines
      linear system matrix = precond matrix:
      Mat Object: (fieldsplit_P_) 1 MPI processes
        type: seqaij
        rows=50, cols=50
        total: nonzeros=148, allocated nonzeros=148
        total number of mallocs used during MatSetValues calls =0
          not using I-node routines
    Split number 1 Defined by IS
    KSP Object: (fieldsplit_T_) 1 MPI processes
      type: bcgs
      maximum iterations=10000, initial guess is zero
      tolerances:  relative=1e-05, absolute=1e-50, divergence=10000.
      left preconditioning
      using PRECONDITIONED norm type for convergence test
    PC Object: (fieldsplit_T_) 1 MPI processes
      type: ilu
        ILU: out-of-place factorization
        0 levels of fill
        tolerance for zero pivot 2.22045e-14
        matrix ordering: natural
        factor fill ratio given 1., needed 1.
          Factored matrix follows:
            Mat Object: 1 MPI processes
              type: seqaij
              rows=50, cols=50
              package used to perform factorization: petsc
              total: nonzeros=148, allocated nonzeros=148
              total number of mallocs used during MatSetValues calls =0
                not using I-node routines
      linear system matrix = precond matrix:
      Mat Object: (fieldsplit_T_) 1 MPI processes
        type: seqaij
        rows=50, cols=50
        total: nonzeros=148, allocated nonzeros=148
        total number of mallocs used during MatSetValues calls =0
          not using I-node routines
  linear system matrix = precond matrix:
  Mat Object: 1 MPI processes
    type: seqaij
    rows=100, cols=100
    total: nonzeros=298, allocated nonzeros=500
    total number of mallocs used during MatSetValues calls =0
      not using I-node routines


-------------- next part --------------
A non-text attachment was scrubbed...
Name: ex.c
Type: text/x-csrc
Size: 2885 bytes
Desc: not available
URL: <http://lists.mcs.anl.gov/pipermail/petsc-users/attachments/20160728/c8397e86/attachment-0001.c>
-------------- next part --------------
A non-text attachment was scrubbed...
Name: signature.asc
Type: application/pgp-signature
Size: 490 bytes
Desc: OpenPGP digital signature
URL: <http://lists.mcs.anl.gov/pipermail/petsc-users/attachments/20160728/c8397e86/attachment-0001.pgp>

From C.Klaij at marin.nl  Thu Jul 28 03:38:54 2016
From: C.Klaij at marin.nl (Klaij, Christiaan)
Date: Thu, 28 Jul 2016 08:38:54 +0000
Subject: [petsc-users] block matrix without MatCreateNest
Message-ID: <1469695134232.97712@marin.nl>

I'm trying to understand how to assemble a block matrix in a
format-independent manner, so that I can switch between types
mpiaij and matnest.

The manual states that the key to format-independent assembly is
to use MatGetLocalSubMatrix. So, in the code below, I'm using
this to assemble a 3-by-3 block matrix A and setting the diagonal
of block A02. This seems to work for type mpiaij, but not for
type matnest. What am I missing?

Chris


$ cat mattry.F90
program mattry

  use petscksp
  implicit none
#include <petsc/finclude/petsckspdef.h>

  PetscInt :: n=4   ! setting 4 cells per process

  PetscErrorCode         :: ierr
  PetscInt               :: size,rank,i
  Mat                    :: A,A02
  IS                     :: isg0,isg1,isg2
  IS                     :: isl0,isl1,isl2
  ISLocalToGlobalMapping :: map

  integer, allocatable, dimension(:) :: idx

  call PetscInitialize(PETSC_NULL_CHARACTER,ierr); CHKERRQ(ierr)
  call MPI_Comm_size(PETSC_COMM_WORLD,size,ierr); CHKERRQ(ierr)
  call MPI_Comm_rank(PETSC_COMM_WORLD,rank,ierr);CHKERRQ(ierr)

  ! local index sets for 3 fields
  allocate(idx(n))
  idx=(/ (i-1, i=1,n) /)
  call ISCreateGeneral(PETSC_COMM_WORLD,n,idx,PETSC_COPY_VALUES,isl0,ierr);CHKERRQ(ierr)
  call ISCreateGeneral(PETSC_COMM_WORLD,n,idx+n,PETSC_COPY_VALUES,isl1,ierr);CHKERRQ(ierr)
  call ISCreateGeneral(PETSC_COMM_WORLD,n,idx+2*n,PETSC_COPY_VALUES,isl2,ierr);CHKERRQ(ierr)
!  call ISView(isl3,PETSC_VIEWER_STDOUT_WORLD,ierr); CHKERRQ(ierr)
  deallocate(idx)

  ! global index sets for 3 fields
  allocate(idx(n))
  idx=(/ (i-1+rank*3*n, i=1,n) /)
  call ISCreateGeneral(PETSC_COMM_WORLD,n,idx,PETSC_COPY_VALUES,isg0,ierr);CHKERRQ(ierr)
  call ISCreateGeneral(PETSC_COMM_WORLD,n,idx+n,PETSC_COPY_VALUES,isg1,ierr); CHKERRQ(ierr)
  call ISCreateGeneral(PETSC_COMM_WORLD,n,idx+2*n,PETSC_COPY_VALUES,isg2,ierr); CHKERRQ(ierr)
!  call ISView(isg3,PETSC_VIEWER_STDOUT_WORLD,ierr); CHKERRQ(ierr)
  deallocate(idx)

  ! local-to-global mapping
  allocate(idx(3*n))
  idx=(/ (i-1+rank*3*n, i=1,3*n) /)
  call ISLocalToGlobalMappingCreate(PETSC_COMM_WORLD,1,3*n,idx,PETSC_COPY_VALUES,map,ierr); CHKERRQ(ierr)
!  call ISLocalToGlobalMappingView(map,PETSC_VIEWER_STDOUT_WORLD,ierr); CHKERRQ(ierr)
  deallocate(idx)

  ! create the 3-by-3 block matrix
  call MatCreate(PETSC_COMM_WORLD,A,ierr); CHKERRQ(ierr)
  call MatSetSizes(A,3*n,3*n,PETSC_DECIDE,PETSC_DECIDE,ierr); CHKERRQ(ierr)
!  call MatSetType(A,MATNEST,ierr); CHKERRQ(ierr)
  call MatSetUp(A,ierr); CHKERRQ(ierr)
  call MatSetOptionsPrefix(A,"A_",ierr); CHKERRQ(ierr)
  call MatSetLocalToGlobalMapping(A,map,map,ierr); CHKERRQ(ierr)
  call MatSetFromOptions(A,ierr); CHKERRQ(ierr)

  ! set diagonal of block A02 to 0.65
  call MatGetLocalSubmatrix(A,isl0,isl2,A02,ierr); CHKERRQ(ierr)
  do i=1,n
     call MatSetValuesLocal(A02,1,i-1,1,i-1,0.65d0,INSERT_VALUES,ierr); CHKERRQ(ierr)
  end do
  call MatRestoreLocalSubMatrix(A,isl0,isl2,A02,ierr); CHKERRQ(ierr)
  call MatAssemblyBegin(A,MAT_FINAL_ASSEMBLY,ierr); CHKERRQ(ierr)
  call MatAssemblyEnd(A,MAT_FINAL_ASSEMBLY,ierr); CHKERRQ(ierr)

  ! verify
  call MatGetSubmatrix(A,isg0,isg2,MAT_INITIAL_MATRIX,A02,ierr); CHKERRQ(ierr)
  call MatView(A02,PETSC_VIEWER_STDOUT_WORLD,ierr);CHKERRQ(ierr)

  call PetscFinalize(ierr)

end program mattry

$ mpiexec -n 2 ./mattry -A_mat_type mpiaij
Mat Object: 2 MPI processes
  type: mpiaij
row 0: (0, 0.65)
row 1: (1, 0.65)
row 2: (2, 0.65)
row 3: (3, 0.65)
row 4: (4, 0.65)
row 5: (5, 0.65)
row 6: (6, 0.65)
row 7: (7, 0.65)

$ mpiexec -n 2 ./mattry -A_mat_type nest
[0]PETSC ERROR: --------------------- Error Message --------------------------------------------------------------
[0]PETSC ERROR: Null argument, when expecting valid pointer
[0]PETSC ERROR: Null Pointer: Parameter # 3
[0]PETSC ERROR: See http://www.mcs.anl.gov/petsc/documentation/faq.html for trouble shooting.
[0]PETSC ERROR: Petsc Release Version 3.7.3, Jul, 24, 2016
[0]PETSC ERROR: ./mattry                                                                                                                                                                                                                                                         on a linux_64bit_debug named lin0322.marin.local by cklaij Thu Jul 28 10:31:04 2016
[0]PETSC ERROR: Configure options --with-mpi-dir=/home/cklaij/ReFRESCO/Dev/trunk/Libs/install/openmpi/1.8.7 --with-clanguage=c++ --with-x=1 --with-debugging=1 --with-blas-lapack-dir=/opt/intel/composer_xe_2015.1.133/mkl --with-shared-libraries=0
[0]PETSC ERROR: #1 MatNestFindIS() line 298 in /home/cklaij/ReFRESCO/Dev/trunk/Libs/build/petsc/3.7.3-dbg/src/mat/impls/nest/matnest.c
[0]PETSC ERROR: [1]PETSC ERROR: --------------------- Error Message --------------------------------------------------------------
[1]PETSC ERROR: Null argument, when expecting valid pointer
[1]PETSC ERROR: Null Pointer: Parameter # 3
[1]PETSC ERROR: See http://www.mcs.anl.gov/petsc/documentation/faq.html for trouble shooting.
[1]PETSC ERROR: Petsc Release Version 3.7.3, Jul, 24, 2016
[1]PETSC ERROR: ./mattry                                                                                                                                                                                                                                                         on a linux_64bit_debug named lin0322.marin.local by cklaij Thu Jul 28 10:31:04 2016
[1]PETSC ERROR: Configure options --with-mpi-dir=/home/cklaij/ReFRESCO/Dev/trunk/Libs/install/openmpi/1.8.7 --with-clanguage=c++ --with-x=1 --with-debugging=1 --with-blas-lapack-dir=/opt/intel/composer_xe_2015.1.133/mkl --with-shared-libraries=0
[1]PETSC ERROR: #1 MatNestFindIS() line 298 in /home/cklaij/ReFRESCO/Dev/trunk/Libs/build/petsc/3.7.3-dbg/src/mat/impls/nest/matnest.c
[1]PETSC ERROR: #2 MatNestFindSubMat() line 371 in /home/cklaij/ReFRESCO/Dev/trunk/Libs/build/petsc/3.7.3-dbg/src/mat/impls/nest/matnest.c
[1]PETSC ERROR: #3 MatGetLocalSubMatrix_Nest() line 414 in /home/cklaij/ReFRESCO/Dev/trunk/Libs/build/petsc/3.7.3-dbg/src/mat/impls/nest/matnest.c
[1]PETSC ERROR: #4 MatGetLocalSubMatrix() line 10099 in /home/cklaij/ReFRESCO/Dev/trunk/Libs/build/petsc/3.7.3-dbg/src/mat/interface/matrix.c
#2 MatNestFindSubMat() line 371 in /home/cklaij/ReFRESCO/Dev/trunk/Libs/build/petsc/3.7.3-dbg/src/mat/impls/nest/matnest.c
[0]PETSC ERROR: #3 MatGetLocalSubMatrix_Nest() line 414 in /home/cklaij/ReFRESCO/Dev/trunk/Libs/build/petsc/3.7.3-dbg/src/mat/impls/nest/matnest.c
[0]PETSC ERROR: #4 MatGetLocalSubMatrix() line 10099 in /home/cklaij/ReFRESCO/Dev/trunk/Libs/build/petsc/3.7.3-dbg/src/mat/interface/matrix.c
--------------------------------------------------------------------------
MPI_ABORT was invoked on rank 1 in communicator MPI_COMM_WORLD
with errorcode 85.

NOTE: invoking MPI_ABORT causes Open MPI to kill all MPI processes.
You may or may not see output from other processes, depending on
exactly when Open MPI kills them.
--------------------------------------------------------------------------
[lin0322.marin.local:11985] 1 more process has sent help message help-mpi-api.txt / mpi-abort
[lin0322.marin.local:11985] Set MCA parameter "orte_base_help_aggregate" to 0 to see all help / error messages
$


dr. ir. Christiaan Klaij  | CFD Researcher | Research & Development
MARIN | T +31 317 49 33 44 | mailto:C.Klaij at marin.nl | http://www.marin.nl

MARIN news: http://www.marin.nl/web/News/News-items/Ship-design-in-EU-project-Holiship.htm


From loiseau.jc at gmail.com  Thu Jul 28 11:07:43 2016
From: loiseau.jc at gmail.com (JC)
Date: Thu, 28 Jul 2016 18:07:43 +0200
Subject: [petsc-users] Comprehensive example for KSPRegister()
Message-ID: <E42F29AB-3617-412B-9E2C-519C01861AA0@gmail.com>

Hey everyone,

I was wondering if any of you had a comprehensive example of KSPRegister() to create our own KSP solver in fortran?
I have tried to look online but have not been able to find it.

Thanks a lot,
JC

From knepley at gmail.com  Thu Jul 28 12:22:41 2016
From: knepley at gmail.com (Matthew Knepley)
Date: Thu, 28 Jul 2016 10:22:41 -0700
Subject: [petsc-users] Comprehensive example for KSPRegister()
In-Reply-To: <E42F29AB-3617-412B-9E2C-519C01861AA0@gmail.com>
References: <E42F29AB-3617-412B-9E2C-519C01861AA0@gmail.com>
Message-ID: <CAMYG4G=-muMGr8Z9LVnJTm5u0AUTJ7GeBqT=YLL3NPrgwsy3Hw@mail.gmail.com>

On Thu, Jul 28, 2016 at 9:07 AM, JC <loiseau.jc at gmail.com> wrote:

> Hey everyone,
>
> I was wondering if any of you had a comprehensive example of KSPRegister()
> to create our own KSP solver in fortran?
> I have tried to look online but have not been able to find it.
>

We do not currently have one in Fortran, although we do in C.

  Thanks,

     Matt


> Thanks a lot,
> JC


-- 
What most experimenters take for granted before they begin their
experiments is infinitely more interesting than any results to which their
experiments lead.
-- Norbert Wiener
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.mcs.anl.gov/pipermail/petsc-users/attachments/20160728/a5498f61/attachment.html>

From valeria.mele at unina.it  Thu Jul 28 12:48:47 2016
From: valeria.mele at unina.it (Valeria Mele)
Date: Thu, 28 Jul 2016 12:48:47 -0500
Subject: [petsc-users] PCMG with matrix-free operators accessing DMDA
Message-ID: <CAOheOaMxRtsnGNwN9Jjj2ongj1uJn4OitJ9n9N9BatHHHicaBw@mail.gmail.com>

Hi everyone,
this time I am using PETSc to do something that is more complicated than my
usual and I want to do it at the highest possible abstraction level.

To put it in a nutshell, my intent is to build a parallel multigrid to
solve a linear system via DM, KSP and PCMG (I would like to use DMMG but
probably I should have the same problems or more).

I created the distributed object, *da*, with DMDACreate3d, even if it is
distributed (as yet) only in the x-dimension and has 3 dof.
Then I create the KSP (type KSPRICHARDSON) and set the nonzero initial
guess and PCMG as preconditioner. Here I start to tune the MG.

The point is that I need to define all the operators as matrix-free, since
they will do several operations on *x* to obtain *y*, and I am not familiar
with the way to access all the elements or informations in the two levels
involved and/or among the processors with a so-high level interface.

So please (please please please please) tell me if I correctly understand
the mechanism or I am on the wrong way and clear my doubts.

That is, let's say that my operation for the shell are:

   - *A_mult(Mat mat,Vec x, Vec y) *//coefficients matrix

in this case the level is only one but should I write it taking into
account only the local data (I think so) and accessing them via the
informations in *da*?

For example, if I use DMDAVecGetArray, DMDAVecGetCorners (or
DMDAGetGhostCorners) and DMDAVecRestoreArray, will they retrieve
informations from the right level each time (I am pretty sure that in some
official examples it is done in this way)?

Or should I handle just *Vec*s as local structures with their usual indices
(through VecGetArray and VecRestoreArray)?

   - *P_mult(Mat mat,Vec x, Vec y) *//interpolation matrix that is NOT
   conceptually the traspose of Restriction

in this case *x* and *y* will be from two different levels (respectively L
and L+1), so, if I retrieve informations from the *da*... how can I access
the two at different levels?

I am sorry if it seems that they are trivial questions, and I will be
grateful to anyone will help me.

Thanks a lot,
Valeria


---------------------------------------------------------------------------------------------
PhD Valeria Mele

University of Naples Federico II
Department of Mathematics and Applications "R. Caccioppoli"
Complesso Universitario M.S. Angelo, Via Cinthia
80126 Naples
---------------------------------------------------------------------------------------------
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.mcs.anl.gov/pipermail/petsc-users/attachments/20160728/2d2dc828/attachment.html>

From andrewh0 at uw.edu  Thu Jul 28 13:31:07 2016
From: andrewh0 at uw.edu (Andrew Ho)
Date: Thu, 28 Jul 2016 11:31:07 -0700
Subject: [petsc-users] Implementing discontinuous Galerkin FEM?
Message-ID: <CADhXwgsEu5r+iGO=Q39fB4V7EAgbFsa_+aZKKmXj2THZvLzYMQ@mail.gmail.com>

I am trying to implement a discontinuous Galerkin discretization using the
PETSc DM features to handle most of the topology/geometry specific
functions. However, I'm not really sure which direction to approach this
from since DG is kind of a middle ground between finite volume and
traditional continuous Galerkin finite element methods.

It appears to me that if I want to implement a nodal DG method, then it
would be more practical to extend the PetscFE interface, but for a modal DG
method perhaps the PetscFV interface is better?

There are still a few questions that I don't know the answers to, though.

Questions about implementing nodal DG:

1. Does PetscFE support sub/super parametric element types? If so, how do I
express the internal node structure for a nodal DG method (say, for example
located at the abscissa of a Gauss-Lobatto quadrature scheme)?
2. How would I go about making the dataset stored discontinuous between
neighboring elements (specifically at shared nodes for a nodal DG method)?
3. Similar to 2, how would I handle boundary conditions? Specifically, I
need a layer of data space of just the boundary nodes (not a complete
"ghost" element), and these are the actual constrained points.

Questions about implementing modal DG:

A. What does specifying the quadrature object for a PetscFV object actually
do? Is it purely a surface flux integration quadrature? How does the
quadrature object handle simplex-type elements in 2D/3D?
B. How would I go about modifying the limiters to take into account these
multiple modes?

-- 
Andrew Ho
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.mcs.anl.gov/pipermail/petsc-users/attachments/20160728/479d9c99/attachment.html>

From andrewh0 at uw.edu  Thu Jul 28 17:43:24 2016
From: andrewh0 at uw.edu (Andrew Ho)
Date: Thu, 28 Jul 2016 15:43:24 -0700
Subject: [petsc-users] How to create a quadrature object?
Message-ID: <CADhXwgvVLaPeQnOx5qyu2a-sobf1cJdyEpGS9fvBHgsdrPq2SA@mail.gmail.com>

I am trying to create a very simple quadrature object, but for some reason
PETSc keeps giving me an "invalid argument" error.

Relevant code:

#include <petsc.h>
> int main(int argc, char** argv)
> {
>   CHKERRQ(PetscInitialize(&argc, &argv, nullptr, "quadrature testing"));
>   PetscQuadrature quad;
>   CHKERRQ(PetscQuadratureCreate(PETSC_COMM_SELF, &quad));
>
  CHKERRQ(PetscQuadratureDestroy(&quad));

  CHKERRQ(PetscFinalize());
> }


Error message:

>
> [0]PETSC ERROR: --------------------- Error Message
> --------------------------------------------
> ------------------
>
> [0]PETSC ERROR: Invalid argument
> [0]PETSC ERROR: Invalid object classid 0
> This could happen if you compile with PETSC_HAVE_DYNAMIC_LIBRARIES, but
> link with static libraries.
>
> [0]PETSC ERROR: See http://www.mcs.anl.gov/petsc/documentation/faq.html
> for troubleshooting.
> [0]PETSC ERROR: Petsc Release Version 3.7.2, unknown
> [0]PETSC ERROR: Configure options --with-debugging=0 COPTFLAGS="-O3
> -march=native" CXXOPTFLAGS="-O3 -march=native" FOPTFLAGS="-O3
> -march=native" --prefix=/usr/local
> [0]PETSC ERROR: #1 PetscClassRegLogGetClass() line 290 in
> /home/andrew/tools/petsc/petsc/src/sys/logging/utils/classlog.c
> [0]PETSC ERROR: #2 PetscLogObjCreateDefault() line 317 in
> /home/andrew/tools/petsc/petsc/src/sys/logging/utils/classlog.c
> [0]PETSC ERROR: #3 PetscQuadratureCreate() line 54 in
> /home/andrew/tools/petsc/petsc/src/dm/dt/interface/dt.c
>
> [0]PETSC ERROR: --------------------- Error Message
> --------------------------------------------------------------
>
> [0]PETSC ERROR: Invalid argument
> [0]PETSC ERROR: Invalid object classid 0
> This could happen if you compile with PETSC_HAVE_DYNAMIC_LIBRARIES, but
> link with static libraries.
>
> [0]PETSC ERROR: See http://www.mcs.anl.gov/petsc/documentation/faq.html
> for troubleshooting.
> [0]PETSC ERROR: Petsc Release Version 3.7.2, unknown
> [0]PETSC ERROR: Configure options --with-debugging=0 COPTFLAGS="-O3
> -march=native" CXXOPTFLAGS="
> -O3 -march=native" FOPTFLAGS="-O3 -march=native" --prefix=/usr/local
> [0]PETSC ERROR: #4 PetscClassRegLogGetClass() line 290 in
> /home/andrew/tools/petsc/petsc/src/sys/logging/utils/classlog.c
>
> [0]PETSC ERROR: #5 PetscLogObjCreateDefault() line 317 in
> /home/andrew/tools/petsc/petsc/src/sys/logging/utils/classlog.c
>
> [0]PETSC ERROR: #6 PetscQuadratureCreate() line 54 in
> /home/andrew/tools/petsc/petsc/src/dm/dt/interface/dt.c
>
> [0]PETSC ERROR: #7 main() line 5 in /home/user/tests/pquad.cpp
> [0]PETSC ERROR: No PETSc Option Table entries
> [0]PETSC ERROR: ----------------End of Error Message -------send entire
> error message to petsc-maint at mcs.anl.gov----------


I've had no problems using other parts of PETSc with the exact same build
options/install (SNES, linear solvers), so I suspect this is likely just
user error. I have tried building PETSc both as a shared and static
library, and both methods fail in the same way.

As a related question, what is the "Order" of quadrature object suppose to
be? The documentation for "PetscQuadratureGetOrder" and
"PetscQuadratureSetOrder" says this is the highest degree polynomial that
is exactly integrated.

However, when I looked at the source code for
"PetscDTGaussTensorQuadrature", this appears to set the order to the number
of quadrature points, which for Gaussian quadrature means it can integrate
a 2*npoints-1 polynomial exactly.

If I wanted to implement my own quadrature scheme (Gauss-Lobatto) which is
exact for polynomials up to 2*npoints-3, what should I set the quadrature
order to?

-- 
Andrew Ho
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.mcs.anl.gov/pipermail/petsc-users/attachments/20160728/e3cd22ff/attachment-0001.html>

From knepley at gmail.com  Thu Jul 28 18:35:33 2016
From: knepley at gmail.com (Matthew Knepley)
Date: Thu, 28 Jul 2016 16:35:33 -0700
Subject: [petsc-users] How to create a quadrature object?
In-Reply-To: <CADhXwgvVLaPeQnOx5qyu2a-sobf1cJdyEpGS9fvBHgsdrPq2SA@mail.gmail.com>
References: <CADhXwgvVLaPeQnOx5qyu2a-sobf1cJdyEpGS9fvBHgsdrPq2SA@mail.gmail.com>
Message-ID: <CAMYG4GkPuswQuL1Hm2nAbhjnDHb4U41zMgmj=ZZow+n=m6oK6Q@mail.gmail.com>

On Thu, Jul 28, 2016 at 3:43 PM, Andrew Ho <andrewh0 at uw.edu> wrote:

> I am trying to create a very simple quadrature object, but for some reason
> PETSc keeps giving me an "invalid argument" error.
>
> Relevant code:
>
> #include <petsc.h>
>> int main(int argc, char** argv)
>> {
>>   CHKERRQ(PetscInitialize(&argc, &argv, nullptr, "quadrature testing"));
>>   PetscQuadrature quad;
>>   CHKERRQ(PetscQuadratureCreate(PETSC_COMM_SELF, &quad));
>>
>   CHKERRQ(PetscQuadratureDestroy(&quad));
>
>   CHKERRQ(PetscFinalize());
>> }
>
>
> Error message:
>
>>
>> [0]PETSC ERROR: --------------------- Error Message
>> --------------------------------------------
>> ------------------
>>
>> [0]PETSC ERROR: Invalid argument
>> [0]PETSC ERROR: Invalid object classid 0
>> This could happen if you compile with PETSC_HAVE_DYNAMIC_LIBRARIES, but
>> link with static libraries.
>>
>> [0]PETSC ERROR: See http://www.mcs.anl.gov/petsc/documentation/faq.html
>> for troubleshooting.
>> [0]PETSC ERROR: Petsc Release Version 3.7.2, unknown
>> [0]PETSC ERROR: Configure options --with-debugging=0 COPTFLAGS="-O3
>> -march=native" CXXOPTFLAGS="-O3 -march=native" FOPTFLAGS="-O3
>> -march=native" --prefix=/usr/local
>> [0]PETSC ERROR: #1 PetscClassRegLogGetClass() line 290 in
>> /home/andrew/tools/petsc/petsc/src/sys/logging/utils/classlog.c
>> [0]PETSC ERROR: #2 PetscLogObjCreateDefault() line 317 in
>> /home/andrew/tools/petsc/petsc/src/sys/logging/utils/classlog.c
>> [0]PETSC ERROR: #3 PetscQuadratureCreate() line 54 in
>> /home/andrew/tools/petsc/petsc/src/dm/dt/interface/dt.c
>>
>> [0]PETSC ERROR: --------------------- Error Message
>> --------------------------------------------------------------
>>
>> [0]PETSC ERROR: Invalid argument
>> [0]PETSC ERROR: Invalid object classid 0
>> This could happen if you compile with PETSC_HAVE_DYNAMIC_LIBRARIES, but
>> link with static libraries.
>>
>> [0]PETSC ERROR: See http://www.mcs.anl.gov/petsc/documentation/faq.html
>> for troubleshooting.
>> [0]PETSC ERROR: Petsc Release Version 3.7.2, unknown
>> [0]PETSC ERROR: Configure options --with-debugging=0 COPTFLAGS="-O3
>> -march=native" CXXOPTFLAGS="
>> -O3 -march=native" FOPTFLAGS="-O3 -march=native" --prefix=/usr/local
>> [0]PETSC ERROR: #4 PetscClassRegLogGetClass() line 290 in
>> /home/andrew/tools/petsc/petsc/src/sys/logging/utils/classlog.c
>>
>> [0]PETSC ERROR: #5 PetscLogObjCreateDefault() line 317 in
>> /home/andrew/tools/petsc/petsc/src/sys/logging/utils/classlog.c
>>
>> [0]PETSC ERROR: #6 PetscQuadratureCreate() line 54 in
>> /home/andrew/tools/petsc/petsc/src/dm/dt/interface/dt.c
>>
>> [0]PETSC ERROR: #7 main() line 5 in /home/user/tests/pquad.cpp
>> [0]PETSC ERROR: No PETSc Option Table entries
>> [0]PETSC ERROR: ----------------End of Error Message -------send entire
>> error message to petsc-maint at mcs.anl.gov----------
>
>
> I've had no problems using other parts of PETSc with the exact same build
> options/install (SNES, linear solvers), so I suspect this is likely just
> user error. I have tried building PETSc both as a shared and static
> library, and both methods fail in the same way.
>

This is strange. First, you should only have uninitialized classids if you
build without dynamics libraries. Did you? If you
include the entire error output, it would show us.

If so, then PetscInitialize should set the value of this classid, unless
you built this with PETSC_HAVE_DYNAMIC_LIBRARIES,
but linked against static libraries, as the error message says. Do you have
multiple versions of PETSc on your machine?


> As a related question, what is the "Order" of quadrature object suppose to
> be? The documentation for "PetscQuadratureGetOrder" and
> "PetscQuadratureSetOrder" says this is the highest degree polynomial that
> is exactly integrated.
>

That is what it is supposed to be, but its not. We are changing this now.
However, at the moment, order just means the
number of points/dim used to define the quadrature.


> However, when I looked at the source code for
> "PetscDTGaussTensorQuadrature", this appears to set the order to the number
> of quadrature points, which for Gaussian quadrature means it can integrate
> a 2*npoints-1 polynomial exactly.
>

Agree completely.


> If I wanted to implement my own quadrature scheme (Gauss-Lobatto) which is
> exact for polynomials up to 2*npoints-3, what should I set the quadrature
> order to?
>

I would set it to be the number of points now, since that is what should be
fed to the low level routines. Then we will go through
and complete a higher layer that translates an order request into a number
of points for each type.

   Matt


> --
> Andrew Ho
>


-- 
What most experimenters take for granted before they begin their
experiments is infinitely more interesting than any results to which their
experiments lead.
-- Norbert Wiener
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.mcs.anl.gov/pipermail/petsc-users/attachments/20160728/a5151d52/attachment.html>

From andrewh0 at uw.edu  Thu Jul 28 21:08:49 2016
From: andrewh0 at uw.edu (Andrew Ho)
Date: Thu, 28 Jul 2016 19:08:49 -0700
Subject: [petsc-users] How to create a quadrature object?
In-Reply-To: <CAMYG4GkPuswQuL1Hm2nAbhjnDHb4U41zMgmj=ZZow+n=m6oK6Q@mail.gmail.com>
References: <CADhXwgvVLaPeQnOx5qyu2a-sobf1cJdyEpGS9fvBHgsdrPq2SA@mail.gmail.com>
	<CAMYG4GkPuswQuL1Hm2nAbhjnDHb4U41zMgmj=ZZow+n=m6oK6Q@mail.gmail.com>
Message-ID: <CADhXwgv47WT3pAMgehg7rQ+9uELipTHgxwzd_P9XvaRVbf0Z0w@mail.gmail.com>

On Thu, Jul 28, 2016 at 4:35 PM, Matthew Knepley <knepley at gmail.com> wrote:

>
> This is strange. First, you should only have uninitialized classids if you
> build without dynamics libraries. Did you? If you
> include the entire error output, it would show us.
>
>
This is pretty much the entire error output; I trimmed off the ending "MPI
Abort called" message (it's the standard message printed by OpenMPI when
MPIAbort is called). The code snippet I had is a complete working example
which is able to give me this error.


> If so, then PetscInitialize should set the value of this classid, unless
> you built this with PETSC_HAVE_DYNAMIC_LIBRARIES,
> but linked against static libraries, as the error message says. Do you
> have multiple versions of PETSc on your machine?
>

I have a single install of PETSc at any one time. I've tried builds with
--with-shared-libraries on and off with the same results (I made sure to
purge any previous install first). Here's a diff if you want to test the
same configs I tried running.

Just run *make ex4* in the src/dm/dt/examples/tests folder

Configure command:

./configure --with-debugging=0 COPTFLAGS="-O3 -march=native"
CXXOPTFLAGS="-O3 -march=native" FOPTFLAGS="-O3 -march=native"

-- 
Andrew Ho
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.mcs.anl.gov/pipermail/petsc-users/attachments/20160728/c1c828ed/attachment.html>
-------------- next part --------------
A non-text attachment was scrubbed...
Name: quad_test.patch
Type: text/x-patch
Size: 1134 bytes
Desc: not available
URL: <http://lists.mcs.anl.gov/pipermail/petsc-users/attachments/20160728/c1c828ed/attachment.bin>

From knepley at gmail.com  Thu Jul 28 22:58:45 2016
From: knepley at gmail.com (Matthew Knepley)
Date: Thu, 28 Jul 2016 20:58:45 -0700
Subject: [petsc-users] How to create a quadrature object?
In-Reply-To: <CADhXwgv47WT3pAMgehg7rQ+9uELipTHgxwzd_P9XvaRVbf0Z0w@mail.gmail.com>
References: <CADhXwgvVLaPeQnOx5qyu2a-sobf1cJdyEpGS9fvBHgsdrPq2SA@mail.gmail.com>
	<CAMYG4GkPuswQuL1Hm2nAbhjnDHb4U41zMgmj=ZZow+n=m6oK6Q@mail.gmail.com>
	<CADhXwgv47WT3pAMgehg7rQ+9uELipTHgxwzd_P9XvaRVbf0Z0w@mail.gmail.com>
Message-ID: <CAMYG4Gn1nbyszVVoiLuZhvSSoWSqtqmt45JCK4jErVB+02j9zg@mail.gmail.com>

On Thu, Jul 28, 2016 at 7:08 PM, Andrew Ho <andrewh0 at uw.edu> wrote:

>
> On Thu, Jul 28, 2016 at 4:35 PM, Matthew Knepley <knepley at gmail.com>
> wrote:
>
>>
>> This is strange. First, you should only have uninitialized classids if
>> you build without dynamics libraries. Did you? If you
>> include the entire error output, it would show us.
>>
>>
> This is pretty much the entire error output; I trimmed off the ending "MPI
> Abort called" message (it's the standard message printed by OpenMPI when
> MPIAbort is called). The code snippet I had is a complete working example
> which is able to give me this error.
>
>
>> If so, then PetscInitialize should set the value of this classid, unless
>> you built this with PETSC_HAVE_DYNAMIC_LIBRARIES,
>> but linked against static libraries, as the error message says. Do you
>> have multiple versions of PETSc on your machine?
>>
>
> I have a single install of PETSc at any one time. I've tried builds with
> --with-shared-libraries on and off with the same results (I made sure to
> purge any previous install first). Here's a diff if you want to test the
> same configs I tried running.
>

Crap. We changed the way initialization works, but this was left behind.
You can make this change

diff --git a/src/dm/dt/interface/dt.c b/src/dm/dt/interface/dt.c
index 5d12959..d6f3454 100644
--- a/src/dm/dt/interface/dt.c
+++ b/src/dm/dt/interface/dt.c
@@ -50,7 +50,7 @@ PetscErrorCode PetscQuadratureCreate(MPI_Comm comm,
PetscQuadrature *q)

   PetscFunctionBegin;
   PetscValidPointer(q, 2);
-  ierr = DMInitializePackage();CHKERRQ(ierr);
+  ierr = PetscSysInitializePackage();CHKERRQ(ierr);
   ierr =
PetscHeaderCreate(*q,PETSC_OBJECT_CLASSID,"PetscQuadrature","Quadrature","DT",comm,PetscQuadratureDestroy,PetscQuadratureView);CHKERRQ(ierr);
   (*q)->dim       = -1;
   (*q)->order     = -1;

and then rebuild

  cd $PETSC_DIR
  make -f ./gmakefile

and try running your example again.

I will get this change in.

  Thanks,

     Matt


> Just run *make ex4* in the src/dm/dt/examples/tests folder
>
> Configure command:
>
> ./configure --with-debugging=0 COPTFLAGS="-O3 -march=native"
> CXXOPTFLAGS="-O3 -march=native" FOPTFLAGS="-O3 -march=native"
>
> --
> Andrew Ho
>


-- 
What most experimenters take for granted before they begin their
experiments is infinitely more interesting than any results to which their
experiments lead.
-- Norbert Wiener
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.mcs.anl.gov/pipermail/petsc-users/attachments/20160728/24da1d50/attachment-0001.html>

From bsmith at mcs.anl.gov  Fri Jul 29 09:41:19 2016
From: bsmith at mcs.anl.gov (Barry Smith)
Date: Fri, 29 Jul 2016 09:41:19 -0500
Subject: [petsc-users] vec norm for local portion of a vector
In-Reply-To: <CAAPpcpnVrGB2PiFBjyXGC+Y8niGc_ONE+KRZv2t7fsP_n_HL7A@mail.gmail.com>
References: <CAAPpcpnVrGB2PiFBjyXGC+Y8niGc_ONE+KRZv2t7fsP_n_HL7A@mail.gmail.com>
Message-ID: <79191996-1FC0-4DFE-B1EB-33A29A5D8CA2@mcs.anl.gov>


> On Jul 27, 2016, at 4:42 PM, Xiangdong <epscodes at gmail.com> wrote:
> 
> Hello everyone,
> 
> I have a global dmda vector vg. On each processor, if I want to know the norm of local portion of vg, which function should I call?
> 
> So far I am thinking of using DMDAVecGetArray and then write a loop to compute the norm of this local array. 
> 
> Is there a simple function available to call? like *vg->ops->norm_local(vg,NORM_2, &normlocal)? 

There isn't a public interface to this call because it really isn't a mathematically well defined object; the subdomains in the decomposition of the array are arbitrary based on the number of processes used.

   Anyways if you want it and it is the NON-overlapping portion then yes, you can write a little routine (basically just cut and paste VecNorm()) call it say VecNormLocal() and have it call the function pointer you indicated above. Note for the 2 norm the norm_local() returns the square of the norm so you need to take the square root.

   If you want the overlapping portion of the vector then you should just do the DMDAVecGetArray() as you already do.

   Barry


> 
> Thanks.
> 
> Best,
> Xiangdong 
> 


From bsmith at mcs.anl.gov  Fri Jul 29 09:49:29 2016
From: bsmith at mcs.anl.gov (Barry Smith)
Date: Fri, 29 Jul 2016 09:49:29 -0500
Subject: [petsc-users] Ignore command line arguments with fortran code
	using PETSc
In-Reply-To: <CAE=bjn4fnCXyKO4AnyOWUwuShpp8XmVxXnSf9vaN9if77B37fg@mail.gmail.com>
References: <CAE=bjn5QBx=Dsq=GhW-Xq+94-dghD5EFNY4F42vYYiGQ8kFf_g@mail.gmail.com>
	<CAMYG4G=oWuPiFwM4JLMUpJapiag=M=COuqmbDdjcdNLGnfCQ+A@mail.gmail.com>
	<CAE=bjn5Ef9ZmyXdVF3JZ19frw6+Ev+E2L+YN-xJOoA3pv1pBFw@mail.gmail.com>
	<8F9DC370-1CD1-48AA-8009-42731293566A@mcs.anl.gov>
	<CAE=bjn4fnCXyKO4AnyOWUwuShpp8XmVxXnSf9vaN9if77B37fg@mail.gmail.com>
Message-ID: <CB2C2D83-BD46-41E9-9A4C-791F26F4BA04@mcs.anl.gov>


> On Jul 28, 2016, at 2:35 AM, Tim Steinhoff <kandanovian at gmail.com> wrote:
> 
> 2016-07-27 21:42 GMT+02:00 Barry Smith <bsmith at mcs.anl.gov>:
>> 
>>  Actually there is currently no way to PetscInitialize from Fortran without adding the command line options to the database. In the middle
>> of petscinitialize_() is the code fragment
>> 
>>  PETScParseFortranArgs_Private(&PetscGlobalArgc,&PetscGlobalArgs);
>>  FIXCHAR(filename,len,t1);
>>  *ierr = PetscOptionsInsert(NULL,&PetscGlobalArgc,&PetscGlobalArgs,t1);
>> 
>>  We'll need to do a bit of code refactoring to provide a Fortran petscinitializenoarguments_(). The simplest way to refactor would be to change the name of petscinitialize_ to say PetscInitializeFortran_Internal() and add a bool argument whether to process the arguments and then write two trivial routines petscinitialize_ that calls the new routine with PETSC_TRUE and petscinitializenoarguments_() that calls it with PETSC_FALSE.
> 
> Thanks Barry. It would be really nice if PETSc comes with that feature
> in future, because I would prefer not to make any changes to the PETSc
> code that disappear with every new PETSc release.

   Understood. You could make a pull request with your changes https://bitbucket.org/petsc/petsc/wiki/pull-request-instructions-git otherwise I will add it but it will take a few days since I am backlogged.

   Barry

> 
>> 
>>   Barry
>> 
>>  Of course you can have a C/C++ main routine that calls PetscInitializeNoArguments(); followed by PetscInitializeFortran() and then have the bulk of your code in Fortran.
> That would work, but we have a rather large fortran code without any
> C. So, for now we will probably stick to your first approach and keep
> our code fotran only.
> 
> Thanks again,
> Volker
> 
> 
>> 
>> 
>>> On Jul 27, 2016, at 10:55 AM, Tim Steinhoff <kandanovian at gmail.com> wrote:
>>> 
>>> 2016-07-27 16:04 GMT+02:00 Matthew Knepley <knepley at gmail.com>:
>>>> On Wed, Jul 27, 2016 at 4:59 AM, Tim Steinhoff <kandanovian at gmail.com>
>>>> wrote:
>>>>> 
>>>>> Hi all,
>>>>> 
>>>>> we coupled PETSc with our fortran code. Is there any way to let PETSc
>>>>> (PetscInitialize) ignore all arguments passed by the command line?
>>>>> Since our code is controlled by command line arguements as well, it
>>>>> leads to a mess, when those arguments are read twice.
>>>> 
>>>> 
>>>> 1) You can use PetscInitializeNoArguments()
>>> 
>>> Thanks! I thought that function was for C/C++ only.
>>> 
>>>> 
>>>> 2) What goes wrong? PETSc should just ignore any options it does not
>>>> recognize.
>>> 
>>> 
>>> The problem is that our code uses the same or similar argument names
>>> as PETSc does and our end user should not have access to all petsc
>>> options.
>>> 
>>> 
>>>> 
>>>> Thanks,
>>>> 
>>>>   Matt
>>>> 
>>>>> 
>>>>> Thanks and kind regards,
>>>>> 
>>>>> Volker
>>>> 
>>>> 
>>>> 
>>>> 
>>>> --
>>>> What most experimenters take for granted before they begin their experiments
>>>> is infinitely more interesting than any results to which their experiments
>>>> lead.
>>>> -- Norbert Wiener
>> 


From bsmith at mcs.anl.gov  Fri Jul 29 10:29:34 2016
From: bsmith at mcs.anl.gov (Barry Smith)
Date: Fri, 29 Jul 2016 10:29:34 -0500
Subject: [petsc-users] Scheduled Relaxation Jacobi method
In-Reply-To: <E56DC36C-B04B-4E5B-9BEF-11515DF8097E@gmail.com>
References: <40FF0CE8-7589-4631-AB5B-0F4AF5205C99@gmail.com>
	<3EFDE37C-513C-4894-9859-BA7DCA70A760@mcs.anl.gov>
	<8DFDAF89-70EA-4245-967B-3C0D6A7E136F@gmail.com>
	<C437E4B5-2EE9-45C6-990B-406B179D8EFA@mcs.anl.gov>
	<E56DC36C-B04B-4E5B-9BEF-11515DF8097E@gmail.com>
Message-ID: <A761147E-305C-4EA7-85DD-B4FC3C98A7C2@mcs.anl.gov>


   Make sure you always respond to petsc-users so the email doesn't just get sent to me. Someone else would have already helped you.

   Ahh, the problem is we copy the scale value out of the KSP object at the beginning of the routine into a local variable so it remains the same value even though you correctly change the value in the KSP object.

   I have changed the Richardson code in the maint and master branch so if you use it there the scaling will work as you desire. (Just follow the git instructions at http://www.mcs.anl.gov/petsc/download/index.html for obtaining PETSc.)


   Barry

> On Jul 28, 2016, at 9:24 AM, JC <loiseau.jc at gmail.com> wrote:
> 
> Hej,
> 
> I have tried to use kspMonitorSet to change the value of the scale in the Richardson iteration, however it does not seem to take it into account when actually solving the problem. Here is my monitoring routine. relaxation is an array declared in my module. If I do print *, relaxation(ind), then the correct value is printed. It is not passed to the KSP framework however despite the call to ksprichardsonsetscale. Any idea why?
> 
> Thanks a lot,
> JC
> 
> 
>  subroutine MyKSPMonitor(solver, iter, dummy_1, dummy_2, ierr)
> 
>    !----- Inputs -----!
> 
>    KSP, intent(inout) :: solver
>    PetscInt, intent(in) :: iter
>    PetscReal, intent(in) :: dummy_1
>    PetscInt, intent(in) :: dummy_2
> 
>    !----- Output -----!
> 
>    PetscErrorCode, intent(out) :: ierr
> 
>    !----- Miscellaneous -----!
> 
>    PetscInt :: ind
>    PetscReal :: weight
> 
>    ind = mod(iter, max_srj)
>    weight = relaxation(ind)
>    call ksprichardsonsetscale(solver, weight, ierr)
>    if (ierr/=0) call abort(ierr, 'Failed to set the relaxation weights.', nrank)
>    ierr = 0
> 
>    return
>  end subroutine MyKSPMonitor
> 
> 
>> On 27 Jul 2016, at 18:11, Barry Smith <bsmith at mcs.anl.gov> wrote:
>> 
>> 
>>> On Jul 27, 2016, at 6:27 AM, JC <loiseau.jc at gmail.com> wrote:
>>> 
>>>> The dependence on grid size is easy.
>>> 
>>> Knowing the grid size, I have a list of files I can read to load the correct relaxation weights. This is indeed the easy part.
>>> 
>>>> By depends on the iteration do you mean the linear iteration, as in the first iteration you use .1 then in the second you use .2 etc?
>>> 
>>> That is exactly what I meant. At the first iteration of the solver (i.e. the first matrix-vector product), the relaxation is say omega_1. At the second iteration, it is omega_2, so on so forth.
>>> 
>>>> To do this use KSPSetMonitor() and have your monitor routine call KSPRichardsonSetScale() with the value you like which can depend on the iteration.
>>> 
>>> So basically, KSPSetMonitor() allows me to define a callback procedure that will be executed at the end of each iteration of the KSP solver, right?
>> 
>> Yes
>> 
>>> 
>>> Thanks a lot,
>>> JC
>> 
> 


From bsmith at mcs.anl.gov  Fri Jul 29 10:39:02 2016
From: bsmith at mcs.anl.gov (Barry Smith)
Date: Fri, 29 Jul 2016 10:39:02 -0500
Subject: [petsc-users] Comprehensive example for KSPRegister()
In-Reply-To: <CAMYG4G=-muMGr8Z9LVnJTm5u0AUTJ7GeBqT=YLL3NPrgwsy3Hw@mail.gmail.com>
References: <E42F29AB-3617-412B-9E2C-519C01861AA0@gmail.com>
	<CAMYG4G=-muMGr8Z9LVnJTm5u0AUTJ7GeBqT=YLL3NPrgwsy3Hw@mail.gmail.com>
Message-ID: <3D74EA20-9D78-4730-A088-CC2EBEDF32F2@mcs.anl.gov>


> On Jul 28, 2016, at 12:22 PM, Matthew Knepley <knepley at gmail.com> wrote:
> 
> On Thu, Jul 28, 2016 at 9:07 AM, JC <loiseau.jc at gmail.com> wrote:
> Hey everyone,
> 
> I was wondering if any of you had a comprehensive example of KSPRegister() to create our own KSP solver in fortran?
> I have tried to look online but have not been able to find it.
> 
> We do not currently have one in Fortran, although we do in C.

   Basically you would copy the file src/ksp/ksp/impls/cg/cg.c and replace the bodies of each of the methods (KSPSolve_CG etc) with calls to your Fortran routines that implement each method.    Given the large number of KSP methods we already have implemented if what you want to implement is a variation of what we already have it would likely require much less new code if you worked in C and simple derived off a subclass of an already implemented class where you made the the changes.


   Barry

> 
>   Thanks,
> 
>      Matt
>  
> Thanks a lot,
> JC
> 
> 
> 
> -- 
> What most experimenters take for granted before they begin their experiments is infinitely more interesting than any results to which their experiments lead.
> -- Norbert Wiener


From jshen25 at jhu.edu  Fri Jul 29 11:46:54 2016
From: jshen25 at jhu.edu (Jinlei Shen)
Date: Fri, 29 Jul 2016 12:46:54 -0400
Subject: [petsc-users] Petsc mesh scalability issue with iterative solver
	and direct solver
Message-ID: <CAD1b_yOQYd6=k9Laa0UP92zAfzQJj6t4BVrhvuTME6b3zY5CTQ@mail.gmail.com>

Dear PETSC developers,

Thank you for developing such a powerful tool for scientific computations.

I'm currently trying to run a simple cantilever beam FEM to test the
scalability of PETSC on multi-processors. I also want to verify whether
iterative solver or direct solver is more efficient for parallel large FEM
problem.

Problem description, An Euler elementary cantilever beam with point load at
the end along -y direction. Each node has 2 DOF (deflection and rotation)).
MPIBAIJ is used with bs = 2, dnnz and onnz are determined based on the
connectivity. Loop with elements in each processor to assemble the global
matrix with same element stiffness matrix. The boundary condition is set
using call
MatZeroRowsColumns(SG,2,g_BC,one,PETSC_NULL_OBJECT,PETSC_NULL_OBJECT,ierr);

Based on what I have done, I find the computations work well, i.e the
results are correct compared with theoretical solution, for small mesh size
(small than 5000 elements) using both solvers with different numbers of
processes.

However, there are several confusing issues when I increase the mesh size
to 10000 and more elements with iterative solve(CG + PCBJACOBI)

1. For 10k elements, I can get accurate solution using iterative solver
with uni-processor(i.e. only one process). However, when I use 2-8
processes, it tells the linear solver converged with different iterations,
but, the results are all different for different processes and erroneous.
The wired thing is when I use >9 processes, the results are correct again.
I am really confused by this. Could you explain me why?  If my
parallelization is not correct, why it works for small cases? And I check
the global matrix and RHS vector and didn't see any mallocs during the
process.

2. For 30k elements, if I use one process, it says: Linear solve did not
converge due to DIVERGED_INDEFINITE_PC. Does this commonly happen for large
sparse matrix? If so, is there any stable solver or pc for large problem?


For parallel computing using direct solver(SUPERLU_DIST + PCLU), I can only
get accuracy when the number of elements are below 5000. There must be
something wrong. The way I use the superlu_dist solver is first convert
MatType to AIJ, then call PCFactorSetMatSolverPackage, and change the PC to
PCLU. Do I miss anything else to run SUPER_LU correctly?


I also use SUPER_LU and iterative solver(CG+PCBJACOBI) to solve the
sequential version of the same problem. The results shows that iterative
solver works well for <50k elements, while SUPER_LU only gets right
solution below 5k elements. Can I say iterative solver is better than
SUPER_LU for large problem? How can I improve the solver to copy with very
large problem, such as million by million? Another thing is it's still
doubtable of performance of SUPER_LU.

For the inaccuracy issue, do you think it may be due to the memory?
However, there is no memory error showing during the execution.

I really appreciate someone could resolve those puzzles above for me. My
goal is to replace the current SUPER_LU  solver in my parallel CPFEM main
program with the iterative solver using PETSC.


Please let me if you would like to see my code in detail.

Thank you very much.

Bests,
Jinlei
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.mcs.anl.gov/pipermail/petsc-users/attachments/20160729/0be3b0f4/attachment-0001.html>

From andrewh0 at uw.edu  Fri Jul 29 11:54:02 2016
From: andrewh0 at uw.edu (Andrew Ho)
Date: Fri, 29 Jul 2016 09:54:02 -0700
Subject: [petsc-users] How to create a quadrature object?
In-Reply-To: <CAMYG4Gn1nbyszVVoiLuZhvSSoWSqtqmt45JCK4jErVB+02j9zg@mail.gmail.com>
References: <CADhXwgvVLaPeQnOx5qyu2a-sobf1cJdyEpGS9fvBHgsdrPq2SA@mail.gmail.com>
	<CAMYG4GkPuswQuL1Hm2nAbhjnDHb4U41zMgmj=ZZow+n=m6oK6Q@mail.gmail.com>
	<CADhXwgv47WT3pAMgehg7rQ+9uELipTHgxwzd_P9XvaRVbf0Z0w@mail.gmail.com>
	<CAMYG4Gn1nbyszVVoiLuZhvSSoWSqtqmt45JCK4jErVB+02j9zg@mail.gmail.com>
Message-ID: <CADhXwguLYGnc-iW+=ZrWPhAYSDbnApg9ieM45hLPwGryMtHcWA@mail.gmail.com>

Thanks, this fix works.

On Thu, Jul 28, 2016 at 8:58 PM, Matthew Knepley <knepley at gmail.com> wrote:

> On Thu, Jul 28, 2016 at 7:08 PM, Andrew Ho <andrewh0 at uw.edu> wrote:
>
>>
>> On Thu, Jul 28, 2016 at 4:35 PM, Matthew Knepley <knepley at gmail.com>
>> wrote:
>>
>>>
>>> This is strange. First, you should only have uninitialized classids if
>>> you build without dynamics libraries. Did you? If you
>>> include the entire error output, it would show us.
>>>
>>>
>> This is pretty much the entire error output; I trimmed off the ending
>> "MPI Abort called" message (it's the standard message printed by OpenMPI
>> when MPIAbort is called). The code snippet I had is a complete working
>> example which is able to give me this error.
>>
>>
>>> If so, then PetscInitialize should set the value of this classid, unless
>>> you built this with PETSC_HAVE_DYNAMIC_LIBRARIES,
>>> but linked against static libraries, as the error message says. Do you
>>> have multiple versions of PETSc on your machine?
>>>
>>
>> I have a single install of PETSc at any one time. I've tried builds with
>> --with-shared-libraries on and off with the same results (I made sure to
>> purge any previous install first). Here's a diff if you want to test the
>> same configs I tried running.
>>
>
> Crap. We changed the way initialization works, but this was left behind.
> You can make this change
>
> diff --git a/src/dm/dt/interface/dt.c b/src/dm/dt/interface/dt.c
> index 5d12959..d6f3454 100644
> --- a/src/dm/dt/interface/dt.c
> +++ b/src/dm/dt/interface/dt.c
> @@ -50,7 +50,7 @@ PetscErrorCode PetscQuadratureCreate(MPI_Comm comm,
> PetscQuadrature *q)
>
>    PetscFunctionBegin;
>    PetscValidPointer(q, 2);
> -  ierr = DMInitializePackage();CHKERRQ(ierr);
> +  ierr = PetscSysInitializePackage();CHKERRQ(ierr);
>    ierr =
> PetscHeaderCreate(*q,PETSC_OBJECT_CLASSID,"PetscQuadrature","Quadrature","DT",comm,PetscQuadratureDestroy,PetscQuadratureView);CHKERRQ(ierr);
>    (*q)->dim       = -1;
>    (*q)->order     = -1;
>
> and then rebuild
>
>   cd $PETSC_DIR
>   make -f ./gmakefile
>
> and try running your example again.
>
> I will get this change in.
>
>   Thanks,
>
>      Matt
>
>
>> Just run *make ex4* in the src/dm/dt/examples/tests folder
>>
>> Configure command:
>>
>> ./configure --with-debugging=0 COPTFLAGS="-O3 -march=native"
>> CXXOPTFLAGS="-O3 -march=native" FOPTFLAGS="-O3 -march=native"
>>
>> --
>> Andrew Ho
>>
>
>
>
> --
> What most experimenters take for granted before they begin their
> experiments is infinitely more interesting than any results to which their
> experiments lead.
> -- Norbert Wiener
>


-- 
Andrew Ho
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.mcs.anl.gov/pipermail/petsc-users/attachments/20160729/7cb7a9e6/attachment.html>

From bsmith at mcs.anl.gov  Fri Jul 29 12:19:11 2016
From: bsmith at mcs.anl.gov (Barry Smith)
Date: Fri, 29 Jul 2016 12:19:11 -0500
Subject: [petsc-users] PCMG with matrix-free operators accessing DMDA
In-Reply-To: <CAOheOaMxRtsnGNwN9Jjj2ongj1uJn4OitJ9n9N9BatHHHicaBw@mail.gmail.com>
References: <CAOheOaMxRtsnGNwN9Jjj2ongj1uJn4OitJ9n9N9BatHHHicaBw@mail.gmail.com>
Message-ID: <4706BBEA-9216-4973-8CAE-E7DCCD9F0F89@mcs.anl.gov>


> On Jul 28, 2016, at 12:48 PM, Valeria Mele <valeria.mele at unina.it> wrote:
> 
> Hi everyone,
> this time I am using PETSc to do something that is more complicated than my usual and I want to do it at the highest possible abstraction level.
> 
> To put it in a nutshell, my intent is to build a parallel multigrid to solve a linear system via DM, KSP and PCMG (I would like to use DMMG but probably I should have the same problems or more).
   
   DMMG doesn't exist anymore. It was refactored away many years ago, its functionality is handled by PCMG and DM.


> 
> I created the distributed object, da, with DMDACreate3d, even if it is distributed (as yet) only in the x-dimension and has 3 dof.
> Then I create the KSP (type KSPRICHARDSON) and set the nonzero initial guess and PCMG as preconditioner. Here I start to tune the MG. 
> 
> The point is that I need to define all the operators as matrix-free, since they will do several operations on x to obtain y, and I am not familiar with the way to access all the elements or informations in the two levels involved and/or among the processors with a so-high level interface.
>  
> So please (please please please please) tell me if I correctly understand the mechanism or I am on the wrong way and clear my doubts.
> 
> That is, let's say that my operation for the shell are:
> 	? A_mult(Mat mat,Vec x, Vec y) //coefficients matrix
> in this case the level is only one but should I write it taking into account only the local data (I think so) and accessing them via the informations in da? 

   Yes. You can use VecGetDM(x,&da) to get the DMDA object

> 
> For example, if I use DMDAVecGetArray, DMDAVecGetCorners (or DMDAGetGhostCorners) and DMDAVecRestoreArray, will they retrieve informations from the right level each time (I am pretty sure that in some official examples it is done in this way)?
> 
> Or should I handle just Vecs as local structures with their usual indices (through VecGetArray and VecRestoreArray)?

   No, no, no because then you would need to mange all the structured grid information yourself, since the DMDA manages it for you you should use it.

> 	? P_mult(Mat mat,Vec x, Vec y) //interpolation matrix that is NOT conceptually the traspose of Restriction
> in this case x and y will be from two different levels (respectively L and L+1), so, if I retrieve informations from the da... how can I access the two at different levels?

   Use VecGetDM(x,  and VecGetDM(y to get access to both DMDA.

> 
> I am sorry if it seems that they are trivial questions, and I will be grateful to anyone will help me. 

  Additional information. Since the PCMG will be requesting the matrices and the interpolation/restriction operations (rather than you setting them into each level of multigrid) you will need to use DMShellSetCreateMatrix() and DMShellSetCreateInterpolation() and DMShellSetCreateRestriction() to provide the routines that will create the Shell matrices you need to represent the operators on the levels and the restriction and interpolation (Even though you are using a DMDA you can still call these routines).

   Barry

> 
> Thanks a lot,
> Valeria
> 
> 
> 
> 
> 
> 
> ---------------------------------------------------------------------------------------------
> PhD Valeria Mele
> 
> University of Naples Federico II
> Department of Mathematics and Applications "R. Caccioppoli"
> Complesso Universitario M.S. Angelo, Via Cinthia
> 80126 Naples 
> ---------------------------------------------------------------------------------------------


From bsmith at mcs.anl.gov  Fri Jul 29 13:09:46 2016
From: bsmith at mcs.anl.gov (Barry Smith)
Date: Fri, 29 Jul 2016 13:09:46 -0500
Subject: [petsc-users] Petsc mesh scalability issue with iterative
	solver and direct solver
In-Reply-To: <CAD1b_yOQYd6=k9Laa0UP92zAfzQJj6t4BVrhvuTME6b3zY5CTQ@mail.gmail.com>
References: <CAD1b_yOQYd6=k9Laa0UP92zAfzQJj6t4BVrhvuTME6b3zY5CTQ@mail.gmail.com>
Message-ID: <B8FB6915-DF43-4467-B6BC-6110B99B3422@mcs.anl.gov>


  First run  under valgrind all the cases to make sure there is not some use of uninitialized data or overwriting of data. Go to http://www.mcs.anl.gov/petsc follow the link to FAQ and search for valgrind (the web server seems to be broken at the moment).

  Second it is possible that your code the assembles the matrices and vectors is not correctly assembling it for either the sequential or parallel case. Hence a different number of processes could be generating a different linear system hence inconsistent results. How are you handling the parallelism? How do you know the matrix generated in parallel is identically to that sequentially?

Simple preconditioners such as pbjacobi will converge slower and slower with more elements. 

Note that you should run with -ksp_monitor_true_residual and -ksp_converged_reason to make sure that the iterative solver is even converging. By default PETSc KSP solvers do not stop with a big error message if they do not converge so you need make sure they are always converging.

   Barry


> On Jul 29, 2016, at 11:46 AM, Jinlei Shen <jshen25 at jhu.edu> wrote:
> 
> Dear PETSC developers,
> 
> Thank you for developing such a powerful tool for scientific computations.
> 
> I'm currently trying to run a simple cantilever beam FEM to test the scalability of PETSC on multi-processors. I also want to verify whether iterative solver or direct solver is more efficient for parallel large FEM problem.
> 
> Problem description, An Euler elementary cantilever beam with point load at the end along -y direction. Each node has 2 DOF (deflection and rotation)). MPIBAIJ is used with bs = 2, dnnz and onnz are determined based on the connectivity. Loop with elements in each processor to assemble the global matrix with same element stiffness matrix. The boundary condition is set using call MatZeroRowsColumns(SG,2,g_BC,one,PETSC_NULL_OBJECT,PETSC_NULL_OBJECT,ierr);
> 
> Based on what I have done, I find the computations work well, i.e the results are correct compared with theoretical solution, for small mesh size (small than 5000 elements) using both solvers with different numbers of processes.
> 
> However, there are several confusing issues when I increase the mesh size to 10000 and more elements with iterative solve(CG + PCBJACOBI) 
> 
> 1. For 10k elements, I can get accurate solution using iterative solver with uni-processor(i.e. only one process). However, when I use 2-8 processes, it tells the linear solver converged with different iterations, but, the results are all different for different processes and erroneous. The wired thing is when I use >9 processes, the results are correct again. I am really confused by this. Could you explain me why?  If my parallelization is not correct, why it works for small cases? And I check the global matrix and RHS vector and didn't see any mallocs during the process.
> 
> 2. For 30k elements, if I use one process, it says: Linear solve did not converge due to DIVERGED_INDEFINITE_PC. Does this commonly happen for large sparse matrix? If so, is there any stable solver or pc for large problem?
> 
> 
> For parallel computing using direct solver(SUPERLU_DIST + PCLU), I can only get accuracy when the number of elements are below 5000. There must be something wrong. The way I use the superlu_dist solver is first convert MatType to AIJ, then call PCFactorSetMatSolverPackage, and change the PC to PCLU. Do I miss anything else to run SUPER_LU correctly?  
> 
> 
> I also use SUPER_LU and iterative solver(CG+PCBJACOBI) to solve the sequential version of the same problem. The results shows that iterative solver works well for <50k elements, while SUPER_LU only gets right solution below 5k elements. Can I say iterative solver is better than SUPER_LU for large problem? How can I improve the solver to copy with very large problem, such as million by million? Another thing is it's still doubtable of performance of SUPER_LU.
> 
> For the inaccuracy issue, do you think it may be due to the memory? However, there is no memory error showing during the execution. 
> 
> I really appreciate someone could resolve those puzzles above for me. My goal is to replace the current SUPER_LU  solver in my parallel CPFEM main program with the iterative solver using PETSC. 
> 
> 
> Please let me if you would like to see my code in detail.
> 
> Thank you very much.
> 
> Bests,
> Jinlei
> 
> 
> 
> 
> 
> 
> 


From valeria.mele at unina.it  Fri Jul 29 13:58:53 2016
From: valeria.mele at unina.it (Valeria Mele)
Date: Fri, 29 Jul 2016 13:58:53 -0500
Subject: [petsc-users] PCMG with matrix-free operators accessing DMDA
In-Reply-To: <4706BBEA-9216-4973-8CAE-E7DCCD9F0F89@mcs.anl.gov>
References: <CAOheOaMxRtsnGNwN9Jjj2ongj1uJn4OitJ9n9N9BatHHHicaBw@mail.gmail.com>
	<4706BBEA-9216-4973-8CAE-E7DCCD9F0F89@mcs.anl.gov>
Message-ID: <CAOheOaNm1ipHk2kw8E15u7=M608HWqwmNMMZAd230pGMUADKfg@mail.gmail.com>

Thank you very much Barry.
Apparently I missed many things about "DMShell..." that I didn't find in
the current users manual, and I was trying to create the operators through
matCreateShell() and MatShellSetOperation().
If I use "DMShellSetCreate..." to define the matrices I shouldn't have any
doubt about the da to refer to.

Now I can go on.
Thank you again.

Best,
Valeria


---------------------------------------------------------------------------------------------
PhD Valeria Mele

University of Naples Federico II
Department of Mathematics and Applications "R. Caccioppoli"
Complesso Universitario M.S. Angelo, Via Cinthia
80126 Naples
---------------------------------------------------------------------------------------------

2016-07-29 12:19 GMT-05:00 Barry Smith <bsmith at mcs.anl.gov>:

>
> > On Jul 28, 2016, at 12:48 PM, Valeria Mele <valeria.mele at unina.it>
> wrote:
> >
> > Hi everyone,
> > this time I am using PETSc to do something that is more complicated than
> my usual and I want to do it at the highest possible abstraction level.
> >
> > To put it in a nutshell, my intent is to build a parallel multigrid to
> solve a linear system via DM, KSP and PCMG (I would like to use DMMG but
> probably I should have the same problems or more).
>
>    DMMG doesn't exist anymore. It was refactored away many years ago, its
> functionality is handled by PCMG and DM.
>
>
> >
> > I created the distributed object, da, with DMDACreate3d, even if it is
> distributed (as yet) only in the x-dimension and has 3 dof.
> > Then I create the KSP (type KSPRICHARDSON) and set the nonzero initial
> guess and PCMG as preconditioner. Here I start to tune the MG.
> >
> > The point is that I need to define all the operators as matrix-free,
> since they will do several operations on x to obtain y, and I am not
> familiar with the way to access all the elements or informations in the two
> levels involved and/or among the processors with a so-high level interface.
> >
> > So please (please please please please) tell me if I correctly
> understand the mechanism or I am on the wrong way and clear my doubts.
> >
> > That is, let's say that my operation for the shell are:
> >       ? A_mult(Mat mat,Vec x, Vec y) //coefficients matrix
> > in this case the level is only one but should I write it taking into
> account only the local data (I think so) and accessing them via the
> informations in da?
>
>    Yes. You can use VecGetDM(x,&da) to get the DMDA object
>
> >
> > For example, if I use DMDAVecGetArray, DMDAVecGetCorners (or
> DMDAGetGhostCorners) and DMDAVecRestoreArray, will they retrieve
> informations from the right level each time (I am pretty sure that in some
> official examples it is done in this way)?
> >
> > Or should I handle just Vecs as local structures with their usual
> indices (through VecGetArray and VecRestoreArray)?
>
>    No, no, no because then you would need to mange all the structured grid
> information yourself, since the DMDA manages it for you you should use it.
>
> >       ? P_mult(Mat mat,Vec x, Vec y) //interpolation matrix that is NOT
> conceptually the traspose of Restriction
> > in this case x and y will be from two different levels (respectively L
> and L+1), so, if I retrieve informations from the da... how can I access
> the two at different levels?
>
>    Use VecGetDM(x,  and VecGetDM(y to get access to both DMDA.
>
> >
> > I am sorry if it seems that they are trivial questions, and I will be
> grateful to anyone will help me.
>
>   Additional information. Since the PCMG will be requesting the matrices
> and the interpolation/restriction operations (rather than you setting them
> into each level of multigrid) you will need to use DMShellSetCreateMatrix()
> and DMShellSetCreateInterpolation() and DMShellSetCreateRestriction() to
> provide the routines that will create the Shell matrices you need to
> represent the operators on the levels and the restriction and interpolation
> (Even though you are using a DMDA you can still call these routines).
>
>    Barry
>
> >
> > Thanks a lot,
> > Valeria
> >
> >
> >
> >
> >
> >
> >
> ---------------------------------------------------------------------------------------------
> > PhD Valeria Mele
> >
> > University of Naples Federico II
> > Department of Mathematics and Applications "R. Caccioppoli"
> > Complesso Universitario M.S. Angelo, Via Cinthia
> > 80126 Naples
> >
> ---------------------------------------------------------------------------------------------
>
>
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.mcs.anl.gov/pipermail/petsc-users/attachments/20160729/24cd273e/attachment.html>

From bsmith at mcs.anl.gov  Fri Jul 29 16:22:51 2016
From: bsmith at mcs.anl.gov (Barry Smith)
Date: Fri, 29 Jul 2016 16:22:51 -0500
Subject: [petsc-users] PCMG with matrix-free operators accessing DMDA
In-Reply-To: <CAOheOaNm1ipHk2kw8E15u7=M608HWqwmNMMZAd230pGMUADKfg@mail.gmail.com>
References: <CAOheOaMxRtsnGNwN9Jjj2ongj1uJn4OitJ9n9N9BatHHHicaBw@mail.gmail.com>
	<4706BBEA-9216-4973-8CAE-E7DCCD9F0F89@mcs.anl.gov>
	<CAOheOaNm1ipHk2kw8E15u7=M608HWqwmNMMZAd230pGMUADKfg@mail.gmail.com>
Message-ID: <B91E0D52-0BE0-4C77-A3BD-2313DD18853D@mcs.anl.gov>


> On Jul 29, 2016, at 1:58 PM, Valeria Mele <valeria.mele at unina.it> wrote:
> 
> Thank you very much Barry.
> Apparently I missed many things about "DMShell..." that I didn't find in the current users manual, and I was trying to create the operators through matCreateShell() and MatShellSetOperation().

   You do need to use MatCreateShell and MatShellSetOperation()!  But you need to call these from within the DMShellSetCreateMatrix() and interpolation/restriction routines that you provide.

   Barry


> If I use "DMShellSetCreate..." to define the matrices I shouldn't have any doubt about the da to refer to. 
> 
> Now I can go on. 
> Thank you again.
> 
> Best,
> Valeria
> 
> 
> ---------------------------------------------------------------------------------------------
> PhD Valeria Mele
> 
> University of Naples Federico II
> Department of Mathematics and Applications "R. Caccioppoli"
> Complesso Universitario M.S. Angelo, Via Cinthia
> 80126 Naples 
> ---------------------------------------------------------------------------------------------
> 
> 2016-07-29 12:19 GMT-05:00 Barry Smith <bsmith at mcs.anl.gov>:
> 
> > On Jul 28, 2016, at 12:48 PM, Valeria Mele <valeria.mele at unina.it> wrote:
> >
> > Hi everyone,
> > this time I am using PETSc to do something that is more complicated than my usual and I want to do it at the highest possible abstraction level.
> >
> > To put it in a nutshell, my intent is to build a parallel multigrid to solve a linear system via DM, KSP and PCMG (I would like to use DMMG but probably I should have the same problems or more).
> 
>    DMMG doesn't exist anymore. It was refactored away many years ago, its functionality is handled by PCMG and DM.
> 
> 
> >
> > I created the distributed object, da, with DMDACreate3d, even if it is distributed (as yet) only in the x-dimension and has 3 dof.
> > Then I create the KSP (type KSPRICHARDSON) and set the nonzero initial guess and PCMG as preconditioner. Here I start to tune the MG.
> >
> > The point is that I need to define all the operators as matrix-free, since they will do several operations on x to obtain y, and I am not familiar with the way to access all the elements or informations in the two levels involved and/or among the processors with a so-high level interface.
> >
> > So please (please please please please) tell me if I correctly understand the mechanism or I am on the wrong way and clear my doubts.
> >
> > That is, let's say that my operation for the shell are:
> >       ? A_mult(Mat mat,Vec x, Vec y) //coefficients matrix
> > in this case the level is only one but should I write it taking into account only the local data (I think so) and accessing them via the informations in da?
> 
>    Yes. You can use VecGetDM(x,&da) to get the DMDA object
> 
> >
> > For example, if I use DMDAVecGetArray, DMDAVecGetCorners (or DMDAGetGhostCorners) and DMDAVecRestoreArray, will they retrieve informations from the right level each time (I am pretty sure that in some official examples it is done in this way)?
> >
> > Or should I handle just Vecs as local structures with their usual indices (through VecGetArray and VecRestoreArray)?
> 
>    No, no, no because then you would need to mange all the structured grid information yourself, since the DMDA manages it for you you should use it.
> 
> >       ? P_mult(Mat mat,Vec x, Vec y) //interpolation matrix that is NOT conceptually the traspose of Restriction
> > in this case x and y will be from two different levels (respectively L and L+1), so, if I retrieve informations from the da... how can I access the two at different levels?
> 
>    Use VecGetDM(x,  and VecGetDM(y to get access to both DMDA.
> 
> >
> > I am sorry if it seems that they are trivial questions, and I will be grateful to anyone will help me.
> 
>   Additional information. Since the PCMG will be requesting the matrices and the interpolation/restriction operations (rather than you setting them into each level of multigrid) you will need to use DMShellSetCreateMatrix() and DMShellSetCreateInterpolation() and DMShellSetCreateRestriction() to provide the routines that will create the Shell matrices you need to represent the operators on the levels and the restriction and interpolation (Even though you are using a DMDA you can still call these routines).
> 
>    Barry
> 
> >
> > Thanks a lot,
> > Valeria
> >
> >
> >
> >
> >
> >
> > ---------------------------------------------------------------------------------------------
> > PhD Valeria Mele
> >
> > University of Naples Federico II
> > Department of Mathematics and Applications "R. Caccioppoli"
> > Complesso Universitario M.S. Angelo, Via Cinthia
> > 80126 Naples
> > ---------------------------------------------------------------------------------------------
> 
> 
> 


From C.Klaij at marin.nl  Sat Jul 30 09:41:58 2016
From: C.Klaij at marin.nl (Klaij, Christiaan)
Date: Sat, 30 Jul 2016 14:41:58 +0000
Subject: [petsc-users] block matrix without MatCreateNest
In-Reply-To: <mailman.45.1469725211.29570.petsc-users@mcs.anl.gov>
References: <mailman.45.1469725211.29570.petsc-users@mcs.anl.gov>
Message-ID: <1469889718285.98025@marin.nl>

Anyone?
(my guess is an if-statement, something like "if type nest then
setup nest"...)

> Date: Thu, 28 Jul 2016 08:38:54 +0000
> From: "Klaij, Christiaan" <C.Klaij at marin.nl>
> To: "petsc-users at mcs.anl.gov" <petsc-users at mcs.anl.gov>
> Subject: [petsc-users] block matrix without MatCreateNest
> Message-ID: <1469695134232.97712 at marin.nl>
> Content-Type: text/plain; charset="utf-8"
>
> I'm trying to understand how to assemble a block matrix in a
> format-independent manner, so that I can switch between types
> mpiaij and matnest.
>
> The manual states that the key to format-independent assembly is
> to use MatGetLocalSubMatrix. So, in the code below, I'm using
> this to assemble a 3-by-3 block matrix A and setting the diagonal
> of block A02. This seems to work for type mpiaij, but not for
> type matnest. What am I missing?
>
> Chris
>
>
> $ cat mattry.F90
> program mattry
>
>   use petscksp
>   implicit none
> #include <petsc/finclude/petsckspdef.h>
>
>   PetscInt :: n=4   ! setting 4 cells per process
>
>   PetscErrorCode         :: ierr
>   PetscInt               :: size,rank,i
>   Mat                    :: A,A02
>   IS                     :: isg0,isg1,isg2
>   IS                     :: isl0,isl1,isl2
>   ISLocalToGlobalMapping :: map
>
>   integer, allocatable, dimension(:) :: idx
>
>   call PetscInitialize(PETSC_NULL_CHARACTER,ierr); CHKERRQ(ierr)
>   call MPI_Comm_size(PETSC_COMM_WORLD,size,ierr); CHKERRQ(ierr)
>   call MPI_Comm_rank(PETSC_COMM_WORLD,rank,ierr);CHKERRQ(ierr)
>
>   ! local index sets for 3 fields
>   allocate(idx(n))
>   idx=(/ (i-1, i=1,n) /)
>   call ISCreateGeneral(PETSC_COMM_WORLD,n,idx,PETSC_COPY_VALUES,isl0,ierr);CHKERRQ(ierr)
>   call ISCreateGeneral(PETSC_COMM_WORLD,n,idx+n,PETSC_COPY_VALUES,isl1,ierr);CHKERRQ(ierr)
>   call ISCreateGeneral(PETSC_COMM_WORLD,n,idx+2*n,PETSC_COPY_VALUES,isl2,ierr);CHKERRQ(ierr)
> !  call ISView(isl3,PETSC_VIEWER_STDOUT_WORLD,ierr); CHKERRQ(ierr)
>   deallocate(idx)
>
>   ! global index sets for 3 fields
>   allocate(idx(n))
>   idx=(/ (i-1+rank*3*n, i=1,n) /)
>   call ISCreateGeneral(PETSC_COMM_WORLD,n,idx,PETSC_COPY_VALUES,isg0,ierr);CHKERRQ(ierr)
>   call ISCreateGeneral(PETSC_COMM_WORLD,n,idx+n,PETSC_COPY_VALUES,isg1,ierr); CHKERRQ(ierr)
>   call ISCreateGeneral(PETSC_COMM_WORLD,n,idx+2*n,PETSC_COPY_VALUES,isg2,ierr); CHKERRQ(ierr)
> !  call ISView(isg3,PETSC_VIEWER_STDOUT_WORLD,ierr); CHKERRQ(ierr)
>   deallocate(idx)
>
>   ! local-to-global mapping
>   allocate(idx(3*n))
>   idx=(/ (i-1+rank*3*n, i=1,3*n) /)
>   call ISLocalToGlobalMappingCreate(PETSC_COMM_WORLD,1,3*n,idx,PETSC_COPY_VALUES,map,ierr); CHKERRQ(ierr)
> !  call ISLocalToGlobalMappingView(map,PETSC_VIEWER_STDOUT_WORLD,ierr); CHKERRQ(ierr)
>   deallocate(idx)
>
>   ! create the 3-by-3 block matrix
>   call MatCreate(PETSC_COMM_WORLD,A,ierr); CHKERRQ(ierr)
>   call MatSetSizes(A,3*n,3*n,PETSC_DECIDE,PETSC_DECIDE,ierr); CHKERRQ(ierr)
> !  call MatSetType(A,MATNEST,ierr); CHKERRQ(ierr)
>   call MatSetUp(A,ierr); CHKERRQ(ierr)
>   call MatSetOptionsPrefix(A,"A_",ierr); CHKERRQ(ierr)
>   call MatSetLocalToGlobalMapping(A,map,map,ierr); CHKERRQ(ierr)
>   call MatSetFromOptions(A,ierr); CHKERRQ(ierr)
>
>   ! set diagonal of block A02 to 0.65
>   call MatGetLocalSubmatrix(A,isl0,isl2,A02,ierr); CHKERRQ(ierr)
>   do i=1,n
>      call MatSetValuesLocal(A02,1,i-1,1,i-1,0.65d0,INSERT_VALUES,ierr); CHKERRQ(ierr)
>   end do
>   call MatRestoreLocalSubMatrix(A,isl0,isl2,A02,ierr); CHKERRQ(ierr)
>   call MatAssemblyBegin(A,MAT_FINAL_ASSEMBLY,ierr); CHKERRQ(ierr)
>   call MatAssemblyEnd(A,MAT_FINAL_ASSEMBLY,ierr); CHKERRQ(ierr)
>
>   ! verify
>   call MatGetSubmatrix(A,isg0,isg2,MAT_INITIAL_MATRIX,A02,ierr); CHKERRQ(ierr)
>   call MatView(A02,PETSC_VIEWER_STDOUT_WORLD,ierr);CHKERRQ(ierr)
>
>   call PetscFinalize(ierr)
>
> end program mattry
>
> $ mpiexec -n 2 ./mattry -A_mat_type mpiaij
> Mat Object: 2 MPI processes
>   type: mpiaij
> row 0: (0, 0.65)
> row 1: (1, 0.65)
> row 2: (2, 0.65)
> row 3: (3, 0.65)
> row 4: (4, 0.65)
> row 5: (5, 0.65)
> row 6: (6, 0.65)
> row 7: (7, 0.65)
>
> $ mpiexec -n 2 ./mattry -A_mat_type nest
> [0]PETSC ERROR: --------------------- Error Message --------------------------------------------------------------
> [0]PETSC ERROR: Null argument, when expecting valid pointer
> [0]PETSC ERROR: Null Pointer: Parameter # 3
> [0]PETSC ERROR: See http://www.mcs.anl.gov/petsc/documentation/faq.html for trouble shooting.
> [0]PETSC ERROR: Petsc Release Version 3.7.3, Jul, 24, 2016
> [0]PETSC ERROR: ./mattry                                                                                                                                                                                                                                                         on a linux_64bit_debug named lin0322.marin.local by cklaij Thu Jul 28 10:31:04 2016
> [0]PETSC ERROR: Configure options --with-mpi-dir=/home/cklaij/ReFRESCO/Dev/trunk/Libs/install/openmpi/1.8.7 --with-clanguage=c++ --with-x=1 --with-debugging=1 --with-blas-lapack-dir=/opt/intel/composer_xe_2015.1.133/mkl --with-shared-libraries=0
> [0]PETSC ERROR: #1 MatNestFindIS() line 298 in /home/cklaij/ReFRESCO/Dev/trunk/Libs/build/petsc/3.7.3-dbg/src/mat/impls/nest/matnest.c
> [0]PETSC ERROR: [1]PETSC ERROR: --------------------- Error Message --------------------------------------------------------------
> [1]PETSC ERROR: Null argument, when expecting valid pointer
> [1]PETSC ERROR: Null Pointer: Parameter # 3
> [1]PETSC ERROR: See http://www.mcs.anl.gov/petsc/documentation/faq.html for trouble shooting.
> [1]PETSC ERROR: Petsc Release Version 3.7.3, Jul, 24, 2016
> [1]PETSC ERROR: ./mattry                                                                                                                                                                                                                                                         on a linux_64bit_debug named lin0322.marin.local by cklaij Thu Jul 28 10:31:04 2016
> [1]PETSC ERROR: Configure options --with-mpi-dir=/home/cklaij/ReFRESCO/Dev/trunk/Libs/install/openmpi/1.8.7 --with-clanguage=c++ --with-x=1 --with-debugging=1 --with-blas-lapack-dir=/opt/intel/composer_xe_2015.1.133/mkl --with-shared-libraries=0
> [1]PETSC ERROR: #1 MatNestFindIS() line 298 in /home/cklaij/ReFRESCO/Dev/trunk/Libs/build/petsc/3.7.3-dbg/src/mat/impls/nest/matnest.c
> [1]PETSC ERROR: #2 MatNestFindSubMat() line 371 in /home/cklaij/ReFRESCO/Dev/trunk/Libs/build/petsc/3.7.3-dbg/src/mat/impls/nest/matnest.c
> [1]PETSC ERROR: #3 MatGetLocalSubMatrix_Nest() line 414 in /home/cklaij/ReFRESCO/Dev/trunk/Libs/build/petsc/3.7.3-dbg/src/mat/impls/nest/matnest.c
> [1]PETSC ERROR: #4 MatGetLocalSubMatrix() line 10099 in /home/cklaij/ReFRESCO/Dev/trunk/Libs/build/petsc/3.7.3-dbg/src/mat/interface/matrix.c
> #2 MatNestFindSubMat() line 371 in /home/cklaij/ReFRESCO/Dev/trunk/Libs/build/petsc/3.7.3-dbg/src/mat/impls/nest/matnest.c
> [0]PETSC ERROR: #3 MatGetLocalSubMatrix_Nest() line 414 in /home/cklaij/ReFRESCO/Dev/trunk/Libs/build/petsc/3.7.3-dbg/src/mat/impls/nest/matnest.c
> [0]PETSC ERROR: #4 MatGetLocalSubMatrix() line 10099 in /home/cklaij/ReFRESCO/Dev/trunk/Libs/build/petsc/3.7.3-dbg/src/mat/interface/matrix.c
> --------------------------------------------------------------------------
> MPI_ABORT was invoked on rank 1 in communicator MPI_COMM_WORLD
> with errorcode 85.
>
> NOTE: invoking MPI_ABORT causes Open MPI to kill all MPI processes.
> You may or may not see output from other processes, depending on
> exactly when Open MPI kills them.
> --------------------------------------------------------------------------
> [lin0322.marin.local:11985] 1 more process has sent help message help-mpi-api.txt / mpi-abort
> [lin0322.marin.local:11985] Set MCA parameter "orte_base_help_aggregate" to 0 to see all help / error messages
> $
>
>
> dr. ir. Christiaan Klaij  | CFD Researcher | Research & Development
> MARIN | T +31 317 49 33 44 | mailto:C.Klaij at marin.nl | http://www.marin.nl
>
> MARIN news: http://www.marin.nl/web/News/News-items/Ship-design-in-EU-project-Holiship.htm
>


dr. ir. Christiaan Klaij  | CFD Researcher | Research & Development
MARIN | T +31 317 49 33 44 | mailto:C.Klaij at marin.nl | http://www.marin.nl

MARIN news: http://www.marin.nl/web/News/News-items/Joint-Industry-Project-LifeLine-kicks-off.htm


From knepley at gmail.com  Sat Jul 30 10:02:21 2016
From: knepley at gmail.com (Matthew Knepley)
Date: Sat, 30 Jul 2016 10:02:21 -0500
Subject: [petsc-users] block matrix without MatCreateNest
In-Reply-To: <1469695134232.97712@marin.nl>
References: <1469695134232.97712@marin.nl>
Message-ID: <CAMYG4G=n=gXUJgfCO+mosw-A1zjiak3dy-0QAcF7SyW1vtJxqQ@mail.gmail.com>

On Thu, Jul 28, 2016 at 3:38 AM, Klaij, Christiaan <C.Klaij at marin.nl> wrote:

> I'm trying to understand how to assemble a block matrix in a
> format-independent manner, so that I can switch between types
> mpiaij and matnest.
>
> The manual states that the key to format-independent assembly is
> to use MatGetLocalSubMatrix. So, in the code below, I'm using
> this to assemble a 3-by-3 block matrix A and setting the diagonal
> of block A02. This seems to work for type mpiaij, but not for
> type matnest. What am I missing?
>
> Chris
>
>
> $ cat mattry.F90
> program mattry
>
>   use petscksp
>   implicit none
> #include <petsc/finclude/petsckspdef.h>
>
>   PetscInt :: n=4   ! setting 4 cells per process
>
>   PetscErrorCode         :: ierr
>   PetscInt               :: size,rank,i
>   Mat                    :: A,A02
>   IS                     :: isg0,isg1,isg2
>   IS                     :: isl0,isl1,isl2
>   ISLocalToGlobalMapping :: map
>
>   integer, allocatable, dimension(:) :: idx
>
>   call PetscInitialize(PETSC_NULL_CHARACTER,ierr); CHKERRQ(ierr)
>   call MPI_Comm_size(PETSC_COMM_WORLD,size,ierr); CHKERRQ(ierr)
>   call MPI_Comm_rank(PETSC_COMM_WORLD,rank,ierr);CHKERRQ(ierr)
>
>   ! local index sets for 3 fields
>   allocate(idx(n))
>   idx=(/ (i-1, i=1,n) /)
>   call
> ISCreateGeneral(PETSC_COMM_WORLD,n,idx,PETSC_COPY_VALUES,isl0,ierr);CHKERRQ(ierr)
>   call
> ISCreateGeneral(PETSC_COMM_WORLD,n,idx+n,PETSC_COPY_VALUES,isl1,ierr);CHKERRQ(ierr)
>   call
> ISCreateGeneral(PETSC_COMM_WORLD,n,idx+2*n,PETSC_COPY_VALUES,isl2,ierr);CHKERRQ(ierr)
> !  call ISView(isl3,PETSC_VIEWER_STDOUT_WORLD,ierr); CHKERRQ(ierr)
>   deallocate(idx)
>
>   ! global index sets for 3 fields
>   allocate(idx(n))
>   idx=(/ (i-1+rank*3*n, i=1,n) /)
>   call
> ISCreateGeneral(PETSC_COMM_WORLD,n,idx,PETSC_COPY_VALUES,isg0,ierr);CHKERRQ(ierr)
>   call
> ISCreateGeneral(PETSC_COMM_WORLD,n,idx+n,PETSC_COPY_VALUES,isg1,ierr);
> CHKERRQ(ierr)
>   call
> ISCreateGeneral(PETSC_COMM_WORLD,n,idx+2*n,PETSC_COPY_VALUES,isg2,ierr);
> CHKERRQ(ierr)
> !  call ISView(isg3,PETSC_VIEWER_STDOUT_WORLD,ierr); CHKERRQ(ierr)
>   deallocate(idx)
>
>   ! local-to-global mapping
>   allocate(idx(3*n))
>   idx=(/ (i-1+rank*3*n, i=1,3*n) /)
>   call
> ISLocalToGlobalMappingCreate(PETSC_COMM_WORLD,1,3*n,idx,PETSC_COPY_VALUES,map,ierr);
> CHKERRQ(ierr)
> !  call ISLocalToGlobalMappingView(map,PETSC_VIEWER_STDOUT_WORLD,ierr);
> CHKERRQ(ierr)
>   deallocate(idx)
>
>   ! create the 3-by-3 block matrix
>   call MatCreate(PETSC_COMM_WORLD,A,ierr); CHKERRQ(ierr)
>   call MatSetSizes(A,3*n,3*n,PETSC_DECIDE,PETSC_DECIDE,ierr); CHKERRQ(ierr)
> !  call MatSetType(A,MATNEST,ierr); CHKERRQ(ierr)
>   call MatSetUp(A,ierr); CHKERRQ(ierr)
>

I am sorry I have not had time to run this, but I believe you need to
insert a call here:

  MatNestSetSubMats(A, 3, [isg0, isg1, isg2], 3, [isg0, isg1, isg2], NULL);

coming from
http://www.mcs.anl.gov/petsc/petsc-current/docs/manualpages/Mat/MatNestSetSubMats.html#MatNestSetSubMats

Thanks,

    Matt

  call MatSetOptionsPrefix(A,"A_",ierr); CHKERRQ(ierr)
>   call MatSetLocalToGlobalMapping(A,map,map,ierr); CHKERRQ(ierr)
>   call MatSetFromOptions(A,ierr); CHKERRQ(ierr)
>
>   ! set diagonal of block A02 to 0.65
>   call MatGetLocalSubmatrix(A,isl0,isl2,A02,ierr); CHKERRQ(ierr)
>   do i=1,n
>      call MatSetValuesLocal(A02,1,i-1,1,i-1,0.65d0,INSERT_VALUES,ierr);
> CHKERRQ(ierr)
>   end do
>   call MatRestoreLocalSubMatrix(A,isl0,isl2,A02,ierr); CHKERRQ(ierr)
>   call MatAssemblyBegin(A,MAT_FINAL_ASSEMBLY,ierr); CHKERRQ(ierr)
>   call MatAssemblyEnd(A,MAT_FINAL_ASSEMBLY,ierr); CHKERRQ(ierr)
>
>   ! verify
>   call MatGetSubmatrix(A,isg0,isg2,MAT_INITIAL_MATRIX,A02,ierr);
> CHKERRQ(ierr)
>   call MatView(A02,PETSC_VIEWER_STDOUT_WORLD,ierr);CHKERRQ(ierr)
>
>   call PetscFinalize(ierr)
>
> end program mattry
>
> $ mpiexec -n 2 ./mattry -A_mat_type mpiaij
> Mat Object: 2 MPI processes
>   type: mpiaij
> row 0: (0, 0.65)
> row 1: (1, 0.65)
> row 2: (2, 0.65)
> row 3: (3, 0.65)
> row 4: (4, 0.65)
> row 5: (5, 0.65)
> row 6: (6, 0.65)
> row 7: (7, 0.65)
>
> $ mpiexec -n 2 ./mattry -A_mat_type nest
> [0]PETSC ERROR: --------------------- Error Message
> --------------------------------------------------------------
> [0]PETSC ERROR: Null argument, when expecting valid pointer
> [0]PETSC ERROR: Null Pointer: Parameter # 3
> [0]PETSC ERROR: See http://www.mcs.anl.gov/petsc/documentation/faq.html
> for trouble shooting.
> [0]PETSC ERROR: Petsc Release Version 3.7.3, Jul, 24, 2016
> [0]PETSC ERROR: ./mattry
>
>
>                                                on a linux_64bit_debug named
> lin0322.marin.local by cklaij Thu Jul 28 10:31:04 2016
> [0]PETSC ERROR: Configure options
> --with-mpi-dir=/home/cklaij/ReFRESCO/Dev/trunk/Libs/install/openmpi/1.8.7
> --with-clanguage=c++ --with-x=1 --with-debugging=1
> --with-blas-lapack-dir=/opt/intel/composer_xe_2015.1.133/mkl
> --with-shared-libraries=0
> [0]PETSC ERROR: #1 MatNestFindIS() line 298 in
> /home/cklaij/ReFRESCO/Dev/trunk/Libs/build/petsc/3.7.3-dbg/src/mat/impls/nest/matnest.c
> [0]PETSC ERROR: [1]PETSC ERROR: --------------------- Error Message
> --------------------------------------------------------------
> [1]PETSC ERROR: Null argument, when expecting valid pointer
> [1]PETSC ERROR: Null Pointer: Parameter # 3
> [1]PETSC ERROR: See http://www.mcs.anl.gov/petsc/documentation/faq.html
> for trouble shooting.
> [1]PETSC ERROR: Petsc Release Version 3.7.3, Jul, 24, 2016
> [1]PETSC ERROR: ./mattry
>
>
>                                                on a linux_64bit_debug named
> lin0322.marin.local by cklaij Thu Jul 28 10:31:04 2016
> [1]PETSC ERROR: Configure options
> --with-mpi-dir=/home/cklaij/ReFRESCO/Dev/trunk/Libs/install/openmpi/1.8.7
> --with-clanguage=c++ --with-x=1 --with-debugging=1
> --with-blas-lapack-dir=/opt/intel/composer_xe_2015.1.133/mkl
> --with-shared-libraries=0
> [1]PETSC ERROR: #1 MatNestFindIS() line 298 in
> /home/cklaij/ReFRESCO/Dev/trunk/Libs/build/petsc/3.7.3-dbg/src/mat/impls/nest/matnest.c
> [1]PETSC ERROR: #2 MatNestFindSubMat() line 371 in
> /home/cklaij/ReFRESCO/Dev/trunk/Libs/build/petsc/3.7.3-dbg/src/mat/impls/nest/matnest.c
> [1]PETSC ERROR: #3 MatGetLocalSubMatrix_Nest() line 414 in
> /home/cklaij/ReFRESCO/Dev/trunk/Libs/build/petsc/3.7.3-dbg/src/mat/impls/nest/matnest.c
> [1]PETSC ERROR: #4 MatGetLocalSubMatrix() line 10099 in
> /home/cklaij/ReFRESCO/Dev/trunk/Libs/build/petsc/3.7.3-dbg/src/mat/interface/matrix.c
> #2 MatNestFindSubMat() line 371 in
> /home/cklaij/ReFRESCO/Dev/trunk/Libs/build/petsc/3.7.3-dbg/src/mat/impls/nest/matnest.c
> [0]PETSC ERROR: #3 MatGetLocalSubMatrix_Nest() line 414 in
> /home/cklaij/ReFRESCO/Dev/trunk/Libs/build/petsc/3.7.3-dbg/src/mat/impls/nest/matnest.c
> [0]PETSC ERROR: #4 MatGetLocalSubMatrix() line 10099 in
> /home/cklaij/ReFRESCO/Dev/trunk/Libs/build/petsc/3.7.3-dbg/src/mat/interface/matrix.c
> --------------------------------------------------------------------------
> MPI_ABORT was invoked on rank 1 in communicator MPI_COMM_WORLD
> with errorcode 85.
>
> NOTE: invoking MPI_ABORT causes Open MPI to kill all MPI processes.
> You may or may not see output from other processes, depending on
> exactly when Open MPI kills them.
> --------------------------------------------------------------------------
> [lin0322.marin.local:11985] 1 more process has sent help message
> help-mpi-api.txt / mpi-abort
> [lin0322.marin.local:11985] Set MCA parameter "orte_base_help_aggregate"
> to 0 to see all help / error messages
> $
>
>
> dr. ir. Christiaan Klaij  | CFD Researcher | Research & Development
> MARIN | T +31 317 49 33 44 | mailto:C.Klaij at marin.nl | http://www.marin.nl
>
> MARIN news:
> http://www.marin.nl/web/News/News-items/Ship-design-in-EU-project-Holiship.htm
>
>


-- 
What most experimenters take for granted before they begin their
experiments is infinitely more interesting than any results to which their
experiments lead.
-- Norbert Wiener
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.mcs.anl.gov/pipermail/petsc-users/attachments/20160730/476b0f9b/attachment.html>

From bsmith at mcs.anl.gov  Sat Jul 30 11:04:43 2016
From: bsmith at mcs.anl.gov (Barry Smith)
Date: Sat, 30 Jul 2016 11:04:43 -0500
Subject: [petsc-users] block matrix without MatCreateNest
In-Reply-To: <1469889718285.98025@marin.nl>
References: <mailman.45.1469725211.29570.petsc-users@mcs.anl.gov>
	<1469889718285.98025@marin.nl>
Message-ID: <47333B7B-15AA-4089-88CA-8BEB471F87A1@mcs.anl.gov>


   You need to call MatNestSetSubMats() after you set the mattype. Yes the manual pages are missing needed cross links.


> On Jul 30, 2016, at 9:41 AM, Klaij, Christiaan <C.Klaij at marin.nl> wrote:
> 
> Anyone?
> (my guess is an if-statement, something like "if type nest then
> setup nest"...)
> 
>> Date: Thu, 28 Jul 2016 08:38:54 +0000
>> From: "Klaij, Christiaan" <C.Klaij at marin.nl>
>> To: "petsc-users at mcs.anl.gov" <petsc-users at mcs.anl.gov>
>> Subject: [petsc-users] block matrix without MatCreateNest
>> Message-ID: <1469695134232.97712 at marin.nl>
>> Content-Type: text/plain; charset="utf-8"
>> 
>> I'm trying to understand how to assemble a block matrix in a
>> format-independent manner, so that I can switch between types
>> mpiaij and matnest.
>> 
>> The manual states that the key to format-independent assembly is
>> to use MatGetLocalSubMatrix. So, in the code below, I'm using
>> this to assemble a 3-by-3 block matrix A and setting the diagonal
>> of block A02. This seems to work for type mpiaij, but not for
>> type matnest. What am I missing?
>> 
>> Chris
>> 
>> 
>> $ cat mattry.F90
>> program mattry
>> 
>>  use petscksp
>>  implicit none
>> #include <petsc/finclude/petsckspdef.h>
>> 
>>  PetscInt :: n=4   ! setting 4 cells per process
>> 
>>  PetscErrorCode         :: ierr
>>  PetscInt               :: size,rank,i
>>  Mat                    :: A,A02
>>  IS                     :: isg0,isg1,isg2
>>  IS                     :: isl0,isl1,isl2
>>  ISLocalToGlobalMapping :: map
>> 
>>  integer, allocatable, dimension(:) :: idx
>> 
>>  call PetscInitialize(PETSC_NULL_CHARACTER,ierr); CHKERRQ(ierr)
>>  call MPI_Comm_size(PETSC_COMM_WORLD,size,ierr); CHKERRQ(ierr)
>>  call MPI_Comm_rank(PETSC_COMM_WORLD,rank,ierr);CHKERRQ(ierr)
>> 
>>  ! local index sets for 3 fields
>>  allocate(idx(n))
>>  idx=(/ (i-1, i=1,n) /)
>>  call ISCreateGeneral(PETSC_COMM_WORLD,n,idx,PETSC_COPY_VALUES,isl0,ierr);CHKERRQ(ierr)
>>  call ISCreateGeneral(PETSC_COMM_WORLD,n,idx+n,PETSC_COPY_VALUES,isl1,ierr);CHKERRQ(ierr)
>>  call ISCreateGeneral(PETSC_COMM_WORLD,n,idx+2*n,PETSC_COPY_VALUES,isl2,ierr);CHKERRQ(ierr)
>> !  call ISView(isl3,PETSC_VIEWER_STDOUT_WORLD,ierr); CHKERRQ(ierr)
>>  deallocate(idx)
>> 
>>  ! global index sets for 3 fields
>>  allocate(idx(n))
>>  idx=(/ (i-1+rank*3*n, i=1,n) /)
>>  call ISCreateGeneral(PETSC_COMM_WORLD,n,idx,PETSC_COPY_VALUES,isg0,ierr);CHKERRQ(ierr)
>>  call ISCreateGeneral(PETSC_COMM_WORLD,n,idx+n,PETSC_COPY_VALUES,isg1,ierr); CHKERRQ(ierr)
>>  call ISCreateGeneral(PETSC_COMM_WORLD,n,idx+2*n,PETSC_COPY_VALUES,isg2,ierr); CHKERRQ(ierr)
>> !  call ISView(isg3,PETSC_VIEWER_STDOUT_WORLD,ierr); CHKERRQ(ierr)
>>  deallocate(idx)
>> 
>>  ! local-to-global mapping
>>  allocate(idx(3*n))
>>  idx=(/ (i-1+rank*3*n, i=1,3*n) /)
>>  call ISLocalToGlobalMappingCreate(PETSC_COMM_WORLD,1,3*n,idx,PETSC_COPY_VALUES,map,ierr); CHKERRQ(ierr)
>> !  call ISLocalToGlobalMappingView(map,PETSC_VIEWER_STDOUT_WORLD,ierr); CHKERRQ(ierr)
>>  deallocate(idx)
>> 
>>  ! create the 3-by-3 block matrix
>>  call MatCreate(PETSC_COMM_WORLD,A,ierr); CHKERRQ(ierr)
>>  call MatSetSizes(A,3*n,3*n,PETSC_DECIDE,PETSC_DECIDE,ierr); CHKERRQ(ierr)
>> !  call MatSetType(A,MATNEST,ierr); CHKERRQ(ierr)
>>  call MatSetUp(A,ierr); CHKERRQ(ierr)
>>  call MatSetOptionsPrefix(A,"A_",ierr); CHKERRQ(ierr)
>>  call MatSetLocalToGlobalMapping(A,map,map,ierr); CHKERRQ(ierr)
>>  call MatSetFromOptions(A,ierr); CHKERRQ(ierr)
>> 
>>  ! set diagonal of block A02 to 0.65
>>  call MatGetLocalSubmatrix(A,isl0,isl2,A02,ierr); CHKERRQ(ierr)
>>  do i=1,n
>>     call MatSetValuesLocal(A02,1,i-1,1,i-1,0.65d0,INSERT_VALUES,ierr); CHKERRQ(ierr)
>>  end do
>>  call MatRestoreLocalSubMatrix(A,isl0,isl2,A02,ierr); CHKERRQ(ierr)
>>  call MatAssemblyBegin(A,MAT_FINAL_ASSEMBLY,ierr); CHKERRQ(ierr)
>>  call MatAssemblyEnd(A,MAT_FINAL_ASSEMBLY,ierr); CHKERRQ(ierr)
>> 
>>  ! verify
>>  call MatGetSubmatrix(A,isg0,isg2,MAT_INITIAL_MATRIX,A02,ierr); CHKERRQ(ierr)
>>  call MatView(A02,PETSC_VIEWER_STDOUT_WORLD,ierr);CHKERRQ(ierr)
>> 
>>  call PetscFinalize(ierr)
>> 
>> end program mattry
>> 
>> $ mpiexec -n 2 ./mattry -A_mat_type mpiaij
>> Mat Object: 2 MPI processes
>>  type: mpiaij
>> row 0: (0, 0.65)
>> row 1: (1, 0.65)
>> row 2: (2, 0.65)
>> row 3: (3, 0.65)
>> row 4: (4, 0.65)
>> row 5: (5, 0.65)
>> row 6: (6, 0.65)
>> row 7: (7, 0.65)
>> 
>> $ mpiexec -n 2 ./mattry -A_mat_type nest
>> [0]PETSC ERROR: --------------------- Error Message --------------------------------------------------------------
>> [0]PETSC ERROR: Null argument, when expecting valid pointer
>> [0]PETSC ERROR: Null Pointer: Parameter # 3
>> [0]PETSC ERROR: See http://www.mcs.anl.gov/petsc/documentation/faq.html for trouble shooting.
>> [0]PETSC ERROR: Petsc Release Version 3.7.3, Jul, 24, 2016
>> [0]PETSC ERROR: ./mattry                                                                                                                                                                                                                                                         on a linux_64bit_debug named lin0322.marin.local by cklaij Thu Jul 28 10:31:04 2016
>> [0]PETSC ERROR: Configure options --with-mpi-dir=/home/cklaij/ReFRESCO/Dev/trunk/Libs/install/openmpi/1.8.7 --with-clanguage=c++ --with-x=1 --with-debugging=1 --with-blas-lapack-dir=/opt/intel/composer_xe_2015.1.133/mkl --with-shared-libraries=0
>> [0]PETSC ERROR: #1 MatNestFindIS() line 298 in /home/cklaij/ReFRESCO/Dev/trunk/Libs/build/petsc/3.7.3-dbg/src/mat/impls/nest/matnest.c
>> [0]PETSC ERROR: [1]PETSC ERROR: --------------------- Error Message --------------------------------------------------------------
>> [1]PETSC ERROR: Null argument, when expecting valid pointer
>> [1]PETSC ERROR: Null Pointer: Parameter # 3
>> [1]PETSC ERROR: See http://www.mcs.anl.gov/petsc/documentation/faq.html for trouble shooting.
>> [1]PETSC ERROR: Petsc Release Version 3.7.3, Jul, 24, 2016
>> [1]PETSC ERROR: ./mattry                                                                                                                                                                                                                                                         on a linux_64bit_debug named lin0322.marin.local by cklaij Thu Jul 28 10:31:04 2016
>> [1]PETSC ERROR: Configure options --with-mpi-dir=/home/cklaij/ReFRESCO/Dev/trunk/Libs/install/openmpi/1.8.7 --with-clanguage=c++ --with-x=1 --with-debugging=1 --with-blas-lapack-dir=/opt/intel/composer_xe_2015.1.133/mkl --with-shared-libraries=0
>> [1]PETSC ERROR: #1 MatNestFindIS() line 298 in /home/cklaij/ReFRESCO/Dev/trunk/Libs/build/petsc/3.7.3-dbg/src/mat/impls/nest/matnest.c
>> [1]PETSC ERROR: #2 MatNestFindSubMat() line 371 in /home/cklaij/ReFRESCO/Dev/trunk/Libs/build/petsc/3.7.3-dbg/src/mat/impls/nest/matnest.c
>> [1]PETSC ERROR: #3 MatGetLocalSubMatrix_Nest() line 414 in /home/cklaij/ReFRESCO/Dev/trunk/Libs/build/petsc/3.7.3-dbg/src/mat/impls/nest/matnest.c
>> [1]PETSC ERROR: #4 MatGetLocalSubMatrix() line 10099 in /home/cklaij/ReFRESCO/Dev/trunk/Libs/build/petsc/3.7.3-dbg/src/mat/interface/matrix.c
>> #2 MatNestFindSubMat() line 371 in /home/cklaij/ReFRESCO/Dev/trunk/Libs/build/petsc/3.7.3-dbg/src/mat/impls/nest/matnest.c
>> [0]PETSC ERROR: #3 MatGetLocalSubMatrix_Nest() line 414 in /home/cklaij/ReFRESCO/Dev/trunk/Libs/build/petsc/3.7.3-dbg/src/mat/impls/nest/matnest.c
>> [0]PETSC ERROR: #4 MatGetLocalSubMatrix() line 10099 in /home/cklaij/ReFRESCO/Dev/trunk/Libs/build/petsc/3.7.3-dbg/src/mat/interface/matrix.c
>> --------------------------------------------------------------------------
>> MPI_ABORT was invoked on rank 1 in communicator MPI_COMM_WORLD
>> with errorcode 85.
>> 
>> NOTE: invoking MPI_ABORT causes Open MPI to kill all MPI processes.
>> You may or may not see output from other processes, depending on
>> exactly when Open MPI kills them.
>> --------------------------------------------------------------------------
>> [lin0322.marin.local:11985] 1 more process has sent help message help-mpi-api.txt / mpi-abort
>> [lin0322.marin.local:11985] Set MCA parameter "orte_base_help_aggregate" to 0 to see all help / error messages
>> $
>> 
>> 
>> dr. ir. Christiaan Klaij  | CFD Researcher | Research & Development
>> MARIN | T +31 317 49 33 44 | mailto:C.Klaij at marin.nl | http://www.marin.nl
>> 
>> MARIN news: http://www.marin.nl/web/News/News-items/Ship-design-in-EU-project-Holiship.htm
>> 
> 
> 
> dr. ir. Christiaan Klaij  | CFD Researcher | Research & Development
> MARIN | T +31 317 49 33 44 | mailto:C.Klaij at marin.nl | http://www.marin.nl
> 
> MARIN news: http://www.marin.nl/web/News/News-items/Joint-Industry-Project-LifeLine-kicks-off.htm
> 


From andrewh0 at uw.edu  Sat Jul 30 12:19:45 2016
From: andrewh0 at uw.edu (Andrew Ho)
Date: Sat, 30 Jul 2016 10:19:45 -0700
Subject: [petsc-users] Multi-physics meshes with PETSc DM?
Message-ID: <CADhXwgsHHvY3MGSJhnbaERjm4ECmw_8WxPEe3XP8cBbqvqhOPA@mail.gmail.com>

I am trying to solve a multi-physics problem consisting of some physics on
a rectangular domain which is split in half such that one set of physics is
solved on the left, and the other set of physics is solved on the right.

Each set has their own set of variable components, and I would like to not
allocate both variable sets across the entire domain because the physics in
one subdomain happens to have lots of components per mesh element, which
the other subdomain doesn't need except to compute boundary interactions.

For testing right now, I am using the attached gmsh file to generate a mesh
with 2 physical groups to represent each subdomain (called "left" and
"right"). It has periodic boundaries on all sides.

However, when I try to load the generated mesh into PETSc using the
*DMPlexCreateFromFile* function, PETSc complains that the mesh is not a
valid Gmsh file. I've attached the sample mesh, as well as the error
message PETSc spits out.

Here's the relevant code (should be a complete working example) which
re-creates what I'm doing:

#include <petsc.h>


> int main(int argc, char** argv)
> {
>   PetscInitialize(&argc, &argv, NULL, "multi physics testing");
>   DM dm;
>   CHKERRQ(DMPlexCreateFromFile(PETSC_COMM_WORLD, "periodic_square.msh",
> PETSC_TRUE, &dm));
>   PetscFinalize();
> }


What is the correct procedure for creating a multi-physics mesh using PETSc
DM objects for mesh management?
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.mcs.anl.gov/pipermail/petsc-users/attachments/20160730/8316638e/attachment.html>
-------------- next part --------------
A non-text attachment was scrubbed...
Name: periodic_square.geo
Type: application/octet-stream
Size: 521 bytes
Desc: not available
URL: <http://lists.mcs.anl.gov/pipermail/petsc-users/attachments/20160730/8316638e/attachment.obj>
-------------- next part --------------
A non-text attachment was scrubbed...
Name: periodic_square.msh
Type: model/mesh
Size: 433 bytes
Desc: not available
URL: <http://lists.mcs.anl.gov/pipermail/petsc-users/attachments/20160730/8316638e/attachment.msh>
-------------- next part --------------
A non-text attachment was scrubbed...
Name: petsc_error.log
Type: text/x-log
Size: 3518 bytes
Desc: not available
URL: <http://lists.mcs.anl.gov/pipermail/petsc-users/attachments/20160730/8316638e/attachment.bin>

From knepley at gmail.com  Sat Jul 30 12:49:25 2016
From: knepley at gmail.com (Matthew Knepley)
Date: Sat, 30 Jul 2016 12:49:25 -0500
Subject: [petsc-users] Multi-physics meshes with PETSc DM?
In-Reply-To: <CADhXwgsHHvY3MGSJhnbaERjm4ECmw_8WxPEe3XP8cBbqvqhOPA@mail.gmail.com>
References: <CADhXwgsHHvY3MGSJhnbaERjm4ECmw_8WxPEe3XP8cBbqvqhOPA@mail.gmail.com>
Message-ID: <CAMYG4G=NCb_m6t3+kC_owE9P-OyhUpe-VyS5znzZJ++jZKCSjw@mail.gmail.com>

On Sat, Jul 30, 2016 at 12:19 PM, Andrew Ho <andrewh0 at uw.edu> wrote:

> I am trying to solve a multi-physics problem consisting of some physics on
> a rectangular domain which is split in half such that one set of physics is
> solved on the left, and the other set of physics is solved on the right.
>
> Each set has their own set of variable components, and I would like to not
> allocate both variable sets across the entire domain because the physics in
> one subdomain happens to have lots of components per mesh element, which
> the other subdomain doesn't need except to compute boundary interactions.
>
> For testing right now, I am using the attached gmsh file to generate a
> mesh with 2 physical groups to represent each subdomain (called "left" and
> "right"). It has periodic boundaries on all sides.
>
> However, when I try to load the generated mesh into PETSc using the
> *DMPlexCreateFromFile* function, PETSc complains that the mesh is not a
> valid Gmsh file. I've attached the sample mesh, as well as the error
> message PETSc spits out.
>
> Here's the relevant code (should be a complete working example) which
> re-creates what I'm doing:
>
> #include <petsc.h>
>
>
>> int main(int argc, char** argv)
>> {
>>   PetscInitialize(&argc, &argv, NULL, "multi physics testing");
>>   DM dm;
>>   CHKERRQ(DMPlexCreateFromFile(PETSC_COMM_WORLD, "periodic_square.msh",
>> PETSC_TRUE, &dm));
>>   PetscFinalize();
>> }
>
>
> What is the correct procedure for creating a multi-physics mesh using
> PETSc DM objects for mesh management?
>


1) I don't use Physical Groups from GMsh since its unclear how this would
be reflected in the discretization

2) You should make a PetscSection representing your data layout, which is
discussed in the manual and in the tutorials.
    The number of dofs on different cells/edges/vertices will be different
across the mesh (it sounds like from your description).

3) Obviously this means the closures of different cells will be different
sizes. I am not sure how your assembly is setup to handle this.

  Matt

-- 
What most experimenters take for granted before they begin their
experiments is infinitely more interesting than any results to which their
experiments lead.
-- Norbert Wiener
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.mcs.anl.gov/pipermail/petsc-users/attachments/20160730/61a9734a/attachment.html>

From andrewh0 at uw.edu  Sat Jul 30 13:06:26 2016
From: andrewh0 at uw.edu (Andrew Ho)
Date: Sat, 30 Jul 2016 11:06:26 -0700
Subject: [petsc-users] Multi-physics meshes with PETSc DM?
In-Reply-To: <CAMYG4G=NCb_m6t3+kC_owE9P-OyhUpe-VyS5znzZJ++jZKCSjw@mail.gmail.com>
References: <CADhXwgsHHvY3MGSJhnbaERjm4ECmw_8WxPEe3XP8cBbqvqhOPA@mail.gmail.com>
	<CAMYG4G=NCb_m6t3+kC_owE9P-OyhUpe-VyS5znzZJ++jZKCSjw@mail.gmail.com>
Message-ID: <CADhXwgvMckhf+oKYfEyUHNBbuMrMrRFzqH8FOJJcFs_H6FVqKA@mail.gmail.com>

>
> 1) I don't use Physical Groups from GMsh since its unclear how this would
> be reflected in the discretization


If I'm not using physical groups in GMsh, how do I easily denote what part
of the domain should be handled with which physics? I would like to be able
to use the same code with similar but not identical meshes (for example to
do a convergence study), so manually iterating through a list of vertices
at the element height stratum in a chart doesn't provide any hints on which
subdomain an element is suppose to belong in.
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.mcs.anl.gov/pipermail/petsc-users/attachments/20160730/93b3a831/attachment.html>

From knepley at gmail.com  Sat Jul 30 13:11:15 2016
From: knepley at gmail.com (Matthew Knepley)
Date: Sat, 30 Jul 2016 13:11:15 -0500
Subject: [petsc-users] Multi-physics meshes with PETSc DM?
In-Reply-To: <CADhXwgvMckhf+oKYfEyUHNBbuMrMrRFzqH8FOJJcFs_H6FVqKA@mail.gmail.com>
References: <CADhXwgsHHvY3MGSJhnbaERjm4ECmw_8WxPEe3XP8cBbqvqhOPA@mail.gmail.com>
	<CAMYG4G=NCb_m6t3+kC_owE9P-OyhUpe-VyS5znzZJ++jZKCSjw@mail.gmail.com>
	<CADhXwgvMckhf+oKYfEyUHNBbuMrMrRFzqH8FOJJcFs_H6FVqKA@mail.gmail.com>
Message-ID: <CAMYG4GkP_MbxiryroXQBL2xiunKpD_MAcW0rBt8a3K4ZPOxWCg@mail.gmail.com>

On Sat, Jul 30, 2016 at 1:06 PM, Andrew Ho <andrewh0 at uw.edu> wrote:

> 1) I don't use Physical Groups from GMsh since its unclear how this would
>> be reflected in the discretization
>
>
> If I'm not using physical groups in GMsh, how do I easily denote what part
> of the domain should be handled with which physics? I would like to be able
> to use the same code with similar but not identical meshes (for example to
> do a convergence study), so manually iterating through a list of vertices
> at the element height stratum in a chart doesn't provide any hints on which
> subdomain an element is suppose to belong in.
>

I think the right way to handle all this is to just mark pieces of the
mesh. Mesh formats should just have a generic marking
ability which does not differentiate between vertices, edges, faces, and
cells. Some formats come close (ExodusII) and some
are just crazy (GMsh). If you can point me toward the documentation for the
GMsh format, I will put in code to translate whatever
part marks cells to a cell label, as we do for ExodusII.

  Thanks,

     Matt

-- 
What most experimenters take for granted before they begin their
experiments is infinitely more interesting than any results to which their
experiments lead.
-- Norbert Wiener
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.mcs.anl.gov/pipermail/petsc-users/attachments/20160730/631cb7ae/attachment-0001.html>

From andrewh0 at uw.edu  Sat Jul 30 13:35:41 2016
From: andrewh0 at uw.edu (Andrew Ho)
Date: Sat, 30 Jul 2016 11:35:41 -0700
Subject: [petsc-users] Multi-physics meshes with PETSc DM?
In-Reply-To: <CAMYG4GkP_MbxiryroXQBL2xiunKpD_MAcW0rBt8a3K4ZPOxWCg@mail.gmail.com>
References: <CADhXwgsHHvY3MGSJhnbaERjm4ECmw_8WxPEe3XP8cBbqvqhOPA@mail.gmail.com>
	<CAMYG4G=NCb_m6t3+kC_owE9P-OyhUpe-VyS5znzZJ++jZKCSjw@mail.gmail.com>
	<CADhXwgvMckhf+oKYfEyUHNBbuMrMrRFzqH8FOJJcFs_H6FVqKA@mail.gmail.com>
	<CAMYG4GkP_MbxiryroXQBL2xiunKpD_MAcW0rBt8a3K4ZPOxWCg@mail.gmail.com>
Message-ID: <CADhXwgsUhk6YbOYD1G_yR=v5yJ8GD+BN7CZiZp_bP3Qx1Cj_kA@mail.gmail.com>

Is there a reason the physical groups aren't sufficient for handling this?
As far as I can tell, this is the only way in GMsh to have any kind of
grouping of elements.

The Gmsh file format can be found here (happens to be the ASCII version,
but binary version is below that):
http://gmsh.info/doc/texinfo/gmsh.html#MSH-ASCII-file-format

All tags are attributed to elements; there may be multiple element types
(points, lines, triangles, etc.), but at the end of the day each element
just has a list of indices indicating which physical group(s) each element
belongs to.

>From the documentation for ASCII formatted mesh files:

number-of-tags

gives the number of integer tags that follow for the n-th element. By
> default, the first tag is the number of the physical entity to which the
> element belongs; the second is the number of the elementary geometrical
> entity to which the element belongs; the third is the number of mesh
> partitions to which the element belongs, followed by the partition ids
> (negative partition ids indicate ghost cells). A zero tag is equivalent to
> no tag. Gmsh and most codes using the MSH 2 format require at least the
> first two tags (physical and elementary tags).


My understanding is to support markers you only need to add a 4th stratum
level which has one node per physical group. It would be helpful (though
not necessary) if this subdomain marker stratum level had the physical tag
name labels properly associated with the corresponding nodes on the graph,
but this is not necessary since it's just as easy to refer to them by node
number as long as the node numbering matches or is a simple transform of
the numbering scheme in the original physical group id's.


On Sat, Jul 30, 2016 at 11:11 AM, Matthew Knepley <knepley at gmail.com> wrote:

> On Sat, Jul 30, 2016 at 1:06 PM, Andrew Ho <andrewh0 at uw.edu> wrote:
>
>> 1) I don't use Physical Groups from GMsh since its unclear how this would
>>> be reflected in the discretization
>>
>>
>> If I'm not using physical groups in GMsh, how do I easily denote what
>> part of the domain should be handled with which physics? I would like to be
>> able to use the same code with similar but not identical meshes (for
>> example to do a convergence study), so manually iterating through a list of
>> vertices at the element height stratum in a chart doesn't provide any hints
>> on which subdomain an element is suppose to belong in.
>>
>
> I think the right way to handle all this is to just mark pieces of the
> mesh. Mesh formats should just have a generic marking
> ability which does not differentiate between vertices, edges, faces, and
> cells. Some formats come close (ExodusII) and some
> are just crazy (GMsh). If you can point me toward the documentation for
> the GMsh format, I will put in code to translate whatever
> part marks cells to a cell label, as we do for ExodusII.
>
>   Thanks,
>
>      Matt
>
> --
> What most experimenters take for granted before they begin their
> experiments is infinitely more interesting than any results to which their
> experiments lead.
> -- Norbert Wiener
>


-- 
Andrew Ho
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.mcs.anl.gov/pipermail/petsc-users/attachments/20160730/a6862a33/attachment.html>