From mafunk at nmsu.edu  Tue Aug  1 16:45:22 2006
From: mafunk at nmsu.edu (Matt Funk)
Date: Tue, 1 Aug 2006 15:45:22 -0600
Subject: profiling PETSc code
Message-ID: <200608011545.26224.mafunk@nmsu.edu>

Hi,

i need to profile my code. Specifically the KSPSolve(...) call. Basically i am 
(just for testing) setting up the identity matrix and pass in the solution 
and RHS vectors. I solve the system 4000 times or so (400 times steps that 
is). Anyway, when i run on 1 processor it takes essentially no time (ca 7 
secs). When i run on 4 procs it takes 96 secs. I use an external timer to 
profile just the call to KSPSolve which is where the times come from.

So i read through chap 12. Unfortunately i cannot find ex21.c in 
src/ksp/ksp/examples/tutorials so i couldn't look at the code.

So what i do is the following. At the end of my program, right before calling 
PetscFinalize() i call PetscLogPrintSummary(...). I registered stage 0 and 
then right before the KSPSolve() call i do :   m_ierr = PetscLogStagePush(0); 
and right after i do: m_ierr = PetscLogStagePop(). There is also barriers 
right before the push call and right after the pop call.

The output i get (for the single proc run) is:

...

---------------------------------------------- PETSc Performance Summary: 
----------------------------------------------

/home2/users/mafunk/AMR/Chombo.2.0/example/node/maskExec/testNodePoisson2d.Linux.g++.g77.MPI32.ex 
on a linux-gnu named .1 with 1 processor, by mafunk Tue Aug  1 14:47:43 2006
Using Petsc Release Version 2.3.1, Patch 12, Wed Apr  5 17:55:50 CDT 2006
BK revision: balay at asterix.mcs.anl.gov|ChangeSet|20060405225457|13540

                         Max       Max/Min        Avg      Total 
Time (sec):           4.167e+01      1.00000   4.167e+01
Objects:              1.500e+01      1.00000   1.500e+01
Flops:                2.094e+08      1.00000   2.094e+08  2.094e+08
Flops/sec:            5.025e+06      1.00000   5.025e+06  5.025e+06
Memory:               4.700e+05      1.00000              4.700e+05
MPI Messages:         0.000e+00      0.00000   0.000e+00  0.000e+00
MPI Message Lengths:  0.000e+00      0.00000   0.000e+00  0.000e+00
MPI Reductions:       2.402e+04      1.00000

Flop counting convention: 1 flop = 1 real number operation of type 
(multiply/divide/add/subtract)
                            e.g., VecAXPY() for real vectors of length N --> 
2N flops
                            and VecAXPY() for complex vectors of length N --> 
8N flops

Summary of Stages:   ----- Time ------  ----- Flops -----  --- Messages ---  
-- Message Lengths --  -- Reductions --
                        Avg     %Total     Avg     %Total   counts   %Total     
Avg         %Total   counts   %Total 
 0:      Main Stage: 4.1647e+01  99.9%  2.0937e+08 100.0%  0.000e+00   0.0%  
0.000e+00        0.0%  2.402e+04 100.0% 

------------------------------------------------------------------------------------------------------------------------
See the 'Profiling' chapter of the users' manual for details on interpreting 
output.
Phase summary info:
   Count: number of times phase was executed
   Time and Flops/sec: Max - maximum over all processors
                       Ratio - ratio of maximum to minimum over all processors
   Mess: number of messages sent
   Avg. len: average message length
   Reduct: number of global reductions
   Global: entire computation
   Stage: stages of a computation. Set stages with PetscLogStagePush() and 
PetscLogStagePop().
      %T - percent time in this phase         %F - percent flops in this phase
      %M - percent messages in this phase     %L - percent message lengths in 
this phase
      %R - percent reductions in this phase
   Total Mflop/s: 10e-6 * (sum of flops over all processors)/(max time over 
all processors)
------------------------------------------------------------------------------------------------------------------------


      ##########################################################
      #                                                        #
      #                          WARNING!!!                    #
      #                                                        #
      #   This code was compiled with a debugging option,      #
      #   To get timing results run config/configure.py        #
      #   using --with-debugging=no, the performance will      #
      #   be generally two or three times faster.              #
      #                                                        #
      ##########################################################


      ##########################################################
      #                                                        #
      #                          WARNING!!!                    #
      #                                                        #
      #   This code was run without the PreLoadBegin()         #
      #   macros. To get timing results we always recommend    #
      #   preloading. otherwise timing numbers may be          #
      #   meaningless.                                         #
      ##########################################################


Event                Count      Time (sec)     Flops/sec                         
--- Global ---  --- Stage ---   Total
                   Max Ratio  Max     Ratio   Max  Ratio  Mess   Avg len 
Reduct  %T %F %M %L %R  %T %F %M %L %R Mflop/s
------------------------------------------------------------------------------------------------------------------------

--- Event Stage 0: Main Stage

------------------------------------------------------------------------------------------------------------------------

Memory usage is given in bytes:

Object Type          Creations   Destructions   Memory  Descendants' Mem.

--- Event Stage 0: Main Stage

           Index Set     3              3      35976     0
                 Vec     8              8     169056     0
              Matrix     2              2      23304     0
       Krylov Solver     1              1      17216     0
      Preconditioner     1              1        168     0
========================================================================================================================
Average time to get PetscTime(): 2.14577e-07
Compiled without FORTRAN kernels
Compiled with full precision matrices (default)
sizeof(short) 2 sizeof(int) 4 sizeof(long) 8 sizeof(void*) 8 
sizeof(PetscScalar) 8
Configure run at: Tue Aug  1 11:28:46 2006
Configure options: --with-debugging=1 --with-blas-lapack-dir=/usr/local 
--with-mpi-dir=/usr --with-log=1 --with-shared=0


....


So i want to find out why KSPSolve() takes so long in parallel. However, there 
i no summary for stage 0. Does someone know why this is? Am i using it in a 
wrong way? I compiled the libraries with -with_log=1 -with-debugging=1

thanks
mat


From knepley at gmail.com  Tue Aug  1 16:57:22 2006
From: knepley at gmail.com (Matthew Knepley)
Date: Tue, 1 Aug 2006 16:57:22 -0500
Subject: profiling PETSc code
In-Reply-To: <200608011545.26224.mafunk@nmsu.edu>
References: <200608011545.26224.mafunk@nmsu.edu>
Message-ID: <a9f269830608011457mab0efd2i9290f9b5d967b2d9@mail.gmail.com>

Take out your stage push/pop for the moment, and the log_summary
call. Just run with -log_summary and send the output as a test.

  Thanks,

     Matt

On 8/1/06, Matt Funk <mafunk at nmsu.edu> wrote:
> Hi,
>
> i need to profile my code. Specifically the KSPSolve(...) call. Basically i am
> (just for testing) setting up the identity matrix and pass in the solution
> and RHS vectors. I solve the system 4000 times or so (400 times steps that
> is). Anyway, when i run on 1 processor it takes essentially no time (ca 7
> secs). When i run on 4 procs it takes 96 secs. I use an external timer to
> profile just the call to KSPSolve which is where the times come from.
>
> So i read through chap 12. Unfortunately i cannot find ex21.c in
> src/ksp/ksp/examples/tutorials so i couldn't look at the code.
>
> So what i do is the following. At the end of my program, right before calling
> PetscFinalize() i call PetscLogPrintSummary(...). I registered stage 0 and
> then right before the KSPSolve() call i do :   m_ierr = PetscLogStagePush(0);
> and right after i do: m_ierr = PetscLogStagePop(). There is also barriers
> right before the push call and right after the pop call.
>
> The output i get (for the single proc run) is:
>
> ...
>
> ---------------------------------------------- PETSc Performance Summary:
> ----------------------------------------------
>
> /home2/users/mafunk/AMR/Chombo.2.0/example/node/maskExec/testNodePoisson2d.Linux.g++.g77.MPI32.ex
> on a linux-gnu named .1 with 1 processor, by mafunk Tue Aug  1 14:47:43 2006
> Using Petsc Release Version 2.3.1, Patch 12, Wed Apr  5 17:55:50 CDT 2006
> BK revision: balay at asterix.mcs.anl.gov|ChangeSet|20060405225457|13540
>
>                          Max       Max/Min        Avg      Total
> Time (sec):           4.167e+01      1.00000   4.167e+01
> Objects:              1.500e+01      1.00000   1.500e+01
> Flops:                2.094e+08      1.00000   2.094e+08  2.094e+08
> Flops/sec:            5.025e+06      1.00000   5.025e+06  5.025e+06
> Memory:               4.700e+05      1.00000              4.700e+05
> MPI Messages:         0.000e+00      0.00000   0.000e+00  0.000e+00
> MPI Message Lengths:  0.000e+00      0.00000   0.000e+00  0.000e+00
> MPI Reductions:       2.402e+04      1.00000
>
> Flop counting convention: 1 flop = 1 real number operation of type
> (multiply/divide/add/subtract)
>                             e.g., VecAXPY() for real vectors of length N -->
> 2N flops
>                             and VecAXPY() for complex vectors of length N -->
> 8N flops
>
> Summary of Stages:   ----- Time ------  ----- Flops -----  --- Messages ---
> -- Message Lengths --  -- Reductions --
>                         Avg     %Total     Avg     %Total   counts   %Total
> Avg         %Total   counts   %Total
>  0:      Main Stage: 4.1647e+01  99.9%  2.0937e+08 100.0%  0.000e+00   0.0%
> 0.000e+00        0.0%  2.402e+04 100.0%
>
> ------------------------------------------------------------------------------------------------------------------------
> See the 'Profiling' chapter of the users' manual for details on interpreting
> output.
> Phase summary info:
>    Count: number of times phase was executed
>    Time and Flops/sec: Max - maximum over all processors
>                        Ratio - ratio of maximum to minimum over all processors
>    Mess: number of messages sent
>    Avg. len: average message length
>    Reduct: number of global reductions
>    Global: entire computation
>    Stage: stages of a computation. Set stages with PetscLogStagePush() and
> PetscLogStagePop().
>       %T - percent time in this phase         %F - percent flops in this phase
>       %M - percent messages in this phase     %L - percent message lengths in
> this phase
>       %R - percent reductions in this phase
>    Total Mflop/s: 10e-6 * (sum of flops over all processors)/(max time over
> all processors)
> ------------------------------------------------------------------------------------------------------------------------
>
>
>       ##########################################################
>       #                                                        #
>       #                          WARNING!!!                    #
>       #                                                        #
>       #   This code was compiled with a debugging option,      #
>       #   To get timing results run config/configure.py        #
>       #   using --with-debugging=no, the performance will      #
>       #   be generally two or three times faster.              #
>       #                                                        #
>       ##########################################################
>
>
>
>
>       ##########################################################
>       #                                                        #
>       #                          WARNING!!!                    #
>       #                                                        #
>       #   This code was run without the PreLoadBegin()         #
>       #   macros. To get timing results we always recommend    #
>       #   preloading. otherwise timing numbers may be          #
>       #   meaningless.                                         #
>       ##########################################################
>
>
> Event                Count      Time (sec)     Flops/sec
> --- Global ---  --- Stage ---   Total
>                    Max Ratio  Max     Ratio   Max  Ratio  Mess   Avg len
> Reduct  %T %F %M %L %R  %T %F %M %L %R Mflop/s
> ------------------------------------------------------------------------------------------------------------------------
>
> --- Event Stage 0: Main Stage
>
> ------------------------------------------------------------------------------------------------------------------------
>
> Memory usage is given in bytes:
>
> Object Type          Creations   Destructions   Memory  Descendants' Mem.
>
> --- Event Stage 0: Main Stage
>
>            Index Set     3              3      35976     0
>                  Vec     8              8     169056     0
>               Matrix     2              2      23304     0
>        Krylov Solver     1              1      17216     0
>       Preconditioner     1              1        168     0
> ========================================================================================================================
> Average time to get PetscTime(): 2.14577e-07
> Compiled without FORTRAN kernels
> Compiled with full precision matrices (default)
> sizeof(short) 2 sizeof(int) 4 sizeof(long) 8 sizeof(void*) 8
> sizeof(PetscScalar) 8
> Configure run at: Tue Aug  1 11:28:46 2006
> Configure options: --with-debugging=1 --with-blas-lapack-dir=/usr/local
> --with-mpi-dir=/usr --with-log=1 --with-shared=0
>
>
> ....
>
>
> So i want to find out why KSPSolve() takes so long in parallel. However, there
> i no summary for stage 0. Does someone know why this is? Am i using it in a
> wrong way? I compiled the libraries with -with_log=1 -with-debugging=1
>
> thanks
> mat
>
>


-- 
"Failure has a thousand explanations. Success doesn't need one" -- Sir
Alec Guiness


From mafunk at nmsu.edu  Tue Aug  1 17:30:04 2006
From: mafunk at nmsu.edu (Matt Funk)
Date: Tue, 1 Aug 2006 16:30:04 -0600
Subject: profiling PETSc code
In-Reply-To: <a9f269830608011457mab0efd2i9290f9b5d967b2d9@mail.gmail.com>
References: <200608011545.26224.mafunk@nmsu.edu> <a9f269830608011457mab0efd2i9290f9b5d967b2d9@mail.gmail.com>
Message-ID: <200608011630.05454.mafunk@nmsu.edu>

Hi,

well, now i do get  summary:
...
     #                          WARNING!!!                    #
      #                                                        #
      #   This code was run without the PreLoadBegin()         #
      #   macros. To get timing results we always recommend    #
      #   preloading. otherwise timing numbers may be          #
      #   meaningless.                                         #
      ##########################################################


Event                Count      Time (sec)     Flops/sec                         
--- Global ---  --- Stage ---   Total
                   Max Ratio  Max     Ratio   Max  Ratio  Mess   Avg len 
Reduct  %T %F %M %L %R  %T %F %M %L %R Mflop/s
------------------------------------------------------------------------------------------------------------------------

--- Event Stage 0: Main Stage

VecNorm              200 1.0 5.6217e-03 1.0 2.07e+08 1.0 0.0e+00 0.0e+00 
1.0e+02  0 36  0  0  7   0 36  0  0 31   207
VecCopy              200 1.0 4.2303e-03 1.0 0.00e+00 0.0 0.0e+00 0.0e+00 
0.0e+00  0  0  0  0  0   0  0  0  0  0     0
VecSet                 1 1.0 8.1062e-06 1.0 0.00e+00 0.0 0.0e+00 0.0e+00 
0.0e+00  0  0  0  0  0   0  0  0  0  0     0
VecAYPX              100 1.0 3.2036e-03 1.0 1.82e+08 1.0 0.0e+00 0.0e+00 
0.0e+00  0 18  0  0  0   0 18  0  0  0   182
MatMult              100 1.0 8.3530e-03 1.0 3.49e+07 1.0 0.0e+00 0.0e+00 
0.0e+00  1  9  0  0  0   1  9  0  0  0    35
MatSolve             200 1.0 2.5591e-02 1.0 2.28e+07 1.0 0.0e+00 0.0e+00 
0.0e+00  2 18  0  0  0   2 18  0  0  0    23
MatSolveTranspos     100 1.0 2.1357e-02 1.0 1.36e+07 1.0 0.0e+00 0.0e+00 
0.0e+00  2  9  0  0  0   2  9  0  0  0    14
MatLUFactorNum       100 1.0 4.6215e-02 1.0 6.30e+06 1.0 0.0e+00 0.0e+00 
0.0e+00  3  9  0  0  0   3  9  0  0  0     6
MatILUFactorSym        1 1.0 4.4894e-04 1.0 0.00e+00 0.0 0.0e+00 0.0e+00 
2.0e+00  0  0  0  0  0   0  0  0  0  1     0
MatAssemblyBegin       1 1.0 2.1458e-06 1.0 0.00e+00 0.0 0.0e+00 0.0e+00 
0.0e+00  0  0  0  0  0   0  0  0  0  0     0
MatAssemblyEnd       100 1.0 1.1220e-02 1.0 0.00e+00 0.0 0.0e+00 0.0e+00 
0.0e+00  1  0  0  0  0   1  0  0  0  0     0
MatGetOrdering         1 1.0 2.5296e-04 1.0 0.00e+00 0.0 0.0e+00 0.0e+00 
2.0e+00  0  0  0  0  0   0  0  0  0  1     0
KSPSetup             100 1.0 5.3692e-04 1.0 0.00e+00 0.0 0.0e+00 0.0e+00 
1.4e+01  0  0  0  0  1   0  0  0  0  4     0
KSPSolve             100 1.0 9.0056e-02 1.0 3.23e+07 1.0 0.0e+00 0.0e+00 
3.0e+02  7 91  0  0 21   7 91  0  0 93    32
PCSetUp              100 1.0 4.9087e-02 1.0 5.93e+06 1.0 0.0e+00 0.0e+00 
4.0e+00  4  9  0  0  0   4  9  0  0  1     6
PCApply              300 1.0 4.9106e-02 1.0 1.78e+07 1.0 0.0e+00 0.0e+00 
0.0e+00  4 27  0  0  0   4 27  0  0  0    18
------------------------------------------------------------------------------------------------------------------------

Memory usage is given in bytes:

Object Type          Creations   Destructions   Memory  Descendants' Mem.

--- Event Stage 0: Main Stage

           Index Set     3              3      35976     0
                 Vec   109            109    2458360     0
              Matrix     2              2      23304     0
       Krylov Solver     1              1          0     0
      Preconditioner     1              1        168     0
========================================================================================================================
Average time to get PetscTime(): 9.53674e-08
Compiled without FORTRAN kernels
Compiled with full precision matrices (default)

...

am i using the push and pop calls in an manner they are not intended to be 
used?

plus, how can i see what's going on with respect to why it takes so much 
longer to solve the system in parallel than in serial without being able to 
specify the stages (i.e single out the KSPSolve call)?


mat


On Tuesday 01 August 2006 15:57, Matthew Knepley wrote:
> ke out your stage push/pop for the moment, and the log_summary
> call. Just run with -log_summary and send the output as a test.
>
> ? Thanks,
>
> ? ? ?Matt


From knepley at gmail.com  Tue Aug  1 17:36:55 2006
From: knepley at gmail.com (Matthew Knepley)
Date: Tue, 1 Aug 2006 17:36:55 -0500
Subject: profiling PETSc code
In-Reply-To: <200608011630.05454.mafunk@nmsu.edu>
References: <200608011545.26224.mafunk@nmsu.edu>
	 <a9f269830608011457mab0efd2i9290f9b5d967b2d9@mail.gmail.com>
	 <200608011630.05454.mafunk@nmsu.edu>
Message-ID: <a9f269830608011536v12fb4457l31e3d363a666a893@mail.gmail.com>

On 8/1/06, Matt Funk <mafunk at nmsu.edu> wrote:
> Hi,
>
> well, now i do get  summary:
> ...
>      #                          WARNING!!!                    #
>       #                                                        #
>       #   This code was run without the PreLoadBegin()         #
>       #   macros. To get timing results we always recommend    #
>       #   preloading. otherwise timing numbers may be          #
>       #   meaningless.                                         #
>       ##########################################################
>
>
> Event                Count      Time (sec)     Flops/sec
> --- Global ---  --- Stage ---   Total
>                    Max Ratio  Max     Ratio   Max  Ratio  Mess   Avg len
> Reduct  %T %F %M %L %R  %T %F %M %L %R Mflop/s
> ------------------------------------------------------------------------------------------------------------------------
>
> --- Event Stage 0: Main Stage
>
> VecNorm              200 1.0 5.6217e-03 1.0 2.07e+08 1.0 0.0e+00 0.0e+00
> 1.0e+02  0 36  0  0  7   0 36  0  0 31   207
> VecCopy              200 1.0 4.2303e-03 1.0 0.00e+00 0.0 0.0e+00 0.0e+00
> 0.0e+00  0  0  0  0  0   0  0  0  0  0     0
> VecSet                 1 1.0 8.1062e-06 1.0 0.00e+00 0.0 0.0e+00 0.0e+00
> 0.0e+00  0  0  0  0  0   0  0  0  0  0     0
> VecAYPX              100 1.0 3.2036e-03 1.0 1.82e+08 1.0 0.0e+00 0.0e+00
> 0.0e+00  0 18  0  0  0   0 18  0  0  0   182
> MatMult              100 1.0 8.3530e-03 1.0 3.49e+07 1.0 0.0e+00 0.0e+00
> 0.0e+00  1  9  0  0  0   1  9  0  0  0    35
> MatSolve             200 1.0 2.5591e-02 1.0 2.28e+07 1.0 0.0e+00 0.0e+00
> 0.0e+00  2 18  0  0  0   2 18  0  0  0    23
> MatSolveTranspos     100 1.0 2.1357e-02 1.0 1.36e+07 1.0 0.0e+00 0.0e+00
> 0.0e+00  2  9  0  0  0   2  9  0  0  0    14
> MatLUFactorNum       100 1.0 4.6215e-02 1.0 6.30e+06 1.0 0.0e+00 0.0e+00
> 0.0e+00  3  9  0  0  0   3  9  0  0  0     6
> MatILUFactorSym        1 1.0 4.4894e-04 1.0 0.00e+00 0.0 0.0e+00 0.0e+00
> 2.0e+00  0  0  0  0  0   0  0  0  0  1     0
> MatAssemblyBegin       1 1.0 2.1458e-06 1.0 0.00e+00 0.0 0.0e+00 0.0e+00
> 0.0e+00  0  0  0  0  0   0  0  0  0  0     0
> MatAssemblyEnd       100 1.0 1.1220e-02 1.0 0.00e+00 0.0 0.0e+00 0.0e+00
> 0.0e+00  1  0  0  0  0   1  0  0  0  0     0
> MatGetOrdering         1 1.0 2.5296e-04 1.0 0.00e+00 0.0 0.0e+00 0.0e+00
> 2.0e+00  0  0  0  0  0   0  0  0  0  1     0
> KSPSetup             100 1.0 5.3692e-04 1.0 0.00e+00 0.0 0.0e+00 0.0e+00
> 1.4e+01  0  0  0  0  1   0  0  0  0  4     0
> KSPSolve             100 1.0 9.0056e-02 1.0 3.23e+07 1.0 0.0e+00 0.0e+00
> 3.0e+02  7 91  0  0 21   7 91  0  0 93    32
> PCSetUp              100 1.0 4.9087e-02 1.0 5.93e+06 1.0 0.0e+00 0.0e+00
> 4.0e+00  4  9  0  0  0   4  9  0  0  1     6
> PCApply              300 1.0 4.9106e-02 1.0 1.78e+07 1.0 0.0e+00 0.0e+00
> 0.0e+00  4 27  0  0  0   4 27  0  0  0    18
> ------------------------------------------------------------------------------------------------------------------------
>
> Memory usage is given in bytes:
>
> Object Type          Creations   Destructions   Memory  Descendants' Mem.
>
> --- Event Stage 0: Main Stage
>
>            Index Set     3              3      35976     0
>                  Vec   109            109    2458360     0
>               Matrix     2              2      23304     0
>        Krylov Solver     1              1          0     0
>       Preconditioner     1              1        168     0
> ========================================================================================================================
> Average time to get PetscTime(): 9.53674e-08
> Compiled without FORTRAN kernels
> Compiled with full precision matrices (default)
>
> ...
>
> am i using the push and pop calls in an manner they are not intended to be
> used?

Not exactly. You need to register a stage first before pushing it.

  http://www-unix.mcs.anl.gov/petsc/petsc-as/snapshots/petsc-current/docs/manualpages/Profiling/PetscLogStageRegister.html

> plus, how can i see what's going on with respect to why it takes so much
> longer to solve the system in parallel than in serial without being able to
> specify the stages (i.e single out the KSPSolve call)?

There are 100 calls to KSPSolve() which collectively take .1s. Your
problem is most
likely in matrix setup. I would bet that you have not preallocated the
space correctly.
Therefore, a malloc() is called every time you insert a value. You can
check the number of mallocs using -info.

   Matt


> mat
>
>
>
>
>
> On Tuesday 01 August 2006 15:57, Matthew Knepley wrote:
> > ke out your stage push/pop for the moment, and the log_summary
> > call. Just run with -log_summary and send the output as a test.
> >
> > Thanks,
> >
> > Matt
>
>


-- 
"Failure has a thousand explanations. Success doesn't need one" -- Sir
Alec Guiness


From mafunk at nmsu.edu  Tue Aug  1 18:49:46 2006
From: mafunk at nmsu.edu (Matt Funk)
Date: Tue, 1 Aug 2006 17:49:46 -0600
Subject: profiling PETSc code
In-Reply-To: <a9f269830608011536v12fb4457l31e3d363a666a893@mail.gmail.com>
References: <200608011545.26224.mafunk@nmsu.edu> <200608011630.05454.mafunk@nmsu.edu> <a9f269830608011536v12fb4457l31e3d363a666a893@mail.gmail.com>
Message-ID: <200608011749.48651.mafunk@nmsu.edu>

Hi,

i don't think it is the mallocs since it says things like:
[0] MatAssemblyEnd_SeqAIJMatrix size: 2912 X 2912; storage space: 0 
unneeded,2912 used
[0] MatAssemblyEnd_SeqAIJNumber of mallocs during MatSetValues() is 0

However, i do get errors. They look like:
[0]PETSC ERROR: StageLogRegister() line 95 in src/sys/plog/stageLog.c
[0]PETSC ERROR: Invalid pointer!
[0]PETSC ERROR: Null Pointer: Parameter # 3!
[0]PETSC ERROR: PetscLogStageRegister() line 375 in src/sys/plog/plog.c

which happens during the call PETSCInitialize(...);


After that i get an error like:
[0] PetscCommDuplicateDuplicating a communicator 91 164 max tags = 1073741823
[0] PetscCommDuplicateUsing internal PETSc communicator 91 164
[0] PetscCommDuplicateUsing internal PETSc communicator 91 164
[0] PetscCommDuplicateUsing internal PETSc communicator 91 164
[0]PETSC ERROR: MatGetVecs() line 6283 in src/mat/interface/matrix.c
[0]PETSC ERROR: Null argument, when expecting valid pointer!
[0]PETSC ERROR: Null Object: Parameter # 1!
[0]PETSC ERROR: KSPGetVecs() line 555 in src/ksp/ksp/interface/iterativ.c
[0]PETSC ERROR: KSPDefaultGetWork() line 597 in 
src/ksp/ksp/interface/iterativ.c
[0]PETSC ERROR: KSPSetUp_CG() line 75 in src/ksp/ksp/impls/cg/cg.c
[0]PETSC ERROR: KSPSetUp() line 198 in src/ksp/ksp/interface/itfunc.c

so i suppose that is a problem. I am just not sure what it means.
any ideas?

mat


On Tuesday 01 August 2006 16:36, Matthew Knepley wrote:
> On 8/1/06, Matt Funk <mafunk at nmsu.edu> wrote:
> > Hi,
> >
> > well, now i do get  summary:
> > ...
> >      #                          WARNING!!!                    #
> >       #                                                        #
> >       #   This code was run without the PreLoadBegin()         #
> >       #   macros. To get timing results we always recommend    #
> >       #   preloading. otherwise timing numbers may be          #
> >       #   meaningless.                                         #
> >       ##########################################################
> >
> >
> > Event                Count      Time (sec)     Flops/sec
> > --- Global ---  --- Stage ---   Total
> >                    Max Ratio  Max     Ratio   Max  Ratio  Mess   Avg len
> > Reduct  %T %F %M %L %R  %T %F %M %L %R Mflop/s
> > -------------------------------------------------------------------------
> >-----------------------------------------------
> >
> > --- Event Stage 0: Main Stage
> >
> > VecNorm              200 1.0 5.6217e-03 1.0 2.07e+08 1.0 0.0e+00 0.0e+00
> > 1.0e+02  0 36  0  0  7   0 36  0  0 31   207
> > VecCopy              200 1.0 4.2303e-03 1.0 0.00e+00 0.0 0.0e+00 0.0e+00
> > 0.0e+00  0  0  0  0  0   0  0  0  0  0     0
> > VecSet                 1 1.0 8.1062e-06 1.0 0.00e+00 0.0 0.0e+00 0.0e+00
> > 0.0e+00  0  0  0  0  0   0  0  0  0  0     0
> > VecAYPX              100 1.0 3.2036e-03 1.0 1.82e+08 1.0 0.0e+00 0.0e+00
> > 0.0e+00  0 18  0  0  0   0 18  0  0  0   182
> > MatMult              100 1.0 8.3530e-03 1.0 3.49e+07 1.0 0.0e+00 0.0e+00
> > 0.0e+00  1  9  0  0  0   1  9  0  0  0    35
> > MatSolve             200 1.0 2.5591e-02 1.0 2.28e+07 1.0 0.0e+00 0.0e+00
> > 0.0e+00  2 18  0  0  0   2 18  0  0  0    23
> > MatSolveTranspos     100 1.0 2.1357e-02 1.0 1.36e+07 1.0 0.0e+00 0.0e+00
> > 0.0e+00  2  9  0  0  0   2  9  0  0  0    14
> > MatLUFactorNum       100 1.0 4.6215e-02 1.0 6.30e+06 1.0 0.0e+00 0.0e+00
> > 0.0e+00  3  9  0  0  0   3  9  0  0  0     6
> > MatILUFactorSym        1 1.0 4.4894e-04 1.0 0.00e+00 0.0 0.0e+00 0.0e+00
> > 2.0e+00  0  0  0  0  0   0  0  0  0  1     0
> > MatAssemblyBegin       1 1.0 2.1458e-06 1.0 0.00e+00 0.0 0.0e+00 0.0e+00
> > 0.0e+00  0  0  0  0  0   0  0  0  0  0     0
> > MatAssemblyEnd       100 1.0 1.1220e-02 1.0 0.00e+00 0.0 0.0e+00 0.0e+00
> > 0.0e+00  1  0  0  0  0   1  0  0  0  0     0
> > MatGetOrdering         1 1.0 2.5296e-04 1.0 0.00e+00 0.0 0.0e+00 0.0e+00
> > 2.0e+00  0  0  0  0  0   0  0  0  0  1     0
> > KSPSetup             100 1.0 5.3692e-04 1.0 0.00e+00 0.0 0.0e+00 0.0e+00
> > 1.4e+01  0  0  0  0  1   0  0  0  0  4     0
> > KSPSolve             100 1.0 9.0056e-02 1.0 3.23e+07 1.0 0.0e+00 0.0e+00
> > 3.0e+02  7 91  0  0 21   7 91  0  0 93    32
> > PCSetUp              100 1.0 4.9087e-02 1.0 5.93e+06 1.0 0.0e+00 0.0e+00
> > 4.0e+00  4  9  0  0  0   4  9  0  0  1     6
> > PCApply              300 1.0 4.9106e-02 1.0 1.78e+07 1.0 0.0e+00 0.0e+00
> > 0.0e+00  4 27  0  0  0   4 27  0  0  0    18
> > -------------------------------------------------------------------------
> >-----------------------------------------------
> >
> > Memory usage is given in bytes:
> >
> > Object Type          Creations   Destructions   Memory  Descendants' Mem.
> >
> > --- Event Stage 0: Main Stage
> >
> >            Index Set     3              3      35976     0
> >                  Vec   109            109    2458360     0
> >               Matrix     2              2      23304     0
> >        Krylov Solver     1              1          0     0
> >       Preconditioner     1              1        168     0
> > =========================================================================
> >=============================================== Average time to get
> > PetscTime(): 9.53674e-08
> > Compiled without FORTRAN kernels
> > Compiled with full precision matrices (default)
> >
> > ...
> >
> > am i using the push and pop calls in an manner they are not intended to
> > be used?
>
> Not exactly. You need to register a stage first before pushing it.
>
>  
> http://www-unix.mcs.anl.gov/petsc/petsc-as/snapshots/petsc-current/docs/man
>ualpages/Profiling/PetscLogStageRegister.html
>
> > plus, how can i see what's going on with respect to why it takes so much
> > longer to solve the system in parallel than in serial without being able
> > to specify the stages (i.e single out the KSPSolve call)?
>
> There are 100 calls to KSPSolve() which collectively take .1s. Your
> problem is most
> likely in matrix setup. I would bet that you have not preallocated the
> space correctly.
> Therefore, a malloc() is called every time you insert a value. You can
> check the number of mallocs using -info.
>
>    Matt
>
> > mat
> >
> > On Tuesday 01 August 2006 15:57, Matthew Knepley wrote:
> > > ke out your stage push/pop for the moment, and the log_summary
> > > call. Just run with -log_summary and send the output as a test.
> > >
> > > Thanks,
> > >
> > > Matt


From knepley at gmail.com  Tue Aug  1 19:07:26 2006
From: knepley at gmail.com (Matthew Knepley)
Date: Tue, 1 Aug 2006 19:07:26 -0500
Subject: profiling PETSc code
In-Reply-To: <200608011749.48651.mafunk@nmsu.edu>
References: <200608011545.26224.mafunk@nmsu.edu>
	 <200608011630.05454.mafunk@nmsu.edu>
	 <a9f269830608011536v12fb4457l31e3d363a666a893@mail.gmail.com>
	 <200608011749.48651.mafunk@nmsu.edu>
Message-ID: <a9f269830608011707m12212e5ey2079acc68371672c@mail.gmail.com>

On 8/1/06, Matt Funk <mafunk at nmsu.edu> wrote:
> Hi,
>
> i don't think it is the mallocs since it says things like:
> [0] MatAssemblyEnd_SeqAIJMatrix size: 2912 X 2912; storage space: 0
> unneeded,2912 used
> [0] MatAssemblyEnd_SeqAIJNumber of mallocs during MatSetValues() is 0

This is only on one processor.

> However, i do get errors. They look like:
> [0]PETSC ERROR: StageLogRegister() line 95 in src/sys/plog/stageLog.c
> [0]PETSC ERROR: Invalid pointer!
> [0]PETSC ERROR: Null Pointer: Parameter # 3!
> [0]PETSC ERROR: PetscLogStageRegister() line 375 in src/sys/plog/plog.c

You gave an invalid pointer to the call. You should have

  int stage;
  PetscLogStageRegister(&stage, "MyStage");

> which happens during the call PETSCInitialize(...);
>
> After that i get an error like:
> [0] PetscCommDuplicateDuplicating a communicator 91 164 max tags = 1073741823
> [0] PetscCommDuplicateUsing internal PETSc communicator 91 164
> [0] PetscCommDuplicateUsing internal PETSc communicator 91 164
> [0] PetscCommDuplicateUsing internal PETSc communicator 91 164
> [0]PETSC ERROR: MatGetVecs() line 6283 in src/mat/interface/matrix.c
> [0]PETSC ERROR: Null argument, when expecting valid pointer!
> [0]PETSC ERROR: Null Object: Parameter # 1!
> [0]PETSC ERROR: KSPGetVecs() line 555 in src/ksp/ksp/interface/iterativ.c
> [0]PETSC ERROR: KSPDefaultGetWork() line 597 in
> src/ksp/ksp/interface/iterativ.c
> [0]PETSC ERROR: KSPSetUp_CG() line 75 in src/ksp/ksp/impls/cg/cg.c
> [0]PETSC ERROR: KSPSetUp() line 198 in src/ksp/ksp/interface/itfunc.c
>
> so i suppose that is a problem. I am just not sure what it means.
> any ideas?

It looks like you have not called KSPSetOperators().

   Matt

-- 
"Failure has a thousand explanations. Success doesn't need one" -- Sir
Alec Guiness


From mafunk at nmsu.edu  Tue Aug  1 19:07:28 2006
From: mafunk at nmsu.edu (Matt Funk)
Date: Tue, 1 Aug 2006 18:07:28 -0600
Subject: profiling PETSc code
In-Reply-To: <200608011749.48651.mafunk@nmsu.edu>
References: <200608011545.26224.mafunk@nmsu.edu> <a9f269830608011536v12fb4457l31e3d363a666a893@mail.gmail.com> <200608011749.48651.mafunk@nmsu.edu>
Message-ID: <200608011807.29007.mafunk@nmsu.edu>

Actually the errors occur on my calls to a PETSc functions after calling 
PETSCInitialize.

mat

On Tuesday 01 August 2006 17:49, Matt Funk wrote:
> Hi,
>
> i don't think it is the mallocs since it says things like:
> [0] MatAssemblyEnd_SeqAIJMatrix size: 2912 X 2912; storage space: 0
> unneeded,2912 used
> [0] MatAssemblyEnd_SeqAIJNumber of mallocs during MatSetValues() is 0
>
> However, i do get errors. They look like:
> [0]PETSC ERROR: StageLogRegister() line 95 in src/sys/plog/stageLog.c
> [0]PETSC ERROR: Invalid pointer!
> [0]PETSC ERROR: Null Pointer: Parameter # 3!
> [0]PETSC ERROR: PetscLogStageRegister() line 375 in src/sys/plog/plog.c
>
> which happens during the call PETSCInitialize(...);
>
>
> After that i get an error like:
> [0] PetscCommDuplicateDuplicating a communicator 91 164 max tags =
> 1073741823 [0] PetscCommDuplicateUsing internal PETSc communicator 91 164
> [0] PetscCommDuplicateUsing internal PETSc communicator 91 164
> [0] PetscCommDuplicateUsing internal PETSc communicator 91 164
> [0]PETSC ERROR: MatGetVecs() line 6283 in src/mat/interface/matrix.c
> [0]PETSC ERROR: Null argument, when expecting valid pointer!
> [0]PETSC ERROR: Null Object: Parameter # 1!
> [0]PETSC ERROR: KSPGetVecs() line 555 in src/ksp/ksp/interface/iterativ.c
> [0]PETSC ERROR: KSPDefaultGetWork() line 597 in
> src/ksp/ksp/interface/iterativ.c
> [0]PETSC ERROR: KSPSetUp_CG() line 75 in src/ksp/ksp/impls/cg/cg.c
> [0]PETSC ERROR: KSPSetUp() line 198 in src/ksp/ksp/interface/itfunc.c
>
> so i suppose that is a problem. I am just not sure what it means.
> any ideas?
>
> mat
>
> On Tuesday 01 August 2006 16:36, Matthew Knepley wrote:
> > On 8/1/06, Matt Funk <mafunk at nmsu.edu> wrote:
> > > Hi,
> > >
> > > well, now i do get  summary:
> > > ...
> > >      #                          WARNING!!!                    #
> > >       #                                                        #
> > >       #   This code was run without the PreLoadBegin()         #
> > >       #   macros. To get timing results we always recommend    #
> > >       #   preloading. otherwise timing numbers may be          #
> > >       #   meaningless.                                         #
> > >       ##########################################################
> > >
> > >
> > > Event                Count      Time (sec)     Flops/sec
> > > --- Global ---  --- Stage ---   Total
> > >                    Max Ratio  Max     Ratio   Max  Ratio  Mess   Avg
> > > len Reduct  %T %F %M %L %R  %T %F %M %L %R Mflop/s
> > > -----------------------------------------------------------------------
> > >-- -----------------------------------------------
> > >
> > > --- Event Stage 0: Main Stage
> > >
> > > VecNorm              200 1.0 5.6217e-03 1.0 2.07e+08 1.0 0.0e+00
> > > 0.0e+00 1.0e+02  0 36  0  0  7   0 36  0  0 31   207
> > > VecCopy              200 1.0 4.2303e-03 1.0 0.00e+00 0.0 0.0e+00
> > > 0.0e+00 0.0e+00  0  0  0  0  0   0  0  0  0  0     0
> > > VecSet                 1 1.0 8.1062e-06 1.0 0.00e+00 0.0 0.0e+00
> > > 0.0e+00 0.0e+00  0  0  0  0  0   0  0  0  0  0     0
> > > VecAYPX              100 1.0 3.2036e-03 1.0 1.82e+08 1.0 0.0e+00
> > > 0.0e+00 0.0e+00  0 18  0  0  0   0 18  0  0  0   182
> > > MatMult              100 1.0 8.3530e-03 1.0 3.49e+07 1.0 0.0e+00
> > > 0.0e+00 0.0e+00  1  9  0  0  0   1  9  0  0  0    35
> > > MatSolve             200 1.0 2.5591e-02 1.0 2.28e+07 1.0 0.0e+00
> > > 0.0e+00 0.0e+00  2 18  0  0  0   2 18  0  0  0    23
> > > MatSolveTranspos     100 1.0 2.1357e-02 1.0 1.36e+07 1.0 0.0e+00
> > > 0.0e+00 0.0e+00  2  9  0  0  0   2  9  0  0  0    14
> > > MatLUFactorNum       100 1.0 4.6215e-02 1.0 6.30e+06 1.0 0.0e+00
> > > 0.0e+00 0.0e+00  3  9  0  0  0   3  9  0  0  0     6
> > > MatILUFactorSym        1 1.0 4.4894e-04 1.0 0.00e+00 0.0 0.0e+00
> > > 0.0e+00 2.0e+00  0  0  0  0  0   0  0  0  0  1     0
> > > MatAssemblyBegin       1 1.0 2.1458e-06 1.0 0.00e+00 0.0 0.0e+00
> > > 0.0e+00 0.0e+00  0  0  0  0  0   0  0  0  0  0     0
> > > MatAssemblyEnd       100 1.0 1.1220e-02 1.0 0.00e+00 0.0 0.0e+00
> > > 0.0e+00 0.0e+00  1  0  0  0  0   1  0  0  0  0     0
> > > MatGetOrdering         1 1.0 2.5296e-04 1.0 0.00e+00 0.0 0.0e+00
> > > 0.0e+00 2.0e+00  0  0  0  0  0   0  0  0  0  1     0
> > > KSPSetup             100 1.0 5.3692e-04 1.0 0.00e+00 0.0 0.0e+00
> > > 0.0e+00 1.4e+01  0  0  0  0  1   0  0  0  0  4     0
> > > KSPSolve             100 1.0 9.0056e-02 1.0 3.23e+07 1.0 0.0e+00
> > > 0.0e+00 3.0e+02  7 91  0  0 21   7 91  0  0 93    32
> > > PCSetUp              100 1.0 4.9087e-02 1.0 5.93e+06 1.0 0.0e+00
> > > 0.0e+00 4.0e+00  4  9  0  0  0   4  9  0  0  1     6
> > > PCApply              300 1.0 4.9106e-02 1.0 1.78e+07 1.0 0.0e+00
> > > 0.0e+00 0.0e+00  4 27  0  0  0   4 27  0  0  0    18
> > > -----------------------------------------------------------------------
> > >-- -----------------------------------------------
> > >
> > > Memory usage is given in bytes:
> > >
> > > Object Type          Creations   Destructions   Memory  Descendants'
> > > Mem.
> > >
> > > --- Event Stage 0: Main Stage
> > >
> > >            Index Set     3              3      35976     0
> > >                  Vec   109            109    2458360     0
> > >               Matrix     2              2      23304     0
> > >        Krylov Solver     1              1          0     0
> > >       Preconditioner     1              1        168     0
> > > =======================================================================
> > >== =============================================== Average time to get
> > > PetscTime(): 9.53674e-08
> > > Compiled without FORTRAN kernels
> > > Compiled with full precision matrices (default)
> > >
> > > ...
> > >
> > > am i using the push and pop calls in an manner they are not intended to
> > > be used?
> >
> > Not exactly. You need to register a stage first before pushing it.
> >
> >
> > http://www-unix.mcs.anl.gov/petsc/petsc-as/snapshots/petsc-current/docs/m
> >an ualpages/Profiling/PetscLogStageRegister.html
> >
> > > plus, how can i see what's going on with respect to why it takes so
> > > much longer to solve the system in parallel than in serial without
> > > being able to specify the stages (i.e single out the KSPSolve call)?
> >
> > There are 100 calls to KSPSolve() which collectively take .1s. Your
> > problem is most
> > likely in matrix setup. I would bet that you have not preallocated the
> > space correctly.
> > Therefore, a malloc() is called every time you insert a value. You can
> > check the number of mallocs using -info.
> >
> >    Matt
> >
> > > mat
> > >
> > > On Tuesday 01 August 2006 15:57, Matthew Knepley wrote:
> > > > ke out your stage push/pop for the moment, and the log_summary
> > > > call. Just run with -log_summary and send the output as a test.
> > > >
> > > > Thanks,
> > > >
> > > > Matt


From knepley at gmail.com  Tue Aug  1 19:28:08 2006
From: knepley at gmail.com (Matthew Knepley)
Date: Tue, 1 Aug 2006 19:28:08 -0500
Subject: profiling PETSc code
In-Reply-To: <200608011807.29007.mafunk@nmsu.edu>
References: <200608011545.26224.mafunk@nmsu.edu>
	 <a9f269830608011536v12fb4457l31e3d363a666a893@mail.gmail.com>
	 <200608011749.48651.mafunk@nmsu.edu>
	 <200608011807.29007.mafunk@nmsu.edu>
Message-ID: <a9f269830608011728y344677c0w2adeed4ce0dddd6d@mail.gmail.com>

On 8/1/06, Matt Funk <mafunk at nmsu.edu> wrote:
> Actually the errors occur on my calls to a PETSc functions after calling
> PETSCInitialize.

Yes, it is the error I pointed out in the last message.

   Matt

> mat
-- 
"Failure has a thousand explanations. Success doesn't need one" -- Sir
Alec Guiness


From jiaxun_hou at yahoo.com.cn  Wed Aug  2 06:42:35 2006
From: jiaxun_hou at yahoo.com.cn (jiaxun hou)
Date: Wed, 2 Aug 2006 19:42:35 +0800 (CST)
Subject: some problems in using PETSC with FFTW3 package
Message-ID: <20060802114235.73141.qmail@web15802.mail.cnb.yahoo.com>

Hi all,
  I am trying to using the package FFTW3 in PETSC.
  How can I change type from PetscScalar to complex or double[2]? 
  The documentation seems a bit sketchy.
   
  Regards
Mason


---------------------------------
????????-3.5G???20M??? 
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.mcs.anl.gov/pipermail/petsc-users/attachments/20060802/e1291c1a/attachment.htm>

From balay at mcs.anl.gov  Wed Aug  2 09:19:26 2006
From: balay at mcs.anl.gov (Satish Balay)
Date: Wed, 2 Aug 2006 09:19:26 -0500 (CDT)
Subject: some problems in using PETSC with FFTW3 package
In-Reply-To: <20060802114235.73141.qmail@web15802.mail.cnb.yahoo.com>
References: <20060802114235.73141.qmail@web15802.mail.cnb.yahoo.com>
Message-ID: <Pine.LNX.4.64.0608020919050.25417@asterix>

use the configure option

--with-scalar-type=complex

Satish

On Wed, 2 Aug 2006, jiaxun hou wrote:

> Hi all,
>   I am trying to using the package FFTW3 in PETSC.
>   How can I change type from PetscScalar to complex or double[2]? 
>   The documentation seems a bit sketchy.
>    
>   Regards
> Mason
> 
> 
>  		
> ---------------------------------
> ????????????????-3.5G??????20M?????? 

From hzhang at mcs.anl.gov  Wed Aug  2 10:27:04 2006
From: hzhang at mcs.anl.gov (Hong Zhang)
Date: Wed, 2 Aug 2006 10:27:04 -0500 (CDT)
Subject: some problems in using PETSC with FFTW3 package
In-Reply-To: <20060802114235.73141.qmail@web15802.mail.cnb.yahoo.com>
References: <20060802114235.73141.qmail@web15802.mail.cnb.yahoo.com>
Message-ID: <Pine.LNX.4.58.0608021022471.20044@shakey.mcs.anl.gov>


Manson,

We don't have support for FFTW3 yet(we are currently developing
an interface between petsc and FFTW3). How do you use FFTW3 in PETSC?

To build petsc with complex, you need configure petsc with
'--with-scalar-type=complex'

Hong

On Wed, 2 Aug 2006, jiaxun hou wrote:

> Hi all,
>   I am trying to using the package FFTW3 in PETSC.
>   How can I change type from PetscScalar to complex or double[2]?
>   The documentation seems a bit sketchy.
>
>   Regards
> Mason
>
>
>
> ---------------------------------
> ????????????????-3.5G??????20M??????


From jiaxun_hou at yahoo.com.cn  Wed Aug  2 10:44:04 2006
From: jiaxun_hou at yahoo.com.cn (jiaxun hou)
Date: Wed, 2 Aug 2006 23:44:04 +0800 (CST)
Subject: =?gb2312?q?=BB=D8=B8=B4=A3=BA=20Re:=20some=20problems=20in=20using=20PETS?=
 =?gb2312?q?C=20with=20FFTW3=20package?=
In-Reply-To: <Pine.LNX.4.64.0608020919050.25417@asterix>
Message-ID: <20060802154404.90585.qmail@web15806.mail.cnb.yahoo.com>

Satish,
  Thanks for your response .
  I am sorry for my confusing description. In fact , I did use the configure option
  --with-scalar-type=complex when I configuated the system. So, I wonder if it is possible to change the type PetscScalar to some kinds like double[2] which can be handled in FFTW package? Or, is there any functions can get the real (imaginary) parts of a Petsc's Vector?
   
  Regards,
  Mason
  
Satish Balay <balay at mcs.anl.gov> ???
  use the configure option

--with-scalar-type=complex

Satish

On Wed, 2 Aug 2006, jiaxun hou wrote:

> Hi all,
> I am trying to using the package FFTW3 in PETSC.
> How can I change type from PetscScalar to complex or double[2]? 
> The documentation seems a bit sketchy.
> 
> Regards
> Mason
> 
> 
> 
> ---------------------------------
> ????????-3.5G???20M??? 

 		
---------------------------------
????????-3.5G???20M??? 
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.mcs.anl.gov/pipermail/petsc-users/attachments/20060802/fb850328/attachment.htm>

From jiaxun_hou at yahoo.com.cn  Wed Aug  2 10:57:24 2006
From: jiaxun_hou at yahoo.com.cn (jiaxun hou)
Date: Wed, 2 Aug 2006 23:57:24 +0800 (CST)
Subject: =?gb2312?q?=BB=D8=B8=B4=A3=BA=20Re:=20some=20problems=20in=20using=20PETS?=
 =?gb2312?q?C=20with=20FFTW3=20package?=
In-Reply-To: <Pine.LNX.4.58.0608021022471.20044@shakey.mcs.anl.gov>
Message-ID: <20060802155724.448.qmail@web15801.mail.cnb.yahoo.com>

Hong Zhang,
  Thanks for your respones.
   
    In FFTW3, complex type is set by double[2], and it is very easy to handle.
  But in Petsc, I don't konw exactly how the complex type be set. And when I want to do the fast fourier transform on a Petsc's complex vector by using FFTW3, I get the trouble of the translation between Petsc and FFTW3.
   
  Regards,
  Mason

Hong Zhang <hzhang at mcs.anl.gov> ???
  
Manson,

We don't have support for FFTW3 yet(we are currently developing
an interface between petsc and FFTW3). How do you use FFTW3 in PETSC?

To build petsc with complex, you need configure petsc with
'--with-scalar-type=complex'

Hong

On Wed, 2 Aug 2006, jiaxun hou wrote:

> Hi all,
> I am trying to using the package FFTW3 in PETSC.
> How can I change type from PetscScalar to complex or double[2]?
> The documentation seems a bit sketchy.
>
> Regards
> Mason
>
>
>
> ---------------------------------
> ????????-3.5G???20M???


---------------------------------
 ??????-3.5G???20M??
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.mcs.anl.gov/pipermail/petsc-users/attachments/20060802/60e0c5a5/attachment.htm>

From hzhang at mcs.anl.gov  Wed Aug  2 11:11:53 2006
From: hzhang at mcs.anl.gov (Hong Zhang)
Date: Wed, 2 Aug 2006 11:11:53 -0500 (CDT)
Subject: =?gb2312?q?=BB=D8=B8=B4=A3=BA=20Re:=20some=20problems=20in=20using=20PETS?=
 =?gb2312?q?C=20with=20FFTW3=20package?=
In-Reply-To: <20060802155724.448.qmail@web15801.mail.cnb.yahoo.com>
References: <20060802155724.448.qmail@web15801.mail.cnb.yahoo.com>
Message-ID: <Pine.LNX.4.58.0608021109570.20044@shakey.mcs.anl.gov>

You can retrieve real and imaginary part of a petsc scalar from
PetscRealPart()/PetscImaginaryPart()

See an example at
~petsc/src/ksp/ksp/examples/tutorials/ex11.c

Hong

On Wed, 2 Aug 2006, jiaxun hou wrote:

> Hong Zhang,
>   Thanks for your respones.
>
>     In FFTW3, complex type is set by double[2], and it is very easy to handle.
>   But in Petsc, I don't konw exactly how the complex type be set. And when I want to do the fast fourier transform on a Petsc's complex vector by using FFTW3, I get the trouble of the translation between Petsc and FFTW3.
>
>   Regards,
>   Mason
>
> Hong Zhang <hzhang at mcs.anl.gov> ??????
>
> Manson,
>
> We don't have support for FFTW3 yet(we are currently developing
> an interface between petsc and FFTW3). How do you use FFTW3 in PETSC?
>
> To build petsc with complex, you need configure petsc with
> '--with-scalar-type=complex'
>
> Hong
>
> On Wed, 2 Aug 2006, jiaxun hou wrote:
>
> > Hi all,
> > I am trying to using the package FFTW3 in PETSC.
> > How can I change type from PetscScalar to complex or double[2]?
> > The documentation seems a bit sketchy.
> >
> > Regards
> > Mason
> >
> >
> >
> > ---------------------------------
> > ????????????????-3.5G??????20M??????
>
>
>
>
> ---------------------------------
>  ????????????-3.5G??????20M????


From bsmith at mcs.anl.gov  Wed Aug  2 12:12:07 2006
From: bsmith at mcs.anl.gov (Barry Smith)
Date: Wed, 2 Aug 2006 12:12:07 -0500 (CDT)
Subject: =?gb2312?q?=BB=D8=B8=B4=A3=BA=20Re:=20some=20problems=20in=20using=20PETS?=
 =?gb2312?q?C=20with=20FFTW3=20package?=
In-Reply-To: <20060802154404.90585.qmail@web15806.mail.cnb.yahoo.com>
References: <20060802154404.90585.qmail@web15806.mail.cnb.yahoo.com>
Message-ID: <Pine.OSX.4.64.0608021210180.366@barrys-computer.local>


   Mason,

    A complex number (PetscScalar) is simply a double [2]. So you can either
1) use complex PETSc and caste the arrays when you pass to fftw or
2) user PETScScalar of simply double and pass those beasts to fftw.

   Unless YOUR code is using complex numbers then you should simply use
2 and all is easy.

    Barry


On Wed, 2 Aug 2006, jiaxun hou wrote:

> Satish,
>  Thanks for your response .
>  I am sorry for my confusing description. In fact , I did use the configure option
>  --with-scalar-type=complex when I configuated the system. So, I wonder if it is possible to change the type PetscScalar to some kinds like double[2] which can be handled in FFTW package? Or, is there any functions can get the real (imaginary) parts of a Petsc's Vector?
>
>  Regards,
>  Mason
>
> Satish Balay <balay at mcs.anl.gov> ??????
>  use the configure option
>
> --with-scalar-type=complex
>
> Satish
>
> On Wed, 2 Aug 2006, jiaxun hou wrote:
>
>> Hi all,
>> I am trying to using the package FFTW3 in PETSC.
>> How can I change type from PetscScalar to complex or double[2]?
>> The documentation seems a bit sketchy.
>>
>> Regards
>> Mason
>>
>>
>>
>> ---------------------------------
>> ????????????????-3.5G??????20M??????
>
>
> ---------------------------------
> ????????????????-3.5G??????20M??????

From mafunk at nmsu.edu  Wed Aug  2 16:32:20 2006
From: mafunk at nmsu.edu (Matt Funk)
Date: Wed, 2 Aug 2006 15:32:20 -0600
Subject: profiling PETSc code
In-Reply-To: <a9f269830608011728y344677c0w2adeed4ce0dddd6d@mail.gmail.com>
References: <200608011545.26224.mafunk@nmsu.edu> <200608011807.29007.mafunk@nmsu.edu> <a9f269830608011728y344677c0w2adeed4ce0dddd6d@mail.gmail.com>
Message-ID: <200608021532.21971.mafunk@nmsu.edu>

Hi Matt,

thanks for all the help so far. The -info option is really very helpful. So i 
think i straightened the actual errors out. However, now i am back to the 
original question i had. That is why it takes so much longer on 4 procs than 
on 1 proc.

I profiled the KSPSolve(...) as stage 2:

For 1 proc i have:
--- Event Stage 2: Stage 2 of ChomboPetscInterface

VecDot              4000 1.0 4.9158e-02 1.0 4.74e+08 1.0 0.0e+00 0.0e+00 
0.0e+00  0 18  0  0  0   2 18  0  0  0   474
VecNorm             8000 1.0 2.1798e-01 1.0 2.14e+08 1.0 0.0e+00 0.0e+00 
4.0e+03  1 36  0  0 28   7 36  0  0 33   214
VecAYPX             4000 1.0 1.3449e-01 1.0 1.73e+08 1.0 0.0e+00 0.0e+00 
0.0e+00  0 18  0  0  0   5 18  0  0  0   173
MatMult             4000 1.0 3.6004e-01 1.0 3.24e+07 1.0 0.0e+00 0.0e+00 
0.0e+00  1  9  0  0  0  12  9  0  0  0    32
MatSolve            8000 1.0 1.0620e+00 1.0 2.19e+07 1.0 0.0e+00 0.0e+00 
0.0e+00  3 18  0  0  0  36 18  0  0  0    22
KSPSolve            4000 1.0 2.8338e+00 1.0 4.52e+07 1.0 0.0e+00 0.0e+00 
1.2e+04  7100  0  0 84  97100  0  0100    45
PCApply             8000 1.0 1.1133e+00 1.0 2.09e+07 1.0 0.0e+00 0.0e+00 
0.0e+00  3 18  0  0  0  38 18  0  0  0    21


for 4 procs i have :
--- Event Stage 2: Stage 2 of ChomboPetscInterface

VecDot              4000 1.0 3.5884e+01133.7 2.17e+07133.7 0.0e+00 0.0e+00 
4.0e+03  8 18  0  0  5   9 18  0  0 14     1
VecNorm             8000 1.0 3.4986e-01 1.3 4.43e+07 1.3 0.0e+00 0.0e+00 
8.0e+03  0 36  0  0 10   0 36  0  0 29   133
VecSet              8000 1.0 3.5024e-02 1.4 0.00e+00 0.0 0.0e+00 0.0e+00 
0.0e+00  0  0  0  0  0   0  0  0  0  0     0
VecAYPX             4000 1.0 5.6790e-02 1.3 1.28e+08 1.3 0.0e+00 0.0e+00 
0.0e+00  0 18  0  0  0   0 18  0  0  0   410
VecScatterBegin     4000 1.0 6.0042e+01 1.4 0.00e+00 0.0 0.0e+00 0.0e+00 
0.0e+00 38  0  0  0  0  45  0  0  0  0     0
VecScatterEnd       4000 1.0 5.9364e+01 1.4 0.00e+00 0.0 0.0e+00 0.0e+00 
0.0e+00 37  0  0  0  0  44  0  0  0  0     0
MatMult             4000 1.0 1.1959e+02 1.4 3.46e+04 1.4 0.0e+00 0.0e+00 
0.0e+00 75  9  0  0  0  89  9  0  0  0     0
MatSolve            8000 1.0 2.8150e-01 1.0 2.16e+07 1.0 0.0e+00 0.0e+00 
0.0e+00  0 18  0  0  0   0 18  0  0  0    83
MatLUFactorNum         1 1.0 1.3685e-04 1.1 5.64e+06 1.1 0.0e+00 0.0e+00 
0.0e+00  0  0  0  0  0   0  0  0  0  0    21
MatILUFactorSym        1 1.0 2.3389e-04 1.2 0.00e+00 0.0 0.0e+00 0.0e+00 
2.0e+00  0  0  0  0  0   0  0  0  0  0     0
MatGetOrdering         1 1.0 9.6083e-05 1.2 0.00e+00 0.0 0.0e+00 0.0e+00 
2.0e+00  0  0  0  0  0   0  0  0  0  0     0
KSPSetup               1 1.0 2.1458e-06 2.2 0.00e+00 0.0 0.0e+00 0.0e+00 
0.0e+00  0  0  0  0  0   0  0  0  0  0     0
KSPSolve            4000 1.0 1.2200e+02 1.0 2.63e+05 1.0 0.0e+00 0.0e+00 
2.8e+04 84100  0  0 34 100100  0  0100     1
PCSetUp                1 1.0 5.0187e-04 1.2 1.68e+06 1.2 0.0e+00 0.0e+00 
4.0e+00  0  0  0  0  0   0  0  0  0  0     6
PCSetUpOnBlocks     4000 1.0 1.2104e-02 2.2 1.34e+05 2.2 0.0e+00 0.0e+00 
4.0e+00  0  0  0  0  0   0  0  0  0  0     0
PCApply             8000 1.0 8.4254e-01 1.2 8.27e+06 1.2 0.0e+00 0.0e+00 
8.0e+03  1 18  0  0 10   1 18  0  0 29    28
------------------------------------------------------------------------------------------------------------------------

Now if i understand it right, all these calls summarize all calls between the 
pop and push commands. That would mean that the majority of the time is spend 
in the MatMult and in within that the VecScatterBegin and VecScatterEnd 
commands (if i understand it right).

My problem size is really small. So i was wondering if the problem lies in 
that (namely that the major time is simply spend communicating between 
processors, or whether there is still something wrong with how i wrote the 
code?)


thanks
mat


On Tuesday 01 August 2006 18:28, Matthew Knepley wrote:
> On 8/1/06, Matt Funk <mafunk at nmsu.edu> wrote:
> > Actually the errors occur on my calls to a PETSc functions after calling
> > PETSCInitialize.
>
> Yes, it is the error I pointed out in the last message.
>
>    Matt
>
> > mat


From knepley at gmail.com  Wed Aug  2 16:50:41 2006
From: knepley at gmail.com (Matthew Knepley)
Date: Wed, 2 Aug 2006 16:50:41 -0500
Subject: profiling PETSc code
In-Reply-To: <200608021532.21971.mafunk@nmsu.edu>
References: <200608011545.26224.mafunk@nmsu.edu>
	 <200608011807.29007.mafunk@nmsu.edu>
	 <a9f269830608011728y344677c0w2adeed4ce0dddd6d@mail.gmail.com>
	 <200608021532.21971.mafunk@nmsu.edu>
Message-ID: <a9f269830608021450p455cd8rac31cad95e4777a5@mail.gmail.com>

On 8/2/06, Matt Funk <mafunk at nmsu.edu> wrote:
> Hi Matt,
>
> thanks for all the help so far. The -info option is really very helpful. So i
> think i straightened the actual errors out. However, now i am back to the
> original question i had. That is why it takes so much longer on 4 procs than
> on 1 proc.

So you have a 1.5 load imbalance for MatMult(), which probably cascades to
give the 133! load imbalance for VecDot(). You probably have either:

  1) VERY bad laod imbalance

  2) a screwed up network

  3) bad contention on the network (loaded cluster)

Can you help us narrow this down?


   Matt

> I profiled the KSPSolve(...) as stage 2:
>
> For 1 proc i have:
> --- Event Stage 2: Stage 2 of ChomboPetscInterface
>
> VecDot              4000 1.0 4.9158e-02 1.0 4.74e+08 1.0 0.0e+00 0.0e+00
> 0.0e+00  0 18  0  0  0   2 18  0  0  0   474
> VecNorm             8000 1.0 2.1798e-01 1.0 2.14e+08 1.0 0.0e+00 0.0e+00
> 4.0e+03  1 36  0  0 28   7 36  0  0 33   214
> VecAYPX             4000 1.0 1.3449e-01 1.0 1.73e+08 1.0 0.0e+00 0.0e+00
> 0.0e+00  0 18  0  0  0   5 18  0  0  0   173
> MatMult             4000 1.0 3.6004e-01 1.0 3.24e+07 1.0 0.0e+00 0.0e+00
> 0.0e+00  1  9  0  0  0  12  9  0  0  0    32
> MatSolve            8000 1.0 1.0620e+00 1.0 2.19e+07 1.0 0.0e+00 0.0e+00
> 0.0e+00  3 18  0  0  0  36 18  0  0  0    22
> KSPSolve            4000 1.0 2.8338e+00 1.0 4.52e+07 1.0 0.0e+00 0.0e+00
> 1.2e+04  7100  0  0 84  97100  0  0100    45
> PCApply             8000 1.0 1.1133e+00 1.0 2.09e+07 1.0 0.0e+00 0.0e+00
> 0.0e+00  3 18  0  0  0  38 18  0  0  0    21
>
>
> for 4 procs i have :
> --- Event Stage 2: Stage 2 of ChomboPetscInterface
>
> VecDot              4000 1.0 3.5884e+01133.7 2.17e+07133.7 0.0e+00 0.0e+00
> 4.0e+03  8 18  0  0  5   9 18  0  0 14     1
> VecNorm             8000 1.0 3.4986e-01 1.3 4.43e+07 1.3 0.0e+00 0.0e+00
> 8.0e+03  0 36  0  0 10   0 36  0  0 29   133
> VecSet              8000 1.0 3.5024e-02 1.4 0.00e+00 0.0 0.0e+00 0.0e+00
> 0.0e+00  0  0  0  0  0   0  0  0  0  0     0
> VecAYPX             4000 1.0 5.6790e-02 1.3 1.28e+08 1.3 0.0e+00 0.0e+00
> 0.0e+00  0 18  0  0  0   0 18  0  0  0   410
> VecScatterBegin     4000 1.0 6.0042e+01 1.4 0.00e+00 0.0 0.0e+00 0.0e+00
> 0.0e+00 38  0  0  0  0  45  0  0  0  0     0
> VecScatterEnd       4000 1.0 5.9364e+01 1.4 0.00e+00 0.0 0.0e+00 0.0e+00
> 0.0e+00 37  0  0  0  0  44  0  0  0  0     0
> MatMult             4000 1.0 1.1959e+02 1.4 3.46e+04 1.4 0.0e+00 0.0e+00
> 0.0e+00 75  9  0  0  0  89  9  0  0  0     0
> MatSolve            8000 1.0 2.8150e-01 1.0 2.16e+07 1.0 0.0e+00 0.0e+00
> 0.0e+00  0 18  0  0  0   0 18  0  0  0    83
> MatLUFactorNum         1 1.0 1.3685e-04 1.1 5.64e+06 1.1 0.0e+00 0.0e+00
> 0.0e+00  0  0  0  0  0   0  0  0  0  0    21
> MatILUFactorSym        1 1.0 2.3389e-04 1.2 0.00e+00 0.0 0.0e+00 0.0e+00
> 2.0e+00  0  0  0  0  0   0  0  0  0  0     0
> MatGetOrdering         1 1.0 9.6083e-05 1.2 0.00e+00 0.0 0.0e+00 0.0e+00
> 2.0e+00  0  0  0  0  0   0  0  0  0  0     0
> KSPSetup               1 1.0 2.1458e-06 2.2 0.00e+00 0.0 0.0e+00 0.0e+00
> 0.0e+00  0  0  0  0  0   0  0  0  0  0     0
> KSPSolve            4000 1.0 1.2200e+02 1.0 2.63e+05 1.0 0.0e+00 0.0e+00
> 2.8e+04 84100  0  0 34 100100  0  0100     1
> PCSetUp                1 1.0 5.0187e-04 1.2 1.68e+06 1.2 0.0e+00 0.0e+00
> 4.0e+00  0  0  0  0  0   0  0  0  0  0     6
> PCSetUpOnBlocks     4000 1.0 1.2104e-02 2.2 1.34e+05 2.2 0.0e+00 0.0e+00
> 4.0e+00  0  0  0  0  0   0  0  0  0  0     0
> PCApply             8000 1.0 8.4254e-01 1.2 8.27e+06 1.2 0.0e+00 0.0e+00
> 8.0e+03  1 18  0  0 10   1 18  0  0 29    28
> ------------------------------------------------------------------------------------------------------------------------
>
> Now if i understand it right, all these calls summarize all calls between the
> pop and push commands. That would mean that the majority of the time is spend
> in the MatMult and in within that the VecScatterBegin and VecScatterEnd
> commands (if i understand it right).
>
> My problem size is really small. So i was wondering if the problem lies in
> that (namely that the major time is simply spend communicating between
> processors, or whether there is still something wrong with how i wrote the
> code?)
>
>
> thanks
> mat
>
>
>
> On Tuesday 01 August 2006 18:28, Matthew Knepley wrote:
> > On 8/1/06, Matt Funk <mafunk at nmsu.edu> wrote:
> > > Actually the errors occur on my calls to a PETSc functions after calling
> > > PETSCInitialize.
> >
> > Yes, it is the error I pointed out in the last message.
> >
> >    Matt
> >
> > > mat
>
>


-- 
"Failure has a thousand explanations. Success doesn't need one" -- Sir
Alec Guiness


From mafunk at nmsu.edu  Wed Aug  2 17:21:43 2006
From: mafunk at nmsu.edu (Matt Funk)
Date: Wed, 2 Aug 2006 16:21:43 -0600
Subject: profiling PETSc code
In-Reply-To: <a9f269830608021450p455cd8rac31cad95e4777a5@mail.gmail.com>
References: <200608011545.26224.mafunk@nmsu.edu> <200608021532.21971.mafunk@nmsu.edu> <a9f269830608021450p455cd8rac31cad95e4777a5@mail.gmail.com>
Message-ID: <200608021621.44171.mafunk@nmsu.edu>

Hi Matt,

It could be a bad load imbalance because i don't let PETSc decide. I need to 
fix that anyway, so i think i'll try that first and then let you know.
Thanks though for the quick response and helping me to interpret those 
numbers ... 


mat

On Wednesday 02 August 2006 15:50, Matthew Knepley wrote:
> On 8/2/06, Matt Funk <mafunk at nmsu.edu> wrote:
> > Hi Matt,
> >
> > thanks for all the help so far. The -info option is really very helpful.
> > So i think i straightened the actual errors out. However, now i am back
> > to the original question i had. That is why it takes so much longer on 4
> > procs than on 1 proc.
>
> So you have a 1.5 load imbalance for MatMult(), which probably cascades to
> give the 133! load imbalance for VecDot(). You probably have either:
>
>   1) VERY bad laod imbalance
>
>   2) a screwed up network
>
>   3) bad contention on the network (loaded cluster)
>
> Can you help us narrow this down?
>
>
>    Matt
>
> > I profiled the KSPSolve(...) as stage 2:
> >
> > For 1 proc i have:
> > --- Event Stage 2: Stage 2 of ChomboPetscInterface
> >
> > VecDot              4000 1.0 4.9158e-02 1.0 4.74e+08 1.0 0.0e+00 0.0e+00
> > 0.0e+00  0 18  0  0  0   2 18  0  0  0   474
> > VecNorm             8000 1.0 2.1798e-01 1.0 2.14e+08 1.0 0.0e+00 0.0e+00
> > 4.0e+03  1 36  0  0 28   7 36  0  0 33   214
> > VecAYPX             4000 1.0 1.3449e-01 1.0 1.73e+08 1.0 0.0e+00 0.0e+00
> > 0.0e+00  0 18  0  0  0   5 18  0  0  0   173
> > MatMult             4000 1.0 3.6004e-01 1.0 3.24e+07 1.0 0.0e+00 0.0e+00
> > 0.0e+00  1  9  0  0  0  12  9  0  0  0    32
> > MatSolve            8000 1.0 1.0620e+00 1.0 2.19e+07 1.0 0.0e+00 0.0e+00
> > 0.0e+00  3 18  0  0  0  36 18  0  0  0    22
> > KSPSolve            4000 1.0 2.8338e+00 1.0 4.52e+07 1.0 0.0e+00 0.0e+00
> > 1.2e+04  7100  0  0 84  97100  0  0100    45
> > PCApply             8000 1.0 1.1133e+00 1.0 2.09e+07 1.0 0.0e+00 0.0e+00
> > 0.0e+00  3 18  0  0  0  38 18  0  0  0    21
> >
> >
> > for 4 procs i have :
> > --- Event Stage 2: Stage 2 of ChomboPetscInterface
> >
> > VecDot              4000 1.0 3.5884e+01133.7 2.17e+07133.7 0.0e+00
> > 0.0e+00 4.0e+03  8 18  0  0  5   9 18  0  0 14     1
> > VecNorm             8000 1.0 3.4986e-01 1.3 4.43e+07 1.3 0.0e+00 0.0e+00
> > 8.0e+03  0 36  0  0 10   0 36  0  0 29   133
> > VecSet              8000 1.0 3.5024e-02 1.4 0.00e+00 0.0 0.0e+00 0.0e+00
> > 0.0e+00  0  0  0  0  0   0  0  0  0  0     0
> > VecAYPX             4000 1.0 5.6790e-02 1.3 1.28e+08 1.3 0.0e+00 0.0e+00
> > 0.0e+00  0 18  0  0  0   0 18  0  0  0   410
> > VecScatterBegin     4000 1.0 6.0042e+01 1.4 0.00e+00 0.0 0.0e+00 0.0e+00
> > 0.0e+00 38  0  0  0  0  45  0  0  0  0     0
> > VecScatterEnd       4000 1.0 5.9364e+01 1.4 0.00e+00 0.0 0.0e+00 0.0e+00
> > 0.0e+00 37  0  0  0  0  44  0  0  0  0     0
> > MatMult             4000 1.0 1.1959e+02 1.4 3.46e+04 1.4 0.0e+00 0.0e+00
> > 0.0e+00 75  9  0  0  0  89  9  0  0  0     0
> > MatSolve            8000 1.0 2.8150e-01 1.0 2.16e+07 1.0 0.0e+00 0.0e+00
> > 0.0e+00  0 18  0  0  0   0 18  0  0  0    83
> > MatLUFactorNum         1 1.0 1.3685e-04 1.1 5.64e+06 1.1 0.0e+00 0.0e+00
> > 0.0e+00  0  0  0  0  0   0  0  0  0  0    21
> > MatILUFactorSym        1 1.0 2.3389e-04 1.2 0.00e+00 0.0 0.0e+00 0.0e+00
> > 2.0e+00  0  0  0  0  0   0  0  0  0  0     0
> > MatGetOrdering         1 1.0 9.6083e-05 1.2 0.00e+00 0.0 0.0e+00 0.0e+00
> > 2.0e+00  0  0  0  0  0   0  0  0  0  0     0
> > KSPSetup               1 1.0 2.1458e-06 2.2 0.00e+00 0.0 0.0e+00 0.0e+00
> > 0.0e+00  0  0  0  0  0   0  0  0  0  0     0
> > KSPSolve            4000 1.0 1.2200e+02 1.0 2.63e+05 1.0 0.0e+00 0.0e+00
> > 2.8e+04 84100  0  0 34 100100  0  0100     1
> > PCSetUp                1 1.0 5.0187e-04 1.2 1.68e+06 1.2 0.0e+00 0.0e+00
> > 4.0e+00  0  0  0  0  0   0  0  0  0  0     6
> > PCSetUpOnBlocks     4000 1.0 1.2104e-02 2.2 1.34e+05 2.2 0.0e+00 0.0e+00
> > 4.0e+00  0  0  0  0  0   0  0  0  0  0     0
> > PCApply             8000 1.0 8.4254e-01 1.2 8.27e+06 1.2 0.0e+00 0.0e+00
> > 8.0e+03  1 18  0  0 10   1 18  0  0 29    28
> > -------------------------------------------------------------------------
> >-----------------------------------------------
> >
> > Now if i understand it right, all these calls summarize all calls between
> > the pop and push commands. That would mean that the majority of the time
> > is spend in the MatMult and in within that the VecScatterBegin and
> > VecScatterEnd commands (if i understand it right).
> >
> > My problem size is really small. So i was wondering if the problem lies
> > in that (namely that the major time is simply spend communicating between
> > processors, or whether there is still something wrong with how i wrote
> > the code?)
> >
> >
> > thanks
> > mat
> >
> > On Tuesday 01 August 2006 18:28, Matthew Knepley wrote:
> > > On 8/1/06, Matt Funk <mafunk at nmsu.edu> wrote:
> > > > Actually the errors occur on my calls to a PETSc functions after
> > > > calling PETSCInitialize.
> > >
> > > Yes, it is the error I pointed out in the last message.
> > >
> > >    Matt
> > >
> > > > mat


From jiaxun_hou at yahoo.com.cn  Thu Aug  3 05:13:52 2006
From: jiaxun_hou at yahoo.com.cn (jiaxun hou)
Date: Thu, 3 Aug 2006 18:13:52 +0800 (CST)
Subject: =?gb2312?q?=BB=D8=B8=B4=A3=BA=20Re:=20=BB=D8=B8=B4=A3=BA=20Re:=20some=20p?=
 =?gb2312?q?roblems=20in=20using=20PETSC=20with=20FFTW3=20package?=
In-Reply-To: <Pine.OSX.4.64.0608021210180.366@barrys-computer.local>
Message-ID: <20060803101352.75243.qmail@web15810.mail.cnb.yahoo.com>

Barry,
Thank you very much. 
As long as a complex number (PetscScalar) is simply a double [2],  I can  use the operator "reinterpret_cast" to caste them. And it seems to be working fine now.

Regards,
Mason

Barry Smith <bsmith at mcs.anl.gov> ??? 
   Mason,

    A complex number (PetscScalar) is simply a double [2]. So you can either
1) use complex PETSc and caste the arrays when you pass to fftw or
2) user PETScScalar of simply double and pass those beasts to fftw.

   Unless YOUR code is using complex numbers then you should simply use
2 and all is easy.

    Barry


On Wed, 2 Aug 2006, jiaxun hou wrote:

> Satish,
>  Thanks for your response .
>  I am sorry for my confusing description. In fact , I did use the configure option
>  --with-scalar-type=complex when I configuated the system. So, I wonder if it is possible to change the type PetscScalar to some kinds like double[2] which can be handled in FFTW package? Or, is there any functions can get the real (imaginary) parts of a Petsc's Vector?
>
>  Regards,
>  Mason
>
> Satish Balay  ???
>  use the configure option
>
> --with-scalar-type=complex
>
> Satish
>
> On Wed, 2 Aug 2006, jiaxun hou wrote:
>
>> Hi all,
>> I am trying to using the package FFTW3 in PETSC.
>> How can I change type from PetscScalar to complex or double[2]?
>> The documentation seems a bit sketchy.
>>
>> Regards
>> Mason
>>
>>
>>
>> ---------------------------------
>> ????????-3.5G???20M???
>
>
> ---------------------------------
> ????????-3.5G???20M???

 		
---------------------------------
 Mp3???-???????   
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.mcs.anl.gov/pipermail/petsc-users/attachments/20060803/741eb2ec/attachment.htm>

From jiaxun_hou at yahoo.com.cn  Thu Aug  3 05:20:54 2006
From: jiaxun_hou at yahoo.com.cn (jiaxun hou)
Date: Thu, 3 Aug 2006 18:20:54 +0800 (CST)
Subject: =?gb2312?q?=BB=D8=B8=B4=A3=BA=20Re:=20=BB=D8=B8=B4=A3=BA=20Re:=20some=20p?=
 =?gb2312?q?roblems=20in=20using=20PETSC=20with=20FFTW3=20package?=
In-Reply-To: <Pine.LNX.4.58.0608021109570.20044@shakey.mcs.anl.gov>
Message-ID: <20060803102054.44952.qmail@web15807.mail.cnb.yahoo.com>

Hong,
Thank you for your help.
I have successfully converted PetscScalar* into fftw_complex* by using opertator "reinterpret_cast".

Regards,
Mason

Hong Zhang <hzhang at mcs.anl.gov> ??? You can retrieve real and imaginary part of a petsc scalar from
PetscRealPart()/PetscImaginaryPart()

See an example at
~petsc/src/ksp/ksp/examples/tutorials/ex11.c

Hong

On Wed, 2 Aug 2006, jiaxun hou wrote:

> Hong Zhang,
>   Thanks for your respones.
>
>     In FFTW3, complex type is set by double[2], and it is very easy to handle.
>   But in Petsc, I don't konw exactly how the complex type be set. And when I want to do the fast fourier transform on a Petsc's complex vector by using FFTW3, I get the trouble of the translation between Petsc and FFTW3.
>
>   Regards,
>   Mason
>
> Hong Zhang  ???
>
> Manson,
>
> We don't have support for FFTW3 yet(we are currently developing
> an interface between petsc and FFTW3). How do you use FFTW3 in PETSC?
>
> To build petsc with complex, you need configure petsc with
> '--with-scalar-type=complex'
>
> Hong
>
> On Wed, 2 Aug 2006, jiaxun hou wrote:
>
> > Hi all,
> > I am trying to using the package FFTW3 in PETSC.
> > How can I change type from PetscScalar to complex or double[2]?
> > The documentation seems a bit sketchy.
> >
> > Regards
> > Mason
> >
> >
> >
> > ---------------------------------
> > ????????-3.5G???20M???
>
>
>
>
> ---------------------------------
>  ??????-3.5G???20M??


 __________________________________________________
???????????????
http://cn.mail.yahoo.com
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.mcs.anl.gov/pipermail/petsc-users/attachments/20060803/f5fd2736/attachment.htm>

From diosady at MIT.EDU  Thu Aug  3 09:05:54 2006
From: diosady at MIT.EDU (Laslo Tibor Diosady)
Date: Thu, 3 Aug 2006 10:05:54 -0400 (EDT)
Subject: In place ILU(0) factorization
Message-ID: <Pine.LNX.4.62L.0608030949300.19213@splinter.mit.edu>

Hi,

I wanted to perform an in place ILU factorization with fill level of 0 (ie 
ILU(0)) for a SeqBAIJ matrix. This works fine when I use a natural 
ordering however when I try to use a different matrix reordering I get the 
following error.

[0]PETSC ERROR: MatILUFactor_SeqBAIJ() line 1768 in 
src/mat/impls/baij/seq/baij.c
[0]PETSC ERROR: Invalid argument!
[0]PETSC ERROR: Row and column permutations must be identity for in-place 
ILU!
[0]PETSC ERROR: MatILUFactor() line 2107 in src/mat/interface/matrix.c

In otherwords, Petsc only supports in place ILU(0) without reordering.

The idea behind doing an in place factorization is so that I don't use 
twice as much memory to store my matrix (ie the original matrix and the 
ILU factored matrix).


Is in place ILU factorization with reordering going to be supported by 
Petsc anytime in the near future or is there an easy work around so I can 
get this to work?

Thanks,

Laslo


From hzhang at mcs.anl.gov  Thu Aug  3 09:54:54 2006
From: hzhang at mcs.anl.gov (Hong Zhang)
Date: Thu, 3 Aug 2006 09:54:54 -0500 (CDT)
Subject: In place ILU(0) factorization
In-Reply-To: <Pine.LNX.4.62L.0608030949300.19213@splinter.mit.edu>
References: <Pine.LNX.4.62L.0608030949300.19213@splinter.mit.edu>
Message-ID: <Pine.LNX.4.58.0608030941130.20611@shakey.mcs.anl.gov>


Laslo,

An reordering of matrix changes matrix sparse pattern,
then the factored matrix cannot be stored in the original matrix.
Here is the notes from petsc MatILUFactor():

Notes:
   Probably really in-place only when level of fill is zero, otherwise
allocates
   new space to store factored matrix and deletes previous memory.

i.e., except ilu(0) without reordering, petsc inplace ilu()
virtually computes a new factor, and deletes the previous memory.
You may use petsc out-place ilu, and call MatDestroy()
to delete your original matrix.

> In otherwords, Petsc only supports in place ILU(0) without reordering.
>
> The idea behind doing an in place factorization is so that I don't use
> twice as much memory to store my matrix (ie the original matrix and the
> ILU factored matrix).
>
>
> Is in place ILU factorization with reordering going to be supported by
> Petsc anytime in the near future or is there an easy work around so I can
> get this to work?

We can add this support. As mentioned above, the factored matrix
will be newly allocated with the original memory deleted.

Hong


From diosady at MIT.EDU  Thu Aug  3 10:27:40 2006
From: diosady at MIT.EDU (Laslo Tibor Diosady)
Date: Thu, 3 Aug 2006 11:27:40 -0400 (EDT)
Subject: In place ILU(0) factorization
In-Reply-To: <Pine.LNX.4.58.0608030941130.20611@shakey.mcs.anl.gov>
References: <Pine.LNX.4.62L.0608030949300.19213@splinter.mit.edu>
 <Pine.LNX.4.58.0608030941130.20611@shakey.mcs.anl.gov>
Message-ID: <Pine.LNX.4.62L.0608031113350.19214@splinter.mit.edu>

Hong,

I know that I can use MatILUFactorSymbolic and then MatLUFactorNumeric and 
then destroy the original matrix if I want to. Though support for this in 
one step with MatILUFactor would be nice, it is not really important.

The point I was trying to make is that performing and ILU(0) does not 
change the sparsity pattern, no matter what reordering is used, since by 
definition there is no fill and hence no change in sparsity pattern from 
the original matrix.

The reordering in this case simply changes the order of operations to 
perform the ILU(0) but not the memory requirements. In this case 
reordering is performed not to reduce fill but to achieve a "better" ILU 
factorization.

Hence it should be possible to perform and ILU(0) in place with 
different reorderings. This is what I was hoping to get support for.


Thanks,

Laslo


On Thu, 3 Aug 2006, Hong Zhang wrote:

>
> Laslo,
>
> An reordering of matrix changes matrix sparse pattern,
> then the factored matrix cannot be stored in the original matrix.
> Here is the notes from petsc MatILUFactor():
>
> Notes:
>   Probably really in-place only when level of fill is zero, otherwise
> allocates
>   new space to store factored matrix and deletes previous memory.
>
> i.e., except ilu(0) without reordering, petsc inplace ilu()
> virtually computes a new factor, and deletes the previous memory.
> You may use petsc out-place ilu, and call MatDestroy()
> to delete your original matrix.
>
>> In otherwords, Petsc only supports in place ILU(0) without reordering.
>>
>> The idea behind doing an in place factorization is so that I don't use
>> twice as much memory to store my matrix (ie the original matrix and the
>> ILU factored matrix).
>>
>>
>> Is in place ILU factorization with reordering going to be supported by
>> Petsc anytime in the near future or is there an easy work around so I can
>> get this to work?
>
> We can add this support. As mentioned above, the factored matrix
> will be newly allocated with the original memory deleted.
>
> Hong
>
>


From hzhang at mcs.anl.gov  Thu Aug  3 10:59:07 2006
From: hzhang at mcs.anl.gov (Hong Zhang)
Date: Thu, 3 Aug 2006 10:59:07 -0500 (CDT)
Subject: In place ILU(0) factorization
In-Reply-To: <Pine.LNX.4.62L.0608031113350.19214@splinter.mit.edu>
References: <Pine.LNX.4.62L.0608030949300.19213@splinter.mit.edu>
 <Pine.LNX.4.58.0608030941130.20611@shakey.mcs.anl.gov>
 <Pine.LNX.4.62L.0608031113350.19214@splinter.mit.edu>
Message-ID: <Pine.LNX.4.58.0608031046120.20611@shakey.mcs.anl.gov>


Laslo,
> The point I was trying to make is that performing and ILU(0) does not
> change the sparsity pattern, no matter what reordering is used, since by
> definition there is no fill and hence no change in sparsity pattern from
> the original matrix.

The space required remains the same, but
the row-compressed matrix format
for the factor will be changed with the
reordering.
To store the new format over the
existing memory, temp space has to be allocated during
implementation. Thus replacing the original memory with
newly allocated space would make implementation easier.

> The reordering in this case simply changes the order of operations to
> perform the ILU(0) but not the memory requirements. In this case
> reordering is performed not to reduce fill but to achieve a "better" ILU
> factorization.

Yes.
>
> Hence it should be possible to perform and ILU(0) in place with
> different reorderings. This is what I was hoping to get support for.
>
We'll try to provide this support.

Hong
>
>
> On Thu, 3 Aug 2006, Hong Zhang wrote:
>
> >
> > Laslo,
> >
> > An reordering of matrix changes matrix sparse pattern,
> > then the factored matrix cannot be stored in the original matrix.
> > Here is the notes from petsc MatILUFactor():
> >
> > Notes:
> >   Probably really in-place only when level of fill is zero, otherwise
> > allocates
> >   new space to store factored matrix and deletes previous memory.
> >
> > i.e., except ilu(0) without reordering, petsc inplace ilu()
> > virtually computes a new factor, and deletes the previous memory.
> > You may use petsc out-place ilu, and call MatDestroy()
> > to delete your original matrix.
> >
> >> In otherwords, Petsc only supports in place ILU(0) without reordering.
> >>
> >> The idea behind doing an in place factorization is so that I don't use
> >> twice as much memory to store my matrix (ie the original matrix and the
> >> ILU factored matrix).
> >>
> >>
> >> Is in place ILU factorization with reordering going to be supported by
> >> Petsc anytime in the near future or is there an easy work around so I can
> >> get this to work?
> >
> > We can add this support. As mentioned above, the factored matrix
> > will be newly allocated with the original memory deleted.
> >
> > Hong
> >
> >
>
>


From diosady at MIT.EDU  Fri Aug  4 08:52:24 2006
From: diosady at MIT.EDU (Laslo Tibor Diosady)
Date: Fri, 4 Aug 2006 09:52:24 -0400 (EDT)
Subject: In place ILU(0) factorization
In-Reply-To: <Pine.LNX.4.58.0608031046120.20611@shakey.mcs.anl.gov>
References: <Pine.LNX.4.62L.0608030949300.19213@splinter.mit.edu>
 <Pine.LNX.4.58.0608030941130.20611@shakey.mcs.anl.gov>
 <Pine.LNX.4.62L.0608031113350.19214@splinter.mit.edu>
 <Pine.LNX.4.58.0608031046120.20611@shakey.mcs.anl.gov>
Message-ID: <Pine.LNX.4.62L.0608040943400.15507@splinter.mit.edu>

Hong,


> The space required remains the same, but
> the row-compressed matrix format
> for the factor will be changed with the
> reordering.
> To store the new format over the
> existing memory, temp space has to be allocated during
> implementation. Thus replacing the original memory with
> newly allocated space would make implementation easier.
>

I guess this depends upon the implementation of the ILU(0) factorization 
for the AIJ or BAIJ formats, which  I didn't (nor  do I ever really want 
to) look into.

Thanks for the help,

Laslo


From diosady at MIT.EDU  Mon Aug  7 08:48:14 2006
From: diosady at MIT.EDU (Laslo Tibor Diosady)
Date: Mon, 7 Aug 2006 09:48:14 -0400 (EDT)
Subject: MatSolveTranspose
Message-ID: <Pine.LNX.4.62L.0608070935270.16181@splinter.mit.edu>

Hi,

I am trying to use MatSolveTranspose for a sequential BAIJ format, however 
whenever I call MatSolveTranspose I get a segmentation fault.

The sequence of calls which I make is:
MatILUFactorSymbolic
MatLUFactorNumeric

And I use the resulting matrix in calls to:
MatSolve and MatSolveTranspose

When I use MatSolve this works great. However, with MatSolveTranspose it 
appears petsc tries to allocate memory in a never ending loop until my 
machine runs out of memory and I get a seg fault.

If I do a call to MatHasOperation with this matrix the result is 
Petsc_True, so in theory I think the call to MatSolveTranspose should 
work.

Am I doing something wrong or is this a problem with MatSolveTranspose for 
seqBAIJ matrices?

Any help would be greatly appreciated.

Thanks,

Laslo


From hzhang at mcs.anl.gov  Mon Aug  7 15:24:58 2006
From: hzhang at mcs.anl.gov (Hong Zhang)
Date: Mon, 7 Aug 2006 15:24:58 -0500 (CDT)
Subject: MatSolveTranspose
In-Reply-To: <Pine.LNX.4.62L.0608070935270.16181@splinter.mit.edu>
References: <Pine.LNX.4.62L.0608070935270.16181@splinter.mit.edu>
Message-ID: <Pine.LNX.4.58.0608071519420.28027@shakey.mcs.anl.gov>


Laslo,

MatSolveTranspose() should work for sequential BAIJ format.
The example ~petsc/src/mat/examples/tests/ex48.c tests it.
Would you please run this example and see if it works.

You may simplify your code and send it to us.
Then I'll test it to see where is the problem.

Hong

On Mon, 7 Aug 2006, Laslo Tibor Diosady wrote:

> Hi,
>
> I am trying to use MatSolveTranspose for a sequential BAIJ format, however
> whenever I call MatSolveTranspose I get a segmentation fault.
>
> The sequence of calls which I make is:
> MatILUFactorSymbolic
> MatLUFactorNumeric
>
> And I use the resulting matrix in calls to:
> MatSolve and MatSolveTranspose
>
> When I use MatSolve this works great. However, with MatSolveTranspose it
> appears petsc tries to allocate memory in a never ending loop until my
> machine runs out of memory and I get a seg fault.
>
> If I do a call to MatHasOperation with this matrix the result is
> Petsc_True, so in theory I think the call to MatSolveTranspose should
> work.
>
> Am I doing something wrong or is this a problem with MatSolveTranspose for
> seqBAIJ matrices?
>
> Any help would be greatly appreciated.
>
> Thanks,
>
> Laslo
>
>


From jiaxun_hou at yahoo.com.cn  Tue Aug  8 04:48:48 2006
From: jiaxun_hou at yahoo.com.cn (jiaxun hou)
Date: Tue, 8 Aug 2006 17:48:48 +0800 (CST)
Subject: About user defined PC
In-Reply-To: <Pine.LNX.4.64.0608020919050.25417@asterix>
Message-ID: <20060808094848.2415.qmail@web15803.mail.cnb.yahoo.com>

Hi,
   
  I met a problem ,when I constructed a user-defined PC. That is I need to defind the process of  P^(-1)Mx in each iteration of GMRES by myself for efficient reason, not only define P^(-1)x . But the Petsc seems to  separate  this process into two parts: the first is y=Mx which is defined in Petsc framework, and the second is P^(-1)y which is defined  by the user. So, is there any way to do it without change the code of Petsc framework?
   
  Regards,
  Jiaxun

 		
---------------------------------
????????-3.5G???20M??? 
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.mcs.anl.gov/pipermail/petsc-users/attachments/20060808/f193adf9/attachment.htm>

From hzhang at mcs.anl.gov  Tue Aug  8 08:13:15 2006
From: hzhang at mcs.anl.gov (Hong Zhang)
Date: Tue, 8 Aug 2006 08:13:15 -0500 (CDT)
Subject: In place ILU(0) factorization
In-Reply-To: <Pine.LNX.4.62L.0608040943400.15507@splinter.mit.edu>
References: <Pine.LNX.4.62L.0608030949300.19213@splinter.mit.edu>
 <Pine.LNX.4.58.0608030941130.20611@shakey.mcs.anl.gov>
 <Pine.LNX.4.62L.0608031113350.19214@splinter.mit.edu>
 <Pine.LNX.4.58.0608031046120.20611@shakey.mcs.anl.gov>
 <Pine.LNX.4.62L.0608040943400.15507@splinter.mit.edu>
Message-ID: <Pine.LNX.4.58.0608080810020.18269@shakey.mcs.anl.gov>


Laslo,

We figured out a way to implement ILU(0) with reordering
without allocating workspace.
We'll add this support later. I'll let you know when
it is done.

Thanks for your request that help us to make petsc
better.

Hong

On Fri, 4 Aug 2006, Laslo Tibor Diosady wrote:

> Hong,
>
>
> > The space required remains the same, but
> > the row-compressed matrix format
> > for the factor will be changed with the
> > reordering.
> > To store the new format over the
> > existing memory, temp space has to be allocated during
> > implementation. Thus replacing the original memory with
> > newly allocated space would make implementation easier.
> >
>
> I guess this depends upon the implementation of the ILU(0) factorization
> for the AIJ or BAIJ formats, which  I didn't (nor  do I ever really want
> to) look into.
>
> Thanks for the help,
>
> Laslo
>
>


From bsmith at mcs.anl.gov  Tue Aug  8 07:18:56 2006
From: bsmith at mcs.anl.gov (Barry Smith)
Date: Tue, 8 Aug 2006 07:18:56 -0500 (CDT)
Subject: About user defined PC
In-Reply-To: <20060808094848.2415.qmail@web15803.mail.cnb.yahoo.com>
References: <20060808094848.2415.qmail@web15803.mail.cnb.yahoo.com>
Message-ID: <Pine.OSX.4.64.0608080708360.366@barrys-computer.local>


   Jiaxun,

     I am assuming you are using the PCSHELL? I have added support for this for 
you;

* if you are using petsc-dev
    (http://www-unix.mcs.anl.gov/petsc/petsc-as/developers/index.html) you need
    only do an hg pull to get my additions then run "make" in
    src/ksp/pc/impls/shell.

* if you are not using petsc-dev (or use the nightly tar ball) then I attach the
    three files that were
    changed. include/private/pcimpl.h, include/petscpc.h and
    src/ksp/pc/impls/shell/shell.c (again run make in src/ksp/pc/impls/shell)

    If you are actually writing a complete PC and not using PCSHELL
http://www-unix.mcs.anl.gov/petsc/petsc-as/snapshots/petsc-current/src/ksp/pc/impls/jacobi/jacobi.c.html
then you just need to provide a routine PCApplyBA_XXXX() with calling sequence:
PC,PCSide,Vec b,Vec x,Vec work

   Good luck,

     Barry


On Tue, 8 Aug 2006, jiaxun hou wrote:

> Hi,
>
>  I met a problem ,when I constructed a user-defined PC. That is I need to defind the process of  P^(-1)Mx in each iteration of GMRES by myself for efficient reason, not only define P^(-1)x . But the Petsc seems to  separate  this process into two parts: the first is y=Mx which is defined in Petsc framework, and the second is P^(-1)y which is defined  by the user. So, is there any way to do it without change the code of Petsc framework?
>
>  Regards,
>  Jiaxun
>
>
> ---------------------------------
> ????????????????-3.5G??????20M??????
-------------- next part --------------
/*
      Preconditioner module. 
*/
#if !defined(__PETSCPC_H)
#define __PETSCPC_H
#include "petscmat.h"
PETSC_EXTERN_CXX_BEGIN

EXTERN PetscErrorCode PETSCKSP_DLLEXPORT  PCInitializePackage(const char[]);

/*
    PCList contains the list of preconditioners currently registered
   These are added with the PCRegisterDynamic() macro
*/
extern PetscFList PCList;
#define PCType const char*

/*S
     PC - Abstract PETSc object that manages all preconditioners

   Level: beginner

  Concepts: preconditioners

.seealso:  PCCreate(), PCSetType(), PCType (for list of available types)
S*/
typedef struct _p_PC* PC;

/*E
    PCType - String with the name of a PETSc preconditioner method or the creation function
       with an optional dynamic library name, for example
       http://www.mcs.anl.gov/petsc/lib.a:mypccreate()

   Level: beginner

   Notes: Click on the links below to see details on a particular solver

.seealso: PCSetType(), PC, PCCreate()
E*/
#define PCNONE            "none"
#define PCJACOBI          "jacobi"
#define PCSOR             "sor"
#define PCLU              "lu"
#define PCSHELL           "shell"
#define PCBJACOBI         "bjacobi"
#define PCMG              "mg"
#define PCEISENSTAT       "eisenstat"
#define PCILU             "ilu"
#define PCICC             "icc"
#define PCASM             "asm"
#define PCKSP             "ksp"
#define PCCOMPOSITE       "composite"
#define PCREDUNDANT       "redundant"
#define PCSPAI            "spai"
#define PCNN              "nn"
#define PCCHOLESKY        "cholesky"
#define PCSAMG            "samg"
#define PCPBJACOBI        "pbjacobi"
#define PCMAT             "mat"
#define PCHYPRE           "hypre"
#define PCFIELDSPLIT      "fieldsplit"
#define PCTFS             "tfs"
#define PCML              "ml"
#define PCPROMETHEUS      "prometheus"
#define PCGALERKIN        "galerkin"

/* Logging support */
extern PetscCookie PETSCKSP_DLLEXPORT PC_COOKIE;

/*E
    PCSide - If the preconditioner is to be applied to the left, right
     or symmetrically around the operator.

   Level: beginner

.seealso: 
E*/
typedef enum { PC_LEFT,PC_RIGHT,PC_SYMMETRIC } PCSide;
extern const char *PCSides[];

EXTERN PetscErrorCode PETSCKSP_DLLEXPORT PCCreate(MPI_Comm,PC*);
EXTERN PetscErrorCode PETSCKSP_DLLEXPORT PCSetType(PC,PCType);
EXTERN PetscErrorCode PETSCKSP_DLLEXPORT PCSetUp(PC);
EXTERN PetscErrorCode PETSCKSP_DLLEXPORT PCSetUpOnBlocks(PC);
EXTERN PetscErrorCode PETSCKSP_DLLEXPORT PCApply(PC,Vec,Vec);
EXTERN PetscErrorCode PETSCKSP_DLLEXPORT PCApplySymmetricLeft(PC,Vec,Vec);
EXTERN PetscErrorCode PETSCKSP_DLLEXPORT PCApplySymmetricRight(PC,Vec,Vec);
EXTERN PetscErrorCode PETSCKSP_DLLEXPORT PCApplyBAorAB(PC,PCSide,Vec,Vec,Vec);
EXTERN PetscErrorCode PETSCKSP_DLLEXPORT PCApplyTranspose(PC,Vec,Vec);
EXTERN PetscErrorCode PETSCKSP_DLLEXPORT PCHasApplyTranspose(PC,PetscTruth*);
EXTERN PetscErrorCode PETSCKSP_DLLEXPORT PCApplyBAorABTranspose(PC,PCSide,Vec,Vec,Vec);
EXTERN PetscErrorCode PETSCKSP_DLLEXPORT PCApplyRichardson(PC,Vec,Vec,Vec,PetscReal,PetscReal,PetscReal,PetscInt);
EXTERN PetscErrorCode PETSCKSP_DLLEXPORT PCApplyRichardsonExists(PC,PetscTruth*);

EXTERN PetscErrorCode PETSCKSP_DLLEXPORT PCRegisterDestroy(void);
EXTERN PetscErrorCode PETSCKSP_DLLEXPORT PCRegisterAll(const char[]);
extern PetscTruth PCRegisterAllCalled;

EXTERN PetscErrorCode PETSCKSP_DLLEXPORT PCRegister(const char[],const char[],const char[],PetscErrorCode(*)(PC));

/*MC
   PCRegisterDynamic - Adds a method to the preconditioner package.

   Synopsis:
   PetscErrorCode PCRegisterDynamic(char *name_solver,char *path,char *name_create,PetscErrorCode (*routine_create)(PC))

   Not collective

   Input Parameters:
+  name_solver - name of a new user-defined solver
.  path - path (either absolute or relative) the library containing this solver
.  name_create - name of routine to create method context
-  routine_create - routine to create method context

   Notes:
   PCRegisterDynamic() may be called multiple times to add several user-defined preconditioners.

   If dynamic libraries are used, then the fourth input argument (routine_create)
   is ignored.

   Sample usage:
.vb
   PCRegisterDynamic("my_solver","/home/username/my_lib/lib/libO/solaris/mylib",
              "MySolverCreate",MySolverCreate);
.ve

   Then, your solver can be chosen with the procedural interface via
$     PCSetType(pc,"my_solver")
   or at runtime via the option
$     -pc_type my_solver

   Level: advanced

   Notes: ${PETSC_ARCH}, ${PETSC_DIR}, ${PETSC_LIB_DIR},  or ${any environmental variable}
           occuring in pathname will be replaced with appropriate values.
         If your function is not being put into a shared library then use PCRegister() instead

.keywords: PC, register

.seealso: PCRegisterAll(), PCRegisterDestroy()
M*/
#if defined(PETSC_USE_DYNAMIC_LIBRARIES)
#define PCRegisterDynamic(a,b,c,d) PCRegister(a,b,c,0)
#else
#define PCRegisterDynamic(a,b,c,d) PCRegister(a,b,c,d)
#endif

EXTERN PetscErrorCode PETSCKSP_DLLEXPORT PCDestroy(PC);
EXTERN PetscErrorCode PETSCKSP_DLLEXPORT PCSetFromOptions(PC);
EXTERN PetscErrorCode PETSCKSP_DLLEXPORT PCGetType(PC,PCType*);

EXTERN PetscErrorCode PETSCKSP_DLLEXPORT PCGetFactoredMatrix(PC,Mat*);
EXTERN PetscErrorCode PETSCKSP_DLLEXPORT PCSetModifySubMatrices(PC,PetscErrorCode(*)(PC,PetscInt,const IS[],const IS[],Mat[],void*),void*);
EXTERN PetscErrorCode PETSCKSP_DLLEXPORT PCModifySubMatrices(PC,PetscInt,const IS[],const IS[],Mat[],void*);

EXTERN PetscErrorCode PETSCKSP_DLLEXPORT PCSetOperators(PC,Mat,Mat,MatStructure);
EXTERN PetscErrorCode PETSCKSP_DLLEXPORT PCGetOperators(PC,Mat*,Mat*,MatStructure*);
EXTERN PetscErrorCode PETSCKSP_DLLEXPORT PCGetOperatorsSet(PC,PetscTruth*,PetscTruth*);

EXTERN PetscErrorCode PETSCKSP_DLLEXPORT PCView(PC,PetscViewer);

EXTERN PetscErrorCode PETSCKSP_DLLEXPORT PCSetOptionsPrefix(PC,const char[]);
EXTERN PetscErrorCode PETSCKSP_DLLEXPORT PCAppendOptionsPrefix(PC,const char[]);
EXTERN PetscErrorCode PETSCKSP_DLLEXPORT PCGetOptionsPrefix(PC,const char*[]);

EXTERN PetscErrorCode PETSCKSP_DLLEXPORT PCComputeExplicitOperator(PC,Mat*);

/*
      These are used to provide extra scaling of preconditioned 
   operator for time-stepping schemes like in SUNDIALS 
*/
EXTERN PetscErrorCode PETSCKSP_DLLEXPORT PCDiagonalScale(PC,PetscTruth*);
EXTERN PetscErrorCode PETSCKSP_DLLEXPORT PCDiagonalScaleLeft(PC,Vec,Vec);
EXTERN PetscErrorCode PETSCKSP_DLLEXPORT PCDiagonalScaleRight(PC,Vec,Vec);
EXTERN PetscErrorCode PETSCKSP_DLLEXPORT PCDiagonalScaleSet(PC,Vec);

/* ------------- options specific to particular preconditioners --------- */

EXTERN PetscErrorCode PETSCKSP_DLLEXPORT PCJacobiSetUseRowMax(PC);
EXTERN PetscErrorCode PETSCKSP_DLLEXPORT PCJacobiSetUseAbs(PC);
EXTERN PetscErrorCode PETSCKSP_DLLEXPORT PCSORSetSymmetric(PC,MatSORType);
EXTERN PetscErrorCode PETSCKSP_DLLEXPORT PCSORSetOmega(PC,PetscReal);
EXTERN PetscErrorCode PETSCKSP_DLLEXPORT PCSORSetIterations(PC,PetscInt,PetscInt);

EXTERN PetscErrorCode PETSCKSP_DLLEXPORT PCEisenstatSetOmega(PC,PetscReal);
EXTERN PetscErrorCode PETSCKSP_DLLEXPORT PCEisenstatNoDiagonalScaling(PC);

#define USE_PRECONDITIONER_MATRIX 0
#define USE_TRUE_MATRIX           1
EXTERN PetscErrorCode PETSCKSP_DLLEXPORT PCBJacobiSetUseTrueLocal(PC);
EXTERN PetscErrorCode PETSCKSP_DLLEXPORT PCBJacobiSetTotalBlocks(PC,PetscInt,const PetscInt[]);
EXTERN PetscErrorCode PETSCKSP_DLLEXPORT PCBJacobiSetLocalBlocks(PC,PetscInt,const PetscInt[]);

EXTERN PetscErrorCode PETSCKSP_DLLEXPORT PCKSPSetUseTrue(PC);

EXTERN PetscErrorCode PETSCKSP_DLLEXPORT PCShellSetApply(PC,PetscErrorCode (*)(void*,Vec,Vec)); 
EXTERN PetscErrorCode PETSCKSP_DLLEXPORT PCShellSetApplyBA(PC,PetscErrorCode (*)(void*,PCSide,Vec,Vec,Vec)); 
EXTERN PetscErrorCode PETSCKSP_DLLEXPORT PCShellSetApplyTranspose(PC,PetscErrorCode (*)(void*,Vec,Vec));
EXTERN PetscErrorCode PETSCKSP_DLLEXPORT PCShellSetSetUp(PC,PetscErrorCode (*)(void*));
EXTERN PetscErrorCode PETSCKSP_DLLEXPORT PCShellSetApplyRichardson(PC,PetscErrorCode (*)(void*,Vec,Vec,Vec,PetscReal,PetscReal,PetscReal,PetscInt));
EXTERN PetscErrorCode PETSCKSP_DLLEXPORT PCShellSetView(PC,PetscErrorCode (*)(void*,PetscViewer));
EXTERN PetscErrorCode PETSCKSP_DLLEXPORT PCShellSetDestroy(PC,PetscErrorCode (*)(void*));
EXTERN PetscErrorCode PETSCKSP_DLLEXPORT PCShellGetContext(PC,void**);
EXTERN PetscErrorCode PETSCKSP_DLLEXPORT PCShellSetContext(PC,void*);
EXTERN PetscErrorCode PETSCKSP_DLLEXPORT PCShellSetName(PC,const char[]);
EXTERN PetscErrorCode PETSCKSP_DLLEXPORT PCShellGetName(PC,char*[]);

EXTERN PetscErrorCode PETSCKSP_DLLEXPORT PCFactorSetZeroPivot(PC,PetscReal);
EXTERN PetscErrorCode PETSCKSP_DLLEXPORT PCFactorSetShiftNonzero(PC,PetscReal); 
EXTERN PetscErrorCode PETSCKSP_DLLEXPORT PCFactorSetShiftPd(PC,PetscTruth); 


EXTERN PetscErrorCode PETSCKSP_DLLEXPORT PCFactorSetFill(PC,PetscReal);
EXTERN PetscErrorCode PETSCKSP_DLLEXPORT PCFactorSetPivoting(PC,PetscReal);
EXTERN PetscErrorCode PETSCKSP_DLLEXPORT PCFactorReorderForNonzeroDiagonal(PC,PetscReal);

EXTERN PetscErrorCode PETSCKSP_DLLEXPORT PCFactorSetMatOrdering(PC,MatOrderingType);
EXTERN PetscErrorCode PETSCKSP_DLLEXPORT PCFactorSetReuseOrdering(PC,PetscTruth);
EXTERN PetscErrorCode PETSCKSP_DLLEXPORT PCFactorSetReuseFill(PC,PetscTruth);
EXTERN PetscErrorCode PETSCKSP_DLLEXPORT PCFactorSetUseInPlace(PC);
EXTERN PetscErrorCode PETSCKSP_DLLEXPORT PCFactorSetAllowDiagonalFill(PC);
EXTERN PetscErrorCode PETSCKSP_DLLEXPORT PCFactorSetPivotInBlocks(PC,PetscTruth);

EXTERN PetscErrorCode PETSCKSP_DLLEXPORT PCFactorSetLevels(PC,PetscInt);
EXTERN PetscErrorCode PETSCKSP_DLLEXPORT PCFactorSetUseDropTolerance(PC,PetscReal,PetscReal,PetscInt);

EXTERN PetscErrorCode PETSCKSP_DLLEXPORT PCASMSetLocalSubdomains(PC,PetscInt,IS[]);
EXTERN PetscErrorCode PETSCKSP_DLLEXPORT PCASMSetTotalSubdomains(PC,PetscInt,IS[]);
EXTERN PetscErrorCode PETSCKSP_DLLEXPORT PCASMSetOverlap(PC,PetscInt);
/*E
    PCASMType - Type of additive Schwarz method to use

$  PC_ASM_BASIC - symmetric version where residuals from the ghost points are used
$                 and computed values in ghost regions are added together. Classical
$                 standard additive Schwarz
$  PC_ASM_RESTRICT - residuals from ghost points are used but computed values in ghost
$                    region are discarded. Default
$  PC_ASM_INTERPOLATE - residuals from ghost points are not used, computed values in ghost
$                       region are added back in
$  PC_ASM_NONE - ghost point residuals are not used, computed ghost values are discarded
$                not very good.                

   Level: beginner

.seealso: PCASMSetType()
E*/
typedef enum {PC_ASM_BASIC = 3,PC_ASM_RESTRICT = 1,PC_ASM_INTERPOLATE = 2,PC_ASM_NONE = 0} PCASMType;
extern const char *PCASMTypes[];

EXTERN PetscErrorCode PETSCKSP_DLLEXPORT PCASMSetType(PC,PCASMType);
EXTERN PetscErrorCode PETSCKSP_DLLEXPORT PCASMCreateSubdomains2D(PetscInt,PetscInt,PetscInt,PetscInt,PetscInt,PetscInt,PetscInt *,IS **);
EXTERN PetscErrorCode PETSCKSP_DLLEXPORT PCASMSetUseInPlace(PC);
EXTERN PetscErrorCode PETSCKSP_DLLEXPORT PCASMGetLocalSubdomains(PC,PetscInt*,IS*[]);
EXTERN PetscErrorCode PETSCKSP_DLLEXPORT PCASMGetLocalSubmatrices(PC,PetscInt*,Mat*[]);

/*E
    PCCompositeType - Determines how two or more preconditioner are composed

$  PC_COMPOSITE_ADDITIVE - results from application of all preconditioners are added together
$  PC_COMPOSITE_MULTIPLICATIVE - preconditioners are applied sequentially to the residual freshly
$                                computed after the previous preconditioner application
$  PC_COMPOSITE_SPECIAL - This is very special for a matrix of the form alpha I + R + S
$                         where first preconditioner is built from alpha I + S and second from
$                         alpha I + R

   Level: beginner

.seealso: PCCompositeSetType()
E*/
typedef enum {PC_COMPOSITE_ADDITIVE,PC_COMPOSITE_MULTIPLICATIVE,PC_COMPOSITE_SPECIAL} PCCompositeType;
extern const char *PCCompositeTypes[];

EXTERN PetscErrorCode PETSCKSP_DLLEXPORT PCCompositeSetUseTrue(PC);
EXTERN PetscErrorCode PETSCKSP_DLLEXPORT PCCompositeSetType(PC,PCCompositeType);
EXTERN PetscErrorCode PETSCKSP_DLLEXPORT PCCompositeAddPC(PC,PCType);
EXTERN PetscErrorCode PETSCKSP_DLLEXPORT PCCompositeGetPC(PC pc,PetscInt n,PC *);
EXTERN PetscErrorCode PETSCKSP_DLLEXPORT PCCompositeSpecialSetAlpha(PC,PetscScalar);

EXTERN PetscErrorCode PETSCKSP_DLLEXPORT PCRedundantSetScatter(PC,VecScatter,VecScatter);
EXTERN PetscErrorCode PETSCKSP_DLLEXPORT PCRedundantGetOperators(PC,Mat*,Mat*);
EXTERN PetscErrorCode PETSCKSP_DLLEXPORT PCRedundantGetPC(PC,PC*);

EXTERN PetscErrorCode PETSCKSP_DLLEXPORT PCSPAISetEpsilon(PC,double);
EXTERN PetscErrorCode PETSCKSP_DLLEXPORT PCSPAISetNBSteps(PC,PetscInt);
EXTERN PetscErrorCode PETSCKSP_DLLEXPORT PCSPAISetMax(PC,PetscInt);
EXTERN PetscErrorCode PETSCKSP_DLLEXPORT PCSPAISetMaxNew(PC,PetscInt);
EXTERN PetscErrorCode PETSCKSP_DLLEXPORT PCSPAISetBlockSize(PC,PetscInt);
EXTERN PetscErrorCode PETSCKSP_DLLEXPORT PCSPAISetCacheSize(PC,PetscInt);
EXTERN PetscErrorCode PETSCKSP_DLLEXPORT PCSPAISetVerbose(PC,PetscInt);
EXTERN PetscErrorCode PETSCKSP_DLLEXPORT PCSPAISetSp(PC,PetscInt);

EXTERN PetscErrorCode PETSCKSP_DLLEXPORT PCHYPRESetType(PC,const char[]);
EXTERN PetscErrorCode PETSCKSP_DLLEXPORT PCBJacobiGetLocalBlocks(PC,PetscInt*,const PetscInt*[]);
EXTERN PetscErrorCode PETSCKSP_DLLEXPORT PCBJacobiGetTotalBlocks(PC,PetscInt*,const PetscInt*[]);

EXTERN PetscErrorCode PETSCKSP_DLLEXPORT PCFieldSplitSetFields(PC,PetscInt,PetscInt*);
EXTERN PetscErrorCode PETSCKSP_DLLEXPORT PCFieldSplitSetType(PC,PCCompositeType);

EXTERN PetscErrorCode PETSCKSP_DLLEXPORT PCGalerkinSetRestriction(PC,Mat);
EXTERN PetscErrorCode PETSCKSP_DLLEXPORT PCGalerkinSetInterpolation(PC,Mat);

EXTERN PetscErrorCode PETSCKSP_DLLEXPORT PCSetCoordinates(PC,PetscInt,PetscReal*);
EXTERN PetscErrorCode PETSCKSP_DLLEXPORT PCSASetVectors(PC,PetscInt,PetscReal *);


PETSC_EXTERN_CXX_END
#endif /* __PETSCPC_H */
-------------- next part --------------

#ifndef _PCIMPL
#define _PCIMPL

#include "petscksp.h"
#include "petscpc.h"

typedef struct _PCOps *PCOps;
struct _PCOps {
  PetscErrorCode (*setup)(PC);
  PetscErrorCode (*apply)(PC,Vec,Vec);
  PetscErrorCode (*applyrichardson)(PC,Vec,Vec,Vec,PetscReal,PetscReal,PetscReal,PetscInt);
  PetscErrorCode (*applyBA)(PC,PCSide,Vec,Vec,Vec);
  PetscErrorCode (*applytranspose)(PC,Vec,Vec);
  PetscErrorCode (*applyBAtranspose)(PC,PetscInt,Vec,Vec,Vec);
  PetscErrorCode (*setfromoptions)(PC);
  PetscErrorCode (*presolve)(PC,KSP,Vec,Vec);
  PetscErrorCode (*postsolve)(PC,KSP,Vec,Vec);  
  PetscErrorCode (*getfactoredmatrix)(PC,Mat*);
  PetscErrorCode (*applysymmetricleft)(PC,Vec,Vec);
  PetscErrorCode (*applysymmetricright)(PC,Vec,Vec);
  PetscErrorCode (*setuponblocks)(PC);
  PetscErrorCode (*destroy)(PC);
  PetscErrorCode (*view)(PC,PetscViewer);
};

/*
   Preconditioner context
*/
struct _p_PC {
  PETSCHEADER(struct _PCOps);
  PetscInt       setupcalled;
  MatStructure   flag;
  Mat            mat,pmat;
  Vec            diagonalscaleright,diagonalscaleleft; /* used for time integration scaling */
  PetscTruth     diagonalscale;
  PetscErrorCode (*modifysubmatrices)(PC,PetscInt,const IS[],const IS[],Mat[],void*); /* user provided routine */
  void           *modifysubmatricesP; /* context for user routine */
  void           *data;
};

extern PetscEvent  PC_SetUp, PC_SetUpOnBlocks, PC_Apply, PC_ApplyCoarse, PC_ApplyMultiple, PC_ApplySymmetricLeft;
extern PetscEvent  PC_ApplySymmetricRight, PC_ModifySubMatrices;

#endif
-------------- next part --------------
#define PETSCKSP_DLL

/*
   This provides a simple shell for Fortran (and C programmers) to 
  create their own preconditioner without writing much interface code.
*/

#include "private/pcimpl.h"        /*I "petscpc.h" I*/
#include "private/vecimpl.h"  

EXTERN_C_BEGIN 
typedef struct {
  void           *ctx;                     /* user provided contexts for preconditioner */
  PetscErrorCode (*destroy)(void*);
  PetscErrorCode (*setup)(void*);
  PetscErrorCode (*apply)(void*,Vec,Vec);
  PetscErrorCode (*applyBA)(void*,PCSide,Vec,Vec,Vec);
  PetscErrorCode (*presolve)(void*,KSP,Vec,Vec);
  PetscErrorCode (*postsolve)(void*,KSP,Vec,Vec);
  PetscErrorCode (*view)(void*,PetscViewer);
  PetscErrorCode (*applytranspose)(void*,Vec,Vec);
  PetscErrorCode (*applyrich)(void*,Vec,Vec,Vec,PetscReal,PetscReal,PetscReal,PetscInt);
  char           *name;
} PC_Shell;
EXTERN_C_END

#undef __FUNCT__  
#define __FUNCT__ "PCShellGetContext"
/*@
    PCShellGetContext - Returns the user-provided context associated with a shell PC

    Not Collective

    Input Parameter:
.   pc - should have been created with PCCreateShell()

    Output Parameter:
.   ctx - the user provided context

    Level: advanced

    Notes:
    This routine is intended for use within various shell routines
    
.keywords: PC, shell, get, context

.seealso: PCCreateShell(), PCShellSetContext()
@*/
PetscErrorCode PETSCKSP_DLLEXPORT PCShellGetContext(PC pc,void **ctx)
{
  PetscErrorCode ierr;
  PetscTruth     flg;

  PetscFunctionBegin;
  PetscValidHeaderSpecific(pc,PC_COOKIE,1);
  PetscValidPointer(ctx,2); 
  ierr = PetscTypeCompare((PetscObject)pc,PCSHELL,&flg);CHKERRQ(ierr);
  if (!flg) *ctx = 0; 
  else      *ctx = ((PC_Shell*)(pc->data))->ctx; 
  PetscFunctionReturn(0);
}

#undef __FUNCT__  
#define __FUNCT__ "PCShellSetContext"
/*@C
    PCShellSetContext - sets the context for a shell PC

   Collective on PC

    Input Parameters:
+   pc - the shell PC
-   ctx - the context

   Level: advanced

   Fortran Notes: The context can only be an integer or a PetscObject
      unfortunately it cannot be a Fortran array or derived type.

.seealso: PCCreateShell(), PCShellGetContext()
@*/
PetscErrorCode PETSCKSP_DLLEXPORT PCShellSetContext(PC pc,void *ctx)
{
  PC_Shell      *shell = (PC_Shell*)pc->data;
  PetscErrorCode ierr;
  PetscTruth     flg;

  PetscFunctionBegin;
  PetscValidHeaderSpecific(pc,PC_COOKIE,1);
  ierr = PetscTypeCompare((PetscObject)pc,PCSHELL,&flg);CHKERRQ(ierr);
  if (flg) {
    shell->ctx = ctx;
  }
  PetscFunctionReturn(0);
}

#undef __FUNCT__  
#define __FUNCT__ "PCSetUp_Shell"
static PetscErrorCode PCSetUp_Shell(PC pc)
{
  PC_Shell       *shell;
  PetscErrorCode ierr;

  PetscFunctionBegin;
  shell = (PC_Shell*)pc->data;
  if (shell->setup) {
    CHKMEMQ;
    ierr  = (*shell->setup)(shell->ctx);CHKERRQ(ierr);
    CHKMEMQ;
  }
  PetscFunctionReturn(0);
}

#undef __FUNCT__  
#define __FUNCT__ "PCApply_Shell"
static PetscErrorCode PCApply_Shell(PC pc,Vec x,Vec y)
{
  PC_Shell       *shell;
  PetscErrorCode ierr;

  PetscFunctionBegin;
  shell = (PC_Shell*)pc->data;
  if (!shell->apply) SETERRQ(PETSC_ERR_USER,"No apply() routine provided to Shell PC");
  PetscStackPush("PCSHELL user function");
  CHKMEMQ;
  ierr  = (*shell->apply)(shell->ctx,x,y);CHKERRQ(ierr);
  CHKMEMQ;
  PetscStackPop;
  PetscFunctionReturn(0);
}

#undef __FUNCT__  
#define __FUNCT__ "PCApplyBA_Shell"
static PetscErrorCode PCApplyBA_Shell(PC pc,PCSide side,Vec x,Vec y,Vec w)
{
  PC_Shell       *shell;
  PetscErrorCode ierr;

  PetscFunctionBegin;
  shell = (PC_Shell*)pc->data;
  if (!shell->applyBA) SETERRQ(PETSC_ERR_USER,"No applyBA() routine provided to Shell PC");
  PetscStackPush("PCSHELL user function BA");
  CHKMEMQ;
  ierr  = (*shell->applyBA)(shell->ctx,side,x,y,w);CHKERRQ(ierr);
  CHKMEMQ;
  PetscStackPop;
  PetscFunctionReturn(0);
}

#undef __FUNCT__  
#define __FUNCT__ "PCPreSolve_Shell"
static PetscErrorCode PCPreSolve_Shell(PC pc,KSP ksp,Vec b,Vec x)
{
  PC_Shell       *shell;
  PetscErrorCode ierr;

  PetscFunctionBegin;
  shell = (PC_Shell*)pc->data;
  if (!shell->presolve) SETERRQ(PETSC_ERR_USER,"No presolve() routine provided to Shell PC");
  ierr  = (*shell->presolve)(shell->ctx,ksp,b,x);CHKERRQ(ierr);
  PetscFunctionReturn(0);
}

#undef __FUNCT__  
#define __FUNCT__ "PCPostSolve_Shell"
static PetscErrorCode PCPostSolve_Shell(PC pc,KSP ksp,Vec b,Vec x)
{
  PC_Shell       *shell;
  PetscErrorCode ierr;

  PetscFunctionBegin;
  shell = (PC_Shell*)pc->data;
  if (!shell->postsolve) SETERRQ(PETSC_ERR_USER,"No postsolve() routine provided to Shell PC");
  ierr  = (*shell->postsolve)(shell->ctx,ksp,b,x);CHKERRQ(ierr);
  PetscFunctionReturn(0);
}

#undef __FUNCT__  
#define __FUNCT__ "PCApplyTranspose_Shell"
static PetscErrorCode PCApplyTranspose_Shell(PC pc,Vec x,Vec y)
{
  PC_Shell       *shell;
  PetscErrorCode ierr;

  PetscFunctionBegin;
  shell = (PC_Shell*)pc->data;
  if (!shell->applytranspose) SETERRQ(PETSC_ERR_USER,"No applytranspose() routine provided to Shell PC");
  ierr  = (*shell->applytranspose)(shell->ctx,x,y);CHKERRQ(ierr);
  PetscFunctionReturn(0);
}

#undef __FUNCT__  
#define __FUNCT__ "PCApplyRichardson_Shell"
static PetscErrorCode PCApplyRichardson_Shell(PC pc,Vec x,Vec y,Vec w,PetscReal rtol,PetscReal abstol, PetscReal dtol,PetscInt it)
{
  PetscErrorCode ierr;
  PC_Shell       *shell;

  PetscFunctionBegin;
  shell = (PC_Shell*)pc->data;
  ierr  = (*shell->applyrich)(shell->ctx,x,y,w,rtol,abstol,dtol,it);CHKERRQ(ierr);
  PetscFunctionReturn(0);
}

#undef __FUNCT__  
#define __FUNCT__ "PCDestroy_Shell"
static PetscErrorCode PCDestroy_Shell(PC pc)
{
  PC_Shell       *shell = (PC_Shell*)pc->data;
  PetscErrorCode ierr;

  PetscFunctionBegin;
  ierr = PetscStrfree(shell->name);CHKERRQ(ierr);
  if (shell->destroy) {
    ierr  = (*shell->destroy)(shell->ctx);CHKERRQ(ierr);
  }
  ierr = PetscFree(shell);CHKERRQ(ierr);
  PetscFunctionReturn(0);
}

#undef __FUNCT__  
#define __FUNCT__ "PCView_Shell"
static PetscErrorCode PCView_Shell(PC pc,PetscViewer viewer)
{
  PC_Shell       *shell = (PC_Shell*)pc->data;
  PetscErrorCode ierr;
  PetscTruth     iascii;

  PetscFunctionBegin;
  ierr = PetscTypeCompare((PetscObject)viewer,PETSC_VIEWER_ASCII,&iascii);CHKERRQ(ierr);
  if (iascii) {
    if (shell->name) {ierr = PetscViewerASCIIPrintf(viewer,"  Shell: %s\n",shell->name);CHKERRQ(ierr);}
    else             {ierr = PetscViewerASCIIPrintf(viewer,"  Shell: no name\n");CHKERRQ(ierr);}
  }
  if (shell->view) {
    ierr = PetscViewerASCIIPushTab(viewer);CHKERRQ(ierr);
    ierr  = (*shell->view)(shell->ctx,viewer);CHKERRQ(ierr);
    ierr = PetscViewerASCIIPopTab(viewer);CHKERRQ(ierr);
  }
  PetscFunctionReturn(0);
}

/* ------------------------------------------------------------------------------*/
EXTERN_C_BEGIN
#undef __FUNCT__  
#define __FUNCT__ "PCShellSetDestroy_Shell"
PetscErrorCode PETSCKSP_DLLEXPORT PCShellSetDestroy_Shell(PC pc, PetscErrorCode (*destroy)(void*))
{
  PC_Shell *shell;

  PetscFunctionBegin;
  shell          = (PC_Shell*)pc->data;
  shell->destroy = destroy;
  PetscFunctionReturn(0);
}
EXTERN_C_END

EXTERN_C_BEGIN
#undef __FUNCT__  
#define __FUNCT__ "PCShellSetSetUp_Shell"
PetscErrorCode PETSCKSP_DLLEXPORT PCShellSetSetUp_Shell(PC pc, PetscErrorCode (*setup)(void*))
{
  PC_Shell *shell;

  PetscFunctionBegin;
  shell        = (PC_Shell*)pc->data;
  shell->setup = setup;
  PetscFunctionReturn(0);
}
EXTERN_C_END

EXTERN_C_BEGIN
#undef __FUNCT__  
#define __FUNCT__ "PCShellSetApply_Shell"
PetscErrorCode PETSCKSP_DLLEXPORT PCShellSetApply_Shell(PC pc,PetscErrorCode (*apply)(void*,Vec,Vec))
{
  PC_Shell *shell;

  PetscFunctionBegin;
  shell        = (PC_Shell*)pc->data;
  shell->apply = apply;
  PetscFunctionReturn(0);
}
EXTERN_C_END

EXTERN_C_BEGIN
#undef __FUNCT__  
#define __FUNCT__ "PCShellSetApplyBA_Shell"
PetscErrorCode PETSCKSP_DLLEXPORT PCShellSetApplyBA_Shell(PC pc,PetscErrorCode (*apply)(void*,PCSide,Vec,Vec,Vec))
{
  PC_Shell *shell;

  PetscFunctionBegin;
  shell          = (PC_Shell*)pc->data;
  shell->applyBA = apply;
  PetscFunctionReturn(0);
}
EXTERN_C_END

EXTERN_C_BEGIN
#undef __FUNCT__  
#define __FUNCT__ "PCShellSetPreSolve_Shell"
PetscErrorCode PETSCKSP_DLLEXPORT PCShellSetPreSolve_Shell(PC pc,PetscErrorCode (*presolve)(void*,KSP,Vec,Vec))
{
  PC_Shell *shell;

  PetscFunctionBegin;
  shell             = (PC_Shell*)pc->data;
  shell->presolve   = presolve;
  if (presolve) {
    pc->ops->presolve = PCPreSolve_Shell;
  } else {
    pc->ops->presolve = 0;
  }
  PetscFunctionReturn(0);
}
EXTERN_C_END

EXTERN_C_BEGIN
#undef __FUNCT__  
#define __FUNCT__ "PCShellSetPostSolve_Shell"
PetscErrorCode PETSCKSP_DLLEXPORT PCShellSetPostSolve_Shell(PC pc,PetscErrorCode (*postsolve)(void*,KSP,Vec,Vec))
{
  PC_Shell *shell;

  PetscFunctionBegin;
  shell           = (PC_Shell*)pc->data;
  shell->postsolve = postsolve;
  if (postsolve) {
    pc->ops->postsolve = PCPostSolve_Shell;
  } else {
    pc->ops->postsolve = 0;
  }
  PetscFunctionReturn(0);
}
EXTERN_C_END

EXTERN_C_BEGIN
#undef __FUNCT__  
#define __FUNCT__ "PCShellSetView_Shell"
PetscErrorCode PETSCKSP_DLLEXPORT PCShellSetView_Shell(PC pc,PetscErrorCode (*view)(void*,PetscViewer))
{
  PC_Shell *shell;

  PetscFunctionBegin;
  shell        = (PC_Shell*)pc->data;
  shell->view = view;
  PetscFunctionReturn(0);
}
EXTERN_C_END

EXTERN_C_BEGIN
#undef __FUNCT__  
#define __FUNCT__ "PCShellSetApplyTranspose_Shell"
PetscErrorCode PETSCKSP_DLLEXPORT PCShellSetApplyTranspose_Shell(PC pc,PetscErrorCode (*applytranspose)(void*,Vec,Vec))
{
  PC_Shell *shell;

  PetscFunctionBegin;
  shell                 = (PC_Shell*)pc->data;
  shell->applytranspose = applytranspose;
  PetscFunctionReturn(0);
}
EXTERN_C_END

EXTERN_C_BEGIN
#undef __FUNCT__  
#define __FUNCT__ "PCShellSetName_Shell"
PetscErrorCode PETSCKSP_DLLEXPORT PCShellSetName_Shell(PC pc,const char name[])
{
  PC_Shell       *shell;
  PetscErrorCode ierr;

  PetscFunctionBegin;
  shell = (PC_Shell*)pc->data;
  ierr = PetscStrfree(shell->name);CHKERRQ(ierr);    
  ierr = PetscStrallocpy(name,&shell->name);CHKERRQ(ierr);
  PetscFunctionReturn(0);
}
EXTERN_C_END

EXTERN_C_BEGIN
#undef __FUNCT__  
#define __FUNCT__ "PCShellGetName_Shell"
PetscErrorCode PETSCKSP_DLLEXPORT PCShellGetName_Shell(PC pc,char *name[])
{
  PC_Shell *shell;

  PetscFunctionBegin;
  shell  = (PC_Shell*)pc->data;
  *name  = shell->name;
  PetscFunctionReturn(0);
}
EXTERN_C_END

EXTERN_C_BEGIN
#undef __FUNCT__  
#define __FUNCT__ "PCShellSetApplyRichardson_Shell"
PetscErrorCode PETSCKSP_DLLEXPORT PCShellSetApplyRichardson_Shell(PC pc,PetscErrorCode (*apply)(void*,Vec,Vec,Vec,PetscReal,PetscReal,PetscReal,PetscInt))
{
  PC_Shell *shell;

  PetscFunctionBegin;
  shell                     = (PC_Shell*)pc->data;
  pc->ops->applyrichardson  = PCApplyRichardson_Shell;
  shell->applyrich          = apply;
  PetscFunctionReturn(0);
}
EXTERN_C_END

/* -------------------------------------------------------------------------------*/

#undef __FUNCT__  
#define __FUNCT__ "PCShellSetDestroy"
/*@C
   PCShellSetDestroy - Sets routine to use to destroy the user-provided 
   application context.

   Collective on PC

   Input Parameters:
+  pc - the preconditioner context
.  destroy - the application-provided destroy routine

   Calling sequence of destroy:
.vb
   PetscErrorCode destroy (void *ptr)
.ve

.  ptr - the application context

   Level: developer

.keywords: PC, shell, set, destroy, user-provided

.seealso: PCShellSetApply(), PCShellSetContext()
@*/
PetscErrorCode PETSCKSP_DLLEXPORT PCShellSetDestroy(PC pc,PetscErrorCode (*destroy)(void*))
{
  PetscErrorCode ierr,(*f)(PC,PetscErrorCode (*)(void*));

  PetscFunctionBegin;
  PetscValidHeaderSpecific(pc,PC_COOKIE,1);
  ierr = PetscObjectQueryFunction((PetscObject)pc,"PCShellSetDestroy_C",(void (**)(void))&f);CHKERRQ(ierr);
  if (f) {
    ierr = (*f)(pc,destroy);CHKERRQ(ierr);
  }
  PetscFunctionReturn(0);
}


#undef __FUNCT__  
#define __FUNCT__ "PCShellSetSetUp"
/*@C
   PCShellSetSetUp - Sets routine to use to "setup" the preconditioner whenever the 
   matrix operator is changed.

   Collective on PC

   Input Parameters:
+  pc - the preconditioner context
.  setup - the application-provided setup routine

   Calling sequence of setup:
.vb
   PetscErrorCode setup (void *ptr)
.ve

.  ptr - the application context

   Level: developer

.keywords: PC, shell, set, setup, user-provided

.seealso: PCShellSetApplyRichardson(), PCShellSetApply(), PCShellSetContext()
@*/
PetscErrorCode PETSCKSP_DLLEXPORT PCShellSetSetUp(PC pc,PetscErrorCode (*setup)(void*))
{
  PetscErrorCode ierr,(*f)(PC,PetscErrorCode (*)(void*));

  PetscFunctionBegin;
  PetscValidHeaderSpecific(pc,PC_COOKIE,1);
  ierr = PetscObjectQueryFunction((PetscObject)pc,"PCShellSetSetUp_C",(void (**)(void))&f);CHKERRQ(ierr);
  if (f) {
    ierr = (*f)(pc,setup);CHKERRQ(ierr);
  }
  PetscFunctionReturn(0);
}


#undef __FUNCT__  
#define __FUNCT__ "PCShellSetView"
/*@C
   PCShellSetView - Sets routine to use as viewer of shell preconditioner

   Collective on PC

   Input Parameters:
+  pc - the preconditioner context
-  view - the application-provided view routine

   Calling sequence of apply:
.vb
   PetscErrorCode view(void *ptr,PetscViewer v)
.ve

+  ptr - the application context
-  v   - viewer

   Level: developer

.keywords: PC, shell, set, apply, user-provided

.seealso: PCShellSetApplyRichardson(), PCShellSetSetUp(), PCShellSetApplyTranspose()
@*/
PetscErrorCode PETSCKSP_DLLEXPORT PCShellSetView(PC pc,PetscErrorCode (*view)(void*,PetscViewer))
{
  PetscErrorCode ierr,(*f)(PC,PetscErrorCode (*)(void*,PetscViewer));

  PetscFunctionBegin;
  PetscValidHeaderSpecific(pc,PC_COOKIE,1);
  ierr = PetscObjectQueryFunction((PetscObject)pc,"PCShellSetView_C",(void (**)(void))&f);CHKERRQ(ierr);
  if (f) {
    ierr = (*f)(pc,view);CHKERRQ(ierr);
  }
  PetscFunctionReturn(0);
}

#undef __FUNCT__  
#define __FUNCT__ "PCShellSetApply"
/*@C
   PCShellSetApply - Sets routine to use as preconditioner.

   Collective on PC

   Input Parameters:
+  pc - the preconditioner context
-  apply - the application-provided preconditioning routine

   Calling sequence of apply:
.vb
   PetscErrorCode apply (void *ptr,Vec xin,Vec xout)
.ve

+  ptr - the application context
.  xin - input vector
-  xout - output vector

   Level: developer

.keywords: PC, shell, set, apply, user-provided

.seealso: PCShellSetApplyRichardson(), PCShellSetSetUp(), PCShellSetApplyTranspose(), PCShellSetContext(), PCShellSetApplyBA()
@*/
PetscErrorCode PETSCKSP_DLLEXPORT PCShellSetApply(PC pc,PetscErrorCode (*apply)(void*,Vec,Vec))
{
  PetscErrorCode ierr,(*f)(PC,PetscErrorCode (*)(void*,Vec,Vec));

  PetscFunctionBegin;
  PetscValidHeaderSpecific(pc,PC_COOKIE,1);
  ierr = PetscObjectQueryFunction((PetscObject)pc,"PCShellSetApply_C",(void (**)(void))&f);CHKERRQ(ierr);
  if (f) {
    ierr = (*f)(pc,apply);CHKERRQ(ierr);
  }
  PetscFunctionReturn(0);
}

#undef __FUNCT__  
#define __FUNCT__ "PCShellSetApplyBA"
/*@C
   PCShellSetApplyBA - Sets routine to use as preconditioner times operator.

   Collective on PC

   Input Parameters:
+  pc - the preconditioner context
-  applyBA - the application-provided BA routine

   Calling sequence of apply:
.vb
   PetscErrorCode applyBA (void *ptr,Vec xin,Vec xout)
.ve

+  ptr - the application context
.  xin - input vector
-  xout - output vector

   Level: developer

.keywords: PC, shell, set, apply, user-provided

.seealso: PCShellSetApplyRichardson(), PCShellSetSetUp(), PCShellSetApplyTranspose(), PCShellSetContext(), PCShellSetApply()
@*/
PetscErrorCode PETSCKSP_DLLEXPORT PCShellSetApplyBA(PC pc,PetscErrorCode (*applyBA)(void*,PCSide,Vec,Vec,Vec))
{
  PetscErrorCode ierr,(*f)(PC,PetscErrorCode (*)(void*,PCSide,Vec,Vec,Vec));

  PetscFunctionBegin;
  PetscValidHeaderSpecific(pc,PC_COOKIE,1);
  ierr = PetscObjectQueryFunction((PetscObject)pc,"PCShellSetApplyBA_C",(void (**)(void))&f);CHKERRQ(ierr);
  if (f) {
    ierr = (*f)(pc,applyBA);CHKERRQ(ierr);
  }
  PetscFunctionReturn(0);
}

#undef __FUNCT__  
#define __FUNCT__ "PCShellSetApplyTranspose"
/*@C
   PCShellSetApplyTranspose - Sets routine to use as preconditioner transpose.

   Collective on PC

   Input Parameters:
+  pc - the preconditioner context
-  apply - the application-provided preconditioning transpose routine

   Calling sequence of apply:
.vb
   PetscErrorCode applytranspose (void *ptr,Vec xin,Vec xout)
.ve

+  ptr - the application context
.  xin - input vector
-  xout - output vector

   Level: developer

   Notes: 
   Uses the same context variable as PCShellSetApply().

.keywords: PC, shell, set, apply, user-provided

.seealso: PCShellSetApplyRichardson(), PCShellSetSetUp(), PCShellSetApply(), PCSetContext(), PCShellSetApplyBA()
@*/
PetscErrorCode PETSCKSP_DLLEXPORT PCShellSetApplyTranspose(PC pc,PetscErrorCode (*applytranspose)(void*,Vec,Vec))
{
  PetscErrorCode ierr,(*f)(PC,PetscErrorCode (*)(void*,Vec,Vec));

  PetscFunctionBegin;
  PetscValidHeaderSpecific(pc,PC_COOKIE,1);
  ierr = PetscObjectQueryFunction((PetscObject)pc,"PCShellSetApplyTranspose_C",(void (**)(void))&f);CHKERRQ(ierr);
  if (f) {
    ierr = (*f)(pc,applytranspose);CHKERRQ(ierr);
  }
  PetscFunctionReturn(0);
}

#undef __FUNCT__  
#define __FUNCT__ "PCShellSetPreSolve"
/*@C
   PCShellSetPreSolve - Sets routine to apply to the operators/vectors before a KSPSolve() is
      applied. This usually does something like scale the linear system in some application 
      specific way.

   Collective on PC

   Input Parameters:
+  pc - the preconditioner context
-  presolve - the application-provided presolve routine

   Calling sequence of presolve:
.vb
   PetscErrorCode presolve (void *ptr,KSP ksp,Vec b,Vec x)
.ve

+  ptr - the application context
.  xin - input vector
-  xout - output vector

   Level: developer

.keywords: PC, shell, set, apply, user-provided

.seealso: PCShellSetApplyRichardson(), PCShellSetSetUp(), PCShellSetApplyTranspose(), PCShellSetPostSolve(), PCShellSetContext()
@*/
PetscErrorCode PETSCKSP_DLLEXPORT PCShellSetPreSolve(PC pc,PetscErrorCode (*presolve)(void*,KSP,Vec,Vec))
{
  PetscErrorCode ierr,(*f)(PC,PetscErrorCode (*)(void*,KSP,Vec,Vec));

  PetscFunctionBegin;
  PetscValidHeaderSpecific(pc,PC_COOKIE,1);
  ierr = PetscObjectQueryFunction((PetscObject)pc,"PCShellSetPreSolve_C",(void (**)(void))&f);CHKERRQ(ierr);
  if (f) {
    ierr = (*f)(pc,presolve);CHKERRQ(ierr);
  }
  PetscFunctionReturn(0);
}

#undef __FUNCT__  
#define __FUNCT__ "PCShellSetPostSolve"
/*@C
   PCShellSetPostSolve - Sets routine to apply to the operators/vectors before a KSPSolve() is
      applied. This usually does something like scale the linear system in some application 
      specific way.

   Collective on PC

   Input Parameters:
+  pc - the preconditioner context
-  postsolve - the application-provided presolve routine

   Calling sequence of postsolve:
.vb
   PetscErrorCode postsolve(void *ptr,KSP ksp,Vec b,Vec x)
.ve

+  ptr - the application context
.  xin - input vector
-  xout - output vector

   Level: developer

.keywords: PC, shell, set, apply, user-provided

.seealso: PCShellSetApplyRichardson(), PCShellSetSetUp(), PCShellSetApplyTranspose(), PCShellSetPreSolve(), PCShellSetContext()
@*/
PetscErrorCode PETSCKSP_DLLEXPORT PCShellSetPostSolve(PC pc,PetscErrorCode (*postsolve)(void*,KSP,Vec,Vec))
{
  PetscErrorCode ierr,(*f)(PC,PetscErrorCode (*)(void*,KSP,Vec,Vec));

  PetscFunctionBegin;
  PetscValidHeaderSpecific(pc,PC_COOKIE,1);
  ierr = PetscObjectQueryFunction((PetscObject)pc,"PCShellSetPostSolve_C",(void (**)(void))&f);CHKERRQ(ierr);
  if (f) {
    ierr = (*f)(pc,postsolve);CHKERRQ(ierr);
  }
  PetscFunctionReturn(0);
}

#undef __FUNCT__  
#define __FUNCT__ "PCShellSetName"
/*@C
   PCShellSetName - Sets an optional name to associate with a shell
   preconditioner.

   Not Collective

   Input Parameters:
+  pc - the preconditioner context
-  name - character string describing shell preconditioner

   Level: developer

.keywords: PC, shell, set, name, user-provided

.seealso: PCShellGetName()
@*/
PetscErrorCode PETSCKSP_DLLEXPORT PCShellSetName(PC pc,const char name[])
{
  PetscErrorCode ierr,(*f)(PC,const char []);

  PetscFunctionBegin;
  PetscValidHeaderSpecific(pc,PC_COOKIE,1);
  ierr = PetscObjectQueryFunction((PetscObject)pc,"PCShellSetName_C",(void (**)(void))&f);CHKERRQ(ierr);
  if (f) {
    ierr = (*f)(pc,name);CHKERRQ(ierr);
  }
  PetscFunctionReturn(0);
}

#undef __FUNCT__  
#define __FUNCT__ "PCShellGetName"
/*@C
   PCShellGetName - Gets an optional name that the user has set for a shell
   preconditioner.

   Not Collective

   Input Parameter:
.  pc - the preconditioner context

   Output Parameter:
.  name - character string describing shell preconditioner (you should not free this)

   Level: developer

.keywords: PC, shell, get, name, user-provided

.seealso: PCShellSetName()
@*/
PetscErrorCode PETSCKSP_DLLEXPORT PCShellGetName(PC pc,char *name[])
{
  PetscErrorCode ierr,(*f)(PC,char *[]);

  PetscFunctionBegin;
  PetscValidHeaderSpecific(pc,PC_COOKIE,1);
  PetscValidPointer(name,2);
  ierr = PetscObjectQueryFunction((PetscObject)pc,"PCShellGetName_C",(void (**)(void))&f);CHKERRQ(ierr);
  if (f) {
    ierr = (*f)(pc,name);CHKERRQ(ierr);
  } else {
    SETERRQ(PETSC_ERR_ARG_WRONG,"Not shell preconditioner, cannot get name");
  }
  PetscFunctionReturn(0);
}

#undef __FUNCT__  
#define __FUNCT__ "PCShellSetApplyRichardson"
/*@C
   PCShellSetApplyRichardson - Sets routine to use as preconditioner
   in Richardson iteration.

   Collective on PC

   Input Parameters:
+  pc - the preconditioner context
-  apply - the application-provided preconditioning routine

   Calling sequence of apply:
.vb
   PetscErrorCode apply (void *ptr,Vec b,Vec x,Vec r,PetscReal rtol,PetscReal abstol,PetscReal dtol,PetscInt maxits)
.ve

+  ptr - the application context
.  b - right-hand-side
.  x - current iterate
.  r - work space
.  rtol - relative tolerance of residual norm to stop at
.  abstol - absolute tolerance of residual norm to stop at
.  dtol - if residual norm increases by this factor than return
-  maxits - number of iterations to run

   Level: developer

.keywords: PC, shell, set, apply, Richardson, user-provided

.seealso: PCShellSetApply(), PCShellSetContext()
@*/
PetscErrorCode PETSCKSP_DLLEXPORT PCShellSetApplyRichardson(PC pc,PetscErrorCode (*apply)(void*,Vec,Vec,Vec,PetscReal,PetscReal,PetscReal,PetscInt))
{
  PetscErrorCode ierr,(*f)(PC,PetscErrorCode (*)(void*,Vec,Vec,Vec,PetscReal,PetscReal,PetscReal,PetscInt));

  PetscFunctionBegin;
  PetscValidHeaderSpecific(pc,PC_COOKIE,1);
  ierr = PetscObjectQueryFunction((PetscObject)pc,"PCShellSetApplyRichardson_C",(void (**)(void))&f);CHKERRQ(ierr);
  if (f) {
    ierr = (*f)(pc,apply);CHKERRQ(ierr);
  }
  PetscFunctionReturn(0);
}

/*MC
   PCSHELL - Creates a new preconditioner class for use with your 
              own private data storage format.

   Level: advanced

   Concepts: providing your own preconditioner

  Usage:
$             PetscErrorCode (*mult)(void*,Vec,Vec);
$             PetscErrorCode (*setup)(void*);
$             PCCreate(comm,&pc);
$             PCSetType(pc,PCSHELL);
$             PCShellSetApply(pc,mult);
$             PCShellSetApplyBA(pc,mult);      (optional)
$             PCShellSetApplyTranspose(pc,mult); (optional)
$             PCShellSetContext(pc,ctx)
$             PCShellSetSetUp(pc,setup);       (optional)

.seealso:  PCCreate(), PCSetType(), PCType (for list of available types), PC,
           MATSHELL, PCShellSetSetUp(), PCShellSetApply(), PCShellSetView(), 
           PCShellSetApplyTranspose(), PCShellSetName(), PCShellSetApplyRichardson(), 
           PCShellGetName(), PCShellSetContext(), PCShellGetContext(), PCShellSetApplyBA()
M*/

EXTERN_C_BEGIN
#undef __FUNCT__  
#define __FUNCT__ "PCCreate_Shell"
PetscErrorCode PETSCKSP_DLLEXPORT PCCreate_Shell(PC pc)
{
  PetscErrorCode ierr;
  PC_Shell       *shell;

  PetscFunctionBegin;
  pc->ops->destroy    = PCDestroy_Shell;
  ierr                = PetscNew(PC_Shell,&shell);CHKERRQ(ierr);
  ierr = PetscLogObjectMemory(pc,sizeof(PC_Shell));CHKERRQ(ierr);
  pc->data         = (void*)shell;
  pc->name         = 0;

  pc->ops->apply           = PCApply_Shell;
  pc->ops->applyBA         = PCApplyBA_Shell;
  pc->ops->view            = PCView_Shell;
  pc->ops->applytranspose  = PCApplyTranspose_Shell;
  pc->ops->applyrichardson = 0;
  pc->ops->setup           = PCSetUp_Shell;
  pc->ops->presolve        = 0;
  pc->ops->postsolve       = 0;
  pc->ops->view            = PCView_Shell;

  shell->apply          = 0;
  shell->applytranspose = 0;
  shell->name           = 0;
  shell->applyrich      = 0;
  shell->presolve       = 0;
  shell->postsolve      = 0;
  shell->ctx            = 0;
  shell->setup          = 0;
  shell->view           = 0;
  shell->destroy        = 0;

  ierr = PetscObjectComposeFunctionDynamic((PetscObject)pc,"PCShellSetDestroy_C","PCShellSetDestroy_Shell",
                    PCShellSetDestroy_Shell);CHKERRQ(ierr);
  ierr = PetscObjectComposeFunctionDynamic((PetscObject)pc,"PCShellSetSetUp_C","PCShellSetSetUp_Shell",
                    PCShellSetSetUp_Shell);CHKERRQ(ierr);
  ierr = PetscObjectComposeFunctionDynamic((PetscObject)pc,"PCShellSetApply_C","PCShellSetApply_Shell",
                    PCShellSetApply_Shell);CHKERRQ(ierr);
  ierr = PetscObjectComposeFunctionDynamic((PetscObject)pc,"PCShellSetApplyBA_C","PCShellSetApplyBA_Shell",
                    PCShellSetApplyBA_Shell);CHKERRQ(ierr);
  ierr = PetscObjectComposeFunctionDynamic((PetscObject)pc,"PCShellSetPreSolve_C","PCShellSetPreSolve_Shell",
                    PCShellSetPreSolve_Shell);CHKERRQ(ierr);
  ierr = PetscObjectComposeFunctionDynamic((PetscObject)pc,"PCShellSetPostSolve_C","PCShellSetPostSolve_Shell",
                    PCShellSetPostSolve_Shell);CHKERRQ(ierr);
  ierr = PetscObjectComposeFunctionDynamic((PetscObject)pc,"PCShellSetView_C","PCShellSetView_Shell",
                    PCShellSetView_Shell);CHKERRQ(ierr);
  ierr = PetscObjectComposeFunctionDynamic((PetscObject)pc,"PCShellSetApplyTranspose_C","PCShellSetApplyTranspose_Shell",
                    PCShellSetApplyTranspose_Shell);CHKERRQ(ierr);
  ierr = PetscObjectComposeFunctionDynamic((PetscObject)pc,"PCShellSetName_C","PCShellSetName_Shell",
                    PCShellSetName_Shell);CHKERRQ(ierr);
  ierr = PetscObjectComposeFunctionDynamic((PetscObject)pc,"PCShellGetName_C","PCShellGetName_Shell",
                    PCShellGetName_Shell);CHKERRQ(ierr);
  ierr = PetscObjectComposeFunctionDynamic((PetscObject)pc,"PCShellSetApplyRichardson_C","PCShellSetApplyRichardson_Shell",
                    PCShellSetApplyRichardson_Shell);CHKERRQ(ierr);
  PetscFunctionReturn(0);
}
EXTERN_C_END


From diosady at MIT.EDU  Tue Aug  8 08:40:49 2006
From: diosady at MIT.EDU (Laslo Tibor Diosady)
Date: Tue, 8 Aug 2006 09:40:49 -0400 (EDT)
Subject: In place ILU(0) factorization
In-Reply-To: <Pine.LNX.4.58.0608080810020.18269@shakey.mcs.anl.gov>
References: <Pine.LNX.4.62L.0608030949300.19213@splinter.mit.edu>
 <Pine.LNX.4.58.0608030941130.20611@shakey.mcs.anl.gov>
 <Pine.LNX.4.62L.0608031113350.19214@splinter.mit.edu>
 <Pine.LNX.4.58.0608031046120.20611@shakey.mcs.anl.gov>
 <Pine.LNX.4.62L.0608040943400.15507@splinter.mit.edu>
 <Pine.LNX.4.58.0608080810020.18269@shakey.mcs.anl.gov>
Message-ID: <Pine.LNX.4.62L.0608080940400.28096@splinter.mit.edu>


Thanks,

Laslo


On Tue, 8 Aug 2006, Hong Zhang wrote:

>
> Laslo,
>
> We figured out a way to implement ILU(0) with reordering
> without allocating workspace.
> We'll add this support later. I'll let you know when
> it is done.
>
> Thanks for your request that help us to make petsc
> better.
>
> Hong
>
> On Fri, 4 Aug 2006, Laslo Tibor Diosady wrote:
>
>> Hong,
>>
>>
>>> The space required remains the same, but
>>> the row-compressed matrix format
>>> for the factor will be changed with the
>>> reordering.
>>> To store the new format over the
>>> existing memory, temp space has to be allocated during
>>> implementation. Thus replacing the original memory with
>>> newly allocated space would make implementation easier.
>>>
>>
>> I guess this depends upon the implementation of the ILU(0) factorization
>> for the AIJ or BAIJ formats, which  I didn't (nor  do I ever really want
>> to) look into.
>>
>> Thanks for the help,
>>
>> Laslo
>>
>>
>
>


From jiaxun_hou at yahoo.com.cn  Wed Aug  9 05:42:35 2006
From: jiaxun_hou at yahoo.com.cn (jiaxun hou)
Date: Wed, 9 Aug 2006 18:42:35 +0800 (CST)
Subject: About user defined PC
In-Reply-To: <Pine.OSX.4.64.0608080708360.366@barrys-computer.local>
Message-ID: <20060809104235.32229.qmail@web15801.mail.cnb.yahoo.com>

Barry,
Thank you very much. Your codes are very useful !

Regards,
Jiaxun

Barry Smith <bsmith at mcs.anl.gov> ??? 
   Jiaxun,

     I am assuming you are using the PCSHELL? I have added support for this for 
you;

* if you are using petsc-dev
    (http://www-unix.mcs.anl.gov/petsc/petsc-as/developers/index.html) you need
    only do an hg pull to get my additions then run "make" in
    src/ksp/pc/impls/shell.

* if you are not using petsc-dev (or use the nightly tar ball) then I attach the
    three files that were
    changed. include/private/pcimpl.h, include/petscpc.h and
    src/ksp/pc/impls/shell/shell.c (again run make in src/ksp/pc/impls/shell)

    If you are actually writing a complete PC and not using PCSHELL
http://www-unix.mcs.anl.gov/petsc/petsc-as/snapshots/petsc-current/src/ksp/pc/impls/jacobi/jacobi.c.html
then you just need to provide a routine PCApplyBA_XXXX() with calling sequence:
PC,PCSide,Vec b,Vec x,Vec work

   Good luck,

     Barry


On Tue, 8 Aug 2006, jiaxun hou wrote:

> Hi,
>
>  I met a problem ,when I constructed a user-defined PC. That is I need to defind the process of  P^(-1)Mx in each iteration of GMRES by myself for efficient reason, not only define P^(-1)x . But the Petsc seems to  separate  this process into two parts: the first is y=Mx which is defined in Petsc framework, and the second is P^(-1)y which is defined  by the user. So, is there any way to do it without change the code of Petsc framework?
>
>  Regards,
>  Jiaxun
>
>
> ---------------------------------
> ????????-3.5G???20M???/*
      Preconditioner module. 
*/
#if !defined(__PETSCPC_H)
#define __PETSCPC_H
#include "petscmat.h"
PETSC_EXTERN_CXX_BEGIN

EXTERN PetscErrorCode PETSCKSP_DLLEXPORT  PCInitializePackage(const char[]);

/*
    PCList contains the list of preconditioners currently registered
   These are added with the PCRegisterDynamic() macro
*/
extern PetscFList PCList;
#define PCType const char*

/*S
     PC - Abstract PETSc object that manages all preconditioners

   Level: beginner

  Concepts: preconditioners

.seealso:  PCCreate(), PCSetType(), PCType (for list of available types)
S*/
typedef struct _p_PC* PC;

/*E
    PCType - String with the name of a PETSc preconditioner method or the creation function
       with an optional dynamic library name, for example
       http://www.mcs.anl.gov/petsc/lib.a:mypccreate()

   Level: beginner

   Notes: Click on the links below to see details on a particular solver

.seealso: PCSetType(), PC, PCCreate()
E*/
#define PCNONE            "none"
#define PCJACOBI          "jacobi"
#define PCSOR             "sor"
#define PCLU              "lu"
#define PCSHELL           "shell"
#define PCBJACOBI         "bjacobi"
#define PCMG              "mg"
#define PCEISENSTAT       "eisenstat"
#define PCILU             "ilu"
#define PCICC             "icc"
#define PCASM             "asm"
#define PCKSP             "ksp"
#define PCCOMPOSITE       "composite"
#define PCREDUNDANT       "redundant"
#define PCSPAI            "spai"
#define PCNN              "nn"
#define PCCHOLESKY        "cholesky"
#define PCSAMG            "samg"
#define PCPBJACOBI        "pbjacobi"
#define PCMAT             "mat"
#define PCHYPRE           "hypre"
#define PCFIELDSPLIT      "fieldsplit"
#define PCTFS             "tfs"
#define PCML              "ml"
#define PCPROMETHEUS      "prometheus"
#define PCGALERKIN        "galerkin"

/* Logging support */
extern PetscCookie PETSCKSP_DLLEXPORT PC_COOKIE;

/*E
    PCSide - If the preconditioner is to be applied to the left, right
     or symmetrically around the operator.

   Level: beginner

.seealso: 
E*/
typedef enum { PC_LEFT,PC_RIGHT,PC_SYMMETRIC } PCSide;
extern const char *PCSides[];

EXTERN PetscErrorCode PETSCKSP_DLLEXPORT PCCreate(MPI_Comm,PC*);
EXTERN PetscErrorCode PETSCKSP_DLLEXPORT PCSetType(PC,PCType);
EXTERN PetscErrorCode PETSCKSP_DLLEXPORT PCSetUp(PC);
EXTERN PetscErrorCode PETSCKSP_DLLEXPORT PCSetUpOnBlocks(PC);
EXTERN PetscErrorCode PETSCKSP_DLLEXPORT PCApply(PC,Vec,Vec);
EXTERN PetscErrorCode PETSCKSP_DLLEXPORT PCApplySymmetricLeft(PC,Vec,Vec);
EXTERN PetscErrorCode PETSCKSP_DLLEXPORT PCApplySymmetricRight(PC,Vec,Vec);
EXTERN PetscErrorCode PETSCKSP_DLLEXPORT PCApplyBAorAB(PC,PCSide,Vec,Vec,Vec);
EXTERN PetscErrorCode PETSCKSP_DLLEXPORT PCApplyTranspose(PC,Vec,Vec);
EXTERN PetscErrorCode PETSCKSP_DLLEXPORT PCHasApplyTranspose(PC,PetscTruth*);
EXTERN PetscErrorCode PETSCKSP_DLLEXPORT PCApplyBAorABTranspose(PC,PCSide,Vec,Vec,Vec);
EXTERN PetscErrorCode PETSCKSP_DLLEXPORT PCApplyRichardson(PC,Vec,Vec,Vec,PetscReal,PetscReal,PetscReal,PetscInt);
EXTERN PetscErrorCode PETSCKSP_DLLEXPORT PCApplyRichardsonExists(PC,PetscTruth*);

EXTERN PetscErrorCode PETSCKSP_DLLEXPORT PCRegisterDestroy(void);
EXTERN PetscErrorCode PETSCKSP_DLLEXPORT PCRegisterAll(const char[]);
extern PetscTruth PCRegisterAllCalled;

EXTERN PetscErrorCode PETSCKSP_DLLEXPORT PCRegister(const char[],const char[],const char[],PetscErrorCode(*)(PC));

/*MC
   PCRegisterDynamic - Adds a method to the preconditioner package.

   Synopsis:
   PetscErrorCode PCRegisterDynamic(char *name_solver,char *path,char *name_create,PetscErrorCode (*routine_create)(PC))

   Not collective

   Input Parameters:
+  name_solver - name of a new user-defined solver
.  path - path (either absolute or relative) the library containing this solver
.  name_create - name of routine to create method context
-  routine_create - routine to create method context

   Notes:
   PCRegisterDynamic() may be called multiple times to add several user-defined preconditioners.

   If dynamic libraries are used, then the fourth input argument (routine_create)
   is ignored.

   Sample usage:
.vb
   PCRegisterDynamic("my_solver","/home/username/my_lib/lib/libO/solaris/mylib",
              "MySolverCreate",MySolverCreate);
.ve

   Then, your solver can be chosen with the procedural interface via
$     PCSetType(pc,"my_solver")
   or at runtime via the option
$     -pc_type my_solver

   Level: advanced

   Notes: ${PETSC_ARCH}, ${PETSC_DIR}, ${PETSC_LIB_DIR},  or ${any environmental variable}
           occuring in pathname will be replaced with appropriate values.
         If your function is not being put into a shared library then use PCRegister() instead

.keywords: PC, register

.seealso: PCRegisterAll(), PCRegisterDestroy()
M*/
#if defined(PETSC_USE_DYNAMIC_LIBRARIES)
#define PCRegisterDynamic(a,b,c,d) PCRegister(a,b,c,0)
#else
#define PCRegisterDynamic(a,b,c,d) PCRegister(a,b,c,d)
#endif

EXTERN PetscErrorCode PETSCKSP_DLLEXPORT PCDestroy(PC);
EXTERN PetscErrorCode PETSCKSP_DLLEXPORT PCSetFromOptions(PC);
EXTERN PetscErrorCode PETSCKSP_DLLEXPORT PCGetType(PC,PCType*);

EXTERN PetscErrorCode PETSCKSP_DLLEXPORT PCGetFactoredMatrix(PC,Mat*);
EXTERN PetscErrorCode PETSCKSP_DLLEXPORT PCSetModifySubMatrices(PC,PetscErrorCode(*)(PC,PetscInt,const IS[],const IS[],Mat[],void*),void*);
EXTERN PetscErrorCode PETSCKSP_DLLEXPORT PCModifySubMatrices(PC,PetscInt,const IS[],const IS[],Mat[],void*);

EXTERN PetscErrorCode PETSCKSP_DLLEXPORT PCSetOperators(PC,Mat,Mat,MatStructure);
EXTERN PetscErrorCode PETSCKSP_DLLEXPORT PCGetOperators(PC,Mat*,Mat*,MatStructure*);
EXTERN PetscErrorCode PETSCKSP_DLLEXPORT PCGetOperatorsSet(PC,PetscTruth*,PetscTruth*);

EXTERN PetscErrorCode PETSCKSP_DLLEXPORT PCView(PC,PetscViewer);

EXTERN PetscErrorCode PETSCKSP_DLLEXPORT PCSetOptionsPrefix(PC,const char[]);
EXTERN PetscErrorCode PETSCKSP_DLLEXPORT PCAppendOptionsPrefix(PC,const char[]);
EXTERN PetscErrorCode PETSCKSP_DLLEXPORT PCGetOptionsPrefix(PC,const char*[]);

EXTERN PetscErrorCode PETSCKSP_DLLEXPORT PCComputeExplicitOperator(PC,Mat*);

/*
      These are used to provide extra scaling of preconditioned 
   operator for time-stepping schemes like in SUNDIALS 
*/
EXTERN PetscErrorCode PETSCKSP_DLLEXPORT PCDiagonalScale(PC,PetscTruth*);
EXTERN PetscErrorCode PETSCKSP_DLLEXPORT PCDiagonalScaleLeft(PC,Vec,Vec);
EXTERN PetscErrorCode PETSCKSP_DLLEXPORT PCDiagonalScaleRight(PC,Vec,Vec);
EXTERN PetscErrorCode PETSCKSP_DLLEXPORT PCDiagonalScaleSet(PC,Vec);

/* ------------- options specific to particular preconditioners --------- */

EXTERN PetscErrorCode PETSCKSP_DLLEXPORT PCJacobiSetUseRowMax(PC);
EXTERN PetscErrorCode PETSCKSP_DLLEXPORT PCJacobiSetUseAbs(PC);
EXTERN PetscErrorCode PETSCKSP_DLLEXPORT PCSORSetSymmetric(PC,MatSORType);
EXTERN PetscErrorCode PETSCKSP_DLLEXPORT PCSORSetOmega(PC,PetscReal);
EXTERN PetscErrorCode PETSCKSP_DLLEXPORT PCSORSetIterations(PC,PetscInt,PetscInt);

EXTERN PetscErrorCode PETSCKSP_DLLEXPORT PCEisenstatSetOmega(PC,PetscReal);
EXTERN PetscErrorCode PETSCKSP_DLLEXPORT PCEisenstatNoDiagonalScaling(PC);

#define USE_PRECONDITIONER_MATRIX 0
#define USE_TRUE_MATRIX           1
EXTERN PetscErrorCode PETSCKSP_DLLEXPORT PCBJacobiSetUseTrueLocal(PC);
EXTERN PetscErrorCode PETSCKSP_DLLEXPORT PCBJacobiSetTotalBlocks(PC,PetscInt,const PetscInt[]);
EXTERN PetscErrorCode PETSCKSP_DLLEXPORT PCBJacobiSetLocalBlocks(PC,PetscInt,const PetscInt[]);

EXTERN PetscErrorCode PETSCKSP_DLLEXPORT PCKSPSetUseTrue(PC);

EXTERN PetscErrorCode PETSCKSP_DLLEXPORT PCShellSetApply(PC,PetscErrorCode (*)(void*,Vec,Vec)); 
EXTERN PetscErrorCode PETSCKSP_DLLEXPORT PCShellSetApplyBA(PC,PetscErrorCode (*)(void*,PCSide,Vec,Vec,Vec)); 
EXTERN PetscErrorCode PETSCKSP_DLLEXPORT PCShellSetApplyTranspose(PC,PetscErrorCode (*)(void*,Vec,Vec));
EXTERN PetscErrorCode PETSCKSP_DLLEXPORT PCShellSetSetUp(PC,PetscErrorCode (*)(void*));
EXTERN PetscErrorCode PETSCKSP_DLLEXPORT PCShellSetApplyRichardson(PC,PetscErrorCode (*)(void*,Vec,Vec,Vec,PetscReal,PetscReal,PetscReal,PetscInt));
EXTERN PetscErrorCode PETSCKSP_DLLEXPORT PCShellSetView(PC,PetscErrorCode (*)(void*,PetscViewer));
EXTERN PetscErrorCode PETSCKSP_DLLEXPORT PCShellSetDestroy(PC,PetscErrorCode (*)(void*));
EXTERN PetscErrorCode PETSCKSP_DLLEXPORT PCShellGetContext(PC,void**);
EXTERN PetscErrorCode PETSCKSP_DLLEXPORT PCShellSetContext(PC,void*);
EXTERN PetscErrorCode PETSCKSP_DLLEXPORT PCShellSetName(PC,const char[]);
EXTERN PetscErrorCode PETSCKSP_DLLEXPORT PCShellGetName(PC,char*[]);

EXTERN PetscErrorCode PETSCKSP_DLLEXPORT PCFactorSetZeroPivot(PC,PetscReal);
EXTERN PetscErrorCode PETSCKSP_DLLEXPORT PCFactorSetShiftNonzero(PC,PetscReal); 
EXTERN PetscErrorCode PETSCKSP_DLLEXPORT PCFactorSetShiftPd(PC,PetscTruth); 


EXTERN PetscErrorCode PETSCKSP_DLLEXPORT PCFactorSetFill(PC,PetscReal);
EXTERN PetscErrorCode PETSCKSP_DLLEXPORT PCFactorSetPivoting(PC,PetscReal);
EXTERN PetscErrorCode PETSCKSP_DLLEXPORT PCFactorReorderForNonzeroDiagonal(PC,PetscReal);

EXTERN PetscErrorCode PETSCKSP_DLLEXPORT PCFactorSetMatOrdering(PC,MatOrderingType);
EXTERN PetscErrorCode PETSCKSP_DLLEXPORT PCFactorSetReuseOrdering(PC,PetscTruth);
EXTERN PetscErrorCode PETSCKSP_DLLEXPORT PCFactorSetReuseFill(PC,PetscTruth);
EXTERN PetscErrorCode PETSCKSP_DLLEXPORT PCFactorSetUseInPlace(PC);
EXTERN PetscErrorCode PETSCKSP_DLLEXPORT PCFactorSetAllowDiagonalFill(PC);
EXTERN PetscErrorCode PETSCKSP_DLLEXPORT PCFactorSetPivotInBlocks(PC,PetscTruth);

EXTERN PetscErrorCode PETSCKSP_DLLEXPORT PCFactorSetLevels(PC,PetscInt);
EXTERN PetscErrorCode PETSCKSP_DLLEXPORT PCFactorSetUseDropTolerance(PC,PetscReal,PetscReal,PetscInt);

EXTERN PetscErrorCode PETSCKSP_DLLEXPORT PCASMSetLocalSubdomains(PC,PetscInt,IS[]);
EXTERN PetscErrorCode PETSCKSP_DLLEXPORT PCASMSetTotalSubdomains(PC,PetscInt,IS[]);
EXTERN PetscErrorCode PETSCKSP_DLLEXPORT PCASMSetOverlap(PC,PetscInt);
/*E
    PCASMType - Type of additive Schwarz method to use

$  PC_ASM_BASIC - symmetric version where residuals from the ghost points are used
$                 and computed values in ghost regions are added together. Classical
$                 standard additive Schwarz
$  PC_ASM_RESTRICT - residuals from ghost points are used but computed values in ghost
$                    region are discarded. Default
$  PC_ASM_INTERPOLATE - residuals from ghost points are not used, computed values in ghost
$                       region are added back in
$  PC_ASM_NONE - ghost point residuals are not used, computed ghost values are discarded
$                not very good.                

   Level: beginner

.seealso: PCASMSetType()
E*/
typedef enum {PC_ASM_BASIC = 3,PC_ASM_RESTRICT = 1,PC_ASM_INTERPOLATE = 2,PC_ASM_NONE = 0} PCASMType;
extern const char *PCASMTypes[];

EXTERN PetscErrorCode PETSCKSP_DLLEXPORT PCASMSetType(PC,PCASMType);
EXTERN PetscErrorCode PETSCKSP_DLLEXPORT PCASMCreateSubdomains2D(PetscInt,PetscInt,PetscInt,PetscInt,PetscInt,PetscInt,PetscInt *,IS **);
EXTERN PetscErrorCode PETSCKSP_DLLEXPORT PCASMSetUseInPlace(PC);
EXTERN PetscErrorCode PETSCKSP_DLLEXPORT PCASMGetLocalSubdomains(PC,PetscInt*,IS*[]);
EXTERN PetscErrorCode PETSCKSP_DLLEXPORT PCASMGetLocalSubmatrices(PC,PetscInt*,Mat*[]);

/*E
    PCCompositeType - Determines how two or more preconditioner are composed

$  PC_COMPOSITE_ADDITIVE - results from application of all preconditioners are added together
$  PC_COMPOSITE_MULTIPLICATIVE - preconditioners are applied sequentially to the residual freshly
$                                computed after the previous preconditioner application
$  PC_COMPOSITE_SPECIAL - This is very special for a matrix of the form alpha I + R + S
$                         where first preconditioner is built from alpha I + S and second from
$                         alpha I + R

   Level: beginner

.seealso: PCCompositeSetType()
E*/
typedef enum {PC_COMPOSITE_ADDITIVE,PC_COMPOSITE_MULTIPLICATIVE,PC_COMPOSITE_SPECIAL} PCCompositeType;
extern const char *PCCompositeTypes[];

EXTERN PetscErrorCode PETSCKSP_DLLEXPORT PCCompositeSetUseTrue(PC);
EXTERN PetscErrorCode PETSCKSP_DLLEXPORT PCCompositeSetType(PC,PCCompositeType);
EXTERN PetscErrorCode PETSCKSP_DLLEXPORT PCCompositeAddPC(PC,PCType);
EXTERN PetscErrorCode PETSCKSP_DLLEXPORT PCCompositeGetPC(PC pc,PetscInt n,PC *);
EXTERN PetscErrorCode PETSCKSP_DLLEXPORT PCCompositeSpecialSetAlpha(PC,PetscScalar);

EXTERN PetscErrorCode PETSCKSP_DLLEXPORT PCRedundantSetScatter(PC,VecScatter,VecScatter);
EXTERN PetscErrorCode PETSCKSP_DLLEXPORT PCRedundantGetOperators(PC,Mat*,Mat*);
EXTERN PetscErrorCode PETSCKSP_DLLEXPORT PCRedundantGetPC(PC,PC*);

EXTERN PetscErrorCode PETSCKSP_DLLEXPORT PCSPAISetEpsilon(PC,double);
EXTERN PetscErrorCode PETSCKSP_DLLEXPORT PCSPAISetNBSteps(PC,PetscInt);
EXTERN PetscErrorCode PETSCKSP_DLLEXPORT PCSPAISetMax(PC,PetscInt);
EXTERN PetscErrorCode PETSCKSP_DLLEXPORT PCSPAISetMaxNew(PC,PetscInt);
EXTERN PetscErrorCode PETSCKSP_DLLEXPORT PCSPAISetBlockSize(PC,PetscInt);
EXTERN PetscErrorCode PETSCKSP_DLLEXPORT PCSPAISetCacheSize(PC,PetscInt);
EXTERN PetscErrorCode PETSCKSP_DLLEXPORT PCSPAISetVerbose(PC,PetscInt);
EXTERN PetscErrorCode PETSCKSP_DLLEXPORT PCSPAISetSp(PC,PetscInt);

EXTERN PetscErrorCode PETSCKSP_DLLEXPORT PCHYPRESetType(PC,const char[]);
EXTERN PetscErrorCode PETSCKSP_DLLEXPORT PCBJacobiGetLocalBlocks(PC,PetscInt*,const PetscInt*[]);
EXTERN PetscErrorCode PETSCKSP_DLLEXPORT PCBJacobiGetTotalBlocks(PC,PetscInt*,const PetscInt*[]);

EXTERN PetscErrorCode PETSCKSP_DLLEXPORT PCFieldSplitSetFields(PC,PetscInt,PetscInt*);
EXTERN PetscErrorCode PETSCKSP_DLLEXPORT PCFieldSplitSetType(PC,PCCompositeType);

EXTERN PetscErrorCode PETSCKSP_DLLEXPORT PCGalerkinSetRestriction(PC,Mat);
EXTERN PetscErrorCode PETSCKSP_DLLEXPORT PCGalerkinSetInterpolation(PC,Mat);

EXTERN PetscErrorCode PETSCKSP_DLLEXPORT PCSetCoordinates(PC,PetscInt,PetscReal*);
EXTERN PetscErrorCode PETSCKSP_DLLEXPORT PCSASetVectors(PC,PetscInt,PetscReal *);


PETSC_EXTERN_CXX_END
#endif /* __PETSCPC_H */

#ifndef _PCIMPL
#define _PCIMPL

#include "petscksp.h"
#include "petscpc.h"

typedef struct _PCOps *PCOps;
struct _PCOps {
  PetscErrorCode (*setup)(PC);
  PetscErrorCode (*apply)(PC,Vec,Vec);
  PetscErrorCode (*applyrichardson)(PC,Vec,Vec,Vec,PetscReal,PetscReal,PetscReal,PetscInt);
  PetscErrorCode (*applyBA)(PC,PCSide,Vec,Vec,Vec);
  PetscErrorCode (*applytranspose)(PC,Vec,Vec);
  PetscErrorCode (*applyBAtranspose)(PC,PetscInt,Vec,Vec,Vec);
  PetscErrorCode (*setfromoptions)(PC);
  PetscErrorCode (*presolve)(PC,KSP,Vec,Vec);
  PetscErrorCode (*postsolve)(PC,KSP,Vec,Vec);  
  PetscErrorCode (*getfactoredmatrix)(PC,Mat*);
  PetscErrorCode (*applysymmetricleft)(PC,Vec,Vec);
  PetscErrorCode (*applysymmetricright)(PC,Vec,Vec);
  PetscErrorCode (*setuponblocks)(PC);
  PetscErrorCode (*destroy)(PC);
  PetscErrorCode (*view)(PC,PetscViewer);
};

/*
   Preconditioner context
*/
struct _p_PC {
  PETSCHEADER(struct _PCOps);
  PetscInt       setupcalled;
  MatStructure   flag;
  Mat            mat,pmat;
  Vec            diagonalscaleright,diagonalscaleleft; /* used for time integration scaling */
  PetscTruth     diagonalscale;
  PetscErrorCode (*modifysubmatrices)(PC,PetscInt,const IS[],const IS[],Mat[],void*); /* user provided routine */
  void           *modifysubmatricesP; /* context for user routine */
  void           *data;
};

extern PetscEvent  PC_SetUp, PC_SetUpOnBlocks, PC_Apply, PC_ApplyCoarse, PC_ApplyMultiple, PC_ApplySymmetricLeft;
extern PetscEvent  PC_ApplySymmetricRight, PC_ModifySubMatrices;

#endif
#define PETSCKSP_DLL

/*
   This provides a simple shell for Fortran (and C programmers) to 
  create their own preconditioner without writing much interface code.
*/

#include "private/pcimpl.h"        /*I "petscpc.h" I*/
#include "private/vecimpl.h"  

EXTERN_C_BEGIN 
typedef struct {
  void           *ctx;                     /* user provided contexts for preconditioner */
  PetscErrorCode (*destroy)(void*);
  PetscErrorCode (*setup)(void*);
  PetscErrorCode (*apply)(void*,Vec,Vec);
  PetscErrorCode (*applyBA)(void*,PCSide,Vec,Vec,Vec);
  PetscErrorCode (*presolve)(void*,KSP,Vec,Vec);
  PetscErrorCode (*postsolve)(void*,KSP,Vec,Vec);
  PetscErrorCode (*view)(void*,PetscViewer);
  PetscErrorCode (*applytranspose)(void*,Vec,Vec);
  PetscErrorCode (*applyrich)(void*,Vec,Vec,Vec,PetscReal,PetscReal,PetscReal,PetscInt);
  char           *name;
} PC_Shell;
EXTERN_C_END

#undef __FUNCT__  
#define __FUNCT__ "PCShellGetContext"
/*@
    PCShellGetContext - Returns the user-provided context associated with a shell PC

    Not Collective

    Input Parameter:
.   pc - should have been created with PCCreateShell()

    Output Parameter:
.   ctx - the user provided context

    Level: advanced

    Notes:
    This routine is intended for use within various shell routines
    
.keywords: PC, shell, get, context

.seealso: PCCreateShell(), PCShellSetContext()
@*/
PetscErrorCode PETSCKSP_DLLEXPORT PCShellGetContext(PC pc,void **ctx)
{
  PetscErrorCode ierr;
  PetscTruth     flg;

  PetscFunctionBegin;
  PetscValidHeaderSpecific(pc,PC_COOKIE,1);
  PetscValidPointer(ctx,2); 
  ierr = PetscTypeCompare((PetscObject)pc,PCSHELL,&flg);CHKERRQ(ierr);
  if (!flg) *ctx = 0; 
  else      *ctx = ((PC_Shell*)(pc->data))->ctx; 
  PetscFunctionReturn(0);
}

#undef __FUNCT__  
#define __FUNCT__ "PCShellSetContext"
/*@C
    PCShellSetContext - sets the context for a shell PC

   Collective on PC

    Input Parameters:
+   pc - the shell PC
-   ctx - the context

   Level: advanced

   Fortran Notes: The context can only be an integer or a PetscObject
      unfortunately it cannot be a Fortran array or derived type.

.seealso: PCCreateShell(), PCShellGetContext()
@*/
PetscErrorCode PETSCKSP_DLLEXPORT PCShellSetContext(PC pc,void *ctx)
{
  PC_Shell      *shell = (PC_Shell*)pc->data;
  PetscErrorCode ierr;
  PetscTruth     flg;

  PetscFunctionBegin;
  PetscValidHeaderSpecific(pc,PC_COOKIE,1);
  ierr = PetscTypeCompare((PetscObject)pc,PCSHELL,&flg);CHKERRQ(ierr);
  if (flg) {
    shell->ctx = ctx;
  }
  PetscFunctionReturn(0);
}

#undef __FUNCT__  
#define __FUNCT__ "PCSetUp_Shell"
static PetscErrorCode PCSetUp_Shell(PC pc)
{
  PC_Shell       *shell;
  PetscErrorCode ierr;

  PetscFunctionBegin;
  shell = (PC_Shell*)pc->data;
  if (shell->setup) {
    CHKMEMQ;
    ierr  = (*shell->setup)(shell->ctx);CHKERRQ(ierr);
    CHKMEMQ;
  }
  PetscFunctionReturn(0);
}

#undef __FUNCT__  
#define __FUNCT__ "PCApply_Shell"
static PetscErrorCode PCApply_Shell(PC pc,Vec x,Vec y)
{
  PC_Shell       *shell;
  PetscErrorCode ierr;

  PetscFunctionBegin;
  shell = (PC_Shell*)pc->data;
  if (!shell->apply) SETERRQ(PETSC_ERR_USER,"No apply() routine provided to Shell PC");
  PetscStackPush("PCSHELL user function");
  CHKMEMQ;
  ierr  = (*shell->apply)(shell->ctx,x,y);CHKERRQ(ierr);
  CHKMEMQ;
  PetscStackPop;
  PetscFunctionReturn(0);
}

#undef __FUNCT__  
#define __FUNCT__ "PCApplyBA_Shell"
static PetscErrorCode PCApplyBA_Shell(PC pc,PCSide side,Vec x,Vec y,Vec w)
{
  PC_Shell       *shell;
  PetscErrorCode ierr;

  PetscFunctionBegin;
  shell = (PC_Shell*)pc->data;
  if (!shell->applyBA) SETERRQ(PETSC_ERR_USER,"No applyBA() routine provided to Shell PC");
  PetscStackPush("PCSHELL user function BA");
  CHKMEMQ;
  ierr  = (*shell->applyBA)(shell->ctx,side,x,y,w);CHKERRQ(ierr);
  CHKMEMQ;
  PetscStackPop;
  PetscFunctionReturn(0);
}

#undef __FUNCT__  
#define __FUNCT__ "PCPreSolve_Shell"
static PetscErrorCode PCPreSolve_Shell(PC pc,KSP ksp,Vec b,Vec x)
{
  PC_Shell       *shell;
  PetscErrorCode ierr;

  PetscFunctionBegin;
  shell = (PC_Shell*)pc->data;
  if (!shell->presolve) SETERRQ(PETSC_ERR_USER,"No presolve() routine provided to Shell PC");
  ierr  = (*shell->presolve)(shell->ctx,ksp,b,x);CHKERRQ(ierr);
  PetscFunctionReturn(0);
}

#undef __FUNCT__  
#define __FUNCT__ "PCPostSolve_Shell"
static PetscErrorCode PCPostSolve_Shell(PC pc,KSP ksp,Vec b,Vec x)
{
  PC_Shell       *shell;
  PetscErrorCode ierr;

  PetscFunctionBegin;
  shell = (PC_Shell*)pc->data;
  if (!shell->postsolve) SETERRQ(PETSC_ERR_USER,"No postsolve() routine provided to Shell PC");
  ierr  = (*shell->postsolve)(shell->ctx,ksp,b,x);CHKERRQ(ierr);
  PetscFunctionReturn(0);
}

#undef __FUNCT__  
#define __FUNCT__ "PCApplyTranspose_Shell"
static PetscErrorCode PCApplyTranspose_Shell(PC pc,Vec x,Vec y)
{
  PC_Shell       *shell;
  PetscErrorCode ierr;

  PetscFunctionBegin;
  shell = (PC_Shell*)pc->data;
  if (!shell->applytranspose) SETERRQ(PETSC_ERR_USER,"No applytranspose() routine provided to Shell PC");
  ierr  = (*shell->applytranspose)(shell->ctx,x,y);CHKERRQ(ierr);
  PetscFunctionReturn(0);
}

#undef __FUNCT__  
#define __FUNCT__ "PCApplyRichardson_Shell"
static PetscErrorCode PCApplyRichardson_Shell(PC pc,Vec x,Vec y,Vec w,PetscReal rtol,PetscReal abstol, PetscReal dtol,PetscInt it)
{
  PetscErrorCode ierr;
  PC_Shell       *shell;

  PetscFunctionBegin;
  shell = (PC_Shell*)pc->data;
  ierr  = (*shell->applyrich)(shell->ctx,x,y,w,rtol,abstol,dtol,it);CHKERRQ(ierr);
  PetscFunctionReturn(0);
}

#undef __FUNCT__  
#define __FUNCT__ "PCDestroy_Shell"
static PetscErrorCode PCDestroy_Shell(PC pc)
{
  PC_Shell       *shell = (PC_Shell*)pc->data;
  PetscErrorCode ierr;

  PetscFunctionBegin;
  ierr = PetscStrfree(shell->name);CHKERRQ(ierr);
  if (shell->destroy) {
    ierr  = (*shell->destroy)(shell->ctx);CHKERRQ(ierr);
  }
  ierr = PetscFree(shell);CHKERRQ(ierr);
  PetscFunctionReturn(0);
}

#undef __FUNCT__  
#define __FUNCT__ "PCView_Shell"
static PetscErrorCode PCView_Shell(PC pc,PetscViewer viewer)
{
  PC_Shell       *shell = (PC_Shell*)pc->data;
  PetscErrorCode ierr;
  PetscTruth     iascii;

  PetscFunctionBegin;
  ierr = PetscTypeCompare((PetscObject)viewer,PETSC_VIEWER_ASCII,&iascii);CHKERRQ(ierr);
  if (iascii) {
    if (shell->name) {ierr = PetscViewerASCIIPrintf(viewer,"  Shell: %s\n",shell->name);CHKERRQ(ierr);}
    else             {ierr = PetscViewerASCIIPrintf(viewer,"  Shell: no name\n");CHKERRQ(ierr);}
  }
  if (shell->view) {
    ierr = PetscViewerASCIIPushTab(viewer);CHKERRQ(ierr);
    ierr  = (*shell->view)(shell->ctx,viewer);CHKERRQ(ierr);
    ierr = PetscViewerASCIIPopTab(viewer);CHKERRQ(ierr);
  }
  PetscFunctionReturn(0);
}

/* ------------------------------------------------------------------------------*/
EXTERN_C_BEGIN
#undef __FUNCT__  
#define __FUNCT__ "PCShellSetDestroy_Shell"
PetscErrorCode PETSCKSP_DLLEXPORT PCShellSetDestroy_Shell(PC pc, PetscErrorCode (*destroy)(void*))
{
  PC_Shell *shell;

  PetscFunctionBegin;
  shell          = (PC_Shell*)pc->data;
  shell->destroy = destroy;
  PetscFunctionReturn(0);
}
EXTERN_C_END

EXTERN_C_BEGIN
#undef __FUNCT__  
#define __FUNCT__ "PCShellSetSetUp_Shell"
PetscErrorCode PETSCKSP_DLLEXPORT PCShellSetSetUp_Shell(PC pc, PetscErrorCode (*setup)(void*))
{
  PC_Shell *shell;

  PetscFunctionBegin;
  shell        = (PC_Shell*)pc->data;
  shell->setup = setup;
  PetscFunctionReturn(0);
}
EXTERN_C_END

EXTERN_C_BEGIN
#undef __FUNCT__  
#define __FUNCT__ "PCShellSetApply_Shell"
PetscErrorCode PETSCKSP_DLLEXPORT PCShellSetApply_Shell(PC pc,PetscErrorCode (*apply)(void*,Vec,Vec))
{
  PC_Shell *shell;

  PetscFunctionBegin;
  shell        = (PC_Shell*)pc->data;
  shell->apply = apply;
  PetscFunctionReturn(0);
}
EXTERN_C_END

EXTERN_C_BEGIN
#undef __FUNCT__  
#define __FUNCT__ "PCShellSetApplyBA_Shell"
PetscErrorCode PETSCKSP_DLLEXPORT PCShellSetApplyBA_Shell(PC pc,PetscErrorCode (*apply)(void*,PCSide,Vec,Vec,Vec))
{
  PC_Shell *shell;

  PetscFunctionBegin;
  shell          = (PC_Shell*)pc->data;
  shell->applyBA = apply;
  PetscFunctionReturn(0);
}
EXTERN_C_END

EXTERN_C_BEGIN
#undef __FUNCT__  
#define __FUNCT__ "PCShellSetPreSolve_Shell"
PetscErrorCode PETSCKSP_DLLEXPORT PCShellSetPreSolve_Shell(PC pc,PetscErrorCode (*presolve)(void*,KSP,Vec,Vec))
{
  PC_Shell *shell;

  PetscFunctionBegin;
  shell             = (PC_Shell*)pc->data;
  shell->presolve   = presolve;
  if (presolve) {
    pc->ops->presolve = PCPreSolve_Shell;
  } else {
    pc->ops->presolve = 0;
  }
  PetscFunctionReturn(0);
}
EXTERN_C_END

EXTERN_C_BEGIN
#undef __FUNCT__  
#define __FUNCT__ "PCShellSetPostSolve_Shell"
PetscErrorCode PETSCKSP_DLLEXPORT PCShellSetPostSolve_Shell(PC pc,PetscErrorCode (*postsolve)(void*,KSP,Vec,Vec))
{
  PC_Shell *shell;

  PetscFunctionBegin;
  shell           = (PC_Shell*)pc->data;
  shell->postsolve = postsolve;
  if (postsolve) {
    pc->ops->postsolve = PCPostSolve_Shell;
  } else {
    pc->ops->postsolve = 0;
  }
  PetscFunctionReturn(0);
}
EXTERN_C_END

EXTERN_C_BEGIN
#undef __FUNCT__  
#define __FUNCT__ "PCShellSetView_Shell"
PetscErrorCode PETSCKSP_DLLEXPORT PCShellSetView_Shell(PC pc,PetscErrorCode (*view)(void*,PetscViewer))
{
  PC_Shell *shell;

  PetscFunctionBegin;
  shell        = (PC_Shell*)pc->data;
  shell->view = view;
  PetscFunctionReturn(0);
}
EXTERN_C_END

EXTERN_C_BEGIN
#undef __FUNCT__  
#define __FUNCT__ "PCShellSetApplyTranspose_Shell"
PetscErrorCode PETSCKSP_DLLEXPORT PCShellSetApplyTranspose_Shell(PC pc,PetscErrorCode (*applytranspose)(void*,Vec,Vec))
{
  PC_Shell *shell;

  PetscFunctionBegin;
  shell                 = (PC_Shell*)pc->data;
  shell->applytranspose = applytranspose;
  PetscFunctionReturn(0);
}
EXTERN_C_END

EXTERN_C_BEGIN
#undef __FUNCT__  
#define __FUNCT__ "PCShellSetName_Shell"
PetscErrorCode PETSCKSP_DLLEXPORT PCShellSetName_Shell(PC pc,const char name[])
{
  PC_Shell       *shell;
  PetscErrorCode ierr;

  PetscFunctionBegin;
  shell = (PC_Shell*)pc->data;
  ierr = PetscStrfree(shell->name);CHKERRQ(ierr);    
  ierr = PetscStrallocpy(name,&shell->name);CHKERRQ(ierr);
  PetscFunctionReturn(0);
}
EXTERN_C_END

EXTERN_C_BEGIN
#undef __FUNCT__  
#define __FUNCT__ "PCShellGetName_Shell"
PetscErrorCode PETSCKSP_DLLEXPORT PCShellGetName_Shell(PC pc,char *name[])
{
  PC_Shell *shell;

  PetscFunctionBegin;
  shell  = (PC_Shell*)pc->data;
  *name  = shell->name;
  PetscFunctionReturn(0);
}
EXTERN_C_END

EXTERN_C_BEGIN
#undef __FUNCT__  
#define __FUNCT__ "PCShellSetApplyRichardson_Shell"
PetscErrorCode PETSCKSP_DLLEXPORT PCShellSetApplyRichardson_Shell(PC pc,PetscErrorCode (*apply)(void*,Vec,Vec,Vec,PetscReal,PetscReal,PetscReal,PetscInt))
{
  PC_Shell *shell;

  PetscFunctionBegin;
  shell                     = (PC_Shell*)pc->data;
  pc->ops->applyrichardson  = PCApplyRichardson_Shell;
  shell->applyrich          = apply;
  PetscFunctionReturn(0);
}
EXTERN_C_END

/* -------------------------------------------------------------------------------*/

#undef __FUNCT__  
#define __FUNCT__ "PCShellSetDestroy"
/*@C
   PCShellSetDestroy - Sets routine to use to destroy the user-provided 
   application context.

   Collective on PC

   Input Parameters:
+  pc - the preconditioner context
.  destroy - the application-provided destroy routine

   Calling sequence of destroy:
.vb
   PetscErrorCode destroy (void *ptr)
.ve

.  ptr - the application context

   Level: developer

.keywords: PC, shell, set, destroy, user-provided

.seealso: PCShellSetApply(), PCShellSetContext()
@*/
PetscErrorCode PETSCKSP_DLLEXPORT PCShellSetDestroy(PC pc,PetscErrorCode (*destroy)(void*))
{
  PetscErrorCode ierr,(*f)(PC,PetscErrorCode (*)(void*));

  PetscFunctionBegin;
  PetscValidHeaderSpecific(pc,PC_COOKIE,1);
  ierr = PetscObjectQueryFunction((PetscObject)pc,"PCShellSetDestroy_C",(void (**)(void))&f);CHKERRQ(ierr);
  if (f) {
    ierr = (*f)(pc,destroy);CHKERRQ(ierr);
  }
  PetscFunctionReturn(0);
}


#undef __FUNCT__  
#define __FUNCT__ "PCShellSetSetUp"
/*@C
   PCShellSetSetUp - Sets routine to use to "setup" the preconditioner whenever the 
   matrix operator is changed.

   Collective on PC

   Input Parameters:
+  pc - the preconditioner context
.  setup - the application-provided setup routine

   Calling sequence of setup:
.vb
   PetscErrorCode setup (void *ptr)
.ve

.  ptr - the application context

   Level: developer

.keywords: PC, shell, set, setup, user-provided

.seealso: PCShellSetApplyRichardson(), PCShellSetApply(), PCShellSetContext()
@*/
PetscErrorCode PETSCKSP_DLLEXPORT PCShellSetSetUp(PC pc,PetscErrorCode (*setup)(void*))
{
  PetscErrorCode ierr,(*f)(PC,PetscErrorCode (*)(void*));

  PetscFunctionBegin;
  PetscValidHeaderSpecific(pc,PC_COOKIE,1);
  ierr = PetscObjectQueryFunction((PetscObject)pc,"PCShellSetSetUp_C",(void (**)(void))&f);CHKERRQ(ierr);
  if (f) {
    ierr = (*f)(pc,setup);CHKERRQ(ierr);
  }
  PetscFunctionReturn(0);
}


#undef __FUNCT__  
#define __FUNCT__ "PCShellSetView"
/*@C
   PCShellSetView - Sets routine to use as viewer of shell preconditioner

   Collective on PC

   Input Parameters:
+  pc - the preconditioner context
-  view - the application-provided view routine

   Calling sequence of apply:
.vb
   PetscErrorCode view(void *ptr,PetscViewer v)
.ve

+  ptr - the application context
-  v   - viewer

   Level: developer

.keywords: PC, shell, set, apply, user-provided

.seealso: PCShellSetApplyRichardson(), PCShellSetSetUp(), PCShellSetApplyTranspose()
@*/
PetscErrorCode PETSCKSP_DLLEXPORT PCShellSetView(PC pc,PetscErrorCode (*view)(void*,PetscViewer))
{
  PetscErrorCode ierr,(*f)(PC,PetscErrorCode (*)(void*,PetscViewer));

  PetscFunctionBegin;
  PetscValidHeaderSpecific(pc,PC_COOKIE,1);
  ierr = PetscObjectQueryFunction((PetscObject)pc,"PCShellSetView_C",(void (**)(void))&f);CHKERRQ(ierr);
  if (f) {
    ierr = (*f)(pc,view);CHKERRQ(ierr);
  }
  PetscFunctionReturn(0);
}

#undef __FUNCT__  
#define __FUNCT__ "PCShellSetApply"
/*@C
   PCShellSetApply - Sets routine to use as preconditioner.

   Collective on PC

   Input Parameters:
+  pc - the preconditioner context
-  apply - the application-provided preconditioning routine

   Calling sequence of apply:
.vb
   PetscErrorCode apply (void *ptr,Vec xin,Vec xout)
.ve

+  ptr - the application context
.  xin - input vector
-  xout - output vector

   Level: developer

.keywords: PC, shell, set, apply, user-provided

.seealso: PCShellSetApplyRichardson(), PCShellSetSetUp(), PCShellSetApplyTranspose(), PCShellSetContext(), PCShellSetApplyBA()
@*/
PetscErrorCode PETSCKSP_DLLEXPORT PCShellSetApply(PC pc,PetscErrorCode (*apply)(void*,Vec,Vec))
{
  PetscErrorCode ierr,(*f)(PC,PetscErrorCode (*)(void*,Vec,Vec));

  PetscFunctionBegin;
  PetscValidHeaderSpecific(pc,PC_COOKIE,1);
  ierr = PetscObjectQueryFunction((PetscObject)pc,"PCShellSetApply_C",(void (**)(void))&f);CHKERRQ(ierr);
  if (f) {
    ierr = (*f)(pc,apply);CHKERRQ(ierr);
  }
  PetscFunctionReturn(0);
}

#undef __FUNCT__  
#define __FUNCT__ "PCShellSetApplyBA"
/*@C
   PCShellSetApplyBA - Sets routine to use as preconditioner times operator.

   Collective on PC

   Input Parameters:
+  pc - the preconditioner context
-  applyBA - the application-provided BA routine

   Calling sequence of apply:
.vb
   PetscErrorCode applyBA (void *ptr,Vec xin,Vec xout)
.ve

+  ptr - the application context
.  xin - input vector
-  xout - output vector

   Level: developer

.keywords: PC, shell, set, apply, user-provided

.seealso: PCShellSetApplyRichardson(), PCShellSetSetUp(), PCShellSetApplyTranspose(), PCShellSetContext(), PCShellSetApply()
@*/
PetscErrorCode PETSCKSP_DLLEXPORT PCShellSetApplyBA(PC pc,PetscErrorCode (*applyBA)(void*,PCSide,Vec,Vec,Vec))
{
  PetscErrorCode ierr,(*f)(PC,PetscErrorCode (*)(void*,PCSide,Vec,Vec,Vec));

  PetscFunctionBegin;
  PetscValidHeaderSpecific(pc,PC_COOKIE,1);
  ierr = PetscObjectQueryFunction((PetscObject)pc,"PCShellSetApplyBA_C",(void (**)(void))&f);CHKERRQ(ierr);
  if (f) {
    ierr = (*f)(pc,applyBA);CHKERRQ(ierr);
  }
  PetscFunctionReturn(0);
}

#undef __FUNCT__  
#define __FUNCT__ "PCShellSetApplyTranspose"
/*@C
   PCShellSetApplyTranspose - Sets routine to use as preconditioner transpose.

   Collective on PC

   Input Parameters:
+  pc - the preconditioner context
-  apply - the application-provided preconditioning transpose routine

   Calling sequence of apply:
.vb
   PetscErrorCode applytranspose (void *ptr,Vec xin,Vec xout)
.ve

+  ptr - the application context
.  xin - input vector
-  xout - output vector

   Level: developer

   Notes: 
   Uses the same context variable as PCShellSetApply().

.keywords: PC, shell, set, apply, user-provided

.seealso: PCShellSetApplyRichardson(), PCShellSetSetUp(), PCShellSetApply(), PCSetContext(), PCShellSetApplyBA()
@*/
PetscErrorCode PETSCKSP_DLLEXPORT PCShellSetApplyTranspose(PC pc,PetscErrorCode (*applytranspose)(void*,Vec,Vec))
{
  PetscErrorCode ierr,(*f)(PC,PetscErrorCode (*)(void*,Vec,Vec));

  PetscFunctionBegin;
  PetscValidHeaderSpecific(pc,PC_COOKIE,1);
  ierr = PetscObjectQueryFunction((PetscObject)pc,"PCShellSetApplyTranspose_C",(void (**)(void))&f);CHKERRQ(ierr);
  if (f) {
    ierr = (*f)(pc,applytranspose);CHKERRQ(ierr);
  }
  PetscFunctionReturn(0);
}

#undef __FUNCT__  
#define __FUNCT__ "PCShellSetPreSolve"
/*@C
   PCShellSetPreSolve - Sets routine to apply to the operators/vectors before a KSPSolve() is
      applied. This usually does something like scale the linear system in some application 
      specific way.

   Collective on PC

   Input Parameters:
+  pc - the preconditioner context
-  presolve - the application-provided presolve routine

   Calling sequence of presolve:
.vb
   PetscErrorCode presolve (void *ptr,KSP ksp,Vec b,Vec x)
.ve

+  ptr - the application context
.  xin - input vector
-  xout - output vector

   Level: developer

.keywords: PC, shell, set, apply, user-provided

.seealso: PCShellSetApplyRichardson(), PCShellSetSetUp(), PCShellSetApplyTranspose(), PCShellSetPostSolve(), PCShellSetContext()
@*/
PetscErrorCode PETSCKSP_DLLEXPORT PCShellSetPreSolve(PC pc,PetscErrorCode (*presolve)(void*,KSP,Vec,Vec))
{
  PetscErrorCode ierr,(*f)(PC,PetscErrorCode (*)(void*,KSP,Vec,Vec));

  PetscFunctionBegin;
  PetscValidHeaderSpecific(pc,PC_COOKIE,1);
  ierr = PetscObjectQueryFunction((PetscObject)pc,"PCShellSetPreSolve_C",(void (**)(void))&f);CHKERRQ(ierr);
  if (f) {
    ierr = (*f)(pc,presolve);CHKERRQ(ierr);
  }
  PetscFunctionReturn(0);
}

#undef __FUNCT__  
#define __FUNCT__ "PCShellSetPostSolve"
/*@C
   PCShellSetPostSolve - Sets routine to apply to the operators/vectors before a KSPSolve() is
      applied. This usually does something like scale the linear system in some application 
      specific way.

   Collective on PC

   Input Parameters:
+  pc - the preconditioner context
-  postsolve - the application-provided presolve routine

   Calling sequence of postsolve:
.vb
   PetscErrorCode postsolve(void *ptr,KSP ksp,Vec b,Vec x)
.ve

+  ptr - the application context
.  xin - input vector
-  xout - output vector

   Level: developer

.keywords: PC, shell, set, apply, user-provided

.seealso: PCShellSetApplyRichardson(), PCShellSetSetUp(), PCShellSetApplyTranspose(), PCShellSetPreSolve(), PCShellSetContext()
@*/
PetscErrorCode PETSCKSP_DLLEXPORT PCShellSetPostSolve(PC pc,PetscErrorCode (*postsolve)(void*,KSP,Vec,Vec))
{
  PetscErrorCode ierr,(*f)(PC,PetscErrorCode (*)(void*,KSP,Vec,Vec));

  PetscFunctionBegin;
  PetscValidHeaderSpecific(pc,PC_COOKIE,1);
  ierr = PetscObjectQueryFunction((PetscObject)pc,"PCShellSetPostSolve_C",(void (**)(void))&f);CHKERRQ(ierr);
  if (f) {
    ierr = (*f)(pc,postsolve);CHKERRQ(ierr);
  }
  PetscFunctionReturn(0);
}

#undef __FUNCT__  
#define __FUNCT__ "PCShellSetName"
/*@C
   PCShellSetName - Sets an optional name to associate with a shell
   preconditioner.

   Not Collective

   Input Parameters:
+  pc - the preconditioner context
-  name - character string describing shell preconditioner

   Level: developer

.keywords: PC, shell, set, name, user-provided

.seealso: PCShellGetName()
@*/
PetscErrorCode PETSCKSP_DLLEXPORT PCShellSetName(PC pc,const char name[])
{
  PetscErrorCode ierr,(*f)(PC,const char []);

  PetscFunctionBegin;
  PetscValidHeaderSpecific(pc,PC_COOKIE,1);
  ierr = PetscObjectQueryFunction((PetscObject)pc,"PCShellSetName_C",(void (**)(void))&f);CHKERRQ(ierr);
  if (f) {
    ierr = (*f)(pc,name);CHKERRQ(ierr);
  }
  PetscFunctionReturn(0);
}

#undef __FUNCT__  
#define __FUNCT__ "PCShellGetName"
/*@C
   PCShellGetName - Gets an optional name that the user has set for a shell

=== message truncated ===

 __________________________________________________
???????????????
http://cn.mail.yahoo.com
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.mcs.anl.gov/pipermail/petsc-users/attachments/20060809/78a20073/attachment.htm>

From mwojc at p.lodz.pl  Thu Aug 10 06:11:42 2006
From: mwojc at p.lodz.pl (Marek Wojciechowski)
Date: Thu, 10 Aug 2006 11:11:42 -0000
Subject: PETSc from python
Message-ID: <op.td2epsp7hfz6u8@localhost>

Hi all!

I'm just curious: is there anybody using PETSc python bindings?

-- 
Marek Wojciechowski


From knepley at gmail.com  Thu Aug 10 07:37:17 2006
From: knepley at gmail.com (Matthew Knepley)
Date: Thu, 10 Aug 2006 07:37:17 -0500
Subject: PETSc from python
In-Reply-To: <op.td2epsp7hfz6u8@localhost>
References: <op.td2epsp7hfz6u8@localhost>
Message-ID: <a9f269830608100537g6f7a6ad6n2ee2f76c33068a54@mail.gmail.com>

  Sorry I did not respond earlier. I thought I could fix your problem
quickly. However, it is too hard to maintain my own bindings. There
are nice Python bindings from

  http://lineal.developer.nicta.com.au/

which I think is what most people use.

   Thanks,

      Matt

On 8/10/06, Marek Wojciechowski <mwojc at p.lodz.pl> wrote:
> Hi all!
>
> I'm just curious: is there anybody using PETSc python bindings?
>
> --
> Marek Wojciechowski
>
>


-- 
"Failure has a thousand explanations. Success doesn't need one" -- Sir
Alec Guiness


From mwojc at p.lodz.pl  Thu Aug 10 14:48:10 2006
From: mwojc at p.lodz.pl (Marek Wojciechowski)
Date: Thu, 10 Aug 2006 19:48:10 -0000
Subject: PETSc from python
In-Reply-To: <a9f269830608100537g6f7a6ad6n2ee2f76c33068a54@mail.gmail.com>
References: <op.td2epsp7hfz6u8@localhost> <a9f269830608100537g6f7a6ad6n2ee2f76c33068a54@mail.gmail.com>
Message-ID: <op.td22mkv2hfz6u8@localhost>

Regarding the bindings downloadable from  
ftp.mcs.anl.gov/pub/petsc/PETScPython.tar.gz
there are just lacking two files in comparison to version  
PETScPython.tar.gz.bkp (they are
PetscMap.c PetscViewer.c). I removed them also from makefile and I  
installed these bindings.
However there is now some lack of PETSc funtionality, isn't it?


On Thu, 10 Aug 2006 12:37:17 -0000, Matthew Knepley <knepley at gmail.com>  
wrote:

>   Sorry I did not respond earlier. I thought I could fix your problem
> quickly. However, it is too hard to maintain my own bindings. There
> are nice Python bindings from
>
>   http://lineal.developer.nicta.com.au/
>
> which I think is what most people use.
>
>    Thanks,
>
>       Matt
>
> On 8/10/06, Marek Wojciechowski <mwojc at p.lodz.pl> wrote:
>> Hi all!
>>
>> I'm just curious: is there anybody using PETSc python bindings?
>>
>> --
>> Marek Wojciechowski
>>
>>
>
>


-- 
Marek Wojciechowski


From knepley at gmail.com  Thu Aug 10 14:18:02 2006
From: knepley at gmail.com (Matthew Knepley)
Date: Thu, 10 Aug 2006 14:18:02 -0500
Subject: PETSc from python
In-Reply-To: <op.td22mkv2hfz6u8@localhost>
References: <op.td2epsp7hfz6u8@localhost>
	 <a9f269830608100537g6f7a6ad6n2ee2f76c33068a54@mail.gmail.com>
	 <op.td22mkv2hfz6u8@localhost>
Message-ID: <a9f269830608101218y6cbadf1m41c005fae0fc4d72@mail.gmail.com>

On 8/10/06, Marek Wojciechowski <mwojc at p.lodz.pl> wrote:
> Regarding the bindings downloadable from
> ftp.mcs.anl.gov/pub/petsc/PETScPython.tar.gz
> there are just lacking two files in comparison to version
> PETScPython.tar.gz.bkp (they are
> PetscMap.c PetscViewer.c). I removed them also from makefile and I
> installed these bindings.
> However there is now some lack of PETSc funtionality, isn't it?

PetscMap is no longer a class in PETSc, so no problem there. And I
think I reorganized so that the Viewer moved into different classes.

  Thanks,

     MAtt

>
> On Thu, 10 Aug 2006 12:37:17 -0000, Matthew Knepley <knepley at gmail.com>
> wrote:
>
> >   Sorry I did not respond earlier. I thought I could fix your problem
> > quickly. However, it is too hard to maintain my own bindings. There
> > are nice Python bindings from
> >
> >   http://lineal.developer.nicta.com.au/
> >
> > which I think is what most people use.
> >
> >    Thanks,
> >
> >       Matt
> >
> > On 8/10/06, Marek Wojciechowski <mwojc at p.lodz.pl> wrote:
> >> Hi all!
> >>
> >> I'm just curious: is there anybody using PETSc python bindings?
> >>
> >> --
> >> Marek Wojciechowski
> >>
> >>
> >
> >
>
>
>
> --
> Marek Wojciechowski
>
>


-- 
"Failure has a thousand explanations. Success doesn't need one" -- Sir
Alec Guiness


From julvar at tamu.edu  Thu Aug 10 15:36:19 2006
From: julvar at tamu.edu (Julian)
Date: Thu, 10 Aug 2006 15:36:19 -0500
Subject: How to get the address of an element in the matrix
Message-ID: <20060810203614.6F5443C802@tr-4-int.cis.tamu.edu>

Hi,

Is there any way to get the address/location of a particular element in the
matrix, say mat[i,j]
So I can use that address later on to change the value of that element
rather than using MatSetValues.
I am currently using MatGetValues to get the value as such...
	PetscScalar val;
	MatGetValues(mat, 1, &i, 1, &j, &val);

But I would like something where I can pass a pointer and then the pointer
will be pointing to mat[i,j] such as 
	PetscScalar *val;
	MatGetReference(mat, i, j, val);

Is there something like this already in place ? Or is it not allowed because
of the way the matrix is implemented internally?

Thanks,
Julian.


From knepley at gmail.com  Thu Aug 10 15:50:34 2006
From: knepley at gmail.com (Matthew Knepley)
Date: Thu, 10 Aug 2006 15:50:34 -0500
Subject: How to get the address of an element in the matrix
In-Reply-To: <20060810203614.6F5443C802@tr-4-int.cis.tamu.edu>
References: <20060810203614.6F5443C802@tr-4-int.cis.tamu.edu>
Message-ID: <a9f269830608101350l34976403v3a1013cbdb47ef63@mail.gmail.com>

On 8/10/06, Julian <julvar at tamu.edu> wrote:
> Hi,
>
> Is there any way to get the address/location of a particular element in the
> matrix, say mat[i,j]

This would defeat the purpose of the interface. NO sparse matrix format
supports this access. In fact, they all have individual access routines. We
present the interface so that you never hav to rewrite your code if the matrix
storage format changes. Furthermore, this works equally well in parallel. If you
need mat[i,j], your algorithm is probably wrong.

    Matt

> So I can use that address later on to change the value of that element
> rather than using MatSetValues.
> I am currently using MatGetValues to get the value as such...
>         PetscScalar val;
>         MatGetValues(mat, 1, &i, 1, &j, &val);
>
> But I would like something where I can pass a pointer and then the pointer
> will be pointing to mat[i,j] such as
>         PetscScalar *val;
>         MatGetReference(mat, i, j, val);
>
> Is there something like this already in place ? Or is it not allowed because
> of the way the matrix is implemented internally?
>
> Thanks,
> Julian.
>
>


-- 
"Failure has a thousand explanations. Success doesn't need one" -- Sir
Alec Guiness


From picard2 at llnl.gov  Thu Aug 10 16:39:24 2006
From: picard2 at llnl.gov (Christophe Picard)
Date: Thu, 10 Aug 2006 14:39:24 -0700
Subject: Performance issue with MatSetValues
Message-ID: <200608101439.24263.picard2@llnl.gov>

Hi,

I have a performance issue while trying to insert values in the a matrix. I am 
using DMMG solver for cell-centered scheme in 3D from the petsc-snapshot to 
solve a Poisson equations. Inserting coefficients in the matrix for dirichlet 
or neumann boundary conditions, the insertion is instantaneous. But is I want 
to insert coefficients for periodic boundary conditions, I can notice a huge 
slow down in the insertion process (not in the resolution though).
The smallest size I can notice the performance drop is 32*32*32.

Is there any way to improve this? 


col.i  = row.i+(mx-1);
col.j  = row.j;
col.k = row.k;

v[0] =  1;

MatSetValuesStencil(*A,1,&row,1,&col,v,ADD_VALUES);


Thans,

Christophe


From knepley at gmail.com  Thu Aug 10 18:53:55 2006
From: knepley at gmail.com (Matthew Knepley)
Date: Thu, 10 Aug 2006 18:53:55 -0500
Subject: Performance issue with MatSetValues
In-Reply-To: <200608101439.24263.picard2@llnl.gov>
References: <200608101439.24263.picard2@llnl.gov>
Message-ID: <a9f269830608101653n49e6cd7fw80a7767d11e8eb67@mail.gmail.com>

It sounds like you are inserting values which were not preallocating. To
determine for sure, we would need to know more about the code. However,
if you have a periodic problem, why not use a periodic DA?

   Matt

On 8/10/06, Christophe Picard <picard2 at llnl.gov> wrote:
> Hi,
>
> I have a performance issue while trying to insert values in the a matrix. I am
> using DMMG solver for cell-centered scheme in 3D from the petsc-snapshot to
> solve a Poisson equations. Inserting coefficients in the matrix for dirichlet
> or neumann boundary conditions, the insertion is instantaneous. But is I want
> to insert coefficients for periodic boundary conditions, I can notice a huge
> slow down in the insertion process (not in the resolution though).
> The smallest size I can notice the performance drop is 32*32*32.
>
> Is there any way to improve this?
>
>
> col.i  = row.i+(mx-1);
> col.j  = row.j;
> col.k = row.k;
>
> v[0] =  1;
>
> MatSetValuesStencil(*A,1,&row,1,&col,v,ADD_VALUES);
>
>
>
> Thans,
>
> Christophe
>
>


-- 
"Failure has a thousand explanations. Success doesn't need one" -- Sir
Alec Guiness


From picard2 at llnl.gov  Thu Aug 10 19:27:51 2006
From: picard2 at llnl.gov (Christophe Picard)
Date: Thu, 10 Aug 2006 17:27:51 -0700
Subject: Performance issue with MatSetValues
In-Reply-To: <a9f269830608101653n49e6cd7fw80a7767d11e8eb67@mail.gmail.com>
References: <200608101439.24263.picard2@llnl.gov> <a9f269830608101653n49e6cd7fw80a7767d11e8eb67@mail.gmail.com>
Message-ID: <200608101727.51362.picard2@llnl.gov>


I think the memory is indeed not preallocated.
Yes my problem is periodic, but if I try to use a periodic DA, the multigrid 
solver complains about it (see the end of the message). I believe the source 
of that problem is  DAGetInterpolation_3D_Q0(). 

The problem I am trying to solve is a 3D Poisson equation with 
Neuman/Robin/Periodic boundary conditions. The boundary conditions are 
decided at runtime. 

If I use if-else-if statements to choose the DA here is the message

Thanks, 
Christophe

[0]PETSC ERROR: --------------------- Error Message 
------------------------------------
[0]PETSC ERROR: Invalid argument!
[0]PETSC ERROR: Cannot handle periodic grid in x!
[0]PETSC ERROR: 
------------------------------------------------------------------------
[0]PETSC ERROR: Petsc Development Version 2.3.1, Patch 14, Thu Jul  6 00:02:04 
CDT 2006 HG revision: 97334a27165ab031dddd67964dd7a97955e75d20
[0]PETSC ERROR: See docs/changes/index.html for recent updates.
[0]PETSC ERROR: See docs/faq.html for hints about trouble shooting.
[0]PETSC ERROR: See docs/index.html for manual pages.
[0]PETSC ERROR: 
------------------------------------------------------------------------
[0]PETSC ERROR: ././oceanus on a linux-gnu named tux194.llnl.gov by picard1 
Thu Aug 10 17:16:31 2006
[0]PETSC ERROR: Libraries linked from 
/home/picard1/Tools/Petsc-Dev/lib/linux-gnu-c-real-debug
[0]PETSC ERROR: Configure run at Thu Jul  6 16:05:13 2006
[0]PETSC ERROR: Configure options --prefix=/home/picard1/Tools/Petsc-Dev 
--with-dynamic --with-shared --with-mpi=0 --with-superlu=1 
--download-superlu=ifneeded
[0]PETSC ERROR: 
------------------------------------------------------------------------
[0]PETSC ERROR: DAGetInterpolation_3D_Q0() line 498 in 
src/dm/da/src/dainterp.c
[0]PETSC ERROR: DAGetInterpolation() line 874 in src/dm/da/src/dainterp.c
[0]PETSC ERROR: DMGetInterpolation() line 117 in src/dm/da/utils/dm.c
[0]PETSC ERROR: DMMGSetUp() line 215 in src/snes/utils/damg.c
[0]PETSC ERROR: DMMGSetDM() line 180 in src/snes/utils/damg.c
[0]PETSC ERROR: --------------------- Error Message 
------------------------------------
[0]PETSC ERROR: Null argument, when expecting valid pointer!
[0]PETSC ERROR: Null Object: Parameter # 1!
[0]PETSC ERROR: 
------------------------------------------------------------------------
[0]PETSC ERROR: Petsc Development Version 2.3.1, Patch 14, Thu Jul  6 00:02:04 
CDT 2006 HG revision: 97334a27165ab031dddd67964dd7a97955e75d20
[0]PETSC ERROR: See docs/changes/index.html for recent updates.
[0]PETSC ERROR: See docs/faq.html for hints about trouble shooting.
[0]PETSC ERROR: See docs/index.html for manual pages.
[0]PETSC ERROR: 
------------------------------------------------------------------------
[0]PETSC ERROR: ././oceanus on a linux-gnu named tux194.llnl.gov by picard1 
Thu Aug 10 17:16:31 2006
[0]PETSC ERROR: Libraries linked from 
/home/picard1/Tools/Petsc-Dev/lib/linux-gnu-c-real-debug
[0]PETSC ERROR: Configure run at Thu Jul  6 16:05:13 2006
[0]PETSC ERROR: Configure options --prefix=/home/picard1/Tools/Petsc-Dev 
--with-dynamic --with-shared --with-mpi=0 --with-superlu=1 
--download-superlu=ifneeded
[0]PETSC ERROR: 
------------------------------------------------------------------------
[0]PETSC ERROR: PetscObjectReference() line 106 in src/sys/objects/inherit.c
[0]PETSC ERROR: MGSetInterpolate() line 136 in src/ksp/pc/impls/mg/mgfunc.c
[0]PETSC ERROR: DMMGSetUpLevel() line 385 in src/snes/utils/damg.c
[0]PETSC ERROR: DMMGSetKSP() line 452 in src/snes/utils/damg.c


On Thursday 10 August 2006 04:53 pm, Matthew Knepley wrote:
> It sounds like you are inserting values which were not preallocating. To
> determine for sure, we would need to know more about the code. However,
> if you have a periodic problem, why not use a periodic DA?
>
>    Matt
>
> On 8/10/06, Christophe Picard <picard2 at llnl.gov> wrote:
> > Hi,
> >
> > I have a performance issue while trying to insert values in the a matrix.
> > I am using DMMG solver for cell-centered scheme in 3D from the
> > petsc-snapshot to solve a Poisson equations. Inserting coefficients in
> > the matrix for dirichlet or neumann boundary conditions, the insertion is
> > instantaneous. But is I want to insert coefficients for periodic boundary
> > conditions, I can notice a huge slow down in the insertion process (not
> > in the resolution though). The smallest size I can notice the performance
> > drop is 32*32*32.
> >
> > Is there any way to improve this?
> >
> >
> > col.i  = row.i+(mx-1);
> > col.j  = row.j;
> > col.k = row.k;
> >
> > v[0] =  1;
> >
> > MatSetValuesStencil(*A,1,&row,1,&col,v,ADD_VALUES);
> >
> >
> >
> > Thans,
> >
> > Christophe


From bsmith at mcs.anl.gov  Thu Aug 10 20:01:41 2006
From: bsmith at mcs.anl.gov (Barry Smith)
Date: Thu, 10 Aug 2006 20:01:41 -0500 (CDT)
Subject: Performance issue with MatSetValues
In-Reply-To: <200608101727.51362.picard2@llnl.gov>
References: <200608101439.24263.picard2@llnl.gov>
 <a9f269830608101653n49e6cd7fw80a7767d11e8eb67@mail.gmail.com>
 <200608101727.51362.picard2@llnl.gov>
Message-ID: <Pine.OSX.4.64.0608101958000.366@barrys-computer.local>


   This is just do to an incomplete implementation; a user kindly donated
the code to use but did not add support for periodicity because he did not
need it. If you look at src/dm/da/src/dainterp.c you will find the various
routines for setting up the interpolations. If you look at the code
for 3D_Q1 you will see how the periodic case is handled; you may be able
to modify the 3D_Q0 code to also handle the periodic case. This will then
resolve your difficulty.

Good luck,

    Barry


On Thu, 10 Aug 2006, Christophe Picard wrote:

>
> I think the memory is indeed not preallocated.
> Yes my problem is periodic, but if I try to use a periodic DA, the multigrid
> solver complains about it (see the end of the message). I believe the source
> of that problem is  DAGetInterpolation_3D_Q0().
>
> The problem I am trying to solve is a 3D Poisson equation with
> Neuman/Robin/Periodic boundary conditions. The boundary conditions are
> decided at runtime.
>
> If I use if-else-if statements to choose the DA here is the message
>
> Thanks,
> Christophe
>
> [0]PETSC ERROR: --------------------- Error Message
> ------------------------------------
> [0]PETSC ERROR: Invalid argument!
> [0]PETSC ERROR: Cannot handle periodic grid in x!
> [0]PETSC ERROR:
> ------------------------------------------------------------------------
> [0]PETSC ERROR: Petsc Development Version 2.3.1, Patch 14, Thu Jul  6 00:02:04
> CDT 2006 HG revision: 97334a27165ab031dddd67964dd7a97955e75d20
> [0]PETSC ERROR: See docs/changes/index.html for recent updates.
> [0]PETSC ERROR: See docs/faq.html for hints about trouble shooting.
> [0]PETSC ERROR: See docs/index.html for manual pages.
> [0]PETSC ERROR:
> ------------------------------------------------------------------------
> [0]PETSC ERROR: ././oceanus on a linux-gnu named tux194.llnl.gov by picard1
> Thu Aug 10 17:16:31 2006
> [0]PETSC ERROR: Libraries linked from
> /home/picard1/Tools/Petsc-Dev/lib/linux-gnu-c-real-debug
> [0]PETSC ERROR: Configure run at Thu Jul  6 16:05:13 2006
> [0]PETSC ERROR: Configure options --prefix=/home/picard1/Tools/Petsc-Dev
> --with-dynamic --with-shared --with-mpi=0 --with-superlu=1
> --download-superlu=ifneeded
> [0]PETSC ERROR:
> ------------------------------------------------------------------------
> [0]PETSC ERROR: DAGetInterpolation_3D_Q0() line 498 in
> src/dm/da/src/dainterp.c
> [0]PETSC ERROR: DAGetInterpolation() line 874 in src/dm/da/src/dainterp.c
> [0]PETSC ERROR: DMGetInterpolation() line 117 in src/dm/da/utils/dm.c
> [0]PETSC ERROR: DMMGSetUp() line 215 in src/snes/utils/damg.c
> [0]PETSC ERROR: DMMGSetDM() line 180 in src/snes/utils/damg.c
> [0]PETSC ERROR: --------------------- Error Message
> ------------------------------------
> [0]PETSC ERROR: Null argument, when expecting valid pointer!
> [0]PETSC ERROR: Null Object: Parameter # 1!
> [0]PETSC ERROR:
> ------------------------------------------------------------------------
> [0]PETSC ERROR: Petsc Development Version 2.3.1, Patch 14, Thu Jul  6 00:02:04
> CDT 2006 HG revision: 97334a27165ab031dddd67964dd7a97955e75d20
> [0]PETSC ERROR: See docs/changes/index.html for recent updates.
> [0]PETSC ERROR: See docs/faq.html for hints about trouble shooting.
> [0]PETSC ERROR: See docs/index.html for manual pages.
> [0]PETSC ERROR:
> ------------------------------------------------------------------------
> [0]PETSC ERROR: ././oceanus on a linux-gnu named tux194.llnl.gov by picard1
> Thu Aug 10 17:16:31 2006
> [0]PETSC ERROR: Libraries linked from
> /home/picard1/Tools/Petsc-Dev/lib/linux-gnu-c-real-debug
> [0]PETSC ERROR: Configure run at Thu Jul  6 16:05:13 2006
> [0]PETSC ERROR: Configure options --prefix=/home/picard1/Tools/Petsc-Dev
> --with-dynamic --with-shared --with-mpi=0 --with-superlu=1
> --download-superlu=ifneeded
> [0]PETSC ERROR:
> ------------------------------------------------------------------------
> [0]PETSC ERROR: PetscObjectReference() line 106 in src/sys/objects/inherit.c
> [0]PETSC ERROR: MGSetInterpolate() line 136 in src/ksp/pc/impls/mg/mgfunc.c
> [0]PETSC ERROR: DMMGSetUpLevel() line 385 in src/snes/utils/damg.c
> [0]PETSC ERROR: DMMGSetKSP() line 452 in src/snes/utils/damg.c
>
>
>
> On Thursday 10 August 2006 04:53 pm, Matthew Knepley wrote:
>> It sounds like you are inserting values which were not preallocating. To
>> determine for sure, we would need to know more about the code. However,
>> if you have a periodic problem, why not use a periodic DA?
>>
>>    Matt
>>
>> On 8/10/06, Christophe Picard <picard2 at llnl.gov> wrote:
>>> Hi,
>>>
>>> I have a performance issue while trying to insert values in the a matrix.
>>> I am using DMMG solver for cell-centered scheme in 3D from the
>>> petsc-snapshot to solve a Poisson equations. Inserting coefficients in
>>> the matrix for dirichlet or neumann boundary conditions, the insertion is
>>> instantaneous. But is I want to insert coefficients for periodic boundary
>>> conditions, I can notice a huge slow down in the insertion process (not
>>> in the resolution though). The smallest size I can notice the performance
>>> drop is 32*32*32.
>>>
>>> Is there any way to improve this?
>>>
>>>
>>> col.i  = row.i+(mx-1);
>>> col.j  = row.j;
>>> col.k = row.k;
>>>
>>> v[0] =  1;
>>>
>>> MatSetValuesStencil(*A,1,&row,1,&col,v,ADD_VALUES);
>>>
>>>
>>>
>>> Thans,
>>>
>>> Christophe
>
>


From picard2 at llnl.gov  Thu Aug 10 20:22:57 2006
From: picard2 at llnl.gov (Christophe Picard)
Date: Thu, 10 Aug 2006 18:22:57 -0700
Subject: Performance issue with MatSetValues
In-Reply-To: <Pine.OSX.4.64.0608101958000.366@barrys-computer.local>
References: <200608101439.24263.picard2@llnl.gov> <200608101727.51362.picard2@llnl.gov> <Pine.OSX.4.64.0608101958000.366@barrys-computer.local>
Message-ID: <200608101822.57861.picard2@llnl.gov>

Thank you. 
I knew it was kindly donated...and this is also  the reason why I am using the 
dev version of PETSC. I will look at the periodic implementation see if I can 
fix my problem.
Thank you for your precision.
Christophe


On Thursday 10 August 2006 06:01 pm, Barry Smith wrote:
>    This is just do to an incomplete implementation; a user kindly donated
> the code to use but did not add support for periodicity because he did not
> need it. If you look at src/dm/da/src/dainterp.c you will find the various
> routines for setting up the interpolations. If you look at the code
> for 3D_Q1 you will see how the periodic case is handled; you may be able
> to modify the 3D_Q0 code to also handle the periodic case. This will then
> resolve your difficulty.
>
> Good luck,
>
>     Barry
>
> On Thu, 10 Aug 2006, Christophe Picard wrote:
> > I think the memory is indeed not preallocated.
> > Yes my problem is periodic, but if I try to use a periodic DA, the
> > multigrid solver complains about it (see the end of the message). I
> > believe the source of that problem is  DAGetInterpolation_3D_Q0().
> >
> > The problem I am trying to solve is a 3D Poisson equation with
> > Neuman/Robin/Periodic boundary conditions. The boundary conditions are
> > decided at runtime.
> >
> > If I use if-else-if statements to choose the DA here is the message
> >
> > Thanks,
> > Christophe
> >
> > [0]PETSC ERROR: --------------------- Error Message
> > ------------------------------------
> > [0]PETSC ERROR: Invalid argument!
> > [0]PETSC ERROR: Cannot handle periodic grid in x!
> > [0]PETSC ERROR:
> > ------------------------------------------------------------------------
> > [0]PETSC ERROR: Petsc Development Version 2.3.1, Patch 14, Thu Jul  6
> > 00:02:04 CDT 2006 HG revision: 97334a27165ab031dddd67964dd7a97955e75d20
> > [0]PETSC ERROR: See docs/changes/index.html for recent updates.
> > [0]PETSC ERROR: See docs/faq.html for hints about trouble shooting.
> > [0]PETSC ERROR: See docs/index.html for manual pages.
> > [0]PETSC ERROR:
> > ------------------------------------------------------------------------
> > [0]PETSC ERROR: ././oceanus on a linux-gnu named tux194.llnl.gov by
> > picard1 Thu Aug 10 17:16:31 2006
> > [0]PETSC ERROR: Libraries linked from
> > /home/picard1/Tools/Petsc-Dev/lib/linux-gnu-c-real-debug
> > [0]PETSC ERROR: Configure run at Thu Jul  6 16:05:13 2006
> > [0]PETSC ERROR: Configure options --prefix=/home/picard1/Tools/Petsc-Dev
> > --with-dynamic --with-shared --with-mpi=0 --with-superlu=1
> > --download-superlu=ifneeded
> > [0]PETSC ERROR:
> > ------------------------------------------------------------------------
> > [0]PETSC ERROR: DAGetInterpolation_3D_Q0() line 498 in
> > src/dm/da/src/dainterp.c
> > [0]PETSC ERROR: DAGetInterpolation() line 874 in src/dm/da/src/dainterp.c
> > [0]PETSC ERROR: DMGetInterpolation() line 117 in src/dm/da/utils/dm.c
> > [0]PETSC ERROR: DMMGSetUp() line 215 in src/snes/utils/damg.c
> > [0]PETSC ERROR: DMMGSetDM() line 180 in src/snes/utils/damg.c
> > [0]PETSC ERROR: --------------------- Error Message
> > ------------------------------------
> > [0]PETSC ERROR: Null argument, when expecting valid pointer!
> > [0]PETSC ERROR: Null Object: Parameter # 1!
> > [0]PETSC ERROR:
> > ------------------------------------------------------------------------
> > [0]PETSC ERROR: Petsc Development Version 2.3.1, Patch 14, Thu Jul  6
> > 00:02:04 CDT 2006 HG revision: 97334a27165ab031dddd67964dd7a97955e75d20
> > [0]PETSC ERROR: See docs/changes/index.html for recent updates.
> > [0]PETSC ERROR: See docs/faq.html for hints about trouble shooting.
> > [0]PETSC ERROR: See docs/index.html for manual pages.
> > [0]PETSC ERROR:
> > ------------------------------------------------------------------------
> > [0]PETSC ERROR: ././oceanus on a linux-gnu named tux194.llnl.gov by
> > picard1 Thu Aug 10 17:16:31 2006
> > [0]PETSC ERROR: Libraries linked from
> > /home/picard1/Tools/Petsc-Dev/lib/linux-gnu-c-real-debug
> > [0]PETSC ERROR: Configure run at Thu Jul  6 16:05:13 2006
> > [0]PETSC ERROR: Configure options --prefix=/home/picard1/Tools/Petsc-Dev
> > --with-dynamic --with-shared --with-mpi=0 --with-superlu=1
> > --download-superlu=ifneeded
> > [0]PETSC ERROR:
> > ------------------------------------------------------------------------
> > [0]PETSC ERROR: PetscObjectReference() line 106 in
> > src/sys/objects/inherit.c [0]PETSC ERROR: MGSetInterpolate() line 136 in
> > src/ksp/pc/impls/mg/mgfunc.c [0]PETSC ERROR: DMMGSetUpLevel() line 385 in
> > src/snes/utils/damg.c [0]PETSC ERROR: DMMGSetKSP() line 452 in
> > src/snes/utils/damg.c
> >
> > On Thursday 10 August 2006 04:53 pm, Matthew Knepley wrote:
> >> It sounds like you are inserting values which were not preallocating. To
> >> determine for sure, we would need to know more about the code. However,
> >> if you have a periodic problem, why not use a periodic DA?
> >>
> >>    Matt
> >>
> >> On 8/10/06, Christophe Picard <picard2 at llnl.gov> wrote:
> >>> Hi,
> >>>
> >>> I have a performance issue while trying to insert values in the a
> >>> matrix. I am using DMMG solver for cell-centered scheme in 3D from the
> >>> petsc-snapshot to solve a Poisson equations. Inserting coefficients in
> >>> the matrix for dirichlet or neumann boundary conditions, the insertion
> >>> is instantaneous. But is I want to insert coefficients for periodic
> >>> boundary conditions, I can notice a huge slow down in the insertion
> >>> process (not in the resolution though). The smallest size I can notice
> >>> the performance drop is 32*32*32.
> >>>
> >>> Is there any way to improve this?
> >>>
> >>>
> >>> col.i  = row.i+(mx-1);
> >>> col.j  = row.j;
> >>> col.k = row.k;
> >>>
> >>> v[0] =  1;
> >>>
> >>> MatSetValuesStencil(*A,1,&row,1,&col,v,ADD_VALUES);
> >>>
> >>>
> >>>
> >>> Thans,
> >>>
> >>> Christophe


From marsum2006 at yahoo.com  Fri Aug 11 10:07:22 2006
From: marsum2006 at yahoo.com (Margot Summer)
Date: Fri, 11 Aug 2006 08:07:22 -0700 (PDT)
Subject: nonzeros of a matrix
Message-ID: <20060811150722.49388.qmail@web57107.mail.re3.yahoo.com>

Hi,

Is there a simple way to get the number of nonzeros of
a matrix? Further, how to find out the information
about a matrix object in KSP/PC, e.g., the number of
nonzeros of the preconditioner, or the number of
nonzeros of the ilu(icc) factor? Thanks,

Margot 

__________________________________________________
Do You Yahoo!?
Tired of spam?  Yahoo! Mail has the best spam protection around 
http://mail.yahoo.com 


From balay at mcs.anl.gov  Fri Aug 11 10:36:56 2006
From: balay at mcs.anl.gov (Satish Balay)
Date: Fri, 11 Aug 2006 10:36:56 -0500 (CDT)
Subject: nonzeros of a matrix
In-Reply-To: <20060811150722.49388.qmail@web57107.mail.re3.yahoo.com>
References: <20060811150722.49388.qmail@web57107.mail.re3.yahoo.com>
Message-ID: <Pine.LNX.4.64.0608111034300.25685@asterix>

You can run the code with -mat_view_info [to get the matrix info] and
-ksp_view - to get the info about the solvers [which include some
details about the ilu preconditioner]

Satish

On Fri, 11 Aug 2006, Margot Summer wrote:

> Hi,
> 
> Is there a simple way to get the number of nonzeros of
> a matrix? Further, how to find out the information
> about a matrix object in KSP/PC, e.g., the number of
> nonzeros of the preconditioner, or the number of
> nonzeros of the ilu(icc) factor? Thanks,
> 
> Margot 
> 
> __________________________________________________
> Do You Yahoo!?
> Tired of spam?  Yahoo! Mail has the best spam protection around 
> http://mail.yahoo.com 
> 
> 


From marsum2006 at yahoo.com  Fri Aug 11 10:55:33 2006
From: marsum2006 at yahoo.com (Margot Summer)
Date: Fri, 11 Aug 2006 08:55:33 -0700 (PDT)
Subject: nonzeros of a matrix
In-Reply-To: <Pine.LNX.4.64.0608111034300.25685@asterix>
Message-ID: <20060811155533.49169.qmail@web57112.mail.re3.yahoo.com>

Thanks. But can we find out this info inside the code (like the way we get the number of iterations of ksp)?  Also, for many subpc's, e.g. using bjacobi, -ksp_view does not print out every block.  

Margot
 
Satish Balay <balay at mcs.anl.gov> wrote: You can run the code with -mat_view_info [to get the matrix info] and
-ksp_view - to get the info about the solvers [which include some
details about the ilu preconditioner]

Satish

On Fri, 11 Aug 2006, Margot Summer wrote:

> Hi,
> 
> Is there a simple way to get the number of nonzeros of
> a matrix? Further, how to find out the information
> about a matrix object in KSP/PC, e.g., the number of
> nonzeros of the preconditioner, or the number of
> nonzeros of the ilu(icc) factor? Thanks,
> 
> Margot 
> 
> __________________________________________________
> Do You Yahoo!?
> Tired of spam?  Yahoo! Mail has the best spam protection around 
> http://mail.yahoo.com 
> 
> 


---------------------------------
Stay in the know. Pulse on the new Yahoo.com.  Check it out. 
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.mcs.anl.gov/pipermail/petsc-users/attachments/20060811/1443cf4a/attachment.htm>

From balay at mcs.anl.gov  Fri Aug 11 11:24:38 2006
From: balay at mcs.anl.gov (Satish Balay)
Date: Fri, 11 Aug 2006 11:24:38 -0500 (CDT)
Subject: nonzeros of a matrix
In-Reply-To: <20060811155533.49169.qmail@web57112.mail.re3.yahoo.com>
References: <20060811155533.49169.qmail@web57112.mail.re3.yahoo.com>
Message-ID: <Pine.LNX.4.64.0608111121200.25685@asterix>

There is a MatGetInfo() - which returns MatInfo object.  You might be
able to do PCGetFactoredMatrix() to get the factor and call
MatGetInfo() on it as well..

With bjacobi - you can call PCBJacobiGetSubKSP() to get all the solver
objects for each sub-block [and extract subpcs, and corresponding
factors etc..]

Satish


On Fri, 11 Aug 2006, Margot Summer wrote:

> Thanks. But can we find out this info inside the code (like the way we get the number of iterations of ksp)?  Also, for many subpc's, e.g. using bjacobi, -ksp_view does not print out every block.  
> 
> Margot
>  
> Satish Balay <balay at mcs.anl.gov> wrote: You can run the code with -mat_view_info [to get the matrix info] and
> -ksp_view - to get the info about the solvers [which include some
> details about the ilu preconditioner]
> 
> Satish
> 
> On Fri, 11 Aug 2006, Margot Summer wrote:
> 
> > Hi,
> > 
> > Is there a simple way to get the number of nonzeros of
> > a matrix? Further, how to find out the information
> > about a matrix object in KSP/PC, e.g., the number of
> > nonzeros of the preconditioner, or the number of
> > nonzeros of the ilu(icc) factor? Thanks,
> > 
> > Margot 
> > 
> > __________________________________________________
> > Do You Yahoo!?
> > Tired of spam?  Yahoo! Mail has the best spam protection around 
> > http://mail.yahoo.com 
> > 
> > 
> 
> 
> 
>  		
> ---------------------------------
> Stay in the know. Pulse on the new Yahoo.com.  Check it out. 


From marsum2006 at yahoo.com  Fri Aug 11 11:26:42 2006
From: marsum2006 at yahoo.com (Margot Summer)
Date: Fri, 11 Aug 2006 09:26:42 -0700 (PDT)
Subject: nonzeros of a matrix
In-Reply-To: <Pine.LNX.4.64.0608111121200.25685@asterix>
Message-ID: <20060811162642.34545.qmail@web57114.mail.re3.yahoo.com>

thanks!

Satish Balay <balay at mcs.anl.gov> wrote: There is a MatGetInfo() - which returns MatInfo object.  You might be
able to do PCGetFactoredMatrix() to get the factor and call
MatGetInfo() on it as well..

With bjacobi - you can call PCBJacobiGetSubKSP() to get all the solver
objects for each sub-block [and extract subpcs, and corresponding
factors etc..]

Satish


On Fri, 11 Aug 2006, Margot Summer wrote:

> Thanks. But can we find out this info inside the code (like the way we get the number of iterations of ksp)?  Also, for many subpc's, e.g. using bjacobi, -ksp_view does not print out every block.  
> 
> Margot
>  
> Satish Balay  wrote: You can run the code with -mat_view_info [to get the matrix info] and
> -ksp_view - to get the info about the solvers [which include some
> details about the ilu preconditioner]
> 
> Satish
> 
> On Fri, 11 Aug 2006, Margot Summer wrote:
> 
> > Hi,
> > 
> > Is there a simple way to get the number of nonzeros of
> > a matrix? Further, how to find out the information
> > about a matrix object in KSP/PC, e.g., the number of
> > nonzeros of the preconditioner, or the number of
> > nonzeros of the ilu(icc) factor? Thanks,
> > 
> > Margot 
> > 
> > __________________________________________________
> > Do You Yahoo!?
> > Tired of spam?  Yahoo! Mail has the best spam protection around 
> > http://mail.yahoo.com 
> > 
> > 
> 
> 
> 
>    
> ---------------------------------
> Stay in the know. Pulse on the new Yahoo.com.  Check it out. 


---------------------------------
Get your email and more, right on the  new Yahoo.com 
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.mcs.anl.gov/pipermail/petsc-users/attachments/20060811/95af4b05/attachment.htm>

From jiaxun_hou at yahoo.com.cn  Sun Aug 13 22:55:33 2006
From: jiaxun_hou at yahoo.com.cn (jiaxun hou)
Date: Mon, 14 Aug 2006 11:55:33 +0800 (CST)
Subject: problem of "caused collective abort of all ranks"
Message-ID: <20060814035533.10775.qmail@web15801.mail.cnb.yahoo.com>

 
  Hi,
  I am sorry to bother you. I met this strange trouble yesterday, and I have tried lots of methods to solve it. But fail. My code likes this:
   
  static char help[] = "Solves a tridiagonal linear system with KSP.\n\n";
#include "petscksp.h"
#include "builderh.h"
  #undef __FUNCT__
#define __FUNCT__ "main"
int main(int argc,char **args){
    PetscInitialize(&argc,&args,(char *)0,help);
    Mat A;
    Vec x;
    PetscInt k=3,v=1,n_x=5,s_t=4,row;
    PetscInt i[2];
    PetscReal h_x,h_t;
    PetscScalar temp;
    MainMatPar *mmp;
    PetscErrorCode ierr;
   
    h_x = PETSC_PI/(PetscReal)(n_x+1);
      h_t = PETSC_PI*2/(PetscReal)(s_t);
      s_t++;
     
      ierr = VecCreate(PETSC_COMM_WORLD,&x);CHKERRQ(ierr);
      ierr = PetscObjectSetName((PetscObject) x, "Solution");CHKERRQ(ierr);
      ierr = VecSetSizes(x,PETSC_DECIDE,n_x*s_t);CHKERRQ(ierr); 
      ierr = VecSetFromOptions(x);CHKERRQ(ierr); 
      temp = h_x;
      for (i[0]=0;i[0]<n_x*s_t;i[0]++) {
          row=i[0];
          ierr = VecSetValues(x,1,&row,&temp,INSERT_VALUES);CHKERRQ(ierr);
          temp+=h_x;
      }
    ierr = VecAssemblyBegin(x);CHKERRQ(ierr);
    ierr = VecAssemblyEnd(x);CHKERRQ(ierr);
     
    ierr = VecSet(x,0);CHKERRQ(ierr);    
 
    ierr = VecView(x,PETSC_VIEWER_STDOUT_WORLD);CHKERRQ(ierr);
   
    /**/
    ierr = CreateMainMatPar(RC_1,k,v,n_x,s_t,h_x,h_t,&mmp);CHKERRQ(ierr);
      ierr = BuildMainMatrix(&A,mmp); CHKERRQ(ierr);
      ierr = DestroyMainMatPar(mmp);CHKERRQ(ierr);
   
    //ierr = MatView(A,PETSC_VIEWER_STDOUT_WORLD);CHKERRQ(ierr);
    ierr = MatDestroy(A);CHKERRQ(ierr);
    ierr = VecDestroy(x);CHKERRQ(ierr);
    ierr = PetscFinalize();CHKERRQ(ierr);
     return 0;
}
  When I delete the line "ierr = VecSet(x,0);CHKERRQ(ierr);" , the problem occurs. I don't know why.
I attach the two files that were used in the above code.
  And my run it in one process.
   
  Regards,
Jiaxun


---------------------------------
 ??????-3.5G???20M??
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.mcs.anl.gov/pipermail/petsc-users/attachments/20060814/cdffd105/attachment.htm>
-------------- next part --------------
An embedded and charset-unspecified text was scrubbed...
Name: builderh.h
URL: <http://lists.mcs.anl.gov/pipermail/petsc-users/attachments/20060814/cdffd105/attachment.diff>
-------------- next part --------------
An embedded and charset-unspecified text was scrubbed...
Name: builder.c
URL: <http://lists.mcs.anl.gov/pipermail/petsc-users/attachments/20060814/cdffd105/attachment-0001.diff>

From yaronkretchmer at gmail.com  Tue Aug 15 11:08:06 2006
From: yaronkretchmer at gmail.com (Yaron Kretchmer)
Date: Tue, 15 Aug 2006 09:08:06 -0700
Subject: programmmatic access to commandline variables
Message-ID: <c71e3b870608150908m64d75443u9f6fe742393a0b5e@mail.gmail.com>

Hi All
Is there a way of determining programatically which commandline options are
used/not used within petsc?
Alternatively, is there a file which contains all legal commandline options?
If so, what is it and what is the format?

Thanks
Yaron
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.mcs.anl.gov/pipermail/petsc-users/attachments/20060815/467598b3/attachment.htm>

From balay at mcs.anl.gov  Tue Aug 15 11:14:08 2006
From: balay at mcs.anl.gov (Satish Balay)
Date: Tue, 15 Aug 2006 11:14:08 -0500 (CDT)
Subject: programmmatic access to commandline variables
In-Reply-To: <c71e3b870608150908m64d75443u9f6fe742393a0b5e@mail.gmail.com>
References: <c71e3b870608150908m64d75443u9f6fe742393a0b5e@mail.gmail.com>
Message-ID: <Pine.LNX.4.64.0608151111470.4381@asterix>

On Tue, 15 Aug 2006, Yaron Kretchmer wrote:

> Hi All
> Is there a way of determining programatically which commandline options are
> used/not used within petsc?

You can run the code with the additional option '-options_left'

or add the following line of code - after PetscInitialize()

  ierr = PetscOptionsSetValue("-options_left",PETSC_NULL);CHKERRQ(ierr);

> Alternatively, is there a file which contains all legal commandline options?
> If so, what is it and what is the format?

The options are distributed all over the code. The best way to get
them is to run the appliation code with '-help' option - and it prints
all the relavent options to that code run.

Satish


From yaronkretchmer at gmail.com  Tue Aug 15 11:32:07 2006
From: yaronkretchmer at gmail.com (Yaron Kretchmer)
Date: Tue, 15 Aug 2006 09:32:07 -0700
Subject: programmmatic access to commandline variables
In-Reply-To: <Pine.LNX.4.64.0608151111470.4381@asterix>
References: <c71e3b870608150908m64d75443u9f6fe742393a0b5e@mail.gmail.com>
	 <Pine.LNX.4.64.0608151111470.4381@asterix>
Message-ID: <c71e3b870608150932h347cba8dh4a9779965664e669@mail.gmail.com>

Hi Satish
This would print out the options that were not used.
What I'm looking for is a way of accessing them inside the program (as in
going through an array of unused options or something similar)

Thanks
Yaron


On 8/15/06, Satish Balay <balay at mcs.anl.gov> wrote:
>
> On Tue, 15 Aug 2006, Yaron Kretchmer wrote:
>
> > Hi All
> > Is there a way of determining programatically which commandline options
> are
> > used/not used within petsc?
>
> You can run the code with the additional option '-options_left'
>
> or add the following line of code - after PetscInitialize()
>
> ierr = PetscOptionsSetValue("-options_left",PETSC_NULL);CHKERRQ(ierr);
>
> > Alternatively, is there a file which contains all legal commandline
> options?
> > If so, what is it and what is the format?
>
> The options are distributed all over the code. The best way to get
> them is to run the appliation code with '-help' option - and it prints
> all the relavent options to that code run.
>
> Satish
>
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.mcs.anl.gov/pipermail/petsc-users/attachments/20060815/0aed832d/attachment.htm>

From mafunk at nmsu.edu  Tue Aug 15 11:34:00 2006
From: mafunk at nmsu.edu (Matt Funk)
Date: Tue, 15 Aug 2006 10:34:00 -0600
Subject: profiling PETSc code
In-Reply-To: <200608021621.44171.mafunk@nmsu.edu>
References: <200608011545.26224.mafunk@nmsu.edu> <a9f269830608021450p455cd8rac31cad95e4777a5@mail.gmail.com> <200608021621.44171.mafunk@nmsu.edu>
Message-ID: <200608151034.03619.mafunk@nmsu.edu>

Hi Matt,

sorry for the delay since the last email, but there were some other things i 
needed to do.

Anyway, I hope that maybe I can get some more help from you guys with respect 
to the loadimbalance problem i have. Here is the situtation:
I run my code on 2 procs. I profile my KSPSolve call and here is what i get:

...

--- Event Stage 2: Stage 2 of ChomboPetscInterface

VecDot             20000 1.0 1.9575e+01 2.0 2.39e+08 2.0 0.0e+00 0.0e+00 
2.0e+04  2  8  0  0 56   7  8  0  0 56   245
VecNorm            16000 1.0 4.0559e+01 3.2 1.53e+08 3.2 0.0e+00 0.0e+00 
1.6e+04  3  7  0  0 44  13  7  0  0 44    95
VecCopy             4000 1.0 1.5148e+00 1.4 0.00e+00 0.0 0.0e+00 0.0e+00 
0.0e+00  0  0  0  0  0   1  0  0  0  0     0
VecSet             16000 1.0 3.1937e+00 1.8 0.00e+00 0.0 0.0e+00 0.0e+00 
0.0e+00  0  0  0  0  0   1  0  0  0  0     0
VecAXPY            16000 1.0 8.2395e+00 1.4 3.22e+08 1.4 0.0e+00 0.0e+00 
0.0e+00  1  7  0  0  0   3  7  0  0  0   465
VecAYPX             8000 1.0 3.6370e+00 1.3 3.46e+08 1.3 0.0e+00 0.0e+00 
0.0e+00  0  3  0  0  0   2  3  0  0  0   527
VecScatterBegin    12000 1.0 5.7721e-01 1.1 0.00e+00 0.0 2.4e+04 2.1e+04 
0.0e+00  0  0100100  0   0  0100100  0     0
VecScatterEnd      12000 1.0 1.4059e+01 9.2 0.00e+00 0.0 0.0e+00 0.0e+00 
0.0e+00  1  0  0  0  0   4  0  0  0  0     0
MatMult            12000 1.0 5.0567e+01 1.0 1.82e+08 1.0 2.4e+04 2.1e+04 
0.0e+00  5 32100100  0  25 32100100  0   361
MatSolve           16000 1.0 1.1285e+02 1.4 1.50e+08 1.4 0.0e+00 0.0e+00 
0.0e+00 10 43  0  0  0  47 43  0  0  0   214
MatLUFactorNum         1 1.0 1.9693e-02 1.0 5.79e+07 1.1 0.0e+00 0.0e+00 
0.0e+00  0  0  0  0  0   0  0  0  0  0   110
MatILUFactorSym        1 1.0 7.8840e-03 1.1 0.00e+00 0.0 0.0e+00 0.0e+00 
1.0e+00  0  0  0  0  0   0  0  0  0  0     0
MatGetOrdering         1 1.0 1.2250e-03 1.1 0.00e+00 0.0 0.0e+00 0.0e+00 
2.0e+00  0  0  0  0  0   0  0  0  0  0     0
KSPSetup               1 1.0 1.1921e-06 1.2 0.00e+00 0.0 0.0e+00 0.0e+00 
0.0e+00  0  0  0  0  0   0  0  0  0  0     0
KSPSolve            4000 1.0 2.0419e+02 1.0 1.39e+08 1.0 2.4e+04 2.1e+04 
3.6e+04 21100100100100 100100100100100   278
PCSetUp                1 1.0 2.8828e-02 1.1 3.99e+07 1.1 0.0e+00 0.0e+00 
3.0e+00  0  0  0  0  0   0  0  0  0  0    75
PCSetUpOnBlocks     4000 1.0 3.5605e-02 1.1 3.35e+07 1.1 0.0e+00 0.0e+00 
3.0e+00  0  0  0  0  0   0  0  0  0  0    61
PCApply            16000 1.0 1.1661e+02 1.4 1.46e+08 1.4 0.0e+00 0.0e+00 
0.0e+00 10 43  0  0  0  49 43  0  0  0   207
------------------------------------------------------------------------------------------------------------------------

...


Some things to note are the following: 
I allocate my vector as: 
 VecCreateMPI(PETSC_COMM_WORLD, //communicator
	       a_totallocal_numPoints[a_thisproc], //local points on this proc
	       a_totalglobal_numPoints, //total number of global points
	       &m_globalRHSVector); //the vector to be created

where the vector a_totallocal_numPoints is : 
a_totallocal_numPoints: 59904 59904 

The matrix is allocated as: 
  m_ierr = MatCreateMPIAIJ(PETSC_COMM_WORLD, //communicator
			   a_totallocal_numPoints[a_thisproc], //total number of local rows (that 
is rows residing on this proc
			   a_totallocal_numPoints[a_thisproc], //total number of columns 
corresponding to local part of parallel vector
			   a_totalglobal_numPoints, //number of global rows
			   a_totalglobal_numPoints, //number of global columns
			   PETSC_NULL,
			   a_NumberOfNZPointsInDiagonalMatrix,
			   PETSC_NULL,
			   a_NumberOfNZPointsInOffDiagonalMatrix,
			   &m_globalMatrix);

With the info option i checked and there is no extra mallocs at all.
My problems setup is symmetric so it seems that everything is set up so that 
it should be essentially perfectly balanced. However, the numbers given above 
certainly do not reflect that.

However, the in all other parts of my code (except the PETSc call), i get the 
expected, almost perfect loadbalance. 

Is there anything that i am overlooking? Any help is greatly appreciated.

thanks
mat


On Wednesday 02 August 2006 16:21, Matt Funk wrote:
> Hi Matt,
>
> It could be a bad load imbalance because i don't let PETSc decide. I need
> to fix that anyway, so i think i'll try that first and then let you know.
> Thanks though for the quick response and helping me to interpret those
> numbers ...
>
>
> mat
>
> On Wednesday 02 August 2006 15:50, Matthew Knepley wrote:
> > On 8/2/06, Matt Funk <mafunk at nmsu.edu> wrote:
> > > Hi Matt,
> > >
> > > thanks for all the help so far. The -info option is really very
> > > helpful. So i think i straightened the actual errors out. However, now
> > > i am back to the original question i had. That is why it takes so much
> > > longer on 4 procs than on 1 proc.
> >
> > So you have a 1.5 load imbalance for MatMult(), which probably cascades
> > to give the 133! load imbalance for VecDot(). You probably have either:
> >
> >   1) VERY bad laod imbalance
> >
> >   2) a screwed up network
> >
> >   3) bad contention on the network (loaded cluster)
> >
> > Can you help us narrow this down?
> >
> >
> >    Matt
> >
> > > I profiled the KSPSolve(...) as stage 2:
> > >
> > > For 1 proc i have:
> > > --- Event Stage 2: Stage 2 of ChomboPetscInterface
> > >
> > > VecDot              4000 1.0 4.9158e-02 1.0 4.74e+08 1.0 0.0e+00
> > > 0.0e+00 0.0e+00  0 18  0  0  0   2 18  0  0  0   474
> > > VecNorm             8000 1.0 2.1798e-01 1.0 2.14e+08 1.0 0.0e+00
> > > 0.0e+00 4.0e+03  1 36  0  0 28   7 36  0  0 33   214
> > > VecAYPX             4000 1.0 1.3449e-01 1.0 1.73e+08 1.0 0.0e+00
> > > 0.0e+00 0.0e+00  0 18  0  0  0   5 18  0  0  0   173
> > > MatMult             4000 1.0 3.6004e-01 1.0 3.24e+07 1.0 0.0e+00
> > > 0.0e+00 0.0e+00  1  9  0  0  0  12  9  0  0  0    32
> > > MatSolve            8000 1.0 1.0620e+00 1.0 2.19e+07 1.0 0.0e+00
> > > 0.0e+00 0.0e+00  3 18  0  0  0  36 18  0  0  0    22
> > > KSPSolve            4000 1.0 2.8338e+00 1.0 4.52e+07 1.0 0.0e+00
> > > 0.0e+00 1.2e+04  7100  0  0 84  97100  0  0100    45
> > > PCApply             8000 1.0 1.1133e+00 1.0 2.09e+07 1.0 0.0e+00
> > > 0.0e+00 0.0e+00  3 18  0  0  0  38 18  0  0  0    21
> > >
> > >
> > > for 4 procs i have :
> > > --- Event Stage 2: Stage 2 of ChomboPetscInterface
> > >
> > > VecDot              4000 1.0 3.5884e+01133.7 2.17e+07133.7 0.0e+00
> > > 0.0e+00 4.0e+03  8 18  0  0  5   9 18  0  0 14     1
> > > VecNorm             8000 1.0 3.4986e-01 1.3 4.43e+07 1.3 0.0e+00
> > > 0.0e+00 8.0e+03  0 36  0  0 10   0 36  0  0 29   133
> > > VecSet              8000 1.0 3.5024e-02 1.4 0.00e+00 0.0 0.0e+00
> > > 0.0e+00 0.0e+00  0  0  0  0  0   0  0  0  0  0     0
> > > VecAYPX             4000 1.0 5.6790e-02 1.3 1.28e+08 1.3 0.0e+00
> > > 0.0e+00 0.0e+00  0 18  0  0  0   0 18  0  0  0   410
> > > VecScatterBegin     4000 1.0 6.0042e+01 1.4 0.00e+00 0.0 0.0e+00
> > > 0.0e+00 0.0e+00 38  0  0  0  0  45  0  0  0  0     0
> > > VecScatterEnd       4000 1.0 5.9364e+01 1.4 0.00e+00 0.0 0.0e+00
> > > 0.0e+00 0.0e+00 37  0  0  0  0  44  0  0  0  0     0
> > > MatMult             4000 1.0 1.1959e+02 1.4 3.46e+04 1.4 0.0e+00
> > > 0.0e+00 0.0e+00 75  9  0  0  0  89  9  0  0  0     0
> > > MatSolve            8000 1.0 2.8150e-01 1.0 2.16e+07 1.0 0.0e+00
> > > 0.0e+00 0.0e+00  0 18  0  0  0   0 18  0  0  0    83
> > > MatLUFactorNum         1 1.0 1.3685e-04 1.1 5.64e+06 1.1 0.0e+00
> > > 0.0e+00 0.0e+00  0  0  0  0  0   0  0  0  0  0    21
> > > MatILUFactorSym        1 1.0 2.3389e-04 1.2 0.00e+00 0.0 0.0e+00
> > > 0.0e+00 2.0e+00  0  0  0  0  0   0  0  0  0  0     0
> > > MatGetOrdering         1 1.0 9.6083e-05 1.2 0.00e+00 0.0 0.0e+00
> > > 0.0e+00 2.0e+00  0  0  0  0  0   0  0  0  0  0     0
> > > KSPSetup               1 1.0 2.1458e-06 2.2 0.00e+00 0.0 0.0e+00
> > > 0.0e+00 0.0e+00  0  0  0  0  0   0  0  0  0  0     0
> > > KSPSolve            4000 1.0 1.2200e+02 1.0 2.63e+05 1.0 0.0e+00
> > > 0.0e+00 2.8e+04 84100  0  0 34 100100  0  0100     1
> > > PCSetUp                1 1.0 5.0187e-04 1.2 1.68e+06 1.2 0.0e+00
> > > 0.0e+00 4.0e+00  0  0  0  0  0   0  0  0  0  0     6
> > > PCSetUpOnBlocks     4000 1.0 1.2104e-02 2.2 1.34e+05 2.2 0.0e+00
> > > 0.0e+00 4.0e+00  0  0  0  0  0   0  0  0  0  0     0
> > > PCApply             8000 1.0 8.4254e-01 1.2 8.27e+06 1.2 0.0e+00
> > > 0.0e+00 8.0e+03  1 18  0  0 10   1 18  0  0 29    28
> > > -----------------------------------------------------------------------
> > >-- -----------------------------------------------
> > >
> > > Now if i understand it right, all these calls summarize all calls
> > > between the pop and push commands. That would mean that the majority of
> > > the time is spend in the MatMult and in within that the VecScatterBegin
> > > and VecScatterEnd commands (if i understand it right).
> > >
> > > My problem size is really small. So i was wondering if the problem lies
> > > in that (namely that the major time is simply spend communicating
> > > between processors, or whether there is still something wrong with how
> > > i wrote the code?)
> > >
> > >
> > > thanks
> > > mat
> > >
> > > On Tuesday 01 August 2006 18:28, Matthew Knepley wrote:
> > > > On 8/1/06, Matt Funk <mafunk at nmsu.edu> wrote:
> > > > > Actually the errors occur on my calls to a PETSc functions after
> > > > > calling PETSCInitialize.
> > > >
> > > > Yes, it is the error I pointed out in the last message.
> > > >
> > > >    Matt
> > > >
> > > > > mat


From balay at mcs.anl.gov  Tue Aug 15 11:40:06 2006
From: balay at mcs.anl.gov (Satish Balay)
Date: Tue, 15 Aug 2006 11:40:06 -0500 (CDT)
Subject: programmmatic access to commandline variables
In-Reply-To: <c71e3b870608150932h347cba8dh4a9779965664e669@mail.gmail.com>
References: <c71e3b870608150908m64d75443u9f6fe742393a0b5e@mail.gmail.com> 
 <Pine.LNX.4.64.0608151111470.4381@asterix> <c71e3b870608150932h347cba8dh4a9779965664e669@mail.gmail.com>
Message-ID: <Pine.LNX.4.64.0608151136290.4381@asterix>

You might want to check the code of PetscOptionsLeft(),
PetscOptionsAllUsed() and reimplement the desired functionality.

The above routines are in src/sys/objects/options.c

Note - you'll have to do the check only after the options get used -
this means ksp options can be checked only after kspsolve etc.. [so
doing this just before PetscFinalize() would capture all options
usage]

Satish

On Tue, 15 Aug 2006, Yaron Kretchmer wrote:

> Hi Satish
> This would print out the options that were not used.
> What I'm looking for is a way of accessing them inside the program (as in
> going through an array of unused options or something similar)
> 
> Thanks
> Yaron
> 
> 
> 
> On 8/15/06, Satish Balay <balay at mcs.anl.gov> wrote:
> > 
> > On Tue, 15 Aug 2006, Yaron Kretchmer wrote:
> > 
> > > Hi All
> > > Is there a way of determining programatically which commandline options
> > are
> > > used/not used within petsc?
> > 
> > You can run the code with the additional option '-options_left'
> > 
> > or add the following line of code - after PetscInitialize()
> > 
> > ierr = PetscOptionsSetValue("-options_left",PETSC_NULL);CHKERRQ(ierr);
> > 
> > > Alternatively, is there a file which contains all legal commandline
> > options?
> > > If so, what is it and what is the format?
> > 
> > The options are distributed all over the code. The best way to get
> > them is to run the appliation code with '-help' option - and it prints
> > all the relavent options to that code run.
> > 
> > Satish
> > 
> > 
> 


From bsmith at mcs.anl.gov  Tue Aug 15 12:52:02 2006
From: bsmith at mcs.anl.gov (Barry Smith)
Date: Tue, 15 Aug 2006 12:52:02 -0500 (CDT)
Subject: profiling PETSc code
In-Reply-To: <200608151034.03619.mafunk@nmsu.edu>
References: <200608011545.26224.mafunk@nmsu.edu>
 <a9f269830608021450p455cd8rac31cad95e4777a5@mail.gmail.com>
 <200608021621.44171.mafunk@nmsu.edu> <200608151034.03619.mafunk@nmsu.edu>
Message-ID: <Pine.OSX.4.64.0608151245060.401@barrys-computer.local>


> MatSolve           16000 1.0 1.1285e+02 1.4 1.50e+08 1.4 0.0e+00 0.0e+00
                                          ^^^^^
                                        balance

   Hmmm, I would guess that the matrix entries are not so well balanced?
One process takes 1.4 times as long for the triangular solves as the other
so either one matrix has many more entries or one processor is slower then
the other.

    Barry

On Tue, 15 Aug 2006, Matt Funk wrote:

> Hi Matt,
>
> sorry for the delay since the last email, but there were some other things i
> needed to do.
>
> Anyway, I hope that maybe I can get some more help from you guys with respect
> to the loadimbalance problem i have. Here is the situtation:
> I run my code on 2 procs. I profile my KSPSolve call and here is what i get:
>
> ...
>
> --- Event Stage 2: Stage 2 of ChomboPetscInterface
>
> VecDot             20000 1.0 1.9575e+01 2.0 2.39e+08 2.0 0.0e+00 0.0e+00
> 2.0e+04  2  8  0  0 56   7  8  0  0 56   245
> VecNorm            16000 1.0 4.0559e+01 3.2 1.53e+08 3.2 0.0e+00 0.0e+00
> 1.6e+04  3  7  0  0 44  13  7  0  0 44    95
> VecCopy             4000 1.0 1.5148e+00 1.4 0.00e+00 0.0 0.0e+00 0.0e+00
> 0.0e+00  0  0  0  0  0   1  0  0  0  0     0
> VecSet             16000 1.0 3.1937e+00 1.8 0.00e+00 0.0 0.0e+00 0.0e+00
> 0.0e+00  0  0  0  0  0   1  0  0  0  0     0
> VecAXPY            16000 1.0 8.2395e+00 1.4 3.22e+08 1.4 0.0e+00 0.0e+00
> 0.0e+00  1  7  0  0  0   3  7  0  0  0   465
> VecAYPX             8000 1.0 3.6370e+00 1.3 3.46e+08 1.3 0.0e+00 0.0e+00
> 0.0e+00  0  3  0  0  0   2  3  0  0  0   527
> VecScatterBegin    12000 1.0 5.7721e-01 1.1 0.00e+00 0.0 2.4e+04 2.1e+04
> 0.0e+00  0  0100100  0   0  0100100  0     0
> VecScatterEnd      12000 1.0 1.4059e+01 9.2 0.00e+00 0.0 0.0e+00 0.0e+00
> 0.0e+00  1  0  0  0  0   4  0  0  0  0     0
> MatMult            12000 1.0 5.0567e+01 1.0 1.82e+08 1.0 2.4e+04 2.1e+04
> 0.0e+00  5 32100100  0  25 32100100  0   361
> MatSolve           16000 1.0 1.1285e+02 1.4 1.50e+08 1.4 0.0e+00 0.0e+00
> 0.0e+00 10 43  0  0  0  47 43  0  0  0   214
> MatLUFactorNum         1 1.0 1.9693e-02 1.0 5.79e+07 1.1 0.0e+00 0.0e+00
> 0.0e+00  0  0  0  0  0   0  0  0  0  0   110
> MatILUFactorSym        1 1.0 7.8840e-03 1.1 0.00e+00 0.0 0.0e+00 0.0e+00
> 1.0e+00  0  0  0  0  0   0  0  0  0  0     0
> MatGetOrdering         1 1.0 1.2250e-03 1.1 0.00e+00 0.0 0.0e+00 0.0e+00
> 2.0e+00  0  0  0  0  0   0  0  0  0  0     0
> KSPSetup               1 1.0 1.1921e-06 1.2 0.00e+00 0.0 0.0e+00 0.0e+00
> 0.0e+00  0  0  0  0  0   0  0  0  0  0     0
> KSPSolve            4000 1.0 2.0419e+02 1.0 1.39e+08 1.0 2.4e+04 2.1e+04
> 3.6e+04 21100100100100 100100100100100   278
> PCSetUp                1 1.0 2.8828e-02 1.1 3.99e+07 1.1 0.0e+00 0.0e+00
> 3.0e+00  0  0  0  0  0   0  0  0  0  0    75
> PCSetUpOnBlocks     4000 1.0 3.5605e-02 1.1 3.35e+07 1.1 0.0e+00 0.0e+00
> 3.0e+00  0  0  0  0  0   0  0  0  0  0    61
> PCApply            16000 1.0 1.1661e+02 1.4 1.46e+08 1.4 0.0e+00 0.0e+00
> 0.0e+00 10 43  0  0  0  49 43  0  0  0   207
> ------------------------------------------------------------------------------------------------------------------------
>
> ...
>
>
> Some things to note are the following:
> I allocate my vector as:
> VecCreateMPI(PETSC_COMM_WORLD, //communicator
> 	       a_totallocal_numPoints[a_thisproc], //local points on this proc
> 	       a_totalglobal_numPoints, //total number of global points
> 	       &m_globalRHSVector); //the vector to be created
>
> where the vector a_totallocal_numPoints is :
> a_totallocal_numPoints: 59904 59904
>
> The matrix is allocated as:
>  m_ierr = MatCreateMPIAIJ(PETSC_COMM_WORLD, //communicator
> 			   a_totallocal_numPoints[a_thisproc], //total number of local rows (that
> is rows residing on this proc
> 			   a_totallocal_numPoints[a_thisproc], //total number of columns
> corresponding to local part of parallel vector
> 			   a_totalglobal_numPoints, //number of global rows
> 			   a_totalglobal_numPoints, //number of global columns
> 			   PETSC_NULL,
> 			   a_NumberOfNZPointsInDiagonalMatrix,
> 			   PETSC_NULL,
> 			   a_NumberOfNZPointsInOffDiagonalMatrix,
> 			   &m_globalMatrix);
>
> With the info option i checked and there is no extra mallocs at all.
> My problems setup is symmetric so it seems that everything is set up so that
> it should be essentially perfectly balanced. However, the numbers given above
> certainly do not reflect that.
>
> However, the in all other parts of my code (except the PETSc call), i get the
> expected, almost perfect loadbalance.
>
> Is there anything that i am overlooking? Any help is greatly appreciated.
>
> thanks
> mat
>
>
>
> On Wednesday 02 August 2006 16:21, Matt Funk wrote:
>> Hi Matt,
>>
>> It could be a bad load imbalance because i don't let PETSc decide. I need
>> to fix that anyway, so i think i'll try that first and then let you know.
>> Thanks though for the quick response and helping me to interpret those
>> numbers ...
>>
>>
>> mat
>>
>> On Wednesday 02 August 2006 15:50, Matthew Knepley wrote:
>>> On 8/2/06, Matt Funk <mafunk at nmsu.edu> wrote:
>>>> Hi Matt,
>>>>
>>>> thanks for all the help so far. The -info option is really very
>>>> helpful. So i think i straightened the actual errors out. However, now
>>>> i am back to the original question i had. That is why it takes so much
>>>> longer on 4 procs than on 1 proc.
>>>
>>> So you have a 1.5 load imbalance for MatMult(), which probably cascades
>>> to give the 133! load imbalance for VecDot(). You probably have either:
>>>
>>>   1) VERY bad laod imbalance
>>>
>>>   2) a screwed up network
>>>
>>>   3) bad contention on the network (loaded cluster)
>>>
>>> Can you help us narrow this down?
>>>
>>>
>>>    Matt
>>>
>>>> I profiled the KSPSolve(...) as stage 2:
>>>>
>>>> For 1 proc i have:
>>>> --- Event Stage 2: Stage 2 of ChomboPetscInterface
>>>>
>>>> VecDot              4000 1.0 4.9158e-02 1.0 4.74e+08 1.0 0.0e+00
>>>> 0.0e+00 0.0e+00  0 18  0  0  0   2 18  0  0  0   474
>>>> VecNorm             8000 1.0 2.1798e-01 1.0 2.14e+08 1.0 0.0e+00
>>>> 0.0e+00 4.0e+03  1 36  0  0 28   7 36  0  0 33   214
>>>> VecAYPX             4000 1.0 1.3449e-01 1.0 1.73e+08 1.0 0.0e+00
>>>> 0.0e+00 0.0e+00  0 18  0  0  0   5 18  0  0  0   173
>>>> MatMult             4000 1.0 3.6004e-01 1.0 3.24e+07 1.0 0.0e+00
>>>> 0.0e+00 0.0e+00  1  9  0  0  0  12  9  0  0  0    32
>>>> MatSolve            8000 1.0 1.0620e+00 1.0 2.19e+07 1.0 0.0e+00
>>>> 0.0e+00 0.0e+00  3 18  0  0  0  36 18  0  0  0    22
>>>> KSPSolve            4000 1.0 2.8338e+00 1.0 4.52e+07 1.0 0.0e+00
>>>> 0.0e+00 1.2e+04  7100  0  0 84  97100  0  0100    45
>>>> PCApply             8000 1.0 1.1133e+00 1.0 2.09e+07 1.0 0.0e+00
>>>> 0.0e+00 0.0e+00  3 18  0  0  0  38 18  0  0  0    21
>>>>
>>>>
>>>> for 4 procs i have :
>>>> --- Event Stage 2: Stage 2 of ChomboPetscInterface
>>>>
>>>> VecDot              4000 1.0 3.5884e+01133.7 2.17e+07133.7 0.0e+00
>>>> 0.0e+00 4.0e+03  8 18  0  0  5   9 18  0  0 14     1
>>>> VecNorm             8000 1.0 3.4986e-01 1.3 4.43e+07 1.3 0.0e+00
>>>> 0.0e+00 8.0e+03  0 36  0  0 10   0 36  0  0 29   133
>>>> VecSet              8000 1.0 3.5024e-02 1.4 0.00e+00 0.0 0.0e+00
>>>> 0.0e+00 0.0e+00  0  0  0  0  0   0  0  0  0  0     0
>>>> VecAYPX             4000 1.0 5.6790e-02 1.3 1.28e+08 1.3 0.0e+00
>>>> 0.0e+00 0.0e+00  0 18  0  0  0   0 18  0  0  0   410
>>>> VecScatterBegin     4000 1.0 6.0042e+01 1.4 0.00e+00 0.0 0.0e+00
>>>> 0.0e+00 0.0e+00 38  0  0  0  0  45  0  0  0  0     0
>>>> VecScatterEnd       4000 1.0 5.9364e+01 1.4 0.00e+00 0.0 0.0e+00
>>>> 0.0e+00 0.0e+00 37  0  0  0  0  44  0  0  0  0     0
>>>> MatMult             4000 1.0 1.1959e+02 1.4 3.46e+04 1.4 0.0e+00
>>>> 0.0e+00 0.0e+00 75  9  0  0  0  89  9  0  0  0     0
>>>> MatSolve            8000 1.0 2.8150e-01 1.0 2.16e+07 1.0 0.0e+00
>>>> 0.0e+00 0.0e+00  0 18  0  0  0   0 18  0  0  0    83
>>>> MatLUFactorNum         1 1.0 1.3685e-04 1.1 5.64e+06 1.1 0.0e+00
>>>> 0.0e+00 0.0e+00  0  0  0  0  0   0  0  0  0  0    21
>>>> MatILUFactorSym        1 1.0 2.3389e-04 1.2 0.00e+00 0.0 0.0e+00
>>>> 0.0e+00 2.0e+00  0  0  0  0  0   0  0  0  0  0     0
>>>> MatGetOrdering         1 1.0 9.6083e-05 1.2 0.00e+00 0.0 0.0e+00
>>>> 0.0e+00 2.0e+00  0  0  0  0  0   0  0  0  0  0     0
>>>> KSPSetup               1 1.0 2.1458e-06 2.2 0.00e+00 0.0 0.0e+00
>>>> 0.0e+00 0.0e+00  0  0  0  0  0   0  0  0  0  0     0
>>>> KSPSolve            4000 1.0 1.2200e+02 1.0 2.63e+05 1.0 0.0e+00
>>>> 0.0e+00 2.8e+04 84100  0  0 34 100100  0  0100     1
>>>> PCSetUp                1 1.0 5.0187e-04 1.2 1.68e+06 1.2 0.0e+00
>>>> 0.0e+00 4.0e+00  0  0  0  0  0   0  0  0  0  0     6
>>>> PCSetUpOnBlocks     4000 1.0 1.2104e-02 2.2 1.34e+05 2.2 0.0e+00
>>>> 0.0e+00 4.0e+00  0  0  0  0  0   0  0  0  0  0     0
>>>> PCApply             8000 1.0 8.4254e-01 1.2 8.27e+06 1.2 0.0e+00
>>>> 0.0e+00 8.0e+03  1 18  0  0 10   1 18  0  0 29    28
>>>> -----------------------------------------------------------------------
>>>> -- -----------------------------------------------
>>>>
>>>> Now if i understand it right, all these calls summarize all calls
>>>> between the pop and push commands. That would mean that the majority of
>>>> the time is spend in the MatMult and in within that the VecScatterBegin
>>>> and VecScatterEnd commands (if i understand it right).
>>>>
>>>> My problem size is really small. So i was wondering if the problem lies
>>>> in that (namely that the major time is simply spend communicating
>>>> between processors, or whether there is still something wrong with how
>>>> i wrote the code?)
>>>>
>>>>
>>>> thanks
>>>> mat
>>>>
>>>> On Tuesday 01 August 2006 18:28, Matthew Knepley wrote:
>>>>> On 8/1/06, Matt Funk <mafunk at nmsu.edu> wrote:
>>>>>> Actually the errors occur on my calls to a PETSc functions after
>>>>>> calling PETSCInitialize.
>>>>>
>>>>> Yes, it is the error I pointed out in the last message.
>>>>>
>>>>>    Matt
>>>>>
>>>>>> mat
>
>


From mafunk at nmsu.edu  Tue Aug 15 13:39:31 2006
From: mafunk at nmsu.edu (Matt Funk)
Date: Tue, 15 Aug 2006 12:39:31 -0600
Subject: profiling PETSc code
In-Reply-To: <Pine.OSX.4.64.0608151245060.401@barrys-computer.local>
References: <200608011545.26224.mafunk@nmsu.edu> <200608151034.03619.mafunk@nmsu.edu> <Pine.OSX.4.64.0608151245060.401@barrys-computer.local>
Message-ID: <200608151239.32375.mafunk@nmsu.edu>

On Tuesday 15 August 2006 11:52, Barry Smith wrote:
> > MatSolve           16000 1.0 1.1285e+02 1.4 1.50e+08 1.4 0.0e+00 0.0e+00
>
>                                           ^^^^^
>                                         balance
>
>    Hmmm, I would guess that the matrix entries are not so well balanced?
> One process takes 1.4 times as long for the triangular solves as the other
> so either one matrix has many more entries or one processor is slower then
> the other.
>
>     Barry

Well it would seem that way at first, but i don't know how that could be since 
i allocate an exactly equal amount of points on both processor (see previous 
email).
Further i used the -mat_view_info option. Here is what it gives me:

...
Matrix Object:
  type=mpiaij, rows=119808, cols=119808
[1] PetscCommDuplicateUsing internal PETSc communicator 91 168
[1] PetscCommDuplicateUsing internal PETSc communicator 91 168
...

...
Matrix Object:
  type=seqaij, rows=59904, cols=59904
  total: nonzeros=407400, allocated nonzeros=407400
    not using I-node routines
Matrix Object:
  type=seqaij, rows=59904, cols=59904
  total: nonzeros=407400, allocated nonzeros=407400
    not using I-node routines
...

So to me it look s well split up. Is there anything else that somebody can 
think of. The machine i am running on is all same processors.

By the way, i am not sure if the '[1] PetscCommDuplicateUsing internal PETSc 
communicator 91 168' is something i need to worry about??

mat


>
> On Tue, 15 Aug 2006, Matt Funk wrote:
> > Hi Matt,
> >
> > sorry for the delay since the last email, but there were some other
> > things i needed to do.
> >
> > Anyway, I hope that maybe I can get some more help from you guys with
> > respect to the loadimbalance problem i have. Here is the situtation:
> > I run my code on 2 procs. I profile my KSPSolve call and here is what i
> > get:
> >
> > ...
> >
> > --- Event Stage 2: Stage 2 of ChomboPetscInterface
> >
> > VecDot             20000 1.0 1.9575e+01 2.0 2.39e+08 2.0 0.0e+00 0.0e+00
> > 2.0e+04  2  8  0  0 56   7  8  0  0 56   245
> > VecNorm            16000 1.0 4.0559e+01 3.2 1.53e+08 3.2 0.0e+00 0.0e+00
> > 1.6e+04  3  7  0  0 44  13  7  0  0 44    95
> > VecCopy             4000 1.0 1.5148e+00 1.4 0.00e+00 0.0 0.0e+00 0.0e+00
> > 0.0e+00  0  0  0  0  0   1  0  0  0  0     0
> > VecSet             16000 1.0 3.1937e+00 1.8 0.00e+00 0.0 0.0e+00 0.0e+00
> > 0.0e+00  0  0  0  0  0   1  0  0  0  0     0
> > VecAXPY            16000 1.0 8.2395e+00 1.4 3.22e+08 1.4 0.0e+00 0.0e+00
> > 0.0e+00  1  7  0  0  0   3  7  0  0  0   465
> > VecAYPX             8000 1.0 3.6370e+00 1.3 3.46e+08 1.3 0.0e+00 0.0e+00
> > 0.0e+00  0  3  0  0  0   2  3  0  0  0   527
> > VecScatterBegin    12000 1.0 5.7721e-01 1.1 0.00e+00 0.0 2.4e+04 2.1e+04
> > 0.0e+00  0  0100100  0   0  0100100  0     0
> > VecScatterEnd      12000 1.0 1.4059e+01 9.2 0.00e+00 0.0 0.0e+00 0.0e+00
> > 0.0e+00  1  0  0  0  0   4  0  0  0  0     0
> > MatMult            12000 1.0 5.0567e+01 1.0 1.82e+08 1.0 2.4e+04 2.1e+04
> > 0.0e+00  5 32100100  0  25 32100100  0   361
> > MatSolve           16000 1.0 1.1285e+02 1.4 1.50e+08 1.4 0.0e+00 0.0e+00
> > 0.0e+00 10 43  0  0  0  47 43  0  0  0   214
> > MatLUFactorNum         1 1.0 1.9693e-02 1.0 5.79e+07 1.1 0.0e+00 0.0e+00
> > 0.0e+00  0  0  0  0  0   0  0  0  0  0   110
> > MatILUFactorSym        1 1.0 7.8840e-03 1.1 0.00e+00 0.0 0.0e+00 0.0e+00
> > 1.0e+00  0  0  0  0  0   0  0  0  0  0     0
> > MatGetOrdering         1 1.0 1.2250e-03 1.1 0.00e+00 0.0 0.0e+00 0.0e+00
> > 2.0e+00  0  0  0  0  0   0  0  0  0  0     0
> > KSPSetup               1 1.0 1.1921e-06 1.2 0.00e+00 0.0 0.0e+00 0.0e+00
> > 0.0e+00  0  0  0  0  0   0  0  0  0  0     0
> > KSPSolve            4000 1.0 2.0419e+02 1.0 1.39e+08 1.0 2.4e+04 2.1e+04
> > 3.6e+04 21100100100100 100100100100100   278
> > PCSetUp                1 1.0 2.8828e-02 1.1 3.99e+07 1.1 0.0e+00 0.0e+00
> > 3.0e+00  0  0  0  0  0   0  0  0  0  0    75
> > PCSetUpOnBlocks     4000 1.0 3.5605e-02 1.1 3.35e+07 1.1 0.0e+00 0.0e+00
> > 3.0e+00  0  0  0  0  0   0  0  0  0  0    61
> > PCApply            16000 1.0 1.1661e+02 1.4 1.46e+08 1.4 0.0e+00 0.0e+00
> > 0.0e+00 10 43  0  0  0  49 43  0  0  0   207
> > -------------------------------------------------------------------------
> >-----------------------------------------------
> >
> > ...
> >
> >
> > Some things to note are the following:
> > I allocate my vector as:
> > VecCreateMPI(PETSC_COMM_WORLD, //communicator
> > 	       a_totallocal_numPoints[a_thisproc], //local points on this proc
> > 	       a_totalglobal_numPoints, //total number of global points
> > 	       &m_globalRHSVector); //the vector to be created
> >
> > where the vector a_totallocal_numPoints is :
> > a_totallocal_numPoints: 59904 59904
> >
> > The matrix is allocated as:
> >  m_ierr = MatCreateMPIAIJ(PETSC_COMM_WORLD, //communicator
> > 			   a_totallocal_numPoints[a_thisproc], //total number of local rows
> > (that is rows residing on this proc
> > 			   a_totallocal_numPoints[a_thisproc], //total number of columns
> > corresponding to local part of parallel vector
> > 			   a_totalglobal_numPoints, //number of global rows
> > 			   a_totalglobal_numPoints, //number of global columns
> > 			   PETSC_NULL,
> > 			   a_NumberOfNZPointsInDiagonalMatrix,
> > 			   PETSC_NULL,
> > 			   a_NumberOfNZPointsInOffDiagonalMatrix,
> > 			   &m_globalMatrix);
> >
> > With the info option i checked and there is no extra mallocs at all.
> > My problems setup is symmetric so it seems that everything is set up so
> > that it should be essentially perfectly balanced. However, the numbers
> > given above certainly do not reflect that.
> >
> > However, the in all other parts of my code (except the PETSc call), i get
> > the expected, almost perfect loadbalance.
> >
> > Is there anything that i am overlooking? Any help is greatly appreciated.
> >
> > thanks
> > mat
> >
> > On Wednesday 02 August 2006 16:21, Matt Funk wrote:
> >> Hi Matt,
> >>
> >> It could be a bad load imbalance because i don't let PETSc decide. I
> >> need to fix that anyway, so i think i'll try that first and then let you
> >> know. Thanks though for the quick response and helping me to interpret
> >> those numbers ...
> >>
> >>
> >> mat
> >>
> >> On Wednesday 02 August 2006 15:50, Matthew Knepley wrote:
> >>> On 8/2/06, Matt Funk <mafunk at nmsu.edu> wrote:
> >>>> Hi Matt,
> >>>>
> >>>> thanks for all the help so far. The -info option is really very
> >>>> helpful. So i think i straightened the actual errors out. However, now
> >>>> i am back to the original question i had. That is why it takes so much
> >>>> longer on 4 procs than on 1 proc.
> >>>
> >>> So you have a 1.5 load imbalance for MatMult(), which probably cascades
> >>> to give the 133! load imbalance for VecDot(). You probably have either:
> >>>
> >>>   1) VERY bad laod imbalance
> >>>
> >>>   2) a screwed up network
> >>>
> >>>   3) bad contention on the network (loaded cluster)
> >>>
> >>> Can you help us narrow this down?
> >>>
> >>>
> >>>    Matt
> >>>
> >>>> I profiled the KSPSolve(...) as stage 2:
> >>>>
> >>>> For 1 proc i have:
> >>>> --- Event Stage 2: Stage 2 of ChomboPetscInterface
> >>>>
> >>>> VecDot              4000 1.0 4.9158e-02 1.0 4.74e+08 1.0 0.0e+00
> >>>> 0.0e+00 0.0e+00  0 18  0  0  0   2 18  0  0  0   474
> >>>> VecNorm             8000 1.0 2.1798e-01 1.0 2.14e+08 1.0 0.0e+00
> >>>> 0.0e+00 4.0e+03  1 36  0  0 28   7 36  0  0 33   214
> >>>> VecAYPX             4000 1.0 1.3449e-01 1.0 1.73e+08 1.0 0.0e+00
> >>>> 0.0e+00 0.0e+00  0 18  0  0  0   5 18  0  0  0   173
> >>>> MatMult             4000 1.0 3.6004e-01 1.0 3.24e+07 1.0 0.0e+00
> >>>> 0.0e+00 0.0e+00  1  9  0  0  0  12  9  0  0  0    32
> >>>> MatSolve            8000 1.0 1.0620e+00 1.0 2.19e+07 1.0 0.0e+00
> >>>> 0.0e+00 0.0e+00  3 18  0  0  0  36 18  0  0  0    22
> >>>> KSPSolve            4000 1.0 2.8338e+00 1.0 4.52e+07 1.0 0.0e+00
> >>>> 0.0e+00 1.2e+04  7100  0  0 84  97100  0  0100    45
> >>>> PCApply             8000 1.0 1.1133e+00 1.0 2.09e+07 1.0 0.0e+00
> >>>> 0.0e+00 0.0e+00  3 18  0  0  0  38 18  0  0  0    21
> >>>>
> >>>>
> >>>> for 4 procs i have :
> >>>> --- Event Stage 2: Stage 2 of ChomboPetscInterface
> >>>>
> >>>> VecDot              4000 1.0 3.5884e+01133.7 2.17e+07133.7 0.0e+00
> >>>> 0.0e+00 4.0e+03  8 18  0  0  5   9 18  0  0 14     1
> >>>> VecNorm             8000 1.0 3.4986e-01 1.3 4.43e+07 1.3 0.0e+00
> >>>> 0.0e+00 8.0e+03  0 36  0  0 10   0 36  0  0 29   133
> >>>> VecSet              8000 1.0 3.5024e-02 1.4 0.00e+00 0.0 0.0e+00
> >>>> 0.0e+00 0.0e+00  0  0  0  0  0   0  0  0  0  0     0
> >>>> VecAYPX             4000 1.0 5.6790e-02 1.3 1.28e+08 1.3 0.0e+00
> >>>> 0.0e+00 0.0e+00  0 18  0  0  0   0 18  0  0  0   410
> >>>> VecScatterBegin     4000 1.0 6.0042e+01 1.4 0.00e+00 0.0 0.0e+00
> >>>> 0.0e+00 0.0e+00 38  0  0  0  0  45  0  0  0  0     0
> >>>> VecScatterEnd       4000 1.0 5.9364e+01 1.4 0.00e+00 0.0 0.0e+00
> >>>> 0.0e+00 0.0e+00 37  0  0  0  0  44  0  0  0  0     0
> >>>> MatMult             4000 1.0 1.1959e+02 1.4 3.46e+04 1.4 0.0e+00
> >>>> 0.0e+00 0.0e+00 75  9  0  0  0  89  9  0  0  0     0
> >>>> MatSolve            8000 1.0 2.8150e-01 1.0 2.16e+07 1.0 0.0e+00
> >>>> 0.0e+00 0.0e+00  0 18  0  0  0   0 18  0  0  0    83
> >>>> MatLUFactorNum         1 1.0 1.3685e-04 1.1 5.64e+06 1.1 0.0e+00
> >>>> 0.0e+00 0.0e+00  0  0  0  0  0   0  0  0  0  0    21
> >>>> MatILUFactorSym        1 1.0 2.3389e-04 1.2 0.00e+00 0.0 0.0e+00
> >>>> 0.0e+00 2.0e+00  0  0  0  0  0   0  0  0  0  0     0
> >>>> MatGetOrdering         1 1.0 9.6083e-05 1.2 0.00e+00 0.0 0.0e+00
> >>>> 0.0e+00 2.0e+00  0  0  0  0  0   0  0  0  0  0     0
> >>>> KSPSetup               1 1.0 2.1458e-06 2.2 0.00e+00 0.0 0.0e+00
> >>>> 0.0e+00 0.0e+00  0  0  0  0  0   0  0  0  0  0     0
> >>>> KSPSolve            4000 1.0 1.2200e+02 1.0 2.63e+05 1.0 0.0e+00
> >>>> 0.0e+00 2.8e+04 84100  0  0 34 100100  0  0100     1
> >>>> PCSetUp                1 1.0 5.0187e-04 1.2 1.68e+06 1.2 0.0e+00
> >>>> 0.0e+00 4.0e+00  0  0  0  0  0   0  0  0  0  0     6
> >>>> PCSetUpOnBlocks     4000 1.0 1.2104e-02 2.2 1.34e+05 2.2 0.0e+00
> >>>> 0.0e+00 4.0e+00  0  0  0  0  0   0  0  0  0  0     0
> >>>> PCApply             8000 1.0 8.4254e-01 1.2 8.27e+06 1.2 0.0e+00
> >>>> 0.0e+00 8.0e+03  1 18  0  0 10   1 18  0  0 29    28
> >>>> ----------------------------------------------------------------------
> >>>>- -- -----------------------------------------------
> >>>>
> >>>> Now if i understand it right, all these calls summarize all calls
> >>>> between the pop and push commands. That would mean that the majority
> >>>> of the time is spend in the MatMult and in within that the
> >>>> VecScatterBegin and VecScatterEnd commands (if i understand it right).
> >>>>
> >>>> My problem size is really small. So i was wondering if the problem
> >>>> lies in that (namely that the major time is simply spend communicating
> >>>> between processors, or whether there is still something wrong with how
> >>>> i wrote the code?)
> >>>>
> >>>>
> >>>> thanks
> >>>> mat
> >>>>
> >>>> On Tuesday 01 August 2006 18:28, Matthew Knepley wrote:
> >>>>> On 8/1/06, Matt Funk <mafunk at nmsu.edu> wrote:
> >>>>>> Actually the errors occur on my calls to a PETSc functions after
> >>>>>> calling PETSCInitialize.
> >>>>>
> >>>>> Yes, it is the error I pointed out in the last message.
> >>>>>
> >>>>>    Matt
> >>>>>
> >>>>>> mat


From bsmith at mcs.anl.gov  Tue Aug 15 14:56:34 2006
From: bsmith at mcs.anl.gov (Barry Smith)
Date: Tue, 15 Aug 2006 14:56:34 -0500 (CDT)
Subject: profiling PETSc code
In-Reply-To: <200608151239.32375.mafunk@nmsu.edu>
References: <200608011545.26224.mafunk@nmsu.edu> <200608151034.03619.mafunk@nmsu.edu>
 <Pine.OSX.4.64.0608151245060.401@barrys-computer.local> <200608151239.32375.mafunk@nmsu.edu>
Message-ID: <Pine.OSX.4.64.0608151455390.401@barrys-computer.local>


   Please send the entire -info output as an attachment to me. (Not
in the email) I'll study it in more detail.

    Barry

On Tue, 15 Aug 2006, Matt Funk wrote:

> On Tuesday 15 August 2006 11:52, Barry Smith wrote:
>>> MatSolve           16000 1.0 1.1285e+02 1.4 1.50e+08 1.4 0.0e+00 0.0e+00
>>
>>                                           ^^^^^
>>                                         balance
>>
>>    Hmmm, I would guess that the matrix entries are not so well balanced?
>> One process takes 1.4 times as long for the triangular solves as the other
>> so either one matrix has many more entries or one processor is slower then
>> the other.
>>
>>     Barry
>
> Well it would seem that way at first, but i don't know how that could be since
> i allocate an exactly equal amount of points on both processor (see previous
> email).
> Further i used the -mat_view_info option. Here is what it gives me:
>
> ...
> Matrix Object:
>  type=mpiaij, rows=119808, cols=119808
> [1] PetscCommDuplicateUsing internal PETSc communicator 91 168
> [1] PetscCommDuplicateUsing internal PETSc communicator 91 168
> ...
>
> ...
> Matrix Object:
>  type=seqaij, rows=59904, cols=59904
>  total: nonzeros=407400, allocated nonzeros=407400
>    not using I-node routines
> Matrix Object:
>  type=seqaij, rows=59904, cols=59904
>  total: nonzeros=407400, allocated nonzeros=407400
>    not using I-node routines
> ...
>
> So to me it look s well split up. Is there anything else that somebody can
> think of. The machine i am running on is all same processors.
>
> By the way, i am not sure if the '[1] PetscCommDuplicateUsing internal PETSc
> communicator 91 168' is something i need to worry about??
>
> mat
>
>
>
>
>
>
>
>
>>
>> On Tue, 15 Aug 2006, Matt Funk wrote:
>>> Hi Matt,
>>>
>>> sorry for the delay since the last email, but there were some other
>>> things i needed to do.
>>>
>>> Anyway, I hope that maybe I can get some more help from you guys with
>>> respect to the loadimbalance problem i have. Here is the situtation:
>>> I run my code on 2 procs. I profile my KSPSolve call and here is what i
>>> get:
>>>
>>> ...
>>>
>>> --- Event Stage 2: Stage 2 of ChomboPetscInterface
>>>
>>> VecDot             20000 1.0 1.9575e+01 2.0 2.39e+08 2.0 0.0e+00 0.0e+00
>>> 2.0e+04  2  8  0  0 56   7  8  0  0 56   245
>>> VecNorm            16000 1.0 4.0559e+01 3.2 1.53e+08 3.2 0.0e+00 0.0e+00
>>> 1.6e+04  3  7  0  0 44  13  7  0  0 44    95
>>> VecCopy             4000 1.0 1.5148e+00 1.4 0.00e+00 0.0 0.0e+00 0.0e+00
>>> 0.0e+00  0  0  0  0  0   1  0  0  0  0     0
>>> VecSet             16000 1.0 3.1937e+00 1.8 0.00e+00 0.0 0.0e+00 0.0e+00
>>> 0.0e+00  0  0  0  0  0   1  0  0  0  0     0
>>> VecAXPY            16000 1.0 8.2395e+00 1.4 3.22e+08 1.4 0.0e+00 0.0e+00
>>> 0.0e+00  1  7  0  0  0   3  7  0  0  0   465
>>> VecAYPX             8000 1.0 3.6370e+00 1.3 3.46e+08 1.3 0.0e+00 0.0e+00
>>> 0.0e+00  0  3  0  0  0   2  3  0  0  0   527
>>> VecScatterBegin    12000 1.0 5.7721e-01 1.1 0.00e+00 0.0 2.4e+04 2.1e+04
>>> 0.0e+00  0  0100100  0   0  0100100  0     0
>>> VecScatterEnd      12000 1.0 1.4059e+01 9.2 0.00e+00 0.0 0.0e+00 0.0e+00
>>> 0.0e+00  1  0  0  0  0   4  0  0  0  0     0
>>> MatMult            12000 1.0 5.0567e+01 1.0 1.82e+08 1.0 2.4e+04 2.1e+04
>>> 0.0e+00  5 32100100  0  25 32100100  0   361
>>> MatSolve           16000 1.0 1.1285e+02 1.4 1.50e+08 1.4 0.0e+00 0.0e+00
>>> 0.0e+00 10 43  0  0  0  47 43  0  0  0   214
>>> MatLUFactorNum         1 1.0 1.9693e-02 1.0 5.79e+07 1.1 0.0e+00 0.0e+00
>>> 0.0e+00  0  0  0  0  0   0  0  0  0  0   110
>>> MatILUFactorSym        1 1.0 7.8840e-03 1.1 0.00e+00 0.0 0.0e+00 0.0e+00
>>> 1.0e+00  0  0  0  0  0   0  0  0  0  0     0
>>> MatGetOrdering         1 1.0 1.2250e-03 1.1 0.00e+00 0.0 0.0e+00 0.0e+00
>>> 2.0e+00  0  0  0  0  0   0  0  0  0  0     0
>>> KSPSetup               1 1.0 1.1921e-06 1.2 0.00e+00 0.0 0.0e+00 0.0e+00
>>> 0.0e+00  0  0  0  0  0   0  0  0  0  0     0
>>> KSPSolve            4000 1.0 2.0419e+02 1.0 1.39e+08 1.0 2.4e+04 2.1e+04
>>> 3.6e+04 21100100100100 100100100100100   278
>>> PCSetUp                1 1.0 2.8828e-02 1.1 3.99e+07 1.1 0.0e+00 0.0e+00
>>> 3.0e+00  0  0  0  0  0   0  0  0  0  0    75
>>> PCSetUpOnBlocks     4000 1.0 3.5605e-02 1.1 3.35e+07 1.1 0.0e+00 0.0e+00
>>> 3.0e+00  0  0  0  0  0   0  0  0  0  0    61
>>> PCApply            16000 1.0 1.1661e+02 1.4 1.46e+08 1.4 0.0e+00 0.0e+00
>>> 0.0e+00 10 43  0  0  0  49 43  0  0  0   207
>>> -------------------------------------------------------------------------
>>> -----------------------------------------------
>>>
>>> ...
>>>
>>>
>>> Some things to note are the following:
>>> I allocate my vector as:
>>> VecCreateMPI(PETSC_COMM_WORLD, //communicator
>>> 	       a_totallocal_numPoints[a_thisproc], //local points on this proc
>>> 	       a_totalglobal_numPoints, //total number of global points
>>> 	       &m_globalRHSVector); //the vector to be created
>>>
>>> where the vector a_totallocal_numPoints is :
>>> a_totallocal_numPoints: 59904 59904
>>>
>>> The matrix is allocated as:
>>>  m_ierr = MatCreateMPIAIJ(PETSC_COMM_WORLD, //communicator
>>> 			   a_totallocal_numPoints[a_thisproc], //total number of local rows
>>> (that is rows residing on this proc
>>> 			   a_totallocal_numPoints[a_thisproc], //total number of columns
>>> corresponding to local part of parallel vector
>>> 			   a_totalglobal_numPoints, //number of global rows
>>> 			   a_totalglobal_numPoints, //number of global columns
>>> 			   PETSC_NULL,
>>> 			   a_NumberOfNZPointsInDiagonalMatrix,
>>> 			   PETSC_NULL,
>>> 			   a_NumberOfNZPointsInOffDiagonalMatrix,
>>> 			   &m_globalMatrix);
>>>
>>> With the info option i checked and there is no extra mallocs at all.
>>> My problems setup is symmetric so it seems that everything is set up so
>>> that it should be essentially perfectly balanced. However, the numbers
>>> given above certainly do not reflect that.
>>>
>>> However, the in all other parts of my code (except the PETSc call), i get
>>> the expected, almost perfect loadbalance.
>>>
>>> Is there anything that i am overlooking? Any help is greatly appreciated.
>>>
>>> thanks
>>> mat
>>>
>>> On Wednesday 02 August 2006 16:21, Matt Funk wrote:
>>>> Hi Matt,
>>>>
>>>> It could be a bad load imbalance because i don't let PETSc decide. I
>>>> need to fix that anyway, so i think i'll try that first and then let you
>>>> know. Thanks though for the quick response and helping me to interpret
>>>> those numbers ...
>>>>
>>>>
>>>> mat
>>>>
>>>> On Wednesday 02 August 2006 15:50, Matthew Knepley wrote:
>>>>> On 8/2/06, Matt Funk <mafunk at nmsu.edu> wrote:
>>>>>> Hi Matt,
>>>>>>
>>>>>> thanks for all the help so far. The -info option is really very
>>>>>> helpful. So i think i straightened the actual errors out. However, now
>>>>>> i am back to the original question i had. That is why it takes so much
>>>>>> longer on 4 procs than on 1 proc.
>>>>>
>>>>> So you have a 1.5 load imbalance for MatMult(), which probably cascades
>>>>> to give the 133! load imbalance for VecDot(). You probably have either:
>>>>>
>>>>>   1) VERY bad laod imbalance
>>>>>
>>>>>   2) a screwed up network
>>>>>
>>>>>   3) bad contention on the network (loaded cluster)
>>>>>
>>>>> Can you help us narrow this down?
>>>>>
>>>>>
>>>>>    Matt
>>>>>
>>>>>> I profiled the KSPSolve(...) as stage 2:
>>>>>>
>>>>>> For 1 proc i have:
>>>>>> --- Event Stage 2: Stage 2 of ChomboPetscInterface
>>>>>>
>>>>>> VecDot              4000 1.0 4.9158e-02 1.0 4.74e+08 1.0 0.0e+00
>>>>>> 0.0e+00 0.0e+00  0 18  0  0  0   2 18  0  0  0   474
>>>>>> VecNorm             8000 1.0 2.1798e-01 1.0 2.14e+08 1.0 0.0e+00
>>>>>> 0.0e+00 4.0e+03  1 36  0  0 28   7 36  0  0 33   214
>>>>>> VecAYPX             4000 1.0 1.3449e-01 1.0 1.73e+08 1.0 0.0e+00
>>>>>> 0.0e+00 0.0e+00  0 18  0  0  0   5 18  0  0  0   173
>>>>>> MatMult             4000 1.0 3.6004e-01 1.0 3.24e+07 1.0 0.0e+00
>>>>>> 0.0e+00 0.0e+00  1  9  0  0  0  12  9  0  0  0    32
>>>>>> MatSolve            8000 1.0 1.0620e+00 1.0 2.19e+07 1.0 0.0e+00
>>>>>> 0.0e+00 0.0e+00  3 18  0  0  0  36 18  0  0  0    22
>>>>>> KSPSolve            4000 1.0 2.8338e+00 1.0 4.52e+07 1.0 0.0e+00
>>>>>> 0.0e+00 1.2e+04  7100  0  0 84  97100  0  0100    45
>>>>>> PCApply             8000 1.0 1.1133e+00 1.0 2.09e+07 1.0 0.0e+00
>>>>>> 0.0e+00 0.0e+00  3 18  0  0  0  38 18  0  0  0    21
>>>>>>
>>>>>>
>>>>>> for 4 procs i have :
>>>>>> --- Event Stage 2: Stage 2 of ChomboPetscInterface
>>>>>>
>>>>>> VecDot              4000 1.0 3.5884e+01133.7 2.17e+07133.7 0.0e+00
>>>>>> 0.0e+00 4.0e+03  8 18  0  0  5   9 18  0  0 14     1
>>>>>> VecNorm             8000 1.0 3.4986e-01 1.3 4.43e+07 1.3 0.0e+00
>>>>>> 0.0e+00 8.0e+03  0 36  0  0 10   0 36  0  0 29   133
>>>>>> VecSet              8000 1.0 3.5024e-02 1.4 0.00e+00 0.0 0.0e+00
>>>>>> 0.0e+00 0.0e+00  0  0  0  0  0   0  0  0  0  0     0
>>>>>> VecAYPX             4000 1.0 5.6790e-02 1.3 1.28e+08 1.3 0.0e+00
>>>>>> 0.0e+00 0.0e+00  0 18  0  0  0   0 18  0  0  0   410
>>>>>> VecScatterBegin     4000 1.0 6.0042e+01 1.4 0.00e+00 0.0 0.0e+00
>>>>>> 0.0e+00 0.0e+00 38  0  0  0  0  45  0  0  0  0     0
>>>>>> VecScatterEnd       4000 1.0 5.9364e+01 1.4 0.00e+00 0.0 0.0e+00
>>>>>> 0.0e+00 0.0e+00 37  0  0  0  0  44  0  0  0  0     0
>>>>>> MatMult             4000 1.0 1.1959e+02 1.4 3.46e+04 1.4 0.0e+00
>>>>>> 0.0e+00 0.0e+00 75  9  0  0  0  89  9  0  0  0     0
>>>>>> MatSolve            8000 1.0 2.8150e-01 1.0 2.16e+07 1.0 0.0e+00
>>>>>> 0.0e+00 0.0e+00  0 18  0  0  0   0 18  0  0  0    83
>>>>>> MatLUFactorNum         1 1.0 1.3685e-04 1.1 5.64e+06 1.1 0.0e+00
>>>>>> 0.0e+00 0.0e+00  0  0  0  0  0   0  0  0  0  0    21
>>>>>> MatILUFactorSym        1 1.0 2.3389e-04 1.2 0.00e+00 0.0 0.0e+00
>>>>>> 0.0e+00 2.0e+00  0  0  0  0  0   0  0  0  0  0     0
>>>>>> MatGetOrdering         1 1.0 9.6083e-05 1.2 0.00e+00 0.0 0.0e+00
>>>>>> 0.0e+00 2.0e+00  0  0  0  0  0   0  0  0  0  0     0
>>>>>> KSPSetup               1 1.0 2.1458e-06 2.2 0.00e+00 0.0 0.0e+00
>>>>>> 0.0e+00 0.0e+00  0  0  0  0  0   0  0  0  0  0     0
>>>>>> KSPSolve            4000 1.0 1.2200e+02 1.0 2.63e+05 1.0 0.0e+00
>>>>>> 0.0e+00 2.8e+04 84100  0  0 34 100100  0  0100     1
>>>>>> PCSetUp                1 1.0 5.0187e-04 1.2 1.68e+06 1.2 0.0e+00
>>>>>> 0.0e+00 4.0e+00  0  0  0  0  0   0  0  0  0  0     6
>>>>>> PCSetUpOnBlocks     4000 1.0 1.2104e-02 2.2 1.34e+05 2.2 0.0e+00
>>>>>> 0.0e+00 4.0e+00  0  0  0  0  0   0  0  0  0  0     0
>>>>>> PCApply             8000 1.0 8.4254e-01 1.2 8.27e+06 1.2 0.0e+00
>>>>>> 0.0e+00 8.0e+03  1 18  0  0 10   1 18  0  0 29    28
>>>>>> ----------------------------------------------------------------------
>>>>>> - -- -----------------------------------------------
>>>>>>
>>>>>> Now if i understand it right, all these calls summarize all calls
>>>>>> between the pop and push commands. That would mean that the majority
>>>>>> of the time is spend in the MatMult and in within that the
>>>>>> VecScatterBegin and VecScatterEnd commands (if i understand it right).
>>>>>>
>>>>>> My problem size is really small. So i was wondering if the problem
>>>>>> lies in that (namely that the major time is simply spend communicating
>>>>>> between processors, or whether there is still something wrong with how
>>>>>> i wrote the code?)
>>>>>>
>>>>>>
>>>>>> thanks
>>>>>> mat
>>>>>>
>>>>>> On Tuesday 01 August 2006 18:28, Matthew Knepley wrote:
>>>>>>> On 8/1/06, Matt Funk <mafunk at nmsu.edu> wrote:
>>>>>>>> Actually the errors occur on my calls to a PETSc functions after
>>>>>>>> calling PETSCInitialize.
>>>>>>>
>>>>>>> Yes, it is the error I pointed out in the last message.
>>>>>>>
>>>>>>>    Matt
>>>>>>>
>>>>>>>> mat
>
>


From mafunk at nmsu.edu  Tue Aug 15 15:35:52 2006
From: mafunk at nmsu.edu (Matt Funk)
Date: Tue, 15 Aug 2006 14:35:52 -0600
Subject: profiling PETSc code
In-Reply-To: <Pine.OSX.4.64.0608151455390.401@barrys-computer.local>
References: <200608011545.26224.mafunk@nmsu.edu> <200608151239.32375.mafunk@nmsu.edu> <Pine.OSX.4.64.0608151455390.401@barrys-computer.local>
Message-ID: <200608151435.55054.mafunk@nmsu.edu>

Do you want me to use the debug version or the optimized version of PETSc?

mat

On Tuesday 15 August 2006 13:56, Barry Smith wrote:
>    Please send the entire -info output as an attachment to me. (Not
> in the email) I'll study it in more detail.
>
>     Barry
>
> On Tue, 15 Aug 2006, Matt Funk wrote:
> > On Tuesday 15 August 2006 11:52, Barry Smith wrote:
> >>> MatSolve           16000 1.0 1.1285e+02 1.4 1.50e+08 1.4 0.0e+00
> >>> 0.0e+00
> >>
> >>                                           ^^^^^
> >>                                         balance
> >>
> >>    Hmmm, I would guess that the matrix entries are not so well balanced?
> >> One process takes 1.4 times as long for the triangular solves as the
> >> other so either one matrix has many more entries or one processor is
> >> slower then the other.
> >>
> >>     Barry
> >
> > Well it would seem that way at first, but i don't know how that could be
> > since i allocate an exactly equal amount of points on both processor (see
> > previous email).
> > Further i used the -mat_view_info option. Here is what it gives me:
> >
> > ...
> > Matrix Object:
> >  type=mpiaij, rows=119808, cols=119808
> > [1] PetscCommDuplicateUsing internal PETSc communicator 91 168
> > [1] PetscCommDuplicateUsing internal PETSc communicator 91 168
> > ...
> >
> > ...
> > Matrix Object:
> >  type=seqaij, rows=59904, cols=59904
> >  total: nonzeros=407400, allocated nonzeros=407400
> >    not using I-node routines
> > Matrix Object:
> >  type=seqaij, rows=59904, cols=59904
> >  total: nonzeros=407400, allocated nonzeros=407400
> >    not using I-node routines
> > ...
> >
> > So to me it look s well split up. Is there anything else that somebody
> > can think of. The machine i am running on is all same processors.
> >
> > By the way, i am not sure if the '[1] PetscCommDuplicateUsing internal
> > PETSc communicator 91 168' is something i need to worry about??
> >
> > mat
> >
> >> On Tue, 15 Aug 2006, Matt Funk wrote:
> >>> Hi Matt,
> >>>
> >>> sorry for the delay since the last email, but there were some other
> >>> things i needed to do.
> >>>
> >>> Anyway, I hope that maybe I can get some more help from you guys with
> >>> respect to the loadimbalance problem i have. Here is the situtation:
> >>> I run my code on 2 procs. I profile my KSPSolve call and here is what i
> >>> get:
> >>>
> >>> ...
> >>>
> >>> --- Event Stage 2: Stage 2 of ChomboPetscInterface
> >>>
> >>> VecDot             20000 1.0 1.9575e+01 2.0 2.39e+08 2.0 0.0e+00
> >>> 0.0e+00 2.0e+04  2  8  0  0 56   7  8  0  0 56   245
> >>> VecNorm            16000 1.0 4.0559e+01 3.2 1.53e+08 3.2 0.0e+00
> >>> 0.0e+00 1.6e+04  3  7  0  0 44  13  7  0  0 44    95
> >>> VecCopy             4000 1.0 1.5148e+00 1.4 0.00e+00 0.0 0.0e+00
> >>> 0.0e+00 0.0e+00  0  0  0  0  0   1  0  0  0  0     0
> >>> VecSet             16000 1.0 3.1937e+00 1.8 0.00e+00 0.0 0.0e+00
> >>> 0.0e+00 0.0e+00  0  0  0  0  0   1  0  0  0  0     0
> >>> VecAXPY            16000 1.0 8.2395e+00 1.4 3.22e+08 1.4 0.0e+00
> >>> 0.0e+00 0.0e+00  1  7  0  0  0   3  7  0  0  0   465
> >>> VecAYPX             8000 1.0 3.6370e+00 1.3 3.46e+08 1.3 0.0e+00
> >>> 0.0e+00 0.0e+00  0  3  0  0  0   2  3  0  0  0   527
> >>> VecScatterBegin    12000 1.0 5.7721e-01 1.1 0.00e+00 0.0 2.4e+04
> >>> 2.1e+04 0.0e+00  0  0100100  0   0  0100100  0     0
> >>> VecScatterEnd      12000 1.0 1.4059e+01 9.2 0.00e+00 0.0 0.0e+00
> >>> 0.0e+00 0.0e+00  1  0  0  0  0   4  0  0  0  0     0
> >>> MatMult            12000 1.0 5.0567e+01 1.0 1.82e+08 1.0 2.4e+04
> >>> 2.1e+04 0.0e+00  5 32100100  0  25 32100100  0   361
> >>> MatSolve           16000 1.0 1.1285e+02 1.4 1.50e+08 1.4 0.0e+00
> >>> 0.0e+00 0.0e+00 10 43  0  0  0  47 43  0  0  0   214
> >>> MatLUFactorNum         1 1.0 1.9693e-02 1.0 5.79e+07 1.1 0.0e+00
> >>> 0.0e+00 0.0e+00  0  0  0  0  0   0  0  0  0  0   110
> >>> MatILUFactorSym        1 1.0 7.8840e-03 1.1 0.00e+00 0.0 0.0e+00
> >>> 0.0e+00 1.0e+00  0  0  0  0  0   0  0  0  0  0     0
> >>> MatGetOrdering         1 1.0 1.2250e-03 1.1 0.00e+00 0.0 0.0e+00
> >>> 0.0e+00 2.0e+00  0  0  0  0  0   0  0  0  0  0     0
> >>> KSPSetup               1 1.0 1.1921e-06 1.2 0.00e+00 0.0 0.0e+00
> >>> 0.0e+00 0.0e+00  0  0  0  0  0   0  0  0  0  0     0
> >>> KSPSolve            4000 1.0 2.0419e+02 1.0 1.39e+08 1.0 2.4e+04
> >>> 2.1e+04 3.6e+04 21100100100100 100100100100100   278
> >>> PCSetUp                1 1.0 2.8828e-02 1.1 3.99e+07 1.1 0.0e+00
> >>> 0.0e+00 3.0e+00  0  0  0  0  0   0  0  0  0  0    75
> >>> PCSetUpOnBlocks     4000 1.0 3.5605e-02 1.1 3.35e+07 1.1 0.0e+00
> >>> 0.0e+00 3.0e+00  0  0  0  0  0   0  0  0  0  0    61
> >>> PCApply            16000 1.0 1.1661e+02 1.4 1.46e+08 1.4 0.0e+00
> >>> 0.0e+00 0.0e+00 10 43  0  0  0  49 43  0  0  0   207
> >>> -----------------------------------------------------------------------
> >>>-- -----------------------------------------------
> >>>
> >>> ...
> >>>
> >>>
> >>> Some things to note are the following:
> >>> I allocate my vector as:
> >>> VecCreateMPI(PETSC_COMM_WORLD, //communicator
> >>> 	       a_totallocal_numPoints[a_thisproc], //local points on this proc
> >>> 	       a_totalglobal_numPoints, //total number of global points
> >>> 	       &m_globalRHSVector); //the vector to be created
> >>>
> >>> where the vector a_totallocal_numPoints is :
> >>> a_totallocal_numPoints: 59904 59904
> >>>
> >>> The matrix is allocated as:
> >>>  m_ierr = MatCreateMPIAIJ(PETSC_COMM_WORLD, //communicator
> >>> 			   a_totallocal_numPoints[a_thisproc], //total number of local rows
> >>> (that is rows residing on this proc
> >>> 			   a_totallocal_numPoints[a_thisproc], //total number of columns
> >>> corresponding to local part of parallel vector
> >>> 			   a_totalglobal_numPoints, //number of global rows
> >>> 			   a_totalglobal_numPoints, //number of global columns
> >>> 			   PETSC_NULL,
> >>> 			   a_NumberOfNZPointsInDiagonalMatrix,
> >>> 			   PETSC_NULL,
> >>> 			   a_NumberOfNZPointsInOffDiagonalMatrix,
> >>> 			   &m_globalMatrix);
> >>>
> >>> With the info option i checked and there is no extra mallocs at all.
> >>> My problems setup is symmetric so it seems that everything is set up so
> >>> that it should be essentially perfectly balanced. However, the numbers
> >>> given above certainly do not reflect that.
> >>>
> >>> However, the in all other parts of my code (except the PETSc call), i
> >>> get the expected, almost perfect loadbalance.
> >>>
> >>> Is there anything that i am overlooking? Any help is greatly
> >>> appreciated.
> >>>
> >>> thanks
> >>> mat
> >>>
> >>> On Wednesday 02 August 2006 16:21, Matt Funk wrote:
> >>>> Hi Matt,
> >>>>
> >>>> It could be a bad load imbalance because i don't let PETSc decide. I
> >>>> need to fix that anyway, so i think i'll try that first and then let
> >>>> you know. Thanks though for the quick response and helping me to
> >>>> interpret those numbers ...
> >>>>
> >>>>
> >>>> mat
> >>>>
> >>>> On Wednesday 02 August 2006 15:50, Matthew Knepley wrote:
> >>>>> On 8/2/06, Matt Funk <mafunk at nmsu.edu> wrote:
> >>>>>> Hi Matt,
> >>>>>>
> >>>>>> thanks for all the help so far. The -info option is really very
> >>>>>> helpful. So i think i straightened the actual errors out. However,
> >>>>>> now i am back to the original question i had. That is why it takes
> >>>>>> so much longer on 4 procs than on 1 proc.
> >>>>>
> >>>>> So you have a 1.5 load imbalance for MatMult(), which probably
> >>>>> cascades to give the 133! load imbalance for VecDot(). You probably
> >>>>> have either:
> >>>>>
> >>>>>   1) VERY bad laod imbalance
> >>>>>
> >>>>>   2) a screwed up network
> >>>>>
> >>>>>   3) bad contention on the network (loaded cluster)
> >>>>>
> >>>>> Can you help us narrow this down?
> >>>>>
> >>>>>
> >>>>>    Matt
> >>>>>
> >>>>>> I profiled the KSPSolve(...) as stage 2:
> >>>>>>
> >>>>>> For 1 proc i have:
> >>>>>> --- Event Stage 2: Stage 2 of ChomboPetscInterface
> >>>>>>
> >>>>>> VecDot              4000 1.0 4.9158e-02 1.0 4.74e+08 1.0 0.0e+00
> >>>>>> 0.0e+00 0.0e+00  0 18  0  0  0   2 18  0  0  0   474
> >>>>>> VecNorm             8000 1.0 2.1798e-01 1.0 2.14e+08 1.0 0.0e+00
> >>>>>> 0.0e+00 4.0e+03  1 36  0  0 28   7 36  0  0 33   214
> >>>>>> VecAYPX             4000 1.0 1.3449e-01 1.0 1.73e+08 1.0 0.0e+00
> >>>>>> 0.0e+00 0.0e+00  0 18  0  0  0   5 18  0  0  0   173
> >>>>>> MatMult             4000 1.0 3.6004e-01 1.0 3.24e+07 1.0 0.0e+00
> >>>>>> 0.0e+00 0.0e+00  1  9  0  0  0  12  9  0  0  0    32
> >>>>>> MatSolve            8000 1.0 1.0620e+00 1.0 2.19e+07 1.0 0.0e+00
> >>>>>> 0.0e+00 0.0e+00  3 18  0  0  0  36 18  0  0  0    22
> >>>>>> KSPSolve            4000 1.0 2.8338e+00 1.0 4.52e+07 1.0 0.0e+00
> >>>>>> 0.0e+00 1.2e+04  7100  0  0 84  97100  0  0100    45
> >>>>>> PCApply             8000 1.0 1.1133e+00 1.0 2.09e+07 1.0 0.0e+00
> >>>>>> 0.0e+00 0.0e+00  3 18  0  0  0  38 18  0  0  0    21
> >>>>>>
> >>>>>>
> >>>>>> for 4 procs i have :
> >>>>>> --- Event Stage 2: Stage 2 of ChomboPetscInterface
> >>>>>>
> >>>>>> VecDot              4000 1.0 3.5884e+01133.7 2.17e+07133.7 0.0e+00
> >>>>>> 0.0e+00 4.0e+03  8 18  0  0  5   9 18  0  0 14     1
> >>>>>> VecNorm             8000 1.0 3.4986e-01 1.3 4.43e+07 1.3 0.0e+00
> >>>>>> 0.0e+00 8.0e+03  0 36  0  0 10   0 36  0  0 29   133
> >>>>>> VecSet              8000 1.0 3.5024e-02 1.4 0.00e+00 0.0 0.0e+00
> >>>>>> 0.0e+00 0.0e+00  0  0  0  0  0   0  0  0  0  0     0
> >>>>>> VecAYPX             4000 1.0 5.6790e-02 1.3 1.28e+08 1.3 0.0e+00
> >>>>>> 0.0e+00 0.0e+00  0 18  0  0  0   0 18  0  0  0   410
> >>>>>> VecScatterBegin     4000 1.0 6.0042e+01 1.4 0.00e+00 0.0 0.0e+00
> >>>>>> 0.0e+00 0.0e+00 38  0  0  0  0  45  0  0  0  0     0
> >>>>>> VecScatterEnd       4000 1.0 5.9364e+01 1.4 0.00e+00 0.0 0.0e+00
> >>>>>> 0.0e+00 0.0e+00 37  0  0  0  0  44  0  0  0  0     0
> >>>>>> MatMult             4000 1.0 1.1959e+02 1.4 3.46e+04 1.4 0.0e+00
> >>>>>> 0.0e+00 0.0e+00 75  9  0  0  0  89  9  0  0  0     0
> >>>>>> MatSolve            8000 1.0 2.8150e-01 1.0 2.16e+07 1.0 0.0e+00
> >>>>>> 0.0e+00 0.0e+00  0 18  0  0  0   0 18  0  0  0    83
> >>>>>> MatLUFactorNum         1 1.0 1.3685e-04 1.1 5.64e+06 1.1 0.0e+00
> >>>>>> 0.0e+00 0.0e+00  0  0  0  0  0   0  0  0  0  0    21
> >>>>>> MatILUFactorSym        1 1.0 2.3389e-04 1.2 0.00e+00 0.0 0.0e+00
> >>>>>> 0.0e+00 2.0e+00  0  0  0  0  0   0  0  0  0  0     0
> >>>>>> MatGetOrdering         1 1.0 9.6083e-05 1.2 0.00e+00 0.0 0.0e+00
> >>>>>> 0.0e+00 2.0e+00  0  0  0  0  0   0  0  0  0  0     0
> >>>>>> KSPSetup               1 1.0 2.1458e-06 2.2 0.00e+00 0.0 0.0e+00
> >>>>>> 0.0e+00 0.0e+00  0  0  0  0  0   0  0  0  0  0     0
> >>>>>> KSPSolve            4000 1.0 1.2200e+02 1.0 2.63e+05 1.0 0.0e+00
> >>>>>> 0.0e+00 2.8e+04 84100  0  0 34 100100  0  0100     1
> >>>>>> PCSetUp                1 1.0 5.0187e-04 1.2 1.68e+06 1.2 0.0e+00
> >>>>>> 0.0e+00 4.0e+00  0  0  0  0  0   0  0  0  0  0     6
> >>>>>> PCSetUpOnBlocks     4000 1.0 1.2104e-02 2.2 1.34e+05 2.2 0.0e+00
> >>>>>> 0.0e+00 4.0e+00  0  0  0  0  0   0  0  0  0  0     0
> >>>>>> PCApply             8000 1.0 8.4254e-01 1.2 8.27e+06 1.2 0.0e+00
> >>>>>> 0.0e+00 8.0e+03  1 18  0  0 10   1 18  0  0 29    28
> >>>>>> --------------------------------------------------------------------
> >>>>>>-- - -- -----------------------------------------------
> >>>>>>
> >>>>>> Now if i understand it right, all these calls summarize all calls
> >>>>>> between the pop and push commands. That would mean that the majority
> >>>>>> of the time is spend in the MatMult and in within that the
> >>>>>> VecScatterBegin and VecScatterEnd commands (if i understand it
> >>>>>> right).
> >>>>>>
> >>>>>> My problem size is really small. So i was wondering if the problem
> >>>>>> lies in that (namely that the major time is simply spend
> >>>>>> communicating between processors, or whether there is still
> >>>>>> something wrong with how i wrote the code?)
> >>>>>>
> >>>>>>
> >>>>>> thanks
> >>>>>> mat
> >>>>>>
> >>>>>> On Tuesday 01 August 2006 18:28, Matthew Knepley wrote:
> >>>>>>> On 8/1/06, Matt Funk <mafunk at nmsu.edu> wrote:
> >>>>>>>> Actually the errors occur on my calls to a PETSc functions after
> >>>>>>>> calling PETSCInitialize.
> >>>>>>>
> >>>>>>> Yes, it is the error I pointed out in the last message.
> >>>>>>>
> >>>>>>>    Matt
> >>>>>>>
> >>>>>>>> mat


From knepley at gmail.com  Tue Aug 15 15:44:04 2006
From: knepley at gmail.com (Matthew Knepley)
Date: Tue, 15 Aug 2006 15:44:04 -0500
Subject: profiling PETSc code
In-Reply-To: <200608151435.55054.mafunk@nmsu.edu>
References: <200608011545.26224.mafunk@nmsu.edu>
	 <200608151239.32375.mafunk@nmsu.edu>
	 <Pine.OSX.4.64.0608151455390.401@barrys-computer.local>
	 <200608151435.55054.mafunk@nmsu.edu>
Message-ID: <a9f269830608151344rbfa13fbgdf499b17f641efbf@mail.gmail.com>

I don't think it matters initially since the problem is BIG imbalances.

  Matt

On 8/15/06, Matt Funk <mafunk at nmsu.edu> wrote:
> Do you want me to use the debug version or the optimized version of PETSc?
>
> mat
>
> On Tuesday 15 August 2006 13:56, Barry Smith wrote:
> >    Please send the entire -info output as an attachment to me. (Not
> > in the email) I'll study it in more detail.
> >
> >     Barry
> >
> > On Tue, 15 Aug 2006, Matt Funk wrote:
> > > On Tuesday 15 August 2006 11:52, Barry Smith wrote:
> > >>> MatSolve           16000 1.0 1.1285e+02 1.4 1.50e+08 1.4 0.0e+00
> > >>> 0.0e+00
> > >>
> > >>                                           ^^^^^
> > >>                                         balance
> > >>
> > >>    Hmmm, I would guess that the matrix entries are not so well balanced?
> > >> One process takes 1.4 times as long for the triangular solves as the
> > >> other so either one matrix has many more entries or one processor is
> > >> slower then the other.
> > >>
> > >>     Barry
> > >
> > > Well it would seem that way at first, but i don't know how that could be
> > > since i allocate an exactly equal amount of points on both processor (see
> > > previous email).
> > > Further i used the -mat_view_info option. Here is what it gives me:
> > >
> > > ...
> > > Matrix Object:
> > >  type=mpiaij, rows=119808, cols=119808
> > > [1] PetscCommDuplicateUsing internal PETSc communicator 91 168
> > > [1] PetscCommDuplicateUsing internal PETSc communicator 91 168
> > > ...
> > >
> > > ...
> > > Matrix Object:
> > >  type=seqaij, rows=59904, cols=59904
> > >  total: nonzeros=407400, allocated nonzeros=407400
> > >    not using I-node routines
> > > Matrix Object:
> > >  type=seqaij, rows=59904, cols=59904
> > >  total: nonzeros=407400, allocated nonzeros=407400
> > >    not using I-node routines
> > > ...
> > >
> > > So to me it look s well split up. Is there anything else that somebody
> > > can think of. The machine i am running on is all same processors.
> > >
> > > By the way, i am not sure if the '[1] PetscCommDuplicateUsing internal
> > > PETSc communicator 91 168' is something i need to worry about??
> > >
> > > mat
> > >
> > >> On Tue, 15 Aug 2006, Matt Funk wrote:
> > >>> Hi Matt,
> > >>>
> > >>> sorry for the delay since the last email, but there were some other
> > >>> things i needed to do.
> > >>>
> > >>> Anyway, I hope that maybe I can get some more help from you guys with
> > >>> respect to the loadimbalance problem i have. Here is the situtation:
> > >>> I run my code on 2 procs. I profile my KSPSolve call and here is what i
> > >>> get:
> > >>>
> > >>> ...
> > >>>
> > >>> --- Event Stage 2: Stage 2 of ChomboPetscInterface
> > >>>
> > >>> VecDot             20000 1.0 1.9575e+01 2.0 2.39e+08 2.0 0.0e+00
> > >>> 0.0e+00 2.0e+04  2  8  0  0 56   7  8  0  0 56   245
> > >>> VecNorm            16000 1.0 4.0559e+01 3.2 1.53e+08 3.2 0.0e+00
> > >>> 0.0e+00 1.6e+04  3  7  0  0 44  13  7  0  0 44    95
> > >>> VecCopy             4000 1.0 1.5148e+00 1.4 0.00e+00 0.0 0.0e+00
> > >>> 0.0e+00 0.0e+00  0  0  0  0  0   1  0  0  0  0     0
> > >>> VecSet             16000 1.0 3.1937e+00 1.8 0.00e+00 0.0 0.0e+00
> > >>> 0.0e+00 0.0e+00  0  0  0  0  0   1  0  0  0  0     0
> > >>> VecAXPY            16000 1.0 8.2395e+00 1.4 3.22e+08 1.4 0.0e+00
> > >>> 0.0e+00 0.0e+00  1  7  0  0  0   3  7  0  0  0   465
> > >>> VecAYPX             8000 1.0 3.6370e+00 1.3 3.46e+08 1.3 0.0e+00
> > >>> 0.0e+00 0.0e+00  0  3  0  0  0   2  3  0  0  0   527
> > >>> VecScatterBegin    12000 1.0 5.7721e-01 1.1 0.00e+00 0.0 2.4e+04
> > >>> 2.1e+04 0.0e+00  0  0100100  0   0  0100100  0     0
> > >>> VecScatterEnd      12000 1.0 1.4059e+01 9.2 0.00e+00 0.0 0.0e+00
> > >>> 0.0e+00 0.0e+00  1  0  0  0  0   4  0  0  0  0     0
> > >>> MatMult            12000 1.0 5.0567e+01 1.0 1.82e+08 1.0 2.4e+04
> > >>> 2.1e+04 0.0e+00  5 32100100  0  25 32100100  0   361
> > >>> MatSolve           16000 1.0 1.1285e+02 1.4 1.50e+08 1.4 0.0e+00
> > >>> 0.0e+00 0.0e+00 10 43  0  0  0  47 43  0  0  0   214
> > >>> MatLUFactorNum         1 1.0 1.9693e-02 1.0 5.79e+07 1.1 0.0e+00
> > >>> 0.0e+00 0.0e+00  0  0  0  0  0   0  0  0  0  0   110
> > >>> MatILUFactorSym        1 1.0 7.8840e-03 1.1 0.00e+00 0.0 0.0e+00
> > >>> 0.0e+00 1.0e+00  0  0  0  0  0   0  0  0  0  0     0
> > >>> MatGetOrdering         1 1.0 1.2250e-03 1.1 0.00e+00 0.0 0.0e+00
> > >>> 0.0e+00 2.0e+00  0  0  0  0  0   0  0  0  0  0     0
> > >>> KSPSetup               1 1.0 1.1921e-06 1.2 0.00e+00 0.0 0.0e+00
> > >>> 0.0e+00 0.0e+00  0  0  0  0  0   0  0  0  0  0     0
> > >>> KSPSolve            4000 1.0 2.0419e+02 1.0 1.39e+08 1.0 2.4e+04
> > >>> 2.1e+04 3.6e+04 21100100100100 100100100100100   278
> > >>> PCSetUp                1 1.0 2.8828e-02 1.1 3.99e+07 1.1 0.0e+00
> > >>> 0.0e+00 3.0e+00  0  0  0  0  0   0  0  0  0  0    75
> > >>> PCSetUpOnBlocks     4000 1.0 3.5605e-02 1.1 3.35e+07 1.1 0.0e+00
> > >>> 0.0e+00 3.0e+00  0  0  0  0  0   0  0  0  0  0    61
> > >>> PCApply            16000 1.0 1.1661e+02 1.4 1.46e+08 1.4 0.0e+00
> > >>> 0.0e+00 0.0e+00 10 43  0  0  0  49 43  0  0  0   207
> > >>> -----------------------------------------------------------------------
> > >>>-- -----------------------------------------------
> > >>>
> > >>> ...
> > >>>
> > >>>
> > >>> Some things to note are the following:
> > >>> I allocate my vector as:
> > >>> VecCreateMPI(PETSC_COMM_WORLD, //communicator
> > >>>          a_totallocal_numPoints[a_thisproc], //local points on this proc
> > >>>          a_totalglobal_numPoints, //total number of global points
> > >>>          &m_globalRHSVector); //the vector to be created
> > >>>
> > >>> where the vector a_totallocal_numPoints is :
> > >>> a_totallocal_numPoints: 59904 59904
> > >>>
> > >>> The matrix is allocated as:
> > >>>  m_ierr = MatCreateMPIAIJ(PETSC_COMM_WORLD, //communicator
> > >>>                      a_totallocal_numPoints[a_thisproc], //total number of local rows
> > >>> (that is rows residing on this proc
> > >>>                      a_totallocal_numPoints[a_thisproc], //total number of columns
> > >>> corresponding to local part of parallel vector
> > >>>                      a_totalglobal_numPoints, //number of global rows
> > >>>                      a_totalglobal_numPoints, //number of global columns
> > >>>                      PETSC_NULL,
> > >>>                      a_NumberOfNZPointsInDiagonalMatrix,
> > >>>                      PETSC_NULL,
> > >>>                      a_NumberOfNZPointsInOffDiagonalMatrix,
> > >>>                      &m_globalMatrix);
> > >>>
> > >>> With the info option i checked and there is no extra mallocs at all.
> > >>> My problems setup is symmetric so it seems that everything is set up so
> > >>> that it should be essentially perfectly balanced. However, the numbers
> > >>> given above certainly do not reflect that.
> > >>>
> > >>> However, the in all other parts of my code (except the PETSc call), i
> > >>> get the expected, almost perfect loadbalance.
> > >>>
> > >>> Is there anything that i am overlooking? Any help is greatly
> > >>> appreciated.
> > >>>
> > >>> thanks
> > >>> mat
> > >>>
> > >>> On Wednesday 02 August 2006 16:21, Matt Funk wrote:
> > >>>> Hi Matt,
> > >>>>
> > >>>> It could be a bad load imbalance because i don't let PETSc decide. I
> > >>>> need to fix that anyway, so i think i'll try that first and then let
> > >>>> you know. Thanks though for the quick response and helping me to
> > >>>> interpret those numbers ...
> > >>>>
> > >>>>
> > >>>> mat
> > >>>>
> > >>>> On Wednesday 02 August 2006 15:50, Matthew Knepley wrote:
> > >>>>> On 8/2/06, Matt Funk <mafunk at nmsu.edu> wrote:
> > >>>>>> Hi Matt,
> > >>>>>>
> > >>>>>> thanks for all the help so far. The -info option is really very
> > >>>>>> helpful. So i think i straightened the actual errors out. However,
> > >>>>>> now i am back to the original question i had. That is why it takes
> > >>>>>> so much longer on 4 procs than on 1 proc.
> > >>>>>
> > >>>>> So you have a 1.5 load imbalance for MatMult(), which probably
> > >>>>> cascades to give the 133! load imbalance for VecDot(). You probably
> > >>>>> have either:
> > >>>>>
> > >>>>>   1) VERY bad laod imbalance
> > >>>>>
> > >>>>>   2) a screwed up network
> > >>>>>
> > >>>>>   3) bad contention on the network (loaded cluster)
> > >>>>>
> > >>>>> Can you help us narrow this down?
> > >>>>>
> > >>>>>
> > >>>>>    Matt
> > >>>>>
> > >>>>>> I profiled the KSPSolve(...) as stage 2:
> > >>>>>>
> > >>>>>> For 1 proc i have:
> > >>>>>> --- Event Stage 2: Stage 2 of ChomboPetscInterface
> > >>>>>>
> > >>>>>> VecDot              4000 1.0 4.9158e-02 1.0 4.74e+08 1.0 0.0e+00
> > >>>>>> 0.0e+00 0.0e+00  0 18  0  0  0   2 18  0  0  0   474
> > >>>>>> VecNorm             8000 1.0 2.1798e-01 1.0 2.14e+08 1.0 0.0e+00
> > >>>>>> 0.0e+00 4.0e+03  1 36  0  0 28   7 36  0  0 33   214
> > >>>>>> VecAYPX             4000 1.0 1.3449e-01 1.0 1.73e+08 1.0 0.0e+00
> > >>>>>> 0.0e+00 0.0e+00  0 18  0  0  0   5 18  0  0  0   173
> > >>>>>> MatMult             4000 1.0 3.6004e-01 1.0 3.24e+07 1.0 0.0e+00
> > >>>>>> 0.0e+00 0.0e+00  1  9  0  0  0  12  9  0  0  0    32
> > >>>>>> MatSolve            8000 1.0 1.0620e+00 1.0 2.19e+07 1.0 0.0e+00
> > >>>>>> 0.0e+00 0.0e+00  3 18  0  0  0  36 18  0  0  0    22
> > >>>>>> KSPSolve            4000 1.0 2.8338e+00 1.0 4.52e+07 1.0 0.0e+00
> > >>>>>> 0.0e+00 1.2e+04  7100  0  0 84  97100  0  0100    45
> > >>>>>> PCApply             8000 1.0 1.1133e+00 1.0 2.09e+07 1.0 0.0e+00
> > >>>>>> 0.0e+00 0.0e+00  3 18  0  0  0  38 18  0  0  0    21
> > >>>>>>
> > >>>>>>
> > >>>>>> for 4 procs i have :
> > >>>>>> --- Event Stage 2: Stage 2 of ChomboPetscInterface
> > >>>>>>
> > >>>>>> VecDot              4000 1.0 3.5884e+01133.7 2.17e+07133.7 0.0e+00
> > >>>>>> 0.0e+00 4.0e+03  8 18  0  0  5   9 18  0  0 14     1
> > >>>>>> VecNorm             8000 1.0 3.4986e-01 1.3 4.43e+07 1.3 0.0e+00
> > >>>>>> 0.0e+00 8.0e+03  0 36  0  0 10   0 36  0  0 29   133
> > >>>>>> VecSet              8000 1.0 3.5024e-02 1.4 0.00e+00 0.0 0.0e+00
> > >>>>>> 0.0e+00 0.0e+00  0  0  0  0  0   0  0  0  0  0     0
> > >>>>>> VecAYPX             4000 1.0 5.6790e-02 1.3 1.28e+08 1.3 0.0e+00
> > >>>>>> 0.0e+00 0.0e+00  0 18  0  0  0   0 18  0  0  0   410
> > >>>>>> VecScatterBegin     4000 1.0 6.0042e+01 1.4 0.00e+00 0.0 0.0e+00
> > >>>>>> 0.0e+00 0.0e+00 38  0  0  0  0  45  0  0  0  0     0
> > >>>>>> VecScatterEnd       4000 1.0 5.9364e+01 1.4 0.00e+00 0.0 0.0e+00
> > >>>>>> 0.0e+00 0.0e+00 37  0  0  0  0  44  0  0  0  0     0
> > >>>>>> MatMult             4000 1.0 1.1959e+02 1.4 3.46e+04 1.4 0.0e+00
> > >>>>>> 0.0e+00 0.0e+00 75  9  0  0  0  89  9  0  0  0     0
> > >>>>>> MatSolve            8000 1.0 2.8150e-01 1.0 2.16e+07 1.0 0.0e+00
> > >>>>>> 0.0e+00 0.0e+00  0 18  0  0  0   0 18  0  0  0    83
> > >>>>>> MatLUFactorNum         1 1.0 1.3685e-04 1.1 5.64e+06 1.1 0.0e+00
> > >>>>>> 0.0e+00 0.0e+00  0  0  0  0  0   0  0  0  0  0    21
> > >>>>>> MatILUFactorSym        1 1.0 2.3389e-04 1.2 0.00e+00 0.0 0.0e+00
> > >>>>>> 0.0e+00 2.0e+00  0  0  0  0  0   0  0  0  0  0     0
> > >>>>>> MatGetOrdering         1 1.0 9.6083e-05 1.2 0.00e+00 0.0 0.0e+00
> > >>>>>> 0.0e+00 2.0e+00  0  0  0  0  0   0  0  0  0  0     0
> > >>>>>> KSPSetup               1 1.0 2.1458e-06 2.2 0.00e+00 0.0 0.0e+00
> > >>>>>> 0.0e+00 0.0e+00  0  0  0  0  0   0  0  0  0  0     0
> > >>>>>> KSPSolve            4000 1.0 1.2200e+02 1.0 2.63e+05 1.0 0.0e+00
> > >>>>>> 0.0e+00 2.8e+04 84100  0  0 34 100100  0  0100     1
> > >>>>>> PCSetUp                1 1.0 5.0187e-04 1.2 1.68e+06 1.2 0.0e+00
> > >>>>>> 0.0e+00 4.0e+00  0  0  0  0  0   0  0  0  0  0     6
> > >>>>>> PCSetUpOnBlocks     4000 1.0 1.2104e-02 2.2 1.34e+05 2.2 0.0e+00
> > >>>>>> 0.0e+00 4.0e+00  0  0  0  0  0   0  0  0  0  0     0
> > >>>>>> PCApply             8000 1.0 8.4254e-01 1.2 8.27e+06 1.2 0.0e+00
> > >>>>>> 0.0e+00 8.0e+03  1 18  0  0 10   1 18  0  0 29    28
> > >>>>>> --------------------------------------------------------------------
> > >>>>>>-- - -- -----------------------------------------------
> > >>>>>>
> > >>>>>> Now if i understand it right, all these calls summarize all calls
> > >>>>>> between the pop and push commands. That would mean that the majority
> > >>>>>> of the time is spend in the MatMult and in within that the
> > >>>>>> VecScatterBegin and VecScatterEnd commands (if i understand it
> > >>>>>> right).
> > >>>>>>
> > >>>>>> My problem size is really small. So i was wondering if the problem
> > >>>>>> lies in that (namely that the major time is simply spend
> > >>>>>> communicating between processors, or whether there is still
> > >>>>>> something wrong with how i wrote the code?)
> > >>>>>>
> > >>>>>>
> > >>>>>> thanks
> > >>>>>> mat
> > >>>>>>
> > >>>>>> On Tuesday 01 August 2006 18:28, Matthew Knepley wrote:
> > >>>>>>> On 8/1/06, Matt Funk <mafunk at nmsu.edu> wrote:
> > >>>>>>>> Actually the errors occur on my calls to a PETSc functions after
> > >>>>>>>> calling PETSCInitialize.
> > >>>>>>>
> > >>>>>>> Yes, it is the error I pointed out in the last message.
> > >>>>>>>
> > >>>>>>>    Matt
> > >>>>>>>
> > >>>>>>>> mat
>
>


-- 
"Failure has a thousand explanations. Success doesn't need one" -- Sir
Alec Guiness


From mafunk at nmsu.edu  Tue Aug 15 16:51:52 2006
From: mafunk at nmsu.edu (Matt Funk)
Date: Tue, 15 Aug 2006 15:51:52 -0600
Subject: profiling PETSc code
In-Reply-To: <a9f269830608151344rbfa13fbgdf499b17f641efbf@mail.gmail.com>
References: <200608011545.26224.mafunk@nmsu.edu> <200608151435.55054.mafunk@nmsu.edu> <a9f269830608151344rbfa13fbgdf499b17f641efbf@mail.gmail.com>
Message-ID: <200608151551.53794.mafunk@nmsu.edu>

Is there a limit to how big an attachment can be?
The file is 1.3Mb big. I tried to send it twice and none of the emails went 
through. I also send it directly to Barry and Matthews email. I hope that got 
though?

mat


On Tuesday 15 August 2006 14:44, Matthew Knepley wrote:
> I don't think it matters initially since the problem is BIG imbalances.
>
>   Matt
>
> On 8/15/06, Matt Funk <mafunk at nmsu.edu> wrote:
> > Do you want me to use the debug version or the optimized version of
> > PETSc?
> >
> > mat
> >
> > On Tuesday 15 August 2006 13:56, Barry Smith wrote:
> > >    Please send the entire -info output as an attachment to me. (Not
> > > in the email) I'll study it in more detail.
> > >
> > >     Barry
> > >
> > > On Tue, 15 Aug 2006, Matt Funk wrote:
> > > > On Tuesday 15 August 2006 11:52, Barry Smith wrote:
> > > >>> MatSolve           16000 1.0 1.1285e+02 1.4 1.50e+08 1.4 0.0e+00
> > > >>> 0.0e+00
> > > >>
> > > >>                                           ^^^^^
> > > >>                                         balance
> > > >>
> > > >>    Hmmm, I would guess that the matrix entries are not so well
> > > >> balanced? One process takes 1.4 times as long for the triangular
> > > >> solves as the other so either one matrix has many more entries or
> > > >> one processor is slower then the other.
> > > >>
> > > >>     Barry
> > > >
> > > > Well it would seem that way at first, but i don't know how that could
> > > > be since i allocate an exactly equal amount of points on both
> > > > processor (see previous email).
> > > > Further i used the -mat_view_info option. Here is what it gives me:
> > > >
> > > > ...
> > > > Matrix Object:
> > > >  type=mpiaij, rows=119808, cols=119808
> > > > [1] PetscCommDuplicateUsing internal PETSc communicator 91 168
> > > > [1] PetscCommDuplicateUsing internal PETSc communicator 91 168
> > > > ...
> > > >
> > > > ...
> > > > Matrix Object:
> > > >  type=seqaij, rows=59904, cols=59904
> > > >  total: nonzeros=407400, allocated nonzeros=407400
> > > >    not using I-node routines
> > > > Matrix Object:
> > > >  type=seqaij, rows=59904, cols=59904
> > > >  total: nonzeros=407400, allocated nonzeros=407400
> > > >    not using I-node routines
> > > > ...
> > > >
> > > > So to me it look s well split up. Is there anything else that
> > > > somebody can think of. The machine i am running on is all same
> > > > processors.
> > > >
> > > > By the way, i am not sure if the '[1] PetscCommDuplicateUsing
> > > > internal PETSc communicator 91 168' is something i need to worry
> > > > about??
> > > >
> > > > mat
> > > >
> > > >> On Tue, 15 Aug 2006, Matt Funk wrote:
> > > >>> Hi Matt,
> > > >>>
> > > >>> sorry for the delay since the last email, but there were some other
> > > >>> things i needed to do.
> > > >>>
> > > >>> Anyway, I hope that maybe I can get some more help from you guys
> > > >>> with respect to the loadimbalance problem i have. Here is the
> > > >>> situtation: I run my code on 2 procs. I profile my KSPSolve call
> > > >>> and here is what i get:
> > > >>>
> > > >>> ...
> > > >>>
> > > >>> --- Event Stage 2: Stage 2 of ChomboPetscInterface
> > > >>>
> > > >>> VecDot             20000 1.0 1.9575e+01 2.0 2.39e+08 2.0 0.0e+00
> > > >>> 0.0e+00 2.0e+04  2  8  0  0 56   7  8  0  0 56   245
> > > >>> VecNorm            16000 1.0 4.0559e+01 3.2 1.53e+08 3.2 0.0e+00
> > > >>> 0.0e+00 1.6e+04  3  7  0  0 44  13  7  0  0 44    95
> > > >>> VecCopy             4000 1.0 1.5148e+00 1.4 0.00e+00 0.0 0.0e+00
> > > >>> 0.0e+00 0.0e+00  0  0  0  0  0   1  0  0  0  0     0
> > > >>> VecSet             16000 1.0 3.1937e+00 1.8 0.00e+00 0.0 0.0e+00
> > > >>> 0.0e+00 0.0e+00  0  0  0  0  0   1  0  0  0  0     0
> > > >>> VecAXPY            16000 1.0 8.2395e+00 1.4 3.22e+08 1.4 0.0e+00
> > > >>> 0.0e+00 0.0e+00  1  7  0  0  0   3  7  0  0  0   465
> > > >>> VecAYPX             8000 1.0 3.6370e+00 1.3 3.46e+08 1.3 0.0e+00
> > > >>> 0.0e+00 0.0e+00  0  3  0  0  0   2  3  0  0  0   527
> > > >>> VecScatterBegin    12000 1.0 5.7721e-01 1.1 0.00e+00 0.0 2.4e+04
> > > >>> 2.1e+04 0.0e+00  0  0100100  0   0  0100100  0     0
> > > >>> VecScatterEnd      12000 1.0 1.4059e+01 9.2 0.00e+00 0.0 0.0e+00
> > > >>> 0.0e+00 0.0e+00  1  0  0  0  0   4  0  0  0  0     0
> > > >>> MatMult            12000 1.0 5.0567e+01 1.0 1.82e+08 1.0 2.4e+04
> > > >>> 2.1e+04 0.0e+00  5 32100100  0  25 32100100  0   361
> > > >>> MatSolve           16000 1.0 1.1285e+02 1.4 1.50e+08 1.4 0.0e+00
> > > >>> 0.0e+00 0.0e+00 10 43  0  0  0  47 43  0  0  0   214
> > > >>> MatLUFactorNum         1 1.0 1.9693e-02 1.0 5.79e+07 1.1 0.0e+00
> > > >>> 0.0e+00 0.0e+00  0  0  0  0  0   0  0  0  0  0   110
> > > >>> MatILUFactorSym        1 1.0 7.8840e-03 1.1 0.00e+00 0.0 0.0e+00
> > > >>> 0.0e+00 1.0e+00  0  0  0  0  0   0  0  0  0  0     0
> > > >>> MatGetOrdering         1 1.0 1.2250e-03 1.1 0.00e+00 0.0 0.0e+00
> > > >>> 0.0e+00 2.0e+00  0  0  0  0  0   0  0  0  0  0     0
> > > >>> KSPSetup               1 1.0 1.1921e-06 1.2 0.00e+00 0.0 0.0e+00
> > > >>> 0.0e+00 0.0e+00  0  0  0  0  0   0  0  0  0  0     0
> > > >>> KSPSolve            4000 1.0 2.0419e+02 1.0 1.39e+08 1.0 2.4e+04
> > > >>> 2.1e+04 3.6e+04 21100100100100 100100100100100   278
> > > >>> PCSetUp                1 1.0 2.8828e-02 1.1 3.99e+07 1.1 0.0e+00
> > > >>> 0.0e+00 3.0e+00  0  0  0  0  0   0  0  0  0  0    75
> > > >>> PCSetUpOnBlocks     4000 1.0 3.5605e-02 1.1 3.35e+07 1.1 0.0e+00
> > > >>> 0.0e+00 3.0e+00  0  0  0  0  0   0  0  0  0  0    61
> > > >>> PCApply            16000 1.0 1.1661e+02 1.4 1.46e+08 1.4 0.0e+00
> > > >>> 0.0e+00 0.0e+00 10 43  0  0  0  49 43  0  0  0   207
> > > >>> -------------------------------------------------------------------
> > > >>>---- -- -----------------------------------------------
> > > >>>
> > > >>> ...
> > > >>>
> > > >>>
> > > >>> Some things to note are the following:
> > > >>> I allocate my vector as:
> > > >>> VecCreateMPI(PETSC_COMM_WORLD, //communicator
> > > >>>          a_totallocal_numPoints[a_thisproc], //local points on this
> > > >>> proc a_totalglobal_numPoints, //total number of global points
> > > >>> &m_globalRHSVector); //the vector to be created
> > > >>>
> > > >>> where the vector a_totallocal_numPoints is :
> > > >>> a_totallocal_numPoints: 59904 59904
> > > >>>
> > > >>> The matrix is allocated as:
> > > >>>  m_ierr = MatCreateMPIAIJ(PETSC_COMM_WORLD, //communicator
> > > >>>                      a_totallocal_numPoints[a_thisproc], //total
> > > >>> number of local rows (that is rows residing on this proc
> > > >>>                      a_totallocal_numPoints[a_thisproc], //total
> > > >>> number of columns corresponding to local part of parallel vector
> > > >>>                      a_totalglobal_numPoints, //number of global
> > > >>> rows a_totalglobal_numPoints, //number of global columns
> > > >>> PETSC_NULL,
> > > >>>                      a_NumberOfNZPointsInDiagonalMatrix,
> > > >>>                      PETSC_NULL,
> > > >>>                      a_NumberOfNZPointsInOffDiagonalMatrix,
> > > >>>                      &m_globalMatrix);
> > > >>>
> > > >>> With the info option i checked and there is no extra mallocs at
> > > >>> all. My problems setup is symmetric so it seems that everything is
> > > >>> set up so that it should be essentially perfectly balanced.
> > > >>> However, the numbers given above certainly do not reflect that.
> > > >>>
> > > >>> However, the in all other parts of my code (except the PETSc call),
> > > >>> i get the expected, almost perfect loadbalance.
> > > >>>
> > > >>> Is there anything that i am overlooking? Any help is greatly
> > > >>> appreciated.
> > > >>>
> > > >>> thanks
> > > >>> mat
> > > >>>
> > > >>> On Wednesday 02 August 2006 16:21, Matt Funk wrote:
> > > >>>> Hi Matt,
> > > >>>>
> > > >>>> It could be a bad load imbalance because i don't let PETSc decide.
> > > >>>> I need to fix that anyway, so i think i'll try that first and then
> > > >>>> let you know. Thanks though for the quick response and helping me
> > > >>>> to interpret those numbers ...
> > > >>>>
> > > >>>>
> > > >>>> mat
> > > >>>>
> > > >>>> On Wednesday 02 August 2006 15:50, Matthew Knepley wrote:
> > > >>>>> On 8/2/06, Matt Funk <mafunk at nmsu.edu> wrote:
> > > >>>>>> Hi Matt,
> > > >>>>>>
> > > >>>>>> thanks for all the help so far. The -info option is really very
> > > >>>>>> helpful. So i think i straightened the actual errors out.
> > > >>>>>> However, now i am back to the original question i had. That is
> > > >>>>>> why it takes so much longer on 4 procs than on 1 proc.
> > > >>>>>
> > > >>>>> So you have a 1.5 load imbalance for MatMult(), which probably
> > > >>>>> cascades to give the 133! load imbalance for VecDot(). You
> > > >>>>> probably have either:
> > > >>>>>
> > > >>>>>   1) VERY bad laod imbalance
> > > >>>>>
> > > >>>>>   2) a screwed up network
> > > >>>>>
> > > >>>>>   3) bad contention on the network (loaded cluster)
> > > >>>>>
> > > >>>>> Can you help us narrow this down?
> > > >>>>>
> > > >>>>>
> > > >>>>>    Matt
> > > >>>>>
> > > >>>>>> I profiled the KSPSolve(...) as stage 2:
> > > >>>>>>
> > > >>>>>> For 1 proc i have:
> > > >>>>>> --- Event Stage 2: Stage 2 of ChomboPetscInterface
> > > >>>>>>
> > > >>>>>> VecDot              4000 1.0 4.9158e-02 1.0 4.74e+08 1.0 0.0e+00
> > > >>>>>> 0.0e+00 0.0e+00  0 18  0  0  0   2 18  0  0  0   474
> > > >>>>>> VecNorm             8000 1.0 2.1798e-01 1.0 2.14e+08 1.0 0.0e+00
> > > >>>>>> 0.0e+00 4.0e+03  1 36  0  0 28   7 36  0  0 33   214
> > > >>>>>> VecAYPX             4000 1.0 1.3449e-01 1.0 1.73e+08 1.0 0.0e+00
> > > >>>>>> 0.0e+00 0.0e+00  0 18  0  0  0   5 18  0  0  0   173
> > > >>>>>> MatMult             4000 1.0 3.6004e-01 1.0 3.24e+07 1.0 0.0e+00
> > > >>>>>> 0.0e+00 0.0e+00  1  9  0  0  0  12  9  0  0  0    32
> > > >>>>>> MatSolve            8000 1.0 1.0620e+00 1.0 2.19e+07 1.0 0.0e+00
> > > >>>>>> 0.0e+00 0.0e+00  3 18  0  0  0  36 18  0  0  0    22
> > > >>>>>> KSPSolve            4000 1.0 2.8338e+00 1.0 4.52e+07 1.0 0.0e+00
> > > >>>>>> 0.0e+00 1.2e+04  7100  0  0 84  97100  0  0100    45
> > > >>>>>> PCApply             8000 1.0 1.1133e+00 1.0 2.09e+07 1.0 0.0e+00
> > > >>>>>> 0.0e+00 0.0e+00  3 18  0  0  0  38 18  0  0  0    21
> > > >>>>>>
> > > >>>>>>
> > > >>>>>> for 4 procs i have :
> > > >>>>>> --- Event Stage 2: Stage 2 of ChomboPetscInterface
> > > >>>>>>
> > > >>>>>> VecDot              4000 1.0 3.5884e+01133.7 2.17e+07133.7
> > > >>>>>> 0.0e+00 0.0e+00 4.0e+03  8 18  0  0  5   9 18  0  0 14     1
> > > >>>>>> VecNorm             8000 1.0 3.4986e-01 1.3 4.43e+07 1.3 0.0e+00
> > > >>>>>> 0.0e+00 8.0e+03  0 36  0  0 10   0 36  0  0 29   133
> > > >>>>>> VecSet              8000 1.0 3.5024e-02 1.4 0.00e+00 0.0 0.0e+00
> > > >>>>>> 0.0e+00 0.0e+00  0  0  0  0  0   0  0  0  0  0     0
> > > >>>>>> VecAYPX             4000 1.0 5.6790e-02 1.3 1.28e+08 1.3 0.0e+00
> > > >>>>>> 0.0e+00 0.0e+00  0 18  0  0  0   0 18  0  0  0   410
> > > >>>>>> VecScatterBegin     4000 1.0 6.0042e+01 1.4 0.00e+00 0.0 0.0e+00
> > > >>>>>> 0.0e+00 0.0e+00 38  0  0  0  0  45  0  0  0  0     0
> > > >>>>>> VecScatterEnd       4000 1.0 5.9364e+01 1.4 0.00e+00 0.0 0.0e+00
> > > >>>>>> 0.0e+00 0.0e+00 37  0  0  0  0  44  0  0  0  0     0
> > > >>>>>> MatMult             4000 1.0 1.1959e+02 1.4 3.46e+04 1.4 0.0e+00
> > > >>>>>> 0.0e+00 0.0e+00 75  9  0  0  0  89  9  0  0  0     0
> > > >>>>>> MatSolve            8000 1.0 2.8150e-01 1.0 2.16e+07 1.0 0.0e+00
> > > >>>>>> 0.0e+00 0.0e+00  0 18  0  0  0   0 18  0  0  0    83
> > > >>>>>> MatLUFactorNum         1 1.0 1.3685e-04 1.1 5.64e+06 1.1 0.0e+00
> > > >>>>>> 0.0e+00 0.0e+00  0  0  0  0  0   0  0  0  0  0    21
> > > >>>>>> MatILUFactorSym        1 1.0 2.3389e-04 1.2 0.00e+00 0.0 0.0e+00
> > > >>>>>> 0.0e+00 2.0e+00  0  0  0  0  0   0  0  0  0  0     0
> > > >>>>>> MatGetOrdering         1 1.0 9.6083e-05 1.2 0.00e+00 0.0 0.0e+00
> > > >>>>>> 0.0e+00 2.0e+00  0  0  0  0  0   0  0  0  0  0     0
> > > >>>>>> KSPSetup               1 1.0 2.1458e-06 2.2 0.00e+00 0.0 0.0e+00
> > > >>>>>> 0.0e+00 0.0e+00  0  0  0  0  0   0  0  0  0  0     0
> > > >>>>>> KSPSolve            4000 1.0 1.2200e+02 1.0 2.63e+05 1.0 0.0e+00
> > > >>>>>> 0.0e+00 2.8e+04 84100  0  0 34 100100  0  0100     1
> > > >>>>>> PCSetUp                1 1.0 5.0187e-04 1.2 1.68e+06 1.2 0.0e+00
> > > >>>>>> 0.0e+00 4.0e+00  0  0  0  0  0   0  0  0  0  0     6
> > > >>>>>> PCSetUpOnBlocks     4000 1.0 1.2104e-02 2.2 1.34e+05 2.2 0.0e+00
> > > >>>>>> 0.0e+00 4.0e+00  0  0  0  0  0   0  0  0  0  0     0
> > > >>>>>> PCApply             8000 1.0 8.4254e-01 1.2 8.27e+06 1.2 0.0e+00
> > > >>>>>> 0.0e+00 8.0e+03  1 18  0  0 10   1 18  0  0 29    28
> > > >>>>>> ----------------------------------------------------------------
> > > >>>>>>---- -- - -- -----------------------------------------------
> > > >>>>>>
> > > >>>>>> Now if i understand it right, all these calls summarize all
> > > >>>>>> calls between the pop and push commands. That would mean that
> > > >>>>>> the majority of the time is spend in the MatMult and in within
> > > >>>>>> that the VecScatterBegin and VecScatterEnd commands (if i
> > > >>>>>> understand it right).
> > > >>>>>>
> > > >>>>>> My problem size is really small. So i was wondering if the
> > > >>>>>> problem lies in that (namely that the major time is simply spend
> > > >>>>>> communicating between processors, or whether there is still
> > > >>>>>> something wrong with how i wrote the code?)
> > > >>>>>>
> > > >>>>>>
> > > >>>>>> thanks
> > > >>>>>> mat
> > > >>>>>>
> > > >>>>>> On Tuesday 01 August 2006 18:28, Matthew Knepley wrote:
> > > >>>>>>> On 8/1/06, Matt Funk <mafunk at nmsu.edu> wrote:
> > > >>>>>>>> Actually the errors occur on my calls to a PETSc functions
> > > >>>>>>>> after calling PETSCInitialize.
> > > >>>>>>>
> > > >>>>>>> Yes, it is the error I pointed out in the last message.
> > > >>>>>>>
> > > >>>>>>>    Matt
> > > >>>>>>>
> > > >>>>>>>> mat


From knepley at gmail.com  Tue Aug 15 16:57:27 2006
From: knepley at gmail.com (Matthew Knepley)
Date: Tue, 15 Aug 2006 16:57:27 -0500
Subject: profiling PETSc code
In-Reply-To: <200608151551.53794.mafunk@nmsu.edu>
References: <200608011545.26224.mafunk@nmsu.edu>
	 <200608151435.55054.mafunk@nmsu.edu>
	 <a9f269830608151344rbfa13fbgdf499b17f641efbf@mail.gmail.com>
	 <200608151551.53794.mafunk@nmsu.edu>
Message-ID: <a9f269830608151457g1e4b6b05hee4a81767c6e9762@mail.gmail.com>

Yes, I got it. You are correct, the matrix partitions are exactly the same
size. I guess you have a bad network, since not only are the ILU times
unbalanced, but vector operations as well.

  Matt

On 8/15/06, Matt Funk <mafunk at nmsu.edu> wrote:
> Is there a limit to how big an attachment can be?
> The file is 1.3Mb big. I tried to send it twice and none of the emails went
> through. I also send it directly to Barry and Matthews email. I hope that got
> though?
>
> mat
>
>
> On Tuesday 15 August 2006 14:44, Matthew Knepley wrote:
> > I don't think it matters initially since the problem is BIG imbalances.
> >
> >   Matt
> >
> > On 8/15/06, Matt Funk <mafunk at nmsu.edu> wrote:
> > > Do you want me to use the debug version or the optimized version of
> > > PETSc?
> > >
> > > mat
> > >
> > > On Tuesday 15 August 2006 13:56, Barry Smith wrote:
> > > >    Please send the entire -info output as an attachment to me. (Not
> > > > in the email) I'll study it in more detail.
> > > >
> > > >     Barry
> > > >
> > > > On Tue, 15 Aug 2006, Matt Funk wrote:
> > > > > On Tuesday 15 August 2006 11:52, Barry Smith wrote:
> > > > >>> MatSolve           16000 1.0 1.1285e+02 1.4 1.50e+08 1.4 0.0e+00
> > > > >>> 0.0e+00
> > > > >>
> > > > >>                                           ^^^^^
> > > > >>                                         balance
> > > > >>
> > > > >>    Hmmm, I would guess that the matrix entries are not so well
> > > > >> balanced? One process takes 1.4 times as long for the triangular
> > > > >> solves as the other so either one matrix has many more entries or
> > > > >> one processor is slower then the other.
> > > > >>
> > > > >>     Barry
> > > > >
> > > > > Well it would seem that way at first, but i don't know how that could
> > > > > be since i allocate an exactly equal amount of points on both
> > > > > processor (see previous email).
> > > > > Further i used the -mat_view_info option. Here is what it gives me:
> > > > >
> > > > > ...
> > > > > Matrix Object:
> > > > >  type=mpiaij, rows=119808, cols=119808
> > > > > [1] PetscCommDuplicateUsing internal PETSc communicator 91 168
> > > > > [1] PetscCommDuplicateUsing internal PETSc communicator 91 168
> > > > > ...
> > > > >
> > > > > ...
> > > > > Matrix Object:
> > > > >  type=seqaij, rows=59904, cols=59904
> > > > >  total: nonzeros=407400, allocated nonzeros=407400
> > > > >    not using I-node routines
> > > > > Matrix Object:
> > > > >  type=seqaij, rows=59904, cols=59904
> > > > >  total: nonzeros=407400, allocated nonzeros=407400
> > > > >    not using I-node routines
> > > > > ...
> > > > >
> > > > > So to me it look s well split up. Is there anything else that
> > > > > somebody can think of. The machine i am running on is all same
> > > > > processors.
> > > > >
> > > > > By the way, i am not sure if the '[1] PetscCommDuplicateUsing
> > > > > internal PETSc communicator 91 168' is something i need to worry
> > > > > about??
> > > > >
> > > > > mat
> > > > >
> > > > >> On Tue, 15 Aug 2006, Matt Funk wrote:
> > > > >>> Hi Matt,
> > > > >>>
> > > > >>> sorry for the delay since the last email, but there were some other
> > > > >>> things i needed to do.
> > > > >>>
> > > > >>> Anyway, I hope that maybe I can get some more help from you guys
> > > > >>> with respect to the loadimbalance problem i have. Here is the
> > > > >>> situtation: I run my code on 2 procs. I profile my KSPSolve call
> > > > >>> and here is what i get:
> > > > >>>
> > > > >>> ...
> > > > >>>
> > > > >>> --- Event Stage 2: Stage 2 of ChomboPetscInterface
> > > > >>>
> > > > >>> VecDot             20000 1.0 1.9575e+01 2.0 2.39e+08 2.0 0.0e+00
> > > > >>> 0.0e+00 2.0e+04  2  8  0  0 56   7  8  0  0 56   245
> > > > >>> VecNorm            16000 1.0 4.0559e+01 3.2 1.53e+08 3.2 0.0e+00
> > > > >>> 0.0e+00 1.6e+04  3  7  0  0 44  13  7  0  0 44    95
> > > > >>> VecCopy             4000 1.0 1.5148e+00 1.4 0.00e+00 0.0 0.0e+00
> > > > >>> 0.0e+00 0.0e+00  0  0  0  0  0   1  0  0  0  0     0
> > > > >>> VecSet             16000 1.0 3.1937e+00 1.8 0.00e+00 0.0 0.0e+00
> > > > >>> 0.0e+00 0.0e+00  0  0  0  0  0   1  0  0  0  0     0
> > > > >>> VecAXPY            16000 1.0 8.2395e+00 1.4 3.22e+08 1.4 0.0e+00
> > > > >>> 0.0e+00 0.0e+00  1  7  0  0  0   3  7  0  0  0   465
> > > > >>> VecAYPX             8000 1.0 3.6370e+00 1.3 3.46e+08 1.3 0.0e+00
> > > > >>> 0.0e+00 0.0e+00  0  3  0  0  0   2  3  0  0  0   527
> > > > >>> VecScatterBegin    12000 1.0 5.7721e-01 1.1 0.00e+00 0.0 2.4e+04
> > > > >>> 2.1e+04 0.0e+00  0  0100100  0   0  0100100  0     0
> > > > >>> VecScatterEnd      12000 1.0 1.4059e+01 9.2 0.00e+00 0.0 0.0e+00
> > > > >>> 0.0e+00 0.0e+00  1  0  0  0  0   4  0  0  0  0     0
> > > > >>> MatMult            12000 1.0 5.0567e+01 1.0 1.82e+08 1.0 2.4e+04
> > > > >>> 2.1e+04 0.0e+00  5 32100100  0  25 32100100  0   361
> > > > >>> MatSolve           16000 1.0 1.1285e+02 1.4 1.50e+08 1.4 0.0e+00
> > > > >>> 0.0e+00 0.0e+00 10 43  0  0  0  47 43  0  0  0   214
> > > > >>> MatLUFactorNum         1 1.0 1.9693e-02 1.0 5.79e+07 1.1 0.0e+00
> > > > >>> 0.0e+00 0.0e+00  0  0  0  0  0   0  0  0  0  0   110
> > > > >>> MatILUFactorSym        1 1.0 7.8840e-03 1.1 0.00e+00 0.0 0.0e+00
> > > > >>> 0.0e+00 1.0e+00  0  0  0  0  0   0  0  0  0  0     0
> > > > >>> MatGetOrdering         1 1.0 1.2250e-03 1.1 0.00e+00 0.0 0.0e+00
> > > > >>> 0.0e+00 2.0e+00  0  0  0  0  0   0  0  0  0  0     0
> > > > >>> KSPSetup               1 1.0 1.1921e-06 1.2 0.00e+00 0.0 0.0e+00
> > > > >>> 0.0e+00 0.0e+00  0  0  0  0  0   0  0  0  0  0     0
> > > > >>> KSPSolve            4000 1.0 2.0419e+02 1.0 1.39e+08 1.0 2.4e+04
> > > > >>> 2.1e+04 3.6e+04 21100100100100 100100100100100   278
> > > > >>> PCSetUp                1 1.0 2.8828e-02 1.1 3.99e+07 1.1 0.0e+00
> > > > >>> 0.0e+00 3.0e+00  0  0  0  0  0   0  0  0  0  0    75
> > > > >>> PCSetUpOnBlocks     4000 1.0 3.5605e-02 1.1 3.35e+07 1.1 0.0e+00
> > > > >>> 0.0e+00 3.0e+00  0  0  0  0  0   0  0  0  0  0    61
> > > > >>> PCApply            16000 1.0 1.1661e+02 1.4 1.46e+08 1.4 0.0e+00
> > > > >>> 0.0e+00 0.0e+00 10 43  0  0  0  49 43  0  0  0   207
> > > > >>> -------------------------------------------------------------------
> > > > >>>---- -- -----------------------------------------------
> > > > >>>
> > > > >>> ...
> > > > >>>
> > > > >>>
> > > > >>> Some things to note are the following:
> > > > >>> I allocate my vector as:
> > > > >>> VecCreateMPI(PETSC_COMM_WORLD, //communicator
> > > > >>>          a_totallocal_numPoints[a_thisproc], //local points on this
> > > > >>> proc a_totalglobal_numPoints, //total number of global points
> > > > >>> &m_globalRHSVector); //the vector to be created
> > > > >>>
> > > > >>> where the vector a_totallocal_numPoints is :
> > > > >>> a_totallocal_numPoints: 59904 59904
> > > > >>>
> > > > >>> The matrix is allocated as:
> > > > >>>  m_ierr = MatCreateMPIAIJ(PETSC_COMM_WORLD, //communicator
> > > > >>>                      a_totallocal_numPoints[a_thisproc], //total
> > > > >>> number of local rows (that is rows residing on this proc
> > > > >>>                      a_totallocal_numPoints[a_thisproc], //total
> > > > >>> number of columns corresponding to local part of parallel vector
> > > > >>>                      a_totalglobal_numPoints, //number of global
> > > > >>> rows a_totalglobal_numPoints, //number of global columns
> > > > >>> PETSC_NULL,
> > > > >>>                      a_NumberOfNZPointsInDiagonalMatrix,
> > > > >>>                      PETSC_NULL,
> > > > >>>                      a_NumberOfNZPointsInOffDiagonalMatrix,
> > > > >>>                      &m_globalMatrix);
> > > > >>>
> > > > >>> With the info option i checked and there is no extra mallocs at
> > > > >>> all. My problems setup is symmetric so it seems that everything is
> > > > >>> set up so that it should be essentially perfectly balanced.
> > > > >>> However, the numbers given above certainly do not reflect that.
> > > > >>>
> > > > >>> However, the in all other parts of my code (except the PETSc call),
> > > > >>> i get the expected, almost perfect loadbalance.
> > > > >>>
> > > > >>> Is there anything that i am overlooking? Any help is greatly
> > > > >>> appreciated.
> > > > >>>
> > > > >>> thanks
> > > > >>> mat
> > > > >>>
> > > > >>> On Wednesday 02 August 2006 16:21, Matt Funk wrote:
> > > > >>>> Hi Matt,
> > > > >>>>
> > > > >>>> It could be a bad load imbalance because i don't let PETSc decide.
> > > > >>>> I need to fix that anyway, so i think i'll try that first and then
> > > > >>>> let you know. Thanks though for the quick response and helping me
> > > > >>>> to interpret those numbers ...
> > > > >>>>
> > > > >>>>
> > > > >>>> mat
> > > > >>>>
> > > > >>>> On Wednesday 02 August 2006 15:50, Matthew Knepley wrote:
> > > > >>>>> On 8/2/06, Matt Funk <mafunk at nmsu.edu> wrote:
> > > > >>>>>> Hi Matt,
> > > > >>>>>>
> > > > >>>>>> thanks for all the help so far. The -info option is really very
> > > > >>>>>> helpful. So i think i straightened the actual errors out.
> > > > >>>>>> However, now i am back to the original question i had. That is
> > > > >>>>>> why it takes so much longer on 4 procs than on 1 proc.
> > > > >>>>>
> > > > >>>>> So you have a 1.5 load imbalance for MatMult(), which probably
> > > > >>>>> cascades to give the 133! load imbalance for VecDot(). You
> > > > >>>>> probably have either:
> > > > >>>>>
> > > > >>>>>   1) VERY bad laod imbalance
> > > > >>>>>
> > > > >>>>>   2) a screwed up network
> > > > >>>>>
> > > > >>>>>   3) bad contention on the network (loaded cluster)
> > > > >>>>>
> > > > >>>>> Can you help us narrow this down?
> > > > >>>>>
> > > > >>>>>
> > > > >>>>>    Matt
> > > > >>>>>
> > > > >>>>>> I profiled the KSPSolve(...) as stage 2:
> > > > >>>>>>
> > > > >>>>>> For 1 proc i have:
> > > > >>>>>> --- Event Stage 2: Stage 2 of ChomboPetscInterface
> > > > >>>>>>
> > > > >>>>>> VecDot              4000 1.0 4.9158e-02 1.0 4.74e+08 1.0 0.0e+00
> > > > >>>>>> 0.0e+00 0.0e+00  0 18  0  0  0   2 18  0  0  0   474
> > > > >>>>>> VecNorm             8000 1.0 2.1798e-01 1.0 2.14e+08 1.0 0.0e+00
> > > > >>>>>> 0.0e+00 4.0e+03  1 36  0  0 28   7 36  0  0 33   214
> > > > >>>>>> VecAYPX             4000 1.0 1.3449e-01 1.0 1.73e+08 1.0 0.0e+00
> > > > >>>>>> 0.0e+00 0.0e+00  0 18  0  0  0   5 18  0  0  0   173
> > > > >>>>>> MatMult             4000 1.0 3.6004e-01 1.0 3.24e+07 1.0 0.0e+00
> > > > >>>>>> 0.0e+00 0.0e+00  1  9  0  0  0  12  9  0  0  0    32
> > > > >>>>>> MatSolve            8000 1.0 1.0620e+00 1.0 2.19e+07 1.0 0.0e+00
> > > > >>>>>> 0.0e+00 0.0e+00  3 18  0  0  0  36 18  0  0  0    22
> > > > >>>>>> KSPSolve            4000 1.0 2.8338e+00 1.0 4.52e+07 1.0 0.0e+00
> > > > >>>>>> 0.0e+00 1.2e+04  7100  0  0 84  97100  0  0100    45
> > > > >>>>>> PCApply             8000 1.0 1.1133e+00 1.0 2.09e+07 1.0 0.0e+00
> > > > >>>>>> 0.0e+00 0.0e+00  3 18  0  0  0  38 18  0  0  0    21
> > > > >>>>>>
> > > > >>>>>>
> > > > >>>>>> for 4 procs i have :
> > > > >>>>>> --- Event Stage 2: Stage 2 of ChomboPetscInterface
> > > > >>>>>>
> > > > >>>>>> VecDot              4000 1.0 3.5884e+01133.7 2.17e+07133.7
> > > > >>>>>> 0.0e+00 0.0e+00 4.0e+03  8 18  0  0  5   9 18  0  0 14     1
> > > > >>>>>> VecNorm             8000 1.0 3.4986e-01 1.3 4.43e+07 1.3 0.0e+00
> > > > >>>>>> 0.0e+00 8.0e+03  0 36  0  0 10   0 36  0  0 29   133
> > > > >>>>>> VecSet              8000 1.0 3.5024e-02 1.4 0.00e+00 0.0 0.0e+00
> > > > >>>>>> 0.0e+00 0.0e+00  0  0  0  0  0   0  0  0  0  0     0
> > > > >>>>>> VecAYPX             4000 1.0 5.6790e-02 1.3 1.28e+08 1.3 0.0e+00
> > > > >>>>>> 0.0e+00 0.0e+00  0 18  0  0  0   0 18  0  0  0   410
> > > > >>>>>> VecScatterBegin     4000 1.0 6.0042e+01 1.4 0.00e+00 0.0 0.0e+00
> > > > >>>>>> 0.0e+00 0.0e+00 38  0  0  0  0  45  0  0  0  0     0
> > > > >>>>>> VecScatterEnd       4000 1.0 5.9364e+01 1.4 0.00e+00 0.0 0.0e+00
> > > > >>>>>> 0.0e+00 0.0e+00 37  0  0  0  0  44  0  0  0  0     0
> > > > >>>>>> MatMult             4000 1.0 1.1959e+02 1.4 3.46e+04 1.4 0.0e+00
> > > > >>>>>> 0.0e+00 0.0e+00 75  9  0  0  0  89  9  0  0  0     0
> > > > >>>>>> MatSolve            8000 1.0 2.8150e-01 1.0 2.16e+07 1.0 0.0e+00
> > > > >>>>>> 0.0e+00 0.0e+00  0 18  0  0  0   0 18  0  0  0    83
> > > > >>>>>> MatLUFactorNum         1 1.0 1.3685e-04 1.1 5.64e+06 1.1 0.0e+00
> > > > >>>>>> 0.0e+00 0.0e+00  0  0  0  0  0   0  0  0  0  0    21
> > > > >>>>>> MatILUFactorSym        1 1.0 2.3389e-04 1.2 0.00e+00 0.0 0.0e+00
> > > > >>>>>> 0.0e+00 2.0e+00  0  0  0  0  0   0  0  0  0  0     0
> > > > >>>>>> MatGetOrdering         1 1.0 9.6083e-05 1.2 0.00e+00 0.0 0.0e+00
> > > > >>>>>> 0.0e+00 2.0e+00  0  0  0  0  0   0  0  0  0  0     0
> > > > >>>>>> KSPSetup               1 1.0 2.1458e-06 2.2 0.00e+00 0.0 0.0e+00
> > > > >>>>>> 0.0e+00 0.0e+00  0  0  0  0  0   0  0  0  0  0     0
> > > > >>>>>> KSPSolve            4000 1.0 1.2200e+02 1.0 2.63e+05 1.0 0.0e+00
> > > > >>>>>> 0.0e+00 2.8e+04 84100  0  0 34 100100  0  0100     1
> > > > >>>>>> PCSetUp                1 1.0 5.0187e-04 1.2 1.68e+06 1.2 0.0e+00
> > > > >>>>>> 0.0e+00 4.0e+00  0  0  0  0  0   0  0  0  0  0     6
> > > > >>>>>> PCSetUpOnBlocks     4000 1.0 1.2104e-02 2.2 1.34e+05 2.2 0.0e+00
> > > > >>>>>> 0.0e+00 4.0e+00  0  0  0  0  0   0  0  0  0  0     0
> > > > >>>>>> PCApply             8000 1.0 8.4254e-01 1.2 8.27e+06 1.2 0.0e+00
> > > > >>>>>> 0.0e+00 8.0e+03  1 18  0  0 10   1 18  0  0 29    28
> > > > >>>>>> ----------------------------------------------------------------
> > > > >>>>>>---- -- - -- -----------------------------------------------
> > > > >>>>>>
> > > > >>>>>> Now if i understand it right, all these calls summarize all
> > > > >>>>>> calls between the pop and push commands. That would mean that
> > > > >>>>>> the majority of the time is spend in the MatMult and in within
> > > > >>>>>> that the VecScatterBegin and VecScatterEnd commands (if i
> > > > >>>>>> understand it right).
> > > > >>>>>>
> > > > >>>>>> My problem size is really small. So i was wondering if the
> > > > >>>>>> problem lies in that (namely that the major time is simply spend
> > > > >>>>>> communicating between processors, or whether there is still
> > > > >>>>>> something wrong with how i wrote the code?)
> > > > >>>>>>
> > > > >>>>>>
> > > > >>>>>> thanks
> > > > >>>>>> mat
> > > > >>>>>>
> > > > >>>>>> On Tuesday 01 August 2006 18:28, Matthew Knepley wrote:
> > > > >>>>>>> On 8/1/06, Matt Funk <mafunk at nmsu.edu> wrote:
> > > > >>>>>>>> Actually the errors occur on my calls to a PETSc functions
> > > > >>>>>>>> after calling PETSCInitialize.
> > > > >>>>>>>
> > > > >>>>>>> Yes, it is the error I pointed out in the last message.
> > > > >>>>>>>
> > > > >>>>>>>    Matt
> > > > >>>>>>>
> > > > >>>>>>>> mat
>
>


-- 
"Failure has a thousand explanations. Success doesn't need one" -- Sir
Alec Guiness


From mwojc at p.lodz.pl  Tue Aug 15 20:10:47 2006
From: mwojc at p.lodz.pl (Marek Wojciechowski)
Date: Wed, 16 Aug 2006 01:10:47 -0000
Subject: PETSc from python
In-Reply-To: <a9f269830608101218y6cbadf1m41c005fae0fc4d72@mail.gmail.com>
References: <op.td2epsp7hfz6u8@localhost> <a9f269830608100537g6f7a6ad6n2ee2f76c33068a54@mail.gmail.com> <op.td22mkv2hfz6u8@localhost> <a9f269830608101218y6cbadf1m41c005fae0fc4d72@mail.gmail.com>
Message-ID: <op.tecqv9nvhfz6u8@localhost>

>> > There are nice Python bindings from
>> >
>> >   http://lineal.developer.nicta.com.au/

I inspected a bit lineal but it also seems to be a work in progress...  
There is no official release and the project is currently hibernated (as i  
was told at their mailing list). The approach presented there is  
interesting, but far from the python high level programming spirit (for  
now). Instead, C constructs are used almost directly in python  
interpreter. This has one advantage that the PETSc examples can be easy  
translated to python.

For now I found that it is quite good idea to use PETSc python bindings  
alongside with the parallelized python interpreter "bwpython"  
(http://www.cimec.org.ar/python/python.html) and python mpi bindings  
"mpi4py" (http://www.cimec.org.ar/python/mpi4py.html). I'm able now to run  
interactive sessions with PETSc and I have also access to all MPI  
functions which are not accessible from your PETSc bindings (AFAIK).

Greetings

-- 
Marek Wojciechowski


From knepley at gmail.com  Wed Aug 16 06:07:42 2006
From: knepley at gmail.com (Matthew Knepley)
Date: Wed, 16 Aug 2006 06:07:42 -0500
Subject: PETSc from python
In-Reply-To: <op.tecqv9nvhfz6u8@localhost>
References: <op.td2epsp7hfz6u8@localhost>
	 <a9f269830608100537g6f7a6ad6n2ee2f76c33068a54@mail.gmail.com>
	 <op.td22mkv2hfz6u8@localhost>
	 <a9f269830608101218y6cbadf1m41c005fae0fc4d72@mail.gmail.com>
	 <op.tecqv9nvhfz6u8@localhost>
Message-ID: <a9f269830608160407m347d15day863e0240aae71523@mail.gmail.com>

On 8/15/06, Marek Wojciechowski <mwojc at p.lodz.pl> wrote:
> >> > There are nice Python bindings from
> >> >
> >> >   http://lineal.developer.nicta.com.au/
>
> I inspected a bit lineal but it also seems to be a work in progress...
> There is no official release and the project is currently hibernated (as i
> was told at their mailing list). The approach presented there is
> interesting, but far from the python high level programming spirit (for
> now). Instead, C constructs are used almost directly in python
> interpreter. This has one advantage that the PETSc examples can be easy
> translated to python.
>
> For now I found that it is quite good idea to use PETSc python bindings
> alongside with the parallelized python interpreter "bwpython"
> (http://www.cimec.org.ar/python/python.html) and python mpi bindings
> "mpi4py" (http://www.cimec.org.ar/python/mpi4py.html). I'm able now to run
> interactive sessions with PETSc and I have also access to all MPI
> functions which are not accessible from your PETSc bindings (AFAIK).

This is very interesting. I also know Lisandro who wrote bwpython. I just had
a user have the opposite experience with these packages, but I guess that
is why we have multiple packages. I am very happy you got this going. If we
can help out with anything else, just mail (and it won't take so long
next time).

   Thanks,

      Matt

> Greetings
>
> --
> Marek Wojciechowski
>
>


-- 
"Failure has a thousand explanations. Success doesn't need one" -- Sir
Alec Guiness


From balay at mcs.anl.gov  Wed Aug 16 09:08:50 2006
From: balay at mcs.anl.gov (Satish Balay)
Date: Wed, 16 Aug 2006 09:08:50 -0500 (CDT)
Subject: profiling PETSc code
In-Reply-To: <200608151551.53794.mafunk@nmsu.edu>
References: <200608011545.26224.mafunk@nmsu.edu> <200608151435.55054.mafunk@nmsu.edu>
 <a9f269830608151344rbfa13fbgdf499b17f641efbf@mail.gmail.com>
 <200608151551.53794.mafunk@nmsu.edu>
Message-ID: <Pine.LNX.4.64.0608160906410.19430@asterix>

Yes - we limit the e-mail sizes on the mailing list - as we don't want
to flood all list participents with multi-megabyte emails.

Issues that require such interaction should be done at
petsc-mait at mcs.anl.gov not petsc-users at mcs.anl.gov.

Satish

On Tue, 15 Aug 2006, Matt Funk wrote:

> Is there a limit to how big an attachment can be?
> The file is 1.3Mb big. I tried to send it twice and none of the emails went 
> through. I also send it directly to Barry and Matthews email. I hope that got 
> though?

> > > On Tuesday 15 August 2006 13:56, Barry Smith wrote:
> > > >    Please send the entire -info output as an attachment to me. (Not
> > > > in the email) I'll study it in more detail.


From geenen at gmail.com  Wed Aug 16 09:54:38 2006
From: geenen at gmail.com (Thomas Geenen)
Date: Wed, 16 Aug 2006 16:54:38 +0200
Subject: leave my rows alone
Message-ID: <200608161654.38378.geenen@gmail.com>

dear petsc users,

is there a way to prevent Petsc during the assembly phase from redistributing 
matrix rows over cpu's ?? i like the way the rows are assigned to the cpu's 
during the setvalues phase.  
apparently petsc assigns the first nrows to cpu0 the second nrows to cpu1 etc.
I could of course renumber my matrix but I would rather convince petsc that it 
should keep the distribution of the matrix rows.

tia
Thomas 


From knepley at gmail.com  Wed Aug 16 11:21:10 2006
From: knepley at gmail.com (Matthew Knepley)
Date: Wed, 16 Aug 2006 11:21:10 -0500
Subject: leave my rows alone
In-Reply-To: <200608161654.38378.geenen@gmail.com>
References: <200608161654.38378.geenen@gmail.com>
Message-ID: <a9f269830608160921h74428a20wd7ef84db9dd57f43@mail.gmail.com>

On 8/16/06, Thomas Geenen <geenen at gmail.com> wrote:
> dear petsc users,
>
> is there a way to prevent Petsc during the assembly phase from redistributing
> matrix rows over cpu's ?? i like the way the rows are assigned to the cpu's
> during the setvalues phase.

Actually,  the layout of a matrix is fully determined after MatSetSizes(), or
equivalently MatCreate***(). We do not redistribute at assembly.

setValues() will take values for any row, and send it to the correct
process. The
matrix layouts we support all have contiguous row on each proc. You can set the
sizes on creation.

  Does this answer your question?

  Thanks,

     Matt

> apparently petsc assigns the first nrows to cpu0 the second nrows to cpu1 etc.
> I could of course renumber my matrix but I would rather convince petsc that it
> should keep the distribution of the matrix rows.
>
> tia
> Thomas
>
>


-- 
"Failure has a thousand explanations. Success doesn't need one" -- Sir
Alec Guiness


From balay at mcs.anl.gov  Wed Aug 16 11:25:15 2006
From: balay at mcs.anl.gov (Satish Balay)
Date: Wed, 16 Aug 2006 11:25:15 -0500 (CDT)
Subject: leave my rows alone
In-Reply-To: <200608161654.38378.geenen@gmail.com>
References: <200608161654.38378.geenen@gmail.com>
Message-ID: <Pine.LNX.4.64.0608161120050.19430@asterix>

On Wed, 16 Aug 2006, Thomas Geenen wrote:

> dear petsc users,
> 
> is there a way to prevent Petsc during the assembly phase from redistributing 
> matrix rows over cpu's ??

The row distribution is done at matrix creation time - and you can set
the row distribution with MatSetSizes() [or MatCreateMPIAIJ() etc..]
by using the correct distribution value - instead of PETSC_DECIDE

> i like the way the rows are assigned to the cpu's during the
> setvalues phase.

I don't understand this statement. The row assignment doesn't change

> apparently petsc assigns the first nrows to cpu0 the second nrows to cpu1 etc.

yes. this fact can't be changed.

> I could of course renumber my matrix but I would rather convince
> petsc that it should keep the distribution of the matrix rows.

If you have some other global numbering scheme which is inconsitant
with the matrix row numbering scheme - then you can use 'AO' object
and associated routines to convert between mappings.

Note: 'row distribution' is different from 'row numbering'.

Satish


From geenen at gmail.com  Wed Aug 16 11:37:13 2006
From: geenen at gmail.com (Thomas Geenen)
Date: Wed, 16 Aug 2006 18:37:13 +0200
Subject: leave my rows alone
In-Reply-To: <a9f269830608160921h74428a20wd7ef84db9dd57f43@mail.gmail.com>
References: <200608161654.38378.geenen@gmail.com> <a9f269830608160921h74428a20wd7ef84db9dd57f43@mail.gmail.com>
Message-ID: <200608161837.13717.geenen@gmail.com>

On Wednesday 16 August 2006 18:21, Matthew Knepley wrote:
> On 8/16/06, Thomas Geenen <geenen at gmail.com> wrote:
> > dear petsc users,
> >
> > is there a way to prevent Petsc during the assembly phase from
> > redistributing matrix rows over cpu's ?? i like the way the rows are
> > assigned to the cpu's during the setvalues phase.
>
> Actually,  the layout of a matrix is fully determined after MatSetSizes(),
> or equivalently MatCreate***(). We do not redistribute at assembly.
>
> setValues() will take values for any row, and send it to the correct-
> process. The
send it to the correct process sounds a lot like redistributing but that's 
probably a matter of semantics 
> matrix layouts we support all have contiguous row on each proc. You can set
> the sizes on creation.
pity
>
>   Does this answer your question?
yep 
thanks
>
>   Thanks,
>
>      Matt
>
> > apparently petsc assigns the first nrows to cpu0 the second nrows to cpu1
> > etc. I could of course renumber my matrix but I would rather convince
> > petsc that it should keep the distribution of the matrix rows.
> >
> > tia
> > Thomas


From balay at mcs.anl.gov  Wed Aug 16 11:41:28 2006
From: balay at mcs.anl.gov (Satish Balay)
Date: Wed, 16 Aug 2006 11:41:28 -0500 (CDT)
Subject: leave my rows alone
In-Reply-To: <200608161837.13717.geenen@gmail.com>
References: <200608161654.38378.geenen@gmail.com>
 <a9f269830608160921h74428a20wd7ef84db9dd57f43@mail.gmail.com>
 <200608161837.13717.geenen@gmail.com>
Message-ID: <Pine.LNX.4.64.0608161138260.19430@asterix>

On Wed, 16 Aug 2006, Thomas Geenen wrote:

> On Wednesday 16 August 2006 18:21, Matthew Knepley wrote:
> > On 8/16/06, Thomas Geenen <geenen at gmail.com> wrote:
> > > dear petsc users,
> > >
> > > is there a way to prevent Petsc during the assembly phase from
> > > redistributing matrix rows over cpu's ?? i like the way the rows are
> > > assigned to the cpu's during the setvalues phase.
> >
> > Actually,  the layout of a matrix is fully determined after MatSetSizes(),
> > or equivalently MatCreate***(). We do not redistribute at assembly.
> >
> > setValues() will take values for any row, and send it to the correct-
> > process. The

> send it to the correct process sounds a lot like redistributing but
> that's probably a matter of semantics

No its not redistribution.  When you create the matrix - the ownership
of a given row is determined. [it doesn't change]

If row 10 belongs to proc 2 [determined with MatSetSizes()] , but you
invoke MatSetValues(row=10) on proc 5, clearly this value has to be
communicated to proc2. This happens in MatAssembly***().

Satish


From mafunk at nmsu.edu  Wed Aug 16 11:53:42 2006
From: mafunk at nmsu.edu (Matt Funk)
Date: Wed, 16 Aug 2006 10:53:42 -0600
Subject: leave my rows alone
In-Reply-To: <200608161837.13717.geenen@gmail.com>
References: <200608161654.38378.geenen@gmail.com> <a9f269830608160921h74428a20wd7ef84db9dd57f43@mail.gmail.com> <200608161837.13717.geenen@gmail.com>
Message-ID: <200608161053.44439.mafunk@nmsu.edu>

Hi Thomas.

I am not sure if the following is what you are looking for, but i don't have 
PETSc 'redistribute' anything. That is, i tell PETSc exactly how the matrix 
should be distributed across the procs via the following:

 m_ierr = MatCreateMPIAIJ(PETSC_COMM_WORLD, 
???????????????????????? ? a_totallocal_numPoints[a_thisproc], 
???????????????????????? ? a_totallocal_numPoints[a_thisproc],
???????????????????????? ? a_totalglobal_numPoints, 
???????????????????????? ? a_totalglobal_numPoints,
???????????????????????? ? PETSC_NULL,
???????????????????????? ? a_NumberOfNZPointsInDiagonalMatrix,
???????????????????????? ? PETSC_NULL,
???????????????????????? ? a_NumberOfNZPointsInOffDiagonalMatrix,
???????????????????????? ? &m_globalMatrix);

The argument descriptions are found at 
'http://www-unix.mcs.anl.gov/petsc/petsc-as/snapshots/petsc-current/docs/manualpages/Mat/MatCreateMPIAIJ.html'

So anyway, PETSc does not touch this matrix in the sense of redistributing 
anything. It is just as i want it to be. Hope this helps ...

mat


On Wednesday 16 August 2006 10:37, Thomas Geenen wrote:
> On Wednesday 16 August 2006 18:21, Matthew Knepley wrote:
> > On 8/16/06, Thomas Geenen <geenen at gmail.com> wrote:
> > > dear petsc users,
> > >
> > > is there a way to prevent Petsc during the assembly phase from
> > > redistributing matrix rows over cpu's ?? i like the way the rows are
> > > assigned to the cpu's during the setvalues phase.
> >
> > Actually,  the layout of a matrix is fully determined after
> > MatSetSizes(), or equivalently MatCreate***(). We do not redistribute at
> > assembly.
> >
> > setValues() will take values for any row, and send it to the correct-
> > process. The
>
> send it to the correct process sounds a lot like redistributing but that's
> probably a matter of semantics
>
> > matrix layouts we support all have contiguous row on each proc. You can
> > set the sizes on creation.
>
> pity
>
> >   Does this answer your question?
>
> yep
> thanks
>
> >   Thanks,
> >
> >      Matt
> >
> > > apparently petsc assigns the first nrows to cpu0 the second nrows to
> > > cpu1 etc. I could of course renumber my matrix but I would rather
> > > convince petsc that it should keep the distribution of the matrix rows.
> > >
> > > tia
> > > Thomas


From balay at mcs.anl.gov  Wed Aug 16 12:26:49 2006
From: balay at mcs.anl.gov (Satish Balay)
Date: Wed, 16 Aug 2006 12:26:49 -0500 (CDT)
Subject: leave my rows alone
In-Reply-To: <200608161053.44439.mafunk@nmsu.edu>
References: <200608161654.38378.geenen@gmail.com>
 <a9f269830608160921h74428a20wd7ef84db9dd57f43@mail.gmail.com>
 <200608161837.13717.geenen@gmail.com> <200608161053.44439.mafunk@nmsu.edu>
Message-ID: <Pine.LNX.4.64.0608161225280.19430@asterix>

Perhaps the issue is not using MatGetRowOnership() [but some other
scheme] to get the row indices that are used in MatSetValues()

Satish

On Wed, 16 Aug 2006, Matt Funk wrote:

> Hi Thomas.
> 
> I am not sure if the following is what you are looking for, but i don't have 
> PETSc 'redistribute' anything. That is, i tell PETSc exactly how the matrix 
> should be distributed across the procs via the following:
> 
>  m_ierr = MatCreateMPIAIJ(PETSC_COMM_WORLD, 
> ???????????????????????? ? a_totallocal_numPoints[a_thisproc], 
> ???????????????????????? ? a_totallocal_numPoints[a_thisproc],
> ???????????????????????? ? a_totalglobal_numPoints, 
> ???????????????????????? ? a_totalglobal_numPoints,
> ???????????????????????? ? PETSC_NULL,
> ???????????????????????? ? a_NumberOfNZPointsInDiagonalMatrix,
> ???????????????????????? ? PETSC_NULL,
> ???????????????????????? ? a_NumberOfNZPointsInOffDiagonalMatrix,
> ???????????????????????? ? &m_globalMatrix);
> 
> The argument descriptions are found at 
> 'http://www-unix.mcs.anl.gov/petsc/petsc-as/snapshots/petsc-current/docs/manualpages/Mat/MatCreateMPIAIJ.html'
> 
> So anyway, PETSc does not touch this matrix in the sense of redistributing 
> anything. It is just as i want it to be. Hope this helps ...
> 
> mat
> 
> 
> On Wednesday 16 August 2006 10:37, Thomas Geenen wrote:
> > On Wednesday 16 August 2006 18:21, Matthew Knepley wrote:
> > > On 8/16/06, Thomas Geenen <geenen at gmail.com> wrote:
> > > > dear petsc users,
> > > >
> > > > is there a way to prevent Petsc during the assembly phase from
> > > > redistributing matrix rows over cpu's ?? i like the way the rows are
> > > > assigned to the cpu's during the setvalues phase.
> > >
> > > Actually,  the layout of a matrix is fully determined after
> > > MatSetSizes(), or equivalently MatCreate***(). We do not redistribute at
> > > assembly.
> > >
> > > setValues() will take values for any row, and send it to the correct-
> > > process. The
> >
> > send it to the correct process sounds a lot like redistributing but that's
> > probably a matter of semantics
> >
> > > matrix layouts we support all have contiguous row on each proc. You can
> > > set the sizes on creation.
> >
> > pity
> >
> > >   Does this answer your question?
> >
> > yep
> > thanks
> >
> > >   Thanks,
> > >
> > >      Matt
> > >
> > > > apparently petsc assigns the first nrows to cpu0 the second nrows to
> > > > cpu1 etc. I could of course renumber my matrix but I would rather
> > > > convince petsc that it should keep the distribution of the matrix rows.
> > > >
> > > > tia
> > > > Thomas
> 
> 

From geenen at gmail.com  Wed Aug 16 14:58:44 2006
From: geenen at gmail.com (Thomas Geenen)
Date: Wed, 16 Aug 2006 21:58:44 +0200
Subject: leave my rows alone
In-Reply-To: <Pine.LNX.4.64.0608161138260.19430@asterix>
References: <200608161654.38378.geenen@gmail.com> <200608161837.13717.geenen@gmail.com> <Pine.LNX.4.64.0608161138260.19430@asterix>
Message-ID: <200608162158.44210.geenen@gmail.com>

On Wednesday 16 August 2006 18:41, Satish Balay wrote:
> On Wed, 16 Aug 2006, Thomas Geenen wrote:
> > On Wednesday 16 August 2006 18:21, Matthew Knepley wrote:
> > > On 8/16/06, Thomas Geenen <geenen at gmail.com> wrote:
> > > > dear petsc users,
> > > >
> > > > is there a way to prevent Petsc during the assembly phase from
> > > > redistributing matrix rows over cpu's ?? i like the way the rows are
> > > > assigned to the cpu's during the setvalues phase.
> > >
> > > Actually,  the layout of a matrix is fully determined after
> > > MatSetSizes(), or equivalently MatCreate***(). We do not redistribute
> > > at assembly.
> > >
> > > setValues() will take values for any row, and send it to the correct-
> > > process. The
> >
> > send it to the correct process sounds a lot like redistributing but
> > that's probably a matter of semantics
>
> No its not redistribution.  When you create the matrix - the ownership
> of a given row is determined. [it doesn't change]
>
> If row 10 belongs to proc 2 [determined with MatSetSizes()] , but you
> invoke MatSetValues(row=10) on proc 5, clearly this value has to be
> communicated to proc2. This happens in MatAssembly***().

in matcreate i tell petsc that cpu0 owns 10 rows in matsetvalue i tell him 
which rows (maybe 1,9,23,46 etc) but petsc automatically assumes that cpu0 
owns the first 10 rows
matsetsizes also just tells petsc the number of rows. 0-9 for cpu0

so if i want to keep row 23 on cpu0 i have to do some sort of renumbering 
making row23 <10

>
> Satish


From balay at mcs.anl.gov  Wed Aug 16 15:14:53 2006
From: balay at mcs.anl.gov (Satish Balay)
Date: Wed, 16 Aug 2006 15:14:53 -0500 (CDT)
Subject: leave my rows alone
In-Reply-To: <200608162158.44210.geenen@gmail.com>
References: <200608161654.38378.geenen@gmail.com> <200608161837.13717.geenen@gmail.com>
 <Pine.LNX.4.64.0608161138260.19430@asterix> <200608162158.44210.geenen@gmail.com>
Message-ID: <Pine.LNX.4.64.0608161508250.22265@asterix>

On Wed, 16 Aug 2006, Thomas Geenen wrote:

> in matcreate i tell petsc that cpu0 owns 10 rows 

yes

> in matsetvalue i tell him which rows (maybe 1,9,23,46 etc) but petsc
> automatically assumes that cpu0 owns the first 10 rows matsetsizes
> also just tells petsc the number of rows. 0-9 for cpu0

MatSetValues() provides no such functionality [of specifying rows
owned by local proc] . You are misinterpreting arguments to
MatSetValues().

In this new model [which PETSc doesn't provide] - what hapens if you
call MatSetValues(row=0) on both proc 0 & proc1?  Does this row get
owned by both procesors?

And from what numbering scheme do you get these numbers 1,9,23,46 etc?

Satish

> 
> so if i want to keep row 23 on cpu0 i have to do some sort of renumbering 
> making row23 <10


From geenen at gmail.com  Wed Aug 16 15:24:31 2006
From: geenen at gmail.com (Thomas Geenen)
Date: Wed, 16 Aug 2006 22:24:31 +0200
Subject: leave my rows alone
In-Reply-To: <Pine.LNX.4.64.0608161508250.22265@asterix>
References: <200608161654.38378.geenen@gmail.com> <200608162158.44210.geenen@gmail.com> <Pine.LNX.4.64.0608161508250.22265@asterix>
Message-ID: <200608162224.31369.geenen@gmail.com>

On Wednesday 16 August 2006 22:14, Satish Balay wrote:
> On Wed, 16 Aug 2006, Thomas Geenen wrote:
> > in matcreate i tell petsc that cpu0 owns 10 rows
>
> yes
>
> > in matsetvalue i tell him which rows (maybe 1,9,23,46 etc) but petsc
> > automatically assumes that cpu0 owns the first 10 rows matsetsizes
> > also just tells petsc the number of rows. 0-9 for cpu0
>
> MatSetValues() provides no such functionality [of specifying rows
> owned by local proc] . You are misinterpreting arguments to
> MatSetValues().
i know but i hoped there would be some way of doing this
i like to be in charge of these kind of things :)
>
> In this new model [which PETSc doesn't provide] - what hapens if you
> call MatSetValues(row=0) on both proc 0 & proc1?  Does this row get
> owned by both procesors?
why would you do that???
maybe if you have a very slow network?
>
> And from what numbering scheme do you get these numbers 1,9,23,46 etc?
my nice little FEM app that i build an interface to petsc for :)
>
> Satish
>
> > so if i want to keep row 23 on cpu0 i have to do some sort of renumbering
> > making row23 <10

thanks for all the help

Thomas


From balay at mcs.anl.gov  Wed Aug 16 15:31:03 2006
From: balay at mcs.anl.gov (Satish Balay)
Date: Wed, 16 Aug 2006 15:31:03 -0500 (CDT)
Subject: leave my rows alone
In-Reply-To: <200608162224.31369.geenen@gmail.com>
References: <200608161654.38378.geenen@gmail.com> <200608162158.44210.geenen@gmail.com>
 <Pine.LNX.4.64.0608161508250.22265@asterix> <200608162224.31369.geenen@gmail.com>
Message-ID: <Pine.LNX.4.64.0608161526150.22265@asterix>

On Wed, 16 Aug 2006, Thomas Geenen wrote:

> > And from what numbering scheme do you get these numbers 1,9,23,46 etc?
> my nice little FEM app that i build an interface to petsc for :)

I presume this numbering is same irrespective of number of procs or
the way the grid is partitioned across processors? In that case it
would constitute a global numbering scheme - and you can use PETSc
'AO' object to map between numbering schemes.

Satish


From joel.schaerer at creatis.insa-lyon.fr  Wed Aug 16 11:46:49 2006
From: joel.schaerer at creatis.insa-lyon.fr (=?ISO-8859-1?Q?Jo=EBl?= Schaerer)
Date: Wed, 16 Aug 2006 12:46:49 -0400
Subject: Configuration problems on windows
Message-ID: 
 <1155746809.2364.7.camel@netnyuotp006545ots.unassigned.msnyuhealth.org>

Hello all,

I am trying to configure petsc for visual studio on a windows machine.
Here is the configure line I typed on cygwin:

./config/configure.py --with-cc='win32fe cl' --download-c-blas-lapack=1
--with-mpi=0 --with-x=0 --PETSC_DIR=$(pwd) --with-fortran=0

This has already worked before on other windows machines, but this time
I get the following error:

[snip]
C:\pkg\cygwin\bin\python2.3.exe: *** unable to remap C:\pkg\cygwin\bin
\cygssl-0.
9.7.dll to same address as parent(0x1220000) != 0x1230000
     37 [unknown (0x978)] python 2432 sync_with_child: child 896(0x900)
died bef
ore initialization with status code 0x1
    422 [unknown (0x978)] python 2432 sync_with_child: *** child state
child loa
ding dlls
Exception in thread Shell Command:
Traceback (most recent call last):
  File "/tmp/python.2708/usr/lib/python2.3/threading.py", line 436, in
__bootstr
ap
    self.run()
  File "/tmp/python.2708/usr/lib/python2.3/threading.py", line 416, in
run
    self.__target(*self.__args, **self.__kwargs)
  File
"/cygdrive/d/chenting/tools/petsc/petsc-2.3.1-p16/python/BuildSystem/scri
pt.py", line 190, in run
    (output, error, status) = Script.runShellCommand(command, log)
  File
"/cygdrive/d/chenting/tools/petsc/petsc-2.3.1-p16/python/BuildSystem/scri
pt.py", line 124, in runShellCommand
    (input, output, error, pipe) = Script.openPipe(command)
  File
"/cygdrive/d/chenting/tools/petsc/petsc-2.3.1-p16/python/BuildSystem/scri
pt.py", line 105, in openPipe
    pipe   = popen2.Popen3(command, 1)
  File "/tmp/python.2708/usr/lib/python2.3/popen2.py", line 39, in
__init__
    self.pid = os.fork()
OSError: [Errno 11] Resource temporarily unavailable

The complete configure.log file is available at the following adress:
http://fex.insa-lyon.fr/get?k=7xspvbpkylWdNgpkjPG

Any ideas?

Thanks a lot for your help,

Joel Schaerer


From balay at mcs.anl.gov  Wed Aug 16 18:19:59 2006
From: balay at mcs.anl.gov (Satish Balay)
Date: Wed, 16 Aug 2006 18:19:59 -0500 (CDT)
Subject: Configuration problems on windows
In-Reply-To: <1155746809.2364.7.camel@netnyuotp006545ots.unassigned.msnyuhealth.org>
References: <1155746809.2364.7.camel@netnyuotp006545ots.unassigned.msnyuhealth.org>
Message-ID: <Pine.LNX.4.64.0608161814410.22265@asterix>

> C:\pkg\cygwin\bin\python2.3.exe: *** unable to remap C:\pkg\cygwin\bin\cygssl-0.9.7.dll


There are probably 2 ways to recover from this cygwin error.

1. reinstall cygwin from scratch

2 - kill all cygwin processes [by rebooting]
  - run 'ash' shell from cygwin bin dir [this should be done either
    from 'start -> run' or from 'cmd' - but not 'cygwin bash shell'
  - run 'rebaseall' from the above 'ash' shell.

rebaseall is the easy thing to do - but it has its own issues
[according to cygwin folks] - but you can try and see if it works for
you.

Satish

On Wed, 16 Aug 2006, Jo?l Schaerer wrote:

> Hello all,
> 
> I am trying to configure petsc for visual studio on a windows machine.
> Here is the configure line I typed on cygwin:
> 
> ./config/configure.py --with-cc='win32fe cl' --download-c-blas-lapack=1
> --with-mpi=0 --with-x=0 --PETSC_DIR=$(pwd) --with-fortran=0
> 
> This has already worked before on other windows machines, but this time
> I get the following error:
> 
> [snip]
> C:\pkg\cygwin\bin\python2.3.exe: *** unable to remap C:\pkg\cygwin\bin
> \cygssl-0.
> 9.7.dll to same address as parent(0x1220000) != 0x1230000
>      37 [unknown (0x978)] python 2432 sync_with_child: child 896(0x900)
> died bef
> ore initialization with status code 0x1
>     422 [unknown (0x978)] python 2432 sync_with_child: *** child state
> child loa
> ding dlls
> Exception in thread Shell Command:
> Traceback (most recent call last):
>   File "/tmp/python.2708/usr/lib/python2.3/threading.py", line 436, in
> __bootstr
> ap
>     self.run()
>   File "/tmp/python.2708/usr/lib/python2.3/threading.py", line 416, in
> run
>     self.__target(*self.__args, **self.__kwargs)
>   File
> "/cygdrive/d/chenting/tools/petsc/petsc-2.3.1-p16/python/BuildSystem/scri
> pt.py", line 190, in run
>     (output, error, status) = Script.runShellCommand(command, log)
>   File
> "/cygdrive/d/chenting/tools/petsc/petsc-2.3.1-p16/python/BuildSystem/scri
> pt.py", line 124, in runShellCommand
>     (input, output, error, pipe) = Script.openPipe(command)
>   File
> "/cygdrive/d/chenting/tools/petsc/petsc-2.3.1-p16/python/BuildSystem/scri
> pt.py", line 105, in openPipe
>     pipe   = popen2.Popen3(command, 1)
>   File "/tmp/python.2708/usr/lib/python2.3/popen2.py", line 39, in
> __init__
>     self.pid = os.fork()
> OSError: [Errno 11] Resource temporarily unavailable
> 
> The complete configure.log file is available at the following adress:
> http://fex.insa-lyon.fr/get?k=7xspvbpkylWdNgpkjPG
> 
> Any ideas?
> 
> Thanks a lot for your help,
> 
> Joel Schaerer
> 
> 

From mwojc at p.lodz.pl  Thu Aug 17 14:42:48 2006
From: mwojc at p.lodz.pl (Marek Wojciechowski)
Date: Thu, 17 Aug 2006 19:42:48 -0000
Subject: PETSc from python
In-Reply-To: <a9f269830608160407m347d15day863e0240aae71523@mail.gmail.com>
References: <op.td2epsp7hfz6u8@localhost> <a9f269830608100537g6f7a6ad6n2ee2f76c33068a54@mail.gmail.com> <op.td22mkv2hfz6u8@localhost> <a9f269830608101218y6cbadf1m41c005fae0fc4d72@mail.gmail.com> <op.tecqv9nvhfz6u8@localhost> <a9f269830608160407m347d15day863e0240aae71523@mail.gmail.com>
Message-ID: <op.tef01mlyhfz6u8@localhost>

On Wed, 16 Aug 2006 11:07:42 -0000, Matthew Knepley <knepley at gmail.com>  
wrote:

> This is very interesting. I also know Lisandro who wrote bwpython. I  
> just had a user have the opposite experience with these packages, but I  
> guess that
> is why we have multiple packages. I am very happy you got this going.

Yes. I got this going. And I'm quite satisfied. To prove it, that's an  
examplary session:


mwojc at evo ~ $ mpirun -np 2 `which bwpython` `which ipython` -nobanner

In [1]: from petscinit import *   # This imports for me PETSc extensions,  
initializes and creates stdviewer

In [2]: from mpi4py import MPI

In [3]:

In [3]: v=Vec()

In [4]: size = (MPI.rank + 1) * 2

In [5]: totsize = MPI.COMM_WORLD.Allreduce(size)

In [6]: print 'Process [%d]: size=%d' %(MPI.rank, size)
Process [0]: size=2
Process [1]: size=4

In [7]: print 'Process [%d]: totsize=%d' %(MPI.rank, totsize)
Process [0]: totsize=6
Process [1]: totsize=6

In [8]: v.setSizes(size, PETSC_DECIDE)

In [9]: v.setFromOptions()

In [10]: if MPI.rank == 0: v.setValues(xrange(totsize), [1]*totsize,  
INSERT_VALUES)
    ....:

In [11]: v.assemblyBegin()

In [12]: v.assemblyEnd()

In [13]: v.view(stdviewer)
Process [0]
0: 1
1: 1
Process [1]
2: 1
3: 1
4: 1
5: 1

In [14]:

In [14]: A=Mat()

In [15]: A.setSizes(size, size, PETSC_DECIDE, PETSC_DECIDE)

In [16]: A.setFromOptions()

In [17]: Range = A.getOwnershipRange()

In [18]: print Range
(0, 2)
(2, 6)

In [19]: rows=xrange(Range[0], Range[1])

In [20]: cols=xrange(totsize)

In [21]: import random

In [22]: values=[random.uniform(-1,1) for i in xrange(size*totsize)]

In [23]: A.setValues(rows, cols, values, INSERT_VALUES)

In [24]: A.assemblyBegin(MAT_FINAL_ASSEMBLY)

In [25]: A.assemblyEnd(MAT_FINAL_ASSEMBLY)

In [26]: A.view(stdviewer)
row 0: (0, 0.630031)  (1, 0.673476)  (2, -0.734869)  (3, 0.105727)  (4,  
0.538428)  (5, 0.12576)
row 1: (0, -0.857206)  (1, -0.0761736)  (2, -0.143492)  (3, -0.938166)   
(4, 0.41378)  (5, -0.210328)
row 2: (0, 0.50173)  (1, -0.214067)  (2, 0.59921)  (3, 0.848044)  (4,  
-0.819785)  (5, -0.436404)
row 3: (0, 0.30529)  (1, 0.968145)  (2, 0.377928)  (3, -0.656585)  (4,  
0.882831)  (5, 0.850657)
row 4: (0, -0.304465)  (1, 0.496273)  (2, -0.277161)  (3, -0.81206)  (4,  
0.63498)  (5, 0.58123)
row 5: (0, 0.538759)  (1, -0.654964)  (2, -0.256906)  (3, -0.335948)  (4,  
0.748973)  (5, 0.813876)

In [27]:

In [27]: b = v.duplicate() #right hand vector

In [28]: A.mult(v, b)

In [29]: x=b.duplicate() #unknowns

In [30]:

In [30]: solver = KSP()

In [31]: solver.setOperators(A,A,DIFFERENT_NONZERO_PATTERN)

In [32]: solver.setFromOptions()

In [33]: solver.solve(b,x)

In [34]: x.view(stdviewer)    # IS SOLUTION CORRECT? (SHOULD BE ONES?)
Process [0]
0: 1
1: 1
Process [1]
2: 1
3: 1
4: 1
5: 1

In [35]: #AND SO ON...


Isn't it nice?
But I have also a question. Suppose my matrix A is "dense" and I would  
like to get local arrays:


In [36]: A.setType("dense")

In [37]: A.setValues(rows, cols, values, INSERT_VALUES)

In [38]: A.assemblyBegin(MAT_FINAL_ASSEMBLY)

In [39]: A.assemblyEnd(MAT_FINAL_ASSEMBLY)

In [40]: A.view(stdviewer)
6.3003e-01 6.7348e-01 -7.3487e-01 1.0573e-01 5.3843e-01 1.2576e-01
-8.5721e-01 -7.6174e-02 -1.4349e-01 -9.3817e-01 4.1378e-01 -2.1033e-01
5.0173e-01 -2.1407e-01 5.9921e-01 8.4804e-01 -8.1978e-01 -4.3640e-01
3.0529e-01 9.6815e-01 3.7793e-01 -6.5659e-01 8.8283e-01 8.5066e-01
-3.0446e-01 4.9627e-01 -2.7716e-01 -8.1206e-01 6.3498e-01 5.8123e-01
5.3876e-01 -6.5496e-01 -2.5691e-01 -3.3595e-01 7.4897e-01 8.1388e-01

In [41]: print A.getArray()
array([0.63003134676816508, -0.85720600630413402, 0.67347596051594882,  
-0.076173622638120664], 'd')
array([0.5017303230541359, 0.30529049321568058, -0.3044646591900968,  
0.53875941791034943, -0.21406665507357259, 0.96814544907801547,  
0.49627269833290644, -0.65496405056370244, 0.59920997972823065,  
0.3779276741558788, -0.2771608364332232, -0.25690564212998068,  
0.8480435858237616, -0.65658527037718994, -0.81206017169215183,  
-0.33594803008233565], 'd')


Why the obtained arrays does not represent what is stored on each process.  
I thought there should be 2 first whole rows in the first array and the  
rest in the second or I'm missing something.


I also observed that PETSc objects are not picklable. Is there any good  
reason for that?


Greetings

-- 
Marek Wojciechowski


From mafunk at nmsu.edu  Thu Aug 17 15:25:28 2006
From: mafunk at nmsu.edu (Matt Funk)
Date: Thu, 17 Aug 2006 14:25:28 -0600
Subject: PETSc communicator
In-Reply-To: <op.tef01mlyhfz6u8@localhost>
References: <op.td2epsp7hfz6u8@localhost> <a9f269830608160407m347d15day863e0240aae71523@mail.gmail.com> <op.tef01mlyhfz6u8@localhost>
Message-ID: <200608171425.30756.mafunk@nmsu.edu>

Hi,

i was wondering what the message:
'PetscCommDuplicate Using internal PETSc communicator 92 170'
means exactly. I still have issues with PETSc when running 1 vs 2 procs w.r.t. 
the loadbalance.
However, when run on 2 vs 4 the balance seems to be almost perfect. 
Then the option of a screwed up network was suggested to me, but since the 4vs 
2 proc case is ok, it seems not necessarily to be the case.

Maybe somebody can tell me what it means?

thanks
mat


From jiaxun_hou at yahoo.com.cn  Thu Aug 17 22:52:00 2006
From: jiaxun_hou at yahoo.com.cn (jiaxun hou)
Date: Fri, 18 Aug 2006 11:52:00 +0800 (CST)
Subject: function for eigendecomposition
Message-ID: <20060818035200.72126.qmail@web15801.mail.cnb.yahoo.com>

Hello,
   
  Can you tell me which function can compute the eigenvectors for me? I have only found the function "KSPComputeEigenvalues" in the document, but it is not suitable for me. I want to get both the eigenvalues and eigenvectors. And I am looking for a efficient function to do the eigendecomposition for a symmetric matrix. Any help will be appreciated.
   
  Regards,
  Jiaxun

 		
---------------------------------
 Mp3???-???????   
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.mcs.anl.gov/pipermail/petsc-users/attachments/20060818/8ee21ea7/attachment.htm>

From knepley at gmail.com  Thu Aug 17 23:19:25 2006
From: knepley at gmail.com (Matthew Knepley)
Date: Thu, 17 Aug 2006 23:19:25 -0500
Subject: PETSc communicator
In-Reply-To: <200608171425.30756.mafunk@nmsu.edu>
References: <op.td2epsp7hfz6u8@localhost>
	 <a9f269830608160407m347d15day863e0240aae71523@mail.gmail.com>
	 <op.tef01mlyhfz6u8@localhost> <200608171425.30756.mafunk@nmsu.edu>
Message-ID: <a9f269830608172119g9de1cd4v8c8eb259afdc73c7@mail.gmail.com>

We stash some data in the communicator. This is what is happening here.

   Matt

On 8/17/06, Matt Funk <mafunk at nmsu.edu> wrote:
> Hi,
>
> i was wondering what the message:
> 'PetscCommDuplicate Using internal PETSc communicator 92 170'
> means exactly. I still have issues with PETSc when running 1 vs 2 procs w.r.t.
> the loadbalance.
> However, when run on 2 vs 4 the balance seems to be almost perfect.
> Then the option of a screwed up network was suggested to me, but since the 4vs
> 2 proc case is ok, it seems not necessarily to be the case.
>
> Maybe somebody can tell me what it means?
>
> thanks
> mat
>
>


-- 
"Failure has a thousand explanations. Success doesn't need one" -- Sir
Alec Guiness


From yaronkretchmer at gmail.com  Thu Aug 17 23:25:49 2006
From: yaronkretchmer at gmail.com (Yaron Kretchmer)
Date: Thu, 17 Aug 2006 21:25:49 -0700
Subject: function for eigendecomposition
In-Reply-To: <20060818035200.72126.qmail@web15801.mail.cnb.yahoo.com>
References: <20060818035200.72126.qmail@web15801.mail.cnb.yahoo.com>
Message-ID: <c71e3b870608172125j2cf640b0tee5ac95baee3e63c@mail.gmail.com>

maybe this will help http://www.grycap.upv.es/slepc/


On 8/17/06, jiaxun hou <jiaxun_hou at yahoo.com.cn> wrote:
>
>  Hello,
>
> Can you tell me which function can compute the eigenvectors for me? I have
> only found the function "KSPComputeEigenvalues" in the document, but it is
> not suitable for me. I want to get both the eigenvalues and eigenvectors.
> And I am looking for a efficient function to do the eigendecomposition for a
> symmetric matrix. Any help will be appreciated.
>
> Regards,
> Jiaxun
>
> ------------------------------
> Mp3???-??????? <http://music.yahoo.com.cn/?source=mail_mailbox_footer>
>
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.mcs.anl.gov/pipermail/petsc-users/attachments/20060817/efa3d8ad/attachment.htm>

From mwojc at p.lodz.pl  Mon Aug 21 06:11:21 2006
From: mwojc at p.lodz.pl (Marek Wojciechowski)
Date: Mon, 21 Aug 2006 11:11:21 -0000
Subject: Memory preallocation in python
Message-ID: <op.temr07jmhfz6u8@localhost>

Hi,

i have a small problem in python. In case of matrix assembling -info says  
me something like:

[0] MatSetUpPreallocationWarning not preallocating matrix storage
[0] MatAssemblyEnd_SeqAIJMatrix size: 6006 X 6006; storage space: 12084  
unneeded,108036 used
[0] MatAssemblyEnd_SeqAIJNumber of mallocs during MatSetValues() is 6006
[0] MatAssemblyEnd_SeqAIJMaximum nonzeros in any row is 18
[0] Mat_CheckInodeFound 2002 nodes of 6006. Limit used: 5. Using Inode  
routines

This obviously means poor behavior because of dynamical memory allocation.
Unfortunately, I have no idea how to preallocate memory for Mat objects in  
python.
Any suggestions?

Greetings
-- 
Marek Wojciechowski


From knepley at gmail.com  Mon Aug 21 04:16:08 2006
From: knepley at gmail.com (Matthew Knepley)
Date: Mon, 21 Aug 2006 04:16:08 -0500
Subject: Memory preallocation in python
In-Reply-To: <op.temr07jmhfz6u8@localhost>
References: <op.temr07jmhfz6u8@localhost>
Message-ID: <a9f269830608210216o4ad6377bx4d38ce02c3d3a3bb@mail.gmail.com>

It should be a wrapper for Mat***Preallocation(). Do these exist?

   Matt

On 8/21/06, Marek Wojciechowski <mwojc at p.lodz.pl> wrote:
>
> Hi,
>
> i have a small problem in python. In case of matrix assembling -info says
> me something like:
>
> [0] MatSetUpPreallocationWarning not preallocating matrix storage
> [0] MatAssemblyEnd_SeqAIJMatrix size: 6006 X 6006; storage space: 12084
> unneeded,108036 used
> [0] MatAssemblyEnd_SeqAIJNumber of mallocs during MatSetValues() is 6006
> [0] MatAssemblyEnd_SeqAIJMaximum nonzeros in any row is 18
> [0] Mat_CheckInodeFound 2002 nodes of 6006. Limit used: 5. Using Inode
> routines
>
> This obviously means poor behavior because of dynamical memory allocation.
> Unfortunately, I have no idea how to preallocate memory for Mat objects in
> python.
> Any suggestions?
>
> Greetings
> --
> Marek Wojciechowski
>
>


-- 
"Failure has a thousand explanations. Success doesn't need one" -- Sir Alec
Guiness
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.mcs.anl.gov/pipermail/petsc-users/attachments/20060821/4512b34d/attachment.htm>

From mwojc at p.lodz.pl  Mon Aug 21 07:22:40 2006
From: mwojc at p.lodz.pl (Marek Wojciechowski)
Date: Mon, 21 Aug 2006 12:22:40 -0000
Subject: Memory preallocation in python
In-Reply-To: <a9f269830608210216o4ad6377bx4d38ce02c3d3a3bb@mail.gmail.com>
References: <op.temr07jmhfz6u8@localhost> <a9f269830608210216o4ad6377bx4d38ce02c3d3a3bb@mail.gmail.com>
Message-ID: <op.temvb2wdhfz6u8@localhost>


On Mon, 21 Aug 2006 09:16:08 -0000, Matthew Knepley <knepley at gmail.com>  
wrote:

> It should be a wrapper for Mat***Preallocation(). Do these exist?
>
>    Matt
>

I didn't find a method Mat.Preallocation() nor anything similar for Mat...  
Maybe I should search in another place?


-- 
Marek Wojciechowski


From jiaxun_hou at yahoo.com.cn  Mon Aug 21 04:43:13 2006
From: jiaxun_hou at yahoo.com.cn (jiaxun hou)
Date: Mon, 21 Aug 2006 17:43:13 +0800 (CST)
Subject: =?gb2312?q?=BB=D8=B8=B4=A3=BA=20Re:=20function=20for=20eigendecomposition?=
In-Reply-To: <c71e3b870608172125j2cf640b0tee5ac95baee3e63c@mail.gmail.com>
Message-ID: <20060821094313.44251.qmail@web15805.mail.cnb.yahoo.com>

Thank you very much. It is really what I want.
   
  Regards,
  Jiaxun

Yaron Kretchmer <yaronkretchmer at gmail.com> ???
    maybe this will help http://www.grycap.upv.es/slepc/
  

  On 8/17/06, jiaxun hou <jiaxun_hou at yahoo.com.cn> wrote:       Hello,
   
  Can you tell me which function can compute the eigenvectors for me? I have only found the function "KSPComputeEigenvalues" in the document, but it is not suitable for me. I want to get both the eigenvalues and eigenvectors. And I am looking for a efficient function to do the eigendecomposition for a symmetric matrix. Any help will be appreciated. 
   
  Regards,
  Jiaxun
    
---------------------------------
  Mp3???-???????   
  

 __________________________________________________
???????????????
http://cn.mail.yahoo.com
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.mcs.anl.gov/pipermail/petsc-users/attachments/20060821/a71bc48f/attachment.htm>

From knepley at gmail.com  Mon Aug 21 04:43:42 2006
From: knepley at gmail.com (Matthew Knepley)
Date: Mon, 21 Aug 2006 04:43:42 -0500
Subject: Memory preallocation in python
In-Reply-To: <op.temvb2wdhfz6u8@localhost>
References: <op.temr07jmhfz6u8@localhost>
	 <a9f269830608210216o4ad6377bx4d38ce02c3d3a3bb@mail.gmail.com>
	 <op.temvb2wdhfz6u8@localhost>
Message-ID: <a9f269830608210243l5a2ab308u9040464d687db1f@mail.gmail.com>

Sorry,

  1) It would have to be for a subclass like MatSeqAIJ

  2) It is setPreallocation()

  Thanks,

     Matt

On 8/21/06, Marek Wojciechowski <mwojc at p.lodz.pl> wrote:
>
>
> On Mon, 21 Aug 2006 09:16:08 -0000, Matthew Knepley <knepley at gmail.com>
> wrote:
>
> > It should be a wrapper for Mat***Preallocation(). Do these exist?
> >
> >    Matt
> >
>
> I didn't find a method Mat.Preallocation() nor anything similar for Mat...
> Maybe I should search in another place?
>
>
>
> --
> Marek Wojciechowski
>
>


-- 
"Failure has a thousand explanations. Success doesn't need one" -- Sir Alec
Guiness
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.mcs.anl.gov/pipermail/petsc-users/attachments/20060821/4e7f00d9/attachment.htm>

From mwojc at p.lodz.pl  Mon Aug 21 07:58:08 2006
From: mwojc at p.lodz.pl (Marek Wojciechowski)
Date: Mon, 21 Aug 2006 12:58:08 -0000
Subject: Memory preallocation in python
In-Reply-To: <a9f269830608210243l5a2ab308u9040464d687db1f@mail.gmail.com>
References: <op.temr07jmhfz6u8@localhost> <a9f269830608210216o4ad6377bx4d38ce02c3d3a3bb@mail.gmail.com> <op.temvb2wdhfz6u8@localhost> <a9f269830608210243l5a2ab308u9040464d687db1f@mail.gmail.com>
Message-ID: <op.temwy6rnhfz6u8@localhost>


>   1) It would have to be for a subclass like MatSeqAIJ
>
>   2) It is setPreallocation()
>
>   Thanks,
>
>      Matt
>

Honestly, I can find neither subclass MatSeqAIJ nor the method  
setPreallocation(). I create matrix as follows:

import PETSc.Mat
K = PETSc.Mat.Mat()   ## as far as i know this is the only way to create  
matrix in python
K.setSizes(size, size, Size, Size)
K.setFromOptions()
K.setType("seqaij")   ## i choose matrix type here, I don't know another  
way...

Where is the moment to preallocate?


-- 
Marek Wojciechowski


From knepley at gmail.com  Mon Aug 21 08:29:19 2006
From: knepley at gmail.com (Matthew Knepley)
Date: Mon, 21 Aug 2006 08:29:19 -0500
Subject: Memory preallocation in python
In-Reply-To: <op.temwy6rnhfz6u8@localhost>
References: <op.temr07jmhfz6u8@localhost>
	 <a9f269830608210216o4ad6377bx4d38ce02c3d3a3bb@mail.gmail.com>
	 <op.temvb2wdhfz6u8@localhost>
	 <a9f269830608210243l5a2ab308u9040464d687db1f@mail.gmail.com>
	 <op.temwy6rnhfz6u8@localhost>
Message-ID: <a9f269830608210629l6e54a6c3y533833bd3a0cd0eb@mail.gmail.com>

On 8/21/06, Marek Wojciechowski <mwojc at p.lodz.pl> wrote:
>
>
> >   1) It would have to be for a subclass like MatSeqAIJ
> >
> >   2) It is setPreallocation()
> >
> >   Thanks,
> >
> >      Matt
> >
>
> Honestly, I can find neither subclass MatSeqAIJ nor the method
> setPreallocation(). I create matrix as follows:
>
> import PETSc.Mat
> K = PETSc.Mat.Mat()   ## as far as i know this is the only way to create
> matrix in python
> K.setSizes(size, size, Size, Size)
> K.setFromOptions()
> K.setType("seqaij")   ## i choose matrix type here, I don't know another
> way...
>
> Where is the moment to preallocate?


It may be that they never wrapped the preallocation methods. It seems
strange, but
possible. I guess I would mail Lisandro and see.

  Matt

--
> Marek Wojciechowski
>
-- 
"Failure has a thousand explanations. Success doesn't need one" -- Sir Alec
Guiness
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.mcs.anl.gov/pipermail/petsc-users/attachments/20060821/ed2afb13/attachment.htm>

From mwojc at p.lodz.pl  Mon Aug 21 11:33:19 2006
From: mwojc at p.lodz.pl (Marek Wojciechowski)
Date: Mon, 21 Aug 2006 16:33:19 -0000
Subject: Memory preallocation in python
In-Reply-To: <a9f269830608210629l6e54a6c3y533833bd3a0cd0eb@mail.gmail.com>
References: <op.temr07jmhfz6u8@localhost> <a9f269830608210216o4ad6377bx4d38ce02c3d3a3bb@mail.gmail.com> <op.temvb2wdhfz6u8@localhost> <a9f269830608210243l5a2ab308u9040464d687db1f@mail.gmail.com> <op.temwy6rnhfz6u8@localhost> <a9f269830608210629l6e54a6c3y533833bd3a0cd0eb@mail.gmail.com>
Message-ID: <op.tem6xtgihfz6u8@localhost>


> It may be that they never wrapped the preallocation methods. It seems
> strange, but
> possible. I guess I would mail Lisandro and see.
>

Just to clarify: I'm using the wrappers downloaded from  
ftp://ftp.mcs.anl.gov/pub/petsc/PETScPython.tar.gz.
Are these something to do with petsc4py package by Lisandro?

-- 
Marek Wojciechowski


From knepley at gmail.com  Mon Aug 21 08:56:40 2006
From: knepley at gmail.com (Matthew Knepley)
Date: Mon, 21 Aug 2006 08:56:40 -0500
Subject: Memory preallocation in python
In-Reply-To: <op.tem6xtgihfz6u8@localhost>
References: <op.temr07jmhfz6u8@localhost>
	 <a9f269830608210216o4ad6377bx4d38ce02c3d3a3bb@mail.gmail.com>
	 <op.temvb2wdhfz6u8@localhost>
	 <a9f269830608210243l5a2ab308u9040464d687db1f@mail.gmail.com>
	 <op.temwy6rnhfz6u8@localhost>
	 <a9f269830608210629l6e54a6c3y533833bd3a0cd0eb@mail.gmail.com>
	 <op.tem6xtgihfz6u8@localhost>
Message-ID: <a9f269830608210656v48e75708if6f8a3e4a4c27515@mail.gmail.com>

On 8/21/06, Marek Wojciechowski <mwojc at p.lodz.pl> wrote:
>
>
> > It may be that they never wrapped the preallocation methods. It seems
> > strange, but
> > possible. I guess I would mail Lisandro and see.
> >
>
> Just to clarify: I'm using the wrappers downloaded from
> ftp://ftp.mcs.anl.gov/pub/petsc/PETScPython.tar.gz.
> Are these something to do with petsc4py package by Lisandro?


I got confused. There are several different wrappers. Those are ones that
I produced, but found too hard to support. I am know telling people who
want more functionality to try either the petsc4py or the LINEAL wrappers
since they have the time and money to do a better job I think.

    Matt

--
> Marek Wojciechowski
>
-- 
"Failure has a thousand explanations. Success doesn't need one" -- Sir Alec
Guiness
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.mcs.anl.gov/pipermail/petsc-users/attachments/20060821/cd3f2d64/attachment.htm>

From mwojc at p.lodz.pl  Mon Aug 21 12:11:38 2006
From: mwojc at p.lodz.pl (Marek Wojciechowski)
Date: Mon, 21 Aug 2006 17:11:38 -0000
Subject: Memory preallocation in python
In-Reply-To: <a9f269830608210656v48e75708if6f8a3e4a4c27515@mail.gmail.com>
References: <op.temr07jmhfz6u8@localhost> <a9f269830608210216o4ad6377bx4d38ce02c3d3a3bb@mail.gmail.com> <op.temvb2wdhfz6u8@localhost> <a9f269830608210243l5a2ab308u9040464d687db1f@mail.gmail.com> <op.temwy6rnhfz6u8@localhost> <a9f269830608210629l6e54a6c3y533833bd3a0cd0eb@mail.gmail.com> <op.tem6xtgihfz6u8@localhost> <a9f269830608210656v48e75708if6f8a3e4a4c27515@mail.gmail.com>
Message-ID: <op.tem8powwhfz6u8@localhost>

On Mon, 21 Aug 2006 13:56:40 -0000, Matthew Knepley <knepley at gmail.com>  
wrote:
>
> I got confused. There are several different wrappers. Those are ones that
> I produced, but found too hard to support. I am know telling people who
> want more functionality to try either the petsc4py or the LINEAL wrappers
> since they have the time and money to do a better job I think.
>

Well, does it mean that your wrappers are not developed any more?

One more question then:
In case of petsc4py, I tried to compile it but with no success because of  
the lack
of include file petschead.h in the petsc distribution (2.3.1-p16). I  
guess, it was removed for
some reason. Maybe you could tell me where are now the definitions from  
this file.

-- 
Marek Wojciechowski


From knepley at gmail.com  Mon Aug 21 09:39:50 2006
From: knepley at gmail.com (Matthew Knepley)
Date: Mon, 21 Aug 2006 09:39:50 -0500
Subject: Memory preallocation in python
In-Reply-To: <op.tem8powwhfz6u8@localhost>
References: <op.temr07jmhfz6u8@localhost>
	 <a9f269830608210216o4ad6377bx4d38ce02c3d3a3bb@mail.gmail.com>
	 <op.temvb2wdhfz6u8@localhost>
	 <a9f269830608210243l5a2ab308u9040464d687db1f@mail.gmail.com>
	 <op.temwy6rnhfz6u8@localhost>
	 <a9f269830608210629l6e54a6c3y533833bd3a0cd0eb@mail.gmail.com>
	 <op.tem6xtgihfz6u8@localhost>
	 <a9f269830608210656v48e75708if6f8a3e4a4c27515@mail.gmail.com>
	 <op.tem8powwhfz6u8@localhost>
Message-ID: <a9f269830608210739i352f2502haf0b92ce3feefc78@mail.gmail.com>

You just need petsc.h now. The structs are defined in
include/private/petscimpl.h

   Matt

On 8/21/06, Marek Wojciechowski <mwojc at p.lodz.pl> wrote:
>
> On Mon, 21 Aug 2006 13:56:40 -0000, Matthew Knepley <knepley at gmail.com>
> wrote:
> >
> > I got confused. There are several different wrappers. Those are ones
> that
> > I produced, but found too hard to support. I am know telling people who
> > want more functionality to try either the petsc4py or the LINEAL
> wrappers
> > since they have the time and money to do a better job I think.
> >
>
> Well, does it mean that your wrappers are not developed any more?
>
> One more question then:
> In case of petsc4py, I tried to compile it but with no success because of
> the lack
> of include file petschead.h in the petsc distribution (2.3.1-p16). I
> guess, it was removed for
> some reason. Maybe you could tell me where are now the definitions from
> this file.
>
> --
> Marek Wojciechowski
>
>


-- 
"Failure has a thousand explanations. Success doesn't need one" -- Sir Alec
Guiness
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.mcs.anl.gov/pipermail/petsc-users/attachments/20060821/8b6c2cdd/attachment.htm>

From mwojc at p.lodz.pl  Mon Aug 21 13:10:30 2006
From: mwojc at p.lodz.pl (Marek Wojciechowski)
Date: Mon, 21 Aug 2006 18:10:30 -0000
Subject: Memory preallocation in python
In-Reply-To: <a9f269830608210739i352f2502haf0b92ce3feefc78@mail.gmail.com>
References: <op.temr07jmhfz6u8@localhost> <a9f269830608210216o4ad6377bx4d38ce02c3d3a3bb@mail.gmail.com> <op.temvb2wdhfz6u8@localhost> <a9f269830608210243l5a2ab308u9040464d687db1f@mail.gmail.com> <op.temwy6rnhfz6u8@localhost> <a9f269830608210629l6e54a6c3y533833bd3a0cd0eb@mail.gmail.com> <op.tem6xtgihfz6u8@localhost> <a9f269830608210656v48e75708if6f8a3e4a4c27515@mail.gmail.com> <op.tem8powwhfz6u8@localhost> <a9f269830608210739i352f2502haf0b92ce3feefc78@mail.gmail.com>
Message-ID: <op.tenbfsxshfz6u8@localhost>

On Mon, 21 Aug 2006 14:39:50 -0000, Matthew Knepley <knepley at gmail.com>  
wrote:

This not helps, there are still undeclared variables like:
CARRAY_FLAGS
PETSC_COOKIE
UPDATEIFCOPY
PETSC_FILE_RDONLY
PETSC_FILE_WRONLY
PETSC_FILE_CREATE
KSP_CONVERGED_QCG_NEG_CURVE
KSP_CONVERGED_QCG_CONSTRAINED
I'm affraid petsc4py wrappers are broken for newer versions of PETSc...


> You just need petsc.h now. The structs are defined in
> include/private/petscimpl.h

>>
>> One more question then:
>> In case of petsc4py, I tried to compile it but with no success because  
>> of
>> the lack
>> of include file petschead.h in the petsc distribution (2.3.1-p16). I
>> guess, it was removed for
>> some reason. Maybe you could tell me where are now the definitions from
>> this file.


-- 
Marek Wojciechowski


From xiwang at dragon.rutgers.edu  Mon Aug 21 14:29:49 2006
From: xiwang at dragon.rutgers.edu (Xiaoxu Wang)
Date: Mon, 21 Aug 2006 15:29:49 -0400
Subject: configuration question
Message-ID: <44EA09AD.70503@dragon.rutgers.edu>

sorry, the question I want to ask is

I got the following error when configuring Petsc,
'Configure' object has no attribute 'diff'  File 
"./config/configure.py", line 166, in petsc_configure
when it runs to (out,err,status) = 
Configure.executeShellCommand(getattr(self, 'diff')+' -w diff1 diff2'), 
line 56 in programs.py.


Hi,
     When I comfiguring Petsc under Windows and Cygwin, it always 
crashes when it is checking for diff. The path is correct. It could be 
the difference between '\' and '\\' in python causing the problem. How 
to solve this problem? Or can I skip checking for 'diff'? Thank you for 
suggestions.

Xiaoxu


From balay at mcs.anl.gov  Mon Aug 21 14:34:36 2006
From: balay at mcs.anl.gov (Satish Balay)
Date: Mon, 21 Aug 2006 14:34:36 -0500 (CDT)
Subject: configuration question
In-Reply-To: <44EA09AD.70503@dragon.rutgers.edu>
References: <44EA09AD.70503@dragon.rutgers.edu>
Message-ID: <Pine.LNX.4.64.0608211432270.18910@asterix>

Can you verify if you have /usr/bin/diff.exe installed with your
cygwin installation?

Can you verify if you are using python from cygwin?

If you encounter problems - send the corresponding configure.log to
petsc-maint at mcs.anl.gov

[note: configure.log is too big to be posted on a mailing-list - so
use the above e-mail]

Satish


On Mon, 21 Aug 2006, Xiaoxu Wang wrote:

> sorry, the question I want to ask is
> 
> I got the following error when configuring Petsc,
> 'Configure' object has no attribute 'diff'  File "./config/configure.py", line
> 166, in petsc_configure
> when it runs to (out,err,status) = Configure.executeShellCommand(getattr(self,
> 'diff')+' -w diff1 diff2'), line 56 in programs.py.
> 
> 
> 
> Hi,
>     When I comfiguring Petsc under Windows and Cygwin, it always crashes when
> it is checking for diff. The path is correct. It could be the difference
> between '\' and '\\' in python causing the problem. How to solve this problem?
> Or can I skip checking for 'diff'? Thank you for suggestions.
> 
> Xiaoxu
> 
> 


From xiwang at dragon.rutgers.edu  Mon Aug 21 14:00:11 2006
From: xiwang at dragon.rutgers.edu (Xiaoxu Wang)
Date: Mon, 21 Aug 2006 15:00:11 -0400
Subject: configuration question
Message-ID: <44EA02BB.5000101@dragon.rutgers.edu>

Hi,
   
    When I comfiguring Petsc under Windows and Cygwin, it always crashes 
when it is checking for diff. The path is correct. It could be the 
difference between '\' and '\\' in python causing the problem. How to 
solve this problem? Or can I skip checking for 'diff'? Thank you for 
suggestions.

Xiaoxu


From alabute at stanford.edu  Mon Aug 21 16:42:35 2006
From: alabute at stanford.edu (Alex)
Date: Mon, 21 Aug 2006 14:42:35 -0700
Subject: How do I add two matrices together?
Message-ID: <1156196556.5327.12.camel@localhost.localdomain>

I can't seem to find a function that will allow me to add or subtract
matrices. I would like to do this with matrices that have already been
assembled.

thanks,
-- 
Alexander Ten Eyck
Laboratory for Virtual Experiments in Mechanics
Stanford University


From knepley at gmail.com  Mon Aug 21 16:46:52 2006
From: knepley at gmail.com (Matthew Knepley)
Date: Mon, 21 Aug 2006 16:46:52 -0500
Subject: How do I add two matrices together?
In-Reply-To: <1156196556.5327.12.camel@localhost.localdomain>
References: <1156196556.5327.12.camel@localhost.localdomain>
Message-ID: <a9f269830608211446h144b9109o6888d47c161a37fc@mail.gmail.com>

MatAXPY

   Matt

On 8/21/06, Alex <alabute at stanford.edu> wrote:
>
> I can't seem to find a function that will allow me to add or subtract
> matrices. I would like to do this with matrices that have already been
> assembled.
>
> thanks,
> --
> Alexander Ten Eyck
> Laboratory for Virtual Experiments in Mechanics
> Stanford University
>
>


-- 
"Failure has a thousand explanations. Success doesn't need one" -- Sir Alec
Guiness
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.mcs.anl.gov/pipermail/petsc-users/attachments/20060821/c84b472c/attachment.htm>

From bsmith at mcs.anl.gov  Tue Aug 22 14:34:18 2006
From: bsmith at mcs.anl.gov (Barry Smith)
Date: Tue, 22 Aug 2006 14:34:18 -0500 (CDT)
Subject: PETSc communicator
In-Reply-To: <200608171425.30756.mafunk@nmsu.edu>
References: <op.td2epsp7hfz6u8@localhost> <a9f269830608160407m347d15day863e0240aae71523@mail.gmail.com>
 <op.tef01mlyhfz6u8@localhost> <200608171425.30756.mafunk@nmsu.edu>
Message-ID: <Pine.LNX.4.58.0608221431500.1268@smash.mcs.anl.gov>


  Mat,

    This will not effect load balance or anything like that.

When you pass a communicator like  MPI_COMM_WORLD to PETSc
we don't actually use that communicator (because you might
be using it and there may be tag collisions etc). So instead
we store our own communicator inside the MPI_COMM_WORLD as an
attribute, this message is just telling us we are accessing
the inner communicator.

   Barry


On Thu, 17 Aug 2006, Matt Funk wrote:

> Hi,
>
> i was wondering what the message:
> 'PetscCommDuplicate Using internal PETSc communicator 92 170'
> means exactly. I still have issues with PETSc when running 1 vs 2 procs w.r.t.
> the loadbalance.
> However, when run on 2 vs 4 the balance seems to be almost perfect.
> Then the option of a screwed up network was suggested to me, but since the 4vs
> 2 proc case is ok, it seems not necessarily to be the case.
>
> Maybe somebody can tell me what it means?
>
> thanks
> mat
>
>


From mafunk at nmsu.edu  Tue Aug 22 15:46:04 2006
From: mafunk at nmsu.edu (Matt Funk)
Date: Tue, 22 Aug 2006 14:46:04 -0600
Subject: PETSc communicator
In-Reply-To: <Pine.LNX.4.58.0608221431500.1268@smash.mcs.anl.gov>
References: <op.td2epsp7hfz6u8@localhost> <200608171425.30756.mafunk@nmsu.edu> <Pine.LNX.4.58.0608221431500.1268@smash.mcs.anl.gov>
Message-ID: <200608221446.07726.mafunk@nmsu.edu>

Hi Barry,

thanks for the clarification. I am running my code on a different (much 
slower) machine right now, and from the initial results it seems so far that 
Matt Knepley's suspicion of having a bad network could be correct. But i need 
to do a couple more runs. 

thanks
mat

On Tuesday 22 August 2006 13:34, Barry Smith wrote:
>   Mat,
>
>     This will not effect load balance or anything like that.
>
> When you pass a communicator like  MPI_COMM_WORLD to PETSc
> we don't actually use that communicator (because you might
> be using it and there may be tag collisions etc). So instead
> we store our own communicator inside the MPI_COMM_WORLD as an
> attribute, this message is just telling us we are accessing
> the inner communicator.
>
>    Barry
>
> On Thu, 17 Aug 2006, Matt Funk wrote:
> > Hi,
> >
> > i was wondering what the message:
> > 'PetscCommDuplicate Using internal PETSc communicator 92 170'
> > means exactly. I still have issues with PETSc when running 1 vs 2 procs
> > w.r.t. the loadbalance.
> > However, when run on 2 vs 4 the balance seems to be almost perfect.
> > Then the option of a screwed up network was suggested to me, but since
> > the 4vs 2 proc case is ok, it seems not necessarily to be the case.
> >
> > Maybe somebody can tell me what it means?
> >
> > thanks
> > mat


From mwojc at p.lodz.pl  Wed Aug 23 14:32:52 2006
From: mwojc at p.lodz.pl (Marek Wojciechowski)
Date: Wed, 23 Aug 2006 19:32:52 -0000
Subject: Memory preallocation in python
In-Reply-To: <op.tenbfsxshfz6u8@localhost>
References: <op.temr07jmhfz6u8@localhost> <a9f269830608210216o4ad6377bx4d38ce02c3d3a3bb@mail.gmail.com> <op.temvb2wdhfz6u8@localhost> <a9f269830608210243l5a2ab308u9040464d687db1f@mail.gmail.com> <op.temwy6rnhfz6u8@localhost> <a9f269830608210629l6e54a6c3y533833bd3a0cd0eb@mail.gmail.com> <op.tem6xtgihfz6u8@localhost> <a9f269830608210656v48e75708if6f8a3e4a4c27515@mail.gmail.com> <op.tem8powwhfz6u8@localhost> <a9f269830608210739i352f2502haf0b92ce3feefc78@mail.gmail.com> <op.tenbfsxshfz6u8@localhost>
Message-ID: <op.teq4k21ghfz6u8@localhost>

Good news, Lisandro Dalcin just released new version of his petsc4py which  
compiles nicely with petsc-2.3.1-p16.

On Mon, 21 Aug 2006 18:10:30 -0000, Marek Wojciechowski <mwojc at p.lodz.pl>  
wrote:

> On Mon, 21 Aug 2006 14:39:50 -0000, Matthew Knepley <knepley at gmail.com>  
> wrote:
>
> This not helps, there are still undeclared variables like:
> CARRAY_FLAGS
> PETSC_COOKIE
> UPDATEIFCOPY
> PETSC_FILE_RDONLY
> PETSC_FILE_WRONLY
> PETSC_FILE_CREATE
> KSP_CONVERGED_QCG_NEG_CURVE
> KSP_CONVERGED_QCG_CONSTRAINED
> I'm affraid petsc4py wrappers are broken for newer versions of PETSc...
>
>
>> You just need petsc.h now. The structs are defined in
>> include/private/petscimpl.h
>
>>>
>>> One more question then:
>>> In case of petsc4py, I tried to compile it but with no success because  
>>> of
>>> the lack
>>> of include file petschead.h in the petsc distribution (2.3.1-p16). I
>>> guess, it was removed for
>>> some reason. Maybe you could tell me where are now the definitions from
>>> this file.
>
>
>
>


-- 
Marek Wojciechowski


From lee433 at purdue.edu  Wed Aug 23 12:18:36 2006
From: lee433 at purdue.edu (Changyeol Lee)
Date: Wed, 23 Aug 2006 13:18:36 -0400
Subject: Problem in mpdboot for multi-processing
Message-ID: <1156353516.44ec8dec0dd34@webmail.purdue.edu>

Hi, everyone!

I assembled a 4-node cluster consisting of 4 Intel processors.  I used Fedora
Core 4, PETSc 2.3.1  and MPICH2-1.0.3. 

There is no problem in installation of PETSc and MPICH2. I also made authorized
keys of SSH for connection between nodes without asking password. I confirmed
that SSH show no asking of password between nodes. 

Also, mpdboot for itself is possible like below

$ mpdboot -n 1 -f mpd.hosts
$

However, mpdboot for multi-processors is not possible by showing the error below.
Hostname of node1 is node1.cluster.net and hostname of node2 is node2.cluster.net.

$ mpdboot -n 2 -f mpd.hosts
mpdboot_node1.cluster.net (handle_mpd_output 368): failed to connect to mpd on
node2.cluster.net
$

I believe that the things such as /etc/hosts do not have problems. 

Although it seems a problem of MPICH2, I would like to hear something if you
have similar experience like me.  Let me know if you have any idea.

Thank you so much!

Changyeol

------------------------------------------------------
Have a gneiss day!

Changyeol Lee
Graduate student (Geophysics)
Earth and Atmospheric Sciences
Purdue University
cell: 765)418-8498
phone: 765)495-1294 


From balay at mcs.anl.gov  Wed Aug 23 12:40:11 2006
From: balay at mcs.anl.gov (Satish Balay)
Date: Wed, 23 Aug 2006 12:40:11 -0500 (CDT)
Subject: Problem in mpdboot for multi-processing
In-Reply-To: <1156353516.44ec8dec0dd34@webmail.purdue.edu>
References: <1156353516.44ec8dec0dd34@webmail.purdue.edu>
Message-ID: <Pine.LNX.4.64.0608231239300.4474@asterix>

This query is best sent to mpich2-maint at mcs.anl.gov

Satish

On Wed, 23 Aug 2006, Changyeol Lee wrote:

> Hi, everyone!
> 
> I assembled a 4-node cluster consisting of 4 Intel processors.  I used Fedora
> Core 4, PETSc 2.3.1  and MPICH2-1.0.3. 
> 
> There is no problem in installation of PETSc and MPICH2. I also made authorized
> keys of SSH for connection between nodes without asking password. I confirmed
> that SSH show no asking of password between nodes. 
> 
> Also, mpdboot for itself is possible like below
> 
> $ mpdboot -n 1 -f mpd.hosts
> $
> 
> However, mpdboot for multi-processors is not possible by showing the error below.
> Hostname of node1 is node1.cluster.net and hostname of node2 is node2.cluster.net.
> 
> $ mpdboot -n 2 -f mpd.hosts
> mpdboot_node1.cluster.net (handle_mpd_output 368): failed to connect to mpd on
> node2.cluster.net
> $
> 
> I believe that the things such as /etc/hosts do not have problems. 
> 
> Although it seems a problem of MPICH2, I would like to hear something if you
> have similar experience like me.  Let me know if you have any idea.
> 
> Thank you so much!
> 
> Changyeol
> 
> ------------------------------------------------------
> Have a gneiss day!
> 
> Changyeol Lee
> Graduate student (Geophysics)
> Earth and Atmospheric Sciences
> Purdue University
> cell: 765)418-8498
> phone: 765)495-1294 
> 
> 
> 
> 


From knepley at gmail.com  Wed Aug 23 16:35:21 2006
From: knepley at gmail.com (Matthew Knepley)
Date: Wed, 23 Aug 2006 16:35:21 -0500
Subject: Memory preallocation in python
In-Reply-To: <op.teq4k21ghfz6u8@localhost>
References: <op.temr07jmhfz6u8@localhost>
	 <a9f269830608210243l5a2ab308u9040464d687db1f@mail.gmail.com>
	 <op.temwy6rnhfz6u8@localhost>
	 <a9f269830608210629l6e54a6c3y533833bd3a0cd0eb@mail.gmail.com>
	 <op.tem6xtgihfz6u8@localhost>
	 <a9f269830608210656v48e75708if6f8a3e4a4c27515@mail.gmail.com>
	 <op.tem8powwhfz6u8@localhost>
	 <a9f269830608210739i352f2502haf0b92ce3feefc78@mail.gmail.com>
	 <op.tenbfsxshfz6u8@localhost> <op.teq4k21ghfz6u8@localhost>
Message-ID: <a9f269830608231435t4a60f301hf71c489c22bb12ee@mail.gmail.com>

Excellent. I am putting this up on our website.

  Thanks for your patience,

     Matt

On 8/23/06, Marek Wojciechowski <mwojc at p.lodz.pl> wrote:
>
> Good news, Lisandro Dalcin just released new version of his petsc4py which
> compiles nicely with petsc-2.3.1-p16.
>
> On Mon, 21 Aug 2006 18:10:30 -0000, Marek Wojciechowski <mwojc at p.lodz.pl>
> wrote:
>
> > On Mon, 21 Aug 2006 14:39:50 -0000, Matthew Knepley <knepley at gmail.com>
> > wrote:
> >
> > This not helps, there are still undeclared variables like:
> > CARRAY_FLAGS
> > PETSC_COOKIE
> > UPDATEIFCOPY
> > PETSC_FILE_RDONLY
> > PETSC_FILE_WRONLY
> > PETSC_FILE_CREATE
> > KSP_CONVERGED_QCG_NEG_CURVE
> > KSP_CONVERGED_QCG_CONSTRAINED
> > I'm affraid petsc4py wrappers are broken for newer versions of PETSc...
> >
> >
> >> You just need petsc.h now. The structs are defined in
> >> include/private/petscimpl.h
> >
> >>>
> >>> One more question then:
> >>> In case of petsc4py, I tried to compile it but with no success because
> >>> of
> >>> the lack
> >>> of include file petschead.h in the petsc distribution (2.3.1-p16). I
> >>> guess, it was removed for
> >>> some reason. Maybe you could tell me where are now the definitions
> from
> >>> this file.
> >
> >
> >
> >
>
>
>
> --
> Marek Wojciechowski
>
>


-- 
"Failure has a thousand explanations. Success doesn't need one" -- Sir Alec
Guiness
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.mcs.anl.gov/pipermail/petsc-users/attachments/20060823/42663b5d/attachment.htm>

From julvar at tamu.edu  Thu Aug 24 10:34:47 2006
From: julvar at tamu.edu (Julian)
Date: Thu, 24 Aug 2006 10:34:47 -0500
Subject: Intel Dual core machines
Message-ID: <006301c6c792$d4cb1730$24b75ba5@aero.ad.tamu.edu>

Hello,

So far, I have been using PETSc on a single processor windows machine. Now,
I am planning on using it on a Intel Dual core machine. Before I start
running the installation scripts, I wanted to confirm if I can use both the
processors on this new machine just like how you would use multiple
processors on a supercomputer.
If yes, is there anything special that I need to do when installing PETSc? 
I'm guessing I would have to install some MPI software... Which one do you
recommend for windows machines (I saw more than one windows MPI software on
the PETSc website) ?

Thanks,
Julian.


From randy at geosystem.us  Thu Aug 24 10:41:42 2006
From: randy at geosystem.us (Randall Mackie)
Date: Thu, 24 Aug 2006 08:41:42 -0700
Subject: Intel Dual core machines
In-Reply-To: <006301c6c792$d4cb1730$24b75ba5@aero.ad.tamu.edu>
References: <006301c6c792$d4cb1730$24b75ba5@aero.ad.tamu.edu>
Message-ID: <44EDC8B6.10004@geosystem.us>

Hi Julian,

No problem running on dual cpu machines. You just need to have MPI
set up correctly.

On our cluster, we use ROCKS, which does everything more or less
automagically.

http://www.rocksclusters.org/wordpress/

Randy


Julian wrote:
> Hello,
> 
> So far, I have been using PETSc on a single processor windows machine. Now,
> I am planning on using it on a Intel Dual core machine. Before I start
> running the installation scripts, I wanted to confirm if I can use both the
> processors on this new machine just like how you would use multiple
> processors on a supercomputer.
> If yes, is there anything special that I need to do when installing PETSc? 
> I'm guessing I would have to install some MPI software... Which one do you
> recommend for windows machines (I saw more than one windows MPI software on
> the PETSc website) ?
> 
> Thanks,
> Julian.
> 

-- 
Randall Mackie
GSY-USA, Inc.
PMB# 643
2261 Market St.,
San Francisco, CA 94114-1600
Tel (415) 469-8649
Fax (415) 469-5044

California Registered Geophysicist
License No. GP 1034


From balay at mcs.anl.gov  Thu Aug 24 10:54:14 2006
From: balay at mcs.anl.gov (Satish Balay)
Date: Thu, 24 Aug 2006 10:54:14 -0500 (CDT)
Subject: Intel Dual core machines
In-Reply-To: <006301c6c792$d4cb1730$24b75ba5@aero.ad.tamu.edu>
References: <006301c6c792$d4cb1730$24b75ba5@aero.ad.tamu.edu>
Message-ID: <Pine.LNX.4.64.0608241047300.21715@asterix>

If you plan to use windows recommend mpich1 as this is what PETSc is
usually tested with [as far as installation is concerned].

http://www-unix.mcs.anl.gov/mpi/mpich1/mpich-nt/

Configure will automatically look for it - and use it.

The scalability depends upon the OS, MPI impl and MemoryBandwidh
numbers for this hardware. Don't know enough about the OS & MPI part -
but the MemoryBandwidh part is easy to check based on the hardware you
have. [The new core duo chips appear to have high memory bandwidth
numbers - so I think it should scale well]

But you should be concerned about this only for performance measurents
- but not during development. [You can install MPI on a single cpu
machine and use PETSc on it - for development]

Satish

On Thu, 24 Aug 2006, Julian wrote:

> Hello,
> 
> So far, I have been using PETSc on a single processor windows machine. Now,
> I am planning on using it on a Intel Dual core machine. Before I start
> running the installation scripts, I wanted to confirm if I can use both the
> processors on this new machine just like how you would use multiple
> processors on a supercomputer.
> If yes, is there anything special that I need to do when installing PETSc? 
> I'm guessing I would have to install some MPI software... Which one do you
> recommend for windows machines (I saw more than one windows MPI software on
> the PETSc website) ?
> 
> Thanks,
> Julian.
> 
> 


From randy at geosystem.us  Thu Aug 24 10:16:47 2006
From: randy at geosystem.us (Randall Mackie)
Date: Thu, 24 Aug 2006 08:16:47 -0700
Subject: OT - cluster rental
Message-ID: <44EDC2DF.2050004@geosystem.us>

I have been looking for a commercial cluster rental, with only very little success.
The only one I've found so far is Tsunmaic Technologies:

I've looked at several of the University super-computer centers, but they don't
appear to be easy to rent cluster time from if you're not an academic.

We have our own smallish (70 node) cluster, but we have some jobs where larger (256 node)
clusters would be nice.

If anyone knows of any commercial clusters that can be rented, I'd appreciate the information.

Thanks,

Randy Mackie

-- 
Randall Mackie
GSY-USA, Inc.
PMB# 643
2261 Market St.,
San Francisco, CA 94114-1600
Tel (415) 469-8649
Fax (415) 469-5044

California Registered Geophysicist
License No. GP 1034


From julvar at tamu.edu  Thu Aug 24 11:10:47 2006
From: julvar at tamu.edu (Julian)
Date: Thu, 24 Aug 2006 11:10:47 -0500
Subject: Direct Linear Solvers
Message-ID: <006401c6c797$dc392700$24b75ba5@aero.ad.tamu.edu>

Hello,

I looked at the linear solvers summary page and I could find three direct
solvers that do not use any external packages. But I could find an example
for only one of them (LU). The link to the cholesky solver does not have any
example file. I am new to PETSc and I don't fully understand how the
different solvers are invoked. Do I just change  PCSetType(pc,PCLU); to  
PCSetType(pc,PCCHOLESKY); ?

And the "XXt and Xyt" solver does not have any link. How do I use that
solver?

Also, do the direct solvers do its own internal renumbering to reduce the
matrix bandwidth? Or do we have to take care of that outside of PETSc?

Thanks,
Julian.


From julvar at tamu.edu  Thu Aug 24 11:22:50 2006
From: julvar at tamu.edu (Julian)
Date: Thu, 24 Aug 2006 11:22:50 -0500
Subject: Intel Dual core machines
In-Reply-To: <44EDC8B6.10004@geosystem.us>
Message-ID: <006501c6c799$8b673b80$24b75ba5@aero.ad.tamu.edu>

Thanks for the reply.

Do I really need to use ROCKS if I'm just gonna use a single dual core
machine ? Is that considered a cluster ?
Also, what MPI software does ROCKS install ? I saw some mention of openMPI
1.1 

Julian.


> -----Original Message-----
> From: owner-petsc-users at mcs.anl.gov 
> [mailto:owner-petsc-users at mcs.anl.gov] On Behalf Of Randall Mackie
> Sent: Thursday, August 24, 2006 10:42 AM
> To: petsc-users at mcs.anl.gov
> Subject: Re: Intel Dual core machines
> 
> Hi Julian,
> 
> No problem running on dual cpu machines. You just need to 
> have MPI set up correctly.
> 
> On our cluster, we use ROCKS, which does everything more or 
> less automagically.
> 
> http://www.rocksclusters.org/wordpress/
> 
> Randy
> 
> 
> Julian wrote:
> > Hello,
> > 
> > So far, I have been using PETSc on a single processor 
> windows machine. 
> > Now, I am planning on using it on a Intel Dual core 
> machine. Before I 
> > start running the installation scripts, I wanted to confirm 
> if I can 
> > use both the processors on this new machine just like how you would 
> > use multiple processors on a supercomputer.
> > If yes, is there anything special that I need to do when 
> installing PETSc? 
> > I'm guessing I would have to install some MPI software... 
> Which one do 
> > you recommend for windows machines (I saw more than one windows MPI 
> > software on the PETSc website) ?
> > 
> > Thanks,
> > Julian.
> > 
> 
> --
> Randall Mackie
> GSY-USA, Inc.
> PMB# 643
> 2261 Market St.,
> San Francisco, CA 94114-1600
> Tel (415) 469-8649
> Fax (415) 469-5044
> 
> California Registered Geophysicist
> License No. GP 1034
> 
> 


From julvar at tamu.edu  Thu Aug 24 11:28:24 2006
From: julvar at tamu.edu (Julian)
Date: Thu, 24 Aug 2006 11:28:24 -0500
Subject: Intel Dual core machines
In-Reply-To: <Pine.LNX.4.64.0608241047300.21715@asterix>
Message-ID: <006601c6c79a$525d3910$24b75ba5@aero.ad.tamu.edu>

 thanks, I will try this out. 
I have been using petsc on a single cpu machine... And I want to see how
much faster it is on a dual core.


> -----Original Message-----
> From: owner-petsc-users at mcs.anl.gov 
> [mailto:owner-petsc-users at mcs.anl.gov] On Behalf Of Satish Balay
> Sent: Thursday, August 24, 2006 10:54 AM
> To: petsc-users at mcs.anl.gov
> Subject: Re: Intel Dual core machines
> 
> If you plan to use windows recommend mpich1 as this is what 
> PETSc is usually tested with [as far as installation is concerned].
> 
> http://www-unix.mcs.anl.gov/mpi/mpich1/mpich-nt/
> 
> Configure will automatically look for it - and use it.
> 
> The scalability depends upon the OS, MPI impl and 
> MemoryBandwidh numbers for this hardware. Don't know enough 
> about the OS & MPI part - but the MemoryBandwidh part is easy 
> to check based on the hardware you have. [The new core duo 
> chips appear to have high memory bandwidth numbers - so I 
> think it should scale well]
> 
> But you should be concerned about this only for performance measurents
> - but not during development. [You can install MPI on a 
> single cpu machine and use PETSc on it - for development]
> 
> Satish
> 
> On Thu, 24 Aug 2006, Julian wrote:
> 
> > Hello,
> > 
> > So far, I have been using PETSc on a single processor 
> windows machine. 
> > Now, I am planning on using it on a Intel Dual core 
> machine. Before I 
> > start running the installation scripts, I wanted to confirm 
> if I can 
> > use both the processors on this new machine just like how you would 
> > use multiple processors on a supercomputer.
> > If yes, is there anything special that I need to do when 
> installing PETSc? 
> > I'm guessing I would have to install some MPI software... 
> Which one do 
> > you recommend for windows machines (I saw more than one windows MPI 
> > software on the PETSc website) ?
> > 
> > Thanks,
> > Julian.
> > 
> > 
> 


From randy at geosystem.us  Thu Aug 24 11:52:21 2006
From: randy at geosystem.us (Randall Mackie)
Date: Thu, 24 Aug 2006 09:52:21 -0700
Subject: Intel Dual core machines
In-Reply-To: <006501c6c799$8b673b80$24b75ba5@aero.ad.tamu.edu>
References: <006501c6c799$8b673b80$24b75ba5@aero.ad.tamu.edu>
Message-ID: <44EDD945.6070705@geosystem.us>

Julian,

No, you do not need Rocks (and in fact it is a Linux-based system).
You mentioned the need to install MPI (which you do need for
PETSc), and I was just pointing out that there are clustering
software solutions that make a lot of getting the clusters set
up much simpler than doing it all yourself.

Rocks installs various flavors of MPICH, or MPI, or LAM....

Randy

Julian wrote:
> Thanks for the reply.
> 
> Do I really need to use ROCKS if I'm just gonna use a single dual core
> machine ? Is that considered a cluster ?
> Also, what MPI software does ROCKS install ? I saw some mention of openMPI
> 1.1 
> 
> Julian.
> 
> 
>> -----Original Message-----
>> From: owner-petsc-users at mcs.anl.gov 
>> [mailto:owner-petsc-users at mcs.anl.gov] On Behalf Of Randall Mackie
>> Sent: Thursday, August 24, 2006 10:42 AM
>> To: petsc-users at mcs.anl.gov
>> Subject: Re: Intel Dual core machines
>>
>> Hi Julian,
>>
>> No problem running on dual cpu machines. You just need to 
>> have MPI set up correctly.
>>
>> On our cluster, we use ROCKS, which does everything more or 
>> less automagically.
>>
>> http://www.rocksclusters.org/wordpress/
>>
>> Randy
>>
>>
>> Julian wrote:
>>> Hello,
>>>
>>> So far, I have been using PETSc on a single processor 
>> windows machine. 
>>> Now, I am planning on using it on a Intel Dual core 
>> machine. Before I 
>>> start running the installation scripts, I wanted to confirm 
>> if I can 
>>> use both the processors on this new machine just like how you would 
>>> use multiple processors on a supercomputer.
>>> If yes, is there anything special that I need to do when 
>> installing PETSc? 
>>> I'm guessing I would have to install some MPI software... 
>> Which one do 
>>> you recommend for windows machines (I saw more than one windows MPI 
>>> software on the PETSc website) ?
>>>
>>> Thanks,
>>> Julian.
>>>
>> --
>> Randall Mackie
>> GSY-USA, Inc.
>> PMB# 643
>> 2261 Market St.,
>> San Francisco, CA 94114-1600
>> Tel (415) 469-8649
>> Fax (415) 469-5044
>>
>> California Registered Geophysicist
>> License No. GP 1034
>>
>>
> 

-- 
Randall Mackie
GSY-USA, Inc.
PMB# 643
2261 Market St.,
San Francisco, CA 94114-1600
Tel (415) 469-8649
Fax (415) 469-5044

California Registered Geophysicist
License No. GP 1034


From julvar at tamu.edu  Thu Aug 24 15:40:06 2006
From: julvar at tamu.edu (Julian)
Date: Thu, 24 Aug 2006 15:40:06 -0500
Subject: Intel Dual core machines
In-Reply-To: <Pine.LNX.4.64.0608241047300.21715@asterix>
Message-ID: <007c01c6c7bd$7bb147c0$24b75ba5@aero.ad.tamu.edu>

Hi,

The link to mpich1 says this:

MPICH.NT is no longer being developed.  Please use  MPICH2. MPICH.NT and
MPICH2 can co-exist on the same machine so it is not necessary to uninstall
MPICH to install MPICH2. But applications must be re-compiled with the
MPICH2 header files and libraries. 

So, is it ok if I use mpich2? 

Thanks,
Julian.

> -----Original Message-----
> From: owner-petsc-users at mcs.anl.gov 
> [mailto:owner-petsc-users at mcs.anl.gov] On Behalf Of Satish Balay
> Sent: Thursday, August 24, 2006 10:54 AM
> To: petsc-users at mcs.anl.gov
> Subject: Re: Intel Dual core machines
> 
> If you plan to use windows recommend mpich1 as this is what 
> PETSc is usually tested with [as far as installation is concerned].
> 
> http://www-unix.mcs.anl.gov/mpi/mpich1/mpich-nt/
> 
> Configure will automatically look for it - and use it.
> 
> The scalability depends upon the OS, MPI impl and 
> MemoryBandwidh numbers for this hardware. Don't know enough 
> about the OS & MPI part - but the MemoryBandwidh part is easy 
> to check based on the hardware you have. [The new core duo 
> chips appear to have high memory bandwidth numbers - so I 
> think it should scale well]
> 
> But you should be concerned about this only for performance measurents
> - but not during development. [You can install MPI on a 
> single cpu machine and use PETSc on it - for development]
> 
> Satish
> 
> On Thu, 24 Aug 2006, Julian wrote:
> 
> > Hello,
> > 
> > So far, I have been using PETSc on a single processor 
> windows machine. 
> > Now, I am planning on using it on a Intel Dual core 
> machine. Before I 
> > start running the installation scripts, I wanted to confirm 
> if I can 
> > use both the processors on this new machine just like how you would 
> > use multiple processors on a supercomputer.
> > If yes, is there anything special that I need to do when 
> installing PETSc? 
> > I'm guessing I would have to install some MPI software... 
> Which one do 
> > you recommend for windows machines (I saw more than one windows MPI 
> > software on the PETSc website) ?
> > 
> > Thanks,
> > Julian.
> > 
> > 
> 


From balay at mcs.anl.gov  Thu Aug 24 15:57:33 2006
From: balay at mcs.anl.gov (Satish Balay)
Date: Thu, 24 Aug 2006 15:57:33 -0500 (CDT)
Subject: Intel Dual core machines
In-Reply-To: <007c01c6c7bd$7bb147c0$24b75ba5@aero.ad.tamu.edu>
References: <007c01c6c7bd$7bb147c0$24b75ba5@aero.ad.tamu.edu>
Message-ID: <Pine.LNX.4.64.0608241556210.21715@asterix>

mpich2 is also not being developed anymore [on windows]. Either should
work. 

[but I'm more familer with mpich1 - if you encounter issues]

Satish

On Thu, 24 Aug 2006, Julian wrote:

> Hi,
> 
> The link to mpich1 says this:
> 
> MPICH.NT is no longer being developed.  Please use  MPICH2. MPICH.NT and
> MPICH2 can co-exist on the same machine so it is not necessary to uninstall
> MPICH to install MPICH2. But applications must be re-compiled with the
> MPICH2 header files and libraries. 
> 
> So, is it ok if I use mpich2? 
> 
> Thanks,
> Julian.
> 
> > -----Original Message-----
> > From: owner-petsc-users at mcs.anl.gov 
> > [mailto:owner-petsc-users at mcs.anl.gov] On Behalf Of Satish Balay
> > Sent: Thursday, August 24, 2006 10:54 AM
> > To: petsc-users at mcs.anl.gov
> > Subject: Re: Intel Dual core machines
> > 
> > If you plan to use windows recommend mpich1 as this is what 
> > PETSc is usually tested with [as far as installation is concerned].
> > 
> > http://www-unix.mcs.anl.gov/mpi/mpich1/mpich-nt/
> > 
> > Configure will automatically look for it - and use it.
> > 
> > The scalability depends upon the OS, MPI impl and 
> > MemoryBandwidh numbers for this hardware. Don't know enough 
> > about the OS & MPI part - but the MemoryBandwidh part is easy 
> > to check based on the hardware you have. [The new core duo 
> > chips appear to have high memory bandwidth numbers - so I 
> > think it should scale well]
> > 
> > But you should be concerned about this only for performance measurents
> > - but not during development. [You can install MPI on a 
> > single cpu machine and use PETSc on it - for development]
> > 
> > Satish
> > 
> > On Thu, 24 Aug 2006, Julian wrote:
> > 
> > > Hello,
> > > 
> > > So far, I have been using PETSc on a single processor 
> > windows machine. 
> > > Now, I am planning on using it on a Intel Dual core 
> > machine. Before I 
> > > start running the installation scripts, I wanted to confirm 
> > if I can 
> > > use both the processors on this new machine just like how you would 
> > > use multiple processors on a supercomputer.
> > > If yes, is there anything special that I need to do when 
> > installing PETSc? 
> > > I'm guessing I would have to install some MPI software... 
> > Which one do 
> > > you recommend for windows machines (I saw more than one windows MPI 
> > > software on the PETSc website) ?
> > > 
> > > Thanks,
> > > Julian.
> > > 
> > > 
> > 
> 
> 


From balay at mcs.anl.gov  Thu Aug 24 16:59:37 2006
From: balay at mcs.anl.gov (Satish Balay)
Date: Thu, 24 Aug 2006 16:59:37 -0500 (CDT)
Subject: Direct Linear Solvers
In-Reply-To: <006401c6c797$dc392700$24b75ba5@aero.ad.tamu.edu>
References: <006401c6c797$dc392700$24b75ba5@aero.ad.tamu.edu>
Message-ID: <Pine.LNX.4.64.0608241656200.21715@asterix>

If you need parallel direct solvers - then you can try MUMPS or
Spooles or SuperLU_Dist external packages with PETSc.

I don't think they work on windows - so you might have to use linux
[with a f90 compiler for MUMPS]

The default LU in PETSc is sequential. 

Satish

On Thu, 24 Aug 2006, Julian wrote:

> Hello,
> 
> I looked at the linear solvers summary page and I could find three direct
> solvers that do not use any external packages. But I could find an example
> for only one of them (LU). The link to the cholesky solver does not have any
> example file. I am new to PETSc and I don't fully understand how the
> different solvers are invoked. Do I just change  PCSetType(pc,PCLU); to  
> PCSetType(pc,PCCHOLESKY); ?
> 
> And the "XXt and Xyt" solver does not have any link. How do I use that
> solver?
> 
> Also, do the direct solvers do its own internal renumbering to reduce the
> matrix bandwidth? Or do we have to take care of that outside of PETSc?
> 
> Thanks,
> Julian.
> 
> 
> 
> 


From hzhang at mcs.anl.gov  Sun Aug 27 22:18:34 2006
From: hzhang at mcs.anl.gov (Hong Zhang)
Date: Sun, 27 Aug 2006 22:18:34 -0500 (CDT)
Subject: Direct Linear Solvers
In-Reply-To: <006401c6c797$dc392700$24b75ba5@aero.ad.tamu.edu>
References: <006401c6c797$dc392700$24b75ba5@aero.ad.tamu.edu>
Message-ID: <Pine.LNX.4.58.0608272206493.10601@terra.mcs.anl.gov>

Julian,

> I looked at the linear solvers summary page and I could find three direct
> solvers that do not use any external packages. But I could find an example
> for only one of them (LU). The link to the cholesky solver does not have any
> example file. I am new to PETSc and I don't fully understand how the
> different solvers are invoked. Do I just change  PCSetType(pc,PCLU); to
> PCSetType(pc,PCCHOLESKY); ?
Yes.

You can run petsc KSP example with runtime option, e.g.
~petsc//src/ksp/ksp/examples/tutorials/ex5.c:

./ex5 -ksp_type preonly -pc_type cholesky

or examples on low level call of MatCholeskyFactor()
/src/mat/examples/tests/ex74.c (not recommended for user)

>
> And the "XXt and Xyt" solver does not have any link. How do I use that
> solver?

These are the interface to the Tufo-Fischer parallel direct solver.
I don't knwo much about them. Someone from petsc team may tell
you more about "XXt and Xyt".
>
> Also, do the direct solvers do its own internal renumbering to reduce the
> matrix bandwidth? Or do we have to take care of that outside of PETSc?
>
Do you mean matrix reordering? We do support a set of orderings.
Run a petsc mat or ksp example with the opiton
-help |grep -i ordering,
then you'll see the orderings provided.

Hong


From henke at math.tu-clausthal.de  Mon Aug 28 06:42:50 2006
From: henke at math.tu-clausthal.de (Christian Henke)
Date: Mon, 28 Aug 2006 13:42:50 +0200
Subject: Using of MatGetSubMatrices
Message-ID: <200608281342.50630.henke@math.tu-clausthal.de>

Hi all,

I am trying with petsc2.2.0 to get access to my sparse blockdiagonalmatrix  
matrix which contains for example 4 blocks m1 ... m4. If I use 2 Processors 
then m1 and m2 are owned by p1 and m3, m4 owned by p2. Now I want to read m1, 
m2 from p2 and m3, m4 from p1. First I have used MatGetValues, but then I 
took the message: Only local values currently supported. 
My next try was the function MatGetSubMatrices:

    IS             isrow, iscol;
    Mat            *M;

    ISCreateStride(PETSC_COMM_SELF,m,(int)i*m,1,&isrow);
    ISCreateStride(PETSC_COMM_SELF,m,(int)j*m,1,&iscol);

    const int ierr
      = MatGetSubMatrices(matrix,1,&isrow,&iscol,MAT_INITIAL_MATRIX,&M);
    
    ...

    ISDestroy(isrow);
    ISDestroy(iscol);

where i,j the blockindices and m is the blocksize. It works well for a few 
functioncalls, but then it stops with the log_trace-message:

[0] 0.055568 Event begin: MatGetSubMatrice.

What is my error? Is MatGetSubMatrices the wrong function for my problem or 
are the above lines wrong?

Regards Christian


From mappol at gmail.com  Mon Aug 28 10:44:23 2006
From: mappol at gmail.com (Patrick Lechner)
Date: Mon, 28 Aug 2006 16:44:23 +0100
Subject: DPETSC_USE_FORTRAN_KERNELS-warning
Message-ID: <215294220608280844r41fb9d3ai3006b322a057b9fd@mail.gmail.com>

Dear all,

I currently have the following problem and would be very grateful for any
useful advice:

I have written a Fortran code that uses PETSc for the solution of various
linear systems with complex entries (both in the stiffness matrix and in the
load vector). When I use the PETSc-Log to check the times for my runs, I get
the following warning:

      ##########################################################

#
#
      #                          WARNING!!!
                     #
      #
                             #
      #   The code for various complex numbers numerical
#
      #   kernels uses C++, which generally is not well
     #
      #   optimized.  For performance that is about 4-5 times
    #
      #   faster, specify the flag -DPETSC_USE_FORTRAN_KERNELS  #
      #   in base_variables and recompile the PETSc libraries.
#

#
#
      ##########################################################


My problem now is, that I can't find "base_variables" in my latest
PETSc-version (2.3.1-p15)...
Do I just add the flag to my cpp-flags in bmake/$PETSC_ARCH/petscconf? Or
should I do this modification somewhere else?

Thanks a lot for any help with this!
Best wishes,
Patrick


=================================

Patrick Lechner
Numerical Analysist / Numerical Modeller
Flat 1
159 Hardgate
Aberdeen, AB11 6XQ

Phone: 07815 927333
E-mail: patrick at lechner.com
Homepage: http://www.patrick.lechner.com
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.mcs.anl.gov/pipermail/petsc-users/attachments/20060828/8d360eb3/attachment.htm>

From balay at mcs.anl.gov  Mon Aug 28 10:55:46 2006
From: balay at mcs.anl.gov (Satish Balay)
Date: Mon, 28 Aug 2006 10:55:46 -0500 (CDT)
Subject: DPETSC_USE_FORTRAN_KERNELS-warning
In-Reply-To: <215294220608280844r41fb9d3ai3006b322a057b9fd@mail.gmail.com>
References: <215294220608280844r41fb9d3ai3006b322a057b9fd@mail.gmail.com>
Message-ID: <Pine.LNX.4.64.0608281050490.4274@asterix>

This message is outofdate. I'll fix it in petsc-dev.

The way to enable this feature is to rerun configure with the
additional option

--with-fortran-kernels=generic

You can use additonal option PETSC_ARCH with configure so that a new
set of configuraton [with new set of libraries are created] - this way
the old set is also useable. You can then check if the above option
improves performance or not [and then stick with the higher perfoming
version]

For eg: if your current PETSC_ARCH is linux-gnu - you can do:

./bmake/linux-gnu/configure --with-fortran-kernels=generic PETSC_ARCH=linux-gnu-ftn-kernels
make PETSC_ARCH=linux-gnu-ftn-kernels all
make PETSC_ARCH=linux-gnu-ftn-kernels test

Satish


On Mon, 28 Aug 2006, Patrick Lechner wrote:

> Dear all,
> 
> I currently have the following problem and would be very grateful for any
> useful advice:
> 
> I have written a Fortran code that uses PETSc for the solution of various
> linear systems with complex entries (both in the stiffness matrix and in the
> load vector). When I use the PETSc-Log to check the times for my runs, I get
> the following warning:
> 
>      ##########################################################
> 
> #
> #
>      #                          WARNING!!!
>                     #
>      #
>                             #
>      #   The code for various complex numbers numerical
> #
>      #   kernels uses C++, which generally is not well
>     #
>      #   optimized.  For performance that is about 4-5 times
>    #
>      #   faster, specify the flag -DPETSC_USE_FORTRAN_KERNELS  #
>      #   in base_variables and recompile the PETSc libraries.
> #
> 
> #
> #
>      ##########################################################
> 
> 
> My problem now is, that I can't find "base_variables" in my latest
> PETSc-version (2.3.1-p15)...
> Do I just add the flag to my cpp-flags in bmake/$PETSC_ARCH/petscconf? Or
> should I do this modification somewhere else?
> 
> Thanks a lot for any help with this!
> Best wishes,
> Patrick
> 
> 
> 
> 
> =================================
> 
> Patrick Lechner
> Numerical Analysist / Numerical Modeller
> Flat 1
> 159 Hardgate
> Aberdeen, AB11 6XQ
> 
> Phone: 07815 927333
> E-mail: patrick at lechner.com
> Homepage: http://www.patrick.lechner.com
> 


From knepley at gmail.com  Mon Aug 28 10:56:51 2006
From: knepley at gmail.com (Matthew Knepley)
Date: Mon, 28 Aug 2006 10:56:51 -0500
Subject: DPETSC_USE_FORTRAN_KERNELS-warning
In-Reply-To: <215294220608280844r41fb9d3ai3006b322a057b9fd@mail.gmail.com>
References: <215294220608280844r41fb9d3ai3006b322a057b9fd@mail.gmail.com>
Message-ID: <a9f269830608280856r1799ff55ge3f2376ad7a43281@mail.gmail.com>

  We apologize for the out-of-date documentation. There is now a configure
option

  --with-fortran-kernels=generic

which you can see with --help. Reconfiguring with this option will turn on
the Fortran kernels.

  Matt

On 8/28/06, Patrick Lechner <mappol at gmail.com> wrote:
>
> Dear all,
>
> I currently have the following problem and would be very grateful for any
> useful advice:
>
> I have written a Fortran code that uses PETSc for the solution of various
> linear systems with complex entries (both in the stiffness matrix and in the
> load vector). When I use the PETSc-Log to check the times for my runs, I get
> the following warning:
>
>       ##########################################################
>
> #
> #
>       #                          WARNING!!!
>                        #
>       #
>                                #
>       #   The code for various complex numbers numerical
>    #
>       #   kernels uses C++, which generally is not well
>        #
>       #   optimized.  For performance that is about 4-5 times
>       #
>       #   faster, specify the flag -DPETSC_USE_FORTRAN_KERNELS  #
>       #   in base_variables and recompile the PETSc libraries.
>    #
>
> #
> #
>       ##########################################################
>
>
> My problem now is, that I can't find "base_variables" in my latest
> PETSc-version ( 2.3.1-p15)...
> Do I just add the flag to my cpp-flags in bmake/$PETSC_ARCH/petscconf? Or
> should I do this modification somewhere else?
>
> Thanks a lot for any help with this!
> Best wishes,
> Patrick
>
>
>
>
> =================================
>
> Patrick Lechner
> Numerical Analysist / Numerical Modeller
> Flat 1
> 159 Hardgate
> Aberdeen, AB11 6XQ
>
> Phone: 07815 927333
> E-mail: patrick at lechner.com
> Homepage: http://www.patrick.lechner.com
>


-- 
"Failure has a thousand explanations. Success doesn't need one" -- Sir Alec
Guiness
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.mcs.anl.gov/pipermail/petsc-users/attachments/20060828/1e58e09c/attachment.htm>

From balay at mcs.anl.gov  Mon Aug 28 10:59:31 2006
From: balay at mcs.anl.gov (Satish Balay)
Date: Mon, 28 Aug 2006 10:59:31 -0500 (CDT)
Subject: DPETSC_USE_FORTRAN_KERNELS-warning
In-Reply-To: <Pine.LNX.4.64.0608281050490.4274@asterix>
References: <215294220608280844r41fb9d3ai3006b322a057b9fd@mail.gmail.com>
 <Pine.LNX.4.64.0608281050490.4274@asterix>
Message-ID: <Pine.LNX.4.64.0608281059120.4274@asterix>

On Mon, 28 Aug 2006, Satish Balay wrote:

> This message is outofdate. I'll fix it in petsc-dev.

looks like this is already cleanedup in petsc-dev.

Satish


From bsmith at mcs.anl.gov  Mon Aug 28 11:41:39 2006
From: bsmith at mcs.anl.gov (Barry Smith)
Date: Mon, 28 Aug 2006 11:41:39 -0500 (CDT)
Subject: Using of MatGetSubMatrices
In-Reply-To: <200608281342.50630.henke@math.tu-clausthal.de>
References: <200608281342.50630.henke@math.tu-clausthal.de>
Message-ID: <Pine.LNX.4.58.0608281139110.10865@harley.mcs.anl.gov>


   All processes that share the matrix must call MatGetSubMatrices()
the same number of times. If a process doesn't need a matrix it should
pass in zero length IS's. If you are always calling it with all processes
then you can run with -start_in_debugger and when it is hanging hit
control C in the debugger and type where to see where/why it is hanging.

  Barry

On Mon, 28 Aug 2006, Christian Henke wrote:

> Hi all,
>
> I am trying with petsc2.2.0 to get access to my sparse blockdiagonalmatrix
> matrix which contains for example 4 blocks m1 ... m4. If I use 2 Processors
> then m1 and m2 are owned by p1 and m3, m4 owned by p2. Now I want to read m1,
> m2 from p2 and m3, m4 from p1. First I have used MatGetValues, but then I
> took the message: Only local values currently supported.
> My next try was the function MatGetSubMatrices:
>
>     IS             isrow, iscol;
>     Mat            *M;
>
>     ISCreateStride(PETSC_COMM_SELF,m,(int)i*m,1,&isrow);
>     ISCreateStride(PETSC_COMM_SELF,m,(int)j*m,1,&iscol);
>
>     const int ierr
>       = MatGetSubMatrices(matrix,1,&isrow,&iscol,MAT_INITIAL_MATRIX,&M);
>
>     ...
>
>     ISDestroy(isrow);
>     ISDestroy(iscol);
>
> where i,j the blockindices and m is the blocksize. It works well for a few
> functioncalls, but then it stops with the log_trace-message:
>
> [0] 0.055568 Event begin: MatGetSubMatrice.
>
> What is my error? Is MatGetSubMatrices the wrong function for my problem or
> are the above lines wrong?
>
> Regards Christian
>
>


From Stephen.R.Ball at awe.co.uk  Tue Aug 29 06:36:50 2006
From: Stephen.R.Ball at awe.co.uk (Stephen R Ball)
Date: Tue, 29 Aug 2006 12:36:50 +0100
Subject: Spooles Cholesky problem
Message-ID: <68TCe6009159@awe.co.uk>


Hi

When using the Spooles Cholesky solver I am getting the error:

0:[0]PETSC ERROR: MatSetValues_SeqSBAIJ() line 792 in
src/mat/impls/sbaij/seq/sbaij.c
0:[0]PETSC ERROR: !
0:[0]PETSC ERROR: Lower triangular value cannot be set for sbaij format.
Ignoring these values, run with -mat_ignore_lower_triangular or call
MatSetOption(mat,MAT_IGNORE_LOWER_TRIANGULAR)!
0:[0]PETSC ERROR: MatSetValues() line 702 in src/mat/interface/matrix.c

When I attempt to call MatSetOption(mat,MAT_IGNORE_LOWER_TRIANGULAR,ierr)
via the Fortran interface, I get the compilation error that entity
mat_ignore_lower_triangular
has undefined type. Can you tell me if this option is supported via the
Fortran interface?

Note that using runtime option  -mat_ignore_lower_triangular works, but I
would prefer to
set this option in my code using the MatSetOption() routine.

Regards

Stephen
--
_______________________________________________________________________________

The information in this email and in any attachment(s) is commercial in confidence. If you are not the named addressee(s) or if you receive this email in error then any distribution, copying or use of this communication or the information in it is strictly prohibited.  Please notify us immediately by email at admin.internet(at)awe.co.uk, and then delete this message from your computer.  While attachments are virus checked, AWE plc does not accept any liability in respect of any virus which is not detected.

AWE Plc
Registered in England and Wales
Registration No 02763902
AWE, Aldermaston, Reading, RG7 4PR